Re: [HACKERS] Minmax indexes

Heikki Linnakangas Mon, 30 Sep 2013 04:09:09 -0700

On 27.09.2013 21:43, Greg Stark wrote:

On Fri, Sep 27, 2013 at 7:22 PM, Jim Nasby<j...@nasby.net>  wrote:


Yeah, we obviously kept things simpler when adding forks in order to get the 
feature out the door. There's improvements that need to be made. But IMHO 
that's not reason to automatically avoid forks; we need to consider the cost of 
improving them vs what we gain by using them.


We think this gives short change to the decision to introduce forks.
If you go back to the discussion at the time it was a topic of debate
and the argument which won the day is that interleaving different
streams of data in one storage system is exactly what the file system
is designed to do and we would just be reinventing the wheel if we
tried to do it ourselves. I think that makes a lot of sense for things
like the fsm or vm which grow indefinitely and are maintained by a
different piece of code from the main heap.

The tradeoff might be somewhat different for the pieces of a data
structure like a bitmap index or gin index where the code responsible
for maintaining it is all the same.

There are quite a dfew cases where we have several "streams" of data,all related to a single relation. We've solved them all in slightlydifferent ways:

1. TOAST. A separate heap relation with accompanying b-tree index iscreated.

2. GIN. GIN contains a b-tree, and data pages (and somer other kinds ofpages too IIRC). It would be natural to use the regular B-tree code forthe B-tree, but instead it contains a completely separateimplementation. All the different kinds of streams are stored in themain fork.


3. Free space map. Stored as a separate fork.

4. Visibility map. Stored as a separate fork.

And upcoming:

5. Minmax indexes, with the linearly-addressed range reverse map andvariable lenghth index tuples.

6. Bitmap indexes. Like in GIN, there's a B-tree and the data pagescontaining the bitmaps.

A nice property of the VM and FSM forks currently is that they are justauxiliary information to speed things up. You can safely remove them(when the server is shut down), and the system will recreate them onnext vacuum. It's not carved in stone that it has to be that way for allextra forks, but it is today and I like it.

I feel we need a new kind of a relation fork, something moreheavy-weight than the current forks, but not as heavy-weight as the wayTOAST does it. It would be nice if GIN and bitmap indexes could use theregular nbtree code. Or any other index type - imagine a bitmap indexusing a SP-GiST index instead of a B-tree! You could create a bitmapindex for 2d points, and use it to speed up operations like overlap forexample.

The nbtree code expects the data to be in the main fork and uses the FSMfork too. Maybe it could be abstracted, so that the regular b-tree couldbe used as part of another index type. Same with other indexams.

Perhaps relation forks need to be made more flexible, allowing accessmethods to define what forks exists. IOW, let's not avoid using relationforks, let's make them better instead.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minmax indexes

Reply via email to