Tom Lane wrote:
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes:
I've started working on revamping Free Space Map, using the approach where we store a map of heap pages on every nth heap page. What we need now is discussion on the details of how exactly it should work.

You're cavalierly waving away a whole boatload of problems that will
arise as soon as you start trying to make the index AMs play along
with this :-(.

It doesn't seem very hard. An indexam wanting to use FSM needs a little bit of code where the relation is extended, to let the FSM initialize FSM pages. And then there's the B-tree metapage issue I mentioned. But that's all, AFAICS.

Hash for instance has very narrow-minded ideas about
page allocation within its indexes.

Hash doesn't use FSM at all.

Also, I don't think that "use the special space" will scale to handle
other kinds of maps such as the proposed dead space map.  (This is
exactly why I said the other day that we need a design roadmap for all
these ideas.)

It works for anything that scales linearly with the relation itself. The proposed FSM and visibility map both fall into that category.

A separate file is certainly more flexible. I was leaning towards that option originally (http://archives.postgresql.org/pgsql-hackers/2007-11/msg00142.php) for that reason.

The idea that's becoming attractive to me while contemplating the
multiple-maps problem is that we should adopt something similar to
the old Mac OS idea of multiple "forks" in a relation.  In addition
to the main data fork which contains the same info as now, there could
be one or more map forks which are separate files in the filesystem.
They are named by relfilenode plus an extension, for instance a relation
with relfilenode NNN would have a data fork in file NNN (plus perhaps
NNN.1, NNN.2, etc) and a map fork named something like NNN.map (plus
NNN.map.1 etc as needed).  We'd have to add one more field to buffer
lookup keys (BufferTag) to disambiguate which fork the referenced page
is in.  Having bitten that bullet, though, the idea trivially scales to
any number of map forks with potentially different space requirements
and different locking and WAL-logging requirements.

Hmm. You also need to teach at least xlog.c and xlogutils.c about the map forks, for full page images and the invalid page tracking. I also wonder what the performance impact of extending BufferTag is.

My original thought was to have a separate RelFileNode for each of the maps. That would require no smgr or xlog changes, and not very many changes in the buffer manager, though I guess you'd more catalog changes. You had doubts about that on the previous thread (http://archives.postgresql.org/pgsql-hackers/2007-11/msg00204.php), but the "map forks" idea certainly seems much more invasive than that.

I like the "map forks" idea; it groups the maps nicely at the filesystem level, and I can see it being useful for all kinds of things in the future. The question is, is it really worth the extra code churn? If you think it is, I can try that approach.

Another possible advantage is that a new map fork could be added to an
existing table without much trouble.  Which is certainly something we'd
need if we ever hope to get update-in-place working.

Yep.

The main disadvantage I can see is that for very small tables, the
percentage overhead from multiple map forks of one page apiece is
annoyingly high.  However, most of the point of a map disappears if
the table is small, so we might finesse that by not creating any maps
until the table has reached some minimum size.

Yeah, the map fork idea is actually better than the "every nth heap page" approach from that point of view.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to