Re: [HACKERS] Rewriting Free Space Map

Heikki Linnakangas Mon, 17 Mar 2008 06:55:22 -0700

Tom Lane wrote:

"Heikki Linnakangas" <[EMAIL PROTECTED]> writes:
I've started working on revamping Free Space Map, using the approachwhere we store a map of heap pages on every nth heap page. What we neednow is discussion on the details of how exactly it should work.
You're cavalierly waving away a whole boatload of problems that will
arise as soon as you start trying to make the index AMs play along
with this :-(.

It doesn't seem very hard. An indexam wanting to use FSM needs a littlebit of code where the relation is extended, to let the FSM initializeFSM pages. And then there's the B-tree metapage issue I mentioned. Butthat's all, AFAICS.

Hash for instance has very narrow-minded ideas about
page allocation within its indexes.


Hash doesn't use FSM at all.

Also, I don't think that "use the special space" will scale to handle
other kinds of maps such as the proposed dead space map.  (This is
exactly why I said the other day that we need a design roadmap for all
these ideas.)

It works for anything that scales linearly with the relation itself. Theproposed FSM and visibility map both fall into that category.

A separate file is certainly more flexible. I was leaning towards thatoption originally(http://archives.postgresql.org/pgsql-hackers/2007-11/msg00142.php) forthat reason.

The idea that's becoming attractive to me while contemplating the
multiple-maps problem is that we should adopt something similar to
the old Mac OS idea of multiple "forks" in a relation.  In addition
to the main data fork which contains the same info as now, there could
be one or more map forks which are separate files in the filesystem.
They are named by relfilenode plus an extension, for instance a relation
with relfilenode NNN would have a data fork in file NNN (plus perhaps
NNN.1, NNN.2, etc) and a map fork named something like NNN.map (plus
NNN.map.1 etc as needed).  We'd have to add one more field to buffer
lookup keys (BufferTag) to disambiguate which fork the referenced page
is in.  Having bitten that bullet, though, the idea trivially scales to
any number of map forks with potentially different space requirements
and different locking and WAL-logging requirements.

Hmm. You also need to teach at least xlog.c and xlogutils.c about themap forks, for full page images and the invalid page tracking. I alsowonder what the performance impact of extending BufferTag is.

My original thought was to have a separate RelFileNode for each of themaps. That would require no smgr or xlog changes, and not very manychanges in the buffer manager, though I guess you'd more catalogchanges. You had doubts about that on the previous thread(http://archives.postgresql.org/pgsql-hackers/2007-11/msg00204.php), butthe "map forks" idea certainly seems much more invasive than that.

I like the "map forks" idea; it groups the maps nicely at the filesystemlevel, and I can see it being useful for all kinds of things in thefuture. The question is, is it really worth the extra code churn? If youthink it is, I can try that approach.

Another possible advantage is that a new map fork could be added to an
existing table without much trouble.  Which is certainly something we'd
need if we ever hope to get update-in-place working.


Yep.

The main disadvantage I can see is that for very small tables, the
percentage overhead from multiple map forks of one page apiece is
annoyingly high.  However, most of the point of a map disappears if
the table is small, so we might finesse that by not creating any maps
until the table has reached some minimum size.

Yeah, the map fork idea is actually better than the "every nth heappage" approach from that point of view.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rewriting Free Space Map

Reply via email to