Re: [HACKERS] Rewriting Free Space Map

Heikki Linnakangas Mon, 17 Mar 2008 12:29:41 -0700

Tom Lane wrote:

"Heikki Linnakangas" <[EMAIL PROTECTED]> writes:

Tom Lane wrote:

You're cavalierly waving away a whole boatload of problems that will
arise as soon as you start trying to make the index AMs play along

with this :-(.

It doesn't seem very hard.


The problem is that the index AMs are no longer in control of what goes
where within their indexes, which has always been their prerogative to
determine.  The fact that you think you can kluge btree to still work
doesn't mean that it will work for other AMs.

Well, it does work with all the existing AMs AFAICS. I do agree with thegeneral point; it'd certainly be cleaner, more modular and more flexibleif the AMs didn't need to know about the existence of the maps.

The idea that's becoming attractive to me while contemplating the
multiple-maps problem is that we should adopt something similar to
the old Mac OS idea of multiple "forks" in a relation.

Hmm. You also need to teach at least xlog.c and xlogutils.c about themap forks, for full page images and the invalid page tracking.


Well, you'd have to teach them something anyway, for any incarnation
of maps that they might need to update.

Umm, the WAL code doesn't care where the pages it operates on came from.Sure, we'll need rmgr-specific code that know what to do with the maps,but the full page image code would work without changes with themultiple RelFileNode approach.

The essential change with the map fork idea is that a RelFileNode nolonger uniquely identifies a file on disk (ignoring the segmentationwhich is handled in smgr for now). Anything that operates onRelFileNodes, without any higher level information of what it is, needsto be modified to use RelFileNode+forkid instead. That includes at leastthe buffer manager, smgr, and the full page image code in xlog.c.

It's probably a pretty mechanical change, even though it affects a lotof code. We'd probably want to have a new struct, let's call itPhysFileId for now, for RelFileNode+forkid, and basically replace alloccurrences of RelFileNode with PhysFileId in smgr, bufmgr and xlog code.

I also wonder what the performance impact of extending BufferTag is.


That's a fair objection, and obviously something we'd need to check.
But I don't recall seeing hash_any so high on any profile that I think
it'd be a big problem.

I do remember seeing hash_any in some oprofile runs. But that's fairlyeasy to test: we don't need to actually implement any of the stuff,other than add a field to BufferTag, and run pgbench.

My original thought was to have a separate RelFileNode for each of themaps. That would require no smgr or xlog changes, and not very manychanges in the buffer manager, though I guess you'd more catalogchanges. You had doubts about that on the previous thread(http://archives.postgresql.org/pgsql-hackers/2007-11/msg00204.php), butthe "map forks" idea certainly seems much more invasive than that.
The main problems with that are (a) the need to expose every type of map
in pg_class and (b) the need to pass all those relfilenode numbers down
to pretty low levels of the system.

(a) is certainly a valid point. Regarding (b), I don't think the lowlevel stuff (I assume you mean smgr, bufmgr, bgwriter, xlog by that)would need to be passed any additional relfilenode numbers. Or rather,they already work with relfilenodes, and they don't need to know whetherthe relfilenode is for an index, a heap, or an FSM attached to somethingelse. The relfilenodes would be in RelationData, and we already havethat around whenever we do anything that needs to differentiate betweenthose.

Another consideration is which approach is easiest to debug. The "mapfork" approach seems better on that front, as you can immediately seefrom the PhysFileId if a page is coming from an auxiliary map or themain data portion. That might turn out to be handy in the buffer manageror bgwriter as well; they don't currently have any knowledge of what apage contains.

The nice thing about the fork idea
is that you don't need any added info to uniquely identify what relation
you're working on.  The fork numbers would be hard-wired into whatever
code needed to know about particular forks.  (Of course, these same
advantages apply to using special space in an existing file.  I'm
just suggesting that we can keep these advantages without buying into
the restrictions that special space would have.)

I don't see that advantage. All the higher-level code that care whichrelation you're working on already have Relation around. All thelower-level stuff don't care.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rewriting Free Space Map

Reply via email to