On Thu, Nov 4, 2010 at 2:00 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Robert Haas <robertmh...@gmail.com> writes: >> On Wed, Oct 20, 2010 at 8:11 PM, Robert Haas <robertmh...@gmail.com> wrote: >>>> I'm imagining that the kernel of a >>>> snapshot is just a WAL position, ie the end of WAL as of the time you >>>> take the snapshot (easy to get in O(1) time). Visibility tests then >>>> reduce to "did this transaction commit with a WAL record located before >>>> the specified position?". > >> I spent a bunch of time thinking about this, and I don't see any way >> to get the memory usage requirements down to something reasonable. >> The problem is that RecentGlobalXmin might be arbitrarily far back in >> XID space, and you'll need to know the LSN of every commit from that >> point forward; whereas the ProcArray requires only constant space. > > That's like arguing that clog is no good because it doesn't fit in > constant space. ISTM it would be entirely practical to remember the > commit LSN positions of every transaction back to RecentGlobalXmin, > using a data structure similar to pg_subtrans --- in fact, it'd require > exactly twice as much working space as pg_subtrans, ie 64 bits per XID > instead of 32. Now, it might be that access contention problems would > make this unworkable (I think pg_subtrans works largely because we don't > have to access it often) but it's not something that can be dismissed > on space grounds.
Maybe I didn't explain that very well. The point is not so much how much memory you're using in an absolute sense as how much of it you have to look at to construct a snapshot. If you store a giant array indexed by XID whose value is an LSN, you have to read a potentially unbounded number of entries from that array. You can either make a single read through the relevant portion of the array (snapshot xmin to snapshot xmax) or you can check each XID as you see it and try to build up a local cache, but either way there's no fixed limit on how many bytes must be read from the shared data structure. That compares unfavorably with the current design, where you do a one-time read of a bounded amount of data and you're done. I suspect your theory about pg_subtrans is correct. > [ thinks for a bit... ] But actually this probably ends up being a > wash or a loss as far as contention goes. We're talking about a data > structure that has to be updated during each commit, and read pretty > frequently, and it's not obvious how that's any better than getting > commit info from the ProcArray. Although neither commit nor reading > would require a *global* lock, so maybe there's a way ... Well, if we're talking about the "giant XID array" design, or some variant of that, I would expect that nearly all of the contention would be on the last page or two, so I don't think it would be much better than a global lock. You might be able to get around that by not using an LWLock, and instead using something like lock xchg or LL/SC to atomatically update entries, but I'm not sure there's a portable way to do such operations on anything larger than a 4-byte word. At any rate, I think the problem described in the preceding paragraph is the more serious one. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers