On Wed, Jun 8, 2011 at 11:27 PM, Robert Haas <robertmh...@gmail.com> wrote: > On Wed, Jun 8, 2011 at 11:20 PM, Merlin Moncure <mmonc...@gmail.com> wrote: >> You're probably right. I think though there is enough hypothetical >> upside to the private buffer case that it should be attempted just to >> see what breaks. The major tricky bit is dealing with the new >> pin/unpin mechanics. I'd like to give it the 'college try'. (being >> typically vain and attention seeking, this is right up my alley) :-D. > > Well, I think it's fairly clear what will break: > > - If you make the data-file buffer completely private, then what will > happen when some other backend needs to read or write that buffer?
The private wal buffer? The whole point (maybe impossible) is to try and engineer it so that the other backends *never* have to read and write it -- from their point of view, it hasn't happened yet (even though it has been written into some heap buffers). Since all data action on ongoing transactions can happen at any time, moving wal inserts into the private buffer is delaying its entry into the log so you can avoid taking locks for pre-commit heap activity. Doing this allows the backends doing that to pretend they are actually did write data out into the log without breaking the 'wal before data' rule which is effected by keeping the pin on pages with your magic LSN (which I'm starting to wonder if it should be a flag like BM_DEFERRED_WAL). We essentially are moving xlog activity as far ahead in time as possible (although in a very limited time space) in order to combine locks and hopefully gain efficiency. It all comes down to which rules you can bend and which you can break. The heap pages that have been marked this way may or may not have to be off limits from the backend other than the one that did the marking, and if they have to be off limits logically, there may be no realistic path to make them so. I just don't know...I'm learning as I go. At the end of the day, it's all coming off as pretty fragile if it even works, but it's fun to think about. Anyways, I'm inclined to experiment. > - If you make the XLOG spool private, you will not be able to checkpoint. Correct -- but I don't think this problem is intractable, and is really a secondary issue vs making sure the wal/heap/mvcc/backend interactions 'work'. The intent here is to spool only a relatively small amount of uncommitted transaction data for a short period of time, like 5-10 seconds. Maybe you bite the bullet and tell everyone to flush private WAL at checkpoint time via signal or something. Maybe you bend the some rules on checkpoints. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers