On Mon, Mar 17, 2014 at 10:54 AM, Heikki Linnakangas <hlinnakan...@vmware.com> wrote: > Heap and B-tree WAL records also rely on PageAddItem etc. to reconstruct the > page, instead of making a physical copy of the modified parts. And > _bt_restore_page even inserts the items physically in different order than > the normal codepath does. So for good or bad, there is some precedence for > this.
Yikes. > The imminent danger I see is if we change the logic on how the items are > divided into posting lists, and end up in a situation where a master server > adds an item to a page, and it just fits, but with the compression logic the > standby version has, it cannot make it fit. As an escape hatch for that, we > could have the WAL replay code try the compression again, with a larger max. > posting list size, if it doesn't fit at first. And/or always leave something > like 10 bytes of free space on every data page to make up for small > differences in the logic. That scares the crap out of me. I don't see any intrinsic problem with relying on the existence page contents to figure out how to roll forward, as PageAddItem does; after all, we do FPIs precisely so that the page is in a known good state when we start. However, I really think we ought to try hard to make this deterministic in terms of what the resulting state of the page is; anything else seems like it's playing with fire, and I bet we'll get burned sooner rather than later. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers