Greg Stark <st...@mit.edu> writes: > A single WAL record can be over 24kB.
<pedantic> Actually, WAL records can run to megabytes. Consider for example a commit record for a transaction that dropped thousands of tables --- there'll be info about each such table in the commit record, to cue replay to remove those files. </pedantic> > If you replayed the following record but not this record you would > have an inconsistent database. ... > Or it could be an index insert for that tuple would would result in a > physically inconsistent database with index pointers that point to > incorrect tuples. Index scans would return tuples that didn't match > the index or would miss tuples that should be returned. Skipping actions such as index page splits would lead to even more fun. Even in simple cases such as successive inserts and deletions in the same heap page, failing to replay some of the actions is going to be disastrous. The *best case* scenario for that is that WAL replay PANICs when it notices that the action it's trying to replay is inconsistent with the current state of the page, eg it's trying to insert at a TID that already exists. IMO we can't proceed past a broken WAL record. The actually useful suggestion upthread was that we try to notice whether there seem to be valid WAL records past the broken one, so that we could warn the DBA that some commits might have been lost. I don't think we can do much in the way of automatic data recovery, but we could give the DBA a chance to do forensics rather than blindly starting up (and promptly overwriting all the evidence). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers