I wrote: > Greg Stark <st...@mit.edu> writes: >> WAL-E actually didn't restore a whole 1GB file due to a transient S3 >> problem, in fact a bunch of them.
> Hah. Okay, I think we can write this issue off as closed then. Oh, wait a minute. It's not just a matter of whether we find the right block: we also have to consider whether XLogReadBufferExtended will apply the right "mode" behavior. Currently, it supposes that all pages past the initially observed EOF should be assumed to be uninitialized; but if we're working with an inconsistent database, that seems like an unsafe assumption. It might be that a page is there but we've not (yet) fixed the length of some preceding segment. If we want to not get bogus "WAL contains references to invalid pages" failures in such scenarios, it seems like we need a more invasive change than what I just committed. I think your patch didn't cover this consideration either. What I think we probably want to do is forcibly cause the target page to exist, using a P_NEW loop like what I committed, and then decide on the basis of whether it's all-zeroes whether to consider it invalid or not. This seems sane on the grounds that it's just the extension to the page level of the existing policy of creating the file whether it existed or not. It could only result in a large amount of wasted work if the passed-in target block is insane --- but since we got it out of a CRC-checked WAL record, I think it's safe to not worry too much about that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers