Greetings, * Tomas Vondra (tomas.von...@2ndquadrant.com) wrote: > On 3/2/19 12:03 AM, Robert Haas wrote: > > On Tue, Sep 18, 2018 at 10:37 AM Michael Banck > > <michael.ba...@credativ.de> wrote: > >> I have added a retry for this as well now, without a pg_sleep() as well. > >> This catches around 80% of the half-reads, but a few slip through. At > >> that point we bail out with exit(1), and the user can try again, which I > >> think is fine? > > > > Maybe I'm confused here, but catching 80% of torn pages doesn't sound > > robust at all. > > FWIW I don't think this qualifies as torn page - i.e. it's not a full > read with a mix of old and new data. This is partial write, most likely > because we read the blocks one by one, and when we hit the last page > while the table is being extended, we may only see the fist 4kB. And if > we retry very fast, we may still see only the first 4kB.
I really still am not following why this is such an issue- we do a read, get back 4KB, do another read, check if it's zero, and if so then we should be able to conclude that we're at the end of the file, no? If we're at the end of the file and we don't have a final complete block to run a checksum check on then it seems clear to me that the file was being extended and it's ok to skip that block. We could also stat the file and keep track of where we are, to detect such an extension of the file happening, if we wanted an additional cross-check, couldn't we? If we do a read and get 4KB back and then do another and get 4KB back, then we just treat it like we would an 8KB block. Really, as long as a subsequent read is returning bytes then we keep going, and if it returns zero then it's EOF. I could maybe see a "one final read" option, but I don't think it makes sense to have some kind of time-based delay around this where we keep trying to read. All of this about hacking up a way to connect to PG and lock pages in shared buffers so that we can perform a checksum check seems really rather ridiculous for either the extension case or the regular mid-file torn-page case. To be clear, I agree completely that we don't want to be reporting false positives or "this might mean corruption!" to users running the tool, but I haven't seen a good explaination of why this needs to involve the server to avoid that happening. If someone would like to point that out to me, I'd be happy to go read about it and try to understand. Thanks! Stephen
signature.asc
Description: PGP signature