On 9/18/18 11:45 AM, Stephen Frost wrote: > * Michael Banck (michael.ba...@credativ.de) wrote:
>> I have added a retry for this as well now, without a pg_sleep() as well. > >> This catches around 80% of the half-reads, but a few slip through. At >> that point we bail out with exit(1), and the user can try again, which I >> think is fine? > > No, this is perfectly normal behavior, as is having completely blank > pages, now that I think about it. If we get a short read then I'd say > we simply check that we got an EOF and, in that case, we just move on. > >> Alternatively, we could just skip to the next file then and don't make >> it count as a checksum failure. > > No, I wouldn't count it as a checksum failure. We could possibly count > it towards the skipped pages, though I'm even on the fence about that. +1 for it not being a failure. Personally I'd count it as a skipped page, since we know the page exists but it can't be verified. The other option is to wait for the page to stabilize, which doesn't seem like it would take very long in most cases -- unless you are doing this test from another host with shared storage. Then I would expect to see all kinds of interesting torn pages after the last checkpoint. Regards, -- -David da...@pgmasters.net
signature.asc
Description: OpenPGP digital signature