* Peter Geoghegan (p...@heroku.com) wrote: > On Mon, Jan 23, 2017 at 5:26 PM, Stephen Frost <sfr...@snowman.net> wrote: > > Not sure how this part of that sentence was missed: > > > > ----- > > ... even though they were enabled as soon as the feature became > > available. > > ----- > > > > Which would seem to me to say "the code's been running for a long time > > on a *lot* of systems without throwing a false positive or surfacing a > > bug." > > I think you've both understood what I said correctly. Note that I > remain neutral on the question of whether or not checksums should be > enabled by default. > > Perhaps I've missed the point entirely, but, I have to ask: How could > there ever be false positives? With checksums, false positives are > simply not allowed. Therefore, there cannot be a false positive, > unless we define checksums as a mechanism that should only find > problems that originate somewhere at or below the filesystem. We > clearly have not done that, so ISTM that checksums could legitimately > find bugs in the checksum code. I am not being facetious.
I'm not sure I'm following your question here. A false positive would be a case where the checksum code throws an error on a page whose checksum is correct, or where the checksum has failed but nothing is actually wrong/different on the page. As for the purpose of checksums, it's exactly to identify cases where the page has been changed since we wrote it out, due to corruption in the kernel, filesystem, storage system, etc. As we only check them when we read in a page and calculate them when we go to write the page out, they aren't helpful for shared_buffers corruption, generally speaking. It might be interesting to consider checking them in 'clean' pages in shared_buffers in a background process, as that, presumably, *would* detect shared buffers corruption. Thanks! Stephen
signature.asc
Description: Digital signature