On Tue, Mar 06, 2012 at 09:25:17AM -0500, Robert Haas wrote: > > 2. Turning checksums on/off/on/off in rapid succession can cause false > > positive reports of checksum failure if crashes occur and are ignored. > > That may lead to the feature and PostgreSQL being held in disrepute. > > This I do think is a problem, although not for precisely the reason > stated here. In my experience, in data corruption situations, the > first thing customers do is blame PostgreSQL: they don't believe it's > the hardware; they accuse us of having bugs in our code. Having a > checksum feature would be valuable, because, first, we'd perhaps > detect problems sooner and, second, people understand what checksums > are and that checksum failures really shouldn't happen unless the > hardware is bad. More generally, one of the purposes of checksums is > to distinguish hardware failure from other possible causes of data > corruption problems. If there are code paths where checksum failures > can happy despite the hardware being good, I think that the patch will > fail to accomplish its goal of giving us confidence that the hardware > is bad.
I think the "turning checksums on/off/on/off" is really a killer problem, and obviously many of the actions needed to make it safe make the checksum feature itself less useful. One crazy idea would be to have a checksum _version_ number somewhere on the page and in pg_controldata. When you turn on checksums, you increment that value, and all new checksum pages get that checksum version; if you turn off checksums, we just don't check them anymore, but they might get incorrect due to a hint bit write and a crash. When you turn on checksums again, you increment the checksum version again, and only check pages having the _new_ checksum version. Yes, this does add additional storage requirements for the checksum, but I don't see another clean option. If you can spare one byte, that gives you 255 times to turn on checksums; after that, you have to dump/reload to use the checksum feature. -- Bruce Momjian <br...@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers