Greg Smith <g...@2ndquadrant.com> wrote: > 2) Rework hint bits to make the torn page problem go away. > Checksums go elsewhere? More WAL logging to eliminate the bad > situations? Eliminate some types of hint bit writes? It seems > every alternative has trade-offs that will require serious > performance testing to really validate. I'm wondering whether we're not making a mountain out of a mole-hill here. In real life, on one single crash, how many torn pages with hint-bit-only updates do we expect on average? What's the maximum possible? In the event of a crash recovery, can we force all tables to be seen as needing autovacuum? Would there be a way to limit this to some subset which *might* have torn pages somehow? It seems to me that on a typical production system you would probably have zero or one such page per OS crash, with zero being far more likely than one. If we can get that one fixed (if it exists) before enough time has elapsed for everyone to forget the OS crash, the idea that we would be scaring the users and negatively affecting the perception of reliability seems far-fetched. The fact that they can *have* page checksums in PostgreSQL should do a lot to *enhance* the PostgreSQL reputation for reliability in some circles, especially those getting pounded with FUD from competing products. If a site has so many OS or hardware failures that they lose track -- well, they really should be alarmed. Of course, the fact that you may hit such a torn page in a situation where all data is good means that it shouldn't be more than a warning. This seems as though it eliminates most of the work people have been suggesting as necessary, and makes the submitted patch fairly close to what we want. -Kevin
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers