On Wed, Jan 25, 2017 at 8:18 PM, Robert Haas <robertmh...@gmail.com> wrote: > Also, it's not as if there are no other ways of checking whether your > disks are failing. SMART, for example, is supposed to tell you about > incipient hardware failures before PostgreSQL ever sees a bit flip. > Surely an average user would love to get a heads-up that their > hardware is failing even when that hardware is not being used to power > PostgreSQL, yet many people don't bother to configure SMART (or > similar proprietary systems provided by individual vendors).
You really can't rely on SMART to tell you about hardware failures. 1 in 4 drives fail completely with 0 SMART indication [1]. And for the 1 in 1000 annual checksum failure rate other indicators except system restarts only had a weak correlation[2]. And this is without filesystem and other OS bugs that SMART knows nothing about. My view may be biased by mostly seeing the cases where things have already gone wrong, but I recommend support clients to turn checksums on unless it's known that write IO is going to be an issue. Especially because I know that if it turns out to be a problem I can go in and quickly hack together a tool to help them turn it off. I do agree that to change the PostgreSQL default at least some tool turn it off online should be included. [1] https://www.backblaze.com/blog/what-smart-stats-indicate-hard-drive-failures/ [2] https://www.usenix.org/legacy/event/fast08/tech/full_papers/bairavasundaram/bairavasundaram.pdf Regards, Ants Aasma -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers