On Tue, Mar 20, 2018 at 6:57 AM, Michael Paquier <mich...@paquier.xyz> wrote: > Now, why are people using pg_dump > /dev/null? Mainly the lack of > better tools, which would be actually able to detect pages in corrupted > pages in one run, and not only heap pages. I honestly think that > amcheck is something that we sould more focus on and has more potential > on the matter, and that we are just complicating pg_dump to do something > it is not designed for, and would do it badly anyway.
+1. There are a wide variety of things that pg_dump will not check when used as a smoke-test for heap corruption. Things that could be checked without doing any more I/O will not be checked. For example, we won't test for sane xmin/xmax fields across update chains, even though the additional cost of doing that is pretty modest. Also, we won't make any attempt to validate NOT NULL constraints. I've noticed that if you corrupt an ItemId entry in a heap page, that will tend to result in the row seeming to consist of all-NULLs to expression evaluation. Typically, the pg_dump trick only detects abject corruption, such as a totally corrupt page header. amcheck isn't there yet, of course. The heapallindexed enhancement that's in the current CF will help, though. And, it won't be very hard to adapt heapallindexed verification to take place in parallel, now that IndexBuildHeapScan() can handle parallel heap scans. I do still think that we need an amcheck function that specifically targets a heap relation (not an index), and performs verification fairly quickly, so my heapallindexed patch isn't enough. That wouldn't share much with the existing amcheck verification functions. I hope that someone else can pick that up soon. -- Peter Geoghegan