On Thu, Apr 19, 2018 at 11:59 AM, Teodor Sigaev <teo...@sigaev.ru> wrote: > Agree. Hope, nobody found a way how to use amcheck module in production to > serve user requests. But, it should be implemented before BETA stage, in > opposite case we will get a lot of objections.
It shouldn't take that long to get a patch together. I've already started. I think that any objections to pushing this amcheck enhancement after feature freeze are unlikely, for these reasons: * There is a precedent, since the integrity checking functions added to pg_visibility by commit e472ce96 were added well after feature freeze, and only 3 months before the release of 9.6. * Anyone paying attention to the ongoing "VM map freeze corruption" thread would have to agree that that was an extremely good idea. We're the ones that have to maintain this code, and I think that we're entitled to add a little extra C code to a contrib extension, in order to make testing easier. Including for beta testing. * The extra overhead is quite low, and will only affect bt_index_parent_check() (not bt_index_check()). * I deliberately created bt_index_parent_check() so that we'd have something that does very thorough testing, that's too slow and/or requires too heavy a relation-level lock to be useful to most users. That was the plan from day one. If you only care about hardware issues, then bt_index_check() is better in almost every way, and is really all you should be using in production unless you really don't care about the extra overhead, or about the ShareLock. The docs more or less say this. I have certainly heard of people using bt_index_parent_check() in production, but only when they already knew that their database was corrupt, and wanted to isolate the problem. I imagine that people that use bt_index_parent_check() in production do so because they want as much information as possible, and are willing to do whatever it takes to get more information. > Agree, but at least this place needs a comment - why it's safe. Good idea. >> I also think that we could have better conventional regression test >> coverage here. > > Will try to invent not so large test.oif it means they may get a little extra Your smaller test takes about 350ms to run on a debug build on my laptop, which seems worth it (honestly, this test should have been there already). Maybe we can add it to the amcheck regression tests instead, since those are run less often. This also allows us to add a test case specifically as part of the amcheck enhancement, which makes a bit more sense. -- Peter Geoghegan