On Thu, Apr 19, 2018 at 11:59 AM, Teodor Sigaev <teo...@sigaev.ru> wrote:
> Agree. Hope, nobody found a way how to use amcheck module in production to
> serve user requests. But, it should be implemented before BETA stage, in
> opposite case we will get a lot of objections.

It shouldn't take that long to get a patch together. I've already started.

I think that any objections to pushing this amcheck enhancement after
feature freeze are unlikely, for these reasons:

* There is a precedent, since the integrity checking functions added
to pg_visibility by commit e472ce96 were added well after feature
freeze, and only 3 months before the release of 9.6.

* Anyone paying attention to the ongoing "VM map freeze corruption"
thread would have to agree that that was an extremely good idea. We're
the ones that have to maintain this code, and I think that we're
entitled to add a little extra C code to a contrib extension, in order
to make testing easier. Including for beta testing.

* The extra overhead is quite low, and will only affect
bt_index_parent_check() (not bt_index_check()).

* I deliberately created bt_index_parent_check() so that we'd have
something that does very thorough testing, that's too slow and/or
requires too heavy a relation-level lock to be useful to most users.
That was the plan from day one.

If you only care about hardware issues, then bt_index_check() is
better in almost every way, and is really all you should be using in
production unless you really don't care about the extra overhead, or
about the ShareLock. The docs more or less say this. I have certainly
heard of people using bt_index_parent_check() in production, but only
when they already knew that their database was corrupt, and wanted to
isolate the problem. I imagine that people that use
bt_index_parent_check() in production do so because they want as much
information as possible, and are willing to do whatever it takes to
get more information.

> Agree, but at least this place needs a comment - why it's safe.

Good idea.

>> I also think that we could have better conventional regression test
>> coverage here.
>
> Will try to invent not so large test.oif it means they may get a little extra

Your smaller test takes about 350ms to run on a debug build on my
laptop, which seems worth it (honestly, this test should have been
there already). Maybe we can add it to the amcheck regression tests
instead, since those are run less often. This also allows us to add a
test case specifically as part of the amcheck enhancement, which makes
a bit more sense.

-- 
Peter Geoghegan

Reply via email to