On Wed, Apr 13, 2022 at 3:54 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > After a bit more navel-contemplation I see a way that the pgstats > work could have changed timing in this area. We used to have a > rate limit on how often stats reports would be sent to the > collector, which'd ensure half a second or so delay before a > transaction's change counts became visible to the autovac daemon. > I've not looked at the new code, but I'm betting that that's gone > and the autovac launcher might start a worker nearly immediately > after some foreground process finishes inserting some rows. > So that could result in autovac activity occurring concurrently > with test_setup where it didn't before.
But why should it matter? The test_setup.sql VACUUM of tenk1 should leave relallvisible and relpages in the same state, either way (or very close to it). The only way that it seems like it could matter is if OldestXmin was held back during test_setup.sql's execution of the VACUUM command. > As to what to do about it ... maybe apply the FREEZE and > DISABLE_PAGE_SKIPPING options in test_setup's vacuums? > It seems like DISABLE_PAGE_SKIPPING is necessary but perhaps > not sufficient. BTW, the work on VACUUM for Postgres 15 probably makes VACUUM test flappiness issues less of a problem -- unless they're issues involving something holding back OldestXmin when it shouldn't (in which case it won't have any effect on test stability). I would expect that to be the case, at least, since VACUUM now does almost all of the same work for any individual page that it cannot get a cleanup lock on. There is surprisingly little difference between a page that gets processed by lazy_scan_prune and a page that gets processed by lazy_scan_noprune. -- Peter Geoghegan