Hi, On 2024-03-20 17:41:45 -0700, Andres Freund wrote: > On 2024-03-14 16:56:39 -0400, Tom Lane wrote: > > Also, this is probably not > > helping anything: > > > > 'extra_config' => { > > ... > > 'fsync = on' > > At some point we had practically no test coverage of fsync, so I made my > animals use fsync. I think we still have little coverage. I probably could > reduce the number of animals using it though.
I think there must be some actual regression involved. The frequency of failures on HEAD vs failures on 16 - both of which run the tests concurrently via meson - is just vastly different. I'd expect the absolute number of failures in 027_stream_regress.pl to differ between branches due to fewer runs on 16, but there's no explanation for the difference in percentage of failures. My menagerie had only a single recoveryCheck failure on !HEAD in the last 30 days, but in the vicinity of 100 on HEAD https://buildfarm.postgresql.org/cgi-bin/show_failures.pl?max_days=30&stage=recoveryCheck&filter=Submit If anything the load when testing back branch changes is higher, because commonly back-branch builds are happening on all branches, so I don't think that can be the explanation either. >From what I can tell the pattern changed on 2024-02-16 19:39:02 - there was a rash of recoveryCheck failures in the days before that too, but not 027_stream_regress.pl in that way. It certainly seems suspicious that one commit before the first observed failure is 2024-02-16 11:09:11 -0800 [73f0a132660] Pass correct count to WALRead(). Of course the failure rate is low enough that it could have been a day or two before that, too. Greetings, Andres Freund