Hi, On 2022-07-26 14:30:30 -0400, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > > On 2022-07-26 13:57:53 -0400, Tom Lane wrote: > >> So this is not a case of RecoveryConflictInterrupt doing the wrong thing: > >> the startup process hasn't detected the buffer conflict in the first > >> place. > > > I wonder if this, at least partially, could be be due to the elog thing > > I was complaining about nearby. I.e. we decide to FATAL as part of a > > recovery conflict interrupt, and then during that ERROR out as part of > > another recovery conflict interrupt (because nothing holds interrupts as > > part of FATAL). > > There are all sorts of things one could imagine going wrong in the > backend receiving the recovery conflict interrupt, but AFAICS in these > failures, the startup process hasn't sent a recovery conflict interrupt. > It certainly hasn't logged anything suggesting it noticed a conflict.
I don't think we reliably emit a log message before the recovery conflict is resolved. I've wondered a couple times now about making tap test timeouts somehow trigger a core dump of all processes. Certainly would make it easier to debug some of these kinds of issues. Greetings, Andres Freund