Hi, On 2022-04-09 14:39:16 -0700, Andres Freund wrote: > On 2022-04-09 17:00:41 -0400, Tom Lane wrote: > > Thomas Munro <thomas.mu...@gmail.com> writes: > > > Unlike most "procsignal" handler routines, RecoveryConflictInterrupt() > > > doesn't just set a sig_atomic_t flag and poke the latch. Is the extra > > > stuff it does safe? For example, is this call stack OK (to pick one > > > that jumps out, but not the only one)? > > > > > procsignal_sigusr1_handler > > > -> RecoveryConflictInterrupt > > > -> HoldingBufferPinThatDelaysRecovery > > > -> GetPrivateRefCount > > > -> GetPrivateRefCountEntry > > > -> hash_search(...hash table that might be in the middle of an > > > update...) > > > > Ugh. That one was safe before somebody decided we needed a hash table > > for buffer refcounts, but it's surely not safe now. > > Mea culpa. This is 4b4b680c3d6d - from 2014.
Whoa. There's way worse: StandbyTimeoutHandler() calls SendRecoveryConflictWithBufferPin(), which calls CancelDBBackends(), which acquires lwlocks etc. Which very plausibly is the cause for the issue I'm investigating in https://www.postgresql.org/message-id/20220409220054.fqn5arvbeesmxdg5%40alap3.anarazel.de Greetings, Andres Freund