"MauMau" <maumau...@gmail.com> writes:
> [ sinval catchup signal -> ProcessCatchupEvent -> WaitLatch -> deadlock ]

Ugh.

One line of thought is that it's pretty unsafe to be doing anything
as complicated as transaction start/commit in a signal handler, even one
that is sure it's not interrupting anything else.  The only excuse
ProcessCatchupEvent has for that is that it's "trying to ensure proper
cleanup from an error".  Perhaps with closer analysis we could convince
ourselves that errors thrown from AcceptInvalidationMessages() wouldn't
need transaction abort cleanup.

For that matter, it's not exactly clear that an error thrown out of
there would result in good things even with the transaction
infrastructure.  Presuming that we're waiting for client input, an
error longjmp out of ProcessCatchupEvent could mean taking control
away from OpenSSL, and I bet that won't end well.  Maybe we should
just be doing

        PG_TRY
                AcceptInvalidationMessages();
        PG_CATCH
                elog(FATAL, "curl up and die");

ProcessIncomingNotify is also using
StartTransactionCommand()/CommitTransactionCommand(), so that would
need some thought too.

Or we could try to get rid of the need to do anything beyond setting
a flag in the interrupt handler itself.  But I'm afraid that's probably
unworkable, especially now that we use SA_RESTART on all signals.

Another line of thought is that we've been way too uncritical about
shoving different kinds of events into the SIGUSR1 multiplexor.
It might be a good idea to separate "high level" interrupts from
"low level" ones, using say SIGUSR1 for the former and SIGUSR2
for the latter.  However, that doesn't sound very back-patchable,
even assuming that we can come up with a clean division.

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to