[HACKERS] LISTEN/NOTIFY and notification timing guarantees

Tom Lane Sun, 14 Feb 2010 18:32:11 -0800

The proposed new implementation of listen/notify works by shoving all
of a transaction's outgoing notifies into the global queue during
pre-commit, then sending PROCSIG_NOTIFY_INTERRUPT to listening backends
post-commit.  When a listening backend scans the queue, if it hits a
message from a transaction that hasn't yet committed nor aborted, it
abandons queue scanning, expecting to resume scanning when it gets
another PROCSIG_NOTIFY_INTERRUPT.  This means that a transaction that is
still hanging fire on commit can block receipt of notifies from
already-committed transactions, if they queued after it did.  While the
old implementation never made any hard guarantees about the time
interval between commit and receipt of notify, it still seems to me that
there are some potential surprises here, and I don't recall if they were
all analyzed in the previous discussions.  So bear with me a second:


1. In the previous code, a transaction "hanging fire on commit" (ie,
with pg_listener changes made, but not committed) would be holding
exclusive lock on pg_listener.  So it would block things even worse
than now, as we couldn't queue new items either.  An uncommitted queue
entry seems to behave about the same as that lock would, in that it
prevents all listeners from seeing any new messages.  AFAICS, therefore,
this isn't objectionable in itself.

2. Since the pre-commit code releases AsyncQueueLock between pages,
it is possible for the messages of different transactions to get
interleaved in the queue, which not only means that they'd be delivered
interleaved but also that it's possible for a listener to deliver some
notifications of a transaction, and only later (perhaps many
transactions later) deliver the rest.  The existing code can also
deliver notifications of different transactions interleaved, but AFAICS
it can never deliver some notifications of one transaction and then
deliver more of them in a different batch.  By the time any listener
gets to scan pg_listener, a sending transaction is either committed or
not, it cannot commit partway through a scan (because of the locking
done on pg_listener).

3. It is possible for a backend's own self-notifies to not be delivered
immediately after commit, if they are queued behind some other
uncommitted transaction's messages.  That wasn't possible before either.

I'm not sure how probable it is that applications might be coded in a
way that relies on the properties lost according to point #2 or #3.
It seems rather scary though, particularly because if there were such a
dependency, it would be easy to never see the misbehavior during testing.

We could fix #2 by not releasing AsyncQueueLock between pages when
queuing messages.  This has no obvious downsides as far as I can see;
if anything it ought to save some cycles and contention.  We could fix
#3 by re-instituting the special code path that previously existed for
self-notifies, ie send them to the client directly from AtCommit_Notify
and ignore self-notifies coming back from the queue.  This would mean
that a backend might see its own self-notifies in a different order
relative to other backends' messages than other backends do --- but that
was the case in the old coding as well.  I think preserving the
property that self-notifies are delivered immediately upon commit might
be more important than that.

Comments?

                        regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] LISTEN/NOTIFY and notification timing guarantees

Reply via email to