The proposed new implementation of listen/notify works by shoving all of a transaction's outgoing notifies into the global queue during pre-commit, then sending PROCSIG_NOTIFY_INTERRUPT to listening backends post-commit. When a listening backend scans the queue, if it hits a message from a transaction that hasn't yet committed nor aborted, it abandons queue scanning, expecting to resume scanning when it gets another PROCSIG_NOTIFY_INTERRUPT. This means that a transaction that is still hanging fire on commit can block receipt of notifies from already-committed transactions, if they queued after it did. While the old implementation never made any hard guarantees about the time interval between commit and receipt of notify, it still seems to me that there are some potential surprises here, and I don't recall if they were all analyzed in the previous discussions. So bear with me a second:
1. In the previous code, a transaction "hanging fire on commit" (ie, with pg_listener changes made, but not committed) would be holding exclusive lock on pg_listener. So it would block things even worse than now, as we couldn't queue new items either. An uncommitted queue entry seems to behave about the same as that lock would, in that it prevents all listeners from seeing any new messages. AFAICS, therefore, this isn't objectionable in itself. 2. Since the pre-commit code releases AsyncQueueLock between pages, it is possible for the messages of different transactions to get interleaved in the queue, which not only means that they'd be delivered interleaved but also that it's possible for a listener to deliver some notifications of a transaction, and only later (perhaps many transactions later) deliver the rest. The existing code can also deliver notifications of different transactions interleaved, but AFAICS it can never deliver some notifications of one transaction and then deliver more of them in a different batch. By the time any listener gets to scan pg_listener, a sending transaction is either committed or not, it cannot commit partway through a scan (because of the locking done on pg_listener). 3. It is possible for a backend's own self-notifies to not be delivered immediately after commit, if they are queued behind some other uncommitted transaction's messages. That wasn't possible before either. I'm not sure how probable it is that applications might be coded in a way that relies on the properties lost according to point #2 or #3. It seems rather scary though, particularly because if there were such a dependency, it would be easy to never see the misbehavior during testing. We could fix #2 by not releasing AsyncQueueLock between pages when queuing messages. This has no obvious downsides as far as I can see; if anything it ought to save some cycles and contention. We could fix #3 by re-instituting the special code path that previously existed for self-notifies, ie send them to the client directly from AtCommit_Notify and ignore self-notifies coming back from the queue. This would mean that a backend might see its own self-notifies in a different order relative to other backends' messages than other backends do --- but that was the case in the old coding as well. I think preserving the property that self-notifies are delivered immediately upon commit might be more important than that. Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers