On 03/10/17 22:01, Thomas Munro wrote: > On Wed, Oct 4, 2017 at 7:32 AM, Petr Jelinek > <petr.jeli...@2ndquadrant.com> wrote: >> On 03/10/17 19:44, Tom Lane wrote: >>> I wrote: >>>>> So that's trouble waiting to happen, for sure. At the very least we >>>>> need to do a single fetch of WalRcv->latch, not two. I wonder whether >>>>> even that is sufficient, though: this coding requires an atomic fetch of >>>>> a pointer, which is something we generally don't assume to be safe. >>> >>> BTW, I had supposed that this bug was of long standing, but actually it's >>> new in v10, dating to 597a87ccc9a6fa8af7f3cf280b1e24e41807d555. Before >>> that walreceiver start/stop just changed the owner of a long-lived shared >>> latch, and there was no question of stale pointers. >>> >>> I considered reverting that decision, but the reason for it seems to have >>> been to let libpqwalreceiver.c manipulate MyProc->procLatch rather than >>> having to know about a custom latch. That's probably a sufficient reason >>> to justify some squishiness in the wakeup logic. Still, we might want to >>> revisit it if we find any other problems here. >> That's correct, and because the other users of that library don't have >> special latch it seemed feasible to standardize on the procLatch. If we >> indeed wanted to change the decision about this I think we can simply >> give latch as a parameter to libpqrcv_connect and store it in the >> WalReceiverConn struct. The connection does not need to live past the >> latch (although it currently does, but that's just a matter of >> reordering the code in WalRcvDie() a little AFAICS). > > I wonder if the new ConditionVariable mechanism would be worth > considering in future cases like this, where the signalling process > doesn't know the identity of the process to be woken up (or even how > many waiting processes there are), but instead any interested waiters > block on a particular ConditionVariable that models a specific > interesting condition. In the end it's just latches anyway, but it > may be a better abstraction. On the other hand I'm not sure how waits > on a ConditionVariable would be multiplexed with IO (a distinct wait > event, or leaky abstraction where the caller relies on the fact that > it's built on MyProc->latch, something else?). >
In this specific case the problem is that what we are waiting for is indeed the Latch because we want the libpqwalreceiver calls to be interruptible when waiting for I/O. Otherwise, yes ConditionVariable is useful for generic signaling, we use it in slot and origin for "lock" waits (they don't have normal catalog so normal locking does not work). -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers