Re: Introduce timeout capability for ConditionVariableSleep
On Tue, Jul 16, 2019 at 1:11 AM Robert Haas wrote: > On Fri, Jul 12, 2019 at 11:03 PM Thomas Munro wrote: > > I pushed this too. It's a separate commit, because I think there is > > at least a theoretical argument that it should be back-patched. I'm > > not going to do that today though, because I doubt anyone is relying > > on ConditionVariableSignal() working that reliably yet, and it's > > really with timeouts that it becomes a likely problem. > > To make it work reliably, you'd need to be sure that a process can't > ERROR or FATAL after getting signaled and before doing whatever the > associated work is (or that if it does, it will first pass on the > signal). Since that seems impossible, I'm not sure I see the point of > trying to do anything at all. I agree that that on its own doesn't fix problems in , but that doesn't mean we shouldn't try to make this API as reliable as possible. Unlike typical CV implementations, our wait primitive is not atomic. When we invented two-step wait, we created a way for ConditionVariableSignal() to have no effect due to bad timing. Surely that's a bug. -- Thomas Munro https://enterprisedb.com
Re: Introduce timeout capability for ConditionVariableSleep
On Fri, Jul 12, 2019 at 11:03 PM Thomas Munro wrote: > I pushed this too. It's a separate commit, because I think there is > at least a theoretical argument that it should be back-patched. I'm > not going to do that today though, because I doubt anyone is relying > on ConditionVariableSignal() working that reliably yet, and it's > really with timeouts that it becomes a likely problem. To make it work reliably, you'd need to be sure that a process can't ERROR or FATAL after getting signaled and before doing whatever the associated work is (or that if it does, it will first pass on the signal). Since that seems impossible, I'm not sure I see the point of trying to do anything at all. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Introduce timeout capability for ConditionVariableSleep
On Fri, Jul 12, 2019 at 6:08 PM Shawn Debnath wrote: > On Tue, Jul 09, 2019 at 11:03:18PM +1200, Thomas Munro wrote: > > On Sun, Jul 7, 2019 at 3:09 PM Thomas Munro wrote: > > > + /* Timed out */ > > > + if (rc == 0) > > > + return true; > > > > Here's a version without that bit, because I don't think we need it. > > This works. Agree that letting it fall through covers the first gap. Pushed, like that (with the now unused rc variable also removed). Thanks for the patch! > > > That still leaves the danger that the CV can be signalled some time > > > after ConditionVariableTimedSleep() returns. So now I'm wondering if > > > ConditionVariableCancelSleep() should signal the CV if it discovers > > > that this process is not in the proclist, on the basis that that must > > > indicate that we've been signalled even though we're not interested in > > > the message anymore, and yet some other process else might be > > > interested, and that might have been the only signal that is ever > > > going to be delivered by ConditionVariableSignal(). > > > > And a separate patch to do that. Thoughts? > > I like it. This covers the gap all the way till cancel is invoked and it > manipulates the list to remove itself or realizes that it needs to > forward the signal to some other process. I pushed this too. It's a separate commit, because I think there is at least a theoretical argument that it should be back-patched. I'm not going to do that today though, because I doubt anyone is relying on ConditionVariableSignal() working that reliably yet, and it's really with timeouts that it becomes a likely problem. I thought about this edge case because I have long wanted to propose a pair of functions that provide a simplified payloadless blocking alternative to NOTIFY, that would allow for just the right number of waiting sessions to wake up to handle SKIP LOCKED-style job queues. Otherwise you sometimes get thundering herds of wakeups fighting over crumbs. That made me think about the case where a worker session decides to time out and shut down due to being idle for too long, but eats a wakeup on its way out. Another question that comes up in that use case is whether CV wakeup queues should be LIFO or FIFO. I think the answer is LIFO, to support class worker pool designs that stabilise at the right size using a simple idle timeout rule. They're currently FIFO (proclist_pop_head_node() to wake up, but proclist_push_tail() to sleep). I understand why Robert didn't care about that last time I mentioned it: all our uses of CVs today are "broadcast" wakeups. But a productised version of the "poke" hack I showed earlier that supports poking just one waiter would care about the thing this patch fixed, and also the wakeup queue order. -- Thomas Munro https://enterprisedb.com
Re: Introduce timeout capability for ConditionVariableSleep
On Sun, Jul 7, 2019 at 3:09 PM Thomas Munro wrote: > + /* Timed out */ > + if (rc == 0) > + return true; Here's a version without that bit, because I don't think we need it. > That still leaves the danger that the CV can be signalled some time > after ConditionVariableTimedSleep() returns. So now I'm wondering if > ConditionVariableCancelSleep() should signal the CV if it discovers > that this process is not in the proclist, on the basis that that must > indicate that we've been signalled even though we're not interested in > the message anymore, and yet some other process else might be > interested, and that might have been the only signal that is ever > going to be delivered by ConditionVariableSignal(). And a separate patch to do that. Thoughts? -- Thomas Munro https://enterprisedb.com 0001-Introduce-timed-waits-for-condition-variables-v6.patch Description: Binary data 0002-Forward-received-condition-variable-signals-on-ca-v6.patch Description: Binary data
Re: Introduce timeout capability for ConditionVariableSleep
On Fri, Jul 5, 2019 at 1:40 PM Thomas Munro wrote: > I think this is looking pretty good and I'm planning to commit it > soon. The convention for CHECK_FOR_INTERRUPTS() in latch wait loops > seems to be to put it after the ResetLatch(), so I've moved it in the > attached version (though I don't think it was wrong where it was). > Also pgindented and with my proposed commit message. I've also > attached a throw-away test module that gives you CALL poke() and > SELECT wait_for_poke(timeout) using a CV. I thought of one small problem with the current coding. Suppose there are two processes A and B waiting on a CV, and another process C calls ConditionVariableSignal() to signal one process. Around the same time, A times out and exits via this code path: + /* Timed out */ + if (rc == 0) + return true; Suppose ConditionVariableSignal() set A's latch immediately after WaitEventSetWait() returned 0 in A. Now A won't report the CV signal to the caller, and B is still waiting, so effectively nobody has received the message and yet C thinks it has signalled a waiter if there is one. My first thought is that we could simply remove the above-quoted hunk and fall through to the second timeout-detecting code. That'd mean that if we've been signalled AND timed out as of that point in the code, we'll prefer to report the signal, and it also reduces the complexity of the function to have only one "return true" path. That still leaves the danger that the CV can be signalled some time after ConditionVariableTimedSleep() returns. So now I'm wondering if ConditionVariableCancelSleep() should signal the CV if it discovers that this process is not in the proclist, on the basis that that must indicate that we've been signalled even though we're not interested in the message anymore, and yet some other process else might be interested, and that might have been the only signal that is ever going to be delivered by ConditionVariableSignal(). -- Thomas Munro https://enterprisedb.com
Re: Introduce timeout capability for ConditionVariableSleep
On Fri, Mar 22, 2019 at 7:21 AM Shawn Debnath wrote: > > On Sat, Mar 16, 2019 at 03:27:17PM -0700, Shawn Debnath wrote: > > > + * Track the current time so that we can calculate the > > > remaining timeout > > > + * if we are woken up spuriously. > > > > > > I think tha "track" means chasing a moving objects. So it might > > > be bettter that it is record or something? > > > > > > > * Wait for the given condition variable to be signaled or till > > > > timeout. > > > > * > > > > * Returns -1 when timeout expires, otherwise returns 0. > > > > * > > > > * See ConditionVariableSleep() for general usage. > > > > > > > > > +ConditionVariableTimedSleep(ConditionVariable *cv, long timeout, > > > > > > > > > > Counldn't the two-state return value be a boolean? > > > > I will change it to Record in the next iteration of the patch. > > Posting rebased and updated patch. Changed the word 'Track' to 'Record' > and also changed variable name rem_timeout to cur_timeout to match > naming in other use cases. Hi Shawn, I think this is looking pretty good and I'm planning to commit it soon. The convention for CHECK_FOR_INTERRUPTS() in latch wait loops seems to be to put it after the ResetLatch(), so I've moved it in the attached version (though I don't think it was wrong where it was). Also pgindented and with my proposed commit message. I've also attached a throw-away test module that gives you CALL poke() and SELECT wait_for_poke(timeout) using a CV. Observations that I'm not planning to do anything about: 1. I don't really like the data type "long", but it's already established that we use that for latches so maybe it's too late for me to complain about that. 2. I don't really like the fact that we have to do floating point stuff in INSTR_TIME_GET_MILLISEC(). That's not really your patch's fault and you've copied the timeout adjustment code from latch.c, which seems reasonable. -- Thomas Munro https://enterprisedb.com 0001-Introduce-timed-waits-for-condition-variables-v5.patch Description: Binary data 0002-Simple-module-for-waiting-for-other-sessions-v5.patch Description: Binary data
Re: Introduce timeout capability for ConditionVariableSleep
Hello. At Thu, 14 Mar 2019 17:26:11 -0700, Shawn Debnath wrote in <20190315002611.ga1...@f01898859afd.ant.amazon.com> > Thank you reviewing. Comments inline. > > On Wed, Mar 13, 2019 at 05:24:15PM +0900, Kyotaro HORIGUCHI wrote: > > > Agree. In my testing WaitEventSetWait did the work for calculating > > > the right timeout remaining. It's a bummer that we have to repeat > > > the same pattern at the ConditionVariableTimedSleep() but I guess > > > anyone who loops in such cases will have to adjust their values > > > accordingly. > > > > I think so, too. And actually the duplicate timeout calculation > > doesn't seem good. We could eliminate the duplicate by allowing > > WaitEventSetWait to exit by unwanted events, something like the > > attached. > > After thinking about this more, I see WaitEventSetWait()'s contract as > wait for an event or timeout if no events are received in that time Sure. > frame. Although ConditionVariableTimedSleep() is also using the same > word, I believe the semantics are different. The timeout period in > ConditionVariableTimedSleep() is intended to limit the time we wait > until removal from the wait queue. Whereas, in the case of > WaitEventSetWait, the timeout period is intended to limit the time the > caller waits till the first event. Mmm. The two look the same to me.. Timeout means for both that "Wait for one of these events or timeout expiration to occur". Removal from waiting queue is just a subtask of exiting from waiting state. The "don't exit until timeout expires unless any expected events occur" part is to be done in the uppermost layer so it is enough that the lower layer does just "exit when something happened". This is the behavior of WaitEventSetWaitBlock for WaitEventSetWait. My proposal is giving callers capabliy to tell WaitEventSetWait not to perform useless timeout contination. > Hence, I believe the code is correct as is and we shouldn't change the > contract for WaitEventSetWait. > > > > BTW, I am curious why Andres in 98a64d0bd71 didn't just create an > > > artificial event with WL_TIMEOUT and return that from > > > WaitEventSetWait(). Would have made it cleaner than checking checking > > > return values for -1. > > > > Maybe because it is not a kind of WaitEvent, so it naturally > > cannot be a part of occurred_events. > > Hmm, I don't agree with that completely. One could argue that the > backend is waiting for any event in order to be woken up, including a > timeout event. Right, I understand that. I didn't mean that it is the right design for everyone. Just meant that it is in that shape. (And I rather like it.) latch.h:127 #define WL_TIMEOUT (1 << 3)/* not for WaitEventSetWait() */ We can make it one of the events for WaitEventSetWait, but I don't see such a big point on that, and also that doesn't this patch better in any means. > > # By the way, you can obtain a short hash of a commit by git > > # rev-parse --short . > > Good to know! :-) Luckily git is smart enough to still match it to the > correct commit. And too complex so that infrequent usage easily get out from my brain:( > > > Updated v2 patch attached. Thank you . It looks fine execpt the above point. But still I have some questions on it. (the reason for they not being comments is that they are about wordings..). + * Track the current time so that we can calculate the remaining timeout + * if we are woken up spuriously. I think tha "track" means chasing a moving objects. So it might be bettter that it is record or something? > * Wait for the given condition variable to be signaled or till timeout. > * > * Returns -1 when timeout expires, otherwise returns 0. > * > * See ConditionVariableSleep() for general usage. > > > +ConditionVariableTimedSleep(ConditionVariable *cv, long timeout, > > > > Counldn't the two-state return value be a boolean? > > I wanted to leave the option open to use the positive integers for other > purposes but you are correct, bool suffices for now. If needed, we can > change it in the future. Yes, we can do that after we found it needed. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Introduce timeout capability for ConditionVariableSleep
Hello. At Tue, 12 Mar 2019 17:53:43 -0700, Shawn Debnath wrote in <20190313005342.ga8...@f01898859afd.ant.amazon.com> > Hi Thomas, > > Thanks for reviewing! > > On Wed, Mar 13, 2019 at 12:40:57PM +1300, Thomas Munro wrote: > > Can we just refer to the other function's documentation for this? I > > don't want two copies of this blurb (and this copy-paste already > > failed to include "Timed" in the example function name). > > Hah - point well taken. Fixed. > > > One difference compared to pthread_cond_timedwait() is that pthread > > uses an absolute time and here you use a relative time (as we do in > > WaitEventSet API). The first question is which makes a better API, > > and the second is what the semantics of a relative timeout should be: > > start again every time we get a spurious wake-up, or track time > > already waited? Let's see... > > > > -(void) WaitEventSetWait(cv_wait_event_set, -1, &event, 1, > > +ret = WaitEventSetWait(cv_wait_event_set, timeout, &event, 1, > > wait_event_info); > > > > Here you're restarting the timeout clock every time through the loop > > without adjustment, and I think that's not a good choice: spurious > > wake-ups cause bonus waiting. > > Agree. In my testing WaitEventSetWait did the work for calculating the > right timeout remaining. It's a bummer that we have to repeat the same > pattern at the ConditionVariableTimedSleep() but I guess anyone who > loops in such cases will have to adjust their values accordingly. I think so, too. And actually the duplicate timeout calculation doesn't seem good. We could eliminate the duplicate by allowing WaitEventSetWait to exit by unwanted events, something like the attached. > BTW, I am curious why Andres in 98a64d0bd71 didn't just create an > artificial event with WL_TIMEOUT and return that from > WaitEventSetWait(). Would have made it cleaner than checking checking > return values for -1. Maybe because it is not a kind of WaitEvent, so it naturally cannot be a part of occurred_events. # By the way, you can obtain a short hash of a commit by git # rev-parse --short . > Updated v2 patch attached. I'd like to comment on the patch. + * In the event of a timeout, we simply return and the caller + * calls ConditionVariableCancelSleep to remove themselves from the + * wait queue. See ConditionVariableSleep() for notes on how to correctly check + * for the exit condition. + * + * Returns 0, or -1 if timed out. Maybe this could be more simpler, that like: * ConditionVariableTimedSleep - allows us to specify timeout * * If timeout = =1, block until the condition is satisfied. * * Returns -1 when timeout expires, otherwise returns 0. * * See ConditionVariableSleep() for general behavior and usage. +int +ConditionVariableTimedSleep(ConditionVariable *cv, long timeout, Counldn't the two-state return value be a boolean? + int ret = 0; As a general coding convention, we are not to give such a generic name for a variable with such a long life if avoidable. In the case the 'ret' could be 'timeout_fired' or something and it would be far verbose. + if (rc == 0 && timeout >= 0) WaitEventSetWait returns 0 only in the case of timeout expiration, so the second term is useless. Just setting ret to -1 and break seems to me almost the same with "goto". The reason why the existing ConditionVariableSleep uses do {} while(done) is that it is straightforwad. Timeout added everal exit point in the loop so it's make the loop rather complex by going around with the variable. Whole the loop could be in the following more flat shape. while (true) { CHECK_FOR_INTERRUPTS(); rc = WaitEventSetWait(); ResetLatch(); /* timeout expired, return */ if (rc == 0) return -1; SpinLockAcquire(); if (!proclist...) { done = true; } SpinLockRelease(); /* condition satisfied, return */ if (done) return 0; /* if we're here, we should wait for the remaining time */ INSTR_TIME_SET_CURRENT() ... } + Assert(ret == 0); I don't see a point in the assertion so much. regards. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c index 59fa917ae0..abead3294e 100644 --- a/src/backend/storage/ipc/latch.c +++ b/src/backend/storage/ipc/latch.c @@ -950,7 +950,7 @@ WaitEventAdjustWin32(WaitEventSet *set, WaitEvent *event) int WaitEventSetWait(WaitEventSet *set, long timeout, WaitEvent *occurred_events, int nevents, - uint32 wait_event_info) + uint32 wait_event_info, bool wait_for_timeout) { int returned_events = 0; instr_time start_time; @@ -965,7 +965,8 @@ WaitEventSetWait(WaitEventSet *set, long timeout, */ if (timeout >= 0) { - INSTR_TIME_SET_CURRENT(start_time); + if (wait_for_timeout) + INSTR_TIME_SET_CURRENT(start_
Re: Introduce timeout capability for ConditionVariableSleep
Hi Shawn, On Wed, Mar 13, 2019 at 12:25 PM Shawn Debnath wrote: > Postgres today doesn't support waiting for a condition variable with a > timeout, although the framework it relies upon, does. This change wraps > the existing ConditionVariableSleep functionality and introduces a new > API, ConditionVariableTimedSleep, to allow callers to specify a timeout > value. Seems reasonable, I think, and should be familiar to anyone used to well known multithreading libraries. +/* + * Wait for the given condition variable to be signaled or till timeout. + * This should be called in a predicate loop that tests for a specific exit + * condition and otherwise sleeps, like so: + * + * ConditionVariablePrepareToSleep(cv); // optional + * while (condition for which we are waiting is not true) + * ConditionVariableSleep(cv, wait_event_info); + * ConditionVariableCancelSleep(); + * + * wait_event_info should be a value from one of the WaitEventXXX enums + * defined in pgstat.h. This controls the contents of pg_stat_activity's + * wait_event_type and wait_event columns while waiting. + * + * Returns 0 or -1 if timed out. + */ +int +ConditionVariableTimedSleep(ConditionVariable *cv, long timeout, +uint32 wait_event_info) Can we just refer to the other function's documentation for this? I don't want two copies of this blurb (and this copy-paste already failed to include "Timed" in the example function name). One difference compared to pthread_cond_timedwait() is that pthread uses an absolute time and here you use a relative time (as we do in WaitEventSet API). The first question is which makes a better API, and the second is what the semantics of a relative timeout should be: start again every time we get a spurious wake-up, or track time already waited? Let's see... -(void) WaitEventSetWait(cv_wait_event_set, -1, &event, 1, +ret = WaitEventSetWait(cv_wait_event_set, timeout, &event, 1, wait_event_info); Here you're restarting the timeout clock every time through the loop without adjustment, and I think that's not a good choice: spurious wake-ups cause bonus waiting. -- Thomas Munro https://enterprisedb.com