Re: Introduce timeout capability for ConditionVariableSleep

2019-07-22 Thread Thomas Munro
On Tue, Jul 16, 2019 at 1:11 AM Robert Haas  wrote:
> On Fri, Jul 12, 2019 at 11:03 PM Thomas Munro  wrote:
> > I pushed this too.  It's a separate commit, because I think there is
> > at least a theoretical argument that it should be back-patched.  I'm
> > not going to do that today though, because I doubt anyone is relying
> > on ConditionVariableSignal() working that reliably yet, and it's
> > really with timeouts that it becomes a likely problem.
>
> To make it work reliably, you'd need to be sure that a process can't
> ERROR or FATAL after getting signaled and before doing whatever the
> associated work is (or that if it does, it will first pass on the
> signal). Since that seems impossible, I'm not sure I see the point of
> trying to do anything at all.

I agree that that on its own doesn't fix problems in , but that doesn't mean we
shouldn't try to make this API as reliable as possible.  Unlike
typical CV implementations, our wait primitive is not atomic.  When we
invented two-step wait, we created a way for ConditionVariableSignal()
to have no effect due to bad timing.  Surely that's a bug.

-- 
Thomas Munro
https://enterprisedb.com




Re: Introduce timeout capability for ConditionVariableSleep

2019-07-15 Thread Robert Haas
On Fri, Jul 12, 2019 at 11:03 PM Thomas Munro  wrote:
> I pushed this too.  It's a separate commit, because I think there is
> at least a theoretical argument that it should be back-patched.  I'm
> not going to do that today though, because I doubt anyone is relying
> on ConditionVariableSignal() working that reliably yet, and it's
> really with timeouts that it becomes a likely problem.

To make it work reliably, you'd need to be sure that a process can't
ERROR or FATAL after getting signaled and before doing whatever the
associated work is (or that if it does, it will first pass on the
signal). Since that seems impossible, I'm not sure I see the point of
trying to do anything at all.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Introduce timeout capability for ConditionVariableSleep

2019-07-12 Thread Thomas Munro
On Fri, Jul 12, 2019 at 6:08 PM Shawn Debnath  wrote:
> On Tue, Jul 09, 2019 at 11:03:18PM +1200, Thomas Munro wrote:
> > On Sun, Jul 7, 2019 at 3:09 PM Thomas Munro  wrote:
> > > +   /* Timed out */
> > > +   if (rc == 0)
> > > +   return true;
> >
> > Here's a version without that bit, because I don't think we need it.
>
> This works. Agree that letting it fall through covers the first gap.

Pushed, like that (with the now unused rc variable also removed).
Thanks for the patch!

> > > That still leaves the danger that the CV can be signalled some time
> > > after ConditionVariableTimedSleep() returns.  So now I'm wondering if
> > > ConditionVariableCancelSleep() should signal the CV if it discovers
> > > that this process is not in the proclist, on the basis that that must
> > > indicate that we've been signalled even though we're not interested in
> > > the message anymore, and yet some other process else might be
> > > interested, and that might have been the only signal that is ever
> > > going to be delivered by ConditionVariableSignal().
> >
> > And a separate patch to do that.  Thoughts?
>
> I like it. This covers the gap all the way till cancel is invoked and it
> manipulates the list to remove itself or realizes that it needs to
> forward the signal to some other process.

I pushed this too.  It's a separate commit, because I think there is
at least a theoretical argument that it should be back-patched.  I'm
not going to do that today though, because I doubt anyone is relying
on ConditionVariableSignal() working that reliably yet, and it's
really with timeouts that it becomes a likely problem.

I thought about this edge case because I have long wanted to propose a
pair of functions that provide a simplified payloadless blocking
alternative to NOTIFY, that would allow for just the right number of
waiting sessions to wake up to handle SKIP LOCKED-style job queues.
Otherwise you sometimes get thundering herds of wakeups fighting over
crumbs.  That made me think about the case where a worker session
decides to time out and shut down due to being idle for too long, but
eats a wakeup on its way out.  Another question that comes up in that
use case is whether CV wakeup queues should be LIFO or FIFO.  I think
the answer is LIFO, to support class worker pool designs that
stabilise at the right size using a simple idle timeout rule.  They're
currently FIFO (proclist_pop_head_node() to wake up, but
proclist_push_tail() to sleep).  I understand why Robert didn't care
about that last time I mentioned it: all our uses of CVs today are
"broadcast" wakeups.  But a productised version of the "poke" hack I
showed earlier that supports poking just one waiter would care about
the thing this patch fixed, and also the wakeup queue order.

-- 
Thomas Munro
https://enterprisedb.com




Re: Introduce timeout capability for ConditionVariableSleep

2019-07-09 Thread Thomas Munro
On Sun, Jul 7, 2019 at 3:09 PM Thomas Munro  wrote:
> +   /* Timed out */
> +   if (rc == 0)
> +   return true;

Here's a version without that bit, because I don't think we need it.

> That still leaves the danger that the CV can be signalled some time
> after ConditionVariableTimedSleep() returns.  So now I'm wondering if
> ConditionVariableCancelSleep() should signal the CV if it discovers
> that this process is not in the proclist, on the basis that that must
> indicate that we've been signalled even though we're not interested in
> the message anymore, and yet some other process else might be
> interested, and that might have been the only signal that is ever
> going to be delivered by ConditionVariableSignal().

And a separate patch to do that.  Thoughts?


--
Thomas Munro
https://enterprisedb.com


0001-Introduce-timed-waits-for-condition-variables-v6.patch
Description: Binary data


0002-Forward-received-condition-variable-signals-on-ca-v6.patch
Description: Binary data


Re: Introduce timeout capability for ConditionVariableSleep

2019-07-06 Thread Thomas Munro
On Fri, Jul 5, 2019 at 1:40 PM Thomas Munro  wrote:
> I think this is looking pretty good and I'm planning to commit it
> soon.  The convention for CHECK_FOR_INTERRUPTS() in latch wait loops
> seems to be to put it after the ResetLatch(), so I've moved it in the
> attached version (though I don't think it was wrong where it was).
> Also pgindented and with my proposed commit message.  I've also
> attached a throw-away test module that gives you CALL poke() and
> SELECT wait_for_poke(timeout) using a CV.

I thought of one small problem with the current coding.  Suppose there
are two processes A and B waiting on a CV, and another process C calls
ConditionVariableSignal() to signal one process.  Around the same
time, A times out and exits via this code path:

+   /* Timed out */
+   if (rc == 0)
+   return true;

Suppose ConditionVariableSignal() set A's latch immediately after
WaitEventSetWait() returned 0 in A.  Now A won't report the CV signal
to the caller, and B is still waiting, so effectively nobody has
received the message and yet C thinks it has signalled a waiter if
there is one.  My first thought is that we could simply remove the
above-quoted hunk and fall through to the second timeout-detecting
code.  That'd mean that if we've been signalled AND timed out as of
that point in the code, we'll prefer to report the signal, and it also
reduces the complexity of the function to have only one "return true"
path.

That still leaves the danger that the CV can be signalled some time
after ConditionVariableTimedSleep() returns.  So now I'm wondering if
ConditionVariableCancelSleep() should signal the CV if it discovers
that this process is not in the proclist, on the basis that that must
indicate that we've been signalled even though we're not interested in
the message anymore, and yet some other process else might be
interested, and that might have been the only signal that is ever
going to be delivered by ConditionVariableSignal().

-- 
Thomas Munro
https://enterprisedb.com




Re: Introduce timeout capability for ConditionVariableSleep

2019-07-04 Thread Thomas Munro
On Fri, Mar 22, 2019 at 7:21 AM Shawn Debnath  wrote:
>
> On Sat, Mar 16, 2019 at 03:27:17PM -0700, Shawn Debnath wrote:
> > > + * Track the current time so that we can calculate the
> > > remaining timeout
> > > + * if we are woken up spuriously.
> > >
> > > I think tha "track" means chasing a moving objects. So it might
> > > be bettter that it is record or something?
> > >
> > > >   * Wait for the given condition variable to be signaled or till 
> > > > timeout.
> > > >   *
> > > >   * Returns -1 when timeout expires, otherwise returns 0.
> > > >   *
> > > >   * See ConditionVariableSleep() for general usage.
> > > >
> > > > > +ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,
> > > > >
> > > > > Counldn't the two-state return value be a boolean?
> >
> > I will change it to Record in the next iteration of the patch.
>
> Posting rebased and updated patch. Changed the word 'Track' to 'Record'
> and also changed variable name rem_timeout to cur_timeout to match
> naming in other use cases.

Hi Shawn,

I think this is looking pretty good and I'm planning to commit it
soon.  The convention for CHECK_FOR_INTERRUPTS() in latch wait loops
seems to be to put it after the ResetLatch(), so I've moved it in the
attached version (though I don't think it was wrong where it was).
Also pgindented and with my proposed commit message.  I've also
attached a throw-away test module that gives you CALL poke() and
SELECT wait_for_poke(timeout) using a CV.

Observations that I'm not planning to do anything about:
1.  I don't really like the data type "long", but it's already
established that we use that for latches so maybe it's too late for me
to complain about that.
2.  I don't really like the fact that we have to do floating point
stuff in INSTR_TIME_GET_MILLISEC().  That's not really your patch's
fault and you've copied the timeout adjustment code from latch.c,
which seems reasonable.

-- 
Thomas Munro
https://enterprisedb.com


0001-Introduce-timed-waits-for-condition-variables-v5.patch
Description: Binary data


0002-Simple-module-for-waiting-for-other-sessions-v5.patch
Description: Binary data


Re: Introduce timeout capability for ConditionVariableSleep

2019-03-14 Thread Kyotaro HORIGUCHI
Hello.

At Thu, 14 Mar 2019 17:26:11 -0700, Shawn Debnath  wrote in 
<20190315002611.ga1...@f01898859afd.ant.amazon.com>
> Thank you reviewing. Comments inline.
> 
> On Wed, Mar 13, 2019 at 05:24:15PM +0900, Kyotaro HORIGUCHI wrote:
> > > Agree. In my testing WaitEventSetWait did the work for calculating 
> > > the right timeout remaining. It's a bummer that we have to repeat 
> > > the same pattern at the ConditionVariableTimedSleep() but I guess 
> > > anyone who loops in such cases will have to adjust their values 
> > > accordingly.
> > 
> > I think so, too. And actually the duplicate timeout calculation
> > doesn't seem good. We could eliminate the duplicate by allowing
> > WaitEventSetWait to exit by unwanted events, something like the
> > attached.
> 
> After thinking about this more, I see WaitEventSetWait()'s contract as 
> wait for an event or timeout if no events are received in that time 

Sure.

> frame. Although ConditionVariableTimedSleep() is also using the same 
> word, I believe the semantics are different. The timeout period in 
> ConditionVariableTimedSleep() is intended to limit the time we wait 
> until removal from the wait queue. Whereas, in the case of 
> WaitEventSetWait, the timeout period is intended to limit the time the 
> caller waits till the first event.

Mmm. The two look the same to me.. Timeout means for both that
"Wait for one of these events or timeout expiration to
occur". Removal from waiting queue is just a subtask of exiting
from waiting state.

The "don't exit until timeout expires unless any expected events
occur" part is to be done in the uppermost layer so it is enough
that the lower layer does just "exit when something
happened". This is the behavior of WaitEventSetWaitBlock for
WaitEventSetWait. My proposal is giving callers capabliy to tell
WaitEventSetWait not to perform useless timeout contination.

> Hence, I believe the code is correct as is and we shouldn't change the 
> contract for WaitEventSetWait.
> 
> > > BTW, I am curious why Andres in 98a64d0bd71 didn't just create an 
> > > artificial event with WL_TIMEOUT and return that from 
> > > WaitEventSetWait(). Would have made it cleaner than checking checking 
> > > return values for -1.
> > 
> > Maybe because it is not a kind of WaitEvent, so it naturally
> > cannot be a part of occurred_events.
> 
> Hmm, I don't agree with that completely. One could argue that the 
> backend is waiting for any event in order to be woken up, including a 
> timeout event.

Right, I understand that. I didn't mean that it is the right
design for everyone. Just meant that it is in that shape. (And I
rather like it.)

latch.h:127
#define WL_TIMEOUT (1 << 3)/* not for WaitEventSetWait() */

We can make it one of the events for WaitEventSetWait, but I
don't see such a big point on that, and also that doesn't this
patch better in any means.


> > # By the way, you can obtain a short hash of a commit by git
> > #  rev-parse --short .
> 
> Good to know! :-) Luckily git is smart enough to still match it to the 
> correct commit.

And too complex so that infrequent usage easily get out from my
brain:(


> > > Updated v2 patch attached.

Thank you . It looks fine execpt the above point.  But still I
have some questions on it. (the reason for they not being
comments is that they are about wordings..).

+ * Track the current time so that we can calculate the remaining timeout
+ * if we are woken up spuriously.

I think tha "track" means chasing a moving objects. So it might
be bettter that it is record or something?

>   * Wait for the given condition variable to be signaled or till timeout.
>   *
>   * Returns -1 when timeout expires, otherwise returns 0.
>   *
>   * See ConditionVariableSleep() for general usage.
> 
> > +ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,
> > 
> > Counldn't the two-state return value be a boolean?
> 
> I wanted to leave the option open to use the positive integers for other 
> purposes but you are correct, bool suffices for now. If needed, we can 
> change it in the future.

Yes, we can do that after we found it needed.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Introduce timeout capability for ConditionVariableSleep

2019-03-13 Thread Kyotaro HORIGUCHI
Hello.

At Tue, 12 Mar 2019 17:53:43 -0700, Shawn Debnath  wrote in 
<20190313005342.ga8...@f01898859afd.ant.amazon.com>
> Hi Thomas,
> 
> Thanks for reviewing!
> 
> On Wed, Mar 13, 2019 at 12:40:57PM +1300, Thomas Munro wrote:
> > Can we just refer to the other function's documentation for this?  I
> > don't want two copies of this blurb (and this copy-paste already
> > failed to include "Timed" in the example function name).
> 
> Hah - point well taken. Fixed.
> 
> > One difference compared to pthread_cond_timedwait() is that pthread
> > uses an absolute time and here you use a relative time (as we do in
> > WaitEventSet API).  The first question is which makes a better API,
> > and the second is what the semantics of a relative timeout should be:
> > start again every time we get a spurious wake-up, or track time
> > already waited?  Let's see...
> > 
> > -(void) WaitEventSetWait(cv_wait_event_set, -1, &event, 1,
> > +ret = WaitEventSetWait(cv_wait_event_set, timeout, &event, 1,
> >  wait_event_info);
> > 
> > Here you're restarting the timeout clock every time through the loop
> > without adjustment, and I think that's not a good choice: spurious
> > wake-ups cause bonus waiting.
> 
> Agree. In my testing WaitEventSetWait did the work for calculating the 
> right timeout remaining. It's a bummer that we have to repeat the same 
> pattern at the ConditionVariableTimedSleep() but I guess anyone who 
> loops in such cases will have to adjust their values accordingly.

I think so, too. And actually the duplicate timeout calculation
doesn't seem good. We could eliminate the duplicate by allowing
WaitEventSetWait to exit by unwanted events, something like the
attached.

> BTW, I am curious why Andres in 98a64d0bd71 didn't just create an 
> artificial event with WL_TIMEOUT and return that from 
> WaitEventSetWait(). Would have made it cleaner than checking checking 
> return values for -1.

Maybe because it is not a kind of WaitEvent, so it naturally
cannot be a part of occurred_events.

# By the way, you can obtain a short hash of a commit by git
#  rev-parse --short .

> Updated v2 patch attached.

I'd like to comment on the patch.

+ * In the event of a timeout, we simply return and the caller
+ * calls ConditionVariableCancelSleep to remove themselves from the
+ * wait queue. See ConditionVariableSleep() for notes on how to correctly check
+ * for the exit condition.
+ *
+ * Returns 0, or -1 if timed out.

Maybe this could be more simpler, that like:

* ConditionVariableTimedSleep - allows us to specify timeout
* 
* If timeout = =1, block until the condition is satisfied.
* 
* Returns -1 when timeout expires, otherwise returns 0.
* 
* See ConditionVariableSleep() for general behavior and usage.


+int
+ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,

Counldn't the two-state return value be a boolean?


+   int ret = 0;

As a general coding convention, we are not to give such a generic
name for a variable with such a long life if avoidable. In the
case the 'ret' could be 'timeout_fired' or something and it would
be far verbose.


+   if (rc == 0 && timeout >= 0)

WaitEventSetWait returns 0 only in the case of timeout
expiration, so the second term is useless.  Just setting ret to
-1 and break seems to me almost the same with "goto".  The reason
why the existing ConditionVariableSleep uses do {} while(done) is
that it is straightforwad. Timeout added everal exit point in the
loop so it's make the loop rather complex by going around with
the variable. Whole the loop could be in the following more flat
shape.

   while (true)
   {
  CHECK_FOR_INTERRUPTS();
  rc = WaitEventSetWait();
  ResetLatch();

  /* timeout expired, return */
  if (rc == 0) return -1;
  SpinLockAcquire();
  if (!proclist...)
  {
 done = true;
  }
  SpinLockRelease();

  /* condition satisfied, return */
  if (done) return 0;

  /* if we're here, we should wait for the remaining time */
  INSTR_TIME_SET_CURRENT()
  ...
  }


+   Assert(ret == 0);

I don't see a point in the assertion so much.


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
index 59fa917ae0..abead3294e 100644
--- a/src/backend/storage/ipc/latch.c
+++ b/src/backend/storage/ipc/latch.c
@@ -950,7 +950,7 @@ WaitEventAdjustWin32(WaitEventSet *set, WaitEvent *event)
 int
 WaitEventSetWait(WaitEventSet *set, long timeout,
  WaitEvent *occurred_events, int nevents,
- uint32 wait_event_info)
+ uint32 wait_event_info, bool wait_for_timeout)
 {
 	int			returned_events = 0;
 	instr_time	start_time;
@@ -965,7 +965,8 @@ WaitEventSetWait(WaitEventSet *set, long timeout,
 	 */
 	if (timeout >= 0)
 	{
-		INSTR_TIME_SET_CURRENT(start_time);
+		if (wait_for_timeout)
+			INSTR_TIME_SET_CURRENT(start_

Re: Introduce timeout capability for ConditionVariableSleep

2019-03-12 Thread Thomas Munro
Hi Shawn,

On Wed, Mar 13, 2019 at 12:25 PM Shawn Debnath  wrote:
> Postgres today doesn't support waiting for a condition variable with a
> timeout, although the framework it relies upon, does. This change wraps
> the existing ConditionVariableSleep functionality and introduces a new
> API, ConditionVariableTimedSleep, to allow callers to specify a timeout
> value.

Seems reasonable, I think, and should be familiar to anyone used to
well known multithreading libraries.

+/*
+ * Wait for the given condition variable to be signaled or till timeout.
+ * This should be called in a predicate loop that tests for a specific exit
+ * condition and otherwise sleeps, like so:
+ *
+ * ConditionVariablePrepareToSleep(cv);  // optional
+ * while (condition for which we are waiting is not true)
+ * ConditionVariableSleep(cv, wait_event_info);
+ * ConditionVariableCancelSleep();
+ *
+ * wait_event_info should be a value from one of the WaitEventXXX enums
+ * defined in pgstat.h.  This controls the contents of pg_stat_activity's
+ * wait_event_type and wait_event columns while waiting.
+ *
+ * Returns 0 or -1 if timed out.
+ */
+int
+ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,
+uint32 wait_event_info)


Can we just refer to the other function's documentation for this?  I
don't want two copies of this blurb (and this copy-paste already
failed to include "Timed" in the example function name).

One difference compared to pthread_cond_timedwait() is that pthread
uses an absolute time and here you use a relative time (as we do in
WaitEventSet API).  The first question is which makes a better API,
and the second is what the semantics of a relative timeout should be:
start again every time we get a spurious wake-up, or track time
already waited?  Let's see...

-(void) WaitEventSetWait(cv_wait_event_set, -1, &event, 1,
+ret = WaitEventSetWait(cv_wait_event_set, timeout, &event, 1,
 wait_event_info);

Here you're restarting the timeout clock every time through the loop
without adjustment, and I think that's not a good choice: spurious
wake-ups cause bonus waiting.

-- 
Thomas Munro
https://enterprisedb.com