Hi,
On 2019-04-16 17:05:36 -0700, Andres Freund wrote:
> On 2019-04-16 18:59:37 -0400, Robert Haas wrote:
> > On Tue, Apr 16, 2019 at 6:45 PM Tom Lane <[email protected]> wrote:
> > > Do we need to think harder about establishing rules for multiplexed
> > > use of the process latch? I'm imagining some rule like "if you are
> > > not the outermost event loop of a process, you do not get to
> > > summarily clear MyLatch. Make sure to leave it set after waiting,
> > > if there was any possibility that it was set by something other than
> > > the specific event you're concerned with".
> >
> > Hmm, yeah. If the latch is left set, then the outer loop will just go
> > through an extra and unnecessary iteration, which seems fine. If the
> > latch is left clear, then the outer loop might miss a wakeup intended
> > for it and hang forever.
>
> Arguably that's a sign that the latch using code in the outer loop(s) isn't
> written correctly? If you do it as:
>
> while (true)
> {
> CHECK_FOR_INTERRUPTS();
>
> ResetLatch(MyLatch);
>
> if (work_needed)
> {
> Plenty();
> Code();
> Using(MyLatch);
> }
> else
> {
> WaitLatch(MyLatch);
> }
> }
>
> I think that's not a danger? I think the problem really is that we
> suggest doing that WaitLatch() unconditionally:
>
> * The correct pattern to wait for event(s) is:
> *
> * for (;;)
> * {
> * ResetLatch();
> * if (work to do)
> * Do Stuff();
> * WaitLatch();
> * }
> *
> * It's important to reset the latch *before* checking if there's work to
> * do. Otherwise, if someone sets the latch between the check and the
> * ResetLatch call, you will miss it and Wait will incorrectly block.
> *
> * Another valid coding pattern looks like:
> *
> * for (;;)
> * {
> * if (work to do)
> * Do Stuff(); // in particular, exit loop if some condition
> satisfied
> * WaitLatch();
> * ResetLatch();
> * }
>
> Obviously there's the issue that a lot of latch using code isn't written
> that way - but I also don't think there's that many latch using code
> that then also uses latch. Seems like we could fix that. While it has
> obviously dangers of not being followed, so does the
> 'always-set-latch-unless-outermost-loop' approach.
>
> I'm not sure I like the idea of incurring another unnecessary SetLatch()
> call for most latch using places.
>
> I guess there's a bit bigger danger of taking longer to notice
> postmaster-death. But I'm not sure I can quite see that being
> problematic - seems like all we should incur is another cycle through
> the loop, as the latch shouldn't be set anymore.
I think we should thus change our latch documentation to say:
something like:
diff --git a/src/include/storage/latch.h b/src/include/storage/latch.h
index fc995819d35..dc46dd94c5b 100644
--- a/src/include/storage/latch.h
+++ b/src/include/storage/latch.h
@@ -44,22 +44,31 @@
* {
* ResetLatch();
* if (work to do)
- * Do Stuff();
- * WaitLatch();
+ * DoStuff();
+ * else
+ * WaitLatch();
* }
*
* It's important to reset the latch *before* checking if there's work to
* do. Otherwise, if someone sets the latch between the check and the
* ResetLatch call, you will miss it and Wait will incorrectly block.
*
+ * The reason to only wait on the latch in case there is nothing to do is that
+ * code inside DoStuff() might use the same latch, and leave it reset, even
+ * though a SetLatch() aimed for the outer loop arrived. Which again could
+ * lead to incorrectly blocking in Wait.
+ *
* Another valid coding pattern looks like:
*
* for (;;)
* {
* if (work to do)
- * Do Stuff(); // in particular, exit loop if some condition satisfied
- * WaitLatch();
- * ResetLatch();
+ * DoStuff(); // in particular, exit loop if some condition satisfied
+ * else
+ * {
+ * WaitLatch();
+ * ResetLatch();
+ * }
* }
*
* This is useful to reduce latch traffic if it's expected that the loop's
and adapt code to match (at least in the outer loops).
Greetings,
Andres Freund