Re: [perl #130716] [CONC] unbounded supply {} + react {} = pseudo-hang

Geoffrey Broadwell via perl6-compiler Sun, 05 Feb 2017 16:14:42 -0800

Responses inline ...

On Sat, Feb 4, 2017 at 7:08 AM, jn...@jnthn.net via RT <
perl6-bugs-follo...@perl.org> wrote:

> On Fri, 03 Feb 2017 21:20:59 -0800, g...@google.com wrote:
> > See the following gist:
> >
> >     https://gist.github.com/japhb/40772099ed24e20ec2c37c06f434594b
> >
> > (If you run that at the command line, you'll probably want to pipe it to
> > `head -30` or so; it will output a lot of lines very quickly!)
> >
> > Essentially it appears that unlike the friendly one-at-a-time behavior of
> > .act, react/whenever will try to exhaust all the emits from an unbounded
> > supply *before* delivering any of them to the whenever code -- which
> makes
> > it awfully hard to have the whenever tell the supply when to stop.
>
> Firstly, the boring observations: there are two mistakes in the gist.
>
> 1) A role is not a closure, so:
>     } does role { method done { $done = True } }
> Will not behave as you want.
>

It took me a minute to realize this was true, because if you move the `my
$s2 = make-supply;` from the react section right up under the creation of
$s1, it's clear that things go terribly wrong -- it apparently only worked
for me because I was only creating one at a time and finishing with one
before creating the next.

That said, this raises two questions:

A. How did this work in the first place?  Was the role's reference to $done
pointing to a single static slot?

B. Why isn't a role declaration a closure?  I understand that the
attributes and methods need to be flattened into the composed class; is
this because the contents of the role and the class are inserted into one
combined block that can only have one outer lexical context?

2) In the react example, there is $s1.done, when I presume $s2.done was
> meant.
>

Yup, didn't notice that pasto because the gist was essentially a merge of
two different cases, and the behavior after merging was the same as before.

> Even with these corrected, the behavior under consideration still occurs.
>
> The deadlock we're seeing here is thanks to the intersection of two
> individually reasonable things.
>
> The first is the general principle that supplies are about taming, not
> introducing, concurrency. There are, of course, a number of Supply factory
> methods that will introduce concurrency (Supply.interval, for example),
> together with a number of supply operators that also will - typically,
> anything involving time, such as the delay method. Naturally, schedule-on
> also can. But these are all quite explicitly asking for the concurrency
> (and all are delegating to something else - a scheduler - to actually
> provide it).
>
> The second, which is in some ways a follow-on from the first, is the
> actor-like semantics of supply and react blocks. Only one thread may be
> inside of a given instance of a supply or react block at a time, including
> any of the whenever blocks inside of it. This has two important
> consequences:
>
> 1) You can be sure your setup logic inside of the supply or react block
> will complete before any messages are processed.
>
> 2) You can be sure that you'll never end up with data races on any of the
> variables declared inside of your supply or react block because only one
> message will be processed at a time.
>
> This all works out well if the supply being tapped truly *is* an
> asynchronous source of data - which is what supplies are primarily aimed
> at. In the case we're considering here, however, it is not. Thanks to the
> first principle, we don't introduce concurrency, so we tap the supply on
> the thread running the react block's body. It never hands back control due
> to the loop inside of it, running straight into the concurrency control
> mechanism.
>

OK, the above makes sense to me, but why does the .act version work
properly then?  When I first read that react {} was supplying actor-like
semantics, I assumed that meant it works just like .act -- but it doesn't.
Why not?  What am I missing here?

> A one-word fix is to introduce a bit of concurrency explicitly:
>
> react {
>     start whenever $s2 -> $n {
>         say "Received $n";
>         $s2.done if $n >= 5;
>     }
> }
>
> With this, the react block's setup can complete, and then it starts
> processing the messages.
>

Well ... that kinda works.  As I tried this (with a `sleep 2` added at
program end) and a few other variants -- using `last` instead of `$s2.done`
as recommended in the irclog, using `loop` instead of `until $done`,
getting rid of the role application and instead putting `my $done = False;
CLOSE $done = True;` inside the supply {} block, etc. -- I found that every
variation I tried sometimes worked, and sometimes led to sadness.  For
example, it might emit a largish number of times, then stop emitting and
just hang (way past the length of the sleep).  The version using `last` and
`CLOSE` together would sometimes emit quite a few times before exiting,
with the last few emits interspersed with `===SORRY!===` and `last without
loop construct`.  I assume the pile of emits before stopping is just a
matter of which thread was getting scheduled -- standard concurrency
issues.  But the hang and the error make no sense.

With all the things I tried, at this point I'm not even sure which problems
were results of my ignorance and which were actual bugs of their own.  What
is the *correct and always working* version of this gist?  Note that in the
real input-processing code from which this toy example was extracted, it's
critical to do cleanup after the supply is stopped, because otherwise the
terminal will be stuck in raw input mode ... and that cleanup depends on
restoring state saved just before the supply is set up.

Longer term, a back-pressure model for supplies is something that wants
> looking in to, designing, and implementing. I put this off on the basis
> that Rx.Net is plenty useful without one, and RxJava introduced one after
> its initial release. Taken together, there was no incentive to rush one in.
> However, we might be able to find a solution in that space for this
> particular case.
>
> That said, back when I was teaching async programming, I always made a
> point to note that the places where synchrony and asynchrony meet are often
> sources of trouble. Here, a supply block whose body runs synchronously runs
> up against a construct (react) and data structure (Supply) whose designs
> are optimized for dealing with asynchronous data. Reduced to its essence,
> the code submitted here and the C# code I would show my students to
> illustrate the problem look strikingly similar: a blocking subscription
> prevents message processing, leading to a deadlock.
>
> It's worth noting that this general problem can *not* be solved through a
> back-pressure mechanism; it can only solve cases like the one in this
> ticket where when emit can serve as a preemption point in the case of
> back-pressure being applied. The consequences of making emit have such
> semantics, however, will probably run deep once we get into non-toy
> examples. (For example, will it end up with us declaring `emit` as being
> like `await` in 6.d where you may be on a different OS thread afterwards if
> you do it inside of the thread pool?)
>

I can understand that problem -- though it does lead me to wonder what
exactly .act() used on the receiving side is doing now that makes it more
amenable to this use case.

> A perhaps simpler solution space to explore is providing an API that
> separates the obtaining of a Tap from the starting of processing. That
> would allow us to run the setup logic to completion. But...then what?
> Again, it's easy to make this toy example work because there's only one
> whenever block. But if there are more, then we're just moving the problem,
> and making it harder to diagnose, because instead of a "where are we
> deadlocked" backtrace showing the whenever line, it'd instead show...some
> other location in supply internals. So, a back-pressure model that allows
> us to round-robin is probably a bit better than this.
>
> tl;dr use "start whenever $supply { }" when $supply is going to work
> synchronously.

I thought the initial point of Supply was to address a few fundamental
limitations of Channel, one of which was to not force so much thread
switching just to send a stream of values through.  My understanding was
that (in the case of not explicitly starting a new thread for the
receiver), each value emitted would simply travel through a tree of taps
depth first before emitting the next value.  In other words, `emit` had
coroutine status similar to `take`.  And with .act() on the taps, that
seems to match my mental model.  So why doesn't that work with react {}?

We should also consider implementing a missing "tap-on" supply operator, so
> you can also write:
>
> whenever $supply.tap-on(ThreadPoolScheduler) { }
>
> Or do it at the source:
>
> supply {
>     ...sync code here...
> }.tap-on(ThreadPoolScheduler)
>
> A simple implementation would likely be:
>
> method tap-on(Scheduler:D $scheduler) {
>     supply {
>         $scheduler.cue: { whenever self { .emit } }
>     }
> }
>
> Making the code as originally submitted work is an interesting problem to
> ponder, but raises a bunch of non-trivial questions, and should be
> considered together with various other challenges.
>
> Hope this helps,
>
> /jnthn
>

Re: [perl #130716] [CONC] unbounded supply {} + react {} = pseudo-hang

Reply via email to