On Fri, 03 Feb 2017 21:20:59 -0800, [email protected] wrote:
> See the following gist:
>
> https://gist.github.com/japhb/40772099ed24e20ec2c37c06f434594b
>
> (If you run that at the command line, you'll probably want to pipe it to
> `head -30` or so; it will output a lot of lines very quickly!)
>
> Essentially it appears that unlike the friendly one-at-a-time behavior of
> .act, react/whenever will try to exhaust all the emits from an unbounded
> supply *before* delivering any of them to the whenever code -- which makes
> it awfully hard to have the whenever tell the supply when to stop.
Firstly, the boring observations: there are two mistakes in the gist.
1) A role is not a closure, so:
} does role { method done { $done = True } }
Will not behave as you want.
2) In the react example, there is $s1.done, when I presume $s2.done was meant.
Even with these corrected, the behavior under consideration still occurs.
The deadlock we're seeing here is thanks to the intersection of two
individually reasonable things.
The first is the general principle that supplies are about taming, not
introducing, concurrency. There are, of course, a number of Supply factory
methods that will introduce concurrency (Supply.interval, for example),
together with a number of supply operators that also will - typically, anything
involving time, such as the delay method. Naturally, schedule-on also can. But
these are all quite explicitly asking for the concurrency (and all are
delegating to something else - a scheduler - to actually provide it).
The second, which is in some ways a follow-on from the first, is the actor-like
semantics of supply and react blocks. Only one thread may be inside of a given
instance of a supply or react block at a time, including any of the whenever
blocks inside of it. This has two important consequences:
1) You can be sure your setup logic inside of the supply or react block will
complete before any messages are processed.
2) You can be sure that you'll never end up with data races on any of the
variables declared inside of your supply or react block because only one
message will be processed at a time.
This all works out well if the supply being tapped truly *is* an asynchronous
source of data - which is what supplies are primarily aimed at. In the case
we're considering here, however, it is not. Thanks to the first principle, we
don't introduce concurrency, so we tap the supply on the thread running the
react block's body. It never hands back control due to the loop inside of it,
running straight into the concurrency control mechanism.
A one-word fix is to introduce a bit of concurrency explicitly:
react {
start whenever $s2 -> $n {
say "Received $n";
$s2.done if $n >= 5;
}
}
With this, the react block's setup can complete, and then it starts processing
the messages.
Longer term, a back-pressure model for supplies is something that wants looking
in to, designing, and implementing. I put this off on the basis that Rx.Net is
plenty useful without one, and RxJava introduced one after its initial release.
Taken together, there was no incentive to rush one in. However, we might be
able to find a solution in that space for this particular case.
That said, back when I was teaching async programming, I always made a point to
note that the places where synchrony and asynchrony meet are often sources of
trouble. Here, a supply block whose body runs synchronously runs up against a
construct (react) and data structure (Supply) whose designs are optimized for
dealing with asynchronous data. Reduced to its essence, the code submitted here
and the C# code I would show my students to illustrate the problem look
strikingly similar: a blocking subscription prevents message processing,
leading to a deadlock.
It's worth noting that this general problem can *not* be solved through a
back-pressure mechanism; it can only solve cases like the one in this ticket
where when emit can serve as a preemption point in the case of back-pressure
being applied. The consequences of making emit have such semantics, however,
will probably run deep once we get into non-toy examples. (For example, will it
end up with us declaring `emit` as being like `await` in 6.d where you may be
on a different OS thread afterwards if you do it inside of the thread pool?)
A perhaps simpler solution space to explore is providing an API that separates
the obtaining of a Tap from the starting of processing. That would allow us to
run the setup logic to completion. But...then what? Again, it's easy to make
this toy example work because there's only one whenever block. But if there are
more, then we're just moving the problem, and making it harder to diagnose,
because instead of a "where are we deadlocked" backtrace showing the whenever
line, it'd instead show...some other location in supply internals. So, a
back-pressure model that allows us to round-robin is probably a bit better than
this.
tl;dr use "start whenever $supply { }" when $supply is going to work
synchronously. We should also consider implementing a missing "tap-on" supply
operator, so you can also write:
whenever $supply.tap-on(ThreadPoolScheduler) { }
Or do it at the source:
supply {
...sync code here...
}.tap-on(ThreadPoolScheduler)
A simple implementation would likely be:
method tap-on(Scheduler:D $scheduler) {
supply {
$scheduler.cue: { whenever self { .emit } }
}
}
Making the code as originally submitted work is an interesting problem to
ponder, but raises a bunch of non-trivial questions, and should be considered
together with various other challenges.
Hope this helps,
/jnthn