[perl #130716] [CONC] unbounded supply {} + react {} = pseudo-hang

jn...@jnthn.net via RT Sat, 04 Feb 2017 07:08:33 -0800

On Fri, 03 Feb 2017 21:20:59 -0800, g...@google.com wrote:
> See the following gist:
> 
>     https://gist.github.com/japhb/40772099ed24e20ec2c37c06f434594b
> 
> (If you run that at the command line, you'll probably want to pipe it to
> `head -30` or so; it will output a lot of lines very quickly!)
> 
> Essentially it appears that unlike the friendly one-at-a-time behavior of
> .act, react/whenever will try to exhaust all the emits from an unbounded
> supply *before* delivering any of them to the whenever code -- which makes
> it awfully hard to have the whenever tell the supply when to stop.


Firstly, the boring observations: there are two mistakes in the gist.

1) A role is not a closure, so:
    } does role { method done { $done = True } }
Will not behave as you want.

2) In the react example, there is $s1.done, when I presume $s2.done was meant.

Even with these corrected, the behavior under consideration still occurs.

The deadlock we're seeing here is thanks to the intersection of two 
individually reasonable things.

The first is the general principle that supplies are about taming, not 
introducing, concurrency. There are, of course, a number of Supply factory 
methods that will introduce concurrency (Supply.interval, for example), 
together with a number of supply operators that also will - typically, anything 
involving time, such as the delay method. Naturally, schedule-on also can. But 
these are all quite explicitly asking for the concurrency (and all are 
delegating to something else - a scheduler - to actually provide it).

The second, which is in some ways a follow-on from the first, is the actor-like 
semantics of supply and react blocks. Only one thread may be inside of a given 
instance of a supply or react block at a time, including any of the whenever 
blocks inside of it. This has two important consequences:

1) You can be sure your setup logic inside of the supply or react block will 
complete before any messages are processed.

2) You can be sure that you'll never end up with data races on any of the 
variables declared inside of your supply or react block because only one 
message will be processed at a time.

This all works out well if the supply being tapped truly *is* an asynchronous 
source of data - which is what supplies are primarily aimed at. In the case 
we're considering here, however, it is not. Thanks to the first principle, we 
don't introduce concurrency, so we tap the supply on the thread running the 
react block's body. It never hands back control due to the loop inside of it, 
running straight into the concurrency control mechanism.

A one-word fix is to introduce a bit of concurrency explicitly:

react {
    start whenever $s2 -> $n {
        say "Received $n";
        $s2.done if $n >= 5;
    }
}

With this, the react block's setup can complete, and then it starts processing 
the messages.

Longer term, a back-pressure model for supplies is something that wants looking 
in to, designing, and implementing. I put this off on the basis that Rx.Net is 
plenty useful without one, and RxJava introduced one after its initial release. 
Taken together, there was no incentive to rush one in. However, we might be 
able to find a solution in that space for this particular case.

That said, back when I was teaching async programming, I always made a point to 
note that the places where synchrony and asynchrony meet are often sources of 
trouble. Here, a supply block whose body runs synchronously runs up against a 
construct (react) and data structure (Supply) whose designs are optimized for 
dealing with asynchronous data. Reduced to its essence, the code submitted here 
and the C# code I would show my students to illustrate the problem look 
strikingly similar: a blocking subscription prevents message processing, 
leading to a deadlock.

It's worth noting that this general problem can *not* be solved through a 
back-pressure mechanism; it can only solve cases like the one in this ticket 
where when emit can serve as a preemption point in the case of back-pressure 
being applied. The consequences of making emit have such semantics, however, 
will probably run deep once we get into non-toy examples. (For example, will it 
end up with us declaring `emit` as being like `await` in 6.d where you may be 
on a different OS thread afterwards if you do it inside of the thread pool?)

A perhaps simpler solution space to explore is providing an API that separates 
the obtaining of a Tap from the starting of processing. That would allow us to 
run the setup logic to completion. But...then what? Again, it's easy to make 
this toy example work because there's only one whenever block. But if there are 
more, then we're just moving the problem, and making it harder to diagnose, 
because instead of a "where are we deadlocked" backtrace showing the whenever 
line, it'd instead show...some other location in supply internals. So, a 
back-pressure model that allows us to round-robin is probably a bit better than 
this.

tl;dr use "start whenever $supply { }" when $supply is going to work 
synchronously. We should also consider implementing a missing "tap-on" supply 
operator, so you can also write:

whenever $supply.tap-on(ThreadPoolScheduler) { }

Or do it at the source:

supply {
    ...sync code here...
}.tap-on(ThreadPoolScheduler)

A simple implementation would likely be:

method tap-on(Scheduler:D $scheduler) {
    supply {
        $scheduler.cue: { whenever self { .emit } }
    }
}

Making the code as originally submitted work is an interesting problem to 
ponder, but raises a bunch of non-trivial questions, and should be considered 
together with various other challenges.

Hope this helps,

/jnthn

[perl #130716] [CONC] unbounded supply {} + react {} = pseudo-hang

Reply via email to