[ 
https://issues.apache.org/jira/browse/ARROW-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weston Pace closed ARROW-13004.
-------------------------------
    Resolution: Won't Fix

This is handled better by the exec plan and we are not likely to pursue this in 
futures

> [C++] Allow the creation of future "chains" to better control parallelism
> -------------------------------------------------------------------------
>
>                 Key: ARROW-13004
>                 URL: https://issues.apache.org/jira/browse/ARROW-13004
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Weston Pace
>            Priority: Major
>
> This is a bit tricky to explain.  ShouldSchedule::Always works well for 
> AddCallback but falls short for Transfer and Then.  An example may explain 
> best.
> Consider three operators, Source, Transform, and Sink.  They are setup as...
> {code:java}
> source_fut = source(); // 1
> transform_fut = source_fut.Then(Transform(), ScheduleAlways); // 2
> sink_fut = transform_fut.Then(Consume()); // 3
> {code}
> The intent is to run Transform + Consume as a single thread task on each item 
> generated by source().  This is what happens if source() is slow.  If 
> source() is fast (let's pretend it's always finished) then this is not what 
> happens.
> Line 2 causes a new thread task to be launched (since source_fut is 
> finished).  It is possible that new thread task can mark transform_fut 
> finished before line 3 is executed by the original thread.  This causes 
> Consume() and Transform() to run on separate threads.
> The solution (at least as best I can come up with) is unfortunately a little 
> complex (though the complexity can be hidden in future/async_generator 
> internals).  Basically, it is worth waiting to schedule until the future 
> chain has had a chance to finish connecting the pressure.  This means a 
> future created with ScheduleAlways is created in an "unconsumed" mode.  Any 
> callbacks that would normally be launched will not be launched until the 
> future switches to "consumed".  Future.Wait(), VisitAsyncGenerator, 
> CollectAsyncGenerator, and some of the async_generator operators would cause 
> the future to be "consumed".  The "consume" signal will need to propagate 
> backwards up the chain so futures will need to keep a reference to their 
> antecedent future.
> This work meshes well with some other improvements I have been considering, 
> in particular, splitting future/promise and restricting futures to a single 
> callback.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to