Pau Garcia Rodriguez created ARROW-18431:
--------------------------------------------

             Summary: Acero's Execution Plan never finishes.
                 Key: ARROW-18431
                 URL: https://issues.apache.org/jira/browse/ARROW-18431
             Project: Apache Arrow
          Issue Type: Bug
          Components: C++
    Affects Versions: 10.0.0
            Reporter: Pau Garcia Rodriguez


We have observed that sometimes an execution plan with a small input never 
finishes (the future returned by the ExecPlan::finished() method is never 
marked as finished), even though the generator in the sink node is exhausted 
and has returned nullopt.

This issue seems to happen at random, the same plan with the same input 
sometimes works (the plan is marked finished) and sometimes it doesn't. Since 
the ExecPlanImpl destructor forces the executing thread to wait for the plan to 
finish (when the plan has not yet finished) we enter in a deadlock waiting for 
a plan that never finishes.

Since this has only happened with small inputs and not in a deterministic way, 
we believe the issue might be in the ExecPlan::StartProducing method.

Our hypothesis is that after the plan starts producing on each node, each node 
schedules their tasks and they are  immediately finished (due to the small 
input) and somehow the callback that marks the future finished_ finished is 
never executed.

 
{code:java}
Status StartProducing() {
  ...
  Future<> scheduler_finished =   
util::AsyncTaskScheduler::Make([this(util::AsyncTaskScheduler* async_scheduler) 
{
  ...
  scheduler_finished.AddCallback([this](const Status& st) { 
finished_.MarkFinished(st);});
...
}{code}
 

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to