Andy Grove created ARROW-12255:
----------------------------------

             Summary: [Rust] [Ballista] Integrate scheduler with DataFusion
                 Key: ARROW-12255
                 URL: https://issues.apache.org/jira/browse/ARROW-12255
             Project: Apache Arrow
          Issue Type: New Feature
          Components: Rust - Ballista, Rust - DataFusion
            Reporter: Andy Grove
            Assignee: Andy Grove
             Fix For: 5.0.0


The Ballista scheduler breaks a query down into stages based on changes in 
partitioning int he plan, where each stage is broken down into tasks that can 
be executed concurrently.

Rather than trying to run all the partitions at once, Ballista executors 
process n concurrent tasks at a time and then request new tasks from the 
scheduler.

This approach would help DataFusion scale better and it would be ideal to use 
the same scheduler to scale across cores in DataFusion and across nodes in 
Ballista.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to