Andy Grove created ARROW-11058: ---------------------------------- Summary: [Rust] [DataFusion] Implement "coalesce batches" operator Key: ARROW-11058 URL: https://issues.apache.org/jira/browse/ARROW-11058 Project: Apache Arrow Issue Type: Improvement Components: Rust - DataFusion Reporter: Andy Grove Assignee: Andy Grove Fix For: 3.0.0
When we have a FilterExec in the plan, it can produce lots of small batches and we therefore lose efficiency of vectorized operations. We should implement a new CoalesceBatchExec and wrap every FilterExec with one of these so that small batches can be recombined into larger batches to improve the efficiency of upstream operators. -- This message was sent by Atlassian Jira (v8.3.4#803005)