[ https://issues.apache.org/jira/browse/ARROW-11068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17256477#comment-17256477 ]
Andrew Lamb commented on ARROW-11068: ------------------------------------- I have some suggestions here: https://github.com/apache/arrow/pull/9043#discussion_r550163707 TLDR -- I think incorporating the coalescing logic into the operators themselves (so they don't produce small baches in the first place) might be better > [Rust] [DataFusion] Wrap more operators in CoalesceBatchExec > ------------------------------------------------------------ > > Key: ARROW-11068 > URL: https://issues.apache.org/jira/browse/ARROW-11068 > Project: Apache Arrow > Issue Type: Improvement > Components: Rust - DataFusion > Reporter: Andy Grove > Assignee: Andy Grove > Priority: Major > Fix For: 3.0.0 > > > Once [https://github.com/apache/arrow/pull/9043] is merged, we should extend > this to wrap HashJoinExec and HashAggregateExec as well since they can both > produce small batches. > Rather than hard-code a list of operators that need to be wrapped, we should > find a more generic mechanism so that plans can declare if their input and/or > output batches should be coalesced (similar to how we handle partitioning) > and this would allow custom operators outside of DataFusion to benefit from > this optimization. -- This message was sent by Atlassian Jira (v8.3.4#803005)