houqp commented on a change in pull request #55:
URL: https://github.com/apache/arrow-datafusion/pull/55#discussion_r619871814
##########
File path: datafusion/src/physical_optimizer/coalesce_batches.rs
##########
@@ -58,7 +59,17 @@ impl PhysicalOptimizerRule for CoalesceBatches {
// See https://issues.apache.org/jira/browse/ARROW-11068
let wrap_in_coalesce = plan_any.downcast_ref::<FilterExec>().is_some()
|| plan_any.downcast_ref::<HashJoinExec>().is_some()
- || plan_any.downcast_ref::<RepartitionExec>().is_some();
+ || {
+ match plan_any.downcast_ref::<RepartitionExec>() {
+ Some(p) => match p.partitioning() {
+ // do not coalesce hash partitions since other plans
like partitioned hash
Review comment:
Good call, I started with handling this in `HashJoinStream`, but
couldn't get it to work in all cases due to a small bug. I took a second look
at my first attempt and noticed it could be fixed by changing 3 characters :P I
have now reverted my change in the coalesce optimizer and added back my initial
fix through
https://github.com/apache/arrow-datafusion/pull/55/commits/696f8e04178fef877a658d0ef08c3c4ceebfcfdf.
Let me know if this fix aligns with what you have in mind. Also I suck at
naming, `preserve_left` is the best I can come up with, suggestions welcome :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]