Dandandan commented on code in PR #18630:
URL: https://github.com/apache/datafusion/pull/18630#discussion_r2526799488


##########
datafusion/physical-optimizer/src/coalesce_batches.rs:
##########
@@ -60,8 +60,7 @@ impl PhysicalOptimizerRule for CoalesceBatches {
             // wrap those ones with a CoalesceBatchesExec operator. An 
alternate approach here
             // would be to build the coalescing logic directly into the 
operators
             // See https://github.com/apache/datafusion/issues/139
-            let wrap_in_coalesce = 
plan_any.downcast_ref::<FilterExec>().is_some()
-                || plan_any.downcast_ref::<HashJoinExec>().is_some()
+            let wrap_in_coalesce = 
plan_any.downcast_ref::<HashJoinExec>().is_some()

Review Comment:
   Yes - I tried to apply this optimization to `RepartitionExec` but 
unfortunately got some regressions (my feeling is due to some decreased 
parallelism / work distribution).
   
   HashJoinExec would be a great candidate as it might be possible to avoid the 
`take` followed by `concat` which might speed up things.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to