maropu commented on pull request #29360:
URL: https://github.com/apache/spark/pull/29360#issuecomment-673753587


   > But shuffle is happened during Aggregate here, right? By splitting, the 
total amount of shuffled data is not changed, but split into several ones. Does 
it really result significant improvement?
   
   As @viirya said above, I think the same. Why can this reduce the amount of 
shuffle writes? In the case of `expand -> partial aggregates`, the aggregates 
seem to have the same **total** amount of output size.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to