Rachelint opened a new issue, #12596:
URL: https://github.com/apache/datafusion/issues/12596

   ### Is your feature request related to a problem or challenge?
   
   I impl a poc https://github.com/apache/datafusion/pull/12526, and found this 
idea can actually improve performance.
   
   But for some reasons stated in 
https://github.com/apache/datafusion/issues/11680#issuecomment-2368735093
   
   I think this improvement is not so suitable to be pushed forward currently.
   
   Just file an issue to track it.
   
   
   ### Describe the solution you'd like
   
   - Introduce the partitioned hashtable in `partial aggregation`, and we 
partition the datafusion before inserting them into hashtable.
   - And we push them into `final aggregation` partition by partition after, 
rather than split them again in `repartition`, and merge them again in 
`coalesce`.
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to