Re: [I] Manage group values and states by blocks in aggregation [datafusion]

via GitHub Sat, 07 Mar 2026 00:26:03 -0800


Dandandan commented on issue #11931:
URL: https://github.com/apache/datafusion/issues/11931#issuecomment-4015952467


   Implementing this aggregation approach 
https://github.com/apache/datafusion/issues/20773 would also contribute to 
keeping the state small per aggregation in the partial aggregate (and hopefully 
improves performance). The approach could be repeated in final / final 
partitioned aggregation (not sure why it is not done in the paper, I guess 
because in 90% cases the reduction factor and partitioning reduces the number 
of distinct values enough to make the final aggregate small.
   
   While still a large feature to implement, I think implementing it might be 
more local to aggregation (no changes needed to groupvalues / accumulators, 
etc...)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] Manage group values and states by blocks in aggregation [datafusion]

Reply via email to