Dandandan commented on issue #11931: URL: https://github.com/apache/datafusion/issues/11931#issuecomment-4015952467
Implementing this aggregation approach https://github.com/apache/datafusion/issues/20773 would also contribute to keeping the state small per aggregation in the partial aggregate (and hopefully improves performance). The approach could be repeated in final / final partitioned aggregation (not sure why it is not done in the paper, I guess because in 90% cases the reduction factor and partitioning reduces the number of distinct values enough to make the final aggregate small. While still a large feature to implement, I think implementing it might be more local to aggregation (no changes needed to groupvalues / accumulators, etc...) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
