Rachelint commented on issue #12596: URL: https://github.com/apache/datafusion/issues/12596#issuecomment-2372803765
> > Introduce the partitioned hashtable in partial aggregation, and we partition the datafusion before inserting them into hashtable. > > And we push them into final aggregation partition by partition after, rather than split them again in repartition, and merge them again in coalesce. > > I'm not clear on how this proposal works. Could you please explain why it provides performance benefits compared to partial aggregation, exchange, and final aggregation? Is the proposal aimed explicitly at accelerating high cardinality aggregation, or is it intended to enhance aggregation performance? I think it enhances aggregation performance generally? - Currently we can think `GroupValues` and `GroupAccumulator` uses a single `Vec` to manage intermediate states in `partial aggr`. - After finishing work in `partial aggr`, we pass the `batch` to `exchange`, then we recompute the `hashes` of `batch`. Actually the `hashes` have been computed in `GroupValues`, the this recomputing is `the first avoidable cpu cost`. - Then we split the `batch` to multiple `batches`, according to the `partition nubmers` computed from `hashes`. The splitting needs to creating multiple new `batches` to hold the values from the source `batch`, and need to copy data into them, and that is `the second avoidable cpu cost`. - Finally, before passing data to `final aggr` of the partition, we need to copy the splitted small `batches` of the partition to the `coalesce` firstly, until the buffer large enough (usually the default batch size 8192), and that is `the third avoidable cpu cost`. After using partitioned approach in `GroupValues` and `GroupAccumulator`: - We can naturally reuse the computed `hashes` in `GroupValues` when we calculating the `partition numbers` of the `batches`. - We store the intermediate states in `partial aggr` partition by partition. And we when we submit them to `final aggr`, we just submit them partition by partition, rather than splitting first and merging after. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org