Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/20806 @viirya Yes. `treeAggregate` should only apply to global aggregate. But in this PR the API have to use `seqOp`/`combOp`. What I expect is that the dataframe version treeAggregate can exploit built-in agg function (suppose in the future we have built-in agg function for vector type) `dataset.groupBy()` if do not given any key column then it will group the whole dataset so it can match the case of treeAggregate, or do you have some better design ?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org