Github user viirya commented on the issue: https://github.com/apache/spark/pull/20806 @WeichenXu123 I feel `groupBy` is more SQL-like aggregation by which we can specify a key to grouping by. At least `rdd.treeAggregate` does not support key-specified aggregation. For typed grouping `groupByKey`, it constructs `KeyValueGroupedDataset` by which we rely on SQL `Aggregate` execution to grouping data. Currently it doesn't support tree-based aggregation. This work doesn't intend to overhaul SQL aggregation to support tree-based aggregation. So the API will looks more like as is.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org