Xiang Gao created SPARK-17185:
---------------------------------

             Summary: Unify naming of API for RDD and Dataset
                 Key: SPARK-17185
                 URL: https://issues.apache.org/jira/browse/SPARK-17185
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core, SQL
            Reporter: Xiang Gao


In RDD, groupByKey is used to generate a key-list pair and  aggregateByKey is 
used to do aggregation.
In Dataset, aggregation is done by groupBy and groupByKey, and no API for 
key-list pair is provided.

The same name "groupBy" is designed to do different things and this might be be 
confusing. Besides, it would be more convenient to provide API to generate 
key-list pair for Dataset.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to