Shushant Arora created SPARK-12524: -------------------------------------- Summary: Group by key in a pairrdd without any shuffle Key: SPARK-12524 URL: https://issues.apache.org/jira/browse/SPARK-12524 Project: Spark Issue Type: Improvement Components: Build, Java API Affects Versions: 1.5.2 Reporter: Shushant Arora
In a PairRDD<K,V>. When we are all values of same key are in same partition and want to perform group by key locally and no reduce/aggregation operation afterwords just further tranformation on grouped rdd. There is no facility for that. We have to perform shuffle which is costly. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org