Xiangrui Meng created SPARK-5954: ------------------------------------ Summary: Add topByKey to pair RDDs Key: SPARK-5954 URL: https://issues.apache.org/jira/browse/SPARK-5954 Project: Spark Issue Type: New Feature Components: Spark Core Reporter: Xiangrui Meng
`topByKey(num: Int): RDD[(K, V)]` finds the top-k values for each key in a pair RDD. This is used, e.g., in computing top recommendations. We can use the Guava implementation of finding top-k from an iterator. See also https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/collection/Utils.scala. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org