[ https://issues.apache.org/jira/browse/SPARK-9746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-9746. ------------------------------ Resolution: Not A Problem This should start as a question on user@. https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark I am not sure what you're getting at since your suggestion assumes the values are collections. They are not in general, but, the method is in any event counting keys. Values are irrelevant. > PairRDDFunctions.countByKey: values/counts always 1 > --------------------------------------------------- > > Key: SPARK-9746 > URL: https://issues.apache.org/jira/browse/SPARK-9746 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 1.4.0 > Reporter: Andreas > > org.apache.spark.rdd.PairRDDFunctionscountByKey(): Map[K, Long] = > self.withScope { > self.mapValues(_ => 1L).reduceByKey(_ + _).collect().toMap > } > obviously always returns count 1 for each key. > If I understand the docs correctly I would expect this implementation: > self.mapValues(_.size).reduceByKey(_ + _).collect().toMap -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org