[ 
https://issues.apache.org/jira/browse/SPARK-9746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-9746.
------------------------------
    Resolution: Not A Problem

This should start as a question on user@. 
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark

I am not sure what you're getting at since your suggestion assumes the values 
are collections. They are not in general, but, the method is in any event 
counting keys. Values are irrelevant.

> PairRDDFunctions.countByKey: values/counts always 1
> ---------------------------------------------------
>
>                 Key: SPARK-9746
>                 URL: https://issues.apache.org/jira/browse/SPARK-9746
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.4.0
>            Reporter: Andreas
>
> org.apache.spark.rdd.PairRDDFunctionscountByKey(): Map[K, Long] = 
> self.withScope {
>     self.mapValues(_ => 1L).reduceByKey(_ + _).collect().toMap
>   }
> obviously always returns count 1 for each key.
> If I understand the docs correctly I would expect this implementation:
> self.mapValues(_.size).reduceByKey(_ + _).collect().toMap



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to