Are you by any chance looking for reduceByKey? IF you’re trying to collapse all the values in V into an aggregate, that’s what you should be looking at.
-adrian From: Ted Yu Date: Monday, October 19, 2015 at 9:16 PM To: Shepherd Cc: user Subject: Re: How to calculate row by now and output retults in Spark Under core/src/test/scala/org/apache/spark , you will find a lot of examples for map function. FYI On Mon, Oct 19, 2015 at 10:35 AM, Shepherd <cheng...@huawei.com<mailto:cheng...@huawei.com>> wrote: Hi all, I am new in Spark and Scala. I have a question in doing calculation. I am using "groupBy" to generate key value pair, and the value points to a subset of original RDD. The RDD has four columns, and each subset RDD may have different number of rows. For example, the original code like this:" val b = a.gorupBy(_._2) val res = b.map{case (k, v) => v.map(func)} " Here, I don't know how to write the func. I have to run each row in v, and calculate statistic result. How can I do that? And, how can I write function in Map? Thanks a lot. ________________________________ View this message in context: How to calculate row by now and output retults in Spark<http://apache-spark-user-list.1001560.n3.nabble.com/How-to-calculate-row-by-now-and-output-retults-in-Spark-tp25122.html> Sent from the Apache Spark User List mailing list archive<http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com.