Hello, I have input lines like below
*Input* t1, file1, 1, 1, 1 t1, file1, 1, 2, 3 t1, file2, 2, 2, 2, 2 t2, file1, 5, 5, 5 t2, file2, 1, 1, 2, 2 and i want to achieve the output like below rows which is a vertical addition of the corresponding numbers. *Output* “file1” : [ 1+1+5, 1+2+5, 1+3+5 ] “file2” : [ 2+1, 2+1, 2+2, 2+2 ] I am in a spark streaming context and i am having a hard time trying to figure out the way to group by file name. It seems like i will need to use something like below, i am not sure how to get to the correct syntax. Any inputs will be helpful. myDStream.foreachRDD(rdd => rdd.groupBy()) I know how to do the vertical sum of array of given numbers, but i am not sure how to feed that function to the group by. def compute_counters(counts : ArrayBuffer[List[Int]]) = { counts.toList.transpose.map(_.sum) } ~Thanks, Vinti