Re: how to specify columns in groupby

2014-08-29 Thread MEETHU MATHEW
Thank you Yanbo for the reply.. I 've another query related to cogroup.I want to iterate over the results of cogroup operation. My code is * grp = RDD1.cogroup(RDD2) * map((lambda (x,y): (x,list(y[0]),list(y[1]))), list(grp)) My result looks like : [((u'764', u'20140826'),

how to specify columns in groupby

2014-08-28 Thread MEETHU MATHEW
Hi all, I have an RDD  which has values in the  format id,date,cost. I want to group the elements based on the id and date columns and get the sum of the cost  for each group. Can somebody tell me how to do this?   Thanks Regards, Meethu M

Re: how to specify columns in groupby

2014-08-28 Thread Yanbo Liang
For your reference: val d1 = textFile.map(line = { val fileds = line.split(,) ((fileds(0),fileds(1)), fileds(2).toDouble) }) val d2 = d1.reduceByKey(_+_) d2.foreach(println) 2014-08-28 20:04 GMT+08:00 MEETHU MATHEW meethu2...@yahoo.co.in: Hi all, I have an RDD