Thank you Yanbo for the reply.. I 've another query related to cogroup.I want to iterate over the results of cogroup operation.
My code is * grp = RDD1.cogroup(RDD2) * map((lambda (x,y): (x,list(y[0]),list(y[1]))), list(grp)) My result looks like : [((u'764', u'20140826'), [0.70146274566650391], [ ]), ((u'863', u'20140826'), [0.368011474609375], [ ]), ((u'9571520', u'20140826'), [0.0046129226684570312], [0.60000000000000009])] When I do one more cogroup operation like grp1 = grp.cogroup(RDD3) I am not able to see the results.All my RDDs are of the form ((x,y),z).Can somebody help me to solve this. Thanks & Regards, Meethu M On Thursday, 28 August 2014 5:59 PM, Yanbo Liang <yanboha...@gmail.com> wrote: For your reference: val d1 = textFile.map(line => { val fileds = line.split(",") ((fileds(0),fileds(1)), fileds(2).toDouble) }) val d2 = d1.reduceByKey(_+_) d2.foreach(println) 2014-08-28 20:04 GMT+08:00 MEETHU MATHEW <meethu2...@yahoo.co.in>: Hi all, > > >I have an RDD which has values in the format "id,date,cost". > > >I want to group the elements based on the id and date columns and get the sum >of the cost for each group. > > >Can somebody tell me how to do this? > > > >Thanks & Regards, >Meethu M