Here is a python code, I am sure you'd get the drift. Basically you need to implement 2 functions: seq and comb in order to partial and final operations.
def addtup(t1,t2): j=() for k,v in enumerate(t1): j=j+(t1[k]+t2[k],) return j def seq(tIntrm,tNext): return addtup(tIntrm,tNext) def comb(tP,tF): return addtup(tP,tF) lst = [(2553,(0,0,0,1,0,0,0,0)), (46551,(0,1,0,0,0,0,0,0)), (266,(0,1,0,0,0,0,0,0)), (2553,(0,0,0,0,0,1,0,0)), (225546,(0,0,0,0,0,1,0,0)), (225546,(0,0,0,0,0,1,0,0))] base = sc.parallelize(lst) res = base.aggregateByKey((0,0,0,0,0,0,0,0),seq,comb) for i in res.collect(): print i Result: (266, (0, 1, 0, 0, 0, 0, 0, 0)) (225546, (0, 0, 0, 0, 0, 2, 0, 0)) (2553, (0, 0, 0, 1, 0, 1, 0, 0)) (46551, (0, 1, 0, 0, 0, 0, 0, 0)) On Thu, May 14, 2015 at 11:40 PM, Yasemin Kaya <godo...@gmail.com> wrote: > Hi, > > I have JavaPairRDD<String, String> and I want to implement reduceByKey > method. > > My pairRDD : > *2553: 0,0,0,1,0,0,0,0* > 46551: 0,1,0,0,0,0,0,0 > 266: 0,1,0,0,0,0,0,0 > *2553: 0,0,0,0,0,1,0,0* > > *225546: 0,0,0,0,0,1,0,0* > *225546: 0,0,0,0,0,1,0,0* > > I want to get : > *2553: 0,0,0,1,0,1,0,0* > 46551: 0,1,0,0,0,0,0,0 > 266: 0,1,0,0,0,0,0,0 > *225546: 0,0,0,0,0,2,0,0* > > Anyone can help me getting that? > Thank you. > > Have a nice day. > yasemin > > -- > hiç ender hiç > -- Best Regards, Ayan Guha