RE: newbie question for reduce

2022-01-27 Thread Christopher Robson
ark.apache.org Subject: newbie question for reduce Hello Please help take a look why my this simple reduce doesn't work? >>> rdd = sc.parallelize([("a",1),("b",2),("c",3)]) >>> >>> rdd.reduce(lambda x,y: x[1]+y[1]) Traceback (most recent ca

Re: newbie question for reduce

2022-01-18 Thread Sean Owen
The problem is that you are reducing a list of tuples, but you are producing an int. The resulting int can't be combined with other tuples with your function. reduce() has to produce the same type as its arguments. rdd.map(lambda x: x[1]).reduce(lambda x,y: x+y) ... would work On Tue, Jan 18,

newbie question for reduce

2022-01-18 Thread capitnfrakass
Hello Please help take a look why my this simple reduce doesn't work? rdd = sc.parallelize([("a",1),("b",2),("c",3)]) rdd.reduce(lambda x,y: x[1]+y[1]) Traceback (most recent call last): File "", line 1, in File "/opt/spark/python/pyspark/rdd.py", line 1001, in reduce return