Hi,
I am trying to test mapPartitions function in Spark Python version, but I
got wrong result.
More specifically, in pyspark shell:
>>> rdd = sc.parallelize([1, 2, 3, 4], 2)
>>> def f(iterator): yield sum(iterator)
...
>>> rdd.mapPartitions(f).collect()
The result is [0, 10], not [3, 7]
Is there anything wrong with my code?
Thanks!


-- 
--

Shangyu, Luo
Department of Computer Science
Rice University

Reply via email to