I'm trying to define a class that contains as attributes some of Spark's objects and am running into a problem that I think would be solved I can find python's equivalent of Scala's Extends Serializable.
Here's a simple class that has a Spark RDD as one of its attributes. class Foo: def __init__(self): self.rdd = sc.parallelize([1,2,3,4,5]) def combine(self,first,second): return first + second def f1(self): return self.rdd.reduce(lambda a,b : self.combine(a,b)) When I try b = Foo() b.f1() I get the error: PicklingError: Can't pickle builtin <type 'method_descriptor'> My guess is that this has to do with serialization of the class I created and an error there. So how can I use Spark's RDD methods (such as reduce()) in conjunction with the methods of the class I've created (combine() here) ? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/pyspark-equivalent-to-Extends-Serializable-tp23933.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org