Thanks Davies and Eric. I followed Davies' instructions and it works wonderful.
I would add that you can also add these scripts in the pyspark shell too: pyspark --py-files support.py where support.py is your script containing your class as Davies described. Best, Guillaume Guy * +1 919 - 972 - 8750* On Wed, Feb 18, 2015 at 11:48 PM, Davies Liu <dav...@databricks.com> wrote: > Currently, PySpark can not support pickle a class object in current > script ( '__main__'), the workaround could be put the implementation > of the class into a separate module, then use "bin/spark-submit > --py-files xxx.py" in deploy it. > > in xxx.py: > > class test(object): > def __init__(self, a, b): > self.total = a + b > > in job.py: > > from xxx import test > a = sc.parallelize([(True,False),(False,False)]) > a.map(lambda (x,y): test(x,y)) > > run it by: > > bin/spark-submit --py-files xxx.py job.py > > > On Wed, Feb 18, 2015 at 1:48 PM, Guillaume Guy > <guillaume.c....@gmail.com> wrote: > > Hi, > > > > This is a duplicate of the stack-overflow question here. I hope to > generate > > more interest on this mailing list. > > > > > > The problem: > > > > I am running into some attribute lookup problems when trying to initiate > a > > class within my RDD. > > > > My workflow is quite standard: > > > > 1- Start with an RDD > > > > 2- Take each element of the RDD, initiate an object for each > > > > 3- Reduce (I will write a method that will define the reduce operation > later > > on) > > > > Here is #2: > > > > class test(object): > > def __init__(self, a,b): > > self.total = a + b > > > > a = sc.parallelize([(True,False),(False,False)]) > > a.map(lambda (x,y): test(x,y)) > > > > Here is the error I get: > > > > PicklingError: Can't pickle < class 'main.test' >: attribute lookup > > main.test failed > > > > I'd like to know if there is any way around it. Please, answer with a > > working example to achieve the intended results (i.e. creating a RDD of > > objects of class "tests"). > > > > Thanks in advance! > > > > Related question: > > > > https://groups.google.com/forum/#!topic/edx-code/9xzRJFyQwn > > > > > > GG > > >