Hi Lee, it's actually not related to threading at all - you would still have 
the same problem even if you were using a single thread. See this section ( 
https://spark.apache.org/docs/latest/programming-guide.html#passing-functions-to-spark)
 of the Spark docs. 

On June 5, 2015, at 5:12 PM, Lee McFadden <splee...@gmail.com> wrote:

On Fri, Jun 5, 2015 at 2:05 PM Will Briggs <wrbri...@gmail.com> wrote:

Your lambda expressions on the RDDs in the SecondRollup class are closing 
around the context, and Spark has special logic to ensure that all variables in 
a closure used on an RDD are Serializable - I hate linking to Quora, but 
there's a good explanation here: 
http://www.quora.com/What-does-Closure-cleaner-func-mean-in-Spark


Ah, I see!  So if I broke out the lambda expressions into a method on an object 
it would prevent this issue.  Essentially, "don't use lambda expressions when 
using threads".


Thanks again, I appreciate the help. 

Reply via email to