Hi Will,

That doesn't seem to be the case and was part of the source of my
confusion. The code currently in the run method of the runnable works
perfectly fine with the lambda expressions when it is invoked from the main
method. They also work when they are invoked from within a separate method
on the Transforms object.

It was only when putting that same code into another thread that the
serialization exception occurred.

Examples throughout the spark docs also use lambda expressions a lot -
surely those examples also would not work if this is always an issue with
lambdas?

On Sat, Jun 6, 2015, 12:21 AM Will Briggs <wrbri...@gmail.com> wrote:

> Hi Lee, it's actually not related to threading at all - you would still
> have the same problem even if you were using a single thread. See this
> section (
> https://spark.apache.org/docs/latest/programming-guide.html#passing-functions-to-spark)
> of the Spark docs.
>
>
> On June 5, 2015, at 5:12 PM, Lee McFadden <splee...@gmail.com> wrote:
>
>
> On Fri, Jun 5, 2015 at 2:05 PM Will Briggs <wrbri...@gmail.com> wrote:
>
>> Your lambda expressions on the RDDs in the SecondRollup class are closing
>> around the context, and Spark has special logic to ensure that all
>> variables in a closure used on an RDD are Serializable - I hate linking to
>> Quora, but there's a good explanation here:
>> http://www.quora.com/What-does-Closure-cleaner-func-mean-in-Spark
>>
>
> Ah, I see!  So if I broke out the lambda expressions into a method on an
> object it would prevent this issue.  Essentially, "don't use lambda
> expressions when using threads".
>
> Thanks again, I appreciate the help.
>

Reply via email to