Aaron,

On Thu, Jan 15, 2015 at 5:05 PM, Aaron Davidson <ilike...@gmail.com> wrote:

> Scala for-loops are implemented as closures using anonymous inner classes
> which are instantiated once and invoked many times. This means, though,
> that the code inside the loop is actually sitting inside a class, which
> confuses Spark's Closure Cleaner, whose job is to remove unused references
> from closures to make otherwise-unserializable objects serializable.
>
> My understanding is, in particular, that the closure cleaner will null out
> unused fields in the closure, but cannot go past the first level of depth
> (i.e., it will not follow field references and null out *their *unused,
> and possibly unserializable, references), because this could end up
> mutating state outside of the closure itself. Thus, the extra level of
> depth of the closure that was introduced by the anonymous class (where
> presumably the "outer this" pointer is considered "used" by the closure
> cleaner) is sufficient to make it unserializable.
>

Now, two weeks later, let me add that this is one of the most helpful
comments I have received on this mailing list! This insight helped me save
90% of the time I spent with debugging NotSerializableExceptions.
Thank you very much!

Tobias

Reply via email to