In general, you can find out exactly what's not serializable by adding -Dsun.io.serialization.extendedDebugInfo=true to SPARK_JAVA_OPTS. Since a this reference to the enclosing class is often what's causing the problem, a general workaround is to move the mapPartitions call to a static method where there is no this reference. This transforms this: class A { def f() = rdd.mapPartitions(iter => ...)} into this: class A { def f() = A.helper(rdd)}object A { def helper(rdd: RDD[...]) = rdd.mapPartitions(iter => ...)}
-- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Variables-outside-of-mapPartitions-scope-tp5517p5527.html Sent from the Apache Spark User List mailing list archive at Nabble.com.