Andrew Or created SPARK-7121:
--------------------------------
Summary: ClosureCleaner does not handle nesting properly
Key: SPARK-7121
URL: https://issues.apache.org/jira/browse/SPARK-7121
Project: Spark
Issue Type: Bug
Reporter: Andrew Or
Assignee: Andrew Or
For instance, in SparkContext, I tried to do the following:
{code}
def scope[T](body: => T): T = body // no-op
def myCoolMethod(path: String): RDD[String] = scope {
parallelize(1 to 10).map { _ => path }
}
{code}
and I got an exception complaining that SparkContext is not serializable. The
issue here is that the inner closure is getting its path from the outer closure
(the scope), but the outer closure actually references the SparkContext object
itself to get the `parallelize` method.
Note, however, that the inner closure doesn't actually need the SparkContext;
it just needs a field from the outer closure. If we modify ClosureCleaner to
clean the outer closure recursively while using the fields accessed by the
inner closure, then we can serialize the inner closure.
This is blocking my effort on a visualization task.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]