[jira] [Commented] (SPARK-12414) Remove closure serializer

Lubomir Nerad (JIRA) Mon, 09 Jan 2017 04:32:41 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-12414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811664#comment-15811664
 ]


Lubomir Nerad commented on SPARK-12414:
---------------------------------------

I am using a custom spark.closure.serializer to allow dynamic loading of 
classes in the application running on the Spark driver. The custom serializer 
makes sure that the closures defined by these dynamically loaded classes can be 
serialized with additional information. This information is then used during 
deserialization on executors to select or create the correct classloader for 
the closures. This enables to have multiple classloaders per executor which is 
needed for update of the dynamic code and also allows garbage collection of the 
dynamic classes when they are no longer necessary. Currently the support is in 
a prototype stage, but seems to be working correctly.

Is there an alternative approach for this when the possibility of defining 
spark.closure.serializer is no longer available? I investigated usage of the 
org.apache.spark.repl.ExecutorClassLoader for this, but that seems 
insufficient, because there is only one instance of it on the executor and that 
instance never goes away (no update to existing classes, no garbage collection 
if they are no longer needed).

Can't the decision to remove "spark.closure.serializer" be reconsidered? There 
is still the "spark.serializer" and having option to define both - the data and 
closure serializer seems more consistent to me than having one of them fixed 
and the other one configurable.


> Remove closure serializer
> -------------------------
>
>                 Key: SPARK-12414
>                 URL: https://issues.apache.org/jira/browse/SPARK-12414
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Spark Core
>    Affects Versions: 1.0.0
>            Reporter: Andrew Or
>            Assignee: Sean Owen
>             Fix For: 2.0.0
>
>
> There is a config `spark.closure.serializer` that accepts exactly one value: 
> the java serializer. This is because there are currently bugs in the Kryo 
> serializer that make it not a viable candidate. This was uncovered by an 
> unsuccessful attempt to make it work: SPARK-7708.
> My high level point is that the Java serializer has worked well for at least 
> 6 Spark versions now, and it is an incredibly complicated task to get other 
> serializers (not just Kryo) to work with Spark's closures. IMO the effort is 
> not worth it and we should just remove this documentation and all the code 
> associated with it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-12414) Remove closure serializer

Reply via email to