[jira] [Commented] (SPARK-12414) Remove closure serializer

2017-06-19 Thread Ritesh Tijoriwala (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054529#comment-16054529
 ] 

Ritesh Tijoriwala commented on SPARK-12414:
---

I have a similar situation. I have several classes that I would like to 
instantiate and use in Executors for e.g. DB connections, Elasticsearch 
clients, etc. I don't want to write instantiation code in Spark functions and 
use "statics".  There was a nit trick suggested here - 
https://issues.apache.org/jira/browse/SPARK-650 but seems like this will not 
work starting 2.0.0 as a consequence of this ticket. 

Could anybody from Spark community recommend how to do some initialization on 
each Spark executor for the job before any task execution begins?

> Remove closure serializer
> -
>
> Key: SPARK-12414
> URL: https://issues.apache.org/jira/browse/SPARK-12414
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 1.0.0
>Reporter: Andrew Or
>Assignee: Sean Owen
> Fix For: 2.0.0
>
>
> There is a config `spark.closure.serializer` that accepts exactly one value: 
> the java serializer. This is because there are currently bugs in the Kryo 
> serializer that make it not a viable candidate. This was uncovered by an 
> unsuccessful attempt to make it work: SPARK-7708.
> My high level point is that the Java serializer has worked well for at least 
> 6 Spark versions now, and it is an incredibly complicated task to get other 
> serializers (not just Kryo) to work with Spark's closures. IMO the effort is 
> not worth it and we should just remove this documentation and all the code 
> associated with it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12414) Remove closure serializer

2017-01-09 Thread Lubomir Nerad (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811664#comment-15811664
 ] 

Lubomir Nerad commented on SPARK-12414:
---

I am using a custom spark.closure.serializer to allow dynamic loading of 
classes in the application running on the Spark driver. The custom serializer 
makes sure that the closures defined by these dynamically loaded classes can be 
serialized with additional information. This information is then used during 
deserialization on executors to select or create the correct classloader for 
the closures. This enables to have multiple classloaders per executor which is 
needed for update of the dynamic code and also allows garbage collection of the 
dynamic classes when they are no longer necessary. Currently the support is in 
a prototype stage, but seems to be working correctly.

Is there an alternative approach for this when the possibility of defining 
spark.closure.serializer is no longer available? I investigated usage of the 
org.apache.spark.repl.ExecutorClassLoader for this, but that seems 
insufficient, because there is only one instance of it on the executor and that 
instance never goes away (no update to existing classes, no garbage collection 
if they are no longer needed).

Can't the decision to remove "spark.closure.serializer" be reconsidered? There 
is still the "spark.serializer" and having option to define both - the data and 
closure serializer seems more consistent to me than having one of them fixed 
and the other one configurable.


> Remove closure serializer
> -
>
> Key: SPARK-12414
> URL: https://issues.apache.org/jira/browse/SPARK-12414
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 1.0.0
>Reporter: Andrew Or
>Assignee: Sean Owen
> Fix For: 2.0.0
>
>
> There is a config `spark.closure.serializer` that accepts exactly one value: 
> the java serializer. This is because there are currently bugs in the Kryo 
> serializer that make it not a viable candidate. This was uncovered by an 
> unsuccessful attempt to make it work: SPARK-7708.
> My high level point is that the Java serializer has worked well for at least 
> 6 Spark versions now, and it is an incredibly complicated task to get other 
> serializers (not just Kryo) to work with Spark's closures. IMO the effort is 
> not worth it and we should just remove this documentation and all the code 
> associated with it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12414) Remove closure serializer

2016-04-13 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238787#comment-15238787
 ] 

Sean Owen commented on SPARK-12414:
---

Hm, I am not sure this was intended to allow third party serialized but to 
choose among the ones in Spark. I think there would be class loader 
complications and trouble with serializers having to know about internal Spark 
classes. If you want to try anyway can you have your classes implement custom 
serialization via Java's mechanism?

> Remove closure serializer
> -
>
> Key: SPARK-12414
> URL: https://issues.apache.org/jira/browse/SPARK-12414
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 1.0.0
>Reporter: Andrew Or
>Assignee: Sean Owen
> Fix For: 2.0.0
>
>
> There is a config `spark.closure.serializer` that accepts exactly one value: 
> the java serializer. This is because there are currently bugs in the Kryo 
> serializer that make it not a viable candidate. This was uncovered by an 
> unsuccessful attempt to make it work: SPARK-7708.
> My high level point is that the Java serializer has worked well for at least 
> 6 Spark versions now, and it is an incredibly complicated task to get other 
> serializers (not just Kryo) to work with Spark's closures. IMO the effort is 
> not worth it and we should just remove this documentation and all the code 
> associated with it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12414) Remove closure serializer

2016-04-12 Thread Dubkov Mikhail (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238171#comment-15238171
 ] 

Dubkov Mikhail commented on SPARK-12414:


[~srowen], [~andrewor14],

As I see, you just hard coded spark.closure.serializer use JavaSerializer 
implementation, that's why I have a question:

On our project we use custom spark.serializer that has own requirements for 
objects serialization which can differs from JavaSerializer requirements. For 
example, our serializer  not requires "implements Serializable", unless 
JavaSerializer does.

The question is will application fails once we upgrade to Spark 2.0.0 ?
Because now, if we don't define serializer as "spark.closure.serializer" 
application fails with 'Caused by: java.io.NotSerializableException'

Could you please explain how it will work after changes in scope of this task?

Thanks!

> Remove closure serializer
> -
>
> Key: SPARK-12414
> URL: https://issues.apache.org/jira/browse/SPARK-12414
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 1.0.0
>Reporter: Andrew Or
>Assignee: Sean Owen
> Fix For: 2.0.0
>
>
> There is a config `spark.closure.serializer` that accepts exactly one value: 
> the java serializer. This is because there are currently bugs in the Kryo 
> serializer that make it not a viable candidate. This was uncovered by an 
> unsuccessful attempt to make it work: SPARK-7708.
> My high level point is that the Java serializer has worked well for at least 
> 6 Spark versions now, and it is an incredibly complicated task to get other 
> serializers (not just Kryo) to work with Spark's closures. IMO the effort is 
> not worth it and we should just remove this documentation and all the code 
> associated with it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12414) Remove closure serializer

2016-02-10 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15140696#comment-15140696
 ] 

Apache Spark commented on SPARK-12414:
--

User 'srowen' has created a pull request for this issue:
https://github.com/apache/spark/pull/11150

> Remove closure serializer
> -
>
> Key: SPARK-12414
> URL: https://issues.apache.org/jira/browse/SPARK-12414
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 1.0.0
>Reporter: Andrew Or
>Assignee: Andrew Or
>
> There is a config `spark.closure.serializer` that accepts exactly one value: 
> the java serializer. This is because there are currently bugs in the Kryo 
> serializer that make it not a viable candidate. This was uncovered by an 
> unsuccessful attempt to make it work: SPARK-7708.
> My high level point is that the Java serializer has worked well for at least 
> 6 Spark versions now, and it is an incredibly complicated task to get other 
> serializers (not just Kryo) to work with Spark's closures. IMO the effort is 
> not worth it and we should just remove this documentation and all the code 
> associated with it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12414) Remove closure serializer

2015-12-30 Thread Andrew Or (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075814#comment-15075814
 ] 

Andrew Or commented on SPARK-12414:
---

It's also for code cleanup. Right now SparkEnv has a "closure serializer" and a 
"serializer", which is kind of confusing. We should just use Java serializer 
since it's worked for such a long time. Not sure about Kryo 3.0 but I'm not 
sure if upgrading would be sufficient.

> Remove closure serializer
> -
>
> Key: SPARK-12414
> URL: https://issues.apache.org/jira/browse/SPARK-12414
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 1.0.0
>Reporter: Andrew Or
>Assignee: Andrew Or
>
> There is a config `spark.closure.serializer` that accepts exactly one value: 
> the java serializer. This is because there are currently bugs in the Kryo 
> serializer that make it not a viable candidate. This was uncovered by an 
> unsuccessful attempt to make it work: SPARK-7708.
> My high level point is that the Java serializer has worked well for at least 
> 6 Spark versions now, and it is an incredibly complicated task to get other 
> serializers (not just Kryo) to work with Spark's closures. IMO the effort is 
> not worth it and we should just remove this documentation and all the code 
> associated with it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12414) Remove closure serializer

2015-12-30 Thread Reynold Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075638#comment-15075638
 ] 

Reynold Xin commented on SPARK-12414:
-

Is it possible to use Kryo 3.0 to do this? I'm not sure wether it gains us much 
to actually remove the option. We can certainly remove the documentation of it.


> Remove closure serializer
> -
>
> Key: SPARK-12414
> URL: https://issues.apache.org/jira/browse/SPARK-12414
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 1.0.0
>Reporter: Andrew Or
>Assignee: Andrew Or
>
> There is a config `spark.closure.serializer` that accepts exactly one value: 
> the java serializer. This is because there are currently bugs in the Kryo 
> serializer that make it not a viable candidate. This was uncovered by an 
> unsuccessful attempt to make it work: SPARK-7708.
> My high level point is that the Java serializer has worked well for at least 
> 6 Spark versions now, and it is an incredibly complicated task to get other 
> serializers (not just Kryo) to work with Spark's closures. IMO the effort is 
> not worth it and we should just remove this documentation and all the code 
> associated with it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org