[jira] [Commented] (SPARK-22594) Handling spark-submit and master version mismatch

2017-11-24 Thread Jiri Kremser (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16265242#comment-16265242
 ] 

Jiri Kremser commented on SPARK-22594:
--

To be clear, it doesn't try to solve of somehow provide a compatibility layer 
between old and new versions of Spark, all it does is slightly improving the 
UX, because people are hitting the issue all the time (including me)

couple of instances of this issue:

https://stackoverflow.com/questions/32241485/apache-spark-error-local-class-incompatible-when-initiating-a-sparkcontext-clas
https://stackoverflow.com/questions/38559597/failed-to-connect-to-spark-masterinvalidclassexception-org-apache-spark-rpc-rp
https://issues.apache.org/jira/browse/SPARK-13956
https://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/Spark-Standalone-error-local-class-incompatible-stream-classdesc/td-p/25909
https://github.com/USCDataScience/sparkler/issues/56
https://groups.google.com/a/lists.datastax.com/forum/#!topic/spark-connector-user/Z-4qSGbqhYc
https://sparkr.atlassian.net/browse/SPARKR-72

more here:
https://www.google.com/search?ei=BxMYWt3xLJDdwQKLwr9w&q=java.io.InvalidClassException%3A+org.apache.spark.rpc.RpcEndpointRef%3B+local+class+incompatible&oq=java.io.InvalidClassException%3A+org.apache.spark.rpc.RpcEndpointRef%3B+local+class+incompatible

> Handling spark-submit and master version mismatch
> -
>
> Key: SPARK-22594
> URL: https://issues.apache.org/jira/browse/SPARK-22594
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, Spark Shell, Spark Submit
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Jiri Kremser
>Priority: Minor
>
> When using spark-submit in different version than the remote Spark master, 
> the execution fails on during the message deserialization with this log entry 
> / exception:
> {code}
> Error while invoking RpcHandler#receive() for one-way message.
> java.io.InvalidClassException: org.apache.spark.rpc.RpcEndpointRef; local 
> class incompatible: stream classdesc serialVersionUID = 1835832137613908542, 
> local class serialVersionUID = -1329125091869941550
>   at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:616)
>   at 
> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1843)
>   at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1713)
>   at 
> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1843)
>   at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1713)
>   at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2000)
>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
>   at 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2245)
>   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169)
>   at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027)
>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
>   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
>   at 
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
>   at 
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:108)
>   at 
> org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$deserialize$1$$anonfun$apply$1.apply(NettyRpcEnv.scala:271)
>   at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
>   at 
> org.apache.spark.rpc.netty.NettyRpcEnv.deserialize(NettyRpcEnv.scala:320)
>   at 
> org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$deserialize$1.apply(NettyRpcEnv.scala:270)
>   at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
>   at 
> org.apache.spark.rpc.netty.NettyRpcEnv.deserialize(NettyRpcEnv.scala:269)
>   at 
> org.apache.spark.rpc.netty.RequestMessage$.apply(NettyRpcEnv.scala:604)
>   at 
> org.apache.spark.rpc.netty.NettyRpcHandler.internalReceive(NettyRpcEnv.scala:655)
>   at 
> org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:647)
>   at 
> org.apache.spark.network.server.TransportRequestHandler.processOneWayMessage(TransportRequestHandler.java:209)
>   at 
> org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:114)
>   at 
> org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:118)
> ...
> {code}
> This is quite ok and it can be read as version mismatch between the client 
> and server, however there is no such a message on the client (spark-submit) 
> side, so if the submitter doesn't have an access to the spark master or spark 
> UI, there is no way to figure out what is wrong. 
> I propose sendi

[jira] [Commented] (SPARK-22594) Handling spark-submit and master version mismatch

2017-11-23 Thread Jiri Kremser (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16264596#comment-16264596
 ] 

Jiri Kremser commented on SPARK-22594:
--

PR here: https://github.com/apache/spark/pull/19802 

it's still WIP if this is a wanted change, I am prepared to add also tests.

> Handling spark-submit and master version mismatch
> -
>
> Key: SPARK-22594
> URL: https://issues.apache.org/jira/browse/SPARK-22594
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, Spark Shell, Spark Submit
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Jiri Kremser
>Priority: Minor
>
> When using spark-submit in different version than the remote Spark master, 
> the execution fails on during the message deserialization with this log entry 
> / exception:
> {code}
> Error while invoking RpcHandler#receive() for one-way message.
> java.io.InvalidClassException: org.apache.spark.rpc.RpcEndpointRef; local 
> class incompatible: stream classdesc serialVersionUID = 1835832137613908542, 
> local class serialVersionUID = -1329125091869941550
>   at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:616)
>   at 
> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1843)
>   at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1713)
>   at 
> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1843)
>   at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1713)
>   at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2000)
>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
>   at 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2245)
>   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169)
>   at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027)
>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
>   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
>   at 
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
>   at 
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:108)
>   at 
> org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$deserialize$1$$anonfun$apply$1.apply(NettyRpcEnv.scala:271)
>   at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
>   at 
> org.apache.spark.rpc.netty.NettyRpcEnv.deserialize(NettyRpcEnv.scala:320)
>   at 
> org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$deserialize$1.apply(NettyRpcEnv.scala:270)
>   at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
>   at 
> org.apache.spark.rpc.netty.NettyRpcEnv.deserialize(NettyRpcEnv.scala:269)
>   at 
> org.apache.spark.rpc.netty.RequestMessage$.apply(NettyRpcEnv.scala:604)
>   at 
> org.apache.spark.rpc.netty.NettyRpcHandler.internalReceive(NettyRpcEnv.scala:655)
>   at 
> org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:647)
>   at 
> org.apache.spark.network.server.TransportRequestHandler.processOneWayMessage(TransportRequestHandler.java:209)
>   at 
> org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:114)
>   at 
> org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:118)
> ...
> {code}
> This is quite ok and it can be read as version mismatch between the client 
> and server, however there is no such a message on the client (spark-submit) 
> side, so if the submitter doesn't have an access to the spark master or spark 
> UI, there is no way to figure out what is wrong. 
> I propose sending a {{RpcFailure}} message back from server to client with 
> some more informative error. I'd use the {{OneWayMessage}} instead of 
> {{RpcFailure}}, because there was no counterpart {{RpcRequest}}, but I had no 
> luck sending it using the {{reverseClient.send()}}. I think some internal 
> protocol is assumed when sending messages server2client.
> I have a patch prepared.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-22594) Handling spark-submit and master version mismatch

2017-11-23 Thread Jiri Kremser (JIRA)
Jiri Kremser created SPARK-22594:


 Summary: Handling spark-submit and master version mismatch
 Key: SPARK-22594
 URL: https://issues.apache.org/jira/browse/SPARK-22594
 Project: Spark
  Issue Type: Bug
  Components: Spark Core, Spark Shell, Spark Submit
Affects Versions: 2.2.0, 2.1.0
Reporter: Jiri Kremser
Priority: Minor


When using spark-submit in different version than the remote Spark master, the 
execution fails on during the message deserialization with this log entry / 
exception:

{code}
Error while invoking RpcHandler#receive() for one-way message.
java.io.InvalidClassException: org.apache.spark.rpc.RpcEndpointRef; local class 
incompatible: stream classdesc serialVersionUID = 1835832137613908542, local 
class serialVersionUID = -1329125091869941550
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:616)
at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1843)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1713)
at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1843)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1713)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2000)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2245)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
at 
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at 
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:108)
at 
org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$deserialize$1$$anonfun$apply$1.apply(NettyRpcEnv.scala:271)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at 
org.apache.spark.rpc.netty.NettyRpcEnv.deserialize(NettyRpcEnv.scala:320)
at 
org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$deserialize$1.apply(NettyRpcEnv.scala:270)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at 
org.apache.spark.rpc.netty.NettyRpcEnv.deserialize(NettyRpcEnv.scala:269)
at 
org.apache.spark.rpc.netty.RequestMessage$.apply(NettyRpcEnv.scala:604)
at 
org.apache.spark.rpc.netty.NettyRpcHandler.internalReceive(NettyRpcEnv.scala:655)
at 
org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:647)
at 
org.apache.spark.network.server.TransportRequestHandler.processOneWayMessage(TransportRequestHandler.java:209)
at 
org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:114)
at 
org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:118)
...
{code}


This is quite ok and it can be read as version mismatch between the client and 
server, however there is no such a message on the client (spark-submit) side, 
so if the submitter doesn't have an access to the spark master or spark UI, 
there is no way to figure out what is wrong. 

I propose sending a {{RpcFailure}} message back from server to client with some 
more informative error. I'd use the {{OneWayMessage}} instead of 
{{RpcFailure}}, because there was no counterpart {{RpcRequest}}, but I had no 
luck sending it using the {{reverseClient.send()}}. I think some internal 
protocol is assumed when sending messages server2client.

I have a patch prepared.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org