Dear List,

I'm writing an application where I have RDDs of protobuf messages.
When I run the app via bin/spar-submit with --master local
--driver-class-path path/to/my/uber.jar, Spark is able to
ser/deserialize the messages correctly.

However, if I run WITHOUT --driver-class-path path/to/my/uber.jar or I
try --master spark://my.master:7077 , then I run into errors that make
it look like my protobuf message classes are not on the classpath:

Exception in thread "main" org.apache.spark.SparkException: Job
aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most
recent failure: Lost task 0.0 in stage 1.0 (TID 0, localhost):
java.lang.RuntimeException: Unable to find proto buffer class
        
com.google.protobuf.GeneratedMessageLite$SerializedForm.readResolve(GeneratedMessageLite.java:775)
        sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        java.lang.reflect.Method.invoke(Method.java:606)
        java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:1104)
        
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1807)
        ...

Why do I need --driver-class-path in the local scenario?  And how can
I ensure my classes are on the classpath no matter how my app is
submitted via bin/spark-submit (e.g. --master spark://my.master:7077 )
?  I've tried poking through the shell scripts and SparkSubmit.scala
and unfortunately I haven't been able to grok exactly what Spark is
doing with the remote/local JVMs.

Cheers,
-Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to