Dear List, I'm writing an application where I have RDDs of protobuf messages. When I run the app via bin/spar-submit with --master local --driver-class-path path/to/my/uber.jar, Spark is able to ser/deserialize the messages correctly.
However, if I run WITHOUT --driver-class-path path/to/my/uber.jar or I try --master spark://my.master:7077 , then I run into errors that make it look like my protobuf message classes are not on the classpath: Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 0, localhost): java.lang.RuntimeException: Unable to find proto buffer class com.google.protobuf.GeneratedMessageLite$SerializedForm.readResolve(GeneratedMessageLite.java:775) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) java.lang.reflect.Method.invoke(Method.java:606) java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:1104) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1807) ... Why do I need --driver-class-path in the local scenario? And how can I ensure my classes are on the classpath no matter how my app is submitted via bin/spark-submit (e.g. --master spark://my.master:7077 ) ? I've tried poking through the shell scripts and SparkSubmit.scala and unfortunately I haven't been able to grok exactly what Spark is doing with the remote/local JVMs. Cheers, -Paul --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org