Hi, I am trying to work with spark-submit with cluster deploy mode in single node, but I keep getting ClassNotFoundException as shown below. (in this case, snakeyaml.jar is not found from the spark cluster)
=== 16/03/12 14:19:12 INFO Remoting: Starting remoting 16/03/12 14:19:12 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://Driver@192.168.1.2:52993] 16/03/12 14:19:12 INFO util.Utils: Successfully started service 'Driver' on port 52993. 16/03/12 14:19:12 INFO worker.WorkerWatcher: Connecting to worker akka.tcp://sparkWorker@192.168.1.2:52985/user/Worker Exception in thread "main" java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:58) at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala) Caused by: java.lang.NoClassDefFoundError: org/yaml/snakeyaml/Yaml at com.analytics.config.YamlConfigLoader.loadConfig(YamlConfigLoader.java:30) at com.analytics.api.DeclarativeAnalyticsFactory.create(DeclarativeAnalyticsFactory.java:21) at com.analytics.program.QueryExecutor.main(QueryExecutor.java:12) ... 6 more Caused by: java.lang.ClassNotFoundException: org.yaml.snakeyaml.Yaml at java.lang.ClassLoader.findClass(ClassLoader.java:530) at org.apache.spark.util.ParentClassLoader.findClass(ParentClassLoader.scala:26) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at org.apache.spark.util.ParentClassLoader.loadClass(ParentClassLoader.scala:34) at org.apache.spark.util.ChildFirstURLClassLoader.liftedTree1$1(MutableURLClassLoader.scala:75) at org.apache.spark.util.ChildFirstURLClassLoader.loadClass(MutableURLClassLoader.scala:71) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 9 more 16/03/12 14:19:12 INFO util.Utils: Shutdown hook called ==== I can submit a job successfully with client mode, but I can't with cluster mode, so, it is a matter of not properly passing jars (snakeyaml) to the cluster. The actual command I tried is: $ spark-submit --master spark://192.168.1.2:6066 --deploy-mode cluster --jars all-the-jars(with comma separated) --class com.analytics.program.QueryExecutor analytics.jar (of course, snakeyaml.jar is specified after --jars) I tried spark.executor.extraClassPath and spark.driver.extraClassPath in spark-defaults.conf to specifiy snakeyaml.jar, but none of those worked. I also found couple of similar issues posted in the mailing list or other sites, but, it is not responded back properly or it didn't work to me. < https://mail-archives.apache.org/mod_mbox/spark-user/201505.mbox/%3CCAGSyEuApEkfO_2-iiiuyS2eeg=w_jkf83vcceguns4douod...@mail.gmail.com%3E > < http://stackoverflow.com/questions/34272426/how-to-give-dependent-jars-to-spark-submit-in-cluster-mode > < https://support.datastax.com/hc/en-us/articles/207442243-Spark-submit-fails-with-class-not-found-when-deploying-in-cluster-mode > Could anyone give me a help ? Best regards, Hiro