Re: Cluster mode dependent jars not working
--driver-classpath needs to be added with jars needed. But this is not being mentioned in the spark documentation. On Tue, Dec 15, 2015 at 9:13 PM, Ted Yuwrote: > Please use --conf spark.executor.extraClassPath=XXX to specify dependent > jars. > > On Tue, Dec 15, 2015 at 3:57 AM, vimal dinakaran > wrote: > >> I am running spark using cluster mode for deployment . Below is the >> command >> >> >> JARS=$JARS_HOME/amqp-client-3.5.3.jar,$JARS_HOME/nscala-time_2.10-2.0.0.jar,\ >> $JARS_HOME/kafka_2.10-0.8.2.1.jar,$JARS_HOME/kafka-clients-0.8.2.1.jar,\ >> $JARS_HOME/spark-streaming-kafka_2.10-1.4.1.jar,\ >> $JARS_HOME/zkclient-0.3.jar,$JARS_HOME/protobuf-java-2.4.0a.jar >> >> dse spark-submit -v --conf >> "spark.serializer=org.apache.spark.serializer.KryoSerializer" \ >> --executor-memory 512M \ >> --total-executor-cores 3 \ >> --deploy-mode "cluster" \ >> --master spark://$MASTER:7077 \ >> --jars=$JARS \ >> --supervise \ >> --class "com.testclass" $APP_JAR input.json \ >> --files "/home/test/input.json" >> >> The above command is working fine in client mode. But when I use it in >> cluster mode I get class not found exception >> >> Exception in thread "main" java.lang.reflect.InvocationTargetException >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:606) >> at >> org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:58) >> at >> org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala) >> Caused by: java.lang.NoClassDefFoundError: >> org/apache/spark/streaming/kafka/KafkaUtils$ >> >> In client mode the dependent jars are getting copied to the >> /var/lib/spark/work directory whereas in cluster mode it is not. >> I am using nfs and I have mounted the same directory on all the spark >> nodes under same name. Still I get the error. >> >> From the verbose logs of dse spark-submit, I see the classpath elements >> are missing here . >> >> -- >> spark.hadoop.cassandra.input.native.ssl.trust.store.password -> cassandra >> spark.cassandra.connection.ssl.trustStore.password -> cassandra >> spark.ssl.keyStorePassword -> cassandra >> spark.cassandra.auth.username -> cassandra >> spark.hadoop.fs.har.impl -> org.apache.hadoop.fs.HarFileSystem >> Classpath elements: >> >> >> WARN 2015-12-15 17:08:48 org.apache.spark.util.Utils: Your hostname, >> demeter-dev-node2 resolves to a loopback address: 127.0.1.1; using >> 10.29.23.170 instead (on interface eth0) >> WARN 2015-12-15 17:08:48 org.apache.spark.util.Utils: Set SPARK_LOCAL_IP >> if you need to bind to another addres >> >> How it is able to pick the application jar which is also under same >> directory but not the dependent jars ? >> Please help me in getting this solved. >> >> >
Cluster mode dependent jars not working
I am running spark using cluster mode for deployment . Below is the command JARS=$JARS_HOME/amqp-client-3.5.3.jar,$JARS_HOME/nscala-time_2.10-2.0.0.jar,\ $JARS_HOME/kafka_2.10-0.8.2.1.jar,$JARS_HOME/kafka-clients-0.8.2.1.jar,\ $JARS_HOME/spark-streaming-kafka_2.10-1.4.1.jar,\ $JARS_HOME/zkclient-0.3.jar,$JARS_HOME/protobuf-java-2.4.0a.jar dse spark-submit -v --conf "spark.serializer=org.apache.spark.serializer.KryoSerializer" \ --executor-memory 512M \ --total-executor-cores 3 \ --deploy-mode "cluster" \ --master spark://$MASTER:7077 \ --jars=$JARS \ --supervise \ --class "com.testclass" $APP_JAR input.json \ --files "/home/test/input.json" The above command is working fine in client mode. But when I use it in cluster mode I get class not found exception Exception in thread "main" java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:58) at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala) Caused by: java.lang.NoClassDefFoundError: org/apache/spark/streaming/kafka/KafkaUtils$ In client mode the dependent jars are getting copied to the /var/lib/spark/work directory whereas in cluster mode it is not. I am using nfs and I have mounted the same directory on all the spark nodes under same name. Still I get the error. >From the verbose logs of dse spark-submit, I see the classpath elements are missing here . -- spark.hadoop.cassandra.input.native.ssl.trust.store.password -> cassandra spark.cassandra.connection.ssl.trustStore.password -> cassandra spark.ssl.keyStorePassword -> cassandra spark.cassandra.auth.username -> cassandra spark.hadoop.fs.har.impl -> org.apache.hadoop.fs.HarFileSystem Classpath elements: WARN 2015-12-15 17:08:48 org.apache.spark.util.Utils: Your hostname, demeter-dev-node2 resolves to a loopback address: 127.0.1.1; using 10.29.23.170 instead (on interface eth0) WARN 2015-12-15 17:08:48 org.apache.spark.util.Utils: Set SPARK_LOCAL_IP if you need to bind to another addres How it is able to pick the application jar which is also under same directory but not the dependent jars ? Please help me in getting this solved.
Re: Cluster mode dependent jars not working
Please use --conf spark.executor.extraClassPath=XXX to specify dependent jars. On Tue, Dec 15, 2015 at 3:57 AM, vimal dinakaranwrote: > I am running spark using cluster mode for deployment . Below is the command > > > JARS=$JARS_HOME/amqp-client-3.5.3.jar,$JARS_HOME/nscala-time_2.10-2.0.0.jar,\ > $JARS_HOME/kafka_2.10-0.8.2.1.jar,$JARS_HOME/kafka-clients-0.8.2.1.jar,\ > $JARS_HOME/spark-streaming-kafka_2.10-1.4.1.jar,\ > $JARS_HOME/zkclient-0.3.jar,$JARS_HOME/protobuf-java-2.4.0a.jar > > dse spark-submit -v --conf > "spark.serializer=org.apache.spark.serializer.KryoSerializer" \ > --executor-memory 512M \ > --total-executor-cores 3 \ > --deploy-mode "cluster" \ > --master spark://$MASTER:7077 \ > --jars=$JARS \ > --supervise \ > --class "com.testclass" $APP_JAR input.json \ > --files "/home/test/input.json" > > The above command is working fine in client mode. But when I use it in > cluster mode I get class not found exception > > Exception in thread "main" java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:58) > at > org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala) > Caused by: java.lang.NoClassDefFoundError: > org/apache/spark/streaming/kafka/KafkaUtils$ > > In client mode the dependent jars are getting copied to the > /var/lib/spark/work directory whereas in cluster mode it is not. > I am using nfs and I have mounted the same directory on all the spark > nodes under same name. Still I get the error. > > From the verbose logs of dse spark-submit, I see the classpath elements > are missing here . > > -- > spark.hadoop.cassandra.input.native.ssl.trust.store.password -> cassandra > spark.cassandra.connection.ssl.trustStore.password -> cassandra > spark.ssl.keyStorePassword -> cassandra > spark.cassandra.auth.username -> cassandra > spark.hadoop.fs.har.impl -> org.apache.hadoop.fs.HarFileSystem > Classpath elements: > > > WARN 2015-12-15 17:08:48 org.apache.spark.util.Utils: Your hostname, > demeter-dev-node2 resolves to a loopback address: 127.0.1.1; using > 10.29.23.170 instead (on interface eth0) > WARN 2015-12-15 17:08:48 org.apache.spark.util.Utils: Set SPARK_LOCAL_IP > if you need to bind to another addres > > How it is able to pick the application jar which is also under same > directory but not the dependent jars ? > Please help me in getting this solved. > >
Re: --jars not working?
Are you using a build for scala 2.11? I’ve encountered the same behaviour trying to run on Yarn with scala 2.11 and Spark 1.3.0, 1.3.1 and 1.4.0.RC3 and raised JIRA issue here: https://issues.apache.org/jira/browse/SPARK-7944. Would be good to know if this is identical to what you’re seeing on Mesos. Thanks Alex On Fri, Jun 12, 2015 at 8:45 PM, Akhil Das ak...@sigmoidanalytics.com wrote: You can verify if the jars are shipped properly by looking at the driver UI (running on 4040) Environment tab. Thanks Best Regards On Sat, Jun 13, 2015 at 12:43 AM, Jonathan Coveney jcove...@gmail.com wrote: Spark version is 1.3.0 (will upgrade as soon as we upgrade past mesos 0.19.0)... Regardless, I'm running into a really weird situation where when I pass --jars to bin/spark-shell I can't reference those classes on the repl. Is this expected? The logs even tell me that my jars have been added, and yet the classes inside of them are not available. Am I missing something obvious?
--jars not working?
Spark version is 1.3.0 (will upgrade as soon as we upgrade past mesos 0.19.0)... Regardless, I'm running into a really weird situation where when I pass --jars to bin/spark-shell I can't reference those classes on the repl. Is this expected? The logs even tell me that my jars have been added, and yet the classes inside of them are not available. Am I missing something obvious?
Re: --jars not working?
You can verify if the jars are shipped properly by looking at the driver UI (running on 4040) Environment tab. Thanks Best Regards On Sat, Jun 13, 2015 at 12:43 AM, Jonathan Coveney jcove...@gmail.com wrote: Spark version is 1.3.0 (will upgrade as soon as we upgrade past mesos 0.19.0)... Regardless, I'm running into a really weird situation where when I pass --jars to bin/spark-shell I can't reference those classes on the repl. Is this expected? The logs even tell me that my jars have been added, and yet the classes inside of them are not available. Am I missing something obvious?