Re: Spark-submit not working when application jar is in hdfs
Client mode would not support HDFS jar extraction. I tried this: sudo -u hdfs spark-submit --class org.apache.spark.examples.SparkPi --deploy-mode cluster --master yarn hdfs:///user/spark/spark-examples-1.2.0-cdh5.3.2-hadoop2.5.0-cdh5.3.2.jar 10 And it worked. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-submit-not-working-when-application-jar-is-in-hdfs-tp21840p22302.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark-submit not working when application jar is in hdfs
Made it work by using yarn-cluster as master instead of local. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-submit-not-working-when-application-jar-is-in-hdfs-tp21840p22281.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark-submit not working when application jar is in hdfs
Hi, did you resolve this issue or just work around it be keeping your application jar local? Running into the same issue with 1.3. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-submit-not-working-when-application-jar-is-in-hdfs-tp21840p22272.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark-submit not working when application jar is in hdfs
Looking at SparkSubmit#addJarToClasspath(): uri.getScheme match { case file | local = ... case _ = printWarning(sSkip remote jar $uri.) It seems hdfs scheme is not recognized. FYI On Thu, Feb 26, 2015 at 6:09 PM, dilm dmend...@exist.com wrote: I'm trying to run a spark application using bin/spark-submit. When I reference my application jar inside my local filesystem, it works. However, when I copied my application jar to a directory in hdfs, i get the following exception: Warning: Skip remote jar hdfs://localhost:9000/user/hdfs/jars/simple-project-1.0-SNAPSHOT.jar. java.lang.ClassNotFoundException: com.example.SimpleApp Here's the comand: $ ./bin/spark-submit --class com.example.SimpleApp --master local hdfs://localhost:9000/user/hdfs/jars/simple-project-1.0-SNAPSHOT.jar I'm using hadoop version 2.6.0, spark version 1.2.1 In the official documentation, it stated there that: application-jar: Path to a bundled jar including your application and all dependencies. The URL must be globally visible inside of your cluster, for instance, an *hdfs:// path* or a file:// path that is present on all nodes. I'm thinking maybe this is a valid bug? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-submit-not-working-when-application-jar-is-in-hdfs-tp21840.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Spark-submit not working when application jar is in hdfs
I'm trying to run a spark application using bin/spark-submit. When I reference my application jar inside my local filesystem, it works. However, when I copied my application jar to a directory in hdfs, i get the following exception: Warning: Skip remote jar hdfs://localhost:9000/user/hdfs/jars/simple-project-1.0-SNAPSHOT.jar. java.lang.ClassNotFoundException: com.example.SimpleApp Here's the comand: $ ./bin/spark-submit --class com.example.SimpleApp --master local hdfs://localhost:9000/user/hdfs/jars/simple-project-1.0-SNAPSHOT.jar I'm using hadoop version 2.6.0, spark version 1.2.1 In the official documentation, it stated there that: application-jar: Path to a bundled jar including your application and all dependencies. The URL must be globally visible inside of your cluster, for instance, an *hdfs:// path* or a file:// path that is present on all nodes. I'm thinking maybe this is a valid bug? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-submit-not-working-when-application-jar-is-in-hdfs-tp21840.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org