I think you should be able to drop "yarn-standalone" altogether. We recently updated SparkPi to take in 1 argument (num slices, which you set to 10). Previously, it took in 2 arguments, the master and num slices.
Glad you got it figured out. 2014-05-22 13:41 GMT-07:00 Jon Bender <jonathan.ben...@gmail.com>: > Andrew, > > Brilliant! I built on Java 7 but was still running our cluster on Java 6. > Upgraded the cluster and it worked (with slight tweaks to the args, I > guess the app args come first then yarn-standalone comes last): > > SPARK_JAR=./assembly/target/scala-2.10/spark-assembly-1.0.0-SNAPSHOT-hadoop2.3.0-cdh5.0.0.jar > \ > ./bin/spark-class org.apache.spark.deploy.yarn.Client \ > --jar > examples/target/scala-2.10/spark-examples-1.0.0-SNAPSHOT-hadoop2.3.0-cdh5.0.0.jar > \ > --class org.apache.spark.examples.SparkPi \ > --args 10 \ > --args yarn-standalone \ > --num-workers 3 \ > --master-memory 4g \ > --worker-memory 2g \ > --worker-cores 1 > > I'll make sure to use spark-submit from here on out. > > Thanks very much! > Jon > > > On Thu, May 22, 2014 at 12:40 PM, Andrew Or <and...@databricks.com> wrote: > >> Hi Jon, >> >> Your configuration looks largely correct. I have very recently confirmed >> that the way you launch SparkPi also works for me. >> >> I have run into the same problem a bunch of times. My best guess is that >> this is a Java version issue. If the Spark assembly jar is built with Java >> 7, it cannot be opened by Java 6 because the two versions use different >> packaging schemes. This is a known issue: >> https://issues.apache.org/jira/browse/SPARK-1520. >> >> The workaround is to either make sure that all your executor nodes are >> running Java 7, and, very importantly, have JAVA_HOME point to this >> version. You can achieve this through >> >> export SPARK_YARN_USER_ENV="JAVA_HOME=/path/to/java7/home" >> >> in spark-env.sh. Another safe alternative, of course, is to just build >> the jar with Java 6. An additional debugging step is to review the launch >> environment of all the containers. This is detailed in the last paragraph >> of this section: >> http://people.apache.org/~pwendell/spark-1.0.0-rc7-docs/running-on-yarn.html#debugging-your-application. >> This may not be necessary, but I have personally found it immensely useful. >> >> One last thing, launching Spark applications through >> org.apache.spark.deploy.yarn.Client is deprecated in Spark 1.0. You should >> use bin/spark-submit instead. You can find information about its usage on >> the docs I linked to you, or simply through the --help option. >> >> Cheers, >> Andrew >> >> >> 2014-05-22 11:38 GMT-07:00 Jon Bender <jonathan.ben...@gmail.com>: >> >> Hey all, >>> >>> I'm working through the basic SparkPi example on a YARN cluster, and i'm >>> wondering why my containers don't pick up the spark assembly classes. >>> >>> I built the latest spark code against CDH5.0.0 >>> >>> Then ran the following: >>> SPARK_JAR=./assembly/target/scala-2.10/spark-assembly-1.0.0-SNAPSHOT-hadoop2.3.0-cdh5.0.0.jar >>> \ >>> ./bin/spark-class org.apache.spark.deploy.yarn.Client \ >>> --jar >>> examples/target/scala-2.10/spark-examples-1.0.0-SNAPSHOT-hadoop2.3.0-cdh5.0.0.jar >>> \ >>> --class org.apache.spark.examples.SparkPi \ >>> --args yarn-standalone \ >>> --num-workers 3 \ >>> --master-memory 4g \ >>> --worker-memory 2g \ >>> --worker-cores 1 >>> >>> The job dies, and in the stderr from the containers I see >>> Exception in thread "main" java.lang.NoClassDefFoundError: >>> org/apache/spark/deploy/yarn/ApplicationMaster >>> Caused by: java.lang.ClassNotFoundException: >>> org.apache.spark.deploy.yarn.ApplicationMaster >>> at java.net.URLClassLoader$1.run(URLClassLoader.java:217) >>> at java.security.AccessController.doPrivileged(Native Method) >>> at java.net.URLClassLoader.findClass(URLClassLoader.java:205) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:321) >>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:266) >>> >>> my yarn-site.xml contains the following classpath: >>> <property> >>> <name>yarn.application.classpath</name> >>> <value> >>> /etc/hadoop/conf/, >>> /usr/lib/hadoop/*,/usr/lib/hadoop//lib/*, >>> /usr/lib/hadoop-hdfs/*,/user/lib/hadoop-hdfs/lib/*, >>> /usr/lib/hadoop-mapreduce/*,/usr/lib/hadoop-mapreduce/lib/*, >>> /usr/lib/hadoop-yarn/*,/usr/lib/hadoop-yarn/lib/*, >>> /usr/lib/avro/* >>> </value> >>> </property> >>> >>> I've confirmed that the spark-assembly JAR has this class. Does it >>> actually need to be defined in yarn.application.classpath or should the >>> spark client take care of ensuring the necessary JARs are added during job >>> submission? >>> >>> Any tips would be greatly appreciated! >>> Cheers, >>> Jon >>> >> >> >