I am launching a job on a spark cluster. To get the application to work i have to run it like this:
HADOOP_CONF_DIR=$CONF \ > SPARK_JAR=$SPARK_ASSEMBLY \ > $SPARK/spark-class org.apache.spark.deploy.yarn.Client \ > --jar <path to my jar> \ > --class<my main class> \ > --args <my command line arguments> \ > --num-workers 7 \ > --master-memory 164g \ > --worker-memory 164g \ > --worker-cores 20 \ > --files file:///usr/lib/hbase/lib/hbase-common-0.95.2-cdh5.0.0-beta-1 > .jar,file:///usr/lib/hbase/lib/hbase-client-0.95.2-cdh5.0.0-beta-1 > .jar,file:///usr/lib/hbase/lib/hbase-protocol-0.95.2-cdh5.0.0-beta-1 > .jar,file:///usr/lib/hbase/lib/htrace-core-2.01.jar What this ends up doing is taking all of the hbase jars (added with --file) and copying them into distributed cache and distributing them to each container (at least thats how i understand it). What i would like to do instead is simply place all of these jars onto the SPARK_CLASSPATH since they are already available on every node of the cluster. However even when i place them on the spark class path i get class not found errors. On Mon, Jan 13, 2014 at 2:19 PM, Izhar ul Hassan <ezh...@gmail.com> wrote: > Erik, could you please provide a little more details? Log excerpts and/or > commands you are running would be helpful. > > /Izhar > > > On Monday, January 13, 2014, Eric Kimbrel wrote: > >> Is there any extra trick required to use jars on the SPARK_CLASSPATH when >> running spark on yarn? >> >> I have several jars added to the SPARK_CLASSPATH in spark_env.sh When >> my job runs i print the SPARK_CLASSPATH so i can see that the jars were >> added to the environment that the app master is running in, however even >> though the jars are on the class path I continue to get class not found >> errors. >> >> I have also tried setting SPARK_CLASSPATH via SPARK_YARN_USER_ENV > > > > -- > -- > /Izhar > >