I am launching a job on a spark cluster.   To get the application to work i
have to run it like this:

HADOOP_CONF_DIR=$CONF \
> SPARK_JAR=$SPARK_ASSEMBLY \
>  $SPARK/spark-class org.apache.spark.deploy.yarn.Client \
>  --jar <path to my jar> \
>  --class<my main class> \
>  --args <my command line arguments> \
>  --num-workers 7 \
>  --master-memory 164g \
>  --worker-memory 164g \
>  --worker-cores 20 \
>  --files file:///usr/lib/hbase/lib/hbase-common-0.95.2-cdh5.0.0-beta-1
> .jar,file:///usr/lib/hbase/lib/hbase-client-0.95.2-cdh5.0.0-beta-1
> .jar,file:///usr/lib/hbase/lib/hbase-protocol-0.95.2-cdh5.0.0-beta-1
> .jar,file:///usr/lib/hbase/lib/htrace-core-2.01.jar



What this ends up doing is taking all of the hbase jars (added with --file)
and copying them into distributed cache and distributing them to each
container (at least thats how i understand it).   What i would like to do
instead is simply place all of these jars onto the SPARK_CLASSPATH since
they are already available on every node of the cluster.   However even
when i place them on the spark class path i get class not found errors.



On Mon, Jan 13, 2014 at 2:19 PM, Izhar ul Hassan <ezh...@gmail.com> wrote:

> Erik, could you please provide a little more details? Log excerpts and/or
> commands you are running would be helpful.
>
> /Izhar
>
>
> On Monday, January 13, 2014, Eric Kimbrel wrote:
>
>> Is there any extra trick required to use jars on the SPARK_CLASSPATH when
>> running spark on yarn?
>>
>> I have several jars added to the SPARK_CLASSPATH in spark_env.sh   When
>> my job runs i print the SPARK_CLASSPATH so i can see that the jars were
>> added to the environment that the app master is running in, however even
>> though the jars are on the class path I continue to get class not found
>> errors.
>>
>> I have also tried setting SPARK_CLASSPATH via SPARK_YARN_USER_ENV
>
>
>
> --
> --
> /Izhar
>
>

Reply via email to