Thanks for that extra insight about yarn.  I am new to the whole yarn
eco-system so i've been having trouble figuring out the right way to do
some things.  Sounds like even though the jars are already installed as
part of our cluster on all the nodes, i should just go ahead and add them
with the --files methods to simplify things and avoid having them added for
all applications.

Thanks



On Mon, Jan 13, 2014 at 3:01 PM, Tom Graves <tgraves...@yahoo.com> wrote:

> I'm assuming you actually installed the jar on all the yarn clusters then?
>
> In general this isn't a good idea on yarn as most users don't have
> permissions to install things on the nodes themselves.  The idea is Yarn
> provides a certain set of jars which really should be just the yarn/hadoop
> framework,  it adds those to your classpath and the user provides
> everything else application specific when they submit their application and
> those get distributed with the app and added to the classpath.   If you
> are worried about it being downloaded everytime, you can use the public
> distributed cache on yarn as a way to distribute it and share it.  It will
> only be removed from that nodes distributed cache if other applications
> need that space.
>
> That said what yarn adds to the classpath is configurable via the hadoop
> configuration file yarn-site.xml, config name: yarn.application.classpath.
>  So you can change the config to add it, but it will be added for all types
> of applications.
>
> You can use the --files and --archives options in yarn-standalone mode to
> use the distributed cache.  To make it public, make sure permissions on the
> file are set appropriately.
>
> Tom
>
>
>   On Monday, January 13, 2014 3:49 PM, Eric Kimbrel <lekimb...@gmail.com>
> wrote:
>  Is there any extra trick required to use jars on the SPARK_CLASSPATH
> when running spark on yarn?
>
> I have several jars added to the SPARK_CLASSPATH in spark_env.sh  When my
> job runs i print the SPARK_CLASSPATH so i can see that the jars were added
> to the environment that the app master is running in, however even though
> the jars are on the class path I continue to get class not found errors.
>
> I have also tried setting SPARK_CLASSPATH via SPARK_YARN_USER_ENV
>
>

Reply via email to