Re: running spark job with fat jar file

ayan guha Mon, 17 Jul 2017 11:35:22 -0700

Hi

Here is my understanding:


1. For each container, there will be a local folder be created and
application jar will be copied over there
2. Jars mentioned in --jars switch will be copied over to container to the
class path of the application.

So to your question, --jars is not required to be copied over to all nodes
during submission time. YARN will take care of it.

Best
Ayan

On Tue, Jul 18, 2017 at 4:10 AM, Marcelo Vanzin <van...@cloudera.com> wrote:

> Yes.
>
> On Mon, Jul 17, 2017 at 10:47 AM, Mich Talebzadeh
> <mich.talebza...@gmail.com> wrote:
> > thanks Marcelo.
> >
> > are these files distributed through hdfs?
> >
> > Dr Mich Talebzadeh
> >
> >
> >
> > LinkedIn
> > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCd
> OABUrV8Pw
> >
> >
> >
> > http://talebzadehmich.wordpress.com
> >
> >
> > Disclaimer: Use it at your own risk. Any and all responsibility for any
> > loss, damage or destruction of data or any other property which may arise
> > from relying on this email's technical content is explicitly disclaimed.
> The
> > author will in no case be liable for any monetary damages arising from
> such
> > loss, damage or destruction.
> >
> >
> >
> >
> > On 17 July 2017 at 18:46, Marcelo Vanzin <van...@cloudera.com> wrote:
> >>
> >> The YARN backend distributes all files and jars you submit with your
> >> application.
> >>
> >> On Mon, Jul 17, 2017 at 10:45 AM, Mich Talebzadeh
> >> <mich.talebza...@gmail.com> wrote:
> >> > thanks guys.
> >> >
> >> > just to clarify let us assume i am doing spark-submit as below:
> >> >
> >> > ${SPARK_HOME}/bin/spark-submit \
> >> >                 --packages ${PACKAGES} \
> >> >                 --driver-memory 2G \
> >> >                 --num-executors 2 \
> >> >                 --executor-memory 2G \
> >> >                 --executor-cores 2 \
> >> >                 --master yarn \
> >> >                 --deploy-mode client \
> >> >                 --conf "${SCHEDULER}" \
> >> >                 --conf "${EXTRAJAVAOPTIONS}" \
> >> >                 --jars ${JARS} \
> >> >                 --class "${FILE_NAME}" \
> >> >                 --conf "${SPARKUIPORT}" \
> >> >                 --conf "${SPARKDRIVERPORT}" \
> >> >                 --conf "${SPARKFILESERVERPORT}" \
> >> >                 --conf "${SPARKBLOCKMANAGERPORT}" \
> >> >                 --conf "${SPARKKRYOSERIALIZERBUFFERMAX}" \
> >> >                 ${JAR_FILE}
> >> >
> >> > The ${JAR_FILE} is the one. As I understand Spark should distribute
> that
> >> > ${JAR_FILE} to each container?
> >> >
> >> > Also --jars ${JARS} are the list of normal jar files that need to
> exist
> >> > in
> >> > the same directory on each executor node?
> >> >
> >> > cheers,
> >> >
> >> >
> >> >
> >> > Dr Mich Talebzadeh
> >> >
> >> >
> >> >
> >> > LinkedIn
> >> >
> >> > https://www.linkedin.com/profile/view?id=
> AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> >> >
> >> >
> >> >
> >> > http://talebzadehmich.wordpress.com
> >> >
> >> >
> >> > Disclaimer: Use it at your own risk. Any and all responsibility for
> any
> >> > loss, damage or destruction of data or any other property which may
> >> > arise
> >> > from relying on this email's technical content is explicitly
> disclaimed.
> >> > The
> >> > author will in no case be liable for any monetary damages arising from
> >> > such
> >> > loss, damage or destruction.
> >> >
> >> >
> >> >
> >> >
> >> > On 17 July 2017 at 18:18, ayan guha <guha.a...@gmail.com> wrote:
> >> >>
> >> >> Hi Mitch
> >> >>
> >> >> your jar file can be anywhere in the file system, including hdfs.
> >> >>
> >> >> If using yarn, preferably use cluster mode in terms of deployment.
> >> >>
> >> >> Yarn will distribute the jar to each container.
> >> >>
> >> >> Best
> >> >> Ayan
> >> >>
> >> >> On Tue, 18 Jul 2017 at 2:17 am, Marcelo Vanzin <van...@cloudera.com>
> >> >> wrote:
> >> >>>
> >> >>> Spark distributes your application jar for you.
> >> >>>
> >> >>> On Mon, Jul 17, 2017 at 8:41 AM, Mich Talebzadeh
> >> >>> <mich.talebza...@gmail.com> wrote:
> >> >>> > hi guys,
> >> >>> >
> >> >>> >
> >> >>> > an uber/fat jar file has been created to run with spark in CDH
> yarc
> >> >>> > client
> >> >>> > mode.
> >> >>> >
> >> >>> > As usual job is submitted to the edge node.
> >> >>> >
> >> >>> > does the jar file has to be placed in the same directory ewith
> spark
> >> >>> > is
> >> >>> > running in the cluster to make it work?
> >> >>> >
> >> >>> > Also what will happen if say out of 9 nodes running spark, 3 have
> >> >>> > not
> >> >>> > got
> >> >>> > the jar file. will that job fail or it will carry on on the
> fremaing
> >> >>> > 6
> >> >>> > nodes
> >> >>> > that have that jar file?
> >> >>> >
> >> >>> > thanks
> >> >>> >
> >> >>> > Dr Mich Talebzadeh
> >> >>> >
> >> >>> >
> >> >>> >
> >> >>> > LinkedIn
> >> >>> >
> >> >>> >
> >> >>> > https://www.linkedin.com/profile/view?id=
> AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> >> >>> >
> >> >>> >
> >> >>> >
> >> >>> > http://talebzadehmich.wordpress.com
> >> >>> >
> >> >>> >
> >> >>> > Disclaimer: Use it at your own risk. Any and all responsibility
> for
> >> >>> > any
> >> >>> > loss, damage or destruction of data or any other property which
> may
> >> >>> > arise
> >> >>> > from relying on this email's technical content is explicitly
> >> >>> > disclaimed. The
> >> >>> > author will in no case be liable for any monetary damages arising
> >> >>> > from
> >> >>> > such
> >> >>> > loss, damage or destruction.
> >> >>> >
> >> >>> >
> >> >>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> Marcelo
> >> >>>
> >> >>> ------------------------------------------------------------
> ---------
> >> >>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> >> >>>
> >> >> --
> >> >> Best Regards,
> >> >> Ayan Guha
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Marcelo
> >
> >
>
>
>
> --
> Marcelo
>



-- 
Best Regards,
Ayan Guha

Re: running spark job with fat jar file

Reply via email to