Hi Here is my understanding:
1. For each container, there will be a local folder be created and application jar will be copied over there 2. Jars mentioned in --jars switch will be copied over to container to the class path of the application. So to your question, --jars is not required to be copied over to all nodes during submission time. YARN will take care of it. Best Ayan On Tue, Jul 18, 2017 at 4:10 AM, Marcelo Vanzin <van...@cloudera.com> wrote: > Yes. > > On Mon, Jul 17, 2017 at 10:47 AM, Mich Talebzadeh > <mich.talebza...@gmail.com> wrote: > > thanks Marcelo. > > > > are these files distributed through hdfs? > > > > Dr Mich Talebzadeh > > > > > > > > LinkedIn > > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCd > OABUrV8Pw > > > > > > > > http://talebzadehmich.wordpress.com > > > > > > Disclaimer: Use it at your own risk. Any and all responsibility for any > > loss, damage or destruction of data or any other property which may arise > > from relying on this email's technical content is explicitly disclaimed. > The > > author will in no case be liable for any monetary damages arising from > such > > loss, damage or destruction. > > > > > > > > > > On 17 July 2017 at 18:46, Marcelo Vanzin <van...@cloudera.com> wrote: > >> > >> The YARN backend distributes all files and jars you submit with your > >> application. > >> > >> On Mon, Jul 17, 2017 at 10:45 AM, Mich Talebzadeh > >> <mich.talebza...@gmail.com> wrote: > >> > thanks guys. > >> > > >> > just to clarify let us assume i am doing spark-submit as below: > >> > > >> > ${SPARK_HOME}/bin/spark-submit \ > >> > --packages ${PACKAGES} \ > >> > --driver-memory 2G \ > >> > --num-executors 2 \ > >> > --executor-memory 2G \ > >> > --executor-cores 2 \ > >> > --master yarn \ > >> > --deploy-mode client \ > >> > --conf "${SCHEDULER}" \ > >> > --conf "${EXTRAJAVAOPTIONS}" \ > >> > --jars ${JARS} \ > >> > --class "${FILE_NAME}" \ > >> > --conf "${SPARKUIPORT}" \ > >> > --conf "${SPARKDRIVERPORT}" \ > >> > --conf "${SPARKFILESERVERPORT}" \ > >> > --conf "${SPARKBLOCKMANAGERPORT}" \ > >> > --conf "${SPARKKRYOSERIALIZERBUFFERMAX}" \ > >> > ${JAR_FILE} > >> > > >> > The ${JAR_FILE} is the one. As I understand Spark should distribute > that > >> > ${JAR_FILE} to each container? > >> > > >> > Also --jars ${JARS} are the list of normal jar files that need to > exist > >> > in > >> > the same directory on each executor node? > >> > > >> > cheers, > >> > > >> > > >> > > >> > Dr Mich Talebzadeh > >> > > >> > > >> > > >> > LinkedIn > >> > > >> > https://www.linkedin.com/profile/view?id= > AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > >> > > >> > > >> > > >> > http://talebzadehmich.wordpress.com > >> > > >> > > >> > Disclaimer: Use it at your own risk. Any and all responsibility for > any > >> > loss, damage or destruction of data or any other property which may > >> > arise > >> > from relying on this email's technical content is explicitly > disclaimed. > >> > The > >> > author will in no case be liable for any monetary damages arising from > >> > such > >> > loss, damage or destruction. > >> > > >> > > >> > > >> > > >> > On 17 July 2017 at 18:18, ayan guha <guha.a...@gmail.com> wrote: > >> >> > >> >> Hi Mitch > >> >> > >> >> your jar file can be anywhere in the file system, including hdfs. > >> >> > >> >> If using yarn, preferably use cluster mode in terms of deployment. > >> >> > >> >> Yarn will distribute the jar to each container. > >> >> > >> >> Best > >> >> Ayan > >> >> > >> >> On Tue, 18 Jul 2017 at 2:17 am, Marcelo Vanzin <van...@cloudera.com> > >> >> wrote: > >> >>> > >> >>> Spark distributes your application jar for you. > >> >>> > >> >>> On Mon, Jul 17, 2017 at 8:41 AM, Mich Talebzadeh > >> >>> <mich.talebza...@gmail.com> wrote: > >> >>> > hi guys, > >> >>> > > >> >>> > > >> >>> > an uber/fat jar file has been created to run with spark in CDH > yarc > >> >>> > client > >> >>> > mode. > >> >>> > > >> >>> > As usual job is submitted to the edge node. > >> >>> > > >> >>> > does the jar file has to be placed in the same directory ewith > spark > >> >>> > is > >> >>> > running in the cluster to make it work? > >> >>> > > >> >>> > Also what will happen if say out of 9 nodes running spark, 3 have > >> >>> > not > >> >>> > got > >> >>> > the jar file. will that job fail or it will carry on on the > fremaing > >> >>> > 6 > >> >>> > nodes > >> >>> > that have that jar file? > >> >>> > > >> >>> > thanks > >> >>> > > >> >>> > Dr Mich Talebzadeh > >> >>> > > >> >>> > > >> >>> > > >> >>> > LinkedIn > >> >>> > > >> >>> > > >> >>> > https://www.linkedin.com/profile/view?id= > AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > >> >>> > > >> >>> > > >> >>> > > >> >>> > http://talebzadehmich.wordpress.com > >> >>> > > >> >>> > > >> >>> > Disclaimer: Use it at your own risk. Any and all responsibility > for > >> >>> > any > >> >>> > loss, damage or destruction of data or any other property which > may > >> >>> > arise > >> >>> > from relying on this email's technical content is explicitly > >> >>> > disclaimed. The > >> >>> > author will in no case be liable for any monetary damages arising > >> >>> > from > >> >>> > such > >> >>> > loss, damage or destruction. > >> >>> > > >> >>> > > >> >>> > >> >>> > >> >>> > >> >>> -- > >> >>> Marcelo > >> >>> > >> >>> ------------------------------------------------------------ > --------- > >> >>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >> >>> > >> >> -- > >> >> Best Regards, > >> >> Ayan Guha > >> > > >> > > >> > >> > >> > >> -- > >> Marcelo > > > > > > > > -- > Marcelo > -- Best Regards, Ayan Guha