Re: spark.submit.deployMode: cluster

Pat Ferrel Thu, 28 Mar 2019 15:06:35 -0700

Thanks, are you referring to
https://github.com/spark-jobserver/spark-jobserver or the undocumented REST
job server included in Spark?



From: Jason Nerothin <jasonnerot...@gmail.com> <jasonnerot...@gmail.com>
Reply: Jason Nerothin <jasonnerot...@gmail.com> <jasonnerot...@gmail.com>
Date: March 28, 2019 at 2:53:05 PM
To: Pat Ferrel <p...@occamsmachete.com> <p...@occamsmachete.com>
Cc: Felix Cheung <felixcheun...@hotmail.com>
<felixcheun...@hotmail.com>, Marcelo
Vanzin <van...@cloudera.com> <van...@cloudera.com>, user
<user@spark.apache.org> <user@spark.apache.org>
Subject:  Re: spark.submit.deployMode: cluster

Check out the Spark Jobs API... it sits behind a REST service...


On Thu, Mar 28, 2019 at 12:29 Pat Ferrel <p...@occamsmachete.com> wrote:

> ;-)
>
> Great idea. Can you suggest a project?
>
> Apache PredictionIO uses spark-submit (very ugly) and Apache Mahout only
> launches trivially in test apps since most uses are as a lib.
>
>
> From: Felix Cheung <felixcheun...@hotmail.com> <felixcheun...@hotmail.com>
> Reply: Felix Cheung <felixcheun...@hotmail.com>
> <felixcheun...@hotmail.com>
> Date: March 28, 2019 at 9:42:31 AM
> To: Pat Ferrel <p...@occamsmachete.com> <p...@occamsmachete.com>, Marcelo
> Vanzin <van...@cloudera.com> <van...@cloudera.com>
> Cc: user <user@spark.apache.org> <user@spark.apache.org>
> Subject:  Re: spark.submit.deployMode: cluster
>
> If anyone wants to improve docs please create a PR.
>
> lol
>
>
> But seriously you might want to explore other projects that manage job
> submission on top of spark instead of rolling your own with spark-submit.
>
>
> ------------------------------
> *From:* Pat Ferrel <p...@occamsmachete.com>
> *Sent:* Tuesday, March 26, 2019 2:38 PM
> *To:* Marcelo Vanzin
> *Cc:* user
> *Subject:* Re: spark.submit.deployMode: cluster
>
> Ahh, thank you indeed!
>
> It would have saved us a lot of time if this had been documented. I know,
> OSS so contributions are welcome… I can also imagine your next comment; “If
> anyone wants to improve docs see the Apache contribution rules and create a
> PR.” or something like that.
>
> BTW the code where the context is known and can be used is what I’d call a
> Driver and since all code is copied to nodes and is know in jars, it was
> not obvious to us that this rule existed but it does make sense.
>
> We will need to refactor our code to use spark-submit it appears.
>
> Thanks again.
>
>
> From: Marcelo Vanzin <van...@cloudera.com> <van...@cloudera.com>
> Reply: Marcelo Vanzin <van...@cloudera.com> <van...@cloudera.com>
> Date: March 26, 2019 at 1:59:36 PM
> To: Pat Ferrel <p...@occamsmachete.com> <p...@occamsmachete.com>
> Cc: user <user@spark.apache.org> <user@spark.apache.org>
> Subject:  Re: spark.submit.deployMode: cluster
>
> If you're not using spark-submit, then that option does nothing.
>
> If by "context creation API" you mean "new SparkContext()" or an
> equivalent, then you're explicitly creating the driver inside your
> application.
>
> On Tue, Mar 26, 2019 at 1:56 PM Pat Ferrel <p...@occamsmachete.com> wrote:
> >
> > I have a server that starts a Spark job using the context creation API.
> It DOES NOY use spark-submit.
> >
> > I set spark.submit.deployMode = “cluster”
> >
> > In the GUI I see 2 workers with 2 executors. The link for running
> application “name” goes back to my server, the machine that launched the
> job.
> >
> > This is spark.submit.deployMode = “client” according to the docs. I set
> the Driver to run on the cluster but it runs on the client, ignoring the
> spark.submit.deployMode.
> >
> > Is this as expected? It is documented nowhere I can find.
> >
>
>
> --
> Marcelo
>
> --
Thanks,
Jason

Re: spark.submit.deployMode: cluster

Reply via email to