Re: spark.submit.deployMode: cluster
A little late, but have you looked at https://livy.incubator.apache.org/, works well for us. -Todd On Thu, Mar 28, 2019 at 9:33 PM Jason Nerothin wrote: > Meant this one: https://docs.databricks.com/api/latest/jobs.html > > On Thu, Mar 28, 2019 at 5:06 PM Pat Ferrel wrote: > >> Thanks, are you referring to >> https://github.com/spark-jobserver/spark-jobserver or the undocumented >> REST job server included in Spark? >> >> >> From: Jason Nerothin >> Reply: Jason Nerothin >> Date: March 28, 2019 at 2:53:05 PM >> To: Pat Ferrel >> Cc: Felix Cheung , >> Marcelo Vanzin , user >> >> Subject: Re: spark.submit.deployMode: cluster >> >> Check out the Spark Jobs API... it sits behind a REST service... >> >> >> On Thu, Mar 28, 2019 at 12:29 Pat Ferrel wrote: >> >>> ;-) >>> >>> Great idea. Can you suggest a project? >>> >>> Apache PredictionIO uses spark-submit (very ugly) and Apache Mahout only >>> launches trivially in test apps since most uses are as a lib. >>> >>> >>> From: Felix Cheung >>> >>> Reply: Felix Cheung >>> >>> Date: March 28, 2019 at 9:42:31 AM >>> To: Pat Ferrel , Marcelo >>> Vanzin >>> Cc: user >>> Subject: Re: spark.submit.deployMode: cluster >>> >>> If anyone wants to improve docs please create a PR. >>> >>> lol >>> >>> >>> But seriously you might want to explore other projects that manage job >>> submission on top of spark instead of rolling your own with spark-submit. >>> >>> >>> -- >>> *From:* Pat Ferrel >>> *Sent:* Tuesday, March 26, 2019 2:38 PM >>> *To:* Marcelo Vanzin >>> *Cc:* user >>> *Subject:* Re: spark.submit.deployMode: cluster >>> >>> Ahh, thank you indeed! >>> >>> It would have saved us a lot of time if this had been documented. I >>> know, OSS so contributions are welcome… I can also imagine your next >>> comment; “If anyone wants to improve docs see the Apache contribution rules >>> and create a PR.” or something like that. >>> >>> BTW the code where the context is known and can be used is what I’d call >>> a Driver and since all code is copied to nodes and is know in jars, it was >>> not obvious to us that this rule existed but it does make sense. >>> >>> We will need to refactor our code to use spark-submit it appears. >>> >>> Thanks again. >>> >>> >>> From: Marcelo Vanzin >>> Reply: Marcelo Vanzin >>> Date: March 26, 2019 at 1:59:36 PM >>> To: Pat Ferrel >>> Cc: user >>> Subject: Re: spark.submit.deployMode: cluster >>> >>> If you're not using spark-submit, then that option does nothing. >>> >>> If by "context creation API" you mean "new SparkContext()" or an >>> equivalent, then you're explicitly creating the driver inside your >>> application. >>> >>> On Tue, Mar 26, 2019 at 1:56 PM Pat Ferrel >>> wrote: >>> > >>> > I have a server that starts a Spark job using the context creation >>> API. It DOES NOY use spark-submit. >>> > >>> > I set spark.submit.deployMode = “cluster” >>> > >>> > In the GUI I see 2 workers with 2 executors. The link for running >>> application “name” goes back to my server, the machine that launched the >>> job. >>> > >>> > This is spark.submit.deployMode = “client” according to the docs. I >>> set the Driver to run on the cluster but it runs on the client, ignoring >>> the spark.submit.deployMode. >>> > >>> > Is this as expected? It is documented nowhere I can find. >>> > >>> >>> >>> -- >>> Marcelo >>> >>> -- >> Thanks, >> Jason >> >> > > -- > Thanks, > Jason >
Re: spark.submit.deployMode: cluster
Meant this one: https://docs.databricks.com/api/latest/jobs.html On Thu, Mar 28, 2019 at 5:06 PM Pat Ferrel wrote: > Thanks, are you referring to > https://github.com/spark-jobserver/spark-jobserver or the undocumented > REST job server included in Spark? > > > From: Jason Nerothin > Reply: Jason Nerothin > Date: March 28, 2019 at 2:53:05 PM > To: Pat Ferrel > Cc: Felix Cheung , > Marcelo > Vanzin , user > > Subject: Re: spark.submit.deployMode: cluster > > Check out the Spark Jobs API... it sits behind a REST service... > > > On Thu, Mar 28, 2019 at 12:29 Pat Ferrel wrote: > >> ;-) >> >> Great idea. Can you suggest a project? >> >> Apache PredictionIO uses spark-submit (very ugly) and Apache Mahout only >> launches trivially in test apps since most uses are as a lib. >> >> >> From: Felix Cheung >> >> Reply: Felix Cheung >> >> Date: March 28, 2019 at 9:42:31 AM >> To: Pat Ferrel , Marcelo >> Vanzin >> Cc: user >> Subject: Re: spark.submit.deployMode: cluster >> >> If anyone wants to improve docs please create a PR. >> >> lol >> >> >> But seriously you might want to explore other projects that manage job >> submission on top of spark instead of rolling your own with spark-submit. >> >> >> -- >> *From:* Pat Ferrel >> *Sent:* Tuesday, March 26, 2019 2:38 PM >> *To:* Marcelo Vanzin >> *Cc:* user >> *Subject:* Re: spark.submit.deployMode: cluster >> >> Ahh, thank you indeed! >> >> It would have saved us a lot of time if this had been documented. I know, >> OSS so contributions are welcome… I can also imagine your next comment; “If >> anyone wants to improve docs see the Apache contribution rules and create a >> PR.” or something like that. >> >> BTW the code where the context is known and can be used is what I’d call >> a Driver and since all code is copied to nodes and is know in jars, it was >> not obvious to us that this rule existed but it does make sense. >> >> We will need to refactor our code to use spark-submit it appears. >> >> Thanks again. >> >> >> From: Marcelo Vanzin >> Reply: Marcelo Vanzin >> Date: March 26, 2019 at 1:59:36 PM >> To: Pat Ferrel >> Cc: user >> Subject: Re: spark.submit.deployMode: cluster >> >> If you're not using spark-submit, then that option does nothing. >> >> If by "context creation API" you mean "new SparkContext()" or an >> equivalent, then you're explicitly creating the driver inside your >> application. >> >> On Tue, Mar 26, 2019 at 1:56 PM Pat Ferrel wrote: >> > >> > I have a server that starts a Spark job using the context creation API. >> It DOES NOY use spark-submit. >> > >> > I set spark.submit.deployMode = “cluster” >> > >> > In the GUI I see 2 workers with 2 executors. The link for running >> application “name” goes back to my server, the machine that launched the >> job. >> > >> > This is spark.submit.deployMode = “client” according to the docs. I set >> the Driver to run on the cluster but it runs on the client, ignoring the >> spark.submit.deployMode. >> > >> > Is this as expected? It is documented nowhere I can find. >> > >> >> >> -- >> Marcelo >> >> -- > Thanks, > Jason > > -- Thanks, Jason
Re: spark.submit.deployMode: cluster
Thanks, are you referring to https://github.com/spark-jobserver/spark-jobserver or the undocumented REST job server included in Spark? From: Jason Nerothin Reply: Jason Nerothin Date: March 28, 2019 at 2:53:05 PM To: Pat Ferrel Cc: Felix Cheung , Marcelo Vanzin , user Subject: Re: spark.submit.deployMode: cluster Check out the Spark Jobs API... it sits behind a REST service... On Thu, Mar 28, 2019 at 12:29 Pat Ferrel wrote: > ;-) > > Great idea. Can you suggest a project? > > Apache PredictionIO uses spark-submit (very ugly) and Apache Mahout only > launches trivially in test apps since most uses are as a lib. > > > From: Felix Cheung > Reply: Felix Cheung > > Date: March 28, 2019 at 9:42:31 AM > To: Pat Ferrel , Marcelo > Vanzin > Cc: user > Subject: Re: spark.submit.deployMode: cluster > > If anyone wants to improve docs please create a PR. > > lol > > > But seriously you might want to explore other projects that manage job > submission on top of spark instead of rolling your own with spark-submit. > > > -- > *From:* Pat Ferrel > *Sent:* Tuesday, March 26, 2019 2:38 PM > *To:* Marcelo Vanzin > *Cc:* user > *Subject:* Re: spark.submit.deployMode: cluster > > Ahh, thank you indeed! > > It would have saved us a lot of time if this had been documented. I know, > OSS so contributions are welcome… I can also imagine your next comment; “If > anyone wants to improve docs see the Apache contribution rules and create a > PR.” or something like that. > > BTW the code where the context is known and can be used is what I’d call a > Driver and since all code is copied to nodes and is know in jars, it was > not obvious to us that this rule existed but it does make sense. > > We will need to refactor our code to use spark-submit it appears. > > Thanks again. > > > From: Marcelo Vanzin > Reply: Marcelo Vanzin > Date: March 26, 2019 at 1:59:36 PM > To: Pat Ferrel > Cc: user > Subject: Re: spark.submit.deployMode: cluster > > If you're not using spark-submit, then that option does nothing. > > If by "context creation API" you mean "new SparkContext()" or an > equivalent, then you're explicitly creating the driver inside your > application. > > On Tue, Mar 26, 2019 at 1:56 PM Pat Ferrel wrote: > > > > I have a server that starts a Spark job using the context creation API. > It DOES NOY use spark-submit. > > > > I set spark.submit.deployMode = “cluster” > > > > In the GUI I see 2 workers with 2 executors. The link for running > application “name” goes back to my server, the machine that launched the > job. > > > > This is spark.submit.deployMode = “client” according to the docs. I set > the Driver to run on the cluster but it runs on the client, ignoring the > spark.submit.deployMode. > > > > Is this as expected? It is documented nowhere I can find. > > > > > -- > Marcelo > > -- Thanks, Jason
Re: spark.submit.deployMode: cluster
Check out the Spark Jobs API... it sits behind a REST service... On Thu, Mar 28, 2019 at 12:29 Pat Ferrel wrote: > ;-) > > Great idea. Can you suggest a project? > > Apache PredictionIO uses spark-submit (very ugly) and Apache Mahout only > launches trivially in test apps since most uses are as a lib. > > > From: Felix Cheung > Reply: Felix Cheung > > Date: March 28, 2019 at 9:42:31 AM > To: Pat Ferrel , Marcelo > Vanzin > Cc: user > Subject: Re: spark.submit.deployMode: cluster > > If anyone wants to improve docs please create a PR. > > lol > > > But seriously you might want to explore other projects that manage job > submission on top of spark instead of rolling your own with spark-submit. > > > -- > *From:* Pat Ferrel > *Sent:* Tuesday, March 26, 2019 2:38 PM > *To:* Marcelo Vanzin > *Cc:* user > *Subject:* Re: spark.submit.deployMode: cluster > > Ahh, thank you indeed! > > It would have saved us a lot of time if this had been documented. I know, > OSS so contributions are welcome… I can also imagine your next comment; “If > anyone wants to improve docs see the Apache contribution rules and create a > PR.” or something like that. > > BTW the code where the context is known and can be used is what I’d call a > Driver and since all code is copied to nodes and is know in jars, it was > not obvious to us that this rule existed but it does make sense. > > We will need to refactor our code to use spark-submit it appears. > > Thanks again. > > > From: Marcelo Vanzin > Reply: Marcelo Vanzin > Date: March 26, 2019 at 1:59:36 PM > To: Pat Ferrel > Cc: user > Subject: Re: spark.submit.deployMode: cluster > > If you're not using spark-submit, then that option does nothing. > > If by "context creation API" you mean "new SparkContext()" or an > equivalent, then you're explicitly creating the driver inside your > application. > > On Tue, Mar 26, 2019 at 1:56 PM Pat Ferrel wrote: > > > > I have a server that starts a Spark job using the context creation API. > It DOES NOY use spark-submit. > > > > I set spark.submit.deployMode = “cluster” > > > > In the GUI I see 2 workers with 2 executors. The link for running > application “name” goes back to my server, the machine that launched the > job. > > > > This is spark.submit.deployMode = “client” according to the docs. I set > the Driver to run on the cluster but it runs on the client, ignoring the > spark.submit.deployMode. > > > > Is this as expected? It is documented nowhere I can find. > > > > > -- > Marcelo > > -- Thanks, Jason
Re: spark.submit.deployMode: cluster
;-) Great idea. Can you suggest a project? Apache PredictionIO uses spark-submit (very ugly) and Apache Mahout only launches trivially in test apps since most uses are as a lib. From: Felix Cheung Reply: Felix Cheung Date: March 28, 2019 at 9:42:31 AM To: Pat Ferrel , Marcelo Vanzin Cc: user Subject: Re: spark.submit.deployMode: cluster If anyone wants to improve docs please create a PR. lol But seriously you might want to explore other projects that manage job submission on top of spark instead of rolling your own with spark-submit. -- *From:* Pat Ferrel *Sent:* Tuesday, March 26, 2019 2:38 PM *To:* Marcelo Vanzin *Cc:* user *Subject:* Re: spark.submit.deployMode: cluster Ahh, thank you indeed! It would have saved us a lot of time if this had been documented. I know, OSS so contributions are welcome… I can also imagine your next comment; “If anyone wants to improve docs see the Apache contribution rules and create a PR.” or something like that. BTW the code where the context is known and can be used is what I’d call a Driver and since all code is copied to nodes and is know in jars, it was not obvious to us that this rule existed but it does make sense. We will need to refactor our code to use spark-submit it appears. Thanks again. From: Marcelo Vanzin Reply: Marcelo Vanzin Date: March 26, 2019 at 1:59:36 PM To: Pat Ferrel Cc: user Subject: Re: spark.submit.deployMode: cluster If you're not using spark-submit, then that option does nothing. If by "context creation API" you mean "new SparkContext()" or an equivalent, then you're explicitly creating the driver inside your application. On Tue, Mar 26, 2019 at 1:56 PM Pat Ferrel wrote: > > I have a server that starts a Spark job using the context creation API. It DOES NOY use spark-submit. > > I set spark.submit.deployMode = “cluster” > > In the GUI I see 2 workers with 2 executors. The link for running application “name” goes back to my server, the machine that launched the job. > > This is spark.submit.deployMode = “client” according to the docs. I set the Driver to run on the cluster but it runs on the client, ignoring the spark.submit.deployMode. > > Is this as expected? It is documented nowhere I can find. > -- Marcelo
Re: spark.submit.deployMode: cluster
If anyone wants to improve docs please create a PR. lol But seriously you might want to explore other projects that manage job submission on top of spark instead of rolling your own with spark-submit. From: Pat Ferrel Sent: Tuesday, March 26, 2019 2:38 PM To: Marcelo Vanzin Cc: user Subject: Re: spark.submit.deployMode: cluster Ahh, thank you indeed! It would have saved us a lot of time if this had been documented. I know, OSS so contributions are welcome… I can also imagine your next comment; “If anyone wants to improve docs see the Apache contribution rules and create a PR.” or something like that. BTW the code where the context is known and can be used is what I’d call a Driver and since all code is copied to nodes and is know in jars, it was not obvious to us that this rule existed but it does make sense. We will need to refactor our code to use spark-submit it appears. Thanks again. From: Marcelo Vanzin <mailto:van...@cloudera.com> Reply: Marcelo Vanzin <mailto:van...@cloudera.com> Date: March 26, 2019 at 1:59:36 PM To: Pat Ferrel <mailto:p...@occamsmachete.com> Cc: user <mailto:user@spark.apache.org> Subject: Re: spark.submit.deployMode: cluster If you're not using spark-submit, then that option does nothing. If by "context creation API" you mean "new SparkContext()" or an equivalent, then you're explicitly creating the driver inside your application. On Tue, Mar 26, 2019 at 1:56 PM Pat Ferrel mailto:p...@occamsmachete.com>> wrote: > > I have a server that starts a Spark job using the context creation API. It > DOES NOY use spark-submit. > > I set spark.submit.deployMode = “cluster” > > In the GUI I see 2 workers with 2 executors. The link for running application > “name” goes back to my server, the machine that launched the job. > > This is spark.submit.deployMode = “client” according to the docs. I set the > Driver to run on the cluster but it runs on the client, ignoring the > spark.submit.deployMode. > > Is this as expected? It is documented nowhere I can find. > -- Marcelo
Re: spark.submit.deployMode: cluster
Ahh, thank you indeed! It would have saved us a lot of time if this had been documented. I know, OSS so contributions are welcome… I can also imagine your next comment; “If anyone wants to improve docs see the Apache contribution rules and create a PR.” or something like that. BTW the code where the context is known and can be used is what I’d call a Driver and since all code is copied to nodes and is know in jars, it was not obvious to us that this rule existed but it does make sense. We will need to refactor our code to use spark-submit it appears. Thanks again. From: Marcelo Vanzin Reply: Marcelo Vanzin Date: March 26, 2019 at 1:59:36 PM To: Pat Ferrel Cc: user Subject: Re: spark.submit.deployMode: cluster If you're not using spark-submit, then that option does nothing. If by "context creation API" you mean "new SparkContext()" or an equivalent, then you're explicitly creating the driver inside your application. On Tue, Mar 26, 2019 at 1:56 PM Pat Ferrel wrote: > > I have a server that starts a Spark job using the context creation API. It DOES NOY use spark-submit. > > I set spark.submit.deployMode = “cluster” > > In the GUI I see 2 workers with 2 executors. The link for running application “name” goes back to my server, the machine that launched the job. > > This is spark.submit.deployMode = “client” according to the docs. I set the Driver to run on the cluster but it runs on the client, ignoring the spark.submit.deployMode. > > Is this as expected? It is documented nowhere I can find. > -- Marcelo
Re: spark.submit.deployMode: cluster
If you're not using spark-submit, then that option does nothing. If by "context creation API" you mean "new SparkContext()" or an equivalent, then you're explicitly creating the driver inside your application. On Tue, Mar 26, 2019 at 1:56 PM Pat Ferrel wrote: > > I have a server that starts a Spark job using the context creation API. It > DOES NOY use spark-submit. > > I set spark.submit.deployMode = “cluster” > > In the GUI I see 2 workers with 2 executors. The link for running application > “name” goes back to my server, the machine that launched the job. > > This is spark.submit.deployMode = “client” according to the docs. I set the > Driver to run on the cluster but it runs on the client, ignoring the > spark.submit.deployMode. > > Is this as expected? It is documented nowhere I can find. > -- Marcelo - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
spark.submit.deployMode: cluster
I have a server that starts a Spark job using the context creation API. It DOES NOY use spark-submit. I set spark.submit.deployMode = “cluster” In the GUI I see 2 workers with 2 executors. The link for running application “name” goes back to my server, the machine that launched the job. This is spark.submit.deployMode = “client” according to the docs. I set the Driver to run on the cluster but it runs on the client, *ignoring the spark.submit.deployMode*. Is this as expected? It is documented nowhere I can find.