Thanks Liang, Vadim and everyone for your inputs!!
With this clarity, I've tried client modes for both main and sub-spark
jobs. Every main spark job and its corresponding threaded spark jobs are
coming up on the YARN applications list and the jobs are getting executed
properly. I need to now test
If you run the main driver and other Spark jobs in client mode, you can make
sure they (I meant all the drivers) are running at the same node. Of course
all drivers now consume the resources at the same node.
If you run the main driver in client mode, but run other Spark jobs in
cluster mode, the
I am not familiar of any problem with that.
Anyway, If you run spark applicaction you would have multiple jobs, which makes
sense that it is not a problem.
Thanks David.
From: Naveen [mailto:hadoopst...@gmail.com]
Sent: Wednesday, December 21, 2016 9:18 AM
To: dev@spark.apache.org; u...@spark.ap
Is there any reason you need a context on the application launching the
jobs?
You can use SparkLauncher in a normal app and just listen for state
transitions
On Wed, 21 Dec 2016, 11:44 Naveen, wrote:
> Hi Team,
>
> Thanks for your responses.
> Let me give more details in a picture of how I am tr
Thanks Liang!
I get your point. It would mean that when launching spark jobs, mode needs
to be specified as client for all spark jobs.
However, my concern is to know if driver's memory(which is launching spark
jobs) will be used completely by the Future's(sparkcontext's) or these
spawned sparkconte
Hi Sebastian,
Yes, for fetching the details from Hive and HBase, I would want to use
Spark's HiveContext etc.
However, based on your point, I might have to check if JDBC based driver
connection could be used to do the same.
Main reason for this is to avoid a client-server architecture design.
If
OK.
I think it is little unusual use pattern, but it should work.
As I said before, if you want those Spark applications to share cluster
resources, proper configs is needed for Spark.
If you submit the main driver and all other Spark applications in client
mode under yarn, you should make sure
Hi Team,
Thanks for your responses.
Let me give more details in a picture of how I am trying to launch jobs.
Main spark job will launch other spark-job similar to calling multiple
spark-submit within a Spark driver program.
These spawned threads for new jobs will be totally different components,
Hi,
As you launch multiple Spark jobs through `SparkLauncher`, I think it
actually works like you run multiple Spark applications with `spark-submit`.
By default each application will try to use all available nodes. If your
purpose is to share cluster resources across those Spark jobs/application