Re: Launching multiple spark jobs within a main spark job.

2016-12-24 Thread Naveen
Thanks Liang, Vadim and everyone for your inputs!! With this clarity, I've tried client modes for both main and sub-spark jobs. Every main spark job and its corresponding threaded spark jobs are coming up on the YARN applications list and the jobs are getting executed properly. I need to now test

Re: Launching multiple spark jobs within a main spark job.

2016-12-21 Thread Liang-Chi Hsieh
If you run the main driver and other Spark jobs in client mode, you can make sure they (I meant all the drivers) are running at the same node. Of course all drivers now consume the resources at the same node. If you run the main driver in client mode, but run other Spark jobs in cluster mode, the

RE: Launching multiple spark jobs within a main spark job.

2016-12-21 Thread David Hodeffi
I am not familiar of any problem with that. Anyway, If you run spark applicaction you would have multiple jobs, which makes sense that it is not a problem. Thanks David. From: Naveen [mailto:hadoopst...@gmail.com] Sent: Wednesday, December 21, 2016 9:18 AM To: dev@spark.apache.org; u...@spark.ap

Re: Launching multiple spark jobs within a main spark job.

2016-12-21 Thread Sebastian Piu
Is there any reason you need a context on the application launching the jobs? You can use SparkLauncher in a normal app and just listen for state transitions On Wed, 21 Dec 2016, 11:44 Naveen, wrote: > Hi Team, > > Thanks for your responses. > Let me give more details in a picture of how I am tr

Re: Launching multiple spark jobs within a main spark job.

2016-12-21 Thread Naveen
Thanks Liang! I get your point. It would mean that when launching spark jobs, mode needs to be specified as client for all spark jobs. However, my concern is to know if driver's memory(which is launching spark jobs) will be used completely by the Future's(sparkcontext's) or these spawned sparkconte

Re: Launching multiple spark jobs within a main spark job.

2016-12-21 Thread Naveen
Hi Sebastian, Yes, for fetching the details from Hive and HBase, I would want to use Spark's HiveContext etc. However, based on your point, I might have to check if JDBC based driver connection could be used to do the same. Main reason for this is to avoid a client-server architecture design. If

Re: Launching multiple spark jobs within a main spark job.

2016-12-21 Thread Liang-Chi Hsieh
OK. I think it is little unusual use pattern, but it should work. As I said before, if you want those Spark applications to share cluster resources, proper configs is needed for Spark. If you submit the main driver and all other Spark applications in client mode under yarn, you should make sure

Re: Launching multiple spark jobs within a main spark job.

2016-12-21 Thread Naveen
Hi Team, Thanks for your responses. Let me give more details in a picture of how I am trying to launch jobs. Main spark job will launch other spark-job similar to calling multiple spark-submit within a Spark driver program. These spawned threads for new jobs will be totally different components,

Re: Launching multiple spark jobs within a main spark job.

2016-12-21 Thread Liang-Chi Hsieh
Hi, As you launch multiple Spark jobs through `SparkLauncher`, I think it actually works like you run multiple Spark applications with `spark-submit`. By default each application will try to use all available nodes. If your purpose is to share cluster resources across those Spark jobs/application