Hi, Michael's answer will solve the problem in case you using only SQL based solution.
Otherwise please refer to the wonderful details mentioned here https://spark.apache.org/docs/latest/job-scheduling.html. With EMR 5.3.0 released SPARK 2.1.0 is available in AWS. (note that there is an issue with using zeppelin in it and I have raised it as an issue to AWS and they are looking into it now) Regards, Gourav Sengupta On Tue, Feb 7, 2017 at 10:37 PM, Michael Segel <msegel_had...@hotmail.com> wrote: > Why couldn’t you use the spark thrift server? > > > On Feb 7, 2017, at 1:28 PM, Cosmin Posteuca <cosmin.poste...@gmail.com> > wrote: > > answer for Gourav Sengupta > > I want to use same spark application because i want to work as a FIFO > scheduler. My problem is that i have many jobs(not so big) and if i run an > application for every job my cluster will split resources as a FAIR > scheduler(it's what i observe, maybe i'm wrong) and exist the possibility > to create bottleneck effect. The start time isn't a problem for me, because > it isn't a real-time application. > > I need a business solution, that's the reason why i can't use code from > github. > > Thanks! > > 2017-02-07 19:55 GMT+02:00 Gourav Sengupta <gourav.sengu...@gmail.com>: > >> Hi, >> >> May I ask the reason for using the same spark application? Is it because >> of the time it takes in order to start a spark context? >> >> On another note you may want to look at the number of contributors in a >> github repo before choosing a solution. >> >> >> Regards, >> Gourav >> >> On Tue, Feb 7, 2017 at 5:26 PM, vincent gromakowski < >> vincent.gromakow...@gmail.com> wrote: >> >>> Spark jobserver or Livy server are the best options for pure technical >>> API. >>> If you want to publish business API you will probably have to build you >>> own app like the one I wrote a year ago https://github.com/elppc/akka- >>> spark-experiments >>> It combines Akka actors and a shared Spark context to serve concurrent >>> subsecond jobs >>> >>> >>> 2017-02-07 15:28 GMT+01:00 ayan guha <guha.a...@gmail.com>: >>> >>>> I think you are loking for livy or spark jobserver >>>> >>>> On Wed, 8 Feb 2017 at 12:37 am, Cosmin Posteuca < >>>> cosmin.poste...@gmail.com> wrote: >>>> >>>>> I want to run different jobs on demand with same spark context, but i >>>>> don't know how exactly i can do this. >>>>> >>>>> I try to get current context, but seems it create a new spark >>>>> context(with new executors). >>>>> >>>>> I call spark-submit to add new jobs. >>>>> >>>>> I run code on Amazon EMR(3 instances, 4 core & 16GB ram / instance), >>>>> with yarn as resource manager. >>>>> >>>>> My code: >>>>> >>>>> val sparkContext = SparkContext.getOrCreate() >>>>> val content = 1 to 40000 >>>>> val result = sparkContext.parallelize(content, 5) >>>>> result.map(value => value.toString).foreach(loop) >>>>> >>>>> def loop(x: String): Unit = { >>>>> for (a <- 1 to 30000000) { >>>>> >>>>> } >>>>> } >>>>> >>>>> spark-submit: >>>>> >>>>> spark-submit --executor-cores 1 \ >>>>> --executor-memory 1g \ >>>>> --driver-memory 1g \ >>>>> --master yarn \ >>>>> --deploy-mode cluster \ >>>>> --conf spark.dynamicAllocation.enabled=true \ >>>>> --conf spark.shuffle.service.enabled=true \ >>>>> --conf spark.dynamicAllocation.minExecutors=1 \ >>>>> --conf spark.dynamicAllocation.maxExecutors=3 \ >>>>> --conf spark.dynamicAllocation.initialExecutors=3 \ >>>>> --conf spark.executor.instances=3 \ >>>>> >>>>> If i run twice spark-submit it create 6 executors, but i want to run >>>>> all this jobs on same spark application. >>>>> >>>>> How can achieve adding jobs to an existing spark application? >>>>> >>>>> I don't understand why SparkContext.getOrCreate() don't get existing >>>>> spark context. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Cosmin P. >>>>> >>>> -- >>>> Best Regards, >>>> Ayan Guha >>>> >>> >>> >> > >