I think you are loking for livy or spark jobserver On Wed, 8 Feb 2017 at 12:37 am, Cosmin Posteuca <cosmin.poste...@gmail.com> wrote:
> I want to run different jobs on demand with same spark context, but i > don't know how exactly i can do this. > > I try to get current context, but seems it create a new spark context(with > new executors). > > I call spark-submit to add new jobs. > > I run code on Amazon EMR(3 instances, 4 core & 16GB ram / instance), with > yarn as resource manager. > > My code: > > val sparkContext = SparkContext.getOrCreate() > val content = 1 to 40000 > val result = sparkContext.parallelize(content, 5) > result.map(value => value.toString).foreach(loop) > > def loop(x: String): Unit = { > for (a <- 1 to 30000000) { > > } > } > > spark-submit: > > spark-submit --executor-cores 1 \ > --executor-memory 1g \ > --driver-memory 1g \ > --master yarn \ > --deploy-mode cluster \ > --conf spark.dynamicAllocation.enabled=true \ > --conf spark.shuffle.service.enabled=true \ > --conf spark.dynamicAllocation.minExecutors=1 \ > --conf spark.dynamicAllocation.maxExecutors=3 \ > --conf spark.dynamicAllocation.initialExecutors=3 \ > --conf spark.executor.instances=3 \ > > If i run twice spark-submit it create 6 executors, but i want to run all > this jobs on same spark application. > > How can achieve adding jobs to an existing spark application? > > I don't understand why SparkContext.getOrCreate() don't get existing > spark context. > > > Thanks, > > Cosmin P. > -- Best Regards, Ayan Guha