I want to run different jobs on demand with same spark context, but i don't know how exactly i can do this.
I try to get current context, but seems it create a new spark context(with new executors). I call spark-submit to add new jobs. I run code on Amazon EMR(3 instances, 4 core & 16GB ram / instance), with yarn as resource manager. My code: val sparkContext = SparkContext.getOrCreate() val content = 1 to 40000 val result = sparkContext.parallelize(content, 5) result.map(value => value.toString).foreach(loop) def loop(x: String): Unit = { for (a <- 1 to 30000000) { } } spark-submit: spark-submit --executor-cores 1 \ --executor-memory 1g \ --driver-memory 1g \ --master yarn \ --deploy-mode cluster \ --conf spark.dynamicAllocation.enabled=true \ --conf spark.shuffle.service.enabled=true \ --conf spark.dynamicAllocation.minExecutors=1 \ --conf spark.dynamicAllocation.maxExecutors=3 \ --conf spark.dynamicAllocation.initialExecutors=3 \ --conf spark.executor.instances=3 \ If i run twice spark-submit it create 6 executors, but i want to run all this jobs on same spark application. How can achieve adding jobs to an existing spark application? I don't understand why SparkContext.getOrCreate() don't get existing spark context. Thanks, Cosmin P.