The other way might be to launch a single SparkContext and then run jobs inside of it.
You can take a look at these projects: - https://github.com/spark-jobserver/spark-jobserver#persistent-context-mode---faster--required-for-related-jobs - http://livy.incubator.apache.org Problems with this way: - Can't update the code of your job. - A single job can break the SparkContext. We evaluated this way and decided to go with the dynamic allocation, but we also had to rethink the way we write our jobs: - Can't use caching since it locks executors, have to use checkpointing, which adds up to computation time. - Use some unconventional methods like reusing the same DF to write out multiple separate things in one go. - Sometimes remove executors from within the job, like when we know how many we would need, so the executors could join other jobs. On Tue, Feb 6, 2018 at 3:00 PM, Nirav Patel <npa...@xactlycorp.com> wrote: > Currently sparkContext and it's executor pool is not shareable. Each > spakContext gets its own executor pool for entire life of an application. > So what is the best ways to share cluster resources across multiple long > running spark applications? > > Only one I see is spark dynamic allocation but it has high latency when it > comes to real-time application. > > > > > [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/> > > <https://www.instagram.com/xactlycorp/> > <https://www.linkedin.com/company/xactly-corporation> > <https://twitter.com/Xactly> <https://www.facebook.com/XactlyCorp> > <http://www.youtube.com/xactlycorporation>