Yes, that is one of the basic reasons to use a jobserver/shared-SparkContext. Otherwise, in order share the data in an RDD you have to use an external storage system, such as a distributed filesystem or Tachyon.
On Sun, Jan 17, 2016 at 1:52 PM, Jia <jacqueline...@gmail.com> wrote: > Thanks, Mark. Then, I guess JobServer can fundamentally solve my problem, > so that jobs can be submitted at different time and still share RDDs. > > Best Regards, > Jia > > > On Jan 17, 2016, at 3:44 PM, Mark Hamstra <m...@clearstorydata.com> wrote: > > There is a 1-to-1 relationship between Spark Applications and > SparkContexts -- fundamentally, a Spark Applications is a program that > creates and uses a SparkContext, and that SparkContext is destroyed when > then Application ends. A jobserver generically and the Spark JobServer > specifically is an Application that keeps a SparkContext open for a long > time and allows many Jobs to be be submitted and run using that shared > SparkContext. > > More than one Application/SparkContext unavoidably implies more than one > JVM process per Worker -- Applications/SparkContexts cannot share JVM > processes. > > On Sun, Jan 17, 2016 at 1:15 PM, Jia <jacqueline...@gmail.com> wrote: > >> Hi, Mark, sorry for the confusion. >> >> Let me clarify, when an application is submitted, the master will tell >> each Spark worker to spawn an executor JVM process. All the task sets of >> the application will be executed by the executor. After the application >> runs to completion. The executor process will be killed. >> But I hope that all applications submitted can run in the same executor, >> can JobServer do that? If so, it’s really good news! >> >> Best Regards, >> Jia >> >> On Jan 17, 2016, at 3:09 PM, Mark Hamstra <m...@clearstorydata.com> >> wrote: >> >> You've still got me confused. The SparkContext exists at the Driver, not >> on an Executor. >> >> Many Jobs can be run by a SparkContext -- it is a common pattern to use >> something like the Spark Jobserver where all Jobs are run through a shared >> SparkContext. >> >> On Sun, Jan 17, 2016 at 12:57 PM, Jia Zou <jacqueline...@gmail.com> >> wrote: >> >>> Hi, Mark, sorry, I mean SparkContext. >>> I mean to change Spark into running all submitted jobs (SparkContexts) >>> in one executor JVM. >>> >>> Best Regards, >>> Jia >>> >>> On Sun, Jan 17, 2016 at 2:21 PM, Mark Hamstra <m...@clearstorydata.com> >>> wrote: >>> >>>> -dev >>>> >>>> What do you mean by JobContext? That is a Hadoop mapreduce concept, >>>> not Spark. >>>> >>>> On Sun, Jan 17, 2016 at 7:29 AM, Jia Zou <jacqueline...@gmail.com> >>>> wrote: >>>> >>>>> Dear all, >>>>> >>>>> Is there a way to reuse executor JVM across different JobContexts? >>>>> Thanks. >>>>> >>>>> Best Regards, >>>>> Jia >>>>> >>>> >>>> >>> >> >> > >