Re: Reuse Executor JVM across different JobContext

Mark Hamstra Sun, 17 Jan 2016 13:57:06 -0800

Yes, that is one of the basic reasons to use a
jobserver/shared-SparkContext.  Otherwise, in order share the data in an
RDD you have to use an external storage system, such as a distributed
filesystem or Tachyon.


On Sun, Jan 17, 2016 at 1:52 PM, Jia <jacqueline...@gmail.com> wrote:

> Thanks, Mark. Then, I guess JobServer can fundamentally solve my problem,
> so that jobs can be submitted at different time and still share RDDs.
>
> Best Regards,
> Jia
>
>
> On Jan 17, 2016, at 3:44 PM, Mark Hamstra <m...@clearstorydata.com> wrote:
>
> There is a 1-to-1 relationship between Spark Applications and
> SparkContexts -- fundamentally, a Spark Applications is a program that
> creates and uses a SparkContext, and that SparkContext is destroyed when
> then Application ends.  A jobserver generically and the Spark JobServer
> specifically is an Application that keeps a SparkContext open for a long
> time and allows many Jobs to be be submitted and run using that shared
> SparkContext.
>
> More than one Application/SparkContext unavoidably implies more than one
> JVM process per Worker -- Applications/SparkContexts cannot share JVM
> processes.
>
> On Sun, Jan 17, 2016 at 1:15 PM, Jia <jacqueline...@gmail.com> wrote:
>
>> Hi, Mark, sorry for the confusion.
>>
>> Let me clarify, when an application is submitted, the master will tell
>> each Spark worker to spawn an executor JVM process. All the task sets  of
>> the application will be executed by the executor. After the application
>> runs to completion. The executor process will be killed.
>> But I hope that all applications submitted can run in the same executor,
>> can JobServer do that? If so, it’s really good news!
>>
>> Best Regards,
>> Jia
>>
>> On Jan 17, 2016, at 3:09 PM, Mark Hamstra <m...@clearstorydata.com>
>> wrote:
>>
>> You've still got me confused.  The SparkContext exists at the Driver, not
>> on an Executor.
>>
>> Many Jobs can be run by a SparkContext -- it is a common pattern to use
>> something like the Spark Jobserver where all Jobs are run through a shared
>> SparkContext.
>>
>> On Sun, Jan 17, 2016 at 12:57 PM, Jia Zou <jacqueline...@gmail.com>
>> wrote:
>>
>>> Hi, Mark, sorry, I mean SparkContext.
>>> I mean to change Spark into running all submitted jobs (SparkContexts)
>>> in one executor JVM.
>>>
>>> Best Regards,
>>> Jia
>>>
>>> On Sun, Jan 17, 2016 at 2:21 PM, Mark Hamstra <m...@clearstorydata.com>
>>> wrote:
>>>
>>>> -dev
>>>>
>>>> What do you mean by JobContext?  That is a Hadoop mapreduce concept,
>>>> not Spark.
>>>>
>>>> On Sun, Jan 17, 2016 at 7:29 AM, Jia Zou <jacqueline...@gmail.com>
>>>> wrote:
>>>>
>>>>> Dear all,
>>>>>
>>>>> Is there a way to reuse executor JVM across different JobContexts?
>>>>> Thanks.
>>>>>
>>>>> Best Regards,
>>>>> Jia
>>>>>
>>>>
>>>>
>>>
>>
>>
>
>

Re: Reuse Executor JVM across different JobContext

Reply via email to