Got it, thanks.

On Thu, Jan 8, 2015 at 3:30 PM, Tim Chen <t...@mesosphere.io> wrote:

> In coarse grain mode, the spark executors are launched and kept running
> while the scheduler is running. So if you have a spark shell launched and
> remained open, the executors are running and won't finish until the shell
> is exited.
>
> In fine grain mode, the overhead time mostly comes from downloading the
> spark tar (if it's not already deployed in the slaves) and launching the
> spark executor. I suggest you try it out and look at the latency to see if
> it fits your use case or not.
>
> Tim
>
> On Wed, Jan 7, 2015 at 11:19 PM, Xuelin Cao <xuelincao2...@gmail.com>
> wrote:
>
>>
>> Hi,
>>
>>      Thanks for the information.
>>
>>      One more thing I want to clarify, when does Mesos or Yarn allocate
>> and release the resource? Aka, what is the resource life time?
>>
>>      For example, in the stand-along mode, the resource is allocated when
>> the application is launched, resource released when the application
>> finishes.
>>
>>      Then, it looks like, in the Mesos fine-grain mode, the resource is
>> allocated when the task is about to run; and released when the task
>> finishes.
>>
>>      How about Mesos coarse-grain mode and Yarn mode?  Is the resource
>> managed on the Job level? Aka, the resource life time equals the job life
>> time? Or on the stage level?
>>
>>      One more question for the Mesos fine-grain mode. How is the overhead
>> of resource allocation and release? In MapReduce, a noticeable time is
>> spend on waiting the resource allocation. What is Mesos fine-grain mode?
>>
>>
>>
>> On Thu, Jan 8, 2015 at 3:07 PM, Tim Chen <t...@mesosphere.io> wrote:
>>
>>> Hi Xuelin,
>>>
>>> I can only speak about Mesos mode. There are two modes of management in
>>> Spark's Mesos scheduler, which are fine-grain mode and coarse-grain mode.
>>>
>>> In fine grain mode, each spark task launches one or more spark executors
>>> that only live through the life time of the task. So it's comparable to
>>> what you spoke about.
>>>
>>> In coarse grain mode it's going to support dynamic allocation of
>>> executors but that's being at a higher level than tasks.
>>>
>>> As for resource management recommendation, I think it's important to see
>>> what other applications you want to be running besides Spark in the same
>>> cluster and also your use cases, to see what resource management fits your
>>> need.
>>>
>>> Tim
>>>
>>>
>>> On Wed, Jan 7, 2015 at 10:55 PM, Xuelin Cao <xuelincao2...@gmail.com>
>>> wrote:
>>>
>>>>
>>>> Hi,
>>>>
>>>>      Currently, we are building up a middle scale spark cluster (100
>>>> nodes) in our company. One thing bothering us is, the how spark manages the
>>>> resource (CPU, memory).
>>>>
>>>>      I know there are 3 resource management modes: stand-along, Mesos,
>>>> Yarn
>>>>
>>>>      In the stand along mode, the cluster master simply allocates the
>>>> resource when the application is launched. In this mode, suppose an
>>>> engineer launches a spark-shell, claiming 100 CPU cores and 100G memory,
>>>> but doing nothing. But the cluster master simply allocates the resource to
>>>> this app even if the spark-shell does nothing. This is definitely not what
>>>> we want.
>>>>
>>>>      What we want is, the resource is allocated when the actual task is
>>>> about to run. For example, in the map stage, the app may need 100 cores
>>>> because the RDD has 100 partitions, while in the reduce stage, only 20
>>>> cores is needed because the RDD is shuffled into 20 partitions.
>>>>
>>>>      I'm not very clear about the granularity of the spark resource
>>>> management. In the stand-along mode, the resource is allocated when the app
>>>> is launched. What about Mesos and Yarn? Can they support task level
>>>> resource management?
>>>>
>>>>      And, what is the recommended mode for resource management? (Mesos?
>>>> Yarn?)
>>>>
>>>>      Thanks
>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to