Got it, thanks.
On Thu, Jan 8, 2015 at 3:30 PM, Tim Chen <t...@mesosphere.io> wrote: > In coarse grain mode, the spark executors are launched and kept running > while the scheduler is running. So if you have a spark shell launched and > remained open, the executors are running and won't finish until the shell > is exited. > > In fine grain mode, the overhead time mostly comes from downloading the > spark tar (if it's not already deployed in the slaves) and launching the > spark executor. I suggest you try it out and look at the latency to see if > it fits your use case or not. > > Tim > > On Wed, Jan 7, 2015 at 11:19 PM, Xuelin Cao <xuelincao2...@gmail.com> > wrote: > >> >> Hi, >> >> Thanks for the information. >> >> One more thing I want to clarify, when does Mesos or Yarn allocate >> and release the resource? Aka, what is the resource life time? >> >> For example, in the stand-along mode, the resource is allocated when >> the application is launched, resource released when the application >> finishes. >> >> Then, it looks like, in the Mesos fine-grain mode, the resource is >> allocated when the task is about to run; and released when the task >> finishes. >> >> How about Mesos coarse-grain mode and Yarn mode? Is the resource >> managed on the Job level? Aka, the resource life time equals the job life >> time? Or on the stage level? >> >> One more question for the Mesos fine-grain mode. How is the overhead >> of resource allocation and release? In MapReduce, a noticeable time is >> spend on waiting the resource allocation. What is Mesos fine-grain mode? >> >> >> >> On Thu, Jan 8, 2015 at 3:07 PM, Tim Chen <t...@mesosphere.io> wrote: >> >>> Hi Xuelin, >>> >>> I can only speak about Mesos mode. There are two modes of management in >>> Spark's Mesos scheduler, which are fine-grain mode and coarse-grain mode. >>> >>> In fine grain mode, each spark task launches one or more spark executors >>> that only live through the life time of the task. So it's comparable to >>> what you spoke about. >>> >>> In coarse grain mode it's going to support dynamic allocation of >>> executors but that's being at a higher level than tasks. >>> >>> As for resource management recommendation, I think it's important to see >>> what other applications you want to be running besides Spark in the same >>> cluster and also your use cases, to see what resource management fits your >>> need. >>> >>> Tim >>> >>> >>> On Wed, Jan 7, 2015 at 10:55 PM, Xuelin Cao <xuelincao2...@gmail.com> >>> wrote: >>> >>>> >>>> Hi, >>>> >>>> Currently, we are building up a middle scale spark cluster (100 >>>> nodes) in our company. One thing bothering us is, the how spark manages the >>>> resource (CPU, memory). >>>> >>>> I know there are 3 resource management modes: stand-along, Mesos, >>>> Yarn >>>> >>>> In the stand along mode, the cluster master simply allocates the >>>> resource when the application is launched. In this mode, suppose an >>>> engineer launches a spark-shell, claiming 100 CPU cores and 100G memory, >>>> but doing nothing. But the cluster master simply allocates the resource to >>>> this app even if the spark-shell does nothing. This is definitely not what >>>> we want. >>>> >>>> What we want is, the resource is allocated when the actual task is >>>> about to run. For example, in the map stage, the app may need 100 cores >>>> because the RDD has 100 partitions, while in the reduce stage, only 20 >>>> cores is needed because the RDD is shuffled into 20 partitions. >>>> >>>> I'm not very clear about the granularity of the spark resource >>>> management. In the stand-along mode, the resource is allocated when the app >>>> is launched. What about Mesos and Yarn? Can they support task level >>>> resource management? >>>> >>>> And, what is the recommended mode for resource management? (Mesos? >>>> Yarn?) >>>> >>>> Thanks >>>> >>>> >>>> >>> >> >