Can spark supports task level resource management?
Hi, Currently, we are building up a middle scale spark cluster (100 nodes) in our company. One thing bothering us is, the how spark manages the resource (CPU, memory). I know there are 3 resource management modes: stand-along, Mesos, Yarn In the stand along mode, the cluster master simply allocates the resource when the application is launched. In this mode, suppose an engineer launches a spark-shell, claiming 100 CPU cores and 100G memory, but doing nothing. But the cluster master simply allocates the resource to this app even if the spark-shell does nothing. This is definitely not what we want. What we want is, the resource is allocated when the actual task is about to run. For example, in the map stage, the app may need 100 cores because the RDD has 100 partitions, while in the reduce stage, only 20 cores is needed because the RDD is shuffled into 20 partitions. I'm not very clear about the granularity of the spark resource management. In the stand-along mode, the resource is allocated when the app is launched. What about Mesos and Yarn? Can they support task level resource management? And, what is the recommended mode for resource management? (Mesos? Yarn?) Thanks
Re: Can spark supports task level resource management?
Hi Xuelin, Spark 1.2 includes a "dynamic allocation" feature that allows Spark on YARN to modulate its YARN resource consumption as the demands of the application grow and shrink. This is somewhat coarser than what you call task-level resource management. Elasticity comes through allocating and releasing executors, not through requesting resources from YARN for individual tasks. It would be good to add finer-grained task-level elasticity as well, but this will rely on some YARN work (YARN-1197) for changing the resource allocation of a running container. Mesos has a "fine-grained" mode similar to what you're wondering about. It's documented here: https://spark.apache.org/docs/latest/running-on-mesos.html#mesos-run-modes. -Sandy On Wed, Jan 7, 2015 at 10:55 PM, Xuelin Cao wrote: > > Hi, > > Currently, we are building up a middle scale spark cluster (100 > nodes) in our company. One thing bothering us is, the how spark manages the > resource (CPU, memory). > > I know there are 3 resource management modes: stand-along, Mesos, Yarn > > In the stand along mode, the cluster master simply allocates the > resource when the application is launched. In this mode, suppose an > engineer launches a spark-shell, claiming 100 CPU cores and 100G memory, > but doing nothing. But the cluster master simply allocates the resource to > this app even if the spark-shell does nothing. This is definitely not what > we want. > > What we want is, the resource is allocated when the actual task is > about to run. For example, in the map stage, the app may need 100 cores > because the RDD has 100 partitions, while in the reduce stage, only 20 > cores is needed because the RDD is shuffled into 20 partitions. > > I'm not very clear about the granularity of the spark resource > management. In the stand-along mode, the resource is allocated when the app > is launched. What about Mesos and Yarn? Can they support task level > resource management? > > And, what is the recommended mode for resource management? (Mesos? > Yarn?) > > Thanks > > >
Re: Can spark supports task level resource management?
Hi Xuelin, I can only speak about Mesos mode. There are two modes of management in Spark's Mesos scheduler, which are fine-grain mode and coarse-grain mode. In fine grain mode, each spark task launches one or more spark executors that only live through the life time of the task. So it's comparable to what you spoke about. In coarse grain mode it's going to support dynamic allocation of executors but that's being at a higher level than tasks. As for resource management recommendation, I think it's important to see what other applications you want to be running besides Spark in the same cluster and also your use cases, to see what resource management fits your need. Tim On Wed, Jan 7, 2015 at 10:55 PM, Xuelin Cao wrote: > > Hi, > > Currently, we are building up a middle scale spark cluster (100 > nodes) in our company. One thing bothering us is, the how spark manages the > resource (CPU, memory). > > I know there are 3 resource management modes: stand-along, Mesos, Yarn > > In the stand along mode, the cluster master simply allocates the > resource when the application is launched. In this mode, suppose an > engineer launches a spark-shell, claiming 100 CPU cores and 100G memory, > but doing nothing. But the cluster master simply allocates the resource to > this app even if the spark-shell does nothing. This is definitely not what > we want. > > What we want is, the resource is allocated when the actual task is > about to run. For example, in the map stage, the app may need 100 cores > because the RDD has 100 partitions, while in the reduce stage, only 20 > cores is needed because the RDD is shuffled into 20 partitions. > > I'm not very clear about the granularity of the spark resource > management. In the stand-along mode, the resource is allocated when the app > is launched. What about Mesos and Yarn? Can they support task level > resource management? > > And, what is the recommended mode for resource management? (Mesos? > Yarn?) > > Thanks > > >
Re: Can spark supports task level resource management?
Hi, Thanks for the information. One more thing I want to clarify, when does Mesos or Yarn allocate and release the resource? Aka, what is the resource life time? For example, in the stand-along mode, the resource is allocated when the application is launched, resource released when the application finishes. Then, it looks like, in the Mesos fine-grain mode, the resource is allocated when the task is about to run; and released when the task finishes. How about Mesos coarse-grain mode and Yarn mode? Is the resource managed on the Job level? Aka, the resource life time equals the job life time? Or on the stage level? One more question for the Mesos fine-grain mode. How is the overhead of resource allocation and release? In MapReduce, a noticeable time is spend on waiting the resource allocation. What is Mesos fine-grain mode? On Thu, Jan 8, 2015 at 3:07 PM, Tim Chen wrote: > Hi Xuelin, > > I can only speak about Mesos mode. There are two modes of management in > Spark's Mesos scheduler, which are fine-grain mode and coarse-grain mode. > > In fine grain mode, each spark task launches one or more spark executors > that only live through the life time of the task. So it's comparable to > what you spoke about. > > In coarse grain mode it's going to support dynamic allocation of executors > but that's being at a higher level than tasks. > > As for resource management recommendation, I think it's important to see > what other applications you want to be running besides Spark in the same > cluster and also your use cases, to see what resource management fits your > need. > > Tim > > > On Wed, Jan 7, 2015 at 10:55 PM, Xuelin Cao > wrote: > >> >> Hi, >> >> Currently, we are building up a middle scale spark cluster (100 >> nodes) in our company. One thing bothering us is, the how spark manages the >> resource (CPU, memory). >> >> I know there are 3 resource management modes: stand-along, Mesos, >> Yarn >> >> In the stand along mode, the cluster master simply allocates the >> resource when the application is launched. In this mode, suppose an >> engineer launches a spark-shell, claiming 100 CPU cores and 100G memory, >> but doing nothing. But the cluster master simply allocates the resource to >> this app even if the spark-shell does nothing. This is definitely not what >> we want. >> >> What we want is, the resource is allocated when the actual task is >> about to run. For example, in the map stage, the app may need 100 cores >> because the RDD has 100 partitions, while in the reduce stage, only 20 >> cores is needed because the RDD is shuffled into 20 partitions. >> >> I'm not very clear about the granularity of the spark resource >> management. In the stand-along mode, the resource is allocated when the app >> is launched. What about Mesos and Yarn? Can they support task level >> resource management? >> >> And, what is the recommended mode for resource management? (Mesos? >> Yarn?) >> >> Thanks >> >> >> >
Re: Can spark supports task level resource management?
In coarse grain mode, the spark executors are launched and kept running while the scheduler is running. So if you have a spark shell launched and remained open, the executors are running and won't finish until the shell is exited. In fine grain mode, the overhead time mostly comes from downloading the spark tar (if it's not already deployed in the slaves) and launching the spark executor. I suggest you try it out and look at the latency to see if it fits your use case or not. Tim On Wed, Jan 7, 2015 at 11:19 PM, Xuelin Cao wrote: > > Hi, > > Thanks for the information. > > One more thing I want to clarify, when does Mesos or Yarn allocate > and release the resource? Aka, what is the resource life time? > > For example, in the stand-along mode, the resource is allocated when > the application is launched, resource released when the application > finishes. > > Then, it looks like, in the Mesos fine-grain mode, the resource is > allocated when the task is about to run; and released when the task > finishes. > > How about Mesos coarse-grain mode and Yarn mode? Is the resource > managed on the Job level? Aka, the resource life time equals the job life > time? Or on the stage level? > > One more question for the Mesos fine-grain mode. How is the overhead > of resource allocation and release? In MapReduce, a noticeable time is > spend on waiting the resource allocation. What is Mesos fine-grain mode? > > > > On Thu, Jan 8, 2015 at 3:07 PM, Tim Chen wrote: > >> Hi Xuelin, >> >> I can only speak about Mesos mode. There are two modes of management in >> Spark's Mesos scheduler, which are fine-grain mode and coarse-grain mode. >> >> In fine grain mode, each spark task launches one or more spark executors >> that only live through the life time of the task. So it's comparable to >> what you spoke about. >> >> In coarse grain mode it's going to support dynamic allocation of >> executors but that's being at a higher level than tasks. >> >> As for resource management recommendation, I think it's important to see >> what other applications you want to be running besides Spark in the same >> cluster and also your use cases, to see what resource management fits your >> need. >> >> Tim >> >> >> On Wed, Jan 7, 2015 at 10:55 PM, Xuelin Cao >> wrote: >> >>> >>> Hi, >>> >>> Currently, we are building up a middle scale spark cluster (100 >>> nodes) in our company. One thing bothering us is, the how spark manages the >>> resource (CPU, memory). >>> >>> I know there are 3 resource management modes: stand-along, Mesos, >>> Yarn >>> >>> In the stand along mode, the cluster master simply allocates the >>> resource when the application is launched. In this mode, suppose an >>> engineer launches a spark-shell, claiming 100 CPU cores and 100G memory, >>> but doing nothing. But the cluster master simply allocates the resource to >>> this app even if the spark-shell does nothing. This is definitely not what >>> we want. >>> >>> What we want is, the resource is allocated when the actual task is >>> about to run. For example, in the map stage, the app may need 100 cores >>> because the RDD has 100 partitions, while in the reduce stage, only 20 >>> cores is needed because the RDD is shuffled into 20 partitions. >>> >>> I'm not very clear about the granularity of the spark resource >>> management. In the stand-along mode, the resource is allocated when the app >>> is launched. What about Mesos and Yarn? Can they support task level >>> resource management? >>> >>> And, what is the recommended mode for resource management? (Mesos? >>> Yarn?) >>> >>> Thanks >>> >>> >>> >> >
Re: Can spark supports task level resource management?
Got it, thanks. On Thu, Jan 8, 2015 at 3:30 PM, Tim Chen wrote: > In coarse grain mode, the spark executors are launched and kept running > while the scheduler is running. So if you have a spark shell launched and > remained open, the executors are running and won't finish until the shell > is exited. > > In fine grain mode, the overhead time mostly comes from downloading the > spark tar (if it's not already deployed in the slaves) and launching the > spark executor. I suggest you try it out and look at the latency to see if > it fits your use case or not. > > Tim > > On Wed, Jan 7, 2015 at 11:19 PM, Xuelin Cao > wrote: > >> >> Hi, >> >> Thanks for the information. >> >> One more thing I want to clarify, when does Mesos or Yarn allocate >> and release the resource? Aka, what is the resource life time? >> >> For example, in the stand-along mode, the resource is allocated when >> the application is launched, resource released when the application >> finishes. >> >> Then, it looks like, in the Mesos fine-grain mode, the resource is >> allocated when the task is about to run; and released when the task >> finishes. >> >> How about Mesos coarse-grain mode and Yarn mode? Is the resource >> managed on the Job level? Aka, the resource life time equals the job life >> time? Or on the stage level? >> >> One more question for the Mesos fine-grain mode. How is the overhead >> of resource allocation and release? In MapReduce, a noticeable time is >> spend on waiting the resource allocation. What is Mesos fine-grain mode? >> >> >> >> On Thu, Jan 8, 2015 at 3:07 PM, Tim Chen wrote: >> >>> Hi Xuelin, >>> >>> I can only speak about Mesos mode. There are two modes of management in >>> Spark's Mesos scheduler, which are fine-grain mode and coarse-grain mode. >>> >>> In fine grain mode, each spark task launches one or more spark executors >>> that only live through the life time of the task. So it's comparable to >>> what you spoke about. >>> >>> In coarse grain mode it's going to support dynamic allocation of >>> executors but that's being at a higher level than tasks. >>> >>> As for resource management recommendation, I think it's important to see >>> what other applications you want to be running besides Spark in the same >>> cluster and also your use cases, to see what resource management fits your >>> need. >>> >>> Tim >>> >>> >>> On Wed, Jan 7, 2015 at 10:55 PM, Xuelin Cao >>> wrote: >>> Hi, Currently, we are building up a middle scale spark cluster (100 nodes) in our company. One thing bothering us is, the how spark manages the resource (CPU, memory). I know there are 3 resource management modes: stand-along, Mesos, Yarn In the stand along mode, the cluster master simply allocates the resource when the application is launched. In this mode, suppose an engineer launches a spark-shell, claiming 100 CPU cores and 100G memory, but doing nothing. But the cluster master simply allocates the resource to this app even if the spark-shell does nothing. This is definitely not what we want. What we want is, the resource is allocated when the actual task is about to run. For example, in the map stage, the app may need 100 cores because the RDD has 100 partitions, while in the reduce stage, only 20 cores is needed because the RDD is shuffled into 20 partitions. I'm not very clear about the granularity of the spark resource management. In the stand-along mode, the resource is allocated when the app is launched. What about Mesos and Yarn? Can they support task level resource management? And, what is the recommended mode for resource management? (Mesos? Yarn?) Thanks >>> >> >