Can spark supports task level resource management?

2015-01-07 Thread Xuelin Cao
Hi, Currently, we are building up a middle scale spark cluster (100 nodes) in our company. One thing bothering us is, the how spark manages the resource (CPU, memory). I know there are 3 resource management modes: stand-along, Mesos, Yarn In the stand along mode, the cluster

Re: Can spark supports task level resource management?

2015-01-07 Thread Tim Chen
Hi Xuelin, I can only speak about Mesos mode. There are two modes of management in Spark's Mesos scheduler, which are fine-grain mode and coarse-grain mode. In fine grain mode, each spark task launches one or more spark executors that only live through the life time of the task. So it's

Re: Can spark supports task level resource management?

2015-01-07 Thread Xuelin Cao
Hi, Thanks for the information. One more thing I want to clarify, when does Mesos or Yarn allocate and release the resource? Aka, what is the resource life time? For example, in the stand-along mode, the resource is allocated when the application is launched, resource released

Re: Can spark supports task level resource management?

2015-01-07 Thread Tim Chen
In coarse grain mode, the spark executors are launched and kept running while the scheduler is running. So if you have a spark shell launched and remained open, the executors are running and won't finish until the shell is exited. In fine grain mode, the overhead time mostly comes from

Re: Can spark supports task level resource management?

2015-01-07 Thread Sandy Ryza
Hi Xuelin, Spark 1.2 includes a dynamic allocation feature that allows Spark on YARN to modulate its YARN resource consumption as the demands of the application grow and shrink. This is somewhat coarser than what you call task-level resource management. Elasticity comes through allocating and

Re: Can spark supports task level resource management?

2015-01-07 Thread Xuelin Cao
Got it, thanks. On Thu, Jan 8, 2015 at 3:30 PM, Tim Chen t...@mesosphere.io wrote: In coarse grain mode, the spark executors are launched and kept running while the scheduler is running. So if you have a spark shell launched and remained open, the executors are running and won't finish until