Re: Can spark supports task level resource management?

2015-01-07 Thread Tim Chen
Hi Xuelin,

I can only speak about Mesos mode. There are two modes of management in
Spark's Mesos scheduler, which are fine-grain mode and coarse-grain mode.

In fine grain mode, each spark task launches one or more spark executors
that only live through the life time of the task. So it's comparable to
what you spoke about.

In coarse grain mode it's going to support dynamic allocation of executors
but that's being at a higher level than tasks.

As for resource management recommendation, I think it's important to see
what other applications you want to be running besides Spark in the same
cluster and also your use cases, to see what resource management fits your
need.

Tim


On Wed, Jan 7, 2015 at 10:55 PM, Xuelin Cao xuelincao2...@gmail.com wrote:


 Hi,

  Currently, we are building up a middle scale spark cluster (100
 nodes) in our company. One thing bothering us is, the how spark manages the
 resource (CPU, memory).

  I know there are 3 resource management modes: stand-along, Mesos, Yarn

  In the stand along mode, the cluster master simply allocates the
 resource when the application is launched. In this mode, suppose an
 engineer launches a spark-shell, claiming 100 CPU cores and 100G memory,
 but doing nothing. But the cluster master simply allocates the resource to
 this app even if the spark-shell does nothing. This is definitely not what
 we want.

  What we want is, the resource is allocated when the actual task is
 about to run. For example, in the map stage, the app may need 100 cores
 because the RDD has 100 partitions, while in the reduce stage, only 20
 cores is needed because the RDD is shuffled into 20 partitions.

  I'm not very clear about the granularity of the spark resource
 management. In the stand-along mode, the resource is allocated when the app
 is launched. What about Mesos and Yarn? Can they support task level
 resource management?

  And, what is the recommended mode for resource management? (Mesos?
 Yarn?)

  Thanks





Re: Can spark supports task level resource management?

2015-01-07 Thread Xuelin Cao
Hi,

 Thanks for the information.

 One more thing I want to clarify, when does Mesos or Yarn allocate and
release the resource? Aka, what is the resource life time?

 For example, in the stand-along mode, the resource is allocated when
the application is launched, resource released when the application
finishes.

 Then, it looks like, in the Mesos fine-grain mode, the resource is
allocated when the task is about to run; and released when the task
finishes.

 How about Mesos coarse-grain mode and Yarn mode?  Is the resource
managed on the Job level? Aka, the resource life time equals the job life
time? Or on the stage level?

 One more question for the Mesos fine-grain mode. How is the overhead
of resource allocation and release? In MapReduce, a noticeable time is
spend on waiting the resource allocation. What is Mesos fine-grain mode?



On Thu, Jan 8, 2015 at 3:07 PM, Tim Chen t...@mesosphere.io wrote:

 Hi Xuelin,

 I can only speak about Mesos mode. There are two modes of management in
 Spark's Mesos scheduler, which are fine-grain mode and coarse-grain mode.

 In fine grain mode, each spark task launches one or more spark executors
 that only live through the life time of the task. So it's comparable to
 what you spoke about.

 In coarse grain mode it's going to support dynamic allocation of executors
 but that's being at a higher level than tasks.

 As for resource management recommendation, I think it's important to see
 what other applications you want to be running besides Spark in the same
 cluster and also your use cases, to see what resource management fits your
 need.

 Tim


 On Wed, Jan 7, 2015 at 10:55 PM, Xuelin Cao xuelincao2...@gmail.com
 wrote:


 Hi,

  Currently, we are building up a middle scale spark cluster (100
 nodes) in our company. One thing bothering us is, the how spark manages the
 resource (CPU, memory).

  I know there are 3 resource management modes: stand-along, Mesos,
 Yarn

  In the stand along mode, the cluster master simply allocates the
 resource when the application is launched. In this mode, suppose an
 engineer launches a spark-shell, claiming 100 CPU cores and 100G memory,
 but doing nothing. But the cluster master simply allocates the resource to
 this app even if the spark-shell does nothing. This is definitely not what
 we want.

  What we want is, the resource is allocated when the actual task is
 about to run. For example, in the map stage, the app may need 100 cores
 because the RDD has 100 partitions, while in the reduce stage, only 20
 cores is needed because the RDD is shuffled into 20 partitions.

  I'm not very clear about the granularity of the spark resource
 management. In the stand-along mode, the resource is allocated when the app
 is launched. What about Mesos and Yarn? Can they support task level
 resource management?

  And, what is the recommended mode for resource management? (Mesos?
 Yarn?)

  Thanks






Re: Can spark supports task level resource management?

2015-01-07 Thread Tim Chen
In coarse grain mode, the spark executors are launched and kept running
while the scheduler is running. So if you have a spark shell launched and
remained open, the executors are running and won't finish until the shell
is exited.

In fine grain mode, the overhead time mostly comes from downloading the
spark tar (if it's not already deployed in the slaves) and launching the
spark executor. I suggest you try it out and look at the latency to see if
it fits your use case or not.

Tim

On Wed, Jan 7, 2015 at 11:19 PM, Xuelin Cao xuelincao2...@gmail.com wrote:


 Hi,

  Thanks for the information.

  One more thing I want to clarify, when does Mesos or Yarn allocate
 and release the resource? Aka, what is the resource life time?

  For example, in the stand-along mode, the resource is allocated when
 the application is launched, resource released when the application
 finishes.

  Then, it looks like, in the Mesos fine-grain mode, the resource is
 allocated when the task is about to run; and released when the task
 finishes.

  How about Mesos coarse-grain mode and Yarn mode?  Is the resource
 managed on the Job level? Aka, the resource life time equals the job life
 time? Or on the stage level?

  One more question for the Mesos fine-grain mode. How is the overhead
 of resource allocation and release? In MapReduce, a noticeable time is
 spend on waiting the resource allocation. What is Mesos fine-grain mode?



 On Thu, Jan 8, 2015 at 3:07 PM, Tim Chen t...@mesosphere.io wrote:

 Hi Xuelin,

 I can only speak about Mesos mode. There are two modes of management in
 Spark's Mesos scheduler, which are fine-grain mode and coarse-grain mode.

 In fine grain mode, each spark task launches one or more spark executors
 that only live through the life time of the task. So it's comparable to
 what you spoke about.

 In coarse grain mode it's going to support dynamic allocation of
 executors but that's being at a higher level than tasks.

 As for resource management recommendation, I think it's important to see
 what other applications you want to be running besides Spark in the same
 cluster and also your use cases, to see what resource management fits your
 need.

 Tim


 On Wed, Jan 7, 2015 at 10:55 PM, Xuelin Cao xuelincao2...@gmail.com
 wrote:


 Hi,

  Currently, we are building up a middle scale spark cluster (100
 nodes) in our company. One thing bothering us is, the how spark manages the
 resource (CPU, memory).

  I know there are 3 resource management modes: stand-along, Mesos,
 Yarn

  In the stand along mode, the cluster master simply allocates the
 resource when the application is launched. In this mode, suppose an
 engineer launches a spark-shell, claiming 100 CPU cores and 100G memory,
 but doing nothing. But the cluster master simply allocates the resource to
 this app even if the spark-shell does nothing. This is definitely not what
 we want.

  What we want is, the resource is allocated when the actual task is
 about to run. For example, in the map stage, the app may need 100 cores
 because the RDD has 100 partitions, while in the reduce stage, only 20
 cores is needed because the RDD is shuffled into 20 partitions.

  I'm not very clear about the granularity of the spark resource
 management. In the stand-along mode, the resource is allocated when the app
 is launched. What about Mesos and Yarn? Can they support task level
 resource management?

  And, what is the recommended mode for resource management? (Mesos?
 Yarn?)

  Thanks







Re: Can spark supports task level resource management?

2015-01-07 Thread Sandy Ryza
Hi Xuelin,

Spark 1.2 includes a dynamic allocation feature that allows Spark on YARN
to modulate its YARN resource consumption as the demands of the application
grow and shrink.  This is somewhat coarser than what you call task-level
resource management.  Elasticity comes through allocating and releasing
executors, not through requesting resources from YARN for individual
tasks.  It would be good to add finer-grained task-level elasticity as
well, but this will rely on some YARN work (YARN-1197) for changing the
resource allocation of a running container.

Mesos has a fine-grained mode similar to what you're wondering about.
It's documented here:
https://spark.apache.org/docs/latest/running-on-mesos.html#mesos-run-modes.

-Sandy

On Wed, Jan 7, 2015 at 10:55 PM, Xuelin Cao xuelincao2...@gmail.com wrote:


 Hi,

  Currently, we are building up a middle scale spark cluster (100
 nodes) in our company. One thing bothering us is, the how spark manages the
 resource (CPU, memory).

  I know there are 3 resource management modes: stand-along, Mesos, Yarn

  In the stand along mode, the cluster master simply allocates the
 resource when the application is launched. In this mode, suppose an
 engineer launches a spark-shell, claiming 100 CPU cores and 100G memory,
 but doing nothing. But the cluster master simply allocates the resource to
 this app even if the spark-shell does nothing. This is definitely not what
 we want.

  What we want is, the resource is allocated when the actual task is
 about to run. For example, in the map stage, the app may need 100 cores
 because the RDD has 100 partitions, while in the reduce stage, only 20
 cores is needed because the RDD is shuffled into 20 partitions.

  I'm not very clear about the granularity of the spark resource
 management. In the stand-along mode, the resource is allocated when the app
 is launched. What about Mesos and Yarn? Can they support task level
 resource management?

  And, what is the recommended mode for resource management? (Mesos?
 Yarn?)

  Thanks





Re: Can spark supports task level resource management?

2015-01-07 Thread Xuelin Cao
Got it, thanks.


On Thu, Jan 8, 2015 at 3:30 PM, Tim Chen t...@mesosphere.io wrote:

 In coarse grain mode, the spark executors are launched and kept running
 while the scheduler is running. So if you have a spark shell launched and
 remained open, the executors are running and won't finish until the shell
 is exited.

 In fine grain mode, the overhead time mostly comes from downloading the
 spark tar (if it's not already deployed in the slaves) and launching the
 spark executor. I suggest you try it out and look at the latency to see if
 it fits your use case or not.

 Tim

 On Wed, Jan 7, 2015 at 11:19 PM, Xuelin Cao xuelincao2...@gmail.com
 wrote:


 Hi,

  Thanks for the information.

  One more thing I want to clarify, when does Mesos or Yarn allocate
 and release the resource? Aka, what is the resource life time?

  For example, in the stand-along mode, the resource is allocated when
 the application is launched, resource released when the application
 finishes.

  Then, it looks like, in the Mesos fine-grain mode, the resource is
 allocated when the task is about to run; and released when the task
 finishes.

  How about Mesos coarse-grain mode and Yarn mode?  Is the resource
 managed on the Job level? Aka, the resource life time equals the job life
 time? Or on the stage level?

  One more question for the Mesos fine-grain mode. How is the overhead
 of resource allocation and release? In MapReduce, a noticeable time is
 spend on waiting the resource allocation. What is Mesos fine-grain mode?



 On Thu, Jan 8, 2015 at 3:07 PM, Tim Chen t...@mesosphere.io wrote:

 Hi Xuelin,

 I can only speak about Mesos mode. There are two modes of management in
 Spark's Mesos scheduler, which are fine-grain mode and coarse-grain mode.

 In fine grain mode, each spark task launches one or more spark executors
 that only live through the life time of the task. So it's comparable to
 what you spoke about.

 In coarse grain mode it's going to support dynamic allocation of
 executors but that's being at a higher level than tasks.

 As for resource management recommendation, I think it's important to see
 what other applications you want to be running besides Spark in the same
 cluster and also your use cases, to see what resource management fits your
 need.

 Tim


 On Wed, Jan 7, 2015 at 10:55 PM, Xuelin Cao xuelincao2...@gmail.com
 wrote:


 Hi,

  Currently, we are building up a middle scale spark cluster (100
 nodes) in our company. One thing bothering us is, the how spark manages the
 resource (CPU, memory).

  I know there are 3 resource management modes: stand-along, Mesos,
 Yarn

  In the stand along mode, the cluster master simply allocates the
 resource when the application is launched. In this mode, suppose an
 engineer launches a spark-shell, claiming 100 CPU cores and 100G memory,
 but doing nothing. But the cluster master simply allocates the resource to
 this app even if the spark-shell does nothing. This is definitely not what
 we want.

  What we want is, the resource is allocated when the actual task is
 about to run. For example, in the map stage, the app may need 100 cores
 because the RDD has 100 partitions, while in the reduce stage, only 20
 cores is needed because the RDD is shuffled into 20 partitions.

  I'm not very clear about the granularity of the spark resource
 management. In the stand-along mode, the resource is allocated when the app
 is launched. What about Mesos and Yarn? Can they support task level
 resource management?

  And, what is the recommended mode for resource management? (Mesos?
 Yarn?)

  Thanks