Re: Spark job resource allocation best practices

2014-11-04 Thread Romi Kuntsman
How can I configure Mesos allocation policy to share resources between all
current Spark applications? I can't seem to find it in the architecture
docs.

*Romi Kuntsman*, *Big Data Engineer*
 http://www.totango.com

On Tue, Nov 4, 2014 at 9:11 AM, Akhil Das ak...@sigmoidanalytics.com
wrote:

 Yes. i believe Mesos is the right choice for you.
 http://mesos.apache.org/documentation/latest/mesos-architecture/

 Thanks
 Best Regards

 On Mon, Nov 3, 2014 at 9:33 PM, Romi Kuntsman r...@totango.com wrote:

 So, as said there, static partitioning is used in Spark’s standalone and
 YARN modes, as well as the coarse-grained Mesos mode.
 That leaves us only with Mesos, where there is *dynamic sharing* of CPU
 cores.

 It says when the application is not running tasks on a machine, other
 applications may run tasks on those cores.
 But my applications are short lived (seconds to minutes), and they read a
 large dataset, process it, and write the results. They are also IO-bound,
 meaning most of the time is spent reading input data (from S3) and writing
 the results back.

 Is it possible to divide the resources between them, according to how
 many are trying to run at the same time?
 So for example if I have 12 cores - if one job is scheduled, it will get
 12 cores, but if 3 are scheduled, then each one will get 4 cores and then
 will all start.

 Thanks!

 *Romi Kuntsman*, *Big Data Engineer*
  http://www.totango.com

 On Mon, Nov 3, 2014 at 5:46 PM, Akhil Das ak...@sigmoidanalytics.com
 wrote:

 Have a look at scheduling pools
 https://spark.apache.org/docs/latest/job-scheduling.html. If you want
 more sophisticated resource allocation, then you are better of to use
 cluster managers like mesos or yarn

 Thanks
 Best Regards

 On Mon, Nov 3, 2014 at 9:10 PM, Romi Kuntsman r...@totango.com wrote:

 Hello,

 I have a Spark 1.1.0 standalone cluster, with several nodes, and
 several jobs (applications) being scheduled at the same time.
 By default, each Spark job takes up all available CPUs.
 This way, when more than one job is scheduled, all but the first are
 stuck in WAITING.
 On the other hand, if I tell each job to initially limit itself to a
 fixed number of CPUs, and that job runs by itself, the cluster is
 under-utilized and the job runs longer than it could have if it took all
 the available resources.

 - How to give the tasks a more fair resource division, which lets many
 jobs run together, and together lets them use all the available resources?
 - How do you divide resources between applications on your usecase?

 P.S. I started reading about Mesos but couldn't figure out if/how it
 could solve the described issue.

 Thanks!

 *Romi Kuntsman*, *Big Data Engineer*
  http://www.totango.com







Re: Spark job resource allocation best practices

2014-11-04 Thread Akhil Das
You can look at different modes over here
http://docs.sigmoidanalytics.com/index.php/Spark_On_Mesos#Mesos_Run_Modes

These people has very good tutorial to get you started
http://mesosphere.com/docs/tutorials/run-spark-on-mesos/#overview

Thanks
Best Regards

On Tue, Nov 4, 2014 at 1:44 PM, Romi Kuntsman r...@totango.com wrote:

 How can I configure Mesos allocation policy to share resources between all
 current Spark applications? I can't seem to find it in the architecture
 docs.

 *Romi Kuntsman*, *Big Data Engineer*
  http://www.totango.com

 On Tue, Nov 4, 2014 at 9:11 AM, Akhil Das ak...@sigmoidanalytics.com
 wrote:

 Yes. i believe Mesos is the right choice for you.
 http://mesos.apache.org/documentation/latest/mesos-architecture/

 Thanks
 Best Regards

 On Mon, Nov 3, 2014 at 9:33 PM, Romi Kuntsman r...@totango.com wrote:

 So, as said there, static partitioning is used in Spark’s standalone
 and YARN modes, as well as the coarse-grained Mesos mode.
 That leaves us only with Mesos, where there is *dynamic sharing* of CPU
 cores.

 It says when the application is not running tasks on a machine, other
 applications may run tasks on those cores.
 But my applications are short lived (seconds to minutes), and they read
 a large dataset, process it, and write the results. They are also IO-bound,
 meaning most of the time is spent reading input data (from S3) and writing
 the results back.

 Is it possible to divide the resources between them, according to how
 many are trying to run at the same time?
 So for example if I have 12 cores - if one job is scheduled, it will get
 12 cores, but if 3 are scheduled, then each one will get 4 cores and then
 will all start.

 Thanks!

 *Romi Kuntsman*, *Big Data Engineer*
  http://www.totango.com

 On Mon, Nov 3, 2014 at 5:46 PM, Akhil Das ak...@sigmoidanalytics.com
 wrote:

 Have a look at scheduling pools
 https://spark.apache.org/docs/latest/job-scheduling.html. If you
 want more sophisticated resource allocation, then you are better of to use
 cluster managers like mesos or yarn

 Thanks
 Best Regards

 On Mon, Nov 3, 2014 at 9:10 PM, Romi Kuntsman r...@totango.com wrote:

 Hello,

 I have a Spark 1.1.0 standalone cluster, with several nodes, and
 several jobs (applications) being scheduled at the same time.
 By default, each Spark job takes up all available CPUs.
 This way, when more than one job is scheduled, all but the first are
 stuck in WAITING.
 On the other hand, if I tell each job to initially limit itself to a
 fixed number of CPUs, and that job runs by itself, the cluster is
 under-utilized and the job runs longer than it could have if it took all
 the available resources.

 - How to give the tasks a more fair resource division, which lets many
 jobs run together, and together lets them use all the available resources?
 - How do you divide resources between applications on your usecase?

 P.S. I started reading about Mesos but couldn't figure out if/how it
 could solve the described issue.

 Thanks!

 *Romi Kuntsman*, *Big Data Engineer*
  http://www.totango.com








Re: Spark job resource allocation best practices

2014-11-04 Thread Romi Kuntsman
I have a single Spark cluster, not multiple frameworks and not multiple
versions. Is it relevant for my use-case?
Where can I find information about exactly how to make Mesos tell Spark how
many resources of the cluster to use? (instead of the default take-all)

*Romi Kuntsman*, *Big Data Engineer*
 http://www.totango.com

On Tue, Nov 4, 2014 at 11:00 AM, Akhil Das ak...@sigmoidanalytics.com
wrote:

 You can look at different modes over here
 http://docs.sigmoidanalytics.com/index.php/Spark_On_Mesos#Mesos_Run_Modes

 These people has very good tutorial to get you started
 http://mesosphere.com/docs/tutorials/run-spark-on-mesos/#overview

 Thanks
 Best Regards

 On Tue, Nov 4, 2014 at 1:44 PM, Romi Kuntsman r...@totango.com wrote:

 How can I configure Mesos allocation policy to share resources between
 all current Spark applications? I can't seem to find it in the architecture
 docs.

 *Romi Kuntsman*, *Big Data Engineer*
  http://www.totango.com

 On Tue, Nov 4, 2014 at 9:11 AM, Akhil Das ak...@sigmoidanalytics.com
 wrote:

 Yes. i believe Mesos is the right choice for you.
 http://mesos.apache.org/documentation/latest/mesos-architecture/

 Thanks
 Best Regards

 On Mon, Nov 3, 2014 at 9:33 PM, Romi Kuntsman r...@totango.com wrote:

 So, as said there, static partitioning is used in Spark’s standalone
 and YARN modes, as well as the coarse-grained Mesos mode.
 That leaves us only with Mesos, where there is *dynamic sharing* of
 CPU cores.

 It says when the application is not running tasks on a machine, other
 applications may run tasks on those cores.
 But my applications are short lived (seconds to minutes), and they read
 a large dataset, process it, and write the results. They are also IO-bound,
 meaning most of the time is spent reading input data (from S3) and writing
 the results back.

 Is it possible to divide the resources between them, according to how
 many are trying to run at the same time?
 So for example if I have 12 cores - if one job is scheduled, it will
 get 12 cores, but if 3 are scheduled, then each one will get 4 cores and
 then will all start.

 Thanks!

 *Romi Kuntsman*, *Big Data Engineer*
  http://www.totango.com

 On Mon, Nov 3, 2014 at 5:46 PM, Akhil Das ak...@sigmoidanalytics.com
 wrote:

 Have a look at scheduling pools
 https://spark.apache.org/docs/latest/job-scheduling.html. If you
 want more sophisticated resource allocation, then you are better of to use
 cluster managers like mesos or yarn

 Thanks
 Best Regards

 On Mon, Nov 3, 2014 at 9:10 PM, Romi Kuntsman r...@totango.com
 wrote:

 Hello,

 I have a Spark 1.1.0 standalone cluster, with several nodes, and
 several jobs (applications) being scheduled at the same time.
 By default, each Spark job takes up all available CPUs.
 This way, when more than one job is scheduled, all but the first are
 stuck in WAITING.
 On the other hand, if I tell each job to initially limit itself to a
 fixed number of CPUs, and that job runs by itself, the cluster is
 under-utilized and the job runs longer than it could have if it took all
 the available resources.

 - How to give the tasks a more fair resource division, which lets
 many jobs run together, and together lets them use all the available
 resources?
 - How do you divide resources between applications on your usecase?

 P.S. I started reading about Mesos but couldn't figure out if/how it
 could solve the described issue.

 Thanks!

 *Romi Kuntsman*, *Big Data Engineer*
  http://www.totango.com









Re: Spark job resource allocation best practices

2014-11-04 Thread Akhil Das
You need to install mesos on your cluster. Then you will run your spark
applications by specifying mesos master (mesos://) instead of (spark://).

Spark can run over Mesos in two modes: “*fine-grained*” (default) and “
*coarse-grained*”.

In “*fine-grained*” mode (default), each Spark task runs as a separate
Mesos task. This allows multiple instances of Spark (and other frameworks)
to share machines at a very fine granularity, where each application gets
more or fewer machines as it ramps up and down, but it comes with an
additional overhead in launching each task. This mode may be inappropriate
for low-latency requirements like interactive queries or serving web
requests.

The “*coarse-grained*” mode will instead launch only one long-running Spark
task on each Mesos machine, and dynamically schedule its own “mini-tasks”
within it. The benefit is much lower startup overhead, but at the cost of
reserving the Mesos resources for the complete duration of the application.

To run in coarse-grained mode, set the spark.mesos.coarse property in your
SparkConf:
 conf.set(spark.mesos.coarse, true)


In addition, for coarse-grained mode, you can control the maximum number of
resources Spark will acquire. By default, it will acquire all cores in the
cluster (that get offered by Mesos), which only makes sense if you run just
one application at a time. You can cap the maximum number of cores using
conf.set(spark.cores.max, 10) (for example).


If you run your application in fine-grained mode, then mesos will take care
of the resource allocation for you. You just tell mesos from your
application that you are running in fine-grained mode, and it is the
default mode.

Thanks
Best Regards

On Tue, Nov 4, 2014 at 2:46 PM, Romi Kuntsman r...@totango.com wrote:

 I have a single Spark cluster, not multiple frameworks and not multiple
 versions. Is it relevant for my use-case?
 Where can I find information about exactly how to make Mesos tell Spark
 how many resources of the cluster to use? (instead of the default take-all)

 *Romi Kuntsman*, *Big Data Engineer*
  http://www.totango.com

 On Tue, Nov 4, 2014 at 11:00 AM, Akhil Das ak...@sigmoidanalytics.com
 wrote:

 You can look at different modes over here
 http://docs.sigmoidanalytics.com/index.php/Spark_On_Mesos#Mesos_Run_Modes

 These people has very good tutorial to get you started
 http://mesosphere.com/docs/tutorials/run-spark-on-mesos/#overview

 Thanks
 Best Regards

 On Tue, Nov 4, 2014 at 1:44 PM, Romi Kuntsman r...@totango.com wrote:

 How can I configure Mesos allocation policy to share resources between
 all current Spark applications? I can't seem to find it in the architecture
 docs.

 *Romi Kuntsman*, *Big Data Engineer*
  http://www.totango.com

 On Tue, Nov 4, 2014 at 9:11 AM, Akhil Das ak...@sigmoidanalytics.com
 wrote:

 Yes. i believe Mesos is the right choice for you.
 http://mesos.apache.org/documentation/latest/mesos-architecture/

 Thanks
 Best Regards

 On Mon, Nov 3, 2014 at 9:33 PM, Romi Kuntsman r...@totango.com wrote:

 So, as said there, static partitioning is used in Spark’s standalone
 and YARN modes, as well as the coarse-grained Mesos mode.
 That leaves us only with Mesos, where there is *dynamic sharing* of
 CPU cores.

 It says when the application is not running tasks on a machine, other
 applications may run tasks on those cores.
 But my applications are short lived (seconds to minutes), and they
 read a large dataset, process it, and write the results. They are also
 IO-bound, meaning most of the time is spent reading input data (from S3)
 and writing the results back.

 Is it possible to divide the resources between them, according to how
 many are trying to run at the same time?
 So for example if I have 12 cores - if one job is scheduled, it will
 get 12 cores, but if 3 are scheduled, then each one will get 4 cores and
 then will all start.

 Thanks!

 *Romi Kuntsman*, *Big Data Engineer*
  http://www.totango.com

 On Mon, Nov 3, 2014 at 5:46 PM, Akhil Das ak...@sigmoidanalytics.com
 wrote:

 Have a look at scheduling pools
 https://spark.apache.org/docs/latest/job-scheduling.html. If you
 want more sophisticated resource allocation, then you are better of to 
 use
 cluster managers like mesos or yarn

 Thanks
 Best Regards

 On Mon, Nov 3, 2014 at 9:10 PM, Romi Kuntsman r...@totango.com
 wrote:

 Hello,

 I have a Spark 1.1.0 standalone cluster, with several nodes, and
 several jobs (applications) being scheduled at the same time.
 By default, each Spark job takes up all available CPUs.
 This way, when more than one job is scheduled, all but the first are
 stuck in WAITING.
 On the other hand, if I tell each job to initially limit itself to a
 fixed number of CPUs, and that job runs by itself, the cluster is
 under-utilized and the job runs longer than it could have if it took all
 the available resources.

 - How to give the tasks a more fair resource division, which lets
 many jobs run together, and together lets them use all 

Re: Spark job resource allocation best practices

2014-11-04 Thread Romi Kuntsman
Let's say that I run Spark on Mesos in fine-grained mode, and I have 12
cores and 64GB memory.
I run application A on Spark, and some time after that (but before A
finished) application B.
How many CPUs will each of them get?

*Romi Kuntsman*, *Big Data Engineer*
 http://www.totango.com

On Tue, Nov 4, 2014 at 11:33 AM, Akhil Das ak...@sigmoidanalytics.com
wrote:

 You need to install mesos on your cluster. Then you will run your spark
 applications by specifying mesos master (mesos://) instead of (spark://).

 Spark can run over Mesos in two modes: “*fine-grained*” (default) and “
 *coarse-grained*”.

 In “*fine-grained*” mode (default), each Spark task runs as a separate
 Mesos task. This allows multiple instances of Spark (and other frameworks)
 to share machines at a very fine granularity, where each application gets
 more or fewer machines as it ramps up and down, but it comes with an
 additional overhead in launching each task. This mode may be inappropriate
 for low-latency requirements like interactive queries or serving web
 requests.

 The “*coarse-grained*” mode will instead launch only one long-running
 Spark task on each Mesos machine, and dynamically schedule its own
 “mini-tasks” within it. The benefit is much lower startup overhead, but at
 the cost of reserving the Mesos resources for the complete duration of the
 application.

 To run in coarse-grained mode, set the spark.mesos.coarse property in your
 SparkConf:
  conf.set(spark.mesos.coarse, true)


 In addition, for coarse-grained mode, you can control the maximum number
 of resources Spark will acquire. By default, it will acquire all cores in
 the cluster (that get offered by Mesos), which only makes sense if you run
 just one application at a time. You can cap the maximum number of cores
 using conf.set(spark.cores.max, 10) (for example).


 If you run your application in fine-grained mode, then mesos will take
 care of the resource allocation for you. You just tell mesos from your
 application that you are running in fine-grained mode, and it is the
 default mode.

 Thanks
 Best Regards

 On Tue, Nov 4, 2014 at 2:46 PM, Romi Kuntsman r...@totango.com wrote:

 I have a single Spark cluster, not multiple frameworks and not multiple
 versions. Is it relevant for my use-case?
 Where can I find information about exactly how to make Mesos tell Spark
 how many resources of the cluster to use? (instead of the default take-all)

 *Romi Kuntsman*, *Big Data Engineer*
  http://www.totango.com

 On Tue, Nov 4, 2014 at 11:00 AM, Akhil Das ak...@sigmoidanalytics.com
 wrote:

 You can look at different modes over here
 http://docs.sigmoidanalytics.com/index.php/Spark_On_Mesos#Mesos_Run_Modes

 These people has very good tutorial to get you started
 http://mesosphere.com/docs/tutorials/run-spark-on-mesos/#overview

 Thanks
 Best Regards

 On Tue, Nov 4, 2014 at 1:44 PM, Romi Kuntsman r...@totango.com wrote:

 How can I configure Mesos allocation policy to share resources between
 all current Spark applications? I can't seem to find it in the architecture
 docs.

 *Romi Kuntsman*, *Big Data Engineer*
  http://www.totango.com

 On Tue, Nov 4, 2014 at 9:11 AM, Akhil Das ak...@sigmoidanalytics.com
 wrote:

 Yes. i believe Mesos is the right choice for you.
 http://mesos.apache.org/documentation/latest/mesos-architecture/

 Thanks
 Best Regards

 On Mon, Nov 3, 2014 at 9:33 PM, Romi Kuntsman r...@totango.com
 wrote:

 So, as said there, static partitioning is used in Spark’s standalone
 and YARN modes, as well as the coarse-grained Mesos mode.
 That leaves us only with Mesos, where there is *dynamic sharing* of
 CPU cores.

 It says when the application is not running tasks on a machine,
 other applications may run tasks on those cores.
 But my applications are short lived (seconds to minutes), and they
 read a large dataset, process it, and write the results. They are also
 IO-bound, meaning most of the time is spent reading input data (from S3)
 and writing the results back.

 Is it possible to divide the resources between them, according to how
 many are trying to run at the same time?
 So for example if I have 12 cores - if one job is scheduled, it will
 get 12 cores, but if 3 are scheduled, then each one will get 4 cores and
 then will all start.

 Thanks!

 *Romi Kuntsman*, *Big Data Engineer*
  http://www.totango.com

 On Mon, Nov 3, 2014 at 5:46 PM, Akhil Das ak...@sigmoidanalytics.com
  wrote:

 Have a look at scheduling pools
 https://spark.apache.org/docs/latest/job-scheduling.html. If you
 want more sophisticated resource allocation, then you are better of to 
 use
 cluster managers like mesos or yarn

 Thanks
 Best Regards

 On Mon, Nov 3, 2014 at 9:10 PM, Romi Kuntsman r...@totango.com
 wrote:

 Hello,

 I have a Spark 1.1.0 standalone cluster, with several nodes, and
 several jobs (applications) being scheduled at the same time.
 By default, each Spark job takes up all available CPUs.
 This way, when more than one job is scheduled, all 

Spark job resource allocation best practices

2014-11-03 Thread Romi Kuntsman
Hello,

I have a Spark 1.1.0 standalone cluster, with several nodes, and several
jobs (applications) being scheduled at the same time.
By default, each Spark job takes up all available CPUs.
This way, when more than one job is scheduled, all but the first are stuck
in WAITING.
On the other hand, if I tell each job to initially limit itself to a fixed
number of CPUs, and that job runs by itself, the cluster is under-utilized
and the job runs longer than it could have if it took all the available
resources.

- How to give the tasks a more fair resource division, which lets many jobs
run together, and together lets them use all the available resources?
- How do you divide resources between applications on your usecase?

P.S. I started reading about Mesos but couldn't figure out if/how it could
solve the described issue.

Thanks!

*Romi Kuntsman*, *Big Data Engineer*
 http://www.totango.com


Re: Spark job resource allocation best practices

2014-11-03 Thread Akhil Das
Have a look at scheduling pools
https://spark.apache.org/docs/latest/job-scheduling.html. If you want
more sophisticated resource allocation, then you are better of to use
cluster managers like mesos or yarn

Thanks
Best Regards

On Mon, Nov 3, 2014 at 9:10 PM, Romi Kuntsman r...@totango.com wrote:

 Hello,

 I have a Spark 1.1.0 standalone cluster, with several nodes, and several
 jobs (applications) being scheduled at the same time.
 By default, each Spark job takes up all available CPUs.
 This way, when more than one job is scheduled, all but the first are stuck
 in WAITING.
 On the other hand, if I tell each job to initially limit itself to a fixed
 number of CPUs, and that job runs by itself, the cluster is under-utilized
 and the job runs longer than it could have if it took all the available
 resources.

 - How to give the tasks a more fair resource division, which lets many
 jobs run together, and together lets them use all the available resources?
 - How do you divide resources between applications on your usecase?

 P.S. I started reading about Mesos but couldn't figure out if/how it could
 solve the described issue.

 Thanks!

 *Romi Kuntsman*, *Big Data Engineer*
  http://www.totango.com



Re: Spark job resource allocation best practices

2014-11-03 Thread Romi Kuntsman
So, as said there, static partitioning is used in Spark’s standalone and
YARN modes, as well as the coarse-grained Mesos mode.
That leaves us only with Mesos, where there is *dynamic sharing* of CPU
cores.

It says when the application is not running tasks on a machine, other
applications may run tasks on those cores.
But my applications are short lived (seconds to minutes), and they read a
large dataset, process it, and write the results. They are also IO-bound,
meaning most of the time is spent reading input data (from S3) and writing
the results back.

Is it possible to divide the resources between them, according to how many
are trying to run at the same time?
So for example if I have 12 cores - if one job is scheduled, it will get 12
cores, but if 3 are scheduled, then each one will get 4 cores and then will
all start.

Thanks!

*Romi Kuntsman*, *Big Data Engineer*
 http://www.totango.com

On Mon, Nov 3, 2014 at 5:46 PM, Akhil Das ak...@sigmoidanalytics.com
wrote:

 Have a look at scheduling pools
 https://spark.apache.org/docs/latest/job-scheduling.html. If you want
 more sophisticated resource allocation, then you are better of to use
 cluster managers like mesos or yarn

 Thanks
 Best Regards

 On Mon, Nov 3, 2014 at 9:10 PM, Romi Kuntsman r...@totango.com wrote:

 Hello,

 I have a Spark 1.1.0 standalone cluster, with several nodes, and several
 jobs (applications) being scheduled at the same time.
 By default, each Spark job takes up all available CPUs.
 This way, when more than one job is scheduled, all but the first are
 stuck in WAITING.
 On the other hand, if I tell each job to initially limit itself to a
 fixed number of CPUs, and that job runs by itself, the cluster is
 under-utilized and the job runs longer than it could have if it took all
 the available resources.

 - How to give the tasks a more fair resource division, which lets many
 jobs run together, and together lets them use all the available resources?
 - How do you divide resources between applications on your usecase?

 P.S. I started reading about Mesos but couldn't figure out if/how it
 could solve the described issue.

 Thanks!

 *Romi Kuntsman*, *Big Data Engineer*
  http://www.totango.com





Re: Spark job resource allocation best practices

2014-11-03 Thread Akhil Das
Yes. i believe Mesos is the right choice for you.
http://mesos.apache.org/documentation/latest/mesos-architecture/

Thanks
Best Regards

On Mon, Nov 3, 2014 at 9:33 PM, Romi Kuntsman r...@totango.com wrote:

 So, as said there, static partitioning is used in Spark’s standalone and
 YARN modes, as well as the coarse-grained Mesos mode.
 That leaves us only with Mesos, where there is *dynamic sharing* of CPU
 cores.

 It says when the application is not running tasks on a machine, other
 applications may run tasks on those cores.
 But my applications are short lived (seconds to minutes), and they read a
 large dataset, process it, and write the results. They are also IO-bound,
 meaning most of the time is spent reading input data (from S3) and writing
 the results back.

 Is it possible to divide the resources between them, according to how many
 are trying to run at the same time?
 So for example if I have 12 cores - if one job is scheduled, it will get
 12 cores, but if 3 are scheduled, then each one will get 4 cores and then
 will all start.

 Thanks!

 *Romi Kuntsman*, *Big Data Engineer*
  http://www.totango.com

 On Mon, Nov 3, 2014 at 5:46 PM, Akhil Das ak...@sigmoidanalytics.com
 wrote:

 Have a look at scheduling pools
 https://spark.apache.org/docs/latest/job-scheduling.html. If you want
 more sophisticated resource allocation, then you are better of to use
 cluster managers like mesos or yarn

 Thanks
 Best Regards

 On Mon, Nov 3, 2014 at 9:10 PM, Romi Kuntsman r...@totango.com wrote:

 Hello,

 I have a Spark 1.1.0 standalone cluster, with several nodes, and several
 jobs (applications) being scheduled at the same time.
 By default, each Spark job takes up all available CPUs.
 This way, when more than one job is scheduled, all but the first are
 stuck in WAITING.
 On the other hand, if I tell each job to initially limit itself to a
 fixed number of CPUs, and that job runs by itself, the cluster is
 under-utilized and the job runs longer than it could have if it took all
 the available resources.

 - How to give the tasks a more fair resource division, which lets many
 jobs run together, and together lets them use all the available resources?
 - How do you divide resources between applications on your usecase?

 P.S. I started reading about Mesos but couldn't figure out if/how it
 could solve the described issue.

 Thanks!

 *Romi Kuntsman*, *Big Data Engineer*
  http://www.totango.com