Chris Bannister created SPARK-10293:
---------------------------------------

             Summary: Add support for oversubscription in Mesos
                 Key: SPARK-10293
                 URL: https://issues.apache.org/jira/browse/SPARK-10293
             Project: Spark
          Issue Type: Story
          Components: Mesos
            Reporter: Chris Bannister


Currently when running Spark on Mesos each executor will use all the CPU 
resources offered to it. This can lead to cases where a Spark executor is using 
all the CPU resources on a single slave but is underutilisation the CPU 
allocated to it.

Mesos added support in 0.23 for oversubscription, where frameworks can be 
offered slack resources for CPU, so that if a task is allocated 10 cpus but is 
only using 1, 9 revokable offers will be made to other frameworks. If the 
original task starts using its allocated CPU then Mesos will preempt the 
revokable task, killing it.

>From a cluster usage perspective it would be very useful to be able to specify 
>that some jobs are revokable and can be ran in slack resources, and that they 
>should be rescheduled without affecting the job status (ie not count towards 
>job failure) when a task is revoked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to