[ 
https://issues.apache.org/jira/browse/YARN-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13540021#comment-13540021
 ] 

Arun C Murthy commented on YARN-2:
----------------------------------

[~bikassaha] the original critique of using 'integral' cores was that it would 
lead to under-utilization of CPUs if certain workloads were very CPU-light, 
hence the need for floating-point spec. The one problem with that spec is that 
it gets very hard to deal with heterogenous clusters i.e. you need to be able 
to say "I need 0.25 CPU at 2.0GHz" or "I need 1.5 CPU at 2.5GHz." Furthermore, 
similar to memory, you need a minimum memory spec i.e. 0.25 CPUs to ensure we 
don't fragment resources too finely.

So, rather than make a more complicated spec (#cpus and cpu-frequency etc. and 
a minimum #cpus/cpu-freq) I propose we normalize to a integral number of 
'virtual cores'. This way we get the required 'minimum' i.e. 1 virtual-core and 
built-in support for heterogenous systems and over-subscription i.e. we can 
control #virtual-cores on each node depending on their individual 
characteristics.
                
> Enhance CS to schedule accounting for both memory and cpu cores
> ---------------------------------------------------------------
>
>                 Key: YARN-2
>                 URL: https://issues.apache.org/jira/browse/YARN-2
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: capacityscheduler, scheduler
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>             Fix For: 2.0.3-alpha
>
>         Attachments: MAPREDUCE-4327.patch, MAPREDUCE-4327.patch, 
> MAPREDUCE-4327.patch, MAPREDUCE-4327-v2.patch, MAPREDUCE-4327-v3.patch, 
> MAPREDUCE-4327-v4.patch, MAPREDUCE-4327-v5.patch, YARN-2-help.patch, 
> YARN-2.patch, YARN-2.patch, YARN-2.patch, YARN-2.patch, YARN-2.patch
>
>
> With YARN being a general purpose system, it would be useful for several 
> applications (MPI et al) to specify not just memory but also CPU (cores) for 
> their resource requirements. Thus, it would be useful to the 
> CapacityScheduler to account for both.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to