Alejandro Abdelnur created YARN-789:
---------------------------------------

             Summary: Add flag to scheduler to allow zero capabilities in 
resources
                 Key: YARN-789
                 URL: https://issues.apache.org/jira/browse/YARN-789
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: scheduler
    Affects Versions: 2.0.4-alpha
            Reporter: Alejandro Abdelnur
            Assignee: Alejandro Abdelnur


Per discussion in YARN-689, reposting updated use case:

1. I have a set of services co-existing with a Yarn cluster.

2. These services run out of band from Yarn. They are not started as yarn 
containers and they don't use Yarn containers for processing.

3. These services use, dynamically, different amounts of CPU and memory based 
on their load. They manage their CPU and memory requirements independently. In 
other words, depending on their load, they may require more CPU but not memory 
or vice-versa.
By using YARN as RM for these services I'm able share and utilize the resources 
of the cluster appropriately and in a dynamic way. Yarn keeps tab of all the 
resources.

These services run an AM that reserves resources on their behalf. When this AM 
gets the requested resources, the services bump up their CPU/memory utilization 
out of band from Yarn. If the Yarn allocations are released/preempted, the 
services back off on their resources utilization. By doing this, Yarn and these 
service correctly share the cluster resources, being Yarn RM the only one that 
does the overall resource bookkeeping.

The services AM, not to break the lifecycle of containers, start containers in 
the corresponding NMs. These container processes do basically a sleep forever 
(i.e. sleep 10000d). They are almost not using any CPU nor memory (less than 
1MB). Thus it is reasonable to assume their required CPU and memory utilization 
is NIL (more on hard enforcement later). Because of this almost NIL utilization 
of CPU and memory, it is possible to specify, when doing a request, zero as one 
of the dimensions (CPU or memory).

The current limitation is that the increment is also the minimum. 

If we set the memory increment to 1MB. When doing a pure CPU request, we would 
have to specify 1MB of memory. That would work. However it would allow 
discretionary memory requests without a desired normalization (increments of 
256, 512, etc).

If we set the CPU increment to 1CPU. When doing a pure memory request, we would 
have to specify 1CPU. CPU amounts a much smaller than memory amounts, and 
because we don't have fractional CPUs, it would mean that all my pure memory 
requests will be wasting 1 CPU thus reducing the overall utilization of the 
cluster.

Finally, on hard enforcement. 

* For CPU. Hard enforcement can be done via a cgroup cpu controller. Using an 
absolute minimum of a few CPU shares (ie 10) in the LinuxContainerExecutor we 
ensure there is enough CPU cycles to run the sleep process. This absolute 
minimum would only kick-in if zero is allowed, otherwise will never kick in as 
the shares for 1 CPU are 1024.

* For Memory. Hard enforcement is currently done by the 
ProcfsBasedProcessTree.java, using a minimum absolute of 1 or 2 MBs would take 
care of zero memory resources. And again,  this absolute minimum would only 
kick-in if zero is allowed, otherwise will never kick in as the increment 
memory is in several MBs if not 1GB.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to