Wangda Tan created YARN-3243:
--------------------------------

             Summary: CapacityScheduler should pass headroom from parent to 
children to make sure ParentQueue obey its capacity limits.
                 Key: YARN-3243
                 URL: https://issues.apache.org/jira/browse/YARN-3243
             Project: Hadoop YARN
          Issue Type: Bug
          Components: capacityscheduler, resourcemanager
            Reporter: Wangda Tan
            Assignee: Wangda Tan


Now CapacityScheduler has some issues to make sure ParentQueue always obeys its 
capacity limits, for example:
1) When allocating container of a parent queue, it will only check 
parentQueue.usage < parentQueue.max. If leaf queue allocated a container.size > 
(parentQueue.max - parentQueue.usage), parent queue can excess its max resource 
limit, as following example:
{code}
        A  (usage=54, max=55)
       /     \
      A1     A2 (usage=1, max=55)
(usage=53, max=53)
{code}
Queue-A2 is able to allocate container since its usage < max, but if we do 
that, A's usage can excess A.max.

2) When doing continous reservation check, parent queue will only tell children 
"you need unreserve *some* resource, so that I will less than my maximum 
resource", but it will not tell how many resource need to be unreserved. This 
may lead to parent queue excesses configured maximum capacity as well.

With YARN-3099/YARN-3124, now we have {{ResourceUsage}} class in each class, 
*here is my proposal*:
- ParentQueue will set its children's ResourceUsage.headroom, which means, 
*maximum resource its children can allocate*.
- ParentQueue will set its children's headroom to be (saying parent's name is 
"qA"): min(qA.headroom, qA.max - qA.used). This will make sure qA's ancestors' 
capacity will be enforced as well (qA.headroom is set by qA's parent).
- {{needToUnReserve}} is not necessary, instead, children can get how much 
resource need to be unreserved to keep its parent's resource limit.
- More over, with this, YARN-3026 will make a clear boundary between LeafQueue 
and FiCaSchedulerApp, headroom will consider user-limit, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to