[ https://issues.apache.org/jira/browse/YARN-4189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14876871#comment-14876871 ]
Xianyin Xin commented on YARN-4189: ----------------------------------- Hi [~leftnoteasy], just go through the doc. Please correct me if i am wrong. Can a container that marked as ALLOCATING_WAITING be occupied by other requests? I'm afraid ALLOCATING_WAITING would decline the utilization of the cluster. In a cluster with many nodes and many jobs, it's uneasy to make the most of jobs satisfied with their allocations, especially in the app-oriented allocation mechanism(which maps the newly available resource to appropriate apps). Once a customer asks us "why we can get 100% locality in MR1 but only up to 60%~70% after making various of optimizations?". So we can guess there are quite a percents of resource requests which are not satisfied with their allocations in a cluster, thus there would be many containers experience the ALLOCATING_WAITING phase which makes many resource idle for a period of time. > Capacity Scheduler : Improve location preference waiting mechanism > ------------------------------------------------------------------ > > Key: YARN-4189 > URL: https://issues.apache.org/jira/browse/YARN-4189 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler > Reporter: Wangda Tan > Assignee: Wangda Tan > Attachments: YARN-4189 design v1.pdf > > > There're some issues with current Capacity Scheduler implementation of delay > scheduling: > *1) Waiting time to allocate each container highly depends on cluster > availability* > Currently, app can only increase missed-opportunity when a node has available > resource AND it gets traversed by a scheduler. There’re lots of possibilities > that an app doesn’t get traversed by a scheduler, for example: > A cluster has 2 racks (rack1/2), each rack has 40 nodes. > Node-locality-delay=40. An application prefers rack1. > Node-heartbeat-interval=1s. > Assume there are 2 nodes available on rack1, delay to allocate one container > = 40 sec. > If there are 20 nodes available on rack1, delay of allocating one container = > 2 sec. > *2) It could violate scheduling policies (Fifo/Priority/Fair)* > Assume a cluster is highly utilized, an app (app1) has higher priority, it > wants locality. And there’s another app (app2) has lower priority, but it > doesn’t care about locality. When node heartbeats with available resource, > app1 decides to wait, so app2 gets the available slot. This should be > considered as a bug that we need to fix. > The same problem could happen when we use FIFO/Fair queue policies. > Another problem similar to this is related to preemption: when preemption > policy preempts some resources from queue-A for queue-B (queue-A is > over-satisfied and queue-B is under-satisfied). But queue-B is waiting for > the node-locality-delay so queue-A will get resources back. In next round, > preemption policy could preempt this resources again from queue-A. > This JIRA is target to solve these problems. -- This message was sent by Atlassian JIRA (v6.3.4#6332)