[ https://issues.apache.org/jira/browse/YARN-569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13661185#comment-13661185 ]
Hadoop QA commented on YARN-569: -------------------------------- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12583710/YARN-569.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/953//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/953//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/953//console This message is automatically generated. > CapacityScheduler: support for preemption (using a capacity monitor) > -------------------------------------------------------------------- > > Key: YARN-569 > URL: https://issues.apache.org/jira/browse/YARN-569 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler > Reporter: Carlo Curino > Assignee: Carlo Curino > Attachments: 3queues.pdf, CapScheduler_with_preemption.pdf, > preemption.2.patch, YARN-569.1.patch, YARN-569.2.patch, YARN-569.patch, > YARN-569.patch > > > There is a tension between the fast-pace reactive role of the > CapacityScheduler, which needs to respond quickly to > applications resource requests, and node updates, and the more introspective, > time-based considerations > needed to observe and correct for capacity balance. To this purpose we opted > instead of hacking the delicate > mechanisms of the CapacityScheduler directly to add support for preemption by > means of a "Capacity Monitor", > which can be run optionally as a separate service (much like the > NMLivelinessMonitor). > The capacity monitor (similarly to equivalent functionalities in the fairness > scheduler) operates running on intervals > (e.g., every 3 seconds), observe the state of the assignment of resources to > queues from the capacity scheduler, > performs off-line computation to determine if preemption is needed, and how > best to "edit" the current schedule to > improve capacity, and generates events that produce four possible actions: > # Container de-reservations > # Resource-based preemptions > # Container-based preemptions > # Container killing > The actions listed above are progressively more costly, and it is up to the > policy to use them as desired to achieve the rebalancing goals. > Note that due to the "lag" in the effect of these actions the policy should > operate at the macroscopic level (e.g., preempt tens of containers > from a queue) and not trying to tightly and consistently micromanage > container allocations. > ------------- Preemption policy (ProportionalCapacityPreemptionPolicy): > ------------- > Preemption policies are by design pluggable, in the following we present an > initial policy (ProportionalCapacityPreemptionPolicy) we have been > experimenting with. The ProportionalCapacityPreemptionPolicy behaves as > follows: > # it gathers from the scheduler the state of the queues, in particular, their > current capacity, guaranteed capacity and pending requests (*) > # if there are pending requests from queues that are under capacity it > computes a new ideal balanced state (**) > # it computes the set of preemptions needed to repair the current schedule > and achieve capacity balance (accounting for natural completion rates, and > respecting bounds on the amount of preemption we allow for each round) > # it selects which applications to preempt from each over-capacity queue (the > last one in the FIFO order) > # it remove reservations from the most recently assigned app until the amount > of resource to reclaim is obtained, or until no more reservations exits > # (if not enough) it issues preemptions for containers from the same > applications (reverse chronological order, last assigned container first) > again until necessary or until no containers except the AM container are left, > # (if not enough) it moves onto unreserve and preempt from the next > application. > # containers that have been asked to preempt are tracked across executions. > If a containers is among the one to be preempted for more than a certain > time, the container is moved in a the list of containers to be forcibly > killed. > Notes: > (*) at the moment, in order to avoid double-counting of the requests, we only > look at the "ANY" part of pending resource requests, which means we might not > preempt on behalf of AMs that ask only for specific locations but not any. > (**) The ideal balance state is one in which each queue has at least its > guaranteed capacity, and the spare capacity is distributed among queues (that > wants some) as a weighted fair share. Where the weighting is based on the > guaranteed capacity of a queue, and the function runs to a fix point. > Tunables of the ProportionalCapacityPreemptionPolicy: > # observe-only mode (i.e., log the actions it would take, but behave as > read-only) > # how frequently to run the policy > # how long to wait between preemption and kill of a container > # which fraction of the containers I would like to obtain should I preempt > (has to do with the natural rate at which containers are returned) > # deadzone size, i.e., what % of over-capacity should I ignore (if we are off > perfect balance by some small % we ignore it) > # overall amount of preemption we can afford for each run of the policy (in > terms of total cluster capacity) > In our current experiments this set of tunables seem to be a good start to > shape the preemption action properly. More sophisticated preemption policies > could take into account different type of applications running, job > priorities, cost of preemption, integral of capacity imbalance. This is very > much a control-theory kind of problem, and some of the lessons on designing > and tuning controllers are likely to apply. > Generality: > The monitor-based scheduler edit, and the preemption mechanisms we introduced > here are designed to be more general than enforcing capacity/fairness, in > fact, we are considering other monitors that leverage the same idea of > "schedule edits" to target different global properties (e.g., allocate enough > resources to guarantee deadlines for important jobs, or data-locality > optimizations, IO-balancing among nodes, etc...). > Note that by default the preemption policy we describe is disabled in the > patch. > Depends on YARN-45 and YARN-567, is related to YARN-568 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira