[ https://issues.apache.org/jira/browse/YARN-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14492467#comment-14492467 ]
Peng Zhang commented on YARN-3405: ---------------------------------- Other issues for preemption during development, need confirmation: # Jobs in the same queue will not trigger preemption, cause resToPreemption() only considers unfair between queues. # MapReduce's map task will cause unneeded preemption request, because FSAppAttempt.updateDemand() will count all of ANY, rack and host request, so preemption demand will be triple for one map task. I want to change it to only counting for ANY request, but do not know whether it will affect Non-MapReduce framework. # Notion of "MinResources" is confusing and easy to misconfigure. Because calculation of fair share considers min, max & weight, when min of one queue is above cluster resources or its parent queue, other queue's fair share is 0, also I found sometimes sum of children's fair share can be larger than parent queue's fair share. I have some suggestions for these notion like below: * max resources means maximum resources that one queue can get * min resources means under which threshold the queue cannot not be preempted * weight notion changed to "expected fair share" - like <10240mb 10cores> (I see weight implementation has memory and cpu, but we use only memory now), and make "expected fair share" as the only considered element during calculation of fair share. > FairScheduler's preemption cannot happen between sibling in some case > --------------------------------------------------------------------- > > Key: YARN-3405 > URL: https://issues.apache.org/jira/browse/YARN-3405 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler > Affects Versions: 2.7.0 > Reporter: Peng Zhang > Assignee: Peng Zhang > Priority: Critical > Attachments: YARN-3405.01.patch > > > Queue hierarchy described as below: > {noformat} > root > / \ > queue-1 queue-2 > / \ > queue-1-1 queue-1-2 > {noformat} > Assume cluster resource is 100 > # queue-1-1 and queue-2 has app. Each get 50 usage and 50 fairshare. > # When queue-1-2 is active, and it cause some new preemption request for > fairshare 25. > # When preemption from root, it has possibility to find preemption candidate > is queue-2. If so preemptContainerPreCheck for queue-2 return false because > it's equal to its fairshare. > # Finally queue-1-2 will be waiting for resource release form queue-1-1 > itself. > What I expect here is that queue-1-2 preempt from queue-1-1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)