[ 
https://issues.apache.org/jira/browse/YARN-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385154#comment-14385154
 ] 

Peng Zhang commented on YARN-3405:
----------------------------------

Yes, changing comparator may solve this specific case, but what if queue-2 has 
same sub-queue hierarchy like queue-1, and at the same period, the second queue 
of them get active? Recursive compare still return equal, and the two latter 
sub-queue will be waiting.

As for this issue and YARN-3414, IMPO we should combine "calculation of 
preemption request and preemption". For each preemption request of leaf queue, 
starts preempt upside. If parent queue is under faieshare, found the most over 
fairshare from sibling, otherwise go up again. Finally when get to the root, it 
end because root definitively under fairshare.

This idea can also solve YARN-3414. When found parent has got fairshare(limited 
by max), it will preempt its sibling.

> FairScheduler's preemption cannot happen between sibling in some case
> ---------------------------------------------------------------------
>
>                 Key: YARN-3405
>                 URL: https://issues.apache.org/jira/browse/YARN-3405
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler
>    Affects Versions: 2.7.0
>            Reporter: Peng Zhang
>            Priority: Critical
>
> Queue hierarchy described as below:
> {noformat}
>                   root
>                /         \
>        queue-1          queue-2       
>           /      \
> queue-1-1     queue-1-2
> {noformat}
> Assume cluster resource is 100
> # queue-1-1 and queue-2 has app. Each get 50 usage and 50 fairshare. 
> # When queue-1-2 is active, and it cause some new preemption request for 
> fairshare 25.
> # When preemption from root, it has possibility to find preemption candidate 
> is queue-2. If so preemptContainerPreCheck for queue-2 return false because 
> it's equal to its fairshare.
> # Finally queue-1-2 will be waiting for resource release form queue-1-1 
> itself.
> What I expect here is that queue-1-2 preempt from queue-1-1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to