[ 
https://issues.apache.org/jira/browse/YARN-9278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16776502#comment-16776502
 ] 

Wilfred Spiegelenburg commented on YARN-9278:
---------------------------------------------

[~uranus] I can understand that you want to limit the number of nodes to look 
at for pre-emption in large clusters. I could speed things up in certain cases. 
However when I look at the way we identify we already break out of the loop 
when we get to a node that gives back a container list without AMs. In 
{{identifyContainersToPreemptForOneContainer}} we break out of the loop 
checking nodes when {{numAMContainers}} was 0. So we do already break out of 
the loop looking for suitable nodes.

Based on your comment this will change will introduce a trade of between AMs 
and nodes. You propose to stop checking nodes even if we still have AMs in the 
list. In other words you are willing to accept some AMs in the list even if 
that has side effects on those applications. I don't think that that is a good 
idea.

I do agree with you that for the ANY resource we probably want to do something 
else and not just grab the first nodes out of the list all the time. The list 
that comes back from the node tracker is unsorted and just a copy of what is 
known without a filter. We should introduce some logic to not just use a for 
loop to run over the list from the start. If we use a seeded start point 
somewhere in the list which moves around we spread our 
preemption better.
We could base the starting point on the current time (second) and the size of 
the list returned. I don't think we need that if the list is smaller than a 
hard coded number (maybe 50 or 100)  but it would really help in large clusters.


> Shuffle nodes when selecting to be preempted nodes
> --------------------------------------------------
>
>                 Key: YARN-9278
>                 URL: https://issues.apache.org/jira/browse/YARN-9278
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: fairscheduler
>            Reporter: Zhaohui Xin
>            Assignee: Zhaohui Xin
>            Priority: Major
>
> We should *shuffle* the nodes to avoid some nodes being preempted frequently. 
> Also, we should *limit* the num of nodes to make preemption more efficient.
> Just like this,
> {code:java}
> // we should not iterate all nodes, that will be very slow
> long maxTryNodeNum = 
> context.getPreemptionConfig().getToBePreemptedNodeMaxNumOnce();
> if (potentialNodes.size() > maxTryNodeNum){
>   Collections.shuffle(potentialNodes);
>   List<FSSchedulerNode> newPotentialNodes = new ArrayList<FSSchedulerNode>();
> for (int i = 0; i < maxTryNodeNum; i++){
>   newPotentialNodes.add(potentialNodes.get(i));
> }
> potentialNodes = newPotentialNodes;
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to