[jira] [Commented] (YARN-10738) When multi thread scheduling with multi node, we should shuffle with a gap to prevent hot accessing nodes.

Bibin Chundatt (Jira) Tue, 04 May 2021 03:59:05 -0700


    [ 
https://issues.apache.org/jira/browse/YARN-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17338910#comment-17338910
 ]


Bibin Chundatt commented on YARN-10738:
---------------------------------------

[~zhuqi] 

Following are the probable issue  i see with using 
ResourceUsageMultiNodeLookupPolicy  on large cluster which could cause hot spots

The sorting happens based on available resource consider memory , cpu then 
nodes ID. 
# If the memory is available on node and vcores is full still we use the full 
nodes  for allocation attempt . 
# On the cluster if we have nodes of diff resource sizes the hotspot cases 
become more serious. The larger machines get preferred always creating under 
utilization in lower profile machines.
# If all the nodes are of same size and not used then the ordering is based on 
nodeID which could cause machines allocation attempt  in canonical order

> When multi thread scheduling with multi node, we should shuffle with a gap to 
> prevent hot accessing nodes.
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-10738
>                 URL: https://issues.apache.org/jira/browse/YARN-10738
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Qi Zhu
>            Assignee: Qi Zhu
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> Now the multi threading scheduling with multi node is not reasonable.
> In large clusters, it will cause the hot accessing nodes, which will lead the 
> abnormal boom node.
> Solution:
> I think we should shuffle the sorted node (such the available resource sort 
> policy) with an interval. 
> I will solve the above problem, and avoid the hot accessing node.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-10738) When multi thread scheduling with multi node, we should shuffle with a gap to prevent hot accessing nodes.

Reply via email to