[ https://issues.apache.org/jira/browse/YARN-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17338910#comment-17338910 ]
Bibin Chundatt commented on YARN-10738: --------------------------------------- [~zhuqi] Following are the probable issue i see with using ResourceUsageMultiNodeLookupPolicy on large cluster which could cause hot spots The sorting happens based on available resource consider memory , cpu then nodes ID. # If the memory is available on node and vcores is full still we use the full nodes for allocation attempt . # On the cluster if we have nodes of diff resource sizes the hotspot cases become more serious. The larger machines get preferred always creating under utilization in lower profile machines. # If all the nodes are of same size and not used then the ordering is based on nodeID which could cause machines allocation attempt in canonical order > When multi thread scheduling with multi node, we should shuffle with a gap to > prevent hot accessing nodes. > ---------------------------------------------------------------------------------------------------------- > > Key: YARN-10738 > URL: https://issues.apache.org/jira/browse/YARN-10738 > Project: Hadoop YARN > Issue Type: Improvement > Reporter: Qi Zhu > Assignee: Qi Zhu > Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Now the multi threading scheduling with multi node is not reasonable. > In large clusters, it will cause the hot accessing nodes, which will lead the > abnormal boom node. > Solution: > I think we should shuffle the sorted node (such the available resource sort > policy) with an interval. > I will solve the above problem, and avoid the hot accessing node. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org