[jira] [Commented] (YARN-7534) Fair scheduler assign resources may exceed maxResources
[ https://issues.apache.org/jira/browse/YARN-7534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16266302#comment-16266302 ] Wilfred Spiegelenburg commented on YARN-7534: - Based on the current analysis I do not think we have a problem. [~daemon] if you have logs that show this is not working please attach otherwise I will close this as not a problem > Fair scheduler assign resources may exceed maxResources > --- > > Key: YARN-7534 > URL: https://issues.apache.org/jira/browse/YARN-7534 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: YunFan Zhou >Assignee: Wilfred Spiegelenburg > > The logic we're scheduling now is to check whether the resources used by the > queue has exceeded *maxResources* before assigning the container. This will > leads to the fact that after assigning this container the queue uses more > resources than *maxResources*. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7534) Fair scheduler assign resources may exceed maxResources
[ https://issues.apache.org/jira/browse/YARN-7534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16259530#comment-16259530 ] Yufei Gu commented on YARN-7534: Thanks [~wilfreds] for pointing out. {{FSQueue#fitsInMaxShare}} does check the max resource by recursively bottom-up traversing schedulable tree. I goofed. > Fair scheduler assign resources may exceed maxResources > --- > > Key: YARN-7534 > URL: https://issues.apache.org/jira/browse/YARN-7534 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: YunFan Zhou >Assignee: Wilfred Spiegelenburg > > The logic we're scheduling now is to check whether the resources used by the > queue has exceeded *maxResources* before assigning the container. This will > leads to the fact that after assigning this container the queue uses more > resources than *maxResources*. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7534) Fair scheduler assign resources may exceed maxResources
[ https://issues.apache.org/jira/browse/YARN-7534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16259011#comment-16259011 ] YunFan Zhou commented on YARN-7534: --- [~wilfreds] You can take it for free! > Fair scheduler assign resources may exceed maxResources > --- > > Key: YARN-7534 > URL: https://issues.apache.org/jira/browse/YARN-7534 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: YunFan Zhou >Assignee: Wilfred Spiegelenburg > > The logic we're scheduling now is to check whether the resources used by the > queue has exceeded *maxResources* before assigning the container. This will > leads to the fact that after assigning this container the queue uses more > resources than *maxResources*. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7534) Fair scheduler assign resources may exceed maxResources
[ https://issues.apache.org/jira/browse/YARN-7534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258977#comment-16258977 ] Wilfred Spiegelenburg commented on YARN-7534: - I would like to work on this one if you don't mind I think two things are getting mixed up: the queue used resources are not linked to the node. It is the sum of all the resources of containers from applications that run in a queue. A node heartbeat with a changed usage does not mean that the usage changed because an application in the queue has changed it. It could have changed due to a different queue/application adding a container. We're also not allocating anything just yet and have thus not gone over. When the application is updated, at a later point in time, that is when we do that check. We just have a preliminary check here to see if we can offer this node to the queue. Another point to take into account: we are not checking what the application asked for here. That is the next step that follows just below when we run over all the applications that have a demand: {code} for (FSAppAttempt sched : fetchAppsWithDemand(true)) { if (SchedulerAppUtils.isPlaceBlacklisted(sched, node, LOG)) { continue; } assigned = sched.assignContainer(node); {code} This is the earliest we can find what the ask is. If there are more applications with a demand for the queue we walk over the list. We call [assignContainer |https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java#L830] and that is where the checks happen. One of the checks we perform is in hasContainerForNode for the FSAppAttempt: {code} } else if (!getQueue().fitsInMaxShare(resource)) { // The requested container must fit in queue maximum share updateAMDiagnosticMsg(resource, " exceeds current queue or its parents maximum resource allowed)."); ret = false; {code} Which makes the allocation fail and thus we drop out and check the next request for the application and if that all fails we check the next application in the list from apps with demand. Do you have any logs that show that this is not working as it should? > Fair scheduler assign resources may exceed maxResources > --- > > Key: YARN-7534 > URL: https://issues.apache.org/jira/browse/YARN-7534 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: YunFan Zhou > > The logic we're scheduling now is to check whether the resources used by the > queue has exceeded *maxResources* before assigning the container. This will > leads to the fact that after assigning this container the queue uses more > resources than *maxResources*. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7534) Fair scheduler assign resources may exceed maxResources
[ https://issues.apache.org/jira/browse/YARN-7534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258892#comment-16258892 ] Yufei Gu commented on YARN-7534: That's valid. The solution would be adding the resource request to current usage, and comparing the new resource usage with the maxResource. > Fair scheduler assign resources may exceed maxResources > --- > > Key: YARN-7534 > URL: https://issues.apache.org/jira/browse/YARN-7534 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: YunFan Zhou > > The logic we're scheduling now is to check whether the resources used by the > queue has exceeded *maxResources* before assigning the container. This will > leads to the fact that after assigning this container the queue uses more > resources than *maxResources*. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7534) Fair scheduler assign resources may exceed maxResources
[ https://issues.apache.org/jira/browse/YARN-7534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258849#comment-16258849 ] YunFan Zhou commented on YARN-7534: --- [~templedf] For example, a queue resource usage time slices as follows: Max Resources: ** Current used resources: ** Pending resource request: ** This time a node manager report the heartbeat and it has ** available resources. Before assigning containers it will do follows check: {code:java} @Override public Resource assignContainer(FSSchedulerNode node) { Resource assigned = Resources.none(); if (LOG.isDebugEnabled()) { LOG.debug("Node " + node.getNodeName() + " offered to queue: " + getName() + " fairShare: " + getFairShare()); } if (!assignContainerPreCheck(node)) { return assigned; } {code} Because it used resources is less than maxResources. So it will assign ** to this queue, and in this time the queue's used resources over limit. > Fair scheduler assign resources may exceed maxResources > --- > > Key: YARN-7534 > URL: https://issues.apache.org/jira/browse/YARN-7534 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: YunFan Zhou > > The logic we're scheduling now is to check whether the resources used by the > queue has exceeded *maxResources* before assigning the container. This will > leads to the fact that after assigning this container the queue uses more > resources than *maxResources*. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7534) Fair scheduler assign resources may exceed maxResources
[ https://issues.apache.org/jira/browse/YARN-7534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258808#comment-16258808 ] Daniel Templeton commented on YARN-7534: Any other details? > Fair scheduler assign resources may exceed maxResources > --- > > Key: YARN-7534 > URL: https://issues.apache.org/jira/browse/YARN-7534 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: YunFan Zhou > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org