[jira] [Commented] (YARN-7534) Fair scheduler assign resources may exceed maxResources

2017-11-26 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16266302#comment-16266302
 ] 

Wilfred Spiegelenburg commented on YARN-7534:
-

Based on the current analysis I do not think we have a problem.
[~daemon] if you have logs that show this is not working please attach 
otherwise I will close this as not a problem

> Fair scheduler assign resources may exceed maxResources
> ---
>
> Key: YARN-7534
> URL: https://issues.apache.org/jira/browse/YARN-7534
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: YunFan Zhou
>Assignee: Wilfred Spiegelenburg
>
> The logic we're scheduling now is to check whether the resources used by the 
> queue has exceeded *maxResources* before assigning the container. This will 
> leads to the fact that after assigning this container the queue uses more 
> resources than *maxResources*.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7534) Fair scheduler assign resources may exceed maxResources

2017-11-20 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16259530#comment-16259530
 ] 

Yufei Gu commented on YARN-7534:


Thanks [~wilfreds] for pointing out. {{FSQueue#fitsInMaxShare}} does check the 
max resource by recursively bottom-up traversing schedulable tree. I goofed.

> Fair scheduler assign resources may exceed maxResources
> ---
>
> Key: YARN-7534
> URL: https://issues.apache.org/jira/browse/YARN-7534
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: YunFan Zhou
>Assignee: Wilfred Spiegelenburg
>
> The logic we're scheduling now is to check whether the resources used by the 
> queue has exceeded *maxResources* before assigning the container. This will 
> leads to the fact that after assigning this container the queue uses more 
> resources than *maxResources*.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7534) Fair scheduler assign resources may exceed maxResources

2017-11-20 Thread YunFan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16259011#comment-16259011
 ] 

YunFan Zhou commented on YARN-7534:
---

[~wilfreds] You can take it for free!

> Fair scheduler assign resources may exceed maxResources
> ---
>
> Key: YARN-7534
> URL: https://issues.apache.org/jira/browse/YARN-7534
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: YunFan Zhou
>Assignee: Wilfred Spiegelenburg
>
> The logic we're scheduling now is to check whether the resources used by the 
> queue has exceeded *maxResources* before assigning the container. This will 
> leads to the fact that after assigning this container the queue uses more 
> resources than *maxResources*.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7534) Fair scheduler assign resources may exceed maxResources

2017-11-20 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258977#comment-16258977
 ] 

Wilfred Spiegelenburg commented on YARN-7534:
-

I would like to work on this one if you don't mind

I think two things are getting mixed up: the queue used resources are not 
linked to the node. It is the sum of all the resources of containers from 
applications that run in a queue. A node heartbeat with a changed usage does 
not mean that the usage changed because an application in the queue has changed 
it. It could have changed due to a different queue/application adding a 
container.

We're also not allocating anything just yet and have thus not gone over. When 
the application is updated, at a later point in time, that is when we do that 
check. We just have a preliminary check here to see if we can offer this node 
to the queue. Another point to take into account: we are not checking what the 
application asked for here. That is the next step that follows just below when 
we run over all the applications that have a demand:

{code}
for (FSAppAttempt sched : fetchAppsWithDemand(true)) {
  if (SchedulerAppUtils.isPlaceBlacklisted(sched, node, LOG)) {
continue;
  }
  assigned = sched.assignContainer(node);
{code}

This is the earliest we can find what the ask is. If there are more 
applications with a demand for the queue we walk over the list. We call 
[assignContainer 
|https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java#L830]
and that is where the checks happen.
One of the checks we perform is in hasContainerForNode for the FSAppAttempt:
{code}
} else if (!getQueue().fitsInMaxShare(resource)) {
  // The requested container must fit in queue maximum share
  updateAMDiagnosticMsg(resource,
  " exceeds current queue or its parents maximum resource allowed).");

  ret = false;
{code}

Which makes the allocation fail and thus we drop out and check the next request 
for the application and if that all fails we check the next application in the 
list from apps with demand.

Do you have any logs that show that this is not working as it should?


> Fair scheduler assign resources may exceed maxResources
> ---
>
> Key: YARN-7534
> URL: https://issues.apache.org/jira/browse/YARN-7534
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: YunFan Zhou
>
> The logic we're scheduling now is to check whether the resources used by the 
> queue has exceeded *maxResources* before assigning the container. This will 
> leads to the fact that after assigning this container the queue uses more 
> resources than *maxResources*.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7534) Fair scheduler assign resources may exceed maxResources

2017-11-19 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258892#comment-16258892
 ] 

Yufei Gu commented on YARN-7534:


That's valid. The solution would be adding the resource request to current 
usage, and comparing the new resource usage with the maxResource.

> Fair scheduler assign resources may exceed maxResources
> ---
>
> Key: YARN-7534
> URL: https://issues.apache.org/jira/browse/YARN-7534
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: YunFan Zhou
>
> The logic we're scheduling now is to check whether the resources used by the 
> queue has exceeded *maxResources* before assigning the container. This will 
> leads to the fact that after assigning this container the queue uses more 
> resources than *maxResources*.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7534) Fair scheduler assign resources may exceed maxResources

2017-11-19 Thread YunFan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258849#comment-16258849
 ] 

YunFan Zhou commented on YARN-7534:
---

[~templedf]
For example, a queue resource usage time slices as follows:
Max Resources: **
Current used resources: **
Pending resource request: **

This time a node manager report the heartbeat and it has ** 
available resources.
Before assigning containers it will do follows check:
{code:java}
@Override
  public Resource assignContainer(FSSchedulerNode node) {
Resource assigned = Resources.none();
if (LOG.isDebugEnabled()) {
  LOG.debug("Node " + node.getNodeName() + " offered to queue: " +
  getName() + " fairShare: " + getFairShare());
}

if (!assignContainerPreCheck(node)) {
  return assigned;
}
{code}

Because it used resources is less than maxResources. So it will assign 
** to
this queue, and in this time the queue's used resources over limit.



> Fair scheduler assign resources may exceed maxResources
> ---
>
> Key: YARN-7534
> URL: https://issues.apache.org/jira/browse/YARN-7534
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: YunFan Zhou
>
> The logic we're scheduling now is to check whether the resources used by the 
> queue has exceeded *maxResources* before assigning the container. This will 
> leads to the fact that after assigning this container the queue uses more 
> resources than *maxResources*.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7534) Fair scheduler assign resources may exceed maxResources

2017-11-19 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258808#comment-16258808
 ] 

Daniel Templeton commented on YARN-7534:


Any other details?

> Fair scheduler assign resources may exceed maxResources
> ---
>
> Key: YARN-7534
> URL: https://issues.apache.org/jira/browse/YARN-7534
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: YunFan Zhou
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org