[ 
https://issues.apache.org/jira/browse/YARN-9088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17062390#comment-17062390
 ] 

Anuj commented on YARN-9088:
----------------------------

We are in our setup facing similar issue in which global view of pending and 
available resource is get messed up.

> Non-exclusive labels break QueueMetrics
> ---------------------------------------
>
>                 Key: YARN-9088
>                 URL: https://issues.apache.org/jira/browse/YARN-9088
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler, resourcemanager
>    Affects Versions: 2.8.5
>            Reporter: Brandon Scheller
>            Priority: Major
>              Labels: metrics, nodelabel
>
> QueueMetrics are broken (random/negative values) when non-exclusive labels 
> are being used and unlabeled containers run on labeled nodes.
> This is caused by the change in the patch here:
> https://issues.apache.org/jira/browse/YARN-6467
> It assumes that a container's label will be the same as the node's label that 
> it is running on.
> If you look within the patch, sometimes metrics are updated using the 
> request.getNodeLabelExpression(). And sometimes they are updated using 
> node.getPartition().
> This means that in the case where the node is labeled while the container 
> request isn't, these metrics only get updated when referring to the default 
> queue. This stops metrics from balancing out and results in incorrect and 
> negative values in QueueMetrics. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to