[ https://issues.apache.org/jira/browse/YARN-9596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16887198#comment-16887198 ]
Muhammad Samir Khan commented on YARN-9596: ------------------------------------------- Updated with changes. > QueueMetrics has incorrect metrics when labelled partitions are involved > ------------------------------------------------------------------------ > > Key: YARN-9596 > URL: https://issues.apache.org/jira/browse/YARN-9596 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler > Affects Versions: 2.8.0, 3.3.0 > Reporter: Muhammad Samir Khan > Assignee: Muhammad Samir Khan > Priority: Major > Attachments: Screen Shot 2019-06-03 at 4.41.45 PM.png, Screen Shot > 2019-06-03 at 4.44.15 PM.png, YARN-9596.001.patch, YARN-9596.002.patch, > YARN-9596.003.patch > > > After YARN-6467, QueueMetrics should only be tracking metrics for the default > partition. However, the metrics are incorrect when labelled partitions are > involved. > Steps to reproduce > ============== > # Configure capacity-scheduler.xml with label configuration > # Add label "test" to cluster and replace label on node1 to be "test" > # Note down "totalMB" at > <resourcemanager.webapp.address:port>/ws/v1/cluster/metrics > # Start first job on test queue. > # Start second job on default queue (does not work if the order of two jobs > is swapped). > # While the two applications are running, the "totalMB" at > <resourcemanager.webapp.address:port>/ws/v1/cluster/metrics will go down by > the amount of MB used by the first job (screenshots attached). > Alternately: > In > TestNodeLabelContainerAllocation.testQueueMetricsWithLabelsOnDefaultLabelNode(), > add the following line at the end of the test before rm1.close(): > CSQueue rootQueue = cs.getRootQueue(); > assertEquals(10*GB, > rootQueue.getMetrics().getAvailableMB() + > rootQueue.getMetrics().getAllocatedMB()); > There are two nodes of 10GB each and only one of them have a non-default > label. The test will also fail against 20*GB check. -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org