[ https://issues.apache.org/jira/browse/YARN-11490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722158#comment-17722158 ]
ASF GitHub Bot commented on YARN-11490: --------------------------------------- hadoop-yetus commented on PR #5644: URL: https://github.com/apache/hadoop/pull/5644#issuecomment-1545687204 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |:----:|----------:|--------:|:--------:|:-------:| | +0 :ok: | reexec | 0m 34s | | Docker mode activated. | |||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | |||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 15m 45s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 19m 26s | | trunk passed | | +1 :green_heart: | compile | 6m 57s | | trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 | | +1 :green_heart: | compile | 6m 25s | | trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 | | +1 :green_heart: | checkstyle | 1m 48s | | trunk passed | | +1 :green_heart: | mvnsite | 2m 6s | | trunk passed | | +1 :green_heart: | javadoc | 2m 3s | | trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 | | +1 :green_heart: | javadoc | 1m 50s | | trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 | | +1 :green_heart: | spotbugs | 4m 19s | | trunk passed | | +1 :green_heart: | shadedclient | 21m 32s | | branch has no errors when building and testing our client artifacts. | |||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 29s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 19s | | the patch passed | | +1 :green_heart: | compile | 6m 19s | | the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 | | +1 :green_heart: | javac | 6m 19s | | the patch passed | | +1 :green_heart: | compile | 6m 19s | | the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 | | +1 :green_heart: | javac | 6m 19s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 1m 39s | [/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5644/6/artifact/out/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt) | hadoop-yarn-project/hadoop-yarn: The patch generated 1 new + 281 unchanged - 0 fixed = 282 total (was 281) | | +1 :green_heart: | mvnsite | 1m 55s | | the patch passed | | +1 :green_heart: | javadoc | 1m 45s | | the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 | | +1 :green_heart: | javadoc | 1m 40s | | the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 | | +1 :green_heart: | spotbugs | 4m 18s | | the patch passed | | +1 :green_heart: | shadedclient | 22m 0s | | patch has no errors when building and testing our client artifacts. | |||| _ Other Tests _ | | +1 :green_heart: | unit | 1m 10s | | hadoop-yarn-api in the patch passed. | | +1 :green_heart: | unit | 98m 29s | | hadoop-yarn-server-resourcemanager in the patch passed. | | +1 :green_heart: | asflicense | 0m 58s | | The patch does not generate ASF License warnings. | | | | 233m 33s | | | | Subsystem | Report/Notes | |----------:|:-------------| | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5644/6/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5644 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 9ef1c6772920 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / be6ac1f87533caabef7ae5cc5ca82b79986abe94 | | Default Java | Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5644/6/testReport/ | | Max. process+thread count | 949 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5644/6/console | | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org | This message was automatically generated. > JMX QueueMetrics breaks after mutable config validation in CS > ------------------------------------------------------------- > > Key: YARN-11490 > URL: https://issues.apache.org/jira/browse/YARN-11490 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler > Affects Versions: 3.4.0 > Reporter: Tamas Domok > Assignee: Tamas Domok > Priority: Major > Labels: pull-request-available > Attachments: addqueue.xml, defaultqueue.json, > hadoop-tdomok-resourcemanager-tdomok-MBP16.log, removequeue.xml, > stopqueue.json > > > Reproduction steps: > 1. Submit a long running job > {code} > hadoop-3.4.0-SNAPSHOT/bin/yarn jar > hadoop-3.4.0-SNAPSHOT/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.4.0-SNAPSHOT-tests.jar > sleep -m 1 -r 1 -rt 1200000 -mt 20 > {code} > 2. Verify that there is one running app > {code} > $ curl http://localhost:8088/ws/v1/cluster/metrics | jq > {code} > 3. Verify that the JMX endpoint reports 1 running app as well > {code} > $ curl http://localhost:8088/jmx | jq > {code} > 4. Validate the configuration (x2) > {code} > $ curl -X POST -H 'Content-Type: application/json' -d @defaultqueue.json > localhost:8088/ws/v1/cluster/scheduler-conf/validate > $ cat defaultqueue.json > {"update-queue":{"queue-name":"root.default","params":{"entry":{"key":"maximum-applications","value":"100"}}},"subClusterId":"","global":null,"global-updates":null} > {code} > 5. Check 2. and 3. again. The cluster metrics should still work but the JMX > endpoint will show 0 running apps, that's the bug. > It is caused by YARN-11211, reverting that patch (or only removing the > _QueueMetrics.clearQueueMetrics();_ line) fixes the issue. But I think that > would re-introduce the memory leak. > It looks like the QUEUE_METRICS hash map is "add-only", the > clearQueueMetrics() was only called from ResourceManager.reinitialize() > method (transitionToActive/transitionToStandby) prior to YARN-11211. > Constantly adding and removing queues with unique names would cause a leak as > well, because there is no remove from QUEUE_METRICS, so it is not just the > validation API that has this problem. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org