[ https://issues.apache.org/jira/browse/YARN-11626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17829262#comment-17829262 ]
ASF GitHub Bot commented on YARN-11626: --------------------------------------- hadoop-yetus commented on PR #6616: URL: https://github.com/apache/hadoop/pull/6616#issuecomment-2010221243 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |:----:|----------:|--------:|:--------:|:-------:| | +0 :ok: | reexec | 6m 33s | | Docker mode activated. | |||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | xmllint | 0m 0s | | xmllint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | |||| _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 32m 32s | | trunk passed | | +1 :green_heart: | compile | 0m 33s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 30s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 29s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 33s | | trunk passed | | +1 :green_heart: | javadoc | 0m 35s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 29s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 7s | | trunk passed | | +1 :green_heart: | shadedclient | 20m 1s | | branch has no errors when building and testing our client artifacts. | |||| _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 27s | | the patch passed | | +1 :green_heart: | compile | 0m 27s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 27s | | the patch passed | | +1 :green_heart: | compile | 0m 27s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 27s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 21s | [/results-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6616/4/artifact/out/results-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt) | hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 3 new + 5 unchanged - 0 fixed = 8 total (was 5) | | +1 :green_heart: | mvnsite | 0m 26s | | the patch passed | | +1 :green_heart: | javadoc | 0m 23s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 27s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 6s | | the patch passed | | +1 :green_heart: | shadedclient | 20m 8s | | patch has no errors when building and testing our client artifacts. | |||| _ Other Tests _ | | +1 :green_heart: | unit | 89m 34s | | hadoop-yarn-server-resourcemanager in the patch passed. | | +1 :green_heart: | asflicense | 0m 24s | | The patch does not generate ASF License warnings. | | | | 179m 40s | | | | Subsystem | Report/Notes | |----------:|:-------------| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6616/4/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6616 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle | | uname | Linux 00b3366602f7 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 725bb7fd54d8c2d821e7b38df2a3358678c71b9c | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6616/4/testReport/ | | Max. process+thread count | 950 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6616/4/console | | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org | This message was automatically generated. > Optimization of the safeDelete operation in ZKRMStateStore > ---------------------------------------------------------- > > Key: YARN-11626 > URL: https://issues.apache.org/jira/browse/YARN-11626 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager > Affects Versions: 3.0.0-alpha4, 3.1.1, 3.3.0 > Reporter: wangzhihui > Priority: Minor > Labels: pull-request-available > > h1. Description > * We can be observed that removing app info started at 06:17:20, but the > NoNodeException was received at 06:17:35. > * During the 15s interval, Curator was retrying the metadata operation. Due > to the non-idempotent nature of the Zookeeper deletion operation, in one of > the retry attempts, the metadata operation was successful but no response was > received. In the next retry it resulted in a NoNodeException, triggering the > STATE_STORE_FENCED event and ultimately causing the current ResourceManager > to switch to standby . > {code:java} > 2023-10-28 06:17:20,359 INFO recovery.RMStateStore > (RMStateStore.java:transition(333)) - Removing info for app: > application_1697410508608_140368 > 2023-10-28 06:17:20,359 INFO resourcemanager.RMAppManager > (RMAppManager.java:checkAppNumCompletedLimit(303)) - Application should be > expired, max number of completed apps kept in memory met: > maxCompletedAppsInMemory = 1000, removing app > application_1697410508608_140368 from memory: > 2023-10-28 06:17:35,665 ERROR recovery.RMStateStore > (RMStateStore.java:transition(337)) - Error removing app: > application_1697410508608_140368 > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:111) > 2023-10-28 06:17:35,666 INFO recovery.RMStateStore > (RMStateStore.java:handleStoreEvent(1147)) - RMStateStore state change from > ACTIVE to FENCED > 2023-10-28 06:17:35,666 ERROR resourcemanager.ResourceManager > (ResourceManager.java:handle(898)) - Received RMFatalEvent of type > STATE_STORE_FENCED, caused by > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode > 2023-10-28 06:17:35,666 INFO resourcemanager.ResourceManager > (ResourceManager.java:transitionToStandby(1309)) - Transitioning to standby > state > {code} > h1. Solution > The NoNodeException clearly indicates that the Znode no longer exists, so we > can safely ignore this exception to avoid triggering a larger impact on the > cluster caused by ResourceManager failover. > h1. Other > We also need to discuss and optimize the same issues in safeCreate. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org