[ https://issues.apache.org/jira/browse/HDFS-16455?focusedWorklogId=726844&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-726844 ]
ASF GitHub Bot logged work on HDFS-16455: ----------------------------------------- Author: ASF GitHub Bot Created on: 15/Feb/22 05:29 Start Date: 15/Feb/22 05:29 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3983: URL: https://github.com/apache/hadoop/pull/3983#issuecomment-1039875071 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |:----:|----------:|--------:|:--------:|:-------:| | +0 :ok: | reexec | 0m 36s | | Docker mode activated. | |||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | |||| _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 32m 1s | | trunk passed | | +1 :green_heart: | compile | 22m 9s | | trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 19m 39s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 1m 7s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 43s | | trunk passed | | +1 :green_heart: | javadoc | 1m 15s | | trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 42s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 2m 26s | | trunk passed | | +1 :green_heart: | shadedclient | 21m 53s | | branch has no errors when building and testing our client artifacts. | |||| _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 59s | | the patch passed | | +1 :green_heart: | compile | 21m 31s | | the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javac | 21m 31s | | the patch passed | | +1 :green_heart: | compile | 19m 39s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 19m 39s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 1m 5s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 39s | | the patch passed | | +1 :green_heart: | javadoc | 1m 9s | | the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 44s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 2m 36s | | the patch passed | | +1 :green_heart: | shadedclient | 22m 14s | | patch has no errors when building and testing our client artifacts. | |||| _ Other Tests _ | | +1 :green_heart: | unit | 17m 37s | | hadoop-common in the patch passed. | | +1 :green_heart: | asflicense | 0m 59s | | The patch does not generate ASF License warnings. | | | | 196m 3s | | | | Subsystem | Report/Notes | |----------:|:-------------| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3983/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3983 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux 5991516568fc 4.15.0-161-generic #169-Ubuntu SMP Fri Oct 15 13:41:54 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / e898cb2a0668e54069b1ba26401e570ad33f3302 | | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3983/3/testReport/ | | Max. process+thread count | 1289 (vs. ulimit of 5500) | | modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3983/3/console | | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org | This message was automatically generated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 726844) Time Spent: 1h 10m (was: 1h) > RBF: Router should explicitly specify the value of `jute.maxbuffer` in hadoop > configuration files like core-site.xml > -------------------------------------------------------------------------------------------------------------------- > > Key: HDFS-16455 > URL: https://issues.apache.org/jira/browse/HDFS-16455 > Project: Hadoop HDFS > Issue Type: Bug > Components: rbf > Affects Versions: 3.3.0, 3.4.0 > Reporter: Max Xie > Assignee: Max Xie > Priority: Minor > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > Based on the current design for delegation token in secure Router, the total > number of tokens store and update in zookeeper using > ZKDelegationTokenManager. > But the default value of system property `jute.maxbuffer` is just 4MB, if > Router store too many tokens in zk, it will throw IOException `{{{}Packet > lenxx is out of range{}}}` and all Router will crash. > > In our cluster, Routers crashed because of it. The crash logs are below > {code:java} > 2022-02-09 02:15:51,607 INFO > org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: > Token renewal for identifier: (token for xxx: HDFS_DELEGATION_TOKEN > owner=xxx/scheduler, renewer=hadoop, realUser=, issueDate=1644344146305, > maxDate=1644948946305, sequenceNumbe > r=27136070, masterKeyId=1107); total currentTokens 279548 2022-02-09 > 02:16:07,632 WARN org.apache.zookeeper.ClientCnxn: Session 0x1000172775a0012 > for server zkurl:2181, unexpected e > rror, closing socket connection and attempting reconnect > java.io.IOException: Packet len4194553 is out of range! > at org.apache.zookeeper.ClientCnxnSocket.readLength(ClientCnxnSocket.java:113) > at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:79) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366) > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1145) > 2022-02-09 02:16:07,733 WARN org.apache.hadoop.ipc.Server: IPC Server handler > 1254 on default port 9001, call Call#144 Retry#0 > org.apache.hadoop.hdfs.protocol.ClientProtocol.getDelegationToken from > ip:46534 > java.lang.RuntimeException: Could not increment shared counter !! > at > org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:582) > {code} > When we restart a Router, it crashed again > {code:java} > 2022-02-09 03:14:17,308 INFO > org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager: > Starting to load key cache. > 2022-02-09 03:14:17,310 INFO > org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager: > Loaded key cache. > 2022-02-09 03:14:32,930 WARN org.apache.zookeeper.ClientCnxn: Session > 0x205584be35b0001 for server zkurl:2181, unexpected > error, closing socket connection and attempting reconnect > java.io.IOException: Packet len4194478 is out of range! > at > org.apache.zookeeper.ClientCnxnSocket.readLength(ClientCnxnSocket.java:113) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:79) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366) > at > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1145) > 2022-02-09 03:14:33,030 ERROR > org.apache.hadoop.hdfs.server.federation.router.security.token.ZKDelegationTokenSecretManagerImpl: > Error starting threads for z > kDelegationTokens > java.io.IOException: Could not start PathChildrenCache for tokens {code} > Finnally, we config `-Djute.maxbuffer=10000000` in hadoop-env,sh to fix this > issue. > After dig it, we found the number of the znode > `/ZKDTSMRoot/ZKDTSMTokensRoot`'s children node was more than 250000, which's > data size was over 4MB. > > Maybe we should explicitly specify the value of `jute.maxbuffer` in hadoop > configuration files like core-site.xml, hdfs-rbf-site.xml to configure a > larger value > > -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org