[jira] [Work logged] (HDFS-16455) RBF: Router should explicitly specify the value of `jute.maxbuffer` in hadoop configuration files like core-site.xml

2022-02-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16455?focusedWorklogId=727472&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-727472
 ]

ASF GitHub Bot logged work on HDFS-16455:
-

Author: ASF GitHub Bot
Created on: 15/Feb/22 18:54
Start Date: 15/Feb/22 18:54
Worklog Time Spent: 10m 
  Work Description: Neilxzn commented on a change in pull request #3983:
URL: https://github.com/apache/hadoop/pull/3983#discussion_r806389802



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/ZKDelegationTokenSecretManager.java
##
@@ -199,6 +202,10 @@ public ZKDelegationTokenSecretManager(Configuration conf) {
 ZK_DTSM_ZK_SESSION_TIMEOUT_DEFAULT);
 int numRetries =
 conf.getInt(ZK_DTSM_ZK_NUM_RETRIES, 
ZK_DTSM_ZK_NUM_RETRIES_DEFAULT);
+String juteMaxBuffer =
+conf.get(ZK_DTSM_ZK_JUTE_MAXBUFFER, 
ZK_DTSM_ZK_JUTE_MAXBUFFER_DEFAULT);

Review comment:
   Thank you for your review. fix it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 727472)
Time Spent: 1h 50m  (was: 1h 40m)

> RBF: Router should explicitly specify the value of `jute.maxbuffer` in hadoop 
> configuration files like core-site.xml
> 
>
> Key: HDFS-16455
> URL: https://issues.apache.org/jira/browse/HDFS-16455
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.3.0, 3.4.0
>Reporter: Max  Xie
>Assignee: Max  Xie
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Based on the current design for delegation token in secure Router, the total 
> number of  tokens store and update in zookeeper using 
> ZKDelegationTokenManager.  
> But the default value of  system property `jute.maxbuffer` is just 4MB,  if 
> Router store too many tokens in zk, it will throw  IOException   `{{{}Packet 
> lenxx is out of range{}}}` and all Router will crash. 
>  
> In our cluster,  Routers crashed because of it. The crash logs are below 
> {code:java}
> 2022-02-09 02:15:51,607 INFO 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
>  Token renewal for identifier: (token for xxx: HDFS_DELEGATION_TOKEN 
> owner=xxx/scheduler, renewer=hadoop, realUser=, issueDate=1644344146305, 
> maxDate=1644948946305, sequenceNumbe
> r=27136070, masterKeyId=1107); total currentTokens 279548  2022-02-09 
> 02:16:07,632 WARN org.apache.zookeeper.ClientCnxn: Session 0x1000172775a0012 
> for server zkurl:2181, unexpected e
> rror, closing socket connection and attempting reconnect
> java.io.IOException: Packet len4194553 is out of range!
> at org.apache.zookeeper.ClientCnxnSocket.readLength(ClientCnxnSocket.java:113)
> at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:79)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1145)
> 2022-02-09 02:16:07,733 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 1254 on default port 9001, call Call#144 Retry#0 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.getDelegationToken from 
> ip:46534
> java.lang.RuntimeException: Could not increment shared counter !!
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:582)
>  {code}
> When we restart a Router, it crashed again
> {code:java}
> 2022-02-09 03:14:17,308 INFO 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager: 
> Starting to load key cache.
> 2022-02-09 03:14:17,310 INFO 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager: 
> Loaded key cache.
> 2022-02-09 03:14:32,930 WARN org.apache.zookeeper.ClientCnxn: Session 
> 0x205584be35b0001 for server zkurl:2181, unexpected
> error, closing socket connection and attempting reconnect
> java.io.IOException: Packet len4194478 is out of range!
> at 
> org.apache.zookeeper.ClientCnxnSocket.readLength(ClientCnxnSocket.java:113)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:79)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
> at 
> 

[jira] [Work logged] (HDFS-16455) RBF: Router should explicitly specify the value of `jute.maxbuffer` in hadoop configuration files like core-site.xml

2022-02-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16455?focusedWorklogId=727325&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-727325
 ]

ASF GitHub Bot logged work on HDFS-16455:
-

Author: ASF GitHub Bot
Created on: 15/Feb/22 18:40
Start Date: 15/Feb/22 18:40
Worklog Time Spent: 10m 
  Work Description: goiri commented on a change in pull request #3983:
URL: https://github.com/apache/hadoop/pull/3983#discussion_r806077863



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/ZKDelegationTokenSecretManager.java
##
@@ -98,6 +98,8 @@
   + "kerberos.keytab";
   public static final String ZK_DTSM_ZK_KERBEROS_PRINCIPAL = ZK_CONF_PREFIX
   + "kerberos.principal";
+  public static final String ZK_DTSM_ZK_JUTE_MAXBUFFER = ZK_CONF_PREFIX
+  + "jute.maxbuffer";

Review comment:
   The indentation is not correct. Check the checkstyle.

##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/ZKDelegationTokenSecretManager.java
##
@@ -199,6 +202,10 @@ public ZKDelegationTokenSecretManager(Configuration conf) {
 ZK_DTSM_ZK_SESSION_TIMEOUT_DEFAULT);
 int numRetries =
 conf.getInt(ZK_DTSM_ZK_NUM_RETRIES, 
ZK_DTSM_ZK_NUM_RETRIES_DEFAULT);
+String juteMaxBuffer =
+conf.get(ZK_DTSM_ZK_JUTE_MAXBUFFER, 
ZK_DTSM_ZK_JUTE_MAXBUFFER_DEFAULT);

Review comment:
   Indentation fix.

##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/ZKDelegationTokenSecretManager.java
##
@@ -199,6 +202,10 @@ public ZKDelegationTokenSecretManager(Configuration conf) {
 ZK_DTSM_ZK_SESSION_TIMEOUT_DEFAULT);
 int numRetries =
 conf.getInt(ZK_DTSM_ZK_NUM_RETRIES, 
ZK_DTSM_ZK_NUM_RETRIES_DEFAULT);
+String juteMaxBuffer =
+conf.get(ZK_DTSM_ZK_JUTE_MAXBUFFER, 
ZK_DTSM_ZK_JUTE_MAXBUFFER_DEFAULT);
+System.setProperty(ZKClientConfig.JUTE_MAXBUFFER,
+ juteMaxBuffer);

Review comment:
   This could go to the previous line.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 727325)
Time Spent: 1h 40m  (was: 1.5h)

> RBF: Router should explicitly specify the value of `jute.maxbuffer` in hadoop 
> configuration files like core-site.xml
> 
>
> Key: HDFS-16455
> URL: https://issues.apache.org/jira/browse/HDFS-16455
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.3.0, 3.4.0
>Reporter: Max  Xie
>Assignee: Max  Xie
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Based on the current design for delegation token in secure Router, the total 
> number of  tokens store and update in zookeeper using 
> ZKDelegationTokenManager.  
> But the default value of  system property `jute.maxbuffer` is just 4MB,  if 
> Router store too many tokens in zk, it will throw  IOException   `{{{}Packet 
> lenxx is out of range{}}}` and all Router will crash. 
>  
> In our cluster,  Routers crashed because of it. The crash logs are below 
> {code:java}
> 2022-02-09 02:15:51,607 INFO 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
>  Token renewal for identifier: (token for xxx: HDFS_DELEGATION_TOKEN 
> owner=xxx/scheduler, renewer=hadoop, realUser=, issueDate=1644344146305, 
> maxDate=1644948946305, sequenceNumbe
> r=27136070, masterKeyId=1107); total currentTokens 279548  2022-02-09 
> 02:16:07,632 WARN org.apache.zookeeper.ClientCnxn: Session 0x1000172775a0012 
> for server zkurl:2181, unexpected e
> rror, closing socket connection and attempting reconnect
> java.io.IOException: Packet len4194553 is out of range!
> at org.apache.zookeeper.ClientCnxnSocket.readLength(ClientCnxnSocket.java:113)
> at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:79)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1145)
> 2022-02-09 02:16:07,733 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 1254 on default port 9001, call Call#144 Retry#0 
> org.apache.hadoop.hdfs.prot

[jira] [Work logged] (HDFS-16455) RBF: Router should explicitly specify the value of `jute.maxbuffer` in hadoop configuration files like core-site.xml

2022-02-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16455?focusedWorklogId=727261&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-727261
 ]

ASF GitHub Bot logged work on HDFS-16455:
-

Author: ASF GitHub Bot
Created on: 15/Feb/22 18:35
Start Date: 15/Feb/22 18:35
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3983:
URL: https://github.com/apache/hadoop/pull/3983#issuecomment-1039875071






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 727261)
Time Spent: 1.5h  (was: 1h 20m)

> RBF: Router should explicitly specify the value of `jute.maxbuffer` in hadoop 
> configuration files like core-site.xml
> 
>
> Key: HDFS-16455
> URL: https://issues.apache.org/jira/browse/HDFS-16455
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.3.0, 3.4.0
>Reporter: Max  Xie
>Assignee: Max  Xie
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Based on the current design for delegation token in secure Router, the total 
> number of  tokens store and update in zookeeper using 
> ZKDelegationTokenManager.  
> But the default value of  system property `jute.maxbuffer` is just 4MB,  if 
> Router store too many tokens in zk, it will throw  IOException   `{{{}Packet 
> lenxx is out of range{}}}` and all Router will crash. 
>  
> In our cluster,  Routers crashed because of it. The crash logs are below 
> {code:java}
> 2022-02-09 02:15:51,607 INFO 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
>  Token renewal for identifier: (token for xxx: HDFS_DELEGATION_TOKEN 
> owner=xxx/scheduler, renewer=hadoop, realUser=, issueDate=1644344146305, 
> maxDate=1644948946305, sequenceNumbe
> r=27136070, masterKeyId=1107); total currentTokens 279548  2022-02-09 
> 02:16:07,632 WARN org.apache.zookeeper.ClientCnxn: Session 0x1000172775a0012 
> for server zkurl:2181, unexpected e
> rror, closing socket connection and attempting reconnect
> java.io.IOException: Packet len4194553 is out of range!
> at org.apache.zookeeper.ClientCnxnSocket.readLength(ClientCnxnSocket.java:113)
> at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:79)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1145)
> 2022-02-09 02:16:07,733 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 1254 on default port 9001, call Call#144 Retry#0 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.getDelegationToken from 
> ip:46534
> java.lang.RuntimeException: Could not increment shared counter !!
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:582)
>  {code}
> When we restart a Router, it crashed again
> {code:java}
> 2022-02-09 03:14:17,308 INFO 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager: 
> Starting to load key cache.
> 2022-02-09 03:14:17,310 INFO 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager: 
> Loaded key cache.
> 2022-02-09 03:14:32,930 WARN org.apache.zookeeper.ClientCnxn: Session 
> 0x205584be35b0001 for server zkurl:2181, unexpected
> error, closing socket connection and attempting reconnect
> java.io.IOException: Packet len4194478 is out of range!
> at 
> org.apache.zookeeper.ClientCnxnSocket.readLength(ClientCnxnSocket.java:113)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:79)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
> at 
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1145)
> 2022-02-09 03:14:33,030 ERROR 
> org.apache.hadoop.hdfs.server.federation.router.security.token.ZKDelegationTokenSecretManagerImpl:
>  Error starting threads for z
> kDelegationTokens
> java.io.IOException: Could not start PathChildrenCache for tokens {code}
> Finnally, we config `-Djute.maxbuffer=1000` in hadoop-env,sh to fix this 
> issue.
> After dig it, we found the number of the  znode 
> `/ZKDTSMRoot/ZKDTSMTokensRoot`'s children node was more than 25, which's 
> data size was over 4MB.
>  
> Maybe we should  

[jira] [Work logged] (HDFS-16455) RBF: Router should explicitly specify the value of `jute.maxbuffer` in hadoop configuration files like core-site.xml

2022-02-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16455?focusedWorklogId=726849&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-726849
 ]

ASF GitHub Bot logged work on HDFS-16455:
-

Author: ASF GitHub Bot
Created on: 15/Feb/22 05:51
Start Date: 15/Feb/22 05:51
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3983:
URL: https://github.com/apache/hadoop/pull/3983#issuecomment-1039885118


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m 14s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  38m 22s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  25m 38s |  |  trunk passed with JDK 
Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |  21m 14s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m  0s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 43s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  8s |  |  trunk passed with JDK 
Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 41s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   2m 34s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  25m 54s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  24m 13s |  |  the patch passed with JDK 
Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javac  |  24m 13s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 21s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  21m 21s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 59s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 36s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m  5s |  |  the patch passed with JDK 
Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 38s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   2m 41s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  25m 35s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m  4s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 49s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 219m 11s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3983/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3983 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 378f1572afc2 4.15.0-153-generic #160-Ubuntu SMP Thu Jul 29 
06:54:29 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / e898cb2a0668e54069b1ba26401e570ad33f3302 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3983/2/testReport/ |
   | Max. process+thread count | 3059 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3983/2/console |
   | vers

[jira] [Work logged] (HDFS-16455) RBF: Router should explicitly specify the value of `jute.maxbuffer` in hadoop configuration files like core-site.xml

2022-02-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16455?focusedWorklogId=726844&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-726844
 ]

ASF GitHub Bot logged work on HDFS-16455:
-

Author: ASF GitHub Bot
Created on: 15/Feb/22 05:29
Start Date: 15/Feb/22 05:29
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3983:
URL: https://github.com/apache/hadoop/pull/3983#issuecomment-1039875071


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 36s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m  1s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  22m  9s |  |  trunk passed with JDK 
Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |  19m 39s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m  7s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 43s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 15s |  |  trunk passed with JDK 
Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 42s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   2m 26s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 53s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 59s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 31s |  |  the patch passed with JDK 
Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javac  |  21m 31s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m 39s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  19m 39s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 39s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m  9s |  |  the patch passed with JDK 
Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 44s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   2m 36s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 14s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  17m 37s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 59s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 196m  3s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3983/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3983 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 5991516568fc 4.15.0-161-generic #169-Ubuntu SMP Fri Oct 15 
13:41:54 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / e898cb2a0668e54069b1ba26401e570ad33f3302 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3983/3/testReport/ |
   | Max. process+thread count | 1289 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3983/3/console |
   | vers

[jira] [Work logged] (HDFS-16455) RBF: Router should explicitly specify the value of `jute.maxbuffer` in hadoop configuration files like core-site.xml

2022-02-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16455?focusedWorklogId=726787&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-726787
 ]

ASF GitHub Bot logged work on HDFS-16455:
-

Author: ASF GitHub Bot
Created on: 15/Feb/22 02:11
Start Date: 15/Feb/22 02:11
Worklog Time Spent: 10m 
  Work Description: Neilxzn commented on a change in pull request #3983:
URL: https://github.com/apache/hadoop/pull/3983#discussion_r806389802



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/ZKDelegationTokenSecretManager.java
##
@@ -199,6 +202,10 @@ public ZKDelegationTokenSecretManager(Configuration conf) {
 ZK_DTSM_ZK_SESSION_TIMEOUT_DEFAULT);
 int numRetries =
 conf.getInt(ZK_DTSM_ZK_NUM_RETRIES, 
ZK_DTSM_ZK_NUM_RETRIES_DEFAULT);
+String juteMaxBuffer =
+conf.get(ZK_DTSM_ZK_JUTE_MAXBUFFER, 
ZK_DTSM_ZK_JUTE_MAXBUFFER_DEFAULT);

Review comment:
   Thank you for your review. fix it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 726787)
Time Spent: 1h  (was: 50m)

> RBF: Router should explicitly specify the value of `jute.maxbuffer` in hadoop 
> configuration files like core-site.xml
> 
>
> Key: HDFS-16455
> URL: https://issues.apache.org/jira/browse/HDFS-16455
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.3.0, 3.4.0
>Reporter: Max  Xie
>Assignee: Max  Xie
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Based on the current design for delegation token in secure Router, the total 
> number of  tokens store and update in zookeeper using 
> ZKDelegationTokenManager.  
> But the default value of  system property `jute.maxbuffer` is just 4MB,  if 
> Router store too many tokens in zk, it will throw  IOException   `{{{}Packet 
> lenxx is out of range{}}}` and all Router will crash. 
>  
> In our cluster,  Routers crashed because of it. The crash logs are below 
> {code:java}
> 2022-02-09 02:15:51,607 INFO 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
>  Token renewal for identifier: (token for xxx: HDFS_DELEGATION_TOKEN 
> owner=xxx/scheduler, renewer=hadoop, realUser=, issueDate=1644344146305, 
> maxDate=1644948946305, sequenceNumbe
> r=27136070, masterKeyId=1107); total currentTokens 279548  2022-02-09 
> 02:16:07,632 WARN org.apache.zookeeper.ClientCnxn: Session 0x1000172775a0012 
> for server zkurl:2181, unexpected e
> rror, closing socket connection and attempting reconnect
> java.io.IOException: Packet len4194553 is out of range!
> at org.apache.zookeeper.ClientCnxnSocket.readLength(ClientCnxnSocket.java:113)
> at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:79)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1145)
> 2022-02-09 02:16:07,733 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 1254 on default port 9001, call Call#144 Retry#0 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.getDelegationToken from 
> ip:46534
> java.lang.RuntimeException: Could not increment shared counter !!
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:582)
>  {code}
> When we restart a Router, it crashed again
> {code:java}
> 2022-02-09 03:14:17,308 INFO 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager: 
> Starting to load key cache.
> 2022-02-09 03:14:17,310 INFO 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager: 
> Loaded key cache.
> 2022-02-09 03:14:32,930 WARN org.apache.zookeeper.ClientCnxn: Session 
> 0x205584be35b0001 for server zkurl:2181, unexpected
> error, closing socket connection and attempting reconnect
> java.io.IOException: Packet len4194478 is out of range!
> at 
> org.apache.zookeeper.ClientCnxnSocket.readLength(ClientCnxnSocket.java:113)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:79)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
> at 
> org.apache.

[jira] [Work logged] (HDFS-16455) RBF: Router should explicitly specify the value of `jute.maxbuffer` in hadoop configuration files like core-site.xml

2022-02-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16455?focusedWorklogId=726430&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-726430
 ]

ASF GitHub Bot logged work on HDFS-16455:
-

Author: ASF GitHub Bot
Created on: 14/Feb/22 17:27
Start Date: 14/Feb/22 17:27
Worklog Time Spent: 10m 
  Work Description: goiri commented on a change in pull request #3983:
URL: https://github.com/apache/hadoop/pull/3983#discussion_r806077863



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/ZKDelegationTokenSecretManager.java
##
@@ -98,6 +98,8 @@
   + "kerberos.keytab";
   public static final String ZK_DTSM_ZK_KERBEROS_PRINCIPAL = ZK_CONF_PREFIX
   + "kerberos.principal";
+  public static final String ZK_DTSM_ZK_JUTE_MAXBUFFER = ZK_CONF_PREFIX
+  + "jute.maxbuffer";

Review comment:
   The indentation is not correct. Check the checkstyle.

##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/ZKDelegationTokenSecretManager.java
##
@@ -199,6 +202,10 @@ public ZKDelegationTokenSecretManager(Configuration conf) {
 ZK_DTSM_ZK_SESSION_TIMEOUT_DEFAULT);
 int numRetries =
 conf.getInt(ZK_DTSM_ZK_NUM_RETRIES, 
ZK_DTSM_ZK_NUM_RETRIES_DEFAULT);
+String juteMaxBuffer =
+conf.get(ZK_DTSM_ZK_JUTE_MAXBUFFER, 
ZK_DTSM_ZK_JUTE_MAXBUFFER_DEFAULT);

Review comment:
   Indentation fix.

##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/ZKDelegationTokenSecretManager.java
##
@@ -199,6 +202,10 @@ public ZKDelegationTokenSecretManager(Configuration conf) {
 ZK_DTSM_ZK_SESSION_TIMEOUT_DEFAULT);
 int numRetries =
 conf.getInt(ZK_DTSM_ZK_NUM_RETRIES, 
ZK_DTSM_ZK_NUM_RETRIES_DEFAULT);
+String juteMaxBuffer =
+conf.get(ZK_DTSM_ZK_JUTE_MAXBUFFER, 
ZK_DTSM_ZK_JUTE_MAXBUFFER_DEFAULT);
+System.setProperty(ZKClientConfig.JUTE_MAXBUFFER,
+ juteMaxBuffer);

Review comment:
   This could go to the previous line.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 726430)
Time Spent: 50m  (was: 40m)

> RBF: Router should explicitly specify the value of `jute.maxbuffer` in hadoop 
> configuration files like core-site.xml
> 
>
> Key: HDFS-16455
> URL: https://issues.apache.org/jira/browse/HDFS-16455
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.3.0, 3.4.0
>Reporter: Max  Xie
>Assignee: Max  Xie
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Based on the current design for delegation token in secure Router, the total 
> number of  tokens store and update in zookeeper using 
> ZKDelegationTokenManager.  
> But the default value of  system property `jute.maxbuffer` is just 4MB,  if 
> Router store too many tokens in zk, it will throw  IOException   `{{{}Packet 
> lenxx is out of range{}}}` and all Router will crash. 
>  
> In our cluster,  Routers crashed because of it. The crash logs are below 
> {code:java}
> 2022-02-09 02:15:51,607 INFO 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
>  Token renewal for identifier: (token for xxx: HDFS_DELEGATION_TOKEN 
> owner=xxx/scheduler, renewer=hadoop, realUser=, issueDate=1644344146305, 
> maxDate=1644948946305, sequenceNumbe
> r=27136070, masterKeyId=1107); total currentTokens 279548  2022-02-09 
> 02:16:07,632 WARN org.apache.zookeeper.ClientCnxn: Session 0x1000172775a0012 
> for server zkurl:2181, unexpected e
> rror, closing socket connection and attempting reconnect
> java.io.IOException: Packet len4194553 is out of range!
> at org.apache.zookeeper.ClientCnxnSocket.readLength(ClientCnxnSocket.java:113)
> at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:79)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1145)
> 2022-02-09 02:16:07,733 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 1254 on default port 9001, call Call#144 Retry#0 
> org.apache.hadoop.hdfs.protocol.Cl

[jira] [Work logged] (HDFS-16455) RBF: Router should explicitly specify the value of `jute.maxbuffer` in hadoop configuration files like core-site.xml

2022-02-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16455?focusedWorklogId=726213&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-726213
 ]

ASF GitHub Bot logged work on HDFS-16455:
-

Author: ASF GitHub Bot
Created on: 14/Feb/22 10:16
Start Date: 14/Feb/22 10:16
Worklog Time Spent: 10m 
  Work Description: Neilxzn commented on pull request #3983:
URL: https://github.com/apache/hadoop/pull/3983#issuecomment-1038902823


   > The patch itself looks good. I think one of the problem today is there's 
no documentation around the zkdt usage unless you dig into the code. We should 
add a better doc as a follow-up jira.
   
   @jojochuang  Thank you for your review.  We should add a better doc around 
the zkdt usage. +1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 726213)
Time Spent: 40m  (was: 0.5h)

> RBF: Router should explicitly specify the value of `jute.maxbuffer` in hadoop 
> configuration files like core-site.xml
> 
>
> Key: HDFS-16455
> URL: https://issues.apache.org/jira/browse/HDFS-16455
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.3.0, 3.4.0
>Reporter: Max  Xie
>Assignee: Max  Xie
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Based on the current design for delegation token in secure Router, the total 
> number of  tokens store and update in zookeeper using 
> ZKDelegationTokenManager.  
> But the default value of  system property `jute.maxbuffer` is just 4MB,  if 
> Router store too many tokens in zk, it will throw  IOException   `{{{}Packet 
> lenxx is out of range{}}}` and all Router will crash. 
>  
> In our cluster,  Routers crashed because of it. The crash logs are below 
> {code:java}
> 2022-02-09 02:15:51,607 INFO 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
>  Token renewal for identifier: (token for xxx: HDFS_DELEGATION_TOKEN 
> owner=xxx/scheduler, renewer=hadoop, realUser=, issueDate=1644344146305, 
> maxDate=1644948946305, sequenceNumbe
> r=27136070, masterKeyId=1107); total currentTokens 279548  2022-02-09 
> 02:16:07,632 WARN org.apache.zookeeper.ClientCnxn: Session 0x1000172775a0012 
> for server zkurl:2181, unexpected e
> rror, closing socket connection and attempting reconnect
> java.io.IOException: Packet len4194553 is out of range!
> at org.apache.zookeeper.ClientCnxnSocket.readLength(ClientCnxnSocket.java:113)
> at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:79)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1145)
> 2022-02-09 02:16:07,733 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 1254 on default port 9001, call Call#144 Retry#0 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.getDelegationToken from 
> ip:46534
> java.lang.RuntimeException: Could not increment shared counter !!
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:582)
>  {code}
> When we restart a Router, it crashed again
> {code:java}
> 2022-02-09 03:14:17,308 INFO 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager: 
> Starting to load key cache.
> 2022-02-09 03:14:17,310 INFO 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager: 
> Loaded key cache.
> 2022-02-09 03:14:32,930 WARN org.apache.zookeeper.ClientCnxn: Session 
> 0x205584be35b0001 for server zkurl:2181, unexpected
> error, closing socket connection and attempting reconnect
> java.io.IOException: Packet len4194478 is out of range!
> at 
> org.apache.zookeeper.ClientCnxnSocket.readLength(ClientCnxnSocket.java:113)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:79)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
> at 
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1145)
> 2022-02-09 03:14:33,030 ERROR 
> org.apache.hadoop.hdfs.server.federation.router.security.token.ZKDelegationTokenSecretManagerImpl:
>  Error starting threads for z
> kDelegationTokens
> java.io.IOException: Could not start PathChildrenCache

[jira] [Work logged] (HDFS-16455) RBF: Router should explicitly specify the value of `jute.maxbuffer` in hadoop configuration files like core-site.xml

2022-02-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16455?focusedWorklogId=725596&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-725596
 ]

ASF GitHub Bot logged work on HDFS-16455:
-

Author: ASF GitHub Bot
Created on: 12/Feb/22 08:47
Start Date: 12/Feb/22 08:47
Worklog Time Spent: 10m 
  Work Description: Neilxzn removed a comment on pull request #3983:
URL: https://github.com/apache/hadoop/pull/3983#issuecomment-1036292478


   这是自动回复邮件。来件已接收,谢谢。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 725596)
Time Spent: 0.5h  (was: 20m)

> RBF: Router should explicitly specify the value of `jute.maxbuffer` in hadoop 
> configuration files like core-site.xml
> 
>
> Key: HDFS-16455
> URL: https://issues.apache.org/jira/browse/HDFS-16455
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.3.0, 3.4.0
>Reporter: Max  Xie
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Based on the current design for delegation token in secure Router, the total 
> number of  tokens store and update in zookeeper using 
> ZKDelegationTokenManager.  
> But the default value of  system property `jute.maxbuffer` is just 4MB,  if 
> Router store too many tokens in zk, it will throw  IOException   `{{{}Packet 
> lenxx is out of range{}}}` and all Router will crash. 
>  
> In our cluster,  Routers crashed because of it. The crash logs are below 
> {code:java}
> 2022-02-09 02:15:51,607 INFO 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
>  Token renewal for identifier: (token for xxx: HDFS_DELEGATION_TOKEN 
> owner=xxx/scheduler, renewer=hadoop, realUser=, issueDate=1644344146305, 
> maxDate=1644948946305, sequenceNumbe
> r=27136070, masterKeyId=1107); total currentTokens 279548  2022-02-09 
> 02:16:07,632 WARN org.apache.zookeeper.ClientCnxn: Session 0x1000172775a0012 
> for server zkurl:2181, unexpected e
> rror, closing socket connection and attempting reconnect
> java.io.IOException: Packet len4194553 is out of range!
> at org.apache.zookeeper.ClientCnxnSocket.readLength(ClientCnxnSocket.java:113)
> at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:79)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1145)
> 2022-02-09 02:16:07,733 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 1254 on default port 9001, call Call#144 Retry#0 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.getDelegationToken from 
> ip:46534
> java.lang.RuntimeException: Could not increment shared counter !!
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:582)
>  {code}
> When we restart a Router, it crashed again
> {code:java}
> 2022-02-09 03:14:17,308 INFO 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager: 
> Starting to load key cache.
> 2022-02-09 03:14:17,310 INFO 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager: 
> Loaded key cache.
> 2022-02-09 03:14:32,930 WARN org.apache.zookeeper.ClientCnxn: Session 
> 0x205584be35b0001 for server zkurl:2181, unexpected
> error, closing socket connection and attempting reconnect
> java.io.IOException: Packet len4194478 is out of range!
> at 
> org.apache.zookeeper.ClientCnxnSocket.readLength(ClientCnxnSocket.java:113)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:79)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
> at 
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1145)
> 2022-02-09 03:14:33,030 ERROR 
> org.apache.hadoop.hdfs.server.federation.router.security.token.ZKDelegationTokenSecretManagerImpl:
>  Error starting threads for z
> kDelegationTokens
> java.io.IOException: Could not start PathChildrenCache for tokens {code}
> Finnally, we config `-Djute.maxbuffer=1000` in hadoop-env,sh to fix this 
> issue.
> After dig it, we found the number of the  znode 
> `/ZKDTSMRoot/ZKDTSMTokensRoot`'s children node was more than 25, which's 
> data size was over 4MB.
>  
> Maybe we should  explicitly s

[jira] [Work logged] (HDFS-16455) RBF: Router should explicitly specify the value of `jute.maxbuffer` in hadoop configuration files like core-site.xml

2022-02-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16455?focusedWorklogId=725205&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-725205
 ]

ASF GitHub Bot logged work on HDFS-16455:
-

Author: ASF GitHub Bot
Created on: 11/Feb/22 15:09
Start Date: 11/Feb/22 15:09
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3983:
URL: https://github.com/apache/hadoop/pull/3983#issuecomment-1036314001


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 37s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m  6s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  22m 22s |  |  trunk passed with JDK 
Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |  19m 32s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m  6s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 41s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 13s |  |  trunk passed with JDK 
Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 43s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   2m 28s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m  2s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 58s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 37s |  |  the patch passed with JDK 
Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javac  |  21m 37s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m 29s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  19m 29s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  6s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 38s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 10s |  |  the patch passed with JDK 
Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 44s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   2m 37s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m  4s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  17m 50s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 54s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 196m 51s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3983/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3983 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 8e25a43a97f7 4.15.0-161-generic #169-Ubuntu SMP Fri Oct 15 
13:41:54 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 535e276b55dc36ea220708a71d4094c18669aa35 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3983/1/testReport/ |
   | Max. process+thread count | 3158 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3983/1/console |
   | vers

[jira] [Work logged] (HDFS-16455) RBF: Router should explicitly specify the value of `jute.maxbuffer` in hadoop configuration files like core-site.xml

2022-02-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16455?focusedWorklogId=725186&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-725186
 ]

ASF GitHub Bot logged work on HDFS-16455:
-

Author: ASF GitHub Bot
Created on: 11/Feb/22 14:52
Start Date: 11/Feb/22 14:52
Worklog Time Spent: 10m 
  Work Description: Neilxzn commented on pull request #3983:
URL: https://github.com/apache/hadoop/pull/3983#issuecomment-1036292478


   这是自动回复邮件。来件已接收,谢谢。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 725186)
Remaining Estimate: 0h
Time Spent: 10m

> RBF: Router should explicitly specify the value of `jute.maxbuffer` in hadoop 
> configuration files like core-site.xml
> 
>
> Key: HDFS-16455
> URL: https://issues.apache.org/jira/browse/HDFS-16455
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.3.0, 3.4.0
>Reporter: Max  Xie
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Based on the current design for delegation token in secure Router, the total 
> number of  tokens store and update in zookeeper using 
> ZKDelegationTokenManager.  
> But the default value of  system property `jute.maxbuffer` is just 4MB,  if 
> Router store too many tokens in zk, it will throw  IOException   `{{{}Packet 
> lenxx is out of range{}}}` and all Router will crash. 
>  
> In our cluster,  Routers crashed because of it. The crash logs are below 
> {code:java}
> 2022-02-09 02:15:51,607 INFO 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
>  Token renewal for identifier: (token for xxx: HDFS_DELEGATION_TOKEN 
> owner=xxx/scheduler, renewer=hadoop, realUser=, issueDate=1644344146305, 
> maxDate=1644948946305, sequenceNumbe
> r=27136070, masterKeyId=1107); total currentTokens 279548  2022-02-09 
> 02:16:07,632 WARN org.apache.zookeeper.ClientCnxn: Session 0x1000172775a0012 
> for server zkurl:2181, unexpected e
> rror, closing socket connection and attempting reconnect
> java.io.IOException: Packet len4194553 is out of range!
> at org.apache.zookeeper.ClientCnxnSocket.readLength(ClientCnxnSocket.java:113)
> at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:79)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1145)
> 2022-02-09 02:16:07,733 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 1254 on default port 9001, call Call#144 Retry#0 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.getDelegationToken from 
> ip:46534
> java.lang.RuntimeException: Could not increment shared counter !!
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:582)
>  {code}
> When we restart a Router, it crashed again
> {code:java}
> 2022-02-09 03:14:17,308 INFO 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager: 
> Starting to load key cache.
> 2022-02-09 03:14:17,310 INFO 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager: 
> Loaded key cache.
> 2022-02-09 03:14:32,930 WARN org.apache.zookeeper.ClientCnxn: Session 
> 0x205584be35b0001 for server zkurl:2181, unexpected
> error, closing socket connection and attempting reconnect
> java.io.IOException: Packet len4194478 is out of range!
> at 
> org.apache.zookeeper.ClientCnxnSocket.readLength(ClientCnxnSocket.java:113)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:79)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
> at 
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1145)
> 2022-02-09 03:14:33,030 ERROR 
> org.apache.hadoop.hdfs.server.federation.router.security.token.ZKDelegationTokenSecretManagerImpl:
>  Error starting threads for z
> kDelegationTokens
> java.io.IOException: Could not start PathChildrenCache for tokens {code}
> Finnally, we config `-Djute.maxbuffer=1000` in hadoop-env,sh to fix this 
> issue.
> After dig it, we found the number of the  znode 
> `/ZKDTSMRoot/ZKDTSMTokensRoot`'s children node was more than 25, which's 
> data size was over 4MB.
>  
> Maybe we should  explicitly specify the value of `jute