[jira] [Work logged] (HDFS-16671) RBF: RouterRpcFairnessPolicyController supports configurable permit acquire timeout

2022-07-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16671?focusedWorklogId=795969=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795969
 ]

ASF GitHub Bot logged work on HDFS-16671:
-

Author: ASF GitHub Bot
Created on: 28/Jul/22 07:44
Start Date: 28/Jul/22 07:44
Worklog Time Spent: 10m 
  Work Description: ferhui commented on PR #4597:
URL: https://github.com/apache/hadoop/pull/4597#issuecomment-1197785576

   Merged! @ZanderXu Thanks for your contribution. @goiri @ayushtkn @slfan1989 
Thanks for your reviews!




Issue Time Tracking
---

Worklog Id: (was: 795969)
Time Spent: 3h  (was: 2h 50m)

> RBF: RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout
> ---
>
> Key: HDFS-16671
> URL: https://issues.apache.org/jira/browse/HDFS-16671
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.3.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout. Hardcode 1s is very long, and it has causes an incident in our prod 
> environment when one nameserivce is busy.
> And the optimal timeout maybe should be less than p50(avgTime).
> And all handlers in RBF is waiting to acquire the permit of the busy ns. 
> {code:java}
> "IPC Server handler 12 on default port " #2370 daemon prio=5 os_prio=0 
> tid=? nid=?  waiting on condition [?]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for   (a 
> java.util.concurrent.Semaphore$NonfairSync)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>   at java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:409)
>   at 
> org.apache.hadoop.hdfs.server.federation.fairness.AbstractRouterRpcFairnessPolicyController.acquirePermit(AbstractRouterRpcFairnessPolicyController.java:56)
>   at 
> org.apache.hadoop.hdfs.server.federation.fairness.DynamicRouterRpcFairnessPolicyController.acquirePermit(DynamicRouterRpcFairnessPolicyController.java:123)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.acquirePermit(RouterRpcClient.java:1500)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16671) RBF: RouterRpcFairnessPolicyController supports configurable permit acquire timeout

2022-07-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16671?focusedWorklogId=795967=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795967
 ]

ASF GitHub Bot logged work on HDFS-16671:
-

Author: ASF GitHub Bot
Created on: 28/Jul/22 07:42
Start Date: 28/Jul/22 07:42
Worklog Time Spent: 10m 
  Work Description: ferhui merged PR #4597:
URL: https://github.com/apache/hadoop/pull/4597




Issue Time Tracking
---

Worklog Id: (was: 795967)
Time Spent: 2h 50m  (was: 2h 40m)

> RBF: RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout
> ---
>
> Key: HDFS-16671
> URL: https://issues.apache.org/jira/browse/HDFS-16671
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout. Hardcode 1s is very long, and it has causes an incident in our prod 
> environment when one nameserivce is busy.
> And the optimal timeout maybe should be less than p50(avgTime).
> And all handlers in RBF is waiting to acquire the permit of the busy ns. 
> {code:java}
> "IPC Server handler 12 on default port " #2370 daemon prio=5 os_prio=0 
> tid=? nid=?  waiting on condition [?]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for   (a 
> java.util.concurrent.Semaphore$NonfairSync)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>   at java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:409)
>   at 
> org.apache.hadoop.hdfs.server.federation.fairness.AbstractRouterRpcFairnessPolicyController.acquirePermit(AbstractRouterRpcFairnessPolicyController.java:56)
>   at 
> org.apache.hadoop.hdfs.server.federation.fairness.DynamicRouterRpcFairnessPolicyController.acquirePermit(DynamicRouterRpcFairnessPolicyController.java:123)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.acquirePermit(RouterRpcClient.java:1500)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16671) RBF: RouterRpcFairnessPolicyController supports configurable permit acquire timeout

2022-07-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16671?focusedWorklogId=795932=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795932
 ]

ASF GitHub Bot logged work on HDFS-16671:
-

Author: ASF GitHub Bot
Created on: 28/Jul/22 05:27
Start Date: 28/Jul/22 05:27
Worklog Time Spent: 10m 
  Work Description: slfan1989 commented on code in PR #4597:
URL: https://github.com/apache/hadoop/pull/4597#discussion_r931794170


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/resources/hdfs-rbf-default.xml:
##
@@ -723,6 +723,14 @@
 
   
 
+  
+dfs.federation.router.fairness.acquire.timeout
+1s

Review Comment:
   I see, your configuration is accurate.





Issue Time Tracking
---

Worklog Id: (was: 795932)
Time Spent: 2h 40m  (was: 2.5h)

> RBF: RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout
> ---
>
> Key: HDFS-16671
> URL: https://issues.apache.org/jira/browse/HDFS-16671
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout. Hardcode 1s is very long, and it has causes an incident in our prod 
> environment when one nameserivce is busy.
> And the optimal timeout maybe should be less than p50(avgTime).
> And all handlers in RBF is waiting to acquire the permit of the busy ns. 
> {code:java}
> "IPC Server handler 12 on default port " #2370 daemon prio=5 os_prio=0 
> tid=? nid=?  waiting on condition [?]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for   (a 
> java.util.concurrent.Semaphore$NonfairSync)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>   at java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:409)
>   at 
> org.apache.hadoop.hdfs.server.federation.fairness.AbstractRouterRpcFairnessPolicyController.acquirePermit(AbstractRouterRpcFairnessPolicyController.java:56)
>   at 
> org.apache.hadoop.hdfs.server.federation.fairness.DynamicRouterRpcFairnessPolicyController.acquirePermit(DynamicRouterRpcFairnessPolicyController.java:123)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.acquirePermit(RouterRpcClient.java:1500)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16671) RBF: RouterRpcFairnessPolicyController supports configurable permit acquire timeout

2022-07-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16671?focusedWorklogId=795931=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795931
 ]

ASF GitHub Bot logged work on HDFS-16671:
-

Author: ASF GitHub Bot
Created on: 28/Jul/22 05:23
Start Date: 28/Jul/22 05:23
Worklog Time Spent: 10m 
  Work Description: slfan1989 commented on code in PR #4597:
URL: https://github.com/apache/hadoop/pull/4597#discussion_r931792540


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/resources/hdfs-rbf-default.xml:
##
@@ -723,6 +723,14 @@
 
   
 
+  
+dfs.federation.router.fairness.acquire.timeout
+1s

Review Comment:
   Is this configuration confirmed to be like this?
   
   I feel the following is correct
   ```
   1
   ```





Issue Time Tracking
---

Worklog Id: (was: 795931)
Time Spent: 2.5h  (was: 2h 20m)

> RBF: RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout
> ---
>
> Key: HDFS-16671
> URL: https://issues.apache.org/jira/browse/HDFS-16671
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout. Hardcode 1s is very long, and it has causes an incident in our prod 
> environment when one nameserivce is busy.
> And the optimal timeout maybe should be less than p50(avgTime).
> And all handlers in RBF is waiting to acquire the permit of the busy ns. 
> {code:java}
> "IPC Server handler 12 on default port " #2370 daemon prio=5 os_prio=0 
> tid=? nid=?  waiting on condition [?]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for   (a 
> java.util.concurrent.Semaphore$NonfairSync)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>   at java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:409)
>   at 
> org.apache.hadoop.hdfs.server.federation.fairness.AbstractRouterRpcFairnessPolicyController.acquirePermit(AbstractRouterRpcFairnessPolicyController.java:56)
>   at 
> org.apache.hadoop.hdfs.server.federation.fairness.DynamicRouterRpcFairnessPolicyController.acquirePermit(DynamicRouterRpcFairnessPolicyController.java:123)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.acquirePermit(RouterRpcClient.java:1500)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16671) RBF: RouterRpcFairnessPolicyController supports configurable permit acquire timeout

2022-07-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16671?focusedWorklogId=795765=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795765
 ]

ASF GitHub Bot logged work on HDFS-16671:
-

Author: ASF GitHub Bot
Created on: 27/Jul/22 17:11
Start Date: 27/Jul/22 17:11
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4597:
URL: https://github.com/apache/hadoop/pull/4597#issuecomment-1197060602

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m 16s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  1s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  41m 55s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m  3s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   0m 57s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   0m 47s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 57s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  7s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 13s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   2m  3s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 20s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 41s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 44s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   0m 44s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 40s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   0m 40s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 26s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 43s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 41s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 58s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 31s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 18s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  22m 11s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 47s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 126m 54s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4597/6/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4597 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux 58ab2853a7b1 4.15.0-169-generic #177-Ubuntu SMP Thu Feb 3 
10:50:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / e5436d5a832fb785127e8ab33519820efdb8b564 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4597/6/testReport/ |
   | Max. process+thread count | 2242 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
   | Console output | 

[jira] [Work logged] (HDFS-16671) RBF: RouterRpcFairnessPolicyController supports configurable permit acquire timeout

2022-07-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16671?focusedWorklogId=795702=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795702
 ]

ASF GitHub Bot logged work on HDFS-16671:
-

Author: ASF GitHub Bot
Created on: 27/Jul/22 15:03
Start Date: 27/Jul/22 15:03
Worklog Time Spent: 10m 
  Work Description: ZanderXu commented on PR #4597:
URL: https://github.com/apache/hadoop/pull/4597#issuecomment-1196878432

   @ayushtkn I just delete `assertTrue(acquireTimeMs < 100 + 50);` in 
`TestRouterRpcFairnessPolicyController#testAcquireTimeout()`. Please help me 
review it again and push it forward. Thanks




Issue Time Tracking
---

Worklog Id: (was: 795702)
Time Spent: 2h 10m  (was: 2h)

> RBF: RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout
> ---
>
> Key: HDFS-16671
> URL: https://issues.apache.org/jira/browse/HDFS-16671
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout. Hardcode 1s is very long, and it has causes an incident in our prod 
> environment when one nameserivce is busy.
> And the optimal timeout maybe should be less than p50(avgTime).
> And all handlers in RBF is waiting to acquire the permit of the busy ns. 
> {code:java}
> "IPC Server handler 12 on default port " #2370 daemon prio=5 os_prio=0 
> tid=? nid=?  waiting on condition [?]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for   (a 
> java.util.concurrent.Semaphore$NonfairSync)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>   at java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:409)
>   at 
> org.apache.hadoop.hdfs.server.federation.fairness.AbstractRouterRpcFairnessPolicyController.acquirePermit(AbstractRouterRpcFairnessPolicyController.java:56)
>   at 
> org.apache.hadoop.hdfs.server.federation.fairness.DynamicRouterRpcFairnessPolicyController.acquirePermit(DynamicRouterRpcFairnessPolicyController.java:123)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.acquirePermit(RouterRpcClient.java:1500)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16671) RBF: RouterRpcFairnessPolicyController supports configurable permit acquire timeout

2022-07-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16671?focusedWorklogId=794544=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794544
 ]

ASF GitHub Bot logged work on HDFS-16671:
-

Author: ASF GitHub Bot
Created on: 23/Jul/22 18:04
Start Date: 23/Jul/22 18:04
Worklog Time Spent: 10m 
  Work Description: ayushtkn commented on code in PR #4597:
URL: https://github.com/apache/hadoop/pull/4597#discussion_r928150015


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/fairness/TestRouterRpcFairnessPolicyController.java:
##
@@ -83,6 +87,29 @@ public void testHandlerAllocationPreconfigured() {
 
assertFalse(routerRpcFairnessPolicyController.acquirePermit(CONCURRENT_NS));
   }
 
+  @Test
+  public void testAcquireTimeout() {
+Configuration conf = createConf(40);
+conf.setInt(DFS_ROUTER_FAIR_HANDLER_COUNT_KEY_PREFIX + "ns1", 30);
+conf.setTimeDuration(DFS_ROUTER_FAIRNESS_ACQUIRE_TIMEOUT, 100, 
TimeUnit.MILLISECONDS);
+RouterRpcFairnessPolicyController routerRpcFairnessPolicyController =
+FederationUtil.newFairnessPolicyController(conf);
+
+// ns1 should have 30 permits allocated
+for (int i = 0; i < 30; i++) {
+  assertTrue(routerRpcFairnessPolicyController.acquirePermit("ns1"));
+}
+long acquireBeginTimeMs = Time.monotonicNow();
+assertFalse(routerRpcFairnessPolicyController.acquirePermit("ns1"));
+long acquireEndTimeMs = Time.monotonicNow();
+
+long acquireTimeMs = acquireEndTimeMs - acquireBeginTimeMs;
+
+// There are some other operations, so acquireTimeMs >= 100ms.
+assertTrue(acquireTimeMs >= 100);
+assertTrue(acquireTimeMs < 100 + 50);

Review Comment:
   Either we can keep this safe margin way above or remove this. @goiri do you 
have any suggestions?





Issue Time Tracking
---

Worklog Id: (was: 794544)
Time Spent: 2h  (was: 1h 50m)

> RBF: RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout
> ---
>
> Key: HDFS-16671
> URL: https://issues.apache.org/jira/browse/HDFS-16671
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout. Hardcode 1s is very long, and it has causes an incident in our prod 
> environment when one nameserivce is busy.
> And the optimal timeout maybe should be less than p50(avgTime).
> And all handlers in RBF is waiting to acquire the permit of the busy ns. 
> {code:java}
> "IPC Server handler 12 on default port " #2370 daemon prio=5 os_prio=0 
> tid=? nid=?  waiting on condition [?]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for   (a 
> java.util.concurrent.Semaphore$NonfairSync)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>   at java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:409)
>   at 
> org.apache.hadoop.hdfs.server.federation.fairness.AbstractRouterRpcFairnessPolicyController.acquirePermit(AbstractRouterRpcFairnessPolicyController.java:56)
>   at 
> org.apache.hadoop.hdfs.server.federation.fairness.DynamicRouterRpcFairnessPolicyController.acquirePermit(DynamicRouterRpcFairnessPolicyController.java:123)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.acquirePermit(RouterRpcClient.java:1500)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16671) RBF: RouterRpcFairnessPolicyController supports configurable permit acquire timeout

2022-07-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16671?focusedWorklogId=794543=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794543
 ]

ASF GitHub Bot logged work on HDFS-16671:
-

Author: ASF GitHub Bot
Created on: 23/Jul/22 17:57
Start Date: 23/Jul/22 17:57
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4597:
URL: https://github.com/apache/hadoop/pull/4597#issuecomment-1193164201

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 42s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  40m 40s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 57s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   0m 51s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   0m 47s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 58s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  2s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 18s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 47s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m  4s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 41s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 41s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   0m 41s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 43s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   0m 43s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 59s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 44s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 58s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 31s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m 58s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  22m  7s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 53s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 123m 50s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4597/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4597 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux b9138e5210e5 4.15.0-169-generic #177-Ubuntu SMP Thu Feb 3 
10:50:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / fcbdd2fbf85cffccef4ad55223d4f1236564450a |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4597/5/testReport/ |
   | Max. process+thread count | 2674 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
   | Console output | 

[jira] [Work logged] (HDFS-16671) RBF: RouterRpcFairnessPolicyController supports configurable permit acquire timeout

2022-07-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16671?focusedWorklogId=794507=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794507
 ]

ASF GitHub Bot logged work on HDFS-16671:
-

Author: ASF GitHub Bot
Created on: 23/Jul/22 14:56
Start Date: 23/Jul/22 14:56
Worklog Time Spent: 10m 
  Work Description: ZanderXu commented on code in PR #4597:
URL: https://github.com/apache/hadoop/pull/4597#discussion_r928132207


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/fairness/TestRouterRpcFairnessPolicyController.java:
##
@@ -83,6 +87,29 @@ public void testHandlerAllocationPreconfigured() {
 
assertFalse(routerRpcFairnessPolicyController.acquirePermit(CONCURRENT_NS));
   }
 
+  @Test
+  public void testAcquireTimeout() {
+Configuration conf = createConf(40);
+conf.setInt(DFS_ROUTER_FAIR_HANDLER_COUNT_KEY_PREFIX + "ns1", 30);
+conf.setTimeDuration(DFS_ROUTER_FAIRNESS_ACQUIRE_TIMEOUT, 100, 
TimeUnit.MILLISECONDS);
+RouterRpcFairnessPolicyController routerRpcFairnessPolicyController =
+FederationUtil.newFairnessPolicyController(conf);
+
+// ns1 should have 30 permits allocated
+for (int i = 0; i < 30; i++) {
+  assertTrue(routerRpcFairnessPolicyController.acquirePermit("ns1"));
+}
+long acquireBeginTimeMs = Time.monotonicNow();
+assertFalse(routerRpcFairnessPolicyController.acquirePermit("ns1"));
+long acquireEndTimeMs = Time.monotonicNow();
+
+long acquireTimeMs = acquireEndTimeMs - acquireBeginTimeMs;
+
+// There are some other operations, so acquireTimeMs >= 100ms.
+assertTrue(acquireTimeMs >= 100);
+assertTrue(acquireTimeMs < 100 + 50);

Review Comment:
   @ayushtkn Thanks for your review. About this, do you have some good ideas?





Issue Time Tracking
---

Worklog Id: (was: 794507)
Time Spent: 1h 40m  (was: 1.5h)

> RBF: RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout
> ---
>
> Key: HDFS-16671
> URL: https://issues.apache.org/jira/browse/HDFS-16671
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout. Hardcode 1s is very long, and it has causes an incident in our prod 
> environment when one nameserivce is busy.
> And the optimal timeout maybe should be less than p50(avgTime).
> And all handlers in RBF is waiting to acquire the permit of the busy ns. 
> {code:java}
> "IPC Server handler 12 on default port " #2370 daemon prio=5 os_prio=0 
> tid=? nid=?  waiting on condition [?]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for   (a 
> java.util.concurrent.Semaphore$NonfairSync)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>   at java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:409)
>   at 
> org.apache.hadoop.hdfs.server.federation.fairness.AbstractRouterRpcFairnessPolicyController.acquirePermit(AbstractRouterRpcFairnessPolicyController.java:56)
>   at 
> org.apache.hadoop.hdfs.server.federation.fairness.DynamicRouterRpcFairnessPolicyController.acquirePermit(DynamicRouterRpcFairnessPolicyController.java:123)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.acquirePermit(RouterRpcClient.java:1500)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16671) RBF: RouterRpcFairnessPolicyController supports configurable permit acquire timeout

2022-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16671?focusedWorklogId=794399=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794399
 ]

ASF GitHub Bot logged work on HDFS-16671:
-

Author: ASF GitHub Bot
Created on: 22/Jul/22 22:17
Start Date: 22/Jul/22 22:17
Worklog Time Spent: 10m 
  Work Description: ayushtkn commented on code in PR #4597:
URL: https://github.com/apache/hadoop/pull/4597#discussion_r928012699


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RBFConfigKeys.java:
##
@@ -354,6 +354,10 @@ public class RBFConfigKeys extends 
CommonConfigurationKeysPublic {
   NoRouterRpcFairnessPolicyController.class;
   public static final String DFS_ROUTER_FAIR_HANDLER_COUNT_KEY_PREFIX =
   FEDERATION_ROUTER_FAIRNESS_PREFIX + "handler.count.";
+  public static final String DFS_ROUTER_FAIRNESS_ACQUIRE_TIMEOUT =
+  FEDERATION_ROUTER_FAIRNESS_PREFIX + "acquire.timeout";
+  public static final long   DFS_ROUTER_FAIRNESS_ACQUIRE_TIMEOUT_MS_DEFAULT =

Review Comment:
   The variable name should be similar to the original config name + 
``_DEFAULT``, remove the  ``_MS_`` in middle.
   Should be ``DFS_ROUTER_FAIRNESS_ACQUIRE_TIMEOUT_DEFAULT``
   



##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/fairness/AbstractRouterRpcFairnessPolicyController.java:
##
@@ -42,15 +45,22 @@
   /** Hash table to hold semaphore for each configured name service. */
   private Map permits;
 
+  private long acquireTimeoutMs = 
DFS_ROUTER_FAIRNESS_ACQUIRE_TIMEOUT_MS_DEFAULT;
+
   public void init(Configuration conf) {
 this.permits = new HashMap<>();
+long timeoutMs = conf.getTimeDuration(DFS_ROUTER_FAIRNESS_ACQUIRE_TIMEOUT,
+DFS_ROUTER_FAIRNESS_ACQUIRE_TIMEOUT_MS_DEFAULT, TimeUnit.MILLISECONDS);
+if (timeoutMs >= 0) {
+  acquireTimeoutMs = timeoutMs;

Review Comment:
   if there is an invalid entry configured and we are moving to using the 
default value. We should atleast have a warn log. Kind of ``Invalid value -1 
configured for dfs should be greater than or equal to 0, Using default 
value of : 1s instead``
   something like this



##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/fairness/TestRouterRpcFairnessPolicyController.java:
##
@@ -83,6 +87,29 @@ public void testHandlerAllocationPreconfigured() {
 
assertFalse(routerRpcFairnessPolicyController.acquirePermit(CONCURRENT_NS));
   }
 
+  @Test
+  public void testAcquireTimeout() {
+Configuration conf = createConf(40);
+conf.setInt(DFS_ROUTER_FAIR_HANDLER_COUNT_KEY_PREFIX + "ns1", 30);
+conf.setTimeDuration(DFS_ROUTER_FAIRNESS_ACQUIRE_TIMEOUT, 100, 
TimeUnit.MILLISECONDS);
+RouterRpcFairnessPolicyController routerRpcFairnessPolicyController =
+FederationUtil.newFairnessPolicyController(conf);
+
+// ns1 should have 30 permits allocated
+for (int i = 0; i < 30; i++) {
+  assertTrue(routerRpcFairnessPolicyController.acquirePermit("ns1"));
+}
+long acquireBeginTimeMs = Time.monotonicNow();
+assertFalse(routerRpcFairnessPolicyController.acquirePermit("ns1"));
+long acquireEndTimeMs = Time.monotonicNow();
+
+long acquireTimeMs = acquireEndTimeMs - acquireBeginTimeMs;
+
+// There are some other operations, so acquireTimeMs >= 100ms.
+assertTrue(acquireTimeMs >= 100);
+assertTrue(acquireTimeMs < 100 + 50);

Review Comment:
   Is this 50 added as safe margin, kind of that other operations won't take 
more than 50ms? In that case I doubt this test is gonna go flaky in future, 
Double check to confirm it isn't gonna cross the safe limit of 50 ever.





Issue Time Tracking
---

Worklog Id: (was: 794399)
Time Spent: 1.5h  (was: 1h 20m)

> RBF: RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout
> ---
>
> Key: HDFS-16671
> URL: https://issues.apache.org/jira/browse/HDFS-16671
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout. Hardcode 1s is very long, and it has causes an incident in our prod 
> environment when one nameserivce is busy.
> And the optimal timeout maybe should be less than p50(avgTime).
> And all handlers in RBF is waiting to acquire the permit of the busy ns. 
> {code:java}
> "IPC Server handler 12 on default port " #2370 daemon prio=5 os_prio=0 
> tid=? nid=?  waiting on condition [?]
>

[jira] [Work logged] (HDFS-16671) RBF: RouterRpcFairnessPolicyController supports configurable permit acquire timeout

2022-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16671?focusedWorklogId=794265=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794265
 ]

ASF GitHub Bot logged work on HDFS-16671:
-

Author: ASF GitHub Bot
Created on: 22/Jul/22 15:13
Start Date: 22/Jul/22 15:13
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4597:
URL: https://github.com/apache/hadoop/pull/4597#issuecomment-1192675906

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 42s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  1s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  39m 27s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 50s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   0m 52s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   0m 42s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 59s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 56s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 15s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 43s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 53s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 42s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 44s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   0m 44s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 40s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   0m 40s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 27s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 43s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 41s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 26s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m 40s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  23m  4s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 47s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 122m 10s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4597/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4597 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux e1bcb4eb6966 4.15.0-169-generic #177-Ubuntu SMP Thu Feb 3 
10:50:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d985449246a38b8399ac9c716874924ef51da4bb |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4597/4/testReport/ |
   | Max. process+thread count | 2736 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
   | Console output | 

[jira] [Work logged] (HDFS-16671) RBF: RouterRpcFairnessPolicyController supports configurable permit acquire timeout

2022-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16671?focusedWorklogId=794228=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794228
 ]

ASF GitHub Bot logged work on HDFS-16671:
-

Author: ASF GitHub Bot
Created on: 22/Jul/22 13:11
Start Date: 22/Jul/22 13:11
Worklog Time Spent: 10m 
  Work Description: ZanderXu commented on PR #4597:
URL: https://github.com/apache/hadoop/pull/4597#issuecomment-1192558613

   @goiri Thanks for you review, I learned a lot. Please help me review this 
patch, thanks.




Issue Time Tracking
---

Worklog Id: (was: 794228)
Time Spent: 1h 10m  (was: 1h)

> RBF: RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout
> ---
>
> Key: HDFS-16671
> URL: https://issues.apache.org/jira/browse/HDFS-16671
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout. Hardcode 1s is very long, and it has causes an incident in our prod 
> environment when one nameserivce is busy.
> And the optimal timeout maybe should be less than p50(avgTime).
> And all handlers in RBF is waiting to acquire the permit of the busy ns. 
> {code:java}
> "IPC Server handler 12 on default port " #2370 daemon prio=5 os_prio=0 
> tid=? nid=?  waiting on condition [?]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for   (a 
> java.util.concurrent.Semaphore$NonfairSync)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>   at java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:409)
>   at 
> org.apache.hadoop.hdfs.server.federation.fairness.AbstractRouterRpcFairnessPolicyController.acquirePermit(AbstractRouterRpcFairnessPolicyController.java:56)
>   at 
> org.apache.hadoop.hdfs.server.federation.fairness.DynamicRouterRpcFairnessPolicyController.acquirePermit(DynamicRouterRpcFairnessPolicyController.java:123)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.acquirePermit(RouterRpcClient.java:1500)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16671) RBF: RouterRpcFairnessPolicyController supports configurable permit acquire timeout

2022-07-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16671?focusedWorklogId=793969=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-793969
 ]

ASF GitHub Bot logged work on HDFS-16671:
-

Author: ASF GitHub Bot
Created on: 21/Jul/22 23:21
Start Date: 21/Jul/22 23:21
Worklog Time Spent: 10m 
  Work Description: goiri commented on code in PR #4597:
URL: https://github.com/apache/hadoop/pull/4597#discussion_r927170619


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/fairness/AbstractRouterRpcFairnessPolicyController.java:
##
@@ -42,15 +45,22 @@
   /** Hash table to hold semaphore for each configured name service. */
   private Map permits;
 
+  private long acquireTimeoutMs = DFS_ROUTER_FAIRNESS_ACQUIRE_TIMEOUT_DEFAULT;
+
   public void init(Configuration conf) {
 this.permits = new HashMap<>();
+long timeoutMs = 
conf.getTimeDuration(DFS_ROUTER_FAIRNESS_ACQUIRE_TIMEOUT_MS,

Review Comment:
   If we do getTimeDuration() we don't need the prefix in the key 
DFS_ROUTER_FAIRNESS_ACQUIRE_TIMEOUT_MS.
   We should set the default in the XML to "1s"



##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/fairness/TestRouterRpcFairnessPolicyController.java:
##
@@ -83,6 +85,29 @@ public void testHandlerAllocationPreconfigured() {
 
assertFalse(routerRpcFairnessPolicyController.acquirePermit(CONCURRENT_NS));
   }
 
+  @Test
+  public void testAcquireTimeout() {
+Configuration conf = createConf(40);
+conf.setInt(DFS_ROUTER_FAIR_HANDLER_COUNT_KEY_PREFIX + "ns1", 30);
+conf.setLong(DFS_ROUTER_FAIRNESS_ACQUIRE_TIMEOUT_MS, 100);

Review Comment:
   setTimeDuration(DFS_ROUTER_FAIRNESS_ACQUIRE_TIMEOUT, 1, TimeUnit.SECONDS)



##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/resources/hdfs-rbf-default.xml:
##
@@ -723,6 +723,14 @@
 
   
 
+  
+dfs.federation.router.fairness.acquire.timeout.ms
+1000

Review Comment:
   1s and remove the ms





Issue Time Tracking
---

Worklog Id: (was: 793969)
Time Spent: 1h  (was: 50m)

> RBF: RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout
> ---
>
> Key: HDFS-16671
> URL: https://issues.apache.org/jira/browse/HDFS-16671
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout. Hardcode 1s is very long, and it has causes an incident in our prod 
> environment when one nameserivce is busy.
> And the optimal timeout maybe should be less than p50(avgTime).
> And all handlers in RBF is waiting to acquire the permit of the busy ns. 
> {code:java}
> "IPC Server handler 12 on default port " #2370 daemon prio=5 os_prio=0 
> tid=? nid=?  waiting on condition [?]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for   (a 
> java.util.concurrent.Semaphore$NonfairSync)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>   at java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:409)
>   at 
> org.apache.hadoop.hdfs.server.federation.fairness.AbstractRouterRpcFairnessPolicyController.acquirePermit(AbstractRouterRpcFairnessPolicyController.java:56)
>   at 
> org.apache.hadoop.hdfs.server.federation.fairness.DynamicRouterRpcFairnessPolicyController.acquirePermit(DynamicRouterRpcFairnessPolicyController.java:123)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.acquirePermit(RouterRpcClient.java:1500)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16671) RBF: RouterRpcFairnessPolicyController supports configurable permit acquire timeout

2022-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16671?focusedWorklogId=793535=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-793535
 ]

ASF GitHub Bot logged work on HDFS-16671:
-

Author: ASF GitHub Bot
Created on: 21/Jul/22 05:03
Start Date: 21/Jul/22 05:03
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4597:
URL: https://github.com/apache/hadoop/pull/4597#issuecomment-1191041960

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 43s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  38m 56s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m  2s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   0m 58s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   0m 52s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m  3s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  8s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 48s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 15s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 43s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 46s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   0m 46s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 40s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   0m 40s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 26s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 44s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 41s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 58s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 27s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m 22s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  21m 49s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 53s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 120m 42s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4597/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4597 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux a5492a0cb95f 4.15.0-169-generic #177-Ubuntu SMP Thu Feb 3 
10:50:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / bfb922926e46f644cad83389f6ca9c404085315d |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4597/3/testReport/ |
   | Max. process+thread count | 2761 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
   | Console output | 

[jira] [Work logged] (HDFS-16671) RBF: RouterRpcFairnessPolicyController supports configurable permit acquire timeout

2022-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16671?focusedWorklogId=793434=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-793434
 ]

ASF GitHub Bot logged work on HDFS-16671:
-

Author: ASF GitHub Bot
Created on: 20/Jul/22 21:11
Start Date: 20/Jul/22 21:11
Worklog Time Spent: 10m 
  Work Description: goiri commented on code in PR #4597:
URL: https://github.com/apache/hadoop/pull/4597#discussion_r926056965


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/fairness/StaticRouterRpcFairnessPolicyController.java:
##
@@ -109,16 +109,15 @@ public void init(Configuration conf)
   }
 
   private static void logAssignment(String nsId, int count) {
-LOG.info("Assigned {} handlers to nsId {} ",
-count, nsId);
+LOG.info("Assigned {} handlers to nsId {} ", count, nsId);
   }
 
-  private void validateHandlersCount(Configuration conf, int handlerCount,
- Set allConfiguredNS) {
+  private void validateHandlersCount(Configuration conf,
+  int handlerCount, Set allConfiguredNS) {
 int totalDedicatedHandlers = 0;
 for (String nsId : allConfiguredNS) {
   int dedicatedHandlers =
-  conf.getInt(DFS_ROUTER_FAIR_HANDLER_COUNT_KEY_PREFIX + nsId, 0);
+  conf.getInt(DFS_ROUTER_FAIR_HANDLER_COUNT_KEY_PREFIX + nsId, 0);

Review Comment:
   Make it one line as we are touching this.



##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/fairness/AbstractRouterRpcFairnessPolicyController.java:
##
@@ -42,15 +45,22 @@
   /** Hash table to hold semaphore for each configured name service. */
   private Map permits;
 
+  private long acquireTimeout = DFS_ROUTER_FAIRNESS_ACQUIRE_TIMEOUT_DEFAULT;
+
   public void init(Configuration conf) {
 this.permits = new HashMap<>();
+long timeout = conf.getLong(DFS_ROUTER_FAIRNESS_ACQUIRE_TIMEOUT_KEY,

Review Comment:
   What's the unit? We should do getTimeDuration()





Issue Time Tracking
---

Worklog Id: (was: 793434)
Time Spent: 40m  (was: 0.5h)

> RBF: RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout
> ---
>
> Key: HDFS-16671
> URL: https://issues.apache.org/jira/browse/HDFS-16671
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout. Hardcode 1s is very long, and it has causes an incident in our prod 
> environment when one nameserivce is busy.
> And the optimal timeout maybe should be less than p50(avgTime).
> And all handlers in RBF is waiting to acquire the permit of the busy ns. 
> {code:java}
> "IPC Server handler 12 on default port " #2370 daemon prio=5 os_prio=0 
> tid=? nid=?  waiting on condition [?]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for   (a 
> java.util.concurrent.Semaphore$NonfairSync)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>   at java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:409)
>   at 
> org.apache.hadoop.hdfs.server.federation.fairness.AbstractRouterRpcFairnessPolicyController.acquirePermit(AbstractRouterRpcFairnessPolicyController.java:56)
>   at 
> org.apache.hadoop.hdfs.server.federation.fairness.DynamicRouterRpcFairnessPolicyController.acquirePermit(DynamicRouterRpcFairnessPolicyController.java:123)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.acquirePermit(RouterRpcClient.java:1500)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16671) RBF: RouterRpcFairnessPolicyController supports configurable permit acquire timeout

2022-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16671?focusedWorklogId=793367=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-793367
 ]

ASF GitHub Bot logged work on HDFS-16671:
-

Author: ASF GitHub Bot
Created on: 20/Jul/22 17:56
Start Date: 20/Jul/22 17:56
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4597:
URL: https://github.com/apache/hadoop/pull/4597#issuecomment-1190583416

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 43s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  1s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  38m 30s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 58s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   0m 54s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   0m 51s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m  3s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  9s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 17s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 48s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 25s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 41s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 46s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   0m 46s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 39s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   0m 39s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 24s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 44s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 40s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 59s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 29s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m 35s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  |  21m 56s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4597/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 54s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 120m 31s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.federation.fairness.TestRouterRpcFairnessPolicyController |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4597/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4597 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux b5ac6535d8d3 4.15.0-169-generic #177-Ubuntu SMP Thu Feb 3 
10:50:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 300d8ee4cdc6e947826d469442a41c5cfe0534b7 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 

[jira] [Work logged] (HDFS-16671) RBF: RouterRpcFairnessPolicyController supports configurable permit acquire timeout

2022-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16671?focusedWorklogId=793280=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-793280
 ]

ASF GitHub Bot logged work on HDFS-16671:
-

Author: ASF GitHub Bot
Created on: 20/Jul/22 14:39
Start Date: 20/Jul/22 14:39
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4597:
URL: https://github.com/apache/hadoop/pull/4597#issuecomment-1190372307

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 44s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  40m  9s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 59s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   0m 52s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   0m 47s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m  1s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 10s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 11s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 48s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m  0s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 42s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 44s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   0m 44s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 39s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   0m 39s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 26s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 43s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 40s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 27s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m 46s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  |  21m 49s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4597/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 52s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 122m 43s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.federation.fairness.TestRouterRpcFairnessPolicyController |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4597/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4597 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux 14bc3953b314 4.15.0-169-generic #177-Ubuntu SMP Thu Feb 3 
10:50:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / f965c90296fe2623d577c332105ef3454dc2667e |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 

[jira] [Work logged] (HDFS-16671) RBF: RouterRpcFairnessPolicyController supports configurable permit acquire timeout

2022-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16671?focusedWorklogId=793216=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-793216
 ]

ASF GitHub Bot logged work on HDFS-16671:
-

Author: ASF GitHub Bot
Created on: 20/Jul/22 12:35
Start Date: 20/Jul/22 12:35
Worklog Time Spent: 10m 
  Work Description: ZanderXu opened a new pull request, #4597:
URL: https://github.com/apache/hadoop/pull/4597

   RouterRpcFairnessPolicyController supports configurable permit acquire 
timeout. Hardcode 1s is very long, and it has causes an incident in our prod 
environment when one nameserivce is busy.
   
   And the optimal timeout maybe should be less than p50(avgTime).
   
   And all handlers in RBF is waiting to acquire the permit of the busy ns. 
   ```
   "IPC Server handler 12 on default port " #2370 daemon prio=5 os_prio=0 
tid=? nid=?  waiting on condition [?]
  java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for   (a 
java.util.concurrent.Semaphore$NonfairSync)
at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
at java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:409)
at 
org.apache.hadoop.hdfs.server.federation.fairness.AbstractRouterRpcFairnessPolicyController.acquirePermit(AbstractRouterRpcFairnessPolicyController.java:56)
at 
org.apache.hadoop.hdfs.server.federation.fairness.DynamicRouterRpcFairnessPolicyController.acquirePermit(DynamicRouterRpcFairnessPolicyController.java:123)
at 
org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.acquirePermit(RouterRpcClient.java:1500)
   ```
   




Issue Time Tracking
---

Worklog Id: (was: 793216)
Remaining Estimate: 0h
Time Spent: 10m

> RBF: RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout
> ---
>
> Key: HDFS-16671
> URL: https://issues.apache.org/jira/browse/HDFS-16671
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout. Hardcode 1s is very long, and it has causes an incident in our prod 
> environment when one nameserivce is busy.
> And the optimal timeout maybe should be less than p50(avgTime).
> And all handlers in RBF is waiting to acquire the permit of the busy ns. 
> {code:java}
> "IPC Server handler 12 on default port " #2370 daemon prio=5 os_prio=0 
> tid=? nid=?  waiting on condition [?]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for   (a 
> java.util.concurrent.Semaphore$NonfairSync)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>   at java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:409)
>   at 
> org.apache.hadoop.hdfs.server.federation.fairness.AbstractRouterRpcFairnessPolicyController.acquirePermit(AbstractRouterRpcFairnessPolicyController.java:56)
>   at 
> org.apache.hadoop.hdfs.server.federation.fairness.DynamicRouterRpcFairnessPolicyController.acquirePermit(DynamicRouterRpcFairnessPolicyController.java:123)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.acquirePermit(RouterRpcClient.java:1500)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org