[jira] [Commented] (HDFS-17300) [SBN READ] A rpc call in Observer should throw ObserverRetryOnActiveException if its stateid is always lower than client stateid for a configured time.

2024-01-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805412#comment-17805412
 ] 

ASF GitHub Bot commented on HDFS-17300:
---

LiuGuH commented on PR #6414:
URL: https://github.com/apache/hadoop/pull/6414#issuecomment-1886497158

   @chliang71  Do you have time to review this? Thanks




> [SBN READ]  A rpc call in Observer should throw 
> ObserverRetryOnActiveException if its stateid is always lower than client 
> stateid for a configured time.
> 
>
> Key: HDFS-17300
> URL: https://issues.apache.org/jira/browse/HDFS-17300
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: liuguanghua
>Assignee: liuguanghua
>Priority: Major
>  Labels: pull-request-available
>
>   
> Now when Observer is enable, Observer will update its stateid through that 
> EditLogTailer near-real-time tailing editlog form Active Namenode. And if a 
> rpc call's stateid is lower than client stateid which may update from active 
> namenode with msync, the call will be requeued into callqueue.
> This PR is intend to if a rpc call's stateid is always lower than client 
> statid for a configured time , the call should throw 
> ObserverRetryOnActiveException for client and client will go to active 
> namenode for processing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17311) RBF: ConnectionManager creatorQueue should offer a pool that is not already in creatorQueue.

2024-01-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805410#comment-17805410
 ] 

ASF GitHub Bot commented on HDFS-17311:
---

LiuGuH commented on PR #6392:
URL: https://github.com/apache/hadoop/pull/6392#issuecomment-1886482486

   > There is still a check style issue though.
   
   Fixed.  Thanks




> RBF: ConnectionManager creatorQueue should offer a pool that is not already 
> in creatorQueue.
> 
>
> Key: HDFS-17311
> URL: https://issues.apache.org/jira/browse/HDFS-17311
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: liuguanghua
>Assignee: liuguanghua
>Priority: Major
>  Labels: pull-request-available
>
> In the Router, find blow log
>  
> 2023-12-29 15:18:54,799 ERROR 
> org.apache.hadoop.hdfs.server.federation.router.ConnectionManager: Cannot add 
> more than 2048 connections at the same time
>  
> The log indicates that ConnectionManager.creatorQueue is full at a certain 
> point. But my cluster does not have so many users cloud reach up 2048 pair of 
> .
> This may be due to the following reasons:
>  # ConnectionManager.creatorQueue is a queue that will be offered 
> ConnectionPool if ConnectionContext is not enough.
>  # ConnectionCreator thread will consume from creatorQueue and make more 
> ConnectionContexts for a ConnectionPool.
>  # Client will concurrent invoke for ConnectionManager.getConnection() for a 
> same user. And this maybe lead to add many same ConnectionPool into 
> ConnectionManager.creatorQueue.
>  # When creatorQueue is full, a new ConnectionPool will not be added in 
> successfully and log this error. This maybe lead to a really new 
> ConnectionPool clould not produce more ConnectionContexts for new user.
> So this pr try to make creatorQueue will not add same ConnectionPool at once.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17336) Provide an option to enable/disable considering space used by .Trash folder for user quota compuation

2024-01-10 Thread Srinivasu Majeti (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805405#comment-17805405
 ] 

Srinivasu Majeti commented on HDFS-17336:
-

Added description [~ayushtkn] 

> Provide an option to enable/disable considering space used by .Trash folder 
> for user quota compuation
> -
>
> Key: HDFS-17336
> URL: https://issues.apache.org/jira/browse/HDFS-17336
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.4
>Reporter: Srinivasu Majeti
>Priority: Major
>
> We have a use case for a large account where /user/user1 has got space quota 
> configured. By default, Trash goes into /user/user1/.Trash. As long as 
> removed files stay back in Trash user will never be able to reclaim the space 
> quota. The customer is looking for a feature that will skip computing space 
> quota for the files in the Trash folder. Proposal is to introduce a new 
> configuration parameter to skip computing quota for Trash files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17336) Provide an option to enable/disable considering space used by .Trash folder for user quota compuation

2024-01-10 Thread Srinivasu Majeti (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Srinivasu Majeti updated HDFS-17336:

Description: We have a use case for a large account where /user/user1 has 
got space quota configured. By default, Trash goes into /user/user1/.Trash. As 
long as removed files stay back in Trash user will never be able to reclaim the 
space quota. The customer is looking for a feature that will skip computing 
space quota for the files in the Trash folder. Proposal is to introduce a new 
configuration parameter to skip computing quota for Trash files.  (was: We have 
a use case for a large account where /user/user1 has got space quota 
configured. By default, Trash goes into /user/user1/.Trash. As long as removed 
files stay back in Trash user will never be able to reclaim the space quota. 
The customer is looking for a feature that will skip computing space quota for 
the files in the Trash folder.)

> Provide an option to enable/disable considering space used by .Trash folder 
> for user quota compuation
> -
>
> Key: HDFS-17336
> URL: https://issues.apache.org/jira/browse/HDFS-17336
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.4
>Reporter: Srinivasu Majeti
>Priority: Major
>
> We have a use case for a large account where /user/user1 has got space quota 
> configured. By default, Trash goes into /user/user1/.Trash. As long as 
> removed files stay back in Trash user will never be able to reclaim the space 
> quota. The customer is looking for a feature that will skip computing space 
> quota for the files in the Trash folder. Proposal is to introduce a new 
> configuration parameter to skip computing quota for Trash files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17336) Provide an option to enable/disable considering space used by .Trash folder for user quota compuation

2024-01-10 Thread Srinivasu Majeti (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Srinivasu Majeti updated HDFS-17336:

Description: We have a use case for a large account where /user/user1 has 
got space quota configured. By default, Trash goes into /user/user1/.Trash. As 
long as removed files stay back in Trash user will never be able to reclaim the 
space quota. The customer is looking for a feature that will skip computing 
space quota for the files in the Trash folder.

> Provide an option to enable/disable considering space used by .Trash folder 
> for user quota compuation
> -
>
> Key: HDFS-17336
> URL: https://issues.apache.org/jira/browse/HDFS-17336
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.4
>Reporter: Srinivasu Majeti
>Priority: Major
>
> We have a use case for a large account where /user/user1 has got space quota 
> configured. By default, Trash goes into /user/user1/.Trash. As long as 
> removed files stay back in Trash user will never be able to reclaim the space 
> quota. The customer is looking for a feature that will skip computing space 
> quota for the files in the Trash folder.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17335) Add metrics for syncWaitQ in FSEditLogAsync

2024-01-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805390#comment-17805390
 ] 

ASF GitHub Bot commented on HDFS-17335:
---

hfutatzhanghb commented on PR #6431:
URL: https://github.com/apache/hadoop/pull/6431#issuecomment-1886385171

   @Hexiaoqiao @tomscut Sir, could you please help me review this simple 
modification when you are free, Thanks a lot.




> Add metrics for syncWaitQ in FSEditLogAsync
> ---
>
> Key: HDFS-17335
> URL: https://issues.apache.org/jira/browse/HDFS-17335
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.3.6
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Minor
>  Labels: pull-request-available
>
> To monitor syncWaitQ in FSEditLogAsync, we add a metric syncPendingCount.
> The reason we add this metrics is that when dequeueEdit() return null,  the 
> boolean variable doSync is set to {color:#0747a6}+!syncWaitQ.isEmpty()+    
> {color:#172b4d}After adding this metrics we can better monitor sync 
> performance and codes.{color}{color}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-17336) Provide an option to enable/disable considering space used by .Trash folder for user quota compuation

2024-01-10 Thread Srinivasu Majeti (Jira)
Srinivasu Majeti created HDFS-17336:
---

 Summary: Provide an option to enable/disable considering space 
used by .Trash folder for user quota compuation
 Key: HDFS-17336
 URL: https://issues.apache.org/jira/browse/HDFS-17336
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Affects Versions: 3.1.4
Reporter: Srinivasu Majeti






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17311) RBF: ConnectionManager creatorQueue should offer a pool that is not already in creatorQueue.

2024-01-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805358#comment-17805358
 ] 

ASF GitHub Bot commented on HDFS-17311:
---

hadoop-yetus commented on PR #6392:
URL: https://github.com/apache/hadoop/pull/6392#issuecomment-1886164841

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 21s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  31m 50s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 22s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |   0m 20s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   0m 18s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 29s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 30s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   0m 18s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   0m 50s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  19m 38s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 16s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 17s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |   0m 17s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 17s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   0m 17s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 11s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 21s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 17s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   0m 15s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   0m 49s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  19m 27s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  19m  4s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 20s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  99m  0s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6392/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6392 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux d4051e4357b0 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 
15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 2dd5f2f01fcf172d705982c15c165af75135ad5b |
   | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6392/3/testReport/ |
   | Max. process+thread count | 2614 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6392/3/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> RBF: ConnectionManager creatorQueue should offer a pool that is not already 
> in 

[jira] [Commented] (HDFS-17334) FSEditLogAsync#enqueueEdit does not synchronized this before invoke wait method

2024-01-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805340#comment-17805340
 ] 

ASF GitHub Bot commented on HDFS-17334:
---

hfutatzhanghb commented on PR #6434:
URL: https://github.com/apache/hadoop/pull/6434#issuecomment-1886067772

   @Hexiaoqiao @zhangshuyan0 @tomscut Sir, could you please take a look at this 
PR when you have free time? Thanks.
   




> FSEditLogAsync#enqueueEdit does not synchronized this before invoke wait 
> method
> ---
>
> Key: HDFS-17334
> URL: https://issues.apache.org/jira/browse/HDFS-17334
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.6
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> In method FSEditLogAsync#enqueueEdit , there exist the below codes:
> {code:java}
> if (Thread.holdsLock(this)) {
>           // if queue is full, synchronized caller must immediately relinquish
>           // the monitor before re-offering to avoid deadlock with sync thread
>           // which needs the monitor to write transactions.
>           int permits = overflowMutex.drainPermits();
>           try {
>             do {
>               this.wait(1000); // will be notified by next logSync.
>             } while (!editPendingQ.offer(edit));
>           } finally {
>             overflowMutex.release(permits);
>           }
>         }  {code}
> It maybe invoke this.wait(1000) without having object this's monitor.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17311) RBF: ConnectionManager creatorQueue should offer a pool that is not already in creatorQueue.

2024-01-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805328#comment-17805328
 ] 

ASF GitHub Bot commented on HDFS-17311:
---

goiri commented on PR #6392:
URL: https://github.com/apache/hadoop/pull/6392#issuecomment-1885974080

   There is still a check style issue though.




> RBF: ConnectionManager creatorQueue should offer a pool that is not already 
> in creatorQueue.
> 
>
> Key: HDFS-17311
> URL: https://issues.apache.org/jira/browse/HDFS-17311
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: liuguanghua
>Assignee: liuguanghua
>Priority: Major
>  Labels: pull-request-available
>
> In the Router, find blow log
>  
> 2023-12-29 15:18:54,799 ERROR 
> org.apache.hadoop.hdfs.server.federation.router.ConnectionManager: Cannot add 
> more than 2048 connections at the same time
>  
> The log indicates that ConnectionManager.creatorQueue is full at a certain 
> point. But my cluster does not have so many users cloud reach up 2048 pair of 
> .
> This may be due to the following reasons:
>  # ConnectionManager.creatorQueue is a queue that will be offered 
> ConnectionPool if ConnectionContext is not enough.
>  # ConnectionCreator thread will consume from creatorQueue and make more 
> ConnectionContexts for a ConnectionPool.
>  # Client will concurrent invoke for ConnectionManager.getConnection() for a 
> same user. And this maybe lead to add many same ConnectionPool into 
> ConnectionManager.creatorQueue.
>  # When creatorQueue is full, a new ConnectionPool will not be added in 
> successfully and log this error. This maybe lead to a really new 
> ConnectionPool clould not produce more ConnectionContexts for new user.
> So this pr try to make creatorQueue will not add same ConnectionPool at once.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16064) Determine when to invalidate corrupt replicas based on number of usable replicas

2024-01-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805324#comment-17805324
 ] 

ASF GitHub Bot commented on HDFS-16064:
---

zz12341 commented on code in PR #6437:
URL: https://github.com/apache/hadoop/pull/6437#discussion_r1448119281


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java:
##
@@ -791,15 +791,33 @@ public short getMinReplication() {
 return minReplication;
   }
 
+  public short getMinStorageNum(BlockInfo block) {

Review Comment:
   I was originally trying to make it in sync with what trunk branch is doing: 
https://github.com/apache/hadoop/pull/4410/files#diff-305ecf45a0f0708849b5e3c0d21a56c681db3a1497e52a19ef24939278dc99feL1922-R1926
   
   Let me revert this change 





> Determine when to invalidate corrupt replicas based on number of usable 
> replicas
> 
>
> Key: HDFS-16064
> URL: https://issues.apache.org/jira/browse/HDFS-16064
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 3.2.1
>Reporter: Kevin Wikant
>Assignee: Kevin Wikant
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.5
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Seems that https://issues.apache.org/jira/browse/HDFS-721 was resolved as a 
> non-issue under the assumption that if the namenode & a datanode get into an 
> inconsistent state for a given block pipeline, there should be another 
> datanode available to replicate the block to
> While testing datanode decommissioning using "dfs.exclude.hosts", I have 
> encountered a scenario where the decommissioning gets stuck indefinitely
> Below is the progression of events:
>  * there are initially 4 datanodes DN1, DN2, DN3, DN4
>  * scale-down is started by adding DN1 & DN2 to "dfs.exclude.hosts"
>  * HDFS block pipelines on DN1 & DN2 must now be replicated to DN3 & DN4 in 
> order to satisfy their minimum replication factor of 2
>  * during this replication process 
> https://issues.apache.org/jira/browse/HDFS-721 is encountered which causes 
> the following inconsistent state:
>  ** DN3 thinks it has the block pipeline in FINALIZED state
>  ** the namenode does not think DN3 has the block pipeline
> {code:java}
> 2021-06-06 10:38:23,604 INFO org.apache.hadoop.hdfs.server.datanode.DataNode 
> (DataXceiver for client  at /DN2:45654 [Receiving block BP-YYY:blk_XXX]): 
> DN3:9866:DataXceiver error processing WRITE_BLOCK operation  src: /DN2:45654 
> dst: /DN3:9866; 
> org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block 
> BP-YYY:blk_XXX already exists in state FINALIZED and thus cannot be created.
> {code}
>  * the replication is attempted again, but:
>  ** DN4 has the block
>  ** DN1 and/or DN2 have the block, but don't count towards the minimum 
> replication factor because they are being decommissioned
>  ** DN3 does not have the block & cannot have the block replicated to it 
> because of HDFS-721
>  * the namenode repeatedly tries to replicate the block to DN3 & repeatedly 
> fails, this continues indefinitely
>  * therefore DN4 is the only live datanode with the block & the minimum 
> replication factor of 2 cannot be satisfied
>  * because the minimum replication factor cannot be satisfied for the 
> block(s) being moved off DN1 & DN2, the datanode decommissioning can never be 
> completed 
> {code:java}
> 2021-06-06 10:39:10,106 INFO BlockStateChange (DatanodeAdminMonitor-0): 
> Block: blk_XXX, Expected Replicas: 2, live replicas: 1, corrupt replicas: 0, 
> decommissioned replicas: 0, decommissioning replicas: 2, maintenance 
> replicas: 0, live entering maintenance replicas: 0, excess replicas: 0, Is 
> Open File: false, Datanodes having this block: DN1:9866 DN2:9866 DN4:9866 , 
> Current Datanode: DN1:9866, Is current datanode decommissioning: true, Is 
> current datanode entering maintenance: false
> ...
> 2021-06-06 10:57:10,105 INFO BlockStateChange (DatanodeAdminMonitor-0): 
> Block: blk_XXX, Expected Replicas: 2, live replicas: 1, corrupt replicas: 0, 
> decommissioned replicas: 0, decommissioning replicas: 2, maintenance 
> replicas: 0, live entering maintenance replicas: 0, excess replicas: 0, Is 
> Open File: false, Datanodes having this block: DN1:9866 DN2:9866 DN4:9866 , 
> Current Datanode: DN2:9866, Is current datanode decommissioning: true, Is 
> current datanode entering maintenance: false
> {code}
> Being stuck in decommissioning state forever is not an intended behavior of 
> DataNode decommissioning
> A few potential solutions:
>  * Address the root cause of the problem which is an inconsistent state 
> between namenode & datanode: 

[jira] [Commented] (HDFS-16064) Determine when to invalidate corrupt replicas based on number of usable replicas

2024-01-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805317#comment-17805317
 ] 

ASF GitHub Bot commented on HDFS-16064:
---

shahrs87 commented on code in PR #6437:
URL: https://github.com/apache/hadoop/pull/6437#discussion_r1448086058


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java:
##
@@ -791,15 +791,33 @@ public short getMinReplication() {
 return minReplication;
   }
 
+  public short getMinStorageNum(BlockInfo block) {

Review Comment:
   @zz12341  Why we want these changes? I don't see it in the original patch 
[here](https://github.com/apache/hadoop/pull/4410/files). 





> Determine when to invalidate corrupt replicas based on number of usable 
> replicas
> 
>
> Key: HDFS-16064
> URL: https://issues.apache.org/jira/browse/HDFS-16064
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 3.2.1
>Reporter: Kevin Wikant
>Assignee: Kevin Wikant
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.5
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Seems that https://issues.apache.org/jira/browse/HDFS-721 was resolved as a 
> non-issue under the assumption that if the namenode & a datanode get into an 
> inconsistent state for a given block pipeline, there should be another 
> datanode available to replicate the block to
> While testing datanode decommissioning using "dfs.exclude.hosts", I have 
> encountered a scenario where the decommissioning gets stuck indefinitely
> Below is the progression of events:
>  * there are initially 4 datanodes DN1, DN2, DN3, DN4
>  * scale-down is started by adding DN1 & DN2 to "dfs.exclude.hosts"
>  * HDFS block pipelines on DN1 & DN2 must now be replicated to DN3 & DN4 in 
> order to satisfy their minimum replication factor of 2
>  * during this replication process 
> https://issues.apache.org/jira/browse/HDFS-721 is encountered which causes 
> the following inconsistent state:
>  ** DN3 thinks it has the block pipeline in FINALIZED state
>  ** the namenode does not think DN3 has the block pipeline
> {code:java}
> 2021-06-06 10:38:23,604 INFO org.apache.hadoop.hdfs.server.datanode.DataNode 
> (DataXceiver for client  at /DN2:45654 [Receiving block BP-YYY:blk_XXX]): 
> DN3:9866:DataXceiver error processing WRITE_BLOCK operation  src: /DN2:45654 
> dst: /DN3:9866; 
> org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block 
> BP-YYY:blk_XXX already exists in state FINALIZED and thus cannot be created.
> {code}
>  * the replication is attempted again, but:
>  ** DN4 has the block
>  ** DN1 and/or DN2 have the block, but don't count towards the minimum 
> replication factor because they are being decommissioned
>  ** DN3 does not have the block & cannot have the block replicated to it 
> because of HDFS-721
>  * the namenode repeatedly tries to replicate the block to DN3 & repeatedly 
> fails, this continues indefinitely
>  * therefore DN4 is the only live datanode with the block & the minimum 
> replication factor of 2 cannot be satisfied
>  * because the minimum replication factor cannot be satisfied for the 
> block(s) being moved off DN1 & DN2, the datanode decommissioning can never be 
> completed 
> {code:java}
> 2021-06-06 10:39:10,106 INFO BlockStateChange (DatanodeAdminMonitor-0): 
> Block: blk_XXX, Expected Replicas: 2, live replicas: 1, corrupt replicas: 0, 
> decommissioned replicas: 0, decommissioning replicas: 2, maintenance 
> replicas: 0, live entering maintenance replicas: 0, excess replicas: 0, Is 
> Open File: false, Datanodes having this block: DN1:9866 DN2:9866 DN4:9866 , 
> Current Datanode: DN1:9866, Is current datanode decommissioning: true, Is 
> current datanode entering maintenance: false
> ...
> 2021-06-06 10:57:10,105 INFO BlockStateChange (DatanodeAdminMonitor-0): 
> Block: blk_XXX, Expected Replicas: 2, live replicas: 1, corrupt replicas: 0, 
> decommissioned replicas: 0, decommissioning replicas: 2, maintenance 
> replicas: 0, live entering maintenance replicas: 0, excess replicas: 0, Is 
> Open File: false, Datanodes having this block: DN1:9866 DN2:9866 DN4:9866 , 
> Current Datanode: DN2:9866, Is current datanode decommissioning: true, Is 
> current datanode entering maintenance: false
> {code}
> Being stuck in decommissioning state forever is not an intended behavior of 
> DataNode decommissioning
> A few potential solutions:
>  * Address the root cause of the problem which is an inconsistent state 
> between namenode & datanode: https://issues.apache.org/jira/browse/HDFS-721
>  * Detect when datanode decommissioning is stuck due to lack of 

[jira] [Commented] (HDFS-16064) Determine when to invalidate corrupt replicas based on number of usable replicas

2024-01-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805315#comment-17805315
 ] 

ASF GitHub Bot commented on HDFS-16064:
---

zz12341 opened a new pull request, #6437:
URL: https://github.com/apache/hadoop/pull/6437

   …
   
   
   
   ### Description of PR
   
   [HDFS-16064](https://github.com/apache/hadoop/pull/4410) fixed an issue 
where decommissioning replicas were not counted as usable replicas, which 
caused decom to stuck forever in the case of small clusters. We are seeing the 
same issue on 2.10, and thus backporting the changes. 
   
   ### How was this patch tested?
   
   
   ### For code changes:
   
   - [ x] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




> Determine when to invalidate corrupt replicas based on number of usable 
> replicas
> 
>
> Key: HDFS-16064
> URL: https://issues.apache.org/jira/browse/HDFS-16064
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 3.2.1
>Reporter: Kevin Wikant
>Assignee: Kevin Wikant
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.5
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Seems that https://issues.apache.org/jira/browse/HDFS-721 was resolved as a 
> non-issue under the assumption that if the namenode & a datanode get into an 
> inconsistent state for a given block pipeline, there should be another 
> datanode available to replicate the block to
> While testing datanode decommissioning using "dfs.exclude.hosts", I have 
> encountered a scenario where the decommissioning gets stuck indefinitely
> Below is the progression of events:
>  * there are initially 4 datanodes DN1, DN2, DN3, DN4
>  * scale-down is started by adding DN1 & DN2 to "dfs.exclude.hosts"
>  * HDFS block pipelines on DN1 & DN2 must now be replicated to DN3 & DN4 in 
> order to satisfy their minimum replication factor of 2
>  * during this replication process 
> https://issues.apache.org/jira/browse/HDFS-721 is encountered which causes 
> the following inconsistent state:
>  ** DN3 thinks it has the block pipeline in FINALIZED state
>  ** the namenode does not think DN3 has the block pipeline
> {code:java}
> 2021-06-06 10:38:23,604 INFO org.apache.hadoop.hdfs.server.datanode.DataNode 
> (DataXceiver for client  at /DN2:45654 [Receiving block BP-YYY:blk_XXX]): 
> DN3:9866:DataXceiver error processing WRITE_BLOCK operation  src: /DN2:45654 
> dst: /DN3:9866; 
> org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block 
> BP-YYY:blk_XXX already exists in state FINALIZED and thus cannot be created.
> {code}
>  * the replication is attempted again, but:
>  ** DN4 has the block
>  ** DN1 and/or DN2 have the block, but don't count towards the minimum 
> replication factor because they are being decommissioned
>  ** DN3 does not have the block & cannot have the block replicated to it 
> because of HDFS-721
>  * the namenode repeatedly tries to replicate the block to DN3 & repeatedly 
> fails, this continues indefinitely
>  * therefore DN4 is the only live datanode with the block & the minimum 
> replication factor of 2 cannot be satisfied
>  * because the minimum replication factor cannot be satisfied for the 
> block(s) being moved off DN1 & DN2, the datanode decommissioning can never be 
> completed 
> {code:java}
> 2021-06-06 10:39:10,106 INFO BlockStateChange (DatanodeAdminMonitor-0): 
> Block: blk_XXX, Expected Replicas: 2, live replicas: 1, corrupt replicas: 0, 
> decommissioned replicas: 0, decommissioning replicas: 2, maintenance 
> replicas: 0, live entering maintenance replicas: 0, excess replicas: 0, Is 
> Open File: false, Datanodes having this block: DN1:9866 DN2:9866 DN4:9866 , 
> Current Datanode: DN1:9866, Is current datanode decommissioning: true, Is 
> current datanode entering maintenance: false
> ...
> 2021-06-06 10:57:10,105 INFO BlockStateChange (DatanodeAdminMonitor-0): 
> Block: blk_XXX, Expected Replicas: 2, live replicas: 1, corrupt replicas: 0, 
> decommissioned replicas: 0, decommissioning replicas: 2, maintenance 
> replicas: 0, live entering maintenance replicas: 0, excess replicas: 0, Is 
> Open File: false, Datanodes having this block: 

[jira] [Commented] (HDFS-17333) DFSClient support lazy resolve host->ip.

2024-01-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805308#comment-17805308
 ] 

ASF GitHub Bot commented on HDFS-17333:
---

hadoop-yetus commented on PR #6430:
URL: https://github.com/apache/hadoop/pull/6430#issuecomment-1885834973

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 21s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 23s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  21m 18s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   9m 16s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |   8m 37s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   2m 15s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 38s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m  1s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   2m 22s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   4m 43s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 45s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 21s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 35s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   8m  1s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |   8m  1s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   7m 29s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   7m 29s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m 57s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   2m 17s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 46s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   2m 14s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   4m 56s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m 47s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  16m 34s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   2m  0s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | +1 :green_heart: |  unit  | 184m 36s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 38s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 345m 43s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6430/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6430 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux d05670ad082e 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 
15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 0bd0e84010f6489ccc17351d793014ce0b444c1b |
   | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6430/3/testReport/ |
   | Max. process+thread count | 4653 (vs. ulimit of 5500) |
   | modules | C: 

[jira] [Commented] (HDFS-16064) Determine when to invalidate corrupt replicas based on number of usable replicas

2024-01-10 Thread Rushabh Shah (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805302#comment-17805302
 ] 

Rushabh Shah commented on HDFS-16064:
-

[~KevinWikant]  [~aajisaka]  Any reason why we haven't backported this fix to 
branch-2.10? 

> Determine when to invalidate corrupt replicas based on number of usable 
> replicas
> 
>
> Key: HDFS-16064
> URL: https://issues.apache.org/jira/browse/HDFS-16064
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 3.2.1
>Reporter: Kevin Wikant
>Assignee: Kevin Wikant
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.5
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Seems that https://issues.apache.org/jira/browse/HDFS-721 was resolved as a 
> non-issue under the assumption that if the namenode & a datanode get into an 
> inconsistent state for a given block pipeline, there should be another 
> datanode available to replicate the block to
> While testing datanode decommissioning using "dfs.exclude.hosts", I have 
> encountered a scenario where the decommissioning gets stuck indefinitely
> Below is the progression of events:
>  * there are initially 4 datanodes DN1, DN2, DN3, DN4
>  * scale-down is started by adding DN1 & DN2 to "dfs.exclude.hosts"
>  * HDFS block pipelines on DN1 & DN2 must now be replicated to DN3 & DN4 in 
> order to satisfy their minimum replication factor of 2
>  * during this replication process 
> https://issues.apache.org/jira/browse/HDFS-721 is encountered which causes 
> the following inconsistent state:
>  ** DN3 thinks it has the block pipeline in FINALIZED state
>  ** the namenode does not think DN3 has the block pipeline
> {code:java}
> 2021-06-06 10:38:23,604 INFO org.apache.hadoop.hdfs.server.datanode.DataNode 
> (DataXceiver for client  at /DN2:45654 [Receiving block BP-YYY:blk_XXX]): 
> DN3:9866:DataXceiver error processing WRITE_BLOCK operation  src: /DN2:45654 
> dst: /DN3:9866; 
> org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block 
> BP-YYY:blk_XXX already exists in state FINALIZED and thus cannot be created.
> {code}
>  * the replication is attempted again, but:
>  ** DN4 has the block
>  ** DN1 and/or DN2 have the block, but don't count towards the minimum 
> replication factor because they are being decommissioned
>  ** DN3 does not have the block & cannot have the block replicated to it 
> because of HDFS-721
>  * the namenode repeatedly tries to replicate the block to DN3 & repeatedly 
> fails, this continues indefinitely
>  * therefore DN4 is the only live datanode with the block & the minimum 
> replication factor of 2 cannot be satisfied
>  * because the minimum replication factor cannot be satisfied for the 
> block(s) being moved off DN1 & DN2, the datanode decommissioning can never be 
> completed 
> {code:java}
> 2021-06-06 10:39:10,106 INFO BlockStateChange (DatanodeAdminMonitor-0): 
> Block: blk_XXX, Expected Replicas: 2, live replicas: 1, corrupt replicas: 0, 
> decommissioned replicas: 0, decommissioning replicas: 2, maintenance 
> replicas: 0, live entering maintenance replicas: 0, excess replicas: 0, Is 
> Open File: false, Datanodes having this block: DN1:9866 DN2:9866 DN4:9866 , 
> Current Datanode: DN1:9866, Is current datanode decommissioning: true, Is 
> current datanode entering maintenance: false
> ...
> 2021-06-06 10:57:10,105 INFO BlockStateChange (DatanodeAdminMonitor-0): 
> Block: blk_XXX, Expected Replicas: 2, live replicas: 1, corrupt replicas: 0, 
> decommissioned replicas: 0, decommissioning replicas: 2, maintenance 
> replicas: 0, live entering maintenance replicas: 0, excess replicas: 0, Is 
> Open File: false, Datanodes having this block: DN1:9866 DN2:9866 DN4:9866 , 
> Current Datanode: DN2:9866, Is current datanode decommissioning: true, Is 
> current datanode entering maintenance: false
> {code}
> Being stuck in decommissioning state forever is not an intended behavior of 
> DataNode decommissioning
> A few potential solutions:
>  * Address the root cause of the problem which is an inconsistent state 
> between namenode & datanode: https://issues.apache.org/jira/browse/HDFS-721
>  * Detect when datanode decommissioning is stuck due to lack of available 
> datanodes for satisfying the minimum replication factor, then recover by 
> re-enabling the datanodes being decommissioned
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17333) DFSClient support lazy resolve host->ip.

2024-01-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805299#comment-17805299
 ] 

ASF GitHub Bot commented on HDFS-17333:
---

hadoop-yetus commented on PR #6430:
URL: https://github.com/apache/hadoop/pull/6430#issuecomment-1885771188

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 33s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m  6s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  34m 23s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  17m 44s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |  16m 29s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   4m 32s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   4m 24s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   3m 18s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   3m 31s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   8m 53s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  38m  1s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 30s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 58s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  16m 48s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |  16m 48s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  15m 27s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |  15m 27s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   4m 28s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   4m 24s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   3m 12s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   3m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |  10m  9s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  39m 22s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  19m 25s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   2m 44s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | +1 :green_heart: |  unit  | 215m  4s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   1m  7s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 486m 33s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6430/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6430 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux 6f24688a3742 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 
15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 95d9fbe4a7f83a66e29507536c1b2befd8c8fe46 |
   | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6430/2/testReport/ |
   | Max. process+thread count | 3940 (vs. ulimit of 5500) |
   | modules | C: 

[jira] [Commented] (HDFS-17334) FSEditLogAsync#enqueueEdit does not synchronized this before invoke wait method

2024-01-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805243#comment-17805243
 ] 

ASF GitHub Bot commented on HDFS-17334:
---

hadoop-yetus commented on PR #6434:
URL: https://github.com/apache/hadoop/pull/6434#issuecomment-1885348791

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  11m 25s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  41m 18s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 19s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m  9s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 26s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  6s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 20s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  38m 32s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 22s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |   1m 22s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 18s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   1m 18s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 21s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m  0s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 40s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   4m  7s |  |  the patch passed  |
   | -1 :x: |  shadedclient  |  44m 54s |  |  patch has errors when building 
and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  |   0m 26s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6434/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch failed.  |
   | +0 :ok: |  asflicense  |   0m 26s |  |  ASF License check generated no 
output?  |
   |  |   | 160m  5s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6434/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6434 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 71e56375345d 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 
15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 33a14bba95491c9282718b5053ff75b63140b472 |
   | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6434/1/testReport/ |
   | Max. process+thread count | 566 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6434/1/console |
   | versions | 

[jira] [Commented] (HDFS-17335) Add metrics for syncWaitQ in FSEditLogAsync

2024-01-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805228#comment-17805228
 ] 

ASF GitHub Bot commented on HDFS-17335:
---

hadoop-yetus commented on PR #6431:
URL: https://github.com/apache/hadoop/pull/6431#issuecomment-1885260781

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  16m 48s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  41m 29s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 19s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |   1m 11s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m  8s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 20s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  4s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 34s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 13s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  34m 24s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 10s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  4s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   1m  4s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 56s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6431/2/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 55 unchanged - 
0 fixed = 56 total (was 55)  |
   | +1 :green_heart: |  mvnsite  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 51s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 12s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  34m 14s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 261m 48s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6431/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 42s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 411m 40s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6431/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6431 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux ee54044c9ff2 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 
15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 914ce1689643cc77286761f8670d155762acd7f2 |
   | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 

[jira] [Updated] (HDFS-17333) DFSClient support lazy resolve host->ip.

2024-01-10 Thread Jian Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian Zhang updated HDFS-17333:
--
Summary: DFSClient support lazy resolve host->ip.  (was: Support lazy 
resolve host->ip.)

> DFSClient support lazy resolve host->ip.
> 
>
> Key: HDFS-17333
> URL: https://issues.apache.org/jira/browse/HDFS-17333
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-17333.001.patch
>
>
> Currently, when dfsclient is started, it will resolve all hosts of all 
> namservices: 
>   at DFSUtilClient#getAddresses(conf, null, addressKey)
>   at AbstractNNFailoverProxyProvider#getProxyAddresses(URI uri, 
> String addressKey)
> If the current environment where the dfsClient is located causes resolution 
> of host->ip to be very slow, the existing logic will undoubtedly take a long 
> time when there are too many nameservices.
> Now, each dfsclient only needs the IPs of all namenodes of a certain 
> nameservice at most. A better situation is that if the namenode selected by 
> dfsclient for the first time can provide the required services normally, then 
> the client only needs to know the IP of this namenode. Therefore, it is not 
> necessary to resolve all namenodes of all nameservices in the configuration 
> file, when dfsclient is started.
> This patch supports lazy resolution of host->ip, which will only be resolved 
> when the host needs to be accessed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17333) Support lazy resolve host->ip.

2024-01-10 Thread Jian Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian Zhang updated HDFS-17333:
--
Description: 
Currently, when dfsclient is started, it will resolve all hosts of all 
namservices: 
  at DFSUtilClient#getAddresses(conf, null, addressKey)
  at AbstractNNFailoverProxyProvider#getProxyAddresses(URI uri, String 
addressKey)
If the current environment where the dfsClient is located causes resolution of 
host->ip to be very slow, the existing logic will undoubtedly take a long time 
when there are too many nameservices.

Now, each dfsclient only needs the IPs of all namenodes of a certain 
nameservice at most. A better situation is that if the namenode selected by 
dfsclient for the first time can provide the required services normally, then 
the client only needs to know the IP of this namenode. Therefore, it is not 
necessary to resolve all namenodes of all nameservices in the configuration 
file, when dfsclient is started.

This patch supports lazy resolution of host->ip, which will only be resolved 
when the host needs to be accessed.

  was:
Currently, when dfsclient is started, it will parse all hosts of all ns: 
  at DFSUtilClient#getAddresses(conf, null, addressKey)
  at AbstractNNFailoverProxyProvider#getProxyAddresses(URI uri, String 
addressKey)
If the current environment where the client is located causes resolution of 
host->ip to be very slow, the existing logic will undoubtedly take a long time 
when there are too many NSs.

This patch supports lazy resolution of host->ip, which will only be resolved 
when the host needs to be accessed.


> Support lazy resolve host->ip.
> --
>
> Key: HDFS-17333
> URL: https://issues.apache.org/jira/browse/HDFS-17333
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-17333.001.patch
>
>
> Currently, when dfsclient is started, it will resolve all hosts of all 
> namservices: 
>   at DFSUtilClient#getAddresses(conf, null, addressKey)
>   at AbstractNNFailoverProxyProvider#getProxyAddresses(URI uri, 
> String addressKey)
> If the current environment where the dfsClient is located causes resolution 
> of host->ip to be very slow, the existing logic will undoubtedly take a long 
> time when there are too many nameservices.
> Now, each dfsclient only needs the IPs of all namenodes of a certain 
> nameservice at most. A better situation is that if the namenode selected by 
> dfsclient for the first time can provide the required services normally, then 
> the client only needs to know the IP of this namenode. Therefore, it is not 
> necessary to resolve all namenodes of all nameservices in the configuration 
> file, when dfsclient is started.
> This patch supports lazy resolution of host->ip, which will only be resolved 
> when the host needs to be accessed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17335) Add metrics for syncWaitQ in FSEditLogAsync

2024-01-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805204#comment-17805204
 ] 

ASF GitHub Bot commented on HDFS-17335:
---

hadoop-yetus commented on PR #6431:
URL: https://github.com/apache/hadoop/pull/6431#issuecomment-1885136023

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 33s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  42m 36s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 19s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |   1m 11s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m  6s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 21s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  5s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 35s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 11s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  34m 41s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 14s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  5s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 56s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6431/1/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 56 unchanged - 
0 fixed = 57 total (was 56)  |
   | +1 :green_heart: |  mvnsite  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 52s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 14s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  35m  1s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 213m 37s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6431/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 40s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 349m 39s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6431/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6431 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux ea69413d7d5f 5.15.0-86-generic #96-Ubuntu SMP Wed Sep 20 
08:23:49 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 1575544ab17b7bfab4d7da12ed6b1967ac71967e |
   | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 

[jira] [Commented] (HDFS-17311) RBF: ConnectionManager creatorQueue should offer a pool that is not already in creatorQueue.

2024-01-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805197#comment-17805197
 ] 

ASF GitHub Bot commented on HDFS-17311:
---

hadoop-yetus commented on PR #6392:
URL: https://github.com/apache/hadoop/pull/6392#issuecomment-1885098919

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  17m 33s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  47m 46s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 40s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |   0m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   0m 30s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 42s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 42s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   0m 30s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   1m 21s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  38m  5s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 30s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 34s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |   0m 34s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   0m 30s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 18s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6392/2/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-project/hadoop-hdfs-rbf: The patch generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0)  |
   | +1 :green_heart: |  mvnsite  |   0m 36s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 31s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   0m 24s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   1m 24s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  38m  5s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  24m 23s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 37s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 180m 32s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6392/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6392 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 1611a8d874ac 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 
15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / e6a3bf8bd55d14aa6a33f7ae89a6418d2728e9d3 |
   | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6392/2/testReport/ |
   | Max. process+thread count | 2343 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
   | Console output | 

[jira] [Commented] (HDFS-17334) FSEditLogAsync#enqueueEdit does not synchronized this before invoke wait method

2024-01-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805187#comment-17805187
 ] 

ASF GitHub Bot commented on HDFS-17334:
---

hfutatzhanghb opened a new pull request, #6434:
URL: https://github.com/apache/hadoop/pull/6434

   ### Description of PR
   
   In method FSEditLogAsync#enqueueEdit , there exist the below codes:
   ```java
   if (Thread.holdsLock(this)) {
 // if queue is full, synchronized caller must immediately 
relinquish
 // the monitor before re-offering to avoid deadlock with sync 
thread
 // which needs the monitor to write transactions.
 int permits = overflowMutex.drainPermits();
 try {
   do {
 this.wait(1000); // will be notified by next logSync.
   } while (!editPendingQ.offer(edit));
 } finally {
   overflowMutex.release(permits);
 }
   }  
   ```
   It maybe invoke this.wait(1000) without having object this's monitor.
   





> FSEditLogAsync#enqueueEdit does not synchronized this before invoke wait 
> method
> ---
>
> Key: HDFS-17334
> URL: https://issues.apache.org/jira/browse/HDFS-17334
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.6
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Major
> Fix For: 3.5.0
>
>
> In method FSEditLogAsync#enqueueEdit , there exist the below codes:
> {code:java}
> if (Thread.holdsLock(this)) {
>           // if queue is full, synchronized caller must immediately relinquish
>           // the monitor before re-offering to avoid deadlock with sync thread
>           // which needs the monitor to write transactions.
>           int permits = overflowMutex.drainPermits();
>           try {
>             do {
>               this.wait(1000); // will be notified by next logSync.
>             } while (!editPendingQ.offer(edit));
>           } finally {
>             overflowMutex.release(permits);
>           }
>         }  {code}
> It maybe invoke this.wait(1000) without having object this's monitor.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17334) FSEditLogAsync#enqueueEdit does not synchronized this before invoke wait method

2024-01-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17334:
--
Labels: pull-request-available  (was: )

> FSEditLogAsync#enqueueEdit does not synchronized this before invoke wait 
> method
> ---
>
> Key: HDFS-17334
> URL: https://issues.apache.org/jira/browse/HDFS-17334
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.6
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> In method FSEditLogAsync#enqueueEdit , there exist the below codes:
> {code:java}
> if (Thread.holdsLock(this)) {
>           // if queue is full, synchronized caller must immediately relinquish
>           // the monitor before re-offering to avoid deadlock with sync thread
>           // which needs the monitor to write transactions.
>           int permits = overflowMutex.drainPermits();
>           try {
>             do {
>               this.wait(1000); // will be notified by next logSync.
>             } while (!editPendingQ.offer(edit));
>           } finally {
>             overflowMutex.release(permits);
>           }
>         }  {code}
> It maybe invoke this.wait(1000) without having object this's monitor.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17300) [SBN READ] A rpc call in Observer should throw ObserverRetryOnActiveException if its stateid is always lower than client stateid for a configured time.

2024-01-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805186#comment-17805186
 ] 

ASF GitHub Bot commented on HDFS-17300:
---

hadoop-yetus commented on PR #6414:
URL: https://github.com/apache/hadoop/pull/6414#issuecomment-1885028838

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 21s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  13m 59s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  19m 48s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   8m 32s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |   7m 55s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   2m  5s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 40s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 20s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 35s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 12s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 37s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 21s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   7m 58s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |   7m 58s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   7m 39s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   7m 39s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m 57s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 41s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 21s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 41s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 16s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m 37s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  15m 45s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  | 187m 18s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 44s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 335m 12s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6414/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6414 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux 944f1f3dd85f 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 
15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 3253b3ba230da6b7d9ffadefcd83dfc3506275b6 |
   | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6414/3/testReport/ |
   | Max. process+thread count | 3836 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common 
hadoop-hdfs-project/hadoop-hdfs U: . |
   | Console output | 

[jira] [Updated] (HDFS-17311) RBF: ConnectionManager creatorQueue should offer a pool that is not already in creatorQueue.

2024-01-10 Thread liuguanghua (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liuguanghua updated HDFS-17311:
---
Description: 
In the Router, find blow log

 
2023-12-29 15:18:54,799 ERROR 
org.apache.hadoop.hdfs.server.federation.router.ConnectionManager: Cannot add 
more than 2048 connections at the same time

 
The log indicates that ConnectionManager.creatorQueue is full at a certain 
point. But my cluster does not have so many users cloud reach up 2048 pair of 
.

This may be due to the following reasons:
 # ConnectionManager.creatorQueue is a queue that will be offered 
ConnectionPool if ConnectionContext is not enough.
 # ConnectionCreator thread will consume from creatorQueue and make more 
ConnectionContexts for a ConnectionPool.
 # Client will concurrent invoke for ConnectionManager.getConnection() for a 
same user. And this maybe lead to add many same ConnectionPool into 
ConnectionManager.creatorQueue.
 # When creatorQueue is full, a new ConnectionPool will not be added in 
successfully and log this error. This maybe lead to a really new ConnectionPool 
clould not produce more ConnectionContexts for new user.

So this pr try to make creatorQueue will not add same ConnectionPool at once.

  was:
2023-12-29 15:18:54,799 ERROR 
org.apache.hadoop.hdfs.server.federation.router.ConnectionManager: Cannot add 
more than 2048 connections at the same time

In my environment, ConnectionManager creatorQueue is full ,but the cluster does 
not have so many users cloud reach up  2048 pair of  in router.

In the case of a large number of concurrent  creatorQueue add same pool more 
than once.

 


> RBF: ConnectionManager creatorQueue should offer a pool that is not already 
> in creatorQueue.
> 
>
> Key: HDFS-17311
> URL: https://issues.apache.org/jira/browse/HDFS-17311
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: liuguanghua
>Assignee: liuguanghua
>Priority: Major
>  Labels: pull-request-available
>
> In the Router, find blow log
>  
> 2023-12-29 15:18:54,799 ERROR 
> org.apache.hadoop.hdfs.server.federation.router.ConnectionManager: Cannot add 
> more than 2048 connections at the same time
>  
> The log indicates that ConnectionManager.creatorQueue is full at a certain 
> point. But my cluster does not have so many users cloud reach up 2048 pair of 
> .
> This may be due to the following reasons:
>  # ConnectionManager.creatorQueue is a queue that will be offered 
> ConnectionPool if ConnectionContext is not enough.
>  # ConnectionCreator thread will consume from creatorQueue and make more 
> ConnectionContexts for a ConnectionPool.
>  # Client will concurrent invoke for ConnectionManager.getConnection() for a 
> same user. And this maybe lead to add many same ConnectionPool into 
> ConnectionManager.creatorQueue.
>  # When creatorQueue is full, a new ConnectionPool will not be added in 
> successfully and log this error. This maybe lead to a really new 
> ConnectionPool clould not produce more ConnectionContexts for new user.
> So this pr try to make creatorQueue will not add same ConnectionPool at once.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17311) RBF: ConnectionManager creatorQueue should offer a pool that is not already in creatorQueue.

2024-01-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805128#comment-17805128
 ] 

ASF GitHub Bot commented on HDFS-17311:
---

LiuGuH commented on PR #6392:
URL: https://github.com/apache/hadoop/pull/6392#issuecomment-1884817179

   @goiri  Do you have time to reivew this, Thanks.




> RBF: ConnectionManager creatorQueue should offer a pool that is not already 
> in creatorQueue.
> 
>
> Key: HDFS-17311
> URL: https://issues.apache.org/jira/browse/HDFS-17311
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: liuguanghua
>Priority: Major
>  Labels: pull-request-available
>
> 2023-12-29 15:18:54,799 ERROR 
> org.apache.hadoop.hdfs.server.federation.router.ConnectionManager: Cannot add 
> more than 2048 connections at the same time
> In my environment, ConnectionManager creatorQueue is full ,but the cluster 
> does not have so many users cloud reach up  2048 pair of  in router.
> In the case of a large number of concurrent  creatorQueue add same pool more 
> than once.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17311) RBF: ConnectionManager creatorQueue should offer a pool that is not already in creatorQueue.

2024-01-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805125#comment-17805125
 ] 

ASF GitHub Bot commented on HDFS-17311:
---

LiuGuH commented on code in PR #6392:
URL: https://github.com/apache/hadoop/pull/6392#discussion_r1440027525


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/ConnectionManager.java:
##
@@ -229,7 +229,7 @@ public ConnectionContext getConnection(UserGroupInformation 
ugi,
 
 // Add a new connection to the pool if it wasn't usable
 if (conn == null || !conn.isUsable()) {
-  if (!this.creatorQueue.offer(pool)) {
+  if (!this.creatorQueue.contains(pool) && !this.creatorQueue.offer(pool)) 
{

Review Comment:
   Thanks for review.
   
   Prevents the same pool from being added to the creatorQueue if the pool is 
alread in the creatorQueue.  
   I add a test case for this and update description.  @slfan1989 





> RBF: ConnectionManager creatorQueue should offer a pool that is not already 
> in creatorQueue.
> 
>
> Key: HDFS-17311
> URL: https://issues.apache.org/jira/browse/HDFS-17311
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: liuguanghua
>Priority: Major
>  Labels: pull-request-available
>
> 2023-12-29 15:18:54,799 ERROR 
> org.apache.hadoop.hdfs.server.federation.router.ConnectionManager: Cannot add 
> more than 2048 connections at the same time
> In my environment, ConnectionManager creatorQueue is full ,but the cluster 
> does not have so many users cloud reach up  2048 pair of  in router.
> In the case of a large number of concurrent  creatorQueue add same pool more 
> than once.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17333) Support lazy resolve host->ip.

2024-01-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805079#comment-17805079
 ] 

ASF GitHub Bot commented on HDFS-17333:
---

hadoop-yetus commented on PR #6430:
URL: https://github.com/apache/hadoop/pull/6430#issuecomment-1884664069

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 32s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 38s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  35m 15s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  18m 11s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |  16m 56s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   4m 39s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   4m 23s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   3m 10s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   3m 26s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   9m 12s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  38m 35s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 31s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m  8s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m 46s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |  17m 46s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  15m 10s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |  15m 10s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   4m  1s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   4m  7s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   3m  8s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   3m 19s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   9m 44s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  40m 47s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  19m 33s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   2m 43s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 219m 43s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6430/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  4s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 495m 21s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.tools.TestHdfsConfigFields |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6430/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6430 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 9552b09511ff 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 
15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 334347014a1b311ca40c92fe7c8c29f7904f9a0f |
   | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   |  Test 

[jira] [Commented] (HDFS-17335) Add metrics for syncWaitQ in FSEditLogAsync

2024-01-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805049#comment-17805049
 ] 

ASF GitHub Bot commented on HDFS-17335:
---

hfutatzhanghb opened a new pull request, #6431:
URL: https://github.com/apache/hadoop/pull/6431

   
   ### Description of PR
   See HDFS-17335.
   
   To monitor syncWaitQ in FSEditLogAsync, we add a metric syncPendingCount.
   
   The reason we add this metrics is that when dequeueEdit() return null,  the 
boolean variable doSync is set to !syncWaitQ.isEmpty()After adding this 
metrics we can better monitor sync performance and codes.
   
   




> Add metrics for syncWaitQ in FSEditLogAsync
> ---
>
> Key: HDFS-17335
> URL: https://issues.apache.org/jira/browse/HDFS-17335
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.3.6
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Minor
>
> To monitor syncWaitQ in FSEditLogAsync, we add a metric syncPendingCount.
> The reason we add this metrics is that when dequeueEdit() return null,  the 
> boolean variable doSync is set to {color:#0747a6}+!syncWaitQ.isEmpty()+    
> {color:#172b4d}After adding this metrics we can better monitor sync 
> performance and codes.{color}{color}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-17335) Add metrics for syncWaitQ in FSEditLogAsync

2024-01-10 Thread farmmamba (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

farmmamba reassigned HDFS-17335:


Assignee: farmmamba

> Add metrics for syncWaitQ in FSEditLogAsync
> ---
>
> Key: HDFS-17335
> URL: https://issues.apache.org/jira/browse/HDFS-17335
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.3.6
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Minor
>
> To monitor syncWaitQ in FSEditLogAsync, we add a metric syncPendingCount.
> The reason we add this metrics is that when dequeueEdit() return null,  the 
> boolean variable doSync is set to {color:#0747a6}+!syncWaitQ.isEmpty()+    
> {color:#172b4d}After adding this metrics we can better monitor sync 
> performance and codes.{color}{color}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17335) Add metrics for syncWaitQ in FSEditLogAsync

2024-01-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17335:
--
Labels: pull-request-available  (was: )

> Add metrics for syncWaitQ in FSEditLogAsync
> ---
>
> Key: HDFS-17335
> URL: https://issues.apache.org/jira/browse/HDFS-17335
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.3.6
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Minor
>  Labels: pull-request-available
>
> To monitor syncWaitQ in FSEditLogAsync, we add a metric syncPendingCount.
> The reason we add this metrics is that when dequeueEdit() return null,  the 
> boolean variable doSync is set to {color:#0747a6}+!syncWaitQ.isEmpty()+    
> {color:#172b4d}After adding this metrics we can better monitor sync 
> performance and codes.{color}{color}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-17335) Add metrics for syncWaitQ in FSEditLogAsync

2024-01-10 Thread farmmamba (Jira)
farmmamba created HDFS-17335:


 Summary: Add metrics for syncWaitQ in FSEditLogAsync
 Key: HDFS-17335
 URL: https://issues.apache.org/jira/browse/HDFS-17335
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 3.3.6
Reporter: farmmamba


To monitor syncWaitQ in FSEditLogAsync, we add a metric syncPendingCount.

The reason we add this metrics is that when dequeueEdit() return null,  the 
boolean variable doSync is set to {color:#0747a6}+!syncWaitQ.isEmpty()+    
{color:#172b4d}After adding this metrics we can better monitor sync performance 
and codes.{color}{color}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17300) [SBN READ] A rpc call in Observer should throw ObserverRetryOnActiveException if its stateid is always lower than client stateid for a configured time.

2024-01-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805035#comment-17805035
 ] 

ASF GitHub Bot commented on HDFS-17300:
---

hadoop-yetus commented on PR #6414:
URL: https://github.com/apache/hadoop/pull/6414#issuecomment-1884472273

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   6m 54s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 11s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  20m 29s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   8m 34s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |   7m 45s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   2m  1s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 38s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 18s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 32s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m  8s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 42s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 22s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m  7s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   8m 15s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |   8m 15s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   7m 50s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   7m 50s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m 58s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 37s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 11s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 31s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 18s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  21m 42s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  15m 40s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  | 191m 22s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6414/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 38s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 347m  4s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   |   | hadoop.hdfs.TestErasureCodingPolicies |
   |   | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl |
   |   | hadoop.hdfs.TestDistributedFileSystemWithECFileWithRandomECPolicy |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6414/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6414 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux 5eea4a21e82e 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 
15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 56101db610b19013c3a1fa57a61afb7845cd3971 |
   | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 

[jira] [Commented] (HDFS-17334) FSEditLogAsync#enqueueEdit does not synchronized this before invoke wait method

2024-01-10 Thread farmmamba (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805007#comment-17805007
 ] 

farmmamba commented on HDFS-17334:
--

[~hexiaoqiao] [~tomscut] [~zhangshuyan] [~haiyang Hu] Sir, could you please 
help me check this potential problem when you have free time? Thanks ahead.

> FSEditLogAsync#enqueueEdit does not synchronized this before invoke wait 
> method
> ---
>
> Key: HDFS-17334
> URL: https://issues.apache.org/jira/browse/HDFS-17334
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.6
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Major
> Fix For: 3.5.0
>
>
> In method FSEditLogAsync#enqueueEdit , there exist the below codes:
> {code:java}
> if (Thread.holdsLock(this)) {
>           // if queue is full, synchronized caller must immediately relinquish
>           // the monitor before re-offering to avoid deadlock with sync thread
>           // which needs the monitor to write transactions.
>           int permits = overflowMutex.drainPermits();
>           try {
>             do {
>               this.wait(1000); // will be notified by next logSync.
>             } while (!editPendingQ.offer(edit));
>           } finally {
>             overflowMutex.release(permits);
>           }
>         }  {code}
> It maybe invoke this.wait(1000) without having object this's monitor.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17276) The nn fetch editlog forbidden in kerberos environment

2024-01-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805006#comment-17805006
 ] 

ASF GitHub Bot commented on HDFS-17276:
---

hadoop-yetus commented on PR #6326:
URL: https://github.com/apache/hadoop/pull/6326#issuecomment-1884355739

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 21s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  31m 43s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 41s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |   0m 38s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   0m 36s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 42s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 40s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m  1s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   1m 46s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 35s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 36s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |   0m 36s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 33s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   0m 33s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 28s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 30s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m  0s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   1m 45s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m 34s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 185m  1s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6326/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 25s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 271m 19s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6326/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6326 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 3cf90d63b40e 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 
15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 1eab4744774cb8325c1622734d27b247de27fbc6 |
   | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6326/5/testReport/ |
   | Max. process+thread count | 4396 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output |