[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689582#comment-17689582
 ] 

ASF GitHub Bot commented on HDFS-16917:
---

hadoop-yetus commented on PR #5397:
URL: https://github.com/apache/hadoop/pull/5397#issuecomment-1432641907

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 38s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  16m  0s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  31m  9s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m 14s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |  20m 30s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   3m 45s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 27s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 22s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   2m 34s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   6m 18s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  25m 57s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 55s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |  22m 55s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 41s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |  20m 41s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   3m 37s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/6/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 2 new + 130 unchanged - 0 fixed = 132 total (was 
130)  |
   | +1 :green_heart: |  mvnsite  |   3m 29s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 21s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   2m 42s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   6m 22s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  26m 29s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 15s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  | 205m 59s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/6/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 14s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 452m 40s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.namenode.TestAuditLogger |
   |   | hadoop.hdfs.server.namenode.TestFsck |
   |   | hadoop.hdfs.server.namenode.TestAuditLogs |
   |   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   |   | hadoop.hdfs.server.namenode.TestFSNamesystemLockReport |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/6/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5397 |
   | Optional Tests | dupname asflicense mvnsite codespell detsecrets 
markdownlint compile javac javadoc mvni

[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689577#comment-17689577
 ] 

ASF GitHub Bot commented on HDFS-16917:
---

hadoop-yetus commented on PR #5397:
URL: https://github.com/apache/hadoop/pull/5397#issuecomment-1432623582

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 39s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 31s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  31m  5s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m  4s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |  20m 30s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   3m 55s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 29s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 28s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   2m 43s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   6m 11s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  26m 30s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 34s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |  22m 34s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 34s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |  20m 34s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   3m 36s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/5/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 2 new + 131 unchanged - 0 fixed = 133 total (was 
131)  |
   | +1 :green_heart: |  mvnsite  |   3m 24s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 21s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   2m 42s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   6m 20s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  26m 38s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 22s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  | 204m 26s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 10s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 450m 54s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.namenode.TestAuditLogs |
   |   | hadoop.hdfs.server.namenode.TestFSNamesystemLockReport |
   |   | hadoop.hdfs.server.namenode.TestFsck |
   |   | hadoop.hdfs.server.namenode.TestAuditLogger |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5397 |
   | Optional Tests | dupname asflicense mvnsite codespell detsecrets 
markdownlint compile javac javadoc mvninstall unit shadedclient spotbugs 
checkstyle |
   | uname |

[jira] [Commented] (HDFS-16922) The logic of IncrementalBlockReportManager#addRDBI method may cause missing blocks when cluster is busy.

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689554#comment-17689554
 ] 

ASF GitHub Bot commented on HDFS-16922:
---

hfutatzhanghb commented on PR #5398:
URL: https://github.com/apache/hadoop/pull/5398#issuecomment-1432581037

   > 
   
   
   > Thanks to involve me here. It is interesting issue. I am confused about 
some points of the description.
   > 
   > > dn3 is writting the blk_12345_002 , but dn2 is blocked by recoverClose 
method and does not send ack to client.
   > 
   > is this another injects or related this write flow?
   > 
   > > dn3 writes blk_12345_003 successfully.
   > > dn3 writes blk_12345_002 successfully and notifyNamenodeReceivedBlock.
   > 
   > Here dn3 writes the same block replica twice, is it expected?
   > 
   > Sorry didn't dig deeply this logic, will trace it for a while. 
@hfutatzhanghb Thanks again for your report and offer the solution.
   
   Hi, @Hexiaoqiao , thanks for your reply. 
   For the question 1:  dn2 is blocked by recoverClose() because of the 
datasetWriteLock acquire in branch-3.3.2
   For the question 2: yes, dn3 writes the same block replica twice, but the 
two replicas have different generation stamp. and when blk_12345_003 and 
blk_12345_002  are written in the same IBR interval, the 
IncrementalBlockReportManager#addRDBI will remove the report of blk_12345_003.




> The logic of IncrementalBlockReportManager#addRDBI method may cause missing 
> blocks when cluster is busy.
> 
>
> Key: HDFS-16922
> URL: https://issues.apache.org/jira/browse/HDFS-16922
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: ZhangHB
>Priority: Major
>  Labels: pull-request-available
>
> The current logic of IncrementalBlockReportManager# addRDBI method could lead 
> to the missing blocks when datanodes in pipeline are I/O busy.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16898) Remove write lock for processCommandFromActor of DataNode to reduce impact on heartbeat

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689507#comment-17689507
 ] 

ASF GitHub Bot commented on HDFS-16898:
---

Hexiaoqiao commented on PR #5408:
URL: https://github.com/apache/hadoop/pull/5408#issuecomment-1432502138

   Update title and let's wait what will Yetus say.




> Remove write lock for processCommandFromActor of DataNode to reduce impact on 
> heartbeat
> ---
>
> Key: HDFS-16898
> URL: https://issues.apache.org/jira/browse/HDFS-16898
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.3.4
>Reporter: ZhangHB
>Assignee: ZhangHB
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Now in method processCommandFromActor,  we have code like below:
>  
> {code:java}
> writeLock();
> try {
>   if (actor == bpServiceToActive) {
> return processCommandFromActive(cmd, actor);
>   } else {
> return processCommandFromStandby(cmd, actor);
>   }
> } finally {
>   writeUnlock();
> } {code}
> if method processCommandFromActive costs much time, the write lock would not 
> release.
>  
> It maybe block the updateActorStatesFromHeartbeat method in 
> offerService,furthermore, it can cause the lastcontact of datanode very high, 
> even dead when lastcontact beyond 600s.
> {code:java}
> bpos.updateActorStatesFromHeartbeat(
> this, resp.getNameNodeHaState());{code}
> here we can make write lock fine-grain in processCommandFromActor method to 
> address this problem
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16922) The logic of IncrementalBlockReportManager#addRDBI method may cause missing blocks when cluster is busy.

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689505#comment-17689505
 ] 

ASF GitHub Bot commented on HDFS-16922:
---

Hexiaoqiao commented on PR #5398:
URL: https://github.com/apache/hadoop/pull/5398#issuecomment-1432500654

   addendum:
   
   > Requires a UT which can reproduce the said issue.
   
   Ayushtkn means here is that we should add new unit tests (source code for 
test, such as TestClientProtocolForPipelineRecovery at HDFS-16146 mentioned 
above.) Thanks.




> The logic of IncrementalBlockReportManager#addRDBI method may cause missing 
> blocks when cluster is busy.
> 
>
> Key: HDFS-16922
> URL: https://issues.apache.org/jira/browse/HDFS-16922
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: ZhangHB
>Priority: Major
>  Labels: pull-request-available
>
> The current logic of IncrementalBlockReportManager# addRDBI method could lead 
> to the missing blocks when datanodes in pipeline are I/O busy.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16922) The logic of IncrementalBlockReportManager#addRDBI method may cause missing blocks when cluster is busy.

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689503#comment-17689503
 ] 

ASF GitHub Bot commented on HDFS-16922:
---

Hexiaoqiao commented on PR #5398:
URL: https://github.com/apache/hadoop/pull/5398#issuecomment-1432498891

   Thanks to involve me here. It is interesting issue. I am confused about some 
points of the description.
   
   > dn3 is writting the blk_12345_002 , but dn2 is blocked by recoverClose 
method and does not send ack to client.
   
   is this another injects or related this write flow?
   
   > dn3 writes blk_12345_003 successfully.
   > dn3 writes blk_12345_002 successfully and notifyNamenodeReceivedBlock.
   
   Here dn3 writes the same block replica twice, is it expected?
   
   Sorry didn't dig deeply this logic, will trace it for a while.
   @hfutatzhanghb Thanks again for your report and offer the solution. 




> The logic of IncrementalBlockReportManager#addRDBI method may cause missing 
> blocks when cluster is busy.
> 
>
> Key: HDFS-16922
> URL: https://issues.apache.org/jira/browse/HDFS-16922
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: ZhangHB
>Priority: Major
>  Labels: pull-request-available
>
> The current logic of IncrementalBlockReportManager# addRDBI method could lead 
> to the missing blocks when datanodes in pipeline are I/O busy.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16898) Remove write lock for processCommandFromActor of DataNode to reduce impact on heartbeat

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689486#comment-17689486
 ] 

ASF GitHub Bot commented on HDFS-16898:
---

hfutatzhanghb opened a new pull request, #5408:
URL: https://github.com/apache/hadoop/pull/5408

   https://github.com/apache/hadoop/pull/5330




> Remove write lock for processCommandFromActor of DataNode to reduce impact on 
> heartbeat
> ---
>
> Key: HDFS-16898
> URL: https://issues.apache.org/jira/browse/HDFS-16898
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.3.4
>Reporter: ZhangHB
>Assignee: ZhangHB
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Now in method processCommandFromActor,  we have code like below:
>  
> {code:java}
> writeLock();
> try {
>   if (actor == bpServiceToActive) {
> return processCommandFromActive(cmd, actor);
>   } else {
> return processCommandFromStandby(cmd, actor);
>   }
> } finally {
>   writeUnlock();
> } {code}
> if method processCommandFromActive costs much time, the write lock would not 
> release.
>  
> It maybe block the updateActorStatesFromHeartbeat method in 
> offerService,furthermore, it can cause the lastcontact of datanode very high, 
> even dead when lastcontact beyond 600s.
> {code:java}
> bpos.updateActorStatesFromHeartbeat(
> this, resp.getNameNodeHaState());{code}
> here we can make write lock fine-grain in processCommandFromActor method to 
> address this problem
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16898) Remove write lock for processCommandFromActor of DataNode to reduce impact on heartbeat

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689485#comment-17689485
 ] 

ASF GitHub Bot commented on HDFS-16898:
---

hfutatzhanghb commented on PR #5330:
URL: https://github.com/apache/hadoop/pull/5330#issuecomment-1432454503

   > @hfutatzhanghb This PR could not cherrypick to branch-3.3 smoothly. Would 
you mind to submit another PR for branch-3.3?
   
   @Hexiaoqiao , done~, please have a look. thanks.




> Remove write lock for processCommandFromActor of DataNode to reduce impact on 
> heartbeat
> ---
>
> Key: HDFS-16898
> URL: https://issues.apache.org/jira/browse/HDFS-16898
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.3.4
>Reporter: ZhangHB
>Assignee: ZhangHB
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Now in method processCommandFromActor,  we have code like below:
>  
> {code:java}
> writeLock();
> try {
>   if (actor == bpServiceToActive) {
> return processCommandFromActive(cmd, actor);
>   } else {
> return processCommandFromStandby(cmd, actor);
>   }
> } finally {
>   writeUnlock();
> } {code}
> if method processCommandFromActive costs much time, the write lock would not 
> release.
>  
> It maybe block the updateActorStatesFromHeartbeat method in 
> offerService,furthermore, it can cause the lastcontact of datanode very high, 
> even dead when lastcontact beyond 600s.
> {code:java}
> bpos.updateActorStatesFromHeartbeat(
> this, resp.getNameNodeHaState());{code}
> here we can make write lock fine-grain in processCommandFromActor method to 
> address this problem
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16918) Optionally shut down datanode if it does not stay connected to active namenode

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689481#comment-17689481
 ] 

ASF GitHub Bot commented on HDFS-16918:
---

hadoop-yetus commented on PR #5396:
URL: https://github.com/apache/hadoop/pull/5396#issuecomment-1432445119

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m 22s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  50m 49s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 28s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |   1m 24s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m  8s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 29s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  8s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 32s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 35s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  29m 26s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 30s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |   1m 23s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 13s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 54s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5396/2/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 325 unchanged 
- 0 fixed = 326 total (was 325)  |
   | +1 :green_heart: |  mvnsite  |   1m 23s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   0m 53s | 
[/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5396/2/artifact/out/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt)
 |  hadoop-hdfs in the patch failed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.  |
   | +1 :green_heart: |  javadoc  |   1m 26s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 29s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  29m 11s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 251m 57s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5396/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 43s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 385m  0s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.namenode.TestAuditLogger |
   |   | hadoop.hdfs.server.namenode.TestFSNamesystemLockReport |
   |   | hadoop.hdfs.server.namenode.TestAuditLogs |
   |   | hadoop.hdfs.server.namenode.TestFsck |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5396/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5396 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux 5d0f90e11c93 4.15.0-20

[jira] [Commented] (HDFS-16922) The logic of IncrementalBlockReportManager#addRDBI method may cause missing blocks when cluster is busy.

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689478#comment-17689478
 ] 

ASF GitHub Bot commented on HDFS-16922:
---

hfutatzhanghb commented on PR #5398:
URL: https://github.com/apache/hadoop/pull/5398#issuecomment-1432442616

   @hi, @jojochuang @Hexiaoqiao @zhangshuyan0 , this pr is seems to be another 
supplement for [HDFS-16146](https://issues.apache.org/jira/browse/HDFS-16146), 
could you please take a look at this? thanks all.




> The logic of IncrementalBlockReportManager#addRDBI method may cause missing 
> blocks when cluster is busy.
> 
>
> Key: HDFS-16922
> URL: https://issues.apache.org/jira/browse/HDFS-16922
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: ZhangHB
>Priority: Major
>  Labels: pull-request-available
>
> The current logic of IncrementalBlockReportManager# addRDBI method could lead 
> to the missing blocks when datanodes in pipeline are I/O busy.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16925) Fix regex pattern for namenode audit log tests

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689449#comment-17689449
 ] 

ASF GitHub Bot commented on HDFS-16925:
---

hadoop-yetus commented on PR #5407:
URL: https://github.com/apache/hadoop/pull/5407#issuecomment-1432379608

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 51s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 4 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  49m 10s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 29s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |   1m 21s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m  5s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 29s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  8s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 28s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 40s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  28m 42s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 30s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 25s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |   1m 25s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 13s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 51s |  |  
hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 59 unchanged - 1 
fixed = 59 total (was 60)  |
   | +1 :green_heart: |  mvnsite  |   1m 22s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 52s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 30s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  28m 44s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 218m 14s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5407/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 45s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 347m 43s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5407/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5407 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux cbb1315f256b 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 
18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 3bf75df6d997563e9aaea7af30c58dd9ae4729a8 |
   | Default Java | Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5407/1/testReport/ |
   | Max. process+thread count | 2440 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-

[jira] [Commented] (HDFS-16761) Namenode UI for Datanodes page not loading if any data node is down

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689443#comment-17689443
 ] 

ASF GitHub Bot commented on HDFS-16761:
---

tomscut commented on PR #5390:
URL: https://github.com/apache/hadoop/pull/5390#issuecomment-1432358750

   Sorry for introducing this problem. Thank you all.
   




> Namenode UI for Datanodes page not loading if any data node is down
> ---
>
> Key: HDFS-16761
> URL: https://issues.apache.org/jira/browse/HDFS-16761
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.2.2
>Reporter: Krishna Reddy
>Assignee: Zita Dombi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Steps to reproduce:
> - Install the hadoop components and add 3 datanodes
> - Enable namenode HA 
> - Open Namenode UI and check datanode page 
> - check all datanodes will display
> - Now make one datanode down
> - wait for 10 minutes time as heartbeat expires
> - Refresh namenode page and check
>  
> Actual Result: It is showing error message "NameNode is still loading. 
> Redirecting to the Startup Progress page."



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16761) Namenode UI for Datanodes page not loading if any data node is down

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689440#comment-17689440
 ] 

ASF GitHub Bot commented on HDFS-16761:
---

tasanuma commented on PR #5390:
URL: https://github.com/apache/hadoop/pull/5390#issuecomment-1432353003

   Thanks for merging it. The issue doesn't reproduce in branch-3.3. It seems 
to be caused by HDFS-16203, which is only in trunk.




> Namenode UI for Datanodes page not loading if any data node is down
> ---
>
> Key: HDFS-16761
> URL: https://issues.apache.org/jira/browse/HDFS-16761
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.2.2
>Reporter: Krishna Reddy
>Assignee: Zita Dombi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Steps to reproduce:
> - Install the hadoop components and add 3 datanodes
> - Enable namenode HA 
> - Open Namenode UI and check datanode page 
> - check all datanodes will display
> - Now make one datanode down
> - wait for 10 minutes time as heartbeat expires
> - Refresh namenode page and check
>  
> Actual Result: It is showing error message "NameNode is still loading. 
> Redirecting to the Startup Progress page."



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689439#comment-17689439
 ] 

ASF GitHub Bot commented on HDFS-16917:
---

hadoop-yetus commented on PR #5397:
URL: https://github.com/apache/hadoop/pull/5397#issuecomment-1432344941

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 49s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  27m  9s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  32m 29s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  25m 46s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |  22m 11s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   4m  1s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 18s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 20s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   2m 32s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   6m  5s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  27m  2s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 30s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  25m 19s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |  25m 19s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 14s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |  22m 14s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   4m  4s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/4/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 2 new + 105 unchanged - 0 fixed = 107 total (was 
105)  |
   | +1 :green_heart: |  mvnsite  |   3m 25s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 12s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   2m 32s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   6m 25s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  27m 12s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  19m  8s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  | 211m 18s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 15s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 481m  9s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.namenode.TestAuditLogger |
   |   | hadoop.hdfs.server.namenode.TestFsck |
   |   | hadoop.hdfs.server.namenode.TestAuditLogs |
   |   | hadoop.hdfs.server.namenode.TestFSNamesystemLockReport |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5397 |
   | Optional Tests | dupname asflicense mvnsite codespell detsecrets 
markdownlint compile javac javadoc mvninstall unit shadedclient spotbugs 
checkstyle |
   | uname |

[jira] [Commented] (HDFS-16896) HDFS Client hedged read has increased failure rate than without hedged read

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689437#comment-17689437
 ] 

ASF GitHub Bot commented on HDFS-16896:
---

mccormickt12 commented on code in PR #5322:
URL: https://github.com/apache/hadoop/pull/5322#discussion_r1107919545


##
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java:
##
@@ -1337,7 +1352,11 @@ private void hedgedFetchBlockByteRange(LocatedBlock 
block, long start,
 } catch (InterruptedException ie) {
   // Ignore and retry
 }
-if (refetch) {
+// if refetch is true then all nodes are in deadlist or ignorelist
+// we should loop through all futures and remove them so we do not

Review Comment:
   fixed comments. deadlist is actually deadNodes (I fixed that comment as 
well.)
   When connections fail (in both hedged and non hedged code path) nodes are 
added to the deadNodes collection to try other nodes. Once `chooseDataNode` 
returns `null` (or more accurately `getBestNodeDNAddrPair`) it calls 
`refetchLocations` which clears the deadNodes `clearLocalDeadNodes()` and now 
with my change, also clears the ignore list. 
   
   Note we have added an assumption to this method `refetchLocations`. The 
comment I added to `refetchLocations`
   ``` 
/**
  * RefetchLocations should only be called when there are no active requests
  * to datanodes. In the hedged read case this means futures should be empty
  */
  ```





> HDFS Client hedged read has increased failure rate than without hedged read
> ---
>
> Key: HDFS-16896
> URL: https://issues.apache.org/jira/browse/HDFS-16896
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Tom McCormick
>Assignee: Tom McCormick
>Priority: Major
>  Labels: pull-request-available
>
> When hedged read is enabled by HDFS client, we see an increased failure rate 
> on reads.
> *stacktrace*
>  
> {code:java}
> Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain 
> block: BP-1183972111-10.197.192.88-1590025572374:blk_17114848218_16043459722 
> file=/data/tracking/streaming/AdImpressionEvent/daily/2022/07/18/compaction_1/part-r-1914862.1658217125623.1362294472.orc
> at 
> org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1077)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1060)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1039)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.hedgedFetchBlockByteRange(DFSInputStream.java:1365)
> at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1572)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1535)
> at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> org.apache.hadoop.fs.RetryingInputStream.lambda$readFully$3(RetryingInputStream.java:172)
> at org.apache.hadoop.fs.RetryPolicy.lambda$run$0(RetryPolicy.java:137)
> at org.apache.hadoop.fs.NoOpRetryPolicy.run(NoOpRetryPolicy.java:36)
> at org.apache.hadoop.fs.RetryPolicy.run(RetryPolicy.java:136)
> at 
> org.apache.hadoop.fs.RetryingInputStream.readFully(RetryingInputStream.java:168)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> io.trino.plugin.hive.orc.HdfsOrcDataSource.readInternal(HdfsOrcDataSource.java:76)
> ... 46 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13224) RBF: Resolvers to support mount points across multiple subclusters

2023-02-15 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-13224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689436#comment-17689436
 ] 

Íñigo Goiri commented on HDFS-13224:


[~Daniel Ma], it's been almost five years so I'm having a hard time finding 
design docs.
We added some documentation explaining the idea here: 
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs-rbf/HDFSRouterFederation.html
In the `Multiple subclusters` section it explains some of the use cases.

> RBF: Resolvers to support mount points across multiple subclusters
> --
>
> Key: HDFS-13224
> URL: https://issues.apache.org/jira/browse/HDFS-13224
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.3
>
> Attachments: HDFS-13224-branch-2.000.patch, HDFS-13224.000.patch, 
> HDFS-13224.001.patch, HDFS-13224.002.patch, HDFS-13224.003.patch, 
> HDFS-13224.004.patch, HDFS-13224.005.patch, HDFS-13224.006.patch, 
> HDFS-13224.007.patch, HDFS-13224.008.patch, HDFS-13224.009.patch, 
> HDFS-13224.010.patch
>
>
> Currently, a mount point points to a single subcluster. We should be able to 
> spread files in a mount point across subclusters.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13224) RBF: Resolvers to support mount points across multiple subclusters

2023-02-15 Thread Daniel Ma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689432#comment-17689432
 ] 

Daniel Ma commented on HDFS-13224:
--

[~elgoiri]  Could you pls share the design doc for this feature. No idea what 
kind of scenario need to cross subclusters.
Thanks

> RBF: Resolvers to support mount points across multiple subclusters
> --
>
> Key: HDFS-13224
> URL: https://issues.apache.org/jira/browse/HDFS-13224
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.3
>
> Attachments: HDFS-13224-branch-2.000.patch, HDFS-13224.000.patch, 
> HDFS-13224.001.patch, HDFS-13224.002.patch, HDFS-13224.003.patch, 
> HDFS-13224.004.patch, HDFS-13224.005.patch, HDFS-13224.006.patch, 
> HDFS-13224.007.patch, HDFS-13224.008.patch, HDFS-13224.009.patch, 
> HDFS-13224.010.patch
>
>
> Currently, a mount point points to a single subcluster. We should be able to 
> spread files in a mount point across subclusters.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689433#comment-17689433
 ] 

ASF GitHub Bot commented on HDFS-16917:
---

hadoop-yetus commented on PR #5397:
URL: https://github.com/apache/hadoop/pull/5397#issuecomment-1432335476

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 48s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  19m 31s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  32m  0s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  25m  2s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |  22m 38s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   4m  7s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 26s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 24s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   2m 37s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   6m 13s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  26m 45s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 30s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 30s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  24m 54s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |  24m 54s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 36s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |  22m 36s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   3m 47s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/3/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 2 new + 105 unchanged - 0 fixed = 107 total (was 
105)  |
   | +1 :green_heart: |  mvnsite  |   3m 23s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 14s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   2m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   6m 42s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  27m  8s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 31s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  | 213m  8s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 11s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 473m 52s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.namenode.TestAuditLogger |
   |   | hadoop.hdfs.server.namenode.TestFsck |
   |   | hadoop.hdfs.server.namenode.TestAuditLogs |
   |   | hadoop.hdfs.server.namenode.TestFSNamesystemLockReport |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5397 |
   | Optional Tests | dupname asflicense mvnsite codespell detsecrets 
markdownlint compile javac javadoc mvninstall unit shadedclient spotbugs 
checkstyle |
   | uname |

[jira] [Resolved] (HDFS-16914) Add some logs for updateBlockForPipeline RPC.

2023-02-15 Thread Tao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li resolved HDFS-16914.
---
Fix Version/s: 3.4.0
   Resolution: Fixed

> Add some logs for updateBlockForPipeline RPC.
> -
>
> Key: HDFS-16914
> URL: https://issues.apache.org/jira/browse/HDFS-16914
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namanode
>Affects Versions: 3.3.4
>Reporter: ZhangHB
>Assignee: ZhangHB
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Recently,we received an phone alarm about missing blocks.  We found logs in 
> one datanode where the block was placed on  like below:
>  
> {code:java}
> 2023-02-09 15:05:10,376 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Received BP-578784987-x.x.x.x-1667291826362:blk_1305044966_231832415 src: 
> /clientAddress:44638 dest: /localAddress:50010 of size 45733720
> 2023-02-09 15:05:10,376 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Received BP-578784987-x.x.x.x-1667291826362:blk_1305044966_231826462 src: 
> /upStreamDatanode:60316 dest: /localAddress:50010 of size 45733720 {code}
> the datanode received the same block with different generation stamp because 
> of socket timeout exception.  blk_1305044966_231826462 is received from 
> upstream datanode in pipeline which has two datanodes.  
> blk_1305044966_231832415 is received from client directly.   
>  
> we have search all log info about blk_1305044966 in namenode and three 
> datanodes in original pipeline. but we could not obtain any helpful message 
> about the generation stamp 231826462.  After diving into the source code,  it 
> was assigned in NameNodeRpcServer#updateBlockForPipeline which was invoked in 
> DataStreamer#setupPipelineInternal.   The updateBlockForPipeline RPC does not 
> have any log info. So I think we should add some logs in this RPC.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16914) Add some logs for updateBlockForPipeline RPC.

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689429#comment-17689429
 ] 

ASF GitHub Bot commented on HDFS-16914:
---

tomscut commented on PR #5381:
URL: https://github.com/apache/hadoop/pull/5381#issuecomment-1432330074

   Thanks @hfutatzhanghb for your contribution! And Thanks @slfan1989 for your 
review!




> Add some logs for updateBlockForPipeline RPC.
> -
>
> Key: HDFS-16914
> URL: https://issues.apache.org/jira/browse/HDFS-16914
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namanode
>Affects Versions: 3.3.4
>Reporter: ZhangHB
>Assignee: ZhangHB
>Priority: Minor
>  Labels: pull-request-available
>
> Recently,we received an phone alarm about missing blocks.  We found logs in 
> one datanode where the block was placed on  like below:
>  
> {code:java}
> 2023-02-09 15:05:10,376 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Received BP-578784987-x.x.x.x-1667291826362:blk_1305044966_231832415 src: 
> /clientAddress:44638 dest: /localAddress:50010 of size 45733720
> 2023-02-09 15:05:10,376 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Received BP-578784987-x.x.x.x-1667291826362:blk_1305044966_231826462 src: 
> /upStreamDatanode:60316 dest: /localAddress:50010 of size 45733720 {code}
> the datanode received the same block with different generation stamp because 
> of socket timeout exception.  blk_1305044966_231826462 is received from 
> upstream datanode in pipeline which has two datanodes.  
> blk_1305044966_231832415 is received from client directly.   
>  
> we have search all log info about blk_1305044966 in namenode and three 
> datanodes in original pipeline. but we could not obtain any helpful message 
> about the generation stamp 231826462.  After diving into the source code,  it 
> was assigned in NameNodeRpcServer#updateBlockForPipeline which was invoked in 
> DataStreamer#setupPipelineInternal.   The updateBlockForPipeline RPC does not 
> have any log info. So I think we should add some logs in this RPC.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16914) Add some logs for updateBlockForPipeline RPC.

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689428#comment-17689428
 ] 

ASF GitHub Bot commented on HDFS-16914:
---

tomscut merged PR #5381:
URL: https://github.com/apache/hadoop/pull/5381




> Add some logs for updateBlockForPipeline RPC.
> -
>
> Key: HDFS-16914
> URL: https://issues.apache.org/jira/browse/HDFS-16914
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namanode
>Affects Versions: 3.3.4
>Reporter: ZhangHB
>Assignee: ZhangHB
>Priority: Minor
>  Labels: pull-request-available
>
> Recently,we received an phone alarm about missing blocks.  We found logs in 
> one datanode where the block was placed on  like below:
>  
> {code:java}
> 2023-02-09 15:05:10,376 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Received BP-578784987-x.x.x.x-1667291826362:blk_1305044966_231832415 src: 
> /clientAddress:44638 dest: /localAddress:50010 of size 45733720
> 2023-02-09 15:05:10,376 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Received BP-578784987-x.x.x.x-1667291826362:blk_1305044966_231826462 src: 
> /upStreamDatanode:60316 dest: /localAddress:50010 of size 45733720 {code}
> the datanode received the same block with different generation stamp because 
> of socket timeout exception.  blk_1305044966_231826462 is received from 
> upstream datanode in pipeline which has two datanodes.  
> blk_1305044966_231832415 is received from client directly.   
>  
> we have search all log info about blk_1305044966 in namenode and three 
> datanodes in original pipeline. but we could not obtain any helpful message 
> about the generation stamp 231826462.  After diving into the source code,  it 
> was assigned in NameNodeRpcServer#updateBlockForPipeline which was invoked in 
> DataStreamer#setupPipelineInternal.   The updateBlockForPipeline RPC does not 
> have any log info. So I think we should add some logs in this RPC.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689423#comment-17689423
 ] 

ASF GitHub Bot commented on HDFS-16917:
---

hadoop-yetus commented on PR #5397:
URL: https://github.com/apache/hadoop/pull/5397#issuecomment-1432323560

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 45s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  23m 36s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  31m  4s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m  4s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |  20m 29s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   3m 47s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 32s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 27s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   2m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   6m  7s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  26m 25s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 29s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 21s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |  22m 21s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 33s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |  20m 33s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   3m 35s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/2/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 6 new + 105 unchanged - 0 fixed = 111 total (was 
105)  |
   | +1 :green_heart: |  mvnsite  |   3m 26s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 18s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   2m 39s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   6m 21s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  26m 23s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 19s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  | 208m 23s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 10s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 462m  8s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.namenode.TestAuditLogs |
   |   | hadoop.hdfs.server.namenode.TestAuditLogger |
   |   | hadoop.hdfs.server.namenode.TestFSNamesystemLockReport |
   |   | hadoop.hdfs.server.namenode.TestFsck |
   |   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5397 |
   | Optional Tests | dupname asflicense mvnsite codespell detsecrets 
markdownlint compile javac javadoc mvni

[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689419#comment-17689419
 ] 

ASF GitHub Bot commented on HDFS-16917:
---

xinglin commented on code in PR #5397:
URL: https://github.com/apache/hadoop/pull/5397#discussion_r1107896933


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java:
##
@@ -1936,4 +1936,17 @@ public static boolean isParentEntry(final String path, 
final String parent) {
 return path.charAt(parent.length()) == Path.SEPARATOR_CHAR
 || parent.equals(Path.SEPARATOR);
   }
+
+  /**
+   * Calculate the transfer rate in megabytes/second.
+   * @param bytes bytes
+   * @param durationMS duration in milliseconds
+   * @return the number of megabytes/second of the transfer rate
+  */
+  public static long transferRateMBs(long bytes, long durationMS) {
+if (durationMS == 0) {

Review Comment:
   can we specify our function as: "we expect both inputs to be positive. 
Otherwise, this function will return -1". 
   
   Then returning -1 is a clear signal we don't know how to handle such inputs. 





> Add transfer rate quantile metrics for DataNode reads
> -
>
> Key: HDFS-16917
> URL: https://issues.apache.org/jira/browse/HDFS-16917
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: datanode
>Reporter: Ravindra Dingankar
>Priority: Minor
>  Labels: pull-request-available
>
> Currently we have the following metrics for datanode reads.
> |BytesRead
> BlocksRead
> TotalReadTime|Total number of bytes read from DataNode
> Total number of blocks read from DataNode
> Total number of milliseconds spent on read operation|
> We would like to add a new quantile metric calculating the distribution of 
> data transfer rate for datanode reads.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689417#comment-17689417
 ] 

ASF GitHub Bot commented on HDFS-16917:
---

rdingankar commented on code in PR #5397:
URL: https://github.com/apache/hadoop/pull/5397#discussion_r1107892591


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java:
##
@@ -1936,4 +1936,17 @@ public static boolean isParentEntry(final String path, 
final String parent) {
 return path.charAt(parent.length()) == Path.SEPARATOR_CHAR
 || parent.equals(Path.SEPARATOR);
   }
+
+  /**
+   * Calculate the transfer rate in megabytes/second.
+   * @param bytes bytes
+   * @param durationMS duration in milliseconds
+   * @return the number of megabytes/second of the transfer rate
+  */
+  public static long transferRateMBs(long bytes, long durationMS) {
+if (durationMS == 0) {

Review Comment:
   I dont feel we should handle other cases. This is a Utils method and any 
unexpected data should be left for the client to interpret. For some clients 
the negative values might even make sense.
   The idea behind handling for durationMS = 0 is to take care of DivideByZero 
for cases when data transfer did not happen.





> Add transfer rate quantile metrics for DataNode reads
> -
>
> Key: HDFS-16917
> URL: https://issues.apache.org/jira/browse/HDFS-16917
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: datanode
>Reporter: Ravindra Dingankar
>Priority: Minor
>  Labels: pull-request-available
>
> Currently we have the following metrics for datanode reads.
> |BytesRead
> BlocksRead
> TotalReadTime|Total number of bytes read from DataNode
> Total number of blocks read from DataNode
> Total number of milliseconds spent on read operation|
> We would like to add a new quantile metric calculating the distribution of 
> data transfer rate for datanode reads.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689416#comment-17689416
 ] 

ASF GitHub Bot commented on HDFS-16917:
---

rdingankar commented on code in PR #5397:
URL: https://github.com/apache/hadoop/pull/5397#discussion_r1107889758


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeMetrics.java:
##
@@ -61,6 +61,8 @@ public class DataNodeMetrics {
   @Metric MutableCounterLong bytesRead;
   @Metric("Milliseconds spent reading")
   MutableCounterLong totalReadTime;
+  @Metric MutableRate bytesReadTransferRate;

Review Comment:
   updated



##
hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md:
##
@@ -370,6 +370,7 @@ Each metrics record contains tags such as SessionId and 
Hostname as additional i
 |:--

> Add transfer rate quantile metrics for DataNode reads
> -
>
> Key: HDFS-16917
> URL: https://issues.apache.org/jira/browse/HDFS-16917
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: datanode
>Reporter: Ravindra Dingankar
>Priority: Minor
>  Labels: pull-request-available
>
> Currently we have the following metrics for datanode reads.
> |BytesRead
> BlocksRead
> TotalReadTime|Total number of bytes read from DataNode
> Total number of blocks read from DataNode
> Total number of milliseconds spent on read operation|
> We would like to add a new quantile metric calculating the distribution of 
> data transfer rate for datanode reads.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689409#comment-17689409
 ] 

ASF GitHub Bot commented on HDFS-16917:
---

xinglin commented on code in PR #5397:
URL: https://github.com/apache/hadoop/pull/5397#discussion_r1107877662


##
hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md:
##
@@ -370,6 +370,7 @@ Each metrics record contains tags such as SessionId and 
Hostname as additional i
 |:--

> Add transfer rate quantile metrics for DataNode reads
> -
>
> Key: HDFS-16917
> URL: https://issues.apache.org/jira/browse/HDFS-16917
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: datanode
>Reporter: Ravindra Dingankar
>Priority: Minor
>  Labels: pull-request-available
>
> Currently we have the following metrics for datanode reads.
> |BytesRead
> BlocksRead
> TotalReadTime|Total number of bytes read from DataNode
> Total number of blocks read from DataNode
> Total number of milliseconds spent on read operation|
> We would like to add a new quantile metric calculating the distribution of 
> data transfer rate for datanode reads.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689408#comment-17689408
 ] 

ASF GitHub Bot commented on HDFS-16917:
---

xinglin commented on code in PR #5397:
URL: https://github.com/apache/hadoop/pull/5397#discussion_r1107877375


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeMetrics.java:
##
@@ -61,6 +61,8 @@ public class DataNodeMetrics {
   @Metric MutableCounterLong bytesRead;
   @Metric("Milliseconds spent reading")
   MutableCounterLong totalReadTime;
+  @Metric MutableRate bytesReadTransferRate;

Review Comment:
   nit: rename to readTransferRateMBs?





> Add transfer rate quantile metrics for DataNode reads
> -
>
> Key: HDFS-16917
> URL: https://issues.apache.org/jira/browse/HDFS-16917
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: datanode
>Reporter: Ravindra Dingankar
>Priority: Minor
>  Labels: pull-request-available
>
> Currently we have the following metrics for datanode reads.
> |BytesRead
> BlocksRead
> TotalReadTime|Total number of bytes read from DataNode
> Total number of blocks read from DataNode
> Total number of milliseconds spent on read operation|
> We would like to add a new quantile metric calculating the distribution of 
> data transfer rate for datanode reads.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689406#comment-17689406
 ] 

ASF GitHub Bot commented on HDFS-16917:
---

xinglin commented on code in PR #5397:
URL: https://github.com/apache/hadoop/pull/5397#discussion_r1107876562


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java:
##
@@ -1936,4 +1936,17 @@ public static boolean isParentEntry(final String path, 
final String parent) {
 return path.charAt(parent.length()) == Path.SEPARATOR_CHAR
 || parent.equals(Path.SEPARATOR);
   }
+
+  /**
+   * Calculate the transfer rate in megabytes/second.
+   * @param bytes bytes
+   * @param durationMS duration in milliseconds
+   * @return the number of megabytes/second of the transfer rate
+  */
+  public static long transferRateMBs(long bytes, long durationMS) {
+if (durationMS == 0) {

Review Comment:
   if it is <= 0, just return -1? Let's add a check for bytes as well.





> Add transfer rate quantile metrics for DataNode reads
> -
>
> Key: HDFS-16917
> URL: https://issues.apache.org/jira/browse/HDFS-16917
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: datanode
>Reporter: Ravindra Dingankar
>Priority: Minor
>  Labels: pull-request-available
>
> Currently we have the following metrics for datanode reads.
> |BytesRead
> BlocksRead
> TotalReadTime|Total number of bytes read from DataNode
> Total number of blocks read from DataNode
> Total number of milliseconds spent on read operation|
> We would like to add a new quantile metric calculating the distribution of 
> data transfer rate for datanode reads.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16914) Add some logs for updateBlockForPipeline RPC.

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689403#comment-17689403
 ] 

ASF GitHub Bot commented on HDFS-16914:
---

slfan1989 commented on code in PR #5381:
URL: https://github.com/apache/hadoop/pull/5381#discussion_r1107866913


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java:
##
@@ -5943,6 +5943,8 @@ LocatedBlock bumpBlockGenerationStamp(ExtendedBlock block,
 }
 // Ensure we record the new generation stamp
 getEditLog().logSync();
+LOG.info("bumpBlockGenerationStamp({}, client={}) success",
+locatedBlock.getBlock(), clientName);

Review Comment:
   @hfutatzhanghb @tomscut Thanks for the information!





> Add some logs for updateBlockForPipeline RPC.
> -
>
> Key: HDFS-16914
> URL: https://issues.apache.org/jira/browse/HDFS-16914
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namanode
>Affects Versions: 3.3.4
>Reporter: ZhangHB
>Assignee: ZhangHB
>Priority: Minor
>  Labels: pull-request-available
>
> Recently,we received an phone alarm about missing blocks.  We found logs in 
> one datanode where the block was placed on  like below:
>  
> {code:java}
> 2023-02-09 15:05:10,376 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Received BP-578784987-x.x.x.x-1667291826362:blk_1305044966_231832415 src: 
> /clientAddress:44638 dest: /localAddress:50010 of size 45733720
> 2023-02-09 15:05:10,376 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Received BP-578784987-x.x.x.x-1667291826362:blk_1305044966_231826462 src: 
> /upStreamDatanode:60316 dest: /localAddress:50010 of size 45733720 {code}
> the datanode received the same block with different generation stamp because 
> of socket timeout exception.  blk_1305044966_231826462 is received from 
> upstream datanode in pipeline which has two datanodes.  
> blk_1305044966_231832415 is received from client directly.   
>  
> we have search all log info about blk_1305044966 in namenode and three 
> datanodes in original pipeline. but we could not obtain any helpful message 
> about the generation stamp 231826462.  After diving into the source code,  it 
> was assigned in NameNodeRpcServer#updateBlockForPipeline which was invoked in 
> DataStreamer#setupPipelineInternal.   The updateBlockForPipeline RPC does not 
> have any log info. So I think we should add some logs in this RPC.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16918) Optionally shut down datanode if it does not stay connected to active namenode

2023-02-15 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HDFS-16918:

Description: 
While deploying Hdfs on Envoy proxy setup, depending on the socket timeout 
configured at envoy, the network connection issues or packet loss could be 
observed. All of envoys basically form a transparent communication mesh in 
which each app can send and receive packets to and from localhost and is 
unaware of the network topology.

The primary purpose of Envoy is to make the network transparent to 
applications, in order to identify network issues reliably. However, sometimes 
such proxy based setup could result into socket connection issues b/ datanode 
and namenode.

Many deployment frameworks provide auto-start functionality when any of the 
hadoop daemons are stopped. If a given datanode does not stay connected to 
active namenode in the cluster i.e. does not receive heartbeat response in time 
from active namenode (even though active namenode is not terminated), it would 
not be much useful. We should be able to provide configurable behavior such 
that if a given datanode cannot receive heartbeat response from active namenode 
in configurable time duration, it should terminate itself to avoid impacting 
the availability SLA. This is specifically helpful when the underlying 
deployment or observability framework (e.g. K8S) can start up the datanode 
automatically upon it's shutdown (unless it is being restarted as part of 
rolling upgrade) and help the newly brought up datanode (in case of k8s, a new 
pod with dynamically changing nodes) establish new socket connection to active 
and standby namenodes. This should be an opt-in behavior and not default one.

 

In a distributed system, it is essential to have robust fail-fast mechanisms in 
place to prevent issues related to network partitioning. The system must be 
designed to prevent further degradation of availability and consistency in the 
event of a network partition. Several distributed systems offer fail-safe 
approaches, and for some, partition tolerance is critical to the extent that 
even a few seconds of heartbeat loss can trigger the removal of an application 
server instance from the cluster. For instance, a majority of zooKeeper clients 
utilize the ephemeral nodes for this purpose to make system reliable, 
fault-tolerant and strongly consistent in the event of network partition.

>From the hdfs architecture viewpoint, it is crucial to understand the critical 
>role that active and observer namenode play in file system operations. In a 
>large-scale cluster, if the datanodes holding the same block (primary and 
>replicas) lose connection to both active and observer namenodes for a 
>significant amount of time, delaying the process of shutting down such 
>datanodes and restarting it to re-establish the connection with the namenodes 
>(assuming the active namenode is alive, assumption is important in the even of 
>network partition to reestablish the connection) will further deteriorate the 
>availability of the service. This scenario underscores the importance of 
>resolving network partitioning.

This is a real use case for hdfs and it is not prudent to assume that every 
deployment or cluster management application must be able to restart datanodes 
based on JMX metrics, as this would introduce another application to resolve 
the network partition impact of hdfs. Besides, popular cluster management 
applications are not typically used in all cloud-native env. Even if these 
cluster management applications are deployed, certain security constraints may 
restrict their access to JMX metrics and prevent them from interfering with 
hdfs operations. The applications that can only trigger alerts for users based 
on set parameters (for instance, missing blocks > 0) are allowed to access JMX 
metrics.

  was:
While deploying Hdfs on Envoy proxy setup, depending on the socket timeout 
configured at envoy, the network connection issues or packet loss could be 
observed. All of envoys basically form a transparent communication mesh in 
which each app can send and receive packets to and from localhost and is 
unaware of the network topology.

The primary purpose of Envoy is to make the network transparent to 
applications, in order to identify network issues reliably. However, sometimes 
such proxy based setup could result into socket connection issues b/ datanode 
and namenode.

Many deployment frameworks provide auto-start functionality when any of the 
hadoop daemons are stopped. If a given datanode does not stay connected to 
active namenode in the cluster i.e. does not receive heartbeat response in time 
from active namenode (even though active namenode is not terminated), it would 
not be much useful. We should be able to provide configurable behavior such 
that if a given datanode cannot receive heartbeat response from active namen

[jira] [Commented] (HDFS-16918) Optionally shut down datanode if it does not stay connected to active namenode

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689378#comment-17689378
 ] 

ASF GitHub Bot commented on HDFS-16918:
---

virajjasani commented on PR #5396:
URL: https://github.com/apache/hadoop/pull/5396#issuecomment-1432167209

   Created HDFS-16925 to fix regex expressions for namenode audit log tests




> Optionally shut down datanode if it does not stay connected to active namenode
> --
>
> Key: HDFS-16918
> URL: https://issues.apache.org/jira/browse/HDFS-16918
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> While deploying Hdfs on Envoy proxy setup, depending on the socket timeout 
> configured at envoy, the network connection issues or packet loss could be 
> observed. All of envoys basically form a transparent communication mesh in 
> which each app can send and receive packets to and from localhost and is 
> unaware of the network topology.
> The primary purpose of Envoy is to make the network transparent to 
> applications, in order to identify network issues reliably. However, 
> sometimes such proxy based setup could result into socket connection issues 
> b/ datanode and namenode.
> Many deployment frameworks provide auto-start functionality when any of the 
> hadoop daemons are stopped. If a given datanode does not stay connected to 
> active namenode in the cluster i.e. does not receive heartbeat response in 
> time from active namenode (even though active namenode is not terminated), it 
> would not be much useful. We should be able to provide configurable behavior 
> such that if a given datanode cannot receive heartbeat response from active 
> namenode in configurable time duration, it should terminate itself to avoid 
> impacting the availability SLA. This is specifically helpful when the 
> underlying deployment or observability framework (e.g. K8S) can start up the 
> datanode automatically upon it's shutdown (unless it is being restarted as 
> part of rolling upgrade) and help the newly brought up datanode (in case of 
> k8s, a new pod with dynamically changing nodes) establish new socket 
> connection to active and standby namenodes. This should be an opt-in behavior 
> and not default one.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16918) Optionally shut down datanode if it does not stay connected to active namenode

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689375#comment-17689375
 ] 

ASF GitHub Bot commented on HDFS-16918:
---

virajjasani commented on PR #5396:
URL: https://github.com/apache/hadoop/pull/5396#issuecomment-1432160563

   In a distributed system, it is essential to have robust fail-fast mechanisms 
in place to prevent issues related to network partitioning. The system must be 
designed to prevent further degradation of availability and consistency in the 
event of a network partition. Several distributed systems offer fail-safe 
approaches, and for some, partition tolerance is critical to the extent that 
even a few seconds of heartbeat loss can trigger the removal of an application 
server instance from the cluster. For instance, a majority of zooKeeper clients 
utilize the ephemeral nodes for this purpose to make system reliable, 
fault-tolerant and strongly consistent in the event of network partition.
   
   From the hdfs architecture viewpoint, it is crucial to understand the 
critical role that active and observer namenode play in file system operations. 
In a large-scale cluster, if the datanodes holding the same block (primary and 
replicas) lose connection to both active and observer namenodes for a 
significant amount of time, delaying the process of shutting down such 
datanodes and restarting it to re-establish the connection with the namenodes 
(assuming the active namenode is alive, assumption is important in the even of 
network partition to reestablish the connection) will further deteriorate the 
availability of the service. This scenario underscores the importance of 
resolving network partitioning.
   
   This is a real use case for hdfs and it is not prudent to assume that every 
deployment or cluster management application must be able to restart datanodes 
based on JMX metrics, as this would introduce another application to resolve 
the network partition impact of hdfs. Besides, popular cluster management 
applications are not typically used in all cloud-native env. Even if these 
cluster management applications are deployed, certain security constraints may 
restrict their access to JMX metrics and prevent them from interfering with 
hdfs operations. The applications that can only trigger alerts for users based 
on set parameters (for instance, missing blocks > 0) are allowed to access JMX 
metrics.




> Optionally shut down datanode if it does not stay connected to active namenode
> --
>
> Key: HDFS-16918
> URL: https://issues.apache.org/jira/browse/HDFS-16918
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> While deploying Hdfs on Envoy proxy setup, depending on the socket timeout 
> configured at envoy, the network connection issues or packet loss could be 
> observed. All of envoys basically form a transparent communication mesh in 
> which each app can send and receive packets to and from localhost and is 
> unaware of the network topology.
> The primary purpose of Envoy is to make the network transparent to 
> applications, in order to identify network issues reliably. However, 
> sometimes such proxy based setup could result into socket connection issues 
> b/ datanode and namenode.
> Many deployment frameworks provide auto-start functionality when any of the 
> hadoop daemons are stopped. If a given datanode does not stay connected to 
> active namenode in the cluster i.e. does not receive heartbeat response in 
> time from active namenode (even though active namenode is not terminated), it 
> would not be much useful. We should be able to provide configurable behavior 
> such that if a given datanode cannot receive heartbeat response from active 
> namenode in configurable time duration, it should terminate itself to avoid 
> impacting the availability SLA. This is specifically helpful when the 
> underlying deployment or observability framework (e.g. K8S) can start up the 
> datanode automatically upon it's shutdown (unless it is being restarted as 
> part of rolling upgrade) and help the newly brought up datanode (in case of 
> k8s, a new pod with dynamically changing nodes) establish new socket 
> connection to active and standby namenodes. This should be an opt-in behavior 
> and not default one.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16896) HDFS Client hedged read has increased failure rate than without hedged read

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689348#comment-17689348
 ] 

ASF GitHub Bot commented on HDFS-16896:
---

simbadzina commented on code in PR #5322:
URL: https://github.com/apache/hadoop/pull/5322#discussion_r1107745069


##
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java:
##
@@ -1337,7 +1352,11 @@ private void hedgedFetchBlockByteRange(LocatedBlock 
block, long start,
 } catch (InterruptedException ie) {
   // Ignore and retry
 }
-if (refetch) {
+// if refetch is true then all nodes are in deadlist or ignorelist
+// we should loop through all futures and remove them so we do not

Review Comment:
   Could you add punctuation here and start new sentences with caps. That will 
make the comment easier to follow.
   
   What is the deadlist?





> HDFS Client hedged read has increased failure rate than without hedged read
> ---
>
> Key: HDFS-16896
> URL: https://issues.apache.org/jira/browse/HDFS-16896
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Tom McCormick
>Assignee: Tom McCormick
>Priority: Major
>  Labels: pull-request-available
>
> When hedged read is enabled by HDFS client, we see an increased failure rate 
> on reads.
> *stacktrace*
>  
> {code:java}
> Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain 
> block: BP-1183972111-10.197.192.88-1590025572374:blk_17114848218_16043459722 
> file=/data/tracking/streaming/AdImpressionEvent/daily/2022/07/18/compaction_1/part-r-1914862.1658217125623.1362294472.orc
> at 
> org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1077)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1060)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1039)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.hedgedFetchBlockByteRange(DFSInputStream.java:1365)
> at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1572)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1535)
> at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> org.apache.hadoop.fs.RetryingInputStream.lambda$readFully$3(RetryingInputStream.java:172)
> at org.apache.hadoop.fs.RetryPolicy.lambda$run$0(RetryPolicy.java:137)
> at org.apache.hadoop.fs.NoOpRetryPolicy.run(NoOpRetryPolicy.java:36)
> at org.apache.hadoop.fs.RetryPolicy.run(RetryPolicy.java:136)
> at 
> org.apache.hadoop.fs.RetryingInputStream.readFully(RetryingInputStream.java:168)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> io.trino.plugin.hive.orc.HdfsOrcDataSource.readInternal(HdfsOrcDataSource.java:76)
> ... 46 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16896) HDFS Client hedged read has increased failure rate than without hedged read

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689347#comment-17689347
 ] 

ASF GitHub Bot commented on HDFS-16896:
---

simbadzina commented on code in PR #5322:
URL: https://github.com/apache/hadoop/pull/5322#discussion_r1107745069


##
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java:
##
@@ -1337,7 +1352,11 @@ private void hedgedFetchBlockByteRange(LocatedBlock 
block, long start,
 } catch (InterruptedException ie) {
   // Ignore and retry
 }
-if (refetch) {
+// if refetch is true then all nodes are in deadlist or ignorelist
+// we should loop through all futures and remove them so we do not

Review Comment:
   Could you add punctuation here and start new sentences with caps. It was 
hard to read the flow of the comment.
   
   What is the deadlist?





> HDFS Client hedged read has increased failure rate than without hedged read
> ---
>
> Key: HDFS-16896
> URL: https://issues.apache.org/jira/browse/HDFS-16896
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Tom McCormick
>Assignee: Tom McCormick
>Priority: Major
>  Labels: pull-request-available
>
> When hedged read is enabled by HDFS client, we see an increased failure rate 
> on reads.
> *stacktrace*
>  
> {code:java}
> Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain 
> block: BP-1183972111-10.197.192.88-1590025572374:blk_17114848218_16043459722 
> file=/data/tracking/streaming/AdImpressionEvent/daily/2022/07/18/compaction_1/part-r-1914862.1658217125623.1362294472.orc
> at 
> org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1077)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1060)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1039)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.hedgedFetchBlockByteRange(DFSInputStream.java:1365)
> at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1572)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1535)
> at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> org.apache.hadoop.fs.RetryingInputStream.lambda$readFully$3(RetryingInputStream.java:172)
> at org.apache.hadoop.fs.RetryPolicy.lambda$run$0(RetryPolicy.java:137)
> at org.apache.hadoop.fs.NoOpRetryPolicy.run(NoOpRetryPolicy.java:36)
> at org.apache.hadoop.fs.RetryPolicy.run(RetryPolicy.java:136)
> at 
> org.apache.hadoop.fs.RetryingInputStream.readFully(RetryingInputStream.java:168)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> io.trino.plugin.hive.orc.HdfsOrcDataSource.readInternal(HdfsOrcDataSource.java:76)
> ... 46 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16925) Fix regex pattern for namenode audit log tests

2023-02-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-16925:
--
Labels: pull-request-available  (was: )

> Fix regex pattern for namenode audit log tests
> --
>
> Key: HDFS-16925
> URL: https://issues.apache.org/jira/browse/HDFS-16925
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> With HADOOP-18628 in place, we perform InetAddress#getHostName in addition to 
> InetAddress#getHostAddress, to save host name with IPC Connection object. 
> When we perform InetAddress#getHostName, toString() of InetAddress would 
> automatically print \{hostName}/\{hostIPAddress} if hostname is already 
> resolved:
> {code:java}
> /**
>  * Converts this IP address to a {@code String}. The
>  * string returned is of the form: hostname / literal IP
>  * address.
>  *
>  * If the host name is unresolved, no reverse name service lookup
>  * is performed. The hostname part will be represented by an empty string.
>  *
>  * @return  a string representation of this IP address.
>  */
> public String toString() {
> String hostName = holder().getHostName();
> return ((hostName != null) ? hostName : "")
> + "/" + getHostAddress();
> }{code}
>  
> For namenode audit logs, this means that when dfs client makes filesystem 
> updates, the audit logs would also print host name in the audit logs in 
> addition to ip address. We have some tests that performs regex pattern 
> matching to identify the log pattern of audit logs, we will have to change 
> them to reflect the change in host address.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16925) Fix regex pattern for namenode audit log tests

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689332#comment-17689332
 ] 

ASF GitHub Bot commented on HDFS-16925:
---

virajjasani opened a new pull request, #5407:
URL: https://github.com/apache/hadoop/pull/5407

   With [HADOOP-18628](https://issues.apache.org/jira/browse/HADOOP-18628) in 
place, we perform InetAddress#getHostName in addition to 
InetAddress#getHostAddress, to save host name with IPC Connection object. When 
we perform InetAddress#getHostName, toString() of InetAddress would 
automatically print {hostName}/{hostIPAddress} if hostname is already resolved:
   
   ```
   /**
* Converts this IP address to a {@code String}. The
* string returned is of the form: hostname / literal IP
* address.
*
* If the host name is unresolved, no reverse name service lookup
* is performed. The hostname part will be represented by an empty string.
*
* @return  a string representation of this IP address.
*/
   public String toString() {
   String hostName = holder().getHostName();
   return ((hostName != null) ? hostName : "")
   + "/" + getHostAddress();
   }
   ```
   
   For namenode audit logs, this means that when dfs client makes filesystem 
updates, the audit logs would also print host name in the audit logs in 
addition to ip address. We have some tests that performs regex pattern matching 
to identify the log pattern of audit logs, we will have to change them to 
reflect the change in host address.




> Fix regex pattern for namenode audit log tests
> --
>
> Key: HDFS-16925
> URL: https://issues.apache.org/jira/browse/HDFS-16925
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> With HADOOP-18628 in place, we perform InetAddress#getHostName in addition to 
> InetAddress#getHostAddress, to save host name with IPC Connection object. 
> When we perform InetAddress#getHostName, toString() of InetAddress would 
> automatically print \{hostName}/\{hostIPAddress} if hostname is already 
> resolved:
> {code:java}
> /**
>  * Converts this IP address to a {@code String}. The
>  * string returned is of the form: hostname / literal IP
>  * address.
>  *
>  * If the host name is unresolved, no reverse name service lookup
>  * is performed. The hostname part will be represented by an empty string.
>  *
>  * @return  a string representation of this IP address.
>  */
> public String toString() {
> String hostName = holder().getHostName();
> return ((hostName != null) ? hostName : "")
> + "/" + getHostAddress();
> }{code}
>  
> For namenode audit logs, this means that when dfs client makes filesystem 
> updates, the audit logs would also print host name in the audit logs in 
> addition to ip address. We have some tests that performs regex pattern 
> matching to identify the log pattern of audit logs, we will have to change 
> them to reflect the change in host address.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16925) Fix regex pattern for namenode audit log tests

2023-02-15 Thread Viraj Jasani (Jira)
Viraj Jasani created HDFS-16925:
---

 Summary: Fix regex pattern for namenode audit log tests
 Key: HDFS-16925
 URL: https://issues.apache.org/jira/browse/HDFS-16925
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: Viraj Jasani
Assignee: Viraj Jasani


With HADOOP-18628 in place, we perform InetAddress#getHostName in addition to 
InetAddress#getHostAddress, to save host name with IPC Connection object. When 
we perform InetAddress#getHostName, toString() of InetAddress would 
automatically print \{hostName}/\{hostIPAddress} if hostname is already 
resolved:
{code:java}
/**
 * Converts this IP address to a {@code String}. The
 * string returned is of the form: hostname / literal IP
 * address.
 *
 * If the host name is unresolved, no reverse name service lookup
 * is performed. The hostname part will be represented by an empty string.
 *
 * @return  a string representation of this IP address.
 */
public String toString() {
String hostName = holder().getHostName();
return ((hostName != null) ? hostName : "")
+ "/" + getHostAddress();
}{code}
 

For namenode audit logs, this means that when dfs client makes filesystem 
updates, the audit logs would also print host name in the audit logs in 
addition to ip address. We have some tests that performs regex pattern matching 
to identify the log pattern of audit logs, we will have to change them to 
reflect the change in host address.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16918) Optionally shut down datanode if it does not stay connected to active namenode

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689305#comment-17689305
 ] 

ASF GitHub Bot commented on HDFS-16918:
---

virajjasani commented on PR #5396:
URL: https://github.com/apache/hadoop/pull/5396#issuecomment-1431843597

   Please ignore test failures, they are not relevant. I will track them 
separately.
   
   For the change, this is our real usecase. I have tested it on two clusters 
and deploy in prod as well. Limited infrastructure needs this kind of coverage 
but it is indeed required.
   
   My only concern is that not every deployment framework has ability to 
perform action based on JMX metrics, specifically cloud native ones with more 
security falls in the criteria.
   On the other hand, datanode is not much useful for long time if active 
namenode is not available if client is writing new blocks etc.
   
   @jojochuang @tasanuma @tomscut could you please also take a look and provide 
your feedback?




> Optionally shut down datanode if it does not stay connected to active namenode
> --
>
> Key: HDFS-16918
> URL: https://issues.apache.org/jira/browse/HDFS-16918
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> While deploying Hdfs on Envoy proxy setup, depending on the socket timeout 
> configured at envoy, the network connection issues or packet loss could be 
> observed. All of envoys basically form a transparent communication mesh in 
> which each app can send and receive packets to and from localhost and is 
> unaware of the network topology.
> The primary purpose of Envoy is to make the network transparent to 
> applications, in order to identify network issues reliably. However, 
> sometimes such proxy based setup could result into socket connection issues 
> b/ datanode and namenode.
> Many deployment frameworks provide auto-start functionality when any of the 
> hadoop daemons are stopped. If a given datanode does not stay connected to 
> active namenode in the cluster i.e. does not receive heartbeat response in 
> time from active namenode (even though active namenode is not terminated), it 
> would not be much useful. We should be able to provide configurable behavior 
> such that if a given datanode cannot receive heartbeat response from active 
> namenode in configurable time duration, it should terminate itself to avoid 
> impacting the availability SLA. This is specifically helpful when the 
> underlying deployment or observability framework (e.g. K8S) can start up the 
> datanode automatically upon it's shutdown (unless it is being restarted as 
> part of rolling upgrade) and help the newly brought up datanode (in case of 
> k8s, a new pod with dynamically changing nodes) establish new socket 
> connection to active and standby namenodes. This should be an opt-in behavior 
> and not default one.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16923) The getListing RPC will throw NPE if the path does not exist

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689252#comment-17689252
 ] 

ASF GitHub Bot commented on HDFS-16923:
---

hadoop-yetus commented on PR #5400:
URL: https://github.com/apache/hadoop/pull/5400#issuecomment-1431697168

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 37s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 11s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  31m 55s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   6m 18s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |   6m 12s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m 15s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 12s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 46s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   2m 25s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   4m 44s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 58s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 27s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m  5s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   6m 26s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |   6m 26s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   5m 49s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   5m 49s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  4s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 57s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 25s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   2m 18s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   4m 47s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m 27s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 205m 53s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5400/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  unit  |  20m 34s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 48s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 376m 52s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.namenode.TestAuditLogger |
   |   | hadoop.hdfs.server.namenode.TestFsck |
   |   | hadoop.hdfs.server.namenode.TestAuditLogs |
   |   | hadoop.hdfs.server.namenode.TestFSNamesystemLockReport |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5400/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5400 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 0504b05c1ae8 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 
18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / deaaebdf677274d3c990e76d6904d93ef3fbdfa9 |
   | Default Java | Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1u

[jira] [Commented] (HDFS-16916) Improve the use of JUnit Test in DFSClient

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689233#comment-17689233
 ] 

ASF GitHub Bot commented on HDFS-16916:
---

hadoop-yetus commented on PR #5404:
URL: https://github.com/apache/hadoop/pull/5404#issuecomment-1431655146

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 41s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  47m 20s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m  2s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |   0m 54s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   0m 36s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 59s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 52s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   0m 40s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   2m 41s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 59s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 57s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 52s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |   0m 52s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 45s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   0m 45s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 20s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 50s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 33s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   0m 33s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   2m 29s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m 41s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 26s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 37s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 115m 21s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5404/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5404 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux f0cb8873113b 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 
18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / c6bed9534b9e55bd78f03e7c9aca8a02d1ebeb5c |
   | Default Java | Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5404/1/testReport/ |
   | Max. process+thread count | 564 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-client U: 
hadoop-hdfs-project/hadoop-hdfs-client |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5404/1/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> Improve the use of JUnit Test in DFSClient
> --

[jira] [Updated] (HDFS-16761) Namenode UI for Datanodes page not loading if any data node is down

2023-02-15 Thread Stephen O'Donnell (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen O'Donnell updated HDFS-16761:
-
Fix Version/s: 3.4.0

> Namenode UI for Datanodes page not loading if any data node is down
> ---
>
> Key: HDFS-16761
> URL: https://issues.apache.org/jira/browse/HDFS-16761
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.2.2
>Reporter: Krishna Reddy
>Assignee: Zita Dombi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Steps to reproduce:
> - Install the hadoop components and add 3 datanodes
> - Enable namenode HA 
> - Open Namenode UI and check datanode page 
> - check all datanodes will display
> - Now make one datanode down
> - wait for 10 minutes time as heartbeat expires
> - Refresh namenode page and check
>  
> Actual Result: It is showing error message "NameNode is still loading. 
> Redirecting to the Startup Progress page."



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16761) Namenode UI for Datanodes page not loading if any data node is down

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689216#comment-17689216
 ] 

ASF GitHub Bot commented on HDFS-16761:
---

sodonnel merged PR #5390:
URL: https://github.com/apache/hadoop/pull/5390




> Namenode UI for Datanodes page not loading if any data node is down
> ---
>
> Key: HDFS-16761
> URL: https://issues.apache.org/jira/browse/HDFS-16761
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.2.2
>Reporter: Krishna Reddy
>Assignee: Zita Dombi
>Priority: Major
>  Labels: pull-request-available
>
> Steps to reproduce:
> - Install the hadoop components and add 3 datanodes
> - Enable namenode HA 
> - Open Namenode UI and check datanode page 
> - check all datanodes will display
> - Now make one datanode down
> - wait for 10 minutes time as heartbeat expires
> - Refresh namenode page and check
>  
> Actual Result: It is showing error message "NameNode is still loading. 
> Redirecting to the Startup Progress page."



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16761) Namenode UI for Datanodes page not loading if any data node is down

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689217#comment-17689217
 ] 

ASF GitHub Bot commented on HDFS-16761:
---

sodonnel commented on PR #5390:
URL: https://github.com/apache/hadoop/pull/5390#issuecomment-1431626020

   Merged this into trunk. I wonder if some similar change is needed on branch 
3.3 and 3.2?




> Namenode UI for Datanodes page not loading if any data node is down
> ---
>
> Key: HDFS-16761
> URL: https://issues.apache.org/jira/browse/HDFS-16761
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.2.2
>Reporter: Krishna Reddy
>Assignee: Zita Dombi
>Priority: Major
>  Labels: pull-request-available
>
> Steps to reproduce:
> - Install the hadoop components and add 3 datanodes
> - Enable namenode HA 
> - Open Namenode UI and check datanode page 
> - check all datanodes will display
> - Now make one datanode down
> - wait for 10 minutes time as heartbeat expires
> - Refresh namenode page and check
>  
> Actual Result: It is showing error message "NameNode is still loading. 
> Redirecting to the Startup Progress page."



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16923) The getListing RPC will throw NPE if the path does not exist

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689209#comment-17689209
 ] 

ASF GitHub Bot commented on HDFS-16923:
---

xkrogen commented on code in PR #5400:
URL: https://github.com/apache/hadoop/pull/5400#discussion_r1107318715


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestObserverWithRouter.java:
##
@@ -146,6 +148,19 @@ public void testObserverRead() throws Exception {
 internalTestObserverRead();
   }
 
+  @Test
+  public void testGetListingWithObserverRead() throws Exception {

Review Comment:
   Feels to me that this test probably belongs in `TestObserverNode`? It's not 
really related to Router/federation.





> The getListing RPC will throw NPE if the path does not exist
> 
>
> Key: HDFS-16923
> URL: https://issues.apache.org/jira/browse/HDFS-16923
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> The getListing RPC will throw NPE if the path does not exist. And the stack 
> as bellow:
> {code:java}
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): 
> org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4195)
>     at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:1421)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:783)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:622)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:590)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:574)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16922) The logic of IncrementalBlockReportManager#addRDBI method may cause missing blocks when cluster is busy.

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689196#comment-17689196
 ] 

ASF GitHub Bot commented on HDFS-16922:
---

hfutatzhanghb commented on PR #5398:
URL: https://github.com/apache/hadoop/pull/5398#issuecomment-1431546384

   > Requires a UT which can reproduce the said issue
   
   Hi, @ayushtkn . I can reproduce the issue by UT with version 3.3.x according 
to our production situation, but can not reproduce the issue with trunk 
according to our production situation because 
[HDFS-16146](https://issues.apache.org/jira/browse/HDFS-16146).   But I think 
the patch in this pr can also be useful to solve this problems.   @Hexiaoqiao , 
could you please also take a look at this~ thanks.




> The logic of IncrementalBlockReportManager#addRDBI method may cause missing 
> blocks when cluster is busy.
> 
>
> Key: HDFS-16922
> URL: https://issues.apache.org/jira/browse/HDFS-16922
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: ZhangHB
>Priority: Major
>  Labels: pull-request-available
>
> The current logic of IncrementalBlockReportManager# addRDBI method could lead 
> to the missing blocks when datanodes in pipeline are I/O busy.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16916) Improve the use of JUnit Test in DFSClient

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689174#comment-17689174
 ] 

ASF GitHub Bot commented on HDFS-16916:
---

slfan1989 commented on PR #5404:
URL: https://github.com/apache/hadoop/pull/5404#issuecomment-1431486725

   LGTM.




> Improve the use of JUnit Test in DFSClient
> --
>
> Key: HDFS-16916
> URL: https://issues.apache.org/jira/browse/HDFS-16916
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsclient
>Affects Versions: 3.4.0
>Reporter: Hualong Zhang
>Priority: Minor
>  Labels: pull-request-available
>
> Improve the use of JUnit Test in DFSClient



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16916) Improve the use of JUnit Test in DFSClient

2023-02-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-16916:
--
Labels: pull-request-available  (was: )

> Improve the use of JUnit Test in DFSClient
> --
>
> Key: HDFS-16916
> URL: https://issues.apache.org/jira/browse/HDFS-16916
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsclient
>Affects Versions: 3.4.0
>Reporter: Hualong Zhang
>Priority: Minor
>  Labels: pull-request-available
>
> Improve the use of JUnit Test in DFSClient



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16916) Improve the use of JUnit Test in DFSClient

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689166#comment-17689166
 ] 

ASF GitHub Bot commented on HDFS-16916:
---

zhtttylz opened a new pull request, #5404:
URL: https://github.com/apache/hadoop/pull/5404

   JIRA:HDFS-16916. Improve the use of JUnit Test in DFSClient




> Improve the use of JUnit Test in DFSClient
> --
>
> Key: HDFS-16916
> URL: https://issues.apache.org/jira/browse/HDFS-16916
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsclient
>Affects Versions: 3.4.0
>Reporter: Hualong Zhang
>Priority: Minor
>
> Improve the use of JUnit Test in DFSClient



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689140#comment-17689140
 ] 

ASF GitHub Bot commented on HDFS-16917:
---

hadoop-yetus commented on PR #5397:
URL: https://github.com/apache/hadoop/pull/5397#issuecomment-1431427424

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 38s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m  2s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  30m 51s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m 26s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |  20m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   3m 46s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 29s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 27s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   2m 42s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   6m 10s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  26m 48s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 38s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  23m  9s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |  23m  9s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 25s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |  20m 25s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m  0s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/1/artifact/out/blanks-eol.txt)
 |  The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | -0 :warning: |  checkstyle  |   3m 37s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/1/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 8 new + 106 unchanged - 0 fixed = 114 total (was 
106)  |
   | +1 :green_heart: |  mvnsite  |   3m 25s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 18s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   2m 38s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   6m 25s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  26m 47s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 14s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  | 208m  9s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 15s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 455m 14s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes |
   |   | hadoop.hdfs.server.namenode.TestAuditLogs |
   |   | hadoop.hdfs.server.namenode.TestFSNamesystemLockReport |
   |   | hadoop.hdfs.server.namenode.TestFsck |
   |   | hadoop.hdfs.server.namenode.TestAuditLogger |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/1

[jira] [Commented] (HDFS-16922) The logic of IncrementalBlockReportManager#addRDBI method may cause missing blocks when cluster is busy.

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689104#comment-17689104
 ] 

ASF GitHub Bot commented on HDFS-16922:
---

hadoop-yetus commented on PR #5398:
URL: https://github.com/apache/hadoop/pull/5398#issuecomment-1431321727

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m 23s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  46m 25s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 31s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |   1m 25s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m  9s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 30s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  7s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 28s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 36s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  32m 38s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 31s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |   1m 23s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 54s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5398/1/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 2 unchanged - 
0 fixed = 3 total (was 2)  |
   | +1 :green_heart: |  mvnsite  |   2m 18s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 40s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  30m  2s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 234m 21s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5398/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 47s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 368m 37s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.namenode.TestAuditLogger |
   |   | hadoop.hdfs.server.namenode.TestFSNamesystemLockReport |
   |   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   |   | hadoop.hdfs.server.namenode.TestAuditLogs |
   |   | hadoop.hdfs.server.namenode.TestFsck |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5398/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5398 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 7a1f03d8173f 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 
18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision |

[jira] [Commented] (HDFS-16924) Add libhdfs APIs for createFile

2023-02-15 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689085#comment-17689085
 ] 

Zoltán Borók-Nagy commented on HDFS-16924:
--

Seems like libhdfs only has hdfsOpenFile() which can be used for creating new 
files. hdfsOpenFile() has a builder-based API so probably it can be already 
used for this purpose.

> Add libhdfs APIs for createFile
> ---
>
> Key: HDFS-16924
> URL: https://issues.apache.org/jira/browse/HDFS-16924
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs
>Reporter: Zoltán Borók-Nagy
>Priority: Major
>
> HDFS-14478 introduces builder-based APIs for openFile() based on HADOOP-15229.
> We should also add builder-based APIs for createFile() based on HADOOP-14365.
> This would be especially useful for object stores to tune performance of file 
> writes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16924) Add libhdfs APIs for createFile

2023-02-15 Thread Jira
Zoltán Borók-Nagy created HDFS-16924:


 Summary: Add libhdfs APIs for createFile
 Key: HDFS-16924
 URL: https://issues.apache.org/jira/browse/HDFS-16924
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: fs
Reporter: Zoltán Borók-Nagy


HDFS-14478 introduces builder-based APIs for openFile() based on HADOOP-15229.

We should also add builder-based APIs for createFile() based on HADOOP-14365.

This would be especially useful for object stores to tune performance of file 
writes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16761) Namenode UI for Datanodes page not loading if any data node is down

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689077#comment-17689077
 ] 

ASF GitHub Bot commented on HDFS-16761:
---

ayushtkn commented on PR #5390:
URL: https://github.com/apache/hadoop/pull/5390#issuecomment-1431272820

   Not sure something related to os or browser. Go ahead folks




> Namenode UI for Datanodes page not loading if any data node is down
> ---
>
> Key: HDFS-16761
> URL: https://issues.apache.org/jira/browse/HDFS-16761
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.2.2
>Reporter: Krishna Reddy
>Assignee: Zita Dombi
>Priority: Major
>  Labels: pull-request-available
>
> Steps to reproduce:
> - Install the hadoop components and add 3 datanodes
> - Enable namenode HA 
> - Open Namenode UI and check datanode page 
> - check all datanodes will display
> - Now make one datanode down
> - wait for 10 minutes time as heartbeat expires
> - Refresh namenode page and check
>  
> Actual Result: It is showing error message "NameNode is still loading. 
> Redirecting to the Startup Progress page."



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16923) The getListing RPC will throw NPE if the path does not exist

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689029#comment-17689029
 ] 

ASF GitHub Bot commented on HDFS-16923:
---

ZanderXu commented on PR #5400:
URL: https://github.com/apache/hadoop/pull/5400#issuecomment-1431139094

   @zhengchenyu @xkrogen Master, please help me to review this NPE imported by 
HDFS-16732




> The getListing RPC will throw NPE if the path does not exist
> 
>
> Key: HDFS-16923
> URL: https://issues.apache.org/jira/browse/HDFS-16923
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> The getListing RPC will throw NPE if the path does not exist. And the stack 
> as bellow:
> {code:java}
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): 
> org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4195)
>     at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:1421)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:783)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:622)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:590)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:574)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16923) The getListing RPC will throw NPE if the path does not exist

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689026#comment-17689026
 ] 

ASF GitHub Bot commented on HDFS-16923:
---

ZanderXu opened a new pull request, #5400:
URL: https://github.com/apache/hadoop/pull/5400

   ### Description of PR
   Jira: [HDFS-16923](https://issues.apache.org/jira/browse/HDFS-16923)
   
   The getlisting rpc will throw NPE if the path does not exist. 
   
   




> The getListing RPC will throw NPE if the path does not exist
> 
>
> Key: HDFS-16923
> URL: https://issues.apache.org/jira/browse/HDFS-16923
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>
> The getListing RPC will throw NPE if the path does not exist. And the stack 
> as bellow:
> {code:java}
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): 
> org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4195)
>     at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:1421)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:783)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:622)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:590)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:574)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16923) The getListing RPC will throw NPE if the path does not exist

2023-02-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-16923:
--
Labels: pull-request-available  (was: )

> The getListing RPC will throw NPE if the path does not exist
> 
>
> Key: HDFS-16923
> URL: https://issues.apache.org/jira/browse/HDFS-16923
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> The getListing RPC will throw NPE if the path does not exist. And the stack 
> as bellow:
> {code:java}
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): 
> org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4195)
>     at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:1421)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:783)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:622)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:590)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:574)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16923) The getListing RPC will throw NPE if the path does not exist

2023-02-15 Thread ZanderXu (Jira)
ZanderXu created HDFS-16923:
---

 Summary: The getListing RPC will throw NPE if the path does not 
exist
 Key: HDFS-16923
 URL: https://issues.apache.org/jira/browse/HDFS-16923
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: ZanderXu
Assignee: ZanderXu


The getListing RPC will throw NPE if the path does not exist. And the stack as 
bellow:
{code:java}
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): 
org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
java.lang.NullPointerException
    at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4195)
    at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:1421)
    at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:783)
    at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:622)
    at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:590)
    at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:574)
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16761) Namenode UI for Datanodes page not loading if any data node is down

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689013#comment-17689013
 ] 

ASF GitHub Bot commented on HDFS-16761:
---

tasanuma commented on PR #5390:
URL: https://github.com/apache/hadoop/pull/5390#issuecomment-1431114419

   I also did the test. The issue was reproduced in both `dfshealth.html` and 
`federationhealth.html`, and I confirmed this PR fixed them.




> Namenode UI for Datanodes page not loading if any data node is down
> ---
>
> Key: HDFS-16761
> URL: https://issues.apache.org/jira/browse/HDFS-16761
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.2.2
>Reporter: Krishna Reddy
>Assignee: Zita Dombi
>Priority: Major
>  Labels: pull-request-available
>
> Steps to reproduce:
> - Install the hadoop components and add 3 datanodes
> - Enable namenode HA 
> - Open Namenode UI and check datanode page 
> - check all datanodes will display
> - Now make one datanode down
> - wait for 10 minutes time as heartbeat expires
> - Refresh namenode page and check
>  
> Actual Result: It is showing error message "NameNode is still loading. 
> Redirecting to the Startup Progress page."



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16918) Optionally shut down datanode if it does not stay connected to active namenode

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17688994#comment-17688994
 ] 

ASF GitHub Bot commented on HDFS-16918:
---

hadoop-yetus commented on PR #5396:
URL: https://github.com/apache/hadoop/pull/5396#issuecomment-1431048438

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 53s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  1s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  46m 29s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 29s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |   1m 20s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m  7s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 30s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  7s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 26s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 31s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  28m 26s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 27s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |   1m 23s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 54s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5396/1/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 124 unchanged 
- 0 fixed = 125 total (was 124)  |
   | +1 :green_heart: |  mvnsite  |   1m 21s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   0m 53s | 
[/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5396/1/artifact/out/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt)
 |  hadoop-hdfs in the patch failed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 26s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  28m 27s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 229m 42s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5396/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 42s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 355m 51s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.namenode.TestFSNamesystemLockReport |
   |   | hadoop.hdfs.server.namenode.TestAuditLogs |
   |   | hadoop.hdfs.server.namenode.TestAuditLogger |
   |   | hadoop.tools.TestHdfsConfigFields |
   |   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   |   | hadoop.hdfs.server.namenode.ha.TestObserverNode |
   |   | hadoop.hdfs.server.namenode.TestFsck |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5396/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5396 |
   | Optional Tests | dupnam

[jira] [Commented] (HDFS-16761) Namenode UI for Datanodes page not loading if any data node is down

2023-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17688972#comment-17688972
 ] 

ASF GitHub Bot commented on HDFS-16761:
---

sodonnel commented on PR #5390:
URL: https://github.com/apache/hadoop/pull/5390#issuecomment-1431014524

   For me the issues reproduces. I built trunk and started a local docker 
cluster with 3 nodes. As soon as one of those nodes goes dead the datanodes tab 
just redirects to the "startup in progress" page. Tried on Firefox and Safari 
and both are the same.




> Namenode UI for Datanodes page not loading if any data node is down
> ---
>
> Key: HDFS-16761
> URL: https://issues.apache.org/jira/browse/HDFS-16761
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.2.2
>Reporter: Krishna Reddy
>Assignee: Zita Dombi
>Priority: Major
>  Labels: pull-request-available
>
> Steps to reproduce:
> - Install the hadoop components and add 3 datanodes
> - Enable namenode HA 
> - Open Namenode UI and check datanode page 
> - check all datanodes will display
> - Now make one datanode down
> - wait for 10 minutes time as heartbeat expires
> - Refresh namenode page and check
>  
> Actual Result: It is showing error message "NameNode is still loading. 
> Redirecting to the Startup Progress page."



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] (HDFS-16799) The dn space size is not consistent, and Balancer can not work, resulting in a very unbalanced space

2023-02-15 Thread ruiliang (Jira)


[ https://issues.apache.org/jira/browse/HDFS-16799 ]


ruiliang deleted comment on HDFS-16799:
-

was (Author: ruilaing):
ok

> The dn space size is not consistent, and Balancer can not work, resulting in 
> a very unbalanced space
> 
>
> Key: HDFS-16799
> URL: https://issues.apache.org/jira/browse/HDFS-16799
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.1.0
>Reporter: ruiliang
>Priority: Blocker
>
>  
> {code:java}
> echo 'A DFS Used 99.8% to ip' > sorucehost  
> hdfs --debug  balancer  -fs hdfs://xxcluster06  -threshold 10 -source -f 
> sorucehost  
> 
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-01-08/10.12.65.243:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-01-08/10.12.65.247:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-15-10/10.12.65.214:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-02-08/10.12.14.8:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-05-13/10.12.15.154:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-12-04/10.12.65.218:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-12-03/10.12.65.143:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-05-05/10.12.12.200:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-12-03/10.12.65.217:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-12-03/10.12.65.142:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-01-08/10.12.65.246:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-12-03/10.12.65.219:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-12-03/10.12.65.147:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-15-10/10.12.65.186:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-05-13/10.12.15.153:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-03-07/10.12.19.23:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-04-14/10.12.65.119:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-12-03/10.12.65.131:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-05-04/10.12.12.210:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-05-11/10.12.14.168:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-01-08/10.12.65.245:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-03-02/10.12.17.26:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-01-08/10.12.65.241:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-05-13/10.12.15.152:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-01-08/10.12.65.249:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-07-14/10.12.64.71:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-03-03/10.12.17.35:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-01-08/10.12.65.195:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-01-08/10.12.65.242:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-01-08/10.12.65.248:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-01-08/10.12.65.240:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-15-12/10.12.65.196:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-05-13/10.12.15.150:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-12-03/10.12.65.222:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-12-03/10.12.65.145:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-01-08/10.12.65.244:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-03-07/10.12.19.22:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-12-03/10.12.65.221:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-12-03/10.12.65.136:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-12-03/10.12.65.129:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-05-15/10.12.15.163:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-07-14/10.12.64.72:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a 

[jira] [Commented] (HDFS-16799) The dn space size is not consistent, and Balancer can not work, resulting in a very unbalanced space

2023-02-15 Thread ruiliang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17688914#comment-17688914
 ] 

ruiliang commented on HDFS-16799:
-

ok

> The dn space size is not consistent, and Balancer can not work, resulting in 
> a very unbalanced space
> 
>
> Key: HDFS-16799
> URL: https://issues.apache.org/jira/browse/HDFS-16799
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.1.0
>Reporter: ruiliang
>Priority: Blocker
>
>  
> {code:java}
> echo 'A DFS Used 99.8% to ip' > sorucehost  
> hdfs --debug  balancer  -fs hdfs://xxcluster06  -threshold 10 -source -f 
> sorucehost  
> 
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-01-08/10.12.65.243:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-01-08/10.12.65.247:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-15-10/10.12.65.214:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-02-08/10.12.14.8:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-05-13/10.12.15.154:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-12-04/10.12.65.218:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-12-03/10.12.65.143:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-05-05/10.12.12.200:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-12-03/10.12.65.217:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-12-03/10.12.65.142:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-01-08/10.12.65.246:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-12-03/10.12.65.219:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-12-03/10.12.65.147:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-15-10/10.12.65.186:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-05-13/10.12.15.153:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-03-07/10.12.19.23:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-04-14/10.12.65.119:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-12-03/10.12.65.131:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-05-04/10.12.12.210:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-05-11/10.12.14.168:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-01-08/10.12.65.245:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-03-02/10.12.17.26:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-01-08/10.12.65.241:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-05-13/10.12.15.152:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-01-08/10.12.65.249:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-07-14/10.12.64.71:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-03-03/10.12.17.35:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-01-08/10.12.65.195:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-01-08/10.12.65.242:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-01-08/10.12.65.248:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-01-08/10.12.65.240:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-15-12/10.12.65.196:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-05-13/10.12.15.150:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-12-03/10.12.65.222:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-12-03/10.12.65.145:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-01-08/10.12.65.244:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-03-07/10.12.19.22:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-12-03/10.12.65.221:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-12-03/10.12.65.136:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-12-03/10.12.65.129:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-05-15/10.12.15.163:1019
> 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: 
> /4F08-0