[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689582#comment-17689582 ] ASF GitHub Bot commented on HDFS-16917: --- hadoop-yetus commented on PR #5397: URL: https://github.com/apache/hadoop/pull/5397#issuecomment-1432641907 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 38s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 0s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 16m 0s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 31m 9s | | trunk passed | | +1 :green_heart: | compile | 23m 14s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | compile | 20m 30s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 3m 45s | | trunk passed | | +1 :green_heart: | mvnsite | 3m 27s | | trunk passed | | +1 :green_heart: | javadoc | 2m 22s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 2m 34s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 6m 18s | | trunk passed | | +1 :green_heart: | shadedclient | 25m 57s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 28s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 32s | | the patch passed | | +1 :green_heart: | compile | 22m 55s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javac | 22m 55s | | the patch passed | | +1 :green_heart: | compile | 20m 41s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | javac | 20m 41s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 3m 37s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/6/artifact/out/results-checkstyle-root.txt) | root: The patch generated 2 new + 130 unchanged - 0 fixed = 132 total (was 130) | | +1 :green_heart: | mvnsite | 3m 29s | | the patch passed | | +1 :green_heart: | javadoc | 2m 21s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 2m 42s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 6m 22s | | the patch passed | | +1 :green_heart: | shadedclient | 26m 29s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 18m 15s | | hadoop-common in the patch passed. | | -1 :x: | unit | 205m 59s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/6/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 14s | | The patch does not generate ASF License warnings. | | | | 452m 40s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.TestAuditLogger | | | hadoop.hdfs.server.namenode.TestFsck | | | hadoop.hdfs.server.namenode.TestAuditLogs | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.server.namenode.TestFSNamesystemLockReport | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/6/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5397 | | Optional Tests | dupname asflicense mvnsite codespell detsecrets markdownlint compile javac javadoc mvni
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689577#comment-17689577 ] ASF GitHub Bot commented on HDFS-16917: --- hadoop-yetus commented on PR #5397: URL: https://github.com/apache/hadoop/pull/5397#issuecomment-1432623582 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 39s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 0s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 15m 31s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 31m 5s | | trunk passed | | +1 :green_heart: | compile | 23m 4s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | compile | 20m 30s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 3m 55s | | trunk passed | | +1 :green_heart: | mvnsite | 3m 29s | | trunk passed | | +1 :green_heart: | javadoc | 2m 28s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 2m 43s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 6m 11s | | trunk passed | | +1 :green_heart: | shadedclient | 26m 30s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 28s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 32s | | the patch passed | | +1 :green_heart: | compile | 22m 34s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javac | 22m 34s | | the patch passed | | +1 :green_heart: | compile | 20m 34s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | javac | 20m 34s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 3m 36s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/5/artifact/out/results-checkstyle-root.txt) | root: The patch generated 2 new + 131 unchanged - 0 fixed = 133 total (was 131) | | +1 :green_heart: | mvnsite | 3m 24s | | the patch passed | | +1 :green_heart: | javadoc | 2m 21s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 2m 42s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 6m 20s | | the patch passed | | +1 :green_heart: | shadedclient | 26m 38s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 18m 22s | | hadoop-common in the patch passed. | | -1 :x: | unit | 204m 26s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 10s | | The patch does not generate ASF License warnings. | | | | 450m 54s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.TestAuditLogs | | | hadoop.hdfs.server.namenode.TestFSNamesystemLockReport | | | hadoop.hdfs.server.namenode.TestFsck | | | hadoop.hdfs.server.namenode.TestAuditLogger | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/5/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5397 | | Optional Tests | dupname asflicense mvnsite codespell detsecrets markdownlint compile javac javadoc mvninstall unit shadedclient spotbugs checkstyle | | uname |
[jira] [Commented] (HDFS-16922) The logic of IncrementalBlockReportManager#addRDBI method may cause missing blocks when cluster is busy.
[ https://issues.apache.org/jira/browse/HDFS-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689554#comment-17689554 ] ASF GitHub Bot commented on HDFS-16922: --- hfutatzhanghb commented on PR #5398: URL: https://github.com/apache/hadoop/pull/5398#issuecomment-1432581037 > > Thanks to involve me here. It is interesting issue. I am confused about some points of the description. > > > dn3 is writting the blk_12345_002 , but dn2 is blocked by recoverClose method and does not send ack to client. > > is this another injects or related this write flow? > > > dn3 writes blk_12345_003 successfully. > > dn3 writes blk_12345_002 successfully and notifyNamenodeReceivedBlock. > > Here dn3 writes the same block replica twice, is it expected? > > Sorry didn't dig deeply this logic, will trace it for a while. @hfutatzhanghb Thanks again for your report and offer the solution. Hi, @Hexiaoqiao , thanks for your reply. For the question 1: dn2 is blocked by recoverClose() because of the datasetWriteLock acquire in branch-3.3.2 For the question 2: yes, dn3 writes the same block replica twice, but the two replicas have different generation stamp. and when blk_12345_003 and blk_12345_002 are written in the same IBR interval, the IncrementalBlockReportManager#addRDBI will remove the report of blk_12345_003. > The logic of IncrementalBlockReportManager#addRDBI method may cause missing > blocks when cluster is busy. > > > Key: HDFS-16922 > URL: https://issues.apache.org/jira/browse/HDFS-16922 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: ZhangHB >Priority: Major > Labels: pull-request-available > > The current logic of IncrementalBlockReportManager# addRDBI method could lead > to the missing blocks when datanodes in pipeline are I/O busy. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16898) Remove write lock for processCommandFromActor of DataNode to reduce impact on heartbeat
[ https://issues.apache.org/jira/browse/HDFS-16898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689507#comment-17689507 ] ASF GitHub Bot commented on HDFS-16898: --- Hexiaoqiao commented on PR #5408: URL: https://github.com/apache/hadoop/pull/5408#issuecomment-1432502138 Update title and let's wait what will Yetus say. > Remove write lock for processCommandFromActor of DataNode to reduce impact on > heartbeat > --- > > Key: HDFS-16898 > URL: https://issues.apache.org/jira/browse/HDFS-16898 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.3.4 >Reporter: ZhangHB >Assignee: ZhangHB >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Now in method processCommandFromActor, we have code like below: > > {code:java} > writeLock(); > try { > if (actor == bpServiceToActive) { > return processCommandFromActive(cmd, actor); > } else { > return processCommandFromStandby(cmd, actor); > } > } finally { > writeUnlock(); > } {code} > if method processCommandFromActive costs much time, the write lock would not > release. > > It maybe block the updateActorStatesFromHeartbeat method in > offerService,furthermore, it can cause the lastcontact of datanode very high, > even dead when lastcontact beyond 600s. > {code:java} > bpos.updateActorStatesFromHeartbeat( > this, resp.getNameNodeHaState());{code} > here we can make write lock fine-grain in processCommandFromActor method to > address this problem > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16922) The logic of IncrementalBlockReportManager#addRDBI method may cause missing blocks when cluster is busy.
[ https://issues.apache.org/jira/browse/HDFS-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689505#comment-17689505 ] ASF GitHub Bot commented on HDFS-16922: --- Hexiaoqiao commented on PR #5398: URL: https://github.com/apache/hadoop/pull/5398#issuecomment-1432500654 addendum: > Requires a UT which can reproduce the said issue. Ayushtkn means here is that we should add new unit tests (source code for test, such as TestClientProtocolForPipelineRecovery at HDFS-16146 mentioned above.) Thanks. > The logic of IncrementalBlockReportManager#addRDBI method may cause missing > blocks when cluster is busy. > > > Key: HDFS-16922 > URL: https://issues.apache.org/jira/browse/HDFS-16922 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: ZhangHB >Priority: Major > Labels: pull-request-available > > The current logic of IncrementalBlockReportManager# addRDBI method could lead > to the missing blocks when datanodes in pipeline are I/O busy. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16922) The logic of IncrementalBlockReportManager#addRDBI method may cause missing blocks when cluster is busy.
[ https://issues.apache.org/jira/browse/HDFS-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689503#comment-17689503 ] ASF GitHub Bot commented on HDFS-16922: --- Hexiaoqiao commented on PR #5398: URL: https://github.com/apache/hadoop/pull/5398#issuecomment-1432498891 Thanks to involve me here. It is interesting issue. I am confused about some points of the description. > dn3 is writting the blk_12345_002 , but dn2 is blocked by recoverClose method and does not send ack to client. is this another injects or related this write flow? > dn3 writes blk_12345_003 successfully. > dn3 writes blk_12345_002 successfully and notifyNamenodeReceivedBlock. Here dn3 writes the same block replica twice, is it expected? Sorry didn't dig deeply this logic, will trace it for a while. @hfutatzhanghb Thanks again for your report and offer the solution. > The logic of IncrementalBlockReportManager#addRDBI method may cause missing > blocks when cluster is busy. > > > Key: HDFS-16922 > URL: https://issues.apache.org/jira/browse/HDFS-16922 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: ZhangHB >Priority: Major > Labels: pull-request-available > > The current logic of IncrementalBlockReportManager# addRDBI method could lead > to the missing blocks when datanodes in pipeline are I/O busy. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16898) Remove write lock for processCommandFromActor of DataNode to reduce impact on heartbeat
[ https://issues.apache.org/jira/browse/HDFS-16898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689486#comment-17689486 ] ASF GitHub Bot commented on HDFS-16898: --- hfutatzhanghb opened a new pull request, #5408: URL: https://github.com/apache/hadoop/pull/5408 https://github.com/apache/hadoop/pull/5330 > Remove write lock for processCommandFromActor of DataNode to reduce impact on > heartbeat > --- > > Key: HDFS-16898 > URL: https://issues.apache.org/jira/browse/HDFS-16898 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.3.4 >Reporter: ZhangHB >Assignee: ZhangHB >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Now in method processCommandFromActor, we have code like below: > > {code:java} > writeLock(); > try { > if (actor == bpServiceToActive) { > return processCommandFromActive(cmd, actor); > } else { > return processCommandFromStandby(cmd, actor); > } > } finally { > writeUnlock(); > } {code} > if method processCommandFromActive costs much time, the write lock would not > release. > > It maybe block the updateActorStatesFromHeartbeat method in > offerService,furthermore, it can cause the lastcontact of datanode very high, > even dead when lastcontact beyond 600s. > {code:java} > bpos.updateActorStatesFromHeartbeat( > this, resp.getNameNodeHaState());{code} > here we can make write lock fine-grain in processCommandFromActor method to > address this problem > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16898) Remove write lock for processCommandFromActor of DataNode to reduce impact on heartbeat
[ https://issues.apache.org/jira/browse/HDFS-16898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689485#comment-17689485 ] ASF GitHub Bot commented on HDFS-16898: --- hfutatzhanghb commented on PR #5330: URL: https://github.com/apache/hadoop/pull/5330#issuecomment-1432454503 > @hfutatzhanghb This PR could not cherrypick to branch-3.3 smoothly. Would you mind to submit another PR for branch-3.3? @Hexiaoqiao , done~, please have a look. thanks. > Remove write lock for processCommandFromActor of DataNode to reduce impact on > heartbeat > --- > > Key: HDFS-16898 > URL: https://issues.apache.org/jira/browse/HDFS-16898 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.3.4 >Reporter: ZhangHB >Assignee: ZhangHB >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Now in method processCommandFromActor, we have code like below: > > {code:java} > writeLock(); > try { > if (actor == bpServiceToActive) { > return processCommandFromActive(cmd, actor); > } else { > return processCommandFromStandby(cmd, actor); > } > } finally { > writeUnlock(); > } {code} > if method processCommandFromActive costs much time, the write lock would not > release. > > It maybe block the updateActorStatesFromHeartbeat method in > offerService,furthermore, it can cause the lastcontact of datanode very high, > even dead when lastcontact beyond 600s. > {code:java} > bpos.updateActorStatesFromHeartbeat( > this, resp.getNameNodeHaState());{code} > here we can make write lock fine-grain in processCommandFromActor method to > address this problem > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16918) Optionally shut down datanode if it does not stay connected to active namenode
[ https://issues.apache.org/jira/browse/HDFS-16918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689481#comment-17689481 ] ASF GitHub Bot commented on HDFS-16918: --- hadoop-yetus commented on PR #5396: URL: https://github.com/apache/hadoop/pull/5396#issuecomment-1432445119 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 22s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | xmllint | 0m 0s | | xmllint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 50m 49s | | trunk passed | | +1 :green_heart: | compile | 1m 28s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | compile | 1m 24s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 1m 8s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 29s | | trunk passed | | +1 :green_heart: | javadoc | 1m 8s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 1m 32s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 35s | | trunk passed | | +1 :green_heart: | shadedclient | 29m 26s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 30s | | the patch passed | | +1 :green_heart: | compile | 1m 23s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javac | 1m 23s | | the patch passed | | +1 :green_heart: | compile | 1m 13s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | javac | 1m 13s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 54s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5396/2/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 325 unchanged - 0 fixed = 326 total (was 325) | | +1 :green_heart: | mvnsite | 1m 23s | | the patch passed | | -1 :x: | javadoc | 0m 53s | [/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5396/2/artifact/out/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt) | hadoop-hdfs in the patch failed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04. | | +1 :green_heart: | javadoc | 1m 26s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 29s | | the patch passed | | +1 :green_heart: | shadedclient | 29m 11s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 251m 57s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5396/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 43s | | The patch does not generate ASF License warnings. | | | | 385m 0s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.TestAuditLogger | | | hadoop.hdfs.server.namenode.TestFSNamesystemLockReport | | | hadoop.hdfs.server.namenode.TestAuditLogs | | | hadoop.hdfs.server.namenode.TestFsck | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5396/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5396 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint | | uname | Linux 5d0f90e11c93 4.15.0-20
[jira] [Commented] (HDFS-16922) The logic of IncrementalBlockReportManager#addRDBI method may cause missing blocks when cluster is busy.
[ https://issues.apache.org/jira/browse/HDFS-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689478#comment-17689478 ] ASF GitHub Bot commented on HDFS-16922: --- hfutatzhanghb commented on PR #5398: URL: https://github.com/apache/hadoop/pull/5398#issuecomment-1432442616 @hi, @jojochuang @Hexiaoqiao @zhangshuyan0 , this pr is seems to be another supplement for [HDFS-16146](https://issues.apache.org/jira/browse/HDFS-16146), could you please take a look at this? thanks all. > The logic of IncrementalBlockReportManager#addRDBI method may cause missing > blocks when cluster is busy. > > > Key: HDFS-16922 > URL: https://issues.apache.org/jira/browse/HDFS-16922 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: ZhangHB >Priority: Major > Labels: pull-request-available > > The current logic of IncrementalBlockReportManager# addRDBI method could lead > to the missing blocks when datanodes in pipeline are I/O busy. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16925) Fix regex pattern for namenode audit log tests
[ https://issues.apache.org/jira/browse/HDFS-16925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689449#comment-17689449 ] ASF GitHub Bot commented on HDFS-16925: --- hadoop-yetus commented on PR #5407: URL: https://github.com/apache/hadoop/pull/5407#issuecomment-1432379608 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 51s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 4 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 49m 10s | | trunk passed | | +1 :green_heart: | compile | 1m 29s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | compile | 1m 21s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 1m 5s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 29s | | trunk passed | | +1 :green_heart: | javadoc | 1m 8s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 1m 28s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 40s | | trunk passed | | +1 :green_heart: | shadedclient | 28m 42s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 30s | | the patch passed | | +1 :green_heart: | compile | 1m 25s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javac | 1m 25s | | the patch passed | | +1 :green_heart: | compile | 1m 13s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | javac | 1m 13s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 51s | | hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 59 unchanged - 1 fixed = 59 total (was 60) | | +1 :green_heart: | mvnsite | 1m 22s | | the patch passed | | +1 :green_heart: | javadoc | 0m 52s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 1m 23s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 30s | | the patch passed | | +1 :green_heart: | shadedclient | 28m 44s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 218m 14s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5407/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 45s | | The patch does not generate ASF License warnings. | | | | 347m 43s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5407/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5407 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux cbb1315f256b 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 3bf75df6d997563e9aaea7af30c58dd9ae4729a8 | | Default Java | Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5407/1/testReport/ | | Max. process+thread count | 2440 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-
[jira] [Commented] (HDFS-16761) Namenode UI for Datanodes page not loading if any data node is down
[ https://issues.apache.org/jira/browse/HDFS-16761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689443#comment-17689443 ] ASF GitHub Bot commented on HDFS-16761: --- tomscut commented on PR #5390: URL: https://github.com/apache/hadoop/pull/5390#issuecomment-1432358750 Sorry for introducing this problem. Thank you all. > Namenode UI for Datanodes page not loading if any data node is down > --- > > Key: HDFS-16761 > URL: https://issues.apache.org/jira/browse/HDFS-16761 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.2.2 >Reporter: Krishna Reddy >Assignee: Zita Dombi >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Steps to reproduce: > - Install the hadoop components and add 3 datanodes > - Enable namenode HA > - Open Namenode UI and check datanode page > - check all datanodes will display > - Now make one datanode down > - wait for 10 minutes time as heartbeat expires > - Refresh namenode page and check > > Actual Result: It is showing error message "NameNode is still loading. > Redirecting to the Startup Progress page." -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16761) Namenode UI for Datanodes page not loading if any data node is down
[ https://issues.apache.org/jira/browse/HDFS-16761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689440#comment-17689440 ] ASF GitHub Bot commented on HDFS-16761: --- tasanuma commented on PR #5390: URL: https://github.com/apache/hadoop/pull/5390#issuecomment-1432353003 Thanks for merging it. The issue doesn't reproduce in branch-3.3. It seems to be caused by HDFS-16203, which is only in trunk. > Namenode UI for Datanodes page not loading if any data node is down > --- > > Key: HDFS-16761 > URL: https://issues.apache.org/jira/browse/HDFS-16761 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.2.2 >Reporter: Krishna Reddy >Assignee: Zita Dombi >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Steps to reproduce: > - Install the hadoop components and add 3 datanodes > - Enable namenode HA > - Open Namenode UI and check datanode page > - check all datanodes will display > - Now make one datanode down > - wait for 10 minutes time as heartbeat expires > - Refresh namenode page and check > > Actual Result: It is showing error message "NameNode is still loading. > Redirecting to the Startup Progress page." -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689439#comment-17689439 ] ASF GitHub Bot commented on HDFS-16917: --- hadoop-yetus commented on PR #5397: URL: https://github.com/apache/hadoop/pull/5397#issuecomment-1432344941 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 49s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 0s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 27m 9s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 32m 29s | | trunk passed | | +1 :green_heart: | compile | 25m 46s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | compile | 22m 11s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 4m 1s | | trunk passed | | +1 :green_heart: | mvnsite | 3m 18s | | trunk passed | | +1 :green_heart: | javadoc | 2m 20s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 2m 32s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 6m 5s | | trunk passed | | +1 :green_heart: | shadedclient | 27m 2s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 30s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 32s | | the patch passed | | +1 :green_heart: | compile | 25m 19s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javac | 25m 19s | | the patch passed | | +1 :green_heart: | compile | 22m 14s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | javac | 22m 14s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 4m 4s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/4/artifact/out/results-checkstyle-root.txt) | root: The patch generated 2 new + 105 unchanged - 0 fixed = 107 total (was 105) | | +1 :green_heart: | mvnsite | 3m 25s | | the patch passed | | +1 :green_heart: | javadoc | 2m 12s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 2m 32s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 6m 25s | | the patch passed | | +1 :green_heart: | shadedclient | 27m 12s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 19m 8s | | hadoop-common in the patch passed. | | -1 :x: | unit | 211m 18s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 15s | | The patch does not generate ASF License warnings. | | | | 481m 9s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.TestAuditLogger | | | hadoop.hdfs.server.namenode.TestFsck | | | hadoop.hdfs.server.namenode.TestAuditLogs | | | hadoop.hdfs.server.namenode.TestFSNamesystemLockReport | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/4/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5397 | | Optional Tests | dupname asflicense mvnsite codespell detsecrets markdownlint compile javac javadoc mvninstall unit shadedclient spotbugs checkstyle | | uname |
[jira] [Commented] (HDFS-16896) HDFS Client hedged read has increased failure rate than without hedged read
[ https://issues.apache.org/jira/browse/HDFS-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689437#comment-17689437 ] ASF GitHub Bot commented on HDFS-16896: --- mccormickt12 commented on code in PR #5322: URL: https://github.com/apache/hadoop/pull/5322#discussion_r1107919545 ## hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java: ## @@ -1337,7 +1352,11 @@ private void hedgedFetchBlockByteRange(LocatedBlock block, long start, } catch (InterruptedException ie) { // Ignore and retry } -if (refetch) { +// if refetch is true then all nodes are in deadlist or ignorelist +// we should loop through all futures and remove them so we do not Review Comment: fixed comments. deadlist is actually deadNodes (I fixed that comment as well.) When connections fail (in both hedged and non hedged code path) nodes are added to the deadNodes collection to try other nodes. Once `chooseDataNode` returns `null` (or more accurately `getBestNodeDNAddrPair`) it calls `refetchLocations` which clears the deadNodes `clearLocalDeadNodes()` and now with my change, also clears the ignore list. Note we have added an assumption to this method `refetchLocations`. The comment I added to `refetchLocations` ``` /** * RefetchLocations should only be called when there are no active requests * to datanodes. In the hedged read case this means futures should be empty */ ``` > HDFS Client hedged read has increased failure rate than without hedged read > --- > > Key: HDFS-16896 > URL: https://issues.apache.org/jira/browse/HDFS-16896 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Reporter: Tom McCormick >Assignee: Tom McCormick >Priority: Major > Labels: pull-request-available > > When hedged read is enabled by HDFS client, we see an increased failure rate > on reads. > *stacktrace* > > {code:java} > Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain > block: BP-1183972111-10.197.192.88-1590025572374:blk_17114848218_16043459722 > file=/data/tracking/streaming/AdImpressionEvent/daily/2022/07/18/compaction_1/part-r-1914862.1658217125623.1362294472.orc > at > org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1077) > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1060) > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1039) > at > org.apache.hadoop.hdfs.DFSInputStream.hedgedFetchBlockByteRange(DFSInputStream.java:1365) > at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1572) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1535) > at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > org.apache.hadoop.fs.RetryingInputStream.lambda$readFully$3(RetryingInputStream.java:172) > at org.apache.hadoop.fs.RetryPolicy.lambda$run$0(RetryPolicy.java:137) > at org.apache.hadoop.fs.NoOpRetryPolicy.run(NoOpRetryPolicy.java:36) > at org.apache.hadoop.fs.RetryPolicy.run(RetryPolicy.java:136) > at > org.apache.hadoop.fs.RetryingInputStream.readFully(RetryingInputStream.java:168) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > io.trino.plugin.hive.orc.HdfsOrcDataSource.readInternal(HdfsOrcDataSource.java:76) > ... 46 more > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13224) RBF: Resolvers to support mount points across multiple subclusters
[ https://issues.apache.org/jira/browse/HDFS-13224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689436#comment-17689436 ] Íñigo Goiri commented on HDFS-13224: [~Daniel Ma], it's been almost five years so I'm having a hard time finding design docs. We added some documentation explaining the idea here: https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs-rbf/HDFSRouterFederation.html In the `Multiple subclusters` section it explains some of the use cases. > RBF: Resolvers to support mount points across multiple subclusters > -- > > Key: HDFS-13224 > URL: https://issues.apache.org/jira/browse/HDFS-13224 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.3 > > Attachments: HDFS-13224-branch-2.000.patch, HDFS-13224.000.patch, > HDFS-13224.001.patch, HDFS-13224.002.patch, HDFS-13224.003.patch, > HDFS-13224.004.patch, HDFS-13224.005.patch, HDFS-13224.006.patch, > HDFS-13224.007.patch, HDFS-13224.008.patch, HDFS-13224.009.patch, > HDFS-13224.010.patch > > > Currently, a mount point points to a single subcluster. We should be able to > spread files in a mount point across subclusters. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13224) RBF: Resolvers to support mount points across multiple subclusters
[ https://issues.apache.org/jira/browse/HDFS-13224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689432#comment-17689432 ] Daniel Ma commented on HDFS-13224: -- [~elgoiri] Could you pls share the design doc for this feature. No idea what kind of scenario need to cross subclusters. Thanks > RBF: Resolvers to support mount points across multiple subclusters > -- > > Key: HDFS-13224 > URL: https://issues.apache.org/jira/browse/HDFS-13224 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.3 > > Attachments: HDFS-13224-branch-2.000.patch, HDFS-13224.000.patch, > HDFS-13224.001.patch, HDFS-13224.002.patch, HDFS-13224.003.patch, > HDFS-13224.004.patch, HDFS-13224.005.patch, HDFS-13224.006.patch, > HDFS-13224.007.patch, HDFS-13224.008.patch, HDFS-13224.009.patch, > HDFS-13224.010.patch > > > Currently, a mount point points to a single subcluster. We should be able to > spread files in a mount point across subclusters. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689433#comment-17689433 ] ASF GitHub Bot commented on HDFS-16917: --- hadoop-yetus commented on PR #5397: URL: https://github.com/apache/hadoop/pull/5397#issuecomment-1432335476 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 48s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 0s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 19m 31s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 32m 0s | | trunk passed | | +1 :green_heart: | compile | 25m 2s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | compile | 22m 38s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 4m 7s | | trunk passed | | +1 :green_heart: | mvnsite | 3m 26s | | trunk passed | | +1 :green_heart: | javadoc | 2m 24s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 2m 37s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 6m 13s | | trunk passed | | +1 :green_heart: | shadedclient | 26m 45s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 30s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 30s | | the patch passed | | +1 :green_heart: | compile | 24m 54s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javac | 24m 54s | | the patch passed | | +1 :green_heart: | compile | 22m 36s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | javac | 22m 36s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 3m 47s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/3/artifact/out/results-checkstyle-root.txt) | root: The patch generated 2 new + 105 unchanged - 0 fixed = 107 total (was 105) | | +1 :green_heart: | mvnsite | 3m 23s | | the patch passed | | +1 :green_heart: | javadoc | 2m 14s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 2m 30s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 6m 42s | | the patch passed | | +1 :green_heart: | shadedclient | 27m 8s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 18m 31s | | hadoop-common in the patch passed. | | -1 :x: | unit | 213m 8s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 11s | | The patch does not generate ASF License warnings. | | | | 473m 52s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.TestAuditLogger | | | hadoop.hdfs.server.namenode.TestFsck | | | hadoop.hdfs.server.namenode.TestAuditLogs | | | hadoop.hdfs.server.namenode.TestFSNamesystemLockReport | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5397 | | Optional Tests | dupname asflicense mvnsite codespell detsecrets markdownlint compile javac javadoc mvninstall unit shadedclient spotbugs checkstyle | | uname |
[jira] [Resolved] (HDFS-16914) Add some logs for updateBlockForPipeline RPC.
[ https://issues.apache.org/jira/browse/HDFS-16914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Li resolved HDFS-16914. --- Fix Version/s: 3.4.0 Resolution: Fixed > Add some logs for updateBlockForPipeline RPC. > - > > Key: HDFS-16914 > URL: https://issues.apache.org/jira/browse/HDFS-16914 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namanode >Affects Versions: 3.3.4 >Reporter: ZhangHB >Assignee: ZhangHB >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > > Recently,we received an phone alarm about missing blocks. We found logs in > one datanode where the block was placed on like below: > > {code:java} > 2023-02-09 15:05:10,376 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Received BP-578784987-x.x.x.x-1667291826362:blk_1305044966_231832415 src: > /clientAddress:44638 dest: /localAddress:50010 of size 45733720 > 2023-02-09 15:05:10,376 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Received BP-578784987-x.x.x.x-1667291826362:blk_1305044966_231826462 src: > /upStreamDatanode:60316 dest: /localAddress:50010 of size 45733720 {code} > the datanode received the same block with different generation stamp because > of socket timeout exception. blk_1305044966_231826462 is received from > upstream datanode in pipeline which has two datanodes. > blk_1305044966_231832415 is received from client directly. > > we have search all log info about blk_1305044966 in namenode and three > datanodes in original pipeline. but we could not obtain any helpful message > about the generation stamp 231826462. After diving into the source code, it > was assigned in NameNodeRpcServer#updateBlockForPipeline which was invoked in > DataStreamer#setupPipelineInternal. The updateBlockForPipeline RPC does not > have any log info. So I think we should add some logs in this RPC. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16914) Add some logs for updateBlockForPipeline RPC.
[ https://issues.apache.org/jira/browse/HDFS-16914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689429#comment-17689429 ] ASF GitHub Bot commented on HDFS-16914: --- tomscut commented on PR #5381: URL: https://github.com/apache/hadoop/pull/5381#issuecomment-1432330074 Thanks @hfutatzhanghb for your contribution! And Thanks @slfan1989 for your review! > Add some logs for updateBlockForPipeline RPC. > - > > Key: HDFS-16914 > URL: https://issues.apache.org/jira/browse/HDFS-16914 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namanode >Affects Versions: 3.3.4 >Reporter: ZhangHB >Assignee: ZhangHB >Priority: Minor > Labels: pull-request-available > > Recently,we received an phone alarm about missing blocks. We found logs in > one datanode where the block was placed on like below: > > {code:java} > 2023-02-09 15:05:10,376 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Received BP-578784987-x.x.x.x-1667291826362:blk_1305044966_231832415 src: > /clientAddress:44638 dest: /localAddress:50010 of size 45733720 > 2023-02-09 15:05:10,376 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Received BP-578784987-x.x.x.x-1667291826362:blk_1305044966_231826462 src: > /upStreamDatanode:60316 dest: /localAddress:50010 of size 45733720 {code} > the datanode received the same block with different generation stamp because > of socket timeout exception. blk_1305044966_231826462 is received from > upstream datanode in pipeline which has two datanodes. > blk_1305044966_231832415 is received from client directly. > > we have search all log info about blk_1305044966 in namenode and three > datanodes in original pipeline. but we could not obtain any helpful message > about the generation stamp 231826462. After diving into the source code, it > was assigned in NameNodeRpcServer#updateBlockForPipeline which was invoked in > DataStreamer#setupPipelineInternal. The updateBlockForPipeline RPC does not > have any log info. So I think we should add some logs in this RPC. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16914) Add some logs for updateBlockForPipeline RPC.
[ https://issues.apache.org/jira/browse/HDFS-16914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689428#comment-17689428 ] ASF GitHub Bot commented on HDFS-16914: --- tomscut merged PR #5381: URL: https://github.com/apache/hadoop/pull/5381 > Add some logs for updateBlockForPipeline RPC. > - > > Key: HDFS-16914 > URL: https://issues.apache.org/jira/browse/HDFS-16914 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namanode >Affects Versions: 3.3.4 >Reporter: ZhangHB >Assignee: ZhangHB >Priority: Minor > Labels: pull-request-available > > Recently,we received an phone alarm about missing blocks. We found logs in > one datanode where the block was placed on like below: > > {code:java} > 2023-02-09 15:05:10,376 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Received BP-578784987-x.x.x.x-1667291826362:blk_1305044966_231832415 src: > /clientAddress:44638 dest: /localAddress:50010 of size 45733720 > 2023-02-09 15:05:10,376 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Received BP-578784987-x.x.x.x-1667291826362:blk_1305044966_231826462 src: > /upStreamDatanode:60316 dest: /localAddress:50010 of size 45733720 {code} > the datanode received the same block with different generation stamp because > of socket timeout exception. blk_1305044966_231826462 is received from > upstream datanode in pipeline which has two datanodes. > blk_1305044966_231832415 is received from client directly. > > we have search all log info about blk_1305044966 in namenode and three > datanodes in original pipeline. but we could not obtain any helpful message > about the generation stamp 231826462. After diving into the source code, it > was assigned in NameNodeRpcServer#updateBlockForPipeline which was invoked in > DataStreamer#setupPipelineInternal. The updateBlockForPipeline RPC does not > have any log info. So I think we should add some logs in this RPC. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689423#comment-17689423 ] ASF GitHub Bot commented on HDFS-16917: --- hadoop-yetus commented on PR #5397: URL: https://github.com/apache/hadoop/pull/5397#issuecomment-1432323560 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 45s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 1s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 23m 36s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 31m 4s | | trunk passed | | +1 :green_heart: | compile | 23m 4s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | compile | 20m 29s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 3m 47s | | trunk passed | | +1 :green_heart: | mvnsite | 3m 32s | | trunk passed | | +1 :green_heart: | javadoc | 2m 27s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 2m 36s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 6m 7s | | trunk passed | | +1 :green_heart: | shadedclient | 26m 25s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 29s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 32s | | the patch passed | | +1 :green_heart: | compile | 22m 21s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javac | 22m 21s | | the patch passed | | +1 :green_heart: | compile | 20m 33s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | javac | 20m 33s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 3m 35s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/2/artifact/out/results-checkstyle-root.txt) | root: The patch generated 6 new + 105 unchanged - 0 fixed = 111 total (was 105) | | +1 :green_heart: | mvnsite | 3m 26s | | the patch passed | | +1 :green_heart: | javadoc | 2m 18s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 2m 39s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 6m 21s | | the patch passed | | +1 :green_heart: | shadedclient | 26m 23s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 18m 19s | | hadoop-common in the patch passed. | | -1 :x: | unit | 208m 23s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 10s | | The patch does not generate ASF License warnings. | | | | 462m 8s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.TestAuditLogs | | | hadoop.hdfs.server.namenode.TestAuditLogger | | | hadoop.hdfs.server.namenode.TestFSNamesystemLockReport | | | hadoop.hdfs.server.namenode.TestFsck | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5397 | | Optional Tests | dupname asflicense mvnsite codespell detsecrets markdownlint compile javac javadoc mvni
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689419#comment-17689419 ] ASF GitHub Bot commented on HDFS-16917: --- xinglin commented on code in PR #5397: URL: https://github.com/apache/hadoop/pull/5397#discussion_r1107896933 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java: ## @@ -1936,4 +1936,17 @@ public static boolean isParentEntry(final String path, final String parent) { return path.charAt(parent.length()) == Path.SEPARATOR_CHAR || parent.equals(Path.SEPARATOR); } + + /** + * Calculate the transfer rate in megabytes/second. + * @param bytes bytes + * @param durationMS duration in milliseconds + * @return the number of megabytes/second of the transfer rate + */ + public static long transferRateMBs(long bytes, long durationMS) { +if (durationMS == 0) { Review Comment: can we specify our function as: "we expect both inputs to be positive. Otherwise, this function will return -1". Then returning -1 is a clear signal we don't know how to handle such inputs. > Add transfer rate quantile metrics for DataNode reads > - > > Key: HDFS-16917 > URL: https://issues.apache.org/jira/browse/HDFS-16917 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode >Reporter: Ravindra Dingankar >Priority: Minor > Labels: pull-request-available > > Currently we have the following metrics for datanode reads. > |BytesRead > BlocksRead > TotalReadTime|Total number of bytes read from DataNode > Total number of blocks read from DataNode > Total number of milliseconds spent on read operation| > We would like to add a new quantile metric calculating the distribution of > data transfer rate for datanode reads. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689417#comment-17689417 ] ASF GitHub Bot commented on HDFS-16917: --- rdingankar commented on code in PR #5397: URL: https://github.com/apache/hadoop/pull/5397#discussion_r1107892591 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java: ## @@ -1936,4 +1936,17 @@ public static boolean isParentEntry(final String path, final String parent) { return path.charAt(parent.length()) == Path.SEPARATOR_CHAR || parent.equals(Path.SEPARATOR); } + + /** + * Calculate the transfer rate in megabytes/second. + * @param bytes bytes + * @param durationMS duration in milliseconds + * @return the number of megabytes/second of the transfer rate + */ + public static long transferRateMBs(long bytes, long durationMS) { +if (durationMS == 0) { Review Comment: I dont feel we should handle other cases. This is a Utils method and any unexpected data should be left for the client to interpret. For some clients the negative values might even make sense. The idea behind handling for durationMS = 0 is to take care of DivideByZero for cases when data transfer did not happen. > Add transfer rate quantile metrics for DataNode reads > - > > Key: HDFS-16917 > URL: https://issues.apache.org/jira/browse/HDFS-16917 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode >Reporter: Ravindra Dingankar >Priority: Minor > Labels: pull-request-available > > Currently we have the following metrics for datanode reads. > |BytesRead > BlocksRead > TotalReadTime|Total number of bytes read from DataNode > Total number of blocks read from DataNode > Total number of milliseconds spent on read operation| > We would like to add a new quantile metric calculating the distribution of > data transfer rate for datanode reads. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689416#comment-17689416 ] ASF GitHub Bot commented on HDFS-16917: --- rdingankar commented on code in PR #5397: URL: https://github.com/apache/hadoop/pull/5397#discussion_r1107889758 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeMetrics.java: ## @@ -61,6 +61,8 @@ public class DataNodeMetrics { @Metric MutableCounterLong bytesRead; @Metric("Milliseconds spent reading") MutableCounterLong totalReadTime; + @Metric MutableRate bytesReadTransferRate; Review Comment: updated ## hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md: ## @@ -370,6 +370,7 @@ Each metrics record contains tags such as SessionId and Hostname as additional i |:-- > Add transfer rate quantile metrics for DataNode reads > - > > Key: HDFS-16917 > URL: https://issues.apache.org/jira/browse/HDFS-16917 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode >Reporter: Ravindra Dingankar >Priority: Minor > Labels: pull-request-available > > Currently we have the following metrics for datanode reads. > |BytesRead > BlocksRead > TotalReadTime|Total number of bytes read from DataNode > Total number of blocks read from DataNode > Total number of milliseconds spent on read operation| > We would like to add a new quantile metric calculating the distribution of > data transfer rate for datanode reads. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689409#comment-17689409 ] ASF GitHub Bot commented on HDFS-16917: --- xinglin commented on code in PR #5397: URL: https://github.com/apache/hadoop/pull/5397#discussion_r1107877662 ## hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md: ## @@ -370,6 +370,7 @@ Each metrics record contains tags such as SessionId and Hostname as additional i |:-- > Add transfer rate quantile metrics for DataNode reads > - > > Key: HDFS-16917 > URL: https://issues.apache.org/jira/browse/HDFS-16917 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode >Reporter: Ravindra Dingankar >Priority: Minor > Labels: pull-request-available > > Currently we have the following metrics for datanode reads. > |BytesRead > BlocksRead > TotalReadTime|Total number of bytes read from DataNode > Total number of blocks read from DataNode > Total number of milliseconds spent on read operation| > We would like to add a new quantile metric calculating the distribution of > data transfer rate for datanode reads. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689408#comment-17689408 ] ASF GitHub Bot commented on HDFS-16917: --- xinglin commented on code in PR #5397: URL: https://github.com/apache/hadoop/pull/5397#discussion_r1107877375 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeMetrics.java: ## @@ -61,6 +61,8 @@ public class DataNodeMetrics { @Metric MutableCounterLong bytesRead; @Metric("Milliseconds spent reading") MutableCounterLong totalReadTime; + @Metric MutableRate bytesReadTransferRate; Review Comment: nit: rename to readTransferRateMBs? > Add transfer rate quantile metrics for DataNode reads > - > > Key: HDFS-16917 > URL: https://issues.apache.org/jira/browse/HDFS-16917 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode >Reporter: Ravindra Dingankar >Priority: Minor > Labels: pull-request-available > > Currently we have the following metrics for datanode reads. > |BytesRead > BlocksRead > TotalReadTime|Total number of bytes read from DataNode > Total number of blocks read from DataNode > Total number of milliseconds spent on read operation| > We would like to add a new quantile metric calculating the distribution of > data transfer rate for datanode reads. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689406#comment-17689406 ] ASF GitHub Bot commented on HDFS-16917: --- xinglin commented on code in PR #5397: URL: https://github.com/apache/hadoop/pull/5397#discussion_r1107876562 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java: ## @@ -1936,4 +1936,17 @@ public static boolean isParentEntry(final String path, final String parent) { return path.charAt(parent.length()) == Path.SEPARATOR_CHAR || parent.equals(Path.SEPARATOR); } + + /** + * Calculate the transfer rate in megabytes/second. + * @param bytes bytes + * @param durationMS duration in milliseconds + * @return the number of megabytes/second of the transfer rate + */ + public static long transferRateMBs(long bytes, long durationMS) { +if (durationMS == 0) { Review Comment: if it is <= 0, just return -1? Let's add a check for bytes as well. > Add transfer rate quantile metrics for DataNode reads > - > > Key: HDFS-16917 > URL: https://issues.apache.org/jira/browse/HDFS-16917 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode >Reporter: Ravindra Dingankar >Priority: Minor > Labels: pull-request-available > > Currently we have the following metrics for datanode reads. > |BytesRead > BlocksRead > TotalReadTime|Total number of bytes read from DataNode > Total number of blocks read from DataNode > Total number of milliseconds spent on read operation| > We would like to add a new quantile metric calculating the distribution of > data transfer rate for datanode reads. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16914) Add some logs for updateBlockForPipeline RPC.
[ https://issues.apache.org/jira/browse/HDFS-16914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689403#comment-17689403 ] ASF GitHub Bot commented on HDFS-16914: --- slfan1989 commented on code in PR #5381: URL: https://github.com/apache/hadoop/pull/5381#discussion_r1107866913 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java: ## @@ -5943,6 +5943,8 @@ LocatedBlock bumpBlockGenerationStamp(ExtendedBlock block, } // Ensure we record the new generation stamp getEditLog().logSync(); +LOG.info("bumpBlockGenerationStamp({}, client={}) success", +locatedBlock.getBlock(), clientName); Review Comment: @hfutatzhanghb @tomscut Thanks for the information! > Add some logs for updateBlockForPipeline RPC. > - > > Key: HDFS-16914 > URL: https://issues.apache.org/jira/browse/HDFS-16914 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namanode >Affects Versions: 3.3.4 >Reporter: ZhangHB >Assignee: ZhangHB >Priority: Minor > Labels: pull-request-available > > Recently,we received an phone alarm about missing blocks. We found logs in > one datanode where the block was placed on like below: > > {code:java} > 2023-02-09 15:05:10,376 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Received BP-578784987-x.x.x.x-1667291826362:blk_1305044966_231832415 src: > /clientAddress:44638 dest: /localAddress:50010 of size 45733720 > 2023-02-09 15:05:10,376 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Received BP-578784987-x.x.x.x-1667291826362:blk_1305044966_231826462 src: > /upStreamDatanode:60316 dest: /localAddress:50010 of size 45733720 {code} > the datanode received the same block with different generation stamp because > of socket timeout exception. blk_1305044966_231826462 is received from > upstream datanode in pipeline which has two datanodes. > blk_1305044966_231832415 is received from client directly. > > we have search all log info about blk_1305044966 in namenode and three > datanodes in original pipeline. but we could not obtain any helpful message > about the generation stamp 231826462. After diving into the source code, it > was assigned in NameNodeRpcServer#updateBlockForPipeline which was invoked in > DataStreamer#setupPipelineInternal. The updateBlockForPipeline RPC does not > have any log info. So I think we should add some logs in this RPC. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16918) Optionally shut down datanode if it does not stay connected to active namenode
[ https://issues.apache.org/jira/browse/HDFS-16918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HDFS-16918: Description: While deploying Hdfs on Envoy proxy setup, depending on the socket timeout configured at envoy, the network connection issues or packet loss could be observed. All of envoys basically form a transparent communication mesh in which each app can send and receive packets to and from localhost and is unaware of the network topology. The primary purpose of Envoy is to make the network transparent to applications, in order to identify network issues reliably. However, sometimes such proxy based setup could result into socket connection issues b/ datanode and namenode. Many deployment frameworks provide auto-start functionality when any of the hadoop daemons are stopped. If a given datanode does not stay connected to active namenode in the cluster i.e. does not receive heartbeat response in time from active namenode (even though active namenode is not terminated), it would not be much useful. We should be able to provide configurable behavior such that if a given datanode cannot receive heartbeat response from active namenode in configurable time duration, it should terminate itself to avoid impacting the availability SLA. This is specifically helpful when the underlying deployment or observability framework (e.g. K8S) can start up the datanode automatically upon it's shutdown (unless it is being restarted as part of rolling upgrade) and help the newly brought up datanode (in case of k8s, a new pod with dynamically changing nodes) establish new socket connection to active and standby namenodes. This should be an opt-in behavior and not default one. In a distributed system, it is essential to have robust fail-fast mechanisms in place to prevent issues related to network partitioning. The system must be designed to prevent further degradation of availability and consistency in the event of a network partition. Several distributed systems offer fail-safe approaches, and for some, partition tolerance is critical to the extent that even a few seconds of heartbeat loss can trigger the removal of an application server instance from the cluster. For instance, a majority of zooKeeper clients utilize the ephemeral nodes for this purpose to make system reliable, fault-tolerant and strongly consistent in the event of network partition. >From the hdfs architecture viewpoint, it is crucial to understand the critical >role that active and observer namenode play in file system operations. In a >large-scale cluster, if the datanodes holding the same block (primary and >replicas) lose connection to both active and observer namenodes for a >significant amount of time, delaying the process of shutting down such >datanodes and restarting it to re-establish the connection with the namenodes >(assuming the active namenode is alive, assumption is important in the even of >network partition to reestablish the connection) will further deteriorate the >availability of the service. This scenario underscores the importance of >resolving network partitioning. This is a real use case for hdfs and it is not prudent to assume that every deployment or cluster management application must be able to restart datanodes based on JMX metrics, as this would introduce another application to resolve the network partition impact of hdfs. Besides, popular cluster management applications are not typically used in all cloud-native env. Even if these cluster management applications are deployed, certain security constraints may restrict their access to JMX metrics and prevent them from interfering with hdfs operations. The applications that can only trigger alerts for users based on set parameters (for instance, missing blocks > 0) are allowed to access JMX metrics. was: While deploying Hdfs on Envoy proxy setup, depending on the socket timeout configured at envoy, the network connection issues or packet loss could be observed. All of envoys basically form a transparent communication mesh in which each app can send and receive packets to and from localhost and is unaware of the network topology. The primary purpose of Envoy is to make the network transparent to applications, in order to identify network issues reliably. However, sometimes such proxy based setup could result into socket connection issues b/ datanode and namenode. Many deployment frameworks provide auto-start functionality when any of the hadoop daemons are stopped. If a given datanode does not stay connected to active namenode in the cluster i.e. does not receive heartbeat response in time from active namenode (even though active namenode is not terminated), it would not be much useful. We should be able to provide configurable behavior such that if a given datanode cannot receive heartbeat response from active namen
[jira] [Commented] (HDFS-16918) Optionally shut down datanode if it does not stay connected to active namenode
[ https://issues.apache.org/jira/browse/HDFS-16918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689378#comment-17689378 ] ASF GitHub Bot commented on HDFS-16918: --- virajjasani commented on PR #5396: URL: https://github.com/apache/hadoop/pull/5396#issuecomment-1432167209 Created HDFS-16925 to fix regex expressions for namenode audit log tests > Optionally shut down datanode if it does not stay connected to active namenode > -- > > Key: HDFS-16918 > URL: https://issues.apache.org/jira/browse/HDFS-16918 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > > While deploying Hdfs on Envoy proxy setup, depending on the socket timeout > configured at envoy, the network connection issues or packet loss could be > observed. All of envoys basically form a transparent communication mesh in > which each app can send and receive packets to and from localhost and is > unaware of the network topology. > The primary purpose of Envoy is to make the network transparent to > applications, in order to identify network issues reliably. However, > sometimes such proxy based setup could result into socket connection issues > b/ datanode and namenode. > Many deployment frameworks provide auto-start functionality when any of the > hadoop daemons are stopped. If a given datanode does not stay connected to > active namenode in the cluster i.e. does not receive heartbeat response in > time from active namenode (even though active namenode is not terminated), it > would not be much useful. We should be able to provide configurable behavior > such that if a given datanode cannot receive heartbeat response from active > namenode in configurable time duration, it should terminate itself to avoid > impacting the availability SLA. This is specifically helpful when the > underlying deployment or observability framework (e.g. K8S) can start up the > datanode automatically upon it's shutdown (unless it is being restarted as > part of rolling upgrade) and help the newly brought up datanode (in case of > k8s, a new pod with dynamically changing nodes) establish new socket > connection to active and standby namenodes. This should be an opt-in behavior > and not default one. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16918) Optionally shut down datanode if it does not stay connected to active namenode
[ https://issues.apache.org/jira/browse/HDFS-16918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689375#comment-17689375 ] ASF GitHub Bot commented on HDFS-16918: --- virajjasani commented on PR #5396: URL: https://github.com/apache/hadoop/pull/5396#issuecomment-1432160563 In a distributed system, it is essential to have robust fail-fast mechanisms in place to prevent issues related to network partitioning. The system must be designed to prevent further degradation of availability and consistency in the event of a network partition. Several distributed systems offer fail-safe approaches, and for some, partition tolerance is critical to the extent that even a few seconds of heartbeat loss can trigger the removal of an application server instance from the cluster. For instance, a majority of zooKeeper clients utilize the ephemeral nodes for this purpose to make system reliable, fault-tolerant and strongly consistent in the event of network partition. From the hdfs architecture viewpoint, it is crucial to understand the critical role that active and observer namenode play in file system operations. In a large-scale cluster, if the datanodes holding the same block (primary and replicas) lose connection to both active and observer namenodes for a significant amount of time, delaying the process of shutting down such datanodes and restarting it to re-establish the connection with the namenodes (assuming the active namenode is alive, assumption is important in the even of network partition to reestablish the connection) will further deteriorate the availability of the service. This scenario underscores the importance of resolving network partitioning. This is a real use case for hdfs and it is not prudent to assume that every deployment or cluster management application must be able to restart datanodes based on JMX metrics, as this would introduce another application to resolve the network partition impact of hdfs. Besides, popular cluster management applications are not typically used in all cloud-native env. Even if these cluster management applications are deployed, certain security constraints may restrict their access to JMX metrics and prevent them from interfering with hdfs operations. The applications that can only trigger alerts for users based on set parameters (for instance, missing blocks > 0) are allowed to access JMX metrics. > Optionally shut down datanode if it does not stay connected to active namenode > -- > > Key: HDFS-16918 > URL: https://issues.apache.org/jira/browse/HDFS-16918 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > > While deploying Hdfs on Envoy proxy setup, depending on the socket timeout > configured at envoy, the network connection issues or packet loss could be > observed. All of envoys basically form a transparent communication mesh in > which each app can send and receive packets to and from localhost and is > unaware of the network topology. > The primary purpose of Envoy is to make the network transparent to > applications, in order to identify network issues reliably. However, > sometimes such proxy based setup could result into socket connection issues > b/ datanode and namenode. > Many deployment frameworks provide auto-start functionality when any of the > hadoop daemons are stopped. If a given datanode does not stay connected to > active namenode in the cluster i.e. does not receive heartbeat response in > time from active namenode (even though active namenode is not terminated), it > would not be much useful. We should be able to provide configurable behavior > such that if a given datanode cannot receive heartbeat response from active > namenode in configurable time duration, it should terminate itself to avoid > impacting the availability SLA. This is specifically helpful when the > underlying deployment or observability framework (e.g. K8S) can start up the > datanode automatically upon it's shutdown (unless it is being restarted as > part of rolling upgrade) and help the newly brought up datanode (in case of > k8s, a new pod with dynamically changing nodes) establish new socket > connection to active and standby namenodes. This should be an opt-in behavior > and not default one. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16896) HDFS Client hedged read has increased failure rate than without hedged read
[ https://issues.apache.org/jira/browse/HDFS-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689348#comment-17689348 ] ASF GitHub Bot commented on HDFS-16896: --- simbadzina commented on code in PR #5322: URL: https://github.com/apache/hadoop/pull/5322#discussion_r1107745069 ## hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java: ## @@ -1337,7 +1352,11 @@ private void hedgedFetchBlockByteRange(LocatedBlock block, long start, } catch (InterruptedException ie) { // Ignore and retry } -if (refetch) { +// if refetch is true then all nodes are in deadlist or ignorelist +// we should loop through all futures and remove them so we do not Review Comment: Could you add punctuation here and start new sentences with caps. That will make the comment easier to follow. What is the deadlist? > HDFS Client hedged read has increased failure rate than without hedged read > --- > > Key: HDFS-16896 > URL: https://issues.apache.org/jira/browse/HDFS-16896 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Reporter: Tom McCormick >Assignee: Tom McCormick >Priority: Major > Labels: pull-request-available > > When hedged read is enabled by HDFS client, we see an increased failure rate > on reads. > *stacktrace* > > {code:java} > Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain > block: BP-1183972111-10.197.192.88-1590025572374:blk_17114848218_16043459722 > file=/data/tracking/streaming/AdImpressionEvent/daily/2022/07/18/compaction_1/part-r-1914862.1658217125623.1362294472.orc > at > org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1077) > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1060) > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1039) > at > org.apache.hadoop.hdfs.DFSInputStream.hedgedFetchBlockByteRange(DFSInputStream.java:1365) > at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1572) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1535) > at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > org.apache.hadoop.fs.RetryingInputStream.lambda$readFully$3(RetryingInputStream.java:172) > at org.apache.hadoop.fs.RetryPolicy.lambda$run$0(RetryPolicy.java:137) > at org.apache.hadoop.fs.NoOpRetryPolicy.run(NoOpRetryPolicy.java:36) > at org.apache.hadoop.fs.RetryPolicy.run(RetryPolicy.java:136) > at > org.apache.hadoop.fs.RetryingInputStream.readFully(RetryingInputStream.java:168) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > io.trino.plugin.hive.orc.HdfsOrcDataSource.readInternal(HdfsOrcDataSource.java:76) > ... 46 more > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16896) HDFS Client hedged read has increased failure rate than without hedged read
[ https://issues.apache.org/jira/browse/HDFS-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689347#comment-17689347 ] ASF GitHub Bot commented on HDFS-16896: --- simbadzina commented on code in PR #5322: URL: https://github.com/apache/hadoop/pull/5322#discussion_r1107745069 ## hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java: ## @@ -1337,7 +1352,11 @@ private void hedgedFetchBlockByteRange(LocatedBlock block, long start, } catch (InterruptedException ie) { // Ignore and retry } -if (refetch) { +// if refetch is true then all nodes are in deadlist or ignorelist +// we should loop through all futures and remove them so we do not Review Comment: Could you add punctuation here and start new sentences with caps. It was hard to read the flow of the comment. What is the deadlist? > HDFS Client hedged read has increased failure rate than without hedged read > --- > > Key: HDFS-16896 > URL: https://issues.apache.org/jira/browse/HDFS-16896 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Reporter: Tom McCormick >Assignee: Tom McCormick >Priority: Major > Labels: pull-request-available > > When hedged read is enabled by HDFS client, we see an increased failure rate > on reads. > *stacktrace* > > {code:java} > Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain > block: BP-1183972111-10.197.192.88-1590025572374:blk_17114848218_16043459722 > file=/data/tracking/streaming/AdImpressionEvent/daily/2022/07/18/compaction_1/part-r-1914862.1658217125623.1362294472.orc > at > org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1077) > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1060) > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1039) > at > org.apache.hadoop.hdfs.DFSInputStream.hedgedFetchBlockByteRange(DFSInputStream.java:1365) > at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1572) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1535) > at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > org.apache.hadoop.fs.RetryingInputStream.lambda$readFully$3(RetryingInputStream.java:172) > at org.apache.hadoop.fs.RetryPolicy.lambda$run$0(RetryPolicy.java:137) > at org.apache.hadoop.fs.NoOpRetryPolicy.run(NoOpRetryPolicy.java:36) > at org.apache.hadoop.fs.RetryPolicy.run(RetryPolicy.java:136) > at > org.apache.hadoop.fs.RetryingInputStream.readFully(RetryingInputStream.java:168) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > io.trino.plugin.hive.orc.HdfsOrcDataSource.readInternal(HdfsOrcDataSource.java:76) > ... 46 more > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16925) Fix regex pattern for namenode audit log tests
[ https://issues.apache.org/jira/browse/HDFS-16925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDFS-16925: -- Labels: pull-request-available (was: ) > Fix regex pattern for namenode audit log tests > -- > > Key: HDFS-16925 > URL: https://issues.apache.org/jira/browse/HDFS-16925 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > > With HADOOP-18628 in place, we perform InetAddress#getHostName in addition to > InetAddress#getHostAddress, to save host name with IPC Connection object. > When we perform InetAddress#getHostName, toString() of InetAddress would > automatically print \{hostName}/\{hostIPAddress} if hostname is already > resolved: > {code:java} > /** > * Converts this IP address to a {@code String}. The > * string returned is of the form: hostname / literal IP > * address. > * > * If the host name is unresolved, no reverse name service lookup > * is performed. The hostname part will be represented by an empty string. > * > * @return a string representation of this IP address. > */ > public String toString() { > String hostName = holder().getHostName(); > return ((hostName != null) ? hostName : "") > + "/" + getHostAddress(); > }{code} > > For namenode audit logs, this means that when dfs client makes filesystem > updates, the audit logs would also print host name in the audit logs in > addition to ip address. We have some tests that performs regex pattern > matching to identify the log pattern of audit logs, we will have to change > them to reflect the change in host address. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16925) Fix regex pattern for namenode audit log tests
[ https://issues.apache.org/jira/browse/HDFS-16925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689332#comment-17689332 ] ASF GitHub Bot commented on HDFS-16925: --- virajjasani opened a new pull request, #5407: URL: https://github.com/apache/hadoop/pull/5407 With [HADOOP-18628](https://issues.apache.org/jira/browse/HADOOP-18628) in place, we perform InetAddress#getHostName in addition to InetAddress#getHostAddress, to save host name with IPC Connection object. When we perform InetAddress#getHostName, toString() of InetAddress would automatically print {hostName}/{hostIPAddress} if hostname is already resolved: ``` /** * Converts this IP address to a {@code String}. The * string returned is of the form: hostname / literal IP * address. * * If the host name is unresolved, no reverse name service lookup * is performed. The hostname part will be represented by an empty string. * * @return a string representation of this IP address. */ public String toString() { String hostName = holder().getHostName(); return ((hostName != null) ? hostName : "") + "/" + getHostAddress(); } ``` For namenode audit logs, this means that when dfs client makes filesystem updates, the audit logs would also print host name in the audit logs in addition to ip address. We have some tests that performs regex pattern matching to identify the log pattern of audit logs, we will have to change them to reflect the change in host address. > Fix regex pattern for namenode audit log tests > -- > > Key: HDFS-16925 > URL: https://issues.apache.org/jira/browse/HDFS-16925 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > With HADOOP-18628 in place, we perform InetAddress#getHostName in addition to > InetAddress#getHostAddress, to save host name with IPC Connection object. > When we perform InetAddress#getHostName, toString() of InetAddress would > automatically print \{hostName}/\{hostIPAddress} if hostname is already > resolved: > {code:java} > /** > * Converts this IP address to a {@code String}. The > * string returned is of the form: hostname / literal IP > * address. > * > * If the host name is unresolved, no reverse name service lookup > * is performed. The hostname part will be represented by an empty string. > * > * @return a string representation of this IP address. > */ > public String toString() { > String hostName = holder().getHostName(); > return ((hostName != null) ? hostName : "") > + "/" + getHostAddress(); > }{code} > > For namenode audit logs, this means that when dfs client makes filesystem > updates, the audit logs would also print host name in the audit logs in > addition to ip address. We have some tests that performs regex pattern > matching to identify the log pattern of audit logs, we will have to change > them to reflect the change in host address. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16925) Fix regex pattern for namenode audit log tests
Viraj Jasani created HDFS-16925: --- Summary: Fix regex pattern for namenode audit log tests Key: HDFS-16925 URL: https://issues.apache.org/jira/browse/HDFS-16925 Project: Hadoop HDFS Issue Type: Task Reporter: Viraj Jasani Assignee: Viraj Jasani With HADOOP-18628 in place, we perform InetAddress#getHostName in addition to InetAddress#getHostAddress, to save host name with IPC Connection object. When we perform InetAddress#getHostName, toString() of InetAddress would automatically print \{hostName}/\{hostIPAddress} if hostname is already resolved: {code:java} /** * Converts this IP address to a {@code String}. The * string returned is of the form: hostname / literal IP * address. * * If the host name is unresolved, no reverse name service lookup * is performed. The hostname part will be represented by an empty string. * * @return a string representation of this IP address. */ public String toString() { String hostName = holder().getHostName(); return ((hostName != null) ? hostName : "") + "/" + getHostAddress(); }{code} For namenode audit logs, this means that when dfs client makes filesystem updates, the audit logs would also print host name in the audit logs in addition to ip address. We have some tests that performs regex pattern matching to identify the log pattern of audit logs, we will have to change them to reflect the change in host address. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16918) Optionally shut down datanode if it does not stay connected to active namenode
[ https://issues.apache.org/jira/browse/HDFS-16918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689305#comment-17689305 ] ASF GitHub Bot commented on HDFS-16918: --- virajjasani commented on PR #5396: URL: https://github.com/apache/hadoop/pull/5396#issuecomment-1431843597 Please ignore test failures, they are not relevant. I will track them separately. For the change, this is our real usecase. I have tested it on two clusters and deploy in prod as well. Limited infrastructure needs this kind of coverage but it is indeed required. My only concern is that not every deployment framework has ability to perform action based on JMX metrics, specifically cloud native ones with more security falls in the criteria. On the other hand, datanode is not much useful for long time if active namenode is not available if client is writing new blocks etc. @jojochuang @tasanuma @tomscut could you please also take a look and provide your feedback? > Optionally shut down datanode if it does not stay connected to active namenode > -- > > Key: HDFS-16918 > URL: https://issues.apache.org/jira/browse/HDFS-16918 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > > While deploying Hdfs on Envoy proxy setup, depending on the socket timeout > configured at envoy, the network connection issues or packet loss could be > observed. All of envoys basically form a transparent communication mesh in > which each app can send and receive packets to and from localhost and is > unaware of the network topology. > The primary purpose of Envoy is to make the network transparent to > applications, in order to identify network issues reliably. However, > sometimes such proxy based setup could result into socket connection issues > b/ datanode and namenode. > Many deployment frameworks provide auto-start functionality when any of the > hadoop daemons are stopped. If a given datanode does not stay connected to > active namenode in the cluster i.e. does not receive heartbeat response in > time from active namenode (even though active namenode is not terminated), it > would not be much useful. We should be able to provide configurable behavior > such that if a given datanode cannot receive heartbeat response from active > namenode in configurable time duration, it should terminate itself to avoid > impacting the availability SLA. This is specifically helpful when the > underlying deployment or observability framework (e.g. K8S) can start up the > datanode automatically upon it's shutdown (unless it is being restarted as > part of rolling upgrade) and help the newly brought up datanode (in case of > k8s, a new pod with dynamically changing nodes) establish new socket > connection to active and standby namenodes. This should be an opt-in behavior > and not default one. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16923) The getListing RPC will throw NPE if the path does not exist
[ https://issues.apache.org/jira/browse/HDFS-16923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689252#comment-17689252 ] ASF GitHub Bot commented on HDFS-16923: --- hadoop-yetus commented on PR #5400: URL: https://github.com/apache/hadoop/pull/5400#issuecomment-1431697168 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 37s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 15m 11s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 31m 55s | | trunk passed | | +1 :green_heart: | compile | 6m 18s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | compile | 6m 12s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 1m 15s | | trunk passed | | +1 :green_heart: | mvnsite | 2m 12s | | trunk passed | | +1 :green_heart: | javadoc | 1m 46s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 2m 25s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 4m 44s | | trunk passed | | +1 :green_heart: | shadedclient | 23m 58s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 27s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 5s | | the patch passed | | +1 :green_heart: | compile | 6m 26s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javac | 6m 26s | | the patch passed | | +1 :green_heart: | compile | 5m 49s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | javac | 5m 49s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 1m 4s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 57s | | the patch passed | | +1 :green_heart: | javadoc | 1m 25s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 2m 18s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 4m 47s | | the patch passed | | +1 :green_heart: | shadedclient | 24m 27s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 205m 53s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5400/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | unit | 20m 34s | | hadoop-hdfs-rbf in the patch passed. | | +1 :green_heart: | asflicense | 0m 48s | | The patch does not generate ASF License warnings. | | | | 376m 52s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.TestAuditLogger | | | hadoop.hdfs.server.namenode.TestFsck | | | hadoop.hdfs.server.namenode.TestAuditLogs | | | hadoop.hdfs.server.namenode.TestFSNamesystemLockReport | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5400/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5400 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 0504b05c1ae8 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / deaaebdf677274d3c990e76d6904d93ef3fbdfa9 | | Default Java | Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1u
[jira] [Commented] (HDFS-16916) Improve the use of JUnit Test in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-16916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689233#comment-17689233 ] ASF GitHub Bot commented on HDFS-16916: --- hadoop-yetus commented on PR #5404: URL: https://github.com/apache/hadoop/pull/5404#issuecomment-1431655146 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 41s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 47m 20s | | trunk passed | | +1 :green_heart: | compile | 1m 2s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | compile | 0m 54s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 0m 36s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 59s | | trunk passed | | +1 :green_heart: | javadoc | 0m 52s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 0m 40s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 2m 41s | | trunk passed | | +1 :green_heart: | shadedclient | 24m 59s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 57s | | the patch passed | | +1 :green_heart: | compile | 0m 52s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javac | 0m 52s | | the patch passed | | +1 :green_heart: | compile | 0m 45s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | javac | 0m 45s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 20s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 50s | | the patch passed | | +1 :green_heart: | javadoc | 0m 33s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 0m 33s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 2m 29s | | the patch passed | | +1 :green_heart: | shadedclient | 24m 41s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 26s | | hadoop-hdfs-client in the patch passed. | | +1 :green_heart: | asflicense | 0m 37s | | The patch does not generate ASF License warnings. | | | | 115m 21s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5404/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5404 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux f0cb8873113b 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / c6bed9534b9e55bd78f03e7c9aca8a02d1ebeb5c | | Default Java | Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5404/1/testReport/ | | Max. process+thread count | 564 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs-client U: hadoop-hdfs-project/hadoop-hdfs-client | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5404/1/console | | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org | This message was automatically generated. > Improve the use of JUnit Test in DFSClient > --
[jira] [Updated] (HDFS-16761) Namenode UI for Datanodes page not loading if any data node is down
[ https://issues.apache.org/jira/browse/HDFS-16761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen O'Donnell updated HDFS-16761: - Fix Version/s: 3.4.0 > Namenode UI for Datanodes page not loading if any data node is down > --- > > Key: HDFS-16761 > URL: https://issues.apache.org/jira/browse/HDFS-16761 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.2.2 >Reporter: Krishna Reddy >Assignee: Zita Dombi >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Steps to reproduce: > - Install the hadoop components and add 3 datanodes > - Enable namenode HA > - Open Namenode UI and check datanode page > - check all datanodes will display > - Now make one datanode down > - wait for 10 minutes time as heartbeat expires > - Refresh namenode page and check > > Actual Result: It is showing error message "NameNode is still loading. > Redirecting to the Startup Progress page." -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16761) Namenode UI for Datanodes page not loading if any data node is down
[ https://issues.apache.org/jira/browse/HDFS-16761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689216#comment-17689216 ] ASF GitHub Bot commented on HDFS-16761: --- sodonnel merged PR #5390: URL: https://github.com/apache/hadoop/pull/5390 > Namenode UI for Datanodes page not loading if any data node is down > --- > > Key: HDFS-16761 > URL: https://issues.apache.org/jira/browse/HDFS-16761 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.2.2 >Reporter: Krishna Reddy >Assignee: Zita Dombi >Priority: Major > Labels: pull-request-available > > Steps to reproduce: > - Install the hadoop components and add 3 datanodes > - Enable namenode HA > - Open Namenode UI and check datanode page > - check all datanodes will display > - Now make one datanode down > - wait for 10 minutes time as heartbeat expires > - Refresh namenode page and check > > Actual Result: It is showing error message "NameNode is still loading. > Redirecting to the Startup Progress page." -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16761) Namenode UI for Datanodes page not loading if any data node is down
[ https://issues.apache.org/jira/browse/HDFS-16761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689217#comment-17689217 ] ASF GitHub Bot commented on HDFS-16761: --- sodonnel commented on PR #5390: URL: https://github.com/apache/hadoop/pull/5390#issuecomment-1431626020 Merged this into trunk. I wonder if some similar change is needed on branch 3.3 and 3.2? > Namenode UI for Datanodes page not loading if any data node is down > --- > > Key: HDFS-16761 > URL: https://issues.apache.org/jira/browse/HDFS-16761 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.2.2 >Reporter: Krishna Reddy >Assignee: Zita Dombi >Priority: Major > Labels: pull-request-available > > Steps to reproduce: > - Install the hadoop components and add 3 datanodes > - Enable namenode HA > - Open Namenode UI and check datanode page > - check all datanodes will display > - Now make one datanode down > - wait for 10 minutes time as heartbeat expires > - Refresh namenode page and check > > Actual Result: It is showing error message "NameNode is still loading. > Redirecting to the Startup Progress page." -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16923) The getListing RPC will throw NPE if the path does not exist
[ https://issues.apache.org/jira/browse/HDFS-16923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689209#comment-17689209 ] ASF GitHub Bot commented on HDFS-16923: --- xkrogen commented on code in PR #5400: URL: https://github.com/apache/hadoop/pull/5400#discussion_r1107318715 ## hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestObserverWithRouter.java: ## @@ -146,6 +148,19 @@ public void testObserverRead() throws Exception { internalTestObserverRead(); } + @Test + public void testGetListingWithObserverRead() throws Exception { Review Comment: Feels to me that this test probably belongs in `TestObserverNode`? It's not really related to Router/federation. > The getListing RPC will throw NPE if the path does not exist > > > Key: HDFS-16923 > URL: https://issues.apache.org/jira/browse/HDFS-16923 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: ZanderXu >Assignee: ZanderXu >Priority: Major > Labels: pull-request-available > > The getListing RPC will throw NPE if the path does not exist. And the stack > as bellow: > {code:java} > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): > org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4195) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:1421) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:783) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:622) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:590) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:574) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16922) The logic of IncrementalBlockReportManager#addRDBI method may cause missing blocks when cluster is busy.
[ https://issues.apache.org/jira/browse/HDFS-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689196#comment-17689196 ] ASF GitHub Bot commented on HDFS-16922: --- hfutatzhanghb commented on PR #5398: URL: https://github.com/apache/hadoop/pull/5398#issuecomment-1431546384 > Requires a UT which can reproduce the said issue Hi, @ayushtkn . I can reproduce the issue by UT with version 3.3.x according to our production situation, but can not reproduce the issue with trunk according to our production situation because [HDFS-16146](https://issues.apache.org/jira/browse/HDFS-16146). But I think the patch in this pr can also be useful to solve this problems. @Hexiaoqiao , could you please also take a look at this~ thanks. > The logic of IncrementalBlockReportManager#addRDBI method may cause missing > blocks when cluster is busy. > > > Key: HDFS-16922 > URL: https://issues.apache.org/jira/browse/HDFS-16922 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: ZhangHB >Priority: Major > Labels: pull-request-available > > The current logic of IncrementalBlockReportManager# addRDBI method could lead > to the missing blocks when datanodes in pipeline are I/O busy. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16916) Improve the use of JUnit Test in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-16916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689174#comment-17689174 ] ASF GitHub Bot commented on HDFS-16916: --- slfan1989 commented on PR #5404: URL: https://github.com/apache/hadoop/pull/5404#issuecomment-1431486725 LGTM. > Improve the use of JUnit Test in DFSClient > -- > > Key: HDFS-16916 > URL: https://issues.apache.org/jira/browse/HDFS-16916 > Project: Hadoop HDFS > Issue Type: Improvement > Components: dfsclient >Affects Versions: 3.4.0 >Reporter: Hualong Zhang >Priority: Minor > Labels: pull-request-available > > Improve the use of JUnit Test in DFSClient -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16916) Improve the use of JUnit Test in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-16916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDFS-16916: -- Labels: pull-request-available (was: ) > Improve the use of JUnit Test in DFSClient > -- > > Key: HDFS-16916 > URL: https://issues.apache.org/jira/browse/HDFS-16916 > Project: Hadoop HDFS > Issue Type: Improvement > Components: dfsclient >Affects Versions: 3.4.0 >Reporter: Hualong Zhang >Priority: Minor > Labels: pull-request-available > > Improve the use of JUnit Test in DFSClient -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16916) Improve the use of JUnit Test in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-16916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689166#comment-17689166 ] ASF GitHub Bot commented on HDFS-16916: --- zhtttylz opened a new pull request, #5404: URL: https://github.com/apache/hadoop/pull/5404 JIRA:HDFS-16916. Improve the use of JUnit Test in DFSClient > Improve the use of JUnit Test in DFSClient > -- > > Key: HDFS-16916 > URL: https://issues.apache.org/jira/browse/HDFS-16916 > Project: Hadoop HDFS > Issue Type: Improvement > Components: dfsclient >Affects Versions: 3.4.0 >Reporter: Hualong Zhang >Priority: Minor > > Improve the use of JUnit Test in DFSClient -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689140#comment-17689140 ] ASF GitHub Bot commented on HDFS-16917: --- hadoop-yetus commented on PR #5397: URL: https://github.com/apache/hadoop/pull/5397#issuecomment-1431427424 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 38s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 0s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 15m 2s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 30m 51s | | trunk passed | | +1 :green_heart: | compile | 23m 26s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | compile | 20m 33s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 3m 46s | | trunk passed | | +1 :green_heart: | mvnsite | 3m 29s | | trunk passed | | +1 :green_heart: | javadoc | 2m 27s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 2m 42s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 6m 10s | | trunk passed | | +1 :green_heart: | shadedclient | 26m 48s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 28s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 38s | | the patch passed | | +1 :green_heart: | compile | 23m 9s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javac | 23m 9s | | the patch passed | | +1 :green_heart: | compile | 20m 25s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | javac | 20m 25s | | the patch passed | | -1 :x: | blanks | 0m 0s | [/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/1/artifact/out/blanks-eol.txt) | The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply | | -0 :warning: | checkstyle | 3m 37s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/1/artifact/out/results-checkstyle-root.txt) | root: The patch generated 8 new + 106 unchanged - 0 fixed = 114 total (was 106) | | +1 :green_heart: | mvnsite | 3m 25s | | the patch passed | | +1 :green_heart: | javadoc | 2m 18s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 2m 38s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 6m 25s | | the patch passed | | +1 :green_heart: | shadedclient | 26m 47s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 18m 14s | | hadoop-common in the patch passed. | | -1 :x: | unit | 208m 9s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 15s | | The patch does not generate ASF License warnings. | | | | 455m 14s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes | | | hadoop.hdfs.server.namenode.TestAuditLogs | | | hadoop.hdfs.server.namenode.TestFSNamesystemLockReport | | | hadoop.hdfs.server.namenode.TestFsck | | | hadoop.hdfs.server.namenode.TestAuditLogger | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/1
[jira] [Commented] (HDFS-16922) The logic of IncrementalBlockReportManager#addRDBI method may cause missing blocks when cluster is busy.
[ https://issues.apache.org/jira/browse/HDFS-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689104#comment-17689104 ] ASF GitHub Bot commented on HDFS-16922: --- hadoop-yetus commented on PR #5398: URL: https://github.com/apache/hadoop/pull/5398#issuecomment-1431321727 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 23s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 46m 25s | | trunk passed | | +1 :green_heart: | compile | 1m 31s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | compile | 1m 25s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 1m 9s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 30s | | trunk passed | | +1 :green_heart: | javadoc | 1m 7s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 1m 28s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 36s | | trunk passed | | +1 :green_heart: | shadedclient | 32m 38s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 31s | | the patch passed | | +1 :green_heart: | compile | 1m 23s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javac | 1m 23s | | the patch passed | | +1 :green_heart: | compile | 1m 17s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | javac | 1m 17s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 54s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5398/1/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 2 unchanged - 0 fixed = 3 total (was 2) | | +1 :green_heart: | mvnsite | 2m 18s | | the patch passed | | +1 :green_heart: | javadoc | 0m 57s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 1m 30s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 40s | | the patch passed | | +1 :green_heart: | shadedclient | 30m 2s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 234m 21s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5398/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 47s | | The patch does not generate ASF License warnings. | | | | 368m 37s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.TestAuditLogger | | | hadoop.hdfs.server.namenode.TestFSNamesystemLockReport | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.server.namenode.TestAuditLogs | | | hadoop.hdfs.server.namenode.TestFsck | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5398/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5398 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 7a1f03d8173f 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision |
[jira] [Commented] (HDFS-16924) Add libhdfs APIs for createFile
[ https://issues.apache.org/jira/browse/HDFS-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689085#comment-17689085 ] Zoltán Borók-Nagy commented on HDFS-16924: -- Seems like libhdfs only has hdfsOpenFile() which can be used for creating new files. hdfsOpenFile() has a builder-based API so probably it can be already used for this purpose. > Add libhdfs APIs for createFile > --- > > Key: HDFS-16924 > URL: https://issues.apache.org/jira/browse/HDFS-16924 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs >Reporter: Zoltán Borók-Nagy >Priority: Major > > HDFS-14478 introduces builder-based APIs for openFile() based on HADOOP-15229. > We should also add builder-based APIs for createFile() based on HADOOP-14365. > This would be especially useful for object stores to tune performance of file > writes. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16924) Add libhdfs APIs for createFile
Zoltán Borók-Nagy created HDFS-16924: Summary: Add libhdfs APIs for createFile Key: HDFS-16924 URL: https://issues.apache.org/jira/browse/HDFS-16924 Project: Hadoop HDFS Issue Type: Bug Components: fs Reporter: Zoltán Borók-Nagy HDFS-14478 introduces builder-based APIs for openFile() based on HADOOP-15229. We should also add builder-based APIs for createFile() based on HADOOP-14365. This would be especially useful for object stores to tune performance of file writes. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16761) Namenode UI for Datanodes page not loading if any data node is down
[ https://issues.apache.org/jira/browse/HDFS-16761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689077#comment-17689077 ] ASF GitHub Bot commented on HDFS-16761: --- ayushtkn commented on PR #5390: URL: https://github.com/apache/hadoop/pull/5390#issuecomment-1431272820 Not sure something related to os or browser. Go ahead folks > Namenode UI for Datanodes page not loading if any data node is down > --- > > Key: HDFS-16761 > URL: https://issues.apache.org/jira/browse/HDFS-16761 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.2.2 >Reporter: Krishna Reddy >Assignee: Zita Dombi >Priority: Major > Labels: pull-request-available > > Steps to reproduce: > - Install the hadoop components and add 3 datanodes > - Enable namenode HA > - Open Namenode UI and check datanode page > - check all datanodes will display > - Now make one datanode down > - wait for 10 minutes time as heartbeat expires > - Refresh namenode page and check > > Actual Result: It is showing error message "NameNode is still loading. > Redirecting to the Startup Progress page." -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16923) The getListing RPC will throw NPE if the path does not exist
[ https://issues.apache.org/jira/browse/HDFS-16923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689029#comment-17689029 ] ASF GitHub Bot commented on HDFS-16923: --- ZanderXu commented on PR #5400: URL: https://github.com/apache/hadoop/pull/5400#issuecomment-1431139094 @zhengchenyu @xkrogen Master, please help me to review this NPE imported by HDFS-16732 > The getListing RPC will throw NPE if the path does not exist > > > Key: HDFS-16923 > URL: https://issues.apache.org/jira/browse/HDFS-16923 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: ZanderXu >Assignee: ZanderXu >Priority: Major > Labels: pull-request-available > > The getListing RPC will throw NPE if the path does not exist. And the stack > as bellow: > {code:java} > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): > org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4195) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:1421) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:783) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:622) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:590) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:574) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16923) The getListing RPC will throw NPE if the path does not exist
[ https://issues.apache.org/jira/browse/HDFS-16923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689026#comment-17689026 ] ASF GitHub Bot commented on HDFS-16923: --- ZanderXu opened a new pull request, #5400: URL: https://github.com/apache/hadoop/pull/5400 ### Description of PR Jira: [HDFS-16923](https://issues.apache.org/jira/browse/HDFS-16923) The getlisting rpc will throw NPE if the path does not exist. > The getListing RPC will throw NPE if the path does not exist > > > Key: HDFS-16923 > URL: https://issues.apache.org/jira/browse/HDFS-16923 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: ZanderXu >Assignee: ZanderXu >Priority: Major > > The getListing RPC will throw NPE if the path does not exist. And the stack > as bellow: > {code:java} > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): > org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4195) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:1421) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:783) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:622) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:590) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:574) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16923) The getListing RPC will throw NPE if the path does not exist
[ https://issues.apache.org/jira/browse/HDFS-16923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDFS-16923: -- Labels: pull-request-available (was: ) > The getListing RPC will throw NPE if the path does not exist > > > Key: HDFS-16923 > URL: https://issues.apache.org/jira/browse/HDFS-16923 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: ZanderXu >Assignee: ZanderXu >Priority: Major > Labels: pull-request-available > > The getListing RPC will throw NPE if the path does not exist. And the stack > as bellow: > {code:java} > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): > org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4195) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:1421) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:783) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:622) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:590) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:574) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16923) The getListing RPC will throw NPE if the path does not exist
ZanderXu created HDFS-16923: --- Summary: The getListing RPC will throw NPE if the path does not exist Key: HDFS-16923 URL: https://issues.apache.org/jira/browse/HDFS-16923 Project: Hadoop HDFS Issue Type: Bug Reporter: ZanderXu Assignee: ZanderXu The getListing RPC will throw NPE if the path does not exist. And the stack as bellow: {code:java} org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4195) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:1421) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:783) at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:622) at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:590) at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:574) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16761) Namenode UI for Datanodes page not loading if any data node is down
[ https://issues.apache.org/jira/browse/HDFS-16761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689013#comment-17689013 ] ASF GitHub Bot commented on HDFS-16761: --- tasanuma commented on PR #5390: URL: https://github.com/apache/hadoop/pull/5390#issuecomment-1431114419 I also did the test. The issue was reproduced in both `dfshealth.html` and `federationhealth.html`, and I confirmed this PR fixed them. > Namenode UI for Datanodes page not loading if any data node is down > --- > > Key: HDFS-16761 > URL: https://issues.apache.org/jira/browse/HDFS-16761 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.2.2 >Reporter: Krishna Reddy >Assignee: Zita Dombi >Priority: Major > Labels: pull-request-available > > Steps to reproduce: > - Install the hadoop components and add 3 datanodes > - Enable namenode HA > - Open Namenode UI and check datanode page > - check all datanodes will display > - Now make one datanode down > - wait for 10 minutes time as heartbeat expires > - Refresh namenode page and check > > Actual Result: It is showing error message "NameNode is still loading. > Redirecting to the Startup Progress page." -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16918) Optionally shut down datanode if it does not stay connected to active namenode
[ https://issues.apache.org/jira/browse/HDFS-16918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17688994#comment-17688994 ] ASF GitHub Bot commented on HDFS-16918: --- hadoop-yetus commented on PR #5396: URL: https://github.com/apache/hadoop/pull/5396#issuecomment-1431048438 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 53s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +0 :ok: | xmllint | 0m 1s | | xmllint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 46m 29s | | trunk passed | | +1 :green_heart: | compile | 1m 29s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | compile | 1m 20s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 1m 7s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 30s | | trunk passed | | +1 :green_heart: | javadoc | 1m 7s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 1m 26s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 31s | | trunk passed | | +1 :green_heart: | shadedclient | 28m 26s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 27s | | the patch passed | | +1 :green_heart: | compile | 1m 23s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javac | 1m 23s | | the patch passed | | +1 :green_heart: | compile | 1m 15s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | javac | 1m 15s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 54s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5396/1/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 124 unchanged - 0 fixed = 125 total (was 124) | | +1 :green_heart: | mvnsite | 1m 21s | | the patch passed | | -1 :x: | javadoc | 0m 53s | [/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5396/1/artifact/out/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt) | hadoop-hdfs in the patch failed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04. | | +1 :green_heart: | javadoc | 1m 23s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 26s | | the patch passed | | +1 :green_heart: | shadedclient | 28m 27s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 229m 42s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5396/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 42s | | The patch does not generate ASF License warnings. | | | | 355m 51s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.TestFSNamesystemLockReport | | | hadoop.hdfs.server.namenode.TestAuditLogs | | | hadoop.hdfs.server.namenode.TestAuditLogger | | | hadoop.tools.TestHdfsConfigFields | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.server.namenode.ha.TestObserverNode | | | hadoop.hdfs.server.namenode.TestFsck | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5396/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5396 | | Optional Tests | dupnam
[jira] [Commented] (HDFS-16761) Namenode UI for Datanodes page not loading if any data node is down
[ https://issues.apache.org/jira/browse/HDFS-16761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17688972#comment-17688972 ] ASF GitHub Bot commented on HDFS-16761: --- sodonnel commented on PR #5390: URL: https://github.com/apache/hadoop/pull/5390#issuecomment-1431014524 For me the issues reproduces. I built trunk and started a local docker cluster with 3 nodes. As soon as one of those nodes goes dead the datanodes tab just redirects to the "startup in progress" page. Tried on Firefox and Safari and both are the same. > Namenode UI for Datanodes page not loading if any data node is down > --- > > Key: HDFS-16761 > URL: https://issues.apache.org/jira/browse/HDFS-16761 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.2.2 >Reporter: Krishna Reddy >Assignee: Zita Dombi >Priority: Major > Labels: pull-request-available > > Steps to reproduce: > - Install the hadoop components and add 3 datanodes > - Enable namenode HA > - Open Namenode UI and check datanode page > - check all datanodes will display > - Now make one datanode down > - wait for 10 minutes time as heartbeat expires > - Refresh namenode page and check > > Actual Result: It is showing error message "NameNode is still loading. > Redirecting to the Startup Progress page." -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] (HDFS-16799) The dn space size is not consistent, and Balancer can not work, resulting in a very unbalanced space
[ https://issues.apache.org/jira/browse/HDFS-16799 ] ruiliang deleted comment on HDFS-16799: - was (Author: ruilaing): ok > The dn space size is not consistent, and Balancer can not work, resulting in > a very unbalanced space > > > Key: HDFS-16799 > URL: https://issues.apache.org/jira/browse/HDFS-16799 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.0 >Reporter: ruiliang >Priority: Blocker > > > {code:java} > echo 'A DFS Used 99.8% to ip' > sorucehost > hdfs --debug balancer -fs hdfs://xxcluster06 -threshold 10 -source -f > sorucehost > > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-01-08/10.12.65.243:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-01-08/10.12.65.247:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-15-10/10.12.65.214:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-02-08/10.12.14.8:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-05-13/10.12.15.154:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-12-04/10.12.65.218:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-12-03/10.12.65.143:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-05-05/10.12.12.200:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-12-03/10.12.65.217:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-12-03/10.12.65.142:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-01-08/10.12.65.246:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-12-03/10.12.65.219:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-12-03/10.12.65.147:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-15-10/10.12.65.186:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-05-13/10.12.15.153:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-03-07/10.12.19.23:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-04-14/10.12.65.119:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-12-03/10.12.65.131:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-05-04/10.12.12.210:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-05-11/10.12.14.168:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-01-08/10.12.65.245:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-03-02/10.12.17.26:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-01-08/10.12.65.241:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-05-13/10.12.15.152:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-01-08/10.12.65.249:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-07-14/10.12.64.71:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-03-03/10.12.17.35:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-01-08/10.12.65.195:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-01-08/10.12.65.242:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-01-08/10.12.65.248:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-01-08/10.12.65.240:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-15-12/10.12.65.196:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-05-13/10.12.15.150:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-12-03/10.12.65.222:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-12-03/10.12.65.145:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-01-08/10.12.65.244:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-03-07/10.12.19.22:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-12-03/10.12.65.221:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-12-03/10.12.65.136:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-12-03/10.12.65.129:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-05-15/10.12.15.163:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-07-14/10.12.64.72:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a
[jira] [Commented] (HDFS-16799) The dn space size is not consistent, and Balancer can not work, resulting in a very unbalanced space
[ https://issues.apache.org/jira/browse/HDFS-16799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17688914#comment-17688914 ] ruiliang commented on HDFS-16799: - ok > The dn space size is not consistent, and Balancer can not work, resulting in > a very unbalanced space > > > Key: HDFS-16799 > URL: https://issues.apache.org/jira/browse/HDFS-16799 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.0 >Reporter: ruiliang >Priority: Blocker > > > {code:java} > echo 'A DFS Used 99.8% to ip' > sorucehost > hdfs --debug balancer -fs hdfs://xxcluster06 -threshold 10 -source -f > sorucehost > > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-01-08/10.12.65.243:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-01-08/10.12.65.247:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-15-10/10.12.65.214:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-02-08/10.12.14.8:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-05-13/10.12.15.154:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-12-04/10.12.65.218:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-12-03/10.12.65.143:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-05-05/10.12.12.200:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-12-03/10.12.65.217:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-12-03/10.12.65.142:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-01-08/10.12.65.246:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-12-03/10.12.65.219:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-12-03/10.12.65.147:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-15-10/10.12.65.186:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-05-13/10.12.15.153:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-03-07/10.12.19.23:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-04-14/10.12.65.119:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-12-03/10.12.65.131:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-05-04/10.12.12.210:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-05-11/10.12.14.168:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-01-08/10.12.65.245:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-03-02/10.12.17.26:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-01-08/10.12.65.241:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-05-13/10.12.15.152:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-01-08/10.12.65.249:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-07-14/10.12.64.71:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-03-03/10.12.17.35:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-01-08/10.12.65.195:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-01-08/10.12.65.242:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-01-08/10.12.65.248:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-01-08/10.12.65.240:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-15-12/10.12.65.196:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-05-13/10.12.15.150:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-12-03/10.12.65.222:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-12-03/10.12.65.145:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-01-08/10.12.65.244:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-03-07/10.12.19.22:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-12-03/10.12.65.221:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-12-03/10.12.65.136:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-12-03/10.12.65.129:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-05-15/10.12.15.163:1019 > 22/10/09 16:43:52 INFO net.NetworkTopology: Adding a new node: > /4F08-0