[jira] [Resolved] (HDFS-17405) [FGL] Using different metric name to trace FGL and Global lock
[ https://issues.apache.org/jira/browse/HDFS-17405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Fei resolved HDFS-17405. Resolution: Fixed > [FGL] Using different metric name to trace FGL and Global lock > -- > > Key: HDFS-17405 > URL: https://issues.apache.org/jira/browse/HDFS-17405 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: ZanderXu >Assignee: ZanderXu >Priority: Major > Labels: pull-request-available > > Currently, Fine-grained locks and global lock are using the same metric name > to trace its performance, so we need to different them. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-17405) [FGL] Using different metric name to trace FGL and Global lock
[ https://issues.apache.org/jira/browse/HDFS-17405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823883#comment-17823883 ] ASF GitHub Bot commented on HDFS-17405: --- ferhui commented on PR #6600: URL: https://github.com/apache/hadoop/pull/6600#issuecomment-1980230785 Failed cases are unrelated. > [FGL] Using different metric name to trace FGL and Global lock > -- > > Key: HDFS-17405 > URL: https://issues.apache.org/jira/browse/HDFS-17405 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: ZanderXu >Assignee: ZanderXu >Priority: Major > Labels: pull-request-available > > Currently, Fine-grained locks and global lock are using the same metric name > to trace its performance, so we need to different them. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-17405) [FGL] Using different metric name to trace FGL and Global lock
[ https://issues.apache.org/jira/browse/HDFS-17405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823880#comment-17823880 ] ASF GitHub Bot commented on HDFS-17405: --- hadoop-yetus commented on PR #6600: URL: https://github.com/apache/hadoop/pull/6600#issuecomment-1980222696 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 7m 16s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ HDFS-17384 Compile Tests _ | | +1 :green_heart: | mvninstall | 33m 4s | | HDFS-17384 passed | | +1 :green_heart: | compile | 0m 44s | | HDFS-17384 passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 43s | | HDFS-17384 passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 0m 38s | | HDFS-17384 passed | | +1 :green_heart: | mvnsite | 0m 44s | | HDFS-17384 passed | | +1 :green_heart: | javadoc | 0m 43s | | HDFS-17384 passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 8s | | HDFS-17384 passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 1m 48s | | HDFS-17384 passed | | +1 :green_heart: | shadedclient | 20m 31s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 37s | | the patch passed | | +1 :green_heart: | compile | 0m 36s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 36s | | the patch passed | | +1 :green_heart: | compile | 0m 32s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | javac | 0m 32s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 30s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 38s | | the patch passed | | +1 :green_heart: | javadoc | 0m 31s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 2s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 1m 42s | | the patch passed | | +1 :green_heart: | shadedclient | 20m 22s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 199m 35s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6600/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 31s | | The patch does not generate ASF License warnings. | | | | 295m 15s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.protocol.TestBlockListAsLongs | | | hadoop.hdfs.tools.TestDFSAdmin | | | hadoop.hdfs.server.datanode.TestLargeBlockReport | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6600/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6600 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux fda160309c15 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | HDFS-17384 / 1c72a0e6530880d2ac5871e4e1c33f0987830f42 | | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6600/2/testReport/ | | Max. process+thread count | 4448 (vs. ulimit of 5500) |
[jira] [Commented] (HDFS-17397) Choose another DN as soon as possible, when encountering network issues
[ https://issues.apache.org/jira/browse/HDFS-17397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823840#comment-17823840 ] ASF GitHub Bot commented on HDFS-17397: --- hadoop-yetus commented on PR #6591: URL: https://github.com/apache/hadoop/pull/6591#issuecomment-1980006462 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 32s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 47m 1s | | trunk passed | | +1 :green_heart: | compile | 1m 0s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 55s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 0m 34s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 58s | | trunk passed | | +1 :green_heart: | javadoc | 0m 49s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 44s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | -1 :x: | spotbugs | 2m 38s | [/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6591/9/artifact/out/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html) | hadoop-hdfs-project/hadoop-hdfs-client in trunk has 1 extant spotbugs warnings. | | +1 :green_heart: | shadedclient | 34m 12s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 49s | | the patch passed | | +1 :green_heart: | compile | 0m 51s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 51s | | the patch passed | | +1 :green_heart: | compile | 0m 45s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | javac | 0m 45s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 20s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 47s | | the patch passed | | +1 :green_heart: | javadoc | 0m 34s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 34s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 2m 33s | | the patch passed | | +1 :green_heart: | shadedclient | 34m 30s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 27s | | hadoop-hdfs-client in the patch passed. | | +1 :green_heart: | asflicense | 0m 37s | | The patch does not generate ASF License warnings. | | | | 136m 36s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6591/9/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6591 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 48f8ed5137d7 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 6dfaaef8c282276ac4e4f88468644f11cd67a11d | | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6591/9/testReport/ | | Max. process+thread count | 705 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-
[jira] [Commented] (HDFS-17380) FsImageValidation: remove inaccessible nodes
[ https://issues.apache.org/jira/browse/HDFS-17380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823827#comment-17823827 ] ASF GitHub Bot commented on HDFS-17380: --- hadoop-yetus commented on PR #6549: URL: https://github.com/apache/hadoop/pull/6549#issuecomment-1979930650 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 13m 26s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 44m 8s | | trunk passed | | +1 :green_heart: | compile | 1m 19s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 1m 15s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 1m 9s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 26s | | trunk passed | | +1 :green_heart: | javadoc | 1m 6s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 33s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 11s | | trunk passed | | +1 :green_heart: | shadedclient | 34m 44s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 7s | | the patch passed | | +1 :green_heart: | compile | 1m 15s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 1m 15s | | the patch passed | | +1 :green_heart: | compile | 1m 6s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | javac | 1m 6s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 57s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 11s | | the patch passed | | +1 :green_heart: | javadoc | 0m 52s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 27s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 16s | | the patch passed | | +1 :green_heart: | shadedclient | 35m 8s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 226m 9s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6549/6/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 47s | | The patch does not generate ASF License warnings. | | | | 377m 52s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.protocol.TestBlockListAsLongs | | | hadoop.hdfs.server.datanode.TestLargeBlockReport | | | hadoop.hdfs.tools.TestDFSAdmin | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6549/6/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6549 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 72343f7ce41f 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 44dc54115731700891f03157ce0ac672f5dff6a7 | | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6549/6/
[jira] [Commented] (HDFS-17299) HDFS is not rack failure tolerant while creating a new file.
[ https://issues.apache.org/jira/browse/HDFS-17299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823823#comment-17823823 ] ASF GitHub Bot commented on HDFS-17299: --- ritegarg commented on PR #6566: URL: https://github.com/apache/hadoop/pull/6566#issuecomment-1979910458 > There are some new checkstyle issues from the result of CI. Could you fix them? Should be fixed now > HDFS is not rack failure tolerant while creating a new file. > > > Key: HDFS-17299 > URL: https://issues.apache.org/jira/browse/HDFS-17299 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.10.1 >Reporter: Rushabh Shah >Assignee: Ritesh >Priority: Critical > Labels: pull-request-available > Attachments: repro.patch > > > Recently we saw an HBase cluster outage when we mistakenly brought down 1 AZ. > Our configuration: > 1. We use 3 Availability Zones (AZs) for fault tolerance. > 2. We use BlockPlacementPolicyRackFaultTolerant as the block placement policy. > 3. We use the following configuration parameters: > dfs.namenode.heartbeat.recheck-interval: 60 > dfs.heartbeat.interval: 3 > So it will take 123 ms (20.5mins) to detect that datanode is dead. > > Steps to reproduce: > # Bring down 1 AZ. > # HBase (HDFS client) tries to create a file (WAL file) and then calls > hflush on the newly created file. > # DataStreamer is not able to find blocks locations that satisfies the rack > placement policy (one copy in each rack which essentially means one copy in > each AZ) > # Since all the datanodes in that AZ are down but still alive to namenode, > the client gets different datanodes but still all of them are in the same AZ. > See logs below. > # HBase is not able to create a WAL file and it aborts the region server. > > Relevant logs from hdfs client and namenode > > {noformat} > 2023-12-16 17:17:43,818 INFO [on default port 9000] FSNamesystem.audit - > allowed=trueugi=hbase/ (auth:KERBEROS) ip= > cmd=create src=/hbase/WALs/ dst=null > 2023-12-16 17:17:43,978 INFO [on default port 9000] hdfs.StateChange - > BLOCK* allocate blk_1214652565_140946716, replicas=:50010, > :50010, :50010 for /hbase/WALs/ > 2023-12-16 17:17:44,061 INFO [Thread-39087] hdfs.DataStreamer - Exception in > createBlockOutputStream > java.io.IOException: Got error, status=ERROR, status message , ack with > firstBadLink as :50010 > at > org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:113) > at > org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1747) > at > org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1651) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:715) > 2023-12-16 17:17:44,061 WARN [Thread-39087] hdfs.DataStreamer - Abandoning > BP-179318874--1594838129323:blk_1214652565_140946716 > 2023-12-16 17:17:44,179 WARN [Thread-39087] hdfs.DataStreamer - Excluding > datanode > DatanodeInfoWithStorage[:50010,DS-a493abdb-3ac3-49b1-9bfb-848baf5c1c2c,DISK] > 2023-12-16 17:17:44,339 INFO [on default port 9000] hdfs.StateChange - > BLOCK* allocate blk_1214652580_140946764, replicas=:50010, > :50010, :50010 for /hbase/WALs/ > 2023-12-16 17:17:44,369 INFO [Thread-39087] hdfs.DataStreamer - Exception in > createBlockOutputStream > java.io.IOException: Got error, status=ERROR, status message , ack with > firstBadLink as :50010 > at > org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:113) > at > org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1747) > at > org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1651) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:715) > 2023-12-16 17:17:44,369 WARN [Thread-39087] hdfs.DataStreamer - Abandoning > BP-179318874-NN-IP-1594838129323:blk_1214652580_140946764 > 2023-12-16 17:17:44,454 WARN [Thread-39087] hdfs.DataStreamer - Excluding > datanode > DatanodeInfoWithStorage[AZ-2-dn-2:50010,DS-46bb45cc-af89-46f3-9f9d-24e4fdc35b6d,DISK] > 2023-12-16 17:17:44,522 INFO [on default port 9000] hdfs.StateChange - > BLOCK* allocate blk_1214652594_140946796, replicas=:50010, > :50010, :50010 for /hbase/WALs/ > 2023-12-16 17:17:44,712 INFO [Thread-39087] hdfs.DataStreamer - Exception in > createBlockOutputStream > java.io.IOException: Got error, status=ERROR, status message , ack with > firstBadLink as :50010 > at > org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:113) > at > org
[jira] [Commented] (HDFS-17299) HDFS is not rack failure tolerant while creating a new file.
[ https://issues.apache.org/jira/browse/HDFS-17299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823820#comment-17823820 ] ASF GitHub Bot commented on HDFS-17299: --- tasanuma commented on PR #6566: URL: https://github.com/apache/hadoop/pull/6566#issuecomment-1979904394 There are some new checkstyle issues from the result of CI. Could you fix them? > HDFS is not rack failure tolerant while creating a new file. > > > Key: HDFS-17299 > URL: https://issues.apache.org/jira/browse/HDFS-17299 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.10.1 >Reporter: Rushabh Shah >Assignee: Ritesh >Priority: Critical > Labels: pull-request-available > Attachments: repro.patch > > > Recently we saw an HBase cluster outage when we mistakenly brought down 1 AZ. > Our configuration: > 1. We use 3 Availability Zones (AZs) for fault tolerance. > 2. We use BlockPlacementPolicyRackFaultTolerant as the block placement policy. > 3. We use the following configuration parameters: > dfs.namenode.heartbeat.recheck-interval: 60 > dfs.heartbeat.interval: 3 > So it will take 123 ms (20.5mins) to detect that datanode is dead. > > Steps to reproduce: > # Bring down 1 AZ. > # HBase (HDFS client) tries to create a file (WAL file) and then calls > hflush on the newly created file. > # DataStreamer is not able to find blocks locations that satisfies the rack > placement policy (one copy in each rack which essentially means one copy in > each AZ) > # Since all the datanodes in that AZ are down but still alive to namenode, > the client gets different datanodes but still all of them are in the same AZ. > See logs below. > # HBase is not able to create a WAL file and it aborts the region server. > > Relevant logs from hdfs client and namenode > > {noformat} > 2023-12-16 17:17:43,818 INFO [on default port 9000] FSNamesystem.audit - > allowed=trueugi=hbase/ (auth:KERBEROS) ip= > cmd=create src=/hbase/WALs/ dst=null > 2023-12-16 17:17:43,978 INFO [on default port 9000] hdfs.StateChange - > BLOCK* allocate blk_1214652565_140946716, replicas=:50010, > :50010, :50010 for /hbase/WALs/ > 2023-12-16 17:17:44,061 INFO [Thread-39087] hdfs.DataStreamer - Exception in > createBlockOutputStream > java.io.IOException: Got error, status=ERROR, status message , ack with > firstBadLink as :50010 > at > org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:113) > at > org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1747) > at > org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1651) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:715) > 2023-12-16 17:17:44,061 WARN [Thread-39087] hdfs.DataStreamer - Abandoning > BP-179318874--1594838129323:blk_1214652565_140946716 > 2023-12-16 17:17:44,179 WARN [Thread-39087] hdfs.DataStreamer - Excluding > datanode > DatanodeInfoWithStorage[:50010,DS-a493abdb-3ac3-49b1-9bfb-848baf5c1c2c,DISK] > 2023-12-16 17:17:44,339 INFO [on default port 9000] hdfs.StateChange - > BLOCK* allocate blk_1214652580_140946764, replicas=:50010, > :50010, :50010 for /hbase/WALs/ > 2023-12-16 17:17:44,369 INFO [Thread-39087] hdfs.DataStreamer - Exception in > createBlockOutputStream > java.io.IOException: Got error, status=ERROR, status message , ack with > firstBadLink as :50010 > at > org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:113) > at > org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1747) > at > org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1651) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:715) > 2023-12-16 17:17:44,369 WARN [Thread-39087] hdfs.DataStreamer - Abandoning > BP-179318874-NN-IP-1594838129323:blk_1214652580_140946764 > 2023-12-16 17:17:44,454 WARN [Thread-39087] hdfs.DataStreamer - Excluding > datanode > DatanodeInfoWithStorage[AZ-2-dn-2:50010,DS-46bb45cc-af89-46f3-9f9d-24e4fdc35b6d,DISK] > 2023-12-16 17:17:44,522 INFO [on default port 9000] hdfs.StateChange - > BLOCK* allocate blk_1214652594_140946796, replicas=:50010, > :50010, :50010 for /hbase/WALs/ > 2023-12-16 17:17:44,712 INFO [Thread-39087] hdfs.DataStreamer - Exception in > createBlockOutputStream > java.io.IOException: Got error, status=ERROR, status message , ack with > firstBadLink as :50010 > at > org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:113) > at > org.apache.hadoop.hdfs.DataStrea
[jira] [Commented] (HDFS-17146) Use the dfsadmin -reconfig command to initiate reconfiguration on all decommissioning datanodes.
[ https://issues.apache.org/jira/browse/HDFS-17146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823791#comment-17823791 ] ASF GitHub Bot commented on HDFS-17146: --- slfan1989 commented on code in PR #6595: URL: https://github.com/apache/hadoop/pull/6595#discussion_r1513586786 ## hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDFSAdmin.java: ## @@ -1377,7 +1377,8 @@ public Boolean get() { } scanIntoList(out, outs); scanIntoList(err, errs); -return !outs.isEmpty() && outs.get(0).contains("finished"); +return !outs.isEmpty() && outs.stream().filter(line -> +line.contains("finished")).count() == 2; Review Comment: Can we explain this change? > Use the dfsadmin -reconfig command to initiate reconfiguration on all > decommissioning datanodes. > > > Key: HDFS-17146 > URL: https://issues.apache.org/jira/browse/HDFS-17146 > Project: Hadoop HDFS > Issue Type: Improvement > Components: dfsadmin, hdfs >Affects Versions: 3.4.0 >Reporter: Hualong Zhang >Assignee: Hualong Zhang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.1, 3.5.0 > > > If the *DFSAdmin* command could have the ability to perform bulk operations > across all decommissioned datanodes, that would be highly advantageous. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-17404) Add Namenode info to log message when setting block keys from active nn
[ https://issues.apache.org/jira/browse/HDFS-17404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823783#comment-17823783 ] ASF GitHub Bot commented on HDFS-17404: --- ctrezzo commented on PR #6609: URL: https://github.com/apache/hadoop/pull/6609#issuecomment-1979689511 Cherry picked the original commit from trunk since it was trivial. > Add Namenode info to log message when setting block keys from active nn > --- > > Key: HDFS-17404 > URL: https://issues.apache.org/jira/browse/HDFS-17404 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Joseph Dell'Aringa >Priority: Trivial > Labels: pull-request-available > Fix For: 3.3.7, 3.4.1, 3.5.0 > > > Add Namenode info to log message when setting block keys from active nn -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-17404) Add Namenode info to log message when setting block keys from active nn
[ https://issues.apache.org/jira/browse/HDFS-17404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823782#comment-17823782 ] ASF GitHub Bot commented on HDFS-17404: --- ctrezzo closed pull request #6609: HDFS-17404. Add NN Socket Address to log when processing command from active NN URL: https://github.com/apache/hadoop/pull/6609 > Add Namenode info to log message when setting block keys from active nn > --- > > Key: HDFS-17404 > URL: https://issues.apache.org/jira/browse/HDFS-17404 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Joseph Dell'Aringa >Priority: Trivial > Labels: pull-request-available > Fix For: 3.3.7, 3.4.1, 3.5.0 > > > Add Namenode info to log message when setting block keys from active nn -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-17146) Use the dfsadmin -reconfig command to initiate reconfiguration on all decommissioning datanodes.
[ https://issues.apache.org/jira/browse/HDFS-17146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823764#comment-17823764 ] ASF GitHub Bot commented on HDFS-17146: --- hadoop-yetus commented on PR #6595: URL: https://github.com/apache/hadoop/pull/6595#issuecomment-1979594510 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 17m 32s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 49m 14s | | trunk passed | | +1 :green_heart: | compile | 1m 21s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 1m 13s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 1m 13s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 23s | | trunk passed | | +1 :green_heart: | javadoc | 1m 8s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 38s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 20s | | trunk passed | | +1 :green_heart: | shadedclient | 40m 12s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 13s | | the patch passed | | +1 :green_heart: | compile | 1m 14s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 1m 14s | | the patch passed | | +1 :green_heart: | compile | 1m 5s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | javac | 1m 5s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 1m 1s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 14s | | the patch passed | | +1 :green_heart: | javadoc | 0m 56s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 31s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 17s | | the patch passed | | +1 :green_heart: | shadedclient | 41m 1s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 258m 21s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6595/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 46s | | The patch does not generate ASF License warnings. | | | | 430m 35s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.diskbalancer.command.TestDiskBalancerCommand | | | hadoop.hdfs.TestRollingUpgrade | | | hadoop.hdfs.server.datanode.TestLargeBlockReport | | | hadoop.hdfs.protocol.TestBlockListAsLongs | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6595/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6595 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 94272d3e905b 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 1319692bb8a97ab12d3198506396969b16bae8c3 | | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6595/3/testReport/ | | Max. process+thread count | 3187
[jira] [Updated] (HDFS-17404) Add Namenode info to log message when setting block keys from active nn
[ https://issues.apache.org/jira/browse/HDFS-17404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Trezzo updated HDFS-17404: Fix Version/s: 3.3.7 3.4.1 > Add Namenode info to log message when setting block keys from active nn > --- > > Key: HDFS-17404 > URL: https://issues.apache.org/jira/browse/HDFS-17404 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Joseph Dell'Aringa >Priority: Trivial > Labels: pull-request-available > Fix For: 3.3.7, 3.4.1, 3.5.0 > > > Add Namenode info to log message when setting block keys from active nn -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17404) Add Namenode info to log message when setting block keys from active nn
[ https://issues.apache.org/jira/browse/HDFS-17404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Trezzo resolved HDFS-17404. - Resolution: Fixed > Add Namenode info to log message when setting block keys from active nn > --- > > Key: HDFS-17404 > URL: https://issues.apache.org/jira/browse/HDFS-17404 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Joseph Dell'Aringa >Priority: Trivial > Labels: pull-request-available > Fix For: 3.5.0 > > > Add Namenode info to log message when setting block keys from active nn -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-17404) Add Namenode info to log message when setting block keys from active nn
[ https://issues.apache.org/jira/browse/HDFS-17404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Trezzo updated HDFS-17404: Fix Version/s: 3.5.0 > Add Namenode info to log message when setting block keys from active nn > --- > > Key: HDFS-17404 > URL: https://issues.apache.org/jira/browse/HDFS-17404 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Joseph Dell'Aringa >Priority: Trivial > Labels: pull-request-available > Fix For: 3.5.0 > > > Add Namenode info to log message when setting block keys from active nn -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-17404) Add Namenode info to log message when setting block keys from active nn
[ https://issues.apache.org/jira/browse/HDFS-17404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823741#comment-17823741 ] ASF GitHub Bot commented on HDFS-17404: --- ctrezzo merged PR #6598: URL: https://github.com/apache/hadoop/pull/6598 > Add Namenode info to log message when setting block keys from active nn > --- > > Key: HDFS-17404 > URL: https://issues.apache.org/jira/browse/HDFS-17404 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Joseph Dell'Aringa >Priority: Trivial > Labels: pull-request-available > > Add Namenode info to log message when setting block keys from active nn -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-17299) HDFS is not rack failure tolerant while creating a new file.
[ https://issues.apache.org/jira/browse/HDFS-17299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823736#comment-17823736 ] ASF GitHub Bot commented on HDFS-17299: --- shahrs87 commented on PR #6566: URL: https://github.com/apache/hadoop/pull/6566#issuecomment-1979352026 1. [spotbugs](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6566/20/artifact/out/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html) warning is not related to patch. So ignoring for now. 2. There are 3 test failures, 2 of them ([TestBlockListAsLongs.testFuzz](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6566/20/testReport/junit/org.apache.hadoop.hdfs.protocol/TestBlockListAsLongs/testFuzz/) and [TestLargeBlockReport.testBlockReportSucceedsWithLargerLengthLimit](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6566/20/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestLargeBlockReport/testBlockReportSucceedsWithLargerLengthLimit/)) are consistently failing in nightly builds. 3. The third test failure ([TestDFSAdmin.testDecommissionDataNodesReconfig](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6566/20/testReport/junit/org.apache.hadoop.hdfs.tools/TestDFSAdmin/testDecommissionDataNodesReconfig/)) is flaky. Created [HDFS-17409](https://issues.apache.org/jira/browse/HDFS-17409) for further investigation. This PR is ready to review again. All the comments are addressed by @ritegarg. If there are no more comment by EOD tomorrow, I will merge this PR. Cc @ayushtkn @tasanuma > HDFS is not rack failure tolerant while creating a new file. > > > Key: HDFS-17299 > URL: https://issues.apache.org/jira/browse/HDFS-17299 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.10.1 >Reporter: Rushabh Shah >Assignee: Ritesh >Priority: Critical > Labels: pull-request-available > Attachments: repro.patch > > > Recently we saw an HBase cluster outage when we mistakenly brought down 1 AZ. > Our configuration: > 1. We use 3 Availability Zones (AZs) for fault tolerance. > 2. We use BlockPlacementPolicyRackFaultTolerant as the block placement policy. > 3. We use the following configuration parameters: > dfs.namenode.heartbeat.recheck-interval: 60 > dfs.heartbeat.interval: 3 > So it will take 123 ms (20.5mins) to detect that datanode is dead. > > Steps to reproduce: > # Bring down 1 AZ. > # HBase (HDFS client) tries to create a file (WAL file) and then calls > hflush on the newly created file. > # DataStreamer is not able to find blocks locations that satisfies the rack > placement policy (one copy in each rack which essentially means one copy in > each AZ) > # Since all the datanodes in that AZ are down but still alive to namenode, > the client gets different datanodes but still all of them are in the same AZ. > See logs below. > # HBase is not able to create a WAL file and it aborts the region server. > > Relevant logs from hdfs client and namenode > > {noformat} > 2023-12-16 17:17:43,818 INFO [on default port 9000] FSNamesystem.audit - > allowed=trueugi=hbase/ (auth:KERBEROS) ip= > cmd=create src=/hbase/WALs/ dst=null > 2023-12-16 17:17:43,978 INFO [on default port 9000] hdfs.StateChange - > BLOCK* allocate blk_1214652565_140946716, replicas=:50010, > :50010, :50010 for /hbase/WALs/ > 2023-12-16 17:17:44,061 INFO [Thread-39087] hdfs.DataStreamer - Exception in > createBlockOutputStream > java.io.IOException: Got error, status=ERROR, status message , ack with > firstBadLink as :50010 > at > org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:113) > at > org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1747) > at > org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1651) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:715) > 2023-12-16 17:17:44,061 WARN [Thread-39087] hdfs.DataStreamer - Abandoning > BP-179318874--1594838129323:blk_1214652565_140946716 > 2023-12-16 17:17:44,179 WARN [Thread-39087] hdfs.DataStreamer - Excluding > datanode > DatanodeInfoWithStorage[:50010,DS-a493abdb-3ac3-49b1-9bfb-848baf5c1c2c,DISK] > 2023-12-16 17:17:44,339 INFO [on default port 9000] hdfs.StateChange - > BLOCK* allocate blk_1214652580_140946764, replicas=:50010, > :50010, :50010 for /hbase/WALs/ > 2023-12-16 17:17:44,369 INFO [Thread-39087] hdfs.DataStreamer - Exception in > createBlockOutputStream > java.io.IOException: Got error, status=ERROR, status message , ack with > firstBadLink as :50010 > at > org.apache.hadoop.hdfs.protocol.datatransfer.DataTransf
[jira] [Updated] (HDFS-17404) Add Namenode info to log message when setting block keys from active nn
[ https://issues.apache.org/jira/browse/HDFS-17404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Trezzo updated HDFS-17404: Description: Add Namenode info to log message when setting block keys from active nn (was: When BPServiceActor communicates with an active nn we should record the nn address. This information is already available for standby nn. ) > Add Namenode info to log message when setting block keys from active nn > --- > > Key: HDFS-17404 > URL: https://issues.apache.org/jira/browse/HDFS-17404 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Joseph Dell'Aringa >Priority: Trivial > Labels: pull-request-available > > Add Namenode info to log message when setting block keys from active nn -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-17404) Add Namenode info to log message when setting block keys from active nn
[ https://issues.apache.org/jira/browse/HDFS-17404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Trezzo updated HDFS-17404: Summary: Add Namenode info to log message when setting block keys from active nn (was: Add NN Socket Address to log when processing command from active NN) > Add Namenode info to log message when setting block keys from active nn > --- > > Key: HDFS-17404 > URL: https://issues.apache.org/jira/browse/HDFS-17404 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Joseph Dell'Aringa >Priority: Trivial > Labels: pull-request-available > > When BPServiceActor communicates with an active nn we should record the nn > address. This information is already available for standby nn. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-17404) Add NN Socket Address to log when processing command from active NN
[ https://issues.apache.org/jira/browse/HDFS-17404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Trezzo updated HDFS-17404: Summary: Add NN Socket Address to log when processing command from active NN (was: Record the BPServiceActor information that communicates with Active) > Add NN Socket Address to log when processing command from active NN > --- > > Key: HDFS-17404 > URL: https://issues.apache.org/jira/browse/HDFS-17404 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Joseph Dell'Aringa >Priority: Trivial > Labels: pull-request-available > > When BPServiceActor communicates with an active nn we should record the nn > address. This information is already available for standby nn. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-17409) TestDFSAdmin has couple of flaky tests.
Rushabh Shah created HDFS-17409: --- Summary: TestDFSAdmin has couple of flaky tests. Key: HDFS-17409 URL: https://issues.apache.org/jira/browse/HDFS-17409 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.5.0 Reporter: Rushabh Shah TestDFSAdmin has couple of flaky tests. TestDFSAdmin#testAllDatanodesReconfig TestDFSAdmin#testDecommissionDataNodesReconfig Seeing failure in couple of separate PRs: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6583/7/#showFailuresLink https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6566/20/ testDecommissionDataNodesReconfig is failing locally with the following error: {noformat} java.lang.AssertionError at org.junit.Assert.fail(Assert.java:87) at org.junit.Assert.assertTrue(Assert.java:42) at org.junit.Assert.assertTrue(Assert.java:53) at org.apache.hadoop.hdfs.tools.TestDFSAdmin.testDecommissionDataNodesReconfig(TestDFSAdmin.java:1356) {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-17299) HDFS is not rack failure tolerant while creating a new file.
[ https://issues.apache.org/jira/browse/HDFS-17299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823715#comment-17823715 ] ASF GitHub Bot commented on HDFS-17299: --- ritegarg commented on code in PR #6566: URL: https://github.com/apache/hadoop/pull/6566#discussion_r1513209226 ## hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientExcludedNodes.java: ## @@ -89,6 +89,10 @@ public void testExcludedNodesForgiveness() throws IOException { conf.setLong( HdfsClientConfigKeys.Write.EXCLUDE_NODES_CACHE_EXPIRY_INTERVAL_KEY, 2500); +// Set min replication for blocks to be written as 1. +conf.setInt( + HdfsClientConfigKeys.BlockWrite.ReplaceDatanodeOnFailure.MIN_REPLICATION, +1); Review Comment: Fixed this behavior by adding try/catch to DataStreamer.setupPipelineForCreate > HDFS is not rack failure tolerant while creating a new file. > > > Key: HDFS-17299 > URL: https://issues.apache.org/jira/browse/HDFS-17299 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.10.1 >Reporter: Rushabh Shah >Assignee: Ritesh >Priority: Critical > Labels: pull-request-available > Attachments: repro.patch > > > Recently we saw an HBase cluster outage when we mistakenly brought down 1 AZ. > Our configuration: > 1. We use 3 Availability Zones (AZs) for fault tolerance. > 2. We use BlockPlacementPolicyRackFaultTolerant as the block placement policy. > 3. We use the following configuration parameters: > dfs.namenode.heartbeat.recheck-interval: 60 > dfs.heartbeat.interval: 3 > So it will take 123 ms (20.5mins) to detect that datanode is dead. > > Steps to reproduce: > # Bring down 1 AZ. > # HBase (HDFS client) tries to create a file (WAL file) and then calls > hflush on the newly created file. > # DataStreamer is not able to find blocks locations that satisfies the rack > placement policy (one copy in each rack which essentially means one copy in > each AZ) > # Since all the datanodes in that AZ are down but still alive to namenode, > the client gets different datanodes but still all of them are in the same AZ. > See logs below. > # HBase is not able to create a WAL file and it aborts the region server. > > Relevant logs from hdfs client and namenode > > {noformat} > 2023-12-16 17:17:43,818 INFO [on default port 9000] FSNamesystem.audit - > allowed=trueugi=hbase/ (auth:KERBEROS) ip= > cmd=create src=/hbase/WALs/ dst=null > 2023-12-16 17:17:43,978 INFO [on default port 9000] hdfs.StateChange - > BLOCK* allocate blk_1214652565_140946716, replicas=:50010, > :50010, :50010 for /hbase/WALs/ > 2023-12-16 17:17:44,061 INFO [Thread-39087] hdfs.DataStreamer - Exception in > createBlockOutputStream > java.io.IOException: Got error, status=ERROR, status message , ack with > firstBadLink as :50010 > at > org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:113) > at > org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1747) > at > org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1651) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:715) > 2023-12-16 17:17:44,061 WARN [Thread-39087] hdfs.DataStreamer - Abandoning > BP-179318874--1594838129323:blk_1214652565_140946716 > 2023-12-16 17:17:44,179 WARN [Thread-39087] hdfs.DataStreamer - Excluding > datanode > DatanodeInfoWithStorage[:50010,DS-a493abdb-3ac3-49b1-9bfb-848baf5c1c2c,DISK] > 2023-12-16 17:17:44,339 INFO [on default port 9000] hdfs.StateChange - > BLOCK* allocate blk_1214652580_140946764, replicas=:50010, > :50010, :50010 for /hbase/WALs/ > 2023-12-16 17:17:44,369 INFO [Thread-39087] hdfs.DataStreamer - Exception in > createBlockOutputStream > java.io.IOException: Got error, status=ERROR, status message , ack with > firstBadLink as :50010 > at > org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:113) > at > org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1747) > at > org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1651) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:715) > 2023-12-16 17:17:44,369 WARN [Thread-39087] hdfs.DataStreamer - Abandoning > BP-179318874-NN-IP-1594838129323:blk_1214652580_140946764 > 2023-12-16 17:17:44,454 WARN [Thread-39087] hdfs.DataStreamer - Excluding > datanode > DatanodeInfoWithStorage[AZ-2-dn-2:50010,DS-46bb45cc-af89-46f3-9f9d-24e4fdc35b6d,DISK] > 2023-12-16 17:17:44,522 INFO [on defau
[jira] [Reopened] (HDFS-17384) [FGL] Replace the global lock with global FS Lock and global BM lock
[ https://issues.apache.org/jira/browse/HDFS-17384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Fei reopened HDFS-17384: > [FGL] Replace the global lock with global FS Lock and global BM lock > > > Key: HDFS-17384 > URL: https://issues.apache.org/jira/browse/HDFS-17384 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: ZanderXu >Assignee: ZanderXu >Priority: Major > Labels: FGL > > First, we can replace the current global lock with two locks, global FS lock > and global BM lock. > The global FS lock is used to make directory tree-related operations > thread-safe. > The global BM lock is used to make block-related operations and DN-related > operations thread-safe. > > For some operations involving both directory tree and block or DN, the global > FS lock and the global BM lock are acquired. > > The lock order should be: > * The global FS lock > * The global BM lock > > There are some special requirements for this ticket. > * End-user can choose to use global lock or fine-grained lock through > configuration. > * Try not to modify the current implementation logic as much as possible. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17384) [FGL] Replace the global lock with global FS Lock and global BM lock
[ https://issues.apache.org/jira/browse/HDFS-17384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Fei resolved HDFS-17384. Resolution: Fixed > [FGL] Replace the global lock with global FS Lock and global BM lock > > > Key: HDFS-17384 > URL: https://issues.apache.org/jira/browse/HDFS-17384 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: ZanderXu >Assignee: ZanderXu >Priority: Major > Labels: FGL > > First, we can replace the current global lock with two locks, global FS lock > and global BM lock. > The global FS lock is used to make directory tree-related operations > thread-safe. > The global BM lock is used to make block-related operations and DN-related > operations thread-safe. > > For some operations involving both directory tree and block or DN, the global > FS lock and the global BM lock are acquired. > > The lock order should be: > * The global FS lock > * The global BM lock > > There are some special requirements for this ticket. > * End-user can choose to use global lock or fine-grained lock through > configuration. > * Try not to modify the current implementation logic as much as possible. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-17299) HDFS is not rack failure tolerant while creating a new file.
[ https://issues.apache.org/jira/browse/HDFS-17299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823631#comment-17823631 ] ASF GitHub Bot commented on HDFS-17299: --- tasanuma commented on code in PR #6566: URL: https://github.com/apache/hadoop/pull/6566#discussion_r1512884161 ## hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java: ## @@ -1817,10 +1839,10 @@ protected LocatedBlock nextBlockOutputStream() throws IOException { nodes = lb.getLocations(); nextStorageTypes = lb.getStorageTypes(); nextStorageIDs = lb.getStorageIDs(); - + setPipeline(lb); // Connect to first DataNode in the list. success = createBlockOutputStream(nodes, nextStorageTypes, nextStorageIDs, - 0L, false); + 0L, false) || setupPipelineForAppendOrRecovery(); Review Comment: I haven't looked into the PR in detail, but it makes sense to me that PIPELINE_SETUP_CREATE should also consider the `dtpReplaceDatanodeOnFailureReplication`. If I understand correctly, this change won't affect users who have set `dfs.client.block.write.replace-datanode-on-failure.min-replication(=dtpReplaceDatanodeOnFailureReplication)=0`, which is the default setting, so I think it's fairly safe. > HDFS is not rack failure tolerant while creating a new file. > > > Key: HDFS-17299 > URL: https://issues.apache.org/jira/browse/HDFS-17299 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.10.1 >Reporter: Rushabh Shah >Assignee: Ritesh >Priority: Critical > Labels: pull-request-available > Attachments: repro.patch > > > Recently we saw an HBase cluster outage when we mistakenly brought down 1 AZ. > Our configuration: > 1. We use 3 Availability Zones (AZs) for fault tolerance. > 2. We use BlockPlacementPolicyRackFaultTolerant as the block placement policy. > 3. We use the following configuration parameters: > dfs.namenode.heartbeat.recheck-interval: 60 > dfs.heartbeat.interval: 3 > So it will take 123 ms (20.5mins) to detect that datanode is dead. > > Steps to reproduce: > # Bring down 1 AZ. > # HBase (HDFS client) tries to create a file (WAL file) and then calls > hflush on the newly created file. > # DataStreamer is not able to find blocks locations that satisfies the rack > placement policy (one copy in each rack which essentially means one copy in > each AZ) > # Since all the datanodes in that AZ are down but still alive to namenode, > the client gets different datanodes but still all of them are in the same AZ. > See logs below. > # HBase is not able to create a WAL file and it aborts the region server. > > Relevant logs from hdfs client and namenode > > {noformat} > 2023-12-16 17:17:43,818 INFO [on default port 9000] FSNamesystem.audit - > allowed=trueugi=hbase/ (auth:KERBEROS) ip= > cmd=create src=/hbase/WALs/ dst=null > 2023-12-16 17:17:43,978 INFO [on default port 9000] hdfs.StateChange - > BLOCK* allocate blk_1214652565_140946716, replicas=:50010, > :50010, :50010 for /hbase/WALs/ > 2023-12-16 17:17:44,061 INFO [Thread-39087] hdfs.DataStreamer - Exception in > createBlockOutputStream > java.io.IOException: Got error, status=ERROR, status message , ack with > firstBadLink as :50010 > at > org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:113) > at > org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1747) > at > org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1651) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:715) > 2023-12-16 17:17:44,061 WARN [Thread-39087] hdfs.DataStreamer - Abandoning > BP-179318874--1594838129323:blk_1214652565_140946716 > 2023-12-16 17:17:44,179 WARN [Thread-39087] hdfs.DataStreamer - Excluding > datanode > DatanodeInfoWithStorage[:50010,DS-a493abdb-3ac3-49b1-9bfb-848baf5c1c2c,DISK] > 2023-12-16 17:17:44,339 INFO [on default port 9000] hdfs.StateChange - > BLOCK* allocate blk_1214652580_140946764, replicas=:50010, > :50010, :50010 for /hbase/WALs/ > 2023-12-16 17:17:44,369 INFO [Thread-39087] hdfs.DataStreamer - Exception in > createBlockOutputStream > java.io.IOException: Got error, status=ERROR, status message , ack with > firstBadLink as :50010 > at > org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:113) > at > org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1747) > at > org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1651) > at org.apache
[jira] [Commented] (HDFS-17398) [FGL] Implement the FGL lock for FSNLockManager
[ https://issues.apache.org/jira/browse/HDFS-17398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823632#comment-17823632 ] ASF GitHub Bot commented on HDFS-17398: --- hadoop-yetus commented on PR #6599: URL: https://github.com/apache/hadoop/pull/6599#issuecomment-1978853986 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 17m 47s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ HDFS-17384 Compile Tests _ | | +1 :green_heart: | mvninstall | 49m 12s | | HDFS-17384 passed | | +1 :green_heart: | compile | 1m 25s | | HDFS-17384 passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 1m 16s | | HDFS-17384 passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 1m 13s | | HDFS-17384 passed | | +1 :green_heart: | mvnsite | 1m 24s | | HDFS-17384 passed | | +1 :green_heart: | javadoc | 1m 9s | | HDFS-17384 passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 37s | | HDFS-17384 passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 20s | | HDFS-17384 passed | | +1 :green_heart: | shadedclient | 40m 34s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 11s | | the patch passed | | +1 :green_heart: | compile | 1m 16s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 1m 16s | | the patch passed | | +1 :green_heart: | compile | 1m 6s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | javac | 1m 6s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 1m 3s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 13s | | the patch passed | | +1 :green_heart: | javadoc | 0m 58s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 33s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 17s | | the patch passed | | +1 :green_heart: | shadedclient | 40m 37s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 265m 6s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6599/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 50s | | The patch does not generate ASF License warnings. | | | | 438m 24s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.protocol.TestBlockListAsLongs | | | hadoop.hdfs.server.datanode.TestLargeBlockReport | | | hadoop.hdfs.tools.TestDFSAdmin | | | hadoop.hdfs.server.diskbalancer.command.TestDiskBalancerCommand | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6599/4/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6599 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux c133c37a03b3 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | HDFS-17384 / a39fa755ad9781053d80c99bb059ae340fab7de8 | | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6599/4/te
[jira] [Commented] (HDFS-17299) HDFS is not rack failure tolerant while creating a new file.
[ https://issues.apache.org/jira/browse/HDFS-17299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823591#comment-17823591 ] ASF GitHub Bot commented on HDFS-17299: --- hadoop-yetus commented on PR #6566: URL: https://github.com/apache/hadoop/pull/6566#issuecomment-1978665926 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 21s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 3 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 0s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 20m 20s | | trunk passed | | +1 :green_heart: | compile | 2m 56s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 2m 48s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 0m 46s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 20s | | trunk passed | | +1 :green_heart: | javadoc | 1m 6s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 37s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | -1 :x: | spotbugs | 1m 26s | [/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6566/20/artifact/out/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html) | hadoop-hdfs-project/hadoop-hdfs-client in trunk has 1 extant spotbugs warnings. | | +1 :green_heart: | shadedclient | 21m 59s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 22m 13s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 22s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 7s | | the patch passed | | +1 :green_heart: | compile | 2m 49s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 2m 49s | | the patch passed | | +1 :green_heart: | compile | 2m 42s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | javac | 2m 42s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 36s | [/results-checkstyle-hadoop-hdfs-project.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6566/20/artifact/out/results-checkstyle-hadoop-hdfs-project.txt) | hadoop-hdfs-project: The patch generated 5 new + 243 unchanged - 2 fixed = 248 total (was 245) | | +1 :green_heart: | mvnsite | 1m 6s | | the patch passed | | +1 :green_heart: | javadoc | 0m 52s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 27s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 5s | | the patch passed | | +1 :green_heart: | shadedclient | 20m 26s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 1m 49s | | hadoop-hdfs-client in the patch passed. | | -1 :x: | unit | 199m 4s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6566/20/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 29s | | The patch does not generate ASF License warnings. | | | | 307m 29s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.protocol.TestBlockListAsLongs | | | hadoop.hdfs.tools.TestDFSAdmin | | | hadoop.hdfs.server.datanode.TestLargeBlockReport | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6566/20/artifact
[jira] [Commented] (HDFS-17397) Choose another DN as soon as possible, when encountering network issues
[ https://issues.apache.org/jira/browse/HDFS-17397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823556#comment-17823556 ] ASF GitHub Bot commented on HDFS-17397: --- hadoop-yetus commented on PR #6591: URL: https://github.com/apache/hadoop/pull/6591#issuecomment-1978498670 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 33s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 45m 6s | | trunk passed | | +1 :green_heart: | compile | 1m 0s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 55s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 0m 31s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 1s | | trunk passed | | +1 :green_heart: | javadoc | 0m 50s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 42s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | -1 :x: | spotbugs | 2m 46s | [/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6591/8/artifact/out/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html) | hadoop-hdfs-project/hadoop-hdfs-client in trunk has 1 extant spotbugs warnings. | | +1 :green_heart: | shadedclient | 36m 24s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 52s | | the patch passed | | +1 :green_heart: | compile | 0m 55s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 55s | | the patch passed | | +1 :green_heart: | compile | 0m 46s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | javac | 0m 46s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 21s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 49s | | the patch passed | | +1 :green_heart: | javadoc | 0m 36s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 34s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 2m 43s | | the patch passed | | +1 :green_heart: | shadedclient | 38m 23s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 28s | | hadoop-hdfs-client in the patch passed. | | +1 :green_heart: | asflicense | 0m 37s | | The patch does not generate ASF License warnings. | | | | 141m 31s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6591/8/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6591 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux e369432213b5 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / cfef0ddde37ce9fb94e2ad356615c7df20f90cc3 | | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6591/8/testReport/ | | Max. process+thread count | 553 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-
[jira] [Commented] (HDFS-17401) Erasure Coding: Excess internal block may not be able to be deleted correctly when it's stored in fallback storage
[ https://issues.apache.org/jira/browse/HDFS-17401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823508#comment-17823508 ] ASF GitHub Bot commented on HDFS-17401: --- RuinanGu commented on PR #6597: URL: https://github.com/apache/hadoop/pull/6597#issuecomment-1978301877 @zhangshuyan0 @haiyang1987 Could you please take a look? > Erasure Coding: Excess internal block may not be able to be deleted correctly > when it's stored in fallback storage > -- > > Key: HDFS-17401 > URL: https://issues.apache.org/jira/browse/HDFS-17401 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.6 >Reporter: Ruinan Gu >Assignee: Ruinan Gu >Priority: Major > Labels: pull-request-available > > Excess internal block can't be deleted correctly when it's stored in fallback > storage. > Simple case: > EC-RS-6-3-1024k file is stored using ALL_SSD storage policy(SSD is default > storage type and DISK is fallback storage type), if the block group is as > follows > [0(SSD), 0(SSD), 1(SSD), 2(SSD), 3(SSD), 4(SSD), 5(SSD), 6(SSD), 7(SSD), > 8(DISK)] > The are two index 0 internal block and one of them should be chosen to > delete.But the current implement chooses the index 0 internal blocks as > candidates but DISK as exess storage type.As a result, the exess storage > type(DISK) can not correspond to the exess internal blocks' storage type(SSD) > correctly, and the exess internal block can not be deleted correctly. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org