[jira] [Commented] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

2024-04-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834472#comment-17834472
 ] 

ASF GitHub Bot commented on HDFS-17453:
---

goiri commented on code in PR #6708:
URL: https://github.com/apache/hadoop/pull/6708#discussion_r1554465107


##
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestIncrementalBlockReports.java:
##
@@ -215,4 +229,95 @@ public void testReplaceReceivedBlock() throws 
InterruptedException, IOException
   cluster = null;
 }
   }
+
+  @Test
+  public void testIBRRaceCondition() throws Exception {
+cluster.shutdown();
+Configuration conf = new Configuration();
+HAUtil.setAllowStandbyReads(conf, true);
+conf.setInt(DFSConfigKeys.DFS_HA_TAILEDITS_PERIOD_KEY, 1);
+cluster = new MiniDFSCluster.Builder(conf)
+.nnTopology(MiniDFSNNTopology.simpleHATopology())
+.numDataNodes(3)
+.build();
+try {
+  cluster.waitActive();
+  cluster.transitionToActive(0);
+
+  NameNode nn1 = cluster.getNameNode(0);
+  NameNode nn2 = cluster.getNameNode(1);
+  FileSystem fs = HATestUtil.configureFailoverFs(cluster, conf);
+  List ibrsToStandby = new ArrayList<>();
+  List spies = new ArrayList<>();
+  Phaser ibrPhaser = new Phaser(1);
+  for (DataNode dn : cluster.getDataNodes()) {
+DatanodeProtocolClientSideTranslatorPB nnSpy =
+InternalDataNodeTestUtils.spyOnBposToNN(dn, nn2);
+doAnswer((inv) -> {
+  for (StorageReceivedDeletedBlocks srdb :
+  inv.getArgument(2, StorageReceivedDeletedBlocks[].class)) {
+for (ReceivedDeletedBlockInfo block : srdb.getBlocks()) {
+  if (block.getStatus().equals(BlockStatus.RECEIVED_BLOCK)) {
+ibrPhaser.arriveAndDeregister();
+  }
+}
+  }
+  ibrsToStandby.add(inv);
+  return null;
+}).when(nnSpy).blockReceivedAndDeleted(
+any(DatanodeRegistration.class),
+anyString(),
+any(StorageReceivedDeletedBlocks[].class));
+spies.add(nnSpy);
+  }
+
+  Thread.sleep(1000);

Review Comment:
   Can we do better than sleep?



##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/PendingDataNodeMessages.java:
##
@@ -95,16 +95,27 @@ void removeAllMessagesForDatanode(DatanodeDescriptor dn) {
   
   void enqueueReportedBlock(DatanodeStorageInfo storageInfo, Block block,
   ReplicaState reportedState) {
+long genStamp = block.getGenerationStamp();
+Queue queue = null;
 if (BlockIdManager.isStripedBlockID(block.getBlockId())) {
   Block blkId = new Block(BlockIdManager.convertToStripedID(block
   .getBlockId()));
-  getBlockQueue(blkId).add(
-  new ReportedBlockInfo(storageInfo, new Block(block), reportedState));
+  queue = getBlockQueue(blkId);
 } else {
   block = new Block(block);
-  getBlockQueue(block).add(
-  new ReportedBlockInfo(storageInfo, block, reportedState));
+  queue = getBlockQueue(block);
 }
+// We only want the latest non-future reported block to be queued for each
+// DataNode. Otherwise, there can be a race condition that causes an old
+// reported block to be kept in the queue until the SNN switches to ANN and
+// the old reported block will be processed and marked as corrupt by the 
ANN.
+// See HDFS-17453
+int size = queue.size();
+if (queue.removeIf(rbi -> rbi.storageInfo.equals(storageInfo) &&

Review Comment:
   We could make this more robus to nulls with:
   ```
 void enqueueReportedBlock(DatanodeStorageInfo storageInfo, Block block,
 ReplicaState reportedState) {
   if (storageInfo == null || block == null || reportedState == null) {
   return;
   }
   ...
   if (queue.removeIf(rbi -> storageInfo.equals(rbi.storageInfo) &&
   ...
   



##
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestIncrementalBlockReports.java:
##
@@ -215,4 +229,95 @@ public void testReplaceReceivedBlock() throws 
InterruptedException, IOException
   cluster = null;
 }
   }
+
+  @Test
+  public void testIBRRaceCondition() throws Exception {
+cluster.shutdown();
+Configuration conf = new Configuration();
+HAUtil.setAllowStandbyReads(conf, true);
+conf.setInt(DFSConfigKeys.DFS_HA_TAILEDITS_PERIOD_KEY, 1);
+cluster = new MiniDFSCluster.Builder(conf)
+.nnTopology(MiniDFSNNTopology.simpleHATopology())
+.numDataNodes(3)
+.build();
+try {
+  cluster.waitActive();
+  cluster.transitionToActive(0);
+
+  NameNode nn1 = cluster.getNameNode(0);
+  NameNode nn2 = cluster.getNameNode(1);
+  FileSystem fs = 

[jira] [Commented] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

2024-04-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834471#comment-17834471
 ] 

ASF GitHub Bot commented on HDFS-17453:
---

dannytbecker commented on PR #6708:
URL: https://github.com/apache/hadoop/pull/6708#issuecomment-2040837894

   @kihwal Could you take a look at my PR? I think it addresses the issue you 
mentioned here 
https://issues.apache.org/jira/browse/HDFS-14941?focusedCommentId=17140156=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17140156




> IncrementalBlockReport can have race condition with Edit Log Tailer
> ---
>
> Key: HDFS-17453
> URL: https://issues.apache.org/jira/browse/HDFS-17453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover, ha, hdfs, namenode
>Affects Versions: 3.3.0, 3.3.1, 2.10.2, 3.3.2, 3.3.5, 3.3.4, 3.3.6
>Reporter: Danny Becker
>Assignee: Danny Becker
>Priority: Major
>  Labels: pull-request-available
>
> h2. Summary
> There is a race condition between IncrementalBlockReports (IBR) and 
> EditLogTailer in Standby NameNode (SNN) which can lead to leaked IBRs and 
> false corrupt blocks after HA Failover. The race condition occurs when the 
> SNN loads the edit logs before it receives the block reports from DataNode 
> (DN).
> h2. Example
> In the following example there is a block (b1) with 3 generation stamps (gs1, 
> gs2, gs3).
>  # SNN1 loads edit logs for b1gs1 and b1gs2.
>  # DN1 sends the IBR for b1gs1 to SNN1.
>  # SNN1 will determine that the reported block b1gs1 from DN1 is corrupt and 
> it will be queued for later. 
> [BlockManager.java|https://github.com/apache/hadoop/blob/6ed73896f6e8b4b7c720eff64193cb30b3e77fb2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L3447C1-L3464C6]
> {code:java}
>     BlockToMarkCorrupt c = checkReplicaCorrupt(
>         block, reportedState, storedBlock, ucState, dn);
>     if (c != null) {
>       if (shouldPostponeBlocksFromFuture) {
>         // If the block is an out-of-date generation stamp or state,
>         // but we're the standby, we shouldn't treat it as corrupt,
>         // but instead just queue it for later processing.
>         // Storing the reported block for later processing, as that is what
>         // comes from the IBR / FBR and hence what we should use to compare
>         // against the memory state.
>         // See HDFS-6289 and HDFS-15422 for more context.
>         queueReportedBlock(storageInfo, block, reportedState,
>             QUEUE_REASON_CORRUPT_STATE);
>       } else {
>         toCorrupt.add(c);
>       }
>       return storedBlock;
>     } {code}
>  # DN1 sends IBR for b1gs2 and b1gs3 to SNN1.
>  # SNN1 processes b1sg2 and updates the blocks map.
>  # SNN1 queues b1gs3 for later because it determines that b1gs3 is a future 
> genstamp.
>  # SNN1 loads b1gs3 edit logs and processes the queued reports for b1.
>  # SNN1 processes b1gs1 first and puts it back in the queue.
>  # SNN1 processes b1gs3 next and updates the blocks map.
>  # Later, SNN1 becomes the Active NameNode (ANN) during an HA Failover.
>  # SNN1 will catch to the latest edit logs, then process all queued block 
> reports to become the ANN.
>  # ANN1 will process b1gs1 and mark it as corrupt.
> If the example above happens for every DN which stores b1, then when the HA 
> failover happens, b1 will be incorrectly marked as corrupt. This will be 
> fixed when the first DN sends a FullBlockReport or an IBR for b1.
> h2. Logs from Active Cluster
> I added the following logs to confirm this issue in an active cluster:
> {code:java}
> BlockToMarkCorrupt c = checkReplicaCorrupt(
> block, reportedState, storedBlock, ucState, dn);
> if (c != null) {
>   DatanodeStorageInfo storedStorageInfo = storedBlock.findStorageInfo(dn);
>   LOG.info("Found corrupt block {} [{}, {}] from DN {}. Stored block {} from 
> DN {}",
>   block, reportedState.name(), ucState.name(), storageInfo, storedBlock, 
> storedStorageInfo);
>   if (storageInfo.equals(storedStorageInfo) &&
> storedBlock.getGenerationStamp() > block.getGenerationStamp()) {
> LOG.info("Stored Block {} from the same DN {} has a newer GenStamp." +
> storedBlock, storedStorageInfo);
>   }
>   if (shouldPostponeBlocksFromFuture) {
> // If the block is an out-of-date generation stamp or state,
> // but we're the standby, we shouldn't treat it as corrupt,
> // but instead just queue it for later processing.
> // Storing the reported block for later processing, as that is what
> // comes from the IBR / FBR and hence what we should use to compare
> // against the memory state.
> // See HDFS-6289 and HDFS-15422 for more context.
> 

[jira] [Commented] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

2024-04-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834470#comment-17834470
 ] 

ASF GitHub Bot commented on HDFS-17453:
---

dannytbecker commented on PR #6708:
URL: https://github.com/apache/hadoop/pull/6708#issuecomment-2040837517

   @sodonnel Could you take a look at my PR. I think it addresses the issue you 
mentioned here 
https://issues.apache.org/jira/browse/HDFS-15422?focusedCommentId=17287194=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17287194




> IncrementalBlockReport can have race condition with Edit Log Tailer
> ---
>
> Key: HDFS-17453
> URL: https://issues.apache.org/jira/browse/HDFS-17453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover, ha, hdfs, namenode
>Affects Versions: 3.3.0, 3.3.1, 2.10.2, 3.3.2, 3.3.5, 3.3.4, 3.3.6
>Reporter: Danny Becker
>Assignee: Danny Becker
>Priority: Major
>  Labels: pull-request-available
>
> h2. Summary
> There is a race condition between IncrementalBlockReports (IBR) and 
> EditLogTailer in Standby NameNode (SNN) which can lead to leaked IBRs and 
> false corrupt blocks after HA Failover. The race condition occurs when the 
> SNN loads the edit logs before it receives the block reports from DataNode 
> (DN).
> h2. Example
> In the following example there is a block (b1) with 3 generation stamps (gs1, 
> gs2, gs3).
>  # SNN1 loads edit logs for b1gs1 and b1gs2.
>  # DN1 sends the IBR for b1gs1 to SNN1.
>  # SNN1 will determine that the reported block b1gs1 from DN1 is corrupt and 
> it will be queued for later. 
> [BlockManager.java|https://github.com/apache/hadoop/blob/6ed73896f6e8b4b7c720eff64193cb30b3e77fb2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L3447C1-L3464C6]
> {code:java}
>     BlockToMarkCorrupt c = checkReplicaCorrupt(
>         block, reportedState, storedBlock, ucState, dn);
>     if (c != null) {
>       if (shouldPostponeBlocksFromFuture) {
>         // If the block is an out-of-date generation stamp or state,
>         // but we're the standby, we shouldn't treat it as corrupt,
>         // but instead just queue it for later processing.
>         // Storing the reported block for later processing, as that is what
>         // comes from the IBR / FBR and hence what we should use to compare
>         // against the memory state.
>         // See HDFS-6289 and HDFS-15422 for more context.
>         queueReportedBlock(storageInfo, block, reportedState,
>             QUEUE_REASON_CORRUPT_STATE);
>       } else {
>         toCorrupt.add(c);
>       }
>       return storedBlock;
>     } {code}
>  # DN1 sends IBR for b1gs2 and b1gs3 to SNN1.
>  # SNN1 processes b1sg2 and updates the blocks map.
>  # SNN1 queues b1gs3 for later because it determines that b1gs3 is a future 
> genstamp.
>  # SNN1 loads b1gs3 edit logs and processes the queued reports for b1.
>  # SNN1 processes b1gs1 first and puts it back in the queue.
>  # SNN1 processes b1gs3 next and updates the blocks map.
>  # Later, SNN1 becomes the Active NameNode (ANN) during an HA Failover.
>  # SNN1 will catch to the latest edit logs, then process all queued block 
> reports to become the ANN.
>  # ANN1 will process b1gs1 and mark it as corrupt.
> If the example above happens for every DN which stores b1, then when the HA 
> failover happens, b1 will be incorrectly marked as corrupt. This will be 
> fixed when the first DN sends a FullBlockReport or an IBR for b1.
> h2. Logs from Active Cluster
> I added the following logs to confirm this issue in an active cluster:
> {code:java}
> BlockToMarkCorrupt c = checkReplicaCorrupt(
> block, reportedState, storedBlock, ucState, dn);
> if (c != null) {
>   DatanodeStorageInfo storedStorageInfo = storedBlock.findStorageInfo(dn);
>   LOG.info("Found corrupt block {} [{}, {}] from DN {}. Stored block {} from 
> DN {}",
>   block, reportedState.name(), ucState.name(), storageInfo, storedBlock, 
> storedStorageInfo);
>   if (storageInfo.equals(storedStorageInfo) &&
> storedBlock.getGenerationStamp() > block.getGenerationStamp()) {
> LOG.info("Stored Block {} from the same DN {} has a newer GenStamp." +
> storedBlock, storedStorageInfo);
>   }
>   if (shouldPostponeBlocksFromFuture) {
> // If the block is an out-of-date generation stamp or state,
> // but we're the standby, we shouldn't treat it as corrupt,
> // but instead just queue it for later processing.
> // Storing the reported block for later processing, as that is what
> // comes from the IBR / FBR and hence what we should use to compare
> // against the memory state.
> // See HDFS-6289 and HDFS-15422 for more context.
> 

[jira] [Commented] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

2024-04-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834468#comment-17834468
 ] 

ASF GitHub Bot commented on HDFS-17453:
---

dannytbecker commented on PR #6708:
URL: https://github.com/apache/hadoop/pull/6708#issuecomment-2040836033

   @goiri I have added the unit test which reproduces the exact race condition. 
I confirmed this by testing the unit test against a branch without the fix and 
caught the false corrupt replicas.




> IncrementalBlockReport can have race condition with Edit Log Tailer
> ---
>
> Key: HDFS-17453
> URL: https://issues.apache.org/jira/browse/HDFS-17453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover, ha, hdfs, namenode
>Affects Versions: 3.3.0, 3.3.1, 2.10.2, 3.3.2, 3.3.5, 3.3.4, 3.3.6
>Reporter: Danny Becker
>Assignee: Danny Becker
>Priority: Major
>  Labels: pull-request-available
>
> h2. Summary
> There is a race condition between IncrementalBlockReports (IBR) and 
> EditLogTailer in Standby NameNode (SNN) which can lead to leaked IBRs and 
> false corrupt blocks after HA Failover. The race condition occurs when the 
> SNN loads the edit logs before it receives the block reports from DataNode 
> (DN).
> h2. Example
> In the following example there is a block (b1) with 3 generation stamps (gs1, 
> gs2, gs3).
>  # SNN1 loads edit logs for b1gs1 and b1gs2.
>  # DN1 sends the IBR for b1gs1 to SNN1.
>  # SNN1 will determine that the reported block b1gs1 from DN1 is corrupt and 
> it will be queued for later. 
> [BlockManager.java|https://github.com/apache/hadoop/blob/6ed73896f6e8b4b7c720eff64193cb30b3e77fb2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L3447C1-L3464C6]
> {code:java}
>     BlockToMarkCorrupt c = checkReplicaCorrupt(
>         block, reportedState, storedBlock, ucState, dn);
>     if (c != null) {
>       if (shouldPostponeBlocksFromFuture) {
>         // If the block is an out-of-date generation stamp or state,
>         // but we're the standby, we shouldn't treat it as corrupt,
>         // but instead just queue it for later processing.
>         // Storing the reported block for later processing, as that is what
>         // comes from the IBR / FBR and hence what we should use to compare
>         // against the memory state.
>         // See HDFS-6289 and HDFS-15422 for more context.
>         queueReportedBlock(storageInfo, block, reportedState,
>             QUEUE_REASON_CORRUPT_STATE);
>       } else {
>         toCorrupt.add(c);
>       }
>       return storedBlock;
>     } {code}
>  # DN1 sends IBR for b1gs2 and b1gs3 to SNN1.
>  # SNN1 processes b1sg2 and updates the blocks map.
>  # SNN1 queues b1gs3 for later because it determines that b1gs3 is a future 
> genstamp.
>  # SNN1 loads b1gs3 edit logs and processes the queued reports for b1.
>  # SNN1 processes b1gs1 first and puts it back in the queue.
>  # SNN1 processes b1gs3 next and updates the blocks map.
>  # Later, SNN1 becomes the Active NameNode (ANN) during an HA Failover.
>  # SNN1 will catch to the latest edit logs, then process all queued block 
> reports to become the ANN.
>  # ANN1 will process b1gs1 and mark it as corrupt.
> If the example above happens for every DN which stores b1, then when the HA 
> failover happens, b1 will be incorrectly marked as corrupt. This will be 
> fixed when the first DN sends a FullBlockReport or an IBR for b1.
> h2. Logs from Active Cluster
> I added the following logs to confirm this issue in an active cluster:
> {code:java}
> BlockToMarkCorrupt c = checkReplicaCorrupt(
> block, reportedState, storedBlock, ucState, dn);
> if (c != null) {
>   DatanodeStorageInfo storedStorageInfo = storedBlock.findStorageInfo(dn);
>   LOG.info("Found corrupt block {} [{}, {}] from DN {}. Stored block {} from 
> DN {}",
>   block, reportedState.name(), ucState.name(), storageInfo, storedBlock, 
> storedStorageInfo);
>   if (storageInfo.equals(storedStorageInfo) &&
> storedBlock.getGenerationStamp() > block.getGenerationStamp()) {
> LOG.info("Stored Block {} from the same DN {} has a newer GenStamp." +
> storedBlock, storedStorageInfo);
>   }
>   if (shouldPostponeBlocksFromFuture) {
> // If the block is an out-of-date generation stamp or state,
> // but we're the standby, we shouldn't treat it as corrupt,
> // but instead just queue it for later processing.
> // Storing the reported block for later processing, as that is what
> // comes from the IBR / FBR and hence what we should use to compare
> // against the memory state.
> // See HDFS-6289 and HDFS-15422 for more context.
> queueReportedBlock(storageInfo, block, reportedState,
> 

[jira] [Commented] (HDFS-17455) Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt

2024-04-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834427#comment-17834427
 ] 

ASF GitHub Bot commented on HDFS-17455:
---

hadoop-yetus commented on PR #6710:
URL: https://github.com/apache/hadoop/pull/6710#issuecomment-2040538624

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 32s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 39s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  32m 28s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   5m 31s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   5m 22s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   1m 26s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 27s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 57s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 35s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | -1 :x: |  spotbugs  |   2m 59s | 
[/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6710/1/artifact/out/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html)
 |  hadoop-hdfs-project/hadoop-hdfs-client in trunk has 1 extant spotbugs 
warnings.  |
   | +1 :green_heart: |  shadedclient  |  37m 40s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 33s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m  2s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   5m 19s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   5m 19s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   5m 13s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   5m 13s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   1m 14s | 
[/results-checkstyle-hadoop-hdfs-project.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6710/1/artifact/out/results-checkstyle-hadoop-hdfs-project.txt)
 |  hadoop-hdfs-project: The patch generated 1 new + 33 unchanged - 0 fixed = 
34 total (was 33)  |
   | +1 :green_heart: |  mvnsite  |   2m  3s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 34s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  7s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   5m 55s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  35m 55s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 28s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | +1 :green_heart: |  unit  | 226m  4s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 46s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 403m 33s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6710/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6710 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 2bcd94340db8 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 52f7ba8eccaf03420ed317659c19193ed895ddd4 |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
  

[jira] [Commented] (HDFS-17449) Ill-formed decommission host name and port pair would trigger IndexOutOfBound error

2024-04-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834361#comment-17834361
 ] 

ASF GitHub Bot commented on HDFS-17449:
---

hadoop-yetus commented on PR #6691:
URL: https://github.com/apache/hadoop/pull/6691#issuecomment-2040042875

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 46s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  50m  4s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   1m 12s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 24s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  9s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 38s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 19s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  41m  8s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  5s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  1s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 55s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 34s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 16s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  41m  8s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 259m 43s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 46s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 417m 25s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6691/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6691 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux ca4eabab33a4 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / e841403d0a1bc44ad1fd0820f5c7949001d8511c |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6691/2/testReport/ |
   | Max. process+thread count | 3046 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6691/2/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> Ill-formed decommission host name and 

[jira] [Commented] (HDFS-17454) Fix namenode fsck swallows the exception stacktrace, this can help us to troubleshooting log.

2024-04-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834353#comment-17834353
 ] 

ASF GitHub Bot commented on HDFS-17454:
---

hadoop-yetus commented on PR #6709:
URL: https://github.com/apache/hadoop/pull/6709#issuecomment-2040032072

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 51s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  51m  5s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 31s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   1m 21s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 27s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 11s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 44s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 33s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  42m 46s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 22s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   1m 22s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 19s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 55s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 37s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 29s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  42m 17s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 268m 55s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 54s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 432m 20s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6709/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6709 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 5ea6cfbdebcf 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 41fa5cb805156f5df8e0c60106c982d713d5c040 |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6709/1/testReport/ |
   | Max. process+thread count | 3016 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6709/1/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 

[jira] [Commented] (HDFS-17455) Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt

2024-04-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834304#comment-17834304
 ] 

ASF GitHub Bot commented on HDFS-17455:
---

haiyang1987 opened a new pull request, #6710:
URL: https://github.com/apache/hadoop/pull/6710

   ### Description of PR
   https://issues.apache.org/jira/browse/HDFS-17455
   
   When the client read data, connect to the datanode, because at this time the 
datanode access token is invalid will throw InvalidBlockTokenException. At this 
time, when call fetchBlockAt method will throw 
java.lang.IndexOutOfBoundsException causing read data failed.
   
   **Root case:**
   
   - The HDFS file contains only one RBW block, with a block data size of 
2048KB.
   - The client open this file and seeks to the offset of 1024KB to read data.
   - Call DFSInputStream#getBlockReader method connect to the datanode, because 
at this time the datanode access token is invalid will throw 
InvalidBlockTokenException., and call DFSInputStream#fetchBlockAt will throw 
java.lang.IndexOutOfBoundsException.
   
   ```
   private synchronized DatanodeInfo blockSeekTo(long target)
throws IOException {
  if (target >= getFileLength()) {
  // the target size is smaller than fileLength (completeBlockSize + 
lastBlockBeingWrittenLength),
  // here at this time target is 1024 and getFileLength is 2048
throw new IOException("Attempted to read past end of file");
  }
  ...
  while (true) {
...
try {
  blockReader = getBlockReader(targetBlock, offsetIntoBlock,
  targetBlock.getBlockSize() - offsetIntoBlock, targetAddr,
  storageType, chosenNode);
  if(connectFailedOnce) {
DFSClient.LOG.info("Successfully connected to " + targetAddr +
   " for " + targetBlock.getBlock());
  }
  return chosenNode;
} catch (IOException ex) {
  ...
  } else if (refetchToken > 0 && tokenRefetchNeeded(ex, targetAddr)) {
refetchToken--;
// Here will catch InvalidBlockTokenException.
fetchBlockAt(target);
  } else {
...
  }
}
  }
}
   
   private LocatedBlock fetchBlockAt(long offset, long length, boolean useCache)
 throws IOException {
   maybeRegisterBlockRefresh();
   synchronized(infoLock) {
 // Here the locatedBlocks only contains one locatedBlock, at this time 
the offset is 1024 and fileLength is 0,
 // so the targetBlockIdx is -2
 int targetBlockIdx = locatedBlocks.findBlock(offset);
 if (targetBlockIdx < 0) { // block is not cached
   targetBlockIdx = LocatedBlocks.getInsertIndex(targetBlockIdx);
   // Here the targetBlockIdx is 1;
   useCache = false;
 }
 if (!useCache) { // fetch blocks
   final LocatedBlocks newBlocks = (length == 0)
   ? dfsClient.getLocatedBlocks(src, offset)
   : dfsClient.getLocatedBlocks(src, offset, length);
   if (newBlocks == null || newBlocks.locatedBlockCount() == 0) {
 throw new EOFException("Could not find target position " + offset);
   }
   // Update the LastLocatedBlock, if offset is for last block.
   if (offset >= locatedBlocks.getFileLength()) {
 setLocatedBlocksFields(newBlocks, getLastBlockLength(newBlocks));
   } else {
 locatedBlocks.insertRange(targetBlockIdx,
 newBlocks.getLocatedBlocks());
   }
 }
 // Here the locatedBlocks only contains one locatedBlock, so will 
throw java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1
 return locatedBlocks.get(targetBlockIdx);
   }
 }
   ```
   
   The client exception:
   ```
   java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1
   at 
java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64)
   at 
java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70)
   at 
java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:266)
   at java.base/java.util.Objects.checkIndex(Objects.java:359)
   at java.base/java.util.ArrayList.get(ArrayList.java:427)
   at 
org.apache.hadoop.hdfs.protocol.LocatedBlocks.get(LocatedBlocks.java:87)
   at 
org.apache.hadoop.hdfs.DFSInputStream.fetchBlockAt(DFSInputStream.java:569)
   at 
org.apache.hadoop.hdfs.DFSInputStream.fetchBlockAt(DFSInputStream.java:540)
   at 
org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:704)
   at 
org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:884)
   at 
org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:957)
   at 

[jira] [Updated] (HDFS-17455) Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt

2024-04-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17455:
--
Labels: pull-request-available  (was: )

> Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt
> -
>
> Key: HDFS-17455
> URL: https://issues.apache.org/jira/browse/HDFS-17455
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>
> When the client read data, connect to the datanode, because at this time the 
> datanode access token is invalid will throw InvalidBlockTokenException. At 
> this time, when call fetchBlockAt method will  throw 
> java.lang.IndexOutOfBoundsException causing  read data failed.
> *Root case:*
> * The HDFS file contains only one RBW block, with a block data size of 2048KB.
> * The client open this file and seeks to the offset of 1024KB to read data.
> * Call DFSInputStream#getBlockReader method connect to the datanode,  because 
> at this time the datanode access token is invalid will throw 
> InvalidBlockTokenException., and call DFSInputStream#fetchBlockAt will throw 
> java.lang.IndexOutOfBoundsException.
> {code:java}
> private synchronized DatanodeInfo blockSeekTo(long target)
>  throws IOException {
>if (target >= getFileLength()) {
>// the target size is smaller than fileLength (completeBlockSize + 
> lastBlockBeingWrittenLength),
>// here at this time target is 1024 and getFileLength is 2048
>  throw new IOException("Attempted to read past end of file");
>}
>...
>while (true) {
>  ...
>  try {
>blockReader = getBlockReader(targetBlock, offsetIntoBlock,
>targetBlock.getBlockSize() - offsetIntoBlock, targetAddr,
>storageType, chosenNode);
>if(connectFailedOnce) {
>  DFSClient.LOG.info("Successfully connected to " + targetAddr +
> " for " + targetBlock.getBlock());
>}
>return chosenNode;
>  } catch (IOException ex) {
>...
>} else if (refetchToken > 0 && tokenRefetchNeeded(ex, targetAddr)) {
>  refetchToken--;
>  // Here will catch InvalidBlockTokenException.
>  fetchBlockAt(target);
>} else {
>  ...
>}
>  }
>}
>  }
> private LocatedBlock fetchBlockAt(long offset, long length, boolean useCache)
>   throws IOException {
> maybeRegisterBlockRefresh();
> synchronized(infoLock) {
>   // Here the locatedBlocks only contains one locatedBlock, at this time 
> the offset is 1024 and fileLength is 0,
>   // so the targetBlockIdx is -2
>   int targetBlockIdx = locatedBlocks.findBlock(offset);
>   if (targetBlockIdx < 0) { // block is not cached
> targetBlockIdx = LocatedBlocks.getInsertIndex(targetBlockIdx);
> // Here the targetBlockIdx is 1;
> useCache = false;
>   }
>   if (!useCache) { // fetch blocks
> final LocatedBlocks newBlocks = (length == 0)
> ? dfsClient.getLocatedBlocks(src, offset)
> : dfsClient.getLocatedBlocks(src, offset, length);
> if (newBlocks == null || newBlocks.locatedBlockCount() == 0) {
>   throw new EOFException("Could not find target position " + offset);
> }
> // Update the LastLocatedBlock, if offset is for last block.
> if (offset >= locatedBlocks.getFileLength()) {
>   setLocatedBlocksFields(newBlocks, getLastBlockLength(newBlocks));
> } else {
>   locatedBlocks.insertRange(targetBlockIdx,
>   newBlocks.getLocatedBlocks());
> }
>   }
>   // Here the locatedBlocks only contains one locatedBlock, so will throw 
> java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1
>   return locatedBlocks.get(targetBlockIdx);
> }
>   }
> {code}
> The client exception:
> {code:java}
> java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1
> at 
> java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64)
> at 
> java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70)
> at 
> java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:266)
> at java.base/java.util.Objects.checkIndex(Objects.java:359)
> at java.base/java.util.ArrayList.get(ArrayList.java:427)
> at 
> org.apache.hadoop.hdfs.protocol.LocatedBlocks.get(LocatedBlocks.java:87)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.fetchBlockAt(DFSInputStream.java:569)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.fetchBlockAt(DFSInputStream.java:540)
> at 
> 

[jira] [Updated] (HDFS-17455) Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt

2024-04-05 Thread Haiyang Hu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haiyang Hu updated HDFS-17455:
--
Description: 
When the client read data, connect to the datanode, because at this time the 
datanode access token is invalid will throw InvalidBlockTokenException. At this 
time, when call fetchBlockAt method will  throw 
java.lang.IndexOutOfBoundsException causing  read data failed.

*Root case:
* The HDFS file contains only one RBW block, with a block data size of 2048KB.
* The client open this file and seeks to the offset of 1024KB to read data.
* Call DFSInputStream#getBlockReader method connect to the datanode,  because 
at this time the datanode access token is invalid will throw 
InvalidBlockTokenException., and call DFSInputStream#fetchBlockAt will throw 
java.lang.IndexOutOfBoundsException.

{code:java}
private synchronized DatanodeInfo blockSeekTo(long target)
 throws IOException {
   if (target >= getFileLength()) {
   // the target size is smaller than fileLength (completeBlockSize + 
lastBlockBeingWrittenLength),
   // here at this time target is 1024 and getFileLength is 2048
 throw new IOException("Attempted to read past end of file");
   }
   ...
   while (true) {
 ...
 try {
   blockReader = getBlockReader(targetBlock, offsetIntoBlock,
   targetBlock.getBlockSize() - offsetIntoBlock, targetAddr,
   storageType, chosenNode);
   if(connectFailedOnce) {
 DFSClient.LOG.info("Successfully connected to " + targetAddr +
" for " + targetBlock.getBlock());
   }
   return chosenNode;
 } catch (IOException ex) {
   ...
   } else if (refetchToken > 0 && tokenRefetchNeeded(ex, targetAddr)) {
 refetchToken--;
 // Here will catch InvalidBlockTokenException.
 fetchBlockAt(target);
   } else {
 ...
   }
 }
   }
 }

private LocatedBlock fetchBlockAt(long offset, long length, boolean useCache)
  throws IOException {
maybeRegisterBlockRefresh();
synchronized(infoLock) {
  // Here the locatedBlocks only contains one locatedBlock, at this time 
the offset is 1024 and fileLength is 0,
  // so the targetBlockIdx is -2
  int targetBlockIdx = locatedBlocks.findBlock(offset);
  if (targetBlockIdx < 0) { // block is not cached
targetBlockIdx = LocatedBlocks.getInsertIndex(targetBlockIdx);
// Here the targetBlockIdx is 1;
useCache = false;
  }
  if (!useCache) { // fetch blocks
final LocatedBlocks newBlocks = (length == 0)
? dfsClient.getLocatedBlocks(src, offset)
: dfsClient.getLocatedBlocks(src, offset, length);
if (newBlocks == null || newBlocks.locatedBlockCount() == 0) {
  throw new EOFException("Could not find target position " + offset);
}
// Update the LastLocatedBlock, if offset is for last block.
if (offset >= locatedBlocks.getFileLength()) {
  setLocatedBlocksFields(newBlocks, getLastBlockLength(newBlocks));
} else {
  locatedBlocks.insertRange(targetBlockIdx,
  newBlocks.getLocatedBlocks());
}
  }
  // Here the locatedBlocks only contains one locatedBlock, so will throw 
java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1
  return locatedBlocks.get(targetBlockIdx);
}
  }
{code}

The client exception:

{code:java}
java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1
at 
java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64)
at 
java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70)
at 
java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:266)
at java.base/java.util.Objects.checkIndex(Objects.java:359)
at java.base/java.util.ArrayList.get(ArrayList.java:427)
at 
org.apache.hadoop.hdfs.protocol.LocatedBlocks.get(LocatedBlocks.java:87)
at 
org.apache.hadoop.hdfs.DFSInputStream.fetchBlockAt(DFSInputStream.java:569)
at 
org.apache.hadoop.hdfs.DFSInputStream.fetchBlockAt(DFSInputStream.java:540)
at 
org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:704)
at 
org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:884)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:957)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:804)
{code}

The datanode exception:

{code:java}
2024-03-27 15:56:35,477 WARN  datanode.DataNode 
(DataXceiver.java:checkAccess(1487)) [DataXceiver for client 
DFSClient_NONMAPREDUCE_475786505_1 at /xxx [Sending block 
BP-xxx:blk_1138933918_65194340]] - Block token verification failed: 
op=READ_BLOCK, remoteAddress=/XXX, message=Can't re-compute password for 
block_token_identifier 

[jira] [Updated] (HDFS-17455) Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt

2024-04-05 Thread Haiyang Hu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haiyang Hu updated HDFS-17455:
--
Description: 
When the client read data, connect to the datanode, because at this time the 
datanode access token is invalid will throw InvalidBlockTokenException. At this 
time, when call fetchBlockAt method will  throw 
java.lang.IndexOutOfBoundsException causing  read data failed.

*Root case:*
* The HDFS file contains only one RBW block, with a block data size of 2048KB.
* The client open this file and seeks to the offset of 1024KB to read data.
* Call DFSInputStream#getBlockReader method connect to the datanode,  because 
at this time the datanode access token is invalid will throw 
InvalidBlockTokenException., and call DFSInputStream#fetchBlockAt will throw 
java.lang.IndexOutOfBoundsException.

{code:java}
private synchronized DatanodeInfo blockSeekTo(long target)
 throws IOException {
   if (target >= getFileLength()) {
   // the target size is smaller than fileLength (completeBlockSize + 
lastBlockBeingWrittenLength),
   // here at this time target is 1024 and getFileLength is 2048
 throw new IOException("Attempted to read past end of file");
   }
   ...
   while (true) {
 ...
 try {
   blockReader = getBlockReader(targetBlock, offsetIntoBlock,
   targetBlock.getBlockSize() - offsetIntoBlock, targetAddr,
   storageType, chosenNode);
   if(connectFailedOnce) {
 DFSClient.LOG.info("Successfully connected to " + targetAddr +
" for " + targetBlock.getBlock());
   }
   return chosenNode;
 } catch (IOException ex) {
   ...
   } else if (refetchToken > 0 && tokenRefetchNeeded(ex, targetAddr)) {
 refetchToken--;
 // Here will catch InvalidBlockTokenException.
 fetchBlockAt(target);
   } else {
 ...
   }
 }
   }
 }

private LocatedBlock fetchBlockAt(long offset, long length, boolean useCache)
  throws IOException {
maybeRegisterBlockRefresh();
synchronized(infoLock) {
  // Here the locatedBlocks only contains one locatedBlock, at this time 
the offset is 1024 and fileLength is 0,
  // so the targetBlockIdx is -2
  int targetBlockIdx = locatedBlocks.findBlock(offset);
  if (targetBlockIdx < 0) { // block is not cached
targetBlockIdx = LocatedBlocks.getInsertIndex(targetBlockIdx);
// Here the targetBlockIdx is 1;
useCache = false;
  }
  if (!useCache) { // fetch blocks
final LocatedBlocks newBlocks = (length == 0)
? dfsClient.getLocatedBlocks(src, offset)
: dfsClient.getLocatedBlocks(src, offset, length);
if (newBlocks == null || newBlocks.locatedBlockCount() == 0) {
  throw new EOFException("Could not find target position " + offset);
}
// Update the LastLocatedBlock, if offset is for last block.
if (offset >= locatedBlocks.getFileLength()) {
  setLocatedBlocksFields(newBlocks, getLastBlockLength(newBlocks));
} else {
  locatedBlocks.insertRange(targetBlockIdx,
  newBlocks.getLocatedBlocks());
}
  }
  // Here the locatedBlocks only contains one locatedBlock, so will throw 
java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1
  return locatedBlocks.get(targetBlockIdx);
}
  }
{code}

The client exception:

{code:java}
java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1
at 
java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64)
at 
java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70)
at 
java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:266)
at java.base/java.util.Objects.checkIndex(Objects.java:359)
at java.base/java.util.ArrayList.get(ArrayList.java:427)
at 
org.apache.hadoop.hdfs.protocol.LocatedBlocks.get(LocatedBlocks.java:87)
at 
org.apache.hadoop.hdfs.DFSInputStream.fetchBlockAt(DFSInputStream.java:569)
at 
org.apache.hadoop.hdfs.DFSInputStream.fetchBlockAt(DFSInputStream.java:540)
at 
org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:704)
at 
org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:884)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:957)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:804)
{code}

The datanode exception:

{code:java}
2024-03-27 15:56:35,477 WARN  datanode.DataNode 
(DataXceiver.java:checkAccess(1487)) [DataXceiver for client 
DFSClient_NONMAPREDUCE_475786505_1 at /xxx [Sending block 
BP-xxx:blk_1138933918_65194340]] - Block token verification failed: 
op=READ_BLOCK, remoteAddress=/XXX, message=Can't re-compute password for 
block_token_identifier 

[jira] [Created] (HDFS-17455) Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt

2024-04-05 Thread Haiyang Hu (Jira)
Haiyang Hu created HDFS-17455:
-

 Summary: Fix Client throw IndexOutOfBoundsException in 
DFSInputStream#fetchBlockAt
 Key: HDFS-17455
 URL: https://issues.apache.org/jira/browse/HDFS-17455
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Haiyang Hu
Assignee: Haiyang Hu






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17449) Ill-formed decommission host name and port pair would trigger IndexOutOfBound error

2024-04-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834189#comment-17834189
 ] 

ASF GitHub Bot commented on HDFS-17449:
---

teamconfx commented on code in PR #6691:
URL: https://github.com/apache/hadoop/pull/6691#discussion_r1553156626


##
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/HostsFileWriter.java:
##
@@ -106,9 +106,14 @@ public void initOutOfServiceHosts(List 
decommissionHostNameAndPorts,
 for (String hostNameAndPort : decommissionHostNameAndPorts) {
   DatanodeAdminProperties dn = new DatanodeAdminProperties();
   String[] hostAndPort = hostNameAndPort.split(":");
-  dn.setHostName(hostAndPort[0]);
-  dn.setPort(Integer.parseInt(hostAndPort[1]));
-  dn.setAdminState(AdminStates.DECOMMISSIONED);
+  try {
+dn.setHostName(hostAndPort[0]);
+dn.setPort(Integer.parseInt(hostAndPort[1]));
+dn.setAdminState(AdminStates.DECOMMISSIONED);
+  } catch (Exception e) {
+throw new IllegalArgumentException("The decommision host name and 
port format is "
++ "invalid. The format should be in :, not " + 
hostNameAndPort, e);
+  }

Review Comment:
   I've made the change accordingly. Thanks for the advice!





> Ill-formed decommission host name and port pair would trigger IndexOutOfBound 
> error
> ---
>
> Key: HDFS-17449
> URL: https://issues.apache.org/jira/browse/HDFS-17449
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ConfX
>Priority: Major
>  Labels: pull-request-available
>
> h2. What happened:
> Got IndexOutOfBound when trying to run 
> org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatusWithBackoffMonitor#testDecommissionStatusAfterDNRestart
>  with namenode host provider set to 
> org.apache.hadoop.hdfs.server.blockmanagement.CombinedHostFileManager.
> h2. Buggy code:
> In HostsFileWriter.java:
> {code:java}
> String[] hostAndPort = hostNameAndPort.split(":"); // hostNameAndPort might 
> be invalid
> dn.setHostName(hostAndPort[0]);
> dn.setPort(Integer.parseInt(hostAndPort[1])); // here IndexOutOfBound might 
> be thrown
> dn.setAdminState(AdminStates.DECOMMISSIONED);{code}
> h2. StackTrace:
> {code:java}
> java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds for length 1
>     at 
> org.apache.hadoop.hdfs.util.HostsFileWriter.initOutOfServiceHosts(HostsFileWriter.java:110){code}
> h2. How to reproduce:
> (1) Set {{dfs.namenode.hosts.provider.classname}} to 
> {{org.apache.hadoop.hdfs.server.blockmanagement.CombinedHostFileManager}}
> (2) Run test: 
> {{org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatusWithBackoffMonitor#testDecommissionStatusAfterDNRestart}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17454) Fix namenode fsck swallows the exception stacktrace, this can help us to troubleshooting log.

2024-04-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834180#comment-17834180
 ] 

ASF GitHub Bot commented on HDFS-17454:
---

xiaojunxiang2023 opened a new pull request, #6709:
URL: https://github.com/apache/hadoop/pull/6709

   When I used `hdfs fsck /xxx.txt -move`, missing error, but I can't kown the 
reason, because the exception stacktrace doesn't append to LOG, original code:
   
   
![image](https://github.com/apache/hadoop/assets/65019264/3fb94da0-5a9e-4363-a941-67772b9420c1)
   When I fix it, look, we can see the exception stacktrace:
   
   
![image](https://github.com/apache/hadoop/assets/65019264/1a6cfad7-b78c-456e-a8f4-df41a215bf20)
   




> Fix namenode fsck swallows the exception stacktrace, this can help us to 
> troubleshooting log.
> -
>
> Key: HDFS-17454
> URL: https://issues.apache.org/jira/browse/HDFS-17454
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.3.6
>Reporter: xiaojunxiang
>Priority: Minor
> Attachments: image-2024-04-05-15-40-37-147.png, 
> image-2024-04-05-15-41-38-420.png
>
>
> When I used `hdfs fsck /xxx.txt -move`, missing error, but I can't kown the 
> reason, because the exception stacktrace doesn't append to LOG, original code:
> !image-2024-04-05-15-40-37-147.png!
>  
> When I fix it, look, we can see the exception stacktrace:
> !image-2024-04-05-15-41-38-420.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17454) Fix namenode fsck swallows the exception stacktrace, this can help us to troubleshooting log.

2024-04-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17454:
--
Labels: pull-request-available  (was: )

> Fix namenode fsck swallows the exception stacktrace, this can help us to 
> troubleshooting log.
> -
>
> Key: HDFS-17454
> URL: https://issues.apache.org/jira/browse/HDFS-17454
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.3.6
>Reporter: xiaojunxiang
>Priority: Minor
>  Labels: pull-request-available
> Attachments: image-2024-04-05-15-40-37-147.png, 
> image-2024-04-05-15-41-38-420.png
>
>
> When I used `hdfs fsck /xxx.txt -move`, missing error, but I can't kown the 
> reason, because the exception stacktrace doesn't append to LOG, original code:
> !image-2024-04-05-15-40-37-147.png!
>  
> When I fix it, look, we can see the exception stacktrace:
> !image-2024-04-05-15-41-38-420.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17454) Fix namenode fsck swallows the exception stacktrace, this can help us to troubleshooting log.

2024-04-05 Thread xiaojunxiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojunxiang updated HDFS-17454:

Description: 
When I used `hdfs fsck /xxx.txt -move`, missing error, but I can't kown the 
reason, because the exception stacktrace doesn't append to LOG, original code:

!image-2024-04-05-15-40-37-147.png!

 

When I fix it, look, we can see the exception stacktrace:

!image-2024-04-05-15-41-38-420.png!

  was:
When I used `hdfs fsck /xxx.txt -move`, missing error, but I can;t kown the 
reason, because the exception stacktrace doesn't append to LOG, original code:

!image-2024-04-05-15-40-37-147.png!

 

When I fix it, look, we can see the exception stacktrace:

!image-2024-04-05-15-41-38-420.png!


> Fix namenode fsck swallows the exception stacktrace, this can help us to 
> troubleshooting log.
> -
>
> Key: HDFS-17454
> URL: https://issues.apache.org/jira/browse/HDFS-17454
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.3.6
>Reporter: xiaojunxiang
>Priority: Minor
> Attachments: image-2024-04-05-15-40-37-147.png, 
> image-2024-04-05-15-41-38-420.png
>
>
> When I used `hdfs fsck /xxx.txt -move`, missing error, but I can't kown the 
> reason, because the exception stacktrace doesn't append to LOG, original code:
> !image-2024-04-05-15-40-37-147.png!
>  
> When I fix it, look, we can see the exception stacktrace:
> !image-2024-04-05-15-41-38-420.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17454) Fix namenode fsck swallows the exception stacktrace, this can help us to troubleshooting log.

2024-04-05 Thread xiaojunxiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojunxiang updated HDFS-17454:

Affects Version/s: 3.3.6

> Fix namenode fsck swallows the exception stacktrace, this can help us to 
> troubleshooting log.
> -
>
> Key: HDFS-17454
> URL: https://issues.apache.org/jira/browse/HDFS-17454
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.3.6
>Reporter: xiaojunxiang
>Priority: Minor
> Attachments: image-2024-04-05-15-40-37-147.png, 
> image-2024-04-05-15-41-38-420.png
>
>
> When I used `hdfs fsck /xxx.txt -move`, missing error, but I can;t kown the 
> reason, because the exception stacktrace doesn't append to LOG, original code:
> !image-2024-04-05-15-40-37-147.png!
>  
> When I fix it, look, we can see the exception stacktrace:
> !image-2024-04-05-15-41-38-420.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-17454) Fix namenode fsck swallows the exception stacktrace, this can help us to troubleshooting log.

2024-04-05 Thread xiaojunxiang (Jira)
xiaojunxiang created HDFS-17454:
---

 Summary: Fix namenode fsck swallows the exception stacktrace, this 
can help us to troubleshooting log.
 Key: HDFS-17454
 URL: https://issues.apache.org/jira/browse/HDFS-17454
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: xiaojunxiang
 Attachments: image-2024-04-05-15-40-37-147.png, 
image-2024-04-05-15-41-38-420.png

When I used `hdfs fsck /xxx.txt -move`, missing error, but I can;t kown the 
reason, because the exception stacktrace doesn't append to LOG, original code:

!image-2024-04-05-15-40-37-147.png!

 

When I fix it, look, we can see the exception stacktrace:

!image-2024-04-05-15-41-38-420.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17397) Choose another DN as soon as possible, when encountering network issues

2024-04-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834168#comment-17834168
 ] 

ASF GitHub Bot commented on HDFS-17397:
---

xleoken commented on PR #6591:
URL: https://github.com/apache/hadoop/pull/6591#issuecomment-2039122938

   cc @Hexiaoqiao @ZanderXu




> Choose another DN as soon as possible, when encountering network issues
> ---
>
> Key: HDFS-17397
> URL: https://issues.apache.org/jira/browse/HDFS-17397
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: xleoken
>Priority: Minor
>  Labels: pull-request-available
> Attachments: hadoop.png
>
>
> Choose another DN as soon as possible, when encountering network issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org