Re: [PR] HDFS-15413. add dfs.client.read.striped.datanode.max.attempts to fix read ecfile timeout [hadoop]

via GitHub Wed, 20 Mar 2024 04:23:17 -0700


haiyang1987 commented on code in PR #5829:
URL: https://github.com/apache/hadoop/pull/5829#discussion_r1531803838



##########
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/StripeReader.java:
##########
@@ -233,41 +235,62 @@ private ByteBufferStrategy[] 
getReadStrategies(StripingChunk chunk) {
 
   private int readToBuffer(BlockReader blockReader,
       DatanodeInfo currentNode, ByteBufferStrategy strategy,
-      ExtendedBlock currentBlock) throws IOException {
+      LocatedBlock currentBlock, int chunkIndex) throws IOException {
     final int targetLength = strategy.getTargetLength();
-    int length = 0;
-    try {
-      while (length < targetLength) {
-        int ret = strategy.readFromBlock(blockReader);
-        if (ret < 0) {
-          throw new IOException("Unexpected EOS from the reader");
+    int curAttempts = 0;
+    while (curAttempts < readDNMaxAttempts) {
+      curAttempts++;
+      int length = 0;
+      try {
+        while (length < targetLength) {
+          int ret = strategy.readFromBlock(blockReader);
+          if (ret < 0) {
+            throw new IOException("Unexpected EOS from the reader");
+          }
+          length += ret;
+        }
+        return length;
+      } catch (ChecksumException ce) {
+        DFSClient.LOG.warn("Found Checksum error for "
+            + currentBlock + " from " + currentNode
+            + " at " + ce.getPos());
+        //Clear buffer to make next decode success
+        strategy.getReadBuffer().clear();
+        // we want to remember which block replicas we have tried
+        corruptedBlocks.addCorruptedBlock(currentBlock.getBlock(), 
currentNode);
+        throw ce;
+      } catch (IOException e) {
+        //Clear buffer to make next decode success
+        strategy.getReadBuffer().clear();
+        if (curAttempts < readDNMaxAttempts) {
+          if (readerInfos[chunkIndex].reader != null) {
+            readerInfos[chunkIndex].reader.close();
+          }
+          if (dfsStripedInputStream.createBlockReader(currentBlock,
+              alignedStripe.getOffsetInBlock(), targetBlocks,

Review Comment:
   If use pread to read data, if the currently set buffer size is a block size,
   For a block in a dn, the data of multiple cell units may be read, so the 
size of the ByteBufferStrategy array in the StripingChunk corresponding to the 
AlignedStripe is calculated to be multiple (there are multiple List<ByteBuffer> 
slices in ChunkByteBuffer),
   
   <img width="1319" alt="image" 
src="https://github.com/apache/hadoop/assets/3760130/40f7a944-ea57-4891-9719-86a1b009244d";>
   
   So when processing retry createBlockReader in readToBuffer, we may need to 
consider the current actual offsetInBlock to avoid reading duplicate data from 
datanode.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Re: [PR] HDFS-15413. add dfs.client.read.striped.datanode.max.attempts to fix read ecfile timeout [hadoop]

Reply via email to