haiyang1987 commented on code in PR #5829: URL: https://github.com/apache/hadoop/pull/5829#discussion_r1531803838
########## hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/StripeReader.java: ########## @@ -233,41 +235,62 @@ private ByteBufferStrategy[] getReadStrategies(StripingChunk chunk) { private int readToBuffer(BlockReader blockReader, DatanodeInfo currentNode, ByteBufferStrategy strategy, - ExtendedBlock currentBlock) throws IOException { + LocatedBlock currentBlock, int chunkIndex) throws IOException { final int targetLength = strategy.getTargetLength(); - int length = 0; - try { - while (length < targetLength) { - int ret = strategy.readFromBlock(blockReader); - if (ret < 0) { - throw new IOException("Unexpected EOS from the reader"); + int curAttempts = 0; + while (curAttempts < readDNMaxAttempts) { + curAttempts++; + int length = 0; + try { + while (length < targetLength) { + int ret = strategy.readFromBlock(blockReader); + if (ret < 0) { + throw new IOException("Unexpected EOS from the reader"); + } + length += ret; + } + return length; + } catch (ChecksumException ce) { + DFSClient.LOG.warn("Found Checksum error for " + + currentBlock + " from " + currentNode + + " at " + ce.getPos()); + //Clear buffer to make next decode success + strategy.getReadBuffer().clear(); + // we want to remember which block replicas we have tried + corruptedBlocks.addCorruptedBlock(currentBlock.getBlock(), currentNode); + throw ce; + } catch (IOException e) { + //Clear buffer to make next decode success + strategy.getReadBuffer().clear(); + if (curAttempts < readDNMaxAttempts) { + if (readerInfos[chunkIndex].reader != null) { + readerInfos[chunkIndex].reader.close(); + } + if (dfsStripedInputStream.createBlockReader(currentBlock, + alignedStripe.getOffsetInBlock(), targetBlocks, Review Comment: If use pread to read data, if the currently set buffer size is a block size, For a block in a dn, the data of multiple cell units may be read, so the size of the ByteBufferStrategy array in the StripingChunk corresponding to the AlignedStripe is calculated to be multiple (there are multiple List<ByteBuffer> slices in ChunkByteBuffer), <img width="1319" alt="image" src="https://github.com/apache/hadoop/assets/3760130/40f7a944-ea57-4891-9719-86a1b009244d"> So when processing retry createBlockReader in readToBuffer, we may need to consider the current actual offsetInBlock to avoid reading duplicate data from datanode. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org