[ https://issues.apache.org/jira/browse/HDFS-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17835625#comment-17835625 ]
ASF GitHub Bot commented on HDFS-17455: --------------------------------------- haiyang1987 commented on PR #6710: URL: https://github.com/apache/hadoop/pull/6710#issuecomment-2046720852 Thanks @ZanderXu @Hexiaoqiao for your detailed coment. Update pr, please help me review this PR again when you are free, thanks ~ > Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt > ------------------------------------------------------------------------- > > Key: HDFS-17455 > URL: https://issues.apache.org/jira/browse/HDFS-17455 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Haiyang Hu > Assignee: Haiyang Hu > Priority: Major > Labels: pull-request-available > > When the client read data, connect to the datanode, because at this time the > datanode access token is invalid will throw InvalidBlockTokenException. At > this time, when call fetchBlockAt method will throw > java.lang.IndexOutOfBoundsException causing read data failed. > *Root case:* > * The HDFS file contains only one RBW block, with a block data size of 2048KB. > * The client open this file and seeks to the offset of 1024KB to read data. > * Call DFSInputStream#getBlockReader method connect to the datanode, because > at this time the datanode access token is invalid will throw > InvalidBlockTokenException., and call DFSInputStream#fetchBlockAt will throw > java.lang.IndexOutOfBoundsException. > {code:java} > private synchronized DatanodeInfo blockSeekTo(long target) > throws IOException { > if (target >= getFileLength()) { > // the target size is smaller than fileLength (completeBlockSize + > lastBlockBeingWrittenLength), > // here at this time target is 1024 and getFileLength is 2048 > throw new IOException("Attempted to read past end of file"); > } > ... > while (true) { > ... > try { > blockReader = getBlockReader(targetBlock, offsetIntoBlock, > targetBlock.getBlockSize() - offsetIntoBlock, targetAddr, > storageType, chosenNode); > if(connectFailedOnce) { > DFSClient.LOG.info("Successfully connected to " + targetAddr + > " for " + targetBlock.getBlock()); > } > return chosenNode; > } catch (IOException ex) { > ... > } else if (refetchToken > 0 && tokenRefetchNeeded(ex, targetAddr)) { > refetchToken--; > // Here will catch InvalidBlockTokenException. > fetchBlockAt(target); > } else { > ... > } > } > } > } > private LocatedBlock fetchBlockAt(long offset, long length, boolean useCache) > throws IOException { > maybeRegisterBlockRefresh(); > synchronized(infoLock) { > // Here the locatedBlocks only contains one locatedBlock, at this time > the offset is 1024 and fileLength is 0, > // so the targetBlockIdx is -2 > int targetBlockIdx = locatedBlocks.findBlock(offset); > if (targetBlockIdx < 0) { // block is not cached > targetBlockIdx = LocatedBlocks.getInsertIndex(targetBlockIdx); > // Here the targetBlockIdx is 1; > useCache = false; > } > if (!useCache) { // fetch blocks > final LocatedBlocks newBlocks = (length == 0) > ? dfsClient.getLocatedBlocks(src, offset) > : dfsClient.getLocatedBlocks(src, offset, length); > if (newBlocks == null || newBlocks.locatedBlockCount() == 0) { > throw new EOFException("Could not find target position " + offset); > } > // Update the LastLocatedBlock, if offset is for last block. > if (offset >= locatedBlocks.getFileLength()) { > setLocatedBlocksFields(newBlocks, getLastBlockLength(newBlocks)); > } else { > locatedBlocks.insertRange(targetBlockIdx, > newBlocks.getLocatedBlocks()); > } > } > // Here the locatedBlocks only contains one locatedBlock, so will throw > java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1 > return locatedBlocks.get(targetBlockIdx); > } > } > {code} > The client exception: > {code:java} > java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1 > at > java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64) > at > java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70) > at > java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:266) > at java.base/java.util.Objects.checkIndex(Objects.java:359) > at java.base/java.util.ArrayList.get(ArrayList.java:427) > at > org.apache.hadoop.hdfs.protocol.LocatedBlocks.get(LocatedBlocks.java:87) > at > org.apache.hadoop.hdfs.DFSInputStream.fetchBlockAt(DFSInputStream.java:569) > at > org.apache.hadoop.hdfs.DFSInputStream.fetchBlockAt(DFSInputStream.java:540) > at > org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:704) > at > org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:884) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:957) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:804) > {code} > The datanode exception: > {code:java} > 2024-03-27 15:56:35,477 WARN datanode.DataNode > (DataXceiver.java:checkAccess(1487)) [DataXceiver for client > DFSClient_NONMAPREDUCE_475786505_1 at /xxx [Sending block > BP-xxx:blk_1138933918_65194340]] - Block token verification failed: > op=READ_BLOCK, remoteAddress=/XXX, message=Can't re-compute password for > block_token_identifier (expiryDate=1711562193469, keyId=1775816931, > userId=test, blockPoolId=BP-xxx-xxx-xxx, blockId=1138933918, access > modes=[READ], storageTypes= [SSD, SSD, SSD], storageIds= [DS-xxx1, > DS-xxx2,DS-xxx3]), since the required block key (keyID=1775816931) doesn't > exist > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org