[ 
https://issues.apache.org/jira/browse/HDFS-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17834587#comment-17834587
 ] 

ASF GitHub Bot commented on HDFS-17455:
---------------------------------------

haiyang1987 commented on PR #6710:
URL: https://github.com/apache/hadoop/pull/6710#issuecomment-2041284092

   Hi @ZanderXu @Hexiaoqiao @ayushtkn @zhangshuyan0 please help me review this 
PR when you are free, thanks ~




> Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt
> -------------------------------------------------------------------------
>
>                 Key: HDFS-17455
>                 URL: https://issues.apache.org/jira/browse/HDFS-17455
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Haiyang Hu
>            Assignee: Haiyang Hu
>            Priority: Major
>              Labels: pull-request-available
>
> When the client read data, connect to the datanode, because at this time the 
> datanode access token is invalid will throw InvalidBlockTokenException. At 
> this time, when call fetchBlockAt method will  throw 
> java.lang.IndexOutOfBoundsException causing  read data failed.
> *Root case:*
> * The HDFS file contains only one RBW block, with a block data size of 2048KB.
> * The client open this file and seeks to the offset of 1024KB to read data.
> * Call DFSInputStream#getBlockReader method connect to the datanode,  because 
> at this time the datanode access token is invalid will throw 
> InvalidBlockTokenException., and call DFSInputStream#fetchBlockAt will throw 
> java.lang.IndexOutOfBoundsException.
> {code:java}
> private synchronized DatanodeInfo blockSeekTo(long target)
>      throws IOException {
>    if (target >= getFileLength()) {
>    // the target size is smaller than fileLength (completeBlockSize + 
> lastBlockBeingWrittenLength),
>    // here at this time target is 1024 and getFileLength is 2048
>      throw new IOException("Attempted to read past end of file");
>    }
>    ...
>    while (true) {
>      ...
>      try {
>        blockReader = getBlockReader(targetBlock, offsetIntoBlock,
>            targetBlock.getBlockSize() - offsetIntoBlock, targetAddr,
>            storageType, chosenNode);
>        if(connectFailedOnce) {
>          DFSClient.LOG.info("Successfully connected to " + targetAddr +
>                             " for " + targetBlock.getBlock());
>        }
>        return chosenNode;
>      } catch (IOException ex) {
>        ...
>        } else if (refetchToken > 0 && tokenRefetchNeeded(ex, targetAddr)) {
>          refetchToken--;
>          // Here will catch InvalidBlockTokenException.
>          fetchBlockAt(target);
>        } else {
>          ...
>        }
>      }
>    }
>  }
> private LocatedBlock fetchBlockAt(long offset, long length, boolean useCache)
>       throws IOException {
>     maybeRegisterBlockRefresh();
>     synchronized(infoLock) {
>       // Here the locatedBlocks only contains one locatedBlock, at this time 
> the offset is 1024 and fileLength is 0,
>       // so the targetBlockIdx is -2
>       int targetBlockIdx = locatedBlocks.findBlock(offset);
>       if (targetBlockIdx < 0) { // block is not cached
>         targetBlockIdx = LocatedBlocks.getInsertIndex(targetBlockIdx);
>         // Here the targetBlockIdx is 1;
>         useCache = false;
>       }
>       if (!useCache) { // fetch blocks
>         final LocatedBlocks newBlocks = (length == 0)
>             ? dfsClient.getLocatedBlocks(src, offset)
>             : dfsClient.getLocatedBlocks(src, offset, length);
>         if (newBlocks == null || newBlocks.locatedBlockCount() == 0) {
>           throw new EOFException("Could not find target position " + offset);
>         }
>         // Update the LastLocatedBlock, if offset is for last block.
>         if (offset >= locatedBlocks.getFileLength()) {
>           setLocatedBlocksFields(newBlocks, getLastBlockLength(newBlocks));
>         } else {
>           locatedBlocks.insertRange(targetBlockIdx,
>               newBlocks.getLocatedBlocks());
>         }
>       }
>       // Here the locatedBlocks only contains one locatedBlock, so will throw 
> java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1
>       return locatedBlocks.get(targetBlockIdx);
>     }
>   }
> {code}
> The client exception:
> {code:java}
> java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1
>         at 
> java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64)
>         at 
> java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70)
>         at 
> java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:266)
>         at java.base/java.util.Objects.checkIndex(Objects.java:359)
>         at java.base/java.util.ArrayList.get(ArrayList.java:427)
>         at 
> org.apache.hadoop.hdfs.protocol.LocatedBlocks.get(LocatedBlocks.java:87)
>         at 
> org.apache.hadoop.hdfs.DFSInputStream.fetchBlockAt(DFSInputStream.java:569)
>         at 
> org.apache.hadoop.hdfs.DFSInputStream.fetchBlockAt(DFSInputStream.java:540)
>         at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:704)
>         at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:884)
>         at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:957)
>         at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:804)
> {code}
> The datanode exception:
> {code:java}
> 2024-03-27 15:56:35,477 WARN  datanode.DataNode 
> (DataXceiver.java:checkAccess(1487)) [DataXceiver for client 
> DFSClient_NONMAPREDUCE_475786505_1 at /xxx [Sending block 
> BP-xxx:blk_1138933918_65194340]] - Block token verification failed: 
> op=READ_BLOCK, remoteAddress=/XXX, message=Can't re-compute password for 
> block_token_identifier (expiryDate=1711562193469, keyId=1775816931, 
> userId=test, blockPoolId=BP-xxx-xxx-xxx, blockId=1138933918, access 
> modes=[READ], storageTypes= [SSD, SSD, SSD], storageIds= [DS-xxx1, 
> DS-xxx2,DS-xxx3]), since the required block key (keyID=1775816931) doesn't 
> exist
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to