[ https://issues.apache.org/jira/browse/HDFS-16520?focusedWorklogId=761571&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761571 ]
ASF GitHub Bot logged work on HDFS-16520: ----------------------------------------- Author: ASF GitHub Bot Created on: 25/Apr/22 03:26 Start Date: 25/Apr/22 03:26 Worklog Time Spent: 10m Work Description: cndaimin commented on code in PR #4104: URL: https://github.com/apache/hadoop/pull/4104#discussion_r857234796 ########## hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSStripedInputStream.java: ########## @@ -250,9 +255,16 @@ boolean createBlockReader(LocatedBlock block, long offsetInBlock, if (dnInfo == null) { break; } + if (readTo < 0 || readTo > block.getBlockSize()) { + readTo = block.getBlockSize(); + } reader = getBlockReader(block, offsetInBlock, - block.getBlockSize() - offsetInBlock, + readTo - offsetInBlock, dnInfo.addr, dnInfo.storageType, dnInfo.info); + if (blockReaderListener != null) { Review Comment: > It is for test here, right? Can we use fault injector here? Refer to DFSClientFaultInjector Yes, `DFSClientFaultInjector` is better here. Thanks, updated. Issue Time Tracking ------------------- Worklog Id: (was: 761571) Time Spent: 2.5h (was: 2h 20m) > Improve EC pread: avoid potential reading whole block > ----------------------------------------------------- > > Key: HDFS-16520 > URL: https://issues.apache.org/jira/browse/HDFS-16520 > Project: Hadoop HDFS > Issue Type: Improvement > Components: dfsclient, ec > Affects Versions: 3.3.1, 3.3.2 > Reporter: daimin > Assignee: daimin > Priority: Major > Labels: pull-request-available > Time Spent: 2.5h > Remaining Estimate: 0h > > HDFS client 'pread' represents 'position read', this kind of read just need a > range of data instead of reading the whole file/block. By using > BlockReaderFactory#setLength, client tells datanode the block length to be > read from disk and sent to client. > To EC file, the block length to read is not well set, by default using > 'block.getBlockSize() - offsetInBlock' to both pread and sread. Thus datanode > read much more data and send to client, and abort when client closes > connection. There is a lot waste of resource to this situation. -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org