[jira] [Work logged] (HDFS-16520) Improve EC pread: avoid potential reading whole block

ASF GitHub Bot (Jira) Sun, 24 Apr 2022 20:27:05 -0700


     [ 
https://issues.apache.org/jira/browse/HDFS-16520?focusedWorklogId=761571&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761571
 ]


ASF GitHub Bot logged work on HDFS-16520:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 25/Apr/22 03:26
            Start Date: 25/Apr/22 03:26
    Worklog Time Spent: 10m 
      Work Description: cndaimin commented on code in PR #4104:
URL: https://github.com/apache/hadoop/pull/4104#discussion_r857234796


##########
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSStripedInputStream.java:
##########
@@ -250,9 +255,16 @@ boolean createBlockReader(LocatedBlock block, long 
offsetInBlock,
         if (dnInfo == null) {
           break;
         }
+        if (readTo < 0 || readTo > block.getBlockSize()) {
+          readTo = block.getBlockSize();
+        }
         reader = getBlockReader(block, offsetInBlock,
-            block.getBlockSize() - offsetInBlock,
+            readTo - offsetInBlock,
             dnInfo.addr, dnInfo.storageType, dnInfo.info);
+        if (blockReaderListener != null) {

Review Comment:
   > It is for test here, right? Can we use fault injector here? Refer to 
DFSClientFaultInjector
   
   Yes, `DFSClientFaultInjector` is better here. Thanks, updated.





Issue Time Tracking
-------------------

    Worklog Id:     (was: 761571)
    Time Spent: 2.5h  (was: 2h 20m)

> Improve EC pread: avoid potential reading whole block
> -----------------------------------------------------
>
>                 Key: HDFS-16520
>                 URL: https://issues.apache.org/jira/browse/HDFS-16520
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: dfsclient, ec
>    Affects Versions: 3.3.1, 3.3.2
>            Reporter: daimin
>            Assignee: daimin
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> HDFS client 'pread' represents 'position read', this kind of read just need a 
> range of data instead of reading the whole file/block. By using 
> BlockReaderFactory#setLength, client tells datanode the block length to be 
> read from disk and sent to client.
> To EC file, the block length to read is not well set, by default using 
> 'block.getBlockSize() - offsetInBlock' to both pread and sread. Thus datanode 
> read much more data and send to client, and abort when client closes 
> connection. There is a lot waste of resource to this situation.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16520) Improve EC pread: avoid potential reading whole block

Reply via email to