[
https://issues.apache.org/jira/browse/HADOOP-17250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18044298#comment-18044298
]
ASF GitHub Bot commented on HADOOP-17250:
-----------------------------------------
github-actions[bot] closed pull request #2307: HADOOP-17250 ABFS short reads
can be merged with readahead.
URL: https://github.com/apache/hadoop/pull/2307
> ABFS: Random read perf improvement
> ----------------------------------
>
> Key: HADOOP-17250
> URL: https://issues.apache.org/jira/browse/HADOOP-17250
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/azure
> Affects Versions: 3.3.0
> Reporter: Sneha Vijayarajan
> Assignee: Mukund Thakur
> Priority: Major
> Labels: abfsactive, pull-request-available
> Fix For: 3.3.2
>
> Time Spent: 5.5h
> Remaining Estimate: 0h
>
> Random read if marginally read ahead was seen to improve perf for a TPCH
> query.
>
> Introducing fs.azure.readahead.range parameter which can be set by user.
> Data will be populated in buffer for random reads as well which leads to
> lesser
> remote calls.
> This patch also changes the seek implementation to perform a lazy seek. Actual
> seek is done when a read is initiated and data is not present in buffer else
> date is returned from buffer thus reducing the number of remote calls.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]