[
https://issues.apache.org/jira/browse/HADOOP-19795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18055127#comment-18055127
]
ASF GitHub Bot commented on HADOOP-19795:
-----------------------------------------
anmolanmol1234 commented on code in PR #8212:
URL: https://github.com/apache/hadoop/pull/8212#discussion_r2741082638
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsPrefetchInputStream.java:
##########
@@ -80,10 +81,22 @@ protected int readOneBlock(final byte[] b, final int off,
final int len) throws
}
/*
- * Always start with Prefetch even from first read.
- * Even if out of order seek comes, prefetches will be triggered for
next set of blocks.
+ Skips prefetch for the first read if restrictGpsOnOpenFile config is
enabled.
+ This is required since contentLength is not available yet to determine
prefetch block size.
*/
- bytesRead = readInternal(getFCursor(), getBuffer(), 0, getBufferSize(),
false);
+ if(shouldRestrictGpsOnOpenFile() && isFirstRead()) {
+ getTracingContext().setReadType(ReadType.NORMAL_READ);
+ LOG.debug("RestrictGpsOnOpenFile is enabled. Skip readahead for first
read even for sequential input policy.");
+ bytesRead = readInternal(getFCursor(), getBuffer(), 0,
getBufferSize(), true);
Review Comment:
readRemote should be called, readInternal will trigger prefetch
> ABFS: GetPathStatus Optimization on OpenFileForRead
> ---------------------------------------------------
>
> Key: HADOOP-19795
> URL: https://issues.apache.org/jira/browse/HADOOP-19795
> Project: Hadoop Common
> Issue Type: Task
> Components: fs/azure
> Affects Versions: 3.4.1, 3.4.2
> Reporter: Manika Joshi
> Assignee: Manika Joshi
> Priority: Major
> Labels: pull-request-available
>
> We do a getPathStatus call during file open for read. This call is primarily
> used to fetch the file’s metadata properties before the actual read begins.
> We are now introducing an optional, config-driven read flow that avoids the
> getPathStatus call during open and instead derives required metadata from the
> read response itself.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]