[ 
https://issues.apache.org/jira/browse/HADOOP-16852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18044560#comment-18044560
 ] 

ASF GitHub Bot commented on HADOOP-16852:
-----------------------------------------

github-actions[bot] closed pull request #2147: HADOOP-16852: Report read-ahead 
error back
URL: https://github.com/apache/hadoop/pull/2147




> ABFS: Send error back to client for Read Ahead request failure
> --------------------------------------------------------------
>
>                 Key: HADOOP-16852
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16852
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/azure
>    Affects Versions: 3.3.1
>            Reporter: Sneha Vijayarajan
>            Assignee: Sneha Vijayarajan
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.3.1
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Issue seen by a customer:
> The failed requests we were seeing in the AbfsClient logging actually never 
> made it out over the wire. We have found that there’s an issue with ADLS 
> passthrough and the 8 read ahead threads that ADLSv2 spawns in 
> ReadBufferManager.java. We depend on thread local storage in order to get the 
> right JWT token and those threads do not have the right information in their 
> thread local storage. Thus, when they pick up a task from the read ahead 
> queue they fail by throwing an AzureCredentialNotFoundException exception in 
> AbfsRestOperation.executeHttpOperation() where it calls 
> client.getAccessToken(). This exception is silently swallowed by the read 
> ahead threads in ReadBufferWorker.run(). As a result, every read ahead 
> attempt results in a failed executeHttpOperation(), but still calls 
> AbfsClientThrottlingIntercept.updateMetrics() and contributes to throttling 
> (despite not making it out over the wire). After the read aheads fail, the 
> main task thread performs the read with the right thread local storage 
> information and succeeds, but first sleeps for up to 10 seconds due to the 
> throttling.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to