[jira] [Commented] (HADOOP-18521) ABFS ReadBufferManager buffer sharing across concurrent HTTP requests

ASF GitHub Bot (Jira) Mon, 14 Nov 2022 05:46:04 -0800


    [ 
https://issues.apache.org/jira/browse/HADOOP-18521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17633807#comment-17633807
 ]


ASF GitHub Bot commented on HADOOP-18521:
-----------------------------------------

snvijaya commented on PR #5117:
URL: https://github.com/apache/hadoop/pull/5117#issuecomment-1313734822

   Hi @steveloughran, Wanted to get your opinion on below change as a possible 
replacement for this change :
   [https://github.com/apache/hadoop/pull/5133](url)
   
   A ReadBuffer with a valid Buffer assigned to it can be in certain states 
when stream is closed, and with the above change, I am trying to address it as 
below :
   1. Is in QueueReadAheadList - No change, the earlier purge takes care of it
   2. Is in CompletedList - No change again, the earlier purge takes care of it
   3. Is InProgressList but yet to make the network call - If stream is closed, 
stop network call and move the ReadBuffer as a failure into completed list
   4. Is InProgressList , just finished the network call - If stream is closed, 
network call was successful or not, move the ReadBuffer as a failure into 
completed list
   
   Now, when in state 3 or 4, the purge method might not pick it as it might 
have executed first. In that case, to prioritize these ReadBuffers for 
eviction, have added the check for stream is closed in the eviction code as 
well. 
   
   Please let me know if you see value in this fix and I could pursue further 
changes to incorporate validation code at queuing time and when getBlock finds 
a hit in completed list, and will also add related test code.




> ABFS ReadBufferManager buffer sharing across concurrent HTTP requests
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-18521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/azure
>    Affects Versions: 3.3.2, 3.3.3, 3.3.4
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Critical
>              Labels: pull-request-available
>
> AbfsInputStream.close() can trigger the return of buffers used for active 
> prefetch GET requests into the ReadBufferManager free buffer pool.
> A subsequent prefetch by a different stream in the same process may acquire 
> this same buffer. This can lead to risk of corruption of its own prefetched 
> data, data which may then be returned to that other thread.
> On releases without the fix for this (3.3.2 to 3.3.4), the bug can be avoided 
> by disabling all prefetching 
> {code}
> fs.azure.readaheadqueue.depth = 0
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-18521) ABFS ReadBufferManager buffer sharing across concurrent HTTP requests

Reply via email to