[ 
https://issues.apache.org/jira/browse/HADOOP-18521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638300#comment-17638300
 ] 

ASF GitHub Bot commented on HADOOP-18521:
-----------------------------------------

steveloughran commented on code in PR #5133:
URL: https://github.com/apache/hadoop/pull/5133#discussion_r1031522202


##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/ReadBufferManager.java:
##########
@@ -247,7 +247,7 @@ private synchronized boolean tryEvict() {
 
     // first, try buffers where all bytes have been consumed (approximated as 
first and last bytes consumed)
     for (ReadBuffer buf : completedReadList) {
-      if (buf.isFirstByteConsumed() && buf.isLastByteConsumed()) {
+      if (buf.getStream().isClosed() || (buf.isFirstByteConsumed() && 
buf.isLastByteConsumed())) {

Review Comment:
   this doesn't quite do the right thing as evict() is looking for a completed 
read with an allocated buffer. here it will also find any completed read whose 
buffer was released prematurely.
   
   i'm picking up this change in my pr, but including the check for having the 
buffer, so it will *only* pick up those records whose stream was closed after a 
successful read completed.





> ABFS ReadBufferManager buffer sharing across concurrent HTTP requests
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-18521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/azure
>    Affects Versions: 3.3.2, 3.3.3, 3.3.4
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Critical
>              Labels: pull-request-available
>
> AbfsInputStream.close() can trigger the return of buffers used for active 
> prefetch GET requests into the ReadBufferManager free buffer pool.
> A subsequent prefetch by a different stream in the same process may acquire 
> this same buffer. This can lead to risk of corruption of its own prefetched 
> data, data which may then be returned to that other thread.
> On releases without the fix for this (3.3.2+), the bug can be avoided by 
> disabling all prefetching 
> {code}
> fs.azure.readaheadqueue.depth = 0
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to