[ 
https://issues.apache.org/jira/browse/FLINK-9761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16658543#comment-16658543
 ] 

zhijiang commented on FLINK-9761:
---------------------------------

I just quickly reviewed the related codes and think this is still a problem 
which exists only in non-credit-based mode.

When {{PartitionRequestClientHandler.BufferListenerTask#notifyBufferDestroyed}} 
is called by canceler thread, and the {{stagedBufferResponse}} is not 
currently. But we directly set {{stagedBufferResponse = null}}, so it has no 
chance to consume and release this netty message any more resulting in leak 
issue.

 

Even though the {{stageMessages}} is not empty, the {{stagedMessageHandler}} 
would only consume and release the messages in this {{stageMessages}} list, and 
it will not consume and release {{stagedBufferResponse}} firstly. So it still 
has logic problem I think.

 

Maybe need [~NicoK] double check if I guessed the above issue correctly.

> Potential buffer leak in PartitionRequestClientHandler during job failures
> --------------------------------------------------------------------------
>
>                 Key: FLINK-9761
>                 URL: https://issues.apache.org/jira/browse/FLINK-9761
>             Project: Flink
>          Issue Type: Bug
>          Components: Network
>    Affects Versions: 1.5.0
>            Reporter: Nico Kruber
>            Assignee: Nico Kruber
>            Priority: Critical
>             Fix For: 1.5.6, 1.6.3, 1.7.0
>
>
> {{PartitionRequestClientHandler#stagedMessages}} may be accessed from 
> multiple threads:
> 1) Netty's IO thread
> 2) During cancellation, 
> {{PartitionRequestClientHandler.BufferListenerTask#notifyBufferDestroyed}} is 
> called
> If {{PartitionRequestClientHandler.BufferListenerTask#notifyBufferDestroyed}} 
> thinks, {{stagesMessages}} is empty, however, it will not install the 
> {{stagedMessagesHandler}} that consumes and releases buffers from received 
> messages.
> Unless some unexpected combination of code calls prevents this from 
> happening, this would leak the non-recycled buffers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to