[ 
https://issues.apache.org/jira/browse/FLINK-14472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956654#comment-16956654
 ] 

zhijiang commented on FLINK-14472:
----------------------------------

Thanks for concerning on this issue. You are right that some known scenarios 
are invalid for existing back pressure monitor. Although the motivation of this 
ticket is not for solving that limitation, I think we might solve it meanwhile 
while implementing the new monitor way.

The current monitor way is heavy-weight and fragile, and it also needs to 
understand the implementation of `LocalBufferPool` which is bad in design. I 
tried to provide a transparent method in `BufferProvider` to indicate whether 
it is back pressured or not, then the monitor caller would rely on this method 
to get the back pressure result. It is no need to analyze the specific thread 
stacks inside monitor tracker to understand the implementation of 
`BufferProvider`. And it also has the benefit for the restful call to only 
carry light-weight info.

> Implement back-pressure monitor with non-blocking outputs
> ---------------------------------------------------------
>
>                 Key: FLINK-14472
>                 URL: https://issues.apache.org/jira/browse/FLINK-14472
>             Project: Flink
>          Issue Type: Task
>          Components: Runtime / Network
>            Reporter: zhijiang
>            Assignee: Yingjie Cao
>            Priority: Minor
>             Fix For: 1.10.0
>
>
> Currently back-pressure monitor relies on detecting task threads that are 
> stuck in `requestBufferBuilderBlocking`. There are actually two cases to 
> cause back-pressure ATM:
>  * There are no available buffers in `LocalBufferPool` and all the given 
> quotas from global pool are also exhausted. Then we need to wait for buffer 
> recycling to `LocalBufferPool`.
>  * No available buffers in `LocalBufferPool`, but the quota has not been used 
> up. While requesting buffer from global pool, it is blocked because of no 
> available buffers in global pool. Then we need to wait for buffer recycling 
> to global pool.
> We already implemented the non-blocking output for the first case in 
> [FLINK-14396|https://issues.apache.org/jira/browse/FLINK-14396], and we 
> expect the second case done together with adjusting the back-pressure monitor 
> which could check for `RecordWriter#isAvailable` instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to