wsry opened a new pull request #10083: [FLINK-14472][runtime]Implement 
back-pressure monitor with non-blocking outputs.
URL: https://github.com/apache/flink/pull/10083
 
 
   ## What is the purpose of the change
   Currently back-pressure monitor relies on detecting task threads that are 
stuck in `requestBufferBuilderBlocking`. There are actually two cases to cause 
back-pressure ATM:
   
    - There are no available buffers in `LocalBufferPool` and all the given 
quotas from global pool are also exhausted. Then we need to wait for buffer 
recycling to `LocalBufferPool`.
    - No available buffers in `LocalBufferPool`, but the quota has not been 
used up. While requesting buffer from global pool, it is blocked because of no 
available buffers in global pool. Then we need to wait for buffer recycling to 
global pool.
   
   We try to implement the non-blocking network output in FLINK-14396, so the 
back pressure monitor should be adjusted accordingly after the non-blocking 
output is used in practice. In this PR, we implement a new back pressure 
monitor which monitors the task back pressure by checking the availability of 
ResultPartitionWriter, e.g. if there are available free buffers in the 
BufferPool of ResultPartitions for output.
   
   ## Brief change log
     - A new back pressure tracker was implemented which monitors the task back 
pressure by checking the availability of ResultPartitionWriter, e.g. if there 
are available free buffers in the BufferPool of ResultPartitions for output.
     - The old stack sampling based back pressure tracker implementation and 
relevant code were removed.
     - New test cases were added to verify the changes.
   
   
   ## Verifying this change
   Several new test cases are added to verify the changes, including 
```BackPressureStatsTrackerImplTest```, 
```BackPressureSampleCoordinatorTest```, 
```TaskBackPressureSampleServiceTest```, 
```TaskTest#testNoBackPressureIfTaskNotStarted```, 
```TaskExecutorSubmissionTest#testSampleTaskBackPressure```.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): (yes / **no**)
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
     - The serializers: (yes / **no** / don't know)
     - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
     - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
     - Does this pull request introduce a new feature? (yes / **no**)
     - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to