[ 
https://issues.apache.org/jira/browse/FLINK-26762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17523505#comment-17523505
 ] 

fanrui commented on FLINK-26762:
--------------------------------

Hi [~akalashnikov] , I have submitted the PR of the overdraft buffer, but there 
are a series of questions that need to be confirmed with you.
h2. About the configuration
You mentioned before:  it is better to configure overdraft-memory-size rather 
than overdraft-buffers. But I don't know the reason, could you give some 
details? 
Currently, I think the overdraft-buffers is better than overdraft-memory-size, 
because: * Use numberOfRequiredMemorySegments in LocalBufferPool to count, so 
if flink users configure overdraft-memory-size, it may also need to convert to 
overdraft-buffers to implement the code. 
 * Also, I think overdraft-buffers is cleaner for the user. If the flink user 
knows the flatmap operator in the job:
 ** one input and 5 outputs, the user can configure overdraft-buffers = 5.
 ** one input and 10 outputs, the user can configure overdraft-buffers = 10.
 * If configuring memory, do users still need to convert?
 * When overdraft-buffers=numberOfSubpartitions, the buffer required for 
Watermark broadcast can also be resolved. So we may add a configuration let 
overdraft-buffers=numberOfSubpartitions in the future.

h2. About the benchmark

I have tested use a flink job with flatmap.

code link: 
[https://github.com/1996fanrui/fanrui-learning/commit/a1dbd850f878b64bfeb162b05f5c4750f9d629cc]

 
{code:java}
job parallelism = 100
sink sleep = 10ms
flatmap : one input 5 outputs
{code}
 

 

Without overdraft buffer, the checkpoint duration is between 3 and 7 minutes, 
mostly around 5 minutes. If flatmap outputs more data, the backpressure is more 
severe and the job parallelism is higher. The UC duration of jobs without 
overdraft may exceed 10 minutes.

With overdraft buffer, the checkpoint duration is between 0.3 and 2.5 s, the 
benefits are obvious.

!image-2022-04-18-11-45-14-700.png|width=1642,height=622!

!image-2022-04-18-11-46-03-895.png|width=1834,height=679!

I am now using two flink jobs for comparison. How to write a standard benchmark 
for comparison? How to get UC duration reasonably? 

 

Also, could you help review the PR in your free time? Thanks a lot.

> Add the overdraft buffer in BufferPool to reduce unaligned checkpoint being 
> blocked
> -----------------------------------------------------------------------------------
>
>                 Key: FLINK-26762
>                 URL: https://issues.apache.org/jira/browse/FLINK-26762
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Checkpointing, Runtime / Network
>    Affects Versions: 1.13.0, 1.14.0, 1.15.0
>            Reporter: fanrui
>            Assignee: fanrui
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.16.0
>
>         Attachments: image-2022-04-18-11-45-14-700.png, 
> image-2022-04-18-11-46-03-895.png
>
>
> In some past JIRAs of Unaligned Checkpoint, the community has added the  
> recordWriter.isAvaliable() to reduce block for single record write. But for 
> large record, flatmap or broadcast watermark, they may need more buffer.
> Can we add the overdraft buffer in BufferPool to reduce unaligned checkpoint 
> being blocked? 
> h2. Overdraft Buffer mechanism
> Add the configuration of 
> 'taskmanager.network.memory.overdraft-buffers-per-gate=5'. 
> When requestMemory is called and the bufferPool is insufficient, the 
> bufferPool will allow the Task to overdraw up to 5 MemorySegments. And 
> bufferPool will be unavailable until all overdrawn buffers are consumed by 
> downstream tasks. Then the task will wait for bufferPool being available.
> From the above, we have the following benefits:
>  * For scenarios that require multiple buffers, the Task releases the 
> Checkpoint lock, so the Unaligned Checkpoint can be completed quickly.
>  * We can control the memory usage to prevent memory leak.
>  * It just needs a litter memory, and can improve the stability of the Task 
> under back pressure.
>  * Users can increase the overdraft-buffers to adapt the scenarios that 
> require more buffers.
>  
> Masters, please correct me if I'm wrong, thanks a lot.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to