[ 
https://issues.apache.org/jira/browse/FLINK-26762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17521184#comment-17521184
 ] 

Anton Kalashnikov commented on FLINK-26762:
-------------------------------------------

We definitely can ignore buffer debloating for the first version. but it makes 
sense to think about configuration, perhaps it is better to configure 
overdraft-memory-size rather than overdraft-buffers since it seems more 
flexible for the future but I don't really sure about that now.
As I mentioned before, Flink already switches to unavailable when one more 
buffer is left yet.  Somehow we can consider it as one buffer overdraft. So, 
please, take into account this fact during your implementation because it will 
be more convenient to have all of this logic in one place.

> Add the overdraft buffer in BufferPool to reduce unaligned checkpoint being 
> blocked
> -----------------------------------------------------------------------------------
>
>                 Key: FLINK-26762
>                 URL: https://issues.apache.org/jira/browse/FLINK-26762
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Checkpointing, Runtime / Network
>    Affects Versions: 1.13.0, 1.14.0, 1.15.0
>            Reporter: fanrui
>            Assignee: fanrui
>            Priority: Major
>             Fix For: 1.16.0
>
>
> In some past JIRAs of Unaligned Checkpoint, the community has added the  
> recordWriter.isAvaliable() to reduce block for single record write. But for 
> large record, flatmap or broadcast watermark, they may need more buffer.
> Can we add the overdraft buffer in BufferPool to reduce unaligned checkpoint 
> being blocked? 
> h2. Overdraft Buffer mechanism
> Add the configuration of 
> 'taskmanager.network.memory.overdraft-buffers-per-gate=5'. 
> When requestMemory is called and the bufferPool is insufficient, the 
> bufferPool will allow the Task to overdraw up to 5 MemorySegments. And 
> bufferPool will be unavailable until all overdrawn buffers are consumed by 
> downstream tasks. Then the task will wait for bufferPool being available.
> From the above, we have the following benefits:
>  * For scenarios that require multiple buffers, the Task releases the 
> Checkpoint lock, so the Unaligned Checkpoint can be completed quickly.
>  * We can control the memory usage to prevent memory leak.
>  * It just needs a litter memory, and can improve the stability of the Task 
> under back pressure.
>  * Users can increase the overdraft-buffers to adapt the scenarios that 
> require more buffers.
>  
> Masters, please correct me if I'm wrong, thanks a lot.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to