[ 
https://issues.apache.org/jira/browse/FLINK-25688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-25688:
-----------------------------------
    Description: 
As documented in FLINK-25646, currently buffer debloating in Flink, at least in 
the default configuration, has quite noticeable performance degradation at 
larger scale. For example throughput can drop by a factor of 4, or even 
checkpointing times can be increased. Currently it's not clear why is this 
happening. It looks like increasing the number of buffers per channel from the 
default ~2 to above 3 (for example via bumping number of floating buffers to 
value equal or higher then parallelism), seems to be solving this problem, at 
least on one cluster where buffer debloating has been tested at large scale.

Maybe a solution is to change the default Flink's configuration by increasing 
the amount of exclusive or floating buffers, maybe at least if the buffer 
debloating is enabled. However further investigation is required.

CC [~akalashnikov]

  was:
As documented in FLINK-25646, currently buffer debloating in Flink, at least in 
the default configuration, has quite noticeable performance degradation at 
larger scale. For example throughput can drop by a factor of 4, or even 
checkpointing times can be increased. Currently it's not clear why is this 
happening. It looks like increasing the number of buffers per channel from the 
default ~2 to above 3 (for example via bumping number of floating buffers to 
value equal or higher then parallelism), seems to be solving this problem, at 
least on one cluster where buffer debloating has been tested at large scale.

Further investigation is required.

CC [~akalashnikov]


> Resolve performance degradation with high parallelism when using buffer 
> debloating
> ----------------------------------------------------------------------------------
>
>                 Key: FLINK-25688
>                 URL: https://issues.apache.org/jira/browse/FLINK-25688
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Network
>    Affects Versions: 1.15.0, 1.14.3
>            Reporter: Piotr Nowojski
>            Priority: Not a Priority
>
> As documented in FLINK-25646, currently buffer debloating in Flink, at least 
> in the default configuration, has quite noticeable performance degradation at 
> larger scale. For example throughput can drop by a factor of 4, or even 
> checkpointing times can be increased. Currently it's not clear why is this 
> happening. It looks like increasing the number of buffers per channel from 
> the default ~2 to above 3 (for example via bumping number of floating buffers 
> to value equal or higher then parallelism), seems to be solving this problem, 
> at least on one cluster where buffer debloating has been tested at large 
> scale.
> Maybe a solution is to change the default Flink's configuration by increasing 
> the amount of exclusive or floating buffers, maybe at least if the buffer 
> debloating is enabled. However further investigation is required.
> CC [~akalashnikov]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to