[ 
https://issues.apache.org/jira/browse/FLINK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17521540#comment-17521540
 ] 

Piotr Nowojski commented on FLINK-24578:
----------------------------------------

As a next step in this ticket it might be a good idea to double check, if the 
same performance regression as from enabling the debloating is visible after 
manually decreasing the buffer size to a value similar as the debloated one for 
the given job. 

> Unexpected erratic load shape for channel skew load profile and ~10% 
> performance loss with enabled debloating
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-24578
>                 URL: https://issues.apache.org/jira/browse/FLINK-24578
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Checkpointing
>    Affects Versions: 1.14.0
>            Reporter: Anton Kalashnikov
>            Priority: Major
>         Attachments: antiphaseBufferSize.png, erraticBufferSize1.png, 
> erraticBufferSize2.png
>
>
> given:
> The job with 5 maps(with keyBy).
> All channels are remote. Parallelism is 80
> The first task produces only two keys - `indexOfThisSubtask` and 
> `indexOfThisSubtask + 1`. So every subTask has a constant value of active 
> channels(depends on hash rebalance)
> Every record has an equal size and is processed for an equal time.
>  
> when: 
> The buffer debloat is enabled with the default configuration.
>  
> then:
> The buffer size synchonizes on every subTask on the first map for some 
> reason. It can have the strong synchronization as shown on the 
> erraticBufferSize1 picture but usually synchronization is less explicit as on 
> erraticBufferSize2.
> !erraticBufferSize1.png!
> !erraticBufferSize2.png!  
>  
> Expected:
> After the stabilization period the buffer size should be mostly constant with 
> small fluctuation or the different tasks should be in antiphase to each 
> other(when one subtask has small buffer size the another should have a big 
> buffer size). for example the picture antiphaseBufferSize
> !antiphaseBufferSize.png!
>  
> Unfortunatelly, it is not reproduced every time which means that this problem 
> can be connected to environment. But at least, it makes sense to try to 
> understand why we have so strange load shape when only several input channels 
> are active.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to