Anton Kalashnikov created FLINK-24578:
-----------------------------------------

             Summary: Unexpected erratic load shape for channel skew load 
profile
                 Key: FLINK-24578
                 URL: https://issues.apache.org/jira/browse/FLINK-24578
             Project: Flink
          Issue Type: Bug
          Components: Runtime / Checkpointing
    Affects Versions: 1.14.0
            Reporter: Anton Kalashnikov
         Attachments: antiphaseBufferSize.png, erraticBufferSize1.png, 
erraticBufferSize2.png

given:

The job with 5 maps(with keyBy).

All channels are remote. Parallelism is 80

The first task produces only two keys - `indexOfThisSubtask` and 
`indexOfThisSubtask + 1`. So every subTask has a constant value of active 
channels(depends on hash rebalance)

Every record has an equal size and is processed for an equal time.

 

when: 

The buffer debloat is enabled with the default configuration.

 

then:

The buffer size synchonizes on every subTask on the first map for some reason. 
It can have the strong synchronization as shown on the erraticBufferSize1 
picture but usually synchronization is less explicit as on erraticBufferSize2.

!erraticBufferSize1.png!

 

Expected:

After the stabilization period the buffer size should be mostly constant with 
small fluctuation or the different tasks should be in antiphase to each 
other(when one subtask has small buffer size the another should have a big 
buffer size). for example the picture antiphaseBufferSize

!antiphaseBufferSize.png!

 

Unfortunatelly, it is not reproduced every time which means that this problem 
can be connected to environment. But at least, it makes sense to try to 
understand why we have so strange load shape when only several input channels 
are active.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to