[ 
https://issues.apache.org/jira/browse/FLINK-24189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Wysakowicz updated FLINK-24189:
-------------------------------------
    Description: 
Right now, the buffer debloat assumes that it works with {{SingleInputGates}}, 
each of which has a similar load. The goal is to improve the {{UnionInputGate}} 
to behave well in case of data skew.
A possible implementation can be to calculate the throughput separately for 
each gate. Another option is to calculate the total throughput but choose the 
buffer size independently for each gate based on their buffers in use. 

The base idea is to keep the same number of buffers in use for each gate. It 
means if a gate has a small number of buffers in use we decrease the buffer 
size to force this gate to use more buffers, and we move the withdrawn size to 
another gate with a greater number of buffers in use to increase its buffer 
size. In the corner case when all gates(or some gates) use all their buffers, 
the buffer size should be equal in each gate.

It is highly important to fairly share the throughput among all gates. In other 
words, we need to avoid the situation:
* gate1 has a low load while gate2 has a high load
* the small buffer size was set for gate1 and the big buffer size for gate2
* the load for gate1 increased up to the load of gate2
* it is impossible to increase the buffer size for gate1 because there is no 
reason to decrease the buffer size for gate2 since the load for it doesn't 
change

  was:
Right now, the buffer debloat assumes that it works with SingleInputGates each 
of which has a similar load. It needs to support UnionInputGate in case of data 
skew.
The possible implementation can be the calculation of the throughput separately 
for each gate or calculation of the total throughput but choosing the buffer 
size independently for each gate based on their buffers in use. 
So it looks like the base idea is to keep the same number of buffers in use for 
each gate which means if a gate has a small number of buffers in use we 
decrease the buffer size to force this gate to use more buffers, and we move 
the withdrawn size to another gate with a big number of buffer in use to 
increase its buffer size. In the corner case when all gates(or some gates) use 
all their buffers, the buffer size should be equal in each gate.

It is highly important to fairly share the throughput among all gates. In other 
words, avoid the situation:
* gate1 has a low load while gate2 has a high load
* the small buffer size was set for gate1 and the big buffer size for gate2
* the load for gate1 increased up to the load of gate2
* it is impossible to increase the buffer size for gate1 because it is no 
reason to decrease the buffer size for gate2 since the load for it doesn't 
change


> Buffer debloating for multiply gates
> ------------------------------------
>
>                 Key: FLINK-24189
>                 URL: https://issues.apache.org/jira/browse/FLINK-24189
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Network
>    Affects Versions: 1.14.0
>            Reporter: Anton Kalashnikov
>            Priority: Major
>             Fix For: 1.15.0
>
>
> Right now, the buffer debloat assumes that it works with 
> {{SingleInputGates}}, each of which has a similar load. The goal is to 
> improve the {{UnionInputGate}} to behave well in case of data skew.
> A possible implementation can be to calculate the throughput separately for 
> each gate. Another option is to calculate the total throughput but choose the 
> buffer size independently for each gate based on their buffers in use. 
> The base idea is to keep the same number of buffers in use for each gate. It 
> means if a gate has a small number of buffers in use we decrease the buffer 
> size to force this gate to use more buffers, and we move the withdrawn size 
> to another gate with a greater number of buffers in use to increase its 
> buffer size. In the corner case when all gates(or some gates) use all their 
> buffers, the buffer size should be equal in each gate.
> It is highly important to fairly share the throughput among all gates. In 
> other words, we need to avoid the situation:
> * gate1 has a low load while gate2 has a high load
> * the small buffer size was set for gate1 and the big buffer size for gate2
> * the load for gate1 increased up to the load of gate2
> * it is impossible to increase the buffer size for gate1 because there is no 
> reason to decrease the buffer size for gate2 since the load for it doesn't 
> change



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to