[ 
https://issues.apache.org/jira/browse/FLINK-23453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17385360#comment-17385360
 ] 

Piotr Nowojski commented on FLINK-23453:
----------------------------------------

After some offline discussion with [~akalashnikov] we came to a conclusion that 
under
{quote}
the actual number of buffers in use
{quote}
We mean:
# the currently enqueued buffers in the {{RemoteInputChannel#receivedBuffers}} 
+ announced backlog from the sender side
# we don't care about the fill ratio of those buffers

1. point is because we want to somehow take into account a case, when only a 
couple of input channels are in use, so it seems better to take the already 
used number of buffers instead of the size of a {{LocalBufferPool}}.
2. point is because we are calculating the max number of buffered data based on 
the configuration ({{timeInInputBuffer}} - nit, should it be 
{{timeInInflightBuffer}} not just input?) and the estimated throughput. For 
this calculation we don't care if currently used buffers are filled in 100%, 
90% or 60%. We just want to cap the max number of buffered data.





> Dynamic calculation of the buffer size
> --------------------------------------
>
>                 Key: FLINK-23453
>                 URL: https://issues.apache.org/jira/browse/FLINK-23453
>             Project: Flink
>          Issue Type: Sub-task
>            Reporter: Anton Kalashnikov
>            Priority: Major
>
> To calculate the desired buffer size we need to take into account the 
> throughput, configuration(timeInInputBuffer), and the actual number of 
> buffers in use. It makes sense to use EMA for this calculation to smoothen 
> out intermittent spikes.
> The calculation based on the actual number of buffers in use helps to avoid 
> problems with the data skew (when only a couple of channels out of thousands 
> have any data). So the solution needs to reliably and efficiently calculate 
> either the estimated or an average number of buffers in use. 
> Buffer size can be erratic if it’s not trivial to make it stable in the MVP.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to