[ 
https://issues.apache.org/jira/browse/FLINK-11037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16718422#comment-16718422
 ] 

zhijiang commented on FLINK-11037:
----------------------------------

[~StephanEwen], thanks for giving above proposals in point. :)
 * Network stack is very sensitive for performance. We should consider overall 
fairness as you mentioned for avoiding backpressure in some extent.

 * I think your proposal is a proper way to forward this feature, just like we 
give a flag to keep both credit and non-credit modes currently.

 * The adaptive algorithm sounds ideal but may behave unexpected in reality. So 
I agree with your point of forwarding simple determined way first, then we can 
forward step by step based on further experiments. Actually I implemented the 
extreme greedy algorithm as the first version in Alibaba for credit-based, and 
it experienced the challenges in double 11 and run for half an year in 
production. During contributing the whole feature to community, I changed to 
fair algorithm and refactored our private branch to keep same with community. 
In fact we have not compared the specific performances for these two algorithms 
in different scenarios, then it is hard to say which one is better. But I can 
think of one scenario to verify greedy algorithm better in theory. If the 
floating buffers in enough for all the input channels, that means no matter 
with fair or greedy algorithm, all the channels can always get all the required 
floating buffers from pool as a result. In this case the performance should be 
better in greedy way because we only need one request from pool to fetch all 
the floating buffers instead of looping many times to fetch one buffer each 
time, which would enter synchronized lock on pool side among different channels.

> Introduce another greedy mechanism for distributing floating buffers
> --------------------------------------------------------------------
>
>                 Key: FLINK-11037
>                 URL: https://issues.apache.org/jira/browse/FLINK-11037
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Network
>    Affects Versions: 1.8.0
>            Reporter: zhijiang
>            Assignee: zhijiang
>            Priority: Minor
>
> The current mechanism for distributing floating buffers is fair for all the 
> listeners. In detail, each input channel can only request one floating buffer 
> each time although this channel may actually need more floating buffers. Then 
> this channel has to loop to request floating buffer until all are satisfied 
> or pool is exhausted.
> In generally speaking, this way seems fair for all the concurrent channels 
> invoked by netty nio thread.  But every request from LocalBufferPool needs to 
> syn lock and it is hard to say how to distribute all the available floating 
> buffers behaves better in real scenarios.
> Therefore we propose another greedy mechanism to request more floating 
> buffers each time. In extreme case, we can even request all the required 
> buffers at a time or partial ones via configured parameters.  On the other 
> side, LocalBufferPool can also decide how many floating buffers should been 
> assigned based on some factors, such as how many total channels and how many 
> total floating buffers.
> The motivation is making better use of floating buffer resources and it may 
> need extra metrics for adjusting the mechanism dynamically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to