Hi, Ben
     I summit a patch just now, and CC to you.
     From this change ,  it adjust the algorithm to make flow-limit  increase 
faster as it decreases.  And I found that , the problem I mentioned in the past 
mails , also suit for Non-offload situation.
     The test result of Requests per second of Apache benchmark can reach 12k 
from 5k, the test command is as follows:


     ab -n 200000 -c 500 -r http://1.1.1.2/




Thanks, 

Yun

在 2021-05-07 00:50:06,"Ben Pfaff" <b...@ovn.org> 写道:
>On Thu, May 06, 2021 at 09:58:57AM +0800, taoyunupt wrote:
>> 在 2021-05-06 03:26:46,"Ben Pfaff" <b...@ovn.org> 写道:
>> 
>> >On Fri, Apr 30, 2021 at 06:10:43PM +0800, taoyunupt wrote:
>> >> 
>> >> 
>> >> 
>> >> At 2021-04-29 06:39:11, "Ben Pfaff" <b...@ovn.org> wrote:
>> >> >On Wed, Apr 28, 2021 at 08:12:06PM +0800, taoyunupt wrote:
>> >> >> Hi,
>> >> >>      Recently I encountered a TCP connection performance problem, the 
>> >> >> test tool is Apache benchmark.
>> >> >>      The OVS  in my environment is set for  hardware offload solution. 
>> >> >>  The "Requests per second" is about 6000/s, it closed to non-offload 
>> >> >> solution.
>> >> >> 
>> >> >> 
>> >> >>       "flow-lmit"  has a dynamic balance in udpif_revalidator, it will 
>> >> >> modify by the OVS condition(which is pind to "duration").   In the 
>> >> >> revalidate function, when the number of flows is greater than twice 
>> >> >> the "flow-limit" , the delete flow operation will be triggered to 
>> >> >> delete all flows; when the number of flows is greater than the 
>> >> >> "flow-limit", the aging time will be adjusted to 0.1s, Slowly delete 
>> >> >> flow.   
>> >> >> 
>> >> >> 
>> >> >>      
>> >> >>      I found that the reason for the poor performance is that when the 
>> >> >> number of flows in the datapath increases and the processing power of 
>> >> >> OVS decreases, a large number of flow deletions are generated. 
>> >> >>      As we know, In the hardware offloading scenario, although there 
>> >> >> are a lot of flows, in fact, apart from the first packet, there is no 
>> >> >> need to process subsequent packets. 
>> >> >>      In my opinion, the dynamic balance mechanism is very necessary, 
>> >> >> but we need to increase the value of “duration”, or provide some new 
>> >> >> switches for some high-performance scenarios, such as hardware 
>> >> >> offloading.
>> >> >>      Do we still need to restrict the number of flows so strictly? By 
>> >> >> the way, do you have another solution to resolve this?   
>> >> >
>> >> >It's been a long time since I worked on this, but I recall two reasons
>> >> >for the flow limit.  First, each flow takes up memory.  Second, each
>> >> >flow must be revalidated periodically, meaning that it uses CPU as
>> >> >well.
>> >> >
>> >> >I don't, off-hand, remember the real reasons why the logic for deleting
>> >> >flows works as it does.  It might be in the comments or the commit
>> >> >messages.  But, I suspect, it is because above the flow-limit we want to
>> >> >try to reduce the amount of memory and CPU time dedicated to the cache
>> >> >and, if we arrive at twice the flow limit, we conclude that that try
>> >> >failed and that we must have a large number of very short flows so that
>> >> >caching is not very valuable anyhow.
>> >> >
>> >> >In a hardware offload scenario, we get rid of some costs (the cost of
>> >> >processing and forwarding packets and perhaps the memory cost in the
>> >> >datapath) but we still have the cost of revalidating them.  When there
>> >> >are many flows, we add the extra cost of balancing flows between
>> >> >software and the offload hardware.
>> >> >
>> >> >Because of the remaining cost and the added ones when there is hardware
>> >> >offload, it's not obvious to me that we can stop limiting the number of
>> >> >flows.  I think that experimentation and measurements would be needed.
>> >> >Perhaps this would be an adjustment to the dynamic algorithm, rather
>> >> 
>> >> >than a removal of it.
>> >> 
>> >> 
>> >> I think we can increase the init `flow_limit` in udpif_create,10000 is a 
>> >> small number for current server and OS, and if 'duration' is small ,we 
>> >> should increase faster by a lager number not `flow_limit += 1000;`.
>> >> I have not better idea for this situation. Do you have some suggestion? I 
>> >> am very glad to do this change.
>> >
>> >What kind of number are you thinking about?  I'd like to come up with a
>> >rationale for choosing it.  It might be even better to come up with an
>> 
>> >algorithm or a heuristic for choosing it.
>> 
>> 
>> I think we could set the initial value to 200,000, and adjust the
>> increase to 20,000 each time.  Can you describe the rationale
>> algorithm you meationed in detailed ?
>
>I'd expect that whoever is changing it would propose the rationale.
>
>I believe that part of the current rationale is to keep the limit at a
>level such that revalidation takes no more than 1 second.  That's an

>important aspect too.







_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to