Re: [ovs-discuss] Error ovs recursion limit reached on datapath
I have out of band management configured with: ovs-vsctl set controller connection-mode=out-of-band ovs-vsctl set bridge other-config:disable-in-band=true The out of band NIC is not added to an ovs bridge. When I connect it to my openDayLight controller (v0.8.4), that is when the recursion errors show up. Running 2.5.5 (ovs), it works perfectly. f0fc01b4-904d-407b-a3bb-03fd3e004926 Manager "tcp:10.246.49.188:6640" is_connected: true Bridge data Controller "tcp:10.246.49.188:6633" is_connected: true fail_mode: secure Port "vlan1" tag: 1 Interface "vlan1" type: internal Port "ens256" tag: 10 Interface "ens256" Port "ens224" tag: 1 Interface "ens224" Port "vxlan3" Interface "vxlan3" type: vxlan options: {key=flow, remote_ip="10.246.48.149"} Port data Interface data type: inte Dennis Heim | Domain Architect (Collaboration Labs) World Wide Technology, Inc. | +1 314-212-1814 "The most powerful person in the world is the story teller. The storyteller sets the vision, values and agenda of an entire generation that is to come" - Steve Jobs "Leaders who don't listen will eventually be surrounded by people who have nothing to say" --- Andy Stanley "Worry less about who you might offend, and more about who you might inspire" -- Tim Allen "Imagination is more important than knowledge." -- Albert Einstein "If you can raise the level of effort and performance in those around you, you are officially a leader" - Urban Meyer "The greatest danger for most of us is not that our aim is too high and we miss it, but that it is too low and we reach it." -- Michelangelo Buonarroti "Mediocore managers play checkers (assuming everyone is the same). Great managers play chess (acknowledging that everyone is unique)" - Marcus Buckingham "If you're not failing every now and again, it's a sign you're not doing anything very innovative" - Woody Allen Click here to join me in my Collaboration Meeting Room -Original Message- From: Ben Pfaff Sent: Thursday, June 27, 2019 10:01 AM To: Heim, Dennis Cc: ovs-discuss@openvswitch.org Subject: Re: [ovs-discuss] Error ovs recursion limit reached on datapath On Thu, Jun 27, 2019 at 04:48:45AM +, Heim, Dennis wrote: > Any idea what causes the error message "ovs recursion limit reached on > datapath"? I have the configuration working on 2.5.5, but if I run 2.9 > or 2.11, I get that error message. It could be a bug in OVS or it could be an OpenFlow flow table that does something odd, for example, recursively executing connection tracking or tunneling from a system to itself. You might be able to track it down by looking at the kernel flows with "ovs-dpctl dump-flows" or by tracing microflows or packets with "ovs-appctl ofproto/trace". ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] [HELP]Question about fdb entry size
This is a pretty extreme situation. OVS isn't optimized for it. You might need to adjust the code in ofproto-dpif-upcall.c by hand to tune for it. If you come up with some changes that improve performance and are unlikely to substantially negatively affect performance in more common situations, then we'd be grateful to have the patches. On Thu, Jun 27, 2019 at 10:38:20AM +0800, txfh2...@aliyun.com wrote: > Dear Ben: > sorry for my mistake, yes the fdb entry max limit is 1000k. > I have found in my test, when the test pkt flow num beyond 200k, > throughput > declines as the kernel flow limit is 200k.the revalidator threads will delete > kernel flow entry to remain flow size below 200k, am i right? > But even if i have set the flow-limit to 500k, i have found the kernel > flow num would also declines to around 200k after a few minutes. i donot know > the reason. i have read the "revalidatorwhat" slide(2014 ovs conf) but still > cannot get the clue. > Thanks for your reply. > > TIMO > > ---Original--- > From: "Ben Pfaff" > Date: Wed, Jun 26, 2019 23:11 PM > To: "txfh2007"; > Cc: "ovs-discuss"; > Subject: Re: [ovs-discuss] [HELP]Question about fdb entry size > > On Wed, Jun 26, 2019 at 09:18:12PM +0800, txfh2007 via discuss wrote: > > I have a question about ovs fdb entry size && > aging time. I have found the > > > max fdb entry size is hard coded in mac_learning.c, that max_entries is > 100k, > > the longest aging time is 3600s. > > > > But in my test environment on which pkt forwarding is based on OVS normal > > action, and my test center could generate about 200k flow simultaneously. > > So > > the performance is effected by max entry size(there shoud be fdb entries > > evicted by new pkts), So can we enlarge the max_entries limitation, and what > > is the side effect? > > It looks to me like the maximum is 1 million: > > /* Sets the maximum number of entries in 'ml' to 'max_entries', adjusting it > * to be within a reasonable range. */ > void > mac_learning_set_max_entries(struct mac_learning *ml, size_t max_entries) > { > ml->max_entries = (max_entries < 10 ? 10 >: max_entries > 1000 * 1000 ? 1000 * 1000 >: max_entries); > } ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] Error ovs recursion limit reached on datapath
On Thu, Jun 27, 2019 at 04:48:45AM +, Heim, Dennis wrote: > Any idea what causes the error message "ovs recursion limit reached on > datapath"? I have the configuration working on 2.5.5, but if I run 2.9 > or 2.11, I get that error message. It could be a bug in OVS or it could be an OpenFlow flow table that does something odd, for example, recursively executing connection tracking or tunneling from a system to itself. You might be able to track it down by looking at the kernel flows with "ovs-dpctl dump-flows" or by tracing microflows or packets with "ovs-appctl ofproto/trace". ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
[ovs-discuss] Re: [HELP]Question about fdb entry size
Dear Ben sorry for my mistake, yes the max limit of fdb entry is 1000k. I have found when the test pkt flow num beyond 200k, throughput declines . I guess the reason is kernel flow limit is 200k.the revalidator threads will delete kernel flow entry to remain flow size below 200k, am i right? But even if i have set the flow-limit to 500k, i have found the kernel flow num would also declines to around 200k after a few minutes. i don’t know the reason. i have read the "revalidatorwhat" slide(2014 ovs conf) but still cannot get the clue. Thanks for your reply. TIMO On Wed, Jun 26, 2019 at 09:18:12PM +0800, txfh2007 via discuss wrote: > I have a question about ovs fdb entry size && aging time. I have found the > max fdb entry size is hard coded in mac_learning.c, that max_entries is 100k, > the longest aging time is 3600s. > > But in my test environment on which pkt forwarding is based on OVS normal > action, and my test center could generate about 200k flow simultaneously. So > the performance is effected by max entry size(there shoud be fdb entries > evicted by new pkts), So can we enlarge the max_entries limitation, and what > is the side effect? It looks to me like the maximum is 1 million: /* Sets the maximum number of entries in 'ml' to 'max_entries', adjusting it * to be within a reasonable range. */ void mac_learning_set_max_entries(struct mac_learning *ml, size_t max_entries) { ml->max_entries = (max_entries < 10 ? 10 : max_entries > 1000 * 1000 ? 1000 * 1000 : max_entries); } ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] [OVN] Aging mechanism for MAC_Binding table
On Tue, Jun 25, 2019 at 01:05:21PM +0200, Daniel Alvarez Sanchez wrote: > Lately we've been trying to solve certain issues related to stale > entries in the MAC_Binding table (e.g. [0]). On the other hand, for > the OpenStack + Octavia (Load Balancing service) use case, we see that > a reused VIP can be as well affected by stale entries in this table > due to the fact that it's never bound to a VIF so ovn-controller won't > claim it and send the GARPs to update the neighbors. > > I'm not sure if other scenarios may suffer from this issue but seems > reasonable to have an aging mechanism (as we discussed at some point > in the past) that makes unused/old entries to expire. After talking to > Numan on IRC, since a new pinctrl thread has been introduced recently > [1], it'd be nice to implement this aging mechanism there. > At the same time we'd be also reducing the amount of entries for long > lived systems as it'd grow indefinitely. > > Any thoughts? > > Thanks! > Daniel > > PS. With regards to the 'unused' vs 'old' entries I think it has to be > 'old' rather than 'unused' as I don't see a way to reset the TTL of a > MAC_Binding entry when we see packets coming. The implication is that > we'll be seeing ARPs sent out more often when perhaps they're not > needed. This also leads to the discussion of making the cache timeout > configurable. I've always considered the MAC_Binding implementation incomplete because of this issue and others. ovn/TODO.rst says: * Dynamic IP to MAC binding enhancements. OVN has basic support for establishing IP to MAC bindings dynamically, using ARP. * Ratelimiting. From casual observation, Linux appears to generate at most one ARP per second per destination. This might be supported by adding a new OVN logical action for rate-limiting. * Tracking queries It's probably best to only record in the database responses to queries actually issued by an L3 logical router, so somehow they have to be tracked, probably by putting a tentative binding without a MAC address into the database. * Renewal and expiration. Something needs to make sure that bindings remain valid and expire those that become stale. One way to do this might be to add some support for time to the database server itself. * Table size limiting. The table of MAC bindings must not be allowed to grow unreasonably large. * MTU handling (fragmentation on output) So, what do we do about it? First, I think that adding support for time to the database server is a terrible idea (even though I think I wrote the above originally). Let's not do that. The following is some "thinking out loud" on the subject. I think there's a challenge around which ovn-controller should take care of a given MAC_Binding. We don't want every ovn-controller expiring every binding. Ideally, we want exactly one ovn-controller expiring a binding. One way would be to add an owner column (but it would be better if we don't need it). If we want to keep track of "unused" bindings, I can imagine a statistical mechanism to do that. Any user of a binding occasionally and probabilistically changes a serial number column that we'd introduce into the MAC_Binding table (this could be optimized to not bother if it has changed recently). The owner checks the serial number every so often and if it hasn't changed then it deletes the row. The owner could also occasionally revalidate the binding. Any thoughts? Thanks, Ben. ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss