On Wed, Aug 21, 2019 at 3:11 AM Han Zhou <zhou...@gmail.com> wrote: > > > > On Tue, Aug 20, 2019 at 4:57 PM Ben Pfaff <b...@ovn.org> wrote: > > > > Let me see if I'm following this correctly. This is what currently > > happens: > > > > - HV1 needs a MAC address for an IP so it broadcasts an ARP request. > > > > - The port with the IP address, on HV2, causes the MAC_Binding to be > > inserted. > > > > - Every ovn-controller inserts an OF flow for the binding. HV1 and > > perhaps other ovn-controllers use this flow to populate the MAC > > address for subsequent packets destined to the IP address in question. > > > > This proposal augments that with: > > > > - After a while, the binding goes idle and isn't used. The > > ovn-controllers gradually notice this and delete their OF flows for > > it. > > > > - HV3 eventually needs the binding again. It broadcasts an ARP request. > > > > - The port with the IP address causes the MAC_Binding to be inserted. > > This might still be on HV2 if the port hasn't moved, or it might be on > > HV4 if it has. > > > > Is that what you mean? It might work OK. Yes, that's it. At some point we can look into enhancing this using the SB DB and if all ovn-controllers decided to ignore a particular MAC_Binding entry, then we can remove it from the DB from ovn-northd (or some other mechanism).
> > > > Please do update the lifetime description in ovn-sb(5) under the > > MAC_Binding table regardless of what you implement. > > > > Thanks, > > > > Ben. > > > > On Tue, Aug 20, 2019 at 09:03:57AM +0200, Daniel Alvarez Sanchez wrote: > > > Hi folks, > > > > > > Reviving this thread as we're seeing this more and more problematic. > > > Combining the ideas mentioned up thread, Dumitru, Numan, Lucas and I > > > had some internal discussion where we came up with a possible approach > > > and we'd love to get feedback from you: > > > > > > - Local ovn-controller will always insert an OF rule per MAC_Binding > > > entry to match on src_ip + src_mac that will be sampled with a meter > > > to ovn-controller. > > > - When ovn-controller sees that one entry has not been hit "for a > > > while", it'll delete the OpenFlow rule in table 65 that fills the > > > eth.dst field with the MAC_Binding info. > > I assume the rules in table 65 can be "extended" for this purpose, instead of > adding extra rules for this. > > > > - This will result in further ARP requests from the instance(s) that > > > will refresh the MAC_Binding entries in the database. > > > > > > This could make troubleshooting a bit harder so at some point it'll be > > > great to have a mechanism in OVS where we could disable a flow instead > > > of deleting it. This way, one can tell that the flows in table 65 have > > > been disabled due to the aging mechanism in the local node. > > Sorry that I didn't understand this. Why do you want the flow being disabled > instead of deleted? I think if we want to avoid stale entries, we do want to > delete them, so that the stale data doesn't occupy the space in flow table, > neither in SB DB. It may be ok to add debug log for deleting a aged entry in > ovn-controller, for trouble shooting purpose? We can use traces as well, yes :) > > > > > > > Thoughts? Is there any performance consideration regarding the extra > > > flows and meters? > > Are you proposing shared meters or one meter per mac-binding? If it is per > mac-binding, I would be worried about the scalability considering that we may > have >10k of mac-bindings. Or should I be worried? Maybe Justin and Ben can > comment on the meter scalability. If it is a concern, I would suggest the > feature be configurable (i.e. enable/disable), so that it can be enabled in > environments where aging is required but number of mac-bindings are not very > high. I was talking about one meter per mac-binding but I'll defer the answer to others, as I don't know much about meters. I'm not a big fan of configuration options but unless we have a clear view on this, it makes sense to me to have a knob for the 'aging'. > > > > > > > Thanks a lot! > > > Daniel > > > > > > > > > On Tue, Jul 9, 2019 at 7:19 AM Ben Pfaff <b...@ovn.org> wrote: > > > > > > > > On Mon, Jul 08, 2019 at 06:19:23PM -0700, Han Zhou wrote: > > > > > On Thu, Jun 27, 2019 at 6:44 AM Ben Pfaff <b...@ovn.org> wrote: > > > > > > > > > > > > On Tue, Jun 25, 2019 at 01:05:21PM +0200, Daniel Alvarez Sanchez > > > > > > wrote: > > > > > > > Lately we've been trying to solve certain issues related to stale > > > > > > > entries in the MAC_Binding table (e.g. [0]). On the other hand, > > > > > > > for > > > > > > > the OpenStack + Octavia (Load Balancing service) use case, we see > > > > > > > that > > > > > > > a reused VIP can be as well affected by stale entries in this > > > > > > > table > > > > > > > due to the fact that it's never bound to a VIF so ovn-controller > > > > > > > won't > > > > > > > claim it and send the GARPs to update the neighbors. > > > > > > > > > > > > > > I'm not sure if other scenarios may suffer from this issue but > > > > > > > seems > > > > > > > reasonable to have an aging mechanism (as we discussed at some > > > > > > > point > > > > > > > in the past) that makes unused/old entries to expire. After > > > > > > > talking to > > > > > > > Numan on IRC, since a new pinctrl thread has been introduced > > > > > > > recently > > > > > > > [1], it'd be nice to implement this aging mechanism there. > > > > > > > At the same time we'd be also reducing the amount of entries for > > > > > > > long > > > > > > > lived systems as it'd grow indefinitely. > > > > > > > > > > > > > > Any thoughts? > > > > > > > > > > > > > > Thanks! > > > > > > > Daniel > > > > > > > > > > > > > > PS. With regards to the 'unused' vs 'old' entries I think it has > > > > > > > to be > > > > > > > 'old' rather than 'unused' as I don't see a way to reset the TTL > > > > > > > of a > > > > > > > MAC_Binding entry when we see packets coming. The implication is > > > > > > > that > > > > > > > we'll be seeing ARPs sent out more often when perhaps they're not > > > > > > > needed. This also leads to the discussion of making the cache > > > > > > > timeout > > > > > > > configurable. > > > > > > > > > > > > I've always considered the MAC_Binding implementation incomplete > > > > > > because > > > > > > of this issue and others. ovn/TODO.rst says: > > > > > > > > > > > > * Dynamic IP to MAC binding enhancements. > > > > > > > > > > > > OVN has basic support for establishing IP to MAC bindings > > > > > dynamically, using > > > > > > ARP. > > > > > > > > > > > > * Ratelimiting. > > > > > > > > > > > > From casual observation, Linux appears to generate at most > > > > > > one > > > > > ARP per > > > > > > second per destination. > > > > > > > > > > > > This might be supported by adding a new OVN logical action > > > > > > for > > > > > > rate-limiting. > > > > > > > > > > > > * Tracking queries > > > > > > > > > > > > It's probably best to only record in the database > > > > > > responses to > > > > > queries > > > > > > actually issued by an L3 logical router, so somehow they > > > > > > have to > > > > > be > > > > > > tracked, probably by putting a tentative binding without a > > > > > > MAC > > > > > address > > > > > > into the database. > > > > > > > > > > > > * Renewal and expiration. > > > > > > > > > > > > Something needs to make sure that bindings remain valid and > > > > > expire those > > > > > > that become stale. > > > > > > > > > > > > One way to do this might be to add some support for time to > > > > > > the > > > > > database > > > > > > server itself. > > > > > > > > > > > > * Table size limiting. > > > > > > > > > > > > The table of MAC bindings must not be allowed to grow > > > > > unreasonably large. > > > > > > > > > > > > * MTU handling (fragmentation on output) > > > > > > > > > > > > So, what do we do about it? First, I think that adding support for > > > > > > time > > > > > > to the database server is a terrible idea (even though I think I > > > > > > wrote > > > > > > the above originally). Let's not do that. The following is some > > > > > > "thinking out loud" on the subject. > > > > > > > > > > > > I think there's a challenge around which ovn-controller should take > > > > > > care > > > > > > of a given MAC_Binding. We don't want every ovn-controller expiring > > > > > > every binding. Ideally, we want exactly one ovn-controller > > > > > > expiring a > > > > > > binding. One way would be to add an owner column (but it would be > > > > > > better if we don't need it). > > > > > > > > > > > > If we want to keep track of "unused" bindings, I can imagine a > > > > > > statistical mechanism to do that. Any user of a binding > > > > > > occasionally > > > > > > and probabilistically changes a serial number column that we'd > > > > > > introduce > > > > > > into the MAC_Binding table (this could be optimized to not bother > > > > > > if it > > > > > > has changed recently). The owner checks the serial number every so > > > > > > often and if it hasn't changed then it deletes the row. > > > > > > > > > > > > > > > > Thanks Ben for the advice. Since the user of a binding is simply a > > > > > OpenFlow > > > > > rule matching, I guess we will need "controller" action to trigger the > > > > > serial number column update in ovn-controller, combined with a meter > > > > > action > > > > > so that only small number of packets trigger the update. Is this what > > > > > you > > > > > are suggesting? > > > > > > > > I had not thought that far ahead! That approach would work, although > > > > the trigger percentage would be difficult to figure out--it seems like > > > > really we'd want "every Nth second", not "every Nth packet". Another > > > > approach that might work would be for ovn-controller to notice the > > > > statistics on appropriate OpenFlow flows changing, or to use "learn" > > > > actions as a way to make a controller action trigger only every so > > > > often. _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss