On Thu, Nov 26, 2020 at 4:11 PM Daniel Alvarez Sanchez <dalva...@redhat.com> wrote: > > On Wed, Nov 25, 2020 at 7:59 PM Dumitru Ceara <dce...@redhat.com> wrote: > > > On 11/25/20 7:06 PM, Numan Siddique wrote: > > > On Wed, Nov 25, 2020 at 10:24 PM Renat Nurgaliyev <imple...@gmail.com> > > wrote: > > >> > > >> > > >> > > >> On 25.11.20 16:14, Dumitru Ceara wrote: > > >>> On 11/25/20 3:30 PM, Renat Nurgaliyev wrote: > > >>>> Hello folks, > > >>>> > > >>> Hi Renat, > > >>> > > >>>> we run a lab where we try to evaluate scalability potential of OVN > > with > > >>>> OpenStack as CMS. > > >>>> Current lab setup is following: > > >>>> > > >>>> 500 networks > > >>>> 500 routers > > >>>> 1500 VM ports (3 per network/router) > > >>>> 1500 Floating IPs (one per VM port) > > >>>> > > >>>> There is an external network, which is bridged to br-provider on > > gateway > > >>>> nodes. There are 2000 ports > > >>>> connected to this external network (1500 Floating IPs + 500 SNAT > > router > > >>>> ports). So the setup is not > > >>>> very big we'd say, but after applying this configuration via ML2/OVN > > >>>> plugin, northd kicks in and does > > >>>> its job, and after its done, Logical_Flow table gets 645877 entries, > > >>>> which is way too much. But ok, > > >>>> we move on and start one controller on the gateway chassis, and here > > >>>> things get really messy. > > >>>> MAC_Binding table grows from 0 to 999088 entries in one moment, and > > >>>> after its done, the size of SB > > >>>> biggest tables look like this: > > >>>> > > >>>> 999088 MAC_Binding > > >>>> 645877 Logical_Flow > > >>>> 4726 Port_Binding > > >>>> 1117 Multicast_Group > > >>>> 1068 Datapath_Binding > > >>>> 1046 Port_Group > > >>>> 551 IP_Multicast > > >>>> 519 DNS > > >>>> 517 HA_Chassis_Group > > >>>> 517 HA_Chassis > > >>>> ... > > >>>> > > >>>> MAC binding table gets huge, basically it now has an entry for every > > >>>> port that is connected to external > > >>>> network * number of datapaths, which roughly makes it one million > > >>>> entries. This table by itself increases > > >>>> the size of the SB by 200 megabytes. Logical_Flow table also gets very > > >>>> heavy, we have already played a bit > > >>>> with logical datapath patches that Ilya Maximets submitted, and it > > looks > > >>>> much better, but the size of > > >>>> the MAC_Binding table still feels inadequate. > > >>>> > > >>>> We would like to start to work at least on MAC_Binding table > > >>>> optimisation, but it is a bit difficult > > >>>> to start working from scratch. Can someone help us with ideas how this > > >>>> could be optimised? > > >>>> > > >>>> Maybe it would also make sense to group entries in MAC_Binding table > > in > > >>>> the same way like it is proposed > > >>>> for logical flows in Ilya's patch? > > >>>> > > >>> Maybe it would work but I'm not really sure how, right now. However, > > >>> what if we change the way MAC_Bindings are created? > > >>> > > >>> Right now a MAC Binding is created for each logical router port but in > > >>> your case there are a lot of logical router ports connected to the > > >>> single provider logical switch and they all learn the same ARPs. > > >>> > > >>> What if we instead store MAC_Bindings per logical switch? Basically > > >>> sharing all these MAC_Bindings between all router ports connected to > > the > > >>> same LS. > > >>> > > >>> Do you see any problem with this approach? > > >>> > > >>> Thanks, > > >>> Dumitru > > >>> > > >>> > > >> I believe that this approach is way to go, at least nothing comes to my > > mind > > >> that could go wrong here. We will try to make a patch for that. > > However, if > > >> someone is familiar with the code and knows how to do it fast, it would > > also > > >> be very nice. > > > > > > This approach should work. > > > > > > I've another idea (I won't call it a solution yet). What if we drop > > > the usage of MAC_Binding altogether ? > > > > This would be great! > > > > > > > > - When ovn-controller learns a mac_binding, it will not create a row > > > into the SB MAC_binding table > > > - Instead it will maintain the learnt mac binding in its memory. > > > - ovn-controller will still program the table 66 with the flow to set > > > the eth.dst (for the get_arp() action) > > > > > > This has a couple of advantages > > > - Right now we never flush the old/stale mac_binding entries. > > > - If suppose the mac of an external IP has changed, but OVN has an > > > entry for that IP with old mac in the mac_binding table, > > > we will use the old mac, causing the packet to be sent out to the > > > wrong destination and the packet might get lost. > > > - So we will get rid of this problem > > > - We will also save SB DB space. > > > > > > There are few disadvantages > > > - Other ovn-controllers will not add the flows in table 66. I guess > > > this should be fine as each ovn-controller > > > can generate the ARP request and learn the mac. > > > - When ovn-controller restarts we lose the learnt macs and would > > > need to learn again. > > > > > > Any thoughts on this ? > > > > It'd be great to have some sort of local ARP cache but I'm concerned about > the performance implications. > > - How are you going to determine when an entry is stale? > If you slow path the packets to reset the timeout everytime a pkt with > source mac is received, it doesn't look good. Maybe you have something else > in mind.
Right now we don't stale any mac_binding entry. If I understand you correctly, your concern is for the scenario where a floating ip is updated with a different mac, how the local cache is updated ? Right now networking-ovn (in the case of openstack) updates the mac_binding entry in the South db for such cases right ? Thanks Numan > > - > > > > > > There's another scenario that we need to take care of and doesn't seem > > too obvious to address without MAC_Bindings. > > > > GARPs were being injected in the L2 broadcast domain of a LS for nat > > addresses in case FIPs are reused by the CMS, introduced by: > > > > > > https://github.com/ovn-org/ovn/commit/069a32cbf443c937feff44078e8828d7a2702da8 > > > Dumitru and I have been discussing the possibility of reverting this patch > and rely on CMSs to maintain the MAC_Binding entries associated with the > FIPs [0]. > I'm against reverting this patch in OVN [1] for multiple reasons being the > most important one the fact that if we rely on workarounds in the CMS side, > we'll be creating a control plane dependency for something that is pure > dataplane only (ie. if Neutron server is down - outage, upgrades, etc. -, > traffic is going to be disrupted). On the other hand one could argue that > the same dependency now exists on ovn-controller being up & running but I > believe that this is better than a) relying on workarounds on CMSs b) > relying on CMSs availability. > > In the short term I think that moving the MAC_Binding entries to LS instead > of LRP as it was suggested up thread would be a good idea and in the long > haul, the ARP *local* cache seems to be the right solution. Brainstorming > with Dumitru he suggested inspecting the flows regularly to see if the > packet count on flows that check if src_mac == X has not increased in a > while and then remove the ARP responder flows locally. > > [0] > https://github.com/openstack/networking-ovn/commit/5181f1106ff839d08152623c25c9a5f6797aa2d7 > > [1] > https://github.com/ovn-org/ovn/commit/069a32cbf443c937feff44078e8828d7a2702da8 > > > > > > > Recently, due to the dataplane scaling issue (4K resubmit limit being > > hit), we don't flood these packets on non-router ports and instead > > create the MAC Bindings directly from ovn-controller: > > > > > > https://github.com/ovn-org/ovn/commit/a2b88dc5136507e727e4bcdc4bf6fde559f519a9 > > > > Without the MAC_Binding table we'd need to find a way to update or flush > > stale bindings when an IP is used for a VIF or FIP. > > > > Thanks, > > Dumitru > > > > _______________________________________________ > > dev mailing list > > d...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > > _______________________________________________ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev