On Wed, Jun 3, 2020 at 9:39 PM Han Zhou <zhou...@gmail.com> wrote: > > > On Wed, Jun 3, 2020 at 7:16 PM Girish Moodalbail <gmoodalb...@gmail.com> > wrote: > >> Hello all, >> >> While working on an extension, see the diagram below, to the existing >> OVN logical topology for the ovn-kubernetes project, I am seeing an >> explosion of the "Reply to ARP requests" logical flows in the >> `lr_in_ip_input` table for the distributed router (ovn_cluster_router) >> configured with gateway port (rtol-LS) >> >> internet >> ---------+--------------> >> | >> | >> +----------localnet-port---------+ >> |LS | >> +-----------------ltor-LS--------+ >> | >> | >> +---------------------rtol-LS------------+ >> | ovn_cluster_router | >> | (Distributed Router) | >> +-rtos-ls0------rtos-ls1--------rtos-ls2-+ >> | | | >> | | | >> +-----+-+ +----+--+ +-----+-+ >> | LS0 | | LS1 | | LS2 | >> +-+-----+ +-+-----+ +-+-----+ >> | | | >> p0 p1 p2 >> IA0 IA1 IA2 >> EA0 EA1 EA2 >> (Node0) (Node1) (Node2) >> >> In the topology above, each of the three logical switch port has an >> internal address of IAx and an external address of EAx (dnat_and_snat IP). >> They are all bound to their respective nodes (Nodex). A packet from `p0` >> heading towards the internet will be SNAT'ed to EA0 on the local hypervisor >> and then sent out through the LS's localnet-port on that hypervisor. >> Basically, they are configured for distributed NATing. >> >> I am seeing interesting "Reply to ARP requests" flows for arp.tpa set to >> "EAX". Flows are like this: >> >> For EA0 >> priority=90, match=(inport == "rtos-ls0" && arp.tpa == EA0 && arp.op == >> 1), action=(/* ARP reply */) >> priority=90, match=(inport == "rtos-ls1" && arp.tpa == EA0 && arp.op == >> 1), action=(/* ARP reply */) >> priority=90, match=(inport == "rtos-ls2" && arp.tpa == EA0 && arp.op == >> 1), action=(/* ARP reply */) >> >> For EA1 >> priority=90, match=(inport == "rtos-ls0" && arp.tpa == EA1 && arp.op == >> 1), action=(/* ARP reply */) >> priority=90, match=(inport == "rtos-ls1" && arp.tpa == EA0 && arp.op == >> 1), action=(/* ARP reply */) >> priority=90, match=(inport == "rtos-ls2" && arp.tpa == EA1 && arp.op == >> 1), action=(/* ARP reply */) >> >> Similarly, for EA2. >> >> So, we have N * N "Reply to ARP requests" flows for N nodes each with 1 >> dnat_and_snat ip. >> This is causing scale issues. >> >> If you look at the flows for `EA0`, i am confused as to why is it needed? >> >> 1. When will one see an ARP request for the EA0 from any of the >> LS{0,1,2}'s logical switch port. >> 2. If it is needed at all, can't we just remove the `inport` thing >> altogether since the flow is configured for every port of logical router >> port except for the distributed gateway port rtol-LS. For this port, we >> could add an higher priority rule with action set to `next`. >> 3. Say, we don't need east-west NAT connectivity. Is there a way to >> make these ARPs be learnt dynamically, like we are doing for join and >> external logical switch (the other thread [1]). >> >> Regards, >> ~Girish >> >> [1] >> https://mail.openvswitch.org/pipermail/ovs-discuss/2020-May/049994.html >> > > In general, these flows should be per router instead of per router port, > since the nat addresses are not attached to any router port. For > distributed gateway ports, there will need per-port flows to match > is_chassis_resident(gateway-chassis). I think this can be handled by: > - priority X + 20 flows for each distributed gateway port with > is_chassis_resident(), reply ARP > - priority X + 10 flows for each distributed gateway port without > is_chassis_resident(), drop > - priority X flows for each router (no need to match inport), reply ARP >
> This way, there are N * (2D + 1) flows per router. N = number of NAT IPs, > D = number of distributed gateway ports. This would optimize the above > scenario where there is only 1 distributed gateway port but many regular > router ports. Thoughts? > Han, I think this will work. Again, thanks for the quick reply. Regards, ~Girish
_______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss