On Thu, Nov 7, 2019 at 6:02 PM Han Zhou <hz...@ovn.org> wrote:
>
>
>
> On Thu, Nov 7, 2019 at 7:43 AM Dumitru Ceara <dce...@redhat.com> wrote:
> >
> > On Thu, Nov 7, 2019 at 3:51 AM Han Zhou <hz...@ovn.org> wrote:
> > >
> > >
> > >
> > > On Mon, Nov 4, 2019 at 11:32 AM Numan Siddique <num...@ovn.org> wrote:
> > > >
> > > > On Mon, Nov 4, 2019 at 8:37 PM Dumitru Ceara <dce...@redhat.com> wrote:
> > > > >
> > > > > ARP request and ND NS packets for router owned IPs were being
> > > > > flooded in the complete L2 domain (using the MC_FLOOD multicast 
> > > > > group).
> > > > > However this creates a scaling issue in scenarios where aggregation
> > > > > logical switches are connected to more logical routers (~350). The
> > > > > logical pipelines of all routers would have to be executed before the
> > > > > packet is finally replied to by a single router, the owner of the IP
> > > > > address.
> > > > >
> > > > > This commit limits the broadcast domain by bypassing the L2 Lookup 
> > > > > stage
> > > > > for ARP requests that will be replied by a single router. The packets
> > > > > are still flooded in the L2 domain but not on any of the other patch
> > > > > ports towards other routers connected to the switch. This restricted
> > > > > flooding is done by using a new multicast group (MC_ARP_ND_FLOOD).
> > > > >
> > > > > IPs that are owned by the routers and for which this fix applies are:
> > > > > - IP addresses configured on the router ports.
> > > > > - VIPs.
> > > > > - NAT IPs.
> > > > >
> > > > > This commit also fixes function get_router_load_balancer_ips() which
> > > > > was incorrectly returning a single address_family even though the
> > > > > IP set could contain a mix of IPv4 and IPv6 addresses.
> > > > >
> > > > > Reported-at: https://bugzilla.redhat.com/1756945
> > > > > Reported-by: Anil Venkata <vkomm...@redhat.com>
> > > > > Signed-off-by: Dumitru Ceara <dce...@redhat.com>
> > > >
> > > > Thanks Dumitru, for addressing the review comments.
> > > >
> > > > Acked-by: Numan Siddique <num...@ovn.org>
> > > >
> > > > Han, if you can take a look in this patch and provide your comments,
> > > > that would be great.
> > > >
> >
> > Hi Han,
> >
> > >
> > > Sorry for delayed response :(
> >
> > No worries and thanks for reviewing this change.
> >
> > >
> > > The patch looks good to me except:
> > >
> > > 1. It changes the behavior that when an ARP request for a LRP's IP is 
> > > coming to the logical switch, other routers will no longer learn the 
> > > MAC-IP binding of the ARP sender. This has been discussed and I tend to 
> > > agree with Dumitru that it should not cause real issue. I think it just 
> > > worth to be documented clearly in the ovn-architecture, probably in the 
> > > section: Logical Routers and Logical Patch Ports, because this is 
> > > something different from a traditional switch's behavior, and network 
> > > administrators may get suprised.
> >
> > Ack, I'll add this in v6.
> >
> > >
> > > 2. Since we are anyway changing the behavior of ARP request handling and 
> > > bypassing logical router ports, and we think it should not cause real 
> > > problem, then I wonder why adding the MC_ARP_ND group to still flood to 
> > > the non-router ports. Is it really useful? Maybe it is not a big deal, 
> > > but I think the extra MC group would add some performance cost and it is 
> > > almost redundant with the regular MC_FLOOD except that there is no router 
> > > ports in it - it is a lot of redundancy considering that most LSPs are 
> > > regular VIFs in typical large scale environments. I would suggest to 
> > > simplify this unless there is clear concerns of not flooding to other 
> > > ports.
> >
> > If I skip flooding to non-router ports the following test fails
> > because one of the things checked by the test is that we flood
> > broadcasts (including ARPs) in the broadcast domain:
> >  36: ovn -- 3 HVs, 3 LS, 3 lports/LS, 1 LR           FAILED 
> > (ovs-macros.at:220)
> >
> > So my concern was that people might expect the ARP requests to be
> > flooded within the L2 broadcast domain.
> > However, we could add configuration knob to change the behavior and
> > only flood them on localnet ports. Like this we could maintain the
> > current L2 flooding but in deployments where this is not necessary the
> > users could disable the flooding to VIF ports and this would remove
> > the MC_ARP_ND group unless there's a localnet port in the datapath.
> >
> > What do you think?
> >
>
> Adding configuration knob is a valid option, but I hoped we could simplify 
> the change instead of making it even more complex :)
>
> My point is that since we changed the behavior, it might be better to change 
> it with a more straightforward philosophy: for ARP request for an IP that is 
> owned by a LRP, it is not flooded. This is in fact comply with our ARP 
> responder behavior in LS: for a known IP-MAC binding, the ARP request is not 
> flooded but directly responded to the sender. Now with your change we are 
> just extending this behavior for IPs owned by LRPs behind the LSPs with type 
> "router".

Sounds good to me.

>
> I don't think we need to worry about the test case "36: ovn -- 3 HVs, 3 LS, 3 
> lports/LS, 1 LR". In fact the test already considered a case where flooding 
> is not expected:
>             # 192.168.33.254 is configured to the switch patch port for lrp33,
>             # so no ARP flooding expected for it.
> Now we can just change the test case a little more to exclude the IPs owned 
> by LRPs, since this is new expected behavior. What do you think?

Sure, this can be done, I was just raising the point that I'm not sure
whether this would bother users or not.

>
> Also I am confused why you pointed out localnet port as an exception. I think 
> we should just treat it with the same philosophy. If we are not flooding, 
> then just don't flood to any port, no matter if it is VIF or localnet.

My reason was that behind localnet ports we might have larger physical
networks and flooding the ARP requests there might help external
devices already update their FDBs/FIBs and avoid more broadcasts
later. I'm not completely sure if the benefit is worth the effort
though.

>
> (If we do believe there is a need to flood with a new group MC_ARP_ND, I'd 
> say let's don't bother about the configuration knob now, until we see 
> performance impact from scale test)
>

Sure, I'm inclining though towards removing the MC_ARP_ND. The only
concern I have is regarding the localnet ports but if you think it's
ok to skip flooding ARPs there then I can send a follow-up v6 soon.

Thanks,
Dumitru

>
> > >
> > > 3. Not a big thing, but the name of the functions 
> > > build_lswitch_rport_arp_responders() and build_lswitch_rport_arp_flow() 
> > > are a little bit confusing. It is not for arp-responder, but for 
> > > bypassing router ports. Would it be better to be something like: 
> > > build_lswitch_rport_arp_req_flows() and 
> > > build_lswitch_rport_arp_request_flow_for_ip()?
> >
> > Sure, I'll change them.
> > >
> > > In addition, not for the patch, but for the problem itself, I think we'd 
> > > better quantify the number of router ports we support with current 
> > > resubmit limit (as suggested by Ben), and document somewhere the limit 
> > > and the impact if user exceeds the limit. If the limit is easily exceed 
> > > in very common deployments, we probably should consider increasing the 
> > > resubmit limit.
> >
> > Agreed, that should be done too.
> >
> > Thanks,
> > Dumitru
> >
> > >
> > > Thanks,
> > > Han
> >

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to