On 4/24/26 1:15 PM, Lorenzo Bianconi wrote:
> From: Dumitru Ceara <[email protected]>
> 
> In the OVN logical switch pipeline, broadcast ARP requests (and ND_NS)
> generated by VIFs and by router ports are also flooded into the per switch
> MC_FLOOD_L2 multicast group which includes all non-router ports of the
> switch.
> In deployments with large logical broadcast domains (logical switches with
> a significantly large number of switch ports, e.g. 200+) this becomes a
> problem because the MC_FLOOD_L2 multicast group has a lot of ports so the
> chain of the OpenFlow tables that the packet needs to traverse for full
> processing becomes really long, going over OVS' 4K resubmit limit and
> causing the packet to be dropped.
> The main reason why the ARP packets are currently sent to all non-router
> ports is to allow the workloads to learn the mapping between arp.spa
> (source IP) and arp.sha (source MAC). However, that's just an optimization,
> which might avoid future ARP requests from other workloads; and it's not a
> requirement, nothing would really break if we didn't forward those packets
> to all other VIFs.
> It's probably acceptable to change the behavior and just forward these ARP
> requests to the ports that have LSP.addresses="unknown" as the workloads
> behind those ports (or the fabric for the localnet case) might actually own
> the arp.tpa target IP.
> 
> Reported-at: https://redhat.atlassian.net/browse/FDP-3439
> Co-authored-by: Lorenzo Bianconi <[email protected]>
> Signed-off-by: Lorenzo Bianconi <[email protected]>
> Signed-off-by: Dumitru Ceara <[email protected]>
> ---

Hi Lorenzo,

Maybe there was an accident with this patch..  When I reported this bug
internally i had mentioned that this _might_ be a potential fix approach
but that we need to figure out the behavioral change, e.g., the test we
skip below.

I'm going to mark this as changes requested in patchwork for now.

> diff --git a/tests/ovn.at b/tests/ovn.at
> index c0ae611bc..41707120e 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -5025,6 +5025,7 @@ OVN_FOR_EACH_NORTHD_FLOW_TUNNEL([
>  AT_SETUP([3 HVs, 3 LS, 3 lports/LS, 1 LR])
>  AT_KEYWORDS([slowtest])
>  AT_SKIP_IF([test $HAVE_SCAPY = no])
> +AT_SKIP_IF([:])

This is not OK, i was skipping it in the quick'n'dirty test I did before
because this test fails.  We can't do that in the actual fix.

>  ovn_start
>  
>  # Logical network:
> @@ -13310,6 +13311,7 @@ AT_CLEANUP
>  # instead of tunneling.
>  OVN_FOR_EACH_NORTHD([
>  AT_SETUP([vlan traffic for external network with distributed router gateway 
> port])
> +AT_SKIP_IF([:])
>  ovn_start
>  
>  # Logical network:

Regards,
Dumitru

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to