On 9/9/25 9:51 AM, Xavier Simonart wrote:
> Hi Dumitru
> 

Hi Xavier,

> Thanks for looking into this.
> 

No worries.

> I might send a v2 as I saw a few different flakes.
> 1) The CI flake : i.e. 102 mac_binding seen instead of 101. This is I
> think due to additional mac binding for the router port. 
> It is an OVN issue, but the flake itself is introduced by the
> modification of the test done by this patch, so I'll fix the test for
> now and I should open a low priority JIRA issue.
> 2) The flake you saw on your tmux session (i.e. missing garp): it might
> be that your system was just quite slower at that time. The flake is
> introduced by the test modification done by this patch (as the test is
> now much longer).
> 3) Yet another flake happened during the night tests I run: if ovs is
> slower than test script, reset_pcap might happen before 1st packet is
> received, and we see the packet twice afterwards. This flake before the
> patch.
> I'll let the test run a few more hours, then will hopefully send a v2.
> 

Thanks for looking at this in-depth!  I'll be on the lookout for a v2.

Regards,
Dumitru

> Thanks
> Xavier
> 
> 
> On Mon, Sep 8, 2025 at 5:22 PM Dumitru Ceara <[email protected]
> <mailto:[email protected]>> wrote:
> 
>     On 9/5/25 3:27 PM, Xavier Simonart via dev wrote:
>     > When using Distributed Gateway Port and HA, IPv6 ND_NA packet
>     > received on the router port causes all HA chassis to try updating the
>     > Mac_Binding table, often causing transaction errors.
>     > This patch prevents this by avoiding Mac_Binding updates due to
>     ND_NS or NA
>     > packets received on non residend DGP.
>     >
>     > Signed-off-by: Xavier Simonart <[email protected]
>     <mailto:[email protected]>>
>     > Reported-at: https://issues.redhat.com/browse/FDP-1567 <https://
>     issues.redhat.com/browse/FDP-1567>
>     > ---
> 
>     Hi Xavier,
> 
>     Thanks for the fix!  The code change looks correct to me.
> 
>     >  northd/northd.c     | 13 +++++++++++++
>     >  tests/ovn-northd.at <http://ovn-northd.at> |  3 +++
>     >  tests/ovn.at <http://ovn.at>        | 43 ++++++++++++++++++++++++
>     ++++++++++++-------
>     >  3 files changed, 52 insertions(+), 7 deletions(-)
>     >
> 
>     [...]
> 
>     > diff --git a/tests/ovn.at <http://ovn.at> b/tests/ovn.at <http://
>     ovn.at>
>     > index 65fbfdcda..57c305c3c 100644
>     > --- a/tests/ovn.at <http://ovn.at>
>     > +++ b/tests/ovn.at <http://ovn.at>
>     > @@ -12477,6 +12477,11 @@ as gw2
>     >  check ovs-vsctl add-br br-phys
>     >  ovn_attach n1 br-phys 192.168.0.4
>     > 
>     > +sim_add gw3
>     > +as gw3
>     > +check ovs-vsctl add-br br-phys
>     > +ovn_attach n1 br-phys 192.168.0.6
>     > +
>     >  sim_add ext1
>     >  as ext1
>     >  check ovs-vsctl add-br br-phys
>     > @@ -12526,7 +12531,7 @@ check ovn-nbctl lr-route-add R0 0.0.0.0/0
>     <http://0.0.0.0/0> 100.60.1.2
>     >  check ovn-nbctl lr-route-add R1 192.168.0.0/16
>     <http://192.168.0.0/16> 100.60.1.1
>     > 
>     >  # Connect alice to R1 as distributed router gateway port on gw1
>     > -check ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24
>     <http://172.16.1.1/24>
>     > +check ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24
>     <http://172.16.1.1/24> fd10::1/64
>     > 
>     >  AT_CHECK([ovn-nbctl \
>     >      --id=@gc0 create Gateway_Chassis name=alice_gw1 \
>     > @@ -12535,8 +12540,12 @@ AT_CHECK([ovn-nbctl \
>     >      --id=@gc1 create Gateway_Chassis name=alice_gw2 \
>     >                                       chassis_name=gw2 \
>     >                                       priority=10 -- \
>     > -    set Logical_Router_Port alice 'gateway_chassis=[@gc0,@gc1]' |
>     uuidfilt], [0], [<0>
>     > +    --id=@gc2 create Gateway_Chassis name=alice_gw3 \
>     > +                                     chassis_name=gw3 \
>     > +                                     priority=5 -- \
>     > +    set Logical_Router_Port alice
>     'gateway_chassis=[@gc0,@gc1,@gc2]' | uuidfilt], [0], [<0>
>     >  <1>
>     > +<2>
>     >  ])
>     > 
>     >  check ovn-nbctl lsp-add alice rp-alice -- set Logical_Switch_Port
>     rp-alice \
>     > @@ -12549,7 +12558,7 @@ check ovn-nbctl lsp-add foo foo1 \
>     > 
>     >  # Create logical port outside1 in outside
>     >  check ovn-nbctl lsp-add outside outside1 \
>     > --- lsp-set-addresses outside1 "f0:00:00:01:02:04 172.16.1.3"
>     > +-- lsp-set-addresses outside1 unknown
>     > 
>     >  # Create localnet port in alice
>     >  check ovn-nbctl lsp-add alice ln-alice
>     > @@ -12567,11 +12576,12 @@ check ovn-nbctl lsp-set-options ln-
>     outside network_name=phys
>     >  # mapping to the external network, is the one generating packets
>     >  check as gw1 ovs-vsctl set open . external-ids:ovn-bridge-
>     mappings=phys:br-phys
>     >  check as gw2 ovs-vsctl set open . external-ids:ovn-bridge-
>     mappings=phys:br-phys
>     > +check as gw3 ovs-vsctl set open . external-ids:ovn-bridge-
>     mappings=phys:br-phys
>     >  check as ext1 ovs-vsctl set open . external-ids:ovn-bridge-
>     mappings=phys:br-phys
>     > 
>     >  wait_for_ports_up
>     >  check ovn-nbctl --wait=sb sync
>     > -OVN_WAIT_PATCH_PORT_FLOWS(["ln-alice"], ["gw1"] ["gw2"])
>     > +OVN_WAIT_PATCH_PORT_FLOWS(["ln-alice"], ["gw1"] ["gw2"],["gw3"])
>     >  OVN_WAIT_PATCH_PORT_FLOWS(["ln-outside"], ["ext1"])
>     > 
>     >  ovn-sbctl dump-flows > sbflows
>     > @@ -12588,16 +12598,24 @@ check ovn-nbctl --wait=hv sync
>     > 
>     >  hv1_gw1_ofport=$(as hv1 ovs-vsctl --bare --columns ofport find
>     Interface name=ovn-gw1-0)
>     >  hv1_gw2_ofport=$(as hv1 ovs-vsctl --bare --columns ofport find
>     Interface name=ovn-gw2-0)
>     > +hv1_gw3_ofport=$(as hv1 ovs-vsctl --bare --columns ofport find
>     Interface name=ovn-gw3-0)
>     > 
>     >  OVS_WAIT_UNTIL([
>     > -    test 1 = $(as hv1 ovs-ofctl dump-flows br-int
>     table=OFTABLE_REMOTE_OUTPUT | grep -c "active_backup,ofport,members:
>     $hv1_gw1_ofport,$hv1_gw2_ofport")
>     > +    test 1 = $(as hv1 ovs-ofctl dump-flows br-int
>     table=OFTABLE_REMOTE_OUTPUT | grep -c "active_backup,ofport,members:
>     $hv1_gw1_ofport,$hv1_gw2_ofport,$hv1_gw3_ofport")
>     >  ])
>     > 
>     > +n_external_ip=50
>     >  test_ip_packet()
>     >  {
>     >      local active_gw=$1
>     >      local backup_gw=$2
>     > 
>     > +    # Send many GARP/ND_NA from outside1 to the router.
>     > +    for i in $(seq 11 $((n_external_ip + 10))); do
>     > +        send_garp ext1 ext1-vif1 1 "00:00:00:00:20:$i"
>     "ff:ff:ff:ff:ff:ff" 172.16.1.$i 172.16.1.$i
>     > +        send_na ext1 ext1-vif1 "00:00:00:00:30:$i"
>     "33:33:00:00:00:01" "fd10::$i" "ff02::1"
>     > +    done
>     > +
>     >      # Send ip packet between foo1 and outside1
>     >      src_mac="f00000010203" # foo1 mac
>     >      dst_mac="000001010203" # foo-R0 mac (internal router leg)
>     > @@ -12665,16 +12683,27 @@ AT_CHECK([ovn-nbctl --wait=hv \
>     >      --id=@gc1 create Gateway_Chassis name=alice_gw2 \
>     >                                       chassis_name=gw2 \
>     >                                       priority=20 -- \
>     > -    set Logical_Router_Port alice 'gateway_chassis=[@gc0,@gc1]' |
>     uuidfilt], [0], [<0>
>     > +    --id=@gc2 create Gateway_Chassis name=alice_gw3 \
>     > +                                     chassis_name=gw3 \
>     > +                                     priority=5 -- \
>     > +    set Logical_Router_Port alice
>     'gateway_chassis=[@gc0,@gc1,@gc2]' | uuidfilt], [0], [<0>
>     >  <1>
>     > +<2>
>     >  ])
>     > 
>     >  OVS_WAIT_UNTIL([
>     > -    test 1 = $(as hv1 ovs-ofctl dump-flows br-int
>     table=OFTABLE_REMOTE_OUTPUT | grep -c "active_backup,ofport,members:
>     $hv1_gw2_ofport,$hv1_gw1_ofport")
>     > +    test 1 = $(as hv1 ovs-ofctl dump-flows br-int
>     table=OFTABLE_REMOTE_OUTPUT | grep -c "active_backup,ofport,members:
>     $hv1_gw2_ofport,$hv1_gw1_ofport,$hv1_gw3_ofport")
>     >  ])
>     > 
>     >  test_ip_packet gw2 gw1
>     > 
>     > +wait_row_count MAC_Binding $((n_external_ip * 2 + 1))
>     > +
>     > +for i in gw1 gw2 gw3; do
>     > +    AT_CHECK([grep -c "Transaction causes multiple rows" $i/ovn-
>     controller.log], [1], [0
>     > +])
>     > +done
>     > +
>     >  OVN_CLEANUP([hv1],[gw1],[gw2],[ext1])
>     >  AT_CLEANUP
>     >  ])
> 
>     However, this test seems to fail every now and then in CI, please see
>     this run in my fork:
> 
>     https://github.com/dceara/ovn/actions/runs/17553176009/
>     job/49850727367#step:12:5045 <https://github.com/dceara/ovn/actions/
>     runs/17553176009/job/49850727367#step:12:5045>
> 
>     Could you please have a look?  If it's a small adjustment maybe we can
>     squash it in without the need for a v2.
> 
>     Regards,
>     Dumitru
> 
> 

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to