Hi Dumitru Thanks for looking into this.
I might send a v2 as I saw a few different flakes. 1) The CI flake : i.e. 102 mac_binding seen instead of 101. This is I think due to additional mac binding for the router port. It is an OVN issue, but the flake itself is introduced by the modification of the test done by this patch, so I'll fix the test for now and I should open a low priority JIRA issue. 2) The flake you saw on your tmux session (i.e. missing garp): it might be that your system was just quite slower at that time. The flake is introduced by the test modification done by this patch (as the test is now much longer). 3) Yet another flake happened during the night tests I run: if ovs is slower than test script, reset_pcap might happen before 1st packet is received, and we see the packet twice afterwards. This flake before the patch. I'll let the test run a few more hours, then will hopefully send a v2. Thanks Xavier On Mon, Sep 8, 2025 at 5:22 PM Dumitru Ceara <[email protected]> wrote: > On 9/5/25 3:27 PM, Xavier Simonart via dev wrote: > > When using Distributed Gateway Port and HA, IPv6 ND_NA packet > > received on the router port causes all HA chassis to try updating the > > Mac_Binding table, often causing transaction errors. > > This patch prevents this by avoiding Mac_Binding updates due to ND_NS or > NA > > packets received on non residend DGP. > > > > Signed-off-by: Xavier Simonart <[email protected]> > > Reported-at: https://issues.redhat.com/browse/FDP-1567 > > --- > > Hi Xavier, > > Thanks for the fix! The code change looks correct to me. > > > northd/northd.c | 13 +++++++++++++ > > tests/ovn-northd.at | 3 +++ > > tests/ovn.at | 43 ++++++++++++++++++++++++++++++++++++------- > > 3 files changed, 52 insertions(+), 7 deletions(-) > > > > [...] > > > diff --git a/tests/ovn.at b/tests/ovn.at > > index 65fbfdcda..57c305c3c 100644 > > --- a/tests/ovn.at > > +++ b/tests/ovn.at > > @@ -12477,6 +12477,11 @@ as gw2 > > check ovs-vsctl add-br br-phys > > ovn_attach n1 br-phys 192.168.0.4 > > > > +sim_add gw3 > > +as gw3 > > +check ovs-vsctl add-br br-phys > > +ovn_attach n1 br-phys 192.168.0.6 > > + > > sim_add ext1 > > as ext1 > > check ovs-vsctl add-br br-phys > > @@ -12526,7 +12531,7 @@ check ovn-nbctl lr-route-add R0 0.0.0.0/0 > 100.60.1.2 > > check ovn-nbctl lr-route-add R1 192.168.0.0/16 100.60.1.1 > > > > # Connect alice to R1 as distributed router gateway port on gw1 > > -check ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24 > > +check ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24 > fd10::1/64 > > > > AT_CHECK([ovn-nbctl \ > > --id=@gc0 create Gateway_Chassis name=alice_gw1 \ > > @@ -12535,8 +12540,12 @@ AT_CHECK([ovn-nbctl \ > > --id=@gc1 create Gateway_Chassis name=alice_gw2 \ > > chassis_name=gw2 \ > > priority=10 -- \ > > - set Logical_Router_Port alice 'gateway_chassis=[@gc0,@gc1]' | > uuidfilt], [0], [<0> > > + --id=@gc2 create Gateway_Chassis name=alice_gw3 \ > > + chassis_name=gw3 \ > > + priority=5 -- \ > > + set Logical_Router_Port alice 'gateway_chassis=[@gc0,@gc1,@gc2]' | > uuidfilt], [0], [<0> > > <1> > > +<2> > > ]) > > > > check ovn-nbctl lsp-add alice rp-alice -- set Logical_Switch_Port > rp-alice \ > > @@ -12549,7 +12558,7 @@ check ovn-nbctl lsp-add foo foo1 \ > > > > # Create logical port outside1 in outside > > check ovn-nbctl lsp-add outside outside1 \ > > --- lsp-set-addresses outside1 "f0:00:00:01:02:04 172.16.1.3" > > +-- lsp-set-addresses outside1 unknown > > > > # Create localnet port in alice > > check ovn-nbctl lsp-add alice ln-alice > > @@ -12567,11 +12576,12 @@ check ovn-nbctl lsp-set-options ln-outside > network_name=phys > > # mapping to the external network, is the one generating packets > > check as gw1 ovs-vsctl set open . > external-ids:ovn-bridge-mappings=phys:br-phys > > check as gw2 ovs-vsctl set open . > external-ids:ovn-bridge-mappings=phys:br-phys > > +check as gw3 ovs-vsctl set open . > external-ids:ovn-bridge-mappings=phys:br-phys > > check as ext1 ovs-vsctl set open . > external-ids:ovn-bridge-mappings=phys:br-phys > > > > wait_for_ports_up > > check ovn-nbctl --wait=sb sync > > -OVN_WAIT_PATCH_PORT_FLOWS(["ln-alice"], ["gw1"] ["gw2"]) > > +OVN_WAIT_PATCH_PORT_FLOWS(["ln-alice"], ["gw1"] ["gw2"],["gw3"]) > > OVN_WAIT_PATCH_PORT_FLOWS(["ln-outside"], ["ext1"]) > > > > ovn-sbctl dump-flows > sbflows > > @@ -12588,16 +12598,24 @@ check ovn-nbctl --wait=hv sync > > > > hv1_gw1_ofport=$(as hv1 ovs-vsctl --bare --columns ofport find > Interface name=ovn-gw1-0) > > hv1_gw2_ofport=$(as hv1 ovs-vsctl --bare --columns ofport find > Interface name=ovn-gw2-0) > > +hv1_gw3_ofport=$(as hv1 ovs-vsctl --bare --columns ofport find > Interface name=ovn-gw3-0) > > > > OVS_WAIT_UNTIL([ > > - test 1 = $(as hv1 ovs-ofctl dump-flows br-int > table=OFTABLE_REMOTE_OUTPUT | grep -c > "active_backup,ofport,members:$hv1_gw1_ofport,$hv1_gw2_ofport") > > + test 1 = $(as hv1 ovs-ofctl dump-flows br-int > table=OFTABLE_REMOTE_OUTPUT | grep -c > "active_backup,ofport,members:$hv1_gw1_ofport,$hv1_gw2_ofport,$hv1_gw3_ofport") > > ]) > > > > +n_external_ip=50 > > test_ip_packet() > > { > > local active_gw=$1 > > local backup_gw=$2 > > > > + # Send many GARP/ND_NA from outside1 to the router. > > + for i in $(seq 11 $((n_external_ip + 10))); do > > + send_garp ext1 ext1-vif1 1 "00:00:00:00:20:$i" > "ff:ff:ff:ff:ff:ff" 172.16.1.$i 172.16.1.$i > > + send_na ext1 ext1-vif1 "00:00:00:00:30:$i" "33:33:00:00:00:01" > "fd10::$i" "ff02::1" > > + done > > + > > # Send ip packet between foo1 and outside1 > > src_mac="f00000010203" # foo1 mac > > dst_mac="000001010203" # foo-R0 mac (internal router leg) > > @@ -12665,16 +12683,27 @@ AT_CHECK([ovn-nbctl --wait=hv \ > > --id=@gc1 create Gateway_Chassis name=alice_gw2 \ > > chassis_name=gw2 \ > > priority=20 -- \ > > - set Logical_Router_Port alice 'gateway_chassis=[@gc0,@gc1]' | > uuidfilt], [0], [<0> > > + --id=@gc2 create Gateway_Chassis name=alice_gw3 \ > > + chassis_name=gw3 \ > > + priority=5 -- \ > > + set Logical_Router_Port alice 'gateway_chassis=[@gc0,@gc1,@gc2]' | > uuidfilt], [0], [<0> > > <1> > > +<2> > > ]) > > > > OVS_WAIT_UNTIL([ > > - test 1 = $(as hv1 ovs-ofctl dump-flows br-int > table=OFTABLE_REMOTE_OUTPUT | grep -c > "active_backup,ofport,members:$hv1_gw2_ofport,$hv1_gw1_ofport") > > + test 1 = $(as hv1 ovs-ofctl dump-flows br-int > table=OFTABLE_REMOTE_OUTPUT | grep -c > "active_backup,ofport,members:$hv1_gw2_ofport,$hv1_gw1_ofport,$hv1_gw3_ofport") > > ]) > > > > test_ip_packet gw2 gw1 > > > > +wait_row_count MAC_Binding $((n_external_ip * 2 + 1)) > > + > > +for i in gw1 gw2 gw3; do > > + AT_CHECK([grep -c "Transaction causes multiple rows" > $i/ovn-controller.log], [1], [0 > > +]) > > +done > > + > > OVN_CLEANUP([hv1],[gw1],[gw2],[ext1]) > > AT_CLEANUP > > ]) > > However, this test seems to fail every now and then in CI, please see > this run in my fork: > > > https://github.com/dceara/ovn/actions/runs/17553176009/job/49850727367#step:12:5045 > > Could you please have a look? If it's a small adjustment maybe we can > squash it in without the need for a v2. > > Regards, > Dumitru > > > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
