On Wed, Apr 20, 2022 at 8:14 AM Ales Musil <amu...@redhat.com> wrote:
> Thank you for the review, see my reply below. > > On Tue, Apr 19, 2022 at 9:26 PM Ihar Hrachyshka <ihrac...@redhat.com> > wrote: > >> Thanks for the patch. Two comments for the test case below. >> > >> On Tue, Apr 19, 2022 at 1:12 AM Ales Musil <amu...@redhat.com> wrote: >> > >> > The GARP was sent even on chassis that were not serving >> > as l3gw for the specified router. This could lead to race >> > on the physical network when multiple chassis send the same >> > ARP but the physical interfaces have different port numbers >> > on the external bridge. >> > >> > Add check to filter out port_bindings for l3gw that >> > are sitting on different chassis. >> > >> > Reported-at: https://bugzilla.redhat.com/2062580 >> > Signed-off-by: Ales Musil <amu...@redhat.com> >> > --- >> > controller/pinctrl.c | 2 +- >> > tests/ovn.at | 106 +++++++++++++++++++++++++++++++++++++++++++ >> > 2 files changed, 107 insertions(+), 1 deletion(-) >> > >> > diff --git a/controller/pinctrl.c b/controller/pinctrl.c >> > index 25b37ee88..42d1c2a46 100644 >> > --- a/controller/pinctrl.c >> > +++ b/controller/pinctrl.c >> > @@ -5554,7 +5554,7 @@ get_localnet_vifs_l3gwports( >> > sbrec_port_binding_index_set_datapath(target, ld->datapath); >> > SBREC_PORT_BINDING_FOR_EACH_EQUAL (pb, target, >> > >> sbrec_port_binding_by_datapath) { >> > - if (!strcmp(pb->type, "l3gateway") >> > + if ((!strcmp(pb->type, "l3gateway") && pb->chassis == >> chassis) >> > || !strcmp(pb->type, "patch")) { >> > sset_add(local_l3gw_ports, pb->logical_port); >> > } >> > diff --git a/tests/ovn.at b/tests/ovn.at >> > index f9551b843..a02a9dab2 100644 >> > --- a/tests/ovn.at >> > +++ b/tests/ovn.at >> > @@ -8707,6 +8707,112 @@ OVN_CLEANUP([hv1]) >> > AT_CLEANUP >> > ]) >> > >> > +OVN_FOR_EACH_NORTHD([ >> > +AT_SETUP([send gratuitous arp for l3gateway only on selected chassis]) >> > +ovn_start >> > + >> > +# Create logical switch >> > +ovn-nbctl ls-add ls0 >> > +# Create gateway router >> > +ovn-nbctl lr-add lr0 >> > +# Add router port to gateway router >> > +ovn-nbctl lrp-add lr0 lr0-ls0 f0:00:00:00:00:01 192.168.0.1/24 >> > +ovn-nbctl lsp-add ls0 ls0-lr0 -- set Logical_Switch_Port ls0-lr0 \ >> > + type=router options:router-port=lr0-ls0 >> addresses='"f0:00:00:00:00:01"' >> > + >> > +# Create a localnet port. >> > +ovn-nbctl lsp-add ls0 ln_port >> > +ovn-nbctl lsp-set-addresses ln_port unknown >> > +ovn-nbctl lsp-set-type ln_port localnet >> > +ovn-nbctl --wait=hv lsp-set-options ln_port network_name=physnet1 >> > + >> > +# Prepare packets >> > +touch empty_expected >> > +echo >> "fffffffffffff0000000000108060001080006040001f00000000001c0a80001000000000000c0a80001" >> > arp_expected >> > + >> > +net_add n1 >> > +sim_add hv1 >> > +as hv1 >> > +ovs-vsctl \ >> > + -- add-br br-phys \ >> > + -- add-br br-eth0 >> > + >> > +ovn_attach n1 br-phys 192.168.0.10 >> > + >> > +AT_CHECK([ovs-vsctl set Open_vSwitch . >> external-ids:ovn-bridge-mappings=physnet1:br-eth0]) >> > +AT_CHECK([ovs-vsctl add-port br-eth0 snoopvif -- set Interface >> snoopvif options:tx_pcap=hv1/snoopvif-tx.pcap >> options:rxq_pcap=hv1/snoopvif-rx.pcap]) >> > + >> > +sim_add hv2 >> > +as hv2 >> > +ovs-vsctl \ >> > + -- add-br br-phys \ >> > + -- add-br br-eth0 >> > + >> > +ovn_attach n1 br-phys 192.168.0.20 >> > + >> > +AT_CHECK([ovs-vsctl set Open_vSwitch . >> external-ids:ovn-bridge-mappings=physnet1:br-eth0]) >> > +AT_CHECK([ovs-vsctl add-port br-eth0 snoopvif -- set Interface >> snoopvif options:tx_pcap=hv2/snoopvif-tx.pcap >> options:rxq_pcap=hv2/snoopvif-rx.pcap]) >> > + >> > +ovn-sbctl dump-flows > sbflows >> > +AT_CAPTURE_FILE([sbflows]) >> > + >> > +# Wait until the patch ports are created in hv1 and hv2 to connect >> br-int to br-eth0 >> > +AT_CHECK([ovn-nbctl set logical_router lr0 options:chassis=hv1]) >> > +OVS_WAIT_UNTIL([test 1 = `as hv1 ovs-vsctl show | \ >> > +grep "Port patch-br-int-to-ln_port" | wc -l`]) >> > +AT_CHECK([ovn-nbctl set logical_router lr0 options:chassis=hv2]) >> > +OVS_WAIT_UNTIL([test 1 = `as hv2 ovs-vsctl show | \ >> > +grep "Port patch-br-int-to-ln_port" | wc -l`]) >> > + >> > +# Temporarily remove lr0 chassis >> > +AT_CHECK([ovn-nbctl remove logical_router lr0 options chassis]) >> > + >> > +reset_pcap_file() { >> > + local hv=$1 >> > + local iface=$2 >> > + local pcap_file=$3 >> > + as $hv >> > + ovs-vsctl -- set Interface $iface options:tx_pcap=dummy-tx.pcap \ >> > +options:rxq_pcap=dummy-rx.pcap >> > + rm -f ${pcap_file}*.pcap >> > + ovs-vsctl -- set Interface $iface >> options:tx_pcap=${pcap_file}-tx.pcap \ >> > +options:rxq_pcap=${pcap_file}-rx.pcap >> > +} >> > + >> > +reset_pcap_file hv1 snoopvif hv1/snoopvif >> > +reset_pcap_file hv2 snoopvif hv2/snoopvif >> > + >> > +hv1_uuid=$(ovn-sbctl --bare --columns _uuid list chassis hv1) >> > +AT_CHECK([ovn-nbctl set logical_router lr0 options:chassis=hv1]) >> > +OVS_WAIT_UNTIL([ >> > + ls0_lr0=$(ovn-sbctl --bare --columns chassis list port_binding >> ls0-lr0) >> > + test "$ls0_lr0" = $hv1_uuid >> > +]) >> >> Since GARPs are periodic, you should give controllers some time (with >> a sleep?) to produce the GARPs. >> > > It shouldn't be needed as the OVN_CHECK_PACKETS is waiting for it. > On the other hand in the spirit of potential flakiness it is probably a > good idea. > > >> >> > + >> > +OVN_CHECK_PACKETS([hv1/snoopvif-tx.pcap], [arp_expected]) >> > +OVN_CHECK_PACKETS([hv2/snoopvif-tx.pcap], [empty_expected]) >> > + >> > +# Temporarily remove lr0 chassis >> > +AT_CHECK([ovn-nbctl remove logical_router lr0 options chassis]) >> > + >> > +reset_pcap_file hv1 snoopvif hv1/snoopvif >> > +reset_pcap_file hv2 snoopvif hv2/snoopvif >> > + >> > +hv2_uuid=$(ovn-sbctl --bare --columns _uuid list chassis hv2) >> > +AT_CHECK([ovn-nbctl set logical_router lr0 options:chassis=hv2]) >> > +OVS_WAIT_UNTIL([ >> > + ls0_lr0=$(ovn-sbctl --bare --columns chassis list port_binding >> ls0-lr0) >> > + test "$ls0_lr0" = $hv2_uuid >> > +]) >> >> This test case is flaky. If you put "sleep 3" here, you'll see that >> GAPR is issued on both interfaces. This is because the code for l3gw >> is affected by the same issue as fixed for regular ports in: >> >> https://patchwork.ozlabs.org/project/ovn/patch/20220407025822.3861491-1-ihrac...@redhat.com/ >> >> Namely, when the l3gw chassis is changed from hv1 to hv2, hv1 never >> gets the port unregistered from the queue of periodic GARPs. Strictly >> speaking, it is not the bug that you try to fix, but since you add a >> test case that switches chassis for a port, I think the fix should be >> included here. (Alternatively, the test should be split in two parts - >> one just setting a chassis and checking if only one of them observes >> GARPs; and the other part updating chassis and checking that the old >> chassis stops issuing GARPs.) >> > > Didn't know that, however when I was testing it I did not come across the > GARP being still sent > over the old interface. Maybe I have not waited long enough. > > Anyway if I understand that right, you suggest also fix from your patch > and leaving the test as is? > Problematic part is that we always need set both chassis to setup the > patch ports, otherwise we would > not see the garp on the second one. That being said I think the best > approach would be to squash the > fixes and leave this test since it tests both cases WDYT? > Actually looking through the code this scenario cannot hit it, even if I add sleep. In my test case the send_garp_rarp_delete code is called in the for each earlier, pinctrl.c:5771. So those cases are different, I'll add the sleep to avoid any flakiness but IMO those patches can be separated. > > >> >> > + >> > +OVN_CHECK_PACKETS([hv2/snoopvif-tx.pcap], [arp_expected]) >> > +OVN_CHECK_PACKETS([hv1/snoopvif-tx.pcap], [empty_expected]) >> > + >> > +OVN_CLEANUP([hv1],[hv2]) >> > + >> > +AT_CLEANUP >> > +]) >> > + >> > OVN_FOR_EACH_NORTHD([ >> > AT_SETUP([send gratuitous arp with nat-addresses router in localnet]) >> > ovn_start >> > -- >> > 2.35.1 >> > >> > _______________________________________________ >> > dev mailing list >> > d...@openvswitch.org >> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> > >> >> > > -- > > Ales Musil > > Senior Software Engineer - RHV Network > > Red Hat EMEA <https://www.redhat.com> > > amu...@redhat.com IM: amusil > <https://red.ht/sig> > -- Ales Musil Senior Software Engineer - RHV Network Red Hat EMEA <https://www.redhat.com> amu...@redhat.com IM: amusil <https://red.ht/sig> _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev