Thanks for the patch. Two comments for the test case below.

On Tue, Apr 19, 2022 at 1:12 AM Ales Musil <amu...@redhat.com> wrote:
>
> The GARP was sent even on chassis that were not serving
> as l3gw for the specified router. This could lead to race
> on the physical network when multiple chassis send the same
> ARP but the physical interfaces have different port numbers
> on the external bridge.
>
> Add check to filter out port_bindings for l3gw that
> are sitting on different chassis.
>
> Reported-at: https://bugzilla.redhat.com/2062580
> Signed-off-by: Ales Musil <amu...@redhat.com>
> ---
>  controller/pinctrl.c |   2 +-
>  tests/ovn.at         | 106 +++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 107 insertions(+), 1 deletion(-)
>
> diff --git a/controller/pinctrl.c b/controller/pinctrl.c
> index 25b37ee88..42d1c2a46 100644
> --- a/controller/pinctrl.c
> +++ b/controller/pinctrl.c
> @@ -5554,7 +5554,7 @@ get_localnet_vifs_l3gwports(
>          sbrec_port_binding_index_set_datapath(target, ld->datapath);
>          SBREC_PORT_BINDING_FOR_EACH_EQUAL (pb, target,
>                                             sbrec_port_binding_by_datapath) {
> -            if (!strcmp(pb->type, "l3gateway")
> +            if ((!strcmp(pb->type, "l3gateway") && pb->chassis == chassis)
>                  || !strcmp(pb->type, "patch")) {
>                  sset_add(local_l3gw_ports, pb->logical_port);
>              }
> diff --git a/tests/ovn.at b/tests/ovn.at
> index f9551b843..a02a9dab2 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -8707,6 +8707,112 @@ OVN_CLEANUP([hv1])
>  AT_CLEANUP
>  ])
>
> +OVN_FOR_EACH_NORTHD([
> +AT_SETUP([send gratuitous arp for l3gateway only on selected chassis])
> +ovn_start
> +
> +# Create logical switch
> +ovn-nbctl ls-add ls0
> +# Create gateway router
> +ovn-nbctl lr-add lr0
> +# Add router port to gateway router
> +ovn-nbctl lrp-add lr0 lr0-ls0 f0:00:00:00:00:01 192.168.0.1/24
> +ovn-nbctl lsp-add ls0 ls0-lr0 -- set Logical_Switch_Port ls0-lr0 \
> +    type=router options:router-port=lr0-ls0 addresses='"f0:00:00:00:00:01"'
> +
> +# Create a localnet port.
> +ovn-nbctl lsp-add ls0 ln_port
> +ovn-nbctl lsp-set-addresses ln_port unknown
> +ovn-nbctl lsp-set-type ln_port localnet
> +ovn-nbctl --wait=hv lsp-set-options ln_port network_name=physnet1
> +
> +# Prepare packets
> +touch empty_expected
> +echo 
> "fffffffffffff0000000000108060001080006040001f00000000001c0a80001000000000000c0a80001"
>  > arp_expected
> +
> +net_add n1
> +sim_add hv1
> +as hv1
> +ovs-vsctl \
> +    -- add-br br-phys \
> +    -- add-br br-eth0
> +
> +ovn_attach n1 br-phys 192.168.0.10
> +
> +AT_CHECK([ovs-vsctl set Open_vSwitch . 
> external-ids:ovn-bridge-mappings=physnet1:br-eth0])
> +AT_CHECK([ovs-vsctl add-port br-eth0 snoopvif -- set Interface snoopvif 
> options:tx_pcap=hv1/snoopvif-tx.pcap options:rxq_pcap=hv1/snoopvif-rx.pcap])
> +
> +sim_add hv2
> +as hv2
> +ovs-vsctl \
> +    -- add-br br-phys \
> +    -- add-br br-eth0
> +
> +ovn_attach n1 br-phys 192.168.0.20
> +
> +AT_CHECK([ovs-vsctl set Open_vSwitch . 
> external-ids:ovn-bridge-mappings=physnet1:br-eth0])
> +AT_CHECK([ovs-vsctl add-port br-eth0 snoopvif -- set Interface snoopvif 
> options:tx_pcap=hv2/snoopvif-tx.pcap options:rxq_pcap=hv2/snoopvif-rx.pcap])
> +
> +ovn-sbctl dump-flows > sbflows
> +AT_CAPTURE_FILE([sbflows])
> +
> +# Wait until the patch ports are created in hv1 and hv2 to connect br-int to 
> br-eth0
> +AT_CHECK([ovn-nbctl set logical_router lr0 options:chassis=hv1])
> +OVS_WAIT_UNTIL([test 1 = `as hv1 ovs-vsctl show | \
> +grep "Port patch-br-int-to-ln_port" | wc -l`])
> +AT_CHECK([ovn-nbctl set logical_router lr0 options:chassis=hv2])
> +OVS_WAIT_UNTIL([test 1 = `as hv2 ovs-vsctl show | \
> +grep "Port patch-br-int-to-ln_port" | wc -l`])
> +
> +# Temporarily remove lr0 chassis
> +AT_CHECK([ovn-nbctl remove logical_router lr0 options chassis])
> +
> +reset_pcap_file() {
> +    local hv=$1
> +    local iface=$2
> +    local pcap_file=$3
> +    as $hv
> +    ovs-vsctl -- set Interface $iface options:tx_pcap=dummy-tx.pcap \
> +options:rxq_pcap=dummy-rx.pcap
> +    rm -f ${pcap_file}*.pcap
> +    ovs-vsctl -- set Interface $iface options:tx_pcap=${pcap_file}-tx.pcap \
> +options:rxq_pcap=${pcap_file}-rx.pcap
> +}
> +
> +reset_pcap_file hv1 snoopvif hv1/snoopvif
> +reset_pcap_file hv2 snoopvif hv2/snoopvif
> +
> +hv1_uuid=$(ovn-sbctl --bare --columns _uuid list chassis hv1)
> +AT_CHECK([ovn-nbctl set logical_router lr0 options:chassis=hv1])
> +OVS_WAIT_UNTIL([
> +    ls0_lr0=$(ovn-sbctl --bare --columns chassis list port_binding ls0-lr0)
> +    test "$ls0_lr0" = $hv1_uuid
> +])

Since GARPs are periodic, you should give controllers some time (with
a sleep?) to produce the GARPs.

> +
> +OVN_CHECK_PACKETS([hv1/snoopvif-tx.pcap], [arp_expected])
> +OVN_CHECK_PACKETS([hv2/snoopvif-tx.pcap], [empty_expected])
> +
> +# Temporarily remove lr0 chassis
> +AT_CHECK([ovn-nbctl remove logical_router lr0 options chassis])
> +
> +reset_pcap_file hv1 snoopvif hv1/snoopvif
> +reset_pcap_file hv2 snoopvif hv2/snoopvif
> +
> +hv2_uuid=$(ovn-sbctl --bare --columns _uuid list chassis hv2)
> +AT_CHECK([ovn-nbctl set logical_router lr0 options:chassis=hv2])
> +OVS_WAIT_UNTIL([
> +    ls0_lr0=$(ovn-sbctl --bare --columns chassis list port_binding ls0-lr0)
> +    test "$ls0_lr0" = $hv2_uuid
> +])

This test case is flaky. If you put "sleep 3" here, you'll see that
GAPR is issued on both interfaces. This is because the code for l3gw
is affected by the same issue as fixed for regular ports in:
https://patchwork.ozlabs.org/project/ovn/patch/20220407025822.3861491-1-ihrac...@redhat.com/

Namely, when the l3gw chassis is changed from hv1 to hv2, hv1 never
gets the port unregistered from the queue of periodic GARPs. Strictly
speaking, it is not the bug that you try to fix, but since you add a
test case that switches chassis for a port, I think the fix should be
included here. (Alternatively, the test should be split in two parts -
one just setting a chassis and checking if only one of them observes
GARPs; and the other part updating chassis and checking that the old
chassis stops issuing GARPs.)

> +
> +OVN_CHECK_PACKETS([hv2/snoopvif-tx.pcap], [arp_expected])
> +OVN_CHECK_PACKETS([hv1/snoopvif-tx.pcap], [empty_expected])
> +
> +OVN_CLEANUP([hv1],[hv2])
> +
> +AT_CLEANUP
> +])
> +
>  OVN_FOR_EACH_NORTHD([
>  AT_SETUP([send gratuitous arp with nat-addresses router in localnet])
>  ovn_start
> --
> 2.35.1
>
> _______________________________________________
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to