On Wed, Apr 20, 2022 at 2:46 AM Ales Musil <amu...@redhat.com> wrote:

>
>
> On Wed, Apr 20, 2022 at 8:14 AM Ales Musil <amu...@redhat.com> wrote:
>
>> Thank you for the review, see my reply below.
>>
>> On Tue, Apr 19, 2022 at 9:26 PM Ihar Hrachyshka <ihrac...@redhat.com>
>> wrote:
>>
>>> Thanks for the patch. Two comments for the test case below.
>>>
>>
>>> On Tue, Apr 19, 2022 at 1:12 AM Ales Musil <amu...@redhat.com> wrote:
>>> >
>>> > The GARP was sent even on chassis that were not serving
>>> > as l3gw for the specified router. This could lead to race
>>> > on the physical network when multiple chassis send the same
>>> > ARP but the physical interfaces have different port numbers
>>> > on the external bridge.
>>> >
>>> > Add check to filter out port_bindings for l3gw that
>>> > are sitting on different chassis.
>>> >
>>> > Reported-at: https://bugzilla.redhat.com/2062580
>>> > Signed-off-by: Ales Musil <amu...@redhat.com>
>>> > ---
>>> >  controller/pinctrl.c |   2 +-
>>> >  tests/ovn.at         | 106
>>> +++++++++++++++++++++++++++++++++++++++++++
>>> >  2 files changed, 107 insertions(+), 1 deletion(-)
>>> >
>>> > diff --git a/controller/pinctrl.c b/controller/pinctrl.c
>>> > index 25b37ee88..42d1c2a46 100644
>>> > --- a/controller/pinctrl.c
>>> > +++ b/controller/pinctrl.c
>>> > @@ -5554,7 +5554,7 @@ get_localnet_vifs_l3gwports(
>>> >          sbrec_port_binding_index_set_datapath(target, ld->datapath);
>>> >          SBREC_PORT_BINDING_FOR_EACH_EQUAL (pb, target,
>>> >
>>>  sbrec_port_binding_by_datapath) {
>>> > -            if (!strcmp(pb->type, "l3gateway")
>>> > +            if ((!strcmp(pb->type, "l3gateway") && pb->chassis ==
>>> chassis)
>>> >                  || !strcmp(pb->type, "patch")) {
>>> >                  sset_add(local_l3gw_ports, pb->logical_port);
>>> >              }
>>> > diff --git a/tests/ovn.at b/tests/ovn.at
>>> > index f9551b843..a02a9dab2 100644
>>> > --- a/tests/ovn.at
>>> > +++ b/tests/ovn.at
>>> > @@ -8707,6 +8707,112 @@ OVN_CLEANUP([hv1])
>>> >  AT_CLEANUP
>>> >  ])
>>> >
>>> > +OVN_FOR_EACH_NORTHD([
>>> > +AT_SETUP([send gratuitous arp for l3gateway only on selected chassis])
>>> > +ovn_start
>>> > +
>>> > +# Create logical switch
>>> > +ovn-nbctl ls-add ls0
>>> > +# Create gateway router
>>> > +ovn-nbctl lr-add lr0
>>> > +# Add router port to gateway router
>>> > +ovn-nbctl lrp-add lr0 lr0-ls0 f0:00:00:00:00:01 192.168.0.1/24
>>> > +ovn-nbctl lsp-add ls0 ls0-lr0 -- set Logical_Switch_Port ls0-lr0 \
>>> > +    type=router options:router-port=lr0-ls0
>>> addresses='"f0:00:00:00:00:01"'
>>> > +
>>> > +# Create a localnet port.
>>> > +ovn-nbctl lsp-add ls0 ln_port
>>> > +ovn-nbctl lsp-set-addresses ln_port unknown
>>> > +ovn-nbctl lsp-set-type ln_port localnet
>>> > +ovn-nbctl --wait=hv lsp-set-options ln_port network_name=physnet1
>>> > +
>>> > +# Prepare packets
>>> > +touch empty_expected
>>> > +echo
>>> "fffffffffffff0000000000108060001080006040001f00000000001c0a80001000000000000c0a80001"
>>> > arp_expected
>>> > +
>>> > +net_add n1
>>> > +sim_add hv1
>>> > +as hv1
>>> > +ovs-vsctl \
>>> > +    -- add-br br-phys \
>>> > +    -- add-br br-eth0
>>> > +
>>> > +ovn_attach n1 br-phys 192.168.0.10
>>> > +
>>> > +AT_CHECK([ovs-vsctl set Open_vSwitch .
>>> external-ids:ovn-bridge-mappings=physnet1:br-eth0])
>>> > +AT_CHECK([ovs-vsctl add-port br-eth0 snoopvif -- set Interface
>>> snoopvif options:tx_pcap=hv1/snoopvif-tx.pcap
>>> options:rxq_pcap=hv1/snoopvif-rx.pcap])
>>> > +
>>> > +sim_add hv2
>>> > +as hv2
>>> > +ovs-vsctl \
>>> > +    -- add-br br-phys \
>>> > +    -- add-br br-eth0
>>> > +
>>> > +ovn_attach n1 br-phys 192.168.0.20
>>> > +
>>> > +AT_CHECK([ovs-vsctl set Open_vSwitch .
>>> external-ids:ovn-bridge-mappings=physnet1:br-eth0])
>>> > +AT_CHECK([ovs-vsctl add-port br-eth0 snoopvif -- set Interface
>>> snoopvif options:tx_pcap=hv2/snoopvif-tx.pcap
>>> options:rxq_pcap=hv2/snoopvif-rx.pcap])
>>> > +
>>> > +ovn-sbctl dump-flows > sbflows
>>> > +AT_CAPTURE_FILE([sbflows])
>>> > +
>>> > +# Wait until the patch ports are created in hv1 and hv2 to connect
>>> br-int to br-eth0
>>> > +AT_CHECK([ovn-nbctl set logical_router lr0 options:chassis=hv1])
>>> > +OVS_WAIT_UNTIL([test 1 = `as hv1 ovs-vsctl show | \
>>> > +grep "Port patch-br-int-to-ln_port" | wc -l`])
>>> > +AT_CHECK([ovn-nbctl set logical_router lr0 options:chassis=hv2])
>>> > +OVS_WAIT_UNTIL([test 1 = `as hv2 ovs-vsctl show | \
>>> > +grep "Port patch-br-int-to-ln_port" | wc -l`])
>>> > +
>>> > +# Temporarily remove lr0 chassis
>>> > +AT_CHECK([ovn-nbctl remove logical_router lr0 options chassis])
>>> > +
>>> > +reset_pcap_file() {
>>> > +    local hv=$1
>>> > +    local iface=$2
>>> > +    local pcap_file=$3
>>> > +    as $hv
>>> > +    ovs-vsctl -- set Interface $iface options:tx_pcap=dummy-tx.pcap \
>>> > +options:rxq_pcap=dummy-rx.pcap
>>> > +    rm -f ${pcap_file}*.pcap
>>> > +    ovs-vsctl -- set Interface $iface
>>> options:tx_pcap=${pcap_file}-tx.pcap \
>>> > +options:rxq_pcap=${pcap_file}-rx.pcap
>>> > +}
>>> > +
>>> > +reset_pcap_file hv1 snoopvif hv1/snoopvif
>>> > +reset_pcap_file hv2 snoopvif hv2/snoopvif
>>> > +
>>> > +hv1_uuid=$(ovn-sbctl --bare --columns _uuid list chassis hv1)
>>> > +AT_CHECK([ovn-nbctl set logical_router lr0 options:chassis=hv1])
>>> > +OVS_WAIT_UNTIL([
>>> > +    ls0_lr0=$(ovn-sbctl --bare --columns chassis list port_binding
>>> ls0-lr0)
>>> > +    test "$ls0_lr0" = $hv1_uuid
>>> > +])
>>>
>>> Since GARPs are periodic, you should give controllers some time (with
>>> a sleep?) to produce the GARPs.
>>>
>>
>> It shouldn't be needed as the OVN_CHECK_PACKETS is waiting for it.
>> On the other hand in the spirit of potential flakiness it is probably a
>> good idea.
>>
>>
>>>
>>> > +
>>> > +OVN_CHECK_PACKETS([hv1/snoopvif-tx.pcap], [arp_expected])
>>> > +OVN_CHECK_PACKETS([hv2/snoopvif-tx.pcap], [empty_expected])
>>> > +
>>> > +# Temporarily remove lr0 chassis
>>> > +AT_CHECK([ovn-nbctl remove logical_router lr0 options chassis])
>>> > +
>>> > +reset_pcap_file hv1 snoopvif hv1/snoopvif
>>> > +reset_pcap_file hv2 snoopvif hv2/snoopvif
>>> > +
>>> > +hv2_uuid=$(ovn-sbctl --bare --columns _uuid list chassis hv2)
>>> > +AT_CHECK([ovn-nbctl set logical_router lr0 options:chassis=hv2])
>>> > +OVS_WAIT_UNTIL([
>>> > +    ls0_lr0=$(ovn-sbctl --bare --columns chassis list port_binding
>>> ls0-lr0)
>>> > +    test "$ls0_lr0" = $hv2_uuid
>>> > +])
>>>
>>> This test case is flaky. If you put "sleep 3" here, you'll see that
>>> GAPR is issued on both interfaces. This is because the code for l3gw
>>> is affected by the same issue as fixed for regular ports in:
>>>
>>> https://patchwork.ozlabs.org/project/ovn/patch/20220407025822.3861491-1-ihrac...@redhat.com/
>>>
>>> Namely, when the l3gw chassis is changed from hv1 to hv2, hv1 never
>>> gets the port unregistered from the queue of periodic GARPs. Strictly
>>> speaking, it is not the bug that you try to fix, but since you add a
>>> test case that switches chassis for a port, I think the fix should be
>>> included here. (Alternatively, the test should be split in two parts -
>>> one just setting a chassis and checking if only one of them observes
>>> GARPs; and the other part updating chassis and checking that the old
>>> chassis stops issuing GARPs.)
>>>
>>
>> Didn't know that, however when I was testing it I did not come across the
>> GARP being still sent
>> over the old interface. Maybe I have not waited long enough.
>>
>> Anyway if I understand that right, you suggest also fix from your patch
>> and leaving the test as is?
>> Problematic part is that we always need set both chassis to setup the
>> patch ports, otherwise we would
>> not see the garp on the second one. That being said I think the best
>> approach would be to squash the
>> fixes and leave this test since it tests both cases WDYT?
>>
>
> Actually looking through the code this scenario cannot hit it, even if I
> add sleep. In my test case
> the send_garp_rarp_delete code is called in the for each earlier,
> pinctrl.c:5771. So those cases are different,
> I'll add the sleep to avoid any flakiness but IMO those patches can be
> separated.
>
>

You're right, sorry, my fault. Indeed since you filter out ports that don't
belong to the chassis before sset_add, you don't need to explicitly delete
it later. (I should do the same in my patch.)

The test changes - sleep to give chassis time to produce a GARP in case of
"empty" check + _PACKETS_CONTAIN to accommodate for multiple GARPs
generated - stand. Thank you.


>
>>
>>>
>>> > +
>>> > +OVN_CHECK_PACKETS([hv2/snoopvif-tx.pcap], [arp_expected])
>>> > +OVN_CHECK_PACKETS([hv1/snoopvif-tx.pcap], [empty_expected])
>>> > +
>>> > +OVN_CLEANUP([hv1],[hv2])
>>> > +
>>> > +AT_CLEANUP
>>> > +])
>>> > +
>>> >  OVN_FOR_EACH_NORTHD([
>>> >  AT_SETUP([send gratuitous arp with nat-addresses router in localnet])
>>> >  ovn_start
>>> > --
>>> > 2.35.1
>>> >
>>> > _______________________________________________
>>> > dev mailing list
>>> > d...@openvswitch.org
>>> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>> >
>>>
>>>
>>
>> --
>>
>> Ales Musil
>>
>> Senior Software Engineer - RHV Network
>>
>> Red Hat EMEA <https://www.redhat.com>
>>
>> amu...@redhat.com    IM: amusil
>> <https://red.ht/sig>
>>
>
>
> --
>
> Ales Musil
>
> Senior Software Engineer - RHV Network
>
> Red Hat EMEA <https://www.redhat.com>
>
> amu...@redhat.com    IM: amusil
> <https://red.ht/sig>
>
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to