[ovs-dev] [PATCH ovn 00/15] Fixed another set of flaky Unit Tests

2023-09-18 Thread Xavier Simonart
Xavier Simonart (15):
  tests: fixed multiple ovn-ic tests
  tests: fixed "Logical router policy packet marking"
  tests: fixed "options:requested-chassis for logical port"
  tests: tests fixed "Encaps tunnel cleanup ..." and "ovn-controller
exit"
  tests: wait for all flows to be installed before sending packets
  tests: fixed multiple tests not   properly waiting for packets to
be received
  tests: fixed multiple tests missing ovn-nbctl "wait"
  tests: fixed "L2 Drop and Allow ACL w/ Stateful ACL"
  tests: skip test "MAC binding aging" if scapy not available.
  tests: move trim_zeros() to ovn-macros
  tests: fixed "send gratuitous ARP for NAT rules on HA distributed
router"
  tests: fixed "SCTP Load balancer health checks"a
  tests: fixed "LSP incremental processing"
  tests: fixed "ipsec -- basic configuration"
  tests: do not start northd-backup for northd tests querying northd

 tests/ovn-ic.at |  14 +-
 tests/ovn-ipsec.at  |   4 +-
 tests/ovn-macros.at |   4 +
 tests/ovn-northd.at |  27 ++-
 tests/ovn.at| 484 
 5 files changed, 207 insertions(+), 326 deletions(-)

-- 
2.31.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn 00/15] Fixed another set of flaky Unit Tests

2023-09-19 Thread Mark Michelson

Thanks for the patch set Xavier.

With the exception of patch 13:

Acked-by: Mark Michelson 

I'm acking the patch set, but the nature of these changes worries me. I 
could easily see another patch series like this needed in the future 
because it's too easy to write flaky tests. Here are some high-level 
ideas I have based on changes I see in these patches.


1) From patches 12 and 15: Should we change ovn_start() so that by 
default it does not start a backup northd process? I suspect the number 
of tests that rely on a backup northd process are very small.


2) From patch 5 and possibly patch 14: Should we add a new --wait option 
to ovn-nbctl (e.g. "--wait=ovs") that waits for OVS to provide a 
confirmation that flows from this ovn-nbctl call have been installed?


3) From patch 4: Would it be possible to provide a "blocking" version of 
`ovn-appctl -t ovn-controller exit` ? In other words, could we make a 
version that will not return control to the shell until the process has 
exited? Barring that, we could write a macro for ovn-macros.at that will 
call `ovn-appctl -t ovn-controller exit` and then block until the 
process has exited. We could then always call that macro when we want to 
shut ovn-controller down.


On 9/18/23 12:46, Xavier Simonart wrote:

Xavier Simonart (15):
   tests: fixed multiple ovn-ic tests
   tests: fixed "Logical router policy packet marking"
   tests: fixed "options:requested-chassis for logical port"
   tests: tests fixed "Encaps tunnel cleanup ..." and "ovn-controller
 exit"
   tests: wait for all flows to be installed before sending packets
   tests: fixed multiple tests not  properly waiting for packets to
 be received
   tests: fixed multiple tests missing ovn-nbctl "wait"
   tests: fixed "L2 Drop and Allow ACL w/ Stateful ACL"
   tests: skip test "MAC binding aging" if scapy not available.
   tests: move trim_zeros() to ovn-macros
   tests: fixed "send gratuitous ARP for NAT rules on HA distributed
 router"
   tests: fixed "SCTP Load balancer health checks"a
   tests: fixed "LSP incremental processing"
   tests: fixed "ipsec -- basic configuration"
   tests: do not start northd-backup for northd tests querying northd

  tests/ovn-ic.at |  14 +-
  tests/ovn-ipsec.at  |   4 +-
  tests/ovn-macros.at |   4 +
  tests/ovn-northd.at |  27 ++-
  tests/ovn.at| 484 
  5 files changed, 207 insertions(+), 326 deletions(-)



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn 00/15] Fixed another set of flaky Unit Tests

2023-09-21 Thread Xavier Simonart
Hi Mark

Thanks for the review. See comments below.
Thanks
Xavier


On Tue, Sep 19, 2023 at 5:29 PM Mark Michelson  wrote:

> Thanks for the patch set Xavier.
>
> With the exception of patch 13:
>
> Acked-by: Mark Michelson 
>
> I'm acking the patch set, but the nature of these changes worries me. I
> could easily see another patch series like this needed in the future
> because it's too easy to write flaky tests. Here are some high-level
> ideas I have based on changes I see in these patches.
>
> +1

> 1) From patches 12 and 15: Should we change ovn_start() so that by
> default it does not start a backup northd process? I suspect the number
> of tests that rely on a backup northd process are very small.
>
Changing this would impact all tests, including all which do not query
ovn-northd through appctl, and hence do not suffer from this issue.
However, I am not sure that starting northd-backup makes so much sense,
even for those tests.
So I agree. If nobody disagrees, I will change the default in ovn-start
(i.e. default = w/o northd), and add an option --with-northd-backup for the
tests which require the backup to be running)

>
> 2) From patch 5 and possibly patch 14: Should we add a new --wait option
> to ovn-nbctl (e.g. "--wait=ovs") that waits for OVS to provide a
> confirmation that flows from this ovn-nbctl call have been installed?
>
This one is more complex. Waiting for some flows to be present was an easy
way to fix test cases for which wait_for_port_up was not sufficient (either
because flows are added later, such as remote ports related ones, or
because port_up itself is reported too early by ovn-controller).The proper
fix might require changing ovn-controller so that it reports ports up
correctly.

>
> 3) From patch 4: Would it be possible to provide a "blocking" version of
> `ovn-appctl -t ovn-controller exit` ? In other words, could we make a
> version that will not return control to the shell until the process has
> exited?

I guess we could do something like "if command is exit (or a new one such
as exit_wait - so we can keep current behaviour), then get the pid through
unixctl, send the exit, and check for pid termination with a timeout"

Barring that, we could write a macro for ovn-macros.at that will
> call `ovn-appctl -t ovn-controller exit` and then block until the
> process has exited. We could then always call that macro when we want to
> shut ovn-controller down.
>
I'll add the macro in a v2 - it's a small change. Nothing prevents up to
add the blocking ovn-appctl exit afterwards (and change the macro)

>
>
There is at least one more similar issue I noted: some tests fail because
pinctrl is slow to start (and first packets arrive before it's started),
and packets are lost.
I earlier added in some tests WAIT_UNTIL checking ovn-controller.log that
pinctl is connected, but this should be done for more tests.
We could change the tests for this, or update ovn-controller so that it
only starts looping when pinctrl is connected

Thanks
Xavier

On 9/18/23 12:46, Xavier Simonart wrote:
> > Xavier Simonart (15):
> >tests: fixed multiple ovn-ic tests
> >tests: fixed "Logical router policy packet marking"
> >tests: fixed "options:requested-chassis for logical port"
> >tests: tests fixed "Encaps tunnel cleanup ..." and "ovn-controller
> >  exit"
> >tests: wait for all flows to be installed before sending packets
> >tests: fixed multiple tests notproperly waiting for packets to
> >  be received
> >tests: fixed multiple tests missing ovn-nbctl "wait"
> >tests: fixed "L2 Drop and Allow ACL w/ Stateful ACL"
> >tests: skip test "MAC binding aging" if scapy not available.
> >tests: move trim_zeros() to ovn-macros
> >tests: fixed "send gratuitous ARP for NAT rules on HA distributed
> >  router"
> >tests: fixed "SCTP Load balancer health checks"a
> >tests: fixed "LSP incremental processing"
> >tests: fixed "ipsec -- basic configuration"
> >tests: do not start northd-backup for northd tests querying northd
> >
> >   tests/ovn-ic.at |  14 +-
> >   tests/ovn-ipsec.at  |   4 +-
> >   tests/ovn-macros.at |   4 +
> >   tests/ovn-northd.at |  27 ++-
> >   tests/ovn.at| 484 
> >   5 files changed, 207 insertions(+), 326 deletions(-)
> >
>
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn 00/15] Fixed another set of flaky Unit Tests

2023-10-06 Thread Dumitru Ceara
Hi Xavier, Mark,

On 9/21/23 15:27, Xavier Simonart wrote:
> Hi Mark
> 
> Thanks for the review. See comments below.
> Thanks
> Xavier
> 
> 
> On Tue, Sep 19, 2023 at 5:29 PM Mark Michelson  wrote:
> 
>> Thanks for the patch set Xavier.
>>
>> With the exception of patch 13:
>>
>> Acked-by: Mark Michelson 
>>
>> I'm acking the patch set, but the nature of these changes worries me. I
>> could easily see another patch series like this needed in the future
>> because it's too easy to write flaky tests. Here are some high-level
>> ideas I have based on changes I see in these patches.
>>
>> +1

I'm planning to apply the patches that have no ongoing discussion, that
is: 1-3, 6-11, 13.

For the rest I'll wait for the discussion to settle or for a v2.  Some
more remarks inline.

Thanks a lot for improving the tests, that's a very important aspect!

> 
>> 1) From patches 12 and 15: Should we change ovn_start() so that by
>> default it does not start a backup northd process? I suspect the number
>> of tests that rely on a backup northd process are very small.
>>
> Changing this would impact all tests, including all which do not query
> ovn-northd through appctl, and hence do not suffer from this issue.
> However, I am not sure that starting northd-backup makes so much sense,
> even for those tests.
> So I agree. If nobody disagrees, I will change the default in ovn-start
> (i.e. default = w/o northd), and add an option --with-northd-backup for the
> tests which require the backup to be running)
> 

+1 for this.  I guess it will also positively impact run times in
constrained environments like GitHub CI.

>>
>> 2) From patch 5 and possibly patch 14: Should we add a new --wait option
>> to ovn-nbctl (e.g. "--wait=ovs") that waits for OVS to provide a
>> confirmation that flows from this ovn-nbctl call have been installed?
>>
> This one is more complex. Waiting for some flows to be present was an easy
> way to fix test cases for which wait_for_port_up was not sufficient (either
> because flows are added later, such as remote ports related ones, or
> because port_up itself is reported too early by ovn-controller).The proper
> fix might require changing ovn-controller so that it reports ports up
> correctly.
> 

This is more of a test issue IMO; I think people expect eventual
consistency from OVN so the current behavior should be fine, just harder
to test.  I'd vote for a change in the test instead of complex changes
to ovn-controller.

>>
>> 3) From patch 4: Would it be possible to provide a "blocking" version of
>> `ovn-appctl -t ovn-controller exit` ? In other words, could we make a
>> version that will not return control to the shell until the process has
>> exited?
> 
> I guess we could do something like "if command is exit (or a new one such
> as exit_wait - so we can keep current behaviour), then get the pid through
> unixctl, send the exit, and check for pid termination with a timeout"
> 
> Barring that, we could write a macro for ovn-macros.at that will
>> call `ovn-appctl -t ovn-controller exit` and then block until the
>> process has exited. We could then always call that macro when we want to
>> shut ovn-controller down.
>>
> I'll add the macro in a v2 - it's a small change. Nothing prevents up to
> add the blocking ovn-appctl exit afterwards (and change the macro)
> 

Ales just posted this patch, would it work?

https://patchwork.ozlabs.org/project/ovn/patch/20231006070218.308957-1-amu...@redhat.com/

>>
>>
> There is at least one more similar issue I noted: some tests fail because
> pinctrl is slow to start (and first packets arrive before it's started),
> and packets are lost.
> I earlier added in some tests WAIT_UNTIL checking ovn-controller.log that
> pinctl is connected, but this should be done for more tests.
> We could change the tests for this, or update ovn-controller so that it
> only starts looping when pinctrl is connected
> 

+1 for delaying ovn-controller main processing loop start.

> Thanks
> Xavier
> 

Regards,
Dumitru

> On 9/18/23 12:46, Xavier Simonart wrote:
>>> Xavier Simonart (15):
>>>tests: fixed multiple ovn-ic tests
>>>tests: fixed "Logical router policy packet marking"
>>>tests: fixed "options:requested-chassis for logical port"
>>>tests: tests fixed "Encaps tunnel cleanup ..." and "ovn-controller
>>>  exit"
>>>tests: wait for all flows to be installed before sending packets
>>>tests: fixed multiple tests notproperly waiting for packets to
>>>  be received
>>>tests: fixed multiple tests missing ovn-nbctl "wait"
>>>tests: fixed "L2 Drop and Allow ACL w/ Stateful ACL"
>>>tests: skip test "MAC binding aging" if scapy not available.
>>>tests: move trim_zeros() to ovn-macros
>>>tests: fixed "send gratuitous ARP for NAT rules on HA distributed
>>>  router"
>>>tests: fixed "SCTP Load balancer health checks"a
>>>tests: fixed "LSP incremental processing"
>>>tests: fixed "ipsec -- basic configuration"
>>>tests: do not

Re: [ovs-dev] [PATCH ovn 00/15] Fixed another set of flaky Unit Tests

2023-10-06 Thread Dumitru Ceara
On 10/6/23 09:08, Dumitru Ceara wrote:
> Hi Xavier, Mark,
> 
> On 9/21/23 15:27, Xavier Simonart wrote:
>> Hi Mark
>>
>> Thanks for the review. See comments below.
>> Thanks
>> Xavier
>>
>>
>> On Tue, Sep 19, 2023 at 5:29 PM Mark Michelson  wrote:
>>
>>> Thanks for the patch set Xavier.
>>>
>>> With the exception of patch 13:
>>>
>>> Acked-by: Mark Michelson 
>>>
>>> I'm acking the patch set, but the nature of these changes worries me. I
>>> could easily see another patch series like this needed in the future
>>> because it's too easy to write flaky tests. Here are some high-level
>>> ideas I have based on changes I see in these patches.
>>>
>>> +1
> 
> I'm planning to apply the patches that have no ongoing discussion, that
> is: 1-3, 6-11, 13.
> 

In the end I also skipped patch 6 because of merge conflicts and 13
because there's an ongoing discussion:
https://patchwork.ozlabs.org/project/ovn/patch/20230918164714.3144984-14-xsimo...@redhat.com/

I applied the other 8 patches to main and backported them where applicable.

> For the rest I'll wait for the discussion to settle or for a v2.  Some
> more remarks inline.
> 
> Thanks a lot for improving the tests, that's a very important aspect!
> 

Regards,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn 00/15] Fixed another set of flaky Unit Tests

2023-10-23 Thread Xavier Simonart
Hi Mark, Dumitru

Thanks for the feedback

On Fri, Oct 6, 2023 at 12:57 PM Dumitru Ceara  wrote:
>
> On 10/6/23 09:08, Dumitru Ceara wrote:
> > Hi Xavier, Mark,
> >
> > On 9/21/23 15:27, Xavier Simonart wrote:
> >> Hi Mark
> >>
> >> Thanks for the review. See comments below.
> >> Thanks
> >> Xavier
> >>
> >>
> >> On Tue, Sep 19, 2023 at 5:29 PM Mark Michelson  wrote:
> >>
> >>> Thanks for the patch set Xavier.
> >>>
> >>> With the exception of patch 13:
> >>>
> >>> Acked-by: Mark Michelson 
> >>>
> >>> I'm acking the patch set, but the nature of these changes worries me. I
> >>> could easily see another patch series like this needed in the future
> >>> because it's too easy to write flaky tests. Here are some high-level
> >>> ideas I have based on changes I see in these patches.
> >>>
> >>> +1
> >
> > I'm planning to apply the patches that have no ongoing discussion, that
> > is: 1-3, 6-11, 13.
> >
>
> In the end I also skipped patch 6 because of merge conflicts and 13
> because there's an ongoing discussion:
> https://patchwork.ozlabs.org/project/ovn/patch/20230918164714.3144984-14-xsimo...@redhat.com/
>
> I applied the other 8 patches to main and backported them where applicable.
>
I'll send a v2 for the remaining patches 5, 6, 12, 13, 14 and 15. As
Dumitru pointed out patch 4 is not needed anymore as already fixed by
Ales.
Patch 5 (wait for all flows to be installed before sending packets) &
14 (fixed "ipsec -- basic configuration"): v2 is just a rebase, based
on Dumitru's feedback (and Mark's initial ack). If we (later) find a
proper fix to make test cases easier to write, we should definitely do
it.
Patch 6 (fixed multiple tests not properly waiting for packets to be
received):  rebased.
Patch 12 ( fixed "SCTP Load balancer health checks") & 15 (do not
start northd-backup for northd tests querying northd ): v2 merges the
two patches and does not start northd-backup anymore by default, based
on Mark's feedback.
Patch 13 (fixed "LSP incremental processing") v2 uses a min/max based
on Mark's feedback.

Thanks
Xavier

Xavier

> > For the rest I'll wait for the discussion to settle or for a v2.  Some
> > more remarks inline.
> >
> > Thanks a lot for improving the tests, that's a very important aspect!
> >
>
> Regards,
> Dumitru
>

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev