date:20230331

Re: [ovs-dev] [PATCH] net: openvswitch: fix race on port output

2023-03-31 Thread Jakub Kicinski

On Fri, 31 Mar 2023 08:19:09 + Felix Huettner wrote:
> 1. An openvswitch instance with one bridge and default flows
> 2. two network namespaces "server" and "client"
> 3. two ovs interfaces "server" and "client" on the bridge
> 4. for each ovs interface a veth pair with a matching name and 32 rx and
>tx queues
> 5. move the ends of the veth pairs to the respective network namespaces
> 6. assign ip addresses to each of the veth ends in the namespaces (needs
>to be the same subnet)
> 7. start some http server on the server network namespace
> 8. test if a client in the client namespace can reach the http server

grunt.. please read:
https://www.kernel.org/doc/html/next/process/maintainer-netdev.html
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] net: openvswitch: fix race on port output

2023-03-31 Thread Jakub Kicinski

On Fri, 31 Mar 2023 06:25:13 + Felix Hüttner wrote:
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 253584777101..6628323b7bea 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3199,6 +3199,7 @@ static u16 skb_tx_hash(const struct net_device *dev,
> }
> 
> if (skb_rx_queue_recorded(skb)) {
> +   BUG_ON(unlikely(qcount == 0));

DEBUG_NET_WARN_ON()

> hash = skb_get_rx_queue(skb);
> if (hash >= qoffset)
> hash -= qoffset;
> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> index ca3ebfdb3023..33b317e5f9a5 100644
> --- a/net/openvswitch/actions.c
> +++ b/net/openvswitch/actions.c
> @@ -913,7 +913,7 @@ static void do_output(struct datapath *dp, struct sk_buff 
> *skb, int out_port,
>  {
> struct vport *vport = ovs_vport_rcu(dp, out_port);
> 
> -   if (likely(vport)) {
> +   if (likely(vport && vport->dev->reg_state == NETREG_REGISTERED)) {

Without looking too closely netif_carrier_ok() seems like a more
appropriate check for liveness on the datapath?

> u16 mru = OVS_CB(skb)->mru;
> u32 cutlen = OVS_CB(skb)->cutlen;
> 
> --
> 2.40.0
> 
> Diese E Mail enthält möglicherweise vertrauliche Inhalte und ist nur für die 
> Verwertung durch den vorgesehenen Empfänger bestimmt. Sollten Sie nicht der 
> vorgesehene Empfänger sein, setzen Sie den Absender bitte unverzüglich in 
> Kenntnis und löschen diese E Mail. Hinweise zum Datenschutz finden Sie 
> hier.

You gotta get rid of this to work upstream.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v2] ovs-tcpdump: Stdout is shutdown before ovs-tcpdump exit

2023-03-31 Thread Songtao Zhan

To: d...@openvswitch.org,
i.maxim...@ovn.org

If there is a pipe behind ovs-tcpdump(such as ovs-tcpdump -i eth0
| grep "192.168.1.1"), the child process (grep "192.168.1.1") may
exit first and close the pipe when received SIGTERM. When farther
process(ovs-tcpdump) exit, stdout is flushed into broken pipe, and
then received a exception IOError. To avoid such problems, ovs-tcp
dump first close stdout before exit.

Signed-off-by: Songtao Zhan 
---
 utilities/ovs-tcpdump.in | 12 
 1 file changed, 12 insertions(+)

diff --git a/utilities/ovs-tcpdump.in b/utilities/ovs-tcpdump.in
index a49ec9f94..b491647a5 100755
--- a/utilities/ovs-tcpdump.in
+++ b/utilities/ovs-tcpdump.in
@@ -22,6 +22,7 @@ import sys
 import time
 import struct
 import fcntl
+import signal

 try:
 from netifaces import interfaces
@@ -538,6 +539,17 @@ def main():
 print(data.decode('utf-8'))
 raise KeyboardInterrupt
 except KeyboardInterrupt:
+# If there is a pipe behind ovs-tcpdump (such as ovs-tcpdump
+# -i eth0 | grep "192.168.1.1"), the pipe is no longer available
+# after received ctrl+c.
+# If we write data to an unavailable pipe, a pipe error will be
+# reported, so we turn off stdout to avoid subsequence flushing
+# of data into the pipe.
+try:
+sys.stdout.close()
+except IOError:
+pass
+
 if pipes.poll() is None:
 pipes.terminate()

-- 
2.39.1




zhan...@chinatelecom.cn
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v2] ovs-dpctl: Add new command dpctl/ct-sweep-next-run.

2023-03-31 Thread Paolo Valerio

Ilya Maximets  writes:

> On 2/27/23 13:30, Paolo Valerio wrote:
>> Since 3d9c1b855a5f ("conntrack: Replace timeout based expiration lists
>> with rculists.") the sweep interval changed as well as the constraints
>> related to the sweeper.
>> Being able to change the default reschedule time may be convenient in
>> some conditions, like debugging.
>> This patch introduces new commands allowing to get and set the sweep
>> next run in ms.
>> 
>> Signed-off-by: Paolo Valerio 
>> ---
>> v2:
>> - resolved conflict in NEWS
>> - added missing comment
>> - added missing '\' in dpctl.man
>> ---
>>  NEWS|4 +++
>>  lib/conntrack-private.h |1 +
>>  lib/conntrack.c |   18 +-
>>  lib/conntrack.h |2 ++
>>  lib/ct-dpif.c   |   14 +++
>>  lib/ct-dpif.h   |1 +
>>  lib/dpctl.c |   61 
>> +++
>>  lib/dpctl.man   |8 ++
>>  lib/dpif-netdev.c   |   17 +
>>  lib/dpif-netlink.c  |2 ++
>>  lib/dpif-provider.h |4 +++
>>  11 files changed, 131 insertions(+), 1 deletion(-)
>> 
>> diff --git a/NEWS b/NEWS
>> index 85b349621..4c4ef4b2b 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -10,6 +10,10 @@ Post-v3.1.0
>> in order to create OVSDB sockets with access mode of 0770.
>> - QoS:
>>   * Added new configuration option 'jitter' for a linux-netem QoS type.
>> +   - ovs-appctl:
>> + * New commands "dpctl/{ct-get-sweep-next-run,ct-set-sweep-next-run}" 
>> that
>> +   allow to get and set, for the userspace datapath, the next run 
>> interval
>> +   for the conntrack garbage collector.
>
> Hi, Paolo.  Thanks for the patch!
>
> It looks good to me in general, but the command name seems a bit
> strange.  It sounds like it is a one-shot configuration that only
> applies to the next run and will be dropped to default afterwards.
> But that doesn't seem to be the case in the code.  It's a permanent
> configuration for the sweep interval.  So, maybe we should call it
> ct-[gs]et-sweep-interval, or something like that ?
>
> What do you think?
>

Agreed, ct-[gs]et-sweep-interval seems a better name.

> Also, some small unit test, even a basic set+get check, would be
> nice to have.  We have some similar tests in tests/ofproto-dpif.at.
>

sure, I'll add it.

Thank you.

> Best regards, Ilya Maximets.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn v3 1/2] expr: Remove supersets from OR expressions.

2023-03-31 Thread Han Zhou

On Fri, Mar 31, 2023 at 5:34 AM Ilya Maximets  wrote:
>
> On 3/31/23 06:39, Han Zhou wrote:
> >
> >
> > On Wed, Mar 29, 2023 at 9:05 AM Ilya Maximets mailto:i.maxim...@ovn.org>> wrote:
> >>
> >> While crushing OR expressions, OVN removes exact replicas of sub
> >> expressions.  However, there could be many CMP expressions that are
> >> supersets of each other.  These are most likely to be created as a
> >> result of cross-product while expanding brackets in the AND expression
> >> in crush_and_numeric(), i.e. while converting
> >> "x && (a0 || a1) && (b0 || b1)" into "xa0b0 || xa0b1 || xa1b0 ||
xa1b1".
> >>
> >> In addition to removal of exact duplicates introducing scan and removal
> >> of supersets of other existing sub-expressions to reduce the amount of
> >> generated flows.  This adds extra computations, but should save time
> >> later, since less flows will be generated.
> >>
> >> Example:
> >>
> >>   "ip4.src == 172.168.0.0/16  && ip4.src!={
172.168.13.0/24 , 172.168.15.0/24 <
http://172.168.15.0/24>}"
> >>
> >>   Processing of this expression yields 42 flows:
> >>
> >>   $ ./tests/ovstest test-ovn expr-to-flows <<< "$expr"
> >>
> >>   ip,nw_src=172.168.0.0/255.255.1.0 
> >>   ip,nw_src=172.168.0.0/255.255.10.0 
> >>   ip,nw_src=172.168.0.0/255.255.12.0 
> >>   ip,nw_src=172.168.0.0/255.255.3.0 
> >>   ip,nw_src=172.168.0.0/255.255.4.0 
> >>   ip,nw_src=172.168.0.0/255.255.5.0 
> >>   ip,nw_src=172.168.0.0/255.255.6.0 
> >>   ip,nw_src=172.168.0.0/255.255.8.0 
> >>   ip,nw_src=172.168.0.0/255.255.9.0 
> >>   ip,nw_src=172.168.128.0/17 
> >>   <... 32 more flows ...>
> >>
> >>   We can see that many flows above do overlap, e.g. 255.255.3.0
> >>   mask is a superset of 255.255.1.0.  Everything that matches
> >>   255.255.3.0, will match 255.255.1.0 as well (the value is the same).
> >>
> >>   By removing all the unnecessary supersets, the set of flows can
> >>   be reduced from 42 down to 7:
> >>
> >>   ip,nw_src=172.168.0.0/255.255.1.0 
> >>   ip,nw_src=172.168.0.0/255.255.4.0 
> >>   ip,nw_src=172.168.0.0/255.255.8.0 
> >>   ip,nw_src=172.168.128.0/17 
> >>   ip,nw_src=172.168.16.0/255.255.16.0 
> >>   ip,nw_src=172.168.32.0/255.255.32.0 
> >>   ip,nw_src=172.168.64.0/255.255.64.0 
> >>
> >> This change should be particularly useful for expressions with
> >> inequality checks, like the one above.  Such expressions are
> >> frequent among ACL rules.
> >>
> >>   "ip4.src != {172.168.13.0/24 ,
172.168.14.0/24 , 172.168.15.0/24 <
http://172.168.15.0/24>}"
> >>
> >>   Brefore:
> >>   $ ./tests/ovstest test-ovn expr-to-flows <<< "$expr" | wc -l
> >>   2894
> >>
> >>   After:
> >>   $ ./tests/ovstest test-ovn expr-to-flows <<< "$expr" | wc -l
> >>   23
> >>
> >> Superset lookups are performed only if there are expressions with
> >> more bits in the mask than in the current one.  So, there is no extra
> >> cost for equality checks on normal address sets, like port group sets,
> >> where all the IPs are exect matches or have the same prefix length
> >> otherwise.
> >>
> >> Use of bitmaps instead of subvalue functions significantly speeds up
> >> processing since most of the subvalue space is an all-zero empty space.
> >>
> >> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2177197 <
https://bugzilla.redhat.com/show_bug.cgi?id=2177197>
> >> Reported-by: Nadia Pinaeva >
> >> Signed-off-by: Ilya Maximets >
> >> ---
> >>  include/ovn/expr.h |   1 +
> >>  lib/expr.c | 238 ++---
> >>  tests/ovn.at    |  12 +++
> >>  3 files changed, 196 insertions(+), 55 deletions(-)
> >>
> >> diff --git a/include/ovn/expr.h b/include/ovn/expr.h
> >> index 80d95ff67..c48f82398 100644
> >> --- a/include/ovn/expr.h
> >> +++ b/include/ovn/expr.h
> >> @@ -384,6 +384,7 @@ struct expr {
> >>  struct {
> >>  union mf_subvalue value;
> >>  union mf_subvalue mask;
> >> +size_t mask_n_bits;
> >>  };
> >>  };
> >>  } cmp;
> >> diff --git a/lib/expr.c b/lib/expr.c
> >> index 0a6f7e574..8fd07b2d5 100644
> >> --- a/lib/expr.c
> >> +++ b/lib/expr.c
> >> @@ -15,21 +15,23 @@
> >>   */
> >>
> >>  #include 
> >> +#include "bitmap.h"
> >>  #include "byte-order.h"
> >> -#include "openvswitch/json.h"
> >> +#include "hmapx.h"
> >>  #include "nx-

Re: [ovs-dev] [PATCH] ofp-flow: TLVs should be ordered

2023-03-31 Thread Adrian Moreno


Hi Wan,

Thanks for submitting this patch.

On 11/18/22 09:13, Wan Junjie wrote:

According to the ovs-fields manpage, 'TLVs must be ordered so that
a field’s prerequisites are satisfied before if is parsed'.



I believe this sentence is talking about the binary Openflow (>1.2) format where 
match fields are encoded in TLV.


The way we represent flow matches in human-readable strings (for ovs-ofctl) is 
more flexible, allowing the user to express the matches in any arbitrary order. 
For example, we could add a flow such as:

"tcp_src=255,ip_src=1.1.1.1,ip,eth_src=ca:fe:ca:fe:ca:fe,tcp,actions=drop"

which is clearly in a very weird order, but if we looked at the wire we'd see 
the TLV sections of the OFP packet are in the right order.



However, a match pattern like 'tcp,ipv6' can be parsed as 'tcp6',
meanwhile 'ipv6,tcp' is parsed as 'tcp'. So a limitation here make
sure that the order should be respected.



I understand this might be a bit confusing.
Firstly, according to the ovs-fields(7) these are the meanings of the 
shorthands:

ip:   eth_type=0x0800,
ipv6: eth_type=0x86dd
tcp:  eth_type=0x0800,ip_proto=6
tcp6: eth_type=0x86dd,ip_proto=6

If we expand the shorthands:

"tcp,ipv6" == "eth_type=0x0800,ip_proto=6,eth_type=0x86dd"
"ipv6,tcp" == "eth_type=0x86dd,eth_type=0x0800,ip_proto=6"

So the last "eth_type" match prevails.


Signed-off-by: Wan Junjie 
---
  lib/ofp-flow.c|  3 +++
  tests/ofproto-dpif.at |  8 
  tests/ovs-ofctl.at| 15 +++
  3 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/lib/ofp-flow.c b/lib/ofp-flow.c
index 3bc744f78..db9b75c7b 100644
--- a/lib/ofp-flow.c
+++ b/lib/ofp-flow.c
@@ -1580,6 +1580,9 @@ parse_ofp_str__(struct ofputil_flow_mod *fm, int command, 
char *string,
  char *error = NULL;
  
  if (ofp_parse_protocol(name, &p)) {

+if (match.flow.nw_proto) {
+return xstrdup("TLVs must be ordered");
+}
  match_set_dl_type(&match, htons(p->dl_type));
  if (p->nw_proto) {
  match_set_nw_proto(&match, p->nw_proto);
diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
index 728243183..a4a7ab5fb 100644
--- a/tests/ofproto-dpif.at
+++ b/tests/ofproto-dpif.at
@@ -11629,14 +11629,14 @@ table=1,priority=1,action=drop
  dnl
  dnl Table 2
  dnl
-table=2,priority=10,in_port=1,tcp,ct_zone=1,ct_state=+trk+new,tcp,action=ct(commit,zone=1),ct(table=3,zone=2)
-table=2,priority=10,in_port=1,tcp,ct_zone=1,ct_state=+trk+est,tcp,action=ct(table=3,zone=2)
+table=2,priority=10,in_port=1,tcp,ct_zone=1,ct_state=+trk+new,action=ct(commit,zone=1),ct(table=3,zone=2)
+table=2,priority=10,in_port=1,tcp,ct_zone=1,ct_state=+trk+est,action=ct(table=3,zone=2)
  table=2,priority=1,action=drop
  dnl
  dnl Table 3
  dnl
-table=3,priority=10,in_port=1,tcp,ct_zone=2,ct_state=+trk+new,tcp,action=ct(commit,zone=2),4
-table=3,priority=10,in_port=1,tcp,ct_zone=2,ct_state=+trk+est,tcp,action=3
+table=3,priority=10,in_port=1,tcp,ct_zone=2,ct_state=+trk+new,action=ct(commit,zone=2),4
+table=3,priority=10,in_port=1,tcp,ct_zone=2,ct_state=+trk+est,action=3
  table=2,priority=1,action=drop
  ])
  
diff --git a/tests/ovs-ofctl.at b/tests/ovs-ofctl.at

index a8934051e..b933924b8 100644
--- a/tests/ovs-ofctl.at
+++ b/tests/ovs-ofctl.at
@@ -381,6 +381,21 @@ do
  done
  AT_CLEANUP
  
+AT_SETUP([ovs-ofctl parse-flow with protocol order])

+for test_case in \
+'tcp,ip' \
+'tcp,ipv6' \
+'udp,ip' \
+'udp,ip6'
+do
+set $test_case
+field=$1
+AT_CHECK_UNQUOTED([ovs-ofctl parse-flow "$field,actions=drop"], [1], [],
+[ovs-ofctl: TLVs must be ordered
+])
+done
+AT_CLEANUP
+
  AT_SETUP([ovs-ofctl action inconsistency (OpenFlow 1.1)])
  OVS_VSWITCHD_START
  add_of_ports br0 1 2 3


--
Adrián Moreno

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] Reliability of system-offload test #50 [Was: Re: [PATCH v3 2/2] ci: Run tc offload tests in GitHub] Actions.

2023-03-31 Thread Ilya Maximets

On 3/29/23 17:34, Simon Horman wrote:
> On Tue, Mar 28, 2023 at 01:45:22PM +0200, Eelco Chaudron wrote:
>>
>>
>> On 10 Mar 2023, at 17:20, Simon Horman wrote:
>>
>>> On Fri, Mar 10, 2023 at 10:15:44AM +0100, Simon Horman wrote:
 On Thu, Mar 09, 2023 at 05:22:43PM +0100, Eelco Chaudron wrote:
>
>
> On 9 Mar 2023, at 15:42, Simon Horman wrote:
>
>> On Wed, Mar 08, 2023 at 04:18:47PM +0100, Eelco Chaudron wrote:
>>> Run "make check-offloads" as part of the GitHub actions tests.
>>>
>>> This test was run 25 times using GitHub actions, and the
>>> failing rerun test cases where excluded. There are quite some
>>> first-run failures, but unfortunately, there is no other
>>> more stable kernel available as a GitHub-hosted runner.
>>>
>>> Did not yet include sanitizers in the run, as it's causing
>>> the test to run too long >30min and there seems to be (timing)
>>> issues with some of the tests.
>>>
>>> Signed-off-by: Eelco Chaudron 
>>
>> Hi Eelco,
>>
>> I like this patch a lot.
>> But I am observing reliability problems when executing the new job.
>>
>> For 5 runs, on each occasion some tests failed the first time.
>> And on 3 of those runs at least one test failed on the recheck,
>> so the job failed.
>
> Damn :)

 Yes, it pained me to report this.

> I did 25 runs (I did not check for re-runs), and they were fine. I also 
> cleaned up my jobs recently, so I no longer have them.
>
> I can do this again and figure out wich tests are failing. Then analyze 
> the failures to see if we need to exclude them or can fine-tune them.

 I will see if I can spend some cycles on reproducing this (outside of GHA).
 I'll likely start with the tests that show up in the summary below.
>>>
>>> I started off by looking at check-offloads test:
>>>
>>> 50. system-traffic.at:1524: testing datapath - basic truncate action ...
>>>
>>> I haven't dug into the code to debug the problem yet.
>>> But I have collected some results that might be interesting.
>>>
>>>
>>> My testing was on a low-end VM with Ubuntu 18.04, with no HW offload:
>>> $ uname -psv
>>> Linux #56-Ubuntu SMP Tue Sep 20 13:23:26 UTC 2022 x86_64
>>
>>
>> So I took the same approach, I have local vagrant VM with ubuntu 22.11 (like 
>> on GitHub) and ran the tests.
>>
>> I thought I fixed it by this old commit:
>>
>>   
>> https://github.com/openvswitch/ovs/commit/22968988b820aa17a9b050c901208b7d4bed9dac
>>
>> However, as you can see even after excluding the remaining failures I could 
>> not figure out, it still fails randomly:
>>
>>   https://github.com/chaudron/ovs/actions
>>
>> Note that the above 25x runs I did before and none of the above tests failed…
>>
>> I was not able to make any of these tests fail on my local Ubuntu, and also 
>> analysing the results did not lead to a specific thing to fix.
>>
>> As this is working fine on my Fedora (VM) setup for multiple runs without 
>> any problem I’ll abandon this patch now :( I’ll try to get a buy-in form the 
>> Robot to run the datapath tests as part of its sanity check for now…
> 
> Thanks Eelco,
> 
> it's a shame this proved to be elusive.
> 
> Perhaps with a newer, as yet unreleased, Ubuntu version things will
> improve - perhaps it is a kernel issue.
> 
> Or perhaps we have some deep problem related to running in
> resource constrained environments, that we may uncover some day.
> 
> In any case, thanks for looking at this.
> I agree that it makes sense to abandon this patchset for now.
> And that using the Robot may be a promising alternative, for now.

FWIW, we can spin up a Fedora VM in Cirrus CI, if that helps.
We may even ask for more CPUs if needed.

See how we do CI in ovn-heater for example:
  https://github.com/ovn-org/ovn-heater/blob/main/.cirrus.yml

If VM is not necessary, we may also use containers similarly
to GitHub Actions.

Best regards, Ilya Maximets.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] seq: Make read of the current value atomic

2023-03-31 Thread Mike Pattrick

On Fri, Mar 31, 2023 at 10:45 AM Eelco Chaudron  wrote:
>
>
>
> On 30 Mar 2023, at 17:57, Jon Kohler wrote:
>
> >> On Mar 28, 2023, at 1:32 PM, Mike Pattrick  wrote:
> >>
> >> On Mon, Mar 27, 2023 at 7:25 AM Eelco Chaudron  wrote:
> >>>
> >>> Make the read of the current seq->value atomic, i.e., not needing to
> >>> acquire the global mutex when reading it. On 64-bit systems, this
> >>> incurs no overhead, and it will avoid the mutex and potentially
> >>> a system call.
> >>>
> >>> For incrementing the value followed by waking up the threads, we are
> >>> still taking the mutex, so the current behavior is not changing. The
> >>> seq_read() behavior is already defined as, "Returns seq's current
> >>> sequence number (which could change immediately)". So the change
> >>> should not impact the current behavior.
> >>>
> >>> Signed-off-by: Eelco Chaudron 

I think after much discussion we've identified some additional areas
that this work could be extended to, but there's nothing wrong with
this patch as is.

Acked-by: Mike Pattrick 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] utilities: disable OVSDB inactivity probes for non-daemon ovn-nbctl

2023-03-31 Thread Dumitru Ceara

On 3/31/23 16:51, Vladislav Odintsov wrote:
> As I understood from Ilya, in case of one-command run of ovn-sbctl 
> (non-daemon mode), it doesn’t make sense to have client -> server inactivity 
> probes. If cable is unplugged, will just hit tcp session timeout, IIUC.
> Please correct me if I’m wrong.
> 

You're right but the default timeouts are quite large for TCP sessions:

With keepalives default:
tcp_keepalive_time - INTEGER
How often TCP sends out keepalive messages when keepalive is enabled.
Default: 2hours.

tcp_keepalive_probes - INTEGER
How many keepalive probes TCP sends out, until it decides that the
connection is broken. Default value: 9.

tcp_keepalive_intvl - INTEGER
How frequently the probes are send out. Multiplied by
tcp_keepalive_probes it is time to kill not responding connection,
after probes started. Default value: 75sec i.e. connection
will be aborted after ~11 minutes of retries.

DB clients don't even enable tcp keepalives IIRC.

So then we depend retransmits timing out:
tcp_retries2 - INTEGER
This value influences the timeout of an alive TCP connection,
when RTO retransmissions remain unacknowledged.
Given a value of N, a hypothetical TCP connection following
exponential backoff with an initial RTO of TCP_RTO_MIN would
retransmit N times before killing the connection at the (N+1)th RTO.

The default value of 15 yields a hypothetical timeout of 924.6
seconds and is a lower bound for the effective timeout.
TCP will effectively time out at the first RTO which exceeds the
hypothetical timeout.

RFC 1122 recommends at least 100 seconds for the timeout,
which corresponds to a value of at least 8.

Isn't it possible that the connection suddenly goes down in between a
transaction request and its reply but with all TCP segments being
ACKed?

>> On 31 Mar 2023, at 15:55, Dumitru Ceara  wrote:
>>
>> On 3/31/23 14:46, Vladislav Odintsov wrote:
>>> Hi Dumitru,
>>>
 On 31 Mar 2023, at 15:01, Dumitru Ceara  wrote:

 On 3/31/23 11:43, Ales Musil wrote:
> On Thu, Mar 23, 2023 at 8:25 PM Vladislav Odintsov 
> wrote:
>
>> For large OVN_Southbound DBs defatult interval of 5000 ms could be not
>> sufficient.  This patch disables OVSDB inactivity probes for ovn-*ctl
>> running
>> in non-daemon mode.
>>
>> Signed-off-by: Vladislav Odintsov 
>> ---
>> utilities/ovn-dbctl.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/utilities/ovn-dbctl.c b/utilities/ovn-dbctl.c
>> index 369a6a663..4307a5cae 100644
>> --- a/utilities/ovn-dbctl.c
>> +++ b/utilities/ovn-dbctl.c
>> @@ -208,6 +208,9 @@ ovn_dbctl_main(int argc, char *argv[],
>>if (daemon_mode) {
>>server_loop(dbctl_options, idl, argc, argv_);
>>} else {
>> +/* Disable OVSDB probe interval for non-daemon mode. */
>> +ovsdb_idl_set_probe_interval(idl, 0);

 I think I'd avoid using the idl function directly and call instead:

 set_idl_probe_interval(idl, 0);

 Just to keep it aligned with all other uses in OVN.  I can patch that at
 apply time if it looks OK to you.
>>>
>>> I’ve got no objections here.
>>> Small nit: set_idl_probe_interval function needs also a remote. Like this:
>>>
>>> set_idl_probe_interval(idl, db, 0);
>>>
>>> Also, please correct typo in commit message: defatult -> default.
>>>
>>
>> In light of the ovs-discuss thread [0] is it maybe better to just set
>> this probe interval to a very high value instead?  That's for the case
>> when ovn-nbctl/sbctl daemon <-> ovsdb-server connection dies because of
>> for example cable being unplugged somewhere on the way between the two.
>>
>> [0]
>> https://mail.openvswitch.org/pipermail/ovs-discuss/2023-March/052324.html
>>

>> +
>>struct ctl_command *commands;
>>size_t n_commands;
>>char *error;
>> --
>> 2.36.1
>>
>> ___
>> dev mailing list
>> d...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>>
> Looks good to me, thanks.
>
> Reviewed-by: Ales Musil 
>

 Vladislav, Ales, I was thinking of backporting this to stable branches
 too, what do you think?

 Thanks,
 Dumitru
>>>
>>>
>>> Regards,
>>> Vladislav Odintsov
>>>
>>>
>>
>> ___
>> dev mailing list
>> d...@openvswitch.org 
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 
> 
> Regards,
> Vladislav Odintsov
> 
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] utilities: disable OVSDB inactivity probes for non-daemon ovn-nbctl

2023-03-31 Thread Vladislav Odintsov

As I understood from Ilya, in case of one-command run of ovn-sbctl (non-daemon 
mode), it doesn’t make sense to have client -> server inactivity probes. If 
cable is unplugged, will just hit tcp session timeout, IIUC.
Please correct me if I’m wrong.

> On 31 Mar 2023, at 15:55, Dumitru Ceara  wrote:
> 
> On 3/31/23 14:46, Vladislav Odintsov wrote:
>> Hi Dumitru,
>> 
>>> On 31 Mar 2023, at 15:01, Dumitru Ceara  wrote:
>>> 
>>> On 3/31/23 11:43, Ales Musil wrote:
 On Thu, Mar 23, 2023 at 8:25 PM Vladislav Odintsov 
 wrote:
 
> For large OVN_Southbound DBs defatult interval of 5000 ms could be not
> sufficient.  This patch disables OVSDB inactivity probes for ovn-*ctl
> running
> in non-daemon mode.
> 
> Signed-off-by: Vladislav Odintsov 
> ---
> utilities/ovn-dbctl.c | 3 +++
> 1 file changed, 3 insertions(+)
> 
> diff --git a/utilities/ovn-dbctl.c b/utilities/ovn-dbctl.c
> index 369a6a663..4307a5cae 100644
> --- a/utilities/ovn-dbctl.c
> +++ b/utilities/ovn-dbctl.c
> @@ -208,6 +208,9 @@ ovn_dbctl_main(int argc, char *argv[],
>if (daemon_mode) {
>server_loop(dbctl_options, idl, argc, argv_);
>} else {
> +/* Disable OVSDB probe interval for non-daemon mode. */
> +ovsdb_idl_set_probe_interval(idl, 0);
>>> 
>>> I think I'd avoid using the idl function directly and call instead:
>>> 
>>> set_idl_probe_interval(idl, 0);
>>> 
>>> Just to keep it aligned with all other uses in OVN.  I can patch that at
>>> apply time if it looks OK to you.
>> 
>> I’ve got no objections here.
>> Small nit: set_idl_probe_interval function needs also a remote. Like this:
>> 
>> set_idl_probe_interval(idl, db, 0);
>> 
>> Also, please correct typo in commit message: defatult -> default.
>> 
> 
> In light of the ovs-discuss thread [0] is it maybe better to just set
> this probe interval to a very high value instead?  That's for the case
> when ovn-nbctl/sbctl daemon <-> ovsdb-server connection dies because of
> for example cable being unplugged somewhere on the way between the two.
> 
> [0]
> https://mail.openvswitch.org/pipermail/ovs-discuss/2023-March/052324.html
> 
>>> 
> +
>struct ctl_command *commands;
>size_t n_commands;
>char *error;
> --
> 2.36.1
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 
> 
 Looks good to me, thanks.
 
 Reviewed-by: Ales Musil 
 
>>> 
>>> Vladislav, Ales, I was thinking of backporting this to stable branches
>>> too, what do you think?
>>> 
>>> Thanks,
>>> Dumitru
>> 
>> 
>> Regards,
>> Vladislav Odintsov
>> 
>> 
> 
> ___
> dev mailing list
> d...@openvswitch.org 
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Regards,
Vladislav Odintsov

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] seq: Make read of the current value atomic

2023-03-31 Thread Eelco Chaudron



On 30 Mar 2023, at 17:57, Jon Kohler wrote:

>> On Mar 28, 2023, at 1:32 PM, Mike Pattrick  wrote:
>>
>> On Mon, Mar 27, 2023 at 7:25 AM Eelco Chaudron  wrote:
>>>
>>> Make the read of the current seq->value atomic, i.e., not needing to
>>> acquire the global mutex when reading it. On 64-bit systems, this
>>> incurs no overhead, and it will avoid the mutex and potentially
>>> a system call.
>>>
>>> For incrementing the value followed by waking up the threads, we are
>>> still taking the mutex, so the current behavior is not changing. The
>>> seq_read() behavior is already defined as, "Returns seq's current
>>> sequence number (which could change immediately)". So the change
>>> should not impact the current behavior.
>>>
>>> Signed-off-by: Eelco Chaudron 
>>> ---
>>> lib/ovs-rcu.c |2 +-
>>> lib/seq.c |   37 -
>>> lib/seq.h |1 -
>>> 3 files changed, 17 insertions(+), 23 deletions(-)
>>>
>>> diff --git a/lib/ovs-rcu.c b/lib/ovs-rcu.c
>>> index 946aa04d1..9e07d9bab 100644
>>> --- a/lib/ovs-rcu.c
>>> +++ b/lib/ovs-rcu.c
>>> @@ -170,7 +170,7 @@ ovsrcu_try_quiesce(void)
>>> ovs_assert(!single_threaded());
>>> perthread = ovsrcu_perthread_get();
>>> if (!seq_try_lock()) {
>>> -perthread->seqno = seq_read_protected(global_seqno);
>>> +perthread->seqno = seq_read(global_seqno);
>>> if (perthread->cbset) {
>>> ovsrcu_flush_cbset__(perthread, true);
>>> }
>>> diff --git a/lib/seq.c b/lib/seq.c
>>> index 99e5bf8bd..22646f3f8 100644
>>> --- a/lib/seq.c
>>> +++ b/lib/seq.c
>>> @@ -32,7 +32,7 @@ COVERAGE_DEFINE(seq_change);
>>>
>>> /* A sequence number object. */
>>> struct seq {
>>> -uint64_t value OVS_GUARDED;
>>> +atomic_uint64_t value;
>>> struct hmap waiters OVS_GUARDED; /* Contains 'struct seq_waiter's. */
>>> };
>>>
>>> @@ -72,6 +72,7 @@ static void seq_wake_waiters(struct seq *) 
>>> OVS_REQUIRES(seq_mutex);
>>> struct seq * OVS_EXCLUDED(seq_mutex)
>>> seq_create(void)
>>> {
>>> +uint64_t seq_value;
>>> struct seq *seq;
>>>
>>> seq_init();
>>> @@ -81,7 +82,8 @@ seq_create(void)
>>> COVERAGE_INC(seq_change);
>>>
>>> ovs_mutex_lock(&seq_mutex);
>>
>> I think it's also possible to get rid of the mutex here. Operations
>> like modifying a bridge or interface can result in calling seq_create
>> dozens of times, which could potentially lead to contention in cases
>> when a lot of interfaces are constantly added or removed. I'm mainly
>> thinking about kubernetes here.
>>
>> Other than that, I found this patch to reduce locking on seq_mutex
>> pretty significantly with a userspace workload:
>>
>> Before: 468727.2/sec
>> After: 229265.8/sec
>>
>> The rest of this patch looks good to me, but what do you think about:
>>
>> diff --git a/lib/seq.c b/lib/seq.c
>> index 22646f3f8..09fdccce5 100644
>> --- a/lib/seq.c
>> +++ b/lib/seq.c
>> @@ -57,7 +57,7 @@ struct seq_thread {
>>
>> static struct ovs_mutex seq_mutex = OVS_MUTEX_INITIALIZER;
>>
>> -static uint64_t seq_next OVS_GUARDED_BY(seq_mutex) = 1;
>> +static atomic_uint64_t seq_next = 1;
>>
>> static pthread_key_t seq_thread_key;
>>
>> @@ -81,11 +81,9 @@ seq_create(void)
>>
>> COVERAGE_INC(seq_change);
>>
>> -ovs_mutex_lock(&seq_mutex);
>> -seq_value = seq_next++;
>> +seq_value = atomic_count_inc64(&seq_next);
>> atomic_store_relaxed(&seq->value, seq_value);
>> hmap_init(&seq->waiters);
>> -ovs_mutex_unlock(&seq_mutex);
>>
>> return seq;
>> }
>> @@ -128,7 +126,7 @@ void
>> seq_change_protected(struct seq *seq)
>> OVS_REQUIRES(seq_mutex)
>> {
>> -uint64_t seq_value = seq_next++;
>> +uint64_t seq_value = atomic_count_inc64(&seq_next);
>>
>> COVERAGE_INC(seq_change);
>>
>
> Mike - Seems like a good idea at first blush?
>
> Eelco,
> This patch has perfect timing, as we’re currently chasing down some
> overhead on large systems, e.g. quad socket or newer chips with many
> dozens/hundreds of processors per system. In these systems, we see
> a lot of overhead from the revalidator threads coming online and
> constantly competing for the mutex in seq_read(), seq_woke(), and
> seq_change(). This leads to ovs-vswitchd being on-CPU constantly at
> somewhere between 20-80% of one core as per top. This time comes from
> spending a ton of time contending for the lock, then just futex'ing in
> and out of sleep as a result.
>
> Along these same lines of thinking, would it be possible to get rid of
> this mutex in seq_change() too? Perhaps we could have some atomic val
> that indicated if there was a waiter, and if so, then take the mutex
> to deal with the wakeup, but if not, don't take the mutex at all?

I’m not an expert on the seq code, but quickly looking at the code, the mutex 
is to protect the seq->waiters. Just having a waiter count outside of the mutex 
will probably not work, as there is still a raise condition when adding a new 
waiter.

Maybe you could work on a patch, and see there a now cor

Re: [ovs-dev] [PATCH] ci: Separate DPDK from OVS build.

2023-03-31 Thread David Marchand

On Thu, Mar 30, 2023 at 11:24 AM David Marchand
 wrote:
>
> Let's separate DPDK compilation from the rest of OVS build:
> - this avoids multiple jobs building DPDK in parallel, which especially
>   affects builds in the dpdk-latest branch,

Err, this patch actually breaks dpdk-latest since the hash does not
depend on DPDK_VER and GHA does not allow overwritting an existing
cache entry.
I'll wait a bit before trying to solve this and see if others have an
opinion on the approach at least.


> - we separate concerns about DPDK build requirements from OVS build
>   requirements, like python dependencies,
> - building DPDK does not depend on how we will link OVS against it, so we
>   can use a single cache entry regardless of DPDK_SHARED option,
>
> Signed-off-by: David Marchand 

-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] ci: Add arping package to run floating IP tests.

2023-03-31 Thread Dumitru Ceara

On 3/31/23 15:23, Dumitru Ceara wrote:
> On 3/31/23 14:53, Eelco Chaudron wrote:
>>
>>
>> On 31 Mar 2023, at 14:47, Dumitru Ceara wrote:
>>
>>> On 3/31/23 14:16, Eelco Chaudron wrote:
 Add arping package so the "virtual port with floating IP -- ovn-northd"
 tests are run.

 Signed-off-by: Eelco Chaudron 
 ---
>>>
>>> Hi Eelco,
>>>
>>> Thanks for the patch, it's good to enable this test!
>>>
  .github/workflows/test.yml |3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

 diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
 index 90dc8a6f1..bf958db9b 100644
 --- a/.github/workflows/test.yml
 +++ b/.github/workflows/test.yml
 @@ -17,7 +17,8 @@ jobs:
dependencies: |
  automake libtool gcc bc libjemalloc2 libjemalloc-dev\
  libssl-dev llvm-dev libelf-dev libnuma-dev libpcap-dev  \
 -selinux-policy-dev ncat python3-scapy isc-dhcp-server
 +selinux-policy-dev ncat python3-scapy isc-dhcp-server \
 +iputils-arping
>>>
>>> I see now the test running..
>>>
>>> 277: virtual port with floating IP -- ovn-northd -- parallelization=yes
>>> -- ovn_monitor_all=yes FAILED (system-ovn.at:10191)
>>>
>>> .. and failing.
>>>
>>> Not your fault, of course, but I'd really prefer to fix the test (if
>>> it's a test issue) or OVN before enabling this in CI.  Otherwise we'll
>>> have permanently "red" runs in CI which is also confusing.
>>>
>>> I'm not sure if you want and have time to look into this failure.  If
>>> not, let me know, I can try to have a look next week.
>>
>> It ran fine when I tried this :( 
>> https://github.com/chaudron/ovn/actions/runs/4574680800
>>
> 
> I see, thanks for sharing this!  It seems this is a recent regression in
> OVN.  I bisected it and it was introduced by e3bc68c3be69 ("northd: drop
> ct.inv packets in post snat and lb_aff_learn stages").  Lorenzo, Numan,
> do you have time to look into this please?
> 

+Lorenzo, Numan.

> The offending patch has also been backported to branch-23.03 so we need
> a fix there too.
> 
> Thanks,
> Dumitru
> 
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] seq: Make read of the current value atomic

2023-03-31 Thread Eelco Chaudron



On 28 Mar 2023, at 19:32, Mike Pattrick wrote:

> On Mon, Mar 27, 2023 at 7:25 AM Eelco Chaudron  wrote:
>>
>> Make the read of the current seq->value atomic, i.e., not needing to
>> acquire the global mutex when reading it. On 64-bit systems, this
>> incurs no overhead, and it will avoid the mutex and potentially
>> a system call.
>>
>> For incrementing the value followed by waking up the threads, we are
>> still taking the mutex, so the current behavior is not changing. The
>> seq_read() behavior is already defined as, "Returns seq's current
>> sequence number (which could change immediately)". So the change
>> should not impact the current behavior.
>>
>> Signed-off-by: Eelco Chaudron 
>> ---
>>  lib/ovs-rcu.c |2 +-
>>  lib/seq.c |   37 -
>>  lib/seq.h |1 -
>>  3 files changed, 17 insertions(+), 23 deletions(-)
>>
>> diff --git a/lib/ovs-rcu.c b/lib/ovs-rcu.c
>> index 946aa04d1..9e07d9bab 100644
>> --- a/lib/ovs-rcu.c
>> +++ b/lib/ovs-rcu.c
>> @@ -170,7 +170,7 @@ ovsrcu_try_quiesce(void)
>>  ovs_assert(!single_threaded());
>>  perthread = ovsrcu_perthread_get();
>>  if (!seq_try_lock()) {
>> -perthread->seqno = seq_read_protected(global_seqno);
>> +perthread->seqno = seq_read(global_seqno);
>>  if (perthread->cbset) {
>>  ovsrcu_flush_cbset__(perthread, true);
>>  }
>> diff --git a/lib/seq.c b/lib/seq.c
>> index 99e5bf8bd..22646f3f8 100644
>> --- a/lib/seq.c
>> +++ b/lib/seq.c
>> @@ -32,7 +32,7 @@ COVERAGE_DEFINE(seq_change);
>>
>>  /* A sequence number object. */
>>  struct seq {
>> -uint64_t value OVS_GUARDED;
>> +atomic_uint64_t value;
>>  struct hmap waiters OVS_GUARDED; /* Contains 'struct seq_waiter's. */
>>  };
>>
>> @@ -72,6 +72,7 @@ static void seq_wake_waiters(struct seq *) 
>> OVS_REQUIRES(seq_mutex);
>>  struct seq * OVS_EXCLUDED(seq_mutex)
>>  seq_create(void)
>>  {
>> +uint64_t seq_value;
>>  struct seq *seq;
>>
>>  seq_init();
>> @@ -81,7 +82,8 @@ seq_create(void)
>>  COVERAGE_INC(seq_change);
>>
>>  ovs_mutex_lock(&seq_mutex);
>
> I think it's also possible to get rid of the mutex here. Operations
> like modifying a bridge or interface can result in calling seq_create
> dozens of times, which could potentially lead to contention in cases
> when a lot of interfaces are constantly added or removed. I'm mainly
> thinking about kubernetes here.
>
> Other than that, I found this patch to reduce locking on seq_mutex
> pretty significantly with a userspace workload:
>
> Before: 468727.2/sec
> After: 229265.8/sec

Yes I just counted syscalls, and I noticed a Hughe drop in futex calls 
especially if you have more PMD threads.

> The rest of this patch looks good to me, but what do you think about:

At first glance, it looks good to me however not sure if seq_create() is used 
often enough to see any gains.
But removing a mutex in a function sounds good to me, so I would suggest 
submitting a patch.

> diff --git a/lib/seq.c b/lib/seq.c
> index 22646f3f8..09fdccce5 100644
> --- a/lib/seq.c
> +++ b/lib/seq.c
> @@ -57,7 +57,7 @@ struct seq_thread {
>
>  static struct ovs_mutex seq_mutex = OVS_MUTEX_INITIALIZER;
>
> -static uint64_t seq_next OVS_GUARDED_BY(seq_mutex) = 1;
> +static atomic_uint64_t seq_next = 1;

I think we can keep this and use the  OVS_EXCLUDED(seq_mutex) macro in the 
function below.

>  static pthread_key_t seq_thread_key;
>
> @@ -81,11 +81,9 @@ seq_create(void)
>
>  COVERAGE_INC(seq_change);
>
> -ovs_mutex_lock(&seq_mutex);
> -seq_value = seq_next++;
> +seq_value = atomic_count_inc64(&seq_next);
>  atomic_store_relaxed(&seq->value, seq_value);
>  hmap_init(&seq->waiters);
> -ovs_mutex_unlock(&seq_mutex);
>
>  return seq;
>  }
> @@ -128,7 +126,7 @@ void
>  seq_change_protected(struct seq *seq)
>  OVS_REQUIRES(seq_mutex)
>  {
> -uint64_t seq_value = seq_next++;
> +uint64_t seq_value = atomic_count_inc64(&seq_next);
>
>  COVERAGE_INC(seq_change);

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] ci: Add arping package to run floating IP tests.

2023-03-31 Thread Dumitru Ceara

On 3/31/23 14:53, Eelco Chaudron wrote:
> 
> 
> On 31 Mar 2023, at 14:47, Dumitru Ceara wrote:
> 
>> On 3/31/23 14:16, Eelco Chaudron wrote:
>>> Add arping package so the "virtual port with floating IP -- ovn-northd"
>>> tests are run.
>>>
>>> Signed-off-by: Eelco Chaudron 
>>> ---
>>
>> Hi Eelco,
>>
>> Thanks for the patch, it's good to enable this test!
>>
>>>  .github/workflows/test.yml |3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
>>> index 90dc8a6f1..bf958db9b 100644
>>> --- a/.github/workflows/test.yml
>>> +++ b/.github/workflows/test.yml
>>> @@ -17,7 +17,8 @@ jobs:
>>>dependencies: |
>>>  automake libtool gcc bc libjemalloc2 libjemalloc-dev\
>>>  libssl-dev llvm-dev libelf-dev libnuma-dev libpcap-dev  \
>>> -selinux-policy-dev ncat python3-scapy isc-dhcp-server
>>> +selinux-policy-dev ncat python3-scapy isc-dhcp-server \
>>> +iputils-arping
>>
>> I see now the test running..
>>
>> 277: virtual port with floating IP -- ovn-northd -- parallelization=yes
>> -- ovn_monitor_all=yes FAILED (system-ovn.at:10191)
>>
>> .. and failing.
>>
>> Not your fault, of course, but I'd really prefer to fix the test (if
>> it's a test issue) or OVN before enabling this in CI.  Otherwise we'll
>> have permanently "red" runs in CI which is also confusing.
>>
>> I'm not sure if you want and have time to look into this failure.  If
>> not, let me know, I can try to have a look next week.
> 
> It ran fine when I tried this :( 
> https://github.com/chaudron/ovn/actions/runs/4574680800
> 

I see, thanks for sharing this!  It seems this is a recent regression in
OVN.  I bisected it and it was introduced by e3bc68c3be69 ("northd: drop
ct.inv packets in post snat and lb_aff_learn stages").  Lorenzo, Numan,
do you have time to look into this please?

The offending patch has also been backported to branch-23.03 so we need
a fix there too.

Thanks,
Dumitru


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v10] netdev-offload-tc: del ufid mapping if device not exist.

2023-03-31 Thread Eelco Chaudron



On 30 Mar 2023, at 11:27, Faicker Mo wrote:

> The device may be deleted and added with ifindex changed.
> The tc rules on the device will be deleted if the device is deleted.
> The func tc_del_filter will fail when flow del. The mapping of
> ufid to tc will not be deleted.
> The traffic will trigger the same flow(with same ufid) to put to tc
> on the new device. Duplicated ufid mapping will be added.
> If the hashmap is expanded, the old mapping entry will be the first entry,
> and now the dp flow can't be deleted.
>
> Signed-off-by: Faicker Mo 

Changes look good to me, so if Simon’s tests pass over the weekend:

Acked-by: Eelco Chaudron 




> +])
> +NS_CHECK_EXEC([at_ns0], [ping -q -c 2 -i 0.2 10.1.1.3 | FORMAT_PING], [0], 
> [dnl
> +2 packets transmitted, 2 received, 0% packet loss, time 0ms
> +])
> +
> +AT_CHECK([ovs-appctl revalidator/purge], [0])
> +dnl Fix purge fail occasionally

FYI, I noticed when debugging some other TC-related tests that the reason could 
be that the purge is called before the TC flows get installed in the kernel.

> +AT_CHECK([ovs-appctl revalidator/purge], [0])



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v5] ofproto-dpif-upcall: Don't set statistics to 0 when they jump back

2023-03-31 Thread Eelco Chaudron



On 31 Mar 2023, at 15:15, Ilya Maximets wrote:

> On 3/31/23 15:06, Eelco Chaudron wrote:
>>
>>
>> On 31 Mar 2023, at 12:38, Simon Horman wrote:
>>
>>> On Fri, Mar 31, 2023 at 12:05:09PM +0200, Ilya Maximets wrote:
 On 3/31/23 11:07, Simon Horman wrote:
> On Thu, Mar 30, 2023 at 09:04:02PM +0200, Ilya Maximets wrote:
>> On 3/30/23 11:45, Simon Horman wrote:
>>> On Fri, Mar 17, 2023 at 09:47:36PM +0100, Eelco Chaudron wrote:
> Op 17 mrt. 2023 om 21:11 heeft Marcelo Ricardo Leitner 
>  het volgende geschreven:
> Thu, Mar 16, 2023 at 09:51:34AM +0100, Eelco Chaudron wrote:
>>> On 22 Dec 2022, at 13:32, Eelco Chaudron wrote:
 On 22 Dec 2022, at 10:26, Balazs Nemeth wrote:
>>>
 The only way that stats->{n_packets,n_bytes} would decrease is due 
 to an
>>>
>>> nit: s/way/ways/
>>>
 overflow, or if there are bugs in how statistics are handled. In 
 the
 past, there were multiple bugs that caused a jump backward. A
 workaround was in place to set the statistics to 0 in that case. 
 When
 this happened while the revalidator was under heavy load, the 
 workaround
 had an unintended side effect where should_revalidate returned 
 false
 causing the flow to be removed because the metric it calculated was
 based on a bogus value. Since many of those bugs have now been
 identified and resolved, there is no need to set the statistics to 
 0. In
 addition, the (unlikely) overflow still needs to be handled
 appropriately.
>>>
>>> 1. Perhaps it would be prudent to log/count if/when this occurs
>>
>> +1
>> We do have a coverage counter that will indicate the case where stats
>> jump back.  However, if we're certain that this should never happen,
>> we should, probably, emit a warning or even an error log as well, so
>> users are aware that something went wrong.
>
> I was thinking more of a counter, which seems to already be covered.
> But I have no objection to your reasoning about having a warning (too).
>
>>
>>> 2. I take it that the overflow handling would be follow-up work,
>>>is that correct?
>>
>> The unsigned arithmetic should take case of overflowed counters,
>> because the result of subtraction will still give a correct difference
>> between the old and a new value, even if it overflowed and the new
>> value is smaller.  Unless, of course, it overflowed more than once.
>
> More than once between samples?
> If so, I'm assuming that is not a case we can hit unless there is a bug.

 Right.  It's actually should be practically not possible to overflow
 even once with a current hardware.  Assuming we have a fancy 400 Gbps
 NIC, then it should take 11.7 years to overflow a byte counter.

 So, this patch is mostly removing a workaround for some bug that we
 hope we fixed.  But it's not clear what the original bug was as the
 commit message for this workaround didn't specify a root cause.  So,
 it's hard to say if it's fixed or not.  And that's why I'm thinking
 that the error message is needed.
>>>
>>> Yes, I agree that is prudent.
>>
>> If we do add a log message, we should be careful as it could still be a wrap 
>> (for bytes). For packets, it’s not very likely with the current speeds 400G, 
>> will be around 980 years… For bytes, we can wrap easily.
>>
>> So I would suggest it only for packet count…
>>
>
> We already only use a packet counter for the 'ukey_invalid_stat_reset' 
> coverage.
> So, I suppose, we can just add the log under the same condition.
>
> OTOH, Don't we check that the difference is within 3/4 of 64-bit range?
> It should still take many years to overflow the byte counter.

Yes, I was looking at this on my review directory which did not have my patches 
(or latest master).

So just adding it to the ‘ukey_invalid_stat_reset’ if() case will work :)

//Eelco

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v5] ofproto-dpif-upcall: Don't set statistics to 0 when they jump back

2023-03-31 Thread Ilya Maximets

On 3/31/23 15:06, Eelco Chaudron wrote:
> 
> 
> On 31 Mar 2023, at 12:38, Simon Horman wrote:
> 
>> On Fri, Mar 31, 2023 at 12:05:09PM +0200, Ilya Maximets wrote:
>>> On 3/31/23 11:07, Simon Horman wrote:
 On Thu, Mar 30, 2023 at 09:04:02PM +0200, Ilya Maximets wrote:
> On 3/30/23 11:45, Simon Horman wrote:
>> On Fri, Mar 17, 2023 at 09:47:36PM +0100, Eelco Chaudron wrote:
 Op 17 mrt. 2023 om 21:11 heeft Marcelo Ricardo Leitner 
  het volgende geschreven:
 Thu, Mar 16, 2023 at 09:51:34AM +0100, Eelco Chaudron wrote:
>> On 22 Dec 2022, at 13:32, Eelco Chaudron wrote:
>>> On 22 Dec 2022, at 10:26, Balazs Nemeth wrote:
>>
>>> The only way that stats->{n_packets,n_bytes} would decrease is due 
>>> to an
>>
>> nit: s/way/ways/
>>
>>> overflow, or if there are bugs in how statistics are handled. In the
>>> past, there were multiple bugs that caused a jump backward. A
>>> workaround was in place to set the statistics to 0 in that case. 
>>> When
>>> this happened while the revalidator was under heavy load, the 
>>> workaround
>>> had an unintended side effect where should_revalidate returned false
>>> causing the flow to be removed because the metric it calculated was
>>> based on a bogus value. Since many of those bugs have now been
>>> identified and resolved, there is no need to set the statistics to 
>>> 0. In
>>> addition, the (unlikely) overflow still needs to be handled
>>> appropriately.
>>
>> 1. Perhaps it would be prudent to log/count if/when this occurs
>
> +1
> We do have a coverage counter that will indicate the case where stats
> jump back.  However, if we're certain that this should never happen,
> we should, probably, emit a warning or even an error log as well, so
> users are aware that something went wrong.

 I was thinking more of a counter, which seems to already be covered.
 But I have no objection to your reasoning about having a warning (too).

>
>> 2. I take it that the overflow handling would be follow-up work,
>>is that correct?
>
> The unsigned arithmetic should take case of overflowed counters,
> because the result of subtraction will still give a correct difference
> between the old and a new value, even if it overflowed and the new
> value is smaller.  Unless, of course, it overflowed more than once.

 More than once between samples?
 If so, I'm assuming that is not a case we can hit unless there is a bug.
>>>
>>> Right.  It's actually should be practically not possible to overflow
>>> even once with a current hardware.  Assuming we have a fancy 400 Gbps
>>> NIC, then it should take 11.7 years to overflow a byte counter.
>>>
>>> So, this patch is mostly removing a workaround for some bug that we
>>> hope we fixed.  But it's not clear what the original bug was as the
>>> commit message for this workaround didn't specify a root cause.  So,
>>> it's hard to say if it's fixed or not.  And that's why I'm thinking
>>> that the error message is needed.
>>
>> Yes, I agree that is prudent.
> 
> If we do add a log message, we should be careful as it could still be a wrap 
> (for bytes). For packets, it’s not very likely with the current speeds 400G, 
> will be around 980 years… For bytes, we can wrap easily.
> 
> So I would suggest it only for packet count…
> 

We already only use a packet counter for the 'ukey_invalid_stat_reset' coverage.
So, I suppose, we can just add the log under the same condition.

OTOH, Don't we check that the difference is within 3/4 of 64-bit range?
It should still take many years to overflow the byte counter.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn 1/4] northd: Break ACLs into two stages.

2023-03-31 Thread Simon Horman

On Fri, Mar 31, 2023 at 02:10:35PM +0200, Dumitru Ceara wrote:
> On 3/31/23 13:52, Simon Horman wrote:
> > On Tue, Mar 21, 2023 at 01:59:06PM -0400, Mark Michelson wrote:
> >> Prior to this commit, ACLs were evaluated and acted on in a single
> >> stage. With this commit, evaluation of ACLs and acting on an ACL's
> >> decision are separated into two stages.
> >>
> >> The acl_eval stage checks the ACL match and will set a bit to indicate
> >> the verdict of the ACL. The acl_action stage then checks the relevant
> >> bits to determine how to proceed. If no ACLs are matched, then the
> >> default ACL action is taken.
> >>
> >> A couple of notes about updated tests:
> >> - For test cases where I just had to increment a table number, I changed
> >>   the check so the table numbers are masked. This should prevent similar
> >>   changes from being needed later.
> >> - The port security test changes may seem odd. The issue here is that
> >>   the ls_out_apply_port_sec table number changed from 9 to 10. This
> >>   means that this table's flows now sort to a lower position than
> >>   before. This is why the check had to change for this test.
> >>
> >> Signed-off-by: Mark Michelson 
> > 
> > Hi Mark,
> > 
> > this does not appear to apply on top of main any more.
> 
> Hi Simon,
> 
> I didn't review this series yet but something that helps when dealing
> with patches that don't apply on top of main anymore is checking the
> branch 0-day bot created from the patchwork series:
> 
> https://github.com/ovsrobot/ovn/tree/series_347327
> 
> It's: https://github.com/ovsrobot/ovn/tree/series_

Thanks, I did not know that :)
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v5] ofproto-dpif-upcall: Don't set statistics to 0 when they jump back

2023-03-31 Thread Eelco Chaudron



On 31 Mar 2023, at 12:38, Simon Horman wrote:

> On Fri, Mar 31, 2023 at 12:05:09PM +0200, Ilya Maximets wrote:
>> On 3/31/23 11:07, Simon Horman wrote:
>>> On Thu, Mar 30, 2023 at 09:04:02PM +0200, Ilya Maximets wrote:
 On 3/30/23 11:45, Simon Horman wrote:
> On Fri, Mar 17, 2023 at 09:47:36PM +0100, Eelco Chaudron wrote:
>>> Op 17 mrt. 2023 om 21:11 heeft Marcelo Ricardo Leitner 
>>>  het volgende geschreven:
>>> Thu, Mar 16, 2023 at 09:51:34AM +0100, Eelco Chaudron wrote:
> On 22 Dec 2022, at 13:32, Eelco Chaudron wrote:
>> On 22 Dec 2022, at 10:26, Balazs Nemeth wrote:
>
>> The only way that stats->{n_packets,n_bytes} would decrease is due 
>> to an
>
> nit: s/way/ways/
>
>> overflow, or if there are bugs in how statistics are handled. In the
>> past, there were multiple bugs that caused a jump backward. A
>> workaround was in place to set the statistics to 0 in that case. When
>> this happened while the revalidator was under heavy load, the 
>> workaround
>> had an unintended side effect where should_revalidate returned false
>> causing the flow to be removed because the metric it calculated was
>> based on a bogus value. Since many of those bugs have now been
>> identified and resolved, there is no need to set the statistics to 
>> 0. In
>> addition, the (unlikely) overflow still needs to be handled
>> appropriately.
>
> 1. Perhaps it would be prudent to log/count if/when this occurs

 +1
 We do have a coverage counter that will indicate the case where stats
 jump back.  However, if we're certain that this should never happen,
 we should, probably, emit a warning or even an error log as well, so
 users are aware that something went wrong.
>>>
>>> I was thinking more of a counter, which seems to already be covered.
>>> But I have no objection to your reasoning about having a warning (too).
>>>

> 2. I take it that the overflow handling would be follow-up work,
>is that correct?

 The unsigned arithmetic should take case of overflowed counters,
 because the result of subtraction will still give a correct difference
 between the old and a new value, even if it overflowed and the new
 value is smaller.  Unless, of course, it overflowed more than once.
>>>
>>> More than once between samples?
>>> If so, I'm assuming that is not a case we can hit unless there is a bug.
>>
>> Right.  It's actually should be practically not possible to overflow
>> even once with a current hardware.  Assuming we have a fancy 400 Gbps
>> NIC, then it should take 11.7 years to overflow a byte counter.
>>
>> So, this patch is mostly removing a workaround for some bug that we
>> hope we fixed.  But it's not clear what the original bug was as the
>> commit message for this workaround didn't specify a root cause.  So,
>> it's hard to say if it's fixed or not.  And that's why I'm thinking
>> that the error message is needed.
>
> Yes, I agree that is prudent.

If we do add a log message, we should be careful as it could still be a wrap 
(for bytes). For packets, it’s not very likely with the current speeds 400G, 
will be around 980 years… For bytes, we can wrap easily.

So I would suggest it only for packet count…

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] utilities: disable OVSDB inactivity probes for non-daemon ovn-nbctl

2023-03-31 Thread Dumitru Ceara

On 3/31/23 14:46, Vladislav Odintsov wrote:
> Hi Dumitru,
> 
>> On 31 Mar 2023, at 15:01, Dumitru Ceara  wrote:
>>
>> On 3/31/23 11:43, Ales Musil wrote:
>>> On Thu, Mar 23, 2023 at 8:25 PM Vladislav Odintsov 
>>> wrote:
>>>
 For large OVN_Southbound DBs defatult interval of 5000 ms could be not
 sufficient.  This patch disables OVSDB inactivity probes for ovn-*ctl
 running
 in non-daemon mode.

 Signed-off-by: Vladislav Odintsov 
 ---
 utilities/ovn-dbctl.c | 3 +++
 1 file changed, 3 insertions(+)

 diff --git a/utilities/ovn-dbctl.c b/utilities/ovn-dbctl.c
 index 369a6a663..4307a5cae 100644
 --- a/utilities/ovn-dbctl.c
 +++ b/utilities/ovn-dbctl.c
 @@ -208,6 +208,9 @@ ovn_dbctl_main(int argc, char *argv[],
 if (daemon_mode) {
 server_loop(dbctl_options, idl, argc, argv_);
 } else {
 +/* Disable OVSDB probe interval for non-daemon mode. */
 +ovsdb_idl_set_probe_interval(idl, 0);
>>
>> I think I'd avoid using the idl function directly and call instead:
>>
>> set_idl_probe_interval(idl, 0);
>>
>> Just to keep it aligned with all other uses in OVN.  I can patch that at
>> apply time if it looks OK to you.
> 
> I’ve got no objections here.
> Small nit: set_idl_probe_interval function needs also a remote. Like this:
> 
> set_idl_probe_interval(idl, db, 0);
> 
> Also, please correct typo in commit message: defatult -> default.
> 

In light of the ovs-discuss thread [0] is it maybe better to just set
this probe interval to a very high value instead?  That's for the case
when ovn-nbctl/sbctl daemon <-> ovsdb-server connection dies because of
for example cable being unplugged somewhere on the way between the two.

[0]
https://mail.openvswitch.org/pipermail/ovs-discuss/2023-March/052324.html

>>
 +
 struct ctl_command *commands;
 size_t n_commands;
 char *error;
 --
 2.36.1

 ___
 dev mailing list
 d...@openvswitch.org
 https://mail.openvswitch.org/mailman/listinfo/ovs-dev


>>> Looks good to me, thanks.
>>>
>>> Reviewed-by: Ales Musil 
>>>
>>
>> Vladislav, Ales, I was thinking of backporting this to stable branches
>> too, what do you think?
>>
>> Thanks,
>> Dumitru
> 
> 
> Regards,
> Vladislav Odintsov
> 
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] ci: Add arping package to run floating IP tests.

2023-03-31 Thread Eelco Chaudron




On 31 Mar 2023, at 14:47, Dumitru Ceara wrote:

> On 3/31/23 14:16, Eelco Chaudron wrote:
>> Add arping package so the "virtual port with floating IP -- ovn-northd"
>> tests are run.
>>
>> Signed-off-by: Eelco Chaudron 
>> ---
>
> Hi Eelco,
>
> Thanks for the patch, it's good to enable this test!
>
>>  .github/workflows/test.yml |3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
>> index 90dc8a6f1..bf958db9b 100644
>> --- a/.github/workflows/test.yml
>> +++ b/.github/workflows/test.yml
>> @@ -17,7 +17,8 @@ jobs:
>>dependencies: |
>>  automake libtool gcc bc libjemalloc2 libjemalloc-dev\
>>  libssl-dev llvm-dev libelf-dev libnuma-dev libpcap-dev  \
>> -selinux-policy-dev ncat python3-scapy isc-dhcp-server
>> +selinux-policy-dev ncat python3-scapy isc-dhcp-server \
>> +iputils-arping
>
> I see now the test running..
>
> 277: virtual port with floating IP -- ovn-northd -- parallelization=yes
> -- ovn_monitor_all=yes FAILED (system-ovn.at:10191)
>
> .. and failing.
>
> Not your fault, of course, but I'd really prefer to fix the test (if
> it's a test issue) or OVN before enabling this in CI.  Otherwise we'll
> have permanently "red" runs in CI which is also confusing.
>
> I'm not sure if you want and have time to look into this failure.  If
> not, let me know, I can try to have a look next week.

It ran fine when I tried this :( 
https://github.com/chaudron/ovn/actions/runs/4574680800

It might be good for an OVN expert to take a quick look, as to me all these 
tests are still black magic ;)

//Eelco

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] ci: Add arping package to run floating IP tests.

2023-03-31 Thread Dumitru Ceara

On 3/31/23 14:16, Eelco Chaudron wrote:
> Add arping package so the "virtual port with floating IP -- ovn-northd"
> tests are run.
> 
> Signed-off-by: Eelco Chaudron 
> ---

Hi Eelco,

Thanks for the patch, it's good to enable this test!

>  .github/workflows/test.yml |3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
> index 90dc8a6f1..bf958db9b 100644
> --- a/.github/workflows/test.yml
> +++ b/.github/workflows/test.yml
> @@ -17,7 +17,8 @@ jobs:
>dependencies: |
>  automake libtool gcc bc libjemalloc2 libjemalloc-dev\
>  libssl-dev llvm-dev libelf-dev libnuma-dev libpcap-dev  \
> -selinux-policy-dev ncat python3-scapy isc-dhcp-server
> +selinux-policy-dev ncat python3-scapy isc-dhcp-server \
> +iputils-arping

I see now the test running..

277: virtual port with floating IP -- ovn-northd -- parallelization=yes
-- ovn_monitor_all=yes FAILED (system-ovn.at:10191)

.. and failing.

Not your fault, of course, but I'd really prefer to fix the test (if
it's a test issue) or OVN before enabling this in CI.  Otherwise we'll
have permanently "red" runs in CI which is also confusing.

I'm not sure if you want and have time to look into this failure.  If
not, let me know, I can try to have a look next week.

Regards,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] utilities: disable OVSDB inactivity probes for non-daemon ovn-nbctl

2023-03-31 Thread Vladislav Odintsov

Hi Dumitru,

> On 31 Mar 2023, at 15:01, Dumitru Ceara  wrote:
> 
> On 3/31/23 11:43, Ales Musil wrote:
>> On Thu, Mar 23, 2023 at 8:25 PM Vladislav Odintsov 
>> wrote:
>> 
>>> For large OVN_Southbound DBs defatult interval of 5000 ms could be not
>>> sufficient.  This patch disables OVSDB inactivity probes for ovn-*ctl
>>> running
>>> in non-daemon mode.
>>> 
>>> Signed-off-by: Vladislav Odintsov 
>>> ---
>>> utilities/ovn-dbctl.c | 3 +++
>>> 1 file changed, 3 insertions(+)
>>> 
>>> diff --git a/utilities/ovn-dbctl.c b/utilities/ovn-dbctl.c
>>> index 369a6a663..4307a5cae 100644
>>> --- a/utilities/ovn-dbctl.c
>>> +++ b/utilities/ovn-dbctl.c
>>> @@ -208,6 +208,9 @@ ovn_dbctl_main(int argc, char *argv[],
>>> if (daemon_mode) {
>>> server_loop(dbctl_options, idl, argc, argv_);
>>> } else {
>>> +/* Disable OVSDB probe interval for non-daemon mode. */
>>> +ovsdb_idl_set_probe_interval(idl, 0);
> 
> I think I'd avoid using the idl function directly and call instead:
> 
> set_idl_probe_interval(idl, 0);
> 
> Just to keep it aligned with all other uses in OVN.  I can patch that at
> apply time if it looks OK to you.

I’ve got no objections here.
Small nit: set_idl_probe_interval function needs also a remote. Like this:

set_idl_probe_interval(idl, db, 0);

Also, please correct typo in commit message: defatult -> default.

> 
>>> +
>>> struct ctl_command *commands;
>>> size_t n_commands;
>>> char *error;
>>> --
>>> 2.36.1
>>> 
>>> ___
>>> dev mailing list
>>> d...@openvswitch.org
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>> 
>>> 
>> Looks good to me, thanks.
>> 
>> Reviewed-by: Ales Musil 
>> 
> 
> Vladislav, Ales, I was thinking of backporting this to stable branches
> too, what do you think?
> 
> Thanks,
> Dumitru


Regards,
Vladislav Odintsov

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn v3 1/2] expr: Remove supersets from OR expressions.

2023-03-31 Thread Ilya Maximets

On 3/31/23 06:39, Han Zhou wrote:
> 
> 
> On Wed, Mar 29, 2023 at 9:05 AM Ilya Maximets  > wrote:
>>
>> While crushing OR expressions, OVN removes exact replicas of sub
>> expressions.  However, there could be many CMP expressions that are
>> supersets of each other.  These are most likely to be created as a
>> result of cross-product while expanding brackets in the AND expression
>> in crush_and_numeric(), i.e. while converting
>> "x && (a0 || a1) && (b0 || b1)" into "xa0b0 || xa0b1 || xa1b0 || xa1b1".
>>
>> In addition to removal of exact duplicates introducing scan and removal
>> of supersets of other existing sub-expressions to reduce the amount of
>> generated flows.  This adds extra computations, but should save time
>> later, since less flows will be generated.
>>
>> Example:
>>
>>   "ip4.src == 172.168.0.0/16  && 
>> ip4.src!={172.168.13.0/24 , 172.168.15.0/24 
>> }"
>>
>>   Processing of this expression yields 42 flows:
>>
>>   $ ./tests/ovstest test-ovn expr-to-flows <<< "$expr"
>>
>>   ip,nw_src=172.168.0.0/255.255.1.0 
>>   ip,nw_src=172.168.0.0/255.255.10.0 
>>   ip,nw_src=172.168.0.0/255.255.12.0 
>>   ip,nw_src=172.168.0.0/255.255.3.0 
>>   ip,nw_src=172.168.0.0/255.255.4.0 
>>   ip,nw_src=172.168.0.0/255.255.5.0 
>>   ip,nw_src=172.168.0.0/255.255.6.0 
>>   ip,nw_src=172.168.0.0/255.255.8.0 
>>   ip,nw_src=172.168.0.0/255.255.9.0 
>>   ip,nw_src=172.168.128.0/17 
>>   <... 32 more flows ...>
>>
>>   We can see that many flows above do overlap, e.g. 255.255.3.0
>>   mask is a superset of 255.255.1.0.  Everything that matches
>>   255.255.3.0, will match 255.255.1.0 as well (the value is the same).
>>
>>   By removing all the unnecessary supersets, the set of flows can
>>   be reduced from 42 down to 7:
>>
>>   ip,nw_src=172.168.0.0/255.255.1.0 
>>   ip,nw_src=172.168.0.0/255.255.4.0 
>>   ip,nw_src=172.168.0.0/255.255.8.0 
>>   ip,nw_src=172.168.128.0/17 
>>   ip,nw_src=172.168.16.0/255.255.16.0 
>>   ip,nw_src=172.168.32.0/255.255.32.0 
>>   ip,nw_src=172.168.64.0/255.255.64.0 
>>
>> This change should be particularly useful for expressions with
>> inequality checks, like the one above.  Such expressions are
>> frequent among ACL rules.
>>
>>   "ip4.src != {172.168.13.0/24 , 172.168.14.0/24 
>> , 172.168.15.0/24 }"
>>
>>   Brefore:
>>   $ ./tests/ovstest test-ovn expr-to-flows <<< "$expr" | wc -l
>>   2894
>>
>>   After:
>>   $ ./tests/ovstest test-ovn expr-to-flows <<< "$expr" | wc -l
>>   23
>>
>> Superset lookups are performed only if there are expressions with
>> more bits in the mask than in the current one.  So, there is no extra
>> cost for equality checks on normal address sets, like port group sets,
>> where all the IPs are exect matches or have the same prefix length
>> otherwise.
>>
>> Use of bitmaps instead of subvalue functions significantly speeds up
>> processing since most of the subvalue space is an all-zero empty space.
>>
>> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2177197 
>> 
>> Reported-by: Nadia Pinaeva mailto:npina...@redhat.com>>
>> Signed-off-by: Ilya Maximets mailto:i.maxim...@ovn.org>>
>> ---
>>  include/ovn/expr.h |   1 +
>>  lib/expr.c         | 238 ++---
>>  tests/ovn.at        |  12 +++
>>  3 files changed, 196 insertions(+), 55 deletions(-)
>>
>> diff --git a/include/ovn/expr.h b/include/ovn/expr.h
>> index 80d95ff67..c48f82398 100644
>> --- a/include/ovn/expr.h
>> +++ b/include/ovn/expr.h
>> @@ -384,6 +384,7 @@ struct expr {
>>                  struct {
>>                      union mf_subvalue value;
>>                      union mf_subvalue mask;
>> +                    size_t mask_n_bits;
>>                  };
>>              };
>>          } cmp;
>> diff --git a/lib/expr.c b/lib/expr.c
>> index 0a6f7e574..8fd07b2d5 100644
>> --- a/lib/expr.c
>> +++ b/lib/expr.c
>> @@ -15,21 +15,23 @@
>>   */
>>
>>  #include 
>> +#include "bitmap.h"
>>  #include "byte-order.h"
>> -#include "openvswitch/json.h"
>> +#include "hmapx.h"
>>  #include "nx-match.h"
>>  #include "openvswitch/dynamic-string.h"
>> +#include "openvswitch/json.h"
>>  #include "openvswitch/match.h"
>>  #include "openvswitch/ofp-actions.h"
>> -#include "openvswitch/vlog.h"
>>  #includ

[ovs-dev] [PATCH ovn] ci: Add arping package to run floating IP tests.

2023-03-31 Thread Eelco Chaudron

Add arping package so the "virtual port with floating IP -- ovn-northd"
tests are run.

Signed-off-by: Eelco Chaudron 
---
 .github/workflows/test.yml |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
index 90dc8a6f1..bf958db9b 100644
--- a/.github/workflows/test.yml
+++ b/.github/workflows/test.yml
@@ -17,7 +17,8 @@ jobs:
   dependencies: |
 automake libtool gcc bc libjemalloc2 libjemalloc-dev\
 libssl-dev llvm-dev libelf-dev libnuma-dev libpcap-dev  \
-selinux-policy-dev ncat python3-scapy isc-dhcp-server
+selinux-policy-dev ncat python3-scapy isc-dhcp-server \
+iputils-arping
   m32_dependecies: gcc-multilib
   ARCH:${{ matrix.cfg.arch }}
   CC:  ${{ matrix.cfg.compiler }}

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn 1/4] northd: Break ACLs into two stages.

2023-03-31 Thread Dumitru Ceara

On 3/31/23 13:52, Simon Horman wrote:
> On Tue, Mar 21, 2023 at 01:59:06PM -0400, Mark Michelson wrote:
>> Prior to this commit, ACLs were evaluated and acted on in a single
>> stage. With this commit, evaluation of ACLs and acting on an ACL's
>> decision are separated into two stages.
>>
>> The acl_eval stage checks the ACL match and will set a bit to indicate
>> the verdict of the ACL. The acl_action stage then checks the relevant
>> bits to determine how to proceed. If no ACLs are matched, then the
>> default ACL action is taken.
>>
>> A couple of notes about updated tests:
>> - For test cases where I just had to increment a table number, I changed
>>   the check so the table numbers are masked. This should prevent similar
>>   changes from being needed later.
>> - The port security test changes may seem odd. The issue here is that
>>   the ls_out_apply_port_sec table number changed from 9 to 10. This
>>   means that this table's flows now sort to a lower position than
>>   before. This is why the check had to change for this test.
>>
>> Signed-off-by: Mark Michelson 
> 
> Hi Mark,
> 
> this does not appear to apply on top of main any more.

Hi Simon,

I didn't review this series yet but something that helps when dealing
with patches that don't apply on top of main anymore is checking the
branch 0-day bot created from the patchwork series:

https://github.com/ovsrobot/ovn/tree/series_347327

It's: https://github.com/ovsrobot/ovn/tree/series_

Back to this specific series, that made me check the CI results and I
see CI reports a memory leak:


=
==34989==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 3230 byte(s) in 102 object(s) allocated from:
#0 0x499f1d in malloc 
(/home/runner/work/ovn/ovn/ovn-23.03.90/_build/sub/northd/ovn-northd+0x499f1d)
#1 0x83fc64 in xmalloc__ /home/runner/work/ovn/ovn/ovs/lib/util.c:140:15
#2 0x83fd25 in xmalloc /home/runner/work/ovn/ovn/ovs/lib/util.c:175:12
#3 0x83fe9c in xvasprintf /home/runner/work/ovn/ovn/ovs/lib/util.c:230:9
#4 0x840081 in xasprintf /home/runner/work/ovn/ovn/ovs/lib/util.c:371:9
#5 0x4f777b in build_acl_action_lflows 
/home/runner/work/ovn/ovn/ovn-23.03.90/_build/sub/../../northd/northd.c:6695:13
#6 0x4f0719 in build_acls 
/home/runner/work/ovn/ovn/ovn-23.03.90/_build/sub/../../northd/northd.c:7091:5
#7 0x4e031f in build_lswitch_lflows_pre_acl_and_acl 
/home/runner/work/ovn/ovn/ovn-23.03.90/_build/sub/../../northd/northd.c:8491:9
#8 0x4db1f4 in build_lswitch_and_lrouter_iterate_by_od 
/home/runner/work/ovn/ovn/ovn-23.03.90/_build/sub/../../northd/northd.c:14655:5
#9 0x4ce847 in build_lflows_thread 
/home/runner/work/ovn/ovn/ovn-23.03.90/_build/sub/../../northd/northd.c:14772:21
#10 0x822363 in ovsthread_wrapper 
/home/runner/work/ovn/ovn/ovs/lib/ovs-thread.c:423:12
#11 0x7f042e8d6608 in start_thread 
(/lib/x86_64-linux-gnu/libpthread.so.0+0x8608)

SUMMARY: AddressSanitizer: 3230 byte(s) leaked in 102 allocation(s).

Regards,
Dumitru




___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] utilities: disable OVSDB inactivity probes for non-daemon ovn-nbctl

2023-03-31 Thread Dumitru Ceara

On 3/31/23 11:43, Ales Musil wrote:
> On Thu, Mar 23, 2023 at 8:25 PM Vladislav Odintsov 
> wrote:
> 
>> For large OVN_Southbound DBs defatult interval of 5000 ms could be not
>> sufficient.  This patch disables OVSDB inactivity probes for ovn-*ctl
>> running
>> in non-daemon mode.
>>
>> Signed-off-by: Vladislav Odintsov 
>> ---
>>  utilities/ovn-dbctl.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/utilities/ovn-dbctl.c b/utilities/ovn-dbctl.c
>> index 369a6a663..4307a5cae 100644
>> --- a/utilities/ovn-dbctl.c
>> +++ b/utilities/ovn-dbctl.c
>> @@ -208,6 +208,9 @@ ovn_dbctl_main(int argc, char *argv[],
>>  if (daemon_mode) {
>>  server_loop(dbctl_options, idl, argc, argv_);
>>  } else {
>> +/* Disable OVSDB probe interval for non-daemon mode. */
>> +ovsdb_idl_set_probe_interval(idl, 0);

I think I'd avoid using the idl function directly and call instead:

set_idl_probe_interval(idl, 0);

Just to keep it aligned with all other uses in OVN.  I can patch that at
apply time if it looks OK to you.

>> +
>>  struct ctl_command *commands;
>>  size_t n_commands;
>>  char *error;
>> --
>> 2.36.1
>>
>> ___
>> dev mailing list
>> d...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>>
> Looks good to me, thanks.
> 
> Reviewed-by: Ales Musil 
> 

Vladislav, Ales, I was thinking of backporting this to stable branches
too, what do you think?

Thanks,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 5/6] ovsdb: Avoid converting database twice on an initiator.

2023-03-31 Thread Simon Horman

On Mon, Mar 27, 2023 at 09:43:00PM +0200, Ilya Maximets wrote:
> Cluster member, that initiates the schema conversion, converts the
> database twice.  First time while verifying the possibility of the
> conversion, and the second time after reading conversion request
> back from the storage.
> 
> Keep the converted database from the first time around and use it
> after reading the request back from the storage.  This cuts in half
> the conversion CPU cost.
> 
> Signed-off-by: Ilya Maximets 

Reviewed-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 6/6] ovsdb: monitor: Keep and maintain the initial change set.

2023-03-31 Thread Simon Horman

On Mon, Mar 27, 2023 at 09:43:01PM +0200, Ilya Maximets wrote:
> Change sets in OVSDB monitor are storing all the changes that happened
> between a particular transaction ID and now.  Initial change set
> basically contains all the data.
> 
> On each monitor request a new initial change set is created by creating
> an empty change set and adding all the database rows.  Then it is
> converted into JSON reply and immediately untracked and destroyed.
> 
> This is causing significant performance issues if many clients are
> requesting new monitors at the same time.  For example, that is
> happening after database schema conversion, because conversion triggers
> cancellation of all monitors.  After cancellation, every client sends
> a new monitor request.  The server then creates a new initial change
> set, sends a reply, destroys initial change set and repeats that for
> each client.  On a system with 200 MB database and 500 clients,
> cluster of 3 servers spends 20 minutes replying to all the clients
> (200 MB x 500 = 100 GB):
> 
>   timeval|WARN|Unreasonably long 1201525ms poll interval
> 
> Of course, all the clients are already disconnected due to inactivity
> at this point.  When they are re-connecting back, server accepts new
> connections one at a time, so inactivity probes will not be triggered
> anymore, but it still takes another 20 minutes to handle all the
> incoming connections.
> 
> Let's keep the initial change set around for as long as the monitor
> itself exists.  This will allow us to not construct a new change set
> on each new monitor request and even utilize the JSON cache in some
> cases.  All that at a relatively small maintenance cost, since we'll
> need to commit changes to one extra change set on every transaction.
> Measured memory usage increase due to keeping around a shallow copy
> of a database is about 10%.  Measured CPU usage difference during
> normal operation is negligible.
> 
> With this change it takes only 30 seconds to send out all the monitor
> replies in the example above.  So, it's a 40x performance improvement.
> On a more reasonable setup with 250 nodes, the process takes up to
> 8-10 seconds instead of 4-5 minutes.
> 
> Conditional monitoring will benefit from this change as well, however
> results might be less impressive due to lack of JSON cache.
> 
> Signed-off-by: Ilya Maximets 

Reviewed-by: Simon Horman 

> ---
>  ovsdb/monitor.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)

A very pleasing LOC to impact ratio :)
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 4/6] ovsdb: Perform conversion with no data for clustered databases.

2023-03-31 Thread Simon Horman

+ Frode

On Mon, Mar 27, 2023 at 09:42:59PM +0200, Ilya Maximets wrote:
> Currently, database schema conversion in case of clustered database
> produces a transaction record with both new schema and converted
> database data.  So, the sequence of events is following:
> 
>   1. Get the new schema.
>   2. Convert the database to a new schema.
>   3. Translate the newly converted database into JSON.
>   4. Write the schema + data JSON to the storage.
>   5. Destroy converted version of a database.
>   6. Read schema + data JSON from the storage and parse.
>   7. Create a new database from a parsed database data.
>   8. Replace current database with the new one.
> 
> Most of these steps are very computationally expensive.  Also,
> conversion to/from JSON is much more expensive than direct database
> conversion with ovsdb_convert() that can make use of shallow data
> copies.
> 
> Instead of doing all that, let's make use of previously introduced
> ability to not write the converted data into the storage.  The process
> will look like this then:
> 
>   1. Get the new schema.
>   2. Convert the database to a new schema
>  (to verify that it is possible).
>   3. Write the schema to the storage.
>   4. Destroy converted version of a database.
>   5. Read the new schema from the storage and parse.
>   6. Convert the database to a new schema.
>   7. Replace current database with the new one.
> 
> Most of the operations here are performed on the small schema object,
> instead of the actual database data.  Two remaining data operations
> (actual conversion) are noticeably faster than conversion to/from
> JSON due to reference counting and shallow data copies.
> 
> Steps 4-6 can be optimized later to not convert twice on the
> process that initiates the conversion.
> 
> The change results in following performance improvements in conversion
> of OVN_Southbound database schema from version 20.23.0 to 20.27.0
> (measured on a single-server RAFT cluster with no clients):
> 
>   |   Before| After
>   +-+---+-+--
>   DB size |  Total  | Max poll interval |  Total  | Max poll interval
>   +-+---+-+--
>542 MB | 47 sec. | 26 sec.   | 15 sec. | 10 sec.
>225 MB | 19 sec. | 10 sec.   |  6 sec. |4.5 sec.
> 
> 542 MB database had 19.5 M atoms, 225 MB database had 7.5 M atoms.
> 
> Overall performance improvement is about 3x.
> 
> Also, note that before this change database conversion basically
> doubles the database file on disk.  Now it only writes a small
> schema JSON.
> 
> Since the change requires backward-incompatible database file format
> changes, documentation is updated on how to perform an upgrade.
> Handled the same way as we did for the previous incompatible format
> change in 2.15 (column diffs).
> 
> Reported-at: 
> https://mail.openvswitch.org/pipermail/ovs-discuss/2022-December/052140.html
> Signed-off-by: Ilya Maximets 

Reviewed-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 3/6] ovsdb: Allow conversion records with no data in a clustered storage.

2023-03-31 Thread Simon Horman

On Mon, Mar 27, 2023 at 09:42:58PM +0200, Ilya Maximets wrote:
> If the schema with no data was read from the clustered storage, it
> should mean a database conversion request.  In general, we can get:
> 
> 1. Just data --> Transaction record.
> 2. Schema + Data --> Database conversion or raft snapshot install.
> 3. Just schema --> New.  Database conversion request.
> 
> We cannot distinguish between conversion and snapshot installation
> request in the current implementation, so we will keep handling
> conversion with data in the same way as before, i.e. if data is
> provided, we should use it.
> 
> ovsdb-tool is updated to handle this record type as well while
> converting cluster to standalone.
> 
> This change doesn't introduce a way for such records to appear in
> the database.  That will be added in the future commits targeting
> conversion speed increase.
> 
> Signed-off-by: Ilya Maximets 

Reviewed-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 2/6] ovsdb: Check for ephemeral columns before writing a new schema.

2023-03-31 Thread Simon Horman

On Mon, Mar 27, 2023 at 09:42:57PM +0200, Ilya Maximets wrote:
> Clustered databases do not support ephemeral columns, but ovsdb-server
> checks for them after the conversion result is read from the storage.
> It's much easier to recover if this constraint is checked before writing
> to the storage instead.
> 
> It's not a big problem, because the check is always performed by the
> native ovsdb clients before sending a conversion request.  But the
> server, in general, should not trust clients to do the right thing.
> 
> Check in the update_schema() remains, because we shouldn't blindly
> trust the storage.
> 
> Signed-off-by: Ilya Maximets 

Reviewed-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 1/6] ovsdb-tool: Fix cluster-to-standalone for DB conversion records.

2023-03-31 Thread Simon Horman

On Mon, Mar 27, 2023 at 09:42:56PM +0200, Ilya Maximets wrote:
> If database conversion happens, both schema and the new data are
> present in the database record.  However, the schema is just silently
> ignored by ovsdb-tool cluster-to-standalone.  This creates data
> inconsistency if the new data contains new columns, for example, so
> the resulting database file will not be readable, or data will be lost.
> 
> Fix that by re-setting the database whenever a conversion record is
> found and actually writing a new schema that will match the actual
> data.  The database file will not be that similar to the original,
> but there is no way to represent conversion in a standalone database
> file format otherwise.
> 
> Fixes: 00de46f9ee42 ("ovsdb-tool: Convert clustered db to standalone db.")
> Signed-off-by: Ilya Maximets 

Reviewed-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v11 2/4] dpif-netdev: Show netdev offloading flags.

2023-03-31 Thread Simon Horman

On Mon, Mar 27, 2023 at 06:50:11AM -0400, Mike Pattrick wrote:
> From: Flavio Leitner 
> 
> This patch modifies netdev_get_status to include information about
> checksum offload status by port, allowing the user to gain insight into
> where checksum offloading is active.
> 
> Signed-off-by: Flavio Leitner 
> Co-authored-by: Mike Pattrick 
> Signed-off-by: Mike Pattrick 
> Reviewed-by: David Marchand 

Reviewed-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v11 1/4] Documentation: Document netdev offload.

2023-03-31 Thread Simon Horman

On Mon, Mar 27, 2023 at 06:50:10AM -0400, Mike Pattrick wrote:
> From: Flavio Leitner 
> 
> Document the implementation of netdev hardware offloading
> in userspace datapath.
> 
> Signed-off-by: Flavio Leitner 
> Co-authored-by: Mike Pattrick 
> Signed-off-by: Mike Pattrick 

Reviewed-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn 1/4] northd: Break ACLs into two stages.

2023-03-31 Thread Simon Horman

On Tue, Mar 21, 2023 at 01:59:06PM -0400, Mark Michelson wrote:
> Prior to this commit, ACLs were evaluated and acted on in a single
> stage. With this commit, evaluation of ACLs and acting on an ACL's
> decision are separated into two stages.
> 
> The acl_eval stage checks the ACL match and will set a bit to indicate
> the verdict of the ACL. The acl_action stage then checks the relevant
> bits to determine how to proceed. If no ACLs are matched, then the
> default ACL action is taken.
> 
> A couple of notes about updated tests:
> - For test cases where I just had to increment a table number, I changed
>   the check so the table numbers are masked. This should prevent similar
>   changes from being needed later.
> - The port security test changes may seem odd. The issue here is that
>   the ls_out_apply_port_sec table number changed from 9 to 10. This
>   means that this table's flows now sort to a lower position than
>   before. This is why the check had to change for this test.
> 
> Signed-off-by: Mark Michelson 

Hi Mark,

this does not appear to apply on top of main any more.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn v2 branch-22.12 2/2] northd: Drop packets for LBs with no backends

2023-03-31 Thread Dumitru Ceara

On 3/30/23 15:32, Ales Musil wrote:
> When the LB is configured without any backend
> and doesn't report event or reject the packet
> just simply drop the packet.
> 
> Reported-at: https://bugzilla.redhat.com/2177173
> Signed-off-by: Ales Musil 
> Signed-off-by: Dumitru Ceara 
> (cherry picked from commit 75b0bcbb019a8caa2cfebffeff1ecf155fcd1fbc)
> ---
> This version applies cleanly down to 21.12.
> ---

Thanks for the fix, Ales, applied to 22.12 and backported down to 21.12.

I had to tweak some table numbers here and there and had to use "drop;"
instead of debug_drop_action() on some branches.

Regards,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn v2 branch-22.12 1/2] northd: Make the enclose handling less error prone

2023-03-31 Thread Dumitru Ceara

On 3/30/23 15:32, Ales Musil wrote:
> The enclose handling was working for all cases
> except the drop. Make sure that the enclose is
> properly defined for that case and make sure
> that it is used properly.
> 
> Fixes: cf205ca0e52c ("northd: Fix missig "); " from LB flows")
> Signed-off-by: Ales Musil 
> ---

Thanks for the fix, Ales, applied to 22.12 and backported down to 21.12.

Just for future-me: this is only applies to branches <= 22.12 because in
23.03 the corresponding LB code has been changed and the bug indirectly
got fixed via 7b94d212f694 ("northd: Refactor
build_lrouter_nat_flows_for_lb function").

>  northd/northd.c | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/northd/northd.c b/northd/northd.c
> index 3165f4bc2..89bd9cbb3 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -10525,11 +10525,13 @@ build_lrouter_nat_flows_for_lb(struct ovn_lb_vip 
> *lb_vip,
> lb->selection_fields, false,
> features->ct_no_masked_label);
>  bool drop = !!strncmp(ds_cstr(action), "ct_lb", strlen("ct_lb"));
> +const char *enclose = drop ? "" : ");";
>  if (!drop) {
>  /* Remove the trailing ");". */
>  ds_truncate(action, action->length - 2);
>  }
>  
> +

I removed this extra newline.

Regards,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [RFC v2 ovn 4/4] northd: take into account qos_min_rate in port_has_qos_params

2023-03-31 Thread Lorenzo Bianconi

Similar to qos_max_rate and qos_burst, take into account qos_min_rate in
port_has_qos_params routine.

Fixes: 8dda482b80 ("Check and allocate free qdisc queue id for ports with qos 
parameters")
Signed-off-by: Lorenzo Bianconi 
---
 northd/northd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/northd/northd.c b/northd/northd.c
index 7a10e4dcf..57ec10956 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -559,7 +559,7 @@ free_chassis_queueid(struct hmap *set, const struct uuid 
*uuid,
 static inline bool
 port_has_qos_params(const struct smap *opts)
 {
-return (smap_get(opts, "qos_max_rate") ||
+return (smap_get(opts, "qos_max_rate") || smap_get(opts, "qos_min_rate") ||
 smap_get(opts, "qos_burst"));
 }
 
-- 
2.39.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [RFC v2 ovn 3/4] controller: add QoS support for logical switch port interfaces

2023-03-31 Thread Lorenzo Bianconi

Similar to localnet port, introduce QoS (HTB) support for logical switch
port interfaces.

Signed-off-by: Lorenzo Bianconi 
---
 controller/binding.c | 17 ++---
 tests/system-ovn.at  | 34 ++
 2 files changed, 48 insertions(+), 3 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index 55156834e..37953c973 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -255,6 +255,7 @@ static void
 remove_stale_ovs_qos_entry(const struct ovsrec_qos_table *qos_table,
const struct sbrec_port_binding_table *pb_table,
struct ovsdb_idl_index *ovsrec_port_by_qos,
+   struct shash *local_bindings,
struct smap *egress_ifaces)
 {
 const struct ovsrec_qos *qos, *qos_next;
@@ -269,7 +270,8 @@ remove_stale_ovs_qos_entry(const struct ovsrec_qos_table 
*qos_table,
 bool stale = true;
 const struct sbrec_port_binding *pb;
 SBREC_PORT_BINDING_TABLE_FOR_EACH (pb, pb_table) {
-if (get_lport_type(pb) != LP_LOCALNET) {
+if (get_lport_type(pb) != LP_LOCALNET &&
+!local_binding_find(local_bindings, pb->logical_port)) {
 continue;
 }
 
@@ -316,7 +318,7 @@ configure_ovs_qos(struct hmap *queue_map,
   const struct sbrec_port_binding_table *pb_table,
   struct ovsdb_idl_index *ovsrec_port_by_name,
   struct ovsdb_idl_index *ovsrec_port_by_qos,
-  struct smap *egress_ifaces)
+  struct shash *local_bindings, struct smap *egress_ifaces)
 
 {
 struct qos_queue *sb_info;
@@ -329,7 +331,8 @@ configure_ovs_qos(struct hmap *queue_map,
 if (ovs_idl_txn) {
 /* Remove stale QoS entries. */
 remove_stale_ovs_qos_entry(qos_table, pb_table,
-   ovsrec_port_by_qos, egress_ifaces);
+   ovsrec_port_by_qos, local_bindings,
+   egress_ifaces);
 }
 }
 
@@ -1858,6 +1861,8 @@ build_local_bindings(struct binding_ctx_in *b_ctx_in,
 if (!lbinding) {
 lbinding = local_binding_create(iface_id, iface_rec);
 local_binding_add(local_bindings, lbinding);
+smap_replace(b_ctx_out->egress_ifaces, iface_id,
+ iface_rec->name);
 } else {
 lbinding->multiple_bindings = true;
 static struct vlog_rate_limit rl =
@@ -2030,6 +2035,7 @@ binding_run(struct binding_ctx_in *b_ctx_in, struct 
binding_ctx_out *b_ctx_out)
   b_ctx_in->port_binding_table,
   b_ctx_in->ovsrec_port_by_name,
   b_ctx_in->ovsrec_port_by_qos,
+  &b_ctx_out->lbinding_data->bindings,
   b_ctx_out->egress_ifaces);
 destroy_qos_map(&qos_map);
 
@@ -2152,6 +2158,8 @@ consider_iface_claim(const struct ovsrec_interface 
*iface_rec,
 pb = b_lport->pb;
 }
 
+smap_replace(b_ctx_out->egress_ifaces, iface_id, iface_rec->name);
+
 if (!pb) {
 /* There is no port_binding row for this local binding. */
 return true;
@@ -2278,6 +2286,7 @@ consider_iface_release(const struct ovsrec_interface 
*iface_rec,
  b_ctx_out->if_mgr);
 }
 
+smap_remove(b_ctx_out->egress_ifaces, iface_id);
 remove_local_lports(iface_id, b_ctx_out);
 smap_remove(b_ctx_out->local_iface_ids, iface_rec->name);
 
@@ -2506,6 +2515,7 @@ binding_handle_ovs_interface_changes(struct 
binding_ctx_in *b_ctx_in,
   b_ctx_in->port_binding_table,
   b_ctx_in->ovsrec_port_by_name,
   b_ctx_in->ovsrec_port_by_qos,
+  &b_ctx_out->lbinding_data->bindings,
   b_ctx_out->egress_ifaces);
 }
 destroy_qos_map(&qos_map);
@@ -3032,6 +3042,7 @@ delete_done:
   b_ctx_in->port_binding_table,
   b_ctx_in->ovsrec_port_by_name,
   b_ctx_in->ovsrec_port_by_qos,
+  &b_ctx_out->lbinding_data->bindings,
   b_ctx_out->egress_ifaces);
 }
 
diff --git a/tests/system-ovn.at b/tests/system-ovn.at
index 010c70118..6116cfc35 100644
--- a/tests/system-ovn.at
+++ b/tests/system-ovn.at
@@ -6635,6 +6635,40 @@ AT_CHECK([ovn-nbctl remove Logical_Switch_Port public 
options qos_burst=600]
 
 OVS_WAIT_UNTIL([test "$(tc qdisc show | grep 'htb 1: dev ovs-public')" = ""])
 
+AT_CHECK([ovn-nbctl set Logical_Switch_Port sw01 options:qos_min_rate=30])
+AT_CHECK([ovn-nbctl set Logical_Switch_Port sw01 options:qos_max_rate=40])
+AT_CHECK([ovn-nbctl set Logical_Switch_Port sw01 options:qos_burst=300])
+
+OVS_WAIT_UNTIL([tc qdisc show | g

[ovs-dev] [RFC v2 ovn 2/4] controller: improve ovs port lookup by name and qos

2023-03-31 Thread Lorenzo Bianconi

Introduce ovsport_lookup_by_name and ovsport_lookup_by_qos routines in
order to speed-up ovs port lookup based on port name or qos.

Signed-off-by: Lorenzo Bianconi 
---
 controller/binding.c| 49 +
 controller/binding.h|  3 ++-
 controller/ovn-controller.c | 19 +++---
 controller/ovsport.c| 32 
 controller/ovsport.h|  5 
 5 files changed, 77 insertions(+), 31 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index 5beb2d639..55156834e 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -36,6 +36,7 @@
 #include "lport.h"
 #include "ovn-controller.h"
 #include "patch.h"
+#include "ovsport.h"
 
 VLOG_DEFINE_THIS_MODULE(binding);
 
@@ -175,17 +176,11 @@ get_qos_params(const struct sbrec_port_binding *pb, 
struct hmap *queue_map,
 static void
 add_ovs_qos_table_entry(struct ovsdb_idl_txn *ovs_idl_txn,
 const struct ovsrec_qos_table *qos_table,
-const struct ovsrec_port_table *port_table,
+struct ovsdb_idl_index *ovsrec_port_by_name,
 struct qos_queue *sb_info)
 {
-const struct ovsrec_port *port = NULL, *iter;
-OVSREC_PORT_TABLE_FOR_EACH (iter, port_table) {
-if (!strcmp(iter->name, sb_info->port_name)) {
-port = iter;
-break;
-}
-}
-
+const struct ovsrec_port *port =
+ovsport_lookup_by_name(ovsrec_port_by_name, sb_info->port_name);
 if (!port) {
 return;
 }
@@ -257,9 +252,9 @@ add_ovs_qos_table_entry(struct ovsdb_idl_txn *ovs_idl_txn,
 }
 
 static void
-remove_stale_ovs_qos_entry(const struct ovsrec_port_table *port_table,
-   const struct ovsrec_qos_table *qos_table,
+remove_stale_ovs_qos_entry(const struct ovsrec_qos_table *qos_table,
const struct sbrec_port_binding_table *pb_table,
+   struct ovsdb_idl_index *ovsrec_port_by_qos,
struct smap *egress_ifaces)
 {
 const struct ovsrec_qos *qos, *qos_next;
@@ -268,14 +263,8 @@ remove_stale_ovs_qos_entry(const struct ovsrec_port_table 
*port_table,
 continue;
 }
 
-const struct ovsrec_port *port = NULL, *iter;
-OVSREC_PORT_TABLE_FOR_EACH (iter, port_table) {
-if (iter->qos == qos) {
-port = iter;
-break;
-}
-}
-
+const struct ovsrec_port *port =
+ovsport_lookup_by_qos(ovsrec_port_by_qos, qos);
 struct ovsrec_queue *queue = qos->value_queues[0];
 bool stale = true;
 const struct sbrec_port_binding *pb;
@@ -323,22 +312,24 @@ remove_stale_ovs_qos_entry(const struct ovsrec_port_table 
*port_table,
 static void
 configure_ovs_qos(struct hmap *queue_map,
   struct ovsdb_idl_txn *ovs_idl_txn,
-  const struct ovsrec_port_table *port_table,
   const struct ovsrec_qos_table *qos_table,
   const struct sbrec_port_binding_table *pb_table,
+  struct ovsdb_idl_index *ovsrec_port_by_name,
+  struct ovsdb_idl_index *ovsrec_port_by_qos,
   struct smap *egress_ifaces)
 
 {
 struct qos_queue *sb_info;
 HMAP_FOR_EACH (sb_info, node, queue_map) {
 /* Add new QoS entries. */
-add_ovs_qos_table_entry(ovs_idl_txn, qos_table, port_table, sb_info);
+add_ovs_qos_table_entry(ovs_idl_txn, qos_table,
+ovsrec_port_by_name, sb_info);
 }
 
 if (ovs_idl_txn) {
 /* Remove stale QoS entries. */
-remove_stale_ovs_qos_entry(port_table, qos_table, pb_table,
-   egress_ifaces);
+remove_stale_ovs_qos_entry(qos_table, pb_table,
+   ovsrec_port_by_qos, egress_ifaces);
 }
 }
 
@@ -2035,8 +2026,10 @@ binding_run(struct binding_ctx_in *b_ctx_in, struct 
binding_ctx_out *b_ctx_out)
 shash_destroy(&bridge_mappings);
 
 configure_ovs_qos(&qos_map, b_ctx_in->ovs_idl_txn,
-  b_ctx_in->port_table, b_ctx_in->qos_table,
+  b_ctx_in->qos_table,
   b_ctx_in->port_binding_table,
+  b_ctx_in->ovsrec_port_by_name,
+  b_ctx_in->ovsrec_port_by_qos,
   b_ctx_out->egress_ifaces);
 destroy_qos_map(&qos_map);
 
@@ -2509,8 +2502,10 @@ binding_handle_ovs_interface_changes(struct 
binding_ctx_in *b_ctx_in,
 
 if (handled) {
 configure_ovs_qos(&qos_map, b_ctx_in->ovs_idl_txn,
-  b_ctx_in->port_table, b_ctx_in->qos_table,
+  b_ctx_in->qos_table,
   b_ctx_in->port_binding_table,
+  b_ctx_in->ovsrec_port_by_name,
+  b_ctx_in->ovsrec_port_by_qos,

[ovs-dev] [RFC v2 ovn 1/4] controller: configure qos through ovs qos table and do not run tc directly

2023-03-31 Thread Lorenzo Bianconi

Rework OVN QoS implementation in order to configure it through OVS QoS
table instead of running tc command directly bypassing OVS.
Moreover, in the current codebase it is not possible to specify two
different QoS rules for two different localnet ports, even if they
are running on two different datapaths. Both ports will be configured
with the latest QoS rule in the hashmap since it is not possible to
link the QoS rule to the corresponding OVS port.
Fix the issue introducing iface-id in external_ids column for OVS
interface table for the physical interface used by localnet port similar
to what we already do for logical switch port.
Keep ovn-egress-iface for backward compatibility but assume we can have
at most one OVS physical interface per datapath used by localnet port.
Convert egress_ifaces to smap instead of sset.

Signed-off-by: Lorenzo Bianconi 
---
 controller/binding.c| 419 +---
 controller/binding.h|   2 +-
 controller/ovn-controller.c |  16 +-
 tests/system-ovn.at |  67 +-
 4 files changed, 269 insertions(+), 235 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index 5df62baef..5beb2d639 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -111,6 +111,7 @@ binding_wait(void)
 
 struct qos_queue {
 struct hmap_node node;
+char *port_name;
 uint32_t queue_id;
 uint32_t min_rate;
 uint32_t max_rate;
@@ -148,210 +149,197 @@ static void update_lport_tracking(const struct 
sbrec_port_binding *pb,
   bool claimed);
 
 static void
-get_qos_params(const struct sbrec_port_binding *pb, struct hmap *queue_map)
+get_qos_params(const struct sbrec_port_binding *pb, struct hmap *queue_map,
+   struct smap *egress_ifaces)
 {
 uint32_t min_rate = smap_get_int(&pb->options, "qos_min_rate", 0);
 uint32_t max_rate = smap_get_int(&pb->options, "qos_max_rate", 0);
 uint32_t burst = smap_get_int(&pb->options, "qos_burst", 0);
 uint32_t queue_id = smap_get_int(&pb->options, "qdisc_queue_id", 0);
+const char *ovs_port = smap_get(egress_ifaces, pb->logical_port);
 
-if ((!min_rate && !max_rate && !burst) || !queue_id) {
+if ((!min_rate && !max_rate && !burst) || !queue_id || !ovs_port) {
 /* Qos is not configured for this port. */
 return;
 }
 
 struct qos_queue *node = xzalloc(sizeof *node);
 hmap_insert(queue_map, &node->node, hash_int(queue_id, 0));
+node->port_name = xstrdup(ovs_port);
 node->min_rate = min_rate;
 node->max_rate = max_rate;
 node->burst = burst;
 node->queue_id = queue_id;
 }
 
-static const struct ovsrec_qos *
-get_noop_qos(struct ovsdb_idl_txn *ovs_idl_txn,
- const struct ovsrec_qos_table *qos_table)
-{
-const struct ovsrec_qos *qos;
-OVSREC_QOS_TABLE_FOR_EACH (qos, qos_table) {
-if (!strcmp(qos->type, "linux-noop")) {
-return qos;
+static void
+add_ovs_qos_table_entry(struct ovsdb_idl_txn *ovs_idl_txn,
+const struct ovsrec_qos_table *qos_table,
+const struct ovsrec_port_table *port_table,
+struct qos_queue *sb_info)
+{
+const struct ovsrec_port *port = NULL, *iter;
+OVSREC_PORT_TABLE_FOR_EACH (iter, port_table) {
+if (!strcmp(iter->name, sb_info->port_name)) {
+port = iter;
+break;
 }
 }
 
-if (!ovs_idl_txn) {
-return NULL;
+if (!port) {
+return;
 }
-qos = ovsrec_qos_insert(ovs_idl_txn);
-ovsrec_qos_set_type(qos, "linux-noop");
-return qos;
-}
 
-static bool
-set_noop_qos(struct ovsdb_idl_txn *ovs_idl_txn,
- const struct ovsrec_port_table *port_table,
- const struct ovsrec_qos_table *qos_table,
- struct sset *egress_ifaces)
-{
-if (!ovs_idl_txn) {
-return false;
-}
+const struct ovsrec_qos *qos;
+struct ovsrec_queue *queue;
+OVSREC_QOS_TABLE_FOR_EACH (qos, qos_table) {
+if (strcmp(qos->type, OVN_QOS_TYPE)) {
+continue;
+}
 
-const struct ovsrec_qos *noop_qos = get_noop_qos(ovs_idl_txn, qos_table);
-if (!noop_qos) {
-return false;
-}
+if (!qos->n_queues) {
+continue;
+}
 
-const struct ovsrec_port *port;
-size_t count = 0;
+if (port->qos != qos) {
+continue;
+}
 
-OVSREC_PORT_TABLE_FOR_EACH (port, port_table) {
-if (sset_contains(egress_ifaces, port->name)) {
-ovsrec_port_set_qos(port, noop_qos);
-count++;
+if (qos->key_queues[0] != sb_info->queue_id) {
+continue;
 }
-if (sset_count(egress_ifaces) == count) {
-break;
+
+queue = qos->value_queues[0];
+uint32_t max_rate = smap_get_int(&queue->other_config, "max-rate", 0);
+if (max_rate != sb_info->max_rate) {
+continue;
 }
-

[ovs-dev] [RFC v2 ovn 0/4] rework OVN QoS implementation

2023-03-31 Thread Lorenzo Bianconi

Rework OVN QoS implementation in order to configure it through OVS QoS
table instead of running tc command directly bypassing OVS.

Changes since v1:
- get rid of qos_ovs_port from logical_switch_port option column and let ovn to
  compute it
- add VIF QoS support
- take into account qos_min_rate in port_has_qos_params

Lorenzo Bianconi (4):
  controller: configure qos through ovs qos table and do not run tc
directly
  controller: improve ovs port lookup by name and qos
  controller: add QoS support for logical switch port interfaces
  northd: take into account qos_min_rate in port_has_qos_params

 controller/binding.c| 429 +---
 controller/binding.h|   5 +-
 controller/ovn-controller.c |  35 ++-
 controller/ovsport.c|  32 +++
 controller/ovsport.h|   5 +
 northd/northd.c |   2 +-
 tests/system-ovn.at | 101 -
 7 files changed, 367 insertions(+), 242 deletions(-)

-- 
2.39.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v5] ofproto-dpif-upcall: Don't set statistics to 0 when they jump back

2023-03-31 Thread Simon Horman

On Fri, Mar 31, 2023 at 12:05:09PM +0200, Ilya Maximets wrote:
> On 3/31/23 11:07, Simon Horman wrote:
> > On Thu, Mar 30, 2023 at 09:04:02PM +0200, Ilya Maximets wrote:
> >> On 3/30/23 11:45, Simon Horman wrote:
> >>> On Fri, Mar 17, 2023 at 09:47:36PM +0100, Eelco Chaudron wrote:
> > Op 17 mrt. 2023 om 21:11 heeft Marcelo Ricardo Leitner 
> >  het volgende geschreven:
> > Thu, Mar 16, 2023 at 09:51:34AM +0100, Eelco Chaudron wrote:
> >>> On 22 Dec 2022, at 13:32, Eelco Chaudron wrote:
>  On 22 Dec 2022, at 10:26, Balazs Nemeth wrote:
> >>>
>  The only way that stats->{n_packets,n_bytes} would decrease is due 
>  to an
> >>>
> >>> nit: s/way/ways/

>  overflow, or if there are bugs in how statistics are handled. In the
>  past, there were multiple bugs that caused a jump backward. A
>  workaround was in place to set the statistics to 0 in that case. When
>  this happened while the revalidator was under heavy load, the 
>  workaround
>  had an unintended side effect where should_revalidate returned false
>  causing the flow to be removed because the metric it calculated was
>  based on a bogus value. Since many of those bugs have now been
>  identified and resolved, there is no need to set the statistics to 
>  0. In
>  addition, the (unlikely) overflow still needs to be handled
>  appropriately.
> >>>
> >>> 1. Perhaps it would be prudent to log/count if/when this occurs
> >>
> >> +1
> >> We do have a coverage counter that will indicate the case where stats
> >> jump back.  However, if we're certain that this should never happen,
> >> we should, probably, emit a warning or even an error log as well, so
> >> users are aware that something went wrong.
> > 
> > I was thinking more of a counter, which seems to already be covered.
> > But I have no objection to your reasoning about having a warning (too).
> > 
> >>
> >>> 2. I take it that the overflow handling would be follow-up work,
> >>>is that correct?
> >>
> >> The unsigned arithmetic should take case of overflowed counters,
> >> because the result of subtraction will still give a correct difference
> >> between the old and a new value, even if it overflowed and the new
> >> value is smaller.  Unless, of course, it overflowed more than once.
> > 
> > More than once between samples?
> > If so, I'm assuming that is not a case we can hit unless there is a bug.
> 
> Right.  It's actually should be practically not possible to overflow
> even once with a current hardware.  Assuming we have a fancy 400 Gbps
> NIC, then it should take 11.7 years to overflow a byte counter.
> 
> So, this patch is mostly removing a workaround for some bug that we
> hope we fixed.  But it's not clear what the original bug was as the
> commit message for this workaround didn't specify a root cause.  So,
> it's hard to say if it's fixed or not.  And that's why I'm thinking
> that the error message is needed.

Yes, I agree that is prudent.

...
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v5] ofproto-dpif-upcall: Don't set statistics to 0 when they jump back

2023-03-31 Thread Ilya Maximets

On 3/31/23 11:07, Simon Horman wrote:
> On Thu, Mar 30, 2023 at 09:04:02PM +0200, Ilya Maximets wrote:
>> On 3/30/23 11:45, Simon Horman wrote:
>>> On Fri, Mar 17, 2023 at 09:47:36PM +0100, Eelco Chaudron wrote:


 Send from my phone

> Op 17 mrt. 2023 om 21:11 heeft Marcelo Ricardo Leitner 
>  het volgende geschreven:
>
> On Thu, Mar 16, 2023 at 09:51:34AM +0100, Eelco Chaudron wrote:
>>
>>
>>> On 22 Dec 2022, at 13:32, Eelco Chaudron wrote:
>>>
 On 22 Dec 2022, at 10:26, Balazs Nemeth wrote:
>>>
 The only way that stats->{n_packets,n_bytes} would decrease is due to 
 an
>>>
>>> nit: s/way/ways/
>>>
 overflow, or if there are bugs in how statistics are handled. In the
 past, there were multiple bugs that caused a jump backward. A
 workaround was in place to set the statistics to 0 in that case. When
 this happened while the revalidator was under heavy load, the 
 workaround
 had an unintended side effect where should_revalidate returned false
 causing the flow to be removed because the metric it calculated was
 based on a bogus value. Since many of those bugs have now been
 identified and resolved, there is no need to set the statistics to 0. 
 In
 addition, the (unlikely) overflow still needs to be handled
 appropriately.
>>>
>>> 1. Perhaps it would be prudent to log/count if/when this occurs
>>
>> +1
>> We do have a coverage counter that will indicate the case where stats
>> jump back.  However, if we're certain that this should never happen,
>> we should, probably, emit a warning or even an error log as well, so
>> users are aware that something went wrong.
> 
> I was thinking more of a counter, which seems to already be covered.
> But I have no objection to your reasoning about having a warning (too).
> 
>>
>>> 2. I take it that the overflow handling would be follow-up work,
>>>is that correct?
>>
>> The unsigned arithmetic should take case of overflowed counters,
>> because the result of subtraction will still give a correct difference
>> between the old and a new value, even if it overflowed and the new
>> value is smaller.  Unless, of course, it overflowed more than once.
> 
> More than once between samples?
> If so, I'm assuming that is not a case we can hit unless there is a bug.

Right.  It's actually should be practically not possible to overflow
even once with a current hardware.  Assuming we have a fancy 400 Gbps
NIC, then it should take 11.7 years to overflow a byte counter.

So, this patch is mostly removing a workaround for some bug that we
hope we fixed.  But it's not clear what the original bug was as the
commit message for this workaround didn't specify a root cause.  So,
it's hard to say if it's fixed or not.  And that's why I'm thinking
that the error message is needed.

> 
 Signed-off-by: Balazs Nemeth 
 ---

 Depends on:
 - netdev-offload-tc: Preserve tc statistics when flow gets modified.
 - tc: Add support for TCA_STATS_PKT64
 - ofproto-dpif-upcall: Wait for valid hw flow stats before applying 
 min-revalidate-pps
 - ofproto-dpif-upcall: Use last known stats ukey stats on revalidate 
 missed dp flows.
>>>
>>> Did a lot of tests with this patch applied, but as mentioned it does 
>>> need all four patches mentioned above applied first.
>>>
>>> Acked-by: Eelco Chaudron 
>>
>> Now that all the dependent patches have been applied and backported to 
>> 2.17, can we have another look at this patch?
>
> Not sure what you mean here, Eelco. You mean, just a patch review or
> you mean something else, like evaluating if the patch is still needed
> at all?

 Guess I was not clear enough :) what I meant was for the maintainers to
 have another look at this patch and if no further objections apply it.
>>>
>>> There seems to now be minor fuzz when applying this patch (locally).
>>> Perhaps a rebase is in order?
>>
>> Yeah, the coverage counter got introduced in a close proximity.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] ci: Change scheduled jobs to use the latest OVS stable branch.

2023-03-31 Thread Ales Musil

On Mon, Mar 27, 2023 at 5:39 PM Dumitru Ceara  wrote:

> Until now weekly OVN jobs would try to compile against OVS master
> branch.  But that potentially contains changes that break API.  For
> example a recent OVS commit [0] changed the signature of the
> daemonize_start() function.  In order to avoid build failures due
> to such changes, adapt the weekly OVN CI job to compile against the most
> recent OVS stable branch commit.  Most likely that won't contain changes
> that break APIs used by OVN.
>
> [0]
> https://github.com/openvswitch/ovs/commit/07cf5810de8da12c700324bc421bde92376abe06
>
> Signed-off-by: Dumitru Ceara 
>

Hi Dumitru,
this unfortunately breaks the container CI. As it is kinda unexpected to
do something with the git repos inside the build-x.sh. Would it be possible
to have it in the workflow instead?. That would also make it way more
obvious what
is the intention.

Thanks,
Ales


> ---
>  .ci/linux-build.sh |  6 ++
>  .ci/osx-build.sh   |  7 +++
>  .ci/util.sh|  9 +
>  .github/workflows/test.yml | 36 +++-
>  Makefile.am|  1 +
>  5 files changed, 42 insertions(+), 17 deletions(-)
>  create mode 100644 .ci/util.sh
>
> diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
> index 7dfb5c3171..c843dda80c 100755
> --- a/.ci/linux-build.sh
> +++ b/.ci/linux-build.sh
> @@ -3,15 +3,21 @@
>  set -o errexit
>  set -x
>
> +. ./.ci/util.sh
> +
>  ARCH=${ARCH:-"x86_64"}
>  COMMON_CFLAGS=""
>  OVN_CFLAGS=""
>  OPTS="$OPTS --enable-Werror"
>  JOBS=${JOBS:-"-j4"}
> +OVS_USE_STABLE=${OVS_USE_STABLE:false}
>
>  function configure_ovs()
>  {
>  pushd ovs
> +if [ "$OVS_USE_STABLE" = "true" ]; then
> +checkout_latest_stable_branch
> +fi
>  ./boot.sh && ./configure CFLAGS="${COMMON_CFLAGS}" $* || \
>  { cat config.log; exit 1; }
>  make $JOBS || { cat config.log; exit 1; }
> diff --git a/.ci/osx-build.sh b/.ci/osx-build.sh
> index 4b78b66dd1..03ffb9287b 100755
> --- a/.ci/osx-build.sh
> +++ b/.ci/osx-build.sh
> @@ -1,13 +1,20 @@
>  #!/bin/bash
>
>  set -o errexit
> +set -x
> +
> +. ./.ci/util.sh
>
>  CFLAGS="-Werror $CFLAGS"
>  EXTRA_OPTS=""
> +OVS_USE_STABLE=${OVS_USE_STABLE:false}
>
>  function configure_ovs()
>  {
>  pushd ovs
> +if [ "$OVS_USE_STABLE" = "true" ]; then
> +checkout_latest_stable_branch
> +fi
>  ./boot.sh && ./configure $*
>  make -j4 || { cat config.log; exit 1; }
>  popd
> diff --git a/.ci/util.sh b/.ci/util.sh
> new file mode 100644
> index 00..952371dd68
> --- /dev/null
> +++ b/.ci/util.sh
> @@ -0,0 +1,9 @@
> +# Tries to guess the latest stable branch in a git repo and checks it out.
> +# Assumes the CWD is inside a clone of the repo.  It also assumes stable
> +# branch names follow the "branch-x.y" convention.
> +function checkout_latest_stable_branch()
> +{
> +local branch=$(git branch -a -l '*branch-*' | \
> +sed 's/remotes\/origin\///' | sort -V | tail -1)
> +git checkout $branch
> +}
> diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
> index 90dc8a6f19..82f916a997 100644
> --- a/.github/workflows/test.yml
> +++ b/.github/workflows/test.yml
> @@ -19,13 +19,14 @@ jobs:
>  libssl-dev llvm-dev libelf-dev libnuma-dev libpcap-dev  \
>  selinux-policy-dev ncat python3-scapy isc-dhcp-server
>m32_dependecies: gcc-multilib
> -  ARCH:${{ matrix.cfg.arch }}
> -  CC:  ${{ matrix.cfg.compiler }}
> -  LIBS:${{ matrix.cfg.libs }}
> -  OPTS:${{ matrix.cfg.opts }}
> -  TESTSUITE:   ${{ matrix.cfg.testsuite }}
> -  TEST_RANGE:  ${{ matrix.cfg.test_range }}
> -  SANITIZERS:  ${{ matrix.cfg.sanitizers }}
> +  ARCH:   ${{ matrix.cfg.arch }}
> +  CC: ${{ matrix.cfg.compiler }}
> +  LIBS:   ${{ matrix.cfg.libs }}
> +  OPTS:   ${{ matrix.cfg.opts }}
> +  TESTSUITE:  ${{ matrix.cfg.testsuite }}
> +  TEST_RANGE: ${{ matrix.cfg.test_range }}
> +  SANITIZERS: ${{ matrix.cfg.sanitizers }}
> +  OVS_USE_STABLE: ${{ github.event_name == 'schedule'}}
>
>  name: linux ${{ join(matrix.cfg.*, ' ') }}
>  runs-on: ubuntu-20.04
> @@ -70,15 +71,15 @@ jobs:
>if: github.event_name == 'schedule'
>uses: actions/checkout@v3
>
> -# Weekly runs test using OVS master instead of the
> -# submodule.
> -- name: checkout OVS master
> +# Weekly runs test using the tip of the most recent stable OVS branch
> +# instead of the submodule (OVS_USE_STABLE gets set to "true").
> +- name: checkout OVS
>if: github.event_name == 'schedule'
>uses: actions/checkout@v3
>with:
>  repository: 'openvswitch/ovs'
> +fetch-depth: 0
>  path: 'ovs'
> -ref: 'master'
>
>  - name: update APT cache
>run:  sudo apt update
> @@ -137,8 +138,9 @@ jobs:
>
>build-osx:
>  env:
> -

Re: [ovs-dev] [PATCH v10] netdev-offload-tc: del ufid mapping if device not exist.

2023-03-31 Thread Simon Horman

On Thu, Mar 30, 2023 at 05:27:23PM +0800, Faicker Mo wrote:
> The device may be deleted and added with ifindex changed.
> The tc rules on the device will be deleted if the device is deleted.
> The func tc_del_filter will fail when flow del. The mapping of
> ufid to tc will not be deleted.
> The traffic will trigger the same flow(with same ufid) to put to tc
> on the new device. Duplicated ufid mapping will be added.
> If the hashmap is expanded, the old mapping entry will be the first entry,
> and now the dp flow can't be deleted.
> 
> Signed-off-by: Faicker Mo 

Thanks, I am running this through the test I described in [1] -
a tight loop on a low-end VM.

So far things look good, but the loop has not run many times yet.
I plan to let it run over the weekend.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] utilities: disable OVSDB inactivity probes for non-daemon ovn-nbctl

2023-03-31 Thread Ales Musil

On Thu, Mar 23, 2023 at 8:25 PM Vladislav Odintsov 
wrote:

> For large OVN_Southbound DBs defatult interval of 5000 ms could be not
> sufficient.  This patch disables OVSDB inactivity probes for ovn-*ctl
> running
> in non-daemon mode.
>
> Signed-off-by: Vladislav Odintsov 
> ---
>  utilities/ovn-dbctl.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/utilities/ovn-dbctl.c b/utilities/ovn-dbctl.c
> index 369a6a663..4307a5cae 100644
> --- a/utilities/ovn-dbctl.c
> +++ b/utilities/ovn-dbctl.c
> @@ -208,6 +208,9 @@ ovn_dbctl_main(int argc, char *argv[],
>  if (daemon_mode) {
>  server_loop(dbctl_options, idl, argc, argv_);
>  } else {
> +/* Disable OVSDB probe interval for non-daemon mode. */
> +ovsdb_idl_set_probe_interval(idl, 0);
> +
>  struct ctl_command *commands;
>  size_t n_commands;
>  char *error;
> --
> 2.36.1
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
Looks good to me, thanks.

Reviewed-by: Ales Musil 

-- 

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA 

amu...@redhat.comIM: amusil

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v5] ofproto-dpif-upcall: Don't set statistics to 0 when they jump back

2023-03-31 Thread Simon Horman

On Thu, Mar 30, 2023 at 09:04:02PM +0200, Ilya Maximets wrote:
> On 3/30/23 11:45, Simon Horman wrote:
> > On Fri, Mar 17, 2023 at 09:47:36PM +0100, Eelco Chaudron wrote:
> >>
> >>
> >> Send from my phone
> >>
> >>> Op 17 mrt. 2023 om 21:11 heeft Marcelo Ricardo Leitner 
> >>>  het volgende geschreven:
> >>>
> >>> On Thu, Mar 16, 2023 at 09:51:34AM +0100, Eelco Chaudron wrote:
> 
> 
> > On 22 Dec 2022, at 13:32, Eelco Chaudron wrote:
> >
> >> On 22 Dec 2022, at 10:26, Balazs Nemeth wrote:
> >
> >> The only way that stats->{n_packets,n_bytes} would decrease is due to 
> >> an
> > 
> > nit: s/way/ways/
> > 
> >> overflow, or if there are bugs in how statistics are handled. In the
> >> past, there were multiple bugs that caused a jump backward. A
> >> workaround was in place to set the statistics to 0 in that case. When
> >> this happened while the revalidator was under heavy load, the 
> >> workaround
> >> had an unintended side effect where should_revalidate returned false
> >> causing the flow to be removed because the metric it calculated was
> >> based on a bogus value. Since many of those bugs have now been
> >> identified and resolved, there is no need to set the statistics to 0. 
> >> In
> >> addition, the (unlikely) overflow still needs to be handled
> >> appropriately.
> > 
> > 1. Perhaps it would be prudent to log/count if/when this occurs
> 
> +1
> We do have a coverage counter that will indicate the case where stats
> jump back.  However, if we're certain that this should never happen,
> we should, probably, emit a warning or even an error log as well, so
> users are aware that something went wrong.

I was thinking more of a counter, which seems to already be covered.
But I have no objection to your reasoning about having a warning (too).

> 
> > 2. I take it that the overflow handling would be follow-up work,
> >is that correct?
> 
> The unsigned arithmetic should take case of overflowed counters,
> because the result of subtraction will still give a correct difference
> between the old and a new value, even if it overflowed and the new
> value is smaller.  Unless, of course, it overflowed more than once.

More than once between samples?
If so, I'm assuming that is not a case we can hit unless there is a bug.

> >> Signed-off-by: Balazs Nemeth 
> >> ---
> >>
> >> Depends on:
> >> - netdev-offload-tc: Preserve tc statistics when flow gets modified.
> >> - tc: Add support for TCA_STATS_PKT64
> >> - ofproto-dpif-upcall: Wait for valid hw flow stats before applying 
> >> min-revalidate-pps
> >> - ofproto-dpif-upcall: Use last known stats ukey stats on revalidate 
> >> missed dp flows.
> >
> > Did a lot of tests with this patch applied, but as mentioned it does 
> > need all four patches mentioned above applied first.
> >
> > Acked-by: Eelco Chaudron 
> 
>  Now that all the dependent patches have been applied and backported to 
>  2.17, can we have another look at this patch?
> >>>
> >>> Not sure what you mean here, Eelco. You mean, just a patch review or
> >>> you mean something else, like evaluating if the patch is still needed
> >>> at all?
> >>
> >> Guess I was not clear enough :) what I meant was for the maintainers to
> >> have another look at this patch and if no further objections apply it.
> > 
> > There seems to now be minor fuzz when applying this patch (locally).
> > Perhaps a rebase is in order?
> 
> Yeah, the coverage counter got introduced in a close proximity.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] util: fix an issue that thread name cannot be set

2023-03-31 Thread Eelco Chaudron



On 30 Mar 2023, at 4:17, Songtao Zhan wrote:

> To: d...@openvswitch.org,
> i.maxim...@ovn.org
>
> The name of the current thread consists of a name with a maximum
> length of 16 bytes and a thread ID. The final name may be longer
> than 16 bytes. If the name is longer than 16 bytes, the thread
> name will fail to be set

Thanks for the patch, do you have examples of when this happens? Maybe we 
should also change the thread naming to avoid this?

See one additional item below.

> Signed-off-by: Songtao Zhan 
> ---
>  lib/util.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/lib/util.c b/lib/util.c
> index 96a71550d..023829694 100644
> --- a/lib/util.c
> +++ b/lib/util.c
> @@ -645,6 +645,10 @@ set_subprogram_name(const char *subprogram_name)
>  free(subprogram_name_set(pname));
>
>  #if HAVE_GLIBC_PTHREAD_SETNAME_NP
> +/* The maximum thead name including '\0' supported by the system is 16 */
> +if (strlen(pname) > 15) {
> +pname[15] = '\0' ;
> +}

Not sure what is better, but would it make sense to use the last upper 
characters? This way if we do truncate we know the internal thread id, as 
naming in OVS, in general, is “xasprintf("%s%u", aux.name, id)”.
If we do this we could add some indication is was truncated, like 
“LONGTHREAD123456” would become “>NGTHREAD123456”.

>  pthread_setname_np(pthread_self(), pname);
>  #elif HAVE_NETBSD_PTHREAD_SETNAME_NP
>  pthread_setname_np(pthread_self(), "%s", pname);
> -- 
> 2.31.1
>
>
>
> zhan...@chinatelecom.cn
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH] net: openvswitch: fix race on port output

2023-03-31 Thread Felix Huettner via dev

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle not
registered (so e.g. unregistering) devices the same as if the device is
not found (which would be the code path after 9. is done).

Additionally we introduce a `BUG_ON` in `skb_tx_hash` to rather crash
then produce this infinite loop that can not be exited anyway.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, 
skb_addr: 0x9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0x9f0a46d4a000 real_num_tx_queues: 1, 
cpu: 2, pid: 28024, tid: 28024, skb_addr: 0x9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, 
event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, 
event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 
21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, 
tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 
21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, 
event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, 
event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, 
event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, 
skb_addr: 0x9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, 
skb_addr: 0x9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0x9f0a46d4a000 real_num_tx_queues: 0, 
cpu: 2, pid: 28024, tid: 28024, skb_addr: 0x9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, 
reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid:

[ovs-dev] [PATCH v2] util: fix an issue that thread name cannot be set

2023-03-31 Thread Songtao Zhan

To: d...@openvswitch.org,
i.maxim...@ovn.org

The name of the current thread consists of a name with a maximum
length of 16 bytes and a thread ID. The final name may be longer
than 16 bytes. If the name is longer than 16 bytes, the thread
name will fail to be set

Signed-off-by: Songtao Zhan 
---

Notes:
v2:
 - modify code formatting and comments

 lib/util.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/lib/util.c b/lib/util.c
index 96a71550d..b0eb9f343 100644
--- a/lib/util.c
+++ b/lib/util.c
@@ -645,6 +645,10 @@ set_subprogram_name(const char *subprogram_name)
 free(subprogram_name_set(pname));

 #if HAVE_GLIBC_PTHREAD_SETNAME_NP
+/* The maximum thead name including '\0' supported is 16. */
+if (strlen(pname) > 15) {
+pname[15] = '\0';
+}
 pthread_setname_np(pthread_self(), pname);
 #elif HAVE_NETBSD_PTHREAD_SETNAME_NP
 pthread_setname_np(pthread_self(), "%s", pname);
-- 
2.31.1




zhan...@chinatelecom.cn
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] controller: Clear tunnels from old integration bridge

2023-03-31 Thread Ales Musil

On Fri, Mar 31, 2023 at 10:00 AM Dumitru Ceara  wrote:

> On 3/31/23 09:32, Ales Musil wrote:
> >> When two ovn-controllers are running on the same host, won't this
> >> remove tunnels that belong to the other ovn-controller (which are part
> >> of the other controller' bridge)? I think you should only remove
> >> tunnels that match the names as constructed by tunnel_create_name()
> >> [the names should include the so-called unique "chassis index"].
> >>
> > Yeah you are right, I've missed that, even though this would happen only
> > on a fresh start of the controller, but it still is wrong. I'll fix that
> in
> > v2.
> >
>
> Would it be possible to add a test to ensure we don't break this in the
> future?
>

Of course.


>
> Thanks,
> Dumitru
>
>

-- 

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA 

amu...@redhat.comIM: amusil

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] controller: Clear tunnels from old integration bridge

2023-03-31 Thread Dumitru Ceara

On 3/31/23 09:32, Ales Musil wrote:
>> When two ovn-controllers are running on the same host, won't this
>> remove tunnels that belong to the other ovn-controller (which are part
>> of the other controller' bridge)? I think you should only remove
>> tunnels that match the names as constructed by tunnel_create_name()
>> [the names should include the so-called unique "chassis index"].
>>
> Yeah you are right, I've missed that, even though this would happen only
> on a fresh start of the controller, but it still is wrong. I'll fix that in
> v2.
> 

Would it be possible to add a test to ensure we don't break this in the
future?

Thanks,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] controller: Clear tunnels from old integration bridge

2023-03-31 Thread Ales Musil

On Thu, Mar 30, 2023 at 10:47 PM Ihar Hrachyshka 
wrote:

> On Thu, Mar 30, 2023 at 3:53 AM Ales Musil  wrote:
> >
> > After integration bridge change the tunnels would
> > stay on the old bridge preventing new tunnels creation
> > and disrupting traffic. Detect the bridge change
> > and clear the tunnels from the old integration bridge.
> >
> > Reported-at: https://bugzilla.redhat.com/2173635
> > Signed-off-by: Ales Musil 
> > ---
> >  controller/encaps.c | 36 +++-
> >  controller/encaps.h |  4 +-
> >  controller/ovn-controller.c | 26 +++-
> >  tests/ovn.at| 83 +
> >  4 files changed, 145 insertions(+), 4 deletions(-)
> >
> > diff --git a/controller/encaps.c b/controller/encaps.c
> > index 2662eaf98..c66743d3b 100644
> > --- a/controller/encaps.c
> > +++ b/controller/encaps.c
> > @@ -386,6 +386,20 @@ chassis_tzones_overlap(const struct sset
> *transport_zones,
> >  return false;
> >  }
> >
> > +static void
> > +clear_old_tunnels(const struct ovsrec_bridge *old_br_int)
> > +{
> > +for (size_t i = 0; i < old_br_int->n_ports; i++) {
> > +const struct ovsrec_port *port = old_br_int->ports[i];
> > +const char *id = smap_get(&port->external_ids,
> "ovn-chassis-id");
> > +if (id) {
> > +VLOG_DBG("Clearing old tunnel port \"%s\" (%s) from bridge "
> > + "\"%s\".", port->name, id, old_br_int->name);
> > +ovsrec_bridge_update_ports_delvalue(old_br_int, port);
> > +}
> > +}
> > +}
> > +
> >  void
> >  encaps_run(struct ovsdb_idl_txn *ovs_idl_txn,
> > const struct ovsrec_bridge *br_int,
> > @@ -393,12 +407,32 @@ encaps_run(struct ovsdb_idl_txn *ovs_idl_txn,
> > const struct sbrec_chassis *this_chassis,
> > const struct sbrec_sb_global *sbg,
> > const struct ovsrec_open_vswitch_table *ovs_table,
> > -   const struct sset *transport_zones)
> > +   const struct sset *transport_zones,
> > +   const struct ovsrec_bridge_table *bridge_table,
> > +   const char *br_int_name)
> >  {
> >  if (!ovs_idl_txn || !br_int) {
> >  return;
> >  }
> >
> > +if (!br_int_name) {
> > +/* The controller has just started, we need to look through all
> > + * bridges for old tunnel ports. */
> > +const struct ovsrec_bridge *br;
> > +OVSREC_BRIDGE_TABLE_FOR_EACH (br, bridge_table) {
> > +if (!strcmp(br->name, br_int->name)) {
> > +continue;
> > +}
> > +clear_old_tunnels(br);
> > +}
> > +} else if (strcmp(br_int_name, br_int->name)) {
> > +/* The integration bridge was changed, clear tunnel ports from
> > + * the old one. */
> > +const struct ovsrec_bridge *old_br_int =
> > +get_bridge(bridge_table, br_int_name);
> > +clear_old_tunnels(old_br_int);
>
> When two ovn-controllers are running on the same host, won't this
> remove tunnels that belong to the other ovn-controller (which are part
> of the other controller' bridge)? I think you should only remove
> tunnels that match the names as constructed by tunnel_create_name()
> [the names should include the so-called unique "chassis index"].
>

Yeah you are right, I've missed that, even though this would happen only
on a fresh start of the controller, but it still is wrong. I'll fix that in
v2.

Thanks,
Ales

>
>
> > +}
> > +
> >  const struct sbrec_chassis *chassis_rec;
> >
> >  struct tunnel_ctx tc = {
> > diff --git a/controller/encaps.h b/controller/encaps.h
> > index 867c6f28c..cf38dac1a 100644
> > --- a/controller/encaps.h
> > +++ b/controller/encaps.h
> > @@ -35,7 +35,9 @@ void encaps_run(struct ovsdb_idl_txn *ovs_idl_txn,
> >  const struct sbrec_chassis *,
> >  const struct sbrec_sb_global *,
> >  const struct ovsrec_open_vswitch_table *,
> > -const struct sset *transport_zones);
> > +const struct sset *transport_zones,
> > +const struct ovsrec_bridge_table *bridge_table,
> > +const char *br_int_name);
> >
> >  bool encaps_cleanup(struct ovsdb_idl_txn *ovs_idl_txn,
> >  const struct ovsrec_bridge *br_int);
> > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> > index 76be2426e..242d93823 100644
> > --- a/controller/ovn-controller.c
> > +++ b/controller/ovn-controller.c
> > @@ -536,6 +536,21 @@ process_br_int(struct ovsdb_idl_txn *ovs_idl_txn,
> >  *br_int_ = br_int;
> >  }
> >
> > +static void
> > +consider_br_int_change(const struct ovsrec_bridge *br_int, char
> **current_name)
> > +{
> > +ovs_assert(current_name);
> > +
> > +if (!*current_name) {
> > +*current_name = xstrdup(br_int->name);
> > +}
> > +
> > +if (strcmp(*current_name, br_int->name)) {
> > +free(*current_name);
> > +

56 matches

Mail list logo