[ovs-dev] [PATCH v3] hash, jhash: Fix unaligned access to the hash remainder.

2024-05-02 Thread Ales Musil
Partially revert db5a101931c5, this was to avoid warning, however we
shouldn't use pointer to "uint32_t" when the data are potentially
unaligned [0]. Use pointer to "uint8_t" right from the start, this
requires us to use ALIGNED_CAST for the get_unaligned_u32, which is
fine in that case, because the function uses
" __attribute__((__packed__))" struct to access the underlying "uint32_t".

lib/hash.c:46:22: runtime error: load of misaligned address
0x50700065 for type 'const uint32_t *' (aka 'const unsigned int *'),
which requires 4 byte alignment
0x50700065: note: pointer points here
 73 62 2e 73 6f 63 6b  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 ^
  00 00 00 00 00 00 00 00  00
0 0x6191cb in hash_bytes ovs/lib/hash.c:46:9
1 0x69d064 in hash_string ovs/lib/hash.h:404:12
2 0x69d064 in hash_name ovs/lib/shash.c:29:12
3 0x69d064 in shash_find ovs/lib/shash.c:237:49
4 0x69dada in shash_find_data ovs/lib/shash.c:251:31
5 0x507987 in add_remote ovs/ovsdb/ovsdb-server.c:1382:15
6 0x507987 in parse_options ovs/ovsdb/ovsdb-server.c:2659:13
7 0x507987 in main ovs/ovsdb/ovsdb-server.c:751:5

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior lib/hash.c:46:22

[0] https://github.com/llvm/llvm-project/issues/90848
Fixes: db5a101931c5 ("clang: Fix the alignment warning.")
Signed-off-by: Ales Musil 
---
v3: Do partial revert of db5a101931c5 instead of simple cast.
---
 lib/hash.c  |  7 ---
 lib/jhash.c | 10 +-
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/lib/hash.c b/lib/hash.c
index c722f3c3c..3d574de9b 100644
--- a/lib/hash.c
+++ b/lib/hash.c
@@ -29,15 +29,16 @@ hash_3words(uint32_t a, uint32_t b, uint32_t c)
 uint32_t
 hash_bytes(const void *p_, size_t n, uint32_t basis)
 {
-const uint32_t *p = p_;
+const uint8_t *p = p_;
 size_t orig_n = n;
 uint32_t hash;
 
 hash = basis;
 while (n >= 4) {
-hash = hash_add(hash, get_unaligned_u32(p));
+hash = hash_add(hash,
+get_unaligned_u32(ALIGNED_CAST(const uint32_t *, p)));
 n -= 4;
-p += 1;
+p += 4;
 }
 
 if (n) {
diff --git a/lib/jhash.c b/lib/jhash.c
index c59b51b61..a8e3f457b 100644
--- a/lib/jhash.c
+++ b/lib/jhash.c
@@ -96,18 +96,18 @@ jhash_words(const uint32_t *p, size_t n, uint32_t basis)
 uint32_t
 jhash_bytes(const void *p_, size_t n, uint32_t basis)
 {
-const uint32_t *p = p_;
+const uint8_t *p = p_;
 uint32_t a, b, c;
 
 a = b = c = 0xdeadbeef + n + basis;
 
 while (n >= 12) {
-a += get_unaligned_u32(p);
-b += get_unaligned_u32(p + 1);
-c += get_unaligned_u32(p + 2);
+a += get_unaligned_u32(ALIGNED_CAST(const uint32_t *, p));
+b += get_unaligned_u32(ALIGNED_CAST(const uint32_t *, p + 4));
+c += get_unaligned_u32(ALIGNED_CAST(const uint32_t *, p + 8));
 jhash_mix(, , );
 n -= 12;
-p += 3;
+p += 12;
 }
 
 if (n) {
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] docs: List supported rolling upgrade paths.

2024-05-02 Thread Ales Musil
On Fri, Apr 26, 2024 at 10:49 PM Ihar Hrachyshka 
wrote:

> The wording above is not completely clear without these scenarios
> listed. A confused reader may incorrectly read it as:
>
> ```
> Only LTS-to-LTS is supported for rolling upgrades.
> ```
>
> which is wrong.
>
> Signed-off-by: Ihar Hrachyshka 
> ---
>  Documentation/intro/install/ovn-upgrades.rst | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/Documentation/intro/install/ovn-upgrades.rst
> b/Documentation/intro/install/ovn-upgrades.rst
> index 1f99a86ec..f3dea07dc 100644
> --- a/Documentation/intro/install/ovn-upgrades.rst
> +++ b/Documentation/intro/install/ovn-upgrades.rst
> @@ -74,6 +74,11 @@ To avoid buildup of complexity and technical debt we
> limit the span of versions
>  supported for a rolling upgrade on :ref:`long-term-support` (LTS), and it
>  should always be possible to upgrade from the previous LTS version to the
> next.
>
> +The following rolling upgrade paths are supported:
> +
> +1. LTS to the very next LTS release, or to any non-LTS in between the two.
> +2. Any non-LTS to the very next LTS release.
> +
>  The first LTS version of OVN was 22.03.  If you want to upgrade between
> other
>  versions, you can use the `Fail-safe upgrade`_ procedure.
>
> --
> 2.41.0
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
Looks good to me, thanks.

Acked-by: Ales Musil 

-- 

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA 

amu...@redhat.com

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] docs: Explain nature of ovs dependency.

2024-05-02 Thread Ales Musil
On Fri, Apr 26, 2024 at 10:35 PM Ihar Hrachyshka 
wrote:

> The dependency is during build time, not runtime.
>
> Signed-off-by: Ihar Hrachyshka 
> ---
>  Documentation/intro/install/ovn-upgrades.rst | 18 --
>  1 file changed, 12 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/intro/install/ovn-upgrades.rst
> b/Documentation/intro/install/ovn-upgrades.rst
> index bb387e2f8..1f99a86ec 100644
> --- a/Documentation/intro/install/ovn-upgrades.rst
> +++ b/Documentation/intro/install/ovn-upgrades.rst
> @@ -40,13 +40,19 @@ Release Notes
>  You should always check the OVS and OVN release notes (NEWS file) for any
>  release specific notes on upgrades.
>
> -OVS
> 
> +Open vSwitch
> +
>
> -OVN depends on and is included with OVS.  It's expected that OVS and OVN
> are
> -upgraded together, partly for convenience.  OVN is included in OVS
> releases
> -so it's easiest to upgrade them together.  OVN may also make use of new
> -features of OVS only available in that release.
> +OVN compiles with a particular version of Open vSwitch.  This is a
> build-time
> +dependency.
> +
> +In runtime, OVN should be able to work with any reasonably fresh version
> of
> +Open vSwitch (not necessarily the version that it was compiled against.)
> +
> +OVN may make use of new runtime features of Open vSwitch that are only
> +available in a particular release. OVN is expected to test for an Open
> vSwitch
> +feature presence before using it, and gracefully handle scenarios where
> Open
> +vSwitch doesn't support a particular optional feature, yet.
>
>  Upgrade procedures
>  --
> --
> 2.41.0
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
Looks good to me, thanks.

Acked-by: Ales Musil 

-- 

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA 

amu...@redhat.com

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v4] controller: Track individual address set constants.

2024-05-02 Thread Ales Musil
On Thu, May 2, 2024 at 6:23 PM Han Zhou  wrote:
>
>
>
> On Thu, May 2, 2024 at 6:29 AM Ales Musil  wrote:
> >
> > Instead of tracking address set per struct expr_constant_set track it
> > per individual struct expr_constant. This allows more fine grained
> > control for I-P processing of address sets in controller. It helps with
> > scenarios like matching on two address sets in one expression e.g.
> > "ip4.src == {$as1, $as2}". This allows any addition or removal of
> > individual adress from the set to be incrementally processed instead
> > of reprocessing all the flows.
> >
> > This unfortunately doesn't help with the following flows:
> > "ip4.src == $as1 && ip4.dst == $as2"
> > "ip4.src == $as1 || ip4.dst == $as2"
> >
> > The memory impact should be minimal as there is only increase of 8 bytes
> > per the struct expr_constant.
> >
> > Reported-at: https://issues.redhat.com/browse/FDP-509
> > Signed-off-by: Ales Musil 
> > ---
> > v4: Rebase on top of current main.
> > Update the "lflow_handle_addr_set_update" comment according to 
> > suggestion from Han.
>
> Thanks Ales. I updated the commit message for the same, and applied to main 
> branch.
>
> Regards,
> Han
>
> > v3: Rebase on top of current main.
> > Address comments from Han:
> > - Adjust the comment for "lflow_handle_addr_set_update" to include 
> > remaning corner cases.
> > - Make sure that the flows are consistent between I-P and recompute.
> > v2: Rebase on top of current main.
> > Adjust the comment for I-P optimization.
> > ---
> >  controller/lflow.c  | 11 ++---
> >  include/ovn/actions.h   |  2 +-
> >  include/ovn/expr.h  | 46 ++-
> >  lib/actions.c   | 20 -
> >  lib/expr.c  | 99 +
> >  tests/ovn-controller.at | 79 +---
> >  6 files changed, 154 insertions(+), 103 deletions(-)
> >
> > diff --git a/controller/lflow.c b/controller/lflow.c
> > index 760ec0b41..1e05665a1 100644
> > --- a/controller/lflow.c
> > +++ b/controller/lflow.c
> > @@ -278,7 +278,7 @@ lflow_handle_changed_flows(struct lflow_ctx_in 
> > *l_ctx_in,
> >  }
> >
> >  static bool
> > -as_info_from_expr_const(const char *as_name, const union expr_constant *c,
> > +as_info_from_expr_const(const char *as_name, const struct expr_constant *c,
> >  struct addrset_info *as_info)
> >  {
> >  as_info->name = as_name;
> > @@ -644,14 +644,11 @@ as_update_can_be_handled(const char *as_name, struct 
> > addr_set_diff *as_diff,
> >   *generated.
> >   *
> >   *  - The sub expression of the address set is combined with other sub-
> > - *expressions/constants, usually because of disjunctions between
> > - *sub-expressions/constants, e.g.:
> > + *expressions/constants on different fields, e.g.:
> >   *
> >   *  ip.src == $as1 || ip.dst == $as2
> > - *  ip.src == {$as1, $as2}
> > - *  ip.src == {$as1, ip1}
> >   *
> > - *All these could have been split into separate lflows.
> > + *This could have been split into separate lflows.
> >   *
> >   *  - Conjunctions overlapping between lflows, which can be caused by
> >   *overlapping address sets or same address set used by multiple 
> > lflows
> > @@ -714,7 +711,7 @@ lflow_handle_addr_set_update(const char *as_name,
> >  if (as_diff->deleted) {
> >  struct addrset_info as_info;
> >  for (size_t i = 0; i < as_diff->deleted->n_values; i++) {
> > -union expr_constant *c = _diff->deleted->values[i];
> > +struct expr_constant *c = _diff->deleted->values[i];
> >  if (!as_info_from_expr_const(as_name, c, _info)) {
> >  continue;
> >  }
> > diff --git a/include/ovn/actions.h b/include/ovn/actions.h
> > index ae0864fdd..88cf4de79 100644
> > --- a/include/ovn/actions.h
> > +++ b/include/ovn/actions.h
> > @@ -241,7 +241,7 @@ struct ovnact_next {
> >  struct ovnact_load {
> >  struct ovnact ovnact;
> >  struct expr_field dst;
> > -union expr_constant imm;
> > +struct expr_constant imm;
> >  };
> >
> >  /* OVNACT_MOVE, OVNACT_EXCHANGE. */
> > diff --git a/include/ovn/expr.h b/include/ovn/expr.h
> > index c48f82398..e54edb5bf 100644
> > --- a/include/ovn/expr.h
> > +++ b/include/ovn/expr.h
> > @@ -368,7 +368,7 @@ bool expr_relop_from_token(enum lex_type type, enum 
> > expr_relop *relop);
> >  struct expr {
> >  struct ovs_list node;   /* In parent EXPR_T_AND or EXPR_T_OR if 
> > any. */
> >  enum expr_type type;/* Expression type. */
> > -char *as_name;  /* Address set name. Null if it is not an
> > +const char *as_name;/* Address set name. Null if it is not an
> > address set. */
> >
> >  union {
> > @@ -505,40 +505,42 @@ enum expr_constant_type {
> >  };
> >
> >  /* A string or integer 

Re: [ovs-dev] [PATCH v2] hash, jhash: Fix unaligned access to the hash remainder.

2024-05-02 Thread Ales Musil
On Thu, May 2, 2024 at 8:03 PM Ilya Maximets  wrote:

> On 5/2/24 15:28, Ales Musil wrote:
> > The pointer was passed to memcpy as uin32_t *, however the hash bytes
> > might be unaligned at that point. Case it to uint8_t * instead
>
> 'Case' ?
>
> > which has only single byte alignment requirement. This seems to be
> > a false positive reported by clang [0].
>
> After thinking some more, it's not actually a false positive per se.
> According to the C spec we're not actually allowed to have misaligned
> pointers even if we're not reading/writing through them.
>
> So, technically, the initial cast to uint32_t pointer is no correct.
> I don't think we can fully avoid such casts without loosing type checking,
> but I think we need to revert changes to hash functions made in
> commit db5a101931c5 ("clang: Fix the alignment warning.").
> i.e. we should go back to using uint8_t pointer and cast it on the
> get_unaligned_u32() call with ALIGNED_CAST.  We will still have a
> misaligned pointer, but it will be immediately cast back, so should
> cause less issues.
>
> Note: all arithmetic should be done on the uint8_t pointer, not a
> misaligned uin32_t one to avoid potential other UB conditions.
>
> Best regards, Ilya Maximets.
>

Makes sense, done in v3.


>
> >
> > lib/hash.c:46:22: runtime error: load of misaligned address
> > 0x50700065 for type 'const uint32_t *' (aka 'const unsigned int *'),
> > which requires 4 byte alignment
> > 0x50700065: note: pointer points here
> >  73 62 2e 73 6f 63 6b  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
> >  ^
> >   00 00 00 00 00 00 00 00  00
> > 0 0x6191cb in hash_bytes ovs/lib/hash.c:46:9
> > 1 0x69d064 in hash_string ovs/lib/hash.h:404:12
> > 2 0x69d064 in hash_name ovs/lib/shash.c:29:12
> > 3 0x69d064 in shash_find ovs/lib/shash.c:237:49
> > 4 0x69dada in shash_find_data ovs/lib/shash.c:251:31
> > 5 0x507987 in add_remote ovs/ovsdb/ovsdb-server.c:1382:15
> > 6 0x507987 in parse_options ovs/ovsdb/ovsdb-server.c:2659:13
> > 7 0x507987 in main ovs/ovsdb/ovsdb-server.c:751:5
> >
> > SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior lib/hash.c:46:22
> >
> > [0] https://github.com/llvm/llvm-project/issues/90848
> > Signed-off-by: Ales Musil 
> > ---
> >  lib/hash.c  | 2 +-
> >  lib/jhash.c | 2 +-
> >  2 files changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/hash.c b/lib/hash.c
> > index c722f3c3c..986fa6643 100644
> > --- a/lib/hash.c
> > +++ b/lib/hash.c
> > @@ -43,7 +43,7 @@ hash_bytes(const void *p_, size_t n, uint32_t basis)
> >  if (n) {
> >  uint32_t tmp = 0;
> >
> > -memcpy(, p, n);
> > +memcpy(, (const uint8_t *) p, n);
> >  hash = hash_add(hash, tmp);
> >  }
> >
> > diff --git a/lib/jhash.c b/lib/jhash.c
> > index c59b51b61..0a0628589 100644
> > --- a/lib/jhash.c
> > +++ b/lib/jhash.c
> > @@ -114,7 +114,7 @@ jhash_bytes(const void *p_, size_t n, uint32_t basis)
> >  uint32_t tmp[3];
> >
> >  tmp[0] = tmp[1] = tmp[2] = 0;
> > -memcpy(tmp, p, n);
> > +memcpy(tmp, (const uint8_t *) p, n);
> >  a += tmp[0];
> >  b += tmp[1];
> >  c += tmp[2];
>
>
Thanks,
Ales
-- 

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA 

amu...@redhat.com

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] socket: Don't fail when AF_UNIX connect() returns EAGAIN.

2024-05-02 Thread Ilya Maximets
On 4/27/24 00:42, Ihar Hrachyshka wrote:
> POSIX defines EINPROGRESS as the return value for non-blocking connect()
> [1]. But in Linux, AF_UNIX connect() returns EAGAIN instead of
> EINPROGRESS. (but only for AF_UNIX sockets!) [2]
> 
> Both cases should be handled the same way - by returning the `fd` and
> letting the caller to complete connection asynchronously.
> 
> [1]: 
> https://pubs.opengroup.org/onlinepubs/9699919799.2016edition/functions/connect.html
> [2]: see `connect(2)` on Linux for details
> 
> Signed-off-by: Ihar Hrachyshka 
> ---

Hi, Ihar.  Thanks for the patch!

I was reading the original commit that introduced the error mangling:

commit 21a60ea97b492ae642d5b35430b24d137cd67f35
Author: Ben Pfaff 
Date:   Fri Dec 11 16:59:44 2009 -0800

socket-util: Clarify EAGAIN error code for make_unix_socket().

make_unix_socket() can return EAGAIN in rare circumstances, e.g. when the
server's socket listen queue is full.  A lot of OVS callers interpret
EAGAIN as a "try again" error code, but in this case it means that the
attempt to create the socket failed.  So munge EAGAIN into another error
code to prevent that misinterpretation.

IIUC, at the time there were more direct users for this function
that would not re-try the connect() and will remain in the broken
state.  Most of them were migrated to the stream- and reconnect-
wrappers, so they will actually re-connect normally now.  That's
should be fine.

There is still one direct user in syslog-direct.c, but it is using
a blocking connection, so I guess it can't fall into this condition.

So, the change is fine, but I think more information addressing the
reasoning from the original commit should be added to the commit
message.


And this patch is missing changes for the equivalent python code in
python/ovs/socket_util.py.  We should keep them in sync.

>  lib/socket-util-unix.c  |  8 +--
>  lib/socket-util.c   |  2 +-
>  lib/socket-util.h   |  2 +
>  tests/.gitignore|  1 +
>  tests/automake.mk   |  3 +-
>  tests/library.at|  5 ++
>  tests/test-unix-socket-listen-backlog.c | 73 +
>  7 files changed, 88 insertions(+), 6 deletions(-)
>  create mode 100644 tests/test-unix-socket-listen-backlog.c
> 
> diff --git a/lib/socket-util-unix.c b/lib/socket-util-unix.c
> index 59f63fcce..1543aa2e2 100644
> --- a/lib/socket-util-unix.c
> +++ b/lib/socket-util-unix.c
> @@ -363,7 +363,10 @@ make_unix_socket(int style, bool nonblock,
>  error = make_sockaddr_un(connect_path, , _len, , 
> linkname);
>  if (!error
>  && connect(fd, (struct sockaddr*) , un_len)
> -&& errno != EINPROGRESS) {
> +/* POSIX connect() returns EINPROGRESS for all non-blocking
> + * sockets. Linux connect() returns either EAGAIN - for AF_UNIX
> + * sockets - or EINPROGRESS - for other families. Handle both. */

2 spaces between sentences.

> +&& errno != EINPROGRESS && errno != EAGAIN) {
>  error = errno;
>  }
>  free_sockaddr_un(dirfd, linkname);
> @@ -376,9 +379,6 @@ make_unix_socket(int style, bool nonblock,
>  return fd;
>  
>  error:
> -if (error == EAGAIN) {
> -error = EPROTO;
> -}
>  if (bind_path) {
>  fatal_signal_unlink_file_now(bind_path);
>  }
> diff --git a/lib/socket-util.c b/lib/socket-util.c
> index 2d89fce85..bc9628b38 100644
> --- a/lib/socket-util.c
> +++ b/lib/socket-util.c
> @@ -760,7 +760,7 @@ inet_open_passive(int style, const char *target, int 
> default_port,
>  }
>  
>  /* Listen. */
> -if (style == SOCK_STREAM && listen(fd, 64) < 0) {
> +if (style == SOCK_STREAM && listen(fd, LISTEN_BACKLOG) < 0) {
>  error = sock_errno();
>  VLOG_ERR("%s: listen: %s", target, sock_strerror(error));
>  goto error;
> diff --git a/lib/socket-util.h b/lib/socket-util.h
> index 4eec627e3..03219633d 100644
> --- a/lib/socket-util.h
> +++ b/lib/socket-util.h
> @@ -28,6 +28,8 @@
>  #include 
>  #include 
>  
> +#define LISTEN_BACKLOG 64

Should this also be used in stream-unix.c ?

might also be better to define the same constant for python counterparts.

> +
>  struct ds;
>  
>  int set_nonblocking(int fd);
> diff --git a/tests/.gitignore b/tests/.gitignore
> index 3a8c45975..c32b2c7cc 100644
> --- a/tests/.gitignore
> +++ b/tests/.gitignore
> @@ -66,6 +66,7 @@
>  /test-timeval
>  /test-type-props
>  /test-unix-socket
> +/test-unix-socket-listen-backlog
>  /test-util
>  /test-uuid
>  /test-uuidset
> diff --git a/tests/automake.mk b/tests/automake.mk
> index 04f48f2d8..fc74cfa7d 100644
> --- a/tests/automake.mk
> +++ b/tests/automake.mk
> @@ -493,7 +493,8 @@ tests_ovstest_SOURCES = \
>  
>  if !WIN32
>  tests_ovstest_SOURCES += \
> - tests/test-unix-socket.c
> + tests/test-unix-socket.c \
> + 

[ovs-dev] [PATCH] ofproto-dpif-trace: Fix access to an out-of-scope stack memory.

2024-05-02 Thread Ilya Maximets
While tracing NAT actions, pointer to the action may be stored in the
recirculation node for future reference.  However, while translating
actions for the group bucket in xlate_group_bucket, the action list is
allocated temporarily on stack.  So, in case the group translation
leads to NAT, the stack pointer can be stored in the recirculation node
and accessed later by the tracing mechanism when this stack memory is
long gone:

 ==396230==ERROR: AddressSanitizer: stack-use-after-return on address
 0x191844 at pc 0x64222a bp 0xa5da10 sp 0xa5da08
 READ of size 1 at 0x191844 thread T0
  0 0x642229 in ofproto_trace_recirc_node ofproto/ofproto-dpif-trace.c:704:49
  1 0x642229 in ofproto_trace ofproto/ofproto-dpif-trace.c:867:9
  2 0x6434c1 in ofproto_unixctl_trace ofproto/ofproto-dpif-trace.c:489:9
  3 0xc1e491 in process_command lib/unixctl.c:310:13
  4 0xc1e491 in run_connection lib/unixctl.c:344:17
  5 0xc1e491 in unixctl_server_run lib/unixctl.c:395:21
  6 0x53eedf in main ovs/vswitchd/ovs-vswitchd.c:131:9
  7 0x2be087 in __libc_start_call_main
  8 0x2be14a in __libc_start_main@GLIBC_2.2.5
  9 0x42dee4 in _start (vswitchd/ovs-vswitchd+0x42dee4)

 Address 0x191844 is located in stack of thread T0 at offset 68 in frame
  0 0x6d391f in xlate_group_bucket ofproto/ofproto-dpif-xlate.c:4751

  This frame has 3 object(s):
[32, 1056) 'action_list_stub' (line 4760) <== Memory access at
  offset 68 is inside
  this variable
[1184, 1248) 'action_list' (line 4761)
[1280, 1344) 'action_set' (line 4762)

 SUMMARY: AddressSanitizer: stack-use-after-return
   ofproto/ofproto-dpif-trace.c:704:49 in ofproto_trace_recirc_node

Fix that by copying the action.

Fixes: d072d2de011b ("ofproto-dpif-trace: Improve NAT tracing.")
Reported-by: Ales Musil 
Signed-off-by: Ilya Maximets 
---
 ofproto/ofproto-dpif-trace.c |  3 ++-
 ofproto/ofproto-dpif-trace.h |  2 +-
 tests/ofproto-dpif.at| 22 ++
 3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/ofproto/ofproto-dpif-trace.c b/ofproto/ofproto-dpif-trace.c
index 87506aa78..e43d9f88c 100644
--- a/ofproto/ofproto-dpif-trace.c
+++ b/ofproto/ofproto-dpif-trace.c
@@ -102,7 +102,7 @@ oftrace_add_recirc_node(struct ovs_list *recirc_queue,
 node->flow = *flow;
 node->flow.recirc_id = recirc_id;
 node->flow.ct_zone = zone;
-node->nat_act = ofn;
+node->nat_act = ofn ? xmemdup(ofn, sizeof *ofn) : NULL;
 node->packet = packet ? dp_packet_clone(packet) : NULL;
 
 return true;
@@ -113,6 +113,7 @@ oftrace_recirc_node_destroy(struct oftrace_recirc_node 
*node)
 {
 if (node) {
 recirc_free_id(node->recirc_id);
+free(node->nat_act);
 dp_packet_delete(node->packet);
 free(node);
 }
diff --git a/ofproto/ofproto-dpif-trace.h b/ofproto/ofproto-dpif-trace.h
index f579a5ca4..f023b10cd 100644
--- a/ofproto/ofproto-dpif-trace.h
+++ b/ofproto/ofproto-dpif-trace.h
@@ -73,7 +73,7 @@ struct oftrace_recirc_node {
 uint32_t recirc_id;
 struct flow flow;
 struct dp_packet *packet;
-const struct ofpact_nat *nat_act;
+struct ofpact_nat *nat_act;
 };
 
 /* A node within a next_ct_states list. */
diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
index 3eaccb13a..0b23fd6c5 100644
--- a/tests/ofproto-dpif.at
+++ b/tests/ofproto-dpif.at
@@ -947,6 +947,28 @@ AT_CHECK([tail -1 stdout], [0],
 OVS_VSWITCHD_STOP
 AT_CLEANUP
 
+AT_SETUP([ofproto-dpif - group with ct and dnat recirculation in action list])
+OVS_VSWITCHD_START
+add_of_ports br0 1 10
+AT_CHECK([ovs-ofctl -O OpenFlow12 add-group br0 \
+'group_id=1234,type=all,bucket=ct(nat(dst=10.10.10.7:80),commit,table=2)'])
+AT_DATA([flows.txt], [dnl
+table=0 ip,ct_state=-trk actions=group:1234
+table=2 ip,ct_state=+trk actions=output:10
+])
+AT_CHECK([ovs-ofctl -O OpenFlow12 add-flows br0 flows.txt])
+AT_CHECK([ovs-appctl ofproto/trace br0 '
+  in_port=1,dl_src=50:54:00:00:00:05,dl_dst=50:54:00:00:00:07,dl_type=0x0800,
+  
nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_proto=1,nw_tos=0,nw_ttl=128,nw_frag=no,
+  icmp_type=8,icmp_code=0
+'], [0], [stdout])
+AT_CHECK([grep 'Datapath actions' stdout], [0], [dnl
+Datapath actions: ct(commit,nat(dst=10.10.10.7:80)),recirc(0x1)
+Datapath actions: 10
+])
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
 AT_SETUP([ofproto-dpif - group actions have no effect afterwards])
 OVS_VSWITCHD_START
 add_of_ports br0 1 10
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] docs: Document manual cluster recovery procedure.

2024-05-02 Thread Ihar Hrachyshka
On Thu, May 2, 2024 at 5:52 PM Ilya Maximets  wrote:

> On 4/26/24 18:54, Ihar Hrachyshka wrote:
> > Remove the notion of cluster/leave --force since it was never
> > implemented. Instead of these instructions, document how a broken
> > cluster can be re-initialized with the old database contents.
> >
> > Signed-off-by: Ihar Hrachyshka 
> >
> > ---
> >
> > v1: initial version.
> > v2: remove --force mentioned in ovsdb-server(1).
> > v3: multiple language and markup changes suggested by Ilya.
>
> Thanks, Ihar!  This version looks good to me in general.
> I have a couple of minor nits below.  If you agree, I can
> fold those in while applying the change.
>

Feel free to. And thanks for your patience.


>
> Let me know what you think.
>
> Best regards, Ilya Maximets.
>
> >
> > ---
> >  Documentation/ref/ovsdb.7.rst | 44 ---
> >  ovsdb/ovsdb-server.1.in   |  3 +--
> >  2 files changed, 37 insertions(+), 10 deletions(-)
> >
> > diff --git a/Documentation/ref/ovsdb.7.rst
> b/Documentation/ref/ovsdb.7.rst
> > index 46ed13e61..5766e64b9 100644
> > --- a/Documentation/ref/ovsdb.7.rst
> > +++ b/Documentation/ref/ovsdb.7.rst
> > @@ -315,16 +315,11 @@ The above methods for adding and removing servers
> only work for healthy
> >  clusters, that is, for clusters with no more failures than their maximum
> >  tolerance.  For example, in a 3-server cluster, the failure of 2 servers
> >  prevents servers joining or leaving the cluster (as well as database
> access).
> > +
> >  To prevent data loss or inconsistency, the preferred solution to this
> problem
> >  is to bring up enough of the failed servers to make the cluster healthy
> again,
> > -then if necessary remove any remaining failed servers and add new
> ones.  If
> > -this cannot be done, though, use ``ovs-appctl`` to invoke
> ``cluster/leave
> > ---force`` on a running server.  This command forces the server to which
> it is
> > -directed to leave its cluster and form a new single-node cluster that
> contains
> > -only itself.  The data in the new cluster may be inconsistent with the
> former
> > -cluster: transactions not yet replicated to the server will be lost, and
> > -transactions not yet applied to the cluster may be committed.
> Afterward, any
> > -servers in its former cluster will regard the server to have failed.
> > +then if necessary remove any remaining failed servers and add new ones.
> If this
>
> Nit:  2 spaces between sentences.
>
> > +is not an option, see the next section for `Manual cluster recovery`_.
> >
> >  Once a server leaves a cluster, it may never rejoin it.  Instead,
> create a new
> >  server and join it to the cluster.
> > @@ -362,6 +357,39 @@ Clustered OVSDB does not support the OVSDB
> "ephemeral columns" feature.
> >  ones when they work with schemas for clustered databases.  Future
> versions of
> >  OVSDB might add support for this feature.
> >
> > +Manual cluster recovery
> > +~~~
> > +
> > +.. important::
>
> Nit: An empty line here would be nice to be consistent at least
>  within this document.
>
> > +   The procedure below will result in ``cid`` and ``sid`` change. A
> *new*
>
> Nit:  2 spaces between sentences.
>
> > +   cluster will be initialized.
> > +
> > +To recover a clustered database after a failure:
> > +
> > +1. Stop *all* old cluster ``ovsdb-server`` instances before proceeding.
> > +
> > +2. Pick one of the old members which will serve as a bootstrap member
> of the
> > +   to-be-recovered cluster.
> > +
> > +3. Convert its database file to the standalone format using ``ovsdb-tool
> > +   cluster-to-standalone``.
> > +
> > +4. Backup the standalone database file.
> > +
> > +5. Create a new single-node cluster with ``ovsdb-tool create-cluster``
> > +   using the previously saved standalone database file, then start
> > +   ``ovsdb-server``.
> > +
> > +Once the single-node cluster is up and running and serves the restored
> data,
> > +new members should be created and join the new cluster, as usual
> (``ovsdb-tool
> > +join-cluster``).
>
> I'm having hard time reading 'new members should be created and join' as
> my brain wants to relate 'should be' to both 'created' and 'join' and
> 'should be join' is not a correct construct.
>
> How about: "new members should be created and added to the cluster, as
> usual,
> with ``ovsdb-tool join-cluster``."  ?
>

Though it doesn't confuse me, I am not a native speaker, and I find your
version at least as good as mine, so feel free to change.


>
> Also, should it be a step 6 ?
>
>
It won't hurt to fold it into the list.


> > +
> > +.. note::
> > +
> > +   The data in the new cluster may be inconsistent with the former
> cluster:
> > +   transactions not yet replicated to the server chosen in step 2 will
> be lost,
> > +   and transactions not yet applied to the cluster may be committed.
> > +
> >  Upgrading from version 2.14 and earlier to 2.15 and later
> >  

Re: [ovs-dev] [PATCH v3] docs: Document manual cluster recovery procedure.

2024-05-02 Thread Ilya Maximets
On 4/26/24 18:54, Ihar Hrachyshka wrote:
> Remove the notion of cluster/leave --force since it was never
> implemented. Instead of these instructions, document how a broken
> cluster can be re-initialized with the old database contents.
> 
> Signed-off-by: Ihar Hrachyshka 
> 
> ---
> 
> v1: initial version.
> v2: remove --force mentioned in ovsdb-server(1).
> v3: multiple language and markup changes suggested by Ilya.

Thanks, Ihar!  This version looks good to me in general.
I have a couple of minor nits below.  If you agree, I can
fold those in while applying the change.

Let me know what you think.

Best regards, Ilya Maximets.

> 
> ---
>  Documentation/ref/ovsdb.7.rst | 44 ---
>  ovsdb/ovsdb-server.1.in   |  3 +--
>  2 files changed, 37 insertions(+), 10 deletions(-)
> 
> diff --git a/Documentation/ref/ovsdb.7.rst b/Documentation/ref/ovsdb.7.rst
> index 46ed13e61..5766e64b9 100644
> --- a/Documentation/ref/ovsdb.7.rst
> +++ b/Documentation/ref/ovsdb.7.rst
> @@ -315,16 +315,11 @@ The above methods for adding and removing servers only 
> work for healthy
>  clusters, that is, for clusters with no more failures than their maximum
>  tolerance.  For example, in a 3-server cluster, the failure of 2 servers
>  prevents servers joining or leaving the cluster (as well as database access).
> +
>  To prevent data loss or inconsistency, the preferred solution to this problem
>  is to bring up enough of the failed servers to make the cluster healthy 
> again,
> -then if necessary remove any remaining failed servers and add new ones.  If
> -this cannot be done, though, use ``ovs-appctl`` to invoke ``cluster/leave
> ---force`` on a running server.  This command forces the server to which it is
> -directed to leave its cluster and form a new single-node cluster that 
> contains
> -only itself.  The data in the new cluster may be inconsistent with the former
> -cluster: transactions not yet replicated to the server will be lost, and
> -transactions not yet applied to the cluster may be committed.  Afterward, any
> -servers in its former cluster will regard the server to have failed.
> +then if necessary remove any remaining failed servers and add new ones. If 
> this

Nit:  2 spaces between sentences.

> +is not an option, see the next section for `Manual cluster recovery`_.
>  
>  Once a server leaves a cluster, it may never rejoin it.  Instead, create a 
> new
>  server and join it to the cluster.
> @@ -362,6 +357,39 @@ Clustered OVSDB does not support the OVSDB "ephemeral 
> columns" feature.
>  ones when they work with schemas for clustered databases.  Future versions of
>  OVSDB might add support for this feature.
>  
> +Manual cluster recovery
> +~~~
> +
> +.. important::

Nit: An empty line here would be nice to be consistent at least
 within this document.

> +   The procedure below will result in ``cid`` and ``sid`` change. A *new*

Nit:  2 spaces between sentences.

> +   cluster will be initialized.
> +
> +To recover a clustered database after a failure:
> +
> +1. Stop *all* old cluster ``ovsdb-server`` instances before proceeding.
> +
> +2. Pick one of the old members which will serve as a bootstrap member of the
> +   to-be-recovered cluster.
> +
> +3. Convert its database file to the standalone format using ``ovsdb-tool
> +   cluster-to-standalone``.
> +
> +4. Backup the standalone database file.
> +
> +5. Create a new single-node cluster with ``ovsdb-tool create-cluster``
> +   using the previously saved standalone database file, then start
> +   ``ovsdb-server``.
> +
> +Once the single-node cluster is up and running and serves the restored data,
> +new members should be created and join the new cluster, as usual 
> (``ovsdb-tool
> +join-cluster``).

I'm having hard time reading 'new members should be created and join' as
my brain wants to relate 'should be' to both 'created' and 'join' and
'should be join' is not a correct construct.

How about: "new members should be created and added to the cluster, as usual,
with ``ovsdb-tool join-cluster``."  ?

Also, should it be a step 6 ?

> +
> +.. note::
> +
> +   The data in the new cluster may be inconsistent with the former cluster:
> +   transactions not yet replicated to the server chosen in step 2 will be 
> lost,
> +   and transactions not yet applied to the cluster may be committed.
> +
>  Upgrading from version 2.14 and earlier to 2.15 and later
>  ~
>  
> diff --git a/ovsdb/ovsdb-server.1.in b/ovsdb/ovsdb-server.1.in
> index 9fabf2d67..23b8e6e9c 100644
> --- a/ovsdb/ovsdb-server.1.in
> +++ b/ovsdb/ovsdb-server.1.in
> @@ -461,8 +461,7 @@ This does not result in a three server cluster that lacks 
> quorum.
>  .
>  .IP "\fBcluster/kick \fIdb server\fR"
>  Start graceful removal of \fIserver\fR from \fIdb\fR's cluster, like
> -\fBcluster/leave\fR (without \fB\-\-force\fR) except that it can
> -remove any server, not just this one.
> 

Re: [ovs-dev] [PATCH v9 5/6] vswitchd: Add JSON output for 'list-commands' command.

2024-05-02 Thread Ilya Maximets
On 4/12/24 09:26, jm...@redhat.com wrote:
> From: Jakob Meng 
> 
> The 'list-commands' command now supports machine-readable JSON output
> in addition to the plain-text output for humans.
> 
> Reported-at: https://bugzilla.redhat.com/1824861
> Signed-off-by: Jakob Meng 
> ---
>  NEWS  |  1 +
>  lib/unixctl.c | 46 ++-
>  tests/ovs-vswitchd.at | 11 +++
>  3 files changed, 44 insertions(+), 14 deletions(-)
> 
> diff --git a/NEWS b/NEWS
> index 957351acb..af83623b3 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -13,6 +13,7 @@ Post-v3.3.0
>   * Added new option [--pretty] to print JSON output in a readable 
> fashion.
> E.g. members of objects and elements of arrays are printed one per 
> line,
> with indentation.
> + * 'list-commands' now supports output in JSON format.
> - Python:
>   * Added support for choosing the output format, e.g. 'json' or 'text'.
>   * Added new option [--pretty] to print JSON output in a readable 
> fashion.
> diff --git a/lib/unixctl.c b/lib/unixctl.c
> index 67cf26418..ca54884d0 100644
> --- a/lib/unixctl.c
> +++ b/lib/unixctl.c
> @@ -68,24 +68,42 @@ static void
>  unixctl_list_commands(struct unixctl_conn *conn, int argc OVS_UNUSED,
>const char *argv[] OVS_UNUSED, void *aux OVS_UNUSED)
>  {
> -struct ds ds = DS_EMPTY_INITIALIZER;
> -const struct shash_node **nodes = shash_sort();
> -size_t i;
> +if (unixctl_command_get_output_format(conn) == OVS_OUTPUT_FMT_JSON) {
> +struct json *json_commands = json_array_create_empty();
> +const struct shash_node *node;
>  
> -ds_put_cstr(, "The available commands are:\n");
> +SHASH_FOR_EACH (node, ) {
> +const struct unixctl_command *command = node->data;
>  
> -for (i = 0; i < shash_count(); i++) {
> -const struct shash_node *node = nodes[i];
> -const struct unixctl_command *command = node->data;
> -
> -if (command->usage) {
> -ds_put_format(, "  %-23s %s\n", node->name, command->usage);
> +if (command->usage) {
> +struct json *json_command = json_object_create();

An empty line.

> +json_object_put_string(json_command, "name", node->name);
> +json_object_put_string(json_command, "usage", 
> command->usage);
> +json_array_add(json_commands, json_command);

Can it be just a { "name": "usage", "name1": "usage1", ... } map
instead of array of objects?

> +}
>  }
> -}
> -free(nodes);
> +unixctl_command_reply_json(conn, json_commands);
> +} else {
> +struct ds ds = DS_EMPTY_INITIALIZER;
> +const struct shash_node **nodes = shash_sort();
> +size_t i;
>  
> -unixctl_command_reply(conn, ds_cstr());
> -ds_destroy();
> +ds_put_cstr(, "The available commands are:\n");
> +
> +for (i = 0; i < shash_count(); ++i) {
> +const struct shash_node *node = nodes[i];
> +const struct unixctl_command *command = node->data;
> +
> +if (command->usage) {
> +ds_put_format(, "  %-23s %s\n", node->name,
> +  command->usage);
> +}
> +}
> +free(nodes);
> +
> +unixctl_command_reply(conn, ds_cstr());
> +ds_destroy();
> +}
>  }
>  
>  static void
> diff --git a/tests/ovs-vswitchd.at b/tests/ovs-vswitchd.at
> index 5683896ef..dcda71d04 100644
> --- a/tests/ovs-vswitchd.at
> +++ b/tests/ovs-vswitchd.at
> @@ -281,3 +281,14 @@ AT_CHECK_UNQUOTED([ovs-appctl --format json --pretty 
> version], [0], [dnl
>"reply-format": "plain"}])
>  
>  AT_CLEANUP
> +
> +AT_SETUP([ovs-vswitchd list-commands])
> +OVS_VSWITCHD_START
> +
> +AT_CHECK([ovs-appctl list-commands], [0], [ignore])
> +AT_CHECK([ovs-appctl --format json list-commands], [0], [stdout])
> +AT_CHECK([wc -l stdout], [0], [0 stdout
> +])

This line looks strange.  What does it do?

> +AT_CHECK([grep -E '^\[[\{"name":.*\}\]]$' stdout], [0], [ignore])
> +
> +AT_CLEANUP

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v9 4/6] python: Add option '--pretty' for pretty-printing JSON output.

2024-05-02 Thread Ilya Maximets
On 4/12/24 09:26, jm...@redhat.com wrote:
> From: Jakob Meng 
> 
> With the '--pretty' option, appctl.py will now print JSON output in a
> more readable fashion, i.e. with additional line breaks, spaces and
> sorted dictionary keys. The pretty-printed output from appctl.py is not
> strictly the same as with ovs-appctl because of both use different
> pretty-printing implementations.
> 
> Signed-off-by: Jakob Meng 
> ---
>  NEWS |  3 +++
>  python/ovs/unixctl/client.py |  4 ++--
>  tests/appctl.py  | 16 ++--
>  tests/unixctl-py.at  |  5 +
>  4 files changed, 24 insertions(+), 4 deletions(-)
> 
> diff --git a/NEWS b/NEWS
> index 52cadbe0d..957351acb 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -15,6 +15,9 @@ Post-v3.3.0
> with indentation.
> - Python:
>   * Added support for choosing the output format, e.g. 'json' or 'text'.
> + * Added new option [--pretty] to print JSON output in a readable 
> fashion.
> +   E.g. members of objects and elements of arrays are printed one per 
> line,
> +   with indentation.

The option was added to the test-only code.  This shouldn't be
in the news.  If we move the formatting responsibility out
of the UnixctlClient to the user, this patch may not be needed
at all.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v9 3/6] appctl: Add option '--pretty' for pretty-printing JSON output.

2024-05-02 Thread Ilya Maximets
On 4/12/24 09:26, jm...@redhat.com wrote:
> From: Jakob Meng 
> 
> With the '--pretty' option, ovs-appctl will now print JSON output in a
> more readable fashion, i.e. with additional line breaks, spaces and
> sorted dictionary keys.
> 
> Signed-off-by: Jakob Meng 
> ---
>  NEWS   |  3 +++
>  lib/unixctl.c  |  6 +++---
>  lib/unixctl.h  |  2 +-
>  tests/ovs-vswitchd.at  |  5 +
>  utilities/ovs-appctl.c | 32 
>  5 files changed, 40 insertions(+), 8 deletions(-)
> 
> diff --git a/NEWS b/NEWS
> index f18238159..52cadbe0d 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -10,6 +10,9 @@ Post-v3.3.0
> - ovs-appctl:
>   * Added new option [-f|--format] to choose the output format, e.g. 
> 'json'
> or 'text' (by default).
> + * Added new option [--pretty] to print JSON output in a readable 
> fashion.
> +   E.g. members of objects and elements of arrays are printed one per 
> line,
> +   with indentation.
> - Python:
>   * Added support for choosing the output format, e.g. 'json' or 'text'.
>  
> diff --git a/lib/unixctl.c b/lib/unixctl.c
> index 18638d86a..67cf26418 100644
> --- a/lib/unixctl.c
> +++ b/lib/unixctl.c
> @@ -572,8 +572,8 @@ unixctl_client_create(const char *path, struct jsonrpc 
> **client)
>   * '*err' if not NULL. */
>  int
>  unixctl_client_transact(struct jsonrpc *client, const char *command, int 
> argc,
> -char *argv[], enum ovs_output_fmt fmt, char **result,
> -char **err)
> +char *argv[], enum ovs_output_fmt fmt,
> +unsigned int fmt_flags, char **result, char **err)

As I mentioned in the review for patch #1, it may be better to handle
for matting in the ovs-appctl instead and keep unixctl module unaware
of it.

>  {
>  struct jsonrpc_msg *request, *reply;
>  struct json **json_args, *params, *output_src;
> @@ -604,7 +604,7 @@ unixctl_client_transact(struct jsonrpc *client, const 
> char *command, int argc,
>  if (fmt == OVS_OUTPUT_FMT_TEXT && output_src->type == JSON_STRING) {
>  *output_dst = xstrdup(json_string(output_src));
>  } else if (fmt == OVS_OUTPUT_FMT_JSON) {
> -*output_dst = json_to_string(output_src, 0);
> +*output_dst = json_to_string(output_src, fmt_flags);
>  } else {
>  VLOG_WARN("%s: unexpected %s type in JSON rpc reply: %s",
>jsonrpc_get_name(client),
> diff --git a/lib/unixctl.h b/lib/unixctl.h
> index b30bcf092..5a08a1bd1 100644
> --- a/lib/unixctl.h
> +++ b/lib/unixctl.h
> @@ -39,7 +39,7 @@ int unixctl_client_create(const char *path, struct jsonrpc 
> **client);
>  int unixctl_client_transact(struct jsonrpc *client,
>  const char *command,
>  int argc, char *argv[],
> -enum ovs_output_fmt fmt,
> +enum ovs_output_fmt fmt, unsigned int fmt_flags,
>  char **result, char **error);
>  
>  /* Command registration. */
> diff --git a/tests/ovs-vswitchd.at b/tests/ovs-vswitchd.at
> index 32ca2c6e7..5683896ef 100644
> --- a/tests/ovs-vswitchd.at
> +++ b/tests/ovs-vswitchd.at
> @@ -275,4 +275,9 @@ ovs_version=$(ovs-appctl version)
>  AT_CHECK_UNQUOTED([ovs-appctl --format json version], [0], [dnl
>  {"reply-format":"plain","reply":"$ovs_version\n"}])
>  
> +AT_CHECK_UNQUOTED([ovs-appctl --format json --pretty version], [0], [dnl
> +{
> +  "reply": "$ovs_version\n",
> +  "reply-format": "plain"}])
> +
>  AT_CLEANUP
> diff --git a/utilities/ovs-appctl.c b/utilities/ovs-appctl.c
> index 4d4503597..29a0de2ee 100644
> --- a/utilities/ovs-appctl.c
> +++ b/utilities/ovs-appctl.c
> @@ -26,6 +26,7 @@
>  #include "daemon.h"
>  #include "dirs.h"
>  #include "openvswitch/dynamic-string.h"
> +#include "openvswitch/json.h"
>  #include "jsonrpc.h"
>  #include "process.h"
>  #include "timeval.h"
> @@ -39,6 +40,7 @@ static void usage(void);
>  /* Parsed command line args. */
>  struct cmdl_args {
>  enum ovs_output_fmt format;
> +unsigned int format_flags;
>  char *target;
>  };
>  
> @@ -74,8 +76,8 @@ main(int argc, char *argv[])
>  if (!svec_is_empty(_argv)) {
>  error = unixctl_client_transact(client, "set-options",
>  opt_argv.n, opt_argv.names,
> -args->format, _result,
> -_error);
> +args->format, 0,
> +_result, _error);
>  
>  if (error) {
>  ovs_fatal(error, "%s: transaction error", args->target);
> @@ -98,7 +100,9 @@ main(int argc, char *argv[])
>  cmd_argc = argc - optind;
>  cmd_argv = cmd_argc ? argv + optind : NULL;
>  error = unixctl_client_transact(client, cmd, cmd_argc, cmd_argv,
> -   

Re: [ovs-dev] [PATCH v9 2/6] python: Add global option for JSON output to Python tools.

2024-05-02 Thread Ilya Maximets
On 4/12/24 09:26, jm...@redhat.com wrote:
> From: Jakob Meng 
> 
> This patch introduces support for different output formats to the
> Python code, as did the previous commit for ovs-xxx tools like
> 'ovs-appctl --format json dpif/show'.
> In particular, tests/appctl.py gains a global option '-f,--format'
> which allows users to request JSON instead of plain-text for humans.
> 
> Reported-at: https://bugzilla.redhat.com/1824861
> Signed-off-by: Jakob Meng 
> ---
>  NEWS |  2 ++
>  python/ovs/unixctl/client.py | 14 ---
>  python/ovs/unixctl/server.py | 45 +---
>  python/ovs/util.py   |  8 +++
>  tests/appctl.py  | 19 ++-
>  tests/unixctl-py.at  |  3 +++
>  6 files changed, 84 insertions(+), 7 deletions(-)
> 
> diff --git a/NEWS b/NEWS
> index 03ef6486b..f18238159 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -10,6 +10,8 @@ Post-v3.3.0
> - ovs-appctl:
>   * Added new option [-f|--format] to choose the output format, e.g. 
> 'json'
> or 'text' (by default).
> +   - Python:
> + * Added support for choosing the output format, e.g. 'json' or 'text'.

This is misleading.  Unixctl should be mentioned.

The rest of the patch has similar issues to the C implementation,
will not repeat my comments here.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v9 1/6] Add global option for JSON output to ovs-appctl.

2024-05-02 Thread Ilya Maximets
On 4/12/24 09:26, jm...@redhat.com wrote:
> From: Jakob Meng 
> 
> For monitoring systems such as Prometheus it would be beneficial if
> OVS would expose statistics in a machine-readable format.
> 
> This patch introduces support for different output formats to ovs-xxx
> tools, in particular ovs-appctl.

What other tools you plan to add support to?
Database tools already have the machine-readable output from the database.
appctl should generelly be used instead of ovs-dpctl.  What else?

I'd suggest to not try to make this generic.

> The latter gains a global option
> '-f,--format' which changes it to print a JSON document instead of
> plain-text for humans. For example, a later patch implements support
> for 'ovs-appctl --format json dpif/show'. By default, the output format
> is plain-text as before.
> 
> A new 'set-options' command has been added to lib/unixctl.c which
> allows to change the output format of the commands executed afterwards
> on the same socket connection. It is supposed to be run by ovs-xxx

It's specific for appctl unility, nothing else is using unixctl.

> tools transparently for the user when a specific output format has been
> requested.
> For example, when a user calls 'ovs-appctl --format json dpif/show',
> then ovs-appctl will call 'set-options' to set the output format as
> requested by the user and afterwards it will call the actual command
> 'dpif/show'.
> This ovs-appctl behaviour has been implemented in a backward compatible
> way. One can use an updated client (ovs-appctl) with an old server
> (ovs-vswitchd) and vice versa. Of course, JSON output only works when
> both sides have been updated.
> 
> Two access functions unixctl_command_{get,set}_output_format() and
> two unixctl_command_reply_{,error_}json functions have been added to
> lib/unixctl.h:
> unixctl_command_get_output_format() is supposed to be used in commands
> like 'dpif/show' to query the requested output format. When JSON output
> has been selected, both unixctl_command_reply_{,error_}json() functions
> can be used to return JSON objects to the client (ovs-appctl) instead
> of plain-text with the unixctl_command_reply{,_error}() functions.
> 
> When JSON has been requested but a command has not implemented JSON
> output the plain-text output will be wrapped in a provisional JSON
> document with the following structure:
> 
>   {"reply-format":"plain","reply":"$PLAIN_TEXT_HERE"}
> 
> Thus commands which have been executed successfully will not fail when
> they try to render the output at a later stage.
> 
> In popular tools like kubectl the option for output control is usually
> called '-o|--output' instead of '-f,--format'. But ovs-appctl already
> has an short option '-o' which prints the available ovs-appctl options
> ('--option'). The now choosen name also better alignes with
> ovsdb-client where '-f,--format' controls output formatting.
> 
> Reported-at: https://bugzilla.redhat.com/1824861
> Signed-off-by: Jakob Meng 
> ---
>  NEWS   |   3 +
>  lib/command-line.c |  36 ++
>  lib/command-line.h |  10 +++
>  lib/unixctl.c  | 158 -
>  lib/unixctl.h  |  11 +++
>  tests/ovs-vswitchd.at  |  11 +++
>  utilities/ovs-appctl.c |  98 +
>  7 files changed, 277 insertions(+), 50 deletions(-)
> 
> diff --git a/NEWS b/NEWS
> index b92cec532..03ef6486b 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -7,6 +7,9 @@ Post-v3.3.0
> - The primary development branch has been renamed from 'master' to 'main'.
>   The OVS tree remains hosted on GitHub.
>   https://github.com/openvswitch/ovs.git
> +   - ovs-appctl:

'o' goes before 't' or 'u', so I'd put this entry to the top.

> + * Added new option [-f|--format] to choose the output format, e.g. 
> 'json'
> +   or 'text' (by default).

A man page update is missing:
  Documentation/ref/ovs-appctl.8.rst

>  
>  
>  v3.3.0 - 16 Feb 2024
> diff --git a/lib/command-line.c b/lib/command-line.c
> index 967f4f5d5..775e0430a 100644
> --- a/lib/command-line.c
> +++ b/lib/command-line.c

These changes should be moved to unixctl.[ch].

> @@ -24,6 +24,7 @@
>  #include "ovs-thread.h"
>  #include "util.h"
>  #include "openvswitch/vlog.h"
> +#include "openvswitch/json.h"
>  
>  VLOG_DEFINE_THIS_MODULE(command_line);
>  
> @@ -53,6 +54,41 @@ ovs_cmdl_long_options_to_short_options(const struct option 
> options[])
>  return xstrdup(short_options);
>  }
>  
> +const char *
> +ovs_output_fmt_to_string(enum ovs_output_fmt fmt)
> +{
> +switch (fmt) {
> +case OVS_OUTPUT_FMT_TEXT:
> +return "text";
> +
> +case OVS_OUTPUT_FMT_JSON:
> +return "json";
> +
> +default:
> +return NULL;
> +}
> +}
> +
> +struct json *
> +ovs_output_fmt_to_json(enum ovs_output_fmt fmt)
> +{
> +const char *string = ovs_output_fmt_to_string(fmt);
> +return string ? json_string_create(string) : NULL;
> +}
> +
> +bool
> +ovs_output_fmt_from_string(const 

Re: [ovs-dev] [PATCH v2] hash, jhash: Fix unaligned access to the hash remainder.

2024-05-02 Thread Ilya Maximets
On 5/2/24 15:28, Ales Musil wrote:
> The pointer was passed to memcpy as uin32_t *, however the hash bytes
> might be unaligned at that point. Case it to uint8_t * instead

'Case' ?

> which has only single byte alignment requirement. This seems to be
> a false positive reported by clang [0].

After thinking some more, it's not actually a false positive per se.
According to the C spec we're not actually allowed to have misaligned
pointers even if we're not reading/writing through them.

So, technically, the initial cast to uint32_t pointer is no correct.
I don't think we can fully avoid such casts without loosing type checking,
but I think we need to revert changes to hash functions made in
commit db5a101931c5 ("clang: Fix the alignment warning.").
i.e. we should go back to using uint8_t pointer and cast it on the
get_unaligned_u32() call with ALIGNED_CAST.  We will still have a
misaligned pointer, but it will be immediately cast back, so should
cause less issues.

Note: all arithmetic should be done on the uint8_t pointer, not a
misaligned uin32_t one to avoid potential other UB conditions.

Best regards, Ilya Maximets.

> 
> lib/hash.c:46:22: runtime error: load of misaligned address
> 0x50700065 for type 'const uint32_t *' (aka 'const unsigned int *'),
> which requires 4 byte alignment
> 0x50700065: note: pointer points here
>  73 62 2e 73 6f 63 6b  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
>  ^
>   00 00 00 00 00 00 00 00  00
> 0 0x6191cb in hash_bytes ovs/lib/hash.c:46:9
> 1 0x69d064 in hash_string ovs/lib/hash.h:404:12
> 2 0x69d064 in hash_name ovs/lib/shash.c:29:12
> 3 0x69d064 in shash_find ovs/lib/shash.c:237:49
> 4 0x69dada in shash_find_data ovs/lib/shash.c:251:31
> 5 0x507987 in add_remote ovs/ovsdb/ovsdb-server.c:1382:15
> 6 0x507987 in parse_options ovs/ovsdb/ovsdb-server.c:2659:13
> 7 0x507987 in main ovs/ovsdb/ovsdb-server.c:751:5
> 
> SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior lib/hash.c:46:22
> 
> [0] https://github.com/llvm/llvm-project/issues/90848
> Signed-off-by: Ales Musil 
> ---
>  lib/hash.c  | 2 +-
>  lib/jhash.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/hash.c b/lib/hash.c
> index c722f3c3c..986fa6643 100644
> --- a/lib/hash.c
> +++ b/lib/hash.c
> @@ -43,7 +43,7 @@ hash_bytes(const void *p_, size_t n, uint32_t basis)
>  if (n) {
>  uint32_t tmp = 0;
>  
> -memcpy(, p, n);
> +memcpy(, (const uint8_t *) p, n);
>  hash = hash_add(hash, tmp);
>  }
>  
> diff --git a/lib/jhash.c b/lib/jhash.c
> index c59b51b61..0a0628589 100644
> --- a/lib/jhash.c
> +++ b/lib/jhash.c
> @@ -114,7 +114,7 @@ jhash_bytes(const void *p_, size_t n, uint32_t basis)
>  uint32_t tmp[3];
>  
>  tmp[0] = tmp[1] = tmp[2] = 0;
> -memcpy(tmp, p, n);
> +memcpy(tmp, (const uint8_t *) p, n);
>  a += tmp[0];
>  b += tmp[1];
>  c += tmp[2];

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v5] controller: Allow br-int connection via other methods.

2024-05-02 Thread Ales Musil
The br-int connection is hardcoded to use unix socket, which requires
for the socket to be visible for ovn-controller. This is achievable in
container by mounting the socket, but in turn the container requires
additional privileges.

Add option to vswitchd external-ids that allows to specify remote
target for management bridge. This gives the user possibility to
connect to management bridge in different manner than unix socket,
defaulting to the unix socket when not specified. In addition, there
is an option to specify inactivity probe for this connection, disabled
by default.

Reported-at: https://issues.redhat.com/browse/FDP-243
Signed-off-by: Ales Musil 
---
v4: Rebase on top of current main.
v3: Rebase on top of current main.
Fix the copy-paste error in ovn-controller documentation.
v2: Rebase on top of current main.
Make the probe interval accept milliseconds to be aligned with other probe 
intervals.
Use external-ids instead of options for the ovn-controller.
---
 NEWS|  6 +++
 controller/ofctrl.c | 10 +
 controller/ofctrl.h |  5 ++-
 controller/ovn-controller.8.xml | 15 
 controller/ovn-controller.c | 59 +++--
 controller/pinctrl.c| 56 ++--
 controller/pinctrl.h|  6 ++-
 controller/statctrl.c   | 66 ++---
 controller/statctrl.h   |  3 +-
 include/ovn/features.h  |  2 +-
 lib/features.c  | 35 +
 lib/ovn-util.c  | 26 +
 lib/ovn-util.h  |  4 ++
 lib/test-ovn-features.c |  6 +--
 tests/ovn-controller.at | 45 ++
 15 files changed, 193 insertions(+), 151 deletions(-)

diff --git a/NEWS b/NEWS
index 3b5e93dc9..4e15f31c8 100644
--- a/NEWS
+++ b/NEWS
@@ -17,6 +17,12 @@ Post v24.03.0
 external-ids, the option is no longer needed as it became effectively
 "true" for all scenarios.
   - Added DHCPv4 relay support.
+  - Add "ovn-bridge-remote" config option to vswitchd external-ids,
+that allows to specify connection method to management bridge for
+ovn-controller, defaulting to the unix socket.
+  - Add "ovn-bridge-remote-probe-interval" config option to vswitchd
+external-ids, that sets probe interval for integration bridge connection,
+disabled by default.
 
 OVN v24.03.0 - 01 Mar 2024
 --
diff --git a/controller/ofctrl.c b/controller/ofctrl.c
index 6a2564604..9d181a782 100644
--- a/controller/ofctrl.c
+++ b/controller/ofctrl.c
@@ -771,19 +771,13 @@ ofctrl_get_mf_field_id(void)
  * Returns 'true' if an OpenFlow reconnect happened; 'false' otherwise.
  */
 bool
-ofctrl_run(const struct ovsrec_bridge *br_int,
+ofctrl_run(const char *conn_target, int probe_interval,
const struct ovsrec_open_vswitch_table *ovs_table,
struct shash *pending_ct_zones)
 {
-char *target = xasprintf("unix:%s/%s.mgmt", ovs_rundir(), br_int->name);
 bool reconnected = false;
 
-if (strcmp(target, rconn_get_target(swconn))) {
-VLOG_INFO("%s: connecting to switch", target);
-rconn_connect(swconn, target, target);
-}
-free(target);
-
+ovn_update_swconn_at(swconn, conn_target, probe_interval, "ofctrl");
 rconn_run(swconn);
 
 if (!rconn_is_connected(swconn) || !pending_ct_zones) {
diff --git a/controller/ofctrl.h b/controller/ofctrl.h
index 502c73da6..7df0a24ea 100644
--- a/controller/ofctrl.h
+++ b/controller/ofctrl.h
@@ -50,8 +50,9 @@ struct ovn_desired_flow_table {
 /* Interface for OVN main loop. */
 void ofctrl_init(struct ovn_extend_table *group_table,
  struct ovn_extend_table *meter_table);
-bool ofctrl_run(const struct ovsrec_bridge *br_int,
-const struct ovsrec_open_vswitch_table *,
+
+bool ofctrl_run(const char *conn_target, int probe_interval,
+const struct ovsrec_open_vswitch_table *ovs_table,
 struct shash *pending_ct_zones);
 enum mf_field_id ofctrl_get_mf_field_id(void);
 void ofctrl_put(struct ovn_desired_flow_table *lflow_table,
diff --git a/controller/ovn-controller.8.xml b/controller/ovn-controller.8.xml
index 85e7966d7..b6404a19d 100644
--- a/controller/ovn-controller.8.xml
+++ b/controller/ovn-controller.8.xml
@@ -378,6 +378,21 @@
 cap for the exponential backoff used by ovn-controller
 to send GARPs packets.
   
+  external_ids:ovn-bridge-remote
+  
+
+  Connection to the OVN management bridge in OvS. It defaults to
+  unix:br-int.mgmt when not specified.
+
+  
+  external_ids:ovn-bridge-remote-probe-interval
+  
+
+  The inactivity probe interval of the connection to the OVN management
+  bridge, in milliseconds.
+  If the value is zero, it disables the inactivity probe.
+
+  
 
 
 
diff --git 

Re: [ovs-dev] [PATCH ovn] northd: Fix the comment about route priorities.

2024-05-02 Thread Han Zhou
On Thu, May 2, 2024 at 9:01 AM Numan Siddique  wrote:
>
> On Mon, Apr 22, 2024 at 2:41 AM Han Zhou  wrote:
> >
> > The current comments are obviously conflicting.  Fixing it according the
> > current implementation - static route overrides src-ip route.
> >
> > Signed-off-by: Han Zhou 
>
> Acked-by: Numan Siddique 
>
> Numan
>

Thanks Numan. Applied to main.

Han

> > ---
> >  northd/northd.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/northd/northd.c b/northd/northd.c
> > index 331d9c2677b8..dec1eb3679f5 100644
> > --- a/northd/northd.c
> > +++ b/northd/northd.c
> > @@ -271,7 +271,7 @@ static bool default_acl_drop;
> >   * Route offsets implement logic to prioritize traffic for routes with
> >   * same ip_prefix values:
> >   *  -  connected route overrides static one;
> > - *  -  static route overrides connected route. */
> > + *  -  static route overrides src-ip route. */
> >  #define ROUTE_PRIO_OFFSET_MULTIPLIER 3
> >  #define ROUTE_PRIO_OFFSET_STATIC 1
> >  #define ROUTE_PRIO_OFFSET_CONNECTED 2
> > --
> > 2.38.1
> >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v4] controller: Track individual address set constants.

2024-05-02 Thread Han Zhou
On Thu, May 2, 2024 at 6:29 AM Ales Musil  wrote:
>
> Instead of tracking address set per struct expr_constant_set track it
> per individual struct expr_constant. This allows more fine grained
> control for I-P processing of address sets in controller. It helps with
> scenarios like matching on two address sets in one expression e.g.
> "ip4.src == {$as1, $as2}". This allows any addition or removal of
> individual adress from the set to be incrementally processed instead
> of reprocessing all the flows.
>
> This unfortunately doesn't help with the following flows:
> "ip4.src == $as1 && ip4.dst == $as2"
> "ip4.src == $as1 || ip4.dst == $as2"
>
> The memory impact should be minimal as there is only increase of 8 bytes
> per the struct expr_constant.
>
> Reported-at: https://issues.redhat.com/browse/FDP-509
> Signed-off-by: Ales Musil 
> ---
> v4: Rebase on top of current main.
> Update the "lflow_handle_addr_set_update" comment according to
suggestion from Han.

Thanks Ales. I updated the commit message for the same, and applied to main
branch.

Regards,
Han

> v3: Rebase on top of current main.
> Address comments from Han:
> - Adjust the comment for "lflow_handle_addr_set_update" to include
remaning corner cases.
> - Make sure that the flows are consistent between I-P and recompute.
> v2: Rebase on top of current main.
> Adjust the comment for I-P optimization.
> ---
>  controller/lflow.c  | 11 ++---
>  include/ovn/actions.h   |  2 +-
>  include/ovn/expr.h  | 46 ++-
>  lib/actions.c   | 20 -
>  lib/expr.c  | 99 +
>  tests/ovn-controller.at | 79 +---
>  6 files changed, 154 insertions(+), 103 deletions(-)
>
> diff --git a/controller/lflow.c b/controller/lflow.c
> index 760ec0b41..1e05665a1 100644
> --- a/controller/lflow.c
> +++ b/controller/lflow.c
> @@ -278,7 +278,7 @@ lflow_handle_changed_flows(struct lflow_ctx_in
*l_ctx_in,
>  }
>
>  static bool
> -as_info_from_expr_const(const char *as_name, const union expr_constant
*c,
> +as_info_from_expr_const(const char *as_name, const struct expr_constant
*c,
>  struct addrset_info *as_info)
>  {
>  as_info->name = as_name;
> @@ -644,14 +644,11 @@ as_update_can_be_handled(const char *as_name,
struct addr_set_diff *as_diff,
>   *generated.
>   *
>   *  - The sub expression of the address set is combined with other
sub-
> - *expressions/constants, usually because of disjunctions between
> - *sub-expressions/constants, e.g.:
> + *expressions/constants on different fields, e.g.:
>   *
>   *  ip.src == $as1 || ip.dst == $as2
> - *  ip.src == {$as1, $as2}
> - *  ip.src == {$as1, ip1}
>   *
> - *All these could have been split into separate lflows.
> + *This could have been split into separate lflows.
>   *
>   *  - Conjunctions overlapping between lflows, which can be caused by
>   *overlapping address sets or same address set used by multiple
lflows
> @@ -714,7 +711,7 @@ lflow_handle_addr_set_update(const char *as_name,
>  if (as_diff->deleted) {
>  struct addrset_info as_info;
>  for (size_t i = 0; i < as_diff->deleted->n_values; i++) {
> -union expr_constant *c = _diff->deleted->values[i];
> +struct expr_constant *c = _diff->deleted->values[i];
>  if (!as_info_from_expr_const(as_name, c, _info)) {
>  continue;
>  }
> diff --git a/include/ovn/actions.h b/include/ovn/actions.h
> index ae0864fdd..88cf4de79 100644
> --- a/include/ovn/actions.h
> +++ b/include/ovn/actions.h
> @@ -241,7 +241,7 @@ struct ovnact_next {
>  struct ovnact_load {
>  struct ovnact ovnact;
>  struct expr_field dst;
> -union expr_constant imm;
> +struct expr_constant imm;
>  };
>
>  /* OVNACT_MOVE, OVNACT_EXCHANGE. */
> diff --git a/include/ovn/expr.h b/include/ovn/expr.h
> index c48f82398..e54edb5bf 100644
> --- a/include/ovn/expr.h
> +++ b/include/ovn/expr.h
> @@ -368,7 +368,7 @@ bool expr_relop_from_token(enum lex_type type, enum
expr_relop *relop);
>  struct expr {
>  struct ovs_list node;   /* In parent EXPR_T_AND or EXPR_T_OR if
any. */
>  enum expr_type type;/* Expression type. */
> -char *as_name;  /* Address set name. Null if it is not an
> +const char *as_name;/* Address set name. Null if it is not an
> address set. */
>
>  union {
> @@ -505,40 +505,42 @@ enum expr_constant_type {
>  };
>
>  /* A string or integer constant (one must know which from context). */
> -union expr_constant {
> -/* Integer constant.
> - *
> - * The width of a constant isn't always clear, e.g. if you write "1",
> - * there's no way to tell whether you mean for that to be a 1-bit
constant
> - * or a 128-bit constant or 

Re: [ovs-dev] [PATCH ovn v4 1/2] northd: Make `vxlan_mode` a global variable.

2024-05-02 Thread Vladislav Odintsov
Hi Ihar,

thanks for your review!

> On 2 May 2024, at 18:11, Ihar Hrachyshka  wrote:
> 
> On Thu, May 2, 2024 at 5:51 AM Vladislav Odintsov  > wrote:
> 
>> This simplifies code and subsequent commit to explicitely disable vxlan
>> 
> 
> I personally find it debatable that moving from explicit dependency through
> a function argument to implicit dependency through a global variable is a
> simplification. But I will leave others to chime in.
> 

Here I wanted to mention that in many pieces of code argument which
was passed just to find VXLAN encaps was removed and with less
code/arguments it looks more simple.

> 
>> mode is based on these changes.
>> 
>> Also `vxlan mode` term is introduced in ovn-architecture man page.
>> 
> 
> Should the mode name keep VXLAN capitalized?
> 

Dunno.
This was inspired by writing in initial commit [1].
I’m fine with both writings.


> 
>> 
>> Signed-off-by: Vladislav Odintsov 
>> ---
>> northd/en-global-config.c |  4 +-
>> northd/northd.c   | 94 ---
>> northd/northd.h   |  5 ++-
>> ovn-architecture.7.xml| 11 +++--
>> 4 files changed, 50 insertions(+), 64 deletions(-)
>> 
>> diff --git a/northd/en-global-config.c b/northd/en-global-config.c
>> index 28c78a12c..873649a89 100644
>> --- a/northd/en-global-config.c
>> +++ b/northd/en-global-config.c
>> @@ -115,8 +115,8 @@ en_global_config_run(struct engine_node *node , void
>> *data)
>>  config_data->svc_monitor_mac);
>> }
>> 
>> -char *max_tunid = xasprintf("%d",
>> -get_ovn_max_dp_key_local(sbrec_chassis_table));
>> +init_vxlan_mode(sbrec_chassis_table);
>> +char *max_tunid = xasprintf("%d", get_ovn_max_dp_key_local());
>> smap_replace(options, "max_tunid", max_tunid);
>> free(max_tunid);
>> 
>> diff --git a/northd/northd.c b/northd/northd.c
>> index 5e12fd1e8..b54219a85 100644
>> --- a/northd/northd.c
>> +++ b/northd/northd.c
>> @@ -90,6 +90,10 @@ static bool use_ct_inv_match = true;
>>  */
>> static bool default_acl_drop;
>> 
>> +/* If this option is 'true' northd will use limited 24-bit space for
>> datapath
>> + * and ports tunnel key allocation (12 bits for each instead of default
>> 16). */
>> +static bool vxlan_mode;
>> +
>> #define MAX_OVN_TAGS 4096
>> 
>> 
>> @@ -881,24 +885,25 @@ join_datapaths(const struct
>> nbrec_logical_switch_table *nbrec_ls_table,
>> }
>> }
>> 
>> -static bool
>> -is_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
>> +void
>> +init_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
>> {
>> const struct sbrec_chassis *chassis;
>> SBREC_CHASSIS_TABLE_FOR_EACH (chassis, sbrec_chassis_table) {
>> for (int i = 0; i < chassis->n_encaps; i++) {
>> if (!strcmp(chassis->encaps[i]->type, "vxlan")) {
>> -return true;
>> +vxlan_mode = true;
>> +return;
>> }
>> }
>> }
>> -return false;
>> +vxlan_mode = false;
>> }
>> 
>> uint32_t
>> -get_ovn_max_dp_key_local(const struct sbrec_chassis_table
>> *sbrec_chassis_table)
>> +get_ovn_max_dp_key_local(void)
>> {
>> -if (is_vxlan_mode(sbrec_chassis_table)) {
>> +if (vxlan_mode) {
>> /* OVN_MAX_DP_GLOBAL_NUM doesn't apply for vxlan mode. */
>> return OVN_MAX_DP_VXLAN_KEY;
>> }
>> @@ -906,15 +911,14 @@ get_ovn_max_dp_key_local(const struct
>> sbrec_chassis_table *sbrec_chassis_table)
>> }
>> 
>> static void
>> -ovn_datapath_allocate_key(const struct sbrec_chassis_table
>> *sbrec_ch_table,
>> -  struct hmap *datapaths, struct hmap *dp_tnlids,
>> +ovn_datapath_allocate_key(struct hmap *datapaths, struct hmap *dp_tnlids,
>>   struct ovn_datapath *od, uint32_t *hint)
>> {
>> if (!od->tunnel_key) {
>> od->tunnel_key = ovn_allocate_tnlid(dp_tnlids, "datapath",
>> -OVN_MIN_DP_KEY_LOCAL,
>> -
>> get_ovn_max_dp_key_local(sbrec_ch_table),
>> -hint);
>> +OVN_MIN_DP_KEY_LOCAL,
>> +get_ovn_max_dp_key_local(),
>> +hint);
>> if (!od->tunnel_key) {
>> if (od->sb) {
>> sbrec_datapath_binding_delete(od->sb);
>> @@ -927,7 +931,6 @@ ovn_datapath_allocate_key(const struct
>> sbrec_chassis_table *sbrec_ch_table,
>> 
>> static void
>> ovn_datapath_assign_requested_tnl_id(
>> -const struct sbrec_chassis_table *sbrec_chassis_table,
>> struct hmap *dp_tnlids, struct ovn_datapath *od)
>> {
>> const struct smap *other_config = (od->nbs
>> @@ -936,8 +939,7 @@ ovn_datapath_assign_requested_tnl_id(
>> uint32_t tunnel_key = smap_get_int(other_config, "requested-tnl-key",
>> 0);
>> if (tunnel_key) {
>> const char *interconn_ts = smap_get(other_config, "interconn-ts");
>> - 

Re: [ovs-dev] [PATCH ovn] northd: Fix logical router load-balancer nat rules when using DGP.

2024-05-02 Thread Numan Siddique
On Fri, Mar 22, 2024 at 7:51 AM Roberto Bartzen Acosta
 wrote:
>
> Hi Mark,
>
> Thanks for your feedback.
>
> Em seg., 18 de mar. de 2024 às 11:53, Mark Michelson 
> escreveu:
>
> > Hi Roberto,
> >
> > I have some concerns about this patch. Let's use the test case you added
> > as an example network. Let's bind the vms and DGPs to hypervisors:
> >
> > * vm1 and lr1-ts1 are bound to hypervisor hv1
> > * vm2 and lr1-ts2 are bound to hypervisor hv2
> >
> > Now imagine a packet arrives on lr1-ts1. The packet gets load balanced
> > and sent to vm2. In this case, since lr1-ts1 is on hv1, the ct_lb_mark()
> > action runs there, creating a conntrack entry on hv1. However, the
> > packet eventually is tunneled to hv2 so that it can be output to vm2.
> >
> > Now vm2 replies to the packet. Is there anything that ensures that the
> > reply from vm2 gets sent to hv1 for the egress pipeline of lr1? If so,
> > then this should work fine since the packet will be unDNATted as
> > expected on hv1. However, if the packet runs all of lr1's pipelines on
> > hv2, then there is no conntrack entry present, and the attempt to unDNAT
> > will not succeed. The packet will either be dropped because of invalid
> > CT, or the packet will have an incorrect source IP and port. Either way,
> > this isn't what is desired.
> >
>
> yep, you're right! that makes sense.
> If the packet comes back with a chassis that does not have the related LB
> contrack entry corresponding to the initial request, this will trigger a
> miss in the pipeline.
>
> I tried to disable ct tracking for lb by configuring the parameter on the
> router, but I still wasn't successful. E.g.:
> options:lb_force_snat_ip="172.16.200.201"
>
> Even changing the lb_force snat ip on the router, or creating a SNAT
> "stateless" rule, I still see this action in the output pipeline in the
> highest priority table (table=1).
>
>   table=1 (lr_out_undnat  ), priority=120  , match=(ip4 && ((ip4.src ==
> 10.195.185.2 && tcp.src == 8000) || (ip4.src == 10.195.185.3 && tcp.src ==
> 8000) || (ip4.src == 10.195.185.4 && tcp.src == 8000) || (ip4.src ==
> 10.195.185.5 && tcp.src == 8000) || (ip4.src == 10.195.185.6 && tcp.src ==
> 8000)) && outport == "incus-net40-lr-lrp-01" &&
> is_chassis_resident("cr-incus-net40-lr-lrp-01")), action=(ct_dnat_in_czone;)
>   table=1 (lr_out_undnat  ), priority=120  , match=(ip4 && ((ip4.src ==
> 10.195.185.2 && tcp.src == 8000) || (ip4.src == 10.195.185.3 && tcp.src ==
> 8000) || (ip4.src == 10.195.185.4 && tcp.src == 8000) || (ip4.src ==
> 10.195.185.5 && tcp.src == 8000) || (ip4.src == 10.195.185.6 && tcp.src ==
> 8000)) && outport == "incus-net40-lr-lrp-02" &&
> is_chassis_resident("cr-incus-net40-lr-lrp-02")), action=(ct_dnat_in_czone;)
>   table=1 (lr_out_undnat  ), priority=120  , match=(ip4 && ((ip4.src ==
> 10.195.185.2 && tcp.src == 8000) || (ip4.src == 10.195.185.3 && tcp.src ==
> 8000) || (ip4.src == 10.195.185.4 && tcp.src == 8000) || (ip4.src ==
> 10.195.185.5 && tcp.src == 8000) || (ip4.src == 10.195.185.6 && tcp.src ==
> 8000)) && outport == "incus-net40-lr-lrp-03" &&
> is_chassis_resident("cr-incus-net40-lr-lrp-03")), action=(ct_dnat_in_czone;)
>   table=1 (lr_out_undnat  ), priority=120  , match=(ip4 && ((ip4.src ==
> 10.195.185.2 && tcp.src == 8000) || (ip4.src == 10.195.185.3 && tcp.src ==
> 8000) || (ip4.src == 10.195.185.4 && tcp.src == 8000) || (ip4.src ==
> 10.195.185.5 && tcp.src == 8000) || (ip4.src == 10.195.185.6 && tcp.src ==
> 8000)) && outport == "incus-net40-lr-lrp-04" &&
> is_chassis_resident("cr-incus-net40-lr-lrp-04")), action=(ct_dnat_in_czone;)
>   table=1 (lr_out_undnat  ), priority=120  , match=(ip4 && ((ip4.src ==
> 10.195.185.2 && tcp.src == 8000) || (ip4.src == 10.195.185.3 && tcp.src ==
> 8000) || (ip4.src == 10.195.185.4 && tcp.src == 8000) || (ip4.src ==
> 10.195.185.5 && tcp.src == 8000) || (ip4.src == 10.195.185.6 && tcp.src ==
> 8000)) && outport == "incus-net40-lr-lrp-ext" &&
> is_chassis_resident("cr-incus-net40-lr-lrp-ext")),
> action=(ct_dnat_in_czone;)
>
>
> Considering that the return when using multiple DGPs is probably associated
> with ECMP, any idea on how to change the rules so that the LB output rules
> are 'stateless' (not ct dependent) in this case with multiple DGPs? I
> imagine this solves the problem and guarantees a return for any chassis.

I think the only way to solve your problem is to add NAT support for a
router having multiple DGPs.
I'm not sure how easy that is or if it is even possible to support.
We need to explore this.

Numan

>
> Thanks,
> Roberto
>
> PS: I have a complex load balancer scenario for the use case with multiple
> DPGs, and 'internal' VIP addresses + OVN interconnect. I can explain the
> general context if you are interested :)
>
>
> > On 2/19/24 16:33, Roberto Bartzen Acosta wrote:
> > > This commit fixes the build_distr_lrouter_nat_flows_for_lb function to
> > > include one NAT flow entry for each DGP in use. Since we have added
> > support
> > 

Re: [ovs-dev] [PATCH] netdev-tc-offloads: Don't offload header modification on ip fragments.

2024-05-02 Thread Eelco Chaudron


On 2 May 2024, at 16:48, Finn, Emma wrote:

>> -Original Message-
>> From: Eelco Chaudron 
>> Sent: Thursday, May 2, 2024 8:14 AM
>> To: Finn, Emma 
>> Cc: Simon Horman ; Stokes, Ian ;
>> Van Haaren, Harry ; d...@openvswitch.org;
>> Flavio Leitner ; Ilya Maximets 
>> Subject: Re: [ovs-dev] [PATCH] netdev-tc-offloads: Don't offload header
>> modification on ip fragments.
>>
>>
>>
>> On 1 May 2024, at 15:36, Finn, Emma wrote:
>>
 -Original Message-
 From: Eelco Chaudron 
 Sent: Wednesday, May 1, 2024 1:52 PM
 To: Simon Horman 
 Cc: Finn, Emma ; Stokes, Ian
 ; sunil.pa...@intel.com; Van Haaren, Harry
 ; d...@openvswitch.org; Flavio Leitner
 ; Ilya Maximets 
 Subject: Re: [ovs-dev] [PATCH] netdev-tc-offloads: Don't offload
 header modification on ip fragments.



 On 1 May 2024, at 14:39, Simon Horman wrote:

> On Tue, Apr 30, 2024 at 02:42:45PM +0200, Eelco Chaudron wrote:
>> Greetings, Intel team!
>>
>> The self-test conducted as part of this patch has revealed an issue
>> with the
 AVX512 checksumming code. Since it was agreed upon that your team
 would maintain this code upon its inclusion, could you please review
 the problem and provide a patch?
>>
>> Details on the problem can be found in this mail link:
>>
>> https://mail.openvswitch.org/pipermail/ovs-build/2024-
 April/038590.html
>
> Thanks Eelco,
>
> In light of the above, could you clarify your plans for this patch?

 I hope Intel keeps their promise and will have a patch out soon, so
 we can apply this patch.

 If not, I guess I can send out a patch to disable pedit acceleration
 as it’s broken, and then apply this patch.

 Intel are you looking into this?

 Cheers,

 Eelco
>>>
>>> Hi Folks,
>>>
>>> I'll look into this and try reproduce locally myself.
>>> Will reach out when I have an update.
>>
>> Thanks Emma, you can simply replicate this by building OVS with the --enable-
>> actions-default-autovalidator configuration flag, then including the test in 
>> this
>> patch, and run make check-dpdk.
>>
>> Cheers,
>>
>> Eelco
>
> Hi Eelco,
>
> I have identified the issue. Seems your unit test identified an issue in the 
> AVX implementation for the IPv4
> actions code where we are losing carry-over bits during the checksum 
> calculation.
> I have a fix but I need to do some more testing before I push. Due to some 
> PTO this will likely be Tuesday.

Thanks for working on this! I noticed the one-off in the log results, however, 
it did not come to mind at the time this would be a carry problem, though it 
seems obvious thinking about it now.

Don’t rush, some time next week will be fine.

Cheers and enjoy your PTO!

//Eelco

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] northd: Fix the comment about route priorities.

2024-05-02 Thread Numan Siddique
On Mon, Apr 22, 2024 at 2:41 AM Han Zhou  wrote:
>
> The current comments are obviously conflicting.  Fixing it according the
> current implementation - static route overrides src-ip route.
>
> Signed-off-by: Han Zhou 

Acked-by: Numan Siddique 

Numan

> ---
>  northd/northd.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/northd/northd.c b/northd/northd.c
> index 331d9c2677b8..dec1eb3679f5 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -271,7 +271,7 @@ static bool default_acl_drop;
>   * Route offsets implement logic to prioritize traffic for routes with
>   * same ip_prefix values:
>   *  -  connected route overrides static one;
> - *  -  static route overrides connected route. */
> + *  -  static route overrides src-ip route. */
>  #define ROUTE_PRIO_OFFSET_MULTIPLIER 3
>  #define ROUTE_PRIO_OFFSET_STATIC 1
>  #define ROUTE_PRIO_OFFSET_CONNECTED 2
> --
> 2.38.1
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] northd: Add bfd and bfd_consumer nodes to I-P engine.

2024-05-02 Thread Lorenzo Bianconi
Introduce bfd and bfd_consumer nodes to northd I-P engine to track bfd
connections and northd static_route/policy_route changes.

Reported-at: https://issues.redhat.com/browse/FDP-600
Signed-off-by: Lorenzo Bianconi 
---
 northd/en-lflow.c|  19 +--
 northd/en-northd.c   |  92 +
 northd/en-northd.h   |   8 ++
 northd/inc-proc-northd.c |  21 ++-
 northd/northd.c  | 274 ---
 northd/northd.h  |  48 ++-
 6 files changed, 344 insertions(+), 118 deletions(-)

diff --git a/northd/en-lflow.c b/northd/en-lflow.c
index c4b927fb8..9efbd5117 100644
--- a/northd/en-lflow.c
+++ b/northd/en-lflow.c
@@ -41,6 +41,9 @@ lflow_get_input_data(struct engine_node *node,
  struct lflow_input *lflow_input)
 {
 struct northd_data *northd_data = engine_get_input_data("northd", node);
+struct bfd_data *bfd_data = engine_get_input_data("bfd", node);
+struct bfd_consumer_data *bfd_consumer_data =
+engine_get_input_data("bfd_consumer", node);
 struct port_group_data *pg_data =
 engine_get_input_data("port_group", node);
 struct sync_meters_data *sync_meters_data =
@@ -78,7 +81,10 @@ lflow_get_input_data(struct engine_node *node,
 lflow_input->meter_groups = _meters_data->meter_groups;
 lflow_input->lb_datapaths_map = _data->lb_datapaths_map;
 lflow_input->svc_monitor_map = _data->svc_monitor_map;
-lflow_input->bfd_connections = NULL;
+lflow_input->bfd_connections = _data->bfd_connections;
+lflow_input->parsed_routes = _consumer_data->parsed_routes;
+lflow_input->route_tables = _consumer_data->route_tables;
+lflow_input->route_policies = _consumer_data->route_policies;
 
 struct ed_type_global_config *global_config =
 engine_get_input_data("global_config", node);
@@ -95,25 +101,14 @@ void en_lflow_run(struct engine_node *node, void *data)
 struct lflow_input lflow_input;
 lflow_get_input_data(node, _input);
 
-struct hmap bfd_connections = HMAP_INITIALIZER(_connections);
-lflow_input.bfd_connections = _connections;
-
 stopwatch_start(BUILD_LFLOWS_STOPWATCH_NAME, time_msec());
 
 struct lflow_data *lflow_data = data;
 lflow_table_clear(lflow_data->lflow_table);
 lflow_reset_northd_refs(_input);
 
-build_bfd_table(eng_ctx->ovnsb_idl_txn,
-lflow_input.nbrec_bfd_table,
-lflow_input.sbrec_bfd_table,
-lflow_input.lr_ports,
-_connections);
 build_lflows(eng_ctx->ovnsb_idl_txn, _input,
  lflow_data->lflow_table);
-bfd_cleanup_connections(lflow_input.nbrec_bfd_table,
-_connections);
-hmap_destroy(_connections);
 stopwatch_stop(BUILD_LFLOWS_STOPWATCH_NAME, time_msec());
 
 engine_set_node_state(node, EN_UPDATED);
diff --git a/northd/en-northd.c b/northd/en-northd.c
index 4479b4aff..2d8c05608 100644
--- a/northd/en-northd.c
+++ b/northd/en-northd.c
@@ -236,6 +236,66 @@ northd_global_config_handler(struct engine_node *node, 
void *data OVS_UNUSED)
 return true;
 }
 
+void
+en_bfd_consumer_run(struct engine_node *node, void *data)
+{
+
+struct northd_data *northd_data = engine_get_input_data("northd", node);
+struct bfd_data *bfd_data = engine_get_input_data("bfd", node);
+const struct nbrec_bfd_table *nbrec_bfd_table =
+EN_OVSDB_GET(engine_get_input("NB_bfd", node));
+struct bfd_consumer_data *bfd_consumer_data = data;
+
+bfd_consumer_destroy(data);
+bfd_consumer_init(data);
+
+struct ovn_datapath *od;
+HMAP_FOR_EACH (od, key_node, _data->lr_datapaths.datapaths) {
+for (int i = 0; i < od->nbr->n_ports; i++) {
+const char *route_table_name =
+smap_get(>nbr->ports[i]->options, "route_table");
+get_route_table_id(_consumer_data->route_tables,
+   route_table_name);
+}
+for (int i = 0; i < od->nbr->n_static_routes; i++) {
+parsed_routes_add(od, _data->lr_ports,
+  _consumer_data->parsed_routes,
+  _consumer_data->route_tables,
+  od->nbr->static_routes[i],
+  _data->bfd_connections);
+}
+build_route_policies(od, _data->lr_ports,
+ _data->bfd_connections,
+ _consumer_data->route_policies);
+}
+bfd_cleanup_connections(nbrec_bfd_table, _data->bfd_connections);
+
+engine_set_node_state(node, EN_UPDATED);
+}
+
+void
+en_bfd_run(struct engine_node *node, void *data)
+{
+struct northd_data *northd_data = engine_get_input_data("northd", node);
+const struct engine_context *eng_ctx = engine_get_context();
+struct bfd_data *bfd_data = data;
+const struct nbrec_bfd_table *nbrec_bfd_table =
+

Re: [ovs-dev] [PATCH ovn v4 2/2] northd: Add support for disabling vxlan mode.

2024-05-02 Thread Ihar Hrachyshka
This patch is the meat of the feature, and I think it could be as well
implemented without patch 1/2 in the series. I agree with this particular
patch, (assuming we adjust documentation to talk about `VXLAN mode`,
capitalized). But I also have some reservations about patch 1/2 in the
series (or at least its stated goal - to simplify the code). I commented
there, and I would like others to vote on patch 1/2.

For this patch (2/2), with capitalized VXLAN mode in docs:

Acked-By: Ihar Hrachyshka 

On Thu, May 2, 2024 at 5:51 AM Vladislav Odintsov  wrote:

> Commit [1] introduced a "vxlan mode" concept.  It brought a limitation
> for available tunnel IDs because of lack of space in VXLAN VNI.
> In vxlan mode OVN is limited by 4095 datapaths (LRs or non-transit LSs)
> and 2047 logical switch ports per datapath.
>
> Prior to this patch vxlan mode was enabled automatically if at least one
> chassis had encap of vxlan type.  In scenarios where one want to use VXLAN
> only for HW VTEP (RAMP) switch, such limitation makes no sence.
>
> This patch adds support for explicit disabling of vxlan mode via
> Northbound database.
>
> 1: https://github.com/ovn-org/ovn/commit/b07f1bc3d068
>
> CC: Ihar Hrachyshka 
> Fixes: b07f1bc3d068 ("Add VXLAN support for non-VTEP datapath bindings")
> Signed-off-by: Vladislav Odintsov 
> ---
>  NEWS  |  3 +++
>  northd/en-global-config.c |  7 ++-
>  northd/northd.c   | 10 --
>  northd/northd.h   |  3 ++-
>  ovn-architecture.7.xml|  6 ++
>  ovn-nb.xml| 10 ++
>  tests/ovn-northd.at   | 29 +
>  7 files changed, 64 insertions(+), 4 deletions(-)
>
> diff --git a/NEWS b/NEWS
> index 3b5e93dc9..43ab05a68 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -17,6 +17,9 @@ Post v24.03.0
>  external-ids, the option is no longer needed as it became effectively
>  "true" for all scenarios.
>- Added DHCPv4 relay support.
> +  - Added new global config option NB_Global:options:disable_vxlan_mode to
> +extend available tunnel IDs space for datapaths from 4095 to 16711680.
> +For more details see man ovn-nb(5) for mentioned option.
>
>  OVN v24.03.0 - 01 Mar 2024
>  --
> diff --git a/northd/en-global-config.c b/northd/en-global-config.c
> index 873649a89..f5e2a8154 100644
> --- a/northd/en-global-config.c
> +++ b/northd/en-global-config.c
> @@ -115,7 +115,7 @@ en_global_config_run(struct engine_node *node , void
> *data)
>   config_data->svc_monitor_mac);
>  }
>
> -init_vxlan_mode(sbrec_chassis_table);
> +init_vxlan_mode(>options, sbrec_chassis_table);
>  char *max_tunid = xasprintf("%d", get_ovn_max_dp_key_local());
>  smap_replace(options, "max_tunid", max_tunid);
>  free(max_tunid);
> @@ -533,6 +533,11 @@ check_nb_options_out_of_sync(const struct
> nbrec_nb_global *nb,
>  return true;
>  }
>
> +if (config_out_of_sync(>options, _data->nb_options,
> +   "disable_vxlan_mode", false)) {
> +return true;
> +}
> +
>  return false;
>  }
>
> diff --git a/northd/northd.c b/northd/northd.c
> index b54219a85..d1535172e 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -886,8 +886,14 @@ join_datapaths(const struct
> nbrec_logical_switch_table *nbrec_ls_table,
>  }
>
>  void
> -init_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
> +init_vxlan_mode(const struct smap *nb_options,
> +const struct sbrec_chassis_table *sbrec_chassis_table)
>  {
> +if (smap_get_bool(nb_options, "disable_vxlan_mode", false)) {
> +vxlan_mode = false;
> +return;
> +}
> +
>  const struct sbrec_chassis *chassis;
>  SBREC_CHASSIS_TABLE_FOR_EACH (chassis, sbrec_chassis_table) {
>  for (int i = 0; i < chassis->n_encaps; i++) {
> @@ -17593,7 +17599,7 @@ ovnnb_db_run(struct northd_input *input_data,
>  use_common_zone = smap_get_bool(input_data->nb_options,
> "use_common_zone",
>  false);
>
> -init_vxlan_mode(input_data->sbrec_chassis_table);
> +init_vxlan_mode(input_data->nb_options,
> input_data->sbrec_chassis_table);
>
>  build_datapaths(ovnsb_txn,
>  input_data->nbrec_logical_switch_table,
> diff --git a/northd/northd.h b/northd/northd.h
> index be480003e..d0322e621 100644
> --- a/northd/northd.h
> +++ b/northd/northd.h
> @@ -792,7 +792,8 @@ lr_has_multiple_gw_ports(const struct ovn_datapath *od)
>  }
>
>  void
> -init_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table);
> +init_vxlan_mode(const struct smap *nb_options,
> +const struct sbrec_chassis_table *sbrec_chassis_table);
>
>  uint32_t get_ovn_max_dp_key_local(void);
>
> diff --git a/ovn-architecture.7.xml b/ovn-architecture.7.xml
> index 7abb1fa83..251c9c514 100644
> --- a/ovn-architecture.7.xml
> +++ b/ovn-architecture.7.xml
> @@ -2919,4 +2919,10 @@
>  the 

Re: [ovs-dev] [PATCH ovn v4 1/2] northd: Make `vxlan_mode` a global variable.

2024-05-02 Thread Ihar Hrachyshka
On Thu, May 2, 2024 at 5:51 AM Vladislav Odintsov  wrote:

> This simplifies code and subsequent commit to explicitely disable vxlan
>

I personally find it debatable that moving from explicit dependency through
a function argument to implicit dependency through a global variable is a
simplification. But I will leave others to chime in.


> mode is based on these changes.
>
> Also `vxlan mode` term is introduced in ovn-architecture man page.
>

Should the mode name keep VXLAN capitalized?


>
> Signed-off-by: Vladislav Odintsov 
> ---
>  northd/en-global-config.c |  4 +-
>  northd/northd.c   | 94 ---
>  northd/northd.h   |  5 ++-
>  ovn-architecture.7.xml| 11 +++--
>  4 files changed, 50 insertions(+), 64 deletions(-)
>
> diff --git a/northd/en-global-config.c b/northd/en-global-config.c
> index 28c78a12c..873649a89 100644
> --- a/northd/en-global-config.c
> +++ b/northd/en-global-config.c
> @@ -115,8 +115,8 @@ en_global_config_run(struct engine_node *node , void
> *data)
>   config_data->svc_monitor_mac);
>  }
>
> -char *max_tunid = xasprintf("%d",
> -get_ovn_max_dp_key_local(sbrec_chassis_table));
> +init_vxlan_mode(sbrec_chassis_table);
> +char *max_tunid = xasprintf("%d", get_ovn_max_dp_key_local());
>  smap_replace(options, "max_tunid", max_tunid);
>  free(max_tunid);
>
> diff --git a/northd/northd.c b/northd/northd.c
> index 5e12fd1e8..b54219a85 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -90,6 +90,10 @@ static bool use_ct_inv_match = true;
>   */
>  static bool default_acl_drop;
>
> +/* If this option is 'true' northd will use limited 24-bit space for
> datapath
> + * and ports tunnel key allocation (12 bits for each instead of default
> 16). */
> +static bool vxlan_mode;
> +
>  #define MAX_OVN_TAGS 4096
>
>
> @@ -881,24 +885,25 @@ join_datapaths(const struct
> nbrec_logical_switch_table *nbrec_ls_table,
>  }
>  }
>
> -static bool
> -is_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
> +void
> +init_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
>  {
>  const struct sbrec_chassis *chassis;
>  SBREC_CHASSIS_TABLE_FOR_EACH (chassis, sbrec_chassis_table) {
>  for (int i = 0; i < chassis->n_encaps; i++) {
>  if (!strcmp(chassis->encaps[i]->type, "vxlan")) {
> -return true;
> +vxlan_mode = true;
> +return;
>  }
>  }
>  }
> -return false;
> +vxlan_mode = false;
>  }
>
>  uint32_t
> -get_ovn_max_dp_key_local(const struct sbrec_chassis_table
> *sbrec_chassis_table)
> +get_ovn_max_dp_key_local(void)
>  {
> -if (is_vxlan_mode(sbrec_chassis_table)) {
> +if (vxlan_mode) {
>  /* OVN_MAX_DP_GLOBAL_NUM doesn't apply for vxlan mode. */
>  return OVN_MAX_DP_VXLAN_KEY;
>  }
> @@ -906,15 +911,14 @@ get_ovn_max_dp_key_local(const struct
> sbrec_chassis_table *sbrec_chassis_table)
>  }
>
>  static void
> -ovn_datapath_allocate_key(const struct sbrec_chassis_table
> *sbrec_ch_table,
> -  struct hmap *datapaths, struct hmap *dp_tnlids,
> +ovn_datapath_allocate_key(struct hmap *datapaths, struct hmap *dp_tnlids,
>struct ovn_datapath *od, uint32_t *hint)
>  {
>  if (!od->tunnel_key) {
>  od->tunnel_key = ovn_allocate_tnlid(dp_tnlids, "datapath",
> -OVN_MIN_DP_KEY_LOCAL,
> -
> get_ovn_max_dp_key_local(sbrec_ch_table),
> -hint);
> +OVN_MIN_DP_KEY_LOCAL,
> +get_ovn_max_dp_key_local(),
> +hint);
>  if (!od->tunnel_key) {
>  if (od->sb) {
>  sbrec_datapath_binding_delete(od->sb);
> @@ -927,7 +931,6 @@ ovn_datapath_allocate_key(const struct
> sbrec_chassis_table *sbrec_ch_table,
>
>  static void
>  ovn_datapath_assign_requested_tnl_id(
> -const struct sbrec_chassis_table *sbrec_chassis_table,
>  struct hmap *dp_tnlids, struct ovn_datapath *od)
>  {
>  const struct smap *other_config = (od->nbs
> @@ -936,8 +939,7 @@ ovn_datapath_assign_requested_tnl_id(
>  uint32_t tunnel_key = smap_get_int(other_config, "requested-tnl-key",
> 0);
>  if (tunnel_key) {
>  const char *interconn_ts = smap_get(other_config, "interconn-ts");
> -if (!interconn_ts && is_vxlan_mode(sbrec_chassis_table) &&
> -tunnel_key >= 1 << 12) {
> +if (!interconn_ts && vxlan_mode && tunnel_key >= 1 << 12) {
>  static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
>  VLOG_WARN_RL(, "Tunnel key %"PRIu32" for datapath %s is "
>   "incompatible with VXLAN", tunnel_key,
> @@ -985,7 +987,6 @@ build_datapaths(struct ovsdb_idl_txn *ovnsb_txn,
>  const 

Re: [ovs-dev] [PATCH] netdev-tc-offloads: Don't offload header modification on ip fragments.

2024-05-02 Thread Finn, Emma
> -Original Message-
> From: Eelco Chaudron 
> Sent: Thursday, May 2, 2024 8:14 AM
> To: Finn, Emma 
> Cc: Simon Horman ; Stokes, Ian ;
> Van Haaren, Harry ; d...@openvswitch.org;
> Flavio Leitner ; Ilya Maximets 
> Subject: Re: [ovs-dev] [PATCH] netdev-tc-offloads: Don't offload header
> modification on ip fragments.
> 
> 
> 
> On 1 May 2024, at 15:36, Finn, Emma wrote:
> 
> >> -Original Message-
> >> From: Eelco Chaudron 
> >> Sent: Wednesday, May 1, 2024 1:52 PM
> >> To: Simon Horman 
> >> Cc: Finn, Emma ; Stokes, Ian
> >> ; sunil.pa...@intel.com; Van Haaren, Harry
> >> ; d...@openvswitch.org; Flavio Leitner
> >> ; Ilya Maximets 
> >> Subject: Re: [ovs-dev] [PATCH] netdev-tc-offloads: Don't offload
> >> header modification on ip fragments.
> >>
> >>
> >>
> >> On 1 May 2024, at 14:39, Simon Horman wrote:
> >>
> >>> On Tue, Apr 30, 2024 at 02:42:45PM +0200, Eelco Chaudron wrote:
>  Greetings, Intel team!
> 
>  The self-test conducted as part of this patch has revealed an issue
>  with the
> >> AVX512 checksumming code. Since it was agreed upon that your team
> >> would maintain this code upon its inclusion, could you please review
> >> the problem and provide a patch?
> 
>  Details on the problem can be found in this mail link:
> 
>  https://mail.openvswitch.org/pipermail/ovs-build/2024-
> >> April/038590.html
> >>>
> >>> Thanks Eelco,
> >>>
> >>> In light of the above, could you clarify your plans for this patch?
> >>
> >> I hope Intel keeps their promise and will have a patch out soon, so
> >> we can apply this patch.
> >>
> >> If not, I guess I can send out a patch to disable pedit acceleration
> >> as it’s broken, and then apply this patch.
> >>
> >> Intel are you looking into this?
> >>
> >> Cheers,
> >>
> >> Eelco
> >
> > Hi Folks,
> >
> > I'll look into this and try reproduce locally myself.
> > Will reach out when I have an update.
> 
> Thanks Emma, you can simply replicate this by building OVS with the --enable-
> actions-default-autovalidator configuration flag, then including the test in 
> this
> patch, and run make check-dpdk.
> 
> Cheers,
> 
> Eelco

Hi Eelco, 

I have identified the issue. Seems your unit test identified an issue in the 
AVX implementation for the IPv4
actions code where we are losing carry-over bits during the checksum 
calculation.
I have a fix but I need to do some more testing before I push. Due to some PTO 
this will likely be Tuesday. 

Thanks, 
Emma 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v4] controller: Track individual address set constants.

2024-05-02 Thread Ales Musil
Instead of tracking address set per struct expr_constant_set track it
per individual struct expr_constant. This allows more fine grained
control for I-P processing of address sets in controller. It helps with
scenarios like matching on two address sets in one expression e.g.
"ip4.src == {$as1, $as2}". This allows any addition or removal of
individual adress from the set to be incrementally processed instead
of reprocessing all the flows.

This unfortunately doesn't help with the following flows:
"ip4.src == $as1 && ip4.dst == $as2"
"ip4.src == $as1 || ip4.dst == $as2"

The memory impact should be minimal as there is only increase of 8 bytes
per the struct expr_constant.

Reported-at: https://issues.redhat.com/browse/FDP-509
Signed-off-by: Ales Musil 
---
v4: Rebase on top of current main.
Update the "lflow_handle_addr_set_update" comment according to suggestion 
from Han.
v3: Rebase on top of current main.
Address comments from Han:
- Adjust the comment for "lflow_handle_addr_set_update" to include remaning 
corner cases.
- Make sure that the flows are consistent between I-P and recompute.
v2: Rebase on top of current main.
Adjust the comment for I-P optimization.
---
 controller/lflow.c  | 11 ++---
 include/ovn/actions.h   |  2 +-
 include/ovn/expr.h  | 46 ++-
 lib/actions.c   | 20 -
 lib/expr.c  | 99 +
 tests/ovn-controller.at | 79 +---
 6 files changed, 154 insertions(+), 103 deletions(-)

diff --git a/controller/lflow.c b/controller/lflow.c
index 760ec0b41..1e05665a1 100644
--- a/controller/lflow.c
+++ b/controller/lflow.c
@@ -278,7 +278,7 @@ lflow_handle_changed_flows(struct lflow_ctx_in *l_ctx_in,
 }
 
 static bool
-as_info_from_expr_const(const char *as_name, const union expr_constant *c,
+as_info_from_expr_const(const char *as_name, const struct expr_constant *c,
 struct addrset_info *as_info)
 {
 as_info->name = as_name;
@@ -644,14 +644,11 @@ as_update_can_be_handled(const char *as_name, struct 
addr_set_diff *as_diff,
  *generated.
  *
  *  - The sub expression of the address set is combined with other sub-
- *expressions/constants, usually because of disjunctions between
- *sub-expressions/constants, e.g.:
+ *expressions/constants on different fields, e.g.:
  *
  *  ip.src == $as1 || ip.dst == $as2
- *  ip.src == {$as1, $as2}
- *  ip.src == {$as1, ip1}
  *
- *All these could have been split into separate lflows.
+ *This could have been split into separate lflows.
  *
  *  - Conjunctions overlapping between lflows, which can be caused by
  *overlapping address sets or same address set used by multiple lflows
@@ -714,7 +711,7 @@ lflow_handle_addr_set_update(const char *as_name,
 if (as_diff->deleted) {
 struct addrset_info as_info;
 for (size_t i = 0; i < as_diff->deleted->n_values; i++) {
-union expr_constant *c = _diff->deleted->values[i];
+struct expr_constant *c = _diff->deleted->values[i];
 if (!as_info_from_expr_const(as_name, c, _info)) {
 continue;
 }
diff --git a/include/ovn/actions.h b/include/ovn/actions.h
index ae0864fdd..88cf4de79 100644
--- a/include/ovn/actions.h
+++ b/include/ovn/actions.h
@@ -241,7 +241,7 @@ struct ovnact_next {
 struct ovnact_load {
 struct ovnact ovnact;
 struct expr_field dst;
-union expr_constant imm;
+struct expr_constant imm;
 };
 
 /* OVNACT_MOVE, OVNACT_EXCHANGE. */
diff --git a/include/ovn/expr.h b/include/ovn/expr.h
index c48f82398..e54edb5bf 100644
--- a/include/ovn/expr.h
+++ b/include/ovn/expr.h
@@ -368,7 +368,7 @@ bool expr_relop_from_token(enum lex_type type, enum 
expr_relop *relop);
 struct expr {
 struct ovs_list node;   /* In parent EXPR_T_AND or EXPR_T_OR if any. */
 enum expr_type type;/* Expression type. */
-char *as_name;  /* Address set name. Null if it is not an
+const char *as_name;/* Address set name. Null if it is not an
address set. */
 
 union {
@@ -505,40 +505,42 @@ enum expr_constant_type {
 };
 
 /* A string or integer constant (one must know which from context). */
-union expr_constant {
-/* Integer constant.
- *
- * The width of a constant isn't always clear, e.g. if you write "1",
- * there's no way to tell whether you mean for that to be a 1-bit constant
- * or a 128-bit constant or somewhere in between. */
-struct {
-union mf_subvalue value;
-union mf_subvalue mask; /* Only initialized if 'masked'. */
-bool masked;
-
-enum lex_format format; /* From the constant's lex_token. */
-};
+struct expr_constant {
+const char *as_name;
 
-/* Null-terminated string constant. */
-char *string;
+

[ovs-dev] [PATCH v2] hash, jhash: Fix unaligned access to the hash remainder.

2024-05-02 Thread Ales Musil
The pointer was passed to memcpy as uin32_t *, however the hash bytes
might be unaligned at that point. Case it to uint8_t * instead
which has only single byte alignment requirement. This seems to be
a false positive reported by clang [0].

lib/hash.c:46:22: runtime error: load of misaligned address
0x50700065 for type 'const uint32_t *' (aka 'const unsigned int *'),
which requires 4 byte alignment
0x50700065: note: pointer points here
 73 62 2e 73 6f 63 6b  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 ^
  00 00 00 00 00 00 00 00  00
0 0x6191cb in hash_bytes ovs/lib/hash.c:46:9
1 0x69d064 in hash_string ovs/lib/hash.h:404:12
2 0x69d064 in hash_name ovs/lib/shash.c:29:12
3 0x69d064 in shash_find ovs/lib/shash.c:237:49
4 0x69dada in shash_find_data ovs/lib/shash.c:251:31
5 0x507987 in add_remote ovs/ovsdb/ovsdb-server.c:1382:15
6 0x507987 in parse_options ovs/ovsdb/ovsdb-server.c:2659:13
7 0x507987 in main ovs/ovsdb/ovsdb-server.c:751:5

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior lib/hash.c:46:22

[0] https://github.com/llvm/llvm-project/issues/90848
Signed-off-by: Ales Musil 
---
 lib/hash.c  | 2 +-
 lib/jhash.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/hash.c b/lib/hash.c
index c722f3c3c..986fa6643 100644
--- a/lib/hash.c
+++ b/lib/hash.c
@@ -43,7 +43,7 @@ hash_bytes(const void *p_, size_t n, uint32_t basis)
 if (n) {
 uint32_t tmp = 0;
 
-memcpy(, p, n);
+memcpy(, (const uint8_t *) p, n);
 hash = hash_add(hash, tmp);
 }
 
diff --git a/lib/jhash.c b/lib/jhash.c
index c59b51b61..0a0628589 100644
--- a/lib/jhash.c
+++ b/lib/jhash.c
@@ -114,7 +114,7 @@ jhash_bytes(const void *p_, size_t n, uint32_t basis)
 uint32_t tmp[3];
 
 tmp[0] = tmp[1] = tmp[2] = 0;
-memcpy(tmp, p, n);
+memcpy(tmp, (const uint8_t *) p, n);
 a += tmp[0];
 b += tmp[1];
 c += tmp[2];
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v3] controller: Track individual address set constants.

2024-05-02 Thread Ales Musil
On Wed, May 1, 2024 at 6:38 PM Han Zhou  wrote:

>
>
> On Tue, Apr 30, 2024 at 9:56 AM Ales Musil  wrote:
> >
> > Instead of tracking address set per struct expr_constant_set track it
> > per individual struct expr_constant. This allows more fine grained
> > control for I-P processing of address sets in controller. It helps with
> > scenarios like matching on two address sets in one expression e.g.
> > "ip4.src == {$as1, $as2}". This allows any addition or removal of
> > individual adress from the set to be incrementally processed instead
> > of reprocessing all the flows.
> >
> > This unfortunately doesn't help with the following flows:
> > "ip4.src == $as1 && ip4.dst == $as2"
> > "ip4.src == $as1 || ip4.dst == $as2"
> >
> > The memory impact should be minimal as there is only increase of 8 bytes
> > per the struct expr_constant.
> >
> > Reported-at: https://issues.redhat.com/browse/FDP-509
> > Signed-off-by: Ales Musil 
> > ---
> > v3: Rebase on top of current main.
> > Address comments from Han:
> > - Adjust the comment for "lflow_handle_addr_set_update" to include
> remaning corner cases.
> > - Make sure that the flows are consistent between I-P and recompute.
> > v2: Rebase on top of current main.
> > Adjust the comment for I-P optimization.
> > ---
> >  controller/lflow.c  |  7 ++-
> >  include/ovn/actions.h   |  2 +-
> >  include/ovn/expr.h  | 46 ++-
> >  lib/actions.c   | 20 -
> >  lib/expr.c  | 99 +
> >  tests/ovn-controller.at | 79 +---
> >  6 files changed, 153 insertions(+), 100 deletions(-)
> >
> > diff --git a/controller/lflow.c b/controller/lflow.c
> > index 760ec0b41..06e839cbe 100644
> > --- a/controller/lflow.c
> > +++ b/controller/lflow.c
> > @@ -278,7 +278,7 @@ lflow_handle_changed_flows(struct lflow_ctx_in
> *l_ctx_in,
> >  }
> >
> >  static bool
> > -as_info_from_expr_const(const char *as_name, const union expr_constant
> *c,
> > +as_info_from_expr_const(const char *as_name, const struct expr_constant
> *c,
> >  struct addrset_info *as_info)
> >  {
> >  as_info->name = as_name;
> > @@ -647,9 +647,8 @@ as_update_can_be_handled(const char *as_name, struct
> addr_set_diff *as_diff,
> >   *expressions/constants, usually because of disjunctions between
> >   *sub-expressions/constants, e.g.:
> >   *
> > + *  ip.src == $as1 && ip.dst == $as2
> >   *  ip.src == $as1 || ip.dst == $as2
> > - *  ip.src == {$as1, $as2}
> > - *  ip.src == {$as1, ip1}
> >   *
> >   *All these could have been split into separate lflows.
>
> Hi Ales, thanks for v3.
>

Hi Han,


> I checked again and wondered why you mentioned that "ip.src == $as1 &&
> ip.dst == $as2" is not supported. This expression would generate
> conjunctions, which works with I-P before your change and still works. Did
> I miss anything?
>

yeah my bad, I was focused on this patch rather than what is supported
overall.


>
> In addition, since the constraints are relaxed after your change, I'd also
> update the above comments a little more, something like:
>
>*  - The sub expression of the address set is combined with other
> sub-
>*expressions/constants on different fields, e.g.:
>
>
>*
>
>
>
>
>*  ip.src == $as1 || ip.dst == $as2
>
>*
>
>*This could have been split into separate lflows.
>
>
> What do you think?
>

Sounds good, I'll post v4 with this update.


>
> Thanks,
> Han
>
> >   *
> > @@ -714,7 +713,7 @@ lflow_handle_addr_set_update(const char *as_name,
> >  if (as_diff->deleted) {
> >  struct addrset_info as_info;
> >  for (size_t i = 0; i < as_diff->deleted->n_values; i++) {
> > -union expr_constant *c = _diff->deleted->values[i];
> > +struct expr_constant *c = _diff->deleted->values[i];
> >  if (!as_info_from_expr_const(as_name, c, _info)) {
> >  continue;
> >  }
> > diff --git a/include/ovn/actions.h b/include/ovn/actions.h
> > index ae0864fdd..88cf4de79 100644
> > --- a/include/ovn/actions.h
> > +++ b/include/ovn/actions.h
> > @@ -241,7 +241,7 @@ struct ovnact_next {
> >  struct ovnact_load {
> >  struct ovnact ovnact;
> >  struct expr_field dst;
> > -union expr_constant imm;
> > +struct expr_constant imm;
> >  };
> >
> >  /* OVNACT_MOVE, OVNACT_EXCHANGE. */
> > diff --git a/include/ovn/expr.h b/include/ovn/expr.h
> > index c48f82398..e54edb5bf 100644
> > --- a/include/ovn/expr.h
> > +++ b/include/ovn/expr.h
> > @@ -368,7 +368,7 @@ bool expr_relop_from_token(enum lex_type type, enum
> expr_relop *relop);
> >  struct expr {
> >  struct ovs_list node;   /* In parent EXPR_T_AND or EXPR_T_OR if
> any. */
> >  enum expr_type type;/* Expression type. */
> > -char *as_name;  /* Address set name. Null if it 

Re: [ovs-dev] [PATCH] netdev-tc-offloads: Don't offload header modification on ip fragments.

2024-05-02 Thread Ilya Maximets
On 4/29/24 16:48, Eelco Chaudron wrote:
> While offloading header modifications to TC, OVS is using {TCA_PEDIT} +
> {TCA_CSUM} combination as that it the only way to represent header
> rewrite.  However, {TCA_CSUM} is unable to calculate L4 checksums for
> IP fragments.
> 
> Since TC already applies fragmentation bit masking, this patch simply
> needs to prevent these packets from being processed through TC.
> 
> Signed-off-by: Eelco Chaudron 
> ---

Thanks, Eelco.  Beside the avx512 failure, see a few comments below.

Should this also have a Fixes tag?

>  lib/netdev-offload-tc.c | 32 +
>  tests/system-traffic.at | 62 +
>  2 files changed, 94 insertions(+)
> 
> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
> index 921d52317..7e915d419 100644
> --- a/lib/netdev-offload-tc.c
> +++ b/lib/netdev-offload-tc.c
> @@ -1488,6 +1488,30 @@ parse_put_flow_ct_action(struct tc_flower *flower,
>  return 0;
>  }
>  
> +static bool
> +will_tc_add_l4_checksum(struct tc_flower *flower, int type)
> +{
> +/* This function returns true if the tc layer will add a l4 checksum 
> action
> + * for this set action. Refer to the csum_update_flag() function for
> + * detailed logic. Note that even the kernel only supports updating TCP,
> + * UDP and ICMPv6. */

Nit: double spaces between sentences.

We should add a similar comment to the csum_update_flag() prompting
anyone changing this function to update this one as well.

Ideally we would just fail in csum_update_flag() itself to avoid the
logic duplication in different places.  Can it be done or is it a
performance concern?

> +switch (type) {
> +case OVS_KEY_ATTR_IPV4:
> +case OVS_KEY_ATTR_IPV6:
> +case OVS_KEY_ATTR_TCP:
> +case OVS_KEY_ATTR_UDP:
> +switch (flower->key.ip_proto) {
> +case IPPROTO_TCP:
> +case IPPROTO_UDP:
> +case IPPROTO_ICMPV6:
> +case IPPROTO_UDPLITE:
> +return true;
> +}
> +break;
> +}
> +return false;
> +}
> +
>  static int
>  parse_put_flow_set_masked_action(struct tc_flower *flower,
>   struct tc_action *action,
> @@ -1520,6 +1544,14 @@ parse_put_flow_set_masked_action(struct tc_flower 
> *flower,
>  return EOPNOTSUPP;
>  }
>  
> +if (flower->key.flags & TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT
> +&& will_tc_add_l4_checksum(flower, type)) {
> +VLOG_DBG_RL(, "set action type %d not supported on fragments "
> +"due to checksum limitation", type);
> +ofpbuf_uninit(_buf);
> +return EOPNOTSUPP;
> +}
> +
>  for (i = 0; i < ARRAY_SIZE(set_flower_map[type]); i++) {
>  struct netlink_field *f = _flower_map[type][i];
>  
> diff --git a/tests/system-traffic.at b/tests/system-traffic.at
> index bd7647cbe..d06d2f66a 100644
> --- a/tests/system-traffic.at
> +++ b/tests/system-traffic.at
> @@ -8977,3 +8977,65 @@ OVS_WAIT_UNTIL([cat p2.pcap | grep -E "0x0050: * 
> * *5002 *2000 *b85e *00
>  
>  OVS_TRAFFIC_VSWITCHD_STOP
>  AT_CLEANUP
> +
> +

One line is enough.

> +AT_SETUP([datapath - IP mod_nw_src/set_field on fragments])

Should 'IP' be glued to 'fragments' instead?

> +AT_SKIP_IF([test $HAVE_TCPDUMP = no])
> +OVS_TRAFFIC_VSWITCHD_START()
> +
> +ADD_NAMESPACES(at_ns0, at_ns1)
> +
> +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24", 36:b1:ee:7c:01:03)
> +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24", 36:b1:ee:7c:01:02)
> +
> +AT_DATA([flows.txt], [dnl
> +  in_port=ovs-p0,ip,nw_src=10.1.1.1 actions=mod_nw_src=11.1.1.1,ovs-p1
> +  in_port=ovs-p0,ipv6,ipv6_src=fc00::1 
> actions=set_field:fc00::100->ipv6_src,ovs-p1
> +])
> +
> +AT_CHECK([ovs-ofctl del-flows br0])
> +AT_CHECK([ovs-ofctl -Oopenflow13 add-flows br0 flows.txt])
> +
> +NETNS_DAEMONIZE([at_ns1], [tcpdump -l -nn -xx -U -i p1 > p1.out],
> +  [tcpdump.pid])
> +sleep 1

Instead of sleeping we should wait for tcpdump to start listening.
Look for the following check in some other tests:

OVS_WAIT_UNTIL([grep "listening" tcpdump0_err])

> +
> +dnl We send each packet multiple times, ones for learning which will flow
> +dnl through the used datapath for learning, and the others will go through 
> the
> +dnl actuall datapath.

typo: actual

> +for i in 1 2 3 4 5; do
> +  NS_CHECK_EXEC([at_ns0], [$PYTHON3 $srcdir/sendpkt.py p0 \
> +36 b1 ee 7c 01 02 36 b1 ee 7c 01 03 08 00 45 00 00 26 00 01 20 00 40 11 \
> +44 c2 0a 01 01 01 0a 01 01 02 0b c4 08 84 00 26 e9 64 01 02 03 04 05 06 \
> +07 08 09 0a > /dev/null])

Ideally we would use compose-packet for this, but it can't generate such
packets today, so it's fine to just plain-code the packet.  But, please,
add a description on what this packet is.  It's very inconvenient to
copy-paste this into some packet decoder.

> +done
> +
> +OVS_WAIT_UNTIL([test $(grep -c -E \
> +  "0x:  *36b1 *ee7c *0102 *36b1 *ee7c *0103 *0800 *4500" p1.out) -eq 5])
> +OVS_WAIT_UNTIL([test 

Re: [ovs-dev] [PATCH] hash, jhash: Fix unaligned access to the hash remainder.

2024-05-02 Thread Ales Musil
On Thu, May 2, 2024 at 1:22 PM Ilya Maximets  wrote:

> On 5/2/24 12:22, Ales Musil wrote:
> > The has was passed to memcpy as uin32_t *, however the hash bytes
>
> 'The has was passed' ? :)
>

Oops :)


>
> > might be unaligned at that point. Case it to uint8_t * instead
> > which has only single byte alignment requirement.
> >
> > lib/hash.c:46:22: runtime error: load of misaligned address
> 0x50700065 for type 'const uint32_t *' (aka 'const unsigned int *'),
> which requires 4 byte alignment
> > 0x50700065: note: pointer points here
> >  73 62 2e 73 6f 63 6b  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
> 00 00 00 00 00 00 00 00  00
> >  ^
>
> Please, wrap these lines.
>

Ack


>
> > #0 0x6191cb in hash_bytes /workspace/ovn/ovs/lib/hash.c:46:9
> > #1 0x69d064 in hash_string /workspace/ovn/ovs/lib/hash.h:404:12
> > #2 0x69d064 in hash_name /workspace/ovn/ovs/lib/shash.c:29:12
> > #3 0x69d064 in shash_find /workspace/ovn/ovs/lib/shash.c:237:49
> > #4 0x69dada in shash_find_data /workspace/ovn/ovs/lib/shash.c:251:31
> > #5 0x507987 in add_remote
> /workspace/ovn/ovs/ovsdb/ovsdb-server.c:1382:15
> > #6 0x507987 in parse_options
> /workspace/ovn/ovs/ovsdb/ovsdb-server.c:2659:13
> > #7 0x507987 in main /workspace/ovn/ovs/ovsdb/ovsdb-server.c:751:5
> > #8 0x7f47e3997087 in __libc_start_call_main
> (/lib64/libc.so.6+0x2a087) (BuildId:
> b098f1c75a76548bb230d8f551eae07a2aeccf06)
> > #9 0x7f47e399714a in __libc_start_main@GLIBC_2.2.5
> (/lib64/libc.so.6+0x2a14a) (BuildId:
> b098f1c75a76548bb230d8f551eae07a2aeccf06)
> > #10 0x42de64 in _start
> (/workspace/ovn/ovs/ovsdb/ovsdb-server+0x42de64) (BuildId:
> 6c3f4e311556b29f84c9c4a5d6df5114dc08a12e)
> >
>
> Please, remove the '#' signs as github misinterprets them as PR/issue
> reference.  And, please, remove the unnecessary info from the trace,
> e.g. BuildId, '/workspace/ovn/' part of the paths and maybe some other
> parts of the base libc frames.
>

Ack


>
> > SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior lib/hash.c:46:22
> >
> > Signed-off-by: Ales Musil 
> > ---
> >  lib/hash.c  | 2 +-
> >  lib/jhash.c | 2 +-
> >  2 files changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/hash.c b/lib/hash.c
> > index c722f3c3c..986fa6643 100644
> > --- a/lib/hash.c
> > +++ b/lib/hash.c
> > @@ -43,7 +43,7 @@ hash_bytes(const void *p_, size_t n, uint32_t basis)
> >  if (n) {
> >  uint32_t tmp = 0;
> >
> > -memcpy(, p, n);
> > +memcpy(, (const uint8_t *) p, n);
>
> We may accept the change, however, this looks more like a compiler
> bug to me.  memcpy() accepts void pointers, so there is already an
> implicit cast.  I didn't look into assembly, but I'd guess clang
> inlines the call and while doing that assumes the type.  I'm not
> sure it is allowed to do that.  Also, the 'n' here is always less
> than 4, so alignment should not be a problem because we can't copy
> the whole thing in a single aligned instruction (maybe there are
> instructions that can copy just 3 bytes without touching the 4th,
> but idk).
>
> Did you have a look at the asm by any chance?
>
>
As discussed offline, it doesn't inline and does the function call instead.


>
> Best regards, Ilya Maximets.
>
>
Thanks,
Ales

-- 

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA 

amu...@redhat.com

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] hash, jhash: Fix unaligned access to the hash remainder.

2024-05-02 Thread Ilya Maximets
On 5/2/24 12:22, Ales Musil wrote:
> The has was passed to memcpy as uin32_t *, however the hash bytes

'The has was passed' ? :)

> might be unaligned at that point. Case it to uint8_t * instead
> which has only single byte alignment requirement.
> 
> lib/hash.c:46:22: runtime error: load of misaligned address 0x50700065 
> for type 'const uint32_t *' (aka 'const unsigned int *'), which requires 4 
> byte alignment
> 0x50700065: note: pointer points here
>  73 62 2e 73 6f 63 6b  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 
> 00 00 00 00 00 00 00  00
>  ^

Please, wrap these lines.

> #0 0x6191cb in hash_bytes /workspace/ovn/ovs/lib/hash.c:46:9
> #1 0x69d064 in hash_string /workspace/ovn/ovs/lib/hash.h:404:12
> #2 0x69d064 in hash_name /workspace/ovn/ovs/lib/shash.c:29:12
> #3 0x69d064 in shash_find /workspace/ovn/ovs/lib/shash.c:237:49
> #4 0x69dada in shash_find_data /workspace/ovn/ovs/lib/shash.c:251:31
> #5 0x507987 in add_remote /workspace/ovn/ovs/ovsdb/ovsdb-server.c:1382:15
> #6 0x507987 in parse_options 
> /workspace/ovn/ovs/ovsdb/ovsdb-server.c:2659:13
> #7 0x507987 in main /workspace/ovn/ovs/ovsdb/ovsdb-server.c:751:5
> #8 0x7f47e3997087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) 
> (BuildId: b098f1c75a76548bb230d8f551eae07a2aeccf06)
> #9 0x7f47e399714a in __libc_start_main@GLIBC_2.2.5 
> (/lib64/libc.so.6+0x2a14a) (BuildId: b098f1c75a76548bb230d8f551eae07a2aeccf06)
> #10 0x42de64 in _start (/workspace/ovn/ovs/ovsdb/ovsdb-server+0x42de64) 
> (BuildId: 6c3f4e311556b29f84c9c4a5d6df5114dc08a12e)
> 

Please, remove the '#' signs as github misinterprets them as PR/issue
reference.  And, please, remove the unnecessary info from the trace,
e.g. BuildId, '/workspace/ovn/' part of the paths and maybe some other
parts of the base libc frames.

> SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior lib/hash.c:46:22
> 
> Signed-off-by: Ales Musil 
> ---
>  lib/hash.c  | 2 +-
>  lib/jhash.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/hash.c b/lib/hash.c
> index c722f3c3c..986fa6643 100644
> --- a/lib/hash.c
> +++ b/lib/hash.c
> @@ -43,7 +43,7 @@ hash_bytes(const void *p_, size_t n, uint32_t basis)
>  if (n) {
>  uint32_t tmp = 0;
>  
> -memcpy(, p, n);
> +memcpy(, (const uint8_t *) p, n);

We may accept the change, however, this looks more like a compiler
bug to me.  memcpy() accepts void pointers, so there is already an
implicit cast.  I didn't look into assembly, but I'd guess clang
inlines the call and while doing that assumes the type.  I'm not
sure it is allowed to do that.  Also, the 'n' here is always less
than 4, so alignment should not be a problem because we can't copy
the whole thing in a single aligned instruction (maybe there are
instructions that can copy just 3 bytes without touching the 4th,
but idk).

Did you have a look at the asm by any chance?

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] sparse: Add additional define for sparse on GCC >= 14.

2024-05-02 Thread Ales Musil
GCC 14 renamed one of the AVX512 defines to have only single
underscore instead of two [0]. Add the single underscore define to
keep compatibility with multiple GCC versions.

[0] 
https://github.com/gcc-mirror/gcc/commit/aea8e4105553cd16799f2134d15420ccf182d732
Tested-by: Dumitru Ceara 
Signed-off-by: Ales Musil 
---
 include/sparse/immintrin.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/include/sparse/immintrin.h b/include/sparse/immintrin.h
index dd742be9f..36b41d352 100644
--- a/include/sparse/immintrin.h
+++ b/include/sparse/immintrin.h
@@ -26,5 +26,9 @@
 #define _KEYLOCKERINTRIN_H_INCLUDED
 #define __AVX512FP16INTRIN_H_INCLUDED
 #define __AVX512FP16VLINTRIN_H_INCLUDED
+/* GCC >=14 changed the "__AVX512FP16INTRIN_H_INCLUDED" to have only single
+ * underscore. We need both to keep compatibility between various GCC
+ * versions. */
+#define _AVX512FP16INTRIN_H_INCLUDED
 
 #include_next 
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] hash, jhash: Fix unaligned access to the hash remainder.

2024-05-02 Thread Ales Musil
The has was passed to memcpy as uin32_t *, however the hash bytes
might be unaligned at that point. Case it to uint8_t * instead
which has only single byte alignment requirement.

lib/hash.c:46:22: runtime error: load of misaligned address 0x50700065 for 
type 'const uint32_t *' (aka 'const unsigned int *'), which requires 4 byte 
alignment
0x50700065: note: pointer points here
 73 62 2e 73 6f 63 6b  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 
00 00 00 00 00 00  00
 ^
#0 0x6191cb in hash_bytes /workspace/ovn/ovs/lib/hash.c:46:9
#1 0x69d064 in hash_string /workspace/ovn/ovs/lib/hash.h:404:12
#2 0x69d064 in hash_name /workspace/ovn/ovs/lib/shash.c:29:12
#3 0x69d064 in shash_find /workspace/ovn/ovs/lib/shash.c:237:49
#4 0x69dada in shash_find_data /workspace/ovn/ovs/lib/shash.c:251:31
#5 0x507987 in add_remote /workspace/ovn/ovs/ovsdb/ovsdb-server.c:1382:15
#6 0x507987 in parse_options /workspace/ovn/ovs/ovsdb/ovsdb-server.c:2659:13
#7 0x507987 in main /workspace/ovn/ovs/ovsdb/ovsdb-server.c:751:5
#8 0x7f47e3997087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) 
(BuildId: b098f1c75a76548bb230d8f551eae07a2aeccf06)
#9 0x7f47e399714a in __libc_start_main@GLIBC_2.2.5 
(/lib64/libc.so.6+0x2a14a) (BuildId: b098f1c75a76548bb230d8f551eae07a2aeccf06)
#10 0x42de64 in _start (/workspace/ovn/ovs/ovsdb/ovsdb-server+0x42de64) 
(BuildId: 6c3f4e311556b29f84c9c4a5d6df5114dc08a12e)

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior lib/hash.c:46:22

Signed-off-by: Ales Musil 
---
 lib/hash.c  | 2 +-
 lib/jhash.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/hash.c b/lib/hash.c
index c722f3c3c..986fa6643 100644
--- a/lib/hash.c
+++ b/lib/hash.c
@@ -43,7 +43,7 @@ hash_bytes(const void *p_, size_t n, uint32_t basis)
 if (n) {
 uint32_t tmp = 0;
 
-memcpy(, p, n);
+memcpy(, (const uint8_t *) p, n);
 hash = hash_add(hash, tmp);
 }
 
diff --git a/lib/jhash.c b/lib/jhash.c
index c59b51b61..0a0628589 100644
--- a/lib/jhash.c
+++ b/lib/jhash.c
@@ -114,7 +114,7 @@ jhash_bytes(const void *p_, size_t n, uint32_t basis)
 uint32_t tmp[3];
 
 tmp[0] = tmp[1] = tmp[2] = 0;
-memcpy(tmp, p, n);
+memcpy(tmp, (const uint8_t *) p, n);
 a += tmp[0];
 b += tmp[1];
 c += tmp[2];
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] utilities: Correct deletion reason in flow_reval_monitor.py.

2024-05-02 Thread Eelco Chaudron
The flow_reval_monitor.py script incorrectly reported the reasons for
FDR_PURGE and FDR_TOO_EXPENSIVE, as their descriptions were swapped.
This patch rectifies the order.

Fixes: 86b9e653ef22 ("revalidator: Add a USDT probe during flow deletion with 
purge reason.")
Signed-off-by: Eelco Chaudron 
---
 utilities/usdt-scripts/flow_reval_monitor.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/utilities/usdt-scripts/flow_reval_monitor.py 
b/utilities/usdt-scripts/flow_reval_monitor.py
index 534ba8fa2..117f5bc27 100755
--- a/utilities/usdt-scripts/flow_reval_monitor.py
+++ b/utilities/usdt-scripts/flow_reval_monitor.py
@@ -262,8 +262,8 @@ FdrReasonStrings = [
 "Kill all flows condition detected",
 "Mask too wide - need narrower match",
 "No matching ofproto rules",
-"Too expensive to revalidate",
 "Purged with user action",
+"Too expensive to revalidate",
 "Flow state inconsistent after updates",
 "Flow translation error",
 ]
-- 
2.43.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v4 1/2] northd: Make `vxlan_mode` a global variable.

2024-05-02 Thread Vladislav Odintsov
This simplifies code and subsequent commit to explicitely disable vxlan
mode is based on these changes.

Also `vxlan mode` term is introduced in ovn-architecture man page.

Signed-off-by: Vladislav Odintsov 
---
 northd/en-global-config.c |  4 +-
 northd/northd.c   | 94 ---
 northd/northd.h   |  5 ++-
 ovn-architecture.7.xml| 11 +++--
 4 files changed, 50 insertions(+), 64 deletions(-)

diff --git a/northd/en-global-config.c b/northd/en-global-config.c
index 28c78a12c..873649a89 100644
--- a/northd/en-global-config.c
+++ b/northd/en-global-config.c
@@ -115,8 +115,8 @@ en_global_config_run(struct engine_node *node , void *data)
  config_data->svc_monitor_mac);
 }
 
-char *max_tunid = xasprintf("%d",
-get_ovn_max_dp_key_local(sbrec_chassis_table));
+init_vxlan_mode(sbrec_chassis_table);
+char *max_tunid = xasprintf("%d", get_ovn_max_dp_key_local());
 smap_replace(options, "max_tunid", max_tunid);
 free(max_tunid);
 
diff --git a/northd/northd.c b/northd/northd.c
index 5e12fd1e8..b54219a85 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -90,6 +90,10 @@ static bool use_ct_inv_match = true;
  */
 static bool default_acl_drop;
 
+/* If this option is 'true' northd will use limited 24-bit space for datapath
+ * and ports tunnel key allocation (12 bits for each instead of default 16). */
+static bool vxlan_mode;
+
 #define MAX_OVN_TAGS 4096
 
 
@@ -881,24 +885,25 @@ join_datapaths(const struct nbrec_logical_switch_table 
*nbrec_ls_table,
 }
 }
 
-static bool
-is_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
+void
+init_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
 {
 const struct sbrec_chassis *chassis;
 SBREC_CHASSIS_TABLE_FOR_EACH (chassis, sbrec_chassis_table) {
 for (int i = 0; i < chassis->n_encaps; i++) {
 if (!strcmp(chassis->encaps[i]->type, "vxlan")) {
-return true;
+vxlan_mode = true;
+return;
 }
 }
 }
-return false;
+vxlan_mode = false;
 }
 
 uint32_t
-get_ovn_max_dp_key_local(const struct sbrec_chassis_table *sbrec_chassis_table)
+get_ovn_max_dp_key_local(void)
 {
-if (is_vxlan_mode(sbrec_chassis_table)) {
+if (vxlan_mode) {
 /* OVN_MAX_DP_GLOBAL_NUM doesn't apply for vxlan mode. */
 return OVN_MAX_DP_VXLAN_KEY;
 }
@@ -906,15 +911,14 @@ get_ovn_max_dp_key_local(const struct sbrec_chassis_table 
*sbrec_chassis_table)
 }
 
 static void
-ovn_datapath_allocate_key(const struct sbrec_chassis_table *sbrec_ch_table,
-  struct hmap *datapaths, struct hmap *dp_tnlids,
+ovn_datapath_allocate_key(struct hmap *datapaths, struct hmap *dp_tnlids,
   struct ovn_datapath *od, uint32_t *hint)
 {
 if (!od->tunnel_key) {
 od->tunnel_key = ovn_allocate_tnlid(dp_tnlids, "datapath",
-OVN_MIN_DP_KEY_LOCAL,
-get_ovn_max_dp_key_local(sbrec_ch_table),
-hint);
+OVN_MIN_DP_KEY_LOCAL,
+get_ovn_max_dp_key_local(),
+hint);
 if (!od->tunnel_key) {
 if (od->sb) {
 sbrec_datapath_binding_delete(od->sb);
@@ -927,7 +931,6 @@ ovn_datapath_allocate_key(const struct sbrec_chassis_table 
*sbrec_ch_table,
 
 static void
 ovn_datapath_assign_requested_tnl_id(
-const struct sbrec_chassis_table *sbrec_chassis_table,
 struct hmap *dp_tnlids, struct ovn_datapath *od)
 {
 const struct smap *other_config = (od->nbs
@@ -936,8 +939,7 @@ ovn_datapath_assign_requested_tnl_id(
 uint32_t tunnel_key = smap_get_int(other_config, "requested-tnl-key", 0);
 if (tunnel_key) {
 const char *interconn_ts = smap_get(other_config, "interconn-ts");
-if (!interconn_ts && is_vxlan_mode(sbrec_chassis_table) &&
-tunnel_key >= 1 << 12) {
+if (!interconn_ts && vxlan_mode && tunnel_key >= 1 << 12) {
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
 VLOG_WARN_RL(, "Tunnel key %"PRIu32" for datapath %s is "
  "incompatible with VXLAN", tunnel_key,
@@ -985,7 +987,6 @@ build_datapaths(struct ovsdb_idl_txn *ovnsb_txn,
 const struct nbrec_logical_switch_table *nbrec_ls_table,
 const struct nbrec_logical_router_table *nbrec_lr_table,
 const struct sbrec_datapath_binding_table *sbrec_dp_table,
-const struct sbrec_chassis_table *sbrec_chassis_table,
 struct ovn_datapaths *ls_datapaths,
 struct ovn_datapaths *lr_datapaths,
 struct ovs_list *lr_list)
@@ -1000,12 +1001,11 @@ build_datapaths(struct ovsdb_idl_txn *ovnsb_txn,
 struct 

[ovs-dev] [PATCH ovn v4 2/2] northd: Add support for disabling vxlan mode.

2024-05-02 Thread Vladislav Odintsov
Commit [1] introduced a "vxlan mode" concept.  It brought a limitation
for available tunnel IDs because of lack of space in VXLAN VNI.
In vxlan mode OVN is limited by 4095 datapaths (LRs or non-transit LSs)
and 2047 logical switch ports per datapath.

Prior to this patch vxlan mode was enabled automatically if at least one
chassis had encap of vxlan type.  In scenarios where one want to use VXLAN
only for HW VTEP (RAMP) switch, such limitation makes no sence.

This patch adds support for explicit disabling of vxlan mode via
Northbound database.

1: https://github.com/ovn-org/ovn/commit/b07f1bc3d068

CC: Ihar Hrachyshka 
Fixes: b07f1bc3d068 ("Add VXLAN support for non-VTEP datapath bindings")
Signed-off-by: Vladislav Odintsov 
---
 NEWS  |  3 +++
 northd/en-global-config.c |  7 ++-
 northd/northd.c   | 10 --
 northd/northd.h   |  3 ++-
 ovn-architecture.7.xml|  6 ++
 ovn-nb.xml| 10 ++
 tests/ovn-northd.at   | 29 +
 7 files changed, 64 insertions(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index 3b5e93dc9..43ab05a68 100644
--- a/NEWS
+++ b/NEWS
@@ -17,6 +17,9 @@ Post v24.03.0
 external-ids, the option is no longer needed as it became effectively
 "true" for all scenarios.
   - Added DHCPv4 relay support.
+  - Added new global config option NB_Global:options:disable_vxlan_mode to
+extend available tunnel IDs space for datapaths from 4095 to 16711680.
+For more details see man ovn-nb(5) for mentioned option.
 
 OVN v24.03.0 - 01 Mar 2024
 --
diff --git a/northd/en-global-config.c b/northd/en-global-config.c
index 873649a89..f5e2a8154 100644
--- a/northd/en-global-config.c
+++ b/northd/en-global-config.c
@@ -115,7 +115,7 @@ en_global_config_run(struct engine_node *node , void *data)
  config_data->svc_monitor_mac);
 }
 
-init_vxlan_mode(sbrec_chassis_table);
+init_vxlan_mode(>options, sbrec_chassis_table);
 char *max_tunid = xasprintf("%d", get_ovn_max_dp_key_local());
 smap_replace(options, "max_tunid", max_tunid);
 free(max_tunid);
@@ -533,6 +533,11 @@ check_nb_options_out_of_sync(const struct nbrec_nb_global 
*nb,
 return true;
 }
 
+if (config_out_of_sync(>options, _data->nb_options,
+   "disable_vxlan_mode", false)) {
+return true;
+}
+
 return false;
 }
 
diff --git a/northd/northd.c b/northd/northd.c
index b54219a85..d1535172e 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -886,8 +886,14 @@ join_datapaths(const struct nbrec_logical_switch_table 
*nbrec_ls_table,
 }
 
 void
-init_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
+init_vxlan_mode(const struct smap *nb_options,
+const struct sbrec_chassis_table *sbrec_chassis_table)
 {
+if (smap_get_bool(nb_options, "disable_vxlan_mode", false)) {
+vxlan_mode = false;
+return;
+}
+
 const struct sbrec_chassis *chassis;
 SBREC_CHASSIS_TABLE_FOR_EACH (chassis, sbrec_chassis_table) {
 for (int i = 0; i < chassis->n_encaps; i++) {
@@ -17593,7 +17599,7 @@ ovnnb_db_run(struct northd_input *input_data,
 use_common_zone = smap_get_bool(input_data->nb_options, "use_common_zone",
 false);
 
-init_vxlan_mode(input_data->sbrec_chassis_table);
+init_vxlan_mode(input_data->nb_options, input_data->sbrec_chassis_table);
 
 build_datapaths(ovnsb_txn,
 input_data->nbrec_logical_switch_table,
diff --git a/northd/northd.h b/northd/northd.h
index be480003e..d0322e621 100644
--- a/northd/northd.h
+++ b/northd/northd.h
@@ -792,7 +792,8 @@ lr_has_multiple_gw_ports(const struct ovn_datapath *od)
 }
 
 void
-init_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table);
+init_vxlan_mode(const struct smap *nb_options,
+const struct sbrec_chassis_table *sbrec_chassis_table);
 
 uint32_t get_ovn_max_dp_key_local(void);
 
diff --git a/ovn-architecture.7.xml b/ovn-architecture.7.xml
index 7abb1fa83..251c9c514 100644
--- a/ovn-architecture.7.xml
+++ b/ovn-architecture.7.xml
@@ -2919,4 +2919,10 @@
 the future, gateways that do not support encapsulations with large amounts
 of metadata may continue to have a reduced feature set.
   
+  
+vxlan mode is recommended to be disabled if VXLAN encap at
+hypervisors is needed only to support HW VTEP L2 Gateway functionality.
+See man ovn-nb(5) for table NB_Global column
+options key disable_vxlan_mode for more details.
+  
 
diff --git a/ovn-nb.xml b/ovn-nb.xml
index 5cb6ba640..a99e663e5 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -381,6 +381,16 @@
 of SB changes would be very noticeable.
   
 
+  
+By default if at least one chassis in OVN cluster has VXLAN encap,
+northd will run in a vxlan mode. See man
+ovn-architecture(7) Tunnel Encapsulations paragraph for
+ 

[ovs-dev] [PATCH ovn v4 0/2] Add support to disable vxlan mode.

2024-05-02 Thread Vladislav Odintsov
v4:
  - Addressed Dumitru's and Ihar's review comments;
  - single patch was split into two:
1. function call replaced with a global variable `vxlan_mode`;
2. introduced `disable_vxlan_mode` configuration knob;
  - rebased onto latest main branch.
v3:
  - Removed accidental ovs submodule change.
v2:
  - Added NEWS item.

Vladislav Odintsov (2):
  northd: Make `vxlan_mode` a global variable.
  northd: Add support for disabling vxlan mode.

 NEWS  |   3 ++
 northd/en-global-config.c |   9 +++-
 northd/northd.c   | 100 +-
 northd/northd.h   |   6 ++-
 ovn-architecture.7.xml|  17 ---
 ovn-nb.xml|  10 
 tests/ovn-northd.at   |  29 +++
 7 files changed, 110 insertions(+), 64 deletions(-)

-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] netdev-tc-offloads: Don't offload header modification on ip fragments.

2024-05-02 Thread Eelco Chaudron


On 1 May 2024, at 15:36, Finn, Emma wrote:

>> -Original Message-
>> From: Eelco Chaudron 
>> Sent: Wednesday, May 1, 2024 1:52 PM
>> To: Simon Horman 
>> Cc: Finn, Emma ; Stokes, Ian ;
>> sunil.pa...@intel.com; Van Haaren, Harry ;
>> d...@openvswitch.org; Flavio Leitner ; Ilya Maximets
>> 
>> Subject: Re: [ovs-dev] [PATCH] netdev-tc-offloads: Don't offload header
>> modification on ip fragments.
>>
>>
>>
>> On 1 May 2024, at 14:39, Simon Horman wrote:
>>
>>> On Tue, Apr 30, 2024 at 02:42:45PM +0200, Eelco Chaudron wrote:
 Greetings, Intel team!

 The self-test conducted as part of this patch has revealed an issue with 
 the
>> AVX512 checksumming code. Since it was agreed upon that your team would
>> maintain this code upon its inclusion, could you please review the problem
>> and provide a patch?

 Details on the problem can be found in this mail link:

 https://mail.openvswitch.org/pipermail/ovs-build/2024-
>> April/038590.html
>>>
>>> Thanks Eelco,
>>>
>>> In light of the above, could you clarify your plans for this patch?
>>
>> I hope Intel keeps their promise and will have a patch out soon, so we can
>> apply this patch.
>>
>> If not, I guess I can send out a patch to disable pedit acceleration as it’s
>> broken, and then apply this patch.
>>
>> Intel are you looking into this?
>>
>> Cheers,
>>
>> Eelco
>
> Hi Folks,
>
> I'll look into this and try reproduce locally myself.
> Will reach out when I have an update.

Thanks Emma, you can simply replicate this by building OVS with the 
--enable-actions-default-autovalidator configuration flag, then including the 
test in this patch, and run make check-dpdk.

Cheers,

Eelco

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev