Re: [ovs-dev] [PATCH] ofproto-dpif-upcall: Fix ukey installation failure logs and counters.

2024-04-05 Thread Ilya Maximets
On 4/5/24 15:46, Eelco Chaudron wrote:
> 
> 
> On 5 Apr 2024, at 15:04, Ilya Maximets wrote:
> 
>> On 4/5/24 14:33, Eelco Chaudron wrote:
>>>
>>>
>>> On 4 Apr 2024, at 14:09, Ilya Maximets wrote:
>>>
 ukey_install() returns boolean signaling if the ukey was installed
 or not.  Installation may fail for a few reasons:

  1. Conflicting ukey.
  2. Mutex contention while trying to replace existing ukey.
  3. The same ukey already exists and active.

 Only the first case here signals an actual problem.  Third one is
 a little odd for userspace datapath, but harmless.  Second is the
 most common one that can easily happen during normal operation
 since other threads like revalidators may be currently working on
 this ukey preventing an immediate access.

 Since only the first case is actually worth logging and it already
 has its own log message, removing the 'upcall installation fails'
 warning from the upcall_cb().  This should fix most of the random
 failures of userspace system tests in CI.

 While at it, also fixing coverage counters.  Mutex contention was
 mistakenly counted as a duplicate upcall.  ukey contention for
 revalidators was counted only in one of two places.

 New counter added for the ukey contention on replace.  We should
 not re-use existing upcall_ukey_contention counter for this, since
 it may lead to double counting.

 Fixes: 67f08985d769 ("upcall: Replace ukeys for deleted flows.")
 Fixes: 9cec8274ed9a ("ofproto-dpif-upcall: Add VLOG_WARN_RL logs for 
 upcall_cb() error.")
 Signed-off-by: Ilya Maximets 
 ---
>>>
>>> Thanks for looking into this, and the patch looks good to me.
>>>
>>> Maybe we should have another patch fixing some of the namings?
>>>
>>>   upcall_ukey_replace -> ukey_replace
>>>   handler_duplicate_upcall -> duplicate_upcall
>>>   upcall_ukey_contention -> revalidate_ukey_contention -> ukey_contention
>>
>> I had something similar in mind, but I didn't make any radical
>> renaming since I plan to backport this change.
>>
>> One more thing is that we would likely want to distinguish
>> contention in handlers/pmds from contention in revalidators,
>> so these will need separate counters, I think.
> 
> Ack but all can call the update functions ;) We can follow this up after the 
> backport…

Sure.

> 
>>>
>>> Cheers,
>>>
>>> Eelco
>>>
>>> Acked-by: Eelco Chaudron 
>>>
> 

Thanks!  Applied and backported down to 2.17.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] Use listen backlog = 64 for all connections.

2024-04-05 Thread Ilya Maximets
On 4/5/24 11:15, Eelco Chaudron wrote:
> 
> 
> On 4 Apr 2024, at 21:17, Ihar Hrachyshka wrote:
> 
>> On Thu, Apr 4, 2024 at 2:36 AM Eelco Chaudron  wrote:
>>
>>>
>>>
>>> On 3 Apr 2024, at 23:18, Ihar Hrachyshka wrote:
>>>
 Before the patch, the size of the backlog depended on the type of socket
 (UNIX vs INET) as well as on the language (C vs Python), specifically:

 - python used backlog size = 10 for all sockets;
 - C used 64 for UNIX sockets but 10 for INET sockets.

 This consolidates the values across the board. It effectively bumps the
 number of simultaneous connections to python unixctl servers to 64. Also
 for INET C servers too.

 Signed-off-by: Ihar Hrachyshka 
>>>
>>> Hi Ihar,
>>>
>>> Thanks for submitting the patch. The patch looks fine to me, however, can
>>> you elaborate on why you want to increase the size to 64? Is 10 giving
>>> problems in specific scenarios?
>>>
>>>
>> I guess I should've included it in the commit message, (and I am happy to
>> send v2 with it updated), but...
>>
>> 1. Originally* the patch was implemented to allow more parallel fmt_pkt
>> calls in OVN test suite (that rely on a python AF_UNIX unixctl server to
>> transform scapy string format strings into byte strings). The problem with
>> parallel handling of more than 10 unixctl AF_UNIX requests to python
>> servers was noticed here:
>> https://github.com/ovn-org/ovn/commit/0baca3e519756cbe98a32526ccc637bb73468743
>> 2. Now, Brian also reports listen backlog issues in OpenStack environments
>> for INET sockets, see:
>> https://mail.openvswitch.org/pipermail/ovs-discuss/2024-April/053049.html
>>
>> * the patch was part of a series of patches -
>> https://patchwork.ozlabs.org/project/openvswitch/list/?series=382739=%2A=both
>> - that generally improve AF_UNIX unixctl socket handling, which I plan to
>> revive later, but nothing stops us from merging this before I get to it.
>>
> 
> Yes, I noticed that conversation right after I replied ;) Let’s wait for 
> Ilya’s feedback before sending a v2.

I think, this patch is fine.  Expanding the commit message with more
context will definitely be useful.

Also, this patch spans multiple files, but having 'stream: ' or 'socket: '
subject prefix should be appropriate, I think.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] vlog: Log stack trace on vlog_abort.

2024-04-05 Thread Ilya Maximets
Currently, calls like ovs_assert() just print out a condition that
caused assertion to fail.  But it may be not enough to understand what
exactly has happened, especially if assertion failed in some generic
function like dp_packet_resize() or similar.

Print the stack trace along with the abort message to give more context
for the later debugging.

This should be especially useful in case the issue happens in an
environment with core dumps disabled.

Adding the log to vlog_abort() to cover a little more cases where
VLOG_ABORT is called and not only assertion failures.

It would be nice to also have stack traces in case of reaching the
OVS_NOT_REACHED().  But this macro is used in some places as a last
resort and should not actually do more than just stopping the process
immediately.  And it also can be used in contexts without logging
initialized.  Such a change will need to be done more carefully.
Better solution might be to use VLOG_ABORT() where appropriate instead.

Signed-off-by: Ilya Maximets 
---
 lib/vlog.c   | 10 --
 tests/library.at |  4 +++-
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/lib/vlog.c b/lib/vlog.c
index b2653142f..e78c785f7 100644
--- a/lib/vlog.c
+++ b/lib/vlog.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include "async-append.h"
+#include "backtrace.h"
 #include "coverage.h"
 #include "dirs.h"
 #include "openvswitch/dynamic-string.h"
@@ -1274,8 +1275,9 @@ vlog_fatal(const struct vlog_module *module, const char 
*message, ...)
 va_end(args);
 }
 
-/* Logs 'message' to 'module' at maximum verbosity, then calls abort().  Always
- * writes the message to stderr, even if the console destination is disabled.
+/* Attempts to log a stack trace, logs 'message' to 'module' at maximum
+ * verbosity, then calls abort().  Always writes the message to stderr, even
+ * if the console destination is disabled.
  *
  * Choose this function instead of vlog_fatal_valist() if the daemon monitoring
  * facility should automatically restart the current daemon.  */
@@ -1289,6 +1291,10 @@ vlog_abort_valist(const struct vlog_module *module_,
  * message written by the later ovs_abort_valist(). */
 module->levels[VLF_CONSOLE] = VLL_OFF;
 
+/* Printing the stack trace before the 'message', because the 'message'
+ * will flush the async log queue (VLL_EMER).  With a different order we
+ * would need to flush the queue manually again. */
+log_backtrace();
 vlog_valist(module, VLL_EMER, message, args);
 ovs_abort_valist(0, message, args);
 }
diff --git a/tests/library.at b/tests/library.at
index 7b4acebb8..d962e1b3f 100644
--- a/tests/library.at
+++ b/tests/library.at
@@ -230,7 +230,9 @@ AT_CHECK([ovstest test-util -voff -vfile:info 
'-vPATTERN:file:%c|%p|%m' --log-fi
   [$exit_status], [], [stderr])
 
 AT_CHECK([sed 's/\(opened log file\) .*/\1/
-s/|[[^|]]*: /|/' test-util.log], [0], [dnl
+s/|[[^|]]*: /|/
+/backtrace/d
+/|.*|/!d' test-util.log], [0], [dnl
 vlog|INFO|opened log file
 util|EMER|assertion false failed in test_assert()
 ])
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v14 0/6] dpif: probe support for OVS_ACTION_ATTR_DROP

2024-04-05 Thread Ilya Maximets
On 4/3/24 16:35, Eric Garver wrote:
> v14:
>   - remove atomic_init()
>   - add bogus dl_type to feature probe
>   - ovs_assert() that check_drop_action() returns false when expected
>   - add and use missing enum ovs_dec_ttl_attr
> v13:
>   - use macros strip_ptype and strip_eth in test
> v12:
>   - new patch to verify re-probe
>   - don't mention hw offload in the probe
>   - changed log message to not mention HW offload
> v11:
>   - move netdev and flow API check to lib
>   - tweaked log message
>   - use atomic store instead of cmpx
>   - copy odp in copy_support()
> v10:
>   - fix a sparse error in show_dp_feature_atomic_bool()
> v9:
>   - new patch make get_datapath_cap() access support by pointer
>   - fix a clang warning in copy_support()
> v8:
>   - new patch to support atomic_bool dpif field types
>   - re-add re-probe of support
>   - use atomic_bool type for explicit_drop_action
> v7:
>   - remove re-probe of support as Ilya is working on a generic solution
> v6:
>   - improve log if hw-offload enabled
>   - re-probe support if hw-offload set true at runtime
> v5:
>   - combine patches 3 and 4
>   - add description to combined patch 3
> v4:
>   - avoid passing action to datapath if TC offload in use
> v3:
>   - alter test such that it's reliable, different xlate_error
>   - better commit message in patch 2
>   - reordered _DEC_TTL value in switch statements
>   - add format_dec_ttl_action()
> v2:
>   - new patch (1) to fix build (switch cases)
>   - fixed check-system-userspace
> 
> Eric Garver (6):
>   dpif: Stub out unimplemented action OVS_ACTION_ATTR_DEC_TTL.
>   dpif: Make get_datapath_cap() access support by pointer.
>   dpif: Support atomic_bool field type.
>   dpif: Probe support for OVS_ACTION_ATTR_DROP.
>   tests: system-traffic: Add coverage for drop action.
>   tests: system-offload-traffic: Verify re-probe of drop action.
> 
>  include/linux/openvswitch.h  |  11 +-
>  lib/dpif-netdev.c|   1 +
>  lib/dpif.c   |   7 +-
>  lib/dpif.h   |   2 +-
>  lib/odp-execute.c|   2 +
>  lib/odp-util.c   |  23 
>  ofproto/ofproto-dpif-ipfix.c |   1 +
>  ofproto/ofproto-dpif-sflow.c |   1 +
>  ofproto/ofproto-dpif.c   | 186 ---
>  ofproto/ofproto-dpif.h   |   4 +-
>  tests/system-common-macros.at|   4 +
>  tests/system-offloads-traffic.at |  12 ++
>  tests/system-traffic.at  |  31 ++
>  13 files changed, 240 insertions(+), 45 deletions(-)
> 

Thanks, Eric and Eelco!  Applied.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 2/2] netlink-conntrack: Optimize flushing ct zone.

2024-04-05 Thread Ilya Maximets
On 4/4/24 15:27, Aaron Conole wrote:
> Felix Huettner via dev  writes:
> 
>> Previously the kernel did not provide a netlink interface to flush/list
>> only conntrack entries matching a specific zone. With [1] and [2] it is now
>> possible to flush and list conntrack entries filtered by zone. Older
>> kernels not yet supporting this feature will ignore the filter.
>> For the list request that means just returning all entries (which we can
>> then filter in userspace as before).
>> For the flush request that means deleting all conntrack entries.
>>
>> The implementation is now identical to the windows one, so we combine
>> them.
>>
>> These significantly improves the performance of flushing conntrack zones
>> when the conntrack table is large. Since flushing a conntrack zone is
>> normally triggered via an openflow command it blocks the main ovs thread
>> and thereby also blocks new flows from being applied. Using this new
>> feature we can reduce the flushing time for zones by around 93%.
>>
>> In combination with OVN the creation of a Logical_Router (which causes
>> the flushing of a ct zone) could block other operations, e.g. the
>> failover of Logical_Routers (as they cause new flows to be created).
>> This is visible from a user perspective as a ovn-controller that is idle
>> (as it waits for vswitchd) and vswitchd reporting:
>> "blocked 1000 ms waiting for main to quiesce" (potentially with ever
>> increasing times).
>>
>> The following performance tests where run in a qemu vm with 500.000
>> conntrack entries distributed evenly over 500 ct zones using `ovstest
>> test-netlink-conntrack flush zone=`.
>>
>>   |  flush zone with 1000 entries  |   flush zone with no entry |
>>   +-+--+-+--|
>>   |   with the patch| without  |   with the patch| without  |
>>   +--+--+--+--+--+--|
>>   | v6.8-rc4 |  v6.7.1  | v6.8-rc4 | v6.8-rc4 |  v6.7.1  | v6.8-rc4 |
>> +-+--+--+--+--+--+--|
>> | Min |  0.260   |  3.946   |  3.497   |  0.228   |  3.462   |  3.212   |
>> | Median  |  0.319   |  4.237   |  4.349   |  0.298   |  4.460   |  4.010   |
>> | 90%ile  |  0.335   |  4.367   |  4.522   |  0.325   |  4.662   |  4.572   |
>> | 99%ile  |  0.348   |  4.495   |  4.773   |  0.340   |  4.931   |  6.003   |
>> | Max |  0.362   |  4.543   |  5.054   |  0.348   |  5.390   |  6.396   |
>> | Mean|  0.320   |  4.236   |  4.331   |  0.296   |  4.430   |  4.071   |
>> | Total   |  80.02   |  1058|  1082|  73.93   |  1107|  1017|
>>
>> [1]: 
>> https://github.com/torvalds/linux/commit/eff3c558bb7e61c41b53e4c8130e514a5a4df9ba
>> [2]: 
>> https://github.com/torvalds/linux/commit/fa173a1b4e3fd1ab5451cbc57de6fc624c824b0a
>>
>> Acked-by: Mike Pattrick 
>> Co-Authored-By: Luca Czesla 
>> Signed-off-by: Luca Czesla 
>> Co-Authored-By: Max Lamprecht 
>> Signed-off-by: Max Lamprecht 
>> Signed-off-by: Felix Huettner 
>> ---
> 
> Acked-by: Aaron Conole 
> 
> Thanks!

Thanks, everyone!  Applied.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2] northd, controller: Use paused controller action for packet buffering.

2024-04-05 Thread Numan Siddique
On Tue, Apr 2, 2024 at 2:28 AM Ales Musil  wrote:

> The current packet injection loses ct_state in the process. When
> the ct_state is lost we might commit to DNAT zone and perform
> zero SNAT after the packet injection. This causes the first session
> to be wrong as the reply packets are not unDNATted.
>
> Instead of re-injecting the packet back into the pipeline when
> we get the MAC binding, use paused controller action. The paused
> controller action stores ct_state, which is important for the behavior
> of the resumed packet.
>
> At the same time bump the OvS submodule latest branch-3.3. This is
> mainly for [0], which fixes metering for paused controller actions.
>
> [0] c560f6ca3257 ("ofproto-dpif-xlate: Fix continuations with associated
> metering.")
>
> Reported-at: https://issues.redhat.com/browse/FDP-439
> Signed-off-by: Ales Musil 
> Acked-by: Mark Michelson 
> ---
> v2: Fix the Jira link and add ack from Mark.
>

Hi Ales,

I see one problem during upgrades.  The packet buffering functionality
would be broken when ovn-controllers are updated to the version
which  has your patch  but ovn-northd is still yet to.

In this case, the below logical flow will be present  (without the outer
"output" action)

---
  table=21(lr_in_arp_request  ), priority=100  , match=(eth.dst ==
00:00:00:00:00:00 && ip4), action=(arp { eth.dst = ff:ff:ff:ff:ff:ff;
arp.spa = reg1; arp.tpa = reg0; arp.op = 1; output; };)
---

The updated ovn-controller will now resume the packet after learning the
nexthop.  But this resumed packet will be dropped because there is no
"next" or "output" action following arp {}.

Is this behavior acceptable until ovn-northd is also updated/upgraded ?

Thanks
Numan


---
>  controller/mac-learn.c | 30 +
>  controller/mac-learn.h |  9 ++--
>  controller/pinctrl.c   | 64 ++--
>  lib/actions.c  | 41 +-
>  northd/northd.c|  6 +--
>  ovs|  2 +-
>  tests/ovn-northd.at|  3 ++
>  tests/ovn.at   |  8 ++--
>  tests/system-ovn.at| 95 ++
>  9 files changed, 195 insertions(+), 63 deletions(-)
>
> diff --git a/controller/mac-learn.c b/controller/mac-learn.c
> index 071f01b4f..0c3b60c23 100644
> --- a/controller/mac-learn.c
> +++ b/controller/mac-learn.c
> @@ -199,15 +199,24 @@ ovn_fdb_add(struct hmap *fdbs, uint32_t dp_key,
> struct eth_addr mac,
>  /* packet buffering functions */
>
>  struct packet_data *
> -ovn_packet_data_create(struct ofpbuf ofpacts,
> -   const struct dp_packet *original_packet)
> +ovn_packet_data_create(const struct ofputil_packet_in *pin,
> +   const struct ofpbuf *continuation)
>  {
>  struct packet_data *pd = xmalloc(sizeof *pd);
>
> -pd->ofpacts = ofpacts;
> -/* clone the packet to send it later with correct L2 address */
> -pd->p = dp_packet_clone_data(dp_packet_data(original_packet),
> - dp_packet_size(original_packet));
> +pd->pin = (struct ofputil_packet_in) {
> +.packet = xmemdup(pin->packet, pin->packet_len),
> +.packet_len = pin->packet_len,
> +.flow_metadata = pin->flow_metadata,
> +.reason = pin->reason,
> +.table_id = pin->table_id,
> +.cookie = pin->cookie,
> +/* Userdata are empty on purpose,
> + * it is not needed for the continuation. */
> +.userdata = NULL,
> +.userdata_len = 0,
> +};
> +pd->continuation = ofpbuf_clone(continuation);
>
>  return pd;
>  }
> @@ -216,8 +225,8 @@ ovn_packet_data_create(struct ofpbuf ofpacts,
>  void
>  ovn_packet_data_destroy(struct packet_data *pd)
>  {
> -dp_packet_delete(pd->p);
> -ofpbuf_uninit(>ofpacts);
> +free(pd->pin.packet);
> +ofpbuf_delete(pd->continuation);
>  free(pd);
>  }
>
> @@ -307,7 +316,10 @@ ovn_buffered_packets_ctx_run(struct
> buffered_packets_ctx *ctx,
>
>  struct packet_data *pd;
>  LIST_FOR_EACH_POP (pd, node, >queue) {
> -struct eth_header *eth = dp_packet_data(pd->p);
> +struct dp_packet packet;
> +dp_packet_use_const(, pd->pin.packet,
> pd->pin.packet_len);
> +
> +struct eth_header *eth = dp_packet_data();
>  eth->eth_dst = mac;
>
>  ovs_list_push_back(>ready_packets_data, >node);
> diff --git a/controller/mac-learn.h b/controller/mac-learn.h
> index e0fd6a8d1..20a015e1a 100644
> --- a/controller/mac-learn.h
> +++ b/controller/mac-learn.h
> @@ -24,6 +24,7 @@
>  #include "openvswitch/hmap.h"
>  #include "openvswitch/list.h"
>  #include "openvswitch/ofpbuf.h"
> +#include "openvswitch/ofp-packet.h"
>
>  struct ovsdb_idl_index;
>
> @@ -91,8 +92,8 @@ struct fdb_entry *ovn_fdb_add(struct hmap *fdbs,
>  struct packet_data {
>  struct ovs_list node;
>
> -struct ofpbuf ofpacts;
> -struct dp_packet *p;
> +struct ofpbuf *continuation;
> +struct ofputil_packet_in pin;
> 

Re: [ovs-dev] [PATCH ovn] ovn-ctl: Use the current user for default file permissions.

2024-04-05 Thread Numan Siddique
On Thu, Apr 4, 2024 at 4:01 PM Mark Michelson  wrote:

> Thanks Ales,
>
> Acked-by: Mark Michelson 
>

Thanks.  I applied this patch to the main branch.  Does this require a
backport ?

Numan


> On 3/25/24 06:40, Ales Musil wrote:
> > The ovn-ctl utility was assuming that the user/group is always root,
> > when not specified otherwise by the --ovn-user/--ovn-group options.
> > This has the consequence of trying to change permissions of OVN
> > directories to root:root even though the script might be run as
> > completely different user.
> >
> > Take the current user and group instead of the hardcoded root.
> > At the same time remove the ovs-user option as it was not used for
> > anything and might be confusing.
> >
> > Reported-at: https://issues.redhat.com/browse/FDP-245
> > Signed-off-by: Ales Musil 
> > ---
> >   utilities/ovn-ctl   | 5 ++---
> >   utilities/ovn-ctl.8.xml | 1 -
> >   utilities/ovn-lib.in| 4 ++--
> >   3 files changed, 4 insertions(+), 6 deletions(-)
> >
> > diff --git a/utilities/ovn-ctl b/utilities/ovn-ctl
> > index 700efe35a..dae5e22f4 100755
> > --- a/utilities/ovn-ctl
> > +++ b/utilities/ovn-ctl
> > @@ -269,8 +269,8 @@ $cluster_remote_port
> >   # Set the owner of the ovn_dbdir (with -R option) to OVN_USER if
> set.
> >   # This is required because the ovndbs are created with root
> permission
> >   # if not present when create_cluster/upgrade_db is called.
> > -INSTALL_USER="root"
> > -INSTALL_GROUP="root"
> > +INSTALL_USER="$(id -un)"
> > +INSTALL_GROUP="$(id -gn)"
> >   [ "$OVN_USER" != "" ] && INSTALL_USER="${OVN_USER%:*}"
> >   [ "${OVN_USER##*:}" != "" ] && INSTALL_GROUP="${OVN_USER##*:}"
> >
> > @@ -1088,7 +1088,6 @@ Options:
> > --ovn-ic-sb-db-ssl-protocols=PROTOCOLS OVN IC Southbound DB SSL
> protocols
> > --ovn-ic-sb-db-ssl-ciphers=CIPHERS OVN IC Southbound DB SSL cipher
> list
> > --ovn-user="user[:group]"  pass the --user flag to the ovn
> daemons
> > -  --ovs-user="user[:group]"  pass the --user flag to ovs daemons
> > --ovsdb-nb-wrapper=WRAPPER run with a wrapper like valgrind for
> debugging
> > --ovsdb-sb-wrapper=WRAPPER run with a wrapper like valgrind for
> debugging
> > --ovsdb-disable-file-column-diff=no|yes
> > diff --git a/utilities/ovn-ctl.8.xml b/utilities/ovn-ctl.8.xml
> > index 57712bfdc..c0fbb0792 100644
> > --- a/utilities/ovn-ctl.8.xml
> > +++ b/utilities/ovn-ctl.8.xml
> > @@ -70,7 +70,6 @@
> >   --ovsdb-nb-wrapper=WRAPPER
> >   --ovsdb-sb-wrapper=WRAPPER
> >   --ovn-user=USER:GROUP
> > ---ovs-user=USER:GROUP
> >   -h | --help
> >
> >   File location options
> > diff --git a/utilities/ovn-lib.in b/utilities/ovn-lib.in
> > index 1e48ef28c..65cbfbcdc 100644
> > --- a/utilities/ovn-lib.in
> > +++ b/utilities/ovn-lib.in
> > @@ -48,8 +48,8 @@ LC_ALL=C; export LC_ALL
> >   ovn_install_dir () {
> >   DIR="$1"
> >   INSTALL_MODE="${2:-755}"
> > -INSTALL_USER="root"
> > -INSTALL_GROUP="root"
> > +INSTALL_USER="$(id -un)"
> > +INSTALL_GROUP="$(id -gn)"
> >   [ "$OVN_USER" != "" ] && INSTALL_USER="${OVN_USER%:*}"
> >   [ "${OVN_USER##*:}" != "" ] && INSTALL_GROUP="${OVN_USER##*:}"
> >
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v4] Make tunnel ids exhaustion test trigger the problem.

2024-04-05 Thread Ihar Hrachyshka
The original version of the scenario passed with or without the fix.
This is because all LSs were processed in one go, so the allocate
function was never entered with *hint==0.

Also, added another scenario that will check behavior when *hint is out
of [min;max] bounds but > max (this happens in an obscure scenario where
a vxlan chassis is added to the cluster mid-light, forcing northd to
reduce its effective max value for tunnel ids; which may become lower
than the current *hint for ports.)

Fixes: a1f165a7b807 ("northd: fix infinite loop in ovn_allocate_tnlid()")
Co-Authored-By: Vladislav Odintsov 
Signed-off-by: Vladislav Odintsov 
Signed-off-by: Ihar Hrachyshka 
---
v1: initial version.
v2: cover both cases of hint = 0 and hint > max.
v3: reduce the number of ports to create in the hint > max scenario needed to 
trigger the problem.
v4: remove spurious lib/ovn-util.c change.
---
 tests/ovn-northd.at | 43 ---
 1 file changed, 40 insertions(+), 3 deletions(-)

diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index be006fb32..1a4e7274d 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -2823,7 +2823,7 @@ AT_CLEANUP
 ])
 
 OVN_FOR_EACH_NORTHD_NO_HV([
-AT_SETUP([check tunnel ids exhaustion])
+AT_SETUP([check datapath tunnel ids exhaustion])
 ovn_start
 
 # Create a fake chassis with vxlan encap to lower MAX DP tunnel key to 2^12
@@ -2833,13 +2833,18 @@ ovn-sbctl \
 
 cmd="ovn-nbctl --wait=sb"
 
-for i in {1..4097}; do
+for i in {1..4095}; do
 cmd="${cmd} -- ls-add lsw-${i}"
 done
 
 eval $cmd
 
-check_row_count nb:Logical_Switch 4097
+check_row_count nb:Logical_Switch 4095
+wait_row_count sb:Datapath_Binding 4095
+
+ovn-nbctl ls-add lsw-exhausted
+
+check_row_count nb:Logical_Switch 4096
 wait_row_count sb:Datapath_Binding 4095
 
 OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted" 
northd/ovn-northd.log])
@@ -2847,6 +2852,38 @@ OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted" 
northd/ovn-northd.log])
 AT_CLEANUP
 ])
 
+OVN_FOR_EACH_NORTHD_NO_HV([
+AT_SETUP([check port tunnel ids exhaustion; vxlan chassis pops up midflight])
+ovn_start
+
+cmd="ovn-nbctl --wait=sb"
+
+cmd="${cmd} -- ls-add lsw"
+for i in {1..2048}; do
+cmd="${cmd} -- lsp-add lsw lsp-${i}"
+done
+
+eval $cmd
+
+check_row_count nb:Logical_Switch_Port 2048
+wait_row_count sb:Port_Binding 2048
+
+# Now create a fake chassis with vxlan encap to lower MAX port tunnel key to 
2^11
+ovn-sbctl \
+--id=@e create encap chassis_name=hv1 ip="192.168.0.1" type="vxlan" \
+-- --id=@c create chassis name=hv1 encaps=@e
+
+ovn-nbctl lsp-add lsw lsp-exhausted
+
+check_row_count nb:Logical_Switch_Port 2049
+wait_row_count sb:Port_Binding 2048
+
+OVS_WAIT_UNTIL([grep "all port tunnel ids exhausted" northd/ovn-northd.log])
+
+AT_CLEANUP
+])
+
+
 
 OVN_FOR_EACH_NORTHD_NO_HV([
 AT_SETUP([Logical Flow Datapath Groups])
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v3] Make tunnel ids exhaustion test trigger the problem.

2024-04-05 Thread Ihar Hrachyshka
Sorry for forgetting to include changelogs in these submissions.

v1: initial version.
v2: cover both cases of hint = 0 and hint > max.
v3: reduce the number of ports to create in the hint > max scenario needed
to trigger the problem.

On Fri, Apr 5, 2024 at 12:10 PM Ihar Hrachyshka  wrote:

> The original version of the scenario passed with or without the fix.
> This is because all LSs were processed in one go, so the allocate
> function was never entered with *hint==0.
>
> Also, added another scenario that will check behavior when *hint is out
> of [min;max] bounds but > max (this happens in an obscure scenario where
> a vxlan chassis is added to the cluster mid-light, forcing northd to
> reduce its effective max value for tunnel ids; which may become lower
> than the current *hint for ports.)
>
> Fixes: a1f165a7b807 ("northd: fix infinite loop in ovn_allocate_tnlid()")
> Co-Authored-By: Vladislav Odintsov 
> Signed-off-by: Vladislav Odintsov 
> Signed-off-by: Ihar Hrachyshka 
> ---
>  lib/ovn-util.c  | 11 ---
>  tests/ovn-northd.at | 43 ---
>  2 files changed, 44 insertions(+), 10 deletions(-)
>
> diff --git a/lib/ovn-util.c b/lib/ovn-util.c
> index 9f97ae2ca..248f00b9e 100644
> --- a/lib/ovn-util.c
> +++ b/lib/ovn-util.c
> @@ -693,17 +693,14 @@ uint32_t
>  ovn_allocate_tnlid(struct hmap *set, const char *name, uint32_t min,
> uint32_t max, uint32_t *hint)
>  {
> -/* Normalize hint, because it can be outside of [min, max]. */
> -*hint = next_tnlid(*hint, min, max);
> -
> -uint32_t tnlid = *hint;
> -do {
> +VLOG_ERR("IHAR starting hint = %u", hint);
> +for (uint32_t tnlid = next_tnlid(*hint, min, max); tnlid != *hint;
> + tnlid = next_tnlid(tnlid, min, max)) {
>  if (ovn_add_tnlid(set, tnlid)) {
>  *hint = tnlid;
>  return tnlid;
>  }
> -tnlid = next_tnlid(tnlid, min, max);
> -} while (tnlid != *hint);
> +}
>
>  static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
>  VLOG_WARN_RL(, "all %s tunnel ids exhausted", name);
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index be006fb32..1a4e7274d 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -2823,7 +2823,7 @@ AT_CLEANUP
>  ])
>
>  OVN_FOR_EACH_NORTHD_NO_HV([
> -AT_SETUP([check tunnel ids exhaustion])
> +AT_SETUP([check datapath tunnel ids exhaustion])
>  ovn_start
>
>  # Create a fake chassis with vxlan encap to lower MAX DP tunnel key to
> 2^12
> @@ -2833,13 +2833,18 @@ ovn-sbctl \
>
>  cmd="ovn-nbctl --wait=sb"
>
> -for i in {1..4097}; do
> +for i in {1..4095}; do
>  cmd="${cmd} -- ls-add lsw-${i}"
>  done
>
>  eval $cmd
>
> -check_row_count nb:Logical_Switch 4097
> +check_row_count nb:Logical_Switch 4095
> +wait_row_count sb:Datapath_Binding 4095
> +
> +ovn-nbctl ls-add lsw-exhausted
> +
> +check_row_count nb:Logical_Switch 4096
>  wait_row_count sb:Datapath_Binding 4095
>
>  OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted"
> northd/ovn-northd.log])
> @@ -2847,6 +2852,38 @@ OVS_WAIT_UNTIL([grep "all datapath tunnel ids
> exhausted" northd/ovn-northd.log])
>  AT_CLEANUP
>  ])
>
> +OVN_FOR_EACH_NORTHD_NO_HV([
> +AT_SETUP([check port tunnel ids exhaustion; vxlan chassis pops up
> midflight])
> +ovn_start
> +
> +cmd="ovn-nbctl --wait=sb"
> +
> +cmd="${cmd} -- ls-add lsw"
> +for i in {1..2048}; do
> +cmd="${cmd} -- lsp-add lsw lsp-${i}"
> +done
> +
> +eval $cmd
> +
> +check_row_count nb:Logical_Switch_Port 2048
> +wait_row_count sb:Port_Binding 2048
> +
> +# Now create a fake chassis with vxlan encap to lower MAX port tunnel key
> to 2^11
> +ovn-sbctl \
> +--id=@e create encap chassis_name=hv1 ip="192.168.0.1" type="vxlan" \
> +-- --id=@c create chassis name=hv1 encaps=@e
> +
> +ovn-nbctl lsp-add lsw lsp-exhausted
> +
> +check_row_count nb:Logical_Switch_Port 2049
> +wait_row_count sb:Port_Binding 2048
> +
> +OVS_WAIT_UNTIL([grep "all port tunnel ids exhausted"
> northd/ovn-northd.log])
> +
> +AT_CLEANUP
> +])
> +
> +
>
>  OVN_FOR_EACH_NORTHD_NO_HV([
>  AT_SETUP([Logical Flow Datapath Groups])
> --
> 2.41.0
>
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v3] Make tunnel ids exhaustion test trigger the problem.

2024-04-05 Thread Ihar Hrachyshka
The original version of the scenario passed with or without the fix.
This is because all LSs were processed in one go, so the allocate
function was never entered with *hint==0.

Also, added another scenario that will check behavior when *hint is out
of [min;max] bounds but > max (this happens in an obscure scenario where
a vxlan chassis is added to the cluster mid-light, forcing northd to
reduce its effective max value for tunnel ids; which may become lower
than the current *hint for ports.)

Fixes: a1f165a7b807 ("northd: fix infinite loop in ovn_allocate_tnlid()")
Co-Authored-By: Vladislav Odintsov 
Signed-off-by: Vladislav Odintsov 
Signed-off-by: Ihar Hrachyshka 
---
 lib/ovn-util.c  | 11 ---
 tests/ovn-northd.at | 43 ---
 2 files changed, 44 insertions(+), 10 deletions(-)

diff --git a/lib/ovn-util.c b/lib/ovn-util.c
index 9f97ae2ca..248f00b9e 100644
--- a/lib/ovn-util.c
+++ b/lib/ovn-util.c
@@ -693,17 +693,14 @@ uint32_t
 ovn_allocate_tnlid(struct hmap *set, const char *name, uint32_t min,
uint32_t max, uint32_t *hint)
 {
-/* Normalize hint, because it can be outside of [min, max]. */
-*hint = next_tnlid(*hint, min, max);
-
-uint32_t tnlid = *hint;
-do {
+VLOG_ERR("IHAR starting hint = %u", hint);
+for (uint32_t tnlid = next_tnlid(*hint, min, max); tnlid != *hint;
+ tnlid = next_tnlid(tnlid, min, max)) {
 if (ovn_add_tnlid(set, tnlid)) {
 *hint = tnlid;
 return tnlid;
 }
-tnlid = next_tnlid(tnlid, min, max);
-} while (tnlid != *hint);
+}
 
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
 VLOG_WARN_RL(, "all %s tunnel ids exhausted", name);
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index be006fb32..1a4e7274d 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -2823,7 +2823,7 @@ AT_CLEANUP
 ])
 
 OVN_FOR_EACH_NORTHD_NO_HV([
-AT_SETUP([check tunnel ids exhaustion])
+AT_SETUP([check datapath tunnel ids exhaustion])
 ovn_start
 
 # Create a fake chassis with vxlan encap to lower MAX DP tunnel key to 2^12
@@ -2833,13 +2833,18 @@ ovn-sbctl \
 
 cmd="ovn-nbctl --wait=sb"
 
-for i in {1..4097}; do
+for i in {1..4095}; do
 cmd="${cmd} -- ls-add lsw-${i}"
 done
 
 eval $cmd
 
-check_row_count nb:Logical_Switch 4097
+check_row_count nb:Logical_Switch 4095
+wait_row_count sb:Datapath_Binding 4095
+
+ovn-nbctl ls-add lsw-exhausted
+
+check_row_count nb:Logical_Switch 4096
 wait_row_count sb:Datapath_Binding 4095
 
 OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted" 
northd/ovn-northd.log])
@@ -2847,6 +2852,38 @@ OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted" 
northd/ovn-northd.log])
 AT_CLEANUP
 ])
 
+OVN_FOR_EACH_NORTHD_NO_HV([
+AT_SETUP([check port tunnel ids exhaustion; vxlan chassis pops up midflight])
+ovn_start
+
+cmd="ovn-nbctl --wait=sb"
+
+cmd="${cmd} -- ls-add lsw"
+for i in {1..2048}; do
+cmd="${cmd} -- lsp-add lsw lsp-${i}"
+done
+
+eval $cmd
+
+check_row_count nb:Logical_Switch_Port 2048
+wait_row_count sb:Port_Binding 2048
+
+# Now create a fake chassis with vxlan encap to lower MAX port tunnel key to 
2^11
+ovn-sbctl \
+--id=@e create encap chassis_name=hv1 ip="192.168.0.1" type="vxlan" \
+-- --id=@c create chassis name=hv1 encaps=@e
+
+ovn-nbctl lsp-add lsw lsp-exhausted
+
+check_row_count nb:Logical_Switch_Port 2049
+wait_row_count sb:Port_Binding 2048
+
+OVS_WAIT_UNTIL([grep "all port tunnel ids exhausted" northd/ovn-northd.log])
+
+AT_CLEANUP
+])
+
+
 
 OVN_FOR_EACH_NORTHD_NO_HV([
 AT_SETUP([Logical Flow Datapath Groups])
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] northd: fix infinite loop in ovn_allocate_tnlid()

2024-04-05 Thread Ihar Hrachyshka
On Thu, Apr 4, 2024 at 4:02 PM Vladislav Odintsov  wrote:

>
>
> > On 4 Apr 2024, at 22:51, Mark Michelson  wrote:
> >
> > On 4/4/24 12:46, Dumitru Ceara wrote:
> >> On 4/4/24 17:52, Vladislav Odintsov wrote:
> >>> Thanks Dumitru!
> >>> I’m totally fine with your change.
> >>> Should I send backport patches with resolved conflicts for remaining
> branches at least till 22.03, which is an LTS?
> >>>
> >> Well, 24.03 is the most recent LTS.  We don't really backport patches to
> >> 22.03 unless they fix critical issues.  I'm not completely convinced
> >> that this is such a critical issue though.  You need 4K logical
> >> datapaths with vxlan enabled before this gets hit.  In any case, Mark,
> >> what do you think?
> >
> > I don't think this needs backporting down to 22.03.
>
> I just wanted to mention that to reproduce this bug it is only enough to
> have at least one chassis with vxlan encap and create 4096 LSs/LRs.
>

Or: 4096... no, 2048! LSPs in a single network. (Because only 11 bits are
allowed for unicast tunnel keys.)


> If the problem is triggered, ovn-northd starts consuming 100% CPU and
> hangs (doesn’t process any change) until excess LS/LR is removed and northd
> is restarted.
> I can submit backport patches for old branches if needed (already rebased).
>

I also think this should be fixed in all supported branches.


>
> >
> >> Regards,
> >> Dumitru
>  On 4 Apr 2024, at 18:26, Dumitru Ceara  wrote:
> 
>  On 4/1/24 16:27, Mark Michelson wrote:
> > Thanks Vladislav,
> >
> > Acked-by: Mark Michelson  mmich...@redhat.com>>
> >
> 
>  Thanks, Vladislav and Mark!  Applied to main and backported down to
>  23.06 with a minor test change, please see below.
> 
> > On 4/1/24 08:15, Vladislav Odintsov wrote:
> >> In case if all tunnel ids are exhausted, ovn_allocate_tnlid()
> function
> >> iterates over tnlids indefinitely when *hint is outside of [min,
> max].
> >> This is because when tnlid reaches max, next tnlid is min and
> for-loop
> >> never reaches exit condition for tnlid != *hint.
> >>
> >> This patch fixes mentioned issue and adds a testcase.
> >>
> >> Signed-off-by: Vladislav Odintsov 
> >> ---
> >>   lib/ovn-util.c  | 10 +++---
> >>   tests/ovn-northd.at | 26 ++
> >>   2 files changed, 33 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/lib/ovn-util.c b/lib/ovn-util.c
> >> index ee5cbcdc3..9f97ae2ca 100644
> >> --- a/lib/ovn-util.c
> >> +++ b/lib/ovn-util.c
> >> @@ -693,13 +693,17 @@ uint32_t
> >>   ovn_allocate_tnlid(struct hmap *set, const char *name, uint32_t
> min,
> >>  uint32_t max, uint32_t *hint)
> >>   {
> >> -for (uint32_t tnlid = next_tnlid(*hint, min, max); tnlid !=
> *hint;
> >> - tnlid = next_tnlid(tnlid, min, max)) {
> >> +/* Normalize hint, because it can be outside of [min, max]. */
> >> +*hint = next_tnlid(*hint, min, max);
> >> +
> >> +uint32_t tnlid = *hint;
> >> +do {
> >>   if (ovn_add_tnlid(set, tnlid)) {
> >>   *hint = tnlid;
> >>   return tnlid;
> >>   }
> >> -}
> >> +tnlid = next_tnlid(tnlid, min, max);
> >> +} while (tnlid != *hint);
> >> static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
> 1);
> >>   VLOG_WARN_RL(, "all %s tunnel ids exhausted", name);
> >> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> >> index cd53755b2..174dbacda 100644
> >> --- a/tests/ovn-northd.at
> >> +++ b/tests/ovn-northd.at
> >> @@ -2822,6 +2822,32 @@ AT_CHECK([test $lsp02 = 3 && test $ls1 =
> 123])
> >>   AT_CLEANUP
> >>   ])
> >>   +OVN_FOR_EACH_NORTHD_NO_HV([
> >> +AT_SETUP([check tunnel ids exhaustion])
> >> +ovn_start
> >> +
> >> +# Create a fake chassis with vxlan encap to lower MAX DP tunnel key
> >> to 2^12
> >> +ovn-sbctl \
> >> +--id=@e create encap chassis_name=hv1 ip="192.168.0.1"
> >> type="vxlan" \
> >> +-- --id=@c create chassis name=hv1 encaps=@e
> >> +
> >> +cmd="ovn-nbctl --wait=sb"
> >> +
> >> +for i in {1..4097..1}; do
> 
>  This can be changed to:
> 
>  for i in {1..4097}; do
> 
> >> +cmd="${cmd} -- ls-add lsw-${i}"
> >> +done
> >> +
> >> +eval $cmd
> >> +
> >> +check_row_count nb:Logical_Switch 4097
> >> +wait_row_count sb:Datapath_Binding 4095
> >> +
> >> +OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted"
> >> northd/ovn-northd.log])
> >> +
> >> +AT_CLEANUP
> >> +])
> >> +
> >> +
> >>   OVN_FOR_EACH_NORTHD_NO_HV([
> >>   AT_SETUP([Logical Flow Datapath Groups])
> >>   ovn_start
> 
>  Regards,
>  Dumitru
> 
>  ___
>  dev mailing list
>  

Re: [ovs-dev] [PATCH ovn] Make tunnel ids exhaustion test case trigger the problem.

2024-04-05 Thread Ihar Hrachyshka
On Fri, Apr 5, 2024 at 11:38 AM Vladislav Odintsov 
wrote:

>
>
> On 5 Apr 2024, at 18:35, Ihar Hrachyshka  wrote:
>
> On Thu, Apr 4, 2024 at 3:56 PM Vladislav Odintsov 
> wrote:
>
>> Thanks Ihar for the patch.
>>
>> It definitely triggers the bug mentioned in Fixes commit, but how do you
>> like next diff as an alternative?
>> It seems a little easier to me, because it shows the real limit and the
>> situation where the problem was (separate ls-add):
>>
>>
> Ah, I think we are talking about two separate scenarios, both resulting in
> *hint out of [min; max] bounds!
>
> - You are talking about hint=0 with min:max = [1; 4096] - which indeed can
> be triggered by creating a new DP *after* tunnel ids are exhausted;
> - I am talking about a more obscure case where hint=4097 (because
> originally there were no vxlan chassis in the cluster); then a vxlan
> chassis is created (reducing max to 4096); then the allocation function is
> entered with hint=4097.
>
> Both scenarios are fixed by your patch. It may be worth having both test
> cases, one per scenario, in the test suite then. What do you think?
>
>
> I agree, it’s worth adding both.
> Thanks for clarification!
>
>
v2 with both cases here:
https://patchwork.ozlabs.org/project/ovn/patch/20240405155915.624103-1-ihrac...@redhat.com/


>
> (Side Note: I now find the runtime flip of max cap as a vxlan chassis is
> added - that I myself implemented - unfortunate.)
>
> Ihar
>
>
>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>> index 6edb1129e..cef144f10 100644
>> --- a/tests/ovn-northd.at
>> +++ b/tests/ovn-northd.at
>> @@ -2862,13 +2862,18 @@ ovn-sbctl \
>>
>>  cmd="ovn-nbctl --wait=sb"
>>
>> -for i in {1..4097}; do
>> +for i in {1..4095}; do
>>  cmd="${cmd} -- ls-add lsw-${i}"
>>  done
>>
>>  eval $cmd
>>
>> -check_row_count nb:Logical_Switch 4097
>> +check_row_count nb:Logical_Switch 4095
>> +wait_row_count sb:Datapath_Binding 4095
>> +
>> +ovn-nbctl ls-add lsw-exhausted
>> +
>> +check_row_count nb:Logical_Switch 4096
>>  wait_row_count sb:Datapath_Binding 4095
>>
>>  OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted"
>> northd/ovn-northd.log])
>>
>>
>> On 4 Apr 2024, at 20:13, Ihar Hrachyshka  wrote:
>>
>> The original version of the scenario passed with or without the fix.
>>
>> Fixes: a1f165a7b807 ("northd: fix infinite loop in ovn_allocate_tnlid()")
>> Signed-off-by: Ihar Hrachyshka 
>> ---
>> tests/ovn-northd.at | 17 +++--
>> 1 file changed, 11 insertions(+), 6 deletions(-)
>>
>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>> index fc2c972a4..e8ea8b050 100644
>> --- a/tests/ovn-northd.at
>> +++ b/tests/ovn-northd.at
>> @@ -2826,11 +2826,6 @@ OVN_FOR_EACH_NORTHD_NO_HV([
>> AT_SETUP([check tunnel ids exhaustion])
>> ovn_start
>>
>> -# Create a fake chassis with vxlan encap to lower MAX DP tunnel key to
>> 2^12
>> -ovn-sbctl \
>> ---id=@e create encap chassis_name=hv1 ip="192.168.0.1" type="vxlan" \
>> --- --id=@c create chassis name=hv1 encaps=@e
>> -
>> cmd="ovn-nbctl --wait=sb"
>>
>> for i in {1..4097}; do
>> @@ -2840,7 +2835,17 @@ done
>> eval $cmd
>>
>> check_row_count nb:Logical_Switch 4097
>> -wait_row_count sb:Datapath_Binding 4095
>> +wait_row_count sb:Datapath_Binding 4097
>> +
>> +# Now create a fake chassis with vxlan encap to lower MAX DP tunnel key
>> to 2^12
>> +ovn-sbctl \
>> +--id=@e create encap chassis_name=hv1 ip="192.168.0.1" type="vxlan" \
>> +-- --id=@c create chassis name=hv1 encaps=@e
>> +
>> +ovn-nbctl --wait=sb ls-add lsw-exhausted
>> +
>> +check_row_count nb:Logical_Switch 4098
>> +wait_row_count sb:Datapath_Binding 4097
>>
>> OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted"
>> northd/ovn-northd.log])
>>
>> --
>> 2.41.0
>>
>> ___
>> dev mailing list
>> d...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>>
>>
>>
>>
>> Regards,
>> Vladislav Odintsov
>>
>>
>
> Regards,
> Vladislav Odintsov
>
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v2] Make tunnel ids exhaustion test trigger the problem.

2024-04-05 Thread Ihar Hrachyshka
The original version of the scenario passed with or without the fix.
This is because all LSs were processed in one go, so the allocate
function was never entered with *hint==0.

Also, added another scenario that will check behavior when *hint is out
of [min;max] bounds but > max (this happens in an obscure scenario where
a vxlan chassis is added to the cluster mid-light, forcing northd to
reduce its effective max value for tunnel ids; which may become lower
than the current *hint for ports.)

Fixes: a1f165a7b807 ("northd: fix infinite loop in ovn_allocate_tnlid()")
Co-Authored-By: Vladislav Odintsov 
Signed-off-by: Vladislav Odintsov 
Signed-off-by: Ihar Hrachyshka 
---
 tests/ovn-northd.at | 43 ---
 1 file changed, 40 insertions(+), 3 deletions(-)

diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index be006fb32..29c4a93a0 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -2823,7 +2823,7 @@ AT_CLEANUP
 ])
 
 OVN_FOR_EACH_NORTHD_NO_HV([
-AT_SETUP([check tunnel ids exhaustion])
+AT_SETUP([check datapath tunnel ids exhaustion])
 ovn_start
 
 # Create a fake chassis with vxlan encap to lower MAX DP tunnel key to 2^12
@@ -2833,13 +2833,18 @@ ovn-sbctl \
 
 cmd="ovn-nbctl --wait=sb"
 
-for i in {1..4097}; do
+for i in {1..4095}; do
 cmd="${cmd} -- ls-add lsw-${i}"
 done
 
 eval $cmd
 
-check_row_count nb:Logical_Switch 4097
+check_row_count nb:Logical_Switch 4095
+wait_row_count sb:Datapath_Binding 4095
+
+ovn-nbctl ls-add lsw-exhausted
+
+check_row_count nb:Logical_Switch 4096
 wait_row_count sb:Datapath_Binding 4095
 
 OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted" 
northd/ovn-northd.log])
@@ -2847,6 +2852,38 @@ OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted" 
northd/ovn-northd.log])
 AT_CLEANUP
 ])
 
+OVN_FOR_EACH_NORTHD_NO_HV([
+AT_SETUP([check port tunnel ids exhaustion; vxlan chassis pops up midflight])
+ovn_start
+
+cmd="ovn-nbctl --wait=sb"
+
+cmd="${cmd} -- ls-add lsw"
+for i in {1..4096}; do
+cmd="${cmd} -- lsp-add lsw lsp-${i}"
+done
+
+eval $cmd
+
+check_row_count nb:Logical_Switch_Port 4096
+wait_row_count sb:Port_Binding 4096
+
+# Now create a fake chassis with vxlan encap to lower MAX port tunnel key to 
2^12
+ovn-sbctl \
+--id=@e create encap chassis_name=hv1 ip="192.168.0.1" type="vxlan" \
+-- --id=@c create chassis name=hv1 encaps=@e
+
+ovn-nbctl lsp-add lsw lsp-exhausted
+
+check_row_count nb:Logical_Switch_Port 4097
+wait_row_count sb:Port_Binding 4096
+
+OVS_WAIT_UNTIL([grep "all port tunnel ids exhausted" northd/ovn-northd.log])
+
+AT_CLEANUP
+])
+
+
 
 OVN_FOR_EACH_NORTHD_NO_HV([
 AT_SETUP([Logical Flow Datapath Groups])
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2] controller: Change dns resolution to async.

2024-04-05 Thread Numan Siddique
On Fri, Apr 5, 2024 at 1:38 AM Naveen Yerramneni <
naveen.yerramn...@nutanix.com> wrote:

> Currently DNS resolution is a blocking call in OVN controller.
> If DNS server is not reachable for any reason then, ovn-controller
> thread blocks for longer time and other events are not processed.
>
> Ex: If we try to run ovn-appctl commands during this then, ovn-controller
> will not respond for a longer time.
>
> Signed-off-by: Naveen Yerramneni 
> Acked-by: Mark Michelson 
>

Thanks.  I applied this patch to the main.

Numan

---
> v2: Fix subject line
> ---
>  controller/ovn-controller.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index c9ff5967a..b84f6dfd4 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -85,6 +85,7 @@
>  #include "mirror.h"
>  #include "mac_cache.h"
>  #include "statctrl.h"
> +#include "lib/dns-resolve.h"
>
>  VLOG_DEFINE_THIS_MODULE(main);
>
> @@ -5090,6 +5091,7 @@ main(int argc, char *argv[])
>  mirror_init();
>  vif_plug_provider_initialize();
>  statctrl_init();
> +dns_resolve_init(true);
>
>  /* Connect to OVS OVSDB instance. */
>  struct ovsdb_idl_loop ovs_idl_loop = OVSDB_IDL_LOOP_INITIALIZER(
> @@ -6176,6 +6178,7 @@ loop_done:
>  unixctl_server_destroy(unixctl);
>  service_stop();
>  ovsrcu_exit();
> +dns_resolve_destroy();
>
>  exit(retval);
>  }
> --
> 2.36.6
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH OVN v5 0/4] DHCP Relay Agent support for overlay subnets.

2024-04-05 Thread Numan Siddique
On Wed, Mar 20, 2024 at 10:40 AM Naveen Yerramneni <
naveen.yerramn...@nutanix.com> wrote:
>
> This patch contains changes to enable DHCP Relay Agent support for
overlay subnets.
>
> USE CASE:
> --
>   - Enable IP address assignment for overlay subnets from the
centralized DHCP server present in the underlay network.
>
> PREREQUISITES
> --
>   - Logical Router Port IP should be assigned (statically) from the
same overlay subnet which is managed by DHCP server.
>   - LRP IP is used for GIADRR field when relaying the DHCP packets
and also same IP needs to be configured as default gateway for the overlay
subnet.
>   - Overlay subnets managed by external DHCP server are expected to
be directly reachable from the underlay network.
>
> EXPECTED PACKET FLOW:
> --
> Following is the expected packet flow inorder to support DHCP rleay
functionality in OVN.
>   1. DHCP client originates DHCP discovery (broadcast).
>   2. DHCP relay (running on the OVN) receives the broadcast and
forwards the packet to the DHCP server by converting it to unicast.
>  While forwarding the packet, it updates the GIADDR in DHCP
header to its interface IP on which DHCP packet is received and increments
hop count.
>   3. DHCP server uses GIADDR field to decide the IP address pool from
which IP has to be assigned and DHCP offer is sent to the same IP (GIADDR).
>   4. DHCP relay agent forwards the offer to the client.
>   5. DHCP client sends DHCP request (broadcast) packet.
>   6. DHCP relay (running on the OVN) receives the broadcast and
forwards the packet to the DHCP server by converting it to unicast.
>  While forwarding the packet, it updates the GIADDR in DHCP
header to its interface IP on which DHCP packet is received.
>   7. DHCP Server sends the ACK packet.
>   8. DHCP relay agent forwards the ACK packet to the client.
>   9. All the future renew/release packets are directly exchanged
between DHCP client and DHCP server.
>
> OVN DHCP RELAY PACKET FLOW:
> 
> To add DHCP Relay support on OVN, we need to replicate all the
behavior described above using distributed logical switch and logical
router.
> At, highlevel packet flow is distributed among Logical Switch and
Logical Router on source node (where VM is deployed) and redirect
chassis(RC) node.
>   1. Request packet gets processed on the source node where VM is
deployed and relays the packet to DHCP server.
>   2. Response packet is first processed on RC node (which first
recieves the packet from underlay network). RC node forwards the packet to
the right node by filling in the dest MAC and IP.
>
> OVN Packet flow with DHCP relay is explained below.
>   1. DHCP client (VM) sends the DHCP discover packet (broadcast).
>   2. Logical switch converts the packet to L2 unicast by setting the
destination MAC to LRP's MAC
>   3. Logical Router receives the packet and redirects it to the OVN
controller.
>   4. OVN controller updates the required information(GIADDR, HOP
count) in the DHCP payload after doing the required checks. If any check
fails, packet is dropped.
>   5. Logical Router converts the packet to L3 unicast and forwards it
to the server. This packets gets routed like any other packet (via RC node).
>   6. Server replies with DHCP offer.
>   7. RC node processes the DHCP offer and forwards it to the OVN
controller.
>   8. OVN controller does sanity checks and  updates the destination
MAC (available in DHCP header), destination IP (available in DHCP header)
and reinjects the packet to datapath.
>  If any check fails, packet is dropped.
>   9. Logical router updates the source IP and port and forwards the
packet to logical switch.
>   10. Logical switch delivers the packet to the DHCP client.
>   11. Similar steps are performed for Request and Ack packets.
>   12. All the future renew/release packets are directly exchanged
between DHCP client and DHCP server
>
> NEW OVN ACTIONS
> ---
>   1. dhcp_relay_req_chk(, )
>   - This action executes on the source node on which the DHCP
request originated.
>   - This action relays the DHCP request coming from client to the
server. Relay-ip is used to update GIADDR in the DHCP header.
>   2. dhcp_relay_resp_chk(, )
>   - This action executes on the first node (RC node) which
processes the DHCP response from the server.
>   - This action updates  the destination MAC and destination IP
so that the response can be forwarded to the appropriate node from which
request was originated.
>   - Relay-ip, server-ip are used to validate GIADDR and SERVER ID
in the DHCP payload.
>
> FLOWS
> -
> Following are the flows added when DHCP Relay is configured on one
overlay subnet, one additonal flow is added in ls_in_l2_lkup 

Re: [ovs-dev] [PATCH ovn] Make tunnel ids exhaustion test case trigger the problem.

2024-04-05 Thread Vladislav Odintsov


> On 5 Apr 2024, at 18:35, Ihar Hrachyshka  wrote:
> 
> On Thu, Apr 4, 2024 at 3:56 PM Vladislav Odintsov  > wrote:
>> Thanks Ihar for the patch.
>> 
>> It definitely triggers the bug mentioned in Fixes commit, but how do you 
>> like next diff as an alternative?
>> It seems a little easier to me, because it shows the real limit and the 
>> situation where the problem was (separate ls-add):
>> 
> 
> Ah, I think we are talking about two separate scenarios, both resulting in 
> *hint out of [min; max] bounds!
> 
> - You are talking about hint=0 with min:max = [1; 4096] - which indeed can be 
> triggered by creating a new DP *after* tunnel ids are exhausted;
> - I am talking about a more obscure case where hint=4097 (because originally 
> there were no vxlan chassis in the cluster); then a vxlan chassis is created 
> (reducing max to 4096); then the allocation function is entered with 
> hint=4097.
> 
> Both scenarios are fixed by your patch. It may be worth having both test 
> cases, one per scenario, in the test suite then. What do you think?

I agree, it’s worth adding both.
Thanks for clarification!

> 
> (Side Note: I now find the runtime flip of max cap as a vxlan chassis is 
> added - that I myself implemented - unfortunate.)
> 
> Ihar
>  
>> diff --git a/tests/ovn-northd.at  
>> b/tests/ovn-northd.at 
>> index 6edb1129e..cef144f10 100644
>> --- a/tests/ovn-northd.at 
>> +++ b/tests/ovn-northd.at 
>> @@ -2862,13 +2862,18 @@ ovn-sbctl \
>>  
>>  cmd="ovn-nbctl --wait=sb"
>>  
>> -for i in {1..4097}; do
>> +for i in {1..4095}; do
>>  cmd="${cmd} -- ls-add lsw-${i}"
>>  done
>>  
>>  eval $cmd
>>  
>> -check_row_count nb:Logical_Switch 4097
>> +check_row_count nb:Logical_Switch 4095
>> +wait_row_count sb:Datapath_Binding 4095
>> +
>> +ovn-nbctl ls-add lsw-exhausted
>> +
>> +check_row_count nb:Logical_Switch 4096
>>  wait_row_count sb:Datapath_Binding 4095
>>  
>>  OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted" 
>> northd/ovn-northd.log])
>> 
>> 
>>> On 4 Apr 2024, at 20:13, Ihar Hrachyshka >> > wrote:
>>> 
>>> The original version of the scenario passed with or without the fix.
>>> 
>>> Fixes: a1f165a7b807 ("northd: fix infinite loop in ovn_allocate_tnlid()")
>>> Signed-off-by: Ihar Hrachyshka >> >
>>> ---
>>> tests/ovn-northd.at  | 17 +++--
>>> 1 file changed, 11 insertions(+), 6 deletions(-)
>>> 
>>> diff --git a/tests/ovn-northd.at  
>>> b/tests/ovn-northd.at 
>>> index fc2c972a4..e8ea8b050 100644
>>> --- a/tests/ovn-northd.at 
>>> +++ b/tests/ovn-northd.at 
>>> @@ -2826,11 +2826,6 @@ OVN_FOR_EACH_NORTHD_NO_HV([
>>> AT_SETUP([check tunnel ids exhaustion])
>>> ovn_start
>>> 
>>> -# Create a fake chassis with vxlan encap to lower MAX DP tunnel key to 2^12
>>> -ovn-sbctl \
>>> ---id=@e create encap chassis_name=hv1 ip="192.168.0.1" type="vxlan" \
>>> --- --id=@c create chassis name=hv1 encaps=@e
>>> -
>>> cmd="ovn-nbctl --wait=sb"
>>> 
>>> for i in {1..4097}; do
>>> @@ -2840,7 +2835,17 @@ done
>>> eval $cmd
>>> 
>>> check_row_count nb:Logical_Switch 4097
>>> -wait_row_count sb:Datapath_Binding 4095
>>> +wait_row_count sb:Datapath_Binding 4097
>>> +
>>> +# Now create a fake chassis with vxlan encap to lower MAX DP tunnel key to 
>>> 2^12
>>> +ovn-sbctl \
>>> +--id=@e create encap chassis_name=hv1 ip="192.168.0.1" type="vxlan" \
>>> +-- --id=@c create chassis name=hv1 encaps=@e
>>> +
>>> +ovn-nbctl --wait=sb ls-add lsw-exhausted
>>> +
>>> +check_row_count nb:Logical_Switch 4098
>>> +wait_row_count sb:Datapath_Binding 4097
>>> 
>>> OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted" 
>>> northd/ovn-northd.log])
>>> 
>>> -- 
>>> 2.41.0
>>> 
>>> ___
>>> dev mailing list
>>> d...@openvswitch.org 
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>> 
>> 
>> 
>> 
>> Regards,
>> Vladislav Odintsov
>> 


Regards,
Vladislav Odintsov

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] Make tunnel ids exhaustion test case trigger the problem.

2024-04-05 Thread Ihar Hrachyshka
On Thu, Apr 4, 2024 at 3:56 PM Vladislav Odintsov  wrote:

> Thanks Ihar for the patch.
>
> It definitely triggers the bug mentioned in Fixes commit, but how do you
> like next diff as an alternative?
> It seems a little easier to me, because it shows the real limit and the
> situation where the problem was (separate ls-add):
>
>
Ah, I think we are talking about two separate scenarios, both resulting in
*hint out of [min; max] bounds!

- You are talking about hint=0 with min:max = [1; 4096] - which indeed can
be triggered by creating a new DP *after* tunnel ids are exhausted;
- I am talking about a more obscure case where hint=4097 (because
originally there were no vxlan chassis in the cluster); then a vxlan
chassis is created (reducing max to 4096); then the allocation function is
entered with hint=4097.

Both scenarios are fixed by your patch. It may be worth having both test
cases, one per scenario, in the test suite then. What do you think?

(Side Note: I now find the runtime flip of max cap as a vxlan chassis is
added - that I myself implemented - unfortunate.)

Ihar


> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index 6edb1129e..cef144f10 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -2862,13 +2862,18 @@ ovn-sbctl \
>
>  cmd="ovn-nbctl --wait=sb"
>
> -for i in {1..4097}; do
> +for i in {1..4095}; do
>  cmd="${cmd} -- ls-add lsw-${i}"
>  done
>
>  eval $cmd
>
> -check_row_count nb:Logical_Switch 4097
> +check_row_count nb:Logical_Switch 4095
> +wait_row_count sb:Datapath_Binding 4095
> +
> +ovn-nbctl ls-add lsw-exhausted
> +
> +check_row_count nb:Logical_Switch 4096
>  wait_row_count sb:Datapath_Binding 4095
>
>  OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted"
> northd/ovn-northd.log])
>
>
> On 4 Apr 2024, at 20:13, Ihar Hrachyshka  wrote:
>
> The original version of the scenario passed with or without the fix.
>
> Fixes: a1f165a7b807 ("northd: fix infinite loop in ovn_allocate_tnlid()")
> Signed-off-by: Ihar Hrachyshka 
> ---
> tests/ovn-northd.at | 17 +++--
> 1 file changed, 11 insertions(+), 6 deletions(-)
>
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index fc2c972a4..e8ea8b050 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -2826,11 +2826,6 @@ OVN_FOR_EACH_NORTHD_NO_HV([
> AT_SETUP([check tunnel ids exhaustion])
> ovn_start
>
> -# Create a fake chassis with vxlan encap to lower MAX DP tunnel key to
> 2^12
> -ovn-sbctl \
> ---id=@e create encap chassis_name=hv1 ip="192.168.0.1" type="vxlan" \
> --- --id=@c create chassis name=hv1 encaps=@e
> -
> cmd="ovn-nbctl --wait=sb"
>
> for i in {1..4097}; do
> @@ -2840,7 +2835,17 @@ done
> eval $cmd
>
> check_row_count nb:Logical_Switch 4097
> -wait_row_count sb:Datapath_Binding 4095
> +wait_row_count sb:Datapath_Binding 4097
> +
> +# Now create a fake chassis with vxlan encap to lower MAX DP tunnel key
> to 2^12
> +ovn-sbctl \
> +--id=@e create encap chassis_name=hv1 ip="192.168.0.1" type="vxlan" \
> +-- --id=@c create chassis name=hv1 encaps=@e
> +
> +ovn-nbctl --wait=sb ls-add lsw-exhausted
> +
> +check_row_count nb:Logical_Switch 4098
> +wait_row_count sb:Datapath_Binding 4097
>
> OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted"
> northd/ovn-northd.log])
>
> --
> 2.41.0
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
>
>
>
> Regards,
> Vladislav Odintsov
>
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] Documentation: Updates for rename of primary development branch as main.

2024-04-05 Thread Eelco Chaudron



On 5 Apr 2024, at 15:57, Simon Horman wrote:

> Recently OVS adopted a policy of using the inclusive naming word list v1
> [1, 2].
>
> In keeping with this policy rename the primary development branch from
> 'master' to 'main'. This patch does not actually make that change, but
> rather updates references to the branch in documentation in the source
> tree.  It is intended to be applied at (approximately) the same time
> that the change is made.
>
> OVS is currently hosted on GitHub. We can expect the following behaviour
> after the rename:
>
> 1. GitHub pull requests against are renamed branch are automatically
>re-homed on new branch
> 2. GitHub Issues do not seem to be affected - at least the test issue I
>created had no association with a branch
> 3. URLs accessed via the GitHub web UI are automatically renamed
>(so long as a new branch called master is not created).
> 4. Using the git cli command, fetch will fetch the new branch (main),
>and fetch -p will remove (prune) the old branch (master)
>
> [1] df5e5cf4318a ("Documentation: Add section on inclusive language.")
> [2] https://inclusivenaming.org/word-lists/
>
> Signed-off-by: Simon Horman 

Thanks this revision looks good to me.

Acked-by: Eelco Chaudron 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] Documentation: Updates for rename of primary development branch as main.

2024-04-05 Thread 0-day Robot
Bleep bloop.  Greetings Simon Horman, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: The subject, ': ', is over 70 characters, i.e., 72.
Subject: Documentation: Updates for rename of primary development branch as 
main.
WARNING: Line is 110 characters long (recommended limit is 79)
#374 FILE: README.rst:11:
.. image:: 
https://ci.appveyor.com/api/projects/status/github/openvswitch/ovs?branch=main=true=true

Lines checked: 381, Warnings: 2, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3] Documentation: Updates for rename of primary development branch as main.

2024-04-05 Thread Simon Horman
Recently OVS adopted a policy of using the inclusive naming word list v1
[1, 2].

In keeping with this policy rename the primary development branch from
'master' to 'main'. This patch does not actually make that change, but
rather updates references to the branch in documentation in the source
tree.  It is intended to be applied at (approximately) the same time
that the change is made.

OVS is currently hosted on GitHub. We can expect the following behaviour
after the rename:

1. GitHub pull requests against are renamed branch are automatically
   re-homed on new branch
2. GitHub Issues do not seem to be affected - at least the test issue I
   created had no association with a branch
3. URLs accessed via the GitHub web UI are automatically renamed
   (so long as a new branch called master is not created).
4. Using the git cli command, fetch will fetch the new branch (main),
   and fetch -p will remove (prune) the old branch (master)

[1] df5e5cf4318a ("Documentation: Add section on inclusive language.")
[2] https://inclusivenaming.org/word-lists/

Signed-off-by: Simon Horman 
---
Changes in v3:
- This patch only updates documentation. Update the patch prefix and
  description accordingly.
- Drop documentation of 'tested on "master" branch' rather than updating
  it, as the updated text seems somewhat nonsensical.
- Correct indentation of NEWS entry.

Changes in v2:
- Keep two blank lines between versions.
- Drop bogus update to OpenSSL hashes URL in appveyor.yml.
- Drop other appveyor.yml changes, they are now present upstream.
  + appveyor: Prepare for rename of primary development branch.
https://github.com/openvswitch/ovs/commit/95ff912edef8
- Add note about updates to git configuration.
---
Notes:

* Now is the time to raise any concerns regarding this patch.
  It is planned to implement this change next week.

* If you have an automation that fetches the master branch then
  the suggested action is:
  1. Before the branch rename occurs: update the automation to pull main an
 fall back to pulling master if that fails
  2. After the rename occurs: Update the automation to only fetch main

* After the change it may be necessary to update your local
  git configuration for checked out branches.

  For example:
  # Fetch origin: new remote main branch; remote master branch is deleted
  git fetch -tp origin
  # Rename local branch
  git branch -m master main
  # Update local main branch to use remote main branch as it's upstream
  git branch --set-upstream-to=origin/main main

* As a follow-up, after the rename, I plan to post a patch which removes
  references to master in CI jobs
---
 .../internals/committer-responsibilities.rst   | 12 +++---
 .../internals/contributing/backporting-patches.rst | 12 +++---
 Documentation/internals/release-process.rst| 50 +++---
 Documentation/intro/install/dpdk.rst   |  2 +-
 Documentation/intro/install/fedora.rst |  2 +-
 Documentation/intro/install/general.rst|  2 +-
 Documentation/intro/install/rhel.rst   |  2 +-
 Documentation/topics/language-bindings.rst |  2 +-
 Documentation/tutorials/faucet.rst |  6 +--
 Documentation/tutorials/ovs-conntrack.rst  |  1 -
 NEWS   |  3 ++
 README.rst |  2 +-
 12 files changed, 49 insertions(+), 47 deletions(-)

diff --git a/Documentation/internals/committer-responsibilities.rst 
b/Documentation/internals/committer-responsibilities.rst
index c35fd708913b..eed2e017678a 100644
--- a/Documentation/internals/committer-responsibilities.rst
+++ b/Documentation/internals/committer-responsibilities.rst
@@ -73,14 +73,14 @@ If it is someone else's change, then you can ask the 
original submitter to
 address it. Regardless, you need to ensure that the problem is fixed in a
 timely way. The definition of "timely" depends on the severity of the problem.
 
-If a bug is present on master and other branches, fix it on master first, then
+If a bug is present on main and other branches, fix it on main first, then
 backport the fix to other branches. Straightforward backports do not require
-additional review (beyond that for the fix on master).
+additional review (beyond that for the fix on main).
 
-Feature development should be done only on master. Occasionally it makes sense
+Feature development should be done only on main. Occasionally it makes sense
 to add a feature to the most recent release branch, before the first actual
 release of that branch. These should be handled in the same way as bug fixes,
-that is, first implemented on master and then backported.
+that is, first implemented on main and then backported.
 
 Keep the authorship of a commit clear by maintaining a correct list of
 "Signed-off-by:"s. If a confusing situation comes up, as it occasionally does,
@@ -99,7 +99,7 @@ Pre-Push Hook
 -
 
 The following script can be helpful 

Re: [ovs-dev] [PATCH v14 4/6] dpif: Probe support for OVS_ACTION_ATTR_DROP.

2024-04-05 Thread Eelco Chaudron


On 5 Apr 2024, at 15:43, Ilya Maximets wrote:

> On 4/5/24 15:08, Ilya Maximets wrote:
>> On 4/5/24 15:01, Eric Garver wrote:
>>> On Fri, Apr 05, 2024 at 11:39:58AM +0200, Eelco Chaudron wrote:


 On 3 Apr 2024, at 16:35, Eric Garver wrote:

> Kernel support has been added for this action. As such, we need to probe
> the datapath for support.
>
> Signed-off-by: Eric Garver 
> ---
>  include/linux/openvswitch.h |  2 +-
>  lib/dpif.c  |  6 ++-
>  lib/dpif.h  |  2 +-
>  ofproto/ofproto-dpif.c  | 77 ++---
>  ofproto/ofproto-dpif.h  |  4 +-
>  5 files changed, 81 insertions(+), 10 deletions(-)
>
> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
> index d18754c84f62..d9fb991ef234 100644
> --- a/include/linux/openvswitch.h
> +++ b/include/linux/openvswitch.h
> @@ -1086,11 +1086,11 @@ enum ovs_action_attr {
>   OVS_ACTION_ATTR_CHECK_PKT_LEN, /* Nested OVS_CHECK_PKT_LEN_ATTR_*. */
>   OVS_ACTION_ATTR_ADD_MPLS, /* struct ovs_action_add_mpls. */
>   OVS_ACTION_ATTR_DEC_TTL,  /* Nested OVS_DEC_TTL_ATTR_*. */
> + OVS_ACTION_ATTR_DROP, /* u32 xlate_error. */
>
>  #ifndef __KERNEL__
>   OVS_ACTION_ATTR_TUNNEL_PUSH,   /* struct ovs_action_push_tnl*/
>   OVS_ACTION_ATTR_TUNNEL_POP,/* u32 port number. */
> - OVS_ACTION_ATTR_DROP,  /* u32 xlate_error. */
>   OVS_ACTION_ATTR_LB_OUTPUT, /* u32 bond-id. */
>  #endif
>   __OVS_ACTION_ATTR_MAX,/* Nothing past this will be accepted
> diff --git a/lib/dpif.c b/lib/dpif.c
> index 0f480bec48d0..23eb18495a66 100644
> --- a/lib/dpif.c
> +++ b/lib/dpif.c
> @@ -28,6 +28,7 @@
>  #include "dpctl.h"
>  #include "dpif-netdev.h"
>  #include "flow.h"
> +#include "netdev-offload.h"
>  #include "netdev-provider.h"
>  #include "netdev.h"
>  #include "netlink.h"
> @@ -1935,9 +1936,10 @@ dpif_supports_tnl_push_pop(const struct dpif *dpif)
>  }
>
>  bool
> -dpif_supports_explicit_drop_action(const struct dpif *dpif)
> +dpif_may_support_explicit_drop_action(const struct dpif *dpif)
>  {
> -return dpif_is_netdev(dpif);
> +/* TC does not support offloading this action. */
> +return dpif_is_netdev(dpif) || !netdev_is_flow_api_enabled();
>  }
>
>  bool
> diff --git a/lib/dpif.h b/lib/dpif.h
> index 0f2dc2ef3c55..a764e8a592bd 100644
> --- a/lib/dpif.h
> +++ b/lib/dpif.h
> @@ -940,7 +940,7 @@ int dpif_get_pmds_for_port(const struct dpif * dpif, 
> odp_port_t port_no,
>
>  char *dpif_get_dp_version(const struct dpif *);
>  bool dpif_supports_tnl_push_pop(const struct dpif *);
> -bool dpif_supports_explicit_drop_action(const struct dpif *);
> +bool dpif_may_support_explicit_drop_action(const struct dpif *);
>  bool dpif_synced_dp_layers(struct dpif *);
>
>  /* Log functions. */
> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> index c4e2e867ecdc..32d037be607c 100644
> --- a/ofproto/ofproto-dpif.c
> +++ b/ofproto/ofproto-dpif.c
> @@ -221,6 +221,7 @@ static void ct_zone_config_init(struct dpif_backer 
> *backer);
>  static void ct_zone_config_uninit(struct dpif_backer *backer);
>  static void ct_zone_timeout_policy_sweep(struct dpif_backer *backer);
>  static void ct_zone_limits_commit(struct dpif_backer *backer);
> +static bool recheck_support_explicit_drop_action(struct dpif_backer 
> *backer);
>
>  static inline struct ofproto_dpif *
>  ofproto_dpif_cast(const struct ofproto *ofproto)
> @@ -391,6 +392,10 @@ type_run(const char *type)
>  udpif_set_threads(backer->udpif, n_handlers, n_revalidators);
>  }
>
> +if (recheck_support_explicit_drop_action(backer)) {
> +backer->need_revalidate = REV_RECONFIGURE;
> +}
> +
>  if (backer->need_revalidate) {
>  struct ofproto_dpif *ofproto;
>  struct simap_node *node;
> @@ -855,7 +860,11 @@ ovs_native_tunneling_is_on(struct ofproto_dpif 
> *ofproto)
>  bool
>  ovs_explicit_drop_action_supported(struct ofproto_dpif *ofproto)
>  {
> -return ofproto->backer->rt_support.explicit_drop_action;
> +bool value;
> +
> +
> atomic_read_relaxed(>backer->rt_support.explicit_drop_action,
> +);
> +return value;
>  }
>
>  bool
> @@ -1379,6 +1388,40 @@ check_ct_timeout_policy(struct dpif_backer *backer)
>  return !error;
>  }
>
> +/* Tests whether backer's datapath supports the OVS_ACTION_ATTR_DROP 
> action. */
> +static bool
> +check_drop_action(struct dpif_backer *backer)
> +{
> +struct odputil_keybuf keybuf;
> +uint8_t actbuf[NL_A_U32_SIZE];

Re: [ovs-dev] [PATCH] ofproto-dpif-upcall: Fix ukey installation failure logs and counters.

2024-04-05 Thread Eelco Chaudron


On 5 Apr 2024, at 15:04, Ilya Maximets wrote:

> On 4/5/24 14:33, Eelco Chaudron wrote:
>>
>>
>> On 4 Apr 2024, at 14:09, Ilya Maximets wrote:
>>
>>> ukey_install() returns boolean signaling if the ukey was installed
>>> or not.  Installation may fail for a few reasons:
>>>
>>>  1. Conflicting ukey.
>>>  2. Mutex contention while trying to replace existing ukey.
>>>  3. The same ukey already exists and active.
>>>
>>> Only the first case here signals an actual problem.  Third one is
>>> a little odd for userspace datapath, but harmless.  Second is the
>>> most common one that can easily happen during normal operation
>>> since other threads like revalidators may be currently working on
>>> this ukey preventing an immediate access.
>>>
>>> Since only the first case is actually worth logging and it already
>>> has its own log message, removing the 'upcall installation fails'
>>> warning from the upcall_cb().  This should fix most of the random
>>> failures of userspace system tests in CI.
>>>
>>> While at it, also fixing coverage counters.  Mutex contention was
>>> mistakenly counted as a duplicate upcall.  ukey contention for
>>> revalidators was counted only in one of two places.
>>>
>>> New counter added for the ukey contention on replace.  We should
>>> not re-use existing upcall_ukey_contention counter for this, since
>>> it may lead to double counting.
>>>
>>> Fixes: 67f08985d769 ("upcall: Replace ukeys for deleted flows.")
>>> Fixes: 9cec8274ed9a ("ofproto-dpif-upcall: Add VLOG_WARN_RL logs for 
>>> upcall_cb() error.")
>>> Signed-off-by: Ilya Maximets 
>>> ---
>>
>> Thanks for looking into this, and the patch looks good to me.
>>
>> Maybe we should have another patch fixing some of the namings?
>>
>>   upcall_ukey_replace -> ukey_replace
>>   handler_duplicate_upcall -> duplicate_upcall
>>   upcall_ukey_contention -> revalidate_ukey_contention -> ukey_contention
>
> I had something similar in mind, but I didn't make any radical
> renaming since I plan to backport this change.
>
> One more thing is that we would likely want to distinguish
> contention in handlers/pmds from contention in revalidators,
> so these will need separate counters, I think.

Ack but all can call the update functions ;) We can follow this up after the 
backport…

>>
>> Cheers,
>>
>> Eelco
>>
>> Acked-by: Eelco Chaudron 
>>

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v14 4/6] dpif: Probe support for OVS_ACTION_ATTR_DROP.

2024-04-05 Thread Ilya Maximets
On 4/5/24 15:08, Ilya Maximets wrote:
> On 4/5/24 15:01, Eric Garver wrote:
>> On Fri, Apr 05, 2024 at 11:39:58AM +0200, Eelco Chaudron wrote:
>>>
>>>
>>> On 3 Apr 2024, at 16:35, Eric Garver wrote:
>>>
 Kernel support has been added for this action. As such, we need to probe
 the datapath for support.

 Signed-off-by: Eric Garver 
 ---
  include/linux/openvswitch.h |  2 +-
  lib/dpif.c  |  6 ++-
  lib/dpif.h  |  2 +-
  ofproto/ofproto-dpif.c  | 77 ++---
  ofproto/ofproto-dpif.h  |  4 +-
  5 files changed, 81 insertions(+), 10 deletions(-)

 diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
 index d18754c84f62..d9fb991ef234 100644
 --- a/include/linux/openvswitch.h
 +++ b/include/linux/openvswitch.h
 @@ -1086,11 +1086,11 @@ enum ovs_action_attr {
OVS_ACTION_ATTR_CHECK_PKT_LEN, /* Nested OVS_CHECK_PKT_LEN_ATTR_*. */
OVS_ACTION_ATTR_ADD_MPLS, /* struct ovs_action_add_mpls. */
OVS_ACTION_ATTR_DEC_TTL,  /* Nested OVS_DEC_TTL_ATTR_*. */
 +  OVS_ACTION_ATTR_DROP, /* u32 xlate_error. */

  #ifndef __KERNEL__
OVS_ACTION_ATTR_TUNNEL_PUSH,   /* struct ovs_action_push_tnl*/
OVS_ACTION_ATTR_TUNNEL_POP,/* u32 port number. */
 -  OVS_ACTION_ATTR_DROP,  /* u32 xlate_error. */
OVS_ACTION_ATTR_LB_OUTPUT, /* u32 bond-id. */
  #endif
__OVS_ACTION_ATTR_MAX,/* Nothing past this will be accepted
 diff --git a/lib/dpif.c b/lib/dpif.c
 index 0f480bec48d0..23eb18495a66 100644
 --- a/lib/dpif.c
 +++ b/lib/dpif.c
 @@ -28,6 +28,7 @@
  #include "dpctl.h"
  #include "dpif-netdev.h"
  #include "flow.h"
 +#include "netdev-offload.h"
  #include "netdev-provider.h"
  #include "netdev.h"
  #include "netlink.h"
 @@ -1935,9 +1936,10 @@ dpif_supports_tnl_push_pop(const struct dpif *dpif)
  }

  bool
 -dpif_supports_explicit_drop_action(const struct dpif *dpif)
 +dpif_may_support_explicit_drop_action(const struct dpif *dpif)
  {
 -return dpif_is_netdev(dpif);
 +/* TC does not support offloading this action. */
 +return dpif_is_netdev(dpif) || !netdev_is_flow_api_enabled();
  }

  bool
 diff --git a/lib/dpif.h b/lib/dpif.h
 index 0f2dc2ef3c55..a764e8a592bd 100644
 --- a/lib/dpif.h
 +++ b/lib/dpif.h
 @@ -940,7 +940,7 @@ int dpif_get_pmds_for_port(const struct dpif * dpif, 
 odp_port_t port_no,

  char *dpif_get_dp_version(const struct dpif *);
  bool dpif_supports_tnl_push_pop(const struct dpif *);
 -bool dpif_supports_explicit_drop_action(const struct dpif *);
 +bool dpif_may_support_explicit_drop_action(const struct dpif *);
  bool dpif_synced_dp_layers(struct dpif *);

  /* Log functions. */
 diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
 index c4e2e867ecdc..32d037be607c 100644
 --- a/ofproto/ofproto-dpif.c
 +++ b/ofproto/ofproto-dpif.c
 @@ -221,6 +221,7 @@ static void ct_zone_config_init(struct dpif_backer 
 *backer);
  static void ct_zone_config_uninit(struct dpif_backer *backer);
  static void ct_zone_timeout_policy_sweep(struct dpif_backer *backer);
  static void ct_zone_limits_commit(struct dpif_backer *backer);
 +static bool recheck_support_explicit_drop_action(struct dpif_backer 
 *backer);

  static inline struct ofproto_dpif *
  ofproto_dpif_cast(const struct ofproto *ofproto)
 @@ -391,6 +392,10 @@ type_run(const char *type)
  udpif_set_threads(backer->udpif, n_handlers, n_revalidators);
  }

 +if (recheck_support_explicit_drop_action(backer)) {
 +backer->need_revalidate = REV_RECONFIGURE;
 +}
 +
  if (backer->need_revalidate) {
  struct ofproto_dpif *ofproto;
  struct simap_node *node;
 @@ -855,7 +860,11 @@ ovs_native_tunneling_is_on(struct ofproto_dpif 
 *ofproto)
  bool
  ovs_explicit_drop_action_supported(struct ofproto_dpif *ofproto)
  {
 -return ofproto->backer->rt_support.explicit_drop_action;
 +bool value;
 +
 +atomic_read_relaxed(>backer->rt_support.explicit_drop_action,
 +);
 +return value;
  }

  bool
 @@ -1379,6 +1388,40 @@ check_ct_timeout_policy(struct dpif_backer *backer)
  return !error;
  }

 +/* Tests whether backer's datapath supports the OVS_ACTION_ATTR_DROP 
 action. */
 +static bool
 +check_drop_action(struct dpif_backer *backer)
 +{
 +struct odputil_keybuf keybuf;
 +uint8_t actbuf[NL_A_U32_SIZE];
 +struct ofpbuf actions;
 +struct ofpbuf key;
 +bool supported;
 +
 +struct flow flow = {
 +.dl_type = 

Re: [ovs-dev] [PATCH v14 4/6] dpif: Probe support for OVS_ACTION_ATTR_DROP.

2024-04-05 Thread Ilya Maximets
On 4/5/24 15:01, Eric Garver wrote:
> On Fri, Apr 05, 2024 at 11:39:58AM +0200, Eelco Chaudron wrote:
>>
>>
>> On 3 Apr 2024, at 16:35, Eric Garver wrote:
>>
>>> Kernel support has been added for this action. As such, we need to probe
>>> the datapath for support.
>>>
>>> Signed-off-by: Eric Garver 
>>> ---
>>>  include/linux/openvswitch.h |  2 +-
>>>  lib/dpif.c  |  6 ++-
>>>  lib/dpif.h  |  2 +-
>>>  ofproto/ofproto-dpif.c  | 77 ++---
>>>  ofproto/ofproto-dpif.h  |  4 +-
>>>  5 files changed, 81 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
>>> index d18754c84f62..d9fb991ef234 100644
>>> --- a/include/linux/openvswitch.h
>>> +++ b/include/linux/openvswitch.h
>>> @@ -1086,11 +1086,11 @@ enum ovs_action_attr {
>>> OVS_ACTION_ATTR_CHECK_PKT_LEN, /* Nested OVS_CHECK_PKT_LEN_ATTR_*. */
>>> OVS_ACTION_ATTR_ADD_MPLS, /* struct ovs_action_add_mpls. */
>>> OVS_ACTION_ATTR_DEC_TTL,  /* Nested OVS_DEC_TTL_ATTR_*. */
>>> +   OVS_ACTION_ATTR_DROP, /* u32 xlate_error. */
>>>
>>>  #ifndef __KERNEL__
>>> OVS_ACTION_ATTR_TUNNEL_PUSH,   /* struct ovs_action_push_tnl*/
>>> OVS_ACTION_ATTR_TUNNEL_POP,/* u32 port number. */
>>> -   OVS_ACTION_ATTR_DROP,  /* u32 xlate_error. */
>>> OVS_ACTION_ATTR_LB_OUTPUT, /* u32 bond-id. */
>>>  #endif
>>> __OVS_ACTION_ATTR_MAX,/* Nothing past this will be accepted
>>> diff --git a/lib/dpif.c b/lib/dpif.c
>>> index 0f480bec48d0..23eb18495a66 100644
>>> --- a/lib/dpif.c
>>> +++ b/lib/dpif.c
>>> @@ -28,6 +28,7 @@
>>>  #include "dpctl.h"
>>>  #include "dpif-netdev.h"
>>>  #include "flow.h"
>>> +#include "netdev-offload.h"
>>>  #include "netdev-provider.h"
>>>  #include "netdev.h"
>>>  #include "netlink.h"
>>> @@ -1935,9 +1936,10 @@ dpif_supports_tnl_push_pop(const struct dpif *dpif)
>>>  }
>>>
>>>  bool
>>> -dpif_supports_explicit_drop_action(const struct dpif *dpif)
>>> +dpif_may_support_explicit_drop_action(const struct dpif *dpif)
>>>  {
>>> -return dpif_is_netdev(dpif);
>>> +/* TC does not support offloading this action. */
>>> +return dpif_is_netdev(dpif) || !netdev_is_flow_api_enabled();
>>>  }
>>>
>>>  bool
>>> diff --git a/lib/dpif.h b/lib/dpif.h
>>> index 0f2dc2ef3c55..a764e8a592bd 100644
>>> --- a/lib/dpif.h
>>> +++ b/lib/dpif.h
>>> @@ -940,7 +940,7 @@ int dpif_get_pmds_for_port(const struct dpif * dpif, 
>>> odp_port_t port_no,
>>>
>>>  char *dpif_get_dp_version(const struct dpif *);
>>>  bool dpif_supports_tnl_push_pop(const struct dpif *);
>>> -bool dpif_supports_explicit_drop_action(const struct dpif *);
>>> +bool dpif_may_support_explicit_drop_action(const struct dpif *);
>>>  bool dpif_synced_dp_layers(struct dpif *);
>>>
>>>  /* Log functions. */
>>> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
>>> index c4e2e867ecdc..32d037be607c 100644
>>> --- a/ofproto/ofproto-dpif.c
>>> +++ b/ofproto/ofproto-dpif.c
>>> @@ -221,6 +221,7 @@ static void ct_zone_config_init(struct dpif_backer 
>>> *backer);
>>>  static void ct_zone_config_uninit(struct dpif_backer *backer);
>>>  static void ct_zone_timeout_policy_sweep(struct dpif_backer *backer);
>>>  static void ct_zone_limits_commit(struct dpif_backer *backer);
>>> +static bool recheck_support_explicit_drop_action(struct dpif_backer 
>>> *backer);
>>>
>>>  static inline struct ofproto_dpif *
>>>  ofproto_dpif_cast(const struct ofproto *ofproto)
>>> @@ -391,6 +392,10 @@ type_run(const char *type)
>>>  udpif_set_threads(backer->udpif, n_handlers, n_revalidators);
>>>  }
>>>
>>> +if (recheck_support_explicit_drop_action(backer)) {
>>> +backer->need_revalidate = REV_RECONFIGURE;
>>> +}
>>> +
>>>  if (backer->need_revalidate) {
>>>  struct ofproto_dpif *ofproto;
>>>  struct simap_node *node;
>>> @@ -855,7 +860,11 @@ ovs_native_tunneling_is_on(struct ofproto_dpif 
>>> *ofproto)
>>>  bool
>>>  ovs_explicit_drop_action_supported(struct ofproto_dpif *ofproto)
>>>  {
>>> -return ofproto->backer->rt_support.explicit_drop_action;
>>> +bool value;
>>> +
>>> +atomic_read_relaxed(>backer->rt_support.explicit_drop_action,
>>> +);
>>> +return value;
>>>  }
>>>
>>>  bool
>>> @@ -1379,6 +1388,40 @@ check_ct_timeout_policy(struct dpif_backer *backer)
>>>  return !error;
>>>  }
>>>
>>> +/* Tests whether backer's datapath supports the OVS_ACTION_ATTR_DROP 
>>> action. */
>>> +static bool
>>> +check_drop_action(struct dpif_backer *backer)
>>> +{
>>> +struct odputil_keybuf keybuf;
>>> +uint8_t actbuf[NL_A_U32_SIZE];
>>> +struct ofpbuf actions;
>>> +struct ofpbuf key;
>>> +bool supported;
>>> +
>>> +struct flow flow = {
>>> +.dl_type = CONSTANT_HTONS(0x1234), /* bogus */
>>> +};
>>> +struct odp_flow_key_parms odp_parms = {
>>> +.flow = ,
>>> +.probe = true,
>>> +};
>>> +
>>> +

Re: [ovs-dev] [PATCH] ofproto-dpif-upcall: Fix ukey installation failure logs and counters.

2024-04-05 Thread Ilya Maximets
On 4/5/24 14:33, Eelco Chaudron wrote:
> 
> 
> On 4 Apr 2024, at 14:09, Ilya Maximets wrote:
> 
>> ukey_install() returns boolean signaling if the ukey was installed
>> or not.  Installation may fail for a few reasons:
>>
>>  1. Conflicting ukey.
>>  2. Mutex contention while trying to replace existing ukey.
>>  3. The same ukey already exists and active.
>>
>> Only the first case here signals an actual problem.  Third one is
>> a little odd for userspace datapath, but harmless.  Second is the
>> most common one that can easily happen during normal operation
>> since other threads like revalidators may be currently working on
>> this ukey preventing an immediate access.
>>
>> Since only the first case is actually worth logging and it already
>> has its own log message, removing the 'upcall installation fails'
>> warning from the upcall_cb().  This should fix most of the random
>> failures of userspace system tests in CI.
>>
>> While at it, also fixing coverage counters.  Mutex contention was
>> mistakenly counted as a duplicate upcall.  ukey contention for
>> revalidators was counted only in one of two places.
>>
>> New counter added for the ukey contention on replace.  We should
>> not re-use existing upcall_ukey_contention counter for this, since
>> it may lead to double counting.
>>
>> Fixes: 67f08985d769 ("upcall: Replace ukeys for deleted flows.")
>> Fixes: 9cec8274ed9a ("ofproto-dpif-upcall: Add VLOG_WARN_RL logs for 
>> upcall_cb() error.")
>> Signed-off-by: Ilya Maximets 
>> ---
> 
> Thanks for looking into this, and the patch looks good to me.
> 
> Maybe we should have another patch fixing some of the namings?
> 
>   upcall_ukey_replace -> ukey_replace
>   handler_duplicate_upcall -> duplicate_upcall
>   upcall_ukey_contention -> revalidate_ukey_contention -> ukey_contention

I had something similar in mind, but I didn't make any radical
renaming since I plan to backport this change.

One more thing is that we would likely want to distinguish
contention in handlers/pmds from contention in revalidators,
so these will need separate counters, I think.

> 
> Cheers,
> 
> Eelco
> 
> Acked-by: Eelco Chaudron 
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v14 4/6] dpif: Probe support for OVS_ACTION_ATTR_DROP.

2024-04-05 Thread Eric Garver
On Fri, Apr 05, 2024 at 11:39:58AM +0200, Eelco Chaudron wrote:
> 
> 
> On 3 Apr 2024, at 16:35, Eric Garver wrote:
> 
> > Kernel support has been added for this action. As such, we need to probe
> > the datapath for support.
> >
> > Signed-off-by: Eric Garver 
> > ---
> >  include/linux/openvswitch.h |  2 +-
> >  lib/dpif.c  |  6 ++-
> >  lib/dpif.h  |  2 +-
> >  ofproto/ofproto-dpif.c  | 77 ++---
> >  ofproto/ofproto-dpif.h  |  4 +-
> >  5 files changed, 81 insertions(+), 10 deletions(-)
> >
> > diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
> > index d18754c84f62..d9fb991ef234 100644
> > --- a/include/linux/openvswitch.h
> > +++ b/include/linux/openvswitch.h
> > @@ -1086,11 +1086,11 @@ enum ovs_action_attr {
> > OVS_ACTION_ATTR_CHECK_PKT_LEN, /* Nested OVS_CHECK_PKT_LEN_ATTR_*. */
> > OVS_ACTION_ATTR_ADD_MPLS, /* struct ovs_action_add_mpls. */
> > OVS_ACTION_ATTR_DEC_TTL,  /* Nested OVS_DEC_TTL_ATTR_*. */
> > +   OVS_ACTION_ATTR_DROP, /* u32 xlate_error. */
> >
> >  #ifndef __KERNEL__
> > OVS_ACTION_ATTR_TUNNEL_PUSH,   /* struct ovs_action_push_tnl*/
> > OVS_ACTION_ATTR_TUNNEL_POP,/* u32 port number. */
> > -   OVS_ACTION_ATTR_DROP,  /* u32 xlate_error. */
> > OVS_ACTION_ATTR_LB_OUTPUT, /* u32 bond-id. */
> >  #endif
> > __OVS_ACTION_ATTR_MAX,/* Nothing past this will be accepted
> > diff --git a/lib/dpif.c b/lib/dpif.c
> > index 0f480bec48d0..23eb18495a66 100644
> > --- a/lib/dpif.c
> > +++ b/lib/dpif.c
> > @@ -28,6 +28,7 @@
> >  #include "dpctl.h"
> >  #include "dpif-netdev.h"
> >  #include "flow.h"
> > +#include "netdev-offload.h"
> >  #include "netdev-provider.h"
> >  #include "netdev.h"
> >  #include "netlink.h"
> > @@ -1935,9 +1936,10 @@ dpif_supports_tnl_push_pop(const struct dpif *dpif)
> >  }
> >
> >  bool
> > -dpif_supports_explicit_drop_action(const struct dpif *dpif)
> > +dpif_may_support_explicit_drop_action(const struct dpif *dpif)
> >  {
> > -return dpif_is_netdev(dpif);
> > +/* TC does not support offloading this action. */
> > +return dpif_is_netdev(dpif) || !netdev_is_flow_api_enabled();
> >  }
> >
> >  bool
> > diff --git a/lib/dpif.h b/lib/dpif.h
> > index 0f2dc2ef3c55..a764e8a592bd 100644
> > --- a/lib/dpif.h
> > +++ b/lib/dpif.h
> > @@ -940,7 +940,7 @@ int dpif_get_pmds_for_port(const struct dpif * dpif, 
> > odp_port_t port_no,
> >
> >  char *dpif_get_dp_version(const struct dpif *);
> >  bool dpif_supports_tnl_push_pop(const struct dpif *);
> > -bool dpif_supports_explicit_drop_action(const struct dpif *);
> > +bool dpif_may_support_explicit_drop_action(const struct dpif *);
> >  bool dpif_synced_dp_layers(struct dpif *);
> >
> >  /* Log functions. */
> > diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> > index c4e2e867ecdc..32d037be607c 100644
> > --- a/ofproto/ofproto-dpif.c
> > +++ b/ofproto/ofproto-dpif.c
> > @@ -221,6 +221,7 @@ static void ct_zone_config_init(struct dpif_backer 
> > *backer);
> >  static void ct_zone_config_uninit(struct dpif_backer *backer);
> >  static void ct_zone_timeout_policy_sweep(struct dpif_backer *backer);
> >  static void ct_zone_limits_commit(struct dpif_backer *backer);
> > +static bool recheck_support_explicit_drop_action(struct dpif_backer 
> > *backer);
> >
> >  static inline struct ofproto_dpif *
> >  ofproto_dpif_cast(const struct ofproto *ofproto)
> > @@ -391,6 +392,10 @@ type_run(const char *type)
> >  udpif_set_threads(backer->udpif, n_handlers, n_revalidators);
> >  }
> >
> > +if (recheck_support_explicit_drop_action(backer)) {
> > +backer->need_revalidate = REV_RECONFIGURE;
> > +}
> > +
> >  if (backer->need_revalidate) {
> >  struct ofproto_dpif *ofproto;
> >  struct simap_node *node;
> > @@ -855,7 +860,11 @@ ovs_native_tunneling_is_on(struct ofproto_dpif 
> > *ofproto)
> >  bool
> >  ovs_explicit_drop_action_supported(struct ofproto_dpif *ofproto)
> >  {
> > -return ofproto->backer->rt_support.explicit_drop_action;
> > +bool value;
> > +
> > +atomic_read_relaxed(>backer->rt_support.explicit_drop_action,
> > +);
> > +return value;
> >  }
> >
> >  bool
> > @@ -1379,6 +1388,40 @@ check_ct_timeout_policy(struct dpif_backer *backer)
> >  return !error;
> >  }
> >
> > +/* Tests whether backer's datapath supports the OVS_ACTION_ATTR_DROP 
> > action. */
> > +static bool
> > +check_drop_action(struct dpif_backer *backer)
> > +{
> > +struct odputil_keybuf keybuf;
> > +uint8_t actbuf[NL_A_U32_SIZE];
> > +struct ofpbuf actions;
> > +struct ofpbuf key;
> > +bool supported;
> > +
> > +struct flow flow = {
> > +.dl_type = CONSTANT_HTONS(0x1234), /* bogus */
> > +};
> > +struct odp_flow_key_parms odp_parms = {
> > +.flow = ,
> > +.probe = true,
> > +};
> > +
> > +ofpbuf_use_stack(, , sizeof keybuf);
> > +

Re: [ovs-dev] [PATCH v2] netdev-dpdk: Fallback to non tunnel offloading API.

2024-04-05 Thread Ilya Maximets
On 4/4/24 14:46, David Marchand wrote:
> On Wed, Apr 3, 2024 at 8:13 PM Ilya Maximets  wrote:
>>> - This patch fixes some misusage of the DPDK API.
>>
>> Hmm, I understand that the driver does something funny when it gets
>> outer flags set without any inner flags, but how is that a misuse
>> of the DPDK API?  Could you point me to the API docs that say that
>> inner flags must always be set in this case or that setting only
>> outer offloads is not allowed?
> 
> Setting the tunnel type (which is set along outer checksum in OVS) is
> described as:
> 
> /**
>  * Bits 45:48 used for the tunnel type.
>  * The tunnel type must be specified for TSO or checksum on the inner part
>  * of tunnel packets.
>  * These flags can be used with RTE_MBUF_F_TX_TCP_SEG for TSO, or
>  * RTE_MBUF_F_TX_xxx_CKSUM.
>  * The mbuf fields for inner and outer header lengths are required:
>  * outer_l2_len, outer_l3_len, l2_len, l3_len, l4_len and tso_segsz for TSO.
>  */
> #define RTE_MBUF_F_TX_TUNNEL_VXLAN   (0x1ULL << 45)
> #define RTE_MBUF_F_TX_TUNNEL_GRE (0x2ULL << 45)
> #define RTE_MBUF_F_TX_TUNNEL_IPIP(0x3ULL << 45)
> #define RTE_MBUF_F_TX_TUNNEL_GENEVE  (0x4ULL << 45)
> /** TX packet with MPLS-in-UDP RFC 7510 header. */
> #define RTE_MBUF_F_TX_TUNNEL_MPLSINUDP (0x5ULL << 45)
> #define RTE_MBUF_F_TX_TUNNEL_VXLAN_GPE (0x6ULL << 45)
> #define RTE_MBUF_F_TX_TUNNEL_GTP   (0x7ULL << 45)
> #define RTE_MBUF_F_TX_TUNNEL_ESP   (0x8ULL << 45)
> 
> It is not specified what to expect it neither TSO nor inner checksum
> is requested.

I can agree on that part that specifying the tunnel type while not
requesting any inner offloads is kind of undefined.

However, this is counter-intuitive that providing more of correct
information about a packet makes the driver freak out.

While we should work around this, I would still not call it a misuse.

DPDK needs to define the behavior for this case.  Two options are:

a. Document that tunnel type must not be set if inner offloads
   are not requested.
b. Fix the affected drivers to not break the packets and document
   that any valid informational flags can be set in ol_flags.

Option 'a' sounds like a poor API design though.

BTW, why the outer checksums are documented as inner:
 https://doc.dpdk.org/guides/nics/features.html#inner-l3-checksum
 https://doc.dpdk.org/guides/nics/features.html#inner-l4-checksum
?

> 
> In a same way, it is not described what to expect if outer API is
> called with no inner offload.

I disagree on that one.

/**
 * Outer UDP checksum offload flag. This flag is used for enabling
 * outer UDP checksum in PMD. To use outer UDP checksum, the user needs to
 * 1) Enable the following in mbuf,
 * a) Fill outer_l2_len and outer_l3_len in mbuf.
 * b) Set the RTE_MBUF_F_TX_OUTER_UDP_CKSUM flag.
 * c) Set the RTE_MBUF_F_TX_OUTER_IPV4 or RTE_MBUF_F_TX_OUTER_IPV6 flag.
 * 2) Configure RTE_ETH_TX_OFFLOAD_OUTER_UDP_CKSUM offload flag.
 */
#define RTE_MBUF_F_TX_OUTER_UDP_CKSUM (1ULL << 41)

Reading this and other documentation in the file I would expect the
following two sets of flags to be equivalent on a card that supports
tunnel offloads:

 - RTE_MBUF_F_TX_OUTER_IPV4|RTE_MBUF_F_TX_OUTER_UDP_CKSUM with the
   outer_l2_len and outer_l3_len filled in.

 - RTE_MBUF_F_TX_IPV4|RTE_MBUF_F_TX_UDP_CKSUM with
   l2_len and l3_len filled in.

If these are not equivalent, that sounds like a bug in DPDK.

BTW, the l2_len field description is kind of vague:

uint64_t l2_len:RTE_MBUF_L2_LEN_BITS;
/**< L2 (MAC) Header Length for non-tunneling pkt.
 * Outer_L4_len + ... + Inner_L2_len for tunneling pkt.
 */

Reading other comments, it seems that setting any "outer" bits
in ol_flags constitutes a 'tunneling pkt', but it's not said explicitly
in this particular comment.

> Adding Ferruh and Thomas who may have one opinion.
> 
> 
>>
>> I agree that it seems safer to just downgrade all outer flags to
>> inner ones on OVS side, when no inner offloads are requested, I'm
>> just not sure I agree that it's an API misuse.  Especially since
>> non-Intel cards seem to work fine.
> 
> I suppose you mean mlx5.

Yes, I think it was tested with mlx5.

> Has it been tested on other nics?

I didn't test.  The original reporter doesn't have other cards, IIRC.

>>
>>> Basically, resolving a neighbor with ARP inside tunnels is broken on
>>> Intel nics (even without re-enabling outer udp checksum).
>>> This can be observed with the following debug patch at the end of
>>> netdev_dpdk_prep_hwol_packet():
>>>
>>> +char buf[256];
>>> +if (rte_get_tx_ol_flag_list(mbuf->ol_flags, buf, sizeof(buf)) != 0)
>>> +buf[0] = '\0';
>>> +VLOG_WARN("len=%u, ol_flags=%s, outer_l2_len=%u, outer_l3_len=%u,
>>> l2_len=%u, l3_len=%u, l4_len=%u, tso_segsz=%u", mbuf->pkt_len, buf,
>>> mbuf->outer_l2_len, mbuf->outer_l3_len, mbuf->l2_len, mbuf->l3_len,
>>> mbuf->l4_len, mbuf->tso_segsz);
>>>
>>> Then doing a "arping" inside the tunnel triggers:
>>> 

Re: [ovs-dev] [PATCH v2] Rename primary development branch as main.

2024-04-05 Thread Simon Horman
On Fri, Apr 05, 2024 at 11:57:17AM +0200, Ilya Maximets wrote:
> On 4/2/24 16:44, Simon Horman wrote:
> > Recently OVS adopted a policy of using the inclusive naming word list v1
> > [1, 2].
> > 
> > In keeping with this policy rename the primary development branch from
> > 'master' to 'main'. This patch does not actually make that change,
> > but rather updates references to the branch in the source tree.
> > It is intended to be applied at (approximately) the same time that the
> > change is made.
> > 
> > OVS is currently hosted on GitHub. We can expect the following behaviour
> > after the rename:
> > 
> > 1. GitHub pull requests against are renamed branch are automatically
> >re-homed on new branch
> > 2. GitHub Issues do not seem to be affected - at least the test issue I
> >created had no association with a branch
> > 3. URLs accessed via the GitHub web UI are automatically renamed
> >(so long as a new branch called master is not created).
> > 4. Using the git cli command, fetch will fetch the new branch (main),
> >and fetch -p will remove (prune) the old branch (master)
> > 
> > [1] df5e5cf4318a ("Documentation: Add section on inclusive language.")
> > [2] https://inclusivenaming.org/word-lists/
> > 
> > Signed-off-by: Simon Horman 
> > ---
> > Changes in v2:
> > - Keep two blank lines between versions.
> > - Drop bogus update to OpenSSL hashes URL in appveyor.yml.
> > - Drop other appveyor.yml changes, they are now present upstream.
> >   + appveyor: Prepare for rename of primary development branch.
> > https://github.com/openvswitch/ovs/commit/95ff912edef8
> 
> We still need to remove the 'master' branch from the appveyor.yml
> at some point and we also need to remove the 'master' branch
> lookup from build-clang-analyze GHA job.
> 
> Or do you plan to send a separate patch for those?

For some reason I had it in mind to send a separate patch
for that after the rename. But I don't feel strongly about it.

> If we're not touching CI configuration in this patch it may also
> be appropriate to have 'Documentation: ' prefix in the subject.

Good point, will do.

> > - Add note about updates to git configuration.
> > ---
> > Notes:
> > 
> > * Now is the time to raise any concerns regarding this patch.
> >   It is planned to implement this change next week.
> > 
> > * If you have an automation that fetches the master branch then
> >   the suggested action is:
> >   1. Before the branch rename occurs: update the automation to pull main an
> >  fall back to pulling master if that fails
> >   2. After the rename occurs: Update the automation to only fetch main
> > 
> > * After the change it may be necessary to update your local
> >   git configuration for checked out branches.
> > 
> >   For example:
> >   # Fetch origin: new remote main branch; remote master branch is deleted
> >   git fetch -tp origin
> >   # Rename local branch
> >   git branch -m master main
> >   # Update local main branch to use remote main branch as it's upstream
> >   git branch --set-upstream-to=origin/main main
> > ---
> >  .../internals/committer-responsibilities.rst   | 12 +++---
> >  .../internals/contributing/backporting-patches.rst | 12 +++---
> >  Documentation/internals/release-process.rst| 50 
> > +++---
> >  Documentation/intro/install/dpdk.rst   |  2 +-
> >  Documentation/intro/install/fedora.rst |  2 +-
> >  Documentation/intro/install/general.rst|  2 +-
> >  Documentation/intro/install/rhel.rst   |  2 +-
> >  Documentation/topics/language-bindings.rst |  2 +-
> >  Documentation/tutorials/faucet.rst |  6 +--
> >  Documentation/tutorials/ovs-conntrack.rst  |  2 +-
> >  NEWS   |  3 ++
> >  README.rst |  2 +-
> >  12 files changed, 50 insertions(+), 47 deletions(-)
> > 
> 
> 
> 
> > diff --git a/Documentation/tutorials/ovs-conntrack.rst 
> > b/Documentation/tutorials/ovs-conntrack.rst
> > index e8a58c4eb298..6b0b73cd1173 100644
> > --- a/Documentation/tutorials/ovs-conntrack.rst
> > +++ b/Documentation/tutorials/ovs-conntrack.rst
> > @@ -35,7 +35,7 @@ to match on the TCP segments from connection setup to 
> > connection tear down.
> >  It will use OVS with the Linux kernel module as the datapath for this
> >  tutorial. (The datapath that utilizes the openvswitch kernel module to do
> >  the packet processing in the Linux kernel)
> > -It was tested with the "master" branch of Open vSwitch.
> > +It was tested with the "main" branch of Open vSwitch.
> 
> Nit: This sentence doesn't make a lot of sense to me.  It may have been
> meaningful at the point of introduction of that doc, but not today.
> Maybe we can just remove it?

Sure, I will remove it.

> >  Definitions
> >  ---
> > diff --git a/NEWS b/NEWS
> > index c9e4064e67a7..5c9fff54595c 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -4,6 +4,9 @@ 

Re: [ovs-dev] [PATCH] ofproto-dpif-upcall: Fix ukey installation failure logs and counters.

2024-04-05 Thread Eelco Chaudron



On 4 Apr 2024, at 14:09, Ilya Maximets wrote:

> ukey_install() returns boolean signaling if the ukey was installed
> or not.  Installation may fail for a few reasons:
>
>  1. Conflicting ukey.
>  2. Mutex contention while trying to replace existing ukey.
>  3. The same ukey already exists and active.
>
> Only the first case here signals an actual problem.  Third one is
> a little odd for userspace datapath, but harmless.  Second is the
> most common one that can easily happen during normal operation
> since other threads like revalidators may be currently working on
> this ukey preventing an immediate access.
>
> Since only the first case is actually worth logging and it already
> has its own log message, removing the 'upcall installation fails'
> warning from the upcall_cb().  This should fix most of the random
> failures of userspace system tests in CI.
>
> While at it, also fixing coverage counters.  Mutex contention was
> mistakenly counted as a duplicate upcall.  ukey contention for
> revalidators was counted only in one of two places.
>
> New counter added for the ukey contention on replace.  We should
> not re-use existing upcall_ukey_contention counter for this, since
> it may lead to double counting.
>
> Fixes: 67f08985d769 ("upcall: Replace ukeys for deleted flows.")
> Fixes: 9cec8274ed9a ("ofproto-dpif-upcall: Add VLOG_WARN_RL logs for 
> upcall_cb() error.")
> Signed-off-by: Ilya Maximets 
> ---

Thanks for looking into this, and the patch looks good to me.

Maybe we should have another patch fixing some of the namings?

  upcall_ukey_replace -> ukey_replace
  handler_duplicate_upcall -> duplicate_upcall
  upcall_ukey_contention -> revalidate_ukey_contention -> ukey_contention

Cheers,

Eelco

Acked-by: Eelco Chaudron 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5] ci: Add clang-analyze to GitHub actions.

2024-04-05 Thread Ilya Maximets
On 1/11/24 00:08, Eelco Chaudron wrote:
> This patch identifies new static analysis issues during a GitHub action
> run and reports them. The process involves analyzing the changes introduced
> in the current commit and comparing them to those in the preceding commit.
> 
> However, there are two cases when the GitHub push action runner does not
> provide enough details to determine the preceding commit. These cases are
> a new branch or a forced push. The strategy for these exceptions is to
> find the first common commit on any upstream branch, and use that.
> 
> An example error output might look like this:
> 
>   error level: +0 -0 no changes
>   warning level: +2 +0
> New issue "deadcode.DeadStores Value stored to 'remote' is never read" (1 
> occurrence)
>   file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:86
> New issue "unix.Malloc Potential leak of memory pointed to by 'remote'" 
> (1 occurrence)
>   file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:95
>   note level: +0 -0 no changes
>   all levels: +2 +0
> 
> Signed-off-by: Eelco Chaudron 
> ---
> 
> changes in v2:
>   - When it's a new branch, it compares it to the HEAD of the default branch.
> 
> changes in v3:
>   - Include the clang version as part of the cache
>   - Change the way it looks for the 'default' branch so it will work
> for patch branches.
>   - Also compare to the base branch for forced commits.
> 
> changes in v4:
>   - No longer look for a default branch, but consume all patches
> from the current author.
> 
> changes in v5:
>   - Addressed Ilya's comments.
>   - Checkout upstream branch and find common point to base delta on.
> 
>  .ci/linux-build.sh   |  30 +++
>  .ci/linux-prepare.sh |   2 +-
>  .github/workflows/build-and-test.yml | 113 +++
>  3 files changed, 144 insertions(+), 1 deletion(-)
> 
> diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
> index 90581c10b..4589a8ba2 100755
> --- a/.ci/linux-build.sh
> +++ b/.ci/linux-build.sh
> @@ -50,6 +50,31 @@ function build_ovs()
>  make ${JOBS}
>  }
>  
> +function clang_analyze()
> +{
> +[ -d "./base-clang-analyzer-results" ] && cache_build=false \
> +   || cache_build=true
> +if [ "$cache_build" = true ]; then
> +# If this is a cache build, proceed to the base branch's directory.
> +pushd base_ovs_main
> +fi;
> +
> +configure_ovs $OPTS
> +
> +make clean
> +scan-build -o ./clang-analyzer-results -sarif --use-cc=${CC} make ${JOBS}
> +
> +if [ "$cache_build" = true ]; then
> +# Move results, so it will be picked up by the cache.
> +mv ./clang-analyzer-results ../base-clang-analyzer-results
> +popd
> +else
> +# Only do the compare on the none cache builds.
> +sarif --check note diff ./base-clang-analyzer-results \
> +./clang-analyzer-results
> +fi;
> +}
> +
>  if [ "$DEB_PACKAGE" ]; then
>  ./boot.sh && ./configure --with-dpdk=$DPDK && make debian
>  mk-build-deps --install --root-cmd sudo --remove debian/control
> @@ -117,6 +142,11 @@ fi
>  
>  OPTS="${EXTRA_OPTS} ${OPTS} $*"
>  
> +if [ "$CLANG_ANALYZE" ]; then
> +clang_analyze
> +exit 0
> +fi
> +
>  if [ "$TESTSUITE" = 'test' ]; then
>  # 'distcheck' will reconfigure with required options.
>  # Now we only need to prepare the Makefile without sparse-wrapped CC.
> diff --git a/.ci/linux-prepare.sh b/.ci/linux-prepare.sh
> index c28b6819a..5028bdc44 100755
> --- a/.ci/linux-prepare.sh
> +++ b/.ci/linux-prepare.sh
> @@ -23,7 +23,7 @@ cd ..
>  # https://github.com/pypa/pip/issues/10655
>  pip3 install --disable-pip-version-check --user wheel
>  pip3 install --disable-pip-version-check --user \
> -flake8 'hacking>=3.0' netaddr pyparsing sphinx setuptools
> +flake8 'hacking>=3.0' netaddr pyparsing sarif-tools sphinx setuptools
>  
>  # Install python test dependencies
>  pip3 install -r python/test_requirements.txt
> diff --git a/.github/workflows/build-and-test.yml 
> b/.github/workflows/build-and-test.yml
> index 710757693..f5858fdbe 100644
> --- a/.github/workflows/build-and-test.yml
> +++ b/.github/workflows/build-and-test.yml
> @@ -254,6 +254,119 @@ jobs:
>  name: logs-linux-${{ join(matrix.*, '-') }}
>  path: logs.tgz
>  
> +  build-clang-analyze:
> +needs: build-dpdk
> +env:
> +  dependencies: |
> +automake bc clang-tools libbpf-dev libnuma-dev libpcap-dev \
> +libunbound-dev libunwind-dev libssl-dev libtool llvm-dev
> +  CC:   clang
> +  DPDK: dpdk
> +  CLANG_ANALYZE: true
> +name: clang-analyze
> +runs-on: ubuntu-22.04
> +timeout-minutes: 30
> +
> +steps:
> +- name: checkout
> +  uses: actions/checkout@v3
> +  with:
> +fetch-depth: 0
> +
> +- name: get base branch sha
> +  id: base_branch
> +  env:
> +BASE_SHA: 

Re: [ovs-dev] [PATCH v2] Rename primary development branch as main.

2024-04-05 Thread Eelco Chaudron



On 2 Apr 2024, at 16:44, Simon Horman wrote:

> Recently OVS adopted a policy of using the inclusive naming word list v1
> [1, 2].
>
> In keeping with this policy rename the primary development branch from
> 'master' to 'main'. This patch does not actually make that change,
> but rather updates references to the branch in the source tree.
> It is intended to be applied at (approximately) the same time that the
> change is made.
>
> OVS is currently hosted on GitHub. We can expect the following behaviour
> after the rename:
>
> 1. GitHub pull requests against are renamed branch are automatically
>re-homed on new branch
> 2. GitHub Issues do not seem to be affected - at least the test issue I
>created had no association with a branch
> 3. URLs accessed via the GitHub web UI are automatically renamed
>(so long as a new branch called master is not created).
> 4. Using the git cli command, fetch will fetch the new branch (main),
>and fetch -p will remove (prune) the old branch (master)
>
> [1] df5e5cf4318a ("Documentation: Add section on inclusive language.")
> [2] https://inclusivenaming.org/word-lists/
>
> Signed-off-by: Simon Horman 

Thanks for doing this Simon!  I don't have any additional comments beyond those 
provided by Ilya.

//Eelco

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] utilities: Add missing bfd option in ovn-nbctl manpage.

2024-04-05 Thread Lorenzo Bianconi
Add missing bfd option to lr-route-add and lr-policy-add commands in
ovn-nbctl manpage.

Reported-at: https://issues.redhat.com/browse/FDP-550
Signed-off-by: Lorenzo Bianconi 
---
 utilities/ovn-nbctl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/utilities/ovn-nbctl.c b/utilities/ovn-nbctl.c
index 25eb86f7f..618f3a18b 100644
--- a/utilities/ovn-nbctl.c
+++ b/utilities/ovn-nbctl.c
@@ -392,6 +392,7 @@ Route commands:\n\
   [--ecmp]\n\
   [--ecmp-symmetric-reply]\n\
   [--route-table=ROUTE_TABLE]\n\
+  [--bfd]\n\
   lr-route-add ROUTER PREFIX NEXTHOP [PORT]\n\
 add a route to ROUTER\n\
   [--policy=POLICY]\n\
@@ -402,6 +403,7 @@ Route commands:\n\
   lr-route-list ROUTER  print routes for ROUTER\n\
 \n\
 Policy commands:\n\
+  [--bfd]\n\
   lr-policy-add ROUTER PRIORITY MATCH ACTION [NEXTHOP,[NEXTHOP,...]] \
 [OPTIONS KEY=VALUE ...] \n\
 add a policy to router\n\
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] Rename primary development branch as main.

2024-04-05 Thread Ilya Maximets
On 4/2/24 16:44, Simon Horman wrote:
> Recently OVS adopted a policy of using the inclusive naming word list v1
> [1, 2].
> 
> In keeping with this policy rename the primary development branch from
> 'master' to 'main'. This patch does not actually make that change,
> but rather updates references to the branch in the source tree.
> It is intended to be applied at (approximately) the same time that the
> change is made.
> 
> OVS is currently hosted on GitHub. We can expect the following behaviour
> after the rename:
> 
> 1. GitHub pull requests against are renamed branch are automatically
>re-homed on new branch
> 2. GitHub Issues do not seem to be affected - at least the test issue I
>created had no association with a branch
> 3. URLs accessed via the GitHub web UI are automatically renamed
>(so long as a new branch called master is not created).
> 4. Using the git cli command, fetch will fetch the new branch (main),
>and fetch -p will remove (prune) the old branch (master)
> 
> [1] df5e5cf4318a ("Documentation: Add section on inclusive language.")
> [2] https://inclusivenaming.org/word-lists/
> 
> Signed-off-by: Simon Horman 
> ---
> Changes in v2:
> - Keep two blank lines between versions.
> - Drop bogus update to OpenSSL hashes URL in appveyor.yml.
> - Drop other appveyor.yml changes, they are now present upstream.
>   + appveyor: Prepare for rename of primary development branch.
> https://github.com/openvswitch/ovs/commit/95ff912edef8

We still need to remove the 'master' branch from the appveyor.yml
at some point and we also need to remove the 'master' branch
lookup from build-clang-analyze GHA job.

Or do you plan to send a separate patch for those?

If we're not touching CI configuration in this patch it may also
be appropriate to have 'Documentation: ' prefix in the subject.

> - Add note about updates to git configuration.
> ---
> Notes:
> 
> * Now is the time to raise any concerns regarding this patch.
>   It is planned to implement this change next week.
> 
> * If you have an automation that fetches the master branch then
>   the suggested action is:
>   1. Before the branch rename occurs: update the automation to pull main an
>  fall back to pulling master if that fails
>   2. After the rename occurs: Update the automation to only fetch main
> 
> * After the change it may be necessary to update your local
>   git configuration for checked out branches.
> 
>   For example:
>   # Fetch origin: new remote main branch; remote master branch is deleted
>   git fetch -tp origin
>   # Rename local branch
>   git branch -m master main
>   # Update local main branch to use remote main branch as it's upstream
>   git branch --set-upstream-to=origin/main main
> ---
>  .../internals/committer-responsibilities.rst   | 12 +++---
>  .../internals/contributing/backporting-patches.rst | 12 +++---
>  Documentation/internals/release-process.rst| 50 
> +++---
>  Documentation/intro/install/dpdk.rst   |  2 +-
>  Documentation/intro/install/fedora.rst |  2 +-
>  Documentation/intro/install/general.rst|  2 +-
>  Documentation/intro/install/rhel.rst   |  2 +-
>  Documentation/topics/language-bindings.rst |  2 +-
>  Documentation/tutorials/faucet.rst |  6 +--
>  Documentation/tutorials/ovs-conntrack.rst  |  2 +-
>  NEWS   |  3 ++
>  README.rst |  2 +-
>  12 files changed, 50 insertions(+), 47 deletions(-)
> 



> diff --git a/Documentation/tutorials/ovs-conntrack.rst 
> b/Documentation/tutorials/ovs-conntrack.rst
> index e8a58c4eb298..6b0b73cd1173 100644
> --- a/Documentation/tutorials/ovs-conntrack.rst
> +++ b/Documentation/tutorials/ovs-conntrack.rst
> @@ -35,7 +35,7 @@ to match on the TCP segments from connection setup to 
> connection tear down.
>  It will use OVS with the Linux kernel module as the datapath for this
>  tutorial. (The datapath that utilizes the openvswitch kernel module to do
>  the packet processing in the Linux kernel)
> -It was tested with the "master" branch of Open vSwitch.
> +It was tested with the "main" branch of Open vSwitch.

Nit: This sentence doesn't make a lot of sense to me.  It may have been
meaningful at the point of introduction of that doc, but not today.
Maybe we can just remove it?

>  
>  Definitions
>  ---
> diff --git a/NEWS b/NEWS
> index c9e4064e67a7..5c9fff54595c 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -4,6 +4,9 @@ Post-v3.3.0
>   * Conntrack now supports 'random' flag for selecting ports in a range
> while natting and 'persistent' flag for selection of the IP address
> from a range.
> +  - The primary development branch has been renamed from 'master' to 'main'.
> +The OVS tree remains hosted on GitHub.
> +https://github.com/openvswitch/ovs.git

We use 3 spaces for the first level of indentation in the NEWS file.
I don't know 

Re: [ovs-dev] [PATCH ovn] tests: Add macro for checking flows after recompute.

2024-04-05 Thread Xavier Simonart
Hi Mark

Thanks for the review.

On Thu, Apr 4, 2024 at 10:25 PM Mark Michelson  wrote:

> Hi Xavier, the patch looks good, but I have one question down below.
>
> On 3/26/24 07:56, Xavier Simonart wrote:
> > The macro CHECK_FLOWS_AFTER_RECOMPUTE dumps the Openflows, then
> > recomputes, then dumps again the Openflows, and finally compares
> > both sets of flows. The test fails if flows are different.
> > As of now, the macro cannot be used in all tests: many tests would fail
> > as I+P does not properly remove flows when the last logical port of
> > a datapath is deleted.
> >
> > Signed-off-by: Xavier Simonart 
> > ---
> >   tests/ovn-macros.at | 44 
> >   1 file changed, 44 insertions(+)
> >
> > diff --git a/tests/ovn-macros.at b/tests/ovn-macros.at
> > index ed93764d3..11377f616 100644
> > --- a/tests/ovn-macros.at
> > +++ b/tests/ovn-macros.at
> > @@ -10,6 +10,50 @@ m4_define([OVN_CLEANUP_VSWITCH],[
> >   OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> >   ])
> >
> > +# DUMP_FLOWS(sbox, output_file)
> > +# Dump openflows to output_file for sbox
> > +m4_define([DUMP_FLOWS], [
> > +sbox=$1
> > +output_file=$2
> > +as $sbox
> > +ovs-ofctl dump-flows br-int |
> > +  sed 's/cookie=0x[[^,]]*/cookie=xx/g' |
> > +  sed 's/duration=[[^,]]*/duration=xx/g' |
> > +  sed 's/idle_age=[[^,]]*/idle_age=xx/g' |
> > +  sed 's/, hard_age=[[^,]]*//g' |
> > +  sed 's/n_bytes=[[^,]]*/n_bytes=xx/g' |
> > +  sed 's/n_packets=[[^,]]*/n_packets=xx/g' |
> > +  sed 's/conjunction([[^,]]*/conjunction(xx/g' |
> > +  sort > $output_file
>
> Should we mask group IDs here? Can group IDs change because of a recompute?
>
> Good question. I do not think so. When assigning the group_id a check is
done to see if a group with the same content already exists, so the same
group_id is reused in recompute.

> > +])
> > +
> > +m4_define([CHECK_FLOWS_AFTER_RECOMPUTE], [
> > +hv=$1
> > +sbox=$2
> > +# Make sure I+P has finalized his job before getting flows and
> comparing them after recompte.
> > +# Some tests have northd and ovn-nb ovsdb stopped, so avoid
> ovn-nbctl for those.
> > +if [[ -e ovn-nb/ovn-nb.sock ]] && [[ -e northd/ovn-northd.pid ]];
> then
> > +# Do wait twice to handle some potential race conditions
> > +check ovn-nbctl --wait=hv sync
> > +check ovn-nbctl --wait=hv sync
> > +fi
> > +
> > +as $sbox
> > +if test "$hv" != "vtep"; then
> > +  # Get flows before and after recompute
> > +  DUMP_FLOWS([$sbox], [flows-$hv-1])
> > +
> > +  check ovn-appctl -t ovn-controller recompute
> > +  # The recompute might cause some sb changes. Let controller catch
> up.
> > +  if [[ -e ovn-nb/ovn-nb.sock ]] && [[ -e northd/ovn-northd.pid ]];
> then
> > +  check ovn-nbctl --wait=hv sync
> > +  fi
> > +  DUMP_FLOWS([$sbox], [flows-$hv-2])
> > +  diff flows-$hv-1 flows-$hv-2 > flow-diff
> > +  AT_CHECK([test $(diff flows-$hv-1 flows-$hv-2 | wc -l) == 0])
> > +fi
> > +])
> > +
> >   # OVN_CLEANUP_CONTROLLER(sbox)
> >   #
> >   # Gracefully terminate ovn-controller in the specified
>
Thanks
Xavier
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v14 4/6] dpif: Probe support for OVS_ACTION_ATTR_DROP.

2024-04-05 Thread Eelco Chaudron



On 3 Apr 2024, at 16:35, Eric Garver wrote:

> Kernel support has been added for this action. As such, we need to probe
> the datapath for support.
>
> Signed-off-by: Eric Garver 
> ---
>  include/linux/openvswitch.h |  2 +-
>  lib/dpif.c  |  6 ++-
>  lib/dpif.h  |  2 +-
>  ofproto/ofproto-dpif.c  | 77 ++---
>  ofproto/ofproto-dpif.h  |  4 +-
>  5 files changed, 81 insertions(+), 10 deletions(-)
>
> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
> index d18754c84f62..d9fb991ef234 100644
> --- a/include/linux/openvswitch.h
> +++ b/include/linux/openvswitch.h
> @@ -1086,11 +1086,11 @@ enum ovs_action_attr {
>   OVS_ACTION_ATTR_CHECK_PKT_LEN, /* Nested OVS_CHECK_PKT_LEN_ATTR_*. */
>   OVS_ACTION_ATTR_ADD_MPLS, /* struct ovs_action_add_mpls. */
>   OVS_ACTION_ATTR_DEC_TTL,  /* Nested OVS_DEC_TTL_ATTR_*. */
> + OVS_ACTION_ATTR_DROP, /* u32 xlate_error. */
>
>  #ifndef __KERNEL__
>   OVS_ACTION_ATTR_TUNNEL_PUSH,   /* struct ovs_action_push_tnl*/
>   OVS_ACTION_ATTR_TUNNEL_POP,/* u32 port number. */
> - OVS_ACTION_ATTR_DROP,  /* u32 xlate_error. */
>   OVS_ACTION_ATTR_LB_OUTPUT, /* u32 bond-id. */
>  #endif
>   __OVS_ACTION_ATTR_MAX,/* Nothing past this will be accepted
> diff --git a/lib/dpif.c b/lib/dpif.c
> index 0f480bec48d0..23eb18495a66 100644
> --- a/lib/dpif.c
> +++ b/lib/dpif.c
> @@ -28,6 +28,7 @@
>  #include "dpctl.h"
>  #include "dpif-netdev.h"
>  #include "flow.h"
> +#include "netdev-offload.h"
>  #include "netdev-provider.h"
>  #include "netdev.h"
>  #include "netlink.h"
> @@ -1935,9 +1936,10 @@ dpif_supports_tnl_push_pop(const struct dpif *dpif)
>  }
>
>  bool
> -dpif_supports_explicit_drop_action(const struct dpif *dpif)
> +dpif_may_support_explicit_drop_action(const struct dpif *dpif)
>  {
> -return dpif_is_netdev(dpif);
> +/* TC does not support offloading this action. */
> +return dpif_is_netdev(dpif) || !netdev_is_flow_api_enabled();
>  }
>
>  bool
> diff --git a/lib/dpif.h b/lib/dpif.h
> index 0f2dc2ef3c55..a764e8a592bd 100644
> --- a/lib/dpif.h
> +++ b/lib/dpif.h
> @@ -940,7 +940,7 @@ int dpif_get_pmds_for_port(const struct dpif * dpif, 
> odp_port_t port_no,
>
>  char *dpif_get_dp_version(const struct dpif *);
>  bool dpif_supports_tnl_push_pop(const struct dpif *);
> -bool dpif_supports_explicit_drop_action(const struct dpif *);
> +bool dpif_may_support_explicit_drop_action(const struct dpif *);
>  bool dpif_synced_dp_layers(struct dpif *);
>
>  /* Log functions. */
> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> index c4e2e867ecdc..32d037be607c 100644
> --- a/ofproto/ofproto-dpif.c
> +++ b/ofproto/ofproto-dpif.c
> @@ -221,6 +221,7 @@ static void ct_zone_config_init(struct dpif_backer 
> *backer);
>  static void ct_zone_config_uninit(struct dpif_backer *backer);
>  static void ct_zone_timeout_policy_sweep(struct dpif_backer *backer);
>  static void ct_zone_limits_commit(struct dpif_backer *backer);
> +static bool recheck_support_explicit_drop_action(struct dpif_backer *backer);
>
>  static inline struct ofproto_dpif *
>  ofproto_dpif_cast(const struct ofproto *ofproto)
> @@ -391,6 +392,10 @@ type_run(const char *type)
>  udpif_set_threads(backer->udpif, n_handlers, n_revalidators);
>  }
>
> +if (recheck_support_explicit_drop_action(backer)) {
> +backer->need_revalidate = REV_RECONFIGURE;
> +}
> +
>  if (backer->need_revalidate) {
>  struct ofproto_dpif *ofproto;
>  struct simap_node *node;
> @@ -855,7 +860,11 @@ ovs_native_tunneling_is_on(struct ofproto_dpif *ofproto)
>  bool
>  ovs_explicit_drop_action_supported(struct ofproto_dpif *ofproto)
>  {
> -return ofproto->backer->rt_support.explicit_drop_action;
> +bool value;
> +
> +atomic_read_relaxed(>backer->rt_support.explicit_drop_action,
> +);
> +return value;
>  }
>
>  bool
> @@ -1379,6 +1388,40 @@ check_ct_timeout_policy(struct dpif_backer *backer)
>  return !error;
>  }
>
> +/* Tests whether backer's datapath supports the OVS_ACTION_ATTR_DROP action. 
> */
> +static bool
> +check_drop_action(struct dpif_backer *backer)
> +{
> +struct odputil_keybuf keybuf;
> +uint8_t actbuf[NL_A_U32_SIZE];
> +struct ofpbuf actions;
> +struct ofpbuf key;
> +bool supported;
> +
> +struct flow flow = {
> +.dl_type = CONSTANT_HTONS(0x1234), /* bogus */
> +};
> +struct odp_flow_key_parms odp_parms = {
> +.flow = ,
> +.probe = true,
> +};
> +
> +ofpbuf_use_stack(, , sizeof keybuf);
> +odp_flow_key_from_flow(_parms, );
> +
> +ofpbuf_use_stack(, , sizeof actbuf);
> +nl_msg_put_u32(, OVS_ACTION_ATTR_DROP, XLATE_OK);
> +
> +supported = dpif_may_support_explicit_drop_action(backer->dpif) &&
> +dpif_probe_feature(backer->dpif, "drop", , , 
> NULL);
> +
> +VLOG_INFO("%s: 

Re: [ovs-dev] [PATCH v14 1/6] dpif: Stub out unimplemented action OVS_ACTION_ATTR_DEC_TTL.

2024-04-05 Thread Eelco Chaudron



On 3 Apr 2024, at 16:35, Eric Garver wrote:

> This is prep for adding a different OVS_ACTION_ATTR_ enum value. This
> action, OVS_ACTION_ATTR_DEC_TTL, is not actually implemented. However,
> to make -Werror happy we must add a case to all existing switches.
>
> Signed-off-by: Eric Garver 

Thanks for fixing up the format_odp_check_pkt_len_action() function!

Acked-by: Eelco Chaudron 

//Eelco

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] Use listen backlog = 64 for all connections.

2024-04-05 Thread Eelco Chaudron


On 4 Apr 2024, at 21:17, Ihar Hrachyshka wrote:

> On Thu, Apr 4, 2024 at 2:36 AM Eelco Chaudron  wrote:
>
>>
>>
>> On 3 Apr 2024, at 23:18, Ihar Hrachyshka wrote:
>>
>>> Before the patch, the size of the backlog depended on the type of socket
>>> (UNIX vs INET) as well as on the language (C vs Python), specifically:
>>>
>>> - python used backlog size = 10 for all sockets;
>>> - C used 64 for UNIX sockets but 10 for INET sockets.
>>>
>>> This consolidates the values across the board. It effectively bumps the
>>> number of simultaneous connections to python unixctl servers to 64. Also
>>> for INET C servers too.
>>>
>>> Signed-off-by: Ihar Hrachyshka 
>>
>> Hi Ihar,
>>
>> Thanks for submitting the patch. The patch looks fine to me, however, can
>> you elaborate on why you want to increase the size to 64? Is 10 giving
>> problems in specific scenarios?
>>
>>
> I guess I should've included it in the commit message, (and I am happy to
> send v2 with it updated), but...
>
> 1. Originally* the patch was implemented to allow more parallel fmt_pkt
> calls in OVN test suite (that rely on a python AF_UNIX unixctl server to
> transform scapy string format strings into byte strings). The problem with
> parallel handling of more than 10 unixctl AF_UNIX requests to python
> servers was noticed here:
> https://github.com/ovn-org/ovn/commit/0baca3e519756cbe98a32526ccc637bb73468743
> 2. Now, Brian also reports listen backlog issues in OpenStack environments
> for INET sockets, see:
> https://mail.openvswitch.org/pipermail/ovs-discuss/2024-April/053049.html
>
> * the patch was part of a series of patches -
> https://patchwork.ozlabs.org/project/openvswitch/list/?series=382739=%2A=both
> - that generally improve AF_UNIX unixctl socket handling, which I plan to
> revive later, but nothing stops us from merging this before I get to it.
>

Yes, I noticed that conversation right after I replied ;) Let’s wait for Ilya’s 
feedback before sending a v2.

//Eelco

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] Make tunnel ids exhaustion test case trigger the problem.

2024-04-05 Thread Vladislav Odintsov
Yes, this diff is from main.
To trigger an initial bug it is enough to create a new ls/lr while all 
available tunnel ids are used for datapaths (4095).
This is because we need to enter ovn_allocate_tnlid() with *hint=0 to trigger 
infinite loop.
That is why I suggest just to create 4095 LSs and then create another one.
I’ve tested this diff and see that northd goes to 100% CPU and doesn’t print 
warn log about ids exhaustion.

> On 4 Apr 2024, at 23:34, Ihar Hrachyshka  wrote:
> 
> On Thu, Apr 4, 2024 at 3:56 PM Vladislav Odintsov  wrote:
> 
>> Thanks Ihar for the patch.
>> 
>> It definitely triggers the bug mentioned in Fixes commit, but how do you
>> like next diff as an alternative?
>> It seems a little easier to me, because it shows the real limit and the
>> situation where the problem was (separate ls-add):
>> 
> 
> Is it a diff from main? I don't think it will trigger the issue. The key is
> to trigger northd to change its max cap for tunnel ids AFTER it bumped hint
> beyond the "vxlan mode max tun_id" (which is why I have to create vxlan
> chassis AFTER I create enough LSs to get into unsafe territory.)
> 
> Note: I haven't tried your version yet; I may check your version some time
> later. So it's the initial thought only.
> 
> 
>> 
>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>> index 6edb1129e..cef144f10 100644
>> --- a/tests/ovn-northd.at
>> +++ b/tests/ovn-northd.at
>> @@ -2862,13 +2862,18 @@ ovn-sbctl \
>> 
>> cmd="ovn-nbctl --wait=sb"
>> 
>> -for i in {1..4097}; do
>> +for i in {1..4095}; do
>> cmd="${cmd} -- ls-add lsw-${i}"
>> done
>> 
>> eval $cmd
>> 
>> -check_row_count nb:Logical_Switch 4097
>> +check_row_count nb:Logical_Switch 4095
>> +wait_row_count sb:Datapath_Binding 4095
>> +
>> +ovn-nbctl ls-add lsw-exhausted
>> +
>> +check_row_count nb:Logical_Switch 4096
>> wait_row_count sb:Datapath_Binding 4095
>> 
>> OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted"
>> northd/ovn-northd.log])
>> 
>> 
>> On 4 Apr 2024, at 20:13, Ihar Hrachyshka  wrote:
>> 
>> The original version of the scenario passed with or without the fix.
>> 
>> Fixes: a1f165a7b807 ("northd: fix infinite loop in ovn_allocate_tnlid()")
>> Signed-off-by: Ihar Hrachyshka 
>> ---
>> tests/ovn-northd.at | 17 +++--
>> 1 file changed, 11 insertions(+), 6 deletions(-)
>> 
>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>> index fc2c972a4..e8ea8b050 100644
>> --- a/tests/ovn-northd.at
>> +++ b/tests/ovn-northd.at
>> @@ -2826,11 +2826,6 @@ OVN_FOR_EACH_NORTHD_NO_HV([
>> AT_SETUP([check tunnel ids exhaustion])
>> ovn_start
>> 
>> -# Create a fake chassis with vxlan encap to lower MAX DP tunnel key to
>> 2^12
>> -ovn-sbctl \
>> ---id=@e create encap chassis_name=hv1 ip="192.168.0.1" type="vxlan" \
>> --- --id=@c create chassis name=hv1 encaps=@e
>> -
>> cmd="ovn-nbctl --wait=sb"
>> 
>> for i in {1..4097}; do
>> @@ -2840,7 +2835,17 @@ done
>> eval $cmd
>> 
>> check_row_count nb:Logical_Switch 4097
>> -wait_row_count sb:Datapath_Binding 4095
>> +wait_row_count sb:Datapath_Binding 4097
>> +
>> +# Now create a fake chassis with vxlan encap to lower MAX DP tunnel key
>> to 2^12
>> +ovn-sbctl \
>> +--id=@e create encap chassis_name=hv1 ip="192.168.0.1" type="vxlan" \
>> +-- --id=@c create chassis name=hv1 encaps=@e
>> +
>> +ovn-nbctl --wait=sb ls-add lsw-exhausted
>> +
>> +check_row_count nb:Logical_Switch 4098
>> +wait_row_count sb:Datapath_Binding 4097
>> 
>> OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted"
>> northd/ovn-northd.log])
>> 
>> --
>> 2.41.0
>> 
>> ___
>> dev mailing list
>> d...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>> 
>> 
>> 
>> 
>> 
>> Regards,
>> Vladislav Odintsov
>> 
>> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Regards,
Vladislav Odintsov

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev