[ovs-dev] [PATCH] socket: Increase listen backlog to 64 everywhere.

2024-04-11 Thread Ihar Hrachyshka
Before the patch, the size of the backlog depended on the type of socket
(UNIX vs INET) as well as on the language (C vs Python), specifically:

- python used backlog size = 10 for all sockets;
- C used 64 for UNIX sockets but 10 for INET sockets.

This consolidates the values across the board. It effectively bumps the
number of simultaneous connections to python unixctl servers to 64. Also
for INET C servers too.

The rationale to do it, on top of consistency, is as follows:

- fmt_pkt in ovn testsuite is limited by python server listen backlog,
  and as was found out when adopting the tool, it is sometimes useful to
  run lots of parallel calls to fmt_pkt unixctl server in some tests.
  (See [1] for example.)

- there is a recent report [2] on discuss@ ML where the reporter noticed
  significant listen queue overflows in some scenarios (large openstack
  deployments; happens during leader transition when hundreds of neutron
  nodes - with dozens of neutron api workers each - simultaneously
  reconnect to the same northbound leader.) Note: While there is no
  clear indication that this backlog size bump would resolve the
  reported issues, it would probably help somewhat.

[1] 
https://github.com/ovn-org/ovn/commit/0baca3e519756cbe98a32526ccc637bb73468743
[2] https://mail.openvswitch.org/pipermail/ovs-discuss/2024-April/053049.html

Signed-off-by: Ihar Hrachyshka 
Acked-by: Eelco Chaudron 
---
 lib/socket-util.c| 2 +-
 python/ovs/stream.py | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/socket-util.c b/lib/socket-util.c
index 3eb3a3816..2d89fce85 100644
--- a/lib/socket-util.c
+++ b/lib/socket-util.c
@@ -760,7 +760,7 @@ inet_open_passive(int style, const char *target, int 
default_port,
 }
 
 /* Listen. */
-if (style == SOCK_STREAM && listen(fd, 10) < 0) {
+if (style == SOCK_STREAM && listen(fd, 64) < 0) {
 error = sock_errno();
 VLOG_ERR("%s: listen: %s", target, sock_strerror(error));
 goto error;
diff --git a/python/ovs/stream.py b/python/ovs/stream.py
index 82fbb0d68..dbb6b2e1f 100644
--- a/python/ovs/stream.py
+++ b/python/ovs/stream.py
@@ -620,7 +620,7 @@ class PassiveStream(object):
 raise Exception('Unknown connection string')
 
 try:
-sock.listen(10)
+sock.listen(64)
 except socket.error as e:
 vlog.err("%s: listen: %s" % (name, os.strerror(e.error)))
 sock.close()
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] docs: Document manual cluster recovery procedure.

2024-04-11 Thread Ihar Hrachyshka
Remove the notion of cluster/leave --force since it was never
implemented. Instead of these instructions, document how a broken
cluster can be re-initialized with the old database contents.

Signed-off-by: Ihar Hrachyshka 
---
 Documentation/ref/ovsdb.7.rst | 50 +--
 1 file changed, 42 insertions(+), 8 deletions(-)

diff --git a/Documentation/ref/ovsdb.7.rst b/Documentation/ref/ovsdb.7.rst
index 46ed13e61..5882643a0 100644
--- a/Documentation/ref/ovsdb.7.rst
+++ b/Documentation/ref/ovsdb.7.rst
@@ -315,16 +315,11 @@ The above methods for adding and removing servers only 
work for healthy
 clusters, that is, for clusters with no more failures than their maximum
 tolerance.  For example, in a 3-server cluster, the failure of 2 servers
 prevents servers joining or leaving the cluster (as well as database access).
+
 To prevent data loss or inconsistency, the preferred solution to this problem
 is to bring up enough of the failed servers to make the cluster healthy again,
-then if necessary remove any remaining failed servers and add new ones.  If
-this cannot be done, though, use ``ovs-appctl`` to invoke ``cluster/leave
---force`` on a running server.  This command forces the server to which it is
-directed to leave its cluster and form a new single-node cluster that contains
-only itself.  The data in the new cluster may be inconsistent with the former
-cluster: transactions not yet replicated to the server will be lost, and
-transactions not yet applied to the cluster may be committed.  Afterward, any
-servers in its former cluster will regard the server to have failed.
+then if necessary remove any remaining failed servers and add new ones. If this
+is not an option, see the next section for manual recovery procedure.
 
 Once a server leaves a cluster, it may never rejoin it.  Instead, create a new
 server and join it to the cluster.
@@ -362,6 +357,45 @@ Clustered OVSDB does not support the OVSDB "ephemeral 
columns" feature.
 ones when they work with schemas for clustered databases.  Future versions of
 OVSDB might add support for this feature.
 
+Manual cluster recovery
+~~~
+
+If kicking and rejoining failed members to the existing cluster is not
+available in your environment, you may consider to recover the cluster
+manually, as follows.
+
+*Important*: The procedure below will result in `cid` and `sid` change.
+Afterward, any servers in the former cluster will regard the recovered server
+failed.
+
+If you understand the risks and are still willing to proceed, then:
+
+1. Stop the old cluster ``ovsdb-server`` instances before proceeding.
+
+2. Pick one of the old members which will serve as the bootstrap member of the
+   to-be-recovered cluster.
+
+3. Convert its database file to standalone format using ``ovsdb-tool
+   cluster-to-standalone``.
+
+4. Backup the standalone database file. You will use the file in the next step.
+
+5. Re-initialize the new cluster with the bootstrap member (``ovsdb-tool
+   create-cluster``) using the previously saved database file.
+
+6. Start the bootstrapped cluster with this new member.
+
+Once you confirmed that the single member cluster is up and running and serves
+the restored data, you may proceed with joining the rest of the members to the
+newly formed cluster, as usual (``ovsdb-tool join-cluster``).
+
+Once the cluster is restored, any active clients will have to reconnect to the
+new cluster.
+
+Note: The data in the new cluster may be inconsistent with the former cluster:
+transactions not yet replicated to the server will be lost, and transactions
+not yet applied to the cluster may be committed.
+
 Upgrading from version 2.14 and earlier to 2.15 and later
 ~
 
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v6 5/6] northd: Remove unused `sb` arg in ls_port_create.

2024-04-11 Thread Ihar Hrachyshka
It's always NULL.

Signed-off-by: Ihar Hrachyshka 
---
 northd/northd.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index 6a8ace52f..78028131f 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -4336,7 +4336,7 @@ ls_port_init(struct ovn_port *op, struct ovsdb_idl_txn 
*ovnsb_txn,
 static struct ovn_port *
 ls_port_create(struct ovsdb_idl_txn *ovnsb_txn, struct hmap *ls_ports,
const char *key, const struct nbrec_logical_switch_port *nbsp,
-   struct ovn_datapath *od, const struct sbrec_port_binding *sb,
+   struct ovn_datapath *od,
const struct sbrec_mirror_table *sbrec_mirror_table,
const struct sbrec_chassis_table *sbrec_chassis_table,
struct ovsdb_idl_index *sbrec_chassis_by_name,
@@ -4345,7 +4345,7 @@ ls_port_create(struct ovsdb_idl_txn *ovnsb_txn, struct 
hmap *ls_ports,
 struct ovn_port *op = ovn_port_create(ls_ports, key, nbsp, NULL,
   NULL);
 hmap_insert(>ports, >dp_node, hmap_node_hash(>key_node));
-if (!ls_port_init(op, ovnsb_txn, od, sb,
+if (!ls_port_init(op, ovnsb_txn, od, NULL,
   sbrec_mirror_table, sbrec_chassis_table,
   sbrec_chassis_by_name, sbrec_chassis_by_hostname)) {
 ovn_port_destroy(ls_ports, op);
@@ -4514,7 +4514,7 @@ ls_handle_lsp_changes(struct ovsdb_idl_txn *ovnsb_idl_txn,
 goto fail;
 }
 op = ls_port_create(ovnsb_idl_txn, >ls_ports,
-new_nbsp->name, new_nbsp, od, NULL,
+new_nbsp->name, new_nbsp, od,
 ni->sbrec_mirror_table,
 ni->sbrec_chassis_table,
 ni->sbrec_chassis_by_name,
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v6 6/6] northd: Remove unused nbrp arg in ls_port_reinit.

2024-04-11 Thread Ihar Hrachyshka
It's always NULL.

Signed-off-by: Ihar Hrachyshka 
---
 northd/northd.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index 78028131f..6a51e5bad 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -4358,7 +4358,6 @@ ls_port_create(struct ovsdb_idl_txn *ovnsb_txn, struct 
hmap *ls_ports,
 static bool
 ls_port_reinit(struct ovn_port *op, struct ovsdb_idl_txn *ovnsb_txn,
 const struct nbrec_logical_switch_port *nbsp,
-const struct nbrec_logical_router_port *nbrp,
 struct ovn_datapath *od,
 const struct sbrec_port_binding *sb,
 const struct sbrec_mirror_table *sbrec_mirror_table,
@@ -4368,7 +4367,7 @@ ls_port_reinit(struct ovn_port *op, struct ovsdb_idl_txn 
*ovnsb_txn,
 {
 ovn_port_cleanup(op);
 op->sb = sb;
-ovn_port_set_nb(op, nbsp, nbrp);
+ovn_port_set_nb(op, nbsp, NULL);
 op->l3dgw_port = op->cr_port = NULL;
 return ls_port_init(op, ovnsb_txn, od, sb,
 sbrec_mirror_table, sbrec_chassis_table,
@@ -4546,7 +4545,7 @@ ls_handle_lsp_changes(struct ovsdb_idl_txn *ovnsb_idl_txn,
 continue;
 }
 if (!ls_port_reinit(op, ovnsb_idl_txn,
-new_nbsp, NULL,
+new_nbsp,
 od, sb, ni->sbrec_mirror_table,
 ni->sbrec_chassis_table,
 ni->sbrec_chassis_by_name,
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v6 3/6] tests: Correct tunnel ids exhaustion scenario.

2024-04-11 Thread Ihar Hrachyshka
The original version of the scenario passed with or without the fix.
This is because all LSs were processed in one go, so the allocate
function was never entered with *hint==0.

Also, added another scenario that will check behavior when *hint is out
of [min;max] bounds but > max (this happens in an obscure scenario where
a vxlan chassis is added to the cluster mid-light, forcing northd to
reduce its effective max value for tunnel ids; which may become lower
than the current *hint for ports.)

Fixes: a1f165a7b807 ("northd: fix infinite loop in ovn_allocate_tnlid()")
Co-Authored-By: Vladislav Odintsov 
Signed-off-by: Vladislav Odintsov 
Signed-off-by: Ihar Hrachyshka 
---
 tests/ovn-northd.at | 43 ---
 1 file changed, 40 insertions(+), 3 deletions(-)

diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index be006fb32..1a4e7274d 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -2823,7 +2823,7 @@ AT_CLEANUP
 ])
 
 OVN_FOR_EACH_NORTHD_NO_HV([
-AT_SETUP([check tunnel ids exhaustion])
+AT_SETUP([check datapath tunnel ids exhaustion])
 ovn_start
 
 # Create a fake chassis with vxlan encap to lower MAX DP tunnel key to 2^12
@@ -2833,13 +2833,18 @@ ovn-sbctl \
 
 cmd="ovn-nbctl --wait=sb"
 
-for i in {1..4097}; do
+for i in {1..4095}; do
 cmd="${cmd} -- ls-add lsw-${i}"
 done
 
 eval $cmd
 
-check_row_count nb:Logical_Switch 4097
+check_row_count nb:Logical_Switch 4095
+wait_row_count sb:Datapath_Binding 4095
+
+ovn-nbctl ls-add lsw-exhausted
+
+check_row_count nb:Logical_Switch 4096
 wait_row_count sb:Datapath_Binding 4095
 
 OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted" 
northd/ovn-northd.log])
@@ -2847,6 +2852,38 @@ OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted" 
northd/ovn-northd.log])
 AT_CLEANUP
 ])
 
+OVN_FOR_EACH_NORTHD_NO_HV([
+AT_SETUP([check port tunnel ids exhaustion; vxlan chassis pops up midflight])
+ovn_start
+
+cmd="ovn-nbctl --wait=sb"
+
+cmd="${cmd} -- ls-add lsw"
+for i in {1..2048}; do
+cmd="${cmd} -- lsp-add lsw lsp-${i}"
+done
+
+eval $cmd
+
+check_row_count nb:Logical_Switch_Port 2048
+wait_row_count sb:Port_Binding 2048
+
+# Now create a fake chassis with vxlan encap to lower MAX port tunnel key to 
2^11
+ovn-sbctl \
+--id=@e create encap chassis_name=hv1 ip="192.168.0.1" type="vxlan" \
+-- --id=@c create chassis name=hv1 encaps=@e
+
+ovn-nbctl lsp-add lsw lsp-exhausted
+
+check_row_count nb:Logical_Switch_Port 2049
+wait_row_count sb:Port_Binding 2048
+
+OVS_WAIT_UNTIL([grep "all port tunnel ids exhausted" northd/ovn-northd.log])
+
+AT_CLEANUP
+])
+
+
 
 OVN_FOR_EACH_NORTHD_NO_HV([
 AT_SETUP([Logical Flow Datapath Groups])
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v6 4/6] northd: Delete pb if tunnel is not allocated.

2024-04-11 Thread Ihar Hrachyshka
This allows callers to avoid cleanup of the record in case the function
fails.

Signed-off-by: Ihar Hrachyshka 
---
 northd/northd.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index 4cea669cf..6a8ace52f 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -4322,6 +4322,9 @@ ls_port_init(struct ovn_port *op, struct ovsdb_idl_txn 
*ovnsb_txn,
 }
 /* Assign new tunnel ids where needed. */
 if (!ovn_port_allocate_key(sbrec_chassis_table, op)) {
+if (!sb) {
+sbrec_port_binding_delete(op->sb);
+}
 return false;
 }
 ovn_port_update_sbrec(ovnsb_txn, sbrec_chassis_by_name,
@@ -4345,9 +4348,6 @@ ls_port_create(struct ovsdb_idl_txn *ovnsb_txn, struct 
hmap *ls_ports,
 if (!ls_port_init(op, ovnsb_txn, od, sb,
   sbrec_mirror_table, sbrec_chassis_table,
   sbrec_chassis_by_name, sbrec_chassis_by_hostname)) {
-if (op->sb) {
-sbrec_port_binding_delete(op->sb);
-}
 ovn_port_destroy(ls_ports, op);
 return NULL;
 }
@@ -4551,8 +4551,8 @@ ls_handle_lsp_changes(struct ovsdb_idl_txn *ovnsb_idl_txn,
 ni->sbrec_chassis_table,
 ni->sbrec_chassis_by_name,
 ni->sbrec_chassis_by_hostname)) {
-if (op->sb) {
-sbrec_port_binding_delete(op->sb);
+if (sb) {
+sbrec_port_binding_delete(sb);
 }
 ovn_port_destroy(>ls_ports, op);
 goto fail;
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v6 2/6] northd: Don't detach op->list when it wasn't used.

2024-04-11 Thread Ihar Hrachyshka
In some scenarios, op->list is not attached anywhere, which makes
attempts to detach it trigger ubsan failure.

ovn/ovs/include/openvswitch/list.h:252:17: runtime error: member access
within null pointer of type 'struct ovs_list'

 #0 0x?? in ovs_list_remove ovn/ovs/include/openvswitch/list.h:252:17
 #1 0x?? in ovn_port_allocate_key ovn/northd/northd.c:4021:13
 #2 0x?? in ls_port_init ovn/northd/northd.c:4321:10
 #3 0x?? in ls_port_create ovn/northd/northd.c:4342:10
 #4 0x?? in ls_handle_lsp_changes ovn/northd/northd.c:4511:18
 #5 0x?? in northd_handle_ls_changes ovn/northd/northd.c:4655:14
 #6 0x?? in northd_nb_logical_switch_handler ovn/northd/en-northd.c:150:

This patch makes northd use op->list only as a temporary means for
build_ports logic to track ports that are persisted in both, nb, or sb
only. Now build_ports will always detach ops once done.

Now that op->list is never left attached to a list, we can remove
ovs_list_remove calls for it elsewhere, including where op was never
attached in the first place.

Signed-off-by: Ihar Hrachyshka 
---
 northd/northd.c | 23 +--
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index f2406890c..4cea669cf 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -4111,6 +4111,7 @@ build_ports(struct ovsdb_idl_txn *ovnsb_txn,
   sbrec_mirror_table,
   op, queue_id_bitmap,
   _ha_chassis_grps);
+ovs_list_remove(>list);
 }
 
 /* Add southbound record for each unmatched northbound record. */
@@ -4123,6 +4124,7 @@ build_ports(struct ovsdb_idl_txn *ovnsb_txn,
   op, queue_id_bitmap,
   _ha_chassis_grps);
 sbrec_port_binding_set_logical_port(op->sb, op->key);
+ovs_list_remove(>list);
 }
 
 /* Delete southbound records without northbound matches. */
@@ -4292,7 +4294,7 @@ ovn_port_find_in_datapath(struct ovn_datapath *od,
 
 static bool
 ls_port_init(struct ovn_port *op, struct ovsdb_idl_txn *ovnsb_txn,
- struct hmap *ls_ports, struct ovn_datapath *od,
+ struct ovn_datapath *od,
  const struct sbrec_port_binding *sb,
  const struct sbrec_mirror_table *sbrec_mirror_table,
  const struct sbrec_chassis_table *sbrec_chassis_table,
@@ -4320,11 +4322,6 @@ ls_port_init(struct ovn_port *op, struct ovsdb_idl_txn 
*ovnsb_txn,
 }
 /* Assign new tunnel ids where needed. */
 if (!ovn_port_allocate_key(sbrec_chassis_table, op)) {
-if (op->sb) {
-sbrec_port_binding_delete(op->sb);
-}
-ovs_list_remove(>list);
-ovn_port_destroy(ls_ports, op);
 return false;
 }
 ovn_port_update_sbrec(ovnsb_txn, sbrec_chassis_by_name,
@@ -4345,9 +4342,12 @@ ls_port_create(struct ovsdb_idl_txn *ovnsb_txn, struct 
hmap *ls_ports,
 struct ovn_port *op = ovn_port_create(ls_ports, key, nbsp, NULL,
   NULL);
 hmap_insert(>ports, >dp_node, hmap_node_hash(>key_node));
-if (!ls_port_init(op, ovnsb_txn, ls_ports, od, sb,
+if (!ls_port_init(op, ovnsb_txn, od, sb,
   sbrec_mirror_table, sbrec_chassis_table,
   sbrec_chassis_by_name, sbrec_chassis_by_hostname)) {
+if (op->sb) {
+sbrec_port_binding_delete(op->sb);
+}
 ovn_port_destroy(ls_ports, op);
 return NULL;
 }
@@ -4357,7 +4357,6 @@ ls_port_create(struct ovsdb_idl_txn *ovnsb_txn, struct 
hmap *ls_ports,
 
 static bool
 ls_port_reinit(struct ovn_port *op, struct ovsdb_idl_txn *ovnsb_txn,
-struct hmap *ls_ports,
 const struct nbrec_logical_switch_port *nbsp,
 const struct nbrec_logical_router_port *nbrp,
 struct ovn_datapath *od,
@@ -4371,7 +4370,7 @@ ls_port_reinit(struct ovn_port *op, struct ovsdb_idl_txn 
*ovnsb_txn,
 op->sb = sb;
 ovn_port_set_nb(op, nbsp, nbrp);
 op->l3dgw_port = op->cr_port = NULL;
-return ls_port_init(op, ovnsb_txn, ls_ports, od, sb,
+return ls_port_init(op, ovnsb_txn, od, sb,
 sbrec_mirror_table, sbrec_chassis_table,
 sbrec_chassis_by_name, sbrec_chassis_by_hostname);
 }
@@ -4546,12 +4545,16 @@ ls_handle_lsp_changes(struct ovsdb_idl_txn 
*ovnsb_idl_txn,
 op->visited = true;
 continue;
 }
-if (!ls_port_reinit(op, ovnsb_idl_txn, >ls_ports,
+if (!ls_port_reinit(op, ovnsb_idl_txn,
 new_nbsp, NULL,
 od, sb, ni->sbrec_mirror_table,
 ni->sbrec_chassis_table,
 ni->sbrec_chassis_by_name,
 ni->sbrec_chassis_by_hostname)) {
+if (op->sb) {
+

[ovs-dev] [PATCH ovn v6 1/6] northd: Don't cleanup op in ovn_port_allocate_key.

2024-04-11 Thread Ihar Hrachyshka
Let the callers do the cleanup as needed.

Signed-off-by: Ihar Hrachyshka 
---
 northd/northd.c | 24 +++-
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index 02cf5b234..f2406890c 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -4006,7 +4006,6 @@ ovn_port_assign_requested_tnl_id(
 
 static bool
 ovn_port_allocate_key(const struct sbrec_chassis_table *sbrec_chassis_table,
-  struct hmap *ports,
   struct ovn_port *op)
 {
 if (!op->tunnel_key) {
@@ -4015,11 +4014,6 @@ ovn_port_allocate_key(const struct sbrec_chassis_table 
*sbrec_chassis_table,
 1, (1u << (key_bits - 1)) - 1,
 >od->port_key_hint);
 if (!op->tunnel_key) {
-if (op->sb) {
-sbrec_port_binding_delete(op->sb);
-}
-ovs_list_remove(>list);
-ovn_port_destroy(ports, op);
 return false;
 }
 }
@@ -4084,10 +4078,17 @@ build_ports(struct ovsdb_idl_txn *ovnsb_txn,
 
 /* Assign new tunnel ids where needed. */
 LIST_FOR_EACH_SAFE (op, list, ) {
-ovn_port_allocate_key(sbrec_chassis_table, ports, op);
+if (!ovn_port_allocate_key(sbrec_chassis_table, op)) {
+sbrec_port_binding_delete(op->sb);
+ovs_list_remove(>list);
+ovn_port_destroy(ports, op);
+}
 }
 LIST_FOR_EACH_SAFE (op, list, _only) {
-ovn_port_allocate_key(sbrec_chassis_table, ports, op);
+if (!ovn_port_allocate_key(sbrec_chassis_table, op)) {
+ovs_list_remove(>list);
+ovn_port_destroy(ports, op);
+}
 }
 
 /* For logical ports that are in both databases, update the southbound
@@ -4318,7 +4319,12 @@ ls_port_init(struct ovn_port *op, struct ovsdb_idl_txn 
*ovnsb_txn,
 sbrec_port_binding_set_logical_port(op->sb, op->key);
 }
 /* Assign new tunnel ids where needed. */
-if (!ovn_port_allocate_key(sbrec_chassis_table, ls_ports, op)) {
+if (!ovn_port_allocate_key(sbrec_chassis_table, op)) {
+if (op->sb) {
+sbrec_port_binding_delete(op->sb);
+}
+ovs_list_remove(>list);
+ovn_port_destroy(ls_ports, op);
 return false;
 }
 ovn_port_update_sbrec(ovnsb_txn, sbrec_chassis_by_name,
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v6 0/6] Correct tunnel ids exhaustion scenario.

2024-04-11 Thread Ihar Hrachyshka
v2+ of the original test patch exposed a ubsan failure in port tunnel id
allocation code when tunnel id space is exhausted.

This series fixes the ubsan failure (patches 1-2); then adjusts the
invalid scenario to trigger the originally intended failure mode - id
space exhausted (patch 3). Finally, it includes a number of smaller
cleanup patches in the area that simplify the allocation code somewhat.
(patches 4-6)

I attempted to make each patch as simple as possible, to simplify
review. If you think it's too granular, let me know and I can squash
some.

v1: initial version.
v2: cover both cases of hint = 0 and hint > max.
v3: reduce the number of ports to create in the hint > max scenario needed to 
trigger the problem.
v4: remove spurious lib/ovn-util.c change.
v5: ubsan fixes included.
v6: modify patch 5 to honor previously allocated tunnel ids.
always detach op->list in build_ports (and never elsewhere.)

Ihar Hrachyshka (6):
  northd: Don't cleanup op in ovn_port_allocate_key.
  northd: Don't detach op->list when it wasn't used.
  tests: Correct tunnel ids exhaustion scenario.
  northd: Delete pb if tunnel is not allocated.
  northd: Remove unused `sb` arg in ls_port_create.
  northd: Remove unused nbrp arg in ls_port_reinit.

 northd/northd.c | 46 ++---
 tests/ovn-northd.at | 43 +++---
 2 files changed, 67 insertions(+), 22 deletions(-)

-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 1/2] ovsdb: raft: Fix inability to join a cluster with a large database.

2024-04-11 Thread Ilya Maximets
Inactivity probe interval on RAFT connections depend on a value of the
election timer.  However, the actual value is not known until the
database snapshot with the RAFT information is received by a joining
server.  New joining server is using a default 1 second until then.

In case a new joining server is trying to join an existing cluster
with a large database, it may take more than a second to generate and
send an initial database snapshot.  This is causing an inability to
actually join this cluster.  Joining server sends ADD_SERVER request,
waits 1 second, sends a probe, doesn't get a reply within another
second, because the leader is busy preparing and sending an initial
snapshot to it, disconnects, repeat.

This is not an issue for the servers that did already join, since
their probe intervals are larger than election timeout.
Cooperative multitasking also doesn't fully solve this issue, since
it depends on election timer, which is likely higher in the existing
cluster with a very big database.

Fix that by using the maximum election timer value for inactivity
probes until the actual value is known.  We still shouldn't completely
disable the probes, because in the rare event the connection is
established but the other side silently goes away, we still want to
disconnect and try to re-establish the connection eventually.

Since probe intervals also depend on the joining state now, update
them when the server joins the cluster.

Fixes: 14b2b0aad7ae ("raft: Reintroduce jsonrpc inactivity probes.")
Reported-by: Terry Wilson 
Reported-at: https://issues.redhat.com/browse/FDP-144
Signed-off-by: Ilya Maximets 
---
 ovsdb/raft.c | 22 +-
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/ovsdb/raft.c b/ovsdb/raft.c
index d81a1758a..083ebf66a 100644
--- a/ovsdb/raft.c
+++ b/ovsdb/raft.c
@@ -1018,8 +1018,13 @@ raft_conn_update_probe_interval(struct raft *raft, 
struct raft_conn *r_conn)
  * inactivity probe follower will just try to initiate election
  * indefinitely staying in 'candidate' role.  And the leader will continue
  * to send heartbeats to the dead connection thinking that remote server
- * is still part of the cluster. */
-int probe_interval = raft->election_timer + ELECTION_RANGE_MSEC;
+ * is still part of the cluster.
+ *
+ * While joining, the real value of the election timeout is not known to
+ * this server, so using the maximum. */
+int probe_interval = (raft->joining ? ELECTION_MAX_MSEC
+: raft->election_timer)
+ + ELECTION_RANGE_MSEC;
 
 jsonrpc_session_set_probe_interval(r_conn->js, probe_interval);
 }
@@ -2820,6 +2825,13 @@ raft_send_heartbeats(struct raft *raft)
 raft_reset_ping_timer(raft);
 }
 
+static void
+raft_join_complete(struct raft *raft)
+{
+raft->joining = false;
+raft_update_probe_intervals(raft);
+}
+
 /* Initializes the fields in 's' that represent the leader's view of the
  * server. */
 static void
@@ -2866,7 +2878,7 @@ raft_become_leader(struct raft *raft)
  * we're becoming a cluster leader without receiving reply for a
  * join request and will commit addition of this server ourselves. */
 VLOG_INFO_RL(, "elected as leader while joining");
-raft->joining = false;
+raft_join_complete(raft);
 }
 
 struct raft_server *s;
@@ -3101,7 +3113,7 @@ raft_update_commit_index(struct raft *raft, uint64_t 
new_commit_index)
 "added to configuration without reply "
 "(eid: "UUID_FMT", commit index: %"PRIu64")",
 UUID_ARGS(>eid), index);
-raft->joining = false;
+raft_join_complete(raft);
 }
 }
 raft_servers_destroy();
@@ -4049,7 +4061,7 @@ raft_handle_add_server_reply(struct raft *raft,
 }
 
 if (rpy->success) {
-raft->joining = false;
+raft_join_complete(raft);
 
 /* It is tempting, at this point, to check that this server is part of
  * the current configuration.  However, this is not necessarily the
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 2/2] ovsdb: raft: Fix probe intervals after install snapshot request.

2024-04-11 Thread Ilya Maximets
If the new snapshot received with INSTALL_SNAPSHOT request contains
a different election timer value, the timer is updated, but the
probe intervals for RAFT connections are not.

Fix that by updating probe intervals whenever we get election timer
from the log.

Fixes: 14b2b0aad7ae ("raft: Reintroduce jsonrpc inactivity probes.")
Signed-off-by: Ilya Maximets 
---
 ovsdb/raft.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/ovsdb/raft.c b/ovsdb/raft.c
index 083ebf66a..ac3d37ac4 100644
--- a/ovsdb/raft.c
+++ b/ovsdb/raft.c
@@ -5035,6 +5035,7 @@ raft_get_election_timer_from_log(struct raft *raft)
 break;
 }
 }
+raft_update_probe_intervals(raft);
 }
 
 static void
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 0/2] ovsdb: raft: Fixes for probe interval updates.

2024-04-11 Thread Ilya Maximets
A couple of fixes related to corner cases with probe intervals
on RAFT connections between servers.

Ilya Maximets (2):
  ovsdb: raft: Fix inability to join a cluster with a large database.
  ovsdb: raft: Fix probe intervals after install snapshot request.

 ovsdb/raft.c | 23 ++-
 1 file changed, 18 insertions(+), 5 deletions(-)

-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] appveyor: Fix too wide OpenSSL version regexp.

2024-04-11 Thread Ilya Maximets
Current regexp is not good enough.  OpenSSL 3.3.0 is now available
and unfortunately the regexp is matching both 3.3.0 and 3.0.13.

All the AppVeyor runs are currently failing because of this.

Making it more restrictive by matching on the start of the string,
explicit dots and numbers after the last dot.  Hopefully, this is
good enough.

In addition, taking only the first result just in case it mismatches
again.

Fixes: 9d8208484a35 ("appveyor: Build with OpenSSL 3.0.")
Signed-off-by: Ilya Maximets 
---
 appveyor.yml | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/appveyor.yml b/appveyor.yml
index baa844753..d11e46399 100644
--- a/appveyor.yml
+++ b/appveyor.yml
@@ -26,12 +26,12 @@ install:
 $URL = 
"https://raw.githubusercontent.com/slproweb/opensslhashes/master/win32_openssl_hashes.json;
 $webData = (Invoke-WebRequest -Uri $URL).content | ConvertFrom-Json
 $source = ($webData.files.PSObject.Properties | Where-Object {
-$_.Value.basever   -match "3.0.*" -and
-$_.Value.bits  -eq"64"-and
-$_.Value.arch  -eq"INTEL" -and
-$_.Value.installer -eq"exe"   -and
+$_.Value.basever   -match "^3\.0\.[0-9]+" -and
+$_.Value.bits  -eq"64"-and
+$_.Value.arch  -eq"INTEL" -and
+$_.Value.installer -eq"exe"   -and
 -not $_.Value.light
-} | Select-Object Value).PSObject.Properties.Value
+} | Select-Object Value | Select -First 1).PSObject.Properties.Value
 
 Write-Host "Latest OpenSSL 3.0:" ($source | Format-List | Out-String)
 
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v5 5/7] northd: Create pb after tunnel id is allocated.

2024-04-11 Thread Ihar Hrachyshka
Sorry, this is incorrect, the order of steps here is important. Instead, we
should clean up the newly created record on failure. (Other patches in the
series are ok; will send an updated series.)

On Thu, Apr 11, 2024 at 6:59 PM Ihar Hrachyshka  wrote:

> This allows to avoid cleanup of the record in case tunnel id fails to
> allocate.
>
> Signed-off-by: Ihar Hrachyshka 
> ---
>  northd/northd.c | 15 ++-
>  1 file changed, 6 insertions(+), 9 deletions(-)
>
> diff --git a/northd/northd.c b/northd/northd.c
> index 3d2715911..5a0225189 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -4305,6 +4305,10 @@ ls_port_init(struct ovn_port *op, struct
> ovsdb_idl_txn *ovnsb_txn,
>  if (!ovn_port_assign_requested_tnl_id(sbrec_chassis_table, op)) {
>  return false;
>  }
> +/* Assign new tunnel ids where needed. */
> +if (!ovn_port_allocate_key(sbrec_chassis_table, op)) {
> +return false;
> +}
>  if (sb) {
>  op->sb = sb;
>  /* Keep nonconflicting tunnel IDs that are already assigned. */
> @@ -4318,10 +4322,6 @@ ls_port_init(struct ovn_port *op, struct
> ovsdb_idl_txn *ovnsb_txn,
>  op->sb = sbrec_port_binding_insert(ovnsb_txn);
>  sbrec_port_binding_set_logical_port(op->sb, op->key);
>  }
> -/* Assign new tunnel ids where needed. */
> -if (!ovn_port_allocate_key(sbrec_chassis_table, op)) {
> -return false;
> -}
>  ovn_port_update_sbrec(ovnsb_txn, sbrec_chassis_by_name,
>sbrec_chassis_by_hostname, NULL,
> sbrec_mirror_table,
>op, NULL, NULL);
> @@ -4343,9 +4343,6 @@ ls_port_create(struct ovsdb_idl_txn *ovnsb_txn,
> struct hmap *ls_ports,
>  if (!ls_port_init(op, ovnsb_txn, od, sb,
>sbrec_mirror_table, sbrec_chassis_table,
>sbrec_chassis_by_name, sbrec_chassis_by_hostname)) {
> -if (op->sb) {
> -sbrec_port_binding_delete(op->sb);
> -}
>  ovn_port_destroy(ls_ports, op);
>  return NULL;
>  }
> @@ -4549,8 +4546,8 @@ ls_handle_lsp_changes(struct ovsdb_idl_txn
> *ovnsb_idl_txn,
>  ni->sbrec_chassis_table,
>  ni->sbrec_chassis_by_name,
>  ni->sbrec_chassis_by_hostname)) {
> -if (op->sb) {
> -sbrec_port_binding_delete(op->sb);
> +if (sb) {
> +sbrec_port_binding_delete(sb);
>  }
>  ovs_list_remove(>list);
>  ovn_port_destroy(>ls_ports, op);
> --
> 2.41.0
>
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v4] Make tunnel ids exhaustion test trigger the problem.

2024-04-11 Thread Ihar Hrachyshka
Series with this patch that also fixes ubsan failure here:
https://patchwork.ozlabs.org/project/ovn/list/?series=402694

On Mon, Apr 8, 2024 at 9:49 AM Ihar Hrachyshka  wrote:

> Note to reviewers: looks like the port tunnel id test case revealed a
> number of undefined behaviors and leaks (?) in northd; I am working on
> addressing these. Before that, we should not merge this patch as-is. (Or at
> least we should not merge the part with the port case; I believe the
> network case is fine.)
>
> On Sat, Apr 6, 2024 at 2:30 AM Vladislav Odintsov 
> wrote:
>
>> Hi Ihar,
>>
>> Thanks for cooperation and enhancements in the testcases!
>> The patch looks good to me.
>>
>> On 5 Apr 2024, at 19:14, Ihar Hrachyshka  wrote:
>>
>> The original version of the scenario passed with or without the fix.
>> This is because all LSs were processed in one go, so the allocate
>> function was never entered with *hint==0.
>>
>> Also, added another scenario that will check behavior when *hint is out
>> of [min;max] bounds but > max (this happens in an obscure scenario where
>> a vxlan chassis is added to the cluster mid-light, forcing northd to
>> reduce its effective max value for tunnel ids; which may become lower
>> than the current *hint for ports.)
>>
>> Fixes: a1f165a7b807 ("northd: fix infinite loop in ovn_allocate_tnlid()")
>> Co-Authored-By: Vladislav Odintsov 
>> Signed-off-by: Vladislav Odintsov 
>> Signed-off-by: Ihar Hrachyshka 
>> ---
>> v1: initial version.
>> v2: cover both cases of hint = 0 and hint > max.
>> v3: reduce the number of ports to create in the hint > max scenario
>> needed to trigger the problem.
>> v4: remove spurious lib/ovn-util.c change.
>> ---
>> tests/ovn-northd.at | 43 ---
>> 1 file changed, 40 insertions(+), 3 deletions(-)
>>
>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>> index be006fb32..1a4e7274d 100644
>> --- a/tests/ovn-northd.at
>> +++ b/tests/ovn-northd.at
>> @@ -2823,7 +2823,7 @@ AT_CLEANUP
>> ])
>>
>> OVN_FOR_EACH_NORTHD_NO_HV([
>> -AT_SETUP([check tunnel ids exhaustion])
>> +AT_SETUP([check datapath tunnel ids exhaustion])
>> ovn_start
>>
>> # Create a fake chassis with vxlan encap to lower MAX DP tunnel key to
>> 2^12
>> @@ -2833,13 +2833,18 @@ ovn-sbctl \
>>
>> cmd="ovn-nbctl --wait=sb"
>>
>> -for i in {1..4097}; do
>> +for i in {1..4095}; do
>> cmd="${cmd} -- ls-add lsw-${i}"
>> done
>>
>> eval $cmd
>>
>> -check_row_count nb:Logical_Switch 4097
>> +check_row_count nb:Logical_Switch 4095
>> +wait_row_count sb:Datapath_Binding 4095
>> +
>> +ovn-nbctl ls-add lsw-exhausted
>> +
>> +check_row_count nb:Logical_Switch 4096
>> wait_row_count sb:Datapath_Binding 4095
>>
>> OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted"
>> northd/ovn-northd.log])
>> @@ -2847,6 +2852,38 @@ OVS_WAIT_UNTIL([grep "all datapath tunnel ids
>> exhausted" northd/ovn-northd.log])
>> AT_CLEANUP
>> ])
>>
>> +OVN_FOR_EACH_NORTHD_NO_HV([
>> +AT_SETUP([check port tunnel ids exhaustion; vxlan chassis pops up
>> midflight])
>> +ovn_start
>> +
>> +cmd="ovn-nbctl --wait=sb"
>> +
>> +cmd="${cmd} -- ls-add lsw"
>> +for i in {1..2048}; do
>> +cmd="${cmd} -- lsp-add lsw lsp-${i}"
>> +done
>> +
>> +eval $cmd
>> +
>> +check_row_count nb:Logical_Switch_Port 2048
>> +wait_row_count sb:Port_Binding 2048
>> +
>> +# Now create a fake chassis with vxlan encap to lower MAX port tunnel
>> key to 2^11
>> +ovn-sbctl \
>> +--id=@e create encap chassis_name=hv1 ip="192.168.0.1" type="vxlan" \
>> +-- --id=@c create chassis name=hv1 encaps=@e
>> +
>> +ovn-nbctl lsp-add lsw lsp-exhausted
>> +
>> +check_row_count nb:Logical_Switch_Port 2049
>> +wait_row_count sb:Port_Binding 2048
>> +
>> +OVS_WAIT_UNTIL([grep "all port tunnel ids exhausted"
>> northd/ovn-northd.log])
>> +
>> +AT_CLEANUP
>> +])
>> +
>> +
>>
>> OVN_FOR_EACH_NORTHD_NO_HV([
>> AT_SETUP([Logical Flow Datapath Groups])
>> --
>> 2.41.0
>>
>> ___
>> dev mailing list
>> d...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>>
>>
>> Regards,
>> Vladislav Odintsov
>>
>>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v5 7/7] northd: Remove unused nbrp arg in ls_port_reinit.

2024-04-11 Thread Ihar Hrachyshka
It's always NULL.

Signed-off-by: Ihar Hrachyshka 
---
 northd/northd.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index adc8930c8..044a69264 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -4353,7 +4353,6 @@ ls_port_create(struct ovsdb_idl_txn *ovnsb_txn, struct 
hmap *ls_ports,
 static bool
 ls_port_reinit(struct ovn_port *op, struct ovsdb_idl_txn *ovnsb_txn,
 const struct nbrec_logical_switch_port *nbsp,
-const struct nbrec_logical_router_port *nbrp,
 struct ovn_datapath *od,
 const struct sbrec_port_binding *sb,
 const struct sbrec_mirror_table *sbrec_mirror_table,
@@ -4363,7 +4362,7 @@ ls_port_reinit(struct ovn_port *op, struct ovsdb_idl_txn 
*ovnsb_txn,
 {
 ovn_port_cleanup(op);
 op->sb = sb;
-ovn_port_set_nb(op, nbsp, nbrp);
+ovn_port_set_nb(op, nbsp, NULL);
 op->l3dgw_port = op->cr_port = NULL;
 return ls_port_init(op, ovnsb_txn, od, sb,
 sbrec_mirror_table, sbrec_chassis_table,
@@ -4541,7 +4540,7 @@ ls_handle_lsp_changes(struct ovsdb_idl_txn *ovnsb_idl_txn,
 continue;
 }
 if (!ls_port_reinit(op, ovnsb_idl_txn,
-new_nbsp, NULL,
+new_nbsp,
 od, sb, ni->sbrec_mirror_table,
 ni->sbrec_chassis_table,
 ni->sbrec_chassis_by_name,
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v5 2/7] northd: Don't cleanup op inside ls_port_init.

2024-04-11 Thread Ihar Hrachyshka
Let the caller do the cleanup, as needed.

Signed-off-by: Ihar Hrachyshka 
---
 northd/northd.c | 23 +--
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index f2406890c..ace663d54 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -4292,7 +4292,7 @@ ovn_port_find_in_datapath(struct ovn_datapath *od,
 
 static bool
 ls_port_init(struct ovn_port *op, struct ovsdb_idl_txn *ovnsb_txn,
- struct hmap *ls_ports, struct ovn_datapath *od,
+ struct ovn_datapath *od,
  const struct sbrec_port_binding *sb,
  const struct sbrec_mirror_table *sbrec_mirror_table,
  const struct sbrec_chassis_table *sbrec_chassis_table,
@@ -4320,11 +4320,6 @@ ls_port_init(struct ovn_port *op, struct ovsdb_idl_txn 
*ovnsb_txn,
 }
 /* Assign new tunnel ids where needed. */
 if (!ovn_port_allocate_key(sbrec_chassis_table, op)) {
-if (op->sb) {
-sbrec_port_binding_delete(op->sb);
-}
-ovs_list_remove(>list);
-ovn_port_destroy(ls_ports, op);
 return false;
 }
 ovn_port_update_sbrec(ovnsb_txn, sbrec_chassis_by_name,
@@ -4345,9 +4340,13 @@ ls_port_create(struct ovsdb_idl_txn *ovnsb_txn, struct 
hmap *ls_ports,
 struct ovn_port *op = ovn_port_create(ls_ports, key, nbsp, NULL,
   NULL);
 hmap_insert(>ports, >dp_node, hmap_node_hash(>key_node));
-if (!ls_port_init(op, ovnsb_txn, ls_ports, od, sb,
+if (!ls_port_init(op, ovnsb_txn, od, sb,
   sbrec_mirror_table, sbrec_chassis_table,
   sbrec_chassis_by_name, sbrec_chassis_by_hostname)) {
+if (op->sb) {
+sbrec_port_binding_delete(op->sb);
+}
+ovs_list_remove(>list);
 ovn_port_destroy(ls_ports, op);
 return NULL;
 }
@@ -4357,7 +4356,6 @@ ls_port_create(struct ovsdb_idl_txn *ovnsb_txn, struct 
hmap *ls_ports,
 
 static bool
 ls_port_reinit(struct ovn_port *op, struct ovsdb_idl_txn *ovnsb_txn,
-struct hmap *ls_ports,
 const struct nbrec_logical_switch_port *nbsp,
 const struct nbrec_logical_router_port *nbrp,
 struct ovn_datapath *od,
@@ -4371,7 +4369,7 @@ ls_port_reinit(struct ovn_port *op, struct ovsdb_idl_txn 
*ovnsb_txn,
 op->sb = sb;
 ovn_port_set_nb(op, nbsp, nbrp);
 op->l3dgw_port = op->cr_port = NULL;
-return ls_port_init(op, ovnsb_txn, ls_ports, od, sb,
+return ls_port_init(op, ovnsb_txn, od, sb,
 sbrec_mirror_table, sbrec_chassis_table,
 sbrec_chassis_by_name, sbrec_chassis_by_hostname);
 }
@@ -4546,12 +4544,17 @@ ls_handle_lsp_changes(struct ovsdb_idl_txn 
*ovnsb_idl_txn,
 op->visited = true;
 continue;
 }
-if (!ls_port_reinit(op, ovnsb_idl_txn, >ls_ports,
+if (!ls_port_reinit(op, ovnsb_idl_txn,
 new_nbsp, NULL,
 od, sb, ni->sbrec_mirror_table,
 ni->sbrec_chassis_table,
 ni->sbrec_chassis_by_name,
 ni->sbrec_chassis_by_hostname)) {
+if (op->sb) {
+sbrec_port_binding_delete(op->sb);
+}
+ovs_list_remove(>list);
+ovn_port_destroy(>ls_ports, op);
 goto fail;
 }
 add_op_to_northd_tracked_ports(_lsps->updated, op);
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v5 3/7] northd: Don't remove detached op from list.

2024-04-11 Thread Ihar Hrachyshka
The `op` here is a new port structure that is not, yet, attached
anywhere, and the attempt to detach it will result in undefined
behavior.

Signed-off-by: Ihar Hrachyshka 
---
 northd/northd.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/northd/northd.c b/northd/northd.c
index ace663d54..3d2715911 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -4346,7 +4346,6 @@ ls_port_create(struct ovsdb_idl_txn *ovnsb_txn, struct 
hmap *ls_ports,
 if (op->sb) {
 sbrec_port_binding_delete(op->sb);
 }
-ovs_list_remove(>list);
 ovn_port_destroy(ls_ports, op);
 return NULL;
 }
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v5 4/7] tests: Correct tunnel ids exhaustion scenario.

2024-04-11 Thread Ihar Hrachyshka
The original version of the scenario passed with or without the fix.
This is because all LSs were processed in one go, so the allocate
function was never entered with *hint==0.

Also, added another scenario that will check behavior when *hint is out
of [min;max] bounds but > max (this happens in an obscure scenario where
a vxlan chassis is added to the cluster mid-light, forcing northd to
reduce its effective max value for tunnel ids; which may become lower
than the current *hint for ports.)

Fixes: a1f165a7b807 ("northd: fix infinite loop in ovn_allocate_tnlid()")
Co-Authored-By: Vladislav Odintsov 
Signed-off-by: Vladislav Odintsov 
Signed-off-by: Ihar Hrachyshka 
---
 tests/ovn-northd.at | 43 ---
 1 file changed, 40 insertions(+), 3 deletions(-)

diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index be006fb32..1a4e7274d 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -2823,7 +2823,7 @@ AT_CLEANUP
 ])
 
 OVN_FOR_EACH_NORTHD_NO_HV([
-AT_SETUP([check tunnel ids exhaustion])
+AT_SETUP([check datapath tunnel ids exhaustion])
 ovn_start
 
 # Create a fake chassis with vxlan encap to lower MAX DP tunnel key to 2^12
@@ -2833,13 +2833,18 @@ ovn-sbctl \
 
 cmd="ovn-nbctl --wait=sb"
 
-for i in {1..4097}; do
+for i in {1..4095}; do
 cmd="${cmd} -- ls-add lsw-${i}"
 done
 
 eval $cmd
 
-check_row_count nb:Logical_Switch 4097
+check_row_count nb:Logical_Switch 4095
+wait_row_count sb:Datapath_Binding 4095
+
+ovn-nbctl ls-add lsw-exhausted
+
+check_row_count nb:Logical_Switch 4096
 wait_row_count sb:Datapath_Binding 4095
 
 OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted" 
northd/ovn-northd.log])
@@ -2847,6 +2852,38 @@ OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted" 
northd/ovn-northd.log])
 AT_CLEANUP
 ])
 
+OVN_FOR_EACH_NORTHD_NO_HV([
+AT_SETUP([check port tunnel ids exhaustion; vxlan chassis pops up midflight])
+ovn_start
+
+cmd="ovn-nbctl --wait=sb"
+
+cmd="${cmd} -- ls-add lsw"
+for i in {1..2048}; do
+cmd="${cmd} -- lsp-add lsw lsp-${i}"
+done
+
+eval $cmd
+
+check_row_count nb:Logical_Switch_Port 2048
+wait_row_count sb:Port_Binding 2048
+
+# Now create a fake chassis with vxlan encap to lower MAX port tunnel key to 
2^11
+ovn-sbctl \
+--id=@e create encap chassis_name=hv1 ip="192.168.0.1" type="vxlan" \
+-- --id=@c create chassis name=hv1 encaps=@e
+
+ovn-nbctl lsp-add lsw lsp-exhausted
+
+check_row_count nb:Logical_Switch_Port 2049
+wait_row_count sb:Port_Binding 2048
+
+OVS_WAIT_UNTIL([grep "all port tunnel ids exhausted" northd/ovn-northd.log])
+
+AT_CLEANUP
+])
+
+
 
 OVN_FOR_EACH_NORTHD_NO_HV([
 AT_SETUP([Logical Flow Datapath Groups])
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v5 6/7] northd: Remove unused `sb` arg in ls_port_create.

2024-04-11 Thread Ihar Hrachyshka
It's always NULL.

Signed-off-by: Ihar Hrachyshka 
---
 northd/northd.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index 5a0225189..adc8930c8 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -4331,7 +4331,7 @@ ls_port_init(struct ovn_port *op, struct ovsdb_idl_txn 
*ovnsb_txn,
 static struct ovn_port *
 ls_port_create(struct ovsdb_idl_txn *ovnsb_txn, struct hmap *ls_ports,
const char *key, const struct nbrec_logical_switch_port *nbsp,
-   struct ovn_datapath *od, const struct sbrec_port_binding *sb,
+   struct ovn_datapath *od,
const struct sbrec_mirror_table *sbrec_mirror_table,
const struct sbrec_chassis_table *sbrec_chassis_table,
struct ovsdb_idl_index *sbrec_chassis_by_name,
@@ -4340,7 +4340,7 @@ ls_port_create(struct ovsdb_idl_txn *ovnsb_txn, struct 
hmap *ls_ports,
 struct ovn_port *op = ovn_port_create(ls_ports, key, nbsp, NULL,
   NULL);
 hmap_insert(>ports, >dp_node, hmap_node_hash(>key_node));
-if (!ls_port_init(op, ovnsb_txn, od, sb,
+if (!ls_port_init(op, ovnsb_txn, od, NULL,
   sbrec_mirror_table, sbrec_chassis_table,
   sbrec_chassis_by_name, sbrec_chassis_by_hostname)) {
 ovn_port_destroy(ls_ports, op);
@@ -4509,7 +4509,7 @@ ls_handle_lsp_changes(struct ovsdb_idl_txn *ovnsb_idl_txn,
 goto fail;
 }
 op = ls_port_create(ovnsb_idl_txn, >ls_ports,
-new_nbsp->name, new_nbsp, od, NULL,
+new_nbsp->name, new_nbsp, od,
 ni->sbrec_mirror_table,
 ni->sbrec_chassis_table,
 ni->sbrec_chassis_by_name,
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v5 5/7] northd: Create pb after tunnel id is allocated.

2024-04-11 Thread Ihar Hrachyshka
This allows to avoid cleanup of the record in case tunnel id fails to
allocate.

Signed-off-by: Ihar Hrachyshka 
---
 northd/northd.c | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index 3d2715911..5a0225189 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -4305,6 +4305,10 @@ ls_port_init(struct ovn_port *op, struct ovsdb_idl_txn 
*ovnsb_txn,
 if (!ovn_port_assign_requested_tnl_id(sbrec_chassis_table, op)) {
 return false;
 }
+/* Assign new tunnel ids where needed. */
+if (!ovn_port_allocate_key(sbrec_chassis_table, op)) {
+return false;
+}
 if (sb) {
 op->sb = sb;
 /* Keep nonconflicting tunnel IDs that are already assigned. */
@@ -4318,10 +4322,6 @@ ls_port_init(struct ovn_port *op, struct ovsdb_idl_txn 
*ovnsb_txn,
 op->sb = sbrec_port_binding_insert(ovnsb_txn);
 sbrec_port_binding_set_logical_port(op->sb, op->key);
 }
-/* Assign new tunnel ids where needed. */
-if (!ovn_port_allocate_key(sbrec_chassis_table, op)) {
-return false;
-}
 ovn_port_update_sbrec(ovnsb_txn, sbrec_chassis_by_name,
   sbrec_chassis_by_hostname, NULL, sbrec_mirror_table,
   op, NULL, NULL);
@@ -4343,9 +4343,6 @@ ls_port_create(struct ovsdb_idl_txn *ovnsb_txn, struct 
hmap *ls_ports,
 if (!ls_port_init(op, ovnsb_txn, od, sb,
   sbrec_mirror_table, sbrec_chassis_table,
   sbrec_chassis_by_name, sbrec_chassis_by_hostname)) {
-if (op->sb) {
-sbrec_port_binding_delete(op->sb);
-}
 ovn_port_destroy(ls_ports, op);
 return NULL;
 }
@@ -4549,8 +4546,8 @@ ls_handle_lsp_changes(struct ovsdb_idl_txn *ovnsb_idl_txn,
 ni->sbrec_chassis_table,
 ni->sbrec_chassis_by_name,
 ni->sbrec_chassis_by_hostname)) {
-if (op->sb) {
-sbrec_port_binding_delete(op->sb);
+if (sb) {
+sbrec_port_binding_delete(sb);
 }
 ovs_list_remove(>list);
 ovn_port_destroy(>ls_ports, op);
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v5 0/7] Correct tunnel ids exhaustion scenario.

2024-04-11 Thread Ihar Hrachyshka
v2+ of the original test patch exposed a ubsan failure in port tunnel id
allocation code when tunnel id space is exhausted.

This series fixes the ubsan failure (patches 1-3); then adjusts the
invalid scenario to trigger the originally intended failure mode - id
space exhausted (patch 4). Finally, it includes a number of smaller
cleanup patches in the area that simplify the allocation code somewhat.
(patches 5-7)

I attempted to make each patch as simple as possible, to simplify
review. If you think it's too granular, let me know and I can squash
some.

v1: initial version.
v2: cover both cases of hint = 0 and hint > max.
v3: reduce the number of ports to create in the hint > max scenario needed to 
trigger the problem.
v4: remove spurious lib/ovn-util.c change.

Ihar Hrachyshka (7):
  northd: Don't cleanup op in ovn_port_allocate_key.
  northd: Don't cleanup op inside ls_port_init.
  northd: Don't remove detached op from list.
  tests: Correct tunnel ids exhaustion scenario.
  northd: Create pb after tunnel id is allocated.
  northd: Remove unused `sb` arg in ls_port_create.
  northd: Remove unused nbrp arg in ls_port_reinit.

 northd/northd.c | 48 -
 tests/ovn-northd.at | 43 +---
 2 files changed, 66 insertions(+), 25 deletions(-)

-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v5 1/7] northd: Don't cleanup op in ovn_port_allocate_key.

2024-04-11 Thread Ihar Hrachyshka
Let the callers do the cleanup as needed.

Signed-off-by: Ihar Hrachyshka 
---
 northd/northd.c | 24 +++-
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index 02cf5b234..f2406890c 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -4006,7 +4006,6 @@ ovn_port_assign_requested_tnl_id(
 
 static bool
 ovn_port_allocate_key(const struct sbrec_chassis_table *sbrec_chassis_table,
-  struct hmap *ports,
   struct ovn_port *op)
 {
 if (!op->tunnel_key) {
@@ -4015,11 +4014,6 @@ ovn_port_allocate_key(const struct sbrec_chassis_table 
*sbrec_chassis_table,
 1, (1u << (key_bits - 1)) - 1,
 >od->port_key_hint);
 if (!op->tunnel_key) {
-if (op->sb) {
-sbrec_port_binding_delete(op->sb);
-}
-ovs_list_remove(>list);
-ovn_port_destroy(ports, op);
 return false;
 }
 }
@@ -4084,10 +4078,17 @@ build_ports(struct ovsdb_idl_txn *ovnsb_txn,
 
 /* Assign new tunnel ids where needed. */
 LIST_FOR_EACH_SAFE (op, list, ) {
-ovn_port_allocate_key(sbrec_chassis_table, ports, op);
+if (!ovn_port_allocate_key(sbrec_chassis_table, op)) {
+sbrec_port_binding_delete(op->sb);
+ovs_list_remove(>list);
+ovn_port_destroy(ports, op);
+}
 }
 LIST_FOR_EACH_SAFE (op, list, _only) {
-ovn_port_allocate_key(sbrec_chassis_table, ports, op);
+if (!ovn_port_allocate_key(sbrec_chassis_table, op)) {
+ovs_list_remove(>list);
+ovn_port_destroy(ports, op);
+}
 }
 
 /* For logical ports that are in both databases, update the southbound
@@ -4318,7 +4319,12 @@ ls_port_init(struct ovn_port *op, struct ovsdb_idl_txn 
*ovnsb_txn,
 sbrec_port_binding_set_logical_port(op->sb, op->key);
 }
 /* Assign new tunnel ids where needed. */
-if (!ovn_port_allocate_key(sbrec_chassis_table, ls_ports, op)) {
+if (!ovn_port_allocate_key(sbrec_chassis_table, op)) {
+if (op->sb) {
+sbrec_port_binding_delete(op->sb);
+}
+ovs_list_remove(>list);
+ovn_port_destroy(ls_ports, op);
 return false;
 }
 ovn_port_update_sbrec(ovnsb_txn, sbrec_chassis_by_name,
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 0/3] Fix more issues with Python and update to 3.12.

2024-04-11 Thread Ilya Maximets
On 4/11/24 00:43, Ilya Maximets wrote:
> Fixing a few more issues with Python 3.12 and Python files in general.
> Switch CI testing to Python 3.12, more explanation in commit messages.
> 
> Plan s to also backport these changes to fix issues on older branches
> and also to have uniform CI on all branches if possible.
> 
> Ilya Maximets (3):
>   ovsdb-doc: Fix syntax warning with Python 3.12 and flake8 issues.
>   ovsdb-dot: Fix flake8 issues.
>   github: Update python to 3.12.
> 
>  .ci/dpdk-prepare.sh  |  2 +-
>  .github/workflows/build-and-test.yml | 11 +++---
>  ovsdb/automake.mk|  2 ++
>  ovsdb/dot2pic|  6 ++--
>  ovsdb/ovsdb-doc  | 50 ++--
>  ovsdb/ovsdb-dot.in   | 41 ---
>  6 files changed, 60 insertions(+), 52 deletions(-)
> 

Thanks, Simon for review!

I applied the set to all branches down to 2.17.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] rhel/systemd: Set ovsdb-server timeout to 5 minutes

2024-04-11 Thread Chris Riches

On 11/04/2024 17:10, Dumitru Ceara wrote:

On 4/11/24 15:43, Chris Riches wrote:

On 11/04/2024 14:24, Ilya Maximets wrote:

On 4/11/24 10:59, Chris Riches wrote:

Hi Chris, Ilya,


  From what we know so far, the DB was full of stale connection-tracking
information such as the following:

[...]

Once the host was recovered by putting in the timeout increase,
ovsdb-server successfully started and GCed the database down from 2.4
*GB* to 29 *KB*. Had this happened before the host restart, we would
have never seen this problem. But since it seems possible to end up
booting with such a large DB, we figured a timeout increase was a
sensible measure to take.

Uff.  Sounds like ovn-controller went off the rails.

Normally, ovsdb-server compacts the database once in 10-20 minutes,
if the database doubles the size since the previous check.  If all
the transactions are that small, it would mean ovn-controller made
about 10K transactions per second in the 10-20 minutes before the
restart.  That's huge.

I wonder if this can be addressed with a better compaction strategy.
Something like forcing compaction if "the database is more than 10 MB
and increased 10x" regardless of the time.

I'm not sure exactly what the test was doing when this was observed, so
I don't know whether that transaction volume is within the realm of
possibility or if we're looking at a failure to perform compaction on
time. It would be nice to have an enhanced safety-net for DB size, as we
were only a few hundred MB away from hitting filesystem space issues as
well.


To rule out any known issues, what OVN version is running on that setup?
This was during an upgrade test. We started with OVN 20.9, and this 
produced the massive DB. We then upgraded to 21.9 and rebooted, which 
failed to come up as described due to the massive DB.


Our networking team doing the RCA think that the system was rapidly 
flapping external ports between two configurations, hence the excessive 
DB transactions. The root cause of flapping is yet to be determined but 
these transactions were being done from OVN itself. They raised the 
theory that the flapping was so intense that ovsdb didn't actually get a 
chance to compact at all - is this a possibility?


I've CCed Priyankar who is in charge of the RCA.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] rhel/systemd: Set ovsdb-server timeout to 5 minutes

2024-04-11 Thread Dumitru Ceara
On 4/11/24 15:43, Chris Riches wrote:
> On 11/04/2024 14:24, Ilya Maximets wrote:
>> On 4/11/24 10:59, Chris Riches wrote:

Hi Chris, Ilya,

>>>  From what we know so far, the DB was full of stale connection-tracking
>>> information such as the following:
>>>
>>> [...]
>>>
>>> Once the host was recovered by putting in the timeout increase,
>>> ovsdb-server successfully started and GCed the database down from 2.4
>>> *GB* to 29 *KB*. Had this happened before the host restart, we would
>>> have never seen this problem. But since it seems possible to end up
>>> booting with such a large DB, we figured a timeout increase was a
>>> sensible measure to take.
>> Uff.  Sounds like ovn-controller went off the rails.
>>
>> Normally, ovsdb-server compacts the database once in 10-20 minutes,
>> if the database doubles the size since the previous check.  If all
>> the transactions are that small, it would mean ovn-controller made
>> about 10K transactions per second in the 10-20 minutes before the
>> restart.  That's huge.
>>
>> I wonder if this can be addressed with a better compaction strategy.
>> Something like forcing compaction if "the database is more than 10 MB
>> and increased 10x" regardless of the time.
> 
> I'm not sure exactly what the test was doing when this was observed, so
> I don't know whether that transaction volume is within the realm of
> possibility or if we're looking at a failure to perform compaction on
> time. It would be nice to have an enhanced safety-net for DB size, as we
> were only a few hundred MB away from hitting filesystem space issues as
> well.
> 

To rule out any known issues, what OVN version is running on that setup?

>> Normally, ovsdb-server compacts the database once in 10-20 minutes, if
>> the database doubles the size since the previous check.
> 
> I presume you mean if it doubled in size since the previous
> *compaction*? If we only compact when it doubles since the last *check*,
> then it would be easy for it to slightly-less-than-double every 10-20
> minutes and never trigger the compaction while still growing exponentially.
> 
> I'm happy to discuss compaction approaches (though my expertise is very
> much in host service management and not OVS itself), but do you think
> there's merit in having this extended timeout as a backstop too?
> 

Regards,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] Add dh-python to debian/control.

2024-04-11 Thread Igor Zhukov
I tried to build OVN in a fresh Ubuntu 24.04 Docker container.

I only installed the Build-Depends.

I ran:

$ dpkg-buildpackage -us -uc -ui -b

And I received the following output:

dpkg-buildpackage: info: source package ovn
dpkg-buildpackage: info: source version 24.03.90-1
dpkg-buildpackage: info: source distribution unstable
dpkg-buildpackage: info: source changed by OVN team 
dpkg-source --before-build .
dpkg-buildpackage: info: host architecture amd64
debian/rules clean
dh clean --with autoreconf,python3 --parallel
dh: warning: Compatibility levels before 10 are deprecated (level 9 in use)
dh: error: unable to load addon python3: Can't locate 
Debian/Debhelper/Sequence/python3.pm in @INC (you may need to install the 
Debian::Debhelper::Sequence::python3 module) (@INC entries checked: /etc/perl 
/usr/local/lib/x86_64-linux-gnu/perl/5.38.2 /usr/local/share/perl/5.38.2 
/usr/lib/x86_64-linux-gnu/perl5/5.38 /usr/share/perl5 
/usr/lib/x86_64-linux-gnu/perl-base /usr/lib/x86_64-linux-gnu/perl/5.38 
/usr/share/perl/5.38 /usr/local/lib/site_perl) at (eval 12) line 1.
BEGIN failed--compilation aborted at (eval 12) line 1.

make: *** [debian/rules:25: clean] Error 255
dpkg-buildpackage: error: debian/rules clean subprocess returned exit status 2
debuild: fatal error at line 1184:
dpkg-buildpackage -us -uc -ui -b failed


After researching a solution, I discovered that we need to add dh-python to the 
Build-Depends in debian/control.

Signed-off-by: Igor Zhukov 
---
 debian/control | 1 +
 1 file changed, 1 insertion(+)

diff --git a/debian/control b/debian/control
index 6ee2192..d99b483 100644
--- a/debian/control
+++ b/debian/control
@@ -9,6 +9,7 @@ Build-Depends: graphviz,
bzip2,
debhelper (>= 8),
dh-autoreconf,
+   dh-python,
libssl-dev,
libtool,
openssl,
--
2.43.0
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v8 1/6] Add global option for JSON output to ovs-appctl.

2024-04-11 Thread 0-day Robot
Bleep bloop.  Greetings Jakob Meng, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line lacks whitespace around operator
WARNING: Line lacks whitespace around operator
#583 FILE: utilities/ovs-appctl.c:145:
  -f, --format=FMT   Output format. One of: 'json', or 'text'\n\

Lines checked: 675, Warnings: 2, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v8 5/6] vswitchd: Add JSON output for 'list-commands' command.

2024-04-11 Thread jmeng
From: Jakob Meng 

The 'list-commands' command now supports machine-readable JSON output
in addition to the plain-text output for humans.

Reported-at: https://bugzilla.redhat.com/1824861
Signed-off-by: Jakob Meng 
---
 NEWS  |  1 +
 lib/unixctl.c | 46 ++-
 tests/ovs-vswitchd.at | 11 +++
 3 files changed, 44 insertions(+), 14 deletions(-)

diff --git a/NEWS b/NEWS
index 957351acb..af83623b3 100644
--- a/NEWS
+++ b/NEWS
@@ -13,6 +13,7 @@ Post-v3.3.0
  * Added new option [--pretty] to print JSON output in a readable fashion.
E.g. members of objects and elements of arrays are printed one per line,
with indentation.
+ * 'list-commands' now supports output in JSON format.
- Python:
  * Added support for choosing the output format, e.g. 'json' or 'text'.
  * Added new option [--pretty] to print JSON output in a readable fashion.
diff --git a/lib/unixctl.c b/lib/unixctl.c
index 4bb6bbe7f..3c611a21d 100644
--- a/lib/unixctl.c
+++ b/lib/unixctl.c
@@ -68,24 +68,42 @@ static void
 unixctl_list_commands(struct unixctl_conn *conn, int argc OVS_UNUSED,
   const char *argv[] OVS_UNUSED, void *aux OVS_UNUSED)
 {
-struct ds ds = DS_EMPTY_INITIALIZER;
-const struct shash_node **nodes = shash_sort();
-size_t i;
+if (unixctl_command_get_output_format(conn) == OVS_OUTPUT_FMT_JSON) {
+struct json *json_commands = json_array_create_empty();
+const struct shash_node *node;
 
-ds_put_cstr(, "The available commands are:\n");
+SHASH_FOR_EACH (node, ) {
+const struct unixctl_command *command = node->data;
 
-for (i = 0; i < shash_count(); i++) {
-const struct shash_node *node = nodes[i];
-const struct unixctl_command *command = node->data;
-
-if (command->usage) {
-ds_put_format(, "  %-23s %s\n", node->name, command->usage);
+if (command->usage) {
+struct json *json_command = json_object_create();
+json_object_put_string(json_command, "name", node->name);
+json_object_put_string(json_command, "usage", command->usage);
+json_array_add(json_commands, json_command);
+}
 }
-}
-free(nodes);
+unixctl_command_reply_json(conn, json_commands);
+} else {
+struct ds ds = DS_EMPTY_INITIALIZER;
+const struct shash_node **nodes = shash_sort();
+size_t i;
 
-unixctl_command_reply(conn, ds_cstr());
-ds_destroy();
+ds_put_cstr(, "The available commands are:\n");
+
+for (i = 0; i < shash_count(); ++i) {
+const struct shash_node *node = nodes[i];
+const struct unixctl_command *command = node->data;
+
+if (command->usage) {
+ds_put_format(, "  %-23s %s\n", node->name,
+  command->usage);
+}
+}
+free(nodes);
+
+unixctl_command_reply(conn, ds_cstr());
+ds_destroy();
+}
 }
 
 static void
diff --git a/tests/ovs-vswitchd.at b/tests/ovs-vswitchd.at
index 5683896ef..dcda71d04 100644
--- a/tests/ovs-vswitchd.at
+++ b/tests/ovs-vswitchd.at
@@ -281,3 +281,14 @@ AT_CHECK_UNQUOTED([ovs-appctl --format json --pretty 
version], [0], [dnl
   "reply-format": "plain"}])
 
 AT_CLEANUP
+
+AT_SETUP([ovs-vswitchd list-commands])
+OVS_VSWITCHD_START
+
+AT_CHECK([ovs-appctl list-commands], [0], [ignore])
+AT_CHECK([ovs-appctl --format json list-commands], [0], [stdout])
+AT_CHECK([wc -l stdout], [0], [0 stdout
+])
+AT_CHECK([grep -E '^\[[\{"name":.*\}\]]$' stdout], [0], [ignore])
+
+AT_CLEANUP
-- 
2.39.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v8 6/6] ofproto: Add JSON output for 'dpif/show' command.

2024-04-11 Thread jmeng
From: Jakob Meng 

The 'dpif/show' command now supports machine-readable JSON output in
addition to the plain-text output for humans. An example would be:

  ovs-appctl --format json dpif/show

Reported-at: https://bugzilla.redhat.com/1824861
Signed-off-by: Jakob Meng 
---
 NEWS   |   1 +
 ofproto/ofproto-dpif.c | 124 +
 tests/pmd.at   |  28 ++
 3 files changed, 142 insertions(+), 11 deletions(-)

diff --git a/NEWS b/NEWS
index af83623b3..97b30233c 100644
--- a/NEWS
+++ b/NEWS
@@ -14,6 +14,7 @@ Post-v3.3.0
E.g. members of objects and elements of arrays are printed one per line,
with indentation.
  * 'list-commands' now supports output in JSON format.
+ * 'dpif/show' now supports output in JSON format.
- Python:
  * Added support for choosing the output format, e.g. 'json' or 'text'.
  * Added new option [--pretty] to print JSON output in a readable fashion.
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 32d037be6..67e1df550 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -6519,8 +6519,104 @@ done:
 return changed;
 }
 
+static struct json *
+dpif_show_backer_json(const struct dpif_backer *backer)
+{
+struct json *json_backer = json_object_create();
+
+/* Add dpif name to JSON. */
+json_object_put_string(json_backer, "name", dpif_name(backer->dpif));
+
+/* Add dpif stats to JSON. */
+struct dpif_dp_stats dp_stats;
+struct json *json_dp_stats = json_object_create();
+
+dpif_get_dp_stats(backer->dpif, _stats);
+json_object_put_format(json_dp_stats, "n_hit", "%"PRIu64, dp_stats.n_hit);
+json_object_put_format(json_dp_stats, "n_missed", "%"PRIu64,
+   dp_stats.n_missed);
+json_object_put(json_backer, "stats", json_dp_stats);
+
+/* Add ofprotos to JSON. */
+struct json *json_ofprotos = json_array_create_empty();
+struct shash ofproto_shash;
+shash_init(_shash);
+const struct shash_node **ofprotos = get_ofprotos(_shash);
+for (size_t i = 0; i < shash_count(_shash); i++) {
+struct ofproto_dpif *ofproto = ofprotos[i]->data;
+
+if (ofproto->backer != backer) {
+continue;
+}
+
+struct json *json_ofproto = json_object_create();
+
+/* Add ofproto name to JSON array. */
+json_object_put_string(json_ofproto, "name", ofproto->up.name);
+
+/* Add ofproto ports to JSON array. */
+struct json *json_ofproto_ports = json_array_create_empty();
+const struct shash_node **ports;
+ports = shash_sort(>up.port_by_name);
+for (size_t j = 0; j < shash_count(>up.port_by_name); j++) {
+const struct shash_node *port = ports[j];
+struct ofport *ofport = port->data;
+
+struct json * json_ofproto_port = json_object_create();
+/* Add ofproto port netdev name to JSON sub array. */
+json_object_put_string(json_ofproto_port, "netdev_name",
+   netdev_get_name(ofport->netdev));
+/* Add ofproto port ofp port to JSON sub array. */
+json_object_put_format(json_ofproto_port, "ofp_port", "%u",
+   ofport->ofp_port);
+
+/* Add ofproto port odp port to JSON sub array. */
+odp_port_t odp_port = ofp_port_to_odp_port(ofproto,
+   ofport->ofp_port);
+if (odp_port != ODPP_NONE) {
+json_object_put_format(json_ofproto_port, "odp_port",
+   "%"PRIu32, odp_port);
+} else {
+json_object_put_string(json_ofproto_port, "odp_port", "none");
+}
+
+/* Add ofproto port netdev type to JSON sub array. */
+json_object_put_string(json_ofproto_port, "netdev_type",
+   netdev_get_type(ofport->netdev));
+
+/* Add ofproto port config to JSON sub array. */
+struct json *json_port_config = json_object_create();
+struct smap config;
+smap_init();
+if (!netdev_get_config(ofport->netdev, )) {
+struct smap_node *node;
+
+SMAP_FOR_EACH (node, ) {
+json_object_put_string(json_port_config, node->key,
+   node->value);
+}
+}
+smap_destroy();
+json_object_put(json_ofproto_port, "netdev_config",
+json_port_config);
+
+json_array_add(json_ofproto_ports, json_ofproto_port);
+} /* End of ofproto port(s). */
+
+free(ports);
+json_object_put(json_ofproto, "ports", json_ofproto_ports);
+
+json_array_add(json_ofprotos, json_ofproto);
+} /* End of ofproto(s). */
+shash_destroy(_shash);
+free(ofprotos);

[ovs-dev] [PATCH v8 4/6] python: Add option '--pretty' for pretty-printing JSON output.

2024-04-11 Thread jmeng
From: Jakob Meng 

With the '--pretty' option, appctl.py will now print JSON output in a
more readable fashion, i.e. with additional line breaks, spaces and
sorted dictionary keys. The pretty-printed output from appctl.py is not
strictly the same as with ovs-appctl because of both use different
pretty-printing implementations.

Signed-off-by: Jakob Meng 
---
 NEWS |  3 +++
 python/ovs/unixctl/client.py |  4 ++--
 tests/appctl.py  | 16 ++--
 tests/unixctl-py.at  |  5 +
 4 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index 52cadbe0d..957351acb 100644
--- a/NEWS
+++ b/NEWS
@@ -15,6 +15,9 @@ Post-v3.3.0
with indentation.
- Python:
  * Added support for choosing the output format, e.g. 'json' or 'text'.
+ * Added new option [--pretty] to print JSON output in a readable fashion.
+   E.g. members of objects and elements of arrays are printed one per line,
+   with indentation.
 
 
 v3.3.0 - 16 Feb 2024
diff --git a/python/ovs/unixctl/client.py b/python/ovs/unixctl/client.py
index 000d261c0..460c79e88 100644
--- a/python/ovs/unixctl/client.py
+++ b/python/ovs/unixctl/client.py
@@ -27,7 +27,7 @@ class UnixctlClient(object):
 assert isinstance(conn, ovs.jsonrpc.Connection)
 self._conn = conn
 
-def transact(self, command, argv, fmt):
+def transact(self, command, argv, fmt, fmt_flags):
 assert isinstance(command, str)
 assert isinstance(argv, list)
 for arg in argv:
@@ -46,7 +46,7 @@ class UnixctlClient(object):
 if fmt == ovs.util.OutputFormat.TEXT:
 return str(body)
 else:
-return ovs.json.to_string(body)
+return ovs.json.to_string(body, **fmt_flags)
 
 if reply.error is not None:
 return 0, to_string(reply.error), None
diff --git a/tests/appctl.py b/tests/appctl.py
index 7ccf34cc3..bd9953e75 100644
--- a/tests/appctl.py
+++ b/tests/appctl.py
@@ -53,18 +53,29 @@ def main():
 help="Output format.", default="text",
 choices=[fmt.name.lower()
  for fmt in ovs.util.OutputFormat])
+parser.add_argument("--pretty", action="store_true",
+help="By default, JSON in output is printed as"
+ " compactly as possible. This option causes JSON"
+ " in output to be printed in a more readable"
+ " fashion. Members of objects and elements of"
+ " arrays are printed one per line, with"
+ " indentation.")
 args = parser.parse_args()
 
+if args.format != ovs.util.OutputFormat.JSON and args.pretty:
+ovs.util.ovs_fatal(0, "--pretty is supported with --format json only")
+
 signal_alarm(int(args.timeout) if args.timeout else None)
 
 ovs.vlog.Vlog.init()
 target = args.target
 format = ovs.util.OutputFormat[args.format.upper()]
+format_flags = dict(pretty=True, sort_keys=True) if args.pretty else {}
 client = connect_to_target(target)
 
 if format != ovs.util.OutputFormat.TEXT:
 err_no, error, _ = client.transact(
-"set-options", ["--format", args.format], format)
+"set-options", ["--format", args.format], format, format_flags)
 
 if err_no:
 ovs.util.ovs_fatal(err_no, "%s: transaction error" % target)
@@ -73,7 +84,8 @@ def main():
 ovs.util.ovs_error(0, "%s: server returned an error" % target)
 sys.exit(2)
 
-err_no, error, result = client.transact(args.command, args.argv, format)
+err_no, error, result = client.transact(
+args.command, args.argv, format, format_flags)
 client.close()
 
 if err_no:
diff --git a/tests/unixctl-py.at b/tests/unixctl-py.at
index 1440b30b7..bcea19abc 100644
--- a/tests/unixctl-py.at
+++ b/tests/unixctl-py.at
@@ -115,6 +115,11 @@ AT_CHECK([APPCTL -t test-unixctl.py version], [0], 
[expout])
 AT_CHECK([PYAPPCTL_PY -t test-unixctl.py version], [0], [expout])
 AT_CHECK_UNQUOTED([PYAPPCTL_PY -t test-unixctl.py --format json version], [0], 
[dnl
 {"reply":"$(cat expout)\n","reply-format":"plain"}])
+AT_CHECK_UNQUOTED([PYAPPCTL_PY -t test-unixctl.py --format json --pretty 
version], [0], [dnl
+{
+  "reply":"$(cat expout)\n",
+  "reply-format":"plain"
+}])
 
 AT_CHECK([APPCTL -t test-unixctl.py echo robot ninja], [0], [stdout])
 AT_CHECK([cat stdout | sed -e "s/u'/'/g"], [0], [dnl
-- 
2.39.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v8 1/6] Add global option for JSON output to ovs-appctl.

2024-04-11 Thread jmeng
From: Jakob Meng 

For monitoring systems such as Prometheus it would be beneficial if
OVS would expose statistics in a machine-readable format.

This patch introduces support for different output formats to ovs-xxx
tools, in particular ovs-appctl. The latter gains a global option
'-f,--format' which changes it to print a JSON document instead of
plain-text for humans. For example, a later patch implements support
for 'ovs-appctl --format json dpif/show'. By default, the output format
is plain-text as before.

A new 'set-options' command has been added to lib/unixctl.c which
allows to change the output format of the commands executed afterwards
on the same socket connection. It is supposed to be run by ovs-xxx
tools transparently for the user when a specific output format has been
requested.
For example, when a user calls 'ovs-appctl --format json dpif/show',
then ovs-appctl will call 'set-options' to set the output format as
requested by the user and afterwards it will call the actual command
'dpif/show'.
This ovs-appctl behaviour has been implemented in a backward compatible
way. One can use an updated client (ovs-appctl) with an old server
(ovs-vswitchd) and vice versa. Of course, JSON output only works when
both sides have been updated.

Two access functions unixctl_command_{get,set}_output_format() and
two unixctl_command_reply_{,error_}json functions have been added to
lib/unixctl.h:
unixctl_command_get_output_format() is supposed to be used in commands
like 'dpif/show' to query the requested output format. When JSON output
has been selected, both unixctl_command_reply_{,error_}json() functions
can be used to return JSON objects to the client (ovs-appctl) instead
of plain-text with the unixctl_command_reply{,_error}() functions.

When JSON has been requested but a command has not implemented JSON
output the plain-text output will be wrapped in a provisional JSON
document with the following structure:

  {"reply-format":"plain","reply":"$PLAIN_TEXT_HERE"}

Thus commands which have been executed successfully will not fail when
they try to render the output at a later stage.

In popular tools like kubectl the option for output control is usually
called '-o|--output' instead of '-f,--format'. But ovs-appctl already
has an short option '-o' which prints the available ovs-appctl options
('--option'). The now choosen name also better alignes with
ovsdb-client where '-f,--format' controls output formatting.

Reported-at: https://bugzilla.redhat.com/1824861
Signed-off-by: Jakob Meng 
---
 NEWS   |   3 +
 lib/command-line.c |  36 ++
 lib/command-line.h |  10 +++
 lib/unixctl.c  | 158 -
 lib/unixctl.h  |  11 +++
 tests/ovs-vswitchd.at  |  11 +++
 utilities/ovs-appctl.c |  98 +
 7 files changed, 277 insertions(+), 50 deletions(-)

diff --git a/NEWS b/NEWS
index b92cec532..03ef6486b 100644
--- a/NEWS
+++ b/NEWS
@@ -7,6 +7,9 @@ Post-v3.3.0
- The primary development branch has been renamed from 'master' to 'main'.
  The OVS tree remains hosted on GitHub.
  https://github.com/openvswitch/ovs.git
+   - ovs-appctl:
+ * Added new option [-f|--format] to choose the output format, e.g. 'json'
+   or 'text' (by default).
 
 
 v3.3.0 - 16 Feb 2024
diff --git a/lib/command-line.c b/lib/command-line.c
index 967f4f5d5..775e0430a 100644
--- a/lib/command-line.c
+++ b/lib/command-line.c
@@ -24,6 +24,7 @@
 #include "ovs-thread.h"
 #include "util.h"
 #include "openvswitch/vlog.h"
+#include "openvswitch/json.h"
 
 VLOG_DEFINE_THIS_MODULE(command_line);
 
@@ -53,6 +54,41 @@ ovs_cmdl_long_options_to_short_options(const struct option 
options[])
 return xstrdup(short_options);
 }
 
+const char *
+ovs_output_fmt_to_string(enum ovs_output_fmt fmt)
+{
+switch (fmt) {
+case OVS_OUTPUT_FMT_TEXT:
+return "text";
+
+case OVS_OUTPUT_FMT_JSON:
+return "json";
+
+default:
+return NULL;
+}
+}
+
+struct json *
+ovs_output_fmt_to_json(enum ovs_output_fmt fmt)
+{
+const char *string = ovs_output_fmt_to_string(fmt);
+return string ? json_string_create(string) : NULL;
+}
+
+bool
+ovs_output_fmt_from_string(const char *string, enum ovs_output_fmt *fmt)
+{
+if (!strcmp(string, "text")) {
+*fmt = OVS_OUTPUT_FMT_TEXT;
+} else if (!strcmp(string, "json")) {
+*fmt = OVS_OUTPUT_FMT_JSON;
+} else {
+return false;
+}
+return true;
+}
+
 static char * OVS_WARN_UNUSED_RESULT
 build_short_options(const struct option *long_options)
 {
diff --git a/lib/command-line.h b/lib/command-line.h
index fc2a2690f..494274cec 100644
--- a/lib/command-line.h
+++ b/lib/command-line.h
@@ -20,6 +20,7 @@
 /* Utilities for command-line parsing. */
 
 #include "compiler.h"
+#include 
 
 struct option;
 
@@ -46,6 +47,15 @@ struct ovs_cmdl_command {
 
 char *ovs_cmdl_long_options_to_short_options(const struct option *options);
 
+enum ovs_output_fmt {
+   

[ovs-dev] [PATCH v8 3/6] appctl: Add option '--pretty' for pretty-printing JSON output.

2024-04-11 Thread jmeng
From: Jakob Meng 

With the '--pretty' option, ovs-appctl will now print JSON output in a
more readable fashion, i.e. with additional line breaks, spaces and
sorted dictionary keys.

Signed-off-by: Jakob Meng 
---
 NEWS   |  3 +++
 lib/unixctl.c  |  6 +++---
 lib/unixctl.h  |  2 +-
 tests/ovs-vswitchd.at  |  5 +
 utilities/ovs-appctl.c | 32 
 5 files changed, 40 insertions(+), 8 deletions(-)

diff --git a/NEWS b/NEWS
index f18238159..52cadbe0d 100644
--- a/NEWS
+++ b/NEWS
@@ -10,6 +10,9 @@ Post-v3.3.0
- ovs-appctl:
  * Added new option [-f|--format] to choose the output format, e.g. 'json'
or 'text' (by default).
+ * Added new option [--pretty] to print JSON output in a readable fashion.
+   E.g. members of objects and elements of arrays are printed one per line,
+   with indentation.
- Python:
  * Added support for choosing the output format, e.g. 'json' or 'text'.
 
diff --git a/lib/unixctl.c b/lib/unixctl.c
index 18638d86a..4bb6bbe7f 100644
--- a/lib/unixctl.c
+++ b/lib/unixctl.c
@@ -572,8 +572,8 @@ unixctl_client_create(const char *path, struct jsonrpc 
**client)
  * '*err' if not NULL. */
 int
 unixctl_client_transact(struct jsonrpc *client, const char *command, int argc,
-char *argv[], enum ovs_output_fmt fmt, char **result,
-char **err)
+char *argv[], enum ovs_output_fmt fmt, int fmt_flags,
+char **result, char **err)
 {
 struct jsonrpc_msg *request, *reply;
 struct json **json_args, *params, *output_src;
@@ -604,7 +604,7 @@ unixctl_client_transact(struct jsonrpc *client, const char 
*command, int argc,
 if (fmt == OVS_OUTPUT_FMT_TEXT && output_src->type == JSON_STRING) {
 *output_dst = xstrdup(json_string(output_src));
 } else if (fmt == OVS_OUTPUT_FMT_JSON) {
-*output_dst = json_to_string(output_src, 0);
+*output_dst = json_to_string(output_src, fmt_flags);
 } else {
 VLOG_WARN("%s: unexpected %s type in JSON rpc reply: %s",
   jsonrpc_get_name(client),
diff --git a/lib/unixctl.h b/lib/unixctl.h
index b30bcf092..c49f0c658 100644
--- a/lib/unixctl.h
+++ b/lib/unixctl.h
@@ -39,7 +39,7 @@ int unixctl_client_create(const char *path, struct jsonrpc 
**client);
 int unixctl_client_transact(struct jsonrpc *client,
 const char *command,
 int argc, char *argv[],
-enum ovs_output_fmt fmt,
+enum ovs_output_fmt fmt, int fmt_flags,
 char **result, char **error);
 
 /* Command registration. */
diff --git a/tests/ovs-vswitchd.at b/tests/ovs-vswitchd.at
index 32ca2c6e7..5683896ef 100644
--- a/tests/ovs-vswitchd.at
+++ b/tests/ovs-vswitchd.at
@@ -275,4 +275,9 @@ ovs_version=$(ovs-appctl version)
 AT_CHECK_UNQUOTED([ovs-appctl --format json version], [0], [dnl
 {"reply-format":"plain","reply":"$ovs_version\n"}])
 
+AT_CHECK_UNQUOTED([ovs-appctl --format json --pretty version], [0], [dnl
+{
+  "reply": "$ovs_version\n",
+  "reply-format": "plain"}])
+
 AT_CLEANUP
diff --git a/utilities/ovs-appctl.c b/utilities/ovs-appctl.c
index 4d4503597..9872d5bde 100644
--- a/utilities/ovs-appctl.c
+++ b/utilities/ovs-appctl.c
@@ -26,6 +26,7 @@
 #include "daemon.h"
 #include "dirs.h"
 #include "openvswitch/dynamic-string.h"
+#include "openvswitch/json.h"
 #include "jsonrpc.h"
 #include "process.h"
 #include "timeval.h"
@@ -39,6 +40,7 @@ static void usage(void);
 /* Parsed command line args. */
 struct cmdl_args {
 enum ovs_output_fmt format;
+int format_flags;
 char *target;
 };
 
@@ -74,8 +76,8 @@ main(int argc, char *argv[])
 if (!svec_is_empty(_argv)) {
 error = unixctl_client_transact(client, "set-options",
 opt_argv.n, opt_argv.names,
-args->format, _result,
-_error);
+args->format, 0,
+_result, _error);
 
 if (error) {
 ovs_fatal(error, "%s: transaction error", args->target);
@@ -98,7 +100,9 @@ main(int argc, char *argv[])
 cmd_argc = argc - optind;
 cmd_argv = cmd_argc ? argv + optind : NULL;
 error = unixctl_client_transact(client, cmd, cmd_argc, cmd_argv,
-args->format, _result, _error);
+args->format, args->format_flags,
+_result, _error);
+
 if (error) {
 ovs_fatal(error, "%s: transaction error", args->target);
 }
@@ -144,6 +148,11 @@ Other options:\n\
   --timeout=SECS wait at most SECS seconds for a response\n\
   -f, --format=FMT   Output format. One of: 'json', or 'text'\n\
  ('text', 

[ovs-dev] [PATCH v8 2/6] python: Add global option for JSON output to Python tools.

2024-04-11 Thread jmeng
From: Jakob Meng 

This patch introduces support for different output formats to the
Python code, as did the previous commit for ovs-xxx tools like
'ovs-appctl --format json dpif/show'.
In particular, tests/appctl.py gains a global option '-f,--format'
which allows users to request JSON instead of plain-text for humans.

Reported-at: https://bugzilla.redhat.com/1824861
Signed-off-by: Jakob Meng 
---
 NEWS |  2 ++
 python/ovs/unixctl/client.py | 14 ---
 python/ovs/unixctl/server.py | 45 +---
 python/ovs/util.py   |  8 +++
 tests/appctl.py  | 19 ++-
 tests/unixctl-py.at  |  3 +++
 6 files changed, 84 insertions(+), 7 deletions(-)

diff --git a/NEWS b/NEWS
index 03ef6486b..f18238159 100644
--- a/NEWS
+++ b/NEWS
@@ -10,6 +10,8 @@ Post-v3.3.0
- ovs-appctl:
  * Added new option [-f|--format] to choose the output format, e.g. 'json'
or 'text' (by default).
+   - Python:
+ * Added support for choosing the output format, e.g. 'json' or 'text'.
 
 
 v3.3.0 - 16 Feb 2024
diff --git a/python/ovs/unixctl/client.py b/python/ovs/unixctl/client.py
index 8283f99bb..000d261c0 100644
--- a/python/ovs/unixctl/client.py
+++ b/python/ovs/unixctl/client.py
@@ -14,6 +14,7 @@
 
 import os
 
+import ovs.json
 import ovs.jsonrpc
 import ovs.stream
 import ovs.util
@@ -26,11 +27,12 @@ class UnixctlClient(object):
 assert isinstance(conn, ovs.jsonrpc.Connection)
 self._conn = conn
 
-def transact(self, command, argv):
+def transact(self, command, argv, fmt):
 assert isinstance(command, str)
 assert isinstance(argv, list)
 for arg in argv:
 assert isinstance(arg, str)
+assert isinstance(fmt, ovs.util.OutputFormat)
 
 request = ovs.jsonrpc.Message.create_request(command, argv)
 error, reply = self._conn.transact_block(request)
@@ -40,11 +42,17 @@ class UnixctlClient(object):
   % (self._conn.name, os.strerror(error)))
 return error, None, None
 
+def to_string(body):
+if fmt == ovs.util.OutputFormat.TEXT:
+return str(body)
+else:
+return ovs.json.to_string(body)
+
 if reply.error is not None:
-return 0, str(reply.error), None
+return 0, to_string(reply.error), None
 else:
 assert reply.result is not None
-return 0, None, str(reply.result)
+return 0, None, to_string(reply.result)
 
 def close(self):
 self._conn.close()
diff --git a/python/ovs/unixctl/server.py b/python/ovs/unixctl/server.py
index b9cb52fad..f95317c4a 100644
--- a/python/ovs/unixctl/server.py
+++ b/python/ovs/unixctl/server.py
@@ -12,6 +12,7 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
+import argparse
 import copy
 import errno
 import os
@@ -35,6 +36,7 @@ class UnixctlConnection(object):
 assert isinstance(rpc, ovs.jsonrpc.Connection)
 self._rpc = rpc
 self._request_id = None
+self._fmt = ovs.util.OutputFormat.TEXT
 
 def run(self):
 self._rpc.run()
@@ -65,9 +67,15 @@ class UnixctlConnection(object):
 def reply(self, body):
 self._reply_impl(True, body)
 
+def reply_json(self, body):
+self._reply_impl_json(True, body)
+
 def reply_error(self, body):
 self._reply_impl(False, body)
 
+def reply_error_json(self, body):
+self._reply_impl_json(False, body)
+
 # Called only by unixctl classes.
 def _close(self):
 self._rpc.close()
@@ -79,17 +87,27 @@ class UnixctlConnection(object):
 self._rpc.recv_wait(poller)
 
 def _reply_impl(self, success, body):
-assert isinstance(success, bool)
 assert body is None or isinstance(body, str)
 
-assert self._request_id is not None
-
 if body is None:
 body = ""
 
 if body and not body.endswith("\n"):
 body += "\n"
 
+if self._fmt == ovs.util.OutputFormat.JSON:
+body = {
+"reply-format": "plain",
+"reply": body
+}
+
+return self._reply_impl_json(success, body)
+
+def _reply_impl_json(self, success, body):
+assert isinstance(success, bool)
+
+assert self._request_id is not None
+
 if success:
 reply = Message.create_reply(body, self._request_id)
 else:
@@ -136,6 +154,24 @@ def _unixctl_version(conn, unused_argv, version):
 conn.reply(version)
 
 
+def _unixctl_set_options(conn, argv, unused_aux):
+assert isinstance(conn, UnixctlConnection)
+
+parser = argparse.ArgumentParser()
+parser.add_argument("--format", default="text",
+choices=[fmt.name.lower()
+ for fmt in ovs.util.OutputFormat])
+
+try:
+args = 

[ovs-dev] [PATCH v8 0/6] Add global option to output JSON from ovs-appctl cmds.

2024-04-11 Thread jmeng
From: Jakob Meng 

This patch series is based on v7 [0] but applies three fundamental changes:

* When JSON output is requested but a command has not implemented JSON printing 
yet, the plain-text output will be wrapped in a JSON object anyway. The main 
reason is that commands, which have already been executed, should not fail 
later at the printing/reply stage due to an unsupported output format.
* The output format is stored in struct unixctl_conn and can be queried within 
commands with unixctl_command_get_output_format().
* Commands no longer need to register what output format they support.

Thus, a huge amount of code changes from the previous patch series could be 
eliminated.

Other changes from v7 [0]:
* Additional tests for (wrapped) JSON.
* '--pretty' option now also implemented in appctl.py (previously for 
ovs-appctl only).
* 'list-commands' gained JSON output (because smaller/easier change than 
'dpif/show' command).
* 'dpif/show' test uses pretty-printing to avoid test failures due to unsorted 
keys.

What has not been changed is the code for JSON output in 'dpif/show'. Mostly, 
because rearranging it does not make it more readable?!

Thanks again, in particular to Ilya Maximets and Eelco Chaudron, for your good 
suggestions and helpful feedback! ☺️

[0] 
https://patchwork.ozlabs.org/project/openvswitch/list/?series=391066=%2A=both

Jakob Meng (6):
  Add global option for JSON output to ovs-appctl.
  python: Add global option for JSON output to Python tools.
  appctl: Add option '--pretty' for pretty-printing JSON output.
  python: Add option '--pretty' for pretty-printing JSON output.
  vswitchd: Add JSON output for 'list-commands' command.
  ofproto: Add JSON output for 'dpif/show' command.

 NEWS |  13 +++
 lib/command-line.c   |  36 +++
 lib/command-line.h   |  10 ++
 lib/unixctl.c| 204 ++-
 lib/unixctl.h|  11 ++
 ofproto/ofproto-dpif.c   | 124 +++--
 python/ovs/unixctl/client.py |  14 ++-
 python/ovs/unixctl/server.py |  45 +++-
 python/ovs/util.py   |   8 ++
 tests/appctl.py  |  31 +-
 tests/ovs-vswitchd.at|  27 +
 tests/pmd.at |  28 +
 tests/unixctl-py.at  |   8 ++
 utilities/ovs-appctl.c   | 122 ++---
 14 files changed, 599 insertions(+), 82 deletions(-)

-- 
2.39.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] rhel/systemd: Set ovsdb-server timeout to 5 minutes

2024-04-11 Thread Chris Riches

On 11/04/2024 14:24, Ilya Maximets wrote:

On 4/11/24 10:59, Chris Riches wrote:

 From what we know so far, the DB was full of stale connection-tracking
information such as the following:

[...]

Once the host was recovered by putting in the timeout increase,
ovsdb-server successfully started and GCed the database down from 2.4
*GB* to 29 *KB*. Had this happened before the host restart, we would
have never seen this problem. But since it seems possible to end up
booting with such a large DB, we figured a timeout increase was a
sensible measure to take.

Uff.  Sounds like ovn-controller went off the rails.

Normally, ovsdb-server compacts the database once in 10-20 minutes,
if the database doubles the size since the previous check.  If all
the transactions are that small, it would mean ovn-controller made
about 10K transactions per second in the 10-20 minutes before the
restart.  That's huge.

I wonder if this can be addressed with a better compaction strategy.
Something like forcing compaction if "the database is more than 10 MB
and increased 10x" regardless of the time.


I'm not sure exactly what the test was doing when this was observed, so 
I don't know whether that transaction volume is within the realm of 
possibility or if we're looking at a failure to perform compaction on 
time. It would be nice to have an enhanced safety-net for DB size, as we 
were only a few hundred MB away from hitting filesystem space issues as 
well.


Normally, ovsdb-server compacts the database once in 10-20 minutes, if 
the database doubles the size since the previous check.


I presume you mean if it doubled in size since the previous 
*compaction*? If we only compact when it doubles since the last *check*, 
then it would be easy for it to slightly-less-than-double every 10-20 
minutes and never trigger the compaction while still growing exponentially.


I'm happy to discuss compaction approaches (though my expertise is very 
much in host service management and not OVS itself), but do you think 
there's merit in having this extended timeout as a backstop too?

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/3] python: ovsdb-idl: Make IndexedRows mirror hmap.

2024-04-11 Thread Terry Wilson
I tried to get this to thread under the "ovsdb-idl: potential issues
with the persistent UUID implementation" thread, but failed. This is
one potential solution to that which mirrors the current C IDL
implementation which seems to work for the most part, but relying on
the fact that hmaps allow duplicate keys wasn't necessarily intended
there. Reading that thread is definitely a prerequisite for
understanding why this patch exists. And it may be better to solve
this some other way, but it seems difficult to do for the case of
"inserting a UUID that already exists" w/o creating a race condition
with multiple IDL clients.

Terry

On Wed, Apr 10, 2024 at 4:39 PM Terry Wilson  wrote:
>
> The Python IDL code very closely mirrors the C IDL code, which uses
> an hmap to store table rows. hmap code allows duplicate keys, while
> IndexedRows, which is derived from DictBase does not.
>
> The persistent UUID code can attempt to temporarily add a Row with
> a duplicate UUID to table.rows, so IndexedRows is modified to
> behave similarly to the C IDL's hmap implementation.
>
> Fixes: 55b9507e6824 ("ovsdb-idl: Add the support to specify the uuid for row 
> insert.")
> Signed-off-by: Terry Wilson 
> ---
>  python/ovs/db/custom_index.py | 13 ++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/python/ovs/db/custom_index.py b/python/ovs/db/custom_index.py
> index 587caf5e3..3fa03d3c9 100644
> --- a/python/ovs/db/custom_index.py
> +++ b/python/ovs/db/custom_index.py
> @@ -90,14 +90,21 @@ class IndexedRows(DictBase, object):
>  index = self.indexes[name] = MultiColumnIndex(name)
>  return index
>
> +def __getitem__(self, key):
> +return self.data[key][-1]
> +
>  def __setitem__(self, key, item):
> -self.data[key] = item
> +try:
> +self.data[key].append(item)
> +except KeyError:
> +self.data[key] = [item]
>  for index in self.indexes.values():
>  index.add(item)
>
>  def __delitem__(self, key):
> -val = self.data[key]
> -del self.data[key]
> +val = self.data[key].pop()
> +if len(self.data[key]) == 0:
> +del self.data[key]
>  for index in self.indexes.values():
>  index.remove(val)
>
> --
> 2.34.3
>

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] rhel/systemd: Set ovsdb-server timeout to 5 minutes

2024-04-11 Thread Ilya Maximets
On 4/11/24 10:59, Chris Riches wrote:
> On 10/04/2024 23:31, Ilya Maximets wrote:
>> On 4/10/24 17:48, Chris Riches wrote:
>>> If the database is particularly large (multi-GB), ovsdb-server can take
>> Hi, Chris.  May I ask how did you end up with multi-GB database?
>> I would understand if it was an OVN Southbound DB, for example,
>> but why the local database that only stores ports/bridges and
>> some other not that large things ends up with so much data?
>>
>> Sounds a little strange.
>>
>> Best regards, Ilya Maximets.
> 
> I'd like to understand that too, and it's a separate RCA we're working 
> on but haven't reached a conclusion yet.
> 
>  From what we know so far, the DB was full of stale connection-tracking 
> information such as the following:
> 
> {
>    "_date": 1710858766431,
>    "Bridge": {
>      "49cb85cd-b085-4af8-98a2-56030dd614b9": {
>    "external_ids": [
>      "map",
>      [
>    [
> "ct-zone-lrp-ext_gw_port_48a89ae3-6528-4851-a277-e21db02518ad",
>      "4"
>    ],
>    [
>      "external",
>      "true"
>    ]
>      ]
>    ]
>      }
>    },
>    "_comment": "ovn-controller: modifying OVS tunnels 
> '5995b338-3080-44b1-9251-58080cc878f7'"
> }
> 
> Once the host was recovered by putting in the timeout increase, 
> ovsdb-server successfully started and GCed the database down from 2.4 
> *GB* to 29 *KB*. Had this happened before the host restart, we would 
> have never seen this problem. But since it seems possible to end up 
> booting with such a large DB, we figured a timeout increase was a 
> sensible measure to take.

Uff.  Sounds like ovn-controller went off the rails.

Normally, ovsdb-server compacts the database once in 10-20 minutes,
if the database doubles the size since the previous check.  If all
the transactions are that small, it would mean ovn-controller made
about 10K transactions per second in the 10-20 minutes before the
restart.  That's huge.

I wonder if this can be addressed with a better compaction strategy.
Something like forcing compaction if "the database is more than 10 MB
and increased 10x" regardless of the time.

Bets regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 0/2] ci: Remove reference to master branch.

2024-04-11 Thread Simon Horman
On Wed, Apr 10, 2024 at 03:09:19PM +0100, Simon Horman wrote:
> The OvS primary development branch has been renamed main
> so there is no longer any need for CI configuration to
> refer to master.
> 
> Signed-off-by: Simon Horman 
> 
> ---
> Simon Horman (2):
>   appveyor: Remove reference to master branch.
>   github: Remove reference to master branch.
> 
>  .github/workflows/build-and-test.yml | 2 +-
>  appveyor.yml | 1 -
>  2 files changed, 1 insertion(+), 2 deletions(-)

Thanks Alin, Eelco, and Ilya for your review.

I have applied this main.

- github: Remove reference to master branch.
  https://github.com/openvswitch/ovs/commit/acf6537124b6
- appveyor: Remove reference to master branch.
  https://github.com/openvswitch/ovs/commit/b34dac4c68eb

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [ovn] ovn-controller segmentation fault in svc_monitor_send_tcp_health_check__()

2024-04-11 Thread Ales Musil
On Thu, Apr 11, 2024 at 12:20 PM Vladislav Odintsov 
wrote:

> Hi all,
>
> I’m running ovn 22.09 and sometimes see that ovn-controllers crash with
> segmentation fault.  The backtrace is next:
>
> (gdb) bt
> #0  0x7f0742707de1 in __strlen_sse2 () from /lib64/libc.so.6
> #1  0x7f0742788c5d in inet_pton () from /lib64/libc.so.6
> #2  0x564f45a1c784 in ip_parse (s=, 
> ip=ip@entry=0x7f074040f90c)
> at lib/packets.c:698
> #3  0x564f4594cbfb in svc_monitor_send_tcp_health_check__
> (swconn=swconn@entry=0x7f0738000940,
> svc_mon=svc_mon@entry=0x564f4c2960c0, ctl_flags=ctl_flags@entry=2,
> tcp_seq=3858078915, tcp_ack=tcp_ack@entry=0,
> tcp_src=) at controller/pinctrl.c:7513
> #4  0x564f4594d47c in svc_monitor_send_tcp_health_check__
> (tcp_src=, tcp_ack=0, tcp_seq=,
> ctl_flags=2, svc_mon=0x564f4c2960c0, swconn=0x7f0738000940) at
> controller/pinctrl.c:7502
> #5  svc_monitor_send_health_check (swconn=swconn@entry=0x7f0738000940,
> svc_mon=svc_mon@entry=0x564f4c2960c0)
> at controller/pinctrl.c:7621
> #6  0x564f4595869b in svc_monitors_run
> (svc_monitors_next_run_time=0x564f45dd3970
> ,
> swconn=0x7f0738000940) at controller/pinctrl.c:7693
> #7  pinctrl_handler (arg_=0x564f45e11240 ) at
> controller/pinctrl.c:3499
> #8  0x564f45a0ad6f in ovsthread_wrapper (aux_=) at
> lib/ovs-thread.c:422
> #9  0x7f074325bea5 in start_thread () from /lib64/libpthread.so.0
> #10 0x7f07427798dd in clone () from /lib64/libc.so.6
>
> After moving to frame #3, I can get actual data from svc_mon structure
> (port/protocol/dp_key/port_key) - I’ve looked them up in SB DB and found
> port_binding, which belongs to a logical port, which resides on this
> chassis.
> It has configured LB with HC. Here everything seems good.  But if to check
> svc_mon->sb_svc_mon structure, it seems to me that it contains garbage -
> Address 0x564f out of bounds; logical_port == 0, etc (but I can be
> wrong):
>
> $1 = (const struct sbrec_service_monitor *) 0x564f54db2b40
> (gdb) print *svc_mon->sb_svc_mon
> $2 = {header_ = {hmap_node = {hash = 94898726054728, next = 0x0}, uuid =
> {parts = {0, 0, 0, 0}}, src_arcs = {prev = 0x564f54aae0d0, next = 0x0},
> dst_arcs = {prev = 0x564f7f8bd470, next = 0x564f7f8bd540}, table = 0x64,
> old_datum = 0xf,
> parsed = 152, reparse_node = {prev = 0x0, next = 0x0}, new_datum =
> 0x0, prereqs = 0x52eb8916, written = 0x171, txn_node = {hash = 1, next =
> 0x564f54db2db0}, map_op_written = 0x0, map_op_lists = 0x0, set_op_written =
> 0x0,
> set_op_lists = 0x0, change_seqno = {0, 0, 0}, track_node = {prev =
> 0x564f, next = 0x0}, updated = 0x0, tracked_old_datum = 0x0},
> external_ids = {map = {buckets = 0x1, one = 0x564f54db2d90, mask = 0, n =
> 0}},
>   ip = 0x564f , logical_port
> = 0x0, options = {map = {buckets = 0x0, one = 0x0, mask = 1, n =
> 94898780242768}}, port = 0, protocol = 0x0, src_ip = 0x1  of bounds>,
>   src_mac = 0x564f54db2d70 "`Ջ\177OV", status = 0x0}
> …
> (gdb) print svc_mon->state
> $8 = SVC_MON_S_ONLINE
> (gdb) print svc_mon->status
> $9 = SVC_MON_ST_ONLINE
> (gdb) print svc_mon->protocol
> $10 = SVC_MON_PROTO_TCP
> (gdb) print svc_mon->sb_svc_mon
>
> This crash occurred right after ovsdb SB connection loss due to inactivity
> probe failure.  So, ovn-controller was re-connecting to SB, and I guess,
> this
> could somehow re-initialize SB IDL objects.
>
> I’m not sure I can try to reproduce this behaviour on latest main branch,
> so my
> question, if this theoretically can be connected with re-initialization of
> IDL?
> If yes, what should be done to avoid such behavior?
> Should ovn-controller process changes if its IDL is in inconsistent state?
>
> Any help is appreciated.
>
> Regards,
> Vladislav Odintsov
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>

Hi Vladislav,

I didn't look closely what could cause the issue, however I have an idea
how to reproduce it on main. You could try a test that uses `sleep_sb` [0],
combining it with `ovn-remote-probe-interval`. That should eventually lead
to the same state.

Regards,
Ales
-- 

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA 

amu...@redhat.com

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/3] ovsdb-idl: Add python keyword to persistent UUID test.

2024-04-11 Thread Simon Horman
On Wed, Apr 10, 2024 at 04:38:24PM -0500, Terry Wilson wrote:
> The Python persistent UUID tests should have the keyword "python"
> added so that TESTSUITEFLAGS="-k python" will not miss testing
> them.
> 
> Fixes: 55b9507e6824 ("ovsdb-idl: Add the support to specify the uuid for row 
> insert.")
> Signed-off-by: Terry Wilson 

Acked-by: Simon Horman 
Tested-by: Simon Horman 

...
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 3/3] github: Update python to 3.12.

2024-04-11 Thread Simon Horman
On Thu, Apr 11, 2024 at 12:43:30AM +0200, Ilya Maximets wrote:
> We pinned the python version to 3.9 because we had issues building
> older meson 0.47.1 with python 3.10.  Since then meson was updated to
> 0.53.2 in our CI, but we didn't reconsider the python version.
> 
> Newer versions of python uncover more issues with our python files.
> And newer major distributions are using newer versions of python.  But
> we do not really want to use bleeding edge of python releases either to
> avoid unexpected CI failures that need immediate fixes.
> 
> Pin python version to 3.12 as it is the latest released version and we
> should not have any issues with this version.
> 
> While at it, updating meson to a newer version that plays nicely with
> python 3.12.  We do not really care much about the version we use here
> as long as it is able to build the version of DPDK we're using.  Meson
> has no LTS releases, as far as I can tell, so just choosing the latest
> stable 1.4.x series.  It should be fine to use for a next few years.
> Major distributions are using 1.0+ versions.  Upcoming F40 and Ubuntu
> 24.03 have meson 1.3.
> 
> It would also be nice to test the minimal supported version of python,
> but 3.6 is not available in setup-python for 22.04.  The oldest is 3.7.
> And 3.7 is EoL, so pip fails to install some of our dependencies.  The
> oldest version we can use today is 3.8.  But, in the end, this becomes
> a race against older python versions reaching end of their life and
> packages dropping support of these versions.  This may cause unexpected
> CI failures.  So, not doing that for now.
> 
> Signed-off-by: Ilya Maximets 

Acked-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/3] ovsdb-dot: Fix flake8 issues.

2024-04-11 Thread Simon Horman
On Thu, Apr 11, 2024 at 12:43:29AM +0200, Ilya Maximets wrote:
> Missing and extra spaces, missing empty lines, unused imports and
> variables, long lines.
> 
> Decided to just comment out the unused 'tail' and 'head' as they
> seem useful in documenting the meaning of the words.
> 
> Files added to flake8-check to avoid future issues.
> 
> Signed-off-by: Ilya Maximets 

Acked-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/3] ovsdb-doc: Fix syntax warning with Python 3.12 and flake8 issues.

2024-04-11 Thread Simon Horman
On Thu, Apr 11, 2024 at 12:43:28AM +0200, Ilya Maximets wrote:
> ovsdb-doc script generates the following syntax warning while running
> with Python 3.12:
> 
>   /ovsdb/ovsdb-doc:240: SyntaxWarning: invalid escape sequence '\{'
>   s += """
> 
> This doesn't cause a build failure because so far it's only a warning,
> but it will become a syntax error in the future.
> 
> Fix that by converting to a raw string and removing unnecessary
> escape sequences.
> 
> Adding ovsdb-doc to flake8-check to avoid re-introducing issues in
> the future.  This means also fixing all the other issues with the
> script like unused imports and variables, long lines, missing empty
> lines, wildcarded imports.  Also cleaning up one place that handles
> compatibility with Python 2 types, since we do not support Python 2
> for a long time now.
> 
> Signed-off-by: Ilya Maximets 

Acked-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [ovn] ovn-controller segmentation fault in svc_monitor_send_tcp_health_check__()

2024-04-11 Thread Vladislav Odintsov
Hi all,

I’m running ovn 22.09 and sometimes see that ovn-controllers crash with
segmentation fault.  The backtrace is next:

(gdb) bt
#0  0x7f0742707de1 in __strlen_sse2 () from /lib64/libc.so.6
#1  0x7f0742788c5d in inet_pton () from /lib64/libc.so.6
#2  0x564f45a1c784 in ip_parse (s=, 
ip=ip@entry=0x7f074040f90c) at lib/packets.c:698
#3  0x564f4594cbfb in svc_monitor_send_tcp_health_check__ 
(swconn=swconn@entry=0x7f0738000940,
svc_mon=svc_mon@entry=0x564f4c2960c0, ctl_flags=ctl_flags@entry=2, 
tcp_seq=3858078915, tcp_ack=tcp_ack@entry=0,
tcp_src=) at controller/pinctrl.c:7513
#4  0x564f4594d47c in svc_monitor_send_tcp_health_check__ 
(tcp_src=, tcp_ack=0, tcp_seq=,
ctl_flags=2, svc_mon=0x564f4c2960c0, swconn=0x7f0738000940) at 
controller/pinctrl.c:7502
#5  svc_monitor_send_health_check (swconn=swconn@entry=0x7f0738000940, 
svc_mon=svc_mon@entry=0x564f4c2960c0)
at controller/pinctrl.c:7621
#6  0x564f4595869b in svc_monitors_run 
(svc_monitors_next_run_time=0x564f45dd3970 ,
swconn=0x7f0738000940) at controller/pinctrl.c:7693
#7  pinctrl_handler (arg_=0x564f45e11240 ) at controller/pinctrl.c:3499
#8  0x564f45a0ad6f in ovsthread_wrapper (aux_=) at 
lib/ovs-thread.c:422
#9  0x7f074325bea5 in start_thread () from /lib64/libpthread.so.0
#10 0x7f07427798dd in clone () from /lib64/libc.so.6

After moving to frame #3, I can get actual data from svc_mon structure
(port/protocol/dp_key/port_key) - I’ve looked them up in SB DB and found
port_binding, which belongs to a logical port, which resides on this chassis.
It has configured LB with HC. Here everything seems good.  But if to check
svc_mon->sb_svc_mon structure, it seems to me that it contains garbage -
Address 0x564f out of bounds; logical_port == 0, etc (but I can be
wrong):

$1 = (const struct sbrec_service_monitor *) 0x564f54db2b40
(gdb) print *svc_mon->sb_svc_mon
$2 = {header_ = {hmap_node = {hash = 94898726054728, next = 0x0}, uuid = {parts 
= {0, 0, 0, 0}}, src_arcs = {prev = 0x564f54aae0d0, next = 0x0}, dst_arcs = 
{prev = 0x564f7f8bd470, next = 0x564f7f8bd540}, table = 0x64, old_datum = 0xf,
parsed = 152, reparse_node = {prev = 0x0, next = 0x0}, new_datum = 0x0, 
prereqs = 0x52eb8916, written = 0x171, txn_node = {hash = 1, next = 
0x564f54db2db0}, map_op_written = 0x0, map_op_lists = 0x0, set_op_written = 0x0,
set_op_lists = 0x0, change_seqno = {0, 0, 0}, track_node = {prev = 
0x564f, next = 0x0}, updated = 0x0, tracked_old_datum = 0x0}, 
external_ids = {map = {buckets = 0x1, one = 0x564f54db2d90, mask = 0, n = 0}},
  ip = 0x564f , logical_port = 
0x0, options = {map = {buckets = 0x0, one = 0x0, mask = 1, n = 
94898780242768}}, port = 0, protocol = 0x0, src_ip = 0x1 ,
  src_mac = 0x564f54db2d70 "`Ջ\177OV", status = 0x0}
…
(gdb) print svc_mon->state
$8 = SVC_MON_S_ONLINE
(gdb) print svc_mon->status
$9 = SVC_MON_ST_ONLINE
(gdb) print svc_mon->protocol
$10 = SVC_MON_PROTO_TCP
(gdb) print svc_mon->sb_svc_mon

This crash occurred right after ovsdb SB connection loss due to inactivity
probe failure.  So, ovn-controller was re-connecting to SB, and I guess, this
could somehow re-initialize SB IDL objects.

I’m not sure I can try to reproduce this behaviour on latest main branch, so my
question, if this theoretically can be connected with re-initialization of IDL?
If yes, what should be done to avoid such behavior?
Should ovn-controller process changes if its IDL is in inconsistent state?

Any help is appreciated.

Regards,
Vladislav Odintsov

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] rhel/systemd: Set ovsdb-server timeout to 5 minutes

2024-04-11 Thread Chris Riches

On 10/04/2024 23:31, Ilya Maximets wrote:

On 4/10/24 17:48, Chris Riches wrote:

If the database is particularly large (multi-GB), ovsdb-server can take

Hi, Chris.  May I ask how did you end up with multi-GB database?
I would understand if it was an OVN Southbound DB, for example,
but why the local database that only stores ports/bridges and
some other not that large things ends up with so much data?

Sounds a little strange.

Best regards, Ilya Maximets.


I'd like to understand that too, and it's a separate RCA we're working 
on but haven't reached a conclusion yet.


From what we know so far, the DB was full of stale connection-tracking 
information such as the following:


{
  "_date": 1710858766431,
  "Bridge": {
    "49cb85cd-b085-4af8-98a2-56030dd614b9": {
  "external_ids": [
    "map",
    [
  [
"ct-zone-lrp-ext_gw_port_48a89ae3-6528-4851-a277-e21db02518ad",
    "4"
  ],
  [
    "external",
    "true"
  ]
    ]
  ]
    }
  },
  "_comment": "ovn-controller: modifying OVS tunnels 
'5995b338-3080-44b1-9251-58080cc878f7'"

}

Once the host was recovered by putting in the timeout increase, 
ovsdb-server successfully started and GCed the database down from 2.4 
*GB* to 29 *KB*. Had this happened before the host restart, we would 
have never seen this problem. But since it seems possible to end up 
booting with such a large DB, we figured a timeout increase was a 
sensible measure to take.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] Start mcast id allocations from OVN_MIN_IP_MULTICAST.

2024-04-11 Thread Ales Musil
On Thu, Apr 4, 2024 at 11:44 PM Ihar Hrachyshka  wrote:

> Strictly speaking, this is not *essential* to start from MIN and not
> MIN+1 (once the hint reaches max, it will wrap back to MIN anyway), but
> this is inconsistent with how we handle datapath and port keys (we start
> with hint = 0 there).
>
> Signed-off-by: Ihar Hrachyshka 
> ---
>  northd/northd.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/northd/northd.c b/northd/northd.c
> index c568f6360..4baff408d 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -641,7 +641,8 @@ init_mcast_info_for_datapath(struct ovn_datapath *od)
>  }
>
>  hmap_init(>mcast_info.group_tnlids);
> -od->mcast_info.group_tnlid_hint = OVN_MIN_IP_MULTICAST;
> +/* allocations start from hint + 1 */
> +od->mcast_info.group_tnlid_hint = OVN_MIN_IP_MULTICAST - 1;
>  ovs_list_init(>mcast_info.groups);
>
>  if (od->nbs) {
> --
> 2.41.0
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
Looks good to me, thanks.

Acked-by: Ales Musil 

-- 

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA 

amu...@redhat.com

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] treewide: Remove remaining XenServer references.

2024-04-11 Thread Ales Musil
On Mon, Apr 8, 2024 at 1:06 PM Ilya Maximets  wrote:

> Remaining bits from the OVS/OVN split.
>
> Fixes: 1af37d11be73 ("Remove XenServer Code")
> Signed-off-by: Ilya Maximets 
> ---
>  Documentation/tutorials/index.rst |  6 --
>  README.rst|  3 ---
>  acinclude.m4  | 26 ---
>  build-aux/initial-tab-whitelist   |  1 -
>  debian/copyright.in   | 34 ---
>  5 files changed, 70 deletions(-)
>
> diff --git a/Documentation/tutorials/index.rst
> b/Documentation/tutorials/index.rst
> index 0b7f52d0b..865a64018 100644
> --- a/Documentation/tutorials/index.rst
> +++ b/Documentation/tutorials/index.rst
> @@ -30,12 +30,6 @@ Tutorials
>  Getting started with Open vSwitch (OVS) and Open Virtual Network (OVN)
> for Open
>  vSwitch.
>
> -.. TODO(stephenfin): We could really do with a few basic tutorials, along
> with
> -   some more specialized ones (DPDK, XenServer, Windows). The latter could
> -   probably be formed from the install guides, but the former will need
> to be
> -   produced from scratch or reproduced from blogs (with permission of the
> -   author)
> -
>  .. toctree::
> :maxdepth: 2
>
> diff --git a/README.rst b/README.rst
> index 428cd8ee8..dce8c5bc8 100644
> --- a/README.rst
> +++ b/README.rst
> @@ -91,9 +91,6 @@ license applies to the file in question.
>
>  File build-aux/cccl is licensed under the GNU General Public License,
> version 2.
>
> -Files under the xenserver directory are licensed on a file-by-file basis.
> -Refer to each file for details.
> -
>  Contact
>  ---
>
> diff --git a/acinclude.m4 b/acinclude.m4
> index 1a80dfaa7..73a2a51be 100644
> --- a/acinclude.m4
> +++ b/acinclude.m4
> @@ -177,32 +177,6 @@ AC_DEFUN([OVS_CONDITIONAL_CC_OPTION],
> AM_CONDITIONAL([$2], [test $ovs_have_cc_option = yes])])
>  dnl --
>
> -dnl Check for too-old XenServer.
> -AC_DEFUN([OVS_CHECK_XENSERVER_VERSION],
> -  [AC_CACHE_CHECK([XenServer release], [ovs_cv_xsversion],
> -[if test -e /etc/redhat-release; then
> -   ovs_cv_xsversion=`sed -n 's/^XenServer DDK release
> \([[^-]]*\)-.*/\1/p' /etc/redhat-release`
> - fi
> - if test -z "$ovs_cv_xsversion"; then
> -   ovs_cv_xsversion=none
> - fi])
> -  case $ovs_cv_xsversion in
> -none)
> -  ;;
> -
> -[[1-9]][[0-9]]* |dnl XenServer 10 or later
> -[[6-9]]* |   dnl XenServer 6 or later
> -5.[[7-9]]* | dnl XenServer 5.7 or later
> -5.6.[[1-9]][[0-9]][[0-9]][[0-9]]* |  dnl XenServer 5.6.1000 or later
> -5.6.[[2-9]][[0-9]][[0-9]]* | dnl XenServer 5.6.200 or later
> -5.6.1[[0-9]][[0-9]]) dnl Xenserver 5.6.100 or later
> -  ;;
> -
> -*)
> -  AC_MSG_ERROR([This appears to be XenServer $ovs_cv_xsversion, but
> only XenServer 5.6.100 or later is supported.  (If you are really using a
> supported version of XenServer, you may override this error message by
> specifying 'ovs_cv_xsversion=5.6.100' on the "configure" command line.)])
> -  ;;
> -  esac])
> -
>  dnl OVS_CHECK_SPARSE_TARGET
>  dnl
>  dnl The "cgcc" script from "sparse" isn't very good at detecting the
> diff --git a/build-aux/initial-tab-whitelist
> b/build-aux/initial-tab-whitelist
> index 8ad43d616..eb5c1a23a 100644
> --- a/build-aux/initial-tab-whitelist
> +++ b/build-aux/initial-tab-whitelist
> @@ -5,7 +5,6 @@
>  \.sln$
>  ^ovs/
>  ^third-party/
> -^xenserver/
>  ^debian/rules.modules$
>  ^debian/rules$
>  ^\.gitmodules$
> diff --git a/debian/copyright.in b/debian/copyright.in
> index 9ad00340f..335fd6af8 100644
> --- a/debian/copyright.in
> +++ b/debian/copyright.in
> @@ -27,40 +27,6 @@ Upstream Copyright Holders:
>
>  License:
>
> -* The following components are licensed under the
> -  GNU Lesser General Public License version 2.1 only
> -  with the exception clause below as a pre-amble.
> -
> -xenserver/etc_xensource_scripts_vif
> -xenserver/opt_xensource_libexec_InterfaceReconfigure.py
> -xenserver/opt_xensource_libexec_InterfaceReconfigureBridge.py
> -xenserver/opt_xensource_libexec_InterfaceReconfigureVswitch.py
> -xenserver/opt_xensource_libexec_interface-reconfigure
> -xenserver/usr_lib_xsconsole_plugins-base_XSFeatureVSwitch.py
> -
> -* These components are only distributed in the source package.
> -  They do not appear in any binary packages.
> -
> -  On Debian systems, the complete text of the
> -  GNU Lesser General Public License version 2.1 can be found in
> -  `/usr/share/common-licenses/LGPL-2.1'
> -
> -  The exception clause pre-amble reads:
> -
> -  As a special exception to the GNU Lesser General Public License, you
> -  may link, statically or dynamically, a "work that uses the Library"
> -  with a publicly distributed version of the Library to produce an
> -  executable file