Re: [ovs-dev] [PATCH net-next] net: openvswitch: Use struct_size()

2023-05-17 Thread patchwork-bot+netdevbpf
Hello:

This patch was applied to netdev/net-next.git (main)
by Jakub Kicinski :

On Sat,  6 May 2023 18:04:16 +0200 you wrote:
> Use struct_size() instead of hand writing it.
> This is less verbose and more informative.
> 
> Signed-off-by: Christophe JAILLET 
> ---
> It will also help scripts when __counted_by macro will be added.
> See [1].
> 
> [...]

Here is the summary with links:
  - [net-next] net: openvswitch: Use struct_size()
https://git.kernel.org/netdev/net-next/c/b50a8b0d57ab

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2] python: Add aync DNS support

2023-05-17 Thread Terry Wilson
This adds a Python version of the async DNS support added in:

771680d96 DNS: Add basic support for asynchronous DNS resolving

The above version uses the unbound C library, and this
implimentation uses the SWIG-wrapped Python version of that.

In the event that the Python unbound library is not available,
a warning will be logged and the resolve() method will just
return None. For the case where inet_parse_active() is passed
an IP address, it will not try to resolve it, so existing
behavior should be preserved in the case that the unbound
library is unavailable.

Intentional differences from the C version are as follows:

  OVS_HOSTS_FILE environment variable can bet set to override
  the system 'hosts' file. This is primarily to allow testing to
  be done without requiring network connectivity.

  Since resolution can still be done via hosts file lookup, DNS
  lookups are not disabled when resolv.conf cannot be loaded.

  The Python socket_util module has fallen behind its C equivalent.
  The bare minimum change was done to inet_parse_active() to support
  sync/async dns, as there is no equivalent to
  parse_sockaddr_components(), inet_parse_passive(), etc. A TODO
  was added to bring socket_util.py up to equivalency to the C
  version.

Signed-off-by: Terry Wilson 
---
 .github/workflows/build-and-test.yml |   4 +-
 debian/control.in|   1 +
 python/TODO.rst  |   7 +
 python/automake.mk   |   2 +
 python/ovs/dns_resolve.py| 257 
 python/ovs/socket_util.py|  21 +-
 python/ovs/stream.py |   2 +-
 python/ovs/tests/test_dns_resolve.py | 280 +++
 python/setup.py  |   6 +-
 rhel/openvswitch-fedora.spec.in  |   2 +-
 tests/vlog.at|   2 +
 11 files changed, 575 insertions(+), 9 deletions(-)
 create mode 100644 python/ovs/dns_resolve.py
 create mode 100644 python/ovs/tests/test_dns_resolve.py

diff --git a/.github/workflows/build-and-test.yml 
b/.github/workflows/build-and-test.yml
index f66ab43b0..47d239f10 100644
--- a/.github/workflows/build-and-test.yml
+++ b/.github/workflows/build-and-test.yml
@@ -183,10 +183,10 @@ jobs:
   run:  sudo apt update || true
 - name: install common dependencies
   run:  sudo apt install -y ${{ env.dependencies }}
-- name: install libunbound libunwind
+- name: install libunbound libunwind python3-unbound
   # GitHub Actions doesn't have 32-bit versions of these libraries.
   if:   matrix.m32 == ''
-  run:  sudo apt install -y libunbound-dev libunwind-dev
+  run:  sudo apt install -y libunbound-dev libunwind-dev python3-unbound
 - name: install 32-bit libraries
   if:   matrix.m32 != ''
   run:  sudo apt install -y gcc-multilib
diff --git a/debian/control.in b/debian/control.in
index 19f590d06..64b0a4ce0 100644
--- a/debian/control.in
+++ b/debian/control.in
@@ -287,6 +287,7 @@ Depends:
 Suggests:
  python3-netaddr,
  python3-pyparsing,
+ python3-unbound,
 Description: Python 3 bindings for Open vSwitch
  Open vSwitch is a production quality, multilayer, software-based,
  Ethernet virtual switch. It is designed to enable massive network
diff --git a/python/TODO.rst b/python/TODO.rst
index 3a53489f1..acc5461e2 100644
--- a/python/TODO.rst
+++ b/python/TODO.rst
@@ -32,3 +32,10 @@ Python Bindings To-do List
 
   * Support write-only-changed monitor mode (equivalent of
 OVSDB_IDL_WRITE_CHANGED_ONLY).
+
+* socket_util:
+
+  * Add equivalent fuctions to inet_parse_passive, parse_sockaddr_components,
+et al. to better support using async dns. The reconnect code will
+currently log a warning when inet_parse_active() returns w/o yet having
+resolved an address, but will continue to connect and eventually succeed.
diff --git a/python/automake.mk b/python/automake.mk
index d00911828..82a508787 100644
--- a/python/automake.mk
+++ b/python/automake.mk
@@ -16,6 +16,7 @@ ovs_pyfiles = \
python/ovs/compat/sortedcontainers/sorteddict.py \
python/ovs/compat/sortedcontainers/sortedset.py \
python/ovs/daemon.py \
+   python/ovs/dns_resolve.py \
python/ovs/db/__init__.py \
python/ovs/db/custom_index.py \
python/ovs/db/data.py \
@@ -55,6 +56,7 @@ ovs_pyfiles = \
 
 ovs_pytests = \
python/ovs/tests/test_decoders.py \
+   python/ovs/tests/test_dns_resolve.py \
python/ovs/tests/test_filter.py \
python/ovs/tests/test_kv.py \
python/ovs/tests/test_list.py \
diff --git a/python/ovs/dns_resolve.py b/python/ovs/dns_resolve.py
new file mode 100644
index 0..463a39277
--- /dev/null
+++ b/python/ovs/dns_resolve.py
@@ -0,0 +1,257 @@
+import collections
+import functools
+import ipaddress
+import os
+import threading
+import time
+import typing
+
+try:
+import unbound  # type: ignore
+except ImportError:
+pass
+
+import ovs.vlog
+
+vlog = 

Re: [ovs-dev] [PATCH v8 ovn 00/10] Configure OVN QoS thorugh OvS db

2023-05-17 Thread Numan Siddique
On Wed, May 17, 2023 at 5:02 AM Lorenzo Bianconi
 wrote:
>
> Rework OVN QoS implementation in order to configure it through OVS QoS
> table instead of running tc command directly bypassing OVS.
> This series allows to apply QoS rules on the localnet port related to
> logical switch ports running on the same datapath. Considering the
> following netowrk configuration:
>
> LSP{0,1} -- LogicalSwitch -- Localnet0
>
> It is possible to apply the following QoS rules on Localnet0 on egress traffic
> entering the cluster from LSP{0,1}:
> - LSP0: min-rate r0, max_rate R0
> - LSP1: min-rate r1, max_rate R1
>
> https://bugzilla.redhat.com/show_bug.cgi?id=2129742
>
> Changes since v7:
> - make queue_id allocation in norhd global
> - update QoS configuration in binding_handle_ovs_interface_changes()
> - fix "ovn-controller incremental processing" test
> - fix possible NULL pointer dereference
> - add new ovn-northd unit test
> - get rid of ovs port lookup by name
> - rebase on top of ovn main branch
> Changes since v6:
> - run add_ovs_qos_table_entry() and remove_stale_qos_entry() in setup_qos()
> - rename setup_qos() in configure_qos()
> - add some new unit-tests in ovn.at
> - erase QoS configuration if related port_binding is removed
> Changes since v5:
> - add IP for qos_map map
> - add some new unit-tests in ovn.at
> Changes since v4:
> - do not remove ovn-egress-iface parameter
> - rebase on top of ovn main branch
>
> Lorenzo Bianconi (10):
>   controller: remove tunnel interfaces from egress_ifaces sset
>   controller: add incremental processing for ovn-controller qos_map
>   northd: add qos_physical_network in port_binding config column
>   controller: configure qos through ovs qos table and do not run tc
> directly
>   controller: improve ovs port lookup by qos
>   controller: use unsigned long long int for
> qos_max_rate/qos_min_rate/qos_burst
>   northd: make queue_id allocation global for the ovn cluster
>   northd: apply QoS rules on the localnet port related to LSP ports
>   controller: get rid of egress_ifaces sset
>   update NEWS with new QoS info
>
>  NEWS|   6 +
>  controller/binding.c| 622 +---
>  controller/binding.h|   6 +-
>  controller/ovn-controller.c |  34 +-
>  controller/ovsport.c|  16 +
>  controller/ovsport.h|   3 +
>  northd/northd.c | 163 +++---
>  northd/ovn-northd.8.xml |  12 +
>  ovn-sb.xml  |   5 +
>  tests/ovn-northd.at |  24 ++
>  tests/ovn-performance.at|   5 -
>  tests/ovn.at| 185 +++
>  tests/system-ovn.at | 151 -
>  13 files changed, 772 insertions(+), 460 deletions(-)


Hi Lorenzo,

Thanks for addressing the review comments.

There is one issue with patch 3.  Please see that.

With patch 3  and Mark's comments in patch 4 addressed, for the series
: Acked-by: Numan Siddique 

Numan

>
> --
> 2.40.1
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v8 ovn 03/10] northd: add qos_physical_network in port_binding config column

2023-05-17 Thread Numan Siddique
On Wed, May 17, 2023 at 5:02 AM Lorenzo Bianconi
 wrote:
>
> Introduce qos_physical_network in port_binding config column in order to
> indicate the name of the egress network name where traffic shaping will
> be applied.
> This is a preliminary patch to rework OVN QoS implementation in order to
> configure it through OVS QoS table instead of running tc command
> directly bypassing OVS.
>
> Signed-off-by: Lorenzo Bianconi 
> ---
>  controller/binding.c |  8 
>  northd/northd.c  | 10 ++
>  ovn-sb.xml   |  5 +
>  tests/ovn-northd.at  | 22 ++
>  4 files changed, 45 insertions(+)
>
> diff --git a/controller/binding.c b/controller/binding.c
> index ad19a4092..91437dbb8 100644
> --- a/controller/binding.c
> +++ b/controller/binding.c
> @@ -142,6 +142,7 @@ static void update_lport_tracking(const struct 
> sbrec_port_binding *pb,
>  struct qos_queue {
>  struct hmap_node node;
>
> +char *network;
>  char *port;
>
>  uint32_t queue_id;
> @@ -165,6 +166,7 @@ find_qos_queue(struct hmap *queue_map, uint32_t hash, 
> const char *port)
>  static void
>  qos_queue_erase_entry(struct qos_queue *q)
>  {
> +free(q->network);
>  free(q->port);
>  free(q);
>  }
> @@ -186,6 +188,7 @@ get_qos_queue(const struct sbrec_port_binding *pb, struct 
> hmap *queue_map)
>  uint32_t max_rate = smap_get_int(>options, "qos_max_rate", 0);
>  uint32_t burst = smap_get_int(>options, "qos_burst", 0);
>  uint32_t queue_id = smap_get_int(>options, "qdisc_queue_id", 0);
> +const char *network = smap_get(>options, "qos_physical_network");
>
>  if ((!min_rate && !max_rate && !burst) || !queue_id) {
>  /* Qos is not configured for this port. */
> @@ -200,6 +203,11 @@ get_qos_queue(const struct sbrec_port_binding *pb, 
> struct hmap *queue_map)
>  q->port = xstrdup(pb->logical_port);
>  q->queue_id = queue_id;
>  }
> +
> +free(q->network);

There is a potential double free error for the q->network.

I tested it locally and could reproduce the crash. After freeing
please set q->network to NULL.

Numan

> +if (network) {
> +q->network = xstrdup(network);
> +}
>  q->min_rate = min_rate;
>  q->max_rate = max_rate;
>  q->burst = burst;
> diff --git a/northd/northd.c b/northd/northd.c
> index 7190cd18f..470f76809 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -3505,7 +3505,17 @@ ovn_port_update_sbrec(struct ovsdb_idl_txn *ovnsb_txn,
>  }
>
>  smap_clone(, >nbsp->options);
> +
>  if (queue_id) {
> +if (op->od->n_localnet_ports) {
> +struct ovn_port *port = op->od->localnet_ports[0];
> +const char *physical_network = smap_get(
> +>nbsp->options, "network_name");
> +if (physical_network) {
> +smap_add(, "qos_physical_network",
> + physical_network);
> +}
> +}
>  smap_add_format(,
>  "qdisc_queue_id", "%d", queue_id);
>  }
> diff --git a/ovn-sb.xml b/ovn-sb.xml
> index ead9efcab..0988cb1f8 100644
> --- a/ovn-sb.xml
> +++ b/ovn-sb.xml
> @@ -3691,6 +3691,11 @@ tcp.flags = RST;
>  interface, in bits.
>
>
> +  
> +If set, indicates the name of the egress network name where traffic
> +shaping will be applied.
> +  
> +
>type='{"type": "integer", "minInteger": 1, "maxInteger": 
> 61440}'>
>  Indicates the queue number on the physical device. This is same as 
> the
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index 1f6169b77..a9af0f76a 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -9064,3 +9064,25 @@ mac_binding_timestamp: true
>
>  AT_CLEANUP
>  ])
> +
> +OVN_FOR_EACH_NORTHD_NO_HV([
> +AT_SETUP([check OVN QoS])
> +AT_KEYWORDS([OVN-QoS])
> +ovn_start
> +
> +check ovn-nbctl ls-add ls
> +check ovn-nbctl lsp-add ls public
> +check ovn-nbctl lsp-set-type public localnet
> +check ovn-nbctl lsp-set-addresses public unknown
> +
> +check_column "" sb:Port_Binding options logical_port=public
> +
> +check ovn-nbctl --wait=sb set Logical_Switch_Port public 
> options:qos_min_rate=20
> +AT_CHECK([fetch_column sb:Port_Binding options logical_port=public |grep -q 
> 'qos_min_rate=20'])
> +AT_CHECK([fetch_column sb:Port_Binding options logical_port=public |grep -q 
> 'qos_physical_network'],[1])
> +
> +check ovn-nbctl --wait=sb set Logical_Switch_Port public 
> options:qos_min_rate=20 options:network_name=phys
> +AT_CHECK([fetch_column sb:Port_Binding options logical_port=public |grep -q 
> 'qos_physical_network=phys'])
> +
> +AT_CLEANUP
> +])
> --
> 2.40.1
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>

[ovs-dev] [PATCH ovn v3 3/5] if-status: track interfaces for additional chassis

2023-05-17 Thread Ihar Hrachyshka
This will allow all chassis hosting a port to extract interface MTU from
if-status-mgr. This will be used in a later patch to calculate the
effective path MTU for each port.

In addition, it's the right thing to do to claim and mark an interface
on all chassis as ovn-installed, even if the chassis is "additional".

Fixes: fa8c591fa2a7 ("Support LSP:options:requested-chassis as a list")
Signed-off-by: Ihar Hrachyshka 
Acked-by: Dumitru Ceara 
---
 controller/binding.c   | 46 ++
 controller/binding.h   |  4 
 controller/if-status.c |  8 ++--
 controller/if-status.h |  3 ++-
 tests/ovn.at   | 10 +
 5 files changed, 47 insertions(+), 24 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index a627618b7..19622ec0a 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -57,6 +57,10 @@ struct claimed_port {
 static struct shash _claimed_ports = SHASH_INITIALIZER(&_claimed_ports);
 static struct sset _postponed_ports = SSET_INITIALIZER(&_postponed_ports);
 
+static void
+remove_additional_chassis(const struct sbrec_port_binding *pb,
+  const struct sbrec_chassis *chassis_rec);
+
 struct sset *
 get_postponed_ports(void)
 {
@@ -1073,6 +1077,26 @@ set_pb_chassis_in_sbrec(const struct sbrec_port_binding 
*pb,
 }
 }
 
+void
+set_pb_additional_chassis_in_sbrec(const struct sbrec_port_binding *pb,
+   const struct sbrec_chassis *chassis_rec,
+   bool is_set)
+{
+if (!is_additional_chassis(pb, chassis_rec)) {
+VLOG_INFO("Claiming lport %s for this additional chassis.",
+  pb->logical_port);
+for (size_t i = 0; i < pb->n_mac; i++) {
+VLOG_INFO("%s: Claiming %s", pb->logical_port, pb->mac[i]);
+}
+sbrec_port_binding_update_additional_chassis_addvalue(pb, chassis_rec);
+if (pb->chassis == chassis_rec) {
+sbrec_port_binding_set_chassis(pb, NULL);
+}
+} else if (!is_set) {
+remove_additional_chassis(pb, chassis_rec);
+}
+}
+
 bool
 local_bindings_pb_chassis_is_set(struct shash *local_bindings,
  const char *pb_name,
@@ -1274,7 +1298,7 @@ claim_lport(const struct sbrec_port_binding *pb,
 set_pb_chassis_in_sbrec(pb, chassis_rec, true);
 } else {
 if_status_mgr_claim_iface(if_mgr, pb, chassis_rec, iface_rec,
-  sb_readonly);
+  sb_readonly, can_bind);
 }
 register_claim_timestamp(pb->logical_port, now);
 sset_find_and_delete(postponed_ports, pb->logical_port);
@@ -1288,27 +1312,15 @@ claim_lport(const struct sbrec_port_binding *pb,
 !smap_get_bool(_rec->external_ids,
OVN_INSTALLED_EXT_ID, false)) {
 if_status_mgr_claim_iface(if_mgr, pb, chassis_rec,
-  iface_rec, sb_readonly);
+  iface_rec, sb_readonly,
+  can_bind);
 }
 }
 }
 } else if (can_bind == CAN_BIND_AS_ADDITIONAL) {
 if (!is_additional_chassis(pb, chassis_rec)) {
-if (sb_readonly) {
-return false;
-}
-
-VLOG_INFO("Claiming lport %s for this additional chassis.",
-  pb->logical_port);
-for (size_t i = 0; i < pb->n_mac; i++) {
-VLOG_INFO("%s: Claiming %s", pb->logical_port, pb->mac[i]);
-}
-
-sbrec_port_binding_update_additional_chassis_addvalue(pb,
-  chassis_rec);
-if (pb->chassis == chassis_rec) {
-sbrec_port_binding_set_chassis(pb, NULL);
-}
+if_status_mgr_claim_iface(if_mgr, pb, chassis_rec, iface_rec,
+  sb_readonly, can_bind);
 update_tracked = true;
 }
 }
diff --git a/controller/binding.h b/controller/binding.h
index 5b73c6a4b..46e618b97 100644
--- a/controller/binding.h
+++ b/controller/binding.h
@@ -202,6 +202,10 @@ bool is_additional_chassis(const struct sbrec_port_binding 
*pb,
 void set_pb_chassis_in_sbrec(const struct sbrec_port_binding *pb,
  const struct sbrec_chassis *chassis_rec,
  bool is_set);
+void
+set_pb_additional_chassis_in_sbrec(const struct sbrec_port_binding *pb,
+   const struct sbrec_chassis *chassis_rec,
+   bool is_set);
 
 void remove_ovn_installed_for_uuid(const struct ovsrec_interface_table *,
const struct uuid *);
diff --git a/controller/if-status.c 

[ovs-dev] [PATCH ovn v3 4/5] Add new egress tables to accommodate for too-big packets handling

2023-05-17 Thread Ihar Hrachyshka
The new tables will be used in a later patch as follows:

table=37, OFTABLE_OUTPUT_INIT: becomes an initial entry point into the
egress pipeline that serves a semantic goal. (Not doing any actual
processing at the moment.)

table=38, OFTABLE_OUTPUT_LARGE_PKT_DETECT: detect "too-big" IP packets
and mark them for later processing in table=39.

table=39, OFTABLE_OUTPUT_LARGE_PKT_PROCESS: process "too-big" IP packets
detected in table=38 by sending ICMPv4 Fragmentation Needed / ICMPv6 Too
Big errors back to the originating port.

All previous table indices shifted by 3 (old table=37 becomes table=40).
Otherwise, no changes to existing tables and flows introduced.

Acked-by: Dumitru Ceara 
Signed-off-by: Ihar Hrachyshka 
---
 controller/lflow.c   |   4 +-
 controller/lflow.h   |  49 ---
 controller/physical.c|  77 +++
 controller/pinctrl.c |   6 +-
 ovn-architecture.7.xml   |  71 +-
 tests/ovn-controller.at  | 174 
 tests/ovn.at | 280 +++
 tests/system-ovn-kmod.at |   2 +-
 tests/system-ovn.at  |   8 +-
 9 files changed, 350 insertions(+), 321 deletions(-)

diff --git a/controller/lflow.c b/controller/lflow.c
index 0b071138d..22faaf013 100644
--- a/controller/lflow.c
+++ b/controller/lflow.c
@@ -397,7 +397,7 @@ consider_lflow_for_added_as_ips__(
 : OFTABLE_LOG_EGRESS_PIPELINE);
 uint8_t ptable = first_ptable + lflow->table_id;
 uint8_t output_ptable = (ingress
- ? OFTABLE_REMOTE_OUTPUT
+ ? OFTABLE_OUTPUT_INIT
  : OFTABLE_SAVE_INPORT);
 
 uint64_t ovnacts_stub[1024 / 8];
@@ -1067,7 +1067,7 @@ consider_logical_flow__(const struct sbrec_logical_flow 
*lflow,
 : OFTABLE_LOG_EGRESS_PIPELINE);
 uint8_t ptable = first_ptable + lflow->table_id;
 uint8_t output_ptable = (ingress
- ? OFTABLE_REMOTE_OUTPUT
+ ? OFTABLE_OUTPUT_INIT
  : OFTABLE_SAVE_INPORT);
 
 /* Parse OVN logical actions.
diff --git a/controller/lflow.h b/controller/lflow.h
index dd742257b..b804e61e5 100644
--- a/controller/lflow.h
+++ b/controller/lflow.h
@@ -63,27 +63,34 @@ struct uuid;
  *
  * These are heavily documented in ovn-architecture(7), please update it if
  * you make any changes. */
-#define OFTABLE_PHY_TO_LOG0
-#define OFTABLE_LOG_INGRESS_PIPELINE  8 /* First of LOG_PIPELINE_LEN tables. */
-#define OFTABLE_REMOTE_OUTPUT37
-#define OFTABLE_LOCAL_OUTPUT 38
-#define OFTABLE_CHECK_LOOPBACK   39
-#define OFTABLE_LOG_EGRESS_PIPELINE  40 /* First of LOG_PIPELINE_LEN tables. */
-#define OFTABLE_SAVE_INPORT  64
-#define OFTABLE_LOG_TO_PHY   65
-#define OFTABLE_MAC_BINDING  66
-#define OFTABLE_MAC_LOOKUP   67
-#define OFTABLE_CHK_LB_HAIRPIN   68
-#define OFTABLE_CHK_LB_HAIRPIN_REPLY 69
-#define OFTABLE_CT_SNAT_HAIRPIN  70
-#define OFTABLE_GET_FDB  71
-#define OFTABLE_LOOKUP_FDB   72
-#define OFTABLE_CHK_IN_PORT_SEC  73
-#define OFTABLE_CHK_IN_PORT_SEC_ND   74
-#define OFTABLE_CHK_OUT_PORT_SEC 75
-#define OFTABLE_ECMP_NH_MAC  76
-#define OFTABLE_ECMP_NH  77
-#define OFTABLE_CHK_LB_AFFINITY  78
+#define OFTABLE_PHY_TO_LOG0
+
+/* Start of LOG_PIPELINE_LEN tables. */
+#define OFTABLE_LOG_INGRESS_PIPELINE  8
+#define OFTABLE_OUTPUT_INIT  37
+#define OFTABLE_OUTPUT_LARGE_PKT_DETECT  38
+#define OFTABLE_OUTPUT_LARGE_PKT_PROCESS 39
+#define OFTABLE_REMOTE_OUTPUT40
+#define OFTABLE_LOCAL_OUTPUT 41
+#define OFTABLE_CHECK_LOOPBACK   42
+
+/* Start of LOG_PIPELINE_LEN tables. */
+#define OFTABLE_LOG_EGRESS_PIPELINE  43
+#define OFTABLE_SAVE_INPORT  64
+#define OFTABLE_LOG_TO_PHY   65
+#define OFTABLE_MAC_BINDING  66
+#define OFTABLE_MAC_LOOKUP   67
+#define OFTABLE_CHK_LB_HAIRPIN   68
+#define OFTABLE_CHK_LB_HAIRPIN_REPLY 69
+#define OFTABLE_CT_SNAT_HAIRPIN  70
+#define OFTABLE_GET_FDB  71
+#define OFTABLE_LOOKUP_FDB   72
+#define OFTABLE_CHK_IN_PORT_SEC  73
+#define OFTABLE_CHK_IN_PORT_SEC_ND   74
+#define OFTABLE_CHK_OUT_PORT_SEC 75
+#define OFTABLE_ECMP_NH_MAC  76
+#define OFTABLE_ECMP_NH  77
+#define OFTABLE_CHK_LB_AFFINITY  78
 
 struct lflow_ctx_in {
 struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath;
diff --git a/controller/physical.c b/controller/physical.c
index ec861f49c..1b0482e3b 100644
--- a/controller/physical.c
+++ b/controller/physical.c
@@ -876,12 +876,12 @@ put_local_common_flows(uint32_t dp_key,
 
 uint32_t port_key = pb->tunnel_key;
 
-/* Table 38, priority 100.
+/* Table 41, priority 100.
  * ===
   

[ovs-dev] [PATCH ovn v3 5/5] Implement MTU Path Discovery for multichassis ports

2023-05-17 Thread Ihar Hrachyshka
When a multichassis port belongs to a switch with a localnet port,
packets originating or directed to the multichassis port are NOT sent
thorugh the localnet port. Instead, tunneling is enforced in-cluster to
guarantee delivery of all packets to all chassis of the port.

This behavior has an unfortunate side effect, where - because of
additional tunnel header added to each packet - the effective MTU of the
path for multichassis ports changes from what's set as mtu_request. This
effectively makes OVN to black hole all packets for the port that use
full capacity of the interface MTU. This breaks usual TCP / UDP
services, among other things (SSH, iperf sessions etc.)

This patch adds flows so that
- (in table 38) detect too-big packets (table 38), and then
- (in table 39) icmp fragmentation needed / too big errors are sent
  back to offending port.

Once the error is received, the sender is expected to adjust the route
MTU accordingly, sending the next packets with the new path MTU. After a
multichassis port is re-assigned to a single chassis, the effective path
MTU is restored to "usual". Peers will eventually see their "learned"
path MTU cache expire, which will make them switch back to the "usual"
MTU.

Among other scenarios, this patch helps to maintain existing services
working during live migration of a VM, if multichassis ports are used.
(E.g. in OpenStack Nueutron.)

Fixes: 7084cf437421 ("Always funnel multichassis port traffic through tunnels")

Signed-off-by: Ihar Hrachyshka 
---
 NEWS|   6 +
 controller/ovn-controller.c |   3 +
 controller/physical.c   | 260 -
 controller/physical.h   |   1 +
 include/ovn/actions.h   |   3 +
 lib/actions.c   |   4 +-
 lib/ovn-util.h  |   7 +
 northd/northd.c |   2 +
 ovn-architecture.7.xml  |   9 +-
 tests/ovn.at| 321 
 10 files changed, 607 insertions(+), 9 deletions(-)

diff --git a/NEWS b/NEWS
index 0f1c5f985..560e51f21 100644
--- a/NEWS
+++ b/NEWS
@@ -25,6 +25,12 @@ Post v23.03.0
 databases for ovn-nbctl and ovn-sbctl respectively.  See man ovn-nb and
 man ovn-sb for 'nbctl_probe_interval' and 'sbctl_probe_interval'
 options for more details.
+  - Send ICMP Fragmentation Needed packets back to offending ports when
+communicating with multichassis ports using frames that don't fit through a
+tunnel. This is done only for logical switches that are attached to a
+physical network via a localnet port, in which case multichassis ports may
+have an effective MTU different from regular ports and hence may need this
+mechanism to maintain connectivity with other peers in the network.
 
 OVN v23.03.0 - 03 Mar 2023
 --
diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 28785b3f0..132831dde 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -4089,6 +4089,9 @@ static void init_physical_ctx(struct engine_node *node,
 p_ctx->patch_ofports = _vif_data->patch_ofports;
 p_ctx->chassis_tunnels = _vif_data->chassis_tunnels;
 
+struct controller_engine_ctx *ctrl_ctx = engine_get_context()->client_ctx;
+p_ctx->if_mgr = ctrl_ctx->if_mgr;
+
 pflow_output_get_debug(node, _ctx->debug);
 }
 
diff --git a/controller/physical.c b/controller/physical.c
index 1b0482e3b..a3ea54284 100644
--- a/controller/physical.c
+++ b/controller/physical.c
@@ -41,6 +41,7 @@
 #include "lib/ovn-sb-idl.h"
 #include "lib/ovn-util.h"
 #include "ovn/actions.h"
+#include "if-status.h"
 #include "physical.h"
 #include "pinctrl.h"
 #include "openvswitch/shash.h"
@@ -91,6 +92,7 @@ physical_register_ovs_idl(struct ovsdb_idl *ovs_idl)
 
 ovsdb_idl_add_table(ovs_idl, _table_interface);
 ovsdb_idl_track_add_column(ovs_idl, _interface_col_name);
+ovsdb_idl_track_add_column(ovs_idl, _interface_col_mtu);
 ovsdb_idl_track_add_column(ovs_idl, _interface_col_ofport);
 ovsdb_idl_track_add_column(ovs_idl, _interface_col_external_ids);
 }
@@ -1104,6 +1106,240 @@ setup_activation_strategy(const struct 
sbrec_port_binding *binding,
 }
 }
 
+/*
+ * Insert a flow to determine if an IP packet is too big for the corresponding
+ * egress interface.
+ */
+static void
+determine_if_pkt_too_big(struct ovn_desired_flow_table *flow_table,
+ const struct sbrec_port_binding *binding,
+ const struct sbrec_port_binding *mcp,
+ uint16_t mtu, bool is_ipv6, int direction)
+{
+struct ofpbuf ofpacts;
+ofpbuf_init(, 0);
+
+/* Store packet too large flag in reg9[1]. */
+struct match match;
+match_init_catchall();
+match_set_dl_type(, htons(is_ipv6 ? ETH_TYPE_IPV6 : ETH_TYPE_IP));
+match_set_metadata(, htonll(binding->datapath->tunnel_key));
+match_set_reg(, direction - MFF_REG0, mcp->tunnel_key);
+
+/* reg9[1] is REGBIT_PKT_LARGER as 

[ovs-dev] [PATCH ovn v3 2/5] Track interface MTU in if-status-mgr

2023-05-17 Thread Ihar Hrachyshka
This will be used in a later patch to calculate the effective interface
MTU after considering tunneling overhead.

NOTE: ideally, OVN would support Logical_Port MTU, in which case we
wouldn't have to track OVSDB for interfaces.

Signed-off-by: Ihar Hrachyshka 
---
 controller/binding.c|  4 +-
 controller/if-status.c  | 36 --
 controller/if-status.h  |  6 +++
 controller/ovn-controller.c | 73 +
 controller/ovsport.c|  9 +
 controller/ovsport.h|  2 +
 controller/physical.h   |  1 +
 7 files changed, 125 insertions(+), 6 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index bd810f669..a627618b7 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -1273,7 +1273,7 @@ claim_lport(const struct sbrec_port_binding *pb,
 }
 set_pb_chassis_in_sbrec(pb, chassis_rec, true);
 } else {
-if_status_mgr_claim_iface(if_mgr, pb, chassis_rec,
+if_status_mgr_claim_iface(if_mgr, pb, chassis_rec, iface_rec,
   sb_readonly);
 }
 register_claim_timestamp(pb->logical_port, now);
@@ -1288,7 +1288,7 @@ claim_lport(const struct sbrec_port_binding *pb,
 !smap_get_bool(_rec->external_ids,
OVN_INSTALLED_EXT_ID, false)) {
 if_status_mgr_claim_iface(if_mgr, pb, chassis_rec,
-  sb_readonly);
+  iface_rec, sb_readonly);
 }
 }
 }
diff --git a/controller/if-status.c b/controller/if-status.c
index 8503e5daa..dc4062559 100644
--- a/controller/if-status.c
+++ b/controller/if-status.c
@@ -18,12 +18,14 @@
 #include "binding.h"
 #include "if-status.h"
 #include "ofctrl-seqno.h"
+#include "ovsport.h"
 #include "simap.h"
 
 #include "lib/hmapx.h"
 #include "lib/util.h"
 #include "timeval.h"
 #include "openvswitch/vlog.h"
+#include "lib/vswitch-idl.h"
 #include "lib/ovn-sb-idl.h"
 
 VLOG_DEFINE_THIS_MODULE(if_status);
@@ -181,6 +183,7 @@ struct ovs_iface {
  * be fully programmed in OVS.  Only used in state
  * OIF_INSTALL_FLOWS.
  */
+uint16_t mtu;   /* Extracted from OVS interface.mtu field. */
 };
 
 static uint64_t ifaces_usage;
@@ -205,9 +208,10 @@ struct if_status_mgr {
 uint32_t iface_seqno;
 };
 
-static struct ovs_iface *ovs_iface_create(struct if_status_mgr *,
-  const char *iface_id,
-  enum if_state );
+static struct ovs_iface *
+ovs_iface_create(struct if_status_mgr *, const char *iface_id,
+ const struct ovsrec_interface *iface_rec,
+ enum if_state);
 static void add_to_ovn_uninstall_hash(struct if_status_mgr *, const char *,
   const struct uuid *);
 static void ovs_iface_destroy(struct if_status_mgr *, struct ovs_iface *);
@@ -272,13 +276,14 @@ void
 if_status_mgr_claim_iface(struct if_status_mgr *mgr,
   const struct sbrec_port_binding *pb,
   const struct sbrec_chassis *chassis_rec,
+  const struct ovsrec_interface *iface_rec,
   bool sb_readonly)
 {
 const char *iface_id = pb->logical_port;
 struct ovs_iface *iface = shash_find_data(>ifaces, iface_id);
 
 if (!iface) {
-iface = ovs_iface_create(mgr, iface_id, OIF_CLAIMED);
+iface = ovs_iface_create(mgr, iface_id, iface_rec, OIF_CLAIMED);
 }
 
 memcpy(>pb_uuid, >header_.uuid, sizeof(iface->pb_uuid));
@@ -639,14 +644,37 @@ ovn_uninstall_hash_account_mem(const char *name, bool 
erase)
 }
 }
 
+uint16_t
+if_status_mgr_iface_get_mtu(const struct if_status_mgr *mgr,
+const char *iface_id)
+{
+const struct ovs_iface *iface = shash_find_data(>ifaces, iface_id);
+return iface ? iface->mtu : 0;
+}
+
+bool
+if_status_mgr_iface_set_mtu(const struct if_status_mgr *mgr,
+const char *iface_id,
+uint16_t mtu)
+{
+struct ovs_iface *iface = shash_find_data(>ifaces, iface_id);
+if (iface && iface->mtu != mtu) {
+iface->mtu = mtu;
+return true;
+}
+return false;
+}
+
 static struct ovs_iface *
 ovs_iface_create(struct if_status_mgr *mgr, const char *iface_id,
+ const struct ovsrec_interface *iface_rec,
  enum if_state state)
 {
 struct ovs_iface *iface = xzalloc(sizeof *iface);
 
 VLOG_DBG("Interface %s create.", iface_id);
 iface->id = xstrdup(iface_id);
+iface->mtu = get_iface_mtu(iface_rec);
 shash_add_nocopy(>ifaces, iface->id, iface);
 ovs_iface_set_state(mgr, iface, state);
 

[ovs-dev] [PATCH ovn v3 0/5] Implement MTU Path Discovery for multichassis ports

2023-05-17 Thread Ihar Hrachyshka
This series fixes a non-optimal behavior with some multichassis ports.

Specifically,

- when a multichassis port belongs to a switch that also has a localnet
  port,
- because ingress and egress traffic for the port is funnelled through
  tunnels to guarantee delivery of packets to all chassis that host the
  port,
- because tunnels add overhead to each funnelled packet,
- leaving less space for actual packets,

... the effective MTU of the path for multichassis ports is reduced by
the size that takes the tunnel to wrap a packet around. (This depends on
the type of tunnel, also on IP version of the tunnel.)

When this happens, the port and its peers are not notified about the
reduced MTU for the path to the port, effectively blackholing some of
the larger packets.

This patch series makes OVN detect these scenarios and handle them as
follows:

- for all multichassis ports, 4 flows are added - for inport and
  outport matches - to detect "too-big" IP packets;
- for all "too-big" packets, 2 flows are added to produce a ICMP
  Fragmentation Needed (for IPv4) or Too Big (for IPv6) packet and
  return it to the offending sender.

The proposed implementation is isolated in ovn-controller. An
alternative would be to implement flows producing ICMP errors via the
existing Logical_Flow action icmp4_error, as is done for router ports.
In this case, the 4 flows that detect "too-big" packets would still have
to reside in ovn-controller because of limitations of the current OVN
port model, namely lack of `mtu` as an attribute of OVN logical port.

Caveats: this works for IP traffic only. The party receiving the ICMP
error should handle it by temporarily lowering the MTU used to reach the
port. Behavior may depend on protocol implementation of the offending
peer as well as its networking stack configuration.

This patch series was successfully tested with Linux kernel 6.0.8.

This patch series is designed to simplify backporting process. It is
split in logical pieces to simplify cherry-picking. I consider the
original behavior described at the start of the cover letter a bug and
worth a backport.

---
v1: - initial version.
v2: - more system test cases adjusted to new table numbers for egress
  pipeline;
- switch from xxd to coreutils (tr, head) tools to avoid a new
  dependency;
- adjusted a comment in the new test case to avoid "migrator"
  terminology that makes no sense outside live migration context -
  which is not the only potential use case for multichassis ports;
- guard the new test case with HAVE_SCAPY;
- nit: rewrapped some lines in patch 6/7 to avoid unnecessary line
  changes;
- added a NEWS entry for patch 6/7 that implements the core of the
  feature;
- rebased to latest main.
v3: - expose and reuse lib/actions.c:encode_[start|finish]_controller_op
  functions;
- reuse ETH_HEADER_LEN, IP_HEADER_LEN, and IPV6_HEADER_LEN from
  ovs:lib/packets.h;
- add a comment for REGBIT_PKT_LARGER that mentions that the
  register is also used in OFTABLE_OUTPUT_LARGE_PKT_DETECT table;
- squash the patch that makes if-status-mgr track changes to
  interface mtu into the patch that introduces the mtu field in the
  manager;
- doc: don't describe path discovery flows in the patch that
  introduces new tables (only in the patch that actually implements
  the flows);
- nit: use MAX to shave off several lines of code;
- nit: remove redundant if_status_mgr_iface_get_mtu declaration from
  a header file;
- rebased to latest main;
- updated acks based on latest review comments.
---

Ihar Hrachyshka (5):
  Track ip version of tunnel in chassis_tunnel struct
  Track interface MTU in if-status-mgr
  if-status: track interfaces for additional chassis
  Add new egress tables to accommodate for too-big packets handling
  Implement MTU Path Discovery for multichassis ports

 NEWS|   6 +
 controller/binding.c|  48 +--
 controller/binding.h|   4 +
 controller/if-status.c  |  44 ++-
 controller/if-status.h  |   9 +-
 controller/lflow.c  |   4 +-
 controller/lflow.h  |  49 +--
 controller/local_data.c |   2 +
 controller/local_data.h |   1 +
 controller/ovn-controller.c |  76 +
 controller/ovsport.c|   9 +
 controller/ovsport.h|   2 +
 controller/physical.c   | 337 ++--
 controller/physical.h   |   2 +
 controller/pinctrl.c|   6 +-
 include/ovn/actions.h   |   3 +
 lib/actions.c   |   4 +-
 lib/ovn-util.h  |   7 +
 northd/northd.c |   2 +
 ovn-architecture.7.xml  |  76 ++---
 tests/ovn-controller.at | 174 +-
 tests/ovn.at| 611 +++-
 tests/system-ovn-kmod.at|   2 +-
 tests/system-ovn.at |   8 +-
 24 files changed, 1129 insertions(+), 357 deletions(-)

-- 
2.38.1


[ovs-dev] [PATCH ovn v3 1/5] Track ip version of tunnel in chassis_tunnel struct

2023-05-17 Thread Ihar Hrachyshka
This will be used in a later patch to calculate tunneling overhead for
effective path MTU.

Acked-by: Dumitru Ceara 
Signed-off-by: Ihar Hrachyshka 
---
 controller/local_data.c | 2 ++
 controller/local_data.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/controller/local_data.c b/controller/local_data.c
index acaf1de6d..cf0b21bb1 100644
--- a/controller/local_data.c
+++ b/controller/local_data.c
@@ -22,6 +22,7 @@
 #include "lib/util.h"
 #include "lib/vswitch-idl.h"
 #include "openvswitch/vlog.h"
+#include "socket-util.h"
 
 /* OVN includes. */
 #include "encaps.h"
@@ -447,6 +448,7 @@ local_nonvif_data_run(const struct ovsrec_bridge *br_int,
 tun->chassis_id = xstrdup(tunnel_id);
 tun->ofport = u16_to_ofp(ofport);
 tun->type = tunnel_type;
+tun->is_ipv6 = ip ? addr_is_ipv6(ip) : false;
 
 free(hash_id);
 free(ip);
diff --git a/controller/local_data.h b/controller/local_data.h
index 748f009aa..ad0fa7f94 100644
--- a/controller/local_data.h
+++ b/controller/local_data.h
@@ -133,6 +133,7 @@ struct chassis_tunnel {
 char *chassis_id;
 ofp_port_t ofport;
 enum chassis_tunnel_type type;
+bool is_ipv6;
 };
 
 void local_nonvif_data_run(const struct ovsrec_bridge *br_int,
-- 
2.38.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v8 ovn 04/10] controller: configure qos through ovs qos table and do not run tc directly

2023-05-17 Thread Lorenzo Bianconi
> Hi Lorenzo,
> 
> I'm only replying to this patch since as far as I'm concerned, the rest of
> the patches look good enough to me. I'll withhold ACKs until I can ACK the
> whole series.
> 
> See below for comments.
> 
> On 5/17/23 05:01, Lorenzo Bianconi wrote:
> > Rework OVN QoS implementation in order to configure it through OVS QoS
> > table instead of running tc command directly bypassing OVS.
> > 
> > Acked-By: Ihar Hrachyshka 
> > Reviewed-by: Simon Horman 
> > Tested-by: Rodolfo Alonso 
> > Signed-off-by: Lorenzo Bianconi 
> > ---
> >   controller/binding.c| 432 +++-
> >   controller/ovn-controller.c |   7 +
> >   tests/ovn-performance.at|   5 -
> >   tests/system-ovn.at |  67 +-
> >   4 files changed, 298 insertions(+), 213 deletions(-)
> > 
> > diff --git a/controller/binding.c b/controller/binding.c
> > index 91437dbb8..e18488c7e 100644
> > --- a/controller/binding.c
> > +++ b/controller/binding.c
> > @@ -181,222 +181,268 @@ destroy_qos_map(struct hmap *qos_map)
> >   hmap_destroy(qos_map);
> >   }
> > -static void
> > -get_qos_queue(const struct sbrec_port_binding *pb, struct hmap *queue_map)
> > +static const struct ovsrec_interface *
> > +get_qos_egress_port_interface(struct shash *bridge_mappings,
> > +  const struct ovsrec_port **pport,
> > +  const char *network)
> >   {
> > -uint32_t min_rate = smap_get_int(>options, "qos_min_rate", 0);
> > -uint32_t max_rate = smap_get_int(>options, "qos_max_rate", 0);
> > -uint32_t burst = smap_get_int(>options, "qos_burst", 0);
> > -uint32_t queue_id = smap_get_int(>options, "qdisc_queue_id", 0);
> > -const char *network = smap_get(>options, "qos_physical_network");
> > +struct ovsrec_bridge *br_ln = shash_find_data(bridge_mappings, 
> > network);
> > +if (!br_ln) {
> > +return NULL;
> > +}
> > -if ((!min_rate && !max_rate && !burst) || !queue_id) {
> > -/* Qos is not configured for this port. */
> > -return;
> > +/* Add egress-ifaces from the connected bridge */
> > +for (size_t i = 0; i < br_ln->n_ports; i++) {
> > +const struct ovsrec_port *port = br_ln->ports[i];
> > +for (size_t j = 0; j < port->n_interfaces; j++) {
> > +const struct ovsrec_interface *iface = port->interfaces[j];
> > +
> > +if (smap_get(>external_ids, "iface-id")) {
> > +continue;
> > +}
> 
> Why do we skip interfaces with an iface-id in external_ids?

Because we do not want to use an interface binded to a vm to shape egress
traffic

> 
> > +
> > +bool is_egress_iface = smap_get_bool(>external_ids,
> > + "ovn-egress-iface", 
> > false);
> > +if (is_egress_iface) {
> > +*pport = port;
> > +return iface;
> > +}
> > +}
> >   }
> > -uint32_t hash = hash_string(pb->logical_port, 0);
> > -struct qos_queue *q = find_qos_queue(queue_map, hash, 
> > pb->logical_port);
> > -if (!q) {
> > -q = xzalloc(sizeof *q);
> > -hmap_insert(queue_map, >node, hash);
> > -q->port = xstrdup(pb->logical_port);
> > -q->queue_id = queue_id;
> > +return NULL;
> > +}
> > +
> > +/* 34359738360 == (2^32 - 1) * 8.  netdev_set_qos() doesn't support
> > + * 64-bit rate netlink attributes, so the maximum value is 2^32 - 1
> > + * bytes. The 'max-rate' config option is in bits, so multiplying by 8.
> > + * Without setting max-rate the reported link speed will be used, which
> > + * can be unrecognized for certain NICs or reported too low for virtual
> > + * interfaces. */
> > +#define OVN_QOS_MAX_RATE34359738360ULL
> > +static void
> > +add_ovs_qos_table_entry(struct ovsdb_idl_txn *ovs_idl_txn,
> > +const struct ovsrec_port *port,
> > +uint32_t min_rate, uint32_t max_rate,
> > +uint32_t burst, uint32_t queue_id,
> > +const char *ovn_port)
> > +{
> > +struct smap external_ids = SMAP_INITIALIZER(_ids);
> > +struct smap other_config = SMAP_INITIALIZER(_config);
> > +
> > +const struct ovsrec_qos *qos = port->qos;
> > +if (qos && smap_get_bool(>external_ids, "ovn_qos", false)) {
> > +/* External configured QoS, do not overwrite it. */
> 
> The code and comment do not match. I think the comment is correct, and the
> if should be:
> 
> if (qos && !smap_get_bool(>external_ids, "ovn_qos", false))

ack, right. I will fix it

> 
> > +return;
> >   }
> > -free(q->network);
> > -if (network) {
> > -q->network = xstrdup(network);
> > +if (!qos) {
> > +qos = ovsrec_qos_insert(ovs_idl_txn);
> > +ovsrec_qos_set_type(qos, OVN_QOS_TYPE);
> > +ovsrec_port_set_qos(port, qos);
> > +smap_add_format(_config, "max-rate", "%lld", 
> > 

[ovs-dev] [Notice] 0day-bot host scheduled for a system update & backup on Monday, May 22nd

2023-05-17 Thread Michael Santana
Hi all,

We are scheduling a system update and backup for the 0day-bot host on Monday, 
May 22nd.
This means 0daybot, Github Actions and other CI tools will be down for the 
duration of the update & backup.

It is expected to be back online at the end of business day.

Best

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] tc: fix crash on EAGAIN return from recvmsg on netlink socket.

2023-05-17 Thread Simon Horman
On Mon, May 15, 2023 at 10:04:48AM +0200, Frode Nordahl wrote:
> The tc module combines the use of the `tc_transact` helper
> function for communication with the in-kernel tc infrastructure
> with assertions on the reply data by `ofpbuf_at_assert` on the
> received data prior to further processing.
> 
> `tc_transact` in turn calls `nl_transact`, which via
> `nl_transact_multiple__` ultimately calls and handles return
> value from `recvmsg`.  On error a check for EAGAIN is performed
> and a consequence of this condition is effectively to provide a
> non-error (0) result and an empty reply buffer.
> 
> Before this change, the `tc_transact` and, consumers of it, were
> unaware of this condition.  The use of assertions on the reply
> buffer can as such lead to a fatal crash of OVS.
> 
> To be fair, the behavior of `nl_transact` when handling an EAGAIN
> return is currently not documented, so this change also adds that
> documentation.
> 
> While fixing the problem, it led me to find potential problems
> with the one-time initialization functions in the netdev-offload-tc
> module.  Make use of the information now available about an EAGAIN
> condition to retry one-time initialization, and resort to logging
> a warning if probing of tc features fails due to temporary
> situations such as resource depletion.
> 
> For the record, the symptom of the crash is this in the log:
> EMER|../include/openvswitch/ofpbuf.h:194: assertion offset + size <= b->size 
> failed in ofpbuf_at_assert()
> 
> And an excerpt of the backtrace looks like this:
> 0x561dac1396d1 in ofpbuf_at_assert (b=0x7fb650005d20, b=0x7fb650005d20, 
> offset=16, size=20) at ../include/openvswitch/ofpbuf.h:194
> tc_replace_flower (id=, flower=) at 
> ../lib/tc.c:3223
> 0x561dac128155 in netdev_tc_flow_put (netdev=0x561dacf91840, 
> match=, actions=, actions_len=,
> ufid=, info=, stats=) at 
> ../lib/netdev-offload-tc.c:2096
> 0x561dac117541 in netdev_flow_put (stats=, 
> info=0x7fb65b7ba780, ufid=, act_len=, 
> actions=,
> match=0x7fb65b7ba980, netdev=0x561dacf91840) at ../lib/netdev-offload.c:257
> parse_flow_put (put=0x7fb65b7bcc50, dpif=0x561dad0ad550) at 
> ../lib/dpif-netlink.c:2297
> try_send_to_netdev (op=0x7fb65b7bcc48, dpif=0x561dad0ad550) at 
> ../lib/dpif-netlink.c:2384
> 
> Reported-At: https://launchpad.net/bugs/2018500
> Fixes f98e418fbdb6 (tc: Add tc flower functions)
> Fixes 407556ac6c90 (netlink-socket: Avoid forcing a reply for final message 
> in a transaction.)
> Signed-off-by: Frode Nordahl 

Reviewed-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2 6/6] Update multichassis physical flows on interface MTU update

2023-05-17 Thread Ihar Hrachyshka
On Tue, May 16, 2023 at 9:41 AM Dumitru Ceara  wrote:
>
> On 5/3/23 22:13, Ihar Hrachyshka wrote:
> > Make ICMP Path MTU Discovery flows in table=38 react to underlying
> > interface MTU update.
> >
> > NOTE: ideally, OVN would support Logical_Port MTU, in which case we
> > wouldn't have to track OVSDB for interfaces, and we would also be able
> > to react to MTU changes regardless of interface location. This patch is
> > best effort and doesn't handle all scenarios. (E.g. a scenario where
> > MTUs are not changed consistently for all switch interfaces across all
> > chassis.)
>
> This matches the alternative solution I had hinted at in patch 5/6.
>

(I understand that you are not blocking the series for Logical_Flow
reimplementation, but I want to address this regardless, for posterity
- read: ourselves - when later we return to this topic and wonder why
this wasn't implemented in northd. Feel free to ignore below.)

OK, now that I see this comment here too, I think I understand what
you meant in 5/6 in re: CMS. But mtu implementation in OVN is not my
goal here, and without it, the Logical_Flow solution will be
incomplete (still offloading the detection of too-big frames to
ovn-controller direct programming of flows - but yes, we would get rid
of ICMP reply generation function). Whether the split of the
implementation between northd (to create Logical_Flow for ICMP reply
generation) and ovn-controller (to insert flows to detect "too-big"
frames) is a better way than what's proposed is a matter of taste. (I
prefer keeping the whole thing in a single component.) But I can see
why offloading as much of flow programming logic into northd may seem
like a better way, even if this offloading is not complete.

(I think it may be obvious, and if so, I'm sorry, but I will leave
this for completeness.) There are two reasons why I don't want to
implement port MTU as northbound OVN API here.

1. This is a new feature and is not backportable. While I would like
to see this mis-behavior fixed in released branches.
2. This is a significant piece of work that, I am afraid, is
significantly more intrusive than the isolated implementation I have
here.

> Also, I think this patch should be squashed into patch 2/6.

I will do that. Alternatively, we could just rearrange patches so that
6/6 becomes 3/6. In this way, this "dynamic mtu" patch is still
separated logically, but on the other hand if-status is not utilized
until 5/6 is merged. (Before 5/6, nothing in the tree actually uses
the collected ->mtu from if-status-manager, so it's not bad per se
that it's not updated in the unusual situation where interface gets
its mtu updated during lifetime.)

Anyway, it's as easy to squash. So I'll do that.

>
> Regards,
> Dumitru
>
> >
> > Signed-off-by: Ihar Hrachyshka 
> > ---
> >  controller/if-status.c  | 23 +++-
> >  controller/if-status.h  |  3 ++
> >  controller/ovn-controller.c | 73 +
> >  controller/ovsport.c|  9 +
> >  controller/ovsport.h|  2 +
> >  controller/physical.c   |  1 -
> >  controller/physical.h   |  1 +
> >  tests/ovn.at| 66 +++--
> >  8 files changed, 164 insertions(+), 14 deletions(-)
> >
> > diff --git a/controller/if-status.c b/controller/if-status.c
> > index e60156c4a..1cdd893ab 100644
> > --- a/controller/if-status.c
> > +++ b/controller/if-status.c
> > @@ -18,6 +18,7 @@
> >  #include "binding.h"
> >  #include "if-status.h"
> >  #include "ofctrl-seqno.h"
> > +#include "ovsport.h"
> >  #include "simap.h"
> >
> >  #include "lib/hmapx.h"
> > @@ -500,15 +501,6 @@ ovs_iface_account_mem(const char *iface_id, bool erase)
> >  }
> >  }
> >
> > -static uint16_t
> > -get_iface_mtu(const struct ovsrec_interface *iface)
> > -{
> > -if (!iface || !iface->n_mtu || iface->mtu[0] <= 0) {
> > -return 0;
> > -}
> > -return (uint16_t) iface->mtu[0];
> > -}
> > -
> >  uint16_t
> >  if_status_mgr_iface_get_mtu(const struct if_status_mgr *mgr,
> >  const char *iface_id)
> > @@ -517,6 +509,19 @@ if_status_mgr_iface_get_mtu(const struct if_status_mgr 
> > *mgr,
> >  return iface ? iface->mtu : 0;
> >  }
> >
> > +bool
> > +if_status_mgr_iface_set_mtu(const struct if_status_mgr *mgr,
> > +const char *iface_id,
> > +uint16_t mtu)
> > +{
> > +struct ovs_iface *iface = shash_find_data(>ifaces, iface_id);
> > +if (iface && iface->mtu != mtu) {
> > +iface->mtu = mtu;
> > +return true;
> > +}
> > +return false;
> > +}
> > +
> >  static struct ovs_iface *
> >  ovs_iface_create(struct if_status_mgr *mgr, const char *iface_id,
> >   const struct ovsrec_interface *iface_rec,
> > diff --git a/controller/if-status.h b/controller/if-status.h
> > index 8186bdf08..b11d4cd61 100644
> > --- a/controller/if-status.h
> > +++ b/controller/if-status.h
> > @@ -47,6 +47,9 

Re: [ovs-dev] [PATCH ovn v2 2/6] Track interface MTU in if-status-mgr

2023-05-17 Thread Ihar Hrachyshka
On Tue, May 16, 2023 at 9:40 AM Dumitru Ceara  wrote:
>
> On 5/3/23 22:13, Ihar Hrachyshka wrote:
> > This will be used in a later patch to calculate the effective interface
> > MTU after considering tunneling overhead.
> >
> > Signed-off-by: Ihar Hrachyshka 
> > ---
> >  controller/binding.c   |  4 ++--
> >  controller/if-status.c | 31 +++
> >  controller/if-status.h |  3 +++
> >  3 files changed, 32 insertions(+), 6 deletions(-)
> >
> > diff --git a/controller/binding.c b/controller/binding.c
> > index 5df62baef..561b857fa 100644
> > --- a/controller/binding.c
> > +++ b/controller/binding.c
> > @@ -1228,7 +1228,7 @@ claim_lport(const struct sbrec_port_binding *pb,
> >  }
> >  set_pb_chassis_in_sbrec(pb, chassis_rec, true);
> >  } else {
> > -if_status_mgr_claim_iface(if_mgr, pb, chassis_rec,
> > +if_status_mgr_claim_iface(if_mgr, pb, chassis_rec, 
> > iface_rec,
> >sb_readonly);
> >  }
> >  register_claim_timestamp(pb->logical_port, now);
> > @@ -1241,7 +1241,7 @@ claim_lport(const struct sbrec_port_binding *pb,
> >  } else {
> >  if (pb->n_up && !pb->up[0]) {
> >  if_status_mgr_claim_iface(if_mgr, pb, chassis_rec,
> > -  sb_readonly);
> > +  iface_rec, sb_readonly);
> >  }
> >  }
> >  }
> > diff --git a/controller/if-status.c b/controller/if-status.c
> > index d1c14ac30..f2ea21635 100644
> > --- a/controller/if-status.c
> > +++ b/controller/if-status.c
> > @@ -24,6 +24,7 @@
> >  #include "lib/util.h"
> >  #include "timeval.h"
> >  #include "openvswitch/vlog.h"
> > +#include "lib/vswitch-idl.h"
> >  #include "lib/ovn-sb-idl.h"
> >
> >  VLOG_DEFINE_THIS_MODULE(if_status);
> > @@ -146,6 +147,7 @@ struct ovs_iface {
> >   * be fully programmed in OVS.  Only used in 
> > state
> >   * OIF_INSTALL_FLOWS.
> >   */
> > +uint16_t mtu;   /* Extracted from OVS interface.mtu field. */
> >  };
> >
> >  static uint64_t ifaces_usage;
> > @@ -167,9 +169,10 @@ struct if_status_mgr {
> >  uint32_t iface_seqno;
> >  };
> >
> > -static struct ovs_iface *ovs_iface_create(struct if_status_mgr *,
> > -  const char *iface_id,
> > -  enum if_state );
> > +static struct ovs_iface *
> > +ovs_iface_create(struct if_status_mgr *, const char *iface_id,
> > + const struct ovsrec_interface *iface_rec,
> > + enum if_state);
> >  static void ovs_iface_destroy(struct if_status_mgr *, struct ovs_iface *);
> >  static void ovs_iface_set_state(struct if_status_mgr *, struct ovs_iface *,
> >  enum if_state);
> > @@ -222,13 +225,14 @@ void
> >  if_status_mgr_claim_iface(struct if_status_mgr *mgr,
> >const struct sbrec_port_binding *pb,
> >const struct sbrec_chassis *chassis_rec,
> > +  const struct ovsrec_interface *iface_rec,
> >bool sb_readonly)
> >  {
> >  const char *iface_id = pb->logical_port;
> >  struct ovs_iface *iface = shash_find_data(>ifaces, iface_id);
> >
> >  if (!iface) {
> > -iface = ovs_iface_create(mgr, iface_id, OIF_CLAIMED);
> > +iface = ovs_iface_create(mgr, iface_id, iface_rec, OIF_CLAIMED);
> >  }
> >
> >  if (!sb_readonly) {
> > @@ -492,14 +496,33 @@ ovs_iface_account_mem(const char *iface_id, bool 
> > erase)
> >  }
> >  }
> >
> > +static uint16_t
> > +get_iface_mtu(const struct ovsrec_interface *iface)
> > +{
> > +if (!iface || !iface->n_mtu || iface->mtu[0] <= 0) {
> > +return 0;
> > +}
> > +return (uint16_t) iface->mtu[0];
> > +}
>
> We try to not use SB/OVS record fields inside the if-status module.  We
> only use those records as "opaque" pointers.  Can we move the mtu
> extraction outside if-status and just pass the uint16_t mtu as an
> argument to if_status_mgr_claim_iface()?

This is definitely not hard to do. But the rationale that I had to
pass a pointer and allow the module to extract the value it needs is
that in the future the module may need to update some more data
relevant to the interface on claim attempt, then we'd have to pass
another field. It seems to me that the if-status module (the module
that is supposed to track relevant data about interfaces) can be
reasonably expected to interpret interface records.

I am happy to oblige if you feel otherwise, I just wanted to share my
rationale first, because it wasn't accidental that I pass iface and
not mtu (I actually had mtu extraction outside the module in one of my
draft versions, then I changed to the 

Re: [ovs-dev] [PATCH v8 ovn 04/10] controller: configure qos through ovs qos table and do not run tc directly

2023-05-17 Thread Mark Michelson

Hi Lorenzo,

I'm only replying to this patch since as far as I'm concerned, the rest 
of the patches look good enough to me. I'll withhold ACKs until I can 
ACK the whole series.


See below for comments.

On 5/17/23 05:01, Lorenzo Bianconi wrote:

Rework OVN QoS implementation in order to configure it through OVS QoS
table instead of running tc command directly bypassing OVS.

Acked-By: Ihar Hrachyshka 
Reviewed-by: Simon Horman 
Tested-by: Rodolfo Alonso 
Signed-off-by: Lorenzo Bianconi 
---
  controller/binding.c| 432 +++-
  controller/ovn-controller.c |   7 +
  tests/ovn-performance.at|   5 -
  tests/system-ovn.at |  67 +-
  4 files changed, 298 insertions(+), 213 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index 91437dbb8..e18488c7e 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -181,222 +181,268 @@ destroy_qos_map(struct hmap *qos_map)
  hmap_destroy(qos_map);
  }
  
-static void

-get_qos_queue(const struct sbrec_port_binding *pb, struct hmap *queue_map)
+static const struct ovsrec_interface *
+get_qos_egress_port_interface(struct shash *bridge_mappings,
+  const struct ovsrec_port **pport,
+  const char *network)
  {
-uint32_t min_rate = smap_get_int(>options, "qos_min_rate", 0);
-uint32_t max_rate = smap_get_int(>options, "qos_max_rate", 0);
-uint32_t burst = smap_get_int(>options, "qos_burst", 0);
-uint32_t queue_id = smap_get_int(>options, "qdisc_queue_id", 0);
-const char *network = smap_get(>options, "qos_physical_network");
+struct ovsrec_bridge *br_ln = shash_find_data(bridge_mappings, network);
+if (!br_ln) {
+return NULL;
+}
  
-if ((!min_rate && !max_rate && !burst) || !queue_id) {

-/* Qos is not configured for this port. */
-return;
+/* Add egress-ifaces from the connected bridge */
+for (size_t i = 0; i < br_ln->n_ports; i++) {
+const struct ovsrec_port *port = br_ln->ports[i];
+for (size_t j = 0; j < port->n_interfaces; j++) {
+const struct ovsrec_interface *iface = port->interfaces[j];
+
+if (smap_get(>external_ids, "iface-id")) {
+continue;
+}


Why do we skip interfaces with an iface-id in external_ids?


+
+bool is_egress_iface = smap_get_bool(>external_ids,
+ "ovn-egress-iface", false);
+if (is_egress_iface) {
+*pport = port;
+return iface;
+}
+}
  }
  
-uint32_t hash = hash_string(pb->logical_port, 0);

-struct qos_queue *q = find_qos_queue(queue_map, hash, pb->logical_port);
-if (!q) {
-q = xzalloc(sizeof *q);
-hmap_insert(queue_map, >node, hash);
-q->port = xstrdup(pb->logical_port);
-q->queue_id = queue_id;
+return NULL;
+}
+
+/* 34359738360 == (2^32 - 1) * 8.  netdev_set_qos() doesn't support
+ * 64-bit rate netlink attributes, so the maximum value is 2^32 - 1
+ * bytes. The 'max-rate' config option is in bits, so multiplying by 8.
+ * Without setting max-rate the reported link speed will be used, which
+ * can be unrecognized for certain NICs or reported too low for virtual
+ * interfaces. */
+#define OVN_QOS_MAX_RATE34359738360ULL
+static void
+add_ovs_qos_table_entry(struct ovsdb_idl_txn *ovs_idl_txn,
+const struct ovsrec_port *port,
+uint32_t min_rate, uint32_t max_rate,
+uint32_t burst, uint32_t queue_id,
+const char *ovn_port)
+{
+struct smap external_ids = SMAP_INITIALIZER(_ids);
+struct smap other_config = SMAP_INITIALIZER(_config);
+
+const struct ovsrec_qos *qos = port->qos;
+if (qos && smap_get_bool(>external_ids, "ovn_qos", false)) {
+/* External configured QoS, do not overwrite it. */


The code and comment do not match. I think the comment is correct, and 
the if should be:


if (qos && !smap_get_bool(>external_ids, "ovn_qos", false))


+return;
  }
  
-free(q->network);

-if (network) {
-q->network = xstrdup(network);
+if (!qos) {
+qos = ovsrec_qos_insert(ovs_idl_txn);
+ovsrec_qos_set_type(qos, OVN_QOS_TYPE);
+ovsrec_port_set_qos(port, qos);
+smap_add_format(_config, "max-rate", "%lld", OVN_QOS_MAX_RATE);
+ovsrec_qos_set_other_config(qos, _config);
+smap_clear(_config);
+
+smap_add(_ids, "ovn_qos", "true");
+ovsrec_qos_set_external_ids(qos, _ids);
+smap_clear(_ids);
  }
-q->min_rate = min_rate;
-q->max_rate = max_rate;
-q->burst = burst;
-}
  
-static const struct ovsrec_qos *

-get_noop_qos(struct ovsdb_idl_txn *ovs_idl_txn,
- const struct ovsrec_qos_table *qos_table)
-{
-const struct ovsrec_qos *qos;
-OVSREC_QOS_TABLE_FOR_EACH (qos, 

[ovs-dev] [PATCH 1/2] stream-ssl: Disable alerts on unexpected EOF.

2023-05-17 Thread Ilya Maximets
OpenSSL 3.0 enabled alerts for unexpected EOF by default.  It supposed
to alert the application whenever the connection terminated without
a proper close_notify.  And that should allow applications to take
actions to protect themselves from potential TLS truncation attack.
This is how it looks like in the log:

 |stream_ssl|WARN|SSL_read: error:0A000126:SSL routines::unexpected eof while 
reading
 |jsonrpc|WARN|ssl:127.0.0.1:34288: receive error: Input/output error
 |reconnect|WARN|ssl:127.0.0.1:34288: connection dropped (Input/output error)

The problem is that clients based on OVS libraries do not wait for
the proper termination if it didn't happen right away.  It means that
chances to have alerts on the server side for every single disconnection
are very high.

None of the high level protocols supported by OVS daemons can carry
state between re-connections, e.g., there are no session cookies or
anything like that.  So, the TLS truncation attack is no applicable.

Disable the alert to avoid unnecessary warnings in the log.

Signed-off-by: Ilya Maximets 
---
 lib/stream-ssl.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/lib/stream-ssl.c b/lib/stream-ssl.c
index 62da9febb..86747e58b 100644
--- a/lib/stream-ssl.c
+++ b/lib/stream-ssl.c
@@ -1075,7 +1075,13 @@ do_ssl_init(void)
 VLOG_ERR("SSL_CTX_new: %s", ERR_error_string(ERR_get_error(), NULL));
 return ENOPROTOOPT;
 }
-SSL_CTX_set_options(ctx, SSL_OP_NO_SSLv2 | SSL_OP_NO_SSLv3);
+
+long options = SSL_OP_NO_SSLv2 | SSL_OP_NO_SSLv3;
+#ifdef SSL_OP_IGNORE_UNEXPECTED_EOF
+options |= SSL_OP_IGNORE_UNEXPECTED_EOF;
+#endif
+SSL_CTX_set_options(ctx, options);
+
 #if OPENSSL_VERSION_NUMBER < 0x300fL
 SSL_CTX_set_tmp_dh_callback(ctx, tmp_dh_callback);
 #else
-- 
2.40.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 2/2] tests: Check ovsdb-server logs in OVSDB tests.

2023-05-17 Thread Ilya Maximets
Many OVSDB tests are not checking the server log for warnings or
errors.  Some are not even using the log file.  It's mostly OK as we're
usually checking the user-visible behavior.  But it would also be nice
to detect some internal warnings if there are some.

Moving the OVSDB_SERVER_SHUTDOWN macro to the common place, adding
the call to check_logs into it and making OVSDB tests use this macro.

Signed-off-by: Ilya Maximets 
---
 tests/ovsdb-client.at  | 12 +++---
 tests/ovsdb-idl.at | 10 -
 tests/ovsdb-lock.at| 10 ++---
 tests/ovsdb-macros.at  | 12 ++
 tests/ovsdb-monitor.at | 88 
 tests/ovsdb-server.at  | 91 +-
 tests/ovsdb-tool.at| 44 ++--
 tests/vtep-ctl.at  |  2 +-
 8 files changed, 152 insertions(+), 117 deletions(-)

diff --git a/tests/ovsdb-client.at b/tests/ovsdb-client.at
index 2d14f1ac2..68fb962bd 100644
--- a/tests/ovsdb-client.at
+++ b/tests/ovsdb-client.at
@@ -5,7 +5,7 @@ AT_KEYWORDS([ovsdb client positive])
 ordinal_schema > schema
 on_exit 'kill `cat *.pid`'
 AT_CHECK([ovsdb-tool create db schema], [0], [ignore], [ignore])
-AT_CHECK([ovsdb-server --detach --no-chdir --pidfile --remote=punix:socket 
db], [0], [ignore], [ignore])
+AT_CHECK([ovsdb-server --detach --no-chdir --log-file --pidfile 
--remote=punix:socket db], [0], [ignore], [ignore])
 AT_CHECK([ovsdb-client get-schema-version unix:socket ordinals], [0], [5.1.3
 ])
 AT_CHECK([ovsdb-client get-schema-cksum unix:socket ordinals], [0], [12345678 9
@@ -19,7 +19,7 @@ on_exit 'kill `cat *.pid`'
 ordinal_schema > schema
 touch .db.~lock~
 AT_CHECK([ovsdb-tool create db schema], [0], [], [ignore])
-AT_CHECK([ovsdb-server --detach --no-chdir --pidfile --remote=punix:socket 
db], [0], [ignore], [ignore])
+AT_CHECK([ovsdb-server --detach --no-chdir --log-file --pidfile 
--remote=punix:socket db], [0], [ignore], [ignore])
 AT_CHECK([ovsdb-client needs-conversion unix:socket schema], [0], [no
 ])
 OVSDB_SERVER_SHUTDOWN
@@ -31,7 +31,7 @@ ordinal_schema > schema
 touch .db.~lock~
 on_exit 'kill `cat *.pid`'
 AT_CHECK([ovsdb-tool create db schema], [0], [], [ignore])
-AT_CHECK([ovsdb-server --detach --no-chdir --pidfile --remote=punix:socket 
db], [0], [ignore], [ignore])
+AT_CHECK([ovsdb-server --detach --no-chdir --log-file --pidfile 
--remote=punix:socket db], [0], [ignore], [ignore])
 sed 's/5\.1\.3/5.1.4/' < schema > schema2
 AT_CHECK([diff schema schema2], [1], [ignore])
 AT_CHECK([ovsdb-client needs-conversion unix:socket schema2], [0], [yes
@@ -134,7 +134,7 @@ _uuid name number
 ])
 
 dnl Stop the database server, then re-start it based on the backup.
-OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+OVSDB_SERVER_SHUTDOWN
 AT_CHECK([ovsdb-server -vfile -vvlog:off --detach --no-chdir --pidfile 
--log-file --remote=punix:db.sock backup], [0])
 
 dnl Dump a new copy of the data.
@@ -195,7 +195,7 @@ ordinals table
 _uuid,name,number
 ])
 
-OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+OVSDB_SERVER_SHUTDOWN
 
 AT_CLEANUP
 
@@ -254,7 +254,7 @@ _uuid,name,number
 ])
 
 dnl Stopping the server.
-OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+OVSDB_SERVER_SHUTDOWN
 dnl ovsdb-client should exit by itself after disconnection form the server.
 OVS_WAIT_WHILE([test -e ovsdb-client.pid])
 
diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at
index 978a6677b..df5a9d2fd 100644
--- a/tests/ovsdb-idl.at
+++ b/tests/ovsdb-idl.at
@@ -278,7 +278,10 @@ m4_define([OVSDB_CHECK_IDL_SSL_PY],
 [0], [stdout], [ignore])
AT_CHECK([sort stdout | uuidfilt]m4_if([$6],,, [[| $6]]),
 [0], [$4])
-   OVSDB_SERVER_SHUTDOWN
+   OVSDB_SERVER_SHUTDOWN(["
+ /unexpected SSL connection close/d
+ /Protocol error/d
+   "])
AT_CLEANUP])
 
 m4_define([OVSDB_CHECK_IDL],
@@ -2309,7 +2312,10 @@ m4_define([CHECK_STREAM_OPEN_BLOCK],
AT_CHECK([$2 PROTOCOL:$4:$TCP_PORT $SSL_KEY_ARGS], [0], [ignore])
AT_CHECK([$2 PROTOCOL:$4:$WRONG_PORT $SSL_KEY_ARGS], [1], [ignore],
 [ignore])
-   OVSDB_SERVER_SHUTDOWN
+   OVSDB_SERVER_SHUTDOWN(["
+ /unexpected SSL connection close/d
+ /Protocol error/d
+   "])
AT_CHECK([$2 PROTOCOL:$4:$TCP_PORT $SSL_KEY_ARGS], [1], [ignore], [ignore])
AT_CLEANUP])
 
diff --git a/tests/ovsdb-lock.at b/tests/ovsdb-lock.at
index a3acd2f27..6bc247302 100644
--- a/tests/ovsdb-lock.at
+++ b/tests/ovsdb-lock.at
@@ -12,8 +12,8 @@ m4_define([OVSDB_CHECK_LOCK_SETUP],
 AT_KEYWORDS([ovsdb lock $2])
 ordinal_schema > schema
 AT_CHECK([ovsdb-tool create db schema], [0], [stdout], [ignore])
-AT_CAPTURE_FILE([ovsdb-server-log])
-AT_CHECK([ovsdb-server --detach --no-chdir --pidfile --remote=punix:socket 
--log-file="`pwd`"/ovsdb-server-log db >/dev/null 2>&1], [0], [], [])])
+AT_CAPTURE_FILE([ovsdb-server.log])
+AT_CHECK([ovsdb-server --detach --no-chdir --pidfile --remote=punix:socket 
--log-file db], [0], [ignore], [ignore])])
 
 #
 # Two sessions create two locks. Both sessions should be able 

[ovs-dev] [PATCH 0/2] ssl: Ignore unexpected EOF + ovsdb log checking in tests.

2023-05-17 Thread Ilya Maximets
This patch set removes annoying EOF alerts from ovsdb-server logs
in case of OpenSSL 3.0+ and enables checking of server logs in many
existing tests.

Ilya Maximets (2):
  stream-ssl: Disable alerts on unexpected EOF.
  tests: Check ovsdb-server logs in OVSDB tests.

 lib/stream-ssl.c   |  8 +++-
 tests/ovsdb-client.at  | 12 +++---
 tests/ovsdb-idl.at | 10 -
 tests/ovsdb-lock.at| 10 ++---
 tests/ovsdb-macros.at  | 12 ++
 tests/ovsdb-monitor.at | 88 
 tests/ovsdb-server.at  | 91 +-
 tests/ovsdb-tool.at| 44 ++--
 tests/vtep-ctl.at  |  2 +-
 9 files changed, 159 insertions(+), 118 deletions(-)

-- 
2.40.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2 5/6] Implement MTU Path Discovery for multichassis ports

2023-05-17 Thread Ihar Hrachyshka
Thank you Dumitru! See below.

On Tue, May 16, 2023 at 9:41 AM Dumitru Ceara  wrote:
> I'm not necessarily rejecting this change.  I just wanted to bring up an
> alternative approach (I'm not sure if it's possible to implement it though):
>
> The CMS (e.g., Neutron) probably knows before hand the reduced MTU value
> it should impose on the localnet port.  That means it might be possible
> to just implement all this through logical flows that use
> 'check_pkt_larger()'.
>

I think I mentioned this alternative in my cover letter for the
series. (At least one of the variations.)

I don't understand the specifics of what you are describing here
though. Yes, CMS may do calculations for the "desired path MTU". But
what's the mechanism for CMS to pass this information down to the OVN
controller? ACL? I don't think the space of possible actions that
would allow CMS itself to define the behavior is available to CMS.

Perhaps you mean (as I think we discussed before in private) the
approach where OVN northd creates Logical_Flow objects that would
implement the behavior. If so, yes that could be possible. But note
that northd doesn't know the desired path MTU (because it's not CMS
and OVN northbound database API doesn't support MTU primitive for
ports.) So the best we could do in the alternative way is implement
half of the solution via Logical_Flow (the one that sends ICMP replies
for "too-big" frames), leaving the other half (tagging the "too-big"
frames based on interface MTU for processing by the Logical_Flows) to
OVN controller. I think I listed this variation of the approach in my
cover letter, but please let me know if it's not sufficient.

(The split could be avoided if OVN adds a MTU attribute to its ports.
But I of course am not trying to tackle this in this series, for
obvious reasons. But MTU for ports is - in my mind - a worthy feature
to have in OVN, if you ask me.)

> In any case, the code in this patch looks OK to me overall.  I have a
> few minor comments below.
>
> >
> > Fixes: 7084cf437421 ("Always funnel multichassis port traffic through 
> > tunnels")
> >
> > Signed-off-by: Ihar Hrachyshka 
> > ---
> >  NEWS|   6 +
> >  controller/ovn-controller.c |   3 +
> >  controller/physical.c   | 293 +++-
> >  controller/physical.h   |   1 +
> >  lib/ovn-util.h  |  11 ++
> >  tests/ovn.at| 263 
> >  6 files changed, 572 insertions(+), 5 deletions(-)
> >
> > diff --git a/NEWS b/NEWS
> > index 60467581a..9d5eef268 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -14,6 +14,12 @@ Post v23.03.0
> >  existing behaviour of flooding these arp requests to all attached 
> > Ports.
> >- Always allow IPv6 Router Discovery, Neighbor Discovery, and Multicast
> >  Listener Discovery protocols, regardless of ACLs defined.
> > +  - Send ICMP Fragmentation Needed packets back to offending ports when
> > +communicating with multichassis ports using frames that don't fit 
> > through a
> > +tunnel. This is done only for logical switches that are attached to a
> > +physical network via a localnet port, in which case multichassis ports 
> > may
> > +have an effective MTU different from regular ports and hence may need 
> > this
> > +mechanism to maintain connectivity with other peers in the network.
> >
> >  OVN v23.03.0 - 03 Mar 2023
> >  --
> > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> > index c094cb74d..9359925fa 100644
> > --- a/controller/ovn-controller.c
> > +++ b/controller/ovn-controller.c
> > @@ -4083,6 +4083,9 @@ static void init_physical_ctx(struct engine_node 
> > *node,
> >  p_ctx->patch_ofports = _vif_data->patch_ofports;
> >  p_ctx->chassis_tunnels = _vif_data->chassis_tunnels;
> >
> > +struct controller_engine_ctx *ctrl_ctx = 
> > engine_get_context()->client_ctx;
> > +p_ctx->if_mgr = ctrl_ctx->if_mgr;
> > +
> >  pflow_output_get_debug(node, _ctx->debug);
> >  }
> >
> > diff --git a/controller/physical.c b/controller/physical.c
> > index 1b0482e3b..a2a25d067 100644
> > --- a/controller/physical.c
> > +++ b/controller/physical.c
> > @@ -41,6 +41,7 @@
> >  #include "lib/ovn-sb-idl.h"
> >  #include "lib/ovn-util.h"
> >  #include "ovn/actions.h"
> > +#include "if-status.h"
> >  #include "physical.h"
> >  #include "pinctrl.h"
> >  #include "openvswitch/shash.h"
> > @@ -91,6 +92,7 @@ physical_register_ovs_idl(struct ovsdb_idl *ovs_idl)
> >
> >  ovsdb_idl_add_table(ovs_idl, _table_interface);
> >  ovsdb_idl_track_add_column(ovs_idl, _interface_col_name);
> > +ovsdb_idl_track_add_column(ovs_idl, _interface_col_mtu);
> >  ovsdb_idl_track_add_column(ovs_idl, _interface_col_ofport);
> >  ovsdb_idl_track_add_column(ovs_idl, 
> > _interface_col_external_ids);
> >  }
> > @@ -1104,6 +1106,273 @@ setup_activation_strategy(const struct 
> > sbrec_port_binding *binding,
> >  }
> >  }

Re: [ovs-dev] [ovn] help wanted: how MC_FLOOD actions works for eth.mcast traffic coming from localnet lports?

2023-05-17 Thread Vladislav Odintsov
Thanks Dumitru for the response.

I’ve looked through this code and even tried to implement similar logic for 
vtep lport, but realized, that vtep mcast traffic is a bit different from 
localnet port.
mcast traffic originated in normal vif port in localnet-attached lswitch has 
two cases:
- if uplink bridge mapping is locally available, the mcast packet is sent to 
this interface assuming classic network switch outside of ovn will flood it in 
l2 domain
- if uplink bridge mapping is locally unawailable, such mcast packet is flooded 
in overlay. (honestly I didn’t find code related to this logic, I’ve found it 
just performing tests)

With same logic implemented in consider_mc_group() the BUM traffic originated 
in vif lport doesn’t reach remote chassis at all…

With vtep lport mcast traffic must be replicated always on source node, but 
it’s prohibited to re-send received from vtep lport mcast packets to other 
remote chassis.
I guess I need to reject mcast l2 flood to remote ports if REGBIT_FROM_RAMP == 
1.

Do you have an idea for the correct place to put such logic in?

> On 16 May 2023, at 23:02, Dumitru Ceara  wrote:
> 
> On 5/16/23 21:48, Vladislav Odintsov wrote:
>> Hi Numan, Dumitru, Ilya, Mark,
>> 
> 
> Hi Vladislav,
> 
>> if someone can help, I’ll be very grateful.
>> 
>> Thanks in advance.
>> 
>>> On 15 May 2023, at 15:06, Vladislav Odintsov  wrote:
>>> 
>>> Hi, I’m implementing neighbour learning on the chassis-redirect ports
>>> for traffic coming from lport of vtep type and have some
>>> misunderstanding of how MC_FLOOD action works with traffic coming from
>>> localnet lports. Imagine, we’ve got lswitch with 4 LSPs: - vtep lport
>>> (type vtep) - lrp (chassis-redirect configured router port on hv1) -
>>> lsp1 on hv1 - lsp2 on hv2 My problem is that mcast/bcast traffic
>>> coming from vtep lport got flooded to all logical ports of the
>>> lswitch, even for those which reside on another hypervisor (hv2 in
>>> this example), resulting in duplicated packets on lsp2 vif port. vtep
>>> switch performs source node replication of BUM traffic, so bcast
>>> packets reach normal vif lports + MC_FLOOD in ls_in_l2_lkup. I’m
>>> comparing behaviour with localnet port in same setup (lswitch with 2
>>> LSPs on different hypervisors) and see there is no such behaviour: packets
>>> don’t get flooded to vif lports on other chassis.
>>> I see that eth.mcast traffic is matched in ls_in_l2_lkup lflow table with
>>> the action 'outport = "_MC_flood"; output;', then it got flooded to only
>>> LRPs of LS and *local* vif lports. The question is - where is the
>>> logic, which prevents flooding from localnet to vif located on other
>>> chassis? Am I missing something? Let me know if any additional
>>> information is needed. Thanks!
>>> 
> 
> For localnet ports the key is in how ovn-controllers expand (L2)
> multicast_groups.  In consider_mc_group() [0]:
> 
>/* Go through all of the ports in the multicast group:
> *
> *- For remote ports, add the chassis to 'remote_chassis' or
> *  'vtep_chassis'.
> *
> *- For local ports (other than logical patch ports), add actions
> *  to 'ofpacts' to set the output port and resubmit.
> *
> *- For logical patch ports, add actions to 'remote_ofpacts'
> *  instead.  (If we put them in 'ofpacts', then the output
> *  would happen on every hypervisor in the multicast group,
> *  effectively duplicating the packet.)  The only exception
> *  is in case of transit switches (used by OVN-IC).  We need
> *  patch ports to be processed on the remote side, after
> *  crossing the AZ boundary.
> *
> *- For chassisredirect ports, add actions to 'ofpacts' to
> *  set the output port to be the router patch port for which
> *  the redirect port was added.
> */
> 
> Later on, we have a special check for switches with localnet
> ports and we don't forward traffic to remote ports in that
> case because multicast traffic reaches remote ports through
> the localnet directly:
> 
>} else if (!get_localnet_port(local_datapaths,
>  mc->datapath->tunnel_key)) {
>/* Add remote chassis only when localnet port not exist,
> * otherwise multicast will reach remote ports through localnet
> * port. */
>if (port->chassis) {
>if (chassis_is_vtep(port->chassis)) {
>sset_add(_chassis, port->chassis->name);
>} else {
>sset_add(_chassis, port->chassis->name);
>}
>}
>for (size_t j = 0; j < port->n_additional_chassis; j++) {
>if (chassis_is_vtep(port->additional_chassis[j])) {
>sset_add(_chassis,
> port->additional_chassis[j]->name);
>} else {
>sset_add(_chassis,
>

Re: [ovs-dev] [ovn] ha-chassis-group false positive failover

2023-05-17 Thread Numan Siddique
On Wed, May 17, 2023 at 4:44 AM Vladislav Odintsov  wrote:
>
>
>
> > On 16 May 2023, at 23:54, Numan Siddique  wrote:
> >
> > Hi Vladislav,
> >
> > Sorry for the late reply.
> >
> > PSB for few comments.
>
>
> Thanks for your reply!
> My answers are inline.
>
> >
> >
> >
> > On Tue, May 16, 2023 at 3:42 PM Vladislav Odintsov  > > wrote:
> >>
> >> Hi Numan, Dumitru, Ilya,
> >>
> >> I guess I’ve managed to deal with some of these problems and have some 
> >> conclusions.
> >> Please see inline.
> >>
> >>> On 14 Apr 2023, at 16:26, Vladislav Odintsov  wrote:
> >>>
> >>> And as a follow-up, I see sometimes next error message in ovn-controller 
> >>> log, though there is a configured much higher limit in seconds for 
> >>> inactivity probe value for the chassis:
> >>>
> >>
> >> ovn-controller establishes 3 openflow connections to local ovs-vswitchd 
> >> for next purposes:
> >> 1. openflow rules management (openflow probe interval is configured 
> >> through ovn-openflow-probe-interval external_ids setting [1]; default — 
> >> disabled [2]);
> >> 2. openflow features collection (hardcoded probe interval 5s [3]);
> >> 3. pinctrl thread (hardcoded probe interval 5s [4]).
> >>
> >>> ovs|04426|rconn|ERR|unix:/run/openvswitch/br-int.mgmt: no response to 
> >>> inactivity probe after 8 seconds, disconnecting
> >>>
> >>> ovs|26560|rconn|ERR|unix:/run/openvswitch/br-int.mgmt: no response to 
> >>> inactivity probe after 10 seconds, disconnecting
> >>>
> >>
> >> These messages most likely related to the connection for features 
> >> collection or pinctrl thread. So ovn-openflow-probe-interval has no matter 
> >> here.
> >>
> >>>
> >>> server with first log message:
> >>> # ovs-vsctl get open . external_ids:ovn-openflow-probe-interval
> >>> "30"
> >>>
> >>> server with second log message:
> >>> # ovs-vsctl get open . external_ids:ovn-openflow-probe-interval
> >>> "60"
> >>>
> >>>
> >>>
>  On 30 Mar 2023, at 23:19, Vladislav Odintsov  wrote:
> 
>  Hi all,
> 
>  I’m facing a false positive failover in the event of openflow connection 
>  got dropped after inactivity probe timeout due to high load of 
>  ovn-controller handling huge amount of ovsdb updates.
>  I wonder wether its a bug or I have to tune any specifics. See 
>  ovn-controller logs:
> 
>  2023-03-30T07:57:24.469Z|58440|inc_proc_eng|INFO|node: 
>  logical_flow_output, recompute (failed handler for input port_groups) 
>  took 35774ms
>  2023-03-30T07:57:28.521Z|58441|lflow_cache|INFO|Detected cache 
>  inactivity (last active 40006 ms ago): trimming cache
>  2023-03-30T07:57:28.528Z|58442|timeval|WARN|Unreasonably long 40084ms 
>  poll interval (32947ms user, 6968ms system)
>  2023-03-30T07:57:28.528Z|58443|timeval|WARN|faults: 43468 minor, 0 major
>  …..
>  2023-03-30T07:57:48.784Z|58496|reconnect|ERR|ssl::6642: no response 
>  to inactivity probe after 60.2 seconds, disconnecting
>  2023-03-30T07:57:48.784Z|58497|reconnect|INFO|ssl::6642: connection 
>  dropped
>  2023-03-30T07:57:48.905Z|58498|main|INFO|OVNSB commit failed, force 
>  recompute next time.
>  2023-03-30T07:57:49.786Z|58499|reconnect|INFO|ssl::6642: 
>  connecting...
>  2023-03-30T07:57:49.924Z|58500|reconnect|INFO|ssl::6642: connected
>  2023-03-30T07:57:29.781Z|58501|poll_loop|INFO|wakeup due to 0-ms timeout 
>  at lib/stream-ssl.c:838 (74% CPU usage)
>  2023-03-30T07:57:29.836Z|58502|poll_loop|INFO|wakeup due to 0-ms timeout 
>  at lib/stream-ssl.c:838 (74% CPU usage)
>  ….
>  2023-03-30T07:57:05.976Z|58545|poll_loop|INFO|wakeup due to 0-ms timeout 
>  at lib/stream-ssl.c:838 (101% CPU usage)
>  2023-03-30T07:57:07.002Z|58546|memory|INFO|peak resident set size grew 
>  54% in last 682257.2 seconds, from 4435696 kB to 6824820 kB
>  2023-03-30T07:57:07.002Z|58547|memory|INFO|idl-cells-OVN_Southbound:11491359
>   idl-cells-Open_vSwitch:14416 lflow-cache-entries-cache-expr:530298 
>  lflow-cache-entries-cache-matches:31770 lflow-cache-size-KB:769553 
>  local_datapath_usage-KB:630
>  ofctrl_desired_flow_usage-KB:577052 
>  ofctrl_installed_flow_usage-KB:409538 ofctrl_sb_flow_ref_usage-KB:273707
>  …
>  2023-03-30T07:57:31.667Z|58552|rconn|WARN|unix:/run/openvswitch/br-int.mgmt:
>   connection dropped (Broken pipe)
>  2023-03-30T07:57:31.739Z|58553|binding|INFO|Releasing lport cr-xxx from 
>  this chassis (sb_readonly=0)
>  2023-03-30T07:57:31.739Z|58554|if_status|WARN|Dropped 1 log messages in 
>  last 1254725 seconds (most recently, 1254725 seconds ago) due to 
>  excessive rate
>  2023-03-30T07:57:31.739Z|58555|if_status|WARN|Trying to release unknown 
>  interface cr-xxx
>  2023-03-30T07:57:31.743Z|58556|binding|INFO|Releasing lport cr-yyy from 
>  this chassis (sb_readonly=0)
>  

Re: [ovs-dev] [PATCH ovn] actions: Remove unused ovnfield_act_header structure.

2023-05-17 Thread Dumitru Ceara
On 5/16/23 18:25, Simon Horman wrote:
> On Mon, May 08, 2023 at 01:14:55PM +0200, Dumitru Ceara wrote:
>> It's not referenced anywhere and by the look of it it has never been.
>>
>> Signed-off-by: Dumitru Ceara 
> 
> Reviewed-by: Simon Horman 
> 

Thanks, Simon, for the review.  I applied the patch to the main branch.

Regards,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2] tests: Fix flakiness of policy based routing on slower systems

2023-05-17 Thread Dumitru Ceara
On 5/10/23 13:41, Ales Musil wrote:
> The test expected that the packet statistics will be
> immediately reflected after packet inject, however that
> might not be true on slower systems. Use OVS_WAIT_UNTIL
> instead to ensure that the packet really went through.
> Also align the IPv4 and IPv6 variant and fix some checks
> that were suggesting flow statistics check, but checked
> logical flows instead.
> 
> Signed-off-by: Ales Musil 
> ---
> v2: Address comments from Xavier:
> Remove the additional sleep.
> Make sure that all statistics checks are done through
> OVS_WAIT_UNTIL.
> ---

Thanks, Ales and Xavier!

I applied this patch to the main branch and backported it to all stable
branches down to 22.03 LTS.

I folded in the incremental change posted below.  It replaces "wc -l"
with "-c" as argument to "grep".

Regards,
Dumitru

diff --git a/tests/ovn.at b/tests/ovn.at
index efd37658f2..2ef370c26d 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -7823,7 +7823,7 @@ OVS_WAIT_UNTIL([as pbr-hv ovs-appctl -t ovn-controller 
inject-pkt "$packet"])
 # Check if packet hit the drop policy
 OVS_WAIT_UNTIL([test "1" = "$(ovs-ofctl dump-flows br-int | \
 grep "nw_src=192.168.1.0/24,nw_dst=172.16.1.0/24 actions=drop" | \
-grep "priority=10" | grep "n_packets=1" | wc -l)"])
+grep "priority=10" | grep "n_packets=1" -c)"])
 
 # Expected to drop the packet.
 $PYTHON "$ovs_srcdir/utilities/ovs-pcap.in" pbr-hv/vif2-tx.pcap > vif2.packets
@@ -7834,7 +7834,7 @@ AT_FAIL_IF([test "$rcvd_packet" != ""])
 check ovn-nbctl --wait=hv lr-policy-add R1 20 "ip4.src==192.168.1.0/24 && 
ip4.dst==172.16.1.0/24" allow
 
 # Check logical flow
-AT_CHECK([ovn-sbctl dump-flows | grep lr_in_policy | grep "192.168.1.0" | wc 
-l], [0], [dnl
+AT_CHECK([ovn-sbctl dump-flows | grep lr_in_policy | grep "192.168.1.0" -c], 
[0], [dnl
 2
 ])
 
@@ -7847,7 +7847,7 @@ OVS_WAIT_UNTIL([as pbr-hv ovs-appctl -t ovn-controller 
inject-pkt "$packet"])
 # Check if packet hit the allow policy
 OVS_WAIT_UNTIL([test "1" = "$(ovs-ofctl dump-flows br-int | \
 grep "nw_src=192.168.1.0/24,nw_dst=172.16.1.0/24" | \
-grep "priority=20" | grep "n_packets=1" | wc -l)"])
+grep "priority=20" | grep "n_packets=1" -c)"])
 
 # Expected packet has TTL decreased by 1
 expected="eth.src==$ls2_ro_mac && eth.dst==$ls2_p1_mac &&
@@ -7863,7 +7863,7 @@ check ovn-nbctl --wait=hv lr-policy-add R1 30 
"ip4.src==192.168.1.0/24 && ip4.ds
 # Check logical flow
 AT_CHECK([ovn-sbctl dump-flows | grep lr_in_policy | \
 grep "192.168.1.0" | \
-grep "priority=30" | wc -l], [0], [dnl
+grep "priority=30" -c], [0], [dnl
 1
 ])
 
@@ -7876,7 +7876,7 @@ OVS_WAIT_UNTIL([as pbr-hv ovs-appctl -t ovn-controller 
inject-pkt "$packet"])
 # Check if packet hit the allow policy
 OVS_WAIT_UNTIL([test "1" = "$(ovs-ofctl dump-flows br-int | \
 grep "nw_src=192.168.1.0/24,nw_dst=172.16.1.0/24" | \
-grep "priority=30" | grep "n_packets=1" | wc -l)"])
+grep "priority=30" | grep "n_packets=1" -c)"])
 echo "packet hit reroute policy"
 
 # Expected packet has TTL decreased by 1
@@ -7979,7 +7979,7 @@ ls3_p1_mac=00:00:00:01:02:05
 check ovn-nbctl --wait=sb lr-policy-add R1 10 "ip6.src==2001::/64 && 
ip6.dst==2002::/64" drop
 
 # Check logical flow
-AT_CHECK([ovn-sbctl dump-flows | grep lr_in_policy | grep "2001" | wc -l], 
[0], [dnl
+AT_CHECK([ovn-sbctl dump-flows | grep lr_in_policy | grep "2001" -c], [0], [dnl
 1
 ])
 
@@ -7993,7 +7993,7 @@ OVS_WAIT_UNTIL([as pbr-hv ovs-appctl -t ovn-controller 
inject-pkt "$packet"])
 # Check if packet hit the drop policy
 OVS_WAIT_UNTIL([test "1" = "$(ovs-ofctl dump-flows br-int | \
 grep "ipv6_src=2001::/64,ipv6_dst=2002::/64 actions=drop" | \
-grep "priority=10" | grep "n_packets=1" | wc -l)"])
+grep "priority=10" | grep "n_packets=1" -c)"])
 
 # Expected to drop the packet.
 $PYTHON "$ovs_srcdir/utilities/ovs-pcap.in" pbr-hv/vif2-tx.pcap > vif2.packets
@@ -8003,7 +8003,7 @@ AT_FAIL_IF([test -s vif2.packets])
 check ovn-nbctl --wait=sb lr-policy-add R1 20 "ip6.src==2001::/64 && 
ip6.dst==2002::/64" allow
 
 # Check logical flow
-AT_CHECK([ovn-sbctl dump-flows | grep lr_in_policy | grep "2001" | wc -l], 
[0], [dnl
+AT_CHECK([ovn-sbctl dump-flows | grep lr_in_policy | grep "2001" -c], [0], [dnl
 2
 ])
 
@@ -8016,7 +8016,7 @@ OVS_WAIT_UNTIL([as pbr-hv ovs-appctl -t ovn-controller 
inject-pkt "$packet"])
 # Check if packet hit the allow policy
 OVS_WAIT_UNTIL([test "1" = "$(ovs-ofctl dump-flows br-int | \
 grep "ipv6_src=2001::/64,ipv6_dst=2002::/64"  | \
-grep "priority=20" | grep "n_packets=1" | wc -l)"])
+grep "priority=20" | grep "n_packets=1" -c)"])
 
 # Expected packet has TTL decreased by 1
 expected="eth.src==$ls2_ro_mac && eth.dst==$ls2_p1_mac &&
@@ -8032,7 +8032,7 @@ check ovn-nbctl --wait=sb lr-policy-add R1 30 
"ip6.src==2001::/64 && ip6.dst==20
 # Check logical flow
 AT_CHECK([ovn-sbctl dump-flows | grep lr_in_policy | \
 grep "2001" | \
-grep "priority=30" | wc -l], [0], [dnl
+

Re: [ovs-dev] [PATCH ovn] ovn-controller docs: fix typo in ovn-monitor-all description

2023-05-17 Thread Dumitru Ceara
On 5/16/23 17:46, Simon Horman wrote:
> On Fri, May 05, 2023 at 11:40:03AM +0300, Vladislav Odintsov wrote:
>> Make it more clear that ovn-monitor-all option has effect on OVN Southbound
>> database rather than local OVS.
>>
>> Reported-at: 
>> https://mail.openvswitch.org/pipermail/ovs-discuss/2023-May/052421.html
>> Reported-by: Ilya Maximets 
>> Signed-off-by: Vladislav Odintsov 
> 
> Reviewed-by: Simon Horman 
> 

Thanks, Vladislav, Simon and Ilya!  I applied this to the main branch
and backported it to all stable branches down to 22.03 LTS.

Regards,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn 0/4] remove unnecessary OVS_PACKED attribute from dhcpv6 structures

2023-05-17 Thread Dumitru Ceara
On 5/15/23 21:48, Mark Michelson wrote:
> Thanks Lorenzo,
> 
> For the series,
> 
> Acked-by: Mark Michelson 
> 

Thanks, Lorenzo and Mark!

I applied patches 2, 3 and 4 to the main branch.  I skipped the first
patch because I had already applied my (identical) version of it. [0]

Regards,
Dumitru

[0]
https://patchwork.ozlabs.org/project/ovn/patch/20230508111455.2745657-1-dce...@redhat.com/

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v9] ofproto-dpif-upcall: Don't set statistics to 0 when they jump back

2023-05-17 Thread Balazs Nemeth
The only way that stats->{n_packets,n_bytes} would decrease is due to an
overflow, or if there are bugs in how statistics are handled. In the
past, there were multiple issues that caused a jump backward. A
workaround was in place to set the statistics to 0 in that case. When
this happened while the revalidator was under heavy load, the workaround
had an unintended side effect where should_revalidate returned false
causing the flow to be removed because the metric it calculated was
based on a bogus value. Since many of those bugs have now been
identified and resolved, there is no need to set the statistics to 0. In
addition, the (unlikely) overflow still needs to be handled
appropriately. If an unexpected jump does happen, just log it as a
warning.

Signed-off-by: Balazs Nemeth 
---
 ofproto/ofproto-dpif-upcall.c | 24 
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index cd57fdbd9..139eb6c77 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -2368,24 +2368,32 @@ revalidate_ukey(struct udpif *udpif, struct udpif_key 
*ukey,
 enum reval_result result = UKEY_DELETE;
 struct dpif_flow_stats push;
 
-ofpbuf_clear(odp_actions);
-
 push.used = stats->used;
 push.tcp_flags = stats->tcp_flags;
-push.n_packets = (stats->n_packets > ukey->stats.n_packets
-  ? stats->n_packets - ukey->stats.n_packets
-  : 0);
-push.n_bytes = (stats->n_bytes > ukey->stats.n_bytes
-? stats->n_bytes - ukey->stats.n_bytes
-: 0);
+push.n_packets = stats->n_packets - ukey->stats.n_packets;
+push.n_bytes = stats->n_bytes - ukey->stats.n_bytes;
 
 if (stats->n_packets < ukey->stats.n_packets &&
 ukey->stats.n_packets < UINT64_THREE_QUARTERS) {
+static struct vlog_rate_limit rll = VLOG_RATE_LIMIT_INIT(1, 5);
+struct ds ds = DS_EMPTY_INITIALIZER;
+
 /* Report cases where the packet counter is lower than the previous
  * instance, but exclude the potential wrapping of an uint64_t. */
 COVERAGE_INC(ukey_invalid_stat_reset);
+
+odp_flow_key_format(ukey->key, ukey->key_len, );
+ds_put_cstr(, ", actions:");
+format_odp_actions(, odp_actions->data, odp_actions->size, NULL);
+
+VLOG_WARN_RL(, "Unexpected jump in packet stats from %"PRIu64
+" to %"PRIu64" when handling ukey %s",
+stats->n_packets, ukey->stats.n_packets, ds_cstr());
+ds_destroy();
 }
 
+ofpbuf_clear(odp_actions);
+
 if (need_revalidate) {
 if (should_revalidate(udpif, ukey, push.n_packets)) {
 if (!ukey->xcache) {
-- 
2.40.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v8] ofproto-dpif-upcall: Don't set statistics to 0 when they jump back

2023-05-17 Thread Balazs Nemeth
The only way that stats->{n_packets,n_bytes} would decrease is due to an
overflow, or if there are bugs in how statistics are handled. In the
past, there were multiple issues that caused a jump backward. A
workaround was in place to set the statistics to 0 in that case. When
this happened while the revalidator was under heavy load, the workaround
had an unintended side effect where should_revalidate returned false
causing the flow to be removed because the metric it calculated was
based on a bogus value. Since many of those bugs have now been
identified and resolved, there is no need to set the statistics to 0. In
addition, the (unlikely) overflow still needs to be handled
appropriately. If an unexpected jump does happen, just log it as a
warning.

Signed-off-by: Balazs Nemeth 
---
 ofproto/ofproto-dpif-upcall.c | 22 ++
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index cd57fdbd9..c3d207dc6 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -2368,24 +2368,30 @@ revalidate_ukey(struct udpif *udpif, struct udpif_key 
*ukey,
 enum reval_result result = UKEY_DELETE;
 struct dpif_flow_stats push;
 
-ofpbuf_clear(odp_actions);
-
 push.used = stats->used;
 push.tcp_flags = stats->tcp_flags;
-push.n_packets = (stats->n_packets > ukey->stats.n_packets
-  ? stats->n_packets - ukey->stats.n_packets
-  : 0);
-push.n_bytes = (stats->n_bytes > ukey->stats.n_bytes
-? stats->n_bytes - ukey->stats.n_bytes
-: 0);
+push.n_packets = stats->n_packets - ukey->stats.n_packets;
+push.n_bytes = stats->n_bytes - ukey->stats.n_bytes;
 
 if (stats->n_packets < ukey->stats.n_packets &&
 ukey->stats.n_packets < UINT64_THREE_QUARTERS) {
+static struct vlog_rate_limit rll = VLOG_RATE_LIMIT_INIT(1, 5);
+struct ds ds = DS_EMPTY_INITIALIZER;
+
 /* Report cases where the packet counter is lower than the previous
  * instance, but exclude the potential wrapping of an uint64_t. */
 COVERAGE_INC(ukey_invalid_stat_reset);
+
+ds_put_cstr(, ", actions:");
+format_odp_actions(, odp_actions->data, odp_actions->size, NULL);
+
+VLOG_WARN_RL(, "Unexpected jump in packet stats from %"PRIu64
+" to %"PRIu64" when handling ukey %s",
+stats->n_packets, ukey->stats.n_packets, ds_cstr());
 }
 
+ofpbuf_clear(odp_actions);
+
 if (need_revalidate) {
 if (should_revalidate(udpif, ukey, push.n_packets)) {
 if (!ukey->xcache) {
-- 
2.40.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v7] ofproto-dpif-upcall: Don't set statistics to 0 when they jump back

2023-05-17 Thread Balazs Nemeth
The only way that stats->{n_packets,n_bytes} would decrease is due to an
overflow, or if there are bugs in how statistics are handled. In the
past, there were multiple issues that caused a jump backward. A
workaround was in place to set the statistics to 0 in that case. When
this happened while the revalidator was under heavy load, the workaround
had an unintended side effect where should_revalidate returned false
causing the flow to be removed because the metric it calculated was
based on a bogus value. Since many of those bugs have now been
identified and resolved, there is no need to set the statistics to 0. In
addition, the (unlikely) overflow still needs to be handled
appropriately. If an unexpected jump does happen, just log it as a
warning.

Signed-off-by: Balazs Nemeth 
---
 ofproto/ofproto-dpif-upcall.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index cd57fdbd9..3cf080c75 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -2372,18 +2372,17 @@ revalidate_ukey(struct udpif *udpif, struct udpif_key 
*ukey,
 
 push.used = stats->used;
 push.tcp_flags = stats->tcp_flags;
-push.n_packets = (stats->n_packets > ukey->stats.n_packets
-  ? stats->n_packets - ukey->stats.n_packets
-  : 0);
-push.n_bytes = (stats->n_bytes > ukey->stats.n_bytes
-? stats->n_bytes - ukey->stats.n_bytes
-: 0);
+push.n_packets = stats->n_packets - ukey->stats.n_packets;
+push.n_bytes = stats->n_bytes - ukey->stats.n_bytes;
 
 if (stats->n_packets < ukey->stats.n_packets &&
 ukey->stats.n_packets < UINT64_THREE_QUARTERS) {
 /* Report cases where the packet counter is lower than the previous
  * instance, but exclude the potential wrapping of an uint64_t. */
 COVERAGE_INC(ukey_invalid_stat_reset);
+
+VLOG_WARN("Unexpected jump in packet stats from %"PRIu64" to %"PRIu64,
+stats->n_packets, ukey->stats.n_packets);
 }
 
 if (need_revalidate) {
-- 
2.40.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v3] Pass localnet traffic through CT when a LB is configured.

2023-05-17 Thread Numan Siddique
On Wed, May 10, 2023 at 4:05 AM Ales Musil  wrote:
>
> On Mon, May 8, 2023 at 5:19 PM Mark Michelson  wrote:
>
> > Current code always skips conntrack for traffic that ingresses or
> > egresses on a localnet port. However, this makes it impossible for
> > traffic to be load-balanced when it arrives on a localnet port.
> >
> > This patch allows for traffic to be load balanced on localnet ports by
> > making two changes:
> > * Localnet ports now have a conntrack zone assigned.
> > * When a load balancer is configured on a logical switch containing a
> >   localnet port, then conntrack is no longer skipped for traffic
> >   involving the localnet port.
> >
> > Co-authored by: Dumitru Ceara 
> > Signed-off-by: Dumitru Ceara 
> > Signed-off-by: Mark Michelson 
> > ---
> > v2 -> v3:
> >  * We now unconditionally skip localnet ports for ACL-related conntrack.
> >  * There is a CT dump check in the system test.
> > ---
> >  controller/ovn-controller.c | 13 +++---
> >  northd/northd.c | 18 +---
> >  tests/ovn-northd.at | 60 +
> >  tests/system-ovn.at | 90 +
> >  4 files changed, 170 insertions(+), 11 deletions(-)
> >
> > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> > index de90025f0..e22cd1eb9 100644
> > --- a/controller/ovn-controller.c
> > +++ b/controller/ovn-controller.c
> > @@ -708,7 +708,7 @@ get_snat_ct_zone(const struct sbrec_datapath_binding
> > *dp)
> >  }
> >
> >  static void
> > -update_ct_zones(const struct shash *binding_lports,
> > +update_ct_zones(const struct sset *local_lports,
> >  const struct hmap *local_datapaths,
> >  struct simap *ct_zones, unsigned long *ct_zone_bitmap,
> >  struct shash *pending_ct_zones)
> > @@ -721,9 +721,9 @@ update_ct_zones(const struct shash *binding_lports,
> >  unsigned long unreq_snat_zones_map[BITMAP_N_LONGS(MAX_CT_ZONES)];
> >  struct simap unreq_snat_zones = SIMAP_INITIALIZER(_snat_zones);
> >
> > -struct shash_node *shash_node;
> > -SHASH_FOR_EACH (shash_node, binding_lports) {
> > -sset_add(_users, shash_node->name);
> > +const char *local_lport;
> > +SSET_FOR_EACH (local_lport, local_lports) {
> > +sset_add(_users, local_lport);
> >  }
> >
> >  /* Local patched datapath (gateway routers) need zones assigned. */
> > @@ -2377,7 +2377,7 @@ en_ct_zones_run(struct engine_node *node, void *data)
> >  EN_OVSDB_GET(engine_get_input("OVS_bridge", node));
> >
> >  restore_ct_zones(bridge_table, ovs_table, ct_zones_data);
> > -update_ct_zones(_data->lbinding_data.lports,
> > _data->local_datapaths,
> > +update_ct_zones(_data->local_lports, _data->local_datapaths,
> >  _zones_data->current, ct_zones_data->bitmap,
> >  _zones_data->pending);
> >
> > @@ -2467,7 +2467,8 @@ ct_zones_runtime_data_handler(struct engine_node
> > *node, void *data)
> >  SHASH_FOR_EACH (shash_node, >lports) {
> >  struct tracked_lport *t_lport = shash_node->data;
> >  if (strcmp(t_lport->pb->type, "")
> > -&& strcmp(t_lport->pb->type, "localport")) {
> > +&& strcmp(t_lport->pb->type, "localport")
> > +&& strcmp(t_lport->pb->type, "localnet")) {
> >  /* We allocate zone-id's only to VIF and localport
> > lports. */

I suppose you can update the comment mentioning localnet ports too.


> >  continue;
> >  }
> > diff --git a/northd/northd.c b/northd/northd.c
> > index b69fcf321..41d0f5994 100644
> > --- a/northd/northd.c
> > +++ b/northd/northd.c
> > @@ -5968,7 +5968,8 @@ build_pre_acls(struct ovn_datapath *od, const struct
> > hmap *port_groups,
> >  }
> >  for (size_t i = 0; i < od->n_localnet_ports; i++) {
> >  skip_port_from_conntrack(od, od->localnet_ports[i],
> > - S_SWITCH_IN_PRE_ACL,
> > S_SWITCH_OUT_PRE_ACL,
> > + S_SWITCH_IN_PRE_ACL,
> > + S_SWITCH_OUT_PRE_ACL,
> >   110, lflows);
> >  }
> >
> > @@ -6137,10 +6138,17 @@ build_pre_lb(struct ovn_datapath *od, const struct
> > shash *meter_groups,
> >   S_SWITCH_IN_PRE_LB, S_SWITCH_OUT_PRE_LB,
> >   110, lflows);
> >  }
> > -for (size_t i = 0; i < od->n_localnet_ports; i++) {
> > -skip_port_from_conntrack(od, od->localnet_ports[i],
> > - S_SWITCH_IN_PRE_LB, S_SWITCH_OUT_PRE_LB,
> > - 110, lflows);
> > +/* Localnet ports have no need for going through conntrack, unless
> > + * the logical switch has a load balancer. Then, conntrack is
> > necessary
> > + * so that traffic arriving via the localnet port can be load
> > +   

Re: [ovs-dev] Official Way For Rebuilding OVS

2023-05-17 Thread Lazuardi Nasution
Hi Frode,

Any comparison between using dpkg-buildpackage and debhelper?

What is the official way for including custom patches? Just edit the source
code files? What about custom patches naming?

Best regards.


On Wed, May 17, 2023, 2:28 AM Frode Nordahl 
wrote:

>
>
> tir. 16. mai 2023, 19:03 skrev Ilya Maximets :
>
>> On 5/16/23 17:39, Lazuardi Nasution wrote:
>> > Hi,
>> >
>> > I'm trying to rebuild OVS package by using "dpkg-buildpackage -b"
>> command.
>> > But the built binary size is always much bigger than repo binary. Is
>> there
>> > any official way to do that so the rebuilt binary will be near the repo
>> > binary?
>>
>> The official way is documented here:
>>   https://docs.openvswitch.org/en/latest/intro/install/debian/
>>
>> >
>> > Rebuilt Binary:
>> > ubuntu@jammy:~$ ls -lah
>> ./openvswitch-3.0.3/_debian/vswitchd/ovs-vswitchd
>> > -rwxr-xr-x 1 ubuntu ubuntu *17M* May 16 22:25
>> > ./openvswitch-3.0.3/_debian/vswitchd/ovs-vswitchd
>> >
>> > Repo Binary:
>> > ubuntu@jammy:~$ ls -lah /usr/lib/openvswitch-switch/ovs-vswitchd
>> > -rwxr-xr-x 1 root root *2.7M* Feb 27 20:18
>> > /usr/lib/openvswitch-switch/ovs-vswitchd
>>
>> I'm not sure what exactly dpkg-buildpackage does and why you end up
>> with a different size binary.  Most likely your binary doesn't have
>> debuginfo stripped out while the official one does.  It might also
>> be possible that your package is built with DPDK, while the official
>> openvswitch-switch package is not.
>>
>
> The upstream OVS packages make use of debhelper [0], just like Debian and
> Ubuntu. Recent versions of debhelper will automatically strip out debug
> information and put into separate packages at package build time.
>
> I see you are comparing file sizes using the artifact in the build tree
> above, and I think you'll find that the binary size is smaller if you look
> inside the built deb in the ../ directory instead.
>
> 0: https://manpages.ubuntu.com/manpages/kinetic/man7/debhelper.7.html
>
> --
> Frode Nordahl
>
> Best regards, Ilya Maximets.
>> ___
>> dev mailing list
>> d...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v6] ofproto-dpif-upcall: Don't set statistics to 0 when they jump back

2023-05-17 Thread Ilya Maximets
On 5/17/23 14:50, Balazs Nemeth wrote:
> The only way that stats->{n_packets,n_bytes} would decrease is due to an
> overflow, or if there are bugs in how statistics are handled. In the
> past, there were multiple issues that caused a jump backward. A
> workaround was in place to set the statistics to 0 in that case. When
> this happened while the revalidator was under heavy load, the workaround
> had an unintended side effect where should_revalidate returned false
> causing the flow to be removed because the metric it calculated was
> based on a bogus value. Since many of those bugs have now been
> identified and resolved, there is no need to set the statistics to 0. In
> addition, the (unlikely) overflow still needs to be handled
> appropriately. If an unexpected jump does happen, just log it as a
> warning.
> 
> Signed-off-by: Balazs Nemeth 
> ---
>  ofproto/ofproto-dpif-upcall.c | 11 +--
>  1 file changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
> index cd57fdbd9..3cf080c75 100644
> --- a/ofproto/ofproto-dpif-upcall.c
> +++ b/ofproto/ofproto-dpif-upcall.c
> @@ -2372,18 +2372,17 @@ revalidate_ukey(struct udpif *udpif, struct udpif_key 
> *ukey,
>  
>  push.used = stats->used;
>  push.tcp_flags = stats->tcp_flags;
> -push.n_packets = (stats->n_packets > ukey->stats.n_packets
> -  ? stats->n_packets - ukey->stats.n_packets
> -  : 0);
> -push.n_bytes = (stats->n_bytes > ukey->stats.n_bytes
> -? stats->n_bytes - ukey->stats.n_bytes
> -: 0);
> +push.n_packets = stats->n_packets - ukey->stats.n_packets;
> +push.n_bytes = stats->n_bytes - ukey->stats.n_bytes;
>  
>  if (stats->n_packets < ukey->stats.n_packets &&
>  ukey->stats.n_packets < UINT64_THREE_QUARTERS) {
>  /* Report cases where the packet counter is lower than the previous
>   * instance, but exclude the potential wrapping of an uint64_t. */
>  COVERAGE_INC(ukey_invalid_stat_reset);
> +
> +VLOG_WARN("Unexpected jump in packet stats from %"PRIu64" to 
> %"PRIu64,
> +stats->n_packets, ukey->stats.n_packets);

Hi, Balazs.  We should, probbaly, print out the key and actions,
so we can better understand which flow caused the issue.
We should be able to use odp_flow_key_format() and format_odp_actions()
for this.  ukey contains the key.

What do you think?

We should also rate-limit the log just in case.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] Official Way For Rebuilding OVS

2023-05-17 Thread Lazuardi Nasution
Hi Ilya,

You are right, dpkg-buildpackage produce two sets of binaries, each for
standard and DPDK versions. I have found the debug symbols stripped ones on
different folders

Best regards.


On Wed, May 17, 2023, 12:03 AM Ilya Maximets  wrote:

> On 5/16/23 17:39, Lazuardi Nasution wrote:
> > Hi,
> >
> > I'm trying to rebuild OVS package by using "dpkg-buildpackage -b"
> command.
> > But the built binary size is always much bigger than repo binary. Is
> there
> > any official way to do that so the rebuilt binary will be near the repo
> > binary?
>
> The official way is documented here:
>   https://docs.openvswitch.org/en/latest/intro/install/debian/
>
> >
> > Rebuilt Binary:
> > ubuntu@jammy:~$ ls -lah
> ./openvswitch-3.0.3/_debian/vswitchd/ovs-vswitchd
> > -rwxr-xr-x 1 ubuntu ubuntu *17M* May 16 22:25
> > ./openvswitch-3.0.3/_debian/vswitchd/ovs-vswitchd
> >
> > Repo Binary:
> > ubuntu@jammy:~$ ls -lah /usr/lib/openvswitch-switch/ovs-vswitchd
> > -rwxr-xr-x 1 root root *2.7M* Feb 27 20:18
> > /usr/lib/openvswitch-switch/ovs-vswitchd
>
> I'm not sure what exactly dpkg-buildpackage does and why you end up
> with a different size binary.  Most likely your binary doesn't have
> debuginfo stripped out while the official one does.  It might also
> be possible that your package is built with DPDK, while the official
> openvswitch-switch package is not.
>
> Best regards, Ilya Maximets.
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] conntrack: Release nat_conn in case both keys have the same hash.

2023-05-17 Thread Aaron Conole
Paolo Valerio  writes:

> Ilya Maximets  writes:
>
>> On 5/4/23 19:21, Paolo Valerio wrote:
>>> Ilya Maximets  writes:
>>> 
 On 4/19/23 20:40, Paolo Valerio wrote:
> During the creation of a new connection, there's a chance both key and
> rev_key end up having the same hash. This is more common in the case
> of all-zero snat with no collisions. In that case, once the
> connection is expired, but not cleaned up, if a new packet with the
> same 5-tuple is received, an assertion failure gets triggered in
> conn_update_state() because of a previous failure of retrieving a
> CT_CONN_TYPE_DEFAULT connection.
>
> Fix it by releasing the nat_conn during the connection creation in the
> case of same hash for both key and rev_key.

 This sounds a bit odd.  Shouldn't we treat hash collision as a normal case?

 Looking at the code, I'm assuming that the issue comes from the following
 part in process_one():

 if (OVS_LIKELY(conn)) {
 if (conn->conn_type == CT_CONN_TYPE_UN_NAT) {
 ...
 conn_key_lookup(ct, >key, hash, now, , >reply);

 And here we get the same connection again, because the default one is 
 already
 expired.  Is that correct?

 If so, maybe we should add an extra condition to conn_key_lookup() to
 only look for DEFAULT connections instead, just for this case?  Since
 we really don't want to get the UN_NAT one here.

>>> 
>>> Hello Ilya,
>>> 
>>> It's a fair point.
>>> I initially thought about the approach you're suggesting, but I had some
>>> concerns about it that I'll try to summarize below.
>>> 
>>> For sure it would fix the issue (it could require the first patch to be
>>> applied as well for the branches with rcu exp lists).
>>> 
>>> Based on the current logic, new packets matching that expired connection
>>> but not evicted will be marked as +inv and further packets will be
>>> marked so for the whole sweep interval unless an exception like this get
>>> added:
>>> 
>>> uint32_t hash = conn_key_hash(>rev_key, ct->hash_basis);
>>> /* the last flag indicates CT_CONN_TYPE_DEFAULT only */
>>> conn_key_lookup_(ct, >key, hash, now, , >reply, true);
>>> /* special case where there's hash collision */
>>> if (!conn && ctx->hash != hash) {
>>> pkt->md.ct_state |= CS_INVALID;
>>> write_ct_md(pkt, zone, NULL, NULL, NULL);
>>> ...
>>> return;
>>> }
>>> 
>>> This would further require that subsequent lookup in the create_new_conn
>>> path are restricted to CT_CONN_TYPE_DEFAULT, e.g.:
>>> 
>>> uint32_t hash = conn_key_hash(>key, ct->hash_basis);
>>> /* Only check for CT_CONN_TYPE_DEFAULT */
>>> if (!conn_key_lookup_(ct, >key, hash, now, NULL, NULL, true)) {
>>> conn = conn_not_found(ct, pkt, ctx, commit, now, nat_action_info,
>>>   helper, alg_exp, ct_alg_ctl, tp_id);
>>> }
>>> 
>>> otherwise we could incur in a false positive which prevent to create a
>>> new connection.
>>
>> I'm not really sure if what described above is more correct way of doing
>> things or not...  Aaron, do you have opinion on this?
>>
>> Another thought: Can we expire the CT_CONN_TYPE_UN_NAT connection the
>> moment DEFAULT counterpart of it expires?  Or that will that be against
>> some logic / not possible to do?
>>
>
> As far as I can tell, this could not be straightforward as simply
> marking it as expired should not be reliable (e.g. doing it from the
> sweeper), and I guess that managing the expiration time field for the
> nat_conn as well would require updating the nat_conn every time the
> default one gets updated, probably making it a bit unpractical.
>
> Another approach would be removing the nat_conn [1] altogether.
> The problem in this case is backporting. Some adjustments that would add
> to the patch might be needed for older branches.
>
> [1]
> https://patchwork.ozlabs.org/project/openvswitch/patch/20201129033255.64647-2-hepeng.0...@bytedance.com/

I think that work was interesting, and maybe the best way to go
forward.  Backports would become difficult, though - agreed.

>>
>>> 
 Best regards, Ilya Maximets.

>
> Reported-by: Michael Plato 
> Fixes: 61e48c2d1db2 ("conntrack: Handle SNAT with all-zero IP address.")
> Signed-off-by: Paolo Valerio 
> ---
> In this thread [0] there are some more details. A similar
> approach here could be to avoid to add the nat_conn to the cmap and
> letting the sweeper release the memory for nat_conn once the whole
> connection gets freed.
> That approach could still be ok, but the drawback is that it could
> require a different patch for older branches that don't include
> 3d9c1b855a5f ("conntrack: Replace timeout based expiration lists with
> rculists."). It still worth to be considered.
>
> [0] 
> https://mail.openvswitch.org/pipermail/ovs-discuss/2023-April/052339.html
> ---
>  lib/conntrack.c |   21 

Re: [ovs-dev] [PATCH ovn] northd: centralized reply lb traffic even if FIP is defined

2023-05-17 Thread Dumitru Ceara
On 4/28/23 15:57, Lorenzo Bianconi wrote:
> In the current codebase for distributed gw router port use-case,
> it is not possible to add a load balancer that redirects the traffic
> to a backed if it is even the internal IP of a FIP NAT rule since
> the reply traffic is never centralized. Fix the issue centralizing the
> traffic if it is the reply packet for the load balancer.
> 
> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2023609
> Signed-off-by: Lorenzo Bianconi 
> ---

Hi Lorenzo,

>  northd/northd.c | 15 +++
>  northd/ovn-northd.8.xml | 14 ++

Would it be possible to add a unit test (ovn.at/ovn-northd.at) that
depicts the logical flows that need to be installed and a system test
(system-ovn.at) for this change?

It would really help understand what's changing here.

>  2 files changed, 29 insertions(+)
> 
> diff --git a/northd/northd.c b/northd/northd.c
> index d59a54b32..e33289b18 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -10680,6 +10680,8 @@ build_distr_lrouter_nat_flows_for_lb(struct 
> lrouter_nat_lb_flows_ctx *ctx,
>   struct ovn_datapath *od)
>  {
>  struct ovn_port *dgp = od->l3dgw_ports[0];
> +struct ds gw_redir_match = DS_EMPTY_INITIALIZER;
> +struct ds gw_redir_action = DS_EMPTY_INITIALIZER;
>  
>  const char *undnat_action;
>  
> @@ -10724,6 +10726,17 @@ build_distr_lrouter_nat_flows_for_lb(struct 
> lrouter_nat_lb_flows_ctx *ctx,
>  return;
>  }
>  
> +/* We need to centralize the LB traffic to properly perform
> + * the undnat stage.
> + */
> +ds_put_format(_redir_match, "%s) && outport == %s",
> +  ds_cstr(ctx->undnat_match), dgp->json_key);
> +ds_put_format(_redir_action, "outport = %s; next;",
> +  dgp->cr_port->json_key);
> +ovn_lflow_add_with_hint(ctx->lflows, od, S_ROUTER_IN_GW_REDIRECT,
> +200, ds_cstr(_redir_match),
> +ds_cstr(_redir_action), 
> >lb->nlb->header_);
> +
>  ds_put_format(ctx->undnat_match, ") && (inport == %s || outport == %s)"
>" && is_chassis_resident(%s)", dgp->json_key, 
> dgp->json_key,
>dgp->cr_port->json_key);
> @@ -10731,6 +10744,8 @@ build_distr_lrouter_nat_flows_for_lb(struct 
> lrouter_nat_lb_flows_ctx *ctx,
>  ds_cstr(ctx->undnat_match), undnat_action,
>  >lb->nlb->header_);
>  ds_truncate(ctx->undnat_match, undnat_match_len);
> +ds_destroy(_redir_match);
> +ds_destroy(_redir_action);
>  }
>  
>  static void
> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> index 70153dc9e..46afaeb82 100644
> --- a/northd/ovn-northd.8.xml
> +++ b/northd/ovn-northd.8.xml
> @@ -4440,6 +4440,20 @@ icmp6 {
>  
>  
>  
> +  
> +For all the configured load balancing rules that includes an IPv4
> +address VIP, for every backend IPv4 address B
> +defined for the VIP a priority-200 flow that matches

I really thought you meant a flow for each backend but I think it's
actually a single lflow for all backends, right?

> +ip  ip4.src == B with an action
> +outport = CR; next; where CR is 
> the
> +chassisredirect port representing the instance of the
> +logical router distributed gateway port on the gateway chassis.
> +If the backend IPv4 address B is also configured with
> +L4 port PORT of protocol P, then the match
> +also includes P.src == PORT.
> +Similar flows are addeded for IPv6 counterpart.
> +  
> +
>
>  For each NAT rule in the OVN Northbound database that can
>  be handled in a distributed manner, a priority-100 logical

Thanks,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v6] ofproto-dpif-upcall: Don't set statistics to 0 when they jump back

2023-05-17 Thread Balazs Nemeth
The only way that stats->{n_packets,n_bytes} would decrease is due to an
overflow, or if there are bugs in how statistics are handled. In the
past, there were multiple issues that caused a jump backward. A
workaround was in place to set the statistics to 0 in that case. When
this happened while the revalidator was under heavy load, the workaround
had an unintended side effect where should_revalidate returned false
causing the flow to be removed because the metric it calculated was
based on a bogus value. Since many of those bugs have now been
identified and resolved, there is no need to set the statistics to 0. In
addition, the (unlikely) overflow still needs to be handled
appropriately. If an unexpected jump does happen, just log it as a
warning.

Signed-off-by: Balazs Nemeth 
---
 ofproto/ofproto-dpif-upcall.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index cd57fdbd9..3cf080c75 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -2372,18 +2372,17 @@ revalidate_ukey(struct udpif *udpif, struct udpif_key 
*ukey,
 
 push.used = stats->used;
 push.tcp_flags = stats->tcp_flags;
-push.n_packets = (stats->n_packets > ukey->stats.n_packets
-  ? stats->n_packets - ukey->stats.n_packets
-  : 0);
-push.n_bytes = (stats->n_bytes > ukey->stats.n_bytes
-? stats->n_bytes - ukey->stats.n_bytes
-: 0);
+push.n_packets = stats->n_packets - ukey->stats.n_packets;
+push.n_bytes = stats->n_bytes - ukey->stats.n_bytes;
 
 if (stats->n_packets < ukey->stats.n_packets &&
 ukey->stats.n_packets < UINT64_THREE_QUARTERS) {
 /* Report cases where the packet counter is lower than the previous
  * instance, but exclude the potential wrapping of an uint64_t. */
 COVERAGE_INC(ukey_invalid_stat_reset);
+
+VLOG_WARN("Unexpected jump in packet stats from %"PRIu64" to %"PRIu64,
+stats->n_packets, ukey->stats.n_packets);
 }
 
 if (need_revalidate) {
-- 
2.40.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5] ofproto-dpif-upcall: Don't set statistics to 0 when they jump back

2023-05-17 Thread Ilya Maximets
On 5/17/23 13:09, Eelco Chaudron wrote:
> 
> 
> On 31 Mar 2023, at 15:19, Eelco Chaudron wrote:
> 
>> On 31 Mar 2023, at 15:15, Ilya Maximets wrote:
>>
>>> On 3/31/23 15:06, Eelco Chaudron wrote:


 On 31 Mar 2023, at 12:38, Simon Horman wrote:

> On Fri, Mar 31, 2023 at 12:05:09PM +0200, Ilya Maximets wrote:
>> On 3/31/23 11:07, Simon Horman wrote:
>>> On Thu, Mar 30, 2023 at 09:04:02PM +0200, Ilya Maximets wrote:
 On 3/30/23 11:45, Simon Horman wrote:
> On Fri, Mar 17, 2023 at 09:47:36PM +0100, Eelco Chaudron wrote:
>>> Op 17 mrt. 2023 om 21:11 heeft Marcelo Ricardo Leitner 
>>>  het volgende geschreven:
>>> Thu, Mar 16, 2023 at 09:51:34AM +0100, Eelco Chaudron wrote:
> On 22 Dec 2022, at 13:32, Eelco Chaudron wrote:
>> On 22 Dec 2022, at 10:26, Balazs Nemeth wrote:
>
>> The only way that stats->{n_packets,n_bytes} would decrease is 
>> due to an
>
> nit: s/way/ways/
>
>> overflow, or if there are bugs in how statistics are handled. In 
>> the
>> past, there were multiple bugs that caused a jump backward. A
>> workaround was in place to set the statistics to 0 in that case. 
>> When
>> this happened while the revalidator was under heavy load, the 
>> workaround
>> had an unintended side effect where should_revalidate returned 
>> false
>> causing the flow to be removed because the metric it calculated 
>> was
>> based on a bogus value. Since many of those bugs have now been
>> identified and resolved, there is no need to set the statistics 
>> to 0. In
>> addition, the (unlikely) overflow still needs to be handled
>> appropriately.
>
> 1. Perhaps it would be prudent to log/count if/when this occurs

 +1
 We do have a coverage counter that will indicate the case where stats
 jump back.  However, if we're certain that this should never happen,
 we should, probably, emit a warning or even an error log as well, so
 users are aware that something went wrong.
>>>
>>> I was thinking more of a counter, which seems to already be covered.
>>> But I have no objection to your reasoning about having a warning (too).
>>>

> 2. I take it that the overflow handling would be follow-up work,
>is that correct?

 The unsigned arithmetic should take case of overflowed counters,
 because the result of subtraction will still give a correct difference
 between the old and a new value, even if it overflowed and the new
 value is smaller.  Unless, of course, it overflowed more than once.
>>>
>>> More than once between samples?
>>> If so, I'm assuming that is not a case we can hit unless there is a bug.
>>
>> Right.  It's actually should be practically not possible to overflow
>> even once with a current hardware.  Assuming we have a fancy 400 Gbps
>> NIC, then it should take 11.7 years to overflow a byte counter.
>>
>> So, this patch is mostly removing a workaround for some bug that we
>> hope we fixed.  But it's not clear what the original bug was as the
>> commit message for this workaround didn't specify a root cause.  So,
>> it's hard to say if it's fixed or not.  And that's why I'm thinking
>> that the error message is needed.
>
> Yes, I agree that is prudent.

 If we do add a log message, we should be careful as it could still be a 
 wrap (for bytes). For packets, it’s not very likely with the current 
 speeds 400G, will be around 980 years… For bytes, we can wrap easily.

 So I would suggest it only for packet count…

>>>
>>> We already only use a packet counter for the 'ukey_invalid_stat_reset' 
>>> coverage.
>>> So, I suppose, we can just add the log under the same condition.
>>>
>>> OTOH, Don't we check that the difference is within 3/4 of 64-bit range?
>>> It should still take many years to overflow the byte counter.
>>
>> Yes, I was looking at this on my review directory which did not have my 
>> patches (or latest master).
>>
>> So just adding it to the ‘ukey_invalid_stat_reset’ if() case will work :)
> 
> 
> Balasz are you planning to send out a v6 for this patch? If I understand Ilya 
> correctly all that needs to change is adding a log to the section below, Ilya?

Yep.  Some rate-limited warning or error should be fine.

> 
> if (stats->n_packets < ukey->stats.n_packets &&
> ukey->stats.n_packets < UINT64_THREE_QUARTERS) {
> /* Report cases where the packet counter is lower than the previous
>  * instance, but exclude the potential wrapping of an uint64_t. */
> COVERAGE_INC(ukey_invalid_stat_reset);
> 
> ++ ADD LOG HERE…
> }
> 


Re: [ovs-dev] [PATCH v5] ofproto-dpif-upcall: Don't set statistics to 0 when they jump back

2023-05-17 Thread Eelco Chaudron


On 31 Mar 2023, at 15:19, Eelco Chaudron wrote:

> On 31 Mar 2023, at 15:15, Ilya Maximets wrote:
>
>> On 3/31/23 15:06, Eelco Chaudron wrote:
>>>
>>>
>>> On 31 Mar 2023, at 12:38, Simon Horman wrote:
>>>
 On Fri, Mar 31, 2023 at 12:05:09PM +0200, Ilya Maximets wrote:
> On 3/31/23 11:07, Simon Horman wrote:
>> On Thu, Mar 30, 2023 at 09:04:02PM +0200, Ilya Maximets wrote:
>>> On 3/30/23 11:45, Simon Horman wrote:
 On Fri, Mar 17, 2023 at 09:47:36PM +0100, Eelco Chaudron wrote:
>> Op 17 mrt. 2023 om 21:11 heeft Marcelo Ricardo Leitner 
>>  het volgende geschreven:
>> Thu, Mar 16, 2023 at 09:51:34AM +0100, Eelco Chaudron wrote:
 On 22 Dec 2022, at 13:32, Eelco Chaudron wrote:
> On 22 Dec 2022, at 10:26, Balazs Nemeth wrote:

> The only way that stats->{n_packets,n_bytes} would decrease is 
> due to an

 nit: s/way/ways/

> overflow, or if there are bugs in how statistics are handled. In 
> the
> past, there were multiple bugs that caused a jump backward. A
> workaround was in place to set the statistics to 0 in that case. 
> When
> this happened while the revalidator was under heavy load, the 
> workaround
> had an unintended side effect where should_revalidate returned 
> false
> causing the flow to be removed because the metric it calculated 
> was
> based on a bogus value. Since many of those bugs have now been
> identified and resolved, there is no need to set the statistics 
> to 0. In
> addition, the (unlikely) overflow still needs to be handled
> appropriately.

 1. Perhaps it would be prudent to log/count if/when this occurs
>>>
>>> +1
>>> We do have a coverage counter that will indicate the case where stats
>>> jump back.  However, if we're certain that this should never happen,
>>> we should, probably, emit a warning or even an error log as well, so
>>> users are aware that something went wrong.
>>
>> I was thinking more of a counter, which seems to already be covered.
>> But I have no objection to your reasoning about having a warning (too).
>>
>>>
 2. I take it that the overflow handling would be follow-up work,
is that correct?
>>>
>>> The unsigned arithmetic should take case of overflowed counters,
>>> because the result of subtraction will still give a correct difference
>>> between the old and a new value, even if it overflowed and the new
>>> value is smaller.  Unless, of course, it overflowed more than once.
>>
>> More than once between samples?
>> If so, I'm assuming that is not a case we can hit unless there is a bug.
>
> Right.  It's actually should be practically not possible to overflow
> even once with a current hardware.  Assuming we have a fancy 400 Gbps
> NIC, then it should take 11.7 years to overflow a byte counter.
>
> So, this patch is mostly removing a workaround for some bug that we
> hope we fixed.  But it's not clear what the original bug was as the
> commit message for this workaround didn't specify a root cause.  So,
> it's hard to say if it's fixed or not.  And that's why I'm thinking
> that the error message is needed.

 Yes, I agree that is prudent.
>>>
>>> If we do add a log message, we should be careful as it could still be a 
>>> wrap (for bytes). For packets, it’s not very likely with the current speeds 
>>> 400G, will be around 980 years… For bytes, we can wrap easily.
>>>
>>> So I would suggest it only for packet count…
>>>
>>
>> We already only use a packet counter for the 'ukey_invalid_stat_reset' 
>> coverage.
>> So, I suppose, we can just add the log under the same condition.
>>
>> OTOH, Don't we check that the difference is within 3/4 of 64-bit range?
>> It should still take many years to overflow the byte counter.
>
> Yes, I was looking at this on my review directory which did not have my 
> patches (or latest master).
>
> So just adding it to the ‘ukey_invalid_stat_reset’ if() case will work :)


Balasz are you planning to send out a v6 for this patch? If I understand Ilya 
correctly all that needs to change is adding a log to the section below, Ilya?

if (stats->n_packets < ukey->stats.n_packets &&
ukey->stats.n_packets < UINT64_THREE_QUARTERS) {
/* Report cases where the packet counter is lower than the previous
 * instance, but exclude the potential wrapping of an uint64_t. */
COVERAGE_INC(ukey_invalid_stat_reset);

++ ADD LOG HERE…
}

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] seq: Make read of the current value atomic

2023-05-17 Thread Eelco Chaudron


On 16 May 2023, at 21:48, Ilya Maximets wrote:

> On 5/16/23 10:20, Eelco Chaudron wrote:
>>
>>
>> On 15 May 2023, at 17:47, Ilya Maximets wrote:
>>
>>> On 5/15/23 16:24, Eelco Chaudron wrote:


 On 4 May 2023, at 0:55, Ilya Maximets wrote:

> On 3/27/23 13:25, Eelco Chaudron wrote:
>> Make the read of the current seq->value atomic, i.e., not needing to
>> acquire the global mutex when reading it. On 64-bit systems, this
>> incurs no overhead, and it will avoid the mutex and potentially
>> a system call.
>>
>> For incrementing the value followed by waking up the threads, we are
>> still taking the mutex, so the current behavior is not changing. The
>> seq_read() behavior is already defined as, "Returns seq's current
>> sequence number (which could change immediately)". So the change
>> should not impact the current behavior.
>>
>> Signed-off-by: Eelco Chaudron 
>> ---
>>  lib/ovs-rcu.c |2 +-
>>  lib/seq.c |   37 -
>>  lib/seq.h |1 -
>>  3 files changed, 17 insertions(+), 23 deletions(-)
>>
>> diff --git a/lib/ovs-rcu.c b/lib/ovs-rcu.c
>> index 946aa04d1..9e07d9bab 100644
>> --- a/lib/ovs-rcu.c
>> +++ b/lib/ovs-rcu.c
>> @@ -170,7 +170,7 @@ ovsrcu_try_quiesce(void)
>>  ovs_assert(!single_threaded());
>>  perthread = ovsrcu_perthread_get();
>>  if (!seq_try_lock()) {
>> -perthread->seqno = seq_read_protected(global_seqno);
>> +perthread->seqno = seq_read(global_seqno);
>>  if (perthread->cbset) {
>>  ovsrcu_flush_cbset__(perthread, true);
>>  }
>> diff --git a/lib/seq.c b/lib/seq.c
>> index 99e5bf8bd..22646f3f8 100644
>> --- a/lib/seq.c
>> +++ b/lib/seq.c
>> @@ -32,7 +32,7 @@ COVERAGE_DEFINE(seq_change);
>>
>>  /* A sequence number object. */
>>  struct seq {
>> -uint64_t value OVS_GUARDED;
>> +atomic_uint64_t value;
>>  struct hmap waiters OVS_GUARDED; /* Contains 'struct seq_waiter's. 
>> */
>>  };
>>
>> @@ -72,6 +72,7 @@ static void seq_wake_waiters(struct seq *) 
>> OVS_REQUIRES(seq_mutex);
>>  struct seq * OVS_EXCLUDED(seq_mutex)
>>  seq_create(void)
>>  {
>> +uint64_t seq_value;
>>  struct seq *seq;
>>
>>  seq_init();
>> @@ -81,7 +82,8 @@ seq_create(void)
>>  COVERAGE_INC(seq_change);
>>
>>  ovs_mutex_lock(_mutex);
>> -seq->value = seq_next++;
>> +seq_value = seq_next++;
>> +atomic_store_relaxed(>value, seq_value);
>>  hmap_init(>waiters);
>>  ovs_mutex_unlock(_mutex);
>>
>> @@ -126,9 +128,11 @@ void
>>  seq_change_protected(struct seq *seq)
>>  OVS_REQUIRES(seq_mutex)
>>  {
>> +uint64_t seq_value = seq_next++;
>> +
>>  COVERAGE_INC(seq_change);
>>
>> -seq->value = seq_next++;
>> +atomic_store_relaxed(>value, seq_value);
>>  seq_wake_waiters(seq);
>>  }
>>
>> @@ -143,18 +147,6 @@ seq_change(struct seq *seq)
>>  ovs_mutex_unlock(_mutex);
>>  }
>>
>> -/* Returns 'seq''s current sequence number (which could change 
>> immediately).
>> - *
>> - * seq_read() and seq_wait() can be used together to yield a race-free 
>> wakeup
>> - * when an object changes, even without an ability to lock the object.  
>> See
>> - * Usage in seq.h for details. */
>> -uint64_t
>> -seq_read_protected(const struct seq *seq)
>> -OVS_REQUIRES(seq_mutex)
>> -{
>> -return seq->value;
>> -}
>> -
>>  /* Returns 'seq''s current sequence number (which could change 
>> immediately).
>>   *
>>   * seq_read() and seq_wait() can be used together to yield a race-free 
>> wakeup
>> @@ -162,14 +154,15 @@ seq_read_protected(const struct seq *seq)
>>   * Usage in seq.h for details. */
>>  uint64_t
>>  seq_read(const struct seq *seq)
>> -OVS_EXCLUDED(seq_mutex)
>>  {
>>  uint64_t value;
>>
>> -ovs_mutex_lock(_mutex);
>> -value = seq_read_protected(seq);
>> -ovs_mutex_unlock(_mutex);
>> -
>> +/* We use atomic_read(), memory_order_seq_cst, to simulate the full
>> + * memory barrier you would get with locking and unlocking the 
>> global
>> + * mutex.
>
> Hi, Eelco.   I'm not sure this is actually correct.
>
> We're performing memory_order_seq_cst operation on the value.
>
> Sequential consistency means just acquire-release semantics in our case, 
> since
> we're pairing with non-seq-cst operations.  And acquire-release semantics 
> works
> between acquire loads and release stores on the same atomic.  But all the 
> loads
> on the value we have with this change are relaxed.  And the reader 
> 

[ovs-dev] [PATCH v8 ovn 09/10] controller: get rid of egress_ifaces sset

2023-05-17 Thread Lorenzo Bianconi
egress_ifaces sset is no longer used by ovn-controller, so get rid of it

Signed-off-by: Lorenzo Bianconi 
---
 controller/binding.c| 40 -
 controller/binding.h|  1 -
 controller/ovn-controller.c | 10 ++
 3 files changed, 2 insertions(+), 49 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index 2fa044b52..673b5a1d8 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -474,39 +474,6 @@ sbrec_get_port_encap(const struct sbrec_chassis 
*chassis_rec,
 return best_encap;
 }
 
-static void
-add_localnet_egress_interface_mappings(
-const struct sbrec_port_binding *port_binding,
-struct shash *bridge_mappings, struct sset *egress_ifaces)
-{
-const char *network = smap_get(_binding->options, "network_name");
-if (!network) {
-return;
-}
-
-struct ovsrec_bridge *br_ln = shash_find_data(bridge_mappings, network);
-if (!br_ln) {
-return;
-}
-
-/* Add egress-ifaces from the connected bridge */
-for (size_t i = 0; i < br_ln->n_ports; i++) {
-const struct ovsrec_port *port_rec = br_ln->ports[i];
-
-for (size_t j = 0; j < port_rec->n_interfaces; j++) {
-const struct ovsrec_interface *iface_rec;
-
-iface_rec = port_rec->interfaces[j];
-bool is_egress_iface = smap_get_bool(_rec->external_ids,
- "ovn-egress-iface", false);
-if (!is_egress_iface) {
-continue;
-}
-sset_add(egress_ifaces, iface_rec->name);
-}
-}
-}
-
 static bool
 is_network_plugged(const struct sbrec_port_binding *binding_rec,
struct shash *bridge_mappings)
@@ -547,7 +514,6 @@ update_ld_multichassis_ports(const struct 
sbrec_port_binding *binding_rec,
 static void
 update_ld_localnet_port(const struct sbrec_port_binding *binding_rec,
 struct shash *bridge_mappings,
-struct sset *egress_ifaces,
 struct hmap *local_datapaths)
 {
 /* Ignore localnet ports for unplugged networks. */
@@ -555,9 +521,6 @@ update_ld_localnet_port(const struct sbrec_port_binding 
*binding_rec,
 return;
 }
 
-add_localnet_egress_interface_mappings(binding_rec,
-bridge_mappings, egress_ifaces);
-
 struct local_datapath *ld
 = get_local_datapath(local_datapaths,
  binding_rec->datapath->tunnel_key);
@@ -2127,7 +2090,6 @@ binding_run(struct binding_ctx_in *b_ctx_in, struct 
binding_ctx_out *b_ctx_out)
 struct lport *lnet_lport;
 LIST_FOR_EACH_POP (lnet_lport, list_node, _lports) {
 update_ld_localnet_port(lnet_lport->pb, _mappings,
-b_ctx_out->egress_ifaces,
 b_ctx_out->local_datapaths);
 free(lnet_lport);
 }
@@ -2932,7 +2894,6 @@ handle_updated_port(struct binding_ctx_in *b_ctx_in,
 b_ctx_in->bridge_table,
 _mappings);
 update_ld_localnet_port(pb, _mappings,
-b_ctx_out->egress_ifaces,
 b_ctx_out->local_datapaths);
 shash_destroy(_mappings);
 break;
@@ -3128,7 +3089,6 @@ delete_done:
 enum en_lport_type lport_type = get_lport_type(pb);
 if (lport_type == LP_LOCALNET) {
 update_ld_localnet_port(pb, _mappings,
-b_ctx_out->egress_ifaces,
 b_ctx_out->local_datapaths);
 } else if (lport_type == LP_EXTERNAL) {
 update_ld_external_ports(pb, b_ctx_out->local_datapaths);
diff --git a/controller/binding.h b/controller/binding.h
index 6b97bcff0..84b4cb591 100644
--- a/controller/binding.h
+++ b/controller/binding.h
@@ -91,7 +91,6 @@ struct binding_ctx_out {
  */
 bool non_vif_ports_changed;
 
-struct sset *egress_ifaces;
 struct hmap *qos_map;
 /* smap of OVS interface name as key and
  * OVS interface external_ids:iface-id as value. */
diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 1dd1099a4..04c4af68e 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -1347,7 +1347,6 @@ struct ed_type_runtime_data {
 struct sset active_tunnels;
 
 /* runtime data engine private data. */
-struct sset egress_ifaces;
 struct hmap qos_map;
 struct smap local_iface_ids;
 
@@ -1407,8 +1406,8 @@ struct ed_type_runtime_data {
  * | local_lport_ids  | is not tracked explicitly.   |
  *  -
  * | local_iface_ids  | This is used internally within the runtime data  |
- * | egress_ifaces| engine (used only in binding.c) and hence there  |
- * | 

[ovs-dev] [PATCH v8 ovn 08/10] northd: apply QoS rules on the localnet port related to LSP ports

2023-05-17 Thread Lorenzo Bianconi
This patch allows to apply QoS rules on the localnet port related to
logical switch ports running on the same datapath. Considering the
following netowrk configuration:

LSP{0,1} -- LogicalSwitch -- Localnet0

It is possible to apply the following QoS rules on Localnet0 on egress traffic
entering the cluster from LSP{0,1}:
- LSP0: min-rate r0, max_rate R0
- LSP1: min-rate r1, max_rate R1

Acked-By: Ihar Hrachyshka 
Tested-by: Rodolfo Alonso 
Signed-off-by: Lorenzo Bianconi 
---
 controller/binding.c|   6 +-
 northd/northd.c |  26 --
 northd/ovn-northd.8.xml |  12 +++
 tests/ovn-northd.at |   2 +
 tests/ovn.at| 185 
 tests/system-ovn.at |  86 ++-
 6 files changed, 306 insertions(+), 11 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index bf0ec9834..2fa044b52 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -202,9 +202,9 @@ get_qos_egress_port_interface(struct shash *bridge_mappings,
 continue;
 }
 
-bool is_egress_iface = smap_get_bool(>external_ids,
- "ovn-egress-iface", false);
-if (is_egress_iface) {
+if (smap_get_bool(>external_ids,
+  "ovn-egress-iface", false) ||
+!strcmp(iface->type, "")) {
 *pport = port;
 return iface;
 }
diff --git a/northd/northd.c b/northd/northd.c
index c107fe33a..2f96d15a4 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -5737,15 +5737,29 @@ build_lswitch_port_sec_op(struct ovn_port *op, struct 
hmap *lflows,
   ds_cstr(match), ds_cstr(actions),
   op->key, >nbsp->header_);
 
+if (!lsp_is_localnet(op->nbsp) && !op->od->n_localnet_ports) {
+return;
+}
+
+ds_clear(actions);
+ds_put_format(actions, "set_queue(%s); output;", queue_id);
+
+ds_clear(match);
 if (lsp_is_localnet(op->nbsp)) {
-ds_clear(match);
-ds_clear(actions);
 ds_put_format(match, "outport == %s", op->json_key);
-ds_put_format(actions, "set_queue(%s); output;", queue_id);
 ovn_lflow_add_with_lport_and_hint(lflows, op->od,
-S_SWITCH_OUT_APPLY_PORT_SEC, 100,
-ds_cstr(match), ds_cstr(actions),
-op->key, >nbsp->header_);
+  S_SWITCH_OUT_APPLY_PORT_SEC, 100,
+  ds_cstr(match), ds_cstr(actions),
+  op->key, >nbsp->header_);
+} else if (op->od->n_localnet_ports) {
+ds_put_format(match, "outport == %s && inport == %s",
+  op->od->localnet_ports[0]->json_key,
+  op->json_key);
+ovn_lflow_add_with_lport_and_hint(lflows, op->od,
+S_SWITCH_OUT_APPLY_PORT_SEC, 110,
+ds_cstr(match), ds_cstr(actions),
+op->od->localnet_ports[0]->key,
+>od->localnet_ports[0]->nbsp->header_);
 }
 }
 }
diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 70153dc9e..7da912da3 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -2223,6 +2223,18 @@ output;
 
 
 
+  
+
+For each port configured with egress qos in the
+ column of , running a localnet port on the same logical
+switch, a priority 110 flow is added which matches on the localnet
+outport and on the port inport and
+applies the action set_queue(id); output;".
+
+  
+
   
 
 For each localnet port configured with egress qos in the
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index a9af0f76a..c52f86490 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -8154,6 +8154,7 @@ sort | sed 's/table=../table=??/' ], [0], [dnl
   table=??(ls_out_check_port_sec), priority=0, match=(1), 
action=(reg0[[15]] = check_out_port_sec(); next;)
   table=??(ls_out_check_port_sec), priority=100  , match=(eth.mcast), 
action=(reg0[[15]] = 0; next;)
   table=??(ls_out_apply_port_sec), priority=0, match=(1), action=(output;)
+  table=??(ls_out_apply_port_sec), priority=110  , match=(outport == 
"localnetport" && inport == "sw0p2"), action=(set_queue(10); output;)
   table=??(ls_out_apply_port_sec), priority=50   , match=(reg0[[15]] == 1), 
action=(drop;)
 ])
 
@@ -8184,6 +8185,7 @@ sort | sed 's/table=../table=??/' ], [0], [dnl
   table=??(ls_out_check_port_sec), priority=100  , match=(eth.mcast), 
action=(reg0[[15]] = 0; next;)
   table=??(ls_out_apply_port_sec), priority=0, match=(1), action=(output;)
   

[ovs-dev] [PATCH v8 ovn 10/10] update NEWS with new QoS info

2023-05-17 Thread Lorenzo Bianconi
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2129742
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2139100
Acked-By: Ihar Hrachyshka 
Signed-off-by: Lorenzo Bianconi 
---
 NEWS | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/NEWS b/NEWS
index 0f1c5f985..a7a11061f 100644
--- a/NEWS
+++ b/NEWS
@@ -25,6 +25,12 @@ Post v23.03.0
 databases for ovn-nbctl and ovn-sbctl respectively.  See man ovn-nb and
 man ovn-sb for 'nbctl_probe_interval' and 'sbctl_probe_interval'
 options for more details.
+  - Rework OVN egress QoS implementation in order to rely on OvS interface
+instead of directly running tc from OVN. Get rid of traffic shaping on the
+tunnel interfaces. Now for LSPs running on a LogicalSwitch with a localnet
+port is possible to define QoS rules to apply to the local egress localnet
+port. Please note now the QoS will be applied just to the local localnet
+port and not to all localnet port marked with ovn-egress iface.
 
 OVN v23.03.0 - 03 Mar 2023
 --
-- 
2.40.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v8 ovn 07/10] northd: make queue_id allocation global for the ovn cluster

2023-05-17 Thread Lorenzo Bianconi
In order to avoid possible queue_id clash when we have a localnet and a
LSP ports on the same hv, make QoS queue_id allocation global for the
ovn cluster.

Signed-off-by: Lorenzo Bianconi 
---
 northd/northd.c | 127 
 1 file changed, 20 insertions(+), 107 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index 470f76809..c107fe33a 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -469,94 +469,20 @@ build_chassis_features(const struct sbrec_chassis_table 
*sbrec_chassis_table,
 }
 }
 }
-
-struct ovn_chassis_qdisc_queues {
-struct hmap_node key_node;
-uint32_t queue_id;
-struct uuid chassis_uuid;
-};
 
 static uint32_t
-hash_chassis_queue(const struct uuid *chassis_uuid, uint32_t queue_id)
-{
-return hash_2words(uuid_hash(chassis_uuid), queue_id);
-}
-
-static void
-destroy_chassis_queues(struct hmap *set)
-{
-struct ovn_chassis_qdisc_queues *node;
-HMAP_FOR_EACH_POP (node, key_node, set) {
-free(node);
-}
-hmap_destroy(set);
-}
-
-static void
-add_chassis_queue(struct hmap *set, const struct uuid *chassis_uuid,
-  uint32_t queue_id)
+allocate_queueid(unsigned long *queue_id_bitmap)
 {
-struct ovn_chassis_qdisc_queues *node = xmalloc(sizeof *node);
-node->queue_id = queue_id;
-node->chassis_uuid = *chassis_uuid;
-hmap_insert(set, >key_node,
-hash_chassis_queue(chassis_uuid, queue_id));
-}
-
-static bool
-chassis_queueid_in_use(const struct hmap *set, const struct uuid *chassis_uuid,
-   uint32_t queue_id)
-{
-const struct ovn_chassis_qdisc_queues *node;
-HMAP_FOR_EACH_WITH_HASH (node, key_node,
- hash_chassis_queue(chassis_uuid, queue_id), set) {
-if (uuid_equals(chassis_uuid, >chassis_uuid)
-&& node->queue_id == queue_id) {
-return true;
-}
-}
-return false;
-}
-
-static uint32_t
-allocate_chassis_queueid(struct hmap *set, const struct uuid *uuid, char *name)
-{
-if (!uuid) {
+uint32_t queue_id = bitmap_scan(queue_id_bitmap, 0, 1,
+QDISC_MAX_QUEUE_ID + 1);
+if (queue_id == QDISC_MAX_QUEUE_ID + 1) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+VLOG_WARN_RL(, "all queue ids exhausted");
 return 0;
 }
+bitmap_set1(queue_id_bitmap, queue_id);
 
-for (uint32_t queue_id = QDISC_MIN_QUEUE_ID + 1;
- queue_id <= QDISC_MAX_QUEUE_ID;
- queue_id++) {
-if (!chassis_queueid_in_use(set, uuid, queue_id)) {
-add_chassis_queue(set, uuid, queue_id);
-return queue_id;
-}
-}
-
-static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
-VLOG_WARN_RL(, "all %s queue ids exhausted", name);
-return 0;
-}
-
-static void
-free_chassis_queueid(struct hmap *set, const struct uuid *uuid,
- uint32_t queue_id)
-{
-if (!uuid) {
-return;
-}
-
-struct ovn_chassis_qdisc_queues *node;
-HMAP_FOR_EACH_WITH_HASH (node, key_node,
- hash_chassis_queue(uuid, queue_id), set) {
-if (uuid_equals(uuid, >chassis_uuid)
-&& node->queue_id == queue_id) {
-hmap_remove(set, >key_node);
-free(node);
-break;
-}
-}
+return queue_id;
 }
 
 static inline bool
@@ -2472,7 +2398,7 @@ tag_alloc_create_new_tag(struct hmap *tag_alloc_table,
 static void
 join_logical_ports(const struct sbrec_port_binding_table *sbrec_pb_table,
struct hmap *ls_datapaths, struct hmap *lr_datapaths,
-   struct hmap *ports, struct hmap *chassis_qdisc_queues,
+   struct hmap *ports, unsigned long *queue_id_bitmap,
struct hmap *tag_alloc_table, struct ovs_list *sb_only,
struct ovs_list *nb_only, struct ovs_list *both)
 {
@@ -2537,11 +2463,8 @@ join_logical_ports(const struct sbrec_port_binding_table 
*sbrec_pb_table,
 
 uint32_t queue_id = smap_get_int(>sb->options,
  "qdisc_queue_id", 0);
-if (queue_id && op->sb->chassis) {
-add_chassis_queue(
- chassis_qdisc_queues,
- >sb->chassis->header_.uuid,
- queue_id);
+if (queue_id) {
+bitmap_set1(queue_id_bitmap, queue_id);
 }
 
 ovs_list_push_back(both, >list);
@@ -3359,7 +3282,7 @@ ovn_port_update_sbrec(struct ovsdb_idl_txn *ovnsb_txn,
   struct ovsdb_idl_index *sbrec_ha_chassis_grp_by_name,
   const struct sbrec_mirror_table *sbrec_mirror_table,
   const struct ovn_port *op,
-  struct hmap *chassis_qdisc_queues,
+

[ovs-dev] [PATCH v8 ovn 05/10] controller: improve ovs port lookup by qos

2023-05-17 Thread Lorenzo Bianconi
Introduce ovsport_lookup_by_qos routine in order to speed-up
ovs port lookup based on port qos.

Acked-By: Ihar Hrachyshka 
Tested-by: Rodolfo Alonso 
Signed-off-by: Lorenzo Bianconi 
---
 controller/binding.c| 37 ++---
 controller/binding.h|  2 +-
 controller/ovn-controller.c | 13 +
 controller/ovsport.c| 16 
 controller/ovsport.h|  3 +++
 5 files changed, 43 insertions(+), 28 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index e18488c7e..6cd3c6c1d 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -36,6 +36,7 @@
 #include "lport.h"
 #include "ovn-controller.h"
 #include "patch.h"
+#include "ovsport.h"
 
 VLOG_DEFINE_THIS_MODULE(binding);
 
@@ -281,7 +282,7 @@ add_ovs_qos_table_entry(struct ovsdb_idl_txn *ovs_idl_txn,
 static void
 remove_stale_qos_entry(struct ovsdb_idl_txn *ovs_idl_txn,
const struct sbrec_port_binding *pb,
-   const struct ovsrec_port_table *port_table,
+   struct ovsdb_idl_index *ovsrec_port_by_qos,
const struct ovsrec_qos_table *qos_table,
struct hmap *queue_map)
 {
@@ -314,13 +315,8 @@ remove_stale_qos_entry(struct ovsdb_idl_txn *ovs_idl_txn,
 ovsrec_queue_delete(queue);
 
 if (qos->n_queues == 1) {
-const struct ovsrec_port *port = NULL, *iter;
-OVSREC_PORT_TABLE_FOR_EACH (iter, port_table) {
-if (iter->qos == qos) {
-port = iter;
-break;
-}
-}
+const struct ovsrec_port *port =
+ovsport_lookup_by_qos(ovsrec_port_by_qos, qos);
 if (port) {
 ovsrec_port_set_qos(port, NULL);
 }
@@ -349,9 +345,8 @@ configure_qos(const struct sbrec_port_binding *pb,
 if ((!min_rate && !max_rate && !burst) || !queue_id) {
 /* Qos is not configured for this port. */
 remove_stale_qos_entry(b_ctx_in->ovs_idl_txn, pb,
-   b_ctx_in->port_table,
-   b_ctx_in->qos_table,
-   b_ctx_out->qos_map);
+   b_ctx_in->ovsrec_port_by_qos,
+   b_ctx_in->qos_table, b_ctx_out->qos_map);
 return;
 }
 
@@ -398,7 +393,7 @@ configure_qos(const struct sbrec_port_binding *pb,
 
 static void
 ovs_qos_entries_gc(struct ovsdb_idl_txn *ovs_idl_txn,
-   const struct ovsrec_port_table *port_table,
+   struct ovsdb_idl_index *ovsrec_port_by_qos,
const struct ovsrec_qos_table *qos_table,
struct hmap *queue_map)
 {
@@ -430,13 +425,8 @@ ovs_qos_entries_gc(struct ovsdb_idl_txn *ovs_idl_txn,
 }
 
 if (n_queue_deleted && n_queue_deleted == n_queues) {
-const struct ovsrec_port *port = NULL, *iter;
-OVSREC_PORT_TABLE_FOR_EACH (iter, port_table) {
-if (iter->qos == qos) {
-port = iter;
-break;
-}
-}
+const struct ovsrec_port *port =
+ovsport_lookup_by_qos(ovsrec_port_by_qos, qos);
 if (port) {
 ovsrec_port_set_qos(port, NULL);
 }
@@ -2160,7 +2150,7 @@ binding_run(struct binding_ctx_in *b_ctx_in, struct 
binding_ctx_out *b_ctx_out)
 
 shash_destroy(_mappings);
 /* Remove stale QoS entries. */
-ovs_qos_entries_gc(b_ctx_in->ovs_idl_txn, b_ctx_in->port_table,
+ovs_qos_entries_gc(b_ctx_in->ovs_idl_txn, b_ctx_in->ovsrec_port_by_qos,
b_ctx_in->qos_table, b_ctx_out->qos_map);
 
 cleanup_claimed_port_timestamps();
@@ -2380,8 +2370,8 @@ consider_iface_release(const struct ovsrec_interface 
*iface_rec,
 
 if (b_ctx_in->ovs_idl_txn) {
 remove_stale_qos_entry(b_ctx_in->ovs_idl_txn, b_lport->pb,
-   b_ctx_in->port_table, b_ctx_in->qos_table,
-   b_ctx_out->qos_map);
+   b_ctx_in->ovsrec_port_by_qos,
+   b_ctx_in->qos_table, b_ctx_out->qos_map);
 }
 
 /* Release the primary binding lport and other children lports if
@@ -3020,7 +3010,8 @@ binding_handle_port_binding_changes(struct binding_ctx_in 
*b_ctx_in,
 shash_add(_other_pbs, pb->logical_port, pb);
 }
 
-remove_stale_qos_entry(b_ctx_in->ovs_idl_txn, pb, b_ctx_in->port_table,
+remove_stale_qos_entry(b_ctx_in->ovs_idl_txn, pb,
+   b_ctx_in->ovsrec_port_by_qos,
b_ctx_in->qos_table, b_ctx_out->qos_map);
 }
 
diff --git a/controller/binding.h b/controller/binding.h
index 87ee7b540..6b97bcff0 100644

[ovs-dev] [PATCH v8 ovn 04/10] controller: configure qos through ovs qos table and do not run tc directly

2023-05-17 Thread Lorenzo Bianconi
Rework OVN QoS implementation in order to configure it through OVS QoS
table instead of running tc command directly bypassing OVS.

Acked-By: Ihar Hrachyshka 
Reviewed-by: Simon Horman 
Tested-by: Rodolfo Alonso 
Signed-off-by: Lorenzo Bianconi 
---
 controller/binding.c| 432 +++-
 controller/ovn-controller.c |   7 +
 tests/ovn-performance.at|   5 -
 tests/system-ovn.at |  67 +-
 4 files changed, 298 insertions(+), 213 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index 91437dbb8..e18488c7e 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -181,222 +181,268 @@ destroy_qos_map(struct hmap *qos_map)
 hmap_destroy(qos_map);
 }
 
-static void
-get_qos_queue(const struct sbrec_port_binding *pb, struct hmap *queue_map)
+static const struct ovsrec_interface *
+get_qos_egress_port_interface(struct shash *bridge_mappings,
+  const struct ovsrec_port **pport,
+  const char *network)
 {
-uint32_t min_rate = smap_get_int(>options, "qos_min_rate", 0);
-uint32_t max_rate = smap_get_int(>options, "qos_max_rate", 0);
-uint32_t burst = smap_get_int(>options, "qos_burst", 0);
-uint32_t queue_id = smap_get_int(>options, "qdisc_queue_id", 0);
-const char *network = smap_get(>options, "qos_physical_network");
+struct ovsrec_bridge *br_ln = shash_find_data(bridge_mappings, network);
+if (!br_ln) {
+return NULL;
+}
 
-if ((!min_rate && !max_rate && !burst) || !queue_id) {
-/* Qos is not configured for this port. */
-return;
+/* Add egress-ifaces from the connected bridge */
+for (size_t i = 0; i < br_ln->n_ports; i++) {
+const struct ovsrec_port *port = br_ln->ports[i];
+for (size_t j = 0; j < port->n_interfaces; j++) {
+const struct ovsrec_interface *iface = port->interfaces[j];
+
+if (smap_get(>external_ids, "iface-id")) {
+continue;
+}
+
+bool is_egress_iface = smap_get_bool(>external_ids,
+ "ovn-egress-iface", false);
+if (is_egress_iface) {
+*pport = port;
+return iface;
+}
+}
 }
 
-uint32_t hash = hash_string(pb->logical_port, 0);
-struct qos_queue *q = find_qos_queue(queue_map, hash, pb->logical_port);
-if (!q) {
-q = xzalloc(sizeof *q);
-hmap_insert(queue_map, >node, hash);
-q->port = xstrdup(pb->logical_port);
-q->queue_id = queue_id;
+return NULL;
+}
+
+/* 34359738360 == (2^32 - 1) * 8.  netdev_set_qos() doesn't support
+ * 64-bit rate netlink attributes, so the maximum value is 2^32 - 1
+ * bytes. The 'max-rate' config option is in bits, so multiplying by 8.
+ * Without setting max-rate the reported link speed will be used, which
+ * can be unrecognized for certain NICs or reported too low for virtual
+ * interfaces. */
+#define OVN_QOS_MAX_RATE34359738360ULL
+static void
+add_ovs_qos_table_entry(struct ovsdb_idl_txn *ovs_idl_txn,
+const struct ovsrec_port *port,
+uint32_t min_rate, uint32_t max_rate,
+uint32_t burst, uint32_t queue_id,
+const char *ovn_port)
+{
+struct smap external_ids = SMAP_INITIALIZER(_ids);
+struct smap other_config = SMAP_INITIALIZER(_config);
+
+const struct ovsrec_qos *qos = port->qos;
+if (qos && smap_get_bool(>external_ids, "ovn_qos", false)) {
+/* External configured QoS, do not overwrite it. */
+return;
 }
 
-free(q->network);
-if (network) {
-q->network = xstrdup(network);
+if (!qos) {
+qos = ovsrec_qos_insert(ovs_idl_txn);
+ovsrec_qos_set_type(qos, OVN_QOS_TYPE);
+ovsrec_port_set_qos(port, qos);
+smap_add_format(_config, "max-rate", "%lld", OVN_QOS_MAX_RATE);
+ovsrec_qos_set_other_config(qos, _config);
+smap_clear(_config);
+
+smap_add(_ids, "ovn_qos", "true");
+ovsrec_qos_set_external_ids(qos, _ids);
+smap_clear(_ids);
 }
-q->min_rate = min_rate;
-q->max_rate = max_rate;
-q->burst = burst;
-}
 
-static const struct ovsrec_qos *
-get_noop_qos(struct ovsdb_idl_txn *ovs_idl_txn,
- const struct ovsrec_qos_table *qos_table)
-{
-const struct ovsrec_qos *qos;
-OVSREC_QOS_TABLE_FOR_EACH (qos, qos_table) {
-if (!strcmp(qos->type, "linux-noop")) {
-return qos;
+struct ovsrec_queue *queue;
+size_t i;
+for (i = 0; i < qos->n_queues; i++) {
+queue = qos->value_queues[i];
+
+const char *p = smap_get(>external_ids, "ovn_port");
+if (p && !strcmp(p, ovn_port)) {
+break;
 }
 }
 
-if (!ovs_idl_txn) {
-return NULL;
+if (i == qos->n_queues) {
+queue = ovsrec_queue_insert(ovs_idl_txn);

[ovs-dev] [PATCH v8 ovn 06/10] controller: use unsigned long long int for qos_max_rate/qos_min_rate/qos_burst

2023-05-17 Thread Lorenzo Bianconi
This patch allow to configure max/min rate greater than 4Gbps

Acked-By: Ihar Hrachyshka 
Tested-by: Rodolfo Alonso 
Signed-off-by: Lorenzo Bianconi 
---
 controller/binding.c | 29 -
 1 file changed, 16 insertions(+), 13 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index 6cd3c6c1d..bf0ec9834 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -147,9 +147,9 @@ struct qos_queue {
 char *port;
 
 uint32_t queue_id;
-uint32_t min_rate;
-uint32_t max_rate;
-uint32_t burst;
+unsigned long long min_rate;
+unsigned long long max_rate;
+unsigned long long burst;
 };
 
 static struct qos_queue *
@@ -224,9 +224,10 @@ get_qos_egress_port_interface(struct shash 
*bridge_mappings,
 static void
 add_ovs_qos_table_entry(struct ovsdb_idl_txn *ovs_idl_txn,
 const struct ovsrec_port *port,
-uint32_t min_rate, uint32_t max_rate,
-uint32_t burst, uint32_t queue_id,
-const char *ovn_port)
+unsigned long long min_rate,
+unsigned long long max_rate,
+unsigned long long burst,
+uint32_t queue_id, const char *ovn_port)
 {
 struct smap external_ids = SMAP_INITIALIZER(_ids);
 struct smap other_config = SMAP_INITIALIZER(_config);
@@ -266,9 +267,9 @@ add_ovs_qos_table_entry(struct ovsdb_idl_txn *ovs_idl_txn,
 ovsrec_qos_update_queues_setkey(qos, queue_id, queue);
 }
 
-smap_add_format(_config, "max-rate", "%d", max_rate);
-smap_add_format(_config, "min-rate", "%d", min_rate);
-smap_add_format(_config, "burst", "%d", burst);
+smap_add_format(_config, "max-rate", "%llu", max_rate);
+smap_add_format(_config, "min-rate", "%llu", min_rate);
+smap_add_format(_config, "burst", "%llu", burst);
 ovsrec_queue_verify_other_config(queue);
 ovsrec_queue_set_other_config(queue, _config);
 smap_destroy(_config);
@@ -336,10 +337,12 @@ configure_qos(const struct sbrec_port_binding *pb,
   struct binding_ctx_in *b_ctx_in,
   struct binding_ctx_out *b_ctx_out)
 {
-
-uint32_t min_rate = smap_get_int(>options, "qos_min_rate", 0);
-uint32_t max_rate = smap_get_int(>options, "qos_max_rate", 0);
-uint32_t burst = smap_get_int(>options, "qos_burst", 0);
+unsigned long long min_rate = smap_get_ullong(
+>options, "qos_min_rate", 0);
+unsigned long long max_rate = smap_get_ullong(
+>options, "qos_max_rate", 0);
+unsigned long long burst = smap_get_ullong(
+>options, "qos_burst", 0);
 uint32_t queue_id = smap_get_int(>options, "qdisc_queue_id", 0);
 
 if ((!min_rate && !max_rate && !burst) || !queue_id) {
-- 
2.40.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v8 ovn 02/10] controller: add incremental processing for ovn-controller qos_map

2023-05-17 Thread Lorenzo Bianconi
Introduce support to process incrementally ovn-controller QoS
configuration received from ovn-northd adding qos_map hash map.
This is a preliminary patch to rework OVN QoS implementation in order to
configure it through OVS QoS table instead of running tc command
directly bypassing OVS.

Signed-off-by: Lorenzo Bianconi 
---
 controller/binding.c| 201 ++--
 controller/binding.h|   3 +
 controller/ovn-controller.c |   8 +-
 3 files changed, 109 insertions(+), 103 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index a0fbacc97..ad19a4092 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -109,14 +109,6 @@ binding_wait(void)
 }
 }
 
-struct qos_queue {
-struct hmap_node node;
-uint32_t queue_id;
-uint32_t min_rate;
-uint32_t max_rate;
-uint32_t burst;
-};
-
 void
 binding_register_ovs_idl(struct ovsdb_idl *ovs_idl)
 {
@@ -147,8 +139,48 @@ static void update_lport_tracking(const struct 
sbrec_port_binding *pb,
   struct hmap *tracked_dp_bindings,
   bool claimed);
 
+struct qos_queue {
+struct hmap_node node;
+
+char *port;
+
+uint32_t queue_id;
+uint32_t min_rate;
+uint32_t max_rate;
+uint32_t burst;
+};
+
+static struct qos_queue *
+find_qos_queue(struct hmap *queue_map, uint32_t hash, const char *port)
+{
+struct qos_queue *q;
+HMAP_FOR_EACH_WITH_HASH (q, node, hash, queue_map) {
+if (!strcmp(q->port, port)) {
+return q;
+}
+}
+return NULL;
+}
+
 static void
-get_qos_params(const struct sbrec_port_binding *pb, struct hmap *queue_map)
+qos_queue_erase_entry(struct qos_queue *q)
+{
+free(q->port);
+free(q);
+}
+
+void
+destroy_qos_map(struct hmap *qos_map)
+{
+struct qos_queue *q;
+HMAP_FOR_EACH_POP (q, node, qos_map) {
+qos_queue_erase_entry(q);
+}
+hmap_destroy(qos_map);
+}
+
+static void
+get_qos_queue(const struct sbrec_port_binding *pb, struct hmap *queue_map)
 {
 uint32_t min_rate = smap_get_int(>options, "qos_min_rate", 0);
 uint32_t max_rate = smap_get_int(>options, "qos_max_rate", 0);
@@ -160,12 +192,17 @@ get_qos_params(const struct sbrec_port_binding *pb, 
struct hmap *queue_map)
 return;
 }
 
-struct qos_queue *node = xzalloc(sizeof *node);
-hmap_insert(queue_map, >node, hash_int(queue_id, 0));
-node->min_rate = min_rate;
-node->max_rate = max_rate;
-node->burst = burst;
-node->queue_id = queue_id;
+uint32_t hash = hash_string(pb->logical_port, 0);
+struct qos_queue *q = find_qos_queue(queue_map, hash, pb->logical_port);
+if (!q) {
+q = xzalloc(sizeof *q);
+hmap_insert(queue_map, >node, hash);
+q->port = xstrdup(pb->logical_port);
+q->queue_id = queue_id;
+}
+q->min_rate = min_rate;
+q->max_rate = max_rate;
+q->burst = burst;
 }
 
 static const struct ovsrec_qos *
@@ -354,17 +391,6 @@ setup_qos(const char *egress_iface, struct hmap *queue_map)
 netdev_close(netdev_phy);
 }
 
-static void
-destroy_qos_map(struct hmap *qos_map)
-{
-struct qos_queue *qos_queue;
-HMAP_FOR_EACH_POP (qos_queue, node, qos_map) {
-free(qos_queue);
-}
-
-hmap_destroy(qos_map);
-}
-
 /*
  * Get the encap from the chassis for this port. The interface
  * may have an external_ids:encap-ip= set; if so we
@@ -651,7 +677,7 @@ static struct binding_lport 
*local_binding_get_primary_or_localport_lport(
 
 static bool local_binding_handle_stale_binding_lports(
 struct local_binding *lbinding, struct binding_ctx_in *b_ctx_in,
-struct binding_ctx_out *b_ctx_out, struct hmap *qos_map);
+struct binding_ctx_out *b_ctx_out);
 
 static struct binding_lport *binding_lport_create(
 const struct sbrec_port_binding *,
@@ -1469,8 +1495,7 @@ consider_vif_lport_(const struct sbrec_port_binding *pb,
 bool can_bind,
 struct binding_ctx_in *b_ctx_in,
 struct binding_ctx_out *b_ctx_out,
-struct binding_lport *b_lport,
-struct hmap *qos_map)
+struct binding_lport *b_lport)
 {
 bool lbinding_set = b_lport && is_lbinding_set(b_lport->lbinding);
 
@@ -1502,8 +1527,8 @@ consider_vif_lport_(const struct sbrec_port_binding *pb,
 tracked_datapath_lport_add(pb, TRACKED_RESOURCE_UPDATED,
b_ctx_out->tracked_dp_bindings);
 }
-if (b_lport->lbinding->iface && qos_map && b_ctx_in->ovs_idl_txn) {
-get_qos_params(pb, qos_map);
+if (b_lport->lbinding->iface && b_ctx_in->ovs_idl_txn) {
+get_qos_queue(pb, b_ctx_out->qos_map);
 }
 } else {
 /* We could, but can't claim the lport. */
@@ -1537,8 +1562,7 @@ static bool
 consider_vif_lport(const struct sbrec_port_binding *pb,

[ovs-dev] [PATCH v8 ovn 03/10] northd: add qos_physical_network in port_binding config column

2023-05-17 Thread Lorenzo Bianconi
Introduce qos_physical_network in port_binding config column in order to
indicate the name of the egress network name where traffic shaping will
be applied.
This is a preliminary patch to rework OVN QoS implementation in order to
configure it through OVS QoS table instead of running tc command
directly bypassing OVS.

Signed-off-by: Lorenzo Bianconi 
---
 controller/binding.c |  8 
 northd/northd.c  | 10 ++
 ovn-sb.xml   |  5 +
 tests/ovn-northd.at  | 22 ++
 4 files changed, 45 insertions(+)

diff --git a/controller/binding.c b/controller/binding.c
index ad19a4092..91437dbb8 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -142,6 +142,7 @@ static void update_lport_tracking(const struct 
sbrec_port_binding *pb,
 struct qos_queue {
 struct hmap_node node;
 
+char *network;
 char *port;
 
 uint32_t queue_id;
@@ -165,6 +166,7 @@ find_qos_queue(struct hmap *queue_map, uint32_t hash, const 
char *port)
 static void
 qos_queue_erase_entry(struct qos_queue *q)
 {
+free(q->network);
 free(q->port);
 free(q);
 }
@@ -186,6 +188,7 @@ get_qos_queue(const struct sbrec_port_binding *pb, struct 
hmap *queue_map)
 uint32_t max_rate = smap_get_int(>options, "qos_max_rate", 0);
 uint32_t burst = smap_get_int(>options, "qos_burst", 0);
 uint32_t queue_id = smap_get_int(>options, "qdisc_queue_id", 0);
+const char *network = smap_get(>options, "qos_physical_network");
 
 if ((!min_rate && !max_rate && !burst) || !queue_id) {
 /* Qos is not configured for this port. */
@@ -200,6 +203,11 @@ get_qos_queue(const struct sbrec_port_binding *pb, struct 
hmap *queue_map)
 q->port = xstrdup(pb->logical_port);
 q->queue_id = queue_id;
 }
+
+free(q->network);
+if (network) {
+q->network = xstrdup(network);
+}
 q->min_rate = min_rate;
 q->max_rate = max_rate;
 q->burst = burst;
diff --git a/northd/northd.c b/northd/northd.c
index 7190cd18f..470f76809 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -3505,7 +3505,17 @@ ovn_port_update_sbrec(struct ovsdb_idl_txn *ovnsb_txn,
 }
 
 smap_clone(, >nbsp->options);
+
 if (queue_id) {
+if (op->od->n_localnet_ports) {
+struct ovn_port *port = op->od->localnet_ports[0];
+const char *physical_network = smap_get(
+>nbsp->options, "network_name");
+if (physical_network) {
+smap_add(, "qos_physical_network",
+ physical_network);
+}
+}
 smap_add_format(,
 "qdisc_queue_id", "%d", queue_id);
 }
diff --git a/ovn-sb.xml b/ovn-sb.xml
index ead9efcab..0988cb1f8 100644
--- a/ovn-sb.xml
+++ b/ovn-sb.xml
@@ -3691,6 +3691,11 @@ tcp.flags = RST;
 interface, in bits.
   
 
+  
+If set, indicates the name of the egress network name where traffic
+shaping will be applied.
+  
+
   
 Indicates the queue number on the physical device. This is same as the
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 1f6169b77..a9af0f76a 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -9064,3 +9064,25 @@ mac_binding_timestamp: true
 
 AT_CLEANUP
 ])
+
+OVN_FOR_EACH_NORTHD_NO_HV([
+AT_SETUP([check OVN QoS])
+AT_KEYWORDS([OVN-QoS])
+ovn_start
+
+check ovn-nbctl ls-add ls
+check ovn-nbctl lsp-add ls public
+check ovn-nbctl lsp-set-type public localnet
+check ovn-nbctl lsp-set-addresses public unknown
+
+check_column "" sb:Port_Binding options logical_port=public
+
+check ovn-nbctl --wait=sb set Logical_Switch_Port public 
options:qos_min_rate=20
+AT_CHECK([fetch_column sb:Port_Binding options logical_port=public |grep -q 
'qos_min_rate=20'])
+AT_CHECK([fetch_column sb:Port_Binding options logical_port=public |grep -q 
'qos_physical_network'],[1])
+
+check ovn-nbctl --wait=sb set Logical_Switch_Port public 
options:qos_min_rate=20 options:network_name=phys
+AT_CHECK([fetch_column sb:Port_Binding options logical_port=public |grep -q 
'qos_physical_network=phys'])
+
+AT_CLEANUP
+])
-- 
2.40.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v8 ovn 01/10] controller: remove tunnel interfaces from egress_ifaces sset

2023-05-17 Thread Lorenzo Bianconi
Remove tunnel interfaces from egress list in order to not shape them.

Acked-By: Ihar Hrachyshka 
Tested-by: Rodolfo Alonso 
Signed-off-by: Lorenzo Bianconi 
---
 controller/binding.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index bd810f669..a0fbacc97 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -1938,15 +1938,6 @@ build_local_bindings(struct binding_ctx_in *b_ctx_in,
 smap_replace(b_ctx_out->local_iface_ids, iface_rec->name,
  iface_id);
 }
-
-/* Check if this is a tunnel interface. */
-if (smap_get(_rec->options, "remote_ip")) {
-const char *tunnel_iface
-= smap_get(_rec->status, "tunnel_egress_iface");
-if (tunnel_iface) {
-sset_add(b_ctx_out->egress_ifaces, tunnel_iface);
-}
-}
 }
 }
 }
-- 
2.40.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v8 ovn 00/10] Configure OVN QoS thorugh OvS db

2023-05-17 Thread Lorenzo Bianconi
Rework OVN QoS implementation in order to configure it through OVS QoS
table instead of running tc command directly bypassing OVS.
This series allows to apply QoS rules on the localnet port related to
logical switch ports running on the same datapath. Considering the
following netowrk configuration:

LSP{0,1} -- LogicalSwitch -- Localnet0

It is possible to apply the following QoS rules on Localnet0 on egress traffic
entering the cluster from LSP{0,1}:
- LSP0: min-rate r0, max_rate R0
- LSP1: min-rate r1, max_rate R1

https://bugzilla.redhat.com/show_bug.cgi?id=2129742

Changes since v7:
- make queue_id allocation in norhd global
- update QoS configuration in binding_handle_ovs_interface_changes()
- fix "ovn-controller incremental processing" test
- fix possible NULL pointer dereference
- add new ovn-northd unit test
- get rid of ovs port lookup by name
- rebase on top of ovn main branch
Changes since v6:
- run add_ovs_qos_table_entry() and remove_stale_qos_entry() in setup_qos()
- rename setup_qos() in configure_qos()
- add some new unit-tests in ovn.at
- erase QoS configuration if related port_binding is removed
Changes since v5:
- add IP for qos_map map
- add some new unit-tests in ovn.at
Changes since v4:
- do not remove ovn-egress-iface parameter
- rebase on top of ovn main branch

Lorenzo Bianconi (10):
  controller: remove tunnel interfaces from egress_ifaces sset
  controller: add incremental processing for ovn-controller qos_map
  northd: add qos_physical_network in port_binding config column
  controller: configure qos through ovs qos table and do not run tc
directly
  controller: improve ovs port lookup by qos
  controller: use unsigned long long int for
qos_max_rate/qos_min_rate/qos_burst
  northd: make queue_id allocation global for the ovn cluster
  northd: apply QoS rules on the localnet port related to LSP ports
  controller: get rid of egress_ifaces sset
  update NEWS with new QoS info

 NEWS|   6 +
 controller/binding.c| 622 +---
 controller/binding.h|   6 +-
 controller/ovn-controller.c |  34 +-
 controller/ovsport.c|  16 +
 controller/ovsport.h|   3 +
 northd/northd.c | 163 +++---
 northd/ovn-northd.8.xml |  12 +
 ovn-sb.xml  |   5 +
 tests/ovn-northd.at |  24 ++
 tests/ovn-performance.at|   5 -
 tests/ovn.at| 185 +++
 tests/system-ovn.at | 151 -
 13 files changed, 772 insertions(+), 460 deletions(-)

-- 
2.40.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [ovn] ha-chassis-group false positive failover

2023-05-17 Thread Vladislav Odintsov


> On 16 May 2023, at 23:54, Numan Siddique  wrote:
> 
> Hi Vladislav,
> 
> Sorry for the late reply.
> 
> PSB for few comments.


Thanks for your reply!
My answers are inline.

> 
> 
> 
> On Tue, May 16, 2023 at 3:42 PM Vladislav Odintsov  > wrote:
>> 
>> Hi Numan, Dumitru, Ilya,
>> 
>> I guess I’ve managed to deal with some of these problems and have some 
>> conclusions.
>> Please see inline.
>> 
>>> On 14 Apr 2023, at 16:26, Vladislav Odintsov  wrote:
>>> 
>>> And as a follow-up, I see sometimes next error message in ovn-controller 
>>> log, though there is a configured much higher limit in seconds for 
>>> inactivity probe value for the chassis:
>>> 
>> 
>> ovn-controller establishes 3 openflow connections to local ovs-vswitchd for 
>> next purposes:
>> 1. openflow rules management (openflow probe interval is configured through 
>> ovn-openflow-probe-interval external_ids setting [1]; default — disabled 
>> [2]);
>> 2. openflow features collection (hardcoded probe interval 5s [3]);
>> 3. pinctrl thread (hardcoded probe interval 5s [4]).
>> 
>>> ovs|04426|rconn|ERR|unix:/run/openvswitch/br-int.mgmt: no response to 
>>> inactivity probe after 8 seconds, disconnecting
>>> 
>>> ovs|26560|rconn|ERR|unix:/run/openvswitch/br-int.mgmt: no response to 
>>> inactivity probe after 10 seconds, disconnecting
>>> 
>> 
>> These messages most likely related to the connection for features collection 
>> or pinctrl thread. So ovn-openflow-probe-interval has no matter here.
>> 
>>> 
>>> server with first log message:
>>> # ovs-vsctl get open . external_ids:ovn-openflow-probe-interval
>>> "30"
>>> 
>>> server with second log message:
>>> # ovs-vsctl get open . external_ids:ovn-openflow-probe-interval
>>> "60"
>>> 
>>> 
>>> 
 On 30 Mar 2023, at 23:19, Vladislav Odintsov  wrote:
 
 Hi all,
 
 I’m facing a false positive failover in the event of openflow connection 
 got dropped after inactivity probe timeout due to high load of 
 ovn-controller handling huge amount of ovsdb updates.
 I wonder wether its a bug or I have to tune any specifics. See 
 ovn-controller logs:
 
 2023-03-30T07:57:24.469Z|58440|inc_proc_eng|INFO|node: 
 logical_flow_output, recompute (failed handler for input port_groups) took 
 35774ms
 2023-03-30T07:57:28.521Z|58441|lflow_cache|INFO|Detected cache inactivity 
 (last active 40006 ms ago): trimming cache
 2023-03-30T07:57:28.528Z|58442|timeval|WARN|Unreasonably long 40084ms poll 
 interval (32947ms user, 6968ms system)
 2023-03-30T07:57:28.528Z|58443|timeval|WARN|faults: 43468 minor, 0 major
 …..
 2023-03-30T07:57:48.784Z|58496|reconnect|ERR|ssl::6642: no response to 
 inactivity probe after 60.2 seconds, disconnecting
 2023-03-30T07:57:48.784Z|58497|reconnect|INFO|ssl::6642: connection 
 dropped
 2023-03-30T07:57:48.905Z|58498|main|INFO|OVNSB commit failed, force 
 recompute next time.
 2023-03-30T07:57:49.786Z|58499|reconnect|INFO|ssl::6642: connecting...
 2023-03-30T07:57:49.924Z|58500|reconnect|INFO|ssl::6642: connected
 2023-03-30T07:57:29.781Z|58501|poll_loop|INFO|wakeup due to 0-ms timeout 
 at lib/stream-ssl.c:838 (74% CPU usage)
 2023-03-30T07:57:29.836Z|58502|poll_loop|INFO|wakeup due to 0-ms timeout 
 at lib/stream-ssl.c:838 (74% CPU usage)
 ….
 2023-03-30T07:57:05.976Z|58545|poll_loop|INFO|wakeup due to 0-ms timeout 
 at lib/stream-ssl.c:838 (101% CPU usage)
 2023-03-30T07:57:07.002Z|58546|memory|INFO|peak resident set size grew 54% 
 in last 682257.2 seconds, from 4435696 kB to 6824820 kB
 2023-03-30T07:57:07.002Z|58547|memory|INFO|idl-cells-OVN_Southbound:11491359
  idl-cells-Open_vSwitch:14416 lflow-cache-entries-cache-expr:530298 
 lflow-cache-entries-cache-matches:31770 lflow-cache-size-KB:769553 
 local_datapath_usage-KB:630
 ofctrl_desired_flow_usage-KB:577052 ofctrl_installed_flow_usage-KB:409538 
 ofctrl_sb_flow_ref_usage-KB:273707
 …
 2023-03-30T07:57:31.667Z|58552|rconn|WARN|unix:/run/openvswitch/br-int.mgmt:
  connection dropped (Broken pipe)
 2023-03-30T07:57:31.739Z|58553|binding|INFO|Releasing lport cr-xxx from 
 this chassis (sb_readonly=0)
 2023-03-30T07:57:31.739Z|58554|if_status|WARN|Dropped 1 log messages in 
 last 1254725 seconds (most recently, 1254725 seconds ago) due to excessive 
 rate
 2023-03-30T07:57:31.739Z|58555|if_status|WARN|Trying to release unknown 
 interface cr-xxx
 2023-03-30T07:57:31.743Z|58556|binding|INFO|Releasing lport cr-yyy from 
 this chassis (sb_readonly=0)
 2023-03-30T07:57:31.743Z|58557|binding|INFO|Releasing lport cr-zzz from 
 this chassis (sb_readonly=0)
 …
 
 After some time these ports are claimed back:
 
 2023-03-30T07:57:49.320Z|59137|rconn|INFO|unix:/run/openvswitch/br-int.mgmt:
  connecting...