[ovs-dev] [PATCH ovn v4 1/1] Add support for overlay provider networks.

2024-06-06 Thread numans
From: Numan Siddique 

It is expected that a provider network logical switch has a localnet logical
switch port in order to bridge the overlay traffic to the underlay traffic.
There can be some usecases where a overlay logical switch (without
a localnet port) can act as a provider network and presently NAT doesn't
work as expected.  This patch adds this support.  A new config option
"overlay_provider_network" is added to support this feature.
This feature gets enabled for a logical switch 'P' if:
  - The above option is set to true in the Logical_Switch.other_config
column.
  - The logical switch 'P' doesn't have any localnet ports.
  - The logical router port of a router 'R' connecting to 'P'
is a gateway router port.
  - And the logical router 'R' has only one gateway router port.

If all the above conditions are met, ovn-northd creates a chassisredirect
port for the logical switch port (of type router) connecting to the
router 'R'.  For example, if the logical port is named as "P-R" and its
peer router port is "R-P", then chassisredirect port cr-P-R is created
along with cr-R-P.  Gateway chassis binding the cr-R-P also binds cr-P-R.
This ensures that the routing is centralized on this gateway chassis for
the traffic coming from switch "P" towards the router or vice versa.
This centralization is required in order to support NAT (both SNAT and
DNAT).  Distributed NAT (i.e if external_mac and router_port is set) is
not supported and instead the router port mac is used for such traffic.

Reported-at: https://issues.redhat.com/browse/FDP-364
Signed-off-by: Numan Siddique 
---
 NEWS  |   2 +
 controller/physical.c |   4 +
 northd/northd.c   | 239 ++
 northd/northd.h   |   1 +
 ovn-nb.xml|  27 ++
 tests/multinode-macros.at |   2 +-
 tests/multinode.at| 177 +
 tests/ovn-northd.at   | 520 +-
 tests/ovn.at  |   8 +-
 9 files changed, 928 insertions(+), 52 deletions(-)

diff --git a/NEWS b/NEWS
index 3bdc551728..51e6eeabc9 100644
--- a/NEWS
+++ b/NEWS
@@ -31,6 +31,8 @@ Post v24.03.0
 has been renamed to "options:ic-route-denylist" in order to comply with
 inclusive language guidelines. The previous name is still recognized to
 aid with backwards compatibility.
+  - Added Overlay provider network support to a logical switch if
+the config "overlay_provider_network" is set to true.
 
 OVN v24.03.0 - 01 Mar 2024
 --
diff --git a/controller/physical.c b/controller/physical.c
index 25da789f0b..69a617f341 100644
--- a/controller/physical.c
+++ b/controller/physical.c
@@ -1607,6 +1607,10 @@ consider_port_binding(struct ovsdb_idl_index 
*sbrec_port_binding_by_name,
 ct_zones);
 put_zones_ofpacts(_ids, ofpacts_p);
 
+/* Clear the MFF_INPORT.  Its possible that the same packet may
+ * go out from the same tunnel inport. */
+put_load(ofp_to_u16(OFPP_NONE), MFF_IN_PORT, 0, 16, ofpacts_p);
+
 /* Resubmit to table 41. */
 put_resubmit(OFTABLE_CHECK_LOOPBACK, ofpacts_p);
 }
diff --git a/northd/northd.c b/northd/northd.c
index 9f81afccbb..2af9295a50 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -2098,6 +2098,53 @@ parse_lsp_addrs(struct ovn_port *op)
 }
 }
 
+static struct ovn_port *
+create_cr_port(struct ovn_port *op, struct hmap *ports,
+   struct ovs_list *both_dbs, struct ovs_list *nb_only)
+{
+char *redirect_name = ovn_chassis_redirect_name(
+op->nbsp ? op->nbsp->name : op->nbrp->name);
+
+struct ovn_port *crp = ovn_port_find(ports, redirect_name);
+if (crp && crp->sb && crp->sb->datapath == op->od->sb) {
+ovn_port_set_nb(crp, NULL, op->nbrp);
+ovs_list_remove(>list);
+ovs_list_push_back(both_dbs, >list);
+} else {
+crp = ovn_port_create(ports, redirect_name,
+  op->nbsp, op->nbrp, NULL);
+ovs_list_push_back(nb_only, >list);
+}
+
+crp->primary_port = op;
+op->cr_port = crp;
+crp->od = op->od;
+free(redirect_name);
+
+return crp;
+}
+
+/* Returns true if chassis resident port needs to be created for
+ * op's peer logical switch.  False otherwise.
+ *
+ * Chassis resident port needs to be created if the op's logical router
+ *   - Has only one gateway router port and
+ *   - op's peer logical switch has no localnet ports and
+ *   - op's logical switch has the option 'overlay_provider_network'
+ * set to true.
+ */
+static bool
+peer_needs_cr_port_creation(struct ovn_port *op)
+{
+if (op->od->n_l3dgw_ports == 1 && op->peer && op->peer->nbsp
+&& !op->peer->od->n_localnet_ports) {
+return smap_get_bool(>peer->od->nbs->other_config,
+ "overlay_provider_network", false);
+}
+
+return false;
+}
+
 static void
 

[ovs-dev] [PATCH ovn v4 0/1] Overlay provider network support.

2024-06-06 Thread numans
From: Numan Siddique 

This patch adds support of overlay provider networks so that
NAT can be supported on the logical switch subnet even if there
are no localnet ports.

v3 -> v4
--
  * Patches 1 and 2 of v3 are merged.
  * Renamed the function name in the patch 3 of v3 as suggested by Mark.

v2 -> v3
--
  * Rebased and resolved conflicts.

v1 -> v2
--
  * Added a new patch, patch 2 to the series to refactor the cr port
code.  This patch was earlier submitted separately here:

https://patchwork.ozlabs.org/project/ovn/patch/20240507215713.902148-1-num...@ovn.org/
  * Patch 2 of v1 is now patch 3.
  * Addressed review comments from Mark.

Numan Siddique (1):
  Add support for overlay provider networks.

 NEWS  |   2 +
 controller/physical.c |   4 +
 northd/northd.c   | 239 ++
 northd/northd.h   |   1 +
 ovn-nb.xml|  27 ++
 tests/multinode-macros.at |   2 +-
 tests/multinode.at| 177 +
 tests/ovn-northd.at   | 520 +-
 tests/ovn.at  |   8 +-
 9 files changed, 928 insertions(+), 52 deletions(-)

-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v3 3/3] Add support for overlay provider networks.

2024-06-06 Thread Numan Siddique
On Thu, Jun 6, 2024 at 4:11 PM Mark Michelson  wrote:
>
> Hi Numan,
>
> I have only one small comment below.
>
> On 5/21/24 16:17, num...@ovn.org wrote:
> > From: Numan Siddique 
> >
> > It is expected that a provider network logical switch has a localnet logical
> > switch port in order to bridge the overlay traffic to the underlay traffic.
> > There can be some usecases where a overlay logical switch (without
> > a localnet port) can act as a provider network and presently NAT doesn't
> > work as expected.  This patch adds this support.  A new config option
> > "overlay_provider_network" is added to support this feature.
> > This feature gets enabled for a logical switch 'P' if:
> >- The above option is set to true in the Logical_Switch.other_config
> >  column.
> >- The logical switch 'P' doesn't have any localnet ports.
> >- The logical router port of a router 'R' connecting to 'P'
> >  is a gateway router port.
> >- And the logical router 'R' has only one gateway router port.
> >
> > If all the above conditions are met, ovn-northd creates a chassisredirect
> > port for the logical switch port (of type router) connecting to the
> > router 'R'.  For example, if the logical port is named as "P-R" and its
> > peer router port is "R-P", then chassisredirect port cr-P-R is created
> > along with cr-R-P.  Gateway chassis binding the cr-R-P also binds cr-P-R.
> > This ensures that the routing is centralized on this gateway chassis for
> > the traffic coming from switch "P" towards the router or vice versa.
> > This centralization is required in order to support NAT (both SNAT and
> > DNAT).  Distributed NAT (i.e if external_mac and router_port is set) is
> > not supported and instead the router port mac is used for such traffic.
> >
> > Reported-at: https://issues.redhat.com/browse/FDP-364
> > Signed-off-by: Numan Siddique 
> > ---
> >   NEWS  |   2 +
> >   controller/physical.c |   4 +
> >   northd/northd.c   | 239 ++
> >   northd/northd.h   |   1 +
> >   ovn-nb.xml|  27 ++
> >   tests/multinode-macros.at |   2 +-
> >   tests/multinode.at| 177 +
> >   tests/ovn-northd.at   | 520 +-
> >   tests/ovn.at  |   8 +-
> >   9 files changed, 928 insertions(+), 52 deletions(-)
> >
> > diff --git a/NEWS b/NEWS
> > index 81c958f9a0..b501d064f0 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -21,6 +21,8 @@ Post v24.03.0
> >   MAC addresses configured on the LSP with "unknown", are learnt via the
> >   OVN native FDB.
> > - Add support for ovsdb-server `--config-file` option in ovn-ctl.
> > +  - Added Overlay provider network support to a logical switch if
> > +the config "overlay_provider_network" is set to true.
> >
> >   OVN v24.03.0 - 01 Mar 2024
> >   --
> > diff --git a/controller/physical.c b/controller/physical.c
> > index 7ee3086940..625e37e8a7 100644
> > --- a/controller/physical.c
> > +++ b/controller/physical.c
> > @@ -1587,6 +1587,10 @@ consider_port_binding(struct ovsdb_idl_index 
> > *sbrec_port_binding_by_name,
> >   ct_zones);
> >   put_zones_ofpacts(_ids, ofpacts_p);
> >
> > +/* Clear the MFF_INPORT.  Its possible that the same packet may
> > + * go out from the same tunnel inport. */
> > +put_load(ofp_to_u16(OFPP_NONE), MFF_IN_PORT, 0, 16, ofpacts_p);
> > +
> >   /* Resubmit to table 41. */
> >   put_resubmit(OFTABLE_CHECK_LOOPBACK, ofpacts_p);
> >   }
> > diff --git a/northd/northd.c b/northd/northd.c
> > index 6393d688f6..65999d82c4 100644
> > --- a/northd/northd.c
> > +++ b/northd/northd.c
> > @@ -2098,6 +2098,53 @@ parse_lsp_addrs(struct ovn_port *op)
> >   }
> >   }
> >
> > +static struct ovn_port *
> > +create_cr_port(struct ovn_port *op, struct hmap *ports,
> > +   struct ovs_list *both_dbs, struct ovs_list *nb_only)
> > +{
> > +char *redirect_name = ovn_chassis_redirect_name(
> > +op->nbsp ? op->nbsp->name : op->nbrp->name);
> > +
> > +struct ovn_port *crp = ovn_port_find(ports, redirect_name);
> > +if (crp && crp->sb && crp->sb->datapath == op->od->sb) {
> > +ovn_port_set_nb(crp, NULL, op->nbrp);
> > +ovs_list_remove(>list);
> > +ovs_list_push_back(both_dbs, >list);
> > +} else {
> > +crp = ovn_port_create(ports, redirect_name,
> > +  op->nbsp, op->nbrp, NULL);
> > +ovs_list_push_back(nb_only, >list);
> > +}
> > +
> > +crp->primary_port = op;
> > +op->cr_port = crp;
> > +crp->od = op->od;
> > +free(redirect_name);
> > +
> > +return crp;
> > +}
> > +
> > +/* Returns true if chassis resident port needs to be created for
> > + * op's peer logical switch.  False otherwise.
> > + *
> > + * Chassis resident port needs to be created if the 

Re: [ovs-dev] [PATCH ovn 6/8] tests: Wait for controller exit before restart.

2024-06-06 Thread Mark Michelson

Hi Xavier,

This is the only patch in the series for which I have a recommendation. 
The rest all look good to me.


On 6/4/24 09:10, Xavier Simonart wrote:

In some rare cases, and despite "recent" changes to wait for cleanup
before replying to exit, ovn-controller was still running when trying
to restart it.

Signed-off-by: Xavier Simonart 
---
  tests/ovn-macros.at |  9 +
  tests/ovn.at| 17 ++---
  2 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/tests/ovn-macros.at b/tests/ovn-macros.at
index 47ada5c70..71a46db8b 100644
--- a/tests/ovn-macros.at
+++ b/tests/ovn-macros.at
@@ -1107,6 +1107,15 @@ m4_define([OVN_CHECK_SCAPY_EDNS_CLIENT_SUBNET_SUPPORT],
  AT_SKIP_IF([! echo "from scapy.layers.dns import EDNS0ClientSubnet" | python 
2>&1 > /dev/null])
  ])
  
+m4_define([OVN_CONTROLLER_EXIT_RESTART],


I think this should just be called "OVN_CONTROLLER_RESTART". The goal is 
to restart ovn-controller, so I think the simpler name makes sense. It's 
true that we are using an "exit" command for ovn-appctl, but that is 
more of an implementation detail than anything else.



+  [TMPPID=$(cat $1/ovn-controller.pid)
+   AT_CHECK([as $1 ovn-appctl -t ovn-controller exit --restart])
+   # Make sure ovn-controller stopped so that a future restart will not fail.
+   # Checking debug/status is running is not enough as there might be a small 
race condition.
+   echo "Waiting for pid $TMPPID"
+   OVS_WAIT_WHILE([kill -0 $TMPPID 2>/dev/null])
+])
+
  m4_define([OFTABLE_PHY_TO_LOG], [0])
  m4_define([OFTABLE_LOG_INGRESS_PIPELINE], [8])
  m4_define([OFTABLE_OUTPUT_LARGE_PKT_DETECT], [37])
diff --git a/tests/ovn.at b/tests/ovn.at
index 2dd0dfd2e..8fa26c192 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -20615,7 +20615,7 @@ echo $expected | ovstest test-ovn expr-to-packets > 
expected
  OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected])
  
  # Stop ovn-controller on hv2 with --restart flag

-as hv2 ovs-appctl -t ovn-controller exit --restart
+OVN_CONTROLLER_EXIT_RESTART([hv2])
  
  # Now send the packet again. This time, it should still arrive

  OVS_WAIT_UNTIL([as hv1 ovs-appctl -t ovn-controller inject-pkt "$packet"])
@@ -29316,7 +29316,7 @@ check test "$hvt2" -gt 0
  # Kill ovn-controller on chassis hv3, so that it won't update nb_cfg.
  # Then wait for 9 out of 10
  sleep 1
-check as hv3 ovn-appctl -t ovn-controller exit --restart
+OVN_CONTROLLER_EXIT_RESTART([hv3])
  wait_for_ports_up
  ovn-nbctl --wait=sb sync
  wait_row_count Chassis_Private 9 name!=hv3 nb_cfg=2
@@ -36252,7 +36252,7 @@ check_tunnel_port hv1 br-int hv2@192.168.0.2%192.168.0.1
  check_tunnel_port hv2 br-int hv1@192.168.0.1%192.168.0.2
  
  # Stop ovn-controller on hv1

-check as hv1 ovn-appctl -t ovn-controller exit --restart
+OVN_CONTROLLER_EXIT_RESTART([hv1])
  
  # The tunnel should remain intact

  check_tunnel_port hv1 br-int hv2@192.168.0.2%192.168.0.1
@@ -36281,7 +36281,7 @@ check_tunnel_port hv2 br-int1 
hv1@192.168.0.1%192.168.0.2
  check grep -q "Clearing old tunnel port \"ovn-hv1-0\" (hv1@192.168.0.1%192.168.0.2) from 
bridge \"br-int\"" hv2/ovn-controller.log
  
  # Stop ovn-controller on hv1

-check as hv1 ovn-appctl -t ovn-controller exit --restart
+OVN_CONTROLLER_EXIT_RESTART([hv1])
  
  # The tunnel should remain intact

  check_tunnel_port hv1 br-int1 hv2@192.168.0.2%192.168.0.1
@@ -36371,10 +36371,7 @@ prev_id2=$(ovs-vsctl --bare --columns _uuid find port 
external_ids:ovn-chassis-i
  # The hv2 is running we can remove the override file
  rm -f ${OVN_SYSCONFDIR}/system-id-override
  
-check ovn-appctl -t ovn-controller exit --restart

-
-# Make sure ovn-controller stopped before restarting it
-OVS_WAIT_UNTIL([test x$(ovn-appctl -t ovn-controller debug/status) != 
"xrunning"])
+OVN_CONTROLLER_EXIT_RESTART([hv1])
  
  # for some reason SSL ovsdb configuration overrides CLI, so

  # delete ssl config from ovsdb to give CLI arguments priority
@@ -37086,9 +37083,7 @@ AT_CHECK([grep -c "NXT_CT_FLUSH_ZONE" 
hv1/ovs-vswitchd.log], [0], [dnl
  ])
  
  AS_BOX([Check conversion from UUID - restart])

-ovn-appctl -t ovn-controller exit --restart
-# Make sure ovn-controller stopped before restarting it
-OVS_WAIT_UNTIL([test "$(ovn-appctl -t ovn-controller debug/status)" != 
"running"])
+OVN_CONTROLLER_EXIT_RESTART([hv1])
  
  replace_with_uuid lr0

  replace_with_uuid sw0


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] controller: Send RARP/GARP for VIF post link state is up.

2024-06-06 Thread Mark Michelson

Thank you for the patch Shibir. It looks good to me, so

Acked-by: Mark Michelson 

On 5/27/24 14:24, Shibir Basak wrote:

Currently, GARP/RARP broadcast is sent for VIFs (part of logical
switch with localnet port) after iface-id is set.
This fix is to avoid packet loss during migration if iface-id
is set even before the VM migration is completed.

Signed-off-by: Shibir Basak 
Acked-by: Naveen Yerramneni 
---
  controller/ovn-controller.c | 1 +
  controller/pinctrl.c| 4 
  2 files changed, 5 insertions(+)

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 6b38f113d..982378a50 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -1128,6 +1128,7 @@ ctrl_register_ovs_idl(struct ovsdb_idl *ovs_idl)
  ovsdb_idl_add_table(ovs_idl, _table_queue);
  ovsdb_idl_add_column(ovs_idl, _queue_col_other_config);
  ovsdb_idl_add_column(ovs_idl, _queue_col_external_ids);
+ovsdb_idl_add_column(ovs_idl, _interface_col_link_state);
  
  chassis_register_ovs_idl(ovs_idl);

  encaps_register_ovs_idl(ovs_idl);
diff --git a/controller/pinctrl.c b/controller/pinctrl.c
index 6a2c3dc68..b5d3162b8 100644
--- a/controller/pinctrl.c
+++ b/controller/pinctrl.c
@@ -6375,6 +6375,10 @@ get_localnet_vifs_l3gwports(
  if (!pb || pb->chassis != chassis) {
  continue;
  }
+if (!iface_rec->link_state ||
+strcmp(iface_rec->link_state, "up")) {
+continue;
+}
  struct local_datapath *ld
  = get_local_datapath(local_datapaths,
   pb->datapath->tunnel_key);


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v3 3/3] Add support for overlay provider networks.

2024-06-06 Thread Mark Michelson

Hi Numan,

I have only one small comment below.

On 5/21/24 16:17, num...@ovn.org wrote:

From: Numan Siddique 

It is expected that a provider network logical switch has a localnet logical
switch port in order to bridge the overlay traffic to the underlay traffic.
There can be some usecases where a overlay logical switch (without
a localnet port) can act as a provider network and presently NAT doesn't
work as expected.  This patch adds this support.  A new config option
"overlay_provider_network" is added to support this feature.
This feature gets enabled for a logical switch 'P' if:
   - The above option is set to true in the Logical_Switch.other_config
 column.
   - The logical switch 'P' doesn't have any localnet ports.
   - The logical router port of a router 'R' connecting to 'P'
 is a gateway router port.
   - And the logical router 'R' has only one gateway router port.

If all the above conditions are met, ovn-northd creates a chassisredirect
port for the logical switch port (of type router) connecting to the
router 'R'.  For example, if the logical port is named as "P-R" and its
peer router port is "R-P", then chassisredirect port cr-P-R is created
along with cr-R-P.  Gateway chassis binding the cr-R-P also binds cr-P-R.
This ensures that the routing is centralized on this gateway chassis for
the traffic coming from switch "P" towards the router or vice versa.
This centralization is required in order to support NAT (both SNAT and
DNAT).  Distributed NAT (i.e if external_mac and router_port is set) is
not supported and instead the router port mac is used for such traffic.

Reported-at: https://issues.redhat.com/browse/FDP-364
Signed-off-by: Numan Siddique 
---
  NEWS  |   2 +
  controller/physical.c |   4 +
  northd/northd.c   | 239 ++
  northd/northd.h   |   1 +
  ovn-nb.xml|  27 ++
  tests/multinode-macros.at |   2 +-
  tests/multinode.at| 177 +
  tests/ovn-northd.at   | 520 +-
  tests/ovn.at  |   8 +-
  9 files changed, 928 insertions(+), 52 deletions(-)

diff --git a/NEWS b/NEWS
index 81c958f9a0..b501d064f0 100644
--- a/NEWS
+++ b/NEWS
@@ -21,6 +21,8 @@ Post v24.03.0
  MAC addresses configured on the LSP with "unknown", are learnt via the
  OVN native FDB.
- Add support for ovsdb-server `--config-file` option in ovn-ctl.
+  - Added Overlay provider network support to a logical switch if
+the config "overlay_provider_network" is set to true.
  
  OVN v24.03.0 - 01 Mar 2024

  --
diff --git a/controller/physical.c b/controller/physical.c
index 7ee3086940..625e37e8a7 100644
--- a/controller/physical.c
+++ b/controller/physical.c
@@ -1587,6 +1587,10 @@ consider_port_binding(struct ovsdb_idl_index 
*sbrec_port_binding_by_name,
  ct_zones);
  put_zones_ofpacts(_ids, ofpacts_p);
  
+/* Clear the MFF_INPORT.  Its possible that the same packet may

+ * go out from the same tunnel inport. */
+put_load(ofp_to_u16(OFPP_NONE), MFF_IN_PORT, 0, 16, ofpacts_p);
+
  /* Resubmit to table 41. */
  put_resubmit(OFTABLE_CHECK_LOOPBACK, ofpacts_p);
  }
diff --git a/northd/northd.c b/northd/northd.c
index 6393d688f6..65999d82c4 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -2098,6 +2098,53 @@ parse_lsp_addrs(struct ovn_port *op)
  }
  }
  
+static struct ovn_port *

+create_cr_port(struct ovn_port *op, struct hmap *ports,
+   struct ovs_list *both_dbs, struct ovs_list *nb_only)
+{
+char *redirect_name = ovn_chassis_redirect_name(
+op->nbsp ? op->nbsp->name : op->nbrp->name);
+
+struct ovn_port *crp = ovn_port_find(ports, redirect_name);
+if (crp && crp->sb && crp->sb->datapath == op->od->sb) {
+ovn_port_set_nb(crp, NULL, op->nbrp);
+ovs_list_remove(>list);
+ovs_list_push_back(both_dbs, >list);
+} else {
+crp = ovn_port_create(ports, redirect_name,
+  op->nbsp, op->nbrp, NULL);
+ovs_list_push_back(nb_only, >list);
+}
+
+crp->primary_port = op;
+op->cr_port = crp;
+crp->od = op->od;
+free(redirect_name);
+
+return crp;
+}
+
+/* Returns true if chassis resident port needs to be created for
+ * op's peer logical switch.  False otherwise.
+ *
+ * Chassis resident port needs to be created if the op's logical router
+ *   - Has only one gateway router port and
+ *   - op's peer logical switch has no localnet ports and
+ *   - op's logical switch has the option 'overlay_provider_network'
+ * set to true.
+ */
+static bool
+needs_cr_port_creation(struct ovn_port *op)


Since this function checks if op->peer needs to have a chassis-resident 
port created, could this be renamed to something like 
"peer_needs_cr_port_creation()" ?



+{
+if (op->od->n_l3dgw_ports 

Re: [ovs-dev] [PATCH ovn v3 2/3] northd: Refactor chassisresident port checking.

2024-06-06 Thread Mark Michelson
Thank you very much for clarifying things in the code. This is *much* 
needed.


Acked-by: Mark Michelson 

On 5/21/24 16:17, num...@ovn.org wrote:

From: Numan Siddique 

The implementation of util functions "is_cr_port()" and "is_l3dgw_port()"
are confusing and not very intuitive.  This patch adds some
documentation.  It also renames the struct ovn_port member 'l3dgw_port'
to 'primary_port'.  If struct ovn_port->primary_port is set, then it
means 'ovn_port' instance is a chassis resident port and it is derived
from the primary port. An upcoming patch creates a chassisresident port
even for a logical switch port of type "patch" and hence renamed to
avoid confusion.

Signed-off-by: Numan Siddique 
---
  northd/northd.c | 48 +++-
  northd/northd.h | 10 --
  2 files changed, 39 insertions(+), 19 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index 2d0946218a..6393d688f6 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -1077,16 +1077,36 @@ struct ovn_port_routable_addresses {
  
  static bool lsp_can_be_inc_processed(const struct nbrec_logical_switch_port *);
  
+/* This function returns true if 'op' is a gateway router port.

+ * False otherwise.
+ * For 'op' to be a gateway router port.
+ *  1. op->nbrp->gateway_chassis or op->nbrp->ha_chassis_group should
+ * be configured.
+ *  2. op->cr_port should not be NULL.  If op->nbrp->gateway_chassis or
+ * op->nbrp->ha_chassis_group is set by the user, northd WILL create
+ * a chassis resident port in the SB port binding.
+ * See join_logical_ports().
+ */
  static bool
  is_l3dgw_port(const struct ovn_port *op)
  {
-return op->cr_port;
+return op->cr_port && op->nbrp &&
+   (op->nbrp->n_gateway_chassis || op->nbrp->ha_chassis_group);
  }
  
+/* This function returns true if 'op' is a chassis resident

+ * derived port. False otherwise.
+ * There are 2 ways to check if 'op' is chassis resident port.
+ *  1. op->sb->type is "chassisresident"
+ *  2. op->primary_port is not NULL.  If op->primary_port is set,
+ * it means 'op' is derived from the ovn_port op->primary_port.
+ *
+ * This function uses the (2) method as it doesn't involve strcmp().
+ */
  static bool
  is_cr_port(const struct ovn_port *op)
  {
-return op->l3dgw_port;
+return op->primary_port;
  }
  
  static void

@@ -1171,7 +1191,7 @@ ovn_port_create(struct hmap *ports, const char *key,
  op->key = xstrdup(key);
  op->sb = sb;
  ovn_port_set_nb(op, nbsp, nbrp);
-op->l3dgw_port = op->cr_port = NULL;
+op->primary_port = op->cr_port = NULL;
  hmap_insert(ports, >key_node, hash_string(op->key, 0));
  
  op->lflow_ref = lflow_ref_create();

@@ -1228,7 +1248,7 @@ ovn_port_destroy(struct hmap *ports, struct ovn_port 
*port)
  /* Don't remove port->list. The node should be removed from such lists
   * before calling this function. */
  hmap_remove(ports, >key_node);
-if (port->od && !port->l3dgw_port) {
+if (port->od && !port->primary_port) {
  hmap_remove(>od->ports, >dp_node);
  }
  ovn_port_destroy_orphan(port);
@@ -1398,7 +1418,7 @@ lsp_force_fdb_lookup(const struct ovn_port *op)
  static struct ovn_port *
  ovn_port_get_peer(const struct hmap *lr_ports, struct ovn_port *op)
  {
-if (!op->nbsp || !lsp_is_router(op->nbsp) || op->l3dgw_port) {
+if (!op->nbsp || !lsp_is_router(op->nbsp) || is_cr_port(op)) {
  return NULL;
  }
  
@@ -2278,7 +2298,7 @@ join_logical_ports(const struct sbrec_port_binding_table *sbrec_pb_table,

NULL, nbrp, NULL);
  ovs_list_push_back(nb_only, >list);
  }
-crp->l3dgw_port = op;
+crp->primary_port = op;
  op->cr_port = crp;
  crp->od = od;
  free(redirect_name);
@@ -2299,7 +2319,7 @@ join_logical_ports(const struct sbrec_port_binding_table 
*sbrec_pb_table,
   * to their peers. */
  struct ovn_port *op;
  HMAP_FOR_EACH (op, key_node, ports) {
-if (op->nbsp && lsp_is_router(op->nbsp) && !op->l3dgw_port) {
+if (op->nbsp && lsp_is_router(op->nbsp) && !op->primary_port) {
  struct ovn_port *peer = ovn_port_get_peer(ports, op);
  if (!peer || !peer->nbrp) {
  continue;
@@ -2353,7 +2373,7 @@ join_logical_ports(const struct sbrec_port_binding_table 
*sbrec_pb_table,
  op->enable_router_port_acl = smap_get_bool(
  >nbsp->options, "enable_router_port_acl", false);
  }
-} else if (op->nbrp && op->nbrp->peer && !op->l3dgw_port) {
+} else if (op->nbrp && op->nbrp->peer && !is_cr_port(op)) {
  struct ovn_port *peer = ovn_port_find(ports, op->nbrp->peer);
  if (peer) {
  if (peer->nbrp) {
@@ -3889,7 +3909,7 @@ sync_pb_for_lrp(struct ovn_port 

Re: [ovs-dev] [PATCH ovn v3 1/3] northd: Don't reparse lport's addresses while adding L2_LKUP flows.

2024-06-06 Thread Mark Michelson

Thanks Numan,

Acked-by: Mark Michelson 

On 5/21/24 16:15, num...@ovn.org wrote:

From: Numan Siddique 

The addresses are already parsed and stored in the "struct ovn_port"'s
lsp_addrs field.

Signed-off-by: Numan Siddique 
---
  northd/northd.c | 173 ++--
  1 file changed, 63 insertions(+), 110 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index c8a5efa012..2d0946218a 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -9578,132 +9578,85 @@ build_lswitch_ip_unicast_lookup(struct ovn_port *op,
  return;
  }
  
-/* For ports connected to logical routers add flows to bypass the

- * broadcast flooding of ARP/ND requests in table 19. We direct the
- * requests only to the router port that owns the IP address.
- */
-if (lsp_is_router(op->nbsp)) {
-build_lswitch_rport_arp_req_flows(op->peer, op->od, op, lflows,
-  >nbsp->header_);
+/* Skip adding the unicast lookup flows if force FDB Lookup is enabled
+ * on the lsp. */
+if (lsp_force_fdb_lookup(op)) {
+return;
  }
  
  bool lsp_clone_to_unknown = lsp_is_clone_to_unknown(op->nbsp);

+bool lsp_enabled = lsp_is_enabled(op->nbsp);
+const char *action = lsp_enabled
+ ? ((lsp_clone_to_unknown && op->od->has_unknown)
+ ? "clone {outport = %s; output; };"
+   "outport = \""MC_UNKNOWN "\"; output;"
+ : "outport = %s; output;")
+ : debug_drop_action();
+ds_clear(actions);
+ds_put_format(actions, action, op->json_key);
  
-for (size_t i = 0; i < op->nbsp->n_addresses; i++) {

-/* Addresses are owned by the logical port.
- * Ethernet address followed by zero or more IPv4
- * or IPv6 addresses (or both). */
-struct eth_addr mac;
-bool lsp_enabled = lsp_is_enabled(op->nbsp);
-const char *action = lsp_enabled
- ? ((lsp_clone_to_unknown && op->od->has_unknown)
-? "clone {outport = %s; output; };"
-  "outport = \""MC_UNKNOWN "\"; output;"
-: "outport = %s; output;")
- : debug_drop_action();
-
-if (ovs_scan(op->nbsp->addresses[i],
-ETH_ADDR_SCAN_FMT, ETH_ADDR_SCAN_ARGS(mac))) {
-/* Skip adding flows related to the MAC address
- * as force FDB Lookup is enabled on the lsp.
- */
-if (lsp_force_fdb_lookup(op)) {
-continue;
-}
-ds_clear(match);
-ds_put_format(match, "eth.dst == "ETH_ADDR_FMT,
-  ETH_ADDR_ARGS(mac));
+if (lsp_is_router(op->nbsp) && op->peer && op->peer->nbrp) {
+/* For ports connected to logical routers add flows to bypass the
+ * broadcast flooding of ARP/ND requests in table 19. We direct the
+ * requests only to the router port that owns the IP address.
+ */
+build_lswitch_rport_arp_req_flows(op->peer, op->od, op, lflows,
+  >nbsp->header_);
  
-ds_clear(actions);

-ds_put_format(actions, action, op->json_key);
-ovn_lflow_add_with_hint(lflows, op->od,
-S_SWITCH_IN_L2_LKUP,
-50, ds_cstr(match),
-ds_cstr(actions),
->nbsp->header_,
-op->lflow_ref);
-} else if (!strcmp(op->nbsp->addresses[i], "unknown")) {
-continue;
-} else if (is_dynamic_lsp_address(op->nbsp->addresses[i])) {
-if (!op->nbsp->dynamic_addresses
-|| !ovs_scan(op->nbsp->dynamic_addresses,
-ETH_ADDR_SCAN_FMT, ETH_ADDR_SCAN_ARGS(mac))) {
-continue;
-}
-ds_clear(match);
-ds_put_format(match, "eth.dst == "ETH_ADDR_FMT,
-  ETH_ADDR_ARGS(mac));
+ds_clear(match);
+if (!eth_addr_is_zero(op->proxy_arp_addrs.ea)) {
+ds_put_format(match, "eth.dst == { %s, %s }",
+  op->proxy_arp_addrs.ea_s,
+  op->peer->lrp_networks.ea_s);
+} else {
+ds_put_format(match, "eth.dst == %s", op->peer->lrp_networks.ea_s);
+}
  
-ds_clear(actions);

-ds_put_format(actions, action, op->json_key);
-ovn_lflow_add_with_hint(lflows, op->od,
-S_SWITCH_IN_L2_LKUP,
-50, ds_cstr(match),
-ds_cstr(actions),
->nbsp->header_,
-

Re: [ovs-dev] [PATCH ovn v5 1/2] northd: Make `vxlan_mode` a global variable.

2024-06-06 Thread Ihar Hrachyshka
On Thu, Jun 6, 2024 at 1:27 AM Vladislav Odintsov  wrote:

> Thanks Mark for such a detailed answer!
>
> I agree with your points, and also was thinking about them, but could not
> value their importance in terms of I-P logic. You helped with that.
>
> I’d prefer to apply my proposal to revert back to “bool is_vxlan_mode()”
> to make the “global” a true global. Will submit v6.
>
>
Happy we are doing it. Mark is more eloquent than me. :)


> regards,
> Vladislav Odintsov
>
> > On 5 Jun 2024, at 23:13, Mark Michelson  wrote:
> >
> > On 6/5/24 08:51, Vladislav Odintsov wrote:
> >> Hi Mark,
> >> Thanks for the review!
> >> Please, see below.
> >> regards,
> >>  Vladislav Odintsov
> >> -Original Message-
> >> From: Mark Michelson 
> >> Date: Tuesday, 4 June 2024 at 03:45
> >> To: Vladislav Odintsov , 
> >> Subject: Re: [ovs-dev] [PATCH ovn v5 1/2] northd: Make `vxlan_mode` a
> global variable.
> >> Hi Vladislav,
> >> Generally speaking, I agree with this change. However, I think that
> >> setting a global variable from an incremental processing engine node
> >> runner feels wrong.
> >> The init_vxlan_mode() is called inside the en_global_config_run() only
> to
> >> initialize global value, which is then read by
> get_ovn_max_dp_key_local() to
> >> fill the "max_tunid" variable inside incremental processing engine node.
> >> Which drawbacks do you see of such variable initialization?
> >
> > The biggest drawbacks are:
> > * Reasoning about "ownership" of the vxlan_mode global variable
> > * Maintenance of the en_global_config I-P engine node.
> >
> > On the first point, since vxlan_mode is a global variable in northd.c,
> it's not obvious that the owner of this data is the en_global_config engine
> node. It's an easy mistake for someone to reference the variable before it
> has been initialized, for instance. However, if the boolean is on the
> ed_type_global_config struct, then it's clear to see that this data is
> scoped to the en_global_config engine node.
> >
> > On the second point, if someone were to overhaul the en_global_config
> engine node, it would be an easy mistake to make to not notice that
> vxlan_mode is being set by the engine node. I could see a developer
> splitting the node into separate nodes, for instance. In doing so, the
> developer could easily miss that the global vxlan_mode is being set by the
> engine node, since it's hidden behind an init_ function call. However,
> placing vxlan_mode on the ed_type_global_config makes it more clear that
> the en_global_config engine node is responsible for setting the value.
>
> >
> >> I think that instead, the "vxlan_mode" variable you have introduced
> >> should be a field on struct ed_type_global_config. This way, the
> engine
> >> node is only modifying data local to itself.
> >> I guess, that moving this to the struct ed_type_global_config will make
> the code
> >> a bit more complex: we have to pass this variable through all function
> calls to
> >> be able to read vxlan_mode value inside
> ovn_datapath_assign_requested_tnl_id(),
> >> ovn_port_assign_requested_tnl_id() and ovn_port_allocate_key().
> >
> > I think dependency injection makes the code easier to read, understand,
> and maintain rather than making it more complex. It's clearer that the data
> from the en_global_config engine node is needed in all of the functions you
> listed if those functions require an ed_type_global_config argument.
> >
> >> Apart of this, the "vxlan_mode" variable has the same "global" meaning
> as
> >> "use_ct_inv_match", "check_lsp_is_up", "use_common_zone" and other
> global
> >> variables, which configure the global OVN behaviour. The difference is
> that it
> >> is required to read its value inside the en_global_config_run() to
> reflect the
> >> max_tunid back to NB_Global.
> >
> > Personally, I don't like those global variables either :)
> >
> > But those globals are also set within northd.c, and are initialized at
> the begining of a DB run. From the perspective of northd processing, they
> are truly "global" in their scope. The engine nodes form a dependency tree,
> and so it's important that data that is scoped to a particular node is
> housed in that engine node's data. This way, when nodes are being created,
> it's clear to know which other engine nodes they depend on. If engine nodes
> are setting global variables, then it becomes harder to understand the
> dependencies.
> >> If the global variable setting is totally not acceptable from engine
> node, I
> >> can propose another approach here. Revert init_vxlan_mode() back to
> >> `bool is_vxlan_mode()` and assign global variable outside of global
> engine node:
> >> diff --git a/northd/en-global-config.c b/northd/en-global-config.c
> >> index 873649a89..df0f8e58c 100644
> >> --- a/northd/en-global-config.c
> >> +++ b/northd/en-global-config.c
> >> @@ -115,8 +115,9 @@ en_global_config_run(struct engine_node *node ,
> void *data)
> >>   

Re: [ovs-dev] [PATCH] python: idl: Fix index not being updated on row modification.

2024-06-06 Thread Terry Wilson
On Thu, Jun 6, 2024 at 10:41 AM Dumitru Ceara  wrote:
>
> On 5/27/24 23:39, Ilya Maximets wrote:
> > When a row is modified, python IDL doesn't perform any operations on
> > existing client-side indexes.  This means that if the column on which
> > index is created changes, the old value will remain in the index and
> > the new one will not be added to the index.  Beside lookup failures
> > this is also causing inability to remove modified rows, because the
> > new column value doesn't exist in the index causing an exception on
> > attempt to remove it:
> >
> >  Traceback (most recent call last):
> >File "ovsdbapp/backend/ovs_idl/connection.py", line 110, in run
> >  self.idl.run()
> >File "ovs/db/idl.py", line 465, in run
> >  self.__parse_update(msg.params[2], OVSDB_UPDATE3)
> >File "ovs/db/idl.py", line 924, in __parse_update
> >  self.__do_parse_update(update, version, self.tables)
> >File "ovs/db/idl.py", line 964, in __do_parse_update
> >  changes = self.__process_update2(table, uuid, row_update)
> >File "ovs/db/idl.py", line 991, in __process_update2
> >  del table.rows[uuid]
> >File "ovs/db/custom_index.py", line 102, in __delitem__
> >  index.remove(val)
> >File "ovs/db/custom_index.py", line 66, in remove
> >  self.values.remove(self.index_entry_from_row(row))
> >File "sortedcontainers/sortedlist.py", line 2015, in remove
> >  raise ValueError('{0!r} not in list'.format(value))
> >  ValueError: Datapath_Binding(
> >uuid=UUID('498e66a2-70bc-4587-a66f-0433baf82f60'),
> >tunnel_key=16711683, load_balancers=[], external_ids={}) not in list
> >
> > Fix that by always removing an existing row from indexes before
> > modification and adding back afterwards.  This ensures that old
> > values are removed from the index and new ones are added.
> >
> > This behavior is consistent with the C implementation.
> >
> > The new test that reproduces the removal issue is added.  Some extra
> > testing infrastructure added to be able to handle and print out the
> > 'indexed' table from the idltest schema.
> >
> > Fixes: 13973bc41524 ("Add multi-column index support for the Python IDL")
> > Reported-at: 
> > https://mail.openvswitch.org/pipermail/ovs-discuss/2024-May/053159.html
> > Reported-by: Roberto Bartzen Acosta 
> > Signed-off-by: Ilya Maximets 
> > ---
>
> Looks good to me:
>
> Acked-by: Dumitru Ceara 
>
> Regards,
> Dumitru
>

Looks good to me. I don't like that my code for IndexedRows strongly
implies that it behaves exactly like a dict, and in this case it
doesn't. Maybe some comments explaining why a delete has to be done
for posterity would be helpful.

Acked-by: Terry Wilson 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 ovn 1/3] northd: Introduce ECMP_Nexthop table in SB db.

2024-06-06 Thread 0-day Robot
Bleep bloop.  Greetings Lorenzo Bianconi, I am a robot and I have tried out 
your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


git-am:
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 northd: Introduce ECMP_Nexthop table in SB db.
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 ovn 2/3] northd: Add nexhop id in ct_label.label.

2024-06-06 Thread Lorenzo Bianconi
Introduce the nexthop identifier in the ct_label.label field for
ecmp-symmetric replies connections. This field will be used by
ovn-controller to track ct entries and to flush them if requested by the
CMS (e.g. removing the related static routes).

Signed-off-by: Lorenzo Bianconi 
---
 northd/en-lflow.c|  3 +++
 northd/inc-proc-northd.c |  1 +
 northd/northd.c  | 41 +---
 northd/northd.h  |  1 +
 tests/ovn.at |  4 +--
 tests/system-ovn.at  | 58 +++-
 6 files changed, 72 insertions(+), 36 deletions(-)

diff --git a/northd/en-lflow.c b/northd/en-lflow.c
index 3dba5034b..b4df49076 100644
--- a/northd/en-lflow.c
+++ b/northd/en-lflow.c
@@ -54,6 +54,8 @@ lflow_get_input_data(struct engine_node *node,
 engine_get_input_data("lr_stateful", node);
 struct ed_type_ls_stateful *ls_stateful_data =
 engine_get_input_data("ls_stateful", node);
+struct ecmp_nexthop_data *nexthop_data =
+engine_get_input_data("ecmp_nexthop", node);
 
 lflow_input->sbrec_logical_flow_table =
 EN_OVSDB_GET(engine_get_input("SB_logical_flow", node));
@@ -83,6 +85,7 @@ lflow_get_input_data(struct engine_node *node,
 lflow_input->parsed_routes = _routes_data->parsed_routes;
 lflow_input->route_tables = _routes_data->route_tables;
 lflow_input->route_policies = _policies_data->route_policies;
+lflow_input->nexthops_table = _data->nexthops;
 
 struct ed_type_global_config *global_config =
 engine_get_input_data("global_config", node);
diff --git a/northd/inc-proc-northd.c b/northd/inc-proc-northd.c
index c4e5b9bf6..3d4bfa175 100644
--- a/northd/inc-proc-northd.c
+++ b/northd/inc-proc-northd.c
@@ -281,6 +281,7 @@ void inc_proc_northd_init(struct ovsdb_idl_loop *nb,
 engine_add_input(_lflow, _route_policies, NULL);
 engine_add_input(_lflow, _static_routes, NULL);
 engine_add_input(_lflow, _bfd, NULL);
+engine_add_input(_lflow, _ecmp_nexthop, NULL);
 engine_add_input(_lflow, _northd, lflow_northd_handler);
 engine_add_input(_lflow, _port_group, lflow_port_group_handler);
 engine_add_input(_lflow, _lr_stateful, lflow_lr_stateful_handler);
diff --git a/northd/northd.c b/northd/northd.c
index efe1e3f46..0e7ff0df1 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -10903,7 +10903,8 @@ add_ecmp_symmetric_reply_flows(struct lflow_table 
*lflows,
struct ovn_port *out_port,
const struct parsed_route *route,
struct ds *route_match,
-   struct lflow_ref *lflow_ref)
+   struct lflow_ref *lflow_ref,
+   struct hmap *nexthops_table)
 {
 const struct nbrec_logical_router_static_route *st_route = route->route;
 struct ds match = DS_EMPTY_INITIALIZER;
@@ -10939,15 +10940,28 @@ add_ecmp_symmetric_reply_flows(struct lflow_table 
*lflows,
  * ds_put_cstr() call. The previous contents are needed.
  */
 ds_put_cstr(, " && !ct.rpl && (ct.new || ct.est)");
+struct ds nexthop_label = DS_EMPTY_INITIALIZER;
+
+struct ecmp_nexthop_entry *e;
+HMAP_FOR_EACH_WITH_HASH (e, hmap_node, hash_string(st_route->nexthop, 0),
+ nexthops_table) {
+if (!strcmp(st_route->nexthop, e->nexthop)) {
+ds_put_format(_label, "ct_label.label = %d;", e->id);
+break;
+}
+}
+
 ds_put_format(,
 "ct_commit { ct_label.ecmp_reply_eth = eth.src; "
-" %s = %" PRId64 ";}; "
+" %s = %" PRId64 "; %s }; "
 "next;",
-ct_ecmp_reply_port_match, out_port->sb->tunnel_key);
+ct_ecmp_reply_port_match, out_port->sb->tunnel_key,
+ds_cstr(_label));
 ovn_lflow_add_with_hint(lflows, od, S_ROUTER_IN_ECMP_STATEFUL, 100,
 ds_cstr(), ds_cstr(),
 _route->header_,
 lflow_ref);
+ds_destroy(_label);
 
 /* Bypass ECMP selection if we already have ct_label information
  * for where to route the packet.
@@ -11001,7 +11015,8 @@ static void
 build_ecmp_route_flow(struct lflow_table *lflows, struct ovn_datapath *od,
   bool ct_masked_mark, const struct hmap *lr_ports,
   struct ecmp_groups_node *eg,
-  struct lflow_ref *lflow_ref)
+  struct lflow_ref *lflow_ref,
+  struct hmap *nexthops_table)
 
 {
 bool is_ipv4 = IN6_IS_ADDR_V4MAPPED(>prefix);
@@ -11059,7 +11074,7 @@ build_ecmp_route_flow(struct lflow_table *lflows, 
struct ovn_datapath *od,
 add_ecmp_symmetric_reply_flows(lflows, od, ct_masked_mark,
lrp_addr_s, out_port,
route_, _match,
-  

[ovs-dev] [PATCH v3 ovn 3/3] ofctrl: Introduce ecmp_nexthop_monitor.

2024-06-06 Thread Lorenzo Bianconi
Introduce ecmp_nexthop_monitor in ovn-controller in order to track and
flush ecmp-symmetric reply ct entires when requested by the CMS (e.g
removing the related static routes).

Signed-off-by: Lorenzo Bianconi 
---
 controller/ofctrl.c | 101 ++
 controller/ofctrl.h |   2 +
 controller/ovn-controller.c |   2 +
 tests/system-ovn-kmod.at| 266 
 tests/system-ovn.at |   4 +
 5 files changed, 375 insertions(+)

diff --git a/controller/ofctrl.c b/controller/ofctrl.c
index 9d181a782..826f78a85 100644
--- a/controller/ofctrl.c
+++ b/controller/ofctrl.c
@@ -388,9 +388,24 @@ struct meter_band_entry {
 
 static struct shash meter_bands;
 
+static struct hmap ecmp_nexthop_map;
+struct ecmp_nexthop_entry {
+struct hmap_node node;
+bool erase;
+
+char *nexthop;
+int id;
+};
+
 static void ofctrl_meter_bands_destroy(void);
 static void ofctrl_meter_bands_clear(void);
 
+static void ecmp_nexthop_monitor_destroy(void);
+static void ecmp_nexthop_monitor_run(
+const struct sbrec_ecmp_nexthop_table *enh_table,
+struct ovs_list *msgs);
+
+
 /* MFF_* field ID for our Geneve option.  In S_TLV_TABLE_MOD_SENT, this is
  * the option we requested (we don't know whether we obtained it yet).  In
  * S_CLEAR_FLOWS or S_UPDATE_FLOWS, this is really the option we have. */
@@ -429,6 +444,7 @@ ofctrl_init(struct ovn_extend_table *group_table,
 groups = group_table;
 meters = meter_table;
 shash_init(_bands);
+hmap_init(_nexthop_map);
 }
 
 /* S_NEW, for a new connection.
@@ -876,6 +892,7 @@ ofctrl_destroy(void)
 expr_symtab_destroy();
 shash_destroy();
 ofctrl_meter_bands_destroy();
+ecmp_nexthop_monitor_destroy();
 }
 
 uint64_t
@@ -2305,6 +2322,87 @@ add_meter(struct ovn_extend_table_info *m_desired,
 ofctrl_meter_bands_alloc(sb_meter, m_desired, msgs);
 }
 
+static void
+ecmp_nexthop_monitor_free_entry(struct ecmp_nexthop_entry *e,
+struct ovs_list *msgs)
+{
+if (msgs) {
+ovs_u128 mask = {
+/* ct_labels.label BITS[96-127] */
+.u64.hi = 0x,
+};
+uint64_t id = e->id;
+ovs_u128 nexthop = {
+.u64.hi = id << 32,
+};
+struct ofp_ct_match match = {
+.labels = nexthop,
+.labels_mask = mask,
+};
+struct ofpbuf *msg = ofp_ct_match_encode(, NULL,
+ rconn_get_version(swconn));
+ovs_list_push_back(msgs, >list_node);
+}
+free(e->nexthop);
+free(e);
+}
+
+static void
+ecmp_nexthop_monitor_destroy(void)
+{
+struct ecmp_nexthop_entry *e;
+HMAP_FOR_EACH_POP (e, node, _nexthop_map) {
+ecmp_nexthop_monitor_free_entry(e, NULL);
+}
+hmap_destroy(_nexthop_map);
+}
+
+static struct ecmp_nexthop_entry *
+ecmp_nexthop_monitor_lookup(char *nexthop)
+{
+uint32_t hash = hash_string(nexthop, 0);
+struct ecmp_nexthop_entry *e;
+
+HMAP_FOR_EACH_WITH_HASH (e, node, hash, _nexthop_map) {
+if (!strcmp(e->nexthop, nexthop)) {
+return e;
+}
+}
+return NULL;
+}
+
+static void
+ecmp_nexthop_monitor_run(const struct sbrec_ecmp_nexthop_table *enh_table,
+ struct ovs_list *msgs)
+{
+struct ecmp_nexthop_entry *e;
+HMAP_FOR_EACH (e, node, _nexthop_map) {
+e->erase = true;
+}
+
+const struct sbrec_ecmp_nexthop *sbrec_ecmp_nexthop;
+SBREC_ECMP_NEXTHOP_TABLE_FOR_EACH (sbrec_ecmp_nexthop, enh_table) {
+e = ecmp_nexthop_monitor_lookup(sbrec_ecmp_nexthop->nexthop);
+if (!e) {
+e = xzalloc(sizeof *e);
+e->nexthop = xstrdup(sbrec_ecmp_nexthop->nexthop);
+e->id = sbrec_ecmp_nexthop->id;
+uint32_t hash = hash_string(e->nexthop, 0);
+hmap_insert(_nexthop_map, >node, hash);
+} else {
+e->erase = false;
+}
+}
+
+HMAP_FOR_EACH_SAFE (e, node, _nexthop_map) {
+if (e->erase) {
+hmap_remove(_nexthop_map, >node);
+ecmp_nexthop_monitor_free_entry(e, msgs);
+}
+}
+
+}
+
 static void
 installed_flow_add(struct ovn_flow *d,
struct ofputil_bundle_ctrl_msg *bc,
@@ -2663,6 +2761,7 @@ ofctrl_put(struct ovn_desired_flow_table *lflow_table,
struct shash *pending_ct_zones,
struct hmap *pending_lb_tuples,
struct ovsdb_idl_index *sbrec_meter_by_name,
+   const struct sbrec_ecmp_nexthop_table *enh_table,
uint64_t req_cfg,
bool lflows_changed,
bool pflows_changed)
@@ -2703,6 +2802,8 @@ ofctrl_put(struct ovn_desired_flow_table *lflow_table,
 /* OpenFlow messages to send to the switch to bring it up-to-date. */
 struct ovs_list msgs = OVS_LIST_INITIALIZER();
 
+ecmp_nexthop_monitor_run(enh_table, );
+
 /* Iterate through ct zones that need to 

[ovs-dev] [PATCH v3 ovn 0/3] Introduce ECMP_nexthop monitor in ovn-controller

2024-06-06 Thread Lorenzo Bianconi
Reported-at: https://issues.redhat.com/browse/FDP-56

Changes since v2:
- rebase on top of https://patchwork.ozlabs.org/project/ovn/list/?series=409660
  in order to use new bfd and static_routes maps
- add IP northd node for ecmp_nexthop processing
Changes since v1:
- add ID column in ECMP_Nexthop table in SB db
- remove nexthop-id in logical_router_static_route option column

Lorenzo Bianconi (3):
  northd: Introduce ECMP_Nexthop table in SB db.
  northd: Add nexhop id in ct_label.label.
  ofctrl: Introduce ecmp_nexthop_monitor.

 controller/ofctrl.c | 101 ++
 controller/ofctrl.h |   2 +
 controller/ovn-controller.c |   2 +
 northd/en-lflow.c   |   3 +
 northd/en-northd.c  |  33 +
 northd/en-northd.h  |   4 +
 northd/inc-proc-northd.c|   8 +-
 northd/northd.c | 158 +++--
 northd/northd.h |  11 ++
 ovn-sb.ovsschema|  16 ++-
 ovn-sb.xml  |  31 +
 tests/ovn-northd.at |   4 +
 tests/ovn.at|   4 +-
 tests/system-ovn-kmod.at| 266 
 tests/system-ovn.at |  62 +
 15 files changed, 667 insertions(+), 38 deletions(-)

-- 
2.45.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 ovn 1/3] northd: Introduce ECMP_Nexthop table in SB db.

2024-06-06 Thread Lorenzo Bianconi
Introduce ECMP_Nexthop table in the SB db in order to track active
ecmp-symmetric-reply connections and flush stale ones.

Signed-off-by: Lorenzo Bianconi 
---
 northd/en-northd.c   |  33 +++
 northd/en-northd.h   |   4 ++
 northd/inc-proc-northd.c |   7 ++-
 northd/northd.c  | 117 +++
 northd/northd.h  |  10 
 ovn-sb.ovsschema |  16 +-
 ovn-sb.xml   |  31 +++
 tests/ovn-northd.at  |   4 ++
 8 files changed, 220 insertions(+), 2 deletions(-)

diff --git a/northd/en-northd.c b/northd/en-northd.c
index a4de71ba1..a2823ab2b 100644
--- a/northd/en-northd.c
+++ b/northd/en-northd.c
@@ -380,6 +380,23 @@ en_bfd_run(struct engine_node *node, void *data)
 engine_set_node_state(node, EN_UPDATED);
 }
 
+void
+en_ecmp_nexthop_run(struct engine_node *node, void *data)
+{
+const struct engine_context *eng_ctx = engine_get_context();
+struct static_routes_data *static_routes_data =
+engine_get_input_data("static_routes", node);
+struct ecmp_nexthop_data *enh_data = data;
+const struct sbrec_ecmp_nexthop_table *sbrec_ecmp_nexthop_table =
+EN_OVSDB_GET(engine_get_input("SB_ecmp_nexthop", node));
+
+build_ecmp_nexthop_table(eng_ctx->ovnsb_idl_txn,
+ _routes_data->parsed_routes,
+ _data->nexthops,
+ sbrec_ecmp_nexthop_table);
+engine_set_node_state(node, EN_UPDATED);
+}
+
 void
 *en_northd_init(struct engine_node *node OVS_UNUSED,
 struct engine_arg *arg OVS_UNUSED)
@@ -421,6 +438,16 @@ void
 return data;
 }
 
+void
+*en_ecmp_nexthop_init(struct engine_node *node OVS_UNUSED,
+  struct engine_arg *arg OVS_UNUSED)
+{
+struct ecmp_nexthop_data *data = xzalloc(sizeof *data);
+
+ecmp_nexthop_init(data);
+return data;
+}
+
 void
 en_northd_cleanup(void *data)
 {
@@ -451,3 +478,9 @@ en_bfd_cleanup(void *data)
 {
 bfd_destroy(data);
 }
+
+void
+en_ecmp_nexthop_cleanup(void *data)
+{
+ecmp_nexthop_destroy(data);
+}
diff --git a/northd/en-northd.h b/northd/en-northd.h
index 424209c2f..c6d520f71 100644
--- a/northd/en-northd.h
+++ b/northd/en-northd.h
@@ -34,5 +34,9 @@ void *en_bfd_init(struct engine_node *node OVS_UNUSED,
 void en_bfd_cleanup(void *data);
 bool bfd_change_handler(struct engine_node *node, void *data);
 void en_bfd_run(struct engine_node *node, void *data);
+void en_ecmp_nexthop_run(struct engine_node *node, void *data);
+void *en_ecmp_nexthop_init(struct engine_node *node OVS_UNUSED,
+   struct engine_arg *arg OVS_UNUSED);
+void en_ecmp_nexthop_cleanup(void *data);
 
 #endif /* EN_NORTHD_H */
diff --git a/northd/inc-proc-northd.c b/northd/inc-proc-northd.c
index d907da14d..c4e5b9bf6 100644
--- a/northd/inc-proc-northd.c
+++ b/northd/inc-proc-northd.c
@@ -103,7 +103,8 @@ static unixctl_cb_func chassis_features_list;
 SB_NODE(fdb, "fdb") \
 SB_NODE(static_mac_binding, "static_mac_binding") \
 SB_NODE(chassis_template_var, "chassis_template_var") \
-SB_NODE(logical_dp_group, "logical_dp_group")
+SB_NODE(logical_dp_group, "logical_dp_group") \
+SB_NODE(ecmp_nexthop, "ecmp_nexthop")
 
 enum sb_engine_node {
 #define SB_NODE(NAME, NAME_STR) SB_##NAME,
@@ -160,6 +161,7 @@ static ENGINE_NODE_WITH_CLEAR_TRACK_DATA(ls_stateful, 
"ls_stateful");
 static ENGINE_NODE(route_policies, "route_policies");
 static ENGINE_NODE(static_routes, "static_routes");
 static ENGINE_NODE(bfd, "bfd");
+static ENGINE_NODE(ecmp_nexthop, "ecmp_nexthop");
 
 void inc_proc_northd_init(struct ovsdb_idl_loop *nb,
   struct ovsdb_idl_loop *sb)
@@ -261,6 +263,9 @@ void inc_proc_northd_init(struct ovsdb_idl_loop *nb,
 engine_add_input(_static_routes,
  _nb_logical_router_static_route, NULL);
 
+engine_add_input(_ecmp_nexthop, _sb_ecmp_nexthop, NULL);
+engine_add_input(_ecmp_nexthop, _static_routes, NULL);
+
 engine_add_input(_sync_meters, _nb_acl, NULL);
 engine_add_input(_sync_meters, _nb_meter, NULL);
 engine_add_input(_sync_meters, _sb_meter, NULL);
diff --git a/northd/northd.c b/northd/northd.c
index 2eb5f2be8..efe1e3f46 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -10039,6 +10039,105 @@ build_bfd_table(
 return ret;
 }
 
+struct ecmp_nexthop_entry {
+struct hmap_node hmap_node;
+
+char *nexthop;
+int id;
+bool stale;
+};
+
+static struct ecmp_nexthop_entry *
+ecmp_nexthop_lookup(const struct hmap *map, const char *nexthop, size_t hash)
+{
+struct ecmp_nexthop_entry *e;
+
+HMAP_FOR_EACH_WITH_HASH (e, hmap_node, hash, map) {
+if (!strcmp(e->nexthop, nexthop)) {
+return e;
+}
+}
+return NULL;
+}
+
+#define NEXTHOP_IDS_LEN65535
+bool
+build_ecmp_nexthop_table(
+struct ovsdb_idl_txn *ovnsb_txn,
+struct hmap *routes,
+struct hmap *nexthops,
+

Re: [ovs-dev] [PATCH] python: ovs: flow: Fix nested check_pkt_len acts.

2024-06-06 Thread Adrián Moreno
On Thu, Jun 06, 2024 at 06:00:26PM GMT, Ilya Maximets wrote:
> On 6/6/24 17:15, Adrian Moreno wrote:
> > Add check_pkt_len action to the decoder list that it, itself, uses.
> >
> > This makes nested check_pkt_len (i.e:a check_pkt_len inside another)
> > work.
> >
> > Signed-off-by: Adrian Moreno 
> > ---
> >  python/ovs/flow/odp.py   | 43 ++--
> >  python/ovs/tests/test_odp.py | 29 
> >  2 files changed, 51 insertions(+), 21 deletions(-)
>
> Hi, Adrian.
>
> Could you, please, provide a Fixes tag for this?
> No need to send v2 just for this, just reply with it to this thread.
> (Tags should start from the beginning of the line for patchwork to
> recognize them.)
>

Sure, how about this:

Reported-by: Ilya Maximets 
Fixes: 076663b31edc ("python: Add ovs datapath flow parsing.")

Thanks
Adrián

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [External] : Re: [PATCH ovn] northd: Skip arp-proxy flows if the lsp is a router port.

2024-06-06 Thread Dumitru Ceara
On 6/5/24 16:52, Brendan Doyle via dev wrote:
> 
> 
> So I'm wondering will this break the LSP option:
>>     *o**p**t**i**o**n**s*  *:*  *a**r**p**_**p**r**o**x**y*:
>> optional string
>>    Optional.  A  list  of  MAC  and  addresses/cidrs  or 
>> just  ad‐
>>    dresses/cidrs that this logical
>> switch*r**o**u**t**e**r*  port will reply to
>>    ARP/NDP  requests. 
>> Examples:*1**6**9**.**2**5**4**.**2**3**9**.**2**5**4*   
>> *1**6**9**.**2**5**4**.**2**3**9**.**2*,
>>   
>> *0**a**:**5**8**:**a**9**:**f**e**:**0**1**:**0**1*  
>> *1**6**9**.**2**5**4**.**2**3**9**.**2**5**4* 
>> *1**6**9**.**2**5**4**.**2**3**9**.**2*
>>   
>> *1**6**9**.**2**5**4**.**2**3**8**.**0**/**2**4*,*f**d**7**b**:**6**b**4**d**:**7**b**2**5**:**d**2**2**f**:**:**1*
>>   *f**d**7**b**:**6**b**4**d**:**7**b**2**5**:**d**2**2**f**:**:**2*,
>>    *0**a**:**5**8**:**a**9**:**f**e**:**0**1**:**0**1* 
>> *f**d**7**b**:**6**b**4**d**:**7**b**2**5**:**d**2**2**f**:**:**0**/**6**4*. 
>>  The*o**p**t**i**o**n**s**:**r**o**u**t**e**r**-*
>>    *p**o**r**t*’s  logical  router  should  have a route
>> to forward packets
>>    sent to configured proxy ARP MAC/IPs to an appropriate 
>> destina‐
>>    tion.
> 

Hi Brendan,

I'm not sure I understand what breaks.  Do you have a specific scenario
in mind?  The patch is for the arp-proxy feature.  I doubt that people
rely on ARP requests originated by logical routers being handled by the
arp proxy on the connected logical switch port.  But I might be wrong.

Thanks,
Dumitru

> 
> On 05/06/2024 13:03, Enrique Llorente Pastora wrote:
>> On Wed, Jun 5, 2024 at 12:12 PM Lorenzo Bianconi <
>> lorenzo.bianc...@redhat.com> wrote:
>>
 On 5/14/24 18:44, Lorenzo Bianconi wrote:
> Skip proxy-arp logical flows for traffic that is entering the logical
> switch pipeline from a lsp of type router. This patch will avoid
> recirculating back the traffic entering  by the logical router
> pipeline if proxy-arp hasn been configured by the CMS.
>
> Reported-at:https://urldefense.com/v3/__https://issues.redhat.com/browse/FDP-96__;!!ACWV5N9M2RV99hQ!LZCOnrD03VQXa0QBfUgAD0IntqpkCU4P2Mz0jhlfAcJ0Joe1ehsAIloa28JszFGBjchui8U4OL2MfnKJQEhn$
>   Signed-off-by: Lorenzo Bianconi
> ---
 Hi Lorenzo,

 This change looks OK but I'd like to double check that it doesn't break
 CNV/ovn-kubernetes use cases.

 Hi Enrique,

 Would you mind double checking that on one of your setups?

 If I'm not wrong our upstream ovn-kubernetes (in ovn-org/ovn) tests
 don't enable anything CNV specific:
 https://urldefense.com/v3/__https://github.com/ovsrobot/ovn/actions/runs/9083258143__;!!ACWV5N9M2RV99hQ!LZCOnrD03VQXa0QBfUgAD0IntqpkCU4P2Mz0jhlfAcJ0Joe1ehsAIloa28JszFGBjchui8U4OL2MflAqga4Q$
  
 Thanks,
 Dumitru
>>> Hi Dumitru,
>>>
>>> Thx, for the review. That's a good idea :)
>>>
>> Live migration tests look fine with this change, both IC and non IC,
>> let's
>> also activate the
>> kubevirt lanes so we gate with them too.
>>
>> Tested-by:
>>
>>
>>> Regards,
>>> Lorenzo
>>>
>   northd/northd.c | 15 +--
>   tests/ovn.at    |  8 
>   tests/system-ovn.at | 22 +-
>   3 files changed, 38 insertions(+), 7 deletions(-)
>
> diff --git a/northd/northd.c b/northd/northd.c
> index 0cabda7ea..29dc58ef4 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -118,6 +118,7 @@ static bool default_acl_drop;
>   #define REGBIT_PORT_SEC_DROP  "reg0[15]"
>   #define REGBIT_ACL_STATELESS  "reg0[16]"
>   #define REGBIT_ACL_HINT_ALLOW_REL "reg0[17]"
> +#define REGBIT_FROM_ROUTER_PORT   "reg0[18]"
>
>   #define REG_ORIG_DIP_IPV4 "reg1"
>   #define REG_ORIG_DIP_IPV6 "xxreg1"
> @@ -5785,6 +5786,13 @@ build_lswitch_port_sec_op(struct ovn_port *op,
>>> struct lflow_table *lflows,
>   >od->localnet_ports[0]->nbsp->header_,
>   op->lflow_ref);
>   }
> +    } else if (lsp_is_router(op->nbsp)) {
> +    ds_put_format(actions, REGBIT_FROM_ROUTER_PORT" = 1; next;");
> +    ovn_lflow_add_with_lport_and_hint(lflows, op->od,
> +  S_SWITCH_IN_CHECK_PORT_SEC,
>>> 70,
> +  ds_cstr(match),
>>> ds_cstr(actions),
> +  op->key,
> >nbsp->header_,
> +  op->lflow_ref);
>   }
>   }
>
> @@ -9051,7 +9059,9 @@ build_lswitch_arp_nd_responder_known_ips(struct
>>> ovn_port *op,
>   if (op->proxy_arp_addrs.n_ipv4_addrs) {
>   /* Match rule on all proxy ARP IPs. */
>   ds_clear(match);
> -    

Re: [ovs-dev] [PATCH v4 6/6] netdev-dpdk: Refactor tunnel checksum offloading.

2024-06-06 Thread Kevin Traynor
On 05/06/2024 19:16, Ilya Maximets wrote:
> On 5/31/24 16:10, Kevin Traynor wrote:
>> On 30/05/2024 14:10, David Marchand wrote:
>>> All informations required for checksum offloading can be deducted by
>>
>> nit:  "All information required for checksum offloading can be deduced
>> by" can update on applying, assuming no more revs are needed.
>>
>>> already tracked dp_packet l3_ofs, l4_ofs, inner_l3_ofs and inner_l4_ofs
>>> fields.
>>> Remove DPDK specific l[2-4]_len from generic OVS code.
>>>
>>> netdev-dpdk code then fills mbuf specifics step by step:
>>> - outer_l2_len and outer_l3_len are needed for tunneling (and below
>>>   features),
>>> - l2_len and l3_len are needed for IP and L4 checksum (and below features),
>>> - l4_len and tso_segsz are needed when doing TSO,
>>>
>>> Signed-off-by: David Marchand 
>>> ---
>>>  lib/dp-packet.h | 37 --
>>>  lib/netdev-dpdk.c   | 35 ++---
>>>  lib/netdev-native-tnl.c | 50 +
>>>  3 files changed, 27 insertions(+), 95 deletions(-)
>>
>> Acked-by: Kevin Traynor 
> 
> Thanks, David and Kevin!
> 
> I generally like the direction of this patch set, especially the
> cleanup of the generic tunnel code.
> 
> I didn't test it with a real hardware nor I re-checked the math,
> so will not Ack it, but it looks good to me otherwise, and I think
> we should backport the whole thing to at least branch-3.3 as well.
> 
> Best regards, Ilya Maximets.
> 

Thanks David, Jun and Ilya. Series applied and backported to branch-3.3.

Kevin.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] socket: Increase listen backlog to 128 everywhere.

2024-06-06 Thread Ihar Hrachyshka
On Thu, Jun 6, 2024 at 11:12 AM Ihar Hrachyshka  wrote:

> On Wed, Jun 5, 2024 at 11:17 AM Brian Haley  wrote:
>
>> Hi Ilya,
>>
>> On 6/4/24 6:27 AM, Ilya Maximets wrote:
>> > On 6/4/24 11:54, Alin Serdean wrote:
>> >> Does it make sense to make this configurable via automake in order to
>> avoid
>> >> future patches if we need to bump the value further?
>> >
>> > I'm still not convinced this is an issue worth fixing.
>> >
>> > The original discussion is here:
>> >
>> https://mail.openvswitch.org/pipermail/ovs-discuss/2024-April/053058.html
>> >
>> > It's not clear if these reconnections actually cause any harm or are
>> > even helpful in getting to connect to a less busy server in a general
>> > case.
>>
>> To me, not having to reconnect is a win, especially given the number of
>> client threads here. And 128 is still not that much for a server.
>>
>> > The original number 10 was clearly way too low for any reasonable
>> workload.
>> > But even with that we lived for a very long time with very large
>> clusters
>> > without any issues.
>>
>> It could have also been that noone ever noticed, I know I didn't until
>> digging deeper into this.
>>
>> > The main problem in the original thread was that a lot of neutron
>> clients
>> > are having leader-only connections to the database for seemingly no
>> reason.
>>
>
> They write, so AFAIU they need a leader (even if they connect to
> secondary, writes will be proxied), correct?
>
> There may be place for some OpenStack optimizations (e.g. consolidating
> all neutron agents into one that would maintain a single connection per
> chassis), but ultimately, it's still O(n) where n = number of chassis in
> the cluster; and we have to have at least 2 connections per chassis (1
> ovn-controller and at least 1 neutron agent, for metadata). But I would not
> overestimate the amount of settings where the number of connections from
> neutron agents can be reduced. This is only an option where multiple
> neutron agents are running on the same chassis. Usually, we run
> ovn-controller and ovn-metadata-agent. We may also run ovn-agent (for qos)
> or bgp-agent (for BGP) but these are not deployed in all environments / on
> all nodes.
>
> Apart from it, deploying a proxy per node will allow to coalesce the
> number of connections to 1 per chassis. One can go further and build a tree
> of proxies to keep the number of connections to the leader at bay (is it
> O(log n)?) This is not a trivial change to architecture.
>

OK disregard this (somewhat): Brian experiences issues with NB, not SB. NB
connections come from neutron-server workers (not agents). In this case,
OpenStack would have to run its own "proxy", per node - for all workers to
share, or an ovsdb proxy.

Regarding the switch from leader-only to any member for NB connections by
workers: is the argument to do it about the fact that the number of
secondaries is a lot lower than the number of neutron-server workers, so
the load on a leader is also lower? (2 or 4 connections from secondaries
vs. N*W connections, where N is number of API controllers and W is the
number of workers per neutron-server)

What would be the drawbacks we should consider? I can imagine the load on
secondary members will increase (but the load on the leader will decrease).
Plus probably some minor lag in request processing because of proxying
through secondaries (for most - but not all - neutron-server workers).
Anything else?


>
>
>> > That results in unnecessary mass re-connection on leadership change.
>> > So, I'd prefer this fixed in OpenStack instead.
>>
>> But isn't having many leader-only connections a perfectly valid config
>> when most if not all are writers? And if a small increase in the backlog
>> helps I don't see any reason not to do it. Fixing in Openstack would
>> involve one of two options from what I know - 1) Use a proxy; 2) Just
>> connect irregardless of leadership status. I'm not sure #2 won't cause
>> some other problems as the secondaries would have to forward things to
>> the leader, correct?
>>
>> The best solution would be to have this configurable if you see that as
>> a viable option.
>>
>> -Brian
>>
>> > Best regards, Ilya Maximets.
>> >
>> >>
>> >> On Tue, Jun 4, 2024 at 11:05 AM Simon Horman  wrote:
>> >>
>> >>> + Ihar
>> >>>
>> >>> On Fri, May 31, 2024 at 03:40:08PM -0400, Brian Haley wrote:
>>  An earlier patch [1] increased the size of the listen
>>  backlog to 64. While that was a huge improvement over
>>  10, further testing in large deployments showed 128
>>  was even better.
>> >>>
>>
>
> What I lack in this commit message is the definition of "better". I see
> stats of drops using ss and netstat, but it's not clear what the user's
> visible effect of this is. Do you have a bug report to link to, with some
> symptoms described?
>
>
>> >>> nit: I would slightly prefer if a commit was referenced like this:
>> >>>
>> >>>commit 2b7efee031c3 ("socket: Increase listen backlog to 64
>> 

Re: [ovs-dev] [PATCH] python: ovs: flow: Fix nested check_pkt_len acts.

2024-06-06 Thread Ilya Maximets
On 6/6/24 17:15, Adrian Moreno wrote:
> Add check_pkt_len action to the decoder list that it, itself, uses.
> 
> This makes nested check_pkt_len (i.e:a check_pkt_len inside another)
> work.
> 
> Signed-off-by: Adrian Moreno 
> ---
>  python/ovs/flow/odp.py   | 43 ++--
>  python/ovs/tests/test_odp.py | 29 
>  2 files changed, 51 insertions(+), 21 deletions(-)

Hi, Adrian.

Could you, please, provide a Fixes tag for this?
No need to send v2 just for this, just reply with it to this thread.
(Tags should start from the beginning of the line for patchwork to
recognize them.)

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] python: idl: Fix index not being updated on row modification.

2024-06-06 Thread Dumitru Ceara
On 5/27/24 23:39, Ilya Maximets wrote:
> When a row is modified, python IDL doesn't perform any operations on
> existing client-side indexes.  This means that if the column on which
> index is created changes, the old value will remain in the index and
> the new one will not be added to the index.  Beside lookup failures
> this is also causing inability to remove modified rows, because the
> new column value doesn't exist in the index causing an exception on
> attempt to remove it:
> 
>  Traceback (most recent call last):
>File "ovsdbapp/backend/ovs_idl/connection.py", line 110, in run
>  self.idl.run()
>File "ovs/db/idl.py", line 465, in run
>  self.__parse_update(msg.params[2], OVSDB_UPDATE3)
>File "ovs/db/idl.py", line 924, in __parse_update
>  self.__do_parse_update(update, version, self.tables)
>File "ovs/db/idl.py", line 964, in __do_parse_update
>  changes = self.__process_update2(table, uuid, row_update)
>File "ovs/db/idl.py", line 991, in __process_update2
>  del table.rows[uuid]
>File "ovs/db/custom_index.py", line 102, in __delitem__
>  index.remove(val)
>File "ovs/db/custom_index.py", line 66, in remove
>  self.values.remove(self.index_entry_from_row(row))
>File "sortedcontainers/sortedlist.py", line 2015, in remove
>  raise ValueError('{0!r} not in list'.format(value))
>  ValueError: Datapath_Binding(
>uuid=UUID('498e66a2-70bc-4587-a66f-0433baf82f60'),
>tunnel_key=16711683, load_balancers=[], external_ids={}) not in list
> 
> Fix that by always removing an existing row from indexes before
> modification and adding back afterwards.  This ensures that old
> values are removed from the index and new ones are added.
> 
> This behavior is consistent with the C implementation.
> 
> The new test that reproduces the removal issue is added.  Some extra
> testing infrastructure added to be able to handle and print out the
> 'indexed' table from the idltest schema.
> 
> Fixes: 13973bc41524 ("Add multi-column index support for the Python IDL")
> Reported-at: 
> https://mail.openvswitch.org/pipermail/ovs-discuss/2024-May/053159.html
> Reported-by: Roberto Bartzen Acosta 
> Signed-off-by: Ilya Maximets 
> ---

Looks good to me:

Acked-by: Dumitru Ceara 

Regards,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] python: ovs: flow: Fix nested check_pkt_len acts.

2024-06-06 Thread Adrian Moreno
Add check_pkt_len action to the decoder list that it, itself, uses.

This makes nested check_pkt_len (i.e:a check_pkt_len inside another)
work.

Signed-off-by: Adrian Moreno 
---
 python/ovs/flow/odp.py   | 43 ++--
 python/ovs/tests/test_odp.py | 29 
 2 files changed, 51 insertions(+), 21 deletions(-)

diff --git a/python/ovs/flow/odp.py b/python/ovs/flow/odp.py
index 7d9b165d4..a8f8c067a 100644
--- a/python/ovs/flow/odp.py
+++ b/python/ovs/flow/odp.py
@@ -365,29 +365,30 @@ class ODPFlow(Flow):
 is_list=True,
 )
 
-return {
-**_decoders,
-"check_pkt_len": nested_kv_decoder(
-KVDecoders(
-{
-"size": decode_int,
-"gt": nested_kv_decoder(
-KVDecoders(
-decoders=_decoders,
-default_free=decode_free_output,
-),
-is_list=True,
+_decoders["check_pkt_len"] = nested_kv_decoder(
+KVDecoders(
+{
+"size": decode_int,
+"gt": nested_kv_decoder(
+KVDecoders(
+decoders=_decoders,
+default_free=decode_free_output,
 ),
-"le": nested_kv_decoder(
-KVDecoders(
-decoders=_decoders,
-default_free=decode_free_output,
-),
-is_list=True,
+is_list=True,
+),
+"le": nested_kv_decoder(
+KVDecoders(
+decoders=_decoders,
+default_free=decode_free_output,
 ),
-}
-)
-),
+is_list=True,
+),
+}
+)
+)
+
+return {
+**_decoders,
 }
 
 @staticmethod
diff --git a/python/ovs/tests/test_odp.py b/python/ovs/tests/test_odp.py
index f19ec386e..d514e9be3 100644
--- a/python/ovs/tests/test_odp.py
+++ b/python/ovs/tests/test_odp.py
@@ -541,6 +541,35 @@ def test_odp_fields(input_string, expected):
 ),
 ],
 ),
+(
+
"actions:check_pkt_len(size=200,gt(check_pkt_len(size=400,gt(4),le(2))),le(check_pkt_len(size=100,gt(1),le(drop",
  # noqa: E501
+[
+KeyValue(
+"check_pkt_len",
+{
+"size": 200,
+"gt": [
+{
+"check_pkt_len": {
+"size": 400,
+"gt": [{"output": {"port": 4}}],
+"le": [{"output": {"port": 2}}],
+}
+}
+],
+"le": [
+{
+"check_pkt_len": {
+"size": 100,
+"gt": [{"output": {"port": 1}}],
+"le": [{"drop": True}],
+}
+}
+],
+},
+)
+],
+),
 (
 "actions:meter(1),hash(l4(0))",
 [
-- 
2.45.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] socket: Increase listen backlog to 128 everywhere.

2024-06-06 Thread Ihar Hrachyshka
On Wed, Jun 5, 2024 at 11:17 AM Brian Haley  wrote:

> Hi Ilya,
>
> On 6/4/24 6:27 AM, Ilya Maximets wrote:
> > On 6/4/24 11:54, Alin Serdean wrote:
> >> Does it make sense to make this configurable via automake in order to
> avoid
> >> future patches if we need to bump the value further?
> >
> > I'm still not convinced this is an issue worth fixing.
> >
> > The original discussion is here:
> >
> https://mail.openvswitch.org/pipermail/ovs-discuss/2024-April/053058.html
> >
> > It's not clear if these reconnections actually cause any harm or are
> > even helpful in getting to connect to a less busy server in a general
> > case.
>
> To me, not having to reconnect is a win, especially given the number of
> client threads here. And 128 is still not that much for a server.
>
> > The original number 10 was clearly way too low for any reasonable
> workload.
> > But even with that we lived for a very long time with very large clusters
> > without any issues.
>
> It could have also been that noone ever noticed, I know I didn't until
> digging deeper into this.
>
> > The main problem in the original thread was that a lot of neutron clients
> > are having leader-only connections to the database for seemingly no
> reason.
>

They write, so AFAIU they need a leader (even if they connect to secondary,
writes will be proxied), correct?

There may be place for some OpenStack optimizations (e.g. consolidating all
neutron agents into one that would maintain a single connection per
chassis), but ultimately, it's still O(n) where n = number of chassis in
the cluster; and we have to have at least 2 connections per chassis (1
ovn-controller and at least 1 neutron agent, for metadata). But I would not
overestimate the amount of settings where the number of connections from
neutron agents can be reduced. This is only an option where multiple
neutron agents are running on the same chassis. Usually, we run
ovn-controller and ovn-metadata-agent. We may also run ovn-agent (for qos)
or bgp-agent (for BGP) but these are not deployed in all environments / on
all nodes.

Apart from it, deploying a proxy per node will allow to coalesce the number
of connections to 1 per chassis. One can go further and build a tree of
proxies to keep the number of connections to the leader at bay (is it O(log
n)?) This is not a trivial change to architecture.


> > That results in unnecessary mass re-connection on leadership change.
> > So, I'd prefer this fixed in OpenStack instead.
>
> But isn't having many leader-only connections a perfectly valid config
> when most if not all are writers? And if a small increase in the backlog
> helps I don't see any reason not to do it. Fixing in Openstack would
> involve one of two options from what I know - 1) Use a proxy; 2) Just
> connect irregardless of leadership status. I'm not sure #2 won't cause
> some other problems as the secondaries would have to forward things to
> the leader, correct?
>
> The best solution would be to have this configurable if you see that as
> a viable option.
>
> -Brian
>
> > Best regards, Ilya Maximets.
> >
> >>
> >> On Tue, Jun 4, 2024 at 11:05 AM Simon Horman  wrote:
> >>
> >>> + Ihar
> >>>
> >>> On Fri, May 31, 2024 at 03:40:08PM -0400, Brian Haley wrote:
>  An earlier patch [1] increased the size of the listen
>  backlog to 64. While that was a huge improvement over
>  10, further testing in large deployments showed 128
>  was even better.
> >>>
>

What I lack in this commit message is the definition of "better". I see
stats of drops using ss and netstat, but it's not clear what the user's
visible effect of this is. Do you have a bug report to link to, with some
symptoms described?


> >>> nit: I would slightly prefer if a commit was referenced like this:
> >>>
> >>>commit 2b7efee031c3 ("socket: Increase listen backlog to 64
> everywhere.")
> >>>
>  Looking at 'ss -lmt' output over more than one week for
>  port 6641, captured across three different controllers,
>  the average was:
> 
>   listen(s, 10) : 1213 drops/day
>   listen(s, 64) : 791 drops/day
>   listen(s, 128): 657 drops/day
> 
>  Looking at 'netstat -s | egrep -i 'listen|drop|SYN_RECV''
>  output over one week for port 6641, again captured across
>  three different controllers, the average was:
> 
>   listen(s, 10) : 741 drops/day
>   listen(s, 64) : 200 drops/day
>   listen(s, 128): 22 drops/day
>

So why not 256? 512? 1024? It's always an issue when a single value is
picked for all environments, isn't it.

The suggestion to have it configurable seems to me to be the way to go.


> 
>  While having this value configurable would be the
>  best solution, changing to 128 is a quick fix that
>  should be good for all deployments. A link to the
>  original discussion is at [2].
> 
>  [1]
> >>>
> https://github.com/openvswitch/ovs/commit/2b7efee031c3a2205ad2ee999275893edd083c1c
>  [2]
> 

Re: [ovs-dev] [PATCH v2 1/1] datapath-windows : Avoid a deadlock when processing TFTP conntrack.

2024-06-06 Thread Wilson Peng via dev
Hi, Simon,

Below list why it would go into deadlock,

1). 1st place OvsCtExecute_()->OvsProcessConntrackEntry().   entry->parent
= parentEntry;(here parentEntry == entry)
2).  After it return back from OvsProcessConntrackEntry(), it goes to the
lines below,(Still on function OvsCtExecute_())
 1263 if (entry == NULL) {
1264 return status;
1265 }
1266
1267 /*
1268  * Note that natInfo is not the same as entry->natInfo here.
natInfo
1269  * is decided by action in the openflow rule, entry->natInfo is
decided
1270  * when the entry is created. In the reverse NAT case, natInfo is
1271  * NAT_ACTION_REVERSE, yet entry->natInfo is NAT_ACTION_SRC or
1272  * NAT_ACTION_DST without NAT_ACTION_REVERSE
1273  */
1274 KIRQL irql = KeGetCurrentIrql();
1275 OVS_ACQUIRE_SPIN_LOCK(&(entry->lock), irql);
-->SPIN_LOCK 1st time
1276 if (natInfo->natAction != NAT_ACTION_NONE) {
1277 OvsNatPacket(fwdCtx, entry, entry->natInfo.natAction,
1278  key, ctx.reply);
1279 }

3, Then it will goto the next same SPIN_LOCK for then lines below(Still on
function OvsCtExecute_())
1340 /* Add original tuple information to flow Key */
1341 if (entry->key.dl_type == ntohs(ETH_TYPE_IPV4)) {
1342 if (parent != NULL) {
1343 OVS_ACQUIRE_SPIN_LOCK(&(parent->lock), irql);
 --->>SPIN_LOCK 2nd time causing deadlock
1344 OvsCtUpdateTuple(key, >key);
1345 OVS_RELEASE_SPIN_LOCK(&(parent->lock), irql);
1346 } else {
1347 OvsCtUpdateTuple(key, >key);
1348 }
...

1365 OVS_RELEASE_SPIN_LOCK(&(entry->lock), irql);  --->this is the
SPIN_LOCK release place.

Regards
Wilson

On Thu, Jun 6, 2024 at 8:56 PM Simon Horman  wrote:

> On Wed, Jun 05, 2024 at 09:26:02PM +0800, Wilson Peng wrote:
> > Hi, Simon,
> > In this case, for tftp packet processing, it does have a child-parent
> > processing logic just like ftp in tcp.
> > Tftp packet1 from port1 to 69 and it will create one new conntrack entry
> > and create one related conntrack
> > entry also with src-port 0 and dst-port port1.   Original idea is if the
> > tftp reply packet is from port2 to port1 and
> > then it will create one new conntrack entry then it could set this
> > conntrack entry(port2->port1) parent be equal
> > to  conntrack_entry(port1-->69).
> >
> > In the processing part, if the tftp reply packet is from 69 to port1
> > then it would hit the original conntrack entry
> > (port1-->69) and lead to conntrack_entry parent be equal to itself.
> >
> > Only ftp/tftp traffic processing will have parent-child logic.
>
> Hi Wilson,
>
> Thanks for the explanation.
>
> I understand that this patch will prevent OVS from crashing (good!).  But
> I'm unclear how flows that would have caused a crash will be handled by
> conntrack with this patch present.  Could you shed some light on that too?
>

-- 
This electronic communication and the information and any files transmitted 
with it, or attached to it, are confidential and are intended solely for 
the use of the individual or entity to whom it is addressed and may contain 
information that is confidential, legally privileged, protected by privacy 
laws, or otherwise restricted from disclosure to anyone else. If you are 
not the intended recipient or the person responsible for delivering the 
e-mail to the intended recipient, you are hereby notified that any use, 
copying, distributing, dissemination, forwarding, printing, or copying of 
this e-mail is strictly prohibited. If you received this e-mail in error, 
please return the e-mail to the sender, delete it from your computer, and 
destroy any printed copy of it.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 2/2] ci: Restore vhost-user unit tests in check-dpdk.

2024-06-06 Thread David Marchand
Following a rework in the DPDK cache, the PATH variable is incorrectly
set, resulting in dpdk-testpmd not being available.
Because of this, vhost-user unit tests were skipped in GHA runs.

Fixes: 8893e24d9d09 ("dpdk: Update to use v23.11.")
Signed-off-by: David Marchand 
---
 .ci/linux-build.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
index bf9d6241d5..702feeb3bb 100755
--- a/.ci/linux-build.sh
+++ b/.ci/linux-build.sh
@@ -25,7 +25,7 @@ function install_dpdk()
 export PKG_CONFIG_PATH=$DPDK_LIB/pkgconfig/:$PKG_CONFIG_PATH
 
 # Expose dpdk binaries.
-export PATH=$(pwd)/dpdk-dir/build/bin:$PATH
+export PATH=$(pwd)/dpdk-dir/bin:$PATH
 
 if [ ! -f "${VERSION_FILE}" ]; then
 echo "Could not find DPDK in $DPDK_INSTALL_DIR"
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 1/2] system-dpdk: Fix socket conflict when starting testpmd.

2024-06-06 Thread David Marchand
The DPDK telemetry library tries to connect to existing socket files so
that it knows whether it can take over them.

As was reported by Christian, following a fix in DPDK that got backported
in v23.11.1, vhost-user unit tests that have both OVS and testpmd running
at the same time reveal a conflict over the telemetry socket.
This conflict shows up as an error message in OVS logs which makes those
tests fail in the CI:

2024-06-06T13:03:38.351Z|1|dpdk|ERR|TELEMETRY: Socket write base info
to client failed

The EAL file-prefix option affects both the directory where DPDK stores
running files (like the telemetry socket) and how files backing hugepages
are named (when in non --in-memory mode).
Configure (again) this prefix so that testpmd runs in a dedicated directory.

Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2024-June/414545.html
Fixes: c488f28a0eaf ("system-dpdk: Don't require hugetlbfs.")
Signed-off-by: David Marchand 
---
 tests/system-dpdk-macros.at | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at
index 7cf9bac170..f8ba766739 100644
--- a/tests/system-dpdk-macros.at
+++ b/tests/system-dpdk-macros.at
@@ -102,7 +102,7 @@ m4_define([OVS_DPDK_CHECK_TESTPMD],
 m4_define([OVS_DPDK_START_TESTPMD],
   [AT_CHECK([lscpu], [], [stdout])
AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while (c++<$(3)) 
{printf "512,"}; print "512"}' > NUMA_NODE])
-   eal_options="$DPDK_EAL_OPTIONS --in-memory --socket-mem="$(cat NUMA_NODE)" 
--single-file-segments --no-pci"
+   eal_options="$DPDK_EAL_OPTIONS --in-memory --socket-mem="$(cat NUMA_NODE)" 
--single-file-segments --no-pci --file-prefix testpmd"
options="$1"
test "$options" != "${options%% -- *}" || options="$options -- "
eal_options="$eal_options ${options%% -- *}"
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] system-dpdk: Tolerate new warnings 23.11.1/24.03.

2024-06-06 Thread David Marchand
On Thu, Jun 6, 2024 at 2:41 PM Christian Ehrhardt
 wrote:
> On Thu, Jun 6, 2024 at 2:25 PM David Marchand  
> wrote:
> > On Thu, Jun 6, 2024 at 9:33 AM  wrote:
> > >
> > > From: Christian Ehrhardt 
> > >
> > > DPDK fixed counting of telemetry clients in 24.03 [1] which was also
> > > backported to 23.11.1 [2]. Due to that the dpdk related openvswitch
> > > tests now fail in the following cases:
> > >   4: OVS-DPDK - ping vhost-user ports FAILED (system-dpdk.at:148)
> > >   5: OVS-DPDK - ping vhost-user-client ports FAILED (system-dpdk.at:224)
> > >   16: OVS-DPDK - MTU increase vport port FAILED (system-dpdk.at:609)
> > >   17: OVS-DPDK - MTU decrease vport port FAILED (system-dpdk.at:651)
> > >   20: OVS-DPDK - MTU upper bound vport port FAILED (system-dpdk.at:767)
> > >   21: OVS-DPDK - MTU lower bound vport port FAILED (system-dpdk.at:809)
> > >
> > > This is due to a new error being logged under these conditions:
> > >   dpdk|ERR|TELEMETRY: Socket write base info to client failed
> >
> > I had some trouble understanding what was happening here because this
> > error log should be triggered on connection to the telemetry socket.
> >
> > So I expected this error log affect any dpdk unit tests, but only the
> > tests with vhost-user were affected...
> > Attaching gdb, I could confirm something was indeed connecting to the
> > telemetry socket and I realised it was due to telemetry running in the
> > testpmd dpdk instance that tries to be smart and connect/quit on
> > existing sockets.
> >
> > So we can waive this error log but I think a better fix would be to
> > avoid this conflict between testpmd and ovs (regardless of telemetry).
> > Something like:
> >
> > diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at
> > index 7cf9bac170..f8ba766739 100644
> > --- a/tests/system-dpdk-macros.at
> > +++ b/tests/system-dpdk-macros.at
> > @@ -102,7 +102,7 @@ m4_define([OVS_DPDK_CHECK_TESTPMD],
> >  m4_define([OVS_DPDK_START_TESTPMD],
> >[AT_CHECK([lscpu], [], [stdout])
> > AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while
> > (c++<$(3)) {printf "512,"}; print "512"}' > NUMA_NODE])
> > -   eal_options="$DPDK_EAL_OPTIONS --in-memory --socket-mem="$(cat
> > NUMA_NODE)" --single-file-segments --no-pci"
> > +   eal_options="$DPDK_EAL_OPTIONS --in-memory --socket-mem="$(cat
> > NUMA_NODE)" --single-file-segments --no-pci --file-prefix testpmd"
> > options="$1"
> > test "$options" != "${options%% -- *}" || options="$options -- "
> > eal_options="$eal_options ${options%% -- *}"
>
> I'm happy with that approach as well,
> feel free to submit for real and scrap mine.

Ok, I'll post a series as we also have a problem with GHA (not)
running DPDK tests.

Btw, on the DPDK side, I sent a patch to lower this telemetry log
message to debug level.
https://patchwork.dpdk.org/project/dpdk/patch/20240606122654.2889214-1-david.march...@redhat.com/


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 1/1] datapath-windows : Avoid a deadlock when processing TFTP conntrack.

2024-06-06 Thread Simon Horman
On Wed, Jun 05, 2024 at 09:26:02PM +0800, Wilson Peng wrote:
> Hi, Simon,
> In this case, for tftp packet processing, it does have a child-parent
> processing logic just like ftp in tcp.
> Tftp packet1 from port1 to 69 and it will create one new conntrack entry
> and create one related conntrack
> entry also with src-port 0 and dst-port port1.   Original idea is if the
> tftp reply packet is from port2 to port1 and
> then it will create one new conntrack entry then it could set this
> conntrack entry(port2->port1) parent be equal
> to  conntrack_entry(port1-->69).
> 
> In the processing part, if the tftp reply packet is from 69 to port1
> then it would hit the original conntrack entry
> (port1-->69) and lead to conntrack_entry parent be equal to itself.
> 
> Only ftp/tftp traffic processing will have parent-child logic.

Hi Wilson,

Thanks for the explanation.

I understand that this patch will prevent OVS from crashing (good!).  But
I'm unclear how flows that would have caused a crash will be handled by
conntrack with this patch present.  Could you shed some light on that too?
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/1] netdev-offload-tc: Reserve lower tc prio for vlan ethertype.

2024-06-06 Thread Simon Horman
On Thu, Jun 06, 2024 at 10:54:45AM +, Roi Dayan wrote:
> Hi Simon,
> 
> I appreciate the review, Yes we will look in some of the other ether
> types and see if it's something we think is needed to prioritize as well.

Thanks Roi,

Much appreciated.

I've gone ahead and applied this patch to main.

- netdev-offload-tc: Reserve lower tc prio for vlan ethertype.
  https://github.com/openvswitch/ovs/commit/6280f5d04a8d

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] system-dpdk: Tolerate new warnings 23.11.1/24.03.

2024-06-06 Thread Christian Ehrhardt
On Thu, Jun 6, 2024 at 2:25 PM David Marchand  wrote:
>
> On Thu, Jun 6, 2024 at 9:33 AM  wrote:
> >
> > From: Christian Ehrhardt 
> >
> > DPDK fixed counting of telemetry clients in 24.03 [1] which was also
> > backported to 23.11.1 [2]. Due to that the dpdk related openvswitch
> > tests now fail in the following cases:
> >   4: OVS-DPDK - ping vhost-user ports FAILED (system-dpdk.at:148)
> >   5: OVS-DPDK - ping vhost-user-client ports FAILED (system-dpdk.at:224)
> >   16: OVS-DPDK - MTU increase vport port FAILED (system-dpdk.at:609)
> >   17: OVS-DPDK - MTU decrease vport port FAILED (system-dpdk.at:651)
> >   20: OVS-DPDK - MTU upper bound vport port FAILED (system-dpdk.at:767)
> >   21: OVS-DPDK - MTU lower bound vport port FAILED (system-dpdk.at:809)
> >
> > This is due to a new error being logged under these conditions:
> >   dpdk|ERR|TELEMETRY: Socket write base info to client failed
>
> I had some trouble understanding what was happening here because this
> error log should be triggered on connection to the telemetry socket.
>
> So I expected this error log affect any dpdk unit tests, but only the
> tests with vhost-user were affected...
> Attaching gdb, I could confirm something was indeed connecting to the
> telemetry socket and I realised it was due to telemetry running in the
> testpmd dpdk instance that tries to be smart and connect/quit on
> existing sockets.
>
> So we can waive this error log but I think a better fix would be to
> avoid this conflict between testpmd and ovs (regardless of telemetry).
> Something like:
>
> diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at
> index 7cf9bac170..f8ba766739 100644
> --- a/tests/system-dpdk-macros.at
> +++ b/tests/system-dpdk-macros.at
> @@ -102,7 +102,7 @@ m4_define([OVS_DPDK_CHECK_TESTPMD],
>  m4_define([OVS_DPDK_START_TESTPMD],
>[AT_CHECK([lscpu], [], [stdout])
> AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while
> (c++<$(3)) {printf "512,"}; print "512"}' > NUMA_NODE])
> -   eal_options="$DPDK_EAL_OPTIONS --in-memory --socket-mem="$(cat
> NUMA_NODE)" --single-file-segments --no-pci"
> +   eal_options="$DPDK_EAL_OPTIONS --in-memory --socket-mem="$(cat
> NUMA_NODE)" --single-file-segments --no-pci --file-prefix testpmd"
> options="$1"
> test "$options" != "${options%% -- *}" || options="$options -- "
> eal_options="$eal_options ${options%% -- *}"

I'm happy with that approach as well,
feel free to submit for real and scrap mine.

> Opinions?
>
> --
> David Marchand
>


-- 
Christian Ehrhardt
Director of Engineering, Ubuntu Server
Canonical Ltd
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] checkpatch: Extend and move extra_keywords list to file.

2024-06-06 Thread Simon Horman
On Wed, Jun 05, 2024 at 09:07:42PM -0400, Mike Pattrick wrote:
> This patch extends the extra_keywords list from 324 to 747 keywords and
> moves this list to a separate file. The methodology used to create this
> list was running the spell checker on a large volume of historical
> patches and selecting any words that appeared multiple times.
> 
> The rational for using a separate file is to make management of this
> list simpler by decoupling the code from the keywords.
> 
> Signed-off-by: Mike Pattrick 
> ---
> v2: Included new file in distfiles

Acked-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] checkpatch: Extend and move extra_keywords list to file.

2024-06-06 Thread Simon Horman
On Wed, Jun 05, 2024 at 03:24:46PM -0400, Mike Pattrick wrote:
> This patch extends the extra_keywords list from 324 to 747 keywords and
> moves this list to a separate file. The methodology used to create this
> list was running the spell checker on a large volume of historical
> patches and selecting any words that appeared multiple times.
> 
> The rational for using a separate file is to make management of this
> list simpler by decoupling the code from the keywords.
> 
> Signed-off-by: Mike Pattrick 

Acked-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] system-dpdk: Tolerate new warnings 23.11.1/24.03.

2024-06-06 Thread David Marchand
On Thu, Jun 6, 2024 at 9:33 AM  wrote:
>
> From: Christian Ehrhardt 
>
> DPDK fixed counting of telemetry clients in 24.03 [1] which was also
> backported to 23.11.1 [2]. Due to that the dpdk related openvswitch
> tests now fail in the following cases:
>   4: OVS-DPDK - ping vhost-user ports FAILED (system-dpdk.at:148)
>   5: OVS-DPDK - ping vhost-user-client ports FAILED (system-dpdk.at:224)
>   16: OVS-DPDK - MTU increase vport port FAILED (system-dpdk.at:609)
>   17: OVS-DPDK - MTU decrease vport port FAILED (system-dpdk.at:651)
>   20: OVS-DPDK - MTU upper bound vport port FAILED (system-dpdk.at:767)
>   21: OVS-DPDK - MTU lower bound vport port FAILED (system-dpdk.at:809)
>
> This is due to a new error being logged under these conditions:
>   dpdk|ERR|TELEMETRY: Socket write base info to client failed

I had some trouble understanding what was happening here because this
error log should be triggered on connection to the telemetry socket.

So I expected this error log affect any dpdk unit tests, but only the
tests with vhost-user were affected...
Attaching gdb, I could confirm something was indeed connecting to the
telemetry socket and I realised it was due to telemetry running in the
testpmd dpdk instance that tries to be smart and connect/quit on
existing sockets.

So we can waive this error log but I think a better fix would be to
avoid this conflict between testpmd and ovs (regardless of telemetry).
Something like:

diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at
index 7cf9bac170..f8ba766739 100644
--- a/tests/system-dpdk-macros.at
+++ b/tests/system-dpdk-macros.at
@@ -102,7 +102,7 @@ m4_define([OVS_DPDK_CHECK_TESTPMD],
 m4_define([OVS_DPDK_START_TESTPMD],
   [AT_CHECK([lscpu], [], [stdout])
AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while
(c++<$(3)) {printf "512,"}; print "512"}' > NUMA_NODE])
-   eal_options="$DPDK_EAL_OPTIONS --in-memory --socket-mem="$(cat
NUMA_NODE)" --single-file-segments --no-pci"
+   eal_options="$DPDK_EAL_OPTIONS --in-memory --socket-mem="$(cat
NUMA_NODE)" --single-file-segments --no-pci --file-prefix testpmd"
options="$1"
test "$options" != "${options%% -- *}" || options="$options -- "
eal_options="$eal_options ${options%% -- *}"

Opinions?

-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] datapath-windows: Fix parsing of split buffers in OvsGetTcpHeader.

2024-06-06 Thread Alin Serdean
Hi Ilya,

Thanks for the patch. Will try to get a setup and test it.

In theory if we have a new analyze target of a newer visual studio it should 
flag those types of issues.

Will try to send a patch for the aforementioned target.

Wenying can you help in testing the patch?

Thank you,
Alin.

From: Ilya Maximets 
Sent: Thursday, June 6, 2024 12:52 PM
To: ovs-dev@openvswitch.org 
Cc: Alin Serdean ; Wenying Dong ; Ilya 
Maximets 
Subject: [PATCH] datapath-windows: Fix parsing of split buffers in 
OvsGetTcpHeader.

NdisGetDataBuffer() is called without providing a buffer to copy packet
data in case it is not contiguous.  So, it fails in some scenarios
where the packet is handled by the general network stack before OVS
and headers become split in multiple buffers.

Use existing helpers to retrieve the headers instead, they are using
OvsGetPacketBytes() which should be able to handle split data.

It might be a bit slower than getting direct pointers that may be
provided by NdisGetDataBuffer(), but it's better to optimize commonly
used OvsGetPacketBytes() helper in the future instead of optimizing
every single caller separately.  And we're still copying the TCP
header anyway.

Fixes: 9726a016d9d6 ("datapath-windows: Implement locking in conntrack NAT.")
Reported-at: https://github.com/openvswitch/ovs-issues/issues/323
Signed-off-by: Ilya Maximets 
---

WARNING: I beleive this code is correct, but I did not test it with real
 traffic, I only verified that it compiles.  Should not be applied
 unless someone tests it in an actual Windows setup.

 datapath-windows/ovsext/Conntrack.c | 45 ++---
 1 file changed, 21 insertions(+), 24 deletions(-)

diff --git a/datapath-windows/ovsext/Conntrack.c 
b/datapath-windows/ovsext/Conntrack.c
index 39ba5cc10..4649805dd 100644
--- a/datapath-windows/ovsext/Conntrack.c
+++ b/datapath-windows/ovsext/Conntrack.c
@@ -678,46 +678,43 @@ OvsGetTcpHeader(PNET_BUFFER_LIST nbl,
 VOID *storage,
 UINT32 *tcpPayloadLen)
 {
-IPHdr *ipHdr;
-IPv6Hdr *ipv6Hdr;
-TCPHdr *tcp;
+IPv6Hdr ipv6HdrStorage;
+IPHdr ipHdrStorage;
+const IPv6Hdr *ipv6Hdr;
+const IPHdr *ipHdr;
+const TCPHdr *tcp;
 VOID *dest = storage;
 uint16_t ipv6ExtLength = 0;

 if (layers->isIPv6) {
-ipv6Hdr = NdisGetDataBuffer(NET_BUFFER_LIST_FIRST_NB(nbl),
-layers->l4Offset + sizeof(TCPHdr),
-NULL, 1, 0);
+ipv6Hdr = OvsGetPacketBytes(nbl, sizeof *ipv6Hdr,
+layers->l3Offset, );
 if (ipv6Hdr == NULL) {
 return NULL;
 }

-tcp = (TCPHdr *)((PCHAR)ipv6Hdr + layers->l4Offset);
-ipv6Hdr = (IPv6Hdr *)((PCHAR)ipv6Hdr + layers->l3Offset);
-if (tcp->doff * 4 >= sizeof *tcp) {
-NdisMoveMemory(dest, tcp, sizeof(TCPHdr));
-ipv6ExtLength = layers->l4Offset - layers->l3Offset - 
sizeof(IPv6Hdr);
-*tcpPayloadLen = (ntohs(ipv6Hdr->payload_len) - ipv6ExtLength - 
TCP_HDR_LEN(tcp));
-return storage;
+tcp = OvsGetTcp(nbl, layers->l4Offset, dest);
+if (tcp == NULL) {
+return NULL;
 }
+
+ipv6ExtLength = layers->l4Offset - layers->l3Offset - sizeof(IPv6Hdr);
+*tcpPayloadLen = (ntohs(ipv6Hdr->payload_len) - ipv6ExtLength - 
TCP_HDR_LEN(tcp));
 } else {
-ipHdr = NdisGetDataBuffer(NET_BUFFER_LIST_FIRST_NB(nbl),
-  layers->l4Offset + sizeof(TCPHdr),
-  NULL, 1 /*no align*/, 0);
+ipHdr = OvsGetIp(nbl, layers->l3Offset, );
 if (ipHdr == NULL) {
 return NULL;
 }

-ipHdr = (IPHdr *)((PCHAR)ipHdr + layers->l3Offset);
-tcp = (TCPHdr *)((PCHAR)ipHdr + ipHdr->ihl * 4);
-
-if (tcp->doff * 4 >= sizeof *tcp) {
-NdisMoveMemory(dest, tcp, sizeof(TCPHdr));
-*tcpPayloadLen = TCP_DATA_LENGTH(ipHdr, tcp);
-return storage;
+tcp = OvsGetTcp(nbl, layers->l4Offset, dest);
+if (tcp == NULL) {
+return NULL;
 }
+
+*tcpPayloadLen = TCP_DATA_LENGTH(ipHdr, tcp);
 }
-return NULL;
+
+return storage;
 }

 static UINT8
--
2.45.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v4] Text respresntations for drop sampling.

2024-06-06 Thread Dumitru Ceara
Nit: typo in subject: "respresntations".  And maybe it should be
something like:

Add text representations for drop flows.

On 6/6/24 11:17, Adrián Moreno wrote:
> On Mon, Jun 03, 2024 at 03:47:25PM GMT, Jacob Tanenbaum wrote:
>> Created a new column in the southbound database to hardcode a human readable
>> description for flows. This first use is describing why the flow is dropping 
>> packets.
>> The new column is called flow_desc and will create southbound database 
>> entries like this
>>
>> _uuid   : 20f1897b-477e-47ae-a32c-c546d83ec097
>> actions : 
>> "sample(probability=65535,collector_set=123,obs_domain=1,obs_point=$cookie); 
>> /* drop */"
>> controller_meter: []
>> external_ids: {source="northd.c:8721", stage-name=ls_in_l2_unknown}
>> flow_desc   : "No L2 destination"
>> logical_datapath: []
>> logical_dp_group: ee3c3db5-98a2-4f34-8a84-409deae140a7
>> match   : "outport == \"none\""
>> pipeline: ingress
>> priority: 50
>> table_id: 27
>> tags: {}
>> hash: 0
>>
>> future work includes entering more flow_desc for more flows and adding
>> flow_desc to the actions as a comment.
> 
> Is this future work planned for the next OVN version?
> 

Given that OVN release cadence is every 6 months I'd suggest we do this
now so we have a usable feature in v24.09.0.  The patch will be a bit
larger but it seems rather straightforward.

Wdyt Jacob?

>>
>> Signed-off-by: Jacob Tanenbaum 
>> Suggested-by: Dumitru Ceara 
>> Reported-at: https://issues.redhat.com/browse/FDP-307
>>
>> ---
>>
>> v1: initial version
>> v2: correct commit message
>> make the flow_desc a char*
>> correct a typo in the ovn-sb.xml
>> correct the test
>> v3: rebase issue with NEWS file
>> v4: remove options:debug_drop_domain_id="1" from testing as we
>> do not depend on it
>>
>> merge
>>
>> diff --git a/NEWS b/NEWS
>> index 81c958f9a..04c441ada 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -21,6 +21,8 @@ Post v24.03.0
>>  MAC addresses configured on the LSP with "unknown", are learnt via the
>>  OVN native FDB.
>>- Add support for ovsdb-server `--config-file` option in ovn-ctl.
>> +  - Added a new column in the southbound database "flow_desc" to provide
>> +human readable context to flows.
>>
>>  OVN v24.03.0 - 01 Mar 2024
>>  --
>> diff --git a/northd/lflow-mgr.c b/northd/lflow-mgr.c
>> index b2c60b5de..e27558a32 100644
>> --- a/northd/lflow-mgr.c
>> +++ b/northd/lflow-mgr.c
>> @@ -36,7 +36,7 @@ static void ovn_lflow_init(struct ovn_lflow *, struct 
>> ovn_datapath *od,
>> uint16_t priority, char *match,
>> char *actions, char *io_port,
>> char *ctrl_meter, char *stage_hint,
>> -   const char *where);
>> +   const char *where, char *flow_desc);
>>  static struct ovn_lflow *ovn_lflow_find(const struct hmap *lflows,
>>  enum ovn_stage stage,
>>  uint16_t priority, const char 
>> *match,
>> @@ -52,7 +52,7 @@ static struct ovn_lflow *do_ovn_lflow_add(
>>  const char *actions, const char *io_port,
>>  const char *ctrl_meter,
>>  const struct ovsdb_idl_row *stage_hint,
>> -const char *where);
>> +const char *where, const char *flow_desc);
>>
>>
>>  static struct ovs_mutex *lflow_hash_lock(const struct hmap *lflow_table,
>> @@ -173,6 +173,7 @@ struct ovn_lflow {
>>* 'dpg_bitmap'. */
>>  struct ovn_dp_group *dpg;/* Link to unique Sb datapath group. */
>>  const char *where;
>> +char *flow_desc;
>>
>>  struct uuid sb_uuid; /* SB DB row uuid, specified by northd. */
>>  struct ovs_list referenced_by;  /* List of struct lflow_ref_node. */
>> @@ -659,7 +660,7 @@ lflow_table_add_lflow(struct lflow_table *lflow_table,
>>const char *match, const char *actions,
>>const char *io_port, const char *ctrl_meter,
>>const struct ovsdb_idl_row *stage_hint,
>> -  const char *where,
>> +  const char *where, const char *flow_desc,
>>struct lflow_ref *lflow_ref)
>>  OVS_EXCLUDED(fake_hash_mutex)
>>  {
>> @@ -679,7 +680,7 @@ lflow_table_add_lflow(struct lflow_table *lflow_table,
>>  do_ovn_lflow_add(lflow_table,
>>   od ? ods_size(od->datapaths) : dp_bitmap_len,
>>   hash, stage, priority, match, actions,
>> - io_port, ctrl_meter, stage_hint, where);
>> + io_port, ctrl_meter, stage_hint, where, flow_desc);
>>
>>  if (lflow_ref) {
>>  struct lflow_ref_node *lrn =
>> @@ -733,7 +734,7 @@ lflow_table_add_lflow_default_drop(struct lflow_table 
>> 

Re: [ovs-dev] [PATCH 1/1] netdev-offload-tc: Reserve lower tc prio for vlan ethertype.

2024-06-06 Thread Roi Dayan via dev
Hi Simon,

I appreciate the review, Yes we will look in some of the other ether types and 
see if it's something we think is needed to prioritize as well.

Thanks,
Roi

Sent from Nine


From: Simon Horman 
Sent: Tuesday, June 4, 2024 11:56
To: Roi Dayan
Cc: Ilya Maximets; d...@openvswitch.org; Maor Dickman
Subject: Re: [ovs-dev] [PATCH 1/1] netdev-offload-tc: Reserve lower tc prio for 
vlan ethertype.


On Thu, May 30, 2024 at 09:31:06AM +0300, Roi Dayan via dev wrote:
>
>
> On 28/05/2024 20:12, Ilya Maximets wrote:
> > On 5/26/24 10:31, Roi Dayan via dev wrote:
> >> From: Maor Dickman 
> >>
> >> The cited commit reserved lower tc priorities for IP ethertypes in order
> >> to give IP traffic higher priority than other management traffic.
> >> In case of of vlan encap traffic, IP traffic will still get lower
> >> priority.
> >>
> >> Fix it by also reserving low priority tc prio for vlan.
> >>
> >> Fixes: c230c7579c14 ("netdev-offload-tc: Reserve lower tc prios for ip 
> >> ethertypes")
> >> Signed-off-by: Maor Dickman 
> >> Acked-by: Roi Dayan 
> >> ---
> >>  lib/netdev-offload-tc.c | 2 ++
> >>  lib/tc.h| 1 +
> >>  2 files changed, 3 insertions(+)
> >>
> >> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
> >> index 921d5231777e..3be1c08d24f6 100644
> >> --- a/lib/netdev-offload-tc.c
> >> +++ b/lib/netdev-offload-tc.c
> >> @@ -400,6 +400,8 @@ get_next_available_prio(ovs_be16 protocol)
> >>  return TC_RESERVED_PRIORITY_IPV4;
> >>  } else if (protocol == htons(ETH_P_IPV6)) {
> >>  return TC_RESERVED_PRIORITY_IPV6;
> >> +} else if (protocol == htons(ETH_P_8021Q)) {
> >
> > Should 802.1ad traffic also get the priority?
> > What about MPLS?
> >
> > Best regards, Ilya Maximets.
>
> Hi Ilya,
>
> It is correct there could be more types that could benefit from a reserved
> prio but from what we saw in the field we didn't notice a real usage
> of 802.1ad or MPLS while 8021q was being used actively and we noticed the
> performance degradation and improvement after using the reserved prio.
> We think this patch can help many active openvswitch users and in the future
> if other types would be more common we could add those.

Hi Roi and Ilya,

FWIIW, I think it is reasonable to update 8021q up-front,
i.e. accept this patch.

But I also think that, as a follow-up, we should look into other Ether Types
and see if they also need this treatment, rather than waiting for problems
to materialise in the field.

Roi, is this something your team can take on?
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] datapath-windows: Fix parsing of split buffers in OvsGetTcpHeader.

2024-06-06 Thread Ilya Maximets
NdisGetDataBuffer() is called without providing a buffer to copy packet
data in case it is not contiguous.  So, it fails in some scenarios
where the packet is handled by the general network stack before OVS
and headers become split in multiple buffers.

Use existing helpers to retrieve the headers instead, they are using
OvsGetPacketBytes() which should be able to handle split data.

It might be a bit slower than getting direct pointers that may be
provided by NdisGetDataBuffer(), but it's better to optimize commonly
used OvsGetPacketBytes() helper in the future instead of optimizing
every single caller separately.  And we're still copying the TCP
header anyway.

Fixes: 9726a016d9d6 ("datapath-windows: Implement locking in conntrack NAT.")
Reported-at: https://github.com/openvswitch/ovs-issues/issues/323
Signed-off-by: Ilya Maximets 
---

WARNING: I beleive this code is correct, but I did not test it with real
 traffic, I only verified that it compiles.  Should not be applied
 unless someone tests it in an actual Windows setup.

 datapath-windows/ovsext/Conntrack.c | 45 ++---
 1 file changed, 21 insertions(+), 24 deletions(-)

diff --git a/datapath-windows/ovsext/Conntrack.c 
b/datapath-windows/ovsext/Conntrack.c
index 39ba5cc10..4649805dd 100644
--- a/datapath-windows/ovsext/Conntrack.c
+++ b/datapath-windows/ovsext/Conntrack.c
@@ -678,46 +678,43 @@ OvsGetTcpHeader(PNET_BUFFER_LIST nbl,
 VOID *storage,
 UINT32 *tcpPayloadLen)
 {
-IPHdr *ipHdr;
-IPv6Hdr *ipv6Hdr;
-TCPHdr *tcp;
+IPv6Hdr ipv6HdrStorage;
+IPHdr ipHdrStorage;
+const IPv6Hdr *ipv6Hdr;
+const IPHdr *ipHdr;
+const TCPHdr *tcp;
 VOID *dest = storage;
 uint16_t ipv6ExtLength = 0;
 
 if (layers->isIPv6) {
-ipv6Hdr = NdisGetDataBuffer(NET_BUFFER_LIST_FIRST_NB(nbl),
-layers->l4Offset + sizeof(TCPHdr),
-NULL, 1, 0);
+ipv6Hdr = OvsGetPacketBytes(nbl, sizeof *ipv6Hdr,
+layers->l3Offset, );
 if (ipv6Hdr == NULL) {
 return NULL;
 }
 
-tcp = (TCPHdr *)((PCHAR)ipv6Hdr + layers->l4Offset);
-ipv6Hdr = (IPv6Hdr *)((PCHAR)ipv6Hdr + layers->l3Offset);
-if (tcp->doff * 4 >= sizeof *tcp) {
-NdisMoveMemory(dest, tcp, sizeof(TCPHdr));
-ipv6ExtLength = layers->l4Offset - layers->l3Offset - 
sizeof(IPv6Hdr);
-*tcpPayloadLen = (ntohs(ipv6Hdr->payload_len) - ipv6ExtLength - 
TCP_HDR_LEN(tcp));
-return storage;
+tcp = OvsGetTcp(nbl, layers->l4Offset, dest);
+if (tcp == NULL) {
+return NULL;
 }
+
+ipv6ExtLength = layers->l4Offset - layers->l3Offset - sizeof(IPv6Hdr);
+*tcpPayloadLen = (ntohs(ipv6Hdr->payload_len) - ipv6ExtLength - 
TCP_HDR_LEN(tcp));
 } else {
-ipHdr = NdisGetDataBuffer(NET_BUFFER_LIST_FIRST_NB(nbl),
-  layers->l4Offset + sizeof(TCPHdr),
-  NULL, 1 /*no align*/, 0);
+ipHdr = OvsGetIp(nbl, layers->l3Offset, );
 if (ipHdr == NULL) {
 return NULL;
 }
 
-ipHdr = (IPHdr *)((PCHAR)ipHdr + layers->l3Offset);
-tcp = (TCPHdr *)((PCHAR)ipHdr + ipHdr->ihl * 4);
-
-if (tcp->doff * 4 >= sizeof *tcp) {
-NdisMoveMemory(dest, tcp, sizeof(TCPHdr));
-*tcpPayloadLen = TCP_DATA_LENGTH(ipHdr, tcp);
-return storage;
+tcp = OvsGetTcp(nbl, layers->l4Offset, dest);
+if (tcp == NULL) {
+return NULL;
 }
+
+*tcpPayloadLen = TCP_DATA_LENGTH(ipHdr, tcp);
 }
-return NULL;
+
+return storage;
 }
 
 static UINT8
-- 
2.45.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] system-dpdk: Tolerate new warnings 23.11.1/24.03.

2024-06-06 Thread Simon Horman
On Thu, Jun 06, 2024 at 09:32:17AM +0200, christian.ehrha...@canonical.com 
wrote:
> From: Christian Ehrhardt 
> 
> DPDK fixed counting of telemetry clients in 24.03 [1] which was also
> backported to 23.11.1 [2]. Due to that the dpdk related openvswitch
> tests now fail in the following cases:
>   4: OVS-DPDK - ping vhost-user ports FAILED (system-dpdk.at:148)
>   5: OVS-DPDK - ping vhost-user-client ports FAILED (system-dpdk.at:224)
>   16: OVS-DPDK - MTU increase vport port FAILED (system-dpdk.at:609)
>   17: OVS-DPDK - MTU decrease vport port FAILED (system-dpdk.at:651)
>   20: OVS-DPDK - MTU upper bound vport port FAILED (system-dpdk.at:767)
>   21: OVS-DPDK - MTU lower bound vport port FAILED (system-dpdk.at:809)
> 
> This is due to a new error being logged under these conditions:
>   dpdk|ERR|TELEMETRY: Socket write base info to client failed
> 
> This is okay to happen in the cases we set up in the DPDK related tests
> for OVS and can be ignored, therefore it is added to the tolerated
> messages in the m4 definition of OVS_DPDK_STOP_VSWITCHD.
> 
> [1]: https://github.com/DPDK/dpdk/commit/e14bb5f1
> [2]: https://github.com/DPDK/dpdk-stable/commit/cbd1c165
> 
> Signed-off-by: Christian Ehrhardt 
> ---
> Changes since v1:
> - adapt to grammar rules for the subject

Acked-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] checkpatch: Don't warn on pointer to pointer.

2024-06-06 Thread Simon Horman
On Wed, Jun 05, 2024 at 03:51:38PM +0200, Adrian Moreno wrote:
> Current regexp used to check whitespaces around operators does not
> consider that there can be more than one "*" together to express pointer
> to pointer.
> 
> As a result, false positive warnings are raised when the
> patch contains a simple list of pointers, e.g: "char **errrp").
> 
> Fix the regexp to allow more than one consecutive "+" characters.
> 
> Signed-off-by: Adrian Moreno 

Acked-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/1] debian: Fix tabs vs spaces.

2024-06-06 Thread Simon Horman
On Tue, May 28, 2024 at 11:34:17AM +0300, Roi Dayan via dev wrote:
> Getting the following message while trying to build a debian package.
> debian/openvswitch-switch.init
> debian/openvswitch-switch.postinst
> See above for files that use tabs for indentation.
> Please use spaces instead.
> Fix it.
> 
> Signed-off-by: Roi Dayan 

Thanks, applied.

- debian: Fix tabs vs spaces.
  https://github.com/openvswitch/ovs/commit/792e8ee869aa

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] netdev-offload-dpdk: Support offload of set dscp action.

2024-06-06 Thread Simon Horman
On Wed, May 29, 2024 at 09:14:03AM +0800, Sunyang Wu via dev wrote:

Hi Sunyang,

It would be nice to include a patch description here.

> Signed-off-by: Sunyang Wu 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v4] Text respresntations for drop sampling.

2024-06-06 Thread Adrián Moreno
On Mon, Jun 03, 2024 at 03:47:25PM GMT, Jacob Tanenbaum wrote:
> Created a new column in the southbound database to hardcode a human readable
> description for flows. This first use is describing why the flow is dropping 
> packets.
> The new column is called flow_desc and will create southbound database 
> entries like this
>
> _uuid   : 20f1897b-477e-47ae-a32c-c546d83ec097
> actions : 
> "sample(probability=65535,collector_set=123,obs_domain=1,obs_point=$cookie); 
> /* drop */"
> controller_meter: []
> external_ids: {source="northd.c:8721", stage-name=ls_in_l2_unknown}
> flow_desc   : "No L2 destination"
> logical_datapath: []
> logical_dp_group: ee3c3db5-98a2-4f34-8a84-409deae140a7
> match   : "outport == \"none\""
> pipeline: ingress
> priority: 50
> table_id: 27
> tags: {}
> hash: 0
>
> future work includes entering more flow_desc for more flows and adding
> flow_desc to the actions as a comment.

Is this future work planned for the next OVN version?

>
> Signed-off-by: Jacob Tanenbaum 
> Suggested-by: Dumitru Ceara 
> Reported-at: https://issues.redhat.com/browse/FDP-307
>
> ---
>
> v1: initial version
> v2: correct commit message
> make the flow_desc a char*
> correct a typo in the ovn-sb.xml
> correct the test
> v3: rebase issue with NEWS file
> v4: remove options:debug_drop_domain_id="1" from testing as we
> do not depend on it
>
> merge
>
> diff --git a/NEWS b/NEWS
> index 81c958f9a..04c441ada 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -21,6 +21,8 @@ Post v24.03.0
>  MAC addresses configured on the LSP with "unknown", are learnt via the
>  OVN native FDB.
>- Add support for ovsdb-server `--config-file` option in ovn-ctl.
> +  - Added a new column in the southbound database "flow_desc" to provide
> +human readable context to flows.
>
>  OVN v24.03.0 - 01 Mar 2024
>  --
> diff --git a/northd/lflow-mgr.c b/northd/lflow-mgr.c
> index b2c60b5de..e27558a32 100644
> --- a/northd/lflow-mgr.c
> +++ b/northd/lflow-mgr.c
> @@ -36,7 +36,7 @@ static void ovn_lflow_init(struct ovn_lflow *, struct 
> ovn_datapath *od,
> uint16_t priority, char *match,
> char *actions, char *io_port,
> char *ctrl_meter, char *stage_hint,
> -   const char *where);
> +   const char *where, char *flow_desc);
>  static struct ovn_lflow *ovn_lflow_find(const struct hmap *lflows,
>  enum ovn_stage stage,
>  uint16_t priority, const char *match,
> @@ -52,7 +52,7 @@ static struct ovn_lflow *do_ovn_lflow_add(
>  const char *actions, const char *io_port,
>  const char *ctrl_meter,
>  const struct ovsdb_idl_row *stage_hint,
> -const char *where);
> +const char *where, const char *flow_desc);
>
>
>  static struct ovs_mutex *lflow_hash_lock(const struct hmap *lflow_table,
> @@ -173,6 +173,7 @@ struct ovn_lflow {
>* 'dpg_bitmap'. */
>  struct ovn_dp_group *dpg;/* Link to unique Sb datapath group. */
>  const char *where;
> +char *flow_desc;
>
>  struct uuid sb_uuid; /* SB DB row uuid, specified by northd. */
>  struct ovs_list referenced_by;  /* List of struct lflow_ref_node. */
> @@ -659,7 +660,7 @@ lflow_table_add_lflow(struct lflow_table *lflow_table,
>const char *match, const char *actions,
>const char *io_port, const char *ctrl_meter,
>const struct ovsdb_idl_row *stage_hint,
> -  const char *where,
> +  const char *where, const char *flow_desc,
>struct lflow_ref *lflow_ref)
>  OVS_EXCLUDED(fake_hash_mutex)
>  {
> @@ -679,7 +680,7 @@ lflow_table_add_lflow(struct lflow_table *lflow_table,
>  do_ovn_lflow_add(lflow_table,
>   od ? ods_size(od->datapaths) : dp_bitmap_len,
>   hash, stage, priority, match, actions,
> - io_port, ctrl_meter, stage_hint, where);
> + io_port, ctrl_meter, stage_hint, where, flow_desc);
>
>  if (lflow_ref) {
>  struct lflow_ref_node *lrn =
> @@ -733,7 +734,7 @@ lflow_table_add_lflow_default_drop(struct lflow_table 
> *lflow_table,
>  {
>  lflow_table_add_lflow(lflow_table, od, NULL, 0, stage, 0, "1",
>debug_drop_action(), NULL, NULL, NULL,
> -  where, lflow_ref);
> +  where, NULL, lflow_ref);
>  }
>
>  struct ovn_dp_group *
> @@ -856,7 +857,7 @@ static void
>  ovn_lflow_init(struct ovn_lflow *lflow, struct ovn_datapath *od,
> size_t dp_bitmap_len, enum ovn_stage stage, uint16_t priority,
>   

Re: [ovs-dev] [PATCH net-next 1/2] selftests: openvswitch: fix action formatting

2024-06-06 Thread Adrián Moreno
On Mon, Jun 03, 2024 at 03:00:03PM GMT, Aaron Conole wrote:
> Adrian Moreno  writes:
>
> > In the action formatting function ("dpstr"), the iteration is made over
> > the nla_map, so if there are more than one attribute from the same type
> > we only print the first one.
> >
> > Fix this by iterating over the actual attributes.
> >
> > Signed-off-by: Adrian Moreno 
> > ---
> >  .../selftests/net/openvswitch/ovs-dpctl.py| 48 +++
> >  1 file changed, 27 insertions(+), 21 deletions(-)
> >
> > diff --git a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py 
> > b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
> > index 1dd057afd3fb..b76907ac0092 100644
> > --- a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
> > +++ b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
> > @@ -437,40 +437,46 @@ class ovsactions(nla):
> >  def dpstr(self, more=False):
> >  print_str = ""
> >
> > -for field in self.nla_map:
> > -if field[1] == "none" or self.get_attr(field[0]) is None:
> > +for attr_name, value in self["attrs"]:
> > +attr_desc = next(filter(lambda x: x[0] == attr_name, 
> > self.nla_map),
> > + None)
> > +if not attr_desc:
> > +raise ValueError("Unknown attribute: %s" % attr)
> > +
> > +attr_type = attr_desc[1]
> > +
> > +if attr_type == "none":
>
> I agree, this is an issue.  BUT I think it might be better to just
> filter by field type up front.  See:
>
> https://github.com/apconole/linux-next-work/commit/7262107de7170d44b6dbf6c5ea6f7e6c0bb71d36#diff-3e72e7405c6bb4e9842bed5f63883ca930387086bb40d4034e92ed83a5decb4bR441
>
> That version I think ends up being much easier to follow.  If you want
> to take it for your series, feel free.  If you disagree, maybe there's
> something I'm not considering about it.
>

I agree. It's better to check field attribute names first. I found this
during manual testing of the "emit_sample" series but I ended up not
needing it for the automated one, so I'm OK waiting for your cleanup
series.

In fact, I also have some patches that try some rework of this part. In
particular, I tried to unify all attributes under a common base class
that would handle printing and parsing. That way, most cases would fall
into "print_str += datum.dpstr(more)" and the "if/elif" block would
shrink significantly.

> NOTE that version is just a bunch of independent changes that are
> squashed together.  I have a cleaner version.
>
> I can also bundle up the series I have so far and submit, but I didn't
> want to do that until I got all the pmtu.sh support working.  Maybe it
> makes sense to send it now though.  Simon, Jakub - wdyt?
>
> >  continue
> >  if print_str != "":
> >  print_str += ","
> >
> > -if field[1] == "uint32":
> > -if field[0] == "OVS_ACTION_ATTR_OUTPUT":
> > -print_str += "%d" % int(self.get_attr(field[0]))
> > -elif field[0] == "OVS_ACTION_ATTR_RECIRC":
> > -print_str += "recirc(0x%x)" % 
> > int(self.get_attr(field[0]))
> > -elif field[0] == "OVS_ACTION_ATTR_TRUNC":
> > -print_str += "trunc(%d)" % int(self.get_attr(field[0]))
> > -elif field[0] == "OVS_ACTION_ATTR_DROP":
> > -print_str += "drop(%d)" % int(self.get_attr(field[0]))
> > -elif field[1] == "flag":
> > -if field[0] == "OVS_ACTION_ATTR_CT_CLEAR":
> > +if attr_type == "uint32":
> > +if attr_name == "OVS_ACTION_ATTR_OUTPUT":
> > +print_str += "%d" % int(value)
> > +elif attr_name == "OVS_ACTION_ATTR_RECIRC":
> > +print_str += "recirc(0x%x)" % int(value)
> > +elif attr_name == "OVS_ACTION_ATTR_TRUNC":
> > +print_str += "trunc(%d)" % int(value)
> > +elif attr_name == "OVS_ACTION_ATTR_DROP":
> > +print_str += "drop(%d)" % int(value)
> > +elif attr_type == "flag":
> > +if attr_name == "OVS_ACTION_ATTR_CT_CLEAR":
> >  print_str += "ct_clear"
> > -elif field[0] == "OVS_ACTION_ATTR_POP_VLAN":
> > +elif attr_name == "OVS_ACTION_ATTR_POP_VLAN":
> >  print_str += "pop_vlan"
> > -elif field[0] == "OVS_ACTION_ATTR_POP_ETH":
> > +elif attr_name == "OVS_ACTION_ATTR_POP_ETH":
> >  print_str += "pop_eth"
> > -elif field[0] == "OVS_ACTION_ATTR_POP_NSH":
> > +elif attr_name == "OVS_ACTION_ATTR_POP_NSH":
> >  print_str += "pop_nsh"
> > -elif field[0] == "OVS_ACTION_ATTR_POP_MPLS":
> > +elif attr_name == "OVS_ACTION_ATTR_POP_MPLS":
> >  print_str += 

Re: [ovs-dev] [RFC v2 4/9] vswitchd: Add local sampling to vswitchd schema.

2024-06-06 Thread Adrián Moreno
On Wed, Jun 05, 2024 at 10:23:32PM GMT, Adrian Moreno wrote:
> Add as new column in the Flow_Sample_Collector_Set table named
> "local_sample_group" which enables this feature.
>
> Signed-off-by: Adrian Moreno 

Ugh...It seems a merge conflict with the NEWS file is breaking robot
builds.

> ---
>  NEWS   |  4 ++
>  vswitchd/bridge.c  | 78 +++---
>  vswitchd/vswitch.ovsschema |  9 -
>  vswitchd/vswitch.xml   | 39 +--
>  4 files changed, 119 insertions(+), 11 deletions(-)
>
> diff --git a/NEWS b/NEWS
> index b92cec532..1c05a7120 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -7,6 +7,10 @@ Post-v3.3.0
> - The primary development branch has been renamed from 'master' to 'main'.
>   The OVS tree remains hosted on GitHub.
>   https://github.com/openvswitch/ovs.git
> +   - Local sampling is introduced. It reuses the OpenFlow sample action and
> + allows samples to be emitted locally (instead of via IPFIX) in a
> + datapath-specific manner via the new datapath action called 
> "emit_sample".
> + Linux kernel datapath is the first to support this feature.
>
>
>  v3.3.0 - 16 Feb 2024
> diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
> index 95a65fcdc..cd7dc6646 100644
> --- a/vswitchd/bridge.c
> +++ b/vswitchd/bridge.c
> @@ -288,6 +288,7 @@ static void bridge_configure_mac_table(struct bridge *);
>  static void bridge_configure_mcast_snooping(struct bridge *);
>  static void bridge_configure_sflow(struct bridge *, int 
> *sflow_bridge_number);
>  static void bridge_configure_ipfix(struct bridge *);
> +static void bridge_configure_lsample(struct bridge *);
>  static void bridge_configure_spanning_tree(struct bridge *);
>  static void bridge_configure_tables(struct bridge *);
>  static void bridge_configure_dp_desc(struct bridge *);
> @@ -989,6 +990,7 @@ bridge_reconfigure(const struct ovsrec_open_vswitch 
> *ovs_cfg)
>  bridge_configure_netflow(br);
>  bridge_configure_sflow(br, _bridge_number);
>  bridge_configure_ipfix(br);
> +bridge_configure_lsample(br);
>  bridge_configure_spanning_tree(br);
>  bridge_configure_tables(br);
>  bridge_configure_dp_desc(br);
> @@ -1537,10 +1539,11 @@ ovsrec_ipfix_is_valid(const struct ovsrec_ipfix 
> *ipfix)
>  return ipfix && ipfix->n_targets > 0;
>  }
>
> -/* Returns whether a Flow_Sample_Collector_Set row is valid. */
> +/* Returns whether a Flow_Sample_Collector_Set row constains valid IPFIX
> + * configuration. */
>  static bool
> -ovsrec_fscs_is_valid(const struct ovsrec_flow_sample_collector_set *fscs,
> - const struct bridge *br)
> +ovsrec_fscs_is_valid_ipfix(const struct ovsrec_flow_sample_collector_set 
> *fscs,
> +   const struct bridge *br)
>  {
>  return ovsrec_ipfix_is_valid(fscs->ipfix) && fscs->bridge == br->cfg;
>  }
> @@ -1558,7 +1561,7 @@ bridge_configure_ipfix(struct bridge *br)
>  const char *virtual_obs_id;
>
>  OVSREC_FLOW_SAMPLE_COLLECTOR_SET_FOR_EACH(fe_cfg, idl) {
> -if (ovsrec_fscs_is_valid(fe_cfg, br)) {
> +if (ovsrec_fscs_is_valid_ipfix(fe_cfg, br)) {
>  n_fe_opts++;
>  }
>  }
> @@ -1621,7 +1624,7 @@ bridge_configure_ipfix(struct bridge *br)
>  fe_opts = xcalloc(n_fe_opts, sizeof *fe_opts);
>  opts = fe_opts;
>  OVSREC_FLOW_SAMPLE_COLLECTOR_SET_FOR_EACH(fe_cfg, idl) {
> -if (ovsrec_fscs_is_valid(fe_cfg, br)) {
> +if (ovsrec_fscs_is_valid_ipfix(fe_cfg, br)) {
>  opts->collector_set_id = fe_cfg->id;
>  sset_init(>targets);
>  sset_add_array(>targets, fe_cfg->ipfix->targets,
> @@ -1667,6 +1670,71 @@ bridge_configure_ipfix(struct bridge *br)
>  }
>  }
>
> +/* Returns whether a Flow_Sample_Collector_Set row contains valid local
> + * sampling configuraiton. */
> +static bool
> +ovsrec_fscs_is_valid_local(const struct ovsrec_flow_sample_collector_set 
> *fscs,
> +   const struct bridge *br)
> +{
> +return fscs->local_sample_group && fscs->n_local_sample_group == 1 &&
> +   fscs->bridge == br->cfg;
> +}
> +
> +/* Set local sample configuration on 'br'. */
> +static void
> +bridge_configure_lsample(struct bridge *br)
> +{
> +static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> +const struct ovsrec_flow_sample_collector_set *fscs;
> +struct ofproto_lsample_options *opts_array, *opts;
> +size_t n_opts = 0;
> +int ret;
> +
> +/* Iterate the Flow_Sample_Collector_Set table twice.
> + * First to get the number of valid configuration entries, then to 
> process
> + * each of them and build an array of options. */
> +OVSREC_FLOW_SAMPLE_COLLECTOR_SET_FOR_EACH (fscs, idl) {
> +if (ovsrec_fscs_is_valid_local(fscs, br)) {
> +n_opts ++;
> +}
> +}
> +
> +if (n_opts == 0) {
> +

Re: [ovs-dev] [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action

2024-06-06 Thread Adrián Moreno
On Wed, Jun 05, 2024 at 08:51:17PM GMT, Simon Horman wrote:
> On Mon, Jun 03, 2024 at 08:56:39PM +0200, Adrian Moreno wrote:
> > Add support for a new action: emit_sample.
> >
> > This action accepts a u32 group id and a variable-length cookie and uses
> > the psample multicast group to make the packet available for
> > observability.
> >
> > The maximum length of the user-defined cookie is set to 16, same as
> > tc_cookie, to discourage using cookies that will not be offloadable.
> >
> > Signed-off-by: Adrian Moreno 
>
> Hi Adrian,
>
> Some minor nits from my side.
>
> ...
>
> > diff --git a/include/uapi/linux/openvswitch.h 
> > b/include/uapi/linux/openvswitch.h
> > index efc82c318fa2..a0e9dde0584a 100644
> > --- a/include/uapi/linux/openvswitch.h
> > +++ b/include/uapi/linux/openvswitch.h
> > @@ -914,6 +914,30 @@ struct check_pkt_len_arg {
> >  };
> >  #endif
> >
> > +#define OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE 16
> > +/**
> > + * enum ovs_emit_sample_attr - Attributes for %OVS_ACTION_ATTR_EMIT_SAMPLE
> > + * action.
> > + *
> > + * @OVS_EMIT_SAMPLE_ATTR_GROUP: 32-bit number to identify the source of the
> > + * sample.
> > + * @OVS_EMIT_SAMPLE_ATTR_COOKIE: A variable-length binary cookie that 
> > contains
> > + * user-defined metadata. The maximum length is 16 bytes.
> > + *
> > + * Sends the packet to the psample multicast group with the specified 
> > group and
> > + * cookie. It is possible to combine this action with the
> > + * %OVS_ACTION_ATTR_TRUNC action to limit the size of the packet being 
> > emitted.
> > + */
> > +enum ovs_emit_sample_attr {
> > +   OVS_EMIT_SAMPLE_ATTR_UNPSEC,
> > +   OVS_EMIT_SAMPLE_ATTR_GROUP, /* u32 number. */
> > +   OVS_EMIT_SAMPLE_ATTR_COOKIE,/* Optional, user specified cookie. */
> > +   __OVS_EMIT_SAMPLE_ATTR_MAX
> > +};
> > +
> > +#define OVS_EMIT_SAMPLE_ATTR_MAX (__OVS_EMIT_SAMPLE_ATTR_MAX - 1)
> > +
> > +
>
> nit: One blank line is enough.
>

Ack.

>  Flagged by checkpatch.pl
>
> >  /**
> >   * enum ovs_action_attr - Action types.
> >   *
> > @@ -1004,6 +1028,7 @@ enum ovs_action_attr {
> > OVS_ACTION_ATTR_ADD_MPLS, /* struct ovs_action_add_mpls. */
> > OVS_ACTION_ATTR_DEC_TTL,  /* Nested OVS_DEC_TTL_ATTR_*. */
> > OVS_ACTION_ATTR_DROP, /* u32 error code. */
> > +   OVS_ACTION_ATTR_EMIT_SAMPLE,  /* Nested OVS_EMIT_SAMPLE_ATTR_*. */
>
> nit: Please add OVS_ACTION_ATTR_EMIT_SAMPLE to the Kenrel doc
>  for this structure.
>

Thanks for spotting this. Will do.


> >
> > __OVS_ACTION_ATTR_MAX,/* Nothing past this will be accepted
> >* from userspace. */
>
> ...
>

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2] system-dpdk: Tolerate new warnings 23.11.1/24.03.

2024-06-06 Thread christian . ehrhardt
From: Christian Ehrhardt 

DPDK fixed counting of telemetry clients in 24.03 [1] which was also
backported to 23.11.1 [2]. Due to that the dpdk related openvswitch
tests now fail in the following cases:
  4: OVS-DPDK - ping vhost-user ports FAILED (system-dpdk.at:148)
  5: OVS-DPDK - ping vhost-user-client ports FAILED (system-dpdk.at:224)
  16: OVS-DPDK - MTU increase vport port FAILED (system-dpdk.at:609)
  17: OVS-DPDK - MTU decrease vport port FAILED (system-dpdk.at:651)
  20: OVS-DPDK - MTU upper bound vport port FAILED (system-dpdk.at:767)
  21: OVS-DPDK - MTU lower bound vport port FAILED (system-dpdk.at:809)

This is due to a new error being logged under these conditions:
  dpdk|ERR|TELEMETRY: Socket write base info to client failed

This is okay to happen in the cases we set up in the DPDK related tests
for OVS and can be ignored, therefore it is added to the tolerated
messages in the m4 definition of OVS_DPDK_STOP_VSWITCHD.

[1]: https://github.com/DPDK/dpdk/commit/e14bb5f1
[2]: https://github.com/DPDK/dpdk-stable/commit/cbd1c165

Signed-off-by: Christian Ehrhardt 
---
Changes since v1:
- adapt to grammar rules for the subject
---
 tests/system-dpdk-macros.at | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at
index 7cf9bac17..64e5ad522 100644
--- a/tests/system-dpdk-macros.at
+++ b/tests/system-dpdk-macros.at
@@ -82,6 +82,7 @@ $1";/does not exist. The Open vSwitch kernel module is 
probably not loaded./d
 /Failed to enable flow control/d
 /ice_vsi_config_outer_vlan_stripping(): Single VLAN mode (SVM) does not 
support qinq/d
 /Rx checksum offload is not supported on/d
+/TELEMETRY: Socket write base info to client failed/d
 /TELEMETRY: No legacy callbacks, legacy socket not created/d"])
 ])
 
-- 
2.43.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev