Re: [ovs-dev] [PATCH] ofproto: Add upcall/dump-ufid-rules command to map UFIDs to OpenFlow.

2024-06-03 Thread Ales Musil
On Tue, Jun 4, 2024 at 12:05 AM Dumitru Ceara  wrote:

> It improves the debugging experience if we can easily get a list of
> OpenFlow rules and groups that contribute to the creation of a datapath
> flow.
>
> The suggested workflow is:
> a. dump datapath flows (along with UUIDs), this also prints the core IDs
> (PMD IDs) when applicable.
>
>   $ ovs-appctl dpctl/dump-flows -m
>   flow-dump from pmd on cpu core: 7
>   ufid:7460db8f..., recirc_id(0), 
>
> b. dump related OpenFlow rules and groups:
>   $ ovs-appctl upcall/dump-ufid-rules ufid:7460db8f... pmd=7
>   cookie=0x12345678, table=0
> priority=100,ip,in_port=2,nw_dst=10.0.0.2,actions=resubmit(,1)
>   cookie=0x0, table=1 priority=200,actions=group:1
>
> group_id=1,bucket=bucket_id:0,actions=ct(commit,table=2,nat(dst=20.0.0.2))
>   cookie=0x0, table=2 actions=output:1
>
> The new command only shows rules and groups attached to ukeys that are
> in states UKEY_VISIBLE or UKEY_OPERATIONAL.  That should be fine as all
> other ukeys should not be relevant for the use case presented above.
>
> For ukeys that don't have an xcache populated yet, the command goes
> ahead and populates one.  In theory this is creates a slight overhead as
> those ukeys might not need an xcache until they see traffic (and get
> revalidated) but in practice the overhead should be minimal.
>
> This commit tries to mimic the output format of the ovs-ofctl
> dump-flows/dump-groups commands.  For groups it actually uses
> ofputil_group/_bucket functions for formatting.  For rules it's not that
> straightforward or clean as 'struct rule' uses its own versions of
> fields to describe the match, actions, and other relevant fields and
> 'ovs-ofctl dump-flows' operates on the equivalent 'struct
> ofputil_flow_stats'.
>
> Signed-off-by: Dumitru Ceara 
> ---
>  NEWS|   3 +
>  include/openvswitch/ofp-group.h |   7 ++
>  lib/ofp-group.c | 110 +++-
>  ofproto/ofproto-dpif-upcall.c   |  87 +
>  ofproto/ofproto-dpif.c  |  33 ++
>  ofproto/ofproto-dpif.h  |   4 ++
>  tests/ofproto-dpif.at   |  39 +++
>  tests/ofproto-macros.at |   4 ++
>  8 files changed, 243 insertions(+), 44 deletions(-)
>
> diff --git a/NEWS b/NEWS
> index 5ae0108d552b..1bc085f97045 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -9,6 +9,9 @@ Post-v3.3.0
>   https://github.com/openvswitch/ovs.git
> - DPDK:
>   * OVS validated with DPDK 23.11.1.
> +   - ovs-appctl:
> + * Added 'upcall/dump-ufid-rules' to output the set of OpenFlow rules
> and
> +   groups that contributed to the creation of a specific datapath
> flow.
>
>
>  v3.3.0 - 16 Feb 2024
> diff --git a/include/openvswitch/ofp-group.h
> b/include/openvswitch/ofp-group.h
> index cd7af0ebff9c..79fcb3a4c0d1 100644
> --- a/include/openvswitch/ofp-group.h
> +++ b/include/openvswitch/ofp-group.h
> @@ -70,6 +70,11 @@ struct ofputil_bucket *ofputil_bucket_find(const struct
> ovs_list *,
>  bool ofputil_bucket_check_duplicate_id(const struct ovs_list *);
>  struct ofputil_bucket *ofputil_bucket_list_front(const struct ovs_list *);
>  struct ofputil_bucket *ofputil_bucket_list_back(const struct ovs_list *);
> +void ofputil_bucket_format(const struct ofputil_bucket *,
> +   enum ofp11_group_type, enum ofp_version,
> +   const struct ofputil_port_map *,
> +   const struct ofputil_table_map *,
> +   struct ds *);
>
>  static inline bool
>  ofputil_bucket_has_liveness(const struct ofputil_bucket *bucket)
> @@ -88,6 +93,8 @@ struct ofputil_group_props {
>  void ofputil_group_properties_destroy(struct ofputil_group_props *);
>  void ofputil_group_properties_copy(struct ofputil_group_props *to,
> const struct ofputil_group_props
> *from);
> +void ofputil_group_properties_format(const struct ofputil_group_props *,
> + struct ds *);
>  /* Protocol-independent group_mod. */
>  struct ofputil_group_mod {
>  uint16_t command; /* One of OFPGC15_*. */
> diff --git a/lib/ofp-group.c b/lib/ofp-group.c
> index 737f48047b10..8f7614c22f0e 100644
> --- a/lib/ofp-group.c
> +++ b/lib/ofp-group.c
> @@ -58,14 +58,16 @@ ofputil_group_from_string(const char *s, uint32_t
> *group_idp)
>  return true;
>  }
>
> -/* Appends to 's' a string representation of the OpenFlow group ID
> 'group_id'.
> - * Most groups' string representation is just the number, but for special
> - * groups, e.g. OFPG_ALL, it is the name, e.g. "ALL". */
> +/* Appends to 's' a string representation of the OpenFlow group.
> 'group_id'.
> + * Most groups' string representation is just 'group_id=' followed by the
> ID,
> + * but for special groups, e.g. OFPG_ALL, the ID is replaced by the name,
> + * e.g. "ALL". */
>  void
>  ofputil_format_group(uint32_t group_id, struct ds *s)
>  {
>  char name[M

Re: [ovs-dev] [PATCH net-next v2 6/9] net: openvswitch: store sampling probability in cb.

2024-06-03 Thread kernel test robot
Hi Adrian,

kernel test robot noticed the following build errors:

[auto build test ERROR on net-next/main]

url:
https://github.com/intel-lab-lkp/linux/commits/Adrian-Moreno/net-psample-add-user-cookie/20240604-030055
base:   net-next/main
patch link:
https://lore.kernel.org/r/20240603185647.2310748-7-amorenoz%40redhat.com
patch subject: [PATCH net-next v2 6/9] net: openvswitch: store sampling 
probability in cb.
config: m68k-allmodconfig 
(https://download.01.org/0day-ci/archive/20240604/202406041339.ytdrh41v-...@intel.com/config)
compiler: m68k-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): 
(https://download.01.org/0day-ci/archive/20240604/202406041339.ytdrh41v-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202406041339.ytdrh41v-...@intel.com/

All errors (new ones prefixed by >>, old ones prefixed by <<):

WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-zydacron.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-viewsonic.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-waltop.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-winwing.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/of/of_test.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/fbtft/fbtft.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/staging/greybus/gb-bootrom.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/staging/greybus/gb-spilib.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/staging/greybus/gb-hid.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/staging/greybus/gb-light.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/staging/greybus/gb-log.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/staging/greybus/gb-loopback.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/staging/greybus/gb-power-supply.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/staging/greybus/gb-raw.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/staging/greybus/gb-vibrator.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/staging/greybus/gb-audio-manager.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/staging/greybus/gb-gbphy.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/staging/greybus/gb-gpio.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/staging/greybus/gb-i2c.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/staging/greybus/gb-pwm.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/staging/greybus/gb-sdio.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/staging/greybus/gb-spi.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/staging/greybus/gb-uart.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/staging/greybus/gb-usb.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/platform/goldfish/goldfish_pipe.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/platform/chrome/cros_kunit_proto_test.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/mailbox/mtk-cmdq-mailbox.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/devfreq/governor_simpleondemand.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/devfreq/governor_performance.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/devfreq/governor_powersave.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/devfreq/governor_userspace.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/hwtracing/intel_th/intel_th_msu_sink.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/nvmem/nvmem-apple-efuses.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/nvmem/nvmem_brcm_nvram.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/nvmem/nvmem_u-boot-env.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/interconnect/imx/imx-interconnect.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/interconnect/imx/imx8mm-interconnect.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/interconnect/imx/imx8mq-interconnect.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/interconnect/imx/imx8mn-interconnect.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/interconnect/imx/imx8mp-interconnect.o
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/hte/hte-tegra194-test.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/vdpa/vdpa.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/parport/parport.o
WARNING: modpost: drivers/parport/parport_amiga: section mismatch in reference: 
amiga_parallel_driver+0x8 (section: .data) -> amiga_parallel_remove (section: 
.exit.text)
WARNING: modpost: missing MODULE_DESCRIPTION() in 
drivers/mtd/parsers/brcm_u-

Re: [ovs-dev] [PATCH ovn v4] Text respresntations for drop sampling.

2024-06-03 Thread Ales Musil
On Mon, Jun 3, 2024 at 9:47 PM Jacob Tanenbaum  wrote:

> Created a new column in the southbound database to hardcode a human
> readable
> description for flows. This first use is describing why the flow is
> dropping packets.
> The new column is called flow_desc and will create southbound database
> entries like this
>
> _uuid   : 20f1897b-477e-47ae-a32c-c546d83ec097
> actions :
> "sample(probability=65535,collector_set=123,obs_domain=1,obs_point=$cookie);
> /* drop */"
> controller_meter: []
> external_ids: {source="northd.c:8721", stage-name=ls_in_l2_unknown}
> flow_desc   : "No L2 destination"
> logical_datapath: []
> logical_dp_group: ee3c3db5-98a2-4f34-8a84-409deae140a7
> match   : "outport == \"none\""
> pipeline: ingress
> priority: 50
> table_id: 27
> tags: {}
> hash: 0
>
> future work includes entering more flow_desc for more flows and adding
> flow_desc to the actions as a comment.
>
> Signed-off-by: Jacob Tanenbaum 
> Suggested-by: Dumitru Ceara 
> Reported-at: https://issues.redhat.com/browse/FDP-307
>
> ---
>
> v1: initial version
> v2: correct commit message
> make the flow_desc a char*
> correct a typo in the ovn-sb.xml
> correct the test
> v3: rebase issue with NEWS file
> v4: remove options:debug_drop_domain_id="1" from testing as we
> do not depend on it
>
> merge
>
> diff --git a/NEWS b/NEWS
> index 81c958f9a..04c441ada 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -21,6 +21,8 @@ Post v24.03.0
>  MAC addresses configured on the LSP with "unknown", are learnt via the
>  OVN native FDB.
>- Add support for ovsdb-server `--config-file` option in ovn-ctl.
> +  - Added a new column in the southbound database "flow_desc" to provide
> +human readable context to flows.
>
>  OVN v24.03.0 - 01 Mar 2024
>  --
> diff --git a/northd/lflow-mgr.c b/northd/lflow-mgr.c
> index b2c60b5de..e27558a32 100644
> --- a/northd/lflow-mgr.c
> +++ b/northd/lflow-mgr.c
> @@ -36,7 +36,7 @@ static void ovn_lflow_init(struct ovn_lflow *, struct
> ovn_datapath *od,
> uint16_t priority, char *match,
> char *actions, char *io_port,
> char *ctrl_meter, char *stage_hint,
> -   const char *where);
> +   const char *where, char *flow_desc);
>  static struct ovn_lflow *ovn_lflow_find(const struct hmap *lflows,
>  enum ovn_stage stage,
>  uint16_t priority, const char
> *match,
> @@ -52,7 +52,7 @@ static struct ovn_lflow *do_ovn_lflow_add(
>  const char *actions, const char *io_port,
>  const char *ctrl_meter,
>  const struct ovsdb_idl_row *stage_hint,
> -const char *where);
> +const char *where, const char *flow_desc);
>
>
>  static struct ovs_mutex *lflow_hash_lock(const struct hmap *lflow_table,
> @@ -173,6 +173,7 @@ struct ovn_lflow {
>* 'dpg_bitmap'. */
>  struct ovn_dp_group *dpg;/* Link to unique Sb datapath group. */
>  const char *where;
> +char *flow_desc;
>
>  struct uuid sb_uuid; /* SB DB row uuid, specified by northd.
> */
>  struct ovs_list referenced_by;  /* List of struct lflow_ref_node. */
> @@ -659,7 +660,7 @@ lflow_table_add_lflow(struct lflow_table *lflow_table,
>const char *match, const char *actions,
>const char *io_port, const char *ctrl_meter,
>const struct ovsdb_idl_row *stage_hint,
> -  const char *where,
> +  const char *where, const char *flow_desc,
>struct lflow_ref *lflow_ref)
>  OVS_EXCLUDED(fake_hash_mutex)
>  {
> @@ -679,7 +680,7 @@ lflow_table_add_lflow(struct lflow_table *lflow_table,
>  do_ovn_lflow_add(lflow_table,
>   od ? ods_size(od->datapaths) : dp_bitmap_len,
>   hash, stage, priority, match, actions,
> - io_port, ctrl_meter, stage_hint, where);
> + io_port, ctrl_meter, stage_hint, where,
> flow_desc);
>
>  if (lflow_ref) {
>  struct lflow_ref_node *lrn =
> @@ -733,7 +734,7 @@ lflow_table_add_lflow_default_drop(struct lflow_table
> *lflow_table,
>  {
>  lflow_table_add_lflow(lflow_table, od, NULL, 0, stage, 0, "1",
>debug_drop_action(), NULL, NULL, NULL,
> -  where, lflow_ref);
> +  where, NULL, lflow_ref);
>  }
>
>  struct ovn_dp_group *
> @@ -856,7 +857,7 @@ static void
>  ovn_lflow_init(struct ovn_lflow *lflow, struct ovn_datapath *od,
> size_t dp_bitmap_len, enum ovn_stage stage, uint16_t
> priority,
> char *match, char *actions, char *io_port, char
>

Re: [ovs-dev] [PATCH net-next 1/2] selftests: openvswitch: fix action formatting

2024-06-03 Thread Jakub Kicinski
On Mon, 03 Jun 2024 15:00:03 -0400 Aaron Conole wrote:
> I agree, this is an issue.  BUT I think it might be better to just
> filter by field type up front.  See:
> 
> https://github.com/apconole/linux-next-work/commit/7262107de7170d44b6dbf6c5ea6f7e6c0bb71d36#diff-3e72e7405c6bb4e9842bed5f63883ca930387086bb40d4034e92ed83a5decb4bR441
> 
> That version I think ends up being much easier to follow.  If you want
> to take it for your series, feel free.  If you disagree, maybe there's
> something I'm not considering about it.
> 
> NOTE that version is just a bunch of independent changes that are
> squashed together.  I have a cleaner version.
> 
> I can also bundle up the series I have so far and submit, but I didn't
> want to do that until I got all the pmtu.sh support working.  Maybe it
> makes sense to send it now though.  Simon, Jakub - wdyt?

I'd say - hold onto the changes until pmtu.sh works, unless there's
*any* reason for a particular patch to go in early, eg:
 - patch fixes existing bug
 - someone else needs a patch
 - ...
 - a patch which falls under some of the criteria above depends 
   on the patch
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 2/2] python: ovsdb-idl: Use monitor_cond for _Server DB.

2024-06-03 Thread Ilya Maximets
On 5/6/24 18:58, Terry Wilson wrote:
> Unlike the C IDL code, the Python version still monitors the
> _Server DB with "monitor" instead of "monitor_cond". This results
> in receiving an entire Database row every time the "index" value
> is updated which includes the 'schema' column. Using "monitor_cond"
> will result in "update2" notifications which just include the
> changed "index" value.
> 
> Unlike the C IDL, the Python IDL requires a SchemaHelper object
> to instanitate the IDL, leaving it to the user of the library to
> call "get_schema" themselves. Since the Python IDL did not have
> support for retrieving the schema automatically and did not have
> a state for doing so, instead of transitioning on an error response
> from retrieving the _Server schema to requesting the "data" schema,
> this moves directly to monitoring the "data" DB.
> 
> Signed-off-by: Terry Wilson 
> ---
>  python/ovs/db/idl.py | 28 ++--
>  1 file changed, 14 insertions(+), 14 deletions(-)
> 

Thanks, Terry!  Sorry for a long delay.

I'm considering this as a bug fix, because the use of a plain monitor
causes a significant traffic amplification with every small transactions
and can potentially create serious issues in large scale clusters.

Applied to main.  Backporting to 2.17 is probably too much, so only ported
to 3.3 for now to avoid this issue on a future LTS branch.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] ofproto: Add upcall/dump-ufid-rules command to map UFIDs to OpenFlow.

2024-06-03 Thread Dumitru Ceara
It improves the debugging experience if we can easily get a list of
OpenFlow rules and groups that contribute to the creation of a datapath
flow.

The suggested workflow is:
a. dump datapath flows (along with UUIDs), this also prints the core IDs
(PMD IDs) when applicable.

  $ ovs-appctl dpctl/dump-flows -m
  flow-dump from pmd on cpu core: 7
  ufid:7460db8f..., recirc_id(0), 

b. dump related OpenFlow rules and groups:
  $ ovs-appctl upcall/dump-ufid-rules ufid:7460db8f... pmd=7
  cookie=0x12345678, table=0 
priority=100,ip,in_port=2,nw_dst=10.0.0.2,actions=resubmit(,1)
  cookie=0x0, table=1 priority=200,actions=group:1
  group_id=1,bucket=bucket_id:0,actions=ct(commit,table=2,nat(dst=20.0.0.2))
  cookie=0x0, table=2 actions=output:1

The new command only shows rules and groups attached to ukeys that are
in states UKEY_VISIBLE or UKEY_OPERATIONAL.  That should be fine as all
other ukeys should not be relevant for the use case presented above.

For ukeys that don't have an xcache populated yet, the command goes
ahead and populates one.  In theory this is creates a slight overhead as
those ukeys might not need an xcache until they see traffic (and get
revalidated) but in practice the overhead should be minimal.

This commit tries to mimic the output format of the ovs-ofctl
dump-flows/dump-groups commands.  For groups it actually uses
ofputil_group/_bucket functions for formatting.  For rules it's not that
straightforward or clean as 'struct rule' uses its own versions of
fields to describe the match, actions, and other relevant fields and
'ovs-ofctl dump-flows' operates on the equivalent 'struct
ofputil_flow_stats'.

Signed-off-by: Dumitru Ceara 
---
 NEWS|   3 +
 include/openvswitch/ofp-group.h |   7 ++
 lib/ofp-group.c | 110 +++-
 ofproto/ofproto-dpif-upcall.c   |  87 +
 ofproto/ofproto-dpif.c  |  33 ++
 ofproto/ofproto-dpif.h  |   4 ++
 tests/ofproto-dpif.at   |  39 +++
 tests/ofproto-macros.at |   4 ++
 8 files changed, 243 insertions(+), 44 deletions(-)

diff --git a/NEWS b/NEWS
index 5ae0108d552b..1bc085f97045 100644
--- a/NEWS
+++ b/NEWS
@@ -9,6 +9,9 @@ Post-v3.3.0
  https://github.com/openvswitch/ovs.git
- DPDK:
  * OVS validated with DPDK 23.11.1.
+   - ovs-appctl:
+ * Added 'upcall/dump-ufid-rules' to output the set of OpenFlow rules and
+   groups that contributed to the creation of a specific datapath flow.
 
 
 v3.3.0 - 16 Feb 2024
diff --git a/include/openvswitch/ofp-group.h b/include/openvswitch/ofp-group.h
index cd7af0ebff9c..79fcb3a4c0d1 100644
--- a/include/openvswitch/ofp-group.h
+++ b/include/openvswitch/ofp-group.h
@@ -70,6 +70,11 @@ struct ofputil_bucket *ofputil_bucket_find(const struct 
ovs_list *,
 bool ofputil_bucket_check_duplicate_id(const struct ovs_list *);
 struct ofputil_bucket *ofputil_bucket_list_front(const struct ovs_list *);
 struct ofputil_bucket *ofputil_bucket_list_back(const struct ovs_list *);
+void ofputil_bucket_format(const struct ofputil_bucket *,
+   enum ofp11_group_type, enum ofp_version,
+   const struct ofputil_port_map *,
+   const struct ofputil_table_map *,
+   struct ds *);
 
 static inline bool
 ofputil_bucket_has_liveness(const struct ofputil_bucket *bucket)
@@ -88,6 +93,8 @@ struct ofputil_group_props {
 void ofputil_group_properties_destroy(struct ofputil_group_props *);
 void ofputil_group_properties_copy(struct ofputil_group_props *to,
const struct ofputil_group_props *from);
+void ofputil_group_properties_format(const struct ofputil_group_props *,
+ struct ds *);
 /* Protocol-independent group_mod. */
 struct ofputil_group_mod {
 uint16_t command; /* One of OFPGC15_*. */
diff --git a/lib/ofp-group.c b/lib/ofp-group.c
index 737f48047b10..8f7614c22f0e 100644
--- a/lib/ofp-group.c
+++ b/lib/ofp-group.c
@@ -58,14 +58,16 @@ ofputil_group_from_string(const char *s, uint32_t 
*group_idp)
 return true;
 }
 
-/* Appends to 's' a string representation of the OpenFlow group ID 'group_id'.
- * Most groups' string representation is just the number, but for special
- * groups, e.g. OFPG_ALL, it is the name, e.g. "ALL". */
+/* Appends to 's' a string representation of the OpenFlow group. 'group_id'.
+ * Most groups' string representation is just 'group_id=' followed by the ID,
+ * but for special groups, e.g. OFPG_ALL, the ID is replaced by the name,
+ * e.g. "ALL". */
 void
 ofputil_format_group(uint32_t group_id, struct ds *s)
 {
 char name[MAX_GROUP_NAME_LEN + 1];
 
+ds_put_cstr(s, "group_id=");
 ofputil_group_to_string(group_id, name, sizeof name);
 ds_put_cstr(s, name);
 }
@@ -297,7 +299,7 @@ ofputil_group_desc_request_format(struct ds *string,
const stru

Re: [ovs-dev] [PATCH ovn v2] ovn-controller: Initialize bitmap to zero.

2024-06-03 Thread Mark Michelson

Like Ilya, I'm surprised that ASAN did not catch this.

Having said that,
Acked-by: Mark Michelson 

On 5/10/24 06:57, Ales Musil wrote:

The bitmap used in the update_ct_zones was uninitialized, and it could
contain any value besides 0. Use the bitmap_allocate() function instead,
to allocate the bitmap in heap rather than stack, the allocate makes sure
that the memory is properly zeroed.
This was caught by valgrind:

Conditional jump or move depends on uninitialised value(s)
at 0x44074B: update_ct_zones (ovn-controller.c:812)
by 0x440DC9: en_ct_zones_run (ovn-controller.c:2579)
by 0x468BB7: engine_recompute (inc-proc-eng.c:415)
by 0x46954C: engine_compute (inc-proc-eng.c:454)
by 0x46954C: engine_run_node (inc-proc-eng.c:503)
by 0x46954C: engine_run (inc-proc-eng.c:528)
by 0x40AE9D: main (ovn-controller.c:5776)
Uninitialised value was created by a stack allocation
at 0x440313: update_ct_zones (ovn-controller.c:747)

Fixes: f9cab11d5fab ("Allow explicit setting of the SNAT zone on a gateway 
router.")
Signed-off-by: Ales Musil 
---
v2: Use bitmap_allocate() instead of array on stack.
---
  controller/ovn-controller.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 453dc62fd..8ee2da2fd 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -732,7 +732,7 @@ update_ct_zones(const struct sset *local_lports,
  const char *user;
  struct sset all_users = SSET_INITIALIZER(&all_users);
  struct simap req_snat_zones = SIMAP_INITIALIZER(&req_snat_zones);
-unsigned long unreq_snat_zones_map[BITMAP_N_LONGS(MAX_CT_ZONES)];
+unsigned long *unreq_snat_zones_map = bitmap_allocate(MAX_CT_ZONES);
  struct simap unreq_snat_zones = SIMAP_INITIALIZER(&unreq_snat_zones);
  
  const char *local_lport;

@@ -843,6 +843,7 @@ update_ct_zones(const struct sset *local_lports,
  simap_destroy(&req_snat_zones);
  simap_destroy(&unreq_snat_zones);
  sset_destroy(&all_users);
+free(unreq_snat_zones_map);
  }
  
  static void


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v5 2/2] northd: Add support for disabling vxlan mode.

2024-06-03 Thread Mark Michelson

Hi Vladislav,

I have just one comment below.

On 5/3/24 04:13, Vladislav Odintsov wrote:

Commit [1] introduced a "VXLAN mode" concept.  It brought a limitation
for available tunnel IDs because of lack of space in VXLAN VNI.
In VXLAN mode OVN is limited by 4095 datapaths (LRs or non-transit LSs)
and 2047 logical ports per datapath.

Prior to this patch VXLAN mode was enabled automatically if at least one
chassis had encap of VXLAN type.  In scenarios where one want to use
VXLAN only for HW VTEP (RAMP) switch, such limitation makes no sence.

This patch adds support for explicit disabling of VXLAN mode via
Northbound database.

1: https://github.com/ovn-org/ovn/commit/b07f1bc3d068

Acked-By: Ihar Hrachyshka 
Fixes: b07f1bc3d068 ("Add VXLAN support for non-VTEP datapath bindings")
Signed-off-by: Vladislav Odintsov 
---
  NEWS  |  4 
  northd/en-global-config.c |  7 ++-
  northd/northd.c   | 10 --
  northd/northd.h   |  3 ++-
  ovn-architecture.7.xml|  6 ++
  ovn-nb.xml| 10 ++
  tests/ovn-northd.at   | 29 +
  7 files changed, 65 insertions(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index 3b5e93dc9..007b27f3d 100644
--- a/NEWS
+++ b/NEWS
@@ -17,6 +17,10 @@ Post v24.03.0
  external-ids, the option is no longer needed as it became effectively
  "true" for all scenarios.
- Added DHCPv4 relay support.
+  - Added new global config option NB_Global:options:disable_vxlan_mode to
+extend available tunnel IDs space for datapaths from 4095 to 16711680
+when running in "VXLAN mode".  For more details see man ovn-nb(5) for
+mentioned option.
  
  OVN v24.03.0 - 01 Mar 2024

  --
diff --git a/northd/en-global-config.c b/northd/en-global-config.c
index 873649a89..f5e2a8154 100644
--- a/northd/en-global-config.c
+++ b/northd/en-global-config.c
@@ -115,7 +115,7 @@ en_global_config_run(struct engine_node *node , void *data)
   config_data->svc_monitor_mac);
  }
  
-init_vxlan_mode(sbrec_chassis_table);

+init_vxlan_mode(&nb->options, sbrec_chassis_table);
  char *max_tunid = xasprintf("%d", get_ovn_max_dp_key_local());
  smap_replace(options, "max_tunid", max_tunid);
  free(max_tunid);
@@ -533,6 +533,11 @@ check_nb_options_out_of_sync(const struct nbrec_nb_global 
*nb,
  return true;
  }
  
+if (config_out_of_sync(&nb->options, &config_data->nb_options,

+   "disable_vxlan_mode", false)) {
+return true;
+}
+
  return false;
  }
  
diff --git a/northd/northd.c b/northd/northd.c

index 0e0ae24db..7bdffe531 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -886,8 +886,14 @@ join_datapaths(const struct nbrec_logical_switch_table 
*nbrec_ls_table,
  }
  
  void

-init_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
+init_vxlan_mode(const struct smap *nb_options,
+const struct sbrec_chassis_table *sbrec_chassis_table)
  {
+if (smap_get_bool(nb_options, "disable_vxlan_mode", false)) {
+vxlan_mode = false;
+return;
+}
+
  const struct sbrec_chassis *chassis;
  SBREC_CHASSIS_TABLE_FOR_EACH (chassis, sbrec_chassis_table) {
  for (int i = 0; i < chassis->n_encaps; i++) {
@@ -17596,7 +17602,7 @@ ovnnb_db_run(struct northd_input *input_data,
  use_common_zone = smap_get_bool(input_data->nb_options, "use_common_zone",
  false);
  
-init_vxlan_mode(input_data->sbrec_chassis_table);

+init_vxlan_mode(input_data->nb_options, input_data->sbrec_chassis_table);
  
  build_datapaths(ovnsb_txn,

  input_data->nbrec_logical_switch_table,
diff --git a/northd/northd.h b/northd/northd.h
index be480003e..d0322e621 100644
--- a/northd/northd.h
+++ b/northd/northd.h
@@ -792,7 +792,8 @@ lr_has_multiple_gw_ports(const struct ovn_datapath *od)
  }
  
  void

-init_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table);
+init_vxlan_mode(const struct smap *nb_options,
+const struct sbrec_chassis_table *sbrec_chassis_table);
  
  uint32_t get_ovn_max_dp_key_local(void);
  
diff --git a/ovn-architecture.7.xml b/ovn-architecture.7.xml

index 3ecb58933..f4eae340c 100644
--- a/ovn-architecture.7.xml
+++ b/ovn-architecture.7.xml
@@ -2920,4 +2920,10 @@
  the future, gateways that do not support encapsulations with large amounts
  of metadata may continue to have a reduced feature set.

+  
+VXLAN mode is recommended to be disabled if VXLAN encap at
+hypervisors is needed only to support HW VTEP L2 Gateway functionality.
+See man ovn-nb(5) for table NB_Global column
+options key disable_vxlan_mode for more details.
+  
  
diff --git a/ovn-nb.xml b/ovn-nb.xml
index 5cb6ba640..84f1e07b6 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -381,6 +381,16 @@
  of SB changes would be very noticeable.


Re: [ovs-dev] [PATCH ovn v5 1/2] northd: Make `vxlan_mode` a global variable.

2024-06-03 Thread Mark Michelson

Hi Vladislav,

Generally speaking, I agree with this change. However, I think that 
setting a global variable from an incremental processing engine node 
runner feels wrong.


I think that instead, the "vxlan_mode" variable you have introduced 
should be a field on struct ed_type_global_config. This way, the engine 
node is only modifying data local to itself.


On 5/3/24 04:13, Vladislav Odintsov wrote:

This simplifies code and subsequent commit to explicitely disable VXLAN
mode is based on these changes.

Also "VXLAN mode" term is introduced in ovn-architecture man page.

Signed-off-by: Vladislav Odintsov 
---
  northd/en-global-config.c |  4 +-
  northd/northd.c   | 85 +--
  northd/northd.h   |  5 ++-
  ovn-architecture.7.xml| 10 ++---
  4 files changed, 47 insertions(+), 57 deletions(-)

diff --git a/northd/en-global-config.c b/northd/en-global-config.c
index 28c78a12c..873649a89 100644
--- a/northd/en-global-config.c
+++ b/northd/en-global-config.c
@@ -115,8 +115,8 @@ en_global_config_run(struct engine_node *node , void *data)
   config_data->svc_monitor_mac);
  }
  
-char *max_tunid = xasprintf("%d",

-get_ovn_max_dp_key_local(sbrec_chassis_table));
+init_vxlan_mode(sbrec_chassis_table);
+char *max_tunid = xasprintf("%d", get_ovn_max_dp_key_local());
  smap_replace(options, "max_tunid", max_tunid);
  free(max_tunid);
  
diff --git a/northd/northd.c b/northd/northd.c

index 133cddb69..0e0ae24db 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -90,6 +90,10 @@ static bool use_ct_inv_match = true;
   */
  static bool default_acl_drop;
  
+/* If this option is 'true' northd will use limited 24-bit space for datapath

+ * and ports tunnel key allocation (12 bits for each instead of default 16). */
+static bool vxlan_mode;
+
  #define MAX_OVN_TAGS 4096
  
  
@@ -881,40 +885,40 @@ join_datapaths(const struct nbrec_logical_switch_table *nbrec_ls_table,

  }
  }
  
-static bool

-is_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
+void
+init_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
  {
  const struct sbrec_chassis *chassis;
  SBREC_CHASSIS_TABLE_FOR_EACH (chassis, sbrec_chassis_table) {
  for (int i = 0; i < chassis->n_encaps; i++) {
  if (!strcmp(chassis->encaps[i]->type, "vxlan")) {
-return true;
+vxlan_mode = true;
+return;
  }
  }
  }
-return false;
+vxlan_mode = false;
  }
  
  uint32_t

-get_ovn_max_dp_key_local(const struct sbrec_chassis_table *sbrec_chassis_table)
+get_ovn_max_dp_key_local(void)
  {
-if (is_vxlan_mode(sbrec_chassis_table)) {
-/* OVN_MAX_DP_GLOBAL_NUM doesn't apply for vxlan mode. */
+if (vxlan_mode) {
+/* OVN_MAX_DP_GLOBAL_NUM doesn't apply for VXLAN mode. */
  return OVN_MAX_DP_VXLAN_KEY;
  }
  return OVN_MAX_DP_KEY - OVN_MAX_DP_GLOBAL_NUM;
  }
  
  static void

-ovn_datapath_allocate_key(const struct sbrec_chassis_table *sbrec_ch_table,
-  struct hmap *datapaths, struct hmap *dp_tnlids,
+ovn_datapath_allocate_key(struct hmap *datapaths, struct hmap *dp_tnlids,
struct ovn_datapath *od, uint32_t *hint)
  {
  if (!od->tunnel_key) {
  od->tunnel_key = ovn_allocate_tnlid(dp_tnlids, "datapath",
-OVN_MIN_DP_KEY_LOCAL,
-get_ovn_max_dp_key_local(sbrec_ch_table),
-hint);
+OVN_MIN_DP_KEY_LOCAL,
+get_ovn_max_dp_key_local(),
+hint);
  if (!od->tunnel_key) {
  if (od->sb) {
  sbrec_datapath_binding_delete(od->sb);
@@ -927,7 +931,6 @@ ovn_datapath_allocate_key(const struct sbrec_chassis_table 
*sbrec_ch_table,
  
  static void

  ovn_datapath_assign_requested_tnl_id(
-const struct sbrec_chassis_table *sbrec_chassis_table,
  struct hmap *dp_tnlids, struct ovn_datapath *od)
  {
  const struct smap *other_config = (od->nbs
@@ -936,8 +939,7 @@ ovn_datapath_assign_requested_tnl_id(
  uint32_t tunnel_key = smap_get_int(other_config, "requested-tnl-key", 0);
  if (tunnel_key) {
  const char *interconn_ts = smap_get(other_config, "interconn-ts");
-if (!interconn_ts && is_vxlan_mode(sbrec_chassis_table) &&
-tunnel_key >= 1 << 12) {
+if (!interconn_ts && vxlan_mode && tunnel_key >= 1 << 12) {
  static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
  VLOG_WARN_RL(&rl, "Tunnel key %"PRIu32" for datapath %s is "
   "incompatible with VXLAN", tunnel_key,
@@ -985,7 +987,6 @@ build_datapaths(struct ovsdb_idl_txn *ovnsb_txn,
  const struct nbrec_logi

Re: [ovs-dev] [PATCH ovn v5] controller: Allow br-int connection via other methods.

2024-06-03 Thread Mark Michelson

Thanks for keeping this fresh, Ales.

Acked-by: Mark Michelson 

On 5/2/24 12:44, Ales Musil wrote:

The br-int connection is hardcoded to use unix socket, which requires
for the socket to be visible for ovn-controller. This is achievable in
container by mounting the socket, but in turn the container requires
additional privileges.

Add option to vswitchd external-ids that allows to specify remote
target for management bridge. This gives the user possibility to
connect to management bridge in different manner than unix socket,
defaulting to the unix socket when not specified. In addition, there
is an option to specify inactivity probe for this connection, disabled
by default.

Reported-at: https://issues.redhat.com/browse/FDP-243
Signed-off-by: Ales Musil 
---
v4: Rebase on top of current main.
v3: Rebase on top of current main.
 Fix the copy-paste error in ovn-controller documentation.
v2: Rebase on top of current main.
 Make the probe interval accept milliseconds to be aligned with other probe 
intervals.
 Use external-ids instead of options for the ovn-controller.
---
  NEWS|  6 +++
  controller/ofctrl.c | 10 +
  controller/ofctrl.h |  5 ++-
  controller/ovn-controller.8.xml | 15 
  controller/ovn-controller.c | 59 +++--
  controller/pinctrl.c| 56 ++--
  controller/pinctrl.h|  6 ++-
  controller/statctrl.c   | 66 ++---
  controller/statctrl.h   |  3 +-
  include/ovn/features.h  |  2 +-
  lib/features.c  | 35 +
  lib/ovn-util.c  | 26 +
  lib/ovn-util.h  |  4 ++
  lib/test-ovn-features.c |  6 +--
  tests/ovn-controller.at | 45 ++
  15 files changed, 193 insertions(+), 151 deletions(-)

diff --git a/NEWS b/NEWS
index 3b5e93dc9..4e15f31c8 100644
--- a/NEWS
+++ b/NEWS
@@ -17,6 +17,12 @@ Post v24.03.0
  external-ids, the option is no longer needed as it became effectively
  "true" for all scenarios.
- Added DHCPv4 relay support.
+  - Add "ovn-bridge-remote" config option to vswitchd external-ids,
+that allows to specify connection method to management bridge for
+ovn-controller, defaulting to the unix socket.
+  - Add "ovn-bridge-remote-probe-interval" config option to vswitchd
+external-ids, that sets probe interval for integration bridge connection,
+disabled by default.
  
  OVN v24.03.0 - 01 Mar 2024

  --
diff --git a/controller/ofctrl.c b/controller/ofctrl.c
index 6a2564604..9d181a782 100644
--- a/controller/ofctrl.c
+++ b/controller/ofctrl.c
@@ -771,19 +771,13 @@ ofctrl_get_mf_field_id(void)
   * Returns 'true' if an OpenFlow reconnect happened; 'false' otherwise.
   */
  bool
-ofctrl_run(const struct ovsrec_bridge *br_int,
+ofctrl_run(const char *conn_target, int probe_interval,
 const struct ovsrec_open_vswitch_table *ovs_table,
 struct shash *pending_ct_zones)
  {
-char *target = xasprintf("unix:%s/%s.mgmt", ovs_rundir(), br_int->name);
  bool reconnected = false;
  
-if (strcmp(target, rconn_get_target(swconn))) {

-VLOG_INFO("%s: connecting to switch", target);
-rconn_connect(swconn, target, target);
-}
-free(target);
-
+ovn_update_swconn_at(swconn, conn_target, probe_interval, "ofctrl");
  rconn_run(swconn);
  
  if (!rconn_is_connected(swconn) || !pending_ct_zones) {

diff --git a/controller/ofctrl.h b/controller/ofctrl.h
index 502c73da6..7df0a24ea 100644
--- a/controller/ofctrl.h
+++ b/controller/ofctrl.h
@@ -50,8 +50,9 @@ struct ovn_desired_flow_table {
  /* Interface for OVN main loop. */
  void ofctrl_init(struct ovn_extend_table *group_table,
   struct ovn_extend_table *meter_table);
-bool ofctrl_run(const struct ovsrec_bridge *br_int,
-const struct ovsrec_open_vswitch_table *,
+
+bool ofctrl_run(const char *conn_target, int probe_interval,
+const struct ovsrec_open_vswitch_table *ovs_table,
  struct shash *pending_ct_zones);
  enum mf_field_id ofctrl_get_mf_field_id(void);
  void ofctrl_put(struct ovn_desired_flow_table *lflow_table,
diff --git a/controller/ovn-controller.8.xml b/controller/ovn-controller.8.xml
index 85e7966d7..b6404a19d 100644
--- a/controller/ovn-controller.8.xml
+++ b/controller/ovn-controller.8.xml
@@ -378,6 +378,21 @@
  cap for the exponential backoff used by ovn-controller
  to send GARPs packets.

+  external_ids:ovn-bridge-remote
+  
+
+  Connection to the OVN management bridge in OvS. It defaults to
+  unix:br-int.mgmt when not specified.
+
+  
+  external_ids:ovn-bridge-remote-probe-interval
+  
+
+  The inactivity probe interval of the connection to the OVN management
+   

[ovs-dev] [PATCH ovn v4] Text respresntations for drop sampling.

2024-06-03 Thread Jacob Tanenbaum
Created a new column in the southbound database to hardcode a human readable
description for flows. This first use is describing why the flow is dropping 
packets.
The new column is called flow_desc and will create southbound database entries 
like this

_uuid   : 20f1897b-477e-47ae-a32c-c546d83ec097
actions : 
"sample(probability=65535,collector_set=123,obs_domain=1,obs_point=$cookie); /* 
drop */"
controller_meter: []
external_ids: {source="northd.c:8721", stage-name=ls_in_l2_unknown}
flow_desc   : "No L2 destination"
logical_datapath: []
logical_dp_group: ee3c3db5-98a2-4f34-8a84-409deae140a7
match   : "outport == \"none\""
pipeline: ingress
priority: 50
table_id: 27
tags: {}
hash: 0

future work includes entering more flow_desc for more flows and adding
flow_desc to the actions as a comment.

Signed-off-by: Jacob Tanenbaum 
Suggested-by: Dumitru Ceara 
Reported-at: https://issues.redhat.com/browse/FDP-307

---

v1: initial version
v2: correct commit message
make the flow_desc a char*
correct a typo in the ovn-sb.xml
correct the test
v3: rebase issue with NEWS file
v4: remove options:debug_drop_domain_id="1" from testing as we
do not depend on it

merge

diff --git a/NEWS b/NEWS
index 81c958f9a..04c441ada 100644
--- a/NEWS
+++ b/NEWS
@@ -21,6 +21,8 @@ Post v24.03.0
 MAC addresses configured on the LSP with "unknown", are learnt via the
 OVN native FDB.
   - Add support for ovsdb-server `--config-file` option in ovn-ctl.
+  - Added a new column in the southbound database "flow_desc" to provide
+human readable context to flows.
 
 OVN v24.03.0 - 01 Mar 2024
 --
diff --git a/northd/lflow-mgr.c b/northd/lflow-mgr.c
index b2c60b5de..e27558a32 100644
--- a/northd/lflow-mgr.c
+++ b/northd/lflow-mgr.c
@@ -36,7 +36,7 @@ static void ovn_lflow_init(struct ovn_lflow *, struct 
ovn_datapath *od,
uint16_t priority, char *match,
char *actions, char *io_port,
char *ctrl_meter, char *stage_hint,
-   const char *where);
+   const char *where, char *flow_desc);
 static struct ovn_lflow *ovn_lflow_find(const struct hmap *lflows,
 enum ovn_stage stage,
 uint16_t priority, const char *match,
@@ -52,7 +52,7 @@ static struct ovn_lflow *do_ovn_lflow_add(
 const char *actions, const char *io_port,
 const char *ctrl_meter,
 const struct ovsdb_idl_row *stage_hint,
-const char *where);
+const char *where, const char *flow_desc);
 
 
 static struct ovs_mutex *lflow_hash_lock(const struct hmap *lflow_table,
@@ -173,6 +173,7 @@ struct ovn_lflow {
   * 'dpg_bitmap'. */
 struct ovn_dp_group *dpg;/* Link to unique Sb datapath group. */
 const char *where;
+char *flow_desc;
 
 struct uuid sb_uuid; /* SB DB row uuid, specified by northd. */
 struct ovs_list referenced_by;  /* List of struct lflow_ref_node. */
@@ -659,7 +660,7 @@ lflow_table_add_lflow(struct lflow_table *lflow_table,
   const char *match, const char *actions,
   const char *io_port, const char *ctrl_meter,
   const struct ovsdb_idl_row *stage_hint,
-  const char *where,
+  const char *where, const char *flow_desc,
   struct lflow_ref *lflow_ref)
 OVS_EXCLUDED(fake_hash_mutex)
 {
@@ -679,7 +680,7 @@ lflow_table_add_lflow(struct lflow_table *lflow_table,
 do_ovn_lflow_add(lflow_table,
  od ? ods_size(od->datapaths) : dp_bitmap_len,
  hash, stage, priority, match, actions,
- io_port, ctrl_meter, stage_hint, where);
+ io_port, ctrl_meter, stage_hint, where, flow_desc);
 
 if (lflow_ref) {
 struct lflow_ref_node *lrn =
@@ -733,7 +734,7 @@ lflow_table_add_lflow_default_drop(struct lflow_table 
*lflow_table,
 {
 lflow_table_add_lflow(lflow_table, od, NULL, 0, stage, 0, "1",
   debug_drop_action(), NULL, NULL, NULL,
-  where, lflow_ref);
+  where, NULL, lflow_ref);
 }
 
 struct ovn_dp_group *
@@ -856,7 +857,7 @@ static void
 ovn_lflow_init(struct ovn_lflow *lflow, struct ovn_datapath *od,
size_t dp_bitmap_len, enum ovn_stage stage, uint16_t priority,
char *match, char *actions, char *io_port, char *ctrl_meter,
-   char *stage_hint, const char *where)
+   char *stage_hint, const char *where, char *flow_desc)
 {
 lflow->dpg_bitmap = bitmap_allocate(dp_bitmap_len);
 lflow->od = od;
@@ -867,6 +868,7 @@ ovn_lflow_init(struct ovn_lflow *lflow, s

Re: [ovs-dev] [PATCH ovn v3] text respresntations for drop sampling.

2024-06-03 Thread Jacob Tanenbaum
On Mon, Jun 3, 2024 at 2:52 AM Ales Musil  wrote:

>
>
> On Wed, May 29, 2024 at 2:25 AM Jacob Tanenbaum 
> wrote:
>
>> created a new column in the southbound database to hardcode a human
>> readable
>> description for flows. This first use is describing why the flow is
>> dropping packets.
>> The new column is called flow_desc and will create southbound database
>> entries like this
>>
>> _uuid   : 20f1897b-477e-47ae-a32c-c546d83ec097
>> actions :
>> "sample(probability=65535,collector_set=123,obs_domain=1,obs_point=$cookie);
>> /* drop */"
>> controller_meter: []
>> external_ids: {source="northd.c:8721",
>> stage-name=ls_in_l2_unknown}
>> flow_desc   : "No L2 destination"
>> logical_datapath: []
>> logical_dp_group: ee3c3db5-98a2-4f34-8a84-409deae140a7
>> match   : "outport == \"none\""
>> pipeline: ingress
>> priority: 50
>> table_id: 27
>> tags: {}
>> hash: 0
>>
>> future work includes entering more flow_desc for more flows and adding
>> flow_desc to the actions as a comment.
>>
>> Signed-off-by: Jacob Tanenbaum 
>> Suggested-by: Dumitru Ceara 
>> Reported-at: https://issues.redhat.com/browse/FDP-307
>>
>> ---
>>
>> v1: initial version
>> v2: correct commit message
>> make the flow_desc a char*
>> correct a typo in the ovn-sb.xml
>> correct the test
>> v3: rebase issue with NEWS file
>>
>>
> Hi Jacob,
>
> thank you for the follow up. I have two minor comments, see below.
>
>
>> diff --git a/NEWS b/NEWS
>> index 81c958f9a..04c441ada 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -21,6 +21,8 @@ Post v24.03.0
>>  MAC addresses configured on the LSP with "unknown", are learnt via
>> the
>>  OVN native FDB.
>>- Add support for ovsdb-server `--config-file` option in ovn-ctl.
>> +  - Added a new column in the southbound database "flow_desc" to provide
>> +human readable context to flows.
>>
>>  OVN v24.03.0 - 01 Mar 2024
>>  --
>> diff --git a/northd/lflow-mgr.c b/northd/lflow-mgr.c
>> index b2c60b5de..e27558a32 100644
>> --- a/northd/lflow-mgr.c
>> +++ b/northd/lflow-mgr.c
>> @@ -36,7 +36,7 @@ static void ovn_lflow_init(struct ovn_lflow *, struct
>> ovn_datapath *od,
>> uint16_t priority, char *match,
>> char *actions, char *io_port,
>> char *ctrl_meter, char *stage_hint,
>> -   const char *where);
>> +   const char *where, char *flow_desc);
>>  static struct ovn_lflow *ovn_lflow_find(const struct hmap *lflows,
>>  enum ovn_stage stage,
>>  uint16_t priority, const char
>> *match,
>> @@ -52,7 +52,7 @@ static struct ovn_lflow *do_ovn_lflow_add(
>>  const char *actions, const char *io_port,
>>  const char *ctrl_meter,
>>  const struct ovsdb_idl_row *stage_hint,
>> -const char *where);
>> +const char *where, const char *flow_desc);
>>
>>
>>  static struct ovs_mutex *lflow_hash_lock(const struct hmap *lflow_table,
>> @@ -173,6 +173,7 @@ struct ovn_lflow {
>>* 'dpg_bitmap'. */
>>  struct ovn_dp_group *dpg;/* Link to unique Sb datapath group. */
>>  const char *where;
>> +char *flow_desc;
>>
>>  struct uuid sb_uuid; /* SB DB row uuid, specified by northd.
>> */
>>  struct ovs_list referenced_by;  /* List of struct lflow_ref_node. */
>> @@ -659,7 +660,7 @@ lflow_table_add_lflow(struct lflow_table *lflow_table,
>>const char *match, const char *actions,
>>const char *io_port, const char *ctrl_meter,
>>const struct ovsdb_idl_row *stage_hint,
>> -  const char *where,
>> +  const char *where, const char *flow_desc,
>>struct lflow_ref *lflow_ref)
>>  OVS_EXCLUDED(fake_hash_mutex)
>>  {
>> @@ -679,7 +680,7 @@ lflow_table_add_lflow(struct lflow_table *lflow_table,
>>  do_ovn_lflow_add(lflow_table,
>>   od ? ods_size(od->datapaths) : dp_bitmap_len,
>>   hash, stage, priority, match, actions,
>> - io_port, ctrl_meter, stage_hint, where);
>> + io_port, ctrl_meter, stage_hint, where,
>> flow_desc);
>>
>>  if (lflow_ref) {
>>  struct lflow_ref_node *lrn =
>> @@ -733,7 +734,7 @@ lflow_table_add_lflow_default_drop(struct lflow_table
>> *lflow_table,
>>  {
>>  lflow_table_add_lflow(lflow_table, od, NULL, 0, stage, 0, "1",
>>debug_drop_action(), NULL, NULL, NULL,
>> -  where, lflow_ref);
>> +  where, NULL, lflow_ref);
>>
>
> The lflow_table_add_lflow_default_drop() is probably the best candidate to
> receive the description as an arg

Re: [ovs-dev] [PATCH net-next 2/2] selftests: openvswitch: set value to nla flags

2024-06-03 Thread Aaron Conole
Adrian Moreno  writes:

> Netlink flags, although they don't have payload at the netlink level,
> are represented as having a "True" value in pyroute2.
>
> Without it, trying to add a flow with a flag-type action (e.g: pop_vlan)
> fails with the following traceback:
>
> Traceback (most recent call last):
>   File "[...]/ovs-dpctl.py", line 2498, in 
> sys.exit(main(sys.argv))
>  ^^
>   File "[...]/ovs-dpctl.py", line 2487, in main
> ovsflow.add_flow(rep["dpifindex"], flow)
>   File "[...]/ovs-dpctl.py", line 2136, in add_flow
> reply = self.nlm_request(
> ^
>   File "[...]/pyroute2/netlink/nlsocket.py", line 822, in nlm_request
> return tuple(self._genlm_request(*argv, **kwarg))
>  ^^^
>   File "[...]/pyroute2/netlink/generic/__init__.py", line 126, in
> nlm_request
> return tuple(super().nlm_request(*argv, **kwarg))
>^^
>   File "[...]/pyroute2/netlink/nlsocket.py", line 1124, in nlm_request
> self.put(msg, msg_type, msg_flags, msg_seq=msg_seq)
>   File "[...]/pyroute2/netlink/nlsocket.py", line 389, in put
> self.sendto_gate(msg, addr)
>   File "[...]/pyroute2/netlink/nlsocket.py", line 1056, in sendto_gate
> msg.encode()
>   File "[...]/pyroute2/netlink/__init__.py", line 1245, in encode
> offset = self.encode_nlas(offset)
>  
>   File "[...]/pyroute2/netlink/__init__.py", line 1560, in encode_nlas
> nla_instance.setvalue(cell[1])
>   File "[...]/pyroute2/netlink/__init__.py", line 1265, in setvalue
> nlv.setvalue(nla_tuple[1])
>  ~^^^
> IndexError: list index out of range
>
> Signed-off-by: Adrian Moreno 
> ---

Acked-by: Aaron Conole 

I don't know which pyroute2 version I had used when I tested this
previously, but even on my current system I get this error now.  Thanks
for the fix.

>  tools/testing/selftests/net/openvswitch/ovs-dpctl.py | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py 
> b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
> index b76907ac0092..a2395c3f37a1 100644
> --- a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
> +++ b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
> @@ -537,7 +537,7 @@ class ovsactions(nla):
>  for flat_act in parse_flat_map:
>  if parse_starts_block(actstr, flat_act[0], False):
>  actstr = actstr[len(flat_act[0]):]
> -self["attrs"].append([flat_act[1]])
> +self["attrs"].append([flat_act[1], True])
>  actstr = actstr[strspn(actstr, ", ") :]
>  parsed = True

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next 1/2] selftests: openvswitch: fix action formatting

2024-06-03 Thread Aaron Conole
Adrian Moreno  writes:

> In the action formatting function ("dpstr"), the iteration is made over
> the nla_map, so if there are more than one attribute from the same type
> we only print the first one.
>
> Fix this by iterating over the actual attributes.
>
> Signed-off-by: Adrian Moreno 
> ---
>  .../selftests/net/openvswitch/ovs-dpctl.py| 48 +++
>  1 file changed, 27 insertions(+), 21 deletions(-)
>
> diff --git a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py 
> b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
> index 1dd057afd3fb..b76907ac0092 100644
> --- a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
> +++ b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
> @@ -437,40 +437,46 @@ class ovsactions(nla):
>  def dpstr(self, more=False):
>  print_str = ""
>  
> -for field in self.nla_map:
> -if field[1] == "none" or self.get_attr(field[0]) is None:
> +for attr_name, value in self["attrs"]:
> +attr_desc = next(filter(lambda x: x[0] == attr_name, 
> self.nla_map),
> + None)
> +if not attr_desc:
> +raise ValueError("Unknown attribute: %s" % attr)
> +
> +attr_type = attr_desc[1]
> +
> +if attr_type == "none":

I agree, this is an issue.  BUT I think it might be better to just
filter by field type up front.  See:

https://github.com/apconole/linux-next-work/commit/7262107de7170d44b6dbf6c5ea6f7e6c0bb71d36#diff-3e72e7405c6bb4e9842bed5f63883ca930387086bb40d4034e92ed83a5decb4bR441

That version I think ends up being much easier to follow.  If you want
to take it for your series, feel free.  If you disagree, maybe there's
something I'm not considering about it.

NOTE that version is just a bunch of independent changes that are
squashed together.  I have a cleaner version.

I can also bundle up the series I have so far and submit, but I didn't
want to do that until I got all the pmtu.sh support working.  Maybe it
makes sense to send it now though.  Simon, Jakub - wdyt?

>  continue
>  if print_str != "":
>  print_str += ","
>  
> -if field[1] == "uint32":
> -if field[0] == "OVS_ACTION_ATTR_OUTPUT":
> -print_str += "%d" % int(self.get_attr(field[0]))
> -elif field[0] == "OVS_ACTION_ATTR_RECIRC":
> -print_str += "recirc(0x%x)" % 
> int(self.get_attr(field[0]))
> -elif field[0] == "OVS_ACTION_ATTR_TRUNC":
> -print_str += "trunc(%d)" % int(self.get_attr(field[0]))
> -elif field[0] == "OVS_ACTION_ATTR_DROP":
> -print_str += "drop(%d)" % int(self.get_attr(field[0]))
> -elif field[1] == "flag":
> -if field[0] == "OVS_ACTION_ATTR_CT_CLEAR":
> +if attr_type == "uint32":
> +if attr_name == "OVS_ACTION_ATTR_OUTPUT":
> +print_str += "%d" % int(value)
> +elif attr_name == "OVS_ACTION_ATTR_RECIRC":
> +print_str += "recirc(0x%x)" % int(value)
> +elif attr_name == "OVS_ACTION_ATTR_TRUNC":
> +print_str += "trunc(%d)" % int(value)
> +elif attr_name == "OVS_ACTION_ATTR_DROP":
> +print_str += "drop(%d)" % int(value)
> +elif attr_type == "flag":
> +if attr_name == "OVS_ACTION_ATTR_CT_CLEAR":
>  print_str += "ct_clear"
> -elif field[0] == "OVS_ACTION_ATTR_POP_VLAN":
> +elif attr_name == "OVS_ACTION_ATTR_POP_VLAN":
>  print_str += "pop_vlan"
> -elif field[0] == "OVS_ACTION_ATTR_POP_ETH":
> +elif attr_name == "OVS_ACTION_ATTR_POP_ETH":
>  print_str += "pop_eth"
> -elif field[0] == "OVS_ACTION_ATTR_POP_NSH":
> +elif attr_name == "OVS_ACTION_ATTR_POP_NSH":
>  print_str += "pop_nsh"
> -elif field[0] == "OVS_ACTION_ATTR_POP_MPLS":
> +elif attr_name == "OVS_ACTION_ATTR_POP_MPLS":
>  print_str += "pop_mpls"
>  else:
> -datum = self.get_attr(field[0])
> -if field[0] == "OVS_ACTION_ATTR_CLONE":
> +if attr_name == "OVS_ACTION_ATTR_CLONE":
>  print_str += "clone("
> -print_str += datum.dpstr(more)
> +print_str += value.dpstr(more)
>  print_str += ")"
>  else:
> -print_str += datum.dpstr(more)
> +print_str += value.dpstr(more)
>  
>  return print_str

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net-next v2 7/9] net: openvswitch: do not notify drops inside sample

2024-06-03 Thread Adrian Moreno
The OVS_ACTION_ATTR_SAMPLE action is, in essence,
observability-oriented.

Apart from some corner case in which it's used a replacement of clone()
for old kernels, it's really only used for sFlow, IPFIX and now,
local emit_sample.

With this in mind, it doesn't make much sense to report
OVS_DROP_LAST_ACTION inside sample actions.

For instance, if the flow:

  actions:sample(..,emit_sample(..)),2

triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
confusing for users since the packet did reach its destination.

This patch makes internal action execution silently consume the skb
instead of notifying a drop for this case.

Unfortunately, this patch does not remove all potential sources of
confusion since, if the sample action itself is the last action, e.g:

actions:sample(..,emit_sample(..))

we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.

Sadly, this case is difficult to solve without breaking the
optimization by which the skb is not cloned on last sample actions.
But, given explicit drop actions are now supported, OVS can just add one
after the last sample() and rewrite the flow as:

actions:sample(..,emit_sample(..)),drop

Signed-off-by: Adrian Moreno 
---
 net/openvswitch/actions.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 33f6d93ba5e4..54fc1abcff95 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
 static struct action_flow_keys __percpu *flow_keys;
 static DEFINE_PER_CPU(int, exec_actions_level);
 
+static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
+{
+   /* Do not emit packet drops inside sample(). */
+   if (OVS_CB(skb)->probability)
+   consume_skb(skb);
+   else
+   ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
+}
+
 /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
  * space. Return NULL if out of key spaces.
  */
@@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct sk_buff 
*skb,
if ((arg->probability != U32_MAX) &&
(!arg->probability || get_random_u32() > arg->probability)) {
if (last)
-   ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
+   ovs_drop_skb_last_action(skb);
return 0;
}
 
@@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, struct 
sk_buff *skb,
}
}
 
-   ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
+   ovs_drop_skb_last_action(skb);
return 0;
 }
 
-- 
2.45.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net-next v2 9/9] selftests: openvswitch: add emit_sample test

2024-06-03 Thread Adrian Moreno
Add a test to verify sampling packets via psample works.

In order to do that, create a subcommand in ovs-dpctl.py to listen to
on the psample multicast group and print samples.

In order to also test simultaneous sFlow and psample actions and
packet truncation, add missing parsing support for "userspace" and
"trunc" actions.

Signed-off-by: Adrian Moreno 
---
 .../selftests/net/openvswitch/openvswitch.sh  |  99 +++-
 .../selftests/net/openvswitch/ovs-dpctl.py| 112 +-
 2 files changed, 204 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/net/openvswitch/openvswitch.sh 
b/tools/testing/selftests/net/openvswitch/openvswitch.sh
index 5cae53543849..f6e0ae3f6424 100755
--- a/tools/testing/selftests/net/openvswitch/openvswitch.sh
+++ b/tools/testing/selftests/net/openvswitch/openvswitch.sh
@@ -20,7 +20,8 @@ tests="
nat_related_v4  ip4-nat-related: ICMP related 
matches work with SNAT
netlink_checks  ovsnl: validate netlink attrs 
and settings
upcall_interfaces   ovs: test the upcall interfaces
-   drop_reason drop: test drop reasons are 
emitted"
+   drop_reason drop: test drop reasons are 
emitted
+   emit_sample emit_sample: Sampling packets 
with psample"
 
 info() {
 [ $VERBOSE = 0 ] || echo $*
@@ -170,6 +171,19 @@ ovs_drop_reason_count()
return `echo "$perf_output" | grep "$pattern" | wc -l`
 }
 
+ovs_test_flow_fails () {
+   ERR_MSG="Flow actions may not be safe on all matching packets"
+
+   PRE_TEST=$(dmesg | grep -c "${ERR_MSG}")
+   ovs_add_flow $@ &> /dev/null $@ && return 1
+   POST_TEST=$(dmesg | grep -c "${ERR_MSG}")
+
+   if [ "$PRE_TEST" == "$POST_TEST" ]; then
+   return 1
+   fi
+   return 0
+}
+
 usage() {
echo
echo "$0 [OPTIONS] [TEST]..."
@@ -184,6 +198,89 @@ usage() {
exit 1
 }
 
+
+# emit_sample test
+# - use emit_sample to observe packets
+test_emit_sample() {
+   sbx_add "test_emit_sample" || return $?
+
+   # Add a datapath with per-vport dispatching.
+   ovs_add_dp "test_emit_sample" emit_sample -V 2:1 || return 1
+
+   info "create namespaces"
+   ovs_add_netns_and_veths "test_emit_sample" "emit_sample" \
+   client c0 c1 172.31.110.10/24 -u || return 1
+   ovs_add_netns_and_veths "test_emit_sample" "emit_sample" \
+   server s0 s1 172.31.110.20/24 -u || return 1
+
+   # Check if emit_sample actions can be configured.
+   ovs_add_flow "test_emit_sample" emit_sample \
+   'in_port(1),eth(),eth_type(0x0806),arp()' 'emit_sample(group=1)'
+   if [ $? == 1 ]; then
+   info "no support for emit_sample - skipping"
+   ovs_exit_sig
+   return $ksft_skip
+   fi
+
+   ovs_del_flows "test_emit_sample" emit_sample
+
+   # Allow ARP
+   ovs_add_flow "test_emit_sample" emit_sample \
+   'in_port(1),eth(),eth_type(0x0806),arp()' '2' || return 1
+   ovs_add_flow "test_emit_sample" emit_sample \
+   'in_port(2),eth(),eth_type(0x0806),arp()' '1' || return 1
+
+   # Test action verification.
+   OLDIFS=$IFS
+   IFS='*'
+   min_key='in_port(1),eth(),eth_type(0x0800),ipv4()'
+   for testcase in \
+   "cookie to 
large"*"emit_sample(group=1,cookie=1615141312111009080706050403020100)" \
+   "no group with cookie"*"emit_sample(cookie=abcd)" \
+   "no group"*"sample()";
+   do
+   set -- $testcase;
+   ovs_test_flow_fails "test_emit_sample" emit_sample $min_key $2
+   if [ $? == 1 ]; then
+   info "failed - $1"
+   return 1
+   fi
+   done
+   IFS=$OLDIFS
+
+   # Sample first 14 bytes of all traffic.
+   ovs_add_flow "test_emit_sample" emit_sample \
+   
"in_port(1),eth(),eth_type(0x0800),ipv4(src=172.31.110.10,proto=1),icmp()" 
"trunc(14),emit_sample(group=1,cookie=c0ffee),2"
+
+   # Sample all traffic. In this case, use a sample() action with both
+   # emit_sample and an upcall emulating simultaneous local sampling and
+   # sFlow / IPFIX.
+   nlpid=$(grep -E "listening on upcall packet handler" $ovs_dir/s0.out | 
cut -d ":" -f 2 | tr -d ' ')
+   ovs_add_flow "test_emit_sample" emit_sample \
+   
"in_port(2),eth(),eth_type(0x0800),ipv4(src=172.31.110.20,proto=1),icmp()" 
"sample(sample=100%,actions(emit_sample(group=2,cookie=eeff0c),userspace(pid=${nlpid},userdata=eeff0c))),1"
+
+   # Record emit_sample data.
+   python3 $ovs_base/ovs-dpctl.py psample >$ovs_dir/psample.out 
2>$ovs_dir/psample.err &
+   pid=$!
+   on_exit "ovs_sbx test_emit_sample kill -TERM $pid 2>/dev/null"
+
+   # Send a single ping.
+   sleep 1
+   ovs_sbx "test_emit_sam

[ovs-dev] [PATCH net-next v2 8/9] selftests: openvswitch: add emit_sample action

2024-06-03 Thread Adrian Moreno
Add sample and emit_sample action support to ovs-dpctl.py.

Refactor common attribute parsing logic into an external function.

Signed-off-by: Adrian Moreno 
---
 .../selftests/net/openvswitch/ovs-dpctl.py| 162 +-
 1 file changed, 161 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py 
b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
index a2395c3f37a1..f8b5362aac8c 100644
--- a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
+++ b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
@@ -8,6 +8,7 @@ import argparse
 import errno
 import ipaddress
 import logging
+import math
 import multiprocessing
 import re
 import struct
@@ -58,6 +59,7 @@ OVS_FLOW_CMD_DEL = 2
 OVS_FLOW_CMD_GET = 3
 OVS_FLOW_CMD_SET = 4
 
+UINT32_MAX = 0x
 
 def macstr(mac):
 outstr = ":".join(["%02X" % i for i in mac])
@@ -267,6 +269,75 @@ def parse_extract_field(
 return str_skipped, data
 
 
+def parse_attrs(actstr, attr_desc):
+"""Parses the given action string and returns a list of netlink
+attributes based on a list of attribute descriptions.
+
+Each element in the attribute description list is a tuple such as:
+(name, attr_name, parse_func)
+where:
+name: is the string representing the attribute
+attr_name: is the name of the attribute as defined in the uAPI.
+parse_func: is a callable accepting a string and returning either
+a single object (the parsed attribute value) or a tuple of
+two values (the parsed attribute value and the remaining string)
+
+Returns a list of attributes and the remaining string.
+"""
+def parse_attr(actstr, key, func):
+actstr = actstr[len(key) :]
+
+if not func:
+return None, actstr
+
+delim = actstr[0]
+actstr = actstr[1:]
+
+if delim == "=":
+pos = strcspn(actstr, ",)")
+ret = func(actstr[:pos])
+else:
+ret = func(actstr)
+
+if isinstance(ret, tuple):
+(datum, actstr) = ret
+else:
+datum = ret
+actstr = actstr[strcspn(actstr, ",)"):]
+
+if delim == "(":
+if not actstr or actstr[0] != ")":
+raise ValueError("Action contains unbalanced parentheses")
+
+actstr = actstr[1:]
+
+actstr = actstr[strspn(actstr, ", ") :]
+
+return datum, actstr
+
+attrs = []
+attr_desc = list(attr_desc)
+while actstr and actstr[0] != ")" and attr_desc:
+found = False
+for i, (key, attr, func) in enumerate(attr_desc):
+if actstr.startswith(key):
+datum, actstr = parse_attr(actstr, key, func)
+attrs.append([attr, datum])
+found = True
+del attr_desc[i]
+
+if not found:
+raise ValueError("Unknown attribute: '%s'" % actstr)
+
+actstr = actstr[strspn(actstr, ", ") :]
+
+if actstr[0] != ")":
+raise ValueError("Action string contains extra garbage or has "
+ "unbalanced parenthesis: '%s'" % actstr)
+
+return attrs, actstr[1:]
+
+
 class ovs_dp_msg(genlmsg):
 # include the OVS version
 # We need a custom header rather than just being able to rely on
@@ -285,7 +356,7 @@ class ovsactions(nla):
 ("OVS_ACTION_ATTR_SET", "none"),
 ("OVS_ACTION_ATTR_PUSH_VLAN", "none"),
 ("OVS_ACTION_ATTR_POP_VLAN", "flag"),
-("OVS_ACTION_ATTR_SAMPLE", "none"),
+("OVS_ACTION_ATTR_SAMPLE", "sample"),
 ("OVS_ACTION_ATTR_RECIRC", "uint32"),
 ("OVS_ACTION_ATTR_HASH", "none"),
 ("OVS_ACTION_ATTR_PUSH_MPLS", "none"),
@@ -304,8 +375,85 @@ class ovsactions(nla):
 ("OVS_ACTION_ATTR_ADD_MPLS", "none"),
 ("OVS_ACTION_ATTR_DEC_TTL", "none"),
 ("OVS_ACTION_ATTR_DROP", "uint32"),
+("OVS_ACTION_ATTR_EMIT_SAMPLE", "emit_sample"),
 )
 
+class emit_sample(nla):
+nla_flags = NLA_F_NESTED
+
+nla_map = (
+("OVS_EMIT_SAMPLE_ATTR_UNSPEC", "none"),
+("OVS_EMIT_SAMPLE_ATTR_GROUP", "uint32"),
+("OVS_EMIT_SAMPLE_ATTR_COOKIE", "array(uint8)"),
+)
+
+def dpstr(self, more=False):
+args = "group=%d" % self.get_attr("OVS_EMIT_SAMPLE_ATTR_GROUP")
+
+cookie = self.get_attr("OVS_EMIT_SAMPLE_ATTR_COOKIE")
+if cookie:
+args += ",cookie(%s)" % \
+"".join(format(x, "02x") for x in cookie)
+
+return "emit_sample(%s)" % args
+
+def parse(self, actstr):
+desc = (
+("group", "OVS_EMIT_SAMPLE_ATTR_GROUP", int),
+("cookie", "OVS_EMIT_SAMPLE_ATTR_COOKIE",
+lambda x: list(bytearray.fromhex(x)))
+)
+
+attrs, actstr = parse_attrs(actstr, desc)
+
+for attr in attrs:
+self[

[ovs-dev] [PATCH net-next v2 6/9] net: openvswitch: store sampling probability in cb.

2024-06-03 Thread Adrian Moreno
The behavior of actions might not be the exact same if they are being
executed inside a nested sample action. Store the probability of the
parent sample action in the skb's cb area.

Use the probability in emit_sample to pass it down to psample.

Signed-off-by: Adrian Moreno 
---
 include/uapi/linux/openvswitch.h |  3 ++-
 net/openvswitch/actions.c| 25 ++---
 net/openvswitch/datapath.h   |  3 +++
 net/openvswitch/vport.c  |  1 +
 4 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
index a0e9dde0584a..9d675725fa2b 100644
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -649,7 +649,8 @@ enum ovs_flow_attr {
  * Actions are passed as nested attributes.
  *
  * Executes the specified actions with the given probability on a per-packet
- * basis.
+ * basis. Nested actions will be able to access the probability value of the
+ * parent @OVS_ACTION_ATTR_SAMPLE.
  */
 enum ovs_sample_attr {
OVS_SAMPLE_ATTR_UNSPEC,
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 3b4dba0ded59..33f6d93ba5e4 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -1048,12 +1048,15 @@ static int sample(struct datapath *dp, struct sk_buff 
*skb,
struct nlattr *sample_arg;
int rem = nla_len(attr);
const struct sample_arg *arg;
+   u32 init_probability;
bool clone_flow_key;
+   int err;
 
/* The first action is always 'OVS_SAMPLE_ATTR_ARG'. */
sample_arg = nla_data(attr);
arg = nla_data(sample_arg);
actions = nla_next(sample_arg, &rem);
+   init_probability = OVS_CB(skb)->probability;
 
if ((arg->probability != U32_MAX) &&
(!arg->probability || get_random_u32() > arg->probability)) {
@@ -1062,9 +1065,21 @@ static int sample(struct datapath *dp, struct sk_buff 
*skb,
return 0;
}
 
+   if (init_probability) {
+   OVS_CB(skb)->probability = ((u64)OVS_CB(skb)->probability *
+   arg->probability / U32_MAX);
+   } else {
+   OVS_CB(skb)->probability = arg->probability;
+   }
+
clone_flow_key = !arg->exec;
-   return clone_execute(dp, skb, key, 0, actions, rem, last,
-clone_flow_key);
+   err = clone_execute(dp, skb, key, 0, actions, rem, last,
+   clone_flow_key);
+
+   if (!last)
+   OVS_CB(skb)->probability = init_probability;
+
+   return err;
 }
 
 /* When 'last' is true, clone() should always consume the 'skb'.
@@ -1313,6 +1328,7 @@ static int execute_emit_sample(struct datapath *dp, 
struct sk_buff *skb,
struct psample_metadata md = {};
struct vport *input_vport;
const struct nlattr *a;
+   u32 rate;
int rem;
 
for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
@@ -1337,8 +1353,11 @@ static int execute_emit_sample(struct datapath *dp, 
struct sk_buff *skb,
 
md.in_ifindex = input_vport->dev->ifindex;
md.trunc_size = skb->len - OVS_CB(skb)->cutlen;
+   md.rate_as_probability = 1;
+
+   rate = OVS_CB(skb)->probability ? OVS_CB(skb)->probability : U32_MAX;
 
-   psample_sample_packet(&psample_group, skb, 0, &md);
+   psample_sample_packet(&psample_group, skb, rate, &md);
 #endif
 
return 0;
diff --git a/net/openvswitch/datapath.h b/net/openvswitch/datapath.h
index 0cd29971a907..9ca6231ea647 100644
--- a/net/openvswitch/datapath.h
+++ b/net/openvswitch/datapath.h
@@ -115,12 +115,15 @@ struct datapath {
  * fragmented.
  * @acts_origlen: The netlink size of the flow actions applied to this skb.
  * @cutlen: The number of bytes from the packet end to be removed.
+ * @probability: The sampling probability that was applied to this skb; 0 means
+ * no sampling has occurred; U32_MAX means 100% probability.
  */
 struct ovs_skb_cb {
struct vport*input_vport;
u16 mru;
u16 acts_origlen;
u32 cutlen;
+   u32 probability;
 };
 #define OVS_CB(skb) ((struct ovs_skb_cb *)(skb)->cb)
 
diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
index 972ae01a70f7..8732f6e51ae5 100644
--- a/net/openvswitch/vport.c
+++ b/net/openvswitch/vport.c
@@ -500,6 +500,7 @@ int ovs_vport_receive(struct vport *vport, struct sk_buff 
*skb,
OVS_CB(skb)->input_vport = vport;
OVS_CB(skb)->mru = 0;
OVS_CB(skb)->cutlen = 0;
+   OVS_CB(skb)->probability = 0;
if (unlikely(dev_net(skb->dev) != ovs_dp_get_net(vport->dp))) {
u32 mark;
 
-- 
2.45.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net-next v2 4/9] net: psample: allow using rate as probability

2024-06-03 Thread Adrian Moreno
Although not explicitly documented in the psample module itself, the
definition of PSAMPLE_ATTR_SAMPLE_RATE seems inherited from act_sample.

Quoting tc-sample(8):
"RATE of 100 will lead to an average of one sampled packet out of every
100 observed."

With this semantics, the rates that we can express with an unsigned
32-bits number are very unevenly distributed and concentrated towards
"sampling few packets".
For example, we can express a probability of 2.32E-8% but we
cannot express anything between 100% and 50%.

For sampling applications that are capable of sampling a decent
amount of packets, this sampling rate semantics is not very useful.

Add a new flag to the uAPI that indicates that the sampling rate is
expressed in scaled probability, this is:
- 0 is 0% probability, no packets get sampled.
- U32_MAX is 100% probability, all packets get sampled.

Signed-off-by: Adrian Moreno 
---
 include/net/psample.h | 3 ++-
 include/uapi/linux/psample.h  | 4 
 include/uapi/linux/tc_act/tc_sample.h | 1 +
 net/psample/psample.c | 3 +++
 4 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/include/net/psample.h b/include/net/psample.h
index 2ac71260a546..c52e9ebd88dd 100644
--- a/include/net/psample.h
+++ b/include/net/psample.h
@@ -24,7 +24,8 @@ struct psample_metadata {
u8 out_tc_valid:1,
   out_tc_occ_valid:1,
   latency_valid:1,
-  unused:5;
+  rate_as_probability:1,
+  unused:4;
const u8 *user_cookie;
u32 user_cookie_len;
 };
diff --git a/include/uapi/linux/psample.h b/include/uapi/linux/psample.h
index e80637e1d97b..8b069e75beab 100644
--- a/include/uapi/linux/psample.h
+++ b/include/uapi/linux/psample.h
@@ -20,6 +20,10 @@ enum {
PSAMPLE_ATTR_TIMESTAMP, /* u64, nanoseconds */
PSAMPLE_ATTR_PROTO, /* u16 */
PSAMPLE_ATTR_USER_COOKIE,   /* binary, user provided data */
+   PSAMPLE_ATTR_SAMPLE_PROBABILITY,/* no argument, interpret rate in
+* PSAMPLE_ATTR_SAMPLE_RATE as a
+* probability scaled 0 - U32_MAX.
+*/
 
__PSAMPLE_ATTR_MAX
 };
diff --git a/include/uapi/linux/tc_act/tc_sample.h 
b/include/uapi/linux/tc_act/tc_sample.h
index fee1bcc20793..7ee0735e7b38 100644
--- a/include/uapi/linux/tc_act/tc_sample.h
+++ b/include/uapi/linux/tc_act/tc_sample.h
@@ -18,6 +18,7 @@ enum {
TCA_SAMPLE_TRUNC_SIZE,
TCA_SAMPLE_PSAMPLE_GROUP,
TCA_SAMPLE_PAD,
+   TCA_SAMPLE_PROBABILITY,
__TCA_SAMPLE_MAX
 };
 #define TCA_SAMPLE_MAX (__TCA_SAMPLE_MAX - 1)
diff --git a/net/psample/psample.c b/net/psample/psample.c
index 1c76f3e48dcd..f48b5b9cd409 100644
--- a/net/psample/psample.c
+++ b/net/psample/psample.c
@@ -497,6 +497,9 @@ void psample_sample_packet(struct psample_group *group, 
struct sk_buff *skb,
md->user_cookie))
goto error;
 
+   if (md->rate_as_probability)
+   nla_put_flag(skb, PSAMPLE_ATTR_SAMPLE_PROBABILITY);
+
genlmsg_end(nl_skb, data);
genlmsg_multicast_netns(&psample_nl_family, group->net, nl_skb, 0,
PSAMPLE_NL_MCGRP_SAMPLE, GFP_ATOMIC);
-- 
2.45.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action

2024-06-03 Thread Adrian Moreno
Add support for a new action: emit_sample.

This action accepts a u32 group id and a variable-length cookie and uses
the psample multicast group to make the packet available for
observability.

The maximum length of the user-defined cookie is set to 16, same as
tc_cookie, to discourage using cookies that will not be offloadable.

Signed-off-by: Adrian Moreno 
---
 Documentation/netlink/specs/ovs_flow.yaml | 17 
 include/uapi/linux/openvswitch.h  | 25 
 net/openvswitch/actions.c | 50 +++
 net/openvswitch/flow_netlink.c| 33 ++-
 4 files changed, 124 insertions(+), 1 deletion(-)

diff --git a/Documentation/netlink/specs/ovs_flow.yaml 
b/Documentation/netlink/specs/ovs_flow.yaml
index 4fdfc6b5cae9..a7ab5593a24f 100644
--- a/Documentation/netlink/specs/ovs_flow.yaml
+++ b/Documentation/netlink/specs/ovs_flow.yaml
@@ -727,6 +727,12 @@ attribute-sets:
 name: dec-ttl
 type: nest
 nested-attributes: dec-ttl-attrs
+  -
+name: emit-sample
+type: nest
+nested-attributes: emit-sample-attrs
+doc: |
+  Sends a packet sample to psample for external observation.
   -
 name: tunnel-key-attrs
 enum-name: ovs-tunnel-key-attr
@@ -938,6 +944,17 @@ attribute-sets:
   -
 name: gbp
 type: u32
+  -
+name: emit-sample-attrs
+enum-name: ovs-emit-sample-attr
+name-prefix: ovs-emit-sample-attr-
+attributes:
+  -
+name: group
+type: u32
+  -
+name: cookie
+type: binary
 
 operations:
   name-prefix: ovs-flow-cmd-
diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
index efc82c318fa2..a0e9dde0584a 100644
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -914,6 +914,30 @@ struct check_pkt_len_arg {
 };
 #endif
 
+#define OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE 16
+/**
+ * enum ovs_emit_sample_attr - Attributes for %OVS_ACTION_ATTR_EMIT_SAMPLE
+ * action.
+ *
+ * @OVS_EMIT_SAMPLE_ATTR_GROUP: 32-bit number to identify the source of the
+ * sample.
+ * @OVS_EMIT_SAMPLE_ATTR_COOKIE: A variable-length binary cookie that contains
+ * user-defined metadata. The maximum length is 16 bytes.
+ *
+ * Sends the packet to the psample multicast group with the specified group and
+ * cookie. It is possible to combine this action with the
+ * %OVS_ACTION_ATTR_TRUNC action to limit the size of the packet being emitted.
+ */
+enum ovs_emit_sample_attr {
+   OVS_EMIT_SAMPLE_ATTR_UNPSEC,
+   OVS_EMIT_SAMPLE_ATTR_GROUP, /* u32 number. */
+   OVS_EMIT_SAMPLE_ATTR_COOKIE,/* Optional, user specified cookie. */
+   __OVS_EMIT_SAMPLE_ATTR_MAX
+};
+
+#define OVS_EMIT_SAMPLE_ATTR_MAX (__OVS_EMIT_SAMPLE_ATTR_MAX - 1)
+
+
 /**
  * enum ovs_action_attr - Action types.
  *
@@ -1004,6 +1028,7 @@ enum ovs_action_attr {
OVS_ACTION_ATTR_ADD_MPLS, /* struct ovs_action_add_mpls. */
OVS_ACTION_ATTR_DEC_TTL,  /* Nested OVS_DEC_TTL_ATTR_*. */
OVS_ACTION_ATTR_DROP, /* u32 error code. */
+   OVS_ACTION_ATTR_EMIT_SAMPLE,  /* Nested OVS_EMIT_SAMPLE_ATTR_*. */
 
__OVS_ACTION_ATTR_MAX,/* Nothing past this will be accepted
   * from userspace. */
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 964225580824..3b4dba0ded59 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -24,6 +24,11 @@
 #include 
 #include 
 #include 
+
+#if IS_ENABLED(CONFIG_PSAMPLE)
+#include 
+#endif
+
 #include 
 
 #include "datapath.h"
@@ -1299,6 +1304,46 @@ static int execute_dec_ttl(struct sk_buff *skb, struct 
sw_flow_key *key)
return 0;
 }
 
+static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
+  const struct sw_flow_key *key,
+  const struct nlattr *attr)
+{
+#if IS_ENABLED(CONFIG_PSAMPLE)
+   struct psample_group psample_group = {};
+   struct psample_metadata md = {};
+   struct vport *input_vport;
+   const struct nlattr *a;
+   int rem;
+
+   for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
+a = nla_next(a, &rem)) {
+   switch (nla_type(a)) {
+   case OVS_EMIT_SAMPLE_ATTR_GROUP:
+   psample_group.group_num = nla_get_u32(a);
+   break;
+
+   case OVS_EMIT_SAMPLE_ATTR_COOKIE:
+   md.user_cookie = nla_data(a);
+   md.user_cookie_len = nla_len(a);
+   break;
+   }
+   }
+
+   psample_group.net = ovs_dp_get_net(dp);
+
+   input_vport = ovs_vport_rcu(dp, key->phy.in_port);
+   if (!input_vport)
+   input_vport = ovs_vport_rcu(dp, OVSP_LOCAL);
+
+   md.in_ifindex = input_vport->dev->ifindex;
+   md.trunc_size = skb->len - OVS_CB(skb)->cutlen;
+
+   psa

[ovs-dev] [PATCH net-next v2 3/9] net: psample: skip packet copy if no listeners

2024-06-03 Thread Adrian Moreno
If nobody is listening on the multicast group, generating the sample,
which involves copying packet data, seems completely unnecessary.

Return fast in this case.

Signed-off-by: Adrian Moreno 
---
 net/psample/psample.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/net/psample/psample.c b/net/psample/psample.c
index b37488f426bc..1c76f3e48dcd 100644
--- a/net/psample/psample.c
+++ b/net/psample/psample.c
@@ -376,6 +376,10 @@ void psample_sample_packet(struct psample_group *group, 
struct sk_buff *skb,
void *data;
int ret;
 
+   if (!genl_has_listeners(&psample_nl_family, group->net,
+   PSAMPLE_NL_MCGRP_SAMPLE))
+   return;
+
meta_len = (in_ifindex ? nla_total_size(sizeof(u16)) : 0) +
   (out_ifindex ? nla_total_size(sizeof(u16)) : 0) +
   (md->out_tc_valid ? nla_total_size(sizeof(u16)) : 0) +
-- 
2.45.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net-next v2 2/9] net: sched: act_sample: add action cookie to sample

2024-06-03 Thread Adrian Moreno
If the action has a user_cookie, pass it along to the sample so it can
be easily identified.

Signed-off-by: Adrian Moreno 
---
 net/sched/act_sample.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/net/sched/act_sample.c b/net/sched/act_sample.c
index a69b53d54039..5c3f86ec964a 100644
--- a/net/sched/act_sample.c
+++ b/net/sched/act_sample.c
@@ -165,9 +165,11 @@ TC_INDIRECT_SCOPE int tcf_sample_act(struct sk_buff *skb,
 const struct tc_action *a,
 struct tcf_result *res)
 {
+   u8 cookie_data[TC_COOKIE_MAX_SIZE] = {};
struct tcf_sample *s = to_sample(a);
struct psample_group *psample_group;
struct psample_metadata md = {};
+   struct tc_cookie *user_cookie;
int retval;
 
tcf_lastuse_update(&s->tcf_tm);
@@ -189,6 +191,16 @@ TC_INDIRECT_SCOPE int tcf_sample_act(struct sk_buff *skb,
if (skb_at_tc_ingress(skb) && tcf_sample_dev_ok_push(skb->dev))
skb_push(skb, skb->mac_len);
 
+   rcu_read_lock();
+   user_cookie = rcu_dereference(a->user_cookie);
+   if (user_cookie) {
+   memcpy(cookie_data, user_cookie->data,
+  user_cookie->len);
+   md.user_cookie = cookie_data;
+   md.user_cookie_len = user_cookie->len;
+   }
+   rcu_read_unlock();
+
md.trunc_size = s->truncate ? s->trunc_size : skb->len;
psample_sample_packet(psample_group, skb, s->rate, &md);
 
-- 
2.45.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net-next v2 1/9] net: psample: add user cookie

2024-06-03 Thread Adrian Moreno
Add a user cookie to the sample metadata so that sample emitters can
provide more contextual information to samples.

If present, send the user cookie in a new attribute:
PSAMPLE_ATTR_USER_COOKIE.

Signed-off-by: Adrian Moreno 
---
 include/net/psample.h| 2 ++
 include/uapi/linux/psample.h | 1 +
 net/psample/psample.c| 9 -
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/include/net/psample.h b/include/net/psample.h
index 0509d2d6be67..2ac71260a546 100644
--- a/include/net/psample.h
+++ b/include/net/psample.h
@@ -25,6 +25,8 @@ struct psample_metadata {
   out_tc_occ_valid:1,
   latency_valid:1,
   unused:5;
+   const u8 *user_cookie;
+   u32 user_cookie_len;
 };
 
 struct psample_group *psample_group_get(struct net *net, u32 group_num);
diff --git a/include/uapi/linux/psample.h b/include/uapi/linux/psample.h
index e585db5bf2d2..e80637e1d97b 100644
--- a/include/uapi/linux/psample.h
+++ b/include/uapi/linux/psample.h
@@ -19,6 +19,7 @@ enum {
PSAMPLE_ATTR_LATENCY,   /* u64, nanoseconds */
PSAMPLE_ATTR_TIMESTAMP, /* u64, nanoseconds */
PSAMPLE_ATTR_PROTO, /* u16 */
+   PSAMPLE_ATTR_USER_COOKIE,   /* binary, user provided data */
 
__PSAMPLE_ATTR_MAX
 };
diff --git a/net/psample/psample.c b/net/psample/psample.c
index a5d9b8446f77..b37488f426bc 100644
--- a/net/psample/psample.c
+++ b/net/psample/psample.c
@@ -386,7 +386,9 @@ void psample_sample_packet(struct psample_group *group, 
struct sk_buff *skb,
   nla_total_size(sizeof(u32)) +/* group_num */
   nla_total_size(sizeof(u32)) +/* seq */
   nla_total_size_64bit(sizeof(u64)) +  /* timestamp */
-  nla_total_size(sizeof(u16)); /* protocol */
+  nla_total_size(sizeof(u16)) +/* protocol */
+  (md->user_cookie_len ?
+   nla_total_size(md->user_cookie_len) : 0); /* user cookie */
 
 #ifdef CONFIG_INET
tun_info = skb_tunnel_info(skb);
@@ -486,6 +488,11 @@ void psample_sample_packet(struct psample_group *group, 
struct sk_buff *skb,
}
 #endif
 
+   if (md->user_cookie && md->user_cookie_len &&
+   nla_put(nl_skb, PSAMPLE_ATTR_USER_COOKIE, md->user_cookie_len,
+   md->user_cookie))
+   goto error;
+
genlmsg_end(nl_skb, data);
genlmsg_multicast_netns(&psample_nl_family, group->net, nl_skb, 0,
PSAMPLE_NL_MCGRP_SAMPLE, GFP_ATOMIC);
-- 
2.45.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net-next v2 0/9] net: openvswitch: Add sample multicasting.

2024-06-03 Thread Adrian Moreno
** Background **
Currently, OVS supports several packet sampling mechanisms (sFlow,
per-bridge IPFIX, per-flow IPFIX). These end up being translated into a
userspace action that needs to be handled by ovs-vswitchd's handler
threads only to be forwarded to some third party application that
will somehow process the sample and provide observability on the
datapath.

A particularly interesting use-case is controller-driven
per-flow IPFIX sampling where the OpenFlow controller can add metadata
to samples (via two 32bit integers) and this metadata is then available
to the sample-collecting system for correlation.

** Problem **
The fact that sampled traffic share netlink sockets and handler thread
time with upcalls, apart from being a performance bottleneck in the
sample extraction itself, can severely compromise the datapath,
yielding this solution unfit for highly loaded production systems.

Users are left with little options other than guessing what sampling
rate will be OK for their traffic pattern and system load and dealing
with the lost accuracy.

Looking at available infrastructure, an obvious candidated would be
to use psample. However, it's current state does not help with the
use-case at stake because sampled packets do not contain user-defined
metadata.

** Proposal **
This series is an attempt to fix this situation by extending the
existing psample infrastructure to carry a variable length
user-defined cookie.

The main existing user of psample is tc's act_sample. It is also
extended to forward the action's cookie to psample.

Finally, a new OVS action (OVS_SAMPLE_ATTR_EMIT_SAMPLE) is created.
It accepts a group and an optional cookie and uses psample to
multicast the packet and the metadata.

--
v1 -> v2:
- Create a new action ("emit_sample") rather than reuse existing
  "sample" one.
- Add probability semantics to psample's sampling rate.
- Store sampling probability in skb's cb area and use it in emit_sample.
- Test combining "emit_sample" with "trunc"
- Drop group_id filtering and tracepoint in psample.

rfc_v2 -> v1:
- Accomodate Ilya's comments.
- Split OVS's attribute in two attributes and simplify internal
handling of psample arguments.
- Extend psample and tc with a user-defined cookie.
- Add a tracepoint to psample to facilitate troubleshooting.

rfc_v1 -> rfc_v2:
- Use psample instead of a new OVS-only multicast group.
- Extend psample and tc with a user-defined cookie.

Adrian Moreno (9):
  net: psample: add user cookie
  net: sched: act_sample: add action cookie to sample
  net: psample: skip packet copy if no listeners
  net: psample: allow using rate as probability
  net: openvswitch: add emit_sample action
  net: openvswitch: store sampling probability in cb.
  net: openvswitch: do not notify drops inside sample
  selftests: openvswitch: add emit_sample action
  selftests: openvswitch: add emit_sample test

 Documentation/netlink/specs/ovs_flow.yaml |  17 ++
 include/net/psample.h |   5 +-
 include/uapi/linux/openvswitch.h  |  28 +-
 include/uapi/linux/psample.h  |   5 +
 include/uapi/linux/tc_act/tc_sample.h |   1 +
 net/openvswitch/actions.c |  86 +-
 net/openvswitch/datapath.h|   3 +
 net/openvswitch/flow_netlink.c|  33 ++-
 net/openvswitch/vport.c   |   1 +
 net/psample/psample.c |  16 +-
 net/sched/act_sample.c|  12 +
 .../selftests/net/openvswitch/openvswitch.sh  |  99 ++-
 .../selftests/net/openvswitch/ovs-dpctl.py| 274 +-
 13 files changed, 564 insertions(+), 16 deletions(-)

-- 
2.45.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] Do not reply on unicast arps for targets.

2024-06-03 Thread Vasyl Saienko
I've pushed a new revision, please review. The thread is `[PATCH v2 ovn] Do
not reply on unicast arps for IPv4 targets`

On Thu, May 23, 2024 at 1:07 PM Han Zhou  wrote:

>
>
> On Thu, May 23, 2024 at 1:51 PM Vasyl Saienko 
> wrote:
> >
> > Thanks for feedback,
> >
> > I'm not an ipv6 expert, I did some research. There is no ARP in ipv6,
> it's replaced with Neighbor Discovery. The ARP resolution is done via
> Neighbor Solicitation and Neighbour Advertisement messages.
> > Neighbor Solicitation is an analog of ARP request and is sent to either
> >
> > solicited-node multicast address 33:33:ff:XX:XX:XX where XX:XX:XX is the
> last 3 bytes of the target ipv6 address
> > or target node address
> >
> > So in our case we can update the flows to match only the solicited-node
> multicast address 33:33:ff:XX:XX:XX as follows. From
> >
> > table=??(ls_in_arp_rsp  ), priority=100  , match=(nd_ns && ip6.dst
> == {fd00::10, ff02::1:ff00:10} && nd.target == fd00::10 && inport ==
> "S1-vm1"), action=(next;)
> > table=??(ls_in_arp_rsp  ), priority=50   , match=(nd_ns  && ip6.dst
> == {fd00::10, ff02::1:ff00:10} && nd.target == fd00::10), action=(nd_na {
> eth.src = 50:54:00:00:00:10; ip6.src = fd00::10; nd.target = fd00::10;
> nd.tll = 50:54:00:00:00:10; outport = inport; flags.loopback = 1; output;
> };)
> >
> > to
> >
> > table=??(ls_in_arp_rsp  ), priority=100  , match=(nd_ns && eth.dst
> == {33:33:ff:00:00:10, 33:33:ff:ff:00:10} && ip6.dst == {fd00::10,
> ff02::1:ff00:10} && nd.target == fd00::10 && inport == "S1-vm1"),
> action=(next;)
> > table=??(ls_in_arp_rsp  ), priority=50   , match=(nd_ns  && eth.dst
> == {33:33:ff:00:00:10, 33:33:ff:ff:00:10} && ip6.dst == {fd00::10,
> ff02::1:ff00:10} && nd.target == fd00::10), action=(nd_na { eth.src =
> 50:54:00:00:00:10; ip6.src = fd00::10; nd.target = fd00::10; nd.tll =
> 50:54:00:00:00:10; outport = inport; flags.loopback = 1; output; };)
> >
> > Will this be okay? Maybe we can do this in a separate patch?
> >
> I am not an IPv6 expert either. I even forgot that IPv6 NS uses multicast
> instead of broadcast MAC address, and was assuming we could simply append
> the eth.dst == ff:ff:ff:ff:ff:ff to the IPv6 NS responder flows, which is
> obviously wrong. Please ignore my feedback for the current patch and we can
> address IPv6 in a separate patch if it is required.
>
> Thanks,
> Han
>
> > On Wed, May 22, 2024 at 12:16 AM Han Zhou  wrote:
> >>
> >>
> >>
> >> On Tue, May 21, 2024 at 9:56 PM Numan Siddique  wrote:
> >> >
> >> > On Mon, May 20, 2024 at 1:47 AM Vasyl Saienko 
> wrote:
> >> > >
> >> > > Thanks for feedback Numan.
> >> > >
> >> > > On Wed, May 15, 2024 at 12:04 AM Numan Siddique 
> wrote:
> >> > >
> >> > > > On Mon, May 13, 2024 at 9:00 AM Dumitru Ceara 
> wrote:
> >> > > > >
> >> > > > > On 5/4/24 11:38, Vasyl Saienko wrote:
> >> > > > > > Hello Numan
> >> > > > > >
> >> > > > > > Thanks for review and feedback.
> >> > > > > >
> >> > > > > > On Fri, May 3, 2024 at 7:14 PM Numan Siddique 
> wrote:
> >> > > > > >
> >> > > > > >> On Mon, Apr 22, 2024 at 2:54 AM Vasyl Saienko <
> vsaie...@mirantis.com>
> >> > > > > >> wrote:
> >> > > > > >>>
> >> > > > > >>> Reply only if target ethernet address is broadcast, if
> >> > > > > >>> address is specified explicitly do noting to let target
> >> > > > > >>> reply by itself. This technique allows to monitor target
> >> > > > > >>> aliveness with arping.
> >> > > > > >>
> >> > > > > >> Thanks for the patch.
> >> > > > > >>
> >> > > > > >> This patch does change the behavior of OVN.  Having ARP
> responder
> >> > > > > >> flows avoids tunnelling the request packet if the
> destination is on a
> >> > > > > >> different compute node.
> >> > > > > >> But I don't think its a big deal.
> >> > > > > >>
> >> > > > > >> You are totally correct that the ARP responder allows
> limiting ARP
> >> > > > > > broadcast traffic between nodes. Normally ARP requests are
> sent to
> >> > > > > > broadcast ff:ff:ff:ff:ff:ff ethernet, as the protocol is
> designed to
> >> > > > > > identify ethernet addresses for known IP addresses. In this
> case since
> >> > > > > > traffic is broadcast it will be flooded among all nodes if ARP
> >> > > > responder is
> >> > > > > > not applied. At the same time the client may specify the
> target
> >> > > > > > ethernet address as unicast (in this case a broadcast storm
> will not
> >> > > > > > happen, as the destination ethernet address is unicast).
> >> > > > > >
> >> > > > > > In OpenStack we use unicast ARP requests as a way to monitor
> VM
> >> > > > aliveness
> >> > > > > > from remote locations to make sure there are no issues with
> >> > > > > > flows/underlying networking infra between nodes. We use ARPs
> due to
> >> > > > several
> >> > > > > > reasons:
> >> > > > > > 1. This protocol is mandatory and can't be disabled on the
> target
> >> > > > machine
> >> > > > > > (which guarantees that the target VM will always reply to
> ARPs if it is
> >> > > > > > alive)
> >> > > > > > 2. T

[ovs-dev] [PATCH net-next 2/2] selftests: openvswitch: set value to nla flags

2024-06-03 Thread Adrian Moreno
Netlink flags, although they don't have payload at the netlink level,
are represented as having a "True" value in pyroute2.

Without it, trying to add a flow with a flag-type action (e.g: pop_vlan)
fails with the following traceback:

Traceback (most recent call last):
  File "[...]/ovs-dpctl.py", line 2498, in 
sys.exit(main(sys.argv))
 ^^
  File "[...]/ovs-dpctl.py", line 2487, in main
ovsflow.add_flow(rep["dpifindex"], flow)
  File "[...]/ovs-dpctl.py", line 2136, in add_flow
reply = self.nlm_request(
^
  File "[...]/pyroute2/netlink/nlsocket.py", line 822, in nlm_request
return tuple(self._genlm_request(*argv, **kwarg))
 ^^^
  File "[...]/pyroute2/netlink/generic/__init__.py", line 126, in
nlm_request
return tuple(super().nlm_request(*argv, **kwarg))
   ^^
  File "[...]/pyroute2/netlink/nlsocket.py", line 1124, in nlm_request
self.put(msg, msg_type, msg_flags, msg_seq=msg_seq)
  File "[...]/pyroute2/netlink/nlsocket.py", line 389, in put
self.sendto_gate(msg, addr)
  File "[...]/pyroute2/netlink/nlsocket.py", line 1056, in sendto_gate
msg.encode()
  File "[...]/pyroute2/netlink/__init__.py", line 1245, in encode
offset = self.encode_nlas(offset)
 
  File "[...]/pyroute2/netlink/__init__.py", line 1560, in encode_nlas
nla_instance.setvalue(cell[1])
  File "[...]/pyroute2/netlink/__init__.py", line 1265, in setvalue
nlv.setvalue(nla_tuple[1])
 ~^^^
IndexError: list index out of range

Signed-off-by: Adrian Moreno 
---
 tools/testing/selftests/net/openvswitch/ovs-dpctl.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py 
b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
index b76907ac0092..a2395c3f37a1 100644
--- a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
+++ b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
@@ -537,7 +537,7 @@ class ovsactions(nla):
 for flat_act in parse_flat_map:
 if parse_starts_block(actstr, flat_act[0], False):
 actstr = actstr[len(flat_act[0]):]
-self["attrs"].append([flat_act[1]])
+self["attrs"].append([flat_act[1], True])
 actstr = actstr[strspn(actstr, ", ") :]
 parsed = True
 
-- 
2.45.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net-next 1/2] selftests: openvswitch: fix action formatting

2024-06-03 Thread Adrian Moreno
In the action formatting function ("dpstr"), the iteration is made over
the nla_map, so if there are more than one attribute from the same type
we only print the first one.

Fix this by iterating over the actual attributes.

Signed-off-by: Adrian Moreno 
---
 .../selftests/net/openvswitch/ovs-dpctl.py| 48 +++
 1 file changed, 27 insertions(+), 21 deletions(-)

diff --git a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py 
b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
index 1dd057afd3fb..b76907ac0092 100644
--- a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
+++ b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
@@ -437,40 +437,46 @@ class ovsactions(nla):
 def dpstr(self, more=False):
 print_str = ""
 
-for field in self.nla_map:
-if field[1] == "none" or self.get_attr(field[0]) is None:
+for attr_name, value in self["attrs"]:
+attr_desc = next(filter(lambda x: x[0] == attr_name, self.nla_map),
+ None)
+if not attr_desc:
+raise ValueError("Unknown attribute: %s" % attr)
+
+attr_type = attr_desc[1]
+
+if attr_type == "none":
 continue
 if print_str != "":
 print_str += ","
 
-if field[1] == "uint32":
-if field[0] == "OVS_ACTION_ATTR_OUTPUT":
-print_str += "%d" % int(self.get_attr(field[0]))
-elif field[0] == "OVS_ACTION_ATTR_RECIRC":
-print_str += "recirc(0x%x)" % int(self.get_attr(field[0]))
-elif field[0] == "OVS_ACTION_ATTR_TRUNC":
-print_str += "trunc(%d)" % int(self.get_attr(field[0]))
-elif field[0] == "OVS_ACTION_ATTR_DROP":
-print_str += "drop(%d)" % int(self.get_attr(field[0]))
-elif field[1] == "flag":
-if field[0] == "OVS_ACTION_ATTR_CT_CLEAR":
+if attr_type == "uint32":
+if attr_name == "OVS_ACTION_ATTR_OUTPUT":
+print_str += "%d" % int(value)
+elif attr_name == "OVS_ACTION_ATTR_RECIRC":
+print_str += "recirc(0x%x)" % int(value)
+elif attr_name == "OVS_ACTION_ATTR_TRUNC":
+print_str += "trunc(%d)" % int(value)
+elif attr_name == "OVS_ACTION_ATTR_DROP":
+print_str += "drop(%d)" % int(value)
+elif attr_type == "flag":
+if attr_name == "OVS_ACTION_ATTR_CT_CLEAR":
 print_str += "ct_clear"
-elif field[0] == "OVS_ACTION_ATTR_POP_VLAN":
+elif attr_name == "OVS_ACTION_ATTR_POP_VLAN":
 print_str += "pop_vlan"
-elif field[0] == "OVS_ACTION_ATTR_POP_ETH":
+elif attr_name == "OVS_ACTION_ATTR_POP_ETH":
 print_str += "pop_eth"
-elif field[0] == "OVS_ACTION_ATTR_POP_NSH":
+elif attr_name == "OVS_ACTION_ATTR_POP_NSH":
 print_str += "pop_nsh"
-elif field[0] == "OVS_ACTION_ATTR_POP_MPLS":
+elif attr_name == "OVS_ACTION_ATTR_POP_MPLS":
 print_str += "pop_mpls"
 else:
-datum = self.get_attr(field[0])
-if field[0] == "OVS_ACTION_ATTR_CLONE":
+if attr_name == "OVS_ACTION_ATTR_CLONE":
 print_str += "clone("
-print_str += datum.dpstr(more)
+print_str += value.dpstr(more)
 print_str += ")"
 else:
-print_str += datum.dpstr(more)
+print_str += value.dpstr(more)
 
 return print_str
 
-- 
2.45.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] ovsdb: Use table indexes if available for ovsdb_query().

2024-06-03 Thread Ilya Maximets
On 6/3/24 06:20, Mike Pattrick wrote:
> Currently all OVSDB database queries except for UUID lookups all result
> in linear lookups over the entire table, even if an index is present.
> 
> This patch modifies ovsdb_query() to attempt an index lookup first, if
> possible. If no matching indexes are present then a linear index is
> still conducted.
> 
> Reported-at: https://issues.redhat.com/browse/FDP-590
> Signed-off-by: Mike Pattrick 
> ---
>  NEWS |   3 ++
>  ovsdb/query.c| 102 +++
>  ovsdb/row.h  |  28 +++
>  ovsdb/transaction.c  |  27 ---
>  tests/ovsdb-execution.at |  34 -
>  tests/ovsdb-server.at|   2 +-
>  tests/ovsdb-tool.at  |   2 +-
>  7 files changed, 159 insertions(+), 39 deletions(-)

Hi, Mike.  Thanks for the patch.

Besides what Simon asked, the patch has a few other issues:

1. Lookup is performed only on the committed index and it doesn't include
   rows that are in-flight in the current transaction.

   Unlike rows in a hash table, indexes are updated only after the whole
   transaction is committed.  With this change we'll not be able to find
   newly added rows.

   Another thing related to this is that it is allowed to have duplicates
   within a transaction as long as they are removed before the transaction
   ends.  So it is possible that multiple rows will satisfy the condition
   on indexed columns while the transaction is in-flight.

   Consider the following commands executed in a sandbox:

   # ovs-vsctl set-manager "tcp:my-first-target"
   # ovsdb-client transact unix:$(pwd)/sandbox/db.sock '
   ["Open_vSwitch",
{"op": "select",
 "table": "Manager",
 "columns": ["_uuid", "target"],
 "where": [["target", "==", "tcp:my-first-target"]]},
{"op": "insert", 
 "table": "Manager",
 "uuid-name": "duplicate",
 "row": {"target": "tcp:my-first-target"}},
{"op": "select",
 "table": "Manager",
 "columns": ["_uuid", "target"],
 "where": [["target", "==", "tcp:my-first-target"]]},
{"op": "delete",
 "table": "Manager",
 "where":[["_uuid","==",["named-uuid","duplicate"]]]},
{"op": "select",
 "table": "Manager",
 "columns": ["_uuid", "target"],
 "where": [["target", "==", "tcp:my-first-target"]]}]'

   Transaction must succeed.  The first selection should return 1 row,
   the second should return both duplicates and the third should again
   return one row.

   Ideally, implementation should not leak the transaction details to
   the query module, though I'm not sure if that is 100% achievable.

2. Taking above case into account, this change needs way more unit tests
   with different order of operations and complex data updates.

3. Since this is a performance-oriented change, please, include some
   performance numbers in the commit message as well, including impact
   on non-indexed lookups, if any.

4. There seems to be a lot of logic overlap with existing functions like
   ovsdb_condition_match_every_clause(), ovsdb_index_search() and
   ovsdb_row_hash_columns().  Can we re-use those instead?  For example,
   by creating a row from the conditions before the lookup?  What a
   performance impact will look like?

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] ovsdb: Use table indexes if available for ovsdb_query().

2024-06-03 Thread Mike Pattrick
On Mon, Jun 3, 2024 at 9:56 AM Simon Horman  wrote:
>
> On Mon, Jun 03, 2024 at 12:20:36AM -0400, Mike Pattrick wrote:
> > Currently all OVSDB database queries except for UUID lookups all result
> > in linear lookups over the entire table, even if an index is present.
> >
> > This patch modifies ovsdb_query() to attempt an index lookup first, if
> > possible. If no matching indexes are present then a linear index is
> > still conducted.
> >
> > Reported-at: https://issues.redhat.com/browse/FDP-590
> > Signed-off-by: Mike Pattrick 
>
> ...
>
> > diff --git a/ovsdb/query.c b/ovsdb/query.c
> > index eebe56412..e3e50a034 100644
> > --- a/ovsdb/query.c
> > +++ b/ovsdb/query.c
> > @@ -21,32 +21,116 @@
> >  #include "condition.h"
> >  #include "row.h"
> >  #include "table.h"
> > +#include "transaction.h"
> > +
> > +static bool
> > +ovsdb_query_index(struct ovsdb_table *table,
> > +  const struct ovsdb_condition *cnd,
> > +  const struct ovsdb_row **out)
> > +{
> > +for (size_t idx = 0; idx < table->schema->n_indexes; idx++) {
> > +const struct ovsdb_column_set *index = 
> > &table->schema->indexes[idx];
> > +struct hmap_node *node;
> > +size_t matches = 0;
> > +uint32_t hash = 0;
> > +
> > +if (index->n_columns != cnd->n_clauses) {
> > +continue;
> > +}
> > +
> > +/* The conditions may not be in the same order as the index. */
> > +for (size_t c = 0; c < cnd->n_clauses; c++) {
> > +const struct ovsdb_clause *cnd_cls = &cnd->clauses[c];
> > +
> > +if (cnd_cls->function != OVSDB_F_EQ) {
> > +return false;
> > +}
> > +
> > +for (size_t i = 0; i < index->n_columns; i++) {
> > +const struct ovsdb_column *idx_col = index->columns[i];
> > +
> > +if (cnd_cls->index == idx_col->index) {
> > +hash = ovsdb_datum_hash(&cnd_cls->arg, &idx_col->type,
> > +hash);
> > +matches++;
> > +break;
> > +}
> > +}
> > +
> > +/* If none of the indexed columns match, continue to the next
> > + * index. */
> > +if (matches == c) {
> > +break;
> > +}
> > +}
> > +
> > +if (matches != cnd->n_clauses) {
> > +continue;
> > +}
> > +
> > +for (node = hmap_first_with_hash(&table->indexes[idx], hash); node;
> > + node = hmap_next_with_hash(node)) {
> > +struct ovsdb_row *irow = ovsdb_row_from_index_node(node, table,
> > +   idx);
> > +
> > +for (size_t c = 0; c < cnd->n_clauses; c++) {
> > +const struct ovsdb_clause *cnd_cls = &cnd->clauses[c];
> > +
> > +if (!ovsdb_datum_equals(&cnd_cls->arg,
> > +&irow->fields[cnd_cls->index],
> > +&cnd_cls->column->type)) {
> > +irow = NULL;
> > +break;
> > +}
> > +}
> > +
> > +if (irow) {
> > +*out = irow;
> > +return true;
> > +}
> > +}
> > +
> > +/* In the case that there was a matching index but no matching 
> > row, the
> > + * index check is still considered to be a success. */
> > +return true;
>
> Hi Mike,
>
> Maybe I misread it, but it seems that the code above implements:
> 1. If a row is found, return true
> 2. Otherwise, returns true
>
> If so, then is there a need 1?

That's true, I could just break out of the loop. Github-ci also
pointed out another issue with the patch that I'm working on. When I
resubmit I'll correct this too.

-M

>
> > +}
> > +return false;
> > +}
>
> ...
>

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 3/3] nsh: Add support to compose-packet and use it in system tests.

2024-06-03 Thread Simon Horman
On Fri, May 31, 2024 at 11:45:12PM +0200, Ilya Maximets wrote:
> OVS can parse NSH, but can't compose.  Fix that and get rid of plain
> hex NSH packets in system tests as they are hard to read or modify.
> 
> Tcpdump calls modified to write actual pcaps instead of text output,
> so ovs-pcap can be used while checking the results.
> 
> While at it, replacing sleeps with more robust waiting for tcpdump
> to start listening.

I might have separated a) adding NSH compose support
b) more robust tcpdump waiting, and c) using text rather than hex,
into 3 patches. But I don't feel strongly about it.

> M4 macros are better than shell variables, because we can see the
> substitution result in the test log.  So, using m4_define and m4_join
> extensively.
> 
> Signed-off-by: Ilya Maximets 

Acked-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/3] tests: Convert ND, MPLS and CT sendpkt tests to compose-packet.

2024-06-03 Thread Simon Horman
On Fri, May 31, 2024 at 11:45:11PM +0200, Ilya Maximets wrote:
> These tests contain plain hex dumps that are hard to read and modify.
> Replace with equivalent calls to ovs-ofctl compose-packet --bare and
> ovs-pcap.
> 
> Tcpdump calls modified to write actual pcaps instead of text output,
> so ovs-pcap can be used while checking the results.
> 
> While at it, replacing sleeps with more robust waiting for tcpdump
> to start listening.

I might have put that part in a separate patch,
but I don't feel strongly about it.

> 
> M4 macros are better than shell variables, because we can see the
> substitution result in the test log.  So, using m4_define and m4_join
> extensively.
> 
> Signed-off-by: Ilya Maximets 

Acked-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/3] tests: sendpkt: Allow different input formats.

2024-06-03 Thread Simon Horman
On Fri, May 31, 2024 at 11:45:10PM +0200, Ilya Maximets wrote:
> We require python 3, so instead of manually parsing bytes on input we
> can use built-in bytes.fromhex().  This function ignores whitespaces,
> so we can use different input formats - the old style space-separated
> bytes as well as pure hex strings provided by ovs-ofctl compose-packet
> and ovs-pcap.
> 
> Signed-off-by: Ilya Maximets 

Acked-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2 1/3] northd: Introduce ECMP_Nexthop table in SB db.

2024-06-03 Thread Lorenzo Bianconi
[...]
> > --- a/ovn-sb.xml
> > +++ b/ovn-sb.xml
> > @@ -5095,4 +5095,30 @@ tcp.flags = RST;
> >The set of variable values for a given chassis.
> >  
> >
> > +
> > +  
> > +
> > +  
> > +Nexthop IP address for this route.  Nexthop IP address should be
> the IP
> > +address of a connected router port or the IP address of a
> logical port
> > +or can be set to discard for dropping packets which
> match
> > +the given route.
> > +  
> > +
> > +
> > +
> > +  
> > +Nexthop unique indetifier. Nexthop ID is used to track active
> > +ecmp-symmetric-reply connections and flush stale ones.
> > +  
> > +
> > +
> > +
> > +  Reserved for future use.
> > +
> > +
> > +
> > +  See External IDs at the beginning of this document.
> > +
> > +  
> >  
> 
> Hi Lorenzo, I haven't reviewed all the patches of this series yet, but it
> is really hard for me to understand this new SB table just by looking at
> the documentation above. Each table should have some basic description for
> the table itself before describing columns, i.e. the purpose of the table
> and what does each row represent. Also in the description of the columns,
> it is confusing what "this route" and "the given route" mean. It may be
> clear to you since you are implementing the feature, but I think the SB
> documentation needs to be clear for someone new to the feature.

Hi Han,

thx, I will fix it posting v2. I will rebase this patch ontop of FDP-600
https://patchwork.ozlabs.org/project/ovn/list/?series=405268

Regards,
Lorenzo

> 
> Thanks,
> Han
> 
> > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> > index 89aed5adc..2160e8de7 100644
> > --- a/tests/ovn-northd.at
> > +++ b/tests/ovn-northd.at
> > @@ -6542,6 +6542,7 @@ check ovn-nbctl lsp-set-addresses public-lr0 router
> >  check ovn-nbctl lsp-set-options public-lr0 router-port=lr0-public
> >
> >  check ovn-nbctl --wait=sb --ecmp-symmetric-reply lr-route-add lr0
> 1.0.0.1 192.168.0.10
> > +check_row_count ECMP_Nexthop 1
> >
> >  ovn-sbctl dump-flows lr0 > lr0flows
> >
> > @@ -6553,6 +6554,7 @@ AT_CHECK([grep -e "lr_in_ip_routing_ecmp" lr0flows
> | ovn_strip_lflows], [0], [dn
> >  ])
> >
> >  check ovn-nbctl --wait=sb --ecmp-symmetric-reply lr-route-add lr0
> 1.0.0.1 192.168.0.20
> > +check_row_count ECMP_Nexthop 2
> >
> >  ovn-sbctl dump-flows lr0 > lr0flows
> >  AT_CHECK([grep -e "lr_in_ip_routing.*select" lr0flows |
> ovn_strip_lflows], [0], [dnl
> > @@ -6589,6 +6591,7 @@ AT_CHECK([grep -e "lr_in_arp_resolve.*ecmp"
> lr0flows | ovn_strip_lflows], [0], [
> >
> >  # add ecmp route with wrong nexthop
> >  check ovn-nbctl --wait=sb --ecmp-symmetric-reply lr-route-add lr0
> 1.0.0.1 192.168.1.20
> > +check_row_count ECMP_Nexthop 3
> >
> >  ovn-sbctl dump-flows lr0 > lr0flows
> >  AT_CHECK([grep -e "lr_in_ip_routing.*select" lr0flows |
> ovn_strip_lflows], [0], [dnl
> > @@ -6603,6 +6606,7 @@ AT_CHECK([grep -e "lr_in_ip_routing_ecmp" lr0flows
> | sed 's/192\.168\.0\..0/192.
> >
> >  check ovn-nbctl lr-route-del lr0
> >  wait_row_count nb:Logical_Router_Static_Route 0
> > +check_row_count ECMP_Nexthop 0
> >
> >  check ovn-nbctl --wait=sb lr-route-add lr0 1.0.0.0/24 192.168.0.10
> >  ovn-sbctl dump-flows lr0 > lr0flows
> > --
> > 2.44.0
> >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] controller: Send RARP/GARP for VIF post link state is up.

2024-06-03 Thread Shibir Basak
Hi team,

Can someone please review these changes?

Thanks,
Shibir Basak

From: Shibir Basak 
Date: Monday, 27 May 2024 at 11:55 PM
To: d...@openvswitch.org 
Cc: Shibir Basak , Naveen Yerramneni 

Subject: [PATCH ovn] controller: Send RARP/GARP for VIF post link state is up.
Currently, GARP/RARP broadcast is sent for VIFs (part of logical
switch with localnet port) after iface-id is set.
This fix is to avoid packet loss during migration if iface-id
is set even before the VM migration is completed.

Signed-off-by: Shibir Basak 
Acked-by: Naveen Yerramneni 
---
 controller/ovn-controller.c | 1 +
 controller/pinctrl.c| 4 
 2 files changed, 5 insertions(+)

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 6b38f113d..982378a50 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -1128,6 +1128,7 @@ ctrl_register_ovs_idl(struct ovsdb_idl *ovs_idl)
 ovsdb_idl_add_table(ovs_idl, &ovsrec_table_queue);
 ovsdb_idl_add_column(ovs_idl, &ovsrec_queue_col_other_config);
 ovsdb_idl_add_column(ovs_idl, &ovsrec_queue_col_external_ids);
+ovsdb_idl_add_column(ovs_idl, &ovsrec_interface_col_link_state);

 chassis_register_ovs_idl(ovs_idl);
 encaps_register_ovs_idl(ovs_idl);
diff --git a/controller/pinctrl.c b/controller/pinctrl.c
index 6a2c3dc68..b5d3162b8 100644
--- a/controller/pinctrl.c
+++ b/controller/pinctrl.c
@@ -6375,6 +6375,10 @@ get_localnet_vifs_l3gwports(
 if (!pb || pb->chassis != chassis) {
 continue;
 }
+if (!iface_rec->link_state ||
+strcmp(iface_rec->link_state, "up")) {
+continue;
+}
 struct local_datapath *ld
 = get_local_datapath(local_datapaths,
  pb->datapath->tunnel_key);
--
2.22.3
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 2/2] ipf: Handle common case of ipf defragmentation.

2024-06-03 Thread Simon Horman
On Thu, May 16, 2024 at 11:38:32AM -0400, Mike Pattrick wrote:
> When conntrack is reassembling packet fragments, the same reassembly
> context can be shared across multiple threads handling different packets
> simultaneously. Once a full packet is assembled, it is added to a packet
> batch for processing, in the case where there are multiple different pmd
> threads accessing conntrack simultaneously, there is a race condition
> where the reassembled packet may be added to an arbitrary batch even if
> the current batch is available.
> 
> When this happens, the packet may be handled incorrectly as it is
> inserted into a random openflow execution pipeline, instead of the
> pipeline for that packets flow.
> 
> This change makes a best effort attempt to try to add the defragmented
> packet to the current batch. directly. This should succeed most of the
> time.
> 
> Fixes: 4ea96698f667 ("Userspace datapath: Add fragmentation handling.")
> Reported-at: https://issues.redhat.com/browse/FDP-560
> Signed-off-by: Mike Pattrick 

Acked-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 1/2] ipf: Only add fragments to batch of same dl_type.

2024-06-03 Thread Simon Horman
On Thu, May 16, 2024 at 11:38:31AM -0400, Mike Pattrick wrote:
> When conntrack is reassembling packet fragments, the same reassembly
> context can be shared across multiple threads handling different packets
> simultaneously. Once a full packet is assembled, it is added to a packet
> batch for processing, this is most likely the batch that added it in the
> first place, but that isn't a guarantee.
> 
> The packets in these batches should be segregated by network protocol
> version (ipv4 vs ipv6) for conntrack defragmentation to function
> appropriately. However, there are conditions where we would add a
> reassembled packet of one type to a batch of another.
> 
> This change introduces checks to make sure that reassembled or expired
> fragments are only added to packet batches of the same type.
> 
> Fixes: 4ea96698f667 ("Userspace datapath: Add fragmentation handling.")
> Reported-at: https://issues.redhat.com/browse/FDP-560
> Signed-off-by: Mike Pattrick 

Acked-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] tests: Fix non-portable plus match in python vlog test.

2024-06-03 Thread Ilya Maximets
On 6/3/24 13:14, Eelco Chaudron wrote:
> 
> 
> On 3 Jun 2024, at 13:12, Ilya Maximets wrote:
> 
>> '\+' as a one-or-more match is a GNU extension and it doesn't work
>> in BSD sed.  This makes the python vlog test to fail on FreeBSD 14
>> that recently got python 3.11 in CirrusCI:
>>
>>  |  --- -   2024-06-03 10:42:26.363566000 +
>>  |  +++ /tmp/cirrus-ci-build/tests/testsuite.dir/at-groups/2541/stdout
>>  |  @@ -7,31 +7,37 @@
>>  |   Traceback (most recent call last):
>>  | File , line , in main
>>  |   assert fail
>>  |  +   
>>
>> Remove the '\+' match to make the line removal work.  It doesn't do
>> much for us as we would remove the same lines either way.
>>
>> This change makes CirruCI green again.
>>
>> Fixes: 9185793e7543 ("tests: Fix compatibility issue with Python 3.13 in 
>> vlog.at.")
>> Signed-off-by: Ilya Maximets 
> 
> The change looks good to me (I did not test it).
> 
> Acked-by: Eelco Chaudron 

Thanks!  I retested this and applied to all branches down to 2.17.

Best regards, Ilya Maximets.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] ovsdb: Use table indexes if available for ovsdb_query().

2024-06-03 Thread Simon Horman
On Mon, Jun 03, 2024 at 12:20:36AM -0400, Mike Pattrick wrote:
> Currently all OVSDB database queries except for UUID lookups all result
> in linear lookups over the entire table, even if an index is present.
> 
> This patch modifies ovsdb_query() to attempt an index lookup first, if
> possible. If no matching indexes are present then a linear index is
> still conducted.
> 
> Reported-at: https://issues.redhat.com/browse/FDP-590
> Signed-off-by: Mike Pattrick 

...

> diff --git a/ovsdb/query.c b/ovsdb/query.c
> index eebe56412..e3e50a034 100644
> --- a/ovsdb/query.c
> +++ b/ovsdb/query.c
> @@ -21,32 +21,116 @@
>  #include "condition.h"
>  #include "row.h"
>  #include "table.h"
> +#include "transaction.h"
> +
> +static bool
> +ovsdb_query_index(struct ovsdb_table *table,
> +  const struct ovsdb_condition *cnd,
> +  const struct ovsdb_row **out)
> +{
> +for (size_t idx = 0; idx < table->schema->n_indexes; idx++) {
> +const struct ovsdb_column_set *index = &table->schema->indexes[idx];
> +struct hmap_node *node;
> +size_t matches = 0;
> +uint32_t hash = 0;
> +
> +if (index->n_columns != cnd->n_clauses) {
> +continue;
> +}
> +
> +/* The conditions may not be in the same order as the index. */
> +for (size_t c = 0; c < cnd->n_clauses; c++) {
> +const struct ovsdb_clause *cnd_cls = &cnd->clauses[c];
> +
> +if (cnd_cls->function != OVSDB_F_EQ) {
> +return false;
> +}
> +
> +for (size_t i = 0; i < index->n_columns; i++) {
> +const struct ovsdb_column *idx_col = index->columns[i];
> +
> +if (cnd_cls->index == idx_col->index) {
> +hash = ovsdb_datum_hash(&cnd_cls->arg, &idx_col->type,
> +hash);
> +matches++;
> +break;
> +}
> +}
> +
> +/* If none of the indexed columns match, continue to the next
> + * index. */
> +if (matches == c) {
> +break;
> +}
> +}
> +
> +if (matches != cnd->n_clauses) {
> +continue;
> +}
> +
> +for (node = hmap_first_with_hash(&table->indexes[idx], hash); node;
> + node = hmap_next_with_hash(node)) {
> +struct ovsdb_row *irow = ovsdb_row_from_index_node(node, table,
> +   idx);
> +
> +for (size_t c = 0; c < cnd->n_clauses; c++) {
> +const struct ovsdb_clause *cnd_cls = &cnd->clauses[c];
> +
> +if (!ovsdb_datum_equals(&cnd_cls->arg,
> +&irow->fields[cnd_cls->index],
> +&cnd_cls->column->type)) {
> +irow = NULL;
> +break;
> +}
> +}
> +
> +if (irow) {
> +*out = irow;
> +return true;
> +}
> +}
> +
> +/* In the case that there was a matching index but no matching row, 
> the
> + * index check is still considered to be a success. */
> +return true;

Hi Mike,

Maybe I misread it, but it seems that the code above implements:
1. If a row is found, return true
2. Otherwise, returns true

If so, then is there a need 1?

> +}
> +return false;
> +}

...
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/1] ofproto-dpif-upcall: Avoid stale ukeys leaks.

2024-06-03 Thread Eelco Chaudron


On 3 Jun 2024, at 10:07, Roi Dayan wrote:

> On 03/06/2024 10:18, Roi Dayan wrote:
>>
>>
>> On 30/05/2024 18:48, Eelco Chaudron wrote:
>>>
>>>
>>> On 23 May 2024, at 12:46, Roi Dayan via dev wrote:
>>>
 It is observed in some environments that there are much more ukeys than
 actual DP flows. For example:

 $ ovs-appctl upcall/show
 system@ovs-system:
 flows : (current 7) (avg 6) (max 117) (limit 2125)
 offloaded flows : 525
 dump duration : 1063ms
 ufid enabled : true

 23: (keys 3612)
 24: (keys 3625)
 25: (keys 3485)

 The revalidator threads are busy revalidating the stale ukeys leading to
 high CPU and long dump duration.

 There are some possible situations that may result in stale ukeys that
 have no corresponding DP flows.

 In revalidator, push_dp_ops() doesn't check error if the op type is not
 DEL. It is possible that a PUT(MODIFY) fails, especially for tc offload
 case, where the old flow is deleted first and then the new one is
 created. If the creation fails, the ukey will be stale (no corresponding
 DP flow). This patch adds a warning in such case.
>>>
>>> Not sure I understand, this behavior should be captured by the 
>>> UKEY_INCONSISTENT state.
>>
>> Hi Eelco,
>>
>> thanks for reviewing.
>> We started the patch on older branch as we didn't rebase for some time
>> and got a little later on sending it.
>> I see the addition now for UKEY_INCOSISTENT in the following patch:
>>
>> cf11766cbcf1 ofproto-dpif-upcall: Fix push_dp_ops to handle all errors.
>>
>> Looks like it handles the same situation we tried to handle in this patch.
>>
>>>
 Another possible scenario is in handle_upcalls() if a PUT operation did
 not succeed and op->error attribute was not set correctly it can lead to
 stale ukey in operational state.
>>>
>>>
>>> Guess we need to figure out these cases, as these are the root cause of 
>>> your problem.
>>
>> right. maybe. but this could help keep the system alive/clean while logging 
>> the
>> bad situation instead of having increase in those stale/inconsistent ukeys.
>> I understand if it's not nice to handle it like this.
>>
>
> Hi Eelco,
>
> I remember now one of the reproduction scenarios we did is do some traffic
> to get rules added using TC and then delete those from tc command line
> and it got to stale ukeys.
> The revalidator dump thread not seeing the rules so not updating anything
> and sweep over the ukeys skipping them as well.
> This is why we checked against the timing stats of the ukey.
>
> I remember I tested on the upstream code and not our development branch
> and it reproduces. I didn't notice if the commit adding UKEY_INCONSISTENT
> existed but it handles errors from adding flows so I dont think it matters.
>
> I'll try to check and verify again but I think it's still here.
> So while cases getting dop.error handled with UKEY_INCONSISTENT,
> this case I think is not.

I think you are right this case is not covered by the UKEY_INCONSISTENT test 
below. See my suggestion on fixing this in revalidate_ukey().

Maybe you could also add a test case for this in the offload test suite.

//Eelco

 This patch adds checks in the sweep phase for such ukeys and move them
 to DELETE so that they can be cleared eventually.

 Co-authored-by: Han Zhou 
 Signed-off-by: Han Zhou 
 Signed-off-by: Roi Dayan 
 ---
  ofproto/ofproto-dpif-upcall.c | 15 +++
  1 file changed, 15 insertions(+)

 diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
 index 83609ec62b63..e9520ebdf910 100644
 --- a/ofproto/ofproto-dpif-upcall.c
 +++ b/ofproto/ofproto-dpif-upcall.c
 @@ -57,6 +57,7 @@ COVERAGE_DEFINE(dumped_inconsistent_flow);
  COVERAGE_DEFINE(dumped_new_flow);
  COVERAGE_DEFINE(handler_duplicate_upcall);
  COVERAGE_DEFINE(revalidate_missed_dp_flow);
 +COVERAGE_DEFINE(revalidate_missed_dp_flow_del);
  COVERAGE_DEFINE(ukey_dp_change);
  COVERAGE_DEFINE(ukey_invalid_stat_reset);
  COVERAGE_DEFINE(ukey_replace_contention);
 @@ -278,6 +279,7 @@ enum flow_del_reason {
  FDR_BAD_ODP_FIT,/* Bad ODP flow fit. */
  FDR_FLOW_IDLE,  /* Flow idle timeout. */
  FDR_FLOW_LIMIT, /* Kill all flows condition reached. */
 +FDR_FLOW_STALE, /* Flow stale detected. */
>>>
>>> Please also update the associated script, see above comment.
>>>
  FDR_FLOW_WILDCARDED,/* Flow needs a narrower wildcard mask. */
  FDR_NO_OFPROTO, /* Bridge not found. */
  FDR_PURGE,  /* User requested flow deletion. */
 @@ -2557,6 +2559,10 @@ push_dp_ops(struct udpif *udpif, struct ukey_op 
 *ops, size_t n_ops)
>>>
>>>
>>> The (op->dop.error) condition below will never be reached, as it’s 
>>> explicitly checked above. So the error message will never happen. The 
>>> condition you ex

Re: [ovs-dev] [PATCH v3 0/8] Coverity fixes.

2024-06-03 Thread Eelco Chaudron



On 29 May 2024, at 12:53, Eelco Chaudron wrote:

> This series addresses several high-priority Coverity issues.
>
> Delta from v2 -> v3:
> - Cleaned up error message on patch 8.
>
> Delta from v1 -> v2:
> - Add cover letter.
> - Split first patch and update commit subject.
> - Added ds_destroy() to context_done().
> - Changed invalid gso_type handling by returning an error + log message.
> - Changed sFlowRcvrTimeout to be using uint32_t to avoid TIME_T_MAKE.
>
> Eelco Chaudron (8):
>   netdev-linux: Fix possible int overflow in tc_add_matchall_policer().
>   cfm: Fix possible integer overflow in tc_add_matchall_policer().
>   sflow: Replace libc's random() function with the OVS's random_range().
>   sflow: Use uint32_t instead of time_t for tick handling in the poller.
>   sflow: Fix check for disabled receive time.
>   ofproto-dpif: Define age as time_t in ofproto_unixctl_fdb_add().
>   db-ctl-base: Initialize the output variable in the ctx structure.
>   netdev-linux: Fix uninitialized gso_type case.
>
>  lib/cfm.c|  2 +-
>  lib/db-ctl-base.c|  2 ++
>  lib/netdev-linux.c   |  8 ++--
>  lib/sflow_api.h  | 10 +-
>  lib/sflow_poller.c   |  3 ++-
>  lib/sflow_receiver.c |  7 ---
>  ofproto/ofproto-dpif-sflow.c |  2 +-
>  ofproto/ofproto-dpif.c   |  2 +-
>  8 files changed, 22 insertions(+), 14 deletions(-)

Series applied to main. Thank Ilya, Paolo, and Mike for the reviews.

Cheers,

Eelco

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] tests: Fix non-portable plus match in python vlog test.

2024-06-03 Thread Eelco Chaudron



On 3 Jun 2024, at 13:12, Ilya Maximets wrote:

> '\+' as a one-or-more match is a GNU extension and it doesn't work
> in BSD sed.  This makes the python vlog test to fail on FreeBSD 14
> that recently got python 3.11 in CirrusCI:
>
>  |  --- -   2024-06-03 10:42:26.363566000 +
>  |  +++ /tmp/cirrus-ci-build/tests/testsuite.dir/at-groups/2541/stdout
>  |  @@ -7,31 +7,37 @@
>  |   Traceback (most recent call last):
>  | File , line , in main
>  |   assert fail
>  |  +   
>
> Remove the '\+' match to make the line removal work.  It doesn't do
> much for us as we would remove the same lines either way.
>
> This change makes CirruCI green again.
>
> Fixes: 9185793e7543 ("tests: Fix compatibility issue with Python 3.13 in 
> vlog.at.")
> Signed-off-by: Ilya Maximets 

The change looks good to me (I did not test it).

Acked-by: Eelco Chaudron 

> ---
>  tests/vlog.at | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tests/vlog.at b/tests/vlog.at
> index efe91479a..2768c0740 100644
> --- a/tests/vlog.at
> +++ b/tests/vlog.at
> @@ -8,7 +8,7 @@ AT_CHECK([$PYTHON3 $srcdir/test-vlog.py --log-file log_file \
>
>  AT_CHECK([sed -e 's/.*-.*-.*T..:..:..Z |//' \
>  -e 's/File ".*", line [[0-9]][[0-9]]*,/File , line ,/' \
> --e '/\^\+/d' \
> +-e '/\^/d' \
>  stderr_log], [0], [dnl
>0  | module_0 | EMER | emergency
>1  | module_0 | ERR | error
> -- 
> 2.45.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 8/8] netdev-linux: Fix uninitialized gso_type case.

2024-06-03 Thread Eelco Chaudron


On 31 May 2024, at 13:37, Ilya Maximets wrote:

> On 5/29/24 12:53, Eelco Chaudron wrote:
>> This patch fixes an uninitialized gso_type case in
>> netdev_linux_prepend_vnet_hdr() by returning an error.
>>
>> Fixes: 3337e6d91c5b ("userspace: Enable L4 checksum offloading by default.")
>> Signed-off-by: Eelco Chaudron 
>> ---
>>  lib/netdev-linux.c | 4 
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
>> index eb0c5c624..dc67e1268 100644
>> --- a/lib/netdev-linux.c
>> +++ b/lib/netdev-linux.c
>> @@ -7167,6 +7167,10 @@ netdev_linux_prepend_vnet_hdr(struct dp_packet *b, 
>> int mtu)
>>  vnet->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
>>  } else if (dp_packet_hwol_tx_ipv6(b)) {
>>  vnet->gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
>> +} else {
>> +VLOG_ERR_RL(&rl, "Unknown gso_type for TSO packet. Flags: 
>> %"PRIx64,
>
> I'd suggest adding the # qualifier to the format string, i.e. %#"PRIx64.
> Can be fixed on commit, I suppose.

Thanks for the review! I’ll fix on commit, as I totally missed the # sign :(.

> Acked-by: Ilya Maximets 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] tests: Fix non-portable plus match in python vlog test.

2024-06-03 Thread Ilya Maximets
'\+' as a one-or-more match is a GNU extension and it doesn't work
in BSD sed.  This makes the python vlog test to fail on FreeBSD 14
that recently got python 3.11 in CirrusCI:

 |  --- -   2024-06-03 10:42:26.363566000 +
 |  +++ /tmp/cirrus-ci-build/tests/testsuite.dir/at-groups/2541/stdout
 |  @@ -7,31 +7,37 @@
 |   Traceback (most recent call last):
 | File , line , in main
 |   assert fail
 |  +   

Remove the '\+' match to make the line removal work.  It doesn't do
much for us as we would remove the same lines either way.

This change makes CirruCI green again.

Fixes: 9185793e7543 ("tests: Fix compatibility issue with Python 3.13 in 
vlog.at.")
Signed-off-by: Ilya Maximets 
---
 tests/vlog.at | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/vlog.at b/tests/vlog.at
index efe91479a..2768c0740 100644
--- a/tests/vlog.at
+++ b/tests/vlog.at
@@ -8,7 +8,7 @@ AT_CHECK([$PYTHON3 $srcdir/test-vlog.py --log-file log_file \
 
 AT_CHECK([sed -e 's/.*-.*-.*T..:..:..Z |//' \
 -e 's/File ".*", line [[0-9]][[0-9]]*,/File , line ,/' \
--e '/\^\+/d' \
+-e '/\^/d' \
 stderr_log], [0], [dnl
   0  | module_0 | EMER | emergency
   1  | module_0 | ERR | error
-- 
2.45.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] utilties: Allow ovn-detrace to run on ovs-ofctl dump-flows output.

2024-06-03 Thread 0-day Robot
Bleep bloop.  Greetings Ales Musil, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line lacks whitespace around operator
#189 FILE: utilities/ovn-debug.c:105:
uuid-to-cookie UUID\n\

Lines checked: 229, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] utilties: Allow ovn-detrace to run on ovs-ofctl dump-flows output.

2024-06-03 Thread Ales Musil
The ovs-ofctl dump-flows output is slightly different from oproto/trace
cookie 0xXXX vs. cookie=0xXXX. Update the regex that it also matches
on the equals case. This allows us to run ovn-detrace against the
ovs-ofctl dump-flows output.

Also provide simple, partially hardcoded test case for ovn-detrace.

Signed-off-by: Ales Musil 
---
 tests/automake.mk   |   3 +-
 tests/ovn-util.at   | 108 
 tests/testsuite.at  |   1 +
 utilities/ovn-debug.c   |  18 ++
 utilities/ovn_detrace.py.in |   5 +-
 5 files changed, 132 insertions(+), 3 deletions(-)
 create mode 100644 tests/ovn-util.at

diff --git a/tests/automake.mk b/tests/automake.mk
index 1fdc89835..3899c9e80 100644
--- a/tests/automake.mk
+++ b/tests/automake.mk
@@ -45,7 +45,8 @@ TESTSUITE_AT = \
tests/ovn-lflow-cache.at \
tests/ovn-lflow-conj-ids.at \
tests/ovn-ipsec.at \
-   tests/ovn-vif-plug.at
+   tests/ovn-vif-plug.at \
+   tests/ovn-util.at
 
 SYSTEM_DPDK_TESTSUITE_AT = \
tests/system-dpdk-testsuite.at \
diff --git a/tests/ovn-util.at b/tests/ovn-util.at
new file mode 100644
index 0..fd3282548
--- /dev/null
+++ b/tests/ovn-util.at
@@ -0,0 +1,108 @@
+AT_SETUP([ovn-detrace - simple scenario])
+AT_SKIP_IF([test $HAVE_SCAPY = no])
+ovn_start
+net_add n1
+
+sim_add hv1
+as hv1
+ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.1
+ovs-vsctl -- add-port br-int vm0 -- \
+set interface vm0 external-ids:iface-id=vm0
+
+ovs-vsctl -- add-port br-int vm1 -- \
+set interface vm1 external-ids:iface-id=vm1
+
+ovn-nbctl ls-add ls \
+-- set logical_switch ls other-config:requested-tnl-key=1
+
+ovn-nbctl lsp-add ls vm0 \
+-- lsp-set-addresses vm0 "f0:00:00:01:01:00 192.168.1.10" \
+-- set logical_switch_port vm0 options:requested-tnl-key=10
+ovn-nbctl lsp-add ls vm1 \
+-- lsp-set-addresses vm1 "f0:00:00:01:01:01 192.168.1.11" \
+-- set logical_switch_port vm1 options:requested-tnl-key=11
+
+# Allow some time for ovn-northd and ovn-controller to catch up.
+wait_for_ports_up
+check ovn-nbctl --wait=hv sync
+
+ingress_table=$(ovn-debug lflow-stage-to-ltable ls_in_check_port_sec)
+egress_table=$(ovn-debug lflow-stage-to-ltable ls_out_apply_port_sec)
+dp_uuid=$(fetch_column datapath _uuid external_ids:name=ls)
+pb_vm0=$(ovn-debug uuid-to-cookie $(fetch_column port_binding _uuid \
+logical_port=vm0))
+pb_vm1=$(ovn-debug uuid-to-cookie $(fetch_column port_binding _uuid \
+logical_port=vm1))
+ingress=$(ovn-debug uuid-to-cookie $(fetch_column logical_flow _uuid \
+table_id=$ingress_table pipeline=ingress match="1"))
+egress=$(ovn-debug uuid-to-cookie $(fetch_column logical_flow _uuid \
+table_id=$egress_table pipeline=egress match="1"))
+
+cat << EOF > trace
+0. in_port=1, priority 100, cookie $pb_vm0
+set_field:0x4/0x->reg13
+set_field:0x1->reg11
+set_field:0x1->reg12
+set_field:0x1->metadata
+set_field:0x1->reg14
+set_field:0/0x->reg13
+resubmit(,??)
+8. metadata=0x1, priority 50, cookie $ingress
+set_field:0/0x1000->reg10
+resubmit(,??)
+51. metadata=0x1, priority 0, cookie $egress
+resubmit(,??)
+65. reg15=0x2,metadata=0x1, priority 100, cookie $pb_vm1
+output:2
+EOF
+
+AT_CHECK_UNQUOTED([cat trace | $PYTHON 
$top_srcdir/utilities/ovn_detrace.py.in], [0], [dnl
+0. in_port=1, priority 100, cookie $pb_vm0
+set_field:0x4/0x->reg13
+set_field:0x1->reg11
+set_field:0x1->reg12
+set_field:0x1->metadata
+set_field:0x1->reg14
+set_field:0/0x->reg13
+resubmit(,??)
+  * Logical datapath: "ls" ($dp_uuid)
+  * Port Binding: logical_port "vm0", tunnel_key 10, chassis-name "hv1", 
chassis-str "hv1"
+8. metadata=0x1, priority 50, cookie $ingress
+set_field:0/0x1000->reg10
+resubmit(,??)
+  * Logical datapaths:
+  * "ls" ($dp_uuid) [[ingress]]
+  * Logical flow: table=$ingress_table (ls_in_check_port_sec), priority=50, 
match=(1), actions=(reg0[[15]] = check_in_port_sec(); next;)
+51. metadata=0x1, priority 0, cookie $egress
+resubmit(,??)
+  * Logical datapaths:
+  * "ls" ($dp_uuid) [[egress]]
+  * Logical flow: table=$egress_table (ls_out_apply_port_sec), priority=0, 
match=(1), actions=(output;)
+65. reg15=0x2,metadata=0x1, priority 100, cookie $pb_vm1
+output:2
+  * Logical datapath: "ls" ($dp_uuid)
+  * Port Binding: logical_port "vm1", tunnel_key 11, chassis-name "hv1", 
chassis-str "hv1"
+
+])
+
+ovs-ofctl dump-flows br-int table=$(ovn-debug lflow-stage-to-oftable 
ls_in_check_port_sec),cookie=$ingress/0x >> flows
+ovs-ofctl dump-flows br-int table=$(ovn-debug lflow-stage-to-oftable 
ls_out_apply_port_sec),cookie=$egress/0x >> flows
+
+AT_CHECK_UNQUOTED([cat flows | awk '{print $1, $7, $8}' | grep -v "NXST_FLOW" 
| \
+   sed -e "s/resubmit(,[[0-9]]\+)/resubmit(,??)/g" | \
+   $PYTHON $top_srcdir/utilities/ovn_detrace.py.in], [0], [dnl
+cookie=$ingress, priority=50,metadata=0x1 
actions=load:0->NXM_NX_REG10[[12]],resubmit(,??),move:NXM_NX

Re: [ovs-dev] [PATCH 1/1] ofproto-dpif-upcall: Avoid stale ukeys leaks.

2024-06-03 Thread Roi Dayan via dev


On 03/06/2024 10:18, Roi Dayan wrote:
> 
> 
> On 30/05/2024 18:48, Eelco Chaudron wrote:
>>
>>
>> On 23 May 2024, at 12:46, Roi Dayan via dev wrote:
>>
>>> It is observed in some environments that there are much more ukeys than
>>> actual DP flows. For example:
>>>
>>> $ ovs-appctl upcall/show
>>> system@ovs-system:
>>> flows : (current 7) (avg 6) (max 117) (limit 2125)
>>> offloaded flows : 525
>>> dump duration : 1063ms
>>> ufid enabled : true
>>>
>>> 23: (keys 3612)
>>> 24: (keys 3625)
>>> 25: (keys 3485)
>>>
>>> The revalidator threads are busy revalidating the stale ukeys leading to
>>> high CPU and long dump duration.
>>>
>>> There are some possible situations that may result in stale ukeys that
>>> have no corresponding DP flows.
>>>
>>> In revalidator, push_dp_ops() doesn't check error if the op type is not
>>> DEL. It is possible that a PUT(MODIFY) fails, especially for tc offload
>>> case, where the old flow is deleted first and then the new one is
>>> created. If the creation fails, the ukey will be stale (no corresponding
>>> DP flow). This patch adds a warning in such case.
>>
>> Not sure I understand, this behavior should be captured by the 
>> UKEY_INCONSISTENT state.
> 
> Hi Eelco,
> 
> thanks for reviewing.
> We started the patch on older branch as we didn't rebase for some time
> and got a little later on sending it.
> I see the addition now for UKEY_INCOSISTENT in the following patch:
> 
> cf11766cbcf1 ofproto-dpif-upcall: Fix push_dp_ops to handle all errors.
> 
> Looks like it handles the same situation we tried to handle in this patch.
> 
>>
>>> Another possible scenario is in handle_upcalls() if a PUT operation did
>>> not succeed and op->error attribute was not set correctly it can lead to
>>> stale ukey in operational state.
>>
>>
>> Guess we need to figure out these cases, as these are the root cause of your 
>> problem.
> 
> right. maybe. but this could help keep the system alive/clean while logging 
> the
> bad situation instead of having increase in those stale/inconsistent ukeys.
> I understand if it's not nice to handle it like this.
> 

Hi Eelco,

I remember now one of the reproduction scenarios we did is do some traffic
to get rules added using TC and then delete those from tc command line
and it got to stale ukeys.
The revalidator dump thread not seeing the rules so not updating anything
and sweep over the ukeys skipping them as well.
This is why we checked against the timing stats of the ukey.

I remember I tested on the upstream code and not our development branch
and it reproduces. I didn't notice if the commit adding UKEY_INCONSISTENT
existed but it handles errors from adding flows so I dont think it matters.

I'll try to check and verify again but I think it's still here.
So while cases getting dop.error handled with UKEY_INCONSISTENT,
this case I think is not.

Thanks,
Roi

>>
>>>
>>> This patch adds checks in the sweep phase for such ukeys and move them
>>> to DELETE so that they can be cleared eventually.
>>>
>>> Co-authored-by: Han Zhou 
>>> Signed-off-by: Han Zhou 
>>> Signed-off-by: Roi Dayan 
>>> ---
>>>  ofproto/ofproto-dpif-upcall.c | 15 +++
>>>  1 file changed, 15 insertions(+)
>>>
>>> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
>>> index 83609ec62b63..e9520ebdf910 100644
>>> --- a/ofproto/ofproto-dpif-upcall.c
>>> +++ b/ofproto/ofproto-dpif-upcall.c
>>> @@ -57,6 +57,7 @@ COVERAGE_DEFINE(dumped_inconsistent_flow);
>>>  COVERAGE_DEFINE(dumped_new_flow);
>>>  COVERAGE_DEFINE(handler_duplicate_upcall);
>>>  COVERAGE_DEFINE(revalidate_missed_dp_flow);
>>> +COVERAGE_DEFINE(revalidate_missed_dp_flow_del);
>>>  COVERAGE_DEFINE(ukey_dp_change);
>>>  COVERAGE_DEFINE(ukey_invalid_stat_reset);
>>>  COVERAGE_DEFINE(ukey_replace_contention);
>>> @@ -278,6 +279,7 @@ enum flow_del_reason {
>>>  FDR_BAD_ODP_FIT,/* Bad ODP flow fit. */
>>>  FDR_FLOW_IDLE,  /* Flow idle timeout. */
>>>  FDR_FLOW_LIMIT, /* Kill all flows condition reached. */
>>> +FDR_FLOW_STALE, /* Flow stale detected. */
>>
>> Please also update the associated script, see above comment.
>>
>>>  FDR_FLOW_WILDCARDED,/* Flow needs a narrower wildcard mask. */
>>>  FDR_NO_OFPROTO, /* Bridge not found. */
>>>  FDR_PURGE,  /* User requested flow deletion. */
>>> @@ -2557,6 +2559,10 @@ push_dp_ops(struct udpif *udpif, struct ukey_op 
>>> *ops, size_t n_ops)
>>
>>
>> The (op->dop.error) condition below will never be reached, as it’s 
>> explicitly checked above. So the error message will never happen. The 
>> condition you explain is already covered by UKEY_INCONSISTENT. Not sure if 
>> we need a log message for this either at warn level, maybe debug, especially 
>> as you say it can happen often.
>>
> 
> right. as replied above. I didn't notice this when I did the cherry-pick as 
> git didn't
> show me a conflict in this area at all.
> This commit was done before the change so we did hit 

Re: [ovs-dev] [PATCH 1/1] ofproto-dpif-upcall: Avoid stale ukeys leaks.

2024-06-03 Thread Roi Dayan via dev


On 30/05/2024 18:48, Eelco Chaudron wrote:
> 
> 
> On 23 May 2024, at 12:46, Roi Dayan via dev wrote:
> 
>> It is observed in some environments that there are much more ukeys than
>> actual DP flows. For example:
>>
>> $ ovs-appctl upcall/show
>> system@ovs-system:
>> flows : (current 7) (avg 6) (max 117) (limit 2125)
>> offloaded flows : 525
>> dump duration : 1063ms
>> ufid enabled : true
>>
>> 23: (keys 3612)
>> 24: (keys 3625)
>> 25: (keys 3485)
>>
>> The revalidator threads are busy revalidating the stale ukeys leading to
>> high CPU and long dump duration.
>>
>> There are some possible situations that may result in stale ukeys that
>> have no corresponding DP flows.
>>
>> In revalidator, push_dp_ops() doesn't check error if the op type is not
>> DEL. It is possible that a PUT(MODIFY) fails, especially for tc offload
>> case, where the old flow is deleted first and then the new one is
>> created. If the creation fails, the ukey will be stale (no corresponding
>> DP flow). This patch adds a warning in such case.
> 
> Not sure I understand, this behavior should be captured by the 
> UKEY_INCONSISTENT state.

Hi Eelco,

thanks for reviewing.
We started the patch on older branch as we didn't rebase for some time
and got a little later on sending it.
I see the addition now for UKEY_INCOSISTENT in the following patch:

cf11766cbcf1 ofproto-dpif-upcall: Fix push_dp_ops to handle all errors.

Looks like it handles the same situation we tried to handle in this patch.

> 
>> Another possible scenario is in handle_upcalls() if a PUT operation did
>> not succeed and op->error attribute was not set correctly it can lead to
>> stale ukey in operational state.
> 
> 
> Guess we need to figure out these cases, as these are the root cause of your 
> problem.

right. maybe. but this could help keep the system alive/clean while logging the
bad situation instead of having increase in those stale/inconsistent ukeys.
I understand if it's not nice to handle it like this.

> 
>>
>> This patch adds checks in the sweep phase for such ukeys and move them
>> to DELETE so that they can be cleared eventually.
>>
>> Co-authored-by: Han Zhou 
>> Signed-off-by: Han Zhou 
>> Signed-off-by: Roi Dayan 
>> ---
>>  ofproto/ofproto-dpif-upcall.c | 15 +++
>>  1 file changed, 15 insertions(+)
>>
>> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
>> index 83609ec62b63..e9520ebdf910 100644
>> --- a/ofproto/ofproto-dpif-upcall.c
>> +++ b/ofproto/ofproto-dpif-upcall.c
>> @@ -57,6 +57,7 @@ COVERAGE_DEFINE(dumped_inconsistent_flow);
>>  COVERAGE_DEFINE(dumped_new_flow);
>>  COVERAGE_DEFINE(handler_duplicate_upcall);
>>  COVERAGE_DEFINE(revalidate_missed_dp_flow);
>> +COVERAGE_DEFINE(revalidate_missed_dp_flow_del);
>>  COVERAGE_DEFINE(ukey_dp_change);
>>  COVERAGE_DEFINE(ukey_invalid_stat_reset);
>>  COVERAGE_DEFINE(ukey_replace_contention);
>> @@ -278,6 +279,7 @@ enum flow_del_reason {
>>  FDR_BAD_ODP_FIT,/* Bad ODP flow fit. */
>>  FDR_FLOW_IDLE,  /* Flow idle timeout. */
>>  FDR_FLOW_LIMIT, /* Kill all flows condition reached. */
>> +FDR_FLOW_STALE, /* Flow stale detected. */
> 
> Please also update the associated script, see above comment.
> 
>>  FDR_FLOW_WILDCARDED,/* Flow needs a narrower wildcard mask. */
>>  FDR_NO_OFPROTO, /* Bridge not found. */
>>  FDR_PURGE,  /* User requested flow deletion. */
>> @@ -2557,6 +2559,10 @@ push_dp_ops(struct udpif *udpif, struct ukey_op *ops, 
>> size_t n_ops)
> 
> 
> The (op->dop.error) condition below will never be reached, as it’s explicitly 
> checked above. So the error message will never happen. The condition you 
> explain is already covered by UKEY_INCONSISTENT. Not sure if we need a log 
> message for this either at warn level, maybe debug, especially as you say it 
> can happen often.
> 

right. as replied above. I didn't notice this when I did the cherry-pick as git 
didn't
show me a conflict in this area at all.
This commit was done before the change so we did hit this spot. we got little 
off
checking the commit and sending it on time for review. sorry about that.
cf11766cbcf1 ofproto-dpif-upcall: Fix push_dp_ops to handle all errors.

I think we should redo our checks with the commit adding the UKEY_INCONSISTENT
state and see our result again.

thanks for the review.


>>  if (op->dop.type != DPIF_OP_FLOW_DEL) {
>>  /* Only deleted flows need their stats pushed. */
>> +if (op->dop.error) {
>> +VLOG_WARN_RL(&rl, "push_dp_ops: error %d in op type %d, 
>> ukey "
>> + "%p", op->dop.error, op->dop.type, op->ukey);
>> +}
>>  continue;
>>  }
>>
>> @@ -3027,6 +3033,15 @@ revalidator_sweep__(struct revalidator *revalidator, 
>> bool purge)
>>  del_reason = purge ? FDR_PURGE : FDR_UPDATE_FAIL;
>>  } else if (!seq_mismatch) {
>>