[ovs-dev] [PATCH ovn] patch.c: Avoid patch interface deletion & recreation during restart.

2022-06-27 Thread Han Zhou
When ovn-controller is restarted, it may need multiple iterations of
main loop before completely download all related data from SB DB,
especially when ovn-monitor-all=false, so after restart, before it sees
the related localnet ports from SB DB, it treats the related patch
ports on the chassis as not needed, and deleted them. Later when it
downloads thoses port-bindings it recreates them.  For a graceful
upgrade, we don't this to happen, because it would break the traffic.

This is especially problematic at ovn-k8s setups because the external
side of the patch port is used to program some flows for external
network communication. When the patch ports are recreated, the OF port
number changes and those flows are broken. The CMS would detect and fix
the flows but it would result in even longer downtime.

This patch postpone the deletion of the patch ports, with the assumption
that leaving the unused ports on the chassis for little longer is not
harmful.

Signed-off-by: Han Zhou 
---
 controller/patch.c  | 15 -
 tests/ovn-controller.at | 71 +
 2 files changed, 85 insertions(+), 1 deletion(-)

diff --git a/controller/patch.c b/controller/patch.c
index ed831302c..9faae5879 100644
--- a/controller/patch.c
+++ b/controller/patch.c
@@ -307,11 +307,24 @@ patch_run(struct ovsdb_idl_txn *ovs_idl_txn,
 
 /* Now 'existing_ports' only still contains patch ports that exist in the
  * database but shouldn't.  Delete them from the database. */
+
+/* Wait for some iterations before really deleting any patch ports, because
+ * with conditional monitoring it is possible that related SB data is not
+ * completely downloaded yet after last restart of ovn-controller.
+ * Otherwise it may cause unncessary dataplane interruption during
+ * restart/upgrade. */
+static int delete_patch_port_delay = 10;
+if (delete_patch_port_delay > 0) {
+delete_patch_port_delay--;
+}
+
 struct shash_node *port_node;
 SHASH_FOR_EACH_SAFE (port_node, _ports) {
 port = port_node->data;
 shash_delete(_ports, port_node);
-remove_port(bridge_table, port);
+if (!delete_patch_port_delay) {
+remove_port(bridge_table, port);
+}
 }
 shash_destroy(_ports);
 }
diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at
index b8db342b9..a3a9122ab 100644
--- a/tests/ovn-controller.at
+++ b/tests/ovn-controller.at
@@ -2242,3 +2242,74 @@ OVS_WAIT_UNTIL([test 6 = $(as hv1 ovs-ofctl dump-flows 
br-int | grep -c $(port_b
 
 OVN_CLEANUP([hv1])
 AT_CLEANUP
+
+AT_SETUP([ovn-controller - restart should not delete patch ports])
+
+ovn_start
+
+net_add n1
+sim_add hv1
+as hv1
+check ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.1
+check ovs-vsctl add-port br-int hv1-vif1 -- \
+set interface hv1-vif1 external-ids:iface-id=lsp1
+check ovs-vsctl add-br br-ext -- \
+set open . external-ids:ovn-bridge-mappings=extnet:br-ext
+
+# Create logical topology so that lsp1 has multiple hops to a localnet port,
+# which would require multiple iterations to download the related datapaths and
+# port_bindings from SB DB during startup.
+#
+# lsp1@hv1 -- ls1 -- lr1 -- ls2 -- lr2 -- ls-ext -- lsp-ext (localnet)
+
+check ovn-appctl -t ovn-controller vlog/set file:dbg
+check ovn-nbctl ls-add ls1 \
+-- lsp-add ls1 lsp1 \
+-- lsp-set-addresses lsp1 "f0:00:00:00:00:01 10.0.1.2"
+check ovn-nbctl lr-add lr1 \
+-- lrp-add lr1 lrp-lr1-ls1 f0:00:aa:00:01:01 10.0.1.1/24 \
+-- lsp-add ls1 lsp-ls1-lr1 \
+-- lsp-set-type lsp-ls1-lr1 router \
+-- lsp-set-options lsp-ls1-lr1 router-port=lrp-lr1-ls1 \
+-- lsp-set-addresses lsp-ls1-lr1 router
+check ovn-nbctl ls-add ls2 \
+-- lrp-add lr1 lrp-lr1-ls2 f0:00:aa:00:01:02 10.0.2.1/24 \
+-- lsp-add ls2 lsp-ls2-lr1 \
+-- lsp-set-type lsp-ls2-lr1 router \
+-- lsp-set-options lsp-ls2-lr1 router-port=lrp-lr1-ls2 \
+-- lsp-set-addresses lsp-ls2-lr1 router
+check ovn-nbctl lr-add lr2 \
+-- lrp-add lr2 lrp-lr2-ls2 f0:00:aa:00:02:02 10.0.2.2/24 \
+-- lsp-add ls2 lsp-ls2-lr2 \
+-- lsp-set-type lsp-ls2-lr2 router \
+-- lsp-set-options lsp-ls2-lr2 router-port=lrp-lr2-ls2 \
+-- lsp-set-addresses lsp-ls2-lr2 router
+check ovn-nbctl ls-add ls-ext \
+-- lrp-add lr2 lrp-lr2-ext f0:00:aa:00:02:03 10.0.3.1/24 \
+-- lsp-add ls-ext lsp-ext-lr2 \
+-- lsp-set-type lsp-ext-lr2 router \
+-- lsp-set-options lsp-ext-lr2 router-port=lrp-lr2-ext \
+-- lsp-set-addresses lsp-ext-lr2 router
+check ovn-nbctl lsp-add ls-ext lsp-ext \
+-- lsp-set-type lsp-ext localnet \
+-- lsp-set-options lsp-ext network_name=extnet \
+-- lsp-set-addresses lsp-ext unknown
+check ovn-nbctl --wait=hv sync
+
+AT_CHECK([ovs-vsctl list-ports br-int | grep patch-br-int-to-lsp-ext], [0], 
[ignore])
+
+# Stop ovn-controller
+OVS_APP_EXIT_AND_WAIT([ovn-controller])
+
+# Start ovn-controller, which shouldn't cause any patch interface
+# 

Re: [ovs-dev] [PATCH ovn] nb: Remove possibility of disabling logical datapath groups.

2022-06-27 Thread Mark Michelson

Hi Dumitru,

I have a small note below, and with it fixed,

Acked-by: Mark Michelson

On 6/23/22 15:58, Dumitru Ceara wrote:

In large scale scenarios this option hugely reduces the size of the
Southbound database positively affecting end to end performance.  In
such scenarios there's no real reason to ever disable datapath groups.

In lower scale scenarios any potential overhead due to logical datapath
groups is, very likely, negligible.

Aside from potential scalability concerns, the
NB.NB_Global.options:use_logical_dp_group knob was kept until now to
ensure that in case of a bug in the logical datapath groups code a CMS
may turn it off and fall back to the mode in which logical flows are not
grouped together.  As far as I know, this has never happened until now.

Moreover, datpath_groups are enabled by default since v21.09.0 (4 stable
releases ago), via 90daa7ce18dc ("northd: Enable logical dp groups by
default.").

 From a testing perspective removing this knob will halve the CI matrix.
This is desirable, especially in the context of more tests being added,
e.g.:
https://patchwork.ozlabs.org/project/ovn/patch/20220623110239.2973854-1-mh...@redhat.com/

This commit also adds OVN_FOR_EACH_NORTHD to the "ovn-northd -- lr
multiple gw ports NAT" test case.  That was previously skipped because
the duplicate logical flow would have been merged when running with
datapath groups enabled.  Instead, change the expected output to not
include the duplicate.

Signed-off-by: Dumitru Ceara 
---
  NEWS|  1 +
  TODO.rst|  6 +++
  northd/northd.c | 77 ++--
  ovn-nb.xml  | 19 +++--
  tests/ovn-macros.at | 18 +
  tests/ovn-northd.at | 95 +
  tests/ovs-macros.at |  4 +-
  7 files changed, 47 insertions(+), 173 deletions(-)

diff --git a/NEWS b/NEWS
index 3737907ca..20cea579e 100644
--- a/NEWS
+++ b/NEWS
@@ -13,6 +13,7 @@ Post v22.06.0
  and ipsec_forceencaps=true (strongswan) to unconditionally enforce
  NAT-T UDP encapsulation. Requires OVS support for IPsec custom tunnel
  options (which will be available in OVS 2.18).
+  - Removed possibility of disabling logical datapath groups.
  
  OVN v22.06.0 - 03 Jun 2022

  --
diff --git a/TODO.rst b/TODO.rst
index 618ea4844..acbcacc4e 100644
--- a/TODO.rst
+++ b/TODO.rst
@@ -170,3 +170,9 @@ OVN To-do List
* physical.c has a global simap -localvif_to_ofport which stores the
  local OVS interfaces and the ofport numbers. Move this to the engine data
  of the engine data node - ed_type_pflow_output.
+
+* ovn-northd parallel logical flow processing
+
+  * Multi-threaded logical flow computation was optimized for the case
+when datapath groups are disabled.  Datpath groups are always enabled
+now so northd parallel processing should be revisited.
diff --git a/northd/northd.c b/northd/northd.c
index 6997c280c..d568b3952 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -4896,10 +4896,6 @@ ovn_lflow_equal(const struct ovn_lflow *a, const struct 
ovn_datapath *od,
  && nullable_string_is_equal(a->ctrl_meter, ctrl_meter));
  }
  
-/* If this option is 'true' northd will combine logical flows that differ by

- * logical datapath only by creating a datapath group. */
-static bool use_logical_dp_groups = false;
-
  enum {
  STATE_NULL,   /* parallelization is off */
  STATE_INIT_HASH_SIZES,/* parallelization is on; hashes sizing needed 
*/
@@ -4924,8 +4920,7 @@ ovn_lflow_init(struct ovn_lflow *lflow, struct 
ovn_datapath *od,
  lflow->ctrl_meter = ctrl_meter;
  lflow->dpg = NULL;
  lflow->where = where;
-if ((parallelization_state != STATE_NULL)
-&& use_logical_dp_groups) {
+if (parallelization_state != STATE_NULL) {
  ovs_mutex_init(>odg_lock);
  }
  }
@@ -4935,7 +4930,7 @@ ovn_dp_group_add_with_reference(struct ovn_lflow 
*lflow_ref,
  struct ovn_datapath *od)
  OVS_NO_THREAD_SAFETY_ANALYSIS
  {
-if (!use_logical_dp_groups || !lflow_ref) {
+if (!lflow_ref) {
  return false;
  }
  
@@ -5014,13 +5009,11 @@ do_ovn_lflow_add(struct hmap *lflow_map, struct ovn_datapath *od,

  struct ovn_lflow *old_lflow;
  struct ovn_lflow *lflow;
  
-if (use_logical_dp_groups) {

-old_lflow = ovn_lflow_find(lflow_map, NULL, stage, priority, match,
-   actions, ctrl_meter, hash);
-if (old_lflow) {
-ovn_dp_group_add_with_reference(old_lflow, od);
-return old_lflow;
-}
+old_lflow = ovn_lflow_find(lflow_map, NULL, stage, priority, match,
+   actions, ctrl_meter, hash);
+if (old_lflow) {
+ovn_dp_group_add_with_reference(old_lflow, od);
+return old_lflow;
  }
  
  lflow = xmalloc(sizeof *lflow);

@@ -5076,8 +5069,7 @@ 

Re: [ovs-dev] [PATCH v2 ovn] northd: add condition for stateless nat drop flow in S_ROUTER_IN_GW_REDIRECT pipeline

2022-06-27 Thread Mark Michelson

Thanks for the update, Lorenzo.

Acked-by: Mark Michelson 

On 6/24/22 08:38, Lorenzo Bianconi wrote:

Match the drop flow for stateless dnat_and_snat flow in S_ROUTER_IN_GW_REDIRECT
stage just if allowed_ext_ips or exempted_ext_ips conditions do not
match since it breaks the hairpin scenario with stateless nat.
Fix the northd documentation.

Fixes: c0224a14f ("northd: fix stateless nat with allowed_ext_ips")
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2094980
Signed-off-by: Lorenzo Bianconi 
---
Change since v1:
- fix lflow for exempted_ext_ips in S_ROUTER_IN_GW_REDIRECT stage
---
  northd/northd.c | 47 +
  northd/ovn-northd.8.xml | 28 +++-
  tests/system-ovn.at | 15 +
  3 files changed, 72 insertions(+), 18 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index 6997c280c..a9d3b5285 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -12072,6 +12072,7 @@ build_gateway_redirect_flows_for_lrouter(
  }
  for (size_t i = 0; i < od->n_l3dgw_ports; i++) {
  const struct ovsdb_idl_row *stage_hint = NULL;
+bool add_def_flow = true;
  
  if (od->l3dgw_ports[i]->nbrp) {

  stage_hint = >l3dgw_ports[i]->nbrp->header_;
@@ -12090,22 +12091,42 @@ build_gateway_redirect_flows_for_lrouter(
  ovn_lflow_add_with_hint(lflows, od, S_ROUTER_IN_GW_REDIRECT, 50,
  ds_cstr(match), ds_cstr(actions),
  stage_hint);
-}
+for (int j = 0; j < od->n_nat_entries; j++) {
+const struct ovn_nat *nat = >nat_entries[j];
  
-for (int i = 0; i < od->n_nat_entries; i++) {

-const struct ovn_nat *nat = >nat_entries[i];
+if (!lrouter_nat_is_stateless(nat->nb) ||
+strcmp(nat->nb->type, "dnat_and_snat") ||
+(!nat->nb->allowed_ext_ips && !nat->nb->exempted_ext_ips)) {
+continue;
+}
  
-if (!lrouter_nat_is_stateless(nat->nb) ||

-strcmp(nat->nb->type, "dnat_and_snat")) {
-   continue;
-}
+struct ds match_ext = DS_EMPTY_INITIALIZER;
+struct nbrec_address_set  *as = nat->nb->allowed_ext_ips
+? nat->nb->allowed_ext_ips : nat->nb->exempted_ext_ips;
+ds_put_format(_ext, "%s && ip%s.src == $%s",
+  ds_cstr(match), nat_entry_is_v6(nat) ? "6" : "4",
+  as->name);
  
-ds_clear(match);

-ds_put_format(match, "ip && ip%s.dst == %s",
-  nat_entry_is_v6(nat) ? "6" : "4",
-  nat->nb->external_ip);
-ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 100,
-  ds_cstr(match), "drop;");
+if (nat->nb->allowed_ext_ips) {
+ovn_lflow_add_with_hint(lflows, od, S_ROUTER_IN_GW_REDIRECT,
+75, ds_cstr(_ext),
+ds_cstr(actions), stage_hint);
+if (add_def_flow) {
+ds_clear(_ext);
+ds_put_format(_ext, "ip && ip%s.dst == %s",
+  nat_entry_is_v6(nat) ? "6" : "4",
+  nat->nb->external_ip);
+ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 70,
+  ds_cstr(_ext), "drop;");
+add_def_flow = false;
+}
+} else if (nat->nb->exempted_ext_ips) {
+ovn_lflow_add_with_hint(lflows, od, S_ROUTER_IN_GW_REDIRECT,
+75, ds_cstr(_ext), "drop;",
+stage_hint);
+}
+ds_destroy(_ext);
+}
  }
  
  /* Packets are allowed by default. */

diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 59c584710..06773eee4 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -3028,8 +3028,7 @@ icmp6 {
ip  ip6.dst == B
with an action ct_snat; . If the NAT rule is of type
dnat_and_snat and has stateless=true in the
-  options, then the action would be ip4/6.dst=
-  (B).
+  options, then the action would be next;.
  
  
  

@@ -3069,7 +3068,7 @@ icmp6 {
action ct_snat_in_czone; to unSNAT in the common
zone.  If the NAT rule is of type dnat_and_snat and has
stateless=true in the options, then the action
-  would be ip4/6.dst=(B).
+  would be next;.
  
  
  

@@ -4227,6 +4226,26 @@ icmp6 {
  external ip and D is NAT external mac.

  
+  

+For each dnat_and_snat NAT rule with
+stateless=true and allowed_ext_ips
+configured, a priority-75 flow is programmed with match
+

Re: [ovs-dev] [PATCH] dpif-netdev: Refactor AVX512 runtime checks.

2022-06-27 Thread Ilya Maximets
On 6/24/22 09:29, David Marchand wrote:
> As described in the bugzilla below, cpu_has_isa code may be compiled
> with some AVX512 instructions in it, because cpu.c is built as part of
> the libopenvswitchavx512.
> This is a problem when this function (supposed to probe for AVX512
> instructions availability) is invoked from generic OVS code, on older
> CPUs that don't support them.
> 
> For the same reason, dpcls_subtable_avx512_gather_probe,
> dp_netdev_input_outer_avx512_probe, mfex_avx512_probe and
> mfex_avx512_vbmi_probe are potential runtime bombs and can't either be
> built as part of libopenvswitchavx512.
> 
> Move cpu.c to be part of the "normal" libopenvswitch.
> And move other helpers in generic OVS code.
> 
> Note:
> - dpcls_subtable_avx512_gather_probe is split in two, because it also
>   needs to do its own magic,
> - while moving those helpers, prefer direct calls to cpu_has_isa and
>   avoid cast to intermediate integer variables when a simple boolean
>   is enough,
> 
> Fixes: b366fa2f4947 ("dpif-netdev: Call cpuid for x86 isa availability.")
> Reported-at: https://bugzilla.redhat.com/2100393
> Reported-by: Ales Musil 
> Signed-off-by: David Marchand 
> ---

Thanks, David and Ales.  The change looks good to me in general.

Cian, Sunil, could you, please, take a look?

The issue is causing failures in upstream OpenStack CI.
We'll also need to backport this change to stable branches.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn branch-21.12] northd: avoid snat on reply packets

2022-06-27 Thread Mark Michelson

Thank you for the backport, Xavier. I pushed the change to branch-21.12.

On 6/27/22 07:28, Xavier Simonart wrote:

On gateway routers egress packets might be both:
- unDNATted
- SNATted
Reply packets should not be SNATted (they must of course be UnDNATted
if DNAT was applied).

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2061593

Signed-off-by: Xavier Simonart 
Acked-by: Mark Michelson 
Signed-off-by: Numan Siddique 
(cherry picked from commit 8b3e1afc30f3cf0ef9857fdc68f619b6fbed10dc)
---
  northd/northd.c |   1 +
  northd/ovn-northd.8.xml |   3 +-
  tests/ovn-northd.at |  33 +--
  tests/system-ovn.at | 119 
  4 files changed, 139 insertions(+), 17 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index ce78f03de..f51962c4b 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -12991,6 +12991,7 @@ build_lrouter_out_snat_flow(struct hmap *lflows, struct 
ovn_datapath *od,
  ds_put_format(actions, "ip%s.src=%s; next;",
is_v6 ? "6" : "4", nat->external_ip);
  } else {
+ds_put_format(match, " && (!ct.trk || !ct.rpl)");
  ds_put_format(actions, "ct_snat(%s", nat->external_ip);
  
  if (nat->external_port_range[0]) {

diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 2b307cef3..2594c6d3b 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -4417,7 +4417,8 @@ nd_ns {
to change the source IP address of a packet from an IP address of
A or to change the source IP address of a packet that
belongs to network A to B, a flow matches
-  ip  ip4.src == A with an action
+  ip  ip4.src == A 
+  (!ct.trk || !ct.rpl) with an action
ct_snat(B);.  The priority of the flow
is calculated based on the mask of A, with matches
having larger masks getting higher priorities. If the NAT rule is
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index d5701a72b..a8689dfb0 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -1018,7 +1018,7 @@ AT_CHECK([grep -e "lr_out_snat" drflows | sed 
's/table=../table=??/' | sort], [0
  AT_CHECK([grep -e "lr_out_snat" crflows | sed 's/table=../table=??/' | sort], 
[0], [dnl
table=??(lr_out_snat), priority=0, match=(1), action=(next;)
table=??(lr_out_snat), priority=120  , match=(nd_ns), action=(next;)
-  table=??(lr_out_snat), priority=33   , match=(ip && ip4.src == 50.0.0.11 
&& ip4.dst == $allowed_range), action=(ct_snat(172.16.1.1);)
+  table=??(lr_out_snat), priority=33   , match=(ip && ip4.src == 50.0.0.11 && 
ip4.dst == $allowed_range && (!ct.trk || !ct.rpl)), action=(ct_snat(172.16.1.1);)
  ])
  
  
@@ -1050,7 +1050,7 @@ AT_CHECK([grep -e "lr_out_snat" drflows2 | sed 's/table=../table=??/' | sort], [

  AT_CHECK([grep -e "lr_out_snat" crflows2 | sed 's/table=../table=??/' | 
sort], [0], [dnl
table=??(lr_out_snat), priority=0, match=(1), action=(next;)
table=??(lr_out_snat), priority=120  , match=(nd_ns), action=(next;)
-  table=??(lr_out_snat), priority=33   , match=(ip && ip4.src == 
50.0.0.11), action=(ct_snat(172.16.1.1);)
+  table=??(lr_out_snat), priority=33   , match=(ip && ip4.src == 50.0.0.11 
&& (!ct.trk || !ct.rpl)), action=(ct_snat(172.16.1.1);)
table=??(lr_out_snat), priority=35   , match=(ip && ip4.src == 50.0.0.11 
&& ip4.dst == $disallowed_range), action=(next;)
  ])
  
@@ -1079,7 +1079,7 @@ AT_CHECK([grep -e "lr_out_snat" drflows3 | sed 's/table=../table=??/' | sort], [

  AT_CHECK([grep -e "lr_out_snat" crflows3 | sed 's/table=../table=??/' | 
sort], [0], [dnl
table=??(lr_out_snat), priority=0, match=(1), action=(next;)
table=??(lr_out_snat), priority=120  , match=(nd_ns), action=(next;)
-  table=??(lr_out_snat), priority=33   , match=(ip && ip4.src == 50.0.0.11 
&& ip4.dst == $allowed_range), action=(ct_snat(172.16.1.2);)
+  table=??(lr_out_snat), priority=33   , match=(ip && ip4.src == 50.0.0.11 && 
ip4.dst == $allowed_range && (!ct.trk || !ct.rpl)), action=(ct_snat(172.16.1.2);)
  ])
  
  # Stateful FIP with DISALLOWED_IPs

@@ -1108,7 +1108,7 @@ AT_CHECK([grep -e "lr_out_snat" drflows4 | sed 
's/table=../table=??/' | sort], [
  AT_CHECK([grep -e "lr_out_snat" crflows4 | sed 's/table=../table=??/' | 
sort], [0], [dnl
table=??(lr_out_snat), priority=0, match=(1), action=(next;)
table=??(lr_out_snat), priority=120  , match=(nd_ns), action=(next;)
-  table=??(lr_out_snat), priority=33   , match=(ip && ip4.src == 
50.0.0.11), action=(ct_snat(172.16.1.2);)
+  table=??(lr_out_snat), priority=33   , match=(ip && ip4.src == 50.0.0.11 
&& (!ct.trk || !ct.rpl)), action=(ct_snat(172.16.1.2);)
table=??(lr_out_snat), priority=35   , match=(ip && ip4.src == 50.0.0.11 
&& 

Re: [ovs-dev] [PATCH ovn 1/2] Fix memleak in ovn-nbctl when args can't be parsed

2022-06-27 Thread Mark Michelson

For patches 1 and 2:

Acked-by: Mark Michelson 

Given their simplicity and how obviously correct they are, I went ahead 
and merged these to main, branch-22.06, branch-22.03, and branch-21.12.


On 6/24/22 18:41, Ihar Hrachyshka wrote:

The leak is reported for asan runs.

Signed-off-by: Ihar Hrachyshka 
---
  utilities/ovn-dbctl.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/utilities/ovn-dbctl.c b/utilities/ovn-dbctl.c
index a292e589d..c4cc8c9b2 100644
--- a/utilities/ovn-dbctl.c
+++ b/utilities/ovn-dbctl.c
@@ -202,6 +202,13 @@ ovn_dbctl_main(int argc, char *argv[],
  error = ctl_parse_commands(argc - optind, argv_ + optind,
 _options, , _commands);
  if (error) {
+ovsdb_idl_destroy(idl);
+idl = the_idl = NULL;
+
+for (int i = 0; i < argc; i++) {
+free(argv_[i]);
+}
+free(argv_);
  ctl_fatal("%s", error);
  }
  



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] python: Add Python bindings TODO file.

2022-06-27 Thread Numan Siddique
On Mon, Jun 27, 2022 at 11:27 AM Terry Wilson  wrote:
>
> On Mon, Jun 27, 2022 at 5:44 AM Dumitru Ceara  wrote:
> >
> > For now include the IDL related TODO items as discussed at:
> > https://mail.openvswitch.org/pipermail/ovs-dev/2022-April/393516.html
> >
> > Signed-off-by: Dumitru Ceara 
> > ---
> >  python/TODO.rst| 34 ++
> >  python/automake.mk |  2 ++
> >  2 files changed, 36 insertions(+)
> >  create mode 100644 python/TODO.rst
> >
> > diff --git a/python/TODO.rst b/python/TODO.rst
> > new file mode 100644
> > index ..3a53489f1280
> > --- /dev/null
> > +++ b/python/TODO.rst
> > @@ -0,0 +1,34 @@
> > +..
> > +  Licensed under the Apache License, Version 2.0 (the "License"); you 
> > may
> > +  not use this file except in compliance with the License. You may 
> > obtain
> > +  a copy of the License at
> > +
> > +  http://www.apache.org/licenses/LICENSE-2.0
> > +
> > +  Unless required by applicable law or agreed to in writing, software
> > +  distributed under the License is distributed on an "AS IS" BASIS, 
> > WITHOUT
> > +  WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See 
> > the
> > +  License for the specific language governing permissions and 
> > limitations
> > +  under the License.
> > +
> > +  Convention for heading levels in Open vSwitch documentation:
> > +
> > +  ===  Heading 0 (reserved for the title in a document)
> > +  ---  Heading 1
> > +  ~~~  Heading 2
> > +  +++  Heading 3
> > +  '''  Heading 4
> > +
> > +  Avoid deeper levels because they do not render well.
> > +
> > +==
> > +Python Bindings To-do List
> > +==
> > +
> > +* IDL:
> > +
> > +  * Support incremental change tracking monitor mode (equivalent of
> > +OVSDB_IDL_TRACK).
> > +
> > +  * Support write-only-changed monitor mode (equivalent of
> > +OVSDB_IDL_WRITE_CHANGED_ONLY).
> > diff --git a/python/automake.mk b/python/automake.mk
> > index 767512f1757f..89c2ccbb06ae 100644
> > --- a/python/automake.mk
> > +++ b/python/automake.mk
> > @@ -117,3 +117,5 @@ $(srcdir)/python/ovs/dirs.py: 
> > python/ovs/dirs.py.template
> > mv $@.tmp $@
> >  EXTRA_DIST += python/ovs/dirs.py.template
> >  CLEANFILES += python/ovs/dirs.py
> > +
> > +EXTRA_DIST += python/TODO.rst
> > --
> > 2.31.1
> >
>
> Numan's patch for allowing one to specify the uuid via the IDL can go
> in here when it's committed as well. :)
>
> Acked-by: Terry Wilson 

I can respin the uuid patch adding the TODO if this patch gets committed first.

Numan

>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v3 1/1] Allow arbitrary args to be passed to called binary

2022-06-27 Thread Terry Wilson
It is common to pass ovn-ctl options via an /etc/sysconfig file.
It would be useful to be able to pass custom --remote options or
additional DBs to listen to via this file. This would give end
users the ability to have more fine-grained control without
having to modify ovn-ctl.

Signed-off-by: Terry Wilson 
---
 utilities/ovn-ctl   | 22 ++
 utilities/ovn-ctl.8.xml | 14 +-
 2 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/utilities/ovn-ctl b/utilities/ovn-ctl
index 23d4d7f8c..93be9b84b 100755
--- a/utilities/ovn-ctl
+++ b/utilities/ovn-ctl
@@ -311,6 +311,10 @@ $cluster_remote_port
 set "$@" --sync-from=`cat $active_conf_file`
 fi
 
+if test X"$extra_args" != X; then
+set "$@" $extra_args
+fi
+
 local run_ovsdb_in_bg="no"
 local process_id=
 if test X$detach = Xno && test $mode = cluster && test -z 
"$cluster_remote_addr" ; then
@@ -541,6 +545,10 @@ start_ic () {
 
 set "$@" $OVN_IC_LOG $ovn_ic_params
 
+if test X"$extra_args" != X; then
+set "$@" $extra_args
+fi
+
 OVS_RUNDIR=${OVS_RUNDIR} start_ovn_daemon "$OVN_IC_PRIORITY" 
"$OVN_IC_WRAPPER" "$@"
 fi
 }
@@ -563,6 +571,10 @@ start_controller () {
 
 [ "$OVN_USER" != "" ] && set "$@" --user "$OVN_USER"
 
+if test X"$extra_args" != X; then
+set "$@" $extra_args
+fi
+
 OVS_RUNDIR=${OVS_RUNDIR} start_ovn_daemon "$OVN_CONTROLLER_PRIORITY" 
"$OVN_CONTROLLER_WRAPPER" "$@"
 }
 
@@ -590,6 +602,10 @@ start_controller_vtep () {
 
 [ "$OVN_USER" != "" ] && set "$@" --user "$OVN_USER"
 
+if test X"$extra_args" != X; then
+set "$@" $extra_args
+fi
+
 OVS_RUNDIR=${OVS_RUNDIR} start_ovn_daemon "$OVN_CONTROLLER_PRIORITY" 
"$OVN_CONTROLLER_WRAPPER" "$@"
 }
 
@@ -1106,8 +1122,10 @@ EOF
 
 set_defaults
 command=
+extra_args=
 for arg
 do
+shift
 case $arg in
 -h | --help)
 usage
@@ -1130,6 +1148,10 @@ do
 type=bool
 set_option
 ;;
+--)
+extra_args=$@
+break
+;;
 -*)
 echo >&2 "$0: unknown option \"$arg\" (use --help for help)"
 exit 1
diff --git a/utilities/ovn-ctl.8.xml b/utilities/ovn-ctl.8.xml
index a1d39b22b..42d16fabc 100644
--- a/utilities/ovn-ctl.8.xml
+++ b/utilities/ovn-ctl.8.xml
@@ -4,7 +4,10 @@
 ovn-ctl -- Open Virtual Network northbound daemon lifecycle utility
 
 Synopsis
-ovn-ctl [options] command
+
+  ovn-ctl [options] command
+  [--- extra_args]
+
 
 Description
 This program is intended to be invoked internally by Open Virtual 
Network
@@ -156,6 +159,15 @@
 --db-nb-probe-interval-to-active=Time in 
milliseconds
 --db-sb-probe-interval-to-active=Time in 
milliseconds
 
+ Extra Options 
+
+  Any options after '--' will be passed on to the binary run by
+  command with the exception of start_northd, which can have
+  options specified in ovn-northd-db-params.conf. Any extra_args
+  passed to start_northd will be passed to the ovsdb-servers if
+  --ovn-manage-ovsdb=yes
+
+
 Configuration files
 Following are the optional configuration files. If present, it should 
be located in the etc dir
 
-- 
2.34.3

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2 0/6] MAC binding aging mechanism

2022-06-27 Thread Dumitru Ceara
On 6/24/22 22:56, Han Zhou wrote:
> On Fri, Jun 24, 2022 at 12:41 PM Numan Siddique  wrote:
>>
>> On Fri, Jun 24, 2022 at 11:49 AM Han Zhou  wrote:
>>>
>>> On Fri, Jun 24, 2022 at 1:11 AM Ales Musil  wrote:

 Hi Han,

 after our discussion I did he suggested test and the throughput does
> not
>>> seem to be affected,
 I did the test with aging set to 2 sec, and during the test period
> (360
>>> sec) the MAC binding was removed multiple times.
 There were some dropped packets, but the traffic was maintained with
>>> minimal disturbance.
>>>
>>> Thanks for sharing the result! I think different applications may react
> to
>>> this kind of disturbance differently. Some may be sensitive to packet
> loss.
>>> In addition, I believe this would also incur megaflow cache miss and
>>> trigger OVS userspace processing in the middle of a flow.
>>> May I know the traffic pattern of your test? Did you measure with iperf
>>> during the test? Could share the numbers with v.s. without the drops?
>>>
>>> On the other hand, if such random disturbance is not considered harmful
> for
>>> some deployment, then I would also question the value of doing all those
>>> OVS flow idle_age checkings on the *owner* chassis. There can be lots of
>>> chassis consuming the same mac-binding entry but we are now checking "at
>>> least one of them is not using the entry recently", which doesn't sound
> too
>>> different from just blindly expiring the entries without checking
> anything,
>>> and let it recreate if someone still needs it - if the minimal
> disturbance
>>> is acceptable in such environment. ovn-northd can do this periodical
> check
>>> easily and clean the expired entries, correct?
>>>
>>> Thanks,
>>> Han
>>>


 Thanks,
 Ales

 On Wed, Jun 22, 2022 at 9:51 AM Ales Musil  wrote:
>
>
>
> On Wed, Jun 22, 2022 at 9:21 AM Han Zhou  wrote:
>>
>>
>>
>> On Fri, Jun 17, 2022 at 2:08 AM Ales Musil 
> wrote:
>>>
>>> Add MAC binding aging mechanism, that
>>> should take care of stale MAC bindings.
>>>
>>> The mechanism works on "ownership" of the
>>> MAC binding row. The chassis that creates
>>> the row is then checking if the "idle_age"
>>> of the flow is over the aging threshold.
>>> In that case the MAC binding is removed
>>> from database. The "owner" might change
>>> when another chassis saw an update of the
>>> MAC address.
>>>
>>> This approach has downside, the chassis
>>> that "owns" the MAC binding might not actually be
>>> the one that is using it actively. This
>>> might lead some delays in packet flow when
>>> the row is removed.
>>>
>>
>
> Hi Han, thank you for your input.
>
>>
>> Thanks Ales for working on this! The stale entries in MAC_Binding
> table
>>> was a big TODO of OVN and a difficult problem. It is great to see a
>>> solution finally, and I think utilizing the "idle_age" is brilliant.
> Before
>>> reviewing it in more detail, I'd like to discuss the "downside" first.
>>
>> I think the "downside" here is indeed a problem with this approach.
> The
>>> MAC binding in OVN is in fact the ARP cache (or neighbour table) of the
>>> router, but OVN logical router is distributed (except for gateway-router
>>> and DGP), so in most cases by nature of OVN LR the user of MAC binding
>>> wouldn't be the one "owns" it. It would be a big dataplane performance
>>> impact, thinking about a chassis that has a flow with high throughput of
>>> packets suddenly needs to pause and wait for ovn-controller (and SB DB)
> to
>>> complete the ARP resolution process. I saw this being pointed out and
>>> discussed in the first version, but I'd raise more attention to it,
> because
>>> the problem introduced would be much bigger than the stale entries in
> the
>>> MAC binding table.
>>
>>
>> I think the proposal from Daniel that transfers owner with "expire
>>> timestamp" set would help, but I am also thinking that since the logical
>>> router is distributed, it may be unreasonable to have an owner at all.
> My
>>> suggestion is, instead of assigning "owner" for each entry, a central
>>> controller can just be responsible for checking if any chassis still
> uses
>>> the entry and removing it when no one uses it anymore. Naturally the
>>> central controller can be hosted in ovn-northd. Here is the detailed
>>> algorithm I am thinking:
>>
>> * when an entry is created (by any ovn-controller), an
> expire_timestamp
>>> is set (e.g. 10 min from now - can be configurable)
>> * Each ovn-controller: check the entries it uses and if the
>>> expire_timestamp of the entry is past, but its own "idle_age" indicates
> the
>>> entry is still needed, it will update the SB DB entry with a new
>>> expire_timestamp. Note: before updating the SB DB, ovn-controller needs
> a
>>> random delay, to avoid update storm to SB unnecessarily - in most cases
>>> only 

Re: [ovs-dev] [PATCH] python: Add Python bindings TODO file.

2022-06-27 Thread Terry Wilson
On Mon, Jun 27, 2022 at 5:44 AM Dumitru Ceara  wrote:
>
> For now include the IDL related TODO items as discussed at:
> https://mail.openvswitch.org/pipermail/ovs-dev/2022-April/393516.html
>
> Signed-off-by: Dumitru Ceara 
> ---
>  python/TODO.rst| 34 ++
>  python/automake.mk |  2 ++
>  2 files changed, 36 insertions(+)
>  create mode 100644 python/TODO.rst
>
> diff --git a/python/TODO.rst b/python/TODO.rst
> new file mode 100644
> index ..3a53489f1280
> --- /dev/null
> +++ b/python/TODO.rst
> @@ -0,0 +1,34 @@
> +..
> +  Licensed under the Apache License, Version 2.0 (the "License"); you may
> +  not use this file except in compliance with the License. You may obtain
> +  a copy of the License at
> +
> +  http://www.apache.org/licenses/LICENSE-2.0
> +
> +  Unless required by applicable law or agreed to in writing, software
> +  distributed under the License is distributed on an "AS IS" BASIS, 
> WITHOUT
> +  WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See 
> the
> +  License for the specific language governing permissions and limitations
> +  under the License.
> +
> +  Convention for heading levels in Open vSwitch documentation:
> +
> +  ===  Heading 0 (reserved for the title in a document)
> +  ---  Heading 1
> +  ~~~  Heading 2
> +  +++  Heading 3
> +  '''  Heading 4
> +
> +  Avoid deeper levels because they do not render well.
> +
> +==
> +Python Bindings To-do List
> +==
> +
> +* IDL:
> +
> +  * Support incremental change tracking monitor mode (equivalent of
> +OVSDB_IDL_TRACK).
> +
> +  * Support write-only-changed monitor mode (equivalent of
> +OVSDB_IDL_WRITE_CHANGED_ONLY).
> diff --git a/python/automake.mk b/python/automake.mk
> index 767512f1757f..89c2ccbb06ae 100644
> --- a/python/automake.mk
> +++ b/python/automake.mk
> @@ -117,3 +117,5 @@ $(srcdir)/python/ovs/dirs.py: python/ovs/dirs.py.template
> mv $@.tmp $@
>  EXTRA_DIST += python/ovs/dirs.py.template
>  CLEANFILES += python/ovs/dirs.py
> +
> +EXTRA_DIST += python/TODO.rst
> --
> 2.31.1
>

Numan's patch for allowing one to specify the uuid via the IDL can go
in here when it's committed as well. :)

Acked-by: Terry Wilson 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovsdb v2 1/1] Add Local_Config schema

2022-06-27 Thread Terry Wilson
The only way to configure settings on a remote (e.g. inactivity_probe)
is via --remote=db:DB,table,row. There is no way to do this via the
existing CLI options.

For a clustered DB with multiple servers listening on unique addresses
there is no way to store these entries in the DB as the DB is shared.
For example, three servers listening on  1.1.1.1, 1.1.1.2, and 1.1.1.3
respectively would require a Manager/Connection row each, but then
all three servers would try to listen on all three addresses.

It is possible for ovsdb-server to serve multiple databases. This
means that we can have a local "config" database in addition to
the main database we are servering (Open_vSwitch, OVN_Southbound, etc.)
and this patch adds a Local_Config schema that currently just mirrors
the Connection table and a Config table with a 'connections' row that
stores each Connection.

Signed-off-by: Terry Wilson 
---
 NEWS   |   5 +
 debian/openvswitch-common.manpages |   1 +
 ovsdb/.gitignore   |   2 +
 ovsdb/automake.mk  |  20 ++
 ovsdb/local_config.ovsschema   |  43 +
 ovsdb/local_config.xml | 281 +
 rhel/openvswitch-fedora.spec.in|   1 +
 tests/ovsdb-cluster.at |  42 -
 8 files changed, 388 insertions(+), 7 deletions(-)
 create mode 100644 ovsdb/local_config.ovsschema
 create mode 100644 ovsdb/local_config.xml

diff --git a/NEWS b/NEWS
index 9fe3f44f4..b9e6f0216 100644
--- a/NEWS
+++ b/NEWS
@@ -32,6 +32,11 @@ Post-v2.17.0
- DPDK:
  * OVS validated with DPDK 21.11.1.  It is recommended to use this version
until further releases.
+   - Local_Config schema added:
+ * Clustered ovsdb-servers listening on unique addresses can pass a second
+   DB created with the new Local_Config schema in order to configure their
+   Connections independent from each other. See the ovsdb.local-config.5
+   manpage for schema details.
 
 
 v2.17.0 - 17 Feb 2022
diff --git a/debian/openvswitch-common.manpages 
b/debian/openvswitch-common.manpages
index 95004122c..7c46a2acf 100644
--- a/debian/openvswitch-common.manpages
+++ b/debian/openvswitch-common.manpages
@@ -5,3 +5,4 @@ debian/tmp/usr/share/man/man8/ovs-appctl.8
 utilities/ovs-ofctl.8
 debian/tmp/usr/share/man/man8/ovs-parse-backtrace.8
 debian/tmp/usr/share/man/man8/ovs-pki.8
+ovsdb/ovsdb.local-config.5
diff --git a/ovsdb/.gitignore b/ovsdb/.gitignore
index fbcefafc6..747f119a9 100644
--- a/ovsdb/.gitignore
+++ b/ovsdb/.gitignore
@@ -1,5 +1,7 @@
 /_server.ovsschema.inc
 /_server.ovsschema.stamp
+/local_config.ovsschema.stamp
+/ovsdb.local-config.5
 /ovsdb-client
 /ovsdb-client.1
 /ovsdb-doc
diff --git a/ovsdb/automake.mk b/ovsdb/automake.mk
index 62cc02686..c53675a87 100644
--- a/ovsdb/automake.mk
+++ b/ovsdb/automake.mk
@@ -148,4 +148,24 @@ ovsdb/ovsdb-server.5: \
$(srcdir)/ovsdb/_server.xml > $@.tmp && \
mv $@.tmp $@
 
+EXTRA_DIST += ovsdb/local_config.ovsschema
+
+# Version checking for local_config.ovsschema.
+ALL_LOCAL += ovsdb/local_config.ovsschema.stamp
+ovsdb/local_config.ovsschema.stamp: ovsdb/local_config.ovsschema
+   $(srcdir)/build-aux/cksum-schema-check $? $@
+CLEANFILES += ovsdb/local_config.ovsschema.stamp
+
+# Local_Config schema documentation
+EXTRA_DIST += ovsdb/local_config.xml
+CLEANFILES += ovsdb/ovsdb.local-config.5
+man_MANS += ovsdb/ovsdb.local-config.5
+ovsdb/ovsdb.local-config.5: \
+   ovsdb/ovsdb-doc ovsdb/ ovsdb/local_config.xml 
ovsdb/local_config.ovsschema
+   $(AM_V_GEN)$(OVSDB_DOC) \
+   --version=$(VERSION) \
+   $(srcdir)/ovsdb/local_config.ovsschema \
+   $(srcdir)/ovsdb/local_config.xml > $@.tmp && \
+   mv $@.tmp $@
+
 EXTRA_DIST += ovsdb/TODO.rst
diff --git a/ovsdb/local_config.ovsschema b/ovsdb/local_config.ovsschema
new file mode 100644
index 0..bd86d0f4f
--- /dev/null
+++ b/ovsdb/local_config.ovsschema
@@ -0,0 +1,43 @@
+{
+"name": "Local_Config",
+"version": "1.0.0",
+"cksum": "2048726482 1858",
+"tables": {
+"Config": {
+"columns": {
+"connections": {
+"type": {"key": {"type": "uuid",
+ "refTable": "Connection"},
+ "min": 0,
+ "max": "unlimited"}}},
+"maxRows": 1,
+"isRoot": true},
+"Connection": {
+"columns": {
+"target": {"type": "string"},
+"max_backoff": {"type": {"key": {"type": "integer",
+ "minInteger": 1000},
+ "min": 0,
+ "max": 1}},
+"inactivity_probe": {"type": {"key": "integer",
+  "min": 0,
+  "max": 1}},
+

Re: [ovs-dev] [v5 8/8] dpif-netlink: Offloading meter to tc police action

2022-06-27 Thread Jianbo Liu via dev
On Mon, 2022-06-27 at 11:32 +0200, Eelco Chaudron wrote:
> 
> 
> On 21 Jun 2022, at 10:22, Jianbo Liu wrote:
> 
> > On Mon, 2022-06-20 at 12:18 +0200, Eelco Chaudron wrote:
> > > On 27 May 2022, at 11:00, Jianbo Liu wrote:
> > > 
> > > > OVS meters are created in advance and openflow rules refer to
> > > > them
> > > > by
> > > > their unique ID. New tc_police API is used to offload them. By
> > > > calling
> > > > the API, police actions are created and meters are mapped to
> > > > them.
> > > > These actions then can be used in tc filter rules by the index.
> > > > 
> > > > Signed-off-by: Jianbo Liu 
> > > > ---
> > > >  NEWS |  2 ++
> > > >  lib/dpif-netlink.c   | 31 +
> > > >  tests/system-offloads-traffic.at | 48
> > > > 
> > > >  3 files changed, 76 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/NEWS b/NEWS
> > > > index eece0d0b2..dfd659d4e 100644
> > > > --- a/NEWS
> > > > +++ b/NEWS
> > > > @@ -29,6 +29,8 @@ Post-v2.17.0
> > > >     - Windows:
> > > >   * Conntrack support for TCPv6, UDPv6, ICMPv6, FTPv6.
> > > >   * IPv6 Geneve tunnel support.
> > > > +   - Linux datapath:
> > > > + * Add offloading meter tc police.
> > > > 
> > > > 
> > > >  v2.17.0 - 17 Feb 2022
> > > > diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
> > > > index 06e1e8ca0..0af9ee77e 100644
> > > > --- a/lib/dpif-netlink.c
> > > > +++ b/lib/dpif-netlink.c
> > > > @@ -4163,11 +4163,18 @@ static int
> > > >  dpif_netlink_meter_set(struct dpif *dpif_, ofproto_meter_id
> > > > meter_id,
> > > >     struct ofputil_meter_config *config)
> > > >  {
> > > > +    int err;
> > > > +
> > > >  if (probe_broken_meters(dpif_)) {
> > > >  return ENOMEM;
> > > >  }
> > > > 
> > > > -    return dpif_netlink_meter_set__(dpif_, meter_id, config);
> > > > +    err = dpif_netlink_meter_set__(dpif_, meter_id, config);
> > > > +    if (!err && netdev_is_flow_api_enabled()) {
> > > > +    meter_offload_set(meter_id, config);
> > > 
> > > Although currently we always return 0, we should still account
> > > for it
> > > to change in the future, so we should set err to the return
> > > value.
> > > 
> > 
> > If there is err from meter_offload_set, it will be passed to the
> > caller
> > of dpif_netlink_meter_set(). I don't agree with that, because the
> > caller thinks meter_set operation fail, but actually not. Besides,
> > we
> > allow the case that dp meter_set success, but offloading fails, so
> > the
> > return the error of meter_offload_set seems unnecessary.
> 
> If this is the case, we should change the dpif_netlink_meter_set()
> API to return void.
> And add a comment to the function why it would not return an error:
> 

OK.

> --- a/lib/netdev-offload.c
> +++ b/lib/netdev-offload.c
> @@ -196,7 +196,7 @@ netdev_assign_flow_api(struct netdev *netdev)
>  return -1;
>  }
> 
> -int
> +void
>  meter_offload_set(ofproto_meter_id meter_id,
>    struct ofputil_meter_config *config)
>  {
> @@ -212,8 +212,8 @@ meter_offload_set(ofproto_meter_id meter_id,
>     }
>  }
>  }
> -
> -    return 0;
> +    /* Offload APIs could fail, for example, because the offload is
> not
> + * supported. This is fine, as the offload API should take care
> of this. */
>  }
> 
> 
> 
> > >  +    err = meter_offload_set(meter_id, config);
> > > 
> > > > +    }
> > > > +
> 
> 
> 
> > > > diff --git a/tests/system-offloads-traffic.at b/tests/system-
> > > > offloads-traffic.at
> > > > index 80bc1dd5c..7ec75340f 100644
> > > > --- a/tests/system-offloads-traffic.at
> > > > +++ b/tests/system-offloads-traffic.at
> > > > @@ -168,3 +168,51 @@ matchall
> > > >  ])
> > > >  OVS_TRAFFIC_VSWITCHD_STOP
> > > >  AT_CLEANUP
> > > > +
> > > > +AT_SETUP([offloads - check if meter offloading ])
> > > 
> > > Can we replace if with interface, as I keep on reading it as
> > > "if".
> > > 
> > 
> > Sure.
> > 
> > > > +AT_KEYWORDS([meter])
> > > > +AT_SKIP_IF([test $SUPPORT_TC_INGRESS_PPS = "no"])
> > > > +OVS_TRAFFIC_VSWITCHD_START()
> > > > +
> > > > +AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:hw-
> > > > offload=true])
> > > > +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br0 'meter=1 pktps
> > > > bands=type=drop rate=1'])
> > > > +
> > > > +ADD_NAMESPACES(at_ns0, at_ns1)
> > > > +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24", "f0:00:00:01:01:01")
> > > > +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24", "f0:00:00:01:01:02")
> > > > +
> > > > +NS_CHECK_EXEC([at_ns0], [ip neigh add 10.1.1.2 lladdr
> > > > f0:00:00:01:01:02 dev p0])
> > > > +NS_CHECK_EXEC([at_ns1], [ip neigh add 10.1.1.1 lladdr
> > > > f0:00:00:01:01:01 dev p1])
> > > > +
> > > > +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0
> > > > "actions=normal"])
> > > > +NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2 10.1.1.2 |
> > > > FORMAT_PING], [0], [dnl
> > > > +10 packets transmitted, 10 received, 0% packet loss, time 0ms
> > 

Re: [ovs-dev] [PATCH v2] netdev-linux: do not touch LAG slaves if master is not attached to ovs.

2022-06-27 Thread Roi Dayan via dev



On 2022-06-27 1:13 PM, Tao Liu wrote:

On Mon, Jun 27, 2022 at 11:51:48AM +0300, Roi Dayan wrote:



On 2022-06-27 10:49 AM, Tao Liu wrote:

On Mon, Jun 27, 2022 at 09:40:38AM +0300, Roi Dayan wrote:



On 2022-06-23 12:42 PM, Roi Dayan wrote:



On 2022-06-22 11:56 PM, Marcelo Ricardo Leitner wrote:

On Wed, Jun 22, 2022 at 11:10:17AM +0800, Tao Liu wrote:

On Tue, Jun 21, 2022 at 06:49:42AM -0700, Marcelo Ricardo Leitner wrote:

Hi,

On Tue, Jun 21, 2022 at 02:10:39PM +0800, Tao Liu wrote:

Bond master netdev may be created without a classification type, due
to routing or tunneling code.


Can you please elaborate on why is this an issue?


Hi, thanks for your reply.

We are using BlueField2 in Bare Metal Server.
p0 and p1 enslave to master bond0. A SF is created for RDMA.
ovs manages pf0hpf and gre.

Traffics between bond0 and SF are controlled by tc rules.
```
master=bond0
slave1=p0
slave2=p1
sf=enp3s0f0s0
sf_rep=en3f0pf0sf0
$tc qdisc add dev $master ingress_block 22 ingress
$tc qdisc add dev $slave1 ingress_block 22 ingress
$tc qdisc add dev $slave2 ingress_block 22 ingress
$tc qdisc add dev $sf_rep ingress

# some customized tc rules
$tc filter add dev $sf_rep pref 1 ingress ... action mirred
egress redirect dev $master

$tc filter add block 22 pref 1 ... action mirred egress redirect
dev $sf_rep
```

Unfortunately ingress_block on p0 or p1 may be deleted and recreated by
ovs when they enslaves to bond0.

Also we have a similar architectur on cx5/cx6.



If bond master is not attached to ovs, the ingress block
on slaves shoud
not be updated.


Why? (in short, but more below)


If we do not use bond master and slaves in ovs, they shoud not
be interfered.


Makes sense, ...





Simple reproducer:
     ip a add 10.1.1.1/30 dev bond0
     ip l set net3 master bond0
     tc q ls dev net3

Fixes: d22f8927c3c9 ("netdev-linux: monitor and offload
LAG slaves to TC")
Signed-off-by: Tao Liu 
---
    lib/netdev-linux.c | 5 -
    1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index 9d12502..b9e0c99 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -682,7 +682,10 @@ netdev_linux_update_lag(struct
rtnetlink_change *change)
    return;
    }

-    if (is_netdev_linux_class(master_netdev->netdev_class)) {
+    /* If bond master is not attached to ovs,
ingress block on slaves
+ * shoud not be updated. */


I think this will break a core use case. As in your reproducer, that's
pretty much how it is expected to work today with tunnels, for
example:

     ip a add 10.1.1.1/30 dev bond0
     ip l set net3 master bond0
     ip l s bond0 up
     ovs-vsctl add-port ovsbr0 vxlan0 -- \
  set interface vxlan0
  type=vxlan \
  options:local_ip=10.1.1.1
  options:remote_ip=10.1.1.2
  options:key=0

If you patch like this, then who would be adding the ingress qdiscs on
the slaves?

Forcing the bond to be added is probably not optimal, because it
doesn't really need to be. Unless your considering that as some sort
of authorization for ovs to mangle with it?

     Marcelo


This patch does not break the tunnel use case. The ingress attaches on
vxlan_sys, but not need to attach on bond master or slaves.


... I was under the impression that the driver needed it somehow,
despite that. But apparently that's not right, as the driver will have
one indr callback for each uplink, and should be able to have its way
from there.

Thanks for the explanations.

Reviewed-by: Marcelo Ricardo Leitner 



there is an example without bond about ingress. We can
use vxlan port in ovs with tunnel ip assigned on the pf.
like Tao mentioned, the ingress and tc rules are created on
the vxlan netdev created by ovs but ovs doesnt recreate ingress
on the pf and the rules are offloaded by the driver.

also looks good to me. thanks.

Reviewed-by: Roi Dayan 



Hmm, sorry but looks like I did review too quick.
I did some manual testing now and ovs doesn't add
ingress to the slave devices for me.
master_netdev->auto_classified==true in my case.
can you describe the steps you did to test? or you just tested
the flow where bond is not attached to ovs?


# ovs-vsctl show
866b5d8e-bf4d-4600-9dd3-b724036c7e99
  ovs_version: "2.17.90"

# echo "+bond0" > /sys/class/net/bonding_masters

# ip l show dev bond0
21: bond0:  mtu 1500 qdisc noop state DOWN mode 
DEFAULT group default qlen 1000
  link/ether 94:29:2f:70:22:9c brd ff:ff:ff:ff:ff:ff
# ip a add 10.1.1.1/30 dev bond0

# ip l set net3 down
# tc q add dev net3 ingress_block 111 ingress
# tc q ls dev net3 ingress
  qdisc ingress : parent :fff1 ingress_block 111 
# ip l set net3 master bond0
# tc q ls dev net3 ingress
  qdisc ingress : parent :fff1 ingress_block 22 

Also add some print in netdev_linux_update_lag():
2022-06-27T07:43:45.324Z|00058|netdev_linux|INFO|LAG: master_name=bond0, 
master_ifindex=22, 

Re: [ovs-dev] [PATCH] ofpbuf: Fix offsetting a NULL pointer in ofpbuf_reserve.

2022-06-27 Thread Ilya Maximets
On 6/24/22 16:00, Dumitru Ceara wrote:
> On 6/24/22 14:54, Ilya Maximets wrote:
>> ofpbuf_reserve() can be called with a zero size for a buffer with
>> an unallocated data.  It's a valid case, but we should not allow
>> evaluation of 'NULL + 0'.
>>
>>  SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior lib/ofpbuf.c:469:30 
>> in
>>  lib/ofpbuf.c:469:30: runtime error: applying zero offset to null pointer
>> 0 0xb2f890 in ofpbuf_reserve lib/ofpbuf.c:469:30
>> 1 0xb2f9bc in ofpbuf_new_with_headroom lib/ofpbuf.c:179:5
>> 2 0xb2f9bc in ofpbuf_clone_data_with_headroom lib/ofpbuf.c:228:24
>> 3 0xb2f9bc in ofpbuf_clone_with_headroom lib/ofpbuf.c:199:18
>> 4 0xb2f8ea in ofpbuf_clone lib/ofpbuf.c:189:12
>> 5 0x6b3c57 in ukey_set_actions ofproto/ofproto-dpif-upcall.c:1712:5
>> 6 0x6c4315 in ukey_create__ ofproto/ofproto-dpif-upcall.c:1738:5
>> 7 0x6beed6 in ukey_create_from_upcall 
>> ofproto/ofproto-dpif-upcall.c:1793:12
>> 8 0x6beed6 in upcall_xlate ofproto/ofproto-dpif-upcall.c:1284:24
>> 9 0x6beed6 in process_upcall ofproto/ofproto-dpif-upcall.c:1456:9
>> 10 0x6bafb6 in recv_upcalls ofproto/ofproto-dpif-upcall.c:875:17
>> 11 0x6b70fa in udpif_upcall_handler ofproto/ofproto-dpif-upcall.c:792:13
>> 12 0xb4d5fa in ovsthread_wrapper lib/ovs-thread.c:422:12
>> 13 0x7fe6922081ce in start_thread (/lib64/libpthread.so.0+0x81ce)
>> 14 0x7fe690e39dd2 in clone (/lib64/libc.so.6+0x39dd2)
>>
>> Signed-off-by: Ilya Maximets 
>> ---
>>  lib/ofpbuf.c | 4 
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/lib/ofpbuf.c b/lib/ofpbuf.c
>> index 79ced46d7..679f3ba3e 100644
>> --- a/lib/ofpbuf.c
>> +++ b/lib/ofpbuf.c
>> @@ -464,6 +464,10 @@ ofpbuf_put_hex(struct ofpbuf *b, const char *s, size_t 
>> *n)
>>  void
>>  ofpbuf_reserve(struct ofpbuf *b, size_t size)
>>  {
>> +if (!size) {
>> +return;
>> +}
>> +
>>  ovs_assert(!b->size);
> 
> Just to be extra sure, I think I'd move the ovs_assert(!b->size) before
> your new check.
> 
> Anyhow, it's ok without or it can be done at apply time:
> 
> Acked-by: Dumitru Ceara 

Thanks, Aaron and Dumitru!  I moved the assert and applied the patch.
Also backported down to 2.13.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] odp-util: Fix unaligned access to tunnel id.

2022-06-27 Thread Ilya Maximets
On 6/24/22 20:49, Aaron Conole wrote:
> Ilya Maximets  writes:
> 
>>   SUMMARY: UndefinedBehaviorSanitizer:
>>
>>   lib/odp-util.c:3436:32: runtime error:
>> load of misaligned address 0x624000489424 for type 'const ovs_be64'
>> (aka 'const unsigned long'), which requires 8 byte alignment 
>> 0x624000489424:
>> note: pointer points here
>>   0c 00 00 00 ff ff ff ff  ff ff ff ff 08 00 01 00  ...
>>   ^
>>0 0x9b13a2 in format_be64 lib/odp-util.c:3436:32
>>1 0x9b13a2 in format_odp_tun_attr lib/odp-util.c:3942:13
>>2 0x9b13a2 in format_odp_key_attr__ lib/odp-util.c:4221:9
>>3 0x9ae7a2 in odp_flow_format lib/odp-util.c:4606:17
>>4 0xee5037 in format_dpif_flow lib/dpctl.c:862:5
>>5 0xed69ed in dpctl_dump_flows lib/dpctl.c:1142:13
>>6 0xed32b3 in dpctl_unixctl_handler lib/dpctl.c:3035:17
>>7 0xc7c80b in process_command lib/unixctl.c:310:13
>>8 0xc7c80b in run_connection lib/unixctl.c:344:17
>>9 0xc7c80b in unixctl_server_run lib/unixctl.c:395:21
>>10 0x59a9a4 in main vswitchd/ovs-vswitchd.c:130:9
>>11 0x7fee2803acf2 in __libc_start_main (/lib64/libc.so.6+0x3acf2)
>>12 0x47e60d in _start (vswitchd/ovs-vswitchd+0x47e60d)
>>
>> Tunnel id mask in the flow key is only 4 bytes aligned, so has to be
>> accessed with appropriate unaligned read function.
>>
>> Signed-off-by: Ilya Maximets 
>> ---
> 
> Acked-by: Aaron Conole 
> 

Thanks, Aaron and Dumitru!  Applied and backported down to 2.13.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] conntrack: Fix incorrect bit shift while hashing nat range.

2022-06-27 Thread Ilya Maximets
On 6/24/22 20:48, Aaron Conole wrote:
> Ilya Maximets  writes:
> 
>> 'max_port' is 16bit field, shift expands it to 'int', not unsigned int.
>>
>>  lib/conntrack.c:2245:41: runtime error:
>>left shift of 34568 by 16 places cannot be represented in type 'int'.
>>
>>  0 0xec45f4 in nat_range_hash lib/conntrack.c:2245:41
>>  1 0xec45f4 in nat_get_unique_tuple lib/conntrack.c:2422:21
>>  2 0xec45f4 in conn_not_found lib/conntrack.c:1035:32
>>  3 0xeaf0a5 in process_one lib/conntrack.c:1407:20
>>  4 0xea9390 in conntrack_execute lib/conntrack.c:1465:13
>>  5 0x839530 in dp_execute_cb lib/dpif-netdev.c:9060:9
>>  6 0x9909cc in odp_execute_actions lib/odp-execute.c:868:17
>>  7 0x830946 in dp_netdev_execute_actions lib/dpif-netdev.c:9106:5
>>  8 0x830946 in handle_packet_upcall lib/dpif-netdev.c:8294:5
>>  9 0x82ea5e in fast_path_processing lib/dpif-netdev.c:8390:25
>>  10 0x7ed87f in dp_netdev_input__ lib/dpif-netdev.c:8479:9
>>  11 0x7eb5fc in dp_netdev_input lib/dpif-netdev.c:8517:5
>>  12 0x81dada in dp_netdev_process_rxq_port lib/dpif-netdev.c:5329:19
>>  13 0x7f0063 in dpif_netdev_run lib/dpif-netdev.c:6664:25
>>  14 0x85f036 in dpif_run lib/dpif.c:467:16
>>  15 0x61833a in type_run ofproto/ofproto-dpif.c:366:9
>>  16 0x5c210e in ofproto_type_run ofproto/ofproto.c:1822:31
>>  17 0x565db2 in bridge_run__ vswitchd/bridge.c:3245:9
>>  18 0x562f82 in bridge_run vswitchd/bridge.c:3310:5
>>  19 0x59a98c in main vswitchd/ovs-vswitchd.c:129:9
>>  20 0x7f8864c3acf2 in __libc_start_main (/lib64/libc.so.6+0x3acf2)
>>  21 0x47e60d in _start (vswitchd/ovs-vswitchd+0x47e60d)
>>
>> Fixes: 92edd073ce6c ("conntrack: Hash entire NAT data structure in 
>> nat_range_hash().")
>> Signed-off-by: Ilya Maximets 
>> ---
> 
> Acked-by: Aaron Conole 
> 

Thanks, Aaron and Paolo!  Applied and backported down to 2.13.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] packets: Fix misaligned write to MPLS lse.

2022-06-27 Thread Ilya Maximets
On 6/24/22 16:36, Eelco Chaudron wrote:
> 
> 
> On 24 Jun 2022, at 16:18, Ilya Maximets wrote:
> 
>> MPLS header is only 2 byte aligned, so the value has to be written
>> in parts.  Also, even though the 'struct mpls_hdr' has only one
>> field, it's cleaner to not access that field directly.
>>
>>  lib/packets.c:432:9: runtime error:
>>store to misaligned address 0x61b000756382 for type 'ovs_be32'
>>(aka 'unsigned int'), which requires 4 byte alignment
>>0x61b000756382: note: pointer points here
>>00 00  be be be be be be ff ff  ff ff ff ff a6 36 77 20  ...
>> ^
>>  0 0xbb30ae in add_mpls lib/packets.c:432:17
>>  1 0x9934d2 in odp_execute_actions lib/odp-execute.c:1072:17
>>  2 0x830946 in dp_netdev_execute_actions lib/dpif-netdev.c:9106:5
>>  3 0x830946 in handle_packet_upcall lib/dpif-netdev.c:8294:5
>>  4 0x82ea5e in fast_path_processing lib/dpif-netdev.c:8390:25
>>  5 0x7ed87f in dp_netdev_input__ lib/dpif-netdev.c:8479:9
>>  6 0x7eb5fc in dp_netdev_input lib/dpif-netdev.c:8517:5
>>  7 0x81dada in dp_netdev_process_rxq_port lib/dpif-netdev.c:5329:19
>>  8 0x7f0063 in dpif_netdev_run lib/dpif-netdev.c:6664:25
>>  9 0x85f036 in dpif_run lib/dpif.c:467:16
>>  10 0x61833a in type_run ofproto/ofproto-dpif.c:366:9
>>  11 0x5c210e in ofproto_type_run ofproto/ofproto.c:1822:31
>>  12 0x565db2 in bridge_run__ vswitchd/bridge.c:3245:9
>>  13 0x562f82 in bridge_run vswitchd/bridge.c:3310:5
>>  14 0x59a98c in main vswitchd/ovs-vswitchd.c:129:9
>>  15 0x7ff895c3acf2 in __libc_start_main (/lib64/libc.so.6+0x3acf2)
>>  16 0x47e60d in _start (vswitchd/ovs-vswitchd+0x47e60d)
>>
>> Fixes: 1917ace89364 ("Encap & Decap actions for MPLS packet type.")
>> Signed-off-by: Ilya Maximets 
> 
> 
> Changes look good to me, and tests are passing.
> 
> Acked-by: Eelco Chaudron 
> 


Thanks!  Applied to master and branch-2.17.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] tc: Fix misaligned access to stats and time values.

2022-06-27 Thread Ilya Maximets
On 6/24/22 16:34, Eelco Chaudron wrote:
> 
> 
> On 24 Jun 2022, at 15:18, Ilya Maximets wrote:
> 
>> Pointers to gnet_stats_basic and tcf_t are not correctly aligned,
>> so we need to copy the data before accessing.  Found by running
>> check-offloads testsuite with UBsan:
>>
>>  lib/tc.c:1791:50: runtime error:
>>member access within misaligned address 0x6195ce1c for type
>>'const struct gnet_stats_basic', which requires 8 byte alignment
>>0x6195ce1c: note: pointer points here
>>   14 00 07 00 00 00 00 00  00 00 00 00 00 00 00 00  ...
>>   ^
>>0 0xd69044 in nl_parse_single_action lib/tc.c:1791:50
>>1 0xd69044 in nl_parse_flower_actions lib/tc.c:1841:19
>>2 0xd57612 in nl_parse_flower_options lib/tc.c:1893:12
>>3 0xd5468d in parse_netlink_to_tc_flower lib/tc.c:1941:12
>>4 0xd5a7ec in tc_replace_flower lib/tc.c:3199:19
>>5 0xd2baea in probe_tc_block_support lib/netdev-offload-tc.c:2226:13
>>6 0xd2baea in netdev_tc_init_flow_api lib/netdev-offload-tc.c:2279:9
>>7 0x94d726 in netdev_assign_flow_api lib/netdev-offload.c:183:14
>>8 0x94d726 in netdev_init_flow_api lib/netdev-offload.c:323:9
>>9 0x9515c7 in netdev_ports_flow_init lib/netdev-offload.c:775:8
>>10 0x9515c7 in netdev_set_flow_api_enabled lib/netdev-offload.c:815:13
>>11 0x562ec8 in bridge_run vswitchd/bridge.c:3292:9
>>12 0x59a98c in main vswitchd/ovs-vswitchd.c:129:9
>>13 0x7fb5c583acf2 in __libc_start_main (/lib64/libc.so.6+0x3acf2)
>>14 0x47e60d in _start (vswitchd/ovs-vswitchd+0x47e60d)
>>
>>  lib/tc.c:1306:43: runtime error:
>>member access within misaligned address 0x619000140324 for type
>>'const struct tcf_t', which requires 8 byte alignment
>>0x619000140324: note: pointer points here
>>   24 00 01 00 00 00 00 00  00 00 00 00 00 00 00 00  ...
>>   ^
>>0 0xd6ee6f in nl_parse_tcf lib/tc.c:1306:43
>>1 0xd6423f in nl_parse_act_mirred lib/tc.c:1389:5
>>2 0xd6423f in nl_parse_single_action lib/tc.c:1747:15
>>3 0xd6423f in nl_parse_flower_actions lib/tc.c:1843:19
>>4 0xd57612 in nl_parse_flower_options lib/tc.c:1895:12
>>5 0xd5468d in parse_netlink_to_tc_flower lib/tc.c:1943:12
>>6 0xd5a7ec in tc_replace_flower lib/tc.c:3201:19
>>7 0xd28ae8 in netdev_tc_flow_put lib/netdev-offload-tc.c:1969:11
>>8 0x94cc97 in netdev_flow_put lib/netdev-offload.c:257:14
>>9 0xcba2be in parse_flow_put lib/dpif-netlink.c:2289:11
>>10 0xcba2be in try_send_to_netdev lib/dpif-netlink.c:2376:15
>>11 0xcba2be in dpif_netlink_operate lib/dpif-netlink.c:2447:23
>>12 0x8647ce in dpif_operate lib/dpif.c:1372:13
>>13 0x6b50a6 in push_dp_ops ofproto/ofproto-dpif-upcall.c:2406:5
>>14 0x6c9bcd in revalidate ofproto/ofproto-dpif-upcall.c:2792:13
>>15 0x6b79b5 in udpif_revalidator ofproto/ofproto-dpif-upcall.c:980:9
>>16 0xb4d5ea in ovsthread_wrapper lib/ovs-thread.c:422:12
>>17 0x7ff1090081ce in start_thread (/lib64/libpthread.so.0+0x81ce)
>>18 0x7ff107c39dd2 in clone (/lib64/libc.so.6+0x39dd2)
>>
>> This patch fixes only problems reported by UBsan in current tests,
>> there could be more issues like this not currently covered by the
>> testsuite.
>>
>> Signed-off-by: Ilya Maximets 
> 
> Changes look good to me, and tests are passing.
> 
> Copied in Jianbo Liu, as his “netdev-linux: Add functions to manipulate tc 
> police action” patch will also be affected.
> 
> Acked-by: Eelco Chaudron 


Thanks, Eelco and Dumitru.  Applied to master and branch-2.17.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn branch-21.12] northd: avoid snat on reply packets

2022-06-27 Thread 0-day Robot
Bleep bloop.  Greetings Xavier Simonart, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Unexpected sign-offs from developers who are not authors or co-authors 
or committers: Numan Siddique 
Lines checked: 275, Warnings: 1, Errors: 0


build:
PYTHONPATH=/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/python":"$PYTHONPATH
 PYTHONDONTWRITEBYTECODE=yes /usr/bin/python3 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/ovsdb/ovsdb-idlc.in
 annotate ./ovn-ic-sb.ovsschema ./lib/ovn-ic-sb-idl.ann > 
lib/ovn-ic-sb-idl.ovsidl.tmp && \
mv lib/ovn-ic-sb-idl.ovsidl.tmp lib/ovn-ic-sb-idl.ovsidl
PYTHONPATH=/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/python":"$PYTHONPATH
 PYTHONDONTWRITEBYTECODE=yes /usr/bin/python3 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/ovsdb/ovsdb-idlc.in
 c-idl-source lib/ovn-ic-sb-idl.ovsidl > lib/ovn-ic-sb-idl.c.tmp && mv 
lib/ovn-ic-sb-idl.c.tmp lib/ovn-ic-sb-idl.c
PYTHONPATH=/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/python":"$PYTHONPATH
 PYTHONDONTWRITEBYTECODE=yes /usr/bin/python3 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/ovsdb/ovsdb-idlc.in
 c-idl-header lib/ovn-ic-sb-idl.ovsidl > lib/ovn-ic-sb-idl.h.tmp && mv 
lib/ovn-ic-sb-idl.h.tmp lib/ovn-ic-sb-idl.h
make  all-am
make[1]: Entering directory 
`/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace'
depbase=`echo lib/acl-log.lo | sed 's|[^/]*$|.deps/&|;s|\.lo$||'`;\
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -std=gnu99 -DHAVE_CONFIG_H -I. 
  -I ./include  -I ./include -I ./ovn -I ./include -I ./lib -I ./lib -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/include
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/include
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/lib 
-I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/lib 
-I /var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs
-Wstrict-prototypes -Wall -Wextra -Wno-sign-compare -Wpointer-arith -Wformat 
-Wformat-security -Wswitch-enum -Wunused-parameter -Wbad-function-cast 
-Wcast-align -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes 
-Wmissing-field-initializers -fno-strict-aliasing -Wshadow -Werror -Werror  -g 
-O2 -MT lib/acl-log.lo -MD -MP -MF $depbase.Tpo -c -o li
 b/acl-log.lo lib/acl-log.c &&\
mv -f $depbase.Tpo $depbase.Plo
libtool: compile:  gcc -std=gnu99 -DHAVE_CONFIG_H -I. -I ./include -I ./include 
-I ./ovn -I ./include -I ./lib -I ./lib -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/include
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/include
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/lib 
-I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/lib 
-I /var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs 
-Wstrict-prototypes -Wall -Wextra -Wno-sign-compare -Wpointer-arith -Wformat 
-Wformat-security -Wswitch-enum -Wunused-parameter -Wbad-function-cast 
-Wcast-align -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes 
-Wmissing-field-initializers -fno-strict-aliasing -Wshadow -Werror -Werror -g 
-O2 -MT lib/acl-log.lo -MD -MP -MF lib/.deps/acl-log.Tpo -c lib/acl-log.c -o 
lib/acl-log.
 o
depbase=`echo lib/actions.lo | sed 's|[^/]*$|.deps/&|;s|\.lo$||'`;\
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -std=gnu99 -DHAVE_CONFIG_H -I. 
  -I ./include  -I ./include -I ./ovn -I ./include -I ./lib -I ./lib -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/include
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/include
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/lib 
-I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/lib 
-I /var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs
-Wstrict-prototypes -Wall -Wextra -Wno-sign-compare -Wpointer-arith -Wformat 
-Wformat-security -Wswitch-enum -Wunused-parameter -Wbad-function-cast 
-Wcast-align -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes 
-Wmissing-field-initializers -fno-strict-aliasing -Wshadow -Werror -Werror  -g 
-O2 -MT lib/actions.lo -MD -MP -MF $depbase.Tpo -c -o li
 b/actions.lo lib/actions.c &&\
mv -f $depbase.Tpo $depbase.Plo
libtool: compile:  gcc -std=gnu99 -DHAVE_CONFIG_H -I. -I ./include -I ./include 

Re: [ovs-dev] [PATCH v4 02/17] python: add mask, ip and eth decoders

2022-06-27 Thread Ilya Maximets
On 6/27/22 13:25, Ilya Maximets wrote:
> On 6/16/22 08:32, Adrian Moreno wrote:
>> Add more decoders that can be used by KVParser.
>>
>> For IPv4 and IPv6 addresses, create a new class that wraps
>> netaddr.IPAddress.
>> For Ethernet addresses, create a new class that wraps netaddr.EUI.
>> For Integers, create a new class that performs basic bitwise mask
>> comparisons
>>
>> Acked-by: Eelco Chaudron 
>> Signed-off-by: Adrian Moreno 
>> ---
>>  python/ovs/flow/decoders.py | 398 
>>  python/setup.py |   2 +-
>>  2 files changed, 399 insertions(+), 1 deletion(-)
>>
>> diff --git a/python/ovs/flow/decoders.py b/python/ovs/flow/decoders.py
>> index 0c2259c76..883e61acf 100644
>> --- a/python/ovs/flow/decoders.py
>> +++ b/python/ovs/flow/decoders.py
>> @@ -5,6 +5,15 @@ A decoder is generally a callable that accepts a string and 
>> returns the value
>>  object.
>>  """
>>  
>> +import netaddr
> 
> 
> 
>> diff --git a/python/setup.py b/python/setup.py
>> index 7ac3c3662..350ac6056 100644
>> --- a/python/setup.py
>> +++ b/python/setup.py
>> @@ -87,7 +87,7 @@ setup_args = dict(
>>  ext_modules=[setuptools.Extension("ovs._json", sources=["ovs/_json.c"],
>>libraries=['openvswitch'])],
>>  cmdclass={'build_ext': try_build_ext},
>> -install_requires=['sortedcontainers'],
>> +install_requires=['sortedcontainers', 'netaddr'],
>>  extras_require={':sys_platform == "win32"': ['pywin32 >= 1.0']},
>>  )
>>  
> 
> So, the 'netaddr' is not a test dependency, but a new global dependency.
> The tests are failing on patches #11 and #12 for that reason (unable to
> import netaddr), it will be installed as a test dependency in patch #13.
> 
> We have 2 options here:
> 
> 1.
>   - Document the new global dependency in installation docs.
>   - Install python3-netaddr explicitly in CI scripts.
>   - Update fedora spec and debian/control with a build dependency and
> runtime dependency for a python package.
> 
> 2.
>   - Make it an optional dependency, by adding to extras_require and
> providing a meaningful error on attempt to import ovs.flow if
> netaddr is not available instead of throwing an import error.
>   - Skip python part of tests in patches #11 and #12 if ovs.flow
> import fails. (try/catch + exit(0) in test scripts ?)
>   - Document optional dependency for a flow library and update packaging
> with 'Suggests' section for pytohn3-netaddr.

- Install python3-netaddr explicitly in CI scripts (GHA and CirrusCI).

Needed for that option too.

> 
> The second option seems better to me as it doesn't force users to install
> python3-netaddr if they don't want to parse OVS flows in python.
> What do you think?
> 
> Best regards, Ilya Maximets.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 15/17] python: add unit tests for openflow parsing

2022-06-27 Thread Ilya Maximets
On 6/16/22 08:32, Adrian Moreno wrote:
> Add unit tests for OFPFlow class and ip-port range decoder
> 
> Acked-by: Eelco Chaudron 
> Signed-off-by: Adrian Moreno 
> ---
>  python/automake.mk|   4 +-
>  python/ovs/tests/test_decoders.py | 130 
>  python/ovs/tests/test_list.py |   4 +-
>  python/ovs/tests/test_ofp.py  | 534 ++
>  4 files changed, 669 insertions(+), 3 deletions(-)
>  create mode 100644 python/ovs/tests/test_decoders.py
>  create mode 100644 python/ovs/tests/test_ofp.py
> 



> diff --git a/python/ovs/tests/test_list.py b/python/ovs/tests/test_list.py
> index e5139869f..e08ee1e85 100644
> --- a/python/ovs/tests/test_list.py
> +++ b/python/ovs/tests/test_list.py
> @@ -1,7 +1,7 @@
>  import pytest
>  
> -from ovs.flows.list import ListParser, ListDecoders
> -from ovs.flows.kv import KeyValue
> +from ovs.flow.list import ListParser, ListDecoders
> +from ovs.flow.kv import KeyValue

This belongs to a previous patch, otherwise tests are failing.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 11/17] tests: verify flows in ofp-actions are parseable

2022-06-27 Thread Ilya Maximets
On 6/16/22 08:32, Adrian Moreno wrote:
> Create a small helper script and check that flows used in ofp-actions.at
> are parseable.
> 
> Signed-off-by: Adrian Moreno 
> ---
>  tests/automake.mk |  2 ++
>  tests/ofp-actions.at  | 18 +
>  tests/ovs-test-ofparse.py | 42 +++
>  3 files changed, 62 insertions(+)
>  create mode 100755 tests/ovs-test-ofparse.py
> 
> diff --git a/tests/automake.mk b/tests/automake.mk
> index 261fbb942..ab23f04ca 100644
> --- a/tests/automake.mk
> +++ b/tests/automake.mk
> @@ -19,6 +19,7 @@ EXTRA_DIST += \
>   $(OVSDB_CLUSTER_TESTSUITE) \
>   tests/atlocal.in \
>   $(srcdir)/package.m4 \
> + $(srcdir)/tests/ovs-test-ofparse.py \
>   $(srcdir)/tests/testsuite \
>   $(srcdir)/tests/testsuite.patch
>  
> @@ -525,6 +526,7 @@ CHECK_PYFILES = \
>   tests/flowgen.py \
>   tests/mfex_fuzzy.py \
>   tests/ovsdb-monitor-sort.py \
> + tests/ovs-test-ofparse.py \

Should this be called just test-ofparse.py for consistency?
Same for the other script.

>   tests/test-daemon.py \
>   tests/test-json.py \
>   tests/test-jsonrpc.py \
> diff --git a/tests/ofp-actions.at b/tests/ofp-actions.at
> index 9d820eba6..c75f3ccb5 100644
> --- a/tests/ofp-actions.at
> +++ b/tests/ofp-actions.at
> @@ -329,6 +329,7 @@ AT_CAPTURE_FILE([experr])
>  AT_CHECK(
>[ovs-ofctl '-vPATTERN:console:%c|%p|%m' parse-actions OpenFlow10 < 
> input.txt],
>[0], [expout], [experr])
> +AT_CHECK([cat expout | grep 'actions=' | ovs-test-ofparse.py])

This should be '$PYTHON3 $srcdir/ovs-test-ofparse.py'.
The same for other places and the other patch.

>  AT_CLEANUP
>  
>  AT_SETUP([OpenFlow 1.0 "instruction" translations])
> @@ -359,6 +360,7 @@ AT_CAPTURE_FILE([experr])
>  AT_CHECK(
>[ovs-ofctl '-vPATTERN:console:%c|%p|%m' parse-instructions OpenFlow10 < 
> input.txt],
>[0], [expout], [experr])
> +AT_CHECK([cat expout | grep 'actions=' | ovs-test-ofparse.py])
>  AT_CLEANUP
>  
>  AT_SETUP([OpenFlow 1.1 action translation])
> @@ -502,6 +504,7 @@ AT_CAPTURE_FILE([experr])
>  AT_CHECK(
>[ovs-ofctl '-vPATTERN:console:%c|%p|%m' parse-actions OpenFlow11 < 
> input.txt],
>[0], [expout], [experr])
> +AT_CHECK([cat expout | grep 'actions=' | ovs-test-ofparse.py])
>  AT_CLEANUP
>  
>  AT_SETUP([OpenFlow 1.1 instruction translation])
> @@ -737,6 +740,7 @@ AT_CAPTURE_FILE([experr])
>  AT_CHECK(
>[ovs-ofctl '-vPATTERN:console:%c|%p|%m' parse-actions OpenFlow12 < 
> input.txt],
>[0], [expout], [experr])
> +AT_CHECK([cat expout | grep 'actions=' | ovs-test-ofparse.py])
>  AT_CLEANUP
>  
>  dnl Our primary goal here is to verify OpenFlow 1.3-specific changes,
> @@ -798,6 +802,7 @@ AT_CAPTURE_FILE([experr])
>  AT_CHECK(
>[ovs-ofctl '-vPATTERN:console:%c|%p|%m' parse-actions OpenFlow13 < 
> input.txt],
>[0], [expout], [experr])
> +AT_CHECK([cat expout | grep 'actions=' | ovs-test-ofparse.py])
>  AT_CLEANUP
>  
>  dnl Our primary goal here is to verify that OpenFlow 1.5-specific changes,
> @@ -827,17 +832,20 @@ AT_CAPTURE_FILE([experr])
>  AT_CHECK(
>[ovs-ofctl '-vPATTERN:console:%c|%p|%m' parse-actions OpenFlow15 < 
> input.txt],
>[0], [expout], [experr])
> +AT_CHECK([cat expout | grep 'actions=' | ovs-test-ofparse.py])
>  AT_CLEANUP
>  
>  AT_SETUP([ofp-actions - inconsistent MPLS actions])
>  OVS_VSWITCHD_START
>  dnl OK: Use fin_timeout action on TCP flow
>  AT_CHECK([ovs-ofctl -O OpenFlow11 -vwarn add-flow br0 'tcp 
> actions=fin_timeout(idle_timeout=1)'])
> +AT_CHECK([echo 'tcp actions=fin_timeout(idle_timeout=1)' | 
> ovs-test-ofparse.py])
>  dnl Bad: Use fin_timeout action on TCP flow that has been converted to MPLS
>  AT_CHECK([ovs-ofctl -O OpenFlow11 -vwarn add-flow br0 'tcp 
> actions=push_mpls:0x8847,fin_timeout(idle_timeout=1)'],
>   [1], [], [dnl
>  ovs-ofctl: none of the usable flow formats (OpenFlow10,NXM) is among the 
> allowed flow formats (OpenFlow11)
>  ])
> +AT_CHECK([echo 'tcp actions=push_mpls:0x8847,fin_timeout(idle_timeout=1)' | 
> ovs-test-ofparse.py])
>  OVS_VSWITCHD_STOP
>  AT_CLEANUP
>  
> @@ -853,6 +861,8 @@ AT_CHECK([ovs-ofctl -O OpenFlow10 dump-flows br0 | 
> ofctl_strip], [0], [dnl
>  NXST_FLOW reply:
>   mpls actions=load:0xa->OXM_OF_MPLS_LABEL[[]]
>  ])
> +AT_CHECK([echo 'mpls actions=set_field:10->mpls_label' | 
> ovs-test-ofparse.py])
> +AT_CHECK([echo 'mpls actions=load:0xa->OXM_OF_MPLS_LABEL[[]]'| 
> ovs-test-ofparse.py])
>  OVS_VSWITCHD_STOP
>  AT_CLEANUP
>  
> @@ -862,14 +872,17 @@ OVS_VSWITCHD_START
>  dnl OpenFlow 1.0 has an "enqueue" action.  For OpenFlow 1.1+, we translate
>  dnl it to a series of actions that accomplish the same thing.
>  AT_CHECK([ovs-ofctl -O OpenFlow10 add-flow br0 'actions=enqueue(123,456)'])
> +AT_CHECK([echo 'actions=enqueue(123,456)' | ovs-test-ofparse.py])
>  AT_CHECK([ovs-ofctl -O OpenFlow10 dump-flows br0 | ofctl_strip], [0], [dnl
>  NXST_FLOW reply:
>   actions=enqueue:123:456
>  ])
> +AT_CHECK([echo 

[ovs-dev] [PATCH ovn branch-21.12] northd: avoid snat on reply packets

2022-06-27 Thread Xavier Simonart
On gateway routers egress packets might be both:
- unDNATted
- SNATted
Reply packets should not be SNATted (they must of course be UnDNATted
if DNAT was applied).

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2061593

Signed-off-by: Xavier Simonart 
Acked-by: Mark Michelson 
Signed-off-by: Numan Siddique 
(cherry picked from commit 8b3e1afc30f3cf0ef9857fdc68f619b6fbed10dc)
---
 northd/northd.c |   1 +
 northd/ovn-northd.8.xml |   3 +-
 tests/ovn-northd.at |  33 +--
 tests/system-ovn.at | 119 
 4 files changed, 139 insertions(+), 17 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index ce78f03de..f51962c4b 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -12991,6 +12991,7 @@ build_lrouter_out_snat_flow(struct hmap *lflows, struct 
ovn_datapath *od,
 ds_put_format(actions, "ip%s.src=%s; next;",
   is_v6 ? "6" : "4", nat->external_ip);
 } else {
+ds_put_format(match, " && (!ct.trk || !ct.rpl)");
 ds_put_format(actions, "ct_snat(%s", nat->external_ip);
 
 if (nat->external_port_range[0]) {
diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 2b307cef3..2594c6d3b 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -4417,7 +4417,8 @@ nd_ns {
   to change the source IP address of a packet from an IP address of
   A or to change the source IP address of a packet that
   belongs to network A to B, a flow matches
-  ip  ip4.src == A with an action
+  ip  ip4.src == A 
+  (!ct.trk || !ct.rpl) with an action
   ct_snat(B);.  The priority of the flow
   is calculated based on the mask of A, with matches
   having larger masks getting higher priorities. If the NAT rule is
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index d5701a72b..a8689dfb0 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -1018,7 +1018,7 @@ AT_CHECK([grep -e "lr_out_snat" drflows | sed 
's/table=../table=??/' | sort], [0
 AT_CHECK([grep -e "lr_out_snat" crflows | sed 's/table=../table=??/' | sort], 
[0], [dnl
   table=??(lr_out_snat), priority=0, match=(1), action=(next;)
   table=??(lr_out_snat), priority=120  , match=(nd_ns), action=(next;)
-  table=??(lr_out_snat), priority=33   , match=(ip && ip4.src == 
50.0.0.11 && ip4.dst == $allowed_range), action=(ct_snat(172.16.1.1);)
+  table=??(lr_out_snat), priority=33   , match=(ip && ip4.src == 
50.0.0.11 && ip4.dst == $allowed_range && (!ct.trk || !ct.rpl)), 
action=(ct_snat(172.16.1.1);)
 ])
 
 
@@ -1050,7 +1050,7 @@ AT_CHECK([grep -e "lr_out_snat" drflows2 | sed 
's/table=../table=??/' | sort], [
 AT_CHECK([grep -e "lr_out_snat" crflows2 | sed 's/table=../table=??/' | sort], 
[0], [dnl
   table=??(lr_out_snat), priority=0, match=(1), action=(next;)
   table=??(lr_out_snat), priority=120  , match=(nd_ns), action=(next;)
-  table=??(lr_out_snat), priority=33   , match=(ip && ip4.src == 
50.0.0.11), action=(ct_snat(172.16.1.1);)
+  table=??(lr_out_snat), priority=33   , match=(ip && ip4.src == 
50.0.0.11 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.16.1.1);)
   table=??(lr_out_snat), priority=35   , match=(ip && ip4.src == 
50.0.0.11 && ip4.dst == $disallowed_range), action=(next;)
 ])
 
@@ -1079,7 +1079,7 @@ AT_CHECK([grep -e "lr_out_snat" drflows3 | sed 
's/table=../table=??/' | sort], [
 AT_CHECK([grep -e "lr_out_snat" crflows3 | sed 's/table=../table=??/' | sort], 
[0], [dnl
   table=??(lr_out_snat), priority=0, match=(1), action=(next;)
   table=??(lr_out_snat), priority=120  , match=(nd_ns), action=(next;)
-  table=??(lr_out_snat), priority=33   , match=(ip && ip4.src == 
50.0.0.11 && ip4.dst == $allowed_range), action=(ct_snat(172.16.1.2);)
+  table=??(lr_out_snat), priority=33   , match=(ip && ip4.src == 
50.0.0.11 && ip4.dst == $allowed_range && (!ct.trk || !ct.rpl)), 
action=(ct_snat(172.16.1.2);)
 ])
 
 # Stateful FIP with DISALLOWED_IPs
@@ -1108,7 +1108,7 @@ AT_CHECK([grep -e "lr_out_snat" drflows4 | sed 
's/table=../table=??/' | sort], [
 AT_CHECK([grep -e "lr_out_snat" crflows4 | sed 's/table=../table=??/' | sort], 
[0], [dnl
   table=??(lr_out_snat), priority=0, match=(1), action=(next;)
   table=??(lr_out_snat), priority=120  , match=(nd_ns), action=(next;)
-  table=??(lr_out_snat), priority=33   , match=(ip && ip4.src == 
50.0.0.11), action=(ct_snat(172.16.1.2);)
+  table=??(lr_out_snat), priority=33   , match=(ip && ip4.src == 
50.0.0.11 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.16.1.2);)
   table=??(lr_out_snat), priority=35   , match=(ip && ip4.src == 
50.0.0.11 && ip4.dst == $disallowed_range), action=(next;)
 ])
 
@@ -5082,11 +5082,12 @@ AT_CHECK([grep "lr_out_post_undnat" lr0flows | sed 
's/table=./table=?/' | sort],
 

Re: [ovs-dev] [PATCH v4 02/17] python: add mask, ip and eth decoders

2022-06-27 Thread Ilya Maximets
On 6/16/22 08:32, Adrian Moreno wrote:
> Add more decoders that can be used by KVParser.
> 
> For IPv4 and IPv6 addresses, create a new class that wraps
> netaddr.IPAddress.
> For Ethernet addresses, create a new class that wraps netaddr.EUI.
> For Integers, create a new class that performs basic bitwise mask
> comparisons
> 
> Acked-by: Eelco Chaudron 
> Signed-off-by: Adrian Moreno 
> ---
>  python/ovs/flow/decoders.py | 398 
>  python/setup.py |   2 +-
>  2 files changed, 399 insertions(+), 1 deletion(-)
> 
> diff --git a/python/ovs/flow/decoders.py b/python/ovs/flow/decoders.py
> index 0c2259c76..883e61acf 100644
> --- a/python/ovs/flow/decoders.py
> +++ b/python/ovs/flow/decoders.py
> @@ -5,6 +5,15 @@ A decoder is generally a callable that accepts a string and 
> returns the value
>  object.
>  """
>  
> +import netaddr



> diff --git a/python/setup.py b/python/setup.py
> index 7ac3c3662..350ac6056 100644
> --- a/python/setup.py
> +++ b/python/setup.py
> @@ -87,7 +87,7 @@ setup_args = dict(
>  ext_modules=[setuptools.Extension("ovs._json", sources=["ovs/_json.c"],
>libraries=['openvswitch'])],
>  cmdclass={'build_ext': try_build_ext},
> -install_requires=['sortedcontainers'],
> +install_requires=['sortedcontainers', 'netaddr'],
>  extras_require={':sys_platform == "win32"': ['pywin32 >= 1.0']},
>  )
>  

So, the 'netaddr' is not a test dependency, but a new global dependency.
The tests are failing on patches #11 and #12 for that reason (unable to
import netaddr), it will be installed as a test dependency in patch #13.

We have 2 options here:

1.
  - Document the new global dependency in installation docs.
  - Install python3-netaddr explicitly in CI scripts.
  - Update fedora spec and debian/control with a build dependency and
runtime dependency for a python package.

2.
  - Make it an optional dependency, by adding to extras_require and
providing a meaningful error on attempt to import ovs.flow if
netaddr is not available instead of throwing an import error.
  - Skip python part of tests in patches #11 and #12 if ovs.flow
import fails. (try/catch + exit(0) in test scripts ?)
  - Document optional dependency for a flow library and update packaging
with 'Suggests' section for pytohn3-netaddr.

The second option seems better to me as it doesn't force users to install
python3-netaddr if they don't want to parse OVS flows in python.
What do you think?

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] python: Add Python bindings TODO file.

2022-06-27 Thread Dumitru Ceara
For now include the IDL related TODO items as discussed at:
https://mail.openvswitch.org/pipermail/ovs-dev/2022-April/393516.html

Signed-off-by: Dumitru Ceara 
---
 python/TODO.rst| 34 ++
 python/automake.mk |  2 ++
 2 files changed, 36 insertions(+)
 create mode 100644 python/TODO.rst

diff --git a/python/TODO.rst b/python/TODO.rst
new file mode 100644
index ..3a53489f1280
--- /dev/null
+++ b/python/TODO.rst
@@ -0,0 +1,34 @@
+..
+  Licensed under the Apache License, Version 2.0 (the "License"); you may
+  not use this file except in compliance with the License. You may obtain
+  a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+  WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+  License for the specific language governing permissions and limitations
+  under the License.
+
+  Convention for heading levels in Open vSwitch documentation:
+
+  ===  Heading 0 (reserved for the title in a document)
+  ---  Heading 1
+  ~~~  Heading 2
+  +++  Heading 3
+  '''  Heading 4
+
+  Avoid deeper levels because they do not render well.
+
+==
+Python Bindings To-do List
+==
+
+* IDL:
+
+  * Support incremental change tracking monitor mode (equivalent of
+OVSDB_IDL_TRACK).
+
+  * Support write-only-changed monitor mode (equivalent of
+OVSDB_IDL_WRITE_CHANGED_ONLY).
diff --git a/python/automake.mk b/python/automake.mk
index 767512f1757f..89c2ccbb06ae 100644
--- a/python/automake.mk
+++ b/python/automake.mk
@@ -117,3 +117,5 @@ $(srcdir)/python/ovs/dirs.py: python/ovs/dirs.py.template
mv $@.tmp $@
 EXTRA_DIST += python/ovs/dirs.py.template
 CLEANFILES += python/ovs/dirs.py
+
+EXTRA_DIST += python/TODO.rst
-- 
2.31.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v2] tests: fixed multiple flaky tests (not waiting for patch flows)

2022-06-27 Thread Xavier Simonart
The following test cases were sometimes failing, (mainly) for the same reason
i.e. packet lost as being sent before patch ports were installed.
- 2 HVs, 2 LS, switching between multiple localnet ports with same tags
- VLAN transparency, passthru=true, ARP responder disabled
- send gratuitous arp for l3gateway only on selected chassis
- 4 HV, 3 LS, 2 LR, packet test with HA distributed router gateway port
- 1 LR with distributed router gateway port
- send gratuitous arp for NAT rules on distributed router
- send gratuitous ARP for NAT rules on HA distributed router
- localnet connectivity with multiple requested-chassis
- 2 HVs, 2 lports/HV, localnet ports, DVR chassis mac
- 2 HVs, 4 lports/HV, localnet ports
- 2 HVs, 2 LS, routing works for multiple collocated segments attached to 
different switches
- 2 HVs, 1 LS, no switching between multiple localnet ports with different tags

Signed-off-by: Xavier Simonart 

---
v2:  - handled Mark's comments.
   rebased on origin/main
---
 tests/ovn.at | 81 ++--
 1 file changed, 54 insertions(+), 27 deletions(-)

diff --git a/tests/ovn.at b/tests/ovn.at
index bfaa41962..b855653ac 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -76,19 +76,22 @@ m4_divert_text([PREPARE_TESTS],
}
 
ovn_wait_patch_port_flows () {
- patch_port=$1
- hv=$2
- echo "$3: waiting for flows for $patch_port on $hv"
- # Patch port might be created after ports are reported up
- OVS_WAIT_UNTIL([
- test 1 = $(as $hv ovs-vsctl show | grep "Port $patch_port" | wc -l)
- ])
- # Wait for a flow outputing to patch port
- OVS_WAIT_UNTIL([
- hv_patch_ofport=$(as $hv ovs-vsctl --bare --columns ofport find 
Interface name=$patch_port)
- echo "$patch_port=$hv_patch_ofport"
- test 1 = $(as $hv ovs-ofctl dump-flows br-int | grep -c 
"output:$hv_patch_ofport")
- ])
+ for localnet in $1; do
+   patch_port="patch-br-int-to-$localnet"
+   for hv in $2; do
+ echo "$3: waiting for flows for $patch_port on $hv"
+ # Patch port might be created after ports are reported up
+ OVS_WAIT_UNTIL([
+ test 1 = $(as $hv ovs-vsctl show | grep "Port \b$patch_port\b" | 
wc -l)
+ ])
+ # Wait for a flow outputing to patch port
+ OVS_WAIT_UNTIL([
+ hv_patch_ofport=$(as $hv ovs-vsctl --bare --columns ofport find 
Interface name=$patch_port)
+ echo "$patch_port=$hv_patch_ofport"
+ test 1 -le $(as $hv ovs-ofctl dump-flows br-int | grep -c 
"output:\b$hv_patch_ofport\b")
+ ])
+   done
+ done
}
 
ovn_wait_remote_output_flows () {
@@ -2952,6 +2955,8 @@ for i in 1 2; do
 done
 done
 wait_for_ports_up
+OVN_WAIT_PATCH_PORT_FLOWS(["ln1" "ln2" "ln3"], ["hv1"] ["hv2"])
+
 ovn-nbctl --wait=sb sync
 ovn-sbctl dump-flows > sbflows
 AT_CAPTURE_FILE([sbflows])
@@ -3122,6 +3127,7 @@ done
 
 wait_for_ports_up
 ovn-nbctl --wait=sb sync
+OVN_WAIT_PATCH_PORT_FLOWS(["ln-11" "ln-21"], ["hv-1" "hv-2"])
 
 ovn-sbctl dump-flows > sbflows
 AT_CAPTURE_FILE([sbflows])
@@ -3394,6 +3400,7 @@ for i in 1 2; do
 ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:00:0$i
 
 OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup])
+OVN_WAIT_PATCH_PORT_FLOWS(["ln-$i-20"], ["hv-$i"])
 done
 
 wait_for_ports_up
@@ -3501,7 +3508,7 @@ done
 
 wait_for_ports_up
 
-# Remote output flows are setup whe pb of remote is received
+# Remote output flows are setup when pb of remote is received
 # Hence they can be setup after both ports have been reported up.
 OVN_WAIT_REMOTE_OUTPUT_FLOWS(["hv-1"],["hv-2"])
 OVN_WAIT_REMOTE_OUTPUT_FLOWS(["hv-2"],["hv-1"])
@@ -3641,7 +3648,7 @@ for i in 1 2; do
 
 # Patch port might be created after ports are reported up
 # Wait for a flow outputing to patch port
-OVN_WAIT_PATCH_PORT_FLOWS(["patch-br-int-to-ln-100"], ["hv-$i"])
+OVN_WAIT_PATCH_PORT_FLOWS(["ln-100"], ["hv-$i"])
 done
 
 test_packet() {
@@ -3710,7 +3717,7 @@ for i in 1 2; do
 
 # Patch port might be created after ports are reported up
 # Wait for a flow outputing to patch port
-OVN_WAIT_PATCH_PORT_FLOWS(["patch-br-int-to-ln-100"], ["hv-$i"])
+OVN_WAIT_PATCH_PORT_FLOWS(["ln-100"], ["hv-$i"])
 done
 # create taps on fabric to check vlan encapsulation there
 for i in 1 2; do
@@ -3796,7 +3803,7 @@ for i in 1 2; do
 
 # Patch port might be created after ports are reported up
 # Wait for a flow outputing to patch port
-OVN_WAIT_PATCH_PORT_FLOWS(["patch-br-int-to-ln"], ["hv-$i"])
+OVN_WAIT_PATCH_PORT_FLOWS(["ln"], ["hv-$i"])
 done
 
 for i in 1 2; do
@@ -4021,6 +4028,7 @@ for tag in 10 20; do
 ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:0$i:$tag
 
 OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup])
+OVN_WAIT_PATCH_PORT_FLOWS(["ln-$tag"], ["hv-$tag-$i"])
 done
 done
 wait_for_ports_up
@@ -8856,14 +8864,14 @@ 

Re: [ovs-dev] [PATCH v2] netdev-linux: do not touch LAG slaves if master is not attached to ovs.

2022-06-27 Thread Tao Liu
On Mon, Jun 27, 2022 at 11:51:48AM +0300, Roi Dayan wrote:
> 
> 
> On 2022-06-27 10:49 AM, Tao Liu wrote:
> > On Mon, Jun 27, 2022 at 09:40:38AM +0300, Roi Dayan wrote:
> > > 
> > > 
> > > On 2022-06-23 12:42 PM, Roi Dayan wrote:
> > > > 
> > > > 
> > > > On 2022-06-22 11:56 PM, Marcelo Ricardo Leitner wrote:
> > > > > On Wed, Jun 22, 2022 at 11:10:17AM +0800, Tao Liu wrote:
> > > > > > On Tue, Jun 21, 2022 at 06:49:42AM -0700, Marcelo Ricardo Leitner 
> > > > > > wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > On Tue, Jun 21, 2022 at 02:10:39PM +0800, Tao Liu wrote:
> > > > > > > > Bond master netdev may be created without a classification 
> > > > > > > > type, due
> > > > > > > > to routing or tunneling code.
> > > > > > > 
> > > > > > > Can you please elaborate on why is this an issue?
> > > > > > > 
> > > > > > Hi, thanks for your reply.
> > > > > > 
> > > > > > We are using BlueField2 in Bare Metal Server.
> > > > > > p0 and p1 enslave to master bond0. A SF is created for RDMA.
> > > > > > ovs manages pf0hpf and gre.
> > > > > > 
> > > > > > Traffics between bond0 and SF are controlled by tc rules.
> > > > > > ```
> > > > > > master=bond0
> > > > > > slave1=p0
> > > > > > slave2=p1
> > > > > > sf=enp3s0f0s0
> > > > > > sf_rep=en3f0pf0sf0
> > > > > > $tc qdisc add dev $master ingress_block 22 ingress
> > > > > > $tc qdisc add dev $slave1 ingress_block 22 ingress
> > > > > > $tc qdisc add dev $slave2 ingress_block 22 ingress
> > > > > > $tc qdisc add dev $sf_rep ingress
> > > > > > 
> > > > > > # some customized tc rules
> > > > > > $tc filter add dev $sf_rep pref 1 ingress ... action mirred
> > > > > > egress redirect dev $master
> > > > > > 
> > > > > > $tc filter add block 22 pref 1 ... action mirred egress redirect
> > > > > > dev $sf_rep
> > > > > > ```
> > > > > > 
> > > > > > Unfortunately ingress_block on p0 or p1 may be deleted and 
> > > > > > recreated by
> > > > > > ovs when they enslaves to bond0.
> > > > > > 
> > > > > > Also we have a similar architectur on cx5/cx6.
> > > > > > 
> > > > > > > > 
> > > > > > > > If bond master is not attached to ovs, the ingress block
> > > > > > > > on slaves shoud
> > > > > > > > not be updated.
> > > > > > > 
> > > > > > > Why? (in short, but more below)
> > > > > > > 
> > > > > > If we do not use bond master and slaves in ovs, they shoud not
> > > > > > be interfered.
> > > > > 
> > > > > Makes sense, ...
> > > > > 
> > > > > > 
> > > > > > > > 
> > > > > > > > Simple reproducer:
> > > > > > > >     ip a add 10.1.1.1/30 dev bond0
> > > > > > > >     ip l set net3 master bond0
> > > > > > > >     tc q ls dev net3
> > > > > > > > 
> > > > > > > > Fixes: d22f8927c3c9 ("netdev-linux: monitor and offload
> > > > > > > > LAG slaves to TC")
> > > > > > > > Signed-off-by: Tao Liu 
> > > > > > > > ---
> > > > > > > >    lib/netdev-linux.c | 5 -
> > > > > > > >    1 file changed, 4 insertions(+), 1 deletion(-)
> > > > > > > > 
> > > > > > > > diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
> > > > > > > > index 9d12502..b9e0c99 100644
> > > > > > > > --- a/lib/netdev-linux.c
> > > > > > > > +++ b/lib/netdev-linux.c
> > > > > > > > @@ -682,7 +682,10 @@ netdev_linux_update_lag(struct
> > > > > > > > rtnetlink_change *change)
> > > > > > > >    return;
> > > > > > > >    }
> > > > > > > > 
> > > > > > > > -    if 
> > > > > > > > (is_netdev_linux_class(master_netdev->netdev_class)) {
> > > > > > > > +    /* If bond master is not attached to ovs,
> > > > > > > > ingress block on slaves
> > > > > > > > + * shoud not be updated. */
> > > > > > > 
> > > > > > > I think this will break a core use case. As in your reproducer, 
> > > > > > > that's
> > > > > > > pretty much how it is expected to work today with tunnels, for
> > > > > > > example:
> > > > > > > 
> > > > > > >     ip a add 10.1.1.1/30 dev bond0
> > > > > > >     ip l set net3 master bond0
> > > > > > >     ip l s bond0 up
> > > > > > >     ovs-vsctl add-port ovsbr0 vxlan0 -- \
> > > > > > >  set interface vxlan0
> > > > > > >  type=vxlan \
> > > > > > >  options:local_ip=10.1.1.1
> > > > > > >  options:remote_ip=10.1.1.2
> > > > > > >  options:key=0
> > > > > > > 
> > > > > > > If you patch like this, then who would be adding the ingress 
> > > > > > > qdiscs on
> > > > > > > the slaves?
> > > > > > > 
> > > > > > > Forcing the bond to be added is probably not optimal, because it
> > > > > > > doesn't really need to be. Unless your considering that as some 
> > > > > > > sort
> > > > > > > of authorization for ovs to mangle with it?
> > > > > > > 
> > > > > > >     Marcelo
> > > > > > > 
> > > > > > This patch does not break the tunnel use case. The ingress attaches 
> > > > > > on
> > > > > > vxlan_sys, but not need to attach on bond master or slaves.
> > > > > 
> > > > > ... I was under the impression that the driver needed it somehow,
> > > > > despite that. But apparently that's not right, as the driver will 

Re: [ovs-dev] [v5 5/8] netdev-offload-tc: Implement meter offload API for tc

2022-06-27 Thread Eelco Chaudron


On 27 Jun 2022, at 12:00, Jianbo Liu wrote:

> On Mon, 2022-06-27 at 11:33 +0200, Eelco Chaudron wrote:
>>
>>
>> On 27 Jun 2022, at 11:32, Jianbo Liu wrote:
>>
>>> On Mon, 2022-06-27 at 11:03 +0200, Eelco Chaudron wrote:


 On 21 Jun 2022, at 5:29, Jianbo Liu wrote:

> On Mon, 2022-06-20 at 12:15 +0200, Eelco Chaudron wrote:
>> On 27 May 2022, at 11:00, Jianbo Liu wrote:
>>
>>> For dpif-netlink, meters are mapped by tc to police actions
>>> with
>>> one-to-one relationship. Implement meter offload API to
>>> set/get/del
>>> the police action, and a hmap is used to save the mappings.
>>> An id-pool is used to manage all the available police
>>> indexes,
>>> which
>>> are 0x1000-0x1fff, reserved only for OVS.
>>>
>>> Signed-off-by: Jianbo Liu 
>>> ---

 

>>> +
>>> +static int
>>> +meter_tc_set_policer(ofproto_meter_id meter_id,
>>> + struct ofputil_meter_config *config)
>>> +{
>>> +    uint32_t police_index;
>>> +    uint32_t rate, burst;
>>> +    bool add_policer;
>>> +    int err;
>>> +
>>> +    ovs_assert(config->bands != NULL);
>>
>> I would remove the assert and just return and error if bands
>> ==
>> NULL
>> or n_bands is < 1.
>> So:
>>
>>     if (!config->bands || config->n_bands < 1) {
>>     return EINVAL;
>>     }
>>
>
> I think maybe not necessary, as dpif_netlink_meter_set__() is
> called
> before, and it doesn't check if bands is not NULL and n_bands
>> = 1.

 You are right it’s assuming it’s non-null and walks through the
 n_bands.

 So maybe you should just ignore if the bands is not 1, so
 something
 like:

   if (!config->bands || config->n_bands < 1) {
     return 0;
   }

 The dpif_netlink_meter_set__() also check is the band type is
 OVS_METER_BAND_TYPE_DROP. Not sure if you need to check this also
 when setting the stats, as now your code assumes this.
>>>
>>> How about adding one more check?
>>>     if (!config->bands || config->n_bands < 1
>>>     || config->bands[0].type != OFPMBT13_DROP) {
>>>     return 0;
>>>     }
>>
>> That will work, make sure you also update your get stats function
>> with a similar check.
>>
>
> No need for get stats and del functions, as they will try to find the
> police index before calling TC funcs. So if meter_set return directly
> (or fail), there is not police action mapped for this meter, and thus
> no TC stats/del funcs will be called.

Good point, I should think better before sending a quick reply ;)



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [v5 5/8] netdev-offload-tc: Implement meter offload API for tc

2022-06-27 Thread Jianbo Liu via dev
On Mon, 2022-06-27 at 11:33 +0200, Eelco Chaudron wrote:
> 
> 
> On 27 Jun 2022, at 11:32, Jianbo Liu wrote:
> 
> > On Mon, 2022-06-27 at 11:03 +0200, Eelco Chaudron wrote:
> > > 
> > > 
> > > On 21 Jun 2022, at 5:29, Jianbo Liu wrote:
> > > 
> > > > On Mon, 2022-06-20 at 12:15 +0200, Eelco Chaudron wrote:
> > > > > On 27 May 2022, at 11:00, Jianbo Liu wrote:
> > > > > 
> > > > > > For dpif-netlink, meters are mapped by tc to police actions
> > > > > > with
> > > > > > one-to-one relationship. Implement meter offload API to
> > > > > > set/get/del
> > > > > > the police action, and a hmap is used to save the mappings.
> > > > > > An id-pool is used to manage all the available police
> > > > > > indexes,
> > > > > > which
> > > > > > are 0x1000-0x1fff, reserved only for OVS.
> > > > > > 
> > > > > > Signed-off-by: Jianbo Liu 
> > > > > > ---
> > > 
> > > 
> > > 
> > > > > > +
> > > > > > +static int
> > > > > > +meter_tc_set_policer(ofproto_meter_id meter_id,
> > > > > > + struct ofputil_meter_config *config)
> > > > > > +{
> > > > > > +    uint32_t police_index;
> > > > > > +    uint32_t rate, burst;
> > > > > > +    bool add_policer;
> > > > > > +    int err;
> > > > > > +
> > > > > > +    ovs_assert(config->bands != NULL);
> > > > > 
> > > > > I would remove the assert and just return and error if bands
> > > > > ==
> > > > > NULL
> > > > > or n_bands is < 1.
> > > > > So:
> > > > > 
> > > > >     if (!config->bands || config->n_bands < 1) {
> > > > >     return EINVAL;
> > > > >     }
> > > > > 
> > > > 
> > > > I think maybe not necessary, as dpif_netlink_meter_set__() is
> > > > called
> > > > before, and it doesn't check if bands is not NULL and n_bands
> > > > >= 1.
> > > 
> > > You are right it’s assuming it’s non-null and walks through the
> > > n_bands.
> > > 
> > > So maybe you should just ignore if the bands is not 1, so
> > > something
> > > like:
> > > 
> > >   if (!config->bands || config->n_bands < 1) {
> > >     return 0;
> > >   }
> > > 
> > > The dpif_netlink_meter_set__() also check is the band type is
> > > OVS_METER_BAND_TYPE_DROP. Not sure if you need to check this also
> > > when setting the stats, as now your code assumes this.
> > 
> > How about adding one more check?
> >     if (!config->bands || config->n_bands < 1
> >     || config->bands[0].type != OFPMBT13_DROP) {
> >     return 0;
> >     }
> 
> That will work, make sure you also update your get stats function
> with a similar check.
> 

No need for get stats and del functions, as they will try to find the
police index before calling TC funcs. So if meter_set return directly
(or fail), there is not police action mapped for this meter, and thus
no TC stats/del funcs will be called.   

> > > 
> > > > > > +    rate = config->bands[0].rate;
> > > > > > +    if (config->flags & OFPMF13_BURST) {
> > > > > > +    burst = config->bands[0].burst_size;
> > > > > > +    } else {
> > > > > > +    burst = config->bands[0].rate;
> > > > > > +    }
> > > 
> > > 
> > > 
> > > > > > +static int
> > > > > > +meter_tc_del_policer(ofproto_meter_id meter_id,
> > > > > > + struct ofputil_meter_stats *stats,
> > > > > > + uint16_t max_bands OVS_UNUSED)
> > > > > > +{
> > > > > > +    uint32_t police_index;
> > > > > > +    int err = ENOENT;
> > > > > > +
> > > > > > +    if (!meter_id_lookup(meter_id.uint32, _index))
> > > > > > {
> > > > > > +    err = tc_del_policer_action(police_index, stats);
> > > > > > +    if (err && err != ENOENT) {
> > > > > > +    VLOG_WARN_RL(_rl,
> > > > > > + "Failed to del police %u for
> > > > > > meter
> > > > > > %u:
> > > > > > %s",
> > > > > > + police_index, meter_id.uint32,
> > > > > > ovs_strerror(err));
> > > > > 
> > > > > Why do we not cleanup the meter id from the pool on TC
> > > > > failure?
> > > > > Do we
> > > > > assume the entry exists and it was not cleaned up correctly?
> > > > > 
> > > > 
> > > > Just in case, I don't think it's possible becausue the reserved
> > > > police
> > > > index is big enough, and no other application will use them.
> > > 
> > > Yes, we should have enough IDs, was just curious why. Did you
> > > encounter this when testing?
> > 
> > No, I didn't.
> > 
> > > 
> > > > > If this is the case, this should be an error log, not a
> > > > > warning,
> > > > > as
> > > > > something is wrong in this environment.
> > > > 
> > > > OK, I will add an error log.
> > > 
> > > Thanks, this will at least trigger some attention to a potential
> > > problem.
> > > 
> > > > > 
> > > > > > +    } else {
> > > > > > +    meter_free_police_index(police_index);
> > > > > > +    }
> > > > > > +    meter_id_remove(meter_id.uint32);
> > > > > > +    }
> > > > > > +
> > > > > > +    return err;
> > > > > > +}
> > > > > > +
> > > > > >  const struct netdev_flow_api netdev_offload_tc = {
> > > > > >     .type = "linux_tc",
> 

Re: [ovs-dev] [v5 6/8] netdev-offload-tc: Cleanup police actions with reserved indexes on startup

2022-06-27 Thread Jianbo Liu via dev
On Mon, 2022-06-27 at 11:17 +0200, Eelco Chaudron wrote:
> 
> 
> On 21 Jun 2022, at 5:51, Jianbo Liu wrote:
> 
> > On Mon, 2022-06-20 at 12:16 +0200, Eelco Chaudron wrote:
> > > On 27 May 2022, at 11:00, Jianbo Liu wrote:
> > > 
> > > > As the police actions with indexes of 0x1000-0x1fff are
> > > > reserved for meter offload, to provide a clean environment for
> > > > OVS,
> > > > these reserved police actions should be deleted on startup. So
> > > > dump
> > > > all the police actions, delete those actions with indexes in
> > > > this
> > > > range.
> > > > 
> > > > Signed-off-by: Jianbo Liu 
> > > > ---
> 
> 
> 
> > > > +int
> > > > +parse_netlink_to_tc_policer(struct ofpbuf *reply, uint32_t
> > > > police_idx[])
> > > > +{
> > > > +    static struct nl_policy
> > > > actions_orders_policy[TCA_ACT_MAX_PRIO] = {};
> > > > +    struct nlattr
> > > > *actions_orders[ARRAY_SIZE(actions_orders_policy)];
> > > > +    const int max_size = ARRAY_SIZE(actions_orders_policy);
> > > > +    const struct nlattr *actions;
> > > > +    struct tc_flower flower;
> > > > +    struct tcamsg *tca;
> > > > +    int i, cnt = 0;
> > > > +    int err;
> > > > +
> > > > +    for (i = 0; i < max_size; i++) {
> > > > +    actions_orders_policy[i].type = NL_A_NESTED;
> > > > +    actions_orders_policy[i].optional = true;
> > > > +    }
> > > > +
> > > > +    tca = ofpbuf_at_assert(reply, NLMSG_HDRLEN, sizeof *tca);
> > > > +    actions = nl_attr_find(reply, NLMSG_HDRLEN + sizeof *tca,
> > > > TCA_ACT_TAB);
> > > > +    if (!actions || !nl_parse_nested(actions,
> > > > actions_orders_policy,
> > > > + actions_orders,
> > > > max_size)) {
> > > > +    VLOG_ERR_RL(_rl, "Failed to parse police
> > > > actions");
> > > 
> > > These could be all kinds of actions, not only police ones, i.e.
> > > whatever is programmed.
> > > Maybe change this to "Failed to parse flower actions".
> > > 
> > 
> > But this function is intended for police only, and its name is
> > ended
> > with "_policer". Do you want to make it generel func for all
> > actions?
> 
> As you could receive any set of actions from netlink (all that exists
> in the system), the error might not be specific to a police action,
> that’s why I think it should be removed.
> 
> > > > +    return EPROTO;
> > > > +    }
> > > > +
> > > > +    for (i = 0; i < TCA_ACT_MAX_PRIO; i++) {
> > > > +    if (actions_orders[i]) {
> > > > +    memset(, 0, sizeof(struct tc_flower));
> > > > +    err = nl_parse_single_action(actions_orders[i],
> > > > , false);
> > > > +    if (err) {
> > > > +    continue;
> > > > +    }
> > > > +    if (flower.actions[0].police.index) {
> > > 
> > > You need to make sure this is the Police action (see the previous
> > > review), i.e.,
> > > 
> > 
> > I remember your comment in previous version. Sorry I didn't change
> > here
> > becasue the same reason above.
> 
> Let me try to be more clear. Whatever you receive from netlink could
> be any kind of action, i.e. whatever is programmed in the system.
> So if someone manually added add list of actions, in the ID range,
> you think it’s Police but it could be something else.
> 
> What about the following for the two comments above:

I don't know if someone will use these functions. But currently they
are only for police, as I dump police actions by RTM_GETACTION request.
Anyway, I will update as you suggested.

> 
> --- a/lib/tc.c
> +++ b/lib/tc.c
> @@ -2127,7 +2127,8 @@ parse_netlink_to_tc_policer(struct ofpbuf
> *reply, uint32_t police_idx[])
>  actions = nl_attr_find(reply, NLMSG_HDRLEN + sizeof *tca,
> TCA_ACT_TAB);
>  if (!actions || !nl_parse_nested(actions, actions_orders_policy,
>   actions_orders, max_size)) {
> -    VLOG_ERR_RL(_rl, "Failed to parse police actions");
> +    VLOG_ERR_RL(_rl,
> +    "Failed to parse actions in netlink to
> policer");
>  return EPROTO;
>  }
> 
> @@ -2135,7 +2136,7 @@ parse_netlink_to_tc_policer(struct ofpbuf
> *reply, uint32_t police_idx[])
>  if (actions_orders[i]) {
>  memset(, 0, sizeof(struct tc_flower));
>  err = nl_parse_single_action(actions_orders[i], ,
> false);
> -    if (err) {
> +    if (err || flower.actions[0].type != TC_ACT_POLICE) {
>  continue;
>  }
>  if (flower.actions[0].police.index) {
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] northd.c: Add flow to skip put_nd action if ip6.src or nd.sll is 0

2022-06-27 Thread Ales Musil
The ip6.src or nd.sll does not have to be always set.
According to rfc4861:

Source Address
 Either an address assigned to the interface from
 which this message is sent or (if Duplicate Address
 Detection is in progress [ADDRCONF]) the
 unspecified address.

Source link-layer address
 The link-layer address for the sender.  MUST NOT be
 included when the source IP address is the
 unspecified address.  Otherwise, on link layers
 that have addresses this option MUST be included in
 multicast solicitations and SHOULD be included in
 unicast solicitations.

Add rule that avoids adding MAC binding is either of those
is 0. This is continuation after discussion during review on
80187a803 (ovn-northd: Add flow to use eth.src if nd.tll is 0
in put_nd() action.)

Signed-off-by: Ales Musil 
---
 northd/northd.c | 3 +++
 northd/ovn-northd.8.xml | 6 ++
 tests/ovn-northd.at | 3 +++
 3 files changed, 12 insertions(+)

diff --git a/northd/northd.c b/northd/northd.c
index 6997c280c..6634edb0f 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -11022,6 +11022,9 @@ build_neigh_learning_flows_for_lrouter(
 ovn_lflow_add(lflows, od, S_ROUTER_IN_LEARN_NEIGHBOR, 100,
   ds_cstr(match), "next;");
 
+ovn_lflow_add(lflows, od, S_ROUTER_IN_LEARN_NEIGHBOR, 95,
+  "nd_ns && (ip6.src == 0 || nd.sll == 0)", "next;");
+
 ovn_lflow_metered(lflows, od, S_ROUTER_IN_LEARN_NEIGHBOR, 90,
   "arp", "put_arp(inport, arp.spa, arp.sha); next;",
   copp_meter_get(COPP_ARP, od->nbr->copp,
diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 59c584710..5df74a410 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -2332,6 +2332,12 @@ next;
 to learn the neighbor.
   
 
+  
+A priority-95 flow with the match nd_ns 
+  (ip6.src == 0 || nd.sll == 0) and applies the action
+next;
+  
+
   
 A priority-90 flow with the match arp and
 applies the action
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 033b58b8c..fe97bedad 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -6707,6 +6707,7 @@ AT_CHECK([ovn-sbctl dump-flows DR | grep -e lr_in_unsnat 
-e lr_out_snat -e lr_in
 
 AT_CLEANUP
 
+OVN_FOR_EACH_NORTHD([
 AT_SETUP([LR NB Static_MAC_Binding table])
 ovn_start
 
@@ -6730,6 +6731,7 @@ ovn-nbctl --may-exist static-mac-binding-add lr0-p0 
192.168.10.100 00:00:22:33:5
 wait_row_count Static_MAC_Binding 1 logical_port=lr0-p0 ip=192.168.10.100 
mac="00\:00\:22\:33\:55\:66"
 
 AT_CLEANUP
+])
 
 OVN_FOR_EACH_NORTHD([
 AT_SETUP([LR neighbor lookup and learning flows])
@@ -6751,6 +6753,7 @@ AT_CHECK([cat lrflows | grep -e lr_in_lookup_neighbor -e 
lr_in_learn_neighbor |
   table=2 (lr_in_learn_neighbor), priority=90   , match=(nd_na), 
action=(put_nd(inport, nd.target, nd.tll); next;)
   table=2 (lr_in_learn_neighbor), priority=90   , match=(nd_ns), 
action=(put_nd(inport, ip6.src, nd.sll); next;)
   table=2 (lr_in_learn_neighbor), priority=95   , match=(nd_na && nd.tll == 
0), action=(put_nd(inport, nd.target, eth.src); next;)
+  table=2 (lr_in_learn_neighbor), priority=95   , match=(nd_ns && (ip6.src == 
0 || nd.sll == 0)), action=(next;)
 ])
 
 AT_CLEANUP
-- 
2.35.3

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 12/17] tests: verify flows in odp.at are parseable

2022-06-27 Thread Eelco Chaudron



On 16 Jun 2022, at 8:32, Adrian Moreno wrote:

> Create a small helper script and check that flows tested in odp.at are
> parseable.
>
> Signed-off-by: Adrian Moreno 

Changes look good.

Acked-by: Eelco Chaudron 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 13/17] python: introduce unit tests

2022-06-27 Thread Eelco Chaudron



On 16 Jun 2022, at 8:32, Adrian Moreno wrote:

> Use pytest to run unit tests as part of the standard testsuite.
>
> Signed-off-by: Adrian Moreno 

Changes look good to me.

Acked-by: Eelco Chaudron 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 11/17] tests: verify flows in ofp-actions are parseable

2022-06-27 Thread Eelco Chaudron



On 16 Jun 2022, at 8:32, Adrian Moreno wrote:

> Create a small helper script and check that flows used in ofp-actions.at
> are parseable.
>
> Signed-off-by: Adrian Moreno 
> ---

Changes look good to me.

Acked-by: Eelco Chaudron 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [v5 5/8] netdev-offload-tc: Implement meter offload API for tc

2022-06-27 Thread Eelco Chaudron


On 27 Jun 2022, at 11:32, Jianbo Liu wrote:

> On Mon, 2022-06-27 at 11:03 +0200, Eelco Chaudron wrote:
>>
>>
>> On 21 Jun 2022, at 5:29, Jianbo Liu wrote:
>>
>>> On Mon, 2022-06-20 at 12:15 +0200, Eelco Chaudron wrote:
 On 27 May 2022, at 11:00, Jianbo Liu wrote:

> For dpif-netlink, meters are mapped by tc to police actions
> with
> one-to-one relationship. Implement meter offload API to
> set/get/del
> the police action, and a hmap is used to save the mappings.
> An id-pool is used to manage all the available police indexes,
> which
> are 0x1000-0x1fff, reserved only for OVS.
>
> Signed-off-by: Jianbo Liu 
> ---
>>
>> 
>>
> +
> +static int
> +meter_tc_set_policer(ofproto_meter_id meter_id,
> + struct ofputil_meter_config *config)
> +{
> +    uint32_t police_index;
> +    uint32_t rate, burst;
> +    bool add_policer;
> +    int err;
> +
> +    ovs_assert(config->bands != NULL);

 I would remove the assert and just return and error if bands ==
 NULL
 or n_bands is < 1.
 So:

     if (!config->bands || config->n_bands < 1) {
     return EINVAL;
     }

>>>
>>> I think maybe not necessary, as dpif_netlink_meter_set__() is
>>> called
>>> before, and it doesn't check if bands is not NULL and n_bands >= 1.
>>
>> You are right it’s assuming it’s non-null and walks through the
>> n_bands.
>>
>> So maybe you should just ignore if the bands is not 1, so something
>> like:
>>
>>   if (!config->bands || config->n_bands < 1) {
>>     return 0;
>>   }
>>
>> The dpif_netlink_meter_set__() also check is the band type is
>> OVS_METER_BAND_TYPE_DROP. Not sure if you need to check this also
>> when setting the stats, as now your code assumes this.
>
> How about adding one more check?
> if (!config->bands || config->n_bands < 1
> || config->bands[0].type != OFPMBT13_DROP) {
> return 0;
> }

That will work, make sure you also update your get stats function with a 
similar check.

>>
> +    rate = config->bands[0].rate;
> +    if (config->flags & OFPMF13_BURST) {
> +    burst = config->bands[0].burst_size;
> +    } else {
> +    burst = config->bands[0].rate;
> +    }
>>
>> 
>>
> +static int
> +meter_tc_del_policer(ofproto_meter_id meter_id,
> + struct ofputil_meter_stats *stats,
> + uint16_t max_bands OVS_UNUSED)
> +{
> +    uint32_t police_index;
> +    int err = ENOENT;
> +
> +    if (!meter_id_lookup(meter_id.uint32, _index)) {
> +    err = tc_del_policer_action(police_index, stats);
> +    if (err && err != ENOENT) {
> +    VLOG_WARN_RL(_rl,
> + "Failed to del police %u for meter
> %u:
> %s",
> + police_index, meter_id.uint32,
> ovs_strerror(err));

 Why do we not cleanup the meter id from the pool on TC failure?
 Do we
 assume the entry exists and it was not cleaned up correctly?

>>>
>>> Just in case, I don't think it's possible becausue the reserved
>>> police
>>> index is big enough, and no other application will use them.
>>
>> Yes, we should have enough IDs, was just curious why. Did you
>> encounter this when testing?
>
> No, I didn't.
>
>>
 If this is the case, this should be an error log, not a warning,
 as
 something is wrong in this environment.
>>>
>>> OK, I will add an error log.
>>
>> Thanks, this will at least trigger some attention to a potential
>> problem.
>>

> +    } else {
> +    meter_free_police_index(police_index);
> +    }
> +    meter_id_remove(meter_id.uint32);
> +    }
> +
> +    return err;
> +}
> +
>  const struct netdev_flow_api netdev_offload_tc = {
>     .type = "linux_tc",
>     .flow_flush = netdev_tc_flow_flush,
> @@ -2312,5 +2512,8 @@ const struct netdev_flow_api
> netdev_offload_tc = {
>     .flow_get = netdev_tc_flow_get,
>     .flow_del = netdev_tc_flow_del,
>     .flow_get_n_flows = netdev_tc_get_n_flows,
> +   .meter_set = meter_tc_set_policer,
> +   .meter_get = meter_tc_get_policer,
> +   .meter_del = meter_tc_del_policer,
>     .init_flow_api = netdev_tc_init_flow_api,
>  };
> -- 
> 2.26.2

>>

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [v5 8/8] dpif-netlink: Offloading meter to tc police action

2022-06-27 Thread Eelco Chaudron


On 21 Jun 2022, at 10:22, Jianbo Liu wrote:

> On Mon, 2022-06-20 at 12:18 +0200, Eelco Chaudron wrote:
>> On 27 May 2022, at 11:00, Jianbo Liu wrote:
>>
>>> OVS meters are created in advance and openflow rules refer to them
>>> by
>>> their unique ID. New tc_police API is used to offload them. By
>>> calling
>>> the API, police actions are created and meters are mapped to them.
>>> These actions then can be used in tc filter rules by the index.
>>>
>>> Signed-off-by: Jianbo Liu 
>>> ---
>>>  NEWS |  2 ++
>>>  lib/dpif-netlink.c   | 31 +
>>>  tests/system-offloads-traffic.at | 48
>>> 
>>>  3 files changed, 76 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/NEWS b/NEWS
>>> index eece0d0b2..dfd659d4e 100644
>>> --- a/NEWS
>>> +++ b/NEWS
>>> @@ -29,6 +29,8 @@ Post-v2.17.0
>>>     - Windows:
>>>   * Conntrack support for TCPv6, UDPv6, ICMPv6, FTPv6.
>>>   * IPv6 Geneve tunnel support.
>>> +   - Linux datapath:
>>> + * Add offloading meter tc police.
>>>
>>>
>>>  v2.17.0 - 17 Feb 2022
>>> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
>>> index 06e1e8ca0..0af9ee77e 100644
>>> --- a/lib/dpif-netlink.c
>>> +++ b/lib/dpif-netlink.c
>>> @@ -4163,11 +4163,18 @@ static int
>>>  dpif_netlink_meter_set(struct dpif *dpif_, ofproto_meter_id
>>> meter_id,
>>>     struct ofputil_meter_config *config)
>>>  {
>>> +    int err;
>>> +
>>>  if (probe_broken_meters(dpif_)) {
>>>  return ENOMEM;
>>>  }
>>>
>>> -    return dpif_netlink_meter_set__(dpif_, meter_id, config);
>>> +    err = dpif_netlink_meter_set__(dpif_, meter_id, config);
>>> +    if (!err && netdev_is_flow_api_enabled()) {
>>> +    meter_offload_set(meter_id, config);
>>
>> Although currently we always return 0, we should still account for it
>> to change in the future, so we should set err to the return value.
>>
>
> If there is err from meter_offload_set, it will be passed to the caller
> of dpif_netlink_meter_set(). I don't agree with that, because the
> caller thinks meter_set operation fail, but actually not. Besides, we
> allow the case that dp meter_set success, but offloading fails, so the
> return the error of meter_offload_set seems unnecessary.

If this is the case, we should change the dpif_netlink_meter_set() API to 
return void.
And add a comment to the function why it would not return an error:

--- a/lib/netdev-offload.c
+++ b/lib/netdev-offload.c
@@ -196,7 +196,7 @@ netdev_assign_flow_api(struct netdev *netdev)
 return -1;
 }

-int
+void
 meter_offload_set(ofproto_meter_id meter_id,
   struct ofputil_meter_config *config)
 {
@@ -212,8 +212,8 @@ meter_offload_set(ofproto_meter_id meter_id,
}
 }
 }
-
-return 0;
+/* Offload APIs could fail, for example, because the offload is not
+ * supported. This is fine, as the offload API should take care of this. */
 }



>>  +    err = meter_offload_set(meter_id, config);
>>
>>> +    }
>>> +



>>> diff --git a/tests/system-offloads-traffic.at b/tests/system-
>>> offloads-traffic.at
>>> index 80bc1dd5c..7ec75340f 100644
>>> --- a/tests/system-offloads-traffic.at
>>> +++ b/tests/system-offloads-traffic.at
>>> @@ -168,3 +168,51 @@ matchall
>>>  ])
>>>  OVS_TRAFFIC_VSWITCHD_STOP
>>>  AT_CLEANUP
>>> +
>>> +AT_SETUP([offloads - check if meter offloading ])
>>
>> Can we replace if with interface, as I keep on reading it as "if".
>>
>
> Sure.
>
>>> +AT_KEYWORDS([meter])
>>> +AT_SKIP_IF([test $SUPPORT_TC_INGRESS_PPS = "no"])
>>> +OVS_TRAFFIC_VSWITCHD_START()
>>> +
>>> +AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:hw-
>>> offload=true])
>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br0 'meter=1 pktps
>>> bands=type=drop rate=1'])
>>> +
>>> +ADD_NAMESPACES(at_ns0, at_ns1)
>>> +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24", "f0:00:00:01:01:01")
>>> +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24", "f0:00:00:01:01:02")
>>> +
>>> +NS_CHECK_EXEC([at_ns0], [ip neigh add 10.1.1.2 lladdr
>>> f0:00:00:01:01:02 dev p0])
>>> +NS_CHECK_EXEC([at_ns1], [ip neigh add 10.1.1.1 lladdr
>>> f0:00:00:01:01:01 dev p1])
>>> +
>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 "actions=normal"])
>>> +NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2 10.1.1.2 |
>>> FORMAT_PING], [0], [dnl
>>> +10 packets transmitted, 10 received, 0% packet loss, time 0ms
>>> +])
>>> +
>>> +NETNS_DAEMONIZE([at_ns1], [nc -u -l 5678 > /dev/null ], [nc0.pid])
>>> +
>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 del-flows br0])
>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0
>>> "priority=10,in_port=ovs-p0,udp actions=meter:1,normal"])
>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 "priority=1
>>> actions=normal"])
>>> +
>>> +NS_CHECK_EXEC([at_ns0], [sleep 0.5; echo "mark" | nc -u 10.1.1.2
>>> 5678 -p 6789])
>>
>> Any specific reason why you need the sleep 0.5 here? Is it to be sure
>> the flow is programmed?
>
> I remember I added this because there 

Re: [ovs-dev] [v5 5/8] netdev-offload-tc: Implement meter offload API for tc

2022-06-27 Thread Jianbo Liu via dev
On Mon, 2022-06-27 at 11:03 +0200, Eelco Chaudron wrote:
> 
> 
> On 21 Jun 2022, at 5:29, Jianbo Liu wrote:
> 
> > On Mon, 2022-06-20 at 12:15 +0200, Eelco Chaudron wrote:
> > > On 27 May 2022, at 11:00, Jianbo Liu wrote:
> > > 
> > > > For dpif-netlink, meters are mapped by tc to police actions
> > > > with
> > > > one-to-one relationship. Implement meter offload API to
> > > > set/get/del
> > > > the police action, and a hmap is used to save the mappings.
> > > > An id-pool is used to manage all the available police indexes,
> > > > which
> > > > are 0x1000-0x1fff, reserved only for OVS.
> > > > 
> > > > Signed-off-by: Jianbo Liu 
> > > > ---
> 
> 
> 
> > > > +
> > > > +static int
> > > > +meter_tc_set_policer(ofproto_meter_id meter_id,
> > > > + struct ofputil_meter_config *config)
> > > > +{
> > > > +    uint32_t police_index;
> > > > +    uint32_t rate, burst;
> > > > +    bool add_policer;
> > > > +    int err;
> > > > +
> > > > +    ovs_assert(config->bands != NULL);
> > > 
> > > I would remove the assert and just return and error if bands ==
> > > NULL
> > > or n_bands is < 1.
> > > So:
> > > 
> > >     if (!config->bands || config->n_bands < 1) {
> > >     return EINVAL;
> > >     }
> > > 
> > 
> > I think maybe not necessary, as dpif_netlink_meter_set__() is
> > called
> > before, and it doesn't check if bands is not NULL and n_bands >= 1.
> 
> You are right it’s assuming it’s non-null and walks through the
> n_bands.
> 
> So maybe you should just ignore if the bands is not 1, so something
> like:
> 
>   if (!config->bands || config->n_bands < 1) {
>     return 0;
>   }
> 
> The dpif_netlink_meter_set__() also check is the band type is
> OVS_METER_BAND_TYPE_DROP. Not sure if you need to check this also
> when setting the stats, as now your code assumes this.

How about adding one more check?
if (!config->bands || config->n_bands < 1
|| config->bands[0].type != OFPMBT13_DROP) {
return 0;
} 

> 
> > > > +    rate = config->bands[0].rate;
> > > > +    if (config->flags & OFPMF13_BURST) {
> > > > +    burst = config->bands[0].burst_size;
> > > > +    } else {
> > > > +    burst = config->bands[0].rate;
> > > > +    }
> 
> 
> 
> > > > +static int
> > > > +meter_tc_del_policer(ofproto_meter_id meter_id,
> > > > + struct ofputil_meter_stats *stats,
> > > > + uint16_t max_bands OVS_UNUSED)
> > > > +{
> > > > +    uint32_t police_index;
> > > > +    int err = ENOENT;
> > > > +
> > > > +    if (!meter_id_lookup(meter_id.uint32, _index)) {
> > > > +    err = tc_del_policer_action(police_index, stats);
> > > > +    if (err && err != ENOENT) {
> > > > +    VLOG_WARN_RL(_rl,
> > > > + "Failed to del police %u for meter
> > > > %u:
> > > > %s",
> > > > + police_index, meter_id.uint32,
> > > > ovs_strerror(err));
> > > 
> > > Why do we not cleanup the meter id from the pool on TC failure?
> > > Do we
> > > assume the entry exists and it was not cleaned up correctly?
> > > 
> > 
> > Just in case, I don't think it's possible becausue the reserved
> > police
> > index is big enough, and no other application will use them.
> 
> Yes, we should have enough IDs, was just curious why. Did you
> encounter this when testing?

No, I didn't.

> 
> > > If this is the case, this should be an error log, not a warning,
> > > as
> > > something is wrong in this environment.
> > 
> > OK, I will add an error log.
> 
> Thanks, this will at least trigger some attention to a potential
> problem.
> 
> > > 
> > > > +    } else {
> > > > +    meter_free_police_index(police_index);
> > > > +    }
> > > > +    meter_id_remove(meter_id.uint32);
> > > > +    }
> > > > +
> > > > +    return err;
> > > > +}
> > > > +
> > > >  const struct netdev_flow_api netdev_offload_tc = {
> > > >     .type = "linux_tc",
> > > >     .flow_flush = netdev_tc_flow_flush,
> > > > @@ -2312,5 +2512,8 @@ const struct netdev_flow_api
> > > > netdev_offload_tc = {
> > > >     .flow_get = netdev_tc_flow_get,
> > > >     .flow_del = netdev_tc_flow_del,
> > > >     .flow_get_n_flows = netdev_tc_get_n_flows,
> > > > +   .meter_set = meter_tc_set_policer,
> > > > +   .meter_get = meter_tc_get_policer,
> > > > +   .meter_del = meter_tc_del_policer,
> > > >     .init_flow_api = netdev_tc_init_flow_api,
> > > >  };
> > > > -- 
> > > > 2.26.2
> > > 
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [v5 6/8] netdev-offload-tc: Cleanup police actions with reserved indexes on startup

2022-06-27 Thread Eelco Chaudron


On 21 Jun 2022, at 5:51, Jianbo Liu wrote:

> On Mon, 2022-06-20 at 12:16 +0200, Eelco Chaudron wrote:
>> On 27 May 2022, at 11:00, Jianbo Liu wrote:
>>
>>> As the police actions with indexes of 0x1000-0x1fff are
>>> reserved for meter offload, to provide a clean environment for OVS,
>>> these reserved police actions should be deleted on startup. So dump
>>> all the police actions, delete those actions with indexes in this
>>> range.
>>>
>>> Signed-off-by: Jianbo Liu 
>>> ---



>>> +int
>>> +parse_netlink_to_tc_policer(struct ofpbuf *reply, uint32_t
>>> police_idx[])
>>> +{
>>> +    static struct nl_policy
>>> actions_orders_policy[TCA_ACT_MAX_PRIO] = {};
>>> +    struct nlattr
>>> *actions_orders[ARRAY_SIZE(actions_orders_policy)];
>>> +    const int max_size = ARRAY_SIZE(actions_orders_policy);
>>> +    const struct nlattr *actions;
>>> +    struct tc_flower flower;
>>> +    struct tcamsg *tca;
>>> +    int i, cnt = 0;
>>> +    int err;
>>> +
>>> +    for (i = 0; i < max_size; i++) {
>>> +    actions_orders_policy[i].type = NL_A_NESTED;
>>> +    actions_orders_policy[i].optional = true;
>>> +    }
>>> +
>>> +    tca = ofpbuf_at_assert(reply, NLMSG_HDRLEN, sizeof *tca);
>>> +    actions = nl_attr_find(reply, NLMSG_HDRLEN + sizeof *tca,
>>> TCA_ACT_TAB);
>>> +    if (!actions || !nl_parse_nested(actions,
>>> actions_orders_policy,
>>> + actions_orders, max_size)) {
>>> +    VLOG_ERR_RL(_rl, "Failed to parse police actions");
>>
>> These could be all kinds of actions, not only police ones, i.e.
>> whatever is programmed.
>> Maybe change this to "Failed to parse flower actions".
>>
>
> But this function is intended for police only, and its name is ended
> with "_policer". Do you want to make it generel func for all actions?

As you could receive any set of actions from netlink (all that exists in the 
system), the error might not be specific to a police action, that’s why I think 
it should be removed.

>>> +    return EPROTO;
>>> +    }
>>> +
>>> +    for (i = 0; i < TCA_ACT_MAX_PRIO; i++) {
>>> +    if (actions_orders[i]) {
>>> +    memset(, 0, sizeof(struct tc_flower));
>>> +    err = nl_parse_single_action(actions_orders[i],
>>> , false);
>>> +    if (err) {
>>> +    continue;
>>> +    }
>>> +    if (flower.actions[0].police.index) {
>>
>> You need to make sure this is the Police action (see the previous
>> review), i.e.,
>>
>
> I remember your comment in previous version. Sorry I didn't change here
> becasue the same reason above.

Let me try to be more clear. Whatever you receive from netlink could be any 
kind of action, i.e. whatever is programmed in the system.
So if someone manually added add list of actions, in the ID range, you think 
it’s Police but it could be something else.

What about the following for the two comments above:

--- a/lib/tc.c
+++ b/lib/tc.c
@@ -2127,7 +2127,8 @@ parse_netlink_to_tc_policer(struct ofpbuf *reply, 
uint32_t police_idx[])
 actions = nl_attr_find(reply, NLMSG_HDRLEN + sizeof *tca, TCA_ACT_TAB);
 if (!actions || !nl_parse_nested(actions, actions_orders_policy,
  actions_orders, max_size)) {
-VLOG_ERR_RL(_rl, "Failed to parse police actions");
+VLOG_ERR_RL(_rl,
+"Failed to parse actions in netlink to policer");
 return EPROTO;
 }

@@ -2135,7 +2136,7 @@ parse_netlink_to_tc_policer(struct ofpbuf *reply, 
uint32_t police_idx[])
 if (actions_orders[i]) {
 memset(, 0, sizeof(struct tc_flower));
 err = nl_parse_single_action(actions_orders[i], , false);
-if (err) {
+if (err || flower.actions[0].type != TC_ACT_POLICE) {
 continue;
 }
 if (flower.actions[0].police.index) {

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [v5 5/8] netdev-offload-tc: Implement meter offload API for tc

2022-06-27 Thread Eelco Chaudron


On 21 Jun 2022, at 5:29, Jianbo Liu wrote:

> On Mon, 2022-06-20 at 12:15 +0200, Eelco Chaudron wrote:
>> On 27 May 2022, at 11:00, Jianbo Liu wrote:
>>
>>> For dpif-netlink, meters are mapped by tc to police actions with
>>> one-to-one relationship. Implement meter offload API to set/get/del
>>> the police action, and a hmap is used to save the mappings.
>>> An id-pool is used to manage all the available police indexes,
>>> which
>>> are 0x1000-0x1fff, reserved only for OVS.
>>>
>>> Signed-off-by: Jianbo Liu 
>>> ---



>>> +
>>> +static int
>>> +meter_tc_set_policer(ofproto_meter_id meter_id,
>>> + struct ofputil_meter_config *config)
>>> +{
>>> +    uint32_t police_index;
>>> +    uint32_t rate, burst;
>>> +    bool add_policer;
>>> +    int err;
>>> +
>>> +    ovs_assert(config->bands != NULL);
>>
>> I would remove the assert and just return and error if bands == NULL
>> or n_bands is < 1.
>> So:
>>
>>     if (!config->bands || config->n_bands < 1) {
>>     return EINVAL;
>>     }
>>
>
> I think maybe not necessary, as dpif_netlink_meter_set__() is called
> before, and it doesn't check if bands is not NULL and n_bands >= 1.

You are right it’s assuming it’s non-null and walks through the n_bands.

So maybe you should just ignore if the bands is not 1, so something like:

  if (!config->bands || config->n_bands < 1) {
return 0;
  }

The dpif_netlink_meter_set__() also check is the band type is 
OVS_METER_BAND_TYPE_DROP. Not sure if you need to check this also when setting 
the stats, as now your code assumes this.

>>> +    rate = config->bands[0].rate;
>>> +    if (config->flags & OFPMF13_BURST) {
>>> +    burst = config->bands[0].burst_size;
>>> +    } else {
>>> +    burst = config->bands[0].rate;
>>> +    }



>>> +static int
>>> +meter_tc_del_policer(ofproto_meter_id meter_id,
>>> + struct ofputil_meter_stats *stats,
>>> + uint16_t max_bands OVS_UNUSED)
>>> +{
>>> +    uint32_t police_index;
>>> +    int err = ENOENT;
>>> +
>>> +    if (!meter_id_lookup(meter_id.uint32, _index)) {
>>> +    err = tc_del_policer_action(police_index, stats);
>>> +    if (err && err != ENOENT) {
>>> +    VLOG_WARN_RL(_rl,
>>> + "Failed to del police %u for meter %u:
>>> %s",
>>> + police_index, meter_id.uint32,
>>> ovs_strerror(err));
>>
>> Why do we not cleanup the meter id from the pool on TC failure? Do we
>> assume the entry exists and it was not cleaned up correctly?
>>
>
> Just in case, I don't think it's possible becausue the reserved police
> index is big enough, and no other application will use them.

Yes, we should have enough IDs, was just curious why. Did you encounter this 
when testing?

>> If this is the case, this should be an error log, not a warning, as
>> something is wrong in this environment.
>
> OK, I will add an error log.

Thanks, this will at least trigger some attention to a potential problem.

>>
>>> +    } else {
>>> +    meter_free_police_index(police_index);
>>> +    }
>>> +    meter_id_remove(meter_id.uint32);
>>> +    }
>>> +
>>> +    return err;
>>> +}
>>> +
>>>  const struct netdev_flow_api netdev_offload_tc = {
>>>     .type = "linux_tc",
>>>     .flow_flush = netdev_tc_flow_flush,
>>> @@ -2312,5 +2512,8 @@ const struct netdev_flow_api
>>> netdev_offload_tc = {
>>>     .flow_get = netdev_tc_flow_get,
>>>     .flow_del = netdev_tc_flow_del,
>>>     .flow_get_n_flows = netdev_tc_get_n_flows,
>>> +   .meter_set = meter_tc_set_policer,
>>> +   .meter_get = meter_tc_get_policer,
>>> +   .meter_del = meter_tc_del_policer,
>>>     .init_flow_api = netdev_tc_init_flow_api,
>>>  };
>>> -- 
>>> 2.26.2
>>

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] netdev-linux: do not touch LAG slaves if master is not attached to ovs.

2022-06-27 Thread Roi Dayan via dev



On 2022-06-27 10:49 AM, Tao Liu wrote:

On Mon, Jun 27, 2022 at 09:40:38AM +0300, Roi Dayan wrote:



On 2022-06-23 12:42 PM, Roi Dayan wrote:



On 2022-06-22 11:56 PM, Marcelo Ricardo Leitner wrote:

On Wed, Jun 22, 2022 at 11:10:17AM +0800, Tao Liu wrote:

On Tue, Jun 21, 2022 at 06:49:42AM -0700, Marcelo Ricardo Leitner wrote:

Hi,

On Tue, Jun 21, 2022 at 02:10:39PM +0800, Tao Liu wrote:

Bond master netdev may be created without a classification type, due
to routing or tunneling code.


Can you please elaborate on why is this an issue?


Hi, thanks for your reply.

We are using BlueField2 in Bare Metal Server.
p0 and p1 enslave to master bond0. A SF is created for RDMA.
ovs manages pf0hpf and gre.

Traffics between bond0 and SF are controlled by tc rules.
```
master=bond0
slave1=p0
slave2=p1
sf=enp3s0f0s0
sf_rep=en3f0pf0sf0
$tc qdisc add dev $master ingress_block 22 ingress
$tc qdisc add dev $slave1 ingress_block 22 ingress
$tc qdisc add dev $slave2 ingress_block 22 ingress
$tc qdisc add dev $sf_rep ingress

# some customized tc rules
$tc filter add dev $sf_rep pref 1 ingress ... action mirred
egress redirect dev $master

$tc filter add block 22 pref 1 ... action mirred egress redirect
dev $sf_rep
```

Unfortunately ingress_block on p0 or p1 may be deleted and recreated by
ovs when they enslaves to bond0.

Also we have a similar architectur on cx5/cx6.



If bond master is not attached to ovs, the ingress block
on slaves shoud
not be updated.


Why? (in short, but more below)


If we do not use bond master and slaves in ovs, they shoud not
be interfered.


Makes sense, ...





Simple reproducer:
    ip a add 10.1.1.1/30 dev bond0
    ip l set net3 master bond0
    tc q ls dev net3

Fixes: d22f8927c3c9 ("netdev-linux: monitor and offload
LAG slaves to TC")
Signed-off-by: Tao Liu 
---
   lib/netdev-linux.c | 5 -
   1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index 9d12502..b9e0c99 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -682,7 +682,10 @@ netdev_linux_update_lag(struct
rtnetlink_change *change)
   return;
   }

-    if (is_netdev_linux_class(master_netdev->netdev_class)) {
+    /* If bond master is not attached to ovs,
ingress block on slaves
+ * shoud not be updated. */


I think this will break a core use case. As in your reproducer, that's
pretty much how it is expected to work today with tunnels, for
example:

    ip a add 10.1.1.1/30 dev bond0
    ip l set net3 master bond0
    ip l s bond0 up
    ovs-vsctl add-port ovsbr0 vxlan0 -- \
 set interface vxlan0
 type=vxlan \
 options:local_ip=10.1.1.1
 options:remote_ip=10.1.1.2
 options:key=0

If you patch like this, then who would be adding the ingress qdiscs on
the slaves?

Forcing the bond to be added is probably not optimal, because it
doesn't really need to be. Unless your considering that as some sort
of authorization for ovs to mangle with it?

    Marcelo


This patch does not break the tunnel use case. The ingress attaches on
vxlan_sys, but not need to attach on bond master or slaves.


... I was under the impression that the driver needed it somehow,
despite that. But apparently that's not right, as the driver will have
one indr callback for each uplink, and should be able to have its way
from there.

Thanks for the explanations.

Reviewed-by: Marcelo Ricardo Leitner 



there is an example without bond about ingress. We can
use vxlan port in ovs with tunnel ip assigned on the pf.
like Tao mentioned, the ingress and tc rules are created on
the vxlan netdev created by ovs but ovs doesnt recreate ingress
on the pf and the rules are offloaded by the driver.

also looks good to me. thanks.

Reviewed-by: Roi Dayan 



Hmm, sorry but looks like I did review too quick.
I did some manual testing now and ovs doesn't add
ingress to the slave devices for me.
master_netdev->auto_classified==true in my case.
can you describe the steps you did to test? or you just tested
the flow where bond is not attached to ovs?


# ovs-vsctl show
866b5d8e-bf4d-4600-9dd3-b724036c7e99
 ovs_version: "2.17.90"

# echo "+bond0" > /sys/class/net/bonding_masters

# ip l show dev bond0
21: bond0:  mtu 1500 qdisc noop state DOWN mode 
DEFAULT group default qlen 1000
 link/ether 94:29:2f:70:22:9c brd ff:ff:ff:ff:ff:ff
# ip a add 10.1.1.1/30 dev bond0

# ip l set net3 down
# tc q add dev net3 ingress_block 111 ingress
# tc q ls dev net3 ingress
 qdisc ingress : parent :fff1 ingress_block 111 
# ip l set net3 master bond0
# tc q ls dev net3 ingress
 qdisc ingress : parent :fff1 ingress_block 22 

Also add some print in netdev_linux_update_lag():
2022-06-27T07:43:45.324Z|00058|netdev_linux|INFO|LAG: master_name=bond0, 
master_ifindex=22, auto_classified=1


you show a test for ovs adding ingress to slaves when bond0
is not part of ovs.
now with your patch also 

Re: [ovs-dev] [v5 4/8] netdev-linux: Add functions to manipulate tc police action

2022-06-27 Thread Eelco Chaudron


On 21 Jun 2022, at 5:09, Jianbo Liu wrote:

> On Mon, 2022-06-20 at 12:14 +0200, Eelco Chaudron wrote:
>> On 27 May 2022, at 11:00, Jianbo Liu wrote:
>>
>>> Add helpers to add, delete and get stats of police action with
>>> the specified index.
>>>
>>> Signed-off-by: Jianbo Liu 
>>> ---



>>> +static int
>>> +nl_parse_action_stats(struct nlattr *act_stats,
>>> +  struct ovs_flow_stats *stats_sw,
>>> +  struct ovs_flow_stats *stats_hw,
>>> +  struct ovs_flow_stats *stats_dropped)
>>> +{
>>> +    struct gnet_stats_basic bs_sw = { .packets = 0, .bytes = 0, };
>>> +    struct nlattr *stats_attrs[ARRAY_SIZE(stats_policy)];
>>> +    const struct gnet_stats_basic *bs_all = NULL;
>>> +    const struct gnet_stats_basic *bs_hw = NULL;
>>> +    const struct gnet_stats_queue *qs;
>>
>> We need a null check for stats here.
>>
>> if (!stats) {
>>    return EINVAL;
>> }
>
> Are you talking about act_stats?
> It should not be NULL, because [TCA_ACT_STATS] is not optional in
> act_policy.

You are right, it’s act_stats and the attribute is not an optional option. For 
some reason, I was thinking that nl_parse_action_stats() was a none-static 
function.

Please ignore the comment.


>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v2] ovn-controller: fixed ovn-installed not always properly added.

2022-06-27 Thread Xavier Simonart
OVN checks whether ovn-installed is already present (in OVS) before updating it.
This might cause ovn-installed related issues in the following case:
- (1) ovn-installed is present
- (2) we claim the interface
- (3) we update ovs, removing ovn-installed and start installing flows
- (4) (next loop), after flows installed, we check if ovn-installed is absent,
  and if yes, we update OVS with ovn-installed.
However, in step4, if OVS is still busy from step 3, ovn-installed is read as
present; hence OVN controller does not update it and move to INSTALLED state.

Note that this does not happen with writing port up in SBDB because Port status
changes will hit I-P.

This patch will require updating the state machine description in the
"controller: avoid recomputes triggered by SBDB Port_Binding updates." patch
when it's merged, as it changes the if-status state-machine.

Signed-off-by: Xavier Simonart 
---
 controller/binding.c   | 35 --
 controller/binding.h   |  6 ++
 controller/if-status.c | 43 --
 3 files changed, 80 insertions(+), 4 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index 2279570f9..157c381cf 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -643,6 +643,19 @@ local_binding_get_lport_ofport(const struct shash 
*local_bindings,
 u16_to_ofp(lbinding->iface->ofport[0]) : 0;
 }
 
+bool
+local_binding_is_ovn_installed(struct shash *local_bindings,
+   const char *pb_name)
+{
+struct local_binding *lbinding =
+local_binding_find(local_bindings, pb_name);
+if (lbinding && lbinding->iface) {
+return smap_get_bool(>iface->external_ids,
+ OVN_INSTALLED_EXT_ID, false);
+}
+return false;
+}
+
 bool
 local_binding_is_up(struct shash *local_bindings, const char *pb_name)
 {
@@ -715,6 +728,22 @@ local_binding_set_up(struct shash *local_bindings, const 
char *pb_name,
 }
 }
 
+void
+local_binding_remove_ovn_installed(struct shash *local_bindings,
+   const char *pb_name, bool ovs_readonly)
+{
+struct local_binding *lbinding =
+local_binding_find(local_bindings, pb_name);
+
+if (!ovs_readonly && lbinding && lbinding->iface
+&& smap_get_bool(>iface->external_ids,
+ OVN_INSTALLED_EXT_ID, false)) {
+VLOG_INFO("Removing lport %s ovn-installed in OVS", pb_name);
+ovsrec_interface_update_external_ids_delkey(lbinding->iface,
+OVN_INSTALLED_EXT_ID);
+}
+}
+
 void
 local_binding_set_down(struct shash *local_bindings, const char *pb_name,
const struct sbrec_chassis *chassis_rec,
@@ -1297,9 +1326,11 @@ consider_vif_lport_(const struct sbrec_port_binding *pb,
 const char *requested_chassis_option = smap_get(
 >options, "requested-chassis");
 VLOG_INFO_RL(,
-"Not claiming lport %s, chassis %s requested-chassis %s",
+"Not claiming lport %s, chassis %s requested-chassis %s "
+"pb->chassis %s",
 pb->logical_port, b_ctx_in->chassis_rec->name,
-requested_chassis_option ? requested_chassis_option : "[]");
+requested_chassis_option ? requested_chassis_option : "[]",
+pb->chassis ? pb->chassis->name: "");
 }
 }
 
diff --git a/controller/binding.h b/controller/binding.h
index 1fed06674..445bdd9f2 100644
--- a/controller/binding.h
+++ b/controller/binding.h
@@ -152,6 +152,12 @@ ofp_port_t local_binding_get_lport_ofport(const struct 
shash *local_bindings,
   const char *pb_name);
 
 bool local_binding_is_up(struct shash *local_bindings, const char *pb_name);
+bool local_binding_is_ovn_installed(struct shash *local_bindings,
+const char *pb_name);
+void local_binding_remove_ovn_installed(struct shash *local_bindings,
+const char *pb_name,
+bool ovs_readonly);
+
 bool local_binding_is_down(struct shash *local_bindings, const char *pb_name);
 void local_binding_set_up(struct shash *local_bindings, const char *pb_name,
   const struct sbrec_chassis *chassis_rec,
diff --git a/controller/if-status.c b/controller/if-status.c
index ad61844d8..fe544c6ec 100644
--- a/controller/if-status.c
+++ b/controller/if-status.c
@@ -57,6 +57,9 @@ enum if_state {
 OIF_INSTALL_FLOWS, /* Already claimed interface for which flows are still
 * being installed.
 */
+OIF_REMOVE_OVN_INST,/* Interface with flows successfully installed in OVS,
+ * but with ovn-installed still in OVSDB.
+ */
 OIF_MARK_UP,   /* Interface with 

Re: [ovs-dev] [v5 1/8] netdev-offload: Add meter offload API

2022-06-27 Thread Eelco Chaudron


On 21 Jun 2022, at 4:40, Jianbo Liu wrote:

> On Mon, 2022-06-20 at 12:12 +0200, Eelco Chaudron wrote:
>>
>> On 27 May 2022, at 11:00, Jianbo Liu wrote:
>>
>>> Add API to offload meter to HW, and the corresponding functions to
>>> call
>>> the meter callbacks from all the registered flow API providers.
>>> The interfaces are like those related to meter in dpif_class, in
>>> order
>>> to pass necessary info to HW.
>>>
>>> Signed-off-by: Jianbo Liu 
>>> ---
>>>  lib/netdev-offload-provider.h | 19 +
>>>  lib/netdev-offload.c  | 51
>>> +++
>>>  lib/netdev-offload.h  |  8 ++
>>>  3 files changed, 78 insertions(+)
>>>
>>> diff --git a/lib/netdev-offload-provider.h b/lib/netdev-offload-
>>> provider.h
>>> index 8ff2de983..c3028606b 100644
>>> --- a/lib/netdev-offload-provider.h
>>> +++ b/lib/netdev-offload-provider.h
>>> @@ -94,6 +94,25 @@ struct netdev_flow_api {
>>>   * takes ownership of a packet if errno != EOPNOTSUPP. */
>>>  int (*hw_miss_packet_recover)(struct netdev *, struct dp_packet
>>> *);
>>
>> Your code seems to assume that the stats structure is already
>> initialized, this includes setting up all the possible bands, the
>> stats->n_bands is also set up and should be used in your code (see also
>> my comments on patch  6). If this is true, you can remove the max_band
>> variable from all your callback.
>>
>
> I think yes, because you can see my last patch. I check the return
> value of dpif_netlink_meter_set__() before calling meter_offload_set().
>
> static int
> dpif_netlink_meter_set(struct dpif *dpif_, ofproto_meter_id meter_id,
> struct ofputil_meter_config *config)
> {
> 
>
> err = dpif_netlink_meter_set__(dpif_, meter_id, config);
> if (!err && netdev_is_flow_api_enabled()) {
> meter_offload_set(meter_id, config);
> }
>
> return err;
> }
>
> In dpif_netlink_meter_set__(), we will check the band type.
> So, if no err is returned from dpif_netlink_meter_set__(), we can
> asssume that there is a drop band in the meter.
>
> And you are right about the max_band. I will remove it.

Thanks for making these (and other) changes in the next rev.

>> If this is not the case, you should change your patch 6, to use the
>> max_bands value and increase/update the n_bands if it's currently not
>> set to 1.
>>
>>> +    /* Offloads or modifies the offloaded meter in HW with the
>>> given 'meter_id'
>>> + * and the configuration in 'config'.
>>> + *
>>> + * The meter id specified through 'config->meter_id' is
>>> ignored. */
>>> +    int (*meter_set)(ofproto_meter_id meter_id,
>>> + struct ofputil_meter_config *config);
>>> +
>>> +    /* Queries HW for meter stats with the given 'meter_id'.
>>> + * Store the stats of dropped packets to band 0. */
>>> +    int (*meter_get)(ofproto_meter_id meter_id,
>>> + struct ofputil_meter_stats *stats,
>>> + uint16_t max_bands);
>>> +
>>> +    /* Removes meter 'meter_id' from HW. Stores meter statistics
>>> if successful.
>>> + * 'stats' may be passed in as NULL if no stats are needed. */
>>> +    int (*meter_del)(ofproto_meter_id meter_id,
>>> + struct ofputil_meter_stats *stats,
>>> + uint16_t max_bands);
>>> +
>>
>> I was hoping for some more details on the API, stats and return
>> values.
>> What about the following change?
>
> Yes, I will change as you suggested. Thanks.
>
>>
>> diff --git a/lib/netdev-offload-provider.h b/lib/netdev-offload-
>> provider.h
>> index c3028606b..9d78fd953 100644
>> --- a/lib/netdev-offload-provider.h
>> +++ b/lib/netdev-offload-provider.h
>> @@ -95,20 +95,31 @@ struct netdev_flow_api {
>>  int (*hw_miss_packet_recover)(struct netdev *, struct dp_packet
>> *);
>>
>>  /* Offloads or modifies the offloaded meter in HW with the given
>> 'meter_id'
>> - * and the configuration in 'config'.
>> + * and the configuration in 'config'. On failure, a non-zero
>> error code is
>> + * returned.
>>   *
>>   * The meter id specified through 'config->meter_id' is ignored.
>> */
>>  int (*meter_set)(ofproto_meter_id meter_id,
>>   struct ofputil_meter_config *config);
>>
>> -    /* Queries HW for meter stats with the given 'meter_id'.
>> - * Store the stats of dropped packets to band 0. */
>> +    /* Queries HW for meter stats with the given 'meter_id'. Store
>> the stats
>> + * of dropped packets to band 0. On failure, a non-zero error
>> code is
>> + * returned.
>> + *
>> + * Note that the 'stats' structure is already initialized, and
>> only the
>> + * available statistics should be incremented, not replaced.
>> Those fields
>> + * are packet_in_count, byte_in_count and band[]->byte_count and
>> + * band[]->packet_count. */
>>  int (*meter_get)(ofproto_meter_id meter_id,
>>   struct 

Re: [ovs-dev] [RFC] revalidator: add a USDT probe after evaluation when flows are deleted.

2022-06-27 Thread Eelco Chaudron


On 24 Jun 2022, at 21:18, Kevin Sprague wrote:

> During normal operations, it is useful to understand when a particular flow
> gets removed from the system. This can be useful when debugging performance
> issues tied to ofproto flow changes, trying to determine deployed traffic
> patterns, or while debugging dynamic systems where ports come and go.
>
> Prior to this change, there was a lack of visibility around flow expiration.
> The existing debugging infrastructure could tell us when a flow was added to
> the datapath, but not when it was removed or why.
>
> This change introduces a USDT probe at the point where the revalidator
> determines that the flow should be removed.  Additionally, we track the
> reason for the flow eviction and provide that information as well.  With
> this change, we can track the complete flow lifecycle for the netlink datapath
> by hooking the upcall tracepoint in kernel, the flow put USDT, and the
> revaldiator USDT, letting us watch as flows are added and removed from the
> kernel datapath.
>
> This change only enables this information via USDT probe, so it won't be
> possible to access this information any other way (see:
> Documentation/topics/usdt-probes.rst).
>
> Also included are two scripts (utilities/usdt-scripts/filter_probe.py and
> utilities/usdt-scripts/watch_flows.bt) that serve as demonstrations of how
> the new USDT probes might be used going forward.

Hi Kevin,

I’ll add this to my (long) review list, and will get you some real feedback.
But below some quick remarks just glancing at PROBE only...

//Eelco


> Signed-off-by: Kevin Sprague 
> ---
> We are planning to add filter_probe.py to the flake8 check in autotools.
> In addition, we are planning to investigate different ways of filtering
> flows through eBPF in order to gauge their performance impacts.
>  ofproto/ofproto-dpif-upcall.c  |  46 -
>  utilities/usdt-scripts/filter_probe.py | 263 +
>  utilities/usdt-scripts/watch_flows.bt  | 125 
>  3 files changed, 427 insertions(+), 7 deletions(-)
>  create mode 100755 utilities/usdt-scripts/filter_probe.py
>  create mode 100755 utilities/usdt-scripts/watch_flows.bt
>
> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
> index 57f94df54..8971edb2d 100644
> --- a/ofproto/ofproto-dpif-upcall.c
> +++ b/ofproto/ofproto-dpif-upcall.c
> @@ -31,6 +31,7 @@
>  #include "openvswitch/list.h"
>  #include "netlink.h"
>  #include "openvswitch/ofpbuf.h"
> +#include "openvswitch/usdt-probes.h"
>  #include "ofproto-dpif-ipfix.h"
>  #include "ofproto-dpif-sflow.h"
>  #include "ofproto-dpif-xlate.h"
> @@ -260,6 +261,17 @@ enum ukey_state {
>  };
>  #define N_UKEY_STATES (UKEY_DELETED + 1)
>
> +enum flow_del_reason {
> +FLOW_LIVE = 0,
> +FLOW_TIME_OUT,  /* the flow went unused and was deleted. */
> +TOO_EXPENSIVE,
> +FLOW_WILDCARDED,
> +BAD_ODP_FIT,
> +ASSOCIATED_OFPROTO,
> +XLATION_ERROR,
> +AVOID_CACHING,
> +};
> +
>  /* 'udpif_key's are responsible for tracking the little bit of state udpif
>   * needs to do flow expiration which can't be pulled directly from the
>   * datapath.  They may be created by any handler or revalidator thread at any
> @@ -2202,7 +2214,8 @@ populate_xcache(struct udpif *udpif, struct udpif_key 
> *ukey,
>  static enum reval_result
>  revalidate_ukey__(struct udpif *udpif, const struct udpif_key *ukey,
>uint16_t tcp_flags, struct ofpbuf *odp_actions,
> -  struct recirc_refs *recircs, struct xlate_cache *xcache)
> +  struct recirc_refs *recircs, struct xlate_cache *xcache,
> +  enum flow_del_reason *reason)
>  {
>  struct xlate_out *xoutp;
>  struct netflow *netflow;
> @@ -2215,16 +2228,20 @@ revalidate_ukey__(struct udpif *udpif, const struct 
> udpif_key *ukey,
>  .wc = ,
>  };
>
> +int error;
>  result = UKEY_DELETE;
>  xoutp = NULL;
>  netflow = NULL;
>
> -if (xlate_ukey(udpif, ukey, tcp_flags, )) {
> +error = xlate_ukey(udpif, ukey, tcp_flags, );
> +if (error) {
> +*reason = XLATION_ERROR;
>  goto exit;
>  }
>  xoutp = 
>
>  if (xoutp->avoid_caching) {
> +*reason = AVOID_CACHING;
>  goto exit;
>  }
>
> @@ -2238,6 +2255,7 @@ revalidate_ukey__(struct udpif *udpif, const struct 
> udpif_key *ukey,
>  ofpbuf_clear(odp_actions);
>
>  if (!ofproto) {
> +*reason = ASSOCIATED_OFPROTO;
>  goto exit;
>  }
>
> @@ -2249,6 +2267,7 @@ revalidate_ukey__(struct udpif *udpif, const struct 
> udpif_key *ukey,
>  if (odp_flow_key_to_mask(ukey->mask, ukey->mask_len, _mask, ,
>   NULL)
>  == ODP_FIT_ERROR) {
> +*reason = BAD_ODP_FIT;
>  goto exit;
>  }
>
> @@ -2258,6 +2277,7 @@ revalidate_ukey__(struct udpif *udpif, const struct 
> udpif_key *ukey,
>   * down.  Note that we do not know if 

Re: [ovs-dev] [PATCH v2] netdev-linux: do not touch LAG slaves if master is not attached to ovs.

2022-06-27 Thread Tao Liu
On Mon, Jun 27, 2022 at 09:40:38AM +0300, Roi Dayan wrote:
> 
> 
> On 2022-06-23 12:42 PM, Roi Dayan wrote:
> > 
> > 
> > On 2022-06-22 11:56 PM, Marcelo Ricardo Leitner wrote:
> > > On Wed, Jun 22, 2022 at 11:10:17AM +0800, Tao Liu wrote:
> > > > On Tue, Jun 21, 2022 at 06:49:42AM -0700, Marcelo Ricardo Leitner wrote:
> > > > > Hi,
> > > > > 
> > > > > On Tue, Jun 21, 2022 at 02:10:39PM +0800, Tao Liu wrote:
> > > > > > Bond master netdev may be created without a classification type, due
> > > > > > to routing or tunneling code.
> > > > > 
> > > > > Can you please elaborate on why is this an issue?
> > > > > 
> > > > Hi, thanks for your reply.
> > > > 
> > > > We are using BlueField2 in Bare Metal Server.
> > > > p0 and p1 enslave to master bond0. A SF is created for RDMA.
> > > > ovs manages pf0hpf and gre.
> > > > 
> > > > Traffics between bond0 and SF are controlled by tc rules.
> > > > ```
> > > > master=bond0
> > > > slave1=p0
> > > > slave2=p1
> > > > sf=enp3s0f0s0
> > > > sf_rep=en3f0pf0sf0
> > > > $tc qdisc add dev $master ingress_block 22 ingress
> > > > $tc qdisc add dev $slave1 ingress_block 22 ingress
> > > > $tc qdisc add dev $slave2 ingress_block 22 ingress
> > > > $tc qdisc add dev $sf_rep ingress
> > > > 
> > > > # some customized tc rules
> > > > $tc filter add dev $sf_rep pref 1 ingress ... action mirred
> > > > egress redirect dev $master
> > > > 
> > > > $tc filter add block 22 pref 1 ... action mirred egress redirect
> > > > dev $sf_rep
> > > > ```
> > > > 
> > > > Unfortunately ingress_block on p0 or p1 may be deleted and recreated by
> > > > ovs when they enslaves to bond0.
> > > > 
> > > > Also we have a similar architectur on cx5/cx6.
> > > > 
> > > > > > 
> > > > > > If bond master is not attached to ovs, the ingress block
> > > > > > on slaves shoud
> > > > > > not be updated.
> > > > > 
> > > > > Why? (in short, but more below)
> > > > > 
> > > > If we do not use bond master and slaves in ovs, they shoud not
> > > > be interfered.
> > > 
> > > Makes sense, ...
> > > 
> > > > 
> > > > > > 
> > > > > > Simple reproducer:
> > > > > >    ip a add 10.1.1.1/30 dev bond0
> > > > > >    ip l set net3 master bond0
> > > > > >    tc q ls dev net3
> > > > > > 
> > > > > > Fixes: d22f8927c3c9 ("netdev-linux: monitor and offload
> > > > > > LAG slaves to TC")
> > > > > > Signed-off-by: Tao Liu 
> > > > > > ---
> > > > > >   lib/netdev-linux.c | 5 -
> > > > > >   1 file changed, 4 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
> > > > > > index 9d12502..b9e0c99 100644
> > > > > > --- a/lib/netdev-linux.c
> > > > > > +++ b/lib/netdev-linux.c
> > > > > > @@ -682,7 +682,10 @@ netdev_linux_update_lag(struct
> > > > > > rtnetlink_change *change)
> > > > > >   return;
> > > > > >   }
> > > > > > 
> > > > > > -    if 
> > > > > > (is_netdev_linux_class(master_netdev->netdev_class)) {
> > > > > > +    /* If bond master is not attached to ovs,
> > > > > > ingress block on slaves
> > > > > > + * shoud not be updated. */
> > > > > 
> > > > > I think this will break a core use case. As in your reproducer, that's
> > > > > pretty much how it is expected to work today with tunnels, for
> > > > > example:
> > > > > 
> > > > >    ip a add 10.1.1.1/30 dev bond0
> > > > >    ip l set net3 master bond0
> > > > >    ip l s bond0 up
> > > > >    ovs-vsctl add-port ovsbr0 vxlan0 -- \
> > > > > set interface vxlan0
> > > > > type=vxlan \
> > > > > options:local_ip=10.1.1.1
> > > > > options:remote_ip=10.1.1.2
> > > > > options:key=0
> > > > > 
> > > > > If you patch like this, then who would be adding the ingress qdiscs on
> > > > > the slaves?
> > > > > 
> > > > > Forcing the bond to be added is probably not optimal, because it
> > > > > doesn't really need to be. Unless your considering that as some sort
> > > > > of authorization for ovs to mangle with it?
> > > > > 
> > > > >    Marcelo
> > > > > 
> > > > This patch does not break the tunnel use case. The ingress attaches on
> > > > vxlan_sys, but not need to attach on bond master or slaves.
> > > 
> > > ... I was under the impression that the driver needed it somehow,
> > > despite that. But apparently that's not right, as the driver will have
> > > one indr callback for each uplink, and should be able to have its way
> > > from there.
> > > 
> > > Thanks for the explanations.
> > > 
> > > Reviewed-by: Marcelo Ricardo Leitner 
> > > 
> > 
> > there is an example without bond about ingress. We can
> > use vxlan port in ovs with tunnel ip assigned on the pf.
> > like Tao mentioned, the ingress and tc rules are created on
> > the vxlan netdev created by ovs but ovs doesnt recreate ingress
> > on the pf and the rules are offloaded by the driver.
> > 
> > also looks good to me. thanks.
> > 
> > Reviewed-by: Roi Dayan 
> > 
> 
> Hmm, sorry but looks like I did review too quick.
> I did some manual testing now and ovs doesn't 

Re: [ovs-dev] [PATCH v2] netdev-linux: do not touch LAG slaves if master is not attached to ovs.

2022-06-27 Thread Roi Dayan via dev



On 2022-06-23 12:42 PM, Roi Dayan wrote:



On 2022-06-22 11:56 PM, Marcelo Ricardo Leitner wrote:

On Wed, Jun 22, 2022 at 11:10:17AM +0800, Tao Liu wrote:

On Tue, Jun 21, 2022 at 06:49:42AM -0700, Marcelo Ricardo Leitner wrote:

Hi,

On Tue, Jun 21, 2022 at 02:10:39PM +0800, Tao Liu wrote:

Bond master netdev may be created without a classification type, due
to routing or tunneling code.


Can you please elaborate on why is this an issue?


Hi, thanks for your reply.

We are using BlueField2 in Bare Metal Server.
p0 and p1 enslave to master bond0. A SF is created for RDMA.
ovs manages pf0hpf and gre.

Traffics between bond0 and SF are controlled by tc rules.
```
master=bond0
slave1=p0
slave2=p1
sf=enp3s0f0s0
sf_rep=en3f0pf0sf0
$tc qdisc add dev $master ingress_block 22 ingress
$tc qdisc add dev $slave1 ingress_block 22 ingress
$tc qdisc add dev $slave2 ingress_block 22 ingress
$tc qdisc add dev $sf_rep ingress

# some customized tc rules
$tc filter add dev $sf_rep pref 1 ingress ... action mirred egress 
redirect dev $master


$tc filter add block 22 pref 1 ... action mirred egress redirect dev 
$sf_rep

```

Unfortunately ingress_block on p0 or p1 may be deleted and recreated by
ovs when they enslaves to bond0.

Also we have a similar architectur on cx5/cx6.



If bond master is not attached to ovs, the ingress block on slaves 
shoud

not be updated.


Why? (in short, but more below)

If we do not use bond master and slaves in ovs, they shoud not be 
interfered.


Makes sense, ...





Simple reproducer:
   ip a add 10.1.1.1/30 dev bond0
   ip l set net3 master bond0
   tc q ls dev net3

Fixes: d22f8927c3c9 ("netdev-linux: monitor and offload LAG slaves 
to TC")

Signed-off-by: Tao Liu 
---
  lib/netdev-linux.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index 9d12502..b9e0c99 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -682,7 +682,10 @@ netdev_linux_update_lag(struct 
rtnetlink_change *change)

  return;
  }

-    if (is_netdev_linux_class(master_netdev->netdev_class)) {
+    /* If bond master is not attached to ovs, ingress 
block on slaves

+ * shoud not be updated. */


I think this will break a core use case. As in your reproducer, that's
pretty much how it is expected to work today with tunnels, for
example:

   ip a add 10.1.1.1/30 dev bond0
   ip l set net3 master bond0
   ip l s bond0 up
   ovs-vsctl add-port ovsbr0 vxlan0 -- \
set interface vxlan0
type=vxlan \
options:local_ip=10.1.1.1
options:remote_ip=10.1.1.2
options:key=0

If you patch like this, then who would be adding the ingress qdiscs on
the slaves?

Forcing the bond to be added is probably not optimal, because it
doesn't really need to be. Unless your considering that as some sort
of authorization for ovs to mangle with it?

   Marcelo


This patch does not break the tunnel use case. The ingress attaches on
vxlan_sys, but not need to attach on bond master or slaves.


... I was under the impression that the driver needed it somehow,
despite that. But apparently that's not right, as the driver will have
one indr callback for each uplink, and should be able to have its way
from there.

Thanks for the explanations.

Reviewed-by: Marcelo Ricardo Leitner 



there is an example without bond about ingress. We can
use vxlan port in ovs with tunnel ip assigned on the pf.
like Tao mentioned, the ingress and tc rules are created on
the vxlan netdev created by ovs but ovs doesnt recreate ingress
on the pf and the rules are offloaded by the driver.

also looks good to me. thanks.

Reviewed-by: Roi Dayan 



Hmm, sorry but looks like I did review too quick.
I did some manual testing now and ovs doesn't add
ingress to the slave devices for me.
master_netdev->auto_classified==true in my case.
can you describe the steps you did to test? or you just tested
the flow where bond is not attached to ovs?







+    if (!master_netdev->auto_classified &&
+    is_netdev_linux_class(master_netdev->netdev_class)) {
  block_id = netdev_get_block_id(master_netdev);
  if (!block_id) {
  netdev_close(master_netdev);
--
1.8.3.1









___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev