Re: [ovs-dev] [PATCH v4] tc: fix crash on EAGAIN return from recvmsg on netlink socket.

2023-05-24 Thread Frode Nordahl
ons. 24. mai 2023, 21:31 skrev Ilya Maximets :

> On 5/23/23 12:39, Frode Nordahl wrote:
>
> 
>
> >  int
> >  tc_transact(struct ofpbuf *request, struct ofpbuf **replyp)
> >  {
> >  int error = nl_transact(NETLINK_ROUTE, request, replyp);
> >  ofpbuf_uninit(request);
> > +
> > +if (!error && replyp && !(*replyp)->size) {
>
> Not a full review, but shouldn't we also check that *replyp != NULL
> here?  This can happen, for example, if nl_pool_alloc() fails.
>

In the event of `nl_pool_alloc()` failing, would not error also have a
value?

In any case, it probably would not hurt with a extra check before
dereferencing the pointer, thank you for pointing it out.

--
Frode Nordahl

> +COVERAGE_INC(tc_transact_eagain);
> > +/* We replicate the behavior of `nl_transact` for error
> conditions and
> > + * free any allocations before setting the 'replyp' buffer to
> NULL. */
> > +ofpbuf_delete(*replyp);
> > +*replyp = NULL;
> > +return EAGAIN;
> > +}
> > +
> >  return error;
> >  }
> >
>
> Bets regards, Ilya Maximets.
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] northd: Fix address set incremental processing

2023-05-24 Thread Numan Siddique
On Wed, May 24, 2023 at 7:11 AM Ilya Maximets  wrote:
>
> On 5/23/23 10:09, Ales Musil wrote:
> > The incremental processing is broken for addresses
> > that have mask which could "erase" portion of the address
> > itself e.g. 10.16.0.47/4, after applying the mask with token
> > parser the address becomes 0.0.0.0/4, which is fine for
> > logical flows. However, for the deletion/addition to database
> > we need the original string representation which cannot be
> > built from the parsed token.
> >
> > Use svec instead for the comparison which allows us to keep the
> > original strings.
> >
> > The change to svec shouldn't have any significant performance
> > impact, see the numbers below show. The setup was created by
> > the scale script from Han [0].
> >
> > Numbers with expr:
> > Time spent on processing nb_cfg 1:
> > ovn-northd delay before processing: 21ms
> > ovn-northd completion:  544ms
> > ovn-controller(s) completion:   544ms
> > Time spent on processing nb_cfg 2:
> > ovn-northd delay before processing: 17ms
> > ovn-northd completion:  537ms
> > ovn-controller(s) completion:   535ms
> > Time spent on processing nb_cfg 3:
> > ovn-northd delay before processing: 20ms
> > ovn-northd completion:  552ms
> > ovn-controller(s) completion:   550ms
> > Time spent on processing nb_cfg 4:
> > ovn-northd delay before processing: 16ms
> > ovn-northd completion:  529ms
> > ovn-controller(s) completion:   528ms
> >
> > Numbers with svec:
> > Time spent on processing nb_cfg 1:
> > ovn-northd delay before processing: 19ms
> > ovn-northd completion:  548ms
> > ovn-controller(s) completion:   548ms
> > Time spent on processing nb_cfg 2:
> > ovn-northd delay before processing: 19ms
> > ovn-northd completion:  537ms
> > ovn-controller(s) completion:   537ms
> > Time spent on processing nb_cfg 3:
> > ovn-northd delay before processing: 15ms
> > ovn-northd completion:  522ms
> > ovn-controller(s) completion:   521ms
> > Time spent on processing nb_cfg 4:
> > ovn-northd delay before processing: 17ms
> > ovn-northd completion:  522ms
> > ovn-controller(s) completion:   520ms
> >
> > [0] 
> > https://github.com/hzhou8/ovn-test-script/blob/e1149318201309db6d96ae8d5a33fcfbbe1872a3/test_ovn_controller_scale.sh
> >
> > Fixes: 0d5e0a6ced49 ("northd: Add I-P for syncing SB address sets.")
> > Reported-at: https://bugzilla.redhat.com/2196885
>
> The issue was originally reported on the mailing list.  Please,
> provide the link as well.  And it'd also be nice to credit the
> original reporter.
>
> > Signed-off-by: Ales Musil 
> > ---
> >  northd/en-sync-sb.c | 66 +
> >  1 file changed, 30 insertions(+), 36 deletions(-)
> >
> > diff --git a/northd/en-sync-sb.c b/northd/en-sync-sb.c
> > index 758f00bfd..4bd77b168 100644
> > --- a/northd/en-sync-sb.c
> > +++ b/northd/en-sync-sb.c
> > @@ -22,7 +22,6 @@
> >  #include "openvswitch/util.h"
> >
> >  #include "en-sync-sb.h"
> > -#include "include/ovn/expr.h"
> >  #include "lib/inc-proc-eng.h"
> >  #include "lib/lb.h"
> >  #include "lib/ovn-nb-idl.h"
> > @@ -325,45 +324,40 @@ static void
> >  update_sb_addr_set(const char **nb_addresses, size_t n_addresses,
> > const struct sbrec_address_set *sb_as)
> >  {
> > -struct expr_constant_set *cs_nb_as =
> > -expr_constant_set_create_integers(
> > -(const char *const *) nb_addresses, n_addresses);
> > -struct expr_constant_set *cs_sb_as =
> > -expr_constant_set_create_integers(
> > -(const char *const *) sb_as->addresses, sb_as->n_addresses);
> > -
> > -struct expr_constant_set *addr_added = NULL;
> > -struct expr_constant_set *addr_deleted = NULL;
> > -expr_constant_set_integers_diff(cs_sb_as, cs_nb_as, _added,
> > -_deleted);
> > -
> > -struct ds ds = DS_EMPTY_INITIALIZER;
> > -if (addr_added && addr_added->n_values) {
> > -for (size_t i = 0; i < addr_added->n_values; i++) {
> > -ds_clear();
> > -expr_constant_format(_added->values[i], EXPR_C_INTEGER, 
> > );
> > -sbrec_address_set_update_addresses_addvalue(sb_as, 
> > ds_cstr());
> > -}
> > +size_t i;
> > +char *ip;
> > +
> > +struct svec svec_nb_as = SVEC_EMPTY_INITIALIZER;
> > +struct svec svec_sb_as = SVEC_EMPTY_INITIALIZER;
> > +
> > +for (i = 0; i < n_addresses; i++) {
> > +svec_add(_nb_as, nb_addresses[i]);
> >  }
> >
> > -if (addr_deleted && addr_deleted->n_values) {
> > -for (size_t i = 0; i < addr_deleted->n_values; 

Re: [ovs-dev] [PATCH v3 ovn] northd: centralized reply lb traffic even if FIP is defined

2023-05-24 Thread Mark Michelson

Hi Lorenzo,

I tried running the system test from this patch against ovn main and it 
passes even without the changes in northd. I think this is because the 
alice1 and foo1 ports need to be bound on different hvs to properly test 
the changes in northd.


See below for other comments.

On 5/24/23 13:56, Lorenzo Bianconi wrote:

In the current codebase for distributed gw router port use-case,
it is not possible to add a load balancer that redirects the traffic
to a back-end if it is used as internal IP of a FIP NAT rule since
the reply traffic is never centralized. Fix the issue centralizing the
traffic if it is the reply packet for the load balancer.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2023609
Signed-off-by: Lorenzo Bianconi 
---
Changes since v2:
- rebase on top of ovn main branch
- fix spelling
Changes since v1:
- add new system-test and unit-test
---
  northd/northd.c |  15 ++
  northd/ovn-northd.8.xml |  16 +++
  tests/ovn-northd.at |  48 +++
  tests/system-ovn.at | 100 
  4 files changed, 179 insertions(+)

diff --git a/northd/northd.c b/northd/northd.c
index a6eca916b..6317c36fd 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -10740,6 +10740,8 @@ build_distr_lrouter_nat_flows_for_lb(struct 
lrouter_nat_lb_flows_ctx *ctx,
   struct ovn_datapath *od)
  {
  struct ovn_port *dgp = od->l3dgw_ports[0];
+struct ds gw_redir_match = DS_EMPTY_INITIALIZER;
+struct ds gw_redir_action = DS_EMPTY_INITIALIZER;
  
  const char *undnat_action;
  
@@ -10784,6 +10786,17 @@ build_distr_lrouter_nat_flows_for_lb(struct lrouter_nat_lb_flows_ctx *ctx,

  return;
  }
  
+/* We need to centralize the LB traffic to properly perform

+ * the undnat stage.
+ */
+ds_put_format(_redir_match, "%s) && outport == %s",
+  ds_cstr(ctx->undnat_match), dgp->json_key);
+ds_put_format(_redir_action, "outport = %s; next;",
+  dgp->cr_port->json_key);
+ovn_lflow_add_with_hint(ctx->lflows, od, S_ROUTER_IN_GW_REDIRECT,
+200, ds_cstr(_redir_match),
+ds_cstr(_redir_action), >lb->nlb->header_);
+
  ds_put_format(ctx->undnat_match, ") && (inport == %s || outport == %s)"
" && is_chassis_resident(%s)", dgp->json_key, dgp->json_key,
dgp->cr_port->json_key);
@@ -10791,6 +10804,8 @@ build_distr_lrouter_nat_flows_for_lb(struct 
lrouter_nat_lb_flows_ctx *ctx,
  ds_cstr(ctx->undnat_match), undnat_action,
  >lb->nlb->header_);
  ds_truncate(ctx->undnat_match, undnat_match_len);
+ds_destroy(_redir_match);
+ds_destroy(_redir_action);
  }
  
  static void

diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 540fe03bd..295f52b11 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -4578,6 +4578,22 @@ icmp6 {
  
  
  

+  
+For all the configured load balancing rules that include an IPv4
+address VIP, and a list of IPv4 backend addresses
+B0, B1 .. Bn defined for the
+VIP a priority-200 flow is added that matches 
+ip4  (ip4.src == B0 || ip4.src == B1
+|| ... || ip4.src == Bn) with an action 
+outport = CR; next; where CR is the
+chassisredirect port representing the instance of the
+logical router distributed gateway port on the gateway chassis.
+If the backend IPv4 address Bx is also configured with
+L4 port PORT of protocol P, then the match
+also includes P.src == PORT.
+Similar flows are addeded for IPv6.


s/addeded/added/


+  
+

  For each NAT rule in the OVN Northbound database that can
  be handled in a distributed manner, a priority-100 logical
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index e3669bdf5..59f932c5f 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -9484,3 +9484,51 @@ AT_CHECK([ovn-sbctl lflow-list sw | grep ls_out_pre_lb | 
grep priority=110 | gre
  
  AT_CLEANUP

  ])
+
+OVN_FOR_EACH_NORTHD_NO_HV([
+AT_SETUP([check fip and lb flows])
+AT_KEYWORDS([fip-lb-flows])
+ovn_start
+
+ovn-nbctl lr-add R1
+ovn-nbctl lrp-add R1 R1-S0 02:ac:10:01:00:01 10.0.0.1/24 1000::a/64
+ovn-nbctl lrp-add R1 R1-PUB 02:ac:20:01:01:01 172.16.0.1/24 3000::a/64
+ovn-nbctl lrp-set-gateway-chassis R1-PUB hv1 20
+
+ovn-nbctl ls-add S0
+ovn-nbctl lsp-add S0 S0-R1
+ovn-nbctl lsp-set-type S0-R1 router
+ovn-nbctl lsp-set-addresses S0-R1 02:ac:10:01:00:01
+ovn-nbctl lsp-set-options S0-R1 router-port=R1-S0
+ovn-nbctl lsp-add S0 S0-P0
+ovn-nbctl lsp-set-addresses S0-P0 "50:54:00:00:00:03 10.0.0.3 1000::3"
+
+ovn-nbctl lr-nat-add R1 dnat_and_snat 172.16.0.110 10.0.0.3 S0-P0 
30:54:00:00:00:03
+ovn-nbctl lr-nat-add R1 dnat_and_snat 3000::c 1000::3 S0-P0 40:54:00:00:00:03


Nit: All 

Re: [ovs-dev] [PATCH v4] tc: fix crash on EAGAIN return from recvmsg on netlink socket.

2023-05-24 Thread Ilya Maximets
On 5/23/23 12:39, Frode Nordahl wrote:



>  int
>  tc_transact(struct ofpbuf *request, struct ofpbuf **replyp)
>  {
>  int error = nl_transact(NETLINK_ROUTE, request, replyp);
>  ofpbuf_uninit(request);
> +
> +if (!error && replyp && !(*replyp)->size) {

Not a full review, but shouldn't we also check that *replyp != NULL
here?  This can happen, for example, if nl_pool_alloc() fails.

> +COVERAGE_INC(tc_transact_eagain);
> +/* We replicate the behavior of `nl_transact` for error conditions 
> and
> + * free any allocations before setting the 'replyp' buffer to NULL. 
> */
> +ofpbuf_delete(*replyp);
> +*replyp = NULL;
> +return EAGAIN;
> +}
> +
>  return error;
>  }
>  

Bets regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] tests: layer3-tunnels: Skip bareudp tests if not supported by kernel.

2023-05-24 Thread Ilya Maximets
On 5/24/23 15:39, Frode Nordahl wrote:
> The bareudp tests depend on specific kernel configuration to
> succeed.  Skip the test if the feature is not enabled in the
> running kernel.
> 
> Signed-off-by: Frode Nordahl 
> ---
>  tests/system-kmod-macros.at  | 10 ++
>  tests/system-layer3-tunnels.at   |  2 ++
>  tests/system-userspace-macros.at |  8 
>  3 files changed, 20 insertions(+)
> 
> diff --git a/tests/system-kmod-macros.at b/tests/system-kmod-macros.at
> index fb15a5a7c..55e7821ce 100644
> --- a/tests/system-kmod-macros.at
> +++ b/tests/system-kmod-macros.at
> @@ -237,3 +237,13 @@ m4_define([CHECK_L3L4_CONNTRACK_REASM])
>  #
>  # The kernel module tests do not use TC offload.
>  m4_define([CHECK_NO_TC_OFFLOAD])
> +
> +# OVS_CHECK_BAREUDP()
> +#
> +# The feature needs to be enabled in the kernel configuration 
> (CONFIG_BAREUDP)
> +# to work.
> +m4_define([OVS_CHECK_BAREUDP],
> +[
> +AT_SKIP_IF([! ip link add dev bareudp0 type bareudp dstport 6635 
> ethertype mpls_uc 2>&1 >/dev/null])
> +AT_CHECK([ip link del dev bareudp0])

Maybe call it ovs_bareudp0 or something like that?  Just to decrease chances
for random collisions.

> +])
> diff --git a/tests/system-layer3-tunnels.at b/tests/system-layer3-tunnels.at
> index c37852b21..5546bc879 100644
> --- a/tests/system-layer3-tunnels.at
> +++ b/tests/system-layer3-tunnels.at
> @@ -155,6 +155,7 @@ AT_CLEANUP
>  
>  AT_SETUP([layer3 - ping over MPLS Bareudp])
>  OVS_CHECK_MIN_KERNEL(5, 7)
> +OVS_CHECK_BAREUDP()

This new check supersedes the kernel version check.  Should we remove it?
The original idea was that we can create bareudp tunnels even if the ip utility
didn't support them at a time.  But, I suppose, enough time passed since then
and the ip utility available in distributions caught up, so we can just check
it instead.


>  OVS_TRAFFIC_VSWITCHD_START([_ADD_BR([br1])])
>  ADD_NAMESPACES(at_ns0, at_ns1)
>  
> @@ -203,6 +204,7 @@ AT_CLEANUP
>  
>  AT_SETUP([layer3 - ping over Bareudp])
>  OVS_CHECK_MIN_KERNEL(5, 7)

Same here.

> +OVS_CHECK_BAREUDP()
>  OVS_TRAFFIC_VSWITCHD_START([_ADD_BR([br1])])
>  ADD_NAMESPACES(at_ns0, at_ns1)
>  
> diff --git a/tests/system-userspace-macros.at 
> b/tests/system-userspace-macros.at
> index 482079386..1cb67d6f6 100644
> --- a/tests/system-userspace-macros.at
> +++ b/tests/system-userspace-macros.at
> @@ -336,3 +336,11 @@ m4_define([CHECK_L3L4_CONNTRACK_REASM],
>  #
>  # Userspace tests do not use TC offload.
>  m4_define([CHECK_NO_TC_OFFLOAD])
> +
> +# OVS_CHECK_BAREUDP()
> +#
> +# The userspace skips all tests that check kernel configuration.

This should probably say that userspace datapath doesn't support bareudp tunnels
instead.

> +m4_define([OVS_CHECK_BAREUDP],
> +[
> +AT_SKIP_IF([:])
> +])

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v10] ofproto-dpif-upcall: Don't set statistics to 0 when they jump back

2023-05-24 Thread Eelco Chaudron



On 24 May 2023, at 21:05, Ilya Maximets wrote:

> On 5/23/23 15:11, Balazs Nemeth wrote:
>> The only way that stats->{n_packets,n_bytes} would decrease is due to an
>> overflow, or if there are bugs in how statistics are handled. In the
>> past, there were multiple issues that caused a jump backward. A
>> workaround was in place to set the statistics to 0 in that case. When
>> this happened while the revalidator was under heavy load, the workaround
>> had an unintended side effect where should_revalidate returned false
>> causing the flow to be removed because the metric it calculated was
>> based on a bogus value. Since many of those bugs have now been
>> identified and resolved, there is no need to set the statistics to 0. In
>> addition, the (unlikely) overflow still needs to be handled
>> appropriately. If an unexpected jump does happen, just log it as a
>> warning.
>>
>> Signed-off-by: Balazs Nemeth 
>> ---
>>  ofproto/ofproto-dpif-upcall.c | 26 ++
>>  1 file changed, 18 insertions(+), 8 deletions(-)
>>
>> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
>> index cd57fdbd9..b06315d22 100644
>> --- a/ofproto/ofproto-dpif-upcall.c
>> +++ b/ofproto/ofproto-dpif-upcall.c
>> @@ -2368,24 +2368,34 @@ revalidate_ukey(struct udpif *udpif, struct 
>> udpif_key *ukey,
>>  enum reval_result result = UKEY_DELETE;
>>  struct dpif_flow_stats push;
>>
>> -ofpbuf_clear(odp_actions);
>> -
>>  push.used = stats->used;
>>  push.tcp_flags = stats->tcp_flags;
>> -push.n_packets = (stats->n_packets > ukey->stats.n_packets
>> -  ? stats->n_packets - ukey->stats.n_packets
>> -  : 0);
>> -push.n_bytes = (stats->n_bytes > ukey->stats.n_bytes
>> -? stats->n_bytes - ukey->stats.n_bytes
>> -: 0);
>> +push.n_packets = stats->n_packets - ukey->stats.n_packets;
>> +push.n_bytes = stats->n_bytes - ukey->stats.n_bytes;
>>
>>  if (stats->n_packets < ukey->stats.n_packets &&
>>  ukey->stats.n_packets < UINT64_THREE_QUARTERS) {
>> +static struct vlog_rate_limit rll = VLOG_RATE_LIMIT_INIT(1, 5);
>> +struct ds ds = DS_EMPTY_INITIALIZER;
>> +
>>  /* Report cases where the packet counter is lower than the previous
>>   * instance, but exclude the potential wrapping of an uint64_t. */
>>  COVERAGE_INC(ukey_invalid_stat_reset);
>> +
>> +odp_flow_key_format(ukey->key, ukey->key_len, );
>> +ds_put_cstr(, ", actions:");
>> +format_odp_actions(, odp_actions->data, odp_actions->size, NULL);
>
> The 'odp_actions' here is a scratch pad for the processing below,
> it doesn't contain any actual actions for this ukey.
> We should get real actions from the ukey->actions with rcu_get()
> and print them out instead, e.g.:
>
> actions = ovsrcu_get(struct ofpbuf *, >actions);
> format_odp_actions(, actions->data, actions->size, NULL);
>
>> +odp_format_ufid(>ufid, );
>> +ds_put_cstr(, " ");
>
> Above two lines should be in the opposite order, otherwise ufid is
> glued to the actions.

I guess this should be before the odp_flow_key_format() function, missed it 
when reviewing :( The same as other logs do.

>> +
>> +VLOG_WARN_RL(, "Unexpected jump in packet stats from %"PRIu64
>> +" to %"PRIu64" when handling ukey %s",
>> +stats->n_packets, ukey->stats.n_packets, ds_cstr());
>> +ds_destroy();
>>  }
>>
>> +ofpbuf_clear(odp_actions);
>
> No need to move this.
>
> Best regards, Ilya Maximets.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v10] ofproto-dpif-upcall: Don't set statistics to 0 when they jump back

2023-05-24 Thread Ilya Maximets
On 5/23/23 15:11, Balazs Nemeth wrote:
> The only way that stats->{n_packets,n_bytes} would decrease is due to an
> overflow, or if there are bugs in how statistics are handled. In the
> past, there were multiple issues that caused a jump backward. A
> workaround was in place to set the statistics to 0 in that case. When
> this happened while the revalidator was under heavy load, the workaround
> had an unintended side effect where should_revalidate returned false
> causing the flow to be removed because the metric it calculated was
> based on a bogus value. Since many of those bugs have now been
> identified and resolved, there is no need to set the statistics to 0. In
> addition, the (unlikely) overflow still needs to be handled
> appropriately. If an unexpected jump does happen, just log it as a
> warning.
> 
> Signed-off-by: Balazs Nemeth 
> ---
>  ofproto/ofproto-dpif-upcall.c | 26 ++
>  1 file changed, 18 insertions(+), 8 deletions(-)
> 
> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
> index cd57fdbd9..b06315d22 100644
> --- a/ofproto/ofproto-dpif-upcall.c
> +++ b/ofproto/ofproto-dpif-upcall.c
> @@ -2368,24 +2368,34 @@ revalidate_ukey(struct udpif *udpif, struct udpif_key 
> *ukey,
>  enum reval_result result = UKEY_DELETE;
>  struct dpif_flow_stats push;
>  
> -ofpbuf_clear(odp_actions);
> -
>  push.used = stats->used;
>  push.tcp_flags = stats->tcp_flags;
> -push.n_packets = (stats->n_packets > ukey->stats.n_packets
> -  ? stats->n_packets - ukey->stats.n_packets
> -  : 0);
> -push.n_bytes = (stats->n_bytes > ukey->stats.n_bytes
> -? stats->n_bytes - ukey->stats.n_bytes
> -: 0);
> +push.n_packets = stats->n_packets - ukey->stats.n_packets;
> +push.n_bytes = stats->n_bytes - ukey->stats.n_bytes;
>  
>  if (stats->n_packets < ukey->stats.n_packets &&
>  ukey->stats.n_packets < UINT64_THREE_QUARTERS) {
> +static struct vlog_rate_limit rll = VLOG_RATE_LIMIT_INIT(1, 5);
> +struct ds ds = DS_EMPTY_INITIALIZER;
> +
>  /* Report cases where the packet counter is lower than the previous
>   * instance, but exclude the potential wrapping of an uint64_t. */
>  COVERAGE_INC(ukey_invalid_stat_reset);
> +
> +odp_flow_key_format(ukey->key, ukey->key_len, );
> +ds_put_cstr(, ", actions:");
> +format_odp_actions(, odp_actions->data, odp_actions->size, NULL);

The 'odp_actions' here is a scratch pad for the processing below,
it doesn't contain any actual actions for this ukey.
We should get real actions from the ukey->actions with rcu_get()
and print them out instead, e.g.:

actions = ovsrcu_get(struct ofpbuf *, >actions);
format_odp_actions(, actions->data, actions->size, NULL);

> +odp_format_ufid(>ufid, );
> +ds_put_cstr(, " ");

Above two lines should be in the opposite order, otherwise ufid is
glued to the actions.

> +
> +VLOG_WARN_RL(, "Unexpected jump in packet stats from %"PRIu64
> +" to %"PRIu64" when handling ukey %s",
> +stats->n_packets, ukey->stats.n_packets, ds_cstr());
> +ds_destroy();
>  }
>  
> +ofpbuf_clear(odp_actions);

No need to move this.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4] backtrace: Extend the backtrace functionality

2023-05-24 Thread Ilya Maximets
On 5/19/23 09:36, Ales Musil wrote:
> Use the backtrace functions that is provided by libc,
> this allows us to get backtrace that is independent of
> the current memory map of the process. Which in turn can
> be used for debugging/tracing purpose. The backtrace
> is not 100% accurate due to various optimizations, most
> notably "-fomit-frame-pointer" and LTO. This might result
> that the line in source file doesn't correspond to the
> real line. However, it should be able to pinpoint at least
> the function where the backtrace was called.
> 
> The usage for SIGSEGV is determined during compilation
> based on available libraries. Libunwind has higher priority
> if both methods are available to keep the compatibility with
> current behavior.
> 
> The backtrace is not marked as signal safe however the backtrace
> manual page gives more detailed explanation why it might be the
> case [0]. Load the "libgcc" or equivalent in advance within the
> "fatal_signal_init" which should ensure that subsequent calls
> to backtrace* do not call malloc and are signal safe.
> 
> The typical backtrace will look similar to the one below:
> /lib64/libopenvswitch-3.1.so.0(backtrace_capture+0x1e) [0x7fc5db298dfe]
> /lib64/libopenvswitch-3.1.so.0(log_backtrace_at+0x57) [0x7fc5db2999e7]
> /lib64/libovsdb-3.1.so.0(ovsdb_txn_complete+0x7b) [0x7fc5db56247b]
> /lib64/libovsdb-3.1.so.0(ovsdb_txn_propose_commit_block+0x8d) [0x7fc5db563a8d]
> ovsdb-server(+0xa661) [0x562cfce2e661]
> ovsdb-server(+0x7e39) [0x562cfce2be39]
> /lib64/libc.so.6(+0x27b4a) [0x7fc5db048b4a]
> /lib64/libc.so.6(__libc_start_main+0x8b) [0x7fc5db048c0b]
> ovsdb-server(+0x8c35) [0x562cfce2cc35]
> 
> backtrace.h elaborates on how to effectively get the line
> information associated with the addressed presented in the
> backtrace.
> 
> [0]
> backtrace() and backtrace_symbols_fd() don't call malloc()
> explicitly, but they are part of libgcc, which gets loaded
> dynamically when first used.  Dynamic loading usually triggers
> a call to malloc(3).  If you need certain calls to these two
> functions to not allocate memory (in signal handlers, for
> example), you need to make sure libgcc is loaded beforehand
> 
> Reported-at: https://bugzilla.redhat.com/2177760
> Signed-off-by: Ales Musil 
> ---
> v2: Extend the current use of libunwind rather than replacing it.
> v3: Allow vlog_fd to be also 0.
> Return the backtrace log from monitor in updated form.
> Return use of the "vlog_direct_write_to_log_file_unsafe".
> v4: Rebase on top of current master.
> Address comments from Ilya:
> Address the coding style issues.
> Make sure that the backrace received by monitor doesn't
> have more than maximum allowed frames.
> Change the backtrace_format to accept delimiter.
> Remove unnecessary checks in the tests.
> ---

Hi, Ales.  See a few comments below.

>  include/openvswitch/vlog.h |   3 +
>  lib/backtrace.c| 116 +
>  lib/backtrace.h|  57 +++---
>  lib/fatal-signal.c |  52 +++--
>  lib/ovsdb-error.c  |  15 ++---
>  lib/vlog.c |   7 +++
>  m4/openvswitch.m4  |   9 ++-
>  tests/atlocal.in   |   2 +
>  tests/daemon.at|  45 ++
>  9 files changed, 228 insertions(+), 78 deletions(-)
> 
> diff --git a/include/openvswitch/vlog.h b/include/openvswitch/vlog.h
> index e53ce6d81..d3c369bf2 100644
> --- a/include/openvswitch/vlog.h
> +++ b/include/openvswitch/vlog.h
> @@ -148,6 +148,9 @@ void vlog_set_syslog_target(const char *target);
>  /* Write directly to log file. */
>  void vlog_direct_write_to_log_file_unsafe(const char *s);
>  
> +/* Return the current vlog file descriptor. */
> +int vlog_fd(void);

I'll repeat my comment from v3 here:
"""
It's unfortunate we need to export this kind of function.  We should
give it a better name at least.  Maybe vlog_get_log_file_fd_unsafe() ?
vlog module doesn't have a file descriptor, file does.  And fd is
supposed to be protected by a mutex that we're clearly not holding
while using this function, so 'unsafe'.
"""

> +
>  /* Initialization. */
>  void vlog_init(void);
>  void vlog_enable_async(void);
> diff --git a/lib/backtrace.c b/lib/backtrace.c
> index 2853d5ff1..4cac5e14c 100644
> --- a/lib/backtrace.c
> +++ b/lib/backtrace.c
> @@ -32,12 +32,27 @@ VLOG_DEFINE_THIS_MODULE(backtrace);
>  void
>  backtrace_capture(struct backtrace *b)
>  {
> -void *frames[BACKTRACE_MAX_FRAMES];
> -int i;
> +b->n_frames = backtrace(b->frames, BACKTRACE_MAX_FRAMES);
> +}
> +
> +void
> +backtrace_format(const struct backtrace *bt, struct ds *ds,
> + const char *delimiter)
> +{
> +if (bt->n_frames) {
> +char **symbols = backtrace_symbols(bt->frames, bt->n_frames);
> +
> +if (!symbols) {
> +return;
> +}
>  
> -b->n_frames = backtrace(frames, BACKTRACE_MAX_FRAMES);
> -for (i = 0; i < b->n_frames; i++) {
> -

Re: [ovs-dev] [PATCH v4 2/2] netdev-dpdk: Check rx/tx descriptor sizes for device.

2023-05-24 Thread Ilya Maximets
On 5/16/23 13:59, Kevin Traynor wrote:
> By default OVS configures 2048 descriptors for tx and rx queues
> on DPDK devices. It also allows the user to configure those values.
> 
> If the values used are not acceptable to the device then queue setup
> would fail.
> 
> The device exposes it's max/min/alignment requirements and OVS
> applies some limits also. Use these to ensure an acceptable value
> is used for the number of descriptors on a device tx/rx.
> 
> If the default or user value is not acceptable, adjust to a suitable
> value and log.
> 
> Reported-at: https://bugzilla.redhat.com/2119876
> Signed-off-by: Kevin Traynor 
> ---
>  lib/netdev-dpdk.c | 61 ++-
>  1 file changed, 50 insertions(+), 11 deletions(-)
> 
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 2d9afc493..e5fc1aa30 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -1911,8 +1911,30 @@ dpdk_set_rxq_config(struct netdev_dpdk *dev, const 
> struct smap *args)
>  static void
>  dpdk_process_queue_size(struct netdev *netdev, const struct smap *args,
> -const char *flag, int default_size, int *new_size)
> +struct rte_eth_dev_info *info, bool is_rx)
>  {
> -int queue_size = smap_get_int(args, flag, default_size);
> +struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> +struct rte_eth_desc_lim *lim;
> +int default_size, queue_size, cur_size, new_requested_size;
> +int *cur_requested_size;
> +bool reconfig = false;
>  
> +if (is_rx) {
> +default_size = NIC_PORT_DEFAULT_RXQ_SIZE;
> +new_requested_size = smap_get_int(args, "n_rxq_desc", default_size);
> +cur_size = dev->rxq_size;
> +cur_requested_size = >requested_rxq_size;
> +lim = info ? >rx_desc_lim : NULL;
> +

The empty line here seems unnecessary.

> +} else {
> +default_size = NIC_PORT_DEFAULT_TXQ_SIZE;
> +new_requested_size = smap_get_int(args, "n_txq_desc", default_size);
> +cur_size = dev->txq_size;
> +cur_requested_size = >requested_txq_size;
> +lim = info ? >tx_desc_lim : NULL;
> +}
> +
> +queue_size = new_requested_size;
> +
> +/* Check for OVS limits. */
>  if (queue_size <= 0 || queue_size > NIC_PORT_MAX_Q_SIZE
>  || !is_pow2(queue_size)) {
> @@ -1920,7 +1942,24 @@ dpdk_process_queue_size(struct netdev *netdev, const 
> struct smap *args,
>  }
>  
> -if (queue_size != *new_size) {
> -*new_size = queue_size;
> +if (lim) {
> +/* Check for device limits. */
> +if (lim->nb_align) {
> +queue_size = ROUND_UP(queue_size, lim->nb_align);
> +}
> +queue_size = MIN(queue_size, lim->nb_max);
> +queue_size = MAX(queue_size, lim->nb_min);
> +}
> +
> +*cur_requested_size = queue_size;
> +
> +if (cur_size != queue_size) {
>  netdev_request_reconfigure(netdev);
> +reconfig = true;
> +}
> +if (new_requested_size != queue_size) {
> +VLOG(reconfig ? VLL_INFO : VLL_DBG,
> + "Interface %s cannot set %s descriptor size to %d. "
> + "Adjusted to %d.", dev->up.name, is_rx ? "rx": "tx" ,
> + new_requested_size, queue_size);

This message is a bit misleading.  It says that it can't set the
descriptor size.  But it's not the descriptor size, it's a number
of descriptors.  Or a descriptor ring size, but that's too much
of the unnecessary terminology.
And we should use netdev_get_name instead of accessing the name
directly.

How about this:

VLOG(reconfig ? VLL_INFO : VLL_DBG,
 "%s: Unable to set the number of %s descriptors to %d. "
 "Adjusted to %d.", netdev_get_name(netdev),
 is_rx ? "rx": "tx", new_requested_size, queue_size);

?

>  }
>  }
> @@ -1938,7 +1977,9 @@ netdev_dpdk_set_config(struct netdev *netdev, const 
> struct smap *args,
>  {RTE_ETH_FC_RX_PAUSE, RTE_ETH_FC_FULL}
>  };
> +struct rte_eth_dev_info info;
>  const char *new_devargs;
>  const char *vf_mac;
>  int err = 0;
> +int ret;
>  
>  ovs_mutex_lock(_mutex);
> @@ -1947,11 +1988,4 @@ netdev_dpdk_set_config(struct netdev *netdev, const 
> struct smap *args,
>  dpdk_set_rxq_config(dev, args);
>  
> -dpdk_process_queue_size(netdev, args, "n_rxq_desc",
> -NIC_PORT_DEFAULT_RXQ_SIZE,
> ->requested_rxq_size);
> -dpdk_process_queue_size(netdev, args, "n_txq_desc",
> -NIC_PORT_DEFAULT_TXQ_SIZE,
> ->requested_txq_size);
> -
>  new_devargs = smap_get(args, "dpdk-devargs");
>  
> @@ -2073,4 +2107,9 @@ netdev_dpdk_set_config(struct netdev *netdev, const 
> struct smap *args,
>  }
>  
> +ret = rte_eth_dev_info_get(dev->port_id, );
> +
> +dpdk_process_queue_size(netdev, args, !ret ?  : NULL, true);
> +

[ovs-dev] [PATCH v3 ovn] northd: centralized reply lb traffic even if FIP is defined

2023-05-24 Thread Lorenzo Bianconi
In the current codebase for distributed gw router port use-case,
it is not possible to add a load balancer that redirects the traffic
to a back-end if it is used as internal IP of a FIP NAT rule since
the reply traffic is never centralized. Fix the issue centralizing the
traffic if it is the reply packet for the load balancer.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2023609
Signed-off-by: Lorenzo Bianconi 
---
Changes since v2:
- rebase on top of ovn main branch
- fix spelling
Changes since v1:
- add new system-test and unit-test
---
 northd/northd.c |  15 ++
 northd/ovn-northd.8.xml |  16 +++
 tests/ovn-northd.at |  48 +++
 tests/system-ovn.at | 100 
 4 files changed, 179 insertions(+)

diff --git a/northd/northd.c b/northd/northd.c
index a6eca916b..6317c36fd 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -10740,6 +10740,8 @@ build_distr_lrouter_nat_flows_for_lb(struct 
lrouter_nat_lb_flows_ctx *ctx,
  struct ovn_datapath *od)
 {
 struct ovn_port *dgp = od->l3dgw_ports[0];
+struct ds gw_redir_match = DS_EMPTY_INITIALIZER;
+struct ds gw_redir_action = DS_EMPTY_INITIALIZER;
 
 const char *undnat_action;
 
@@ -10784,6 +10786,17 @@ build_distr_lrouter_nat_flows_for_lb(struct 
lrouter_nat_lb_flows_ctx *ctx,
 return;
 }
 
+/* We need to centralize the LB traffic to properly perform
+ * the undnat stage.
+ */
+ds_put_format(_redir_match, "%s) && outport == %s",
+  ds_cstr(ctx->undnat_match), dgp->json_key);
+ds_put_format(_redir_action, "outport = %s; next;",
+  dgp->cr_port->json_key);
+ovn_lflow_add_with_hint(ctx->lflows, od, S_ROUTER_IN_GW_REDIRECT,
+200, ds_cstr(_redir_match),
+ds_cstr(_redir_action), >lb->nlb->header_);
+
 ds_put_format(ctx->undnat_match, ") && (inport == %s || outport == %s)"
   " && is_chassis_resident(%s)", dgp->json_key, dgp->json_key,
   dgp->cr_port->json_key);
@@ -10791,6 +10804,8 @@ build_distr_lrouter_nat_flows_for_lb(struct 
lrouter_nat_lb_flows_ctx *ctx,
 ds_cstr(ctx->undnat_match), undnat_action,
 >lb->nlb->header_);
 ds_truncate(ctx->undnat_match, undnat_match_len);
+ds_destroy(_redir_match);
+ds_destroy(_redir_action);
 }
 
 static void
diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 540fe03bd..295f52b11 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -4578,6 +4578,22 @@ icmp6 {
 
 
 
+  
+For all the configured load balancing rules that include an IPv4
+address VIP, and a list of IPv4 backend addresses
+B0, B1 .. Bn defined for the
+VIP a priority-200 flow is added that matches 
+ip4  (ip4.src == B0 || ip4.src == B1
+|| ... || ip4.src == Bn) with an action 
+outport = CR; next; where CR is the
+chassisredirect port representing the instance of the
+logical router distributed gateway port on the gateway chassis.
+If the backend IPv4 address Bx is also configured with
+L4 port PORT of protocol P, then the match
+also includes P.src == PORT.
+Similar flows are addeded for IPv6.
+  
+
   
 For each NAT rule in the OVN Northbound database that can
 be handled in a distributed manner, a priority-100 logical
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index e3669bdf5..59f932c5f 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -9484,3 +9484,51 @@ AT_CHECK([ovn-sbctl lflow-list sw | grep ls_out_pre_lb | 
grep priority=110 | gre
 
 AT_CLEANUP
 ])
+
+OVN_FOR_EACH_NORTHD_NO_HV([
+AT_SETUP([check fip and lb flows])
+AT_KEYWORDS([fip-lb-flows])
+ovn_start
+
+ovn-nbctl lr-add R1
+ovn-nbctl lrp-add R1 R1-S0 02:ac:10:01:00:01 10.0.0.1/24 1000::a/64
+ovn-nbctl lrp-add R1 R1-PUB 02:ac:20:01:01:01 172.16.0.1/24 3000::a/64
+ovn-nbctl lrp-set-gateway-chassis R1-PUB hv1 20
+
+ovn-nbctl ls-add S0
+ovn-nbctl lsp-add S0 S0-R1
+ovn-nbctl lsp-set-type S0-R1 router
+ovn-nbctl lsp-set-addresses S0-R1 02:ac:10:01:00:01
+ovn-nbctl lsp-set-options S0-R1 router-port=R1-S0
+ovn-nbctl lsp-add S0 S0-P0
+ovn-nbctl lsp-set-addresses S0-P0 "50:54:00:00:00:03 10.0.0.3 1000::3"
+
+ovn-nbctl lr-nat-add R1 dnat_and_snat 172.16.0.110 10.0.0.3 S0-P0 
30:54:00:00:00:03
+ovn-nbctl lr-nat-add R1 dnat_and_snat 3000::c 1000::3 S0-P0 40:54:00:00:00:03
+
+ovn-sbctl dump-flows R1 > R1flows
+AT_CAPTURE_FILE([R1flows])
+AT_CHECK([grep "lr_in_gw_redirect" R1flows | sort], [0], [dnl
+  table=20(lr_in_gw_redirect  ), priority=0, match=(1), action=(next;)
+  table=20(lr_in_gw_redirect  ), priority=100  , match=(ip4.src == 10.0.0.3 && 
outport == "R1-PUB" && is_chassis_resident("S0-P0")), action=(eth.src = 
30:54:00:00:00:03; reg1 = 172.16.0.110; 

Re: [ovs-dev] [PATCH] relay: allow setting probe interval

2023-05-24 Thread Simon Horman
On Fri, May 19, 2023 at 05:13:23PM +0200, Felix Huettner via dev wrote:
> previously it was not possible to set the probe interval for the
> connection from a relay to the backing ovsdb-server. With this change it
> is now possible using the
> `ovsdb-server/set-active-ovsdb-server-probe-interval` command.
> 
> The command `ovsdb-server/set-active-ovsdb-server-probe-interval` is
> already used to set the probe interval for active-backup replication.
> However this is mutally exclusive with being a relay and the case for
> using the command is the same. Therefor we decided to reuse it instead
> of adding a new one.
> 
> Signed-off-by: Felix Huettner 

Reviewed-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 ovn] northd: centralized reply lb traffic even if FIP is defined

2023-05-24 Thread Simon Horman
Thanks Lorenzo,

a few suggestions from my side.

On Thu, May 18, 2023 at 10:20:34PM +0200, Lorenzo Bianconi wrote:
> In the current codebase for distributed gw router port use-case,
> it is not possible to add a load balancer that redirects the traffic
> to a backed if it is even the internal IP of a FIP NAT rule since

s/backed/back-end/
s/even// ?

> the reply traffic is never centralized. Fix the issue centralizing the
> traffic if it is the reply packet for the load balancer.
> 
> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2023609
> Signed-off-by: Lorenzo Bianconi 

...

> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> index 7da912da3..1703d8bae 100644
> --- a/northd/ovn-northd.8.xml
> +++ b/northd/ovn-northd.8.xml
> @@ -4452,6 +4452,22 @@ icmp6 {
>  
>  
>  
> +  
> +For all the configured load balancing rules that includes an IPv4

maybe: s/includes/include/

> +address VIP, and a list of IPv4 backend addresses
> +B0, B1 .. Bn defined for the
> +VIP a priority-200 flow that matches ip4 

maybe: s/flow/flow is added/ ?

> +(ip4.src == B0 || ip4.src == B1 || ...
> +|| ip4.src == Bn) with an action 
> +outport = CR; next; where CR is the
> +chassisredirect port representing the instance of the
> +logical router distributed gateway port on the gateway chassis.
> +If the backend IPv4 address Bx is also configured with
> +L4 port PORT of protocol P, then the match
> +also includes P.src == PORT.
> +Similar flows are addeded for IPv6 counterpart.

maybe: s/counterpart//

> +  
> +
>
>  For each NAT rule in the OVN Northbound database that can
>  be handled in a distributed manner, a priority-100 logical

...
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v10] netdev-dpdk: add control plane protection support

2023-05-24 Thread Robin Jarry
Kevin Traynor, May 24, 2023 at 16:54:
> > The "rss" item should be mandatory anyway. There should be no way to
> > disable it. I assume that it is what Ilya meant with these "+" symbols.
> > That would leave us with these examples:
> > 
> > # lacp+bfd on separate queue, rss on other queues
> > 
> >  options:rx-steering=rss,lacp,bfd
> > 
> > # same
> > 
> >  options:rx-steering=lacp,bfd,rss
>
> ^ It looks a little odd to me that some values are for single queues and 
> some are for the rest of the queues, with no way to distinguish.
>
> So maybe Ilya had placed significance in the order with his suggestion ? 
> i.e. [default]+[single queue proto]+[other single queue proto]+...
>
> I don't want to bike shed too much on it, i guess with good docs, it can 
> be clear anyway.
>
> > # only regular rss, same as no config at all
> > 
> >  options:rx-steering=rss
> > 
> > # invalid configurations
> > 
> >  options:rx-steering=lacp
>
> >  options:rx-steering=
> (^ This will be ok and use the default rss)

In that case, maybe we should exclude "rss" from the available items to
avoid confusion and only require users to specify the protocols that
need to be put in a separate queue. The remaining queues being rss by
default.

Would that be ok?

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v10] netdev-dpdk: add control plane protection support

2023-05-24 Thread Ilya Maximets
On 5/24/23 16:54, Kevin Traynor wrote:
> On 24/05/2023 15:32, Robin Jarry wrote:
>> Kevin Traynor, May 24, 2023 at 16:06:
>>> Hmm, not sure on this one. I have a feeling that having a 'hash' mode
>>> for tx-steering that only applies to vhost devices, and 'hash' mode
>>> for rx-steering that only applies to NICs means people will miss the
>>> subtlety and try to enable the wrong hash mode on the wrong device :-)
>>> So 'rss' might make it more distinct.
>>
>> "rss" is fine then.
>>
>>> Any reason for '+' ? I think commas are used more frequently. If it
>>> needed to be extended in future for some extra config, then adding
>>> 'key:value' pairs would be seamlessly. e.g. =rss:,
>>> lacp:
>>
>> The "rss" item should be mandatory anyway. There should be no way to
>> disable it. I assume that it is what Ilya meant with these "+" symbols.
>> That would leave us with these examples:
>>
>> # lacp+bfd on separate queue, rss on other queues
>>
>>  options:rx-steering=rss,lacp,bfd
>>
>> # same
>>
>>  options:rx-steering=lacp,bfd,rss
>>
> 
> ^ It looks a little odd to me that some values are for single queues and 
> some are for the rest of the queues, with no way to distinguish.
> 
> So maybe Ilya had placed significance in the order with his suggestion ? 
> i.e. [default]+[single queue proto]+[other single queue proto]+...

I had a '+' because rss and lacp are two different entities and I looked
at it as a mode of operation. i.e. RSS plus special handling for LACP.
RSS looks strange in a comma-separated list, IMO.

> 
> I don't want to bike shed too much on it, i guess with good docs, it can 
> be clear anyway.
> 
>> # only regular rss, same as no config at all
>>
>>  options:rx-steering=rss
>>
>> # invalid configurations
>>
>>  options:rx-steering=lacp
> 
>>  options:rx-steering=
> (^ This will be ok and use the default rss)
> 
>>  options:rx-steering=bfd,lacp
>>
>> What do you think?
>>
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] ofproto-dpif-xlate: Reduce stack usage in recursive xlate functions

2023-05-24 Thread Simon Horman
On Thu, May 18, 2023 at 04:08:34PM -0400, Mike Pattrick wrote:
> Several xlate actions used in recursive translation currently store a
> large amount of information on the stack. This can result in handler
> threads quickly running out of stack space despite before
> xlate_resubmit_resource_check() is able to terminate translation. This
> patch reduces stack usage by over 3kb from several translation actions.
> 
> This patch also moves some trace function from do_xlate_actions into its
> own function to reduce stack usage.
> 
> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2104779
> Signed-off-by: Mike Pattrick 
> 
> ---
> 
> Since v1:
>  - Refactored code into specialized functions and renamed variables for
>  improved readability.
> 
> ---
>  ofproto/ofproto-dpif-xlate.c | 223 ++-
>  1 file changed, 141 insertions(+), 82 deletions(-)
> 
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index 29f4daa63..abfa717e2 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -501,6 +501,90 @@ ctx_cancel_freeze(struct xlate_ctx *ctx)
>  
>  static void finish_freezing(struct xlate_ctx *ctx);
>  
> +/* These functions and structure are used to save stack space in actions that
> + * need to retain a large amount of xlate_ctx state. */
> +struct xretained_state {
> +union mf_subvalue new_stack[1024 / sizeof(union mf_subvalue)];
> +uint64_t actset_stub[1024 / 8];
> +struct ofpbuf old_stack;
> +struct ofpbuf old_action_set;
> +struct flow old_flow;
> +struct flow old_base;
> +struct flow_tnl flow_tnl;
> +};
> +
> +/* The return of this function must be freed by xretain_state_restore(). */
> +static inline struct xretained_state *
> +xretain_state_save(struct xlate_ctx *ctx)

Hi Mike,

nit: I don't think the inline keyword is needed here, or elsewhere in this
 patch. Because the compiler should be able to figure this out.
 So unless there is a strong reason I'd prefer to drop 'inline'.

Otherwise the patch looks good to me.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v10] netdev-dpdk: add control plane protection support

2023-05-24 Thread Robin Jarry
Hey Kevin,

Kevin Traynor, May 24, 2023 at 16:13:
> Hi Robin,
>
> I tested combinations of enabling/disabling cp-proto and enabling
> hwol. It is working as expected and hwol feature always has a clear
> priority, regardless of the order they are enabled in.
>
> I didn't test lacp traffic, but I was able to see that the rxq was
> added/removed for it and that rss was working on the other rxqs.
>
> I also tested combinations of different numbers of rxqs,
> adding/deleting rxqs, enabling/disabling cp-proto etc and working
> fine. I also tested enabling cp-proto and changing rxqs in the same
> command, working fine.

Thanks for testing!

> > +/* User input for n_rxq + an optional control plane protection 
> > queue
> > + * (see netdev_dpdk_reconfigure). This field is different from the
> > + * other requested_* fields as it may contain a different value 
> > than
> > + * the user input. */
>
> ^ I don't think this comment is really needed. requested_rxq_size can
> also be adjusted and they don't always come from user input, sometimes
> just the OVS default.
...
> > +/* User input for n_rxq (see dpdk_set_rxq_config). */
> > +int user_n_rxq;
>
> I also put in a similar 3rd stage status in recent patches for
> rx/tx-descriptors but managed to get rid of it. You might be able to
> remove this and just check check cp-proto flags do a +1/-1 where
> needed?

I had already tried this and it was not possible. I will have another
fresh look. Maybe I was thinking about this the wrong way.

> > +/*
> > + * Some drivers set reta_size equal to the total number of rxqs 
> > that
> > + * are configured when it is a power of two. Since we are actually
> > + * reconfiguring the redirection table to exclude the last rxq, we 
> > may
> > + * end up with an imbalanced redirection table. For example, such
> > + * configuration:
> > + *
> > + *   options:n_rxq=3 options:cp-protection=lacp
> > + *
> > + * Will actually configure 4 rxqs on the NIC, and the default reta 
> > to:
> > + *
> > + *   [0, 1, 2, 3]
> > + *
> > + * And dpdk_cp_prot_rss_configure() will reconfigure reta to:
> > + *
> > + *   [0, 1, 2, 0]
> > + *
> > + * Causing queue 0 to receive twice as much traffic as queues 1 
> > and 2.
> > + *
> > + * Work around that corner case by forcing a bigger redirection 
> > table
> > + * size to 128 entries when reta_size is not a multiple of 
> > rss_n_rxq
> > + * and when reta_size is less than 128. This value seems to be
> > + * supported by most of the drivers that also support rte flow.
> > + */
> > +info.reta_size = RTE_ETH_RSS_RETA_SIZE_128;
>
> Thanks for following up on previous comments and adding this
> workaround. I think you didn't get much response about it, so might be
> worth adding a DPDK bugzilla to formally report/track it.

I don't think it is a bug. If the default redirection table is not
modified, there will be proper balancing on all Rx queues. The issue we
have here is that we are reconfiguring the table and removing the last
queue from it which can lead to imbalances in some cases. Do you think
this is worth reporting anyway?

> > +try_cp_prot = dev->requested_cp_prot_flags != 0;
> > +dev->requested_n_rxq = dev->user_n_rxq;
> > +if (try_cp_prot) {
> > +dev->requested_n_rxq += 1;
>
> Minor, but adjusting the requested_n_rxq's seems more suited to
> dpdk_cp_prot_set_config() (and move it after dpdk_set_rxq_config() in
> netdev_dpdk_set_config()) as this is the function that applies the
> configuration. I admit, it's partly aesthetic, so that this function
> still just checks for changes that require a reconfig and returns if
> there are none :-)

This is related to the introduction of user_n_rxq. I had tried to make
it work and this was the only way. I'll have another fresh look.

Cheers,
Robin

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v10] netdev-dpdk: add control plane protection support

2023-05-24 Thread Kevin Traynor

On 24/05/2023 15:32, Robin Jarry wrote:

Kevin Traynor, May 24, 2023 at 16:06:

Hmm, not sure on this one. I have a feeling that having a 'hash' mode
for tx-steering that only applies to vhost devices, and 'hash' mode
for rx-steering that only applies to NICs means people will miss the
subtlety and try to enable the wrong hash mode on the wrong device :-)
So 'rss' might make it more distinct.


"rss" is fine then.


Any reason for '+' ? I think commas are used more frequently. If it
needed to be extended in future for some extra config, then adding
'key:value' pairs would be seamlessly. e.g. =rss:,
lacp:


The "rss" item should be mandatory anyway. There should be no way to
disable it. I assume that it is what Ilya meant with these "+" symbols.
That would leave us with these examples:

# lacp+bfd on separate queue, rss on other queues

 options:rx-steering=rss,lacp,bfd

# same

 options:rx-steering=lacp,bfd,rss



^ It looks a little odd to me that some values are for single queues and 
some are for the rest of the queues, with no way to distinguish.


So maybe Ilya had placed significance in the order with his suggestion ? 
i.e. [default]+[single queue proto]+[other single queue proto]+...


I don't want to bike shed too much on it, i guess with good docs, it can 
be clear anyway.



# only regular rss, same as no config at all

 options:rx-steering=rss

# invalid configurations

 options:rx-steering=lacp



 options:rx-steering=

(^ This will be ok and use the default rss)


 options:rx-steering=bfd,lacp

What do you think?



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] tests: layer3-tunnels: Skip bareudp tests if not supported by kernel.

2023-05-24 Thread Simon Horman
On Wed, May 24, 2023 at 03:39:53PM +0200, Frode Nordahl wrote:
> The bareudp tests depend on specific kernel configuration to
> succeed.  Skip the test if the feature is not enabled in the
> running kernel.
> 
> Signed-off-by: Frode Nordahl 

Tested-by: Simon Horman 
Reviewed-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v10] netdev-dpdk: add control plane protection support

2023-05-24 Thread Robin Jarry
Kevin Traynor, May 24, 2023 at 16:06:
> Hmm, not sure on this one. I have a feeling that having a 'hash' mode
> for tx-steering that only applies to vhost devices, and 'hash' mode
> for rx-steering that only applies to NICs means people will miss the
> subtlety and try to enable the wrong hash mode on the wrong device :-)
> So 'rss' might make it more distinct.

"rss" is fine then.

> Any reason for '+' ? I think commas are used more frequently. If it
> needed to be extended in future for some extra config, then adding
> 'key:value' pairs would be seamlessly. e.g. =rss:,
> lacp:

The "rss" item should be mandatory anyway. There should be no way to
disable it. I assume that it is what Ilya meant with these "+" symbols.
That would leave us with these examples:

# lacp+bfd on separate queue, rss on other queues

options:rx-steering=rss,lacp,bfd

# same

options:rx-steering=lacp,bfd,rss

# only regular rss, same as no config at all

options:rx-steering=rss

# invalid configurations

options:rx-steering=lacp
options:rx-steering=
options:rx-steering=bfd,lacp

What do you think?

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v10] netdev-dpdk: add control plane protection support

2023-05-24 Thread Kevin Traynor

On 17/04/2023 13:37, Robin Jarry wrote:

Some control protocols are used to maintain link status between
forwarding engines (e.g. LACP). When the system is not sized properly,
the PMD threads may not be able to process all incoming traffic from the
configured Rx queues. When a signaling packet of such protocols is
dropped, it can cause link flapping, worsening the situation.

Use the RTE flow API to redirect these protocols into a dedicated Rx
queue. The assumption is made that the ratio between control protocol
traffic and user data traffic is very low and thus this dedicated Rx
queue will never get full. The RSS redirection table is re-programmed to
only use the other Rx queues. The RSS table size is stored in the
netdev_dpdk structure at port initialization to avoid requesting the
information again when changing the port configuration.

The additional Rx queue will be assigned a PMD core like any other Rx
queue. Polling that extra queue may introduce increased latency and
a slight performance penalty at the benefit of preventing link flapping.

This feature must be enabled per port on specific protocols via the
cp-protection option. This option takes a coma-separated list of
protocol names. It is only supported on ethernet ports. This feature is
experimental.

If the user has already configured multiple Rx queues on the port, an
additional one will be allocated for control plane packets. If the
hardware cannot satisfy the requested number of requested Rx queues, the
last Rx queue will be assigned for control plane. If only one Rx queue
is available, the cp-protection feature will be disabled. If the
hardware does not support the RTE flow matchers/actions, the feature
will be disabled.

It cannot be enabled when other_config:hw-offload=true as it may
conflict with the offloaded RTE flows. Similarly, if hw-offload is
enabled, cp-protection will be forcibly disabled on all ports.

Example use:

  ovs-vsctl add-bond br-phy bond0 phy0 phy1 -- \
set interface phy0 type=dpdk options:dpdk-devargs=:ca:00.0 -- \
set interface phy0 options:cp-protection=lacp -- \
set interface phy1 type=dpdk options:dpdk-devargs=:ca:00.1 -- \
set interface phy1 options:cp-protection=lacp

As a starting point, only one protocol is supported: LACP. Other
protocols can be added in the future. NIC compatibility should be
checked.

To validate that this works as intended, I used a traffic generator to
generate random traffic slightly above the machine capacity at line rate
on a two ports bond interface. OVS is configured to receive traffic on
two VLANs and pop/push them in a br-int bridge based on tags set on
patch ports.

+--+
| DUT  |
|++|
||   br-int   || 
in_port=patch10,actions=mod_dl_src:$patch11,mod_dl_dst:$tgen1,output:patch11
|||| 
in_port=patch11,actions=mod_dl_src:$patch10,mod_dl_dst:$tgen0,output:patch10
|| patch10patch11 ||
|+---|---|+|
||   | |
|+---|---|+|
|| patch00patch01 ||
||  tag:10tag:20  ||
||||
||   br-phy   || default flow, action=NORMAL
||||
||   bond0|| balance-slb, lacp=passive, lacp-time=fast
||phy0   phy1 ||
|+--|-|---+|
+---|-|+
| |
+---|-|+
| port0  port1 | balance L3/L4, lacp=active, lacp-time=fast
| lag  | mode trunk VLANs 10, 20
|  |
|switch|
|  |
|  vlan 10vlan 20  |  mode access
|   port2  port3   |
+-|--|-+
  |  |
+-|--|-+
|   tgen0  tgen1   |  Random traffic that is properly balanced
|  |  across the bond ports in both directions.
|  traffic generator   |
+--+

Without cp-protection, the bond0 links are randomly switching to
"defaulted" when one of the LACP packets sent by the switch is dropped
because the RX queues are full and the PMD threads did not process them
fast enough. When that happens, all traffic must go through a single
link which causes above line rate traffic to be dropped.

  ~# ovs-appctl lacp/show-stats bond0
   bond0 statistics 
  member: phy0:
TX PDUs: 347246
RX PDUs: 14865
RX Bad PDUs: 0
RX Marker Request PDUs: 0
Link Expired: 168
Link Defaulted: 0
Carrier Status Changed: 0
  member: phy1:
TX PDUs: 347245
RX PDUs: 14919
RX Bad PDUs: 0
RX Marker Request PDUs: 0
Link Expired: 147
Link Defaulted: 1
Carrier Status Changed: 0

When cp-protection is enabled, no LACP packet is dropped and the bond
links remain enabled at all times, maximizing the throughput. Neither
the "Link Expired" nor the "Link Defaulted" counters are incremented

Re: [ovs-dev] [PATCH v10] netdev-dpdk: add control plane protection support

2023-05-24 Thread Kevin Traynor

On 24/05/2023 11:41, Ilya Maximets wrote:

On 5/24/23 09:18, Robin Jarry wrote:

Ilya Maximets, May 23, 2023 at 22:04:

'rxq-isolate' will be confused with 'other_config:pmd-rxq-isolate'.
Same likely goes for the 'isolated-rxq'.

'rxq-steernig' may be confused with 'other_config:tx-steering'.
But this can be argued that it's essentially similar functionality,
so should be named similarly.  Maybe we can double down on that and
use options:rx-steernig=rss|rss+lacp|...  with 'rss' being a default
configuration?


I like that idea. Instead of "rss", how about "hash" to match what
tx-steering is using?


I'm not sure if it's better to use 'rss' or 'hash' in this context.
Aaron, Kevin, what do you think?



Hmm, not sure on this one. I have a feeling that having a 'hash' mode 
for tx-steering that only applies to vhost devices, and 'hash' mode for 
rx-steering that only applies to NICs means people will miss the 
subtlety and try to enable the wrong hash mode on the wrong device :-) 
So 'rss' might make it more distinct.




So the value of options:rx-steering would be "hash" followed by an
arbitrary list of preset protocols (for now, only "lacp") all separated
by "+". It may also open the door for other base steering methods than
"hash" ("rr" for round-robin some day?).

If that is OK, I can prepare a v11 with everything renamed.


Any reason for '+' ? I think commas are used more frequently. If it 
needed to be extended in future for some extra config, then adding 
'key:value' pairs would be seamlessly. e.g.

=rss:, lacp:



Might make sense for the code review.  Kevin was looking at the
patch, IIUC.

Best regards, Ilya Maximets.



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] tests: layer3-tunnels: Skip bareudp tests if not supported by kernel.

2023-05-24 Thread Frode Nordahl
The bareudp tests depend on specific kernel configuration to
succeed.  Skip the test if the feature is not enabled in the
running kernel.

Signed-off-by: Frode Nordahl 
---
 tests/system-kmod-macros.at  | 10 ++
 tests/system-layer3-tunnels.at   |  2 ++
 tests/system-userspace-macros.at |  8 
 3 files changed, 20 insertions(+)

diff --git a/tests/system-kmod-macros.at b/tests/system-kmod-macros.at
index fb15a5a7c..55e7821ce 100644
--- a/tests/system-kmod-macros.at
+++ b/tests/system-kmod-macros.at
@@ -237,3 +237,13 @@ m4_define([CHECK_L3L4_CONNTRACK_REASM])
 #
 # The kernel module tests do not use TC offload.
 m4_define([CHECK_NO_TC_OFFLOAD])
+
+# OVS_CHECK_BAREUDP()
+#
+# The feature needs to be enabled in the kernel configuration (CONFIG_BAREUDP)
+# to work.
+m4_define([OVS_CHECK_BAREUDP],
+[
+AT_SKIP_IF([! ip link add dev bareudp0 type bareudp dstport 6635 ethertype 
mpls_uc 2>&1 >/dev/null])
+AT_CHECK([ip link del dev bareudp0])
+])
diff --git a/tests/system-layer3-tunnels.at b/tests/system-layer3-tunnels.at
index c37852b21..5546bc879 100644
--- a/tests/system-layer3-tunnels.at
+++ b/tests/system-layer3-tunnels.at
@@ -155,6 +155,7 @@ AT_CLEANUP
 
 AT_SETUP([layer3 - ping over MPLS Bareudp])
 OVS_CHECK_MIN_KERNEL(5, 7)
+OVS_CHECK_BAREUDP()
 OVS_TRAFFIC_VSWITCHD_START([_ADD_BR([br1])])
 ADD_NAMESPACES(at_ns0, at_ns1)
 
@@ -203,6 +204,7 @@ AT_CLEANUP
 
 AT_SETUP([layer3 - ping over Bareudp])
 OVS_CHECK_MIN_KERNEL(5, 7)
+OVS_CHECK_BAREUDP()
 OVS_TRAFFIC_VSWITCHD_START([_ADD_BR([br1])])
 ADD_NAMESPACES(at_ns0, at_ns1)
 
diff --git a/tests/system-userspace-macros.at b/tests/system-userspace-macros.at
index 482079386..1cb67d6f6 100644
--- a/tests/system-userspace-macros.at
+++ b/tests/system-userspace-macros.at
@@ -336,3 +336,11 @@ m4_define([CHECK_L3L4_CONNTRACK_REASM],
 #
 # Userspace tests do not use TC offload.
 m4_define([CHECK_NO_TC_OFFLOAD])
+
+# OVS_CHECK_BAREUDP()
+#
+# The userspace skips all tests that check kernel configuration.
+m4_define([OVS_CHECK_BAREUDP],
+[
+AT_SKIP_IF([:])
+])
-- 
2.39.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] ovs-vsctl Bridge IPFIX enable_input_sampling, enable_ouput_sampling and enable_tunnel_sampling returning unexpected values

2023-05-24 Thread Adrian Moreno



On 5/23/23 22:26, Ilya Maximets wrote:

On 5/23/23 20:51, Sayali Naval (sanaval) via dev wrote:

Does anyone have any insights on the below?

From: Sayali Naval (sanaval)
Sent: Friday, May 12, 2023 11:10 AM
To: d...@openvswitch.org 
Subject: [ovs-dev] ovs-vsctl Bridge IPFIX enable_input_sampling, 
enable_ouput_sampling and enable_tunnel_sampling returning unexpected values


As per the Open vSwitch Manual 
(http://www.openvswitch.org/support/dist-docs/ovs-vsctl.8.txt) the Bridge IPFIX 
parameters can be passed as follows:

ovs-vsctl -- set Bridge br0 ipfix=@i \
   --  --id=@i  create  IPFIX targets=\"192.168.0.34:4739\" obs_do‐
   main_id=123   obs_point_id=456   cache_active_timeout=60
   cache_max_flows=13 \
   other_config:enable-input-sampling=false \
   other_config:enable-output-sampling=false \
   other_config:enable-tunnel-sampling=true

where the default values are:

enable_input_sampling: true
enable_output_sampling: true
enable_tunnel_sampling: false



I think wording is very confusing. In ovs-vsctl(8), before the example you 
provide, it says:

"
Configure bridge br0 to send one IPFIX flow record per packet
sample to UDP port 4739 on host 192.168.0.34, with Observation Domain
ID 123 and Observation Point ID 456, a flow cache active timeout of 1
minute (60 seconds), maximum flow cache size of 13 flows, and flows
sampled on output port with tunnel info(sampling on input and output
port is enabled by default if not disabled) :
"

It doesn't explicitly say what's the default value for 
"other_config:enable-tunnel-sampling". On the other hand, the 
ovs-vswitchd.conf.db(5) says:


"
other_config : enable-input-sampling: optional string, either true or false
By default, Open vSwitch samples and reports flows at bridge port input in 
IPFIX flow records. Set this column to false to disable input sampling.


other_config : enable-output-sampling: optional string, either true or false
By default, Open vSwitch samples and reports flows at bridge port output in 
IPFIX flow records. Set this column to false to disable output sampling.

"

One might interpret this as 'you can only set this to "false", since "true" is 
the default value'. If it was possible to forbid the user to put "true" in the 
field, maybe the current implementation would be reasonable. But it's not.


Now for the tunnel-sampling:
"
other_config : enable-tunnel-sampling: optional string, either true or false
Set to true to enable sampling and reporting tunnel header 7-tuples in 
IPFIX flow records. Tunnel sampling is enabled by default.

"

So the default value is true after all!



But in the existing code 
https://github.com/openvswitch/ovs/blob/master/vswitchd/bridge.c#L1560-L1567, 
these 3 parameters take up unexpected values in some scenarios.

Consider the following case for enable_input_sampling and 
enable_output_sampling:

 be_opts.enable_input_sampling = !smap_get_bool(_cfg->other_config,
   "enable-input-sampling", false);

 be_opts.enable_output_sampling = !smap_get_bool(_cfg->other_config,
   "enable-output-sampling", false);

Here, the function smap_get_bool is being used with a negation.

smap_get_bool is defined as below:
(https://github.com/openvswitch/ovs/blob/master/lib/smap.c#L220-L232)

/* Gets the value associated with 'key' in 'smap' and converts it to a boolean.
  * If 'key' is not in 'smap', or its value is neither "true" nor "false",
  * returns 'def'. */
bool
smap_get_bool(const struct smap *smap, const char *key, bool def)
{
 const char *value = smap_get_def(smap, key, "");

 if (def) {
 return strcasecmp("false", value) != 0;
 } else {
 return !strcasecmp("true", value);
 }
}

This returns expected values for the default case (since the above code will 
negate “false” we get from smap_get bool function and return the value “true”) 
but unexpected values for the case where the sampling value is passed through 
the CLI.

For example, if we pass "true" for other_config:enable-input-sampling in the 
CLI, the above code will negate the “true” value we get from the smap_bool function and 
return the value “false”. Same would be the case for enable_output_sampling.

My proposed solution is as below:

 be_opts.enable_input_sampling = smap_get_bool(_cfg->other_config,
   "enable-input-sampling", true);

 be_opts.enable_output_sampling = smap_get_bool(_cfg->other_config,
   "enable-output-sampling", true);

Here, by removing the negation on the smap_get bool function, we make sure the 
CLI value is returned as expected and by changing the 3rd parameter to “true”, 
we return the default value “true” as expected.


Yeah, the negation is very strange.  I'd 

Re: [ovs-dev] [PATCH v13 3/4] userspace: Enable IP checksum offloading by default.

2023-05-24 Thread Ilya Maximets
On 5/17/23 05:11, Mike Pattrick wrote:
> From: Flavio Leitner 
> 
> The netdev receiving packets is supposed to provide the flags
> indicating if the IP checksum was verified and it is GOOD or BAD,
> otherwise the stack will check when appropriate by software.
> 
> If the packet comes with good checksum, then postpone the
> checksum calculation to the egress device if needed.
> 
> When encapsulate a packet with that flag, set the checksum
> of the inner IP header since that is not yet supported.
> 
> Calculate the IP checksum when the packet is going to be sent over
> a device that doesn't support the feature.
> 
> Linux devices don't support IP checksum offload alone, so the
> support is not enabled.
> 
> Signed-off-by: Flavio Leitner 
> Co-authored-by: Mike Pattrick 
> Signed-off-by: Mike Pattrick 
> --
>  Since v9:
>   - Removed duplicative field tx_ip_csum_offload from netdev-dpdk.c
>   - Left rx_csum_offload field as it is not duplicative
>   - Moved system-userspace-offload.at tests to dpif-netdev.at
>   - Various visual changes
>   - Extended miniflow_extract changes into avx512 code
>  Since v10:
>   - avx512 checksum length corrected
>  Since v11:
>   - If hw-offload and userspace-tso is enabled, don't allow dpdk to
>   offload RAW_ENCAP
> 
> Signed-off-by: Mike Pattrick 
> ---
>  lib/conntrack.c  | 19 
>  lib/dp-packet.c  | 15 ++
>  lib/dp-packet.h  | 62 +++--
>  lib/dpif-netdev-extract-avx512.c |  5 ++
>  lib/dpif-netdev.c|  2 +
>  lib/flow.c   | 15 --
>  lib/ipf.c| 11 +++--
>  lib/netdev-dpdk.c| 71 +++--
>  lib/netdev-dummy.c   | 23 ++
>  lib/netdev-native-tnl.c  | 21 ++---
>  lib/netdev-offload-dpdk.c| 18 ++--
>  lib/netdev.c | 16 +++
>  lib/odp-execute-avx512.c | 19 +---
>  lib/odp-execute.c| 21 +++--
>  lib/packets.c| 34 +++---
>  tests/dpif-netdev.at | 78 
>  16 files changed, 359 insertions(+), 71 deletions(-)
> 



> @@ -1174,6 +1190,13 @@ netdev_dummy_send(struct netdev *netdev, int qid,
>  }
>  
>  ovs_mutex_lock(>mutex);
> +if (dp_packet_hwol_tx_ip_csum(packet)) {
> +if (!dp_packet_ip_checksum_good(packet)) {

There is unnecessary level of nesting.

> +dp_packet_ip_set_header_csum(packet);
> +dp_packet_ol_set_ip_csum_good(packet);
> +}
> +}
> +
>  dev->stats.tx_packets++;
>  dev->txq_stats[qid].packets++;
>  dev->stats.tx_bytes += size;
> diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
> index 9abdf5107..bd64a05c9 100644
> --- a/lib/netdev-native-tnl.c
> +++ b/lib/netdev-native-tnl.c
> @@ -88,7 +88,10 @@ netdev_tnl_ip_extract_tnl_md(struct dp_packet *packet, 
> struct flow_tnl *tnl,
>  
>  ovs_be32 ip_src, ip_dst;
>  
> -if (OVS_UNLIKELY(!dp_packet_ip_checksum_valid(packet))) {
> +/* A packet coming from a network device might have the
> + * csum already checked. In this case, skip the check. */
> +if (OVS_UNLIKELY(!dp_packet_ip_checksum_good(packet))
> +&& !dp_packet_hwol_tx_ip_csum(packet)) {
>  if (csum(ip, IP_IHL(ip->ip_ihl_ver) * 4)) {
>  VLOG_WARN_RL(_rl, "ip packet has invalid checksum");
>  return NULL;
> @@ -142,7 +145,8 @@ netdev_tnl_ip_extract_tnl_md(struct dp_packet *packet, 
> struct flow_tnl *tnl,
>   *
>   * This function sets the IP header's ip_tot_len field (which should be 
> zeroed
>   * as part of 'header') and puts its value into '*ip_tot_size' as well.  Also
> - * updates IP header checksum, as well as the l3 and l4 offsets in 'packet'.
> + * updates IP header checksum if not offloaded, as well as the l3 and l4
> + * offsets in the 'packet'.
>   *
>   * Return pointer to the L4 header added to 'packet'. */
>  void *
> @@ -167,11 +171,16 @@ netdev_tnl_push_ip_header(struct dp_packet *packet,
>  *ip_tot_size -= IPV6_HEADER_LEN;
>  ip6->ip6_plen = htons(*ip_tot_size);
>  packet->l4_ofs = dp_packet_size(packet) - *ip_tot_size;
> +dp_packet_hwol_set_tx_ipv6(packet);
> +dp_packet_ol_reset_ip_csum_good(packet);
>  return ip6 + 1;
>  } else {
>  ip = netdev_tnl_ip_hdr(eth);
>  ip->ip_tot_len = htons(*ip_tot_size);
> -ip->ip_csum = recalc_csum16(ip->ip_csum, 0, ip->ip_tot_len);
> +/* Postpone checksum to when the packet is pushed to the port. */
> +dp_packet_hwol_set_tx_ipv4(packet);
> +dp_packet_hwol_set_tx_ip_csum(packet);
> +dp_packet_ol_reset_ip_csum_good(packet);
>  *ip_tot_size -= IP_HEADER_LEN;
>  packet->l4_ofs = dp_packet_size(packet) - *ip_tot_size;
>  return ip + 

Re: [ovs-dev] [PATCH v13 4/4] userspace: Enable L4 checksum offloading by default.

2023-05-24 Thread Ilya Maximets
On 5/17/23 05:11, Mike Pattrick wrote:
> From: Flavio Leitner 
> 
> The netdev receiving packets is supposed to provide the flags
> indicating if the L4 checksum was verified and it is OK or BAD,
> otherwise the stack will check when appropriate by software.
> 
> If the packet comes with good checksum, then postpone the
> checksum calculation to the egress device if needed.
> 
> When encapsulate a packet with that flag, set the checksum
> of the inner L4 header since that is not yet supported.
> 
> Calculate the L4 checksum when the packet is going to be sent
> over a device that doesn't support the feature.
> 
> Linux tap devices allows enabling L3 and L4 offload, so this
> patch enables the feature. However, Linux socket interface
> remains disabled because the API doesn't allow enabling
> those two features without enabling TSO too.
> 
> Signed-off-by: Flavio Leitner 
> Co-authored-by: Mike Pattrick 
> Signed-off-by: Mike Pattrick 
> 
> ---
>  Since v9:
>   - Extended miniflow_extract changes into avx512 code
>   - Formatting changes
>   - Note that we cannot currently enable checksum offloading in
> CONFIGURE_VETH_OFFLOADS for check-system-userspace as
> netdev-linux.c currently only parses the vnet header if TSO
> is enabled.
>  Since v10:
>   - No change
>  Since v11:
>   - Added AVX512 IPv6 checksum offload support.
>   - Improved error messages and logging.
>  Since v12:
>   - Added missing mutex annotations
> 
> Signed-off-by: Mike Pattrick 
> ---
>  lib/conntrack.c  |  15 +-
>  lib/dp-packet.c  |  25 
>  lib/dp-packet.h  |  78 +-
>  lib/dpif-netdev-extract-avx512.c |  62 +++-
>  lib/flow.c   |  23 +++
>  lib/netdev-dpdk.c| 176 +++---
>  lib/netdev-linux.c   | 243 +--
>  lib/netdev-native-tnl.c  |  32 +---
>  lib/netdev.c |  46 ++
>  lib/odp-execute-avx512.c |  88 ++-
>  lib/packets.c| 175 +-
>  lib/packets.h|   3 +
>  12 files changed, 688 insertions(+), 278 deletions(-)
> 



> @@ -1403,7 +1409,6 @@ static int
>  netdev_linux_batch_rxq_recv_tap(struct netdev_rxq_linux *rx, int mtu,
>  struct dp_packet_batch *batch)
>  {
> -int virtio_net_hdr_size;
>  ssize_t retval;
>  size_t std_len;
>  int iovlen;
> @@ -1413,16 +1418,14 @@ netdev_linux_batch_rxq_recv_tap(struct 
> netdev_rxq_linux *rx, int mtu,
>  /* Use the buffer from the allocated packet below to receive MTU
>   * sized packets and an aux_buf for extra TSO data. */
>  iovlen = IOV_TSO_SIZE;
> -virtio_net_hdr_size = sizeof(struct virtio_net_hdr);
>  } else {
>  /* Use only the buffer from the allocated packet. */
>  iovlen = IOV_STD_SIZE;
> -virtio_net_hdr_size = 0;
>  }
>  
>  /* The length here needs to be accounted in the same way when the
>   * aux_buf is allocated so that it can be prepended to TSO buffer. */
> -std_len = virtio_net_hdr_size + VLAN_ETH_HEADER_LEN + mtu;
> +std_len = sizeof(struct virtio_net_hdr) + VLAN_ETH_HEADER_LEN + mtu;
>  for (i = 0; i < NETDEV_MAX_BURST; i++) {
>  struct dp_packet *buffer;
>  struct dp_packet *pkt;
> @@ -1462,7 +1465,7 @@ netdev_linux_batch_rxq_recv_tap(struct netdev_rxq_linux 
> *rx, int mtu,
>  pkt = buffer;
>  }
>  
> -if (virtio_net_hdr_size && netdev_linux_parse_vnet_hdr(pkt)) {
> +if (netdev_linux_parse_vnet_hdr(pkt)) {

If TUNSETOFFLOAD failed, we will not have a vnet header, right?

>  struct netdev *netdev_ = netdev_rxq_get_netdev(>up);
>  struct netdev_linux *netdev = netdev_linux_cast(netdev_);
>  
> @@ -1611,7 +1614,7 @@ netdev_linux_sock_batch_send(int sock, int ifindex, 
> bool tso, int mtu,
>   * on other interface types because we attach a socket filter to the rx
>   * socket. */
>  static int
> -netdev_linux_tap_batch_send(struct netdev *netdev_, bool tso, int mtu,
> +netdev_linux_tap_batch_send(struct netdev *netdev_, int mtu,
>  struct dp_packet_batch *batch)
>  {
>  struct netdev_linux *netdev = netdev_linux_cast(netdev_);
> @@ -1632,9 +1635,7 @@ netdev_linux_tap_batch_send(struct netdev *netdev_, 
> bool tso, int mtu,
>  ssize_t retval;
>  int error;
>  
> -if (tso) {
> -netdev_linux_prepend_vnet_hdr(packet, mtu);
> -}
> +netdev_linux_prepend_vnet_hdr(packet, mtu);

Is it allowed to add vnet header if TUNSETOFFLOAD failed?

>  
>  size = dp_packet_size(packet);
>  do {
> @@ -1765,7 +1766,7 @@ netdev_linux_send(struct netdev *netdev_, int qid 
> OVS_UNUSED,
>  
>  error = netdev_linux_sock_batch_send(sock, ifindex, tso, mtu, batch);
>  } else {
> -error = 

Re: [ovs-dev] [PATCH v13 2/4] dpif-netdev: Show netdev offloading flags.

2023-05-24 Thread Ilya Maximets
On 5/17/23 05:11, Mike Pattrick wrote:
> From: Flavio Leitner 
> 
> This patch modifies netdev_get_status to include information about
> checksum offload status by port, allowing the user to gain insight into
> where checksum offloading is active.
> 
> Signed-off-by: Flavio Leitner 
> Co-authored-by: Mike Pattrick 
> Signed-off-by: Mike Pattrick 
> Reviewed-by: David Marchand 
> ---
>  Since v9:
>   - Removed entire ovs-appctl dpif-netdev/offload-show command, replaced
> with a field in the netdev status.
>   - Removed duplicative field tx_tso_offload from netdev-dpdk.c
>  Since v10:
>   - No change
>  Since v11:
>   - Removed depricated documentation
>   - Clarified offload direction in status field
> ---
>  lib/netdev-dpdk.c |  5 -
>  lib/netdev-provider.h |  1 +
>  lib/netdev.c  | 30 +++---
>  tests/dpif-netdev.at  | 18 ++
>  4 files changed, 46 insertions(+), 8 deletions(-)
> 
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index fb0dd43f7..560694dbc 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -1753,11 +1753,6 @@ netdev_dpdk_get_config(const struct netdev *netdev, 
> struct smap *args)
>  } else {
>  smap_add(args, "rx_csum_offload", "false");
>  }
> -if (dev->hw_ol_features & NETDEV_TX_TSO_OFFLOAD) {
> -smap_add(args, "tx_tso_offload", "true");
> -} else {
> -smap_add(args, "tx_tso_offload", "false");
> -}
>  smap_add(args, "lsc_interrupt_mode",
>   dev->lsc_interrupt_mode ? "true" : "false");
>  
> diff --git a/lib/netdev-provider.h b/lib/netdev-provider.h
> index b5420947d..fcf52bdd9 100644
> --- a/lib/netdev-provider.h
> +++ b/lib/netdev-provider.h
> @@ -37,6 +37,7 @@ extern "C" {
>  struct netdev_tnl_build_header_params;
>  #define NETDEV_NUMA_UNSPEC OVS_NUMA_UNSPEC
>  
> +/* Keep this enum updated with translation to string below. */

What is this comment referring to?

>  enum netdev_ol_flags {
>  NETDEV_TX_OFFLOAD_IPV4_CKSUM = 1 << 0,
>  NETDEV_TX_OFFLOAD_TCP_CKSUM = 1 << 1,
> diff --git a/lib/netdev.c b/lib/netdev.c
> index c79778378..cc03308fb 100644
> --- a/lib/netdev.c
> +++ b/lib/netdev.c
> @@ -43,6 +43,7 @@
>  #include "netdev-provider.h"
>  #include "netdev-vport.h"
>  #include "odp-netlink.h"
> +#include "openvswitch/json.h"
>  #include "openflow/openflow.h"
>  #include "packets.h"
>  #include "openvswitch/ofp-print.h"
> @@ -1373,9 +1374,32 @@ netdev_get_next_hop(const struct netdev *netdev,
>  int
>  netdev_get_status(const struct netdev *netdev, struct smap *smap)
>  {
> -return (netdev->netdev_class->get_status
> -? netdev->netdev_class->get_status(netdev, smap)
> -: EOPNOTSUPP);
> +int err = EOPNOTSUPP;
> +
> +/* Set offload status only if relevant. */
> +if (netdev_get_dpif_type(netdev) &&
> +strcmp(netdev_get_dpif_type(netdev), "system")) {
> +
> +#define OL_ADD_STAT(name, bit) \
> +smap_add(smap, "tx_" name "_csum_offload", \
> + netdev->ol_flags & bit ? "true" : "false");
> +
> +OL_ADD_STAT("ip", NETDEV_TX_OFFLOAD_IPV4_CKSUM);
> +OL_ADD_STAT("tcp", NETDEV_TX_OFFLOAD_TCP_CKSUM);
> +OL_ADD_STAT("udp", NETDEV_TX_OFFLOAD_UDP_CKSUM);
> +OL_ADD_STAT("sctp", NETDEV_TX_OFFLOAD_SCTP_CKSUM);
> +#undef OL_ADD_STAT
> +smap_add(smap, "tx_tcp_seg_offload", netdev->ol_flags & \
> +NETDEV_TX_OFFLOAD_TCP_TSO ? "true" : "false");

Don't use line continuation is a normal C code.  And it's likely
better to write this way:

smap_add(smap, "tx_tcp_seg_offload",
 (netdev->ol_flags & NETDEV_TX_OFFLOAD_TCP_TSO)
 ? "true" : "false");

Or the macro can be re-used by moving the '_csum' part to the 'name'.

> +
> +err = 0;
> +}
> +
> +if (!netdev->netdev_class->get_status) {
> +return err;
> +}
> +
> +return netdev->netdev_class->get_status(netdev, smap);
>  }
>  
>  /* Returns all assigned IP address to  'netdev' and returns 0.
> diff --git a/tests/dpif-netdev.at b/tests/dpif-netdev.at
> index baab60a22..60d789beb 100644
> --- a/tests/dpif-netdev.at
> +++ b/tests/dpif-netdev.at
> @@ -650,6 +650,24 @@ AT_CHECK([ovs-appctl revalidator/resume])
>  OVS_VSWITCHD_STOP
>  AT_CLEANUP
>  
> +AT_SETUP([dpif-netdev - check tx packet checksum offloading])
> +OVS_VSWITCHD_START(
> +  [add-port br0 p1 \
> +   -- set interface p1 type=dummy options:pstream=punix:$OVS_RUNDIR/p0.sock \
> +   -- set bridge br0 datapath-type=dummy \
> + other-config:datapath-id=1234 fail-mode=secure])
> +
> +AT_CHECK([ovs-vsctl get interface p1 status | sed -n 's/^{\(.*\).*}$/\1/p'], 
> [0], [dnl
> +tx_ip_csum_offload="false", tx_sctp_csum_offload="false", 
> tx_tcp_csum_offload="false", tx_tcp_seg_offload="false", 
> tx_udp_csum_offload="false"
> +], [])
> +
> +AT_CHECK([ovs-vsctl get interface br0 status | sed -n 
> 

Re: [ovs-dev] [PATCH v13 1/4] Documentation: Document netdev offload.

2023-05-24 Thread Ilya Maximets
On 5/17/23 05:11, Mike Pattrick wrote:
> From: Flavio Leitner 
> 
> Document the implementation of netdev hardware offloading
> in userspace datapath.
> 
> Signed-off-by: Flavio Leitner 
> Co-authored-by: Mike Pattrick 
> Signed-off-by: Mike Pattrick 
> ---
>  Since v9:
>   - Renamed documentation to reflect the userspace checksum nature of
> this feature
>   - Edited for formatting and clarity issues.
>  Since v10:
>   - No change
> Since v11:
>   - Modified document in accordance with Ilya's feedback
> ---
>  Documentation/automake.mk |  1 +
>  Documentation/topics/index.rst|  1 +
>  .../topics/userspace-checksum-offloading.rst  | 97 +++
>  3 files changed, 99 insertions(+)
>  create mode 100644 Documentation/topics/userspace-checksum-offloading.rst
> 
> diff --git a/Documentation/automake.mk b/Documentation/automake.mk
> index cdf3c9926..8bd3dbb2b 100644
> --- a/Documentation/automake.mk
> +++ b/Documentation/automake.mk
> @@ -57,6 +57,7 @@ DOC_SOURCE = \
>   Documentation/topics/record-replay.rst \
>   Documentation/topics/tracing.rst \
>   Documentation/topics/usdt-probes.rst \
> + Documentation/topics/userspace-checksum-offloading.rst \
>   Documentation/topics/userspace-tso.rst \
>   Documentation/topics/userspace-tx-steering.rst \
>   Documentation/topics/windows.rst \
> diff --git a/Documentation/topics/index.rst b/Documentation/topics/index.rst
> index 90d4c66e6..f239fcf83 100644
> --- a/Documentation/topics/index.rst
> +++ b/Documentation/topics/index.rst
> @@ -55,5 +55,6 @@ OVS
> userspace-tso
> idl-compound-indexes
> ovs-extensions
> +   userspace-checksum-offloading
> userspace-tx-steering
> usdt-probes
> diff --git a/Documentation/topics/userspace-checksum-offloading.rst 
> b/Documentation/topics/userspace-checksum-offloading.rst
> new file mode 100644
> index 0..8157c25ea
> --- /dev/null
> +++ b/Documentation/topics/userspace-checksum-offloading.rst
> @@ -0,0 +1,97 @@
> +..
> +  Licensed under the Apache License, Version 2.0 (the "License"); you may
> +  not use this file except in compliance with the License. You may obtain
> +  a copy of the License at
> +
> +  http://www.apache.org/licenses/LICENSE-2.0
> +
> +  Unless required by applicable law or agreed to in writing, software
> +  distributed under the License is distributed on an "AS IS" BASIS, 
> WITHOUT
> +  WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See 
> the
> +  License for the specific language governing permissions and limitations
> +  under the License.
> +
> +  Convention for heading levels in Open vSwitch documentation:
> +
> +  ===  Heading 0 (reserved for the title in a document)
> +  ---  Heading 1
> +  ~~~  Heading 2
> +  +++  Heading 3
> +  '''  Heading 4
> +
> +  Avoid deeper levels because they do not render well.
> +
> +
> +Userspace Datapath - Checksum Offloading
> +
> +
> +This document explains the internals of Open vSwitch support for checksum
> +offloading in the userspace datapath.
> +
> +Design
> +--
> +
> +Open vSwitch strives to forward packets as they arrive regardless of whether
> +the checksum is correct or not. OVS is not responsible for fixing external
> +checksum issues.
> +
> +The checksum calculation can be offloaded to the NIC when the packet's 
> checksum
> +is verified, known to be good, or known to be destined for an interface that
> +will recalculate the checksum anyways.
> +
> +In other cases, OVS will update the checksum if packet contents is modified 
> in
> +a way that would also invalidate the checksum and the checksum status is not
> +known.

I can't make sense from this sentence.  In a way that the status is not known?
Sounds strange.  Maybe break it down into 2-3 simpler ones?

> +
> +The interface (internally referred to as a netdev) can set flags indicating
> +whether each packet's checksum is good or bad upon recipt. The checksum is

s/recipt/receipt/

> +considered unverified if no flag is set.
> +
> +When packets ingress into the datapath with good checksum, OVS should 
> postpone
> +checksum updates for these packets until they egress.
> +
> +When a packet egress the datapath, the packet flags and the egress interface
> +flags are verified to make sure all required offload features to send out the
> +packet are available on the egress interface. If not, the data path will fall
> +back to equivalent software implementation.
> +
> +
> +Interface (a.k.a. Netdev)
> +-
> +
> +When the interface initiates, it should set the flags to tell the datapath
> +which offload features are supported. For example, if the driver supports IP
> +checksum offloading, then ``netdev->ol_flags`` should set the flag
> +``NETDEV_TX_OFFLOAD_IPV4_CKSUM``.
> +
> +
> +Rules
> +-
> 

Re: [ovs-dev] [PATCH v4 2/2] netdev-dpdk: Check rx/tx descriptor sizes for device.

2023-05-24 Thread Simon Horman
On Tue, May 16, 2023 at 12:59:58PM +0100, Kevin Traynor wrote:
> By default OVS configures 2048 descriptors for tx and rx queues
> on DPDK devices. It also allows the user to configure those values.
> 
> If the values used are not acceptable to the device then queue setup
> would fail.
> 
> The device exposes it's max/min/alignment requirements and OVS
> applies some limits also. Use these to ensure an acceptable value
> is used for the number of descriptors on a device tx/rx.
> 
> If the default or user value is not acceptable, adjust to a suitable
> value and log.
> 
> Reported-at: https://bugzilla.redhat.com/2119876
> Signed-off-by: Kevin Traynor 

Reviewed-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 1/2] netdev-dpdk: Remove requested descriptors from get_config.

2023-05-24 Thread Simon Horman
On Tue, May 16, 2023 at 12:59:57PM +0100, Kevin Traynor wrote:
> There is no need to display 'requested_rx/tx_descriptors' and
> 'configured_rx/tx_descriptors' as they will be the same.
> 
> It is simpler to just have a single 'n_rxq/txq_desc' value.
> 
> Suggested-by: Ilya Maximets 
> Reviewed-by: David Marchand 
> Signed-off-by: Kevin Traynor 

Reviewed-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 2/2] netdev-dpdk: Check rx/tx descriptor sizes for device.

2023-05-24 Thread David Marchand
On Tue, May 16, 2023 at 2:00 PM Kevin Traynor  wrote:
>
> By default OVS configures 2048 descriptors for tx and rx queues
> on DPDK devices. It also allows the user to configure those values.
>
> If the values used are not acceptable to the device then queue setup
> would fail.
>
> The device exposes it's max/min/alignment requirements and OVS
> applies some limits also. Use these to ensure an acceptable value
> is used for the number of descriptors on a device tx/rx.
>
> If the default or user value is not acceptable, adjust to a suitable
> value and log.
>
> Reported-at: https://bugzilla.redhat.com/2119876
> Signed-off-by: Kevin Traynor 

Lgtm.
Reviewed-by: David Marchand 


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] northd: Fix address set incremental processing

2023-05-24 Thread Ilya Maximets
On 5/23/23 10:09, Ales Musil wrote:
> The incremental processing is broken for addresses
> that have mask which could "erase" portion of the address
> itself e.g. 10.16.0.47/4, after applying the mask with token
> parser the address becomes 0.0.0.0/4, which is fine for
> logical flows. However, for the deletion/addition to database
> we need the original string representation which cannot be
> built from the parsed token.
> 
> Use svec instead for the comparison which allows us to keep the
> original strings.
> 
> The change to svec shouldn't have any significant performance
> impact, see the numbers below show. The setup was created by
> the scale script from Han [0].
> 
> Numbers with expr:
> Time spent on processing nb_cfg 1:
> ovn-northd delay before processing: 21ms
> ovn-northd completion:  544ms
> ovn-controller(s) completion:   544ms
> Time spent on processing nb_cfg 2:
> ovn-northd delay before processing: 17ms
> ovn-northd completion:  537ms
> ovn-controller(s) completion:   535ms
> Time spent on processing nb_cfg 3:
> ovn-northd delay before processing: 20ms
> ovn-northd completion:  552ms
> ovn-controller(s) completion:   550ms
> Time spent on processing nb_cfg 4:
> ovn-northd delay before processing: 16ms
> ovn-northd completion:  529ms
> ovn-controller(s) completion:   528ms
> 
> Numbers with svec:
> Time spent on processing nb_cfg 1:
> ovn-northd delay before processing: 19ms
> ovn-northd completion:  548ms
> ovn-controller(s) completion:   548ms
> Time spent on processing nb_cfg 2:
> ovn-northd delay before processing: 19ms
> ovn-northd completion:  537ms
> ovn-controller(s) completion:   537ms
> Time spent on processing nb_cfg 3:
> ovn-northd delay before processing: 15ms
> ovn-northd completion:  522ms
> ovn-controller(s) completion:   521ms
> Time spent on processing nb_cfg 4:
> ovn-northd delay before processing: 17ms
> ovn-northd completion:  522ms
> ovn-controller(s) completion:   520ms
> 
> [0] 
> https://github.com/hzhou8/ovn-test-script/blob/e1149318201309db6d96ae8d5a33fcfbbe1872a3/test_ovn_controller_scale.sh
> 
> Fixes: 0d5e0a6ced49 ("northd: Add I-P for syncing SB address sets.")
> Reported-at: https://bugzilla.redhat.com/2196885

The issue was originally reported on the mailing list.  Please,
provide the link as well.  And it'd also be nice to credit the
original reporter. 

> Signed-off-by: Ales Musil 
> ---
>  northd/en-sync-sb.c | 66 +
>  1 file changed, 30 insertions(+), 36 deletions(-)
> 
> diff --git a/northd/en-sync-sb.c b/northd/en-sync-sb.c
> index 758f00bfd..4bd77b168 100644
> --- a/northd/en-sync-sb.c
> +++ b/northd/en-sync-sb.c
> @@ -22,7 +22,6 @@
>  #include "openvswitch/util.h"
>  
>  #include "en-sync-sb.h"
> -#include "include/ovn/expr.h"
>  #include "lib/inc-proc-eng.h"
>  #include "lib/lb.h"
>  #include "lib/ovn-nb-idl.h"
> @@ -325,45 +324,40 @@ static void
>  update_sb_addr_set(const char **nb_addresses, size_t n_addresses,
> const struct sbrec_address_set *sb_as)
>  {
> -struct expr_constant_set *cs_nb_as =
> -expr_constant_set_create_integers(
> -(const char *const *) nb_addresses, n_addresses);
> -struct expr_constant_set *cs_sb_as =
> -expr_constant_set_create_integers(
> -(const char *const *) sb_as->addresses, sb_as->n_addresses);
> -
> -struct expr_constant_set *addr_added = NULL;
> -struct expr_constant_set *addr_deleted = NULL;
> -expr_constant_set_integers_diff(cs_sb_as, cs_nb_as, _added,
> -_deleted);
> -
> -struct ds ds = DS_EMPTY_INITIALIZER;
> -if (addr_added && addr_added->n_values) {
> -for (size_t i = 0; i < addr_added->n_values; i++) {
> -ds_clear();
> -expr_constant_format(_added->values[i], EXPR_C_INTEGER, 
> );
> -sbrec_address_set_update_addresses_addvalue(sb_as, ds_cstr());
> -}
> +size_t i;
> +char *ip;
> +
> +struct svec svec_nb_as = SVEC_EMPTY_INITIALIZER;
> +struct svec svec_sb_as = SVEC_EMPTY_INITIALIZER;
> +
> +for (i = 0; i < n_addresses; i++) {
> +svec_add(_nb_as, nb_addresses[i]);
>  }
>  
> -if (addr_deleted && addr_deleted->n_values) {
> -for (size_t i = 0; i < addr_deleted->n_values; i++) {
> -ds_clear();
> -expr_constant_format(_deleted->values[i],
> - EXPR_C_INTEGER, );
> -sbrec_address_set_update_addresses_delvalue(sb_as, ds_cstr());
> -}
> +
> +for (i = 0; i < 

Re: [ovs-dev] [PATCH ovn] controller: don't report UDP as unsupported proto in svc_check

2023-05-24 Thread Ales Musil
On Mon, May 22, 2023 at 11:18 PM Vladislav Odintsov 
wrote:

> Depending on the udp service, it can reply with some udp data.
> In that case ovn-controller will warn with next message:
>
> pinctrl(ovn_pinctrl0)|WARN|handle service check: Unsupported protocol -
> [11]
>
> With this patch ovn-controller ignores UDP packets, which came to
> pinctrl_handle_svc_check().  This is not something abnormal, so don't
> write warnings.
>
> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1913162
> Signed-off-by: Vladislav Odintsov 
> ---
>  controller/pinctrl.c | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/controller/pinctrl.c b/controller/pinctrl.c
> index b4be22020..f2e753cdb 100644
> --- a/controller/pinctrl.c
> +++ b/controller/pinctrl.c
> @@ -,6 +,13 @@ pinctrl_handle_svc_check(struct rconn *swconn,
> const struct flow *ip_flow,
>  ip_proto = in_ip->ip6_nxt;
>  }
>
> +if (ip_proto == IPPROTO_UDP) {
> +/* We should do nothing if we got UDP packet.
> + * If service is running, it can respond with any UDP data,
> + * so just ingore it. */
> + return;
> +}
> +
>  if (ip_proto != IPPROTO_TCP && ip_proto != IPPROTO_ICMP &&
>  ip_proto != IPPROTO_ICMPV6) {
>  static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> --
> 2.36.1
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
Looks good to me, thanks.

Acked-by: Ales Musil 

-- 

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA 

amu...@redhat.comIM: amusil

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v10] netdev-dpdk: add control plane protection support

2023-05-24 Thread Ilya Maximets
On 5/24/23 09:18, Robin Jarry wrote:
> Ilya Maximets, May 23, 2023 at 22:04:
>> 'rxq-isolate' will be confused with 'other_config:pmd-rxq-isolate'.
>> Same likely goes for the 'isolated-rxq'.
>>
>> 'rxq-steernig' may be confused with 'other_config:tx-steering'.
>> But this can be argued that it's essentially similar functionality,
>> so should be named similarly.  Maybe we can double down on that and
>> use options:rx-steernig=rss|rss+lacp|...  with 'rss' being a default
>> configuration?
> 
> I like that idea. Instead of "rss", how about "hash" to match what
> tx-steering is using?

I'm not sure if it's better to use 'rss' or 'hash' in this context.
Aaron, Kevin, what do you think?

> 
> So the value of options:rx-steering would be "hash" followed by an
> arbitrary list of preset protocols (for now, only "lacp") all separated
> by "+". It may also open the door for other base steering methods than
> "hash" ("rr" for round-robin some day?).
> 
> If that is OK, I can prepare a v11 with everything renamed.

Might make sense for the code review.  Kevin was looking at the
patch, IIUC.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v10] ofproto-dpif-upcall: Don't set statistics to 0 when they jump back

2023-05-24 Thread Eelco Chaudron



On 23 May 2023, at 15:11, Balazs Nemeth wrote:

> The only way that stats->{n_packets,n_bytes} would decrease is due to an
> overflow, or if there are bugs in how statistics are handled. In the
> past, there were multiple issues that caused a jump backward. A
> workaround was in place to set the statistics to 0 in that case. When
> this happened while the revalidator was under heavy load, the workaround
> had an unintended side effect where should_revalidate returned false
> causing the flow to be removed because the metric it calculated was
> based on a bogus value. Since many of those bugs have now been
> identified and resolved, there is no need to set the statistics to 0. In
> addition, the (unlikely) overflow still needs to be handled
> appropriately. If an unexpected jump does happen, just log it as a
> warning.
>
> Signed-off-by: Balazs Nemeth 

Thanks for all the re-work, this revision looks good to me.

Acked-by: Eelco Chaudron 

Cheers,

Eelco

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v2] lflow-cache: Introduce cache for lflow actions

2023-05-24 Thread Ihtisham ul Haq via dev
Using cache improves performance of recomputation of lflows(by
about 30%)

Exising lflow cache for `matches` and `expressions` is adopted
to include `actions` as well.

Co-authored-by: Felix Huettner 
Signed-off-by: Felix Huettner 
Signed-off-by: Ihtisham ul Haq 
---
 controller/lflow-cache.c  | 12 --
 controller/lflow-cache.h  |  8 +--
 controller/lflow.c| 41 +-
 controller/test-lflow-cache.c | 42 +++
 tests/ovn-lflow-cache.at  | 41 ++
 5 files changed, 115 insertions(+), 29 deletions(-)

diff --git a/controller/lflow-cache.c b/controller/lflow-cache.c
index f723eab8a..a648d77af 100644
--- a/controller/lflow-cache.c
+++ b/controller/lflow-cache.c
@@ -25,7 +25,9 @@
 #include "lflow-cache.h"
 #include "lib/uuid.h"
 #include "memory-trim.h"
+#include "openvswitch/ofpbuf.h"
 #include "openvswitch/vlog.h"
+#include "ovn/actions.h"
 #include "ovn/expr.h"

 VLOG_DEFINE_THIS_MODULE(lflow_cache);
@@ -209,7 +211,8 @@ lflow_cache_get_stats(const struct lflow_cache *lc, struct 
ds *output)

 void
 lflow_cache_add_expr(struct lflow_cache *lc, const struct uuid *lflow_uuid,
- struct expr *expr, size_t expr_sz)
+ struct expr *expr, size_t expr_sz,
+ struct ofpbuf *actions)
 {
 struct lflow_cache_value *lcv =
 lflow_cache_add__(lc, lflow_uuid, LCACHE_T_EXPR, expr_sz);
@@ -220,12 +223,14 @@ lflow_cache_add_expr(struct lflow_cache *lc, const struct 
uuid *lflow_uuid,
 }
 COVERAGE_INC(lflow_cache_add_expr);
 lcv->expr = expr;
+lcv->actions = actions;
 }

 void
 lflow_cache_add_matches(struct lflow_cache *lc, const struct uuid *lflow_uuid,
 uint32_t conj_id_ofs, uint32_t n_conjs,
-struct hmap *matches, size_t matches_sz)
+struct hmap *matches, size_t matches_sz,
+struct ofpbuf *actions)
 {
 struct lflow_cache_value *lcv =
 lflow_cache_add__(lc, lflow_uuid, LCACHE_T_MATCHES, matches_sz);
@@ -239,6 +244,7 @@ lflow_cache_add_matches(struct lflow_cache *lc, const 
struct uuid *lflow_uuid,
 lcv->expr_matches = matches;
 lcv->n_conjs = n_conjs;
 lcv->conj_id_ofs = conj_id_ofs;
+lcv->actions = actions;
 }

 struct lflow_cache_value *
@@ -380,11 +386,13 @@ lflow_cache_delete__(struct lflow_cache *lc, struct 
lflow_cache_entry *lce)
 case LCACHE_T_EXPR:
 COVERAGE_INC(lflow_cache_free_expr);
 expr_destroy(lce->value.expr);
+ovnacts_free((*lce->value.actions).data, (*lce->value.actions).size);
 break;
 case LCACHE_T_MATCHES:
 COVERAGE_INC(lflow_cache_free_matches);
 expr_matches_destroy(lce->value.expr_matches);
 free(lce->value.expr_matches);
+ovnacts_free((*lce->value.actions).data, (*lce->value.actions).size);
 break;
 }

diff --git a/controller/lflow-cache.h b/controller/lflow-cache.h
index 300a56077..9d542beec 100644
--- a/controller/lflow-cache.h
+++ b/controller/lflow-cache.h
@@ -50,6 +50,8 @@ struct lflow_cache_value {
 uint32_t n_conjs;
 uint32_t conj_id_ofs;

+struct ofpbuf *actions;
+
 union {
 struct hmap *expr_matches;
 struct expr *expr;
@@ -66,11 +68,13 @@ bool lflow_cache_is_enabled(const struct lflow_cache *);
 void lflow_cache_get_stats(const struct lflow_cache *, struct ds *output);

 void lflow_cache_add_expr(struct lflow_cache *, const struct uuid *lflow_uuid,
-  struct expr *expr, size_t expr_sz);
+  struct expr *expr, size_t expr_sz,
+  struct ofpbuf *actions);
 void lflow_cache_add_matches(struct lflow_cache *,
  const struct uuid *lflow_uuid,
  uint32_t conj_id_ofs, uint32_t n_conjs,
- struct hmap *matches, size_t matches_sz);
+ struct hmap *matches, size_t matches_sz,
+ struct ofpbuf *actions);

 struct lflow_cache_value *lflow_cache_get(struct lflow_cache *,
   const struct uuid *lflow_uuid);
diff --git a/controller/lflow.c b/controller/lflow.c
index 0b071138d..5ca676ee8 100644
--- a/controller/lflow.c
+++ b/controller/lflow.c
@@ -1078,14 +1078,23 @@ consider_logical_flow__(const struct sbrec_logical_flow 
*lflow,
 struct sset template_vars_ref = SSET_INITIALIZER(_vars_ref);
 struct expr *prereqs = NULL;

-if (!lflow_parse_actions(lflow, l_ctx_in, _vars_ref,
- , )) {
-ovnacts_free(ovnacts.data, ovnacts.size);
-ofpbuf_uninit();
-store_lflow_template_refs(l_ctx_out->lflow_deps_mgr,
-  _vars_ref, lflow);
-sset_destroy(_vars_ref);
-return;
+struct lflow_cache_value *lcv =
+

Re: [ovs-dev] [PATCH ovn] lflow-cache: Introduce cache for lflow actions

2023-05-24 Thread 0-day Robot
Bleep bloop.  Greetings Ihtisham ul Haq, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
ERROR: Co-author Felix Huettner  needs to sign off.
ERROR: Improper whitespace around control block
#134 FILE: controller/lflow.c:1086:
if(lcv_type != LCACHE_T_NONE){

Lines checked: 578, Warnings: 0, Errors: 2


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] lflow-cache: Introduce cache for lflow actions

2023-05-24 Thread Ihtisham ul Haq via dev
Using cache improves performance of recomputation of lflows. We see
about 30% improvments during our tests

Exising lflow cache for `matches` and `expressions` is adopted
to include `actions` as well.

Co-authored-by: Felix Huettner 
Signed-off-by: Ihtisham ul Haq 
---
 controller/lflow-cache.c  | 12 --
 controller/lflow-cache.h  |  8 +--
 controller/lflow.c| 41 +-
 controller/test-lflow-cache.c | 42 +++
 tests/ovn-lflow-cache.at  | 41 ++
 5 files changed, 115 insertions(+), 29 deletions(-)

diff --git a/controller/lflow-cache.c b/controller/lflow-cache.c
index f723eab8a..a648d77af 100644
--- a/controller/lflow-cache.c
+++ b/controller/lflow-cache.c
@@ -25,7 +25,9 @@
 #include "lflow-cache.h"
 #include "lib/uuid.h"
 #include "memory-trim.h"
+#include "openvswitch/ofpbuf.h"
 #include "openvswitch/vlog.h"
+#include "ovn/actions.h"
 #include "ovn/expr.h"

 VLOG_DEFINE_THIS_MODULE(lflow_cache);
@@ -209,7 +211,8 @@ lflow_cache_get_stats(const struct lflow_cache *lc, struct 
ds *output)

 void
 lflow_cache_add_expr(struct lflow_cache *lc, const struct uuid *lflow_uuid,
- struct expr *expr, size_t expr_sz)
+ struct expr *expr, size_t expr_sz,
+ struct ofpbuf *actions)
 {
 struct lflow_cache_value *lcv =
 lflow_cache_add__(lc, lflow_uuid, LCACHE_T_EXPR, expr_sz);
@@ -220,12 +223,14 @@ lflow_cache_add_expr(struct lflow_cache *lc, const struct 
uuid *lflow_uuid,
 }
 COVERAGE_INC(lflow_cache_add_expr);
 lcv->expr = expr;
+lcv->actions = actions;
 }

 void
 lflow_cache_add_matches(struct lflow_cache *lc, const struct uuid *lflow_uuid,
 uint32_t conj_id_ofs, uint32_t n_conjs,
-struct hmap *matches, size_t matches_sz)
+struct hmap *matches, size_t matches_sz,
+struct ofpbuf *actions)
 {
 struct lflow_cache_value *lcv =
 lflow_cache_add__(lc, lflow_uuid, LCACHE_T_MATCHES, matches_sz);
@@ -239,6 +244,7 @@ lflow_cache_add_matches(struct lflow_cache *lc, const 
struct uuid *lflow_uuid,
 lcv->expr_matches = matches;
 lcv->n_conjs = n_conjs;
 lcv->conj_id_ofs = conj_id_ofs;
+lcv->actions = actions;
 }

 struct lflow_cache_value *
@@ -380,11 +386,13 @@ lflow_cache_delete__(struct lflow_cache *lc, struct 
lflow_cache_entry *lce)
 case LCACHE_T_EXPR:
 COVERAGE_INC(lflow_cache_free_expr);
 expr_destroy(lce->value.expr);
+ovnacts_free((*lce->value.actions).data, (*lce->value.actions).size);
 break;
 case LCACHE_T_MATCHES:
 COVERAGE_INC(lflow_cache_free_matches);
 expr_matches_destroy(lce->value.expr_matches);
 free(lce->value.expr_matches);
+ovnacts_free((*lce->value.actions).data, (*lce->value.actions).size);
 break;
 }

diff --git a/controller/lflow-cache.h b/controller/lflow-cache.h
index 300a56077..9d542beec 100644
--- a/controller/lflow-cache.h
+++ b/controller/lflow-cache.h
@@ -50,6 +50,8 @@ struct lflow_cache_value {
 uint32_t n_conjs;
 uint32_t conj_id_ofs;

+struct ofpbuf *actions;
+
 union {
 struct hmap *expr_matches;
 struct expr *expr;
@@ -66,11 +68,13 @@ bool lflow_cache_is_enabled(const struct lflow_cache *);
 void lflow_cache_get_stats(const struct lflow_cache *, struct ds *output);

 void lflow_cache_add_expr(struct lflow_cache *, const struct uuid *lflow_uuid,
-  struct expr *expr, size_t expr_sz);
+  struct expr *expr, size_t expr_sz,
+  struct ofpbuf *actions);
 void lflow_cache_add_matches(struct lflow_cache *,
  const struct uuid *lflow_uuid,
  uint32_t conj_id_ofs, uint32_t n_conjs,
- struct hmap *matches, size_t matches_sz);
+ struct hmap *matches, size_t matches_sz,
+ struct ofpbuf *actions);

 struct lflow_cache_value *lflow_cache_get(struct lflow_cache *,
   const struct uuid *lflow_uuid);
diff --git a/controller/lflow.c b/controller/lflow.c
index 0b071138d..ad5284870 100644
--- a/controller/lflow.c
+++ b/controller/lflow.c
@@ -1078,14 +1078,23 @@ consider_logical_flow__(const struct sbrec_logical_flow 
*lflow,
 struct sset template_vars_ref = SSET_INITIALIZER(_vars_ref);
 struct expr *prereqs = NULL;

-if (!lflow_parse_actions(lflow, l_ctx_in, _vars_ref,
- , )) {
-ovnacts_free(ovnacts.data, ovnacts.size);
-ofpbuf_uninit();
-store_lflow_template_refs(l_ctx_out->lflow_deps_mgr,
-  _vars_ref, lflow);
-sset_destroy(_vars_ref);
-return;
+struct lflow_cache_value *lcv =
+

Re: [ovs-dev] [PATCH v3 7/7] tc: Add vxlan encap action with gbp option offload

2023-05-24 Thread Roi Dayan via dev



On 23/05/2023 19:06, Simon Horman wrote:
> On Mon, May 15, 2023 at 11:23:56AM +0300, Roi Dayan via dev wrote:
>> From: Gavin Li 
>>
>> Add TC offload support for vxlan encap with gbp option
>>
>> Signed-off-by: Gavin Li 
>> Reviewed-by: Gavi Teitz 
>> Reviewed-by: Roi Dayan 
> 
> Reviewed-by: Simon Horman 
> 


thanks for review Simon!
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] Scale testing OVN with ovn-heater for OpenStack use cases

2023-05-24 Thread Felix Huettner via dev
Hi everyone,

Ilya mentioned to me that you will want to bring openstack examples to
ovn-heater.

I wanted to ask how to best join this effort. It would be great for us
to have a tool to scale test ovn. Also we have an ovn deployment with
currently 450 compute nodes where we can try to extract the typically
used northbound entries from.

Thanks
Felix

On Tue, May 16, 2023 at 05:38:25PM +0200, Daniel Alvarez Sanchez wrote:
> Hey Frode, Dumitru, thanks a lot for initiating this
>
> On Thu, May 11, 2023 at 6:05 PM Frode Nordahl 
> wrote:
>
> > Hello, Dumitru, Daniel and team,
> >
> > On Thu, May 11, 2023 at 5:23 PM Dumitru Ceara  wrote:
> > >
> > > Hi Frode,
> > >
> > > During an internal discussion with RedHat OpenStack folks (Daniel in cc)
> > > about scalability of OVN in OpenStack deployments I remembered that you
> > > mentioned that you're thinking of enhancing ovn-heater in order to test
> > > your types of deployments [0].  I assumed that those are also OpenStack
> > > based deployments so I was wondering if there's an opportunity for
> > > collaboration here.
> >
> > Thank you for reaching out to me on this topic. We do indeed have scale
> > testing in the context of OpenStack on our roadmap and welcome the
> > invitation to collaborate on this.
> >
> > As you know we did some preparation work together already as you refer
> > to in [0], and we thank you for your help in upstreaming ovn-heater.
> >
> > We have yet to schedule exactly when in the cycle to work on this, I
> > guess a next step could be to set up a meet to share some ideas and
> > see how it aligns?
>
>
> > If that resonates with you, I'd be happy to organize.
> >
>
> Sounds good, please do :)
>
>
> >
> > > Having OpenStack-like scenarios in ovn-heater would really allow OVN
> > > developers to profile/troubleshoot/optimize OVN for those use cases too.
> > >
> > > Well, in general, the above is true for any other CMS too.
> >
> > Yes, this is what we have been thinking as well. We are supporting other
> > products in their journey with OVN as well, so similar projects may come
> > from those in the future.
> >
>
> Yes, we can try to model OpenStack-like scenarios for ovn-heater where the
> resources are generally more distributed across the hypervisors than in k8s
> for example, perhaps larger L2 domain sizes, etcetera.
> Happy to discuss further!
>
> daniel
>
>
> > --
> > Frode Nordahl
> >
> > > Thanks,
> > > Dumitru
> > >
> > > [0]
> > https://mail.openvswitch.org/pipermail/ovs-dev/2023-March/402680.html
> > >
> >
> >
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Diese E Mail enthält möglicherweise vertrauliche Inhalte und ist nur für die 
Verwertung durch den vorgesehenen Empfänger bestimmt.
Sollten Sie nicht der vorgesehene Empfänger sein, setzen Sie den Absender bitte 
unverzüglich in Kenntnis und löschen diese E Mail.

Hinweise zum Datenschutz finden Sie hier.


This e-mail may contain confidential content and is intended only for the 
specified recipient/s.
If you are not the intended recipient, please inform the sender immediately and 
delete this e-mail.

Information on data protection can be found 
here.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v10] netdev-dpdk: add control plane protection support

2023-05-24 Thread Robin Jarry
Ilya Maximets, May 23, 2023 at 22:04:
> 'rxq-isolate' will be confused with 'other_config:pmd-rxq-isolate'.
> Same likely goes for the 'isolated-rxq'.
>
> 'rxq-steernig' may be confused with 'other_config:tx-steering'.
> But this can be argued that it's essentially similar functionality,
> so should be named similarly.  Maybe we can double down on that and
> use options:rx-steernig=rss|rss+lacp|...  with 'rss' being a default
> configuration?

I like that idea. Instead of "rss", how about "hash" to match what
tx-steering is using?

So the value of options:rx-steering would be "hash" followed by an
arbitrary list of preset protocols (for now, only "lacp") all separated
by "+". It may also open the door for other base steering methods than
"hash" ("rr" for round-robin some day?).

If that is OK, I can prepare a v11 with everything renamed.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev