Re: [ovs-dev] [PATCH net] openvswitch: fix send of uninitialized stack memory in ct limit reply

2021-04-04 Thread Tonghao Zhang
On Mon, Apr 5, 2021 at 2:01 AM Ilya Maximets  wrote:
>
> CC: ovs-dev
>
> On 4/4/21 7:50 PM, Ilya Maximets wrote:
> > 'struct ovs_zone_limit' has more members than initialized in
> > ovs_ct_limit_get_default_limit().  The rest of the memory is a random
> > kernel stack content that ends up being sent to userspace.
> >
> > Fix that by using designated initializer that will clear all
> > non-specified fields.
> >
> > Fixes: 11efd5cb04a1 ("openvswitch: Support conntrack zone limit")
> > Signed-off-by: Ilya Maximets 
> > ---
> >  net/openvswitch/conntrack.c | 8 
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> > index c29b0ef1fc27..cadb6a29b285 100644
> > --- a/net/openvswitch/conntrack.c
> > +++ b/net/openvswitch/conntrack.c
> > @@ -2032,10 +2032,10 @@ static int ovs_ct_limit_del_zone_limit(struct 
> > nlattr *nla_zone_limit,
> >  static int ovs_ct_limit_get_default_limit(struct ovs_ct_limit_info *info,
> > struct sk_buff *reply)
> >  {
> > - struct ovs_zone_limit zone_limit;
> > -
> > - zone_limit.zone_id = OVS_ZONE_LIMIT_DEFAULT_ZONE;
> > - zone_limit.limit = info->default_limit;
> > + struct ovs_zone_limit zone_limit = {
> > + .zone_id = OVS_ZONE_LIMIT_DEFAULT_ZONE,
> > + .limit   = info->default_limit,
> > + };
I review the code, userspace don't use the count of ovs_zone_lime
struct, but this patch looks to to me.
Thanks Ilya.
Acked-by: Tonghao Zhang 
> >   return nla_put_nohdr(reply, sizeof(zone_limit), _limit);
> >  }
> >
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev



-- 
Best regards, Tonghao
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 3/5] dpif: Fix use of uninitialized execute hash.

2021-04-04 Thread Tonghao Zhang
On Mon, Apr 5, 2021 at 1:32 AM Ilya Maximets  wrote:
>
> 'dpif_execute_helper_cb' doesn't initilalize the 'hash' field that
> may be passed down to datapath and might cause execution of a different
> set of actions, e.g. on recirculation.
>
>  Thread 6 handler27:
>  Conditional jump or move depends on uninitialised value(s)
> at 0x53A2C2: dpif_netlink_encode_execute (dpif-netlink.c:1841)
> by 0x53A2C2: dpif_netlink_operate__ (dpif-netlink.c:1919)
> by 0x53A82D: dpif_netlink_operate_chunks (dpif-netlink.c:2238)
> by 0x53A82D: dpif_netlink_operate (dpif-netlink.c:2297)
> by 0x48135F: dpif_operate (dpif.c:1366)
> by 0x481923: dpif_execute.part.24 (dpif.c:1320)
> by 0x481C46: dpif_execute (dpif.c:1312)
> by 0x481C46: dpif_execute_helper_cb (dpif.c:1243)
> by 0x4AE943: odp_execute_actions (odp-execute.c:865)
> by 0x47F272: dpif_execute_with_help (dpif.c:1296)
> by 0x4812FF: dpif_operate (dpif.c:1422)
> by 0x442226: handle_upcalls (ofproto-dpif-upcall.c:1617)
> by 0x442226: recv_upcalls.isra.36 (ofproto-dpif-upcall.c:855)
> by 0x442351: udpif_upcall_handler (ofproto-dpif-upcall.c:755)
> by 0x4FDE2C: ovsthread_wrapper (ovs-thread.c:383)
> by 0x5E19159: start_thread (in /usr/lib64/libpthread-2.28.so)
> by 0x69ECF72: clone (in /usr/lib64/libc-2.28.so)
>   Uninitialised value was created by a stack allocation
> at 0x481966: dpif_execute_helper_cb (dpif.c:1159)
>
> Additionally added a missing comment to the 'struct dpif_execute'.
Thanks Ilya

Acked-by: Tonghao Zhang 
> CC: Tonghao Zhang 
> Fixes: 0442bfb11d6c ("ofproto-dpif-upcall: Echo HASH attribute back to 
> datapath.")
> Signed-off-by: Ilya Maximets 
> ---
>  lib/dpif.c | 1 +
>  lib/dpif.h | 2 +-
>  2 files changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/lib/dpif.c b/lib/dpif.c
> index 56d0b4a65..26e8bfb7d 100644
> --- a/lib/dpif.c
> +++ b/lib/dpif.c
> @@ -1240,6 +1240,7 @@ dpif_execute_helper_cb(void *aux_, struct 
> dp_packet_batch *packets_,
>  execute.needs_help = false;
>  execute.probe = false;
>  execute.mtu = 0;
> +execute.hash = 0;
>  aux->error = dpif_execute(aux->dpif, );
>  log_execute_message(aux->dpif, _module, ,
>  true, aux->error);
> diff --git a/lib/dpif.h b/lib/dpif.h
> index ecda896c7..f9728e673 100644
> --- a/lib/dpif.h
> +++ b/lib/dpif.h
> @@ -727,7 +727,7 @@ struct dpif_execute {
>  bool probe; /* Suppress error messages. */
>  unsigned int mtu;   /* Maximum transmission unit to fragment.
> 0 if not a fragmented packet */
> -uint64_t hash;
> +uint64_t hash;  /* Packet flow hash. 0 if not specified. 
> */
>  const struct flow *flow; /* Flow extracted from 'packet'. */
>
>  /* Input, but possibly modified as a side effect of execution. */
> --
> 2.26.2
>


-- 
Best regards, Tonghao
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net] openvswitch: fix send of uninitialized stack memory in ct limit reply

2021-04-04 Thread Ilya Maximets
CC: ovs-dev

On 4/4/21 7:50 PM, Ilya Maximets wrote:
> 'struct ovs_zone_limit' has more members than initialized in
> ovs_ct_limit_get_default_limit().  The rest of the memory is a random
> kernel stack content that ends up being sent to userspace.
> 
> Fix that by using designated initializer that will clear all
> non-specified fields.
> 
> Fixes: 11efd5cb04a1 ("openvswitch: Support conntrack zone limit")
> Signed-off-by: Ilya Maximets 
> ---
>  net/openvswitch/conntrack.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> index c29b0ef1fc27..cadb6a29b285 100644
> --- a/net/openvswitch/conntrack.c
> +++ b/net/openvswitch/conntrack.c
> @@ -2032,10 +2032,10 @@ static int ovs_ct_limit_del_zone_limit(struct nlattr 
> *nla_zone_limit,
>  static int ovs_ct_limit_get_default_limit(struct ovs_ct_limit_info *info,
> struct sk_buff *reply)
>  {
> - struct ovs_zone_limit zone_limit;
> -
> - zone_limit.zone_id = OVS_ZONE_LIMIT_DEFAULT_ZONE;
> - zone_limit.limit = info->default_limit;
> + struct ovs_zone_limit zone_limit = {
> + .zone_id = OVS_ZONE_LIMIT_DEFAULT_ZONE,
> + .limit   = info->default_limit,
> + };
>  
>   return nla_put_nohdr(reply, sizeof(zone_limit), _limit);
>  }
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 5/5] dpif-netlink: Fix send of uninitialized memory in ct limit requests.

2021-04-04 Thread Ilya Maximets
ct limit requests never initializes the whole 'struct ovs_zone_limit'
sending uninitialized stack memory to kernel:

 Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
at 0x5E23867: sendmsg (in /usr/lib64/libpthread-2.28.so)
by 0x54F761: nl_sock_transact_multiple__ (netlink-socket.c:858)
by 0x54FB6E: nl_sock_transact_multiple.part.9 (netlink-socket.c:1079)
by 0x54FCC0: nl_sock_transact_multiple (netlink-socket.c:1044)
by 0x54FCC0: nl_sock_transact (netlink-socket.c:1108)
by 0x550B6F: nl_transact (netlink-socket.c:1804)
by 0x53BEA2: dpif_netlink_ct_get_limits (dpif-netlink.c:3052)
by 0x588B57: dpctl_ct_get_limits (dpctl.c:2178)
by 0x586FF2: dpctl_unixctl_handler (dpctl.c:2870)
by 0x52C241: process_command (unixctl.c:310)
by 0x52C241: run_connection (unixctl.c:344)
by 0x52C241: unixctl_server_run (unixctl.c:395)
by 0x407526: main (ovs-vswitchd.c:128)
  Address 0x10b87480 is 32 bytes inside a block of size 4,096 alloc'd
at 0x4C30F0B: malloc (vg_replace_malloc.c:307)
by 0x52CDE4: xmalloc (util.c:138)
by 0x4F7E07: ofpbuf_init (ofpbuf.c:123)
by 0x4F7E07: ofpbuf_new (ofpbuf.c:151)
by 0x53BDE3: dpif_netlink_ct_get_limits (dpif-netlink.c:3025)
by 0x588B57: dpctl_ct_get_limits (dpctl.c:2178)
by 0x586FF2: dpctl_unixctl_handler (dpctl.c:2870)
by 0x52C241: process_command (unixctl.c:310)
by 0x52C241: run_connection (unixctl.c:344)
by 0x52C241: unixctl_server_run (unixctl.c:395)
by 0x407526: main (ovs-vswitchd.c:128)
  Uninitialised value was created by a stack allocation
at 0x46AAA0: ct_dpif_get_limits (ct-dpif.c:197)

Fix that by using designated initializers that will clear all the
non-specified fields.

CC: Yi-Hung Wei 
Fixes: 906ff9d229ee ("dpif-netlink: Implement conntrack zone limit")
Signed-off-by: Ilya Maximets 
---
 lib/dpif-netlink.c | 24 ++--
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
index ceb56c685..ee96f4011 100644
--- a/lib/dpif-netlink.c
+++ b/lib/dpif-netlink.c
@@ -2923,8 +2923,6 @@ dpif_netlink_ct_set_limits(struct dpif *dpif OVS_UNUSED,
const uint32_t *default_limits,
const struct ovs_list *zone_limits)
 {
-struct ovs_zone_limit req_zone_limit;
-
 if (ovs_ct_limit_family < 0) {
 return EOPNOTSUPP;
 }
@@ -2941,8 +2939,10 @@ dpif_netlink_ct_set_limits(struct dpif *dpif OVS_UNUSED,
 size_t opt_offset;
 opt_offset = nl_msg_start_nested(request, OVS_CT_LIMIT_ATTR_ZONE_LIMIT);
 if (default_limits) {
-req_zone_limit.zone_id = OVS_ZONE_LIMIT_DEFAULT_ZONE;
-req_zone_limit.limit = *default_limits;
+struct ovs_zone_limit req_zone_limit = {
+.zone_id = OVS_ZONE_LIMIT_DEFAULT_ZONE,
+.limit   = *default_limits,
+};
 nl_msg_put(request, _zone_limit, sizeof req_zone_limit);
 }
 
@@ -2950,8 +2950,10 @@ dpif_netlink_ct_set_limits(struct dpif *dpif OVS_UNUSED,
 struct ct_dpif_zone_limit *zone_limit;
 
 LIST_FOR_EACH (zone_limit, node, zone_limits) {
-req_zone_limit.zone_id = zone_limit->zone;
-req_zone_limit.limit = zone_limit->limit;
+struct ovs_zone_limit req_zone_limit = {
+.zone_id = zone_limit->zone,
+.limit   = zone_limit->limit,
+};
 nl_msg_put(request, _zone_limit, sizeof req_zone_limit);
 }
 }
@@ -3035,8 +3037,9 @@ dpif_netlink_ct_get_limits(struct dpif *dpif OVS_UNUSED,
 size_t opt_offset = nl_msg_start_nested(request,
 OVS_CT_LIMIT_ATTR_ZONE_LIMIT);
 
-struct ovs_zone_limit req_zone_limit;
-req_zone_limit.zone_id = OVS_ZONE_LIMIT_DEFAULT_ZONE;
+struct ovs_zone_limit req_zone_limit = {
+.zone_id = OVS_ZONE_LIMIT_DEFAULT_ZONE,
+};
 nl_msg_put(request, _zone_limit, sizeof req_zone_limit);
 
 struct ct_dpif_zone_limit *zone_limit;
@@ -3086,8 +3089,9 @@ dpif_netlink_ct_del_limits(struct dpif *dpif OVS_UNUSED,
 
 struct ct_dpif_zone_limit *zone_limit;
 LIST_FOR_EACH (zone_limit, node, zone_limits) {
-struct ovs_zone_limit req_zone_limit;
-req_zone_limit.zone_id = zone_limit->zone;
+struct ovs_zone_limit req_zone_limit = {
+.zone_id = zone_limit->zone,
+};
 nl_msg_put(request, _zone_limit, sizeof req_zone_limit);
 }
 nl_msg_end_nested(request, opt_offset);
-- 
2.26.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 4/5] ofproto-dpif: Fix use of uninitialized attributes of timeout policy.

2021-04-04 Thread Ilya Maximets
'cdtp' is allocated on a stack and it has uninitialized 'present'
field that specifies which attributes are actually set.  This
causes use of uninitialized attributes.

 Conditional jump or move depends on uninitialised value(s)
at 0x539651: dpif_netlink_get_nl_tp_udp_attrs (dpif-netlink.c:3206)
by 0x539651: dpif_netlink_get_nl_tp_attrs (dpif-netlink.c:3234)
by 0x539651: dpif_netlink_ct_set_timeout_policy (dpif-netlink.c:3370)
by 0x42B615: ct_add_timeout_policy_to_dpif (ofproto-dpif.c:5421)
by 0x42B615: ct_set_zone_timeout_policy (ofproto-dpif.c:5525)
by 0x40BD98: ct_zones_reconfigure (bridge.c:756)
by 0x40BD98: datapath_reconfigure (bridge.c:804)
by 0x40D737: bridge_reconfigure (bridge.c:903)
by 0x411720: bridge_run (bridge.c:3331)
by 0x40751C: main (ovs-vswitchd.c:127)

Clearing the whole structure to avoid this kind of problems.

CC: Yi-Hung Wei 
Fixes: 993cae678bca ("ofproto-dpif: Consume CT_Zone, and CT_Timeout_Policy 
tables")
Signed-off-by: Ilya Maximets 
---
 ofproto/ofproto-dpif.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index fd0b2fdea..47db9bb57 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -5413,6 +5413,8 @@ ct_add_timeout_policy_to_dpif(struct dpif *dpif,
 struct ct_dpif_timeout_policy cdtp;
 struct simap_node *node;
 
+memset(, 0, sizeof cdtp);
+
 cdtp.id = ct_tp->tp_id;
 SIMAP_FOR_EACH (node, _tp->tp) {
 ct_dpif_set_timeout_policy_attr_by_name(, node->name, node->data);
-- 
2.26.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 3/5] dpif: Fix use of uninitialized execute hash.

2021-04-04 Thread Ilya Maximets
'dpif_execute_helper_cb' doesn't initilalize the 'hash' field that
may be passed down to datapath and might cause execution of a different
set of actions, e.g. on recirculation.

 Thread 6 handler27:
 Conditional jump or move depends on uninitialised value(s)
at 0x53A2C2: dpif_netlink_encode_execute (dpif-netlink.c:1841)
by 0x53A2C2: dpif_netlink_operate__ (dpif-netlink.c:1919)
by 0x53A82D: dpif_netlink_operate_chunks (dpif-netlink.c:2238)
by 0x53A82D: dpif_netlink_operate (dpif-netlink.c:2297)
by 0x48135F: dpif_operate (dpif.c:1366)
by 0x481923: dpif_execute.part.24 (dpif.c:1320)
by 0x481C46: dpif_execute (dpif.c:1312)
by 0x481C46: dpif_execute_helper_cb (dpif.c:1243)
by 0x4AE943: odp_execute_actions (odp-execute.c:865)
by 0x47F272: dpif_execute_with_help (dpif.c:1296)
by 0x4812FF: dpif_operate (dpif.c:1422)
by 0x442226: handle_upcalls (ofproto-dpif-upcall.c:1617)
by 0x442226: recv_upcalls.isra.36 (ofproto-dpif-upcall.c:855)
by 0x442351: udpif_upcall_handler (ofproto-dpif-upcall.c:755)
by 0x4FDE2C: ovsthread_wrapper (ovs-thread.c:383)
by 0x5E19159: start_thread (in /usr/lib64/libpthread-2.28.so)
by 0x69ECF72: clone (in /usr/lib64/libc-2.28.so)
  Uninitialised value was created by a stack allocation
at 0x481966: dpif_execute_helper_cb (dpif.c:1159)

Additionally added a missing comment to the 'struct dpif_execute'.

CC: Tonghao Zhang 
Fixes: 0442bfb11d6c ("ofproto-dpif-upcall: Echo HASH attribute back to 
datapath.")
Signed-off-by: Ilya Maximets 
---
 lib/dpif.c | 1 +
 lib/dpif.h | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/dpif.c b/lib/dpif.c
index 56d0b4a65..26e8bfb7d 100644
--- a/lib/dpif.c
+++ b/lib/dpif.c
@@ -1240,6 +1240,7 @@ dpif_execute_helper_cb(void *aux_, struct dp_packet_batch 
*packets_,
 execute.needs_help = false;
 execute.probe = false;
 execute.mtu = 0;
+execute.hash = 0;
 aux->error = dpif_execute(aux->dpif, );
 log_execute_message(aux->dpif, _module, ,
 true, aux->error);
diff --git a/lib/dpif.h b/lib/dpif.h
index ecda896c7..f9728e673 100644
--- a/lib/dpif.h
+++ b/lib/dpif.h
@@ -727,7 +727,7 @@ struct dpif_execute {
 bool probe; /* Suppress error messages. */
 unsigned int mtu;   /* Maximum transmission unit to fragment.
0 if not a fragmented packet */
-uint64_t hash;
+uint64_t hash;  /* Packet flow hash. 0 if not specified. */
 const struct flow *flow; /* Flow extracted from 'packet'. */
 
 /* Input, but possibly modified as a side effect of execution. */
-- 
2.26.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 2/5] odp-util: Fix use of uninitialized erspan metadata.

2021-04-04 Thread Ilya Maximets
'struct erspan_metadata' contains union with fields of different
sizes, hence not all the memory initiliazed.  This memory goes
to syscalls and also used to compare ukeys with memcmp which may
cause unexpected behavior.

 Thread 15 revalidator13:
 Conditional jump or move depends on uninitialised value(s)
at 0x4C377B6: bcmp (vg_replace_strmem.c:)
by 0x43F844: ofpbuf_equal (ofpbuf.h:273)
by 0x43F844: revalidate_ukey__ (ofproto-dpif-upcall.c:2227)
by 0x43F9C9: revalidate_ukey (ofproto-dpif-upcall.c:2294)
by 0x4425C2: revalidate.isra.33 (ofproto-dpif-upcall.c:2734)
by 0x4434B8: udpif_revalidator (ofproto-dpif-upcall.c:943)
by 0x4FDE2C: ovsthread_wrapper (ovs-thread.c:383)
by 0x5E19159: start_thread (in /usr/lib64/libpthread-2.28.so)
by 0x69ECF72: clone (in /usr/lib64/libc-2.28.so)
  Uninitialised value was created by a stack allocation
at 0x4B1CE0: tun_key_to_attr (odp-util.c:3129)

CC: William Tu 
Fixes: 98514eea21f4 ("erspan: add kernel datapath support")
Signed-off-by: Ilya Maximets 
---
 lib/odp-util.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/lib/odp-util.c b/lib/odp-util.c
index a8598d52a..e1199d1da 100644
--- a/lib/odp-util.c
+++ b/lib/odp-util.c
@@ -3189,17 +3189,17 @@ tun_key_to_attr(struct ofpbuf *a, const struct flow_tnl 
*tun_key,
 if ((!tnl_type || !strcmp(tnl_type, "erspan") ||
 !strcmp(tnl_type, "ip6erspan")) &&
 (tun_key->erspan_ver == 1 || tun_key->erspan_ver == 2)) {
-struct erspan_metadata opts;
+struct erspan_metadata *opts;
 
-opts.version = tun_key->erspan_ver;
-if (opts.version == 1) {
-opts.u.index = htonl(tun_key->erspan_idx);
+opts = nl_msg_put_unspec_zero(a, OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS,
+  sizeof *opts);
+opts->version = tun_key->erspan_ver;
+if (opts->version == 1) {
+opts->u.index = htonl(tun_key->erspan_idx);
 } else {
-opts.u.md2.dir = tun_key->erspan_dir;
-set_hwid(, tun_key->erspan_hwid);
+opts->u.md2.dir = tun_key->erspan_dir;
+set_hwid(>u.md2, tun_key->erspan_hwid);
 }
-nl_msg_put_unspec(a, OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS,
-  , sizeof(opts));
 }
 
 if ((!tnl_type || !strcmp(tnl_type, "gtpu")) &&
-- 
2.26.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 1/5] netdev-linux: Fix use of uninitialized LAG master name.

2021-04-04 Thread Ilya Maximets
'if_indextoname' may fail leaving the 'master_name' uninitialized:

 Conditional jump or move depends on uninitialised value(s)
at 0x4C34329: strlen (vg_replace_strmem.c:459)
by 0x51C638: hash_string (hash.h:342)
by 0x51C638: hash_name (shash.c:28)
by 0x51CC51: shash_find (shash.c:231)
by 0x51CD38: shash_find_data (shash.c:245)
by 0x4A797F: netdev_from_name (netdev.c:2013)
by 0x544148: netdev_linux_update_lag (netdev-linux.c:676)
by 0x544148: netdev_linux_run (netdev-linux.c:769)
by 0x4A5997: netdev_run (netdev.c:186)
by 0x40752B: main (ovs-vswitchd.c:129)
  Uninitialised value was created by a stack allocation
at 0x543AFA: netdev_linux_run (netdev-linux.c:722)

CC: John Hurley 
Fixes: d22f8927c3c9 ("netdev-linux: monitor and offload LAG slaves to TC")
Signed-off-by: Ilya Maximets 
---
 lib/netdev-linux.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index 15b25084b..2b8283e94 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -672,7 +672,9 @@ netdev_linux_update_lag(struct rtnetlink_change *change)
 uint32_t block_id;
 int error = 0;
 
-if_indextoname(change->master_ifindex, master_name);
+if (!if_indextoname(change->master_ifindex, master_name)) {
+return;
+}
 master_netdev = netdev_from_name(master_name);
 if (!master_netdev) {
 return;
-- 
2.26.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 0/5] Fix use of uninitialized memory in various places.

2021-04-04 Thread Ilya Maximets
Found by 'make check-kernel-valgrind'.

Ilya Maximets (5):
  netdev-linux: Fix use of uninitialized LAG master name.
  odp-util: Fix use of uninitialized erspan metadata.
  dpif: Fix use of uninitialized execute hash.
  ofproto-dpif: Fix use of uninitialized attributes of timeout policy.
  dpif-netlink: Fix send of uninitialized memory in ct limit requests.

 lib/dpif-netlink.c | 24 ++--
 lib/dpif.c |  1 +
 lib/dpif.h |  2 +-
 lib/netdev-linux.c |  4 +++-
 lib/odp-util.c | 16 
 ofproto/ofproto-dpif.c |  2 ++
 6 files changed, 29 insertions(+), 20 deletions(-)

-- 
2.26.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] net: openvswitch: Use 'skb_push_rcsum()' instead of hand coding it

2021-04-04 Thread Christophe JAILLET
'skb_push()'/'skb_postpush_rcsum()' can be replaced by an equivalent
'skb_push_rcsum()' which is less verbose.

Signed-off-by: Christophe JAILLET 
---
 net/openvswitch/conntrack.c| 6 ++
 net/openvswitch/vport-netdev.c | 7 +++
 2 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index 71cec03e8612..c29b0ef1fc27 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -809,8 +809,7 @@ static int ovs_ct_nat_execute(struct sk_buff *skb, struct 
nf_conn *ct,
 
err = nf_nat_packet(ct, ctinfo, hooknum, skb);
 push:
-   skb_push(skb, nh_off);
-   skb_postpush_rcsum(skb, skb->data, nh_off);
+   skb_push_rcsum(skb, nh_off);
 
return err;
 }
@@ -1322,8 +1321,7 @@ int ovs_ct_execute(struct net *net, struct sk_buff *skb,
else
err = ovs_ct_lookup(net, key, info, skb);
 
-   skb_push(skb, nh_ofs);
-   skb_postpush_rcsum(skb, skb->data, nh_ofs);
+   skb_push_rcsum(skb, nh_ofs);
if (err)
kfree_skb(skb);
return err;
diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
index 57d6436e6f6a..8e1a88f13622 100644
--- a/net/openvswitch/vport-netdev.c
+++ b/net/openvswitch/vport-netdev.c
@@ -44,10 +44,9 @@ static void netdev_port_receive(struct sk_buff *skb)
if (unlikely(!skb))
return;
 
-   if (skb->dev->type == ARPHRD_ETHER) {
-   skb_push(skb, ETH_HLEN);
-   skb_postpush_rcsum(skb, skb->data, ETH_HLEN);
-   }
+   if (skb->dev->type == ARPHRD_ETHER)
+   skb_push_rcsum(skb, ETH_HLEN);
+
ovs_vport_receive(vport, skb, skb_tunnel_info(skb));
return;
 error:
-- 
2.27.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH V6 11/13] dpif-netdev: Provide orig_in_port in metadata for tunneled packets

2021-04-04 Thread 0-day Robot
Bleep bloop.  Greetings Eli Britstein, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Unexpected sign-offs from developers who are not authors or co-authors 
or committers: Eli Britstein 
Lines checked: 174, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH V6 08/13] netdev-offload: Disallow offloading to unrelated tunneling vports.

2021-04-04 Thread 0-day Robot
Bleep bloop.  Greetings Eli Britstein, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Unexpected sign-offs from developers who are not authors or co-authors 
or committers: Eli Britstein 
Lines checked: 90, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH V6 07/13] netdev-offload: Allow offloading to netdev without ifindex.

2021-04-04 Thread 0-day Robot
Bleep bloop.  Greetings Eli Britstein, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Unexpected sign-offs from developers who are not authors or co-authors 
or committers: Eli Britstein 
Lines checked: 68, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V6 13/13] netdev-dpdk-offload: Add vxlan pattern matching function

2021-04-04 Thread Eli Britstein
For VXLAN offload, matches should be done on outer header for tunnel
properties as well as inner packet matches. Add a function for parsing
VXLAN tunnel matches.

Signed-off-by: Eli Britstein 
Reviewed-by: Gaetan Rivet 
---
 NEWS  |   1 +
 lib/netdev-offload-dpdk.c | 155 +-
 2 files changed, 155 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index 74eed0a6c..2fee43b9a 100644
--- a/NEWS
+++ b/NEWS
@@ -34,6 +34,7 @@ v2.15.0 - 15 Feb 2021
  * Removed support for vhost-user dequeue zero-copy.
  * Add support for DPDK 20.11.
  * Add hardware offload support for tunnel pop action (experimental).
+ * Add hardware offload support for VXLAN flows (experimental).
- Userspace datapath:
  * Add the 'pmd' option to "ovs-appctl dpctl/dump-flows", which
restricts a flow dump to a single PMD thread if set.
diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index 197deb96b..52a74a707 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -372,6 +372,20 @@ dump_flow_pattern(struct ds *s,
   ipv6_mask->hdr.hop_limits);
 }
 ds_put_cstr(s, "/ ");
+} else if (item->type == RTE_FLOW_ITEM_TYPE_VXLAN) {
+const struct rte_flow_item_vxlan *vxlan_spec = item->spec;
+const struct rte_flow_item_vxlan *vxlan_mask = item->mask;
+
+ds_put_cstr(s, "vxlan ");
+if (vxlan_spec) {
+if (!vxlan_mask) {
+vxlan_mask = _flow_item_vxlan_mask;
+}
+DUMP_PATTERN_ITEM(vxlan_mask->vni, "vni", "%"PRIu32,
+  ntohl(*(ovs_be32 *) vxlan_spec->vni) >> 8,
+  ntohl(*(ovs_be32 *) vxlan_mask->vni) >> 8);
+}
+ds_put_cstr(s, "/ ");
 } else {
 ds_put_format(s, "unknown rte flow pattern (%d)\n", item->type);
 }
@@ -863,15 +877,154 @@ out:
 return ret;
 }
 
+static int
+parse_tnl_ip_match(struct flow_patterns *patterns,
+   struct match *match,
+   uint8_t proto)
+{
+struct flow *consumed_masks;
+
+consumed_masks = >wc.masks;
+/* IP v4 */
+if (match->wc.masks.tunnel.ip_src || match->wc.masks.tunnel.ip_dst) {
+struct rte_flow_item_ipv4 *spec, *mask;
+
+spec = xzalloc(sizeof *spec);
+mask = xzalloc(sizeof *mask);
+
+spec->hdr.type_of_service = match->flow.tunnel.ip_tos;
+spec->hdr.time_to_live= match->flow.tunnel.ip_ttl;
+spec->hdr.next_proto_id   = proto;
+spec->hdr.src_addr= match->flow.tunnel.ip_src;
+spec->hdr.dst_addr= match->flow.tunnel.ip_dst;
+
+mask->hdr.type_of_service = match->wc.masks.tunnel.ip_tos;
+mask->hdr.time_to_live= match->wc.masks.tunnel.ip_ttl;
+mask->hdr.next_proto_id   = UINT8_MAX;
+mask->hdr.src_addr= match->wc.masks.tunnel.ip_src;
+mask->hdr.dst_addr= match->wc.masks.tunnel.ip_dst;
+
+consumed_masks->tunnel.ip_tos = 0;
+consumed_masks->tunnel.ip_ttl = 0;
+consumed_masks->tunnel.ip_src = 0;
+consumed_masks->tunnel.ip_dst = 0;
+
+add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_IPV4, spec, mask);
+} else if (!is_all_zeros(>wc.masks.tunnel.ipv6_src,
+ sizeof(struct in6_addr)) ||
+   !is_all_zeros(>wc.masks.tunnel.ipv6_dst,
+ sizeof(struct in6_addr))) {
+/* IP v6 */
+struct rte_flow_item_ipv6 *spec, *mask;
+
+spec = xzalloc(sizeof *spec);
+mask = xzalloc(sizeof *mask);
+
+spec->hdr.proto = proto;
+spec->hdr.hop_limits = match->flow.tunnel.ip_ttl;
+spec->hdr.vtc_flow = htonl((uint32_t) match->flow.tunnel.ip_tos <<
+   RTE_IPV6_HDR_TC_SHIFT);
+memcpy(spec->hdr.src_addr, >flow.tunnel.ipv6_src,
+   sizeof spec->hdr.src_addr);
+memcpy(spec->hdr.dst_addr, >flow.tunnel.ipv6_dst,
+   sizeof spec->hdr.dst_addr);
+
+mask->hdr.proto = UINT8_MAX;
+mask->hdr.hop_limits = match->wc.masks.tunnel.ip_ttl;
+mask->hdr.vtc_flow = htonl((uint32_t) match->wc.masks.tunnel.ip_tos <<
+   RTE_IPV6_HDR_TC_SHIFT);
+memcpy(mask->hdr.src_addr, >wc.masks.tunnel.ipv6_src,
+   sizeof mask->hdr.src_addr);
+memcpy(mask->hdr.dst_addr, >wc.masks.tunnel.ipv6_dst,
+   sizeof mask->hdr.dst_addr);
+
+consumed_masks->tunnel.ip_tos = 0;
+consumed_masks->tunnel.ip_ttl = 0;
+memset(_masks->tunnel.ipv6_src, 0,
+   sizeof consumed_masks->tunnel.ipv6_src);
+memset(_masks->tunnel.ipv6_dst, 0,
+   sizeof consumed_masks->tunnel.ipv6_dst);
+
+add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_IPV6, spec, mask);
+} else {
+VLOG_ERR_RL(, "Tunnel L3 protocol is neither 

[ovs-dev] [PATCH V6 12/13] netdev-offload-dpdk: Support vports flows offload

2021-04-04 Thread Eli Britstein
Vports are virtual, OVS only logical devices, so rte_flows cannot be
applied as is on them. Instead, apply the rules the physical port from
which the packet has arrived, provided by orig_in_port field.

Signed-off-by: Eli Britstein 
Reviewed-by: Gaetan Rivet 
---
 lib/netdev-offload-dpdk.c | 216 --
 1 file changed, 183 insertions(+), 33 deletions(-)

diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index 315575c38..197deb96b 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -25,6 +25,7 @@
 #include "netdev-offload-provider.h"
 #include "netdev-provider.h"
 #include "netdev-vport.h"
+#include "odp-util.h"
 #include "openvswitch/match.h"
 #include "openvswitch/vlog.h"
 #include "packets.h"
@@ -62,6 +63,7 @@ struct ufid_to_rte_flow_data {
 struct rte_flow *rte_flow;
 bool actions_offloaded;
 struct dpif_flow_stats stats;
+struct netdev *physdev;
 };
 
 /* Find rte_flow with @ufid. */
@@ -87,7 +89,8 @@ ufid_to_rte_flow_data_find(const ovs_u128 *ufid, bool warn)
 
 static inline struct ufid_to_rte_flow_data *
 ufid_to_rte_flow_associate(const ovs_u128 *ufid, struct netdev *netdev,
-   struct rte_flow *rte_flow, bool actions_offloaded)
+   struct netdev *physdev, struct rte_flow *rte_flow,
+   bool actions_offloaded)
 {
 size_t hash = hash_bytes(ufid, sizeof *ufid, 0);
 struct ufid_to_rte_flow_data *data = xzalloc(sizeof *data);
@@ -106,6 +109,7 @@ ufid_to_rte_flow_associate(const ovs_u128 *ufid, struct 
netdev *netdev,
 
 data->ufid = *ufid;
 data->netdev = netdev_ref(netdev);
+data->physdev = netdev != physdev ? netdev_ref(physdev) : physdev;
 data->rte_flow = rte_flow;
 data->actions_offloaded = actions_offloaded;
 
@@ -121,7 +125,10 @@ ufid_to_rte_flow_disassociate(struct ufid_to_rte_flow_data 
*data)
 
 cmap_remove(_to_rte_flow,
 CONST_CAST(struct cmap_node *, >node), hash);
-netdev_close(data->netdev);
+if (data->netdev != data->physdev) {
+netdev_close(data->netdev);
+}
+netdev_close(data->physdev);
 ovsrcu_postpone(free, data);
 }
 
@@ -134,6 +141,11 @@ struct flow_patterns {
 struct rte_flow_item *items;
 int cnt;
 int current_max;
+struct netdev *physdev;
+/* tnl_pmd_items is the opaque array of items returned by the PMD. */
+struct rte_flow_item *tnl_pmd_items;
+uint32_t tnl_pmd_items_cnt;
+struct ds s_tnl;
 };
 
 struct flow_actions {
@@ -154,16 +166,20 @@ struct flow_actions {
 static void
 dump_flow_attr(struct ds *s, struct ds *s_extra,
const struct rte_flow_attr *attr,
+   struct flow_patterns *flow_patterns,
struct flow_actions *flow_actions)
 {
 if (flow_actions->tnl_pmd_actions_cnt) {
 ds_clone(s_extra, _actions->s_tnl);
+} else if (flow_patterns->tnl_pmd_items_cnt) {
+ds_clone(s_extra, _patterns->s_tnl);
 }
-ds_put_format(s, "%s%spriority %"PRIu32" group %"PRIu32" %s%s",
+ds_put_format(s, "%s%spriority %"PRIu32" group %"PRIu32" %s%s%s",
   attr->ingress  ? "ingress " : "",
   attr->egress   ? "egress " : "", attr->priority, attr->group,
   attr->transfer ? "transfer " : "",
-  flow_actions->tnl_pmd_actions_cnt ? "tunnel_set 1 " : "");
+  flow_actions->tnl_pmd_actions_cnt ? "tunnel_set 1 " : "",
+  flow_patterns->tnl_pmd_items_cnt ? "tunnel_match 1 " : "");
 }
 
 /* Adds one pattern item 'field' with the 'mask' to dynamic string 's' using
@@ -177,9 +193,18 @@ dump_flow_attr(struct ds *s, struct ds *s_extra,
 }
 
 static void
-dump_flow_pattern(struct ds *s, const struct rte_flow_item *item)
+dump_flow_pattern(struct ds *s,
+  struct flow_patterns *flow_patterns,
+  int pattern_index)
 {
-if (item->type == RTE_FLOW_ITEM_TYPE_ETH) {
+const struct rte_flow_item *item = _patterns->items[pattern_index];
+
+if (item->type == RTE_FLOW_ITEM_TYPE_END) {
+ds_put_cstr(s, "end ");
+} else if (flow_patterns->tnl_pmd_items_cnt &&
+   pattern_index < flow_patterns->tnl_pmd_items_cnt) {
+return;
+} else if (item->type == RTE_FLOW_ITEM_TYPE_ETH) {
 const struct rte_flow_item_eth *eth_spec = item->spec;
 const struct rte_flow_item_eth *eth_mask = item->mask;
 
@@ -569,19 +594,19 @@ dump_flow_action(struct ds *s, struct ds *s_extra,
 static struct ds *
 dump_flow(struct ds *s, struct ds *s_extra,
   const struct rte_flow_attr *attr,
-  const struct rte_flow_item *items,
+  struct flow_patterns *flow_patterns,
   struct flow_actions *flow_actions)
 {
 int i;
 
 if (attr) {
-dump_flow_attr(s, s_extra, attr, flow_actions);
+dump_flow_attr(s, s_extra, attr, flow_patterns, flow_actions);
 }
 ds_put_cstr(s, 

[ovs-dev] [PATCH V6 09/13] netdev-offload-dpdk: Change log rate limits

2021-04-04 Thread Eli Britstein
In order to allow showing more debug messages, increase the rate limits.

Signed-off-by: Eli Britstein 
Reviewed-by: Gaetan Rivet 
---
 lib/netdev-offload-dpdk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index 04cc7d0fe..3b86bc5f9 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -31,7 +31,7 @@
 #include "uuid.h"
 
 VLOG_DEFINE_THIS_MODULE(netdev_offload_dpdk);
-static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(100, 5);
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(600, 600);
 
 /* Thread-safety
  * =
-- 
2.28.0.2311.g225365fb51

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V6 08/13] netdev-offload: Disallow offloading to unrelated tunneling vports.

2021-04-04 Thread Eli Britstein
From: Ilya Maximets 

'linux_tc' flow API suitable only for tunneling vports with backing
linux interfaces. DPDK flow API is not suitable for such ports.

With this change we could drop vport restriction from dpif-netdev.

This is a prerequisite for enabling vport offloading in DPDK.

Signed-off-by: Ilya Maximets 
Signed-off-by: Eli Britstein 
Reviewed-by: Gaetan Rivet 
---
 lib/dpif-netdev.c | 3 +--
 lib/netdev-offload-dpdk.c | 8 
 lib/netdev-offload-tc.c   | 8 
 3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 2345df500..c93a95a9f 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -2698,8 +2698,7 @@ dp_netdev_flow_offload_put(struct dp_flow_offload_item 
*offload)
 info.flow_mark = mark;
 
 port = netdev_ports_get(in_port, dpif_type_str);
-if (!port || netdev_vport_is_vport_class(port->netdev_class)) {
-netdev_close(port);
+if (!port) {
 goto err_free;
 }
 /* Taking a global 'port_mutex' to fulfill thread safety restrictions for
diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index d40b56207..04cc7d0fe 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -24,6 +24,7 @@
 #include "dpif-netdev.h"
 #include "netdev-offload-provider.h"
 #include "netdev-provider.h"
+#include "netdev-vport.h"
 #include "openvswitch/match.h"
 #include "openvswitch/vlog.h"
 #include "packets.h"
@@ -1523,6 +1524,13 @@ netdev_offload_dpdk_flow_del(struct netdev *netdev 
OVS_UNUSED,
 static int
 netdev_offload_dpdk_init_flow_api(struct netdev *netdev)
 {
+if (netdev_vport_is_vport_class(netdev->netdev_class)
+&& !strcmp(netdev_get_dpif_type(netdev), "system")) {
+VLOG_DBG("%s: vport belongs to the system datapath. Skipping.",
+ netdev_get_name(netdev));
+return EOPNOTSUPP;
+}
+
 return netdev_dpdk_flow_api_supported(netdev) ? 0 : EOPNOTSUPP;
 }
 
diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
index 448747eae..00b02411c 100644
--- a/lib/netdev-offload-tc.c
+++ b/lib/netdev-offload-tc.c
@@ -31,6 +31,7 @@
 #include "netdev-linux.h"
 #include "netdev-offload-provider.h"
 #include "netdev-provider.h"
+#include "netdev-vport.h"
 #include "netlink.h"
 #include "netlink-socket.h"
 #include "odp-netlink.h"
@@ -2211,6 +2212,13 @@ netdev_tc_init_flow_api(struct netdev *netdev)
 int ifindex;
 int error;
 
+if (netdev_vport_is_vport_class(netdev->netdev_class)
+&& strcmp(netdev_get_dpif_type(netdev), "system")) {
+VLOG_DBG("%s: vport doesn't belong to the system datapath. Skipping.",
+ netdev_get_name(netdev));
+return EOPNOTSUPP;
+}
+
 ifindex = netdev_get_ifindex(netdev);
 if (ifindex < 0) {
 VLOG_INFO("init: failed to get ifindex for %s: %s",
-- 
2.28.0.2311.g225365fb51

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V6 00/13] Netdev vxlan-decap offload

2021-04-04 Thread Eli Britstein
VXLAN decap in OVS-DPDK configuration consists of two flows:
F1: in_port(ens1f0),eth(),ipv4(),udp(), actions:tnl_pop(vxlan_sys_4789)
F2: tunnel(),in_port(vxlan_sys_4789),eth(),ipv4(), actions:ens1f0_0

F1 is a classification flow. It has outer headers matches and it
classifies the packet as a VXLAN packet, and using tnl_pop action the
packet continues processing in F2.
F2 is a flow that has matches on tunnel metadata as well as on the inner
packet headers (as any other flow).

In order to fully offload VXLAN decap path, both F1 and F2 should be
offloaded. As there are more than one flow in HW, it is possible that
F1 is done by HW but F2 is not. Packet is received by SW, and should be
processed starting from F2 as F1 was already done by HW.
Rte_flows are applicable only on physical port IDs. Keeping the original
physical in port on which the packet is received on enables applying
vport flows (e.g. F2) on that physical port.

This patch-set makes use of [1] introduced in DPDK 20.11, that adds API
for tunnel offloads.

Note that MLX5 PMD has a bug that the tnl_pop private actions must be
first. In OVS it is not.
Fixing this issue is scheduled for 21.05 (and stable 20.11.2).
Meanwhile, tests were done with a workaround for it [2].

v2-v1:
- Tracking original in_port, and applying vport on that physical port instead 
of all PFs.
v3-v2:
- Traversing ports using a new API instead of flow_dump.
- Refactor packet state recover logic, with bug fix for error pop_header.
- One ref count for netdev in non-tunnel case.
- Rename variables, comments, rebase.
v4-v3:
- Extract orig_in_port from physdev for flow modify.
- Miss handling fixes.
v5-v4:
- Drop refactor offload rule creation commit.
- Comment about setting in_port in restore.
- Refactor vports flow offload commit.
v6-v5:
- Fixed duplicate netdev ref bug.

Travis:
v1: https://travis-ci.org/github/elibritstein/OVS/builds/756418552
v2: https://travis-ci.org/github/elibritstein/OVS/builds/758382963
v3: https://travis-ci.org/github/elibritstein/OVS/builds/761089087
v4: https://travis-ci.org/github/elibritstein/OVS/builds/763146966
v5: https://travis-ci.org/github/elibritstein/OVS/builds/765271879
v6: https://travis-ci.org/github/elibritstein/OVS/builds/765816800

GitHub Actions:
v1: https://github.com/elibritstein/OVS/actions/runs/515334647
v2: https://github.com/elibritstein/OVS/actions/runs/554986007
v3: https://github.com/elibritstein/OVS/actions/runs/613226225
v4: https://github.com/elibritstein/OVS/actions/runs/658415274
v5: https://github.com/elibritstein/OVS/actions/runs/704357369
v6: https://github.com/elibritstein/OVS/actions/runs/716304028

[1] https://mails.dpdk.org/archives/dev/2020-October/187314.html
[2] https://github.com/elibritstein/dpdk-stable/pull/1


Eli Britstein (10):
  netdev-offload: Add HW miss packet state recover API
  netdev-dpdk: Introduce DPDK tunnel APIs
  netdev-offload: Introduce an API to traverse ports
  netdev-dpdk: Add flow_api support for netdev vxlan vports
  netdev-offload-dpdk: Implement HW miss packet recover for vport
  dpif-netdev: Add HW miss packet state recover logic
  netdev-offload-dpdk: Change log rate limits
  netdev-offload-dpdk: Support tunnel pop action
  netdev-offload-dpdk: Support vports flows offload
  netdev-dpdk-offload: Add vxlan pattern matching function

Ilya Maximets (2):
  netdev-offload: Allow offloading to netdev without ifindex.
  netdev-offload: Disallow offloading to unrelated tunneling vports.

Sriharsha Basavapatna (1):
  dpif-netdev: Provide orig_in_port in metadata for tunneled packets

 Documentation/howto/dpdk.rst  |   1 +
 NEWS  |   2 +
 lib/dpif-netdev.c |  97 +++--
 lib/netdev-dpdk.c | 118 ++
 lib/netdev-dpdk.h | 106 -
 lib/netdev-offload-dpdk.c | 704 +++---
 lib/netdev-offload-provider.h |   5 +
 lib/netdev-offload-tc.c   |   8 +
 lib/netdev-offload.c  |  47 ++-
 lib/netdev-offload.h  |  10 +
 lib/packets.h |   8 +-
 11 files changed, 1022 insertions(+), 84 deletions(-)

-- 
2.28.0.2311.g225365fb51

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V6 10/13] netdev-offload-dpdk: Support tunnel pop action

2021-04-04 Thread Eli Britstein
Support tunnel pop action.

Signed-off-by: Eli Britstein 
Reviewed-by: Gaetan Rivet 
---
 Documentation/howto/dpdk.rst |   1 +
 NEWS |   1 +
 lib/netdev-offload-dpdk.c| 183 ---
 3 files changed, 170 insertions(+), 15 deletions(-)

diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
index f0d45e47b..4918d80f3 100644
--- a/Documentation/howto/dpdk.rst
+++ b/Documentation/howto/dpdk.rst
@@ -398,6 +398,7 @@ Supported actions for hardware offload are:
 - VLAN Push/Pop (push_vlan/pop_vlan).
 - Modification of IPv6 (set_field:->ipv6_src/ipv6_dst/mod_nw_ttl).
 - Clone/output (tnl_push and output) for encapsulating over a tunnel.
+- Tunnel pop, for changing from a physical port to a vport.
 
 Further Reading
 ---
diff --git a/NEWS b/NEWS
index 95cf922aa..74eed0a6c 100644
--- a/NEWS
+++ b/NEWS
@@ -33,6 +33,7 @@ v2.15.0 - 15 Feb 2021
- DPDK:
  * Removed support for vhost-user dequeue zero-copy.
  * Add support for DPDK 20.11.
+ * Add hardware offload support for tunnel pop action (experimental).
- Userspace datapath:
  * Add the 'pmd' option to "ovs-appctl dpctl/dump-flows", which
restricts a flow dump to a single PMD thread if set.
diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index 3b86bc5f9..315575c38 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -140,15 +140,30 @@ struct flow_actions {
 struct rte_flow_action *actions;
 int cnt;
 int current_max;
+struct netdev *tnl_netdev;
+/* tnl_pmd_actions is the opaque array of actions returned by the PMD. */
+struct rte_flow_action *tnl_pmd_actions;
+uint32_t tnl_pmd_actions_cnt;
+/* tnl_pmd_actions_pos is where the tunnel actions starts within the
+ * 'actions' field.
+ */
+int tnl_pmd_actions_pos;
+struct ds s_tnl;
 };
 
 static void
-dump_flow_attr(struct ds *s, const struct rte_flow_attr *attr)
+dump_flow_attr(struct ds *s, struct ds *s_extra,
+   const struct rte_flow_attr *attr,
+   struct flow_actions *flow_actions)
 {
-ds_put_format(s, "%s%spriority %"PRIu32" group %"PRIu32" %s",
+if (flow_actions->tnl_pmd_actions_cnt) {
+ds_clone(s_extra, _actions->s_tnl);
+}
+ds_put_format(s, "%s%spriority %"PRIu32" group %"PRIu32" %s%s",
   attr->ingress  ? "ingress " : "",
   attr->egress   ? "egress " : "", attr->priority, attr->group,
-  attr->transfer ? "transfer " : "");
+  attr->transfer ? "transfer " : "",
+  flow_actions->tnl_pmd_actions_cnt ? "tunnel_set 1 " : "");
 }
 
 /* Adds one pattern item 'field' with the 'mask' to dynamic string 's' using
@@ -395,9 +410,19 @@ dump_vxlan_encap(struct ds *s, const struct rte_flow_item 
*items)
 
 static void
 dump_flow_action(struct ds *s, struct ds *s_extra,
- const struct rte_flow_action *actions)
+ struct flow_actions *flow_actions, int act_index)
 {
-if (actions->type == RTE_FLOW_ACTION_TYPE_MARK) {
+const struct rte_flow_action *actions = _actions->actions[act_index];
+
+if (actions->type == RTE_FLOW_ACTION_TYPE_END) {
+ds_put_cstr(s, "end");
+} else if (flow_actions->tnl_pmd_actions_cnt &&
+   act_index >= flow_actions->tnl_pmd_actions_pos &&
+   act_index < flow_actions->tnl_pmd_actions_pos +
+   flow_actions->tnl_pmd_actions_cnt) {
+/* Opaque PMD tunnel actions is skipped. */
+return;
+} else if (actions->type == RTE_FLOW_ACTION_TYPE_MARK) {
 const struct rte_flow_action_mark *mark = actions->conf;
 
 ds_put_cstr(s, "mark ");
@@ -528,6 +553,14 @@ dump_flow_action(struct ds *s, struct ds *s_extra,
 ds_put_cstr(s, "vxlan_encap / ");
 dump_vxlan_encap(s_extra, items);
 ds_put_cstr(s_extra, ";");
+} else if (actions->type == RTE_FLOW_ACTION_TYPE_JUMP) {
+const struct rte_flow_action_jump *jump = actions->conf;
+
+ds_put_cstr(s, "jump ");
+if (jump) {
+ds_put_format(s, "group %"PRIu32" ", jump->group);
+}
+ds_put_cstr(s, "/ ");
 } else {
 ds_put_format(s, "unknown rte flow action (%d)\n", actions->type);
 }
@@ -537,20 +570,21 @@ static struct ds *
 dump_flow(struct ds *s, struct ds *s_extra,
   const struct rte_flow_attr *attr,
   const struct rte_flow_item *items,
-  const struct rte_flow_action *actions)
+  struct flow_actions *flow_actions)
 {
+int i;
+
 if (attr) {
-dump_flow_attr(s, attr);
+dump_flow_attr(s, s_extra, attr, flow_actions);
 }
 ds_put_cstr(s, "pattern ");
 while (items && items->type != RTE_FLOW_ITEM_TYPE_END) {
 dump_flow_pattern(s, items++);
 }
 ds_put_cstr(s, "end actions ");
-while (actions && actions->type != RTE_FLOW_ACTION_TYPE_END) {
-  

[ovs-dev] [PATCH V6 04/13] netdev-dpdk: Add flow_api support for netdev vxlan vports

2021-04-04 Thread Eli Britstein
Add the acceptance of vxlan devices to netdev_dpdk_flow_api_supported()
API, to allow offloading of DPDK vxlan devices.

Signed-off-by: Eli Britstein 
Reviewed-by: Gaetan Rivet 
---
 lib/netdev-dpdk.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index aa8716fb3..6e35d0574 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -5216,6 +5216,12 @@ netdev_dpdk_flow_api_supported(struct netdev *netdev)
 struct netdev_dpdk *dev;
 bool ret = false;
 
+if (!strcmp(netdev_get_type(netdev), "vxlan") &&
+!strcmp(netdev_get_dpif_type(netdev), "netdev")) {
+ret = true;
+goto out;
+}
+
 if (!is_dpdk_class(netdev->netdev_class)) {
 goto out;
 }
-- 
2.28.0.2311.g225365fb51

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V6 11/13] dpif-netdev: Provide orig_in_port in metadata for tunneled packets

2021-04-04 Thread Eli Britstein
From: Sriharsha Basavapatna 

When an encapsulated packet is recirculated through a TUNNEL_POP
action, the metadata gets reinitialized and the originating physical
port information is lost. When this flow gets processed by the vport
and it needs to be offloaded, we can't figure out the physical port
through which the tunneled packet was received.

Add a new member to the metadata: 'orig_in_port'. This is passed to
the next stage during recirculation and the offload layer can use it
to offload the flow to this physical port.

Signed-off-by: Sriharsha Basavapatna 
Signed-off-by: Eli Britstein 
Reviewed-by: Gaetan Rivet 
---
 lib/dpif-netdev.c| 20 ++--
 lib/netdev-offload.h |  1 +
 lib/packets.h|  8 +++-
 3 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index c93a95a9f..b6b4125ed 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -431,6 +431,7 @@ struct dp_flow_offload_item {
 struct match match;
 struct nlattr *actions;
 size_t actions_len;
+odp_port_t orig_in_port; /* Originating in_port for tnl flows. */
 
 struct ovs_list node;
 };
@@ -2696,11 +2697,13 @@ dp_netdev_flow_offload_put(struct dp_flow_offload_item 
*offload)
 }
 }
 info.flow_mark = mark;
+info.orig_in_port = offload->orig_in_port;
 
 port = netdev_ports_get(in_port, dpif_type_str);
 if (!port) {
 goto err_free;
 }
+
 /* Taking a global 'port_mutex' to fulfill thread safety restrictions for
  * the netdev-offload-dpdk module. */
 ovs_mutex_lock(>dp->port_mutex);
@@ -2798,7 +2801,8 @@ queue_netdev_flow_del(struct dp_netdev_pmd_thread *pmd,
 static void
 queue_netdev_flow_put(struct dp_netdev_pmd_thread *pmd,
   struct dp_netdev_flow *flow, struct match *match,
-  const struct nlattr *actions, size_t actions_len)
+  const struct nlattr *actions, size_t actions_len,
+  odp_port_t orig_in_port)
 {
 struct dp_flow_offload_item *offload;
 int op;
@@ -2824,6 +2828,7 @@ queue_netdev_flow_put(struct dp_netdev_pmd_thread *pmd,
 offload->actions = xmalloc(actions_len);
 memcpy(offload->actions, actions, actions_len);
 offload->actions_len = actions_len;
+offload->orig_in_port = orig_in_port;
 
 dp_netdev_append_flow_offload(offload);
 }
@@ -3625,7 +3630,8 @@ dp_netdev_get_mega_ufid(const struct match *match, 
ovs_u128 *mega_ufid)
 static struct dp_netdev_flow *
 dp_netdev_flow_add(struct dp_netdev_pmd_thread *pmd,
struct match *match, const ovs_u128 *ufid,
-   const struct nlattr *actions, size_t actions_len)
+   const struct nlattr *actions, size_t actions_len,
+   odp_port_t orig_in_port)
 OVS_REQUIRES(pmd->flow_mutex)
 {
 struct ds extra_info = DS_EMPTY_INITIALIZER;
@@ -3691,7 +3697,8 @@ dp_netdev_flow_add(struct dp_netdev_pmd_thread *pmd,
 cmap_insert(>flow_table, CONST_CAST(struct cmap_node *, >node),
 dp_netdev_flow_hash(>ufid));
 
-queue_netdev_flow_put(pmd, flow, match, actions, actions_len);
+queue_netdev_flow_put(pmd, flow, match, actions, actions_len,
+  orig_in_port);
 
 if (OVS_UNLIKELY(!VLOG_DROP_DBG((_rl {
 struct ds ds = DS_EMPTY_INITIALIZER;
@@ -3762,7 +3769,7 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd,
 if (!netdev_flow) {
 if (put->flags & DPIF_FP_CREATE) {
 dp_netdev_flow_add(pmd, match, ufid, put->actions,
-   put->actions_len);
+   put->actions_len, ODPP_NONE);
 } else {
 error = ENOENT;
 }
@@ -3778,7 +3785,7 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd,
 ovsrcu_set(_flow->actions, new_actions);
 
 queue_netdev_flow_put(pmd, netdev_flow, match,
-  put->actions, put->actions_len);
+  put->actions, put->actions_len, ODPP_NONE);
 
 if (stats) {
 get_dpif_flow_status(pmd->dp, netdev_flow, stats, NULL);
@@ -7256,6 +7263,7 @@ handle_packet_upcall(struct dp_netdev_pmd_thread *pmd,
 ovs_u128 ufid;
 int error;
 uint64_t cycles = cycles_counter_update(>perf_stats);
+odp_port_t orig_in_port = packet->md.orig_in_port;
 
 match.tun_md.valid = false;
 miniflow_expand(>mf, );
@@ -7305,7 +7313,7 @@ handle_packet_upcall(struct dp_netdev_pmd_thread *pmd,
 if (OVS_LIKELY(!netdev_flow)) {
 netdev_flow = dp_netdev_flow_add(pmd, , ,
  add_actions->data,
- add_actions->size);
+ add_actions->size, orig_in_port);
 }
 ovs_mutex_unlock(>flow_mutex);
 uint32_t hash = dp_netdev_flow_hash(_flow->ufid);
diff --git 

[ovs-dev] [PATCH V6 02/13] netdev-dpdk: Introduce DPDK tunnel APIs

2021-04-04 Thread Eli Britstein
As a pre-step towards tunnel offloads, introduce DPDK APIs.

Signed-off-by: Eli Britstein 
Reviewed-by: Gaetan Rivet 
---
 lib/netdev-dpdk.c | 112 ++
 lib/netdev-dpdk.h | 106 ---
 2 files changed, 212 insertions(+), 6 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 9d8096668..aa8716fb3 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -5291,6 +5291,118 @@ netdev_dpdk_rte_flow_query_count(struct netdev *netdev,
 return ret;
 }
 
+#ifdef ALLOW_EXPERIMENTAL_API
+
+int
+netdev_dpdk_rte_flow_tunnel_decap_set(struct netdev *netdev,
+  struct rte_flow_tunnel *tunnel,
+  struct rte_flow_action **actions,
+  uint32_t *num_of_actions,
+  struct rte_flow_error *error)
+{
+struct netdev_dpdk *dev;
+int ret;
+
+if (!is_dpdk_class(netdev->netdev_class)) {
+return -1;
+}
+
+dev = netdev_dpdk_cast(netdev);
+ovs_mutex_lock(>mutex);
+ret = rte_flow_tunnel_decap_set(dev->port_id, tunnel, actions,
+num_of_actions, error);
+ovs_mutex_unlock(>mutex);
+return ret;
+}
+
+int
+netdev_dpdk_rte_flow_tunnel_match(struct netdev *netdev,
+  struct rte_flow_tunnel *tunnel,
+  struct rte_flow_item **items,
+  uint32_t *num_of_items,
+  struct rte_flow_error *error)
+{
+struct netdev_dpdk *dev;
+int ret;
+
+if (!is_dpdk_class(netdev->netdev_class)) {
+return -1;
+}
+
+dev = netdev_dpdk_cast(netdev);
+ovs_mutex_lock(>mutex);
+ret = rte_flow_tunnel_match(dev->port_id, tunnel, items, num_of_items,
+error);
+ovs_mutex_unlock(>mutex);
+return ret;
+}
+
+int
+netdev_dpdk_rte_flow_get_restore_info(struct netdev *netdev,
+  struct dp_packet *p,
+  struct rte_flow_restore_info *info,
+  struct rte_flow_error *error)
+{
+struct rte_mbuf *m = (struct rte_mbuf *) p;
+struct netdev_dpdk *dev;
+int ret;
+
+if (!is_dpdk_class(netdev->netdev_class)) {
+return -1;
+}
+
+dev = netdev_dpdk_cast(netdev);
+ovs_mutex_lock(>mutex);
+ret = rte_flow_get_restore_info(dev->port_id, m, info, error);
+ovs_mutex_unlock(>mutex);
+return ret;
+}
+
+int
+netdev_dpdk_rte_flow_tunnel_action_decap_release
+(struct netdev *netdev,
+ struct rte_flow_action *actions,
+ uint32_t num_of_actions,
+ struct rte_flow_error *error)
+{
+struct netdev_dpdk *dev;
+int ret;
+
+if (!is_dpdk_class(netdev->netdev_class)) {
+return -1;
+}
+
+dev = netdev_dpdk_cast(netdev);
+ovs_mutex_lock(>mutex);
+ret = rte_flow_tunnel_action_decap_release(dev->port_id, actions,
+   num_of_actions, error);
+ovs_mutex_unlock(>mutex);
+return ret;
+}
+
+int
+netdev_dpdk_rte_flow_tunnel_item_release(struct netdev *netdev,
+ struct rte_flow_item *items,
+ uint32_t num_of_items,
+ struct rte_flow_error *error)
+{
+struct netdev_dpdk *dev;
+int ret;
+
+if (!is_dpdk_class(netdev->netdev_class)) {
+return -1;
+}
+
+dev = netdev_dpdk_cast(netdev);
+ovs_mutex_lock(>mutex);
+ret = rte_flow_tunnel_item_release(dev->port_id, items, num_of_items,
+   error);
+ovs_mutex_unlock(>mutex);
+return ret;
+}
+
+#endif /* ALLOW_EXPERIMENTAL_API */
+
 #define NETDEV_DPDK_CLASS_COMMON\
 .is_pmd = true, \
 .alloc = netdev_dpdk_alloc, \
diff --git a/lib/netdev-dpdk.h b/lib/netdev-dpdk.h
index 848346cb4..3b9bf8681 100644
--- a/lib/netdev-dpdk.h
+++ b/lib/netdev-dpdk.h
@@ -26,12 +26,7 @@ struct netdev;
 
 #ifdef DPDK_NETDEV
 
-struct rte_flow;
-struct rte_flow_error;
-struct rte_flow_attr;
-struct rte_flow_item;
-struct rte_flow_action;
-struct rte_flow_query_count;
+#include 
 
 void netdev_dpdk_register(void);
 void free_dpdk_buf(struct dp_packet *);
@@ -56,6 +51,105 @@ netdev_dpdk_rte_flow_query_count(struct netdev *netdev,
 int
 netdev_dpdk_get_port_id(struct netdev *netdev);
 
+#ifdef ALLOW_EXPERIMENTAL_API
+
+int
+netdev_dpdk_rte_flow_tunnel_decap_set(struct netdev *,
+  struct rte_flow_tunnel *,
+  struct rte_flow_action **,
+  uint32_t *,
+  struct rte_flow_error *);
+int

[ovs-dev] [PATCH V6 05/13] netdev-offload-dpdk: Implement HW miss packet recover for vport

2021-04-04 Thread Eli Britstein
A miss in virtual port offloads means the flow with tnl_pop was
offloaded, but not the following one. Recover the state and continue
with SW processing.

Signed-off-by: Eli Britstein 
Reviewed-by: Gaetan Rivet 
---
 lib/netdev-offload-dpdk.c | 152 ++
 1 file changed, 152 insertions(+)

diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index f2413f5be..d40b56207 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -1588,6 +1588,157 @@ netdev_offload_dpdk_flow_flush(struct netdev *netdev)
 return 0;
 }
 
+struct get_vport_netdev_aux {
+struct rte_flow_tunnel *tunnel;
+odp_port_t *odp_port;
+struct netdev *vport;
+};
+
+static bool
+get_vxlan_netdev_cb(struct netdev *netdev,
+odp_port_t odp_port,
+void *aux_)
+{
+const struct netdev_tunnel_config *tnl_cfg;
+struct get_vport_netdev_aux *aux = aux_;
+
+if (strcmp(netdev_get_type(netdev), "vxlan")) {
+   return false;
+}
+
+tnl_cfg = netdev_get_tunnel_config(netdev);
+if (!tnl_cfg) {
+VLOG_ERR_RL(, "Cannot get a tunnel config for netdev %s",
+netdev_get_name(netdev));
+return false;
+}
+
+if (tnl_cfg->dst_port == aux->tunnel->tp_dst) {
+/* Found the netdev. Store the results and stop the traversing. */
+aux->vport = netdev_ref(netdev);
+*aux->odp_port = odp_port;
+return true;
+}
+
+return false;
+}
+
+static struct netdev *
+get_vxlan_netdev(const char *dpif_type,
+ struct rte_flow_tunnel *tunnel,
+ odp_port_t *odp_port)
+{
+struct get_vport_netdev_aux aux = {
+.tunnel = tunnel,
+.odp_port = odp_port,
+.vport = NULL,
+};
+
+netdev_ports_traverse(dpif_type, get_vxlan_netdev_cb, );
+return aux.vport;
+}
+
+static struct netdev *
+get_vport_netdev(const char *dpif_type,
+ struct rte_flow_tunnel *tunnel,
+ odp_port_t *odp_port)
+{
+if (tunnel->type == RTE_FLOW_ITEM_TYPE_VXLAN) {
+return get_vxlan_netdev(dpif_type, tunnel, odp_port);
+}
+
+OVS_NOT_REACHED();
+}
+
+static int
+netdev_offload_dpdk_hw_miss_packet_recover(struct netdev *netdev,
+   struct dp_packet *packet)
+{
+struct rte_flow_restore_info rte_restore_info;
+struct rte_flow_tunnel *rte_tnl;
+struct netdev *vport_netdev;
+struct rte_flow_error error;
+struct pkt_metadata *md;
+struct flow_tnl *md_tnl;
+odp_port_t vport_odp;
+int ret = 0;
+
+if (netdev_dpdk_rte_flow_get_restore_info(netdev, packet,
+  _restore_info, )) {
+/* This function is called for every packet, and in most cases there
+ * will be no restore info from the HW, thus error is expected.
+ */
+(void) error;
+return 0;
+}
+
+if (!(rte_restore_info.flags & RTE_FLOW_RESTORE_INFO_TUNNEL)) {
+return EOPNOTSUPP;
+}
+
+rte_tnl = _restore_info.tunnel;
+vport_netdev = get_vport_netdev(netdev->dpif_type, rte_tnl,
+_odp);
+if (!vport_netdev) {
+VLOG_WARN("Could not find vport netdev");
+return EOPNOTSUPP;
+}
+
+md = >md;
+/* For tunnel recovery (RTE_FLOW_RESTORE_INFO_TUNNEL), it is possible
+ * to have the packet to still be encapsulated, or not
+ * (RTE_FLOW_RESTORE_INFO_ENCAPSULATED).
+ * In the case it is on, the packet is still encapsulated, and we do
+ * the pop in SW.
+ * In the case it is off, the packet is already decapsulated by HW, and
+ * the tunnel info is provided in the tunnel struct. For this case we
+ * take it to OVS metadata.
+ */
+if (rte_restore_info.flags & RTE_FLOW_RESTORE_INFO_ENCAPSULATED) {
+if (!vport_netdev->netdev_class ||
+!vport_netdev->netdev_class->pop_header) {
+VLOG_ERR("vport nedtdev=%s with no pop_header method",
+ netdev_get_name(vport_netdev));
+ret = EOPNOTSUPP;
+goto close_vport_netdev;
+}
+parse_tcp_flags(packet);
+if (vport_netdev->netdev_class->pop_header(packet) == NULL) {
+/* If there is an error with popping the header, the packet is
+ * freed. In this case it should not continue SW processing.
+ */
+ret = -1;
+goto close_vport_netdev;
+}
+} else {
+md_tnl = >tunnel;
+if (rte_tnl->is_ipv6) {
+memcpy(_tnl->ipv6_src, _tnl->ipv6.src_addr,
+   sizeof md_tnl->ipv6_src);
+memcpy(_tnl->ipv6_dst, _tnl->ipv6.dst_addr,
+   sizeof md_tnl->ipv6_dst);
+} else {
+md_tnl->ip_src = rte_tnl->ipv4.src_addr;
+md_tnl->ip_dst = rte_tnl->ipv4.dst_addr;
+}
+md_tnl->tun_id = 

[ovs-dev] [PATCH V6 01/13] netdev-offload: Add HW miss packet state recover API

2021-04-04 Thread Eli Britstein
When the HW offload involves multiple flows, like in tunnel decap path,
it is possible that not all flows in the path are offloaded, resulting
in partial processing in HW. In order to proceed with rest of the
processing in SW, the packet state has to be recovered as if it was
processed in SW from the beginning. In the case of tunnel decap,
potential state to recover could be the outer tunneling layer to
metadata.
Add an API for that.

Signed-off-by: Eli Britstein 
Reviewed-by: Gaetan Rivet 
---
 lib/netdev-offload-provider.h |  5 +
 lib/netdev-offload.c  | 12 
 lib/netdev-offload.h  |  1 +
 3 files changed, 18 insertions(+)

diff --git a/lib/netdev-offload-provider.h b/lib/netdev-offload-provider.h
index cf859d1b4..f24c7dd19 100644
--- a/lib/netdev-offload-provider.h
+++ b/lib/netdev-offload-provider.h
@@ -87,6 +87,11 @@ struct netdev_flow_api {
  * Return 0 if successful, otherwise returns a positive errno value. */
 int (*flow_get_n_flows)(struct netdev *, uint64_t *n_flows);
 
+/* Recover the packet state (contents and data) for continued processing
+ * in software.
+ * Return 0 if successful, otherwise returns a positive errno value. */
+int (*hw_miss_packet_recover)(struct netdev *, struct dp_packet *);
+
 /* Initializies the netdev flow api.
  * Return 0 if successful, otherwise returns a positive errno value. */
 int (*init_flow_api)(struct netdev *);
diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c
index 6237667c3..e5d24651f 100644
--- a/lib/netdev-offload.c
+++ b/lib/netdev-offload.c
@@ -253,6 +253,18 @@ netdev_flow_put(struct netdev *netdev, struct match *match,
: EOPNOTSUPP;
 }
 
+int
+netdev_hw_miss_packet_recover(struct netdev *netdev,
+  struct dp_packet *packet)
+{
+const struct netdev_flow_api *flow_api =
+ovsrcu_get(const struct netdev_flow_api *, >flow_api);
+
+return (flow_api && flow_api->hw_miss_packet_recover)
+? flow_api->hw_miss_packet_recover(netdev, packet)
+: EOPNOTSUPP;
+}
+
 int
 netdev_flow_get(struct netdev *netdev, struct match *match,
 struct nlattr **actions, const ovs_u128 *ufid,
diff --git a/lib/netdev-offload.h b/lib/netdev-offload.h
index 18b48790f..b063c43a3 100644
--- a/lib/netdev-offload.h
+++ b/lib/netdev-offload.h
@@ -89,6 +89,7 @@ bool netdev_flow_dump_next(struct netdev_flow_dump *, struct 
match *,
 int netdev_flow_put(struct netdev *, struct match *, struct nlattr *actions,
 size_t actions_len, const ovs_u128 *,
 struct offload_info *, struct dpif_flow_stats *);
+int netdev_hw_miss_packet_recover(struct netdev *, struct dp_packet *);
 int netdev_flow_get(struct netdev *, struct match *, struct nlattr **actions,
 const ovs_u128 *, struct dpif_flow_stats *,
 struct dpif_flow_attrs *, struct ofpbuf *wbuffer);
-- 
2.28.0.2311.g225365fb51

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V6 06/13] dpif-netdev: Add HW miss packet state recover logic

2021-04-04 Thread Eli Britstein
Recover the packet if it was partially processed by the HW. Fallback to
lookup flow by mark association.

Signed-off-by: Eli Britstein 
Reviewed-by: Gaetan Rivet 
---
 lib/dpif-netdev.c | 74 +++
 1 file changed, 56 insertions(+), 18 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 251788b04..2345df500 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -7064,6 +7064,44 @@ smc_lookup_batch(struct dp_netdev_pmd_thread *pmd,
 pmd_perf_update_counter(>perf_stats, PMD_STAT_SMC_HIT, n_smc_hit);
 }
 
+static struct tx_port *
+pmd_send_port_cache_lookup(const struct dp_netdev_pmd_thread *pmd,
+   odp_port_t port_no);
+
+static inline int
+dp_netdev_hw_flow(const struct dp_netdev_pmd_thread *pmd,
+  odp_port_t port_no,
+  struct dp_packet *packet,
+  struct dp_netdev_flow **flow)
+{
+struct tx_port *p;
+uint32_t mark;
+
+if (!netdev_is_flow_api_enabled() || *recirc_depth_get() != 0) {
+*flow = NULL;
+return 0;
+}
+
+/* Restore the packet if HW processing was terminated before completion. */
+p = pmd_send_port_cache_lookup(pmd, port_no);
+if (OVS_LIKELY(p)) {
+int err = netdev_hw_miss_packet_recover(p->port->netdev, packet);
+
+if (err != 0 && err != EOPNOTSUPP) {
+return -1;
+}
+}
+
+/* If no mark, no flow to find. */
+if (!dp_packet_has_flow_mark(packet, )) {
+*flow = NULL;
+return 0;
+}
+
+*flow = mark_to_flow_find(pmd, mark);
+return 0;
+}
+
 /* Try to process all ('cnt') the 'packets' using only the datapath flow cache
  * 'pmd->flow_cache'. If a flow is not found for a packet 'packets[i]', the
  * miniflow is copied into 'keys' and the packet pointer is moved at the
@@ -7108,7 +7146,6 @@ dfc_processing(struct dp_netdev_pmd_thread *pmd,
 
 DP_PACKET_BATCH_REFILL_FOR_EACH (i, cnt, packet, packets_) {
 struct dp_netdev_flow *flow;
-uint32_t mark;
 
 if (OVS_UNLIKELY(dp_packet_size(packet) < ETH_HEADER_LEN)) {
 dp_packet_delete(packet);
@@ -7127,24 +7164,25 @@ dfc_processing(struct dp_netdev_pmd_thread *pmd,
 pkt_metadata_init(>md, port_no);
 }
 
-if ((*recirc_depth_get() == 0) &&
-dp_packet_has_flow_mark(packet, )) {
-flow = mark_to_flow_find(pmd, mark);
-if (OVS_LIKELY(flow)) {
-tcp_flags = parse_tcp_flags(packet);
-if (OVS_LIKELY(batch_enable)) {
-dp_netdev_queue_batches(packet, flow, tcp_flags, batches,
-n_batches);
-} else {
-/* Flow batching should be performed only after fast-path
- * processing is also completed for packets with emc miss
- * or else it will result in reordering of packets with
- * same datapath flows. */
-packet_enqueue_to_flow_map(packet, flow, tcp_flags,
-   flow_map, map_cnt++);
-}
-continue;
+if (OVS_UNLIKELY(dp_netdev_hw_flow(pmd, port_no, packet, ))) {
+/* Packet restoration failed. Its mbuf was freed, do not continue
+ * processing.
+ */
+continue;
+} else if (OVS_LIKELY(flow)) {
+tcp_flags = parse_tcp_flags(packet);
+if (OVS_LIKELY(batch_enable)) {
+dp_netdev_queue_batches(packet, flow, tcp_flags, batches,
+n_batches);
+} else {
+/* Flow batching should be performed only after fast-path
+ * processing is also completed for packets with emc miss
+ * or else it will result in reordering of packets with
+ * same datapath flows. */
+packet_enqueue_to_flow_map(packet, flow, tcp_flags, flow_map,
+   map_cnt++);
 }
+continue;
 }
 
 miniflow_extract(packet, >mf);
-- 
2.28.0.2311.g225365fb51

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V6 07/13] netdev-offload: Allow offloading to netdev without ifindex.

2021-04-04 Thread Eli Britstein
From: Ilya Maximets 

Virtual interfaces like vports or dpdk vhost-user ports have no
proper ifindex, while still supporting some offloads.

This is a prerequisite for tunneling vport offloading with DPDK
flow API.

Signed-off-by: Ilya Maximets 
Signed-off-by: Eli Britstein 
Reviewed-by: Gaetan Rivet 
---
 lib/netdev-offload.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c
index 10f543018..8075cfbd8 100644
--- a/lib/netdev-offload.c
+++ b/lib/netdev-offload.c
@@ -579,10 +579,6 @@ netdev_ports_insert(struct netdev *netdev, const char 
*dpif_type,
 struct port_to_netdev_data *data;
 int ifindex = netdev_get_ifindex(netdev);
 
-if (ifindex < 0) {
-return ENODEV;
-}
-
 ovs_rwlock_wrlock(_hmap_rwlock);
 if (netdev_ports_lookup(dpif_port->port_no, dpif_type)) {
 ovs_rwlock_unlock(_hmap_rwlock);
@@ -592,13 +588,18 @@ netdev_ports_insert(struct netdev *netdev, const char 
*dpif_type,
 data = xzalloc(sizeof *data);
 data->netdev = netdev_ref(netdev);
 dpif_port_clone(>dpif_port, dpif_port);
-data->ifindex = ifindex;
+
+if (ifindex >= 0) {
+data->ifindex = ifindex;
+hmap_insert(_to_port, >ifindex_node, ifindex);
+} else {
+data->ifindex = -1;
+}
 
 netdev_set_dpif_type(netdev, dpif_type);
 
 hmap_insert(_to_netdev, >portno_node,
 netdev_ports_hash(dpif_port->port_no, dpif_type));
-hmap_insert(_to_port, >ifindex_node, ifindex);
 ovs_rwlock_unlock(_hmap_rwlock);
 
 netdev_init_flow_api(netdev);
@@ -634,7 +635,9 @@ netdev_ports_remove(odp_port_t port_no, const char 
*dpif_type)
 dpif_port_destroy(>dpif_port);
 netdev_close(data->netdev); /* unref and possibly close */
 hmap_remove(_to_netdev, >portno_node);
-hmap_remove(_to_port, >ifindex_node);
+if (data->ifindex >= 0) {
+hmap_remove(_to_port, >ifindex_node);
+}
 free(data);
 ret = 0;
 }
-- 
2.28.0.2311.g225365fb51

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V6 03/13] netdev-offload: Introduce an API to traverse ports

2021-04-04 Thread Eli Britstein
Introduce an API to traverse the ports added to the offload ports map,
with a generic callback for each one.

Signed-off-by: Eli Britstein 
Reviewed-by: Gaetan Rivet 
---
 lib/netdev-offload.c | 18 ++
 lib/netdev-offload.h |  8 
 2 files changed, 26 insertions(+)

diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c
index e5d24651f..10f543018 100644
--- a/lib/netdev-offload.c
+++ b/lib/netdev-offload.c
@@ -458,6 +458,24 @@ netdev_ports_flow_flush(const char *dpif_type)
 ovs_rwlock_unlock(_hmap_rwlock);
 }
 
+void
+netdev_ports_traverse(const char *dpif_type,
+  bool (*cb)(struct netdev *, odp_port_t, void *),
+  void *aux)
+{
+struct port_to_netdev_data *data;
+
+ovs_rwlock_rdlock(_hmap_rwlock);
+HMAP_FOR_EACH (data, portno_node, _to_netdev) {
+if (netdev_get_dpif_type(data->netdev) == dpif_type) {
+if (cb(data->netdev, data->dpif_port.port_no, aux)) {
+break;
+}
+}
+}
+ovs_rwlock_unlock(_hmap_rwlock);
+}
+
 struct netdev_flow_dump **
 netdev_ports_flow_dump_create(const char *dpif_type, int *ports, bool terse)
 {
diff --git a/lib/netdev-offload.h b/lib/netdev-offload.h
index b063c43a3..5bf89f891 100644
--- a/lib/netdev-offload.h
+++ b/lib/netdev-offload.h
@@ -113,6 +113,14 @@ struct netdev *netdev_ports_get(odp_port_t port, const 
char *dpif_type);
 int netdev_ports_remove(odp_port_t port, const char *dpif_type);
 odp_port_t netdev_ifindex_to_odp_port(int ifindex);
 
+/* For each of the ports with dpif_type, call cb with the netdev and port
+ * number of the port, and an opaque user argument.
+ * The returned value is used to continue traversing upon false or stop if
+ * true.
+ */
+void netdev_ports_traverse(const char *dpif_type,
+   bool (*cb)(struct netdev *, odp_port_t, void *),
+   void *aux);
 struct netdev_flow_dump **netdev_ports_flow_dump_create(
 const char *dpif_type,
 int *ports,
-- 
2.28.0.2311.g225365fb51

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] net: openvswitch: Use 'skb_push_rcsum()' instead of hand coding it

2021-04-04 Thread patchwork-bot+netdevbpf
Hello:

This patch was applied to netdev/net-next.git (refs/heads/master):

On Sun,  4 Apr 2021 09:11:03 +0200 you wrote:
> 'skb_push()'/'skb_postpush_rcsum()' can be replaced by an equivalent
> 'skb_push_rcsum()' which is less verbose.
> 
> Signed-off-by: Christophe JAILLET 
> ---
>  net/openvswitch/conntrack.c| 6 ++
>  net/openvswitch/vport-netdev.c | 7 +++
>  2 files changed, 5 insertions(+), 8 deletions(-)

Here is the summary with links:
  - net: openvswitch: Use 'skb_push_rcsum()' instead of hand coding it
https://git.kernel.org/netdev/net-next/c/7d42e84eb99d

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev