Hi Ilya,
Please find comments inline and at the end of this mail.

Regards,
Ophir

> -----Original Message-----
> From: Ilya Maximets <i.maxim...@samsung.com>
> Sent: Tuesday, April 23, 2019 7:19 PM
> To: ovs-dev@openvswitch.org
> Cc: Ian Stokes <ian.sto...@intel.com>; Flavio Leitner <f...@sysclose.org>;
> Ophir Munk <ophi...@mellanox.com>; Kevin Traynor
> <ktray...@redhat.com>; Roni Bar Yanai <ron...@mellanox.com>; Finn
> Christensen <f...@napatech.com>; Ben Pfaff <b...@ovn.org>; Roi Dayan
> <r...@mellanox.com>; Simon Horman <simon.hor...@netronome.com>;
> Ilya Maximets <i.maxim...@samsung.com>
> Subject: [PATCH] netdev: Dynamic per-port Flow API.
> 
> Current issues with Flow API:
> 
> * OVS calls offloading functions regardless of successful
>   flow API initialization. (ex. on init_flow_api failure)
> * Static initilaization of Flow API for a netdev_class forbids
>   having different offloading types for different instances
>   of netdev with the same netdev_class. (ex. different vports in
>   'system' and 'netdev' datapaths at the same time)
> 
> Solution:
> 
> * Move Flow API from the netdev_class to netdev instance.
> * Make Flow API dynamic, i.e. probe the APIs and choose the
>   suitable one.
> 

I suggest distinguishing between initialization and probing.
The probing you mention is implemented by testing device initialization: 
init_flow_api(). 
I suggest adding a new probe() API just for the sake of probing. It will check 
the netdev struct. 
I suggest  leaving the init_flow_api() API for HW/drivers initialization.
The HW/driver may fail at initialization, but it does not mean we need to probe 
for a new API in that case.

> Side effects:
> 
> * Flow API providers localized as possible in their modules.
> * Now we have an ability to make runtime checks. For example,
>   we could check if particular device supports features we
>   need, like if dpdk device supports RSS+MARK action.
> 
> Signed-off-by: Ilya Maximets <i.maxim...@samsung.com>
> ---
> 
> Since RFC:
>   * Fixed sparce build.
>   * Some logs turned from INFO to DBG.
>   * Enabled HW offloading on non-Linux systems
>     (For testing with dummy provider).
> 
>  lib/automake.mk               |   2 +-
>  lib/dpdk.c                    |   2 +
>  lib/netdev-dpdk.c             |  24 +++-
>  lib/netdev-dpdk.h             |   3 +
>  lib/netdev-dummy.c            |  24 +++-
>  lib/netdev-linux.c            |   3 -
>  lib/netdev-linux.h            |  10 --
>  lib/netdev-offload-provider.h |  99 +++++++++++++++
>  lib/netdev-provider.h         |  67 +----------
>  lib/netdev-rte-offloads.c     |  40 +++++-
>  lib/netdev-rte-offloads.h     |  41 +------
>  lib/netdev-tc-offloads.c      |  39 ++++--
>  lib/netdev-tc-offloads.h      |  44 -------
>  lib/netdev-vport.c            |   6 +-
>  lib/netdev.c                  | 221 +++++++++++++++++++++++++++-------
>  tests/dpif-netdev.at          |   4 +-
>  tests/ofproto-macros.at       |   1 -
>  17 files changed, 398 insertions(+), 232 deletions(-)  create mode 100644
> lib/netdev-offload-provider.h  delete mode 100644 lib/netdev-tc-offloads.h
> 
> diff --git a/lib/automake.mk b/lib/automake.mk index cc5dccf39..b9f3b69e7
> 100644
> --- a/lib/automake.mk
> +++ b/lib/automake.mk
> @@ -138,6 +138,7 @@ lib_libopenvswitch_la_SOURCES = \
>       lib/netdev-dpdk.h \
>       lib/netdev-dummy.c \
>       lib/netdev-provider.h \
> +     lib/netdev-offload-provider.h \

Please keep alphabetical order (switch between the last 2 lines)

>       lib/netdev-rte-offloads.h \
>       lib/netdev-vport.c \
>       lib/netdev-vport.h \
> @@ -393,7 +394,6 @@ lib_libopenvswitch_la_SOURCES += \
>       lib/netdev-linux.c \
>       lib/netdev-linux.h \
>       lib/netdev-tc-offloads.c \
> -     lib/netdev-tc-offloads.h \
>       lib/netlink-conntrack.c \
>       lib/netlink-conntrack.h \
>       lib/netlink-notifier.c \
> diff --git a/lib/dpdk.c b/lib/dpdk.c
> index dc6171546..6c6298635 100644
> --- a/lib/dpdk.c
> +++ b/lib/dpdk.c
> @@ -34,6 +34,7 @@
>  #include "dirs.h"
>  #include "fatal-signal.h"
>  #include "netdev-dpdk.h"
> +#include "netdev-rte-offloads.h"
>  #include "openvswitch/dynamic-string.h"
>  #include "openvswitch/vlog.h"
>  #include "ovs-numa.h"
> @@ -442,6 +443,7 @@ dpdk_init__(const struct smap *ovs_other_config)
> 
>      /* Finally, register the dpdk classes */
>      netdev_dpdk_register();
> +    netdev_dpdk_flow_api_register();
>      return true;
>  }
> 
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
> 47153dc60..c06c9ef81 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -4204,6 +4204,27 @@ unlock:
>      return err;
>  }
> 
> +bool
> +netdev_dpdk_flow_api_supported(struct netdev *netdev) {
> +    struct netdev_dpdk *dev;
> +    bool ret = false;
> +
> +    if (!is_dpdk_class(netdev->netdev_class)) {
> +        goto out;
> +    }
> +
> +    dev = netdev_dpdk_cast(netdev);
> +    ovs_mutex_lock(&dev->mutex);
> +    if (dev->type == DPDK_DEV_ETH) {
> +        /* TODO: Check if we able to offload some minimal flow. */
> +        ret = true;
> +    }
> +    ovs_mutex_unlock(&dev->mutex);
> +out:
> +    return ret;
> +}
> +
>  int
>  netdev_dpdk_rte_flow_destroy(struct netdev *netdev,
>                               struct rte_flow *rte_flow, @@ -4268,8 +4289,7 @@
> netdev_dpdk_rte_flow_create(struct netdev *netdev,
>      .get_features = netdev_dpdk_get_features,           \
>      .get_status = netdev_dpdk_get_status,               \
>      .reconfigure = netdev_dpdk_reconfigure,             \
> -    .rxq_recv = netdev_dpdk_rxq_recv,                   \
> -    DPDK_FLOW_OFFLOAD_API
> +    .rxq_recv = netdev_dpdk_rxq_recv
> 
>  static const struct netdev_class dpdk_class = {
>      .type = "dpdk",
> diff --git a/lib/netdev-dpdk.h b/lib/netdev-dpdk.h index
> 9bbb8d8d6..60631c4f0 100644
> --- a/lib/netdev-dpdk.h
> +++ b/lib/netdev-dpdk.h
> @@ -34,6 +34,9 @@ struct rte_flow_action;
> 
>  void netdev_dpdk_register(void);
>  void free_dpdk_buf(struct dp_packet *);
> +
> +bool netdev_dpdk_flow_api_supported(struct netdev *);
> +
>  int
>  netdev_dpdk_rte_flow_destroy(struct netdev *netdev,
>                               struct rte_flow *rte_flow, diff --git 
> a/lib/netdev-dummy.c
> b/lib/netdev-dummy.c index 3f90ffa09..2e2e0c2ab 100644
> --- a/lib/netdev-dummy.c
> +++ b/lib/netdev-dummy.c
> @@ -24,6 +24,7 @@
>  #include "dp-packet.h"
>  #include "dpif-netdev.h"
>  #include "flow.h"
> +#include "netdev-offload-provider.h"
>  #include "netdev-provider.h"
>  #include "netdev-vport.h"
>  #include "odp-util.h"
> @@ -1523,10 +1524,6 @@ exit:
>      return error ? -1 : 0;
>  }
> 
> -#define DUMMY_FLOW_OFFLOAD_API                          \
> -    .flow_put = netdev_dummy_flow_put,                  \
> -    .flow_del = netdev_dummy_flow_del
> -
>  #define NETDEV_DUMMY_CLASS_COMMON                       \
>      .run = netdev_dummy_run,                            \
>      .wait = netdev_dummy_wait,                          \
> @@ -1559,8 +1556,7 @@ exit:
>      .rxq_dealloc = netdev_dummy_rxq_dealloc,            \
>      .rxq_recv = netdev_dummy_rxq_recv,                  \
>      .rxq_wait = netdev_dummy_rxq_wait,                  \
> -    .rxq_drain = netdev_dummy_rxq_drain,                \
> -    DUMMY_FLOW_OFFLOAD_API
> +    .rxq_drain = netdev_dummy_rxq_drain
> 
>  static const struct netdev_class dummy_class = {
>      NETDEV_DUMMY_CLASS_COMMON,
> @@ -1578,6 +1574,20 @@ static const struct netdev_class
> dummy_pmd_class = {
>      .is_pmd = true,
>      .reconfigure = netdev_dummy_reconfigure  };
> +
> +static int
> +netdev_dummy_offloads_init_flow_api(struct netdev *netdev) {
> +    return is_dummy_class(netdev->netdev_class) ? 0 : -1; }
> +
> +static const struct netdev_flow_api netdev_dummy_offloads = {
> +    .type = "dummy",
> +    .flow_put = netdev_dummy_flow_put,
> +    .flow_del = netdev_dummy_flow_del,
> +    .init_flow_api = netdev_dummy_offloads_init_flow_api,
> +};
> +
>  

>  /* Helper functions. */
> 
> @@ -2024,5 +2034,7 @@ netdev_dummy_register(enum dummy_level
> level)
>      netdev_register_provider(&dummy_internal_class);
>      netdev_register_provider(&dummy_pmd_class);
> 
> +    netdev_register_flow_api_provider(&netdev_dummy_offloads);
> +
>      netdev_vport_tunnel_register();
>  }
> diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c index 
> f75d73fd3..e4ea94cf9
> 100644
> --- a/lib/netdev-linux.c
> +++ b/lib/netdev-linux.c
> @@ -55,7 +55,6 @@
>  #include "hash.h"
>  #include "openvswitch/hmap.h"
>  #include "netdev-provider.h"
> -#include "netdev-tc-offloads.h"
>  #include "netdev-vport.h"
>  #include "netlink-notifier.h"
>  #include "netlink-socket.h"
> @@ -3321,7 +3320,6 @@ exit:
> 
>  const struct netdev_class netdev_linux_class = {
>      NETDEV_LINUX_CLASS_COMMON,
> -    LINUX_FLOW_OFFLOAD_API,
>      .type = "system",

Please note that before this patch type="system" and type="internal" had 
LINUX_FLOW_OFFLOAD_API, but type="tap" did not.
With this patch type="tap" is going to be tested with init_flow_api(). If the 
test passes - type="tap" is going to have flow_api which it did not have before.
Was is verified that type="tap" will fail with init_flow_api()?

>      .construct = netdev_linux_construct,
>      .get_stats = netdev_linux_get_stats, @@ -3341,7 +3339,6 @@ const struct
> netdev_class netdev_tap_class = {
> 
>  const struct netdev_class netdev_internal_class = {
>      NETDEV_LINUX_CLASS_COMMON,
> -    LINUX_FLOW_OFFLOAD_API,
>      .type = "internal",
>      .construct = netdev_linux_construct,
>      .get_stats = netdev_internal_get_stats, diff --git a/lib/netdev-linux.h
... 
> +static int
> +netdev_assign_flow_api(struct netdev *netdev) {
> +    struct netdev_registered_flow_api *rfa;
> +
> +    CMAP_FOR_EACH (rfa, cmap_node, &netdev_flow_apis) {
> +        if (!rfa->flow_api->init_flow_api(netdev)) {
> +            ovs_refcount_ref(&rfa->refcnt);
> +            ovsrcu_set(&netdev->flow_api, rfa->flow_api);
> +            VLOG_INFO("%s: Assigned flow API '%s'.",
> +                      netdev_get_name(netdev), rfa->flow_api->type);
> +            return 0;
> +        }

By this logic the first API which successfully passes init_flow_api() becomes 
the flow_api for this netdev.
Assuming there are 2 vport APIs: 1. system datapath vport API 2. netdev 
datapath vport API (planned to be added soon). 
Assuming both vport APIs successfully pass init_flow_api(), then the first one 
to be tested will always be chosen and the second one will never be used.
Are we sure that the system datapath vport will not always be chosen with the 
current netdev_tc_init_flow_api() implementation. 
How can we distinguish between a vport created under different datapaths?
This issue must be resolved for this patch.

> +        VLOG_DBG("%s: flow API '%s' is not suitable.",
> +                 netdev_get_name(netdev), rfa->flow_api->type);
> +    }
> +    VLOG_INFO("%s: No suitable flow API found.",
> + netdev_get_name(netdev));
> +
> +    return -1;

Please return an enumerated positive error.

> +}
> +
>  int
>  netdev_flow_flush(struct netdev *netdev)  {
> -    const struct netdev_class *class = netdev->netdev_class;

A few more comments.
1. This patch changes code related to OVS offload implementation. OVS offload 
must be confirmed with this patch before it is accepted.

2. We may suggest to offload "dpdkvhostclient" port type and "vxlan" (under 
netdev datapath)  port type. 
Please confirm the following understanding of the required steps. 
2.1 Need to register two new flow_apis for the two ports types.
2.2 Need to write the corresponding init_flow_api() to test the relevant netdev 
instances. 
2.3 The question how to distinguish between a vport under system datapath 
versus vport under netdev datapath is still an open question (would be happy to 
learn that it is already resolved).

3. Could you please add inline documentation for the newly added code? It seems 
the logic is spread is several places so any comments will be helpful.

Regards,
Ophir
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to