Hi, Please see my comments below.
On Thu, Jul 01, 2021 at 04:06:13PM +0100, Cian Ferriter wrote: > From: Harry van Haaren <harry.van.haa...@intel.com> > > This commit adds a new command to allow the user to switch > the active DPIF implementation at runtime. A probe function > is executed before switching the DPIF implementation, to ensure > the CPU is capable of running the ISA required. For example, the > below code will switch to the AVX512 enabled DPIF assuming > that the runtime CPU is capable of running AVX512 instructions: > > $ ovs-appctl dpif-netdev/dpif-impl-set dpif_avx512 > > A new configuration flag is added to allow selection of the > default DPIF. This is useful for running the unit-tests against > the available DPIF implementations, without modifying each unit test. > > The design of the testing & validation for ISA optimized DPIF > implementations is based around the work already upstream for DPCLS. > Note however that a DPCLS lookup has no state or side-effects, allowing > the auto-validator implementation to perform multiple lookups and > provide consistent statistic counters. > > The DPIF component does have state, so running two implementations in > parallel and comparing output is not a valid testing method, as there > are changes in DPIF statistic counters (side effects). As a result, the > DPIF is tested directly against the unit-tests. > > Signed-off-by: Harry van Haaren <harry.van.haa...@intel.com> > Co-authored-by: Cian Ferriter <cian.ferri...@intel.com> > Signed-off-by: Cian Ferriter <cian.ferri...@intel.com> > > --- > > v14: > - Change command name to dpif-impl-set > - Fix the order of includes to what is layed out in the coding-style.rst > - Use bool not int to capture return value of dpdk_get_cpu_has_isa() > - Use an enum to index DPIF impls array. > - Hide more of the dpif impl details from lib/dpif-netdev.c. > - Fix comment on *dp_netdev_input_func() typedef. > - Rename dp_netdev_input_func func to input_func. > - Remove the datapath or dp argument from the dpif-impl-set CMD. > - Set the DPIF function pointer atomically. > > v13: > - Add Docs items about the switch DPIF command here rather than in > later commit. > - Document operation in manpages as well as rST. > - Minor code refactoring to address review comments. > --- > Documentation/topics/dpdk/bridge.rst | 34 ++++++++ > acinclude.m4 | 15 ++++ > configure.ac | 1 + > lib/automake.mk | 1 + > lib/dpif-netdev-avx512.c | 14 +++ > lib/dpif-netdev-private-dpif.c | 122 +++++++++++++++++++++++++++ > lib/dpif-netdev-private-dpif.h | 47 +++++++++++ > lib/dpif-netdev-private-thread.h | 10 --- > lib/dpif-netdev-unixctl.man | 3 + > lib/dpif-netdev.c | 74 ++++++++++++++-- > 10 files changed, 306 insertions(+), 15 deletions(-) > create mode 100644 lib/dpif-netdev-private-dpif.c > > diff --git a/Documentation/topics/dpdk/bridge.rst > b/Documentation/topics/dpdk/bridge.rst > index 526d5c959..06d1f943c 100644 > --- a/Documentation/topics/dpdk/bridge.rst > +++ b/Documentation/topics/dpdk/bridge.rst > @@ -214,3 +214,37 @@ implementation :: > > Compile OVS in debug mode to have `ovs_assert` statements error out if > there is a mis-match in the DPCLS lookup implementation. > + > +Datapath Interface Performance > +------------------------------ > + > +The datapath interface (DPIF) or dp_netdev_input() is responsible for taking > +packets through the major components of the userspace datapath; such as > +miniflow_extract, EMC, SMC and DPCLS lookups, and a lot of the performance > +stats associated with the datapath. > + > +Just like with the SIMD DPCLS feature above, SIMD can be applied to the DPIF > to > +improve performance. > + > +By default, dpif_scalar is used. The DPIF implementation can be selected by > +name :: > + > + $ ovs-appctl dpif-netdev/dpif-impl-set dpif_avx512 > + DPIF implementation set to dpif_avx512. > + > + $ ovs-appctl dpif-netdev/dpif-impl-set dpif_scalar > + DPIF implementation set to dpif_scalar. > + > +Running Unit Tests with AVX512 DPIF > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +Since the AVX512 DPIF is disabled by default, a compile time option is > +available in order to test it with the OVS unit test suite. When building > with > +a CPU that supports AVX512, use the following configure option :: > + > + $ ./configure --enable-dpif-default-avx512 > + > +The following line should be seen in the configure output when the above > option > +is used :: > + > + checking whether DPIF AVX512 is default implementation... yes > diff --git a/acinclude.m4 b/acinclude.m4 > index 15a54d636..5fbcd9872 100644 > --- a/acinclude.m4 > +++ b/acinclude.m4 > @@ -30,6 +30,21 @@ AC_DEFUN([OVS_CHECK_DPCLS_AUTOVALIDATOR], [ > fi > ]) > > +dnl Set OVS DPIF default implementation at configure time for running the > unit > +dnl tests on the whole codebase without modifying tests per DPIF impl > +AC_DEFUN([OVS_CHECK_DPIF_AVX512_DEFAULT], [ > + AC_ARG_ENABLE([dpif-default-avx512], > + [AC_HELP_STRING([--enable-dpif-default-avx512], [Enable DPIF > AVX512 implementation as default.])], > + [dpifavx512=yes],[dpifavx512=no]) > + AC_MSG_CHECKING([whether DPIF AVX512 is default implementation]) > + if test "$dpifavx512" != yes; then > + AC_MSG_RESULT([no]) > + else > + OVS_CFLAGS="$OVS_CFLAGS -DDPIF_AVX512_DEFAULT" > + AC_MSG_RESULT([yes]) > + fi > +]) > + > dnl OVS_ENABLE_WERROR > AC_DEFUN([OVS_ENABLE_WERROR], > [AC_ARG_ENABLE( > diff --git a/configure.ac b/configure.ac > index c077034d4..e45685a6c 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -185,6 +185,7 @@ OVS_ENABLE_WERROR > OVS_ENABLE_SPARSE > OVS_CTAGS_IDENTIFIERS > OVS_CHECK_DPCLS_AUTOVALIDATOR > +OVS_CHECK_DPIF_AVX512_DEFAULT > OVS_CHECK_BINUTILS_AVX512 > > AC_ARG_VAR(KARCH, [Kernel Architecture String]) > diff --git a/lib/automake.mk b/lib/automake.mk > index 660cd07f0..49f42c2a3 100644 > --- a/lib/automake.mk > +++ b/lib/automake.mk > @@ -116,6 +116,7 @@ lib_libopenvswitch_la_SOURCES = \ > lib/dpif-netdev-private-dfc.c \ > lib/dpif-netdev-private-dfc.h \ > lib/dpif-netdev-private-dpcls.h \ > + lib/dpif-netdev-private-dpif.c \ > lib/dpif-netdev-private-dpif.h \ > lib/dpif-netdev-private-flow.h \ > lib/dpif-netdev-private-hwol.h \ > diff --git a/lib/dpif-netdev-avx512.c b/lib/dpif-netdev-avx512.c > index d013fea1f..cb252617d 100644 > --- a/lib/dpif-netdev-avx512.c > +++ b/lib/dpif-netdev-avx512.c > @@ -24,6 +24,7 @@ > #include "dpif-netdev-perf.h" > #include "dpif-netdev-private.h" > > +#include <errno.h> > #include <immintrin.h> > > #include "dp-packet.h" > @@ -57,6 +58,19 @@ struct dpif_userdata { > struct pkt_flow_meta pkt_meta[NETDEV_MAX_BURST]; > }; > > +int32_t > +dp_netdev_input_outer_avx512_probe(void) > +{ > + bool avx512f_available = dpdk_get_cpu_has_isa("x86_64", "avx512f"); > + bool bmi2_available = dpdk_get_cpu_has_isa("x86_64", "bmi2"); > + > + if (!avx512f_available || !bmi2_available) { > + return -ENOTSUP; > + } > + > + return 0; > +} > + > int32_t > dp_netdev_input_outer_avx512(struct dp_netdev_pmd_thread *pmd, > struct dp_packet_batch *packets, > diff --git a/lib/dpif-netdev-private-dpif.c b/lib/dpif-netdev-private-dpif.c > new file mode 100644 > index 000000000..da3511f51 > --- /dev/null > +++ b/lib/dpif-netdev-private-dpif.c > @@ -0,0 +1,122 @@ > +/* > + * Copyright (c) 2021 Intel Corporation. > + * > + * Licensed under the Apache License, Version 2.0 (the "License"); > + * you may not use this file except in compliance with the License. > + * You may obtain a copy of the License at: > + * > + * http://www.apache.org/licenses/LICENSE-2.0 > + * > + * Unless required by applicable law or agreed to in writing, software > + * distributed under the License is distributed on an "AS IS" BASIS, > + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. > + * See the License for the specific language governing permissions and > + * limitations under the License. > + */ > + > +#include <config.h> > + > +#include "dpif-netdev-private-dpif.h" > +#include "dpif-netdev-private-thread.h" > + > +#include <errno.h> > +#include <string.h> > + > +#include "openvswitch/dynamic-string.h" > +#include "openvswitch/vlog.h" > +#include "util.h" > + > +VLOG_DEFINE_THIS_MODULE(dpif_netdev_impl); > + > +enum dpif_netdev_impl_info_idx { > + DPIF_NETDEV_IMPL_SCALAR, > + DPIF_NETDEV_IMPL_AVX512 > +}; > + > +/* Actual list of implementations goes here. */ > +static struct dpif_netdev_impl_info_t dpif_impls[] = { > + /* The default scalar C code implementation. */ > + [DPIF_NETDEV_IMPL_SCALAR] = { .input_func = dp_netdev_input, > + .probe = NULL, > + .name = "dpif_scalar", }, > + > +#if (__x86_64__ && HAVE_AVX512F && HAVE_LD_AVX512_GOOD && __SSE4_2__) > + /* Only available on x86_64 bit builds with SSE 4.2 used for OVS core. */ > + [DPIF_NETDEV_IMPL_AVX512] = { .input_func = dp_netdev_input_outer_avx512, > + .probe = dp_netdev_input_outer_avx512_probe, > + .name = "dpif_avx512", }, > +#endif > +}; > + > +static dp_netdev_input_func default_dpif_func; > + > +dp_netdev_input_func > +dp_netdev_impl_get_default(void) > +{ > + /* For the first call, this will be NULL. Compute the compile time > default. > + */ > + if (!default_dpif_func) { > + int dpif_idx = 0; That should be DPIF_NETDEV_IMPL_SCALAR. > + > +/* Configure-time overriding to run test suite on all implementations. */ > +#if (__x86_64__ && HAVE_AVX512F && HAVE_LD_AVX512_GOOD && __SSE4_2__) > +#ifdef DPIF_AVX512_DEFAULT > + ovs_assert(dpif_impls[DPIF_NETDEV_IMPL_AVX512].input_func > + == dp_netdev_input_outer_avx512); This assert() makes little sense now. It's not possible to change the dpif_impls at runtime, and if we change the code we will notice the problem only at runtime. Wouldn't it make more sense to make it generic like below? #ifdef DPIF_AVX512_DEFAULT dp_netdev_input_func_probe probe; /* Check if the compiled default is compatible. */ probe = dpif_impls[DPIF_NETDEV_IMPL_AVX512].probe; if (!probe || !probe()) { dpif_idx = DPIF_NETDEV_IMPL_AVX512; } #endif > + if (!dp_netdev_input_outer_avx512_probe()) { > + dpif_idx = DPIF_NETDEV_IMPL_AVX512; > + }; > +#endif > +#endif > + > + VLOG_INFO("Default DPIF implementation is %s.\n", > + dpif_impls[dpif_idx].name); > + default_dpif_func = dpif_impls[dpif_idx].input_func; > + } > + > + return default_dpif_func; > +} > + > +int32_t > +dp_netdev_impl_set_default_by_name(const char *name) > +{ > + dp_netdev_input_func new_default; > + > + int32_t err = dp_netdev_impl_get_by_name(name, &new_default); > + > + if (!err) { > + default_dpif_func = new_default; > + } > + > + return err; > + > +} > + > +/* This function checks all available DPIF implementations, and selects the > + * returns the function pointer to the one requested by "name". > + */ > +int32_t > +dp_netdev_impl_get_by_name(const char *name, dp_netdev_input_func *out_func) That one should be static and removed from the lib/dpif-netdev-private-dpif.h. > +{ > + ovs_assert(name); > + ovs_assert(out_func); > + > + uint32_t i; > + > + for (i = 0; i < ARRAY_SIZE(dpif_impls); i++) { > + if (strcmp(dpif_impls[i].name, name) == 0) { > + /* Probe function is optional - so check it is set before exec. > */ > + if (dpif_impls[i].probe) { > + int probe_err = dpif_impls[i].probe(); > + if (probe_err) { > + *out_func = NULL; > + return probe_err; > + } > + } > + *out_func = dpif_impls[i].input_func; > + return 0; > + } > + } > + > + return -EINVAL; > +} > diff --git a/lib/dpif-netdev-private-dpif.h b/lib/dpif-netdev-private-dpif.h > index bbd719b22..0e58153f4 100644 > --- a/lib/dpif-netdev-private-dpif.h > +++ b/lib/dpif-netdev-private-dpif.h > @@ -23,7 +23,54 @@ > struct dp_netdev_pmd_thread; > struct dp_packet_batch; > > +/* Typedef for DPIF functions. > + * Returns whether all packets were processed successfully. > + */ > +typedef int32_t (*dp_netdev_input_func)(struct dp_netdev_pmd_thread *pmd, > + struct dp_packet_batch *packets, > + odp_port_t port_no); > + > +/* Probe a DPIF implementation. This allows the implementation to validate > CPU > + * ISA availability. Returns -ENOTSUP if not available, returns 1 if valid to > + * use. Returns 0 if valid to use? > + */ > +typedef int32_t (*dp_netdev_input_func_probe)(void); > + > +/* Structure describing each available DPIF implementation. */ > +struct dpif_netdev_impl_info_t { > + /* Function pointer to execute to have this DPIF implementation run. */ > + dp_netdev_input_func input_func; > + /* Function pointer to execute to check the CPU ISA is available to run. > + * May be NULL, which implies that it is always valid to use. Please reword to make sure setting to NULL is required: /* Function pointer to execute to check the CPU ISA is available to * run. If not necessary, it must be set to NULL which implies that * it is always valid to use. */ > + */ > + dp_netdev_input_func_probe probe; > + /* Name used to select this DPIF implementation. */ > + const char *name; > +}; > + > +/* This function checks all available DPIF implementations, and selects the > + * returns the function pointer to the one requested by "name". > + */ > +int32_t > +dp_netdev_impl_get_by_name(const char *name, dp_netdev_input_func *out_func); That one doesn't need to be exposed as I mentioned before. > + > +/* Returns the default DPIF which is first ./configure selected, but can be > + * overridden at runtime. */ > +dp_netdev_input_func dp_netdev_impl_get_default(void); > + > +/* Overrides the default DPIF with the user set DPIF. */ > +int32_t dp_netdev_impl_set_default_by_name(const char *name); > + > /* Available DPIF implementations below. */ > +int32_t > +dp_netdev_input(struct dp_netdev_pmd_thread *pmd, > + struct dp_packet_batch *packets, > + odp_port_t in_port); > + > +/* AVX512 enabled DPIF implementation and probe functions. */ > +int32_t > +dp_netdev_input_outer_avx512_probe(void); > + > int32_t > dp_netdev_input_outer_avx512(struct dp_netdev_pmd_thread *pmd, > struct dp_packet_batch *packets, > diff --git a/lib/dpif-netdev-private-thread.h > b/lib/dpif-netdev-private-thread.h > index 63b99220b..ba79c4a0a 100644 > --- a/lib/dpif-netdev-private-thread.h > +++ b/lib/dpif-netdev-private-thread.h > @@ -50,16 +50,6 @@ struct dp_netdev_pmd_thread_ctx { > bool smc_enable_db; > }; > > -/* Forward declaration for typedef. */ > -struct dp_netdev_pmd_thread; > - > -/* Typedef for DPIF functions. > - * Returns a bitmask of packets to handle, possibly including upcall/misses. > - */ > -typedef int32_t (*dp_netdev_input_func)(struct dp_netdev_pmd_thread *pmd, > - struct dp_packet_batch *packets, > - odp_port_t port_no); > - > /* PMD: Poll modes drivers. PMD accesses devices via polling to eliminate > * the performance overhead of interrupt processing. Therefore netdev can > * not implement rx-wait for these devices. dpif-netdev needs to poll > diff --git a/lib/dpif-netdev-unixctl.man b/lib/dpif-netdev-unixctl.man > index 858d491df..76cc949f9 100644 > --- a/lib/dpif-netdev-unixctl.man > +++ b/lib/dpif-netdev-unixctl.man > @@ -226,3 +226,6 @@ recirculation (only in balance-tcp mode). > When this is the case, the above command prints the load-balancing > information > of the bonds configured in datapath \fIdp\fR showing the interface associated > with each bucket (hash). > +. > +.IP "\fBdpif-netdev/dpif-impl-set\fR \fIdpif_impl\fR" > +Sets the DPIF to be used to \fIdpif_impl\fR. By default "dpif_scalar" is > used. > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > index e0c8f055d..19917c7c5 100644 > --- a/lib/dpif-netdev.c > +++ b/lib/dpif-netdev.c > @@ -470,8 +470,6 @@ static void dp_netdev_execute_actions(struct > dp_netdev_pmd_thread *pmd, > const struct flow *flow, > const struct nlattr *actions, > size_t actions_len); > -static int32_t dp_netdev_input(struct dp_netdev_pmd_thread *, > - struct dp_packet_batch *, odp_port_t port_no); > static void dp_netdev_recirculate(struct dp_netdev_pmd_thread *, > struct dp_packet_batch *); > > @@ -982,6 +980,66 @@ dpif_netdev_subtable_lookup_set(struct unixctl_conn > *conn, int argc, > ds_destroy(&reply); > } > > +static void > +dpif_netdev_impl_set(struct unixctl_conn *conn, int argc OVS_UNUSED, > + const char *argv[], void *aux OVS_UNUSED) > +{ > + /* This function requires just one parameter, the DPIF name. */ > + const char *dpif_name = argv[1]; > + struct shash_node *node; > + > + static const char *error_description[2] = { > + "Unknown DPIF implementation", > + "CPU doesn't support the required instruction for", > + }; > + > + ovs_mutex_lock(&dp_netdev_mutex); > + int32_t err = dp_netdev_impl_set_default_by_name(dpif_name); > + > + if (err) { > + struct ds reply = DS_EMPTY_INITIALIZER; > + ds_put_format(&reply, "DPIF implementation not available: %s %s.\n", > + error_description[ (err == -ENOTSUP) ], dpif_name); > + const char *reply_str = ds_cstr(&reply); > + unixctl_command_reply(conn, reply_str); That should be unixctl_command_reply_error(conn, reply_str) > + VLOG_INFO("%s", reply_str); > + ds_destroy(&reply); > + ovs_mutex_unlock(&dp_netdev_mutex); > + return; > + } > + > + SHASH_FOR_EACH (node, &dp_netdevs) { > + struct dp_netdev *dp = node->data; > + > + /* Get PMD threads list, required to get DPCLS instances. */ > + size_t n; > + struct dp_netdev_pmd_thread **pmd_list; > + sorted_poll_thread_list(dp, &pmd_list, &n); > + > + for (size_t i = 0; i < n; i++) { > + struct dp_netdev_pmd_thread *pmd = pmd_list[i]; > + if (pmd->core_id == NON_PMD_CORE_ID) { > + continue; > + } > + > + /* Initialize DPIF function pointer to the newly configured > + * default. */ > + dp_netdev_input_func default_func = dp_netdev_impl_get_default(); > + atomic_uintptr_t *pmd_func = (void *) &pmd->netdev_input_func; > + atomic_store_relaxed(pmd_func, (uintptr_t) default_func); > + }; > + } > + ovs_mutex_unlock(&dp_netdev_mutex); > + > + /* Reply with success to command. */ > + struct ds reply = DS_EMPTY_INITIALIZER; > + ds_put_format(&reply, "DPIF implementation set to %s.\n", dpif_name); > + const char *reply_str = ds_cstr(&reply); > + unixctl_command_reply(conn, reply_str); > + VLOG_INFO("%s", reply_str); > + ds_destroy(&reply); > +} > + > static void > dpif_netdev_pmd_rebalance(struct unixctl_conn *conn, int argc, > const char *argv[], void *aux OVS_UNUSED) > @@ -1204,6 +1262,10 @@ dpif_netdev_init(void) > unixctl_command_register("dpif-netdev/subtable-lookup-prio-get", "", > 0, 0, dpif_netdev_subtable_lookup_get, > NULL); > + unixctl_command_register("dpif-netdev/dpif-impl-set", > + "dpif_implementation_name", > + 1, 1, dpif_netdev_impl_set, > + NULL); > return 0; > } > > @@ -6106,8 +6168,10 @@ dp_netdev_configure_pmd(struct dp_netdev_pmd_thread > *pmd, struct dp_netdev *dp, > hmap_init(&pmd->send_port_cache); > cmap_init(&pmd->tx_bonds); > > - /* Initialize the DPIF function pointer to the default scalar version. */ > - pmd->netdev_input_func = dp_netdev_input; > + /* Initialize DPIF function pointer to the default configured version. */ > + dp_netdev_input_func default_func = dp_netdev_impl_get_default(); > + atomic_uintptr_t *pmd_func = (void *) &pmd->netdev_input_func; > + atomic_init(pmd_func, (uintptr_t) default_func); > > /* init the 'flow_cache' since there is no > * actual thread created for NON_PMD_CORE_ID. */ > @@ -7078,7 +7142,7 @@ dp_netdev_input__(struct dp_netdev_pmd_thread *pmd, > } > } > > -static int32_t > +int32_t > dp_netdev_input(struct dp_netdev_pmd_thread *pmd, > struct dp_packet_batch *packets, > odp_port_t port_no) > -- > 2.32.0 > > _______________________________________________ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev -- fbl _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev