On Mon, Jul 12, 2021 at 02:22:46PM +0200, Eelco Chaudron wrote:
> See some comments below…
> 
> For this patch series, I’m only looking at the diff from v6..v9, not a full 
> review.
> I will do basic compilation and some tests at the end.
> 
> Cheers,
> 
> Eelco
> 
> 
> On 12 Jul 2021, at 7:51, kumar Amber wrote:
> 
> > From: Kumar Amber <kumar.am...@intel.com>
> >
> > This patch introduces the MFEX function pointers which allows
> > the user to switch between different miniflow extract implementations
> > which are provided by the OVS based on optimized ISA CPU.
> >
> > The user can query for the available minflow extract variants available
> > for that CPU by following commands:
> >
> > $ovs-appctl dpif-netdev/miniflow-parser-get
> >
> > Similarly an user can set the miniflow implementation by the following
> > command :
> >
> > $ ovs-appctl dpif-netdev/miniflow-parser-set name
> >
> > This allows for more performance and flexibility to the user to choose
> > the miniflow implementation according to the needs.
> >
> > Signed-off-by: Kumar Amber <kumar.am...@intel.com>
> > Co-authored-by: Harry van Haaren <harry.van.haa...@intel.com>
> > Signed-off-by: Harry van Haaren <harry.van.haa...@intel.com>
> >
> > ---
> > v9:
> > - fix review comments from Flavio
> > v7:
> > - fix review comments(Eelco, Flavio)
> > v5:
> > - fix review comments(Ian, Flavio, Eelco)
> > - add enum to hold mfex indexes
> > - add new get and set implemenatations
> > - add Atomic set and get
> > ---
> > ---
> >  NEWS                              |   1 +
> >  lib/automake.mk                   |   2 +
> >  lib/dpif-netdev-avx512.c          |  31 +++++-
> >  lib/dpif-netdev-private-extract.c | 162 ++++++++++++++++++++++++++++++
> >  lib/dpif-netdev-private-extract.h | 111 ++++++++++++++++++++
> >  lib/dpif-netdev-private-thread.h  |   8 ++
> >  lib/dpif-netdev.c                 | 105 +++++++++++++++++++
> >  7 files changed, 416 insertions(+), 4 deletions(-)
> >  create mode 100644 lib/dpif-netdev-private-extract.c
> >  create mode 100644 lib/dpif-netdev-private-extract.h
> >
> > diff --git a/NEWS b/NEWS
> > index 6cdccc715..b0f08e96d 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -32,6 +32,7 @@ Post-v2.15.0
> >       * Enable the AVX512 DPCLS implementation to use VPOPCNT instruction 
> > if the
> >         CPU supports it. This enhances performance by using the native 
> > vpopcount
> >         instructions, instead of the emulated version of vpopcount.
> > +     * Add command line option to switch between MFEX function pointers.
> >     - ovs-ctl:
> >       * New option '--no-record-hostname' to disable hostname configuration
> >         in ovsdb on startup.
> > diff --git a/lib/automake.mk b/lib/automake.mk
> > index 3c9523c1a..53b8abc0f 100644
> > --- a/lib/automake.mk
> > +++ b/lib/automake.mk
> > @@ -118,6 +118,8 @@ lib_libopenvswitch_la_SOURCES = \
> >     lib/dpif-netdev-private-dpcls.h \
> >     lib/dpif-netdev-private-dpif.c \
> >     lib/dpif-netdev-private-dpif.h \
> > +   lib/dpif-netdev-private-extract.c \
> > +   lib/dpif-netdev-private-extract.h \
> >     lib/dpif-netdev-private-flow.h \
> >     lib/dpif-netdev-private-thread.h \
> >     lib/dpif-netdev-private.h \
> > diff --git a/lib/dpif-netdev-avx512.c b/lib/dpif-netdev-avx512.c
> > index 6f9aa8284..7772b7abf 100644
> > --- a/lib/dpif-netdev-avx512.c
> > +++ b/lib/dpif-netdev-avx512.c
> > @@ -149,6 +149,15 @@ dp_netdev_input_outer_avx512(struct 
> > dp_netdev_pmd_thread *pmd,
> >       *     // do all processing (HWOL->MFEX->EMC->SMC)
> >       * }
> >       */
> > +
> > +    /* Do a batch minfilow extract into keys. */
> > +    uint32_t mf_mask = 0;
> > +    miniflow_extract_func mfex_func;
> > +    atomic_read_relaxed(&pmd->miniflow_extract_opt, &mfex_func);
> > +    if (mfex_func) {
> > +        mf_mask = mfex_func(packets, keys, batch_size, in_port, pmd);
> > +    }
> > +
> >      uint32_t lookup_pkts_bitmask = (1ULL << batch_size) - 1;
> >      uint32_t iter = lookup_pkts_bitmask;
> >      while (iter) {
> > @@ -167,6 +176,13 @@ dp_netdev_input_outer_avx512(struct 
> > dp_netdev_pmd_thread *pmd,
> >          pkt_metadata_init(&packet->md, in_port);
> >
> >          struct dp_netdev_flow *f = NULL;
> > +        struct netdev_flow_key *key = &keys[i];
> > +
> > +        /* Check the minfiflow mask to see if the packet was correctly
> > +         * classifed by vector mfex else do a scalar miniflow extract
> > +         * for that packet.
> > +         */
> > +        bool mfex_hit = !!(mf_mask & (1 << i));
> >
> >          /* Check for a partial hardware offload match. */
> >          if (hwol_enabled) {
> > @@ -177,7 +193,13 @@ dp_netdev_input_outer_avx512(struct 
> > dp_netdev_pmd_thread *pmd,
> >              }
> >              if (f) {
> >                  rules[i] = &f->cr;
> > -                pkt_meta[i].tcp_flags = parse_tcp_flags(packet);
> > +                /* If AVX512 MFEX already classified the packet, use it. */
> > +                if (mfex_hit) {
> > +                    pkt_meta[i].tcp_flags = 
> > miniflow_get_tcp_flags(&key->mf);
> > +                } else {
> > +                    pkt_meta[i].tcp_flags = parse_tcp_flags(packet);
> > +                }
> > +
> >                  pkt_meta[i].bytes = dp_packet_size(packet);
> >                  phwol_hits++;
> >                  hwol_emc_smc_hitmask |= (1 << i);
> > @@ -185,9 +207,10 @@ dp_netdev_input_outer_avx512(struct 
> > dp_netdev_pmd_thread *pmd,
> >              }
> >          }
> >
> > -        /* Do miniflow extract into keys. */
> > -        struct netdev_flow_key *key = &keys[i];
> > -        miniflow_extract(packet, &key->mf);
> > +        if (!mfex_hit) {
> > +            /* Do a scalar miniflow extract into keys. */
> > +            miniflow_extract(packet, &key->mf);
> > +        }
> >
> >          /* Cache TCP and byte values for all packets. */
> >          pkt_meta[i].bytes = dp_packet_size(packet);
> > diff --git a/lib/dpif-netdev-private-extract.c 
> > b/lib/dpif-netdev-private-extract.c
> > new file mode 100644
> > index 000000000..f476069a6
> > --- /dev/null
> > +++ b/lib/dpif-netdev-private-extract.c
> > @@ -0,0 +1,162 @@
> > +/*
> > + * Copyright (c) 2021 Intel.
> > + *
> > + * Licensed under the Apache License, Version 2.0 (the "License");
> > + * you may not use this file except in compliance with the License.
> > + * You may obtain a copy of the License at:
> > + *
> > + *     http://www.apache.org/licenses/LICENSE-2.0
> > + *
> > + * Unless required by applicable law or agreed to in writing, software
> > + * distributed under the License is distributed on an "AS IS" BASIS,
> > + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> > + * See the License for the specific language governing permissions and
> > + * limitations under the License.
> > + */
> > +
> > +#include <config.h>
> > +#include <errno.h>
> > +#include <stdint.h>
> > +#include <string.h>
> > +
> > +#include "dp-packet.h"
> > +#include "dpif-netdev-private-dpcls.h"
> > +#include "dpif-netdev-private-extract.h"
> > +#include "dpif-netdev-private-thread.h"
> > +#include "flow.h"
> > +#include "openvswitch/vlog.h"
> > +#include "ovs-thread.h"
> > +#include "util.h"
> > +
> > +VLOG_DEFINE_THIS_MODULE(dpif_netdev_extract);
> > +
> > +/* Variable to hold the default MFEX implementation. */
> > +static miniflow_extract_func default_mfex_func = NULL;
> > +
> > +/* Implementations of available extract options and
> > + * the implementations are always in order of preference.
> > + */
> > +static struct dpif_miniflow_extract_impl mfex_impls[] = {
> > +
> > +    [MFEX_IMPL_SCALAR] = {
> > +        .probe = NULL,
> > +        .extract_func = NULL,
> > +        .name = "scalar", },
> > +};
> > +
> > +BUILD_ASSERT_DECL(MFEX_IMPL_MAX == ARRAY_SIZE(mfex_impls));
> > +
> > +void
> > +dpif_miniflow_extract_init(void)
> > +{
> > +    /* Call probe on each impl, and cache the result. */
> > +    for (int i = 0; i < MFEX_IMPL_MAX; i++) {
> > +        bool avail = true;
> > +        if (mfex_impls[i].probe) {
> > +            /* Return zero is success, non-zero means error. */
> > +            avail = (mfex_impls[i].probe() == 0);
> > +        }
> > +        VLOG_INFO("Miniflow Extract implementation %s (available: %s)\n",
> > +                  mfex_impls[i].name, avail ? "available" : "not 
> > available");
> > +        mfex_impls[i].available = avail;
> > +    }
> > +}
> > +
> > +miniflow_extract_func
> > +dp_mfex_impl_get_default(void)
> > +{
> > +    atomic_uintptr_t *mfex_func = (void *)&default_mfex_func;
> > +    static bool default_mfex_func_set = false;
> > +    int mfex_idx = MFEX_IMPL_SCALAR;
> > +
> > +    /* For the first call, this will be choosen based on the
> > +     * compile time flag and if nor flag is set it is set to
> > +     * default scalar.
> > +     */
> > +    if (OVS_UNLIKELY(!default_mfex_func_set)) {
> > +        VLOG_INFO("Default MFEX implementation is %s.\n",
> > +                  mfex_impls[mfex_idx].name);
> > +        atomic_store_relaxed(mfex_func, (uintptr_t) mfex_impls
> > +                             [mfex_idx].extract_func);
> > +        default_mfex_func_set = true;
> 
> If this only needs to be done once, why not move it to 
> dpif_miniflow_extract_init() as suggested during the v6 review (in a later 
> patch)?
> This will remove this check every time dp_mfex_impl_get_default() gets called.

Good point, +1.


> > +    }
> > +
> > +    return default_mfex_func;
> > +}
> > +
> > +int
> > +dp_mfex_impl_set_default_by_name(const char *name)
> > +{
> > +    miniflow_extract_func new_default;
> > +    atomic_uintptr_t *mfex_func = (void *)&default_mfex_func;
> > +
> > +    int err = dp_mfex_impl_get_by_name(name, &new_default);
> > +
> > +    if (!err) {
> > +        atomic_store_relaxed(mfex_func, (uintptr_t) new_default);
> > +    }
> > +
> > +    return err;
> > +
> > +}
> > +
> > +void
> > +dp_mfex_impl_get(struct ds *reply, struct dp_netdev_pmd_thread **pmd_list,
> > +                 size_t pmd_list_size)
> > +{
> > +    /* Add all MFEX functions to reply string. */
> > +    ds_put_cstr(reply, "Available MFEX implementations:\n");
> > +
> > +    for (int i = 0; i < MFEX_IMPL_MAX; i++) {
> > +        ds_put_format(reply, "  %s (available: %s pmds: ",
> > +                      mfex_impls[i].name, mfex_impls[i].available ?
> > +                      "True" : "False");
> 
> Flavio mentioned that True/False did not make sense to an end-user, not sure 
> if he has the same feeling here?
> Maybe yes/no make more sense here? Flavio?

We should strive to be as human friendly as possible in the logs.
It would be nice if that is possible with command outputs too, but
we already have a command like dpif-netdev/pmd-rxq-show showing
"true" or "false".

I have no strong opinion here.

fbl
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to