Re: [ovs-dev] [v14 01/11] dpif-netdev: Refactor to multiple header files.

Flavio Leitner Tue, 06 Jul 2021 12:21:23 -0700

Hi,


I was reviewing the patch while testing and I can consistently
loss 1Mpps (or more) on a P2P scenario with this flow table:
ovs-ofctl add-flow br0 in_port=dpdk0,actions=output:dpdk1 

TX: 14Mpps
RX without patch: +12.6Mpps
RX with patch: 11.67Mpps

CPU: Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz

Perf diff:
# Event 'cycles'
#
# Baseline  Delta Abs  Shared Object       Symbol                               
         
# ........  .........  ..................  
..............................................
#
     8.32%     -2.64%  libc-2.28.so        [.] __memcmp_avx2_movbe
               +2.04%  ovs-vswitchd        [.] 
dp_netdev_pmd_flush_output_packets.part.41
    14.60%     -1.78%  ovs-vswitchd        [.] mlx5_rx_burst_vec
     2.78%     +1.78%  ovs-vswitchd        [.] non_atomic_ullong_add
    23.95%     +1.60%  ovs-vswitchd        [.] miniflow_extract
     2.02%     +1.54%  ovs-vswitchd        [.] netdev_dpdk_rxq_recv
    14.77%     -0.82%  ovs-vswitchd        [.] dp_netdev_input__
     5.46%     +0.79%  ovs-vswitchd        [.] mlx5_tx_burst_none_empw
     2.79%     -0.77%  ovs-vswitchd        [.] 
dp_netdev_pmd_flush_output_on_port
     3.70%     -0.58%  ovs-vswitchd        [.] dp_execute_output_action
     3.34%     -0.58%  ovs-vswitchd        [.] netdev_send
     2.92%     +0.41%  ovs-vswitchd        [.] dp_netdev_process_rxq_port
     0.36%     +0.39%  ovs-vswitchd        [.] netdev_dpdk_vhost_rxq_recv
     4.25%     +0.38%  ovs-vswitchd        [.] mlx5_tx_handle_completion.isra.49
     0.82%     +0.30%  ovs-vswitchd        [.] pmd_perf_end_iteration
     1.53%     -0.12%  ovs-vswitchd        [.] netdev_dpdk_filter_packet_len
     0.53%     -0.11%  ovs-vswitchd        [.] netdev_is_flow_api_enabled
     0.54%     -0.10%  [vdso]              [.] 0x00000000000009c0
     1.72%     +0.10%  ovs-vswitchd        [.] netdev_rxq_recv
     0.61%     +0.09%  ovs-vswitchd        [.] pmd_thread_main
     0.08%     +0.07%  ovs-vswitchd        [.] userspace_tso_enabled
     0.45%     -0.07%  ovs-vswitchd        [.] memcmp@plt
     0.22%     -0.05%  ovs-vswitchd        [.] dp_execute_cb



On Thu, Jul 01, 2021 at 04:06:09PM +0100, Cian Ferriter wrote:
> From: Harry van Haaren <harry.van.haa...@intel.com>
> 
> Split the very large file dpif-netdev.c and the datastructures
> it contains into multiple header files. Each header file is
> responsible for the datastructures of that component.
> 
> This logical split allows better reuse and modularity of the code,
> and reduces the very large file dpif-netdev.c to be more managable.
> 
> Due to dependencies between components, it is not possible to
> move component in smaller granularities than this patch.
> 
> To explain the dependencies better, eg:
> 
> DPCLS has no deps (from dpif-netdev.c file)
> FLOW depends on DPCLS (struct dpcls_rule)
> DFC depends on DPCLS (netdev_flow_key) and FLOW (netdev_flow_key)
> THREAD depends on DFC (struct dfc_cache)
> 
> DFC_PROC depends on THREAD (struct pmd_thread)
> 
> DPCLS lookup.h/c require only DPCLS
> DPCLS implementations require only dpif-netdev-lookup.h.
> - This change was made in 2.12 release with function pointers
> - This commit only refactors the name to "private-dpcls.h"
> 
> netdev_flow_key_equal_mf() is renamed to emc_flow_key_equal_mf().
> 
> Rename functions specific to dpcls from netdev_* namespace to the
> dpcls_* namespace, as they are only used by dpcls code.
> 
> 'inline' is added to the dp_netdev_flow_hash() when it is moved
> definition to fix a compiler error.
> 
> One valid checkpatch issue with the use of the
> EMC_FOR_EACH_POS_WITH_HASH() macro was fixed.
> 
> Signed-off-by: Harry van Haaren <harry.van.haa...@intel.com>
> Co-authored-by: Cian Ferriter <cian.ferri...@intel.com>
> Signed-off-by: Cian Ferriter <cian.ferri...@intel.com>
> 
> ---
> 
> Cc: Gaetan Rivet <gaet...@nvidia.com>
> Cc: Sriharsha Basavapatna <sriharsha.basavapa...@broadcom.com>
> 
> v14:
> - Make some functions in lib/dpif-netdev-private-dfc.c private as they
>   aren't used in other files.
> - Fix the order of includes to what is layed out in the coding-style.rst
> 
> v13:
> - Add NEWS item in this commit rather than later.
> - Add lib/dpif-netdev-private-dfc.c file and move non fast path dfc
>   related functions there.
> - Squash commit which renames functions specific to dpcls from netdev_*
>   namespace to the dpcls_* namespace, as they are only used by dpcls
>   code into this commit.
> - Minor fixes from review comments.
> ---
>  NEWS                                   |   1 +
>  lib/automake.mk                        |   5 +
>  lib/dpif-netdev-lookup-autovalidator.c |   1 -
>  lib/dpif-netdev-lookup-avx512-gather.c |   1 -
>  lib/dpif-netdev-lookup-generic.c       |   1 -
>  lib/dpif-netdev-lookup.h               |   2 +-
>  lib/dpif-netdev-private-dfc.c          | 110 +++++
>  lib/dpif-netdev-private-dfc.h          | 164 ++++++++
>  lib/dpif-netdev-private-dpcls.h        | 128 ++++++
>  lib/dpif-netdev-private-flow.h         | 163 ++++++++
>  lib/dpif-netdev-private-thread.h       | 206 ++++++++++
>  lib/dpif-netdev-private.h              | 100 +----
>  lib/dpif-netdev.c                      | 539 +------------------------
>  13 files changed, 801 insertions(+), 620 deletions(-)
>  create mode 100644 lib/dpif-netdev-private-dfc.c
>  create mode 100644 lib/dpif-netdev-private-dfc.h
>  create mode 100644 lib/dpif-netdev-private-dpcls.h
>  create mode 100644 lib/dpif-netdev-private-flow.h
>  create mode 100644 lib/dpif-netdev-private-thread.h
> 
> diff --git a/NEWS b/NEWS
> index f02f07cdf..bc64183d3 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -10,6 +10,7 @@ Post-v2.15.0
>       * Auto load balancing of PMDs now partially supports cross-NUMA polling
>         cases, e.g if all PMD threads are running on the same NUMA node.
>       * Userspace datapath now supports up to 2^18 meters.
> +     * Refactor lib/dpif-netdev.c to multiple header files.
>     - ovs-ctl:
>       * New option '--no-record-hostname' to disable hostname configuration
>         in ovsdb on startup.
> diff --git a/lib/automake.mk b/lib/automake.mk
> index db9017591..fdba3c6c0 100644
> --- a/lib/automake.mk
> +++ b/lib/automake.mk
> @@ -111,6 +111,11 @@ lib_libopenvswitch_la_SOURCES = \
>       lib/dpif-netdev-lookup-generic.c \
>       lib/dpif-netdev.c \
>       lib/dpif-netdev.h \
> +     lib/dpif-netdev-private-dfc.c \
> +     lib/dpif-netdev-private-dfc.h \
> +     lib/dpif-netdev-private-dpcls.h \
> +     lib/dpif-netdev-private-flow.h \
> +     lib/dpif-netdev-private-thread.h \
>       lib/dpif-netdev-private.h \
>       lib/dpif-netdev-perf.c \
>       lib/dpif-netdev-perf.h \
> diff --git a/lib/dpif-netdev-lookup-autovalidator.c 
> b/lib/dpif-netdev-lookup-autovalidator.c
> index 97b59fdd0..475e1ab1e 100644
> --- a/lib/dpif-netdev-lookup-autovalidator.c
> +++ b/lib/dpif-netdev-lookup-autovalidator.c
> @@ -17,7 +17,6 @@
>  #include <config.h>
>  #include "dpif-netdev.h"
>  #include "dpif-netdev-lookup.h"
> -#include "dpif-netdev-private.h"
>  #include "openvswitch/vlog.h"
>  
>  VLOG_DEFINE_THIS_MODULE(dpif_lookup_autovalidator);
> diff --git a/lib/dpif-netdev-lookup-avx512-gather.c 
> b/lib/dpif-netdev-lookup-avx512-gather.c
> index 5e3634249..8fc1cdfa5 100644
> --- a/lib/dpif-netdev-lookup-avx512-gather.c
> +++ b/lib/dpif-netdev-lookup-avx512-gather.c
> @@ -21,7 +21,6 @@
>  
>  #include "dpif-netdev.h"
>  #include "dpif-netdev-lookup.h"
> -#include "dpif-netdev-private.h"
>  #include "cmap.h"
>  #include "flow.h"
>  #include "pvector.h"
> diff --git a/lib/dpif-netdev-lookup-generic.c 
> b/lib/dpif-netdev-lookup-generic.c
> index b1a0cfc36..e3b6be4b6 100644
> --- a/lib/dpif-netdev-lookup-generic.c
> +++ b/lib/dpif-netdev-lookup-generic.c
> @@ -17,7 +17,6 @@
>  
>  #include <config.h>
>  #include "dpif-netdev.h"
> -#include "dpif-netdev-private.h"
>  #include "dpif-netdev-lookup.h"
>  
>  #include "bitmap.h"
> diff --git a/lib/dpif-netdev-lookup.h b/lib/dpif-netdev-lookup.h
> index bd72aa29b..59f51faa0 100644
> --- a/lib/dpif-netdev-lookup.h
> +++ b/lib/dpif-netdev-lookup.h
> @@ -19,7 +19,7 @@
>  
>  #include <config.h>
>  #include "dpif-netdev.h"
> -#include "dpif-netdev-private.h"
> +#include "dpif-netdev-private-dpcls.h"
>  
>  /* Function to perform a probe for the subtable bit fingerprint.
>   * Returns NULL if not valid, or a valid function pointer to call for this
> diff --git a/lib/dpif-netdev-private-dfc.c b/lib/dpif-netdev-private-dfc.c
> new file mode 100644
> index 000000000..1d53fafff
> --- /dev/null
> +++ b/lib/dpif-netdev-private-dfc.c
> @@ -0,0 +1,110 @@
> +/*
> + * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2015 Nicira, Inc.
> + * Copyright (c) 2019, 2020, 2021 Intel Corporation.
> + *
> + * Licensed under the Apache License, Version 2.0 (the "License");
> + * you may not use this file except in compliance with the License.
> + * You may obtain a copy of the License at:
> + *
> + *     http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing, software
> + * distributed under the License is distributed on an "AS IS" BASIS,
> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> + * See the License for the specific language governing permissions and
> + * limitations under the License.
> + */
> +
> +
> +#include <config.h>
> +
> +#include "dpif-netdev-private-dfc.h"
> +
> +static void
> +emc_clear_entry(struct emc_entry *ce)
> +{
> +    if (ce->flow) {
> +        dp_netdev_flow_unref(ce->flow);
> +        ce->flow = NULL;
> +    }
> +}
> +
> +static void
> +smc_clear_entry(struct smc_bucket *b, int idx)
> +{
> +    b->flow_idx[idx] = UINT16_MAX;
> +}
> +
> +static void
> +emc_cache_init(struct emc_cache *flow_cache)
> +{
> +    int i;
> +
> +    flow_cache->sweep_idx = 0;
> +    for (i = 0; i < ARRAY_SIZE(flow_cache->entries); i++) {
> +        flow_cache->entries[i].flow = NULL;
> +        flow_cache->entries[i].key.hash = 0;
> +        flow_cache->entries[i].key.len = sizeof(struct miniflow);
> +        flowmap_init(&flow_cache->entries[i].key.mf.map);
> +    }
> +}
> +
> +static void
> +smc_cache_init(struct smc_cache *smc_cache)
> +{
> +    int i, j;
> +    for (i = 0; i < SMC_BUCKET_CNT; i++) {
> +        for (j = 0; j < SMC_ENTRY_PER_BUCKET; j++) {
> +            smc_cache->buckets[i].flow_idx[j] = UINT16_MAX;
> +        }
> +    }
> +}
> +
> +void
> +dfc_cache_init(struct dfc_cache *flow_cache)
> +{
> +    emc_cache_init(&flow_cache->emc_cache);
> +    smc_cache_init(&flow_cache->smc_cache);
> +}
> +
> +static void
> +emc_cache_uninit(struct emc_cache *flow_cache)
> +{
> +    int i;
> +
> +    for (i = 0; i < ARRAY_SIZE(flow_cache->entries); i++) {
> +        emc_clear_entry(&flow_cache->entries[i]);
> +    }
> +}
> +
> +static void
> +smc_cache_uninit(struct smc_cache *smc)
> +{
> +    int i, j;
> +
> +    for (i = 0; i < SMC_BUCKET_CNT; i++) {
> +        for (j = 0; j < SMC_ENTRY_PER_BUCKET; j++) {
> +            smc_clear_entry(&(smc->buckets[i]), j);
> +        }
> +    }
> +}
> +
> +void
> +dfc_cache_uninit(struct dfc_cache *flow_cache)
> +{
> +    smc_cache_uninit(&flow_cache->smc_cache);
> +    emc_cache_uninit(&flow_cache->emc_cache);
> +}
> +
> +/* Check and clear dead flow references slowly (one entry at each
> + * invocation).  */
> +void
> +emc_cache_slow_sweep(struct emc_cache *flow_cache)
> +{
> +    struct emc_entry *entry = &flow_cache->entries[flow_cache->sweep_idx];
> +
> +    if (!emc_entry_alive(entry)) {
> +        emc_clear_entry(entry);
> +    }
> +    flow_cache->sweep_idx = (flow_cache->sweep_idx + 1) & EM_FLOW_HASH_MASK;
> +}
> diff --git a/lib/dpif-netdev-private-dfc.h b/lib/dpif-netdev-private-dfc.h
> new file mode 100644
> index 000000000..6f1570355
> --- /dev/null
> +++ b/lib/dpif-netdev-private-dfc.h
> @@ -0,0 +1,164 @@
> +/*
> + * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2015 Nicira, Inc.
> + * Copyright (c) 2019, 2020, 2021 Intel Corporation.
> + *
> + * Licensed under the Apache License, Version 2.0 (the "License");
> + * you may not use this file except in compliance with the License.
> + * You may obtain a copy of the License at:
> + *
> + *     http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing, software
> + * distributed under the License is distributed on an "AS IS" BASIS,
> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> + * See the License for the specific language governing permissions and
> + * limitations under the License.
> + */
> +
> +#ifndef DPIF_NETDEV_PRIVATE_DFC_H
> +#define DPIF_NETDEV_PRIVATE_DFC_H 1
> +
> +#include "dpif.h"
> +#include "dpif-netdev-private-dpcls.h"
> +#include "dpif-netdev-private-flow.h"
> +
> +#include <stdbool.h>
> +#include <stdint.h>
> +
> +#ifdef  __cplusplus
> +extern "C" {
> +#endif
> +
> +/* EMC cache and SMC cache compose the datapath flow cache (DFC)
> + *
> + * Exact match cache for frequently used flows
> + *
> + * The cache uses a 32-bit hash of the packet (which can be the RSS hash) to
> + * search its entries for a miniflow that matches exactly the miniflow of the
> + * packet. It stores the 'dpcls_rule' (rule) that matches the miniflow.
> + *
> + * A cache entry holds a reference to its 'dp_netdev_flow'.
> + *
> + * A miniflow with a given hash can be in one of EM_FLOW_HASH_SEGS different
> + * entries. The 32-bit hash is split into EM_FLOW_HASH_SEGS values (each of
> + * them is EM_FLOW_HASH_SHIFT bits wide and the remainder is thrown away). 
> Each
> + * value is the index of a cache entry where the miniflow could be.
> + *
> + *
> + * Signature match cache (SMC)
> + *
> + * This cache stores a 16-bit signature for each flow without storing keys, 
> and
> + * stores the corresponding 16-bit flow_table index to the 'dp_netdev_flow'.
> + * Each flow thus occupies 32bit which is much more memory efficient than 
> EMC.
> + * SMC uses a set-associative design that each bucket contains
> + * SMC_ENTRY_PER_BUCKET number of entries.
> + * Since 16-bit flow_table index is used, if there are more than 2^16
> + * dp_netdev_flow, SMC will miss them that cannot be indexed by a 16-bit 
> value.
> + *
> + *
> + * Thread-safety
> + * =============
> + *
> + * Each pmd_thread has its own private exact match cache.
> + * If dp_netdev_input is not called from a pmd thread, a mutex is used.
> + */
> +
> +#define EM_FLOW_HASH_SHIFT 13
> +#define EM_FLOW_HASH_ENTRIES (1u << EM_FLOW_HASH_SHIFT)
> +#define EM_FLOW_HASH_MASK (EM_FLOW_HASH_ENTRIES - 1)
> +#define EM_FLOW_HASH_SEGS 2
> +
> +/* SMC uses a set-associative design. A bucket contains a set of entries that
> + * a flow item can occupy. For now, it uses one hash function rather than two
> + * as for the EMC design. */
> +#define SMC_ENTRY_PER_BUCKET 4
> +#define SMC_ENTRIES (1u << 20)
> +#define SMC_BUCKET_CNT (SMC_ENTRIES / SMC_ENTRY_PER_BUCKET)
> +#define SMC_MASK (SMC_BUCKET_CNT - 1)
> +
> +/* Default EMC insert probability is 1 / DEFAULT_EM_FLOW_INSERT_INV_PROB */
> +#define DEFAULT_EM_FLOW_INSERT_INV_PROB 100
> +#define DEFAULT_EM_FLOW_INSERT_MIN (UINT32_MAX /                     \
> +                                    DEFAULT_EM_FLOW_INSERT_INV_PROB)
> +
> +struct emc_entry {
> +    struct dp_netdev_flow *flow;
> +    struct netdev_flow_key key;   /* key.hash used for emc hash value. */
> +};
> +
> +struct emc_cache {
> +    struct emc_entry entries[EM_FLOW_HASH_ENTRIES];
> +    int sweep_idx;                /* For emc_cache_slow_sweep(). */
> +};
> +
> +struct smc_bucket {
> +    uint16_t sig[SMC_ENTRY_PER_BUCKET];
> +    uint16_t flow_idx[SMC_ENTRY_PER_BUCKET];
> +};
> +
> +/* Signature match cache, differentiate from EMC cache */
> +struct smc_cache {
> +    struct smc_bucket buckets[SMC_BUCKET_CNT];
> +};
> +
> +struct dfc_cache {
> +    struct emc_cache emc_cache;
> +    struct smc_cache smc_cache;
> +};
> +
> +/* Iterate in the exact match cache through every entry that might contain a
> + * miniflow with hash 'HASH'. */
> +#define EMC_FOR_EACH_POS_WITH_HASH(EMC, CURRENT_ENTRY, HASH)                 
> \
> +    for (uint32_t i__ = 0, srch_hash__ = (HASH);                             
> \
> +         (CURRENT_ENTRY) = &(EMC)->entries[srch_hash__ & EM_FLOW_HASH_MASK], 
> \
> +         i__ < EM_FLOW_HASH_SEGS;                                            
> \
> +         i__++, srch_hash__ >>= EM_FLOW_HASH_SHIFT)
> +
> +void dfc_cache_init(struct dfc_cache *flow_cache);
> +
> +void dfc_cache_uninit(struct dfc_cache *flow_cache);
> +
> +/* Check and clear dead flow references slowly (one entry at each
> + * invocation).  */
> +void emc_cache_slow_sweep(struct emc_cache *flow_cache);
> +
> +static inline bool
> +emc_entry_alive(struct emc_entry *ce)
> +{
> +    return ce->flow && !ce->flow->dead;
> +}
> +
> +/* Used to compare 'netdev_flow_key' in the exact match cache to a miniflow.
> + * The maps are compared bitwise, so both 'key->mf' and 'mf' must have been
> + * generated by miniflow_extract. */
> +static inline bool
> +emc_flow_key_equal_mf(const struct netdev_flow_key *key,
> +                         const struct miniflow *mf)
> +{
> +    return !memcmp(&key->mf, mf, key->len);
> +}
> +
> +static inline struct dp_netdev_flow *
> +emc_lookup(struct emc_cache *cache, const struct netdev_flow_key *key)
> +{
> +    struct emc_entry *current_entry;
> +
> +    EMC_FOR_EACH_POS_WITH_HASH (cache, current_entry, key->hash) {
> +        if (current_entry->key.hash == key->hash
> +            && emc_entry_alive(current_entry)
> +            && emc_flow_key_equal_mf(&current_entry->key, &key->mf)) {
> +
> +            /* We found the entry with the 'key->mf' miniflow */
> +            return current_entry->flow;
> +        }
> +    }
> +
> +    return NULL;
> +}
> +
> +
> +#ifdef  __cplusplus
> +}
> +#endif
> +
> +#endif /* dpif-netdev-private-dfc.h */
> diff --git a/lib/dpif-netdev-private-dpcls.h b/lib/dpif-netdev-private-dpcls.h
> new file mode 100644
> index 000000000..dc22431a3
> --- /dev/null
> +++ b/lib/dpif-netdev-private-dpcls.h
> @@ -0,0 +1,128 @@
> +/*
> + * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2015 Nicira, Inc.
> + * Copyright (c) 2019, 2020, 2021 Intel Corporation.
> + *
> + * Licensed under the Apache License, Version 2.0 (the "License");
> + * you may not use this file except in compliance with the License.
> + * You may obtain a copy of the License at:
> + *
> + *     http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing, software
> + * distributed under the License is distributed on an "AS IS" BASIS,
> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> + * See the License for the specific language governing permissions and
> + * limitations under the License.
> + */
> +
> +#ifndef DPIF_NETDEV_PRIVATE_DPCLS_H
> +#define DPIF_NETDEV_PRIVATE_DPCLS_H 1
> +
> +#include "dpif.h"
> +
> +#include <stdbool.h>
> +#include <stdint.h>
> +
> +#include "cmap.h"
> +#include "openvswitch/thread.h"
> +
> +#ifdef  __cplusplus
> +extern "C" {
> +#endif
> +
> +/* Forward declaration for lookup_func typedef. */
> +struct dpcls_subtable;
> +struct dpcls_rule;
> +
> +/* Must be public as it is instantiated in subtable struct below. */
> +struct netdev_flow_key {
> +    uint32_t hash;       /* Hash function differs for different users. */
> +    uint32_t len;        /* Length of the following miniflow (incl. map). */
> +    struct miniflow mf;
> +    uint64_t buf[FLOW_MAX_PACKET_U64S];
> +};
> +
> +/* A rule to be inserted to the classifier. */
> +struct dpcls_rule {
> +    struct cmap_node cmap_node;   /* Within struct dpcls_subtable 'rules'. */
> +    struct netdev_flow_key *mask; /* Subtable's mask. */
> +    struct netdev_flow_key flow;  /* Matching key. */
> +    /* 'flow' must be the last field, additional space is allocated here. */
> +};
> +
> +/* Lookup function for a subtable in the dpcls. This function is called
> + * by each subtable with an array of packets, and a bitmask of packets to
> + * perform the lookup on. Using a function pointer gives flexibility to
> + * optimize the lookup function based on subtable properties and the
> + * CPU instruction set available at runtime.
> + */
> +typedef
> +uint32_t (*dpcls_subtable_lookup_func)(struct dpcls_subtable *subtable,
> +                                       uint32_t keys_map,
> +                                       const struct netdev_flow_key *keys[],
> +                                       struct dpcls_rule **rules);
> +
> +/* A set of rules that all have the same fields wildcarded. */
> +struct dpcls_subtable {
> +    /* The fields are only used by writers. */
> +    struct cmap_node cmap_node OVS_GUARDED; /* Within dpcls 'subtables_map'. 
> */
> +
> +    /* These fields are accessed by readers. */
> +    struct cmap rules;           /* Contains "struct dpcls_rule"s. */
> +    uint32_t hit_cnt;            /* Number of match hits in subtable in 
> current
> +                                    optimization interval. */
> +
> +    /* Miniflow fingerprint that the subtable matches on. The miniflow "bits"
> +     * are used to select the actual dpcls lookup implementation at subtable
> +     * creation time.
> +     */
> +    uint8_t mf_bits_set_unit0;
> +    uint8_t mf_bits_set_unit1;
> +
> +    /* The lookup function to use for this subtable. If there is a known
> +     * property of the subtable (eg: only 3 bits of miniflow metadata is
> +     * used for the lookup) then this can point at an optimized version of
> +     * the lookup function for this particular subtable. */
> +    dpcls_subtable_lookup_func lookup_func;
> +
> +    /* Caches the masks to match a packet to, reducing runtime calculations. 
> */
> +    uint64_t *mf_masks;
> +
> +    struct netdev_flow_key mask; /* Wildcards for fields (const). */
> +    /* 'mask' must be the last field, additional space is allocated here. */
> +};
> +
> +/* Iterate through netdev_flow_key TNL u64 values specified by 'FLOWMAP'. */
> +#define NETDEV_FLOW_KEY_FOR_EACH_IN_FLOWMAP(VALUE, KEY, FLOWMAP)   \
> +    MINIFLOW_FOR_EACH_IN_FLOWMAP (VALUE, &(KEY)->mf, FLOWMAP)
> +
> +/* Generates a mask for each bit set in the subtable's miniflow. */
> +void
> +dpcls_flow_key_gen_masks(const struct netdev_flow_key *tbl, uint64_t 
> *mf_masks,
> +                         const uint32_t mf_bits_u0, const uint32_t 
> mf_bits_u1);
> +
> +/* Matches a dpcls rule against the incoming packet in 'target' */
> +bool dpcls_rule_matches_key(const struct dpcls_rule *rule,
> +                            const struct netdev_flow_key *target);
> +
> +static inline uint32_t
> +dpif_netdev_packet_get_rss_hash_orig_pkt(struct dp_packet *packet,
> +                                const struct miniflow *mf)
> +{
> +    uint32_t hash;
> +
> +    if (OVS_LIKELY(dp_packet_rss_valid(packet))) {
> +        hash = dp_packet_get_rss_hash(packet);
> +    } else {
> +        hash = miniflow_hash_5tuple(mf, 0);
> +        dp_packet_set_rss_hash(packet, hash);
> +    }
> +
> +    return hash;
> +}
> +
> +#ifdef  __cplusplus
> +}
> +#endif
> +
> +#endif /* dpif-netdev-private-dpcls.h */
> diff --git a/lib/dpif-netdev-private-flow.h b/lib/dpif-netdev-private-flow.h
> new file mode 100644
> index 000000000..303066067
> --- /dev/null
> +++ b/lib/dpif-netdev-private-flow.h
> @@ -0,0 +1,163 @@
> +/*
> + * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2015 Nicira, Inc.
> + * Copyright (c) 2019, 2020, 2021 Intel Corporation.
> + *
> + * Licensed under the Apache License, Version 2.0 (the "License");
> + * you may not use this file except in compliance with the License.
> + * You may obtain a copy of the License at:
> + *
> + *     http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing, software
> + * distributed under the License is distributed on an "AS IS" BASIS,
> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> + * See the License for the specific language governing permissions and
> + * limitations under the License.
> + */
> +
> +#ifndef DPIF_NETDEV_PRIVATE_FLOW_H
> +#define DPIF_NETDEV_PRIVATE_FLOW_H 1
> +
> +#include "dpif.h"
> +#include "dpif-netdev-private-dpcls.h"
> +
> +#include <stdbool.h>
> +#include <stdint.h>
> +
> +#include "cmap.h"
> +#include "openvswitch/thread.h"
> +
> +#ifdef  __cplusplus
> +extern "C" {
> +#endif
> +
> +/* Contained by struct dp_netdev_flow's 'stats' member.  */
> +struct dp_netdev_flow_stats {
> +    atomic_llong used;             /* Last used time, in monotonic msecs. */
> +    atomic_ullong packet_count;    /* Number of packets matched. */
> +    atomic_ullong byte_count;      /* Number of bytes matched. */
> +    atomic_uint16_t tcp_flags;     /* Bitwise-OR of seen tcp_flags values. */
> +};
> +
> +/* Contained by struct dp_netdev_flow's 'last_attrs' member.  */
> +struct dp_netdev_flow_attrs {
> +    atomic_bool offloaded;         /* True if flow is offloaded to HW. */
> +    ATOMIC(const char *) dp_layer; /* DP layer the flow is handled in. */
> +};
> +
> +/* A flow in 'dp_netdev_pmd_thread's 'flow_table'.
> + *
> + *
> + * Thread-safety
> + * =============
> + *
> + * Except near the beginning or ending of its lifespan, rule 'rule' belongs 
> to
> + * its pmd thread's classifier.  The text below calls this classifier 'cls'.
> + *
> + * Motivation
> + * ----------
> + *
> + * The thread safety rules described here for "struct dp_netdev_flow" are
> + * motivated by two goals:
> + *
> + *    - Prevent threads that read members of "struct dp_netdev_flow" from
> + *      reading bad data due to changes by some thread concurrently modifying
> + *      those members.
> + *
> + *    - Prevent two threads making changes to members of a given "struct
> + *      dp_netdev_flow" from interfering with each other.
> + *
> + *
> + * Rules
> + * -----
> + *
> + * A flow 'flow' may be accessed without a risk of being freed during an RCU
> + * grace period.  Code that needs to hold onto a flow for a while
> + * should try incrementing 'flow->ref_cnt' with dp_netdev_flow_ref().
> + *
> + * 'flow->ref_cnt' protects 'flow' from being freed.  It doesn't protect the
> + * flow from being deleted from 'cls' and it doesn't protect members of 
> 'flow'
> + * from modification.
> + *
> + * Some members, marked 'const', are immutable.  Accessing other members
> + * requires synchronization, as noted in more detail below.
> + */
> +struct dp_netdev_flow {
> +    const struct flow flow;      /* Unmasked flow that created this entry. */
> +    /* Hash table index by unmasked flow. */
> +    const struct cmap_node node; /* In owning dp_netdev_pmd_thread's */
> +                                 /* 'flow_table'. */
> +    const struct cmap_node mark_node; /* In owning flow_mark's mark_to_flow 
> */
> +    const ovs_u128 ufid;         /* Unique flow identifier. */
> +    const ovs_u128 mega_ufid;    /* Unique mega flow identifier. */
> +    const unsigned pmd_id;       /* The 'core_id' of pmd thread owning this 
> */
> +                                 /* flow. */
> +
> +    /* Number of references.
> +     * The classifier owns one reference.
> +     * Any thread trying to keep a rule from being freed should hold its own
> +     * reference. */
> +    struct ovs_refcount ref_cnt;
> +
> +    bool dead;
> +    uint32_t mark;               /* Unique flow mark assigned to a flow */
> +
> +    /* Statistics. */
> +    struct dp_netdev_flow_stats stats;
> +
> +    /* Statistics and attributes received from the netdev offload provider. 
> */
> +    atomic_int netdev_flow_get_result;
> +    struct dp_netdev_flow_stats last_stats;
> +    struct dp_netdev_flow_attrs last_attrs;
> +
> +    /* Actions. */
> +    OVSRCU_TYPE(struct dp_netdev_actions *) actions;
> +
> +    /* While processing a group of input packets, the datapath uses the next
> +     * member to store a pointer to the output batch for the flow.  It is
> +     * reset after the batch has been sent out (See 
> dp_netdev_queue_batches(),
> +     * packet_batch_per_flow_init() and packet_batch_per_flow_execute()). */
> +    struct packet_batch_per_flow *batch;
> +
> +    /* Packet classification. */
> +    char *dp_extra_info;         /* String to return in a flow dump/get. */
> +    struct dpcls_rule cr;        /* In owning dp_netdev's 'cls'. */
> +    /* 'cr' must be the last member. */
> +};
> +
> +static inline uint32_t
> +dp_netdev_flow_hash(const ovs_u128 *ufid)
> +{
> +    return ufid->u32[0];
> +}
> +
> +/* Given the number of bits set in miniflow's maps, returns the size of the
> + * 'netdev_flow_key.mf' */
> +static inline size_t
> +netdev_flow_key_size(size_t flow_u64s)
> +{
> +    return sizeof(struct miniflow) + MINIFLOW_VALUES_SIZE(flow_u64s);
> +}
> +
> +/* forward declaration required for EMC to unref flows */
> +void dp_netdev_flow_unref(struct dp_netdev_flow *);
> +
> +/* A set of datapath actions within a "struct dp_netdev_flow".
> + *
> + *
> + * Thread-safety
> + * =============
> + *
> + * A struct dp_netdev_actions 'actions' is protected with RCU. */
> +struct dp_netdev_actions {
> +    /* These members are immutable: they do not change during the struct's
> +     * lifetime.  */
> +    unsigned int size;          /* Size of 'actions', in bytes. */
> +    struct nlattr actions[];    /* Sequence of OVS_ACTION_ATTR_* attributes. 
> */
> +};
> +
> +#ifdef  __cplusplus
> +}
> +#endif
> +
> +#endif /* dpif-netdev-private-flow.h */
> diff --git a/lib/dpif-netdev-private-thread.h 
> b/lib/dpif-netdev-private-thread.h
> new file mode 100644
> index 000000000..91f3753d1
> --- /dev/null
> +++ b/lib/dpif-netdev-private-thread.h
> @@ -0,0 +1,206 @@
> +/*
> + * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2015 Nicira, Inc.
> + * Copyright (c) 2019, 2020, 2021 Intel Corporation.
> + *
> + * Licensed under the Apache License, Version 2.0 (the "License");
> + * you may not use this file except in compliance with the License.
> + * You may obtain a copy of the License at:
> + *
> + *     http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing, software
> + * distributed under the License is distributed on an "AS IS" BASIS,
> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> + * See the License for the specific language governing permissions and
> + * limitations under the License.
> + */
> +
> +#ifndef DPIF_NETDEV_PRIVATE_THREAD_H
> +#define DPIF_NETDEV_PRIVATE_THREAD_H 1
> +
> +#include "dpif.h"
> +#include "dpif-netdev-perf.h"
> +#include "dpif-netdev-private-dfc.h"
> +
> +#include <stdbool.h>
> +#include <stdint.h>
> +
> +#include "cmap.h"
> +#include "openvswitch/thread.h"
> +
> +#ifdef  __cplusplus
> +extern "C" {
> +#endif
> +
> +/* PMD Thread Structures */
> +
> +/* A set of properties for the current processing loop that is not directly
> + * associated with the pmd thread itself, but with the packets being
> + * processed or the short-term system configuration (for example, time).
> + * Contained by struct dp_netdev_pmd_thread's 'ctx' member. */
> +struct dp_netdev_pmd_thread_ctx {
> +    /* Latest measured time. See 'pmd_thread_ctx_time_update()'. */
> +    long long now;
> +    /* RX queue from which last packet was received. */
> +    struct dp_netdev_rxq *last_rxq;
> +    /* EMC insertion probability context for the current processing cycle. */
> +    uint32_t emc_insert_min;
> +};
> +
> +/* PMD: Poll modes drivers.  PMD accesses devices via polling to eliminate
> + * the performance overhead of interrupt processing.  Therefore netdev can
> + * not implement rx-wait for these devices.  dpif-netdev needs to poll
> + * these device to check for recv buffer.  pmd-thread does polling for
> + * devices assigned to itself.
> + *
> + * DPDK used PMD for accessing NIC.
> + *
> + * Note, instance with cpu core id NON_PMD_CORE_ID will be reserved for
> + * I/O of all non-pmd threads.  There will be no actual thread created
> + * for the instance.
> + *
> + * Each struct has its own flow cache and classifier per managed ingress 
> port.
> + * For packets received on ingress port, a look up is done on corresponding 
> PMD
> + * thread's flow cache and in case of a miss, lookup is performed in the
> + * corresponding classifier of port.  Packets are executed with the found
> + * actions in either case.
> + * */
> +struct dp_netdev_pmd_thread {
> +    struct dp_netdev *dp;
> +    struct ovs_refcount ref_cnt;    /* Every reference must be refcount'ed. 
> */
> +    struct cmap_node node;          /* In 'dp->poll_threads'. */
> +
> +    /* Per thread exact-match cache.  Note, the instance for cpu core
> +     * NON_PMD_CORE_ID can be accessed by multiple threads, and thusly
> +     * need to be protected by 'non_pmd_mutex'.  Every other instance
> +     * will only be accessed by its own pmd thread. */
> +    OVS_ALIGNED_VAR(CACHE_LINE_SIZE) struct dfc_cache flow_cache;
> +
> +    /* Flow-Table and classifiers
> +     *
> +     * Writers of 'flow_table' must take the 'flow_mutex'.  Corresponding
> +     * changes to 'classifiers' must be made while still holding the
> +     * 'flow_mutex'.
> +     */
> +    struct ovs_mutex flow_mutex;
> +    struct cmap flow_table OVS_GUARDED; /* Flow table. */
> +
> +    /* One classifier per in_port polled by the pmd */
> +    struct cmap classifiers;
> +    /* Periodically sort subtable vectors according to hit frequencies */
> +    long long int next_optimization;
> +    /* End of the next time interval for which processing cycles
> +       are stored for each polled rxq. */
> +    long long int rxq_next_cycle_store;
> +
> +    /* Last interval timestamp. */
> +    uint64_t intrvl_tsc_prev;
> +    /* Last interval cycles. */
> +    atomic_ullong intrvl_cycles;
> +
> +    /* Current context of the PMD thread. */
> +    struct dp_netdev_pmd_thread_ctx ctx;
> +
> +    struct seq *reload_seq;
> +    uint64_t last_reload_seq;
> +
> +    /* These are atomic variables used as a synchronization and configuration
> +     * points for thread reload/exit.
> +     *
> +     * 'reload' atomic is the main one and it's used as a memory
> +     * synchronization point for all other knobs and data.
> +     *
> +     * For a thread that requests PMD reload:
> +     *
> +     *   * All changes that should be visible to the PMD thread must be made
> +     *     before setting the 'reload'.  These changes could use any memory
> +     *     ordering model including 'relaxed'.
> +     *   * Setting the 'reload' atomic should occur in the same thread where
> +     *     all other PMD configuration options updated.
> +     *   * Setting the 'reload' atomic should be done with 'release' memory
> +     *     ordering model or stricter.  This will guarantee that all previous
> +     *     changes (including non-atomic and 'relaxed') will be visible to
> +     *     the PMD thread.
> +     *   * To check that reload is done, thread should poll the 'reload' 
> atomic
> +     *     to become 'false'.  Polling should be done with 'acquire' memory
> +     *     ordering model or stricter.  This ensures that PMD thread 
> completed
> +     *     the reload process.
> +     *
> +     * For the PMD thread:
> +     *
> +     *   * PMD thread should read 'reload' atomic with 'acquire' memory
> +     *     ordering model or stricter.  This will guarantee that all changes
> +     *     made before setting the 'reload' in the requesting thread will be
> +     *     visible to the PMD thread.
> +     *   * All other configuration data could be read with any memory
> +     *     ordering model (including non-atomic and 'relaxed') but *only 
> after*
> +     *     reading the 'reload' atomic set to 'true'.
> +     *   * When the PMD reload done, PMD should (optionally) set all the 
> below
> +     *     knobs except the 'reload' to their default ('false') values and
> +     *     (mandatory), as the last step, set the 'reload' to 'false' using
> +     *     'release' memory ordering model or stricter.  This will inform the
> +     *     requesting thread that PMD has completed a reload cycle.
> +     */
> +    atomic_bool reload;             /* Do we need to reload ports? */
> +    atomic_bool wait_for_reload;    /* Can we busy wait for the next reload? 
> */
> +    atomic_bool reload_tx_qid;      /* Do we need to reload static_tx_qid? */
> +    atomic_bool exit;               /* For terminating the pmd thread. */
> +
> +    pthread_t thread;
> +    unsigned core_id;               /* CPU core id of this pmd thread. */
> +    int numa_id;                    /* numa node id of this pmd thread. */
> +    bool isolated;
> +
> +    /* Queue id used by this pmd thread to send packets on all netdevs if
> +     * XPS disabled for this netdev. All static_tx_qid's are unique and less
> +     * than 'cmap_count(dp->poll_threads)'. */
> +    uint32_t static_tx_qid;
> +
> +    /* Number of filled output batches. */
> +    int n_output_batches;
> +
> +    struct ovs_mutex port_mutex;    /* Mutex for 'poll_list' and 'tx_ports'. 
> */
> +    /* List of rx queues to poll. */
> +    struct hmap poll_list OVS_GUARDED;
> +    /* Map of 'tx_port's used for transmission.  Written by the main thread,
> +     * read by the pmd thread. */
> +    struct hmap tx_ports OVS_GUARDED;
> +
> +    struct ovs_mutex bond_mutex;    /* Protects updates of 'tx_bonds'. */
> +    /* Map of 'tx_bond's used for transmission.  Written by the main thread
> +     * and read by the pmd thread. */
> +    struct cmap tx_bonds;
> +
> +    /* These are thread-local copies of 'tx_ports'.  One contains only tunnel
> +     * ports (that support push_tunnel/pop_tunnel), the other contains ports
> +     * with at least one txq (that support send).  A port can be in both.
> +     *
> +     * There are two separate maps to make sure that we don't try to execute
> +     * OUTPUT on a device which has 0 txqs or PUSH/POP on a non-tunnel 
> device.
> +     *
> +     * The instances for cpu core NON_PMD_CORE_ID can be accessed by multiple
> +     * threads, and thusly need to be protected by 'non_pmd_mutex'.  Every
> +     * other instance will only be accessed by its own pmd thread. */
> +    struct hmap tnl_port_cache;
> +    struct hmap send_port_cache;
> +
> +    /* Keep track of detailed PMD performance statistics. */
> +    struct pmd_perf_stats perf_stats;
> +
> +    /* Stats from previous iteration used by automatic pmd
> +     * load balance logic. */
> +    uint64_t prev_stats[PMD_N_STATS];
> +    atomic_count pmd_overloaded;
> +
> +    /* Set to true if the pmd thread needs to be reloaded. */
> +    bool need_reload;
> +
> +    /* Next time when PMD should try RCU quiescing. */
> +    long long next_rcu_quiesce;
> +};
> +
> +#ifdef  __cplusplus
> +}
> +#endif
> +
> +#endif /* dpif-netdev-private-thread.h */
> diff --git a/lib/dpif-netdev-private.h b/lib/dpif-netdev-private.h
> index 4fda1220b..d7b6fd7ec 100644
> --- a/lib/dpif-netdev-private.h
> +++ b/lib/dpif-netdev-private.h
> @@ -18,95 +18,17 @@
>  #ifndef DPIF_NETDEV_PRIVATE_H
>  #define DPIF_NETDEV_PRIVATE_H 1
>  
> -#include <stdbool.h>
> -#include <stdint.h>
> -
> -#include "dpif.h"
> -#include "cmap.h"
> -
> -#ifdef  __cplusplus
> -extern "C" {
> -#endif
> -
> -/* Forward declaration for lookup_func typedef. */
> -struct dpcls_subtable;
> -struct dpcls_rule;
> -
> -/* Must be public as it is instantiated in subtable struct below. */
> -struct netdev_flow_key {
> -    uint32_t hash;       /* Hash function differs for different users. */
> -    uint32_t len;        /* Length of the following miniflow (incl. map). */
> -    struct miniflow mf;
> -    uint64_t buf[FLOW_MAX_PACKET_U64S];
> -};
> -
> -/* A rule to be inserted to the classifier. */
> -struct dpcls_rule {
> -    struct cmap_node cmap_node;   /* Within struct dpcls_subtable 'rules'. */
> -    struct netdev_flow_key *mask; /* Subtable's mask. */
> -    struct netdev_flow_key flow;  /* Matching key. */
> -    /* 'flow' must be the last field, additional space is allocated here. */
> -};
> -
> -/* Lookup function for a subtable in the dpcls. This function is called
> - * by each subtable with an array of packets, and a bitmask of packets to
> - * perform the lookup on. Using a function pointer gives flexibility to
> - * optimize the lookup function based on subtable properties and the
> - * CPU instruction set available at runtime.
> +/* This header includes the various dpif-netdev components' header
> + * files in the appropriate order. Unfortunately there is a strict
> + * requirement in the include order due to dependences between components.
> + * E.g:
> + *  DFC/EMC/SMC requires the netdev_flow_key struct
> + *  PMD thread requires DFC_flow struct
> + *
>   */
> -typedef
> -uint32_t (*dpcls_subtable_lookup_func)(struct dpcls_subtable *subtable,
> -                                       uint32_t keys_map,
> -                                       const struct netdev_flow_key *keys[],
> -                                       struct dpcls_rule **rules);
> -
> -/* A set of rules that all have the same fields wildcarded. */
> -struct dpcls_subtable {
> -    /* The fields are only used by writers. */
> -    struct cmap_node cmap_node OVS_GUARDED; /* Within dpcls 'subtables_map'. 
> */
> -
> -    /* These fields are accessed by readers. */
> -    struct cmap rules;           /* Contains "struct dpcls_rule"s. */
> -    uint32_t hit_cnt;            /* Number of match hits in subtable in 
> current
> -                                    optimization interval. */
> -
> -    /* Miniflow fingerprint that the subtable matches on. The miniflow "bits"
> -     * are used to select the actual dpcls lookup implementation at subtable
> -     * creation time.
> -     */
> -    uint8_t mf_bits_set_unit0;
> -    uint8_t mf_bits_set_unit1;
> -
> -    /* The lookup function to use for this subtable. If there is a known
> -     * property of the subtable (eg: only 3 bits of miniflow metadata is
> -     * used for the lookup) then this can point at an optimized version of
> -     * the lookup function for this particular subtable. */
> -    dpcls_subtable_lookup_func lookup_func;
> -
> -    /* Caches the masks to match a packet to, reducing runtime calculations. 
> */
> -    uint64_t *mf_masks;
> -
> -    struct netdev_flow_key mask; /* Wildcards for fields (const). */
> -    /* 'mask' must be the last field, additional space is allocated here. */
> -};
> -
> -/* Iterate through netdev_flow_key TNL u64 values specified by 'FLOWMAP'. */
> -#define NETDEV_FLOW_KEY_FOR_EACH_IN_FLOWMAP(VALUE, KEY, FLOWMAP)   \
> -    MINIFLOW_FOR_EACH_IN_FLOWMAP (VALUE, &(KEY)->mf, FLOWMAP)
> -
> -/* Generates a mask for each bit set in the subtable's miniflow. */
> -void
> -netdev_flow_key_gen_masks(const struct netdev_flow_key *tbl,
> -                          uint64_t *mf_masks,
> -                          const uint32_t mf_bits_u0,
> -                          const uint32_t mf_bits_u1);
> -
> -/* Matches a dpcls rule against the incoming packet in 'target' */
> -bool dpcls_rule_matches_key(const struct dpcls_rule *rule,
> -                            const struct netdev_flow_key *target);
> -
> -#ifdef  __cplusplus
> -}
> -#endif
> +#include "dpif-netdev-private-flow.h"
> +#include "dpif-netdev-private-dpcls.h"
> +#include "dpif-netdev-private-dfc.h"
> +#include "dpif-netdev-private-thread.h"
>  
>  #endif /* netdev-private.h */
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index c5ab35d2a..2e29980c5 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -17,6 +17,7 @@
>  #include <config.h>
>  #include "dpif-netdev.h"
>  #include "dpif-netdev-private.h"
> +#include "dpif-netdev-private-dfc.h"
>  
>  #include <ctype.h>
>  #include <errno.h>
> @@ -142,90 +143,6 @@ static struct odp_support dp_netdev_support = {
>      .ct_orig_tuple6 = true,
>  };
>  
> -/* EMC cache and SMC cache compose the datapath flow cache (DFC)
> - *
> - * Exact match cache for frequently used flows
> - *
> - * The cache uses a 32-bit hash of the packet (which can be the RSS hash) to
> - * search its entries for a miniflow that matches exactly the miniflow of the
> - * packet. It stores the 'dpcls_rule' (rule) that matches the miniflow.
> - *
> - * A cache entry holds a reference to its 'dp_netdev_flow'.
> - *
> - * A miniflow with a given hash can be in one of EM_FLOW_HASH_SEGS different
> - * entries. The 32-bit hash is split into EM_FLOW_HASH_SEGS values (each of
> - * them is EM_FLOW_HASH_SHIFT bits wide and the remainder is thrown away). 
> Each
> - * value is the index of a cache entry where the miniflow could be.
> - *
> - *
> - * Signature match cache (SMC)
> - *
> - * This cache stores a 16-bit signature for each flow without storing keys, 
> and
> - * stores the corresponding 16-bit flow_table index to the 'dp_netdev_flow'.
> - * Each flow thus occupies 32bit which is much more memory efficient than 
> EMC.
> - * SMC uses a set-associative design that each bucket contains
> - * SMC_ENTRY_PER_BUCKET number of entries.
> - * Since 16-bit flow_table index is used, if there are more than 2^16
> - * dp_netdev_flow, SMC will miss them that cannot be indexed by a 16-bit 
> value.
> - *
> - *
> - * Thread-safety
> - * =============
> - *
> - * Each pmd_thread has its own private exact match cache.
> - * If dp_netdev_input is not called from a pmd thread, a mutex is used.
> - */
> -
> -#define EM_FLOW_HASH_SHIFT 13
> -#define EM_FLOW_HASH_ENTRIES (1u << EM_FLOW_HASH_SHIFT)
> -#define EM_FLOW_HASH_MASK (EM_FLOW_HASH_ENTRIES - 1)
> -#define EM_FLOW_HASH_SEGS 2
> -
> -/* SMC uses a set-associative design. A bucket contains a set of entries that
> - * a flow item can occupy. For now, it uses one hash function rather than two
> - * as for the EMC design. */
> -#define SMC_ENTRY_PER_BUCKET 4
> -#define SMC_ENTRIES (1u << 20)
> -#define SMC_BUCKET_CNT (SMC_ENTRIES / SMC_ENTRY_PER_BUCKET)
> -#define SMC_MASK (SMC_BUCKET_CNT - 1)
> -
> -/* Default EMC insert probability is 1 / DEFAULT_EM_FLOW_INSERT_INV_PROB */
> -#define DEFAULT_EM_FLOW_INSERT_INV_PROB 100
> -#define DEFAULT_EM_FLOW_INSERT_MIN (UINT32_MAX /                     \
> -                                    DEFAULT_EM_FLOW_INSERT_INV_PROB)
> -
> -struct emc_entry {
> -    struct dp_netdev_flow *flow;
> -    struct netdev_flow_key key;   /* key.hash used for emc hash value. */
> -};
> -
> -struct emc_cache {
> -    struct emc_entry entries[EM_FLOW_HASH_ENTRIES];
> -    int sweep_idx;                /* For emc_cache_slow_sweep(). */
> -};
> -
> -struct smc_bucket {
> -    uint16_t sig[SMC_ENTRY_PER_BUCKET];
> -    uint16_t flow_idx[SMC_ENTRY_PER_BUCKET];
> -};
> -
> -/* Signature match cache, differentiate from EMC cache */
> -struct smc_cache {
> -    struct smc_bucket buckets[SMC_BUCKET_CNT];
> -};
> -
> -struct dfc_cache {
> -    struct emc_cache emc_cache;
> -    struct smc_cache smc_cache;
> -};
> -
> -/* Iterate in the exact match cache through every entry that might contain a
> - * miniflow with hash 'HASH'. */
> -#define EMC_FOR_EACH_POS_WITH_HASH(EMC, CURRENT_ENTRY, HASH)                 
> \
> -    for (uint32_t i__ = 0, srch_hash__ = (HASH);                             
> \
> -         (CURRENT_ENTRY) = &(EMC)->entries[srch_hash__ & EM_FLOW_HASH_MASK], 
> \
> -         i__ < EM_FLOW_HASH_SEGS;                                            
> \
> -         i__++, srch_hash__ >>= EM_FLOW_HASH_SHIFT)
>  
>  /* Simple non-wildcarding single-priority classifier. */
>  
> @@ -478,119 +395,10 @@ struct dp_netdev_port {
>      char *rxq_affinity_list;    /* Requested affinity of rx queues. */
>  };
>  
> -/* Contained by struct dp_netdev_flow's 'stats' member.  */
> -struct dp_netdev_flow_stats {
> -    atomic_llong used;             /* Last used time, in monotonic msecs. */
> -    atomic_ullong packet_count;    /* Number of packets matched. */
> -    atomic_ullong byte_count;      /* Number of bytes matched. */
> -    atomic_uint16_t tcp_flags;     /* Bitwise-OR of seen tcp_flags values. */
> -};
> -
> -/* Contained by struct dp_netdev_flow's 'last_attrs' member.  */
> -struct dp_netdev_flow_attrs {
> -    atomic_bool offloaded;         /* True if flow is offloaded to HW. */
> -    ATOMIC(const char *) dp_layer; /* DP layer the flow is handled in. */
> -};
> -
> -/* A flow in 'dp_netdev_pmd_thread's 'flow_table'.
> - *
> - *
> - * Thread-safety
> - * =============
> - *
> - * Except near the beginning or ending of its lifespan, rule 'rule' belongs 
> to
> - * its pmd thread's classifier.  The text below calls this classifier 'cls'.
> - *
> - * Motivation
> - * ----------
> - *
> - * The thread safety rules described here for "struct dp_netdev_flow" are
> - * motivated by two goals:
> - *
> - *    - Prevent threads that read members of "struct dp_netdev_flow" from
> - *      reading bad data due to changes by some thread concurrently modifying
> - *      those members.
> - *
> - *    - Prevent two threads making changes to members of a given "struct
> - *      dp_netdev_flow" from interfering with each other.
> - *
> - *
> - * Rules
> - * -----
> - *
> - * A flow 'flow' may be accessed without a risk of being freed during an RCU
> - * grace period.  Code that needs to hold onto a flow for a while
> - * should try incrementing 'flow->ref_cnt' with dp_netdev_flow_ref().
> - *
> - * 'flow->ref_cnt' protects 'flow' from being freed.  It doesn't protect the
> - * flow from being deleted from 'cls' and it doesn't protect members of 
> 'flow'
> - * from modification.
> - *
> - * Some members, marked 'const', are immutable.  Accessing other members
> - * requires synchronization, as noted in more detail below.
> - */
> -struct dp_netdev_flow {
> -    const struct flow flow;      /* Unmasked flow that created this entry. */
> -    /* Hash table index by unmasked flow. */
> -    const struct cmap_node node; /* In owning dp_netdev_pmd_thread's */
> -                                 /* 'flow_table'. */
> -    const struct cmap_node mark_node; /* In owning flow_mark's mark_to_flow 
> */
> -    const ovs_u128 ufid;         /* Unique flow identifier. */
> -    const ovs_u128 mega_ufid;    /* Unique mega flow identifier. */
> -    const unsigned pmd_id;       /* The 'core_id' of pmd thread owning this 
> */
> -                                 /* flow. */
> -
> -    /* Number of references.
> -     * The classifier owns one reference.
> -     * Any thread trying to keep a rule from being freed should hold its own
> -     * reference. */
> -    struct ovs_refcount ref_cnt;
> -
> -    bool dead;
> -    uint32_t mark;               /* Unique flow mark assigned to a flow */
> -
> -    /* Statistics. */
> -    struct dp_netdev_flow_stats stats;
> -
> -    /* Statistics and attributes received from the netdev offload provider. 
> */
> -    atomic_int netdev_flow_get_result;
> -    struct dp_netdev_flow_stats last_stats;
> -    struct dp_netdev_flow_attrs last_attrs;
> -
> -    /* Actions. */
> -    OVSRCU_TYPE(struct dp_netdev_actions *) actions;
> -
> -    /* While processing a group of input packets, the datapath uses the next
> -     * member to store a pointer to the output batch for the flow.  It is
> -     * reset after the batch has been sent out (See 
> dp_netdev_queue_batches(),
> -     * packet_batch_per_flow_init() and packet_batch_per_flow_execute()). */
> -    struct packet_batch_per_flow *batch;
> -
> -    /* Packet classification. */
> -    char *dp_extra_info;         /* String to return in a flow dump/get. */
> -    struct dpcls_rule cr;        /* In owning dp_netdev's 'cls'. */
> -    /* 'cr' must be the last member. */
> -};
> -
> -static void dp_netdev_flow_unref(struct dp_netdev_flow *);
>  static bool dp_netdev_flow_ref(struct dp_netdev_flow *);
>  static int dpif_netdev_flow_from_nlattrs(const struct nlattr *, uint32_t,
>                                           struct flow *, bool);
>  
> -/* A set of datapath actions within a "struct dp_netdev_flow".
> - *
> - *
> - * Thread-safety
> - * =============
> - *
> - * A struct dp_netdev_actions 'actions' is protected with RCU. */
> -struct dp_netdev_actions {
> -    /* These members are immutable: they do not change during the struct's
> -     * lifetime.  */
> -    unsigned int size;          /* Size of 'actions', in bytes. */
> -    struct nlattr actions[];    /* Sequence of OVS_ACTION_ATTR_* attributes. 
> */
> -};
> -
>  struct dp_netdev_actions *dp_netdev_actions_create(const struct nlattr *,
>                                                     size_t);
>  struct dp_netdev_actions *dp_netdev_flow_get_actions(
> @@ -637,171 +445,6 @@ struct tx_bond {
>      struct member_entry member_buckets[BOND_BUCKETS];
>  };
>  
> -/* A set of properties for the current processing loop that is not directly
> - * associated with the pmd thread itself, but with the packets being
> - * processed or the short-term system configuration (for example, time).
> - * Contained by struct dp_netdev_pmd_thread's 'ctx' member. */
> -struct dp_netdev_pmd_thread_ctx {
> -    /* Latest measured time. See 'pmd_thread_ctx_time_update()'. */
> -    long long now;
> -    /* RX queue from which last packet was received. */
> -    struct dp_netdev_rxq *last_rxq;
> -    /* EMC insertion probability context for the current processing cycle. */
> -    uint32_t emc_insert_min;
> -};
> -
> -/* PMD: Poll modes drivers.  PMD accesses devices via polling to eliminate
> - * the performance overhead of interrupt processing.  Therefore netdev can
> - * not implement rx-wait for these devices.  dpif-netdev needs to poll
> - * these device to check for recv buffer.  pmd-thread does polling for
> - * devices assigned to itself.
> - *
> - * DPDK used PMD for accessing NIC.
> - *
> - * Note, instance with cpu core id NON_PMD_CORE_ID will be reserved for
> - * I/O of all non-pmd threads.  There will be no actual thread created
> - * for the instance.
> - *
> - * Each struct has its own flow cache and classifier per managed ingress 
> port.
> - * For packets received on ingress port, a look up is done on corresponding 
> PMD
> - * thread's flow cache and in case of a miss, lookup is performed in the
> - * corresponding classifier of port.  Packets are executed with the found
> - * actions in either case.
> - * */
> -struct dp_netdev_pmd_thread {
> -    struct dp_netdev *dp;
> -    struct ovs_refcount ref_cnt;    /* Every reference must be refcount'ed. 
> */
> -    struct cmap_node node;          /* In 'dp->poll_threads'. */
> -
> -    /* Per thread exact-match cache.  Note, the instance for cpu core
> -     * NON_PMD_CORE_ID can be accessed by multiple threads, and thusly
> -     * need to be protected by 'non_pmd_mutex'.  Every other instance
> -     * will only be accessed by its own pmd thread. */
> -    OVS_ALIGNED_VAR(CACHE_LINE_SIZE) struct dfc_cache flow_cache;
> -
> -    /* Flow-Table and classifiers
> -     *
> -     * Writers of 'flow_table' must take the 'flow_mutex'.  Corresponding
> -     * changes to 'classifiers' must be made while still holding the
> -     * 'flow_mutex'.
> -     */
> -    struct ovs_mutex flow_mutex;
> -    struct cmap flow_table OVS_GUARDED; /* Flow table. */
> -
> -    /* One classifier per in_port polled by the pmd */
> -    struct cmap classifiers;
> -    /* Periodically sort subtable vectors according to hit frequencies */
> -    long long int next_optimization;
> -    /* End of the next time interval for which processing cycles
> -       are stored for each polled rxq. */
> -    long long int rxq_next_cycle_store;
> -
> -    /* Last interval timestamp. */
> -    uint64_t intrvl_tsc_prev;
> -    /* Last interval cycles. */
> -    atomic_ullong intrvl_cycles;
> -
> -    /* Current context of the PMD thread. */
> -    struct dp_netdev_pmd_thread_ctx ctx;
> -
> -    struct seq *reload_seq;
> -    uint64_t last_reload_seq;
> -
> -    /* These are atomic variables used as a synchronization and configuration
> -     * points for thread reload/exit.
> -     *
> -     * 'reload' atomic is the main one and it's used as a memory
> -     * synchronization point for all other knobs and data.
> -     *
> -     * For a thread that requests PMD reload:
> -     *
> -     *   * All changes that should be visible to the PMD thread must be made
> -     *     before setting the 'reload'.  These changes could use any memory
> -     *     ordering model including 'relaxed'.
> -     *   * Setting the 'reload' atomic should occur in the same thread where
> -     *     all other PMD configuration options updated.
> -     *   * Setting the 'reload' atomic should be done with 'release' memory
> -     *     ordering model or stricter.  This will guarantee that all previous
> -     *     changes (including non-atomic and 'relaxed') will be visible to
> -     *     the PMD thread.
> -     *   * To check that reload is done, thread should poll the 'reload' 
> atomic
> -     *     to become 'false'.  Polling should be done with 'acquire' memory
> -     *     ordering model or stricter.  This ensures that PMD thread 
> completed
> -     *     the reload process.
> -     *
> -     * For the PMD thread:
> -     *
> -     *   * PMD thread should read 'reload' atomic with 'acquire' memory
> -     *     ordering model or stricter.  This will guarantee that all changes
> -     *     made before setting the 'reload' in the requesting thread will be
> -     *     visible to the PMD thread.
> -     *   * All other configuration data could be read with any memory
> -     *     ordering model (including non-atomic and 'relaxed') but *only 
> after*
> -     *     reading the 'reload' atomic set to 'true'.
> -     *   * When the PMD reload done, PMD should (optionally) set all the 
> below
> -     *     knobs except the 'reload' to their default ('false') values and
> -     *     (mandatory), as the last step, set the 'reload' to 'false' using
> -     *     'release' memory ordering model or stricter.  This will inform the
> -     *     requesting thread that PMD has completed a reload cycle.
> -     */
> -    atomic_bool reload;             /* Do we need to reload ports? */
> -    atomic_bool wait_for_reload;    /* Can we busy wait for the next reload? 
> */
> -    atomic_bool reload_tx_qid;      /* Do we need to reload static_tx_qid? */
> -    atomic_bool exit;               /* For terminating the pmd thread. */
> -
> -    pthread_t thread;
> -    unsigned core_id;               /* CPU core id of this pmd thread. */
> -    int numa_id;                    /* numa node id of this pmd thread. */
> -    bool isolated;
> -
> -    /* Queue id used by this pmd thread to send packets on all netdevs if
> -     * XPS disabled for this netdev. All static_tx_qid's are unique and less
> -     * than 'cmap_count(dp->poll_threads)'. */
> -    uint32_t static_tx_qid;
> -
> -    /* Number of filled output batches. */
> -    int n_output_batches;
> -
> -    struct ovs_mutex port_mutex;    /* Mutex for 'poll_list' and 'tx_ports'. 
> */
> -    /* List of rx queues to poll. */
> -    struct hmap poll_list OVS_GUARDED;
> -    /* Map of 'tx_port's used for transmission.  Written by the main thread,
> -     * read by the pmd thread. */
> -    struct hmap tx_ports OVS_GUARDED;
> -
> -    struct ovs_mutex bond_mutex;    /* Protects updates of 'tx_bonds'. */
> -    /* Map of 'tx_bond's used for transmission.  Written by the main thread
> -     * and read by the pmd thread. */
> -    struct cmap tx_bonds;
> -
> -    /* These are thread-local copies of 'tx_ports'.  One contains only tunnel
> -     * ports (that support push_tunnel/pop_tunnel), the other contains ports
> -     * with at least one txq (that support send).  A port can be in both.
> -     *
> -     * There are two separate maps to make sure that we don't try to execute
> -     * OUTPUT on a device which has 0 txqs or PUSH/POP on a non-tunnel 
> device.
> -     *
> -     * The instances for cpu core NON_PMD_CORE_ID can be accessed by multiple
> -     * threads, and thusly need to be protected by 'non_pmd_mutex'.  Every
> -     * other instance will only be accessed by its own pmd thread. */
> -    struct hmap tnl_port_cache;
> -    struct hmap send_port_cache;
> -
> -    /* Keep track of detailed PMD performance statistics. */
> -    struct pmd_perf_stats perf_stats;
> -
> -    /* Stats from previous iteration used by automatic pmd
> -     * load balance logic. */
> -    uint64_t prev_stats[PMD_N_STATS];
> -    atomic_count pmd_overloaded;
> -
> -    /* Set to true if the pmd thread needs to be reloaded. */
> -    bool need_reload;
> -
> -    /* Next time when PMD should try RCU quiescing. */
> -    long long next_rcu_quiesce;
> -};
> -
>  /* Interface to netdev-based datapath. */
>  struct dpif_netdev {
>      struct dpif dpif;
> @@ -906,90 +549,12 @@ static inline struct dpcls *
>  dp_netdev_pmd_lookup_dpcls(struct dp_netdev_pmd_thread *pmd,
>                             odp_port_t in_port);
>  
> -static inline bool emc_entry_alive(struct emc_entry *ce);
> -static void emc_clear_entry(struct emc_entry *ce);
> -static void smc_clear_entry(struct smc_bucket *b, int idx);
> -
>  static void dp_netdev_request_reconfigure(struct dp_netdev *dp);
>  static inline bool
>  pmd_perf_metrics_enabled(const struct dp_netdev_pmd_thread *pmd);
>  static void queue_netdev_flow_del(struct dp_netdev_pmd_thread *pmd,
>                                    struct dp_netdev_flow *flow);
>  
> -static void
> -emc_cache_init(struct emc_cache *flow_cache)
> -{
> -    int i;
> -
> -    flow_cache->sweep_idx = 0;
> -    for (i = 0; i < ARRAY_SIZE(flow_cache->entries); i++) {
> -        flow_cache->entries[i].flow = NULL;
> -        flow_cache->entries[i].key.hash = 0;
> -        flow_cache->entries[i].key.len = sizeof(struct miniflow);
> -        flowmap_init(&flow_cache->entries[i].key.mf.map);
> -    }
> -}
> -
> -static void
> -smc_cache_init(struct smc_cache *smc_cache)
> -{
> -    int i, j;
> -    for (i = 0; i < SMC_BUCKET_CNT; i++) {
> -        for (j = 0; j < SMC_ENTRY_PER_BUCKET; j++) {
> -            smc_cache->buckets[i].flow_idx[j] = UINT16_MAX;
> -        }
> -    }
> -}
> -
> -static void
> -dfc_cache_init(struct dfc_cache *flow_cache)
> -{
> -    emc_cache_init(&flow_cache->emc_cache);
> -    smc_cache_init(&flow_cache->smc_cache);
> -}
> -
> -static void
> -emc_cache_uninit(struct emc_cache *flow_cache)
> -{
> -    int i;
> -
> -    for (i = 0; i < ARRAY_SIZE(flow_cache->entries); i++) {
> -        emc_clear_entry(&flow_cache->entries[i]);
> -    }
> -}
> -
> -static void
> -smc_cache_uninit(struct smc_cache *smc)
> -{
> -    int i, j;
> -
> -    for (i = 0; i < SMC_BUCKET_CNT; i++) {
> -        for (j = 0; j < SMC_ENTRY_PER_BUCKET; j++) {
> -            smc_clear_entry(&(smc->buckets[i]), j);
> -        }
> -    }
> -}
> -
> -static void
> -dfc_cache_uninit(struct dfc_cache *flow_cache)
> -{
> -    smc_cache_uninit(&flow_cache->smc_cache);
> -    emc_cache_uninit(&flow_cache->emc_cache);
> -}
> -
> -/* Check and clear dead flow references slowly (one entry at each
> - * invocation).  */
> -static void
> -emc_cache_slow_sweep(struct emc_cache *flow_cache)
> -{
> -    struct emc_entry *entry = &flow_cache->entries[flow_cache->sweep_idx];
> -
> -    if (!emc_entry_alive(entry)) {
> -        emc_clear_entry(entry);
> -    }
> -    flow_cache->sweep_idx = (flow_cache->sweep_idx + 1) & EM_FLOW_HASH_MASK;
> -}
> -
>  /* Updates the time in PMD threads context and should be called in three 
> cases:
>   *
>   *     1. PMD structure initialization:
> @@ -2377,19 +1942,13 @@ dp_netdev_flow_free(struct dp_netdev_flow *flow)
>      free(flow);
>  }
>  
> -static void dp_netdev_flow_unref(struct dp_netdev_flow *flow)
> +void dp_netdev_flow_unref(struct dp_netdev_flow *flow)
>  {
>      if (ovs_refcount_unref_relaxed(&flow->ref_cnt) == 1) {
>          ovsrcu_postpone(dp_netdev_flow_free, flow);
>      }
>  }
>  
> -static uint32_t
> -dp_netdev_flow_hash(const ovs_u128 *ufid)
> -{
> -    return ufid->u32[0];
> -}
> -
>  static inline struct dpcls *
>  dp_netdev_pmd_lookup_dpcls(struct dp_netdev_pmd_thread *pmd,
>                             odp_port_t in_port)
> @@ -3009,14 +2568,6 @@ static bool dp_netdev_flow_ref(struct dp_netdev_flow 
> *flow)
>   *   single memcmp().
>   * - These functions can be inlined by the compiler. */
>  
> -/* Given the number of bits set in miniflow's maps, returns the size of the
> - * 'netdev_flow_key.mf' */
> -static inline size_t
> -netdev_flow_key_size(size_t flow_u64s)
> -{
> -    return sizeof(struct miniflow) + MINIFLOW_VALUES_SIZE(flow_u64s);
> -}
> -
>  static inline bool
>  netdev_flow_key_equal(const struct netdev_flow_key *a,
>                        const struct netdev_flow_key *b)
> @@ -3025,16 +2576,6 @@ netdev_flow_key_equal(const struct netdev_flow_key *a,
>      return a->hash == b->hash && !memcmp(&a->mf, &b->mf, a->len);
>  }
>  
> -/* Used to compare 'netdev_flow_key' in the exact match cache to a miniflow.
> - * The maps are compared bitwise, so both 'key->mf' and 'mf' must have been
> - * generated by miniflow_extract. */
> -static inline bool
> -netdev_flow_key_equal_mf(const struct netdev_flow_key *key,
> -                         const struct miniflow *mf)
> -{
> -    return !memcmp(&key->mf, mf, key->len);
> -}
> -
>  static inline void
>  netdev_flow_key_clone(struct netdev_flow_key *dst,
>                        const struct netdev_flow_key *src)
> @@ -3101,21 +2642,6 @@ netdev_flow_key_init_masked(struct netdev_flow_key 
> *dst,
>                              (dst_u64 - miniflow_get_values(&dst->mf)) * 8);
>  }
>  
> -static inline bool
> -emc_entry_alive(struct emc_entry *ce)
> -{
> -    return ce->flow && !ce->flow->dead;
> -}
> -
> -static void
> -emc_clear_entry(struct emc_entry *ce)
> -{
> -    if (ce->flow) {
> -        dp_netdev_flow_unref(ce->flow);
> -        ce->flow = NULL;
> -    }
> -}
> -
>  static inline void
>  emc_change_entry(struct emc_entry *ce, struct dp_netdev_flow *flow,
>                   const struct netdev_flow_key *key)
> @@ -3181,24 +2707,6 @@ emc_probabilistic_insert(struct dp_netdev_pmd_thread 
> *pmd,
>      }
>  }
>  
> -static inline struct dp_netdev_flow *
> -emc_lookup(struct emc_cache *cache, const struct netdev_flow_key *key)
> -{
> -    struct emc_entry *current_entry;
> -
> -    EMC_FOR_EACH_POS_WITH_HASH(cache, current_entry, key->hash) {
> -        if (current_entry->key.hash == key->hash
> -            && emc_entry_alive(current_entry)
> -            && netdev_flow_key_equal_mf(&current_entry->key, &key->mf)) {
> -
> -            /* We found the entry with the 'key->mf' miniflow */
> -            return current_entry->flow;
> -        }
> -    }
> -
> -    return NULL;
> -}
> -
>  static inline const struct cmap_node *
>  smc_entry_get(struct dp_netdev_pmd_thread *pmd, const uint32_t hash)
>  {
> @@ -3219,12 +2727,6 @@ smc_entry_get(struct dp_netdev_pmd_thread *pmd, const 
> uint32_t hash)
>      return NULL;
>  }
>  
> -static void
> -smc_clear_entry(struct smc_bucket *b, int idx)
> -{
> -    b->flow_idx[idx] = UINT16_MAX;
> -}
> -
>  /* Insert the flow_table index into SMC. Insertion may fail when 1) SMC is
>   * turned off, 2) the flow_table index is larger than uint16_t can handle.
>   * If there is already an SMC entry having same signature, the index will be
> @@ -6912,22 +6414,6 @@ dp_netdev_upcall(struct dp_netdev_pmd_thread *pmd, 
> struct dp_packet *packet_,
>                           actions, wc, put_actions, dp->upcall_aux);
>  }
>  
> -static inline uint32_t
> -dpif_netdev_packet_get_rss_hash_orig_pkt(struct dp_packet *packet,
> -                                const struct miniflow *mf)
> -{
> -    uint32_t hash;
> -
> -    if (OVS_LIKELY(dp_packet_rss_valid(packet))) {
> -        hash = dp_packet_get_rss_hash(packet);
> -    } else {
> -        hash = miniflow_hash_5tuple(mf, 0);
> -        dp_packet_set_rss_hash(packet, hash);
> -    }
> -
> -    return hash;
> -}
> -
>  static inline uint32_t
>  dpif_netdev_packet_get_rss_hash(struct dp_packet *packet,
>                                  const struct miniflow *mf)
> @@ -8775,7 +8261,7 @@ dpcls_create_subtable(struct dpcls *cls, const struct 
> netdev_flow_key *mask)
>      subtable->mf_bits_set_unit0 = unit0;
>      subtable->mf_bits_set_unit1 = unit1;
>      subtable->mf_masks = xmalloc(sizeof(uint64_t) * (unit0 + unit1));
> -    netdev_flow_key_gen_masks(mask, subtable->mf_masks, unit0, unit1);
> +    dpcls_flow_key_gen_masks(mask, subtable->mf_masks, unit0, unit1);
>  
>      /* Get the preferred subtable search function for this (u0,u1) subtable.
>       * The function is guaranteed to always return a valid implementation, 
> and
> @@ -8950,11 +8436,10 @@ dpcls_remove(struct dpcls *cls, struct dpcls_rule 
> *rule)
>      }
>  }
>  
> -/* Inner loop for mask generation of a unit, see netdev_flow_key_gen_masks. 
> */
> +/* Inner loop for mask generation of a unit, see dpcls_flow_key_gen_masks. */
>  static inline void
> -netdev_flow_key_gen_mask_unit(uint64_t iter,
> -                              const uint64_t count,
> -                              uint64_t *mf_masks)
> +dpcls_flow_key_gen_mask_unit(uint64_t iter, const uint64_t count,
> +                             uint64_t *mf_masks)
>  {
>      int i;
>      for (i = 0; i < count; i++) {
> @@ -8975,16 +8460,16 @@ netdev_flow_key_gen_mask_unit(uint64_t iter,
>   * @param mf_bits_unit0 Number of bits set in unit0 of the miniflow
>   */
>  void
> -netdev_flow_key_gen_masks(const struct netdev_flow_key *tbl,
> -                          uint64_t *mf_masks,
> -                          const uint32_t mf_bits_u0,
> -                          const uint32_t mf_bits_u1)
> +dpcls_flow_key_gen_masks(const struct netdev_flow_key *tbl,
> +                         uint64_t *mf_masks,
> +                         const uint32_t mf_bits_u0,
> +                         const uint32_t mf_bits_u1)
>  {
>      uint64_t iter_u0 = tbl->mf.map.bits[0];
>      uint64_t iter_u1 = tbl->mf.map.bits[1];
>  
> -    netdev_flow_key_gen_mask_unit(iter_u0, mf_bits_u0, &mf_masks[0]);
> -    netdev_flow_key_gen_mask_unit(iter_u1, mf_bits_u1, 
> &mf_masks[mf_bits_u0]);
> +    dpcls_flow_key_gen_mask_unit(iter_u0, mf_bits_u0, &mf_masks[0]);
> +    dpcls_flow_key_gen_mask_unit(iter_u1, mf_bits_u1, &mf_masks[mf_bits_u0]);
>  }
>  
>  /* Returns true if 'target' satisfies 'key' in 'mask', that is, if each 1-bit
> -- 
> 2.32.0
> 
> _______________________________________________
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

-- 
fbl
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [v14 01/11] dpif-netdev: Refactor to multiple header files.

Reply via email to