On 02/14/2017 11:46 AM, Ciara Loftus wrote:
> Unconditional insertion of EMC entries results in EMC thrashing at high
> numbers of parallel flows. When this occurs, the performance of the EMC
> often falls below that of the dpcls classifier, rendering the EMC
> practically useless.
> 
> Instead of unconditionally inserting entries into the EMC when a miss
> occurs, use a 1% probability of insertion. This ensures that the most
> frequent flows have the highest chance of creating an entry in the EMC,
> and the probability of thrashing the EMC is also greatly reduced.
> 
> The probability of insertion is configurable, via the
> other_config:emc-insert-inv-prob option. This value sets the average
> probability of insertion to 1/emc-insert-inv-prob.
> 
> For example the following command changes the insertion proability to

typo

> (on average) 1 in every 20 packets ie. 1/20 ie. 5%.
> 
> ovs-vsctl set Open_vSwitch . other_config:emc-insert-inv-prob=20
> 

Overall LGTM. Couple of minor comments below. I see around 10%
throughput increase for 1M flows.

Acked-by: Kevin Traynor <ktray...@redhat.com>

> Signed-off-by: Ciara Loftus <ciara.lof...@intel.com>
> Signed-off-by: Georg Schmuecking <georg.schmueck...@ericsson.com>
> Co-authored-by: Georg Schmuecking <georg.schmueck...@ericsson.com>
> ---
> v9:
> - Revert back to original 1/N formula for configuring the probability
>   (don't use percentiles).
> - Rename option to reflect inverse probability & update documentation to
>   make the wording clearer.
> v8:
> - Floating point precision percentiles
> - Moved NEWS entry to post-2.7 section and is no longer in the DPDK
>   specific section.
> v7:
> - Remove further code duplication
> v6:
> - Refactor the code to remove duplication around calls to
>   emc_probabilistic_insert()
> v5:
> - Use percentiles for emc-insert-prob (0-100%)
> - Update docs to reflect the option not exclusive to the DPDK datapath.
> v4:
> - Added Georg to Authors file
> - Set emc-insert-prob=1 for 'PMD - stats' unit test
> - Use read_relaxed on atomic var
> - Correctly allow for 0 and 100% probababilites
> - Cache align new element in dp_netdev struct
> - Revert to default probability if invalid emc-insert-prob set
> - Allow configurability for non-DPDK case
> v3:
> - Use new dpif other_config infrastructure to tidy up how the
>   emc-insert-prob value is passed to dpif-netdev.
> v2:
> - Enable probability configurability via other_config:emc-insert-prob
>   option.
> 
>  AUTHORS.rst                  |  1 +
>  Documentation/howto/dpdk.rst | 20 ++++++++++++++++++
>  NEWS                         |  2 ++
>  lib/dpif-netdev.c            | 49 
> ++++++++++++++++++++++++++++++++++++++++----
>  tests/pmd.at                 |  1 +
>  vswitchd/vswitch.xml         | 17 +++++++++++++++
>  6 files changed, 86 insertions(+), 4 deletions(-)
> 
> diff --git a/AUTHORS.rst b/AUTHORS.rst
> index f247df5..9a37423 100644
> --- a/AUTHORS.rst
> +++ b/AUTHORS.rst
> @@ -385,6 +385,7 @@ Eric Lopez                      elo...@nicira.com
>  Frido Roose                     fr.ro...@gmail.com
>  Gaetano Catalli                 gaetano.cata...@gmail.com
>  Gavin Remaley                   gavin_rema...@selinc.com
> +Georg Schmuecking               georg.schmueck...@ericsson.com
>  George Shuklin                  ama...@desunote.ru
>  Gerald Rogers                   gerald.rog...@intel.com
>  Ghanem Bahri                    bahri.gha...@gmail.com
> diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
> index d1e6e89..52cb3fc 100644
> --- a/Documentation/howto/dpdk.rst
> +++ b/Documentation/howto/dpdk.rst
> @@ -354,6 +354,26 @@ the `DPDK documentation
>  
>  Note: Not all DPDK virtual PMD drivers have been tested and verified to work.
>  
> +EMC Insertion Probability
> +-------------------------
> +By default 1 in every 100 flows are inserted into the Exact Match Cache 
> (EMC).
> +It is possible to change this insertion probability by setting the
> +``emc-insert-inv-prob`` option::
> +
> +    $ ovs-vsctl --no-wait set Open_vSwitch . 
> other_config:emc-insert-inv-prob=N
> +
> +where:
> +
> +``N``
> +  is a positive integer representing the inverse probability of insertion ie.
> +  on average 1 in every N packets with a unique flow will generate an EMC
> +  insertion.
> +
> +If ``N`` is set to 1, an insertion will be performed for every flow. If set 
> to
> +0, no insertions will be performed and the EMC will effectively be disabled.
> +
> +For more information on the EMC refer to :doc:`/intro/install/dpdk` .
> +
>  .. _dpdk-ovs-in-guest:
>  
>  OVS with DPDK Inside VMs
> diff --git a/NEWS b/NEWS
> index aebd99c..b14c76d 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -3,6 +3,8 @@ Post-v2.7.0
>     - Tunnels:
>       * Added support to set packet mark for tunnel endpoint using
>         `egress_pkt_mark` OVSDB option.
> +   - EMC insertion probability is reduced to 1% and is configurable via
> +     the new 'other_config:emc-insert-inv-prob' option.
>  
>  v2.7.0 - xx xxx xxxx
>  ---------------------
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 0be5db5..54af4ac 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -144,6 +144,11 @@ struct netdev_flow_key {
>  #define EM_FLOW_HASH_MASK (EM_FLOW_HASH_ENTRIES - 1)
>  #define EM_FLOW_HASH_SEGS 2
>  
> +/* Default EMC insert probability is 1 / DEFAULT_EM_FLOW_INSERT_INV_PROB */
> +#define DEFAULT_EM_FLOW_INSERT_INV_PROB 100
> +#define DEFAULT_EM_FLOW_INSERT_MIN (UINT32_MAX /                     \
> +                                    DEFAULT_EM_FLOW_INSERT_INV_PROB)
> +
>  struct emc_entry {
>      struct dp_netdev_flow *flow;
>      struct netdev_flow_key key;   /* key.hash used for emc hash value. */
> @@ -254,6 +259,9 @@ struct dp_netdev {
>      uint64_t last_tnl_conf_seq;
>  
>      struct conntrack conntrack;
> +
> +    /* Probability of EMC insertions is a factor of 'emc_insert_min'.*/
> +    OVS_ALIGNED_VAR(CACHE_LINE_SIZE)atomic_uint32_t emc_insert_min;
>  };
>  
>  static struct dp_netdev_port *dp_netdev_lookup_port(const struct dp_netdev 
> *dp,
> @@ -1066,6 +1074,8 @@ create_dp_netdev(const char *name, const struct 
> dpif_class *class,
>  
>      conntrack_init(&dp->conntrack);
>  
> +    atomic_init(&dp->emc_insert_min, DEFAULT_EM_FLOW_INSERT_MIN);
> +
>      cmap_init(&dp->poll_threads);
>      ovs_mutex_init_recursive(&dp->non_pmd_mutex);
>      ovsthread_key_create(&dp->per_pmd_key, NULL);
> @@ -1943,6 +1953,27 @@ emc_insert(struct emc_cache *cache, const struct 
> netdev_flow_key *key,
>      emc_change_entry(to_be_replaced, flow, key);
>  }
>  
> +static inline void
> +emc_probabilistic_insert(struct dp_netdev_pmd_thread *pmd,
> +                         const struct netdev_flow_key *key,
> +                         struct dp_netdev_flow *flow)
> +{
> +    /* Insert an entry into the EMC based on probability value 'min'. By
> +     * default the value is UINT32_MAX / 100 which yields an insertion
> +     * probability of 1/100 ie. 1% */
> +
> +    uint32_t min;
> +    atomic_read_relaxed(&pmd->dp->emc_insert_min, &min);
> +
> +#ifdef DPDK_NETDEV
> +    if (min && (key->hash ^ (uint32_t)pmd->last_cycles) <= min) {
> +#else
> +    if (min && (key->hash ^ random_uint32()) <= min) {
> +#endif
> +        emc_insert(&pmd->flow_cache, key, flow);
> +    }
> +}
> +
>  static inline struct dp_netdev_flow *
>  emc_lookup(struct emc_cache *cache, const struct netdev_flow_key *key)
>  {
> @@ -2731,6 +2762,7 @@ dpif_netdev_set_config(struct dpif *dpif, const struct 
> smap *other_config)
>  {
>      struct dp_netdev *dp = get_dp_netdev(dpif);
>      const char *cmask = smap_get(other_config, "pmd-cpu-mask");
> +    int insert_prob = smap_get_int(other_config, "emc-insert-inv-prob", -1);
>  
>      if (!nullable_string_is_equal(dp->pmd_cmask, cmask)) {
>          free(dp->pmd_cmask);
> @@ -2738,6 +2770,17 @@ dpif_netdev_set_config(struct dpif *dpif, const struct 
> smap *other_config)
>          dp_netdev_request_reconfigure(dp);
>      }
>  
> +    if (insert_prob >= 0 && insert_prob <= UINT32_MAX) {
> +        uint32_t insert_min = insert_prob == 0 ? 0 : UINT32_MAX / 
> insert_prob;
> +        uint32_t cur_min;
> +        atomic_read_relaxed(&dp->emc_insert_min, &cur_min);
> +        if (insert_min != cur_min) {
> +            atomic_store_relaxed(&dp->emc_insert_min, insert_min);
> +        }
> +    } else {
> +        atomic_store_relaxed(&dp->emc_insert_min, 
> DEFAULT_EM_FLOW_INSERT_MIN);
> +    }
> +

You could 'if' on just setting the insert_min, and combine 'read, if
!=cur_min, store'. That would avoid storing emc_insert_min every time
another parameter is changed and emc-inv-prob is not in the db (the
default). You would still have to do a read instead of the store, so I'm
not sure if it's much/any better.

A VLOG_INFO of a new setting might be useful also, especially where
someone sets it =0.

>      return 0;
>  }
>  
> @@ -4193,8 +4236,7 @@ handle_packet_upcall(struct dp_netdev_pmd_thread *pmd, 
> struct dp_packet *packet,
>                                               add_actions->size);
>          }
>          ovs_mutex_unlock(&pmd->flow_mutex);
> -
> -        emc_insert(&pmd->flow_cache, key, netdev_flow);
> +        emc_probabilistic_insert(pmd, key, netdev_flow);
>      }
>  }
>  
> @@ -4217,7 +4259,6 @@ fast_path_processing(struct dp_netdev_pmd_thread *pmd,
>      struct dpcls *cls;
>      struct dpcls_rule *rules[PKT_ARRAY_SIZE];
>      struct dp_netdev *dp = pmd->dp;
> -    struct emc_cache *flow_cache = &pmd->flow_cache;
>      int miss_cnt = 0, lost_cnt = 0;
>      int lookup_cnt = 0, add_lookup_cnt;
>      bool any_miss;
> @@ -4288,7 +4329,7 @@ fast_path_processing(struct dp_netdev_pmd_thread *pmd,
>  
>          flow = dp_netdev_flow_cast(rules[i]);
>  
> -        emc_insert(flow_cache, &keys[i], flow);
> +        emc_probabilistic_insert(pmd, &keys[i], flow);
>          dp_netdev_queue_batches(packet, flow, &keys[i].mf, batches, 
> n_batches);
>      }
>  
> diff --git a/tests/pmd.at b/tests/pmd.at
> index 35dbcfe..5686bed 100644
> --- a/tests/pmd.at
> +++ b/tests/pmd.at
> @@ -158,6 +158,7 @@ CHECK_PMD_THREADS_CREATED()
>  
>  AT_CHECK([ovs-appctl vlog/set dpif_netdev:dbg])
>  AT_CHECK([ovs-ofctl add-flow br0 action=normal])
> +AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:emc-insert-inv-prob=1])
>  
>  sleep 1
>  
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> index 146a816..782417f 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -335,6 +335,23 @@
>            datapaths.
>          </p>
>        </column>
> +
> +      <column name="other_config" key="emc-insert-inv-prob"
> +              type='{"type": "integer", "minInteger": 0, "maxInteger": 
> 4294967295}'>
> +        <p>
> +          Specifies the inverse probability (1/emc-insert-inv-prob) of a flow
> +          being inserted into the Exact Match Cache (EMC). On average one in
> +          every <code>emc-insert-inv-prob</code> packets that generate a 
> unique
> +          flow will cause an insertion into the EMC.
> +
> +          A value of 1 will result in an insertion for every flow (1/1 = 
> 100%)
> +          whereas a value of zero will result in no insertions and 
> essentially
> +          disable the EMC.
> +        </p>
> +        <p>
> +          Defaults to 100 ie. there is (1/100 =) 1% chance of EMC insertion.
> +        </p>
> +      </column>
>      </group>
>  
>      <group title="Status">
> 

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to