LGTM.

I think this patch could work fine in conjunction with the one I posted
https://mail.openvswitch.org/pipermail/ovs-dev/2017-July/335940.html
where I'm targeting a congestion usecase with recirculated packets.
The goal is still to limit thrashing and the criteria is to avoid 
EMC lookup and insertions for recirculated packets.

I gave a try to your 2 patches, I'm using 
Commit 325b2b1a493a2230072de726bbb53a8337759f39

On top of that I applied Ciara's patch "dpif-netdev: add EMC entry count 
and %full figure to pmd-stats-show" at:
https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/327570.html
because I wanted to see the count of EMC entries.

I generated the different traffic streams by looping on IP addresses.
Of course results are strictly dependent on the particular data traffic.

I sent traffic 64B UDP packets at the line rate and measured the Rx packet 
rate, regardless of pkt loss.

Flow setup:
table=0, in_port=dpdk0 actions=output:dpdk1
table=0, in_port=dpdk1 actions=output:dpdk0

2 PMDs, 3 Tx queues.

To read entries counts:
 sudo ./utilities/ovs-appctl dpif-netdev/pmd-stats-show | grep entries

Results
=======

        +=========================+============================
        |  Orig + Ciara's patch   |  + these 2 patches     
========+=========================+============================
Streams | Entries      Rx [Mpps]  |   Entries      Rx [Mpps]
--------+-------------------------+----------------------------
100     | 100, 100    9.76, 10.05 |   100, 100    10.55, 10.72 
1000    | 1000, 999   7.85, 8.09  |   1000, 1000  8.77, 8.91 
3000    | 2993, 2987  7.61, 7.87  |   3000, 3000  8.58, 7.89
5000    | 4336, 4872  7.49, 7.26  |   4496, 5000  8.57, 7.28
7000    | 5870, 7000  8.57, 7.20  |   6039, 7000  8.56, 7.26 
9000    | 6550, 7572  7.19, 6.91  |   6643, 7836  7.06, 6.90 
11000   | 7152, 8192  6.81, 6.83  |   7158, 8192  6.81, 6.79
--------+-------------------------+----------------------------

It's interesting to see how these patches allow to use more EMC locations -
see cases from 3000 to 9000 streams - so reducing thrashing.


-Antonio


> -----Original Message-----
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-boun...@openvswitch.org]
> On Behalf Of Ilya Maximets
> Sent: Monday, July 31, 2017 3:41 PM
> To: ovs-dev@openvswitch.org
> Cc: Heetae Ahn <heetae82....@samsung.com>; Ilya Maximets
> <i.maxim...@samsung.com>
> Subject: [ovs-dev] [PATCH v2 2/2] dpif-netdev: Fix emc replacement policy.
> 
> Current EMC replacement policy allows to replace active EMC entry
> even if there are dead (empty) entries available. This leads to
> EMC trashing even on few hundreds of flows. In some cases PMD
> threads starts to execute classifier lookups even in tests with
> 50 - 100 active flows.
> 
> Looks like the hash comparison rule was introduced to randomly
> choose one of alive entries to replace. But it doesn't work as
> needed and also hashes has nothing common with randomness.
> 
> Lets fix the replacement policy by removing hash checking and
> using the random value passed from 'emc_probabilistic_insert()'
> only while considering replace of the alive entry.

[Antonio]
I like the approach to re-use the info we already have, in this case
the random value.

> This should give us nearly fair way to choose the entry to replace.
> 
> We are avoiding calculation of the new random value by reusing
> bits of already generated random for probabilistic EMC insertion.
> Bits higher than 'EM_FLOW_INSERT_INV_PROB_SHIFT' are used because
> lower bits are less than 'min' and not fully random.
> 
> Not replacing of alive entries while dead ones exists allows to
> significantly decrease EMC trashing.
> 
> Testing shows stable work of exact match cache without misses
> with up to 3072 - 6144 active flows (depends on traffic pattern).
> 
> Signed-off-by: Ilya Maximets <i.maxim...@samsung.com>
> ---
>  lib/dpif-netdev.c | 23 ++++++++++++++++-------
>  1 file changed, 16 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 123a7c9..a714329 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -155,6 +155,9 @@ struct netdev_flow_key {
>  #define EM_FLOW_INSERT_INV_PROB_SHIFT 20
>  #define EM_FLOW_INSERT_INV_PROB_MAX  (1 << EM_FLOW_INSERT_INV_PROB_SHIFT)
>  #define EM_FLOW_INSERT_INV_PROB_MASK (EM_FLOW_INSERT_INV_PROB_MAX - 1)
> +/* We will use bits higher than EM_FLOW_INSERT_INV_PROB_SHIFT of the random
> + * value for EMC replacement policy. */
> +BUILD_ASSERT_DECL(32 - EM_FLOW_INSERT_INV_PROB_SHIFT >= EM_FLOW_HASH_SEGS);
>  /* Default EMC insert probability is 1 / DEFAULT_EM_FLOW_INSERT_INV_PROB */
>  #define DEFAULT_EM_FLOW_INSERT_INV_PROB 100
>  #define DEFAULT_EM_FLOW_INSERT_MIN (EM_FLOW_INSERT_INV_PROB_MAX /        \
> @@ -2041,7 +2044,7 @@ emc_change_entry(struct emc_entry *ce, struct
> dp_netdev_flow *flow,
> 
>  static inline void
>  emc_insert(struct emc_cache *cache, const struct netdev_flow_key *key,
> -           struct dp_netdev_flow *flow)
> +           struct dp_netdev_flow *flow, uint32_t random_value)
>  {
>      struct emc_entry *to_be_replaced = NULL;
>      struct emc_entry *current_entry;
> @@ -2054,12 +2057,12 @@ emc_insert(struct emc_cache *cache, const struct
> netdev_flow_key *key,
>          }
> 
>          /* Replacement policy: put the flow in an empty (not alive) entry, or
> -         * in the first entry where it can be */
> +         * in the random entry where it can be */
>          if (!to_be_replaced
>              || (emc_entry_alive(to_be_replaced)
> -                && !emc_entry_alive(current_entry))
> -            || current_entry->key.hash < to_be_replaced->key.hash) {
> +                && (!emc_entry_alive(current_entry) || random_value & 1))) {
>              to_be_replaced = current_entry;
> +            random_value >>= 1;
>          }
>      }
>      /* We didn't find the miniflow in the cache.
> @@ -2077,11 +2080,17 @@ emc_probabilistic_insert(struct dp_netdev_pmd_thread
> *pmd,
>       * default the value is UINT32_MAX / 100 which yields an insertion
>       * probability of 1/100 ie. 1% */
> 
> -    uint32_t min;
> +    uint32_t min, random_value;
>      atomic_read_relaxed(&pmd->dp->emc_insert_min, &min);
> 
> -    if (min && (random_uint32() & EM_FLOW_INSERT_INV_PROB_MASK) < min) {
> -        emc_insert(&pmd->flow_cache, key, flow);
> +    if (!min) {
> +        return;
> +    }
> +
> +    random_value = random_uint32();
> +    if ((random_value & EM_FLOW_INSERT_INV_PROB_MASK) < min) {
> +        emc_insert(&pmd->flow_cache, key, flow,
> +                   random_value >> EM_FLOW_INSERT_INV_PROB_SHIFT);
>      }
>  }
> 
> --
> 2.7.4
> 
> _______________________________________________
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to