Hi Antonio
Would you mind sharing your distribution algorithm ?
I would like to understand how you saw some benefit in the 1000-5000 range.
1000 | 1000, 999 7.85, 8.09 | 1000, 1000 8.77, 8.91
3000 | 2993, 2987 7.61, 7.87 | 3000, 3000 8.58, 7.89
5000 | 4336, 4872 7.49, 7.26 | 4496, 5000 8.57, 7.28
Thanks Darrell
-----Original Message-----
From: <[email protected]> on behalf of "Fischetti, Antonio"
<[email protected]>
Date: Thursday, August 3, 2017 at 8:10 AM
To: Ilya Maximets <[email protected]>, "[email protected]"
<[email protected]>
Cc: Heetae Ahn <[email protected]>
Subject: Re: [ovs-dev] [PATCH v2 2/2] dpif-netdev: Fix emc replacement policy.
LGTM.
I think this patch could work fine in conjunction with the one I posted
https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2017-2DJuly_335940.html&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=gPngVJNy2q5ZxGlA8YyqJvmxn6ITNf2p_ExFf4bI1qA&s=Ro4bZVnRtHb2-gSch_oxiU5ypxPWW8ONyoMeuXml4Dk&e=
where I'm targeting a congestion usecase with recirculated packets.
The goal is still to limit thrashing and the criteria is to avoid
EMC lookup and insertions for recirculated packets.
I gave a try to your 2 patches, I'm using
Commit 325b2b1a493a2230072de726bbb53a8337759f39
On top of that I applied Ciara's patch "dpif-netdev: add EMC entry count
and %full figure to pmd-stats-show" at:
https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2017-2DJanuary_327570.html&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=gPngVJNy2q5ZxGlA8YyqJvmxn6ITNf2p_ExFf4bI1qA&s=Np71SDnUWCgZNENFZ0A95iSWDqHN9SBtRso9UU6Wzlk&e=
because I wanted to see the count of EMC entries.
I generated the different traffic streams by looping on IP addresses.
Of course results are strictly dependent on the particular data traffic.
I sent traffic 64B UDP packets at the line rate and measured the Rx packet
rate, regardless of pkt loss.
Flow setup:
table=0, in_port=dpdk0 actions=output:dpdk1
table=0, in_port=dpdk1 actions=output:dpdk0
2 PMDs, 3 Tx queues.
To read entries counts:
sudo ./utilities/ovs-appctl dpif-netdev/pmd-stats-show | grep entries
Results
=======
+=========================+============================
| Orig + Ciara's patch | + these 2 patches
========+=========================+============================
Streams | Entries Rx [Mpps] | Entries Rx [Mpps]
--------+-------------------------+----------------------------
100 | 100, 100 9.76, 10.05 | 100, 100 10.55, 10.72
1000 | 1000, 999 7.85, 8.09 | 1000, 1000 8.77, 8.91
3000 | 2993, 2987 7.61, 7.87 | 3000, 3000 8.58, 7.89
5000 | 4336, 4872 7.49, 7.26 | 4496, 5000 8.57, 7.28
7000 | 5870, 7000 8.57, 7.20 | 6039, 7000 8.56, 7.26
9000 | 6550, 7572 7.19, 6.91 | 6643, 7836 7.06, 6.90
11000 | 7152, 8192 6.81, 6.83 | 7158, 8192 6.81, 6.79
--------+-------------------------+----------------------------
It's interesting to see how these patches allow to use more EMC locations -
see cases from 3000 to 9000 streams - so reducing thrashing.
-Antonio
> -----Original Message-----
> From: [email protected]
[mailto:[email protected]]
> On Behalf Of Ilya Maximets
> Sent: Monday, July 31, 2017 3:41 PM
> To: [email protected]
> Cc: Heetae Ahn <[email protected]>; Ilya Maximets
> <[email protected]>
> Subject: [ovs-dev] [PATCH v2 2/2] dpif-netdev: Fix emc replacement policy.
>
> Current EMC replacement policy allows to replace active EMC entry
> even if there are dead (empty) entries available. This leads to
> EMC trashing even on few hundreds of flows. In some cases PMD
> threads starts to execute classifier lookups even in tests with
> 50 - 100 active flows.
>
> Looks like the hash comparison rule was introduced to randomly
> choose one of alive entries to replace. But it doesn't work as
> needed and also hashes has nothing common with randomness.
>
> Lets fix the replacement policy by removing hash checking and
> using the random value passed from 'emc_probabilistic_insert()'
> only while considering replace of the alive entry.
[Antonio]
I like the approach to re-use the info we already have, in this case
the random value.
> This should give us nearly fair way to choose the entry to replace.
>
> We are avoiding calculation of the new random value by reusing
> bits of already generated random for probabilistic EMC insertion.
> Bits higher than 'EM_FLOW_INSERT_INV_PROB_SHIFT' are used because
> lower bits are less than 'min' and not fully random.
>
> Not replacing of alive entries while dead ones exists allows to
> significantly decrease EMC trashing.
>
> Testing shows stable work of exact match cache without misses
> with up to 3072 - 6144 active flows (depends on traffic pattern).
>
> Signed-off-by: Ilya Maximets <[email protected]>
> ---
> lib/dpif-netdev.c | 23 ++++++++++++++++-------
> 1 file changed, 16 insertions(+), 7 deletions(-)
>
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 123a7c9..a714329 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -155,6 +155,9 @@ struct netdev_flow_key {
> #define EM_FLOW_INSERT_INV_PROB_SHIFT 20
> #define EM_FLOW_INSERT_INV_PROB_MAX (1 << EM_FLOW_INSERT_INV_PROB_SHIFT)
> #define EM_FLOW_INSERT_INV_PROB_MASK (EM_FLOW_INSERT_INV_PROB_MAX - 1)
> +/* We will use bits higher than EM_FLOW_INSERT_INV_PROB_SHIFT of the
random
> + * value for EMC replacement policy. */
> +BUILD_ASSERT_DECL(32 - EM_FLOW_INSERT_INV_PROB_SHIFT >=
EM_FLOW_HASH_SEGS);
> /* Default EMC insert probability is 1 / DEFAULT_EM_FLOW_INSERT_INV_PROB
*/
> #define DEFAULT_EM_FLOW_INSERT_INV_PROB 100
> #define DEFAULT_EM_FLOW_INSERT_MIN (EM_FLOW_INSERT_INV_PROB_MAX /
\
> @@ -2041,7 +2044,7 @@ emc_change_entry(struct emc_entry *ce, struct
> dp_netdev_flow *flow,
>
> static inline void
> emc_insert(struct emc_cache *cache, const struct netdev_flow_key *key,
> - struct dp_netdev_flow *flow)
> + struct dp_netdev_flow *flow, uint32_t random_value)
> {
> struct emc_entry *to_be_replaced = NULL;
> struct emc_entry *current_entry;
> @@ -2054,12 +2057,12 @@ emc_insert(struct emc_cache *cache, const struct
> netdev_flow_key *key,
> }
>
> /* Replacement policy: put the flow in an empty (not alive)
entry, or
> - * in the first entry where it can be */
> + * in the random entry where it can be */
> if (!to_be_replaced
> || (emc_entry_alive(to_be_replaced)
> - && !emc_entry_alive(current_entry))
> - || current_entry->key.hash < to_be_replaced->key.hash) {
> + && (!emc_entry_alive(current_entry) || random_value &
1))) {
> to_be_replaced = current_entry;
> + random_value >>= 1;
> }
> }
> /* We didn't find the miniflow in the cache.
> @@ -2077,11 +2080,17 @@ emc_probabilistic_insert(struct
dp_netdev_pmd_thread
> *pmd,
> * default the value is UINT32_MAX / 100 which yields an insertion
> * probability of 1/100 ie. 1% */
>
> - uint32_t min;
> + uint32_t min, random_value;
> atomic_read_relaxed(&pmd->dp->emc_insert_min, &min);
>
> - if (min && (random_uint32() & EM_FLOW_INSERT_INV_PROB_MASK) < min) {
> - emc_insert(&pmd->flow_cache, key, flow);
> + if (!min) {
> + return;
> + }
> +
> + random_value = random_uint32();
> + if ((random_value & EM_FLOW_INSERT_INV_PROB_MASK) < min) {
> + emc_insert(&pmd->flow_cache, key, flow,
> + random_value >> EM_FLOW_INSERT_INV_PROB_SHIFT);
> }
> }
>
> --
> 2.7.4
>
> _______________________________________________
> dev mailing list
> [email protected]
>
https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=gPngVJNy2q5ZxGlA8YyqJvmxn6ITNf2p_ExFf4bI1qA&s=catT2G8FNT5ScEbzPLhiCPWCH1Lk3A9BN4tt0egFtTo&e=
_______________________________________________
dev mailing list
[email protected]
https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=gPngVJNy2q5ZxGlA8YyqJvmxn6ITNf2p_ExFf4bI1qA&s=catT2G8FNT5ScEbzPLhiCPWCH1Lk3A9BN4tt0egFtTo&e=
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev