LGTM. I think this patch could work fine in conjunction with the one I posted https://mail.openvswitch.org/pipermail/ovs-dev/2017-July/335940.html where I'm targeting a congestion usecase with recirculated packets. The goal is still to limit thrashing and the criteria is to avoid EMC lookup and insertions for recirculated packets.
I gave a try to your 2 patches, I'm using Commit 325b2b1a493a2230072de726bbb53a8337759f39 On top of that I applied Ciara's patch "dpif-netdev: add EMC entry count and %full figure to pmd-stats-show" at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/327570.html because I wanted to see the count of EMC entries. I generated the different traffic streams by looping on IP addresses. Of course results are strictly dependent on the particular data traffic. I sent traffic 64B UDP packets at the line rate and measured the Rx packet rate, regardless of pkt loss. Flow setup: table=0, in_port=dpdk0 actions=output:dpdk1 table=0, in_port=dpdk1 actions=output:dpdk0 2 PMDs, 3 Tx queues. To read entries counts: sudo ./utilities/ovs-appctl dpif-netdev/pmd-stats-show | grep entries Results ======= +=========================+============================ | Orig + Ciara's patch | + these 2 patches ========+=========================+============================ Streams | Entries Rx [Mpps] | Entries Rx [Mpps] --------+-------------------------+---------------------------- 100 | 100, 100 9.76, 10.05 | 100, 100 10.55, 10.72 1000 | 1000, 999 7.85, 8.09 | 1000, 1000 8.77, 8.91 3000 | 2993, 2987 7.61, 7.87 | 3000, 3000 8.58, 7.89 5000 | 4336, 4872 7.49, 7.26 | 4496, 5000 8.57, 7.28 7000 | 5870, 7000 8.57, 7.20 | 6039, 7000 8.56, 7.26 9000 | 6550, 7572 7.19, 6.91 | 6643, 7836 7.06, 6.90 11000 | 7152, 8192 6.81, 6.83 | 7158, 8192 6.81, 6.79 --------+-------------------------+---------------------------- It's interesting to see how these patches allow to use more EMC locations - see cases from 3000 to 9000 streams - so reducing thrashing. -Antonio > -----Original Message----- > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-boun...@openvswitch.org] > On Behalf Of Ilya Maximets > Sent: Monday, July 31, 2017 3:41 PM > To: ovs-dev@openvswitch.org > Cc: Heetae Ahn <heetae82....@samsung.com>; Ilya Maximets > <i.maxim...@samsung.com> > Subject: [ovs-dev] [PATCH v2 2/2] dpif-netdev: Fix emc replacement policy. > > Current EMC replacement policy allows to replace active EMC entry > even if there are dead (empty) entries available. This leads to > EMC trashing even on few hundreds of flows. In some cases PMD > threads starts to execute classifier lookups even in tests with > 50 - 100 active flows. > > Looks like the hash comparison rule was introduced to randomly > choose one of alive entries to replace. But it doesn't work as > needed and also hashes has nothing common with randomness. > > Lets fix the replacement policy by removing hash checking and > using the random value passed from 'emc_probabilistic_insert()' > only while considering replace of the alive entry. [Antonio] I like the approach to re-use the info we already have, in this case the random value. > This should give us nearly fair way to choose the entry to replace. > > We are avoiding calculation of the new random value by reusing > bits of already generated random for probabilistic EMC insertion. > Bits higher than 'EM_FLOW_INSERT_INV_PROB_SHIFT' are used because > lower bits are less than 'min' and not fully random. > > Not replacing of alive entries while dead ones exists allows to > significantly decrease EMC trashing. > > Testing shows stable work of exact match cache without misses > with up to 3072 - 6144 active flows (depends on traffic pattern). > > Signed-off-by: Ilya Maximets <i.maxim...@samsung.com> > --- > lib/dpif-netdev.c | 23 ++++++++++++++++------- > 1 file changed, 16 insertions(+), 7 deletions(-) > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > index 123a7c9..a714329 100644 > --- a/lib/dpif-netdev.c > +++ b/lib/dpif-netdev.c > @@ -155,6 +155,9 @@ struct netdev_flow_key { > #define EM_FLOW_INSERT_INV_PROB_SHIFT 20 > #define EM_FLOW_INSERT_INV_PROB_MAX (1 << EM_FLOW_INSERT_INV_PROB_SHIFT) > #define EM_FLOW_INSERT_INV_PROB_MASK (EM_FLOW_INSERT_INV_PROB_MAX - 1) > +/* We will use bits higher than EM_FLOW_INSERT_INV_PROB_SHIFT of the random > + * value for EMC replacement policy. */ > +BUILD_ASSERT_DECL(32 - EM_FLOW_INSERT_INV_PROB_SHIFT >= EM_FLOW_HASH_SEGS); > /* Default EMC insert probability is 1 / DEFAULT_EM_FLOW_INSERT_INV_PROB */ > #define DEFAULT_EM_FLOW_INSERT_INV_PROB 100 > #define DEFAULT_EM_FLOW_INSERT_MIN (EM_FLOW_INSERT_INV_PROB_MAX / \ > @@ -2041,7 +2044,7 @@ emc_change_entry(struct emc_entry *ce, struct > dp_netdev_flow *flow, > > static inline void > emc_insert(struct emc_cache *cache, const struct netdev_flow_key *key, > - struct dp_netdev_flow *flow) > + struct dp_netdev_flow *flow, uint32_t random_value) > { > struct emc_entry *to_be_replaced = NULL; > struct emc_entry *current_entry; > @@ -2054,12 +2057,12 @@ emc_insert(struct emc_cache *cache, const struct > netdev_flow_key *key, > } > > /* Replacement policy: put the flow in an empty (not alive) entry, or > - * in the first entry where it can be */ > + * in the random entry where it can be */ > if (!to_be_replaced > || (emc_entry_alive(to_be_replaced) > - && !emc_entry_alive(current_entry)) > - || current_entry->key.hash < to_be_replaced->key.hash) { > + && (!emc_entry_alive(current_entry) || random_value & 1))) { > to_be_replaced = current_entry; > + random_value >>= 1; > } > } > /* We didn't find the miniflow in the cache. > @@ -2077,11 +2080,17 @@ emc_probabilistic_insert(struct dp_netdev_pmd_thread > *pmd, > * default the value is UINT32_MAX / 100 which yields an insertion > * probability of 1/100 ie. 1% */ > > - uint32_t min; > + uint32_t min, random_value; > atomic_read_relaxed(&pmd->dp->emc_insert_min, &min); > > - if (min && (random_uint32() & EM_FLOW_INSERT_INV_PROB_MASK) < min) { > - emc_insert(&pmd->flow_cache, key, flow); > + if (!min) { > + return; > + } > + > + random_value = random_uint32(); > + if ((random_value & EM_FLOW_INSERT_INV_PROB_MASK) < min) { > + emc_insert(&pmd->flow_cache, key, flow, > + random_value >> EM_FLOW_INSERT_INV_PROB_SHIFT); > } > } > > -- > 2.7.4 > > _______________________________________________ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev