> -----Original Message----- > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ananyev, Konstantin > Sent: Wednesday, February 14, 2018 12:35 PM > To: Richardson, Bruce <bruce.richard...@intel.com> > Cc: Yongseok Koh <ys...@mellanox.com>; Olivier Matz <olivier.m...@6wind.com>; > dev@dpdk.org > Subject: Re: [dpdk-dev] Accessing 2nd cacheline in rte_pktmbuf_prefree_seg() > > > > > -----Original Message----- > > From: Richardson, Bruce > > Sent: Wednesday, February 14, 2018 12:12 PM > > To: Ananyev, Konstantin <konstantin.anan...@intel.com> > > Cc: Yongseok Koh <ys...@mellanox.com>; Olivier Matz > > <olivier.m...@6wind.com>; dev@dpdk.org > > Subject: Re: [dpdk-dev] Accessing 2nd cacheline in rte_pktmbuf_prefree_seg() > > > > On Wed, Feb 14, 2018 at 12:03:55PM +0000, Ananyev, Konstantin wrote: > > > > > > > > > > -----Original Message----- > > > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ananyev, Konstantin > > > > Sent: Wednesday, February 14, 2018 11:48 AM > > > > To: Yongseok Koh <ys...@mellanox.com>; Olivier Matz > > > > <olivier.m...@6wind.com> > > > > Cc: dev@dpdk.org > > > > Subject: Re: [dpdk-dev] Accessing 2nd cacheline in > > > > rte_pktmbuf_prefree_seg() > > > > > > > > Hi Yongseok, > > > > > > > > > > On Feb 13, 2018, at 2:45 PM, Yongseok Koh <ys...@mellanox.com> > > > > > > wrote: > > > > > > > > > > > > Hi Olivier > > > > > > > > > > > > I'm wondering why rte_pktmbuf_prefree_seg() checks m->next instead > > > > > > of > > > > > > m->nb_segs? As 'next' is in the 2nd cacheline, checking nb_segs > > > > > > seems beneficial > > > > > > to the cases where almost mbufs have single segment. > > > > > > > > > > > > A customer reported high rate of cache misses in the code and I > > > > > > thought the > > > > > > following patch could be helpful. I haven't had them try it yet but > > > > > > just wanted > > > > > > to hear from you. > > > > > > > > > > > > I'd appreciate if you can review this idea. > > > > > > > > > > > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h > > > > > > index 62740254d..96edbcb9e 100644 > > > > > > --- a/lib/librte_mbuf/rte_mbuf.h > > > > > > +++ b/lib/librte_mbuf/rte_mbuf.h > > > > > > @@ -1398,7 +1398,7 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf *m) > > > > > > if (RTE_MBUF_INDIRECT(m)) > > > > > > rte_pktmbuf_detach(m); > > > > > > > > > > > > - if (m->next != NULL) { > > > > > > + if (m->nb_segs > 1) { > > > > > > m->next = NULL; > > > > > > m->nb_segs = 1; > > > > > > } > > > > > > @@ -1410,7 +1410,7 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf *m) > > > > > > if (RTE_MBUF_INDIRECT(m)) > > > > > > rte_pktmbuf_detach(m); > > > > > > > > > > > > - if (m->next != NULL) { > > > > > > + if (m->nb_segs > 1) { > > > > > > m->next = NULL; > > > > > > m->nb_segs = 1; > > > > > > } > > > > > > > > > > Well, m->pool in the 2nd cacheline has to be accessed anyway in order > > > > > to put it back to the mempool. > > > > > It looks like the cache miss is unavoidable. > > > > > > > > As a thought: in theory PMD can store pool pointer together with each > > > > mbuf it has to free, > > > > then it could be something like: > > > > > > > > if (rte_pktmbuf_prefree_seg(m[x] != NULL) > > > > rte_mempool_put(pool[x], m[x]); > > > > > > > > Then what you suggested above might help. > > > > > > After another thought - we have to check m->next not m->nb_segs. > > > There could be a situations where nb_segs==1, but m->next != NULL > > > (2-nd segment of the 3 segment packet for example). > > > So probably we have to keep it as it is. > > > Sorry for the noise > > > Konstantin > > > > It's still worth considering as an option. We could check nb_segs for > > the first segment of a packet and thereafter iterate using the next > > pointer. > > In multi-seg case PMD frees segments (not packets). > It could happen that first segment would be already freed while the second > still not. > > > It means that your idea of storing the pool pointer for each > > mbuf becomes useful for single-segment packets. > > But then we'll have to support 2 different flavors of prefree_seg(). > Alternative would be to change all PMDs multi-seg TX so when first segment is > going to be freed we update nb_segs for the second and so on. > Both options seems like too much hassle. >
As a side thought what probably can be done to minimize access to 2-nd mbuf's cache line at PMD tx free: Introduce something like that: static __rte_always_inline struct rte_mepool * xxx_prefree_seg(struct rte_mbuf *m) { if (rte_mbuf_refcnt_read(m) == 1 && RTE_MBUF_DIRECT(m)) { if (m->next != NULL) { m->next = NULL; m->nb_segs = 1; } return m->pool; } return NULL; } Then at tx_burst() before doing actual TX PMD can call that function and store it's return value along with mbuf: .. m[x] = pkt; pool[x] = xxx_prefree_seg(m[x]); Then at free time, we can do something ilike: If (pool[x] != NULL) rte_mempool_put(pool[x], m[x]); else rte_pktmbuf_free_seg(m[x]); We still access m->next but doing that before actual TX is done. Hopefully there would be more chances that m->next is still in the cache at that moment. In theory, that might help for most common case when we have direct mbufs with refcnt==1. Though for indirect/refcnt>1 mbufs there would be extra overhead. Konstantin