On Sat, Apr 18, 2026 at 09:56:38AM +0000, Morten Brørup wrote: > Freeing mbufs directly into the mempool meant that mbuf instrumentation, > including mbuf history marking, was omitted. > The mbufs are now freed via the rte_mbuf_raw_free_bulk() function instead. > > Added a static_assert to ensure that type casting the array of struct > ci_tx_entry_vec to an array of rte_mbuf pointers remains sound. > > Performance note: > The (n & 31) condition was not removed. > For the default tx_rs_thresh value (32), the condition will be true. > And due to inlining, the rte_mbuf_raw_free_bulk() ends up in an > rte_memcpy(), where the optimizer takes advantage of knowing that the > lower bits are not set. > This should compensate somewhat for removing the handcoded optimization of > copying in chunks of 32 mbufs. > > Signed-off-by: Morten Brørup <[email protected]> > ---
Ran a very quick perf test using a couple of 100G ports, no regression seen with this patch, maybe even a slight perf bump. Therefore: Acked-by: Bruce Richardson <[email protected]> Tested-by: Bruce Richardson <[email protected]> One comment inline below: > doc/guides/rel_notes/release_26_07.rst | 4 +++ > drivers/net/intel/common/tx.h | 36 +++----------------------- > 2 files changed, 7 insertions(+), 33 deletions(-) > > diff --git a/doc/guides/rel_notes/release_26_07.rst > b/doc/guides/rel_notes/release_26_07.rst > index 060b26ff61..9367d38b13 100644 > --- a/doc/guides/rel_notes/release_26_07.rst > +++ b/doc/guides/rel_notes/release_26_07.rst > @@ -24,6 +24,10 @@ DPDK Release 26.07 > New Features > ------------ > > +* **Updated Intel common driver.** > + > + * Added missing mbuf history marking to vectorized Tx path for > MBUF_FAST_FREE. > + I don't think this is a big enough change to require a release note update. It's really more of a bug fix. If you are ok with it, I'd like to drop this RN entry on apply of the patch? > .. This section should contain new features added in this release. > Sample format: > > diff --git a/drivers/net/intel/common/tx.h b/drivers/net/intel/common/tx.h > index 283bd58d5d..4a201da83c 100644 > --- a/drivers/net/intel/common/tx.h > +++ b/drivers/net/intel/common/tx.h > @@ -285,42 +285,12 @@ ci_tx_free_bufs_vec(struct ci_tx_queue *txq, > ci_desc_done_fn desc_done, bool ctx > (txq->fast_free_mp = txep[0].mbuf->pool); > > if (mp != NULL && (n & 31) == 0) { > - void **cache_objs; > - struct rte_mempool_cache *cache = rte_mempool_default_cache(mp, > rte_lcore_id()); > - > - if (cache == NULL) > - goto normal; > - > - cache_objs = &cache->objs[cache->len]; > - > - if (n > RTE_MEMPOOL_CACHE_MAX_SIZE) { > - rte_mempool_ops_enqueue_bulk(mp, (void *)txep, n); > - goto done; > - } > - > - /* The cache follows the following algorithm > - * 1. Add the objects to the cache > - * 2. Anything greater than the cache min value (if it > - * crosses the cache flush threshold) is flushed to the ring. > - */ > - /* Add elements back into the cache */ > - uint32_t copied = 0; > - /* n is multiple of 32 */ > - while (copied < n) { > - memcpy(&cache_objs[copied], &txep[copied], 32 * > sizeof(void *)); > - copied += 32; > - } > - cache->len += n; > - > - if (cache->len >= cache->flushthresh) { > - rte_mempool_ops_enqueue_bulk(mp, > &cache->objs[cache->size], > - cache->len - cache->size); > - cache->len = cache->size; > - } > + static_assert(sizeof(*txep) == sizeof(struct rte_mbuf *), > + "txep array is not similar to an array of > rte_mbuf pointers"); > + rte_mbuf_raw_free_bulk(mp, (void *)txep, n); > goto done; > } > > -normal: > m = rte_pktmbuf_prefree_seg(txep[0].mbuf); > if (likely(m)) { > free[0] = m; > -- > 2.43.0 >

