Hi Jerin,

> -----Original Message-----
> From: Jerin Jacob <[email protected]>
> Sent: Friday, March 6, 2020 3:45 PM
> To: Gavin Hu <[email protected]>
> Cc: dpdk-dev <[email protected]>; nd <[email protected]>; David Marchand
> <[email protected]>; [email protected];
> [email protected]; Ye, Xiaolong <[email protected]>; Honnappa
> Nagarahalli <[email protected]>; Ruifeng Wang
> <[email protected]>; Phil Yang <[email protected]>; Joyce Kong
> <[email protected]>; Steve Capper <[email protected]>
> Subject: Re: [dpdk-dev] [PATCH v1 3/3] net/i40e: auto-vectorization to
> speed up Tx free
> 
> On Fri, Mar 6, 2020 at 10:35 AM Gavin Hu <[email protected]> wrote:
> >
> > Tx mbuf free is a hotspot for i40e on aarch64, as there are no
> > inter-loop dependencies, it is safe to enable auto-vectorization
> > to speed up.
> >
> > This patch showed 2~3% performance lift on ThunderX2 and no
> degradation
> > on Arm N1SDP. The test case is single core RFC2544 zero-loss test.
> >
> > Signed-off-by: Gavin Hu <[email protected]>
> > Reviewed-by: Steve Capper <[email protected]>
> > ---
> >  drivers/net/i40e/i40e_rxtx_vec_common.h | 5 +++++
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/drivers/net/i40e/i40e_rxtx_vec_common.h
> b/drivers/net/i40e/i40e_rxtx_vec_common.h
> > index 0e6ffa007..fc0fa45d4 100644
> > --- a/drivers/net/i40e/i40e_rxtx_vec_common.h
> > +++ b/drivers/net/i40e/i40e_rxtx_vec_common.h
> > @@ -98,6 +98,11 @@ i40e_tx_free_bufs(struct i40e_tx_queue *txq)
> >         if (likely(m != NULL)) {
> >                 free[0] = m;
> >                 nb_free = 1;
> > +#if defined(__clang__)
> > +#pragma clang loop vectorize(assume_safety)
> > +#elif defined(__GNUC__)
> > +#pragma GCC ivdep
> > +#endif
> 
> IMO, It is better to abstract the compiler features  (above compiler
> feature and __restrict__) as macros in
> rte_common.h or so. It will help to support other compilers(ICC or
> Windows) and enable them to have "changes" in one place.

How about defining RTE_LOOP_AUTO_VECTORIZATION in the rte_common.h?
#if defined(__clang__)
        define RTE_LOOP_AUTO_VECTORIZATION  \
                #pragma clang loop vectorize(assume_safety)
#elif defined(__GNUC__)
        define RTE_LOOP_AUTO_VECTORIZATION  \
                #pragma GCC ivdep
#else 
        define RTE_LOOP_AUTO_VECTORIZATION
#endif

If you agree, I will submit a v2. Thanks for your comments! 
/Gavin
> 
> 
> 
> >                 for (i = 1; i < n; i++) {
> >                         m = rte_pktmbuf_prefree_seg(txep[i].mbuf);
> >                         if (likely(m != NULL)) {
> > --
> > 2.17.1
> >

Reply via email to