> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Tuesday, November 20, 2018 2:54 PM
> To: Zhang, Qi Z <qi.z.zh...@intel.com>; Richardson, Bruce
> <bruce.richard...@intel.com>; Wiles, Keith <keith.wi...@intel.com>
> Cc: dev@dpdk.org; Lu, Wenzhuo <wenzhuo...@intel.com>; Iremonger, Bernard
> <bernard.iremon...@intel.com>; sta...@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH] app/testpmd: improve MAC swap performance
> 
> 
> 
> > -----Original Message-----
> > From: Ananyev, Konstantin
> > Sent: Tuesday, November 20, 2018 5:26 PM
> > To: Zhang, Qi Z <qi.z.zh...@intel.com>; Richardson, Bruce
> > <bruce.richard...@intel.com>; Wiles, Keith <keith.wi...@intel.com>
> > Cc: dev@dpdk.org; Lu, Wenzhuo <wenzhuo...@intel.com>; Iremonger,
> > Bernard <bernard.iremon...@intel.com>; sta...@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH] app/testpmd: improve MAC swap
> > performance
> >
> >
> >
> > > -----Original Message-----
> > > From: Zhang, Qi Z
> > > Sent: Tuesday, November 20, 2018 4:58 PM
> > > To: Ananyev, Konstantin <konstantin.anan...@intel.com>; Richardson,
> > > Bruce <bruce.richard...@intel.com>; Wiles, Keith
> > > <keith.wi...@intel.com>
> > > Cc: dev@dpdk.org; Lu, Wenzhuo <wenzhuo...@intel.com>; Iremonger,
> > > Bernard <bernard.iremon...@intel.com>; sta...@dpdk.org
> > > Subject: RE: [dpdk-dev] [PATCH] app/testpmd: improve MAC swap
> > > performance
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Ananyev, Konstantin
> > > > Sent: Tuesday, November 20, 2018 1:17 AM
> > > > To: Zhang, Qi Z <qi.z.zh...@intel.com>; Richardson, Bruce
> > > > <bruce.richard...@intel.com>; Wiles, Keith <keith.wi...@intel.com>
> > > > Cc: dev@dpdk.org; Lu, Wenzhuo <wenzhuo...@intel.com>; Iremonger,
> > > > Bernard <bernard.iremon...@intel.com>; Zhang, Qi Z
> > > > <qi.z.zh...@intel.com>; sta...@dpdk.org
> > > > Subject: RE: [dpdk-dev] [PATCH] app/testpmd: improve MAC swap
> > > > performance
> > > >
> > > > Hi Qi,
> > > >
> > > > > -----Original Message-----
> > > > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Qi Zhang
> > > > > Sent: Tuesday, November 20, 2018 4:46 AM
> > > > > To: Richardson, Bruce <bruce.richard...@intel.com>; Wiles, Keith
> > > > > <keith.wi...@intel.com>
> > > > > Cc: dev@dpdk.org; Lu, Wenzhuo <wenzhuo...@intel.com>; Iremonger,
> > > > > Bernard <bernard.iremon...@intel.com>; Zhang, Qi Z
> > > > > <qi.z.zh...@intel.com>; sta...@dpdk.org
> > > > > Subject: [dpdk-dev] [PATCH] app/testpmd: improve MAC swap
> > > > > performance
> > > > >
> > > > > The patch optimizes the mac swap operation by taking advantage
> > > > > of SSE instructions, it only impacts x86 platform.
> > > > >
> > > > > Cc: sta...@dpdk.org
> > > > >
> > > > > Signed-off-by: Qi Zhang <qi.z.zh...@intel.com>
> > > > > ---
> > > > >  app/test-pmd/macswap.c | 16 +++++++++++++++-
> > > > >  1 file changed, 15 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/app/test-pmd/macswap.c b/app/test-pmd/macswap.c
> > > > > index
> > > > > a8384d5b8..0722782b0 100644
> > > > > --- a/app/test-pmd/macswap.c
> > > > > +++ b/app/test-pmd/macswap.c
> > > > > @@ -78,7 +78,6 @@ pkt_burst_mac_swap(struct fwd_stream *fs)
> > > > >       struct rte_port  *txp;
> > > > >       struct rte_mbuf  *mb;
> > > > >       struct ether_hdr *eth_hdr;
> > > > > -     struct ether_addr addr;
> > > > >       uint16_t nb_rx;
> > > > >       uint16_t nb_tx;
> > > > >       uint16_t i;
> > > > > @@ -95,6 +94,15 @@ pkt_burst_mac_swap(struct fwd_stream *fs)
> > > > >       start_tsc = rte_rdtsc();
> > > > >  #endif
> > > > >
> > > > > +#ifdef RTE_ARCH_X86
> > > > > +     __m128i addr;
> > > > > +     __m128i shfl_msk = _mm_set_epi8(15, 14, 13, 12,
> > > > > +                                     5, 4, 3, 2,
> > > > > +                                     1, 0, 11, 10,
> > > > > +                                     9, 8, 7, 6);
> > > > > +#else
> > > > > +     struct ether_addr addr;
> > > > > +#endif
> > > >
> > > > I think it would better to place IA specific code into a separate
> > > > fnction (and probably into a separate .h file).
> > >
> > > OK, I will think about how to rework this.
> >
> > Ideally would be good to have an generic one, and IA optimized version.
> >
> > >
> > > > BTW, just curious what % of improvement it gives?
> > >
> > > So far , the only server I can test is a 1.6GHz Broadwell server with 2 
> > > ports on
> 1 i40e 25G.
> > > The macswap performance is increase from 16.8mpps to 20mpps (about
> > > 19% improvement)

I need to add a notice here, I found previous test is running on CPU from 
remote socket.
For the test on CPU from local socket on the same server, actually the mac swap 
performance is improved from 23.34 to 26.36, its about 12.9% increase, but 
still considerable.

> >
> > Quite a lot, definitely looks like worth it.
> 
> You probably can squeeze few more cycles doing it in bulks of 4 or so.

it's a good idea, based on my experience I can get more than 4% increase by 
batch with 4, 
it can reach 27.46mpps, so now its 17.7% increase, I will send patch later, 
please help to polish:)

Thanks
Qi

> Konstantin

Reply via email to