> > >
> > >> diff --git a/app/test-pmd/macswap_sse.h b/app/test-pmd/macswap_sse.h
> > >> index 223f87a539..29088843b7 100644
> > >> --- a/app/test-pmd/macswap_sse.h
> > >> +++ b/app/test-pmd/macswap_sse.h
> > >> @@ -16,13 +16,13 @@ do_macswap(struct rte_mbuf *pkts[], uint16_t nb,
> > >>        uint64_t ol_flags;
> > >>        int i;
> > >>        int r;
> > >> -     __m128i addr0, addr1, addr2, addr3;
> > >> +     register __m128i addr0, addr1, addr2, addr3;
> > > Some compilers treat register as a no-op. Are you sure? Did you check 
> > > with godbolt.
> >
> > Thank you Stephen, I have tested the code changes on Linux using GCC and
> > Clang compiler.
> >
> > In both cases in Linux environment, we have seen the the values loaded
> > onto register `xmm`.
> >
> > ```
> > registerconst__m128i shfl_msk = _mm_set_epi8(15, 14, 13, 12, 5, 4, 3, 2,
> > 1, 0, 11, 10, 9, 8, 7, 6);
> > vmovdqaxmm0, xmmwordptr[rip+ .LCPI0_0]

Yep, that what I would probably expect: one time load before the loop starts, 
right?
Curious  what exactly it would generate then if 'register' keyword is missed?
BTW, on my box,  gcc-11  with '-O3 -msse4.2 ...'  I am seeing expected behavior 
without 'register' keyword.
Is it some particular compiler version that misbehaves?
 
> >
> > ```
> >
> > Both cases we have performance improvement.
> >
> >
> > Can you please help us understand if we have missed out something?
> 
> Ok, not sure why compiler would not decide to already use a register here?

Reply via email to