https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107563

--- Comment #4 from cqwrteur <unlvsur at live dot com> ---
(In reply to cqwrteur from comment #2)
> (In reply to cqwrteur from comment #0)
> > #if defined(__SSE2__)
> > 
> > using temp_vec_type [[__gnu__::__vector_size__ (16)]] = char;
> > void foo(temp_vec_type& v) noexcept
> > {
> >     v=__builtin_shufflevector(v,v,15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0);
> > }
> > 
> > #endif
> > 
> > g++ -S pq.cc -Ofast
> > proves sse2 is enabled by default, but it does not call
> > https://www.felixcloutier.com/x86/pshufb
> > neither
> > https://www.felixcloutier.com/x86/pshufd
> > 
> > while g++ -S pq.cc -Ofast -msse4.2 will generate them correctly. Which is
> > buggy
> 
> pshufb is sse3 sorry. but pshufd is sse2. It can be used for generating the
> right instruction.

https://godbolt.org/z/6baWWoE4e
BTW. -msse3 does not use pshufb either. i do not know why

Reply via email to