https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67553

--- Comment #2 from tmb99 at gmx dot net ---
seems to be the same for most saturating instructions:

__m128i v0 = _mm_setzero_si128();
__m128i v2 = _mm_setzero_si128();
__m128i sum = _mm_adds_epi16(v0,v2);
__m128i dif = _mm_subs_epi8(v0,v2);
__m128i hsum = _mm_hadds_epi16(v0,v2);
__m128i hdif = _mm_hsubs_epi16(v0,v2);
__m128i pacu = _mm_packus_epi16(v0,v2);
__m128i pacs = _mm_packs_epi32(v0,v2);

compiles to:

vpxor   %xmm0, %xmm0, %xmm0
vpxor   %xmm2, %xmm2, %xmm2
vphsubsw        %xmm0, %xmm0, %xmm4
vpackuswb       %xmm0, %xmm0, %xmm3
vphaddsw        %xmm0, %xmm0, %xmm5
vpsubsb %xmm2, %xmm2, %xmm2
vpxor   %xmm1, %xmm1, %xmm1
vpaddsw %xmm0, %xmm0, %xmm0
vpackssdw       %xmm1, %xmm1, %xmm1

also: 3 setzero/vpxor instructions instead of just one.

Reply via email to