Hello, I would like to use the psrlq sse2 instruction. There used to be builtins named __builtin_ia32_psrlq128 and __builtin_ia32_psrlqi128.
The patch http://gcc.gnu.org/ml/gcc-patches/2007-03/msg01571.html has removed some SSE builtins, including these. However, the documentation has not been updated accordingly, so one might be led to think that they are still usable. Furthermore, the builtins at this moment seem to half-exist ; they don't have prototypes, or so it seems. Things such as this no longer compile: typedef uint64_t v2di __attribute__((vector_size(16))); v2di a,b,c; int with_builtins() { a = __builtin_ia32_psrlqi128(b, 1); a = __builtin_ia32_psrlq128(b, a); } sse-builtin-test.c: In function ‘with_builtins’: sse-builtin-test.c:15: note: use -flax-vector-conversions to permit conversions between vectors with differing element types or numbers of subparts sse-builtin-test.c:15: error: incompatible type for argument 1 of ‘__builtin_ia32_psrlqi128’ sse-builtin-test.c:15: error: incompatible types in assignment sse-builtin-test.c:16: error: incompatible type for argument 1 of ‘__builtin_ia32_psrlq128’ sse-builtin-test.c:16: error: incompatible type for argument 2 of ‘__builtin_ia32_psrlq128’ sse-builtin-test.c:16: error: incompatible types in assignment (I would like to stay away from -flax-vector-conversions, as the code above in reality does no ``vector conversions'' per se). The patch http://gcc.gnu.org/ml/gcc-patches/2005-01/msg00468.html which was one of the purported justifications for the patch above, has apparently failed to make it for gcc 4.3.0 ; This means that using emmintrin.h is no better than above: int with_emm() { a = _mm_srli_epi64(b, 1); a = _mm_srl_epi64(b, a); } sse-builtin-test.c: In function ‘with_emm’: sse-builtin-test.c:23: note: use -flax-vector-conversions to permit conversions between vectors with differing element types or numbers of subparts sse-builtin-test.c:23: error: incompatible type for argument 1 of ‘_mm_srli_epi64’ sse-builtin-test.c:23: error: incompatible types in assignment sse-builtin-test.c:24: error: incompatible type for argument 1 of ‘_mm_srl_epi64’ sse-builtin-test.c:24: error: incompatible type for argument 2 of ‘_mm_srl_epi64’ sse-builtin-test.c:24: error: incompatible types in assignment Next, I try with inline assembly. static inline v2di my_psrlq(v2di x, v2di sh) __attribute__((always_inline)); static inline v2di my_psrlq(v2di x, v2di sh) { __asm__("psrlq %1,%0" : "+x,x"(x) : "x,m"(sh)); return x; } static inline v2di my_psrlqi(v2di x_, const unsigned int r_) __attribute__((always_inline)); static inline v2di my_psrlqi(v2di x_, const unsigned int r_) { __asm__("psrlq %1,%0" : "+x"(x_) : "J"(r_)); return x_; } int with_extended_asm() { a = my_psrlqi(b, 23); a = my_psrlq(b, a); } It works at first glance, but the my_psrlq() code above causes an ICE on more involved code, as in the attachment below. /tmp/l.c:33: internal compiler error: in reg_overlap_mentioned_p, at rtlanal.c:1398 So my questions are: - what is the status of __builtin_ia32_psrlq128 and friends ? - should emmintrin.h be fixed ? - what is now the recommended way of accessing the psrlq instruction reliably ? Thanks, E. -- Summary: Is the sse2 psrlq insn accessible ? Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: Emmanuel dot Thome at inria dot fr http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36370