Hello,

I would like to use the psrlq sse2 instruction.  There used to be 
builtins named __builtin_ia32_psrlq128 and __builtin_ia32_psrlqi128.


The patch http://gcc.gnu.org/ml/gcc-patches/2007-03/msg01571.html has
removed some SSE builtins, including these. However, the documentation
has not been updated accordingly, so one might be led to think that they
are still usable. Furthermore, the builtins at this moment seem to
half-exist ; they don't have prototypes, or so it seems. Things such as
this no longer compile:

typedef uint64_t v2di __attribute__((vector_size(16)));
v2di a,b,c;
int with_builtins()
{
    a =  __builtin_ia32_psrlqi128(b, 1);
    a =  __builtin_ia32_psrlq128(b, a);
}
sse-builtin-test.c: In function ‘with_builtins’:
sse-builtin-test.c:15: note: use -flax-vector-conversions to permit conversions
between vectors with differing element types or numbers of subparts
sse-builtin-test.c:15: error: incompatible type for argument 1 of
‘__builtin_ia32_psrlqi128’
sse-builtin-test.c:15: error: incompatible types in assignment
sse-builtin-test.c:16: error: incompatible type for argument 1 of
‘__builtin_ia32_psrlq128’
sse-builtin-test.c:16: error: incompatible type for argument 2 of
‘__builtin_ia32_psrlq128’
sse-builtin-test.c:16: error: incompatible types in assignment

(I would like to stay away from -flax-vector-conversions, as the code above
in reality does no ``vector conversions'' per se).

The patch http://gcc.gnu.org/ml/gcc-patches/2005-01/msg00468.html
which was one of the purported justifications for the patch above, has
apparently failed to make it for gcc 4.3.0 ; This means that using
emmintrin.h is no better than above:

int with_emm()
{
    a = _mm_srli_epi64(b, 1);
    a = _mm_srl_epi64(b, a);
}
sse-builtin-test.c: In function ‘with_emm’:
sse-builtin-test.c:23: note: use -flax-vector-conversions to permit conversions
between vectors with differing element types or numbers of subparts
sse-builtin-test.c:23: error: incompatible type for argument 1 of
‘_mm_srli_epi64’
sse-builtin-test.c:23: error: incompatible types in assignment
sse-builtin-test.c:24: error: incompatible type for argument 1 of
‘_mm_srl_epi64’
sse-builtin-test.c:24: error: incompatible type for argument 2 of
‘_mm_srl_epi64’
sse-builtin-test.c:24: error: incompatible types in assignment

Next, I try with inline assembly.

static inline v2di my_psrlq(v2di x, v2di sh) __attribute__((always_inline));
static inline v2di my_psrlq(v2di x, v2di sh)
{
    __asm__("psrlq %1,%0" : "+x,x"(x) : "x,m"(sh));
    return x;
}

static inline v2di my_psrlqi(v2di x_, const unsigned int r_)
__attribute__((always_inline));
static inline v2di my_psrlqi(v2di x_, const unsigned int r_)
{
    __asm__("psrlq %1,%0" : "+x"(x_) : "J"(r_));
    return x_;
}

int with_extended_asm()
{
    a = my_psrlqi(b, 23);
    a = my_psrlq(b, a);
}

It works at first glance, but the my_psrlq() code above causes an ICE on
more involved code, as in the attachment below.

/tmp/l.c:33: internal compiler error: in reg_overlap_mentioned_p, at
rtlanal.c:1398



So my questions are:

- what is the status of __builtin_ia32_psrlq128 and friends ?
- should emmintrin.h be fixed ?
- what is now the recommended way of accessing the psrlq instruction reliably ?


Thanks,

E.


-- 
           Summary: Is the sse2 psrlq insn accessible ?
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: Emmanuel dot Thome at inria dot fr


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36370

Reply via email to