http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2012-03-20
     Ever Confirmed|0                           |1

--- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-03-20 
09:17:41 UTC ---
Testing the 3 patches now (AVX2 improvements, expand_vselect and #c8 with
further comments).  For 3/4 insn sequences, I agree with the proposal to
attempt to handle d->op0 == d->op1 cross-lane shuffles as two operand in-lane
shuffles
after vperm2f128 swapping the lanes.  Two insn expanders could be groupped into
expand_vec_perm_2 and three insn expanders into expand_vec_perm_3.
We need to write some further 2 and 3 insn in-lane expanders though, as shown
by:
typedef double V4DF __attribute__((vector_size (4 * sizeof (double))));
typedef long V4DI __attribute__((vector_size (4 * sizeof (long))));

#define A(a, b, c, d) \
__attribute__((noinline, noclone)) V4DF \
f##a##b##c##d (V4DF x, V4DF y) \
{\
  V4DI m = { a, b, c, d }; \
  return __builtin_shuffle (x, y, m); \
}
#define B(b, c, d) A(0, b, c, d) A(1, b, c, d) A(4, b, c, d) A(5, b, c, d)
#define C(c, d) B(0, c, d) B(1, c, d) B(4, c, d) B(5, c, d)
#define D(d) C(2, d) C(3, d) C(6, d) C(7, d)
#define E D(2) D(3) D(6) D(7)
E

int
main ()
{
  V4DF x = { 0.5, 1.5, 2.5, 3.5 }, y = { 4.5, 5.5, 6.5, 7.5 }, z;
#undef A
#define A(a, b, c, d) \
  z = f##a##b##c##d (x, y); \
  if (z[0] != a + .5 || z[1] != b + .5 || z[2] != c + .5 || z[3] != d + .5) \
    __builtin_abort ();
  E
  return 0;
}

Reply via email to