https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115709

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Blocks|                            |53947
                 CC|                            |rguenth at gcc dot gnu.org

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
I don't think this works, in the end we have to add even and odd elements
to compute b[i] (real and imag parts).  Yes, the multiplies could happen
on unpermuted data.  But your example assembly accumulates in a wrong way.

GCC produces

  vect__4.10_77 = MEM <vector(4) double> [(double *)a_15(D) + ivtmp.33_113 *
2];
  vect__4.11_79 = MEM <vector(4) double> [(double *)a_15(D) + 32B +
ivtmp.33_113 * 2];
  vect_perm_even_80 = VEC_PERM_EXPR <vect__4.10_77, vect__4.11_79, { 0, 2, 4, 6
}>;
  vect_perm_odd_81 = VEC_PERM_EXPR <vect__4.10_77, vect__4.11_79, { 1, 3, 5, 7
}>;
  vect_powmult_7.13_83 = vect_perm_odd_81 * vect_perm_odd_81;
  vect__10.14_84 = .FMA (vect_perm_even_80, vect_perm_even_80,
vect_powmult_7.13_83);
  MEM <vector(4) double> [(double *)b_16(D) + ivtmp.33_113 * 1] =
vect__10.14_84;


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

Reply via email to