------- Comment #1 from rguenth at gcc dot gnu dot org  2009-07-16 09:44 -------
The middle-end presents the vectorizer with

<bb 3>:
  # i_13 = PHI <i_7(4), 0(2)>
  # ivtmp.26_8 = PHI <ivtmp.26_16(4), 1024(2)>
  D.1623_3 = xd[i_13];
  sincostmp.21_1 = __builtin_cexpi (D.1623_3);
  D.1624_4 = IMAGPART_EXPR <sincostmp.21_1>;
  sd[i_13] = D.1624_4;
  D.1625_6 = REALPART_EXPR <sincostmp.21_1>;
  cd[i_13] = D.1625_6;
  i_7 = i_13 + 1;
  ivtmp.26_16 = ivtmp.26_8 - 1;
  if (ivtmp.26_16 != 0)
    goto <bb 4>;
  else
    goto <bb 5>;

which has first of all complex types (they should be recognized as V2DF
with vectorization factor 1, thus SLP-able).

For the float case

<bb 3>:
  # i_13 = PHI <i_7(4), 0(2)>
  # ivtmp.6_8 = PHI <ivtmp.6_16(4), 1024(2)>
  D.1610_3 = xf[i_13];
  sincostmp.1_1 = __builtin_cexpif (D.1610_3);
  D.1611_4 = IMAGPART_EXPR <sincostmp.1_1>;
  sf[i_13] = D.1611_4;
  D.1612_6 = REALPART_EXPR <sincostmp.1_1>;
  cf[i_13] = D.1612_6;
  i_7 = i_13 + 1;
  ivtmp.6_16 = ivtmp.6_8 - 1;
  if (ivtmp.6_16 != 0)
    goto <bb 4>;
  else
    goto <bb 5>;

they should be V2SF, thus use V4SF and vectorization factor 2.  Still use
SLP probably.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770

Reply via email to