http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48329

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
           Keywords|openmp                      |
   Last reconfirmed|                            |2011.03.29 10:31:56
          Component|middle-end                  |tree-optimization
                 CC|                            |rguenth at gcc dot gnu.org
     Ever Confirmed|0                           |1
            Summary|Program takes twice as long |Missed vectorization of
                   |*without* -fopenmp than     |reduction due to PRE
                   |with 1 OpenMP thread        |

--- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-03-29 
10:31:56 UTC ---
We vectorize the reduction if the function is outlined.  I suppose sth
confuses the vectorizer in the non-OMP path.  Yep, it's PRE, so try
-fno-tree-pre:

<bb 3>:
  # i_1 = PHI <1(2), i_22(4)>
  # sum_2 = PHI <0.0(2), sum_20(4)>
  # prephitmp.9_50 = PHI
<5.66893424036281234980410020432668056299176519904892395524e-20(2),
D.1586_48(4)>
  # ivtmp.12_10 = PHI <2100000000(2), ivtmp.12_11(4)>
  D.1574_17 = prephitmp.9_50 + 1.0e+0;
  D.1575_18 = ((D.1574_17));
  D.1576_19 = 4.0e+0 / D.1575_18;
  sum_20 = D.1576_19 + sum_2;
  ivtmp.12_11 = ivtmp.12_10 - 1;
  if (ivtmp.12_11 == 0)
    goto <bb 5>;
  else
    goto <bb 4>;

<bb 4>:
  i_22 = i_1 + 1;
  pretmp.8_44 = (real(kind=8)) i_22;
  pretmp.8_45 = pretmp.8_44 - 5.0e-1;
  pretmp.8_46 = ((pretmp.8_45));
  pretmp.8_47 = pretmp.8_46 *
4.76190476190476200439314681013558416822206709184683859348e-10;
  D.1586_48 = __builtin_pow (pretmp.8_47, 2.0e+0);
  goto <bb 3>;

is not detected as reduction.  Probably not only because, but at least
also because of the latch block not being empty.

Reply via email to