https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101207

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
Ah, so what happens is that we elide the load permutation that feeds the
plus reduction originally but then we vectorize the live operands of the
minus reduction as BIT_FIELD_REFs ending up extracting the wrong lanes.

Testcase for x86_64:

/* { dg-additional-options "-ftree-slp-vectorize -ffast-math" } */

double a[2];
double x, y;

void __attribute__((noipa)) foo ()
{
  x = a[1] - a[0];
  y = a[0] + a[1];
}

int main()
{
  a[0] = 0.;
  a[1] = 1.;
  foo ();
  if (x != 1. || y != 1.)
    __builtin_abort ();
  return 0;
}

Reply via email to