https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109230

--- Comment #6 from avieira at gcc dot gnu.org ---
Thanks!

My initial investigation has lead me to think the change is being caused at
vrp2, which is the only time the pattern gets triggered with -O2, the tree
before the pass (at the place where the transformation happens):

  vect__83.466_787 = VEC_PERM_EXPR <vect__83.456_796, vect__83.456_796, { 1, 1
}>;
  vect__87.467_786 = vect__81.462_791 * vect__83.466_787;
  vect__91.469_784 = vect__84.458_794 - vect__87.467_786;
  vect__88.468_785 = vect__84.458_794 + vect__87.467_786;
  _783 = VEC_PERM_EXPR <vect__88.468_785, vect__91.469_784, { 0, 3 }>;
 ...
  vect__96.470_782 = vect__95.450_800 - _783;

after the pass:
  vect__83.466_787 = VEC_PERM_EXPR <vect__83.456_796, vect__83.456_796, { 1, 1
}>;
  vect__87.467_786 = vect__83.466_787 * vect__81.462_791;
  vect__91.469_784 = vect__84.458_794 - vect__87.467_786;
  vect__88.468_785 = vect__87.467_786 + vect__84.458_794;
  _756 = VIEW_CONVERT_EXPR<double>(vect__87.467_786);
  _755 = -_756;
  _739 = VIEW_CONVERT_EXPR<vector(2) float>(_755);
  _783 = _739 + vect__84.458_794;
...
  vect__96.470_782 = vect__95.450_800 - _783;

So before we had:
_783 = the first element of vect_88 and the second element of vect__91
these are respectively
vect__88 = vect__84 + vect__87
vect__91 = vect__84 - vect__87
so _783 = {vect__84[0] + vect__87[0], vect__84[1] - vect__87[1]}

after the pass
_783 = _739 + vect__84
This is where I don't know if I'm reading the optimization correctly, but it
says all 'even' lanes are negated, does that mean we end up with:
_739 = { -vect__87[0] , vect__87[1]}
if so then that's why we have a wrong result as you want to negate lane 1 not
0.  Otherwise if lane 1 is the one that gets negated then it should be OK as
you'd get:
so _783 = { vect__87[0] + vect__84[0], -vect__87[1] + vect__84[1] }
Now obviously that's assuming -a + b == b - a (not sure if that's true with
floating point errors etc)

Reply via email to