https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109230
--- Comment #6 from avieira at gcc dot gnu.org --- Thanks! My initial investigation has lead me to think the change is being caused at vrp2, which is the only time the pattern gets triggered with -O2, the tree before the pass (at the place where the transformation happens): vect__83.466_787 = VEC_PERM_EXPR <vect__83.456_796, vect__83.456_796, { 1, 1 }>; vect__87.467_786 = vect__81.462_791 * vect__83.466_787; vect__91.469_784 = vect__84.458_794 - vect__87.467_786; vect__88.468_785 = vect__84.458_794 + vect__87.467_786; _783 = VEC_PERM_EXPR <vect__88.468_785, vect__91.469_784, { 0, 3 }>; ... vect__96.470_782 = vect__95.450_800 - _783; after the pass: vect__83.466_787 = VEC_PERM_EXPR <vect__83.456_796, vect__83.456_796, { 1, 1 }>; vect__87.467_786 = vect__83.466_787 * vect__81.462_791; vect__91.469_784 = vect__84.458_794 - vect__87.467_786; vect__88.468_785 = vect__87.467_786 + vect__84.458_794; _756 = VIEW_CONVERT_EXPR<double>(vect__87.467_786); _755 = -_756; _739 = VIEW_CONVERT_EXPR<vector(2) float>(_755); _783 = _739 + vect__84.458_794; ... vect__96.470_782 = vect__95.450_800 - _783; So before we had: _783 = the first element of vect_88 and the second element of vect__91 these are respectively vect__88 = vect__84 + vect__87 vect__91 = vect__84 - vect__87 so _783 = {vect__84[0] + vect__87[0], vect__84[1] - vect__87[1]} after the pass _783 = _739 + vect__84 This is where I don't know if I'm reading the optimization correctly, but it says all 'even' lanes are negated, does that mean we end up with: _739 = { -vect__87[0] , vect__87[1]} if so then that's why we have a wrong result as you want to negate lane 1 not 0. Otherwise if lane 1 is the one that gets negated then it should be OK as you'd get: so _783 = { vect__87[0] + vect__84[0], -vect__87[1] + vect__84[1] } Now obviously that's assuming -a + b == b - a (not sure if that's true with floating point errors etc)