6 Regression] gcc 5.2: suboptimal code compared to 4.9

jakub at gcc dot gnu.org Mon, 23 Nov 2015 02:35:00 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68483


--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Ah, no, the problem is not on the backend side, but during veclower2 pass.
Before that pass we after the replacement of v>> 64 or v>>32 shifts we have:
  vect_sum_15.12_58 = VEC_PERM_EXPR <vect_sum_15.10_57, { 0, 0, 0, 0 }, { 2, 3,
4, 5 }>;
  vect_sum_15.12_59 = vect_sum_15.12_58 + vect_sum_15.10_57;
  vect_sum_15.12_60 = VEC_PERM_EXPR <vect_sum_15.12_59, { 0, 0, 0, 0 }, { 1, 2,
3, 4 }>;
  vect_sum_15.12_61 = vect_sum_15.12_60 + vect_sum_15.12_59;
but veclower2 for some reason decides to lower the latter VEC_PERM_EXPR into:
  _32 = BIT_FIELD_REF <vect_sum_15.12_59, 32, 32>;
  _17 = BIT_FIELD_REF <vect_sum_15.12_59, 32, 64>;
  _23 = BIT_FIELD_REF <vect_sum_15.12_59, 32, 96>;
  vect_sum_15.12_60 = {_32, _17, _23, 0};
The first VEC_PERM_EXPR is kept and generates efficient code.  If I manually
disable in the debugger the lowering, the code regression is gone.

[Bug target/68483] [5/6 Regression] gcc 5.2: suboptimal code compared to 4.9

Reply via email to