https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98291

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Blocks|                            |53947
   Last reconfirmed|                            |2021-01-04
             Status|UNCONFIRMED                 |ASSIGNED
     Ever confirmed|0                           |1
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot 
gnu.org

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
I think this is a missed optimization in SLP reduction vectorization.  We're
detecting the optimal vectorization opportunity but:

x.c:6:36: note:   ==> examining statement: d2_30 = PHI <d2_25(7), 0.0(6)>
x.c:6:36: note:   vect_is_simple_use: operand _11 * _14, type of def: internal
x.c:6:36: note:   vect_is_simple_use: operand d2_30 = PHI <d2_25(7), 0.0(6)>,
type of def: reduction
x.c:6:36: missed:   reduc op not supported by target.
x.c:6:36: missed:   in-order unchained SLP reductions not supported.
x.c:1:8: missed:   not vectorized: relevant stmt not supported: d2_30 = PHI
<d2_25(7), 0.0(6)>
x.c:6:36: note:   removing SLP instance operations starting from: d1_24 = _7 +
d1_28;
x.c:6:36: missed:  unsupported SLP instances
x.c:6:36: note:  re-trying with SLP disabled

and end up vectorizing two reductions on interleaved data (ugh).

The fact we fail to notice is that there's no reduction needed in the
epilogue (VF == 1) and the reduction is vectorized in-order already.

  if (reduction_type == FOLD_LEFT_REDUCTION 
      && slp_node
      && !REDUC_GROUP_FIRST_ELEMENT (stmt_info))
    {
      /* We cannot use in-order reductions in this case because there is
         an implicit reassociation of the operations involved.  */
      if (dump_enabled_p ())
        dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
                         "in-order unchained SLP reductions not supported.\n");
      return false;

If we allow VF == 1 here we end up with odd (wrong?) code so I guess we
shouldn't classify a VF == 1 SLP reduction as FOLD_LEFT_REDUCTION in the
first place but IIRC this classification is done before the VF is fixed.

Mine.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

Reply via email to