https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113358

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
The issue with block.c is

Analyzing loop at block.c:22
block.c:22:39: note:  === analyze_loop_nest ===
block.c:22:39: note:   === vect_analyze_loop_form ===
block.c:22:39: note:    === get_loop_niters ===
block.c:22:39: missed:   not vectorized: number of iterations cannot be
computed.
block.c:22:39: missed:  bad loop form.
block.c:22:39: missed: couldn't vectorize loop

we fail to compute an expression for the number of scalar iterations in the
innermost loop.  That's because we have 'j < J + BLOCK && j < n' as
the terminating condition.  I suspect that the blocking should peel the
case where J + BLOCK > n, basically

      if (J + BLOCK > n || I + BLOCK > n)
        {
          ... blocking nest with < n exit condition
        }
      else
        {
          ... blocking nest with < {J,I} + BLOCK exit condition
        }

the vectorizer (or rather niter analysis) could try to recover in a similar
way with using 'assumptions' - basically we can compute the number of
iterations to BLOCK if we assume that J + BLOCK <= n.  The exit condition
looks like

  _145 = J_86 + 999;
...
  <bb 4> [local count: 958878294]:
  # j_88 = PHI <j_58(18), J_86(7)>
...
  j_58 = j_88 + 1;
  _63 = n_49(D) > j_58;
  _64 = j_58 <= _145;
  _65 = _63 & _64;
  if (_65 != 0)

we could try to pattern-match this NE_EXPR (we need to choose which
condition we use as assumption and which to base the niters on).
Another possibility would be (I think this came up in another bugreport
as well) to use j < MIN (J + BLOCK, n).

The following source modification works:

    for (int i = I; i < I + BLOCK && i < n; i++) {
        int m = J + BLOCK > n ? n : J + BLOCK;
        for (int j = J; j < m; j++) {

whether it's a general profitable transform or should be matched again
only during niter analysis I'm not sure (if the MIN is loop invariant
and this is an exit condition it surely is profitable).

Reply via email to