juzhe.zh...@rivai.ai writes:
> +  /* If we're using decrement IV approach in loop control, we can use output 
> of
> +     SELECT_VL to adjust IV of loop control and data reference when it 
> satisfies
> +     the following checks:
> +
> +     (a) SELECT_VL is supported by the target.
> +     (b) LOOP_VINFO is single-rgroup control.
> +     (c) non-SLP.
> +     (d) LOOP can not be unrolled.
> +
> +     Otherwise, we use MIN_EXPR approach.
> +
> +     1. We only apply SELECT_VL on single-rgroup since:
> +
> +     (1). Multiple-rgroup controls N vector loads/stores would need N pointer
> +       updates by variable amounts.
> +     (2). SELECT_VL allows flexible length (<=VF) in each iteration.
> +     (3). For decrement IV approach, we calculate the MAX length of the loop
> +       and then deduce the length of each control from this MAX length.
> +
> +     Base on (1), (2) and (3) situations, if we try to use SELECT_VL on
> +     multiple-rgroup control, we need to generate multiple SELECT_VL to
> +     carefully adjust length of each control.

If we use SELECT_VL to refer only to the target-independent ifn, I don't
see why this last bit is true.  Like I said in the previous message,
when it comes to determining the length of each control, the approach we
take for MIN_EXPR IVs should work for SELECT_VL IVs.  The point is that,
in both cases, any inactive lanes are always the last lanes.

E.g. suppose that, for one particular iteration, SELECT_VL decides that
6 lanes should be active in a loop with VF==8.  If there is a 2-control
rgroup with 4 lanes each, the first control must be 4 and the second
control must be 2, just as if a MIN_EXPR had decided that 6 lanes of
the final iteration are active.

I'm not saying the decision itself is wrong.  But I think the explanation
could be clearer.

> +     Such approach is very inefficient
> +     and unprofitable for targets that are using a standalone instruction
> +     to configure the length of each operation.
> +     E.g. RISC-V vector use 'vsetvl' to configure the length of each 
> operation.

What I don't understand is why this isn't also a problem with the
fallback MIN_EXPR approach.  That is, with the same example as above,
but using MIN_EXPR IVs, I would have expected:

  VF == 8

  1-control rgroup "A":
    A set by MIN_EXPR IV

  2-control rgroup "B1", "B2":
    B1 = MIN (A, 4)
    B2 = A - B1

and so the vectors controlled by A, B1 and B2 would all have different
lengths.

Is the point that, when using MIN_EXPR, this only happens in the
final iteration?  And that you use a tail/epilogue loop for that,
so that the main loop body operates on full vectors only?

Thanks,
Richard

Reply via email to