On Mon, Apr 09, 2018 at 06:47:45PM +0100, Richard Sandiford wrote:
> In this PR we used WIDEN_SUM_EXPR to vectorise:
> 
>   short i, y;
>   int sum;
>   [...]
>   for (i = x; i > 0; i--)
>     sum += y;
> 
> with 4 ints and 8 shorts per vector.  The problem was that we set
> the VF based only on the ints, then calculated the number of vector
> copies based on the shorts, giving 4/8.  Previously that led to
> ncopies==0, but after r249897 we pick it up as an ICE.
> 
> In this particular case we could vectorise the reduction by setting
> ncopies based on the output type rather than the input type, but it
> doesn't seem worth adding a special "optimisation" for such a
> pathological case.  I think it's really an instance of the more general
> problem that we can't vectorise using combinations of (say) 64-bit and
> 128-bit vectors on targets that support both.

We badly need that, there are plenty of PRs where we generate really large
vectorized loop because of it e.g. on x86 where we can easily use 128-bit,
256-bit and 512-bit vectors; but I'm afraid it is not a stage4 material.

        Jakub

Reply via email to