https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101842

--- Comment #3 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #2)
> OK, so with a hack like the following we vectorize the BB as
> 
>   vect__1.10_62 = MEM <vector(4) float> [(float *)p_34];
>   vect_powmult_9.11_61 = vect__1.10_62 * vect__1.10_62;
>   _60 = .REDUC_PLUS (vect_powmult_9.11_61);
>   d_25 = d_35 - _60;
>   p_26 = p_34 + 16;
>   i_27 = i_37 + 4;
>   _10 = len_20(D) > i_27;
>   _11 = lim_21(D) <= d_25;
>   _12 = _10 & _11;
>   if (_12 != 0)
> 

Ah awesome!

> 
> the hack simply re-starts reduction discovery at the "previous" stmt
> (this breaks down after skipping the first stmt eventually).  As said,
> it's a hack.  But is that the kind of vectorization you expect?

Yeah that looks perfect, the patch seems to be based on a different code than
upstream so couldn't apply it to test the full loop, but this looks perfect!
(We already vectorize a similar loop without the `&& d >= lim` condition).

Reply via email to