https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102512

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #2)
>   max_9 = *p_8(D);
>   _10 = {max_9, max_9, max_9, max_9, max_9, max_9, max_9, max_9};
>   vect__4.7_13 = MEM <vector(8) short int> [(short int *)p_8(D)];
>   vect_max_11.8_14 = MAX_EXPR <_10, vect__4.7_13>;
>   _20 = .REDUC_MAX (vect_max_11.8_14); [tail call]
> 
> it's a bit difficult to improve here - match.pd doesn't like MEMs too much
> and this all just collapses because _10 is a splat of element zero of
> vect__4.7_13 ...
> 
> In theory the vectorizer could use the first full vector as initial value
> or of course a vector of all SHORT_MIN.  But the intent of using the first
> scalar value was that this would optimize better ...

That is, the alternative is to apply the 'short max = p[0]' "bias" after
the epilogue and have the initial value be { SHORT_MIN, ... }.

Reply via email to