https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102512
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Richard Biener from comment #2) > max_9 = *p_8(D); > _10 = {max_9, max_9, max_9, max_9, max_9, max_9, max_9, max_9}; > vect__4.7_13 = MEM <vector(8) short int> [(short int *)p_8(D)]; > vect_max_11.8_14 = MAX_EXPR <_10, vect__4.7_13>; > _20 = .REDUC_MAX (vect_max_11.8_14); [tail call] > > it's a bit difficult to improve here - match.pd doesn't like MEMs too much > and this all just collapses because _10 is a splat of element zero of > vect__4.7_13 ... > > In theory the vectorizer could use the first full vector as initial value > or of course a vector of all SHORT_MIN. But the intent of using the first > scalar value was that this would optimize better ... That is, the alternative is to apply the 'short max = p[0]' "bias" after the epilogue and have the initial value be { SHORT_MIN, ... }.