https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111793
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Jakub Jelinek from comment #4) > So, shouldn't we match.pd (or something else) pattern match > vect_cst__50 = {mask.48_7(D), mask.48_7(D), mask.48_7(D), mask.48_7(D), > mask.48_7(D), mask.48_7(D), mask.48_7(D), mask.48_7(D), mask.48_7(D), > mask.48_7(D), mask.48_7(D), mask.48_7(D), mask.48_7(D), mask.48_7(D), > mask.48_7(D), mask.48_7(D)}; > vect__8.132_51 = vect_cst__50 >> { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, > 12, 13, 14, 15 }; > vect__9.133_53 = vect__8.132_51 & { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > 1, 1, 1 }; > mask__39.139_60 = vect__9.133_53 != { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 0 }; > back into mask__39.139_60 = mask.48_7(D); ? Yes, that's a possibility. I wonder if it's possible to arrange things in the vectorizer itself so that costing gets more accurate (probably not that important for OMP SIMD though). Maybe it works a bit better if we did mask & (1 << iv), but I guess we canonicalize that back. I've opened this for tracking for now, working on PR111795 first.