https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114107
--- Comment #11 from Hongtao Liu <liuhongt at gcc dot gnu.org> --- (In reply to N Schaeffer from comment #9) > In addition, optimizing for size with -Os leads to a non-vectorized > double-loop (51 bytes) while the vectorized loop with vbroadcastsd (produced > by clang -Os) leads to 40 bytes. > It is thus also a missed optimization for -Os. vectorization is enabled with O2 but not Os.