https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82136

--- Comment #1 from Peter Cordes <peter at cordes dot ca> ---
Whoops, the compiler-explorer link had aligned=1.  This one produces the asm I
showed in the original report: https://godbolt.org/g/WsZ5S9

See bug 82137 for a much more efficient vectorization strategy gcc should use
instead, with just in-lane shuffle + blend and some duplicated work.

Reply via email to