https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762
--- Comment #15 from jbeulich at suse dot com --- (In reply to Richard Biener from comment #12) > _mm_storel_pi could be implemented using __builtin_shufflevector these days. > Which shows exactly the same issue: (also related to comment 10) I don't think the problem is how the registers are filled (and in my example I simply used the first approach that came to mind and worked). The problem is that the arithmetic insn assumes the upper parts to not hold certain special values (or pairs thereof). Aiui one could create the exact same situation with inline assembly instead of any of the builtins. This isn't any different from using 512-bit operations for more narrow vectors when AVX512VL isn't enabled. Afaict such uses are carefully avoided for floating point vectors, and are used only in a limited number of cases on integer vectors (Hongtao recently asked me to not go any further in that direction either).