https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105965
Bug ID: 105965 Summary: x86: single-element vectors don't have scalar FMA insns used anymore Product: gcc Version: 12.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: jbeulich at suse dot com Target Milestone: --- While this used to work fine up to gcc8, gcc9 and newer use vmuls[sdh]+vadds[sdh] instead. No similar issue exists when operating on scalars, or when operating on multi-element vectors not matching any available register size (so my guess of "target" as the component may not be correct). This has regressed the test harness of the Xen Project's insn emulator [1], which no longer exercises any scalar FMA insns because of the compiler not emitting any. (Note that using intrinsics is not really an option, as the primary goal is to test insns with memory operands. Yet the intrinsics don't lend themselves to such because of using 128-bit parameter types.) The issue is uniform for FMA, FMA4, AVX512F, and AVX512-FP16. It can be easily seen by compiling T test(T x, T y, T z) { return x * y + z; } #define TEST(n) \ typedef T __attribute__((vector_size(n * sizeof(T)))) v##n##_t; \ v##n##_t test##n(v##n##_t x, v##n##_t y, v##n##_t z) { \ return x * y + z; \ } TEST(1) TEST(2) TEST(4) TEST(8) TEST(16) TEST(32) TEST(64) with e.g. "-mfpmath=sse -O3 -c -mfma -DT=float", but obvious other option combinations similarly demonstrate the issue. [1] https://xenbits.xen.org/gitweb/?p=xen.git;a=tree;f=tools/tests/x86_emulator