https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113104
Richard Sandiford <rsandifo at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |ASSIGNED Last reconfirmed| |2023-12-30 Ever confirmed|0 |1 CC| |rsandifo at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |rsandifo at gcc dot gnu.org --- Comment #4 from Richard Sandiford <rsandifo at gcc dot gnu.org> --- FWIW, we do get the desired code with -march=armv8-a+sve (even though the test doesn't use SVE). This is because of: /* Consider enabling VECT_COMPARE_COSTS for SVE, both so that we can compare SVE against Advanced SIMD and so that we can compare multiple SVE vectorization approaches against each other. There's not really any point doing this for Advanced SIMD only, since the first mode that works should always be the best. */ if (TARGET_SVE && aarch64_sve_compare_costs) flags |= VECT_COMPARE_COSTS; The testcase in this PR is a counterexample to the claim in the final sentence. I think the comment might predate significant support for mixed-sized Advanced SIMD vectorisation. If we enable SVE (or uncomment the "if" line), the costs are 13 units per vector iteration for 128-bit vectors and 4 units per vector iteration for 64-bit vectors (so 8 units per 128 bits on a parity basis). The 64-bit version is therefore seen as significantly cheaper and is chosen ahead of the 128-bit version. I think this PR is enough proof that we should enable VECT_COMPARE_COSTS even without SVE. Assigning to myself for that.