https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123603
--- Comment #6 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by Richard Biener <[email protected]>: https://gcc.gnu.org/g:4b2db74430233302e4c711d6f958a2e7fbc643f3 commit r16-6891-g4b2db74430233302e4c711d6f958a2e7fbc643f3 Author: Richard Biener <[email protected]> Date: Fri Jan 16 10:22:17 2026 +0100 target/123603 - add --param ix86-vect-compare-costs The following allows to switch the x86 target to use the vectorizer cost comparison mechanic to select between different vector mode variants of vectorizations. The default is still to not do this but this allows an opt-in. On SPEC CPU 2017 for -Ofast -march=znver4 this shows 2463 out of 39706 vectorized loops changing mode. In 503 out of 12378 cases we decided to not use masked epilogs. Compile-time increases by ~1% overall. With a quick 1-run there does not seem to be off-noise effects for INT, this particular optimization and target option combination and actual hardware to run on. For FP 549.fotonik3d_r improves by 6% (confirmed with a 2-run). This was triggered by PR123190 and PR123603 which have cases where comparing costs would have resulted in the faster vector size to be used. Both were reported for -O2 -march=x86-64-v3 -flto and with PGO. The PR123603 recorded regression of 548.exchange2_r with these flags is resolved with the flag (performance improves by 13%). I don't have SPEC 2006 on that machine so did not verify the PR123190 433.milc regression, but that has been improved with the two earlier patches. The --param has no effect on the testcase in the PR. I do expect that some of our tricks in the x86 cost model to make larger vector sizes unprofitable will be obsolete or are counter-productive with cost comparison turned on. PR target/123603 * config/i386/i386.opt (-param=ix86-vect-compare-costs=): Add. * config/i386/i386.cc (ix86_autovectorize_vector_modes): Honor it. * doc/invoke.texi (ix86-vect-compare-costs): Document. * gcc.dg/vect/costmodel/x86_64/costmodel-pr123603.c: New testcase.
