https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94375
Bug ID: 94375 Summary: 548.exchange2_r run time is 8-18% worse than GCC 9 at -Ofast -march=native Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: jamborm at gcc dot gnu.org Blocks: 26163 Target Milestone: --- Host: x86_64-linux Target: x86_64-linux When compiled with trunk revision 26b3e568a60 and options -Ofast -march=native -mtune=native, SPEC 2017 INTrate benchmark 548.exchange2_r runs 19% slower on AMD Zen2 and 12% slower on Intel Cascade Lake than when built with GCC 9.2. It appears that the main culprit is the vectorizer, switching it off recovers the performance - it is in fact even some 4% better than GCC 9 on AMD). Side note: with --param ipa-cp-eval-threshold=1 --param ipa-cp-unit-growth=80 one can exchange that is 25% faster yet but that is a different issue. This started happening in the autumn but not exactly at one point, as the following table of run-times relative to GCC 9.2 shows. Revision: time ------------------------- ---- d82f38123b5 (Nov 14 2019) 117% d9adca6e663 (Nov 5 2019) 117% bf037872d3c (Oct 24 2019) 101% 77ef339456f (Oct 14 2019) 118% 38a734350fd (Oct 3 2019) 100% d469a71e5a0 (Sep 23 2019) 101% Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 [Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)