[Bug tree-optimization/94375] New: 548.exchange2_r run time is 8-18% worse than GCC 9 at -Ofast -march=native

jamborm at gcc dot gnu.org Fri, 27 Mar 2020 16:31:00 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94375


            Bug ID: 94375
           Summary: 548.exchange2_r run time is 8-18% worse than GCC 9 at
                    -Ofast -march=native
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jamborm at gcc dot gnu.org
            Blocks: 26163
  Target Milestone: ---
              Host: x86_64-linux
            Target: x86_64-linux

When compiled with trunk revision 26b3e568a60 and options -Ofast
-march=native -mtune=native, SPEC 2017 INTrate benchmark
548.exchange2_r runs 19% slower on AMD Zen2 and 12% slower on Intel
Cascade Lake than when built with GCC 9.2.

It appears that the main culprit is the vectorizer, switching it off
recovers the performance - it is in fact even some 4% better than GCC
9 on AMD).

Side note: with --param ipa-cp-eval-threshold=1 --param
ipa-cp-unit-growth=80 one can exchange that is 25% faster yet but that
is a different issue.

This started happening in the autumn but not exactly at one point, as
the following table of run-times relative to GCC 9.2 shows. 

Revision:                  time 
-------------------------  ----
d82f38123b5 (Nov 14 2019)  117%
d9adca6e663 (Nov 5 2019)   117%
bf037872d3c (Oct 24 2019)  101%
77ef339456f (Oct 14 2019)  118%
38a734350fd (Oct 3 2019)   100%
d469a71e5a0 (Sep 23 2019)  101%


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

[Bug tree-optimization/94375] New: 548.exchange2_r run time is 8-18% worse than GCC 9 at -Ofast -march=native

Reply via email to