https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102473

            Bug ID: 102473
           Summary: 521.wrf_r 5% slower at -Ofast and generic x86_64
                    tuning after r12-3426-g8f323c712ea76c
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jamborm at gcc dot gnu.org
                CC: crazylht at gmail dot com
            Blocks: 26163
  Target Milestone: ---
              Host: x86_64-linux
            Target: x86_64-linux

All three x86_64 LNT machines have detected a 4.5-5.2% performance
regression of SPEC FPrate 2017 benchmarks 521.wrf_r when compiled with
-Ofast and the default (generic) march and mtune.

Zen2 based machine regressed by 5%:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=294.548.0
Zen1 based machine regressed by 5.2%:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=35.548.0
Kabylake based machine regressed by 4.5%:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=34.548.0

On an AMD zen2 based machine I have bisected the regression to commit
r12-3426-g8f323c712ea76c:

8f323c712ea76cc4506b03895e9b991e4e4b2baf is the first bad commit
commit 8f323c712ea76cc4506b03895e9b991e4e4b2baf
Author: liuhongt <hongtao....@intel.com>
Date:   Tue Sep 7 12:39:04 2021 +0800

    Optimize v4sf reduction.

    gcc/ChangeLog:

            PR target/101059
            * config/i386/sse.md (reduc_plus_scal_<mode>): Split to ..
            (reduc_plus_scal_v4sf): .. this, New define_expand.
            (reduc_plus_scal_v2df): .. and this, New define_expand.


I have confirmed that the commit causes a similar regression on
another Intel Skylake server.

On the Zen2 machine, this is the difference in samples collected by
perf for different symbols (before is commit 60eec23b5ed, after commit
8f323c712ea):

| Symbol                                      | sys lib | Before | After | 
diff |     % |
|---------------------------------------------+---------+--------+-------+-------+-------|
| __logf_fma                                  | yes     |  68882 | 68940 |  
+58 | +0.08 |
| __atanf                                     | yes     |  66664 | 66196 | 
-468 | -0.70 |
| __module_advect_em_MOD_advect_scalar_pd     | no      |  62286 | 62348 |  
+62 | +0.10 |
| __powf_fma                                  | yes     |  56213 | 56127 |  
-86 | -0.15 |
| __module_mp_wsm5_MOD_nislfv_rain_plm        | no      |  46990 | 48340 |
+1350 | +2.87 |
| __module_mp_wsm5_MOD_wsm52d                 | no      |  41031 | 40968 |  
-63 | -0.15 |
| __module_small_step_em_MOD_advance_uv       | no      |  30908 | 30909 |   
+1 | +0.00 |
| __module_small_step_em_MOD_advance_w        | no      |  28738 | 28600 | 
-138 | -0.48 |
| __module_advect_em_MOD_advect_scalar        | no      |  28400 | 28429 |  
+29 | +0.10 |
| __expf_fma                                  | yes     |  26702 | 26516 | 
-186 | -0.70 |
| __module_big_step_utilities_em_MOD_phy_prep | no      |  25878 | 25816 |  
-62 | -0.24 |
| psim_unstable_                              | no      |  24994 | 25106 | 
+112 | +0.45 |
| __module_bl_ysu_MOD_ysu2d                   | no      |  24799 | 25251 | 
+452 | +1.82 |
| psih_unstable_                              | no      |  22600 | 23139 | 
+539 | +2.38 |
| __module_small_step_em_MOD_advance_mu_t     | no      |  22250 | 22232 |  
-18 | -0.08 |
| __memset_avx2_unaligned_erms                | yes     |  21748 | 21613 | 
-135 | -0.62 |
| _ZGVbN4vv_powf_sse4                         | yes     |  21206 | 21355 | 
+149 | +0.70 |


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

Reply via email to