https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102750

            Bug ID: 102750
           Summary: 433.milc regressed by 10% on AMD zen2 at -Ofast
                    -march=native -flto after r12-3893-g6390c5047adb75
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jamborm at gcc dot gnu.org
                CC: rguenth at gcc dot gnu.org
            Blocks: 26163
  Target Milestone: ---
              Host: x86_64-linux
            Target: x86_64-linux

I have bisected an AMD zen2 10% performance regression of SPEC 2006 FP
433.milc benchmark when compiled with -Ofast -march=native -flto to
commit r12-3893-g6390c5047adb75.  See also:

 
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=412.70.0&plot.1=289.70.0&;

As Richi asked, I am filing this bug even though I cannot reproduce the
regression neither on an AMD zen3 machine nor on Intel CascadeLake, because
the history of the benchmark performance and because I know milc can be
sensitive to conditions outside our control.  OTOH, the regression
reproduces reliably for me.

Some relevant perf data:

BEFORE:
# Samples: 585K of event 'cycles:u'
# Event count (approx.): 472738682838
#
# Overhead       Samples  Command          Shared Object           Symbol
# ........  ............  ...............  ...................... 
.........................................
# 
    24.59%        140397  milc_peak.mine-  milc_peak.mine-lto-nat  [.]
u_shift_fermion
    18.47%        105497  milc_peak.mine-  milc_peak.mine-lto-nat  [.]
add_force_to_mom
    15.97%         96343  milc_peak.mine-  milc_peak.mine-lto-nat  [.]
mult_su3_na
    15.29%         90027  milc_peak.mine-  milc_peak.mine-lto-nat  [.]
mult_su3_nn
     5.55%         35114  milc_peak.mine-  milc_peak.mine-lto-nat  [.]
path_product
     4.75%         27693  milc_peak.mine-  milc_peak.mine-lto-nat  [.]
compute_gen_staple
     2.76%         16109  milc_peak.mine-  milc_peak.mine-lto-nat  [.]
mult_su3_an
     2.42%         14255  milc_peak.mine-  milc_peak.mine-lto-nat  [.]
imp_gauge_force.constprop.0
     2.02%         11561  milc_peak.mine-  milc_peak.mine-lto-nat  [.]
mult_adj_su3_mat_4vec

AFTER:
# Samples: 634K of event 'cycles:u'
# Event count (approx.): 513635733685
#
# Overhead       Samples  Command          Shared Object           Symbol       
# ........  ............  ...............  ...................... 
.........................................
#
    24.04%        149010  milc_peak.mine-  milc_peak.mine-lto-nat  [.]
add_force_to_mom
    23.76%        147370  milc_peak.mine-  milc_peak.mine-lto-nat  [.]
u_shift_fermion
    14.19%         90929  milc_peak.mine-  milc_peak.mine-lto-nat  [.]
mult_su3_nn
    14.14%         92912  milc_peak.mine-  milc_peak.mine-lto-nat  [.]
mult_su3_na
     4.90%         33846  milc_peak.mine-  milc_peak.mine-lto-nat  [.]
path_product
     3.89%         24621  milc_peak.mine-  milc_peak.mine-lto-nat  [.]
mult_su3_an
     3.62%         22831  milc_peak.mine-  milc_peak.mine-lto-nat  [.]
compute_gen_staple
     2.05%         13215  milc_peak.mine-  milc_peak.mine-lto-nat  [.]
imp_gauge_force.constprop.0


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

Reply via email to