[Bug tree-optimization/94364] New: 505.mcf_r is 8% faster when compiled with -mprefer-vector-width=128

jamborm at gcc dot gnu.org Fri, 27 Mar 2020 11:08:00 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94364


            Bug ID: 94364
           Summary: 505.mcf_r is 8% faster when compiled with
                    -mprefer-vector-width=128
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jamborm at gcc dot gnu.org
            Blocks: 26163
  Target Milestone: ---
              Host: x86_64-linux
            Target: x86_64-linux

SPEC 2017 INTrate benchmark 505.mcf_r, when compiled with options
-Ofast -march=native -mtune=native, is 8% slower than when we also use
option -mprefer-vector-width=128.  I have observed it on both AMD Zen2
and Intel Cascade Lake Server CPUs (using master revision 26b3e568a60).

Better vector width selection would therefore bring about noticeable
speed-up.


Symbol profiles (collected on AMD Rome):

-Ofast -march=native -mtune=native:

  Overhead       Samples  Shared Object    Symbol                          
  ........  ............  ...............  ................................

    28.64%        462302  mcf_r_peak.mine  spec_qsort
    21.58%        348703  mcf_r_peak.mine  cost_compare
    15.81%        255029  mcf_r_peak.mine  primal_bea_mpp
    15.58%        251176  mcf_r_peak.mine  replace_weaker_arc
     7.37%        118646  mcf_r_peak.mine  arc_compare
     6.53%        105337  mcf_r_peak.mine  price_out_impl
     1.38%         22276  mcf_r_peak.mine  update_tree

-Ofast -march=native -mtune=native -mprefer-vector-width=128:

  Overhead       Samples  Shared Object    Symbol                          
  ........  ............  ...............  ................................

    23.57%        354536  mcf_r_peak.mine  spec_qsort
    23.51%        353767  mcf_r_peak.mine  cost_compare
    16.98%        255104  mcf_r_peak.mine  primal_bea_mpp
    16.65%        249891  mcf_r_peak.mine  replace_weaker_arc
     7.29%        109267  mcf_r_peak.mine  arc_compare
     7.09%        106380  mcf_r_peak.mine  price_out_impl
     1.53%         22968  mcf_r_peak.mine  update_tree


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

[Bug tree-optimization/94364] New: 505.mcf_r is 8% faster when compiled with -mprefer-vector-width=128

Reply via email to