https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81616
--- Comment #47 from Andrew Roberts <andrewm.roberts at sky dot com> --- Again with the latest snapshot: gcc version 8.0.1 20180121 matrix.c is still needing additional options to get the best out of the Ryzen processor. But is better than before (223029 clocks vs 371978 originally), but 122677 is achievable with the right options. However the same can also be said for haswell as things stand. The haswell (-march=haswell -mtune=haswell) time has dropped from 190000 to 23000, but do we put that down to Meltdown/Spectre updates or compiler updates. With just -O3 on Ryzen: Top 5 mult took 115669 clocks -march=ivybridge -mtune=skylake-avx512 mult took 118403 clocks -march=corei7-avx -mtune=skylake-avx512 mult took 119379 clocks -march=core-avx-i -mtune=skylake-avx512 mult took 119735 clocks -march=corei7-avx -mtune=skylake mult took 119901 clocks -march=sandybridge -mtune=broadwell mult took 120023 clocks -march=sandybridge -mtune=haswell mult took 121010 clocks -march=corei7-avx -mtune=haswell mult took 127371 clocks -march=sandybridge -mtune=x86-64 mult took 151208 clocks -march=btver2 -mtune=generic mult took 152360 clocks -march=ivybridge -mtune=generic mult took 173926 clocks -march=haswell -mtune=haswell mult took 177359 clocks -march=znver1 -mtune=athlon64 mult took 180000 clocks -march=ivybridge -mtune=znver1 mult took 188219 clocks -march=znver1 -mtune=generic mult took 199721 clocks -march=znver1 -mtune=x86-64 mult took 223029 clocks -march=znver1 -mtune=znver1 Bot 5 mult took 377398 clocks -march=znver1 -mtune=bdver3 mult took 377650 clocks -march=knl -mtune=bdver3 mult took 378600 clocks -march=core-avx2 -mtune=bonnell mult took 381447 clocks -march=skylake-avx512 -mtune=haswell mult took 388837 clocks -march=skylake-avx512 -mtune=bdver4 On Haswell Top 5 mult took 133704 clocks -march=ivybridge -mtune=k8-sse3 mult took 150000 clocks -march=btver2 -mtune=k8 mult took 150000 clocks -march=core-avx-i -mtune=x86-64 mult took 150000 clocks -march=corei7-avx -mtune=nano mult took 150000 clocks -march=corei7-avx -mtune=opteron mult took 160000 clocks -march=core-avx-i -mtune=haswell mult took 190000 clocks -march=haswell -mtune=eden-x4 mult took 190000 clocks -march=ivybridge -mtune=generic mult took 200000 clocks -march=haswell -mtune=x86-64 mult took 230000 clocks -march=haswell -mtune=haswell mult took 270000 clocks -march=haswell -mtune=generic Bot 5 mult took 420000 clocks -march=skylake-avx512 -mtune=bdver2 mult took 420000 clocks -march=znver1 -mtune=bdver3 mult took 420000 clocks -march=znver1 -mtune=bdver4 mult took 430000 clocks -march=bdver2 -mtune=bdver2 mult took 430000 clocks -march=knl -mtune=bdver2 Using -mprefer-vector-width=none -mno-fma -mno-avx2 -O3 On Ryzen Top 5 mult took 116558 clocks -march=haswell -mtune=bdver3 mult took 116673 clocks -march=haswell -mtune=skylake mult took 117268 clocks -march=sandybridge -mtune=skylake-avx512 mult took 117288 clocks -march=broadwell -mtune=nocona mult took 118450 clocks -march=corei7-avx -mtune=haswell mult took 119719 clocks -march=core-avx-i -mtune=znver1 mult took 120028 clocks -march=znver1 -mtune=skylake mult took 122677 clocks -march=znver1 -mtune=znver1 mult took 123423 clocks -march=haswell -mtune=haswell mult took 127388 clocks -march=skylake -mtune=x86-64 mult took 130475 clocks -march=znver1 -mtune=x86-64 mult took 132374 clocks -march=sandybridge -mtune=generic mult took 162317 clocks -march=znver1 -mtune=generic Bot 5 mult took 300000 clocks -march=nano-x2 -mtune=btver2 mult took 310000 clocks -march=skylake-avx512 -mtune=westmere mult took 319772 clocks -march=knl -mtune=sandybridge mult took 320000 clocks -march=eden-x2 -mtune=amdfam10 mult took 330000 clocks -march=atom -mtune=broadwell On Haswell Top 5 mult took 123148 clocks -march=bonnell -mtune=ivybridge mult took 130262 clocks -march=ivybridge -mtune=silvermont mult took 135299 clocks -march=core-avx2 -mtune=nano-3000 mult took 150000 clocks -march=core-avx2 -mtune=intel mult took 150000 clocks -march=haswell -mtune=btver1 mult took 170000 clocks -march=core-avx-i -mtune=haswell mult took 170000 clocks -march=znver1 -mtune=x86-64 mult took 180000 clocks -march=haswell -mtune=haswell mult took 180000 clocks -march=znver1 -mtune=generic mult took 210000 clocks -march=haswell -mtune=generic mult took 230000 clocks -march=haswell -mtune=x86-64 Bot 5 mult took 350000 clocks -march=nano-x4 -mtune=nano-2000 mult took 350000 clocks -march=slm -mtune=skylake-avx512 mult took 360000 clocks -march=barcelona -mtune=broadwell mult took 360000 clocks -march=nano -mtune=corei7 mult took 360000 clocks -march=nocona -mtune=btver2