https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81616

--- Comment #46 from Andrew Roberts <andrewm.roberts at sky dot com> ---
With the latest snapshot:
gcc version 8.0.1 20180121

For the mt19937ar things now look reasonable without any strange options on
Ryzen.

Top 5
mt19937ar took 226849 clocks -march=amdfam10 -mtune=btver2
mt19937ar took 228970 clocks -march=amdfam10 -mtune=barcelona
mt19937ar took 229494 clocks -march=bdver1 -mtune=btver1
mt19937ar took 229524 clocks -march=nano -mtune=nano
mt19937ar took 230003 clocks -march=opteron-sse3 -mtune=athlon64-sse3

mt19937ar took 233793 clocks -march=k8-sse3 -mtune=x86-64
mt19937ar took 241700 clocks -march=corei7 -mtune=generic
mt19937ar took 242373 clocks -march=nano-3000 -mtune=znver1
mt19937ar took 245550 clocks -march=k8-sse3 -mtune=haswell
mt19937ar took 251431 clocks -march=znver1 -mtune=generic
mt19937ar took 262200 clocks -march=znver1 -mtune=znver1
mt19937ar took 276993 clocks -march=haswell -mtune=haswell

Bot 5
mt19937ar took 341326 clocks -march=nano-x4 -mtune=silvermont
mt19937ar took 341750 clocks -march=core-avx-i -mtune=nocona
mt19937ar took 342457 clocks -march=k8 -mtune=znver1
mt19937ar took 347453 clocks -march=ivybridge -mtune=bonnell
mt19937ar took 364041 clocks -march=haswell -mtune=core-avx-i

with -mno-avx2
mt19937ar took 235997 clocks -march=znver1 -mtune=opteron
mt19937ar took 233921 clocks -march=nano-1000 -mtune=x86-64
mt19937ar took 243452 clocks -march=znver1 -mtune=x86-64
mt19937ar took 243540 clocks -march=silvermont -mtune=generic
mt19937ar took 247113 clocks -march=znver1 -mtune=generic
mt19937ar took 241368 clocks -march=nano-2000 -mtune=haswell
mt19937ar took 247806 clocks -march=znver1 -mtune=znver1

Compare this with it taking 430875 clocks originally for -march=znver1
-mtune=znver1

On Haswell 

Top 5

mt19937ar took 220000 clocks -march=amdfam10 -mtune=amdfam10
mt19937ar took 220000 clocks -march=amdfam10 -mtune=athlon64
mt19937ar took 220000 clocks -march=amdfam10 -mtune=athlon64-sse3
mt19937ar took 220000 clocks -march=amdfam10 -mtune=athlon-fx
mt19937ar took 220000 clocks -march=amdfam10 -mtune=barcelona

mt19937ar took 220000 clocks -march=corei7-avx -mtune=x86-64
mt19937ar took 230000 clocks -march=haswell -mtune=haswell
mt19937ar took 240000 clocks -march=haswell -mtune=generic
mt19937ar took 260000 clocks -march=haswell -mtune=x86-64

Bot 5 (all various shades of mtune=bdverZ or mtune=btverZ)
mt19937ar took 310000 clocks -march=core-avx2 -mtune=bdver1
mt19937ar took 310000 clocks -march=haswell -mtune=bdver1
mt19937ar took 310000 clocks -march=skylake -mtune=bdver1

Reply via email to