[Bug target/113570] RISC-V: SPEC2017 549 fotonik3d miscompilation in autovec VLS 256 build
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113570 --- Comment #5 from JuzheZhong --- It seems that we don't have any bugs in current SPEC 2017 testing. So I strongly suggest "full coverage" testing on SPEC 2017 which I mentioned in PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113087 -march=rv64gcv --param=riscv-autovec-lmul=m2 -march=rv64gcv --param=riscv-autovec-lmul=m4 -march=rv64gcv --param=riscv-autovec-lmul=m8 -march=rv64gcv --param=riscv-autovec-lmul=dynamic -march=rv64gcv_zvl256b --param=riscv-autovec-lmul=m2 -march=rv64gcv_zvl256b --param=riscv-autovec-lmul=m4 -march=rv64gcv_zvl256b --param=riscv-autovec-lmul=m8 -march=rv64gcv_zvl256b --param=riscv-autovec-lmul=dynamic -march=rv64gcv_zvl512b --param=riscv-autovec-lmul=m2 -march=rv64gcv_zvl512b --param=riscv-autovec-lmul=m4 -march=rv64gcv_zvl512b --param=riscv-autovec-lmul=m8 -march=rv64gcv_zvl512b --param=riscv-autovec-lmul=dynamic -march=rv64gcv --param=riscv-autovec-preference=fixed-vlmax -march=rv64gcv --param=riscv-autovec-lmul=m2 --param=riscv-autovec-preference=fixed-vlmax -march=rv64gcv --param=riscv-autovec-lmul=m4 --param=riscv-autovec-preference=fixed-vlmax -march=rv64gcv --param=riscv-autovec-lmul=m8 --param=riscv-autovec-preference=fixed-vlmax -march=rv64gcv --param=riscv-autovec-lmul=dynamic --param=riscv-autovec-preference=fixed-vlmax -march=rv64gcv_zvl256b --param=riscv-autovec-preference=fixed-vlmax -march=rv64gcv_zvl256b --param=riscv-autovec-lmul=m2 --param=riscv-autovec-preference=fixed-vlmax -march=rv64gcv_zvl256b --param=riscv-autovec-lmul=m4 --param=riscv-autovec-preference=fixed-vlmax -march=rv64gcv_zvl256b --param=riscv-autovec-lmul=m8 --param=riscv-autovec-preference=fixed-vlmax -march=rv64gcv_zvl256b --param=riscv-autovec-lmul=dynamic --param=riscv-autovec-preference=fixed-vlmax -march=rv64gcv_zvl512b --param=riscv-autovec-preference=fixed-vlmax -march=rv64gcv_zvl512b --param=riscv-autovec-lmul=m2 --param=riscv-autovec-preference=fixed-vlmax -march=rv64gcv_zvl512b --param=riscv-autovec-lmul=m4 --param=riscv-autovec-preference=fixed-vlmax -march=rv64gcv_zvl512b --param=riscv-autovec-lmul=m8 --param=riscv-autovec-preference=fixed-vlmax -march=rv64gcv_zvl512b --param=riscv-autovec-lmul=dynamic --param=riscv-autovec-preference=fixed-vlmax Could you trigger these testing ?
[Bug target/113570] RISC-V: SPEC2017 549 fotonik3d miscompilation in autovec VLS 256 build
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113570 Jeffrey A. Law changed: What|Removed |Added Resolution|--- |DUPLICATE Status|UNCONFIRMED |RESOLVED --- Comment #4 from Jeffrey A. Law --- Just looked a little closer. Looks like the same signature. Essentially you can't use -Ofast (or fast-math in general) with this benchmark, especially with vectorization. *** This bug has been marked as a duplicate of bug 84201 ***
[Bug target/113570] RISC-V: SPEC2017 549 fotonik3d miscompilation in autovec VLS 256 build
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113570 --- Comment #3 from Jeffrey A. Law --- See pr84201 for more details as well as https://www.spec.org/cpu2017/Docs/benchmarks/549.fotonik3d_r.html
[Bug target/113570] RISC-V: SPEC2017 549 fotonik3d miscompilation in autovec VLS 256 build
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113570 --- Comment #2 from Robin Dapp --- I'm pretty certain this is "works as intended" and -Ofast causes the precision to be different than with -O3 (and dependant on the target). See also: It has been reported that with gfortran -Ofast -march=native verification errors may be seen, for example: *** Miscompare of pscyee.out; for details see /data2/johnh/out.v1.1.5/benchspec/CPU/549.fotonik3d_r/run/run_base_refrate_Ofastnative./pscyee.out.mis 0646: -1.91273086037953E-17, -1.46491401919706E-15, -1.91273086057460E-17, -1.46491401919687E-15, ^ 0668: -1.91251317582607E-17, -1.42348205527085E-15, -1.91251317602571E-17, -1.42348205527068E-15, ^ The errors may occur with other compilers as well, depending on your particular compiler version, hardware platform, and optimization options. The problem arises when a compiler chooses to vectorize a particular loop from power.F90 line number 369 369 do ifreq = 1, tmppower%nofreq 370 frequency(ifreq,ipower) = freq 371 freq = freq + freqstep 372 end do from https://www.spec.org/cpu2017/Docs/benchmarks/549.fotonik3d_r.html which further states: Workaround: You will need to specify optimization options that do not cause this loop to be vectorized. For example, on a particular platform studied in mid-2020 using GCC 10.2, these results were seen: OK -Ofast -march=native -fno-unsafe-math-optimization If you apply one of the above workarounds in base, be sure to obey the same-for-all rule which requires that all benchmarks in a suite of a given language must use the same flags. For example, the sections below turn off unsafe math optimizations for all Fortran modules in the floating point rate and floating point speed benchmark suites: default=base: OPTIMIZE = -Ofast -flto -march=native fprate,fpspeed=base: FOPTIMIZE = -fno-unsafe-math-optimizations
[Bug target/113570] RISC-V: SPEC2017 549 fotonik3d miscompilation in autovec VLS 256 build
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113570 --- Comment #1 from Vineet Gupta --- This one is a headache as we don't know where the problem is. And that it takes ~7hr for a QEMU run to finish. Good this is there is a comparison point as VLA build works fine. (1). bloat-o-meter (from Linux kernel) to diff the VLS (nok) and VLA (ok) builds. Function old new delta init_ 67226752 +30 __huygens_mod_MOD_huygense 17078 17990+912 __huygens_mod_MOD_huygensh 14412 15614 +1202 __huygens_mod_MOD_uin.isra 29222944 +22 __material_mod_MOD_mat_updatee 42284272 +44 __mur_mod_MOD_mur_init 90549162+108 __mur_mod_MOD_mur_storee25462446-100 __mur_mod_MOD_mur_updatee 10124 10354+230 __pec_mod_MOD_pec_init 85228046-476 __plane_source_mod_MOD_plane_source_init69427072+130 __power_mod_MOD___copy_power_mod_Powertyp 14 26 +12 __power_mod_MOD_power_dft 12801156-124 __power_mod_MOD_power_init 98309994+164 __power_mod_MOD_power_print 23041556-748 __upml_mod_MOD_upml_allocate.isra 19384 19614+230 __upml_mod_MOD_upml_init 25564 26178+614 __upml_mod_MOD_upml_set_eps_arrays.isra 54066356+950 __upml_mod_MOD_upml_updatee32516 27130 -5386 __upml_mod_MOD_upml_updatee_simple 36112 29612 -6500 __upml_mod_MOD_upml_updateh15962 15992 +30 writeout_ 10856 11002+146 (2). Assuming the issue is one of those above (which may not be the true), manually rebuild, changing build flags to VLA, one module at a time, relink, rerun qemu and compare output. - This resulted in power.fppized.f90 as the culprit (3) Manually split up the power module into multiple files - one function at a time and do the same exercise to identify the function.