Dear Ole,

I suspect the problem is more with OpenBLAS than GCC.

OpenBLAS 0.3.20 probably doesn't detect AMD Genoa (Zen4) correctly yet, and doesn't try to use AVX-512 instructions there.

OpenBLAS 0.3.21 detects Genoa, enbales AVX-512, but there's a bug in a kernel being used.

I would try and see whether you observe any problems with more recent OpenBLAS versions, like OpenBLAS-0.3.23-GCC-12.3.0.eb .

If not, we may be able to trace down the fix and patch OpenBLAS 0.3.21 to fix the problem you're seeing...


regards,

Kenneth

On 28/09/2023 09:26, Ole Holm Nielsen wrote:
It's interesting that while attempting to build the foss-2022a toolchain in stead of foss-2022b, the build of OpenBLAS with GCC 11.3.0 succeeds without errors:

== processing EasyBuild easyconfig /home/modules/software/EasyBuild/4.8.1/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.20-GCC-11.3.0.eb
== building and installing OpenBLAS/0.3.20-GCC-11.3.0...
== fetching files...
== ... (took 4 secs)
== creating build dir, resetting environment...
== unpacking...
== patching...
== preparing...
== configuring...
== building...
== ... (took 56 secs)
== testing...
== ... (took 2 mins 24 secs)
== installing...
== ... (took 1 secs)
== taking care of extensions...
== restore after iterating...
== postprocessing...
== sanity checking...
== cleaning up...
== creating module...
== permissions...
== packaging...
== COMPLETED: Installation ended successfully (took 3 mins 28 secs)

The only difference here appears to be GCC version 12.2.0 versus 11.3.0!

Any ideas about what's causing this error in the tests?

Perhaps GCC version 12.2.0 tries to use the new AVX-512 instructions in AMD Genoa and has a bug?

Thanks,
Ole


On 9/26/23 08:04, Ole Holm Nielsen wrote:
I'm starting EasyBuild up on our new AMD "Genoa" platform with 1 AMD EPYC 9124 16-Core Processor with 2 threads/core, 384 GB RAM, and AlmaLinux 8.8 OS.

Unfortunately, building the foss-2022b toolchain exits during the testing phase of OpenBLAS-0.3.21-GCC-12.2.0.eb as shown below.  Does anyone have ideas about what might be wrong?

$ eb foss-2022b.eb -r
(lines deleted)
== processing EasyBuild easyconfig /home/modules/software/EasyBuild/4.8.1/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.21-GCC-12.2.0.eb
== building and installing OpenBLAS/0.3.21-GCC-12.2.0...
== fetching files...
== ... (took 7 secs)
== creating build dir, resetting environment...
== unpacking...
== patching...
== preparing...
== configuring...
== building...
== ... (took 53 secs)
== testing...
== ... (took 12 secs)
== FAILED: Installation ended unsuccessfully (build directory: /dev/shm/OpenBLAS/0.3.21/GCC-12.2.0): build failed (first 300 chars): cmd " make tests  BINARY='64'  CC='gcc'  FC='gfortran' MAKE_NB_JOBS='-1' USE_OPENMP='1'  USE_THREAD='1' " exited with exit code 2 and output: /home/modules/software/binutils/2.39-GCCcore-12.2.0/bin/ld: warning: /tmp/eb-74m3kzgo/ccy1Gkzg.o: missing .note.GNU-stack section implies executable stack
/ (took 1 min 14 secs)
== Results of the build can be found in the log file(s) /tmp/eb-74m3kzgo/easybuild-OpenBLAS-0.3.21-20230925.161149.UfDUO.log ERROR: Build of /home/modules/software/EasyBuild/4.8.1/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.21-GCC-12.2.0.eb failed (err: 'build failed (first 300 chars): cmd " make tests BINARY=\'64\'  CC=\'gcc\'  FC=\'gfortran\'  MAKE_NB_JOBS=\'-1\' USE_OPENMP=\'1\'  USE_THREAD=\'1\' " exited with exit code 2 and output:\n/home/modules/software/binutils/2.39-GCCcore-12.2.0/bin/ld: warning: /tmp/eb-74m3kzgo/ccy1Gkzg.o: missing .note.GNU-stack section implies executable stack\n/')


The log file shows some an error in test_kernel_regress.c:50:

(lines deleted)
./openblas_utest
TEST 1/37 max:smax_zero [OK]
TEST 2/37 max:dmax_positive [OK]
TEST 3/37 max:smax_negative [OK]
TEST 4/37 min:smin_zero [OK]
TEST 5/37 min:dmin_positive [OK]
TEST 6/37 min:smin_negative [OK]
TEST 7/37 amax:damax [OK]
TEST 8/37 amax:samax [OK]
TEST 9/37 ismax:negative_step_2 [OK]
TEST 10/37 ismax:positive_step_2 [OK]
TEST 11/37 ismin:negative_step_2 [OK]
TEST 12/37 ismin:positive_step_2 [OK]
TEST 13/37 drotmg:drotmg_D1_big_D2_big_flag_zero [OK]
TEST 14/37 drotmg:rotmg_D1eqD2_X1eqX2 [OK]
TEST 15/37 drotmg:rotmg_issue1452 [OK]
TEST 16/37 drotmg:rotmg [OK]
TEST 17/37 axpy:caxpy_inc_0 [OK]
TEST 18/37 axpy:saxpy_inc_0 [OK]
TEST 19/37 axpy:zaxpy_inc_0 [OK]
TEST 20/37 axpy:daxpy_inc_0 [OK]
TEST 21/37 zdotu:zdotu_offset_1 [OK]
TEST 22/37 zdotu:zdotu_n_1 [OK]
TEST 23/37 dsdot:dsdot_n_1 [OK]
TEST 24/37 swap:cswap_inc_0 [OK]
TEST 25/37 swap:sswap_inc_0 [OK]
TEST 26/37 swap:zswap_inc_0 [OK]
TEST 27/37 swap:dswap_inc_0 [OK]
TEST 28/37 rot:csrot_inc_0 [OK]
TEST 29/37 rot:srot_inc_0 [OK]
TEST 30/37 rot:zdrot_inc_0 [OK]
TEST 31/37 rot:drot_inc_0 [OK]
TEST 32/37 dnrm2:dnrm2_tiny [OK]
TEST 33/37 dnrm2:dnrm2_inf [OK]
TEST 34/37 potrf:smoketest_trivial [OK]
TEST 35/37 potrf:bug_695 [OK]
TEST 36/37 kernel_regress:skx_avx [FAIL]
   ERR: test_kernel_regress.c:50  expected 0.000e+00, got 6.734e+01 (diff -6.734e+01, tol 1.000e-10)
TEST 37/37 fork:safety_after_fork_in_parent [OK]
RESULTS: 37 tests (36 ok, 1 failed, 0 skipped) ran in 3 ms
make[1]: *** [Makefile:52: run_test] Error 1
make[1]: Leaving directory '/dev/shm/OpenBLAS/0.3.21/GCC-12.2.0/OpenBLAS-0.3.21/utest'
make: *** [Makefile:150: tests] Error 2
  (at easybuild/tools/run.py:681 in parse_cmd_output)
== 2023-09-25 16:13:04,292 build_log.py:267 INFO ... (took 12 secs)
== 2023-09-25 16:13:04,292 filetools.py:2012 INFO Removing lock /home/modules/software/.locks/_home_modules_software_OpenBLAS_0.3.21-GCC-12.2.0.lock... == 2023-09-25 16:13:04,293 filetools.py:383 INFO Path /home/modules/software/.locks/_home_modules_software_OpenBLAS_0.3.21-GCC-12.2.0.lock successfully removed. == 2023-09-25 16:13:04,293 filetools.py:2016 INFO Lock removed: /home/modules/software/.locks/_home_modules_software_OpenBLAS_0.3.21-GCC-12.2.0.lock == 2023-09-25 16:13:04,293 easyblock.py:4277 WARNING build failed (first 300 chars): cmd " make tests  BINARY='64'  CC='gcc' FC='gfortran' MAKE_NB_JOBS='-1'  USE_OPENMP='1'  USE_THREAD='1' " exited with exit code 2 and output: /home/modules/software/binutils/2.39-GCCcore-12.2.0/bin/ld: warning: /tmp/eb-74m3kzgo/ccy1Gkzg.o: missing .note.GNU-stack section implies executable stack
/
== 2023-09-25 16:13:04,293 easyblock.py:328 INFO Closing log for application name OpenBLAS version 0.3.21

Thanks,
Ole



Reply via email to