When I recompile clpeak, it works...
So it seems that the issue is in clpeak
Test project /tmp/clpeak-1.1.2/obj-x86_64-linux-gnu
Constructing a list of tests
Done constructing a list of tests
Updating test list for fixtures
Added 0 tests to meet fixture requirements
Checking test dependency graph...
Checking test dependency graph end
test 1
Start 1: clpeak_test_run
1: Test command:
/home/picca/tmp/clpeak-1.1.2/obj-x86_64-linux-gnu/clpeak
1: Working Directory: /home/picca/tmp/clpeak-1.1.2/obj-x86_64-linux-gnu
1: Test timeout computed to be: 10000000
1:
1: Platform: AMD Accelerated Parallel Processing
1: Device: gfx90c:xnack-
1: Driver version : 3590.0 (HSA1.1,LC) (Linux x64)
1: Compute units : 6
1: Clock frequency : 1500 MHz
1:
1: Global memory bandwidth (GBPS)
1: float : 15.60
1: float2 : 15.58
1: float4 : 17.05
1: float8 : 18.45
1: float16 : 18.06
1:
1: Single-precision compute (GFLOPS)
1: float : 1103.20
1: float2 : 1102.38
1: float4 : 1098.42
1: float8 : 1090.52
1: float16 : 1073.20
1:
1: Half-precision compute (GFLOPS)
1: half : 1104.12
1: half2 : 2116.23
1: half4 : 2123.43
1: half8 : 2094.10
1: half16 : 2048.29
1:
1: Double-precision compute (GFLOPS)
1: double : 71.04
1: double2 : 70.81
1: double4 : 70.80
1: double8 : 70.34
1: double16 : 70.42
1:
1: Integer compute (GIOPS)
1: int : 225.82
1: int2 : 225.59
1: int4 : 225.31
1: int8 : 225.06
1: int16 : 224.10
1:
1: Integer compute Fast 24bit (GIOPS)
1: int : 1056.93
1: int2 : 1051.22
1: int4 : 1051.37
1: int8 : 1037.44
1: int16 : 939.23
1:
1: Transfer bandwidth (GBPS)
1: enqueueWriteBuffer : 8.38
1: enqueueReadBuffer : 8.70
1: enqueueWriteBuffer non-blocking : 8.65
1: enqueueReadBuffer non-blocking : 8.71
1: enqueueMapBuffer(for read) : 12156.92
1: memcpy from mapped ptr : 8.63
1: enqueueUnmap(after write) : 79639.61
1: memcpy to mapped ptr : 8.65
1:
1: Kernel launch latency : 14.63 us
1:
1:
1: Platform: Portable Computing Language
1: Device: cpu-haswell-AMD Ryzen 5 4500U with Radeon Graphics
1: Driver version : 6.0+debian (Linux x64)
1: Compute units : 6
1: Clock frequency : 2375 MHz
1: 64 warnings generated.
1:
1: Global memory bandwidth (GBPS)
1: float : 14.73
1: float2 : 14.60
1: float4 : 15.98
1: float8 : 12.29
1: float16 : 14.40
1:
1: Single-precision compute (GFLOPS)
1: float : 7.96
1: float2 : 16.10
1: float4 : 32.47
1: float8 : 65.57
1: float16 : 128.78
1:
1: No half precision support! Skipped
1:
1: Double-precision compute (GFLOPS)
1: double : 7.98
1: double2 : 15.96
1: double4 : 32.40
1: double8 : 62.80
1: double16 : 123.23
1:
1: Integer compute (GIOPS)
1: int : 11.94
1: int2 : 21.54
1: int4 : 44.47
1: int8 : 88.77
1: int16 : 167.00
1:
1: Integer compute Fast 24bit (GIOPS)
1: int : 12.00
1: int2 : 21.76
1: int4 : 43.90
1: int8 : 88.93
1: int16 : 159.47
1:
1: Transfer bandwidth (GBPS)
1: enqueueWriteBuffer : 9.13
1: enqueueReadBuffer : 9.30
1: enqueueWriteBuffer non-blocking : 9.56
1: enqueueReadBuffer non-blocking : 9.49
1: enqueueMapBuffer(for read) : 15449.52
1: memcpy from mapped ptr : 8.82
1: enqueueUnmap(after write) : 20297.58
1: memcpy to mapped ptr : 9.41
1:
1: Kernel launch latency : 10.06 us
1:
1/1 Test #1: clpeak_test_run .................. Passed 173.09 sec
100% tests passed, 0 tests failed out of 1