Hi Clement, I am the upstream author of pyFAI and probably this bug has little to do with debian packaging. Indeed, I don't test pyFAI on AMD hardware regularly.
Can you run the tests in a more verbose way to now specifically which test is failing ? Maybe we should follow up this discussion in a pyFAI issue: https://github.com/silx-kit/pyFAI/issues/2584 Could it be that is is `pyFAI-benchmark -h` which fails ? I also find surprizing that your card advertizes WG=1024 abut that there are limitation to WG=256 in the output... but maybe this is unrelated. Cheers, Jerome On Mon, 28 Jul 2025 13:58:08 +0000 LONGEAC Clement <[email protected]> wrote: > Hello, > > I work on the pkg Pyfai , I am in an internship at Synchrotron-Soleil , my > directors are Frederic- > Emmanuel PICCA and Emmanuel FARHI. I implemented Rocm and Pocl autopkgtest for > architecture amd64 and arm64. I implemented autopktests for rocm and pocl > using Opencl on the package named Pyfai > on local. The aim is to have an overview of code compatibility with various > AMD > graphics cards, the codes on all the AMD boards available for CI rocm for GPU > and Pocl for CPU. > > But I have several problems with it, it makes very long times to build , so > that some tests are marked as timed out whatever I do. > I implemented the time limit at 42 200 second . In the ROCm parts , I have > the error "Maximum valid workgroup size 256 on device <pyopencl.Device > 'gfx1034' on 'AMD Accelerated Parallel Processing' at 0xe90bf90> 0.0 > 1.871411379818157e-05 " > > I don't know how to solve that and what it come from ... I made a lot of > research and I don't really know how to solve it. > It seems to be material , to solve it we must have a GPU AMD marked as PRO , > not a gaming graphic card. > > Our config : > ******* > Agent 2 > ******* > Name: gfx1034 > Uuid: GPU-XX > Marketing Name: AMD Radeon RX 6400 > Vendor Name: AMD > Feature: KERNEL_DISPATCH > Profile: BASE_PROFILE > Float Round Mode: NEAR > Max Queue Number: 128(0x80) > Queue Min Size: 64(0x40) > Queue Max Size: 131072(0x20000) > Queue Type: MULTI > Node: 1 > Device Type: GPU > Cache Info: > L1: 16(0x10) KB > L2: 1024(0x400) KB > L3: 16384(0x4000) KB > Chip ID: 29759(0x743f) > ASIC Revision: 0(0x0) > Cacheline Size: 128(0x80) > Max Clock Freq. (MHz): 2320 > BDFID: 20224 > Internal Node ID: 1 > Compute Unit: 12 > SIMDs per CU: 2 > Shader Engines: 1 > Shader Arrs. per Eng.: 2 > WatchPts on Addr. Ranges:4 > Coherent Host Access: FALSE > Features: KERNEL_DISPATCH > Fast F16 Operation: TRUE > Wavefront Size: 32(0x20) > Workgroup Max Size: 1024(0x400) > Workgroup Max Size per Dimension: > x 1024(0x400) > y 1024(0x400) > z 1024(0x400) > Max Waves Per CU: 32(0x20) > Max Work-item Per CU: 1024(0x400) > Grid Max Size: 4294967295(0xffffffff) > Grid Max Size per Dimension: > x 4294967295(0xffffffff) > y 4294967295(0xffffffff) > z 4294967295(0xffffffff) > Max fbarriers/Workgrp: 32 > Packet Processor uCode:: 129 > SDMA engine uCode:: 34 > IOMMU Support:: None > Pool Info: > Pool 1 > Segment: GLOBAL; FLAGS: COARSE GRAINED > Size: 4177920(0x3fc000) KB > Allocatable: TRUE > Alloc Granule: 4KB > Alloc Recommended Granule:2048KB > Alloc Alignment: 4KB > Accessible by all: FALSE > Pool 2 > Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED > Size: 4177920(0x3fc000) KB > Allocatable: TRUE > Alloc Granule: 4KB > Alloc Recommended Granule:2048KB > Alloc Alignment: 4KB > Accessible by all: FALSE > Pool 3 > Segment: GROUP > Size: 64(0x40) KB > Allocatable: FALSE > Alloc Granule: 0KB > Alloc Recommended Granule:0KB > Alloc Alignment: 0KB > Accessible by all: FALSE > ISA Info: > ISA 1 > Name: amdgcn-amd-amdhsa--gfx1034 > Machine Models: HSA_MACHINE_MODEL_LARGE > Profiles: HSA_PROFILE_BASE > Default Rounding Mode: NEAR > Default Rounding Mode: NEAR > Fast f16: TRUE > Workgroup Max Size: 1024(0x400) > Workgroup Max Size per Dimension: > x 1024(0x400) > y 1024(0x400) > z 1024(0x400) > Grid Max Size: 4294967295(0xffffffff) > Grid Max Size per Dimension: > x 4294967295(0xffffffff) > y 4294967295(0xffffffff) > z 4294967295(0xffffffff) > FBarrier Max Size: 32 > > > I added Rocm and Pocl tools in debian/tests/control : > > # tests that must pass > > Test-Command: no-opencl > Architecture: !amd64 !arm64 !armel !armhf !i386 > Depends: > bitshuffle, > python3-all, > python3-pyfai, > python3-tk, > xauth, > xvfb, > python3-pyqt5.qtopengl, > python3-pyqt5, > libgl1-mesa-glx, > Features: test-name=no-opencl > Restrictions: allow-stderr, skip-not-installable > > > Test-Command: rocm-test-launcher debian/tests/opencl > Architecture: amd64 arm64 armel armhf i386 > Depends: > bitshuffle, > clinfo, > rocminfo, > libnuma1, > ocl-icd-libopencl1, > rocm-opencl-icd, > pkg-rocm-tools, > python3-all, > python3-pyfai, > python3-tk, > xauth, > xvfb, > libclang-common-17-dev, > hipcc, > rocm-device-libs-17, > Features: test-name=opencl-rocm > Restrictions: allow-stderr, skip-not-installable, skippable > > Test-Command: debian/tests/opencl > Architecture: amd64 arm64 armel armhf i386 > Depends: > bitshuffle, > pocl-opencl-icd, > clinfo, > python3-all, > python3-pyfai, > python3-tk, > xauth, > xvfb, > libclang-common-17-dev, > Features: test-name=opencl-pocl > Restrictions: allow-stderr, skip-not-installable > > > Test-Command: xvfb-run -s "-screen 0 1024x768x24 -ac +extension GLX +render > -noreset" sh debian/tests/gui > Depends: > debhelper, > mesa-utils, > @, > xauth, > xvfb, > Restrictions: allow-stderr > > And the file : debian/tests/opencl : > > #!/bin/sh -e > > # Check that OpenCL isn't totally broken (note that it isn't totally working > either) > # Uses device 0 platform 0, i.e. to use a real GPU manually install its > opencl-icd before running this > # Mark the test has flaky, the important part is the CPU computation. > > export PYFAI_OPENCL=True > export PYOPENCL_COMPILER_OUTPUT=1 > > # skip test > # TestAzimHalfFrelon.test_medfilt1d > > cp bootstrap.py run_tests.py pyproject.toml version.py README.rst > "$AUTOPKGTEST_TMP" > > for py in $(py3versions -s 2>/dev/null) > do cd "$AUTOPKGTEST_TMP" > echo "Testing with $py:" > xvfb-run -a --server-args="-screen 0 1024x768x24" $py run_tests.py -v -m > --low-mem --installed > done > > The error log for ROCm part: > > When the autopkgtest for rocm is launched, I get this error at the end. Where > does this come from? > > INFO:memProf:Time: 60.074s RAM: 0.000 Mb > pyFAI.test.test_containers.TestContainer.test_rebin1d > ====================================================================== FAIL: > testPyfaiBenchmark > (pyFAI.test.test_scripts.TestScriptsHelp.testPyfaiBenchmark) > ---------------------------------------------------------------------- > Traceback (most recent call last): File > "/usr/lib/python3/dist-packages/pyFAI/test/test_scripts.py", line 105, in > testPyfaiBenchmark self.executeAppHelp("pyFAI-benchmark", > "pyFAI.app.benchmark") > ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File > "/usr/lib/python3/dist-packages/pyFAI/test/test_scripts.py", line 86, in > executeAppHelp self.executeCommandLine(command_line, env) > ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^ File > "/usr/lib/python3/dist-packages/pyFAI/test/test_scripts.py", line 79, in > executeCommandLine self.assertEqual(p.returncode, 0) > ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ AssertionError: 1 != 0 > ---------------------------------------------------------------------- Ran > 453 tests in 5584.067s FAILED (failures=1, skipped=95) Maximum valid > workgroup size 256 on device <pyopencl.Device 'gfx1034' on 'AMD Accelerated > Parallel Processing' at 0xe90bf90> 0.0 1.871411379818157e-05 autopkgtest > [18:23:38]: test opencl-rocm: -----------------------] autopkgtest > [18:23:38]: test opencl-rocm: - - - - - - - - - - results - - - - - - - - - - > opencl-rocm FAIL non-zero exit status 1 > > Thank you very much > Clément LONGEAC > > -- Jérôme Kieffer tel +33 476 882 445

