Source: kokkos Severity: wishlist X-Debbugs-Cc: c...@slerp.xyz Dear Maintainer,
The kokkos project has an AMD ROCm backend for hardware acceleration on AMD GPUs. It would be good to enable this functionality. The AMD backend depends on the following packages: hipcc librocthrust-dev The kokkos library uses an unusual set of options to choose the AMD GPU targets. Please see this table from the Spack package [1] that translates from the LLVM target name to the kokkos architecture suffix: amdgpu_arch_map = { "gfx900": "vega900", "gfx906": "vega906", "gfx908": "vega908", "gfx90a": "vega90A", "gfx940": "amd_gfx940", "gfx942": "amd_gfx942", "gfx1030": "navi1030", "gfx1100": "navi1100", } The names are somewhat misleading because gfx908 and gfx90a are not vega. The gfx908 and gfx90a architectures were arcturus and aldebaran, respectively. The option name doesn't affect anything, but perhaps that clarification will help avoid confusion in the future. I would suggest enabling all of the architectures listed above, except perhaps gfx940, as that architecture was only used for early MI300 engineering samples. All retail MI300 hardware is gfx942. The kokkos architecture names would be used with the Kokkos_ARCH_ prefix, so I believe this would result in the following flags: -DKokkos_ARCH_vega900=ON -DKokkos_ARCH_vega906=ON -DKokkos_ARCH_vega908=ON -DKokkos_ARCH_vega90A=ON -DKokkos_ARCH_amd_gfx942=ON -DKokkos_ARCH_navi1030=ON -DKokkos_ARCH_navi1100=ON You may also need: -DKokkos_ENABLE_HIP=ON -DKokkos_ENABLE_ROCTHRUST=ON Compiling custom kernels will require the use of hipcc as the C++ compiler. This can be done with CXX=hipcc before calling into CMake. Any options that are incompatible with device code can be restricted to host code by prefixing -Xarch_host. For an example of this, see the rocthrust package [2]. I would suggest adding -DROCPRIM_USE_ARCH_CONVERSION to the DEB_CXXFLAGS_MAINT_PREPEND flags when building kokkos. This is a rocprim build option that affects rocthrust. It causes gfx902, gfx909, and gfx90c to use gfx900 code paths and causes gfx1031, gfx1032, gfx1033, gfx1034, gfx1035, and gfx1036 to use gfx1030 code paths. It may or may not be sufficient to kokkos support to those architectures, but it will at least be necessary. If you require access to compatible hardware to test this functionality, please reach out privately and I may be able to help. Sincerely, Cory Bloor [1]: https://github.com/spack/spack/blob/0191e15a6a0c86520152694d2a75c3e595763d0a/var/spack/repos/builtin/packages/kokkos/package.py#L163-L172 [2]: https://salsa.debian.org/rocm-team/rocthrust/-/blob/debian/5.7.1-3/debian/rules#L8-10 -- System Information: Debian Release: trixie/sid APT prefers unstable APT policy: (500, 'unstable') Architecture: amd64 (x86_64) Kernel: Linux 6.10.11-amd64 (SMP w/32 CPU threads; PREEMPT) Locale: LANG=C, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not set Shell: /bin/sh linked to /usr/bin/dash Init: unable to detect