Package: clang-16 Version: 1:16.0.0-1~exp5 Severity: normal X-Debbugs-Cc: c...@slerp.xyz, debian...@lists.debian.org
Dear Maintainer, Thank you for packaging amdgpu-arch as part of clang-tools-16! I see that it can correctly detect the architecture of my installed GPU when libhsa-runtime-dev is installed. Should libhsa-runtime-dev be added to the list of Suggested packages? In any case, I would expect that clang++-16 would be using amdgpu-arch to autodetect the default GPU build target for hip programs, but that doesn't seem to be happening. I have an RX 5700 XT installed on my workstation and when I run amdgpu-arch-16, I see: $ amdgpu-arch-16 2> /dev/null gfx1010 However, when I build a sample program, it doesn't build for the right architecture unless I manually specify. For example, main.hip: #include <stdio.h> #include <stdlib.h> #include <hip/hip_runtime.h> #define CHECK_HIP(expr) do { \ hipError_t result = (expr); \ if (result != hipSuccess) { \ fprintf(stderr, "%s:%d: %s (%d)\n", \ __FILE__, __LINE__, \ hipGetErrorString(result), result); \ exit(EXIT_FAILURE); \ } \ } while(0) __global__ void sq_arr(float *arr, int n) { int tid = blockDim.x*blockIdx.x + threadIdx.x; if (tid < n) { arr[tid] = arr[tid] * arr[tid]; } } int main() { enum { N = 5 }; float hArr[N] = { 1, 2, 3, 4, 5 }; float *dArr; CHECK_HIP(hipMalloc(&dArr, sizeof(float) * N)); CHECK_HIP(hipMemcpy(dArr, hArr, sizeof(float) * N, hipMemcpyHostToDevice)); sq_arr<<<dim3(1), dim3(32,1,1), 0, 0>>>(dArr, N); CHECK_HIP(hipMemcpy(hArr, dArr, sizeof(float) * N, hipMemcpyDeviceToHost)); for (int i = 0; i < N; ++i) { printf("%f\n", hArr[i]); } CHECK_HIP(hipFree(dArr)); return 0; } build command: clang++-16 --rocm-device-lib-path=/usr/lib/x86_64-linux-gnu/amdgcn/bitcode -x hip main.hip -lamdhip64 output: $ ./a.out KFD does not support xnack mode query. ROCr must assume xnack is disabled. "hipErrorNoBinaryForGpu: Unable to find code object for all current devices!" Aborted (core dumped) If I add -v to the build command, I can see that the compiler is building for gfx906, which I presume is the default build architecture. I'm not sure exactly why the architecture detection is failing. Could it be a caused by the rename from amdgpu-arch to amdgpu-arch-16? Or perhaps it's because in addition to printing my gpu architecture to stdout, amdgpu-arch also prints this message to stderr? KFD does not support xnack mode query. ROCr must assume xnack is disabled. I'm not exactly sure what the problem is, but I get my expected output when I explicitly specify the architecture: clang++-16 --rocm-device-lib-path=/usr/lib/x86_64-linux-gnu/amdgcn/bitcode --offload-arch=gfx1010 -x hip main.hip -lamdhip64 expected output: # ./a.out KFD does not support xnack mode query. ROCr must assume xnack is disabled. 1.000000 4.000000 9.000000 16.000000 25.000000 The logic within clang to use amdgpu-arch is in the AMDGPU ToolChain Driver. The answer to what is happening can probably be found within the short function that deals with the amdgpu-arch [1]. P.S. I guess maybe the bitcode works cross-versions after all? I wouldn't trust it without running a large test suite on a variety of hardware platforms, but it's nice that it at least seems to work for this particular example on this particular architecture. If it does turn out to be compatible, we won't need to patch clang like was done for LLVM 15. There is now a configuration file in LLVM 16 for specifying default command-line options [2]. [1]: https://github.com/llvm/llvm-project/blob/llvmorg-16.0.0/clang/lib/Driver/ToolChains/AMDGPU.cpp#L751-L774 [2]: https://clang.llvm.org/docs/UsersManual.html#configuration-files -- System Information: Debian Release: 12.0 APT prefers unstable APT policy: (500, 'unstable'), (1, 'experimental') Architecture: amd64 (x86_64) Kernel: Linux 6.1.0-5-amd64 (SMP w/32 CPU threads; PREEMPT) Kernel taint flags: TAINT_WARN Locale: LANG=C, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not set Shell: /bin/sh linked to /usr/bin/dash Init: unable to detect Versions of packages clang-16 depends on: ii binutils 2.40-2 ii libc6 2.36-8 ii libc6-dev 2.36-8 ii libclang-common-16-dev 1:16.0.0-1~exp5 ii libclang-cpp16 1:16.0.0-1~exp5 ii libclang1-16 1:16.0.0-1~exp5 ii libgcc-12-dev 12.2.0-14 ii libgcc-s1 12.2.0-14 ii libllvm16 1:16.0.0-1~exp5 ii libobjc-12-dev 12.2.0-14 ii libstdc++-12-dev 12.2.0-14 ii libstdc++6 12.2.0-14 ii llvm-16-linker-tools 1:16.0.0-1~exp5 Versions of packages clang-16 recommends: ii llvm-16-dev 1:16.0.0-1~exp5 ii python3 3.11.2-1 Versions of packages clang-16 suggests: pn clang-16-doc <none> pn wasi-libc <none> -- no debconf information