[clang] [Clang] Add timeout for GPU detection utilities (PR #94751)

2024-06-07 Thread Artem Belevich via cfe-commits
Artem-B wrote: > What's the config to set this by default without any graphics? https://docs.nvidia.com/deploy/driver-persistence/index.html I usually use "nvidia-smi -i -pm ENABLED" to force the driver to be loaded permanently. As for `__nvcc_device_query`, my guess is that it just uses a

[clang] [Clang] Add timeout for GPU detection utilities (PR #94751)

2024-06-07 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > Ooh... I think I know exactly what may be causing this. I've observed this a few times. For my case it's usually when some application hangs on the GPU and no one notices, then these tools hang forever and it takes awhile to notice. Figured an error is friendlier since I highl

[clang] [Clang] Add timeout for GPU detection utilities (PR #94751)

2024-06-07 Thread Artem Belevich via cfe-commits
Artem-B wrote: Ooh... I think I know exactly what may be causing this. On machines where NVIDIA GPUs are used for compute only (e.g. a headless server machine), NVIDIA drivers are not always loaded by default and may not have driver persistence enabled. The drivers get loaded when GPU is acces

[clang] [Clang] Add timeout for GPU detection utilities (PR #94751)

2024-06-07 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/94751 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Add timeout for GPU detection utilities (PR #94751)

2024-06-07 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/94751 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Add timeout for GPU detection utilities (PR #94751)

2024-06-07 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/94751 >From 0e367c72a1cc163fd781f98b9fac809d90f4beb7 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Fri, 7 Jun 2024 08:15:06 -0500 Subject: [PATCH] [Clang] Add timeout for GPU detection utilities Summary: The utili

[clang] [Clang] Add timeout for GPU detection utilities (PR #94751)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -205,7 +205,7 @@ class ToolChain { /// Executes the given \p Executable and returns the stdout. llvm::Expected> - executeToolChainProgram(StringRef Executable) const; + executeToolChainProgram(StringRef Executable, unsigned Timeout = 0) const; arsenm

[clang] [Clang] Add timeout for GPU detection utilities (PR #94751)

2024-06-07 Thread Joseph Huber via cfe-commits
jhuber6 wrote: No active test because I have no clue how you would, but I intentionally made it time out and it returns a 'Child timed out` error as expected. https://github.com/llvm/llvm-project/pull/94751 ___ cfe-commits mailing list cfe-commits@lis

[clang] [Clang] Add timeout for GPU detection utilities (PR #94751)

2024-06-07 Thread via cfe-commits
llvmbot wrote: @llvm/pr-subscribers-backend-amdgpu @llvm/pr-subscribers-clang-driver Author: Joseph Huber (jhuber6) Changes Summary: The utilities `nvptx-arch` and `amdgpu-arch` are used to support `--offload-arch=native` among other utilities in clang. However, these rely on the GPU drive

[clang] [Clang] Add timeout for GPU detection utilities (PR #94751)

2024-06-07 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/94751 Summary: The utilities `nvptx-arch` and `amdgpu-arch` are used to support `--offload-arch=native` among other utilities in clang. However, these rely on the GPU drivers to query the features. In certain cases thes