[PATCH] D140433: [Clang] Add `nvptx-arch` tool to query installed NVIDIA GPUs

2023-01-03 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments.



Comment at: clang/tools/nvptx-arch/NVPTXArch.cpp:34
+  const char *ErrStr = nullptr;
+  CUresult Result = cuGetErrorString(Err, &ErrStr);
+  if (Result != CUDA_SUCCESS)

jhuber6 wrote:
> tra wrote:
> > One problem with this approach is that `nvptx-arch` will fail to run on a 
> > machine without NVIDIA drivers installed because dynamic linker will not 
> > find `libcuda.so.1`.
> > 
> > Ideally we want it to run on any machine and fail the way we want.
> > 
> > A typical way to achieve that is to dlopen("libcuda.so.1"), and obtain the 
> > pointers to the functions we need via `dlsym()`.
> > 
> > 
> We do this in the OpenMP runtime. I mostly copied this approach from the 
> existing `amdgpu-arch` but we could change both to use this method.
An alternative would be to enumerate GPUs using CUDA runtime API, and link 
statically with libcudart_static.a

CUDA runtime will take care of finding libcuda.so and will return an error if 
it fails, so you do not need to mess with dlopen, etc.

E.g. this could be used as a base:
https://github.com/NVIDIA/cuda-samples/blob/master/Samples/1_Utilities/deviceQuery/deviceQuery.cpp


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140433/new/

https://reviews.llvm.org/D140433

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D140433: [Clang] Add `nvptx-arch` tool to query installed NVIDIA GPUs

2023-01-03 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments.



Comment at: clang/tools/nvptx-arch/CMakeLists.txt:19
+if (NOT CUDA_FOUND OR NOT cuda-library)
+  message(STATUS "Not building nvptx-arch: cuda runtime not found")
+  return()

tra wrote:
> Nit: libcuda.so is part of the NVIDIA driver which provides NVIDIA driver API 
> , It has nothing to do with the CUDA runtime.
> Here, it's actually not even the libcuda.so itself that's not found, but it's 
> stub. 
> I think a sensible error here should say "Failed to find stubs/libcuda.so in 
> CUDA_LIBDIR"
Good point. Never thought about the difference because they're both called 
`cuda` somewhere.



Comment at: clang/tools/nvptx-arch/CMakeLists.txt:25
+
+set_target_properties(nvptx-arch PROPERTIES INSTALL_RPATH_USE_LINK_PATH ON)
+target_include_directories(nvptx-arch PRIVATE ${CUDA_INCLUDE_DIRS})

tra wrote:
> Does it mean that the executable will have RPATH pointing to 
> CUDA_LIBDIR/stubs?
> 
> This should not be necessary. The stub shipped with CUDA comes as 
> "libcuda.so" only. It's SONAME is libcuda.so.1, but there's no symlink with 
> that name in stubs, so RPATH pointing there will do nothing. At runtime, 
> dynamic linker will attempt to open libcuda.so.1 and it will only be found 
> among the actual libraries installed by NVIDIA drivers.
> 
> 
Interesting, I can probably delete it. Another thing I mostly just copied from 
the existing tool.



Comment at: clang/tools/nvptx-arch/NVPTXArch.cpp:26
+#if !CUDA_HEADER_FOUND
+int main() { return 1; }
+#else

tra wrote:
> How do we distinguish "we didn't have CUDA at build time" reported here from 
> "some driver API failed with CUDA_ERROR_INVALID_VALUE=1" ?
> 
I guess the latter would print an error message. We do the same thing with the 
`amdgpu-arch` so I just copied it.



Comment at: clang/tools/nvptx-arch/NVPTXArch.cpp:34
+  const char *ErrStr = nullptr;
+  CUresult Result = cuGetErrorString(Err, &ErrStr);
+  if (Result != CUDA_SUCCESS)

tra wrote:
> One problem with this approach is that `nvptx-arch` will fail to run on a 
> machine without NVIDIA drivers installed because dynamic linker will not find 
> `libcuda.so.1`.
> 
> Ideally we want it to run on any machine and fail the way we want.
> 
> A typical way to achieve that is to dlopen("libcuda.so.1"), and obtain the 
> pointers to the functions we need via `dlsym()`.
> 
> 
We do this in the OpenMP runtime. I mostly copied this approach from the 
existing `amdgpu-arch` but we could change both to use this method.



Comment at: clang/tools/nvptx-arch/NVPTXArch.cpp:63
+
+printf("sm_%d%d\n", Major, Minor);
+  }

tra wrote:
> jhuber6 wrote:
> > tianshilei1992 wrote:
> > > Do we want to include device number here?
> > For `amdgpu-arch` and here we just have it implicitly in the order, so the 
> > n-th line is the n-th device, i.e.
> > ```
> > sm_70 // device 0
> > sm_80 // device 1
> > sm_70 // device 2
> > ```
> NVIDIA GPU enumeration order is more or less arbitrary. By default it's 
> arranged by "sort of fastest GPU first", but can be rearranged in order of 
> PCI(e) bus IDs or in an arbitrary user-specified order using 
> `CUDA_VISIBLE_DEVICES`. Printing compute capability in the enumeration order 
> is pretty much all the user needs.  If we want to print something uniquely 
> identifying the device, we would need to pring the device UUID, similarly to 
> what `nvidia-smi -L` does. Or PCIe bus IDs. In other words -- we can uniquely 
> identify devices, but there's no such thing as inherent canonical order among 
> the devices.
I think it's mostly just important that it prints a valid GPU. Most of the uses 
for this tool will just be "Give me a valid GPU I can run on this machine".


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140433/new/

https://reviews.llvm.org/D140433

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D140433: [Clang] Add `nvptx-arch` tool to query installed NVIDIA GPUs

2023-01-03 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments.



Comment at: clang/tools/nvptx-arch/CMakeLists.txt:19
+if (NOT CUDA_FOUND OR NOT cuda-library)
+  message(STATUS "Not building nvptx-arch: cuda runtime not found")
+  return()

Nit: libcuda.so is part of the NVIDIA driver which provides NVIDIA driver API , 
It has nothing to do with the CUDA runtime.
Here, it's actually not even the libcuda.so itself that's not found, but it's 
stub. 
I think a sensible error here should say "Failed to find stubs/libcuda.so in 
CUDA_LIBDIR"



Comment at: clang/tools/nvptx-arch/CMakeLists.txt:25
+
+set_target_properties(nvptx-arch PROPERTIES INSTALL_RPATH_USE_LINK_PATH ON)
+target_include_directories(nvptx-arch PRIVATE ${CUDA_INCLUDE_DIRS})

Does it mean that the executable will have RPATH pointing to CUDA_LIBDIR/stubs?

This should not be necessary. The stub shipped with CUDA comes as "libcuda.so" 
only. It's SONAME is libcuda.so.1, but there's no symlink with that name in 
stubs, so RPATH pointing there will do nothing. At runtime, dynamic linker will 
attempt to open libcuda.so.1 and it will only be found among the actual 
libraries installed by NVIDIA drivers.





Comment at: clang/tools/nvptx-arch/NVPTXArch.cpp:26
+#if !CUDA_HEADER_FOUND
+int main() { return 1; }
+#else

How do we distinguish "we didn't have CUDA at build time" reported here from 
"some driver API failed with CUDA_ERROR_INVALID_VALUE=1" ?




Comment at: clang/tools/nvptx-arch/NVPTXArch.cpp:34
+  const char *ErrStr = nullptr;
+  CUresult Result = cuGetErrorString(Err, &ErrStr);
+  if (Result != CUDA_SUCCESS)

One problem with this approach is that `nvptx-arch` will fail to run on a 
machine without NVIDIA drivers installed because dynamic linker will not find 
`libcuda.so.1`.

Ideally we want it to run on any machine and fail the way we want.

A typical way to achieve that is to dlopen("libcuda.so.1"), and obtain the 
pointers to the functions we need via `dlsym()`.





Comment at: clang/tools/nvptx-arch/NVPTXArch.cpp:63
+
+printf("sm_%d%d\n", Major, Minor);
+  }

jhuber6 wrote:
> tianshilei1992 wrote:
> > Do we want to include device number here?
> For `amdgpu-arch` and here we just have it implicitly in the order, so the 
> n-th line is the n-th device, i.e.
> ```
> sm_70 // device 0
> sm_80 // device 1
> sm_70 // device 2
> ```
NVIDIA GPU enumeration order is more or less arbitrary. By default it's 
arranged by "sort of fastest GPU first", but can be rearranged in order of 
PCI(e) bus IDs or in an arbitrary user-specified order using 
`CUDA_VISIBLE_DEVICES`. Printing compute capability in the enumeration order is 
pretty much all the user needs.  If we want to print something uniquely 
identifying the device, we would need to pring the device UUID, similarly to 
what `nvidia-smi -L` does. Or PCIe bus IDs. In other words -- we can uniquely 
identify devices, but there's no such thing as inherent canonical order among 
the devices.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140433/new/

https://reviews.llvm.org/D140433

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D140433: [Clang] Add `nvptx-arch` tool to query installed NVIDIA GPUs

2022-12-29 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added inline comments.



Comment at: clang/tools/nvptx-arch/CMakeLists.txt:28
+
+clang_target_link_libraries(nvptx-arch PRIVATE ${cuda-library})

This broke my build with `CLANG_LINK_CLANG_DYLIB`; we must use the standard 
CMake `target_link_libraries` for the CUDA libraries. I fixed this in commit 
rGf3c9342a3d56e1782e3b6db081401af334648492.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140433/new/

https://reviews.llvm.org/D140433

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D140433: [Clang] Add `nvptx-arch` tool to query installed NVIDIA GPUs

2022-12-25 Thread Joseph Huber via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGd5a5ee856e7c: [Clang] Add `nvptx-arch` tool to query 
installed NVIDIA GPUs (authored by jhuber6).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140433/new/

https://reviews.llvm.org/D140433

Files:
  clang/tools/CMakeLists.txt
  clang/tools/nvptx-arch/CMakeLists.txt
  clang/tools/nvptx-arch/NVPTXArch.cpp

Index: clang/tools/nvptx-arch/NVPTXArch.cpp
===
--- /dev/null
+++ clang/tools/nvptx-arch/NVPTXArch.cpp
@@ -0,0 +1,72 @@
+//===- NVPTXArch.cpp - list installed NVPTX devies --*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file implements a tool for detecting name of CUDA gpus installed in the
+// system.
+//
+//===--===//
+
+#if defined(__has_include)
+#if __has_include("cuda.h")
+#include "cuda.h"
+#define CUDA_HEADER_FOUND 1
+#else
+#define CUDA_HEADER_FOUND 0
+#endif
+#else
+#define CUDA_HEADER_FOUND 0
+#endif
+
+#if !CUDA_HEADER_FOUND
+int main() { return 1; }
+#else
+
+#include 
+#include 
+
+static int handleError(CUresult Err) {
+  const char *ErrStr = nullptr;
+  CUresult Result = cuGetErrorString(Err, &ErrStr);
+  if (Result != CUDA_SUCCESS)
+return EXIT_FAILURE;
+  fprintf(stderr, "CUDA error: %s\n", ErrStr);
+  return EXIT_FAILURE;
+}
+
+int main() {
+  if (CUresult Err = cuInit(0)) {
+if (Err == CUDA_ERROR_NO_DEVICE)
+  return EXIT_SUCCESS;
+else
+  return handleError(Err);
+  }
+
+  int Count = 0;
+  if (CUresult Err = cuDeviceGetCount(&Count))
+return handleError(Err);
+  if (Count == 0)
+return EXIT_SUCCESS;
+  for (int DeviceId = 0; DeviceId < Count; ++DeviceId) {
+CUdevice Device;
+if (CUresult Err = cuDeviceGet(&Device, DeviceId))
+  return handleError(Err);
+
+int32_t Major, Minor;
+if (CUresult Err = cuDeviceGetAttribute(
+&Major, CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR, Device))
+  return handleError(Err);
+if (CUresult Err = cuDeviceGetAttribute(
+&Minor, CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR, Device))
+  return handleError(Err);
+
+printf("sm_%d%d\n", Major, Minor);
+  }
+  return EXIT_SUCCESS;
+}
+
+#endif
Index: clang/tools/nvptx-arch/CMakeLists.txt
===
--- /dev/null
+++ clang/tools/nvptx-arch/CMakeLists.txt
@@ -0,0 +1,28 @@
+# //======//
+# //
+# // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+# // See https://llvm.org/LICENSE.txt for details.
+# // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+# //
+# //======//
+
+
+# TODO: This is deprecated. Since CMake 3.17 we can use FindCUDAToolkit instead.
+find_package(CUDA QUIET)
+find_library(cuda-library NAMES cuda PATHS /lib64)
+if (NOT cuda-library AND CUDA_FOUND)
+  get_filename_component(CUDA_LIBDIR "${CUDA_cudart_static_LIBRARY}" DIRECTORY)
+  find_library(cuda-library NAMES cuda HINTS "${CUDA_LIBDIR}/stubs")
+endif()
+
+if (NOT CUDA_FOUND OR NOT cuda-library)
+  message(STATUS "Not building nvptx-arch: cuda runtime not found")
+  return()
+endif()
+
+add_clang_tool(nvptx-arch NVPTXArch.cpp)
+
+set_target_properties(nvptx-arch PROPERTIES INSTALL_RPATH_USE_LINK_PATH ON)
+target_include_directories(nvptx-arch PRIVATE ${CUDA_INCLUDE_DIRS})
+
+clang_target_link_libraries(nvptx-arch PRIVATE ${cuda-library})
Index: clang/tools/CMakeLists.txt
===
--- clang/tools/CMakeLists.txt
+++ clang/tools/CMakeLists.txt
@@ -50,3 +50,4 @@
 add_clang_subdirectory(libclang)
 
 add_clang_subdirectory(amdgpu-arch)
+add_clang_subdirectory(nvptx-arch)
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D140433: [Clang] Add `nvptx-arch` tool to query installed NVIDIA GPUs

2022-12-21 Thread Shilei Tian via Phabricator via cfe-commits
tianshilei1992 accepted this revision.
tianshilei1992 added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140433/new/

https://reviews.llvm.org/D140433

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D140433: [Clang] Add `nvptx-arch` tool to query installed NVIDIA GPUs

2022-12-21 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 484637.
jhuber6 added a comment.

Print to `stderr` and only return `1` if thre was an actual error. A lack of 
devices is considered a success and we print nothing.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140433/new/

https://reviews.llvm.org/D140433

Files:
  clang/tools/CMakeLists.txt
  clang/tools/nvptx-arch/CMakeLists.txt
  clang/tools/nvptx-arch/NVPTXArch.cpp

Index: clang/tools/nvptx-arch/NVPTXArch.cpp
===
--- /dev/null
+++ clang/tools/nvptx-arch/NVPTXArch.cpp
@@ -0,0 +1,72 @@
+//===- NVPTXArch.cpp - list installed NVPTX devies --*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file implements a tool for detecting name of CUDA gpus installed in the
+// system.
+//
+//===--===//
+
+#if defined(__has_include)
+#if __has_include("cuda.h")
+#include "cuda.h"
+#define CUDA_HEADER_FOUND 1
+#else
+#define CUDA_HEADER_FOUND 0
+#endif
+#else
+#define CUDA_HEADER_FOUND 0
+#endif
+
+#if !CUDA_HEADER_FOUND
+int main() { return 1; }
+#else
+
+#include 
+#include 
+
+static int handleError(CUresult Err) {
+  const char *ErrStr = nullptr;
+  CUresult Result = cuGetErrorString(Err, &ErrStr);
+  if (Result != CUDA_SUCCESS)
+return EXIT_FAILURE;
+  fprintf(stderr, "CUDA error: %s\n", ErrStr);
+  return EXIT_FAILURE;
+}
+
+int main() {
+  if (CUresult Err = cuInit(0)) {
+if (Err == CUDA_ERROR_NO_DEVICE)
+  return EXIT_SUCCESS;
+else
+  return handleError(Err);
+  }
+
+  int Count = 0;
+  if (CUresult Err = cuDeviceGetCount(&Count))
+return handleError(Err);
+  if (Count == 0)
+return EXIT_SUCCESS;
+  for (int DeviceId = 0; DeviceId < Count; ++DeviceId) {
+CUdevice Device;
+if (CUresult Err = cuDeviceGet(&Device, DeviceId))
+  return handleError(Err);
+
+int32_t Major, Minor;
+if (CUresult Err = cuDeviceGetAttribute(
+&Major, CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR, Device))
+  return handleError(Err);
+if (CUresult Err = cuDeviceGetAttribute(
+&Minor, CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR, Device))
+  return handleError(Err);
+
+printf("sm_%d%d\n", Major, Minor);
+  }
+  return EXIT_SUCCESS;
+}
+
+#endif
Index: clang/tools/nvptx-arch/CMakeLists.txt
===
--- /dev/null
+++ clang/tools/nvptx-arch/CMakeLists.txt
@@ -0,0 +1,28 @@
+# //======//
+# //
+# // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+# // See https://llvm.org/LICENSE.txt for details.
+# // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+# //
+# //======//
+
+
+# TODO: This is deprecated. Since CMake 3.17 we can use FindCUDAToolkit instead.
+find_package(CUDA QUIET)
+find_library(cuda-library NAMES cuda PATHS /lib64)
+if (NOT cuda-library AND CUDA_FOUND)
+  get_filename_component(CUDA_LIBDIR "${CUDA_cudart_static_LIBRARY}" DIRECTORY)
+  find_library(cuda-library NAMES cuda HINTS "${CUDA_LIBDIR}/stubs")
+endif()
+
+if (NOT CUDA_FOUND OR NOT cuda-library)
+  message(STATUS "Not building nvptx-arch: cuda runtime not found")
+  return()
+endif()
+
+add_clang_tool(nvptx-arch NVPTXArch.cpp)
+
+set_target_properties(nvptx-arch PROPERTIES INSTALL_RPATH_USE_LINK_PATH ON)
+target_include_directories(nvptx-arch PRIVATE ${CUDA_INCLUDE_DIRS})
+
+clang_target_link_libraries(nvptx-arch PRIVATE ${cuda-library})
Index: clang/tools/CMakeLists.txt
===
--- clang/tools/CMakeLists.txt
+++ clang/tools/CMakeLists.txt
@@ -50,3 +50,4 @@
 add_clang_subdirectory(libclang)
 
 add_clang_subdirectory(amdgpu-arch)
+add_clang_subdirectory(nvptx-arch)
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D140433: [Clang] Add `nvptx-arch` tool to query installed NVIDIA GPUs

2022-12-21 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments.



Comment at: clang/tools/nvptx-arch/NVPTXArch.cpp:37
+return 1;
+  printf("CUDA error: %s\n", ErrStr);
+  return 1;

stderr?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140433/new/

https://reviews.llvm.org/D140433

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D140433: [Clang] Add `nvptx-arch` tool to query installed NVIDIA GPUs

2022-12-21 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 484594.
jhuber6 added a comment.

Change header I copied from the AMD implementation.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140433/new/

https://reviews.llvm.org/D140433

Files:
  clang/tools/CMakeLists.txt
  clang/tools/nvptx-arch/CMakeLists.txt
  clang/tools/nvptx-arch/NVPTXArch.cpp

Index: clang/tools/nvptx-arch/NVPTXArch.cpp
===
--- /dev/null
+++ clang/tools/nvptx-arch/NVPTXArch.cpp
@@ -0,0 +1,67 @@
+//===- NVPTXArch.cpp - list installed NVPTX devies --*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file implements a tool for detecting name of CUDA gpus installed in the
+// system.
+//
+//===--===//
+
+#if defined(__has_include)
+#if __has_include("cuda.h")
+#include "cuda.h"
+#define CUDA_HEADER_FOUND 1
+#else
+#define CUDA_HEADER_FOUND 0
+#endif
+#else
+#define CUDA_HEADER_FOUND 0
+#endif
+
+#if !CUDA_HEADER_FOUND
+int main() { return 1; }
+#else
+
+#include 
+#include 
+
+static int handleError(CUresult Err) {
+  const char *ErrStr = nullptr;
+  CUresult Result = cuGetErrorString(Err, &ErrStr);
+  if (Result != CUDA_SUCCESS)
+return 1;
+  printf("CUDA error: %s\n", ErrStr);
+  return 1;
+}
+
+int main() {
+  if (CUresult Err = cuInit(0))
+return 1;
+
+  int Count = 0;
+  if (cuDeviceGetCount(&Count))
+return 1;
+  if (Count == 0)
+return 0;
+  for (int DeviceId = 0; DeviceId < Count; ++DeviceId) {
+CUdevice Device;
+if (CUresult Err = cuDeviceGet(&Device, DeviceId))
+  return handleError(Err);
+
+int32_t Major, Minor;
+if (CUresult Err = cuDeviceGetAttribute(
+&Major, CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR, Device))
+  return handleError(Err);
+if (CUresult Err = cuDeviceGetAttribute(
+&Minor, CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR, Device))
+  return handleError(Err);
+
+printf("sm_%d%d\n", Major, Minor);
+  }
+  return 0;
+}
+#endif
Index: clang/tools/nvptx-arch/CMakeLists.txt
===
--- /dev/null
+++ clang/tools/nvptx-arch/CMakeLists.txt
@@ -0,0 +1,28 @@
+# //======//
+# //
+# // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+# // See https://llvm.org/LICENSE.txt for details.
+# // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+# //
+# //======//
+
+
+# TODO: This is deprecated. Since CMake 3.17 we can use FindCUDAToolkit instead.
+find_package(CUDA QUIET)
+find_library(cuda-library NAMES cuda PATHS /lib64)
+if (NOT cuda-library AND CUDA_FOUND)
+  get_filename_component(CUDA_LIBDIR "${CUDA_cudart_static_LIBRARY}" DIRECTORY)
+  find_library(cuda-library NAMES cuda HINTS "${CUDA_LIBDIR}/stubs")
+endif()
+
+if (NOT CUDA_FOUND OR NOT cuda-library)
+  message(STATUS "Not building nvptx-arch: cuda runtime not found")
+  return()
+endif()
+
+add_clang_tool(nvptx-arch NVPTXArch.cpp)
+
+set_target_properties(nvptx-arch PROPERTIES INSTALL_RPATH_USE_LINK_PATH ON)
+target_include_directories(nvptx-arch PRIVATE ${CUDA_INCLUDE_DIRS})
+
+clang_target_link_libraries(nvptx-arch PRIVATE ${cuda-library})
Index: clang/tools/CMakeLists.txt
===
--- clang/tools/CMakeLists.txt
+++ clang/tools/CMakeLists.txt
@@ -50,3 +50,4 @@
 add_clang_subdirectory(libclang)
 
 add_clang_subdirectory(amdgpu-arch)
+add_clang_subdirectory(nvptx-arch)
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D140433: [Clang] Add `nvptx-arch` tool to query installed NVIDIA GPUs

2022-12-21 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments.



Comment at: clang/tools/nvptx-arch/NVPTXArch.cpp:63
+
+printf("sm_%d%d\n", Major, Minor);
+  }

tianshilei1992 wrote:
> Do we want to include device number here?
For `amdgpu-arch` and here we just have it implicitly in the order, so the n-th 
line is the n-th device, i.e.
```
sm_70 // device 0
sm_80 // device 1
sm_70 // device 2
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140433/new/

https://reviews.llvm.org/D140433

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D140433: [Clang] Add `nvptx-arch` tool to query installed NVIDIA GPUs

2022-12-21 Thread Shilei Tian via Phabricator via cfe-commits
tianshilei1992 added inline comments.



Comment at: clang/tools/nvptx-arch/NVPTXArch.cpp:63
+
+printf("sm_%d%d\n", Major, Minor);
+  }

Do we want to include device number here?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140433/new/

https://reviews.llvm.org/D140433

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D140433: [Clang] Add `nvptx-arch` tool to query installed NVIDIA GPUs

2022-12-20 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision.
jhuber6 added reviewers: JonChesterfield, tra, yaxunl, jdoerfert, 
tianshilei1992, MaskRay.
Herald added subscribers: kosarev, mattd, gchakrabarti, asavonic, StephenFan, 
tpr.
Herald added a project: All.
jhuber6 requested review of this revision.
Herald added subscribers: cfe-commits, sstefan1, jholewinski.
Herald added a project: clang.

We already have a tool called `amdgpu-arch` which returns the GPUs on
the system. This is used to determine the default architecture when
doing offloading. This patch introduces a similar tool `nvptx-arch`.
Right now we use the detected GPU at compile time. This is unhelpful
when building on a login node and moving execution to a compute node for
example. This will allow us to better choose a default architecture when
targeting NVPTX. Also we can probably use this with CMake's `native`
setting for CUDA now.

CUDA since 11.6 provides `__nvcc_device_query` which has a similar
function but it is probably better to define this locally if we want to
depend on it in clang.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D140433

Files:
  clang/tools/CMakeLists.txt
  clang/tools/nvptx-arch/CMakeLists.txt
  clang/tools/nvptx-arch/NVPTXArch.cpp

Index: clang/tools/nvptx-arch/NVPTXArch.cpp
===
--- /dev/null
+++ clang/tools/nvptx-arch/NVPTXArch.cpp
@@ -0,0 +1,67 @@
+//===- NVPTXArch.cpp - list installed NVPTX devies --*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file implements a tool for detecting name of AMDGPU installed in system
+// using HSA. This tool is used by AMDGPU OpenMP driver.
+//
+//===--===//
+
+#if defined(__has_include)
+#if __has_include("cuda.h")
+#include "cuda.h"
+#define CUDA_HEADER_FOUND 1
+#else
+#define CUDA_HEADER_FOUND 0
+#endif
+#else
+#define CUDA_HEADER_FOUND 0
+#endif
+
+#if !CUDA_HEADER_FOUND
+int main() { return 1; }
+#else
+
+#include 
+#include 
+
+static int handleError(CUresult Err) {
+  const char *ErrStr = nullptr;
+  CUresult Result = cuGetErrorString(Err, &ErrStr);
+  if (Result != CUDA_SUCCESS)
+return 1;
+  printf("CUDA error: %s\n", ErrStr);
+  return 1;
+}
+
+int main() {
+  if (CUresult Err = cuInit(0))
+return 1;
+
+  int Count = 0;
+  if (cuDeviceGetCount(&Count))
+return 1;
+  if (Count == 0)
+return 0;
+  for (int DeviceId = 0; DeviceId < Count; ++DeviceId) {
+CUdevice Device;
+if (CUresult Err = cuDeviceGet(&Device, DeviceId))
+  return handleError(Err);
+
+int32_t Major, Minor;
+if (CUresult Err = cuDeviceGetAttribute(
+&Major, CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR, Device))
+  return handleError(Err);
+if (CUresult Err = cuDeviceGetAttribute(
+&Minor, CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR, Device))
+  return handleError(Err);
+
+printf("sm_%d%d\n", Major, Minor);
+  }
+  return 0;
+}
+#endif
Index: clang/tools/nvptx-arch/CMakeLists.txt
===
--- /dev/null
+++ clang/tools/nvptx-arch/CMakeLists.txt
@@ -0,0 +1,28 @@
+# //======//
+# //
+# // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+# // See https://llvm.org/LICENSE.txt for details.
+# // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+# //
+# //======//
+
+
+# TODO: This is deprecated. Since CMake 3.17 we can use FindCUDAToolkit instead.
+find_package(CUDA QUIET)
+find_library(cuda-library NAMES cuda PATHS /lib64)
+if (NOT cuda-library AND CUDA_FOUND)
+  get_filename_component(CUDA_LIBDIR "${CUDA_cudart_static_LIBRARY}" DIRECTORY)
+  find_library(cuda-library NAMES cuda HINTS "${CUDA_LIBDIR}/stubs")
+endif()
+
+if (NOT CUDA_FOUND OR NOT cuda-library)
+  message(STATUS "Not building nvptx-arch: cuda runtime not found")
+  return()
+endif()
+
+add_clang_tool(nvptx-arch NVPTXArch.cpp)
+
+set_target_properties(nvptx-arch PROPERTIES INSTALL_RPATH_USE_LINK_PATH ON)
+target_include_directories(nvptx-arch PRIVATE ${CUDA_INCLUDE_DIRS})
+
+clang_target_link_libraries(nvptx-arch PRIVATE ${cuda-library})
Index: clang/tools/CMakeLists.txt
===
--- clang/tools/CMakeLists.txt
+++ clang/tools/CMakeLists.txt
@@ -50,3 +50,4 @@
 add_clang_subdirectory(libclang)
 
 add_clang_subdirectory(amdgpu-arch)
+add_clang_subdirectory(nvptx-arch)
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.ll