jhuber6 added inline comments.
================
Comment at: clang/tools/nvptx-arch/CMakeLists.txt:19
+if (NOT CUDA_FOUND OR NOT cuda-library)
+ message(STATUS "Not building nvptx-arch: cuda runtime not found")
+ return()
----------------
tra wrote:
> Nit: libcuda.so is part of the NVIDIA driver which provides NVIDIA driver API
> , It has nothing to do with the CUDA runtime.
> Here, it's actually not even the libcuda.so itself that's not found, but it's
> stub.
> I think a sensible error here should say "Failed to find stubs/libcuda.so in
> CUDA_LIBDIR"
Good point. Never thought about the difference because they're both called
`cuda` somewhere.
================
Comment at: clang/tools/nvptx-arch/CMakeLists.txt:25
+
+set_target_properties(nvptx-arch PROPERTIES INSTALL_RPATH_USE_LINK_PATH ON)
+target_include_directories(nvptx-arch PRIVATE ${CUDA_INCLUDE_DIRS})
----------------
tra wrote:
> Does it mean that the executable will have RPATH pointing to
> CUDA_LIBDIR/stubs?
>
> This should not be necessary. The stub shipped with CUDA comes as
> "libcuda.so" only. It's SONAME is libcuda.so.1, but there's no symlink with
> that name in stubs, so RPATH pointing there will do nothing. At runtime,
> dynamic linker will attempt to open libcuda.so.1 and it will only be found
> among the actual libraries installed by NVIDIA drivers.
>
>
Interesting, I can probably delete it. Another thing I mostly just copied from
the existing tool.
================
Comment at: clang/tools/nvptx-arch/NVPTXArch.cpp:26
+#if !CUDA_HEADER_FOUND
+int main() { return 1; }
+#else
----------------
tra wrote:
> How do we distinguish "we didn't have CUDA at build time" reported here from
> "some driver API failed with CUDA_ERROR_INVALID_VALUE=1" ?
>
I guess the latter would print an error message. We do the same thing with the
`amdgpu-arch` so I just copied it.
================
Comment at: clang/tools/nvptx-arch/NVPTXArch.cpp:34
+ const char *ErrStr = nullptr;
+ CUresult Result = cuGetErrorString(Err, &ErrStr);
+ if (Result != CUDA_SUCCESS)
----------------
tra wrote:
> One problem with this approach is that `nvptx-arch` will fail to run on a
> machine without NVIDIA drivers installed because dynamic linker will not find
> `libcuda.so.1`.
>
> Ideally we want it to run on any machine and fail the way we want.
>
> A typical way to achieve that is to dlopen("libcuda.so.1"), and obtain the
> pointers to the functions we need via `dlsym()`.
>
>
We do this in the OpenMP runtime. I mostly copied this approach from the
existing `amdgpu-arch` but we could change both to use this method.
================
Comment at: clang/tools/nvptx-arch/NVPTXArch.cpp:63
+
+ printf("sm_%d%d\n", Major, Minor);
+ }
----------------
tra wrote:
> jhuber6 wrote:
> > tianshilei1992 wrote:
> > > Do we want to include device number here?
> > For `amdgpu-arch` and here we just have it implicitly in the order, so the
> > n-th line is the n-th device, i.e.
> > ```
> > sm_70 // device 0
> > sm_80 // device 1
> > sm_70 // device 2
> > ```
> NVIDIA GPU enumeration order is more or less arbitrary. By default it's
> arranged by "sort of fastest GPU first", but can be rearranged in order of
> PCI(e) bus IDs or in an arbitrary user-specified order using
> `CUDA_VISIBLE_DEVICES`. Printing compute capability in the enumeration order
> is pretty much all the user needs. If we want to print something uniquely
> identifying the device, we would need to pring the device UUID, similarly to
> what `nvidia-smi -L` does. Or PCIe bus IDs. In other words -- we can uniquely
> identify devices, but there's no such thing as inherent canonical order among
> the devices.
I think it's mostly just important that it prints a valid GPU. Most of the uses
for this tool will just be "Give me a valid GPU I can run on this machine".
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D140433/new/
https://reviews.llvm.org/D140433
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits