jhuber6 added inline comments.
================ Comment at: clang/tools/nvptx-arch/CMakeLists.txt:19 +if (NOT CUDA_FOUND OR NOT cuda-library) + message(STATUS "Not building nvptx-arch: cuda runtime not found") + return() ---------------- tra wrote: > Nit: libcuda.so is part of the NVIDIA driver which provides NVIDIA driver API > , It has nothing to do with the CUDA runtime. > Here, it's actually not even the libcuda.so itself that's not found, but it's > stub. > I think a sensible error here should say "Failed to find stubs/libcuda.so in > CUDA_LIBDIR" Good point. Never thought about the difference because they're both called `cuda` somewhere. ================ Comment at: clang/tools/nvptx-arch/CMakeLists.txt:25 + +set_target_properties(nvptx-arch PROPERTIES INSTALL_RPATH_USE_LINK_PATH ON) +target_include_directories(nvptx-arch PRIVATE ${CUDA_INCLUDE_DIRS}) ---------------- tra wrote: > Does it mean that the executable will have RPATH pointing to > CUDA_LIBDIR/stubs? > > This should not be necessary. The stub shipped with CUDA comes as > "libcuda.so" only. It's SONAME is libcuda.so.1, but there's no symlink with > that name in stubs, so RPATH pointing there will do nothing. At runtime, > dynamic linker will attempt to open libcuda.so.1 and it will only be found > among the actual libraries installed by NVIDIA drivers. > > Interesting, I can probably delete it. Another thing I mostly just copied from the existing tool. ================ Comment at: clang/tools/nvptx-arch/NVPTXArch.cpp:26 +#if !CUDA_HEADER_FOUND +int main() { return 1; } +#else ---------------- tra wrote: > How do we distinguish "we didn't have CUDA at build time" reported here from > "some driver API failed with CUDA_ERROR_INVALID_VALUE=1" ? > I guess the latter would print an error message. We do the same thing with the `amdgpu-arch` so I just copied it. ================ Comment at: clang/tools/nvptx-arch/NVPTXArch.cpp:34 + const char *ErrStr = nullptr; + CUresult Result = cuGetErrorString(Err, &ErrStr); + if (Result != CUDA_SUCCESS) ---------------- tra wrote: > One problem with this approach is that `nvptx-arch` will fail to run on a > machine without NVIDIA drivers installed because dynamic linker will not find > `libcuda.so.1`. > > Ideally we want it to run on any machine and fail the way we want. > > A typical way to achieve that is to dlopen("libcuda.so.1"), and obtain the > pointers to the functions we need via `dlsym()`. > > We do this in the OpenMP runtime. I mostly copied this approach from the existing `amdgpu-arch` but we could change both to use this method. ================ Comment at: clang/tools/nvptx-arch/NVPTXArch.cpp:63 + + printf("sm_%d%d\n", Major, Minor); + } ---------------- tra wrote: > jhuber6 wrote: > > tianshilei1992 wrote: > > > Do we want to include device number here? > > For `amdgpu-arch` and here we just have it implicitly in the order, so the > > n-th line is the n-th device, i.e. > > ``` > > sm_70 // device 0 > > sm_80 // device 1 > > sm_70 // device 2 > > ``` > NVIDIA GPU enumeration order is more or less arbitrary. By default it's > arranged by "sort of fastest GPU first", but can be rearranged in order of > PCI(e) bus IDs or in an arbitrary user-specified order using > `CUDA_VISIBLE_DEVICES`. Printing compute capability in the enumeration order > is pretty much all the user needs. If we want to print something uniquely > identifying the device, we would need to pring the device UUID, similarly to > what `nvidia-smi -L` does. Or PCIe bus IDs. In other words -- we can uniquely > identify devices, but there's no such thing as inherent canonical order among > the devices. I think it's mostly just important that it prints a valid GPU. Most of the uses for this tool will just be "Give me a valid GPU I can run on this machine". Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D140433/new/ https://reviews.llvm.org/D140433 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits