tra added inline comments.
================ Comment at: clang/tools/nvptx-arch/NVPTXArch.cpp:34 + const char *ErrStr = nullptr; + CUresult Result = cuGetErrorString(Err, &ErrStr); + if (Result != CUDA_SUCCESS) ---------------- jhuber6 wrote: > tra wrote: > > One problem with this approach is that `nvptx-arch` will fail to run on a > > machine without NVIDIA drivers installed because dynamic linker will not > > find `libcuda.so.1`. > > > > Ideally we want it to run on any machine and fail the way we want. > > > > A typical way to achieve that is to dlopen("libcuda.so.1"), and obtain the > > pointers to the functions we need via `dlsym()`. > > > > > We do this in the OpenMP runtime. I mostly copied this approach from the > existing `amdgpu-arch` but we could change both to use this method. An alternative would be to enumerate GPUs using CUDA runtime API, and link statically with libcudart_static.a CUDA runtime will take care of finding libcuda.so and will return an error if it fails, so you do not need to mess with dlopen, etc. E.g. this could be used as a base: https://github.com/NVIDIA/cuda-samples/blob/master/Samples/1_Utilities/deviceQuery/deviceQuery.cpp Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D140433/new/ https://reviews.llvm.org/D140433 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits