tra added inline comments.

================
Comment at: clang/tools/nvptx-arch/NVPTXArch.cpp:34
+  const char *ErrStr = nullptr;
+  CUresult Result = cuGetErrorString(Err, &ErrStr);
+  if (Result != CUDA_SUCCESS)
----------------
jhuber6 wrote:
> tra wrote:
> > One problem with this approach is that `nvptx-arch` will fail to run on a 
> > machine without NVIDIA drivers installed because dynamic linker will not 
> > find `libcuda.so.1`.
> > 
> > Ideally we want it to run on any machine and fail the way we want.
> > 
> > A typical way to achieve that is to dlopen("libcuda.so.1"), and obtain the 
> > pointers to the functions we need via `dlsym()`.
> > 
> > 
> We do this in the OpenMP runtime. I mostly copied this approach from the 
> existing `amdgpu-arch` but we could change both to use this method.
An alternative would be to enumerate GPUs using CUDA runtime API, and link 
statically with libcudart_static.a

CUDA runtime will take care of finding libcuda.so and will return an error if 
it fails, so you do not need to mess with dlopen, etc.

E.g. this could be used as a base:
https://github.com/NVIDIA/cuda-samples/blob/master/Samples/1_Utilities/deviceQuery/deviceQuery.cpp


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140433/new/

https://reviews.llvm.org/D140433

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to