[PATCH] D137154: Adding nvvm_reflect clang builtin

2022-11-11 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. > Yes, this probably would become untenable with really large sizes. I think that horse had left the barn long ago. Vast majority of NVIDIA GPU cycles these days spent in libraries shipped by NVIDIA and those are huge (O(1GB)), numerous (cuDNN, TensorRT, cuBLAS, cuSPARSE,

[PATCH] D137154: Adding nvvm_reflect clang builtin

2022-11-11 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D137154#3918949 , @bader wrote: > Is binary size a concern here? NVIDIA, AMD and Intel GPUs are already have ~ > 20 different architectures each, so I want my app/library to run on any GPU > from these vendors (which is

[PATCH] D137154: Adding nvvm_reflect clang builtin

2022-11-10 Thread Alexey Bader via Phabricator via cfe-commits
bader added a comment. Is binary size a concern here? NVIDIA, AMD and Intel GPUs are already have ~ 20 different architectures each, so I want my app/library to run on any GPU from these vendors (which is quite reasonable expectation), I'll need to have/distribute ~ 60 different binaries.

[PATCH] D137154: Adding nvvm_reflect clang builtin

2022-11-09 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D137154#3917752 , @tra wrote: > As for the code specialization, why not build for individual GPUs? To me it > looks like this use case is a good match for the "new-driver" offloading > that's been recently implemented in

[PATCH] D137154: Adding nvvm_reflect clang builtin

2022-11-09 Thread Artem Belevich via Phabricator via cfe-commits
tra added a subscriber: jhuber6. tra added a comment. In D137154#3917692 , @hdelan wrote: > Thanks for feedback. Instead of adding `__nvvm_reflect` as a clang builtin, > would it be acceptable if I modified the NVVMReflect pass That would be less

[PATCH] D137154: Adding nvvm_reflect clang builtin

2022-11-09 Thread Hugh Delaney via Phabricator via cfe-commits
hdelan added a comment. Thanks for feedback. Instead of adding `__nvvm_reflect` as a clang builtin, would it be acceptable if I modified the NVVMReflect pass so that it works with addrspace casting as well? This would allow us to use `__nvvm_reflect` in openCL Repository: rG LLVM Github

[PATCH] D137154: Adding nvvm_reflect clang builtin

2022-11-04 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/include/clang/Basic/BuiltinsNVPTX.def:827 +BUILTIN(__nvvm_reflect, "icC*", "nc") + hdelan wrote: > tra wrote: > > Do we actually need this patch at all. > > > > `extern "C" int __nvvm_reflect(const char *);`

[PATCH] D137154: Adding nvvm_reflect clang builtin

2022-11-04 Thread Hugh Delaney via Phabricator via cfe-commits
hdelan added inline comments. Comment at: clang/include/clang/Basic/BuiltinsNVPTX.def:827 +BUILTIN(__nvvm_reflect, "icC*", "nc") + tra wrote: > Do we actually need this patch at all. > > `extern "C" int __nvvm_reflect(const char *);` appears to work just fine

[PATCH] D137154: Adding nvvm_reflect clang builtin

2022-11-03 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/include/clang/Basic/BuiltinsNVPTX.def:827 +BUILTIN(__nvvm_reflect, "icC*", "nc") + Do we actually need this patch at all. `extern "C" int __nvvm_reflect(const char *);` appears to work just fine already:

[PATCH] D137154: Adding nvvm_reflect clang builtin

2022-11-03 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D137154#3905711 , @yaxunl wrote: > LLVM does not maintain bitcode backward compatibility between major releases. > How do you make sure the clang/LLVM you are using is compatible with > libdevice? Yes it does? We support

[PATCH] D137154: Adding nvvm_reflect clang builtin

2022-11-03 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D137154#3904810 , @hdelan wrote: > In DPC++ for CUDA we use libclc as a wrapper around CUDA SDK's libdevice. > Like libdevice we want to precompile libclc to bc for the CUDA backend > without specializing for a particular

[PATCH] D137154: Adding nvvm_reflect clang builtin

2022-11-03 Thread Hugh Delaney via Phabricator via cfe-commits
hdelan marked an inline comment as not done. hdelan added a comment. In DPC++ for CUDA we use libclc as a wrapper around CUDA SDK's libdevice. Like libdevice we want to precompile libclc to bc for the CUDA backend without specializing for a particular arch, so that we can call different __nv

[PATCH] D137154: Adding nvvm_reflect clang builtin

2022-11-01 Thread Artem Belevich via Phabricator via cfe-commits
tra added a subscriber: yaxunl. tra added a comment. I don't think it's a good idea. `__nvvm_reflect` is a hack that we should not propagate. The only code that currently relies on it is NVIDIA's libdevice bitcode and I'd prefer to keep it that way. Can you elaborate on what motivates this

[PATCH] D137154: Adding nvvm_reflect clang builtin

2022-11-01 Thread Hugh Delaney via Phabricator via cfe-commits
hdelan updated this revision to Diff 472311. hdelan added a comment. Removing redundant check Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D137154/new/ https://reviews.llvm.org/D137154 Files: clang/include/clang/Basic/BuiltinsNVPTX.def

[PATCH] D137154: Adding nvvm_reflect clang builtin

2022-11-01 Thread Hugh Delaney via Phabricator via cfe-commits
hdelan updated this revision to Diff 472310. hdelan added a comment. Removing redundant check Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D137154/new/ https://reviews.llvm.org/D137154 Files: llvm/lib/Target/NVPTX/NVVMReflect.cpp Index:

[PATCH] D137154: Adding nvvm_reflect clang builtin

2022-11-01 Thread Andrew Savonichev via Phabricator via cfe-commits
asavonic added inline comments. Comment at: llvm/include/llvm/IR/IntrinsicsNVVM.td:1581 def int_nvvm_reflect : - Intrinsic<[llvm_i32_ty], [llvm_anyptr_ty], [IntrNoMem], "llvm.nvvm.reflect">; + Intrinsic<[llvm_i32_ty], [llvm_ptr_ty], [IntrNoMem], "llvm.nvvm.reflect">, +

[PATCH] D137154: Adding nvvm_reflect clang builtin

2022-11-01 Thread Hugh Delaney via Phabricator via cfe-commits
hdelan added inline comments. Comment at: llvm/include/llvm/IR/IntrinsicsNVVM.td:1581 def int_nvvm_reflect : - Intrinsic<[llvm_i32_ty], [llvm_anyptr_ty], [IntrNoMem], "llvm.nvvm.reflect">; + Intrinsic<[llvm_i32_ty], [llvm_ptr_ty], [IntrNoMem], "llvm.nvvm.reflect">, +

[PATCH] D137154: Adding nvvm_reflect clang builtin

2022-11-01 Thread Andrew Savonichev via Phabricator via cfe-commits
asavonic added inline comments. Comment at: llvm/include/llvm/IR/IntrinsicsNVVM.td:1581 def int_nvvm_reflect : - Intrinsic<[llvm_i32_ty], [llvm_anyptr_ty], [IntrNoMem], "llvm.nvvm.reflect">; + Intrinsic<[llvm_i32_ty], [llvm_ptr_ty], [IntrNoMem], "llvm.nvvm.reflect">, +

[PATCH] D137154: Adding nvvm_reflect clang builtin

2022-11-01 Thread Hugh Delaney via Phabricator via cfe-commits
hdelan created this revision. Herald added subscribers: mattd, gchakrabarti, asavonic, hiraditya. Herald added a project: All. hdelan requested review of this revision. Herald added subscribers: llvm-commits, cfe-commits, jholewinski. Herald added projects: clang, LLVM. This patch adds