================
@@ -794,7 +794,7 @@ void CodeGenModule::Release() {
       AddGlobalCtor(ObjCInitFunction);
   if (Context.getLangOpts().CUDA && CUDARuntime) {
     if (llvm::Function *CudaCtorFunction = CUDARuntime->finalizeModule())
-      AddGlobalCtor(CudaCtorFunction);
+      AddGlobalCtor(CudaCtorFunction, /*Priority=*/0);
----------------
Artem-B wrote:

> User code in Clang interpreter, is also executed through global_ctors. This 
> patch ensures kernels can be launched in the same iteration it is defined in 
> by making the registration first in the list.

This sounds like an application-specific problem that may be addressable by 
lowering priority of user code initializers.

In general, I'm very reluctant to change the initialization order to be 
different from what NVCC generates. We do need to interoperate with NVIDIA's 
libraries and the change in initialization order is potentially risky. 
Considering that we have no practical way to test it, and that it appears to 
address something that affects only one application (and may be dealt with on 
the app level), I do not think we should change the priority for the 
clang-generated kernel registration code.



https://github.com/llvm/llvm-project/pull/66658
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to