tra added inline comments.

================
Comment at: clang/lib/CodeGen/CGCUDANV.cpp:1201
+      return Loc->second;
+    // non-HIP compilation may end up with a different F and need to have
+    // handles and stubs updated.
----------------
dcastagna wrote:
> The different F is because sometimes at first we use an F with incomplete 
> types as a key and later we use a different F (even if it's coming from the 
> same function) with the instantiated types (usually after we encounter a 
> statement where the size of the template parameter is necessary, like a 
> function call).
> Another option could be to always instantiate the complete type for F when we 
> call GetOrCreateLLVMFunction (IIRC) if the function has a __global__ 
> attribute.
> 
> If you prefer this solution, the patch LGTM.
> 
> 
Instantiating a complete function when we normally would not sounds a bit too 
invasive and may buy us unintended side effects. E.g. I can see it potentially 
causing code explosion in some cases, where we previously could get by with 
just a function decl. 

Fixing the cache locally maintains the global status quo which seems to be 
about the right balance between improving the situation, the effort involved, 
and the potential risk of the fix.

The way we track kernels could potentially be improved (I think it would be 
better to track them explicitly vs relying on observing the functions flying 
by), but that should be a separate patch. Right now I just want to fix the 
regression w/o affecting anything else.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146448/new/

https://reviews.llvm.org/D146448

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to