[PATCH] D47070: [CUDA] Upgrade linked bitcode to enable inlining

Jonas Hahnfeld via Phabricator via cfe-commits Thu, 31 May 2018 23:28:24 -0700

Hahnfeld added a comment.

In https://reviews.llvm.org/D47070#1108803, @tra wrote:


> Here's my understanding of what happens: 
>  We've started adding target-features and target-cpu to everything clang 
> generates. 
>  We also need to link with libdevice (or IR generated by clang which which 
> has functions w/o those attributes. Or we need to link with IR produced by 
> clang which used different CUDA SDK and thus different PTX version in 
> target-feature.
>  Due to attribute mismatch we are failing to inline some of the functions and 
> that hurts performance.


In the case of OpenMP we are linking runtime function in a bitcode library so 
that Clang can inline them. This dramatically improves performance, so I'm 
really interested in making this work again with libraries compiled by older 
versions of Clang.

Is there a viable path forward? Should I put up a patch that just ignores all 
`target-features` in LLVM?


Repository:
  rC Clang

https://reviews.llvm.org/D47070



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D47070: [CUDA] Upgrade linked bitcode to enable inlining

Reply via email to