yxsamliu wrote:

When clang does host compilation, it essentially makes an assumption that the 
generated IR for host does not depend on the assumed GPU arch, or, the 
generated IR may be affected by assumed GPU arch, but it won't affect the 
program output. This is true in most cases. For example, in the case of 
rocprim, it needs to see __AMDGCN_WAVEFRONT_SIZE  to be able to parse device 
functions in host compilation, but that does not affect what host IR is 
generated. That is the reason why we can assume a default GPU arch in host 
compilation.

I think a more practical approach to void __AMDGCN_WAVEFRONT_SIZE  being used 
in host functions is to define it as a clang constexpr builtin that is 
evaluated to an integer constant in device functions or variables, but will be 
diagnosed when used in host functions or variables. warpSize should also be 
defined as a device constexpr variable to avoid it being used in host code.

https://github.com/llvm/llvm-project/pull/83558
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to