yaxunl added a comment.

In D56411#1398291 <https://reviews.llvm.org/D56411#1398291>, @tra wrote:

> >> That said, does CUDA have a general rule resolving `__host__` vs. 
> >> `__device__` overloads based on context?  And does it allow overloading 
> >> based solely on `__host__` vs. `__device__`?
>
> NVCC does not. Clang does. See https://goo.gl/EXnymm for the details.
>
> AFAICT, NVIDIA is starting to consider adopting Clang's approach:
>  http://lists.llvm.org/pipermail/cfe-dev/2018-November/060070.html (original 
> message from Bryce apparently didn't make it to the cfe-dev archive)


So my concern about checking host/device compatibility in template 
instantiation is still valid.

I verified the following code is valid with clang

  #define __device__ __attribute__((device))
  
  __device__ void f();
  
  void f();
  
  __device__ void g() {
    f();
  }
  
  template<void (*F)()> __device__ void t() {
    F();
  }
  
  __device__ void h() {
    t<f>();
  }

To be able to resolve function type template argument based on host/device 
attribute, we need to do the check before template instantiation.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D56411/new/

https://reviews.llvm.org/D56411



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to