gtbercea added a comment.

Thanks @Hahnfeld for your suggestions.

Unfortunately doing the lowering in the backend one would need to replace the 
math function calls with calls to libdevice function calls. I have not been 
able to do that in an elegant way. Encoding the interface to libdevice is just 
not a clean process not to mention that any changes to libdevice will have to 
be tracked manually with every new CUDA version. It does not make the code more 
maintainable, on the contrary I think it makes it harder to track libdevice 
changes.

On the same note, clang-cuda doesn't do the pow(a,2) -> a*a optimization, I 
checked. It is something that needs to be fixed for Clang-CUDA first before 
OpenMP can make use of it. OpenMP-NVPTX toolchain is designed to exist on top 
of the CUDA toolchain. It therefore inherits all the clang-cuda benefits and in 
this particular case, limitations.

As for the Sema check error you report (the one related to the x restriction), 
I think the fix you proposed is good and should be pushed in a separate patch.


Repository:
  rC Clang

https://reviews.llvm.org/D47849



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to