hfinkel added a comment.

In D60907#1479370 <https://reviews.llvm.org/D60907#1479370>, @gtbercea wrote:

> In D60907#1479142 <https://reviews.llvm.org/D60907#1479142>, @hfinkel wrote:
>
> > In D60907#1479118 <https://reviews.llvm.org/D60907#1479118>, @gtbercea 
> > wrote:
> >
> > > Ping @hfinkel @tra
> >
> >
> > The last two comments in D47849 <https://reviews.llvm.org/D47849> indicated 
> > exploration of a different approach, and one which still seems superior to 
> > this one. Can you please comment on why you're now pursuing this approach 
> > instead?
>
>
> ...
>
> Hal, as far as I can tell, this solution is similar to yours but with a 
> slightly different implementation. If there are particular aspects about this 
> patch you would like to discuss/give feedback on please let me know.


The solution I suggested had the advantages of:

1. Being able to directly reuse the code in `__clang_cuda_device_functions.h`. 
On the other hand, using this solution we need to implement a wrapper function 
for every math function. When `__clang_cuda_device_functions.h` is updated, we 
need to update the OpenMP wrapper as well.
2. Providing access to wrappers for other CUDA intrinsics in a natural way 
(e.g., rnorm3d) [it looks a bit nicer to provide a host version of rnorm3d than 
__nv_rnorm3d in user code].
3. Being similar to the "declare variant" functionality from OpenMP 5, and 
thus, I suspect, closer to the solution we'll eventually be able to apply in a 
standard way to all targets.

What are all of the long-double functions going to do on NVPTX?

> This solution is following Alexey's suggestions. This solution allows the 
> optimization of math calls if they apply (example: pow(x,2) => x*x ) which 
> was one of the issues in the previous solution I implemented.

So we're also missing that optimization for CUDA code when compiling with 
Clang? Isn't this also something that, regardless, should be fixed?

Also, how fragile is this? We inline bottom up but this optimization needs to 
apply before inlining?

Finally, regardless of all of this, do we really need to preinclude this 
header? Can't we do this with a math.h wrapper?


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60907/new/

https://reviews.llvm.org/D60907



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to