hfinkel added a comment. In D60907#1483660 <https://reviews.llvm.org/D60907#1483660>, @jdoerfert wrote:
> In D60907#1483615 <https://reviews.llvm.org/D60907#1483615>, @hfinkel wrote: > > > In D60907#1479370 <https://reviews.llvm.org/D60907#1479370>, @gtbercea > > wrote: > > > > > In D60907#1479142 <https://reviews.llvm.org/D60907#1479142>, @hfinkel > > > wrote: > > > > > > > In D60907#1479118 <https://reviews.llvm.org/D60907#1479118>, @gtbercea > > > > wrote: > > > > > > > > > Ping @hfinkel @tra > > > > > > > > > > > > The last two comments in D47849 <https://reviews.llvm.org/D47849> > > > > indicated exploration of a different approach, and one which still > > > > seems superior to this one. Can you please comment on why you're now > > > > pursuing this approach instead? > > > > > > > > > ... > > > > > > Hal, as far as I can tell, this solution is similar to yours but with a > > > slightly different implementation. If there are particular aspects about > > > this patch you would like to discuss/give feedback on please let me know. > > > > > > The solution I suggested had the advantages of: > > > > 1. Being able to directly reuse the code in > > `__clang_cuda_device_functions.h`. On the other hand, using this solution > > we need to implement a wrapper function for every math function. When > > `__clang_cuda_device_functions.h` is updated, we need to update the OpenMP > > wrapper as well. > > > I'd even go as far as to argue that `__clang_cuda_device_functions.h` should > include the internal math.h wrapper to get all math functions. See also the > next comment. > > > 2. Providing access to wrappers for other CUDA intrinsics in a natural way > > (e.g., rnorm3d) [it looks a bit nicer to provide a host version of rnorm3d > > than __nv_rnorm3d in user code]. > > @hfinkel > I don't see why you want to mix CUDA intrinsics with math.h overloads. What I had in mind was matching non-standard functions in a standard way. For example, let's just say that I have a CUDA kernel that uses the rnorm3d function, or I otherwise have a function that I'd like to write in OpenMP that will make good use of this CUDA function (because it happens to have an efficient device implementation). This is a function that CUDA provides, in the global namespace, although it's not standard. Then I can do something like this (depending on how we setup the implementation): double rnorm3d(double a, double b, double c) { return sqrt(a*a + b*b + c*c); } ... #pragma omp target { double a = ..., b = ..., c = ...; double r = rnorm3d(a, b, c) } and, if we use the CUDA math headers for CUDA math-function support, than this might "just work." To be clear, I can see an argument for having this work being a bad idea ;) -- but it has the advantage of providing a way to take advantage of system-specific functions while still writing completely-portable code. > I added a rough outline of how I imagined the internal math.h header to > look like as a comment in D47849. Could you elaborate how that differs from > what you imagine and how the other intrinsics come in? That looks like what I had in mind (including `__clang_cuda_device_functions.h` to get the device functions.) > > >> 3. Being similar to the "declare variant" functionality from OpenMP 5, and >> thus, I suspect, closer to the solution we'll eventually be able to apply in >> a standard way to all targets. > > I can see this. > >>> This solution is following Alexey's suggestions. This solution allows the >>> optimization of math calls if they apply (example: pow(x,2) => x*x ) which >>> was one of the issues in the previous solution I implemented. >> >> So we're also missing that optimization for CUDA code when compiling with >> Clang? Isn't this also something that, regardless, should be fixed? > > Maybe through a general built-in recognition and lowering into target > specific implementations/intrinsics late again? I suspect that we need to match the intrinsics and perform the optimizations in LLVM at that level in order to get the optimizations for CUDA. Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D60907/new/ https://reviews.llvm.org/D60907 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits