hfinkel added a comment.

In D61399#1488366 <https://reviews.llvm.org/D61399#1488366>, @ABataev wrote:

> In D61399#1488329 <https://reviews.llvm.org/D61399#1488329>, @hfinkel wrote:
>
> > In D61399#1488309 <https://reviews.llvm.org/D61399#1488309>, @ABataev wrote:
> >
> > > In D61399#1488299 <https://reviews.llvm.org/D61399#1488299>, @hfinkel 
> > > wrote:
> > >
> > > > In D61399#1488262 <https://reviews.llvm.org/D61399#1488262>, @ABataev 
> > > > wrote:
> > > >
> > > > > I don't like this implementation. Seems to me, it breaks one of the 
> > > > > OpenMP standard requirements: the program can be compiled without 
> > > > > openmp support. I assume, that with this includes the program won't 
> > > > > be able to be compiled without OpenMP support anymore because it may 
> > > > > use some device-specific math functions explicitly.
> > > > >  Instead, I would like to see some additional, device-scpecific math 
> > > > > header file, that must be included explicitly to support some 
> > > > > device-specific math functions. And we need to provide default 
> > > > > implementations for those extra math functions for all the platforms 
> > > > > we're going to support, including default host implementations.
> > > >
> > > >
> > > > Can you provide an example of a conforming program that can't be 
> > > > compiled without OpenMP support? Regardless of the use of any 
> > > > device-specific functions (which isn't covered by the standard, of 
> > > > course, but might be needed in practice), the code still needs to be 
> > > > compilable by the host in order to generate the host-fallback version. 
> > > > This doesn't change that. Thus, any program that uses anything from 
> > > > this math.h, etc. needs to compile for the host, and thus, likely 
> > > > compiles without OpenMP support. Maybe I'm missing your point, however.
> > >
> > >
> > > Assume we have something like this:
> > >
> > >   #pragma omp target if(cond)
> > >   a = __nv_xxxx(....);
> > >
> > >
> > > Instead of `__nv_xxx` you can try to use any Cuda-specific function, 
> > > which is not the part of the standard `math.h`/`cmath` files. Will it be 
> > > compilable even with OpenMP?
> >
> >
> > I don't think that this changes that one way or the other. Your example 
> > won't work, AFAIK, unless you do something like:
> >
> >   #pragma omp target if(cond)
> >   #ifdef __NVPTX__
> >   a = __nv_xxxx(....);
> >   #else
> >   a = something_on_the_host;
> >   #endif
> >   
> >
> > and anything from these headers that doesn't also have a host version will 
> > suffer the same fate: if it won't also compile for the host (one way or 
> > another), then it won't work.
>
>
> The problem with this header file is that it allows to use those 
> Cuda-specific functions unconditionally in some cases:
>
>   #pragma omp target
>   a = __nv_xxxx(....);
>
>
> It won't require any target-specific guards to compile this code (if we 
> compile it only for Cuda-specific devices) and we're loosing the consistency 
> here: in some cases target regions will require special device guards, in 
> others, with the same function calls, it is not. And the worst thing, is that 
> we implicitly allow to introduce this kind of incostistency into users code. 
> That's why I would prefer to see a special kind of the include file, NVPTX 
> specific, that must be included explicitly, so the user explictly commanded 
> to use some target-specific math functions, if he really wants it. Plus, 
> maybe, in this files we need force check of the platform and warn users that 
> the functions from this header file must be used using device-specific 
> checks. Or provide some kind of the default implementations for all the 
> platforms, that do not support those math function natively.


I believe that I understand your point, but two things:

1. I think that you're mistaken on the underlying premise. That code will not 
meaningfully compile without ifdefs, even if only CUDA-specific devices are the 
only ones selected. We *always* compile code for the host as well, not for 
offloading proper, but for the fallback (for execution when the offloading 
fails). If I emulate this situation by writing this:



  #ifdef __NVPTX__
  int __nv_floor();
  #endif
  
  int main() {
  #pragma omp target
  __nv_floor();
  }

and try to compile using Clang with -fopenmp -fopenmp-targets=nvptx64, the 
compilation fails:

  int1.cpp:8:1: error: use of undeclared identifier '__nv_floor'

and this is because, when we invoke the compilation for the host, there is no 
declaration for that function. This is true even though nvptx64 is the only 
target for which the code is being compiled (because we always also compile the 
host fallback).

2. I believe that the future state -- what we get by following this patch, and 
then when declare variant is available using that -- gives us all what we want. 
When we have declare variant, then all of the definitions in these headers will 
be declared as variants only available on the nvptx device, and so, for a user 
to use such a function they would need to explicit add a variant that is only 
available on the host. This would be explicit.

I think that you're getting at what I mentioned earlier, where if you have code 
that uses some function called, for example, rnorm3d -- this is a non-standard 
function, and so a user is free to define it, and then they compile with OpenMP 
target and include math.h, then the version of rnorm3d we bring in from the 
CUDA header will silently override their version (which would otherwise be 
implicitly declare target). But I don't think that this will happen either with 
this patch, instead, they'll get a warning about conflicting definitions 
(`error: redefinition of 'rnorm3d'`), and if it's an external-linkage function, 
then something should give them an error (although we should check this). The 
more interesting case, I think, actually comes when we switch to using `declare 
variant`, because then I think this silent override does occur, so we'd want to 
add a warning for when a variant delcared in a system header file would 
silently override a plain (non-variant) function declare/defined elsewhere. I 
believe that will give us all of the benefits of this while also addressing the 
concern you highlight.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61399/new/

https://reviews.llvm.org/D61399



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to