tra added inline comments.

================
Comment at: clang/lib/Headers/__clang_cuda_device_functions.h:32
+
+#define __DEVICE__ static __device__ __forceinline__
+// There are number of functions that became compiler builtins in CUDA-9 and 
are
----------------
jlebar wrote:
> tra wrote:
> > jlebar wrote:
> > > I don't think we should need `__forceinline__`?  I'm sure they had that 
> > > in their headers, but clang doesn't rely on its inliner for correctness, 
> > > unlike some *other* compilers.  :)
> > > 
> > > I'm also not sure about `static`.  I think we mark all CUDA code as 
> > > internal, so it shouldn't be necessary?  If they're not static, they 
> > > should be marked as `inline`, though.
> > The idea is to make those wrappers always inlined, so they are effectively 
> > replaced with a call to __nv_* libdevice function. Whether the callee 
> > itself get inlined is controlled by the attributes in the libdevice bitcode.
> > 
> > `__attribute__((always_inline))` which `__forceinline__` expands to is 
> > handled differently from regular inline. We have 
> > Transforms/IPO/AlwaysInliner.cpp which handles them with -O1. Higher opt 
> > levels will presumably inline these functions, too. Without always_inline 
> > these wrappers will remain uninlined at -O1. Replacing it with just 
> > `inline` will work OK (only affects -O1 in a minor way), but 
> > `__forceinline__` is a bit closer to what I want from these wrappers.
> > 
> > As for the `static`, unused non-static functions will be emitted by clang 
> > -- as far as clang is concerned they are externally visible. I've checked 
> > and we do end up emitting them in PTX, even though they don't have .visible 
> > directive (I think that's what you referred to when you mentioned 'mark as 
> > internal'). I don't think we want that.
> > 
> > I'll keep this combination for now. We can adjust it later, if necessary.
> Can you help me understand what makes you want this particular inlining 
> behavior (i.e., always inline, even at -O0) for these wrappers?
> 
> Does e.g. libc++ do the same thing for the functions it has that are thin 
> wrappers around e.g. libc functions?
> 
> > As for the static, unused non-static functions will be emitted by clang -- 
> > as far as clang is concerned they are externally visible. I've checked and 
> > we do end up emitting them in PTX, even though they don't have .visible 
> > directive (I think that's what you referred to when you mentioned 'mark as 
> > internal'). I don't think we want that.
> 
> Agree.  This is kind of unfortunate, though -- it's not How You Write 
> Headers.  Could we add a comment?
libc++ does seem to use _LIBCPP_ALWAYS_INLINE for a lot of wrapper-like 
functions:

E.g. libcxx/include/__locale:
```
535:    _LIBCPP_ALWAYS_INLINE
536-    const char_type* tolower(char_type* __low, const char_type* __high) 
const
537-    {
538-        return do_tolower(__low, __high);
539-    }
```




https://reviews.llvm.org/D42513



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to