tra added inline comments.
================ Comment at: clang/lib/Headers/__clang_cuda_device_functions.h:32 + +#define __DEVICE__ static __device__ __forceinline__ +// There are number of functions that became compiler builtins in CUDA-9 and are ---------------- jlebar wrote: > tra wrote: > > jlebar wrote: > > > I don't think we should need `__forceinline__`? I'm sure they had that > > > in their headers, but clang doesn't rely on its inliner for correctness, > > > unlike some *other* compilers. :) > > > > > > I'm also not sure about `static`. I think we mark all CUDA code as > > > internal, so it shouldn't be necessary? If they're not static, they > > > should be marked as `inline`, though. > > The idea is to make those wrappers always inlined, so they are effectively > > replaced with a call to __nv_* libdevice function. Whether the callee > > itself get inlined is controlled by the attributes in the libdevice bitcode. > > > > `__attribute__((always_inline))` which `__forceinline__` expands to is > > handled differently from regular inline. We have > > Transforms/IPO/AlwaysInliner.cpp which handles them with -O1. Higher opt > > levels will presumably inline these functions, too. Without always_inline > > these wrappers will remain uninlined at -O1. Replacing it with just > > `inline` will work OK (only affects -O1 in a minor way), but > > `__forceinline__` is a bit closer to what I want from these wrappers. > > > > As for the `static`, unused non-static functions will be emitted by clang > > -- as far as clang is concerned they are externally visible. I've checked > > and we do end up emitting them in PTX, even though they don't have .visible > > directive (I think that's what you referred to when you mentioned 'mark as > > internal'). I don't think we want that. > > > > I'll keep this combination for now. We can adjust it later, if necessary. > Can you help me understand what makes you want this particular inlining > behavior (i.e., always inline, even at -O0) for these wrappers? > > Does e.g. libc++ do the same thing for the functions it has that are thin > wrappers around e.g. libc functions? > > > As for the static, unused non-static functions will be emitted by clang -- > > as far as clang is concerned they are externally visible. I've checked and > > we do end up emitting them in PTX, even though they don't have .visible > > directive (I think that's what you referred to when you mentioned 'mark as > > internal'). I don't think we want that. > > Agree. This is kind of unfortunate, though -- it's not How You Write > Headers. Could we add a comment? libc++ does seem to use _LIBCPP_ALWAYS_INLINE for a lot of wrapper-like functions: E.g. libcxx/include/__locale: ``` 535: _LIBCPP_ALWAYS_INLINE 536- const char_type* tolower(char_type* __low, const char_type* __high) const 537- { 538- return do_tolower(__low, __high); 539- } ``` https://reviews.llvm.org/D42513 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits