tra marked 2 inline comments as done.
tra added inline comments.

================
Comment at: clang/lib/Headers/__clang_cuda_device_functions.h:32
+
+#define __DEVICE__ static __device__ __forceinline__
+// There are number of functions that became compiler builtins in CUDA-9 and 
are
----------------
jlebar wrote:
> I don't think we should need `__forceinline__`?  I'm sure they had that in 
> their headers, but clang doesn't rely on its inliner for correctness, unlike 
> some *other* compilers.  :)
> 
> I'm also not sure about `static`.  I think we mark all CUDA code as internal, 
> so it shouldn't be necessary?  If they're not static, they should be marked 
> as `inline`, though.
The idea is to make those wrappers always inlined, so they are effectively 
replaced with a call to __nv_* libdevice function. Whether the callee itself 
get inlined is controlled by the attributes in the libdevice bitcode.

`__attribute__((always_inline))` which `__forceinline__` expands to is handled 
differently from regular inline. We have Transforms/IPO/AlwaysInliner.cpp which 
handles them with -O1. Higher opt levels will presumably inline these 
functions, too. Without always_inline these wrappers will remain uninlined at 
-O1. Replacing it with just `inline` will work OK (only affects -O1 in a minor 
way), but `__forceinline__` is a bit closer to what I want from these wrappers.

As for the `static`, unused non-static functions will be emitted by clang -- as 
far as clang is concerned they are externally visible. I've checked and we do 
end up emitting them in PTX, even though they don't have .visible directive (I 
think that's what you referred to when you mentioned 'mark as internal'). I 
don't think we want that.

I'll keep this combination for now. We can adjust it later, if necessary.


================
Comment at: clang/lib/Headers/__clang_cuda_device_functions.h:51
+__DEVICE__ void __brkpt() { asm volatile("brkpt;"); }
+__DEVICE__ void __brkpt(int __a) { __brkpt(); }
+__DEVICE__ unsigned int __byte_perm(unsigned int __a, unsigned int __b,
----------------
jlebar wrote:
> Sanity check: Ignoring the __a argument here is correct?
Yes. It's a leftover from the old versions and has been deprecated in CUDA-8:
```
__DEPRECATED__("Please use __brkpt() instead.") void brkpt(int c = 0)
```

CUDA-9.1 does not have it at all.


================
Comment at: clang/lib/Headers/__clang_cuda_runtime_wrapper.h:144
+// Declare or define device-side functions that particular CUDA version relies
+// on but does (no longer) declare in its headers. E.g. some of the device-side
+// functions that used to be implemented in a header in CUDA-8, became NVCC's
----------------
jlebar wrote:
> Sentence doesn't read right if you skip the parens.
> 
> But also: Our header provides *everything* from 
> cuda/device_functions_decls.h, not just the things that were once in that 
> header and, in newer CUDA versions, are no longer there, right?  I'm actually 
> even more confused now that I read the big header, because here we say the 
> header "defines or declares" these functions, but that suggests that for 
> every function, it decides whether to define or to declare it, but that's not 
> the case...
I should rephrase that then.

`__clang_cuda_device_functions.h` does two jobs. It always includes 
`__clang_cuda_libdevice_declares.h` which provides **declarations** for the 
libdevice functions. Those are gone in CUDA-9.1.

It also provides **definitions** for the standard library and __* device-side 
functions that call libdevice.

I can explicitly include _clang_cuda_libdevice_declares.h  here and update the 
comment to make more sense.


================
Comment at: clang/lib/Headers/__clang_cuda_runtime_wrapper.h:155
+#undef __THROW
+#define __THROW
 
----------------
jlebar wrote:
> Should we be pushing/popping the macro?  That is, do we want this macro 
> exposed to user code?
We do  push __THROW at the beginning of this header and pop it closer to the 
end, so this redefinition is not visible outside.



https://reviews.llvm.org/D42513



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to