jlebar added inline comments.

================
Comment at: lib/Headers/cuda_builtin_vars.h:72
@@ -66,1 +71,3 @@
+  // uint3).  This function is defined after we pull in vector_types.h.
+  __attribute__((device)) operator uint3() const;
 private:
----------------
tra wrote:
> Considering that built-in variables are never instantiated, I wonder how it's 
> going to work as the operator will presumably need 'this' pointing 
> *somewhere*, even if we don't use it. Unused 'this' would probably get 
> optimized away with optimizations on, but -O0 may cause problems.
This is interesting.  In the ptx, threadIdx actually gets instantiated, as a 
non-weak global:

  .global .align 1 .b8 threadIdx[1];

Then we take the address of this thing.

At -O2, we don't emit a threadIdx global at all.

I think this is basically fine.  It's actually not right to change extern to 
static in the decl, because then we try to construct a 
__cuda_builtin_threadIdx_t, and the default constructor is deleted.  :)


http://reviews.llvm.org/D17561



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to