[PATCH] D25704: [CUDA] When we emit an error that might have been deferred, also print a callstack.

2016-10-19 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL284647: [CUDA] When we emit an error that might have been 
deferred, also print a… (authored by jlebar).

Changed prior to commit:
  https://reviews.llvm.org/D25704?vs=75178=75226#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D25704

Files:
  cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
  cfe/trunk/include/clang/Sema/Sema.h
  cfe/trunk/lib/Sema/SemaCUDA.cpp
  cfe/trunk/test/SemaCUDA/bad-calls-on-same-line.cu
  cfe/trunk/test/SemaCUDA/call-device-fn-from-host.cu
  cfe/trunk/test/SemaCUDA/call-host-fn-from-device.cu
  cfe/trunk/test/SemaCUDA/call-stack-for-deferred-err.cu
  cfe/trunk/test/SemaCUDA/exceptions.cu
  cfe/trunk/test/SemaCUDA/no-call-stack-for-immediate-errs.cu
  cfe/trunk/test/SemaCUDA/trace-through-global.cu

Index: cfe/trunk/include/clang/Sema/Sema.h
===
--- cfe/trunk/include/clang/Sema/Sema.h
+++ cfe/trunk/include/clang/Sema/Sema.h
@@ -9249,26 +9249,42 @@
   /// Diagnostics that are emitted only if we discover that the given function
   /// must be codegen'ed.  Because handling these correctly adds overhead to
   /// compilation, this is currently only enabled for CUDA compilations.
-  llvm::DenseMap>
+  llvm::DenseMap
   CUDADeferredDiags;
 
   /// FunctionDecls plus raw encodings of SourceLocations for which
   /// CheckCUDACall has emitted a (maybe deferred) "bad call" diagnostic.  We
   /// use this to avoid emitting the same deferred diag twice.
-  llvm::DenseSet LocsWithCUDACallDiags;
+  llvm::DenseSet>
+  LocsWithCUDACallDiags;
 
-  /// The set of CUDA functions that we've discovered must be emitted by tracing
-  /// the call graph.  Functions that we can tell a priori must be emitted
-  /// aren't added to this set.
-  llvm::DenseSet CUDAKnownEmittedFns;
+  /// A pair of a canonical FunctionDecl and a SourceLocation.
+  struct FunctionDeclAndLoc {
+CanonicalDeclPtr FD;
+SourceLocation Loc;
+  };
+
+  /// An inverse call graph, mapping known-emitted functions to one of their
+  /// known-emitted callers (plus the location of the call).
+  ///
+  /// Functions that we can tell a priori must be emitted aren't added to this
+  /// map.
+  llvm::DenseMap,
+ /* Caller = */ FunctionDeclAndLoc>
+  CUDAKnownEmittedFns;
 
   /// A partial call graph maintained during CUDA compilation to support
-  /// deferred diagnostics.  Specifically, functions are only added here if, at
-  /// the time they're added, they are not known-emitted.  As soon as we
-  /// discover that a function is known-emitted, we remove it and everything it
-  /// transitively calls from this set and add those functions to
-  /// CUDAKnownEmittedFns.
-  llvm::DenseMap> CUDACallGraph;
+  /// deferred diagnostics.
+  ///
+  /// Functions are only added here if, at the time they're considered, they are
+  /// not known-emitted.  As soon as we discover that a function is
+  /// known-emitted, we remove it and everything it transitively calls from this
+  /// set and add those functions to CUDAKnownEmittedFns.
+  llvm::DenseMap,
+ /* Callees = */ llvm::MapVector>
+  CUDACallGraph;
 
   /// Diagnostic builder for CUDA errors which may or may not be deferred.
   ///
@@ -9291,13 +9307,19 @@
   K_Nop,
   /// Emit the diagnostic immediately (i.e., behave like Sema::Diag()).
   K_Immediate,
+  /// Emit the diagnostic immediately, and, if it's a warning or error, also
+  /// emit a call stack showing how this function can be reached by an a
+  /// priori known-emitted function.
+  K_ImmediateWithCallStack,
   /// Create a deferred diagnostic, which is emitted only if the function
-  /// it's attached to is codegen'ed.
+  /// it's attached to is codegen'ed.  Also emit a call stack as with
+  /// K_ImmediateWithCallStack.
   K_Deferred
 };
 
 CUDADiagBuilder(Kind K, SourceLocation Loc, unsigned DiagID,
 FunctionDecl *Fn, Sema );
+~CUDADiagBuilder();
 
 /// Convertible to bool: True if we immediately emitted an error, false if
 /// we didn't emit an error or we created a deferred error.
@@ -9309,38 +9331,29 @@
 ///
 /// But see CUDADiagIfDeviceCode() and CUDADiagIfHostCode() -- you probably
 /// want to use these instead of creating a CUDADiagBuilder yourself.
-operator bool() const { return ImmediateDiagBuilder.hasValue(); }
+operator bool() const { return ImmediateDiag.hasValue(); }
 
 template 
 friend const CUDADiagBuilder <<(const CUDADiagBuilder ,
  const T ) {
-  if (Diag.ImmediateDiagBuilder.hasValue())
-*Diag.ImmediateDiagBuilder << Value;
-  else if 

[PATCH] D25704: [CUDA] When we emit an error that might have been deferred, also print a callstack.

2016-10-19 Thread Justin Lebar via cfe-commits
jlebar marked 2 inline comments as done.
jlebar added a comment.

Thank you for the review, Reid.


https://reviews.llvm.org/D25704



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25704: [CUDA] When we emit an error that might have been deferred, also print a callstack.

2016-10-19 Thread Justin Lebar via cfe-commits
jlebar marked 2 inline comments as done.
jlebar added a comment.

I'm going to submit this and send a patch to reuse FunctionDeclAndLoc.  But I'm 
happy to add a comment about the note as well.




Comment at: clang/include/clang/Basic/DiagnosticSemaKinds.td:6707
 
+def note_called_by : Note<"called by %0">;
 def err_kern_type_not_void_return : Error<

rnk wrote:
> Do you think it's worth trying to indicate why the root of the "called by" 
> notes must be emitted? I'm not suggesting we do it in this patch, just 
> wondering.
It seems just like e.g. note_template_class_instantiation_here, so I am not 
entirely sure what is the point of confusion.  But if you can clarify that, I 
am happy to add a comment in a separate patch -- this stuff is already quite 
complicated, so I'm in favor of whatever we can do to make it clearer.



Comment at: clang/include/clang/Sema/Sema.h:9259
   /// use this to avoid emitting the same deferred diag twice.
-  llvm::DenseSet LocsWithCUDACallDiags;
+  llvm::DenseSet>
+  LocsWithCUDACallDiags;

rnk wrote:
> So, part of me wants to use FunctionDeclAndLoc here instead of std::pair, but 
> then you'd have to bring back all the hashing machinery instead of getting it 
> for free. Up to you, I guess.
Oh, and now make its hash function depend on both members instead of just the 
FD?

I actually like that change, but let me make it in a separate patch.



Comment at: clang/include/clang/Sema/Sema.h:9274
+  llvm::DenseMap,
+ /* Caller = */ FunctionDeclAndLoc>
+  CUDAKnownEmittedFns;

rnk wrote:
> I guess there can be many callers, but we always record the first one that 
> caused this function to be emitted.
Yes, exactly.

I was briefly worried about cycles here, but I think we're OK because 
ultimately we need to end up at an a priori known-emitted function, and that's 
a root in this tree.


https://reviews.llvm.org/D25704



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25704: [CUDA] When we emit an error that might have been deferred, also print a callstack.

2016-10-19 Thread Reid Kleckner via cfe-commits
rnk accepted this revision.
rnk added a comment.
This revision is now accepted and ready to land.

lgtm




Comment at: clang/include/clang/Basic/DiagnosticSemaKinds.td:6707
 
+def note_called_by : Note<"called by %0">;
 def err_kern_type_not_void_return : Error<

Do you think it's worth trying to indicate why the root of the "called by" 
notes must be emitted? I'm not suggesting we do it in this patch, just 
wondering.



Comment at: clang/include/clang/Sema/Sema.h:9259
   /// use this to avoid emitting the same deferred diag twice.
-  llvm::DenseSet LocsWithCUDACallDiags;
+  llvm::DenseSet>
+  LocsWithCUDACallDiags;

So, part of me wants to use FunctionDeclAndLoc here instead of std::pair, but 
then you'd have to bring back all the hashing machinery instead of getting it 
for free. Up to you, I guess.



Comment at: clang/include/clang/Sema/Sema.h:9274
+  llvm::DenseMap,
+ /* Caller = */ FunctionDeclAndLoc>
+  CUDAKnownEmittedFns;

I guess there can be many callers, but we always record the first one that 
caused this function to be emitted.


https://reviews.llvm.org/D25704



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25704: [CUDA] When we emit an error that might have been deferred, also print a callstack.

2016-10-19 Thread Justin Lebar via cfe-commits
jlebar added inline comments.



Comment at: clang/include/clang/Sema/Sema.h:9292
+  llvm::DenseMap,
+ /* Callees = */ llvm::SetVector>
+  CUDACallGraph;

rnk wrote:
> Rather than having a custom key type, maybe it would be better to phrase this 
> as a `MapVector` ? Despite 
> all the comments, I assumed FunctionDeclAndLoc was hashed by both elements 
> for a long time.
That's much better; thank you.


https://reviews.llvm.org/D25704



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25704: [CUDA] When we emit an error that might have been deferred, also print a callstack.

2016-10-19 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 75178.
jlebar marked 2 inline comments as done.
jlebar added a comment.

Address rnk's comments.


https://reviews.llvm.org/D25704

Files:
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/include/clang/Sema/Sema.h
  clang/lib/Sema/SemaCUDA.cpp
  clang/test/SemaCUDA/bad-calls-on-same-line.cu
  clang/test/SemaCUDA/call-device-fn-from-host.cu
  clang/test/SemaCUDA/call-host-fn-from-device.cu
  clang/test/SemaCUDA/call-stack-for-deferred-err.cu
  clang/test/SemaCUDA/exceptions-host-device.cu
  clang/test/SemaCUDA/no-call-stack-for-immediate-errs.cu
  clang/test/SemaCUDA/trace-through-global.cu

Index: clang/test/SemaCUDA/trace-through-global.cu
===
--- clang/test/SemaCUDA/trace-through-global.cu
+++ clang/test/SemaCUDA/trace-through-global.cu
@@ -35,10 +35,16 @@
 template 
 void launch_kernel() {
   kernel<<<0, 0>>>(T());
-  hd1();
-  hd3(T());
+
+  // Notice that these two diagnostics are different: Because the call to hd1
+  // is not dependent on T, the call to hd1 comes from 'launch_kernel', while
+  // the call to hd3, being dependent, comes from 'launch_kernel'.
+  hd1(); // expected-note {{called by 'launch_kernel'}}
+  hd3(T()); // expected-note {{called by 'launch_kernel'}}
 }
 
 void host_fn() {
   launch_kernel();
+  // expected-note@-1 {{called by 'host_fn'}}
+  // expected-note@-2 {{called by 'host_fn'}}
 }
Index: clang/test/SemaCUDA/no-call-stack-for-immediate-errs.cu
===
--- /dev/null
+++ clang/test/SemaCUDA/no-call-stack-for-immediate-errs.cu
@@ -0,0 +1,17 @@
+// RUN: %clang_cc1 -fcuda-is-device -fsyntax-only -verify %s
+
+#include "Inputs/cuda.h"
+
+// Here we should dump an error about the VLA in device_fn, but we should not
+// print a callstack indicating how device_fn becomes known-emitted, because
+// it's an error to use a VLA in any __device__ function, even one that doesn't
+// get emitted.
+
+inline __device__ void device_fn(int n);
+inline __device__ void device_fn2() { device_fn(42); }
+
+__global__ void kernel() { device_fn2(); }
+
+inline __device__ void device_fn(int n) {
+  int vla[n]; // expected-error {{variable-length array}}
+}
Index: clang/test/SemaCUDA/exceptions-host-device.cu
===
--- clang/test/SemaCUDA/exceptions-host-device.cu
+++ clang/test/SemaCUDA/exceptions-host-device.cu
@@ -36,3 +36,6 @@
 #endif
 }
 __device__ void call_hd3() { hd3(); }
+#ifndef HOST
+// expected-note@-2 {{called by 'call_hd3'}}
+#endif
Index: clang/test/SemaCUDA/call-stack-for-deferred-err.cu
===
--- /dev/null
+++ clang/test/SemaCUDA/call-stack-for-deferred-err.cu
@@ -0,0 +1,18 @@
+// RUN: %clang_cc1 -fcuda-is-device -fsyntax-only -verify %s
+
+#include "Inputs/cuda.h"
+
+// We should emit an error for hd_fn's use of a VLA.  This would have been
+// legal if hd_fn were never codegen'ed on the device, so we should also print
+// out a callstack showing how we determine that hd_fn is known-emitted.
+//
+// Compare to no-call-stack-for-deferred-err.cu.
+
+inline __host__ __device__ void hd_fn(int n);
+inline __device__ void device_fn2() { hd_fn(42); } // expected-note {{called by 'device_fn2'}}
+
+__global__ void kernel() { device_fn2(); } // expected-note {{called by 'kernel'}}
+
+inline __host__ __device__ void hd_fn(int n) {
+  int vla[n]; // expected-error {{variable-length array}}
+}
Index: clang/test/SemaCUDA/call-host-fn-from-device.cu
===
--- clang/test/SemaCUDA/call-host-fn-from-device.cu
+++ clang/test/SemaCUDA/call-host-fn-from-device.cu
@@ -1,5 +1,5 @@
 // RUN: %clang_cc1 %s --std=c++11 -triple nvptx-unknown-unknown -fcuda-is-device \
-// RUN:   -emit-llvm -o /dev/null -verify
+// RUN:   -emit-llvm -o /dev/null -verify -verify-ignore-unexpected=note
 
 // Note: This test won't work with -fsyntax-only, because some of these errors
 // are emitted during codegen.
Index: clang/test/SemaCUDA/call-device-fn-from-host.cu
===
--- clang/test/SemaCUDA/call-device-fn-from-host.cu
+++ clang/test/SemaCUDA/call-device-fn-from-host.cu
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 %s --std=c++11 -triple x86_64-unknown-linux -emit-llvm -o - -verify
+// RUN: %clang_cc1 %s --std=c++11 -triple x86_64-unknown-linux -emit-llvm -o - \
+// RUN:   -verify -verify-ignore-unexpected=note
 
 // Note: This test won't work with -fsyntax-only, because some of these errors
 // are emitted during codegen.
Index: clang/test/SemaCUDA/bad-calls-on-same-line.cu
===
--- clang/test/SemaCUDA/bad-calls-on-same-line.cu
+++ clang/test/SemaCUDA/bad-calls-on-same-line.cu
@@ -35,5 +35,7 @@
 void host_fn() {
   hd();
   hd();  // expected-note {{function 

[PATCH] D25704: [CUDA] When we emit an error that might have been deferred, also print a callstack.

2016-10-19 Thread Reid Kleckner via cfe-commits
rnk added inline comments.



Comment at: clang/include/clang/Sema/Sema.h:9292
+  llvm::DenseMap,
+ /* Callees = */ llvm::SetVector>
+  CUDACallGraph;

Rather than having a custom key type, maybe it would be better to phrase this 
as a `MapVector` ? Despite all 
the comments, I assumed FunctionDeclAndLoc was hashed by both elements for a 
long time.



Comment at: clang/include/clang/Sema/Sema.h:9322
+  /// it's attached to is codegen'ed.  Also emit a call stack as with
+  /// K_ImmedaiteWithCallStack.
   K_Deferred

typo on "immediate"


https://reviews.llvm.org/D25704



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25704: [CUDA] When we emit an error that might have been deferred, also print a callstack.

2016-10-17 Thread Justin Lebar via cfe-commits
jlebar created this revision.
jlebar added a reviewer: rnk.
jlebar added subscribers: tra, cfe-commits.

Previously, when you did something not allowed in a host+device function
and then caused it to be codegen'ed, we would print out an error telling
you that you did something bad, but we wouldn't tell you how we decided
that the function needed to be codegen'ed.

This change causes us to print out a callstack when emitting deferred
errors.  This is immensely helpful when debugging highly-templated code,
where it's often unclear how a function became known-emitted.

We only print the callstack once per function, after we print the all
deferred errors.

This patch also switches all of our hashtables to using canonical
FunctionDecls instead of regular FunctionDecls.  This prevents a number
of bugs, some of which are caught by tests added here, in which we
assume that two FDs for the same function have the same pointer value.


https://reviews.llvm.org/D25704

Files:
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/include/clang/Sema/Sema.h
  clang/lib/Sema/SemaCUDA.cpp
  clang/test/SemaCUDA/bad-calls-on-same-line.cu
  clang/test/SemaCUDA/call-device-fn-from-host.cu
  clang/test/SemaCUDA/call-host-fn-from-device.cu
  clang/test/SemaCUDA/call-stack-for-deferred-err.cu
  clang/test/SemaCUDA/exceptions-host-device.cu
  clang/test/SemaCUDA/no-call-stack-for-immediate-errs.cu
  clang/test/SemaCUDA/trace-through-global.cu

Index: clang/test/SemaCUDA/trace-through-global.cu
===
--- clang/test/SemaCUDA/trace-through-global.cu
+++ clang/test/SemaCUDA/trace-through-global.cu
@@ -35,10 +35,16 @@
 template 
 void launch_kernel() {
   kernel<<<0, 0>>>(T());
-  hd1();
-  hd3(T());
+
+  // Notice that these two diagnostics are different: Because the call to hd1
+  // is not dependent on T, the call to hd1 comes from 'launch_kernel', while
+  // the call to hd3, being dependent, comes from 'launch_kernel'.
+  hd1(); // expected-note {{called by 'launch_kernel'}}
+  hd3(T()); // expected-note {{called by 'launch_kernel'}}
 }
 
 void host_fn() {
   launch_kernel();
+  // expected-note@-1 {{called by 'host_fn'}}
+  // expected-note@-2 {{called by 'host_fn'}}
 }
Index: clang/test/SemaCUDA/no-call-stack-for-immediate-errs.cu
===
--- /dev/null
+++ clang/test/SemaCUDA/no-call-stack-for-immediate-errs.cu
@@ -0,0 +1,17 @@
+// RUN: %clang_cc1 -fcuda-is-device -fsyntax-only -verify %s
+
+#include "Inputs/cuda.h"
+
+// Here we should dump an error about the VLA in device_fn, but we should not
+// print a callstack indicating how device_fn becomes known-emitted, because
+// it's an error to use a VLA in any __device__ function, even one that doesn't
+// get emitted.
+
+inline __device__ void device_fn(int n);
+inline __device__ void device_fn2() { device_fn(42); }
+
+__global__ void kernel() { device_fn2(); }
+
+inline __device__ void device_fn(int n) {
+  int vla[n]; // expected-error {{variable-length array}}
+}
Index: clang/test/SemaCUDA/exceptions-host-device.cu
===
--- clang/test/SemaCUDA/exceptions-host-device.cu
+++ clang/test/SemaCUDA/exceptions-host-device.cu
@@ -36,3 +36,6 @@
 #endif
 }
 __device__ void call_hd3() { hd3(); }
+#ifndef HOST
+// expected-note@-2 {{called by 'call_hd3'}}
+#endif
Index: clang/test/SemaCUDA/call-stack-for-deferred-err.cu
===
--- /dev/null
+++ clang/test/SemaCUDA/call-stack-for-deferred-err.cu
@@ -0,0 +1,18 @@
+// RUN: %clang_cc1 -fcuda-is-device -fsyntax-only -verify %s
+
+#include "Inputs/cuda.h"
+
+// We should emit an error for hd_fn's use of a VLA.  This would have been
+// legal if hd_fn were never codegen'ed on the device, so we should also print
+// out a callstack showing how we determine that hd_fn is known-emitted.
+//
+// Compare to no-call-stack-for-deferred-err.cu.
+
+inline __host__ __device__ void hd_fn(int n);
+inline __device__ void device_fn2() { hd_fn(42); } // expected-note {{called by 'device_fn2'}}
+
+__global__ void kernel() { device_fn2(); } // expected-note {{called by 'kernel'}}
+
+inline __host__ __device__ void hd_fn(int n) {
+  int vla[n]; // expected-error {{variable-length array}}
+}
Index: clang/test/SemaCUDA/call-host-fn-from-device.cu
===
--- clang/test/SemaCUDA/call-host-fn-from-device.cu
+++ clang/test/SemaCUDA/call-host-fn-from-device.cu
@@ -1,5 +1,5 @@
 // RUN: %clang_cc1 %s --std=c++11 -triple nvptx-unknown-unknown -fcuda-is-device \
-// RUN:   -emit-llvm -o /dev/null -verify
+// RUN:   -emit-llvm -o /dev/null -verify -verify-ignore-unexpected=note
 
 // Note: This test won't work with -fsyntax-only, because some of these errors
 // are emitted during codegen.
Index: clang/test/SemaCUDA/call-device-fn-from-host.cu