rjmccall added inline comments.

================
Comment at: clang/lib/Sema/Sema.cpp:1514
+  void visitUsedDecl(SourceLocation Loc, Decl *D) {
+    if (auto *TD = dyn_cast<TranslationUnitDecl>(D)) {
+      for (auto *DD : TD->decls()) {
----------------
bader wrote:
> yaxunl wrote:
> > rjmccall wrote:
> > > yaxunl wrote:
> > > > rjmccall wrote:
> > > > > erichkeane wrote:
> > > > > > rjmccall wrote:
> > > > > > > erichkeane wrote:
> > > > > > > > Note that when recommitting this (if you choose to), this needs 
> > > > > > > > to also handle NamespaceDecl.  We're a downstream and 
> > > > > > > > discovered that this doesn't properly handle functions or 
> > > > > > > > records handled in a namespace.
> > > > > > > > 
> > > > > > > > It can be implemented identically to TranslationUnitDecl.
> > > > > > > Wait, what?  We shouldn't be doing this for TranslationUnitDecl 
> > > > > > > either.   I don't even know how we're "using" a 
> > > > > > > TranslationUnitDecl, but neither this case not the case for 
> > > > > > > `NamespaceDecl` should be recursively using every declaration 
> > > > > > > declared inside it.  If there's a declaration in a namespace 
> > > > > > > that's being used, it should be getting visited as part of the 
> > > > > > > actual use of it.
> > > > > > > 
> > > > > > > The logic for `RecordDecl` has the same problem.  
> > > > > > Despite the name, this seems to be more of a home-written ast 
> > > > > > walking class.  The entry point is the 'translation unit' which 
> > > > > > seems to walk through everything in an attempt to find all the 
> > > > > > functions (including those that are 'marked' as used by an 
> > > > > > attribute).
> > > > > > 
> > > > > > You'll see the FunctionDecl section makes this assumption as well 
> > > > > > (not necessarily that we got to a function via a call). IMO, this 
> > > > > > approach is strange, and we should register entry points in some 
> > > > > > manner (functions marked as emitted to the device in some fashion), 
> > > > > > then just follow its call-graph (via the clang::CallGraph?) to emit 
> > > > > > all of these functions.
> > > > > > 
> > > > > > It seemed really odd to see this approach here, but it seemed well 
> > > > > > reviewed by the time I noticed it (via a downstream bug) so I 
> > > > > > figured I'd lost my chance to disagree with the approach.
> > > > > > 
> > > > > > 
> > > > > Sure, but `visitUsedDecl` isn't the right place to be entering the 
> > > > > walk.  `visitUsedDecl` is supposed to be the *callback* from the 
> > > > > walk.  If they need to walk all the global declarations to find 
> > > > > kernels instead of tracking the kernels as they're encountered (which 
> > > > > would be a *much* better approach), it should be done as a separate 
> > > > > function.
> > > > > 
> > > > > I just missed this in the review.
> > > > The deferred diagnostics could be initiated by non-kernel functions or 
> > > > even host functions.
> > > > 
> > > > Let's consider a device code library where no kernels are defined. A 
> > > > device function is emitted, which calls a host device function which 
> > > > has a deferred diagnostic. All device functions that are emitted need 
> > > > to be checked.
> > > > 
> > > > Same with host functions that are emitted, which may call a host device 
> > > > function which has deferred diagnostic.
> > > > 
> > > > Also not just function calls need to be checked. A function address may 
> > > > be taken then called through function pointer. Therefore any reference 
> > > > to a function needs to be followed.
> > > > 
> > > > In the case of OpenMP, the initialization of a global function pointer 
> > > > which refers a function may trigger a deferred diangostic. There are 
> > > > tests for that.
> > > Right, I get that emitting deferred diagnostics for a declaration D needs 
> > > to trigger any deferred diagnostics in declarations used by D, 
> > > recursively.  You essentially have a graph of lazily-emitted declarations 
> > > (which may or may not have deferred diagnostics) and a number of 
> > > eagerly-emitted "root" declarations with use-edges leading into that 
> > > graph.  Any declaration that's reachable from a root will need to be 
> > > emitted and so needs to have any deferred diagnostics emitted as well.  
> > > My question is why you're finding these roots with a retroactive walk of 
> > > the entire translation unit instead of either building a list of roots as 
> > > you go or (better yet) building a list of lazily-emitted declarations 
> > > that are used by those roots.  You can unambiguously identify at the 
> > > point of declaration whether an entity will be eagerly or lazily emitted, 
> > > right?  If you just store those initial edges into the lazily-emitted 
> > > declarations graph and then initiate the recursive walk from them at the 
> > > end of the translation unit, you'll only end up walking declarations that 
> > > are actually relevant to your compilation, so you'll have much better 
> > > locality and (if this matters to you) you'll naturally work a lot better 
> > > with PCH and modules.
> > I will try the approach you suggested. Basically I will record the emitted 
> > functions and variables during parsing and use them as starting point for 
> > the final traversal.
> > 
> > This should work for CUDA/HIP. However it may be tricky for OpenMP since 
> > the emission of some entities depending on pragmas. Still it may be doable. 
> > If I encounter difficulty I will come back for discussion.
> > 
> > I will post the change for review.
> > 
> > Thanks.
> FYI: SYCL is also using deferred diagnostics engine to emit device side 
> diagnostics, although this part hasn't been up-streamed yet, but we are 
> tracking changes in this area.
> SYCL support implementation should be quite similar to CUDA/HIP.
Okay, thank you.  Do you still need all the cases in here for records, 
templates, and so on?  It looks to me like you should always end up here with 
exactly the variables and functions that are being used, and you should never 
need to make special efforts to e.g. visit all the specializations of a 
template or visit all the methods of a class.


================
Comment at: clang/lib/Sema/Sema.cpp:1540
+    } else if (auto *VD = dyn_cast<VarDecl>(D)) {
+      if (auto *Init = VD->getInit()) {
+        auto DevTy = OMPDeclareTargetDeclAttr::getDeviceType(VD);
----------------
Can there also be deferred diagnostics associated with this initializer?


================
Comment at: clang/lib/Sema/SemaDecl.cpp:12229
+  if (LangOpts.OpenMP)
+    DeclsToCheckForDeferredDiags.push_back(VDecl);
   CheckCompleteVariableDeclaration(VDecl);
----------------
`DeclsToCheckForDeferredDiags` is basically a set of declarations that you know 
to have to emit, right?  It doesn't seem right to be adding every variable with 
an initializer to that set — especially because I'm pretty sure this function 
gets called for literally every variable with an initializer, including local 
variables.  Presumably you only need to do this for global variables that 
you're definitely going to emit in the current mode.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70172/new/

https://reviews.llvm.org/D70172



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to