yaxunl marked an inline comment as done.
yaxunl added inline comments.

================
Comment at: clang/lib/Sema/Sema.cpp:1514
+  void visitUsedDecl(SourceLocation Loc, Decl *D) {
+    if (auto *TD = dyn_cast<TranslationUnitDecl>(D)) {
+      for (auto *DD : TD->decls()) {
----------------
rjmccall wrote:
> yaxunl wrote:
> > rjmccall wrote:
> > > erichkeane wrote:
> > > > rjmccall wrote:
> > > > > erichkeane wrote:
> > > > > > Note that when recommitting this (if you choose to), this needs to 
> > > > > > also handle NamespaceDecl.  We're a downstream and discovered that 
> > > > > > this doesn't properly handle functions or records handled in a 
> > > > > > namespace.
> > > > > > 
> > > > > > It can be implemented identically to TranslationUnitDecl.
> > > > > Wait, what?  We shouldn't be doing this for TranslationUnitDecl 
> > > > > either.   I don't even know how we're "using" a TranslationUnitDecl, 
> > > > > but neither this case not the case for `NamespaceDecl` should be 
> > > > > recursively using every declaration declared inside it.  If there's a 
> > > > > declaration in a namespace that's being used, it should be getting 
> > > > > visited as part of the actual use of it.
> > > > > 
> > > > > The logic for `RecordDecl` has the same problem.  
> > > > Despite the name, this seems to be more of a home-written ast walking 
> > > > class.  The entry point is the 'translation unit' which seems to walk 
> > > > through everything in an attempt to find all the functions (including 
> > > > those that are 'marked' as used by an attribute).
> > > > 
> > > > You'll see the FunctionDecl section makes this assumption as well (not 
> > > > necessarily that we got to a function via a call). IMO, this approach 
> > > > is strange, and we should register entry points in some manner 
> > > > (functions marked as emitted to the device in some fashion), then just 
> > > > follow its call-graph (via the clang::CallGraph?) to emit all of these 
> > > > functions.
> > > > 
> > > > It seemed really odd to see this approach here, but it seemed well 
> > > > reviewed by the time I noticed it (via a downstream bug) so I figured 
> > > > I'd lost my chance to disagree with the approach.
> > > > 
> > > > 
> > > Sure, but `visitUsedDecl` isn't the right place to be entering the walk.  
> > > `visitUsedDecl` is supposed to be the *callback* from the walk.  If they 
> > > need to walk all the global declarations to find kernels instead of 
> > > tracking the kernels as they're encountered (which would be a *much* 
> > > better approach), it should be done as a separate function.
> > > 
> > > I just missed this in the review.
> > The deferred diagnostics could be initiated by non-kernel functions or even 
> > host functions.
> > 
> > Let's consider a device code library where no kernels are defined. A device 
> > function is emitted, which calls a host device function which has a 
> > deferred diagnostic. All device functions that are emitted need to be 
> > checked.
> > 
> > Same with host functions that are emitted, which may call a host device 
> > function which has deferred diagnostic.
> > 
> > Also not just function calls need to be checked. A function address may be 
> > taken then called through function pointer. Therefore any reference to a 
> > function needs to be followed.
> > 
> > In the case of OpenMP, the initialization of a global function pointer 
> > which refers a function may trigger a deferred diangostic. There are tests 
> > for that.
> Right, I get that emitting deferred diagnostics for a declaration D needs to 
> trigger any deferred diagnostics in declarations used by D, recursively.  You 
> essentially have a graph of lazily-emitted declarations (which may or may not 
> have deferred diagnostics) and a number of eagerly-emitted "root" 
> declarations with use-edges leading into that graph.  Any declaration that's 
> reachable from a root will need to be emitted and so needs to have any 
> deferred diagnostics emitted as well.  My question is why you're finding 
> these roots with a retroactive walk of the entire translation unit instead of 
> either building a list of roots as you go or (better yet) building a list of 
> lazily-emitted declarations that are used by those roots.  You can 
> unambiguously identify at the point of declaration whether an entity will be 
> eagerly or lazily emitted, right?  If you just store those initial edges into 
> the lazily-emitted declarations graph and then initiate the recursive walk 
> from them at the end of the translation unit, you'll only end up walking 
> declarations that are actually relevant to your compilation, so you'll have 
> much better locality and (if this matters to you) you'll naturally work a lot 
> better with PCH and modules.
I will try the approach you suggested. Basically I will record the emitted 
functions and variables during parsing and use them as starting point for the 
final traversal.

This should work for CUDA/HIP. However it may be tricky for OpenMP since the 
emission of some entities depending on pragmas. Still it may be doable. If I 
encounter difficulty I will come back for discussion.

I will post the change for review.

Thanks.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70172/new/

https://reviews.llvm.org/D70172



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to