yaxunl marked an inline comment as done. yaxunl added inline comments.
================ Comment at: clang/lib/Sema/Sema.cpp:1514 + void visitUsedDecl(SourceLocation Loc, Decl *D) { + if (auto *TD = dyn_cast<TranslationUnitDecl>(D)) { + for (auto *DD : TD->decls()) { ---------------- rjmccall wrote: > yaxunl wrote: > > rjmccall wrote: > > > erichkeane wrote: > > > > rjmccall wrote: > > > > > erichkeane wrote: > > > > > > Note that when recommitting this (if you choose to), this needs to > > > > > > also handle NamespaceDecl. We're a downstream and discovered that > > > > > > this doesn't properly handle functions or records handled in a > > > > > > namespace. > > > > > > > > > > > > It can be implemented identically to TranslationUnitDecl. > > > > > Wait, what? We shouldn't be doing this for TranslationUnitDecl > > > > > either. I don't even know how we're "using" a TranslationUnitDecl, > > > > > but neither this case not the case for `NamespaceDecl` should be > > > > > recursively using every declaration declared inside it. If there's a > > > > > declaration in a namespace that's being used, it should be getting > > > > > visited as part of the actual use of it. > > > > > > > > > > The logic for `RecordDecl` has the same problem. > > > > Despite the name, this seems to be more of a home-written ast walking > > > > class. The entry point is the 'translation unit' which seems to walk > > > > through everything in an attempt to find all the functions (including > > > > those that are 'marked' as used by an attribute). > > > > > > > > You'll see the FunctionDecl section makes this assumption as well (not > > > > necessarily that we got to a function via a call). IMO, this approach > > > > is strange, and we should register entry points in some manner > > > > (functions marked as emitted to the device in some fashion), then just > > > > follow its call-graph (via the clang::CallGraph?) to emit all of these > > > > functions. > > > > > > > > It seemed really odd to see this approach here, but it seemed well > > > > reviewed by the time I noticed it (via a downstream bug) so I figured > > > > I'd lost my chance to disagree with the approach. > > > > > > > > > > > Sure, but `visitUsedDecl` isn't the right place to be entering the walk. > > > `visitUsedDecl` is supposed to be the *callback* from the walk. If they > > > need to walk all the global declarations to find kernels instead of > > > tracking the kernels as they're encountered (which would be a *much* > > > better approach), it should be done as a separate function. > > > > > > I just missed this in the review. > > The deferred diagnostics could be initiated by non-kernel functions or even > > host functions. > > > > Let's consider a device code library where no kernels are defined. A device > > function is emitted, which calls a host device function which has a > > deferred diagnostic. All device functions that are emitted need to be > > checked. > > > > Same with host functions that are emitted, which may call a host device > > function which has deferred diagnostic. > > > > Also not just function calls need to be checked. A function address may be > > taken then called through function pointer. Therefore any reference to a > > function needs to be followed. > > > > In the case of OpenMP, the initialization of a global function pointer > > which refers a function may trigger a deferred diangostic. There are tests > > for that. > Right, I get that emitting deferred diagnostics for a declaration D needs to > trigger any deferred diagnostics in declarations used by D, recursively. You > essentially have a graph of lazily-emitted declarations (which may or may not > have deferred diagnostics) and a number of eagerly-emitted "root" > declarations with use-edges leading into that graph. Any declaration that's > reachable from a root will need to be emitted and so needs to have any > deferred diagnostics emitted as well. My question is why you're finding > these roots with a retroactive walk of the entire translation unit instead of > either building a list of roots as you go or (better yet) building a list of > lazily-emitted declarations that are used by those roots. You can > unambiguously identify at the point of declaration whether an entity will be > eagerly or lazily emitted, right? If you just store those initial edges into > the lazily-emitted declarations graph and then initiate the recursive walk > from them at the end of the translation unit, you'll only end up walking > declarations that are actually relevant to your compilation, so you'll have > much better locality and (if this matters to you) you'll naturally work a lot > better with PCH and modules. I will try the approach you suggested. Basically I will record the emitted functions and variables during parsing and use them as starting point for the final traversal. This should work for CUDA/HIP. However it may be tricky for OpenMP since the emission of some entities depending on pragmas. Still it may be doable. If I encounter difficulty I will come back for discussion. I will post the change for review. Thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D70172/new/ https://reviews.llvm.org/D70172 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits