sammccall added a comment. In D136723#3915853 <https://reviews.llvm.org/D136723#3915853>, @maryammo wrote:
> This commit causes build failure on > `https://lab.llvm.org/buildbot/#/builders/121/builds/24947` : > I was able to reproduce the failure and by reverting this commit locally it > passed, can you please take a look? @sammccall It looks like this was already fixed in https://github.com/llvm/llvm-project/commit/d19ba74dee0b9ab553bd8a6ef5b67ff349f4bf13 ================ Comment at: clang-tools-extra/include-cleaner/include/clang-include-cleaner/Record.h:81 + llvm::StringMap<llvm::SmallVector<unsigned>> BySpelling; + llvm::DenseMap<const FileEntry *, llvm::SmallVector<unsigned>> ByFile; + } Includes; ---------------- kadircet wrote: > it feels like having a set of includes corresponding to the same fileentry, > rather than a single one, feels subtle enough to deserve a comment. > > the only case i can think of is, two logically different files resolving to > same physical file (i.e. symlinks). these will actually have completely > different spellings, and probably it'll make things hard when deciding which > include to "keep" or "insert". are there other cases where we can have > multiple includes corresponding to the same fileentry? Sure: ``` #include "foo.h" #include "bar.h" #include "foo.h" ``` it isn't useful, but people write it (by mistake). Maybe we want to clean it up, maybe we don't, but I don't see much reason to model it wrong on purpose - it doesn't really simplify anything, and may further confuse a confusing situation. I can add a comment - what do you want it to say? "There may be more than one #include of the same file" just seems to restate the obvious interpretation of the type here. (And this is the header, so it's not an ideal place to dump a bunch of implementation comments) ================ Comment at: clang-tools-extra/include-cleaner/include/clang-include-cleaner/Record.h:54 +struct RecordedPP { + // The callback (when installed into clang) tracks macros/includes in this. + std::unique_ptr<PPCallbacks> record(const Preprocessor &PP); ---------------- kadircet wrote: > s/this/the main file/ "this" as in the current object, as the data flow here isn't obvious. Changed to `*this`, clearer? ================ Comment at: clang-tools-extra/include-cleaner/include/clang-include-cleaner/Record.h:75 + // - for a logical file like <vector>, we check Spelled + llvm::SmallVector<const Include *> match(Header H) const; + ---------------- kadircet wrote: > hokein wrote: > > sammccall wrote: > > > in the prototype I reimplemented this function in clangd, but I expect we > > > can just reuse the RecordedIncludes class instead, copying from clangd's > > > includes list is cheap. > > > > > > (That's one argument for putting this in a different header, which I can > > > do already if you prefer) > > that's an interesting idea, but we need to do it carefully because of > > `FileEntry*`, the `Include` structure has a filed of `FileEntry*`, it is > > not feasible for clangd to propagate it (clangd's `Inclusion` doesn't have > > it, instead it uses a `HeaderID` which is based on the `fs::UniqueID` to > > identify a physical file). > > > > But I think this is not something we should worry about at the moment, and > > the current interfaces look quite good. > right, let's keep it to the minimum now. we can extend/move-around later on Good point. We could give Include a UniqueID instead I think. Let's defer this though. ================ Comment at: clang-tools-extra/include-cleaner/include/clang-include-cleaner/Record.h:75 + // - for a logical file like <vector>, we check Spelled + llvm::SmallVector<const Include *> match(Header H) const; + ---------------- sammccall wrote: > kadircet wrote: > > hokein wrote: > > > sammccall wrote: > > > > in the prototype I reimplemented this function in clangd, but I expect > > > > we can just reuse the RecordedIncludes class instead, copying from > > > > clangd's includes list is cheap. > > > > > > > > (That's one argument for putting this in a different header, which I > > > > can do already if you prefer) > > > that's an interesting idea, but we need to do it carefully because of > > > `FileEntry*`, the `Include` structure has a filed of `FileEntry*`, it is > > > not feasible for clangd to propagate it (clangd's `Inclusion` doesn't > > > have it, instead it uses a `HeaderID` which is based on the > > > `fs::UniqueID` to identify a physical file). > > > > > > But I think this is not something we should worry about at the moment, > > > and the current interfaces look quite good. > > right, let's keep it to the minimum now. we can extend/move-around later on > Good point. We could give Include a UniqueID instead I think. Let's defer > this though. > that's an interesting idea, but we need to do it carefully because of > `FileEntry*`, the `Include` structure has a filed of `FileEntry*`, it is not > feasible for clangd to propagate it (clangd's `Inclusion` doesn't have it, > instead it uses a `HeaderID` which is based on the `fs::UniqueID` to identify > a physical file). > > But I think this is not something we should worry about at the moment, and > the current interfaces look quite good. ================ Comment at: clang-tools-extra/include-cleaner/include/clang-include-cleaner/Types.h:52 struct Symbol { enum Kind { // A canonical clang declaration. ---------------- hokein wrote: > probably worth adding a comment: the order of the enumerators must match > the the order of template arguments in Storage. Done in the base patch. D136710 ================ Comment at: clang-tools-extra/include-cleaner/include/clang-include-cleaner/Types.h:117 + llvm::StringRef Spelled; // e.g. vector + const FileEntry *Resolved = nullptr; // e.g. /path/to/c++/v1/vector + SourceLocation HashLocation; // of hash in #include <vector> ---------------- kadircet wrote: > i think it's worth leaving a comment here mentioning that this can be null > (couldn't resolve, or maybe we shouldn't be caring, e.g. logical stl include). > > even better, maybe we should actually have a `Header` here? that way this > would be conceptual equivalent of SymbolReference, but for includes? Done (comment.) > maybe we should actually have a Header here A `Header` is conceptually something like a `Matcher<Include>`. If we put Header here then we're communicating what the reference is going to look like: `#include "foo.h"` is going to satisfy a symbol usage in `/path/to/foo.h` or one in a file with `// IWYU pragma private: include "foo.h"`, which we don't know. (And it could be both, which would no longer be expressible). Technically you could store an Header that *always* has a FileEntry, but that just seems like obfuscation. ================ Comment at: clang-tools-extra/include-cleaner/include/clang-include-cleaner/Types.h:41 +struct Macro { + IdentifierInfo *Name; + // The location of the Name where the macro is defined. ---------------- kadircet wrote: > what about just a stringref to name? That's twice as big, and this is already the critical path for sizeof(SymbolReference) which me can accumulate a lot of. I don't think it's critical, but I also don't see why StringRef is better. (Less indirection vs carries some semantics + lifetime information) ================ Comment at: clang-tools-extra/include-cleaner/include/clang-include-cleaner/Types.h:46 + bool operator==(const Macro &S) const { + return Name == S.Name && Definition == S.Definition; + } ---------------- kadircet wrote: > OOC: can we really have multiple macro identifiers defined at the same source > location? I guess it's cheap to compare another pointer, but might be easier > if we actually didn't make the identifierinfo part of the identity. Probably not? Removed from comparison. (I think it's *simpler* to include it in the comparison, but it doesn't matter much) ================ Comment at: clang-tools-extra/include-cleaner/include/clang-include-cleaner/Types.h:118 + const FileEntry *Resolved = nullptr; // e.g. /path/to/c++/v1/vector + SourceLocation Location; // of hash in #include <vector> + unsigned Line = 0; // 1-based line number for #include ---------------- hokein wrote: > nit: HashLoc seems clearer than `Location`. > > And it looks like `Location` and `Line` feels somewhat redundant -- we can > get the Line loc from the `Location` with `SourceManager`. But I think it is > fair to hold both, Line is an important property of the include, which will > probably widely used. Agree, we could one of these but it's a bit sad: we need SourceLocation to diagnose things, but Line is much more convenient the rest of the time. Renamed, leaving this open to see what Kadir thinks too. ================ Comment at: clang-tools-extra/include-cleaner/lib/Record.cpp:20 +class PPRecorder : public PPCallbacks { +public: ---------------- hokein wrote: > this should be hidden in anonymous namespace, right? Right! This used to be friended. ================ Comment at: clang-tools-extra/include-cleaner/lib/Record.cpp:139 + case Header::Physical: + for (unsigned I : ByFile.lookup(H.physical())) + Result.push_back(&All[I]); ---------------- kadircet wrote: > we might have different handles (fileentry*) to the same file (if we end up > having multiple filemanagers in action, which is unlikely as things stand > today, but still). so what about using uniqueids for comparison here? > > AFAICT, filemanager won't have multiple `fileenrty`s with the same `uniqueid` > anyway. so we wouldn't really lose any generality. This is approximately the same issue as above (`> that's an interesting idea, but we need to do it carefully because of `FileEntry*`). I agree and it doesn't really add code complexity but it does add subtlety, especially if we use FileEntry in the data structure and UniqueID for matching. So I think deferring *all* the multi-file-manager concerns makes sense (or handling them all now if you prefer) ================ Comment at: clang-tools-extra/include-cleaner/lib/Record.cpp:28 + FileID PrevFID) override { + Active = SM.isWrittenInMainFile(Loc); + } ---------------- kadircet wrote: > we can keep a include depth based on the `Reason` and bail out on other > operations when depth is not `1` (getFileID on Loc here probably hits the > SourceManager cache anyways, so might not be important, but i feel like it's > better to not rely on that). That sounds complicated, wait until it shows up in a profile? The conceptual stack definitely isn't properly maintained in all situations. e.g. a normal parse leaks 1 entry for... builtins or something, and clangd's parse with preamble leaks 3. I'm not specifically sure there's a problem in this scenario, but it's *not* a trivial thing that obviously can't go wrong. (& I think if we're afraid to rely on the cache for the file we're *currently lexing* then we should seriously consider fixing the cache instead) ================ Comment at: clang-tools-extra/include-cleaner/unittests/RecordTest.cpp:122 +// Matches an Include with a particular spelling. +MATCHER_P(spelled, S, "") { return arg.Spelled == S; } + ---------------- kadircet wrote: > i know these look more like functions, but convention is to use `PascalCase` > for matchers, so `Spelled` instead? These are functions - I don't think there's a convention here, just confusion and bugs. OK to fix `named` intead? ================ Comment at: clang-tools-extra/include-cleaner/unittests/RecordTest.cpp:186 + Header.point("def")); + ASSERT_NE(Recorded.MacroReferences.size(), 0u); + Symbol OrigX = Recorded.MacroReferences.front().Symbol; ---------------- kadircet wrote: > nit: `ASSERT_FALSE(Recorded.MacroReferences.empty())` Why? This means we get "expected empty() is false, got true" instead of "expected size() is 0, got 7", which seems strictly worse... Added a somewhat-useful printer for SymbolReference and switched to IsEmpty() instead. ================ Comment at: clang-tools-extra/include-cleaner/unittests/RecordTest.cpp:180 + Inputs.ExtraFiles["header.h"] = Header.code(); + Inputs.ErrorOK = true; // missing header + auto AST = build(); ---------------- hokein wrote: > IIUC, ErrorOK is irrelevant, and should be removed. Hmm, it's not irrelevant, it means I get to make lots of mistakes in my examples :-) Removed. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D136723/new/ https://reviews.llvm.org/D136723 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits