dexonsmith added a comment. Just a few comments on implementation details. The only high-level piece to call out is that I wonder if `NumIncludes` could/should be simplified (semantically) to a Boolean in a prep commit.
================ Comment at: clang/include/clang/Lex/Preprocessor.h:720-723 + /// Efficient map with FileEntry keys. + template <typename T> class FileEntryMap { + /// The underlying storage. Entries are indexed by FileEntry UID. + std::vector<T> UIDToValue; ---------------- I'm not sure about this choice. UIDs are unlikely to be adjacent and in SubmoduleIncludeState, since a given submodule is unlikely to have "most" files. Also, memory usage characteristics could be "weird" if the FileManager is being reused between different compilations. `DenseMap<unsigned, unsigned>` will have the same number of allocations (i.e., just 1). If UIDs really are dense and near each other in one of the submodules then that map will be a little bigger than necessary (~2x), but it should be better for the rest of them. ================ Comment at: clang/lib/Lex/Preprocessor.cpp:1323 + auto &ModNumIncludes = SubmoduleIncludeStates[M].NumIncludes; + for (unsigned UID = 0; UID <= ModNumIncludes.maxUID(); ++UID) { + CurSubmoduleIncludeState->NumIncludes[UID] += ModNumIncludes[UID]; ---------------- jansvoboda11 wrote: > Iterating over all FileEntries is probably not very efficient, as Volodymyr > mentioned. Thinking about how to make this more efficient... My suggestion above to drop FileEntryMap in favour of a simple DenseMap would help a bit, just iterating through the files actually included by the submodules. Further, I wonder if "num-includes"/file/submodule (`unsigned`) is actually useful, vs. "was-included"/file/submodule (`bool`). The only observer I see is `HeaderSearch::PrintStats()` but maybe I missed something? If I'm right and we can switch to `bool`, then NumIncludes becomes a `DenseSet<FileEntry *> IncludedFiles` (or `DenseSet<unsigned>` for UIDs). (BTW, I also wondered if you could rotate the map, using File as the outer key, and then use bitsets for the sbumodules, but I doubt this is better, since most files are only included by a few submodules, right?) Then you can just do a set union here. Also simplifies bitcode serialization. (If a `bool`/set is sufficient, then I'd suggest landing that first/separately, before adding the per-submodule granularity in this patch.) ================ Comment at: clang/lib/Serialization/ASTWriter.cpp:2392-2393 + + // TODO: Use plain vector, submodule IDs are small and dense. + std::map<unsigned, std::map<StringRef, unsigned>> ToBeSerialized; + ---------------- A vector of maps would be an improvement, but that'll still be a lot of allocations. - Since insertion/lookup/deletion aren't intermingled, the simplest way to avoid adding unnecessary overhead is a sorted vector (https://llvm.org/docs/ProgrammersManual.html#dss-sortedvectormap). - With no lookups (at all), there's no benefit to a tiered data structure (vs flat). Leading me toward a simple flat vector + sort. ``` lang=c++ struct IncludeToSerialize { // Probably more straightforward than a std::tuple... StringRef Filename; unsigned SMID; unsigned NumIncludes; bool operator<(const IncludeToSerialize &RHS) const { if (SMID != RHS.SMID) return SMID < RHS.SMID; int Diff = Filename.compare(RHS.Filename); assert(Diff && "Expected unique SMID / Filename pairs"); return Diff < 0; } }; SmallVector<IncludeToSerialize> IncludesToSerialize; // ... IncludesToSerialize.push_back({Filename, LocalSMID, NumIncludes}); // ... llvm::sort(IncludesToSerialize); for (const IncludeToSerialize &SI : IncludesToSerialize) { // emit record } ``` (Or if there are duplicates expected to be encountered and ignored, you can remove the assertion and use stable_sort + unique + erase.) ================ Comment at: clang/lib/Serialization/ASTWriter.cpp:2418 + NumIncludes}; + Stream.EmitRecordWithBlob(IncludesAbbrev, Record, Filename); + } ---------------- I wonder, will the `Filename` already be serialized elsewhere? Could an ID from that be reused here, rather than writing the filename again? (Maybe that'd need a bigger refactor of some sort to create a filename table?) Stepping back, it looks like this is always eagerly loaded. - Could it be lazily-loaded by submodule? - Could it be lazily-loaded by filename? In the former case, seems like a single record per submodule makes sense, with a single blob that can be decoded on-demand. In the latter case, maybe it should be rotated, and stored a single record per filename as a blob that can be lazily decoded: ``` <filename-size> <filename> <num-submodules> (<smid> <num-includes>)+ ``` Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112915/new/ https://reviews.llvm.org/D112915 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits