jansvoboda11 marked 5 inline comments as done. jansvoboda11 added inline comments.
================ Comment at: clang/include/clang/Serialization/ASTWriter.h:449-452 + /// Exclusive prefix sum of the lengths of preceding non-affecting inputs. + std::vector<SourceLocation::UIntTy> NonAffectingInputOffsetAdjustments; + /// Exclusive prefix sum of the count of preceding non-affecting inputs. + std::vector<unsigned> NonAffectingInputFileIDAdjustments; ---------------- dexonsmith wrote: > Can you collect a histogram for how big these vectors are? Can we avoid > pointer chasing in the common case by making them `SmallVector` of some size > during lookup? > Usually 4-6 elements. Making them a `SmallVector<T, 8>` didn't affect performance, though. ================ Comment at: clang/lib/Serialization/ASTWriter.cpp:5282-5283 + +SourceLocation::UIntTy +ASTWriter::getAdjustment(SourceLocation::UIntTy Offset) const { + if (PP->getSourceManager().isLoadedOffset(Offset) || ---------------- dexonsmith wrote: > How often does `getAdjustment()` return the same answer in consecutive calls? > If at all common, this would likely benefit from a peephole: > ``` > lang=c++ > Optional<SLocRange> ASTWriter::CachedAdjustmentRange; > Optional<UIntTy> ASTWriter::CachedAdjustment; > > SourceLocation::UIntTy > ASTWriter::getAdjustment(SourceLocation::UIntTy Offset) const { > // Check for 0. > // > // How fast is "isLoadedOffset()"? Can/should we add a peephole, or is it > just bit > // manipulation? (I seem to remember it checking the high bit or something, > but if > // it's doing some sort of look up, maybe it should be in the slow path so > it can > // get cached by LastAdjustment.) > if (PP->getSourceManager().isLoadedOffset(Offset) || > NonAffectingInputs.empty()) > return 0; > > // Check CachedAdjustment. > if (CachedAdjustment && CachedAdjustmentRange->includes(Offset)) > return *CachedAdjustment; > > // Call getAdjustmentSlow, which updates CachedAdjustment and > CachedAdjustmentRange. > // It's out-of-line so that `getAdjustment` can easily be inlined without > inlining > // the slow path. > // > // LastAdjustmentRange would be the size of the "gap" between this > adjustment > // level and the next one (end would be UINTMAX if it's after the last > // non-affecting range). > return getAdjustmentSlow(Offset); > } > ``` > Not that often, see my top-level comment. ================ Comment at: clang/lib/Serialization/ASTWriter.cpp:5289-5290 + ? NonAffectingInputs.end() + : llvm::lower_bound(NonAffectingInputs, + PP->getSourceManager().getFileID(Offset)); + unsigned Idx = std::distance(NonAffectingInputs.begin(), It); ---------------- jansvoboda11 wrote: > dexonsmith wrote: > > Why do you need to call `getFileID()` here? > > > > Instead, I would expect this to be a search through a range of offsets > > (e.g., see my suggestion at https://reviews.llvm.org/D106876#3869247 -- > > `DroppedMMs` contains SourceLocations, not FileIDs). > > > > Two benefits: > > 1. You don't need to call `getFileID()` to look up an offset. > > 2. You can merge adjacent non-affecting files (shrinking the search/storage > > significantly). > > > > > My reasoning was that if we search through a range of offsets, we're doing > conceptually the same thing as `getFileID()` (which already has some > optimizations baked in). Maybe the non-affecting files are indeed adjacent > and we'll be able to merge most of them. I'll give it a shot and report back. This ended up being faster due to merging of non-affecting files. Thanks for the suggestion! ================ Comment at: clang/test/Modules/non-affecting-module-maps-source-locations.m:32 + +// RUN: %clang_cc1 -I %t/first -I %t/second -I %t/third -fmodules -fimplicit-module-maps -fmodules-cache-path=%t/cache %t/tu.m -o %t/tu.o ---------------- dexonsmith wrote: > dexonsmith wrote: > > This is exercising the code, but it could do one better and check if the > > output PCMs are bit-for-bit identical when we (now) expect them to be. > > > > Maybe you could do this by having two run lines: one that includes `-I > > %t/second` and another that doesn't. Then check if the output PCMs are > > equal. > (Or if the PCM isn't bit-for-bit identical yet, maybe at least the AST block > should be...) Yes, I'll probably drop this test entirely and just check the PCM files are bit-for-bit identical when a non-affecting file is not loaded at all. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D136624/new/ https://reviews.llvm.org/D136624 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits