[PATCH] D151938: [clang][index] NFCI: Make `CXFile` a `FileEntryRef`

2023-09-13 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added a comment.

In D151938#4645381 , @Jake-Egan wrote:

> @jansvoboda11 Actually, we could give you access to an AIX machine to debug 
> this. Would that work for you?

Hi, I'd be happy to try that!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151938/new/

https://reviews.llvm.org/D151938

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D151938: [clang][index] NFCI: Make `CXFile` a `FileEntryRef`

2023-09-10 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added a comment.

In D151938#4626992 , @Jake-Egan wrote:

> Sorry for the late response @jansvoboda11, when `lookupModuleFile` is called, 
> the file 
> `build/tools/clang/test/Index/Core/Output/index-with-module.m.tmp.mcp/A03A61VI43WA/ModA-21USRMHJNU3PG.pcm`
>  doesn't exist. So it seems there's some issue with the creation of the file.
>
> The backtrace is:
>
>   ...

@Jake-Egan Thank you for the stack trace and XFAILing the test! Other tests 
started failing after converting more `FileEntry` usages to `FileEntryRef`: 
https://lab.llvm.org/buildbot/#/builders/214/builds/9416

There are definitely some behavioral changes these patches introduce, but I 
don't see anything obviously wrong. What's more concerning, though, is that 
`llvm::expectedToOptional()`/`llvm::cantFail()` end up in `llvm_unreachable()`. 
Are you able to investigate? If not, could you provide me a way to reproduce 
this? These failures only occur on the AIX bots AFAIK.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151938/new/

https://reviews.llvm.org/D151938

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D137213: [clang][modules] NFCI: Pragma diagnostic mappings: write/read FileID instead of SourceLocation

2023-08-28 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added a subscriber: rsmith.
jansvoboda11 added inline comments.



Comment at: clang/lib/Serialization/ASTReader.cpp:6343
  "Invalid data, missing pragma diagnostic states");
-  SourceLocation Loc = ReadSourceLocation(F, Record[Idx++]);
-  auto IDAndOffset = SourceMgr.getDecomposedLoc(Loc);
-  assert(IDAndOffset.first.isValid() && "invalid FileID for transition");
-  assert(IDAndOffset.second == 0 && "not a start location for a FileID");
+  FileID FID = ReadFileID(F, Record, Idx);
+  assert(FID.isValid() && "invalid FileID for transition");

dexonsmith wrote:
> alexfh wrote:
> > alexfh wrote:
> > > dexonsmith wrote:
> > > > alexfh wrote:
> > > > > alexfh wrote:
> > > > > > jansvoboda11 wrote:
> > > > > > > alexfh wrote:
> > > > > > > > dexonsmith wrote:
> > > > > > > > > eaeltsin wrote:
> > > > > > > > > > This doesn't work as before, likely because ReadFileID 
> > > > > > > > > > doesn't do TranslateSourceLocation.
> > > > > > > > > > 
> > > > > > > > > > Our tests fail.
> > > > > > > > > > 
> > > > > > > > > > I tried calling TranslateSourceLocation here and the tests 
> > > > > > > > > > passed:
> > > > > > > > > > ```
> > > > > > > > > >   SourceLocation Loc = 
> > > > > > > > > > Diag.SourceMgr->getComposedLoc(FID, 0);
> > > > > > > > > >   SourceLocation Loc2 = TranslateSourceLocation(F, Loc);
> > > > > > > > > >   auto IDAndOffset = SourceMgr.getDecomposedLoc(Loc2);
> > > > > > > > > > 
> > > > > > > > > >   // Note that we don't need to set up 
> > > > > > > > > > Parent/ParentOffset here, because
> > > > > > > > > >   // we won't be changing the diagnostic state within 
> > > > > > > > > > imported FileIDs
> > > > > > > > > >   // (other than perhaps appending to the main source 
> > > > > > > > > > file, which has no
> > > > > > > > > >   // parent).
> > > > > > > > > >   auto &F = 
> > > > > > > > > > Diag.DiagStatesByLoc.Files[IDAndOffset.first];
> > > > > > > > > > ```
> > > > > > > > > > 
> > > > > > > > > > Sorry I don't know the codebase, so this fix is definitely 
> > > > > > > > > > ugly :) But it shows the problem.
> > > > > > > > > > 
> > > > > > > > > I don't think that's the issue, since `ReadFileID()` calls 
> > > > > > > > > `TranslateFileID`, which should seems like it should be 
> > > > > > > > > equivalent.
> > > > > > > > > 
> > > > > > > > > However, I notice that the post-increment for `Idx` got 
> > > > > > > > > dropped! Can you try replacing the line of code with the 
> > > > > > > > > following and see if that fixes your tests (without any extra 
> > > > > > > > > TranslateSourceLocation logic)?
> > > > > > > > > ```
> > > > > > > > > lang=c++
> > > > > > > > > FileID FID = ReadFileID(F, Record, Idx++);
> > > > > > > > > ```
> > > > > > > > > 
> > > > > > > > > If so, maybe you can contribute that fix with a reduced 
> > > > > > > > > testcase? I suggest adding me, @vsapsai, @Bigcheese, and 
> > > > > > > > > @jansvoboda11 as reviewers.
> > > > > > > > > 
> > > > > > > > > @alexfh, maybe you can check if this fixes your tests as well?
> > > > > > > > > 
> > > > > > > > > (If this is the issue, it's a bit surprising we don't have 
> > > > > > > > > existing tests covering this case... and embarrassing I 
> > > > > > > > > missed it when reviewing initially!)
> > > > > > > > I've noticed the dropped `Idx` post-increment as well, but I 
> > > > > > > > went a step further and looked at the `ReadFileID` 
> > > > > > > > implementation, which is actually doing a post-increment 
> > > > > > > > itself, and accepts `Idx` by reference:
> > > > > > > > ```
> > > > > > > >   FileID ReadFileID(ModuleFile &F, const RecordDataImpl &Record,
> > > > > > > > unsigned &Idx) const {
> > > > > > > > return TranslateFileID(F, FileID::get(Record[Idx++]));
> > > > > > > >   }
> > > > > > > > ```
> > > > > > > > 
> > > > > > > > Thus, it seems to be correct. But what @eaeltsin  has found is 
> > > > > > > > actually a problem for us.  I'm currently trying to make an 
> > > > > > > > isolated test case, but it's quite tricky (as header modules 
> > > > > > > > are =\). It may be the case that our build setup relies on 
> > > > > > > > something clang doesn't explicitly promises, but the fact is 
> > > > > > > > that the behavior (as observed by our build setup) has changed. 
> > > > > > > > I'll try to revert the commit for now to get us unblocked and 
> > > > > > > > provide a test case as soon as I can.
> > > > > > > Thanks for helping out @dexonsmith, we did have the week off.
> > > > > > > 
> > > > > > > @eaeltsin @alexfhDone, are you able to provide the failing test 
> > > > > > > case? I'm happy to look into it and re-land this with a fix.
> > > > > > I've spent multiple hours trying to extract an observable test 
> > > > > > case. It turned out to be too hairy of a yaq to shave: for each 
> > > > > > compilation a separate sandboxed environment is created

[PATCH] D158572: [clang][modules] Use relative offsets for input files

2023-08-24 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added a comment.

@phosek BTW can you confirm whether these started failing with this patch or 
with D158573 ?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158572/new/

https://reviews.llvm.org/D158572

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158572: [clang][modules] Use relative offsets for input files

2023-08-24 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added a comment.

In D158572#4615198 , @phosek wrote:

> After this change, all libc++ `clang_modules_include.gen.py` tests started 
> failing on our Linux builders:
>
>   ...
>
> You can see the full output at 
> https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-linux-arm64/b8771862804321535569/test-results.
>  Would it be possible to revert this change?

Thanks for notifying me. Reverting now...


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158572/new/

https://reviews.llvm.org/D158572

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158573: [clang][modules] Move `UNHASHED_CONTROL_BLOCK` up in the AST file

2023-08-24 Thread Jan Svoboda via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
jansvoboda11 marked an inline comment as done.
Closed by commit rG7d1565727dad: [clang][modules] Move `UNHASHED_CONTROL_BLOCK` 
up in the AST file (authored by jansvoboda11).

Changed prior to commit:
  https://reviews.llvm.org/D158573?vs=552897&id=553156#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158573/new/

https://reviews.llvm.org/D158573

Files:
  clang/include/clang/Basic/Module.h
  clang/include/clang/Serialization/ASTBitCodes.h
  clang/include/clang/Serialization/ASTWriter.h
  clang/lib/Serialization/ASTReader.cpp
  clang/lib/Serialization/ASTWriter.cpp
  clang/lib/Serialization/GlobalModuleIndex.cpp
  clang/test/Modules/ASTSignature.c

Index: clang/test/Modules/ASTSignature.c
===
--- clang/test/Modules/ASTSignature.c
+++ clang/test/Modules/ASTSignature.c
@@ -1,6 +1,6 @@
 // RUN: rm -rf %t
-// RUN: %clang_cc1 -iquote %S/Inputs/ASTHash/ -fsyntax-only -fmodules \
-// RUN:   -fimplicit-module-maps -fmodules-strict-context-hash \
+// RUN: %clang_cc1 -iquote %S/Inputs/ASTHash/ -fsyntax-only \
+// RUN:   -fmodules -fimplicit-module-maps -fmodules-strict-context-hash \
 // RUN:   -fmodules-cache-path=%t -fdisable-module-hash %s
 // RUN: cp %t/MyHeader2.pcm %t1.pcm
 // RUN: rm -rf %t
@@ -8,17 +8,18 @@
 // RUN:   -fmodules -fimplicit-module-maps -fmodules-strict-context-hash \
 // RUN:   -fmodules-cache-path=%t -fdisable-module-hash %s
 // RUN: cp %t/MyHeader2.pcm %t2.pcm
-// RUN: llvm-bcanalyzer --dump --disable-histogram %t1.pcm > %t1.dump
-// RUN: llvm-bcanalyzer --dump --disable-histogram %t2.pcm > %t2.dump
+// RUN: llvm-bcanalyzer --dump --disable-histogram --show-binary-blobs %t1.pcm > %t1.dump
+// RUN: llvm-bcanalyzer --dump --disable-histogram --show-binary-blobs %t2.pcm > %t2.dump
 // RUN: cat %t1.dump %t2.dump | FileCheck %s
 
 #include "my_header_2.h"
 
 my_int var = 42;
 
-// CHECK: [[AST_BLOCK_HASH:]]
-// CHECK: [[SIGNATURE:]]
-// CHECK: [[AST_BLOCK_HASH]]
-// CHECK-NOT: [[SIGNATURE]]
-// The modules built by this test are designed to yield the same AST. If this
-// test fails, it means that the AST block is has become non-relocatable.
+// CHECK:  blob data = '[[AST_BLOCK_HASH:.*]]'
+// CHECK:  blob data = '[[SIGNATURE:.*]]'
+// CHECK:  blob data = '[[AST_BLOCK_HASH]]'
+// CHECK-NOT:  blob data = '[[SIGNATURE]]'
+// The modules built by this test are designed to yield the same AST but distinct AST files.
+// If this test fails, it means that either the AST block has become non-relocatable,
+// or the file signature stopped hashing some parts of the AST file.
Index: clang/lib/Serialization/GlobalModuleIndex.cpp
===
--- clang/lib/Serialization/GlobalModuleIndex.cpp
+++ clang/lib/Serialization/GlobalModuleIndex.cpp
@@ -697,9 +697,12 @@
 }
 
 // Get Signature.
-if (State == DiagnosticOptionsBlock && Code == SIGNATURE)
-  getModuleFileInfo(File).Signature = ASTFileSignature::create(
-  Record.begin(), Record.begin() + ASTFileSignature::size);
+if (State == DiagnosticOptionsBlock && Code == SIGNATURE) {
+  auto Signature = ASTFileSignature::create(Blob.begin(), Blob.end());
+  assert(Signature != ASTFileSignature::createDummy() &&
+ "Dummy AST file signature not backpatched in ASTWriter.");
+  getModuleFileInfo(File).Signature = Signature;
+}
 
 // We don't care about this record.
   }
Index: clang/lib/Serialization/ASTWriter.cpp
===
--- clang/lib/Serialization/ASTWriter.cpp
+++ clang/lib/Serialization/ASTWriter.cpp
@@ -1120,50 +1120,97 @@
 }
 
 std::pair
-ASTWriter::createSignature(StringRef AllBytes, StringRef ASTBlockBytes) {
+ASTWriter::createSignature() const {
+  StringRef AllBytes(Buffer.data(), Buffer.size());
+
   llvm::SHA1 Hasher;
-  Hasher.update(ASTBlockBytes);
+  Hasher.update(AllBytes.slice(ASTBlockRange.first, ASTBlockRange.second));
   ASTFileSignature ASTBlockHash = ASTFileSignature::create(Hasher.result());
 
-  // Add the remaining bytes (i.e. bytes before the unhashed control block that
-  // are not part of the AST block).
-  Hasher.update(
-  AllBytes.take_front(ASTBlockBytes.bytes_end() - AllBytes.bytes_begin()));
+  // Add the remaining bytes:
+  //  1. Before the unhashed control block.
+  Hasher.update(AllBytes.slice(0, UnhashedControlBlockRange.first));
+  //  2. Between the unhashed control block and the AST block.
   Hasher.update(
-  AllBytes.take_back(AllBytes.bytes_end() - ASTBlockBytes.bytes_end()));
+  AllBytes.slice(UnhashedControlBlockRange.second, ASTBlockRange.first));
+  //  3. After the AST block.
+  Hasher.update(AllBytes.slice(ASTBlockRange.second, StringRef::npos));
   ASTFileSignature Signatu

[PATCH] D158572: [clang][modules] Use relative offsets for input files

2023-08-24 Thread Jan Svoboda via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGb9d78bdc730b: [clang][modules] Use relative offsets for 
input files (authored by jansvoboda11).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158572/new/

https://reviews.llvm.org/D158572

Files:
  clang/include/clang/Serialization/ModuleFile.h
  clang/lib/Serialization/ASTReader.cpp
  clang/lib/Serialization/ASTWriter.cpp


Index: clang/lib/Serialization/ASTWriter.cpp
===
--- clang/lib/Serialization/ASTWriter.cpp
+++ clang/lib/Serialization/ASTWriter.cpp
@@ -1570,6 +1570,8 @@
   IFHAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 32));
   unsigned IFHAbbrevCode = Stream.EmitAbbrev(std::move(IFHAbbrev));
 
+  uint64_t InputFilesOffsetBase = Stream.GetCurrentBitNo();
+
   // Get all ContentCache objects for files.
   std::vector UserFiles;
   std::vector SystemFiles;
@@ -1633,7 +1635,7 @@
   continue; // already recorded this file.
 
 // Record this entry's offset.
-InputFileOffsets.push_back(Stream.GetCurrentBitNo());
+InputFileOffsets.push_back(Stream.GetCurrentBitNo() - 
InputFilesOffsetBase);
 
 InputFileID = InputFileOffsets.size();
 
Index: clang/lib/Serialization/ASTReader.cpp
===
--- clang/lib/Serialization/ASTReader.cpp
+++ clang/lib/Serialization/ASTReader.cpp
@@ -2326,7 +2326,8 @@
   // Go find this input file.
   BitstreamCursor &Cursor = F.InputFilesCursor;
   SavedStreamPosition SavedPosition(Cursor);
-  if (llvm::Error Err = Cursor.JumpToBit(F.InputFileOffsets[ID - 1])) {
+  if (llvm::Error Err = Cursor.JumpToBit(F.InputFilesOffsetBase +
+ F.InputFileOffsets[ID - 1])) {
 // FIXME this drops errors on the floor.
 consumeError(std::move(Err));
   }
@@ -2410,7 +2411,8 @@
   // Go find this input file.
   BitstreamCursor &Cursor = F.InputFilesCursor;
   SavedStreamPosition SavedPosition(Cursor);
-  if (llvm::Error Err = Cursor.JumpToBit(F.InputFileOffsets[ID - 1])) {
+  if (llvm::Error Err = Cursor.JumpToBit(F.InputFilesOffsetBase +
+ F.InputFileOffsets[ID - 1])) {
 // FIXME this drops errors on the floor.
 consumeError(std::move(Err));
   }
@@ -2788,6 +2790,7 @@
   Error("malformed block record in AST file");
   return Failure;
 }
+F.InputFilesOffsetBase = F.InputFilesCursor.GetCurrentBitNo();
 continue;
 
   case OPTIONS_BLOCK_ID:
@@ -5328,6 +5331,7 @@
   bool NeedsSystemInputFiles = Listener.needsSystemInputFileVisitation();
   bool NeedsImports = Listener.needsImportVisitation();
   BitstreamCursor InputFilesCursor;
+  uint64_t InputFilesOffsetBase = 0;
 
   RecordData Record;
   std::string ModuleDir;
@@ -5363,6 +5367,7 @@
 if (NeedsInputFiles &&
 ReadBlockAbbrevs(InputFilesCursor, INPUT_FILES_BLOCK_ID))
   return true;
+InputFilesOffsetBase = InputFilesCursor.GetCurrentBitNo();
 break;
 
   default:
@@ -5435,7 +5440,8 @@
 
 BitstreamCursor &Cursor = InputFilesCursor;
 SavedStreamPosition SavedPosition(Cursor);
-if (llvm::Error Err = Cursor.JumpToBit(InputFileOffs[I])) {
+if (llvm::Error Err =
+Cursor.JumpToBit(InputFilesOffsetBase + InputFileOffs[I])) {
   // FIXME this drops errors on the floor.
   consumeError(std::move(Err));
 }
Index: clang/include/clang/Serialization/ModuleFile.h
===
--- clang/include/clang/Serialization/ModuleFile.h
+++ clang/include/clang/Serialization/ModuleFile.h
@@ -245,7 +245,10 @@
   /// The cursor to the start of the input-files block.
   llvm::BitstreamCursor InputFilesCursor;
 
-  /// Offsets for all of the input file entries in the AST file.
+  /// Absolute offset of the start of the input-files block.
+  uint64_t InputFilesOffsetBase = 0;
+
+  /// Relative offsets for all of the input file entries in the AST file.
   const llvm::support::unaligned_uint64_t *InputFileOffsets = nullptr;
 
   /// The input files that have been loaded from this AST file.


Index: clang/lib/Serialization/ASTWriter.cpp
===
--- clang/lib/Serialization/ASTWriter.cpp
+++ clang/lib/Serialization/ASTWriter.cpp
@@ -1570,6 +1570,8 @@
   IFHAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 32));
   unsigned IFHAbbrevCode = Stream.EmitAbbrev(std::move(IFHAbbrev));
 
+  uint64_t InputFilesOffsetBase = Stream.GetCurrentBitNo();
+
   // Get all ContentCache objects for files.
   std::vector UserFiles;
   std::vector SystemFiles;
@@ -1633,7 +1635,7 @@
   continue; // already recorded this file.
 
 // Record this entry's offset.
-InputFileOffsets.push_back(

[PATCH] D158469: [clang][deps] Compute command lines and file deps on-demand

2023-08-23 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added a comment.

In D158469#4612012 , @benlangmuir 
wrote:

>> I tried that approach, but found it way too easy to keep `ModuleDeps` 
>> around, which keep scanning instances alive, and use tons of memory.
>
> It seems like the problem (too easy to keep around MD) is the same either 
> way, it's just instead of wasting memory it's leaving dangling pointers that 
> could cause UAF if they're actually used.

You're right. I have a version of this patch that stores 
`std::weak_ptr MDC` in `ModuleDeps` and asserts when you're 
about to dereference dangling pointer. That might solve that part of the issue.

> Where were you seeing MD held too long? I wonder if there's another way to 
> fix that.

For example in `clang-scan-deps`, we accumulate all `ModuleDeps` into 
`FullDeps::Modules`. This means that at the end of the scan, that can be 
keeping up to `NumTUs` worth of `CompilerInvocations` alive.

>> Hmm, maybe we could avoid holding on to the whole CompilerInstance.
>
> This seems promising!

OK, I'll try to explore this a little further. I'm a bit concerned that 
`FileManager` (and its caches) might still be too heavy to keep around, though.

---

I guess what makes this harder to get right is the fact that `ModuleDeps` and 
`ModuleDepsGraph` live on different levels. I think that maybe restructuring 
the API a little bit could simplify things. What I have in mind is replacing 
the `DependencyConsumer` function `handleModuleDependency(ModuleDeps)` with 
`handleModuleDepsGraph(ModuleDepsGraph)`. We could then turn `ModuleDepsGraph` 
into an iterator over proxy objects representing what's currently called 
`ModuleDeps`. (Lifetime of these proxies would be clearly tied to that of 
`ModuleDepsGraph`.) I think we can assume that scanner clients are less tempted 
to "permanently" store the full `ModuleDepsGraph` (since that's wasteful due to 
cross-TU sharing), compared to `ModuleDeps` (which you probably want to store 
in some form). Clients would therefore be nudged into creating own 
storage-friendly version of `ModuleDeps` before disposing of `ModuleDepsGraph`, 
which would force them to call `getBuildArguments()` and `getFileDeps()` on our 
`ModuleDeps` proxy before disposing of the heavy-weight graph. WDYT about this 
direction?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158469/new/

https://reviews.llvm.org/D158469

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158469: [clang][deps] Compute command lines and file deps on-demand

2023-08-23 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added a comment.

Hmm, maybe we could avoid holding on to the whole `CompilerInstance`. For 
generating the command-line, we only need the `CompilerInvocation`. For 
collecting file dependencies, we could hold on to the `MemoryBuffer` (and maybe 
offset to the input files block), and deserialize that on-demand, without 
keeping the whole `CompilerInstance` around.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158469/new/

https://reviews.llvm.org/D158469

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158469: [clang][deps] Compute command lines and file deps on-demand

2023-08-23 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added a comment.

In D158469#4611923 , @benlangmuir 
wrote:

> I find this a bit hard to understand as far as object lifetime is concerned: 
> you're storing the `ScanInstance` in `TranslationUnitDeps`, but that's a 
> layer above the actual consumer interface, which means every consumer needs 
> to understand how this lifetime is managed or it will have dangling 
> references by default.  You also seem to have ScanInstance stored twice -- 
> once explicitly and once in the graph.

Agreed, I'm not happy about the lifetime complexity here.

> What do you think of an alternate design where `ModuleDeps` has 
> `shared_ptr MDC` and `ModuleDepCollector` has 
> `shared_ptr`?  That way we don't need to explicitly expose 
> the scan instance to the consumer, it comes with the `ModuleDeps`, which then 
> keeps all the necessary memory alive?

I tried that approach, but found it way too easy to keep `ModuleDeps` around, 
which keep scanning instances alive, and use tons of memory.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158469/new/

https://reviews.llvm.org/D158469

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158573: [clang][modules] Move `UNHASHED_CONTROL_BLOCK` up in the AST file

2023-08-23 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 marked 5 inline comments as done.
jansvoboda11 added inline comments.



Comment at: clang/lib/Serialization/ASTWriter.cpp:1169
+  writeSignature(Sig, Out);
+  std::copy_n(Out.begin(), Out.size(), Buffer.begin() + Offset);
+};

benlangmuir wrote:
> jansvoboda11 wrote:
> > I don't feel great about removing `const` from `Buffer` and writing into it 
> > directly, circumventing `Stream`. This currently works fine, because the 
> > `Stream` in `ASTWriter` is never backed by a file (and therefore never 
> > flushed). But if that ever changes, this code is problematic. Do you think 
> > this is worth spending more time on?
> > 
> > `Stream` already has the `BackpatchWord()` function, which makes sure the 
> > underlying file is updated as well in case we're backpatching 
> > already-flushed data.
> Interesting; what was the reason for not using BackpatchWord from the start? 
> IIUC our signatures should be a multiple of 4 bytes already.
Not sure how I arrived at this. This is fixed in the latest revision.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158573/new/

https://reviews.llvm.org/D158573

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158573: [clang][modules] Move `UNHASHED_CONTROL_BLOCK` up in the AST file

2023-08-23 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 updated this revision to Diff 552897.
jansvoboda11 added a comment.

Early return.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158573/new/

https://reviews.llvm.org/D158573

Files:
  clang/include/clang/Basic/Module.h
  clang/include/clang/Serialization/ASTBitCodes.h
  clang/include/clang/Serialization/ASTWriter.h
  clang/lib/Serialization/ASTReader.cpp
  clang/lib/Serialization/ASTWriter.cpp
  clang/lib/Serialization/GlobalModuleIndex.cpp
  clang/test/Modules/ASTSignature.c

Index: clang/test/Modules/ASTSignature.c
===
--- clang/test/Modules/ASTSignature.c
+++ clang/test/Modules/ASTSignature.c
@@ -1,6 +1,6 @@
 // RUN: rm -rf %t
-// RUN: %clang_cc1 -iquote %S/Inputs/ASTHash/ -fsyntax-only -fmodules \
-// RUN:   -fimplicit-module-maps -fmodules-strict-context-hash \
+// RUN: %clang_cc1 -iquote %S/Inputs/ASTHash/ -fsyntax-only \
+// RUN:   -fmodules -fimplicit-module-maps -fmodules-strict-context-hash \
 // RUN:   -fmodules-cache-path=%t -fdisable-module-hash %s
 // RUN: cp %t/MyHeader2.pcm %t1.pcm
 // RUN: rm -rf %t
@@ -8,17 +8,18 @@
 // RUN:   -fmodules -fimplicit-module-maps -fmodules-strict-context-hash \
 // RUN:   -fmodules-cache-path=%t -fdisable-module-hash %s
 // RUN: cp %t/MyHeader2.pcm %t2.pcm
-// RUN: llvm-bcanalyzer --dump --disable-histogram %t1.pcm > %t1.dump
-// RUN: llvm-bcanalyzer --dump --disable-histogram %t2.pcm > %t2.dump
+// RUN: llvm-bcanalyzer --dump --disable-histogram --show-binary-blobs %t1.pcm > %t1.dump
+// RUN: llvm-bcanalyzer --dump --disable-histogram --show-binary-blobs %t2.pcm > %t2.dump
 // RUN: cat %t1.dump %t2.dump | FileCheck %s
 
 #include "my_header_2.h"
 
 my_int var = 42;
 
-// CHECK: [[AST_BLOCK_HASH:]]
-// CHECK: [[SIGNATURE:]]
-// CHECK: [[AST_BLOCK_HASH]]
-// CHECK-NOT: [[SIGNATURE]]
-// The modules built by this test are designed to yield the same AST. If this
-// test fails, it means that the AST block is has become non-relocatable.
+// CHECK:  blob data = '[[AST_BLOCK_HASH:.*]]'
+// CHECK:  blob data = '[[SIGNATURE:.*]]'
+// CHECK:  blob data = '[[AST_BLOCK_HASH]]'
+// CHECK-NOT:  blob data = '[[SIGNATURE]]'
+// The modules built by this test are designed to yield the same AST but distinct AST files.
+// If this test fails, it means that either the AST block has become non-relocatable,
+// or the file signature stopped hashing some parts of the AST file.
Index: clang/lib/Serialization/GlobalModuleIndex.cpp
===
--- clang/lib/Serialization/GlobalModuleIndex.cpp
+++ clang/lib/Serialization/GlobalModuleIndex.cpp
@@ -15,6 +15,7 @@
 #include "clang/Basic/FileManager.h"
 #include "clang/Lex/HeaderSearch.h"
 #include "clang/Serialization/ASTBitCodes.h"
+#include "clang/Serialization/ASTReader.h"
 #include "clang/Serialization/ModuleFile.h"
 #include "clang/Serialization/PCHContainerOperations.h"
 #include "llvm/ADT/DenseMap.h"
@@ -697,9 +698,12 @@
 }
 
 // Get Signature.
-if (State == DiagnosticOptionsBlock && Code == SIGNATURE)
-  getModuleFileInfo(File).Signature = ASTFileSignature::create(
-  Record.begin(), Record.begin() + ASTFileSignature::size);
+if (State == DiagnosticOptionsBlock && Code == SIGNATURE) {
+  auto Signature = ASTFileSignature::create(Blob.begin(), Blob.end());
+  assert(Signature != ASTFileSignature::createDummy() &&
+ "Dummy AST file signature not backpatched in ASTWriter.");
+  getModuleFileInfo(File).Signature = Signature;
+}
 
 // We don't care about this record.
   }
Index: clang/lib/Serialization/ASTWriter.cpp
===
--- clang/lib/Serialization/ASTWriter.cpp
+++ clang/lib/Serialization/ASTWriter.cpp
@@ -1120,50 +1120,96 @@
 }
 
 std::pair
-ASTWriter::createSignature(StringRef AllBytes, StringRef ASTBlockBytes) {
+ASTWriter::createSignature() const {
+  StringRef AllBytes(Buffer.data(), Buffer.size());
+
   llvm::SHA1 Hasher;
-  Hasher.update(ASTBlockBytes);
+  Hasher.update(AllBytes.slice(ASTBlockRange.first, ASTBlockRange.second));
   ASTFileSignature ASTBlockHash = ASTFileSignature::create(Hasher.result());
 
-  // Add the remaining bytes (i.e. bytes before the unhashed control block that
-  // are not part of the AST block).
-  Hasher.update(
-  AllBytes.take_front(ASTBlockBytes.bytes_end() - AllBytes.bytes_begin()));
+  // Add the remaining bytes:
+  //  1. Before the unhashed control block.
+  Hasher.update(AllBytes.slice(0, UnhashedControlBlockRange.first));
+  //  2. Between the unhashed control block and the AST block.
   Hasher.update(
-  AllBytes.take_back(AllBytes.bytes_end() - ASTBlockBytes.bytes_end()));
+  AllBytes.slice(UnhashedControlBlockRange.second, ASTBlockRange.first));
+  //  3. After the AST block.
+  Hasher.update(AllBytes.slice(ASTBlockRange.second, StringRef::npos));
   

[PATCH] D158573: [clang][modules] Move `UNHASHED_CONTROL_BLOCK` up in the AST file

2023-08-23 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 updated this revision to Diff 552896.
jansvoboda11 added a comment.

Use `Stream::BackpatchWord()` instead of manipulating `Buffer` directly.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158573/new/

https://reviews.llvm.org/D158573

Files:
  clang/include/clang/Basic/Module.h
  clang/include/clang/Serialization/ASTBitCodes.h
  clang/include/clang/Serialization/ASTWriter.h
  clang/lib/Serialization/ASTReader.cpp
  clang/lib/Serialization/ASTWriter.cpp
  clang/lib/Serialization/GlobalModuleIndex.cpp
  clang/test/Modules/ASTSignature.c

Index: clang/test/Modules/ASTSignature.c
===
--- clang/test/Modules/ASTSignature.c
+++ clang/test/Modules/ASTSignature.c
@@ -1,6 +1,6 @@
 // RUN: rm -rf %t
-// RUN: %clang_cc1 -iquote %S/Inputs/ASTHash/ -fsyntax-only -fmodules \
-// RUN:   -fimplicit-module-maps -fmodules-strict-context-hash \
+// RUN: %clang_cc1 -iquote %S/Inputs/ASTHash/ -fsyntax-only \
+// RUN:   -fmodules -fimplicit-module-maps -fmodules-strict-context-hash \
 // RUN:   -fmodules-cache-path=%t -fdisable-module-hash %s
 // RUN: cp %t/MyHeader2.pcm %t1.pcm
 // RUN: rm -rf %t
@@ -8,17 +8,18 @@
 // RUN:   -fmodules -fimplicit-module-maps -fmodules-strict-context-hash \
 // RUN:   -fmodules-cache-path=%t -fdisable-module-hash %s
 // RUN: cp %t/MyHeader2.pcm %t2.pcm
-// RUN: llvm-bcanalyzer --dump --disable-histogram %t1.pcm > %t1.dump
-// RUN: llvm-bcanalyzer --dump --disable-histogram %t2.pcm > %t2.dump
+// RUN: llvm-bcanalyzer --dump --disable-histogram --show-binary-blobs %t1.pcm > %t1.dump
+// RUN: llvm-bcanalyzer --dump --disable-histogram --show-binary-blobs %t2.pcm > %t2.dump
 // RUN: cat %t1.dump %t2.dump | FileCheck %s
 
 #include "my_header_2.h"
 
 my_int var = 42;
 
-// CHECK: [[AST_BLOCK_HASH:]]
-// CHECK: [[SIGNATURE:]]
-// CHECK: [[AST_BLOCK_HASH]]
-// CHECK-NOT: [[SIGNATURE]]
-// The modules built by this test are designed to yield the same AST. If this
-// test fails, it means that the AST block is has become non-relocatable.
+// CHECK:  blob data = '[[AST_BLOCK_HASH:.*]]'
+// CHECK:  blob data = '[[SIGNATURE:.*]]'
+// CHECK:  blob data = '[[AST_BLOCK_HASH]]'
+// CHECK-NOT:  blob data = '[[SIGNATURE]]'
+// The modules built by this test are designed to yield the same AST but distinct AST files.
+// If this test fails, it means that either the AST block has become non-relocatable,
+// or the file signature stopped hashing some parts of the AST file.
Index: clang/lib/Serialization/GlobalModuleIndex.cpp
===
--- clang/lib/Serialization/GlobalModuleIndex.cpp
+++ clang/lib/Serialization/GlobalModuleIndex.cpp
@@ -15,6 +15,7 @@
 #include "clang/Basic/FileManager.h"
 #include "clang/Lex/HeaderSearch.h"
 #include "clang/Serialization/ASTBitCodes.h"
+#include "clang/Serialization/ASTReader.h"
 #include "clang/Serialization/ModuleFile.h"
 #include "clang/Serialization/PCHContainerOperations.h"
 #include "llvm/ADT/DenseMap.h"
@@ -697,9 +698,12 @@
 }
 
 // Get Signature.
-if (State == DiagnosticOptionsBlock && Code == SIGNATURE)
-  getModuleFileInfo(File).Signature = ASTFileSignature::create(
-  Record.begin(), Record.begin() + ASTFileSignature::size);
+if (State == DiagnosticOptionsBlock && Code == SIGNATURE) {
+  auto Signature = ASTFileSignature::create(Blob.begin(), Blob.end());
+  assert(Signature != ASTFileSignature::createDummy() &&
+ "Dummy AST file signature not backpatched in ASTWriter.");
+  getModuleFileInfo(File).Signature = Signature;
+}
 
 // We don't care about this record.
   }
Index: clang/lib/Serialization/ASTWriter.cpp
===
--- clang/lib/Serialization/ASTWriter.cpp
+++ clang/lib/Serialization/ASTWriter.cpp
@@ -1120,50 +1120,96 @@
 }
 
 std::pair
-ASTWriter::createSignature(StringRef AllBytes, StringRef ASTBlockBytes) {
+ASTWriter::createSignature() const {
+  StringRef AllBytes(Buffer.data(), Buffer.size());
+
   llvm::SHA1 Hasher;
-  Hasher.update(ASTBlockBytes);
+  Hasher.update(AllBytes.slice(ASTBlockRange.first, ASTBlockRange.second));
   ASTFileSignature ASTBlockHash = ASTFileSignature::create(Hasher.result());
 
-  // Add the remaining bytes (i.e. bytes before the unhashed control block that
-  // are not part of the AST block).
-  Hasher.update(
-  AllBytes.take_front(ASTBlockBytes.bytes_end() - AllBytes.bytes_begin()));
+  // Add the remaining bytes:
+  //  1. Before the unhashed control block.
+  Hasher.update(AllBytes.slice(0, UnhashedControlBlockRange.first));
+  //  2. Between the unhashed control block and the AST block.
   Hasher.update(
-  AllBytes.take_back(AllBytes.bytes_end() - ASTBlockBytes.bytes_end()));
+  AllBytes.slice(UnhashedControlBlockRange.second, ASTBlockRange.first));
+  //  3. After the AST block.
+  Hasher.update(

[PATCH] D158572: [clang][modules] Use relative offsets for input files

2023-08-23 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 updated this revision to Diff 552894.
jansvoboda11 added a comment.

Initialize absolute offsets in `ASTReader`/`ModuleFile`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158572/new/

https://reviews.llvm.org/D158572

Files:
  clang/include/clang/Serialization/ModuleFile.h
  clang/lib/Serialization/ASTReader.cpp
  clang/lib/Serialization/ASTWriter.cpp


Index: clang/lib/Serialization/ASTWriter.cpp
===
--- clang/lib/Serialization/ASTWriter.cpp
+++ clang/lib/Serialization/ASTWriter.cpp
@@ -1570,6 +1570,8 @@
   IFHAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 32));
   unsigned IFHAbbrevCode = Stream.EmitAbbrev(std::move(IFHAbbrev));
 
+  uint64_t InputFilesOffsetBase = Stream.GetCurrentBitNo();
+
   // Get all ContentCache objects for files.
   std::vector UserFiles;
   std::vector SystemFiles;
@@ -1633,7 +1635,7 @@
   continue; // already recorded this file.
 
 // Record this entry's offset.
-InputFileOffsets.push_back(Stream.GetCurrentBitNo());
+InputFileOffsets.push_back(Stream.GetCurrentBitNo() - 
InputFilesOffsetBase);
 
 InputFileID = InputFileOffsets.size();
 
Index: clang/lib/Serialization/ASTReader.cpp
===
--- clang/lib/Serialization/ASTReader.cpp
+++ clang/lib/Serialization/ASTReader.cpp
@@ -2326,7 +2326,8 @@
   // Go find this input file.
   BitstreamCursor &Cursor = F.InputFilesCursor;
   SavedStreamPosition SavedPosition(Cursor);
-  if (llvm::Error Err = Cursor.JumpToBit(F.InputFileOffsets[ID - 1])) {
+  if (llvm::Error Err = Cursor.JumpToBit(F.InputFilesOffsetBase +
+ F.InputFileOffsets[ID - 1])) {
 // FIXME this drops errors on the floor.
 consumeError(std::move(Err));
   }
@@ -2410,7 +2411,8 @@
   // Go find this input file.
   BitstreamCursor &Cursor = F.InputFilesCursor;
   SavedStreamPosition SavedPosition(Cursor);
-  if (llvm::Error Err = Cursor.JumpToBit(F.InputFileOffsets[ID - 1])) {
+  if (llvm::Error Err = Cursor.JumpToBit(F.InputFilesOffsetBase +
+ F.InputFileOffsets[ID - 1])) {
 // FIXME this drops errors on the floor.
 consumeError(std::move(Err));
   }
@@ -2788,6 +2790,7 @@
   Error("malformed block record in AST file");
   return Failure;
 }
+F.InputFilesOffsetBase = F.InputFilesCursor.GetCurrentBitNo();
 continue;
 
   case OPTIONS_BLOCK_ID:
@@ -5328,6 +5331,7 @@
   bool NeedsSystemInputFiles = Listener.needsSystemInputFileVisitation();
   bool NeedsImports = Listener.needsImportVisitation();
   BitstreamCursor InputFilesCursor;
+  uint64_t InputFilesOffsetBase = 0;
 
   RecordData Record;
   std::string ModuleDir;
@@ -5363,6 +5367,7 @@
 if (NeedsInputFiles &&
 ReadBlockAbbrevs(InputFilesCursor, INPUT_FILES_BLOCK_ID))
   return true;
+InputFilesOffsetBase = InputFilesCursor.GetCurrentBitNo();
 break;
 
   default:
@@ -5435,7 +5440,8 @@
 
 BitstreamCursor &Cursor = InputFilesCursor;
 SavedStreamPosition SavedPosition(Cursor);
-if (llvm::Error Err = Cursor.JumpToBit(InputFileOffs[I])) {
+if (llvm::Error Err =
+Cursor.JumpToBit(InputFilesOffsetBase + InputFileOffs[I])) {
   // FIXME this drops errors on the floor.
   consumeError(std::move(Err));
 }
Index: clang/include/clang/Serialization/ModuleFile.h
===
--- clang/include/clang/Serialization/ModuleFile.h
+++ clang/include/clang/Serialization/ModuleFile.h
@@ -245,7 +245,10 @@
   /// The cursor to the start of the input-files block.
   llvm::BitstreamCursor InputFilesCursor;
 
-  /// Offsets for all of the input file entries in the AST file.
+  /// Absolute offset of the start of the input-files block.
+  uint64_t InputFilesOffsetBase = 0;
+
+  /// Relative offsets for all of the input file entries in the AST file.
   const llvm::support::unaligned_uint64_t *InputFileOffsets = nullptr;
 
   /// The input files that have been loaded from this AST file.


Index: clang/lib/Serialization/ASTWriter.cpp
===
--- clang/lib/Serialization/ASTWriter.cpp
+++ clang/lib/Serialization/ASTWriter.cpp
@@ -1570,6 +1570,8 @@
   IFHAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 32));
   unsigned IFHAbbrevCode = Stream.EmitAbbrev(std::move(IFHAbbrev));
 
+  uint64_t InputFilesOffsetBase = Stream.GetCurrentBitNo();
+
   // Get all ContentCache objects for files.
   std::vector UserFiles;
   std::vector SystemFiles;
@@ -1633,7 +1635,7 @@
   continue; // already recorded this file.
 
 // Record this entry's offset.
-InputFileOffsets.push_back(Stream.GetCurrentBitNo());
+InputFileOffsets.push_back(Stream.GetCurrentBitNo() - InputFilesOffsetBase

[PATCH] D158572: [clang][modules] Use relative offsets for input files

2023-08-23 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added inline comments.



Comment at: clang/include/clang/Serialization/ModuleFile.h:249
+  /// Absolute offset of the start of the input-files block.
+  uint64_t InputFilesOffsetBase;
+

benlangmuir wrote:
> Doesn't `InputFilesCursor` already know where the input files block starts?
I think it does. We should be able to remove this and always call 
`InputFilesCursor::GetCurrentBitNo()` at the start. I implemented it this way 
because it's consistent with the existing pattern: 
`SLocEntryCursor`/`SourceManagerBlockStartOffset`, 
`MacroCursor`/`MacroOffsetsBase`, `DeclsCursor`/`DeclsBlockStartOffset`.

LMK if you feel strongly about it and I can look into getting rid of the extra 
offset base members.



Comment at: clang/lib/Serialization/ASTReader.cpp:5334
   BitstreamCursor InputFilesCursor;
+  uint64_t InputFilesOffsetBase;
 

benlangmuir wrote:
> We should initialize this to something - either 0 or maybe ~0 so it will be 
> invalid?
Good point, I'll do that along with initializing the member variable.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158572/new/

https://reviews.llvm.org/D158572

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158573: [clang][modules] Move `UNHASHED_CONTROL_BLOCK` up in the AST file

2023-08-23 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added a comment.

Good suggestions all around, thanks!




Comment at: clang/lib/Serialization/ASTWriter.cpp:1169
+  writeSignature(Sig, Out);
+  std::copy_n(Out.begin(), Out.size(), Buffer.begin() + Offset);
+};

I don't feel great about removing `const` from `Buffer` and writing into it 
directly, circumventing `Stream`. This currently works fine, because the 
`Stream` in `ASTWriter` is never backed by a file (and therefore never 
flushed). But if that ever changes, this code is problematic. Do you think this 
is worth spending more time on?

`Stream` already has the `BackpatchWord()` function, which makes sure the 
underlying file is updated as well in case we're backpatching already-flushed 
data.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158573/new/

https://reviews.llvm.org/D158573

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158469: [clang][deps] Compute command lines and file deps on-demand

2023-08-22 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 updated this revision to Diff 552566.
jansvoboda11 added a comment.

Rebase.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158469/new/

https://reviews.llvm.org/D158469

Files:
  clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
  clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
  clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
  clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
  clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
  clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
  clang/tools/clang-scan-deps/ClangScanDeps.cpp

Index: clang/tools/clang-scan-deps/ClangScanDeps.cpp
===
--- clang/tools/clang-scan-deps/ClangScanDeps.cpp
+++ clang/tools/clang-scan-deps/ClangScanDeps.cpp
@@ -351,14 +351,24 @@
   }
 
   void mergeDeps(ModuleDepsGraph Graph, size_t InputIndex) {
-std::unique_lock ul(Lock);
-for (const ModuleDeps &MD : Graph) {
-  auto I = Modules.find({MD.ID, 0});
-  if (I != Modules.end()) {
-I->first.InputIndex = std::min(I->first.InputIndex, InputIndex);
-continue;
+std::vector NewMDs;
+{
+  std::unique_lock ul(Lock);
+  for (const ModuleDeps &MD : Graph.MDs) {
+auto I = Modules.find({MD.ID, 0});
+if (I != Modules.end()) {
+  I->first.InputIndex = std::min(I->first.InputIndex, InputIndex);
+  continue;
+}
+auto NewIt = Modules.insert(I, {{MD.ID, InputIndex}, std::move(MD)});
+NewMDs.push_back(&NewIt->second);
   }
-  Modules.insert(I, {{MD.ID, InputIndex}, std::move(MD)});
+}
+// Eagerly compute the lazy members before the graph goes out of scope.
+// This is somewhat costly, so do it outside the critical section.
+for (ModuleDeps *MD : NewMDs) {
+  (void)MD->getBuildArguments();
+  (void)MD->getFileDeps();
 }
   }
 
@@ -382,7 +392,7 @@
 /*ShouldOwnClient=*/false);
 
 for (auto &&M : Modules)
-  if (roundTripCommand(M.second.BuildArguments, *Diags))
+  if (roundTripCommand(M.second.getBuildArguments(), *Diags))
 return true;
 
 for (auto &&I : Inputs)
@@ -408,10 +418,10 @@
   Object O{
   {"name", MD.ID.ModuleName},
   {"context-hash", MD.ID.ContextHash},
-  {"file-deps", toJSONSorted(MD.FileDeps)},
+  {"file-deps", toJSONSorted(MD.getFileDeps())},
   {"clang-module-deps", toJSONSorted(MD.ClangModuleDeps)},
   {"clang-modulemap-file", MD.ClangModuleMapFile},
-  {"command-line", MD.BuildArguments},
+  {"command-line", MD.getBuildArguments()},
   };
   OutModules.push_back(std::move(O));
 }
Index: clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
===
--- clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -467,20 +467,6 @@
   serialization::ModuleFile *MF =
   MDC.ScanInstance.getASTReader()->getModuleManager().lookup(
   M->getASTFile());
-  MDC.ScanInstance.getASTReader()->visitInputFileInfos(
-  *MF, /*IncludeSystem=*/true,
-  [&](const serialization::InputFileInfo &IFI, bool IsSystem) {
-// __inferred_module.map is the result of the way in which an implicit
-// module build handles inferred modules. It adds an overlay VFS with
-// this file in the proper directory and relies on the rest of Clang to
-// handle it like normal. With explicitly built modules we don't need
-// to play VFS tricks, so replace it with the correct module map.
-if (StringRef(IFI.Filename).endswith("__inferred_module.map")) {
-  MDC.addFileDep(MD, ModuleMap->getName());
-  return;
-}
-MDC.addFileDep(MD, IFI.Filename);
-  });
 
   llvm::DenseSet SeenDeps;
   addAllSubmodulePrebuiltDeps(M, MD, SeenDeps);
@@ -510,7 +496,9 @@
   // Finish the compiler invocation. Requires dependencies and the context hash.
   MDC.addOutputPaths(CI, MD);
 
-  MD.BuildArguments = CI.getCC1CommandLine();
+  // Wire up lazy info computation.
+  MD.MDC = &MDC;
+  MDC.LazyModuleDepsInfoByID.insert({MD.ID, {MF, std::move(CI)}});
 
   return MD.ID;
 }
@@ -643,5 +631,50 @@
 void ModuleDepCollector::addFileDep(ModuleDeps &MD, StringRef Path) {
   llvm::SmallString<256> Storage;
   Path = makeAbsoluteAndPreferred(ScanInstance, Path, Storage);
-  MD.FileDeps.insert(Path);
+  MD.FileDeps->insert(Path);
+}
+
+void ModuleDepCollector::addFileDeps(ModuleDeps &MD) {
+  auto It = LazyModuleDepsInfoByID.find(MD.ID);
+  assert(It != LazyModuleDepsInfoByID.end());
+
+  MD.FileDeps = llvm::StringSet<>{};
+
+  ScanInstance.getASTReader()->visitInputFileInfos(
+  *It->second.MF, /*IncludeSystem=*/tru

[PATCH] D158573: [clang][modules] Move `UNHASHED_CONTROL_BLOCK` up in the AST file

2023-08-22 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 created this revision.
jansvoboda11 added a reviewer: benlangmuir.
Herald added subscribers: ributzka, arphaman.
Herald added a project: All.
jansvoboda11 requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

When loading (transitively) imported AST file, `ModuleManager::addModule()` 
first checks it has the expected signature via `readASTFileSignature()`. The 
signature is part of `UNHASHED_CONTROL_BLOCK`, which is placed at the end of 
the AST file. This means that just to verify signature of an AST file, we need 
to skip over all top-level blocks, paging in the whole AST file from disk. This 
is pretty slow.

This patch moves `UNHASHED_CONTROL_BLOCK` to the start of the AST file, so that 
it can be read more efficiently. To achieve this, we use dummy signature when 
first emitting the unhashed control block, and then backpatch the real 
signature at the end of the serialization process.

This speeds up dependency scanning by over 9% and significantly reduces 
run-to-run variability of my benchmarks.

Depends on D158572 .


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D158573

Files:
  clang/include/clang/Basic/Module.h
  clang/include/clang/Serialization/ASTBitCodes.h
  clang/include/clang/Serialization/ASTReader.h
  clang/include/clang/Serialization/ASTWriter.h
  clang/lib/Serialization/ASTReader.cpp
  clang/lib/Serialization/ASTWriter.cpp
  clang/lib/Serialization/GlobalModuleIndex.cpp
  clang/test/Modules/ASTSignature.c

Index: clang/test/Modules/ASTSignature.c
===
--- clang/test/Modules/ASTSignature.c
+++ clang/test/Modules/ASTSignature.c
@@ -1,6 +1,6 @@
 // RUN: rm -rf %t
-// RUN: %clang_cc1 -iquote %S/Inputs/ASTHash/ -fsyntax-only -fmodules \
-// RUN:   -fimplicit-module-maps -fmodules-strict-context-hash \
+// RUN: %clang_cc1 -iquote %S/Inputs/ASTHash/ -fsyntax-only \
+// RUN:   -fmodules -fimplicit-module-maps -fmodules-strict-context-hash \
 // RUN:   -fmodules-cache-path=%t -fdisable-module-hash %s
 // RUN: cp %t/MyHeader2.pcm %t1.pcm
 // RUN: rm -rf %t
@@ -8,17 +8,18 @@
 // RUN:   -fmodules -fimplicit-module-maps -fmodules-strict-context-hash \
 // RUN:   -fmodules-cache-path=%t -fdisable-module-hash %s
 // RUN: cp %t/MyHeader2.pcm %t2.pcm
-// RUN: llvm-bcanalyzer --dump --disable-histogram %t1.pcm > %t1.dump
-// RUN: llvm-bcanalyzer --dump --disable-histogram %t2.pcm > %t2.dump
+// RUN: llvm-bcanalyzer --dump --disable-histogram --show-binary-blobs %t1.pcm > %t1.dump
+// RUN: llvm-bcanalyzer --dump --disable-histogram --show-binary-blobs %t2.pcm > %t2.dump
 // RUN: cat %t1.dump %t2.dump | FileCheck %s
 
 #include "my_header_2.h"
 
 my_int var = 42;
 
-// CHECK: [[AST_BLOCK_HASH:]]
-// CHECK: [[SIGNATURE:]]
-// CHECK: [[AST_BLOCK_HASH]]
-// CHECK-NOT: [[SIGNATURE]]
-// The modules built by this test are designed to yield the same AST. If this
-// test fails, it means that the AST block is has become non-relocatable.
+// CHECK:  blob data = '[[AST_BLOCK_HASH:.*]]'
+// CHECK:  blob data = '[[SIGNATURE:.*]]'
+// CHECK:  blob data = '[[AST_BLOCK_HASH]]'
+// CHECK-NOT:  blob data = '[[SIGNATURE]]'
+// The modules built by this test are designed to yield the same AST but distinct AST files.
+// If this test fails, it means that either the AST block has become non-relocatable,
+// or the file signature stopped hashing some parts of the AST file.
Index: clang/lib/Serialization/GlobalModuleIndex.cpp
===
--- clang/lib/Serialization/GlobalModuleIndex.cpp
+++ clang/lib/Serialization/GlobalModuleIndex.cpp
@@ -15,6 +15,7 @@
 #include "clang/Basic/FileManager.h"
 #include "clang/Lex/HeaderSearch.h"
 #include "clang/Serialization/ASTBitCodes.h"
+#include "clang/Serialization/ASTReader.h"
 #include "clang/Serialization/ModuleFile.h"
 #include "clang/Serialization/PCHContainerOperations.h"
 #include "llvm/ADT/DenseMap.h"
@@ -698,8 +699,7 @@
 
 // Get Signature.
 if (State == DiagnosticOptionsBlock && Code == SIGNATURE)
-  getModuleFileInfo(File).Signature = ASTFileSignature::create(
-  Record.begin(), Record.begin() + ASTFileSignature::size);
+  getModuleFileInfo(File).Signature = ASTReader::readSignature(Blob.data());
 
 // We don't care about this record.
   }
Index: clang/lib/Serialization/ASTWriter.cpp
===
--- clang/lib/Serialization/ASTWriter.cpp
+++ clang/lib/Serialization/ASTWriter.cpp
@@ -1120,50 +1120,117 @@
 }
 
 std::pair
-ASTWriter::createSignature(StringRef AllBytes, StringRef ASTBlockBytes) {
+ASTWriter::createSignature() const {
+  StringRef AllBytes(Buffer.data(), Buffer.size());
+
   llvm::SHA1 Hasher;
-  Hasher.update(ASTBlockBytes);
+  Hasher.update(AllBytes.slice(ASTBlockRange.first, ASTBlockRange.second));
   ASTFileSignature ASTBloc

[PATCH] D158572: [clang][modules] Use relative offsets for input files

2023-08-22 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 created this revision.
jansvoboda11 added a reviewer: benlangmuir.
Herald added a subscriber: ributzka.
Herald added a project: All.
jansvoboda11 requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

This patch replaces absolute offsets into the input files block with offsets 
relative to the block start. This makes the whole section "relocatable". I 
confirmed all other uses of `GetCurrentBitNo()` are turned into relative 
offsets before being serialized into the AST file.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D158572

Files:
  clang/include/clang/Serialization/ModuleFile.h
  clang/lib/Serialization/ASTReader.cpp
  clang/lib/Serialization/ASTWriter.cpp


Index: clang/lib/Serialization/ASTWriter.cpp
===
--- clang/lib/Serialization/ASTWriter.cpp
+++ clang/lib/Serialization/ASTWriter.cpp
@@ -1570,6 +1570,8 @@
   IFHAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 32));
   unsigned IFHAbbrevCode = Stream.EmitAbbrev(std::move(IFHAbbrev));
 
+  uint64_t InputFilesOffsetBase = Stream.GetCurrentBitNo();
+
   // Get all ContentCache objects for files.
   std::vector UserFiles;
   std::vector SystemFiles;
@@ -1633,7 +1635,7 @@
   continue; // already recorded this file.
 
 // Record this entry's offset.
-InputFileOffsets.push_back(Stream.GetCurrentBitNo());
+InputFileOffsets.push_back(Stream.GetCurrentBitNo() - 
InputFilesOffsetBase);
 
 InputFileID = InputFileOffsets.size();
 
Index: clang/lib/Serialization/ASTReader.cpp
===
--- clang/lib/Serialization/ASTReader.cpp
+++ clang/lib/Serialization/ASTReader.cpp
@@ -2326,7 +2326,8 @@
   // Go find this input file.
   BitstreamCursor &Cursor = F.InputFilesCursor;
   SavedStreamPosition SavedPosition(Cursor);
-  if (llvm::Error Err = Cursor.JumpToBit(F.InputFileOffsets[ID - 1])) {
+  if (llvm::Error Err = Cursor.JumpToBit(F.InputFilesOffsetBase +
+ F.InputFileOffsets[ID - 1])) {
 // FIXME this drops errors on the floor.
 consumeError(std::move(Err));
   }
@@ -2410,7 +2411,8 @@
   // Go find this input file.
   BitstreamCursor &Cursor = F.InputFilesCursor;
   SavedStreamPosition SavedPosition(Cursor);
-  if (llvm::Error Err = Cursor.JumpToBit(F.InputFileOffsets[ID - 1])) {
+  if (llvm::Error Err = Cursor.JumpToBit(F.InputFilesOffsetBase +
+ F.InputFileOffsets[ID - 1])) {
 // FIXME this drops errors on the floor.
 consumeError(std::move(Err));
   }
@@ -2788,6 +2790,7 @@
   Error("malformed block record in AST file");
   return Failure;
 }
+F.InputFilesOffsetBase = F.InputFilesCursor.GetCurrentBitNo();
 continue;
 
   case OPTIONS_BLOCK_ID:
@@ -5328,6 +5331,7 @@
   bool NeedsSystemInputFiles = Listener.needsSystemInputFileVisitation();
   bool NeedsImports = Listener.needsImportVisitation();
   BitstreamCursor InputFilesCursor;
+  uint64_t InputFilesOffsetBase;
 
   RecordData Record;
   std::string ModuleDir;
@@ -5363,6 +5367,7 @@
 if (NeedsInputFiles &&
 ReadBlockAbbrevs(InputFilesCursor, INPUT_FILES_BLOCK_ID))
   return true;
+InputFilesOffsetBase = InputFilesCursor.GetCurrentBitNo();
 break;
 
   default:
@@ -5435,7 +5440,8 @@
 
 BitstreamCursor &Cursor = InputFilesCursor;
 SavedStreamPosition SavedPosition(Cursor);
-if (llvm::Error Err = Cursor.JumpToBit(InputFileOffs[I])) {
+if (llvm::Error Err =
+Cursor.JumpToBit(InputFilesOffsetBase + InputFileOffs[I])) {
   // FIXME this drops errors on the floor.
   consumeError(std::move(Err));
 }
Index: clang/include/clang/Serialization/ModuleFile.h
===
--- clang/include/clang/Serialization/ModuleFile.h
+++ clang/include/clang/Serialization/ModuleFile.h
@@ -245,7 +245,10 @@
   /// The cursor to the start of the input-files block.
   llvm::BitstreamCursor InputFilesCursor;
 
-  /// Offsets for all of the input file entries in the AST file.
+  /// Absolute offset of the start of the input-files block.
+  uint64_t InputFilesOffsetBase;
+
+  /// Relative offsets for all of the input file entries in the AST file.
   const llvm::support::unaligned_uint64_t *InputFileOffsets = nullptr;
 
   /// The input files that have been loaded from this AST file.


Index: clang/lib/Serialization/ASTWriter.cpp
===
--- clang/lib/Serialization/ASTWriter.cpp
+++ clang/lib/Serialization/ASTWriter.cpp
@@ -1570,6 +1570,8 @@
   IFHAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 32));
   unsigned IFHAbbrevCode = Stream.EmitAbbrev(std::move(IFHAbbrev));
 
+  uint64_t InputFilesOffsetBase = Stream.GetCurrentBitNo();
+
   // Get all C

[PATCH] D158469: [clang][deps] Compute command lines and file deps on-demand

2023-08-21 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 created this revision.
jansvoboda11 added a reviewer: benlangmuir.
Herald added a subscriber: ributzka.
Herald added a project: All.
jansvoboda11 requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Although generating command lines and collecting file dependencies recently got 
faster, there are still benefits to be had from doing these lazily, on-demand.

This patch makes it so that the `ModuleDepsGraph` keeps the scanning instance 
alive. Instances of `ModuleDeps` can then use pointer to the 
`ModuleDepCollector` to compute the information lazily.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D158469

Files:
  clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
  clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
  clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
  clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
  clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
  clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
  clang/tools/clang-scan-deps/ClangScanDeps.cpp

Index: clang/tools/clang-scan-deps/ClangScanDeps.cpp
===
--- clang/tools/clang-scan-deps/ClangScanDeps.cpp
+++ clang/tools/clang-scan-deps/ClangScanDeps.cpp
@@ -351,14 +351,24 @@
   }
 
   void mergeDeps(ModuleDepsGraph Graph, size_t InputIndex) {
-std::unique_lock ul(Lock);
-for (const ModuleDeps &MD : Graph) {
-  auto I = Modules.find({MD.ID, 0});
-  if (I != Modules.end()) {
-I->first.InputIndex = std::min(I->first.InputIndex, InputIndex);
-continue;
+std::vector NewMDs;
+{
+  std::unique_lock ul(Lock);
+  for (const ModuleDeps &MD : Graph.MDs) {
+auto I = Modules.find({MD.ID, 0});
+if (I != Modules.end()) {
+  I->first.InputIndex = std::min(I->first.InputIndex, InputIndex);
+  continue;
+}
+auto NewIt = Modules.insert(I, {{MD.ID, InputIndex}, std::move(MD)});
+NewMDs.push_back(&NewIt->second);
   }
-  Modules.insert(I, {{MD.ID, InputIndex}, std::move(MD)});
+}
+// Eagerly compute the lazy members before the graph goes out of scope.
+// This is somewhat costly, so do it outside the critical section.
+for (ModuleDeps *MD : NewMDs) {
+  (void)MD->getBuildArguments();
+  (void)MD->getFileDeps();
 }
   }
 
@@ -382,7 +392,7 @@
 /*ShouldOwnClient=*/false);
 
 for (auto &&M : Modules)
-  if (roundTripCommand(M.second.BuildArguments, *Diags))
+  if (roundTripCommand(M.second.getBuildArguments(), *Diags))
 return true;
 
 for (auto &&I : Inputs)
@@ -408,10 +418,10 @@
   Object O{
   {"name", MD.ID.ModuleName},
   {"context-hash", MD.ID.ContextHash},
-  {"file-deps", toJSONSorted(MD.FileDeps)},
+  {"file-deps", toJSONSorted(MD.getFileDeps())},
   {"clang-module-deps", toJSONSorted(MD.ClangModuleDeps)},
   {"clang-modulemap-file", MD.ClangModuleMapFile},
-  {"command-line", MD.BuildArguments},
+  {"command-line", MD.getBuildArguments()},
   };
   OutModules.push_back(std::move(O));
 }
Index: clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
===
--- clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -467,20 +467,6 @@
   serialization::ModuleFile *MF =
   MDC.ScanInstance.getASTReader()->getModuleManager().lookup(
   M->getASTFile());
-  MDC.ScanInstance.getASTReader()->visitInputFileInfos(
-  *MF, /*IncludeSystem=*/true,
-  [&](const serialization::InputFileInfo &IFI, bool IsSystem) {
-// __inferred_module.map is the result of the way in which an implicit
-// module build handles inferred modules. It adds an overlay VFS with
-// this file in the proper directory and relies on the rest of Clang to
-// handle it like normal. With explicitly built modules we don't need
-// to play VFS tricks, so replace it with the correct module map.
-if (StringRef(IFI.Filename).endswith("__inferred_module.map")) {
-  MDC.addFileDep(MD, ModuleMap->getName());
-  return;
-}
-MDC.addFileDep(MD, IFI.Filename);
-  });
 
   llvm::DenseSet SeenDeps;
   addAllSubmodulePrebuiltDeps(M, MD, SeenDeps);
@@ -510,7 +496,9 @@
   // Finish the compiler invocation. Requires dependencies and the context hash.
   MDC.addOutputPaths(CI, MD);
 
-  MD.BuildArguments = CI.getCC1CommandLine();
+  // Wire up lazy info computation.
+  MD.MDC = &MDC;
+  MDC.LazyModuleDepsInfoByID.insert({MD.ID, {MF, std::move(CI)}});
 
   return MD.ID;
 }
@@ -643,5 +631,50 @@
 void ModuleDepCollector::addFileDep(ModuleDeps &MD, String

[PATCH] D158136: [clang][modules] Avoid storing command-line macro definitions into implicitly built PCM files

2023-08-17 Thread Jan Svoboda via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG6a115578324f: [clang][modules] Avoid storing command-line 
macro definitions into implicitly… (authored by jansvoboda11).

Changed prior to commit:
  https://reviews.llvm.org/D158136?vs=550964&id=551197#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158136/new/

https://reviews.llvm.org/D158136

Files:
  clang/include/clang/Serialization/ASTReader.h
  clang/include/clang/Serialization/ASTWriter.h
  clang/lib/Frontend/ASTUnit.cpp
  clang/lib/Frontend/FrontendActions.cpp
  clang/lib/Serialization/ASTReader.cpp
  clang/lib/Serialization/ASTWriter.cpp
  clang/lib/Serialization/GeneratePCH.cpp
  clang/test/Modules/module_file_info.m

Index: clang/test/Modules/module_file_info.m
===
--- clang/test/Modules/module_file_info.m
+++ clang/test/Modules/module_file_info.m
@@ -1,14 +1,15 @@
 // UNSUPPORTED: target={{.*}}-zos{{.*}}, target={{.*}}-aix{{.*}}
 @import DependsOnModule;
 
-// RUN: rm -rf %t %t-obj
+// RUN: rm -rf %t %t-obj %t-hash
 // RUN: %clang_cc1 -w -Wunused -fmodules -fmodule-format=raw -fimplicit-module-maps -fdisable-module-hash -fmodules-cache-path=%t -F %S/Inputs -DBLARG -DWIBBLE=WOBBLE -fmodule-feature myfeature %s
-// RUN: %clang_cc1 -module-file-info %t/DependsOnModule.pcm | FileCheck %s
-// RUN: %clang_cc1 -module-file-info %t/DependsOnModule.pcm | FileCheck %s --check-prefix=RAW
+// RUN: %clang_cc1 -module-file-info %t/DependsOnModule.pcm | FileCheck %s --check-prefixes=RAW,CHECK,MACROS
 
 // RUN: %clang_cc1 -w -Wunused -fmodules -fmodule-format=obj -fimplicit-module-maps -fdisable-module-hash -fmodules-cache-path=%t-obj -F %S/Inputs -DBLARG -DWIBBLE=WOBBLE -fmodule-feature myfeature %s
-// RUN: %clang_cc1 -module-file-info %t-obj/DependsOnModule.pcm | FileCheck %s
-// RUN: %clang_cc1 -module-file-info %t-obj/DependsOnModule.pcm | FileCheck %s --check-prefix=OBJ
+// RUN: %clang_cc1 -module-file-info %t-obj/DependsOnModule.pcm | FileCheck %s --check-prefixes=OBJ,CHECK,MACROS
+
+// RUN: %clang_cc1 -w -Wunused -fmodules -fmodule-format=obj -fimplicit-module-maps   -fmodules-cache-path=%t-hash -F %S/Inputs -DBLARG -DWIBBLE=WOBBLE -fmodule-feature myfeature %s
+// RUN: %clang_cc1 -module-file-info %t-hash/*/DependsOnModule-*.pcm | FileCheck %s --check-prefixes=OBJ,CHECK,NO_MACROS
 
 // RAW:   Module format: raw
 // OBJ:   Module format: obj
@@ -16,7 +17,7 @@
 
 // CHECK: Module name: DependsOnModule
 // CHECK: Module map file: {{.*}}DependsOnModule.framework{{[/\\]}}module.map
-// CHECK: Imports module 'Module': {{.*}}Module.pcm
+// CHECK: Imports module 'Module': {{.*}}Module{{.*}}.pcm
 
 // CHECK: Language options:
 // CHECK:   C99: Yes
@@ -42,9 +43,10 @@
 // CHECK: Preprocessor options:
 // CHECK:   Uses compiler/target-specific predefines [-undef]: Yes
 // CHECK:   Uses detailed preprocessing record (for indexing): No
-// CHECK:   Predefined macros:
-// CHECK: -DBLARG
-// CHECK: -DWIBBLE=WOBBLE
+// NO_MACROS-NOT: Predefined macros:
+// MACROS:   Predefined macros:
+// MACROS-NEXT: -DBLARG
+// MACROS-NEXT: -DWIBBLE=WOBBLE
 // CHECK: Input file: {{.*}}module.map
 // CHECK-NEXT: Input file: {{.*}}module_private.map
 // CHECK-NEXT: Input file: {{.*}}DependsOnModule.h
Index: clang/lib/Serialization/GeneratePCH.cpp
===
--- clang/lib/Serialization/GeneratePCH.cpp
+++ clang/lib/Serialization/GeneratePCH.cpp
@@ -25,11 +25,11 @@
 StringRef OutputFile, StringRef isysroot, std::shared_ptr Buffer,
 ArrayRef> Extensions,
 bool AllowASTWithErrors, bool IncludeTimestamps,
-bool ShouldCacheASTInMemory)
+bool BuildingImplicitModule, bool ShouldCacheASTInMemory)
 : PP(PP), OutputFile(OutputFile), isysroot(isysroot.str()),
   SemaPtr(nullptr), Buffer(std::move(Buffer)), Stream(this->Buffer->Data),
   Writer(Stream, this->Buffer->Data, ModuleCache, Extensions,
- IncludeTimestamps),
+ IncludeTimestamps, BuildingImplicitModule),
   AllowASTWithErrors(AllowASTWithErrors),
   ShouldCacheASTInMemory(ShouldCacheASTInMemory) {
   this->Buffer->IsComplete = false;
Index: clang/lib/Serialization/ASTWriter.cpp
===
--- clang/lib/Serialization/ASTWriter.cpp
+++ clang/lib/Serialization/ASTWriter.cpp
@@ -1467,11 +1467,19 @@
   Record.clear();
   const PreprocessorOptions &PPOpts = PP.getPreprocessorOpts();
 
-  // Macro definitions.
-  Record.push_back(PPOpts.Macros.size());
-  for (unsigned I = 0, N = PPOpts.Macros.size(); I != N; ++I) {
-AddString(PPOpts.Macros[I].first, Record);
-Record.push_back(PPOpts.Macros[I].second);
+  // If we're building an implicit module with a context hash, the importer is
+  // guaranteed to have the same macros defined on the command line. Skip
+  // writing them.
+  boo

[PATCH] D158136: [clang][modules] Avoid storing command-line macro definitions into implicitly built PCM files

2023-08-16 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 updated this revision to Diff 550964.
jansvoboda11 added a comment.

Format


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158136/new/

https://reviews.llvm.org/D158136

Files:
  clang/include/clang/Serialization/ASTReader.h
  clang/include/clang/Serialization/ASTWriter.h
  clang/lib/Frontend/ASTUnit.cpp
  clang/lib/Frontend/FrontendActions.cpp
  clang/lib/Serialization/ASTReader.cpp
  clang/lib/Serialization/ASTWriter.cpp
  clang/lib/Serialization/GeneratePCH.cpp
  clang/test/Modules/module_file_info.m

Index: clang/test/Modules/module_file_info.m
===
--- clang/test/Modules/module_file_info.m
+++ clang/test/Modules/module_file_info.m
@@ -42,9 +42,6 @@
 // CHECK: Preprocessor options:
 // CHECK:   Uses compiler/target-specific predefines [-undef]: Yes
 // CHECK:   Uses detailed preprocessing record (for indexing): No
-// CHECK:   Predefined macros:
-// CHECK: -DBLARG
-// CHECK: -DWIBBLE=WOBBLE
 // CHECK: Input file: {{.*}}module.map
 // CHECK-NEXT: Input file: {{.*}}module_private.map
 // CHECK-NEXT: Input file: {{.*}}DependsOnModule.h
Index: clang/lib/Serialization/GeneratePCH.cpp
===
--- clang/lib/Serialization/GeneratePCH.cpp
+++ clang/lib/Serialization/GeneratePCH.cpp
@@ -25,11 +25,11 @@
 StringRef OutputFile, StringRef isysroot, std::shared_ptr Buffer,
 ArrayRef> Extensions,
 bool AllowASTWithErrors, bool IncludeTimestamps,
-bool ShouldCacheASTInMemory)
+bool BuildingImplicitModule, bool ShouldCacheASTInMemory)
 : PP(PP), OutputFile(OutputFile), isysroot(isysroot.str()),
   SemaPtr(nullptr), Buffer(std::move(Buffer)), Stream(this->Buffer->Data),
   Writer(Stream, this->Buffer->Data, ModuleCache, Extensions,
- IncludeTimestamps),
+ IncludeTimestamps, BuildingImplicitModule),
   AllowASTWithErrors(AllowASTWithErrors),
   ShouldCacheASTInMemory(ShouldCacheASTInMemory) {
   this->Buffer->IsComplete = false;
Index: clang/lib/Serialization/ASTWriter.cpp
===
--- clang/lib/Serialization/ASTWriter.cpp
+++ clang/lib/Serialization/ASTWriter.cpp
@@ -1466,11 +1466,19 @@
   Record.clear();
   const PreprocessorOptions &PPOpts = PP.getPreprocessorOpts();
 
-  // Macro definitions.
-  Record.push_back(PPOpts.Macros.size());
-  for (unsigned I = 0, N = PPOpts.Macros.size(); I != N; ++I) {
-AddString(PPOpts.Macros[I].first, Record);
-Record.push_back(PPOpts.Macros[I].second);
+  // If we're building an implicit module with a context hash, the importer is
+  // guaranteed to have the same macros defined on the command line. Skip
+  // writing them.
+  bool SkipMacros = BuildingImplicitModule && !HSOpts.DisableModuleHash;
+  bool WriteMacros = !SkipMacros;
+  Record.push_back(WriteMacros);
+  if (WriteMacros) {
+// Macro definitions.
+Record.push_back(PPOpts.Macros.size());
+for (unsigned I = 0, N = PPOpts.Macros.size(); I != N; ++I) {
+  AddString(PPOpts.Macros[I].first, Record);
+  Record.push_back(PPOpts.Macros[I].second);
+}
   }
 
   // Includes
@@ -4539,9 +4547,10 @@
  SmallVectorImpl &Buffer,
  InMemoryModuleCache &ModuleCache,
  ArrayRef> Extensions,
- bool IncludeTimestamps)
+ bool IncludeTimestamps, bool BuildingImplicitModule)
 : Stream(Stream), Buffer(Buffer), ModuleCache(ModuleCache),
-  IncludeTimestamps(IncludeTimestamps) {
+  IncludeTimestamps(IncludeTimestamps),
+  BuildingImplicitModule(BuildingImplicitModule) {
   for (const auto &Ext : Extensions) {
 if (auto Writer = Ext->createExtensionWriter(*this))
   ModuleFileExtensionWriters.push_back(std::move(Writer));
Index: clang/lib/Serialization/ASTReader.cpp
===
--- clang/lib/Serialization/ASTReader.cpp
+++ clang/lib/Serialization/ASTReader.cpp
@@ -208,11 +208,12 @@
 }
 
 bool ChainedASTReaderListener::ReadPreprocessorOptions(
-const PreprocessorOptions &PPOpts, bool Complain,
+const PreprocessorOptions &PPOpts, bool ReadMacros, bool Complain,
 std::string &SuggestedPredefines) {
-  return First->ReadPreprocessorOptions(PPOpts, Complain,
+  return First->ReadPreprocessorOptions(PPOpts, ReadMacros, Complain,
 SuggestedPredefines) ||
- Second->ReadPreprocessorOptions(PPOpts, Complain, SuggestedPredefines);
+ Second->ReadPreprocessorOptions(PPOpts, ReadMacros, Complain,
+ SuggestedPredefines);
 }
 
 void ChainedASTReaderListener::ReadCounter(const serialization::ModuleFile &M,
@@ -658,92 +659,95 @@
 ///are no differences in the options between the two.
 static bool checkPreprocessorOptions(
 const Preproc

[PATCH] D158136: [clang][modules] Avoid storing command-line macro definitions into implicitly built PCM files

2023-08-16 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 created this revision.
jansvoboda11 added reviewers: benlangmuir, rsmith, Bigcheese.
Herald added a subscriber: ributzka.
Herald added a project: All.
jansvoboda11 requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

With implicit modules, it's impossible to load a PCM file that was built using 
different command-line macro definitions. This is guaranteed by the fact that 
they contribute to the context hash. This means that we don't need to store 
those macros into PCM files for validation purposes. This patch avoids 
serializing them in those circumstances, since there's no other use for 
command-line macro definitions (besides "-module-file-info").

For a typical Apple project, this speeds up the dependency scan by 5.6% and 
shrinks the cache with scanning PCMs by 26%.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D158136

Files:
  clang/include/clang/Serialization/ASTReader.h
  clang/include/clang/Serialization/ASTWriter.h
  clang/lib/Frontend/ASTUnit.cpp
  clang/lib/Frontend/FrontendActions.cpp
  clang/lib/Serialization/ASTReader.cpp
  clang/lib/Serialization/ASTWriter.cpp
  clang/lib/Serialization/GeneratePCH.cpp
  clang/test/Modules/module_file_info.m

Index: clang/test/Modules/module_file_info.m
===
--- clang/test/Modules/module_file_info.m
+++ clang/test/Modules/module_file_info.m
@@ -42,9 +42,6 @@
 // CHECK: Preprocessor options:
 // CHECK:   Uses compiler/target-specific predefines [-undef]: Yes
 // CHECK:   Uses detailed preprocessing record (for indexing): No
-// CHECK:   Predefined macros:
-// CHECK: -DBLARG
-// CHECK: -DWIBBLE=WOBBLE
 // CHECK: Input file: {{.*}}module.map
 // CHECK-NEXT: Input file: {{.*}}module_private.map
 // CHECK-NEXT: Input file: {{.*}}DependsOnModule.h
Index: clang/lib/Serialization/GeneratePCH.cpp
===
--- clang/lib/Serialization/GeneratePCH.cpp
+++ clang/lib/Serialization/GeneratePCH.cpp
@@ -25,11 +25,11 @@
 StringRef OutputFile, StringRef isysroot, std::shared_ptr Buffer,
 ArrayRef> Extensions,
 bool AllowASTWithErrors, bool IncludeTimestamps,
-bool ShouldCacheASTInMemory)
+bool BuildingImplicitModule, bool ShouldCacheASTInMemory)
 : PP(PP), OutputFile(OutputFile), isysroot(isysroot.str()),
   SemaPtr(nullptr), Buffer(std::move(Buffer)), Stream(this->Buffer->Data),
   Writer(Stream, this->Buffer->Data, ModuleCache, Extensions,
- IncludeTimestamps),
+ IncludeTimestamps, BuildingImplicitModule),
   AllowASTWithErrors(AllowASTWithErrors),
   ShouldCacheASTInMemory(ShouldCacheASTInMemory) {
   this->Buffer->IsComplete = false;
Index: clang/lib/Serialization/ASTWriter.cpp
===
--- clang/lib/Serialization/ASTWriter.cpp
+++ clang/lib/Serialization/ASTWriter.cpp
@@ -1466,11 +1466,19 @@
   Record.clear();
   const PreprocessorOptions &PPOpts = PP.getPreprocessorOpts();
 
-  // Macro definitions.
-  Record.push_back(PPOpts.Macros.size());
-  for (unsigned I = 0, N = PPOpts.Macros.size(); I != N; ++I) {
-AddString(PPOpts.Macros[I].first, Record);
-Record.push_back(PPOpts.Macros[I].second);
+  // If we're building an implicit module with a context hash, the importer is
+  // guaranteed to have the same macros defined on the command line. Skip
+  // writing them.
+  bool SkipMacros = BuildingImplicitModule && !HSOpts.DisableModuleHash;
+  bool WriteMacros = !SkipMacros;
+  Record.push_back(WriteMacros);
+  if (WriteMacros) {
+// Macro definitions.
+Record.push_back(PPOpts.Macros.size());
+for (unsigned I = 0, N = PPOpts.Macros.size(); I != N; ++I) {
+  AddString(PPOpts.Macros[I].first, Record);
+  Record.push_back(PPOpts.Macros[I].second);
+}
   }
 
   // Includes
@@ -4539,9 +4547,10 @@
  SmallVectorImpl &Buffer,
  InMemoryModuleCache &ModuleCache,
  ArrayRef> Extensions,
- bool IncludeTimestamps)
+ bool IncludeTimestamps, bool BuildingImplicitModule)
 : Stream(Stream), Buffer(Buffer), ModuleCache(ModuleCache),
-  IncludeTimestamps(IncludeTimestamps) {
+  IncludeTimestamps(IncludeTimestamps),
+  BuildingImplicitModule(BuildingImplicitModule) {
   for (const auto &Ext : Extensions) {
 if (auto Writer = Ext->createExtensionWriter(*this))
   ModuleFileExtensionWriters.push_back(std::move(Writer));
Index: clang/lib/Serialization/ASTReader.cpp
===
--- clang/lib/Serialization/ASTReader.cpp
+++ clang/lib/Serialization/ASTReader.cpp
@@ -208,11 +208,12 @@
 }
 
 bool ChainedASTReaderListener::ReadPreprocessorOptions(
-const PreprocessorOptions &PPOpts, bool Complain,
+const PreprocessorOptions &PPOpts, bool ReadMacros, bo

[PATCH] D157559: [clang][modules] Respect "-fmodule-name=" when serializing included files into a PCH

2023-08-10 Thread Jan Svoboda via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGbbdb0c7e4496: [clang][modules] Respect 
"-fmodule-name=" when serializing included files into… (authored by 
jansvoboda11).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157559/new/

https://reviews.llvm.org/D157559

Files:
  clang/lib/Lex/ModuleMap.cpp
  clang/test/Modules/pch-with-module-name-import-twice.c


Index: clang/test/Modules/pch-with-module-name-import-twice.c
===
--- /dev/null
+++ clang/test/Modules/pch-with-module-name-import-twice.c
@@ -0,0 +1,19 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+// This test checks that headers that are part of a module named by
+// -fmodule-name= don't get included again if previously included from a PCH.
+
+//--- include/module.modulemap
+module Mod { header "Mod.h" }
+//--- include/Mod.h
+struct Symbol {};
+//--- pch.h
+#import "Mod.h"
+//--- tu.c
+#import "Mod.h" // expected-no-diagnostics
+
+// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t/cache 
-fimplicit-module-maps -fmodule-name=Mod -I %t/include \
+// RUN:   -emit-pch -x c-header %t/pch.h -o %t/pch.pch
+// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t/cache 
-fimplicit-module-maps -fmodule-name=Mod -I %t/include \
+// RUN:   -fsyntax-only %t/tu.c -include-pch %t/pch.pch -verify
Index: clang/lib/Lex/ModuleMap.cpp
===
--- clang/lib/Lex/ModuleMap.cpp
+++ clang/lib/Lex/ModuleMap.cpp
@@ -1268,8 +1268,7 @@
   HeaderList.push_back(KH);
   Mod->Headers[headerRoleToKind(Role)].push_back(Header);
 
-  bool isCompilingModuleHeader =
-  LangOpts.isCompilingModule() && Mod->getTopLevelModule() == SourceModule;
+  bool isCompilingModuleHeader = Mod->isForBuilding(LangOpts);
   if (!Imported || isCompilingModuleHeader) {
 // When we import HeaderFileInfo, the external source is expected to
 // set the isModuleHeader flag itself.


Index: clang/test/Modules/pch-with-module-name-import-twice.c
===
--- /dev/null
+++ clang/test/Modules/pch-with-module-name-import-twice.c
@@ -0,0 +1,19 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+// This test checks that headers that are part of a module named by
+// -fmodule-name= don't get included again if previously included from a PCH.
+
+//--- include/module.modulemap
+module Mod { header "Mod.h" }
+//--- include/Mod.h
+struct Symbol {};
+//--- pch.h
+#import "Mod.h"
+//--- tu.c
+#import "Mod.h" // expected-no-diagnostics
+
+// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t/cache -fimplicit-module-maps -fmodule-name=Mod -I %t/include \
+// RUN:   -emit-pch -x c-header %t/pch.h -o %t/pch.pch
+// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t/cache -fimplicit-module-maps -fmodule-name=Mod -I %t/include \
+// RUN:   -fsyntax-only %t/tu.c -include-pch %t/pch.pch -verify
Index: clang/lib/Lex/ModuleMap.cpp
===
--- clang/lib/Lex/ModuleMap.cpp
+++ clang/lib/Lex/ModuleMap.cpp
@@ -1268,8 +1268,7 @@
   HeaderList.push_back(KH);
   Mod->Headers[headerRoleToKind(Role)].push_back(Header);
 
-  bool isCompilingModuleHeader =
-  LangOpts.isCompilingModule() && Mod->getTopLevelModule() == SourceModule;
+  bool isCompilingModuleHeader = Mod->isForBuilding(LangOpts);
   if (!Imported || isCompilingModuleHeader) {
 // When we import HeaderFileInfo, the external source is expected to
 // set the isModuleHeader flag itself.
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D156749: [modules] Fix error about the same module being defined in different .pcm files when using VFS overlays.

2023-08-09 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added inline comments.



Comment at: clang/lib/Serialization/ASTWriter.cpp:1330
 AddPath(WritingModule->PresumedModuleMapFile.empty()
-? Map.getModuleMapFileForUniquing(WritingModule)->getName()
+? Map.getModuleMapFileForUniquing(WritingModule)
+  ->getNameAsRequested()

vsapsai wrote:
> jansvoboda11 wrote:
> > Can we canonicalize this also? It'd be useful in the scanner.
> I'm not sure about that. ASTReader has some complicated logic around reading 
> this value 
> https://github.com/llvm/llvm-project/blob/c1803d5366c794ecade4e4ccd0013690a1976d49/clang/lib/Serialization/ASTReader.cpp#L4005
>  So if we don't have a proven need for it with a test case, I wouldn't change 
> it.
Good point. I do have a use for it, but I think it'd be safer to do separately 
from this patch a qualify it in isolation.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156749/new/

https://reviews.llvm.org/D156749

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157559: [clang][modules] Respect "-fmodule-name=" when serializing included files into a PCH

2023-08-09 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 created this revision.
jansvoboda11 added a reviewer: benlangmuir.
Herald added a subscriber: ributzka.
Herald added a project: All.
jansvoboda11 requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Clang writes the set of textually included files into AST files, so that 
importers know to avoid including those files again and instead deserialize 
their contents from the AST on-demand.

Logic for determining the set of included files files only considers headers 
that are either non-modular or that are modular but with 
`HeaderFileInfo::isCompilingModuleHeader` set. Logic for computing that bit is 
different than the one that determines whether to include a header textually 
with the "-fmodule-name=Mod" option. That can lead to header from module "Mod" 
being included textually in a PCH, but be ommited in the serialized set of 
included files. This can then allow such header to be textually included from 
importer of the PCH, wreaking havoc.

This patch fixes that by aligning the logic for computing 
`HeaderFileInfo::isCompilingModuleHeader` with the logic for deciding whether 
to include modular header textually.

As far as I can tell, this bug has been in Clang for forever. It got 
accidentally "fixed" by D114095  (that 
changed the logic for determining the set of included files) and got broken 
again in D155131  (which is essentially a 
revert of the former).


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D157559

Files:
  clang/lib/Lex/ModuleMap.cpp
  clang/test/Modules/pch-with-module-name-import-twice.c


Index: clang/test/Modules/pch-with-module-name-import-twice.c
===
--- /dev/null
+++ clang/test/Modules/pch-with-module-name-import-twice.c
@@ -0,0 +1,19 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+// This test checks that headers that are part of a module named by
+// -fmodule-name= don't get included again if previously included from a PCH.
+
+//--- include/module.modulemap
+module Mod { header "Mod.h" }
+//--- include/Mod.h
+struct Symbol {};
+//--- pch.h
+#import "Mod.h"
+//--- tu.c
+#import "Mod.h" // expected-no-diagnostics
+
+// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t/cache 
-fimplicit-module-maps -fmodule-name=Mod -I %t/include \
+// RUN:   -emit-pch -x c-header %t/pch.h -o %t/pch.pch
+// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t/cache 
-fimplicit-module-maps -fmodule-name=Mod -I %t/include \
+// RUN:   -fsyntax-only %t/tu.c -include-pch %t/pch.pch -verify
Index: clang/lib/Lex/ModuleMap.cpp
===
--- clang/lib/Lex/ModuleMap.cpp
+++ clang/lib/Lex/ModuleMap.cpp
@@ -1268,8 +1268,7 @@
   HeaderList.push_back(KH);
   Mod->Headers[headerRoleToKind(Role)].push_back(Header);
 
-  bool isCompilingModuleHeader =
-  LangOpts.isCompilingModule() && Mod->getTopLevelModule() == SourceModule;
+  bool isCompilingModuleHeader = Mod->isForBuilding(LangOpts);
   if (!Imported || isCompilingModuleHeader) {
 // When we import HeaderFileInfo, the external source is expected to
 // set the isModuleHeader flag itself.


Index: clang/test/Modules/pch-with-module-name-import-twice.c
===
--- /dev/null
+++ clang/test/Modules/pch-with-module-name-import-twice.c
@@ -0,0 +1,19 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+// This test checks that headers that are part of a module named by
+// -fmodule-name= don't get included again if previously included from a PCH.
+
+//--- include/module.modulemap
+module Mod { header "Mod.h" }
+//--- include/Mod.h
+struct Symbol {};
+//--- pch.h
+#import "Mod.h"
+//--- tu.c
+#import "Mod.h" // expected-no-diagnostics
+
+// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t/cache -fimplicit-module-maps -fmodule-name=Mod -I %t/include \
+// RUN:   -emit-pch -x c-header %t/pch.h -o %t/pch.pch
+// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t/cache -fimplicit-module-maps -fmodule-name=Mod -I %t/include \
+// RUN:   -fsyntax-only %t/tu.c -include-pch %t/pch.pch -verify
Index: clang/lib/Lex/ModuleMap.cpp
===
--- clang/lib/Lex/ModuleMap.cpp
+++ clang/lib/Lex/ModuleMap.cpp
@@ -1268,8 +1268,7 @@
   HeaderList.push_back(KH);
   Mod->Headers[headerRoleToKind(Role)].push_back(Header);
 
-  bool isCompilingModuleHeader =
-  LangOpts.isCompilingModule() && Mod->getTopLevelModule() == SourceModule;
+  bool isCompilingModuleHeader = Mod->isForBuilding(LangOpts);
   if (!Imported || isCompilingModuleHeader) {
 // When we import HeaderFileInfo, the external source is expected to
 // set the isModuleHeader flag itself.
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/

[PATCH] D157066: [clang][modules][deps] Create more efficient API for visitation of `ModuleFile` inputs

2023-08-09 Thread Jan Svoboda via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGdcd3a0c9f13b: [clang][modules][deps] Create more efficient 
API for visitation of `ModuleFile`… (authored by jansvoboda11).

Changed prior to commit:
  https://reviews.llvm.org/D157066?vs=547097&id=548666#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157066/new/

https://reviews.llvm.org/D157066

Files:
  clang/include/clang/Serialization/ASTBitCodes.h
  clang/include/clang/Serialization/ASTReader.h
  clang/include/clang/Serialization/ModuleFile.h
  clang/lib/Serialization/ASTReader.cpp
  clang/lib/Serialization/ASTWriter.cpp
  clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp

Index: clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
===
--- clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -459,18 +459,19 @@
   serialization::ModuleFile *MF =
   MDC.ScanInstance.getASTReader()->getModuleManager().lookup(
   M->getASTFile());
-  MDC.ScanInstance.getASTReader()->visitInputFiles(
-  *MF, true, true, [&](const serialization::InputFile &IF, bool isSystem) {
+  MDC.ScanInstance.getASTReader()->visitInputFileInfos(
+  *MF, /*IncludeSystem=*/true,
+  [&](const serialization::InputFileInfo &IFI, bool IsSystem) {
 // __inferred_module.map is the result of the way in which an implicit
 // module build handles inferred modules. It adds an overlay VFS with
 // this file in the proper directory and relies on the rest of Clang to
 // handle it like normal. With explicitly built modules we don't need
 // to play VFS tricks, so replace it with the correct module map.
-if (IF.getFile()->getName().endswith("__inferred_module.map")) {
+if (StringRef(IFI.Filename).endswith("__inferred_module.map")) {
   MDC.addFileDep(MD, ModuleMap->getName());
   return;
 }
-MDC.addFileDep(MD, IF.getFile()->getName());
+MDC.addFileDep(MD, IFI.Filename);
   });
 
   llvm::DenseSet SeenDeps;
@@ -478,11 +479,15 @@
   addAllSubmoduleDeps(M, MD, SeenDeps);
   addAllAffectingClangModules(M, MD, SeenDeps);
 
-  MDC.ScanInstance.getASTReader()->visitTopLevelModuleMaps(
-  *MF, [&](FileEntryRef FE) {
-if (FE.getNameAsRequested().endswith("__inferred_module.map"))
+  MDC.ScanInstance.getASTReader()->visitInputFileInfos(
+  *MF, /*IncludeSystem=*/true,
+  [&](const serialization::InputFileInfo &IFI, bool IsSystem) {
+if (!(IFI.TopLevel && IFI.ModuleMap))
   return;
-MD.ModuleMapFileDeps.emplace_back(FE.getNameAsRequested());
+if (StringRef(IFI.FilenameAsRequested)
+.endswith("__inferred_module.map"))
+  return;
+MD.ModuleMapFileDeps.emplace_back(IFI.FilenameAsRequested);
   });
 
   CompilerInvocation CI = MDC.makeInvocationForModuleBuildWithoutOutputs(
Index: clang/lib/Serialization/ASTWriter.cpp
===
--- clang/lib/Serialization/ASTWriter.cpp
+++ clang/lib/Serialization/ASTWriter.cpp
@@ -1525,7 +1525,8 @@
   bool IsSystemFile;
   bool IsTransient;
   bool BufferOverridden;
-  bool IsTopLevelModuleMap;
+  bool IsTopLevel;
+  bool IsModuleMap;
   uint32_t ContentHash[2];
 
   InputFileEntry(FileEntryRef File) : File(File) {}
@@ -1547,8 +1548,10 @@
   IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 32)); // Modification time
   IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // Overridden
   IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // Transient
+  IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // Top-level
   IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // Module map
-  IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob)); // File name
+  IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 16)); // Name as req. len
+  IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob)); // Name as req. + name
   unsigned IFAbbrevCode = Stream.EmitAbbrev(std::move(IFAbbrev));
 
   // Create input file hash abbreviation.
@@ -1582,8 +1585,8 @@
 Entry.IsSystemFile = isSystem(File.getFileCharacteristic());
 Entry.IsTransient = Cache->IsTransient;
 Entry.BufferOverridden = Cache->BufferOverridden;
-Entry.IsTopLevelModuleMap = isModuleMap(File.getFileCharacteristic()) &&
-File.getIncludeLoc().isInvalid();
+Entry.IsTopLevel = File.getIncludeLoc().isInvalid();
+Entry.IsModuleMap = isModuleMap(File.getFileCharacteristic());
 
 auto ContentHash = hash_code(-1);
 if (PP->getHeaderSearchInfo()
@@ -1631,6 +1634,15 @@
 // Emit size/modification time for this file.
 // And whether this file was overridden.
 {
+  SmallStrin

[PATCH] D157066: [clang][modules][deps] Create more efficient API for visitation of `ModuleFile` inputs

2023-08-09 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added a comment.

Landing with one more use of `Filename` converted to `FilenameAsRequested` (in 
call to `Listener.visitInputFile()`). The only remaining usages of `Filename` 
is now in the scanner (intentional) and in `ASTReader` when deciding whether an 
`InputFileInfo` has already been deserialized or not (default-initialized 
`InputFileInfo` has an empty `Filename`).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157066/new/

https://reviews.llvm.org/D157066

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157054: [clang] NFC: Use compile-time option spelling when generating command line

2023-08-09 Thread Jan Svoboda via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGacf57858c1ac: [clang] NFC: Use compile-time option spelling 
when generating command line (authored by jansvoboda11).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157054/new/

https://reviews.llvm.org/D157054

Files:
  clang/lib/Frontend/CompilerInvocation.cpp


Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -615,7 +615,7 @@
 static void GenerateArg(ArgumentConsumer Consumer,
 llvm::opt::OptSpecifier OptSpecifier) {
   Option Opt = getDriverOptTable().getOption(OptSpecifier);
-  denormalizeSimpleFlag(Consumer, Opt.getPrefix() + Opt.getName(),
+  denormalizeSimpleFlag(Consumer, Opt.getPrefixedName(),
 Option::OptionClass::FlagClass, 0);
 }
 
@@ -623,8 +623,7 @@
 llvm::opt::OptSpecifier OptSpecifier,
 const Twine &Value) {
   Option Opt = getDriverOptTable().getOption(OptSpecifier);
-  denormalizeString(Consumer, Opt.getPrefix() + Opt.getName(), Opt.getKind(), 
0,
-Value);
+  denormalizeString(Consumer, Opt.getPrefixedName(), Opt.getKind(), 0, Value);
 }
 
 // Parse command line arguments into CompilerInvocation.


Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -615,7 +615,7 @@
 static void GenerateArg(ArgumentConsumer Consumer,
 llvm::opt::OptSpecifier OptSpecifier) {
   Option Opt = getDriverOptTable().getOption(OptSpecifier);
-  denormalizeSimpleFlag(Consumer, Opt.getPrefix() + Opt.getName(),
+  denormalizeSimpleFlag(Consumer, Opt.getPrefixedName(),
 Option::OptionClass::FlagClass, 0);
 }
 
@@ -623,8 +623,7 @@
 llvm::opt::OptSpecifier OptSpecifier,
 const Twine &Value) {
   Option Opt = getDriverOptTable().getOption(OptSpecifier);
-  denormalizeString(Consumer, Opt.getPrefix() + Opt.getName(), Opt.getKind(), 0,
-Value);
+  denormalizeString(Consumer, Opt.getPrefixedName(), Opt.getKind(), 0, Value);
 }
 
 // Parse command line arguments into CompilerInvocation.
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157029: [llvm] Construct option's prefixed name at compile-time

2023-08-09 Thread Jan Svoboda via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG501f92d34382: [llvm] Construct option's prefixed name 
at compile-time (authored by jansvoboda11).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157029/new/

https://reviews.llvm.org/D157029

Files:
  clang-tools-extra/clangd/CompileCommands.cpp
  clang/lib/Driver/Driver.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/lib/Tooling/Tooling.cpp
  llvm/include/llvm/Option/OptTable.h
  llvm/include/llvm/Option/Option.h
  llvm/lib/Option/OptTable.cpp
  llvm/unittests/Option/OptionMarshallingTest.cpp
  llvm/utils/TableGen/OptParserEmitter.cpp

Index: llvm/utils/TableGen/OptParserEmitter.cpp
===
--- llvm/utils/TableGen/OptParserEmitter.cpp
+++ llvm/utils/TableGen/OptParserEmitter.cpp
@@ -34,30 +34,14 @@
   return OS;
 }
 
-static std::string getOptionSpelling(const Record &R, size_t &PrefixLength) {
+static std::string getOptionPrefixedName(const Record &R) {
   std::vector Prefixes = R.getValueAsListOfStrings("Prefixes");
   StringRef Name = R.getValueAsString("Name");
 
-  if (Prefixes.empty()) {
-PrefixLength = 0;
+  if (Prefixes.empty())
 return Name.str();
-  }
-
-  PrefixLength = Prefixes[0].size();
-  return (Twine(Prefixes[0]) + Twine(Name)).str();
-}
 
-static std::string getOptionSpelling(const Record &R) {
-  size_t PrefixLength;
-  return getOptionSpelling(R, PrefixLength);
-}
-
-static void emitNameUsingSpelling(raw_ostream &OS, const Record &R) {
-  size_t PrefixLength;
-  OS << "llvm::StringLiteral(";
-  write_cstring(
-  OS, StringRef(getOptionSpelling(R, PrefixLength)).substr(PrefixLength));
-  OS << ")";
+  return (Prefixes[0] + Twine(Name)).str();
 }
 
 class MarshallingInfo {
@@ -105,8 +89,6 @@
   }
 
   void emit(raw_ostream &OS) const {
-write_cstring(OS, StringRef(getOptionSpelling(R)));
-OS << ", ";
 OS << ShouldParse;
 OS << ", ";
 OS << ShouldAlwaysEmit;
@@ -346,8 +328,8 @@
 std::vector RPrefixes = R.getValueAsListOfStrings("Prefixes");
 OS << Prefixes[PrefixKeyT(RPrefixes.begin(), RPrefixes.end())] << ", ";
 
-// The option string.
-emitNameUsingSpelling(OS, R);
+// The option prefixed name.
+write_cstring(OS, getOptionPrefixedName(R));
 
 // The option identifier name.
 OS << ", " << getOptionName(R);
Index: llvm/unittests/Option/OptionMarshallingTest.cpp
===
--- llvm/unittests/Option/OptionMarshallingTest.cpp
+++ llvm/unittests/Option/OptionMarshallingTest.cpp
@@ -10,7 +10,7 @@
 #include "gtest/gtest.h"
 
 struct OptionWithMarshallingInfo {
-  llvm::StringRef Name;
+  llvm::StringLiteral PrefixedName;
   const char *KeyPath;
   const char *ImpliedCheck;
   const char *ImpliedValue;
@@ -18,20 +18,20 @@
 
 static const OptionWithMarshallingInfo MarshallingTable[] = {
 #define OPTION_WITH_MARSHALLING(   \
-PREFIX_TYPE, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,\
-HELPTEXT, METAVAR, VALUES, SPELLING, SHOULD_PARSE, ALWAYS_EMIT, KEYPATH,   \
+PREFIX_TYPE, PREFIXED_NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS,  \
+PARAM, HELPTEXT, METAVAR, VALUES, SHOULD_PARSE, ALWAYS_EMIT, KEYPATH,  \
 DEFAULT_VALUE, IMPLIED_CHECK, IMPLIED_VALUE, NORMALIZER, DENORMALIZER, \
 MERGER, EXTRACTOR, TABLE_INDEX)\
-  {NAME, #KEYPATH, #IMPLIED_CHECK, #IMPLIED_VALUE},
+  {PREFIXED_NAME, #KEYPATH, #IMPLIED_CHECK, #IMPLIED_VALUE},
 #include "Opts.inc"
 #undef OPTION_WITH_MARSHALLING
 };
 
 TEST(OptionMarshalling, EmittedOrderSameAsDefinitionOrder) {
-  ASSERT_STREQ(MarshallingTable[0].Name.data(), "marshalled-flag-d");
-  ASSERT_STREQ(MarshallingTable[1].Name.data(), "marshalled-flag-c");
-  ASSERT_STREQ(MarshallingTable[2].Name.data(), "marshalled-flag-b");
-  ASSERT_STREQ(MarshallingTable[3].Name.data(), "marshalled-flag-a");
+  ASSERT_EQ(MarshallingTable[0].PrefixedName, "-marshalled-flag-d");
+  ASSERT_EQ(MarshallingTable[1].PrefixedName, "-marshalled-flag-c");
+  ASSERT_EQ(MarshallingTable[2].PrefixedName, "-marshalled-flag-b");
+  ASSERT_EQ(MarshallingTable[3].PrefixedName, "-marshalled-flag-a");
 }
 
 TEST(OptionMarshalling, EmittedSpecifiedKeyPath) {
Index: llvm/lib/Option/OptTable.cpp
===
--- llvm/lib/Option/OptTable.cpp
+++ llvm/lib/Option/OptTable.cpp
@@ -59,7 +59,7 @@
   if (&A == &B)
 return false;
 
-  if (int N = StrCmpOptionName(A.Name, B.Name))
+  if (int N = StrCmpOptionName(A.getName(), B.getName()))
 return N < 0;
 
   for (size_t I = 0, K = std::min(A.Prefixes.size(), B.Prefixes.size()); I != K;
@@ -77,7 +77,7 @@
 
 // Support lower_bound between info and an option name.
 static inline bool operator<(const OptTable::Info &I, StringRef Name) {
-  retu

[PATCH] D157054: [clang] NFC: Use compile-time option spelling when generating command line

2023-08-08 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 updated this revision to Diff 548425.
jansvoboda11 added a comment.

Rebase


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157054/new/

https://reviews.llvm.org/D157054

Files:
  clang/lib/Frontend/CompilerInvocation.cpp


Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -615,7 +615,7 @@
 static void GenerateArg(ArgumentConsumer Consumer,
 llvm::opt::OptSpecifier OptSpecifier) {
   Option Opt = getDriverOptTable().getOption(OptSpecifier);
-  denormalizeSimpleFlag(Consumer, Opt.getPrefix() + Opt.getName(),
+  denormalizeSimpleFlag(Consumer, Opt.getPrefixedName(),
 Option::OptionClass::FlagClass, 0);
 }
 
@@ -623,8 +623,7 @@
 llvm::opt::OptSpecifier OptSpecifier,
 const Twine &Value) {
   Option Opt = getDriverOptTable().getOption(OptSpecifier);
-  denormalizeString(Consumer, Opt.getPrefix() + Opt.getName(), Opt.getKind(), 
0,
-Value);
+  denormalizeString(Consumer, Opt.getPrefixedName(), Opt.getKind(), 0, Value);
 }
 
 // Parse command line arguments into CompilerInvocation.


Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -615,7 +615,7 @@
 static void GenerateArg(ArgumentConsumer Consumer,
 llvm::opt::OptSpecifier OptSpecifier) {
   Option Opt = getDriverOptTable().getOption(OptSpecifier);
-  denormalizeSimpleFlag(Consumer, Opt.getPrefix() + Opt.getName(),
+  denormalizeSimpleFlag(Consumer, Opt.getPrefixedName(),
 Option::OptionClass::FlagClass, 0);
 }
 
@@ -623,8 +623,7 @@
 llvm::opt::OptSpecifier OptSpecifier,
 const Twine &Value) {
   Option Opt = getDriverOptTable().getOption(OptSpecifier);
-  denormalizeString(Consumer, Opt.getPrefix() + Opt.getName(), Opt.getKind(), 0,
-Value);
+  denormalizeString(Consumer, Opt.getPrefixedName(), Opt.getKind(), 0, Value);
 }
 
 // Parse command line arguments into CompilerInvocation.
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157029: [llvm] Construct option's prefixed name at compile-time

2023-08-08 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 updated this revision to Diff 548423.
jansvoboda11 added a comment.

Rebase, remove unnecessary changes


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157029/new/

https://reviews.llvm.org/D157029

Files:
  clang-tools-extra/clangd/CompileCommands.cpp
  clang/lib/Driver/Driver.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/lib/Tooling/Tooling.cpp
  llvm/include/llvm/Option/OptTable.h
  llvm/include/llvm/Option/Option.h
  llvm/lib/Option/OptTable.cpp
  llvm/unittests/Option/OptionMarshallingTest.cpp
  llvm/utils/TableGen/OptParserEmitter.cpp

Index: llvm/utils/TableGen/OptParserEmitter.cpp
===
--- llvm/utils/TableGen/OptParserEmitter.cpp
+++ llvm/utils/TableGen/OptParserEmitter.cpp
@@ -34,30 +34,14 @@
   return OS;
 }
 
-static std::string getOptionSpelling(const Record &R, size_t &PrefixLength) {
+static std::string getOptionPrefixedName(const Record &R) {
   std::vector Prefixes = R.getValueAsListOfStrings("Prefixes");
   StringRef Name = R.getValueAsString("Name");
 
-  if (Prefixes.empty()) {
-PrefixLength = 0;
+  if (Prefixes.empty())
 return Name.str();
-  }
-
-  PrefixLength = Prefixes[0].size();
-  return (Twine(Prefixes[0]) + Twine(Name)).str();
-}
 
-static std::string getOptionSpelling(const Record &R) {
-  size_t PrefixLength;
-  return getOptionSpelling(R, PrefixLength);
-}
-
-static void emitNameUsingSpelling(raw_ostream &OS, const Record &R) {
-  size_t PrefixLength;
-  OS << "llvm::StringLiteral(";
-  write_cstring(
-  OS, StringRef(getOptionSpelling(R, PrefixLength)).substr(PrefixLength));
-  OS << ")";
+  return (Prefixes[0] + Twine(Name)).str();
 }
 
 class MarshallingInfo {
@@ -105,8 +89,6 @@
   }
 
   void emit(raw_ostream &OS) const {
-write_cstring(OS, StringRef(getOptionSpelling(R)));
-OS << ", ";
 OS << ShouldParse;
 OS << ", ";
 OS << ShouldAlwaysEmit;
@@ -346,8 +328,8 @@
 std::vector RPrefixes = R.getValueAsListOfStrings("Prefixes");
 OS << Prefixes[PrefixKeyT(RPrefixes.begin(), RPrefixes.end())] << ", ";
 
-// The option string.
-emitNameUsingSpelling(OS, R);
+// The option prefixed name.
+write_cstring(OS, getOptionPrefixedName(R));
 
 // The option identifier name.
 OS << ", " << getOptionName(R);
Index: llvm/unittests/Option/OptionMarshallingTest.cpp
===
--- llvm/unittests/Option/OptionMarshallingTest.cpp
+++ llvm/unittests/Option/OptionMarshallingTest.cpp
@@ -10,7 +10,7 @@
 #include "gtest/gtest.h"
 
 struct OptionWithMarshallingInfo {
-  llvm::StringRef Name;
+  llvm::StringLiteral PrefixedName;
   const char *KeyPath;
   const char *ImpliedCheck;
   const char *ImpliedValue;
@@ -18,20 +18,20 @@
 
 static const OptionWithMarshallingInfo MarshallingTable[] = {
 #define OPTION_WITH_MARSHALLING(   \
-PREFIX_TYPE, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,\
-HELPTEXT, METAVAR, VALUES, SPELLING, SHOULD_PARSE, ALWAYS_EMIT, KEYPATH,   \
+PREFIX_TYPE, PREFIXED_NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS,  \
+PARAM, HELPTEXT, METAVAR, VALUES, SHOULD_PARSE, ALWAYS_EMIT, KEYPATH,  \
 DEFAULT_VALUE, IMPLIED_CHECK, IMPLIED_VALUE, NORMALIZER, DENORMALIZER, \
 MERGER, EXTRACTOR, TABLE_INDEX)\
-  {NAME, #KEYPATH, #IMPLIED_CHECK, #IMPLIED_VALUE},
+  {PREFIXED_NAME, #KEYPATH, #IMPLIED_CHECK, #IMPLIED_VALUE},
 #include "Opts.inc"
 #undef OPTION_WITH_MARSHALLING
 };
 
 TEST(OptionMarshalling, EmittedOrderSameAsDefinitionOrder) {
-  ASSERT_STREQ(MarshallingTable[0].Name.data(), "marshalled-flag-d");
-  ASSERT_STREQ(MarshallingTable[1].Name.data(), "marshalled-flag-c");
-  ASSERT_STREQ(MarshallingTable[2].Name.data(), "marshalled-flag-b");
-  ASSERT_STREQ(MarshallingTable[3].Name.data(), "marshalled-flag-a");
+  ASSERT_EQ(MarshallingTable[0].PrefixedName, "-marshalled-flag-d");
+  ASSERT_EQ(MarshallingTable[1].PrefixedName, "-marshalled-flag-c");
+  ASSERT_EQ(MarshallingTable[2].PrefixedName, "-marshalled-flag-b");
+  ASSERT_EQ(MarshallingTable[3].PrefixedName, "-marshalled-flag-a");
 }
 
 TEST(OptionMarshalling, EmittedSpecifiedKeyPath) {
Index: llvm/lib/Option/OptTable.cpp
===
--- llvm/lib/Option/OptTable.cpp
+++ llvm/lib/Option/OptTable.cpp
@@ -59,7 +59,7 @@
   if (&A == &B)
 return false;
 
-  if (int N = StrCmpOptionName(A.Name, B.Name))
+  if (int N = StrCmpOptionName(A.getName(), B.getName()))
 return N < 0;
 
   for (size_t I = 0, K = std::min(A.Prefixes.size(), B.Prefixes.size()); I != K;
@@ -77,7 +77,7 @@
 
 // Support lower_bound between info and an option name.
 static inline bool operator<(const OptTable::Info &I, StringRef Name) {
-  return StrCmpOptionNameIgnoreCase(I.Name, Name) < 0;
+  return StrCmpOptionNam

[PATCH] D156749: [modules] Fix error about the same module being defined in different .pcm files when using VFS overlays.

2023-08-07 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 accepted this revision.
jansvoboda11 added a comment.

In D156749#4562469 , @vsapsai wrote:

> In D156749#4561803 , @jansvoboda11 
> wrote:
>
>> My suggestion is to use the actual real on-disk path. Not 
>> `FileEntryRef::getName()`, but something that always behaves as if 
>> `use-external-name` was set to `true`. I believe this would handle your 
>> VFS/VFS-use-external-name-true/VFS-use-external-name-false problem. It would 
>> also handle another pitfall: two compilations with distinct VFS overlays 
>> that redirect two different as-requested module map paths into the same 
>> on-disk path.
>
> Do you suggest doing it for the hashing or for ASTWriter or both? We are 
> already doing some module map path canonicalization (added a comment in 
> corresponding place) but it's not pure on-disk path, it takes into account 
> VFS.

I mainly meant in ``. And then make the PCM files VFS-agnostic, which 
would probably require us to do the same thing for all the paths we serialize 
there. But I don't know how feasible that is. Besides, the scanner relies on 
the virtual paths in PCM files.

I think your solution is the most pragmatic. If you're confident this doesn't 
break anything internally, I say go for it. But I think it's good to be aware 
of the pitfall I mentioned, and make sure the build system doesn't trigger that.




Comment at: clang/lib/Serialization/ASTWriter.cpp:1330
 AddPath(WritingModule->PresumedModuleMapFile.empty()
-? Map.getModuleMapFileForUniquing(WritingModule)->getName()
+? Map.getModuleMapFileForUniquing(WritingModule)
+  ->getNameAsRequested()

Can we canonicalize this also? It'd be useful in the scanner.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156749/new/

https://reviews.llvm.org/D156749

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157066: [clang][modules][deps] Create more efficient API for visitation of `ModuleFile` inputs

2023-08-04 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added inline comments.



Comment at: clang/lib/Serialization/ASTReader.cpp:2411
   bool Transient = FI.Transient;
-  StringRef Filename = FI.Filename;
+  StringRef Filename = FI.FilenameAsRequested;
   uint64_t StoredContentHash = FI.ContentHash;

jansvoboda11 wrote:
> benlangmuir wrote:
> > It's not clear to me why this one changed
> This actually maintains the same semantics - `FI.Filename` was previously the 
> as-requested path, now it's the on-disk path. Without changing this line, 
> `FileManager::getFileRef()` would get called with the on-disk path, meaning 
> consumers of this function would get the incorrect path when calling 
> `FileEntryRef::getNameAsRequested()` on the returned file. I recall one 
> clang-scan-deps test failing because of it.
Just remembered: in `ModuleDepCollector.cpp`, we call 
`ModuleMap::getModuleMapFileForUniquing()`. This calls 
`SourceMgr.getFileEntryRefForID()` based on `Module::DefinitionLoc`, which 
triggers deserialization of the associated source location entry and ends up 
calling this function right here. We'd end up with the on-disk module map path 
for modular dependencies.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157066/new/

https://reviews.llvm.org/D157066

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157066: [clang][modules][deps] Create more efficient API for visitation of `ModuleFile` inputs

2023-08-04 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added inline comments.



Comment at: clang/include/clang/Serialization/ModuleFile.h:72
+  bool TopLevel;
+  bool ModuleMap;
 };

benlangmuir wrote:
> Is there something using this new split? It seems like every user is using 
> `&&` for these
You're right. I recall needing this on (now probably abandoned) patch, so I 
thought I might as well add the extra bit now, sine I'm changing the format 
anyways. I'm fine with removing this.



Comment at: clang/lib/Serialization/ASTReader.cpp:2411
   bool Transient = FI.Transient;
-  StringRef Filename = FI.Filename;
+  StringRef Filename = FI.FilenameAsRequested;
   uint64_t StoredContentHash = FI.ContentHash;

benlangmuir wrote:
> It's not clear to me why this one changed
This actually maintains the same semantics - `FI.Filename` was previously the 
as-requested path, now it's the on-disk path. Without changing this line, 
`FileManager::getFileRef()` would get called with the on-disk path, meaning 
consumers of this function would get the incorrect path when calling 
`FileEntryRef::getNameAsRequested()` on the returned file. I recall one 
clang-scan-deps test failing because of it.



Comment at: clang/lib/Serialization/ASTReader.cpp:9323
 
+void ASTReader::visitInputFileInfos(
+serialization::ModuleFile &MF, bool IncludeSystem,

benlangmuir wrote:
> Can we rewrite `visitInputFiles` on top of this?
I can try consolidating these in a follow-up if you're fine with that.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157066/new/

https://reviews.llvm.org/D157066

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D156234: [clang][deps] add support for dependency scanning with cc1 command line

2023-08-04 Thread Jan Svoboda via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG6b4de7b1c71b: [clang][deps] add support for dependency 
scanning with cc1 command line (authored by cpsughrue, committed by 
jansvoboda11).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156234/new/

https://reviews.llvm.org/D156234

Files:
  clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
  clang/test/ClangScanDeps/modules-cc1.cpp

Index: clang/test/ClangScanDeps/modules-cc1.cpp
===
--- /dev/null
+++ clang/test/ClangScanDeps/modules-cc1.cpp
@@ -0,0 +1,29 @@
+// Check that clang-scan-deps works with cc1 command lines
+
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+
+//--- modules_cc1.cpp
+#include "header.h"
+
+//--- header.h
+
+//--- module.modulemap
+module header1 { header "header.h" }
+
+//--- cdb.json.template
+[{
+  "file": "DIR/modules_cc1.cpp",
+  "directory": "DIR",
+  "command": "clang -cc1 DIR/modules_cc1.cpp -fimplicit-module-maps -o modules_cc1.o"
+}]
+
+// RUN: sed "s|DIR|%/t|g" %t/cdb.json.template > %t/cdb.json
+// RUN: clang-scan-deps -compilation-database %t/cdb.json -j 1 -mode preprocess-dependency-directives > %t/result
+// RUN: cat %t/result | sed 's:\?:/:g' | FileCheck %s -DPREFIX=%/t
+
+// CHECK: modules_cc1.o:
+// CHECK-NEXT: [[PREFIX]]/modules_cc1.cpp
+// CHECK-NEXT: [[PREFIX]]/module.modulemap
+// CHECK-NEXT: [[PREFIX]]/header.h
Index: clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
===
--- clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
+++ clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
@@ -385,6 +385,9 @@
   if (!Compilation)
 return false;
 
+  if (Compilation->containsError())
+return false;
+
   for (const driver::Command &Job : Compilation->getJobs()) {
 if (!Callback(Job))
   return false;
@@ -392,6 +395,26 @@
   return true;
 }
 
+static bool createAndRunToolInvocation(
+std::vector CommandLine, DependencyScanningAction &Action,
+FileManager &FM,
+std::shared_ptr &PCHContainerOps,
+DiagnosticsEngine &Diags, DependencyConsumer &Consumer) {
+
+  // Save executable path before providing CommandLine to ToolInvocation
+  std::string Executable = CommandLine[0];
+  ToolInvocation Invocation(std::move(CommandLine), &Action, &FM,
+PCHContainerOps);
+  Invocation.setDiagnosticConsumer(Diags.getClient());
+  Invocation.setDiagnosticOptions(&Diags.getDiagnosticOptions());
+  if (!Invocation.run())
+return false;
+
+  std::vector Args = Action.takeLastCC1Arguments();
+  Consumer.handleBuildCommand({std::move(Executable), std::move(Args)});
+  return true;
+}
+
 bool DependencyScanningWorker::computeDependencies(
 StringRef WorkingDirectory, const std::vector &CommandLine,
 DependencyConsumer &Consumer, DependencyActionController &Controller,
@@ -454,37 +477,37 @@
   DependencyScanningAction Action(WorkingDirectory, Consumer, Controller, DepFS,
   Format, OptimizeArgs, EagerLoadModules,
   DisableFree, ModuleName);
-  bool Success = forEachDriverJob(
-  FinalCommandLine, *Diags, *FileMgr, [&](const driver::Command &Cmd) {
-if (StringRef(Cmd.getCreator().getName()) != "clang") {
-  // Non-clang command. Just pass through to the dependency
-  // consumer.
-  Consumer.handleBuildCommand(
-  {Cmd.getExecutable(),
-   {Cmd.getArguments().begin(), Cmd.getArguments().end()}});
-  return true;
-}
-
-std::vector Argv;
-Argv.push_back(Cmd.getExecutable());
-Argv.insert(Argv.end(), Cmd.getArguments().begin(),
-Cmd.getArguments().end());
-
-// Create an invocation that uses the underlying file
-// system to ensure that any file system requests that
-// are made by the driver do not go through the
-// dependency scanning filesystem.
-ToolInvocation Invocation(std::move(Argv), &Action, &*FileMgr,
-  PCHContainerOps);
-Invocation.setDiagnosticConsumer(Diags->getClient());
-Invocation.setDiagnosticOptions(&Diags->getDiagnosticOptions());
-if (!Invocation.run())
-  return false;
-
-std::vector Args = Action.takeLastCC1Arguments();
-Consumer.handleBuildCommand({Cmd.getExecutable(), std::move(Args)});
-return true;
-  });
+
+  bool Success = false;
+  if (FinalCommandLine[1] == "-cc1") {
+Success = createAndRunToolInvocation(FinalCommandLine, Action, *FileMgr,
+ PCHContainerOps, *Diags, Consumer);
+  } else {
+Success = forEachDriverJob(
+FinalCommandLine, *Diags, *FileMgr, [&](const driver::Command &Cmd) {
+  if (String

[PATCH] D157029: [llvm] Construct option's prefixed name at compile-time

2023-08-04 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 updated this revision to Diff 547339.
jansvoboda11 added a comment.

Rebase.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157029/new/

https://reviews.llvm.org/D157029

Files:
  clang-tools-extra/clangd/CompileCommands.cpp
  clang/lib/Driver/Driver.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/lib/Tooling/Tooling.cpp
  llvm/include/llvm/Option/OptTable.h
  llvm/include/llvm/Option/Option.h
  llvm/lib/Option/OptTable.cpp
  llvm/unittests/Option/OptionMarshallingTest.cpp
  llvm/utils/TableGen/OptParserEmitter.cpp

Index: llvm/utils/TableGen/OptParserEmitter.cpp
===
--- llvm/utils/TableGen/OptParserEmitter.cpp
+++ llvm/utils/TableGen/OptParserEmitter.cpp
@@ -34,30 +34,14 @@
   return OS;
 }
 
-static std::string getOptionSpelling(const Record &R, size_t &PrefixLength) {
+static std::string getOptionPrefixedName(const Record &R) {
   std::vector Prefixes = R.getValueAsListOfStrings("Prefixes");
   StringRef Name = R.getValueAsString("Name");
 
-  if (Prefixes.empty()) {
-PrefixLength = 0;
+  if (Prefixes.empty())
 return Name.str();
-  }
-
-  PrefixLength = Prefixes[0].size();
-  return (Twine(Prefixes[0]) + Twine(Name)).str();
-}
 
-static std::string getOptionSpelling(const Record &R) {
-  size_t PrefixLength;
-  return getOptionSpelling(R, PrefixLength);
-}
-
-static void emitNameUsingSpelling(raw_ostream &OS, const Record &R) {
-  size_t PrefixLength;
-  OS << "llvm::StringLiteral(";
-  write_cstring(
-  OS, StringRef(getOptionSpelling(R, PrefixLength)).substr(PrefixLength));
-  OS << ")";
+  return (Prefixes[0] + Twine(Name)).str();
 }
 
 class MarshallingInfo {
@@ -105,8 +89,6 @@
   }
 
   void emit(raw_ostream &OS) const {
-write_cstring(OS, StringRef(getOptionSpelling(R)));
-OS << ", ";
 OS << ShouldParse;
 OS << ", ";
 OS << ShouldAlwaysEmit;
@@ -303,7 +285,7 @@
 // The option prefix;
 OS << "llvm::ArrayRef()";
 
-// The option string.
+// The option prefixed name.
 OS << ", \"" << R.getValueAsString("Name") << '"';
 
 // The option identifier name.
@@ -346,8 +328,10 @@
 std::vector RPrefixes = R.getValueAsListOfStrings("Prefixes");
 OS << Prefixes[PrefixKeyT(RPrefixes.begin(), RPrefixes.end())] << ", ";
 
-// The option string.
-emitNameUsingSpelling(OS, R);
+// The option prefixed name.
+OS << "llvm::StringLiteral(";
+write_cstring(OS, getOptionPrefixedName(R));
+OS << ")";
 
 // The option identifier name.
 OS << ", " << getOptionName(R);
Index: llvm/unittests/Option/OptionMarshallingTest.cpp
===
--- llvm/unittests/Option/OptionMarshallingTest.cpp
+++ llvm/unittests/Option/OptionMarshallingTest.cpp
@@ -10,7 +10,7 @@
 #include "gtest/gtest.h"
 
 struct OptionWithMarshallingInfo {
-  llvm::StringRef Name;
+  llvm::StringLiteral PrefixedName;
   const char *KeyPath;
   const char *ImpliedCheck;
   const char *ImpliedValue;
@@ -18,20 +18,20 @@
 
 static const OptionWithMarshallingInfo MarshallingTable[] = {
 #define OPTION_WITH_MARSHALLING(   \
-PREFIX_TYPE, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,\
-HELPTEXT, METAVAR, VALUES, SPELLING, SHOULD_PARSE, ALWAYS_EMIT, KEYPATH,   \
+PREFIX_TYPE, PREFIXED_NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS,  \
+PARAM, HELPTEXT, METAVAR, VALUES, SHOULD_PARSE, ALWAYS_EMIT, KEYPATH,  \
 DEFAULT_VALUE, IMPLIED_CHECK, IMPLIED_VALUE, NORMALIZER, DENORMALIZER, \
 MERGER, EXTRACTOR, TABLE_INDEX)\
-  {NAME, #KEYPATH, #IMPLIED_CHECK, #IMPLIED_VALUE},
+  {PREFIXED_NAME, #KEYPATH, #IMPLIED_CHECK, #IMPLIED_VALUE},
 #include "Opts.inc"
 #undef OPTION_WITH_MARSHALLING
 };
 
 TEST(OptionMarshalling, EmittedOrderSameAsDefinitionOrder) {
-  ASSERT_STREQ(MarshallingTable[0].Name.data(), "marshalled-flag-d");
-  ASSERT_STREQ(MarshallingTable[1].Name.data(), "marshalled-flag-c");
-  ASSERT_STREQ(MarshallingTable[2].Name.data(), "marshalled-flag-b");
-  ASSERT_STREQ(MarshallingTable[3].Name.data(), "marshalled-flag-a");
+  ASSERT_EQ(MarshallingTable[0].PrefixedName, "-marshalled-flag-d");
+  ASSERT_EQ(MarshallingTable[1].PrefixedName, "-marshalled-flag-c");
+  ASSERT_EQ(MarshallingTable[2].PrefixedName, "-marshalled-flag-b");
+  ASSERT_EQ(MarshallingTable[3].PrefixedName, "-marshalled-flag-a");
 }
 
 TEST(OptionMarshalling, EmittedSpecifiedKeyPath) {
Index: llvm/lib/Option/OptTable.cpp
===
--- llvm/lib/Option/OptTable.cpp
+++ llvm/lib/Option/OptTable.cpp
@@ -59,7 +59,7 @@
   if (&A == &B)
 return false;
 
-  if (int N = StrCmpOptionName(A.Name, B.Name))
+  if (int N = StrCmpOptionName(A.getName(), B.getName()))
 return N < 0;
 
   for (size_t I = 0, K = std::min(A.Prefixes.size(),

[PATCH] D157028: [llvm] Extract common `OptTable` bits into macros

2023-08-04 Thread Jan Svoboda via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG3f092f37b736: [llvm] Extract common `OptTable` bits into 
macros (authored by jansvoboda11).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157028/new/

https://reviews.llvm.org/D157028

Files:
  clang/include/clang/Driver/Options.h
  clang/lib/Driver/DriverOptions.cpp
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
  clang/tools/clang-scan-deps/ClangScanDeps.cpp
  lld/COFF/Driver.h
  lld/COFF/DriverUtils.cpp
  lld/ELF/Driver.h
  lld/ELF/DriverUtils.cpp
  lld/MachO/Driver.h
  lld/MinGW/Driver.cpp
  lld/wasm/Driver.cpp
  lldb/tools/driver/Driver.cpp
  lldb/tools/lldb-server/lldb-gdbserver.cpp
  lldb/tools/lldb-vscode/lldb-vscode.cpp
  llvm/include/llvm/Option/OptTable.h
  llvm/lib/ExecutionEngine/JITLink/COFFDirectiveParser.cpp
  llvm/lib/ExecutionEngine/JITLink/COFFDirectiveParser.h
  llvm/lib/ToolDrivers/llvm-dlltool/DlltoolDriver.cpp
  llvm/lib/ToolDrivers/llvm-lib/LibDriver.cpp
  llvm/tools/dsymutil/dsymutil.cpp
  llvm/tools/llvm-cvtres/llvm-cvtres.cpp
  llvm/tools/llvm-cxxfilt/llvm-cxxfilt.cpp
  llvm/tools/llvm-debuginfod/llvm-debuginfod.cpp
  llvm/tools/llvm-dwarfutil/llvm-dwarfutil.cpp
  llvm/tools/llvm-dwp/llvm-dwp.cpp
  llvm/tools/llvm-gsymutil/llvm-gsymutil.cpp
  llvm/tools/llvm-ifs/llvm-ifs.cpp
  llvm/tools/llvm-libtool-darwin/llvm-libtool-darwin.cpp
  llvm/tools/llvm-lipo/llvm-lipo.cpp
  llvm/tools/llvm-ml/llvm-ml.cpp
  llvm/tools/llvm-mt/llvm-mt.cpp
  llvm/tools/llvm-nm/llvm-nm.cpp
  llvm/tools/llvm-objcopy/ObjcopyOptions.cpp
  llvm/tools/llvm-objdump/ObjdumpOptID.h
  llvm/tools/llvm-objdump/llvm-objdump.cpp
  llvm/tools/llvm-rc/llvm-rc.cpp
  llvm/tools/llvm-readobj/llvm-readobj.cpp
  llvm/tools/llvm-size/llvm-size.cpp
  llvm/tools/llvm-strings/llvm-strings.cpp
  llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp
  llvm/tools/llvm-tli-checker/llvm-tli-checker.cpp
  llvm/tools/sancov/sancov.cpp
  llvm/unittests/Option/OptionParsingTest.cpp

Index: llvm/unittests/Option/OptionParsingTest.cpp
===
--- llvm/unittests/Option/OptionParsingTest.cpp
+++ llvm/unittests/Option/OptionParsingTest.cpp
@@ -17,9 +17,7 @@
 
 enum ID {
   OPT_INVALID = 0, // This is not an option ID.
-#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-   HELPTEXT, METAVAR, VALUES)  \
-  OPT_##ID,
+#define OPTION(...) LLVM_MAKE_OPT_ID(__VA_ARGS__),
 #include "Opts.inc"
   LastOption
 #undef OPTION
@@ -47,10 +45,7 @@
 };
 
 static constexpr OptTable::Info InfoTable[] = {
-#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-   HELPTEXT, METAVAR, VALUES)  \
-  {PREFIX, NAME,  HELPTEXT,METAVAR, OPT_##ID,  Option::KIND##Class,\
-   PARAM,  FLAGS, OPT_##GROUP, OPT_##ALIAS, ALIASARGS, VALUES},
+#define OPTION(...) LLVM_CONSTRUCT_OPT_INFO(__VA_ARGS__),
 #include "Opts.inc"
 #undef OPTION
 };
Index: llvm/tools/sancov/sancov.cpp
===
--- llvm/tools/sancov/sancov.cpp
+++ llvm/tools/sancov/sancov.cpp
@@ -63,9 +63,7 @@
 using namespace llvm::opt;
 enum ID {
   OPT_INVALID = 0, // This is not an option ID.
-#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-   HELPTEXT, METAVAR, VALUES)  \
-  OPT_##ID,
+#define OPTION(...) LLVM_MAKE_OPT_ID(__VA_ARGS__),
 #include "Opts.inc"
 #undef OPTION
 };
@@ -78,13 +76,7 @@
 #undef PREFIX
 
 static constexpr opt::OptTable::Info InfoTable[] = {
-#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-   HELPTEXT, METAVAR, VALUES)  \
-  {\
-  PREFIX,  NAME,  HELPTEXT,\
-  METAVAR, OPT_##ID,  opt::Option::KIND##Class,\
-  PARAM,   FLAGS, OPT_##GROUP, \
-  OPT_##ALIAS, ALIASARGS, VALUES},
+#define OPTION(...) LLVM_CONSTRUCT_OPT_INFO(__VA_ARGS__),
 #include "Opts.inc"
 #undef OPTION
 };
Index: llvm/tools/llvm-tli-checker/llvm-tli-checker.cpp
===
--- llvm/tools/llvm-tli-checker/llvm-tli-checker.cpp
+++ llvm/tools/llvm-tli-checker/llvm-tli-checker.cpp
@@ -28,9 +28,7 @@
 namespace {
 enum ID {
   OPT_INVALID = 0, // This is not an option ID.
-#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-   HELPTEXT, METAVAR, VALUES)  \
-  OPT_##ID,
+#define OPTION(...) LLVM_MAKE_OPT_ID(__VA_ARGS__),
 #include "Opts.inc"
 #undef OPTION
 };
@@ -43,13 +41,7 @@
 #undef PREFIX
 
 static constexpr o

[PATCH] D156749: [modules] Fix error about the same module being defined in different .pcm files when using VFS overlays.

2023-08-04 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added a comment.

In D156749#4552590 , @vsapsai wrote:

> In D156749#4551596 , @jansvoboda11 
> wrote:
>
>> Alternatively, we could keep VFS overlays out of the context hash but create 
>> `` from the on-disk real path of the defining module map and make the 
>> whole PCM VFS-agnostic. Then it'd be correct to import that PCM regardless 
>> of the specific VFS overlay setup, as long as all VFS queries of the 
>> importer resolve the same way they resolved within the instance that built 
>> the PCM. Maybe we can force the importer to recompile the PCM if that's not 
>> the case, similar to what we do for diagnostic options.
>
> I'm not sure I understand your proposal. Before this change we were 
> calculating hash from the on-disk real path of the defining module map. And 
> due to different VFS/no-VFS options defining module map is at different 
> locations on-disk.

My suggestion is to use the actual real on-disk path. Not 
`FileEntryRef::getName()`, but something that always behaves as if 
`use-external-name` was set to `true`. I believe this would handle your 
VFS/VFS-use-external-name-true/VFS-use-external-name-false problem. It would 
also handle another pitfall: two compilations with distinct VFS overlays that 
redirect two different as-requested module map paths into the same on-disk path.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156749/new/

https://reviews.llvm.org/D156749

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157029: [llvm] Construct option's prefixed name at compile-time

2023-08-04 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added a comment.

In D157029#4561519 , @jansvoboda11 
wrote:

> In D157029#4561490 , @MaskRay wrote:
>
>> This increases the size of  `Info` (static data size and static 
>> relocations). In return, some dynamic relocations are saved. Is this a net 
>> win?
>
> If that's a concern, I can remove `Info::Name` and replace its usages with a 
> function call that drops the prefix from prefixed name.

Done.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157029/new/

https://reviews.llvm.org/D157029

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157029: [llvm] Construct option's prefixed name at compile-time

2023-08-04 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 updated this revision to Diff 547316.
jansvoboda11 added a comment.

Remove `OptTable::Info::Name`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157029/new/

https://reviews.llvm.org/D157029

Files:
  clang-tools-extra/clangd/CompileCommands.cpp
  clang/lib/Driver/Driver.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/lib/Tooling/Tooling.cpp
  llvm/include/llvm/Option/OptTable.h
  llvm/include/llvm/Option/Option.h
  llvm/lib/Option/OptTable.cpp
  llvm/unittests/Option/OptionMarshallingTest.cpp
  llvm/utils/TableGen/OptParserEmitter.cpp

Index: llvm/utils/TableGen/OptParserEmitter.cpp
===
--- llvm/utils/TableGen/OptParserEmitter.cpp
+++ llvm/utils/TableGen/OptParserEmitter.cpp
@@ -34,30 +34,14 @@
   return OS;
 }
 
-static std::string getOptionSpelling(const Record &R, size_t &PrefixLength) {
+static std::string getOptionPrefixedName(const Record &R) {
   std::vector Prefixes = R.getValueAsListOfStrings("Prefixes");
   StringRef Name = R.getValueAsString("Name");
 
-  if (Prefixes.empty()) {
-PrefixLength = 0;
+  if (Prefixes.empty())
 return Name.str();
-  }
-
-  PrefixLength = Prefixes[0].size();
-  return (Twine(Prefixes[0]) + Twine(Name)).str();
-}
 
-static std::string getOptionSpelling(const Record &R) {
-  size_t PrefixLength;
-  return getOptionSpelling(R, PrefixLength);
-}
-
-static void emitNameUsingSpelling(raw_ostream &OS, const Record &R) {
-  size_t PrefixLength;
-  OS << "llvm::StringLiteral(";
-  write_cstring(
-  OS, StringRef(getOptionSpelling(R, PrefixLength)).substr(PrefixLength));
-  OS << ")";
+  return (Prefixes[0] + Twine(Name)).str();
 }
 
 class MarshallingInfo {
@@ -105,8 +89,6 @@
   }
 
   void emit(raw_ostream &OS) const {
-write_cstring(OS, StringRef(getOptionSpelling(R)));
-OS << ", ";
 OS << ShouldParse;
 OS << ", ";
 OS << ShouldAlwaysEmit;
@@ -303,7 +285,7 @@
 // The option prefix;
 OS << "llvm::ArrayRef()";
 
-// The option string.
+// The option prefixed name.
 OS << ", \"" << R.getValueAsString("Name") << '"';
 
 // The option identifier name.
@@ -346,8 +328,10 @@
 std::vector RPrefixes = R.getValueAsListOfStrings("Prefixes");
 OS << Prefixes[PrefixKeyT(RPrefixes.begin(), RPrefixes.end())] << ", ";
 
-// The option string.
-emitNameUsingSpelling(OS, R);
+// The option prefixed name.
+OS << "llvm::StringLiteral(";
+write_cstring(OS, getOptionPrefixedName(R));
+OS << ")";
 
 // The option identifier name.
 OS << ", " << getOptionName(R);
Index: llvm/unittests/Option/OptionMarshallingTest.cpp
===
--- llvm/unittests/Option/OptionMarshallingTest.cpp
+++ llvm/unittests/Option/OptionMarshallingTest.cpp
@@ -10,7 +10,7 @@
 #include "gtest/gtest.h"
 
 struct OptionWithMarshallingInfo {
-  llvm::StringRef Name;
+  llvm::StringLiteral PrefixedName;
   const char *KeyPath;
   const char *ImpliedCheck;
   const char *ImpliedValue;
@@ -18,20 +18,20 @@
 
 static const OptionWithMarshallingInfo MarshallingTable[] = {
 #define OPTION_WITH_MARSHALLING(   \
-PREFIX_TYPE, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,\
-HELPTEXT, METAVAR, VALUES, SPELLING, SHOULD_PARSE, ALWAYS_EMIT, KEYPATH,   \
+PREFIX_TYPE, PREFIXED_NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS,  \
+PARAM, HELPTEXT, METAVAR, VALUES, SHOULD_PARSE, ALWAYS_EMIT, KEYPATH,  \
 DEFAULT_VALUE, IMPLIED_CHECK, IMPLIED_VALUE, NORMALIZER, DENORMALIZER, \
 MERGER, EXTRACTOR, TABLE_INDEX)\
-  {NAME, #KEYPATH, #IMPLIED_CHECK, #IMPLIED_VALUE},
+  {PREFIXED_NAME, #KEYPATH, #IMPLIED_CHECK, #IMPLIED_VALUE},
 #include "Opts.inc"
 #undef OPTION_WITH_MARSHALLING
 };
 
 TEST(OptionMarshalling, EmittedOrderSameAsDefinitionOrder) {
-  ASSERT_STREQ(MarshallingTable[0].Name.data(), "marshalled-flag-d");
-  ASSERT_STREQ(MarshallingTable[1].Name.data(), "marshalled-flag-c");
-  ASSERT_STREQ(MarshallingTable[2].Name.data(), "marshalled-flag-b");
-  ASSERT_STREQ(MarshallingTable[3].Name.data(), "marshalled-flag-a");
+  ASSERT_EQ(MarshallingTable[0].PrefixedName, "-marshalled-flag-d");
+  ASSERT_EQ(MarshallingTable[1].PrefixedName, "-marshalled-flag-c");
+  ASSERT_EQ(MarshallingTable[2].PrefixedName, "-marshalled-flag-b");
+  ASSERT_EQ(MarshallingTable[3].PrefixedName, "-marshalled-flag-a");
 }
 
 TEST(OptionMarshalling, EmittedSpecifiedKeyPath) {
Index: llvm/lib/Option/OptTable.cpp
===
--- llvm/lib/Option/OptTable.cpp
+++ llvm/lib/Option/OptTable.cpp
@@ -59,7 +59,7 @@
   if (&A == &B)
 return false;
 
-  if (int N = StrCmpOptionName(A.Name, B.Name))
+  if (int N = StrCmpOptionName(A.getName(), B.getName()))
 return N < 0;
 
   for (size_t I = 0, K = std:

[PATCH] D157029: [llvm] Construct option's prefixed name at compile-time

2023-08-04 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added a comment.

In D157029#4561490 , @MaskRay wrote:

> This increases the size of  `Info` (static data size and static relocations). 
> In return, some dynamic relocations are saved. Is this a net win?

If that's a concern, I can remove `Info::Name` and replace its usages with a 
function call that drops the prefix from prefixed name.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157029/new/

https://reviews.llvm.org/D157029

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157029: [llvm] Construct option's prefixed name at compile-time

2023-08-04 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added inline comments.



Comment at: llvm/include/llvm/Option/Option.h:103
 
+  StringLiteral getSpelling() const {
+assert(Info && "Must have a valid info!");

benlangmuir wrote:
> This could use a doc comment to differentiate it from other string 
> representations.
> 
> How does this compare with `Arg::getSpelling`? With `Arg`, IIUC the 
> "spelling" is how it was actually written rather than a canonical form.  That 
> might be confusing if this one is canonical; so we should at least clearly 
> document it or maybe put "canonical" in the API name?
You're right `Arg::getSpelling()` is the as-written prefix and name, while 
`Option::getSpelling()` was the canonical prefix and name. I noticed there's 
also `Option::getPrefixedName()` which used to return `std::string`. I decided 
to reuse that and return `StringLiteral` instead. Thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157029/new/

https://reviews.llvm.org/D157029

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157029: [llvm] Construct option spelling at compile-time

2023-08-04 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 updated this revision to Diff 547298.
jansvoboda11 added a comment.

Improve `Option::getPrefixedName()` instead of introducing new 
`Option::getSpelling()`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157029/new/

https://reviews.llvm.org/D157029

Files:
  clang-tools-extra/clangd/CompileCommands.cpp
  clang/lib/Driver/Driver.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/lib/Tooling/Tooling.cpp
  llvm/include/llvm/Option/OptTable.h
  llvm/include/llvm/Option/Option.h
  llvm/lib/Option/OptTable.cpp
  llvm/utils/TableGen/OptParserEmitter.cpp

Index: llvm/utils/TableGen/OptParserEmitter.cpp
===
--- llvm/utils/TableGen/OptParserEmitter.cpp
+++ llvm/utils/TableGen/OptParserEmitter.cpp
@@ -105,8 +105,6 @@
   }
 
   void emit(raw_ostream &OS) const {
-write_cstring(OS, StringRef(getOptionSpelling(R)));
-OS << ", ";
 OS << ShouldParse;
 OS << ", ";
 OS << ShouldAlwaysEmit;
@@ -306,6 +304,9 @@
 // The option string.
 OS << ", \"" << R.getValueAsString("Name") << '"';
 
+// The option spelling.
+OS << ", \"" << R.getValueAsString("Name") << '"';
+
 // The option identifier name.
 OS << ", " << getOptionName(R);
 
@@ -349,6 +350,11 @@
 // The option string.
 emitNameUsingSpelling(OS, R);
 
+// The option spelling.
+OS << ", llvm::StringLiteral(";
+write_cstring(OS, getOptionSpelling(R));
+OS << ")";
+
 // The option identifier name.
 OS << ", " << getOptionName(R);
 
Index: llvm/lib/Option/OptTable.cpp
===
--- llvm/lib/Option/OptTable.cpp
+++ llvm/lib/Option/OptTable.cpp
@@ -529,7 +529,7 @@
 
 static std::string getOptionHelpName(const OptTable &Opts, OptSpecifier Id) {
   const Option O = Opts.getOption(Id);
-  std::string Name = O.getPrefixedName();
+  std::string Name = O.getPrefixedName().str();
 
   // Add metavar, if used.
   switch (O.getKind()) {
Index: llvm/include/llvm/Option/Option.h
===
--- llvm/include/llvm/Option/Option.h
+++ llvm/include/llvm/Option/Option.h
@@ -130,10 +130,9 @@
   }
 
   /// Get the name of this option with the default prefix.
-  std::string getPrefixedName() const {
-std::string Ret(getPrefix());
-Ret += getName();
-return Ret;
+  StringLiteral getPrefixedName() const {
+assert(Info && "Must have a valid info!");
+return Info->PrefixedName;
   }
 
   /// Get the help text for this option.
Index: llvm/include/llvm/Option/OptTable.h
===
--- llvm/include/llvm/Option/OptTable.h
+++ llvm/include/llvm/Option/OptTable.h
@@ -44,7 +44,9 @@
 /// A null terminated array of prefix strings to apply to name while
 /// matching.
 ArrayRef Prefixes;
+// TODO: Compute this from PrefixedName.
 StringRef Name;
+StringLiteral PrefixedName;
 const char *HelpText;
 const char *MetaVar;
 unsigned ID;
@@ -298,31 +300,31 @@
 
 } // end namespace llvm
 
-#define LLVM_MAKE_OPT_ID_WITH_ID_PREFIX(ID_PREFIX, PREFIX, NAME, ID, KIND, \
-GROUP, ALIAS, ALIASARGS, FLAGS, PARAM, \
-HELPTEXT, METAVAR, VALUES) \
+#define LLVM_MAKE_OPT_ID_WITH_ID_PREFIX(   \
+ID_PREFIX, PREFIX, NAME, PREFIXED_NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, \
+FLAGS, PARAM, HELPTEXT, METAVAR, VALUES)   \
   ID_PREFIX##ID
 
-#define LLVM_MAKE_OPT_ID(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS,  \
- FLAGS, PARAM, HELPTEXT, METAVAR, VALUES)  \
-  LLVM_MAKE_OPT_ID_WITH_ID_PREFIX(OPT_, PREFIX, NAME, ID, KIND, GROUP, ALIAS,  \
-  ALIASARGS, FLAGS, PARAM, HELPTEXT, METAVAR,  \
-  VALUE)
+#define LLVM_MAKE_OPT_ID(PREFIX, NAME, PREFIXED_NAME, ID, KIND, GROUP, ALIAS,  \
+ ALIASARGS, FLAGS, PARAM, HELPTEXT, METAVAR, VALUES)   \
+  LLVM_MAKE_OPT_ID_WITH_ID_PREFIX(OPT_, PREFIX, NAME, PREFIXED_NAME, ID, KIND, \
+  GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,   \
+  HELPTEXT, METAVAR, VALUE)
 
 #define LLVM_CONSTRUCT_OPT_INFO_WITH_ID_PREFIX(\
-ID_PREFIX, PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-HELPTEXT, METAVAR, VALUES) \
+ID_PREFIX, PREFIX, NAME, PREFIXED_NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, \
+FLAGS, PARAM, HELPTEXT, METAVAR, VALUES)   \
   llvm::opt::OptTable::Info {  \
-PREFIX, NAME, HELPTEXT, METAVAR, ID_PREFIX##ID, 

[PATCH] D156948: [clang][modules] Add -Wsystem-headers-in-module=

2023-08-04 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 accepted this revision.
jansvoboda11 added a comment.
This revision is now accepted and ready to land.

LGTM




Comment at: clang/include/clang/Basic/DiagnosticOptions.h:128
+  /// whether -Wsystem-headers is enabled on a per-module basis.
+  std::vector SystemHeaderWarningsModules;
+

Out of interest, is there an existing use-case for having multiple modules 
here, or is this just future-proofing?



Comment at: clang/lib/Frontend/CompilerInstance.cpp:1238-1239
 
+  for (StringRef Name : DiagOpts.SystemHeaderWarningsModules)
+if (Name == ModuleName)
+  Instance.getDiagnostics().setSuppressSystemWarnings(false);

I assume `llvm::is_contained()` wouldn't be okay with the `std::string` and 
`StringRef` mismatch?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156948/new/

https://reviews.llvm.org/D156948

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157066: [clang][modules][deps] Create more efficient API for visitation of `ModuleFile` inputs

2023-08-03 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 updated this revision to Diff 547097.
jansvoboda11 added a comment.

Remove leftover `std::string` constructor


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157066/new/

https://reviews.llvm.org/D157066

Files:
  clang/include/clang/Serialization/ASTBitCodes.h
  clang/include/clang/Serialization/ASTReader.h
  clang/include/clang/Serialization/ModuleFile.h
  clang/lib/Serialization/ASTReader.cpp
  clang/lib/Serialization/ASTWriter.cpp
  clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp

Index: clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
===
--- clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -459,18 +459,19 @@
   serialization::ModuleFile *MF =
   MDC.ScanInstance.getASTReader()->getModuleManager().lookup(
   M->getASTFile());
-  MDC.ScanInstance.getASTReader()->visitInputFiles(
-  *MF, true, true, [&](const serialization::InputFile &IF, bool isSystem) {
+  MDC.ScanInstance.getASTReader()->visitInputFileInfos(
+  *MF, /*IncludeSystem=*/true,
+  [&](const serialization::InputFileInfo &IFI, bool IsSystem) {
 // __inferred_module.map is the result of the way in which an implicit
 // module build handles inferred modules. It adds an overlay VFS with
 // this file in the proper directory and relies on the rest of Clang to
 // handle it like normal. With explicitly built modules we don't need
 // to play VFS tricks, so replace it with the correct module map.
-if (IF.getFile()->getName().endswith("__inferred_module.map")) {
+if (StringRef(IFI.Filename).endswith("__inferred_module.map")) {
   MDC.addFileDep(MD, ModuleMap->getName());
   return;
 }
-MDC.addFileDep(MD, IF.getFile()->getName());
+MDC.addFileDep(MD, IFI.Filename);
   });
 
   llvm::DenseSet SeenDeps;
@@ -478,11 +479,15 @@
   addAllSubmoduleDeps(M, MD, SeenDeps);
   addAllAffectingClangModules(M, MD, SeenDeps);
 
-  MDC.ScanInstance.getASTReader()->visitTopLevelModuleMaps(
-  *MF, [&](FileEntryRef FE) {
-if (FE.getNameAsRequested().endswith("__inferred_module.map"))
+  MDC.ScanInstance.getASTReader()->visitInputFileInfos(
+  *MF, /*IncludeSystem=*/true,
+  [&](const serialization::InputFileInfo &IFI, bool IsSystem) {
+if (!(IFI.TopLevel && IFI.ModuleMap))
   return;
-MD.ModuleMapFileDeps.emplace_back(FE.getNameAsRequested());
+if (StringRef(IFI.FilenameAsRequested)
+.endswith("__inferred_module.map"))
+  return;
+MD.ModuleMapFileDeps.emplace_back(IFI.FilenameAsRequested);
   });
 
   CompilerInvocation CI = MDC.makeInvocationForModuleBuildWithoutOutputs(
Index: clang/lib/Serialization/ASTWriter.cpp
===
--- clang/lib/Serialization/ASTWriter.cpp
+++ clang/lib/Serialization/ASTWriter.cpp
@@ -1525,7 +1525,8 @@
   bool IsSystemFile;
   bool IsTransient;
   bool BufferOverridden;
-  bool IsTopLevelModuleMap;
+  bool IsTopLevel;
+  bool IsModuleMap;
   uint32_t ContentHash[2];
 
   InputFileEntry(FileEntryRef File) : File(File) {}
@@ -1547,8 +1548,10 @@
   IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 32)); // Modification time
   IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // Overridden
   IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // Transient
+  IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // Top-level
   IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // Module map
-  IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob)); // File name
+  IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 16)); // Name as req. len
+  IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob)); // Name as req. + name
   unsigned IFAbbrevCode = Stream.EmitAbbrev(std::move(IFAbbrev));
 
   // Create input file hash abbreviation.
@@ -1582,8 +1585,8 @@
 Entry.IsSystemFile = isSystem(File.getFileCharacteristic());
 Entry.IsTransient = Cache->IsTransient;
 Entry.BufferOverridden = Cache->BufferOverridden;
-Entry.IsTopLevelModuleMap = isModuleMap(File.getFileCharacteristic()) &&
-File.getIncludeLoc().isInvalid();
+Entry.IsTopLevel = File.getIncludeLoc().isInvalid();
+Entry.IsModuleMap = isModuleMap(File.getFileCharacteristic());
 
 auto ContentHash = hash_code(-1);
 if (PP->getHeaderSearchInfo()
@@ -1631,6 +1634,15 @@
 // Emit size/modification time for this file.
 // And whether this file was overridden.
 {
+  SmallString<128> NameAsRequested = Entry.File.getNameAsRequested();
+  SmallString<128> Name = Entry.File.getName();
+
+  PreparePathForOutput(NameAsRequested);
+  PreparePathForOutput(Name);
+
+  if (Name == NameAsRequested)
+

[PATCH] D157066: [clang][modules][deps] Create more efficient API for visitation of `ModuleFile` inputs

2023-08-03 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 created this revision.
jansvoboda11 added a reviewer: benlangmuir.
Herald added a subscriber: ributzka.
Herald added a project: All.
jansvoboda11 requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

The current `ASTReader::visitInputFiles()` function calls into `FileManager` to 
create `FileEntryRef` objects. This ends up being fairly costly in 
`clang-scan-deps`, where we mostly only care about file paths.

This patch introduces new `ASTReader` API that gives clients access to just the 
serialized paths. Since the scanner needs both the as-requested path and the 
on-disk one (and doesn't want to transform the former into the latter via 
`FileManager`), this patch starts serializing both of them into the PCM file if 
they differ.

This increases the size of scanning PCMs by 0.1% and speeds up scanning by 5%.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D157066

Files:
  clang/include/clang/Serialization/ASTBitCodes.h
  clang/include/clang/Serialization/ASTReader.h
  clang/include/clang/Serialization/ModuleFile.h
  clang/lib/Serialization/ASTReader.cpp
  clang/lib/Serialization/ASTWriter.cpp
  clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp

Index: clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
===
--- clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -459,18 +459,19 @@
   serialization::ModuleFile *MF =
   MDC.ScanInstance.getASTReader()->getModuleManager().lookup(
   M->getASTFile());
-  MDC.ScanInstance.getASTReader()->visitInputFiles(
-  *MF, true, true, [&](const serialization::InputFile &IF, bool isSystem) {
+  MDC.ScanInstance.getASTReader()->visitInputFileInfos(
+  *MF, /*IncludeSystem=*/true,
+  [&](const serialization::InputFileInfo &IFI, bool IsSystem) {
 // __inferred_module.map is the result of the way in which an implicit
 // module build handles inferred modules. It adds an overlay VFS with
 // this file in the proper directory and relies on the rest of Clang to
 // handle it like normal. With explicitly built modules we don't need
 // to play VFS tricks, so replace it with the correct module map.
-if (IF.getFile()->getName().endswith("__inferred_module.map")) {
+if (StringRef(IFI.Filename).endswith("__inferred_module.map")) {
   MDC.addFileDep(MD, ModuleMap->getName());
   return;
 }
-MDC.addFileDep(MD, IF.getFile()->getName());
+MDC.addFileDep(MD, IFI.Filename);
   });
 
   llvm::DenseSet SeenDeps;
@@ -478,11 +479,15 @@
   addAllSubmoduleDeps(M, MD, SeenDeps);
   addAllAffectingClangModules(M, MD, SeenDeps);
 
-  MDC.ScanInstance.getASTReader()->visitTopLevelModuleMaps(
-  *MF, [&](FileEntryRef FE) {
-if (FE.getNameAsRequested().endswith("__inferred_module.map"))
+  MDC.ScanInstance.getASTReader()->visitInputFileInfos(
+  *MF, /*IncludeSystem=*/true,
+  [&](const serialization::InputFileInfo &IFI, bool IsSystem) {
+if (!(IFI.TopLevel && IFI.ModuleMap))
   return;
-MD.ModuleMapFileDeps.emplace_back(FE.getNameAsRequested());
+if (StringRef(IFI.FilenameAsRequested)
+.endswith("__inferred_module.map"))
+  return;
+MD.ModuleMapFileDeps.emplace_back(IFI.FilenameAsRequested);
   });
 
   CompilerInvocation CI = MDC.makeInvocationForModuleBuildWithoutOutputs(
Index: clang/lib/Serialization/ASTWriter.cpp
===
--- clang/lib/Serialization/ASTWriter.cpp
+++ clang/lib/Serialization/ASTWriter.cpp
@@ -1525,7 +1525,8 @@
   bool IsSystemFile;
   bool IsTransient;
   bool BufferOverridden;
-  bool IsTopLevelModuleMap;
+  bool IsTopLevel;
+  bool IsModuleMap;
   uint32_t ContentHash[2];
 
   InputFileEntry(FileEntryRef File) : File(File) {}
@@ -1547,8 +1548,10 @@
   IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 32)); // Modification time
   IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // Overridden
   IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // Transient
+  IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // Top-level
   IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // Module map
-  IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob)); // File name
+  IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 16)); // Name as req. len
+  IFAbbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob)); // Name as req. + name
   unsigned IFAbbrevCode = Stream.EmitAbbrev(std::move(IFAbbrev));
 
   // Create input file hash abbreviation.
@@ -1582,8 +1585,8 @@
 Entry.IsSystemFile = isSystem(File.getFileCharacteristic());
 Entry.IsTransient = Cache->IsTransient;
 Entry.BufferOverridden = Cache->BufferOverridden;
-Entry.IsTopLevelModuleM

[PATCH] D157046: [clang] Abstract away string allocation in command line generation

2023-08-03 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added inline comments.



Comment at: clang/lib/Frontend/CompilerInvocation.cpp:4323
+GenerateArg(Consumer, OPT_darwin_target_variant_sdk_version_EQ,
+Opts.DarwinTargetVariantSDKVersion.getAsString());
 }

benlangmuir wrote:
> Maybe not worth micro optimizing, but I noticed these two are allocating 
> strings unnecessarily if we had an overload for things that can print to a 
> raw_ostream.
Interesting, there are a couple of other instances where this might help. I 
probably won't be spending time on this right now, but good to be aware.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157046/new/

https://reviews.llvm.org/D157046

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157052: [clang][deps] NFC: Speed up canonical context hash computation

2023-08-03 Thread Jan Svoboda via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG8fd56ea11256: [clang][deps] NFC: Speed up canonical context 
hash computation (authored by jansvoboda11).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157052/new/

https://reviews.llvm.org/D157052

Files:
  clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp


Index: clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
===
--- clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -269,12 +269,13 @@
   HashBuilder.add(serialization::VERSION_MAJOR, serialization::VERSION_MINOR);
 
   // Hash the BuildInvocation without any input files.
-  SmallVector Args;
-  llvm::BumpPtrAllocator Alloc;
-  llvm::StringSaver Saver(Alloc);
-  CI.generateCC1CommandLine(
-  Args, [&](const Twine &Arg) { return Saver.save(Arg).data(); });
-  HashBuilder.addRange(Args);
+  SmallString<0> ArgVec;
+  ArgVec.reserve(4096);
+  CI.generateCC1CommandLine([&](const Twine &Arg) {
+Arg.toVector(ArgVec);
+ArgVec.push_back('\0');
+  });
+  HashBuilder.add(ArgVec);
 
   // Hash the module dependencies. These paths may differ even if the 
invocation
   // is identical if they depend on the contents of the files in the TU -- for


Index: clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
===
--- clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -269,12 +269,13 @@
   HashBuilder.add(serialization::VERSION_MAJOR, serialization::VERSION_MINOR);
 
   // Hash the BuildInvocation without any input files.
-  SmallVector Args;
-  llvm::BumpPtrAllocator Alloc;
-  llvm::StringSaver Saver(Alloc);
-  CI.generateCC1CommandLine(
-  Args, [&](const Twine &Arg) { return Saver.save(Arg).data(); });
-  HashBuilder.addRange(Args);
+  SmallString<0> ArgVec;
+  ArgVec.reserve(4096);
+  CI.generateCC1CommandLine([&](const Twine &Arg) {
+Arg.toVector(ArgVec);
+ArgVec.push_back('\0');
+  });
+  HashBuilder.add(ArgVec);
 
   // Hash the module dependencies. These paths may differ even if the invocation
   // is identical if they depend on the contents of the files in the TU -- for
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157048: [clang] NFC: Avoid double allocation when generating command line

2023-08-03 Thread Jan Svoboda via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGacd1ab869fca: [clang] NFC: Avoid double allocation when 
generating command line (authored by jansvoboda11).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157048/new/

https://reviews.llvm.org/D157048

Files:
  clang/lib/Frontend/CompilerInvocation.cpp


Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -4611,17 +4611,10 @@
 }
 
 std::vector CompilerInvocation::getCC1CommandLine() const {
-  // Set up string allocator.
-  llvm::BumpPtrAllocator Alloc;
-  llvm::StringSaver Strings(Alloc);
-  auto SA = [&Strings](const Twine &Arg) { return Strings.save(Arg).data(); };
-
-  // Synthesize full command line from the CompilerInvocation, including 
"-cc1".
-  SmallVector Args{"-cc1"};
-  generateCC1CommandLine(Args, SA);
-
-  // Convert arguments to the return type.
-  return std::vector{Args.begin(), Args.end()};
+  std::vector Args{"-cc1"};
+  generateCC1CommandLine(
+  [&Args](const Twine &Arg) { Args.push_back(Arg.str()); });
+  return Args;
 }
 
 void CompilerInvocation::resetNonModularOptions() {


Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -4611,17 +4611,10 @@
 }
 
 std::vector CompilerInvocation::getCC1CommandLine() const {
-  // Set up string allocator.
-  llvm::BumpPtrAllocator Alloc;
-  llvm::StringSaver Strings(Alloc);
-  auto SA = [&Strings](const Twine &Arg) { return Strings.save(Arg).data(); };
-
-  // Synthesize full command line from the CompilerInvocation, including "-cc1".
-  SmallVector Args{"-cc1"};
-  generateCC1CommandLine(Args, SA);
-
-  // Convert arguments to the return type.
-  return std::vector{Args.begin(), Args.end()};
+  std::vector Args{"-cc1"};
+  generateCC1CommandLine(
+  [&Args](const Twine &Arg) { Args.push_back(Arg.str()); });
+  return Args;
 }
 
 void CompilerInvocation::resetNonModularOptions() {
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157046: [clang] Abstract away string allocation in command line generation

2023-08-03 Thread Jan Svoboda via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG83452650490e: [clang] Abstract away string allocation in 
command line generation (authored by jansvoboda11).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157046/new/

https://reviews.llvm.org/D157046

Files:
  clang/include/clang/Frontend/CompilerInvocation.h
  clang/lib/Frontend/CompilerInvocation.cpp

Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -165,6 +165,8 @@
 // Normalizers
 //===--===//
 
+using ArgumentConsumer = CompilerInvocation::ArgumentConsumer;
+
 #define SIMPLE_ENUM_VALUE_TABLE
 #include "clang/Driver/Options.inc"
 #undef SIMPLE_ENUM_VALUE_TABLE
@@ -191,13 +193,10 @@
 /// denormalizeSimpleFlags never looks at it. Avoid bloating compile-time with
 /// unnecessary template instantiations and just ignore it with a variadic
 /// argument.
-static void denormalizeSimpleFlag(SmallVectorImpl &Args,
-  const Twine &Spelling,
-  CompilerInvocation::StringAllocator,
-  Option::OptionClass, unsigned, /*T*/...) {
-  // Spelling is already allocated or a static string, no need to call SA.
-  assert(*Spelling.getSingleStringRef().end() == '\0');
-  Args.push_back(Spelling.getSingleStringRef().data());
+static void denormalizeSimpleFlag(ArgumentConsumer Consumer,
+  const Twine &Spelling, Option::OptionClass,
+  unsigned, /*T*/...) {
+  Consumer(Spelling);
 }
 
 template  static constexpr bool is_uint64_t_convertible() {
@@ -234,34 +233,27 @@
 }
 
 static auto makeBooleanOptionDenormalizer(bool Value) {
-  return [Value](SmallVectorImpl &Args, const Twine &Spelling,
- CompilerInvocation::StringAllocator, Option::OptionClass,
- unsigned, bool KeyPath) {
-if (KeyPath == Value) {
-  // Spelling is already allocated or a static string, no need to call SA.
-  assert(*Spelling.getSingleStringRef().end() == '\0');
-  Args.push_back(Spelling.getSingleStringRef().data());
-}
+  return [Value](ArgumentConsumer Consumer, const Twine &Spelling,
+ Option::OptionClass, unsigned, bool KeyPath) {
+if (KeyPath == Value)
+  Consumer(Spelling);
   };
 }
 
-static void denormalizeStringImpl(SmallVectorImpl &Args,
+static void denormalizeStringImpl(ArgumentConsumer Consumer,
   const Twine &Spelling,
-  CompilerInvocation::StringAllocator SA,
   Option::OptionClass OptClass, unsigned,
   const Twine &Value) {
   switch (OptClass) {
   case Option::SeparateClass:
   case Option::JoinedOrSeparateClass:
   case Option::JoinedAndSeparateClass:
-// Spelling is already allocated or a static string, no need to call SA.
-assert(*Spelling.getSingleStringRef().end() == '\0');
-Args.push_back(Spelling.getSingleStringRef().data());
-Args.push_back(SA(Value));
+Consumer(Spelling);
+Consumer(Value);
 break;
   case Option::JoinedClass:
   case Option::CommaJoinedClass:
-Args.push_back(SA(Twine(Spelling) + Value));
+Consumer(Spelling + Value);
 break;
   default:
 llvm_unreachable("Cannot denormalize an option with option class "
@@ -270,11 +262,10 @@
 }
 
 template 
-static void
-denormalizeString(SmallVectorImpl &Args, const Twine &Spelling,
-  CompilerInvocation::StringAllocator SA,
-  Option::OptionClass OptClass, unsigned TableIndex, T Value) {
-  denormalizeStringImpl(Args, Spelling, SA, OptClass, TableIndex, Twine(Value));
+static void denormalizeString(ArgumentConsumer Consumer, const Twine &Spelling,
+  Option::OptionClass OptClass, unsigned TableIndex,
+  T Value) {
+  denormalizeStringImpl(Consumer, Spelling, OptClass, TableIndex, Twine(Value));
 }
 
 static std::optional
@@ -315,15 +306,14 @@
   return std::nullopt;
 }
 
-static void denormalizeSimpleEnumImpl(SmallVectorImpl &Args,
+static void denormalizeSimpleEnumImpl(ArgumentConsumer Consumer,
   const Twine &Spelling,
-  CompilerInvocation::StringAllocator SA,
   Option::OptionClass OptClass,
   unsigned TableIndex, unsigned Value) {
   assert(TableIndex < SimpleEnumValueTablesSize);
   const SimpleEnumValueTable &Table = SimpleEnumValueTables[TableIndex];
   if (auto MaybeEnumVal = findValueTableByValue(Table, Value)) {
-denormalizeString(Args, Spelling, SA, OptClass, Tabl

[PATCH] D157028: [llvm] Extract common `OptTable` bits into macros

2023-08-03 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added inline comments.



Comment at: lld/ELF/Driver.h:28
   OPT_INVALID = 0,
-#define OPTION(_1, _2, ID, _4, _5, _6, _7, _8, _9, _10, _11, _12) OPT_##ID,
+#define OPTION(...) LLVM_MAKE_OPT_ID(__VA_ARGS__),
 #include "Options.inc"

MaskRay wrote:
> lld/wasm lld/COFF lld/MachO are not updated?
You're right, I accidentally missed some LLD parts. Updated.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157028/new/

https://reviews.llvm.org/D157028

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157028: [llvm] Extract common `OptTable` bits into macros

2023-08-03 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 updated this revision to Diff 547090.
jansvoboda11 added a comment.
Herald added subscribers: pmatos, asb, aheejin, sbc100.
Herald added a project: lld-macho.
Herald added a reviewer: lld-macho.

Convert missed LLD parts.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157028/new/

https://reviews.llvm.org/D157028

Files:
  clang/include/clang/Driver/Options.h
  clang/lib/Driver/DriverOptions.cpp
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
  clang/tools/clang-scan-deps/ClangScanDeps.cpp
  lld/COFF/Driver.h
  lld/COFF/DriverUtils.cpp
  lld/ELF/Driver.h
  lld/ELF/DriverUtils.cpp
  lld/MachO/Driver.h
  lld/MinGW/Driver.cpp
  lld/wasm/Driver.cpp
  lldb/tools/driver/Driver.cpp
  lldb/tools/lldb-server/lldb-gdbserver.cpp
  lldb/tools/lldb-vscode/lldb-vscode.cpp
  llvm/include/llvm/Option/OptTable.h
  llvm/lib/ExecutionEngine/JITLink/COFFDirectiveParser.cpp
  llvm/lib/ExecutionEngine/JITLink/COFFDirectiveParser.h
  llvm/lib/ToolDrivers/llvm-dlltool/DlltoolDriver.cpp
  llvm/lib/ToolDrivers/llvm-lib/LibDriver.cpp
  llvm/tools/dsymutil/dsymutil.cpp
  llvm/tools/llvm-cvtres/llvm-cvtres.cpp
  llvm/tools/llvm-cxxfilt/llvm-cxxfilt.cpp
  llvm/tools/llvm-debuginfod/llvm-debuginfod.cpp
  llvm/tools/llvm-dwarfutil/llvm-dwarfutil.cpp
  llvm/tools/llvm-dwp/llvm-dwp.cpp
  llvm/tools/llvm-gsymutil/llvm-gsymutil.cpp
  llvm/tools/llvm-ifs/llvm-ifs.cpp
  llvm/tools/llvm-libtool-darwin/llvm-libtool-darwin.cpp
  llvm/tools/llvm-lipo/llvm-lipo.cpp
  llvm/tools/llvm-ml/llvm-ml.cpp
  llvm/tools/llvm-mt/llvm-mt.cpp
  llvm/tools/llvm-nm/llvm-nm.cpp
  llvm/tools/llvm-objcopy/ObjcopyOptions.cpp
  llvm/tools/llvm-objdump/ObjdumpOptID.h
  llvm/tools/llvm-objdump/llvm-objdump.cpp
  llvm/tools/llvm-rc/llvm-rc.cpp
  llvm/tools/llvm-readobj/llvm-readobj.cpp
  llvm/tools/llvm-size/llvm-size.cpp
  llvm/tools/llvm-strings/llvm-strings.cpp
  llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp
  llvm/tools/llvm-tli-checker/llvm-tli-checker.cpp
  llvm/tools/sancov/sancov.cpp
  llvm/unittests/Option/OptionParsingTest.cpp

Index: llvm/unittests/Option/OptionParsingTest.cpp
===
--- llvm/unittests/Option/OptionParsingTest.cpp
+++ llvm/unittests/Option/OptionParsingTest.cpp
@@ -17,9 +17,7 @@
 
 enum ID {
   OPT_INVALID = 0, // This is not an option ID.
-#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-   HELPTEXT, METAVAR, VALUES)  \
-  OPT_##ID,
+#define OPTION(...) LLVM_MAKE_OPT_ID(__VA_ARGS__),
 #include "Opts.inc"
   LastOption
 #undef OPTION
@@ -47,10 +45,7 @@
 };
 
 static constexpr OptTable::Info InfoTable[] = {
-#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-   HELPTEXT, METAVAR, VALUES)  \
-  {PREFIX, NAME,  HELPTEXT,METAVAR, OPT_##ID,  Option::KIND##Class,\
-   PARAM,  FLAGS, OPT_##GROUP, OPT_##ALIAS, ALIASARGS, VALUES},
+#define OPTION(...) LLVM_CONSTRUCT_OPT_INFO(__VA_ARGS__),
 #include "Opts.inc"
 #undef OPTION
 };
Index: llvm/tools/sancov/sancov.cpp
===
--- llvm/tools/sancov/sancov.cpp
+++ llvm/tools/sancov/sancov.cpp
@@ -63,9 +63,7 @@
 using namespace llvm::opt;
 enum ID {
   OPT_INVALID = 0, // This is not an option ID.
-#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-   HELPTEXT, METAVAR, VALUES)  \
-  OPT_##ID,
+#define OPTION(...) LLVM_MAKE_OPT_ID(__VA_ARGS__),
 #include "Opts.inc"
 #undef OPTION
 };
@@ -78,13 +76,7 @@
 #undef PREFIX
 
 static constexpr opt::OptTable::Info InfoTable[] = {
-#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-   HELPTEXT, METAVAR, VALUES)  \
-  {\
-  PREFIX,  NAME,  HELPTEXT,\
-  METAVAR, OPT_##ID,  opt::Option::KIND##Class,\
-  PARAM,   FLAGS, OPT_##GROUP, \
-  OPT_##ALIAS, ALIASARGS, VALUES},
+#define OPTION(...) LLVM_CONSTRUCT_OPT_INFO(__VA_ARGS__),
 #include "Opts.inc"
 #undef OPTION
 };
Index: llvm/tools/llvm-tli-checker/llvm-tli-checker.cpp
===
--- llvm/tools/llvm-tli-checker/llvm-tli-checker.cpp
+++ llvm/tools/llvm-tli-checker/llvm-tli-checker.cpp
@@ -28,9 +28,7 @@
 namespace {
 enum ID {
   OPT_INVALID = 0, // This is not an option ID.
-#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-   HELPTEXT, METAVAR, VALUES)  \
-  OPT_##ID,
+#define OPTION(...) LLVM_MAKE_OPT_ID(__VA_ARGS__),
 #include "Opts.inc"
 #undef OPTION
 };
@@ -4

[PATCH] D157055: [clang][deps] Avoid unnecessary work for seen dependencies

2023-08-03 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 created this revision.
jansvoboda11 added a reviewer: benlangmuir.
Herald added a subscriber: ributzka.
Herald added a project: All.
jansvoboda11 requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Modular dependencies the client has already seen don't need to be reported 
again. This patch leverages that to skip somewhat expensive computation: 
generating the full command line and collecting the full set of file 
dependencies. Everything else is necessary for computation of the canonical 
context hash, which we cannot skip.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D157055

Files:
  clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
  clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
  clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
  clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp

Index: clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
===
--- clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -406,7 +406,8 @@
 MDC.ProvidedStdCXXModule, MDC.RequiredStdCXXModules);
 
   for (auto &&I : MDC.ModularDeps)
-MDC.Consumer.handleModuleDependency(*I.second);
+if (!MDC.Consumer.alreadySeenModuleDependency(I.second->ID))
+  MDC.Consumer.handleModuleDependency(*I.second);
 
   for (const Module *M : MDC.DirectModularDeps) {
 auto It = MDC.ModularDeps.find(M);
@@ -458,19 +459,6 @@
   serialization::ModuleFile *MF =
   MDC.ScanInstance.getASTReader()->getModuleManager().lookup(
   M->getASTFile());
-  MDC.ScanInstance.getASTReader()->visitInputFiles(
-  *MF, true, true, [&](const serialization::InputFile &IF, bool isSystem) {
-// __inferred_module.map is the result of the way in which an implicit
-// module build handles inferred modules. It adds an overlay VFS with
-// this file in the proper directory and relies on the rest of Clang to
-// handle it like normal. With explicitly built modules we don't need
-// to play VFS tricks, so replace it with the correct module map.
-if (IF.getFile()->getName().endswith("__inferred_module.map")) {
-  MDC.addFileDep(MD, ModuleMap->getName());
-  return;
-}
-MDC.addFileDep(MD, IF.getFile()->getName());
-  });
 
   llvm::DenseSet SeenDeps;
   addAllSubmodulePrebuiltDeps(M, MD, SeenDeps);
@@ -493,11 +481,28 @@
 
   MDC.associateWithContextHash(CI, MD);
 
+  if (MDC.Consumer.alreadySeenModuleDependency(MD.ID))
+return MD.ID;
+
   // Finish the compiler invocation. Requires dependencies and the context hash.
   MDC.addOutputPaths(CI, MD);
 
   MD.BuildArguments = CI.getCC1CommandLine();
 
+  MDC.ScanInstance.getASTReader()->visitInputFiles(
+  *MF, true, true, [&](const serialization::InputFile &IF, bool isSystem) {
+// __inferred_module.map is the result of the way in which an implicit
+// module build handles inferred modules. It adds an overlay VFS with
+// this file in the proper directory and relies on the rest of Clang to
+// handle it like normal. With explicitly built modules we don't need
+// to play VFS tricks, so replace it with the correct module map.
+if (IF.getFile()->getName().endswith("__inferred_module.map")) {
+  MDC.addFileDep(MD, ModuleMap->getName());
+  return;
+}
+MDC.addFileDep(MD, IF.getFile()->getName());
+  });
+
   return MD.ID;
 }
 
Index: clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
===
--- clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
+++ clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
@@ -172,15 +172,7 @@
   TU.FileDeps = std::move(Dependencies);
   TU.PrebuiltModuleDeps = std::move(PrebuiltModuleDeps);
   TU.Commands = std::move(Commands);
-
-  for (auto &&M : ClangModuleDeps) {
-auto &MD = M.second;
-// TODO: Avoid handleModuleDependency even being called for modules
-//   we've already seen.
-if (AlreadySeen.count(M.first))
-  continue;
-TU.ModuleGraph.push_back(std::move(MD));
-  }
+  TU.ModuleGraph = takeModuleGraphDeps();
   TU.ClangModuleDeps = std::move(DirectModuleDeps);
 
   return TU;
@@ -189,14 +181,8 @@
 ModuleDepsGraph FullDependencyConsumer::takeModuleGraphDeps() {
   ModuleDepsGraph ModuleGraph;
 
-  for (auto &&M : ClangModuleDeps) {
-auto &MD = M.second;
-// TODO: Avoid handleModuleDependency even being called for modules
-//   we've already seen.
-if (AlreadySeen.count(M.first))
-  continue;
+  for (auto &&[ID, MD] : ClangModuleDeps)
 ModuleGraph.push_back(std::move(MD));
-  }
 
   return ModuleGraph;
 }
Index: clang/include/clang/Tooling/Dependency

[PATCH] D157054: [clang] NFC: Use compile-time option spelling when generating command line

2023-08-03 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 created this revision.
jansvoboda11 added a reviewer: benlangmuir.
Herald added a subscriber: ributzka.
Herald added a project: All.
jansvoboda11 requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

When generating command lines, use the option spelling generated by TableGen 
(`StringLiteral`) instead of constructing it at runtime. This saves some 
needless allocations.

Depends on D157029 .


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D157054

Files:
  clang/lib/Frontend/CompilerInvocation.cpp


Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -628,7 +628,7 @@
 llvm::opt::OptSpecifier OptSpecifier,
 CompilerInvocation::StringAllocator SA) {
   Option Opt = getDriverOptTable().getOption(OptSpecifier);
-  denormalizeSimpleFlag(Args, SA(Opt.getPrefix() + Opt.getName()), SA,
+  denormalizeSimpleFlag(Args, Opt.getSpelling(), SA,
 Option::OptionClass::FlagClass, 0);
 }
 
@@ -637,8 +637,7 @@
 const Twine &Value,
 CompilerInvocation::StringAllocator SA) {
   Option Opt = getDriverOptTable().getOption(OptSpecifier);
-  denormalizeString(Args, SA(Opt.getPrefix() + Opt.getName()), SA,
-Opt.getKind(), 0, Value);
+  denormalizeString(Args, Opt.getSpelling(), SA, Opt.getKind(), 0, Value);
 }
 
 // Parse command line arguments into CompilerInvocation.


Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -628,7 +628,7 @@
 llvm::opt::OptSpecifier OptSpecifier,
 CompilerInvocation::StringAllocator SA) {
   Option Opt = getDriverOptTable().getOption(OptSpecifier);
-  denormalizeSimpleFlag(Args, SA(Opt.getPrefix() + Opt.getName()), SA,
+  denormalizeSimpleFlag(Args, Opt.getSpelling(), SA,
 Option::OptionClass::FlagClass, 0);
 }
 
@@ -637,8 +637,7 @@
 const Twine &Value,
 CompilerInvocation::StringAllocator SA) {
   Option Opt = getDriverOptTable().getOption(OptSpecifier);
-  denormalizeString(Args, SA(Opt.getPrefix() + Opt.getName()), SA,
-Opt.getKind(), 0, Value);
+  denormalizeString(Args, Opt.getSpelling(), SA, Opt.getKind(), 0, Value);
 }
 
 // Parse command line arguments into CompilerInvocation.
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157050: [clang] NFC: Avoid double allocation when generating command line

2023-08-03 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 abandoned this revision.
jansvoboda11 added a comment.

In D157050#4559138 , @benlangmuir 
wrote:

> Dupe of https://reviews.llvm.org/D157048 ?

Yes, weird. Closing this one.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157050/new/

https://reviews.llvm.org/D157050

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157052: [clang][deps] NFC: Speed up canonical context hash computation

2023-08-03 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 created this revision.
jansvoboda11 added a reviewer: benlangmuir.
Herald added a subscriber: ributzka.
Herald added a project: All.
jansvoboda11 requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

This patch makes use of the infrastructure established in D157046 
 to speed up computation of the canonical 
context hash in the dependency scanner. This is somewhat hot code, since it's 
ran for all modules in the dependency graph of every TU.

I also tried an alternative approach that tried to avoid allocations as much as 
possible (essentially doing `HashBuilder.add(Arg.toStringRef(ArgVec))`), but 
that turned out to be slower than approach in this patch.

Note that this is not problematic in the same way command-line hashing used to 
be prior D143027 . The lambda is now being 
called even for constant strings.

Depends on D157046 .


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D157052

Files:
  clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp


Index: clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
===
--- clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -269,12 +269,13 @@
   HashBuilder.add(serialization::VERSION_MAJOR, serialization::VERSION_MINOR);
 
   // Hash the BuildInvocation without any input files.
-  SmallVector Args;
-  llvm::BumpPtrAllocator Alloc;
-  llvm::StringSaver Saver(Alloc);
-  CI.generateCC1CommandLine(
-  Args, [&](const Twine &Arg) { return Saver.save(Arg).data(); });
-  HashBuilder.addRange(Args);
+  SmallString<0> ArgVec;
+  ArgVec.reserve(4096);
+  CI.generateCC1CommandLine([&](const Twine &Arg) {
+Arg.toVector(ArgVec);
+ArgVec.push_back('\0');
+  });
+  HashBuilder.add(ArgVec);
 
   // Hash the module dependencies. These paths may differ even if the 
invocation
   // is identical if they depend on the contents of the files in the TU -- for


Index: clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
===
--- clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -269,12 +269,13 @@
   HashBuilder.add(serialization::VERSION_MAJOR, serialization::VERSION_MINOR);
 
   // Hash the BuildInvocation without any input files.
-  SmallVector Args;
-  llvm::BumpPtrAllocator Alloc;
-  llvm::StringSaver Saver(Alloc);
-  CI.generateCC1CommandLine(
-  Args, [&](const Twine &Arg) { return Saver.save(Arg).data(); });
-  HashBuilder.addRange(Args);
+  SmallString<0> ArgVec;
+  ArgVec.reserve(4096);
+  CI.generateCC1CommandLine([&](const Twine &Arg) {
+Arg.toVector(ArgVec);
+ArgVec.push_back('\0');
+  });
+  HashBuilder.add(ArgVec);
 
   // Hash the module dependencies. These paths may differ even if the invocation
   // is identical if they depend on the contents of the files in the TU -- for
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157050: [clang] NFC: Avoid double allocation when generating command line

2023-08-03 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 created this revision.
jansvoboda11 added a reviewer: benlangmuir.
Herald added a subscriber: ributzka.
Herald added a project: All.
jansvoboda11 requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

This patch makes use of the infrastructure established in D157046 
 to avoid needless allocations via 
`StringSaver`.

Depends on D157046 .


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D157050

Files:
  clang/lib/Frontend/CompilerInvocation.cpp


Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -4611,17 +4611,10 @@
 }
 
 std::vector CompilerInvocation::getCC1CommandLine() const {
-  // Set up string allocator.
-  llvm::BumpPtrAllocator Alloc;
-  llvm::StringSaver Strings(Alloc);
-  auto SA = [&Strings](const Twine &Arg) { return Strings.save(Arg).data(); };
-
-  // Synthesize full command line from the CompilerInvocation, including 
"-cc1".
-  SmallVector Args{"-cc1"};
-  generateCC1CommandLine(Args, SA);
-
-  // Convert arguments to the return type.
-  return std::vector{Args.begin(), Args.end()};
+  std::vector Args{"-cc1"};
+  generateCC1CommandLine(
+  [&Args](const Twine &Arg) { Args.push_back(Arg.str()); });
+  return Args;
 }
 
 void CompilerInvocation::resetNonModularOptions() {


Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -4611,17 +4611,10 @@
 }
 
 std::vector CompilerInvocation::getCC1CommandLine() const {
-  // Set up string allocator.
-  llvm::BumpPtrAllocator Alloc;
-  llvm::StringSaver Strings(Alloc);
-  auto SA = [&Strings](const Twine &Arg) { return Strings.save(Arg).data(); };
-
-  // Synthesize full command line from the CompilerInvocation, including "-cc1".
-  SmallVector Args{"-cc1"};
-  generateCC1CommandLine(Args, SA);
-
-  // Convert arguments to the return type.
-  return std::vector{Args.begin(), Args.end()};
+  std::vector Args{"-cc1"};
+  generateCC1CommandLine(
+  [&Args](const Twine &Arg) { Args.push_back(Arg.str()); });
+  return Args;
 }
 
 void CompilerInvocation::resetNonModularOptions() {
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157048: [clang] NFC: Avoid double allocation when generating command line

2023-08-03 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 created this revision.
jansvoboda11 added a reviewer: benlangmuir.
Herald added a subscriber: ributzka.
Herald added a project: All.
jansvoboda11 requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

This patch makes use of the infrastructure established in D157046 
 to avoid needless allocations via 
`StringSaver`.

Depends on D157046 .


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D157048

Files:
  clang/lib/Frontend/CompilerInvocation.cpp


Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -4611,17 +4611,10 @@
 }
 
 std::vector CompilerInvocation::getCC1CommandLine() const {
-  // Set up string allocator.
-  llvm::BumpPtrAllocator Alloc;
-  llvm::StringSaver Strings(Alloc);
-  auto SA = [&Strings](const Twine &Arg) { return Strings.save(Arg).data(); };
-
-  // Synthesize full command line from the CompilerInvocation, including 
"-cc1".
-  SmallVector Args{"-cc1"};
-  generateCC1CommandLine(Args, SA);
-
-  // Convert arguments to the return type.
-  return std::vector{Args.begin(), Args.end()};
+  std::vector Args{"-cc1"};
+  generateCC1CommandLine(
+  [&Args](const Twine &Arg) { Args.push_back(Arg.str()); });
+  return Args;
 }
 
 void CompilerInvocation::resetNonModularOptions() {


Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -4611,17 +4611,10 @@
 }
 
 std::vector CompilerInvocation::getCC1CommandLine() const {
-  // Set up string allocator.
-  llvm::BumpPtrAllocator Alloc;
-  llvm::StringSaver Strings(Alloc);
-  auto SA = [&Strings](const Twine &Arg) { return Strings.save(Arg).data(); };
-
-  // Synthesize full command line from the CompilerInvocation, including "-cc1".
-  SmallVector Args{"-cc1"};
-  generateCC1CommandLine(Args, SA);
-
-  // Convert arguments to the return type.
-  return std::vector{Args.begin(), Args.end()};
+  std::vector Args{"-cc1"};
+  generateCC1CommandLine(
+  [&Args](const Twine &Arg) { Args.push_back(Arg.str()); });
+  return Args;
 }
 
 void CompilerInvocation::resetNonModularOptions() {
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157046: [clang] Abstract string allocation in command line generation

2023-08-03 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 created this revision.
jansvoboda11 added a reviewer: benlangmuir.
Herald added a subscriber: ributzka.
Herald added a project: All.
jansvoboda11 requested review of this revision.
Herald added a reviewer: jdoerfert.
Herald added subscribers: cfe-commits, jplehr, sstefan1.
Herald added a project: clang.

This patch abstracts away the string allocation and vector push-back from 
command line generation. Instead, **all** generated arguments are passed into 
`ArgumentConsumer`, which may choose to do the string allocation and vector 
push-back, or something else entirely.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D157046

Files:
  clang/include/clang/Frontend/CompilerInvocation.h
  clang/lib/Frontend/CompilerInvocation.cpp

Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -165,6 +165,8 @@
 // Normalizers
 //===--===//
 
+using ArgumentConsumer = CompilerInvocation::ArgumentConsumer;
+
 #define SIMPLE_ENUM_VALUE_TABLE
 #include "clang/Driver/Options.inc"
 #undef SIMPLE_ENUM_VALUE_TABLE
@@ -191,13 +193,10 @@
 /// denormalizeSimpleFlags never looks at it. Avoid bloating compile-time with
 /// unnecessary template instantiations and just ignore it with a variadic
 /// argument.
-static void denormalizeSimpleFlag(SmallVectorImpl &Args,
-  const Twine &Spelling,
-  CompilerInvocation::StringAllocator,
-  Option::OptionClass, unsigned, /*T*/...) {
-  // Spelling is already allocated or a static string, no need to call SA.
-  assert(*Spelling.getSingleStringRef().end() == '\0');
-  Args.push_back(Spelling.getSingleStringRef().data());
+static void denormalizeSimpleFlag(ArgumentConsumer Consumer,
+  const Twine &Spelling, Option::OptionClass,
+  unsigned, /*T*/...) {
+  Consumer(Spelling);
 }
 
 template  static constexpr bool is_uint64_t_convertible() {
@@ -234,34 +233,27 @@
 }
 
 static auto makeBooleanOptionDenormalizer(bool Value) {
-  return [Value](SmallVectorImpl &Args, const Twine &Spelling,
- CompilerInvocation::StringAllocator, Option::OptionClass,
- unsigned, bool KeyPath) {
-if (KeyPath == Value) {
-  // Spelling is already allocated or a static string, no need to call SA.
-  assert(*Spelling.getSingleStringRef().end() == '\0');
-  Args.push_back(Spelling.getSingleStringRef().data());
-}
+  return [Value](ArgumentConsumer Consumer, const Twine &Spelling,
+ Option::OptionClass, unsigned, bool KeyPath) {
+if (KeyPath == Value)
+  Consumer(Spelling);
   };
 }
 
-static void denormalizeStringImpl(SmallVectorImpl &Args,
+static void denormalizeStringImpl(ArgumentConsumer Consumer,
   const Twine &Spelling,
-  CompilerInvocation::StringAllocator SA,
   Option::OptionClass OptClass, unsigned,
   const Twine &Value) {
   switch (OptClass) {
   case Option::SeparateClass:
   case Option::JoinedOrSeparateClass:
   case Option::JoinedAndSeparateClass:
-// Spelling is already allocated or a static string, no need to call SA.
-assert(*Spelling.getSingleStringRef().end() == '\0');
-Args.push_back(Spelling.getSingleStringRef().data());
-Args.push_back(SA(Value));
+Consumer(Spelling);
+Consumer(Value);
 break;
   case Option::JoinedClass:
   case Option::CommaJoinedClass:
-Args.push_back(SA(Twine(Spelling) + Value));
+Consumer(Spelling + Value);
 break;
   default:
 llvm_unreachable("Cannot denormalize an option with option class "
@@ -270,11 +262,10 @@
 }
 
 template 
-static void
-denormalizeString(SmallVectorImpl &Args, const Twine &Spelling,
-  CompilerInvocation::StringAllocator SA,
-  Option::OptionClass OptClass, unsigned TableIndex, T Value) {
-  denormalizeStringImpl(Args, Spelling, SA, OptClass, TableIndex, Twine(Value));
+static void denormalizeString(ArgumentConsumer Consumer, const Twine &Spelling,
+  Option::OptionClass OptClass, unsigned TableIndex,
+  T Value) {
+  denormalizeStringImpl(Consumer, Spelling, OptClass, TableIndex, Twine(Value));
 }
 
 static std::optional
@@ -315,15 +306,14 @@
   return std::nullopt;
 }
 
-static void denormalizeSimpleEnumImpl(SmallVectorImpl &Args,
+static void denormalizeSimpleEnumImpl(ArgumentConsumer Consumer,
   const Twine &Spelling,
-  CompilerInvocation::StringAllocator SA,
   Option::OptionClass OptClass,
  

[PATCH] D157029: [llvm] Construct option spelling at compile-time

2023-08-03 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 updated this revision to Diff 547019.
jansvoboda11 added a comment.

Rebase on top of D157028 .


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157029/new/

https://reviews.llvm.org/D157029

Files:
  clang-tools-extra/clangd/CompileCommands.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  llvm/include/llvm/Option/OptTable.h
  llvm/include/llvm/Option/Option.h
  llvm/utils/TableGen/OptParserEmitter.cpp

Index: llvm/utils/TableGen/OptParserEmitter.cpp
===
--- llvm/utils/TableGen/OptParserEmitter.cpp
+++ llvm/utils/TableGen/OptParserEmitter.cpp
@@ -105,8 +105,6 @@
   }
 
   void emit(raw_ostream &OS) const {
-write_cstring(OS, StringRef(getOptionSpelling(R)));
-OS << ", ";
 OS << ShouldParse;
 OS << ", ";
 OS << ShouldAlwaysEmit;
@@ -306,6 +304,9 @@
 // The option string.
 OS << ", \"" << R.getValueAsString("Name") << '"';
 
+// The option spelling.
+OS << ", \"" << R.getValueAsString("Name") << '"';
+
 // The option identifier name.
 OS << ", " << getOptionName(R);
 
@@ -349,6 +350,11 @@
 // The option string.
 emitNameUsingSpelling(OS, R);
 
+// The option spelling.
+OS << ", llvm::StringLiteral(";
+write_cstring(OS, getOptionSpelling(R));
+OS << ")";
+
 // The option identifier name.
 OS << ", " << getOptionName(R);
 
Index: llvm/include/llvm/Option/Option.h
===
--- llvm/include/llvm/Option/Option.h
+++ llvm/include/llvm/Option/Option.h
@@ -100,6 +100,11 @@
 return Info->Name;
   }
 
+  StringLiteral getSpelling() const {
+assert(Info && "Must have a valid info!");
+return Info->Spelling;
+  }
+
   const Option getGroup() const {
 assert(Info && "Must have a valid info!");
 assert(Owner && "Must have a valid owner!");
Index: llvm/include/llvm/Option/OptTable.h
===
--- llvm/include/llvm/Option/OptTable.h
+++ llvm/include/llvm/Option/OptTable.h
@@ -45,6 +45,7 @@
 /// matching.
 ArrayRef Prefixes;
 StringRef Name;
+StringLiteral Spelling;
 const char *HelpText;
 const char *MetaVar;
 unsigned ID;
@@ -298,31 +299,31 @@
 
 } // end namespace llvm
 
-#define LLVM_MAKE_OPT_ID_WITH_ID_PREFIX(ID_PREFIX, PREFIX, NAME, ID, KIND, \
-GROUP, ALIAS, ALIASARGS, FLAGS, PARAM, \
-HELPTEXT, METAVAR, VALUES) \
+#define LLVM_MAKE_OPT_ID_WITH_ID_PREFIX(ID_PREFIX, PREFIX, NAME, SPELLING, ID, \
+KIND, GROUP, ALIAS, ALIASARGS, FLAGS,  \
+PARAM, HELPTEXT, METAVAR, VALUES)  \
   ID_PREFIX##ID
 
-#define LLVM_MAKE_OPT_ID(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS,  \
- FLAGS, PARAM, HELPTEXT, METAVAR, VALUES)  \
-  LLVM_MAKE_OPT_ID_WITH_ID_PREFIX(OPT_, PREFIX, NAME, ID, KIND, GROUP, ALIAS,  \
-  ALIASARGS, FLAGS, PARAM, HELPTEXT, METAVAR,  \
-  VALUE)
+#define LLVM_MAKE_OPT_ID(PREFIX, NAME, SPELLING, ID, KIND, GROUP, ALIAS,   \
+ ALIASARGS, FLAGS, PARAM, HELPTEXT, METAVAR, VALUES)   \
+  LLVM_MAKE_OPT_ID_WITH_ID_PREFIX(OPT_, PREFIX, NAME, SPELLING, ID, KIND,  \
+  GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,   \
+  HELPTEXT, METAVAR, VALUE)
 
 #define LLVM_CONSTRUCT_OPT_INFO_WITH_ID_PREFIX(\
-ID_PREFIX, PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-HELPTEXT, METAVAR, VALUES) \
+ID_PREFIX, PREFIX, NAME, SPELLING, ID, KIND, GROUP, ALIAS, ALIASARGS,  \
+FLAGS, PARAM, HELPTEXT, METAVAR, VALUES)   \
   llvm::opt::OptTable::Info {  \
-PREFIX, NAME, HELPTEXT, METAVAR, ID_PREFIX##ID,\
+PREFIX, NAME, SPELLING, HELPTEXT, METAVAR, ID_PREFIX##ID,  \
 llvm::opt::Option::KIND##Class, PARAM, FLAGS, ID_PREFIX##GROUP,\
 ID_PREFIX##ALIAS, ALIASARGS, VALUES\
   }
 
-#define LLVM_CONSTRUCT_OPT_INFO(PREFIX, NAME, ID, KIND, GROUP, ALIAS,  \
-ALIASARGS, FLAGS, PARAM, HELPTEXT, METAVAR,\
-VALUES)\
-  LLVM_CONSTRUCT_OPT_INFO_WITH_ID_PREFIX(OPT_, PREFIX, NAME, ID, KIND, GROUP,  \
- ALIAS, ALIASARGS, FLAGS, PARAM,   \
- HELPTEXT, METAVAR, VALUES)
+#define LLVM_CONSTRUCT_OPT_INFO

[PATCH] D157028: [llvm] Extract common `OptTable` bits into macros

2023-08-03 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 updated this revision to Diff 547018.
jansvoboda11 added a comment.
Herald added a reviewer: alexander-shaposhnikov.

Consolidate all usages by extra `_WITH_ID_PREFIX` macros


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157028/new/

https://reviews.llvm.org/D157028

Files:
  clang/include/clang/Driver/Options.h
  clang/lib/Driver/DriverOptions.cpp
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
  clang/tools/clang-scan-deps/ClangScanDeps.cpp
  lld/ELF/Driver.h
  lldb/tools/driver/Driver.cpp
  lldb/tools/lldb-server/lldb-gdbserver.cpp
  lldb/tools/lldb-vscode/lldb-vscode.cpp
  llvm/include/llvm/Option/OptTable.h
  llvm/lib/ExecutionEngine/JITLink/COFFDirectiveParser.cpp
  llvm/lib/ExecutionEngine/JITLink/COFFDirectiveParser.h
  llvm/lib/ToolDrivers/llvm-dlltool/DlltoolDriver.cpp
  llvm/lib/ToolDrivers/llvm-lib/LibDriver.cpp
  llvm/tools/dsymutil/dsymutil.cpp
  llvm/tools/llvm-cvtres/llvm-cvtres.cpp
  llvm/tools/llvm-cxxfilt/llvm-cxxfilt.cpp
  llvm/tools/llvm-debuginfod/llvm-debuginfod.cpp
  llvm/tools/llvm-dwarfutil/llvm-dwarfutil.cpp
  llvm/tools/llvm-dwp/llvm-dwp.cpp
  llvm/tools/llvm-gsymutil/llvm-gsymutil.cpp
  llvm/tools/llvm-ifs/llvm-ifs.cpp
  llvm/tools/llvm-libtool-darwin/llvm-libtool-darwin.cpp
  llvm/tools/llvm-lipo/llvm-lipo.cpp
  llvm/tools/llvm-ml/llvm-ml.cpp
  llvm/tools/llvm-mt/llvm-mt.cpp
  llvm/tools/llvm-nm/llvm-nm.cpp
  llvm/tools/llvm-objcopy/ObjcopyOptions.cpp
  llvm/tools/llvm-objdump/ObjdumpOptID.h
  llvm/tools/llvm-objdump/llvm-objdump.cpp
  llvm/tools/llvm-rc/llvm-rc.cpp
  llvm/tools/llvm-readobj/llvm-readobj.cpp
  llvm/tools/llvm-size/llvm-size.cpp
  llvm/tools/llvm-strings/llvm-strings.cpp
  llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp
  llvm/tools/llvm-tli-checker/llvm-tli-checker.cpp
  llvm/tools/sancov/sancov.cpp
  llvm/unittests/Option/OptionParsingTest.cpp

Index: llvm/unittests/Option/OptionParsingTest.cpp
===
--- llvm/unittests/Option/OptionParsingTest.cpp
+++ llvm/unittests/Option/OptionParsingTest.cpp
@@ -17,9 +17,7 @@
 
 enum ID {
   OPT_INVALID = 0, // This is not an option ID.
-#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-   HELPTEXT, METAVAR, VALUES)  \
-  OPT_##ID,
+#define OPTION(...) LLVM_MAKE_OPT_ID(__VA_ARGS__),
 #include "Opts.inc"
   LastOption
 #undef OPTION
@@ -47,10 +45,7 @@
 };
 
 static constexpr OptTable::Info InfoTable[] = {
-#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-   HELPTEXT, METAVAR, VALUES)  \
-  {PREFIX, NAME,  HELPTEXT,METAVAR, OPT_##ID,  Option::KIND##Class,\
-   PARAM,  FLAGS, OPT_##GROUP, OPT_##ALIAS, ALIASARGS, VALUES},
+#define OPTION(...) LLVM_CONSTRUCT_OPT_INFO(__VA_ARGS__),
 #include "Opts.inc"
 #undef OPTION
 };
Index: llvm/tools/sancov/sancov.cpp
===
--- llvm/tools/sancov/sancov.cpp
+++ llvm/tools/sancov/sancov.cpp
@@ -63,9 +63,7 @@
 using namespace llvm::opt;
 enum ID {
   OPT_INVALID = 0, // This is not an option ID.
-#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-   HELPTEXT, METAVAR, VALUES)  \
-  OPT_##ID,
+#define OPTION(...) LLVM_MAKE_OPT_ID(__VA_ARGS__),
 #include "Opts.inc"
 #undef OPTION
 };
@@ -78,13 +76,7 @@
 #undef PREFIX
 
 static constexpr opt::OptTable::Info InfoTable[] = {
-#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-   HELPTEXT, METAVAR, VALUES)  \
-  {\
-  PREFIX,  NAME,  HELPTEXT,\
-  METAVAR, OPT_##ID,  opt::Option::KIND##Class,\
-  PARAM,   FLAGS, OPT_##GROUP, \
-  OPT_##ALIAS, ALIASARGS, VALUES},
+#define OPTION(...) LLVM_CONSTRUCT_OPT_INFO(__VA_ARGS__),
 #include "Opts.inc"
 #undef OPTION
 };
Index: llvm/tools/llvm-tli-checker/llvm-tli-checker.cpp
===
--- llvm/tools/llvm-tli-checker/llvm-tli-checker.cpp
+++ llvm/tools/llvm-tli-checker/llvm-tli-checker.cpp
@@ -28,9 +28,7 @@
 namespace {
 enum ID {
   OPT_INVALID = 0, // This is not an option ID.
-#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-   HELPTEXT, METAVAR, VALUES)  \
-  OPT_##ID,
+#define OPTION(...) LLVM_MAKE_OPT_ID(__VA_ARGS__),
 #include "Opts.inc"
 #undef OPTION
 };
@@ -43,13 +41,7 @@
 #undef PREFIX
 
 static constexpr opt::OptTable::Info InfoTable[] = {
-#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-   HEL

[PATCH] D157035: [clang][cli] Accept option spelling as `Twine`

2023-08-03 Thread Jan Svoboda via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG243bc7504965: [clang][cli] Accept option spelling as `Twine` 
(authored by jansvoboda11).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157035/new/

https://reviews.llvm.org/D157035

Files:
  clang/lib/Frontend/CompilerInvocation.cpp


Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -192,10 +192,12 @@
 /// unnecessary template instantiations and just ignore it with a variadic
 /// argument.
 static void denormalizeSimpleFlag(SmallVectorImpl &Args,
-  const char *Spelling,
+  const Twine &Spelling,
   CompilerInvocation::StringAllocator,
   Option::OptionClass, unsigned, /*T*/...) {
-  Args.push_back(Spelling);
+  // Spelling is already allocated or a static string, no need to call SA.
+  assert(*Spelling.getSingleStringRef().end() == '\0');
+  Args.push_back(Spelling.getSingleStringRef().data());
 }
 
 template  static constexpr bool is_uint64_t_convertible() {
@@ -232,16 +234,19 @@
 }
 
 static auto makeBooleanOptionDenormalizer(bool Value) {
-  return [Value](SmallVectorImpl &Args, const char *Spelling,
+  return [Value](SmallVectorImpl &Args, const Twine &Spelling,
  CompilerInvocation::StringAllocator, Option::OptionClass,
  unsigned, bool KeyPath) {
-if (KeyPath == Value)
-  Args.push_back(Spelling);
+if (KeyPath == Value) {
+  // Spelling is already allocated or a static string, no need to call SA.
+  assert(*Spelling.getSingleStringRef().end() == '\0');
+  Args.push_back(Spelling.getSingleStringRef().data());
+}
   };
 }
 
 static void denormalizeStringImpl(SmallVectorImpl &Args,
-  const char *Spelling,
+  const Twine &Spelling,
   CompilerInvocation::StringAllocator SA,
   Option::OptionClass OptClass, unsigned,
   const Twine &Value) {
@@ -249,7 +254,9 @@
   case Option::SeparateClass:
   case Option::JoinedOrSeparateClass:
   case Option::JoinedAndSeparateClass:
-Args.push_back(Spelling);
+// Spelling is already allocated or a static string, no need to call SA.
+assert(*Spelling.getSingleStringRef().end() == '\0');
+Args.push_back(Spelling.getSingleStringRef().data());
 Args.push_back(SA(Value));
 break;
   case Option::JoinedClass:
@@ -264,7 +271,7 @@
 
 template 
 static void
-denormalizeString(SmallVectorImpl &Args, const char *Spelling,
+denormalizeString(SmallVectorImpl &Args, const Twine &Spelling,
   CompilerInvocation::StringAllocator SA,
   Option::OptionClass OptClass, unsigned TableIndex, T Value) {
   denormalizeStringImpl(Args, Spelling, SA, OptClass, TableIndex, 
Twine(Value));
@@ -309,7 +316,7 @@
 }
 
 static void denormalizeSimpleEnumImpl(SmallVectorImpl &Args,
-  const char *Spelling,
+  const Twine &Spelling,
   CompilerInvocation::StringAllocator SA,
   Option::OptionClass OptClass,
   unsigned TableIndex, unsigned Value) {
@@ -326,7 +333,7 @@
 
 template 
 static void denormalizeSimpleEnum(SmallVectorImpl &Args,
-  const char *Spelling,
+  const Twine &Spelling,
   CompilerInvocation::StringAllocator SA,
   Option::OptionClass OptClass,
   unsigned TableIndex, T Value) {
@@ -367,7 +374,7 @@
 }
 
 static void denormalizeStringVector(SmallVectorImpl &Args,
-const char *Spelling,
+const Twine &Spelling,
 CompilerInvocation::StringAllocator SA,
 Option::OptionClass OptClass,
 unsigned TableIndex,


Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -192,10 +192,12 @@
 /// unnecessary template instantiations and just ignore it with a variadic
 /// argument.
 static void denormalizeSimpleFlag(SmallVectorImpl &Args,
-  const char *Spelling,
+  const Twine &Spelling,
   C

[PATCH] D157035: [clang][cli] Accept option spelling as `Twine`

2023-08-03 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added a comment.

That's a good point. Updated.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157035/new/

https://reviews.llvm.org/D157035

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157035: [clang][cli] Accept option spelling as `Twine`

2023-08-03 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 updated this revision to Diff 547014.
jansvoboda11 added a comment.

Assert that spelling is null-terminated


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157035/new/

https://reviews.llvm.org/D157035

Files:
  clang/lib/Frontend/CompilerInvocation.cpp


Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -192,10 +192,12 @@
 /// unnecessary template instantiations and just ignore it with a variadic
 /// argument.
 static void denormalizeSimpleFlag(SmallVectorImpl &Args,
-  const char *Spelling,
+  const Twine &Spelling,
   CompilerInvocation::StringAllocator,
   Option::OptionClass, unsigned, /*T*/...) {
-  Args.push_back(Spelling);
+  // Spelling is already allocated or a static string, no need to call SA.
+  assert(*Spelling.getSingleStringRef().end() == '\0');
+  Args.push_back(Spelling.getSingleStringRef().data());
 }
 
 template  static constexpr bool is_uint64_t_convertible() {
@@ -232,16 +234,19 @@
 }
 
 static auto makeBooleanOptionDenormalizer(bool Value) {
-  return [Value](SmallVectorImpl &Args, const char *Spelling,
+  return [Value](SmallVectorImpl &Args, const Twine &Spelling,
  CompilerInvocation::StringAllocator, Option::OptionClass,
  unsigned, bool KeyPath) {
-if (KeyPath == Value)
-  Args.push_back(Spelling);
+if (KeyPath == Value) {
+  // Spelling is already allocated or a static string, no need to call SA.
+  assert(*Spelling.getSingleStringRef().end() == '\0');
+  Args.push_back(Spelling.getSingleStringRef().data());
+}
   };
 }
 
 static void denormalizeStringImpl(SmallVectorImpl &Args,
-  const char *Spelling,
+  const Twine &Spelling,
   CompilerInvocation::StringAllocator SA,
   Option::OptionClass OptClass, unsigned,
   const Twine &Value) {
@@ -249,7 +254,9 @@
   case Option::SeparateClass:
   case Option::JoinedOrSeparateClass:
   case Option::JoinedAndSeparateClass:
-Args.push_back(Spelling);
+// Spelling is already allocated or a static string, no need to call SA.
+assert(*Spelling.getSingleStringRef().end() == '\0');
+Args.push_back(Spelling.getSingleStringRef().data());
 Args.push_back(SA(Value));
 break;
   case Option::JoinedClass:
@@ -264,7 +271,7 @@
 
 template 
 static void
-denormalizeString(SmallVectorImpl &Args, const char *Spelling,
+denormalizeString(SmallVectorImpl &Args, const Twine &Spelling,
   CompilerInvocation::StringAllocator SA,
   Option::OptionClass OptClass, unsigned TableIndex, T Value) {
   denormalizeStringImpl(Args, Spelling, SA, OptClass, TableIndex, 
Twine(Value));
@@ -309,7 +316,7 @@
 }
 
 static void denormalizeSimpleEnumImpl(SmallVectorImpl &Args,
-  const char *Spelling,
+  const Twine &Spelling,
   CompilerInvocation::StringAllocator SA,
   Option::OptionClass OptClass,
   unsigned TableIndex, unsigned Value) {
@@ -326,7 +333,7 @@
 
 template 
 static void denormalizeSimpleEnum(SmallVectorImpl &Args,
-  const char *Spelling,
+  const Twine &Spelling,
   CompilerInvocation::StringAllocator SA,
   Option::OptionClass OptClass,
   unsigned TableIndex, T Value) {
@@ -367,7 +374,7 @@
 }
 
 static void denormalizeStringVector(SmallVectorImpl &Args,
-const char *Spelling,
+const Twine &Spelling,
 CompilerInvocation::StringAllocator SA,
 Option::OptionClass OptClass,
 unsigned TableIndex,


Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -192,10 +192,12 @@
 /// unnecessary template instantiations and just ignore it with a variadic
 /// argument.
 static void denormalizeSimpleFlag(SmallVectorImpl &Args,
-  const char *Spelling,
+  const Twine &Spelling,
   CompilerInvocation::StringAllocator,
   Option::OptionClass, unsigned, /*T*/...) {
- 

[PATCH] D157035: [clang][cli] Accept option spelling as `Twine`

2023-08-03 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 created this revision.
jansvoboda11 added a reviewer: benlangmuir.
Herald added a subscriber: ributzka.
Herald added a project: All.
jansvoboda11 requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

This will make it possible to accept the spelling as `StringLiteral` in D157029 
 and avoid some unnecessary allocations in a 
later patch.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D157035

Files:
  clang/lib/Frontend/CompilerInvocation.cpp


Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -192,10 +192,11 @@
 /// unnecessary template instantiations and just ignore it with a variadic
 /// argument.
 static void denormalizeSimpleFlag(SmallVectorImpl &Args,
-  const char *Spelling,
+  const Twine &Spelling,
   CompilerInvocation::StringAllocator,
   Option::OptionClass, unsigned, /*T*/...) {
-  Args.push_back(Spelling);
+  // Spelling is already allocated or a static string, no need to call SA.
+  Args.push_back(Spelling.getSingleStringRef().data());
 }
 
 template  static constexpr bool is_uint64_t_convertible() {
@@ -232,16 +233,17 @@
 }
 
 static auto makeBooleanOptionDenormalizer(bool Value) {
-  return [Value](SmallVectorImpl &Args, const char *Spelling,
+  return [Value](SmallVectorImpl &Args, const Twine &Spelling,
  CompilerInvocation::StringAllocator, Option::OptionClass,
  unsigned, bool KeyPath) {
 if (KeyPath == Value)
-  Args.push_back(Spelling);
+  // Spelling is already allocated or a static string, no need to call SA.
+  Args.push_back(Spelling.getSingleStringRef().data());
   };
 }
 
 static void denormalizeStringImpl(SmallVectorImpl &Args,
-  const char *Spelling,
+  const Twine &Spelling,
   CompilerInvocation::StringAllocator SA,
   Option::OptionClass OptClass, unsigned,
   const Twine &Value) {
@@ -249,7 +251,8 @@
   case Option::SeparateClass:
   case Option::JoinedOrSeparateClass:
   case Option::JoinedAndSeparateClass:
-Args.push_back(Spelling);
+// Spelling is already allocated or a static string, no need to call SA.
+Args.push_back(Spelling.getSingleStringRef().data());
 Args.push_back(SA(Value));
 break;
   case Option::JoinedClass:
@@ -264,7 +267,7 @@
 
 template 
 static void
-denormalizeString(SmallVectorImpl &Args, const char *Spelling,
+denormalizeString(SmallVectorImpl &Args, const Twine &Spelling,
   CompilerInvocation::StringAllocator SA,
   Option::OptionClass OptClass, unsigned TableIndex, T Value) {
   denormalizeStringImpl(Args, Spelling, SA, OptClass, TableIndex, 
Twine(Value));
@@ -309,7 +312,7 @@
 }
 
 static void denormalizeSimpleEnumImpl(SmallVectorImpl &Args,
-  const char *Spelling,
+  const Twine &Spelling,
   CompilerInvocation::StringAllocator SA,
   Option::OptionClass OptClass,
   unsigned TableIndex, unsigned Value) {
@@ -326,7 +329,7 @@
 
 template 
 static void denormalizeSimpleEnum(SmallVectorImpl &Args,
-  const char *Spelling,
+  const Twine &Spelling,
   CompilerInvocation::StringAllocator SA,
   Option::OptionClass OptClass,
   unsigned TableIndex, T Value) {
@@ -367,7 +370,7 @@
 }
 
 static void denormalizeStringVector(SmallVectorImpl &Args,
-const char *Spelling,
+const Twine &Spelling,
 CompilerInvocation::StringAllocator SA,
 Option::OptionClass OptClass,
 unsigned TableIndex,


Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -192,10 +192,11 @@
 /// unnecessary template instantiations and just ignore it with a variadic
 /// argument.
 static void denormalizeSimpleFlag(SmallVectorImpl &Args,
-  const char *Spelling,
+  const Twine &Spelling,
   CompilerInvocation::StringAllocator,
   Option:

[PATCH] D157028: [llvm] Extract common `OptTable` bits into macros

2023-08-03 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added a comment.

Here's an example of a patch that changes the `OPTION` macro: D157029 
. I wonder if we could have counterparts to 
`LLVM_MAKE_OPT_ID` and `LLVM_CONSTRUCT_OPT_INFO` that allow overriding the 
default `OPT_` prefix. That would make D157029 
 even smaller. WDYT @MaskRay?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157028/new/

https://reviews.llvm.org/D157028

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157029: [llvm] Construct option spelling at compile-time

2023-08-03 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 created this revision.
jansvoboda11 added a reviewer: MaskRay.
Herald added subscribers: ormris, ributzka, kadircet, arphaman, hiraditya.
Herald added a reviewer: alexander-shaposhnikov.
Herald added a reviewer: jhenderson.
Herald added a project: All.
jansvoboda11 requested review of this revision.
Herald added projects: clang, LLVM, clang-tools-extra.
Herald added subscribers: cfe-commits, llvm-commits.

Some Clang command-line handling code could benefit from the option spelling 
being a `StringLiteral`. This patch changes the `llvm::opt` TableGen backend to 
generate the "canonical" spelling and emit it into the .inc file.

Depends on D157028 .


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D157029

Files:
  clang-tools-extra/clangd/CompileCommands.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  llvm/include/llvm/Option/OptTable.h
  llvm/include/llvm/Option/Option.h
  llvm/lib/ExecutionEngine/JITLink/COFFDirectiveParser.cpp
  llvm/lib/ExecutionEngine/JITLink/COFFDirectiveParser.h
  llvm/tools/llvm-lipo/llvm-lipo.cpp
  llvm/tools/llvm-objcopy/ObjcopyOptions.cpp
  llvm/tools/llvm-objdump/ObjdumpOptID.h
  llvm/tools/llvm-objdump/llvm-objdump.cpp
  llvm/tools/llvm-rc/llvm-rc.cpp
  llvm/utils/TableGen/OptParserEmitter.cpp

Index: llvm/utils/TableGen/OptParserEmitter.cpp
===
--- llvm/utils/TableGen/OptParserEmitter.cpp
+++ llvm/utils/TableGen/OptParserEmitter.cpp
@@ -105,8 +105,6 @@
   }
 
   void emit(raw_ostream &OS) const {
-write_cstring(OS, StringRef(getOptionSpelling(R)));
-OS << ", ";
 OS << ShouldParse;
 OS << ", ";
 OS << ShouldAlwaysEmit;
@@ -306,6 +304,9 @@
 // The option string.
 OS << ", \"" << R.getValueAsString("Name") << '"';
 
+// The option spelling.
+OS << ", \"" << R.getValueAsString("Name") << '"';
+
 // The option identifier name.
 OS << ", " << getOptionName(R);
 
@@ -349,6 +350,11 @@
 // The option string.
 emitNameUsingSpelling(OS, R);
 
+// The option spelling.
+OS << ", llvm::StringLiteral(";
+write_cstring(OS, getOptionSpelling(R));
+OS << ")";
+
 // The option identifier name.
 OS << ", " << getOptionName(R);
 
Index: llvm/tools/llvm-rc/llvm-rc.cpp
===
--- llvm/tools/llvm-rc/llvm-rc.cpp
+++ llvm/tools/llvm-rc/llvm-rc.cpp
@@ -77,8 +77,8 @@
 
 enum Windres_ID {
   WINDRES_INVALID = 0, // This is not a correct option ID.
-#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-   HELPTEXT, METAVAR, VALUES)  \
+#define OPTION(PREFIX, NAME, SPELLING, ID, KIND, GROUP, ALIAS, ALIASARGS,  \
+   FLAGS, PARAM, HELPTEXT, METAVAR, VALUES)\
   WINDRES_##ID,
 #include "WindresOpts.inc"
 #undef OPTION
@@ -93,12 +93,21 @@
 #undef PREFIX
 
 static constexpr opt::OptTable::Info InfoTable[] = {
-#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-   HELPTEXT, METAVAR, VALUES)  \
-  {PREFIX,  NAME, HELPTEXT,\
-   METAVAR, WINDRES_##ID, opt::Option::KIND##Class,\
-   PARAM,   FLAGS,WINDRES_##GROUP, \
-   WINDRES_##ALIAS, ALIASARGS,VALUES},
+#define OPTION(PREFIX, NAME, SPELLING, ID, KIND, GROUP, ALIAS, ALIASARGS,  \
+   FLAGS, PARAM, HELPTEXT, METAVAR, VALUES)\
+  {PREFIX, \
+   NAME,   \
+   SPELLING,   \
+   HELPTEXT,   \
+   METAVAR,\
+   WINDRES_##ID,   \
+   opt::Option::KIND##Class,   \
+   PARAM,  \
+   FLAGS,  \
+   WINDRES_##GROUP,\
+   WINDRES_##ALIAS,\
+   ALIASARGS,  \
+   VALUES},
 #include "WindresOpts.inc"
 #undef OPTION
 };
Index: llvm/tools/llvm-objdump/llvm-objdump.cpp
===
--- llvm/tools/llvm-objdump/llvm-objdump.cpp
+++ llvm/tools/llvm-objdump/llvm-objdump.cpp
@@ -128,12 +128,21 @@
 #undef PREFIX
 
 static constexpr opt::OptTable::Info ObjdumpInfoTable[]

[PATCH] D157028: [llvm] Extract common `OptTable` bits into macros

2023-08-03 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 created this revision.
jansvoboda11 added a reviewer: MaskRay.
Herald added subscribers: jhenderson, ormris, ributzka, steven_wu, hiraditya, 
arichardson, emaste.
Herald added a reviewer: JDevlieghere.
Herald added a reviewer: jhenderson.
Herald added a project: All.
jansvoboda11 requested review of this revision.
Herald added projects: clang, LLDB, LLVM.
Herald added subscribers: llvm-commits, lldb-commits, cfe-commits.

All command-line tools using `llvm::opt` create an enum of option IDs and a 
table of `OptTable::Info` object. Most of the tools use the same ID 
(`OPT_##ID`), kind (`Option::KIND##Class`), group ID (`OPT_##GROUP`) and alias 
ID (`OPT_##ALIAS`). This patch extracts that common code into canonical macros. 
This results in fewer changes when tweaking the `OPTION` macros emitted by the 
TableGen backend.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D157028

Files:
  clang/include/clang/Driver/Options.h
  clang/lib/Driver/DriverOptions.cpp
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
  clang/tools/clang-scan-deps/ClangScanDeps.cpp
  lld/ELF/Driver.h
  lldb/tools/driver/Driver.cpp
  lldb/tools/lldb-server/lldb-gdbserver.cpp
  lldb/tools/lldb-vscode/lldb-vscode.cpp
  llvm/include/llvm/Option/OptTable.h
  llvm/lib/ToolDrivers/llvm-dlltool/DlltoolDriver.cpp
  llvm/lib/ToolDrivers/llvm-lib/LibDriver.cpp
  llvm/tools/dsymutil/dsymutil.cpp
  llvm/tools/llvm-cvtres/llvm-cvtres.cpp
  llvm/tools/llvm-cxxfilt/llvm-cxxfilt.cpp
  llvm/tools/llvm-debuginfod/llvm-debuginfod.cpp
  llvm/tools/llvm-dwarfutil/llvm-dwarfutil.cpp
  llvm/tools/llvm-dwp/llvm-dwp.cpp
  llvm/tools/llvm-gsymutil/llvm-gsymutil.cpp
  llvm/tools/llvm-ifs/llvm-ifs.cpp
  llvm/tools/llvm-libtool-darwin/llvm-libtool-darwin.cpp
  llvm/tools/llvm-ml/llvm-ml.cpp
  llvm/tools/llvm-mt/llvm-mt.cpp
  llvm/tools/llvm-nm/llvm-nm.cpp
  llvm/tools/llvm-rc/llvm-rc.cpp
  llvm/tools/llvm-readobj/llvm-readobj.cpp
  llvm/tools/llvm-size/llvm-size.cpp
  llvm/tools/llvm-strings/llvm-strings.cpp
  llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp
  llvm/tools/llvm-tli-checker/llvm-tli-checker.cpp
  llvm/tools/sancov/sancov.cpp
  llvm/unittests/Option/OptionParsingTest.cpp

Index: llvm/unittests/Option/OptionParsingTest.cpp
===
--- llvm/unittests/Option/OptionParsingTest.cpp
+++ llvm/unittests/Option/OptionParsingTest.cpp
@@ -17,9 +17,7 @@
 
 enum ID {
   OPT_INVALID = 0, // This is not an option ID.
-#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-   HELPTEXT, METAVAR, VALUES)  \
-  OPT_##ID,
+#define OPTION(...) LLVM_MAKE_OPT_ID(__VA_ARGS__),
 #include "Opts.inc"
   LastOption
 #undef OPTION
@@ -47,10 +45,7 @@
 };
 
 static constexpr OptTable::Info InfoTable[] = {
-#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-   HELPTEXT, METAVAR, VALUES)  \
-  {PREFIX, NAME,  HELPTEXT,METAVAR, OPT_##ID,  Option::KIND##Class,\
-   PARAM,  FLAGS, OPT_##GROUP, OPT_##ALIAS, ALIASARGS, VALUES},
+#define OPTION(...) LLVM_CONSTRUCT_OPT_INFO(__VA_ARGS__),
 #include "Opts.inc"
 #undef OPTION
 };
Index: llvm/tools/sancov/sancov.cpp
===
--- llvm/tools/sancov/sancov.cpp
+++ llvm/tools/sancov/sancov.cpp
@@ -63,9 +63,7 @@
 using namespace llvm::opt;
 enum ID {
   OPT_INVALID = 0, // This is not an option ID.
-#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-   HELPTEXT, METAVAR, VALUES)  \
-  OPT_##ID,
+#define OPTION(...) LLVM_MAKE_OPT_ID(__VA_ARGS__),
 #include "Opts.inc"
 #undef OPTION
 };
@@ -78,13 +76,7 @@
 #undef PREFIX
 
 static constexpr opt::OptTable::Info InfoTable[] = {
-#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-   HELPTEXT, METAVAR, VALUES)  \
-  {\
-  PREFIX,  NAME,  HELPTEXT,\
-  METAVAR, OPT_##ID,  opt::Option::KIND##Class,\
-  PARAM,   FLAGS, OPT_##GROUP, \
-  OPT_##ALIAS, ALIASARGS, VALUES},
+#define OPTION(...) LLVM_CONSTRUCT_OPT_INFO(__VA_ARGS__),
 #include "Opts.inc"
 #undef OPTION
 };
Index: llvm/tools/llvm-tli-checker/llvm-tli-checker.cpp
===
--- llvm/tools/llvm-tli-checker/llvm-tli-checker.cpp
+++ llvm/tools/llvm-tli-checker/llvm-tli-checker.cpp
@@ -28,9 +28,7 @@
 namespace {
 enum ID {
   OPT_INVALID = 0, // This is not an option ID.
-#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
-   HELPTEXT, METAVAR, VALUES) 

[PATCH] D156234: [clang][deps] add support for dependency scanning with cc1 command line

2023-08-02 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 accepted this revision.
jansvoboda11 added a comment.
This revision is now accepted and ready to land.

LGTM after Windows CI gets fixed.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156234/new/

https://reviews.llvm.org/D156234

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D156749: [modules] Fix error about the same module being defined in different .pcm files when using VFS overlays.

2023-08-01 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added a subscriber: benlangmuir.
jansvoboda11 added a comment.

CC @benlangmuir, since we've talked about this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156749/new/

https://reviews.llvm.org/D156749

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D156749: [modules] Fix error about the same module being defined in different .pcm files when using VFS overlays.

2023-08-01 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added a comment.

> And it is build system's responsibility to provide `-ivfsoverlay` options 
> that don't cause observable differences.

I wasn't aware of that. Do we document this anywhere? It surprises me that we'd 
impose such restriction on the build system. This seems fairly easy to 
accidentally violate and end up in the same situation - Clang instances with 
different VFS overlays, identical context hashes and different canonical module 
map paths for the same module.

What are the performance implications of making VFS overlays part of the 
context hash?

Alternatively, we could keep VFS overlays out of the context hash but create 
`` from the on-disk real path of the defining module map and make the 
whole PCM VFS-agnostic. Then it'd be correct to import that PCM regardless of 
the specific VFS overlay setup, as long as all VFS queries of the importer 
resolve the same way they resolved within the instance that built the PCM. 
Maybe we can force the importer to recompile the PCM if that's not the case, 
similar to what we do for diagnostic options.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156749/new/

https://reviews.llvm.org/D156749

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D156234: [clang][deps] add support for dependency scanning with cc1 command line

2023-08-01 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added inline comments.



Comment at: 
clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp:482
+  bool Success = false;
+  if (FinalCommandLine[1] == "-cc1") {
+Success = createAndRunToolInvocation(FinalCommandLine, Action, *FileMgr,

cpsughrue wrote:
> Is there a good way to validate cc1 command lines?
I think that happens in `ToolInvocation::run()` that's called by 
`createAndRunToolInvocation`. Are you seeing cases where we don't emit a 
diagnostic for an invalid `-cc1` command line?



Comment at: clang/test/ClangScanDeps/Inputs/modules_cc1_cdb.json:1
+[
+{

I assume this was cargo-culted from 
`clang/test/ClangScanDeps/Inputs/modules_cdb.json`. I don't think we need 
multiple entries here and lots of the flags are unnecessary for just testing 
out `-cc1` command lines work. I suggest having just one minimal `-cc1` command 
line here.



Comment at: clang/test/ClangScanDeps/modules-cc1.cpp:10
+// RUN: mkdir %t.dir/Inputs
+// RUN: cp %S/Inputs/header.h %t.dir/Inputs/header.h
+// RUN: cp %S/Inputs/header2.h %t.dir/Inputs/header2.h

Recently, we've been using `split-file` to set up the file system for our 
tests. It's much easier to see what's going on in a glance. Can you transform 
the test into that format? You can take inspiration from 
`clang/test/ClangScanDeps/modules-transitive.c` for example.



Comment at: clang/test/ClangScanDeps/modules-cc1.cpp:18
+//
+// The output order is non-deterministic when using more than one thread,
+// so check the output using two runs. Note that the 'NOT' check is not used

Yeah, this is already covered by the other test, I suggest dropping this 
comment and the `-j` flag in the `clang-scan-deps` invocation below.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156234/new/

https://reviews.llvm.org/D156234

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D156563: [clang][deps] Remove `ModuleDeps::ImportedByMainFile`

2023-07-28 Thread Jan Svoboda via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGc75b331fc231: [clang][deps] Remove 
`ModuleDeps::ImportedByMainFile` (authored by jansvoboda11).

Changed prior to commit:
  https://reviews.llvm.org/D156563?vs=545234&id=545238#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156563/new/

https://reviews.llvm.org/D156563

Files:
  clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
  clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
  clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
  clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
  clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp

Index: clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
===
--- clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -242,7 +242,7 @@
 
 SmallVector DirectDeps;
 for (const auto &KV : ModularDeps)
-  if (KV.second->ImportedByMainFile)
+  if (DirectModularDeps.contains(KV.first))
 DirectDeps.push_back(KV.second->ID);
 
 // TODO: Report module maps the same way it's done for modular dependencies.
@@ -364,7 +364,7 @@
 MDC.DirectPrebuiltModularDeps.insert(
 {TopLevelModule, PrebuiltModuleDep{TopLevelModule}});
   else
-DirectModularDeps.insert(TopLevelModule);
+MDC.DirectModularDeps.insert(TopLevelModule);
 }
 
 void ModuleDepCollectorPP::EndOfMainFile() {
@@ -394,9 +394,9 @@
   for (const Module *M :
MDC.ScanInstance.getPreprocessor().getAffectingClangModules())
 if (!MDC.isPrebuiltModule(M))
-  DirectModularDeps.insert(M);
+  MDC.DirectModularDeps.insert(M);
 
-  for (const Module *M : DirectModularDeps)
+  for (const Module *M : MDC.DirectModularDeps)
 handleTopLevelModule(M);
 
   MDC.Consumer.handleDependencyOutputOpts(*MDC.Opts);
@@ -408,6 +408,13 @@
   for (auto &&I : MDC.ModularDeps)
 MDC.Consumer.handleModuleDependency(*I.second);
 
+  for (const Module *M : MDC.DirectModularDeps) {
+auto It = MDC.ModularDeps.find(M);
+// Only report direct dependencies that were successfully handled.
+if (It != MDC.ModularDeps.end())
+  MDC.Consumer.handleDirectModuleDependency(MDC.ModularDeps[M]->ID);
+  }
+
   for (auto &&I : MDC.FileDeps)
 MDC.Consumer.handleFileDependency(I);
 
@@ -435,7 +442,6 @@
   ModuleDeps &MD = *ModI.first->second;
 
   MD.ID.ModuleName = M->getFullModuleName();
-  MD.ImportedByMainFile = DirectModularDeps.contains(M);
   MD.IsSystem = M->IsSystem;
 
   ModuleMap &ModMapInfo =
Index: clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
===
--- clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
+++ clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
@@ -34,16 +34,12 @@
 Dependencies.push_back(std::string(File));
   }
 
-  void handlePrebuiltModuleDependency(PrebuiltModuleDep PMD) override {
-// Same as `handleModuleDependency`.
-  }
-
-  void handleModuleDependency(ModuleDeps MD) override {
-// These are ignored for the make format as it can't support the full
-// set of deps, and handleFileDependency handles enough for implicitly
-// built modules to work.
-  }
-
+  // These are ignored for the make format as it can't support the full
+  // set of deps, and handleFileDependency handles enough for implicitly
+  // built modules to work.
+  void handlePrebuiltModuleDependency(PrebuiltModuleDep PMD) override {}
+  void handleModuleDependency(ModuleDeps MD) override {}
+  void handleDirectModuleDependency(ModuleID ID) override {}
   void handleContextHash(std::string Hash) override {}
 
   void printDependencies(std::string &S) {
@@ -179,14 +175,13 @@
 
   for (auto &&M : ClangModuleDeps) {
 auto &MD = M.second;
-if (MD.ImportedByMainFile)
-  TU.ClangModuleDeps.push_back(MD.ID);
 // TODO: Avoid handleModuleDependency even being called for modules
 //   we've already seen.
 if (AlreadySeen.count(M.first))
   continue;
 TU.ModuleGraph.push_back(std::move(MD));
   }
+  TU.ClangModuleDeps = std::move(DirectModuleDeps);
 
   return TU;
 }
Index: clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
===
--- clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
+++ clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
@@ -136,10 +136,6 @@
   /// determined that the differences are benign for this compilation.
   std::vector ClangModuleDeps;
 
-  // Used to track which modules that were discovered were directly imported by
-  // the primary TU.
-  bool ImportedByMainFile = false;
-
   /// Compiler

[PATCH] D156492: [clang][deps] Make the C++ API more type-safe

2023-07-28 Thread Jan Svoboda via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG8a077cfe23e3: [clang][deps] Make the C++ API more type-safe 
(authored by jansvoboda11).

Changed prior to commit:
  https://reviews.llvm.org/D156492?vs=544938&id=545237#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156492/new/

https://reviews.llvm.org/D156492

Files:
  clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
  clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
  clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
  clang/tools/clang-scan-deps/ClangScanDeps.cpp

Index: clang/tools/clang-scan-deps/ClangScanDeps.cpp
===
--- clang/tools/clang-scan-deps/ClangScanDeps.cpp
+++ clang/tools/clang-scan-deps/ClangScanDeps.cpp
@@ -472,8 +472,7 @@
 mutable size_t InputIndex;
 
 bool operator==(const IndexedModuleID &Other) const {
-  return std::tie(ID.ModuleName, ID.ContextHash) ==
- std::tie(Other.ID.ModuleName, Other.ID.ContextHash);
+  return ID == Other.ID;
 }
 
 bool operator<(const IndexedModuleID &Other) const {
@@ -493,7 +492,7 @@
 
 struct Hasher {
   std::size_t operator()(const IndexedModuleID &IMID) const {
-return llvm::hash_combine(IMID.ID.ModuleName, IMID.ID.ContextHash);
+return llvm::hash_value(IMID.ID);
   }
 };
   };
@@ -880,7 +879,7 @@
 
   for (unsigned I = 0; I < Pool.getThreadCount(); ++I) {
 Pool.async([&, I]() {
-  llvm::StringSet<> AlreadySeenModules;
+  llvm::DenseSet AlreadySeenModules;
   while (auto MaybeInputIndex = GetNextInputIndex()) {
 size_t LocalIndex = *MaybeInputIndex;
 const tooling::CompileCommand *Input = &Inputs[LocalIndex];
Index: clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
===
--- clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
+++ clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
@@ -145,7 +145,7 @@
 llvm::Expected
 DependencyScanningTool::getTranslationUnitDependencies(
 const std::vector &CommandLine, StringRef CWD,
-const llvm::StringSet<> &AlreadySeen,
+const llvm::DenseSet &AlreadySeen,
 LookupModuleOutputCallback LookupModuleOutput) {
   FullDependencyConsumer Consumer(AlreadySeen);
   CallbackActionController Controller(LookupModuleOutput);
@@ -158,7 +158,7 @@
 
 llvm::Expected DependencyScanningTool::getModuleDependencies(
 StringRef ModuleName, const std::vector &CommandLine,
-StringRef CWD, const llvm::StringSet<> &AlreadySeen,
+StringRef CWD, const llvm::DenseSet &AlreadySeen,
 LookupModuleOutputCallback LookupModuleOutput) {
   FullDependencyConsumer Consumer(AlreadySeen);
   CallbackActionController Controller(LookupModuleOutput);
Index: clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
===
--- clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
+++ clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
@@ -17,6 +17,7 @@
 #include "clang/Lex/PPCallbacks.h"
 #include "clang/Serialization/ASTReader.h"
 #include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/Hashing.h"
 #include "llvm/ADT/StringSet.h"
 #include "llvm/Support/raw_ostream.h"
 #include 
@@ -296,15 +297,17 @@
 } // end namespace clang
 
 namespace llvm {
+inline hash_code hash_value(const clang::tooling::dependencies::ModuleID &ID) {
+  return hash_combine(ID.ModuleName, ID.ContextHash);
+}
+
 template <> struct DenseMapInfo {
   using ModuleID = clang::tooling::dependencies::ModuleID;
   static inline ModuleID getEmptyKey() { return ModuleID{"", ""}; }
   static inline ModuleID getTombstoneKey() {
 return ModuleID{"~", "~"}; // ~ is not a valid module name or context hash
   }
-  static unsigned getHashValue(const ModuleID &ID) {
-return hash_combine(ID.ModuleName, ID.ContextHash);
-  }
+  static unsigned getHashValue(const ModuleID &ID) { return hash_value(ID); }
   static bool isEqual(const ModuleID &LHS, const ModuleID &RHS) {
 return LHS == RHS;
   }
Index: clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
===
--- clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
+++ clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
@@ -13,9 +13,8 @@
 #include "clang/Tooling/DependencyScanning/DependencyScanningWorker.h"
 #include "clang/Tooling/DependencyScanning/ModuleDepCollector.h"
 #include "clang/Tooling/JSONCompilationDatabase.h"
+#include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/MapVector.h"
-#include "llvm/ADT/StringSet.h"
-#include "llvm/ADT/StringMap.h"
 #include 
 #include 
 #include 
@@ -125,

[PATCH] D156563: [clang][deps] Remove `ModuleDeps::ImportedByMainFile`

2023-07-28 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 created this revision.
jansvoboda11 added a reviewer: benlangmuir.
Herald added a subscriber: ributzka.
Herald added a project: All.
jansvoboda11 requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

This information is already exposed via `TranslationUnitDeps::ClangModuleDeps` 
on the `DependencyScanningTool` level, and this patch also adds it on the 
`DependencyScanningWorker` level via 
`DependencyConsumer::handleDirectModuleDependency()`.

Besides being redundant, this bit of information is misleading for clients that 
share single `ModuleDeps` instance between multiple TUs (by using the 
`AlreadySeen` set). The module can be imported directly in some TUs but 
transitively in others.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D156563

Files:
  clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
  clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
  clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
  clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
  clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp

Index: clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
===
--- clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -242,7 +242,7 @@
 
 SmallVector DirectDeps;
 for (const auto &KV : ModularDeps)
-  if (KV.second->ImportedByMainFile)
+  if (DirectModularDeps.contains(KV.first))
 DirectDeps.push_back(KV.second->ID);
 
 // TODO: Report module maps the same way it's done for modular dependencies.
@@ -364,7 +364,7 @@
 MDC.DirectPrebuiltModularDeps.insert(
 {TopLevelModule, PrebuiltModuleDep{TopLevelModule}});
   else
-DirectModularDeps.insert(TopLevelModule);
+MDC.DirectModularDeps.insert(TopLevelModule);
 }
 
 void ModuleDepCollectorPP::EndOfMainFile() {
@@ -394,9 +394,9 @@
   for (const Module *M :
MDC.ScanInstance.getPreprocessor().getAffectingClangModules())
 if (!MDC.isPrebuiltModule(M))
-  DirectModularDeps.insert(M);
+  MDC.DirectModularDeps.insert(M);
 
-  for (const Module *M : DirectModularDeps)
+  for (const Module *M : MDC.DirectModularDeps)
 handleTopLevelModule(M);
 
   MDC.Consumer.handleDependencyOutputOpts(*MDC.Opts);
@@ -408,6 +408,13 @@
   for (auto &&I : MDC.ModularDeps)
 MDC.Consumer.handleModuleDependency(*I.second);
 
+  for (const Module *M : MDC.DirectModularDeps) {
+auto It = MDC.ModularDeps.find(M);
+// Only report direct dependencies that were successfully handled.
+if (It != MDC.ModularDeps.end())
+  MDC.Consumer.handleDirectModuleDependency(MDC.ModularDeps[M]->ID);
+  }
+
   for (auto &&I : MDC.FileDeps)
 MDC.Consumer.handleFileDependency(I);
 
@@ -435,7 +442,6 @@
   ModuleDeps &MD = *ModI.first->second;
 
   MD.ID.ModuleName = M->getFullModuleName();
-  MD.ImportedByMainFile = DirectModularDeps.contains(M);
   MD.IsSystem = M->IsSystem;
 
   ModuleMap &ModMapInfo =
Index: clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
===
--- clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
+++ clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
@@ -34,16 +34,12 @@
 Dependencies.push_back(std::string(File));
   }
 
-  void handlePrebuiltModuleDependency(PrebuiltModuleDep PMD) override {
-// Same as `handleModuleDependency`.
-  }
-
-  void handleModuleDependency(ModuleDeps MD) override {
-// These are ignored for the make format as it can't support the full
-// set of deps, and handleFileDependency handles enough for implicitly
-// built modules to work.
-  }
-
+  // These are ignored for the make format as it can't support the full
+  // set of deps, and handleFileDependency handles enough for implicitly
+  // built modules to work.
+  void handlePrebuiltModuleDependency(PrebuiltModuleDep PMD) override {}
+  void handleModuleDependency(ModuleDeps MD) override {}
+  void handleDirectModuleDependency(ModuleID ID) override {}
   void handleContextHash(std::string Hash) override {}
 
   void printDependencies(std::string &S) {
@@ -179,14 +175,13 @@
 
   for (auto &&M : ClangModuleDeps) {
 auto &MD = M.second;
-if (MD.ImportedByMainFile)
-  TU.ClangModuleDeps.push_back(MD.ID);
 // TODO: Avoid handleModuleDependency even being called for modules
 //   we've already seen.
 if (AlreadySeen.count(M.first))
   continue;
 TU.ModuleGraph.push_back(std::move(MD));
   }
+  TU.ClangModuleDeps = std::move(DirectModuleDeps);
 
   return TU;
 }
Index: clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
===
--- clang/include/clang/Tooling/DependencyScanning/ModuleDepC

[PATCH] D156492: [clang][deps] Make the C++ API more type-safe

2023-07-27 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 created this revision.
jansvoboda11 added a reviewer: artemcm.
Herald added a subscriber: ributzka.
Herald added a project: All.
jansvoboda11 requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Scanner's C++ API accepts a set of modular dependencies the client has already 
seen and for which it doesn't need the full details. This is currently a set of 
strings, which somewhat implies that it should contain the set of module names. 
However, scanner internally expects the values to be in the format 
"{name}{hash}". Besides not being documented, this is not intuitive at all. 
This patch makes this expectation explicit by changing the type to set of 
`ModuleID`.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D156492

Files:
  clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
  clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
  clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
  clang/tools/clang-scan-deps/ClangScanDeps.cpp

Index: clang/tools/clang-scan-deps/ClangScanDeps.cpp
===
--- clang/tools/clang-scan-deps/ClangScanDeps.cpp
+++ clang/tools/clang-scan-deps/ClangScanDeps.cpp
@@ -472,8 +472,7 @@
 mutable size_t InputIndex;
 
 bool operator==(const IndexedModuleID &Other) const {
-  return std::tie(ID.ModuleName, ID.ContextHash) ==
- std::tie(Other.ID.ModuleName, Other.ID.ContextHash);
+  return ID == Other.ID;
 }
 
 bool operator<(const IndexedModuleID &Other) const {
@@ -493,7 +492,7 @@
 
 struct Hasher {
   std::size_t operator()(const IndexedModuleID &IMID) const {
-return llvm::hash_combine(IMID.ID.ModuleName, IMID.ID.ContextHash);
+return llvm::hash_value(IMID.ID);
   }
 };
   };
@@ -880,7 +879,7 @@
 
   for (unsigned I = 0; I < Pool.getThreadCount(); ++I) {
 Pool.async([&, I]() {
-  llvm::StringSet<> AlreadySeenModules;
+  llvm::DenseSet AlreadySeenModules;
   while (auto MaybeInputIndex = GetNextInputIndex()) {
 size_t LocalIndex = *MaybeInputIndex;
 const tooling::CompileCommand *Input = &Inputs[LocalIndex];
Index: clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
===
--- clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
+++ clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
@@ -145,7 +145,7 @@
 llvm::Expected
 DependencyScanningTool::getTranslationUnitDependencies(
 const std::vector &CommandLine, StringRef CWD,
-const llvm::StringSet<> &AlreadySeen,
+const llvm::DenseSet &AlreadySeen,
 LookupModuleOutputCallback LookupModuleOutput) {
   FullDependencyConsumer Consumer(AlreadySeen);
   CallbackActionController Controller(LookupModuleOutput);
@@ -158,7 +158,7 @@
 
 llvm::Expected DependencyScanningTool::getModuleDependencies(
 StringRef ModuleName, const std::vector &CommandLine,
-StringRef CWD, const llvm::StringSet<> &AlreadySeen,
+StringRef CWD, const llvm::DenseSet &AlreadySeen,
 LookupModuleOutputCallback LookupModuleOutput) {
   FullDependencyConsumer Consumer(AlreadySeen);
   CallbackActionController Controller(LookupModuleOutput);
Index: clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
===
--- clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
+++ clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
@@ -17,6 +17,7 @@
 #include "clang/Lex/PPCallbacks.h"
 #include "clang/Serialization/ASTReader.h"
 #include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/Hashing.h"
 #include "llvm/ADT/StringSet.h"
 #include "llvm/Support/raw_ostream.h"
 #include 
@@ -296,6 +297,10 @@
 } // end namespace clang
 
 namespace llvm {
+llvm::hash_code hash_value(clang::tooling::dependencies::ModuleID ID) {
+  return llvm::hash_combine(ID.ModuleName, ID.ContextHash);
+}
+
 template <> struct DenseMapInfo {
   using ModuleID = clang::tooling::dependencies::ModuleID;
   static inline ModuleID getEmptyKey() { return ModuleID{"", ""}; }
@@ -303,7 +308,7 @@
 return ModuleID{"~", "~"}; // ~ is not a valid module name or context hash
   }
   static unsigned getHashValue(const ModuleID &ID) {
-return hash_combine(ID.ModuleName, ID.ContextHash);
+return llvm::hash_value(ID);
   }
   static bool isEqual(const ModuleID &LHS, const ModuleID &RHS) {
 return LHS == RHS;
Index: clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
===
--- clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
+++ clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
@@ -13,9 +13,8 @@
 #include "clang/Tooling/DependencyScanning/DependencyScanningWorker.h"
 #include "clang/Tooli

[PATCH] D156234: [clang][deps] provide support for cc1 command line scanning

2023-07-25 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added a comment.

This looks pretty good!

I'm not sure unit testing is the best choice here, since we're not checking for 
low-level properties or hard-to-observe behavior. In general LIT tests are 
easier to write/maintain/understand and don't require recompiling, so I'd 
suggest to transform the existing unit test into something similar to tests in 
`clang/test/ClangScanDeps/`.




Comment at: 
clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp:414
+  std::vector Args = Action.takeLastCC1Arguments();
+  Consumer.handleBuildCommand({Executable, std::move(Args)});
+  return true;





Comment at: 
clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp:507
+  // dependency scanning filesystem.
+  return createAndRunToolInvocation(Argv, Action, *FileMgr,
+PCHContainerOps, *Diags, Consumer);




CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156234/new/

https://reviews.llvm.org/D156234

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D156000: Track the RequestingModule in the HeaderSearch LookupFile cache.

2023-07-21 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 accepted this revision.
jansvoboda11 added a comment.
This revision is now accepted and ready to land.

LGTM, thanks for the quick turnaround!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156000/new/

https://reviews.llvm.org/D156000

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D132779: Enforce module decl-use restrictions and private header restrictions in textual headers

2023-07-21 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added a comment.

Hi @rsmith, this commit makes it possible for `HeaderInfo::LookupFile()` to be 
called with different `RequestingModule` within single `CompilerInstance`. This 
is problematic, since some modules may see headers other modules can't (due to 
`[no_undeclared_includes]`). This can permanently mess up contents of the 
lookup cache (`HeaderSearch::LookupFileCache`) that uses only the lookup name 
as the key. You can see the minimal reproducer below. On our side, we can work 
around this by using `-fno-modules-validate-textual-header-includes`, but I 
think this will need to be fixed before that options goes away.

  // RUN: rm -rf %t
  // RUN: split-file %s %t
  
  //--- include/module.modulemap
  module A [no_undeclared_includes] { textual header "A.h" }
  module B { header "B.h" }
  //--- include/A.h
  #if __has_include()
  #error Even textual headers within module A now inherit 
[no_undeclared_includes] \
 and thus do not have that include.
  #endif
  //--- include/B.h
  
  //--- tu.c
  #if !__has_include()
  #error Main TU does have that include.
  #endif
  
  #include "A.h"
  
  #if !__has_include()
  #error Main TU still has that include.
  // We hit the above because the unsuccessful __has_include check in A.h taints
  // lookup cache (HeaderSearch::LookupFileCache) of this CompilerInstance.
  #endif
  
  // RUN: %clang_cc1 -I %t/include -fmodules -fimplicit-module-maps \
  // RUN:   -fmodules-cache-path=%t/cache -fsyntax-only %t/tu.c


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D132779/new/

https://reviews.llvm.org/D132779

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D150320: [clang][modules][deps] Avoid checks for relocated modules

2023-07-17 Thread Jan Svoboda via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG227f71995804: [clang][modules][deps] Avoid checks for 
relocated modules (authored by jansvoboda11).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150320/new/

https://reviews.llvm.org/D150320

Files:
  clang/include/clang/Lex/PreprocessorOptions.h
  clang/lib/Serialization/ASTReader.cpp
  clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
  clang/test/ClangScanDeps/header-search-pruning-transitive.c

Index: clang/test/ClangScanDeps/header-search-pruning-transitive.c
===
--- clang/test/ClangScanDeps/header-search-pruning-transitive.c
+++ clang/test/ClangScanDeps/header-search-pruning-transitive.c
@@ -72,8 +72,8 @@
 // CHECK:],
 // CHECK-NEXT:   "context-hash": "[[HASH_X:.*]]",
 // CHECK-NEXT:   "file-deps": [
-// CHECK-NEXT: "[[PREFIX]]/./X.h",
-// CHECK-NEXT: "[[PREFIX]]/./module.modulemap"
+// CHECK-NEXT: "[[PREFIX]]/X.h",
+// CHECK-NEXT: "[[PREFIX]]/module.modulemap"
 // CHECK-NEXT:   ],
 // CHECK-NEXT:   "name": "X"
 // CHECK-NEXT: },
@@ -84,11 +84,11 @@
 // CHECK:],
 // CHECK-NEXT:   "context-hash": "[[HASH_Y_WITH_A]]",
 // CHECK-NEXT:   "file-deps": [
-// CHECK-NEXT: "[[PREFIX]]/./Y.h",
-// CHECK-NEXT: "[[PREFIX]]/./a/a.h",
-// CHECK-NEXT: "[[PREFIX]]/./begin/begin.h",
-// CHECK-NEXT: "[[PREFIX]]/./end/end.h",
-// CHECK-NEXT: "[[PREFIX]]/./module.modulemap"
+// CHECK-NEXT: "[[PREFIX]]/Y.h",
+// CHECK-NEXT: "[[PREFIX]]/a/a.h",
+// CHECK-NEXT: "[[PREFIX]]/begin/begin.h",
+// CHECK-NEXT: "[[PREFIX]]/end/end.h",
+// CHECK-NEXT: "[[PREFIX]]/module.modulemap"
 // CHECK-NEXT:   ],
 // CHECK-NEXT:   "name": "Y"
 // CHECK-NEXT: }
@@ -126,8 +126,8 @@
 // also has a different context hash from the first version of module X.
 // CHECK-NOT:"context-hash": "[[HASH_X]]",
 // CHECK:"file-deps": [
-// CHECK-NEXT: "[[PREFIX]]/./X.h",
-// CHECK-NEXT: "[[PREFIX]]/./module.modulemap"
+// CHECK-NEXT: "[[PREFIX]]/X.h",
+// CHECK-NEXT: "[[PREFIX]]/module.modulemap"
 // CHECK-NEXT:   ],
 // CHECK-NEXT:   "name": "X"
 // CHECK-NEXT: },
@@ -138,10 +138,10 @@
 // CHECK:],
 // CHECK-NEXT:   "context-hash": "[[HASH_Y_WITHOUT_A]]",
 // CHECK-NEXT:   "file-deps": [
-// CHECK-NEXT: "[[PREFIX]]/./Y.h",
-// CHECK-NEXT: "[[PREFIX]]/./begin/begin.h",
-// CHECK-NEXT: "[[PREFIX]]/./end/end.h",
-// CHECK-NEXT: "[[PREFIX]]/./module.modulemap"
+// CHECK-NEXT: "[[PREFIX]]/Y.h",
+// CHECK-NEXT: "[[PREFIX]]/begin/begin.h",
+// CHECK-NEXT: "[[PREFIX]]/end/end.h",
+// CHECK-NEXT: "[[PREFIX]]/module.modulemap"
 // CHECK-NEXT:   ],
 // CHECK-NEXT:   "name": "Y"
 // CHECK-NEXT: }
Index: clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
===
--- clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
+++ clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
@@ -253,6 +253,9 @@
 // context hashing.
 ScanInstance.getHeaderSearchOpts().ModulesStrictContextHash = true;
 
+// Avoid some checks and module map parsing when loading PCM files.
+ScanInstance.getPreprocessorOpts().ModulesCheckRelocated = false;
+
 std::unique_ptr Action;
 
 if (ModuleName)
Index: clang/lib/Serialization/ASTReader.cpp
===
--- clang/lib/Serialization/ASTReader.cpp
+++ clang/lib/Serialization/ASTReader.cpp
@@ -2990,6 +2990,9 @@
   BaseDirectoryAsWritten = Blob;
   assert(!F.ModuleName.empty() &&
  "MODULE_DIRECTORY found before MODULE_NAME");
+  F.BaseDirectory = std::string(Blob);
+  if (!PP.getPreprocessorOpts().ModulesCheckRelocated)
+break;
   // If we've already loaded a module map file covering this module, we may
   // have a better path for it (relative to the current build).
   Module *M = PP.getHeaderSearchInfo().lookupModule(
@@ -3011,8 +3014,6 @@
   }
 }
 F.BaseDirectory = std::string(M->Directory->getName());
-  } else {
-F.BaseDirectory = std::string(Blob);
   }
   break;
 }
@@ -3990,7 +3991,8 @@
   // usable header search context.
   assert(!F.ModuleName.empty() &&
  "MODULE_NAME should come before MODULE_MAP_FILE");
-  if (F.Kind == MK_ImplicitModule && ModuleMgr.begin()->Kind != MK_MainFile) {
+  if (PP.getPreprocessorOpts().ModulesCheckRelocated &&
+  F.Kind == MK_ImplicitModule && ModuleMgr.begin()->Kind != MK_MainFile) {
 // An implicitly-loaded module file should h

[PATCH] D150479: [clang][modules] Skip submodule & framework re-definitions in module maps

2023-07-17 Thread Jan Svoboda via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGdba2b5c9314e: [clang][modules] Skip submodule & 
framework re-definitions in module maps (authored by jansvoboda11).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150479/new/

https://reviews.llvm.org/D150479

Files:
  clang/lib/Lex/ModuleMap.cpp
  clang/test/Modules/no-check-relocated-fw-private-sub.c
  clang/test/Modules/shadow-framework.m


Index: clang/test/Modules/shadow-framework.m
===
--- /dev/null
+++ clang/test/Modules/shadow-framework.m
@@ -0,0 +1,20 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+// This test checks that redefinitions of frameworks are ignored.
+
+//--- include/module.modulemap
+module first { header "first.h" }
+module FW {}
+//--- include/first.h
+
+//--- frameworks/FW.framework/Modules/module.modulemap
+framework module FW { header "FW.h" }
+//--- frameworks/FW.framework/Headers/FW.h
+
+//--- tu.c
+#import "first.h" // expected-remark {{importing module 'first'}}
+#import 
+
+// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t/cache 
-fimplicit-module-maps \
+// RUN:   -I %t/include -F %t/frameworks -fsyntax-only %t/tu.c -Rmodule-import 
-verify
Index: clang/test/Modules/no-check-relocated-fw-private-sub.c
===
--- /dev/null
+++ clang/test/Modules/no-check-relocated-fw-private-sub.c
@@ -0,0 +1,23 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+//--- frameworks1/FW1.framework/Modules/module.modulemap
+framework module FW1 { header "FW1.h" }
+//--- frameworks1/FW1.framework/Headers/FW1.h
+#import 
+
+//--- frameworks2/FW2.framework/Modules/module.modulemap
+framework module FW2 { header "FW2.h" }
+//--- frameworks2/FW2.framework/Modules/module.private.modulemap
+framework module FW2.Private { header "FW2_Private.h" }
+//--- frameworks2/FW2.framework/Headers/FW2.h
+//--- frameworks2/FW2.framework/PrivateHeaders/FW2_Private.h
+
+//--- tu.c
+#import  // expected-remark{{importing module 'FW1'}} \
+// expected-remark{{importing module 'FW2'}}
+#import  // expected-remark{{importing module 'FW2'}}
+
+// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t/cache 
-fimplicit-module-maps \
+// RUN:   -F %t/frameworks1 -F %t/frameworks2 -fsyntax-only %t/tu.c \
+// RUN:   -fno-modules-check-relocated -Rmodule-import -verify
Index: clang/lib/Lex/ModuleMap.cpp
===
--- clang/lib/Lex/ModuleMap.cpp
+++ clang/lib/Lex/ModuleMap.cpp
@@ -2016,12 +2016,28 @@
   Module *ShadowingModule = nullptr;
   if (Module *Existing = Map.lookupModuleQualified(ModuleName, ActiveModule)) {
 // We might see a (re)definition of a module that we already have a
-// definition for in three cases:
+// definition for in four cases:
 //  - If we loaded one definition from an AST file and we've just found a
 //corresponding definition in a module map file, or
 bool LoadedFromASTFile = Existing->IsFromModuleFile;
 //  - If we previously inferred this module from different module map file.
 bool Inferred = Existing->IsInferred;
+//  - If we're building a framework that vends a module map, we might've
+//previously seen the one in intermediate products and now the system
+//one.
+// FIXME: If we're parsing module map file that looks like this:
+//  framework module FW { ... }
+//  module FW.Sub { ... }
+//We can't check the framework qualifier, since it's not attached 
to
+//the definition of Sub. Checking that qualifier on \c Existing is
+//not correct either, since we might've previously seen:
+//  module FW { ... }
+//  module FW.Sub { ... }
+//We should enforce consistency of redefinitions so that we can 
rely
+//that \c Existing is part of a framework iff the redefinition of 
FW
+//we have just skipped had it too. Once we do that, stop checking
+//the local framework qualifier and only rely on \c Existing.
+bool PartOfFramework = Framework || Existing->isPartOfFramework();
 //  - If we're building a (preprocessed) module and we've just loaded the
 //module map file from which it was created.
 bool ParsedAsMainInput =
@@ -2029,7 +2045,8 @@
 Map.LangOpts.CurrentModule == ModuleName &&
 SourceMgr.getDecomposedLoc(ModuleNameLoc).first !=
 SourceMgr.getDecomposedLoc(Existing->DefinitionLoc).first;
-if (!ActiveModule && (LoadedFromASTFile || Inferred || ParsedAsMainInput)) 
{
+if (LoadedFromASTFile || Inferred || PartOfFramework || ParsedAsMainInput) 
{
+  ActiveModule = PreviousActiveModule;
   // Skip the module definition.
   skipUntil(MMToken::RB

[PATCH] D150478: [clang][modules][deps] Parse "FW_Private" module map even after loading "FW" PCM

2023-07-17 Thread Jan Svoboda via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGbe014563f2f4: [clang][modules][deps] Parse 
"FW_Private" module map even after loading "FW" PCM 
(authored by jansvoboda11).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150478/new/

https://reviews.llvm.org/D150478

Files:
  clang/include/clang/Driver/Options.td
  clang/lib/Lex/HeaderSearch.cpp
  clang/test/Modules/no-check-relocated-fw-private.c


Index: clang/test/Modules/no-check-relocated-fw-private.c
===
--- /dev/null
+++ clang/test/Modules/no-check-relocated-fw-private.c
@@ -0,0 +1,23 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+//--- frameworks1/FW1.framework/Modules/module.modulemap
+framework module FW1 { header "FW1.h" }
+//--- frameworks1/FW1.framework/Headers/FW1.h
+#import 
+
+//--- frameworks2/FW2.framework/Modules/module.modulemap
+framework module FW2 { header "FW2.h" }
+//--- frameworks2/FW2.framework/Modules/module.private.modulemap
+framework module FW2_Private { header "FW2_Private.h" }
+//--- frameworks2/FW2.framework/Headers/FW2.h
+//--- frameworks2/FW2.framework/PrivateHeaders/FW2_Private.h
+
+//--- tu.c
+#import  // expected-remark{{importing module 'FW1'}} \
+// expected-remark{{importing module 'FW2'}}
+#import  // expected-remark{{importing module 
'FW2_Private'}}
+
+// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t/cache 
-fimplicit-module-maps \
+// RUN:   -F %t/frameworks1 -F %t/frameworks2 -fsyntax-only %t/tu.c \
+// RUN:   -fno-modules-check-relocated -Rmodule-import -verify
Index: clang/lib/Lex/HeaderSearch.cpp
===
--- clang/lib/Lex/HeaderSearch.cpp
+++ clang/lib/Lex/HeaderSearch.cpp
@@ -1783,9 +1783,6 @@
 
 Module *HeaderSearch::loadFrameworkModule(StringRef Name, DirectoryEntryRef 
Dir,
   bool IsSystem) {
-  if (Module *Module = ModMap.findModule(Name))
-return Module;
-
   // Try to load a module map file.
   switch (loadModuleMapFile(Dir, IsSystem, /*IsFramework*/true)) {
   case LMM_InvalidModuleMap:
@@ -1794,10 +1791,10 @@
   ModMap.inferFrameworkModule(Dir, IsSystem, /*Parent=*/nullptr);
 break;
 
-  case LMM_AlreadyLoaded:
   case LMM_NoDirectory:
 return nullptr;
 
+  case LMM_AlreadyLoaded:
   case LMM_NewlyLoaded:
 break;
   }
Index: clang/include/clang/Driver/Options.td
===
--- clang/include/clang/Driver/Options.td
+++ clang/include/clang/Driver/Options.td
@@ -2588,6 +2588,10 @@
 defm implicit_modules : BoolFOption<"implicit-modules",
   LangOpts<"ImplicitModules">, DefaultTrue,
   NegFlag, PosFlag, 
BothFlags<[NoXarchOption,CoreOption]>>;
+def fno_modules_check_relocated : Joined<["-"], "fno-modules-check-relocated">,
+  Group, Flags<[CC1Option]>,
+  HelpText<"Skip checks for relocated modules when loading PCM files">,
+  MarshallingInfoNegativeFlag>;
 def fretain_comments_from_system_headers : Flag<["-"], 
"fretain-comments-from-system-headers">, Group, Flags<[CC1Option]>,
   MarshallingInfoFlag>;
 def fmodule_header : Flag <["-"], "fmodule-header">, Group,


Index: clang/test/Modules/no-check-relocated-fw-private.c
===
--- /dev/null
+++ clang/test/Modules/no-check-relocated-fw-private.c
@@ -0,0 +1,23 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+//--- frameworks1/FW1.framework/Modules/module.modulemap
+framework module FW1 { header "FW1.h" }
+//--- frameworks1/FW1.framework/Headers/FW1.h
+#import 
+
+//--- frameworks2/FW2.framework/Modules/module.modulemap
+framework module FW2 { header "FW2.h" }
+//--- frameworks2/FW2.framework/Modules/module.private.modulemap
+framework module FW2_Private { header "FW2_Private.h" }
+//--- frameworks2/FW2.framework/Headers/FW2.h
+//--- frameworks2/FW2.framework/PrivateHeaders/FW2_Private.h
+
+//--- tu.c
+#import  // expected-remark{{importing module 'FW1'}} \
+// expected-remark{{importing module 'FW2'}}
+#import  // expected-remark{{importing module 'FW2_Private'}}
+
+// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t/cache -fimplicit-module-maps \
+// RUN:   -F %t/frameworks1 -F %t/frameworks2 -fsyntax-only %t/tu.c \
+// RUN:   -fno-modules-check-relocated -Rmodule-import -verify
Index: clang/lib/Lex/HeaderSearch.cpp
===
--- clang/lib/Lex/HeaderSearch.cpp
+++ clang/lib/Lex/HeaderSearch.cpp
@@ -1783,9 +1783,6 @@
 
 Module *HeaderSearch::loadFrameworkModule(StringRef Name, DirectoryEntryRef Dir,
   bool IsSystem) {
-  if (Module *Module = ModMap.findModule(Name))
-return Module;
-
   // Try to load a module map file.
   switch

[PATCH] D150292: [clang][modules] Serialize `Module::DefinitionLoc`

2023-07-17 Thread Jan Svoboda via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGabcf7ce45794: [clang][modules] Serialize 
`Module::DefinitionLoc` (authored by jansvoboda11).

Changed prior to commit:
  https://reviews.llvm.org/D150292?vs=539120&id=541223#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150292/new/

https://reviews.llvm.org/D150292

Files:
  clang/include/clang/Serialization/ASTBitCodes.h
  clang/lib/Lex/ModuleMap.cpp
  clang/lib/Serialization/ASTReader.cpp
  clang/lib/Serialization/ASTWriter.cpp

Index: clang/lib/Serialization/ASTWriter.cpp
===
--- clang/lib/Serialization/ASTWriter.cpp
+++ clang/lib/Serialization/ASTWriter.cpp
@@ -200,7 +200,9 @@
 CB(F);
 FileID FID = SourceMgr.translateFile(F);
 SourceLocation Loc = SourceMgr.getIncludeLoc(FID);
-while (Loc.isValid()) {
+// The include location of inferred module maps can point into the header
+// file that triggered the inferring. Cut off the walk if that's the case.
+while (Loc.isValid() && isModuleMap(SourceMgr.getFileCharacteristic(Loc))) {
   FID = SourceMgr.getFileID(Loc);
   CB(*SourceMgr.getFileEntryRefForID(FID));
   Loc = SourceMgr.getIncludeLoc(FID);
@@ -209,11 +211,18 @@
 
   auto ProcessModuleOnce = [&](const Module *M) {
 for (const Module *Mod = M; Mod; Mod = Mod->Parent)
-  if (ProcessedModules.insert(Mod).second)
+  if (ProcessedModules.insert(Mod).second) {
+auto Insert = [&](FileEntryRef F) { ModuleMaps.insert(F); };
+// The containing module map is affecting, because it's being pointed
+// into by Module::DefinitionLoc.
+if (auto ModuleMapFile = MM.getContainingModuleMapFile(Mod))
+  ForIncludeChain(*ModuleMapFile, Insert);
+// For inferred modules, the module map that allowed inferring is not in
+// the include chain of the virtual containing module map file. It did
+// affect the compilation, though.
 if (auto ModuleMapFile = MM.getModuleMapFileForUniquing(Mod))
-  ForIncludeChain(*ModuleMapFile, [&](FileEntryRef F) {
-ModuleMaps.insert(F);
-  });
+  ForIncludeChain(*ModuleMapFile, Insert);
+  }
   };
 
   for (const Module *CurrentModule : ModulesToProcess) {
@@ -2687,6 +2696,7 @@
   Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // ID
   Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Parent
   Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 4)); // Kind
+  Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // Definition location
   Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // IsFramework
   Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // IsExplicit
   Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // IsSystem
@@ -2787,12 +2797,16 @@
   ParentID = SubmoduleIDs[Mod->Parent];
 }
 
+uint64_t DefinitionLoc =
+SourceLocationEncoding::encode(getAdjustedLocation(Mod->DefinitionLoc));
+
 // Emit the definition of the block.
 {
   RecordData::value_type Record[] = {SUBMODULE_DEFINITION,
  ID,
  ParentID,
  (RecordData::value_type)Mod->Kind,
+ DefinitionLoc,
  Mod->IsFramework,
  Mod->IsExplicit,
  Mod->IsSystem,
Index: clang/lib/Serialization/ASTReader.cpp
===
--- clang/lib/Serialization/ASTReader.cpp
+++ clang/lib/Serialization/ASTReader.cpp
@@ -5607,7 +5607,7 @@
   break;
 
 case SUBMODULE_DEFINITION: {
-  if (Record.size() < 12)
+  if (Record.size() < 13)
 return llvm::createStringError(std::errc::illegal_byte_sequence,
"malformed module definition");
 
@@ -5616,6 +5616,7 @@
   SubmoduleID GlobalID = getGlobalSubmoduleID(F, Record[Idx++]);
   SubmoduleID Parent = getGlobalSubmoduleID(F, Record[Idx++]);
   Module::ModuleKind Kind = (Module::ModuleKind)Record[Idx++];
+  SourceLocation DefinitionLoc = ReadSourceLocation(F, Record[Idx++]);
   bool IsFramework = Record[Idx++];
   bool IsExplicit = Record[Idx++];
   bool IsSystem = Record[Idx++];
@@ -5636,8 +5637,7 @@
   ModMap.findOrCreateModule(Name, ParentModule, IsFramework, IsExplicit)
   .first;
 
-  // FIXME: set the definition loc for CurrentModule, or call
-  // ModMap.setInferredModuleAllowedBy()
+  // FIXME: Call ModMap.setInferredModuleAllowedBy()
 
   SubmoduleID GlobalIndex = GlobalID - NUM_PREDEF_SUBMODULE_IDS;
   if (GlobalIndex >= SubmodulesLoaded.size() ||
@@ 

[PATCH] D114173: [clang][modules] Apply local submodule visibility to includes

2023-07-17 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 updated this revision to Diff 541218.
jansvoboda11 edited the summary of this revision.
jansvoboda11 added a comment.

Rebase on top of D155503 .


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114173/new/

https://reviews.llvm.org/D114173

Files:
  clang/include/clang/Lex/Preprocessor.h
  clang/test/Modules/import-textual-noguard.mm


Index: clang/test/Modules/import-textual-noguard.mm
===
--- clang/test/Modules/import-textual-noguard.mm
+++ clang/test/Modules/import-textual-noguard.mm
@@ -1,7 +1,9 @@
 // RUN: rm -rf %t
 // RUN: %clang_cc1 -fsyntax-only -std=c++11 -fmodules -fimplicit-module-maps 
-I%S/Inputs/import-textual/M2 -fmodules-cache-path=%t -x objective-c++ 
-fmodules-local-submodule-visibility %s -verify
 
-#include "A/A.h" // expected-error {{could not build module 'M'}}
+// expected-no-diagnostics
+
+#include "A/A.h"
 #include "B/B.h"
 
 typedef aint xxx;
Index: clang/include/clang/Lex/Preprocessor.h
===
--- clang/include/clang/Lex/Preprocessor.h
+++ clang/include/clang/Lex/Preprocessor.h
@@ -976,6 +976,9 @@
 /// The macros for the submodule.
 MacroMap Macros;
 
+/// The set of files that have been included in the submodule.
+IncludedFilesSet IncludedFiles;
+
 /// The set of modules that are visible within the submodule.
 VisibleModuleSet VisibleModules;
 
@@ -995,9 +998,6 @@
   /// Files included outside of any module (e.g. in PCH) have nullptr key.
   llvm::DenseMap IncludedFilesPerSubmodule;
 
-  /// The files that have been included.
-  IncludedFilesSet IncludedFiles;
-
   /// The set of top-level modules that affected preprocessing, but were not
   /// imported.
   llvm::SmallSetVector AffectingClangModules;
@@ -1486,7 +1486,7 @@
   /// of its dependencies (transitively).
   void markTransitivelyIncluded(const FileEntry *File) {
 HeaderInfo.getFileInfo(File);
-IncludedFiles.insert(File);
+CurSubmoduleState->IncludedFiles.insert(File);
   }
 
   /// Mark the file as included in the current state and attribute it to the
@@ -1500,13 +1500,13 @@
 : getCurrentModule();
 IncludedFilesPerSubmodule[M].insert(File);
 
-return IncludedFiles.insert(File).second;
+return CurSubmoduleState->IncludedFiles.insert(File).second;
   }
 
   /// Return true if this header has already been included.
   bool alreadyIncluded(const FileEntry *File) const {
 HeaderInfo.getFileInfo(File);
-return IncludedFiles.count(File);
+return CurSubmoduleState->IncludedFiles.count(File);
   }
 
   /// Invoke the callback for every module known to include the given file.
@@ -1523,9 +1523,10 @@
 return IncludedFilesPerSubmodule[M];
   }
 
-  /// Get the set of included files.
-  IncludedFilesSet &getIncludedFiles() { return IncludedFiles; }
-  const IncludedFilesSet &getIncludedFiles() const { return IncludedFiles; }
+  /// Get the set of files included in the current state.
+  IncludedFilesSet &getIncludedFiles() {
+return CurSubmoduleState->IncludedFiles;
+  }
 
   /// Return the name of the macro defined before \p Loc that has
   /// spelling \p Tokens.  If there are multiple macros with same spelling,


Index: clang/test/Modules/import-textual-noguard.mm
===
--- clang/test/Modules/import-textual-noguard.mm
+++ clang/test/Modules/import-textual-noguard.mm
@@ -1,7 +1,9 @@
 // RUN: rm -rf %t
 // RUN: %clang_cc1 -fsyntax-only -std=c++11 -fmodules -fimplicit-module-maps -I%S/Inputs/import-textual/M2 -fmodules-cache-path=%t -x objective-c++ -fmodules-local-submodule-visibility %s -verify
 
-#include "A/A.h" // expected-error {{could not build module 'M'}}
+// expected-no-diagnostics
+
+#include "A/A.h"
 #include "B/B.h"
 
 typedef aint xxx;
Index: clang/include/clang/Lex/Preprocessor.h
===
--- clang/include/clang/Lex/Preprocessor.h
+++ clang/include/clang/Lex/Preprocessor.h
@@ -976,6 +976,9 @@
 /// The macros for the submodule.
 MacroMap Macros;
 
+/// The set of files that have been included in the submodule.
+IncludedFilesSet IncludedFiles;
+
 /// The set of modules that are visible within the submodule.
 VisibleModuleSet VisibleModules;
 
@@ -995,9 +998,6 @@
   /// Files included outside of any module (e.g. in PCH) have nullptr key.
   llvm::DenseMap IncludedFilesPerSubmodule;
 
-  /// The files that have been included.
-  IncludedFilesSet IncludedFiles;
-
   /// The set of top-level modules that affected preprocessing, but were not
   /// imported.
   llvm::SmallSetVector AffectingClangModules;
@@ -1486,7 +1486,7 @@
   /// of its dependencies (transitively).
   void markTransitivelyIncluded(const FileEntry *File) {
 HeaderInfo.getFileInfo(File);
-IncludedFiles.ins

[PATCH] D155503: [clang][modules] Track included files per submodule

2023-07-17 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added a comment.

Note to self: lost the change that bumps PCM format version during a rebase. 
Make sure to increment the major version before committing.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155503/new/

https://reviews.llvm.org/D155503

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D155503: [clang][modules] Track included files per submodule

2023-07-17 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 created this revision.
jansvoboda11 added reviewers: benlangmuir, vsapsai, Bigcheese.
Herald added subscribers: ributzka, mgrang.
Herald added a project: All.
jansvoboda11 requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

When building a module consisting of submodules, the preprocessor maintains a 
global set of included headers. This information gets serialized into the PCM 
file (specifically into the HeaderFileInfo table). After loading such PCM file, 
this information is deserialized into the state of the importing preprocessor. 
This happens even if the headers were included by (sub)modules that are not 
visible. This can incorrectly prevent imports of textual headers in the 
importing instance (see attached tests).

This patch fixes this bug splitting the set of included files per submodule. 
This is an alternative to D112915  and 
D104344 .


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D155503

Files:
  clang/include/clang/Lex/Preprocessor.h
  clang/lib/Lex/HeaderSearch.cpp
  clang/lib/Lex/Preprocessor.cpp
  clang/lib/Serialization/ASTReader.cpp
  clang/lib/Serialization/ASTReaderInternals.h
  clang/lib/Serialization/ASTWriter.cpp
  clang/test/Modules/import-submodule-visibility.c

Index: clang/test/Modules/import-submodule-visibility.c
===
--- /dev/null
+++ clang/test/Modules/import-submodule-visibility.c
@@ -0,0 +1,64 @@
+// This test checks that imports of headers that appeared in a different submodule than
+// what is imported by the current TU don't affect the compilation.
+
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+//--- C/C.h
+#include "Textual.h"
+//--- C/module.modulemap
+module C { header "C.h" }
+
+//--- D/D1.h
+#include "Textual.h"
+//--- D/D2.h
+//--- D/module.modulemap
+module D {
+  module D1 { header "D1.h" }
+  module D2 { header "D2.h" }
+}
+
+//--- E/E1.h
+#include "E2.h"
+//--- E/E2.h
+#include "Textual.h"
+//--- E/module.modulemap
+module E {
+  module E1 { header "E1.h" }
+  module E2 { header "E2.h" }
+}
+
+//--- Textual.h
+#define MACRO_TEXTUAL 1
+
+//--- test_top.c
+#import "Textual.h"
+static int x = MACRO_TEXTUAL;
+
+//--- test_sub.c
+#import "D/D2.h"
+#import "Textual.h"
+static int x = MACRO_TEXTUAL;
+
+//--- test_transitive.c
+#import "E/E1.h"
+#import "Textual.h"
+static int x = MACRO_TEXTUAL;
+
+// Simply loading a PCM file containing top-level module including a header does
+// not prevent inclusion of that header in the TU.
+//
+// RUN: %clang_cc1 -fmodules -I %t -emit-module %t/C/module.modulemap -fmodule-name=C -o %t/C.pcm
+// RUN: %clang_cc1 -fmodules -I %t -fsyntax-only %t/test_top.c -fmodule-file=%t/C.pcm
+
+// Loading a PCM file and importing its empty submodule does not prevent
+// inclusion of headers included by invisible sibling submodules.
+//
+// RUN: %clang_cc1 -fmodules -I %t -emit-module %t/D/module.modulemap -fmodule-name=D -o %t/D.pcm
+// RUN: %clang_cc1 -fmodules -I %t -fsyntax-only %t/test_sub.c -fmodule-file=%t/D.pcm
+
+// Loading a PCM file and importing a submodule does not prevent inclusion of
+// headers included by some of its transitive un-exported dependencies.
+//
+// RUN: %clang_cc1 -fmodules -I %t -emit-module %t/E/module.modulemap -fmodule-name=E -o %t/E.pcm
+// RUN: %clang_cc1 -fmodules -I %t -fsyntax-only %t/test_transitive.c -fmodule-file=%t/E.pcm
Index: clang/lib/Serialization/ASTWriter.cpp
===
--- clang/lib/Serialization/ASTWriter.cpp
+++ clang/lib/Serialization/ASTWriter.cpp
@@ -1762,7 +1762,7 @@
 
 struct data_type {
   const HeaderFileInfo &HFI;
-  bool AlreadyIncluded;
+  SmallVector IncludedIn;
   ArrayRef KnownHeaders;
   UnresolvedModule Unresolved;
 };
@@ -1787,6 +1787,7 @@
   DataLen += 4;
   if (Data.Unresolved.getPointer())
 DataLen += 4;
+  DataLen += 4 + 4 * Data.IncludedIn.size();
   return emitULEBKeyDataLength(KeyLen, DataLen, Out);
 }
 
@@ -1808,8 +1809,7 @@
   endian::Writer LE(Out, little);
   uint64_t Start = Out.tell(); (void)Start;
 
-  unsigned char Flags = (Data.AlreadyIncluded << 6)
-  | (Data.HFI.isImport << 5)
+  unsigned char Flags = (Data.HFI.isImport << 5)
   | (Data.HFI.isPragmaOnce << 4)
   | (Data.HFI.DirInfo << 1)
   | Data.HFI.IndexHeaderMapHeader;
@@ -1820,6 +1820,10 @@
   else
 LE.write(Writer.getIdentifierRef(Data.HFI.ControllingMacro));
 
+  LE.write(Data.IncludedIn.size());
+  for (uint32_t ModID : Data.IncludedIn)
+LE.write(ModID);
+
   unsigned Offset = 0;
   if (!Data.HFI.Framework.empty()) {
 // If this header refers into a framework, save the framework name.
@@ -1910,7 +1914,7 @@
   

[PATCH] D155131: [clang][modules] Deserialize included files lazily

2023-07-13 Thread Jan Svoboda via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG6504d87fc0c8: [clang][modules] Deserialize included files 
lazily (authored by jansvoboda11).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155131/new/

https://reviews.llvm.org/D155131

Files:
  clang/include/clang/Lex/Preprocessor.h
  clang/include/clang/Serialization/ASTBitCodes.h
  clang/include/clang/Serialization/ASTReader.h
  clang/include/clang/Serialization/ASTWriter.h
  clang/lib/Serialization/ASTReader.cpp
  clang/lib/Serialization/ASTReaderInternals.h
  clang/lib/Serialization/ASTWriter.cpp

Index: clang/lib/Serialization/ASTWriter.cpp
===
--- clang/lib/Serialization/ASTWriter.cpp
+++ clang/lib/Serialization/ASTWriter.cpp
@@ -866,7 +866,6 @@
   RECORD(CUDA_PRAGMA_FORCE_HOST_DEVICE_DEPTH);
   RECORD(PP_CONDITIONAL_STACK);
   RECORD(DECLS_TO_CHECK_FOR_DEFERRED_DIAGS);
-  RECORD(PP_INCLUDED_FILES);
   RECORD(PP_ASSUME_NONNULL_LOC);
 
   // SourceManager Block.
@@ -1763,6 +1762,7 @@
 
 struct data_type {
   const HeaderFileInfo &HFI;
+  bool AlreadyIncluded;
   ArrayRef KnownHeaders;
   UnresolvedModule Unresolved;
 };
@@ -1808,7 +1808,8 @@
   endian::Writer LE(Out, little);
   uint64_t Start = Out.tell(); (void)Start;
 
-  unsigned char Flags = (Data.HFI.isImport << 5)
+  unsigned char Flags = (Data.AlreadyIncluded << 6)
+  | (Data.HFI.isImport << 5)
   | (Data.HFI.isPragmaOnce << 4)
   | (Data.HFI.DirInfo << 1)
   | Data.HFI.IndexHeaderMapHeader;
@@ -1909,7 +1910,7 @@
 HeaderFileInfoTrait::key_type Key = {
 FilenameDup, *U.Size, IncludeTimestamps ? *U.ModTime : 0};
 HeaderFileInfoTrait::data_type Data = {
-Empty, {}, {M, ModuleMap::headerKindToRole(U.Kind)}};
+Empty, false, {}, {M, ModuleMap::headerKindToRole(U.Kind)}};
 // FIXME: Deal with cases where there are multiple unresolved header
 // directives in different submodules for the same header.
 Generator.insert(Key, Data, GeneratorTrait);
@@ -1952,11 +1953,13 @@
   SavedStrings.push_back(Filename.data());
 }
 
+bool Included = PP->alreadyIncluded(File);
+
 HeaderFileInfoTrait::key_type Key = {
   Filename, File->getSize(), getTimestampForOutput(File)
 };
 HeaderFileInfoTrait::data_type Data = {
-  *HFI, HS.getModuleMap().findResolvedModulesForHeader(File), {}
+  *HFI, Included, HS.getModuleMap().findResolvedModulesForHeader(File), {}
 };
 Generator.insert(Key, Data, GeneratorTrait);
 ++NumHeaderSearchEntries;
@@ -2262,29 +2265,6 @@
   return false;
 }
 
-void ASTWriter::writeIncludedFiles(raw_ostream &Out, const Preprocessor &PP) {
-  using namespace llvm::support;
-
-  const Preprocessor::IncludedFilesSet &IncludedFiles = PP.getIncludedFiles();
-
-  std::vector IncludedInputFileIDs;
-  IncludedInputFileIDs.reserve(IncludedFiles.size());
-
-  for (const FileEntry *File : IncludedFiles) {
-auto InputFileIt = InputFileIDs.find(File);
-if (InputFileIt == InputFileIDs.end())
-  continue;
-IncludedInputFileIDs.push_back(InputFileIt->second);
-  }
-
-  llvm::sort(IncludedInputFileIDs);
-
-  endian::Writer LE(Out, little);
-  LE.write(IncludedInputFileIDs.size());
-  for (uint32_t ID : IncludedInputFileIDs)
-LE.write(ID);
-}
-
 /// Writes the block containing the serialized form of the
 /// preprocessor.
 void ASTWriter::WritePreprocessor(const Preprocessor &PP, bool IsModule) {
@@ -2533,20 +2513,6 @@
MacroOffsetsBase - ASTBlockStartOffset};
 Stream.EmitRecordWithBlob(MacroOffsetAbbrev, Record, bytes(MacroOffsets));
   }
-
-  {
-auto Abbrev = std::make_shared();
-Abbrev->Add(BitCodeAbbrevOp(PP_INCLUDED_FILES));
-Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob));
-unsigned IncludedFilesAbbrev = Stream.EmitAbbrev(std::move(Abbrev));
-
-SmallString<2048> Buffer;
-raw_svector_ostream Out(Buffer);
-writeIncludedFiles(Out, PP);
-RecordData::value_type Record[] = {PP_INCLUDED_FILES};
-Stream.EmitRecordWithBlob(IncludedFilesAbbrev, Record, Buffer.data(),
-  Buffer.size());
-  }
 }
 
 void ASTWriter::WritePreprocessorDetail(PreprocessingRecord &PPRec,
Index: clang/lib/Serialization/ASTReaderInternals.h
===
--- clang/lib/Serialization/ASTReaderInternals.h
+++ clang/lib/Serialization/ASTReaderInternals.h
@@ -276,6 +276,9 @@
   static internal_key_type ReadKey(const unsigned char *d, unsigned);
 
   data_type ReadData(internal_key_ref,const unsigned char *d, unsigned DataLen);
+
+private:
+  const FileEntry *getFile(const internal_key_type &Key);
 };
 
 /// Th

[PATCH] D155131: [clang][modules] Deserialize included files lazily

2023-07-13 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 updated this revision to Diff 540174.
jansvoboda11 added a comment.
Herald added a subscriber: mgrang.

Remove dead code, make sure `getFileInfo()` is called in `alreadyIncluded()`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155131/new/

https://reviews.llvm.org/D155131

Files:
  clang/include/clang/Lex/Preprocessor.h
  clang/include/clang/Serialization/ASTBitCodes.h
  clang/include/clang/Serialization/ASTReader.h
  clang/include/clang/Serialization/ASTWriter.h
  clang/lib/Serialization/ASTReader.cpp
  clang/lib/Serialization/ASTReaderInternals.h
  clang/lib/Serialization/ASTWriter.cpp

Index: clang/lib/Serialization/ASTWriter.cpp
===
--- clang/lib/Serialization/ASTWriter.cpp
+++ clang/lib/Serialization/ASTWriter.cpp
@@ -866,7 +866,6 @@
   RECORD(CUDA_PRAGMA_FORCE_HOST_DEVICE_DEPTH);
   RECORD(PP_CONDITIONAL_STACK);
   RECORD(DECLS_TO_CHECK_FOR_DEFERRED_DIAGS);
-  RECORD(PP_INCLUDED_FILES);
   RECORD(PP_ASSUME_NONNULL_LOC);
 
   // SourceManager Block.
@@ -1763,6 +1762,7 @@
 
 struct data_type {
   const HeaderFileInfo &HFI;
+  bool AlreadyIncluded;
   ArrayRef KnownHeaders;
   UnresolvedModule Unresolved;
 };
@@ -1808,7 +1808,8 @@
   endian::Writer LE(Out, little);
   uint64_t Start = Out.tell(); (void)Start;
 
-  unsigned char Flags = (Data.HFI.isImport << 5)
+  unsigned char Flags = (Data.AlreadyIncluded << 6)
+  | (Data.HFI.isImport << 5)
   | (Data.HFI.isPragmaOnce << 4)
   | (Data.HFI.DirInfo << 1)
   | Data.HFI.IndexHeaderMapHeader;
@@ -1909,7 +1910,7 @@
 HeaderFileInfoTrait::key_type Key = {
 FilenameDup, *U.Size, IncludeTimestamps ? *U.ModTime : 0};
 HeaderFileInfoTrait::data_type Data = {
-Empty, {}, {M, ModuleMap::headerKindToRole(U.Kind)}};
+Empty, false, {}, {M, ModuleMap::headerKindToRole(U.Kind)}};
 // FIXME: Deal with cases where there are multiple unresolved header
 // directives in different submodules for the same header.
 Generator.insert(Key, Data, GeneratorTrait);
@@ -1952,11 +1953,13 @@
   SavedStrings.push_back(Filename.data());
 }
 
+bool Included = PP->alreadyIncluded(File);
+
 HeaderFileInfoTrait::key_type Key = {
   Filename, File->getSize(), getTimestampForOutput(File)
 };
 HeaderFileInfoTrait::data_type Data = {
-  *HFI, HS.getModuleMap().findResolvedModulesForHeader(File), {}
+  *HFI, Included, HS.getModuleMap().findResolvedModulesForHeader(File), {}
 };
 Generator.insert(Key, Data, GeneratorTrait);
 ++NumHeaderSearchEntries;
@@ -2262,29 +2265,6 @@
   return false;
 }
 
-void ASTWriter::writeIncludedFiles(raw_ostream &Out, const Preprocessor &PP) {
-  using namespace llvm::support;
-
-  const Preprocessor::IncludedFilesSet &IncludedFiles = PP.getIncludedFiles();
-
-  std::vector IncludedInputFileIDs;
-  IncludedInputFileIDs.reserve(IncludedFiles.size());
-
-  for (const FileEntry *File : IncludedFiles) {
-auto InputFileIt = InputFileIDs.find(File);
-if (InputFileIt == InputFileIDs.end())
-  continue;
-IncludedInputFileIDs.push_back(InputFileIt->second);
-  }
-
-  llvm::sort(IncludedInputFileIDs);
-
-  endian::Writer LE(Out, little);
-  LE.write(IncludedInputFileIDs.size());
-  for (uint32_t ID : IncludedInputFileIDs)
-LE.write(ID);
-}
-
 /// Writes the block containing the serialized form of the
 /// preprocessor.
 void ASTWriter::WritePreprocessor(const Preprocessor &PP, bool IsModule) {
@@ -2533,20 +2513,6 @@
MacroOffsetsBase - ASTBlockStartOffset};
 Stream.EmitRecordWithBlob(MacroOffsetAbbrev, Record, bytes(MacroOffsets));
   }
-
-  {
-auto Abbrev = std::make_shared();
-Abbrev->Add(BitCodeAbbrevOp(PP_INCLUDED_FILES));
-Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob));
-unsigned IncludedFilesAbbrev = Stream.EmitAbbrev(std::move(Abbrev));
-
-SmallString<2048> Buffer;
-raw_svector_ostream Out(Buffer);
-writeIncludedFiles(Out, PP);
-RecordData::value_type Record[] = {PP_INCLUDED_FILES};
-Stream.EmitRecordWithBlob(IncludedFilesAbbrev, Record, Buffer.data(),
-  Buffer.size());
-  }
 }
 
 void ASTWriter::WritePreprocessorDetail(PreprocessingRecord &PPRec,
Index: clang/lib/Serialization/ASTReaderInternals.h
===
--- clang/lib/Serialization/ASTReaderInternals.h
+++ clang/lib/Serialization/ASTReaderInternals.h
@@ -276,6 +276,9 @@
   static internal_key_type ReadKey(const unsigned char *d, unsigned);
 
   data_type ReadData(internal_key_ref,const unsigned char *d, unsigned DataLen);
+
+private:
+  const FileEntry *getFile(const internal_key_type &Key);
 };
 
 /// The on-disk hash table used for known header file

[PATCH] D155131: [clang][modules] Deserialize included files lazily

2023-07-13 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added a comment.

In D155131#4495489 , @benlangmuir 
wrote:

> Now that it's not eagerly deserialized, should 
> `Preprocessor::alreadyIncluded` call `HeaderInfo.getFileInfo(File)` to ensure 
> the information is up to date?

It should, but `Preprocessor::alreadyIncluded()` is only called from 
`HeaderSearch::ShouldEnterIncludeFile()` and 
`Preprocessor::HandleHeaderIncludeOrImport()`, where 
`HeaderSearch::getFileInfo(File)` has already been called. But I agree it would 
be better to ensure that within `Preprocessor::alreadyIncluded()` itself. I'll 
try to include that in the next revision.

> Similarly, we expose the list of files in `Preprocessor::getIncludedFiles` -- 
> is it okay if this list is incomplete?

That should be okay. This function only needs to return files included in the 
current module compilation, not all transitive includes.




Comment at: clang/lib/Serialization/ASTReader.cpp:1947
+if (const FileEntry *FE = getFile(key))
+  Reader.getPreprocessor().getIncludedFiles().insert(FE);
+

benlangmuir wrote:
> `Reader.getPreprocessor().markIncluded`?
That would trigger infinite recursion, since that calls `getFileInfo()` which 
attempts to deserialize.



Comment at: clang/lib/Serialization/ASTWriter.cpp:2545
-raw_svector_ostream Out(Buffer);
-writeIncludedFiles(Out, PP);
-RecordData::value_type Record[] = {PP_INCLUDED_FILES};

benlangmuir wrote:
> Can we remove `ASTWriter::writeIncludedFiles` now?
Yes, forgot about that, thanks.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155131/new/

https://reviews.llvm.org/D155131

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D155131: [clang][modules] Deserialize included files lazily

2023-07-12 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 created this revision.
jansvoboda11 added reviewers: benlangmuir, vsapsai, Bigcheese.
Herald added a subscriber: ributzka.
Herald added a project: All.
jansvoboda11 requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

In D114095 , `HeaderFileInfo::NumIncludes` 
was moved into `Preprocessor`. This still makes sense, because we want to track 
this on the granularity of submodules (D112915 
, D114173 
), but the way this information is serialized 
is not ideal. In `ASTWriter`, the set of included files gets deserialized 
eagerly, issuing lots of calls to `FileManager::getFile()` for input files the 
PCM consumer might not be interested in.

This patch makes the information part of the header file info table, taking 
advantage of its lazy deserialization which typically happens when a file is 
about to be included.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D155131

Files:
  clang/include/clang/Serialization/ASTBitCodes.h
  clang/include/clang/Serialization/ASTReader.h
  clang/lib/Serialization/ASTReader.cpp
  clang/lib/Serialization/ASTReaderInternals.h
  clang/lib/Serialization/ASTWriter.cpp

Index: clang/lib/Serialization/ASTWriter.cpp
===
--- clang/lib/Serialization/ASTWriter.cpp
+++ clang/lib/Serialization/ASTWriter.cpp
@@ -866,7 +866,6 @@
   RECORD(CUDA_PRAGMA_FORCE_HOST_DEVICE_DEPTH);
   RECORD(PP_CONDITIONAL_STACK);
   RECORD(DECLS_TO_CHECK_FOR_DEFERRED_DIAGS);
-  RECORD(PP_INCLUDED_FILES);
   RECORD(PP_ASSUME_NONNULL_LOC);
 
   // SourceManager Block.
@@ -1763,6 +1762,7 @@
 
 struct data_type {
   const HeaderFileInfo &HFI;
+  bool AlreadyIncluded;
   ArrayRef KnownHeaders;
   UnresolvedModule Unresolved;
 };
@@ -1808,7 +1808,8 @@
   endian::Writer LE(Out, little);
   uint64_t Start = Out.tell(); (void)Start;
 
-  unsigned char Flags = (Data.HFI.isImport << 5)
+  unsigned char Flags = (Data.AlreadyIncluded << 6)
+  | (Data.HFI.isImport << 5)
   | (Data.HFI.isPragmaOnce << 4)
   | (Data.HFI.DirInfo << 1)
   | Data.HFI.IndexHeaderMapHeader;
@@ -1909,7 +1910,7 @@
 HeaderFileInfoTrait::key_type Key = {
 FilenameDup, *U.Size, IncludeTimestamps ? *U.ModTime : 0};
 HeaderFileInfoTrait::data_type Data = {
-Empty, {}, {M, ModuleMap::headerKindToRole(U.Kind)}};
+Empty, false, {}, {M, ModuleMap::headerKindToRole(U.Kind)}};
 // FIXME: Deal with cases where there are multiple unresolved header
 // directives in different submodules for the same header.
 Generator.insert(Key, Data, GeneratorTrait);
@@ -1952,11 +1953,13 @@
   SavedStrings.push_back(Filename.data());
 }
 
+bool Included = PP->alreadyIncluded(File);
+
 HeaderFileInfoTrait::key_type Key = {
   Filename, File->getSize(), getTimestampForOutput(File)
 };
 HeaderFileInfoTrait::data_type Data = {
-  *HFI, HS.getModuleMap().findResolvedModulesForHeader(File), {}
+  *HFI, Included, HS.getModuleMap().findResolvedModulesForHeader(File), {}
 };
 Generator.insert(Key, Data, GeneratorTrait);
 ++NumHeaderSearchEntries;
@@ -2533,20 +2536,6 @@
MacroOffsetsBase - ASTBlockStartOffset};
 Stream.EmitRecordWithBlob(MacroOffsetAbbrev, Record, bytes(MacroOffsets));
   }
-
-  {
-auto Abbrev = std::make_shared();
-Abbrev->Add(BitCodeAbbrevOp(PP_INCLUDED_FILES));
-Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob));
-unsigned IncludedFilesAbbrev = Stream.EmitAbbrev(std::move(Abbrev));
-
-SmallString<2048> Buffer;
-raw_svector_ostream Out(Buffer);
-writeIncludedFiles(Out, PP);
-RecordData::value_type Record[] = {PP_INCLUDED_FILES};
-Stream.EmitRecordWithBlob(IncludedFilesAbbrev, Record, Buffer.data(),
-  Buffer.size());
-  }
 }
 
 void ASTWriter::WritePreprocessorDetail(PreprocessingRecord &PPRec,
Index: clang/lib/Serialization/ASTReaderInternals.h
===
--- clang/lib/Serialization/ASTReaderInternals.h
+++ clang/lib/Serialization/ASTReaderInternals.h
@@ -276,6 +276,9 @@
   static internal_key_type ReadKey(const unsigned char *d, unsigned);
 
   data_type ReadData(internal_key_ref,const unsigned char *d, unsigned DataLen);
+
+private:
+  const FileEntry *getFile(const internal_key_type &Key);
 };
 
 /// The on-disk hash table used for known header files.
Index: clang/lib/Serialization/ASTReader.cpp
===
--- clang/lib/Serialization/ASTReader.cpp
+++ clang/lib/Serialization/ASTReader.cpp
@@ -1875,6 +1875,21 @@
   return LocalID + I->second;
 }
 

[PATCH] D154905: [clang] Implement `PointerLikeTraits` for `{File,Directory}EntryRef`

2023-07-11 Thread Jan Svoboda via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
jansvoboda11 marked an inline comment as done.
Closed by commit rG06611e361363: [clang] Implement `PointerLikeTraits` for 
`{File,Directory}EntryRef` (authored by jansvoboda11).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154905/new/

https://reviews.llvm.org/D154905

Files:
  clang/include/clang/Basic/DirectoryEntry.h
  clang/include/clang/Basic/FileEntry.h
  clang/include/clang/Basic/Module.h
  clang/lib/Basic/Module.cpp
  clang/lib/Lex/ModuleMap.cpp

Index: clang/lib/Lex/ModuleMap.cpp
===
--- clang/lib/Lex/ModuleMap.cpp
+++ clang/lib/Lex/ModuleMap.cpp
@@ -1162,7 +1162,7 @@
 Module *Mod, FileEntryRef UmbrellaHeader, const Twine &NameAsWritten,
 const Twine &PathRelativeToRootModuleDirectory) {
   Headers[UmbrellaHeader].push_back(KnownHeader(Mod, NormalHeader));
-  Mod->Umbrella = &UmbrellaHeader.getMapEntry();
+  Mod->Umbrella = UmbrellaHeader;
   Mod->UmbrellaAsWritten = NameAsWritten.str();
   Mod->UmbrellaRelativeToRootModuleDirectory =
   PathRelativeToRootModuleDirectory.str();
@@ -1176,7 +1176,7 @@
 void ModuleMap::setUmbrellaDirAsWritten(
 Module *Mod, DirectoryEntryRef UmbrellaDir, const Twine &NameAsWritten,
 const Twine &PathRelativeToRootModuleDirectory) {
-  Mod->Umbrella = &UmbrellaDir.getMapEntry();
+  Mod->Umbrella = UmbrellaDir;
   Mod->UmbrellaAsWritten = NameAsWritten.str();
   Mod->UmbrellaRelativeToRootModuleDirectory =
   PathRelativeToRootModuleDirectory.str();
Index: clang/lib/Basic/Module.cpp
===
--- clang/lib/Basic/Module.cpp
+++ clang/lib/Basic/Module.cpp
@@ -264,10 +264,10 @@
 }
 
 OptionalDirectoryEntryRef Module::getEffectiveUmbrellaDir() const {
-  if (const auto *ME = Umbrella.dyn_cast())
-return FileEntryRef(*ME).getDir();
-  if (const auto *ME = Umbrella.dyn_cast())
-return DirectoryEntryRef(*ME);
+  if (Umbrella && Umbrella.is())
+return Umbrella.get().getDir();
+  if (Umbrella && Umbrella.is())
+return Umbrella.get();
   return std::nullopt;
 }
 
Index: clang/include/clang/Basic/Module.h
===
--- clang/include/clang/Basic/Module.h
+++ clang/include/clang/Basic/Module.h
@@ -156,9 +156,7 @@
   std::string PresumedModuleMapFile;
 
   /// The umbrella header or directory.
-  llvm::PointerUnion
-  Umbrella;
+  llvm::PointerUnion Umbrella;
 
   /// The module signature.
   ASTFileSignature Signature;
@@ -650,19 +648,18 @@
 
   /// Retrieve the umbrella directory as written.
   std::optional getUmbrellaDirAsWritten() const {
-if (const auto *ME =
-Umbrella.dyn_cast())
+if (Umbrella && Umbrella.is())
   return DirectoryName{UmbrellaAsWritten,
UmbrellaRelativeToRootModuleDirectory,
-   DirectoryEntryRef(*ME)};
+   Umbrella.get()};
 return std::nullopt;
   }
 
   /// Retrieve the umbrella header as written.
   std::optional getUmbrellaHeaderAsWritten() const {
-if (const auto *ME = Umbrella.dyn_cast())
+if (Umbrella && Umbrella.is())
   return Header{UmbrellaAsWritten, UmbrellaRelativeToRootModuleDirectory,
-FileEntryRef(*ME)};
+Umbrella.get()};
 return std::nullopt;
   }
 
Index: clang/include/clang/Basic/FileEntry.h
===
--- clang/include/clang/Basic/FileEntry.h
+++ clang/include/clang/Basic/FileEntry.h
@@ -234,6 +234,21 @@
 } // namespace clang
 
 namespace llvm {
+
+template <> struct PointerLikeTypeTraits {
+  static inline void *getAsVoidPointer(clang::FileEntryRef File) {
+return const_cast(&File.getMapEntry());
+  }
+
+  static inline clang::FileEntryRef getFromVoidPointer(void *Ptr) {
+return clang::FileEntryRef(
+*reinterpret_cast(Ptr));
+  }
+
+  static constexpr int NumLowBitsAvailable = PointerLikeTypeTraits<
+  const clang::FileEntryRef::MapEntry *>::NumLowBitsAvailable;
+};
+
 /// Specialisation of DenseMapInfo for FileEntryRef.
 template <> struct DenseMapInfo {
   static inline clang::FileEntryRef getEmptyKey() {
Index: clang/include/clang/Basic/DirectoryEntry.h
===
--- clang/include/clang/Basic/DirectoryEntry.h
+++ clang/include/clang/Basic/DirectoryEntry.h
@@ -72,7 +72,7 @@
   bool isSameRef(DirectoryEntryRef RHS) const { return ME == RHS.ME; }
 
   DirectoryEntryRef() = delete;
-  DirectoryEntryRef(const MapEntry &ME) : ME(&ME) {}
+  explicit DirectoryEntryRef(const MapEntry &ME) : ME(&ME) {}
 
   /// Allow DirectoryEntryRef to degrade into 'const DirectoryEntry*' to
   /// facilitate incremental adoption.
@@ -197,6 +197,21 @@
 } // namespace clang
 

[PATCH] D150479: [clang][modules] Skip submodule & framework re-definitions in module maps

2023-07-11 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 updated this revision to Diff 539156.
jansvoboda11 added a comment.

Skip re-definitions of frameworks too, add new tests


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150479/new/

https://reviews.llvm.org/D150479

Files:
  clang/lib/Lex/ModuleMap.cpp
  clang/test/Modules/no-check-relocated-fw-private-sub.c
  clang/test/Modules/shadow-framework.m


Index: clang/test/Modules/shadow-framework.m
===
--- /dev/null
+++ clang/test/Modules/shadow-framework.m
@@ -0,0 +1,20 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+// This test checks that redefinitions of frameworks are ignored.
+
+//--- include/module.modulemap
+module first { header "first.h" }
+module FW {}
+//--- include/first.h
+
+//--- frameworks/FW.framework/Modules/module.modulemap
+framework module FW { header "FW.h" }
+//--- frameworks/FW.framework/Headers/FW.h
+
+//--- tu.c
+#import "first.h" // expected-remark {{importing module 'first'}}
+#import 
+
+// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t/cache 
-fimplicit-module-maps \
+// RUN:   -I %t/include -F %t/frameworks -fsyntax-only %t/tu.c -Rmodule-import 
-verify
Index: clang/test/Modules/no-check-relocated-fw-private-sub.c
===
--- /dev/null
+++ clang/test/Modules/no-check-relocated-fw-private-sub.c
@@ -0,0 +1,23 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+//--- frameworks1/FW1.framework/Modules/module.modulemap
+framework module FW1 { header "FW1.h" }
+//--- frameworks1/FW1.framework/Headers/FW1.h
+#import 
+
+//--- frameworks2/FW2.framework/Modules/module.modulemap
+framework module FW2 { header "FW2.h" }
+//--- frameworks2/FW2.framework/Modules/module.private.modulemap
+framework module FW2.Private { header "FW2_Private.h" }
+//--- frameworks2/FW2.framework/Headers/FW2.h
+//--- frameworks2/FW2.framework/PrivateHeaders/FW2_Private.h
+
+//--- tu.c
+#import  // expected-remark{{importing module 'FW1'}} \
+// expected-remark{{importing module 'FW2'}}
+#import  // expected-remark{{importing module 'FW2'}}
+
+// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t/cache 
-fimplicit-module-maps \
+// RUN:   -F %t/frameworks1 -F %t/frameworks2 -fsyntax-only %t/tu.c \
+// RUN:   -fno-modules-check-relocated -Rmodule-import -verify
Index: clang/lib/Lex/ModuleMap.cpp
===
--- clang/lib/Lex/ModuleMap.cpp
+++ clang/lib/Lex/ModuleMap.cpp
@@ -2016,12 +2016,28 @@
   Module *ShadowingModule = nullptr;
   if (Module *Existing = Map.lookupModuleQualified(ModuleName, ActiveModule)) {
 // We might see a (re)definition of a module that we already have a
-// definition for in three cases:
+// definition for in four cases:
 //  - If we loaded one definition from an AST file and we've just found a
 //corresponding definition in a module map file, or
 bool LoadedFromASTFile = Existing->IsFromModuleFile;
 //  - If we previously inferred this module from different module map file.
 bool Inferred = Existing->IsInferred;
+//  - If we're building a framework that vends a module map, we might've
+//previously seen the one in intermediate products and now the system
+//one.
+// FIXME: If we're parsing module map file that looks like this:
+//  framework module FW { ... }
+//  module FW.Sub { ... }
+//We can't check the framework qualifier, since it's not attached 
to
+//the definition of Sub. Checking that qualifier on \c Existing is
+//not correct either, since we might've previously seen:
+//  module FW { ... }
+//  module FW.Sub { ... }
+//We should enforce consistency of redefinitions so that we can 
rely
+//that \c Existing is part of a framework iff the redefinition of 
FW
+//we have just skipped had it too. Once we do that, stop checking
+//the local framework qualifier and only rely on \c Existing.
+bool PartOfFramework = Framework || Existing->isPartOfFramework();
 //  - If we're building a (preprocessed) module and we've just loaded the
 //module map file from which it was created.
 bool ParsedAsMainInput =
@@ -2029,7 +2045,8 @@
 Map.LangOpts.CurrentModule == ModuleName &&
 SourceMgr.getDecomposedLoc(ModuleNameLoc).first !=
 SourceMgr.getDecomposedLoc(Existing->DefinitionLoc).first;
-if (!ActiveModule && (LoadedFromASTFile || Inferred || ParsedAsMainInput)) 
{
+if (LoadedFromASTFile || Inferred || PartOfFramework || ParsedAsMainInput) 
{
+  ActiveModule = PreviousActiveModule;
   // Skip the module definition.
   skipUntil(MMToken::RBrace);
   if (Tok.is(MMToken::RBrace))


Index: clang/test/Modules/shadow-framework.m

[PATCH] D150478: [clang][modules][deps] Parse "FW_Private" module map even after loading "FW" PCM

2023-07-11 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 updated this revision to Diff 539127.
jansvoboda11 added a comment.

Expose new preprocessor option as a `-cc1` flag, move test from `ClangScanDeps` 
to `Modules` directory.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150478/new/

https://reviews.llvm.org/D150478

Files:
  clang/include/clang/Driver/Options.td
  clang/lib/Lex/HeaderSearch.cpp
  clang/test/Modules/no-check-relocated-fw-private.c


Index: clang/test/Modules/no-check-relocated-fw-private.c
===
--- /dev/null
+++ clang/test/Modules/no-check-relocated-fw-private.c
@@ -0,0 +1,23 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+//--- frameworks1/FW1.framework/Modules/module.modulemap
+framework module FW1 { header "FW1.h" }
+//--- frameworks1/FW1.framework/Headers/FW1.h
+#import 
+
+//--- frameworks2/FW2.framework/Modules/module.modulemap
+framework module FW2 { header "FW2.h" }
+//--- frameworks2/FW2.framework/Modules/module.private.modulemap
+framework module FW2_Private { header "FW2_Private.h" }
+//--- frameworks2/FW2.framework/Headers/FW2.h
+//--- frameworks2/FW2.framework/PrivateHeaders/FW2_Private.h
+
+//--- tu.c
+#import  // expected-remark{{importing module 'FW1'}} \
+// expected-remark{{importing module 'FW2'}}
+#import  // expected-remark{{importing module 
'FW2_Private'}}
+
+// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t/cache 
-fimplicit-module-maps \
+// RUN:   -F %t/frameworks1 -F %t/frameworks2 -fsyntax-only %t/tu.c \
+// RUN:   -fno-modules-check-relocated -Rmodule-import -verify
Index: clang/lib/Lex/HeaderSearch.cpp
===
--- clang/lib/Lex/HeaderSearch.cpp
+++ clang/lib/Lex/HeaderSearch.cpp
@@ -1783,9 +1783,6 @@
 
 Module *HeaderSearch::loadFrameworkModule(StringRef Name, DirectoryEntryRef 
Dir,
   bool IsSystem) {
-  if (Module *Module = ModMap.findModule(Name))
-return Module;
-
   // Try to load a module map file.
   switch (loadModuleMapFile(Dir, IsSystem, /*IsFramework*/true)) {
   case LMM_InvalidModuleMap:
@@ -1794,10 +1791,10 @@
   ModMap.inferFrameworkModule(Dir, IsSystem, /*Parent=*/nullptr);
 break;
 
-  case LMM_AlreadyLoaded:
   case LMM_NoDirectory:
 return nullptr;
 
+  case LMM_AlreadyLoaded:
   case LMM_NewlyLoaded:
 break;
   }
Index: clang/include/clang/Driver/Options.td
===
--- clang/include/clang/Driver/Options.td
+++ clang/include/clang/Driver/Options.td
@@ -2570,6 +2570,10 @@
 defm implicit_modules : BoolFOption<"implicit-modules",
   LangOpts<"ImplicitModules">, DefaultTrue,
   NegFlag, PosFlag, 
BothFlags<[NoXarchOption,CoreOption]>>;
+def fno_modules_check_relocated : Joined<["-"], "fno-modules-check-relocated">,
+  Group, Flags<[CC1Option]>,
+  HelpText<"Skip checks for relocated modules when loading PCM files">,
+  MarshallingInfoNegativeFlag>;
 def fretain_comments_from_system_headers : Flag<["-"], 
"fretain-comments-from-system-headers">, Group, Flags<[CC1Option]>,
   MarshallingInfoFlag>;
 def fmodule_header : Flag <["-"], "fmodule-header">, Group,


Index: clang/test/Modules/no-check-relocated-fw-private.c
===
--- /dev/null
+++ clang/test/Modules/no-check-relocated-fw-private.c
@@ -0,0 +1,23 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+//--- frameworks1/FW1.framework/Modules/module.modulemap
+framework module FW1 { header "FW1.h" }
+//--- frameworks1/FW1.framework/Headers/FW1.h
+#import 
+
+//--- frameworks2/FW2.framework/Modules/module.modulemap
+framework module FW2 { header "FW2.h" }
+//--- frameworks2/FW2.framework/Modules/module.private.modulemap
+framework module FW2_Private { header "FW2_Private.h" }
+//--- frameworks2/FW2.framework/Headers/FW2.h
+//--- frameworks2/FW2.framework/PrivateHeaders/FW2_Private.h
+
+//--- tu.c
+#import  // expected-remark{{importing module 'FW1'}} \
+// expected-remark{{importing module 'FW2'}}
+#import  // expected-remark{{importing module 'FW2_Private'}}
+
+// RUN: %clang_cc1 -fmodules -fmodules-cache-path=%t/cache -fimplicit-module-maps \
+// RUN:   -F %t/frameworks1 -F %t/frameworks2 -fsyntax-only %t/tu.c \
+// RUN:   -fno-modules-check-relocated -Rmodule-import -verify
Index: clang/lib/Lex/HeaderSearch.cpp
===
--- clang/lib/Lex/HeaderSearch.cpp
+++ clang/lib/Lex/HeaderSearch.cpp
@@ -1783,9 +1783,6 @@
 
 Module *HeaderSearch::loadFrameworkModule(StringRef Name, DirectoryEntryRef Dir,
   bool IsSystem) {
-  if (Module *Module = ModMap.findModule(Name))
-return Module;
-
   // Try to load a module map file.
   switch (loadModuleMapFile(Dir, IsSystem, /*IsFramework*/true)) {
   case LMM_InvalidModuleMa

[PATCH] D150320: [clang][modules][deps] Avoid checks for relocated modules

2023-07-11 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 updated this revision to Diff 539122.
jansvoboda11 added a comment.

Rename the new preprocessor option, fix failing test


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150320/new/

https://reviews.llvm.org/D150320

Files:
  clang/include/clang/Lex/PreprocessorOptions.h
  clang/lib/Serialization/ASTReader.cpp
  clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
  clang/test/ClangScanDeps/header-search-pruning-transitive.c

Index: clang/test/ClangScanDeps/header-search-pruning-transitive.c
===
--- clang/test/ClangScanDeps/header-search-pruning-transitive.c
+++ clang/test/ClangScanDeps/header-search-pruning-transitive.c
@@ -72,8 +72,8 @@
 // CHECK:],
 // CHECK-NEXT:   "context-hash": "[[HASH_X:.*]]",
 // CHECK-NEXT:   "file-deps": [
-// CHECK-NEXT: "[[PREFIX]]/./X.h",
-// CHECK-NEXT: "[[PREFIX]]/./module.modulemap"
+// CHECK-NEXT: "[[PREFIX]]/X.h",
+// CHECK-NEXT: "[[PREFIX]]/module.modulemap"
 // CHECK-NEXT:   ],
 // CHECK-NEXT:   "name": "X"
 // CHECK-NEXT: },
@@ -84,11 +84,11 @@
 // CHECK:],
 // CHECK-NEXT:   "context-hash": "[[HASH_Y_WITH_A]]",
 // CHECK-NEXT:   "file-deps": [
-// CHECK-NEXT: "[[PREFIX]]/./Y.h",
-// CHECK-NEXT: "[[PREFIX]]/./a/a.h",
-// CHECK-NEXT: "[[PREFIX]]/./begin/begin.h",
-// CHECK-NEXT: "[[PREFIX]]/./end/end.h",
-// CHECK-NEXT: "[[PREFIX]]/./module.modulemap"
+// CHECK-NEXT: "[[PREFIX]]/Y.h",
+// CHECK-NEXT: "[[PREFIX]]/a/a.h",
+// CHECK-NEXT: "[[PREFIX]]/begin/begin.h",
+// CHECK-NEXT: "[[PREFIX]]/end/end.h",
+// CHECK-NEXT: "[[PREFIX]]/module.modulemap"
 // CHECK-NEXT:   ],
 // CHECK-NEXT:   "name": "Y"
 // CHECK-NEXT: }
@@ -126,8 +126,8 @@
 // also has a different context hash from the first version of module X.
 // CHECK-NOT:"context-hash": "[[HASH_X]]",
 // CHECK:"file-deps": [
-// CHECK-NEXT: "[[PREFIX]]/./X.h",
-// CHECK-NEXT: "[[PREFIX]]/./module.modulemap"
+// CHECK-NEXT: "[[PREFIX]]/X.h",
+// CHECK-NEXT: "[[PREFIX]]/module.modulemap"
 // CHECK-NEXT:   ],
 // CHECK-NEXT:   "name": "X"
 // CHECK-NEXT: },
@@ -138,10 +138,10 @@
 // CHECK:],
 // CHECK-NEXT:   "context-hash": "[[HASH_Y_WITHOUT_A]]",
 // CHECK-NEXT:   "file-deps": [
-// CHECK-NEXT: "[[PREFIX]]/./Y.h",
-// CHECK-NEXT: "[[PREFIX]]/./begin/begin.h",
-// CHECK-NEXT: "[[PREFIX]]/./end/end.h",
-// CHECK-NEXT: "[[PREFIX]]/./module.modulemap"
+// CHECK-NEXT: "[[PREFIX]]/Y.h",
+// CHECK-NEXT: "[[PREFIX]]/begin/begin.h",
+// CHECK-NEXT: "[[PREFIX]]/end/end.h",
+// CHECK-NEXT: "[[PREFIX]]/module.modulemap"
 // CHECK-NEXT:   ],
 // CHECK-NEXT:   "name": "Y"
 // CHECK-NEXT: }
Index: clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
===
--- clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
+++ clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
@@ -253,6 +253,9 @@
 // context hashing.
 ScanInstance.getHeaderSearchOpts().ModulesStrictContextHash = true;
 
+// Avoid some checks and module map parsing when loading PCM files.
+ScanInstance.getPreprocessorOpts().ModulesCheckRelocated = false;
+
 std::unique_ptr Action;
 
 if (ModuleName)
Index: clang/lib/Serialization/ASTReader.cpp
===
--- clang/lib/Serialization/ASTReader.cpp
+++ clang/lib/Serialization/ASTReader.cpp
@@ -2982,6 +2982,9 @@
   BaseDirectoryAsWritten = Blob;
   assert(!F.ModuleName.empty() &&
  "MODULE_DIRECTORY found before MODULE_NAME");
+  F.BaseDirectory = std::string(Blob);
+  if (!PP.getPreprocessorOpts().ModulesCheckRelocated)
+break;
   // If we've already loaded a module map file covering this module, we may
   // have a better path for it (relative to the current build).
   Module *M = PP.getHeaderSearchInfo().lookupModule(
@@ -3003,8 +3006,6 @@
   }
 }
 F.BaseDirectory = std::string(M->Directory->getName());
-  } else {
-F.BaseDirectory = std::string(Blob);
   }
   break;
 }
@@ -4002,7 +4003,8 @@
   // usable header search context.
   assert(!F.ModuleName.empty() &&
  "MODULE_NAME should come before MODULE_MAP_FILE");
-  if (F.Kind == MK_ImplicitModule && ModuleMgr.begin()->Kind != MK_MainFile) {
+  if (PP.getPreprocessorOpts().ModulesCheckRelocated &&
+  F.Kind == MK_ImplicitModule && ModuleMgr.begin()->Kind != MK_MainFile) {
 // An implicitly-loaded module file should have its module listed in some
 // module map file that we've already loaded.
 Module *M =
Index: clang/inc

[PATCH] D150292: [clang][modules] Serialize `Module::DefinitionLoc`

2023-07-11 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 updated this revision to Diff 539120.
jansvoboda11 added a comment.

Rebase.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150292/new/

https://reviews.llvm.org/D150292

Files:
  clang/include/clang/Serialization/ASTBitCodes.h
  clang/lib/Lex/ModuleMap.cpp
  clang/lib/Serialization/ASTReader.cpp
  clang/lib/Serialization/ASTWriter.cpp

Index: clang/lib/Serialization/ASTWriter.cpp
===
--- clang/lib/Serialization/ASTWriter.cpp
+++ clang/lib/Serialization/ASTWriter.cpp
@@ -200,7 +200,9 @@
 CB(F);
 FileID FID = SourceMgr.translateFile(F);
 SourceLocation Loc = SourceMgr.getIncludeLoc(FID);
-while (Loc.isValid()) {
+// The include location of inferred module maps can point into the header
+// file that triggered the inferring. Cut off the walk if that's the case.
+while (Loc.isValid() && isModuleMap(SourceMgr.getFileCharacteristic(Loc))) {
   FID = SourceMgr.getFileID(Loc);
   CB(*SourceMgr.getFileEntryRefForID(FID));
   Loc = SourceMgr.getIncludeLoc(FID);
@@ -209,11 +211,18 @@
 
   auto ProcessModuleOnce = [&](const Module *M) {
 for (const Module *Mod = M; Mod; Mod = Mod->Parent)
-  if (ProcessedModules.insert(Mod).second)
+  if (ProcessedModules.insert(Mod).second) {
+auto Insert = [&](FileEntryRef F) { ModuleMaps.insert(F); };
+// The containing module map is affecting, because it's being pointed
+// into by Module::DefinitionLoc.
+if (auto ModuleMapFile = MM.getContainingModuleMapFile(Mod))
+  ForIncludeChain(*ModuleMapFile, Insert);
+// For inferred modules, the module map that allowed inferring is not in
+// the include chain of the virtual containing module map file. It did
+// affect the compilation, though.
 if (auto ModuleMapFile = MM.getModuleMapFileForUniquing(Mod))
-  ForIncludeChain(*ModuleMapFile, [&](FileEntryRef F) {
-ModuleMaps.insert(F);
-  });
+  ForIncludeChain(*ModuleMapFile, Insert);
+  }
   };
 
   for (const Module *CurrentModule : ModulesToProcess) {
@@ -2721,6 +2730,7 @@
   Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // ID
   Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Parent
   Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 4)); // Kind
+  Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // Definition location
   Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // IsFramework
   Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // IsExplicit
   Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // IsSystem
@@ -2821,12 +2831,16 @@
   ParentID = SubmoduleIDs[Mod->Parent];
 }
 
+uint64_t DefinitionLoc =
+SourceLocationEncoding::encode(getAdjustedLocation(Mod->DefinitionLoc));
+
 // Emit the definition of the block.
 {
   RecordData::value_type Record[] = {SUBMODULE_DEFINITION,
  ID,
  ParentID,
  (RecordData::value_type)Mod->Kind,
+ DefinitionLoc,
  Mod->IsFramework,
  Mod->IsExplicit,
  Mod->IsSystem,
Index: clang/lib/Serialization/ASTReader.cpp
===
--- clang/lib/Serialization/ASTReader.cpp
+++ clang/lib/Serialization/ASTReader.cpp
@@ -5619,7 +5619,7 @@
   break;
 
 case SUBMODULE_DEFINITION: {
-  if (Record.size() < 12)
+  if (Record.size() < 13)
 return llvm::createStringError(std::errc::illegal_byte_sequence,
"malformed module definition");
 
@@ -5628,6 +5628,7 @@
   SubmoduleID GlobalID = getGlobalSubmoduleID(F, Record[Idx++]);
   SubmoduleID Parent = getGlobalSubmoduleID(F, Record[Idx++]);
   Module::ModuleKind Kind = (Module::ModuleKind)Record[Idx++];
+  SourceLocation DefinitionLoc = ReadSourceLocation(F, Record[Idx++]);
   bool IsFramework = Record[Idx++];
   bool IsExplicit = Record[Idx++];
   bool IsSystem = Record[Idx++];
@@ -5648,8 +5649,7 @@
   ModMap.findOrCreateModule(Name, ParentModule, IsFramework, IsExplicit)
   .first;
 
-  // FIXME: set the definition loc for CurrentModule, or call
-  // ModMap.setInferredModuleAllowedBy()
+  // FIXME: Call ModMap.setInferredModuleAllowedBy()
 
   SubmoduleID GlobalIndex = GlobalID - NUM_PREDEF_SUBMODULE_IDS;
   if (GlobalIndex >= SubmodulesLoaded.size() ||
@@ -5678,6 +5678,7 @@
   }
 
   CurrentModule->Kind = Kind;
+  CurrentModule->DefinitionLoc = DefinitionLoc;
   CurrentModule->Signature = F.Signature;
   CurrentModule->IsFromModuleFile = true;
   CurrentModule->IsS

[PATCH] D154905: [clang] Implement `PointerLikeTraits` for `{File,Directory}EntryRef`

2023-07-11 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 marked an inline comment as done.
jansvoboda11 added inline comments.



Comment at: clang/include/clang/Basic/DirectoryEntry.h:211
+
+  static constexpr int NumLowBitsAvailable = 3;
+};

benlangmuir wrote:
> I suggest not hard-coding it if you can get away with it; maybe 
> `PointerLikeTypeTraits *>::NumLowBitsAvailable`? I think 3 could be wrong on a 32-bit platform for 
> example.
Good idea, updated.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154905/new/

https://reviews.llvm.org/D154905

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154905: [clang] Implement `PointerLikeTraits` for `{File,Directory}EntryRef`

2023-07-11 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 updated this revision to Diff 539089.
jansvoboda11 added a comment.

Forward `NumLowBitsAvailable` to `PointerLikeTypeTraits` of the underlying 
pointer.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154905/new/

https://reviews.llvm.org/D154905

Files:
  clang/include/clang/Basic/DirectoryEntry.h
  clang/include/clang/Basic/FileEntry.h
  clang/include/clang/Basic/Module.h
  clang/lib/Basic/Module.cpp
  clang/lib/Lex/ModuleMap.cpp

Index: clang/lib/Lex/ModuleMap.cpp
===
--- clang/lib/Lex/ModuleMap.cpp
+++ clang/lib/Lex/ModuleMap.cpp
@@ -1162,7 +1162,7 @@
 Module *Mod, FileEntryRef UmbrellaHeader, const Twine &NameAsWritten,
 const Twine &PathRelativeToRootModuleDirectory) {
   Headers[UmbrellaHeader].push_back(KnownHeader(Mod, NormalHeader));
-  Mod->Umbrella = &UmbrellaHeader.getMapEntry();
+  Mod->Umbrella = UmbrellaHeader;
   Mod->UmbrellaAsWritten = NameAsWritten.str();
   Mod->UmbrellaRelativeToRootModuleDirectory =
   PathRelativeToRootModuleDirectory.str();
@@ -1176,7 +1176,7 @@
 void ModuleMap::setUmbrellaDirAsWritten(
 Module *Mod, DirectoryEntryRef UmbrellaDir, const Twine &NameAsWritten,
 const Twine &PathRelativeToRootModuleDirectory) {
-  Mod->Umbrella = &UmbrellaDir.getMapEntry();
+  Mod->Umbrella = UmbrellaDir;
   Mod->UmbrellaAsWritten = NameAsWritten.str();
   Mod->UmbrellaRelativeToRootModuleDirectory =
   PathRelativeToRootModuleDirectory.str();
Index: clang/lib/Basic/Module.cpp
===
--- clang/lib/Basic/Module.cpp
+++ clang/lib/Basic/Module.cpp
@@ -264,10 +264,10 @@
 }
 
 OptionalDirectoryEntryRef Module::getEffectiveUmbrellaDir() const {
-  if (const auto *ME = Umbrella.dyn_cast())
-return FileEntryRef(*ME).getDir();
-  if (const auto *ME = Umbrella.dyn_cast())
-return DirectoryEntryRef(*ME);
+  if (Umbrella && Umbrella.is())
+return Umbrella.get().getDir();
+  if (Umbrella && Umbrella.is())
+return Umbrella.get();
   return std::nullopt;
 }
 
Index: clang/include/clang/Basic/Module.h
===
--- clang/include/clang/Basic/Module.h
+++ clang/include/clang/Basic/Module.h
@@ -156,9 +156,7 @@
   std::string PresumedModuleMapFile;
 
   /// The umbrella header or directory.
-  llvm::PointerUnion
-  Umbrella;
+  llvm::PointerUnion Umbrella;
 
   /// The module signature.
   ASTFileSignature Signature;
@@ -650,19 +648,18 @@
 
   /// Retrieve the umbrella directory as written.
   std::optional getUmbrellaDirAsWritten() const {
-if (const auto *ME =
-Umbrella.dyn_cast())
+if (Umbrella && Umbrella.is())
   return DirectoryName{UmbrellaAsWritten,
UmbrellaRelativeToRootModuleDirectory,
-   DirectoryEntryRef(*ME)};
+   Umbrella.get()};
 return std::nullopt;
   }
 
   /// Retrieve the umbrella header as written.
   std::optional getUmbrellaHeaderAsWritten() const {
-if (const auto *ME = Umbrella.dyn_cast())
+if (Umbrella && Umbrella.is())
   return Header{UmbrellaAsWritten, UmbrellaRelativeToRootModuleDirectory,
-FileEntryRef(*ME)};
+Umbrella.get()};
 return std::nullopt;
   }
 
Index: clang/include/clang/Basic/FileEntry.h
===
--- clang/include/clang/Basic/FileEntry.h
+++ clang/include/clang/Basic/FileEntry.h
@@ -234,6 +234,21 @@
 } // namespace clang
 
 namespace llvm {
+
+template <> struct PointerLikeTypeTraits {
+  static inline void *getAsVoidPointer(clang::FileEntryRef File) {
+return const_cast(&File.getMapEntry());
+  }
+
+  static inline clang::FileEntryRef getFromVoidPointer(void *Ptr) {
+return clang::FileEntryRef(
+*reinterpret_cast(Ptr));
+  }
+
+  static constexpr int NumLowBitsAvailable = PointerLikeTypeTraits<
+  const clang::FileEntryRef::MapEntry *>::NumLowBitsAvailable;
+};
+
 /// Specialisation of DenseMapInfo for FileEntryRef.
 template <> struct DenseMapInfo {
   static inline clang::FileEntryRef getEmptyKey() {
Index: clang/include/clang/Basic/DirectoryEntry.h
===
--- clang/include/clang/Basic/DirectoryEntry.h
+++ clang/include/clang/Basic/DirectoryEntry.h
@@ -72,7 +72,7 @@
   bool isSameRef(DirectoryEntryRef RHS) const { return ME == RHS.ME; }
 
   DirectoryEntryRef() = delete;
-  DirectoryEntryRef(const MapEntry &ME) : ME(&ME) {}
+  explicit DirectoryEntryRef(const MapEntry &ME) : ME(&ME) {}
 
   /// Allow DirectoryEntryRef to degrade into 'const DirectoryEntry*' to
   /// facilitate incremental adoption.
@@ -197,6 +197,21 @@
 } // namespace clang
 
 namespace llvm {
+
+template <> struct PointerLikeTypeTraits {
+  static inline void *getAsVoidPointer(clang::DirectoryEntryRef Dir) {
+   

[PATCH] D154905: [clang] Implement `PointerLikeTraits` for `{File,Directory}EntryRef`

2023-07-10 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added a comment.

Note that this enables usage of `{File,Directory}EntryRef` in other containers 
using `PointerLikeTraits` such as `SmallPtrSet`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154905/new/

https://reviews.llvm.org/D154905

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154905: [clang] Implement `PointerLikeTraits` for `{File,Directory}EntryRef`

2023-07-10 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 created this revision.
jansvoboda11 added a reviewer: benlangmuir.
Herald added a subscriber: ributzka.
Herald added a project: All.
jansvoboda11 requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

This patch implements `llvm::PointerLikeTraits` and 
`llvm::PointerLikeTraits`, allowing some simplifications 
around umbrella header/directory code.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D154905

Files:
  clang/include/clang/Basic/DirectoryEntry.h
  clang/include/clang/Basic/FileEntry.h
  clang/include/clang/Basic/Module.h
  clang/lib/Basic/Module.cpp
  clang/lib/Lex/ModuleMap.cpp

Index: clang/lib/Lex/ModuleMap.cpp
===
--- clang/lib/Lex/ModuleMap.cpp
+++ clang/lib/Lex/ModuleMap.cpp
@@ -1162,7 +1162,7 @@
 Module *Mod, FileEntryRef UmbrellaHeader, const Twine &NameAsWritten,
 const Twine &PathRelativeToRootModuleDirectory) {
   Headers[UmbrellaHeader].push_back(KnownHeader(Mod, NormalHeader));
-  Mod->Umbrella = &UmbrellaHeader.getMapEntry();
+  Mod->Umbrella = UmbrellaHeader;
   Mod->UmbrellaAsWritten = NameAsWritten.str();
   Mod->UmbrellaRelativeToRootModuleDirectory =
   PathRelativeToRootModuleDirectory.str();
@@ -1176,7 +1176,7 @@
 void ModuleMap::setUmbrellaDirAsWritten(
 Module *Mod, DirectoryEntryRef UmbrellaDir, const Twine &NameAsWritten,
 const Twine &PathRelativeToRootModuleDirectory) {
-  Mod->Umbrella = &UmbrellaDir.getMapEntry();
+  Mod->Umbrella = UmbrellaDir;
   Mod->UmbrellaAsWritten = NameAsWritten.str();
   Mod->UmbrellaRelativeToRootModuleDirectory =
   PathRelativeToRootModuleDirectory.str();
Index: clang/lib/Basic/Module.cpp
===
--- clang/lib/Basic/Module.cpp
+++ clang/lib/Basic/Module.cpp
@@ -264,10 +264,10 @@
 }
 
 OptionalDirectoryEntryRef Module::getEffectiveUmbrellaDir() const {
-  if (const auto *ME = Umbrella.dyn_cast())
-return FileEntryRef(*ME).getDir();
-  if (const auto *ME = Umbrella.dyn_cast())
-return DirectoryEntryRef(*ME);
+  if (Umbrella && Umbrella.is())
+return Umbrella.get().getDir();
+  if (Umbrella && Umbrella.is())
+return Umbrella.get();
   return std::nullopt;
 }
 
Index: clang/include/clang/Basic/Module.h
===
--- clang/include/clang/Basic/Module.h
+++ clang/include/clang/Basic/Module.h
@@ -156,9 +156,7 @@
   std::string PresumedModuleMapFile;
 
   /// The umbrella header or directory.
-  llvm::PointerUnion
-  Umbrella;
+  llvm::PointerUnion Umbrella;
 
   /// The module signature.
   ASTFileSignature Signature;
@@ -650,19 +648,18 @@
 
   /// Retrieve the umbrella directory as written.
   std::optional getUmbrellaDirAsWritten() const {
-if (const auto *ME =
-Umbrella.dyn_cast())
+if (Umbrella && Umbrella.is())
   return DirectoryName{UmbrellaAsWritten,
UmbrellaRelativeToRootModuleDirectory,
-   DirectoryEntryRef(*ME)};
+   Umbrella.get()};
 return std::nullopt;
   }
 
   /// Retrieve the umbrella header as written.
   std::optional getUmbrellaHeaderAsWritten() const {
-if (const auto *ME = Umbrella.dyn_cast())
+if (Umbrella && Umbrella.is())
   return Header{UmbrellaAsWritten, UmbrellaRelativeToRootModuleDirectory,
-FileEntryRef(*ME)};
+Umbrella.get()};
 return std::nullopt;
   }
 
Index: clang/include/clang/Basic/FileEntry.h
===
--- clang/include/clang/Basic/FileEntry.h
+++ clang/include/clang/Basic/FileEntry.h
@@ -234,6 +234,20 @@
 } // namespace clang
 
 namespace llvm {
+
+template <> struct PointerLikeTypeTraits {
+  static inline void *getAsVoidPointer(clang::FileEntryRef File) {
+return const_cast(&File.getMapEntry());
+  }
+
+  static inline clang::FileEntryRef getFromVoidPointer(void *Ptr) {
+return clang::FileEntryRef(
+*reinterpret_cast(Ptr));
+  }
+
+  static constexpr int NumLowBitsAvailable = 3;
+};
+
 /// Specialisation of DenseMapInfo for FileEntryRef.
 template <> struct DenseMapInfo {
   static inline clang::FileEntryRef getEmptyKey() {
Index: clang/include/clang/Basic/DirectoryEntry.h
===
--- clang/include/clang/Basic/DirectoryEntry.h
+++ clang/include/clang/Basic/DirectoryEntry.h
@@ -72,7 +72,7 @@
   bool isSameRef(DirectoryEntryRef RHS) const { return ME == RHS.ME; }
 
   DirectoryEntryRef() = delete;
-  DirectoryEntryRef(const MapEntry &ME) : ME(&ME) {}
+  explicit DirectoryEntryRef(const MapEntry &ME) : ME(&ME) {}
 
   /// Allow DirectoryEntryRef to degrade into 'const DirectoryEntry*' to
   /// facilitate incremental adoption.
@@ -197,6 +197,20 @@
 } // namespace clang
 
 namespace llvm {
+
+template <> struct Pointe

  1   2   3   4   5   6   7   8   9   10   >