[PATCH] D120465: [clang][deps] Disable implicit module maps

2022-03-12 Thread Jan Svoboda via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGa6ef3635461c: [clang][deps] Disable implicit module maps 
(authored by jansvoboda11).

Changed prior to commit:
  https://reviews.llvm.org/D120465?vs=413384=414814#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120465/new/

https://reviews.llvm.org/D120465

Files:
  clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
  clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
  clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
  clang/test/ClangScanDeps/modules-full.cpp
  clang/test/ClangScanDeps/modules-inferred.m
  clang/test/ClangScanDeps/modules-no-undeclared-includes.c
  clang/test/ClangScanDeps/modules-pch.c

Index: clang/test/ClangScanDeps/modules-pch.c
===
--- clang/test/ClangScanDeps/modules-pch.c
+++ clang/test/ClangScanDeps/modules-pch.c
@@ -26,6 +26,7 @@
 // CHECK-PCH-NEXT: "-cc1"
 // CHECK-PCH:  "-emit-module"
 // CHECK-PCH:  "-fmodules"
+// CHECK-PCH-NOT:  "-fimplicit-module-maps"
 // CHECK-PCH:  "-fmodule-name=ModCommon1"
 // CHECK-PCH:  "-fno-implicit-modules"
 // CHECK-PCH:],
@@ -43,6 +44,7 @@
 // CHECK-PCH-NEXT: "-cc1"
 // CHECK-PCH:  "-emit-module"
 // CHECK-PCH:  "-fmodules"
+// CHECK-PCH-NOT:  "-fimplicit-module-maps"
 // CHECK-PCH:  "-fmodule-name=ModCommon2"
 // CHECK-PCH:  "-fno-implicit-modules"
 // CHECK-PCH:],
@@ -66,6 +68,7 @@
 // CHECK-PCH:  "-emit-module"
 // CHECK-PCH:  "-fmodule-file=[[PREFIX]]/build/[[HASH_MOD_COMMON_2]]/ModCommon2-{{.*}}.pcm"
 // CHECK-PCH:  "-fmodules"
+// CHECK-PCH-NOT:  "-fimplicit-module-maps"
 // CHECK-PCH:  "-fmodule-name=ModPCH"
 // CHECK-PCH:  "-fno-implicit-modules"
 // CHECK-PCH:],
@@ -139,6 +142,7 @@
 // CHECK-TU-NEXT:   "command-line": [
 // CHECK-TU-NEXT: "-cc1",
 // CHECK-TU:  "-emit-module",
+// CHECK-TU-NOT:  "-fimplicit-module-maps",
 // CHECK-TU:  "-fmodule-name=ModTU",
 // CHECK-TU:  "-fno-implicit-modules",
 // CHECK-TU:],
@@ -202,6 +206,7 @@
 // CHECK-TU-WITH-COMMON-NEXT: "-cc1",
 // CHECK-TU-WITH-COMMON:  "-emit-module",
 // CHECK-TU-WITH-COMMON:  "-fmodule-file=[[PREFIX]]/build/{{.*}}/ModCommon1-{{.*}}.pcm",
+// CHECK-TU-WITH-COMMON-NOT:  "-fimplicit-module-maps",
 // CHECK-TU-WITH-COMMON:  "-fmodule-name=ModTUWithCommon",
 // CHECK-TU-WITH-COMMON:  "-fno-implicit-modules",
 // CHECK-TU-WITH-COMMON:],
Index: clang/test/ClangScanDeps/modules-no-undeclared-includes.c
===
--- /dev/null
+++ clang/test/ClangScanDeps/modules-no-undeclared-includes.c
@@ -0,0 +1,72 @@
+// RUN: rm -rf %t && mkdir %t
+// RUN: split-file %s %t
+
+//--- undeclared/module.modulemap
+module Undeclared { header "undeclared.h" }
+
+//--- undeclared/undeclared.h
+
+//--- module.modulemap
+module User [no_undeclared_includes] { header "user.h" }
+
+//--- user.h
+#if __has_include("undeclared.h")
+#error Unreachable. Undeclared comes from a module that's not 'use'd, meaning the compiler should pretend it doesn't exist.
+#endif
+
+//--- test.c
+#include "user.h"
+
+//--- cdb.json.template
+[{
+  "directory": "DIR",
+  "command": "clang -fmodules -gmodules -fimplicit-module-maps -fmodules-cache-path=DIR/cache -IDIR/undeclared -c DIR/test.c -o DIR/test.o",
+  "file": "DIR/test.c"
+}]
+
+// RUN: sed "s|DIR|%/t|g" %t/cdb.json.template > %t/cdb.json
+// RUN: clang-scan-deps -compilation-database %t/cdb.json -format experimental-full \
+// RUN:   -generate-modules-path-args -module-files-dir %t/build > %t/result.json
+// RUN: cat %t/result.json | sed 's:\?:/:g' | FileCheck %s -DPREFIX=%t
+
+// CHECK:{
+// CHECK-NEXT:   "modules": [
+// CHECK-NEXT: {
+// CHECK-NEXT:   "clang-module-deps": [],
+// CHECK-NEXT:   "clang-modulemap-file": "[[PREFIX]]/module.modulemap",
+// CHECK-NEXT:   "command-line": [
+// CHECK:  "-fmodule-map-file=[[PREFIX]]/undeclared/module.modulemap"
+// CHECK:],
+// CHECK-NEXT:   "context-hash": "{{.*}}",
+// CHECK-NEXT:   "file-deps": [
+// CHECK-NEXT: "[[PREFIX]]/module.modulemap",
+// CHECK-NEXT: "[[PREFIX]]/undeclared/module.modulemap",
+// CHECK-NEXT: "[[PREFIX]]/user.h"
+// CHECK-NEXT:   ],
+// CHECK-NEXT:   "name": "User"
+// CHECK-NEXT: }
+// CHECK-NEXT:   ],
+// CHECK-NEXT:   "translation-units": [
+// CHECK-NEXT: {
+// CHECK-NEXT:   "clang-context-hash": "{{.*}}"
+// CHECK-NEXT:   "clang-module-deps": [
+// CHECK-NEXT: 

[PATCH] D120465: [clang][deps] Disable implicit module maps

2022-03-11 Thread Duncan P. N. Exon Smith via Phabricator via cfe-commits
dexonsmith accepted this revision.
dexonsmith added a comment.
This revision is now accepted and ready to land.

LGTM, with a suggested change for the text of the comment.




Comment at: clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp:269-273
+  // However, some module maps loaded implicitly during the dependency scan can
+  // describe anti-dependencies. That happens when the current module is marked
+  // as '[no_undeclared_includes]', doesn't 'use' module from such module map,
+  // but tries to import it anyway. We need to tell the explicit build about
+  // such module map for it to have the same semantics as the implicit build.

jansvoboda11 wrote:
> dexonsmith wrote:
> > jansvoboda11 wrote:
> > > dexonsmith wrote:
> > > > Is there another long-term solution to this that could be pointed at 
> > > > with a FIXME? E.g., could the module map be encoded redundantly here? 
> > > > If so, what else would need to change to make that okay?
> > > I'm not sure I understand why this would warrant "long-term solution" or 
> > > a FIXME. This code handles an existing feature that just happens to be a 
> > > corner case from the dependency scanning point of view. (You can read up 
> > > on the feature [[ https://clang.llvm.org/docs/Modules.html | here ]].)
> > > 
> > > What do you mean by encoding the module map redundantly?
> > Yes, I know the feature.
> > 
> > ... but I was probably wrong about how it affects the logic here.
> > 
> > I also now have two guesses at the scenario being handled here (previously 
> > I assumed (1)):
> > 
> > 1. A textual include from a module that is not marked as used. I wasn't 
> > sure what you meant by "tries to import", but I guess I thought it was just 
> > "loads the module map and finds the textual header listed there". IIUC, 
> > there's no actual attempt to import a module when the only thing referenced 
> > from it is a textual header, but I could see how parsing the module map 
> > could affect the state anyway.
> > 
> > 2. A modular include from a module that is not marked as used. Something 
> > like a `__has_include`, or a `#include` that fails but there's another 
> > search path with the same header. In this case, there'd be an actual import 
> > attempt, which would fail. And that could also affect state.
> > 
> > Can you clarify which scenario you need to handle? Or is it a 3rd?
> > 
> > > I'm not sure I understand why this would warrant "long-term solution" or 
> > > a FIXME.
> > 
> > This smells like a workaround to me. IIUC, sending in module maps that 
> > aren't actually "needed" except to reproduce side effects of failed queries.
> > 
> > My intuition is that the right long-term fix would involve isolate the 
> > failed queries from the compiler state in the scanning phase so that they 
> > don't have side effects. Then there would be no side effects to reproduce 
> > in the explicit build.
> > 
> > > What do you mean by encoding the module map redundantly?
> > 
> > I think I was confused about scanning vs building, thinking that a module 
> > using a textual include (1) could be copied into each module PCM that 
> > directly imports it. (I know that this wouldn't scale right now for various 
> > reasons, but I wondered if there was some way to get there... but 
> > regardless, seems like it's totally unrelated)
> The scenario being handled is the following:
> 
> 3. Modular `#include` of B from module A, where A is marked 
> `[no_undeclared_includes]` and doesn't `use B`. Typically, that `#include` 
> would be guarded by `__has_include`.
> 
> With implicit module maps disabled, the presence of module map B allows us to 
> evaluate the `__has_include` the same way as with them enabled. This is the 
> only reason we need module map B. There are no side effects from failed 
> queries. The query failure itself is the behavior we need to reproduce.
> 
> I'm not even thinking about "another search path with the same header" in 
> this patch.
Thanks! I was making things way too complicated!

IIUC:
- Without the module map, a `__has_include` would succeed because it'd be 
textually included.
- With the module map, a `__has_include` would fail because the module map 
would block textual inclusion and `[no_undeclared_includes]` would block 
modular inclusion.

Here's some text that I'd suggest:
```
  // We usually don't need to list the module map files of our dependencies when
  // building a module explicitly: their semantics will be deserialized from PCM
  // files.
  //
  // However, some module maps loaded implicitly during the dependency scan can
  // describe anti-dependencies. That happens when this module, let's call it 
M1,
  // is marked as '[no_undeclared_includes]' and tries to access a header 
"M2/M2.h"
  // from another module, M2, but doesn't have a 'use M2;' declaration. The 
explicit
  // build needs the module map for M2 so that it knows that textually including
  // "M2/M2.h" is not allowed. E.g., 

[PATCH] D120465: [clang][deps] Disable implicit module maps

2022-03-11 Thread Jan Svoboda via Phabricator via cfe-commits
jansvoboda11 added inline comments.



Comment at: clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp:269-273
+  // However, some module maps loaded implicitly during the dependency scan can
+  // describe anti-dependencies. That happens when the current module is marked
+  // as '[no_undeclared_includes]', doesn't 'use' module from such module map,
+  // but tries to import it anyway. We need to tell the explicit build about
+  // such module map for it to have the same semantics as the implicit build.

dexonsmith wrote:
> jansvoboda11 wrote:
> > dexonsmith wrote:
> > > Is there another long-term solution to this that could be pointed at with 
> > > a FIXME? E.g., could the module map be encoded redundantly here? If so, 
> > > what else would need to change to make that okay?
> > I'm not sure I understand why this would warrant "long-term solution" or a 
> > FIXME. This code handles an existing feature that just happens to be a 
> > corner case from the dependency scanning point of view. (You can read up on 
> > the feature [[ https://clang.llvm.org/docs/Modules.html | here ]].)
> > 
> > What do you mean by encoding the module map redundantly?
> Yes, I know the feature.
> 
> ... but I was probably wrong about how it affects the logic here.
> 
> I also now have two guesses at the scenario being handled here (previously I 
> assumed (1)):
> 
> 1. A textual include from a module that is not marked as used. I wasn't sure 
> what you meant by "tries to import", but I guess I thought it was just "loads 
> the module map and finds the textual header listed there". IIUC, there's no 
> actual attempt to import a module when the only thing referenced from it is a 
> textual header, but I could see how parsing the module map could affect the 
> state anyway.
> 
> 2. A modular include from a module that is not marked as used. Something like 
> a `__has_include`, or a `#include` that fails but there's another search path 
> with the same header. In this case, there'd be an actual import attempt, 
> which would fail. And that could also affect state.
> 
> Can you clarify which scenario you need to handle? Or is it a 3rd?
> 
> > I'm not sure I understand why this would warrant "long-term solution" or a 
> > FIXME.
> 
> This smells like a workaround to me. IIUC, sending in module maps that aren't 
> actually "needed" except to reproduce side effects of failed queries.
> 
> My intuition is that the right long-term fix would involve isolate the failed 
> queries from the compiler state in the scanning phase so that they don't have 
> side effects. Then there would be no side effects to reproduce in the 
> explicit build.
> 
> > What do you mean by encoding the module map redundantly?
> 
> I think I was confused about scanning vs building, thinking that a module 
> using a textual include (1) could be copied into each module PCM that 
> directly imports it. (I know that this wouldn't scale right now for various 
> reasons, but I wondered if there was some way to get there... but regardless, 
> seems like it's totally unrelated)
The scenario being handled is the following:

3. Modular `#include` of B from module A, where A is marked 
`[no_undeclared_includes]` and doesn't `use B`. Typically, that `#include` 
would be guarded by `__has_include`.

With implicit module maps disabled, the presence of module map B allows us to 
evaluate the `__has_include` the same way as with them enabled. This is the 
only reason we need module map B. There are no side effects from failed 
queries. The query failure itself is the behavior we need to reproduce.

I'm not even thinking about "another search path with the same header" in this 
patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120465/new/

https://reviews.llvm.org/D120465

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D120465: [clang][deps] Disable implicit module maps

2022-03-10 Thread Duncan P. N. Exon Smith via Phabricator via cfe-commits
dexonsmith added inline comments.



Comment at: clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp:269-273
+  // However, some module maps loaded implicitly during the dependency scan can
+  // describe anti-dependencies. That happens when the current module is marked
+  // as '[no_undeclared_includes]', doesn't 'use' module from such module map,
+  // but tries to import it anyway. We need to tell the explicit build about
+  // such module map for it to have the same semantics as the implicit build.

jansvoboda11 wrote:
> dexonsmith wrote:
> > Is there another long-term solution to this that could be pointed at with a 
> > FIXME? E.g., could the module map be encoded redundantly here? If so, what 
> > else would need to change to make that okay?
> I'm not sure I understand why this would warrant "long-term solution" or a 
> FIXME. This code handles an existing feature that just happens to be a corner 
> case from the dependency scanning point of view. (You can read up on the 
> feature [[ https://clang.llvm.org/docs/Modules.html | here ]].)
> 
> What do you mean by encoding the module map redundantly?
Yes, I know the feature.

... but I was probably wrong about how it affects the logic here.

I also now have two guesses at the scenario being handled here (previously I 
assumed (1)):

1. A textual include from a module that is not marked as used. I wasn't sure 
what you meant by "tries to import", but I guess I thought it was just "loads 
the module map and finds the textual header listed there". IIUC, there's no 
actual attempt to import a module when the only thing referenced from it is a 
textual header, but I could see how parsing the module map could affect the 
state anyway.

2. A modular include from a module that is not marked as used. Something like a 
`__has_include`, or a `#include` that fails but there's another search path 
with the same header. In this case, there'd be an actual import attempt, which 
would fail. And that could also affect state.

Can you clarify which scenario you need to handle? Or is it a 3rd?

> I'm not sure I understand why this would warrant "long-term solution" or a 
> FIXME.

This smells like a workaround to me. IIUC, sending in module maps that aren't 
actually "needed" except to reproduce side effects of failed queries.

My intuition is that the right long-term fix would involve isolate the failed 
queries from the compiler state in the scanning phase so that they don't have 
side effects. Then there would be no side effects to reproduce in the explicit 
build.

> What do you mean by encoding the module map redundantly?

I think I was confused about scanning vs building, thinking that a module using 
a textual include (1) could be copied into each module PCM that directly 
imports it. (I know that this wouldn't scale right now for various reasons, but 
I wondered if there was some way to get there... but regardless, seems like 
it's totally unrelated)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120465/new/

https://reviews.llvm.org/D120465

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits