[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-16 Thread Ian Anderson via cfe-commits


@@ -1403,94 +1421,146 @@ bool 
HeaderSearch::isFileMultipleIncludeGuarded(FileEntryRef File) const {
 void HeaderSearch::MarkFileModuleHeader(FileEntryRef FE,
 ModuleMap::ModuleHeaderRole Role,
 bool isCompilingModuleHeader) {
-  bool isModularHeader = ModuleMap::isModular(Role);
-
   // Don't mark the file info as non-external if there's nothing to change.
   if (!isCompilingModuleHeader) {
-if (!isModularHeader)
+if ((Role & ModuleMap::ExcludedHeader))
   return;
 auto *HFI = getExistingFileInfo(FE);
 if (HFI && HFI->isModuleHeader)
   return;
   }
 
   auto &HFI = getFileInfo(FE);
-  HFI.isModuleHeader |= isModularHeader;
+  HFI.mergeModuleMembership(Role);
   HFI.isCompilingModuleHeader |= isCompilingModuleHeader;

ian-twilightcoder wrote:

https://github.com/llvm/llvm-project/pull/89005 and thank you!!

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-16 Thread Ian Anderson via cfe-commits


@@ -1403,94 +1421,146 @@ bool 
HeaderSearch::isFileMultipleIncludeGuarded(FileEntryRef File) const {
 void HeaderSearch::MarkFileModuleHeader(FileEntryRef FE,
 ModuleMap::ModuleHeaderRole Role,
 bool isCompilingModuleHeader) {
-  bool isModularHeader = ModuleMap::isModular(Role);
-
   // Don't mark the file info as non-external if there's nothing to change.
   if (!isCompilingModuleHeader) {
-if (!isModularHeader)
+if ((Role & ModuleMap::ExcludedHeader))
   return;
 auto *HFI = getExistingFileInfo(FE);
 if (HFI && HFI->isModuleHeader)
   return;
   }
 
   auto &HFI = getFileInfo(FE);
-  HFI.isModuleHeader |= isModularHeader;
+  HFI.mergeModuleMembership(Role);
   HFI.isCompilingModuleHeader |= isCompilingModuleHeader;

ian-twilightcoder wrote:

Ohhh. So I need to be checking the `getExistingFileInfo` to see if 
`isModuleHeader` and `isTextualModuleHeader` match the given `ModuleHeaderRole`?

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-16 Thread Richard Smith via cfe-commits


@@ -1403,94 +1421,146 @@ bool 
HeaderSearch::isFileMultipleIncludeGuarded(FileEntryRef File) const {
 void HeaderSearch::MarkFileModuleHeader(FileEntryRef FE,
 ModuleMap::ModuleHeaderRole Role,
 bool isCompilingModuleHeader) {
-  bool isModularHeader = ModuleMap::isModular(Role);
-
   // Don't mark the file info as non-external if there's nothing to change.
   if (!isCompilingModuleHeader) {
-if (!isModularHeader)
+if ((Role & ModuleMap::ExcludedHeader))
   return;
 auto *HFI = getExistingFileInfo(FE);
 if (HFI && HFI->isModuleHeader)
   return;
   }
 
   auto &HFI = getFileInfo(FE);
-  HFI.isModuleHeader |= isModularHeader;
+  HFI.mergeModuleMembership(Role);
   HFI.isCompilingModuleHeader |= isCompilingModuleHeader;

zygoloid wrote:

It looks to me like we're now calling `getFileInfo(FE)` even in cases where the 
info doesn't change, for a textual header. I think that's what's causing the 
regression we're seeing -- we're now considering a repeatedly-used module map 
file to be affecting, even though it isn't, because we end up with a local 
`HeaderFileInfo` instead of an imported one.

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-16 Thread Ian Anderson via cfe-commits

ian-twilightcoder wrote:

> To get back to my previous point about the semantics implemented by this 
> patch being unfortunate -- the upshot is that, now:
> 
> ```
> #include 
> #define NDEBUG
> #import 
> ```

It would be nice if we could make this work without modules too. `#pragma many` 
or something would do that, but I fear we'd have compatibility issues if we 
added a pragma that's only supported by clang to a bunch of system headers.

> This doesn't make sense: with modules disabled, everything is textual -- to 
> say that "textual header" in a modulemap should make a header somehow "more" 
> textual than the default non-modular behavior is weird.

With modules disabled, everything isn't textual - it's effectively excluded. 
`textual` was probably not the best name for that keyword in the module map.

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-16 Thread Ian Anderson via cfe-commits

ian-twilightcoder wrote:

Wall of text incoming... Sorry but our documentation in this area is poor, so I 
feel like I need to spell out the assumptions behind this change.

First of all, this _should_ be a relatively minor change that should affect a 
very limited set of circumstances.
1. Only `#import` should be affected, not `#include`.
2. Only headers marked `textual header` in a module map should be affected, 
nothing else.
3. Headers that use `#pragma once` or have header guards should not be affected.
4. Module map reading should not be affected.

That should end up being a rather narrow change, and if something else got 
affected I'd like to know what and fix it.


`#import` is already treated differently if modules are enabled. This adds 
another case to the special treatment, but doesn't introduce that fact that 
sometimes `#import`ed headers will be re-entered when building with modules. We 
can say we don't like `#import` behaving differently when modules are enabled. 
However, it's not going to be any harder to undo this special treatment than it 
will to undo the other ones, if we ever care enough to try, so I don't think 
this is such an egregious change.

`textual header`s are already treated differently as well. To better define 
that, there are 4 different types of headers in a clang modules world.

1. Headers covered by a `header` statement in a module map - ye olde standard 
include-once header.
2. Headers covered by a `textual header` statement in a module map - usually 
headers that are intended to be included multiple times. Generally these are X 
macro generator headers, or special headers like  or clang's 
 and  which are designed to have different behavior based 
on the preprocessor environment of the includer.
3. Headers covered by an `exclude header` statement in a module map - headers 
that can't be used in a module. Sometimes these are just obsolete headers, and 
sometimes it's headers that have dependencies that can't be used in a module.
4. Headers that are not listed in any module map. These are essentially the 
same thing as excluded headers.

There's a semantic difference between textual and excluded headers. Textual 
headers are anointed to be safe to use in modules, and excluded headers are not 
supposed to be included from any modular header. (If I had my way, 
`-Wnon-modular-include-in-module` would default to an error much like 
`-Wmodule-import-in-extern-c` does, but that can be a soap box for another 
day.) Textual headers are already treated differently from excluded headers in 
a few ways, such as they respect `[no_undeclared_includes]`/`use`, while 
excluded headers don't.

All this to say that `#import`ed headers are already treated different when 
modules are enabled, and textual headers are already treated different than 
excluded headers. I think it's fine to tweak this different treatment, 
especially since it's solving a real world problem that occurs in a fully 
modularized system. I also think it's preferable to focus the change down to 
the very specific problem, which is `#import`ed textual headers, instead of 
trying to fix all of the submodule visibility problems we can think of.

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-13 Thread James Y Knight via cfe-commits

jyknight wrote:

To get back to my previous point about the semantics implemented by this patch 
being unfortunate -- the upshot is that, now:
```
#include 
#define NDEBUG
#import 
```
will include assert.h twice (even though the latter is an "import") _only_ if 
modules are enabled. If modules are disabled, import keeps the original 
behavior of suppressing multiple inclusion. This doesn't make sense: with 
modules disabled, everything is textual -- to say that "textual header" in a 
modulemap should make a header somehow "more" textual than the default 
non-modular behavior is weird.

That is what I meant by my previous comment that this change has tied together 
features which should not be related -- and that's the main reason why I 
believe this PR should be reverted.

If we _do_ need to implement a behavior of suppressing `#import`'s include-once 
behavior for certain headers, it should be based on something within the header 
being included. E.g. could create a new`#pragma 
multiply_include_even_with_import` and add it to assert.h -- this would be 
effectively the reverse of `#pragma once`. I do not believe it's really needed 
(once the "header is hidden entirely" bug is fixed), but it would be consistent 
and not produce broken/surprising semantics.

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-12 Thread James Y Knight via cfe-commits

jyknight wrote:

>> The end result should be that #imported and #pragma once-guarded files are 
>> treated the same way as #ifndef-guarded files.

> While I don't necessarily disagree with that goal in principle, it wasn't 
> true before this change either.

Well, yes, I know it isn't true yet -- that's exactly what I've been trying to 
say, and demonstrated with the test -- it's not true _and_ that's a bug.

> Modular headers including non-modular headers is a special problem. If we let 
> every module enter the non-modular header (which Jan's patch does), its 
> declarations will build into multiple modules/pcm files

I'm not really sure what you're getting at in this paragraph. Is there a _new_ 
special problem introduced here?

Yes, a textual header can be built into multiple modules/pcm files -- that's 
the nature of it being textual, right? Certainly it's always be a better to 
modularize headers from the bottom-up, and avoid textually including a header 
into multiple modules. And, yes, the performance cost of duplicating the decls 
into multiple modules' pcm files can be quite significant, depending on the 
size of the header. Yet, sometimes it happens. And, when it does, the 
declaration-merging takes care of it -- for C/C++ clients. This all works (and 
is used) today, with ifndef-guarded header inclusion.

But a textually-included header shouldn't be re-entered if it's been _visibly_ 
included already in the current compilation -- including if that was exported 
from an imported module. It should only be re-entered if the previous include 
is _not_ visible in the current context. Maybe you're saying Jan's patch 
doesn't do that properly yet. (I could see that being the case, since the 
behavior mostly just falls out automatically for the ifndef-guard case, based 
on whether the guard-macro was made visible or not!)

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-12 Thread Ian Anderson via cfe-commits

ian-twilightcoder wrote:

> The end result should be that #imported and #pragma once-guarded files are 
> treated the same way as #ifndef-guarded files.

While I don't necessarily disagree with that goal in principle, it wasn't true 
before this change either. There was already some special casing to defeat 
`#import`. This change just adds another special case to the `#import` path. 
`#include` doesn't even go through the modified code that is 
`MaybeReenterImportedFile` (née `TryEnterImported`).

> I know you weren't trying to fix that -- but I'm saying that's the thing that 
> _should_ be fixed, and that doing so would fix the _serious_ part of the 
> problem you described in the initial message.

Modular headers including non-modular headers is a special problem. If we let 
every module enter the non-modular header (which Jan's patch does), its 
declarations will build into multiple modules/pcm files, where they will 1) 
conflict and cause errors in the several cases where type merging can't handle 
them, and 2) come from multiple modules which is undefined behavior for Swift 
(i.e. Swift will see `mod1.type` and `mod2.type` and won't know how to resolve 
un-qualified `type` in code). So I don't think that case really _can_ be fixed.

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-12 Thread James Y Knight via cfe-commits

jyknight wrote:

> The problem I'm trying to fix is that nobody knows when it's appropriate to 
> use #import vs #include

But you haven't really (and I think cannot) fixed that.

> using header guards or #pragma once is very "un-Objective-C". 

Yes, this is quite unfortunate. The best answer would be to change the 
recommended style, but unfortunately, it's probably 20-30 years too late to do 
so effectively. (I mean, it would be a start if Apple added `#pragma once` to 
all of the SDK headers.)

Your initial comment said:
> Normally this isn't a big problem, [...] However, when clang modules are 
> involved, this can cause the header to be hidden entirely.

And I am basically agreeing with this. The current state of objc 
include/import, while unfortunate, has existed for decades, and is usually not 
a big problem. And, I think that means we shouldn't add more complexity to the 
semantics of `#import` in an attempt to fix it. But the "header is hidden 
entirely" _is_ a big problem, and that part is due to the bug I described -- so 
if that bug is fixed, we're back to "isn't a big problem", without making this 
change in semantics.

> An alternative solution would be @jansvoboda11's 
> https://github.com/llvm/llvm-project/pull/71117, but I believe that will 
> cause issues with non-modular headers building into multiple modules.

I haven't looked at that, but it sounds promising. The end result should be 
that `#import`ed and `#pragma once`-guarded files are treated the same way as 
`#ifndef`-guarded files. I wouldn't expect that to cause issues with 
non-modular headers included in multiple modules, given that works for 
`#ifndef`-guarded files, today.

> Also this change isn't trying to modify #include in any kind of way, only 
> #import. So it's quite intentional that your test behaves the same before and 
> after.

I know you weren't trying to fix that -- but I'm saying that's the thing that 
_should_ be fixed, and that doing so would fix the _serious_ part of the 
problem you described in the initial message.

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-12 Thread Ian Anderson via cfe-commits

ian-twilightcoder wrote:

Also this change isn't trying to modify `#include` in any kind of way, _only_ 
`#import`. So it's quite intentional that your test behaves the same before and 
after.

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-12 Thread Ian Anderson via cfe-commits

ian-twilightcoder wrote:

I don't really think it's the same thing. The problem I'm trying to fix is that 
nobody knows when it's appropriate to use `#import` vs `#include`, and the 
unfortunate convention of Objective-C makes it impossible for header owners to 
indicate if they support being included multiple times or not, as using header 
guards or `#pragma once` is very "un-Objective-C". Marking the header as 
`textual` is a fairly reasonable way for header owners to indicate that their 
header is meant to be included multiple times (e.g. stddef.h), even if ObjC 
developers don't know that and used `#import`. (The other use of `textual` that 
I've seen is for headers like `unwind_arm_ehabi.h` that are only allowed to 
have a single includer, and aren't really standalone.) If you don't want the 
header to build with the module, that's what `excluded` is for is it not?

And alternative solution would be @jansvoboda11's 
https://github.com/llvm/llvm-project/pull/71117, but I believe that will cause 
issues with non-modular headers building into multiple modules.

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-12 Thread James Y Knight via cfe-commits

jyknight wrote:

Oh -- I'd also note that even if this is reverted, ilya-biryukov may want to 
continue to investigate the source-location issue -- it's entirely possible 
that the correct fix will trigger that same problem!

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-12 Thread James Y Knight via cfe-commits

jyknight wrote:

I think the bug this change was attempting to fix is actually the same as 
#38554 and related bugs -- which, it appears, was not really fixed.

The underlying problem here is that `#pragma once` should should work 
identically to ifndef guards, as far as what macros/decls are made visible to 
the includer. However, with the current modules implementation, it fails to do 
so. This PR addresses only one _symptom_ of that issue, but does not address 
the underlying problem.

Additionally, I think the semantics change implemented here is a bad idea. It 
significantly complicates the semantics of `#import`, in a way that I think is 
justifiable, especially given that there is a way to fix the identified issue 
in a more principled way.

Yes, `#import` was a bad and confusing idea, and should have been dropped 20 
years ago, but adding more special cases to it does not make it better. A 
header file being marked "textual" in a module-map does _not_ mean a file 
should be be included multiple times -- it simply means that you didn't want to 
build it into a module. Linking these two concepts together ties together 
concepts which have no 

If this PR was the only way to fix the identified issue, maybe it could via an 
assumption that the current buggy `#pragma once` behavior is expected behavior 
which should be preserved.

As such, I would propose this PR ought to be reverted, and work to fix the 
underlying issue be done instead.

Here's a test case which I think demonstrates the actual bug -- it fails both 
before and after this PR. If you modify 'header.h' to use include guards, then 
it passes. This difference in behavior should not exist.
```
// RUN: rm -rf %t
// RUN: split-file %s %t
// RUN: %clang_cc1 -std=c17 -fmodules-cache-path=%t/no-lsv -fmodules 
-fimplicit-module-maps -I%t %t/pragma-once-modules.c -verify
// RUN: %clang_cc1 -std=c17 -fmodules-cache-path=%t/lsv -fmodules 
-fimplicit-module-maps -fmodules-local-submodule-visibility -I%t 
%t/pragma-once-modules.c -verify

//--- pragma-once-modules.c
// expected-no-diagnostics
#include "one.h"
#include "header.h"
my_int test(void) {
  return MY_MACRO;
}

//--- module.modulemap
module Submodules {
  module one {
header "one.h"
export *
  }
  module two {
header "two.h"
export *
  }
}

//--- one.h
#ifndef one_h
#define one_h
#endif

//--- two.h
#ifndef two_h
#define two_h
#include "header.h"
#endif

//--- header.h
#pragma once
#define MY_MACRO 5
typedef int my_int;
```

It produces the following unexpected output.
```
error: 'expected-error' diagnostics seen but not expected: 
  [...]/pragma-once-modules.c Line 4: missing '#include "header.h"'; 'my_int' 
must be declared before it is used
  [...]/pragma-once-modules.c Line 5: use of undeclared identifier 'MY_MACRO'
error: 'expected-note' diagnostics seen but not expected: 
  [...]/header.h Line 3: declaration here is not visible
```

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-12 Thread Ian Anderson via cfe-commits

ian-twilightcoder wrote:

Wow that's a _ton_ of PCM files. Why are your headers `textual` if they're only 
meant to be included once? Do they need to inherit the includer's preprocessor 
environment or something like that?

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-12 Thread Ilya Biryukov via cfe-commits

ilya-biryukov wrote:

I still don't understand why the number of `FileID`s for module maps has 
increased so much. I am observing this increase when compiling a file that has 
many module dependencies, it creates only one `FileID` for a module map I 
mentioned above, the rest 2259 are from loaded binary PCM files.

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-12 Thread Ilya Biryukov via cfe-commits

ilya-biryukov wrote:

> Are your `textual` headers meant to be included more than once? If not, do 
> they use header guards or `#pragma once`?

As I mentioned, the increase is coming from the module files.
The textual headers are header-guarded and not intended to be included more 
than once.

> clang never transforms #include to #import. It will effectively attempt to 
> transform #include and #import to @import but that's different.

I meant to say that we use module maps to replace `#include` with module 
imports. Sorry for getting the exact terminology wrong.


https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-11 Thread Ian Anderson via cfe-commits

ian-twilightcoder wrote:

> It's definitely caused by that change, reverting it fixes the failure.
> 
> We use module maps so that some of our `#include` get processed as `#import`, 
> I believe the same code gets executed for our use-case. Because our setup is 
> so unusual, producing a repro is hard and make take some time. (Build system 
> generated module maps and `#include` is essentially turned into `#import` by 
> Clang).

clang never transforms `#include` to `#import`. It will effectively attempt to 
transform `#include` and `#import` to `@import` but that's different.

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-11 Thread Ian Anderson via cfe-commits

ian-twilightcoder wrote:

Are your `textual` headers meant to be included more than once? If not, do they 
use header guards or `#pragma once`?

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-11 Thread Ilya Biryukov via cfe-commits

ilya-biryukov wrote:

Most of the increase seems to be coming from module maps that describe textual 
headers:
```
// This describes builtin headers.
Before: gen-crosstool-x86.cppmap:1:1: note: file entered 630 times using 
83989080B of space
After: gen-crosstool-x86.cppmap:1:1: note: file entered 2260 times using 
319523320B of space
```
I will dig further to see where it's coming from.

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-11 Thread Ilya Biryukov via cfe-commits

ilya-biryukov wrote:

It's definitely caused by that change, reverting it fixes the failure.

We use module maps so that some of our `#include` get processed as `#import`, I 
believe the same code gets executed for our use-case. Because our setup is so 
unusual, producing a repro is hard and make take some time. (Build system 
generated module maps and `#include` is essentially turned into `#import` by 
Clang). I will try to investigate this further today and come back with more 
details.

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-10 Thread Ian Anderson via cfe-commits

ian-twilightcoder wrote:

> @ian-twilightcoder this change seemed to cause our internal builds to run out 
> of source location space.
> 
> ```
> remark: source manager location address space usage: [-Rsloc-usage]
> note: 408559B in local locations, 2146921126B in locations loaded from AST 
> files, for a total of 2147329685B (99% of available space)
> ```
> 
> The main offenders from the follow-ups are:
> 
> * module map files for standard library and C/C++ runtime headers, each 
> entered a few thousand times,
> * various headers without header guards, e.g. libc++'s `__config` and others,
> * `arm_neon.h`.
>   Without looking deeper into what the change does, this might be an expected 
> outcome here. However, let me know if something already looks off from this 
> small description (e.g. the module maps entered so many times is surprising, 
> I'll need to see if this happened before).
> 
> I will investigate this further, but we might not be able to change how our 
> builds work quickly. Would it be ok to revert this change or put the new 
> behavior under a flag to give us some more time to investigate?

That's surprising, the code to re-enter files is 
`HeaderSearch::ShouldEnterIncludeFile`, and I only changed the behavior there 
for `#import`. The `#include` behavior is still governed by `#pragma once` and 
`FileInfo.getControllingMacro`, and that isn't changed. Are you sure it's this 
change? Do you have a repro case?

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-10 Thread Ilya Biryukov via cfe-commits

ilya-biryukov wrote:

@ian-twilightcoder this change seemed to cause our internal builds to run out 
of source location space.
```
remark: source manager location address space usage: [-Rsloc-usage]
note: 408559B in local locations, 2146921126B in locations loaded from AST 
files, for a total of 2147329685B (99% of available space)
```

The main offenders from the 

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-05 Thread Ian Anderson via cfe-commits

https://github.com/ian-twilightcoder closed 
https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-05 Thread Jan Svoboda via cfe-commits

https://github.com/jansvoboda11 approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-04-04 Thread Ian Anderson via cfe-commits

https://github.com/ian-twilightcoder updated 
https://github.com/llvm/llvm-project/pull/83660

>From 63ff00ec49ac20c5ac97bd673166dabb0fb56136 Mon Sep 17 00:00:00 2001
From: Ian Anderson 
Date: Fri, 1 Mar 2024 22:17:09 -0800
Subject: [PATCH] [clang][modules] Headers meant to be included multiple times
 can be completely invisible in clang module builds

Once a file has been `#import`'ed, it gets stamped as if it was `#pragma once` 
and will not be re-entered, even on #include. This means that any errant 
#import of a file designed to be included multiple times, such as , 
will incorrectly mark it as include-once and break the multiple include 
functionality. Normally this isn't a big problem, e.g.  can't have 
its NDEBUG mode changed after the first #import, but it is still mostly 
functional. However, when clang modules are involved, this can cause the header 
to be hidden entirely.

Objective-C code most often uses #import for everything, because it's required 
for most Objective-C headers to prevent double inclusion and redeclaration 
errors. (It's rare for Objective-C headers to use macro guards or `#pragma 
once`.) The problem arises when a submodule includes a multiple-include header. 
The "already included" state is global across all modules (which is necessary 
so that non-modular headers don't get compiled into multiple translation units 
and cause redeclaration errors). If another module or the main file #import's 
the same header, it becomes invisible from then on. If the original submodule 
is not imported, the include of the header will effectively do nothing and the 
header will be invisible. The only way to actually get the header's 
declarations is to somehow figure out which submodule consumed the header, and 
import that instead. That's basically impossible since it depends on exactly 
which modules were built in which order.

#import is a poor indicator of whether a header is actually include-once, as 
the #import is external to the header it applies to, and requires that all 
inclusions correctly and consistently use #import vs #include. When modules are 
enabled, consider a header marked `textual` in its module as a stronger 
indicator of multiple-include than #import's indication of include-once. This 
will allow headers like  to always be included when modules are 
enabled, even if #import is erroneously used somewhere.
---
 clang/include/clang/Lex/HeaderSearch.h   |  26 ++-
 clang/lib/Lex/HeaderSearch.cpp   | 183 +--
 clang/lib/Serialization/ASTReader.cpp|   2 +-
 clang/test/Modules/builtin-import.mm |   2 +
 clang/test/Modules/import-textual-noguard.mm |   6 +-
 clang/test/Modules/import-textual.mm |   2 +
 clang/test/Modules/multiple-import.m |  43 +
 7 files changed, 201 insertions(+), 63 deletions(-)
 create mode 100644 clang/test/Modules/multiple-import.m

diff --git a/clang/include/clang/Lex/HeaderSearch.h 
b/clang/include/clang/Lex/HeaderSearch.h
index 705dcfa8aacc3f..855f81f775f8a8 100644
--- a/clang/include/clang/Lex/HeaderSearch.h
+++ b/clang/include/clang/Lex/HeaderSearch.h
@@ -78,11 +78,19 @@ struct HeaderFileInfo {
   LLVM_PREFERRED_TYPE(bool)
   unsigned External : 1;
 
-  /// Whether this header is part of a module.
+  /// Whether this header is part of and built with a module.  i.e. it is 
listed
+  /// in a module map, and is not `excluded` or `textual`. (same meaning as
+  /// `ModuleMap::isModular()`).
   LLVM_PREFERRED_TYPE(bool)
   unsigned isModuleHeader : 1;
 
-  /// Whether this header is part of the module that we are building.
+  /// Whether this header is a `textual header` in a module.
+  LLVM_PREFERRED_TYPE(bool)
+  unsigned isTextualModuleHeader : 1;
+
+  /// Whether this header is part of the module that we are building, even if 
it
+  /// doesn't build with the module. i.e. this will include `excluded` and
+  /// `textual` headers as well as normal headers.
   LLVM_PREFERRED_TYPE(bool)
   unsigned isCompilingModuleHeader : 1;
 
@@ -128,13 +136,20 @@ struct HeaderFileInfo {
 
   HeaderFileInfo()
   : isImport(false), isPragmaOnce(false), DirInfo(SrcMgr::C_User),
-External(false), isModuleHeader(false), isCompilingModuleHeader(false),
-Resolved(false), IndexHeaderMapHeader(false), IsValid(false)  {}
+External(false), isModuleHeader(false), isTextualModuleHeader(false),
+isCompilingModuleHeader(false), Resolved(false),
+IndexHeaderMapHeader(false), IsValid(false) {}
 
   /// Retrieve the controlling macro for this header file, if
   /// any.
   const IdentifierInfo *
   getControllingMacro(ExternalPreprocessorSource *External);
+
+  /// Update the module membership bits based on the header role.
+  ///
+  /// isModuleHeader will potentially be set, but not cleared.
+  /// isTextualModuleHeader will be set or cleared based on the role update.
+  void mergeModuleMembership(ModuleMap::ModuleHeaderRole Role);
 };
 
 /// An external source of header

[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-03-12 Thread Ian Anderson via cfe-commits

ian-twilightcoder wrote:

> > Sometimes it does confuse clang, at least I saw problems with a `typedef 
> > enum` when I made an include-once header `textual`.
> 
> Ok, I see. I would just consider it a bug, not a design decision.
> 
> > That's correct. `#import` is an external source - often it comes from the 
> > users of the header and not the author, and the users might not be 
> > consistent with each other. `textual` comes from the author and a much 
> > stronger indicator of intent.
> 
> Hmm, I need to think more about it. Usually I prefer for modular builds to 
> act like non-modular as it minimizes the surprises and allows to have a 
> portable code. But I haven't really considered the implications of such 
> preference.

Yes generally, but modules build more headers than non-modular builds, nothing 
we can do about that.

> I see why you prefer to give more priority to `textual`. To give a different 
> perspective, what if a developer wants to use a newer library, for example, 
> ncurses. The rest of SDK expects an older version and the developer might 
> need to do some header gymnastics to make it work. So there are cases when 
> developers need to go against library author's intent. I'm thinking that 
> library authors aren't able to predict all the ways their library is used, so 
> it is useful to have some flexibility around that, even at the cost of 
> misusing the library. But this is a general opinion, I don't have specific 
> data about the needs to tweak libraries or the risks of a library misuse. So 
> the opinion isn't particularly strong.

That one seems like an edge case, and using `#import` wouldn't necessarily help 
you because the SDK module could build first before your `#import` is seen.

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-03-12 Thread Volodymyr Sapsai via cfe-commits

vsapsai wrote:

> Sometimes it does confuse clang, at least I saw problems with a `typedef 
> enum` when I made an include-once header `textual`.

Ok, I see. I would just consider it a bug, not a design decision.

> That's correct. `#import` is an external source - often it comes from the 
> users of the header and not the author, and the users might not be consistent 
> with each other. `textual` comes from the author and a much stronger 
> indicator of intent.

Hmm, I need to think more about it. Usually I prefer for modular builds to act 
like non-modular as it minimizes the surprises and allows to have a portable 
code. But I haven't really considered the implications of such preference.

I see why you prefer to give more priority to `textual`. To give a different 
perspective, what if a developer wants to use a newer library, for example, 
ncurses. The rest of SDK expects an older version and the developer might need 
to do some header gymnastics to make it work. So there are cases when 
developers need to go against library author's intent. I'm thinking that 
library authors aren't able to predict all the ways their library is used, so 
it is useful to have some flexibility around that, even at the cost of misusing 
the library. But this is a general opinion, I don't have specific data about 
the needs to tweak libraries or the risks of a library misuse. So the opinion 
isn't particularly strong.

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-03-11 Thread Ian Anderson via cfe-commits

ian-twilightcoder wrote:

> To clarify a little bit
> 
> > [...] The "already included" state is global across all modules (which is 
> > necessary so that non-modular headers don't get compiled into multiple 
> > translation units and cause redeclaration errors).
> 
> The necessity isn't actually true. The same definition in multiple modules 
> shouldn't confuse clang as long as these definitions are identical. But this 
> is a side issue.
Sometimes it does confuse clang, at least I saw problems with a `typedef enum` 
when I made an include-once header `textual`.

> To re-iterate what this change is about:
> 
> 1. `#import` implies a file is a single-include
> 2. Textual header in a module map implies a file is a multi-include (aka 
> re-entrant)
> 3. When we have both `#import` and the header marked as textual, `#import` 
> "wins", i.e. the file is single-include
> 4. You want to make that when we have both `#import` and a textual header, 
> textual should "win", i.e. the file should be multi-include
> 
> Is it correct?

That's correct. `#import` is an external source - often it comes from the users 
of the header and not the author, and the users might not be consistent with 
each other. `textual` comes from the author and a much stronger indicator of 
intent.

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-03-11 Thread Volodymyr Sapsai via cfe-commits

vsapsai wrote:

To clarify a little bit
> [...] The "already included" state is global across all modules (which is 
> necessary so that non-modular headers don't get compiled into multiple 
> translation units and cause redeclaration errors).

The necessity isn't actually true. The same definition in multiple modules 
shouldn't confuse clang as long as these definitions are identical. But this is 
a side issue.

To re-iterate what this change is about:
1. `#import` implies a file is a single-include
2. Textual header in a module map implies a file is a multi-include (aka 
re-entrant)
3. When we have both `#import` and the header marked as textual, `#import` 
"wins", i.e. the file is single-include
4. You want to make that when we have both `#import` and a textual header, 
textual should "win", i.e. the file should be multi-include

Is it correct?

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-03-11 Thread Ian Anderson via cfe-commits

https://github.com/ian-twilightcoder updated 
https://github.com/llvm/llvm-project/pull/83660

>From 1cb3d459f3a9ae73ac98bf8c06b905d788be954f Mon Sep 17 00:00:00 2001
From: Ian Anderson 
Date: Fri, 1 Mar 2024 22:17:09 -0800
Subject: [PATCH] [clang][modules] Headers meant to be included multiple times
 can be completely invisible in clang module builds

Once a file has been `#import`'ed, it gets stamped as if it was `#pragma once` 
and will not be re-entered, even on #include. This means that any errant 
#import of a file designed to be included multiple times, such as , 
will incorrectly mark it as include-once and break the multiple include 
functionality. Normally this isn't a big problem, e.g.  can't have 
its NDEBUG mode changed after the first #import, but it is still mostly 
functional. However, when clang modules are involved, this can cause the header 
to be hidden entirely.

Objective-C code most often uses #import for everything, because it's required 
for most Objective-C headers to prevent double inclusion and redeclaration 
errors. (It's rare for Objective-C headers to use macro guards or `#pragma 
once`.) The problem arises when a submodule includes a multiple-include header. 
The "already included" state is global across all modules (which is necessary 
so that non-modular headers don't get compiled into multiple translation units 
and cause redeclaration errors). If another module or the main file #import's 
the same header, it becomes invisible from then on. If the original submodule 
is not imported, the include of the header will effectively do nothing and the 
header will be invisible. The only way to actually get the header's 
declarations is to somehow figure out which submodule consumed the header, and 
import that instead. That's basically impossible since it depends on exactly 
which modules were built in which order.

#import is a poor indicator of whether a header is actually include-once, as 
the #import is external to the header it applies to, and requires that all 
inclusions correctly and consistently use #import vs #include. When modules are 
enabled, consider a header marked `textual` in its module as a stronger 
indicator of multiple-include than #import's indication of include-once. This 
will allow headers like  to always be included when modules are 
enabled, even if #import is erroneously used somewhere.
---
 clang/include/clang/Lex/HeaderSearch.h   |  26 ++-
 clang/lib/Lex/HeaderSearch.cpp   | 183 +--
 clang/lib/Serialization/ASTReader.cpp|   2 +-
 clang/test/Modules/builtin-import.mm |   2 +
 clang/test/Modules/import-textual-noguard.mm |   6 +-
 clang/test/Modules/import-textual.mm |   2 +
 clang/test/Modules/multiple-import.m |  43 +
 7 files changed, 201 insertions(+), 63 deletions(-)
 create mode 100644 clang/test/Modules/multiple-import.m

diff --git a/clang/include/clang/Lex/HeaderSearch.h 
b/clang/include/clang/Lex/HeaderSearch.h
index 705dcfa8aacc3f..4c9fb58fbd35ef 100644
--- a/clang/include/clang/Lex/HeaderSearch.h
+++ b/clang/include/clang/Lex/HeaderSearch.h
@@ -78,11 +78,19 @@ struct HeaderFileInfo {
   LLVM_PREFERRED_TYPE(bool)
   unsigned External : 1;
 
-  /// Whether this header is part of a module.
+  /// Whether this header is part of and built with a module
+  /// (`isTextualModuleHeader` will be `false`).
   LLVM_PREFERRED_TYPE(bool)
   unsigned isModuleHeader : 1;
 
-  /// Whether this header is part of the module that we are building.
+  /// Whether this header is a textual header in a module (`isModuleHeader` 
will
+  /// be `false`).
+  LLVM_PREFERRED_TYPE(bool)
+  unsigned isTextualModuleHeader : 1;
+
+  /// Whether this header is part of the module that we are building
+  /// (independent of `isModuleHeader` and `isTextualModuleHeader`, they can
+  /// both be `false`).
   LLVM_PREFERRED_TYPE(bool)
   unsigned isCompilingModuleHeader : 1;
 
@@ -128,13 +136,20 @@ struct HeaderFileInfo {
 
   HeaderFileInfo()
   : isImport(false), isPragmaOnce(false), DirInfo(SrcMgr::C_User),
-External(false), isModuleHeader(false), isCompilingModuleHeader(false),
-Resolved(false), IndexHeaderMapHeader(false), IsValid(false)  {}
+External(false), isModuleHeader(false), isTextualModuleHeader(false),
+isCompilingModuleHeader(false), Resolved(false),
+IndexHeaderMapHeader(false), IsValid(false) {}
 
   /// Retrieve the controlling macro for this header file, if
   /// any.
   const IdentifierInfo *
   getControllingMacro(ExternalPreprocessorSource *External);
+
+  /// Update the module membership bits based on the header role.
+  ///
+  /// isModuleHeader will potentially be set, but not cleared.
+  /// isTextualModuleHeader will be set or cleared based on the role update.
+  void mergeModuleMembership(ModuleMap::ModuleHeaderRole Role);
 };
 
 /// An external source of header file information, which may supply
@@ -522,6 +537,9 @@ class HeaderSearch {
  

[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-03-04 Thread Ian Anderson via cfe-commits

ian-twilightcoder wrote:

> Thanks to @Bigcheese for helping with this patch.
> 
> https://reviews.llvm.org/D26267 looks like the most recent significant update 
> to this code that we could find. It had an additional allowance where clang 
> headers that were _not_ marked `textual` could be re-entered. I think this 
> was to allow stddef.h (which is actually a multiple-include file despite it 
> not being `textual`) to be entered multiple times while building the Darwin 
> module. That shouldn't be necessary since nothing in the Darwin module or 
> libc++ defines the `__need_` macros.
> 
> The unit tests added in D26267 still pass, so I don't think the builtin 
> special case is needed anymore, but I'm still doing testing.

@bcardosolopes as far as I can tell the builtin special case isn't needed. Can 
you think of anything else I might need to test?

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-03-04 Thread Ian Anderson via cfe-commits

https://github.com/ian-twilightcoder updated 
https://github.com/llvm/llvm-project/pull/83660

>From f90fe8b1b7c73b68614ade3cf7e1c292f02d3573 Mon Sep 17 00:00:00 2001
From: Ian Anderson 
Date: Fri, 1 Mar 2024 22:17:09 -0800
Subject: [PATCH] [clang][modules] Headers meant to be included multiple times
 can be completely invisible in clang module builds

Once a file has been `#import`'ed, it gets stamped as if it was `#pragma once` 
and will not be re-entered, even on #include. This means that any errant 
#import of a file designed to be included multiple times, such as , 
will incorrectly mark it as include-once and break the multiple include 
functionality. Normally this isn't a big problem, e.g.  can't have 
its NDEBUG mode changed after the first #import, but it is still mostly 
functional. However, when clang modules are involved, this can cause the header 
to be hidden entirely.

Objective-C code most often uses #import for everything, because it's required 
for most Objective-C headers to prevent double inclusion and redeclaration 
errors. (It's rare for Objective-C headers to use macro guards or `#pragma 
once`.) The problem arises when a submodule includes a multiple-include header. 
The "already included" state is global across all modules (which is necessary 
so that non-modular headers don't get compiled into multiple translation units 
and cause redeclaration errors). If another module or the main file #import's 
the same header, it becomes invisible from then on. If the original submodule 
is not imported, the include of the header will effectively do nothing and the 
header will be invisible. The only way to actually get the header's 
declarations is to somehow figure out which submodule consumed the header, and 
import that instead. That's basically impossible since it depends on exactly 
which modules were built in which order.

#import is a poor indicator of whether a header is actually include-once, as 
the #import is external to the header it applies to, and requires that all 
inclusions correctly and consistently use #import vs #include. When modules are 
enabled, consider a header marked `textual` in its module as a stronger 
indicator of multiple-include than #import's indication of include-once. This 
will allow headers like  to always be included when modules are 
enabled, even if #import is erroneously used somewhere.
---
 clang/include/clang/Lex/HeaderSearch.h   |  26 +++-
 clang/lib/Lex/HeaderSearch.cpp   | 154 ---
 clang/lib/Serialization/ASTReader.cpp|   2 +-
 clang/test/Modules/builtin-import.mm |   2 +
 clang/test/Modules/import-textual-noguard.mm |   6 +-
 clang/test/Modules/import-textual.mm |   2 +
 clang/test/Modules/multiple-import.m |  43 ++
 7 files changed, 173 insertions(+), 62 deletions(-)
 create mode 100644 clang/test/Modules/multiple-import.m

diff --git a/clang/include/clang/Lex/HeaderSearch.h 
b/clang/include/clang/Lex/HeaderSearch.h
index 705dcfa8aacc3f8..4c9fb58fbd35ef1 100644
--- a/clang/include/clang/Lex/HeaderSearch.h
+++ b/clang/include/clang/Lex/HeaderSearch.h
@@ -78,11 +78,19 @@ struct HeaderFileInfo {
   LLVM_PREFERRED_TYPE(bool)
   unsigned External : 1;
 
-  /// Whether this header is part of a module.
+  /// Whether this header is part of and built with a module
+  /// (`isTextualModuleHeader` will be `false`).
   LLVM_PREFERRED_TYPE(bool)
   unsigned isModuleHeader : 1;
 
-  /// Whether this header is part of the module that we are building.
+  /// Whether this header is a textual header in a module (`isModuleHeader` 
will
+  /// be `false`).
+  LLVM_PREFERRED_TYPE(bool)
+  unsigned isTextualModuleHeader : 1;
+
+  /// Whether this header is part of the module that we are building
+  /// (independent of `isModuleHeader` and `isTextualModuleHeader`, they can
+  /// both be `false`).
   LLVM_PREFERRED_TYPE(bool)
   unsigned isCompilingModuleHeader : 1;
 
@@ -128,13 +136,20 @@ struct HeaderFileInfo {
 
   HeaderFileInfo()
   : isImport(false), isPragmaOnce(false), DirInfo(SrcMgr::C_User),
-External(false), isModuleHeader(false), isCompilingModuleHeader(false),
-Resolved(false), IndexHeaderMapHeader(false), IsValid(false)  {}
+External(false), isModuleHeader(false), isTextualModuleHeader(false),
+isCompilingModuleHeader(false), Resolved(false),
+IndexHeaderMapHeader(false), IsValid(false) {}
 
   /// Retrieve the controlling macro for this header file, if
   /// any.
   const IdentifierInfo *
   getControllingMacro(ExternalPreprocessorSource *External);
+
+  /// Update the module membership bits based on the header role.
+  ///
+  /// isModuleHeader will potentially be set, but not cleared.
+  /// isTextualModuleHeader will be set or cleared based on the role update.
+  void mergeModuleMembership(ModuleMap::ModuleHeaderRole Role);
 };
 
 /// An external source of header file information, which may supply
@@ -522,6 +537,9 @@ class HeaderSearch 

[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-03-04 Thread Ian Anderson via cfe-commits

https://github.com/ian-twilightcoder updated 
https://github.com/llvm/llvm-project/pull/83660

>From 08681ff77f432806316109146ab4fef058137648 Mon Sep 17 00:00:00 2001
From: Ian Anderson 
Date: Fri, 1 Mar 2024 22:17:09 -0800
Subject: [PATCH] [clang][modules] Headers meant to be included multiple times
 can be completely invisible in clang module builds

Once a file has been `#import`'ed, it gets stamped as if it was `#pragma once` 
and will not be re-entered, even on #include. This means that any errant 
#import of a file designed to be included multiple times, such as , 
will incorrectly mark it as include-once and break the multiple include 
functionality. Normally this isn't a big problem, e.g.  can't have 
its NDEBUG mode changed after the first #import, but it is still mostly 
functional. However, when clang modules are involved, this can cause the header 
to be hidden entirely.

Objective-C code most often uses #import for everything, because it's required 
for most Objective-C headers to prevent double inclusion and redeclaration 
errors. (It's rare for Objective-C headers to use macro guards or `#pragma 
once`.) The problem arises when a submodule includes a multiple-include header. 
The "already included" state is global across all modules (which is necessary 
so that non-modular headers don't get compiled into multiple translation units 
and cause redeclaration errors). If another module or the main file #import's 
the same header, it becomes invisible from then on. If the original submodule 
is not imported, the include of the header will effectively do nothing and the 
header will be invisible. The only way to actually get the header's 
declarations is to somehow figure out which submodule consumed the header, and 
import that instead. That's basically impossible since it depends on exactly 
which modules were built in which order.

#import is a poor indicator of whether a header is actually include-once, as 
the #import is external to the header it applies to, and requires that all 
inclusions correctly and consistently use #import vs #include. When modules are 
enabled, consider a header marked `textual` in its module as a stronger 
indicator of multiple-include than #import's indication of include-once. This 
will allow headers like  to always be included when modules are 
enabled, even if #import is erroneously used somewhere.
---
 clang/include/clang/Lex/HeaderSearch.h   |  26 +++-
 clang/lib/Lex/HeaderSearch.cpp   | 154 ---
 clang/lib/Serialization/ASTReader.cpp|   2 +-
 clang/test/Modules/builtin-import.mm |   2 +
 clang/test/Modules/import-textual-noguard.mm |   6 +-
 clang/test/Modules/import-textual.mm |   2 +
 clang/test/Modules/multiple-import.m |  43 ++
 7 files changed, 173 insertions(+), 62 deletions(-)
 create mode 100644 clang/test/Modules/multiple-import.m

diff --git a/clang/include/clang/Lex/HeaderSearch.h 
b/clang/include/clang/Lex/HeaderSearch.h
index 705dcfa8aacc3f8..4c9fb58fbd35ef1 100644
--- a/clang/include/clang/Lex/HeaderSearch.h
+++ b/clang/include/clang/Lex/HeaderSearch.h
@@ -78,11 +78,19 @@ struct HeaderFileInfo {
   LLVM_PREFERRED_TYPE(bool)
   unsigned External : 1;
 
-  /// Whether this header is part of a module.
+  /// Whether this header is part of and built with a module
+  /// (`isTextualModuleHeader` will be `false`).
   LLVM_PREFERRED_TYPE(bool)
   unsigned isModuleHeader : 1;
 
-  /// Whether this header is part of the module that we are building.
+  /// Whether this header is a textual header in a module (`isModuleHeader` 
will
+  /// be `false`).
+  LLVM_PREFERRED_TYPE(bool)
+  unsigned isTextualModuleHeader : 1;
+
+  /// Whether this header is part of the module that we are building
+  /// (independent of `isModuleHeader` and `isTextualModuleHeader`, they can
+  /// both be `false`).
   LLVM_PREFERRED_TYPE(bool)
   unsigned isCompilingModuleHeader : 1;
 
@@ -128,13 +136,20 @@ struct HeaderFileInfo {
 
   HeaderFileInfo()
   : isImport(false), isPragmaOnce(false), DirInfo(SrcMgr::C_User),
-External(false), isModuleHeader(false), isCompilingModuleHeader(false),
-Resolved(false), IndexHeaderMapHeader(false), IsValid(false)  {}
+External(false), isModuleHeader(false), isTextualModuleHeader(false),
+isCompilingModuleHeader(false), Resolved(false),
+IndexHeaderMapHeader(false), IsValid(false) {}
 
   /// Retrieve the controlling macro for this header file, if
   /// any.
   const IdentifierInfo *
   getControllingMacro(ExternalPreprocessorSource *External);
+
+  /// Update the module membership bits based on the header role.
+  ///
+  /// isModuleHeader will potentially be set, but not cleared.
+  /// isTextualModuleHeader will be set or cleared based on the role update.
+  void mergeModuleMembership(ModuleMap::ModuleHeaderRole Role);
 };
 
 /// An external source of header file information, which may supply
@@ -522,6 +537,9 @@ class HeaderSearch 

[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-03-04 Thread Ian Anderson via cfe-commits

ian-twilightcoder wrote:

> > Once a file has been `#import`'ed, it gets stamped as if it was `#pragma 
> > once` and will not be re-entered, even on #include.
> 
> Can you explain how this is happening? The only place where 
> `HeaderFileInfo::isPragmaOnce` is set to `true` is in 
> `HeaderSearch::MarkFileIncludeOnce()`, only called from 
> `Preprocessor::HandlePragmaOnce()`, which seems correct.

It gets `HeaderFileInfo::isImport` which causes it to be treated the same as 
`#pragma once` from then on.

> > The problem arises when a submodule includes a multiple-include header. The 
> > "already included" state is global across all modules (which is necessary 
> > so that non-modular headers don't get compiled into multiple translation 
> > units and cause redeclaration errors). If another module or the main file 
> > #import's the same header, it becomes invisible from then on. If the 
> > original submodule is not imported, the include of the header will 
> > effectively do nothing and the header will be invisible. The only way to 
> > actually get the header's declarations is to somehow figure out which 
> > submodule consumed the header, and import that instead. That's basically 
> > impossible since it depends on exactly which modules were built in which 
> > order.
> 
> Could this be also solved by tracking the "already included" state [per 
> submodule](https://github.com/llvm/llvm-project/pull/71117)?

That would fix headers that can be included multiple times (which is what I'm 
trying to fix here). But I think that would break headers that can only be 
included once and are not part of a module, since their declarations would go 
into multiple pcms.

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-03-04 Thread Jan Svoboda via cfe-commits

jansvoboda11 wrote:

> Once a file has been `#import`'ed, it gets stamped as if it was `#pragma 
> once` and will not be re-entered, even on #include.

Can you explain how this is happening? The only place where 
`HeaderFileInfo::isPragmaOnce` is set to `true` is in 
`HeaderSearch::MarkFileIncludeOnce()`, only called from 
`Preprocessor::HandlePragmaOnce()`, which seems correct.

> The problem arises when a submodule includes a multiple-include header. The 
> "already included" state is global across all modules (which is necessary so 
> that non-modular headers don't get compiled into multiple translation units 
> and cause redeclaration errors). If another module or the main file #import's 
> the same header, it becomes invisible from then on. If the original submodule 
> is not imported, the include of the header will effectively do nothing and 
> the header will be invisible. The only way to actually get the header's 
> declarations is to somehow figure out which submodule consumed the header, 
> and import that instead. That's basically impossible since it depends on 
> exactly which modules were built in which order.

Could this be also solved by tracking the "already included" state [per 
submodule](https://github.com/llvm/llvm-project/pull/71117)?

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-03-02 Thread Ian Anderson via cfe-commits

ian-twilightcoder wrote:

```
error: 'expected-error' diagnostics seen but not expected:
  File 
/var/lib/buildkite-agent/builds/linux-56-59b8f5d88-6mhbj-1/llvm-project/clang-ci/clang/test/Modules/explicit-build-overlap.cpp
 Line 11: module use does not directly depend on a module exporting 'a.h', 
which is part of indirectly-used module b
1 error generated.
```

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-03-01 Thread Ian Anderson via cfe-commits

https://github.com/ian-twilightcoder updated 
https://github.com/llvm/llvm-project/pull/83660

>From 771c0740dcc438d99baf4ad9555bc57e963001df Mon Sep 17 00:00:00 2001
From: Ian Anderson 
Date: Fri, 1 Mar 2024 22:17:09 -0800
Subject: [PATCH] [clang][modules] Headers meant to be included multiple times
 can be completely invisible in clang module builds

Once a file has been `#import`'ed, it gets stamped as if it was `#pragma once` 
and will not be re-entered, even on #include. This means that any errant 
#import of a file designed to be included multiple times, such as , 
will incorrectly mark it as include-once and break the multiple include 
functionality. Normally this isn't a big problem, e.g.  can't have 
its NDEBUG mode changed after the first #import, but it is still mostly 
functional. However, when clang modules are involved, this can cause the header 
to be hidden entirely.

Objective-C code most often uses #import for everything, because it's required 
for most Objective-C headers to prevent double inclusion and redeclaration 
errors. (It's rare for Objective-C headers to use macro guards or `#pragma 
once`.) The problem arises when a submodule includes a multiple-include header. 
The "already included" state is global across all modules (which is necessary 
so that non-modular headers don't get compiled into multiple translation units 
and cause redeclaration errors). If another module or the main file #import's 
the same header, it becomes invisible from then on. If the original submodule 
is not imported, the include of the header will effectively do nothing and the 
header will be invisible. The only way to actually get the header's 
declarations is to somehow figure out which submodule consumed the header, and 
import that instead. That's basically impossible since it depends on exactly 
which modules were built in which order.

#import is a poor indicator of whether a header is actually include-once, as 
the #import is external to the header it applies to, and requires that all 
inclusions correctly and consistently use #import vs #include. When modules are 
enabled, consider a header marked `textual` in its module as a stronger 
indicator of multiple-include than #import's indication of include-once. This 
will allow headers like  to always be included when modules are 
enabled, even if #import is erroneously used somewhere.
---
 clang/include/clang/Lex/HeaderSearch.h   |  24 ++-
 clang/lib/Lex/HeaderSearch.cpp   | 157 ---
 clang/lib/Serialization/ASTReader.cpp|   2 +-
 clang/test/Modules/builtin-import.mm |   2 +
 clang/test/Modules/import-textual-noguard.mm |   6 +-
 clang/test/Modules/import-textual.mm |   2 +
 clang/test/Modules/multiple-import.m |  43 +
 7 files changed, 173 insertions(+), 63 deletions(-)
 create mode 100644 clang/test/Modules/multiple-import.m

diff --git a/clang/include/clang/Lex/HeaderSearch.h 
b/clang/include/clang/Lex/HeaderSearch.h
index 705dcfa8aacc3f..bf8981b94d1b57 100644
--- a/clang/include/clang/Lex/HeaderSearch.h
+++ b/clang/include/clang/Lex/HeaderSearch.h
@@ -78,11 +78,17 @@ struct HeaderFileInfo {
   LLVM_PREFERRED_TYPE(bool)
   unsigned External : 1;
 
-  /// Whether this header is part of a module.
+  /// Whether this header is part of and built with a module.
   LLVM_PREFERRED_TYPE(bool)
   unsigned isModuleHeader : 1;
 
-  /// Whether this header is part of the module that we are building.
+  /// Whether this header is a textual header in a module (isModuleHeader will
+  /// be false)
+  LLVM_PREFERRED_TYPE(bool)
+  unsigned isTextualModuleHeader : 1;
+
+  /// Whether this header is part of and built with the module that we are
+  /// building.
   LLVM_PREFERRED_TYPE(bool)
   unsigned isCompilingModuleHeader : 1;
 
@@ -128,13 +134,20 @@ struct HeaderFileInfo {
 
   HeaderFileInfo()
   : isImport(false), isPragmaOnce(false), DirInfo(SrcMgr::C_User),
-External(false), isModuleHeader(false), isCompilingModuleHeader(false),
-Resolved(false), IndexHeaderMapHeader(false), IsValid(false)  {}
+External(false), isModuleHeader(false), isTextualModuleHeader(false),
+isCompilingModuleHeader(false), Resolved(false),
+IndexHeaderMapHeader(false), IsValid(false) {}
 
   /// Retrieve the controlling macro for this header file, if
   /// any.
   const IdentifierInfo *
   getControllingMacro(ExternalPreprocessorSource *External);
+
+  /// Update the module membership bits based on the header role.
+  ///
+  /// isModuleHeader will potentially be set, but not cleared.
+  /// isTextualModuleHeader will be set or cleared based on the role update.
+  void mergeModuleMembership(ModuleMap::ModuleHeaderRole Role);
 };
 
 /// An external source of header file information, which may supply
@@ -522,6 +535,9 @@ class HeaderSearch {
   ///
   /// \return false if \#including the file will have no effect or true
   /// if we should include it.
+  ///
+  /// \param M Th

[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-03-01 Thread Ian Anderson via cfe-commits

ian-twilightcoder wrote:

Thanks to @Bigcheese for helping with this patch.

https://reviews.llvm.org/D26267 looks like the most recent significant update 
to this code that we could find. It had an additional allowance where clang 
headers that were _not_ marked `textual` could be re-entered. I think this was 
to allow stddef.h (which is actually a multiple-include file despite it not 
being `textual`) to be entered multiple times while building the Darwin module. 
That shouldn't be necessary since nothing in the Darwin module or libc++ 
defines the `__need_` macros.

The unit tests added in D26267 still pass, so I don't think the builtin special 
case is needed anymore, but I'm still doing testing.

https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-03-01 Thread via cfe-commits

github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff 07317bbc66d1f2d7663af3c9f04d0f6c0487ac03 
171d0e299dd676ce29583e16fdf8c3e6f3dd7925 -- 
clang/include/clang/Lex/HeaderSearch.h clang/lib/Lex/HeaderSearch.cpp 
clang/lib/Serialization/ASTReader.cpp
``





View the diff from clang-format here.


``diff
diff --git a/clang/lib/Lex/HeaderSearch.cpp b/clang/lib/Lex/HeaderSearch.cpp
index fb6ec456dd..f56781e785 100644
--- a/clang/lib/Lex/HeaderSearch.cpp
+++ b/clang/lib/Lex/HeaderSearch.cpp
@@ -1432,7 +1432,8 @@ void HeaderSearch::MarkFileModuleHeader(FileEntryRef FE,
 
   auto &HFI = getFileInfo(FE);
   HFI.mergeModuleMembership(Role);
-  HFI.isCompilingModuleHeader |= (HFI.isModuleHeader && 
isCompilingModuleHeader);
+  HFI.isCompilingModuleHeader |=
+  (HFI.isModuleHeader && isCompilingModuleHeader);
 }
 
 bool HeaderSearch::ShouldEnterIncludeFile(Preprocessor &PP,

``




https://github.com/llvm/llvm-project/pull/83660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-03-01 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Ian Anderson (ian-twilightcoder)


Changes

Once a file has been `#import`'ed, it gets stamped as if it was `#pragma once` 
and will not be re-entered, even on #include. This means that any errant 
#import of a file designed to be included multiple times, such as 
, will incorrectly mark it as include-once and break the 
multiple include functionality. Normally this isn't a big problem, e.g. 
 can't have its NDEBUG mode changed after the first #import, 
but it is still mostly functional. However, when clang modules are involved, 
this can cause the header to be hidden entirely.

Objective-C code most often uses #import for everything, because it's required 
for most Objective-C headers to prevent double inclusion and redeclaration 
errors. (It's rare for Objective-C headers to use macro guards or `#pragma 
once`.) The problem arises when a submodule includes a multiple-include header. 
The "already included" state is global across all modules (which is necessary 
so that non-modular headers don't get compiled into multiple translation units 
and cause redeclaration errors). If another module or the main file #import's 
the same header, it becomes invisible from then on. If the original submodule 
is not imported, the include of the header will effectively do nothing and the 
header will be invisible. The only way to actually get the header's 
declarations is to somehow figure out which submodule consumed the header, and 
import that instead. That's basically impossible since it depends on exactly 
which modules were built in which order.

#import is a poor indicator of whether a header is actually include-once, as 
the #import is external to the header it applies to, and requires that all 
inclusions correctly and consistently use #import vs #include. When modules are 
enabled, consider a header marked `textual` in its module as a stronger 
indicator of multiple-include than #import's indication of include-once. This 
will allow headers like  to always be included when modules are 
enabled, even if #import is erroneously used somewhere.

---
Full diff: https://github.com/llvm/llvm-project/pull/83660.diff


7 Files Affected:

- (modified) clang/include/clang/Lex/HeaderSearch.h (+20-4) 
- (modified) clang/lib/Lex/HeaderSearch.cpp (+99-57) 
- (modified) clang/lib/Serialization/ASTReader.cpp (+1-1) 
- (modified) clang/test/Modules/builtin-import.mm (+2) 
- (modified) clang/test/Modules/import-textual-noguard.mm (+5-1) 
- (modified) clang/test/Modules/import-textual.mm (+2) 
- (added) clang/test/Modules/multiple-import.m (+43) 


``diff
diff --git a/clang/include/clang/Lex/HeaderSearch.h 
b/clang/include/clang/Lex/HeaderSearch.h
index 705dcfa8aacc3f..bf8981b94d1b57 100644
--- a/clang/include/clang/Lex/HeaderSearch.h
+++ b/clang/include/clang/Lex/HeaderSearch.h
@@ -78,11 +78,17 @@ struct HeaderFileInfo {
   LLVM_PREFERRED_TYPE(bool)
   unsigned External : 1;
 
-  /// Whether this header is part of a module.
+  /// Whether this header is part of and built with a module.
   LLVM_PREFERRED_TYPE(bool)
   unsigned isModuleHeader : 1;
 
-  /// Whether this header is part of the module that we are building.
+  /// Whether this header is a textual header in a module (isModuleHeader will
+  /// be false)
+  LLVM_PREFERRED_TYPE(bool)
+  unsigned isTextualModuleHeader : 1;
+
+  /// Whether this header is part of and built with the module that we are
+  /// building.
   LLVM_PREFERRED_TYPE(bool)
   unsigned isCompilingModuleHeader : 1;
 
@@ -128,13 +134,20 @@ struct HeaderFileInfo {
 
   HeaderFileInfo()
   : isImport(false), isPragmaOnce(false), DirInfo(SrcMgr::C_User),
-External(false), isModuleHeader(false), isCompilingModuleHeader(false),
-Resolved(false), IndexHeaderMapHeader(false), IsValid(false)  {}
+External(false), isModuleHeader(false), isTextualModuleHeader(false),
+isCompilingModuleHeader(false), Resolved(false),
+IndexHeaderMapHeader(false), IsValid(false) {}
 
   /// Retrieve the controlling macro for this header file, if
   /// any.
   const IdentifierInfo *
   getControllingMacro(ExternalPreprocessorSource *External);
+
+  /// Update the module membership bits based on the header role.
+  ///
+  /// isModuleHeader will potentially be set, but not cleared.
+  /// isTextualModuleHeader will be set or cleared based on the role update.
+  void mergeModuleMembership(ModuleMap::ModuleHeaderRole Role);
 };
 
 /// An external source of header file information, which may supply
@@ -522,6 +535,9 @@ class HeaderSearch {
   ///
   /// \return false if \#including the file will have no effect or true
   /// if we should include it.
+  ///
+  /// \param M The module to which `File` belongs (this should usually be the
+  /// SuggestedModule returned by LookupFile/LookupSubframeworkHeader)
   bool ShouldEnterIncludeFile(Preprocessor &PP, FileEntryRef File,
   

[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)

2024-03-01 Thread Ian Anderson via cfe-commits

https://github.com/ian-twilightcoder created 
https://github.com/llvm/llvm-project/pull/83660

Once a file has been `#import`'ed, it gets stamped as if it was `#pragma once` 
and will not be re-entered, even on #include. This means that any errant 
#import of a file designed to be included multiple times, such as , 
will incorrectly mark it as include-once and break the multiple include 
functionality. Normally this isn't a big problem, e.g.  can't have 
its NDEBUG mode changed after the first #import, but it is still mostly 
functional. However, when clang modules are involved, this can cause the header 
to be hidden entirely.

Objective-C code most often uses #import for everything, because it's required 
for most Objective-C headers to prevent double inclusion and redeclaration 
errors. (It's rare for Objective-C headers to use macro guards or `#pragma 
once`.) The problem arises when a submodule includes a multiple-include header. 
The "already included" state is global across all modules (which is necessary 
so that non-modular headers don't get compiled into multiple translation units 
and cause redeclaration errors). If another module or the main file #import's 
the same header, it becomes invisible from then on. If the original submodule 
is not imported, the include of the header will effectively do nothing and the 
header will be invisible. The only way to actually get the header's 
declarations is to somehow figure out which submodule consumed the header, and 
import that instead. That's basically impossible since it depends on exactly 
which modules were built in which order.

#import is a poor indicator of whether a header is actually include-once, as 
the #import is external to the header it applies to, and requires that all 
inclusions correctly and consistently use #import vs #include. When modules are 
enabled, consider a header marked `textual` in its module as a stronger 
indicator of multiple-include than #import's indication of include-once. This 
will allow headers like  to always be included when modules are 
enabled, even if #import is erroneously used somewhere.

>From 171d0e299dd676ce29583e16fdf8c3e6f3dd7925 Mon Sep 17 00:00:00 2001
From: Ian Anderson 
Date: Fri, 1 Mar 2024 22:17:09 -0800
Subject: [PATCH] [clang][modules] Headers meant to be included multiple times
 can be completely invisible in clang module builds

Once a file has been `#import`'ed, it gets stamped as if it was `#pragma once` 
and will not be re-entered, even on #include. This means that any errant 
#import of a file designed to be included multiple times, such as , 
will incorrectly mark it as include-once and break the multiple include 
functionality. Normally this isn't a big problem, e.g.  can't have 
its NDEBUG mode changed after the first #import, but it is still mostly 
functional. However, when clang modules are involved, this can cause the header 
to be hidden entirely.

Objective-C code most often uses #import for everything, because it's required 
for most Objective-C headers to prevent double inclusion and redeclaration 
errors. (It's rare for Objective-C headers to use macro guards or `#pragma 
once`.) The problem arises when a submodule includes a multiple-include header. 
The "already included" state is global across all modules (which is necessary 
so that non-modular headers don't get compiled into multiple translation units 
and cause redeclaration errors). If another module or the main file #import's 
the same header, it becomes invisible from then on. If the original submodule 
is not imported, the include of the header will effectively do nothing and the 
header will be invisible. The only way to actually get the header's 
declarations is to somehow figure out which submodule consumed the header, and 
import that instead. That's basically impossible since it depends on exactly 
which modules were built in which order.

#import is a poor indicator of whether a header is actually include-once, as 
the #import is external to the header it applies to, and requires that all 
inclusions correctly and consistently use #import vs #include. When modules are 
enabled, consider a header marked `textual` in its module as a stronger 
indicator of multiple-include than #import's indication of include-once. This 
will allow headers like  to always be included when modules are 
enabled, even if #import is erroneously used somewhere.
---
 clang/include/clang/Lex/HeaderSearch.h   |  24 ++-
 clang/lib/Lex/HeaderSearch.cpp   | 156 ---
 clang/lib/Serialization/ASTReader.cpp|   2 +-
 clang/test/Modules/builtin-import.mm |   2 +
 clang/test/Modules/import-textual-noguard.mm |   6 +-
 clang/test/Modules/import-textual.mm |   2 +
 clang/test/Modules/multiple-import.m |  43 +
 7 files changed, 172 insertions(+), 63 deletions(-)
 create mode 100644 clang/test/Modules/multiple-import.m

diff --git a/clang/include/clang/Lex/HeaderSearch.h 
b/clang/incl