[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
@@ -1403,94 +1421,146 @@ bool HeaderSearch::isFileMultipleIncludeGuarded(FileEntryRef File) const { void HeaderSearch::MarkFileModuleHeader(FileEntryRef FE, ModuleMap::ModuleHeaderRole Role, bool isCompilingModuleHeader) { - bool isModularHeader = ModuleMap::isModular(Role); - // Don't mark the file info as non-external if there's nothing to change. if (!isCompilingModuleHeader) { -if (!isModularHeader) +if ((Role & ModuleMap::ExcludedHeader)) return; auto *HFI = getExistingFileInfo(FE); if (HFI && HFI->isModuleHeader) return; } auto &HFI = getFileInfo(FE); - HFI.isModuleHeader |= isModularHeader; + HFI.mergeModuleMembership(Role); HFI.isCompilingModuleHeader |= isCompilingModuleHeader; ian-twilightcoder wrote: https://github.com/llvm/llvm-project/pull/89005 and thank you!! https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
@@ -1403,94 +1421,146 @@ bool HeaderSearch::isFileMultipleIncludeGuarded(FileEntryRef File) const { void HeaderSearch::MarkFileModuleHeader(FileEntryRef FE, ModuleMap::ModuleHeaderRole Role, bool isCompilingModuleHeader) { - bool isModularHeader = ModuleMap::isModular(Role); - // Don't mark the file info as non-external if there's nothing to change. if (!isCompilingModuleHeader) { -if (!isModularHeader) +if ((Role & ModuleMap::ExcludedHeader)) return; auto *HFI = getExistingFileInfo(FE); if (HFI && HFI->isModuleHeader) return; } auto &HFI = getFileInfo(FE); - HFI.isModuleHeader |= isModularHeader; + HFI.mergeModuleMembership(Role); HFI.isCompilingModuleHeader |= isCompilingModuleHeader; ian-twilightcoder wrote: Ohhh. So I need to be checking the `getExistingFileInfo` to see if `isModuleHeader` and `isTextualModuleHeader` match the given `ModuleHeaderRole`? https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
@@ -1403,94 +1421,146 @@ bool HeaderSearch::isFileMultipleIncludeGuarded(FileEntryRef File) const { void HeaderSearch::MarkFileModuleHeader(FileEntryRef FE, ModuleMap::ModuleHeaderRole Role, bool isCompilingModuleHeader) { - bool isModularHeader = ModuleMap::isModular(Role); - // Don't mark the file info as non-external if there's nothing to change. if (!isCompilingModuleHeader) { -if (!isModularHeader) +if ((Role & ModuleMap::ExcludedHeader)) return; auto *HFI = getExistingFileInfo(FE); if (HFI && HFI->isModuleHeader) return; } auto &HFI = getFileInfo(FE); - HFI.isModuleHeader |= isModularHeader; + HFI.mergeModuleMembership(Role); HFI.isCompilingModuleHeader |= isCompilingModuleHeader; zygoloid wrote: It looks to me like we're now calling `getFileInfo(FE)` even in cases where the info doesn't change, for a textual header. I think that's what's causing the regression we're seeing -- we're now considering a repeatedly-used module map file to be affecting, even though it isn't, because we end up with a local `HeaderFileInfo` instead of an imported one. https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
ian-twilightcoder wrote: > To get back to my previous point about the semantics implemented by this > patch being unfortunate -- the upshot is that, now: > > ``` > #include > #define NDEBUG > #import > ``` It would be nice if we could make this work without modules too. `#pragma many` or something would do that, but I fear we'd have compatibility issues if we added a pragma that's only supported by clang to a bunch of system headers. > This doesn't make sense: with modules disabled, everything is textual -- to > say that "textual header" in a modulemap should make a header somehow "more" > textual than the default non-modular behavior is weird. With modules disabled, everything isn't textual - it's effectively excluded. `textual` was probably not the best name for that keyword in the module map. https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
ian-twilightcoder wrote: Wall of text incoming... Sorry but our documentation in this area is poor, so I feel like I need to spell out the assumptions behind this change. First of all, this _should_ be a relatively minor change that should affect a very limited set of circumstances. 1. Only `#import` should be affected, not `#include`. 2. Only headers marked `textual header` in a module map should be affected, nothing else. 3. Headers that use `#pragma once` or have header guards should not be affected. 4. Module map reading should not be affected. That should end up being a rather narrow change, and if something else got affected I'd like to know what and fix it. `#import` is already treated differently if modules are enabled. This adds another case to the special treatment, but doesn't introduce that fact that sometimes `#import`ed headers will be re-entered when building with modules. We can say we don't like `#import` behaving differently when modules are enabled. However, it's not going to be any harder to undo this special treatment than it will to undo the other ones, if we ever care enough to try, so I don't think this is such an egregious change. `textual header`s are already treated differently as well. To better define that, there are 4 different types of headers in a clang modules world. 1. Headers covered by a `header` statement in a module map - ye olde standard include-once header. 2. Headers covered by a `textual header` statement in a module map - usually headers that are intended to be included multiple times. Generally these are X macro generator headers, or special headers like or clang's and which are designed to have different behavior based on the preprocessor environment of the includer. 3. Headers covered by an `exclude header` statement in a module map - headers that can't be used in a module. Sometimes these are just obsolete headers, and sometimes it's headers that have dependencies that can't be used in a module. 4. Headers that are not listed in any module map. These are essentially the same thing as excluded headers. There's a semantic difference between textual and excluded headers. Textual headers are anointed to be safe to use in modules, and excluded headers are not supposed to be included from any modular header. (If I had my way, `-Wnon-modular-include-in-module` would default to an error much like `-Wmodule-import-in-extern-c` does, but that can be a soap box for another day.) Textual headers are already treated differently from excluded headers in a few ways, such as they respect `[no_undeclared_includes]`/`use`, while excluded headers don't. All this to say that `#import`ed headers are already treated different when modules are enabled, and textual headers are already treated different than excluded headers. I think it's fine to tweak this different treatment, especially since it's solving a real world problem that occurs in a fully modularized system. I also think it's preferable to focus the change down to the very specific problem, which is `#import`ed textual headers, instead of trying to fix all of the submodule visibility problems we can think of. https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
jyknight wrote: To get back to my previous point about the semantics implemented by this patch being unfortunate -- the upshot is that, now: ``` #include #define NDEBUG #import ``` will include assert.h twice (even though the latter is an "import") _only_ if modules are enabled. If modules are disabled, import keeps the original behavior of suppressing multiple inclusion. This doesn't make sense: with modules disabled, everything is textual -- to say that "textual header" in a modulemap should make a header somehow "more" textual than the default non-modular behavior is weird. That is what I meant by my previous comment that this change has tied together features which should not be related -- and that's the main reason why I believe this PR should be reverted. If we _do_ need to implement a behavior of suppressing `#import`'s include-once behavior for certain headers, it should be based on something within the header being included. E.g. could create a new`#pragma multiply_include_even_with_import` and add it to assert.h -- this would be effectively the reverse of `#pragma once`. I do not believe it's really needed (once the "header is hidden entirely" bug is fixed), but it would be consistent and not produce broken/surprising semantics. https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
jyknight wrote: >> The end result should be that #imported and #pragma once-guarded files are >> treated the same way as #ifndef-guarded files. > While I don't necessarily disagree with that goal in principle, it wasn't > true before this change either. Well, yes, I know it isn't true yet -- that's exactly what I've been trying to say, and demonstrated with the test -- it's not true _and_ that's a bug. > Modular headers including non-modular headers is a special problem. If we let > every module enter the non-modular header (which Jan's patch does), its > declarations will build into multiple modules/pcm files I'm not really sure what you're getting at in this paragraph. Is there a _new_ special problem introduced here? Yes, a textual header can be built into multiple modules/pcm files -- that's the nature of it being textual, right? Certainly it's always be a better to modularize headers from the bottom-up, and avoid textually including a header into multiple modules. And, yes, the performance cost of duplicating the decls into multiple modules' pcm files can be quite significant, depending on the size of the header. Yet, sometimes it happens. And, when it does, the declaration-merging takes care of it -- for C/C++ clients. This all works (and is used) today, with ifndef-guarded header inclusion. But a textually-included header shouldn't be re-entered if it's been _visibly_ included already in the current compilation -- including if that was exported from an imported module. It should only be re-entered if the previous include is _not_ visible in the current context. Maybe you're saying Jan's patch doesn't do that properly yet. (I could see that being the case, since the behavior mostly just falls out automatically for the ifndef-guard case, based on whether the guard-macro was made visible or not!) https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
ian-twilightcoder wrote: > The end result should be that #imported and #pragma once-guarded files are > treated the same way as #ifndef-guarded files. While I don't necessarily disagree with that goal in principle, it wasn't true before this change either. There was already some special casing to defeat `#import`. This change just adds another special case to the `#import` path. `#include` doesn't even go through the modified code that is `MaybeReenterImportedFile` (née `TryEnterImported`). > I know you weren't trying to fix that -- but I'm saying that's the thing that > _should_ be fixed, and that doing so would fix the _serious_ part of the > problem you described in the initial message. Modular headers including non-modular headers is a special problem. If we let every module enter the non-modular header (which Jan's patch does), its declarations will build into multiple modules/pcm files, where they will 1) conflict and cause errors in the several cases where type merging can't handle them, and 2) come from multiple modules which is undefined behavior for Swift (i.e. Swift will see `mod1.type` and `mod2.type` and won't know how to resolve un-qualified `type` in code). So I don't think that case really _can_ be fixed. https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
jyknight wrote: > The problem I'm trying to fix is that nobody knows when it's appropriate to > use #import vs #include But you haven't really (and I think cannot) fixed that. > using header guards or #pragma once is very "un-Objective-C". Yes, this is quite unfortunate. The best answer would be to change the recommended style, but unfortunately, it's probably 20-30 years too late to do so effectively. (I mean, it would be a start if Apple added `#pragma once` to all of the SDK headers.) Your initial comment said: > Normally this isn't a big problem, [...] However, when clang modules are > involved, this can cause the header to be hidden entirely. And I am basically agreeing with this. The current state of objc include/import, while unfortunate, has existed for decades, and is usually not a big problem. And, I think that means we shouldn't add more complexity to the semantics of `#import` in an attempt to fix it. But the "header is hidden entirely" _is_ a big problem, and that part is due to the bug I described -- so if that bug is fixed, we're back to "isn't a big problem", without making this change in semantics. > An alternative solution would be @jansvoboda11's > https://github.com/llvm/llvm-project/pull/71117, but I believe that will > cause issues with non-modular headers building into multiple modules. I haven't looked at that, but it sounds promising. The end result should be that `#import`ed and `#pragma once`-guarded files are treated the same way as `#ifndef`-guarded files. I wouldn't expect that to cause issues with non-modular headers included in multiple modules, given that works for `#ifndef`-guarded files, today. > Also this change isn't trying to modify #include in any kind of way, only > #import. So it's quite intentional that your test behaves the same before and > after. I know you weren't trying to fix that -- but I'm saying that's the thing that _should_ be fixed, and that doing so would fix the _serious_ part of the problem you described in the initial message. https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
ian-twilightcoder wrote: Also this change isn't trying to modify `#include` in any kind of way, _only_ `#import`. So it's quite intentional that your test behaves the same before and after. https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
ian-twilightcoder wrote: I don't really think it's the same thing. The problem I'm trying to fix is that nobody knows when it's appropriate to use `#import` vs `#include`, and the unfortunate convention of Objective-C makes it impossible for header owners to indicate if they support being included multiple times or not, as using header guards or `#pragma once` is very "un-Objective-C". Marking the header as `textual` is a fairly reasonable way for header owners to indicate that their header is meant to be included multiple times (e.g. stddef.h), even if ObjC developers don't know that and used `#import`. (The other use of `textual` that I've seen is for headers like `unwind_arm_ehabi.h` that are only allowed to have a single includer, and aren't really standalone.) If you don't want the header to build with the module, that's what `excluded` is for is it not? And alternative solution would be @jansvoboda11's https://github.com/llvm/llvm-project/pull/71117, but I believe that will cause issues with non-modular headers building into multiple modules. https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
jyknight wrote: Oh -- I'd also note that even if this is reverted, ilya-biryukov may want to continue to investigate the source-location issue -- it's entirely possible that the correct fix will trigger that same problem! https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
jyknight wrote: I think the bug this change was attempting to fix is actually the same as #38554 and related bugs -- which, it appears, was not really fixed. The underlying problem here is that `#pragma once` should should work identically to ifndef guards, as far as what macros/decls are made visible to the includer. However, with the current modules implementation, it fails to do so. This PR addresses only one _symptom_ of that issue, but does not address the underlying problem. Additionally, I think the semantics change implemented here is a bad idea. It significantly complicates the semantics of `#import`, in a way that I think is justifiable, especially given that there is a way to fix the identified issue in a more principled way. Yes, `#import` was a bad and confusing idea, and should have been dropped 20 years ago, but adding more special cases to it does not make it better. A header file being marked "textual" in a module-map does _not_ mean a file should be be included multiple times -- it simply means that you didn't want to build it into a module. Linking these two concepts together ties together concepts which have no If this PR was the only way to fix the identified issue, maybe it could via an assumption that the current buggy `#pragma once` behavior is expected behavior which should be preserved. As such, I would propose this PR ought to be reverted, and work to fix the underlying issue be done instead. Here's a test case which I think demonstrates the actual bug -- it fails both before and after this PR. If you modify 'header.h' to use include guards, then it passes. This difference in behavior should not exist. ``` // RUN: rm -rf %t // RUN: split-file %s %t // RUN: %clang_cc1 -std=c17 -fmodules-cache-path=%t/no-lsv -fmodules -fimplicit-module-maps -I%t %t/pragma-once-modules.c -verify // RUN: %clang_cc1 -std=c17 -fmodules-cache-path=%t/lsv -fmodules -fimplicit-module-maps -fmodules-local-submodule-visibility -I%t %t/pragma-once-modules.c -verify //--- pragma-once-modules.c // expected-no-diagnostics #include "one.h" #include "header.h" my_int test(void) { return MY_MACRO; } //--- module.modulemap module Submodules { module one { header "one.h" export * } module two { header "two.h" export * } } //--- one.h #ifndef one_h #define one_h #endif //--- two.h #ifndef two_h #define two_h #include "header.h" #endif //--- header.h #pragma once #define MY_MACRO 5 typedef int my_int; ``` It produces the following unexpected output. ``` error: 'expected-error' diagnostics seen but not expected: [...]/pragma-once-modules.c Line 4: missing '#include "header.h"'; 'my_int' must be declared before it is used [...]/pragma-once-modules.c Line 5: use of undeclared identifier 'MY_MACRO' error: 'expected-note' diagnostics seen but not expected: [...]/header.h Line 3: declaration here is not visible ``` https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
ian-twilightcoder wrote: Wow that's a _ton_ of PCM files. Why are your headers `textual` if they're only meant to be included once? Do they need to inherit the includer's preprocessor environment or something like that? https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
ilya-biryukov wrote: I still don't understand why the number of `FileID`s for module maps has increased so much. I am observing this increase when compiling a file that has many module dependencies, it creates only one `FileID` for a module map I mentioned above, the rest 2259 are from loaded binary PCM files. https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
ilya-biryukov wrote: > Are your `textual` headers meant to be included more than once? If not, do > they use header guards or `#pragma once`? As I mentioned, the increase is coming from the module files. The textual headers are header-guarded and not intended to be included more than once. > clang never transforms #include to #import. It will effectively attempt to > transform #include and #import to @import but that's different. I meant to say that we use module maps to replace `#include` with module imports. Sorry for getting the exact terminology wrong. https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
ian-twilightcoder wrote: > It's definitely caused by that change, reverting it fixes the failure. > > We use module maps so that some of our `#include` get processed as `#import`, > I believe the same code gets executed for our use-case. Because our setup is > so unusual, producing a repro is hard and make take some time. (Build system > generated module maps and `#include` is essentially turned into `#import` by > Clang). clang never transforms `#include` to `#import`. It will effectively attempt to transform `#include` and `#import` to `@import` but that's different. https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
ian-twilightcoder wrote: Are your `textual` headers meant to be included more than once? If not, do they use header guards or `#pragma once`? https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
ilya-biryukov wrote: Most of the increase seems to be coming from module maps that describe textual headers: ``` // This describes builtin headers. Before: gen-crosstool-x86.cppmap:1:1: note: file entered 630 times using 83989080B of space After: gen-crosstool-x86.cppmap:1:1: note: file entered 2260 times using 319523320B of space ``` I will dig further to see where it's coming from. https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
ilya-biryukov wrote: It's definitely caused by that change, reverting it fixes the failure. We use module maps so that some of our `#include` get processed as `#import`, I believe the same code gets executed for our use-case. Because our setup is so unusual, producing a repro is hard and make take some time. (Build system generated module maps and `#include` is essentially turned into `#import` by Clang). I will try to investigate this further today and come back with more details. https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
ian-twilightcoder wrote: > @ian-twilightcoder this change seemed to cause our internal builds to run out > of source location space. > > ``` > remark: source manager location address space usage: [-Rsloc-usage] > note: 408559B in local locations, 2146921126B in locations loaded from AST > files, for a total of 2147329685B (99% of available space) > ``` > > The main offenders from the follow-ups are: > > * module map files for standard library and C/C++ runtime headers, each > entered a few thousand times, > * various headers without header guards, e.g. libc++'s `__config` and others, > * `arm_neon.h`. > Without looking deeper into what the change does, this might be an expected > outcome here. However, let me know if something already looks off from this > small description (e.g. the module maps entered so many times is surprising, > I'll need to see if this happened before). > > I will investigate this further, but we might not be able to change how our > builds work quickly. Would it be ok to revert this change or put the new > behavior under a flag to give us some more time to investigate? That's surprising, the code to re-enter files is `HeaderSearch::ShouldEnterIncludeFile`, and I only changed the behavior there for `#import`. The `#include` behavior is still governed by `#pragma once` and `FileInfo.getControllingMacro`, and that isn't changed. Are you sure it's this change? Do you have a repro case? https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
ilya-biryukov wrote: @ian-twilightcoder this change seemed to cause our internal builds to run out of source location space. ``` remark: source manager location address space usage: [-Rsloc-usage] note: 408559B in local locations, 2146921126B in locations loaded from AST files, for a total of 2147329685B (99% of available space) ``` The main offenders from the https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
https://github.com/ian-twilightcoder closed https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
https://github.com/jansvoboda11 approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
https://github.com/ian-twilightcoder updated https://github.com/llvm/llvm-project/pull/83660 >From 63ff00ec49ac20c5ac97bd673166dabb0fb56136 Mon Sep 17 00:00:00 2001 From: Ian Anderson Date: Fri, 1 Mar 2024 22:17:09 -0800 Subject: [PATCH] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds Once a file has been `#import`'ed, it gets stamped as if it was `#pragma once` and will not be re-entered, even on #include. This means that any errant #import of a file designed to be included multiple times, such as , will incorrectly mark it as include-once and break the multiple include functionality. Normally this isn't a big problem, e.g. can't have its NDEBUG mode changed after the first #import, but it is still mostly functional. However, when clang modules are involved, this can cause the header to be hidden entirely. Objective-C code most often uses #import for everything, because it's required for most Objective-C headers to prevent double inclusion and redeclaration errors. (It's rare for Objective-C headers to use macro guards or `#pragma once`.) The problem arises when a submodule includes a multiple-include header. The "already included" state is global across all modules (which is necessary so that non-modular headers don't get compiled into multiple translation units and cause redeclaration errors). If another module or the main file #import's the same header, it becomes invisible from then on. If the original submodule is not imported, the include of the header will effectively do nothing and the header will be invisible. The only way to actually get the header's declarations is to somehow figure out which submodule consumed the header, and import that instead. That's basically impossible since it depends on exactly which modules were built in which order. #import is a poor indicator of whether a header is actually include-once, as the #import is external to the header it applies to, and requires that all inclusions correctly and consistently use #import vs #include. When modules are enabled, consider a header marked `textual` in its module as a stronger indicator of multiple-include than #import's indication of include-once. This will allow headers like to always be included when modules are enabled, even if #import is erroneously used somewhere. --- clang/include/clang/Lex/HeaderSearch.h | 26 ++- clang/lib/Lex/HeaderSearch.cpp | 183 +-- clang/lib/Serialization/ASTReader.cpp| 2 +- clang/test/Modules/builtin-import.mm | 2 + clang/test/Modules/import-textual-noguard.mm | 6 +- clang/test/Modules/import-textual.mm | 2 + clang/test/Modules/multiple-import.m | 43 + 7 files changed, 201 insertions(+), 63 deletions(-) create mode 100644 clang/test/Modules/multiple-import.m diff --git a/clang/include/clang/Lex/HeaderSearch.h b/clang/include/clang/Lex/HeaderSearch.h index 705dcfa8aacc3f..855f81f775f8a8 100644 --- a/clang/include/clang/Lex/HeaderSearch.h +++ b/clang/include/clang/Lex/HeaderSearch.h @@ -78,11 +78,19 @@ struct HeaderFileInfo { LLVM_PREFERRED_TYPE(bool) unsigned External : 1; - /// Whether this header is part of a module. + /// Whether this header is part of and built with a module. i.e. it is listed + /// in a module map, and is not `excluded` or `textual`. (same meaning as + /// `ModuleMap::isModular()`). LLVM_PREFERRED_TYPE(bool) unsigned isModuleHeader : 1; - /// Whether this header is part of the module that we are building. + /// Whether this header is a `textual header` in a module. + LLVM_PREFERRED_TYPE(bool) + unsigned isTextualModuleHeader : 1; + + /// Whether this header is part of the module that we are building, even if it + /// doesn't build with the module. i.e. this will include `excluded` and + /// `textual` headers as well as normal headers. LLVM_PREFERRED_TYPE(bool) unsigned isCompilingModuleHeader : 1; @@ -128,13 +136,20 @@ struct HeaderFileInfo { HeaderFileInfo() : isImport(false), isPragmaOnce(false), DirInfo(SrcMgr::C_User), -External(false), isModuleHeader(false), isCompilingModuleHeader(false), -Resolved(false), IndexHeaderMapHeader(false), IsValid(false) {} +External(false), isModuleHeader(false), isTextualModuleHeader(false), +isCompilingModuleHeader(false), Resolved(false), +IndexHeaderMapHeader(false), IsValid(false) {} /// Retrieve the controlling macro for this header file, if /// any. const IdentifierInfo * getControllingMacro(ExternalPreprocessorSource *External); + + /// Update the module membership bits based on the header role. + /// + /// isModuleHeader will potentially be set, but not cleared. + /// isTextualModuleHeader will be set or cleared based on the role update. + void mergeModuleMembership(ModuleMap::ModuleHeaderRole Role); }; /// An external source of header
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
ian-twilightcoder wrote: > > Sometimes it does confuse clang, at least I saw problems with a `typedef > > enum` when I made an include-once header `textual`. > > Ok, I see. I would just consider it a bug, not a design decision. > > > That's correct. `#import` is an external source - often it comes from the > > users of the header and not the author, and the users might not be > > consistent with each other. `textual` comes from the author and a much > > stronger indicator of intent. > > Hmm, I need to think more about it. Usually I prefer for modular builds to > act like non-modular as it minimizes the surprises and allows to have a > portable code. But I haven't really considered the implications of such > preference. Yes generally, but modules build more headers than non-modular builds, nothing we can do about that. > I see why you prefer to give more priority to `textual`. To give a different > perspective, what if a developer wants to use a newer library, for example, > ncurses. The rest of SDK expects an older version and the developer might > need to do some header gymnastics to make it work. So there are cases when > developers need to go against library author's intent. I'm thinking that > library authors aren't able to predict all the ways their library is used, so > it is useful to have some flexibility around that, even at the cost of > misusing the library. But this is a general opinion, I don't have specific > data about the needs to tweak libraries or the risks of a library misuse. So > the opinion isn't particularly strong. That one seems like an edge case, and using `#import` wouldn't necessarily help you because the SDK module could build first before your `#import` is seen. https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
vsapsai wrote: > Sometimes it does confuse clang, at least I saw problems with a `typedef > enum` when I made an include-once header `textual`. Ok, I see. I would just consider it a bug, not a design decision. > That's correct. `#import` is an external source - often it comes from the > users of the header and not the author, and the users might not be consistent > with each other. `textual` comes from the author and a much stronger > indicator of intent. Hmm, I need to think more about it. Usually I prefer for modular builds to act like non-modular as it minimizes the surprises and allows to have a portable code. But I haven't really considered the implications of such preference. I see why you prefer to give more priority to `textual`. To give a different perspective, what if a developer wants to use a newer library, for example, ncurses. The rest of SDK expects an older version and the developer might need to do some header gymnastics to make it work. So there are cases when developers need to go against library author's intent. I'm thinking that library authors aren't able to predict all the ways their library is used, so it is useful to have some flexibility around that, even at the cost of misusing the library. But this is a general opinion, I don't have specific data about the needs to tweak libraries or the risks of a library misuse. So the opinion isn't particularly strong. https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
ian-twilightcoder wrote: > To clarify a little bit > > > [...] The "already included" state is global across all modules (which is > > necessary so that non-modular headers don't get compiled into multiple > > translation units and cause redeclaration errors). > > The necessity isn't actually true. The same definition in multiple modules > shouldn't confuse clang as long as these definitions are identical. But this > is a side issue. Sometimes it does confuse clang, at least I saw problems with a `typedef enum` when I made an include-once header `textual`. > To re-iterate what this change is about: > > 1. `#import` implies a file is a single-include > 2. Textual header in a module map implies a file is a multi-include (aka > re-entrant) > 3. When we have both `#import` and the header marked as textual, `#import` > "wins", i.e. the file is single-include > 4. You want to make that when we have both `#import` and a textual header, > textual should "win", i.e. the file should be multi-include > > Is it correct? That's correct. `#import` is an external source - often it comes from the users of the header and not the author, and the users might not be consistent with each other. `textual` comes from the author and a much stronger indicator of intent. https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
vsapsai wrote: To clarify a little bit > [...] The "already included" state is global across all modules (which is > necessary so that non-modular headers don't get compiled into multiple > translation units and cause redeclaration errors). The necessity isn't actually true. The same definition in multiple modules shouldn't confuse clang as long as these definitions are identical. But this is a side issue. To re-iterate what this change is about: 1. `#import` implies a file is a single-include 2. Textual header in a module map implies a file is a multi-include (aka re-entrant) 3. When we have both `#import` and the header marked as textual, `#import` "wins", i.e. the file is single-include 4. You want to make that when we have both `#import` and a textual header, textual should "win", i.e. the file should be multi-include Is it correct? https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
https://github.com/ian-twilightcoder updated https://github.com/llvm/llvm-project/pull/83660 >From 1cb3d459f3a9ae73ac98bf8c06b905d788be954f Mon Sep 17 00:00:00 2001 From: Ian Anderson Date: Fri, 1 Mar 2024 22:17:09 -0800 Subject: [PATCH] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds Once a file has been `#import`'ed, it gets stamped as if it was `#pragma once` and will not be re-entered, even on #include. This means that any errant #import of a file designed to be included multiple times, such as , will incorrectly mark it as include-once and break the multiple include functionality. Normally this isn't a big problem, e.g. can't have its NDEBUG mode changed after the first #import, but it is still mostly functional. However, when clang modules are involved, this can cause the header to be hidden entirely. Objective-C code most often uses #import for everything, because it's required for most Objective-C headers to prevent double inclusion and redeclaration errors. (It's rare for Objective-C headers to use macro guards or `#pragma once`.) The problem arises when a submodule includes a multiple-include header. The "already included" state is global across all modules (which is necessary so that non-modular headers don't get compiled into multiple translation units and cause redeclaration errors). If another module or the main file #import's the same header, it becomes invisible from then on. If the original submodule is not imported, the include of the header will effectively do nothing and the header will be invisible. The only way to actually get the header's declarations is to somehow figure out which submodule consumed the header, and import that instead. That's basically impossible since it depends on exactly which modules were built in which order. #import is a poor indicator of whether a header is actually include-once, as the #import is external to the header it applies to, and requires that all inclusions correctly and consistently use #import vs #include. When modules are enabled, consider a header marked `textual` in its module as a stronger indicator of multiple-include than #import's indication of include-once. This will allow headers like to always be included when modules are enabled, even if #import is erroneously used somewhere. --- clang/include/clang/Lex/HeaderSearch.h | 26 ++- clang/lib/Lex/HeaderSearch.cpp | 183 +-- clang/lib/Serialization/ASTReader.cpp| 2 +- clang/test/Modules/builtin-import.mm | 2 + clang/test/Modules/import-textual-noguard.mm | 6 +- clang/test/Modules/import-textual.mm | 2 + clang/test/Modules/multiple-import.m | 43 + 7 files changed, 201 insertions(+), 63 deletions(-) create mode 100644 clang/test/Modules/multiple-import.m diff --git a/clang/include/clang/Lex/HeaderSearch.h b/clang/include/clang/Lex/HeaderSearch.h index 705dcfa8aacc3f..4c9fb58fbd35ef 100644 --- a/clang/include/clang/Lex/HeaderSearch.h +++ b/clang/include/clang/Lex/HeaderSearch.h @@ -78,11 +78,19 @@ struct HeaderFileInfo { LLVM_PREFERRED_TYPE(bool) unsigned External : 1; - /// Whether this header is part of a module. + /// Whether this header is part of and built with a module + /// (`isTextualModuleHeader` will be `false`). LLVM_PREFERRED_TYPE(bool) unsigned isModuleHeader : 1; - /// Whether this header is part of the module that we are building. + /// Whether this header is a textual header in a module (`isModuleHeader` will + /// be `false`). + LLVM_PREFERRED_TYPE(bool) + unsigned isTextualModuleHeader : 1; + + /// Whether this header is part of the module that we are building + /// (independent of `isModuleHeader` and `isTextualModuleHeader`, they can + /// both be `false`). LLVM_PREFERRED_TYPE(bool) unsigned isCompilingModuleHeader : 1; @@ -128,13 +136,20 @@ struct HeaderFileInfo { HeaderFileInfo() : isImport(false), isPragmaOnce(false), DirInfo(SrcMgr::C_User), -External(false), isModuleHeader(false), isCompilingModuleHeader(false), -Resolved(false), IndexHeaderMapHeader(false), IsValid(false) {} +External(false), isModuleHeader(false), isTextualModuleHeader(false), +isCompilingModuleHeader(false), Resolved(false), +IndexHeaderMapHeader(false), IsValid(false) {} /// Retrieve the controlling macro for this header file, if /// any. const IdentifierInfo * getControllingMacro(ExternalPreprocessorSource *External); + + /// Update the module membership bits based on the header role. + /// + /// isModuleHeader will potentially be set, but not cleared. + /// isTextualModuleHeader will be set or cleared based on the role update. + void mergeModuleMembership(ModuleMap::ModuleHeaderRole Role); }; /// An external source of header file information, which may supply @@ -522,6 +537,9 @@ class HeaderSearch {
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
ian-twilightcoder wrote: > Thanks to @Bigcheese for helping with this patch. > > https://reviews.llvm.org/D26267 looks like the most recent significant update > to this code that we could find. It had an additional allowance where clang > headers that were _not_ marked `textual` could be re-entered. I think this > was to allow stddef.h (which is actually a multiple-include file despite it > not being `textual`) to be entered multiple times while building the Darwin > module. That shouldn't be necessary since nothing in the Darwin module or > libc++ defines the `__need_` macros. > > The unit tests added in D26267 still pass, so I don't think the builtin > special case is needed anymore, but I'm still doing testing. @bcardosolopes as far as I can tell the builtin special case isn't needed. Can you think of anything else I might need to test? https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
https://github.com/ian-twilightcoder updated https://github.com/llvm/llvm-project/pull/83660 >From f90fe8b1b7c73b68614ade3cf7e1c292f02d3573 Mon Sep 17 00:00:00 2001 From: Ian Anderson Date: Fri, 1 Mar 2024 22:17:09 -0800 Subject: [PATCH] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds Once a file has been `#import`'ed, it gets stamped as if it was `#pragma once` and will not be re-entered, even on #include. This means that any errant #import of a file designed to be included multiple times, such as , will incorrectly mark it as include-once and break the multiple include functionality. Normally this isn't a big problem, e.g. can't have its NDEBUG mode changed after the first #import, but it is still mostly functional. However, when clang modules are involved, this can cause the header to be hidden entirely. Objective-C code most often uses #import for everything, because it's required for most Objective-C headers to prevent double inclusion and redeclaration errors. (It's rare for Objective-C headers to use macro guards or `#pragma once`.) The problem arises when a submodule includes a multiple-include header. The "already included" state is global across all modules (which is necessary so that non-modular headers don't get compiled into multiple translation units and cause redeclaration errors). If another module or the main file #import's the same header, it becomes invisible from then on. If the original submodule is not imported, the include of the header will effectively do nothing and the header will be invisible. The only way to actually get the header's declarations is to somehow figure out which submodule consumed the header, and import that instead. That's basically impossible since it depends on exactly which modules were built in which order. #import is a poor indicator of whether a header is actually include-once, as the #import is external to the header it applies to, and requires that all inclusions correctly and consistently use #import vs #include. When modules are enabled, consider a header marked `textual` in its module as a stronger indicator of multiple-include than #import's indication of include-once. This will allow headers like to always be included when modules are enabled, even if #import is erroneously used somewhere. --- clang/include/clang/Lex/HeaderSearch.h | 26 +++- clang/lib/Lex/HeaderSearch.cpp | 154 --- clang/lib/Serialization/ASTReader.cpp| 2 +- clang/test/Modules/builtin-import.mm | 2 + clang/test/Modules/import-textual-noguard.mm | 6 +- clang/test/Modules/import-textual.mm | 2 + clang/test/Modules/multiple-import.m | 43 ++ 7 files changed, 173 insertions(+), 62 deletions(-) create mode 100644 clang/test/Modules/multiple-import.m diff --git a/clang/include/clang/Lex/HeaderSearch.h b/clang/include/clang/Lex/HeaderSearch.h index 705dcfa8aacc3f8..4c9fb58fbd35ef1 100644 --- a/clang/include/clang/Lex/HeaderSearch.h +++ b/clang/include/clang/Lex/HeaderSearch.h @@ -78,11 +78,19 @@ struct HeaderFileInfo { LLVM_PREFERRED_TYPE(bool) unsigned External : 1; - /// Whether this header is part of a module. + /// Whether this header is part of and built with a module + /// (`isTextualModuleHeader` will be `false`). LLVM_PREFERRED_TYPE(bool) unsigned isModuleHeader : 1; - /// Whether this header is part of the module that we are building. + /// Whether this header is a textual header in a module (`isModuleHeader` will + /// be `false`). + LLVM_PREFERRED_TYPE(bool) + unsigned isTextualModuleHeader : 1; + + /// Whether this header is part of the module that we are building + /// (independent of `isModuleHeader` and `isTextualModuleHeader`, they can + /// both be `false`). LLVM_PREFERRED_TYPE(bool) unsigned isCompilingModuleHeader : 1; @@ -128,13 +136,20 @@ struct HeaderFileInfo { HeaderFileInfo() : isImport(false), isPragmaOnce(false), DirInfo(SrcMgr::C_User), -External(false), isModuleHeader(false), isCompilingModuleHeader(false), -Resolved(false), IndexHeaderMapHeader(false), IsValid(false) {} +External(false), isModuleHeader(false), isTextualModuleHeader(false), +isCompilingModuleHeader(false), Resolved(false), +IndexHeaderMapHeader(false), IsValid(false) {} /// Retrieve the controlling macro for this header file, if /// any. const IdentifierInfo * getControllingMacro(ExternalPreprocessorSource *External); + + /// Update the module membership bits based on the header role. + /// + /// isModuleHeader will potentially be set, but not cleared. + /// isTextualModuleHeader will be set or cleared based on the role update. + void mergeModuleMembership(ModuleMap::ModuleHeaderRole Role); }; /// An external source of header file information, which may supply @@ -522,6 +537,9 @@ class HeaderSearch
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
https://github.com/ian-twilightcoder updated https://github.com/llvm/llvm-project/pull/83660 >From 08681ff77f432806316109146ab4fef058137648 Mon Sep 17 00:00:00 2001 From: Ian Anderson Date: Fri, 1 Mar 2024 22:17:09 -0800 Subject: [PATCH] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds Once a file has been `#import`'ed, it gets stamped as if it was `#pragma once` and will not be re-entered, even on #include. This means that any errant #import of a file designed to be included multiple times, such as , will incorrectly mark it as include-once and break the multiple include functionality. Normally this isn't a big problem, e.g. can't have its NDEBUG mode changed after the first #import, but it is still mostly functional. However, when clang modules are involved, this can cause the header to be hidden entirely. Objective-C code most often uses #import for everything, because it's required for most Objective-C headers to prevent double inclusion and redeclaration errors. (It's rare for Objective-C headers to use macro guards or `#pragma once`.) The problem arises when a submodule includes a multiple-include header. The "already included" state is global across all modules (which is necessary so that non-modular headers don't get compiled into multiple translation units and cause redeclaration errors). If another module or the main file #import's the same header, it becomes invisible from then on. If the original submodule is not imported, the include of the header will effectively do nothing and the header will be invisible. The only way to actually get the header's declarations is to somehow figure out which submodule consumed the header, and import that instead. That's basically impossible since it depends on exactly which modules were built in which order. #import is a poor indicator of whether a header is actually include-once, as the #import is external to the header it applies to, and requires that all inclusions correctly and consistently use #import vs #include. When modules are enabled, consider a header marked `textual` in its module as a stronger indicator of multiple-include than #import's indication of include-once. This will allow headers like to always be included when modules are enabled, even if #import is erroneously used somewhere. --- clang/include/clang/Lex/HeaderSearch.h | 26 +++- clang/lib/Lex/HeaderSearch.cpp | 154 --- clang/lib/Serialization/ASTReader.cpp| 2 +- clang/test/Modules/builtin-import.mm | 2 + clang/test/Modules/import-textual-noguard.mm | 6 +- clang/test/Modules/import-textual.mm | 2 + clang/test/Modules/multiple-import.m | 43 ++ 7 files changed, 173 insertions(+), 62 deletions(-) create mode 100644 clang/test/Modules/multiple-import.m diff --git a/clang/include/clang/Lex/HeaderSearch.h b/clang/include/clang/Lex/HeaderSearch.h index 705dcfa8aacc3f8..4c9fb58fbd35ef1 100644 --- a/clang/include/clang/Lex/HeaderSearch.h +++ b/clang/include/clang/Lex/HeaderSearch.h @@ -78,11 +78,19 @@ struct HeaderFileInfo { LLVM_PREFERRED_TYPE(bool) unsigned External : 1; - /// Whether this header is part of a module. + /// Whether this header is part of and built with a module + /// (`isTextualModuleHeader` will be `false`). LLVM_PREFERRED_TYPE(bool) unsigned isModuleHeader : 1; - /// Whether this header is part of the module that we are building. + /// Whether this header is a textual header in a module (`isModuleHeader` will + /// be `false`). + LLVM_PREFERRED_TYPE(bool) + unsigned isTextualModuleHeader : 1; + + /// Whether this header is part of the module that we are building + /// (independent of `isModuleHeader` and `isTextualModuleHeader`, they can + /// both be `false`). LLVM_PREFERRED_TYPE(bool) unsigned isCompilingModuleHeader : 1; @@ -128,13 +136,20 @@ struct HeaderFileInfo { HeaderFileInfo() : isImport(false), isPragmaOnce(false), DirInfo(SrcMgr::C_User), -External(false), isModuleHeader(false), isCompilingModuleHeader(false), -Resolved(false), IndexHeaderMapHeader(false), IsValid(false) {} +External(false), isModuleHeader(false), isTextualModuleHeader(false), +isCompilingModuleHeader(false), Resolved(false), +IndexHeaderMapHeader(false), IsValid(false) {} /// Retrieve the controlling macro for this header file, if /// any. const IdentifierInfo * getControllingMacro(ExternalPreprocessorSource *External); + + /// Update the module membership bits based on the header role. + /// + /// isModuleHeader will potentially be set, but not cleared. + /// isTextualModuleHeader will be set or cleared based on the role update. + void mergeModuleMembership(ModuleMap::ModuleHeaderRole Role); }; /// An external source of header file information, which may supply @@ -522,6 +537,9 @@ class HeaderSearch
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
ian-twilightcoder wrote: > > Once a file has been `#import`'ed, it gets stamped as if it was `#pragma > > once` and will not be re-entered, even on #include. > > Can you explain how this is happening? The only place where > `HeaderFileInfo::isPragmaOnce` is set to `true` is in > `HeaderSearch::MarkFileIncludeOnce()`, only called from > `Preprocessor::HandlePragmaOnce()`, which seems correct. It gets `HeaderFileInfo::isImport` which causes it to be treated the same as `#pragma once` from then on. > > The problem arises when a submodule includes a multiple-include header. The > > "already included" state is global across all modules (which is necessary > > so that non-modular headers don't get compiled into multiple translation > > units and cause redeclaration errors). If another module or the main file > > #import's the same header, it becomes invisible from then on. If the > > original submodule is not imported, the include of the header will > > effectively do nothing and the header will be invisible. The only way to > > actually get the header's declarations is to somehow figure out which > > submodule consumed the header, and import that instead. That's basically > > impossible since it depends on exactly which modules were built in which > > order. > > Could this be also solved by tracking the "already included" state [per > submodule](https://github.com/llvm/llvm-project/pull/71117)? That would fix headers that can be included multiple times (which is what I'm trying to fix here). But I think that would break headers that can only be included once and are not part of a module, since their declarations would go into multiple pcms. https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
jansvoboda11 wrote: > Once a file has been `#import`'ed, it gets stamped as if it was `#pragma > once` and will not be re-entered, even on #include. Can you explain how this is happening? The only place where `HeaderFileInfo::isPragmaOnce` is set to `true` is in `HeaderSearch::MarkFileIncludeOnce()`, only called from `Preprocessor::HandlePragmaOnce()`, which seems correct. > The problem arises when a submodule includes a multiple-include header. The > "already included" state is global across all modules (which is necessary so > that non-modular headers don't get compiled into multiple translation units > and cause redeclaration errors). If another module or the main file #import's > the same header, it becomes invisible from then on. If the original submodule > is not imported, the include of the header will effectively do nothing and > the header will be invisible. The only way to actually get the header's > declarations is to somehow figure out which submodule consumed the header, > and import that instead. That's basically impossible since it depends on > exactly which modules were built in which order. Could this be also solved by tracking the "already included" state [per submodule](https://github.com/llvm/llvm-project/pull/71117)? https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
ian-twilightcoder wrote: ``` error: 'expected-error' diagnostics seen but not expected: File /var/lib/buildkite-agent/builds/linux-56-59b8f5d88-6mhbj-1/llvm-project/clang-ci/clang/test/Modules/explicit-build-overlap.cpp Line 11: module use does not directly depend on a module exporting 'a.h', which is part of indirectly-used module b 1 error generated. ``` https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
https://github.com/ian-twilightcoder updated https://github.com/llvm/llvm-project/pull/83660 >From 771c0740dcc438d99baf4ad9555bc57e963001df Mon Sep 17 00:00:00 2001 From: Ian Anderson Date: Fri, 1 Mar 2024 22:17:09 -0800 Subject: [PATCH] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds Once a file has been `#import`'ed, it gets stamped as if it was `#pragma once` and will not be re-entered, even on #include. This means that any errant #import of a file designed to be included multiple times, such as , will incorrectly mark it as include-once and break the multiple include functionality. Normally this isn't a big problem, e.g. can't have its NDEBUG mode changed after the first #import, but it is still mostly functional. However, when clang modules are involved, this can cause the header to be hidden entirely. Objective-C code most often uses #import for everything, because it's required for most Objective-C headers to prevent double inclusion and redeclaration errors. (It's rare for Objective-C headers to use macro guards or `#pragma once`.) The problem arises when a submodule includes a multiple-include header. The "already included" state is global across all modules (which is necessary so that non-modular headers don't get compiled into multiple translation units and cause redeclaration errors). If another module or the main file #import's the same header, it becomes invisible from then on. If the original submodule is not imported, the include of the header will effectively do nothing and the header will be invisible. The only way to actually get the header's declarations is to somehow figure out which submodule consumed the header, and import that instead. That's basically impossible since it depends on exactly which modules were built in which order. #import is a poor indicator of whether a header is actually include-once, as the #import is external to the header it applies to, and requires that all inclusions correctly and consistently use #import vs #include. When modules are enabled, consider a header marked `textual` in its module as a stronger indicator of multiple-include than #import's indication of include-once. This will allow headers like to always be included when modules are enabled, even if #import is erroneously used somewhere. --- clang/include/clang/Lex/HeaderSearch.h | 24 ++- clang/lib/Lex/HeaderSearch.cpp | 157 --- clang/lib/Serialization/ASTReader.cpp| 2 +- clang/test/Modules/builtin-import.mm | 2 + clang/test/Modules/import-textual-noguard.mm | 6 +- clang/test/Modules/import-textual.mm | 2 + clang/test/Modules/multiple-import.m | 43 + 7 files changed, 173 insertions(+), 63 deletions(-) create mode 100644 clang/test/Modules/multiple-import.m diff --git a/clang/include/clang/Lex/HeaderSearch.h b/clang/include/clang/Lex/HeaderSearch.h index 705dcfa8aacc3f..bf8981b94d1b57 100644 --- a/clang/include/clang/Lex/HeaderSearch.h +++ b/clang/include/clang/Lex/HeaderSearch.h @@ -78,11 +78,17 @@ struct HeaderFileInfo { LLVM_PREFERRED_TYPE(bool) unsigned External : 1; - /// Whether this header is part of a module. + /// Whether this header is part of and built with a module. LLVM_PREFERRED_TYPE(bool) unsigned isModuleHeader : 1; - /// Whether this header is part of the module that we are building. + /// Whether this header is a textual header in a module (isModuleHeader will + /// be false) + LLVM_PREFERRED_TYPE(bool) + unsigned isTextualModuleHeader : 1; + + /// Whether this header is part of and built with the module that we are + /// building. LLVM_PREFERRED_TYPE(bool) unsigned isCompilingModuleHeader : 1; @@ -128,13 +134,20 @@ struct HeaderFileInfo { HeaderFileInfo() : isImport(false), isPragmaOnce(false), DirInfo(SrcMgr::C_User), -External(false), isModuleHeader(false), isCompilingModuleHeader(false), -Resolved(false), IndexHeaderMapHeader(false), IsValid(false) {} +External(false), isModuleHeader(false), isTextualModuleHeader(false), +isCompilingModuleHeader(false), Resolved(false), +IndexHeaderMapHeader(false), IsValid(false) {} /// Retrieve the controlling macro for this header file, if /// any. const IdentifierInfo * getControllingMacro(ExternalPreprocessorSource *External); + + /// Update the module membership bits based on the header role. + /// + /// isModuleHeader will potentially be set, but not cleared. + /// isTextualModuleHeader will be set or cleared based on the role update. + void mergeModuleMembership(ModuleMap::ModuleHeaderRole Role); }; /// An external source of header file information, which may supply @@ -522,6 +535,9 @@ class HeaderSearch { /// /// \return false if \#including the file will have no effect or true /// if we should include it. + /// + /// \param M Th
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
ian-twilightcoder wrote: Thanks to @Bigcheese for helping with this patch. https://reviews.llvm.org/D26267 looks like the most recent significant update to this code that we could find. It had an additional allowance where clang headers that were _not_ marked `textual` could be re-entered. I think this was to allow stddef.h (which is actually a multiple-include file despite it not being `textual`) to be entered multiple times while building the Darwin module. That shouldn't be necessary since nothing in the Darwin module or libc++ defines the `__need_` macros. The unit tests added in D26267 still pass, so I don't think the builtin special case is needed anymore, but I'm still doing testing. https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning: You can test this locally with the following command: ``bash git-clang-format --diff 07317bbc66d1f2d7663af3c9f04d0f6c0487ac03 171d0e299dd676ce29583e16fdf8c3e6f3dd7925 -- clang/include/clang/Lex/HeaderSearch.h clang/lib/Lex/HeaderSearch.cpp clang/lib/Serialization/ASTReader.cpp `` View the diff from clang-format here. ``diff diff --git a/clang/lib/Lex/HeaderSearch.cpp b/clang/lib/Lex/HeaderSearch.cpp index fb6ec456dd..f56781e785 100644 --- a/clang/lib/Lex/HeaderSearch.cpp +++ b/clang/lib/Lex/HeaderSearch.cpp @@ -1432,7 +1432,8 @@ void HeaderSearch::MarkFileModuleHeader(FileEntryRef FE, auto &HFI = getFileInfo(FE); HFI.mergeModuleMembership(Role); - HFI.isCompilingModuleHeader |= (HFI.isModuleHeader && isCompilingModuleHeader); + HFI.isCompilingModuleHeader |= + (HFI.isModuleHeader && isCompilingModuleHeader); } bool HeaderSearch::ShouldEnterIncludeFile(Preprocessor &PP, `` https://github.com/llvm/llvm-project/pull/83660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
llvmbot wrote: @llvm/pr-subscribers-clang Author: Ian Anderson (ian-twilightcoder) Changes Once a file has been `#import`'ed, it gets stamped as if it was `#pragma once` and will not be re-entered, even on #include. This means that any errant #import of a file designed to be included multiple times, such as, will incorrectly mark it as include-once and break the multiple include functionality. Normally this isn't a big problem, e.g. can't have its NDEBUG mode changed after the first #import, but it is still mostly functional. However, when clang modules are involved, this can cause the header to be hidden entirely. Objective-C code most often uses #import for everything, because it's required for most Objective-C headers to prevent double inclusion and redeclaration errors. (It's rare for Objective-C headers to use macro guards or `#pragma once`.) The problem arises when a submodule includes a multiple-include header. The "already included" state is global across all modules (which is necessary so that non-modular headers don't get compiled into multiple translation units and cause redeclaration errors). If another module or the main file #import's the same header, it becomes invisible from then on. If the original submodule is not imported, the include of the header will effectively do nothing and the header will be invisible. The only way to actually get the header's declarations is to somehow figure out which submodule consumed the header, and import that instead. That's basically impossible since it depends on exactly which modules were built in which order. #import is a poor indicator of whether a header is actually include-once, as the #import is external to the header it applies to, and requires that all inclusions correctly and consistently use #import vs #include. When modules are enabled, consider a header marked `textual` in its module as a stronger indicator of multiple-include than #import's indication of include-once. This will allow headers like to always be included when modules are enabled, even if #import is erroneously used somewhere. --- Full diff: https://github.com/llvm/llvm-project/pull/83660.diff 7 Files Affected: - (modified) clang/include/clang/Lex/HeaderSearch.h (+20-4) - (modified) clang/lib/Lex/HeaderSearch.cpp (+99-57) - (modified) clang/lib/Serialization/ASTReader.cpp (+1-1) - (modified) clang/test/Modules/builtin-import.mm (+2) - (modified) clang/test/Modules/import-textual-noguard.mm (+5-1) - (modified) clang/test/Modules/import-textual.mm (+2) - (added) clang/test/Modules/multiple-import.m (+43) ``diff diff --git a/clang/include/clang/Lex/HeaderSearch.h b/clang/include/clang/Lex/HeaderSearch.h index 705dcfa8aacc3f..bf8981b94d1b57 100644 --- a/clang/include/clang/Lex/HeaderSearch.h +++ b/clang/include/clang/Lex/HeaderSearch.h @@ -78,11 +78,17 @@ struct HeaderFileInfo { LLVM_PREFERRED_TYPE(bool) unsigned External : 1; - /// Whether this header is part of a module. + /// Whether this header is part of and built with a module. LLVM_PREFERRED_TYPE(bool) unsigned isModuleHeader : 1; - /// Whether this header is part of the module that we are building. + /// Whether this header is a textual header in a module (isModuleHeader will + /// be false) + LLVM_PREFERRED_TYPE(bool) + unsigned isTextualModuleHeader : 1; + + /// Whether this header is part of and built with the module that we are + /// building. LLVM_PREFERRED_TYPE(bool) unsigned isCompilingModuleHeader : 1; @@ -128,13 +134,20 @@ struct HeaderFileInfo { HeaderFileInfo() : isImport(false), isPragmaOnce(false), DirInfo(SrcMgr::C_User), -External(false), isModuleHeader(false), isCompilingModuleHeader(false), -Resolved(false), IndexHeaderMapHeader(false), IsValid(false) {} +External(false), isModuleHeader(false), isTextualModuleHeader(false), +isCompilingModuleHeader(false), Resolved(false), +IndexHeaderMapHeader(false), IsValid(false) {} /// Retrieve the controlling macro for this header file, if /// any. const IdentifierInfo * getControllingMacro(ExternalPreprocessorSource *External); + + /// Update the module membership bits based on the header role. + /// + /// isModuleHeader will potentially be set, but not cleared. + /// isTextualModuleHeader will be set or cleared based on the role update. + void mergeModuleMembership(ModuleMap::ModuleHeaderRole Role); }; /// An external source of header file information, which may supply @@ -522,6 +535,9 @@ class HeaderSearch { /// /// \return false if \#including the file will have no effect or true /// if we should include it. + /// + /// \param M The module to which `File` belongs (this should usually be the + /// SuggestedModule returned by LookupFile/LookupSubframeworkHeader) bool ShouldEnterIncludeFile(Preprocessor &PP, FileEntryRef File,
[clang] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (PR #83660)
https://github.com/ian-twilightcoder created https://github.com/llvm/llvm-project/pull/83660 Once a file has been `#import`'ed, it gets stamped as if it was `#pragma once` and will not be re-entered, even on #include. This means that any errant #import of a file designed to be included multiple times, such as , will incorrectly mark it as include-once and break the multiple include functionality. Normally this isn't a big problem, e.g. can't have its NDEBUG mode changed after the first #import, but it is still mostly functional. However, when clang modules are involved, this can cause the header to be hidden entirely. Objective-C code most often uses #import for everything, because it's required for most Objective-C headers to prevent double inclusion and redeclaration errors. (It's rare for Objective-C headers to use macro guards or `#pragma once`.) The problem arises when a submodule includes a multiple-include header. The "already included" state is global across all modules (which is necessary so that non-modular headers don't get compiled into multiple translation units and cause redeclaration errors). If another module or the main file #import's the same header, it becomes invisible from then on. If the original submodule is not imported, the include of the header will effectively do nothing and the header will be invisible. The only way to actually get the header's declarations is to somehow figure out which submodule consumed the header, and import that instead. That's basically impossible since it depends on exactly which modules were built in which order. #import is a poor indicator of whether a header is actually include-once, as the #import is external to the header it applies to, and requires that all inclusions correctly and consistently use #import vs #include. When modules are enabled, consider a header marked `textual` in its module as a stronger indicator of multiple-include than #import's indication of include-once. This will allow headers like to always be included when modules are enabled, even if #import is erroneously used somewhere. >From 171d0e299dd676ce29583e16fdf8c3e6f3dd7925 Mon Sep 17 00:00:00 2001 From: Ian Anderson Date: Fri, 1 Mar 2024 22:17:09 -0800 Subject: [PATCH] [clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds Once a file has been `#import`'ed, it gets stamped as if it was `#pragma once` and will not be re-entered, even on #include. This means that any errant #import of a file designed to be included multiple times, such as , will incorrectly mark it as include-once and break the multiple include functionality. Normally this isn't a big problem, e.g. can't have its NDEBUG mode changed after the first #import, but it is still mostly functional. However, when clang modules are involved, this can cause the header to be hidden entirely. Objective-C code most often uses #import for everything, because it's required for most Objective-C headers to prevent double inclusion and redeclaration errors. (It's rare for Objective-C headers to use macro guards or `#pragma once`.) The problem arises when a submodule includes a multiple-include header. The "already included" state is global across all modules (which is necessary so that non-modular headers don't get compiled into multiple translation units and cause redeclaration errors). If another module or the main file #import's the same header, it becomes invisible from then on. If the original submodule is not imported, the include of the header will effectively do nothing and the header will be invisible. The only way to actually get the header's declarations is to somehow figure out which submodule consumed the header, and import that instead. That's basically impossible since it depends on exactly which modules were built in which order. #import is a poor indicator of whether a header is actually include-once, as the #import is external to the header it applies to, and requires that all inclusions correctly and consistently use #import vs #include. When modules are enabled, consider a header marked `textual` in its module as a stronger indicator of multiple-include than #import's indication of include-once. This will allow headers like to always be included when modules are enabled, even if #import is erroneously used somewhere. --- clang/include/clang/Lex/HeaderSearch.h | 24 ++- clang/lib/Lex/HeaderSearch.cpp | 156 --- clang/lib/Serialization/ASTReader.cpp| 2 +- clang/test/Modules/builtin-import.mm | 2 + clang/test/Modules/import-textual-noguard.mm | 6 +- clang/test/Modules/import-textual.mm | 2 + clang/test/Modules/multiple-import.m | 43 + 7 files changed, 172 insertions(+), 63 deletions(-) create mode 100644 clang/test/Modules/multiple-import.m diff --git a/clang/include/clang/Lex/HeaderSearch.h b/clang/incl