[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
vsapsai wrote: I'll be busy during WWDC week too, so no real changes are expected in the next 2 weeks. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
jansvoboda11 wrote: > What's worse, it really depends on whether any `FileID` used by that file was > ever deserialized That reminds me of the issue @alexfh and I discussed here: * [[clang][modules] NFCI: Pragma diagnostic mappings: write/read FileID instead of SourceLocation](https://reviews.llvm.org/D137213) * https://github.com/llvm/llvm-project/pull/87427 * https://github.com/llvm/llvm-project/pull/87442 https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
ilya-biryukov wrote: > So far we are still aiming for a more principled solution. For example, if > support of a header in multiple modules is a supported feature, then some > kind of a spec with a more comprehensive testing would be appropriate. It is > up to you but personally I feel like that's a lot of work and it is more > pragmatic to skirt the issue and make sure your use case keeps working with > smaller investments. I agree we should have better testing and specification, but to your point it is a lot of work and for small issues like this we rather need a quick solution. It's not to say that we should never do it, just want to stay pragmatic about it. If this gets **really** gets in the way, we will absolutely have to invest here. > Personally, at the moment I'm interested in how -fmodules-embed-all-files > works and if its behavior is consistent. Though it is possible I like putting > stuff there because we don't rely on this flag and it feels less risky. > Anyway, the first impression of using -fmodules-embed-all-files makes me > suspicious it works inconsistently. The mental model that I have after staring at the implementation is that any file that `SourceManager` reads gets written into a PCM and should later be used whenever someone looks up the same file (at least with the same path, symlinks make this much-much more complicated and I don't think there's any good semantics there). However, this mental model breaks down when looking at your example. What's worse, it really depends on whether any `FileID` used by that file was ever deserialized, e.g. here's a slightly modified example: ```cpp // RUN: rm -rf %t // RUN: split-file %s %t // // Build modules 'a' and 'b' embedding all files. // RUN: %clang_cc1 -fmodules -fmodule-map-file=%t/modules1.map -fmodule-name=a -emit-module -xc++ -fmodules-embed-all-files -o %t/a.pcm %t/modules1.map // RUN: %clang_cc1 -fmodules -fmodule-map-file=%t/modules2.map -fmodule-name=b -emit-module -xc++ -fmodules-embed-all-files -o %t/b.pcm %t/modules2.map // // Try to access foo.h from a module. // RUN: rm %t/foo.h // RUN: %clang_cc1 -fmodules -fmodule-map-file=%t/modules1.map -fmodule-file=%t/a.pcm -fsyntax-only %t/use.cpp //--- modules1.map module a { header "foo.h" header "bar.h" export * } //--- modules2.map module b { private header "foo.h" private header "bar.h" export * } //--- foo.h [[deprecated("to show location")]] void foo(void) ; //--- bar.h #include "foo.h" //--- use.cpp #include "bar.h" void test() { foo(); // When this call is present, no error below. Because the deprecation warning gets shown and FileID for `foo.h` gets deserialized. // If you comment out the previous line, the #include below gives an error 🤷 } #include "foo.h" ``` I would actually consider this a bug, so my take on it: your example should compile just fine and my follow-up example should not depend on whether there is an error mentioning `foo.h` (or something else that causes a deserialization) before `#include`. My idea to fix this would be to store overridden buffers separately and make sure we always create virtual files for all of them by path, the contents should actually be loaded lazily to avoid wasting memory. I am happy to take a stab at this, but wanted to confirm this direction seems right in advance. As for the original issue, I'll wait for some more thoughts from your side on what the principled solution would be. Happy to pursue other directions and help implement them too. PS I won't be able to respond for the next 10 days, back on June 16. Sorry about the inconvenience. If you urgently need to land the original patch, please do, I have asked @alexfh to locally revert it until I'm back and can handle it. It seems a small enough change and we understand the root cause and impact pretty well now, so I am sure we can deal with it until there's a final solution in place. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
vsapsai wrote: > Thanks a lot, I am sorry for misreading or misunderstanding your comments > too. I hope we can put this behind ourselves. Appreciate the understanding. > [skip the technical description] Thanks for the explanation, it gives a lot of food for thought. Still building a mental model of how `-fmodules-embed-all-files` and private headers work [and how they should work]. For example, it was surprising to me that ``` // RUN: rm -rf %t // RUN: split-file %s %t // // Build modules 'a' and 'b' embedding all files. // RUN: %clang_cc1 -fmodules -fmodule-map-file=%t/modules1.map -fmodule-name=a -emit-module -xc++ -fmodules-embed-all-files -o %t/a.pcm %t/modules1.map // RUN: %clang_cc1 -fmodules -fmodule-map-file=%t/modules2.map -fmodule-name=b -emit-module -xc++ -fmodules-embed-all-files -o %t/b.pcm %t/modules2.map // // Try to access foo.h from a module. // RUN: rm %t/foo.h // RUN: %clang_cc1 -fmodules -fmodule-map-file=%t/modules1.map -fmodule-file=%t/a.pcm -fsyntax-only %t/use.cpp //--- modules1.map module a { header "foo.h" export * } //--- modules2.map module b { private header "foo.h" export * } //--- foo.h void foo(void); //--- use.cpp #include "foo.h" void test() { foo(); } ``` still cannot find "foo.h" despite `-fmodules-embed-all-files`. > ## Looping back to #141792. > (I am not sure if this was already addressed, sorry if I missed the > comments). How does the idea of always preferring to load a header from PCM > sound to folks? Given that access checking of headers and module resolution > are independent in Clang right now and given that same header in multiple > modules is rare, it seems like a behavior change we could potentially agree > on. > > That being said, I also wouldn't be surprised if folks don't like this, it's > certainly not the cleanest design one could imagine. I would want to hear > opinions on that approach. So far we are still aiming for a more principled solution. For example, if support of a header in multiple modules is a supported feature, then some kind of a spec with a more comprehensive testing would be appropriate. It is up to you but personally I feel like that's a lot of work and it is more pragmatic to skirt the issue and make sure your use case keeps working with smaller investments. Personally, at the moment I'm interested in how `-fmodules-embed-all-files` works and if its behavior is consistent. Though it is possible I like putting stuff there because we don't rely on this flag and it feels less risky. Anyway, the first impression of using `-fmodules-embed-all-files` makes me suspicious it works inconsistently. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
ilya-biryukov wrote: > I'm sorry that my replies came across as hostile and snarky, I didn't mean it > that way. Though I'll admit I'm not thrilled about your current approach, > that's what I was communicating. But that's what you have right now and there > is no easy way to change it fast. Thanks a lot, I am sorry for misreading or misunderstanding your comments too. I hope we can put this behind ourselves. > and the lack of the commitment to any work on your side made it sound as a > demand. Sorry that it came out this way. I meant to say that changing this would likely be a very lengthy and labor-intenstive commitment on our side. Simply because it's so closely interacting with the build system and the build system is used by all of our code, I expect that we'll hit every corner case imaginable no matter what change we do. It does not mean that we should not do it if that's for the better of Clang, just want to make sure we have the tools at our disposal to land a change internally. > I think I understand the logic behind your approach. Right, I believe you're right on spot with the conclusions. > Access to headers from .cc files is restricted through the used module maps > and maybe through some Bazel sand-boxing. Doesn't particularly matter in this > case. One clarification here, for .cc files we simply pass the same module maps and set `-fmodule-name=`, making Clang treat `.cc` file as internal to the module, the rest of the access checking rules applied are the same. > What about the approach to provide all_textual.pcm to use.cpp and for clang > to find foo.h in all_textual.pcm? It doesn't work right now but I'm not sure > that it shouldn't (there can be good reasons why it shouldn't work, tbh). We don't actually produce `.pcm` files for the modules that only have textual headers. The rational is that doing so would be detrimental to build performance and resource usage on our distributed build infrastructure. Normally `.pcm` files are a net win because they allow to save work required to generate the AST in their dependecies. However, if all the headers in a module are textual, we merely carry around an empty PCM and add unnecessary dependencies in the build action graph (`.cc` files that depend only on textual headers will require the PCM even though it contains no useful information). We also have content-addressed caches for the inputs to build actions that benefit from the fine granularity. When producing a `.pcm`, embedding the contents turns our to be a net win downstream because we have to carry it around anyway and any change in its headers will cause the PCM itself to change. But shipping all textual headers like this would make this cache to be evicted much more frequently, increase the size of inputs to actions depending on those libraries and decrease the cache hit rates. If we were to make such a change in Clang, we'll have to make a few more changes: - `-fmodules-embed-all-files` currently only writes files that were actually read during the compilation. In order to be able to ship `all_textual.pcm` we would need to add additional logic to write all files mentioned in module maps. - In addition, after landing the current PR, we would end up in a situation where the modules the compilation may succeed or fail when the textual headers are missing depending on the presence of `-fmodules-embed-all-files` flag. This feels weird, although I feel that the flag itself is rather specialized, so it's probably not the end of the world for those using it (I only know of our use-case, though, so may be missing some important problems that others have). > I'm relieved that you aren't a heavy user of textual headers to make macros > work (this is one of the use cases for textual headers). That's one of the use cases too, but indirectly. If the headers define macros, users are expected to mark them as such. Sometimes this is necessary, e.g. certain C++ libraries do rely on macros and our code can have those libraries as dependencies. ## Looping back to #141792. (I am not sure if this was already addressed, sorry if I missed the comments). How does the idea of always preferring to load a header from PCM sound to folks? Given that access checking of headers and module resolution are independent in Clang right now and given that same header in multiple modules is rare, it seems like a behavior change we could potentially agree on. That being said, I also wouldn't be surprised if folks don't like this, it's certainly not the cleanest design one could imagine. I would want to hear opinions on that approach. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
vsapsai wrote: What about the approach to provide all_textual.pcm to use.cpp and for clang to find foo.h in all_textual.pcm? It doesn't work right now but I'm not sure that it shouldn't (there can be good reasons why it shouldn't work, tbh). https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
vsapsai wrote: I think I understand the logic behind your approach. * Access between headers is enforced through decl-use, so you cannot access headers from a target that's not your direct dependency. * But some headers shouldn't be accessed outside of a target [regardless of the dependencies], only inside the target. These become `private header`. * Access to headers from .cc files is restricted through the used module maps and maybe through some Bazel sand-boxing. Doesn't particularly matter in this case. And the same header in multiple modules just happened to be the case, it wasn't a deliberate design choice. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
AaronBallman wrote: We reviewed this thread today at the Clang Area Team meeting and it seems like y'all have perhaps found a solution everyone is happy with. However, if you'd like a meeting or if there's some other way we can support you, we're definitely happy to help! > Please let me know if this is the right escalation path, if not, I am happy > to find another one if this is wrong. The usual path is: relevant maintainer(s), then lead maintainer, then relevant area team(s), then project council. But there's really no "wrong" group to escalate to, so this was a perfectly reasonable escalation path. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
ilya-biryukov wrote: Many thanks for engaging in this discussion and sorry for the delays in my responses too. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
ilya-biryukov wrote: > In general, though, I share the same concern as @vsapsai about leaving this > patch reverted for too long. I have a concrete proposal to move forward, #141792. I have checked that the failure we've seen in our infrastructure goes away with both changes applied. Given that multiple modules owning the same header seem to be a rare case outside of Google, I hope this also won't cause any breakages and the change acceptable (it's only changing the logic that kicks in when we do have multiple modules for the same header). I really appreciate the steps forward here and would be happy to hop on a video call if necessary. I would still suggest to discuss #141792 first and if that stalls, we can schedule a call. Does that work? Unfortunately, I will be OOO until next Monday and won't be able to respond here. If folks feel there is urgency in relanding this or are happy with merging #141792 without discussion, feel free to do so. (We can definitely live with a local revert for a week). The only major objection that I would raise would be to a path forward that breaks the test case I've added above without a migration path that we can employ. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
cyndyishida wrote: Firstly, thank you for your responsiveness @ilya-biryukov, and for helping us understand how this bug fix impacts Google's build. For all individuals involved in this discussion, I would like to remind that our [code of conduct](https://llvm.org/docs/CodeOfConduct.html) does ask us to understand each other when we disagree. In general, though, I share the same concern as @vsapsai about leaving this patch reverted for too long. But in fear of speaking too much on behalf of others, perhaps it would be helpful for us to have a video call, coordinated by the area team, or otherwise, to understand how to move forward if we cannot come to a resolution soon. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
ilya-biryukov wrote: > Can you clarify, are you saying this pattern of having a header in two > different modules has to keep working indefinitely, or are you willing to > migrate off of it? > I don't understand what the reason is for the header to be in two different > modules in the first place. It sounds like it is somehow related to a build > optimization, but what exactly is that? Our whole build infrastructure relies on producing those module maps. In Bazel (and its internal equivalent), you can have multiple targets (`cc_library`) that own the same header. Bazel generates a module map for each of the `cc_library` like this: - each file in `hdrs` becomes a `header` (can be modular or `textual`, depending on the target, e.g. all protos produce modular targets, we can also make some targets modular), - each **header** file in `srcs` becomes a **private** header. - `.inc` files and `textual_hdrs` become `textual headers`. Things are a little more complicated on top of that, we actually have a submodule per header in each module matching a target. And we also rely on `-fmodule-strict-decluse` to make sure we never have any headers that are not covered by module maps. That way we enforce the structure from our BUILD rules: headers are only allowed to include headers from their direct dependencies and only the non-private ones. I believe the way access checking of headers is done today, Clang allows us to have a dependency on a header that you **could** get through a module that has it publicly available, even if it ends up picking a module where the header was actually private. In turn, this allows us to add optimization by limiting the number of header and modules we need to ship to distributed build for each of the compile actions. I know there are many details missing from this explanation, but I hope this gives a start. Changing how our build is done within all of Google is obviously hard. I understand that having multiple modules for a header may not be something others care about, but we need it and are willing to put the work to make it work too. Does that explanation help to clarify our use case or do folks still have questions? > Do you include proto headers directly or through the wrapper headers? Directly. > Are there any important macros in proto headers? I don't think there are. I am not sure why it's important. > Do you need to abuse macros to get something out of proto headers? Kinda like > `#define private public` but more realistic. > > My initial reaction is to offer a solution by not using private headers > (especially that it is so easily subverted by textual headers) and to rely on > `-fmodules-embed-all-files` (which you already do). I believe you have > constraints that make this "simple" solution not feasible. So I'm curious to > know what about this solution doesn't work. As mentioned above, we rely on private headers to check we do not accidentally include a header in `srcs` from a file that's not part of the same target (caveat: unless there's some other target we depend on exposing the same header as non-private). > Do you realize you can ask for a different deadline? If you don't tell how > much time you need, you get an external deadline imposed on you. I find exchanges like this non-respectful and non-considerate. Please give us some benefit of a doubt, we are trying to explain our use-case and we are happy to put in the work to make both our cases work. > What makes you think that no discussion is possible? For the record, "we need > this to work, you must make it work" is not a discussion. Could you directly quote something in my writing that you interpreted like this? I definitely did not mean this to come across as a demand. On the contrary, I have interpreted the xkcd link and some of the replies as hostile and snarky. I would like to move away from that style and find a productive way forward here. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
vsapsai wrote: Do you include proto headers directly or through the wrapper headers? Are there any important macros in proto headers? Do you need to abuse macros to get something out of proto headers? Kinda like `#define private public` but more realistic. My initial reaction is to offer a solution by not using private headers (especially that it is so easily subverted by textual headers) and to rely on `-fmodules-embed-all-files` (which you already do). I believe you have constraints that make this "simple" solution not feasible. So I'm curious to know what about this solution doesn't work. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
cyndyishida wrote: > And, in the end, the folks debugging this (Ilya & co) are busy working on > clang header module reproduction and reduction tools: > https://github.com/emaxx-google/cvise/branches @emaxx-google Interestingly enough, @vsapsai is also working on reproducers for explicitly built clang modules, and that is how he ran into the compiler bug about a not-included textual header. When discovering which files to capture for the reproduction, we technically shouldn't need to also include this header input since it's not needed to build the module, it seems like that principle would also help for reducing reproducers. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
rnk wrote: (This is mostly written wearing my "Googler" hat, not the clang area team lead hat) There's been a lot of discussion of what the exact use case is. I think it's helpful to share the context that Google essentially uses Clang header modules to wrap generated proto headers, and this is critical for enabling our builds to scale. The details here about private modules, module wrappers, and textual inclusions are Blazel-generated, low-level implementation details that don't matter in the grand scheme of things. The intention is to provide a IWYU-clean, header module programming model that is efficient as possible, meaning sometimes the large generated proto header needs to be included textually, and sometimes handled as a module import. I believe the arrangement of the module maps is changeable, but it is a low-level detail at the bottom of a performance critical system, so the lead time for changes is probably measured in months, not days, and probably requires significant effort. Focusing back on the commit at hand, it would have forced our internal build system to provide thousands more header files as inputs to compilations across the system, and we can't afford to do that. We backed it out of the last few internal release candidates, but carrying it over requires manual effort. The purpose of the commit seems to have been to emit fewer errors, not more. I'm sure Google's usage of header modules relies on internal implementation details that it shouldn't, but we appreciate having more time to debug the issue, so thank you for the accommodation. And, in the end, the folks debugging this (Ilya & co) are busy working on clang header module reproduction and reduction tools: https://github.com/emaxx-google/cvise/branches @emaxx-google Effective open source collaboration is all about balancing the needs of contributors, and while this work on cvise and reblaze isn't happening in LLVM, I hope the value of these contributions outweighs the inconvenience of dealing with Google's reliance on header module implementation details. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
vsapsai wrote: > Back to the original issue. Do you control the module map that specifies a header as private? Does it need to stay private? Personally, to me it looks like a bug that a private header is getting picked up [somewhat accidentally]. I'm not planning to fix this bug, just that relying on outright buggy behavior doesn't look promising long-term. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
vsapsai wrote: > It is concerning to me that we as a company are given **a week** to adjust to > the change and there is no discussion either about alternative approaches or > the scale of the changes needed to support this. We do rely on all kinds of > weird behaviors and side effects in Clang, and are happy to adjust to make > the best for the community. I am pretty sure we can find a way to make this > work too, but not under undue pressure. Do you realize you can ask for a different deadline? If you don't tell how much time you need, you get an external deadline imposed on you. What makes you think that no discussion is possible? For the record, "we need this to work, **you must** make it work" is not a discussion. > I agree that maintaining anything for a single company is not something that > the community must or should do, but we are also not asking for maintaining > this behavior indefinitely. Sorry, my bad, I've missed how much time do you need. I.e., for how long this behavior should be maintained. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
benlangmuir wrote: > I don't think we can ever get rid of that pattern and we will keep relying on > Clang supporting this. > We would need to example I shared above to keep working. > I think the change we want is to always prioritize modular over textual, > before we look at non-private vs private. Can you clarify, are you saying this pattern of having a header in two different modules has to keep working indefinitely, or are you willing to migrate off of it? I don't understand what the reason is for the header to be in two different modules in the first place. It sounds like it is somehow related to a build optimization, but what exactly is that? https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
benlangmuir wrote: > I don't think we can ever get rid of that pattern and we will keep relying on > Clang supporting this. > We would need to example I shared above to keep working. I don't have a strong opinion on the best path forward here, but I'm not clear if you're actually willing to https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
ilya-biryukov wrote: Back to the original issue. > I'm willing to help figure out how to achieve the desired result in a > different way. But for that need to know what is the desired result. We would need to example I shared above to keep working. We rely on an optimization that picks a module with a modular header and imports it rather than a textual one whenever it's available. The reason we cannot change that easily is that: - our module maps are generated by the build system, - our builds are run remotely and we have an optimization that reduces the number of inputs by not shipping the headers if they is an available module that has the header as modular, - our codebase is really big and we have no easy way to find and replace the places we have. Any change on our side is either in compile time or will take long to implement. >From first principles, we need to realign the build optimization for reducing >number of inputs and Clang's module semantics. I would need to confirm that, >but I think that'll boil down to preferring to use a module where header is >modular over another where header is textual, even if the underlying header is >marked as private. Currently we happen to have this behavior by accident >(module with textual header is marked unavailable). We already prefer modules where header is modular over textual, but we the non-private vs private header takes higher priority. I think the change we want is to always prioritize modular over textual, before we look at non-private vs private. The open questions that I have are: - Would that change be breaking? - What are the consequences of picking a different module? (Given that multiple modules for a single header is unusual, it might be that it would not matter, but per Hyrum's law, we want cause problems) If that does not work by default, we would probably need a (`-cc1`? ) flag that we can keep using on our side. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
ilya-biryukov wrote: > Ok, I'm going to revert the change to help you out. But I'm going to re-land > it in a week or when you are ready, whichever comes first. > > There was no indication there is anything wrong with the change or if the > issue is wide-spread. And if a single company relies on an existing side > effect, there is no obligation for everyone else [to maintain such a side > effect](https://xkcd.com/1172/) forever. > > I'm willing to help figure out how to achieve the desired result in a > different way. But for that need to know what is the desired result. I would like a second opinion on your proposed course of action from @llvm/clang-area-team. Please let me know if this is the right escalation path, if not, I am happy to find another one if this is wrong. It is concerning to me that we as a company are given **a week** to adjust to the change and there is no discussion either about alternative approaches or the scale of the changes needed to support this. We do rely on all kinds of weird behaviors and side effects in Clang, and are happy to adjust to make the best for the community. I am pretty sure we can find a way to make this work too, but not under undue pressure. I agree that maintaining anything for a single company is not something that the community must or should do, but we are also not asking for maintaining this behavior indefinitely. We ask to give us reasonable time to understand the problem and adjust and engage with us to help fix this. I also do not know how to react to the xkcd, it seems rather snarky and dismissive. I hope it was not intended that way, but I still wanted to call it out that it came over like this to me. With all that being said, thank you for reverting the commit, this really helps relieve some pressure from the discussion. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
vsapsai wrote: Ok, I'm going to revert the change to help you out. But I'm going to re-land it in a week or when you are ready, whichever comes first. There was no indication there is anything wrong with the change or if the issue is wide-spread. And if a single company relies on an existing side effect, there is no obligation for everyone else [to maintain such a side effect](https://xkcd.com/1172/) forever. I'm willing to help figure out how to achieve the desired result in a different way. But for that need to know what is the desired result. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
ilya-biryukov wrote: > And I'm not sure such a use case is worth supporting. > But I understand that is only my own interpretation which can be incorrect. > And I want to believe you have a better use case that doesn't rely on > accessing private headers The code I shared compiles with no error before this change and has errors now. It is okay to discuss if our use-case is "worth supporting", but we heavily rely on this behavior and this has already broken us and this puts us in a corner. Could we revert this and have a discussion on how to satisfy both our use cases on equal grounds? Is there any reason why this commit must be present and head and can't wait until we figure out a mutual way out of this? https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
vsapsai wrote: > @vsapsai I guess, it's a good sign? :) Do you see how our use case can be > supported by a trivial and low-risk forward fix? If not, I'd insist on a > revert before we can figure out the way forward. We can run this sort of a > change through our testing before it relands, and ensure it doesn't break our > code. We did this on multiple occasions before (especially for changes that > touch Clang header modules). I still don't understand what is your use case. My [very ungenerous] interpretation is "to access a file you were told not to access without triggering any tripwires". And I'm not sure such a use case is worth supporting. But I understand that is only my own interpretation which can be incorrect. And I want to believe you have a better use case that doesn't rely on accessing private headers. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
alexfh wrote: > Interesting, changing `private header "wrap_foo.h"` to `header "wrap_foo.h"` > in `b.wrap_foo` stops reproducing the error. I'm not sure it guarantees > > > whenever resolving module for the header, pick one that has the header as > > modular over textual. > > But seems we are closer to this behavior than I expected. @vsapsai I guess, it's a good sign? :) Do you see how our use case can be supported by a trivial and low-risk forward fix? If not, I'd insist on a revert before we can figure out the way forward. We can run this sort of a change through our testing before it relands, and ensure it doesn't break our code. We did this on multiple occasions before (especially for changes that touch Clang header modules). https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
vsapsai wrote: Interesting, changing `private header "wrap_foo.h"` to `header "wrap_foo.h"` in `b.wrap_foo` stops reproducing the error. I'm not sure it guarantees > whenever resolving module for the header, pick one that has the header as > modular over textual. But seems we are closer to this behavior than I expected. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
vsapsai wrote: It's weird that clang picked `b.wrap_foo` for "wrap_foo.h" as it is specified as `private header`. To me it looks like circumventing "use of private header from outside its module [-Wprivate-header]". https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
ilya-biryukov wrote: > What is your timeframe for stopping putting the same header ("wrap_foo.h") > into multiple modules? If you do that it was never guaranteed which module > would be used for a header. I'm willing to unblock you but I'd like to know > if you are committed to fixing the issue in the module maps. I don't think we can ever get rid of that pattern and we will keep relying on Clang supporting this. I am sure there are ways to meet both your needs and our needs. To reiterate, Clang supports having the same header in multiple module maps and we don't view our current setup as having at a high level. I just want to explicit that our stance is that there are no issues that need "fixing in the module maps". Relying on Clang picking a header from a module when it's available is also something we'd like to keep. How we achieve this today is shaky (relying on availability doesn't look right), but we absolutely need: - to allow multiple modules own the same header, - whenever resolving module for the header, pick one that has the header as modular over textual. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
vsapsai wrote: > I know what we do is cheesy and we might need to reconsider how to approach > it, but this is a breaking change and it would be great to find a resolution > for it before committing to the new behavior. Could we revert this while we > discuss the potential ways out of it? What is your timeframe for stopping putting the same header ("wrap_foo.h") into multiple modules? If you do that it was never guaranteed which module would be used for a header. I'm willing to unblock you but I'd like to know if you are committed to fixing the issue in the module maps. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
ilya-biryukov wrote: Here's a reproducer for our breakage: ```cpp // RUN: rm -rf %t // RUN: split-file %s %t // // First, build a module with a header. // // RUN: %clang_cc1 -fmodules -fmodule-map-file=%t/modules1.map -fmodule-name=a -emit-module -xc++ -fmodules-embed-all-files -o %t/a.pcm %t/modules1.map // RUN: %clang_cc1 -fmodules -fmodule-map-file=%t/modules1.map -fmodule-map-file=%t/modules2.map -fmodule-name=b -emit-module \ // RUN: -fmodule-file=%t/a.pcm -xc++ -fmodules-embed-all-files -o %t/b.pcm %t/modules2.map // // Remove the header and check we can still build the code that finds it in a PCM. // // RUN: rm %t/foo.h // RUN: %clang_cc1 -fmodules -fmodule-map-file=%t/modules2.map -fmodule-file=%t/b.pcm -fsyntax-only %t/use.cpp //--- modules1.map module a { module foo { header "foo.h" export * } export * } //--- modules2.map module all_textual { module foo { textual header "foo.h" export * } module wrap_foo { textual header "wrap_foo.h" export * } export * } module b { module wrap_foo { private header "wrap_foo.h" export * } export * } //--- foo.h #ifndef FOO_H #define FOO_H void foo(); #endif //--- wrap_foo.h #include "foo.h" //--- use.cpp #include "wrap_foo.h" void test() { foo(); } ``` It previously built fine because the `all_textual.foo` and its parent module used to be marked as unavailable because `foo.h` could not be found here: https://github.com/llvm/llvm-project/blob/09fd8f0093b8ff489d76285d893be152e4ca4c24/clang/lib/Lex/ModuleMap.cpp#L326 and therefore it made us pick the module `b.wrap_foo` later when resolving `wrap_foo.h` instead of `all_textual.wrap_foo` based on the code here: https://github.com/llvm/llvm-project/blob/09fd8f0093b8ff489d76285d893be152e4ca4c24/clang/lib/Lex/ModuleMap.cpp#L618 After the change, we no longer mark `all_textual` as unavailable and pick `all_textual.wrap_foo`, which causes us to reprocess the header. I know what we do is cheesy and we might need to reconsider how to approach it, but this is a breaking change and it would be great to find a resolution for it before committing to the new behavior. Could we revert this while we discuss the potential ways out of it? https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
bgra8 wrote: We have bisected build breakages internally at google at this change. Ironically the compiler complains about a missing header while before it didn't. We do not have a reproducer yet. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
https://github.com/vsapsai closed https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
vsapsai wrote: Thanks for the review! https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
https://github.com/Bigcheese approved this pull request. Just looking at the `missing_textual_header` module alone this is a bit odd, but that's just the semantics of textual headers. Really textual headers just exist to control the non-modular include warning, so this is fine. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
vsapsai wrote: The change for `exclude` headers was done in 040e12662a674e2ebfc93f86a70eddb7d6fc76da. https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
llvmbot wrote: @llvm/pr-subscribers-clang-modules Author: Volodymyr Sapsai (vsapsai) Changes According to the documentation > A header declaration that does not contain `exclude` nor `textual` specifies a header that contributes to the enclosing module. Which means that `exclude` and `textual` header don't contribute to the enclosing module and their presence isn't required to build such a module. The keywords tell clang how a header should be treated in a context of the module but they don't add headers to the module. When a textual header *is* used, clang still emits "file not found" error pointing to the location where the missing file is included. --- Full diff: https://github.com/llvm/llvm-project/pull/138227.diff 3 Files Affected: - (modified) clang/lib/Lex/ModuleMap.cpp (+4-2) - (modified) clang/test/Modules/Inputs/submodules/module.modulemap (+4) - (modified) clang/test/Modules/missing-header.m (+3) ``diff diff --git a/clang/lib/Lex/ModuleMap.cpp b/clang/lib/Lex/ModuleMap.cpp index c2f13fa48d464..1ba02b919a22a 100644 --- a/clang/lib/Lex/ModuleMap.cpp +++ b/clang/lib/Lex/ModuleMap.cpp @@ -310,8 +310,10 @@ void ModuleMap::resolveHeader(Module *Mod, } else if (Header.HasBuiltinHeader && !Header.Size && !Header.ModTime) { // There's a builtin header but no corresponding on-disk header. Assume // this was supposed to modularize the builtin header alone. - } else if (Header.Kind == Module::HK_Excluded) { -// Ignore missing excluded header files. They're optional anyway. + } else if ((Header.Kind == Module::HK_Excluded) || + (Header.Kind == Module::HK_Textual)) { +// Ignore excluded and textual header files as a module can be built with +// such headers missing. } else { // If we find a module that has a missing header, we mark this module as // unavailable and store the header directive for displaying diagnostics. diff --git a/clang/test/Modules/Inputs/submodules/module.modulemap b/clang/test/Modules/Inputs/submodules/module.modulemap index 1c1b76a08969e..9e8143b8101de 100644 --- a/clang/test/Modules/Inputs/submodules/module.modulemap +++ b/clang/test/Modules/Inputs/submodules/module.modulemap @@ -30,3 +30,7 @@ module missing_umbrella_with_inferred_submodules { module * { export * } export * } + +module missing_textual_header { + textual header "missing_textual.h" +} diff --git a/clang/test/Modules/missing-header.m b/clang/test/Modules/missing-header.m index c162e1b5f08b3..84d82e5ceda32 100644 --- a/clang/test/Modules/missing-header.m +++ b/clang/test/Modules/missing-header.m @@ -8,6 +8,9 @@ @import missing_unavailable_headers.not_missing; // OK // CHECK-NOT: missing_unavailable_headers +@import missing_textual_header; // OK +// CHECK-NOT: missing_textual_header + @import missing_headers; // CHECK: module.modulemap:15:27: error: header 'missing.h' not found // CHECK: could not build module 'missing_headers' `` https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
llvmbot wrote: @llvm/pr-subscribers-clang Author: Volodymyr Sapsai (vsapsai) Changes According to the documentation > A header declaration that does not contain `exclude` nor `textual` specifies a header that contributes to the enclosing module. Which means that `exclude` and `textual` header don't contribute to the enclosing module and their presence isn't required to build such a module. The keywords tell clang how a header should be treated in a context of the module but they don't add headers to the module. When a textual header *is* used, clang still emits "file not found" error pointing to the location where the missing file is included. --- Full diff: https://github.com/llvm/llvm-project/pull/138227.diff 3 Files Affected: - (modified) clang/lib/Lex/ModuleMap.cpp (+4-2) - (modified) clang/test/Modules/Inputs/submodules/module.modulemap (+4) - (modified) clang/test/Modules/missing-header.m (+3) ``diff diff --git a/clang/lib/Lex/ModuleMap.cpp b/clang/lib/Lex/ModuleMap.cpp index c2f13fa48d464..1ba02b919a22a 100644 --- a/clang/lib/Lex/ModuleMap.cpp +++ b/clang/lib/Lex/ModuleMap.cpp @@ -310,8 +310,10 @@ void ModuleMap::resolveHeader(Module *Mod, } else if (Header.HasBuiltinHeader && !Header.Size && !Header.ModTime) { // There's a builtin header but no corresponding on-disk header. Assume // this was supposed to modularize the builtin header alone. - } else if (Header.Kind == Module::HK_Excluded) { -// Ignore missing excluded header files. They're optional anyway. + } else if ((Header.Kind == Module::HK_Excluded) || + (Header.Kind == Module::HK_Textual)) { +// Ignore excluded and textual header files as a module can be built with +// such headers missing. } else { // If we find a module that has a missing header, we mark this module as // unavailable and store the header directive for displaying diagnostics. diff --git a/clang/test/Modules/Inputs/submodules/module.modulemap b/clang/test/Modules/Inputs/submodules/module.modulemap index 1c1b76a08969e..9e8143b8101de 100644 --- a/clang/test/Modules/Inputs/submodules/module.modulemap +++ b/clang/test/Modules/Inputs/submodules/module.modulemap @@ -30,3 +30,7 @@ module missing_umbrella_with_inferred_submodules { module * { export * } export * } + +module missing_textual_header { + textual header "missing_textual.h" +} diff --git a/clang/test/Modules/missing-header.m b/clang/test/Modules/missing-header.m index c162e1b5f08b3..84d82e5ceda32 100644 --- a/clang/test/Modules/missing-header.m +++ b/clang/test/Modules/missing-header.m @@ -8,6 +8,9 @@ @import missing_unavailable_headers.not_missing; // OK // CHECK-NOT: missing_unavailable_headers +@import missing_textual_header; // OK +// CHECK-NOT: missing_textual_header + @import missing_headers; // CHECK: module.modulemap:15:27: error: header 'missing.h' not found // CHECK: could not build module 'missing_headers' `` https://github.com/llvm/llvm-project/pull/138227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)
https://github.com/vsapsai created https://github.com/llvm/llvm-project/pull/138227 According to the documentation > A header declaration that does not contain `exclude` nor `textual` specifies > a header that contributes to the enclosing module. Which means that `exclude` and `textual` header don't contribute to the enclosing module and their presence isn't required to build such a module. The keywords tell clang how a header should be treated in a context of the module but they don't add headers to the module. When a textual header *is* used, clang still emits "file not found" error pointing to the location where the missing file is included. >From d974d5e49406f02b37e9606e15e0d829ae3a5637 Mon Sep 17 00:00:00 2001 From: Volodymyr Sapsai Date: Thu, 1 May 2025 20:39:36 -0700 Subject: [PATCH] [Modules] Don't fail when an unused textual header is missing. According to the documentation > A header declaration that does not contain `exclude` nor `textual` specifies > a header that contributes to the enclosing module. Which means that `exclude` and `textual` header don't contribute to the enclosing module and their presence isn't required to build such a module. The keywords tell clang how a header should be treated in a context of the module but they don't add headers to the module. When a textual header *is* used, clang still emits "file not found" error pointing to the location where the missing file is included. --- clang/lib/Lex/ModuleMap.cpp | 6 -- clang/test/Modules/Inputs/submodules/module.modulemap | 4 clang/test/Modules/missing-header.m | 3 +++ 3 files changed, 11 insertions(+), 2 deletions(-) diff --git a/clang/lib/Lex/ModuleMap.cpp b/clang/lib/Lex/ModuleMap.cpp index c2f13fa48d464..1ba02b919a22a 100644 --- a/clang/lib/Lex/ModuleMap.cpp +++ b/clang/lib/Lex/ModuleMap.cpp @@ -310,8 +310,10 @@ void ModuleMap::resolveHeader(Module *Mod, } else if (Header.HasBuiltinHeader && !Header.Size && !Header.ModTime) { // There's a builtin header but no corresponding on-disk header. Assume // this was supposed to modularize the builtin header alone. - } else if (Header.Kind == Module::HK_Excluded) { -// Ignore missing excluded header files. They're optional anyway. + } else if ((Header.Kind == Module::HK_Excluded) || + (Header.Kind == Module::HK_Textual)) { +// Ignore excluded and textual header files as a module can be built with +// such headers missing. } else { // If we find a module that has a missing header, we mark this module as // unavailable and store the header directive for displaying diagnostics. diff --git a/clang/test/Modules/Inputs/submodules/module.modulemap b/clang/test/Modules/Inputs/submodules/module.modulemap index 1c1b76a08969e..9e8143b8101de 100644 --- a/clang/test/Modules/Inputs/submodules/module.modulemap +++ b/clang/test/Modules/Inputs/submodules/module.modulemap @@ -30,3 +30,7 @@ module missing_umbrella_with_inferred_submodules { module * { export * } export * } + +module missing_textual_header { + textual header "missing_textual.h" +} diff --git a/clang/test/Modules/missing-header.m b/clang/test/Modules/missing-header.m index c162e1b5f08b3..84d82e5ceda32 100644 --- a/clang/test/Modules/missing-header.m +++ b/clang/test/Modules/missing-header.m @@ -8,6 +8,9 @@ @import missing_unavailable_headers.not_missing; // OK // CHECK-NOT: missing_unavailable_headers +@import missing_textual_header; // OK +// CHECK-NOT: missing_textual_header + @import missing_headers; // CHECK: module.modulemap:15:27: error: header 'missing.h' not found // CHECK: could not build module 'missing_headers' ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits