================
@@ -1320,6 +1320,280 @@ indirectly imported internal partition units are not
reachable.
The suggested approach for using an internal partition unit in Clang is
to only import them in the implementation unit.
+Using Clang Module Map to Avoid mixing #include and import problems
+-------------------------------------------------------------------
+
+.. note::
+ Discussion in this section is experimental.
+
+Problems Background
+~~~~~~~~~~~~~~~~~~~
+
+As discussed before, the redeclaration in different TU is one of the major
problems
+of using modules from the perspective of the compiler. The redeclaration
pattern
+is a major trigger of compiler bugs. And even if the compiler accepts the
redeclaration
+pattern as expected, the compilation performance will be affected too.
+
+e.g,
+
+.. code-block:: c++
+
+ // a.h
+ #pragma once
+ class A { ... };
+
+ // a.cppm
+ module;
+ #include "a.h"
+ export module a;
+ export using ::A;
+
+ // a.cc
+ import a;
+ #include "a.h"
+ A a;
+
+Here in ``a.cc``, we have redeclaration for ``A``, one from ``a.cppm`` and one
from ``a.cc``
+itself.
+
+To avoid the redeclaration pattern, in previous section, we suggested users to
comment
+out thirdparty headers manually.
+
+And here we will introduce another approach to avoid such redeclaration
pattern by using
+clang module map.
+
+Clang Module Map Background
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Clang Module Map is a feature of Clang Header Modules. See `Clang Module
<Modules.html>`_
+for full introduction of Clang Header Modules. Here we would only introduce
Clang Header
+Modules to make this document self contained.
+
+Clang Implicit Header Modules
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In Clang Implicit Header Module mode, Clang will read the module map and
compile the
+header in the module map into a module file and use the module file
automatically.
+This sounds very nice. But due to the complexity, this is not so wonderful in
practice.
+Clang has to compile the same header in different preprocessor context into
+different module file for correctness conservatively. Then this may trigger the
+redeclaration in different TU problems. So that the user of implicit header
modules
+has to design a module system bottom up carefully. And clang implicit header
module
+`has many issues with soundness and performance due to tradeoffs made for
module
+reuse and filesystem contention
+<https://discourse.llvm.org/t/clang-modules-build-daemon-build-system-agnostic-support-for-explicitly-built-modules>`_.
+
+Clang Explicit Header Modules
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Clang explicit header modules offloads the job of creating and managing module
files
+to the build system. Given the C++20 modules and clang header modules actually
share the
+same underlying implementation, it is actually possible to reuse the interface
of clang module
+map for C++20 named modules.
+
+Technically, Clang Explicit Header Modules may be able to solve the
redeclaration problem.
+For the above example,
+
+e.g,
+
+.. code-block:: c++
+
+ // a.h
+ #pragma once
+ class A { ... };
+
+ // a.cppm
+ module;
+ #include "a.h"
+ export module a;
+ export using ::A;
+
+ // a.cc
+ import a;
+ #include "a.h"
+ A a;
+
+The build system can build the header into a module file and use it in both
``a.cppm`` and ``a.cc``.
+Then there is no redeclaration in the example. All the declaration of ``class
A`` come from the
+synthesized TU ``a.h``.
+
+But there are problems: (1) the build system needs to support clang explicit
module.
+(2) The interaction between clang named modules and clang header modules are
theoriticall fine but
+not verified in practice. And also the document itself is about standard C++
modules, so we won't
+expand here.
+
+Examples
+~~~~~~~~
+
+To use Clang Module Map for C++20 Named Modules, end users have to wait for
the support
+from build systems. Here we ignore the build systems to help users to
understand the
+mechanism.
+
+Here is an example of using clang module map to replace a header to an import
of a module.
+
+.. code-block:: c++
+
+ // a.h
+ #pragma once
+ static_assert(false, "don't include a.h");
+
+ // main.cpp
+ #include "a.h"
+ int main() {
+ return 0;
+ }
+
+ // a.cppm
+ module;
+ #include <iostream>
+ export module a;
+ struct Init {
+ Init() {
+ std::cout << "Module 'a' got imported" << std::endl;
+ }
+ };
+ Init a;
+
+ // a.cppm.modulemap
+ module a {
+ header "a.h"
+ }
+
+Then invoke Clang with:
+
+.. code-block:: console
+
+ $ clang++ -std=c++20 a.cppm -c -fmodule-output=a.pcm -o a.o
+ $ clang++ -std=c++20 main.cpp -fmodule-map-file=a.cppm.modulemap
-fmodule-file=a=a.pcm a.o -o main
+ $ ./main
+ Module 'a' got imported
+
+We can find that the header file ``a.h`` is not included actually (otherwise
the compilation should fail due to the static assert).
+And it imports the module ``a`` and then the varaible in module ``a`` got
initialized.
+
+The secret comes from the flag ``-fmodule-map-file=a.cppm.modulemap``, the
content of ``a.cppm.modulemap`` says:
+map the #include of ``a.h`` to the import to module ``a``. Then when the
compiler sees ``#include "a.h"``, the compiler
+won't include ``a.h`` actually but tries to import the module ``a``. And the
from the command line ``-fmodule-file=a=a.pcm``,
+the compiler get the module file of module ``a``, then module file of module
``a`` get imported and the inclusion of ``a.h``
+is skipped.
+
+Then we can try to use the mechanism to avoid redeclaration pattern for header
wrapping modules.
+
+.. code-block:: c++
+
+ // a.h
+ #pragma once
+ class A { ... };
+
+ // a.cppm
+ module;
+ #include "a.h"
+ export module a;
+ export using ::A;
+
+ // a.cc
+ import a;
+ #include "a.h"
+ A a;
+
+ // a.cppm.modulemap
+ module a {
+ header "a.h"
+ }
+
+Similarly, when we compile ``a.cc``, if we add the flag
``-fmodule-map-file=a.cppm.modulemap``, the compiler
+will map the inclusion of ``a.h`` to the import of module ``a``. And the
module ``a`` is already imported.
+So we avoid the redeclaration of class ``A`` in ``a.cc``.
+
+An imaginable problem with this approach maybe the hidden inclusion. e.g,
+
+.. code-block:: c++
+
+ // b.h
+ #pragma once
+ struct B {};
+
+ // a.h
+ #pragma once
+ #include "b.h"
+ struct A { B b; };
+
+ // b.cppm
+ export module b;
+ export extern "C++" struct B { };
+
+ // a.cppm
+ export module a;
+ import b;
+ export extern "C++" struct A { B b; };
+
+ // test.cc
+ import a;
+ #include "a.h"
+ A a;
+ B b;
+
+ // a.cppm.modulemap
+ module a {
+ header "a.h"
+ }
+
+ // b.cppm.modulemap
+ module b {
+ header "b.h"
+ }
+
+The example is valid if we don't use the module map:
+
+.. code-block:: console
+
+ $ clang++ -std=c++20 b.cppm -c -fmodule-output=b.pcm -o b.o
+ $ clang++ -std=c++20 a.cppm -c -fmodule-output=a.pcm -fmodule-file=b=b.pcm
-o a.o
+ $ clang++ -std=c++20 test.cc -fmodule-file=a=a.pcm -fmodule-file=b=b.pcm
-fsyntax-only
+
+But if we enable the module map, the example is invalid:
+
+.. code-block:: console
+
+ $ clang++ -std=c++20 test.cc -fmodule-map-file=a.cppm.modulemap
-fmodule-file=a=a.pcm -fmodule-map-file=b.cppm.modulemap -fmodule-file=b=b.pcm
-fsyntax-only
+ test.cc:4:1: error: declaration of 'B' must be imported from module 'b'
before it is required
+ 4 | B b;
+ | ^
+ b.cppm:2:28: note: declaration here is not visible
+ 2 | export extern "C++" struct B { };
+ | ^
+ 1 error generated.
+
+A suggested convention for end users and build systems
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+As said, the build system is a vital role in this strategy.
+However, for build systems, it is not easy to support clang explicit header
modules or
+support the module map with C++20 named modules generally. The complexity for
build system
+won't be less than supporting C++20 named modules.
+
+So here we suggest a convention between end users and build systems to ease
the implementation
+burden of build systems and help end users to avoid the redeclaration problem
from mixing #include
+and import.
+
+For end users who is the author of header based library offering named module
wrappers, The header's interface
+should be a subset of the module interface excluding user-facing macros.
+
+* Extract all user facing headers into a single header file. Since C++20 named
modules
+* For each named module interface, provide a module map file to map the
interface headers to the named module. The name of the module map should be the
name of the module interface unit plus ``.modulemap``.
+
+The number of the module map may not be a lot sicne this is still a
+header based library.
+
+For build systems,
+
+* For each Translation Units, if the unit doesn't import any named modules,
stop. This is not what we want.
+* If the TU imports named module, for all imported named module unit, look up
for the module map file in the same path of the imported module unit with the
name of the module unit plus ``.modulemap``. e.g., if the name of the module
unit is ``a.cppm``, we should lookup for ``a.cppm.modulemap``.
+* For the found module map, pass ``-fmodule-map-file=<module_map_file_path>``
to the clang compiler.
----------------
waruqi wrote:
This seems to require users to manually maintain additional modulemap files,
but modulemaps are a feature specific to clang.
If users are only using clang to build C++20 modules, this won't be a problem.
However, for build systems, a cross-compiler solution is preferred, at least
one that eliminates the need for users to maintain additional compiler-related
configuration files or add compiler-related build configurations.
https://github.com/llvm/llvm-project/pull/178368
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits