#pragma once behavior

Jeremy Rifkin Thu, 05 Sep 2024 22:04:12 -0700

Hello,

I'm looking at #pragma once behavior among the major C/C++ compilers as
part of a proposal paper for standardizing #pragma once. (This is
apparently a very controversial topic)


To put my question up-front: Would GCC ever be open to altering its
#pragma once behavior to bring it more in-line with behavior from other
compilers and possibly more in-line with what users expect?

To elaborate more:

Design decisions for #pragma once essentially boil down to a file-based
definitions vs a content-based definition of "same file".

A file-based definition is easier to reason about and more in-line with
what users expect, however, distinct copies of headers can't be handled
and multiple mount points are problematic.

A content-based definition works for distinct copies, multiple mount
points, and is completely sufficient 99% of the time, however, it could
potentially break in hard-to-debug ways in a few notable cases (more
information later).

Currently the three major C/C++ compilers treat #pragma once very differently:
- GCC uses file mtime + file contents
- Clang uses inodes
- MSVC uses file path

None of the major compilers have documented their #pragma once semantics.

In practice all three of these approaches work pretty well most of the
time (which is why people feel comfortable using #pragma once). However,
they can each break in their own ways.

As mentioned earlier, clang and MSVC's file-based definitions of "same
file" break for multiple mount points and multiple copies of the same
header. MSVC's approach breaks for symbolic links and hard links.

GCC's hybrid approach can break in surprising ways. I have three
examples to share:

Example 1:

Consider a scenario such as:

usr/
  include/
    library_a/
      library_main.hpp
      foo.hpp
    library_b/
      library_main.hpp
      foo.hpp
src/
  main.cpp

main.cpp:
#include "library_a/library_main.hpp"
#include "library_b/library_main.hpp"

And both library_main.hpp's have:
#pragma once
#include "foo.hpp"

Example 2:

namespace v1 {
    #include "library_v1.hpp"
}
namespace v2 {
    #include "library_v2.hpp"
}

Where both library headers include their own copy of a shared header
using #pragma once.

Example 3:

usr/
  include/
    library/
      library.hpp
      vendored-dependency.hpp
src/
  main.cpp
  vendored-dependency.hpp

main.cpp:
#include "vendored-dependency.hpp"
#include <library/library.hpp>

library.hpp:
#pragma once
#include "vendored-dependency.hpp"

Assuming the same contents byte-for-byte of vendored-dependency.hpp, and
it uses #pragma once.

Each of these examples are plausible scenarios where two files with the
same contents could be #included. In each example, on GCC, the code can
work or break based on mtime:
- Example 1: Breaks if mtimes for library_main.hpp happen to be the same
- Example 2: Breaks if mtimes for the shared dependency copies happen to
be the same
- Example 3: Only works if mtimes are the same

File mtimes can happen to match sometimes, e.g. in a fresh git clone.
However, this is a rather fickle criteria to rely on and could easily
diverge in the middle of development. Notably, Example 2 was shared with
me as an example where #pragma once worked great in development and
broke in CI.

Additionally, while GCC's approach might be able to handle multiple
mounts better than other approaches, it can still break under multiple
mounts if mtime resolution differs.

Obviously there is no silver bullet for making #pragma once work
perfectly all the time, however, I think it's easier to provide clear
guarantees for #pragma once behavior when the definition of "same file"
is based on file identity on device, i.e. device id + inode.

Would GCC ever consider using device id + inode instead of mtime +
contents for #pragma once? 

I presume the primary reason against changing the mtime + file contents
approach in GCC would be caution over breaking any existing use. While
the three examples above are cases where fickle mtime can be
problematic, and I can't imagine any situations where mtime could
reliably be relied upon, I do understand the degree of caution required
for changes like this.


Cheers,
Jeremy

#pragma once behavior

Reply via email to