Thanks Andrew, I appreciate the context and links. It looks like the prior implementation failed to handle links due to being based on file path, given cpp_simplify_pathname. Do you have thoughts on the use if device ID + inode as a way to also accommodate symbolic links and hard links without the fickleness of mtime?
Cheers, Jeremy On Sep 6 2024, at 12:25 am, Andrew Pinski <pins...@gmail.com> wrote: > On Thu, Sep 5, 2024 at 10:04 PM Jeremy Rifkin <jer...@rifkin.dev> wrote: >> >> Hello, >> >> I'm looking at #pragma once behavior among the major C/C++ compilers as >> part of a proposal paper for standardizing #pragma once. (This is >> apparently a very controversial topic) >> >> To put my question up-front: Would GCC ever be open to altering its >> #pragma once behavior to bring it more in-line with behavior from other >> compilers and possibly more in-line with what users expect? >> >> To elaborate more: >> >> Design decisions for #pragma once essentially boil down to a file-based >> definitions vs a content-based definition of "same file". >> >> A file-based definition is easier to reason about and more in-line with >> what users expect, however, distinct copies of headers can't be handled >> and multiple mount points are problematic. >> >> A content-based definition works for distinct copies, multiple mount >> points, and is completely sufficient 99% of the time, however, it could >> potentially break in hard-to-debug ways in a few notable cases (more >> information later). >> >> Currently the three major C/C++ compilers treat #pragma once very >> differently: >> - GCC uses file mtime + file contents >> - Clang uses inodes >> - MSVC uses file path > > See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52566#c2 . > Note this was changed specifically in GCC 3.4 to fix the issue around > symlinks and hard links. > See https://gcc.gnu.org/pipermail/gcc-patches/2003-July/111203.html > for more information on the fixes. > > In fact `#pragma once` was deprecated before GCC 3.4 because it would > do incorrectly what clang and MSVC are doing and that was considered > wrong. > So GCC behavior has been this way before clang was even written. > > Thanks, > Andrew > >> >> None of the major compilers have documented their #pragma once semantics. >> >> In practice all three of these approaches work pretty well most of the >> time (which is why people feel comfortable using #pragma once). However, >> they can each break in their own ways. >> >> As mentioned earlier, clang and MSVC's file-based definitions of "same >> file" break for multiple mount points and multiple copies of the same >> header. MSVC's approach breaks for symbolic links and hard links. >> >> GCC's hybrid approach can break in surprising ways. I have three >> examples to share: >> >> Example 1: >> >> Consider a scenario such as: >> >> usr/ >> include/ >> library_a/ >> library_main.hpp >> foo.hpp >> library_b/ >> library_main.hpp >> foo.hpp >> src/ >> main.cpp >> >> main.cpp: >> #include "library_a/library_main.hpp" >> #include "library_b/library_main.hpp" >> >> And both library_main.hpp's have: >> #pragma once >> #include "foo.hpp" >> >> Example 2: >> >> namespace v1 { >> #include "library_v1.hpp" >> } >> namespace v2 { >> #include "library_v2.hpp" >> } >> >> Where both library headers include their own copy of a shared header >> using #pragma once. >> >> Example 3: >> >> usr/ >> include/ >> library/ >> library.hpp >> vendored-dependency.hpp >> src/ >> main.cpp >> vendored-dependency.hpp >> >> main.cpp: >> #include "vendored-dependency.hpp" >> #include <library/library.hpp> >> >> library.hpp: >> #pragma once >> #include "vendored-dependency.hpp" >> >> Assuming the same contents byte-for-byte of vendored-dependency.hpp, and >> it uses #pragma once. >> >> Each of these examples are plausible scenarios where two files with the >> same contents could be #included. In each example, on GCC, the code can >> work or break based on mtime: >> - Example 1: Breaks if mtimes for library_main.hpp happen to be the same >> - Example 2: Breaks if mtimes for the shared dependency copies happen to >> be the same >> - Example 3: Only works if mtimes are the same >> >> File mtimes can happen to match sometimes, e.g. in a fresh git clone. >> However, this is a rather fickle criteria to rely on and could easily >> diverge in the middle of development. Notably, Example 2 was shared with >> me as an example where #pragma once worked great in development and >> broke in CI. >> >> Additionally, while GCC's approach might be able to handle multiple >> mounts better than other approaches, it can still break under multiple >> mounts if mtime resolution differs. >> >> Obviously there is no silver bullet for making #pragma once work >> perfectly all the time, however, I think it's easier to provide clear >> guarantees for #pragma once behavior when the definition of "same file" >> is based on file identity on device, i.e. device id + inode. >> >> Would GCC ever consider using device id + inode instead of mtime + >> contents for #pragma once? >> >> I presume the primary reason against changing the mtime + file contents >> approach in GCC would be caution over breaking any existing use. While >> the three examples above are cases where fickle mtime can be >> problematic, and I can't imagine any situations where mtime could >> reliably be relied upon, I do understand the degree of caution required >> for changes like this. >> >> >> Cheers, >> Jeremy >