Thanks Andrew, I appreciate the context and links. It looks like the
prior implementation failed to handle links due to being based on file
path, given cpp_simplify_pathname. Do you have thoughts on the use if
device ID + inode as a way to also accommodate symbolic links and hard
links without the fickleness of mtime?

Cheers,
Jeremy

On Sep 6 2024, at 12:25 am, Andrew Pinski <pins...@gmail.com> wrote:

> On Thu, Sep 5, 2024 at 10:04 PM Jeremy Rifkin <jer...@rifkin.dev> wrote:
>>  
>> Hello,
>>  
>> I'm looking at #pragma once behavior among the major C/C++ compilers as
>> part of a proposal paper for standardizing #pragma once. (This is
>> apparently a very controversial topic)
>>  
>> To put my question up-front: Would GCC ever be open to altering its
>> #pragma once behavior to bring it more in-line with behavior from other
>> compilers and possibly more in-line with what users expect?
>>  
>> To elaborate more:
>>  
>> Design decisions for #pragma once essentially boil down to a file-based
>> definitions vs a content-based definition of "same file".
>>  
>> A file-based definition is easier to reason about and more in-line with
>> what users expect, however, distinct copies of headers can't be handled
>> and multiple mount points are problematic.
>>  
>> A content-based definition works for distinct copies, multiple mount
>> points, and is completely sufficient 99% of the time, however, it could
>> potentially break in hard-to-debug ways in a few notable cases (more
>> information later).
>>  
>> Currently the three major C/C++ compilers treat #pragma once very 
>> differently:
>> - GCC uses file mtime + file contents
>> - Clang uses inodes
>> - MSVC uses file path
>  
> See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52566#c2 .
> Note this was changed specifically in GCC 3.4 to fix the issue around
> symlinks and hard links.
> See https://gcc.gnu.org/pipermail/gcc-patches/2003-July/111203.html
> for more information on the fixes.
>  
> In fact `#pragma once` was deprecated before GCC 3.4 because it would
> do incorrectly what clang and MSVC are doing and that was considered
> wrong.
> So GCC behavior has been this way before clang was even written.
>  
> Thanks,
> Andrew
>  
>>  
>> None of the major compilers have documented their #pragma once semantics.
>>  
>> In practice all three of these approaches work pretty well most of the
>> time (which is why people feel comfortable using #pragma once). However,
>> they can each break in their own ways.
>>  
>> As mentioned earlier, clang and MSVC's file-based definitions of "same
>> file" break for multiple mount points and multiple copies of the same
>> header. MSVC's approach breaks for symbolic links and hard links.
>>  
>> GCC's hybrid approach can break in surprising ways. I have three
>> examples to share:
>>  
>> Example 1:
>>  
>> Consider a scenario such as:
>>  
>> usr/
>>   include/
>>     library_a/
>>       library_main.hpp
>>       foo.hpp
>>     library_b/
>>       library_main.hpp
>>       foo.hpp
>> src/
>>   main.cpp
>>  
>> main.cpp:
>> #include "library_a/library_main.hpp"
>> #include "library_b/library_main.hpp"
>>  
>> And both library_main.hpp's have:
>> #pragma once
>> #include "foo.hpp"
>>  
>> Example 2:
>>  
>> namespace v1 {
>>     #include "library_v1.hpp"
>> }
>> namespace v2 {
>>     #include "library_v2.hpp"
>> }
>>  
>> Where both library headers include their own copy of a shared header
>> using #pragma once.
>>  
>> Example 3:
>>  
>> usr/
>>   include/
>>     library/
>>       library.hpp
>>       vendored-dependency.hpp
>> src/
>>   main.cpp
>>   vendored-dependency.hpp
>>  
>> main.cpp:
>> #include "vendored-dependency.hpp"
>> #include <library/library.hpp>
>>  
>> library.hpp:
>> #pragma once
>> #include "vendored-dependency.hpp"
>>  
>> Assuming the same contents byte-for-byte of vendored-dependency.hpp, and
>> it uses #pragma once.
>>  
>> Each of these examples are plausible scenarios where two files with the
>> same contents could be #included. In each example, on GCC, the code can
>> work or break based on mtime:
>> - Example 1: Breaks if mtimes for library_main.hpp happen to be the same
>> - Example 2: Breaks if mtimes for the shared dependency copies happen to
>> be the same
>> - Example 3: Only works if mtimes are the same
>>  
>> File mtimes can happen to match sometimes, e.g. in a fresh git clone.
>> However, this is a rather fickle criteria to rely on and could easily
>> diverge in the middle of development. Notably, Example 2 was shared with
>> me as an example where #pragma once worked great in development and
>> broke in CI.
>>  
>> Additionally, while GCC's approach might be able to handle multiple
>> mounts better than other approaches, it can still break under multiple
>> mounts if mtime resolution differs.
>>  
>> Obviously there is no silver bullet for making #pragma once work
>> perfectly all the time, however, I think it's easier to provide clear
>> guarantees for #pragma once behavior when the definition of "same file"
>> is based on file identity on device, i.e. device id + inode.
>>  
>> Would GCC ever consider using device id + inode instead of mtime +
>> contents for #pragma once?
>>  
>> I presume the primary reason against changing the mtime + file contents
>> approach in GCC would be caution over breaking any existing use. While
>> the three examples above are cases where fickle mtime can be
>> problematic, and I can't imagine any situations where mtime could
>> reliably be relied upon, I do understand the degree of caution required
>> for changes like this.
>>  
>>  
>> Cheers,
>> Jeremy
>

Reply via email to