https://gcc.gnu.org/g:b6a8e6e78926adb220210b0a740c82f73610b33d

commit r13-10315-gb6a8e6e78926adb220210b0a740c82f73610b33d
Author: Jonathan Wakely <[email protected]>
Date:   Fri Jan 9 13:39:49 2026 +0000

    libstdc++: Fix chrono::current_zone() for three-level names [PR122567]
    
    chrono::current_zone() fails if /etc/localtime is a symlink to a zone
    with three components, like "America/Indiana/Indianapolis", because we
    only try to find "Indianapolis" and "Indiana/Indianapolis" but neither
    of those is a valid zone name.
    
    We need to try up to three components to handle all valid cases, such as
    "UTC", "America/Indianapolis", and "America/Indiana/Indianapolis". It's
    also possible that users could provide a custom tzdata.zi file which
    includes zones with names using more than three levels, so loop over all
    filename components of the path that /etc/localtime points to.
    
    This also replaces std::filesystem::read_symlink with a plain readlink
    call and find+substr operations on a std::string_view, which is
    approximately twice as fast as using std::filesystem::path and
    std::string.
    
    By default we use a fixed char[128] buffer for readlink to write into,
    but if that doesn't fit we use a std::string as a dynamic buffer that
    grows as needed. We could use ::stat to find the exact length of the
    symlink and avoid looping with an increasingly large std::string
    capacity, but it's already expected to be rare for the char[128] buffer
    to be exceeded, so needing to double the std::string capacity more than
    once (i.e. to 512 or more) should be exceedingly rare. Adding a call to
    ::stat would perform a third filesystem operation when two readlink
    calls should be sufficient for the vast majority of realistic cases.
    
    One consequence of not using filesystem::path is that redundant
    consecutive slashes in the pathname aren't automatically ignored, e.g.
    /usr/share/zoneinfo/Europe//London worked fine with the old
    implementation because we manually concatenated the path components,
    i.e. "Europe" + '/' + "London". So that this continues to work there is
    a new loop to remove redundant slashes from the string being processed.
    That adds a slower, allocating path, but is unlikely to be needed in
    practice (the systemd spec for /etc/localtime explicitly says it should
    end with a time zone name, so "Europe//London" would be invalid anyway,
    even if it points to a valid file). Again, this loop is expected to be
    rare so optimizing this case further isn't important.
    
    While manually testing this I noticed that we will interpret a bogus
    symlink such as /usr/share/zoneinfo/America/Europe/London as a valid
    timezone, even though it's a dangling symlink. We find a name match for
    "Europe/London" before we get to the "America" component. This seems
    unlikely to matter in practice, and was a pre-existing problem.
    
    There's no testcase for current_zone() correctly handling three-level
    names or symlinks with unusual targets. It cannot be tested without
    changing the target of /etc/localtime which requires root access.
    
    I'm still considering whether we want to cache the result of
    current_zone(), either globally or in the tzdb object. Just returning a
    cached variable takes 20-30ns instead of more than 700ns to access the
    filesystem and read the symlink. Using ::lstat to check the symlink's
    mtime would add some overhead though.
    
    libstdc++-v3/ChangeLog:
    
            PR libstdc++/122567
            * src/c++20/tzdb.cc (tzdb::current_zone): Loop over all trailing
            components of /etc/localtime path. Use readlink instead of
            std::filesystem::read_symlink.
    
    Reviewed-by: Tomasz KamiƄski <[email protected]>
    
    (cherry picked from commit 20a6ff7a4877a25ba78461a19417e956bd6c0095)

Diff:
---
 libstdc++-v3/src/c++20/tzdb.cc | 105 ++++++++++++++++++++++++++++++++++-------
 1 file changed, 87 insertions(+), 18 deletions(-)

diff --git a/libstdc++-v3/src/c++20/tzdb.cc b/libstdc++-v3/src/c++20/tzdb.cc
index 83848cf67020..4cfa4b60543d 100644
--- a/libstdc++-v3/src/c++20/tzdb.cc
+++ b/libstdc++-v3/src/c++20/tzdb.cc
@@ -35,7 +35,11 @@
 #include <atomic>     // atomic<T*>, atomic<int>
 #include <memory>     // atomic<shared_ptr<T>>
 #include <mutex>      // mutex
-#include <filesystem> // filesystem::read_symlink
+#include <iomanip>    // quoted
+
+#if defined(_GLIBCXX_HAVE_READLINK) && defined(_GLIBCXX_HAVE_UNISTD_H)
+# include <unistd.h>  // readlink
+#endif
 
 #ifdef _AIX
 # include <cstdlib>   // getenv
@@ -1737,28 +1741,88 @@ namespace std::chrono
   tzdb::current_zone() const
   {
     // TODO cache this function's result?
+    // Could check the modification time of /etc/localtime, and not re-read
+    // it if it hasn't changed. reload_tzdb() could clear the cache too,
+    // to have a way to force a re-read.
 
 #ifndef _AIX
-    // Repeat the preprocessor condition used by filesystem::read_symlink,
-    // to avoid a dependency on src/c++17/fs_ops.o if it won't work anyway.
-#if defined(_GLIBCXX_HAVE_READLINK) && defined(_GLIBCXX_HAVE_SYS_STAT_H)
-    error_code ec;
-    // This should be a symlink to e.g. /usr/share/zoneinfo/Europe/London
-    auto path = filesystem::read_symlink("/etc/localtime", ec);
-    if (!ec)
+#if defined(_GLIBCXX_HAVE_READLINK) && defined(_GLIBCXX_HAVE_UNISTD_H)
+    string_view str;
+    char buf[128]; // strlen("../usr/share/zoneinfo/...") is usually < 55
+    string dynbuf;
+    // /etc/localtime should be a symlink that ends with a zone name,
+    // e.g. /etc/localtime -> /usr/share/zoneinfo/Europe/London
+    // https://www.freedesktop.org/software/systemd/man/latest/localtime.html
+    // This should work on GNU/Linux, macOS, NetBSD, and OpenBSD.
+    // Some FreeBSD systems also use a symlink for /etc/localtime.
+    // Use readlink directly to avoid std::filesystem overhead.
+    if (auto n = ::readlink("/etc/localtime", buf, sizeof(buf)); n > 0)
       {
-       auto first = path.begin(), last = path.end();
-       if (std::distance(first, last) > 2)
+       if (static_cast<size_t>(n) < sizeof(buf))
+         str = string_view(buf, n);
+       else [[unlikely]]
          {
-           --last;
-           string name = last->string();
-           if (auto tz = do_locate_zone(this->zones, this->links, name))
-             return tz;
-           --last;
-           name = last->string() + '/' + name;
-           if (auto tz = do_locate_zone(this->zones, this->links, name))
+           // We read the symlink but it didn't fit in buf[], use dynbuf.
+           do
+             {
+               n *= 2;
+               dynbuf.resize(n);
+               // In gcc-14 and later this uses dynbuf.__resize_and_overwrite:
+               {
+                 auto n2 = ::readlink("/etc/localtime", dynbuf.data(), n);
+                 if (n2 == -1) // symlink removed or replaced by file?!
+                   __throw_runtime_error("tzdb: error reading /etc/localtime");
+                 dynbuf.resize(n2 < n ? n2 : 0);
+               }
+             }
+           while (dynbuf.empty());
+           str = dynbuf;
+         }
+      }
+
+    if (!str.empty())
+      {
+       // Remove any redundant slashes so we can match zone names.
+       // e.g. /usr/share/zoneinfo/Europe//London is a valid symlink,
+       // but won't match against "Europe/London".
+       if (auto pos = str.rfind("//"); pos != str.npos) [[unlikely]]
+         {
+           if (str.data() != dynbuf.data())
+             dynbuf = str;
+           string::size_type spos = pos;
+           do
+             {
+               dynbuf.erase(spos, 1);
+               spos = dynbuf.rfind("//", spos);
+             }
+           while (spos != dynbuf.npos);
+           str = dynbuf;
+         }
+
+       // Check the trailing components of the path against known zone names.
+       // Valid IANA times zones can have one, two, or three parts, e.g.
+       // "UTC", "Europe/London", and "America/Indiana/Indianapolis".
+       // Custom tzdata.zi files could in theory use four or more parts.
+
+       auto pos = str.rfind('/');
+       while (pos != str.npos && pos != 0)
+         {
+           if (auto tz = do_locate_zone(this->zones, this->links,
+                                        str.substr(pos + 1)))
              return tz;
+           pos = str.rfind('/', pos - 1);
          }
+       // If we didn't match yet, try once more so that we will match
+       // a symlink to a relative path such as "Europe/London"
+       // or symlink to an absolute path such as "/Europe/London".
+       // Both cases seem unlikely because it would require either
+       // /etc/Europe or /Europe to be a directory (or a symlink to one)
+       // containing the TZif files, but it's theoretically possible.
+       // If pos==npos then pos+1 wraps to 0 and we use the whole string.
+       // If pos==0 then substr(1) discards the leading slash.
+       if (auto tz = do_locate_zone(this->zones, this->links,
+                                    str.substr(pos + 1)))
+         return tz;
       }
 #endif
     // Otherwise, look for a file naming the time zone.
@@ -1795,7 +1859,12 @@ namespace std::chrono
                  return tz;
              }
       }
-#else
+
+    // FIXME: For DragonFly BSD /etc/localtime is a copy of one of the
+    // zone files in /usr/share/zoneinfo so we need to compare its contents
+    // to each one until we find a match.
+
+#else // _AIX
     // AIX stores current zone in $TZ in /etc/environment but the value
     // is typically a POSIX time zone name, not IANA zone.
     // https://developer.ibm.com/articles/au-aix-posix/

Reply via email to