[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
Sirraide wrote: > In that case WYGIWYD (what you get is what you deserve) applies. Sgtm; I’ll leave the clang-tidy side of things as-is then and just update the test to reflect this. https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
LegalizeAdulthood wrote: > What about symlinks though? In that case WYGIWYD (what you get is what you deserve) applies. https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
AaronBallman wrote: > My expectation would be that if I specify a header filter I'm not going to > use weird paths like a/b/../foo.h, but just a/foo.h because that is where > foo.h lives. What about symlinks though? Would you expect that passing `path/to/file` fails because `path` is a symlink and you should have specified `/var/foo/bar/to/file`? (It's basically the same problem -- do we make the user pass the resolved path or do we canonicalize the path for the user and use that to do the filtering?) https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
LegalizeAdulthood wrote: My expectation would be that if I specify a header filter I'm not going to use weird paths like a/b/../foo.h, but just a/foo.h because that is where foo.h lives. https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
AaronBallman wrote: > Ok, looks like the clang-tidy test failure is related to the `-header-filter` > option: > > ```c++ > // Check that `-header-filter` operates on the same file paths as paths in > // diagnostics printed by ClangTidy. > #include "dir1/dir2/../header_alias.h" > // CHECK_HEADER_ALIAS: dir1/dir2/../header_alias.h:1:11: warning: > single-argument constructors > ``` > > So, I guess my question is now, should the header filter apply to the > original filename, the simplified filename, or both? > > @AaronBallman Thoughts? Or alternatively who do best ping for clang-tidy > related questions? CC @5chmidti @PiotrZSL @HerrCai0907 @LegalizeAdulthood for more opinions on this I would naively expect that I'm giving the tool a path, the tool will resolve all symlinks and relative parts, etc for me same as it would do when specifying the file to compile/tidy. https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
Sirraide wrote: > should the header filter apply to the original filename, I mean, I guess this beacuse it’s what the user specified and it’s what we’re currently doing? It’s just that the end result might be weird, e.g. if a user writes `-exclude-header=filter='a/foo.h'` and then we print diagnostics in `'a/foo.h'` because it was actually included via `"a/b/../foo.h"` (assuming I’m not misinterpreting what this option does), but maybe that’s ok? https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
https://github.com/Sirraide updated https://github.com/llvm/llvm-project/pull/143520 >From 15c0a79d6a0cd65d88fbe152275b224201e632a1 Mon Sep 17 00:00:00 2001 From: Sirraide Date: Tue, 10 Jun 2025 14:32:32 +0200 Subject: [PATCH 1/5] [Clang] [Diagnostics] Simplify filenames that contain '..' --- clang/include/clang/Frontend/TextDiagnostic.h | 1 + clang/lib/Frontend/TextDiagnostic.cpp | 32 --- 2 files changed, 21 insertions(+), 12 deletions(-) diff --git a/clang/include/clang/Frontend/TextDiagnostic.h b/clang/include/clang/Frontend/TextDiagnostic.h index e2e88d4d648a2..9c77bc3e00e19 100644 --- a/clang/include/clang/Frontend/TextDiagnostic.h +++ b/clang/include/clang/Frontend/TextDiagnostic.h @@ -35,6 +35,7 @@ namespace clang { class TextDiagnostic : public DiagnosticRenderer { raw_ostream &OS; const Preprocessor *PP; + llvm::StringMap> SimplifiedFileNameCache; public: TextDiagnostic(raw_ostream &OS, const LangOptions &LangOpts, diff --git a/clang/lib/Frontend/TextDiagnostic.cpp b/clang/lib/Frontend/TextDiagnostic.cpp index b9e681b52e509..edbad42b39950 100644 --- a/clang/lib/Frontend/TextDiagnostic.cpp +++ b/clang/lib/Frontend/TextDiagnostic.cpp @@ -738,12 +738,20 @@ void TextDiagnostic::printDiagnosticMessage(raw_ostream &OS, } void TextDiagnostic::emitFilename(StringRef Filename, const SourceManager &SM) { -#ifdef _WIN32 - SmallString<4096> TmpFilename; -#endif - if (DiagOpts.AbsolutePath) { -auto File = SM.getFileManager().getOptionalFileRef(Filename); -if (File) { + auto File = SM.getFileManager().getOptionalFileRef(Filename); + + // Try to simplify paths that contain '..' in any case since paths to + // standard library headers especially tend to get quite long otherwise. + // Only do that for local filesystems though to avoid slowing down + // compilation too much. + auto AlwaysSimplify = [&] { +return File->getName().contains("..") && + llvm::sys::fs::is_local(File->getName()); + }; + + if (File && (DiagOpts.AbsolutePath || AlwaysSimplify())) { +SmallString<128> &CacheEntry = SimplifiedFileNameCache[Filename]; +if (CacheEntry.empty()) { // We want to print a simplified absolute path, i. e. without "dots". // // The hardest part here are the paths like "//../". @@ -759,15 +767,15 @@ void TextDiagnostic::emitFilename(StringRef Filename, const SourceManager &SM) { // on Windows we can just use llvm::sys::path::remove_dots(), because, // on that system, both aforementioned paths point to the same place. #ifdef _WIN32 - TmpFilename = File->getName(); - llvm::sys::fs::make_absolute(TmpFilename); - llvm::sys::path::native(TmpFilename); - llvm::sys::path::remove_dots(TmpFilename, /* remove_dot_dot */ true); - Filename = StringRef(TmpFilename.data(), TmpFilename.size()); + CacheEntry = File->getName(); + llvm::sys::fs::make_absolute(CacheEntry); + llvm::sys::path::native(CacheEntry); + llvm::sys::path::remove_dots(CacheEntry, /* remove_dot_dot */ true); #else - Filename = SM.getFileManager().getCanonicalName(*File); + CacheEntry = SM.getFileManager().getCanonicalName(*File); #endif } +Filename = CacheEntry; } OS << Filename; >From 97c0accf4fb3dc983bbb9f344ad66d0281b231fb Mon Sep 17 00:00:00 2001 From: Sirraide Date: Tue, 10 Jun 2025 14:47:43 +0200 Subject: [PATCH 2/5] Use whichever path ends up being shorter --- clang/lib/Frontend/TextDiagnostic.cpp | 6 ++ 1 file changed, 6 insertions(+) diff --git a/clang/lib/Frontend/TextDiagnostic.cpp b/clang/lib/Frontend/TextDiagnostic.cpp index edbad42b39950..b77c1797c51c5 100644 --- a/clang/lib/Frontend/TextDiagnostic.cpp +++ b/clang/lib/Frontend/TextDiagnostic.cpp @@ -774,6 +774,12 @@ void TextDiagnostic::emitFilename(StringRef Filename, const SourceManager &SM) { #else CacheEntry = SM.getFileManager().getCanonicalName(*File); #endif + + // In some cases, the resolved path may actually end up being longer (e.g. + // if it was originally a relative path), so just retain whichever one + // ends up being shorter. + if (!DiagOpts.AbsolutePath && CacheEntry.size() > Filename.size()) +CacheEntry = Filename; } Filename = CacheEntry; } >From 529aac1a9ae4969b4389156cb5d44fcecd445cbc Mon Sep 17 00:00:00 2001 From: Sirraide Date: Tue, 10 Jun 2025 20:36:14 +0200 Subject: [PATCH 3/5] Move path computation into SourceManager --- clang/include/clang/Basic/SourceManager.h | 9 clang/lib/Basic/SourceManager.cpp | 55 +++ clang/lib/Frontend/SARIFDiagnostic.cpp| 31 + clang/lib/Frontend/TextDiagnostic.cpp | 48 +--- 4 files changed, 66 insertions(+), 77 deletions(-) diff --git a/clang/include/clang/Basic/SourceManager.h b/clang/include/clang/Basic/SourceManager.h index cd3dac9133223..92080e7891eca 100644 --- a/clang/include/clang/Basic/SourceManager.h
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
Sirraide wrote: Ok, looks like the clang-tidy test failure is related to the `-header-filter` option: ```C++ // Check that `-header-filter` operates on the same file paths as paths in // diagnostics printed by ClangTidy. #include "dir1/dir2/../header_alias.h" // CHECK_HEADER_ALIAS: dir1/dir2/../header_alias.h:1:11: warning: single-argument constructors ``` So, I guess my question is now, should the header filter apply to the original filename, the simplified filename, or both? @AaronBallman Thoughts? Or alternatively who do best ping for clang-tidy related questions? https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
@@ -2403,3 +2403,75 @@ SourceManagerForFile::SourceManagerForFile(StringRef FileName, assert(ID.isValid()); SourceMgr->setMainFileID(ID); } + +StringRef +SourceManager::getNameForDiagnostic(StringRef Filename, +const DiagnosticOptions &Opts) const { + OptionalFileEntryRef File = getFileManager().getOptionalFileRef(Filename); + if (!File) +return Filename; + + bool SimplifyPath = [&] { +if (Opts.AbsolutePath) + return true; + +// Try to simplify paths that contain '..' in any case since paths to +// standard library headers especially tend to get quite long otherwise. +// Only do that for local filesystems though to avoid slowing down +// compilation too much. +if (!File->getName().contains("..")) + return false; + +// If we're not on Windows, check if we're on a network file system and +// avoid simplifying the path in that case since that can be slow. On +// Windows, the check for a local filesystem is already slow, so skip it. +#ifndef _WIN32 +if (!llvm::sys::fs::is_local(File->getName())) + return false; +#endif + +return true; + }(); + + if (!SimplifyPath) +return Filename; Sirraide wrote: Hmm, not sure. This is a `perf/` branch anyway so when I’m done fixing the tests (I think it’s just a clang-tidy test that’s left at this point?) we can just check if there are any regressions and try doing this instead if so. https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
@@ -2403,3 +2403,75 @@ SourceManagerForFile::SourceManagerForFile(StringRef FileName, assert(ID.isValid()); SourceMgr->setMainFileID(ID); } + +StringRef +SourceManager::getNameForDiagnostic(StringRef Filename, +const DiagnosticOptions &Opts) const { + OptionalFileEntryRef File = getFileManager().getOptionalFileRef(Filename); + if (!File) +return Filename; + + bool SimplifyPath = [&] { +if (Opts.AbsolutePath) + return true; + +// Try to simplify paths that contain '..' in any case since paths to +// standard library headers especially tend to get quite long otherwise. +// Only do that for local filesystems though to avoid slowing down +// compilation too much. +if (!File->getName().contains("..")) + return false; + +// If we're not on Windows, check if we're on a network file system and +// avoid simplifying the path in that case since that can be slow. On +// Windows, the check for a local filesystem is already slow, so skip it. +#ifndef _WIN32 +if (!llvm::sys::fs::is_local(File->getName())) + return false; +#endif + +return true; + }(); + + if (!SimplifyPath) +return Filename; + + // This may involve computing canonical names, so cache the result. + StringRef &CacheEntry = DiagNames[Filename]; + if (!CacheEntry.empty()) +return CacheEntry; + + // We want to print a simplified absolute path, i. e. without "dots". + // + // The hardest part here are the paths like "//../". + // On Unix-like systems, we cannot just collapse "/..", because + // paths are resolved sequentially, and, thereby, the path + // "/" may point to a different location. That is why + // we use FileManager::getCanonicalName(), which expands all indirections + // with llvm::sys::fs::real_path() and caches the result. + // + // On the other hand, it would be better to preserve as much of the + // original path as possible, because that helps a user to recognize it. + // real_path() expands all links, which sometimes too much. Luckily, + // on Windows we can just use llvm::sys::path::remove_dots(), because, + // on that system, both aforementioned paths point to the same place. + SmallString<256> TempBuf; +#ifdef _WIN32 + TempBuf = File->getName(); + llvm::sys::fs::make_absolute(TempBuf); + llvm::sys::path::native(TempBuf); + llvm::sys::path::remove_dots(TempBuf, /* remove_dot_dot */ true); +#else + TempBuf = getFileManager().getCanonicalName(*File); +#endif Sirraide wrote: Hmm, I was interpreting the comment above this to mean that Windows itself doesn’t care about symlinks when resolving `..` and actually just deletes preceding path segments, but I’m not much of a Windows person so I don’t know to be fair... https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
@@ -35,6 +35,7 @@ namespace clang { class TextDiagnostic : public DiagnosticRenderer { raw_ostream &OS; const Preprocessor *PP; + llvm::StringMap> SimplifiedFileNameCache; Sirraide wrote: Not anymore; forgot to remove that. https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
@@ -2403,3 +2403,75 @@ SourceManagerForFile::SourceManagerForFile(StringRef FileName, assert(ID.isValid()); SourceMgr->setMainFileID(ID); } + +StringRef +SourceManager::getNameForDiagnostic(StringRef Filename, +const DiagnosticOptions &Opts) const { + OptionalFileEntryRef File = getFileManager().getOptionalFileRef(Filename); + if (!File) +return Filename; + + bool SimplifyPath = [&] { +if (Opts.AbsolutePath) + return true; + +// Try to simplify paths that contain '..' in any case since paths to +// standard library headers especially tend to get quite long otherwise. +// Only do that for local filesystems though to avoid slowing down +// compilation too much. +if (!File->getName().contains("..")) + return false; + +// If we're not on Windows, check if we're on a network file system and +// avoid simplifying the path in that case since that can be slow. On +// Windows, the check for a local filesystem is already slow, so skip it. +#ifndef _WIN32 +if (!llvm::sys::fs::is_local(File->getName())) + return false; +#endif + +return true; + }(); + + if (!SimplifyPath) +return Filename; + + // This may involve computing canonical names, so cache the result. + StringRef &CacheEntry = DiagNames[Filename]; + if (!CacheEntry.empty()) +return CacheEntry; + + // We want to print a simplified absolute path, i. e. without "dots". + // + // The hardest part here are the paths like "//../". + // On Unix-like systems, we cannot just collapse "/..", because + // paths are resolved sequentially, and, thereby, the path + // "/" may point to a different location. That is why + // we use FileManager::getCanonicalName(), which expands all indirections + // with llvm::sys::fs::real_path() and caches the result. + // + // On the other hand, it would be better to preserve as much of the + // original path as possible, because that helps a user to recognize it. + // real_path() expands all links, which sometimes too much. Luckily, + // on Windows we can just use llvm::sys::path::remove_dots(), because, + // on that system, both aforementioned paths point to the same place. + SmallString<256> TempBuf; +#ifdef _WIN32 + TempBuf = File->getName(); + llvm::sys::fs::make_absolute(TempBuf); + llvm::sys::path::native(TempBuf); + llvm::sys::path::remove_dots(TempBuf, /* remove_dot_dot */ true); +#else + TempBuf = getFileManager().getCanonicalName(*File); +#endif cor3ntin wrote: Sort of pre-existing, but I am not sure doing something different for windows actually make sense. Symlinks on Windows exist, they are just very rare. https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
@@ -35,6 +35,7 @@ namespace clang { class TextDiagnostic : public DiagnosticRenderer { raw_ostream &OS; const Preprocessor *PP; + llvm::StringMap> SimplifiedFileNameCache; cor3ntin wrote: Is this used? https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
@@ -2403,3 +2403,75 @@ SourceManagerForFile::SourceManagerForFile(StringRef FileName, assert(ID.isValid()); SourceMgr->setMainFileID(ID); } + +StringRef +SourceManager::getNameForDiagnostic(StringRef Filename, +const DiagnosticOptions &Opts) const { + OptionalFileEntryRef File = getFileManager().getOptionalFileRef(Filename); + if (!File) +return Filename; + + bool SimplifyPath = [&] { +if (Opts.AbsolutePath) + return true; + +// Try to simplify paths that contain '..' in any case since paths to +// standard library headers especially tend to get quite long otherwise. +// Only do that for local filesystems though to avoid slowing down +// compilation too much. +if (!File->getName().contains("..")) + return false; + +// If we're not on Windows, check if we're on a network file system and +// avoid simplifying the path in that case since that can be slow. On +// Windows, the check for a local filesystem is already slow, so skip it. +#ifndef _WIN32 +if (!llvm::sys::fs::is_local(File->getName())) + return false; +#endif + +return true; + }(); + + if (!SimplifyPath) +return Filename; cor3ntin wrote: I wonder if it would be more efficient to check the map first https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
https://github.com/AaronBallman commented: In general, I'm in favor of this patch. Precommit CI found relevant failures that need to be fixed, but I think this is otherwise good to go once those are addressed. https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
@@ -738,12 +738,20 @@ void TextDiagnostic::printDiagnosticMessage(raw_ostream &OS, } void TextDiagnostic::emitFilename(StringRef Filename, const SourceManager &SM) { -#ifdef _WIN32 - SmallString<4096> TmpFilename; -#endif - if (DiagOpts.AbsolutePath) { -auto File = SM.getFileManager().getOptionalFileRef(Filename); -if (File) { + auto File = SM.getFileManager().getOptionalFileRef(Filename); + + // Try to simplify paths that contain '..' in any case since paths to + // standard library headers especially tend to get quite long otherwise. + // Only do that for local filesystems though to avoid slowing down + // compilation too much. + auto AlwaysSimplify = [&] { +return File->getName().contains("..") && + llvm::sys::fs::is_local(File->getName()); + }; Sirraide wrote: Hmm, on my system the difference was about 20ms (180ms vs 160ms), so I think there may be a performance difference, but it’s really not that bad. Also, this stress test is resolving 1000 different paths which is... hopefully a lot more than any user would ever see in a single compilation. https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
@@ -738,12 +738,20 @@ void TextDiagnostic::printDiagnosticMessage(raw_ostream &OS, } void TextDiagnostic::emitFilename(StringRef Filename, const SourceManager &SM) { -#ifdef _WIN32 - SmallString<4096> TmpFilename; -#endif - if (DiagOpts.AbsolutePath) { -auto File = SM.getFileManager().getOptionalFileRef(Filename); -if (File) { + auto File = SM.getFileManager().getOptionalFileRef(Filename); + + // Try to simplify paths that contain '..' in any case since paths to + // standard library headers especially tend to get quite long otherwise. + // Only do that for local filesystems though to avoid slowing down + // compilation too much. + auto AlwaysSimplify = [&] { +return File->getName().contains("..") && + llvm::sys::fs::is_local(File->getName()); + }; AaronBallman wrote: Without your patch, the average is consistently just shy of 7 seconds With your patch, the average is kind of hard to trust; I've seen it as low as 5 seconds and as high as 9 seconds. Overall, I think this means I'm either not testing what I think I'm testing or the performance changes aren't significant. If I was consistently getting 9 second times, I think that's demonstrate something tangible. WDYT? https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
Sirraide wrote: Actually, we could also just put it in `SourceManager` because that already has a reference to the `DiagnosticsEngine` and then a single `getNameForDiagnostic(StringRef Filename)` function would do. https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
@@ -738,12 +738,20 @@ void TextDiagnostic::printDiagnosticMessage(raw_ostream &OS, } void TextDiagnostic::emitFilename(StringRef Filename, const SourceManager &SM) { -#ifdef _WIN32 - SmallString<4096> TmpFilename; -#endif - if (DiagOpts.AbsolutePath) { -auto File = SM.getFileManager().getOptionalFileRef(Filename); -if (File) { + auto File = SM.getFileManager().getOptionalFileRef(Filename); + + // Try to simplify paths that contain '..' in any case since paths to + // standard library headers especially tend to get quite long otherwise. + // Only do that for local filesystems though to avoid slowing down + // compilation too much. + auto AlwaysSimplify = [&] { +return File->getName().contains("..") && + llvm::sys::fs::is_local(File->getName()); + }; Sirraide wrote: I might be able to generate something real quick. https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
Sirraide wrote: > The downside to it being in `TextDiagnostic` is that consumers then all have > to normalize the path themselves (some file system APIs on some systems are > better about relative paths than others). If the paths are always equivalent, > it might be kinder to pass the resolved path. WDYT? I mean, that’s also true I suppose; the only thing is then that we’d be normalising them twice if `-fdiagnostics-absolute-paths` is passed—unless we move the handling for that elsewhere as well, but now that’s dependent on the diagnostic options, so it probably shouldn’t be in `FileManager`—which leaves `DiagnosticsEngine`? But consumers don’t generally have access to the `DiagnosticsEngine`, so it’d have to be in the `FileManager` after all. I guess we could *always* compute both the absolute and the ‘short’ path for a file whenever `FileManager` opens one so that they’re always available. But that might have some impact on performance (though I guess this is a `perf/` branch already so we can try and see how it goes)? @AaronBallman Thoughts? https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
@@ -738,12 +738,20 @@ void TextDiagnostic::printDiagnosticMessage(raw_ostream &OS, } void TextDiagnostic::emitFilename(StringRef Filename, const SourceManager &SM) { -#ifdef _WIN32 - SmallString<4096> TmpFilename; -#endif - if (DiagOpts.AbsolutePath) { -auto File = SM.getFileManager().getOptionalFileRef(Filename); -if (File) { + auto File = SM.getFileManager().getOptionalFileRef(Filename); + + // Try to simplify paths that contain '..' in any case since paths to + // standard library headers especially tend to get quite long otherwise. + // Only do that for local filesystems though to avoid slowing down + // compilation too much. + auto AlwaysSimplify = [&] { +return File->getName().contains("..") && + llvm::sys::fs::is_local(File->getName()); + }; AaronBallman wrote: If we can come up with a good stress test, I might be able to try it out for you on my Windows machine and report back timings for my setup. https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
@@ -738,12 +738,20 @@ void TextDiagnostic::printDiagnosticMessage(raw_ostream &OS, } void TextDiagnostic::emitFilename(StringRef Filename, const SourceManager &SM) { -#ifdef _WIN32 - SmallString<4096> TmpFilename; -#endif - if (DiagOpts.AbsolutePath) { -auto File = SM.getFileManager().getOptionalFileRef(Filename); -if (File) { + auto File = SM.getFileManager().getOptionalFileRef(Filename); + + // Try to simplify paths that contain '..' in any case since paths to + // standard library headers especially tend to get quite long otherwise. + // Only do that for local filesystems though to avoid slowing down + // compilation too much. + auto AlwaysSimplify = [&] { +return File->getName().contains("..") && + llvm::sys::fs::is_local(File->getName()); + }; Sirraide wrote: Good point; I’ll look into what MSDN has to say about those functions. https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
@@ -738,12 +738,20 @@ void TextDiagnostic::printDiagnosticMessage(raw_ostream &OS, } void TextDiagnostic::emitFilename(StringRef Filename, const SourceManager &SM) { -#ifdef _WIN32 - SmallString<4096> TmpFilename; -#endif - if (DiagOpts.AbsolutePath) { -auto File = SM.getFileManager().getOptionalFileRef(Filename); -if (File) { + auto File = SM.getFileManager().getOptionalFileRef(Filename); + + // Try to simplify paths that contain '..' in any case since paths to + // standard library headers especially tend to get quite long otherwise. + // Only do that for local filesystems though to avoid slowing down + // compilation too much. + auto AlwaysSimplify = [&] { +return File->getName().contains("..") && + llvm::sys::fs::is_local(File->getName()); + }; AaronBallman wrote: Are we sure that `is_local()` isn't going to cause a performance issue by itself? I notice the Windows implementation looks like it could be expensive: https://github.com/llvm/llvm-project/blob/bdcbe67400b8b3d31d89df6c74d06ab80d8c7862/llvm/lib/Support/Windows/Path.inc#L313 https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
AaronBallman wrote: > I’m not exactly sure how to test this change since this is not only > platform-dependent but also path-dependent since we may end up producing > absolute paths here. I think this is a case where maybe we want to use unit tests. We have `clang/unittests/Basic/DiagnosticTest.cpp` or `FileManagerTest.cpp` already, so perhaps in one of those (depending on where the functionality ends up living)? https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
https://github.com/Sirraide edited https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
Sirraide wrote: I’m not exactly sure how to test this change since this is not only platform-dependent but also path-dependent since we may end up producing absolute paths here. https://github.com/llvm/llvm-project/pull/143520 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)
https://github.com/Sirraide updated https://github.com/llvm/llvm-project/pull/143520 >From 15c0a79d6a0cd65d88fbe152275b224201e632a1 Mon Sep 17 00:00:00 2001 From: Sirraide Date: Tue, 10 Jun 2025 14:32:32 +0200 Subject: [PATCH 1/2] [Clang] [Diagnostics] Simplify filenames that contain '..' --- clang/include/clang/Frontend/TextDiagnostic.h | 1 + clang/lib/Frontend/TextDiagnostic.cpp | 32 --- 2 files changed, 21 insertions(+), 12 deletions(-) diff --git a/clang/include/clang/Frontend/TextDiagnostic.h b/clang/include/clang/Frontend/TextDiagnostic.h index e2e88d4d648a2..9c77bc3e00e19 100644 --- a/clang/include/clang/Frontend/TextDiagnostic.h +++ b/clang/include/clang/Frontend/TextDiagnostic.h @@ -35,6 +35,7 @@ namespace clang { class TextDiagnostic : public DiagnosticRenderer { raw_ostream &OS; const Preprocessor *PP; + llvm::StringMap> SimplifiedFileNameCache; public: TextDiagnostic(raw_ostream &OS, const LangOptions &LangOpts, diff --git a/clang/lib/Frontend/TextDiagnostic.cpp b/clang/lib/Frontend/TextDiagnostic.cpp index b9e681b52e509..edbad42b39950 100644 --- a/clang/lib/Frontend/TextDiagnostic.cpp +++ b/clang/lib/Frontend/TextDiagnostic.cpp @@ -738,12 +738,20 @@ void TextDiagnostic::printDiagnosticMessage(raw_ostream &OS, } void TextDiagnostic::emitFilename(StringRef Filename, const SourceManager &SM) { -#ifdef _WIN32 - SmallString<4096> TmpFilename; -#endif - if (DiagOpts.AbsolutePath) { -auto File = SM.getFileManager().getOptionalFileRef(Filename); -if (File) { + auto File = SM.getFileManager().getOptionalFileRef(Filename); + + // Try to simplify paths that contain '..' in any case since paths to + // standard library headers especially tend to get quite long otherwise. + // Only do that for local filesystems though to avoid slowing down + // compilation too much. + auto AlwaysSimplify = [&] { +return File->getName().contains("..") && + llvm::sys::fs::is_local(File->getName()); + }; + + if (File && (DiagOpts.AbsolutePath || AlwaysSimplify())) { +SmallString<128> &CacheEntry = SimplifiedFileNameCache[Filename]; +if (CacheEntry.empty()) { // We want to print a simplified absolute path, i. e. without "dots". // // The hardest part here are the paths like "//../". @@ -759,15 +767,15 @@ void TextDiagnostic::emitFilename(StringRef Filename, const SourceManager &SM) { // on Windows we can just use llvm::sys::path::remove_dots(), because, // on that system, both aforementioned paths point to the same place. #ifdef _WIN32 - TmpFilename = File->getName(); - llvm::sys::fs::make_absolute(TmpFilename); - llvm::sys::path::native(TmpFilename); - llvm::sys::path::remove_dots(TmpFilename, /* remove_dot_dot */ true); - Filename = StringRef(TmpFilename.data(), TmpFilename.size()); + CacheEntry = File->getName(); + llvm::sys::fs::make_absolute(CacheEntry); + llvm::sys::path::native(CacheEntry); + llvm::sys::path::remove_dots(CacheEntry, /* remove_dot_dot */ true); #else - Filename = SM.getFileManager().getCanonicalName(*File); + CacheEntry = SM.getFileManager().getCanonicalName(*File); #endif } +Filename = CacheEntry; } OS << Filename; >From 97c0accf4fb3dc983bbb9f344ad66d0281b231fb Mon Sep 17 00:00:00 2001 From: Sirraide Date: Tue, 10 Jun 2025 14:47:43 +0200 Subject: [PATCH 2/2] Use whichever path ends up being shorter --- clang/lib/Frontend/TextDiagnostic.cpp | 6 ++ 1 file changed, 6 insertions(+) diff --git a/clang/lib/Frontend/TextDiagnostic.cpp b/clang/lib/Frontend/TextDiagnostic.cpp index edbad42b39950..b77c1797c51c5 100644 --- a/clang/lib/Frontend/TextDiagnostic.cpp +++ b/clang/lib/Frontend/TextDiagnostic.cpp @@ -774,6 +774,12 @@ void TextDiagnostic::emitFilename(StringRef Filename, const SourceManager &SM) { #else CacheEntry = SM.getFileManager().getCanonicalName(*File); #endif + + // In some cases, the resolved path may actually end up being longer (e.g. + // if it was originally a relative path), so just retain whichever one + // ends up being shorter. + if (!DiagOpts.AbsolutePath && CacheEntry.size() > Filename.size()) +CacheEntry = Filename; } Filename = CacheEntry; } ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits