[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-07-04 Thread via cfe-commits

Sirraide wrote:

> In that case WYGIWYD (what you get is what you deserve) applies.

Sgtm; I’ll leave the clang-tidy side of things as-is then and just update the 
test to reflect this.

https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-07-03 Thread Richard Thomson via cfe-commits

LegalizeAdulthood wrote:

> What about symlinks though?

In that case WYGIWYD (what you get is what you deserve) applies.

https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-07-03 Thread Aaron Ballman via cfe-commits

AaronBallman wrote:

> My expectation would be that if I specify a header filter I'm not going to 
> use weird paths like a/b/../foo.h, but just a/foo.h because that is where 
> foo.h lives.

What about symlinks though? Would you expect that passing `path/to/file` fails 
because `path` is a symlink and you should have specified 
`/var/foo/bar/to/file`? (It's basically the same problem -- do we make the user 
pass the resolved path or do we canonicalize the path for the user and use that 
to do the filtering?)

https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-07-02 Thread Richard Thomson via cfe-commits

LegalizeAdulthood wrote:

My expectation would be that if I specify a header filter I'm not going to use 
weird paths like a/b/../foo.h, but just a/foo.h because that is where foo.h 
lives.

https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-07-02 Thread Aaron Ballman via cfe-commits

AaronBallman wrote:

> Ok, looks like the clang-tidy test failure is related to the `-header-filter` 
> option:
> 
> ```c++
> // Check that `-header-filter` operates on the same file paths as paths in
> // diagnostics printed by ClangTidy.
> #include "dir1/dir2/../header_alias.h"
> // CHECK_HEADER_ALIAS: dir1/dir2/../header_alias.h:1:11: warning: 
> single-argument constructors
> ```
> 
> So, I guess my question is now, should the header filter apply to the 
> original filename, the simplified filename, or both?
> 
> @AaronBallman Thoughts? Or alternatively who do best ping for clang-tidy 
> related questions?

CC @5chmidti @PiotrZSL @HerrCai0907 @LegalizeAdulthood  for more opinions on 
this

I would naively expect that I'm giving the tool a path, the tool will resolve 
all symlinks and relative parts, etc for me same as it would do when specifying 
the file to compile/tidy.

https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-07-02 Thread via cfe-commits

Sirraide wrote:

> should the header filter apply to the original filename,

I mean, I guess this beacuse it’s what the user specified and it’s what we’re 
currently doing? It’s just that the end result might be weird, e.g. if a user 
writes `-exclude-header=filter='a/foo.h'` and then we print diagnostics in 
`'a/foo.h'` because it was actually included via `"a/b/../foo.h"` (assuming I’m 
not misinterpreting what this option does), but maybe that’s ok?

https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-07-02 Thread via cfe-commits

https://github.com/Sirraide updated 
https://github.com/llvm/llvm-project/pull/143520

>From 15c0a79d6a0cd65d88fbe152275b224201e632a1 Mon Sep 17 00:00:00 2001
From: Sirraide 
Date: Tue, 10 Jun 2025 14:32:32 +0200
Subject: [PATCH 1/5] [Clang] [Diagnostics] Simplify filenames that contain
 '..'

---
 clang/include/clang/Frontend/TextDiagnostic.h |  1 +
 clang/lib/Frontend/TextDiagnostic.cpp | 32 ---
 2 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/clang/include/clang/Frontend/TextDiagnostic.h 
b/clang/include/clang/Frontend/TextDiagnostic.h
index e2e88d4d648a2..9c77bc3e00e19 100644
--- a/clang/include/clang/Frontend/TextDiagnostic.h
+++ b/clang/include/clang/Frontend/TextDiagnostic.h
@@ -35,6 +35,7 @@ namespace clang {
 class TextDiagnostic : public DiagnosticRenderer {
   raw_ostream &OS;
   const Preprocessor *PP;
+  llvm::StringMap> SimplifiedFileNameCache;
 
 public:
   TextDiagnostic(raw_ostream &OS, const LangOptions &LangOpts,
diff --git a/clang/lib/Frontend/TextDiagnostic.cpp 
b/clang/lib/Frontend/TextDiagnostic.cpp
index b9e681b52e509..edbad42b39950 100644
--- a/clang/lib/Frontend/TextDiagnostic.cpp
+++ b/clang/lib/Frontend/TextDiagnostic.cpp
@@ -738,12 +738,20 @@ void TextDiagnostic::printDiagnosticMessage(raw_ostream 
&OS,
 }
 
 void TextDiagnostic::emitFilename(StringRef Filename, const SourceManager &SM) 
{
-#ifdef _WIN32
-  SmallString<4096> TmpFilename;
-#endif
-  if (DiagOpts.AbsolutePath) {
-auto File = SM.getFileManager().getOptionalFileRef(Filename);
-if (File) {
+  auto File = SM.getFileManager().getOptionalFileRef(Filename);
+
+  // Try to simplify paths that contain '..' in any case since paths to
+  // standard library headers especially tend to get quite long otherwise.
+  // Only do that for local filesystems though to avoid slowing down
+  // compilation too much.
+  auto AlwaysSimplify = [&] {
+return File->getName().contains("..") &&
+   llvm::sys::fs::is_local(File->getName());
+  };
+
+  if (File && (DiagOpts.AbsolutePath || AlwaysSimplify())) {
+SmallString<128> &CacheEntry = SimplifiedFileNameCache[Filename];
+if (CacheEntry.empty()) {
   // We want to print a simplified absolute path, i. e. without "dots".
   //
   // The hardest part here are the paths like "//../".
@@ -759,15 +767,15 @@ void TextDiagnostic::emitFilename(StringRef Filename, 
const SourceManager &SM) {
   // on Windows we can just use llvm::sys::path::remove_dots(), because,
   // on that system, both aforementioned paths point to the same place.
 #ifdef _WIN32
-  TmpFilename = File->getName();
-  llvm::sys::fs::make_absolute(TmpFilename);
-  llvm::sys::path::native(TmpFilename);
-  llvm::sys::path::remove_dots(TmpFilename, /* remove_dot_dot */ true);
-  Filename = StringRef(TmpFilename.data(), TmpFilename.size());
+  CacheEntry = File->getName();
+  llvm::sys::fs::make_absolute(CacheEntry);
+  llvm::sys::path::native(CacheEntry);
+  llvm::sys::path::remove_dots(CacheEntry, /* remove_dot_dot */ true);
 #else
-  Filename = SM.getFileManager().getCanonicalName(*File);
+  CacheEntry = SM.getFileManager().getCanonicalName(*File);
 #endif
 }
+Filename = CacheEntry;
   }
 
   OS << Filename;

>From 97c0accf4fb3dc983bbb9f344ad66d0281b231fb Mon Sep 17 00:00:00 2001
From: Sirraide 
Date: Tue, 10 Jun 2025 14:47:43 +0200
Subject: [PATCH 2/5] Use whichever path ends up being shorter

---
 clang/lib/Frontend/TextDiagnostic.cpp | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/clang/lib/Frontend/TextDiagnostic.cpp 
b/clang/lib/Frontend/TextDiagnostic.cpp
index edbad42b39950..b77c1797c51c5 100644
--- a/clang/lib/Frontend/TextDiagnostic.cpp
+++ b/clang/lib/Frontend/TextDiagnostic.cpp
@@ -774,6 +774,12 @@ void TextDiagnostic::emitFilename(StringRef Filename, 
const SourceManager &SM) {
 #else
   CacheEntry = SM.getFileManager().getCanonicalName(*File);
 #endif
+
+  // In some cases, the resolved path may actually end up being longer 
(e.g.
+  // if it was originally a relative path), so just retain whichever one
+  // ends up being shorter.
+  if (!DiagOpts.AbsolutePath && CacheEntry.size() > Filename.size())
+CacheEntry = Filename;
 }
 Filename = CacheEntry;
   }

>From 529aac1a9ae4969b4389156cb5d44fcecd445cbc Mon Sep 17 00:00:00 2001
From: Sirraide 
Date: Tue, 10 Jun 2025 20:36:14 +0200
Subject: [PATCH 3/5] Move path computation into SourceManager

---
 clang/include/clang/Basic/SourceManager.h |  9 
 clang/lib/Basic/SourceManager.cpp | 55 +++
 clang/lib/Frontend/SARIFDiagnostic.cpp| 31 +
 clang/lib/Frontend/TextDiagnostic.cpp | 48 +---
 4 files changed, 66 insertions(+), 77 deletions(-)

diff --git a/clang/include/clang/Basic/SourceManager.h 
b/clang/include/clang/Basic/SourceManager.h
index cd3dac9133223..92080e7891eca 100644
--- a/clang/include/clang/Basic/SourceManager.h

[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-07-02 Thread via cfe-commits

Sirraide wrote:

Ok, looks like the clang-tidy test failure is related to the `-header-filter` 
option:
```C++
// Check that `-header-filter` operates on the same file paths as paths in
// diagnostics printed by ClangTidy.
#include "dir1/dir2/../header_alias.h"
// CHECK_HEADER_ALIAS: dir1/dir2/../header_alias.h:1:11: warning: 
single-argument constructors
```

So, I guess my question is now, should the header filter apply to the original 
filename, the simplified filename, or both?

@AaronBallman Thoughts? Or alternatively who do best ping for clang-tidy 
related questions?

https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-07-02 Thread via cfe-commits


@@ -2403,3 +2403,75 @@ SourceManagerForFile::SourceManagerForFile(StringRef 
FileName,
   assert(ID.isValid());
   SourceMgr->setMainFileID(ID);
 }
+
+StringRef
+SourceManager::getNameForDiagnostic(StringRef Filename,
+const DiagnosticOptions &Opts) const {
+  OptionalFileEntryRef File = getFileManager().getOptionalFileRef(Filename);
+  if (!File)
+return Filename;
+
+  bool SimplifyPath = [&] {
+if (Opts.AbsolutePath)
+  return true;
+
+// Try to simplify paths that contain '..' in any case since paths to
+// standard library headers especially tend to get quite long otherwise.
+// Only do that for local filesystems though to avoid slowing down
+// compilation too much.
+if (!File->getName().contains(".."))
+  return false;
+
+// If we're not on Windows, check if we're on a network file system and
+// avoid simplifying the path in that case since that can be slow. On
+// Windows, the check for a local filesystem is already slow, so skip it.
+#ifndef _WIN32
+if (!llvm::sys::fs::is_local(File->getName()))
+  return false;
+#endif
+
+return true;
+  }();
+
+  if (!SimplifyPath)
+return Filename;

Sirraide wrote:

Hmm, not sure. This is a `perf/` branch anyway so when I’m done fixing the 
tests (I think it’s just a clang-tidy test that’s left at this point?) we can 
just check if there are any regressions and try doing this instead if so.

https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-07-02 Thread via cfe-commits


@@ -2403,3 +2403,75 @@ SourceManagerForFile::SourceManagerForFile(StringRef 
FileName,
   assert(ID.isValid());
   SourceMgr->setMainFileID(ID);
 }
+
+StringRef
+SourceManager::getNameForDiagnostic(StringRef Filename,
+const DiagnosticOptions &Opts) const {
+  OptionalFileEntryRef File = getFileManager().getOptionalFileRef(Filename);
+  if (!File)
+return Filename;
+
+  bool SimplifyPath = [&] {
+if (Opts.AbsolutePath)
+  return true;
+
+// Try to simplify paths that contain '..' in any case since paths to
+// standard library headers especially tend to get quite long otherwise.
+// Only do that for local filesystems though to avoid slowing down
+// compilation too much.
+if (!File->getName().contains(".."))
+  return false;
+
+// If we're not on Windows, check if we're on a network file system and
+// avoid simplifying the path in that case since that can be slow. On
+// Windows, the check for a local filesystem is already slow, so skip it.
+#ifndef _WIN32
+if (!llvm::sys::fs::is_local(File->getName()))
+  return false;
+#endif
+
+return true;
+  }();
+
+  if (!SimplifyPath)
+return Filename;
+
+  // This may involve computing canonical names, so cache the result.
+  StringRef &CacheEntry = DiagNames[Filename];
+  if (!CacheEntry.empty())
+return CacheEntry;
+
+  // We want to print a simplified absolute path, i. e. without "dots".
+  //
+  // The hardest part here are the paths like "//../".
+  // On Unix-like systems, we cannot just collapse "/..", because
+  // paths are resolved sequentially, and, thereby, the path
+  // "/" may point to a different location. That is why
+  // we use FileManager::getCanonicalName(), which expands all indirections
+  // with llvm::sys::fs::real_path() and caches the result.
+  //
+  // On the other hand, it would be better to preserve as much of the
+  // original path as possible, because that helps a user to recognize it.
+  // real_path() expands all links, which sometimes too much. Luckily,
+  // on Windows we can just use llvm::sys::path::remove_dots(), because,
+  // on that system, both aforementioned paths point to the same place.
+  SmallString<256> TempBuf;
+#ifdef _WIN32
+  TempBuf = File->getName();
+  llvm::sys::fs::make_absolute(TempBuf);
+  llvm::sys::path::native(TempBuf);
+  llvm::sys::path::remove_dots(TempBuf, /* remove_dot_dot */ true);
+#else
+  TempBuf = getFileManager().getCanonicalName(*File);
+#endif

Sirraide wrote:

Hmm, I was interpreting the comment above this to mean that Windows itself 
doesn’t care about symlinks when resolving `..` and actually just deletes 
preceding path segments, but I’m not much of a Windows person so I don’t know 
to be fair...

https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-07-02 Thread via cfe-commits


@@ -35,6 +35,7 @@ namespace clang {
 class TextDiagnostic : public DiagnosticRenderer {
   raw_ostream &OS;
   const Preprocessor *PP;
+  llvm::StringMap> SimplifiedFileNameCache;

Sirraide wrote:

Not anymore; forgot to remove that.

https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-07-02 Thread Corentin Jabot via cfe-commits


@@ -2403,3 +2403,75 @@ SourceManagerForFile::SourceManagerForFile(StringRef 
FileName,
   assert(ID.isValid());
   SourceMgr->setMainFileID(ID);
 }
+
+StringRef
+SourceManager::getNameForDiagnostic(StringRef Filename,
+const DiagnosticOptions &Opts) const {
+  OptionalFileEntryRef File = getFileManager().getOptionalFileRef(Filename);
+  if (!File)
+return Filename;
+
+  bool SimplifyPath = [&] {
+if (Opts.AbsolutePath)
+  return true;
+
+// Try to simplify paths that contain '..' in any case since paths to
+// standard library headers especially tend to get quite long otherwise.
+// Only do that for local filesystems though to avoid slowing down
+// compilation too much.
+if (!File->getName().contains(".."))
+  return false;
+
+// If we're not on Windows, check if we're on a network file system and
+// avoid simplifying the path in that case since that can be slow. On
+// Windows, the check for a local filesystem is already slow, so skip it.
+#ifndef _WIN32
+if (!llvm::sys::fs::is_local(File->getName()))
+  return false;
+#endif
+
+return true;
+  }();
+
+  if (!SimplifyPath)
+return Filename;
+
+  // This may involve computing canonical names, so cache the result.
+  StringRef &CacheEntry = DiagNames[Filename];
+  if (!CacheEntry.empty())
+return CacheEntry;
+
+  // We want to print a simplified absolute path, i. e. without "dots".
+  //
+  // The hardest part here are the paths like "//../".
+  // On Unix-like systems, we cannot just collapse "/..", because
+  // paths are resolved sequentially, and, thereby, the path
+  // "/" may point to a different location. That is why
+  // we use FileManager::getCanonicalName(), which expands all indirections
+  // with llvm::sys::fs::real_path() and caches the result.
+  //
+  // On the other hand, it would be better to preserve as much of the
+  // original path as possible, because that helps a user to recognize it.
+  // real_path() expands all links, which sometimes too much. Luckily,
+  // on Windows we can just use llvm::sys::path::remove_dots(), because,
+  // on that system, both aforementioned paths point to the same place.
+  SmallString<256> TempBuf;
+#ifdef _WIN32
+  TempBuf = File->getName();
+  llvm::sys::fs::make_absolute(TempBuf);
+  llvm::sys::path::native(TempBuf);
+  llvm::sys::path::remove_dots(TempBuf, /* remove_dot_dot */ true);
+#else
+  TempBuf = getFileManager().getCanonicalName(*File);
+#endif

cor3ntin wrote:

Sort of pre-existing, but I am not sure doing something different for windows 
actually make sense.
Symlinks on Windows exist, they are just very rare.

https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-07-02 Thread Corentin Jabot via cfe-commits


@@ -35,6 +35,7 @@ namespace clang {
 class TextDiagnostic : public DiagnosticRenderer {
   raw_ostream &OS;
   const Preprocessor *PP;
+  llvm::StringMap> SimplifiedFileNameCache;

cor3ntin wrote:

Is this used?

https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-07-02 Thread Corentin Jabot via cfe-commits


@@ -2403,3 +2403,75 @@ SourceManagerForFile::SourceManagerForFile(StringRef 
FileName,
   assert(ID.isValid());
   SourceMgr->setMainFileID(ID);
 }
+
+StringRef
+SourceManager::getNameForDiagnostic(StringRef Filename,
+const DiagnosticOptions &Opts) const {
+  OptionalFileEntryRef File = getFileManager().getOptionalFileRef(Filename);
+  if (!File)
+return Filename;
+
+  bool SimplifyPath = [&] {
+if (Opts.AbsolutePath)
+  return true;
+
+// Try to simplify paths that contain '..' in any case since paths to
+// standard library headers especially tend to get quite long otherwise.
+// Only do that for local filesystems though to avoid slowing down
+// compilation too much.
+if (!File->getName().contains(".."))
+  return false;
+
+// If we're not on Windows, check if we're on a network file system and
+// avoid simplifying the path in that case since that can be slow. On
+// Windows, the check for a local filesystem is already slow, so skip it.
+#ifndef _WIN32
+if (!llvm::sys::fs::is_local(File->getName()))
+  return false;
+#endif
+
+return true;
+  }();
+
+  if (!SimplifyPath)
+return Filename;

cor3ntin wrote:

I wonder if it would be more efficient to check the map first

https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-06-26 Thread Aaron Ballman via cfe-commits

https://github.com/AaronBallman commented:

In general, I'm in favor of this patch. Precommit CI found relevant failures 
that need to be fixed, but I think this is otherwise good to go once those are 
addressed.

https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-06-10 Thread via cfe-commits


@@ -738,12 +738,20 @@ void TextDiagnostic::printDiagnosticMessage(raw_ostream 
&OS,
 }
 
 void TextDiagnostic::emitFilename(StringRef Filename, const SourceManager &SM) 
{
-#ifdef _WIN32
-  SmallString<4096> TmpFilename;
-#endif
-  if (DiagOpts.AbsolutePath) {
-auto File = SM.getFileManager().getOptionalFileRef(Filename);
-if (File) {
+  auto File = SM.getFileManager().getOptionalFileRef(Filename);
+
+  // Try to simplify paths that contain '..' in any case since paths to
+  // standard library headers especially tend to get quite long otherwise.
+  // Only do that for local filesystems though to avoid slowing down
+  // compilation too much.
+  auto AlwaysSimplify = [&] {
+return File->getName().contains("..") &&
+   llvm::sys::fs::is_local(File->getName());
+  };

Sirraide wrote:

Hmm, on my system the difference was about 20ms (180ms vs 160ms), so I think 
there may be a performance difference, but it’s really not that bad. Also, this 
stress test is resolving 1000 different paths which is... hopefully a lot more 
than any user would ever see in a single compilation.

https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-06-10 Thread Aaron Ballman via cfe-commits


@@ -738,12 +738,20 @@ void TextDiagnostic::printDiagnosticMessage(raw_ostream 
&OS,
 }
 
 void TextDiagnostic::emitFilename(StringRef Filename, const SourceManager &SM) 
{
-#ifdef _WIN32
-  SmallString<4096> TmpFilename;
-#endif
-  if (DiagOpts.AbsolutePath) {
-auto File = SM.getFileManager().getOptionalFileRef(Filename);
-if (File) {
+  auto File = SM.getFileManager().getOptionalFileRef(Filename);
+
+  // Try to simplify paths that contain '..' in any case since paths to
+  // standard library headers especially tend to get quite long otherwise.
+  // Only do that for local filesystems though to avoid slowing down
+  // compilation too much.
+  auto AlwaysSimplify = [&] {
+return File->getName().contains("..") &&
+   llvm::sys::fs::is_local(File->getName());
+  };

AaronBallman wrote:

Without your patch, the average is consistently just shy of 7 seconds
With your patch, the average is kind of hard to trust; I've seen it as low as 5 
seconds and as high as 9 seconds.

Overall, I think this means I'm either not testing what I think I'm testing or 
the performance changes aren't significant. If I was consistently getting 9 
second times, I think that's demonstrate something tangible.

WDYT?

https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-06-10 Thread via cfe-commits

Sirraide wrote:

Actually, we could also just put it in `SourceManager` because that already has 
a reference to the `DiagnosticsEngine` and then a single 
`getNameForDiagnostic(StringRef Filename)` function would do. 

https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-06-10 Thread via cfe-commits


@@ -738,12 +738,20 @@ void TextDiagnostic::printDiagnosticMessage(raw_ostream 
&OS,
 }
 
 void TextDiagnostic::emitFilename(StringRef Filename, const SourceManager &SM) 
{
-#ifdef _WIN32
-  SmallString<4096> TmpFilename;
-#endif
-  if (DiagOpts.AbsolutePath) {
-auto File = SM.getFileManager().getOptionalFileRef(Filename);
-if (File) {
+  auto File = SM.getFileManager().getOptionalFileRef(Filename);
+
+  // Try to simplify paths that contain '..' in any case since paths to
+  // standard library headers especially tend to get quite long otherwise.
+  // Only do that for local filesystems though to avoid slowing down
+  // compilation too much.
+  auto AlwaysSimplify = [&] {
+return File->getName().contains("..") &&
+   llvm::sys::fs::is_local(File->getName());
+  };

Sirraide wrote:

I might be able to generate something real quick.

https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-06-10 Thread via cfe-commits

Sirraide wrote:

> The downside to it being in `TextDiagnostic` is that consumers then all have 
> to normalize the path themselves (some file system APIs on some systems are 
> better about relative paths than others). If the paths are always equivalent, 
> it might be kinder to pass the resolved path. WDYT?

I mean, that’s also true I suppose; the only thing is then that we’d be 
normalising them twice if `-fdiagnostics-absolute-paths` is passed—unless we 
move the handling for that elsewhere as well, but now that’s dependent on the 
diagnostic options, so it probably shouldn’t be in `FileManager`—which leaves 
`DiagnosticsEngine`? But consumers don’t generally have access to the 
`DiagnosticsEngine`, so it’d have to be in the `FileManager` after all. 

I guess we could *always* compute both the absolute and the ‘short’ path for a 
file whenever `FileManager` opens one so that they’re always available. But 
that might have some impact on performance (though I guess this is a `perf/` 
branch already so we can try and see how it goes)?

@AaronBallman Thoughts?

https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-06-10 Thread Aaron Ballman via cfe-commits


@@ -738,12 +738,20 @@ void TextDiagnostic::printDiagnosticMessage(raw_ostream 
&OS,
 }
 
 void TextDiagnostic::emitFilename(StringRef Filename, const SourceManager &SM) 
{
-#ifdef _WIN32
-  SmallString<4096> TmpFilename;
-#endif
-  if (DiagOpts.AbsolutePath) {
-auto File = SM.getFileManager().getOptionalFileRef(Filename);
-if (File) {
+  auto File = SM.getFileManager().getOptionalFileRef(Filename);
+
+  // Try to simplify paths that contain '..' in any case since paths to
+  // standard library headers especially tend to get quite long otherwise.
+  // Only do that for local filesystems though to avoid slowing down
+  // compilation too much.
+  auto AlwaysSimplify = [&] {
+return File->getName().contains("..") &&
+   llvm::sys::fs::is_local(File->getName());
+  };

AaronBallman wrote:

If we can come up with a good stress test, I might be able to try it out for 
you on my Windows machine and report back timings for my setup.

https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-06-10 Thread via cfe-commits


@@ -738,12 +738,20 @@ void TextDiagnostic::printDiagnosticMessage(raw_ostream 
&OS,
 }
 
 void TextDiagnostic::emitFilename(StringRef Filename, const SourceManager &SM) 
{
-#ifdef _WIN32
-  SmallString<4096> TmpFilename;
-#endif
-  if (DiagOpts.AbsolutePath) {
-auto File = SM.getFileManager().getOptionalFileRef(Filename);
-if (File) {
+  auto File = SM.getFileManager().getOptionalFileRef(Filename);
+
+  // Try to simplify paths that contain '..' in any case since paths to
+  // standard library headers especially tend to get quite long otherwise.
+  // Only do that for local filesystems though to avoid slowing down
+  // compilation too much.
+  auto AlwaysSimplify = [&] {
+return File->getName().contains("..") &&
+   llvm::sys::fs::is_local(File->getName());
+  };

Sirraide wrote:

Good point; I’ll look into what MSDN has to say about those functions.

https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-06-10 Thread Aaron Ballman via cfe-commits


@@ -738,12 +738,20 @@ void TextDiagnostic::printDiagnosticMessage(raw_ostream 
&OS,
 }
 
 void TextDiagnostic::emitFilename(StringRef Filename, const SourceManager &SM) 
{
-#ifdef _WIN32
-  SmallString<4096> TmpFilename;
-#endif
-  if (DiagOpts.AbsolutePath) {
-auto File = SM.getFileManager().getOptionalFileRef(Filename);
-if (File) {
+  auto File = SM.getFileManager().getOptionalFileRef(Filename);
+
+  // Try to simplify paths that contain '..' in any case since paths to
+  // standard library headers especially tend to get quite long otherwise.
+  // Only do that for local filesystems though to avoid slowing down
+  // compilation too much.
+  auto AlwaysSimplify = [&] {
+return File->getName().contains("..") &&
+   llvm::sys::fs::is_local(File->getName());
+  };

AaronBallman wrote:

Are we sure that `is_local()` isn't going to cause a performance issue by 
itself? I notice the Windows implementation looks like it could be expensive: 
https://github.com/llvm/llvm-project/blob/bdcbe67400b8b3d31d89df6c74d06ab80d8c7862/llvm/lib/Support/Windows/Path.inc#L313

https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-06-10 Thread Aaron Ballman via cfe-commits

AaronBallman wrote:

> I’m not exactly sure how to test this change since this is not only 
> platform-dependent but also path-dependent since we may end up producing 
> absolute paths here.

I think this is a case where maybe we want to use unit tests. We have 
`clang/unittests/Basic/DiagnosticTest.cpp` or `FileManagerTest.cpp` already, so 
perhaps in one of those (depending on where the functionality ends up living)?

https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-06-10 Thread via cfe-commits

https://github.com/Sirraide edited 
https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-06-10 Thread via cfe-commits

Sirraide wrote:

I’m not exactly sure how to test this change since this is not only 
platform-dependent but also path-dependent since we may end up producing 
absolute paths here.

https://github.com/llvm/llvm-project/pull/143520
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Diagnostics] Simplify filenames that contain '..' (PR #143520)

2025-06-10 Thread via cfe-commits

https://github.com/Sirraide updated 
https://github.com/llvm/llvm-project/pull/143520

>From 15c0a79d6a0cd65d88fbe152275b224201e632a1 Mon Sep 17 00:00:00 2001
From: Sirraide 
Date: Tue, 10 Jun 2025 14:32:32 +0200
Subject: [PATCH 1/2] [Clang] [Diagnostics] Simplify filenames that contain
 '..'

---
 clang/include/clang/Frontend/TextDiagnostic.h |  1 +
 clang/lib/Frontend/TextDiagnostic.cpp | 32 ---
 2 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/clang/include/clang/Frontend/TextDiagnostic.h 
b/clang/include/clang/Frontend/TextDiagnostic.h
index e2e88d4d648a2..9c77bc3e00e19 100644
--- a/clang/include/clang/Frontend/TextDiagnostic.h
+++ b/clang/include/clang/Frontend/TextDiagnostic.h
@@ -35,6 +35,7 @@ namespace clang {
 class TextDiagnostic : public DiagnosticRenderer {
   raw_ostream &OS;
   const Preprocessor *PP;
+  llvm::StringMap> SimplifiedFileNameCache;
 
 public:
   TextDiagnostic(raw_ostream &OS, const LangOptions &LangOpts,
diff --git a/clang/lib/Frontend/TextDiagnostic.cpp 
b/clang/lib/Frontend/TextDiagnostic.cpp
index b9e681b52e509..edbad42b39950 100644
--- a/clang/lib/Frontend/TextDiagnostic.cpp
+++ b/clang/lib/Frontend/TextDiagnostic.cpp
@@ -738,12 +738,20 @@ void TextDiagnostic::printDiagnosticMessage(raw_ostream 
&OS,
 }
 
 void TextDiagnostic::emitFilename(StringRef Filename, const SourceManager &SM) 
{
-#ifdef _WIN32
-  SmallString<4096> TmpFilename;
-#endif
-  if (DiagOpts.AbsolutePath) {
-auto File = SM.getFileManager().getOptionalFileRef(Filename);
-if (File) {
+  auto File = SM.getFileManager().getOptionalFileRef(Filename);
+
+  // Try to simplify paths that contain '..' in any case since paths to
+  // standard library headers especially tend to get quite long otherwise.
+  // Only do that for local filesystems though to avoid slowing down
+  // compilation too much.
+  auto AlwaysSimplify = [&] {
+return File->getName().contains("..") &&
+   llvm::sys::fs::is_local(File->getName());
+  };
+
+  if (File && (DiagOpts.AbsolutePath || AlwaysSimplify())) {
+SmallString<128> &CacheEntry = SimplifiedFileNameCache[Filename];
+if (CacheEntry.empty()) {
   // We want to print a simplified absolute path, i. e. without "dots".
   //
   // The hardest part here are the paths like "//../".
@@ -759,15 +767,15 @@ void TextDiagnostic::emitFilename(StringRef Filename, 
const SourceManager &SM) {
   // on Windows we can just use llvm::sys::path::remove_dots(), because,
   // on that system, both aforementioned paths point to the same place.
 #ifdef _WIN32
-  TmpFilename = File->getName();
-  llvm::sys::fs::make_absolute(TmpFilename);
-  llvm::sys::path::native(TmpFilename);
-  llvm::sys::path::remove_dots(TmpFilename, /* remove_dot_dot */ true);
-  Filename = StringRef(TmpFilename.data(), TmpFilename.size());
+  CacheEntry = File->getName();
+  llvm::sys::fs::make_absolute(CacheEntry);
+  llvm::sys::path::native(CacheEntry);
+  llvm::sys::path::remove_dots(CacheEntry, /* remove_dot_dot */ true);
 #else
-  Filename = SM.getFileManager().getCanonicalName(*File);
+  CacheEntry = SM.getFileManager().getCanonicalName(*File);
 #endif
 }
+Filename = CacheEntry;
   }
 
   OS << Filename;

>From 97c0accf4fb3dc983bbb9f344ad66d0281b231fb Mon Sep 17 00:00:00 2001
From: Sirraide 
Date: Tue, 10 Jun 2025 14:47:43 +0200
Subject: [PATCH 2/2] Use whichever path ends up being shorter

---
 clang/lib/Frontend/TextDiagnostic.cpp | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/clang/lib/Frontend/TextDiagnostic.cpp 
b/clang/lib/Frontend/TextDiagnostic.cpp
index edbad42b39950..b77c1797c51c5 100644
--- a/clang/lib/Frontend/TextDiagnostic.cpp
+++ b/clang/lib/Frontend/TextDiagnostic.cpp
@@ -774,6 +774,12 @@ void TextDiagnostic::emitFilename(StringRef Filename, 
const SourceManager &SM) {
 #else
   CacheEntry = SM.getFileManager().getCanonicalName(*File);
 #endif
+
+  // In some cases, the resolved path may actually end up being longer 
(e.g.
+  // if it was originally a relative path), so just retain whichever one
+  // ends up being shorter.
+  if (!DiagOpts.AbsolutePath && CacheEntry.size() > Filename.size())
+CacheEntry = Filename;
 }
 Filename = CacheEntry;
   }

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits