https://github.com/DataCorrupted updated 
https://github.com/llvm/llvm-project/pull/197322

>From 97a81c9ca131ad76d8f3adb5fca04d5f7971b629 Mon Sep 17 00:00:00 2001
From: Peter Rong <[email protected]>
Date: Tue, 12 May 2026 15:35:39 -0700
Subject: [PATCH 1/2] [clang][SerializedDiagnostics] Make RECORD_FIXIT
 text-size VBR,16 instead of Fixed

The RECORD_DIAG abbreviation already uses VBR,16 for its text-size
field, but RECORD_FIXIT was left with Fixed,16. That mismatch silently
caps a single FixIt's CodeToInsert at 65535 bytes; on +assertions
builds, BitstreamWriter::Emit aborts with "High bits set!" when the
value exceeds that width.

In-tree warnings don't currently produce FixIts that large, so this
isn't a user-visible bug upstream today. Apple's downstream fork
(used by Swift toolchains and Xcode) does hit it: their refactoring
engine adds a "fill in missing switch cases" note that attaches a
single FixIt enumerating every uncovered enumerator. For an auto-
generated enum with thousands of values, that single CodeToInsert
grows past 64 KiB and crashes any compile that also passes
-serialize-diagnostic-file on a +assertions build.

Two changes:

  - SerializedDiagnosticPrinter.cpp: RECORD_FIXIT's text-size field
    becomes VBR,16, matching RECORD_DIAG. The abbreviation is
    serialized in the .dia file itself, so existing readers self-
    adjust; no version bump is required.

  - CXLoadedDiagnostic.cpp: drop the corresponding
    `if (CodeToInsert.size() > 65536)` defensive check in
    visitFixitRecord. It was added in 2011 (commit d010ba47294d) as
    a "Basic buffer overflow check" when the writer's Fixed,16 made
    larger sizes impossible; the 2014 NFC refactor that split it
    per-record-kind (commit f884723799951) already flagged it with
    `// FIXME: Why do we care about long strings?`. With the writer
    change this cap stops being protective and starts actively
    rejecting .dia files the writer can now legitimately emit.

Adds a gtest that constructs a synthetic 70 KiB FixItHint, serializes
it through SDiagsWriter, reads it back via SerializedDiagnosticReader,
and asserts the payload round-trips intact. Without either change,
the test fails -- the writer asserts on +assertions builds, or the
reader rejects the file with "Out-of-bounds string in FIXIT".

Signed-off-by: Peter Rong <[email protected]>
---
 .../Frontend/SerializedDiagnosticPrinter.cpp  |   4 +-
 clang/tools/libclang/CXLoadedDiagnostic.cpp   |   3 -
 clang/unittests/Frontend/CMakeLists.txt       |   1 +
 .../SerializedDiagnosticsLargeFixitTest.cpp   | 108 ++++++++++++++++++
 4 files changed, 111 insertions(+), 5 deletions(-)
 create mode 100644 
clang/unittests/Frontend/SerializedDiagnosticsLargeFixitTest.cpp

diff --git a/clang/lib/Frontend/SerializedDiagnosticPrinter.cpp 
b/clang/lib/Frontend/SerializedDiagnosticPrinter.cpp
index 58d080f5504c2..1dc44993613c2 100644
--- a/clang/lib/Frontend/SerializedDiagnosticPrinter.cpp
+++ b/clang/lib/Frontend/SerializedDiagnosticPrinter.cpp
@@ -500,8 +500,8 @@ void SDiagsWriter::EmitBlockInfoBlock() {
   Abbrev = std::make_shared<BitCodeAbbrev>();
   Abbrev->Add(BitCodeAbbrevOp(RECORD_FIXIT));
   AddRangeLocationAbbrev(*Abbrev);
-  Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 16)); // Text size.
-  Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob));      // FixIt text.
+  Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 16)); // Text size.
+  Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob));    // FixIt text.
   Abbrevs.set(RECORD_FIXIT, Stream.EmitBlockInfoAbbrev(BLOCK_DIAG,
                                                        Abbrev));
 
diff --git a/clang/tools/libclang/CXLoadedDiagnostic.cpp 
b/clang/tools/libclang/CXLoadedDiagnostic.cpp
index ed89a6a7754da..bfb02d4dcf49d 100644
--- a/clang/tools/libclang/CXLoadedDiagnostic.cpp
+++ b/clang/tools/libclang/CXLoadedDiagnostic.cpp
@@ -366,9 +366,6 @@ DiagLoader::visitFixitRecord(const 
serialized_diags::Location &Start,
   CXSourceRange SR;
   if (std::error_code EC = readRange(Start, End, SR))
     return EC;
-  // FIXME: Why do we care about long strings?
-  if (CodeToInsert.size() > 65536)
-    return reportInvalidFile("Out-of-bounds string in FIXIT");
   CurrentDiags.back()->FixIts.push_back(
       std::make_pair(SR, TopDiags->copyString(CodeToInsert)));
   return std::error_code();
diff --git a/clang/unittests/Frontend/CMakeLists.txt 
b/clang/unittests/Frontend/CMakeLists.txt
index 0d9fc714212d5..112786daded0e 100644
--- a/clang/unittests/Frontend/CMakeLists.txt
+++ b/clang/unittests/Frontend/CMakeLists.txt
@@ -11,6 +11,7 @@ add_clang_unittest(FrontendTests
   ReparseWorkingDirTest.cpp
   OutputStreamTest.cpp
   SearchPathTest.cpp
+  SerializedDiagnosticsLargeFixitTest.cpp
   TextDiagnosticTest.cpp
   UtilsTest.cpp
   CLANG_LIBS
diff --git a/clang/unittests/Frontend/SerializedDiagnosticsLargeFixitTest.cpp 
b/clang/unittests/Frontend/SerializedDiagnosticsLargeFixitTest.cpp
new file mode 100644
index 0000000000000..d42fee6c28358
--- /dev/null
+++ b/clang/unittests/Frontend/SerializedDiagnosticsLargeFixitTest.cpp
@@ -0,0 +1,108 @@
+//===- SerializedDiagnosticsLargeFixitTest.cpp 
----------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// Regression test for the RECORD_FIXIT text-size field overflow. Historically
+// the abbreviation used Fixed,16 for the FixIt's CodeToInsert size, which
+// silently caps payloads at 65535 bytes and asserts in BitstreamWriter::Emit
+// ("High bits set!") on +assertions builds. RECORD_DIAG already uses VBR,16
+// for the same purpose; this test pins the equivalent behavior for
+// RECORD_FIXIT so the two stay in sync.
+//
+//===----------------------------------------------------------------------===//
+
+#include "clang/Basic/Diagnostic.h"
+#include "clang/Basic/DiagnosticOptions.h"
+#include "clang/Basic/FileManager.h"
+#include "clang/Basic/LangOptions.h"
+#include "clang/Basic/SourceManager.h"
+#include "clang/Frontend/SerializedDiagnosticPrinter.h"
+#include "clang/Frontend/SerializedDiagnosticReader.h"
+#include "clang/Frontend/SerializedDiagnostics.h"
+#include "llvm/ADT/ScopeExit.h"
+#include "llvm/Support/FileSystem.h"
+#include "llvm/Support/Path.h"
+#include "llvm/Support/SmallVectorMemoryBuffer.h"
+#include "gtest/gtest.h"
+
+using namespace clang;
+using namespace llvm;
+
+namespace {
+
+// Reader that captures the text of every FixIt it sees.
+class CapturingReader : public serialized_diags::SerializedDiagnosticReader {
+public:
+  std::vector<std::string> FixItTexts;
+
+protected:
+  std::error_code visitFixitRecord(const serialized_diags::Location &Start,
+                                   const serialized_diags::Location &End,
+                                   StringRef Text) override {
+    FixItTexts.emplace_back(Text);
+    return {};
+  }
+};
+
+TEST(SerializedDiagnostics, LargeFixItRoundTrips) {
+  // Build a temp output path for the .dia file.
+  SmallString<128> TmpPath;
+  ASSERT_FALSE(sys::fs::createTemporaryFile("sdiag-large-fixit", "dia",
+                                            TmpPath));
+  auto Cleanup =
+      llvm::scope_exit([&] { sys::fs::remove(TmpPath); });
+
+  // Build a 70 KiB FixIt payload — comfortably above the historical 65535
+  // limit but small enough to keep the test cheap.
+  constexpr size_t kPayloadSize = 70 * 1024;
+  std::string BigInsertion(kPayloadSize, 'x');
+
+  // Set up the writer in an inner scope so its destructor flushes the .dia
+  // file before we read it back.
+  {
+    FileSystemOptions FSOpts;
+    FileManager FileMgr(FSOpts);
+    DiagnosticOptions DiagOpts;
+    DiagnosticsEngine Diags(DiagnosticIDs::create(), DiagOpts,
+                            new IgnoringDiagConsumer());
+    SourceManager SrcMgr(Diags, FileMgr);
+
+    StringRef Path = "main.cpp";
+    StringRef Contents = "int x;\n";
+    FileEntryRef FE = FileMgr.getVirtualFileRef(
+        Path, /*Size=*/static_cast<off_t>(Contents.size()),
+        /*ModificationTime=*/0);
+    SmallVector<char, 16> Buffer(Contents.begin(), Contents.end());
+    SrcMgr.overrideFileContents(
+        FE, std::make_unique<SmallVectorMemoryBuffer>(
+                std::move(Buffer), Path, /*RequiresNullTerminator=*/false));
+    FileID FID = SrcMgr.createFileID(FE, SourceLocation(), SrcMgr::C_User);
+    SrcMgr.setMainFileID(FID);
+    SourceLocation Loc = SrcMgr.translateLineCol(FID, 1, 1);
+
+    // Attach the serialized-diagnostics consumer.
+    Diags.setClient(serialized_diags::create(TmpPath, DiagOpts).release());
+    Diags.setSourceManager(&SrcMgr);
+    LangOptions LO;
+    Diags.getClient()->BeginSourceFile(LO, /*PP=*/nullptr);
+
+    // Emit one custom warning carrying the giant FixIt. Without the
+    // RECORD_FIXIT VBR widening, this Report() asserts in
+    // BitstreamWriter::Emit on +assertions builds.
+    unsigned ID = Diags.getCustomDiagID(DiagnosticsEngine::Warning, "test");
+    Diags.Report(Loc, ID) << FixItHint::CreateInsertion(Loc, BigInsertion);
+  } // SDiagsWriter dtor flushes the .dia file here.
+
+  // Read the file back and verify the FixIt round-trips intact.
+  CapturingReader Reader;
+  ASSERT_FALSE(Reader.readDiagnostics(TmpPath));
+  ASSERT_EQ(Reader.FixItTexts.size(), 1u);
+  EXPECT_EQ(Reader.FixItTexts[0].size(), kPayloadSize);
+  EXPECT_EQ(Reader.FixItTexts[0], BigInsertion);
+}
+
+} // anonymous namespace

>From 035a64eaf7757d7c33e12685388c8438ba9cbab0 Mon Sep 17 00:00:00 2001
From: Peter Rong <[email protected]>
Date: Tue, 12 May 2026 15:49:02 -0700
Subject: [PATCH 2/2] format

---
 .../Frontend/SerializedDiagnosticsLargeFixitTest.cpp       | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/clang/unittests/Frontend/SerializedDiagnosticsLargeFixitTest.cpp 
b/clang/unittests/Frontend/SerializedDiagnosticsLargeFixitTest.cpp
index d42fee6c28358..a2e905d4123b2 100644
--- a/clang/unittests/Frontend/SerializedDiagnosticsLargeFixitTest.cpp
+++ b/clang/unittests/Frontend/SerializedDiagnosticsLargeFixitTest.cpp
@@ -51,10 +51,9 @@ class CapturingReader : public 
serialized_diags::SerializedDiagnosticReader {
 TEST(SerializedDiagnostics, LargeFixItRoundTrips) {
   // Build a temp output path for the .dia file.
   SmallString<128> TmpPath;
-  ASSERT_FALSE(sys::fs::createTemporaryFile("sdiag-large-fixit", "dia",
-                                            TmpPath));
-  auto Cleanup =
-      llvm::scope_exit([&] { sys::fs::remove(TmpPath); });
+  ASSERT_FALSE(
+      sys::fs::createTemporaryFile("sdiag-large-fixit", "dia", TmpPath));
+  auto Cleanup = llvm::scope_exit([&] { sys::fs::remove(TmpPath); });
 
   // Build a 70 KiB FixIt payload — comfortably above the historical 65535
   // limit but small enough to keep the test cheap.

_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to