https://github.com/DataCorrupted updated https://github.com/llvm/llvm-project/pull/197322
>From 97a81c9ca131ad76d8f3adb5fca04d5f7971b629 Mon Sep 17 00:00:00 2001 From: Peter Rong <[email protected]> Date: Tue, 12 May 2026 15:35:39 -0700 Subject: [PATCH 1/2] [clang][SerializedDiagnostics] Make RECORD_FIXIT text-size VBR,16 instead of Fixed The RECORD_DIAG abbreviation already uses VBR,16 for its text-size field, but RECORD_FIXIT was left with Fixed,16. That mismatch silently caps a single FixIt's CodeToInsert at 65535 bytes; on +assertions builds, BitstreamWriter::Emit aborts with "High bits set!" when the value exceeds that width. In-tree warnings don't currently produce FixIts that large, so this isn't a user-visible bug upstream today. Apple's downstream fork (used by Swift toolchains and Xcode) does hit it: their refactoring engine adds a "fill in missing switch cases" note that attaches a single FixIt enumerating every uncovered enumerator. For an auto- generated enum with thousands of values, that single CodeToInsert grows past 64 KiB and crashes any compile that also passes -serialize-diagnostic-file on a +assertions build. Two changes: - SerializedDiagnosticPrinter.cpp: RECORD_FIXIT's text-size field becomes VBR,16, matching RECORD_DIAG. The abbreviation is serialized in the .dia file itself, so existing readers self- adjust; no version bump is required. - CXLoadedDiagnostic.cpp: drop the corresponding `if (CodeToInsert.size() > 65536)` defensive check in visitFixitRecord. It was added in 2011 (commit d010ba47294d) as a "Basic buffer overflow check" when the writer's Fixed,16 made larger sizes impossible; the 2014 NFC refactor that split it per-record-kind (commit f884723799951) already flagged it with `// FIXME: Why do we care about long strings?`. With the writer change this cap stops being protective and starts actively rejecting .dia files the writer can now legitimately emit. Adds a gtest that constructs a synthetic 70 KiB FixItHint, serializes it through SDiagsWriter, reads it back via SerializedDiagnosticReader, and asserts the payload round-trips intact. Without either change, the test fails -- the writer asserts on +assertions builds, or the reader rejects the file with "Out-of-bounds string in FIXIT". Signed-off-by: Peter Rong <[email protected]> --- .../Frontend/SerializedDiagnosticPrinter.cpp | 4 +- clang/tools/libclang/CXLoadedDiagnostic.cpp | 3 - clang/unittests/Frontend/CMakeLists.txt | 1 + .../SerializedDiagnosticsLargeFixitTest.cpp | 108 ++++++++++++++++++ 4 files changed, 111 insertions(+), 5 deletions(-) create mode 100644 clang/unittests/Frontend/SerializedDiagnosticsLargeFixitTest.cpp diff --git a/clang/lib/Frontend/SerializedDiagnosticPrinter.cpp b/clang/lib/Frontend/SerializedDiagnosticPrinter.cpp index 58d080f5504c2..1dc44993613c2 100644 --- a/clang/lib/Frontend/SerializedDiagnosticPrinter.cpp +++ b/clang/lib/Frontend/SerializedDiagnosticPrinter.cpp @@ -500,8 +500,8 @@ void SDiagsWriter::EmitBlockInfoBlock() { Abbrev = std::make_shared<BitCodeAbbrev>(); Abbrev->Add(BitCodeAbbrevOp(RECORD_FIXIT)); AddRangeLocationAbbrev(*Abbrev); - Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 16)); // Text size. - Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob)); // FixIt text. + Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 16)); // Text size. + Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob)); // FixIt text. Abbrevs.set(RECORD_FIXIT, Stream.EmitBlockInfoAbbrev(BLOCK_DIAG, Abbrev)); diff --git a/clang/tools/libclang/CXLoadedDiagnostic.cpp b/clang/tools/libclang/CXLoadedDiagnostic.cpp index ed89a6a7754da..bfb02d4dcf49d 100644 --- a/clang/tools/libclang/CXLoadedDiagnostic.cpp +++ b/clang/tools/libclang/CXLoadedDiagnostic.cpp @@ -366,9 +366,6 @@ DiagLoader::visitFixitRecord(const serialized_diags::Location &Start, CXSourceRange SR; if (std::error_code EC = readRange(Start, End, SR)) return EC; - // FIXME: Why do we care about long strings? - if (CodeToInsert.size() > 65536) - return reportInvalidFile("Out-of-bounds string in FIXIT"); CurrentDiags.back()->FixIts.push_back( std::make_pair(SR, TopDiags->copyString(CodeToInsert))); return std::error_code(); diff --git a/clang/unittests/Frontend/CMakeLists.txt b/clang/unittests/Frontend/CMakeLists.txt index 0d9fc714212d5..112786daded0e 100644 --- a/clang/unittests/Frontend/CMakeLists.txt +++ b/clang/unittests/Frontend/CMakeLists.txt @@ -11,6 +11,7 @@ add_clang_unittest(FrontendTests ReparseWorkingDirTest.cpp OutputStreamTest.cpp SearchPathTest.cpp + SerializedDiagnosticsLargeFixitTest.cpp TextDiagnosticTest.cpp UtilsTest.cpp CLANG_LIBS diff --git a/clang/unittests/Frontend/SerializedDiagnosticsLargeFixitTest.cpp b/clang/unittests/Frontend/SerializedDiagnosticsLargeFixitTest.cpp new file mode 100644 index 0000000000000..d42fee6c28358 --- /dev/null +++ b/clang/unittests/Frontend/SerializedDiagnosticsLargeFixitTest.cpp @@ -0,0 +1,108 @@ +//===- SerializedDiagnosticsLargeFixitTest.cpp ----------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// Regression test for the RECORD_FIXIT text-size field overflow. Historically +// the abbreviation used Fixed,16 for the FixIt's CodeToInsert size, which +// silently caps payloads at 65535 bytes and asserts in BitstreamWriter::Emit +// ("High bits set!") on +assertions builds. RECORD_DIAG already uses VBR,16 +// for the same purpose; this test pins the equivalent behavior for +// RECORD_FIXIT so the two stay in sync. +// +//===----------------------------------------------------------------------===// + +#include "clang/Basic/Diagnostic.h" +#include "clang/Basic/DiagnosticOptions.h" +#include "clang/Basic/FileManager.h" +#include "clang/Basic/LangOptions.h" +#include "clang/Basic/SourceManager.h" +#include "clang/Frontend/SerializedDiagnosticPrinter.h" +#include "clang/Frontend/SerializedDiagnosticReader.h" +#include "clang/Frontend/SerializedDiagnostics.h" +#include "llvm/ADT/ScopeExit.h" +#include "llvm/Support/FileSystem.h" +#include "llvm/Support/Path.h" +#include "llvm/Support/SmallVectorMemoryBuffer.h" +#include "gtest/gtest.h" + +using namespace clang; +using namespace llvm; + +namespace { + +// Reader that captures the text of every FixIt it sees. +class CapturingReader : public serialized_diags::SerializedDiagnosticReader { +public: + std::vector<std::string> FixItTexts; + +protected: + std::error_code visitFixitRecord(const serialized_diags::Location &Start, + const serialized_diags::Location &End, + StringRef Text) override { + FixItTexts.emplace_back(Text); + return {}; + } +}; + +TEST(SerializedDiagnostics, LargeFixItRoundTrips) { + // Build a temp output path for the .dia file. + SmallString<128> TmpPath; + ASSERT_FALSE(sys::fs::createTemporaryFile("sdiag-large-fixit", "dia", + TmpPath)); + auto Cleanup = + llvm::scope_exit([&] { sys::fs::remove(TmpPath); }); + + // Build a 70 KiB FixIt payload — comfortably above the historical 65535 + // limit but small enough to keep the test cheap. + constexpr size_t kPayloadSize = 70 * 1024; + std::string BigInsertion(kPayloadSize, 'x'); + + // Set up the writer in an inner scope so its destructor flushes the .dia + // file before we read it back. + { + FileSystemOptions FSOpts; + FileManager FileMgr(FSOpts); + DiagnosticOptions DiagOpts; + DiagnosticsEngine Diags(DiagnosticIDs::create(), DiagOpts, + new IgnoringDiagConsumer()); + SourceManager SrcMgr(Diags, FileMgr); + + StringRef Path = "main.cpp"; + StringRef Contents = "int x;\n"; + FileEntryRef FE = FileMgr.getVirtualFileRef( + Path, /*Size=*/static_cast<off_t>(Contents.size()), + /*ModificationTime=*/0); + SmallVector<char, 16> Buffer(Contents.begin(), Contents.end()); + SrcMgr.overrideFileContents( + FE, std::make_unique<SmallVectorMemoryBuffer>( + std::move(Buffer), Path, /*RequiresNullTerminator=*/false)); + FileID FID = SrcMgr.createFileID(FE, SourceLocation(), SrcMgr::C_User); + SrcMgr.setMainFileID(FID); + SourceLocation Loc = SrcMgr.translateLineCol(FID, 1, 1); + + // Attach the serialized-diagnostics consumer. + Diags.setClient(serialized_diags::create(TmpPath, DiagOpts).release()); + Diags.setSourceManager(&SrcMgr); + LangOptions LO; + Diags.getClient()->BeginSourceFile(LO, /*PP=*/nullptr); + + // Emit one custom warning carrying the giant FixIt. Without the + // RECORD_FIXIT VBR widening, this Report() asserts in + // BitstreamWriter::Emit on +assertions builds. + unsigned ID = Diags.getCustomDiagID(DiagnosticsEngine::Warning, "test"); + Diags.Report(Loc, ID) << FixItHint::CreateInsertion(Loc, BigInsertion); + } // SDiagsWriter dtor flushes the .dia file here. + + // Read the file back and verify the FixIt round-trips intact. + CapturingReader Reader; + ASSERT_FALSE(Reader.readDiagnostics(TmpPath)); + ASSERT_EQ(Reader.FixItTexts.size(), 1u); + EXPECT_EQ(Reader.FixItTexts[0].size(), kPayloadSize); + EXPECT_EQ(Reader.FixItTexts[0], BigInsertion); +} + +} // anonymous namespace >From 035a64eaf7757d7c33e12685388c8438ba9cbab0 Mon Sep 17 00:00:00 2001 From: Peter Rong <[email protected]> Date: Tue, 12 May 2026 15:49:02 -0700 Subject: [PATCH 2/2] format --- .../Frontend/SerializedDiagnosticsLargeFixitTest.cpp | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/clang/unittests/Frontend/SerializedDiagnosticsLargeFixitTest.cpp b/clang/unittests/Frontend/SerializedDiagnosticsLargeFixitTest.cpp index d42fee6c28358..a2e905d4123b2 100644 --- a/clang/unittests/Frontend/SerializedDiagnosticsLargeFixitTest.cpp +++ b/clang/unittests/Frontend/SerializedDiagnosticsLargeFixitTest.cpp @@ -51,10 +51,9 @@ class CapturingReader : public serialized_diags::SerializedDiagnosticReader { TEST(SerializedDiagnostics, LargeFixItRoundTrips) { // Build a temp output path for the .dia file. SmallString<128> TmpPath; - ASSERT_FALSE(sys::fs::createTemporaryFile("sdiag-large-fixit", "dia", - TmpPath)); - auto Cleanup = - llvm::scope_exit([&] { sys::fs::remove(TmpPath); }); + ASSERT_FALSE( + sys::fs::createTemporaryFile("sdiag-large-fixit", "dia", TmpPath)); + auto Cleanup = llvm::scope_exit([&] { sys::fs::remove(TmpPath); }); // Build a 70 KiB FixIt payload — comfortably above the historical 65535 // limit but small enough to keep the test cheap. _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
