https://github.com/davidstone updated https://github.com/llvm/llvm-project/pull/169126
>From 271a40d48bd977b8351cc346be705ff187399ebb Mon Sep 17 00:00:00 2001 From: David Stone <[email protected]> Date: Fri, 21 Nov 2025 16:12:54 -0700 Subject: [PATCH 1/2] [llvm][clang] Remove `llvm::OwningArrayRef` `OwningArrayRef` has several problems. The naming is strange: `ArrayRef` is specifically a non-owning view, so the name means "owning non-owning view". It has a const-correctness bug that is inherent to the interface. `OwningArrayRef<T>` publicly derives from `MutableArrayRef<T>`. This means that the following code compiles: ```c++ void const_incorrect(llvm::OwningArrayRef<int> const a) { a[0] = 5; } ``` It's surprising for a non-reference type to allow modification of its elements even when it's declared `const`. However, the problems from this inheritance (which ultimately stem from the same issue as the weird name) are even worse. The following function compiles without warning but corrupts memory when called: ```c++ void memory_corruption(llvm::OwningArrayRef<int> a) { a.consume_front(); } ``` This happens because `MutableArrayRef::consume_front` modifies the internal data pointer to advance the referenced array forward. That's not an issue for `MutableArrayRef` because it's just a view. It is an issue for `OwningArrayRef` because that pointer is passed as the argument to `delete[]`, so when it's modified by advancing it forward it ceases to be valid to `delete[]`. From there, undefined behavior occurs. It is less convenient than `llvm::SmallVector` for construction. By combining the `size` and the `capacity` together without going through `std::allocator` to get memory, it's not possible to fill in data with the correct value to begin with. Instead, the user must construct an `OwningArrayRef` of the appropriate size, then fill in the data. This has one of two consequences: 1. If `T` is a class type, we have to first default construct all of the elements when we construct `OwningArrayRef` and then in a second pass we can assign to those elements to give what we want. This wastes time and for some classes is not possible. 2. If `T` is a built-in type, the data starts out uninitialized. This easily forgotten step means we access uninitialized memory. Using `llvm::SmallVector`, by constrast, has well-known constructors that can fill in the data that we actually want on construction. `OwningArrayRef` has slightly different performance characteristics than `llvm::SmallVector`, but the difference is minimal. The first difference is a theoretical negative for `OwningArrayRef`: by implementing in terms of `new[]` and `delete[]`, the implementation has less room to optimize these calls. However, I say this is theoretical because for clang, at least, the extra freedom of optimization given to `std::allocator` is not yet taken advantage of (see https://github.com/llvm/llvm-project/issues/68365) The second difference is slightly in favor of `OwningArrayRef`: `sizeof(llvm::SmallVector<T>) == sizeof(void *) * 3` on pretty much any implementation, whereas `sizeof(OwningArrayRef) == sizeof(void *) * 2` which seems like a win. However, this is just a misdirection of the accounting costs: array-new sticks bookkeeping information in the allocated storage. There are some cases where this is beneficial to reduce stack usage, but that minor benefit doesn't seem worth the costs. If we actually need that optimization, we'd be better served by writing a `DynamicArray` type that implements a full vector-like feature set (except for operations that change the size of the container) while allocating through `std::allocator` to avoid the pitfalls outlined earlier. --- clang/include/clang/AST/VTableBuilder.h | 4 ++-- clang/include/clang/Basic/LLVM.h | 4 +--- llvm/include/llvm/ADT/ArrayRef.h | 23 ------------------- llvm/include/llvm/CGData/CGDataPatchItem.h | 4 ++-- llvm/include/llvm/CodeGen/PBQP/Math.h | 10 ++++---- .../Transforms/Vectorize/SLPVectorizer.cpp | 7 +++--- llvm/unittests/ADT/ArrayRefTest.cpp | 7 ------ 7 files changed, 14 insertions(+), 45 deletions(-) diff --git a/clang/include/clang/AST/VTableBuilder.h b/clang/include/clang/AST/VTableBuilder.h index cec3d8aee0650..14a2070061a1c 100644 --- a/clang/include/clang/AST/VTableBuilder.h +++ b/clang/include/clang/AST/VTableBuilder.h @@ -253,10 +253,10 @@ class VTableLayout { // in the virtual table group. VTableIndicesTy VTableIndices; - OwningArrayRef<VTableComponent> VTableComponents; + std::vector<VTableComponent> VTableComponents; /// Contains thunks needed by vtables, sorted by indices. - OwningArrayRef<VTableThunkTy> VTableThunks; + std::vector<VTableThunkTy> VTableThunks; /// Address points for all vtables. AddressPointsMapTy AddressPoints; diff --git a/clang/include/clang/Basic/LLVM.h b/clang/include/clang/Basic/LLVM.h index f4956cd16cbcf..fed23f0c44f38 100644 --- a/clang/include/clang/Basic/LLVM.h +++ b/clang/include/clang/Basic/LLVM.h @@ -29,8 +29,7 @@ namespace llvm { class Twine; class VersionTuple; template<typename T> class ArrayRef; - template<typename T> class MutableArrayRef; - template<typename T> class OwningArrayRef; + template <typename T> class MutableArrayRef; template<unsigned InternalLen> class SmallString; template<typename T, unsigned N> class SmallVector; template<typename T> class SmallVectorImpl; @@ -65,7 +64,6 @@ namespace clang { // ADT's. using llvm::ArrayRef; using llvm::MutableArrayRef; - using llvm::OwningArrayRef; using llvm::SaveAndRestore; using llvm::SmallString; using llvm::SmallVector; diff --git a/llvm/include/llvm/ADT/ArrayRef.h b/llvm/include/llvm/ADT/ArrayRef.h index d7ed2c78749f0..00b5534469d65 100644 --- a/llvm/include/llvm/ADT/ArrayRef.h +++ b/llvm/include/llvm/ADT/ArrayRef.h @@ -445,29 +445,6 @@ namespace llvm { } }; - /// This is a MutableArrayRef that owns its array. - template <typename T> class OwningArrayRef : public MutableArrayRef<T> { - public: - OwningArrayRef() = default; - OwningArrayRef(size_t Size) : MutableArrayRef<T>(new T[Size], Size) {} - - OwningArrayRef(ArrayRef<T> Data) - : MutableArrayRef<T>(new T[Data.size()], Data.size()) { - std::copy(Data.begin(), Data.end(), this->begin()); - } - - OwningArrayRef(OwningArrayRef &&Other) { *this = std::move(Other); } - - OwningArrayRef &operator=(OwningArrayRef &&Other) { - delete[] this->data(); - this->MutableArrayRef<T>::operator=(Other); - Other.MutableArrayRef<T>::operator=(MutableArrayRef<T>()); - return *this; - } - - ~OwningArrayRef() { delete[] this->data(); } - }; - /// @name ArrayRef Deduction guides /// @{ /// Deduction guide to construct an ArrayRef from a single element. diff --git a/llvm/include/llvm/CGData/CGDataPatchItem.h b/llvm/include/llvm/CGData/CGDataPatchItem.h index d13f89b032542..e9aad8c23fefc 100644 --- a/llvm/include/llvm/CGData/CGDataPatchItem.h +++ b/llvm/include/llvm/CGData/CGDataPatchItem.h @@ -22,10 +22,10 @@ struct CGDataPatchItem { // Where to patch. uint64_t Pos; // Source data. - OwningArrayRef<uint64_t> D; + std::vector<uint64_t> D; CGDataPatchItem(uint64_t Pos, const uint64_t *D, int N) - : Pos(Pos), D(ArrayRef<uint64_t>(D, N)) {} + : Pos(Pos), D(D, D + N) {} }; } // namespace llvm diff --git a/llvm/include/llvm/CodeGen/PBQP/Math.h b/llvm/include/llvm/CodeGen/PBQP/Math.h index 1cbbeeba3f32b..ba673dd9380a8 100644 --- a/llvm/include/llvm/CodeGen/PBQP/Math.h +++ b/llvm/include/llvm/CodeGen/PBQP/Math.h @@ -41,10 +41,10 @@ class Vector { Vector(Vector &&V) : Data(std::move(V.Data)) {} // Iterator-based access. - const PBQPNum *begin() const { return Data.begin(); } - const PBQPNum *end() const { return Data.end(); } - PBQPNum *begin() { return Data.begin(); } - PBQPNum *end() { return Data.end(); } + const PBQPNum *begin() const { return Data.data(); } + const PBQPNum *end() const { return Data.data() + Data.size(); } + PBQPNum *begin() { return Data.data(); } + PBQPNum *end() { return Data.data() + Data.size(); } /// Comparison operator. bool operator==(const Vector &V) const { @@ -87,7 +87,7 @@ class Vector { } private: - OwningArrayRef<PBQPNum> Data; + std::vector<PBQPNum> Data; }; /// Return a hash_value for the given vector. diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp index b7ab056ef4363..dd9332118d8fa 100644 --- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp +++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp @@ -23977,9 +23977,10 @@ bool SLPVectorizerPass::vectorizeStores( unsigned End = Operands.size(); unsigned Repeat = 0; constexpr unsigned MaxAttempts = 4; - OwningArrayRef<std::pair<unsigned, unsigned>> RangeSizes(Operands.size()); - for (std::pair<unsigned, unsigned> &P : RangeSizes) - P.first = P.second = 1; + std::vector<std::pair<unsigned, unsigned>> RangeSizesStorage( + Operands.size(), {1, 1}); + // The `slice` and `drop_front` interfaces are convenient + const auto RangeSizes = MutableArrayRef(RangeSizesStorage); DenseMap<Value *, std::pair<unsigned, unsigned>> NonSchedulable; auto IsNotVectorized = [](bool First, const std::pair<unsigned, unsigned> &P) { diff --git a/llvm/unittests/ADT/ArrayRefTest.cpp b/llvm/unittests/ADT/ArrayRefTest.cpp index 736c8fbb26b38..b1a86f0214b74 100644 --- a/llvm/unittests/ADT/ArrayRefTest.cpp +++ b/llvm/unittests/ADT/ArrayRefTest.cpp @@ -307,13 +307,6 @@ TEST(ArrayRefTest, ArrayRef) { EXPECT_TRUE(AR2.equals(AR2Ref)); } -TEST(ArrayRefTest, OwningArrayRef) { - static const int A1[] = {0, 1}; - OwningArrayRef<int> A{ArrayRef(A1)}; - OwningArrayRef<int> B(std::move(A)); - EXPECT_EQ(A.data(), nullptr); -} - TEST(ArrayRefTest, ArrayRefFromStdArray) { std::array<int, 5> A1{{42, -5, 0, 1000000, -1000000}}; ArrayRef<int> A2 = ArrayRef(A1); >From 88fa12a22ae8754993fef4d6265cb5dc8f202128 Mon Sep 17 00:00:00 2001 From: David Stone <[email protected]> Date: Fri, 16 Jan 2026 11:51:08 -0700 Subject: [PATCH 2/2] Use `llvm::SmallVector` instead of `std::vector` --- clang/include/clang/AST/VTableBuilder.h | 5 +++-- llvm/include/llvm/CGData/CGDataPatchItem.h | 4 ++-- llvm/include/llvm/CodeGen/PBQP/Math.h | 3 ++- llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | 2 +- 4 files changed, 8 insertions(+), 6 deletions(-) diff --git a/clang/include/clang/AST/VTableBuilder.h b/clang/include/clang/AST/VTableBuilder.h index 14a2070061a1c..a796599fbdb9f 100644 --- a/clang/include/clang/AST/VTableBuilder.h +++ b/clang/include/clang/AST/VTableBuilder.h @@ -20,6 +20,7 @@ #include "clang/Basic/ABI.h" #include "clang/Basic/Thunk.h" #include "llvm/ADT/DenseMap.h" +#include "llvm/ADT/SmallVector.h" #include <memory> #include <utility> @@ -253,10 +254,10 @@ class VTableLayout { // in the virtual table group. VTableIndicesTy VTableIndices; - std::vector<VTableComponent> VTableComponents; + llvm::SmallVector<VTableComponent, 0> VTableComponents; /// Contains thunks needed by vtables, sorted by indices. - std::vector<VTableThunkTy> VTableThunks; + llvm::SmallVector<VTableThunkTy, 0> VTableThunks; /// Address points for all vtables. AddressPointsMapTy AddressPoints; diff --git a/llvm/include/llvm/CGData/CGDataPatchItem.h b/llvm/include/llvm/CGData/CGDataPatchItem.h index e9aad8c23fefc..e5a1328beacd5 100644 --- a/llvm/include/llvm/CGData/CGDataPatchItem.h +++ b/llvm/include/llvm/CGData/CGDataPatchItem.h @@ -13,7 +13,7 @@ #ifndef LLVM_CGDATA_CGDATAPATCHITEM_H #define LLVM_CGDATA_CGDATAPATCHITEM_H -#include "llvm/ADT/ArrayRef.h" +#include "llvm/ADT/SmallVector.h" namespace llvm { @@ -22,7 +22,7 @@ struct CGDataPatchItem { // Where to patch. uint64_t Pos; // Source data. - std::vector<uint64_t> D; + llvm::SmallVector<uint64_t, 0> D; CGDataPatchItem(uint64_t Pos, const uint64_t *D, int N) : Pos(Pos), D(D, D + N) {} diff --git a/llvm/include/llvm/CodeGen/PBQP/Math.h b/llvm/include/llvm/CodeGen/PBQP/Math.h index ba673dd9380a8..aba9e3676ef4b 100644 --- a/llvm/include/llvm/CodeGen/PBQP/Math.h +++ b/llvm/include/llvm/CodeGen/PBQP/Math.h @@ -12,6 +12,7 @@ #include "llvm/ADT/ArrayRef.h" #include "llvm/ADT/Hashing.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/SmallVector.h" #include "llvm/Support/InterleavedRange.h" #include <algorithm> #include <cassert> @@ -87,7 +88,7 @@ class Vector { } private: - std::vector<PBQPNum> Data; + llvm::SmallVector<PBQPNum, 0> Data; }; /// Return a hash_value for the given vector. diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp index dd9332118d8fa..4c4901c314406 100644 --- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp +++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp @@ -23977,7 +23977,7 @@ bool SLPVectorizerPass::vectorizeStores( unsigned End = Operands.size(); unsigned Repeat = 0; constexpr unsigned MaxAttempts = 4; - std::vector<std::pair<unsigned, unsigned>> RangeSizesStorage( + llvm::SmallVector<std::pair<unsigned, unsigned>> RangeSizesStorage( Operands.size(), {1, 1}); // The `slice` and `drop_front` interfaces are convenient const auto RangeSizes = MutableArrayRef(RangeSizesStorage); _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
