teemperor created this revision. teemperor added reviewers: dblaikie, rjmccall, rsmith, v.g.vassilev. teemperor added a project: clang. Herald added a subscriber: dexonsmith. teemperor requested review of this revision. Herald added subscribers: llvm-commits, cfe-commits. Herald added a project: LLVM.
This is a WIP patch that tries to avoid creating a RecordLayout in Clang and instead just emit an opaque structure type as if we only had a forward declarations. The main motivation for this patch is actually just supporting a use case in LLDB where laying out types can be very expensive as it usually triggers parsing of debug information. The changes in this patch can be summarized as: - `CodeGenTypes::ConvertRecordDeclType` (and related funcs) have a new parameter that tells us if we need the definition. It's currently only set to false for Clang pointer types. - There are a few new places where I added (temporary) calls to `ConvertTypeForMem()` on some pointee types. The reason is that the code after is usually creating GEP instructions where we need a non-opaque source type. We can't do this automatically from the GEP factory methods as they would need to know the clang::Type to automatically do this (and they only have the llvm::Type that can't be mapped back to a clang::Type from what I can see, but that might be incorrect). - A few test that needed to be adjusted as they relied on e.g. `Foo *x` to be enough to force `Foo` to be laid out/emitted. There are still about a dozen more tests failing but from what I can see they all just need to be adjusted to force specific types to be emitted. I'll fix those up once there is consensus that this patch is going in the right direction. Some benchmarks: I did a stage2 build of LLVM+Clang with my patch and those are the stats: current ToT Clang: 2232421 - total amount of struct types created 94911 - of which are opaque struct types with this patch: 1715074 - total amount of struct types created (-23%) 173127 - of which are opaque struct types (+82%) I built a part of Clang (the last 300 source files in the compile_commands.json) and the average time on my 64 core machine changes like this (as per hyperfine): Benchmark #1: parallel --progress -j63 -a ToT-clang Time (mean ± σ): 27.703 s ± 0.168 s [User: 1434.619 s, System: 66.687 s] Range (min … max): 27.459 s … 27.891 s 10 runs Benchmark #2: parallel --progress -j63 -a with-patch Time (mean ± σ): 27.439 s ± 0.111 s [User: 1427.739 s, System: 66.220 s] Range (min … max): 27.300 s … 27.625 s 10 runs Summary 'parallel --progress -j63 -a with-patch' ran 1.01 ± 0.01 times faster than 'parallel --progress -j63 -a ToT-clang' Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D108407 Files: clang/include/clang/AST/Type.h clang/lib/CodeGen/CGExpr.cpp clang/lib/CodeGen/CGExprCXX.cpp clang/lib/CodeGen/CGExprScalar.cpp clang/lib/CodeGen/CGOpenMPRuntime.cpp clang/lib/CodeGen/CodeGenTypes.cpp clang/lib/CodeGen/CodeGenTypes.h clang/test/CodeGen/c11atomics.c clang/test/CodeGenCXX/class-layout.cpp clang/test/CodeGenCXX/pr18962.cpp clang/test/CodeGenCXX/warn-padded-packed.cpp llvm/include/llvm/IR/Instructions.h
Index: llvm/include/llvm/IR/Instructions.h =================================================================== --- llvm/include/llvm/IR/Instructions.h +++ llvm/include/llvm/IR/Instructions.h @@ -1173,6 +1173,7 @@ ResultElementType(getIndexedType(PointeeType, IdxList)) { assert(cast<PointerType>(getType()->getScalarType()) ->isOpaqueOrPointeeTypeMatches(ResultElementType)); + assert(PointeeType->isSized()); init(Ptr, IdxList, NameStr); } @@ -1187,6 +1188,7 @@ ResultElementType(getIndexedType(PointeeType, IdxList)) { assert(cast<PointerType>(getType()->getScalarType()) ->isOpaqueOrPointeeTypeMatches(ResultElementType)); + assert(PointeeType->isSized()); init(Ptr, IdxList, NameStr); } Index: clang/test/CodeGenCXX/warn-padded-packed.cpp =================================================================== --- clang/test/CodeGenCXX/warn-padded-packed.cpp +++ clang/test/CodeGenCXX/warn-padded-packed.cpp @@ -148,6 +148,6 @@ // The warnings are emitted when the layout of the structs is computed, so we have to use them. -void f(S1*, S2*, S3*, S4*, S5*, S6*, S7*, S8*, S9*, S10*, S11*, S12*, S13*, - S14*, S15*, S16*, S17*, S18*, S19*, S20*, S21*, S22*, S23*, S24*, S25*, - S26*, S27*){} +void f(S1, S2, S3, S4, S5, S6, S7, S8, S9, S10, S11, S12, S13, + S14, S15, S16, S17, S18, S19, S20, S21, S22, S23, S24, S25, + S26, S27){} Index: clang/test/CodeGenCXX/pr18962.cpp =================================================================== --- clang/test/CodeGenCXX/pr18962.cpp +++ clang/test/CodeGenCXX/pr18962.cpp @@ -27,6 +27,5 @@ // We end up using an opaque type for 'append' to avoid circular references. // CHECK: %class.A = type { {}* } // CHECK: %class.C = type <{ %class.D*, %class.B, [3 x i8] }> -// CHECK: %class.D = type { %class.C.base, [3 x i8] } -// CHECK: %class.C.base = type <{ %class.D*, %class.B }> +// CHECK: %class.D = type opaque // CHECK: %class.B = type { i8 } Index: clang/test/CodeGenCXX/class-layout.cpp =================================================================== --- clang/test/CodeGenCXX/class-layout.cpp +++ clang/test/CodeGenCXX/class-layout.cpp @@ -3,19 +3,19 @@ // An extra byte should be allocated for an empty class. namespace Test1 { // CHECK: %"struct.Test1::A" = type { i8 } - struct A { } *a; + struct A { } a; } namespace Test2 { // No need to add tail padding here. // CHECK: %"struct.Test2::A" = type { i8*, i32 } - struct A { void *a; int b; } *a; + struct A { void *a; int b; } a; } namespace Test3 { // C should have a vtable pointer. // CHECK: %"struct.Test3::A" = type <{ i32 (...)**, i32, [4 x i8] }> - struct A { virtual void f(); int a; } *a; + struct A { virtual void f(); int a; } a; } namespace Test4 { @@ -30,7 +30,7 @@ struct B : public A { short d; double e; - } *b; + } b; } namespace Test5 { @@ -43,7 +43,7 @@ struct B : A { char b : 1; char c; - } *b; + } b; } // PR10912: don't crash @@ -85,10 +85,12 @@ class A {}; // CHECK: %"class.Test7::B" = type <{ i32 (...)**, %"class.Test7::A" }> class B { + public: virtual ~B(); + private: A a; }; - B* b; + B b; #pragma pack () } Index: clang/test/CodeGen/c11atomics.c =================================================================== --- clang/test/CodeGen/c11atomics.c +++ clang/test/CodeGen/c11atomics.c @@ -22,7 +22,7 @@ struct elem { _Atomic(struct ptr) link; }; -// CHECK-DAG: %struct.elem = type { %struct.ptr } +// CHECK-DAG: %struct.elem = type opaque struct ptr object; // CHECK-DAG: @object ={{.*}} global %struct.ptr zeroinitializer Index: clang/lib/CodeGen/CodeGenTypes.h =================================================================== --- clang/lib/CodeGen/CodeGenTypes.h +++ clang/lib/CodeGen/CodeGenTypes.h @@ -128,13 +128,13 @@ CanQualType DeriveThisType(const CXXRecordDecl *RD, const CXXMethodDecl *MD); /// ConvertType - Convert type T into a llvm::Type. - llvm::Type *ConvertType(QualType T); + llvm::Type *ConvertType(QualType T, bool RequireSize = true); /// ConvertTypeForMem - Convert type T into a llvm::Type. This differs from /// ConvertType in that it is used to convert to the memory representation for /// a type. For example, the scalar representation for _Bool is i1, but the /// memory representation is usually i8 or i32, depending on the target. - llvm::Type *ConvertTypeForMem(QualType T, bool ForBitField = false); + llvm::Type *ConvertTypeForMem(QualType T, bool ForBitField = false, bool RequireSize = true); /// GetFunctionType - Get the LLVM function type for \arg Info. llvm::FunctionType *GetFunctionType(const CGFunctionInfo &Info); @@ -283,7 +283,7 @@ public: // These are internal details of CGT that shouldn't be used externally. /// ConvertRecordDeclType - Lay out a tagged decl type like struct or union. - llvm::StructType *ConvertRecordDeclType(const RecordDecl *TD); + llvm::StructType *ConvertRecordDeclType(const RecordDecl *TD, bool RequireSize = true); /// getExpandedTypes - Expand the type \arg Ty into the LLVM /// argument types it would be passed as. See ABIArgInfo::Expand. Index: clang/lib/CodeGen/CodeGenTypes.cpp =================================================================== --- clang/lib/CodeGen/CodeGenTypes.cpp +++ clang/lib/CodeGen/CodeGenTypes.cpp @@ -87,7 +87,7 @@ /// ConvertType in that it is used to convert to the memory representation for /// a type. For example, the scalar representation for _Bool is i1, but the /// memory representation is usually i8 or i32, depending on the target. -llvm::Type *CodeGenTypes::ConvertTypeForMem(QualType T, bool ForBitField) { +llvm::Type *CodeGenTypes::ConvertTypeForMem(QualType T, bool ForBitField, bool RequireSize) { if (T->isConstantMatrixType()) { const Type *Ty = Context.getCanonicalType(T).getTypePtr(); const ConstantMatrixType *MT = cast<ConstantMatrixType>(Ty); @@ -95,7 +95,7 @@ MT->getNumRows() * MT->getNumColumns()); } - llvm::Type *R = ConvertType(T); + llvm::Type *R = ConvertType(T, RequireSize); // If this is a bool type, or an ExtIntType in a bitfield representation, // map this integer to the target-specified size. @@ -392,7 +392,7 @@ } /// ConvertType - Convert the specified type to its LLVM form. -llvm::Type *CodeGenTypes::ConvertType(QualType T) { +llvm::Type *CodeGenTypes::ConvertType(QualType T, bool RequireSize) { T = Context.getCanonicalType(T); const Type *Ty = T.getTypePtr(); @@ -413,7 +413,7 @@ // RecordTypes are cached and processed specially. if (const RecordType *RT = dyn_cast<RecordType>(Ty)) - return ConvertRecordDeclType(RT->getDecl()); + return ConvertRecordDeclType(RT->getAnyDecl(), RequireSize); // See if type is already cached. llvm::DenseMap<const Type *, llvm::Type *>::iterator TCI = TypeCache.find(Ty); @@ -639,7 +639,7 @@ case Type::Pointer: { const PointerType *PTy = cast<PointerType>(Ty); QualType ETy = PTy->getPointeeType(); - llvm::Type *PointeeType = ConvertTypeForMem(ETy); + llvm::Type *PointeeType = ConvertTypeForMem(ETy, false, false); if (PointeeType->isVoidTy()) PointeeType = llvm::Type::getInt8Ty(getLLVMContext()); @@ -807,7 +807,7 @@ } /// ConvertRecordDeclType - Lay out a tagged decl type like struct or union. -llvm::StructType *CodeGenTypes::ConvertRecordDeclType(const RecordDecl *RD) { +llvm::StructType *CodeGenTypes::ConvertRecordDeclType(const RecordDecl *RD, bool RequireSize) { // TagDecl's are not necessarily unique, instead use the (clang) // type connected to the decl. const Type *Key = Context.getTagDeclType(RD).getTypePtr(); @@ -821,9 +821,17 @@ } llvm::StructType *Ty = Entry; + // If the size of the type isn't required, then nothing left to do. + // FIXME: Implicit records are always emitted as a many parts of CodeGen + // expect internal builtin records such as kmp_depend_info_t, __va_list etc. + // to be laid out. + if (!RequireSize && !RD->isImplicit()) + return Ty; + + RD = RD->getDefinition(); + // If this is still a forward declaration, or the LLVM type is already // complete, there's nothing more to do. - RD = RD->getDefinition(); if (!RD || !RD->isCompleteDefinition() || !Ty->isOpaque()) return Ty; Index: clang/lib/CodeGen/CGOpenMPRuntime.cpp =================================================================== --- clang/lib/CodeGen/CGOpenMPRuntime.cpp +++ clang/lib/CodeGen/CGOpenMPRuntime.cpp @@ -5346,6 +5346,8 @@ llvm::Value *RHSBegin = RHSAddr.getPointer(); llvm::Value *LHSBegin = LHSAddr.getPointer(); + // Ensure sie for CreateGEP below. + CGF.ConvertTypeForMem(LHSVar->getType()); // Cast from pointer to array type to pointer to single element. llvm::Value *LHSEnd = CGF.Builder.CreateGEP(LHSAddr.getElementType(), LHSBegin, NumElements); @@ -9897,6 +9899,7 @@ Fn->removeFnAttr(llvm::Attribute::OptimizeNone); // Start the mapper function code generation. CodeGenFunction MapperCGF(CGM); + MapperCGF.ConvertTypeForMem(Ty); MapperCGF.StartFunction(GlobalDecl(), C.VoidTy, Fn, FnInfo, Args, Loc, Loc); // Compute the starting and end addresses of array elements. llvm::Value *Size = MapperCGF.EmitLoadOfScalar( Index: clang/lib/CodeGen/CGExprScalar.cpp =================================================================== --- clang/lib/CodeGen/CGExprScalar.cpp +++ clang/lib/CodeGen/CGExprScalar.cpp @@ -2603,6 +2603,8 @@ // Next most common: pointer increment. } else if (const PointerType *ptr = type->getAs<PointerType>()) { QualType type = ptr->getPointeeType(); + // Size required for pointer arithmetic. + CGF.ConvertTypeForMem(type); // VLA types don't have constant size. if (const VariableArrayType *vla @@ -3463,6 +3465,8 @@ QualType objectType = pointerOperand->getType() ->castAs<ObjCObjectPointerType>() ->getPointeeType(); + // Size needed for pointer arithmetic. + CGF.ConvertTypeForMem(objectType); llvm::Value *objectSize = CGF.CGM.getSize(CGF.getContext().getTypeSizeInChars(objectType)); @@ -3474,6 +3478,8 @@ } QualType elementType = pointerType->getPointeeType(); + // Size needed for pointer arithmetic. + CGF.ConvertTypeForMem(elementType); if (const VariableArrayType *vla = CGF.getContext().getAsVariableArrayType(elementType)) { // The element count here is the total number of non-VLA elements. Index: clang/lib/CodeGen/CGExprCXX.cpp =================================================================== --- clang/lib/CodeGen/CGExprCXX.cpp +++ clang/lib/CodeGen/CGExprCXX.cpp @@ -2196,6 +2196,8 @@ ConvertType(E->getType())->getPointerTo(); if (E->isTypeOperand()) { + // Just to get tests passing which expect the struct type to be complete. + ConvertTypeForMem(E->getTypeOperand(getContext())); llvm::Constant *TypeInfo = CGM.GetAddrOfRTTIDescriptor(E->getTypeOperand(getContext())); return Builder.CreateBitCast(TypeInfo, StdTypeInfoPtrTy); Index: clang/lib/CodeGen/CGExpr.cpp =================================================================== --- clang/lib/CodeGen/CGExpr.cpp +++ clang/lib/CodeGen/CGExpr.cpp @@ -3669,6 +3669,8 @@ assert(isa<llvm::ConstantInt>(idx) && cast<llvm::ConstantInt>(idx)->isZero()); #endif + // Size needed for pointer arithmethic. + CGF.ConvertTypeForMem(eltType); // Determine the element size of the statically-sized base. This is // the thing that the indices are expressed in terms of. Index: clang/include/clang/AST/Type.h =================================================================== --- clang/include/clang/AST/Type.h +++ clang/include/clang/AST/Type.h @@ -4593,6 +4593,7 @@ friend class ASTReader; template <class T> friend class serialization::AbstractTypeReader; +protected: /// Stores the TagDecl associated with this type. The decl may point to any /// TagDecl that declares the entity. TagDecl *decl; @@ -4627,6 +4628,11 @@ return reinterpret_cast<RecordDecl*>(TagType::getDecl()); } + /// Returns one of the RecordDecls that declare this RecordType. + RecordDecl *getAnyDecl() const { + return reinterpret_cast<RecordDecl*>(decl); + } + /// Recursively check all fields in the record for const-ness. If any field /// is declared const, return true. Otherwise, return false. bool hasConstFields() const;
_______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits