[clang] [clang] Move tailclipping to bitfield allocation (PR #87090)

2024-03-29 Thread Nathan Sidwell via cfe-commits

https://github.com/urnathan created 
https://github.com/llvm/llvm-project/pull/87090

This follows on from PR65742. As I suspected clipTailPadding's functionality 
can be moved into accumulateBitfields; it has sufficient information now.  It 
turns out that clipTailPadding was undoing some of the work just added to 
accumulateBitfields.  Specifically, on a strict-alignment LP64 machine, 
accumulateBitfields could determine that a 24-bit access unit was the best unit 
to use (loading 3 separate bytes being better than 4). But if those 3 bytes 
were followed by a padding byte, clipTailPadding would not make the storage a 
3-byte array, and the access unit would be an (unaligned) i32.  Boo!  The new 
testcase shows this -- it's written to work on ILP32 and LP64, but the common 
case I think would be on LP64 where a pointer follows the bitfields, and 
there's 4 bytes of alignment padding between.

The fine-grained-bitfield accesses needed adjusting, to preserve the existing 
behaviour of extending them into useable tail padding. You'll see we now check 
this later -- allowing it to grow, but preventing accumulation of a subsequent 
span. I think that's in keeping with the desired behaviour of that option.


>From 4d9861a9ca23d8de7b33c4b830b5c6f78e8a3768 Mon Sep 17 00:00:00 2001
From: Nathan Sidwell 
Date: Sun, 24 Mar 2024 09:37:46 -0400
Subject: [PATCH 1/2] Move bitfield clipping into bitfield accumulation

---
 clang/lib/CodeGen/CGRecordLayoutBuilder.cpp | 63 -
 clang/test/CodeGen/bitfield-access-unit.c   | 18 ++
 2 files changed, 55 insertions(+), 26 deletions(-)

diff --git a/clang/lib/CodeGen/CGRecordLayoutBuilder.cpp 
b/clang/lib/CodeGen/CGRecordLayoutBuilder.cpp
index e32023aeac1e6f..5b9045f661e048 100644
--- a/clang/lib/CodeGen/CGRecordLayoutBuilder.cpp
+++ b/clang/lib/CodeGen/CGRecordLayoutBuilder.cpp
@@ -41,10 +41,11 @@ namespace {
 ///   contains enough information to determine where the runs break.  Microsoft
 ///   and Itanium follow different rules and use different codepaths.
 /// * It is desired that, when possible, bitfields use the appropriate iN type
-///   when lowered to llvm types.  For example unsigned x : 24 gets lowered to
+///   when lowered to llvm types. For example unsigned x : 24 gets lowered to
 ///   i24.  This isn't always possible because i24 has storage size of 32 bit
-///   and if it is possible to use that extra byte of padding we must use
-///   [i8 x 3] instead of i24.  The function clipTailPadding does this.
+///   and if it is possible to use that extra byte of padding we must use [i8 x
+///   3] instead of i24. This is computed when accumulating bitfields in
+///   accumulateBitfields.
 ///   C++ examples that require clipping:
 ///   struct { int a : 24; char b; }; // a must be clipped, b goes at offset 3
 ///   struct A { int a : 24; ~A(); }; // a must be clipped because:
@@ -197,9 +198,7 @@ struct CGRecordLowering {
   /// not the primary vbase of some base class.
   bool hasOwnStorage(const CXXRecordDecl *Decl, const CXXRecordDecl *Query);
   void calculateZeroInit();
-  /// Lowers bitfield storage types to I8 arrays for bitfields with tail
-  /// padding that is or can potentially be used.
-  void clipTailPadding();
+  void checkTailPadding();
   /// Determines if we need a packed llvm struct.
   void determinePacked(bool NVBaseType);
   /// Inserts padding everywhere it's needed.
@@ -302,7 +301,7 @@ void CGRecordLowering::lower(bool NVBaseType) {
   }
   llvm::stable_sort(Members);
   Members.push_back(StorageInfo(Size, getIntNType(8)));
-  clipTailPadding();
+  checkTailPadding();
   determinePacked(NVBaseType);
   insertPadding();
   Members.pop_back();
@@ -521,6 +520,7 @@ 
CGRecordLowering::accumulateBitFields(RecordDecl::field_iterator Field,
   // available padding characters.
   RecordDecl::field_iterator BestEnd = Begin;
   CharUnits BestEndOffset;
+  bool BestClipped; // Whether the representation must be in a byte array.
 
   for (;;) {
 // AtAlignedBoundary is true iff Field is the (potential) start of a new
@@ -583,10 +583,9 @@ 
CGRecordLowering::accumulateBitFields(RecordDecl::field_iterator Field,
 // this is the best seen so far.
 BestEnd = Field;
 BestEndOffset = BeginOffset + AccessSize;
-if (Types.getCodeGenOpts().FineGrainedBitfieldAccesses)
-  // Fine-grained access, so no merging of spans.
-  InstallBest = true;
-else if (!BitSizeSinceBegin)
+// Assume clipped until proven not below.
+BestClipped = true;
+if (!BitSizeSinceBegin)
   // A zero-sized initial span -- this will install nothing and reset
   // for another.
   InstallBest = true;
@@ -614,6 +613,12 @@ 
CGRecordLowering::accumulateBitFields(RecordDecl::field_iterator Field,
 // The access unit is not at a naturally aligned offset within the
 // structure.
 InstallBest = true;
+
+  if (InstallBest && BestEnd == Fi

[clang] [clang] Move tailclipping to bitfield allocation (PR #87090)

2024-03-29 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Nathan Sidwell (urnathan)


Changes

This follows on from PR65742. As I suspected clipTailPadding's functionality 
can be moved into accumulateBitfields; it has sufficient information now.  It 
turns out that clipTailPadding was undoing some of the work just added to 
accumulateBitfields.  Specifically, on a strict-alignment LP64 machine, 
accumulateBitfields could determine that a 24-bit access unit was the best unit 
to use (loading 3 separate bytes being better than 4). But if those 3 bytes 
were followed by a padding byte, clipTailPadding would not make the storage a 
3-byte array, and the access unit would be an (unaligned) i32.  Boo!  The new 
testcase shows this -- it's written to work on ILP32 and LP64, but the common 
case I think would be on LP64 where a pointer follows the bitfields, and 
there's 4 bytes of alignment padding between.

The fine-grained-bitfield accesses needed adjusting, to preserve the existing 
behaviour of extending them into useable tail padding. You'll see we now check 
this later -- allowing it to grow, but preventing accumulation of a subsequent 
span. I think that's in keeping with the desired behaviour of that option.


---
Full diff: https://github.com/llvm/llvm-project/pull/87090.diff


2 Files Affected:

- (modified) clang/lib/CodeGen/CGRecordLayoutBuilder.cpp (+43-35) 
- (modified) clang/test/CodeGen/bitfield-access-unit.c (+18) 


``diff
diff --git a/clang/lib/CodeGen/CGRecordLayoutBuilder.cpp 
b/clang/lib/CodeGen/CGRecordLayoutBuilder.cpp
index e32023aeac1e6f..acf1746294ef73 100644
--- a/clang/lib/CodeGen/CGRecordLayoutBuilder.cpp
+++ b/clang/lib/CodeGen/CGRecordLayoutBuilder.cpp
@@ -41,10 +41,11 @@ namespace {
 ///   contains enough information to determine where the runs break.  Microsoft
 ///   and Itanium follow different rules and use different codepaths.
 /// * It is desired that, when possible, bitfields use the appropriate iN type
-///   when lowered to llvm types.  For example unsigned x : 24 gets lowered to
+///   when lowered to llvm types. For example unsigned x : 24 gets lowered to
 ///   i24.  This isn't always possible because i24 has storage size of 32 bit
-///   and if it is possible to use that extra byte of padding we must use
-///   [i8 x 3] instead of i24.  The function clipTailPadding does this.
+///   and if it is possible to use that extra byte of padding we must use [i8 x
+///   3] instead of i24. This is computed when accumulating bitfields in
+///   accumulateBitfields.
 ///   C++ examples that require clipping:
 ///   struct { int a : 24; char b; }; // a must be clipped, b goes at offset 3
 ///   struct A { int a : 24; ~A(); }; // a must be clipped because:
@@ -197,9 +198,7 @@ struct CGRecordLowering {
   /// not the primary vbase of some base class.
   bool hasOwnStorage(const CXXRecordDecl *Decl, const CXXRecordDecl *Query);
   void calculateZeroInit();
-  /// Lowers bitfield storage types to I8 arrays for bitfields with tail
-  /// padding that is or can potentially be used.
-  void clipTailPadding();
+  void checkTailPadding();
   /// Determines if we need a packed llvm struct.
   void determinePacked(bool NVBaseType);
   /// Inserts padding everywhere it's needed.
@@ -302,7 +301,7 @@ void CGRecordLowering::lower(bool NVBaseType) {
   }
   llvm::stable_sort(Members);
   Members.push_back(StorageInfo(Size, getIntNType(8)));
-  clipTailPadding();
+  checkTailPadding();
   determinePacked(NVBaseType);
   insertPadding();
   Members.pop_back();
@@ -521,6 +520,7 @@ 
CGRecordLowering::accumulateBitFields(RecordDecl::field_iterator Field,
   // available padding characters.
   RecordDecl::field_iterator BestEnd = Begin;
   CharUnits BestEndOffset;
+  bool BestClipped; // Whether the representation must be in a byte array.
 
   for (;;) {
 // AtAlignedBoundary is true iff Field is the (potential) start of a new
@@ -583,10 +583,9 @@ 
CGRecordLowering::accumulateBitFields(RecordDecl::field_iterator Field,
 // this is the best seen so far.
 BestEnd = Field;
 BestEndOffset = BeginOffset + AccessSize;
-if (Types.getCodeGenOpts().FineGrainedBitfieldAccesses)
-  // Fine-grained access, so no merging of spans.
-  InstallBest = true;
-else if (!BitSizeSinceBegin)
+// Assume clipped until proven not below.
+BestClipped = true;
+if (!BitSizeSinceBegin)
   // A zero-sized initial span -- this will install nothing and reset
   // for another.
   InstallBest = true;
@@ -614,6 +613,12 @@ 
CGRecordLowering::accumulateBitFields(RecordDecl::field_iterator Field,
 // The access unit is not at a naturally aligned offset within the
 // structure.
 InstallBest = true;
+
+  if (InstallBest && BestEnd == Field)
+// We're installing the first span, who's clipping was
+// conservatively presumed above. Compute it correctly.
+  

[clang] [clang] Move tailclipping to bitfield allocation (PR #87090)

2024-03-29 Thread Nathan Sidwell via cfe-commits

urnathan wrote:

Sigh, clipTailPadding is still needed, because of no_unique_address and empty 
base placement.  Will update.

https://github.com/llvm/llvm-project/pull/87090
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Move tailclipping to bitfield allocation (PR #87090)

2024-04-01 Thread Nathan Sidwell via cfe-commits

https://github.com/urnathan converted_to_draft 
https://github.com/llvm/llvm-project/pull/87090
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Move tailclipping to bitfield allocation (PR #87090)

2024-04-15 Thread Nathan Sidwell via cfe-commits

urnathan wrote:

@rjmccall can you take a look at this? as the description says, this turns out 
to not be NFC, but important on (probably only) LP64 machines.

https://github.com/llvm/llvm-project/pull/87090
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Move tailclipping to bitfield allocation (PR #87090)

2024-04-15 Thread John McCall via cfe-commits

https://github.com/rjmccall approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/87090
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Move tailclipping to bitfield allocation (PR #87090)

2024-04-15 Thread Nathan Sidwell via cfe-commits

https://github.com/urnathan closed 
https://github.com/llvm/llvm-project/pull/87090
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits