llvmbot wrote:

<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-clang

Author: None (ille-apple)

<details>
<summary>Changes</summary>

This is an updated version of https://reviews.llvm.org/D90434 from 2020.  Due 
to time constraints I stopped working on the patch, and unfortunately never got 
back around to it until now.

Clang special-cases `__block` variables whose initializer contains a block 
literal that captures the variable itself.  But this special case doesn't work 
for variables of record types, causing Clang to emit broken code.

Fix this by initializing such variables directly on the heap, and add a warning 
for when this needs to be done.

### Background on the special case

`__block` variables are unique in that their address can change during the 
lifetime of the variable.  Each `__block` variable starts on the stack.  But 
when a block is retained using `Block_copy`, both the block itself and any 
`__block` variables it captures are moved to the heap.

What's tricky is when this move happens while the variable is still being 
initialized.  For example:

```c
int copyBlockAndReturnInt(void (^block)(void)) {
    Block_copy(block);
    return 42;
}

int main() {
  __block int val = copyBlockAndReturnInt(^{
      printf("%d\n", val);
  });
}
```

Here, when `Block_copy` is called, `val` is moved to the heap while still in an 
uninitialized state.  Then when `copyBlockAndReturnInt` returns, `main` needs 
to store the return value to the variable's new location, not to the original 
stack slot.

This can only happen if a block literal that captures the variable is located 
syntactically within the initializer itself.  If it's before the initializer 
then it couldn't capture the variable, and if it's after the initializer then 
it couldn't execute during initialization.

So Clang detects when such a literal exists inside the initializer; 
`CGDecl.cpp` refers to this condition as `capturedByInit`.  If it does, then 
the initialization code operates in a special mode.  In several functions in 
that file, there's an `LValue` object and an accompanying `capturedByInit` 
flag. If `capturedByInit` is false (the typical case), then the lvalue points 
to the object being initialized.  But if it's true, the lvalue points to the 
byref header.  Then, depending on the type of the variable, you end up at one 
of many checks of the form:

```cpp
if (capturedByInit)
  drillIntoBlockVariable(*this, lvalue, cast&lt;VarDecl&gt;(D));
```

`drillIntoBlockVariable` emits a load of the forwarding pointer from the byref 
header (to find the object's current location), and then modifies `lvalue` to 
point to the object, unifying the two cases.  Importantly, this happens at the 
last minute: after emitting code to evaluate the initializer, but before 
emitting the `store` instruction to write the evaluated value into the variable.

Okay, but then why are there two different code paths?  If the address 
calculation can be safely deferred, why not just do that all the time rather 
than having a separate mode just for self-capturing `__block` variables?

Because it *can't* always be safely deferred.

### Harder cases

First, consider the case where variable being initialized is a C struct.  This 
is already problematic.  `drillIntoBlockVariable` is supposed to be called 
after evaluating the initializer but before doing a `store`.  But when 
initializing a struct, Clang doesn't evaluate the whole initializer before 
storing.  Instead it evaluates the first field, stores the first field, 
evaluates the second field, stores the second field, and so on.  This is 
actually necessary for correctness[^1] given code like:

```c
struct Foo { int a, b; };
struct Foo foo = {1, foo.a};
```

But if any one of the field evaluations can move the object, then Clang would 
need to reload the forwarding pointer before storing each field.  And that 
would need to apply recursively to any subfields.  Such a scheme is definitely 
possible, but Clang doesn't implement it.

And if instead of a C struct you have a C++ class, supporting relocation to the 
heap becomes flat-out impossible.  Consider:

```cpp
struct Foo {
    void (^block_)(void);
    Foo(void (^block)(void)) {
        printf("I am at %p\n", this);
        block_ = Block_copy(block);
        printf("I am now at %p\n", this); // shouldn't change!
    }
};

__block Foo foo(^{
    printf("%p\n", &amp;foo);
});
```

C++ constructors can observe the address of the object being constructed, and 
expect that address not to change.  Once the object has finished being 
constructed, it can be relocated by calling the move or copy constructor, and 
that's what Clang uses for block captures in the non-self-capturing case.  But 
in this case, `Block_copy` is called while the stack object is still running 
its constructor.  It would be nonsensical to try to move out of it in this 
state.

### Clang's current behavior

So what does Clang currently do?  Well, for both C structs and C++ classes (as 
well as unions[^2]), it runs into this TODO dating to 2011:

```cpp
case TEK_Aggregate:
  [...]
    // TODO: how can we delay here if D is captured by its initializer?
```

Clang simply neglects to call `drillIntoBlockVariable` and falls into its 
normal initialization code.  That means the lvalue is initialized as if it 
pointed to the object, when it actually points to the byref header.  As such, 
the generated code is non-functional.  And this is true for all `__block` 
variables of record type that Clang detects as self-capturing, even if the 
initializer never actually calls `Block_copy`.

### Solution

As stated, it's impossible for self-capturing variables of at least some record 
types to support moving to the heap.  But there is a way to make them work 
regardless.  Any given `__block` variable can move at most once, from the stack 
to the heap.  Once on the heap, it never needs to move again.  So, as suggested 
by @<!-- -->rjmccall in 2020, we can create a heap allocation before 
initialization even starts, and then initialize the object directly on the 
heap, knowing its address will remain stable.

This patch applies this approach to all self-capturing `__block` variables with 
record type, i.e. all variables that previously triggered the broken code 
generation.

The obvious downside of the approach that we pay the cost of a heap allocation 
even if `Block_copy` is never called.  Conveniently, this can never be a 
performance regression *per se*, since the affected cases didn't work at all 
before.  Still, there is room for future improvement, such as by making 
mid-initialization relocation work for C structs[^3], or detecting initializers 
that do contain a block literal but can't possibly call `Block_copy` on it.

To assist performance-sensitive codebases, this patch adds a warning, 
`-Wimplicit-block-var-alloc`, that triggers whenever a block requires an 
implicit allocation.  Since most codebases probably don't care, this warning is 
disabled by default and is not included in any standard warning groups.

### Allocation failure

Another downside of the approach is that because the heap allocation is 
completely implicit, there is no way to check for allocation failure.  
Normally, heap allocation happens only when you call `Block_copy`, which means 
you can catch allocation failure by checking whether `Block_copy` returns NULL.

...Or so I thought.  It turns out that checking `Block_copy`'s return value is 
only a half-measure.  `Block_copy` does return NULL if there's an allocation 
failure when moving the block itself to the heap.  But then `Block_copy` calls 
into the copy helper to move captured variables to the heap, which can also 
fail.  If it does, the copy helper has no way to report that failure.  Instead 
you just get a crash (which should probably be a proper abort; Apple folks, see 
rdar://126632757).  This limitation is more or less baked into the ABI, since 
copy helpers return void.  Even xnu, which uses blocks in kernel space, does 
not support allocation failure for them - though, luckily, xnu's version [waits 
for memory to be available rather than 
crashing](https://github.com/apple-oss-distributions/xnu/blob/94d3b452840153a99b38a3a9659680b2a006908e/libkern/libclosure/runtime.cpp#L49).

The point is, if recovery from allocation failure is already not a thing, then 
this patch isn't responsible for breaking it.  Even if blocks are changed 
someday to support it, it would probably be desired only by a tiny minority of 
codebases, and those codebases could avoid implicit allocations by enabling the 
new warning and setting it as an error.

### ABI limitation

The blocks runtime has no API to allocate a byref struct directly on the heap, 
because it's never been necessary before.  There is only `_Block_object_assign` 
to move one from the stack to the heap.  It would be nice to add an API to 
allocate directly on the heap, but for the dual sakes of backward compatibility 
and laziness, I have not attempted to do so.

Instead, the codegen works with the existing runtime, as follows.  First, a 
byref struct is allocated on the stack (including space for the object), but 
only the header is initialized; the object part is left uninitialized.  Then 
`_Block_object_assign` is called to migrate it to the heap (which unfortunately 
pays the cost of memcpying the uninitialized stack memory to the heap).  This 
produces the desired heap allocation, but also increases the reference count, 
which is balanced by immediately calling `_Block_object_dispose`.  After that, 
the object can finally be initialized inside the new heap allocation.

### Improved self-capture analysis

The existing check for self-capturing `__block` variables is implemented in the 
function `isCapturedBy`.  But it turns out that that implementation is 
overcomplicated for no good reason.  In ancient days (2011), `isCapturedBy` 
only worked on `Expr`s, and it recursed on the `Expr`'s children, which it 
assumed would also be `Expr`s.  This assumption turned out to be false for GNU 
statement expressions.  The simple fix would have been to change `isCapturedBy` 
to accept and recurse on arbitrary `Stmt`s instead, since `children()` is 
defined for all `Stmt`s, not just `Expr`s.  I'm not sure if there was some 
reason that wasn't possible at the time, but what happened instead is that a 
special case was added for statement expressions, which over a series of 
patches got more and more convoluted.

This patch replaces the whole thing with an `EvaluatedExprVisitor` (another 
suggestion from @<!-- -->rjmccall).  This is not only simpler, but also more 
precise, eliminating false positive self-captures including:

- Block literals in unevaluated contexts (`sizeof(^{ ... })`).

- Some types of statements inside GNU statement expressions.

This patch also moves the self-capture check earlier in the compilation 
process, from CodeGen to Sema.  This is necessary in order to support the 
aforementioned warning.

### Constant initialization

This patch also fixes a related bug where self-capturing `__block` variables 
could trigger this assert:

```cpp
assert(!capturedByInit &amp;&amp; "constant init contains a capturing block?");
```

The assert is important, because constant initialization doesn't support the 
`drillIntoBlockVariable` mechanism.  Now, a constant initializer obviously 
can't include a call to `Block_copy`; in fact, it can't include block literals 
at all unless they have no captures.  But I found two cases where the assert 
can be triggered anyway.

First, the initializer can syntactically contain a capturing block literal but 
never evaluate it:

```cpp
__block constexpr int foo = true ? 1 : ^{ return foo; }();
```

This is addressed in this patch by forcing constant-initialized variables to be 
treated as non-self-capturing.

Second, the initializer can simply not be constant.  With variables of types 
like `const int`, Clang first tries to constant-evaluate the initializer, then 
falls back to runtime initialization if that fails.  But the assert is located 
in a place where it triggers just based on the try.  This is addressed in this 
patch by moving the assert down and conditioning it on constant evaluation 
succeeding.

[^1]: Supporting such code appears to be required by the C++ spec (see
    https://timsong-cpp.github.io/cppwp/n4861/dcl.init.aggr#<!-- -->6), though 
not
    by the C spec (see https://port70.net/~nsz/c/c11/n1570.html#<!-- 
-->6.7.9p23).

[^2]: There are other `TEK_Aggregate` types besides records, but all such
    types either can't be captured by blocks (arrays) or can't be the
    type of a local variable at all.

[^3]: Though even in C, moving mid-initializer could break code that took the 
address of the variable:

    ```c
    struct Foo { void *a; void *(^b)(void); };
    __block struct Foo x = {&amp;x.b, Block_copy(^{ return x.a; })};
    assert(x.a == &amp;x.b); // fails without up-front heap allocation
    ```

    However, it's a bit less obvious that such code *needs* to work; perhaps 
the user should just be aware the variable will be moving.  It's not like C++ 
where construction is clearly an indivisible operation performed at a single 
address.



---

Patch is 33.79 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/89475.diff


12 Files Affected:

- (modified) clang/include/clang/AST/Decl.h (+20-2) 
- (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+6) 
- (modified) clang/lib/AST/Decl.cpp (+13) 
- (modified) clang/lib/CodeGen/CGBlocks.cpp (+57-24) 
- (modified) clang/lib/CodeGen/CGBlocks.h (+1) 
- (modified) clang/lib/CodeGen/CGDecl.cpp (+32-82) 
- (modified) clang/lib/CodeGen/CodeGenFunction.h (+11-1) 
- (modified) clang/lib/CodeGen/CodeGenModule.h (+6-2) 
- (modified) clang/lib/Sema/Sema.cpp (+65) 
- (modified) clang/lib/Serialization/ASTReaderDecl.cpp (+1) 
- (modified) clang/lib/Serialization/ASTWriterDecl.cpp (+4-2) 
- (modified) clang/test/CodeGenCXX/block-capture.cpp (+124-3) 


``````````diff
diff --git a/clang/include/clang/AST/Decl.h b/clang/include/clang/AST/Decl.h
index 8b121896d66d15..dc816786d7f11d 100644
--- a/clang/include/clang/AST/Decl.h
+++ b/clang/include/clang/AST/Decl.h
@@ -1103,6 +1103,11 @@ class VarDecl : public DeclaratorDecl, public 
Redeclarable<VarDecl> {
 
     LLVM_PREFERRED_TYPE(bool)
     unsigned IsCXXCondDecl : 1;
+
+    /// Whether this is a __block variable that is captured by a block
+    /// referenced in its own initializer.
+    LLVM_PREFERRED_TYPE(bool)
+    unsigned CapturedByOwnInit : 1;
   };
 
   union {
@@ -1588,10 +1593,23 @@ class VarDecl : public DeclaratorDecl, public 
Redeclarable<VarDecl> {
   /// escaping block.
   bool isNonEscapingByref() const;
 
-  void setEscapingByref() {
-    NonParmVarDeclBits.EscapingByref = true;
+  void setEscapingByref();
+
+  /// Indicates this is a __block variable that is captured by a block
+  /// referenced in its own initializer.
+  bool isCapturedByOwnInit() const {
+    return !isa<ParmVarDecl>(this) && NonParmVarDeclBits.CapturedByOwnInit;
   }
 
+  void setCapturedByOwnInit() {
+    assert(!isa<ParmVarDecl>(this));
+    NonParmVarDeclBits.CapturedByOwnInit = true;
+  }
+
+  /// Check if this is a __block variable which needs to be initialized
+  /// directly on the heap.
+  bool needsInitOnHeap() const;
+
   bool isCXXCondDecl() const {
     return isa<ParmVarDecl>(this) ? false : NonParmVarDeclBits.IsCXXCondDecl;
   }
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td 
b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 1bbe76ff6bd2ac..f19ee8fbc21f3a 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -2356,6 +2356,12 @@ def note_var_fixit_add_initialization : Note<
 def note_uninit_fixit_remove_cond : Note<
   "remove the %select{'%1' if its condition|condition if it}0 "
   "is always %select{false|true}2">;
+def warn_implicit_block_var_alloc : Warning<
+  "variable %q0 will be initialized in a heap allocation">,
+  InGroup<DiagGroup<"implicit-block-var-alloc">>, DefaultIgnore;
+def note_because_captured_by_block : Note<
+  "because it is captured by a block used in its own initializer">;
+
 def err_init_incomplete_type : Error<"initialization of incomplete type %0">;
 def err_list_init_in_parens : Error<
   "cannot initialize %select{non-class|reference}0 type %1 with a "
diff --git a/clang/lib/AST/Decl.cpp b/clang/lib/AST/Decl.cpp
index 474e0ccde5bbf7..fda9010ba126d7 100644
--- a/clang/lib/AST/Decl.cpp
+++ b/clang/lib/AST/Decl.cpp
@@ -2680,6 +2680,19 @@ bool VarDecl::isNonEscapingByref() const {
   return hasAttr<BlocksAttr>() && !NonParmVarDeclBits.EscapingByref;
 }
 
+void VarDecl::setEscapingByref() {
+  assert(!isa<ParmVarDecl>(this) && hasAttr<BlocksAttr>());
+  NonParmVarDeclBits.EscapingByref = true;
+}
+
+bool VarDecl::needsInitOnHeap() const {
+  // We need direct initialization on the heap for self-capturing __block
+  // variables of types that cause CodeGenFunction::getEvaluationKind to return
+  // TEK_Aggregate.  The only such types that can be captured are records.
+  return isCapturedByOwnInit() &&
+         getType().getAtomicUnqualifiedType()->isRecordType();
+}
+
 bool VarDecl::hasDependentAlignment() const {
   QualType T = getType();
   return T->isDependentType() || T->isUndeducedType() ||
diff --git a/clang/lib/CodeGen/CGBlocks.cpp b/clang/lib/CodeGen/CGBlocks.cpp
index 2742c39965b2c8..4bea057a15a52c 100644
--- a/clang/lib/CodeGen/CGBlocks.cpp
+++ b/clang/lib/CodeGen/CGBlocks.cpp
@@ -45,6 +45,13 @@ CGBlockInfo::CGBlockInfo(const BlockDecl *block, StringRef 
name)
     name = name.substr(1);
 }
 
+BlockByrefHelpers::BlockByrefHelpers(const BlockByrefInfo &info)
+    : CopyHelper(nullptr), DisposeHelper(nullptr),
+      // The alignment we care about for the purposes of uniquing byref
+      // helpers is the alignment of the actual byref value field.
+      Alignment(info.ByrefAlignment.alignmentAtOffset(info.FieldOffset)),
+      ForceNoopCopy(info.ForceNoopCopy) {}
+
 // Anchor the vtable to this translation unit.
 BlockByrefHelpers::~BlockByrefHelpers() {}
 
@@ -2127,8 +2134,8 @@ class ObjectByrefHelpers final : public BlockByrefHelpers 
{
   BlockFieldFlags Flags;
 
 public:
-  ObjectByrefHelpers(CharUnits alignment, BlockFieldFlags flags)
-    : BlockByrefHelpers(alignment), Flags(flags) {}
+  ObjectByrefHelpers(const BlockByrefInfo &info, BlockFieldFlags flags)
+      : BlockByrefHelpers(info), Flags(flags) {}
 
   void emitCopy(CodeGenFunction &CGF, Address destField,
                 Address srcField) override {
@@ -2161,7 +2168,7 @@ class ObjectByrefHelpers final : public BlockByrefHelpers 
{
 /// Emits the copy/dispose helpers for an ARC __block __weak variable.
 class ARCWeakByrefHelpers final : public BlockByrefHelpers {
 public:
-  ARCWeakByrefHelpers(CharUnits alignment) : BlockByrefHelpers(alignment) {}
+  ARCWeakByrefHelpers(const BlockByrefInfo &info) : BlockByrefHelpers(info) {}
 
   void emitCopy(CodeGenFunction &CGF, Address destField,
                 Address srcField) override {
@@ -2182,7 +2189,7 @@ class ARCWeakByrefHelpers final : public 
BlockByrefHelpers {
 /// that's not of block-pointer type.
 class ARCStrongByrefHelpers final : public BlockByrefHelpers {
 public:
-  ARCStrongByrefHelpers(CharUnits alignment) : BlockByrefHelpers(alignment) {}
+  ARCStrongByrefHelpers(const BlockByrefInfo &info) : BlockByrefHelpers(info) 
{}
 
   void emitCopy(CodeGenFunction &CGF, Address destField,
                 Address srcField) override {
@@ -2218,8 +2225,8 @@ class ARCStrongByrefHelpers final : public 
BlockByrefHelpers {
 /// variable that's of block-pointer type.
 class ARCStrongBlockByrefHelpers final : public BlockByrefHelpers {
 public:
-  ARCStrongBlockByrefHelpers(CharUnits alignment)
-    : BlockByrefHelpers(alignment) {}
+  ARCStrongBlockByrefHelpers(const BlockByrefInfo &info)
+      : BlockByrefHelpers(info) {}
 
   void emitCopy(CodeGenFunction &CGF, Address destField,
                 Address srcField) override {
@@ -2248,9 +2255,9 @@ class CXXByrefHelpers final : public BlockByrefHelpers {
   const Expr *CopyExpr;
 
 public:
-  CXXByrefHelpers(CharUnits alignment, QualType type,
+  CXXByrefHelpers(const BlockByrefInfo &info, QualType type,
                   const Expr *copyExpr)
-    : BlockByrefHelpers(alignment), VarType(type), CopyExpr(copyExpr) {}
+    : BlockByrefHelpers(info), VarType(type), CopyExpr(copyExpr) {}
 
   bool needsCopy() const override { return CopyExpr != nullptr; }
   void emitCopy(CodeGenFunction &CGF, Address destField,
@@ -2276,8 +2283,8 @@ class NonTrivialCStructByrefHelpers final : public 
BlockByrefHelpers {
   QualType VarType;
 
 public:
-  NonTrivialCStructByrefHelpers(CharUnits alignment, QualType type)
-    : BlockByrefHelpers(alignment), VarType(type) {}
+  NonTrivialCStructByrefHelpers(const BlockByrefInfo &info, QualType type)
+      : BlockByrefHelpers(info), VarType(type) {}
 
   void emitCopy(CodeGenFunction &CGF, Address destField,
                 Address srcField) override {
@@ -2336,7 +2343,7 @@ generateByrefCopyHelper(CodeGenFunction &CGF, const 
BlockByrefInfo &byrefInfo,
     // Create a scope with an artificial location for the body of this 
function.
   auto AL = ApplyDebugLocation::CreateArtificial(CGF);
 
-  if (generator.needsCopy()) {
+  if (generator.needsCopy() && !generator.ForceNoopCopy) {
     // dst->x
     Address destField = CGF.GetAddrOfLocalVar(&Dst);
     destField = Address(CGF.Builder.CreateLoad(destField), byrefInfo.Type,
@@ -2458,18 +2465,13 @@ CodeGenFunction::buildByrefHelpers(llvm::StructType 
&byrefType,
 
   auto &byrefInfo = getBlockByrefInfo(&var);
 
-  // The alignment we care about for the purposes of uniquing byref
-  // helpers is the alignment of the actual byref value field.
-  CharUnits valueAlignment =
-    byrefInfo.ByrefAlignment.alignmentAtOffset(byrefInfo.FieldOffset);
-
   if (const CXXRecordDecl *record = type->getAsCXXRecordDecl()) {
     const Expr *copyExpr =
         CGM.getContext().getBlockVarCopyInit(&var).getCopyExpr();
     if (!copyExpr && record->hasTrivialDestructor()) return nullptr;
 
     return ::buildByrefHelpers(
-        CGM, byrefInfo, CXXByrefHelpers(valueAlignment, type, copyExpr));
+        CGM, byrefInfo, CXXByrefHelpers(byrefInfo, type, copyExpr));
   }
 
   // If type is a non-trivial C struct type that is non-trivial to
@@ -2477,7 +2479,7 @@ CodeGenFunction::buildByrefHelpers(llvm::StructType 
&byrefType,
   if (type.isNonTrivialToPrimitiveDestructiveMove() == QualType::PCK_Struct ||
       type.isDestructedType() == QualType::DK_nontrivial_c_struct)
     return ::buildByrefHelpers(
-        CGM, byrefInfo, NonTrivialCStructByrefHelpers(valueAlignment, type));
+        CGM, byrefInfo, NonTrivialCStructByrefHelpers(byrefInfo, type));
 
   // Otherwise, if we don't have a retainable type, there's nothing to do.
   // that the runtime does extra copies.
@@ -2499,7 +2501,7 @@ CodeGenFunction::buildByrefHelpers(llvm::StructType 
&byrefType,
     // byref routines.
     case Qualifiers::OCL_Weak:
       return ::buildByrefHelpers(CGM, byrefInfo,
-                                 ARCWeakByrefHelpers(valueAlignment));
+                                 ARCWeakByrefHelpers(byrefInfo));
 
     // ARC __strong __block variables need to be retained.
     case Qualifiers::OCL_Strong:
@@ -2507,13 +2509,13 @@ CodeGenFunction::buildByrefHelpers(llvm::StructType 
&byrefType,
       // transfer possible.
       if (type->isBlockPointerType()) {
         return ::buildByrefHelpers(CGM, byrefInfo,
-                                   ARCStrongBlockByrefHelpers(valueAlignment));
+                                   ARCStrongBlockByrefHelpers(byrefInfo));
 
       // Otherwise, we transfer ownership of the retain from the stack
       // to the heap.
       } else {
         return ::buildByrefHelpers(CGM, byrefInfo,
-                                   ARCStrongByrefHelpers(valueAlignment));
+                                   ARCStrongByrefHelpers(byrefInfo));
       }
     }
     llvm_unreachable("fell out of lifetime switch!");
@@ -2533,7 +2535,7 @@ CodeGenFunction::buildByrefHelpers(llvm::StructType 
&byrefType,
     flags |= BLOCK_FIELD_IS_WEAK;
 
   return ::buildByrefHelpers(CGM, byrefInfo,
-                             ObjectByrefHelpers(valueAlignment, flags));
+                             ObjectByrefHelpers(byrefInfo, flags));
 }
 
 Address CodeGenFunction::emitBlockByrefAddress(Address baseAddr,
@@ -2653,11 +2655,39 @@ const BlockByrefInfo 
&CodeGenFunction::getBlockByrefInfo(const VarDecl *D) {
   info.FieldOffset = varOffset;
   info.ByrefAlignment = std::max(varAlign, getPointerAlign());
 
+  // If we're initializing directly on the heap, then we should emit a
+  // no-op copy helper, both because we don't need a real one (the
+  // object will never move), and because a real one would break the
+  // pre-init _Block_object_assign.
+  info.ForceNoopCopy = D->needsInitOnHeap();
+
   auto pair = BlockByrefInfos.insert({D, info});
   assert(pair.second && "info was inserted recursively?");
   return pair.first->second;
 }
 
+void CodeGenFunction::emitByrefInitOnHeap(llvm::Value *P) {
+  // The object itself is initialized directly on the heap.  But for ABI
+  // backwards compatibility reasons, we have to set up a fake byref struct on
+  // the stack (with the structural components initialized but not the object
+  // itself), then call _Block_object_assign to move it to the heap (which is
+  // safe because we forced a no-op copy helper), then call
+  // _Block_object_dispose to release the extra ref from _Block_object_assign.
+  //
+  // 'P' points to the fake byref struct.
+
+  BlockFieldFlags flags = BLOCK_FIELD_IS_BYREF;
+  // Ignored out-parameter.  We'll use the forwarding pointer instead.
+  RawAddress out = CreateDefaultAlignTempAlloca(P->getType(), 
"initOnHeap.out");
+
+  llvm::Value *args[] = {Builder.CreateBitCast(out.getPointer(), VoidPtrTy),
+                         Builder.CreateBitCast(P, VoidPtrTy),
+                         llvm::ConstantInt::get(Int32Ty, flags.getBitMask())};
+  EmitNounwindRuntimeCall(CGM.getBlockObjectAssign(), args);
+
+  BuildBlockRelease(P, flags, false);
+}
+
 /// Initialize the structural components of a __block variable, i.e.
 /// everything but the actual object.
 void CodeGenFunction::emitByrefStructureInit(const AutoVarEmission &emission) {
@@ -2699,8 +2729,8 @@ void CodeGenFunction::emitByrefStructureInit(const 
AutoVarEmission &emission) {
   storeHeaderField(V, getPointerSize(), "byref.isa");
 
   // Store the address of the variable into its own forwarding pointer.
-  storeHeaderField(addr.emitRawPointer(*this), getPointerSize(),
-                   "byref.forwarding");
+  llvm::Value *pointer = addr.emitRawPointer(*this);
+  storeHeaderField(pointer, getPointerSize(), "byref.forwarding");
 
   // Blocks ABI:
   //   c) the flags field is set to either 0 if no helper functions are
@@ -2764,6 +2794,9 @@ void CodeGenFunction::emitByrefStructureInit(const 
AutoVarEmission &emission) {
     auto layoutInfo = CGM.getObjCRuntime().BuildByrefLayout(CGM, type);
     storeHeaderField(layoutInfo, getPointerSize(), "byref.layout");
   }
+
+  if (emission.NeedsInitOnHeap)
+    emitByrefInitOnHeap(pointer);
 }
 
 void CodeGenFunction::BuildBlockRelease(llvm::Value *V, BlockFieldFlags flags,
diff --git a/clang/lib/CodeGen/CGBlocks.h b/clang/lib/CodeGen/CGBlocks.h
index 8d10c4f69b2026..526dbbfd9b38d3 100644
--- a/clang/lib/CodeGen/CGBlocks.h
+++ b/clang/lib/CodeGen/CGBlocks.h
@@ -139,6 +139,7 @@ class BlockByrefInfo {
   unsigned FieldIndex;
   CharUnits ByrefAlignment;
   CharUnits FieldOffset;
+  bool ForceNoopCopy;
 };
 
 /// Represents a type of copy/destroy operation that should be performed for an
diff --git a/clang/lib/CodeGen/CGDecl.cpp b/clang/lib/CodeGen/CGDecl.cpp
index ce6d6d8956076e..954dd1079215fd 100644
--- a/clang/lib/CodeGen/CGDecl.cpp
+++ b/clang/lib/CodeGen/CGDecl.cpp
@@ -1450,6 +1450,8 @@ CodeGenFunction::EmitAutoVarAlloca(const VarDecl &D) {
 
   bool isEscapingByRef = D.isEscapingByref();
   emission.IsEscapingByRef = isEscapingByRef;
+  emission.IsCapturedByOwnInit = D.isCapturedByOwnInit();
+  emission.NeedsInitOnHeap = D.needsInitOnHeap();
 
   CharUnits alignment = getContext().getDeclAlign(&D);
 
@@ -1692,68 +1694,6 @@ CodeGenFunction::EmitAutoVarAlloca(const VarDecl &D) {
   return emission;
 }
 
-static bool isCapturedBy(const VarDecl &, const Expr *);
-
-/// Determines whether the given __block variable is potentially
-/// captured by the given statement.
-static bool isCapturedBy(const VarDecl &Var, const Stmt *S) {
-  if (const Expr *E = dyn_cast<Expr>(S))
-    return isCapturedBy(Var, E);
-  for (const Stmt *SubStmt : S->children())
-    if (isCapturedBy(Var, SubStmt))
-      return true;
-  return false;
-}
-
-/// Determines whether the given __block variable is potentially
-/// captured by the given expression.
-static bool isCapturedBy(const VarDecl &Var, const Expr *E) {
-  // Skip the most common kinds of expressions that make
-  // hierarchy-walking expensive.
-  E = E->IgnoreParenCasts();
-
-  if (const BlockExpr *BE = dyn_cast<BlockExpr>(E)) {
-    const BlockDecl *Block = BE->getBlockDecl();
-    for (const auto &I : Block->captures()) {
-      if (I.getVariable() == &Var)
-        return true;
-    }
-
-    // No need to walk into the subexpressions.
-    return false;
-  }
-
-  if (const StmtExpr *SE = dyn_cast<StmtExpr>(E)) {
-    const CompoundStmt *CS = SE->getSubStmt();
-    for (const auto *BI : CS->body())
-      if (const auto *BIE = dyn_cast<Expr>(BI)) {
-        if (isCapturedBy(Var, BIE))
-          return true;
-      }
-      else if (const auto *DS = dyn_cast<DeclStmt>(BI)) {
-          // special case declarations
-          for (const auto *I : DS->decls()) {
-              if (const auto *VD = dyn_cast<VarDecl>((I))) {
-                const Expr *Init = VD->getInit();
-                if (Init && isCapturedBy(Var, Init))
-                  return true;
-              }
-          }
-      }
-      else
-        // FIXME. Make safe assumption assuming arbitrary statements cause 
capturing.
-        // Later, provide code to poke into statements for capture analysis.
-        return true;
-    return false;
-  }
-
-  for (const Stmt *SubStmt : E->children())
-    if (isCapturedBy(Var, SubStmt))
-      return true;
-
-  return false;
-}
-
 /// Determine whether the given initializer is trivial in the sense
 /// that it requires no code to be generated.
 bool CodeGenFunction::isTrivialInitializer(const Expr *Init) {
@@ -1918,8 +1858,7 @@ void CodeGenFunction::EmitAutoVarInit(const 
AutoVarEmission &emission) {
   // Check whether this is a byref variable that's potentially
   // captured and moved by its own initializer.  If so, we'll need to
   // emit the initializer first, then copy into the variable.
-  bool capturedByInit =
-      Init && emission.IsEscapingByRef && isCapturedBy(D, Init);
+  bool capturedByInit = emission.IsCapturedByOwnInit;
 
   bool locIsByrefHeader = !capturedByInit;
   const Address Loc =
@@ -1951,8 +1890,9 @@ void CodeGenFunction::EmitAutoVarInit(const 
AutoVarEmission &emission) {
   llvm::Constant *constant = nullptr;
   if (emission.IsConstantAggregate ||
       D.mightBeUsableInConstantExpressions(getContext())) {
-    assert(!capturedByInit && "constant init contains a capturing block?");
     constant = ConstantEmitter(*this).tryEmitAbstractForInitializer(D);
+    assert(!(constant && capturedByInit) &&
+           "constant init contains a capturing block?");
     if (constant && !constant->isZeroValue() &&
         (trivialAutoVarInit !=
          LangOptions::TrivialAutoVarInitKind::Uninitialized)) {
@@ -1975,6 +1915,10 @@ void CodeGenFunction::EmitAutoVarInit(const 
AutoVarEmission &emission) {
     initializeWhatIsTechnicallyUninitialized(Loc);
     LValue lv = MakeAddrLValue(Loc, type);
     lv.setNonGC(true);
+    if (emission.NeedsInitOnHeap) {
+      drillIntoBlockVariable(*this, lv, &D);
+      capturedByInit = false; // We don't need to drill again.
+    }
     return EmitExprAsInit(Init, &D, lv, capturedByInit);
   }
 
@@ -2023,6 +1967,9 @@ void CodeGenFunction::EmitExprAsInit(const Expr *init, 
const ValueDecl *D,
     return;
   }
   case TEK_Aggregate:
+    assert(
+        !capturedByInit &&
+        "Capture-by-initializer should have been handled by NeedsInitOnHeap!");
     if (type->isAtomicType()) {
       EmitAtomicInit(const_cast<Expr*>(init), lvalue);
     } else {
@@ -2031,7 +1978,6 @@ void CodeGenFunction::EmitExprAsInit(const Expr *init, 
const ValueDecl *D,
         Overlap = AggValueSlot::DoesNotOverlap;
       else if (auto *FD = dyn_cast<FieldDecl>(D))
         Overlap = getOverlapForFieldInit(FD);
-      // TODO: how can we delay here if D is captured by its initializer?
       EmitAggExpr(init, AggValueSlot::forLValue(
                             lvalue, *this, AggValueSlot::IsDestructed,
                             AggValueSlot::DoesNotNeedGCBarriers,
@@ -2120,27 +2066,31 @@ void CodeGenFunction::EmitAutoVarCleanups(const 
AutoVarEmission &emission) {
   // us from jumping into any of these scopes anyway.
   if (!HaveInsertPoint()) return;
 
-  const VarDecl &D = *emission.Variable;
+  // If we're initializing directly on the heap, _Block_object_destroy will
+  // handle destruction, so we don't need to perform cleanups directly.
+  if (!emission.NeedsInitOnHeap) {
+    const VarDecl &D = *emission.Variable;
 
-  // Check the type for a cleanup.
-  if (QualType::DestructionKind dtorKind = D.needsDestruction(getContext()))
-    emitAutoVarTypeCleanup(emission, dtorKind);
+    // Check the type for a cleanup.
+    if (QualType::DestructionKind dtorKind = D.needsDestruction(getContext()))
+      emitAutoVarTypeCleanup(emission, dtorKind);
 
-  // In GC mode, honor objc_precise_lifetime.
-  if (getLangOpts().getGC() != LangOptions::NonGC &&
-      D.hasAttr<ObjCPreciseLifetimeAttr>()) {
-    EHStack.pushCleanup<ExtendGCLifetime>(NormalCleanup, &D);
-  }
+    // In GC mode, honor objc_precise_lifetime.
+    if (getLangOpts().getGC() != LangOptions::NonGC &&
+        D.hasAttr<ObjCPreciseLifetimeAttr>()) {
+      EHStack.pushCleanup<ExtendGCLifetime>(NormalCleanup, &D);
+    }
 
-  // Handle the cleanup attribute.
-  if (const CleanupAttr *CA = D.getAttr<CleanupAttr>()) {
-    const FunctionDecl *FD = CA->getFunctionDec...
[truncated]

``````````

</details>


https://github.com/llvm/llvm-project/pull/89475
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to