[PATCH] D100739: [Coroutines] Handle overaligned frame allocation (2)

Chuanqi Xu via Phabricator via cfe-commits Sun, 25 Apr 2021 23:55:39 -0700

ChuanqiXu added a comment.

In D100739#2713579 <https://reviews.llvm.org/D100739#2713579>, @ychen wrote:


> In D100739#2711698 <https://reviews.llvm.org/D100739#2711698>, @ChuanqiXu 
> wrote:
>
>>> This is an alternative to D97915 <https://reviews.llvm.org/D97915> which 
>>> missed proper deallocation of the over-allocated frame. This patch handles 
>>> both allocations and deallocations.
>>
>> If D97915 <https://reviews.llvm.org/D97915> is not needed, we should abandon 
>> it.
>>
>> For the example shows in D97915 <https://reviews.llvm.org/D97915>, it says:
>>
>>   #include <experimental/coroutine>
>>   #include <iostream>
>>   #include <stdexcept>
>>   #include <thread>
>>   #include <cassert>
>>   
>>   struct task{
>>     struct alignas(64) promise_type {
>>       task get_return_object() { return {}; }
>>       std::experimental::suspend_never initial_suspend() { return {}; }
>>       std::experimental::suspend_never final_suspend() noexcept { return {}; 
>> }
>>       void return_void() {}
>>       void unhandled_exception() {}
>>     };
>>     using handle = std::experimental::coroutine_handle<promise_type>;
>>   };
>>   
>>   auto switch_to_new_thread() {
>>     struct awaitable {
>>       bool await_ready() { return false; }
>>       void await_suspend(task::handle h) {
>>         auto i = reinterpret_cast<std::uintptr_t>(&h.promise());
>>         std::cout << i << std::endl;
>>         assert(i % 64 == 0);
>>       }
>>       void await_resume() {}
>>     };
>>     return awaitable{};
>>   }
>>   
>>   task resuming_on_new_thread() {
>>     co_await switch_to_new_thread();
>>   }
>>   
>>   int main() {
>>     resuming_on_new_thread();
>>   }
>>
>> The assertion would fail. If this is the root problem, I think we could 
>> adjust the align for the promise alloca like:
>
> The problem is that any member of the coroutine frame could be overaligned 
> (thus make the frame overaligned) including promise, local variables, spills. 
> The problem is *not* specific to promise.
>
>>   %promise = alloca %promise_type, align 8
>>
>> into
>>
>>   %promise = alloca %promise_type, align 128
>>
>> In other words, if this the problem we need to solve, I think we could make 
>> it in a simpler way.
>
> This may not fix the problem.
>
>> Then I looked into the document you give in the summary. The issue#3 says 
>> the frontend can't do some work in the process of template instantiation due 
>> to the frontend doesn't know about the align and size of the coroutine. But 
>> from the implementation, it looks like not the problem this patch wants to 
>> solve.
>
> I meant to use that as a reference to help describe the problem (but not the 
> solution). The document itself includes both problem statements (issue#3) and 
> solutions (frontend-based) which are totally unrelated to this patch. It 
> looks like it is not that useful in this case so please disregard that.
>
>> I am really confused about the problem. Could you please restate your 
>> problem more in more detail? For example, would it make the alignment 
>> incorrect like the example above? Or does we want the frontend to get 
>> alignment information? Then what would be affected? From the title, I can 
>> guess the size of frame would get bigger. But how big would it be? Who would 
>> control and determine the final size?
>
> understood.
>
> There are two kinds of alignments: the alignment of a type/object at 
> compile-time (ABI specified or user-specified), and the alignment the object 
> of that type actually gets during runtime. The compiler assumes that the 
> alignment of a struct is the maximal alignment of all its members. However, 
> that assumption may not be true at runtime where the memory allocator may 
> return a memory block that has insufficient alignment which causes some 
> members aligned incorrectly.
>
> For C++ coroutine, right now the default memory allocator could only return 
> 16 bytes aligned memory block. When any member of the coroutine frame 
> (promise, local variables, spills etc.) has alignment > 16, the frame becomes 
> overaligned. This could only be fixed dynamically at runtime: by 
> over-allocating memory and then adjust the frame start address so that it 
> aligns correctly.
>
> For example, suppose malloc returns 16 bytes aligned address 16, how do we 
> make it 64 bytes aligned? align 16 up to an address that is 64 bytes aligned 
> which is 64, so the adjustment amount is 64-16=48
>
> Another similar example, suppose malloc returns 16 bytes aligned address 32, 
> how do we make it 64 bytes aligned? align 32 up to an address that is 64 
> bytes aligned which is 64, so the adjustment amount is 64-32=32
>
> Another similar example, suppose malloc returns 16 bytes aligned address 48, 
> how do we make it 64 bytes aligned? align 48 up to an address that is 64 
> bytes aligned which is 64, so the adjustment amount is 64-48=16
>
> Another similar example, suppose malloc returns 16 bytes aligned address 64, 
> how do we make it 64 bytes aligned? align 64 up to an address that is 64 
> bytes aligned which is 64, so the adjustment amount is 64-64=0
>
> So the mamximal adjustment amount is 64-16=48. We don't know until runtime if 
> the malloc returned address X is (X % 64 == 0) or (X % 64 == 16) or (X % 64 
> == 32) or (X % 64 == 48), so we must emit extra code to deal with all cases 
> (by bitwise operations).

Thanks for the explanation. I think I got the problem now. And my understanding 
for the solution of this patch is, if the align of the original frame is 64, 
then we allocate (64+48) space  to the new frame now. And the original frame 
becomes an alloca with align 64 in the new frame. So the actually frame gets 
the right alignment now. Do I get the problem and solution right?

I am wondering if there is a simpler solution. For example, after we construct 
the frame, we can get the alignment requirement for the frame. Then if the 
alignment is bigger than 16, we could lower the `coro.begin` to aligned new 
instead of default new. It looks like  that the implementation would be much 
simpler.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D100739/new/

https://reviews.llvm.org/D100739

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D100739: [Coroutines] Handle overaligned frame allocation (2)

Reply via email to