ChuanqiXu added a comment. In D100739#2713579 <https://reviews.llvm.org/D100739#2713579>, @ychen wrote:
> In D100739#2711698 <https://reviews.llvm.org/D100739#2711698>, @ChuanqiXu > wrote: > >>> This is an alternative to D97915 <https://reviews.llvm.org/D97915> which >>> missed proper deallocation of the over-allocated frame. This patch handles >>> both allocations and deallocations. >> >> If D97915 <https://reviews.llvm.org/D97915> is not needed, we should abandon >> it. >> >> For the example shows in D97915 <https://reviews.llvm.org/D97915>, it says: >> >> #include <experimental/coroutine> >> #include <iostream> >> #include <stdexcept> >> #include <thread> >> #include <cassert> >> >> struct task{ >> struct alignas(64) promise_type { >> task get_return_object() { return {}; } >> std::experimental::suspend_never initial_suspend() { return {}; } >> std::experimental::suspend_never final_suspend() noexcept { return {}; >> } >> void return_void() {} >> void unhandled_exception() {} >> }; >> using handle = std::experimental::coroutine_handle<promise_type>; >> }; >> >> auto switch_to_new_thread() { >> struct awaitable { >> bool await_ready() { return false; } >> void await_suspend(task::handle h) { >> auto i = reinterpret_cast<std::uintptr_t>(&h.promise()); >> std::cout << i << std::endl; >> assert(i % 64 == 0); >> } >> void await_resume() {} >> }; >> return awaitable{}; >> } >> >> task resuming_on_new_thread() { >> co_await switch_to_new_thread(); >> } >> >> int main() { >> resuming_on_new_thread(); >> } >> >> The assertion would fail. If this is the root problem, I think we could >> adjust the align for the promise alloca like: > > The problem is that any member of the coroutine frame could be overaligned > (thus make the frame overaligned) including promise, local variables, spills. > The problem is *not* specific to promise. > >> %promise = alloca %promise_type, align 8 >> >> into >> >> %promise = alloca %promise_type, align 128 >> >> In other words, if this the problem we need to solve, I think we could make >> it in a simpler way. > > This may not fix the problem. > >> Then I looked into the document you give in the summary. The issue#3 says >> the frontend can't do some work in the process of template instantiation due >> to the frontend doesn't know about the align and size of the coroutine. But >> from the implementation, it looks like not the problem this patch wants to >> solve. > > I meant to use that as a reference to help describe the problem (but not the > solution). The document itself includes both problem statements (issue#3) and > solutions (frontend-based) which are totally unrelated to this patch. It > looks like it is not that useful in this case so please disregard that. > >> I am really confused about the problem. Could you please restate your >> problem more in more detail? For example, would it make the alignment >> incorrect like the example above? Or does we want the frontend to get >> alignment information? Then what would be affected? From the title, I can >> guess the size of frame would get bigger. But how big would it be? Who would >> control and determine the final size? > > understood. > > There are two kinds of alignments: the alignment of a type/object at > compile-time (ABI specified or user-specified), and the alignment the object > of that type actually gets during runtime. The compiler assumes that the > alignment of a struct is the maximal alignment of all its members. However, > that assumption may not be true at runtime where the memory allocator may > return a memory block that has insufficient alignment which causes some > members aligned incorrectly. > > For C++ coroutine, right now the default memory allocator could only return > 16 bytes aligned memory block. When any member of the coroutine frame > (promise, local variables, spills etc.) has alignment > 16, the frame becomes > overaligned. This could only be fixed dynamically at runtime: by > over-allocating memory and then adjust the frame start address so that it > aligns correctly. > > For example, suppose malloc returns 16 bytes aligned address 16, how do we > make it 64 bytes aligned? align 16 up to an address that is 64 bytes aligned > which is 64, so the adjustment amount is 64-16=48 > > Another similar example, suppose malloc returns 16 bytes aligned address 32, > how do we make it 64 bytes aligned? align 32 up to an address that is 64 > bytes aligned which is 64, so the adjustment amount is 64-32=32 > > Another similar example, suppose malloc returns 16 bytes aligned address 48, > how do we make it 64 bytes aligned? align 48 up to an address that is 64 > bytes aligned which is 64, so the adjustment amount is 64-48=16 > > Another similar example, suppose malloc returns 16 bytes aligned address 64, > how do we make it 64 bytes aligned? align 64 up to an address that is 64 > bytes aligned which is 64, so the adjustment amount is 64-64=0 > > So the mamximal adjustment amount is 64-16=48. We don't know until runtime if > the malloc returned address X is (X % 64 == 0) or (X % 64 == 16) or (X % 64 > == 32) or (X % 64 == 48), so we must emit extra code to deal with all cases > (by bitwise operations). Thanks for the explanation. I think I got the problem now. And my understanding for the solution of this patch is, if the align of the original frame is 64, then we allocate (64+48) space to the new frame now. And the original frame becomes an alloca with align 64 in the new frame. So the actually frame gets the right alignment now. Do I get the problem and solution right? I am wondering if there is a simpler solution. For example, after we construct the frame, we can get the alignment requirement for the frame. Then if the alignment is bigger than 16, we could lower the `coro.begin` to aligned new instead of default new. It looks like that the implementation would be much simpler. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D100739/new/ https://reviews.llvm.org/D100739 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits