================ @@ -8,470 +8,966 @@ Debugging C++ Coroutines Introduction ============ -For performance and other architectural reasons, the C++ Coroutines feature in -the Clang compiler is implemented in two parts of the compiler. Semantic -analysis is performed in Clang, and Coroutine construction and optimization -takes place in the LLVM middle-end. +Coroutines in C++ were introduced in C++20, and the user experience for +debugging them can still be challenging. This document guides you how to most +efficiently debug coroutines and how to navigate existing shortcomings in +debuggers and compilers. + +Coroutines are generally used either as generators or for asynchronous +programming. In this document, we will discuss both use cases. Even if you are +using coroutines for asynchronous programming, you should still read the +generators section, as it will introduce foundational debugging techniques also +applicable to the debugging of asynchronous programming. + +Both compilers (clang, gcc, ...) and debuggers (lldb, gdb, ...) are +still improving their support for coroutines. As such, we recommend using the +latest available version of your toolchain. + +This document focuses on clang and lldb. The screenshots show +[lldb-dap](https://marketplace.visualstudio.com/items?itemName=llvm-vs-code-extensions.lldb-dap) +in combination with VS Code. The same techniques can also be used in other +IDEs. + +Debugging clang-compiled binaries with gdb is possible, but requires more +scripting. This guide comes with a basic GDB script for coroutine debugging. + +This guide will first showcase the more polished, bleeding-edge experience, but +will also show you how to debug coroutines with older toolchains. In general, +the older your toolchain, the deeper you will have to dive into the +implementation details of coroutines (such as their ABI). The further down in +this document you go, the more low-level, technical the content will become. If +you are on an up-to-date toolchain, you will hopefully be able to stop reading +earlier. + +Debugging generators +==================== + +The first major use case for coroutines in C++ are generators, i.e., functions +which can produce values via ``co_yield``. Values are produced lazily, +on-demand. For that purpose, every time a new value is requested the coroutine +gets resumed. As soon as it reaches a ``co_yield`` and thereby returns the +requested value, the coroutine is suspended again. + +This logic is encapsulated in a ``generator`` type similar to this one: -However, this design forces us to generate insufficient debugging information. -Typically, the compiler generates debug information in the Clang frontend, as -debug information is highly language specific. However, this is not possible -for Coroutine frames because the frames are constructed in the LLVM middle-end. - -To mitigate this problem, the LLVM middle end attempts to generate some debug -information, which is unfortunately incomplete, since much of the language -specific information is missing in the middle end. +.. code-block:: c++ -This document describes how to use this debug information to better debug -coroutines. + // generator.hpp + #include <coroutine> -Terminology -=========== + // `generator` is a stripped down, minimal generator type. + template<typename T> + struct generator { + struct promise_type { + T current_value{}; -Due to the recent nature of C++20 Coroutines, the terminology used to describe -the concepts of Coroutines is not settled. This section defines a common, -understandable terminology to be used consistently throughout this document. + auto get_return_object() { + return std::coroutine_handle<promise_type>::from_promise(*this); + } + auto initial_suspend() { return std::suspend_always(); } + auto final_suspend() noexcept { return std::suspend_always(); } + auto return_void() { return std::suspend_always(); } + void unhandled_exception() { __builtin_unreachable(); } + auto yield_value(T v) { + current_value = v; + return std::suspend_always(); + } + }; -coroutine type --------------- + generator(std::coroutine_handle<promise_type> h) : hdl(h) { hdl.resume(); } + ~generator() { hdl.destroy(); } -A `coroutine function` is any function that contains any of the Coroutine -Keywords `co_await`, `co_yield`, or `co_return`. A `coroutine type` is a -possible return type of one of these `coroutine functions`. `Task` and -`Generator` are commonly referred to coroutine types. + generator<T>& operator++() { hdl.resume(); return *this; } // resume the coroutine + T operator*() const { return hdl.promise().current_value; } -coroutine ---------- + private: + std::coroutine_handle<promise_type> hdl; + }; -By technical definition, a `coroutine` is a suspendable function. However, -programmers typically use `coroutine` to refer to an individual instance. -For example: +We can then use this ``generator`` class to print the Fibonacci sequence: .. code-block:: c++ - std::vector<Task> Coros; // Task is a coroutine type. - for (int i = 0; i < 3; i++) - Coros.push_back(CoroTask()); // CoroTask is a coroutine function, which - // would return a coroutine type 'Task'. + #include "generator.hpp" + #include <iostream> -In practice, we typically say "`Coros` contains 3 coroutines" in the above -example, though this is not strictly correct. More technically, this should -say "`Coros` contains 3 coroutine instances" or "Coros contains 3 coroutine -objects." + generator<int> fibonacci() { + co_yield 0; + int prev = 0; + co_yield 1; + int current = 1; + while (true) { + int next = current + prev; + co_yield next; + prev = current; + current = next; + } + } -In this document, we follow the common practice of using `coroutine` to refer -to an individual `coroutine instance`, since the terms `coroutine instance` and -`coroutine object` aren't sufficiently defined in this case. + template<typename T> + void print10Elements(generator<T>& gen) { + for (unsigned i = 0; i < 10; ++i) { + std::cerr << *gen << "\n"; + ++gen; + } + } -coroutine frame ---------------- + int main() { + std::cerr << "Fibonacci sequence - here we go\n"; + generator<int> fib = fibonacci(); + for (unsigned i = 0; i < 5; ++i) { + ++fib; + } + print10Elements(fib); + } -The C++ Standard uses `coroutine state` to describe the allocated storage. In -the compiler, we use `coroutine frame` to describe the generated data structure -that contains the necessary information. +To compile this code, use ``clang++ --std=c++23 generator-example.cpp -g``. -The structure of coroutine frames -================================= +Breakpoints inside the generators +--------------------------------- -The structure of coroutine frames is defined as: +We can set breakpoints inside coroutines just as we set them in regular +functions. For VS Code, that means clicking next the line number in the editor. +In the ``lldb`` CLI or in ``gdb``, you can use ``b`` to set a breakpoint. -.. code-block:: c++ +Inspecting variables in a coroutine +----------------------------------- - struct { - void (*__r)(); // function pointer to the `resume` function - void (*__d)(); // function pointer to the `destroy` function - promise_type; // the corresponding `promise_type` - ... // Any other needed information - } +If you hit a breakpoint inside the ``fibonacci`` function, you should be able +to inspect all local variables (``prev```, ``current```, ``next``) just like in +a regular function. -In the debugger, the function's name is obtainable from the address of the -function. And the name of `resume` function is equal to the name of the -coroutine function. So the name of the coroutine is obtainable once the -address of the coroutine is known. +.. image:: ./coro-generator-variables.png -Print promise_type -================== +Note the two additional variables ``__promise`` and ``__coro_frame``. Those +show the internal state of the coroutine. They are not relevant for our +generator example, but will be relevant for asynchronous programming described +in the next section. -Every coroutine has a `promise_type`, which defines the behavior -for the corresponding coroutine. In other words, if two coroutines have the -same `promise_type`, they should behave in the same way. -To print a `promise_type` in a debugger when stopped at a breakpoint inside a -coroutine, printing the `promise_type` can be done by: +Stepping out of a coroutine +--------------------------- -.. parsed-literal:: +When single-stepping, you will notice that the debugger will leave the +``fibonacci`` function as soon as you hit a ``co_yield`` statement. You might +find yourself inside some standard library code. After stepping out of the +library code, you will be back in the ``main`` function. - print __promise +Stepping into a coroutine +------------------------- -It is also possible to print the `promise_type` of a coroutine from the address -of the coroutine frame. For example, if the address of a coroutine frame is -0x416eb0, and the type of the `promise_type` is `task::promise_type`, printing -the `promise_type` can be done by: +If you stop at ``++fib`` and try to step into the generator, you will first +find yourself inside ``operator++``. Stepping into the ``handle.resume()`` will +not work by default. -.. parsed-literal:: +This is because lldb does not step into functions from the standard library by +default. To make this work, you first need to run ``settings set +target.process.thread.step-avoid-regexp ""``. You can do so from the "Debug +Console" towards the bottom of the screen. With that setting change, you can +step through ``coroutine_handle::resume`` and into your generator. ---------------- ChuanqiXu9 wrote:
Agreed. https://github.com/llvm/llvm-project/pull/142651 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits