On Mon, Mar 25, 2024 at 05:18:50PM +0800, zhuyangyang via wrote: > If g_main_loop_run()/aio_poll() is called in the coroutine context, > the pending coroutine may be woken up repeatedly, and the co_queue_wakeup > may be disordered. > > When the poll() syscall exited in g_main_loop_run()/aio_poll(), it means > some listened events is completed. Therefore, the completion callback > function is dispatched. > > If this callback function needs to invoke aio_co_enter(), it will only > wake up the coroutine (because we are already in coroutine context), > which may cause that the data on this listening event_fd/socket_fd > is not read/cleared. When the next poll () exits, it will be woken up again > and inserted into the wakeup queue again. > > For example, if TLS is enabled in NBD, the server will call g_main_loop_run() > in the coroutine, and repeatedly wake up the io_read event on a socket. > The call stack is as follows: > > aio_co_enter() > aio_co_wake() > qio_channel_restart_read() > aio_dispatch_handler() > aio_dispatch_handlers() > aio_dispatch() > aio_ctx_dispatch() > g_main_context_dispatch() > g_main_loop_run() > nbd_negotiate_handle_starttls() > nbd_negotiate_options() > nbd_negotiate() > nbd_co_client_start() > coroutine_trampoline()
zhuyangyang, do you have a reliable reproduction setup for how you were able to trigger this? Obviously, it only happens when TLS is enabled (we aren't creating a g_main_loop_run for any other NBD command), and only when the server is first starting to serve a client; is this a case where you were hammering a long-running qemu process running an NBD server with multiple clients trying to reconnect to the server all near the same time? If we can come up with a reliable formula for reproducing the corrupted coroutine list, it would make a great iotest addition alongside the existing qemu-iotests 233 for ensuring that NBD TLS traffic is handled correctly in both server and client. > > Signed-off-by: zhuyangyang <zhuyangyan...@huawei.com> Side note: this appears to be your first qemu contribution (based on 'git shortlog --author zhuyangyang'). While I am not in a position to presume how you would like your name Anglicized, I will point out that the prevailing style is to separate given name from family name (just because your username at work has no spaces does not mean that your S-o-b has to follow suit). It is also permissible to list your name in native characters alongside or in place of the Anglicized version; for example, 'git log --author="Stefano Dong"' shows this technique. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org