On Thu, Sep 07, 2023 at 01:28:55PM +0200, Paolo Bonzini wrote: > On 9/6/23 21:01, Stefan Hajnoczi wrote: > > It is not safe to call drain_call_rcu() from qmp_device_add() because > > some call stacks are not prepared for drain_call_rcu() to drop the Big > > QEMU Lock (BQL). > > > > For example, device emulation code is protected by the BQL but when it > > calls aio_poll() -> ... -> qmp_device_add() -> drain_call_rcu() then the > > BQL is dropped. See https://bugzilla.redhat.com/show_bug.cgi?id=2215192 for > > a > > concrete bug of this type. > > > > Another limitation of drain_call_rcu() is that it cannot be invoked within > > an > > RCU read-side critical section since the reclamation phase cannot complete > > until the end of the critical section. Unfortunately, call stacks have been > > seen where this happens (see > > https://bugzilla.redhat.com/show_bug.cgi?id=2214985). > > I think the root cause here is that do_qmp_dispatch_bh is called on the > wrong context, namely qemu_get_aio_context() instead of > iohandler_get_aio_context(). This is what causes it to move to the vCPU > thread. > > Auditing all subsystems that use iohandler_get_aio_context(), for example > via qemu_set_fd_handler(), together with bottom halves, would be a bit > daunting. > > I don't have any objection to this patch series actually, but I would like > to see if using the right AioContext also fixes the bug---and then treat > these changes as more of a cleanup. Coroutines are pretty pervasive in QEMU > and are not going away which, as you say in the updated docs, makes > drain_call_rcu_co() preferrable to drain_call_rcu().
While I agree that the issue would not happen if monitor commands only ran in the iohandler AioContext, I don't think we can change that. When Kevin implemented coroutine commands in commit 9ce44e2ce267 ("qmp: Move dispatcher to a coroutine"), he used qemu_get_aio_context() deliberately so that AIO_WAIT_WHILE() can make progress. I'm not clear on the exact scenario though, because coroutines shouldn't call AIO_WAIT_WHILE(). Kevin? There is only one coroutine monitor command that calls the QEMU block layer: qmp_block_resize(). If we're going to change how the AioContext works then now is the time to do it before there are more commands that need to be audited/refactored. Stefan > > Paolo > > > > This patch series introduces drain_call_rcu_co(), which does the same thing > > as > > drain_call_rcu() but asynchronously. By yielding back to the event loop we > > can > > wait until the caller drops the BQL and leaves its RCU read-side critical > > section. > > > > Patch 1 changes HMP so that coroutine monitor commands yield back to the > > event > > loop instead of running inside a nested event loop. > > > > Patch 2 introduces the new drain_call_rcu_co() API. > > > > Patch 3 converts qmp_device_add() into a coroutine monitor command and uses > > drain_call_rcu_co(). > > > > I'm sending this as an RFC because I don't have confirmation yet that the > > bugs > > mentioned above are fixed by this patch series. > > > > Stefan Hajnoczi (3): > > hmp: avoid the nested event loop in handle_hmp_command() > > rcu: add drain_call_rcu_co() API > > qmp: make qmp_device_add() a coroutine > > > > MAINTAINERS | 2 ++ > > docs/devel/rcu.txt | 21 ++++++++++++++++ > > qapi/qdev.json | 1 + > > include/monitor/qdev.h | 3 ++- > > include/qemu/rcu.h | 1 + > > util/rcu-internal.h | 8 ++++++ > > monitor/hmp.c | 28 +++++++++++---------- > > monitor/qmp-cmds.c | 2 +- > > softmmu/qdev-monitor.c | 34 +++++++++++++++++++++++--- > > util/rcu-co.c | 55 ++++++++++++++++++++++++++++++++++++++++++ > > util/rcu.c | 3 ++- > > hmp-commands.hx | 1 + > > util/meson.build | 2 +- > > 13 files changed, 140 insertions(+), 21 deletions(-) > > create mode 100644 util/rcu-internal.h > > create mode 100644 util/rcu-co.c > > >
signature.asc
Description: PGP signature