On 9/6/23 21:01, Stefan Hajnoczi wrote:
It is not safe to call drain_call_rcu() from qmp_device_add() because
some call stacks are not prepared for drain_call_rcu() to drop the Big
QEMU Lock (BQL).
For example, device emulation code is protected by the BQL but when it
calls aio_poll() -> ... -> qmp_device_add() -> drain_call_rcu() then the
BQL is dropped. See https://bugzilla.redhat.com/show_bug.cgi?id=2215192 for a
concrete bug of this type.
Another limitation of drain_call_rcu() is that it cannot be invoked within an
RCU read-side critical section since the reclamation phase cannot complete
until the end of the critical section. Unfortunately, call stacks have been
seen where this happens (see
https://bugzilla.redhat.com/show_bug.cgi?id=2214985).
I think the root cause here is that do_qmp_dispatch_bh is called on the
wrong context, namely qemu_get_aio_context() instead of
iohandler_get_aio_context(). This is what causes it to move to the vCPU
thread.
Auditing all subsystems that use iohandler_get_aio_context(), for
example via qemu_set_fd_handler(), together with bottom halves, would be
a bit daunting.
I don't have any objection to this patch series actually, but I would
like to see if using the right AioContext also fixes the bug---and then
treat these changes as more of a cleanup. Coroutines are pretty
pervasive in QEMU and are not going away which, as you say in the
updated docs, makes drain_call_rcu_co() preferrable to drain_call_rcu().
Paolo
This patch series introduces drain_call_rcu_co(), which does the same thing as
drain_call_rcu() but asynchronously. By yielding back to the event loop we can
wait until the caller drops the BQL and leaves its RCU read-side critical
section.
Patch 1 changes HMP so that coroutine monitor commands yield back to the event
loop instead of running inside a nested event loop.
Patch 2 introduces the new drain_call_rcu_co() API.
Patch 3 converts qmp_device_add() into a coroutine monitor command and uses
drain_call_rcu_co().
I'm sending this as an RFC because I don't have confirmation yet that the bugs
mentioned above are fixed by this patch series.
Stefan Hajnoczi (3):
hmp: avoid the nested event loop in handle_hmp_command()
rcu: add drain_call_rcu_co() API
qmp: make qmp_device_add() a coroutine
MAINTAINERS | 2 ++
docs/devel/rcu.txt | 21 ++++++++++++++++
qapi/qdev.json | 1 +
include/monitor/qdev.h | 3 ++-
include/qemu/rcu.h | 1 +
util/rcu-internal.h | 8 ++++++
monitor/hmp.c | 28 +++++++++++----------
monitor/qmp-cmds.c | 2 +-
softmmu/qdev-monitor.c | 34 +++++++++++++++++++++++---
util/rcu-co.c | 55 ++++++++++++++++++++++++++++++++++++++++++
util/rcu.c | 3 ++-
hmp-commands.hx | 1 +
util/meson.build | 2 +-
13 files changed, 140 insertions(+), 21 deletions(-)
create mode 100644 util/rcu-internal.h
create mode 100644 util/rcu-co.c