On Thu, Sep 07, 2023 at 01:28:55PM +0200, Paolo Bonzini wrote:
> On 9/6/23 21:01, Stefan Hajnoczi wrote:
> > It is not safe to call drain_call_rcu() from qmp_device_add() because
> > some call stacks are not prepared for drain_call_rcu() to drop the Big
> > QEMU Lock (BQL).
> > 
> > For example, device emulation code is protected by the BQL but when it
> > calls aio_poll() -> ... -> qmp_device_add() -> drain_call_rcu() then the
> > BQL is dropped. See https://bugzilla.redhat.com/show_bug.cgi?id=2215192 for 
> > a
> > concrete bug of this type.
> > 
> > Another limitation of drain_call_rcu() is that it cannot be invoked within 
> > an
> > RCU read-side critical section since the reclamation phase cannot complete
> > until the end of the critical section. Unfortunately, call stacks have been
> > seen where this happens (see
> > https://bugzilla.redhat.com/show_bug.cgi?id=2214985).
> 
> I think the root cause here is that do_qmp_dispatch_bh is called on the
> wrong context, namely qemu_get_aio_context() instead of
> iohandler_get_aio_context().  This is what causes it to move to the vCPU
> thread.
> 
> Auditing all subsystems that use iohandler_get_aio_context(), for example
> via qemu_set_fd_handler(), together with bottom halves, would be a bit
> daunting.
> 
> I don't have any objection to this patch series actually, but I would like
> to see if using the right AioContext also fixes the bug---and then treat
> these changes as more of a cleanup.  Coroutines are pretty pervasive in QEMU
> and are not going away which, as you say in the updated docs, makes
> drain_call_rcu_co() preferrable to drain_call_rcu().

While I agree that the issue would not happen if monitor commands only
ran in the iohandler AioContext, I don't think we can change that.

When Kevin implemented coroutine commands in commit 9ce44e2ce267 ("qmp:
Move dispatcher to a coroutine"), he used qemu_get_aio_context()
deliberately so that AIO_WAIT_WHILE() can make progress.

I'm not clear on the exact scenario though, because coroutines shouldn't
call AIO_WAIT_WHILE().

Kevin?

There is only one coroutine monitor command that calls the QEMU block
layer: qmp_block_resize(). If we're going to change how the AioContext
works then now is the time to do it before there are more commands that
need to be audited/refactored.

Stefan

> 
> Paolo
> 
> 
> > This patch series introduces drain_call_rcu_co(), which does the same thing 
> > as
> > drain_call_rcu() but asynchronously. By yielding back to the event loop we 
> > can
> > wait until the caller drops the BQL and leaves its RCU read-side critical
> > section.
> > 
> > Patch 1 changes HMP so that coroutine monitor commands yield back to the 
> > event
> > loop instead of running inside a nested event loop.
> > 
> > Patch 2 introduces the new drain_call_rcu_co() API.
> > 
> > Patch 3 converts qmp_device_add() into a coroutine monitor command and uses
> > drain_call_rcu_co().
> > 
> > I'm sending this as an RFC because I don't have confirmation yet that the 
> > bugs
> > mentioned above are fixed by this patch series.
> > 
> > Stefan Hajnoczi (3):
> >    hmp: avoid the nested event loop in handle_hmp_command()
> >    rcu: add drain_call_rcu_co() API
> >    qmp: make qmp_device_add() a coroutine
> > 
> >   MAINTAINERS            |  2 ++
> >   docs/devel/rcu.txt     | 21 ++++++++++++++++
> >   qapi/qdev.json         |  1 +
> >   include/monitor/qdev.h |  3 ++-
> >   include/qemu/rcu.h     |  1 +
> >   util/rcu-internal.h    |  8 ++++++
> >   monitor/hmp.c          | 28 +++++++++++----------
> >   monitor/qmp-cmds.c     |  2 +-
> >   softmmu/qdev-monitor.c | 34 +++++++++++++++++++++++---
> >   util/rcu-co.c          | 55 ++++++++++++++++++++++++++++++++++++++++++
> >   util/rcu.c             |  3 ++-
> >   hmp-commands.hx        |  1 +
> >   util/meson.build       |  2 +-
> >   13 files changed, 140 insertions(+), 21 deletions(-)
> >   create mode 100644 util/rcu-internal.h
> >   create mode 100644 util/rcu-co.c
> > 
> 

Attachment: signature.asc
Description: PGP signature

Reply via email to