Am 06.02.2024 um 20:06 hat Stefan Hajnoczi geschrieben:
> The aio_co_reschedule_self() API is designed to avoid the race
> condition between scheduling the coroutine in another AioContext and
> yielding.
> 
> The QMP dispatch code uses the open-coded version that appears
> susceptible to the race condition at first glance:
> 
>   aio_co_schedule(qemu_get_aio_context(), qemu_coroutine_self());
>   qemu_coroutine_yield();
> 
> The code is actually safe because the iohandler and qemu_aio_context
> AioContext run under the Big QEMU Lock. Nevertheless, set a good example
> and use aio_co_reschedule_self() so it's obvious that there is no race.
> 
> Suggested-by: Hanna Reitz <hre...@redhat.com>
> Reviewed-by: Manos Pitsidianakis <manos.pitsidiana...@linaro.org>
> Reviewed-by: Hanna Czenczek <hre...@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefa...@redhat.com>
> ---
>  qapi/qmp-dispatch.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/qapi/qmp-dispatch.c b/qapi/qmp-dispatch.c
> index 176b549473..f3488afeef 100644
> --- a/qapi/qmp-dispatch.c
> +++ b/qapi/qmp-dispatch.c
> @@ -212,8 +212,7 @@ QDict *coroutine_mixed_fn qmp_dispatch(const 
> QmpCommandList *cmds, QObject *requ
>               * executing the command handler so that it can make progress if 
> it
>               * involves an AIO_WAIT_WHILE().
>               */
> -            aio_co_schedule(qemu_get_aio_context(), qemu_coroutine_self());
> -            qemu_coroutine_yield();
> +            aio_co_reschedule_self(qemu_get_aio_context());

Turns out that this one actually causes a regression. [1] This code is
ŕun in iohandler_ctx, aio_co_reschedule_self() looks at the new context
and compares it with qemu_get_current_aio_context() - and because both
are qemu_aio_context, it decides that it has nothing to do. So the
command handler coroutine actually still runs in iohandler_ctx now,
which is not what we want.

We could just revert this patch because it was only meant as a cleanup
without a semantic difference.

Or aio_co_reschedule_self() could look at qemu_coroutine_self()->ctx
instead of using qemu_get_current_aio_context(). That would be a little
more indirect, though, and I'm not sure if co->ctx is always up to date.

Any opinions on what is the best way to fix this?

Kevin

[1] https://issues.redhat.com/browse/RHEL-34618


Reply via email to