On 5/18/22 14:28, Emanuele Giuseppe Esposito wrote:
For example, all callers of bdrv_open() always take the AioContext lock.
Often it is taken very high in the call stack, but it's always taken.
I think it's actually not a problem of who takes the AioContext lock or
where; the requirements are contradictory:
* IO_OR_GS_CODE() functions, when called from coroutine context, expect
to be called with the AioContext lock taken (example:
bdrv_co_yield_to_drain)
* to call these functions with the lock taken, the code has to run in
the BDS's home iothread. Attempts to do otherwise results in deadlocks
(the main loop's AIO_WAIT_WHILEs expect progress from the iothread, that
cannot happen without releasing the aiocontext lock)
* running the code in the BDS's home iothread is not possible for
GLOBAL_STATE_CODE() functions (unless the BDS home iothread is the main
thread, but that cannot be guaranteed in general)
We might suppose that many callbacks are called under drain and in
GLOBAL_STATE, which should be enough, but from our experimentation in
the previous series we saw that currently not everything is under drain,
leaving some operations unprotected (remember assert_graph_writable
temporarily disabled, since drain coverage for bdrv_replace_child_noperm
was not 100%?).
Therefore we need to add more drains. But isn't drain what we decided to
drop at the beginning? Why isn't drain good?
To sum up the patch ordering deadlock that we have right now:
* in some cases, graph manipulations are protected by the AioContext lock
* eliminating the AioContext lock is needed to move callbacks to
coroutine contexts (see above for the deadlock scenario)
* moving callbacks to coroutine context is needed by the graph rwlock
implementation
On one hand, we cannot protect the graph across manipulations with a
graph rwlock without removing the AioContext lock; on the other hand,
the AioContext lock is what _right now_ protects the graph.
So I'd rather go back to Emanuele's draining approach. It may not be
beautiful, but it allows progress. Once that is in place, we can remove
the AioContext lock (which mostly protects virtio-blk/virtio-scsi code
right now) and reevaluate our next steps.
Paolo