On 5/22/22 17:06, Stefan Hajnoczi wrote:
However, I hit on a problem that I think Emanuele and Paolo have already
pointed out: draining is GS & IO. This might have worked under the 1 IOThread
model but it does not make sense for multi-queue. It is possible to submit I/O
requests in drained sections. How can multiple threads be in drained sections
simultaneously and possibly submit further I/O requests in their drained
sections? Those sections wouldn't be "drained" in any useful sense of the word.
Yeah, that works only if the drained sections are well-behaved.
"External" sources of I/O are fine; they are disabled using is_external,
and don't drain themselves I think.
Mirror is the only I/O user of drain, and it's fine because it never
submits I/O to the drained BDS.
Drained sections in the main thread can be special cased to allow I/O
(wrlock in this series would also allow I/O).
So I think that the "cooperation from all relevant places" that Kevin
mentioned is already there, except for coroutine commands in the
monitor. Those are a bad idea in my opinion and I'd rather revert
commit eb94b81a94 ("block: Convert 'block_resize' to coroutine") until
we have a clearer idea of how to handle them.
I agree that it's basically impossible to review the change. On the
other hand, there's already a substantial amount of faith involved in
the correctness of the current code.
In particular the AioContext lock does absolutely nothing to protect
corutines in the main thread against graph changes---both from the
monitor (including BHs as in "block: Fix BB.root changing across
bdrv_next()") and from BDS coroutines. The former are unprotected; the
latter are protected by drain only: using drain to protect against graph
writes would be a matter of extending *existing* faith to the
multi-iothread case.
Once the deadlock is broken, we can proceed to remove the AioContext
lock and then introduce actual coroutine-based locking.
Possible steps for AioContext removal
-------------------------------------
I also wanted to share my assumptions about multi-queue and AioContext removal.
Please let me know if anything seems wrong or questionable:
- IO code can execute in any thread that has an AioContext.
- Multiple threads may execute a IO code at the same time.
- GS code only execute under the BQL.
For AioContext removal this means:
- bdrv_get_aio_context() becomes mostly meaningless since there is no need for
a special "home" AioContext.
Correct. bdrv_set_aio_context() remains useful as a way to set a home
AioContext for sockets.
- bdrv_coroutine_enter() becomes mostly meaningless because there is no need to
run a coroutine in the BDS's AioContext.
- aio_disable_external(bdrv_get_aio_context(bs)) no longer works because many
threads/AioContexts may submit new I/O requests. BlockDevOps.drained_begin()
may be used instead (e.g. to temporarily disable ioeventfds on a multi-queue
virtio-blk device).
This is a change that can be done independent of this work.
- AIO_WAIT_WHILE() simplifies to
while ((cond)) {
aio_poll(qemu_get_current_aio_context(), true);
...
}
and the distinction between home AioContext and non-home context is
eliminated. AioContext unlocking is dropped.
(I'll reply on this from elsewhere in the thread).
Does this make sense? I haven't seen these things in recent patch series.
I agree, and yeah all these are blocked on protecting graph modifications.
In parallel to the block layer discussions, it's possible to work on
introducing a request queue lock in virtio-blk and virtio-scsi. That's
the only thing that relies on the AioContext lock outside the block layer.
Paolo