On 5/22/22 17:06, Stefan Hajnoczi wrote:
However, I hit on a problem that I think Emanuele and Paolo have already
pointed out: draining is GS & IO. This might have worked under the 1 IOThread
model but it does not make sense for multi-queue. It is possible to submit I/O
requests in drained sections. How can multiple threads be in drained sections
simultaneously and possibly submit further I/O requests in their drained
sections? Those sections wouldn't be "drained" in any useful sense of the word.

Yeah, that works only if the drained sections are well-behaved.

"External" sources of I/O are fine; they are disabled using is_external, and don't drain themselves I think.

Mirror is the only I/O user of drain, and it's fine because it never submits I/O to the drained BDS.

Drained sections in the main thread can be special cased to allow I/O (wrlock in this series would also allow I/O).

So I think that the "cooperation from all relevant places" that Kevin mentioned is already there, except for coroutine commands in the monitor. Those are a bad idea in my opinion and I'd rather revert commit eb94b81a94 ("block: Convert 'block_resize' to coroutine") until we have a clearer idea of how to handle them.

I agree that it's basically impossible to review the change. On the other hand, there's already a substantial amount of faith involved in the correctness of the current code.

In particular the AioContext lock does absolutely nothing to protect corutines in the main thread against graph changes---both from the monitor (including BHs as in "block: Fix BB.root changing across bdrv_next()") and from BDS coroutines. The former are unprotected; the latter are protected by drain only: using drain to protect against graph writes would be a matter of extending *existing* faith to the multi-iothread case.

Once the deadlock is broken, we can proceed to remove the AioContext lock and then introduce actual coroutine-based locking.

Possible steps for AioContext removal
-------------------------------------
I also wanted to share my assumptions about multi-queue and AioContext removal.
Please let me know if anything seems wrong or questionable:

- IO code can execute in any thread that has an AioContext.
- Multiple threads may execute a IO code at the same time.
- GS code only execute under the BQL.

For AioContext removal this means:

- bdrv_get_aio_context() becomes mostly meaningless since there is no need for
   a special "home" AioContext.

Correct. bdrv_set_aio_context() remains useful as a way to set a home AioContext for sockets.

- bdrv_coroutine_enter() becomes mostly meaningless because there is no need to
   run a coroutine in the BDS's AioContext.
- aio_disable_external(bdrv_get_aio_context(bs)) no longer works because many
   threads/AioContexts may submit new I/O requests. BlockDevOps.drained_begin()
   may be used instead (e.g. to temporarily disable ioeventfds on a multi-queue
   virtio-blk device).

This is a change that can be done independent of this work.

- AIO_WAIT_WHILE() simplifies to

     while ((cond)) {
         aio_poll(qemu_get_current_aio_context(), true);
         ...
     }

   and the distinction between home AioContext and non-home context is
   eliminated. AioContext unlocking is dropped.

(I'll reply on this from elsewhere in the thread).

Does this make sense? I haven't seen these things in recent patch series.

I agree, and yeah all these are blocked on protecting graph modifications.

In parallel to the block layer discussions, it's possible to work on introducing a request queue lock in virtio-blk and virtio-scsi. That's the only thing that relies on the AioContext lock outside the block layer.

Paolo


Reply via email to