children: new rwlock

Paolo Bonzini Wed, 25 May 2022 00:45:51 -0700

On 5/24/22 12:36, Kevin Wolf wrote:

* IO_OR_GS_CODE() functions, when called from coroutine context, expect to
be called with the AioContext lock taken (example: bdrv_co_yield_to_drain)


Am I right to say this is not inherently part of the definition of
IO_OR_GS_CODE(), but just a property that these functions have in
practice? In practice, the functions that are IO_OR_GS_CODE()
are those that call AIO_WAIT_WHILE().


Yes.

Using a different code path means that the restrictions from
AIO_WAIT_WHILE() don't really apply any more and these functions become
effectively IO_CODE() when called in a coroutine. (At least I'm not
aware of any other piece of code apart from AIO_WAIT_WHILE() that makes
a function IO_OR_GS_CODE().)


The second point is correct.

The first depends on the definition of the coroutine path. For example,bdrv_co_yield_to_drain() expects to run with bdrv_get_aio_context(bs)'slock taken. An iothread cannot take another iothread's AioContext lockto avoid AB-BA deadlocks; therefore, bdrv_co_yield_to_drain() can onlyrun in the main thread or in bs's home iothread.

* to call these functions with the lock taken, the code has to run in the
BDS's home iothread.  Attempts to do otherwise results in deadlocks (the
main loop's AIO_WAIT_WHILEs expect progress from the iothread, that cannot
happen without releasing the aiocontext lock)


This problem can't happen in the main thread itself, AIO_WAIT_WHILE() is
safe both in the home thread and the main thread (at least as long as
you lock only once) because it temporarily drops the lock. It has also
become the definition of IO_OR_GS_CODE(): This code has to run in the
home thread or the main thread.

It doesn't work to run bdrv_open_driver() in the home iothread becausebdrv_open_driver can change the AioContext of a BDS (causing extremeconfusion on what runs where and what AioContext locks have to be takenand released).

So in order to have bdrv_open_driver() run in the main thread, Emanueleadded a variant of generated_co_wrapper that waits on the main thread.But it didn't work either, because then AIO_WAIT_WHILE() does notrelease the BDS's AioContext lock.

When ->bdrv_open() calls bdrv_replace_child_noperm() and it tries todrain, bdrv_co_yield_to_drain() schedules a bottom half on the iothreadand yields; the bottom half can never run, because the AioContext lockis already taken elsewhere.


        main thread                             ctx home thread
        aio_context_acquire(ctx)
        bdrv_open()
          drv->bdrv_co_open()
            bdrv_replace_child_noperm()
              bdrv_co_yield_to_drain()
                aio_context_release(ctx)
                aio_schedule_bh_oneshot()

So, bdrv_open_driver() and bdrv_close() are un-coroutin-izable. I guesswe could split the callback in two parts, one doing I/O and one doinggraph manipulation (it may even be a good idea---the ->bdrv_inactivate()callback even exists already in the case of bdrv_close()---but I'm notsure it's always applicable).

Come to think of it, I believe that many of the functions we declared
IO_CODE() are actually just IO_OR_GS_CODE() (at best; for iothreads,
they certainly require running in the home thread, but the main thread
allowed by IO_OR_GS_CODE() might not work). We have all the coroutine
machinery so that the AioContext lock of the current thread is
automatically released and reacquired across yield. However, this is the
wrong AioContext when called from a different thread, so we need an
additional lock - which should be dropped during yield, too, but it
doesn't happen.

There's no need for additional locks. Drivers have to be protected byindividual locks, which are either QemuMutex and dropped during yield,or CoMutex. A QemuMutex used to protect a CoQueue is also droppedsafely during yield.

The IO_CODE() distinction is indeed mostly theoretical until the fileI/O BlockDriver protocols are protected by the AioContext lock andtherefore are actually IO_OR_GS_CODE(). But that's the least of ourworries; if file-posix AIO is done on qemu_get_current_aio_context()instead of bdrv_get_aio_context(bs), the AioContext lock is not neededanymore for I/O.

Apart from file I/O, all drivers were thread-safe at some point (now itseems blklogwrites.c is not, but the code also seems not safe againstpower loss and I wonder if it should be just deleted).

Switching to drain for locking doesn't solve the problem, but only
possibly defer it. In order to complete the multiqueue work, we need
to make IO_CODE() functions actually conform to the definition of
IO_CODE(). Do we have a plan what this should look like in the final
state when all the multiqueue work is completed? Or will we only have
coroutine locks which don't have this problem?

The latter apart from the occasional QemuMutex, which is used only whenit does not cause the problem.

* running the code in the BDS's home iothread is not possible for
GLOBAL_STATE_CODE() functions (unless the BDS home iothread is the main
thread, but that cannot be guaranteed in general)


There is nothing that stops GLOBAL_STATE_CODE() from scheduling work in
the home iothread of a BDS and then waiting for it - or if it is a
coroutine, even to reschedule itself into the BDS home thread
temporarily.

There are still things that I/O thread cannot do, for examplebdrv_set_aio_context(). Even if it worked, I don't think it would be agood idea. It basically would mean a "movable" main thread. It's evenharder to reason on it.

So that's how I got to the point where I don't think it's possible toproceed with the idea of coroutin-izing more code (a prerequisite for acoroutine-based graph rwlock) without first removing the AioContextlock. But removing the AioContext lock requires a way to make the graphoperations thread-safe, and then we go back to draining.


Paolo

Re: [RFC PATCH v2 0/8] Removal of AioContext lock, bs->parents and ->children: new rwlock

Reply via email to