On 230125 1624, Stefan Hajnoczi wrote: > On Thu, Jan 19, 2023 at 02:03:07AM -0500, Alexander Bulekov wrote: > > Devices can pass their MemoryReentrancyGuard (from their DeviceState), > > when creating new BHes. Then, the async API will toggle the guard > > before/after calling the BH call-back. This prevents bh->mmio reentrancy > > issues. > > > > Signed-off-by: Alexander Bulekov <alx...@bu.edu> > > --- > > docs/devel/multiple-iothreads.txt | 2 ++ > > include/block/aio.h | 18 ++++++++++++++++-- > > include/qemu/main-loop.h | 7 +++++-- > > tests/unit/ptimer-test-stubs.c | 3 ++- > > util/async.c | 12 +++++++++++- > > util/main-loop.c | 5 +++-- > > 6 files changed, 39 insertions(+), 8 deletions(-) > > > > diff --git a/docs/devel/multiple-iothreads.txt > > b/docs/devel/multiple-iothreads.txt > > index 343120f2ef..e4fafed9d9 100644 > > --- a/docs/devel/multiple-iothreads.txt > > +++ b/docs/devel/multiple-iothreads.txt > > @@ -61,6 +61,7 @@ There are several old APIs that use the main loop > > AioContext: > > * LEGACY qemu_aio_set_event_notifier() - monitor an event notifier > > * LEGACY timer_new_ms() - create a timer > > * LEGACY qemu_bh_new() - create a BH > > + * LEGACY qemu_bh_new_guarded() - create a BH with a device re-entrancy > > guard > > * LEGACY qemu_aio_wait() - run an event loop iteration > > > > Since they implicitly work on the main loop they cannot be used in code > > that > > @@ -72,6 +73,7 @@ Instead, use the AioContext functions directly (see > > include/block/aio.h): > > * aio_set_event_notifier() - monitor an event notifier > > * aio_timer_new() - create a timer > > * aio_bh_new() - create a BH > > + * aio_bh_new_guarded() - create a BH with a device re-entrancy guard > > * aio_poll() - run an event loop iteration > > > > The AioContext can be obtained from the IOThread using > > diff --git a/include/block/aio.h b/include/block/aio.h > > index 0f65a3cc9e..94d661ff7e 100644 > > --- a/include/block/aio.h > > +++ b/include/block/aio.h > > @@ -23,6 +23,8 @@ > > #include "qemu/thread.h" > > #include "qemu/timer.h" > > #include "block/graph-lock.h" > > +#include "hw/qdev-core.h" > > + > > > > typedef struct BlockAIOCB BlockAIOCB; > > typedef void BlockCompletionFunc(void *opaque, int ret); > > @@ -332,9 +334,11 @@ void aio_bh_schedule_oneshot_full(AioContext *ctx, > > QEMUBHFunc *cb, void *opaque, > > * is opaque and must be allocated prior to its use. > > * > > * @name: A human-readable identifier for debugging purposes. > > + * @reentrancy_guard: A guard set when entering a cb to prevent > > + * device-reentrancy issues > > */ > > QEMUBH *aio_bh_new_full(AioContext *ctx, QEMUBHFunc *cb, void *opaque, > > - const char *name); > > + const char *name, MemReentrancyGuard > > *reentrancy_guard); > > > > /** > > * aio_bh_new: Allocate a new bottom half structure > > @@ -343,7 +347,17 @@ QEMUBH *aio_bh_new_full(AioContext *ctx, QEMUBHFunc > > *cb, void *opaque, > > * string. > > */ > > #define aio_bh_new(ctx, cb, opaque) \ > > - aio_bh_new_full((ctx), (cb), (opaque), (stringify(cb))) > > + aio_bh_new_full((ctx), (cb), (opaque), (stringify(cb)), NULL) > > + > > +/** > > + * aio_bh_new_guarded: Allocate a new bottom half structure with a > > + * reentrancy_guard > > + * > > + * A convenience wrapper for aio_bh_new_full() that uses the cb as the name > > + * string. > > + */ > > +#define aio_bh_new_guarded(ctx, cb, opaque, guard) \ > > + aio_bh_new_full((ctx), (cb), (opaque), (stringify(cb)), guard) > > > > /** > > * aio_notify: Force processing of pending events. > > diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h > > index c25f390696..84d1ce57f0 100644 > > --- a/include/qemu/main-loop.h > > +++ b/include/qemu/main-loop.h > > @@ -389,9 +389,12 @@ void qemu_cond_timedwait_iothread(QemuCond *cond, int > > ms); > > > > void qemu_fd_register(int fd); > > > > +#define qemu_bh_new_guarded(cb, opaque, guard) \ > > + qemu_bh_new_full((cb), (opaque), (stringify(cb)), guard) > > #define qemu_bh_new(cb, opaque) \ > > - qemu_bh_new_full((cb), (opaque), (stringify(cb))) > > -QEMUBH *qemu_bh_new_full(QEMUBHFunc *cb, void *opaque, const char *name); > > + qemu_bh_new_full((cb), (opaque), (stringify(cb)), NULL) > > +QEMUBH *qemu_bh_new_full(QEMUBHFunc *cb, void *opaque, const char *name, > > + MemReentrancyGuard *reentrancy_guard); > > void qemu_bh_schedule_idle(QEMUBH *bh); > > > > enum { > > diff --git a/tests/unit/ptimer-test-stubs.c b/tests/unit/ptimer-test-stubs.c > > index f5e75a96b6..24d5413f9d 100644 > > --- a/tests/unit/ptimer-test-stubs.c > > +++ b/tests/unit/ptimer-test-stubs.c > > @@ -107,7 +107,8 @@ int64_t qemu_clock_deadline_ns_all(QEMUClockType type, > > int attr_mask) > > return deadline; > > } > > > > -QEMUBH *qemu_bh_new_full(QEMUBHFunc *cb, void *opaque, const char *name) > > +QEMUBH *qemu_bh_new_full(QEMUBHFunc *cb, void *opaque, const char *name, > > + MemReentrancyGuard *reentrancy_guard) > > { > > QEMUBH *bh = g_new(QEMUBH, 1); > > > > diff --git a/util/async.c b/util/async.c > > index 14d63b3091..08924c3212 100644 > > --- a/util/async.c > > +++ b/util/async.c > > @@ -65,6 +65,7 @@ struct QEMUBH { > > void *opaque; > > QSLIST_ENTRY(QEMUBH) next; > > unsigned flags; > > + MemReentrancyGuard *reentrancy_guard; > > }; > > > > /* Called concurrently from any thread */ > > @@ -133,7 +134,7 @@ void aio_bh_schedule_oneshot_full(AioContext *ctx, > > QEMUBHFunc *cb, > > } > > > > QEMUBH *aio_bh_new_full(AioContext *ctx, QEMUBHFunc *cb, void *opaque, > > - const char *name) > > + const char *name, MemReentrancyGuard > > *reentrancy_guard) > > { > > QEMUBH *bh; > > bh = g_new(QEMUBH, 1); > > @@ -142,13 +143,22 @@ QEMUBH *aio_bh_new_full(AioContext *ctx, QEMUBHFunc > > *cb, void *opaque, > > .cb = cb, > > .opaque = opaque, > > .name = name, > > + .reentrancy_guard = reentrancy_guard, > > }; > > return bh; > > } > > > > void aio_bh_call(QEMUBH *bh) > > { > > + if (bh->reentrancy_guard) { > > + bh->reentrancy_guard->engaged_in_io = true; > > + } > > + > > bh->cb(bh->opaque); > > + > > + if (bh->reentrancy_guard) { > > + bh->reentrancy_guard->engaged_in_io = false; > > + } > > } > > QEMU supports nested event loops. I think aio_bh_call() -> cb() -> > aio_poll() -> aio_bh_call() -> ... is possible although it should be > rare. >
Maybe 9p's v9fs_co_run_in_worker is an example of that, though I'm not sure. That was one of the calls to qemu_bh_new that I could not find a straightforward way to refactor.. > ->engaged_in_io will set to false after the innermost aio_bh_call() > returns. Therefore the protection doesn't cover the remainder of the > parent cb() functions. > > I think aio_bh_call() should be: > > void aio_bh_call(QEMUBH *bh) > { > bool last_engaged_in_io = false; > > if (bh->reentrancy_guard) { > last_engaged_in_io = bh->reentrancy_guard->engaged_in_io; > bh->reentrancy_guard->engaged_in_io = true; > } > > bh->cb(bh->opaque); > > if (bh->reentrancy_guard) { > bh->reentrancy_guard->engaged_in_io = last_engaged_in_io; > } > } > > That way nested aio_poll() calls work as expected. > > This also raises the question whether aio_bh_call() should call abort(3) > if ->engaged_in_io is already true when the function is entered? I think > that may be too strict, but I'm not sure. A scenario where this can > happen: > > The memory region read/write function calls aio_poll() -> aio_bh_call() > and a BH with our device's re-entrancy guard is executed. > Is this sort of "bh reentrancy" only likely through a deliberate design-decision by the code author? If so then, maybe it doesn't need to be treated with the same severity as the memory-reentrancy case. I'll add a tracepoint in the next version. Thanks -Alex > > > > /* Multiple occurrences of aio_bh_poll cannot be called concurrently. */ > > diff --git a/util/main-loop.c b/util/main-loop.c > > index 58f776a8c9..07d2e2040a 100644 > > --- a/util/main-loop.c > > +++ b/util/main-loop.c > > @@ -617,9 +617,10 @@ void main_loop_wait(int nonblocking) > > > > /* Functions to operate on the main QEMU AioContext. */ > > > > -QEMUBH *qemu_bh_new_full(QEMUBHFunc *cb, void *opaque, const char *name) > > +QEMUBH *qemu_bh_new_full(QEMUBHFunc *cb, void *opaque, const char *name, > > MemReentrancyGuard *reentrancy_guard) > > { > > - return aio_bh_new_full(qemu_aio_context, cb, opaque, name); > > + return aio_bh_new_full(qemu_aio_context, cb, opaque, name, > > + reentrancy_guard); > > } > > > > /* > > -- > > 2.39.0 > >