Re: [PATCH 0/3] monitor: only run coroutine commands in qemu_aio_context

2024-02-02 Thread YangHang Liu
It's easily for me to encounter " ../block/qcow2.c:5263:
ImageInfoSpecific *qcow2_get_specific_info(BlockDriverState *, Error
**): Assertion `false' failed" issue during 1Q vhost-user interface +
RT VM + post-copy migration

After applying this patch, the issue is still not reproduced even if I
repeat the same migration test for 60 times.

Tested-by: Yanghang Liu 


Best Regards,
YangHang Liu

On Mon, Jan 29, 2024 at 7:39 PM Peter Maydell  wrote:
>
> On Tue, 16 Jan 2024 at 19:01, Stefan Hajnoczi  wrote:
> >
> > Several bugs have been reported related to how QMP commands are rescheduled 
> > in
> > qemu_aio_context:
> > - https://gitlab.com/qemu-project/qemu/-/issues/1933
> > - https://issues.redhat.com/browse/RHEL-17369
> > - https://bugzilla.redhat.com/show_bug.cgi?id=2215192
> > - https://bugzilla.redhat.com/show_bug.cgi?id=2214985
> >
> > The first instance of the bug interacted with drain_call_rcu() temporarily
> > dropping the BQL and resulted in vCPU threads entering device emulation code
> > simultaneously (something that should never happen). I set out to make
> > drain_call_rcu() safe to use in this environment, but Paolo and Kevin 
> > discussed
> > the possibility of avoiding rescheduling the monitor_qmp_dispatcher_co()
> > coroutine for non-coroutine commands. This would prevent monitor commands 
> > from
> > running during vCPU thread aio_poll() entirely and addresses the root cause.
> >
> > This patch series implements this idea. qemu-iotests is sensitive to the 
> > exact
> > order in which QMP events and responses are emitted. Running QMP handlers in
> > the iohandler AioContext causes some QMP events to be ordered differently 
> > than
> > before. It is therefore necessary to adjust the reference output in many 
> > test
> > cases. The actual QMP code change is small and everything else is just to 
> > make
> > qemu-iotests happy.
>
> Hi; we have a suspicion that this change has resulted in a flaky-CI
> test: iotest-144 sometimes fails, apparently because a "return"
> result from QMP isn't always returned at the same place in relation
> to other QMP events. Could you have a look at it?
>
> https://gitlab.com/qemu-project/qemu/-/issues/2126
>
> thanks
> -- PMM
>




[PATCH v2 2/3] virtio: Re-enable notifications after drain

2024-02-02 Thread Hanna Czenczek
During drain, we do not care about virtqueue notifications, which is why
we remove the handlers on it.  When removing those handlers, whether vq
notifications are enabled or not depends on whether we were in polling
mode or not; if not, they are enabled (by default); if so, they have
been disabled by the io_poll_start callback.

Because we do not care about those notifications after removing the
handlers, this is fine.  However, we have to explicitly ensure they are
enabled when re-attaching the handlers, so we will resume receiving
notifications.  We do this in virtio_queue_aio_attach_host_notifier*().
If such a function is called while we are in a polling section,
attaching the notifiers will then invoke the io_poll_start callback,
re-disabling notifications.

Because we will always miss virtqueue updates in the drained section, we
also need to poll the virtqueue once after attaching the notifiers.

Buglink: https://issues.redhat.com/browse/RHEL-3934
Signed-off-by: Hanna Czenczek 
---
 include/block/aio.h |  7 ++-
 hw/virtio/virtio.c  | 42 ++
 2 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/include/block/aio.h b/include/block/aio.h
index 5d0a114988..8378553eb9 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -480,9 +480,14 @@ void aio_set_event_notifier(AioContext *ctx,
 AioPollFn *io_poll,
 EventNotifierHandler *io_poll_ready);
 
-/* Set polling begin/end callbacks for an event notifier that has already been
+/*
+ * Set polling begin/end callbacks for an event notifier that has already been
  * registered with aio_set_event_notifier.  Do nothing if the event notifier is
  * not registered.
+ *
+ * Note that if the io_poll_end() callback (or the entire notifier) is removed
+ * during polling, it will not be called, so an io_poll_begin() is not
+ * necessarily always followed by an io_poll_end().
  */
 void aio_set_event_notifier_poll(AioContext *ctx,
  EventNotifier *notifier,
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 7549094154..d229755eae 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -3556,6 +3556,17 @@ static void 
virtio_queue_host_notifier_aio_poll_end(EventNotifier *n)
 
 void virtio_queue_aio_attach_host_notifier(VirtQueue *vq, AioContext *ctx)
 {
+/*
+ * virtio_queue_aio_detach_host_notifier() can leave notifications 
disabled.
+ * Re-enable them.  (And if detach has not been used before, notifications
+ * being enabled is still the default state while a notifier is attached;
+ * see virtio_queue_host_notifier_aio_poll_end(), which will always leave
+ * notifications enabled once the polling section is left.)
+ */
+if (!virtio_queue_get_notification(vq)) {
+virtio_queue_set_notification(vq, 1);
+}
+
 aio_set_event_notifier(ctx, >host_notifier,
virtio_queue_host_notifier_read,
virtio_queue_host_notifier_aio_poll,
@@ -3563,6 +3574,13 @@ void virtio_queue_aio_attach_host_notifier(VirtQueue 
*vq, AioContext *ctx)
 aio_set_event_notifier_poll(ctx, >host_notifier,
 virtio_queue_host_notifier_aio_poll_begin,
 virtio_queue_host_notifier_aio_poll_end);
+
+/*
+ * We will have ignored notifications about new requests from the guest
+ * while no notifiers were attached, so "kick" the virt queue to process
+ * those requests now.
+ */
+event_notifier_set(>host_notifier);
 }
 
 /*
@@ -3573,14 +3591,38 @@ void virtio_queue_aio_attach_host_notifier(VirtQueue 
*vq, AioContext *ctx)
  */
 void virtio_queue_aio_attach_host_notifier_no_poll(VirtQueue *vq, AioContext 
*ctx)
 {
+/* See virtio_queue_aio_attach_host_notifier() */
+if (!virtio_queue_get_notification(vq)) {
+virtio_queue_set_notification(vq, 1);
+}
+
 aio_set_event_notifier(ctx, >host_notifier,
virtio_queue_host_notifier_read,
NULL, NULL);
+
+/*
+ * See virtio_queue_aio_attach_host_notifier().
+ * Note that this may be unnecessary for the type of virtqueues this
+ * function is used for.  Still, it will not hurt to have a quick look into
+ * whether we can/should process any of the virtqueue elements.
+ */
+event_notifier_set(>host_notifier);
 }
 
 void virtio_queue_aio_detach_host_notifier(VirtQueue *vq, AioContext *ctx)
 {
 aio_set_event_notifier(ctx, >host_notifier, NULL, NULL, NULL);
+
+/*
+ * aio_set_event_notifier_poll() does not guarantee whether io_poll_end()
+ * will run after io_poll_begin(), so by removing the notifier, we do not
+ * know whether virtio_queue_host_notifier_aio_poll_end() has run after a
+ * previous virtio_queue_host_notifier_aio_poll_begin(), i.e. whether
+ * notifications are enabled or disabled.  It does not 

[PATCH v2 1/3] virtio-scsi: Attach event vq notifier with no_poll

2024-02-02 Thread Hanna Czenczek
As of commit 38738f7dbbda90fbc161757b7f4be35b52205552 ("virtio-scsi:
don't waste CPU polling the event virtqueue"), we only attach an io_read
notifier for the virtio-scsi event virtqueue instead, and no polling
notifiers.  During operation, the event virtqueue is typically
non-empty, but none of the buffers are intended to be used immediately.
Instead, they only get used when certain events occur.  Therefore, it
makes no sense to continuously poll it when non-empty, because it is
supposed to be and stay non-empty.

We do this by using virtio_queue_aio_attach_host_notifier_no_poll()
instead of virtio_queue_aio_attach_host_notifier() for the event
virtqueue.

Commit 766aa2de0f29b657148e04599320d771c36fd126 ("virtio-scsi: implement
BlockDevOps->drained_begin()") however has virtio_scsi_drained_end() use
virtio_queue_aio_attach_host_notifier() for all virtqueues, including
the event virtqueue.  This can lead to it being polled again, undoing
the benefit of commit 38738f7dbbda90fbc161757b7f4be35b52205552.

Fix it by using virtio_queue_aio_attach_host_notifier_no_poll() for the
event virtqueue.

Reported-by: Fiona Ebner 
Fixes: 766aa2de0f29b657148e04599320d771c36fd126
   ("virtio-scsi: implement BlockDevOps->drained_begin()")
Reviewed-by: Stefan Hajnoczi 
Tested-by: Fiona Ebner 
Reviewed-by: Fiona Ebner 
Signed-off-by: Hanna Czenczek 
---
 hw/scsi/virtio-scsi.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 690aceec45..9f02ceea09 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -1149,6 +1149,7 @@ static void virtio_scsi_drained_begin(SCSIBus *bus)
 static void virtio_scsi_drained_end(SCSIBus *bus)
 {
 VirtIOSCSI *s = container_of(bus, VirtIOSCSI, bus);
+VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(s);
 VirtIODevice *vdev = VIRTIO_DEVICE(s);
 uint32_t total_queues = VIRTIO_SCSI_VQ_NUM_FIXED +
 s->parent_obj.conf.num_queues;
@@ -1166,7 +1167,11 @@ static void virtio_scsi_drained_end(SCSIBus *bus)
 
 for (uint32_t i = 0; i < total_queues; i++) {
 VirtQueue *vq = virtio_get_queue(vdev, i);
-virtio_queue_aio_attach_host_notifier(vq, s->ctx);
+if (vq == vs->event_vq) {
+virtio_queue_aio_attach_host_notifier_no_poll(vq, s->ctx);
+} else {
+virtio_queue_aio_attach_host_notifier(vq, s->ctx);
+}
 }
 }
 
-- 
2.43.0




[PATCH v2 3/3] virtio-blk: Use ioeventfd_attach in start_ioeventfd

2024-02-02 Thread Hanna Czenczek
Commit d3f6f294aeadd5f88caf0155e4360808c95b3146 ("virtio-blk: always set
ioeventfd during startup") has made virtio_blk_start_ioeventfd() always
kick the virtqueue (set the ioeventfd), regardless of whether the BB is
drained.  That is no longer necessary, because attaching the host
notifier will now set the ioeventfd, too; this happens either
immediately right here in virtio_blk_start_ioeventfd(), or later when
the drain ends, in virtio_blk_ioeventfd_attach().

With event_notifier_set() removed, the code becomes the same as the one
in virtio_blk_ioeventfd_attach(), so we can reuse that function.

Signed-off-by: Hanna Czenczek 
---
 hw/block/virtio-blk.c | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 227d83569f..22b8eef69b 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -37,6 +37,8 @@
 #include "hw/virtio/virtio-blk-common.h"
 #include "qemu/coroutine.h"
 
+static void virtio_blk_ioeventfd_attach(VirtIOBlock *s);
+
 static void virtio_blk_init_request(VirtIOBlock *s, VirtQueue *vq,
 VirtIOBlockReq *req)
 {
@@ -1808,17 +1810,14 @@ static int virtio_blk_start_ioeventfd(VirtIODevice 
*vdev)
 s->ioeventfd_started = true;
 smp_wmb(); /* paired with aio_notify_accept() on the read side */
 
-/* Get this show started by hooking up our callbacks */
-for (i = 0; i < nvqs; i++) {
-VirtQueue *vq = virtio_get_queue(vdev, i);
-AioContext *ctx = s->vq_aio_context[i];
-
-/* Kick right away to begin processing requests already in vring */
-event_notifier_set(virtio_queue_get_host_notifier(vq));
-
-if (!blk_in_drain(s->conf.conf.blk)) {
-virtio_queue_aio_attach_host_notifier(vq, ctx);
-}
+/*
+ * Get this show started by hooking up our callbacks.  If drained now,
+ * virtio_blk_drained_end() will do this later.
+ * Attaching the notifier also kicks the virtqueues, processing any 
requests
+ * they may already have.
+ */
+if (!blk_in_drain(s->conf.conf.blk)) {
+virtio_blk_ioeventfd_attach(s);
 }
 return 0;
 
-- 
2.43.0




[PATCH v2 0/3] virtio: Re-enable notifications after drain

2024-02-02 Thread Hanna Czenczek
v1:

https://lists.nongnu.org/archive/html/qemu-block/2024-01/msg00336.html


Hi,

This is basically the same series as v1: When using
aio_set_event_notifier_poll(), the io_poll_end() callback is only
invoked when polling ends, not when the notifier is being removed while
in a polling section.  This can leave the virtqueue notifier disabled
during drained sections, which however is not a bad thing.  We just need
to ensure they are re-enabled after the drain, and kick the virtqueue
once to pick up all the requests that came in during the drained
section.

Patch 1 is a technically unrelated fix, but addresses a problem that
became visible with patch 2 applied.

Patch 3 is a small (optional) clean-up patch.


v2:
- Changed the title of this series and patch 2 (was: "Keep notifications
  disabled durin drain"): Keeping the notifier disabled was something
  the initial RFC did, this version (v1 too) just ensures the notifier
  is enabled after the drain, regardless of its state before.

- Use event_notifier_set() instead of virtio_queue_notify() in patch 2

- Added patch 3


Hanna Czenczek (3):
  virtio-scsi: Attach event vq notifier with no_poll
  virtio: Re-enable notifications after drain
  virtio-blk: Use ioeventfd_attach in start_ioeventfd

 include/block/aio.h   |  7 ++-
 hw/block/virtio-blk.c | 21 ++---
 hw/scsi/virtio-scsi.c |  7 ++-
 hw/virtio/virtio.c| 42 ++
 4 files changed, 64 insertions(+), 13 deletions(-)

-- 
2.43.0




[PATCH 2/2] scsi: Await request purging

2024-02-02 Thread Hanna Czenczek
scsi_device_for_each_req_async() currently does not provide any way to
be awaited.  One of its callers is scsi_device_purge_requests(), which
therefore currently does not guarantee that all requests are fully
settled when it returns.

We want all requests to be settled, because scsi_device_purge_requests()
is called through the unrealize path, including the one invoked by
virtio_scsi_hotunplug() through qdev_simple_device_unplug_cb(), which
most likely assumes that all SCSI requests are done then.

In fact, scsi_device_purge_requests() already contains a blk_drain(),
but this will not fully await scsi_device_for_each_req_async(), only the
I/O requests it potentially cancels (not the non-I/O requests).
However, we can have scsi_device_for_each_req_async() increment the BB
in-flight counter, and have scsi_device_for_each_req_async_bh()
decrement it when it is done.  This way, the blk_drain() will fully
await all SCSI requests to be purged.

This also removes the need for scsi_device_for_each_req_async_bh() to
double-check the current context and potentially re-schedule itself,
should it now differ from the BB's context: Changing a BB's AioContext
with a root node is done through bdrv_try_change_aio_context(), which
creates a drained section.  With this patch, we keep the BB in-flight
counter elevated throughout, so we know the BB's context cannot change.

Signed-off-by: Hanna Czenczek 
---
 hw/scsi/scsi-bus.c | 30 +-
 1 file changed, 21 insertions(+), 9 deletions(-)

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index 0a2eb11c56..230313022c 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -120,17 +120,13 @@ static void scsi_device_for_each_req_async_bh(void 
*opaque)
 SCSIRequest *next;
 
 /*
- * If the AioContext changed before this BH was called then reschedule into
- * the new AioContext before accessing ->requests. This can happen when
- * scsi_device_for_each_req_async() is called and then the AioContext is
- * changed before BHs are run.
+ * The BB cannot have changed contexts between this BH being scheduled and
+ * now: BBs' AioContexts, when they have a node attached, can only be
+ * changed via bdrv_try_change_aio_context(), in a drained section.  While
+ * we have the in-flight counter incremented, that drain must block.
  */
 ctx = blk_get_aio_context(s->conf.blk);
-if (ctx != qemu_get_current_aio_context()) {
-aio_bh_schedule_oneshot(ctx, scsi_device_for_each_req_async_bh,
-g_steal_pointer());
-return;
-}
+assert(ctx == qemu_get_current_aio_context());
 
 QTAILQ_FOREACH_SAFE(req, >requests, next, next) {
 data->fn(req, data->fn_opaque);
@@ -138,11 +134,16 @@ static void scsi_device_for_each_req_async_bh(void 
*opaque)
 
 /* Drop the reference taken by scsi_device_for_each_req_async() */
 object_unref(OBJECT(s));
+
+/* Paired with blk_inc_in_flight() in scsi_device_for_each_req_async() */
+blk_dec_in_flight(s->conf.blk);
 }
 
 /*
  * Schedule @fn() to be invoked for each enqueued request in device @s. @fn()
  * runs in the AioContext that is executing the request.
+ * Keeps the BlockBackend's in-flight counter incremented until everything is
+ * done, so draining it will settle all scheduled @fn() calls.
  */
 static void scsi_device_for_each_req_async(SCSIDevice *s,
void (*fn)(SCSIRequest *, void *),
@@ -163,6 +164,8 @@ static void scsi_device_for_each_req_async(SCSIDevice *s,
  */
 object_ref(OBJECT(s));
 
+/* Paired with blk_dec_in_flight() in scsi_device_for_each_req_async_bh() 
*/
+blk_inc_in_flight(s->conf.blk);
 aio_bh_schedule_oneshot(blk_get_aio_context(s->conf.blk),
 scsi_device_for_each_req_async_bh,
 data);
@@ -1728,11 +1731,20 @@ static void scsi_device_purge_one_req(SCSIRequest *req, 
void *opaque)
 scsi_req_cancel_async(req, NULL);
 }
 
+/**
+ * Cancel all requests, and block until they are deleted.
+ */
 void scsi_device_purge_requests(SCSIDevice *sdev, SCSISense sense)
 {
 scsi_device_for_each_req_async(sdev, scsi_device_purge_one_req, NULL);
 
+/*
+ * Await all the scsi_device_purge_one_req() calls scheduled by
+ * scsi_device_for_each_req_async(), and all I/O requests that were
+ * cancelled this way, but may still take a bit of time to settle.
+ */
 blk_drain(sdev->conf.blk);
+
 scsi_device_set_ua(sdev, sense);
 }
 
-- 
2.43.0




[PATCH 1/2] block-backend: Allow concurrent context changes

2024-02-02 Thread Hanna Czenczek
Since AioContext locks have been removed, a BlockBackend's AioContext
may really change at any time (only exception is that it is often
confined to a drained section, as noted in this patch).  Therefore,
blk_get_aio_context() cannot rely on its root node's context always
matching that of the BlockBackend.

In practice, whether they match does not matter anymore anyway: Requests
can be sent to BDSs from any context, so anyone who requests the BB's
context should have no reason to require the root node to have the same
context.  Therefore, we can and should remove the assertion to that
effect.

In addition, because the context can be set and queried from different
threads concurrently, it has to be accessed with atomic operations.

Buglink: https://issues.redhat.com/browse/RHEL-19381
Suggested-by: Kevin Wolf 
Signed-off-by: Hanna Czenczek 
---
 block/block-backend.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index 209eb07528..9c4de79e6b 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -44,7 +44,7 @@ struct BlockBackend {
 char *name;
 int refcnt;
 BdrvChild *root;
-AioContext *ctx;
+AioContext *ctx; /* access with atomic operations only */
 DriveInfo *legacy_dinfo;/* null unless created by drive_new() */
 QTAILQ_ENTRY(BlockBackend) link; /* for block_backends */
 QTAILQ_ENTRY(BlockBackend) monitor_link; /* for monitor_block_backends */
@@ -2414,22 +2414,22 @@ void blk_op_unblock_all(BlockBackend *blk, Error 
*reason)
 }
 }
 
+/**
+ * Return BB's current AioContext.  Note that this context may change
+ * concurrently at any time, with one exception: If the BB has a root node
+ * attached, its context will only change through 
bdrv_try_change_aio_context(),
+ * which creates a drained section.  Therefore, incrementing such a BB's
+ * in-flight counter will prevent its context from changing.
+ */
 AioContext *blk_get_aio_context(BlockBackend *blk)
 {
-BlockDriverState *bs;
 IO_CODE();
 
 if (!blk) {
 return qemu_get_aio_context();
 }
 
-bs = blk_bs(blk);
-if (bs) {
-AioContext *ctx = bdrv_get_aio_context(blk_bs(blk));
-assert(ctx == blk->ctx);
-}
-
-return blk->ctx;
+return qatomic_read(>ctx);
 }
 
 int blk_set_aio_context(BlockBackend *blk, AioContext *new_context,
@@ -2442,7 +2442,7 @@ int blk_set_aio_context(BlockBackend *blk, AioContext 
*new_context,
 GLOBAL_STATE_CODE();
 
 if (!bs) {
-blk->ctx = new_context;
+qatomic_set(>ctx, new_context);
 return 0;
 }
 
@@ -2471,7 +2471,7 @@ static void blk_root_set_aio_ctx_commit(void *opaque)
 AioContext *new_context = s->new_ctx;
 ThrottleGroupMember *tgm = >public.throttle_group_member;
 
-blk->ctx = new_context;
+qatomic_set(>ctx, new_context);
 if (tgm->throttle_state) {
 throttle_group_detach_aio_context(tgm);
 throttle_group_attach_aio_context(tgm, new_context);
-- 
2.43.0




[PATCH 0/2] block: Allow concurrent BB context changes

2024-02-02 Thread Hanna Czenczek
Hi,

Without the AioContext lock, a BB's context may kind of change at any
time (unless it has a root node, and I/O requests are pending).  That
also means that its own context (BlockBackend.ctx) and that of its root
node can differ sometimes (while the context is being changed).

blk_get_aio_context() doesn't know this yet and asserts that both are
always equal (if there is a root node).  Because it's no longer true,
and because callers don't seem to really care about the root node's
context, we can and should remove the assertion and just return the BB's
context.

Beyond that, the question is whether the callers of
blk_get_aio_context() are OK with the context potentially changing
concurrently.  Honestly, it isn't entirely clear to me; most look OK,
except for the virtio-scsi code, which operates under the general
assumption that the BB's context is always equal to that of the
virtio-scsi device.  I doubt that this assumption always holds (it is
definitely not obvious to me that it would), but then again, this series
will not make matters worse in that regard, and that is what counts for
me now.

One clear point of contention is scsi_device_for_each_req_async(), which
is addressed by patch 2.  Right now, it schedules a BH in the BB
context, then the BH double-checks whether the context still fits, and
if not, re-schedules itself.  Because virtio-scsi's context is fixed,
this seems to indicate to me that it wants to be able to deal with a
case where BB and virtio-scsi context differ, which seems to break that
aforementioned general virtio-scsi assumption.

Unfortunately, I definitely have to touch that code, because accepting
concurrent changes of AioContexts breaks the double-check (just because
the BB has the right context in that place does not mean it stays in
that context); instead, we must prevent any concurrent change until the
BH is done.  Because changing contexts generally involves a drained
section, we can prevent it by keeping the BB in-flight counter elevated.

Question is, how to reason for that.  I’d really rather not include the
need to follow the BB context in my argument, because I find that part a
bit fishy.

Luckily, there’s a second, completely different reason for having
scsi_device_for_each_req_async() increment the in-flight counter:
Specifically, scsi_device_purge_requests() probably wants to await full
completion of scsi_device_for_each_req_async(), and we can do that most
easily in the very same way by incrementing the in-flight counter.  This
way, the blk_drain() in scsi_device_purge_requests() will not only await
all (cancelled) I/O requests, but also the non-I/O requests.

The fact that this prevents the BB AioContext from changing while the BH
is scheduled/running then is just a nice side effect.


Hanna Czenczek (2):
  block-backend: Allow concurrent context changes
  scsi: Await request purging

 block/block-backend.c | 22 +++---
 hw/scsi/scsi-bus.c| 30 +-
 2 files changed, 32 insertions(+), 20 deletions(-)

-- 
2.43.0




[PATCH] vdpa-dev: Fix initialisation order to restore VDUSE compatibility

2024-02-02 Thread Kevin Wolf
VDUSE requires that virtqueues are first enabled before the DRIVER_OK
status flag is set; with the current API of the kernel module, it is
impossible to enable the opposite order in our block export code because
userspace is not notified when a virtqueue is enabled.

This requirement also mathces the normal initialisation order as done by
the generic vhost code in QEMU. However, commit 6c482547 accidentally
changed the order for vdpa-dev and broke access to VDUSE devices with
this.

This changes vdpa-dev to use the normal order again and use the standard
vhost callback .vhost_set_vring_enable for this. VDUSE devices can be
used with vdpa-dev again after this fix.

Cc: qemu-sta...@nongnu.org
Fixes: 6c4825476a4351530bcac17abab72295b75ffe98
Signed-off-by: Kevin Wolf 
---
 hw/virtio/vdpa-dev.c   |  5 +
 hw/virtio/vhost-vdpa.c | 17 +
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
index eb9ecea83b..13e87f06f6 100644
--- a/hw/virtio/vdpa-dev.c
+++ b/hw/virtio/vdpa-dev.c
@@ -253,14 +253,11 @@ static int vhost_vdpa_device_start(VirtIODevice *vdev, 
Error **errp)
 
 s->dev.acked_features = vdev->guest_features;
 
-ret = vhost_dev_start(>dev, vdev, false);
+ret = vhost_dev_start(>dev, vdev, true);
 if (ret < 0) {
 error_setg_errno(errp, -ret, "Error starting vhost");
 goto err_guest_notifiers;
 }
-for (i = 0; i < s->dev.nvqs; ++i) {
-vhost_vdpa_set_vring_ready(>vdpa, i);
-}
 s->started = true;
 
 /*
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 3a43beb312..c4574d56c5 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -904,6 +904,22 @@ int vhost_vdpa_set_vring_ready(struct vhost_vdpa *v, 
unsigned idx)
 return r;
 }
 
+static int vhost_vdpa_set_vring_enable(struct vhost_dev *dev, int enable)
+{
+struct vhost_vdpa *v = dev->opaque;
+unsigned int i;
+int ret;
+
+for (i = 0; i < dev->nvqs; ++i) {
+ret = vhost_vdpa_set_vring_ready(v, i);
+if (ret < 0) {
+return ret;
+}
+}
+
+return 0;
+}
+
 static int vhost_vdpa_set_config_call(struct vhost_dev *dev,
int fd)
 {
@@ -1524,6 +1540,7 @@ const VhostOps vdpa_ops = {
 .vhost_set_features = vhost_vdpa_set_features,
 .vhost_reset_device = vhost_vdpa_reset_device,
 .vhost_get_vq_index = vhost_vdpa_get_vq_index,
+.vhost_set_vring_enable = vhost_vdpa_set_vring_enable,
 .vhost_get_config  = vhost_vdpa_get_config,
 .vhost_set_config = vhost_vdpa_set_config,
 .vhost_requires_shm_log = NULL,
-- 
2.43.0




Re: [PULL 11/33] scsi: only access SCSIDevice->requests from one thread

2024-02-02 Thread Hanna Czenczek

On 01.02.24 16:25, Hanna Czenczek wrote:

On 01.02.24 15:28, Stefan Hajnoczi wrote:


[...]


Did you find a scenario where the virtio-scsi AioContext is different
from the scsi-hd BB's Aiocontext?


Technically, that’s the reason for this thread, specifically that 
virtio_scsi_hotunplug() switches the BB back to the main context while 
scsi_device_for_each_req_async_bh() is running.  Yes, we can fix that 
specific case via the in-flight counter, but I’m wondering whether 
there’s really any merit in requiring the BB to always be in 
virtio-scsi’s context, or whether it would make more sense to schedule 
everything in virtio-scsi’s context.  Now that BBs/BDSs can receive 
requests from any context, that is.


Now that I know that wouldn’t be easy, let me turn this around: As far 
as I understand, scsi_device_for_each_req_async_bh() should still run in 
virtio-scsi’s context, but that’s hard, so we take the BB’s context, 
which we therefore require to be the same one. Further, (again AFAIU,) 
virtio-scsi’s context cannot change (only set in 
virtio_scsi_dataplane_setup(), which is run in 
virtio_scsi_device_realize()).  Therefore, why does the 
scsi_device_for_each_req_async() code accommodate for BB context changes?


Hanna