Stefan Hajnoczi <stefa...@redhat.com> writes:
> [[PGP Signed Part:Undecided]] > On Wed, Nov 30, 2022 at 11:24:38AM +0000, Alex Bennée wrote: >> ..and use for both virtio-user-blk and virtio-user-gpio. This avoids >> the circular close by deferring shutdown due to disconnection until a >> later point. > > I thought re-entrancy was already avoided by Patch 3? No because vu_gpio_disconnect() gets called while we are in the process of tearing things down for the vu_gpio_stop(). As it hasn't disconnected yet we end up doing the teardown there: #2 0x0000557adbe04d56 in vu_gpio_disconnect (dev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:255 #3 0x0000557adbe049bb in vu_gpio_event (opaque=0x557adf80d640, event=CHR_EVENT_CLOSED) at ../../hw/virtio/vhost-user-gpio.c:274 #4 0x0000557adc0539ef in chr_be_event (s=0x557adea51f10, event=CHR_EVENT_CLOSED) at ../../chardev/char.c:61 #5 0x0000557adc0506aa in qemu_chr_be_event (s=0x557adea51f10, event=CHR_EVENT_CLOSED) at ../../chardev/char.c:81 #6 0x0000557adc04f666 in tcp_chr_disconnect_locked (chr=0x557adea51f10) at ../../chardev/char-socket.c:470 #7 0x0000557adc04c81a in tcp_chr_write (chr=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20) at ../../chardev/char-socket.c:129 #8 0x0000557adc050999 in qemu_chr_write_buffer (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, offset=0x7ffe8588cbe4, write_all=true) at ../../chardev/char.c:121 #9 0x0000557adc0507c7 in qemu_chr_write (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, write_all=true) at ../../chardev/char.c:173 #10 0x0000557adc046f3a in qemu_chr_fe_write_all (be=0x557adf80d830, buf=0x7ffe8588cce0 "\v", len=20) at ../../chardev/char-fe.c:53 #11 0x0000557adbddc02f in vhost_user_write (dev=0x557adf80d878, msg=0x7ffe8588cce0, fds=0x0, fd_num=0) at ../../hw/virtio/vhost-user.c:490 #12 0x0000557adbddd48f in vhost_user_get_vring_base (dev=0x557adf80d878, ring=0x7ffe8588d000) at ../../hw/virtio/vhost-user.c:1260 #13 0x0000557adbdd4bd6 in vhost_virtqueue_stop (dev=0x557adf80d878, vdev=0x557adf80d640, vq=0x557adf843570, idx=0) at ../../hw/virtio/vhost.c:1220 #14 0x0000557adbdd7eda in vhost_dev_stop (hdev=0x557adf80d878, vdev=0x557adf80d640, vrings=false) at ../../hw/virtio/vhost.c:1916 #15 0x0000557adbe051a6 in vu_gpio_stop (vdev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:142 and when we finally return the disconnect event has wiped out vhost_user. > >> virtio-user-blk already had this mechanism in place so >> generalise it as a vhost-user helper function and use for both blk and >> gpio devices. >> >> While we are at it we also fix up vhost-user-gpio to re-establish the >> event handler after close down so we can reconnect later. > > I don't understand how the BH is supposed to help. Normally BHs are used > to defer processing while we're in a call stack of some depth. The > callers still require the resource we want to free, so we scheduled a > BH. > > That's not the case here. The CHR_EVENT_CLOSED callback is just another > top-level event handler, not something that is called deep in a call > stack. Well it is in the shutdown case because as we write to the vhost_user backend to shut things down it dies and we immediately process the disconnect event. It is racy though so some times you get away with it. > This BH approach isn't safe against nested event loops (something that > calls aio_poll() to wait), because now the BH can be invoked deep in a > call stack. Well I did ask if anyone had any better suggestions. In this case I'm mostly just copying what vhost-user-blk did. > > That said, if Patch 3 isn't enough and this patch fixes a problem, then > let's keep it for 7.2. It feels like a possibly incomplete solution and > maybe something that should be solved differently in 8.0. > >> >> Signed-off-by: Alex Bennée <alex.ben...@linaro.org> >> Reviewed-by: Michael S. Tsirkin <m...@redhat.com> >> Reviewed-by: Raphael Norwitz <raphael.norw...@nutanix.com> >> --- >> include/hw/virtio/vhost-user.h | 18 +++++++++ >> hw/block/vhost-user-blk.c | 41 +++----------------- >> hw/virtio/vhost-user-gpio.c | 11 +++++- >> hw/virtio/vhost-user.c | 71 ++++++++++++++++++++++++++++++++++ >> 4 files changed, 104 insertions(+), 37 deletions(-) >> >> diff --git a/include/hw/virtio/vhost-user.h b/include/hw/virtio/vhost-user.h >> index c6e693cd3f..191216a74f 100644 >> --- a/include/hw/virtio/vhost-user.h >> +++ b/include/hw/virtio/vhost-user.h >> @@ -68,4 +68,22 @@ bool vhost_user_init(VhostUserState *user, CharBackend >> *chr, Error **errp); >> */ >> void vhost_user_cleanup(VhostUserState *user); >> >> +/** >> + * vhost_user_async_close() - cleanup vhost-user post connection drop >> + * @d: DeviceState for the associated device (passed to callback) >> + * @chardev: the CharBackend associated with the connection >> + * @vhost: the common vhost device >> + * @cb: the user callback function to complete the clean-up >> + * >> + * This function is used to handle the shutdown of a vhost-user >> + * connection to a backend. We handle this centrally to make sure we >> + * do all the steps and handle potential races due to VM shutdowns. >> + * Once the connection is disabled we call a backhalf to ensure >> + */ >> +typedef void (*vu_async_close_fn)(DeviceState *cb); >> + >> +void vhost_user_async_close(DeviceState *d, >> + CharBackend *chardev, struct vhost_dev *vhost, >> + vu_async_close_fn cb); >> + >> #endif >> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c >> index 1177064631..aff4d2b8cb 100644 >> --- a/hw/block/vhost-user-blk.c >> +++ b/hw/block/vhost-user-blk.c >> @@ -369,17 +369,10 @@ static void vhost_user_blk_disconnect(DeviceState *dev) >> vhost_user_blk_stop(vdev); >> >> vhost_dev_cleanup(&s->dev); >> -} >> >> -static void vhost_user_blk_chr_closed_bh(void *opaque) >> -{ >> - DeviceState *dev = opaque; >> - VirtIODevice *vdev = VIRTIO_DEVICE(dev); >> - VHostUserBlk *s = VHOST_USER_BLK(vdev); >> - >> - vhost_user_blk_disconnect(dev); >> + /* Re-instate the event handler for new connections */ >> qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, vhost_user_blk_event, >> - NULL, opaque, NULL, true); >> + NULL, dev, NULL, true); >> } >> >> static void vhost_user_blk_event(void *opaque, QEMUChrEvent event) >> @@ -398,33 +391,9 @@ static void vhost_user_blk_event(void *opaque, >> QEMUChrEvent event) >> } >> break; >> case CHR_EVENT_CLOSED: >> - if (!runstate_check(RUN_STATE_SHUTDOWN)) { >> - /* >> - * A close event may happen during a read/write, but vhost >> - * code assumes the vhost_dev remains setup, so delay the >> - * stop & clear. >> - */ >> - AioContext *ctx = qemu_get_current_aio_context(); >> - >> - qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, NULL, NULL, >> - NULL, NULL, false); >> - aio_bh_schedule_oneshot(ctx, vhost_user_blk_chr_closed_bh, >> opaque); >> - >> - /* >> - * Move vhost device to the stopped state. The vhost-user device >> - * will be clean up and disconnected in BH. This can be useful >> in >> - * the vhost migration code. If disconnect was caught there is >> an >> - * option for the general vhost code to get the dev state >> without >> - * knowing its type (in this case vhost-user). >> - * >> - * FIXME: this is sketchy to be reaching into vhost_dev >> - * now because we are forcing something that implies we >> - * have executed vhost_dev_stop() but that won't happen >> - * until vhost_user_blk_stop() gets called from the bh. >> - * Really this state check should be tracked locally. >> - */ >> - s->dev.started = false; >> - } >> + /* defer close until later to avoid circular close */ >> + vhost_user_async_close(dev, &s->chardev, &s->dev, >> + vhost_user_blk_disconnect); >> break; >> case CHR_EVENT_BREAK: >> case CHR_EVENT_MUX_IN: >> diff --git a/hw/virtio/vhost-user-gpio.c b/hw/virtio/vhost-user-gpio.c >> index be9be08b4c..b7b82a1099 100644 >> --- a/hw/virtio/vhost-user-gpio.c >> +++ b/hw/virtio/vhost-user-gpio.c >> @@ -233,6 +233,8 @@ static int vu_gpio_connect(DeviceState *dev, Error >> **errp) >> return 0; >> } >> >> +static void vu_gpio_event(void *opaque, QEMUChrEvent event); >> + >> static void vu_gpio_disconnect(DeviceState *dev) >> { >> VirtIODevice *vdev = VIRTIO_DEVICE(dev); >> @@ -245,6 +247,11 @@ static void vu_gpio_disconnect(DeviceState *dev) >> >> vu_gpio_stop(vdev); >> vhost_dev_cleanup(&gpio->vhost_dev); >> + >> + /* Re-instate the event handler for new connections */ >> + qemu_chr_fe_set_handlers(&gpio->chardev, >> + NULL, NULL, vu_gpio_event, >> + NULL, dev, NULL, true); >> } >> >> static void vu_gpio_event(void *opaque, QEMUChrEvent event) >> @@ -262,7 +269,9 @@ static void vu_gpio_event(void *opaque, QEMUChrEvent >> event) >> } >> break; >> case CHR_EVENT_CLOSED: >> - vu_gpio_disconnect(dev); >> + /* defer close until later to avoid circular close */ >> + vhost_user_async_close(dev, &gpio->chardev, &gpio->vhost_dev, >> + vu_gpio_disconnect); >> break; >> case CHR_EVENT_BREAK: >> case CHR_EVENT_MUX_IN: >> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c >> index abe23d4ebe..8f635844af 100644 >> --- a/hw/virtio/vhost-user.c >> +++ b/hw/virtio/vhost-user.c >> @@ -21,6 +21,7 @@ >> #include "qemu/error-report.h" >> #include "qemu/main-loop.h" >> #include "qemu/sockets.h" >> +#include "sysemu/runstate.h" >> #include "sysemu/cryptodev.h" >> #include "migration/migration.h" >> #include "migration/postcopy-ram.h" >> @@ -2670,6 +2671,76 @@ void vhost_user_cleanup(VhostUserState *user) >> user->chr = NULL; >> } >> >> + >> +typedef struct { >> + vu_async_close_fn cb; >> + DeviceState *dev; >> + CharBackend *cd; >> + struct vhost_dev *vhost; >> +} VhostAsyncCallback; >> + >> +static void vhost_user_async_close_bh(void *opaque) >> +{ >> + VhostAsyncCallback *data = opaque; >> + struct vhost_dev *vhost = data->vhost; >> + >> + /* >> + * If the vhost_dev has been cleared in the meantime there is >> + * nothing left to do as some other path has completed the >> + * cleanup. >> + */ >> + if (vhost->vdev) { >> + data->cb(data->dev); >> + } >> + >> + g_free(data); >> +} >> + >> +/* >> + * We only schedule the work if the machine is running. If suspended >> + * we want to keep all the in-flight data as is for migration >> + * purposes. >> + */ >> +void vhost_user_async_close(DeviceState *d, >> + CharBackend *chardev, struct vhost_dev *vhost, >> + vu_async_close_fn cb) >> +{ >> + if (!runstate_check(RUN_STATE_SHUTDOWN)) { >> + /* >> + * A close event may happen during a read/write, but vhost >> + * code assumes the vhost_dev remains setup, so delay the >> + * stop & clear. >> + */ >> + AioContext *ctx = qemu_get_current_aio_context(); >> + VhostAsyncCallback *data = g_new0(VhostAsyncCallback, 1); >> + >> + /* Save data for the callback */ >> + data->cb = cb; >> + data->dev = d; >> + data->cd = chardev; >> + data->vhost = vhost; >> + >> + /* Disable any further notifications on the chardev */ >> + qemu_chr_fe_set_handlers(chardev, >> + NULL, NULL, NULL, NULL, NULL, NULL, >> + false); >> + >> + aio_bh_schedule_oneshot(ctx, vhost_user_async_close_bh, data); >> + >> + /* >> + * Move vhost device to the stopped state. The vhost-user device >> + * will be clean up and disconnected in BH. This can be useful in >> + * the vhost migration code. If disconnect was caught there is an >> + * option for the general vhost code to get the dev state without >> + * knowing its type (in this case vhost-user). >> + * >> + * Note if the vhost device is fully cleared by the time we >> + * execute the bottom half we won't continue with the cleanup. >> + */ >> + vhost->started = false; >> + } >> +} >> + >> static int vhost_user_dev_start(struct vhost_dev *dev, bool started) >> { >> if (!virtio_has_feature(dev->protocol_features, >> -- >> 2.34.1 >> > > [[End of PGP Signed Part]] -- Alex Bennée