From: Dima Stepanov <dimas...@yandex-team.ru> A socket write during vhost-user communication may trigger a disconnect event, calling vhost_user_blk_disconnect() and clearing all the vhost_dev structures holding data that vhost-user functions expect to remain valid to roll back initialization correctly. Delay the cleanup to keep vhost_dev structure valid. There are two possible states to handle: 1. RUN_STATE_PRELAUNCH: skip bh oneshot call and perform disconnect in the caller routine. 2. RUN_STATE_RUNNING: delay by using bh
BH changes are based on the similar changes for the vhost-user-net device: commit e7c83a885f865128ae3cf1946f8cb538b63cbfba "vhost-user: delay vhost_user_stop" Signed-off-by: Dima Stepanov <dimas...@yandex-team.ru> Message-Id: <69b73b94dcd066065595266c852810e0863a0895.1590396396.git.dimas...@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <m...@redhat.com> Signed-off-by: Michael S. Tsirkin <m...@redhat.com> Signed-off-by: Li Feng <fen...@smartx.com> Reviewed-by: Raphael Norwitz <raphael.norw...@nutanix.com> --- hw/block/vhost-user-blk.c | 38 +++++++++++++++++++++++++++++++++++++- 1 file changed, 37 insertions(+), 1 deletion(-) diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c index 9d8c0b3909..76838e76d3 100644 --- a/hw/block/vhost-user-blk.c +++ b/hw/block/vhost-user-blk.c @@ -349,6 +349,19 @@ static void vhost_user_blk_disconnect(DeviceState *dev) vhost_dev_cleanup(&s->dev); } +static void vhost_user_blk_event(void *opaque, QEMUChrEvent event); + +static void vhost_user_blk_chr_closed_bh(void *opaque) +{ + DeviceState *dev = opaque; + VirtIODevice *vdev = VIRTIO_DEVICE(dev); + VHostUserBlk *s = VHOST_USER_BLK(vdev); + + vhost_user_blk_disconnect(dev); + qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, vhost_user_blk_event, + NULL, opaque, NULL, true); +} + static void vhost_user_blk_event(void *opaque, QEMUChrEvent event) { DeviceState *dev = opaque; @@ -363,7 +376,30 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent event) } break; case CHR_EVENT_CLOSED: - vhost_user_blk_disconnect(dev); + /* + * A close event may happen during a read/write, but vhost + * code assumes the vhost_dev remains setup, so delay the + * stop & clear. There are two possible paths to hit this + * disconnect event: + * 1. When VM is in the RUN_STATE_PRELAUNCH state. The + * vhost_user_blk_device_realize() is a caller. + * 2. In tha main loop phase after VM start. + * + * For p2 the disconnect event will be delayed. We can't + * do the same for p1, because we are not running the loop + * at this moment. So just skip this step and perform + * disconnect in the caller function. + * + * TODO: maybe it is a good idea to make the same fix + * for other vhost-user devices. + */ + if (runstate_is_running()) { + AioContext *ctx = qemu_get_current_aio_context(); + + qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, NULL, NULL, + NULL, NULL, false); + aio_bh_schedule_oneshot(ctx, vhost_user_blk_chr_closed_bh, opaque); + } break; case CHR_EVENT_BREAK: case CHR_EVENT_MUX_IN: -- MST