OK - I'm happy with this approach then.
On Wed, May 15, 2024 at 10:48 PM Li Feng <fen...@smartx.com> wrote:
>
>
>
> 2024年5月15日 23:47,Raphael Norwitz <raph...@enfabrica.net> 写道:
>
> The case your describing makes sense but now I have some concerns on
> the vhost_dev_cleanup bit.
>
> On Wed, May 15, 2024 at 1:47 AM Li Feng <fen...@smartx.com> wrote:
>
>
>
>
> 2024年5月14日 21:58,Raphael Norwitz <raph...@enfabrica.net> 写道:
>
> Code looks good. Just a question on the error case you're trying to fix.
>
> On Tue, May 14, 2024 at 2:12 AM Li Feng <fen...@smartx.com> wrote:
>
>
> When the vhost-user is reconnecting to the backend, and if the vhost-user
> fails
> at the get_features in vhost_dev_init(), then the reconnect will fail
> and it will not be retriggered forever.
>
> The reason is:
> When the vhost-user fail at get_features, the vhost_dev_cleanup will be called
> immediately.
>
> vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'.
>
> The reconnect path is:
> vhost_user_blk_event
> vhost_user_async_close(.. vhost_user_blk_disconnect ..)
> qemu_chr_fe_set_handlers <----- clear the notifier callback
> schedule vhost_user_async_close_bh
>
> The vhost->vdev is null, so the vhost_user_blk_disconnect will not be
> called, then the event fd callback will not be reinstalled.
>
> With this patch, the vhost_user_blk_disconnect will call the
> vhost_dev_cleanup() again, it's safe.
>
> In addition, the CLOSE event may occur in a scenario where connected is false.
> At this time, the event handler will be cleared. We need to ensure that the
> event handler can remain installed.
>
>
> Following on from the prior patch, why would "connected" be false when
> a CLOSE event happens?
>
>
> In OPEN event handling, vhost_user_blk_connect calls vhost_dev_init and
> encounters
> an error such that s->connected remains false.
> Next, after the CLOSE event arrives, it is found that s->connected is false,
> so nothing
> is done, but the event handler will be cleaned up in `vhost_user_async_close`
> before the CLOSE event is executed.
>
>
> Got it - I see why the event handler is never re-installed in the code
> as it was before if we fail at get_features. That said, how do you
> explain your comment:
>
>
> OK, I will update the commit message because this code has changed some
> months ago.
>
>
> With this patch, the vhost_user_blk_disconnect will call the
> vhost_dev_cleanup() again, it's safe.
>
>
> I see vhost_dev_cleanup() accessing hdev without even a NULL check. In
> the case we're talking about here I don't think it's a problem because
> if vhost_dev_init() fails, connected will be false and hit the goto
> but I am concerned that there could be double-frees or use-after-frees
> in other cases.
>
>
> OK, you are right, with this patch, the vhost_dev_cleanup will not be
> called multiple times now.
>
> I think there is no need to worry about calling vhost_dev_cleanup multiple
> times,
> because historically vhost_dev_cleanup has been allowed to be called multiple
> times, and looking at the code, it can be found that calling vhost_dev_cleanup
> multiple times is indeed safe.
>
> Look this patch:
>
> commit e0547b59dc0ead4c605d3f02d1c8829630a1311b
> Author: Marc-André Lureau <marcandre.lur...@redhat.com>
> Date: Wed Jul 27 01:15:02 2016 +0400
>
> vhost: make vhost_dev_cleanup() idempotent
>
> It is called on multiple code path, so make it safe to call several
> times (note: I don't remember a reproducer here, but a function called
> 'cleanup' should probably be idempotent in my book)
>
> Signed-off-by: Marc-André Lureau <marcandre.lur...@redhat.com>
> Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
> Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
>
> Thanks,
> Li
>
>
> Thanks,
> Li
>
>
>
> All vhost-user devices have this issue, including vhost-user-blk/scsi.
>
> Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling")
>
> Signed-off-by: Li Feng <fen...@smartx.com>
> ---
> hw/block/vhost-user-blk.c | 3 ++-
> hw/scsi/vhost-user-scsi.c | 3 ++-
> hw/virtio/vhost-user-base.c | 3 ++-
> hw/virtio/vhost-user.c | 10 +---------
> 4 files changed, 7 insertions(+), 12 deletions(-)
>
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> index 41d1ac3a5a..c6842ced48 100644
> --- a/hw/block/vhost-user-blk.c
> +++ b/hw/block/vhost-user-blk.c
> @@ -353,7 +353,7 @@ static void vhost_user_blk_disconnect(DeviceState *dev)
> VHostUserBlk *s = VHOST_USER_BLK(vdev);
>
> if (!s->connected) {
> - return;
> + goto done;
> }
> s->connected = false;
>
> @@ -361,6 +361,7 @@ static void vhost_user_blk_disconnect(DeviceState *dev)
>
> vhost_dev_cleanup(&s->dev);
>
> +done:
> /* Re-instate the event handler for new connections */
> qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, vhost_user_blk_event,
> NULL, dev, NULL, true);
> diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
> index 48a59e020e..b49a11d23b 100644
> --- a/hw/scsi/vhost-user-scsi.c
> +++ b/hw/scsi/vhost-user-scsi.c
> @@ -181,7 +181,7 @@ static void vhost_user_scsi_disconnect(DeviceState *dev)
> VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(dev);
>
> if (!s->connected) {
> - return;
> + goto done;
> }
> s->connected = false;
>
> @@ -189,6 +189,7 @@ static void vhost_user_scsi_disconnect(DeviceState *dev)
>
> vhost_dev_cleanup(&vsc->dev);
>
> +done:
> /* Re-instate the event handler for new connections */
> qemu_chr_fe_set_handlers(&vs->conf.chardev, NULL, NULL,
> vhost_user_scsi_event, NULL, dev, NULL, true);
> diff --git a/hw/virtio/vhost-user-base.c b/hw/virtio/vhost-user-base.c
> index 4b54255682..11e72b1e3b 100644
> --- a/hw/virtio/vhost-user-base.c
> +++ b/hw/virtio/vhost-user-base.c
> @@ -225,13 +225,14 @@ static void vub_disconnect(DeviceState *dev)
> VHostUserBase *vub = VHOST_USER_BASE(vdev);
>
> if (!vub->connected) {
> - return;
> + goto done;
> }
> vub->connected = false;
>
> vub_stop(vdev);
> vhost_dev_cleanup(&vub->vhost_dev);
>
> +done:
> /* Re-instate the event handler for new connections */
> qemu_chr_fe_set_handlers(&vub->chardev,
> NULL, NULL, vub_event,
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index c929097e87..c407ea8939 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -2781,16 +2781,8 @@ typedef struct {
> static void vhost_user_async_close_bh(void *opaque)
> {
> VhostAsyncCallback *data = opaque;
> - struct vhost_dev *vhost = data->vhost;
>
> - /*
> - * If the vhost_dev has been cleared in the meantime there is
> - * nothing left to do as some other path has completed the
> - * cleanup.
> - */
> - if (vhost->vdev) {
> - data->cb(data->dev);
> - }
> + data->cb(data->dev);
>
> g_free(data);
> }
> --
> 2.45.0
>
>