On Mon, Feb 5, 2024 at 2:49 PM Kevin Wolf <kw...@redhat.com> wrote: > > Am 05.02.2024 um 13:22 hat Eugenio Perez Martin geschrieben: > > On Fri, Feb 2, 2024 at 2:25 PM Kevin Wolf <kw...@redhat.com> wrote: > > > > > > VDUSE requires that virtqueues are first enabled before the DRIVER_OK > > > status flag is set; with the current API of the kernel module, it is > > > impossible to enable the opposite order in our block export code because > > > userspace is not notified when a virtqueue is enabled. > > > > > > This requirement also mathces the normal initialisation order as done by > > > the generic vhost code in QEMU. However, commit 6c482547 accidentally > > > changed the order for vdpa-dev and broke access to VDUSE devices with > > > this. > > > > > > This changes vdpa-dev to use the normal order again and use the standard > > > vhost callback .vhost_set_vring_enable for this. VDUSE devices can be > > > used with vdpa-dev again after this fix. > > > > > > Cc: qemu-sta...@nongnu.org > > > Fixes: 6c4825476a4351530bcac17abab72295b75ffe98 > > > Signed-off-by: Kevin Wolf <kw...@redhat.com> > > > --- > > > hw/virtio/vdpa-dev.c | 5 +---- > > > hw/virtio/vhost-vdpa.c | 17 +++++++++++++++++ > > > 2 files changed, 18 insertions(+), 4 deletions(-) > > > > > > diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c > > > index eb9ecea83b..13e87f06f6 100644 > > > --- a/hw/virtio/vdpa-dev.c > > > +++ b/hw/virtio/vdpa-dev.c > > > @@ -253,14 +253,11 @@ static int vhost_vdpa_device_start(VirtIODevice > > > *vdev, Error **errp) > > > > > > s->dev.acked_features = vdev->guest_features; > > > > > > - ret = vhost_dev_start(&s->dev, vdev, false); > > > + ret = vhost_dev_start(&s->dev, vdev, true); > > > if (ret < 0) { > > > error_setg_errno(errp, -ret, "Error starting vhost"); > > > goto err_guest_notifiers; > > > } > > > - for (i = 0; i < s->dev.nvqs; ++i) { > > > - vhost_vdpa_set_vring_ready(&s->vdpa, i); > > > - } > > > s->started = true; > > > > > > /* > > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > > > index 3a43beb312..c4574d56c5 100644 > > > --- a/hw/virtio/vhost-vdpa.c > > > +++ b/hw/virtio/vhost-vdpa.c > > > @@ -904,6 +904,22 @@ int vhost_vdpa_set_vring_ready(struct vhost_vdpa *v, > > > unsigned idx) > > > return r; > > > } > > > > > > +static int vhost_vdpa_set_vring_enable(struct vhost_dev *dev, int enable) > > > +{ > > > + struct vhost_vdpa *v = dev->opaque; > > > + unsigned int i; > > > + int ret; > > > + > > > + for (i = 0; i < dev->nvqs; ++i) { > > > + ret = vhost_vdpa_set_vring_ready(v, i); > > > + if (ret < 0) { > > > + return ret; > > > + } > > > + } > > > + > > > + return 0; > > > +} > > > + > > > static int vhost_vdpa_set_config_call(struct vhost_dev *dev, > > > int fd) > > > { > > > @@ -1524,6 +1540,7 @@ const VhostOps vdpa_ops = { > > > .vhost_set_features = vhost_vdpa_set_features, > > > .vhost_reset_device = vhost_vdpa_reset_device, > > > .vhost_get_vq_index = vhost_vdpa_get_vq_index, > > > + .vhost_set_vring_enable = vhost_vdpa_set_vring_enable, > > > .vhost_get_config = vhost_vdpa_get_config, > > > .vhost_set_config = vhost_vdpa_set_config, > > > .vhost_requires_shm_log = NULL, > > > > vhost-vdpa net enables CVQ before dataplane ones to configure all the > > device in the destination of a live migration. To go back again to > > this callback would cause the device to enable the dataplane before > > virtqueues are configured again. > > Not that it makes a difference, but I don't think it would actually be > going back. Even before your commit 6c482547, we were not making use of > the generic callback but just called the function in a slightly > different place (but less different than after commit 6c482547). > > But anyway... Why don't the other vhost backend need the same for > vhost-net then? Do they just not support live migration? >
They don't support control virtqueue. More specifically, control virtqueue is handled directly in QEMU. > I don't know the code well enough to say where the problem is, but if > vhost-vdpa networking code relies on the usual vhost operations not > being implemented and bypasses VhostOps to replace it, that sounds like > a design problem to me. I don't follow this. What vhost operation is expected not to be implemented? > Maybe VhostOps needs a new operation to enable > just a single virtqueue that can be used by the networking code instead? > > > How does VDUSE userspace knows how many queues should enable? Can't > > the kernel perform the necessary actions after DRIVER_OK, like > > configuring the kick etc? > > Not sure if I understand the question. The vdpa kernel interface always > enables individual queues, so the VDUSE userspace will enable whatever > queues it was asked to enable. The only restriction is that the queues > need to be enabled before setting DRIVER_OK. > > The interface that enables all virtqueues at once seems to be just > .vhost_set_vring_enable in QEMU. > It enables all virtqueues of the same vhost device in QEMU, but QEMU creates one vhost_dev per each vhost-net virtqueue pair and another one for CVQ. This goes back to the time where mq vhost-kernel net devices were implemented by mergin many tap devices. net/vhost-vdpa.c only enables the last one, which corresponds to CVQ, and then enables the rest once all messages have been received. On the other hand, .vhost_set_vring_enable is also used for queue reset (vhost_net_virtqueue_reset and vhost_net_virtqueue_restart). In other words, it is called after the set DRIVER_OK. I guess it is fine for VDUSE as long as it does not offer vring reset capabilities, but future wise should we start going in that direction? But kernels without VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK should be supported, right. Maybe we can add vhost_vdpa_set_vring_enable and call vhost_dev_start with vrings parameters conditional to the feature flag? That way the caller can enable it later or just enable all the rings altogether. Thanks!