On Mon, May 27, 2024 at 01:27:10PM GMT, Halil Pasic wrote:
On Thu, 16 May 2024 10:39:42 +0200
Stefano Garzarella <sgarz...@redhat.com> wrote:
[..]
>---
>
>This is a minimal fix, that follows the current patterns in the
>codebase, and not necessarily the best one.
Yeah, I did something similar with commit 562a7d23bf ("vhost: mask
VIRTIO_F_RING_RESET for vhost and vhost-user devices") so I think for
now is the right approach.
I suggest to check also other devices like we did in that commit (e.g.
hw/scsi/vhost-scsi.c, hw/scsi/vhost-user-scsi.c, etc. )
Hi Stefano!
Thank you for chiming in, and sorry for the late response. I was hoping
that Michael is going to chime in and that I can base my reply on his
take. Anyway here I go.
A very valid observation! I do agree that we need this for
basically every vhost device, and since:
* net/vhost-vdpa.c
* hw/net/vhost_net.c
* hw/virtio/vhost-user-fs.c
already have it, that translates to shotgun it to the rest. Which
isn't nice in my opinion, which is why I am hoping for a discussion
on this topic, and a better solution (even if it turns out to be
something like a common macro).
Yeah, I see your point and I agree on a better solution.
[..]
>
>The documentation however does kind of state, that feature_bits is
>supposed to contain the supported features. And under the assumption
>that feature bit not in feature_bits implies that the corresponding bit
>must not be set in the 3rd argument (features), then even with the
>current implementation we do end up with the intersection of the three
>as stated. And then vsock would be at fault for violating that
>assumption, and my fix would be the best thing to do -- I guess.
>
>Is the implementation the way it is for a good reason, I can't judge
>that with certainty for myself.
Yes, I think we should fix the documentation, and after a few years of
not looking at it I'm confused again about what it does.
I would prefer to fix the algorithm and make whole thing less fragile.
But re-reading my commit for VIRTIO_F_RING_RESET, it seems that I had
interpreted `feature_bits` (2nd argument) as a list of features that
QEMU doesn't know how to emulate and therefore are required by the
backend (vhost/vhost-user/vdpa). Because the problem is that `features`
(3rd argument) is a set of features required by the driver that can be
provided by both QEMU and the backend.
Hm. I would say, this does sound like the sanest explanation, that might
justify the current code, but I will argue that for me, it isn't sane
enough.
Here comes my argument.
1) The uses is explicitly asking for a vhost device and giving the user
a non vhost device is not an option.
I didn't get this point :-( can you elaborate?
2) The whole purpose of vhost is that at least the data plane is
implemented outside of QEMU (I am maybe a little sloppy here with
dataplane). That means a rather substantial portion of the device
implementation is not in QEMU, while QEMU remains in charge of the
setup.
Yep
3) Thus I would argue, that all the "transport feature bits" from 24 to
40 should have a corresponding vhost feature because the vhost part needs
some sort of a support.
What do we have there in bits from 24 to 40 according to the spec?
* VIRTIO_F_INDIRECT_DESC
* VIRTIO_F_EVENT_IDX
* VIRTIO_F_VERSION_1
* VIRTIO_F_ACCESS_PLATFORM
* VIRTIO_F_RING_PACKED
* VIRTIO_F_IN_ORDER
* VIRTIO_F_ORDER_PLATFORM
* VIRTIO_F_SR_IOV
* VIRTIO_F_NOTIFICATION_DATA
* VIRTIO_F_NOTIF_CONFIG_DATA
* VIRTIO_F_RING_RESET
and for transitional:
* VIRTIO_F_NOTIFY_ON_EMPTY
* VIRTIO_F_ANY_LAYOUT
* UNUSED
I would say, form these only VIRTIO_F_SR_IOV and
VIRTIO_F_NOTIF_CONFIG_DATA look iffy in a sense things may work out
for vhost devices without the vhost part doing something for it. And
even there, I don't think it would hurt to make vhost part of the
negotiation (I don't think those are supported by QEMU at this point).
I would very much prefer having a consolidated and safe handling for
these.
I completely agree on this!
4) I would also argue that a bunch of the device specific feature bits
should have vhost feature bits as well for the same reason:
features are also such that for a vhost device, the vhost part needs
some sort of a support.
Looking through all of these would require a lot of time, so instead
of that, let me use SCSI as an example. The features are:
* VIRTIO_SCSI_F_INOUT
* VIRTIO_SCSI_F_HOTPLUG
* VIRTIO_SCSI_F_CHANGE
* VIRTIO_SCSI_F_T10_PI
The in the Linux kernel we have
VHOST_SCSI_FEATURES = VHOST_FEATURES | (1ULL << VIRTIO_SCSI_F_HOTPLUG) |
(1ULL << VIRTIO_SCSI_F_T10_PI)
but in QEMU kernel_feature_bits does not have
VIRTIO_SCSI_F_T10_PI which together does not make much sense to me. And I would
also expect VIRTIO_SCSI_F_INOUT to be a part of the negotiation, because
to me that the side that is processing the queue is the one that should
care about it, but I don't think it is supported by QEMU at all, and
then not negotiating it is fine. VIRTIO_SCSI_F_HOTPLUG is in QEMU's
kernel_feature_bits and thus negotiated. in So the only one that can be
done purely in QEMU seems to be VIRTIO_SCSI_F_CHANGE.
And for vsock we have
VIRTIO_VSOCK_F_SEQPACKET and VIRTIO_VSOCK_F_STREAM, with VIRTIO_VSOCK_F_STREAM
being strange in a sense that the spec says "If no feature bit is set,
only stream socket type is supported. If VIRTIO_VSOCK_F_SEQPACKET has
been negotiated, the device MAY act as if VIRTIO_VSOCK_F_STREAM has also
been negotiated."
Yes, this was a problem we introduced with F_SEQPACKET. Unfortunately,
when we added that feature, we didn't define F_STREAM well because the
devices always supported it. So we had to resort to this workaround.
VIRTIO_VSOCK_F_SEQPACKET is negotiated with vhost.
It seems for whatever reason we just assume that the vhost device
supports VIRTIO_VSOCK_F_STREAM and that is why we don't negotiate it. I
would assume based on what I see that the feature VIRTIO_VSOCK_F_STREAM
was retrofitted, and that is what we ended up with. Thus I guess now
we have an implicit rule that any QEMU compatible vhost-vsock
implementation would have to support that. But in practice we care only
about Linux, and thus it does not matter.
Yep, is this, but I have in my todo to fix this somehow.
5) Based on the following, I would very much prefer a per device list of
features with the semantic "hey QEMU can do that feature without any
specialized vhost-device support (e.g. like VIRTIO_SCSI_F_CHANGE)" over
the current list with the semantics "these are the features that
need vhost backend support, and anything else will just work out". That
way if an omission is made, we would end up with the usually safer
under-indication instead of the current over-indication.
Makes sense to me!
@Michael, Jason: Could you guys chime in?
>
>But I'm pretty convinced that the current approach is fragile,
>especially for the feature bits form the range 24 to 40, as those are
>not specific to a device.
>
>BTW vsock also lacks VIRTIO_F_ACCESS_PLATFORM, and VIRTIO_F_RING_RESET
>as well while vhost-net has both.
VIRTIO_F_RING_RESET is just above the line added by this patch.
My bad! :)
>
>If our design is indeed to make the individual devices responsible for
>having a complete list of possible features in feature_bits, then at
>least having a common macro for the non-device specific features would
>make sense to me.
Yeah, I agree on this!
I guess, "possible features" shifted towards "possible features that QEMU
can not provide without the vhost backend".
But some sort of a common list does seem viable. I would however prefer
a can do without list, or maybe even two ("can do without because I do it
myself and alone" and "can do without, because although I must rely on a
vhost capability to provide that feature, the presence of that capability
is not indicated via the feature bits and based on knowing how things are
I'm assuming the capability is there although not specifically
indicated".
>
>On the other hand, I'm also very happy to send a patch which changes the
>behavior of vhost_get_features(), should the community decide that the
>current behavior does not make all that much sense -- I lean towards:
>probably it does not make much sense, but things like
>VIRTIO_F_ACCESS_PLATFORM, which are mandatory feature bits, need careful
>consideration, because there vhost can't do so we just won't offer it
>and proceed on our merry way is not the right behavior.
>
>Please comment!
Maybe we should discuss it in another thread, but I agree that we should
fix it in someway. Thank you for raising this discussion!
Hm, I ended up replying in this thread because we still don't have
closure on this patch. But if you think it makes, feel free to start a
different thread, and please do include me.
I see, this thread is fine!
>
>Regards,
>Halil
>---
> hw/virtio/vhost-vsock-common.c | 1 +
> 1 file changed, 1 insertion(+)
This patch LGTM, but as I mention we should fix other devices as well,
but maybe we can do with the common macro you suggested in another
patch.
I agree, this should be definitely another patch or even series
depending on how the discussion pans out.
Ack!
Thanks,
Stefano