Re: [Virtio-fs] (no subject)

Hanna Czenczek Tue, 10 Oct 2023 06:19:51 -0700

On 10.10.23 12:36, Alex Bennée wrote:

Hanna Czenczek <hre...@redhat.com> writes:

On 10.10.23 06:00, Yajun Wu wrote:

On 10/9/2023 5:13 PM, Hanna Czenczek wrote:

External email: Use caution opening links or attachments


On 09.10.23 11:07, Hanna Czenczek wrote:

On 09.10.23 10:21, Hanna Czenczek wrote:

On 07.10.23 04:22, Yajun Wu wrote:

[...]

<snip>

So as far as I understand, the feature is supposed to rely on
implementation-specific behavior between specifically qemu as a
front-end and dpdk as a back-end, nothing else.  Honestly, that to me
is a very good reason to deprecate it.  That would make it clear that
any implementation that implements it does so because it relies on
implementation-specific behavior from other implementations.

Option 2 is to fix it.  It is not right to use this broadly defined
feature with its clear protocol as given in the virtio specification
just to set and clear a single bit (DRIVER_OK).  The vhost-user
specification points to that virtio protocol.  We must adhere to the
protocol.  And note that we must not reset devices just because the VM
is paused/resumed.  (That is why I wanted to deprecate SET_STATUS, so
that Stefan’s series would introduce RESET_DEVICE where we need it,
and we can (for now) ignore the SET_STATUS 0 in vhost_dev_stop().)

Option 3 would be to just be honest in the specification, and limit
the scope of F_STATUS to say the only bit that matters is DRIVER_OK.
I would say this is not really different from deprecating, though it
wouldn’t affect your case.  However, I understand Alex relies on a
full status byte.  I’m still interested to know why that is.

For an F_TRANSPORT backend (or whatever the final name ends up being) we
need the backend to have full control of the status byte because all the
handling of VirtIO is deferred to it. Therefor it has to handle all the
feature negotiation and indicate when the device needs resetting.

(side note: feature negotiation is another slippery area when QEMU gets
involved in gating which feature bits may or may not be exposed to the
backend. The only one it should ever mask is F_UNUSED which is used
(sic) to trigger the vhost protocol negotiation)

That’s the thing, feature negotiation is done with GET_FEATURES andSET_FEATURES. Configuring F_REPLY_ACK lets SET_FEATURES return errors.

Indicating that the device needs reset is a good point, there is noother feature to do that. (And something qemu currently ignores, justlike any value the device returns through GET_STATUS, but that’s besidesthe point.)

Option 4 is of course not to do anything, and leave everything as-is,
waiting for the next person to stir the hornet’s nest.

Cc-ing Alex on this mail, because to me, this seems like an important
detail when he plans on using the byte in the future. If we need a
virtio status byte, I can’t see how we could use the existing F_STATUS
for it.

What would we use instead of F_STATUS to query the Device Status field?

We would emulate it in the front-end, just like we need to do forback-ends without F_STATUS. We can’t emulate the DEVICE_NEEDS_RESETbit, though, that’s correct.

Given that qemu currently ignores DEVICE_NEEDS_RESET, I’m not 100 %convinced that your use case has a hard dependency on F_STATUS. However,this still does make a fair point in general that it would be useful tokeep it.

That still leaves us with the situation that currently, the onlyimplementations with F_STATUS support are qemu and dpdk, which bothhandle it incorrectly. Furthermore, the specification leaves much to bedesired, specifically in how F_STATUS interacts with other vhost-usercommands (which is something I cited as a reason for my original patch),i.e. whether RESET_DEVICE and SET_STATUS 0 are equivalent, and whetherfailures in feature negotiation must result in both SET_FEATURESreturning an error (with F_REPLY_ACK), and FEATURES_OK being reset inthe status byte, or whether either is sufficient. What happens whenDEVICE_NEEDS_RESET is set, i.e. do we just need RESET_DEVICE /SET_STATUS 0, or do we also need to reset some protocol state? (This isalso connected to the fact that what happens on RESET_DEVICE is largelyundefined, which I said on Stefan’s series.)

In general, because we have our own transport, we should make a note howit interacts with the status negotiation phases, i.e. that GET_FEATURESmust not be called before S_ACKNOWLEDGE | S_DRIVER are set, thatFEATURES_OK must be set after the SET_FEATURES call, and that DRIVER_OKmust not be set without FEATURES_OK set / SET_FEATURES having returnedsuccess. Here we would also answer the question about the interactionof F_REPLY_ACK+SET_FEATURES with F_STATUS, specifically whether animplementation with F_REPLY_ACK even needs to read back the status byteafter setting FEATURES_OK because it could have got the featurenegotiation result already as a result of the SET_FEATURES call.

After migration, can you just set all flags immediately or do we need tofollow this step-by-step protocol? I think we do need to do itstep-by-step, mostly for simplicity in the back-end, i.e. that it justsees a normal device start-up.

We should also clarify whether SET_STATUS can fail, i.e. whether settingan invalid status (is setting FEATURES_OK when the device doesn’t thinkso invalid?) has SET_STATUS fail (with F_REPLY_ACK) and/or immediatelygets the device into DEVICE_NEEDS_RESET.

We should clarify whether SET_STATUS can block. The current use ofDRIVER_OK seems to indicate to me that dpdk does do time-consumingoperations when it sees DRIVER_OK (code looks like it, too) and onlyreturns when that’s done, but naïvely, I would expect SET_STATUS to bejust setting some value and doing whatever needs to be done in thebackground, not actually launching and blocking on an operation.

I think it is dangerous to just push ahead with using F_STATUS withoutacknowledging that its implementation is broken right now, and that itis so *on purpose* because the DRIVER_OK bit is the only thing that itwas supposed to be used for. Using it for its purported original use(actually the virtio status byte) is contradictory to that. It’sprobably fixable, but I think it requires taking a step back and seeingwhat needs to be done to remedy the conflict. If you rip out all theexisting STATUS code and replace it such that qemu will let the guesthave full control over the status byte (except for migration, where werestore it on the destination, which will result in DRIVER_OK being setat the end, fulfilling that requirement), that will fix theimplementation in qemu. I think. But the specification should beamended to think about all these corner cases, not least because I thinkthey will also affect your implementation.

(The answers to many of the questions I raise for documentation may beobvious to you, based on “in virtio, it’s just an MMIO byte that’swritten and read, so the rest follows from there”. But evidently theimplementation we have kind of ignores that e.g. SET_STATUS 0 is a reset(6f8be29ec17d44496b9ed67599bceaaba72d1864 is a work-around, not muchmore) or that there is actually a protocol to setting the status flagsand you can’t just set them all at once, so I don’t think the answersare immediately obvious, and should be documented.)

As for me and the original patch: I claimed nobody really needsF_STATUS, you say you do, so plainly, I assumed wrong and will naturallytake my hands off of F_STATUS and just ensure not to implement it in anyback-end until you’ve fixed it, as Yajun has advised. I’d still prefermentioning this advice in the documentation until it’s fixed, but, youknow, I wouldn’t be the first one to say “I now know about the quirk, soI can work around it, no need to tell anyone else as long as my stuffworks”.


Hanna

Re: [Virtio-fs] (no subject)

Reply via email to