Re: [RFC] vhost: Introduce packed vq and add buffer elements

Sahil Fri, 21 Jun 2024 21:18:23 -0700

Hi,

Thank you for your reply.


On Wednesday, June 19, 2024 1:07:54 PM GMT+5:30 Eugenio Perez Martin wrote:
> [...]
> > "curr" is being updated here, but descs[i].id is always set to id which
> > doesn't change in the loop. So all the descriptors in the chain will have
> > the same id. I can't find anything in the virtio specification [1] that
> > suggests that all descriptors in the chain have the same id. Also, going
> > by the figure captioned "Three chained descriptors available" in the blog
> > post on packed virtqueues [2], it looks like the descriptors in the chain
> > have different buffer ids.
> > 
> > The virtio implementation in Linux also reuses the same id value for all
> > the descriptors in a single chain. I am not sure if I am missing
> > something here.
> 
> The code is right, the id that identifies the whole chain is just the
> one on the last descriptor. The key is that all the tail descriptors
> of the chains will have a different id, the rest ids are ignored so it
> is easier this way. I got it wrong in a recent mail in the list, where
> you can find more information. Let me know if you cannot find it :).

I found the mail here [1] :)

> In the split vq is different as a chained descriptor can go back and
> forth in the descriptor ring with the next id. So all of them must be
> different. But in the packed vq, the device knows the next descriptor
> is placed at the next entry in the descriptor ring, so the only
> important id is the last one.

Ok, this makes sense now.

> > > +        if (++i >= svq->vring_packed.vring.num) {
> > > +            i = 0;
> > > +            svq->vring_packed.avail_used_flags ^=
> > > +                    1 << VRING_PACKED_DESC_F_AVAIL |
> > > +                    1 << VRING_PACKED_DESC_F_USED;
> > > +        }
> > > +    }
> > > +
> > > +    if (i <= head) {
> > > +        svq->vring_packed.avail_wrap_counter ^= 1;
> > > +    }
> > > +
> > > +    svq->vring_packed.next_avail_idx = i;
> > > +    svq->free_head = curr;
> > 
> > Even though the same id is used, curr will not be id+1 here.
> 
> curr is not the descriptor index, but the id. They're used in a stack
> format: One available chain pops an id and one used id pushes its id
> in the stack.
> 
> Maybe I'm wrong, but I think the main reason is to reuse the same
> memory region of the descriptor state etc so less memory is changed to
> be used in all the operations.

Right, curr is the id. I didn't really understand the popping and pushing
part.

In the implementation, the possible ids are stored in svq.desc_next. And
it's implemented in a way that the next id is (id + 1) % queue_size.

By the following line:
> Even though the same id is used, curr will not be id+1 here.

I meant that if, for example,  there is a chain of 3 descriptors and the
current id is 1, then all 3 descriptors will have 1 as their id. If the vring
size is 5, then the value of curr that will be filled in the 4th descriptor
will be 4 instead of 2.

> > > +    return true;
> > > +}
> > > +
> > > +static bool vhost_svq_add_packed(VhostShadowVirtqueue *svq,
> > > +                                const struct iovec *out_sg, size_t
> > > out_num, +                                const struct iovec *in_sg,
> > > size_t in_num, +                                unsigned *head)
> > > +{
> > > +    bool ok;
> > > +    uint16_t head_flags = 0;
> > > +    g_autofree hwaddr *sgs = g_new(hwaddr, out_num + in_num);
> > 
> > I chose to use out_num+in_num as the size instead of MAX(ount_num,
> > in_num). I found it easier to implement
> > "vhost_svq_vring_write_descs_packed()" like this. Please let me know if
> > this isn't feasible or ideal.
> 
> Not a big deal, I picked the MAX just because it is all the
> hwaddresses the function needs at the same time. Addition should work
> too, and AFAIK chains are usually short. We should get rid of this
> dynamic allocation in the future anyway.

Ok, understood.

> [...]
> > In "struct VhostShadowVirtqueue", I rearranged the order in which some
> > members appear. I tried to keep the members common to split and packed
> > virtqueues above the union and the rest below the union. I haven't
> > entirely understood the role of some of the members (for example,
> > VhostShadowVirtqueueOps *ops). I'll change this ordering if need be as I
> > continue to understand them better.
> 
> That's fine, but do it in a separate patch for future series, so it is
> easier to review.

Sure, I'll do that.

> ops is used when a kind of device wants specialized handling for the
> descriptors forwarded. vdpa-net uses it when QEMU also needs to
> inspect the descriptors. Feel free to ask more about it, but adding
> packed format to SVQ should not affect the ops member.

Got it. I don't have any other questions related to ops.

> > For the next step, I think I should work on "vhost_svq_start()" which is
> > where members of the struct are actually initialized. At the moment, only
> > the split ring part of the structure is initialized.
> 
> Sounds reasonable. My recommendation is to mimic the patches of the
> kernel, doing a git log and following that order. You also need to
> apply the fixes in the history from that moment.
> 
> > I think I should also start working on enhancing "vhost_svq_kick()" to
> > actually send the buffers to the device. I think it'll be easier to test
> > these changes once that's done (I am not sure about this though). Would
> > this involve implementing the notification mechanism and event_idx?
> You can start omitting event_idx if you disable it from the device or
> from qemu's commandline with event_idx=off. If you use vdpa_sim, it's
> easier to remove it using qemu's cmdline in my opinion. Also, there is
> a needs_kick boolean that you can set to always true for testing
> purposes, since it is valid to send extra notifications. I think you
> don't need to modify anything else from vhost_svq_kick to test the
> device receives the buffer, but let me know if you find problems.

I'll take a look at the kernel patches. I'll also try testing these changes with
vdpa_sim. I'll check if the device gets the buffers.

Thanks,
Sahil

[1] https://lists.nongnu.org/archive/html/qemu-devel/2024-05/msg01843.html

Re: [RFC] vhost: Introduce packed vq and add buffer elements

Reply via email to