Dor Laor wrote:
> On Sat, 2008-03-29 at 16:55 -0500, Anthony Liguori wrote:
>   
>> This patch introduces virtio support over PCI.  virtio is a generic virtual 
>> IO
>> framework for Linux first introduced in 2.6.23.  Since 2.6.25, virtio has
>> supported a PCI transport which this patch implements.
>>
>> Since the last time these patches were posted to qemu-devel, I've reworked it
>> to use the proper access functions to manipulate guest memory.
>>
>> Signed-off-by: Anthony Liguori <[EMAIL PROTECTED]>
>>     
>
> It's will be great to drop the nasty hacks :)
> Do you still get 1G net performance using the extra copy from tap
> (memcpy_to_iovector)?
>   

We take a bit of a hit (probably 10-20%) doing copy.  The "tap hacks" 
require a more invasive set of patches to refactor the VLAN support in 
QEMU.  The fundamental problem with tap is that it only supports one tap 
device per VLAN.  What we really want is a VLAN were each VLAN client 
has it's own tap device.  This is also necessary to properly support the 
upcoming ring queue support for TAP along with GSO.

That patch set is why I revisited this one in fact :-)  Once we get 
virtio merged, I'll then send out the VLAN refactoring.  The nice thing 
though is that once we have the VLAN refactoring, we can optimize the 
e1000 device to make use of it.

> [snip]
>
>   
>> +static uint32_t vring_desc_len(VirtQueue *vq, unsigned int i)
>> +{
>>     
>
> Below there were place you did use offsetof(vq->vring.desc[i], len) so
> we better be consistent + its nicer
>
>   
>> +    return ldl_phys(vq->vring.desc + i * sizeof(VRingDesc) + 
>> +                offsetof(VRingDesc, len));
>> +}
>>     

Yup, I just missed this one.  Thanks for the catch!

>> +VirtQueueElement *virtqueue_pop(VirtQueue *vq)
>> +{
>> +    unsigned int i, head;
>> +    unsigned int position;
>> +    VirtQueueElement *elem;
>> +
>> +    /* Check it isn't doing very strange things with descriptor numbers. */
>> +    if ((uint16_t)(vring_avail_idx(vq) - vq->last_avail_idx) > 
>> vq->vring.num)
>> +    errx(1, "Guest moved used index from %u to %u",
>> +         vq->last_avail_idx, vring_avail_idx(vq));
>> +
>> +    /* If there's nothing new since last we looked, return invalid. */
>> +    if (vring_avail_idx(vq) == vq->last_avail_idx)
>> +    return NULL;
>> +
>> +    /* Grab the next descriptor number they're advertising, and increment
>> +     * the index we've seen. */
>> +    head = vring_avail_ring(vq, vq->last_avail_idx++ % vq->vring.num);
>> +
>> +    /* If their number is silly, that's a fatal mistake. */
>> +    if (head >= vq->vring.num)
>> +    errx(1, "Guest says index %u is available", head);
>> +
>> +    /* When we start there are none of either input nor output. */
>> +    position = 0;
>> +
>> +    elem = qemu_mallocz(sizeof(VirtQueueElement));
>> +
>> +    elem->phys_in = qemu_mallocz(sizeof(PhysIOVector) +
>> +                             vq->vring.num * sizeof(PhysIOVectorElement));
>> +    elem->phys_out = qemu_mallocz(sizeof(PhysIOVector) +
>> +                              vq->vring.num * sizeof(PhysIOVectorElement));
>>     
>
> I was wondering whether it can be optimized since vring.num is sometimes
> 512 so and we can either use a pool of these or calculate the vring.num
> from the descriptors but it seems like your way is the best.
>   

My thinking right now is to use qemu_mallocz() for everything and then 
we can go back and optimize with pooling if necessary.

>> +
>> +    i = head;
>> +    do {
>> +    PhysIOVectorElement *sge;
>> +
>> +    if (vring_desc_flags(vq, i) & VRING_DESC_F_WRITE)
>> +        sge = &elem->phys_in->sg[elem->phys_in->num++];
>> +    else
>> +        sge = &elem->phys_out->sg[elem->phys_out->num++];
>> +
>> +    /* Grab the first descriptor, and check it's OK. */
>> +    sge->len = vring_desc_len(vq, i);
>> +    sge->base = vring_desc_addr(vq, i);
>> +
>> +    /* If we've got too many, that implies a descriptor loop. */
>> +    if ((elem->phys_in->num + elem->phys_out->num) > vq->vring.num)
>> +        errx(1, "Looped descriptor");
>> +    } while ((i = virtqueue_next_desc(vq, i)) != vq->vring.num);
>> +
>> +    elem->virt_in = pci_device_dma_map(&vq->vdev->pci_dev, elem->phys_in);
>> +    elem->virt_out = pci_device_dma_map(&vq->vdev->pci_dev, elem->phys_out);
>> +    elem->index = head;
>> +
>> +    if (elem->virt_in == NULL || elem->virt_out == NULL)
>> +    errx(1, "Bad DMA");
>> +
>> +    return elem;
>> +}
>> +
>> +
>>     
>
> The name below is a bit misleading since when enable is true you
> actually set no_notify.
> So I name it something like virtio_vring_set_no_notify(...) or similar.
>   

Yeah, that's not a bad suggestion.

Thanks,

Anthony Liguori

>> +void virtio_ring_set_used_notify(VirtQueue *vq, int enable)
>> +{
>> +    if (enable)
>> +    vring_used_set_flag(vq, VRING_USED_F_NO_NOTIFY);
>> +    else
>> +    vring_used_unset_flag(vq, VRING_USED_F_NO_NOTIFY);
>> +}
>> +
>>     
>
> Cheers,
> Dor
>
>
> -------------------------------------------------------------------------
> Check out the new SourceForge.net Marketplace.
> It's the best place to buy or sell services for
> just about anything Open Source.
> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
> _______________________________________________
> kvm-devel mailing list
> kvm-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/kvm-devel
>   


-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
_______________________________________________
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Reply via email to