Re: [Qemu-devel] [PATCH v4 07/11] virtio: fill/flush/pop for packed ring

2019-02-19 Thread Wei Xu
On Tue, Feb 19, 2019 at 05:33:57PM +0800, Jason Wang wrote:
> 
> On 2019/2/19 下午4:21, Wei Xu wrote:
> >On Tue, Feb 19, 2019 at 02:49:42PM +0800, Jason Wang wrote:
> >>On 2019/2/18 下午10:46, Wei Xu wrote:
> Do we allow chain more descriptors than vq size in the case of indirect?
> According to the spec:
> 
> "
> 
> The device limits the number of descriptors in a list through a
> transport-specific and/or device-specific value. If not limited,
> the maximum number of descriptors in a list is the virt queue
> size.
> "
> 
> This looks possible, so the above is probably wrong if the max number of
> chained descriptors is negotiated through a device specific way.
> 
> >>>OK, I will remove this check, while it is necessary to have a limitation
> >>>for indirect descriptor anyway, otherwise it is possible to hit an overflow
> >>>since presumably u16 is used for most number/size in the spec.
> >>>
> >>Please try to test block and scsi device for you changes as well.
> >Any idea about what kind of test should be covered? AFAICT, currently
> >packed ring is targeted for virtio-net as discussed during voting spec.
> >
> >Wei
> 
> 
> Well it's not only for net for sure, it should support all kinds of device.
> For testing, you should test basic function plus migration.

For sure we will support all the other devices, can we make it for
virtio-net device first and then move on to other devices?

Also, can anybody give me a CLI example for block and scsi devices?
I will give it a quick shot.

Wei

> 
> Thanks
> 
> 
> >
> >>Thanks
> >>
> >>
> 



Re: [Qemu-devel] [PATCH v4 07/11] virtio: fill/flush/pop for packed ring

2019-02-19 Thread Jason Wang



On 2019/2/19 下午4:21, Wei Xu wrote:

On Tue, Feb 19, 2019 at 02:49:42PM +0800, Jason Wang wrote:

On 2019/2/18 下午10:46, Wei Xu wrote:

Do we allow chain more descriptors than vq size in the case of indirect?
According to the spec:

"

The device limits the number of descriptors in a list through a
transport-specific and/or device-specific value. If not limited,
the maximum number of descriptors in a list is the virt queue
size.
"

This looks possible, so the above is probably wrong if the max number of
chained descriptors is negotiated through a device specific way.


OK, I will remove this check, while it is necessary to have a limitation
for indirect descriptor anyway, otherwise it is possible to hit an overflow
since presumably u16 is used for most number/size in the spec.


Please try to test block and scsi device for you changes as well.

Any idea about what kind of test should be covered? AFAICT, currently
packed ring is targeted for virtio-net as discussed during voting spec.

Wei



Well it's not only for net for sure, it should support all kinds of 
device. For testing, you should test basic function plus migration.


Thanks





Thanks






Re: [Qemu-devel] [PATCH v4 07/11] virtio: fill/flush/pop for packed ring

2019-02-19 Thread Wei Xu
On Tue, Feb 19, 2019 at 02:49:42PM +0800, Jason Wang wrote:
> 
> On 2019/2/18 下午10:46, Wei Xu wrote:
> >>Do we allow chain more descriptors than vq size in the case of indirect?
> >>According to the spec:
> >>
> >>"
> >>
> >>The device limits the number of descriptors in a list through a
> >>transport-specific and/or device-specific value. If not limited,
> >>the maximum number of descriptors in a list is the virt queue
> >>size.
> >>"
> >>
> >>This looks possible, so the above is probably wrong if the max number of
> >>chained descriptors is negotiated through a device specific way.
> >>
> >OK, I will remove this check, while it is necessary to have a limitation
> >for indirect descriptor anyway, otherwise it is possible to hit an overflow
> >since presumably u16 is used for most number/size in the spec.
> >
> 
> Please try to test block and scsi device for you changes as well.

Any idea about what kind of test should be covered? AFAICT, currently
packed ring is targeted for virtio-net as discussed during voting spec.

Wei

> 
> Thanks
> 
> 



Re: [Qemu-devel] [PATCH v4 07/11] virtio: fill/flush/pop for packed ring

2019-02-18 Thread Jason Wang



On 2019/2/18 下午10:46, Wei Xu wrote:

Do we allow chain more descriptors than vq size in the case of indirect?
According to the spec:

"

The device limits the number of descriptors in a list through a
transport-specific and/or device-specific value. If not limited,
the maximum number of descriptors in a list is the virt queue
size.
"

This looks possible, so the above is probably wrong if the max number of
chained descriptors is negotiated through a device specific way.


OK, I will remove this check, while it is necessary to have a limitation
for indirect descriptor anyway, otherwise it is possible to hit an overflow
since presumably u16 is used for most number/size in the spec.



Please try to test block and scsi device for you changes as well.

Thanks





Re: [Qemu-devel] [PATCH v4 07/11] virtio: fill/flush/pop for packed ring

2019-02-18 Thread Wei Xu
On Mon, Feb 18, 2019 at 03:51:05PM +0800, Jason Wang wrote:
> 
> On 2019/2/14 下午12:26, w...@redhat.com wrote:
> >From: Wei Xu 
> >
> >last_used_idx/wrap_counter should be equal to last_avail_idx/wrap_counter
> >after a successful flush.
> >
> >Batching in vhost-net & dpdk testpmd is not equivalently supported in
> >userspace backend, but a chained descriptors for Rx is similarly presented
> >as a lightweight batch, so a write barrier is nailed only for the
> >first(head) descriptor.
> >
> >Signed-off-by: Wei Xu 
> >---
> >  hw/virtio/virtio.c | 291 
> > +
> >  1 file changed, 274 insertions(+), 17 deletions(-)
> >
> >diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> >index 832287b..7e276b4 100644
> >--- a/hw/virtio/virtio.c
> >+++ b/hw/virtio/virtio.c
> >@@ -379,6 +379,25 @@ static void vring_packed_desc_read(VirtIODevice *vdev, 
> >VRingPackedDesc *desc,
> >  virtio_tswap16s(vdev, >id);
> >  }
> >+static void vring_packed_desc_write_data(VirtIODevice *vdev,
> >+VRingPackedDesc *desc, MemoryRegionCache *cache, int i)
> >+{
> >+virtio_tswap32s(vdev, >len);
> >+virtio_tswap16s(vdev, >id);
> >+address_space_write_cached(cache,
> >+  i * sizeof(VRingPackedDesc) + offsetof(VRingPackedDesc, id),
> >+  >id, sizeof(desc->id));
> >+address_space_cache_invalidate(cache,
> >+  i * sizeof(VRingPackedDesc) + offsetof(VRingPackedDesc, id),
> >+  sizeof(desc->id));
> >+address_space_write_cached(cache,
> >+  i * sizeof(VRingPackedDesc) + offsetof(VRingPackedDesc, len),
> >+  >len, sizeof(desc->len));
> >+address_space_cache_invalidate(cache,
> >+  i * sizeof(VRingPackedDesc) + offsetof(VRingPackedDesc, len),
> >+  sizeof(desc->len));
> >+}
> >+
> >  static void vring_packed_desc_read_flags(VirtIODevice *vdev,
> >  VRingPackedDesc *desc, MemoryRegionCache *cache, int i)
> >  {
> >@@ -388,6 +407,18 @@ static void vring_packed_desc_read_flags(VirtIODevice 
> >*vdev,
> >  virtio_tswap16s(vdev, >flags);
> >  }
> >+static void vring_packed_desc_write_flags(VirtIODevice *vdev,
> >+VRingPackedDesc *desc, MemoryRegionCache *cache, int i)
> >+{
> >+virtio_tswap16s(vdev, >flags);
> >+address_space_write_cached(cache,
> >+  i * sizeof(VRingPackedDesc) + offsetof(VRingPackedDesc, 
> >flags),
> >+  >flags, sizeof(desc->flags));
> >+address_space_cache_invalidate(cache,
> >+  i * sizeof(VRingPackedDesc) + offsetof(VRingPackedDesc, 
> >flags),
> >+  sizeof(desc->flags));
> >+}
> >+
> >  static inline bool is_desc_avail(struct VRingPackedDesc *desc,
> >  bool wrap_counter)
> >  {
> >@@ -554,19 +585,11 @@ bool virtqueue_rewind(VirtQueue *vq, unsigned int num)
> >  }
> >  /* Called within rcu_read_lock().  */
> >-void virtqueue_fill(VirtQueue *vq, const VirtQueueElement *elem,
> >+static void virtqueue_split_fill(VirtQueue *vq, const VirtQueueElement 
> >*elem,
> >  unsigned int len, unsigned int idx)
> >  {
> >  VRingUsedElem uelem;
> >-trace_virtqueue_fill(vq, elem, len, idx);
> >-
> >-virtqueue_unmap_sg(vq, elem, len);
> >-
> >-if (unlikely(vq->vdev->broken)) {
> >-return;
> >-}
> >-
> >  if (unlikely(!vq->vring.used)) {
> >  return;
> >  }
> >@@ -578,16 +601,71 @@ void virtqueue_fill(VirtQueue *vq, const 
> >VirtQueueElement *elem,
> >  vring_used_write(vq, , idx);
> >  }
> >-/* Called within rcu_read_lock().  */
> >-void virtqueue_flush(VirtQueue *vq, unsigned int count)
> >+static void virtqueue_packed_fill(VirtQueue *vq, const VirtQueueElement 
> >*elem,
> >+unsigned int len, unsigned int idx)
> >  {
> >-uint16_t old, new;
> >+uint16_t head;
> >+VRingMemoryRegionCaches *caches;
> >+VRingPackedDesc desc = {
> >+.flags = 0,
> >+.id = elem->index,
> >+.len = len,
> >+};
> >+bool wrap_counter = vq->used_wrap_counter;
> >+
> >+if (unlikely(!vq->vring.desc)) {
> >+return;
> >+}
> >+
> >+head = vq->used_idx + idx;
> >+if (head >= vq->vring.num) {
> >+head -= vq->vring.num;
> >+wrap_counter ^= 1;
> >+}
> >+if (wrap_counter) {
> >+desc.flags |= (1 << VRING_PACKED_DESC_F_AVAIL);
> >+desc.flags |= (1 << VRING_PACKED_DESC_F_USED);
> >+} else {
> >+desc.flags &= ~(1 << VRING_PACKED_DESC_F_AVAIL);
> >+desc.flags &= ~(1 << VRING_PACKED_DESC_F_USED);
> >+}
> >+
> >+caches = vring_get_region_caches(vq);
> >+vring_packed_desc_write_data(vq->vdev, , >desc, head);
> >+if (idx == 0) {
> >+/*
> >+ * Make sure descriptor id and len is written before
> >+ * flags for the first used buffer.
> >+ */
> >+smp_wmb();
> >+}
> 
> 
> I suspect you need 

Re: [Qemu-devel] [PATCH v4 07/11] virtio: fill/flush/pop for packed ring

2019-02-17 Thread Jason Wang



On 2019/2/14 下午12:26, w...@redhat.com wrote:

From: Wei Xu 

last_used_idx/wrap_counter should be equal to last_avail_idx/wrap_counter
after a successful flush.

Batching in vhost-net & dpdk testpmd is not equivalently supported in
userspace backend, but a chained descriptors for Rx is similarly presented
as a lightweight batch, so a write barrier is nailed only for the
first(head) descriptor.

Signed-off-by: Wei Xu 
---
  hw/virtio/virtio.c | 291 +
  1 file changed, 274 insertions(+), 17 deletions(-)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 832287b..7e276b4 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -379,6 +379,25 @@ static void vring_packed_desc_read(VirtIODevice *vdev, 
VRingPackedDesc *desc,
  virtio_tswap16s(vdev, >id);
  }
  
+static void vring_packed_desc_write_data(VirtIODevice *vdev,

+VRingPackedDesc *desc, MemoryRegionCache *cache, int i)
+{
+virtio_tswap32s(vdev, >len);
+virtio_tswap16s(vdev, >id);
+address_space_write_cached(cache,
+  i * sizeof(VRingPackedDesc) + offsetof(VRingPackedDesc, id),
+  >id, sizeof(desc->id));
+address_space_cache_invalidate(cache,
+  i * sizeof(VRingPackedDesc) + offsetof(VRingPackedDesc, id),
+  sizeof(desc->id));
+address_space_write_cached(cache,
+  i * sizeof(VRingPackedDesc) + offsetof(VRingPackedDesc, len),
+  >len, sizeof(desc->len));
+address_space_cache_invalidate(cache,
+  i * sizeof(VRingPackedDesc) + offsetof(VRingPackedDesc, len),
+  sizeof(desc->len));
+}
+
  static void vring_packed_desc_read_flags(VirtIODevice *vdev,
  VRingPackedDesc *desc, MemoryRegionCache *cache, int i)
  {
@@ -388,6 +407,18 @@ static void vring_packed_desc_read_flags(VirtIODevice 
*vdev,
  virtio_tswap16s(vdev, >flags);
  }
  
+static void vring_packed_desc_write_flags(VirtIODevice *vdev,

+VRingPackedDesc *desc, MemoryRegionCache *cache, int i)
+{
+virtio_tswap16s(vdev, >flags);
+address_space_write_cached(cache,
+  i * sizeof(VRingPackedDesc) + offsetof(VRingPackedDesc, flags),
+  >flags, sizeof(desc->flags));
+address_space_cache_invalidate(cache,
+  i * sizeof(VRingPackedDesc) + offsetof(VRingPackedDesc, flags),
+  sizeof(desc->flags));
+}
+
  static inline bool is_desc_avail(struct VRingPackedDesc *desc,
  bool wrap_counter)
  {
@@ -554,19 +585,11 @@ bool virtqueue_rewind(VirtQueue *vq, unsigned int num)
  }
  
  /* Called within rcu_read_lock().  */

-void virtqueue_fill(VirtQueue *vq, const VirtQueueElement *elem,
+static void virtqueue_split_fill(VirtQueue *vq, const VirtQueueElement *elem,
  unsigned int len, unsigned int idx)
  {
  VRingUsedElem uelem;
  
-trace_virtqueue_fill(vq, elem, len, idx);

-
-virtqueue_unmap_sg(vq, elem, len);
-
-if (unlikely(vq->vdev->broken)) {
-return;
-}
-
  if (unlikely(!vq->vring.used)) {
  return;
  }
@@ -578,16 +601,71 @@ void virtqueue_fill(VirtQueue *vq, const VirtQueueElement 
*elem,
  vring_used_write(vq, , idx);
  }
  
-/* Called within rcu_read_lock().  */

-void virtqueue_flush(VirtQueue *vq, unsigned int count)
+static void virtqueue_packed_fill(VirtQueue *vq, const VirtQueueElement *elem,
+unsigned int len, unsigned int idx)
  {
-uint16_t old, new;
+uint16_t head;
+VRingMemoryRegionCaches *caches;
+VRingPackedDesc desc = {
+.flags = 0,
+.id = elem->index,
+.len = len,
+};
+bool wrap_counter = vq->used_wrap_counter;
+
+if (unlikely(!vq->vring.desc)) {
+return;
+}
+
+head = vq->used_idx + idx;
+if (head >= vq->vring.num) {
+head -= vq->vring.num;
+wrap_counter ^= 1;
+}
+if (wrap_counter) {
+desc.flags |= (1 << VRING_PACKED_DESC_F_AVAIL);
+desc.flags |= (1 << VRING_PACKED_DESC_F_USED);
+} else {
+desc.flags &= ~(1 << VRING_PACKED_DESC_F_AVAIL);
+desc.flags &= ~(1 << VRING_PACKED_DESC_F_USED);
+}
+
+caches = vring_get_region_caches(vq);
+vring_packed_desc_write_data(vq->vdev, , >desc, head);
+if (idx == 0) {
+/*
+ * Make sure descriptor id and len is written before
+ * flags for the first used buffer.
+ */
+smp_wmb();
+}



I suspect you need to do this unconditionally since this function 
doesn't do any batched writing to used ring. What it did is to write 
used descriptors at idx. So there's no reason to supress wmb for the idx 
!= 0. See its usage in virtio-net.c




+
+vring_packed_desc_write_flags(vq->vdev, , >desc, head);
+}
+
+void virtqueue_fill(VirtQueue *vq, const VirtQueueElement *elem,
+unsigned int len, unsigned int idx)
+{
+

[Qemu-devel] [PATCH v4 07/11] virtio: fill/flush/pop for packed ring

2019-02-13 Thread wexu
From: Wei Xu 

last_used_idx/wrap_counter should be equal to last_avail_idx/wrap_counter
after a successful flush.

Batching in vhost-net & dpdk testpmd is not equivalently supported in
userspace backend, but a chained descriptors for Rx is similarly presented
as a lightweight batch, so a write barrier is nailed only for the
first(head) descriptor.

Signed-off-by: Wei Xu 
---
 hw/virtio/virtio.c | 291 +
 1 file changed, 274 insertions(+), 17 deletions(-)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 832287b..7e276b4 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -379,6 +379,25 @@ static void vring_packed_desc_read(VirtIODevice *vdev, 
VRingPackedDesc *desc,
 virtio_tswap16s(vdev, >id);
 }
 
+static void vring_packed_desc_write_data(VirtIODevice *vdev,
+VRingPackedDesc *desc, MemoryRegionCache *cache, int i)
+{
+virtio_tswap32s(vdev, >len);
+virtio_tswap16s(vdev, >id);
+address_space_write_cached(cache,
+  i * sizeof(VRingPackedDesc) + offsetof(VRingPackedDesc, id),
+  >id, sizeof(desc->id));
+address_space_cache_invalidate(cache,
+  i * sizeof(VRingPackedDesc) + offsetof(VRingPackedDesc, id),
+  sizeof(desc->id));
+address_space_write_cached(cache,
+  i * sizeof(VRingPackedDesc) + offsetof(VRingPackedDesc, len),
+  >len, sizeof(desc->len));
+address_space_cache_invalidate(cache,
+  i * sizeof(VRingPackedDesc) + offsetof(VRingPackedDesc, len),
+  sizeof(desc->len));
+}
+
 static void vring_packed_desc_read_flags(VirtIODevice *vdev,
 VRingPackedDesc *desc, MemoryRegionCache *cache, int i)
 {
@@ -388,6 +407,18 @@ static void vring_packed_desc_read_flags(VirtIODevice 
*vdev,
 virtio_tswap16s(vdev, >flags);
 }
 
+static void vring_packed_desc_write_flags(VirtIODevice *vdev,
+VRingPackedDesc *desc, MemoryRegionCache *cache, int i)
+{
+virtio_tswap16s(vdev, >flags);
+address_space_write_cached(cache,
+  i * sizeof(VRingPackedDesc) + offsetof(VRingPackedDesc, flags),
+  >flags, sizeof(desc->flags));
+address_space_cache_invalidate(cache,
+  i * sizeof(VRingPackedDesc) + offsetof(VRingPackedDesc, flags),
+  sizeof(desc->flags));
+}
+
 static inline bool is_desc_avail(struct VRingPackedDesc *desc,
 bool wrap_counter)
 {
@@ -554,19 +585,11 @@ bool virtqueue_rewind(VirtQueue *vq, unsigned int num)
 }
 
 /* Called within rcu_read_lock().  */
-void virtqueue_fill(VirtQueue *vq, const VirtQueueElement *elem,
+static void virtqueue_split_fill(VirtQueue *vq, const VirtQueueElement *elem,
 unsigned int len, unsigned int idx)
 {
 VRingUsedElem uelem;
 
-trace_virtqueue_fill(vq, elem, len, idx);
-
-virtqueue_unmap_sg(vq, elem, len);
-
-if (unlikely(vq->vdev->broken)) {
-return;
-}
-
 if (unlikely(!vq->vring.used)) {
 return;
 }
@@ -578,16 +601,71 @@ void virtqueue_fill(VirtQueue *vq, const VirtQueueElement 
*elem,
 vring_used_write(vq, , idx);
 }
 
-/* Called within rcu_read_lock().  */
-void virtqueue_flush(VirtQueue *vq, unsigned int count)
+static void virtqueue_packed_fill(VirtQueue *vq, const VirtQueueElement *elem,
+unsigned int len, unsigned int idx)
 {
-uint16_t old, new;
+uint16_t head;
+VRingMemoryRegionCaches *caches;
+VRingPackedDesc desc = {
+.flags = 0,
+.id = elem->index,
+.len = len,
+};
+bool wrap_counter = vq->used_wrap_counter;
+
+if (unlikely(!vq->vring.desc)) {
+return;
+}
+
+head = vq->used_idx + idx;
+if (head >= vq->vring.num) {
+head -= vq->vring.num;
+wrap_counter ^= 1;
+}
+if (wrap_counter) {
+desc.flags |= (1 << VRING_PACKED_DESC_F_AVAIL);
+desc.flags |= (1 << VRING_PACKED_DESC_F_USED);
+} else {
+desc.flags &= ~(1 << VRING_PACKED_DESC_F_AVAIL);
+desc.flags &= ~(1 << VRING_PACKED_DESC_F_USED);
+}
+
+caches = vring_get_region_caches(vq);
+vring_packed_desc_write_data(vq->vdev, , >desc, head);
+if (idx == 0) {
+/*
+ * Make sure descriptor id and len is written before
+ * flags for the first used buffer.
+ */
+smp_wmb();
+}
+
+vring_packed_desc_write_flags(vq->vdev, , >desc, head);
+}
+
+void virtqueue_fill(VirtQueue *vq, const VirtQueueElement *elem,
+unsigned int len, unsigned int idx)
+{
+trace_virtqueue_fill(vq, elem, len, idx);
+
+virtqueue_unmap_sg(vq, elem, len);
 
 if (unlikely(vq->vdev->broken)) {
-vq->inuse -= count;
 return;
 }
 
+if (virtio_vdev_has_feature(vq->vdev, VIRTIO_F_RING_PACKED)) {
+virtqueue_packed_fill(vq, elem, len, idx);
+} else {
+virtqueue_split_fill(vq, elem,