Re: [Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio

2015-05-13 Thread Michael S. Tsirkin
On Wed, May 13, 2015 at 12:39:07PM +0200, Cornelia Huck wrote:
 On Tue, 12 May 2015 18:40:35 +0200
 Michael S. Tsirkin m...@redhat.com wrote:
 
  On Tue, May 12, 2015 at 06:25:32PM +0200, Cornelia Huck wrote:
   On Tue, 12 May 2015 17:15:53 +0200
   Michael S. Tsirkin m...@redhat.com wrote:
 
One issue with virtio 1 patches as they are is with how features are
handled ATM.  There are 3 types of features

a. virtio 1 only features
b. virtio 0 only features
c. shared features

and 3 types of devices
a. legacy device: has b+c features
b. modern device: has a+c features
c. transitional device: has a+c features but exposes
   only c through the legacy interface
   
   Wouldn't a transitional device be able to expose b as well?
  
  No because the virtio 1 spec says it shouldn't.
 
 Can you point me to the part where it says that? Can't seem to find it.


What I mean is that these features can't be operated through the modern
interface.
Consider the BLK SCSI command feature. If you set it you really
expect guest to be able to e.g. use your device as a cd writer.
So, a legacy guest is able to do this, but a modern guest isn't?
Doesn't sound like an expected outcome to me.


So the principle of the least surprise dictates that we
- disable the feature by default for everyone
- add flag to enable the feature, available for legacy devices only


So I think a callback that gets features depending on guest
version isn't a good way to model it because fundamentally device
has one set of features.
A better way to model this is really just a single
host_features bitmask, and for transitional devices, a mask
hiding a features - which are so far all bits  31, so maybe
for now we can just have a global mask.
   
   How would this work for transitional presenting a modern device - would
   you have a superset of bits and masks for legacy and modern?
  
  Basically we expose through modern interface a superset
  of bits exposed through legacy.
  F_BAD for pci is probably the only exception.
 
 ccw has F_BAD as well.

It does but it ignores it, so I think setting that bit
in host features is a bug. PCI uses it to detect very old
guests that just acked all features without looking
at them. ccw never had such guests.

 NOTIFY_ON_EMPTY is in the MAY be offered category.

Right. Current guests simply ignore this feature bit,
so I don't much care what we do with this bit as long as
- pci exposes this on the legacy interface
- we don't spend many lines of code working around that special case

 I'm still wondering about the old legacy bits. Transitional devices not
 offering them is easiest to implement from the host side, but I'm
 wondering if all legacy devices will be able to function with missing
 bits? Is breaking some legacy drivers OK?

We should add flags to re-enable them, for legacy devices only.

-- 
MST



Re: [Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio

2015-05-13 Thread Cornelia Huck
On Tue, 12 May 2015 18:40:35 +0200
Michael S. Tsirkin m...@redhat.com wrote:

 On Tue, May 12, 2015 at 06:25:32PM +0200, Cornelia Huck wrote:
  On Tue, 12 May 2015 17:15:53 +0200
  Michael S. Tsirkin m...@redhat.com wrote:

   One issue with virtio 1 patches as they are is with how features are
   handled ATM.  There are 3 types of features
   
 a. virtio 1 only features
 b. virtio 0 only features
 c. shared features
   
   and 3 types of devices
 a. legacy device: has b+c features
 b. modern device: has a+c features
 c. transitional device: has a+c features but exposes
only c through the legacy interface
  
  Wouldn't a transitional device be able to expose b as well?
 
 No because the virtio 1 spec says it shouldn't.

Can you point me to the part where it says that? Can't seem to find it.

   So I think a callback that gets features depending on guest
   version isn't a good way to model it because fundamentally device
   has one set of features.
   A better way to model this is really just a single
   host_features bitmask, and for transitional devices, a mask
   hiding a features - which are so far all bits  31, so maybe
   for now we can just have a global mask.
  
  How would this work for transitional presenting a modern device - would
  you have a superset of bits and masks for legacy and modern?
 
 Basically we expose through modern interface a superset
 of bits exposed through legacy.
 F_BAD for pci is probably the only exception.

ccw has F_BAD as well.

NOTIFY_ON_EMPTY is in the MAY be offered category.

I'm still wondering about the old legacy bits. Transitional devices not
offering them is easiest to implement from the host side, but I'm
wondering if all legacy devices will be able to function with missing
bits? Is breaking some legacy drivers OK?




Re: [Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio

2015-05-12 Thread Michael S. Tsirkin
On Tue, May 12, 2015 at 06:25:32PM +0200, Cornelia Huck wrote:
 On Tue, 12 May 2015 17:15:53 +0200
 Michael S. Tsirkin m...@redhat.com wrote:
 
  On Tue, May 12, 2015 at 03:25:30PM +0200, Cornelia Huck wrote:
   On Wed, 06 May 2015 14:08:02 +0200
   Greg Kurz gk...@linux.vnet.ibm.com wrote:
   
Legacy virtio is native endian: if the guest and host endianness differ,
we have to tell vhost so it can swap bytes where appropriate. This is
done through a vhost ring ioctl.

Signed-off-by: Greg Kurz gk...@linux.vnet.ibm.com
---
 hw/virtio/vhost.c |   50 
+-
 1 file changed, 49 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 54851b7..1d7b939 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
   (...)
@@ -677,6 +700,16 @@ static int vhost_virtqueue_start(struct vhost_dev 
*dev,
 return -errno;
 }

+if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) 
   
   I think this should either go in after the virtio-1 base support (more
   feature bits etc.) or get a big fat comment and be touched up later.
   I'd prefer the first solution so it does not get forgotten, but I'm not
   sure when Michael plans to proceed with the virtio-1 patches (I think
   they're mostly fine already).
  
  There are three main issues with virtio 1 patches that I am aware of.
  
  One issue with virtio 1 patches as they are is with how features are
  handled ATM.  There are 3 types of features
  
  a. virtio 1 only features
  b. virtio 0 only features
  c. shared features
  
  and 3 types of devices
  a. legacy device: has b+c features
  b. modern device: has a+c features
  c. transitional device: has a+c features but exposes
 only c through the legacy interface
 
 Wouldn't a transitional device be able to expose b as well?

No because the virtio 1 spec says it shouldn't.


  
  
  So I think a callback that gets features depending on guest
  version isn't a good way to model it because fundamentally device
  has one set of features.
  A better way to model this is really just a single
  host_features bitmask, and for transitional devices, a mask
  hiding a features - which are so far all bits  31, so maybe
  for now we can just have a global mask.
 
 How would this work for transitional presenting a modern device - would
 you have a superset of bits and masks for legacy and modern?

Basically we expose through modern interface a superset
of bits exposed through legacy.
F_BAD for pci is probably the only exception.


  
  We need to validate features at initialization time and make
  sure they make sense, fail if not (sometimes we need to mask
  features if they don't make sense - this is unfortunate
  but might be needed for compatibility).
  
  Moving host_features to virtio core would make all of the above
  easier.
 
 I have started hacking up code that moves host_features, but I'm quite
 lost with all the different virtio versions floating around. Currently
 trying against master, but that of course ignores the virtio-1 issues.

Yes, I think we should focus on infrastructure cleanups in master first.

  
  
  Second issue is migration, some of it is with migrating the new
  features, so that's tied to the first one.
 
 There's also the used and avail addresses, but that kind of follows
 from virtio-1 support.
 
  
  
  Third issue is fixing devices so they don't try to
  access guest memory until DRIVER_OK is set.
  This is surprisingly hard to do generally given need to support old
  drivers which don't set DRIVER_OK or set it very late, and the fact that
  we tied work-arounds for even older drivers which dont' set pci bus
  master to the DRIVER_OK bit. I tried, and I'm close to giving up and
  just checking guest ack for virtio 1, and ignoring DRIVER_OK requirement
  if not there.
 
 If legacy survived like it is until now, it might be best to focus on
 modern devices for this.

I'm kind of unhappy that it's up to guest though as that controls
whether we run in modern mode. But yea.

-- 
MST



Re: [Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio

2015-05-12 Thread Cornelia Huck
On Tue, 12 May 2015 17:15:53 +0200
Michael S. Tsirkin m...@redhat.com wrote:

 On Tue, May 12, 2015 at 03:25:30PM +0200, Cornelia Huck wrote:
  On Wed, 06 May 2015 14:08:02 +0200
  Greg Kurz gk...@linux.vnet.ibm.com wrote:
  
   Legacy virtio is native endian: if the guest and host endianness differ,
   we have to tell vhost so it can swap bytes where appropriate. This is
   done through a vhost ring ioctl.
   
   Signed-off-by: Greg Kurz gk...@linux.vnet.ibm.com
   ---
hw/virtio/vhost.c |   50 
   +-
1 file changed, 49 insertions(+), 1 deletion(-)
   
   diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
   index 54851b7..1d7b939 100644
   --- a/hw/virtio/vhost.c
   +++ b/hw/virtio/vhost.c
  (...)
   @@ -677,6 +700,16 @@ static int vhost_virtqueue_start(struct vhost_dev 
   *dev,
return -errno;
}
   
   +if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) 
  
  I think this should either go in after the virtio-1 base support (more
  feature bits etc.) or get a big fat comment and be touched up later.
  I'd prefer the first solution so it does not get forgotten, but I'm not
  sure when Michael plans to proceed with the virtio-1 patches (I think
  they're mostly fine already).
 
 There are three main issues with virtio 1 patches that I am aware of.
 
 One issue with virtio 1 patches as they are is with how features are
 handled ATM.  There are 3 types of features
 
   a. virtio 1 only features
   b. virtio 0 only features
   c. shared features
 
 and 3 types of devices
   a. legacy device: has b+c features
   b. modern device: has a+c features
   c. transitional device: has a+c features but exposes
  only c through the legacy interface

Wouldn't a transitional device be able to expose b as well?

 
 
 So I think a callback that gets features depending on guest
 version isn't a good way to model it because fundamentally device
 has one set of features.
 A better way to model this is really just a single
 host_features bitmask, and for transitional devices, a mask
 hiding a features - which are so far all bits  31, so maybe
 for now we can just have a global mask.

How would this work for transitional presenting a modern device - would
you have a superset of bits and masks for legacy and modern?

 
 We need to validate features at initialization time and make
 sure they make sense, fail if not (sometimes we need to mask
 features if they don't make sense - this is unfortunate
 but might be needed for compatibility).
 
 Moving host_features to virtio core would make all of the above
 easier.

I have started hacking up code that moves host_features, but I'm quite
lost with all the different virtio versions floating around. Currently
trying against master, but that of course ignores the virtio-1 issues.

 
 
 Second issue is migration, some of it is with migrating the new
 features, so that's tied to the first one.

There's also the used and avail addresses, but that kind of follows
from virtio-1 support.

 
 
 Third issue is fixing devices so they don't try to
 access guest memory until DRIVER_OK is set.
 This is surprisingly hard to do generally given need to support old
 drivers which don't set DRIVER_OK or set it very late, and the fact that
 we tied work-arounds for even older drivers which dont' set pci bus
 master to the DRIVER_OK bit. I tried, and I'm close to giving up and
 just checking guest ack for virtio 1, and ignoring DRIVER_OK requirement
 if not there.

If legacy survived like it is until now, it might be best to focus on
modern devices for this.




Re: [Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio

2015-05-12 Thread Cornelia Huck
On Wed, 06 May 2015 14:08:02 +0200
Greg Kurz gk...@linux.vnet.ibm.com wrote:

 Legacy virtio is native endian: if the guest and host endianness differ,
 we have to tell vhost so it can swap bytes where appropriate. This is
 done through a vhost ring ioctl.
 
 Signed-off-by: Greg Kurz gk...@linux.vnet.ibm.com
 ---
  hw/virtio/vhost.c |   50 +-
  1 file changed, 49 insertions(+), 1 deletion(-)
 
 diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
 index 54851b7..1d7b939 100644
 --- a/hw/virtio/vhost.c
 +++ b/hw/virtio/vhost.c
(...)
 @@ -677,6 +700,16 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
  return -errno;
  }
 
 +if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) 

I think this should either go in after the virtio-1 base support (more
feature bits etc.) or get a big fat comment and be touched up later.
I'd prefer the first solution so it does not get forgotten, but I'm not
sure when Michael plans to proceed with the virtio-1 patches (I think
they're mostly fine already).

 +virtio_legacy_is_cross_endian(vdev)) {
 +r = vhost_virtqueue_set_vring_endian_legacy(dev,
 +
 virtio_is_big_endian(vdev),
 +vhost_vq_index);
 +if (r) {
 +return -errno;
 +}
 +}
 +
  s = l = virtio_queue_get_desc_size(vdev, idx);
  a = virtio_queue_get_desc_addr(vdev, idx);
  vq-desc = cpu_physical_memory_map(a, l, 0);




Re: [Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio

2015-05-12 Thread Michael S. Tsirkin
On Tue, May 12, 2015 at 03:25:30PM +0200, Cornelia Huck wrote:
 On Wed, 06 May 2015 14:08:02 +0200
 Greg Kurz gk...@linux.vnet.ibm.com wrote:
 
  Legacy virtio is native endian: if the guest and host endianness differ,
  we have to tell vhost so it can swap bytes where appropriate. This is
  done through a vhost ring ioctl.
  
  Signed-off-by: Greg Kurz gk...@linux.vnet.ibm.com
  ---
   hw/virtio/vhost.c |   50 +-
   1 file changed, 49 insertions(+), 1 deletion(-)
  
  diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
  index 54851b7..1d7b939 100644
  --- a/hw/virtio/vhost.c
  +++ b/hw/virtio/vhost.c
 (...)
  @@ -677,6 +700,16 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
   return -errno;
   }
  
  +if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) 
 
 I think this should either go in after the virtio-1 base support (more
 feature bits etc.) or get a big fat comment and be touched up later.
 I'd prefer the first solution so it does not get forgotten, but I'm not
 sure when Michael plans to proceed with the virtio-1 patches (I think
 they're mostly fine already).

There are three main issues with virtio 1 patches that I am aware of.

One issue with virtio 1 patches as they are is with how features are
handled ATM.  There are 3 types of features

a. virtio 1 only features
b. virtio 0 only features
c. shared features

and 3 types of devices
a. legacy device: has b+c features
b. modern device: has a+c features
c. transitional device: has a+c features but exposes
   only c through the legacy interface


So I think a callback that gets features depending on guest
version isn't a good way to model it because fundamentally device
has one set of features.
A better way to model this is really just a single
host_features bitmask, and for transitional devices, a mask
hiding a features - which are so far all bits  31, so maybe
for now we can just have a global mask.

We need to validate features at initialization time and make
sure they make sense, fail if not (sometimes we need to mask
features if they don't make sense - this is unfortunate
but might be needed for compatibility).

Moving host_features to virtio core would make all of the above
easier.


Second issue is migration, some of it is with migrating the new
features, so that's tied to the first one.


Third issue is fixing devices so they don't try to
access guest memory until DRIVER_OK is set.
This is surprisingly hard to do generally given need to support old
drivers which don't set DRIVER_OK or set it very late, and the fact that
we tied work-arounds for even older drivers which dont' set pci bus
master to the DRIVER_OK bit. I tried, and I'm close to giving up and
just checking guest ack for virtio 1, and ignoring DRIVER_OK requirement
if not there.



  +virtio_legacy_is_cross_endian(vdev)) {
  +r = vhost_virtqueue_set_vring_endian_legacy(dev,
  +
  virtio_is_big_endian(vdev),
  +vhost_vq_index);
  +if (r) {
  +return -errno;
  +}
  +}
  +
   s = l = virtio_queue_get_desc_size(vdev, idx);
   a = virtio_queue_get_desc_addr(vdev, idx);
   vq-desc = cpu_physical_memory_map(a, l, 0);



[Qemu-devel] [PATCH RFC 4/7] vhost: set vring endianness for legacy virtio

2015-05-06 Thread Greg Kurz
Legacy virtio is native endian: if the guest and host endianness differ,
we have to tell vhost so it can swap bytes where appropriate. This is
done through a vhost ring ioctl.

Signed-off-by: Greg Kurz gk...@linux.vnet.ibm.com
---
 hw/virtio/vhost.c |   50 +-
 1 file changed, 49 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 54851b7..1d7b939 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -17,9 +17,11 @@
 #include hw/hw.h
 #include qemu/atomic.h
 #include qemu/range.h
+#include qemu/error-report.h
 #include linux/vhost.h
 #include exec/address-spaces.h
 #include hw/virtio/virtio-bus.h
+#include hw/virtio/virtio-access.h
 #include migration/migration.h
 
 static void vhost_dev_sync_region(struct vhost_dev *dev,
@@ -647,6 +649,27 @@ static void vhost_log_stop(MemoryListener *listener,
 /* FIXME: implement */
 }
 
+static int vhost_virtqueue_set_vring_endian_legacy(struct vhost_dev *dev,
+   bool is_big_endian,
+   int vhost_vq_index)
+{
+struct vhost_vring_state s = {
+.index = vhost_vq_index,
+.num = is_big_endian
+};
+
+if (!dev-vhost_ops-vhost_call(dev, VHOST_SET_VRING_ENDIAN, s)) {
+return 0;
+}
+
+if (errno == ENOTTY) {
+error_report(vhost does not support cross-endian);
+return -ENOSYS;
+}
+
+return -errno;
+}
+
 static int vhost_virtqueue_start(struct vhost_dev *dev,
 struct VirtIODevice *vdev,
 struct vhost_virtqueue *vq,
@@ -677,6 +700,16 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
 return -errno;
 }
 
+if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) 
+virtio_legacy_is_cross_endian(vdev)) {
+r = vhost_virtqueue_set_vring_endian_legacy(dev,
+virtio_is_big_endian(vdev),
+vhost_vq_index);
+if (r) {
+return -errno;
+}
+}
+
 s = l = virtio_queue_get_desc_size(vdev, idx);
 a = virtio_queue_get_desc_addr(vdev, idx);
 vq-desc = cpu_physical_memory_map(a, l, 0);
@@ -747,8 +780,9 @@ static void vhost_virtqueue_stop(struct vhost_dev *dev,
 struct vhost_virtqueue *vq,
 unsigned idx)
 {
+int vhost_vq_index = idx - dev-vq_index;
 struct vhost_vring_state state = {
-.index = idx - dev-vq_index
+.index = vhost_vq_index,
 };
 int r;
 assert(idx = dev-vq_index  idx  dev-vq_index + dev-nvqs);
@@ -759,6 +793,20 @@ static void vhost_virtqueue_stop(struct vhost_dev *dev,
 }
 virtio_queue_set_last_avail_idx(vdev, idx, state.num);
 virtio_queue_invalidate_signalled_used(vdev, idx);
+
+/* In the cross-endian case, we need to reset the vring endianness to
+ * native as legacy devices expect so by default.
+ */
+if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) 
+virtio_legacy_is_cross_endian(vdev)) {
+r = vhost_virtqueue_set_vring_endian_legacy(dev,
+
!virtio_is_big_endian(vdev),
+vhost_vq_index);
+if (r  0) {
+error_report(failed to reset vring endianness);
+}
+}
+
 assert (r = 0);
 cpu_physical_memory_unmap(vq-ring, virtio_queue_get_ring_size(vdev, idx),
   0, virtio_queue_get_ring_size(vdev, idx));