Re: [Qemu-devel] [PATCH v2 00/11] qemu: use virtio linux headers in portable code

2013-05-28 Thread Michael S. Tsirkin
On Mon, May 27, 2013 at 11:14:47AM -0500, Anthony Liguori wrote:
  Well specifically if_ether.h says GPLv2+ so it's OK for QEMU.
  Do you mean for some other non GPL app?
 
 Ignore QEMU for the moment.
 
 The headers say they are BSD licensed... but they include a GPLv2+
 header.

Above is a bit misleading.  To be precise, they don't include a GPLv2+
header. One header merely includes this line:
#include linux/if_ether.h

-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC 7/11] virtio_pci: new, capability-aware driver.

2013-05-28 Thread Michael S. Tsirkin
On Mon, Dec 12, 2011 at 01:49:13PM +0200, Michael S. Tsirkin wrote:
 On Mon, Dec 12, 2011 at 09:15:03AM +1030, Rusty Russell wrote:
  On Sun, 11 Dec 2011 11:42:56 +0200, Michael S. Tsirkin m...@redhat.com 
  wrote:
   On Thu, Dec 08, 2011 at 09:09:33PM +1030, Rusty Russell wrote:
+/* There is no iowrite64.  We use two 32-bit ops. */
+static void iowrite64(u64 val, const __le64 *addr)
+{
+   iowrite32((u32)val, (__le32 *)addr);
+   iowrite32(val  32, (__le32 *)addr + 1);
+}
+
   
   Let's put addr_lo/addr_hi in the structure then,
   to make the fact this field is not atomic explicit?
  
  Good point, assuming I haven't missed something.
  
  Are 64-bit accesses actually unknown in PCI-land?  Or is this a limited
  availability thing?
  
  Thanks,
  Rusty.
 
 I think PCI can optionally support atomic 64 bit accesses, but not all
 architectures can generate them.

Ping. Going to change this in the layout struct?

 -- 
 MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH] virtio_pci: fix capability format, comments

2013-05-28 Thread Michael S. Tsirkin
- queue size can actually be 0 which is not a power of 2
- fix capability format. PCI spec says:
The layout of the information is vendor specific, except that the byte
immediately following the “Next” pointer in the capability structure is
defined to be a length field.
This length field provides the number of bytes in the capability
structure (including the ID and Next pointer bytes).

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---

This patch is on top of the new layout branch, too

 include/uapi/linux/virtio_pci.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h
index cda688f..a5ef8cd 100644
--- a/include/uapi/linux/virtio_pci.h
+++ b/include/uapi/linux/virtio_pci.h
@@ -129,6 +129,7 @@
 struct virtio_pci_cap {
__u8 cap_vndr;  /* Generic PCI field: PCI_CAP_ID_VNDR */
__u8 cap_next;  /* Generic PCI field: next ptr. */
+   __u8 cap_len;   /* Generic PCI field: capability length */
__u8 cfg_type;  /* One of the VIRTIO_PCI_CAP_*_CFG. */
__u8 bar;   /* Where to find it. */
__le32 offset;  /* Offset within bar. */
@@ -154,7 +155,7 @@ struct virtio_pci_common_cfg {
 
/* About a specific virtqueue. */
__le16 queue_select;/* read-write */
-   __le16 queue_size;  /* read-write, power of 2. */
+   __le16 queue_size;  /* read-write, power of 2, or 0. */
__le16 queue_msix_vector;   /* read-write */
__le16 queue_enable;/* read-write */
__le16 queue_notify_off;/* read-only */
-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] virtio-balloon spec: rework VIRTIO_BALLOON_F_MUST_TELL_HOST feature, support silent deflation

2013-05-28 Thread Paolo Bonzini
Il 27/05/2013 19:02, Michael S. Tsirkin ha scritto:
 So if we don't want to require all guests to tell host
 first, all we need to do is admit it's not a bug.

 I think we want the possibility for the host to require that.
 
 But why? TELL_HOST makes some optimizations possible, but if
 guest won't cooperate, balloon is useless anyway.

If the guest won't tell host and still propose the feature, then we can
crash it.  So we need to know what the guest is going to do, in order to
enable/disable the optimization.

 If guest cooperates we don't have to require anything,
 just go with what guest tells us it will do.

Yes.

 Please see
 [PATCH] virtio-spec: balloon: MUST_TELL_HOST is optional
 that does exactly this.

 That patch mandates a change in guest behavior that is not compatible
 with the existing Windows driver.  Mine doesn't.

 Paolo
 
 Hmm I don't see it.
 In fact the goal was to document the Windows driver behaviour
 as correct.
 Can you explain the incompatibility please?

Whenever If the X feature is (not) negotiated is used in the spec, it
means in general you should be ready to implement both behaviors, or
perhaps the guest should fail to initialize if the feature is not available.

Here it is the other way round.  The existing guest is not checking the
outcome of the negotiation, so the host must check whether negotiation
happened and possibly fail the initialization of the device.  It is
sufficiently different from any other case that I don't think a one-word
change is enough.

The way I read it yesterday I didn't see any change from the current
specification, so the problem of having a negative feature remains.
Now rereading it, it may be correct, but it is not clear enough.

Perhaps my patch is even too verbose, but it doesn't leave anything open
for interpretation.

Paolo
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] vhost-scsi: return -ENOENT when no matching tcm_vhost_tpg found

2013-05-28 Thread Asias He
On Tue, May 28, 2013 at 04:54:44PM +0800, Wenchao Xia wrote:
 ioctl for VHOST_SCSI_SET_ENDPOINT report file exist errori, when I forget
 to set it correctly in configfs, make user confused. Actually it fail
 to find a matching one, so change the error value.
 
 Signed-off-by: Wenchao Xia wenchaoli...@gmail.com

Acked-by: Asias He as...@redhat.com

BTW, It would be nice to print more informative info in qemu when wwpn
is not available as well.

 ---
  drivers/vhost/scsi.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)
 
 diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
 index 7014202..6325b1d 100644
 --- a/drivers/vhost/scsi.c
 +++ b/drivers/vhost/scsi.c
 @@ -1219,7 +1219,7 @@ static int vhost_scsi_set_endpoint(
   }
   ret = 0;
   } else {
 - ret = -EEXIST;
 + ret = -ENOENT;
   }
  
   /*
 -- 
 1.7.1
 

-- 
Asias
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio-balloon spec: rework VIRTIO_BALLOON_F_MUST_TELL_HOST feature, support silent deflation

2013-05-28 Thread Michael S. Tsirkin
On Tue, May 28, 2013 at 10:38:35AM +0200, Paolo Bonzini wrote:
 Il 27/05/2013 19:02, Michael S. Tsirkin ha scritto:
  So if we don't want to require all guests to tell host
  first, all we need to do is admit it's not a bug.
 
  I think we want the possibility for the host to require that.
  
  But why? TELL_HOST makes some optimizations possible, but if
  guest won't cooperate, balloon is useless anyway.
 
 If the guest won't tell host and still propose the feature,

Ack feature but don't tell host? That would be a clear guest bug.
AFAIK that's not what windows drivers do.
Am I wrong?

 then we can
 crash it.  So we need to know what the guest is going to do, in order to
 enable/disable the optimization.
 
  If guest cooperates we don't have to require anything,
  just go with what guest tells us it will do.
 
 Yes.
 
  Please see
[PATCH] virtio-spec: balloon: MUST_TELL_HOST is optional
  that does exactly this.
 
  That patch mandates a change in guest behavior that is not compatible
  with the existing Windows driver.  Mine doesn't.
 
  Paolo
  
  Hmm I don't see it.
  In fact the goal was to document the Windows driver behaviour
  as correct.
  Can you explain the incompatibility please?
 
 Whenever If the X feature is (not) negotiated is used in the spec, it
 means in general you should be ready to implement both behaviors, or
 perhaps the guest should fail to initialize if the feature is not available.

you meaning host. Yes.
But here guest can tell host first *always if it wants to - it will
just be a bit slower when reusing pages from balloon.
If it acked the feature, it *must* tell host first.

 
 Here it is the other way round.  The existing guest is not checking the
 outcome of the negotiation, so the host must check whether negotiation
 happened and possibly fail the initialization of the device.  It is
 sufficiently different from any other case that I don't think a one-word
 change is enough.
 
 The way I read it yesterday I didn't see any change from the current
 specification, so the problem of having a negative feature remains.

This is standard behaviour:

- guest can ignore any feature that it does not ack
- host must implement both behaviours for guests that
  do and for guests that do not ack features

This is exactly what I'm proposing for TELL_HOST.

 Now rereading it, it may be correct, but it is not clear enough.
 
 Perhaps my patch is even too verbose, but it doesn't leave anything open
 for interpretation.
 
 Paolo

I'm fine with adding more clarifications but I don't yet see why
do we need a new bit.

-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio-balloon spec: rework VIRTIO_BALLOON_F_MUST_TELL_HOST feature, support silent deflation

2013-05-28 Thread Paolo Bonzini
Il 28/05/2013 12:45, Michael S. Tsirkin ha scritto:
 On Tue, May 28, 2013 at 10:38:35AM +0200, Paolo Bonzini wrote:
 Il 27/05/2013 19:02, Michael S. Tsirkin ha scritto:
 So if we don't want to require all guests to tell host
 first, all we need to do is admit it's not a bug.

 I think we want the possibility for the host to require that.

 But why? TELL_HOST makes some optimizations possible, but if
 guest won't cooperate, balloon is useless anyway.

 If the guest won't tell host and still propose the feature,
 
 Ack feature but don't tell host? That would be a clear guest bug.
 AFAIK that's not what windows drivers do.
 Am I wrong?

Yes.  I think we are in agreement on this part.

 Please see
   [PATCH] virtio-spec: balloon: MUST_TELL_HOST is optional
 that does exactly this.

 That patch mandates a change in guest behavior that is not compatible
 with the existing Windows driver.  Mine doesn't.

 Hmm I don't see it.
 In fact the goal was to document the Windows driver behaviour
 as correct. Can you explain the incompatibility please?

 Whenever If the X feature is (not) negotiated is used in the spec, it
 means in general you should be ready to implement both behaviors, or
 perhaps the guest should fail to initialize if the feature is not available.
 
 you meaning host. Yes.

Even guest.  A virtio-net guest driver should be ready to use an older
method if the host doesn't support merged rx buffers, for example.

In this case, a tell host first guest has to do nothing special if the
host doesn't advertise the feature.  It is a bit different from other
uses of negotiated in the spec.

 But here guest can tell host first *always if it wants to - it will
 just be a bit slower when reusing pages from balloon.
 If it acked the feature, it *must* tell host first.

Yes.

 The way I read it yesterday I didn't see any change from the current
 specification, so the problem of having a negative feature remains.
 
 This is standard behaviour:
 
 - guest can ignore any feature that it does not ack
 - host must implement both behaviours for guests that
   do and for guests that do not ack features
 
 This is exactly what I'm proposing for TELL_HOST.

I know, but I think the use negotiated part is unclear.

 Now rereading it, it may be correct, but it is not clear enough.

 Perhaps my patch is even too verbose, but it doesn't leave anything open
 for interpretation.
 
 I'm fine with adding more clarifications but I don't yet see why
 do we need a new bit.

There are three cases:

1) the drivers is not able to tell the host first (or never tell the
host at all), like the Windows driver or the Google fileballoon driver.
 If the host always wants to be told first (e.g. a hypothetical
virtio-balloon running on Xen) it should somehow prevent these drivers
from running.

2) the driver will always tell the host first, like the Linux driver.
The host can trust the guest to do the right thing.

3) the driver wants to optimize if the host can be told last (or not
told altogether).  Again, the host can trust the guest to do the right
thing, but there are two possible behaviors for the guest driver.

Case (3) would be a trivial optimization to implement on the Linux
driver for example, but one could also imagine switching the
implementation entirely: use something like Luiz's shrinker if the host
needs to be told, use something like Google's fileballoon if it doesn't.

The existing bit lets the host distinguish 1 from 2+3.  The other bit is
needed for the guest to pick the right behavior in case 3.

Paolo
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio-balloon spec: rework VIRTIO_BALLOON_F_MUST_TELL_HOST feature, support silent deflation

2013-05-28 Thread Michael S. Tsirkin
On Tue, May 28, 2013 at 01:13:03PM +0200, Paolo Bonzini wrote:
 Il 28/05/2013 12:45, Michael S. Tsirkin ha scritto:
  On Tue, May 28, 2013 at 10:38:35AM +0200, Paolo Bonzini wrote:
  Il 27/05/2013 19:02, Michael S. Tsirkin ha scritto:
  So if we don't want to require all guests to tell host
  first, all we need to do is admit it's not a bug.
 
  I think we want the possibility for the host to require that.
 
  But why? TELL_HOST makes some optimizations possible, but if
  guest won't cooperate, balloon is useless anyway.
 
  If the guest won't tell host and still propose the feature,
  
  Ack feature but don't tell host? That would be a clear guest bug.
  AFAIK that's not what windows drivers do.
  Am I wrong?
 
 Yes.  I think we are in agreement on this part.
 
  Please see
  [PATCH] virtio-spec: balloon: MUST_TELL_HOST is optional
  that does exactly this.
 
  That patch mandates a change in guest behavior that is not compatible
  with the existing Windows driver.  Mine doesn't.
 
  Hmm I don't see it.
  In fact the goal was to document the Windows driver behaviour
  as correct. Can you explain the incompatibility please?
 
  Whenever If the X feature is (not) negotiated is used in the spec, it
  means in general you should be ready to implement both behaviors, or
  perhaps the guest should fail to initialize if the feature is not 
  available.
  
  you meaning host. Yes.
 
 Even guest.  A virtio-net guest driver should be ready to use an older
 method if the host doesn't support merged rx buffers, for example.
 
 In this case, a tell host first guest has to do nothing special if the
 host doesn't advertise the feature.  It is a bit different from other
 uses of negotiated in the spec.

If you just want to add this line
It's legal for guest to tell host first even if
 it did not acknowledge the TELL_HOST feature
then that's fine.

  But here guest can tell host first *always if it wants to - it will
  just be a bit slower when reusing pages from balloon.
  If it acked the feature, it *must* tell host first.
 
 Yes.
 
  The way I read it yesterday I didn't see any change from the current
  specification, so the problem of having a negative feature remains.
  
  This is standard behaviour:
  
  - guest can ignore any feature that it does not ack
  - host must implement both behaviours for guests that
do and for guests that do not ack features
  
  This is exactly what I'm proposing for TELL_HOST.
 
 I know, but I think the use negotiated part is unclear.

negotiated in spec means present and acked by guest.
We can try and replace negotiated by present and acked by guest
everywhere - think it will be clearer?

  Now rereading it, it may be correct, but it is not clear enough.
 
  Perhaps my patch is even too verbose, but it doesn't leave anything open
  for interpretation.
  
  I'm fine with adding more clarifications but I don't yet see why
  do we need a new bit.
 
 There are three cases:
 
 1) the drivers is not able to tell the host first (or never tell the
 host at all), like the Windows driver or the Google fileballoon driver.
  If the host always wants to be told first (e.g. a hypothetical
 virtio-balloon running on Xen) it should somehow prevent these drivers
 from running.

I don't think hosts that always want to be told first can exist.
Handling guests that don't tell first is super easy -
e.g. just don't do anything.

 2) the driver will always tell the host first, like the Linux driver.
 The host can trust the guest to do the right thing.

It should not if guest did not ack the feature bit.
And no host we ever released thankfully does this.

 3) the driver wants to optimize if the host can be told last (or not
 told altogether).  Again, the host can trust the guest to do the right
 thing, but there are two possible behaviors for the guest driver.
 
 Case (3) would be a trivial optimization to implement on the Linux
 driver for example, but one could also imagine switching the
 implementation entirely: use something like Luiz's shrinker if the host
 needs to be told, use something like Google's fileballoon if it doesn't.
 
 The existing bit lets the host distinguish 1 from 2+3.  The other bit is
 needed for the guest to pick the right behavior in case 3.
 
 Paolo

1 does not exist.

And guests simply do not treat the existing bit as your spec patch says
they should. Instead they treat it as they would treat your new bit.  So
let's just make spec match driver behaviour.  Less work for everyone,
and no need for the new bit.

-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio-balloon spec: rework VIRTIO_BALLOON_F_MUST_TELL_HOST feature, support silent deflation

2013-05-28 Thread Paolo Bonzini
Il 28/05/2013 13:44, Michael S. Tsirkin ha scritto:
 negotiated in spec means present and acked by guest.
 We can try and replace negotiated by present and acked by guest
 everywhere - think it will be clearer?

No, I understand what negotiated means.  But in this case, negotiated
is not the word that you want.

 Now rereading it, it may be correct, but it is not clear enough.

 Perhaps my patch is even too verbose, but it doesn't leave anything open
 for interpretation.

 I'm fine with adding more clarifications but I don't yet see why
 do we need a new bit.

 There are three cases:

 1) the drivers is not able to tell the host first (or never tell the
 host at all), like the Windows driver or the Google fileballoon driver.
  If the host always wants to be told first (e.g. a hypothetical
 virtio-balloon running on Xen) it should somehow prevent these drivers
 from running.
 
 I don't think hosts that always want to be told first can exist.
 Handling guests that don't tell first is super easy -
 e.g. just don't do anything.

Of course you can work around it and not do anything.  This doesn't
change the fact that the host needs to know whether to actually balloon
out pages, or fake it.

 2) the driver will always tell the host first, like the Linux driver.
 The host can trust the guest to do the right thing.
 
 It should not if guest did not ack the feature bit.

Of course.

1 = guest doesn't ack feature bit, host may have to fake ballooning
2 or 3 = guest acks feature bit, host assumes that guest will tell first

 3) the driver wants to optimize if the host can be told last (or not
 told altogether).  Again, the host can trust the guest to do the right
 thing, but there are two possible behaviors for the guest driver.

 The existing bit lets the host distinguish 1 from 2+3.  The other bit is
 needed for the guest to pick the right behavior in case 3.
 
 1 does not exist.

1 does not exist in the sense that it's always possible to work around
the problem.  But you need to know when to work around it.  In that
sense, 1 exists.

 And guests simply do not treat the existing bit as your spec patch says
 they should. Instead they treat it as they would treat your new bit.

How so?  I even changed the name of the existing bit to
VIRTIO_F_GUEST_TELLS_HOST.

Paolo
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio-balloon spec: rework VIRTIO_BALLOON_F_MUST_TELL_HOST feature, support silent deflation

2013-05-28 Thread Michael S. Tsirkin
On Tue, May 28, 2013 at 02:04:03PM +0200, Paolo Bonzini wrote:
 Il 28/05/2013 13:44, Michael S. Tsirkin ha scritto:
  negotiated in spec means present and acked by guest.
  We can try and replace negotiated by present and acked by guest
  everywhere - think it will be clearer?
 
 No, I understand what negotiated means.  But in this case, negotiated
 is not the word that you want.
 
  Now rereading it, it may be correct, but it is not clear enough.
 
  Perhaps my patch is even too verbose, but it doesn't leave anything open
  for interpretation.
 
  I'm fine with adding more clarifications but I don't yet see why
  do we need a new bit.
 
  There are three cases:
 
  1) the drivers is not able to tell the host first (or never tell the
  host at all), like the Windows driver or the Google fileballoon driver.
   If the host always wants to be told first (e.g. a hypothetical
  virtio-balloon running on Xen) it should somehow prevent these drivers
  from running.
  
  I don't think hosts that always want to be told first can exist.
  Handling guests that don't tell first is super easy -
  e.g. just don't do anything.
 
 Of course you can work around it and not do anything.  This doesn't
 change the fact that the host needs to know whether to actually balloon
 out pages, or fake it.
 
  2) the driver will always tell the host first, like the Linux driver.
  The host can trust the guest to do the right thing.
  
  It should not if guest did not ack the feature bit.
 
 Of course.
 
 1 = guest doesn't ack feature bit, host may have to fake ballooning
 2 or 3 = guest acks feature bit, host assumes that guest will tell first
 
  3) the driver wants to optimize if the host can be told last (or not
  told altogether).  Again, the host can trust the guest to do the right
  thing, but there are two possible behaviors for the guest driver.
 
  The existing bit lets the host distinguish 1 from 2+3.  The other bit is
  needed for the guest to pick the right behavior in case 3.
  
  1 does not exist.
 
 1 does not exist in the sense that it's always possible to work around
 the problem.  But you need to know when to work around it.  In that
 sense, 1 exists.
 
  And guests simply do not treat the existing bit as your spec patch says
  they should. Instead they treat it as they would treat your new bit.
 
 How so?  I even changed the name of the existing bit to
 VIRTIO_F_GUEST_TELLS_HOST.
 
 Paolo


At this point I am confused. I think there are two changes in your patch:

1. Handling of VIRTIO_F_GUEST_MUST_TELL_HOST
 Is this functionally identical to what I proposed?
 If yes, I am fine with either change being applied.

2. New SILENT_DEFLATE feature
 Since guest can get same functionality by not acking
 TELL_HOST, I still don't see what good it does:
 Historically a host with no features supports silent
 deflate and guest with no features can do silent deflate.
 I conclude silent deflate is the default behaviour for
 both host and guest, and we can't change default without
 breaking compatibility.

How about splitting the patches so we can discuss them separately?

-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio-balloon spec: rework VIRTIO_BALLOON_F_MUST_TELL_HOST feature, support silent deflation

2013-05-28 Thread Paolo Bonzini
Il 28/05/2013 15:32, Michael S. Tsirkin ha scritto:
 At this point I am confused. I think there are two changes in your patch:
 
 1. Handling of VIRTIO_F_GUEST_MUST_TELL_HOST
  Is this functionally identical to what I proposed?
  If yes, I am fine with either change being applied.

Yes.

 2. New SILENT_DEFLATE feature
  Since guest can get same functionality by not acking
  TELL_HOST, I still don't see what good it does:
  Historically a host with no features supports silent
  deflate and guest with no features can do silent deflate.
  I conclude silent deflate is the default behaviour for
  both host and guest, and we can't change default without
  breaking compatibility.

You're right that for correctness the existing feature is enough:
if it is not negotiated by the guest, the host ensures correctness by
only giving the guest a fake balloon.

However, the new feature is about optimization, not correctness.
In fact, VIRTIO_BALLOON_F_SILENT_DEFLATE is the optimization
feature that VIRTIO_BALLOON_F_MUST_TELL_HOST was meant to be.

What I'm interested in, is drivers that can _optionally_ use silent 
deflation (as an optimization).  These should not get a fake balloon!

With the new feature bit, these drivers should propose both
VIRTIO_BALLOON_F_GUEST_TELLS_HOST and VIRTIO_BALLOON_F_SILENT_DEFLATE.
The driver can then use silent  deflation if and only if the host
has negotiated  VIRTIO_BALLOON_F_SILENT_DEFLATE too.  Like this:

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index bd3ae32..05fe948 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -186,12 +186,8 @@ static void leak_balloon(struct virtio_balloon *vb, size_t 
num)
vb-num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
}
 
-   /*
-* Note that if
-* virtio_has_feature(vdev, VIRTIO_BALLOON_F_MUST_TELL_HOST);
-* is true, we *have* to do it in this order
-*/
-   tell_host(vb, vb-deflate_vq);
+   if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_SILENT_DEFLATE)
+   tell_host(vb, vb-deflate_vq);
mutex_unlock(vb-balloon_lock);
release_pages_by_pfn(vb-pfns, vb-num_pfns);
 }
@@ -543,6 +539,7 @@ static int virtballoon_restore(struct virtio_device *vdev)
 static unsigned int features[] = {
VIRTIO_BALLOON_F_MUST_TELL_HOST,
VIRTIO_BALLOON_F_STATS_VQ,
+   VIRTIO_BALLOON_F_SILENT_DEFLATE,
 };
 
 static struct virtio_driver virtio_balloon_driver = {


Of course with the current implementation of the balloon it does not
matter much.  But for example, with Luiz's work, releasing pages as soon
as the shrinker is called will increase effectiveness of the shrinker.
At the same time, not all is lost if the guest prefers not to allow
silent deflation (e.g. because there is an assigned device).

On old hosts, a guest that can optionally use silent deflation will
not use it.  That's the same as for any other feature bit.

 How about splitting the patches so we can discuss them separately?

I can do that, but I hope the above clarifies it.

Paolo
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v8, part3 12/14] mm: correctly update zone-mamaged_pages

2013-05-28 Thread Liu Jiang
Hi Sergei,
Thanks for review!

On 05/27/2013 09:57 PM, Sergei Shtylyov wrote:
 On 26-05-2013 17:38, Jiang Liu wrote:
 
Typo in the subject: s/mamaged_pages/managed_pages/.
Will fix it in next version.

 
 Enhance adjust_managed_page_count() to adjust totalhigh_pages for
 highmem pages. And change code which directly adjusts totalram_pages
 to use adjust_managed_page_count() because it adjusts totalram_pages,
 totalhigh_pages and zone-managed_pages altogether in a safe way.
 
 Remove inc_totalhigh_pages() and dec_totalhigh_pages() from xen/balloon
 driver bacause adjust_managed_page_count() has already adjusted
 totalhigh_pages.
 
 This patch also fixes two bugs:
 1) enhances virtio_balloon driver to adjust totalhigh_pages when
 reserve/unreserve pages.
 2) enhance memory_hotplug.c to adjust totalhigh_pages when hot-removing
 memory.
 
 We still need to deal with modifications of totalram_pages in file
 arch/powerpc/platforms/pseries/cmm.c, but need help from PPC experts.
 
 Signed-off-by: Jiang Liu jiang@huawei.com
 Cc: Chris Metcalf cmetc...@tilera.com
 Cc: Rusty Russell ru...@rustcorp.com.au
 Cc: Michael S. Tsirkin m...@redhat.com
 Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com
 Cc: Jeremy Fitzhardinge jer...@goop.org
 Cc: Wen Congyang we...@cn.fujitsu.com
 Cc: Andrew Morton a...@linux-foundation.org
 Cc: Tang Chen tangc...@cn.fujitsu.com
 Cc: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com
 Cc: Mel Gorman mgor...@suse.de
 Cc: Minchan Kim minc...@kernel.org
 Cc: linux-ker...@vger.kernel.org
 Cc: virtualization@lists.linux-foundation.org
 Cc: xen-de...@lists.xensource.com
 Cc: linux...@kvack.org
 ---
   drivers/virtio/virtio_balloon.c |  8 +---
   drivers/xen/balloon.c   | 23 +--
   mm/hugetlb.c|  2 +-
   mm/memory_hotplug.c | 16 +++-
   mm/page_alloc.c | 10 +-
   5 files changed, 19 insertions(+), 40 deletions(-)
 
 diff --git a/drivers/virtio/virtio_balloon.c
 b/drivers/virtio/virtio_balloon.c
 index bd3ae32..6649968 100644
 --- a/drivers/virtio/virtio_balloon.c
 +++ b/drivers/virtio/virtio_balloon.c
 [...]
 @@ -160,11 +160,13 @@ static void fill_balloon(struct virtio_balloon
 *vb, size_t num)
   static void release_pages_by_pfn(const u32 pfns[], unsigned int num)
   {
   unsigned int i;
 +struct page *page;
 
Why not declare it right in the *for* loop? You could use intializer
 then...
Good suggestion, will change it in next version.

 

   /* Find pfns pointing at start of each page, get pages and free
 them. */
   for (i = 0; i  num; i += VIRTIO_BALLOON_PAGES_PER_PAGE) {
 -balloon_page_free(balloon_pfn_to_page(pfns[i]));
 -totalram_pages++;
 +page = balloon_pfn_to_page(pfns[i]);
 +balloon_page_free(page);
 +adjust_managed_page_count(page, 1);
   }
   }

 [...]
 
 WBR, Sergei
 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio-balloon spec: rework VIRTIO_BALLOON_F_MUST_TELL_HOST feature, support silent deflation

2013-05-28 Thread Michael S. Tsirkin
On Tue, May 28, 2013 at 04:06:02PM +0200, Paolo Bonzini wrote:
 Il 28/05/2013 15:32, Michael S. Tsirkin ha scritto:
  At this point I am confused. I think there are two changes in your patch:
  
  1. Handling of VIRTIO_F_GUEST_MUST_TELL_HOST
   Is this functionally identical to what I proposed?
   If yes, I am fine with either change being applied.
 
 Yes.
 
  2. New SILENT_DEFLATE feature
   Since guest can get same functionality by not acking
   TELL_HOST, I still don't see what good it does:
   Historically a host with no features supports silent
   deflate and guest with no features can do silent deflate.
   I conclude silent deflate is the default behaviour for
   both host and guest, and we can't change default without
   breaking compatibility.
 
 You're right that for correctness the existing feature is enough:
 if it is not negotiated by the guest, the host ensures correctness by
 only giving the guest a fake balloon.
 
 However, the new feature is about optimization, not correctness.
 In fact, VIRTIO_BALLOON_F_SILENT_DEFLATE is the optimization
 feature that VIRTIO_BALLOON_F_MUST_TELL_HOST was meant to be.
 
 What I'm interested in, is drivers that can _optionally_ use silent 
 deflation (as an optimization).  These should not get a fake balloon!
 
 With the new feature bit, these drivers should propose both
 VIRTIO_BALLOON_F_GUEST_TELLS_HOST and VIRTIO_BALLOON_F_SILENT_DEFLATE.
 The driver can then use silent  deflation if and only if the host
 has negotiated  VIRTIO_BALLOON_F_SILENT_DEFLATE too.  Like this:
 
 diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
 index bd3ae32..05fe948 100644
 --- a/drivers/virtio/virtio_balloon.c
 +++ b/drivers/virtio/virtio_balloon.c
 @@ -186,12 +186,8 @@ static void leak_balloon(struct virtio_balloon *vb, 
 size_t num)
   vb-num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
   }
  
 - /*
 -  * Note that if
 -  * virtio_has_feature(vdev, VIRTIO_BALLOON_F_MUST_TELL_HOST);
 -  * is true, we *have* to do it in this order
 -  */
 - tell_host(vb, vb-deflate_vq);
 + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_SILENT_DEFLATE)
 + tell_host(vb, vb-deflate_vq);
   mutex_unlock(vb-balloon_lock);
   release_pages_by_pfn(vb-pfns, vb-num_pfns);
  }
 @@ -543,6 +539,7 @@ static int virtballoon_restore(struct virtio_device *vdev)
  static unsigned int features[] = {
   VIRTIO_BALLOON_F_MUST_TELL_HOST,
   VIRTIO_BALLOON_F_STATS_VQ,
 + VIRTIO_BALLOON_F_SILENT_DEFLATE,
  };
  
  static struct virtio_driver virtio_balloon_driver = {
 
 
 Of course with the current implementation of the balloon it does not
 matter much.  But for example, with Luiz's work, releasing pages as soon
 as the shrinker is called will increase effectiveness of the shrinker.
 At the same time, not all is lost if the guest prefers not to allow
 silent deflation (e.g. because there is an assigned device).
 
 On old hosts, a guest that can optionally use silent deflation will
 not use it.  That's the same as for any other feature bit.
 
  How about splitting the patches so we can discuss them separately?
 
 I can do that, but I hope the above clarifies it.
 
 Paolo

Maybe I'm just dense.
Let's see the split spec patchset?

-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio-balloon spec: rework VIRTIO_BALLOON_F_MUST_TELL_HOST feature, support silent deflation

2013-05-28 Thread Paolo Bonzini
Il 28/05/2013 16:29, Michael S. Tsirkin ha scritto:
 On Tue, May 28, 2013 at 04:06:02PM +0200, Paolo Bonzini wrote:
 Il 28/05/2013 15:32, Michael S. Tsirkin ha scritto:
 At this point I am confused. I think there are two changes in your patch:

 1. Handling of VIRTIO_F_GUEST_MUST_TELL_HOST
  Is this functionally identical to what I proposed?
  If yes, I am fine with either change being applied.

 Yes.

 2. New SILENT_DEFLATE feature
  Since guest can get same functionality by not acking
  TELL_HOST, I still don't see what good it does:
  Historically a host with no features supports silent
  deflate and guest with no features can do silent deflate.
  I conclude silent deflate is the default behaviour for
  both host and guest, and we can't change default without
  breaking compatibility.

 You're right that for correctness the existing feature is enough:
 if it is not negotiated by the guest, the host ensures correctness by
 only giving the guest a fake balloon.

 However, the new feature is about optimization, not correctness.
 In fact, VIRTIO_BALLOON_F_SILENT_DEFLATE is the optimization
 feature that VIRTIO_BALLOON_F_MUST_TELL_HOST was meant to be.

 What I'm interested in, is drivers that can _optionally_ use silent 
 deflation (as an optimization).  These should not get a fake balloon!

 With the new feature bit, these drivers should propose both
 VIRTIO_BALLOON_F_GUEST_TELLS_HOST and VIRTIO_BALLOON_F_SILENT_DEFLATE.
 The driver can then use silent  deflation if and only if the host
 has negotiated  VIRTIO_BALLOON_F_SILENT_DEFLATE too.  Like this:

 diff --git a/drivers/virtio/virtio_balloon.c 
 b/drivers/virtio/virtio_balloon.c
 index bd3ae32..05fe948 100644
 --- a/drivers/virtio/virtio_balloon.c
 +++ b/drivers/virtio/virtio_balloon.c
 @@ -186,12 +186,8 @@ static void leak_balloon(struct virtio_balloon *vb, 
 size_t num)
  vb-num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
  }
  
 -/*
 - * Note that if
 - * virtio_has_feature(vdev, VIRTIO_BALLOON_F_MUST_TELL_HOST);
 - * is true, we *have* to do it in this order
 - */
 -tell_host(vb, vb-deflate_vq);
 +if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_SILENT_DEFLATE)
 +tell_host(vb, vb-deflate_vq);
  mutex_unlock(vb-balloon_lock);
  release_pages_by_pfn(vb-pfns, vb-num_pfns);
  }
 @@ -543,6 +539,7 @@ static int virtballoon_restore(struct virtio_device 
 *vdev)
  static unsigned int features[] = {
  VIRTIO_BALLOON_F_MUST_TELL_HOST,
  VIRTIO_BALLOON_F_STATS_VQ,
 +VIRTIO_BALLOON_F_SILENT_DEFLATE,
  };
  
  static struct virtio_driver virtio_balloon_driver = {


 Of course with the current implementation of the balloon it does not
 matter much.  But for example, with Luiz's work, releasing pages as soon
 as the shrinker is called will increase effectiveness of the shrinker.
 At the same time, not all is lost if the guest prefers not to allow
 silent deflation (e.g. because there is an assigned device).

 On old hosts, a guest that can optionally use silent deflation will
 not use it.  That's the same as for any other feature bit.

 How about splitting the patches so we can discuss them separately?

 I can do that, but I hope the above clarifies it.
 
 Maybe I'm just dense.
 Let's see the split spec patchset?

What's unclear exactly?  I'm not sure the spec patchset improves things
that much, I can split it in two or three (change old feature, add new
feature, add explanation) but it's not like changing logic in a program.

Paolo

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio-balloon spec: rework VIRTIO_BALLOON_F_MUST_TELL_HOST feature, support silent deflation

2013-05-28 Thread Michael S. Tsirkin
On Tue, May 28, 2013 at 04:32:44PM +0200, Paolo Bonzini wrote:
 Il 28/05/2013 16:29, Michael S. Tsirkin ha scritto:
  On Tue, May 28, 2013 at 04:06:02PM +0200, Paolo Bonzini wrote:
  Il 28/05/2013 15:32, Michael S. Tsirkin ha scritto:
  At this point I am confused. I think there are two changes in your patch:
 
  1. Handling of VIRTIO_F_GUEST_MUST_TELL_HOST
   Is this functionally identical to what I proposed?
   If yes, I am fine with either change being applied.
 
  Yes.
 
  2. New SILENT_DEFLATE feature
   Since guest can get same functionality by not acking
   TELL_HOST, I still don't see what good it does:
   Historically a host with no features supports silent
   deflate and guest with no features can do silent deflate.
   I conclude silent deflate is the default behaviour for
   both host and guest, and we can't change default without
   breaking compatibility.
 
  You're right that for correctness the existing feature is enough:
  if it is not negotiated by the guest, the host ensures correctness by
  only giving the guest a fake balloon.
 
  However, the new feature is about optimization, not correctness.
  In fact, VIRTIO_BALLOON_F_SILENT_DEFLATE is the optimization
  feature that VIRTIO_BALLOON_F_MUST_TELL_HOST was meant to be.
 
  What I'm interested in, is drivers that can _optionally_ use silent 
  deflation (as an optimization).  These should not get a fake balloon!
 
  With the new feature bit, these drivers should propose both
  VIRTIO_BALLOON_F_GUEST_TELLS_HOST and VIRTIO_BALLOON_F_SILENT_DEFLATE.
  The driver can then use silent  deflation if and only if the host
  has negotiated  VIRTIO_BALLOON_F_SILENT_DEFLATE too.  Like this:
 
  diff --git a/drivers/virtio/virtio_balloon.c 
  b/drivers/virtio/virtio_balloon.c
  index bd3ae32..05fe948 100644
  --- a/drivers/virtio/virtio_balloon.c
  +++ b/drivers/virtio/virtio_balloon.c
  @@ -186,12 +186,8 @@ static void leak_balloon(struct virtio_balloon *vb, 
  size_t num)
 vb-num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
 }
   
  -  /*
  -   * Note that if
  -   * virtio_has_feature(vdev, VIRTIO_BALLOON_F_MUST_TELL_HOST);
  -   * is true, we *have* to do it in this order
  -   */
  -  tell_host(vb, vb-deflate_vq);
  +  if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_SILENT_DEFLATE)
  +  tell_host(vb, vb-deflate_vq);
 mutex_unlock(vb-balloon_lock);
 release_pages_by_pfn(vb-pfns, vb-num_pfns);
   }
  @@ -543,6 +539,7 @@ static int virtballoon_restore(struct virtio_device 
  *vdev)
   static unsigned int features[] = {
 VIRTIO_BALLOON_F_MUST_TELL_HOST,
 VIRTIO_BALLOON_F_STATS_VQ,
  +  VIRTIO_BALLOON_F_SILENT_DEFLATE,
   };
   
   static struct virtio_driver virtio_balloon_driver = {
 
 
  Of course with the current implementation of the balloon it does not
  matter much.  But for example, with Luiz's work, releasing pages as soon
  as the shrinker is called will increase effectiveness of the shrinker.
  At the same time, not all is lost if the guest prefers not to allow
  silent deflation (e.g. because there is an assigned device).
 
  On old hosts, a guest that can optionally use silent deflation will
  not use it.  That's the same as for any other feature bit.
 
  How about splitting the patches so we can discuss them separately?
 
  I can do that, but I hope the above clarifies it.
  
  Maybe I'm just dense.
  Let's see the split spec patchset?
 
 What's unclear exactly?  I'm not sure the spec patchset improves things
 that much, I can split it in two or three (change old feature, add new
 feature, add explanation) but it's not like changing logic in a program.
 
 Paolo

Both your code and what you say here about the new bit seem to break
compatibility with old hosts and guests.
Again, could be just me seeing things.
If it's in spec, I think it would be clearer what are we trying to
achieve, and how.

-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH RFC] virtio-pci: new config layout: using memory BAR

2013-05-28 Thread Michael S. Tsirkin
This adds support for new config, and is designed to work with
the new layout code in Rusty's new layout branch.

At the moment all fields are in the same memory BAR (bar 2).
This will be used to test performance and compare
memory, io and hypercall latency.

Compiles but does not work yet.
Migration isn't handled yet.

It's not clear what do queue_enable/queue_disable
fields do, not yet implemented.

Gateway for config access with config cycles
not yet implemented.

Sending out for early review/flames.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/virtio/virtio-pci.c | 393 +++--
 hw/virtio/virtio-pci.h |  55 +++
 hw/virtio/virtio.c |  20 +++
 include/hw/virtio/virtio.h |   4 +
 4 files changed, 458 insertions(+), 14 deletions(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 752991a..f4db224 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -259,6 +259,26 @@ static void virtio_pci_stop_ioeventfd(VirtIOPCIProxy 
*proxy)
 proxy-ioeventfd_started = false;
 }
 
+static void virtio_pci_set_status(VirtIOPCIProxy *proxy, uint8_t val)
+{
+VirtIODevice *vdev = proxy-vdev;
+
+if (!(val  VIRTIO_CONFIG_S_DRIVER_OK)) {
+virtio_pci_stop_ioeventfd(proxy);
+}
+
+virtio_set_status(vdev, val  0xFF);
+
+if (val  VIRTIO_CONFIG_S_DRIVER_OK) {
+virtio_pci_start_ioeventfd(proxy);
+}
+
+if (vdev-status == 0) {
+virtio_reset(proxy-vdev);
+msix_unuse_all_vectors(proxy-pci_dev);
+}
+}
+
 static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
 {
 VirtIOPCIProxy *proxy = opaque;
@@ -293,20 +313,7 @@ static void virtio_ioport_write(void *opaque, uint32_t 
addr, uint32_t val)
 }
 break;
 case VIRTIO_PCI_STATUS:
-if (!(val  VIRTIO_CONFIG_S_DRIVER_OK)) {
-virtio_pci_stop_ioeventfd(proxy);
-}
-
-virtio_set_status(vdev, val  0xFF);
-
-if (val  VIRTIO_CONFIG_S_DRIVER_OK) {
-virtio_pci_start_ioeventfd(proxy);
-}
-
-if (vdev-status == 0) {
-virtio_reset(proxy-vdev);
-msix_unuse_all_vectors(proxy-pci_dev);
-}
+virtio_pci_set_status(proxy, val);
 
 /* Linux before 2.6.34 sets the device as OK without enabling
the PCI device bus master bit. In this case we need to disable
@@ -455,6 +462,226 @@ static void virtio_pci_config_write(void *opaque, hwaddr 
addr,
 }
 }
 
+static uint64_t virtio_pci_config_common_read(void *opaque, hwaddr addr,
+  unsigned size)
+{
+VirtIOPCIProxy *proxy = opaque;
+VirtIODevice *vdev = proxy-vdev;
+
+uint64_t low = 0xull;
+
+switch (addr) {
+case offsetof(struct virtio_pci_common_cfg, device_feature_select):
+return proxy-device_feature_select;
+case offsetof(struct virtio_pci_common_cfg, device_feature):
+/* TODO: 64-bit features */
+   return proxy-device_feature_select ? 0 : proxy-host_features;
+case offsetof(struct virtio_pci_common_cfg, guest_feature_select):
+return proxy-guest_feature_select;
+case offsetof(struct virtio_pci_common_cfg, guest_feature):
+/* TODO: 64-bit features */
+   return proxy-guest_feature_select ? 0 : vdev-guest_features;
+case offsetof(struct virtio_pci_common_cfg, msix_config):
+   return vdev-config_vector;
+case offsetof(struct virtio_pci_common_cfg, num_queues):
+/* TODO: more exact limit? */
+   return VIRTIO_PCI_QUEUE_MAX;
+case offsetof(struct virtio_pci_common_cfg, device_status):
+return vdev-status;
+
+   /* About a specific virtqueue. */
+case offsetof(struct virtio_pci_common_cfg, queue_select):
+return  vdev-queue_sel;
+case offsetof(struct virtio_pci_common_cfg, queue_size):
+return virtio_queue_get_num(vdev, vdev-queue_sel);
+case offsetof(struct virtio_pci_common_cfg, queue_msix_vector):
+   return virtio_queue_vector(vdev, vdev-queue_sel);
+case offsetof(struct virtio_pci_common_cfg, queue_enable):
+/* TODO */
+   return 0;
+case offsetof(struct virtio_pci_common_cfg, queue_notify_off):
+return vdev-queue_sel;
+case offsetof(struct virtio_pci_common_cfg, queue_desc):
+return virtio_queue_get_desc_addr(vdev, vdev-queue_sel)  low;
+case offsetof(struct virtio_pci_common_cfg, queue_desc) + 4:
+return virtio_queue_get_desc_addr(vdev, vdev-queue_sel)  32;
+case offsetof(struct virtio_pci_common_cfg, queue_avail):
+return virtio_queue_get_avail_addr(vdev, vdev-queue_sel)  low;
+case offsetof(struct virtio_pci_common_cfg, queue_avail) + 4:
+return virtio_queue_get_avail_addr(vdev, vdev-queue_sel)  32;
+case offsetof(struct virtio_pci_common_cfg, queue_used):
+return virtio_queue_get_used_addr(vdev, vdev-queue_sel)  low;
+case offsetof(struct 

Re: [PATCH] virtio-balloon spec: rework VIRTIO_BALLOON_F_MUST_TELL_HOST feature, support silent deflation

2013-05-28 Thread Michael S. Tsirkin
On Tue, May 28, 2013 at 06:23:19PM +0200, Paolo Bonzini wrote:
 Il 28/05/2013 17:09, Michael S. Tsirkin ha scritto:
  On Tue, May 28, 2013 at 04:32:44PM +0200, Paolo Bonzini wrote:
  Il 28/05/2013 16:29, Michael S. Tsirkin ha scritto:
  On Tue, May 28, 2013 at 04:06:02PM +0200, Paolo Bonzini wrote:
  Il 28/05/2013 15:32, Michael S. Tsirkin ha scritto:
  At this point I am confused. I think there are two changes in your 
  patch:
 
  1. Handling of VIRTIO_F_GUEST_MUST_TELL_HOST
   Is this functionally identical to what I proposed?
   If yes, I am fine with either change being applied.
 
  Yes.
 
  2. New SILENT_DEFLATE feature
   Since guest can get same functionality by not acking
   TELL_HOST, I still don't see what good it does:
   Historically a host with no features supports silent
   deflate and guest with no features can do silent deflate.
   I conclude silent deflate is the default behaviour for
   both host and guest, and we can't change default without
   breaking compatibility.
 
  You're right that for correctness the existing feature is enough:
  if it is not negotiated by the guest, the host ensures correctness by
  only giving the guest a fake balloon.
 
  However, the new feature is about optimization, not correctness.
  In fact, VIRTIO_BALLOON_F_SILENT_DEFLATE is the optimization
  feature that VIRTIO_BALLOON_F_MUST_TELL_HOST was meant to be.
 
  What I'm interested in, is drivers that can _optionally_ use silent 
  deflation (as an optimization).  These should not get a fake balloon!
 
  With the new feature bit, these drivers should propose both
  VIRTIO_BALLOON_F_GUEST_TELLS_HOST and VIRTIO_BALLOON_F_SILENT_DEFLATE.
  The driver can then use silent  deflation if and only if the host
  has negotiated  VIRTIO_BALLOON_F_SILENT_DEFLATE too.  Like this:
 
  diff --git a/drivers/virtio/virtio_balloon.c 
  b/drivers/virtio/virtio_balloon.c
  index bd3ae32..05fe948 100644
  --- a/drivers/virtio/virtio_balloon.c
  +++ b/drivers/virtio/virtio_balloon.c
  @@ -186,12 +186,8 @@ static void leak_balloon(struct virtio_balloon *vb, 
  size_t num)
   vb-num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
   }
   
  -/*
  - * Note that if
  - * virtio_has_feature(vdev, VIRTIO_BALLOON_F_MUST_TELL_HOST);
  - * is true, we *have* to do it in this order
  - */
  -tell_host(vb, vb-deflate_vq);
  +if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_SILENT_DEFLATE)
  +tell_host(vb, vb-deflate_vq);
   mutex_unlock(vb-balloon_lock);
   release_pages_by_pfn(vb-pfns, vb-num_pfns);
   }
  @@ -543,6 +539,7 @@ static int virtballoon_restore(struct virtio_device 
  *vdev)
   static unsigned int features[] = {
   VIRTIO_BALLOON_F_MUST_TELL_HOST,
   VIRTIO_BALLOON_F_STATS_VQ,
  +VIRTIO_BALLOON_F_SILENT_DEFLATE,
   };
   
   static struct virtio_driver virtio_balloon_driver = {
 
 
  Of course with the current implementation of the balloon it does not
  matter much.  But for example, with Luiz's work, releasing pages as soon
  as the shrinker is called will increase effectiveness of the shrinker.
  At the same time, not all is lost if the guest prefers not to allow
  silent deflation (e.g. because there is an assigned device).
 
  On old hosts, a guest that can optionally use silent deflation will
  not use it.  That's the same as for any other feature bit.
 
  How about splitting the patches so we can discuss them separately?
 
  I can do that, but I hope the above clarifies it.
 
  Maybe I'm just dense.
  Let's see the split spec patchset?
 
  What's unclear exactly?  I'm not sure the spec patchset improves things
  that much, I can split it in two or three (change old feature, add new
  feature, add explanation) but it's not like changing logic in a program.
 
  Paolo
  
  Both your code and what you say here about the new bit seem to break
  compatibility with old hosts and guests.
 
 What is the exact scenario that you have in mind?

Existing host follows spec, advertises MUST_TELL_HOST (only)
guest acks that and still does not tell host.

 Here are all the possibilities:

Basically it looks like besides TELL_HOST you want another bit
DONT_TELL_HOST. This just seems weird, and interactions
between the two become very complex. Look at the amount of
text in this thread.


 
 - host that requires tell-first for a real balloon (doesn't
 propose new bit), any guest (doesn't matter if they are old
 or new since the new bit is never negotiated).
 
 Here, the old text suggested that the host need not do anything special,
 but it wouldn't have worked with Windows guests.  The host has to
 provide a fake balloon, or prevent the driver from working (e.g. hide
 the virtqueues).  So this is a change indeed, but the same change is
 present with your 1-word change too.  We have:
 
 negotiated old bit  host operation  guest operation
   F fake balloonneed not 

Re: [PATCH] virtio-balloon spec: rework VIRTIO_BALLOON_F_MUST_TELL_HOST feature, support silent deflation

2013-05-28 Thread Paolo Bonzini
Il 28/05/2013 18:50, Michael S. Tsirkin ha scritto:
 Both your code and what you say here about the new bit seem to break
 compatibility with old hosts and guests.

 What is the exact scenario that you have in mind?
 
 Existing host follows spec, advertises MUST_TELL_HOST (only)
 guest acks that and still does not tell host.

Guest bug.  Guest should only do so if it sees SILENT_DEFLATE too.

 Here are all the possibilities:
 
 Basically it looks like besides TELL_HOST you want another bit
 DONT_TELL_HOST. This just seems weird, and interactions
 between the two become very complex. Look at the amount of
 text in this thread.

Well, there are three cases, you need two bits.

The amount of text is because I'm writing the same thing 20 times in
different ways.  I'm still more satisfied with the text in the patch
than with anything I've written here.  It's possible I contradicted
myself, in fact.

 If it's in spec, I think it would be clearer what are we trying to
 achieve, and how.

 Having one or three patches doesn't change the final text...
 
 It changes the fact that we can stop arguing about
 the thing we agree on (making TELL_HOST optional
 for guests).
 
 We can separately argue about the one we don't seem
 to agree on (need for a new SILENT_DEFLATE).

Okay.  If you agree that your patch does entail a change for old hosts,
that would be a step forward indeed.

Paolo
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH RFC] virtio-pci: new config layout: using memory BAR

2013-05-28 Thread Anthony Liguori
Michael S. Tsirkin m...@redhat.com writes:

 This adds support for new config, and is designed to work with
 the new layout code in Rusty's new layout branch.

 At the moment all fields are in the same memory BAR (bar 2).
 This will be used to test performance and compare
 memory, io and hypercall latency.

 Compiles but does not work yet.
 Migration isn't handled yet.

 It's not clear what do queue_enable/queue_disable
 fields do, not yet implemented.

 Gateway for config access with config cycles
 not yet implemented.

 Sending out for early review/flames.

 Signed-off-by: Michael S. Tsirkin m...@redhat.com
 ---
  hw/virtio/virtio-pci.c | 393 
 +++--
  hw/virtio/virtio-pci.h |  55 +++
  hw/virtio/virtio.c |  20 +++
  include/hw/virtio/virtio.h |   4 +
  4 files changed, 458 insertions(+), 14 deletions(-)

 diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
 index 752991a..f4db224 100644
 --- a/hw/virtio/virtio-pci.c
 +++ b/hw/virtio/virtio-pci.c
 @@ -259,6 +259,26 @@ static void virtio_pci_stop_ioeventfd(VirtIOPCIProxy 
 *proxy)
  proxy-ioeventfd_started = false;
  }
  
 +static void virtio_pci_set_status(VirtIOPCIProxy *proxy, uint8_t val)
 +{
 +VirtIODevice *vdev = proxy-vdev;
 +
 +if (!(val  VIRTIO_CONFIG_S_DRIVER_OK)) {
 +virtio_pci_stop_ioeventfd(proxy);
 +}
 +
 +virtio_set_status(vdev, val  0xFF);
 +
 +if (val  VIRTIO_CONFIG_S_DRIVER_OK) {
 +virtio_pci_start_ioeventfd(proxy);
 +}
 +
 +if (vdev-status == 0) {
 +virtio_reset(proxy-vdev);
 +msix_unuse_all_vectors(proxy-pci_dev);
 +}
 +}
 +
  static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
  {
  VirtIOPCIProxy *proxy = opaque;
 @@ -293,20 +313,7 @@ static void virtio_ioport_write(void *opaque, uint32_t 
 addr, uint32_t val)
  }
  break;
  case VIRTIO_PCI_STATUS:
 -if (!(val  VIRTIO_CONFIG_S_DRIVER_OK)) {
 -virtio_pci_stop_ioeventfd(proxy);
 -}
 -
 -virtio_set_status(vdev, val  0xFF);
 -
 -if (val  VIRTIO_CONFIG_S_DRIVER_OK) {
 -virtio_pci_start_ioeventfd(proxy);
 -}
 -
 -if (vdev-status == 0) {
 -virtio_reset(proxy-vdev);
 -msix_unuse_all_vectors(proxy-pci_dev);
 -}
 +virtio_pci_set_status(proxy, val);
  
  /* Linux before 2.6.34 sets the device as OK without enabling
 the PCI device bus master bit. In this case we need to disable
 @@ -455,6 +462,226 @@ static void virtio_pci_config_write(void *opaque, 
 hwaddr addr,
  }
  }
  
 +static uint64_t virtio_pci_config_common_read(void *opaque, hwaddr addr,
 +  unsigned size)
 +{
 +VirtIOPCIProxy *proxy = opaque;
 +VirtIODevice *vdev = proxy-vdev;
 +
 +uint64_t low = 0xull;
 +
 +switch (addr) {
 +case offsetof(struct virtio_pci_common_cfg, device_feature_select):
 +return proxy-device_feature_select;

Oh dear no...  Please use defines like the rest of QEMU.

From a QEMU pov, take a look at:

https://github.com/aliguori/qemu/commit/587c35c1a3fe90f6af0f97927047ef4d3182a659

And:

https://github.com/aliguori/qemu/commit/01ba80a23cf2eb1e15056f82b44b94ec381565cb

Which lets virtio-pci be subclassable and then remaps the config space to
BAR2.

Haven't looked at the proposed new ring layout yet.

Regards,

Anthony Liguori

 +case offsetof(struct virtio_pci_common_cfg, device_feature):
 +/* TODO: 64-bit features */
 + return proxy-device_feature_select ? 0 : proxy-host_features;
 +case offsetof(struct virtio_pci_common_cfg, guest_feature_select):
 +return proxy-guest_feature_select;
 +case offsetof(struct virtio_pci_common_cfg, guest_feature):
 +/* TODO: 64-bit features */
 + return proxy-guest_feature_select ? 0 : vdev-guest_features;
 +case offsetof(struct virtio_pci_common_cfg, msix_config):
 + return vdev-config_vector;
 +case offsetof(struct virtio_pci_common_cfg, num_queues):
 +/* TODO: more exact limit? */
 + return VIRTIO_PCI_QUEUE_MAX;
 +case offsetof(struct virtio_pci_common_cfg, device_status):
 +return vdev-status;
 +
 + /* About a specific virtqueue. */
 +case offsetof(struct virtio_pci_common_cfg, queue_select):
 +return  vdev-queue_sel;
 +case offsetof(struct virtio_pci_common_cfg, queue_size):
 +return virtio_queue_get_num(vdev, vdev-queue_sel);
 +case offsetof(struct virtio_pci_common_cfg, queue_msix_vector):
 + return virtio_queue_vector(vdev, vdev-queue_sel);
 +case offsetof(struct virtio_pci_common_cfg, queue_enable):
 +/* TODO */
 + return 0;
 +case offsetof(struct virtio_pci_common_cfg, queue_notify_off):
 +return vdev-queue_sel;
 +case offsetof(struct virtio_pci_common_cfg, queue_desc):
 +return virtio_queue_get_desc_addr(vdev, 

Re: [PATCH RFC] virtio-pci: new config layout: using memory BAR

2013-05-28 Thread Michael S. Tsirkin
On Tue, May 28, 2013 at 12:15:16PM -0500, Anthony Liguori wrote:
  @@ -455,6 +462,226 @@ static void virtio_pci_config_write(void *opaque, 
  hwaddr addr,
   }
   }
   
  +static uint64_t virtio_pci_config_common_read(void *opaque, hwaddr addr,
  +  unsigned size)
  +{
  +VirtIOPCIProxy *proxy = opaque;
  +VirtIODevice *vdev = proxy-vdev;
  +
  +uint64_t low = 0xull;
  +
  +switch (addr) {
  +case offsetof(struct virtio_pci_common_cfg, device_feature_select):
  +return proxy-device_feature_select;
 
 Oh dear no...  Please use defines like the rest of QEMU.

Any good reason not to use offsetof?
I see about 138 examples in qemu.

Anyway, that's the way Rusty wrote it in the kernel header -
I started with defines.
If you convince Rusty to switch I can switch too,
but I prefer reusing same structures as linux even if
for now I've given up on reusing same headers.


 From a QEMU pov, take a look at:
 
 https://github.com/aliguori/qemu/commit/587c35c1a3fe90f6af0f97927047ef4d3182a659
 
 And:
 
 https://github.com/aliguori/qemu/commit/01ba80a23cf2eb1e15056f82b44b94ec381565cb
 
 Which lets virtio-pci be subclassable and then remaps the config space to
 BAR2.


Interesting. Have the spec anywhere?
You are saying this is going to conflict because
of BAR2 usage, yes?

So let's only do this virtio-fb only for new layout, so we don't need
to maintain compatibility. In particular, we are working
on making memory BAR access fast for virtio devices
in a generic way. At the moment they are measureably slower
than PIO on x86.


 Haven't looked at the proposed new ring layout yet.
 
 Regards,

No new ring layout. It's new config layout.


-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 0/2] virtio-balloon spec: silent deflation

2013-05-28 Thread Paolo Bonzini
Here is the series, split in two patches, with small edits and new
commit messages.

Paolo Bonzini (2):
  virtio-balloon spec: rewrite description of
VIRTIO_BALLOON_F_MUST_TELL_HOST
  virtio-balloon spec: reintroduce silent deflation feature

 virtio-spec.lyx | 238 ++--
 1 file changed, 233 insertions(+), 5 deletions(-)

-- 
1.8.2.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 1/2] virtio-balloon spec: rewrite description of VIRTIO_BALLOON_F_MUST_TELL_HOST

2013-05-28 Thread Paolo Bonzini
The idea of the VIRTIO_BALLOON_F_MUST_TELL_HOST feature was to let drivers
skip usage of the deflate queue when leaking the balloon (silent
deflation).  Guests may benefit from silent deflate by aggressively
inflating the balloon; they know that they will be able to use ballooned
pages without issuing a (blocking) request to the device.

The original spec assumed that every driver supports
VIRTIO_BALLOON_F_MUST_TELL_HOST, but this was not explicitly documented
and in practice it turns out not to be the case; the Windows balloon
driver does not tell the host correctly.

Since all known device implementations support silent deflation, they
do not negotiate the feature and we are thus free to redefine what
the host should do about this feature.

The Windows driver is also the only known driver that does not
negotiate VIRTIO_BALLOON_F_MUST_TELL_HOST.  Thus, even though the
used to be meant for communication from the host, known drivers are
really using it to communicate was in the other direction.

Adjust the spec to conform.  The original intent is reintroduced with
a new feature bit in the next patch, while also fixing a problem with
its definition.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 virtio-spec.lyx | 62 -
 1 file changed, 57 insertions(+), 5 deletions(-)

diff --git a/virtio-spec.lyx b/virtio-spec.lyx
index adec0a5..5c76a87 100644
--- a/virtio-spec.lyx
+++ b/virtio-spec.lyx
@@ -7219,11 +7219,46 @@ bits
 
 \begin_deeper
 \begin_layout Description
-VIRTIO_BALLOON_F_MUST_TELL_HOST
+VIRTIO_BALLOON_F_
+\change_deleted 1531152142 1347020601
+MUST
+\change_inserted 1531152142 1347020602
+CAN
+\change_unchanged
+_TELL_HOST
 \begin_inset space ~
 \end_inset
 
-(0) Host must be told before pages from the balloon are used.
+(0) 
+\change_deleted 1531152142 1347020625
+Host must be told
+\change_inserted 1531152142 1347020617
+Guest is able to tell host
+\change_unchanged
+ before pages from the balloon are used.
+
+\change_inserted 1531152142 1368005603
+ The host must propose this feature if it has to be told
+ before pages from the balloon are used.
+\begin_inset Foot
+status open
+
+\begin_layout Plain Layout
+
+\change_inserted 1531152142 1347022389
+This feature used to be named VIRTIO_BALLOON_F_\SpecialChar \-
+MUST_TELL_HOST.
+ However, after a few years it was observed that drivers were not using
+ it as specified.
+ The virtio-balloon spec was then adjusted to what the drivers had been
+ doing.
+\end_layout
+
+\end_inset
+
+
+\change_unchanged
+
 \end_layout
 
 \begin_layout Description
@@ -7382,9 +7417,15 @@ The driver constructs an array of addresses of memory 
pages it has previously
 \end_layout
 
 \begin_layout Enumerate
-If the VIRTIO_BALLOON_F_MUST_TELL_HOST feature is set, the guest may not
- use these requested pages until that descriptor in the deflateq has been
- used by the device.
+If the VIRTIO_BALLOON_F_
+\change_deleted 1531152142 1369761770
+MUST
+\change_inserted 1531152142 1369761770
+CAN
+\change_unchanged
+_TELL_HOST feature is set, the guest may not use these requested pages until
+ that descriptor in the deflateq has been used by the device.
+
 \end_layout
 
 \begin_layout Enumerate
@@ -7396,10 +7437,21 @@ status open
 
 \begin_layout Plain Layout
 In this case, deflation advice is merely a courtesy
+\change_inserted 1531152142 1369761798
+.
+ The guest need not use the deflateq at all.
+\change_unchanged
+
 \end_layout
 
 \end_inset
 
+
+\change_inserted 1531152142 1369761801
+ If the host does not support this, it should not do anything when the balloon
+ is inflated or deflated, except put the descriptors on the used ring.
+
+\change_unchanged
  
 \end_layout
 
-- 
1.8.2.1


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 2/2] virtio-balloon spec: reintroduce silent deflation feature

2013-05-28 Thread Paolo Bonzini
The original idea of the VIRTIO_BALLOON_F_MUST_TELL_HOST feature was to
let drivers skip usage of the deflate queue when leaking the balloon
(silent deflation).  Guests may benefit from silent deflate by
aggressively inflating the balloon; they know that they will be able to
use ballooned pages without issuing a (blocking) request to the device.

The previous patch redefined the feature to ensure correctness of the
operation when drivers do not correctly report deflation.  This patch
adds back the optimization.

The new feature bit is for the host to tell the drivers if silent
deflation is actually supported.  The meaning of the feature bit is
reversed compared to the original, because the original meaning was
not safe against migration.

For features to be safe against migration, they have to be defined as
this is true if the guest _can_ do X.  For such a positive feature,
migration is possible if the destination supports it, or the source
didn't set it:

dest support  source set  ok?
  TT  T
  TF  T
  FT  F
  FF  T

Instead, the old feature was defined as this is true if the guest
_cannot_ do X.  For such a negative feature, migration is possible
if the destination supports it, or the source sets it:

dest support  source set  ok?
  TT  T
  TF  F
  FT  T
  FF  T

However, the negotiated features are supposed to be the AND of the
device- and driver-supported features.  In the F/T case, the feature
would be negotiated by the source as T, and become F when negotiated on
the destination.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 virtio-spec.lyx | 197 
 1 file changed, 197 insertions(+)

diff --git a/virtio-spec.lyx b/virtio-spec.lyx
index 5c76a87..5da4fde 100644
--- a/virtio-spec.lyx
+++ b/virtio-spec.lyx
@@ -7267,6 +7267,20 @@ VIRTIO_BALLOON_F_STATS_VQ
 \end_inset
 
 (1) A virtqueue for reporting guest memory statistics is present.
+\change_inserted 1531152142 1347020627
+
+\end_layout
+
+\begin_layout Description
+
+\change_inserted 1531152142 1347020648
+VIRTIO_BALLOON_F_SILENT_DEFLATE
+\begin_inset space ~
+\end_inset
+
+(2) Guest need not tell host before pages from the balloon are used.
+\change_unchanged
+
 \end_layout
 
 \end_deeper
@@ -7627,6 +7641,168 @@ VIRTIO_BALLOON_S_MEMFREE The amount of memory not being 
used for any purpose
 VIRTIO_BALLOON_S_MEMTOT The total amount of memory available (in bytes).
 \end_layout
 
+\begin_layout Description
+
+\change_inserted 1531152142 1368005515
+Silent
+\begin_inset space ~
+\end_inset
+
+deflation
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1531152142 1368005515
+
+\series medium
+Some implementation of the balloon device may not require the guest to deflate
+ the balloon explicitly; instead, the guest may just take a page from its
+ reserve and start using it.
+ This is called 
+\begin_inset Quotes eld
+\end_inset
+
+silent deflate
+\begin_inset Quotes erd
+\end_inset
+
+
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1531152142 1368005515
+In order to use this feature effectively, both the guest and the host need
+ to know how the other part intends to use the balloon.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1531152142 1368005515
+
+\series medium
+Guests may benefit from silent deflate by aggressively inflating the balloon;
+ they know that they will be able to use ballooned pages without issuing
+ a (blocking) request to the device.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1531152142 1368005515
+Knowing that the guest will 
+\emph on
+not
+\emph default
+ deflate silently also benefits the host.
+ For example, if the host is pinning the guest's memory, it may unpin ballooned
+ pages and pin them again upon deflation.
+ This allows cooperative memory overcommit even if the guest's memory is
+ pinned.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1531152142 1368005515
+Thus, there are two possibilities for the host (either it does not support
+ silent deflate, or it does), and three for the guest (it doesn't need silent
+ deflate, it may choose to use it if available, it requires it).
+ Because there are three possibilities for the guest, support for silent
+ deflate is represented by two different feature bits.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1531152142 1368005515
+The feature bits are used as follows:
+\end_layout
+
+\begin_layout Itemize
+
+\change_inserted 1531152142 1368005671
+if the host does not support silent deflate, it must propose the
+VIRTIO_\SpecialChar \-
+BALLOON_F_\SpecialChar \-
+CAN_\SpecialChar \-
+TELL_\SpecialChar \-
+HOST feature.
+ If the guest 

Re: [PATCH RFC] virtio-pci: new config layout: using memory BAR

2013-05-28 Thread Paolo Bonzini
Il 28/05/2013 19:32, Michael S. Tsirkin ha scritto:
   +
   +switch (addr) {
   +case offsetof(struct virtio_pci_common_cfg, device_feature_select):
   +return proxy-device_feature_select;
  
  Oh dear no...  Please use defines like the rest of QEMU.
 Any good reason not to use offsetof?

I'm not sure it's portable to use it in case labels.  IIRC, the
definition ((int)(((T *)0)-field)) is not a valid C integer constant
expression.  Laszlo?

 I see about 138 examples in qemu.

Almost all of them are about fields in the host, not the guest, and none
of them are in case statements.

Paolo

 
 Anyway, that's the way Rusty wrote it in the kernel header -
 I started with defines.
 If you convince Rusty to switch I can switch too,
 but I prefer reusing same structures as linux even if
 for now I've given up on reusing same headers.
 
 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 1/2] virtio-balloon spec: rewrite description of VIRTIO_BALLOON_F_MUST_TELL_HOST

2013-05-28 Thread Michael S. Tsirkin
On Tue, May 28, 2013 at 07:40:17PM +0200, Paolo Bonzini wrote:
 The idea of the VIRTIO_BALLOON_F_MUST_TELL_HOST feature was to let drivers
 skip usage of the deflate queue when leaking the balloon (silent
 deflation).  Guests may benefit from silent deflate by aggressively
 inflating the balloon; they know that they will be able to use ballooned
 pages without issuing a (blocking) request to the device.
 
 The original spec assumed that every driver supports
 VIRTIO_BALLOON_F_MUST_TELL_HOST, but this was not explicitly documented
 and in practice it turns out not to be the case; the Windows balloon
 driver does not tell the host correctly.
 
 Since all known device implementations support silent deflation, they
 do not negotiate the feature and we are thus free to redefine what
 the host should do about this feature.
 
 The Windows driver is also the only known driver that does not
 negotiate VIRTIO_BALLOON_F_MUST_TELL_HOST.  Thus, even though the
 used to be meant for communication from the host, known drivers are
 really using it to communicate was in the other direction.
 
 Adjust the spec to conform.  The original intent is reintroduced with
 a new feature bit in the next patch, while also fixing a problem with
 its definition.
 
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com

What you write in spec below and what you write above seems to
contradict.

Look, how about the simpler patch that I sent:
[PATCH] virtio-spec: balloon: MUST_TELL_HOST is optional

does it functionally do everything you want in this patch?

If yes maybe you could pick that up, and add
a patch with renames and text clarifications on top?

More comments below.

 ---
  virtio-spec.lyx | 62 
 -
  1 file changed, 57 insertions(+), 5 deletions(-)
 
 diff --git a/virtio-spec.lyx b/virtio-spec.lyx
 index adec0a5..5c76a87 100644
 --- a/virtio-spec.lyx
 +++ b/virtio-spec.lyx
 @@ -7219,11 +7219,46 @@ bits
  
  \begin_deeper
  \begin_layout Description
 -VIRTIO_BALLOON_F_MUST_TELL_HOST
 +VIRTIO_BALLOON_F_
 +\change_deleted 1531152142 1347020601
 +MUST
 +\change_inserted 1531152142 1347020602
 +CAN
 +\change_unchanged
 +_TELL_HOST
  \begin_inset space ~
  \end_inset
  
 -(0) Host must be told before pages from the balloon are used.
 +(0) 
 +\change_deleted 1531152142 1347020625
 +Host must be told
 +\change_inserted 1531152142 1347020617
 +Guest is able to tell host
 +\change_unchanged
 + before pages from the balloon are used.
 +

This can and is able is IMO more confusing than clarifying.
Let's be definite. If feature is negotiated, guest must tell host.
If it's not, it does not.

That's why it's MUST not CAN.

And text should be

When negotiated, guest tells host
before pages from the balloon are used.



 +\change_inserted 1531152142 1368005603
 + The host must propose this feature

We don't say propose feature anyway in the spec.
You really mean set this feature bit ?

 if it has to be told
 + before pages from the balloon are used.

Has to?  Not at all. It can't assume anything
until guest has negotiated the feature.
So it should be
if it wants to be told before pages from the balloon are used.


 +\begin_inset Foot
 +status open
 +
 +\begin_layout Plain Layout
 +
 +\change_inserted 1531152142 1347022389
 +This feature used to be named VIRTIO_BALLOON_F_\SpecialChar \-
 +MUST_TELL_HOST.
 + However, after a few years it was observed that drivers were not using
 + it as specified.
 + The virtio-balloon spec was then adjusted to what the drivers had been
 + doing.

I'm not sure what good does this historical note do us.
I would say:
Historically the spec required all drivers
to acknowledge this feature bit.
However, no known hypervisor relies on this
feature bit being acknowledged unconditionally.
Thus, it's safe for drivers to ignore the TELL_HOST
feature.


 +\end_layout
 +
 +\end_inset
 +
 +
 +\change_unchanged
 +
  \end_layout
  
  \begin_layout Description
 @@ -7382,9 +7417,15 @@ The driver constructs an array of addresses of memory 
 pages it has previously
  \end_layout
  
  \begin_layout Enumerate
 -If the VIRTIO_BALLOON_F_MUST_TELL_HOST feature is set, the guest may not
 - use these requested pages until that descriptor in the deflateq has been
 - used by the device.
 +If the VIRTIO_BALLOON_F_
 +\change_deleted 1531152142 1369761770
 +MUST
 +\change_inserted 1531152142 1369761770
 +CAN
 +\change_unchanged
 +_TELL_HOST feature is set, the guest may not use these requested pages until
 + that descriptor in the deflateq has been used by the device.

Not if it is set. If it's *negotiated*.

 +
  \end_layout
  
  \begin_layout Enumerate
 @@ -7396,10 +7437,21 @@ status open
  
  \begin_layout Plain Layout
  In this case, deflation advice is merely a courtesy
 +\change_inserted 1531152142 1369761798
 +.
 + The guest need not use the deflateq at all.
 +\change_unchanged
 +
  \end_layout
  
  \end_inset
  
 +
 +\change_inserted 

Re: [PATCH RFC] virtio-pci: new config layout: using memory BAR

2013-05-28 Thread Anthony Liguori
Michael S. Tsirkin m...@redhat.com writes:

 On Tue, May 28, 2013 at 12:15:16PM -0500, Anthony Liguori wrote:
  @@ -455,6 +462,226 @@ static void virtio_pci_config_write(void *opaque, 
  hwaddr addr,
   }
   }
   
  +static uint64_t virtio_pci_config_common_read(void *opaque, hwaddr addr,
  +  unsigned size)
  +{
  +VirtIOPCIProxy *proxy = opaque;
  +VirtIODevice *vdev = proxy-vdev;
  +
  +uint64_t low = 0xull;
  +
  +switch (addr) {
  +case offsetof(struct virtio_pci_common_cfg, device_feature_select):
  +return proxy-device_feature_select;
 
 Oh dear no...  Please use defines like the rest of QEMU.

 Any good reason not to use offsetof?
 I see about 138 examples in qemu.

There are exactly zero:

$ find . -name *.c -exec grep -l case offset {} \;
$

 Anyway, that's the way Rusty wrote it in the kernel header -
 I started with defines.
 If you convince Rusty to switch I can switch too,

We have 300+ devices in QEMU that use #defines.  We're not using this
kind of thing just because you want to copy code from the kernel.

 https://github.com/aliguori/qemu/commit/587c35c1a3fe90f6af0f97927047ef4d3182a659
 
 And:
 
 https://github.com/aliguori/qemu/commit/01ba80a23cf2eb1e15056f82b44b94ec381565cb
 
 Which lets virtio-pci be subclassable and then remaps the config space to
 BAR2.


 Interesting. Have the spec anywhere?

Not yet, but working on that.

 You are saying this is going to conflict because
 of BAR2 usage, yes?

No, this whole thing is flexible.  I had to use BAR2 because BAR0 has to
be the vram mapping.  It also had to be an MMIO bar.

The new layout might make it easier to implement a device like this.  I
shared it mainly because I wanted to show the subclassing idea vs. just
tacking an option onto the existing virtio-pci code in QEMU.

Regards,

Anthony Liguori

 So let's only do this virtio-fb only for new layout, so we don't need
 to maintain compatibility. In particular, we are working
 on making memory BAR access fast for virtio devices
 in a generic way. At the moment they are measureably slower
 than PIO on x86.


 Haven't looked at the proposed new ring layout yet.
 
 Regards,

 No new ring layout. It's new config layout.


 -- 
 MST
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH RFC] virtio-pci: new config layout: using memory BAR

2013-05-28 Thread Michael S. Tsirkin
On Tue, May 28, 2013 at 01:53:35PM -0500, Anthony Liguori wrote:
 Michael S. Tsirkin m...@redhat.com writes:
 
  On Tue, May 28, 2013 at 12:15:16PM -0500, Anthony Liguori wrote:
   @@ -455,6 +462,226 @@ static void virtio_pci_config_write(void *opaque, 
   hwaddr addr,
}
}

   +static uint64_t virtio_pci_config_common_read(void *opaque, hwaddr addr,
   +  unsigned size)
   +{
   +VirtIOPCIProxy *proxy = opaque;
   +VirtIODevice *vdev = proxy-vdev;
   +
   +uint64_t low = 0xull;
   +
   +switch (addr) {
   +case offsetof(struct virtio_pci_common_cfg, device_feature_select):
   +return proxy-device_feature_select;
  
  Oh dear no...  Please use defines like the rest of QEMU.
 
  Any good reason not to use offsetof?
  I see about 138 examples in qemu.
 
 There are exactly zero:
 
 $ find . -name *.c -exec grep -l case offset {} \;
 $
 
  Anyway, that's the way Rusty wrote it in the kernel header -
  I started with defines.
  If you convince Rusty to switch I can switch too,
 
 We have 300+ devices in QEMU that use #defines.  We're not using this
 kind of thing just because you want to copy code from the kernel.

Rusty, let's switch to defines?

  https://github.com/aliguori/qemu/commit/587c35c1a3fe90f6af0f97927047ef4d3182a659
  
  And:
  
  https://github.com/aliguori/qemu/commit/01ba80a23cf2eb1e15056f82b44b94ec381565cb
  
  Which lets virtio-pci be subclassable and then remaps the config space to
  BAR2.
 
 
  Interesting. Have the spec anywhere?
 
 Not yet, but working on that.
 
  You are saying this is going to conflict because
  of BAR2 usage, yes?
 
 No, this whole thing is flexible.  I had to use BAR2 because BAR0 has to
 be the vram mapping.  It also had to be an MMIO bar.
 
 The new layout might make it easier to implement a device like this.

Absolutely, you can put thing anywhere you like.

  I
 shared it mainly because I wanted to show the subclassing idea vs. just
 tacking an option onto the existing virtio-pci code in QEMU.
 
 Regards,
 
 Anthony Liguori

I don't think a subclass will work for the new layout. This should be
completely transparent to users, just have more devices
work with more flexibility.

  So let's only do this virtio-fb only for new layout, so we don't need
  to maintain compatibility. In particular, we are working
  on making memory BAR access fast for virtio devices
  in a generic way. At the moment they are measureably slower
  than PIO on x86.
 
 
  Haven't looked at the proposed new ring layout yet.
  
  Regards,
 
  No new ring layout. It's new config layout.
 
 
  -- 
  MST
  --
  To unsubscribe from this list: send the line unsubscribe kvm in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH RFC] virtio-pci: new config layout: using memory BAR

2013-05-28 Thread Laszlo Ersek
On 05/28/13 19:43, Paolo Bonzini wrote:
 Il 28/05/2013 19:32, Michael S. Tsirkin ha scritto:
 +
 +switch (addr) {
 +case offsetof(struct virtio_pci_common_cfg, device_feature_select):
 +return proxy-device_feature_select;

 Oh dear no...  Please use defines like the rest of QEMU.
 Any good reason not to use offsetof?
 
 I'm not sure it's portable to use it in case labels.  IIRC, the
 definition ((int)(((T *)0)-field)) is not a valid C integer constant
 expression.  Laszlo?

As long as we use the offsetof() that comes with the C implementation
(ie. we don't hack it up ourselves), we should be safe; ISO C99
elegantly defines offsetof() in 7.17 Common definitions stddef.h p3
as

  The macros are [...]

offsetof(type, member-designator)

  which expands to an integer constant expression that has type
  *size_t*, the value of which is the offset in bytes, to the structure
  member (designated by /member-designator/), from the beginning of its
  structure (designated by /type/). The type and member designator shall
  be such that given

static type t;

  then the expression (t.member-designator) evaluates to an address
  constant. (If the specified member is a bit-field, the behavior is
  undefined.)

(Address constant is described in 6.6p9 but quoting even that here
doesn't seem necesary.)

From 6.8.4.2p3, The expression of each case label shall be an integer
constant expression [...].

So,
(a) if we use the standard offsetof(),
(b) and you can also write, for example,

  static struct virtio_pci_common_cfg t;
  static void *p = t.device_feature_select;

then the construct seems valid.

(Consistently with the above mention of UB if the specified member is a
bit-field: the address-of operator's constraints also forbid a bit-field
object as operand.)

Laszlo
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [Qemu-devel] [PATCH v2 00/11] qemu: use virtio linux headers in portable code

2013-05-28 Thread Rusty Russell
Anthony Liguori anth...@codemonkey.ws writes:
 Rusty Russell ru...@rustcorp.com.au writes:

 Anthony Liguori anth...@codemonkey.ws writes:
 Paolo Bonzini pbonz...@redhat.com writes:

 Il 26/05/2013 22:02, Michael S. Tsirkin ha scritto:
  My fault.  I should have looked at linux/types.h (actually 
  asm-generic/).
 
 Not really, __uX appear in the headers that were posted.

 Which is a problem because this is a reserved namespace in C99.

 Personally, I find it hard to care.  What matters is not what the
 standard has carved out, but whether we have clashes, reserved namespace
 or no.  And that won't happen for these.

 If someone wants to convert all the kernel headers, I won't NAK it
 though.

 virtio headers are special.  Linux headers are normally only consumed in
 the kernel or in a userspace application running on Linux.

 virtio headers may be used either in a userspace application running on
 !Linux (we need to support QEMU on Windows) or even in a foreign kernel.

No.

s/virtio/SCSI/.  s/virtio/if_eth/.  s/virtio/TCPIP/.

Cheers,
Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH RFC] virtio-pci: new config layout: using memory BAR

2013-05-28 Thread Rusty Russell
Paolo Bonzini pbonz...@redhat.com writes:
 Il 28/05/2013 19:32, Michael S. Tsirkin ha scritto:
   +
   +switch (addr) {
   +case offsetof(struct virtio_pci_common_cfg, 
   device_feature_select):
   +return proxy-device_feature_select;
  
  Oh dear no...  Please use defines like the rest of QEMU.
 Any good reason not to use offsetof?

 I'm not sure it's portable to use it in case labels.  IIRC, the
 definition ((int)(((T *)0)-field)) is not a valid C integer constant
 expression.  Laszlo?

It's defined to yield an integer constant expression in the ISO standard
(and I think ANSI too, though that's not at hand):

7.19, para 3:

...offsetof(type, member-designator)
which expands to an integer constant expression that has type
size_t, ...

The real question is whether compilers qemu cares about meet the
standard (there's some evidence that older compilers fail this).  If
not, we'll have to define them as raw offsets...

Cheers,
Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH RFC] virtio-pci: new config layout: using memory BAR

2013-05-28 Thread Rusty Russell
Anthony Liguori aligu...@us.ibm.com writes:
 Michael S. Tsirkin m...@redhat.com writes:
 +case offsetof(struct virtio_pci_common_cfg, device_feature_select):
 +return proxy-device_feature_select;

 Oh dear no...  Please use defines like the rest of QEMU.

It is pretty ugly.

Yet the structure definitions are descriptive, capturing layout, size
and endianness in natural a format readable by any C programmer.

So AFAICT the question is, do we put the required

#define VIRTIO_PCI_CFG_FEATURE_SEL \
 (offsetof(struct virtio_pci_common_cfg, device_feature_select))

etc. in the kernel headers or qemu?

 Haven't looked at the proposed new ring layout yet.

No change, but there's an open question on whether we should nail it to
little endian (or define the endian by the transport).

Of course, I can't rule out that the 1.0 standard *may* decide to frob
the ring layout somehow, but I'd think it would require a compelling
reason.  I suggest that's 2.0 material...

Cheers,
Rusty.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: updated: kvm networking todo wiki

2013-05-28 Thread Rusty Russell
Michael S. Tsirkin m...@redhat.com writes:
 On Fri, May 24, 2013 at 08:47:58AM -0500, Anthony Liguori wrote:
 Michael S. Tsirkin m...@redhat.com writes:
 
  On Fri, May 24, 2013 at 05:41:11PM +0800, Jason Wang wrote:
  On 05/23/2013 04:50 PM, Michael S. Tsirkin wrote:
   Hey guys,
   I've updated the kvm networking todo wiki with current projects.
   Will try to keep it up to date more often.
   Original announcement below.
  
  Thanks a lot. I've added the tasks I'm currently working on to the wiki.
  
  btw. I notice the virtio-net data plane were missed in the wiki. Is the
  project still being considered?
 
  It might have been interesting several years ago, but now that linux has
  vhost-net in kernel, the only point seems to be to
  speed up networking on non-linux hosts.
 
 Data plane just means having a dedicated thread for virtqueue processing
 that doesn't hold qemu_mutex.
 
 Of course we're going to do this in QEMU.  It's a no brainer.  But not
 as a separate device, just as an improvement to the existing userspace
 virtio-net.
 
  Since non-linux does not have kvm, I doubt virtio is a bottleneck.
 
 FWIW, I think what's more interesting is using vhost-net as a networking
 backend with virtio-net in QEMU being what's guest facing.
 
 In theory, this gives you the best of both worlds: QEMU acts as a first
 line of defense against a malicious guest while still getting the
 performance advantages of vhost-net (zero-copy).

 Great idea, that sounds very intresting.

 I'll add it to the wiki.

 In fact a bit of complexity in vhost was put there in the vague hope to
 support something like this: virtio rings are not translated through
 regular memory tables, instead, vhost gets a pointer to ring address.

 This allows qemu acting as a man in the middle,
 verifying the descriptors but not touching the

 Anyone interested in working on such a project?

It would be an interesting idea if we didn't already have the vhost
model where we don't need the userspace bounce.  We already have two
sets of host side ring code in the kernel (vhost and vringh, though
they're being unified).

All an accelerator can offer on the tx side is zero copy and direct
update of the used ring.  On rx userspace could register the buffers and
the accelerator could fill them and update the used ring.  It still
needs to deal with merged buffers, for example.

You avoid the address translation in the kernel, but I'm not convinced
that's a key problem.

Cheers,
Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC 7/11] virtio_pci: new, capability-aware driver.

2013-05-28 Thread Rusty Russell
Michael S. Tsirkin m...@redhat.com writes:

 On Mon, Dec 12, 2011 at 01:49:13PM +0200, Michael S. Tsirkin wrote:
 On Mon, Dec 12, 2011 at 09:15:03AM +1030, Rusty Russell wrote:
  On Sun, 11 Dec 2011 11:42:56 +0200, Michael S. Tsirkin m...@redhat.com 
  wrote:
   On Thu, Dec 08, 2011 at 09:09:33PM +1030, Rusty Russell wrote:
+/* There is no iowrite64.  We use two 32-bit ops. */
+static void iowrite64(u64 val, const __le64 *addr)
+{
+  iowrite32((u32)val, (__le32 *)addr);
+  iowrite32(val  32, (__le32 *)addr + 1);
+}
+
   
   Let's put addr_lo/addr_hi in the structure then,
   to make the fact this field is not atomic explicit?
  
  Good point, assuming I haven't missed something.
  
  Are 64-bit accesses actually unknown in PCI-land?  Or is this a limited
  availability thing?
  
  Thanks,
  Rusty.
 
 I think PCI can optionally support atomic 64 bit accesses, but not all
 architectures can generate them.

 Ping. Going to change this in the layout struct?

Not sure what you mean  We use a queue_enable field, so it doesn't
matter if the accesses aren't atomic for that.

Cheers,
Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [Qemu-devel] [PATCH v2 00/11] qemu: use virtio linux headers in portable code

2013-05-28 Thread Rusty Russell
Anthony Liguori anth...@codemonkey.ws writes:
 The headers say they are BSD licensed... but they include a GPLv2+
 header.  Doesn't make a lot of sense, does it?

It makes perfect sense: you're overthinking it.  It just means that
copying the BSD headers outside Linux is encouraged.

And it's clearly nonsensical to claim the GPL on kernel headers means
that userspace needs to be GPL.  So please ignore this licensing
red-herring.

And we'll bikeshed the headers in the standard when we have to :)  They
certainly don't need to be cutpaste into the kernel sources.

Cheers,
Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization