Re: [Qemu-devel] [PATCH v4 1/1] vhost user: add support of live migration

2015-07-12 Thread Linhaifeng


On 2015/7/10 21:05, Paolo Bonzini wrote:


On 26/06/2015 11:22, Thibaut Collet wrote:

Some vhost client/backend are able to support live migration.
To provide this service the following features must be added:
1. Add the VIRTIO_NET_F_GUEST_ANNOUNCE capability to vhost-net when netdev
backend is vhost-user.
2. Provide a nop receive callback to vhost-user. This callback is for RARP
packets automatically send by qemu_announce_self after a migration.
These packets are useless for vhost user and just discarded.

When a packet is received by vhost-user, the vhost-user writes the
packet in guest memory.  QEMU must then copy that page of guest memory
from source to destination; it uses a dirty bitmap for this purpose.

How does vhost-user do this?  I can see this patch providing enough
support for *non*live migration.  However, it cannot be enough for live
migration unless I'm missing something obvious.

Paolo
Agree. vhost-user should mmap the log memory and mark dirty pages when 
send or receive packets.

Signed-off-by: Thibaut Collet thibaut.col...@6wind.com
---
  hw/net/vhost_net.c |2 ++
  net/vhost-user.c   |   21 +++--
  2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 9bd360b..668c422 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -85,6 +85,8 @@ static const int user_feature_bits[] = {
  VIRTIO_NET_F_CTRL_MAC_ADDR,
  VIRTIO_NET_F_CTRL_GUEST_OFFLOADS,
  
+VIRTIO_NET_F_GUEST_ANNOUNCE,

+
  VIRTIO_NET_F_MQ,
  
  VHOST_INVALID_FEATURE_BIT

diff --git a/net/vhost-user.c b/net/vhost-user.c
index b51bc04..20778a1 100644
--- a/net/vhost-user.c
+++ b/net/vhost-user.c
@@ -65,6 +65,24 @@ static void vhost_user_stop(VhostUserState *s)
  s-vhost_net = 0;
  }
  
+static ssize_t vhost_user_receive(NetClientState *nc, const uint8_t *buf,

+  size_t size)
+{
+/* A live migration is done. Display an error if the packet is not a RARP.
+ * RARP are just discarded: guest is already notified of live migration
+ * by the virtio-net NIC or by the vhost-user backend */
+if (size != 60) {
+static int display_trace = 1;
+
+if (display_trace) {
+fprintf(stderr,Vhost user receives unexpected packets\n);
+fflush(stderr);
+display_trace = 0;
+}
+}
+return size;
+}
+
  static void vhost_user_cleanup(NetClientState *nc)
  {
  VhostUserState *s = DO_UPCAST(VhostUserState, nc, nc);
@@ -90,6 +108,7 @@ static bool vhost_user_has_ufo(NetClientState *nc)
  static NetClientInfo net_vhost_user_info = {
  .type = NET_CLIENT_OPTIONS_KIND_VHOST_USER,
  .size = sizeof(VhostUserState),
+.receive = vhost_user_receive,
  .cleanup = vhost_user_cleanup,
  .has_vnet_hdr = vhost_user_has_vnet_hdr,
  .has_ufo = vhost_user_has_ufo,
@@ -146,8 +165,6 @@ static int net_vhost_user_init(NetClientState *peer, const 
char *device,
  
  s = DO_UPCAST(VhostUserState, nc, nc);
  
-/* We don't provide a receive callback */

-s-nc.receive_disabled = 1;
  s-chr = chr;
  s-nc.queue_index = i;
  










Re: [Qemu-devel] [PATCH v1] vhost-user: fix not send all hugepage files to vhost-user

2015-02-15 Thread Linhaifeng


On 2015/2/2 7:29, Paolo Bonzini wrote:
 
 
 On 17/12/2014 07:02, haifeng@huawei.com wrote:
 From: linhaifeng haifeng@huawei.com

 If we create VM with two or more numa nodes qemu will create two
 or more hugepage files but qemu only send one hugepage file fd
 to vhost-user when VM's memory size is 2G and with two numa nodes.

 Signed-off-by: linhaifeng haifeng@huawei.com
 
 The bug is in vhost_dev_assign_memory.  It doesn't check that the file
 descriptor matches when merging regions.  Michael, does the merging
 trigger in practice?  Can we just eliminate it?
 
 Paolo
 

qemu-system-x86_64 -name vm1 -enable-kvm -smp 2 -m 2048 -object 
memory-backend-file,id=mem1,size=1024M,mem-path=/dev/hugepages,share=on -object 
memory-backend-file,id=mem2,size=1024M,mem-path=/dev/hugepages,share=on -numa 
node,memdev=mem1 -numa node,memdev=mem2 -chardev 
socket,id=chr0,path=/var/run/vhost-user/tap10  -netdev 
type=vhost-user,id=net0,chardev=chr0,vhostforce -device 
virtio-net-pci,netdev=net0,mac=00:00:00:00:00:01,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off
 -drive file=/mnt/sdc/linhf/suse11_sp3_64_0 -vnc :0
hugepage file fd=8 addr=0x7f1ea720 size=1073741824
hugepage file fd=9 addr=0x7f1ee720 size=1073741824
qemu-system-x86_64: -netdev type=vhost-user,id=net0,chardev=chr0,vhostforce: 
chardev chr0 went up

WARNING: Image format was not specified for '/mnt/sdc/linhf/suse11_sp3_64_0' 
and probing guessed raw.
 Automatically detecting the format is dangerous for raw images, write 
operations on block 0 will be restricted.
 Specify the 'raw' format explicitly to remove the restrictions.
reg-userspace_addr=0x7f1ea72c ram_addr=0xc fd=8
reg-userspace_addr=0x7f1e9ee0 ram_addr=0x8000 fd=-1
reg-userspace_addr=0x7f1ea720 ram_addr=0x0 fd=8


It seems like the second region's address is invalid(not in the hugepage's 
area).
so we lost this region.




Re: [Qemu-devel] [PATCH v2 1/3] vhost-user: update the protocol.

2015-02-14 Thread Linhaifeng


On 2015/2/15 5:03, Michael S. Tsirkin wrote:
 On Fri, Feb 13, 2015 at 09:45:37PM +0800, linhaifeng wrote:
 @@ -35,7 +39,7 @@ consists of 3 header fields and a payload:
  
   * Request: 32-bit type of the request
   * Flags: 32-bit bit field:
 -   - Lower 2 bits are the version (currently 0x01)
 +   - Lower 2 bits are the version (currently 0x06)
 - Bit 2 is the reply flag - needs to be sent on each reply from the slave
   * Size - 32-bit size of the payload
  
 
 How do you encode 0x6 in a 2 bit field?
 

I think the reply flag is unnecessary,so i remove it and let all the field as 
version.
The existing version is 0x5 now.

Sorry, i need to update the vhost-user.txt to describe it.


-- 
Regards,
Haifeng




[Qemu-devel] vhost-user: Is vhost-user support PXE?

2015-02-14 Thread Linhaifeng
Hi, Michael

I'm trying to install guest OS with PXE(vhost-user backend), it was failed.

Is there any plans to support it?

-- 
Regards,
Haifeng




[Qemu-devel] [PATCH v3 1/3] vhost-user: update the protocol.

2015-02-14 Thread linhaifeng
From: Linhaifeng haifeng@huawei.com

Every messages need reply.
This path just update the vhost-user.txt to version 0x6.

Signed-off-by: Linhaifeng haifeng@huawei.com
---
 docs/specs/vhost-user.txt | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 650bb18..c54d1e2 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -23,6 +23,10 @@ be a software Ethernet switch running in user space, such as 
Snabbswitch.
 Master and slave can be either a client (i.e. connecting) or server (listening)
 in the socket communication.
 
+version 0x5: Supply base communication between master and slave.
+version 0x6: Add reply for more robust.
+
+
 Message Specification
 -
 
@@ -30,13 +34,11 @@ Note that all numbers are in the machine native byte order. 
A vhost-user message
 consists of 3 header fields and a payload:
 
 
-| request | flags | size | payload |
+| request | version | size | payload |
 
 
  * Request: 32-bit type of the request
- * Flags: 32-bit bit field:
-   - Lower 2 bits are the version (currently 0x01)
-   - Bit 2 is the reply flag - needs to be sent on each reply from the slave
+ * Version: 32-bit version
  * Size - 32-bit size of the payload
 
 
@@ -144,6 +146,7 @@ Message types
   Id: 2
   Ioctl: VHOST_SET_FEATURES
   Master payload: u64
+  Slave payload: u64 0:success else:fail
 
   Enable features in the underlying vhost implementation using a bitmask.
 
@@ -171,6 +174,7 @@ Message types
   Id: 5
   Equivalent ioctl: VHOST_SET_MEM_TABLE
   Master payload: memory regions description
+  Slave payload: u64 0:success else:fail
 
   Sets the memory map regions on the slave so it can translate the vring
   addresses. In the ancillary data there is an array of file descriptors
@@ -182,6 +186,7 @@ Message types
   Id: 6
   Equivalent ioctl: VHOST_SET_LOG_BASE
   Master payload: u64
+  Slave payload: u64 0:success else:fail
 
   Sets the logging base address.
 
@@ -190,6 +195,7 @@ Message types
   Id: 7
   Equivalent ioctl: VHOST_SET_LOG_FD
   Master payload: N/A
+  Slave payload: u64 0:success else:fail
 
   Sets the logging file descriptor, which is passed as ancillary data.
 
@@ -198,6 +204,7 @@ Message types
   Id: 8
   Equivalent ioctl: VHOST_SET_VRING_NUM
   Master payload: vring state description
+  Slave payload: u64 0:success else:fail
 
   Sets the number of vrings for this owner.
 
@@ -206,7 +213,7 @@ Message types
   Id: 9
   Equivalent ioctl: VHOST_SET_VRING_ADDR
   Master payload: vring address description
-  Slave payload: N/A
+  Slave payload: u64 0:success else:fail
 
   Sets the addresses of the different aspects of the vring.
 
@@ -215,6 +222,7 @@ Message types
   Id: 10
   Equivalent ioctl: VHOST_SET_VRING_BASE
   Master payload: vring state description
+  Slave payload: u64 0:success else:fail
 
   Sets the base offset in the available vring.
 
@@ -232,6 +240,7 @@ Message types
   Id: 12
   Equivalent ioctl: VHOST_SET_VRING_KICK
   Master payload: u64
+  Slave payload: u64 0:success else:fail
 
   Set the event file descriptor for adding buffers to the vring. It
   is passed in the ancillary data.
@@ -245,6 +254,7 @@ Message types
   Id: 13
   Equivalent ioctl: VHOST_SET_VRING_CALL
   Master payload: u64
+  Slave payload: u64 0:success else:fail
 
   Set the event file descriptor to signal when buffers are used. It
   is passed in the ancillary data.
@@ -258,6 +268,7 @@ Message types
   Id: 14
   Equivalent ioctl: VHOST_SET_VRING_ERR
   Master payload: u64
+  Slave payload: u64 0:success else:fail
 
   Set the event file descriptor to signal when error occurs. It
   is passed in the ancillary data.
-- 
1.7.12.4





[Qemu-devel] [PATCH v3 3/3] vhost-user: add reply for other messages

2015-02-14 Thread linhaifeng
From: Linhaifeng haifeng@huawei.com

If slave's version bigger than 0x5 we will wait for reply.

Signed-off-by: Linhaifeng haifeng@huawei.com
---
 hw/virtio/vhost-user.c | 42 +-
 1 file changed, 41 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index d56115a..ae684b6 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -80,10 +80,17 @@ static VhostUserMsg m __attribute__ ((unused));
 #define VHOST_USER_PAYLOAD_SIZE (sizeof(m) - VHOST_USER_HDR_SIZE)
 
 /* The version of the protocol we support.
- * Slaves' version should maller than  VHOST_USER_VERSION.
+ * Slaves' version must not bigger than  VHOST_USER_VERSION.
  */
+#define VHOST_USER_BASE   (0x5)
 #define VHOST_USER_VERSION(0x6)
 
+#define VHOST_NEED_REPLY \
+{\
+if (slave_version  VHOST_USER_BASE) \
+need_reply = 1;\
+}
+
 static bool ioeventfd_enabled(void)
 {
 return kvm_enabled()  kvm_eventfds_enabled();
@@ -207,6 +214,8 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 case VHOST_SET_LOG_BASE:
 msg.u64 = *((__u64 *) arg);
 msg.size = sizeof(m.u64);
+
+VHOST_NEED_REPLY;
 break;
 
 case VHOST_SET_OWNER:
@@ -244,16 +253,21 @@ static int vhost_user_call(struct vhost_dev *dev, 
unsigned long int request,
 msg.size += sizeof(m.memory.padding);
 msg.size += fd_num * sizeof(VhostUserMemoryRegion);
 
+VHOST_NEED_REPLY;
 break;
 
 case VHOST_SET_LOG_FD:
 fds[fd_num++] = *((int *) arg);
+
+VHOST_NEED_REPLY;
 break;
 
 case VHOST_SET_VRING_NUM:
 case VHOST_SET_VRING_BASE:
 memcpy(msg.state, arg, sizeof(struct vhost_vring_state));
 msg.size = sizeof(m.state);
+
+VHOST_NEED_REPLY;
 break;
 
 case VHOST_GET_VRING_BASE:
@@ -265,6 +279,8 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 case VHOST_SET_VRING_ADDR:
 memcpy(msg.addr, arg, sizeof(struct vhost_vring_addr));
 msg.size = sizeof(m.addr);
+
+VHOST_NEED_REPLY;
 break;
 
 case VHOST_SET_VRING_KICK:
@@ -278,6 +294,8 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 } else {
 msg.u64 |= VHOST_USER_VRING_NOFD_MASK;
 }
+
+VHOST_NEED_REPLY;
 break;
 default:
 error_report(vhost-user trying to send unhandled ioctl\n);
@@ -315,6 +333,28 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 }
 memcpy(arg, msg.state, sizeof(struct vhost_vring_state));
 break;
+case VHOST_USER_SET_FEATURES:
+case VHOST_USER_SET_LOG_BASE:
+case VHOST_USER_SET_OWNER:
+case VHOST_USER_RESET_OWNER:
+case VHOST_USER_SET_MEM_TABLE:
+case VHOST_USER_SET_LOG_FD:
+case VHOST_USER_SET_VRING_NUM:
+case VHOST_USER_SET_VRING_BASE:
+case VHOST_USER_SET_VRING_ADDR:
+case VHOST_USER_SET_VRING_KICK:
+case VHOST_USER_SET_VRING_CALL:
+case VHOST_USER_SET_VRING_ERR:
+if (msg.size != sizeof(m.u64)) {
+error_report(Received bad msg size.);
+return -1;
+} else {
+if (m.u64) {
+error_report(Failed to handle request %d., msg_request);
+return -1;
+}
+}
+break;
 default:
 error_report(Received unexpected msg type.\n);
 return -1;
-- 
1.7.12.4





[Qemu-devel] [PATCH v3 2/3] vhost-user: update the version to 0x6

2015-02-14 Thread linhaifeng
From: Linhaifeng haifeng@huawei.com

We not need the VHOST_USER_REPLY_MASK so the base version now is 0x5.
  - update the version to 0x6.
  - change the name form flag to version.

Signed-off-by: Linhaifeng haifeng@huawei.com
---
 hw/virtio/vhost-user.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index aefe0bb..d56115a 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -59,10 +59,7 @@ typedef struct VhostUserMemory {
 
 typedef struct VhostUserMsg {
 VhostUserRequest request;
-
-#define VHOST_USER_VERSION_MASK (0x3)
-#define VHOST_USER_REPLY_MASK   (0x12)
-uint32_t flags;
+uint32_t version;
 uint32_t size; /* the following payload size */
 union {
 #define VHOST_USER_VRING_IDX_MASK   (0xff)
@@ -74,15 +71,18 @@ typedef struct VhostUserMsg {
 };
 } QEMU_PACKED VhostUserMsg;
 
+static uint32_t slave_version;
 static VhostUserMsg m __attribute__ ((unused));
 #define VHOST_USER_HDR_SIZE (sizeof(m.request) \
-+ sizeof(m.flags) \
++ sizeof(m.version) \
 + sizeof(m.size))
 
 #define VHOST_USER_PAYLOAD_SIZE (sizeof(m) - VHOST_USER_HDR_SIZE)
 
-/* The version of the protocol we support */
-#define VHOST_USER_VERSION(0x1)
+/* The version of the protocol we support.
+ * Slaves' version should maller than  VHOST_USER_VERSION.
+ */
+#define VHOST_USER_VERSION(0x6)
 
 static bool ioeventfd_enabled(void)
 {
@@ -134,12 +134,12 @@ static int vhost_user_read(struct vhost_dev *dev, 
VhostUserMsg *msg)
 }
 
 /* validate received flags */
-if (msg-flags != (VHOST_USER_REPLY_MASK | VHOST_USER_VERSION)) {
-error_report(Failed to read msg header.
- Flags 0x%x instead of 0x%x.\n, msg-flags,
-VHOST_USER_REPLY_MASK | VHOST_USER_VERSION);
+if (msg-version  VHOST_USER_VERSION) {
+error_report(Invalid version 0x%x.\n
+Vhost user version is 0x%x, msg-version, 
VHOST_USER_VERSION);
 goto fail;
 }
+slave_version = msg-version;
 
 /* validate message size is sane */
 if (msg-size  VHOST_USER_PAYLOAD_SIZE) {
@@ -195,7 +195,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 
 msg_request = vhost_user_request_translate(request);
 msg.request = msg_request;
-msg.flags = VHOST_USER_VERSION;
+msg.version = VHOST_USER_VERSION;
 msg.size = 0;
 
 switch (request) {
-- 
1.7.12.4





[Qemu-devel] [PATCH v3 0/3] vhost-user: support safe protocol

2015-02-14 Thread linhaifeng
From: Linhaifeng haifeng@huawei.com

Mostly the same as ioctl master need the return value to
decided going on or not.So we add these patches for more
safe communication.

change log:
v1-v2: modify the annotate about slave's version.
v2-v3: update the description of version in vhost-user.txt.

Linhaifeng (3):
  vhost-user: update the protocol.
  vhost-user: update the version to 0x6
  vhost-user: add reply for other messages

 docs/specs/vhost-user.txt | 21 
 hw/virtio/vhost-user.c| 64 ++-
 2 files changed, 68 insertions(+), 17 deletions(-)

-- 
1.7.12.4





[Qemu-devel] [PATCH v1 2/3] vhost-user:update the version to 0x6

2015-02-13 Thread linhaifeng
From: Linhaifeng haifeng@huawei.com

We not need the VHOST_USER_REPLY_MASK so the base version now is 0x5.
  - update the version to 0x6.
  - change the name form flag to version.

Signed-off-by: Linhaifeng haifeng@huawei.com
---
 hw/virtio/vhost-user.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index aefe0bb..d56115a 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -59,10 +59,7 @@ typedef struct VhostUserMemory {
 
 typedef struct VhostUserMsg {
 VhostUserRequest request;
-
-#define VHOST_USER_VERSION_MASK (0x3)
-#define VHOST_USER_REPLY_MASK   (0x12)
-uint32_t flags;
+uint32_t version;
 uint32_t size; /* the following payload size */
 union {
 #define VHOST_USER_VRING_IDX_MASK   (0xff)
@@ -74,15 +71,18 @@ typedef struct VhostUserMsg {
 };
 } QEMU_PACKED VhostUserMsg;
 
+static uint32_t slave_version;
 static VhostUserMsg m __attribute__ ((unused));
 #define VHOST_USER_HDR_SIZE (sizeof(m.request) \
-+ sizeof(m.flags) \
++ sizeof(m.version) \
 + sizeof(m.size))
 
 #define VHOST_USER_PAYLOAD_SIZE (sizeof(m) - VHOST_USER_HDR_SIZE)
 
-/* The version of the protocol we support */
-#define VHOST_USER_VERSION(0x1)
+/* The version of the protocol we support.
+ * Slaves' version should maller than  VHOST_USER_VERSION.
+ */
+#define VHOST_USER_VERSION(0x6)
 
 static bool ioeventfd_enabled(void)
 {
@@ -134,12 +134,12 @@ static int vhost_user_read(struct vhost_dev *dev, 
VhostUserMsg *msg)
 }
 
 /* validate received flags */
-if (msg-flags != (VHOST_USER_REPLY_MASK | VHOST_USER_VERSION)) {
-error_report(Failed to read msg header.
- Flags 0x%x instead of 0x%x.\n, msg-flags,
-VHOST_USER_REPLY_MASK | VHOST_USER_VERSION);
+if (msg-version  VHOST_USER_VERSION) {
+error_report(Invalid version 0x%x.\n
+Vhost user version is 0x%x, msg-version, 
VHOST_USER_VERSION);
 goto fail;
 }
+slave_version = msg-version;
 
 /* validate message size is sane */
 if (msg-size  VHOST_USER_PAYLOAD_SIZE) {
@@ -195,7 +195,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 
 msg_request = vhost_user_request_translate(request);
 msg.request = msg_request;
-msg.flags = VHOST_USER_VERSION;
+msg.version = VHOST_USER_VERSION;
 msg.size = 0;
 
 switch (request) {
-- 
1.7.12.4





[Qemu-devel] [PATCH v1 0/3] vhost-user: support safe protocol

2015-02-13 Thread linhaifeng
From: Linhaifeng haifeng@huawei.com

Mostly the same as ioctl master need the return value to
decided going on or not.So we add these patches for more
safe communication.

Linhaifeng (3):
  vhost-user: add reply let the portocol more safe.
  vhost-user:update the version to 0x6
  vhost-user:add reply for other messages

 docs/specs/vhost-user.txt | 19 --
 hw/virtio/vhost-user.c| 64 ++-
 2 files changed, 69 insertions(+), 14 deletions(-)

-- 
1.7.12.4





[Qemu-devel] [PATCH v2 0/3] vhost-user: support safe protocol

2015-02-13 Thread linhaifeng
From: Linhaifeng haifeng@huawei.com

Mostly the same as ioctl master need the return value to
decided going on or not.So we add these patches for more
safe communication.

change log:
v1-v2: modify the annotate about slave's version.

Linhaifeng (3):
  vhost-user: update the protocol.
  vhost-user:update the version to 0x6
  vhost-user:add reply for other messages

 docs/specs/vhost-user.txt | 17 +++--
 hw/virtio/vhost-user.c| 64 ++-
 2 files changed, 67 insertions(+), 14 deletions(-)

-- 
1.7.12.4





[Qemu-devel] [PATCH v2 2/3] vhost-user:update the version to 0x6

2015-02-13 Thread linhaifeng
From: Linhaifeng haifeng@huawei.com

We not need the VHOST_USER_REPLY_MASK so the base version now is 0x5.
  - update the version to 0x6.
  - change the name form flag to version.

Signed-off-by: Linhaifeng haifeng@huawei.com
---
 hw/virtio/vhost-user.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index aefe0bb..d56115a 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -59,10 +59,7 @@ typedef struct VhostUserMemory {
 
 typedef struct VhostUserMsg {
 VhostUserRequest request;
-
-#define VHOST_USER_VERSION_MASK (0x3)
-#define VHOST_USER_REPLY_MASK   (0x12)
-uint32_t flags;
+uint32_t version;
 uint32_t size; /* the following payload size */
 union {
 #define VHOST_USER_VRING_IDX_MASK   (0xff)
@@ -74,15 +71,18 @@ typedef struct VhostUserMsg {
 };
 } QEMU_PACKED VhostUserMsg;
 
+static uint32_t slave_version;
 static VhostUserMsg m __attribute__ ((unused));
 #define VHOST_USER_HDR_SIZE (sizeof(m.request) \
-+ sizeof(m.flags) \
++ sizeof(m.version) \
 + sizeof(m.size))
 
 #define VHOST_USER_PAYLOAD_SIZE (sizeof(m) - VHOST_USER_HDR_SIZE)
 
-/* The version of the protocol we support */
-#define VHOST_USER_VERSION(0x1)
+/* The version of the protocol we support.
+ * Slaves' version should maller than  VHOST_USER_VERSION.
+ */
+#define VHOST_USER_VERSION(0x6)
 
 static bool ioeventfd_enabled(void)
 {
@@ -134,12 +134,12 @@ static int vhost_user_read(struct vhost_dev *dev, 
VhostUserMsg *msg)
 }
 
 /* validate received flags */
-if (msg-flags != (VHOST_USER_REPLY_MASK | VHOST_USER_VERSION)) {
-error_report(Failed to read msg header.
- Flags 0x%x instead of 0x%x.\n, msg-flags,
-VHOST_USER_REPLY_MASK | VHOST_USER_VERSION);
+if (msg-version  VHOST_USER_VERSION) {
+error_report(Invalid version 0x%x.\n
+Vhost user version is 0x%x, msg-version, 
VHOST_USER_VERSION);
 goto fail;
 }
+slave_version = msg-version;
 
 /* validate message size is sane */
 if (msg-size  VHOST_USER_PAYLOAD_SIZE) {
@@ -195,7 +195,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 
 msg_request = vhost_user_request_translate(request);
 msg.request = msg_request;
-msg.flags = VHOST_USER_VERSION;
+msg.version = VHOST_USER_VERSION;
 msg.size = 0;
 
 switch (request) {
-- 
1.7.12.4





[Qemu-devel] [PATCH v2 1/3] vhost-user: update the protocol.

2015-02-13 Thread linhaifeng
From: Linhaifeng haifeng@huawei.com

Every messages need reply.
This path just update the vhost-user.txt to version 0x6.

Signed-off-by: Linhaifeng haifeng@huawei.com
---
 docs/specs/vhost-user.txt | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 650bb18..448babc 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -23,6 +23,10 @@ be a software Ethernet switch running in user space, such as 
Snabbswitch.
 Master and slave can be either a client (i.e. connecting) or server (listening)
 in the socket communication.
 
+version 0x1: Supply base communication between master and slave.
+version 0x6: Add reply for more robust.
+
+
 Message Specification
 -
 
@@ -35,7 +39,7 @@ consists of 3 header fields and a payload:
 
  * Request: 32-bit type of the request
  * Flags: 32-bit bit field:
-   - Lower 2 bits are the version (currently 0x01)
+   - Lower 2 bits are the version (currently 0x06)
- Bit 2 is the reply flag - needs to be sent on each reply from the slave
  * Size - 32-bit size of the payload
 
@@ -144,6 +148,7 @@ Message types
   Id: 2
   Ioctl: VHOST_SET_FEATURES
   Master payload: u64
+  Slave payload: u64 0:success else:fail
 
   Enable features in the underlying vhost implementation using a bitmask.
 
@@ -171,6 +176,7 @@ Message types
   Id: 5
   Equivalent ioctl: VHOST_SET_MEM_TABLE
   Master payload: memory regions description
+  Slave payload: u64 0:success else:fail
 
   Sets the memory map regions on the slave so it can translate the vring
   addresses. In the ancillary data there is an array of file descriptors
@@ -182,6 +188,7 @@ Message types
   Id: 6
   Equivalent ioctl: VHOST_SET_LOG_BASE
   Master payload: u64
+  Slave payload: u64 0:success else:fail
 
   Sets the logging base address.
 
@@ -190,6 +197,7 @@ Message types
   Id: 7
   Equivalent ioctl: VHOST_SET_LOG_FD
   Master payload: N/A
+  Slave payload: u64 0:success else:fail
 
   Sets the logging file descriptor, which is passed as ancillary data.
 
@@ -198,6 +206,7 @@ Message types
   Id: 8
   Equivalent ioctl: VHOST_SET_VRING_NUM
   Master payload: vring state description
+  Slave payload: u64 0:success else:fail
 
   Sets the number of vrings for this owner.
 
@@ -206,7 +215,7 @@ Message types
   Id: 9
   Equivalent ioctl: VHOST_SET_VRING_ADDR
   Master payload: vring address description
-  Slave payload: N/A
+  Slave payload: u64 0:success else:fail
 
   Sets the addresses of the different aspects of the vring.
 
@@ -215,6 +224,7 @@ Message types
   Id: 10
   Equivalent ioctl: VHOST_SET_VRING_BASE
   Master payload: vring state description
+  Slave payload: u64 0:success else:fail
 
   Sets the base offset in the available vring.
 
@@ -232,6 +242,7 @@ Message types
   Id: 12
   Equivalent ioctl: VHOST_SET_VRING_KICK
   Master payload: u64
+  Slave payload: u64 0:success else:fail
 
   Set the event file descriptor for adding buffers to the vring. It
   is passed in the ancillary data.
@@ -245,6 +256,7 @@ Message types
   Id: 13
   Equivalent ioctl: VHOST_SET_VRING_CALL
   Master payload: u64
+  Slave payload: u64 0:success else:fail
 
   Set the event file descriptor to signal when buffers are used. It
   is passed in the ancillary data.
@@ -258,6 +270,7 @@ Message types
   Id: 14
   Equivalent ioctl: VHOST_SET_VRING_ERR
   Master payload: u64
+  Slave payload: u64 0:success else:fail
 
   Set the event file descriptor to signal when error occurs. It
   is passed in the ancillary data.
-- 
1.7.12.4





[Qemu-devel] [PATCH v1 3/3] vhost-user:add reply for other messages

2015-02-13 Thread linhaifeng
From: Linhaifeng haifeng@huawei.com

If slave's version bigger than 0x5 we will wait for reply.

Signed-off-by: Linhaifeng haifeng@huawei.com
---
 hw/virtio/vhost-user.c | 40 
 1 file changed, 40 insertions(+)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index d56115a..fdfd14b 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -82,8 +82,15 @@ static VhostUserMsg m __attribute__ ((unused));
 /* The version of the protocol we support.
  * Slaves' version should maller than  VHOST_USER_VERSION.
  */
+#define VHOST_USER_BASE   (0x5)
 #define VHOST_USER_VERSION(0x6)
 
+#define VHOST_NEED_REPLY \
+{\
+if (slave_version  VHOST_USER_BASE) \
+need_reply = 1;\
+}
+
 static bool ioeventfd_enabled(void)
 {
 return kvm_enabled()  kvm_eventfds_enabled();
@@ -207,6 +214,8 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 case VHOST_SET_LOG_BASE:
 msg.u64 = *((__u64 *) arg);
 msg.size = sizeof(m.u64);
+
+VHOST_NEED_REPLY;
 break;
 
 case VHOST_SET_OWNER:
@@ -244,16 +253,21 @@ static int vhost_user_call(struct vhost_dev *dev, 
unsigned long int request,
 msg.size += sizeof(m.memory.padding);
 msg.size += fd_num * sizeof(VhostUserMemoryRegion);
 
+VHOST_NEED_REPLY;
 break;
 
 case VHOST_SET_LOG_FD:
 fds[fd_num++] = *((int *) arg);
+
+VHOST_NEED_REPLY;
 break;
 
 case VHOST_SET_VRING_NUM:
 case VHOST_SET_VRING_BASE:
 memcpy(msg.state, arg, sizeof(struct vhost_vring_state));
 msg.size = sizeof(m.state);
+
+VHOST_NEED_REPLY;
 break;
 
 case VHOST_GET_VRING_BASE:
@@ -265,6 +279,8 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 case VHOST_SET_VRING_ADDR:
 memcpy(msg.addr, arg, sizeof(struct vhost_vring_addr));
 msg.size = sizeof(m.addr);
+
+VHOST_NEED_REPLY;
 break;
 
 case VHOST_SET_VRING_KICK:
@@ -278,6 +294,8 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 } else {
 msg.u64 |= VHOST_USER_VRING_NOFD_MASK;
 }
+
+VHOST_NEED_REPLY;
 break;
 default:
 error_report(vhost-user trying to send unhandled ioctl\n);
@@ -315,6 +333,28 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 }
 memcpy(arg, msg.state, sizeof(struct vhost_vring_state));
 break;
+case VHOST_USER_SET_FEATURES:
+case VHOST_USER_SET_LOG_BASE:
+case VHOST_USER_SET_OWNER:
+case VHOST_USER_RESET_OWNER:
+case VHOST_USER_SET_MEM_TABLE:
+case VHOST_USER_SET_LOG_FD:
+case VHOST_USER_SET_VRING_NUM:
+case VHOST_USER_SET_VRING_BASE:
+case VHOST_USER_SET_VRING_ADDR:
+case VHOST_USER_SET_VRING_KICK:
+case VHOST_USER_SET_VRING_CALL:
+case VHOST_USER_SET_VRING_ERR:
+if (msg.size != sizeof(m.u64)) {
+error_report(Received bad msg size.);
+return -1;
+} else {
+if (m.u64) {
+error_report(Failed to handle request %d., msg_request);
+return -1;
+}
+}
+break;
 default:
 error_report(Received unexpected msg type.\n);
 return -1;
-- 
1.7.12.4





[Qemu-devel] [PATCH v1 1/3] vhost-user: add reply let the portocol more safe.

2015-02-13 Thread linhaifeng
From: Linhaifeng haifeng@huawei.com

Every messages need reply.

Signed-off-by: Linhaifeng haifeng@huawei.com
---
 docs/specs/vhost-user.txt | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 650bb18..4a14e63 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -23,6 +23,10 @@ be a software Ethernet switch running in user space, such as 
Snabbswitch.
 Master and slave can be either a client (i.e. connecting) or server (listening)
 in the socket communication.
 
+version 0x1: Supply base communication between master and slave.
+version 0x6: Add reply for more robust.
+
+
 Message Specification
 -
 
@@ -35,7 +39,7 @@ consists of 3 header fields and a payload:
 
  * Request: 32-bit type of the request
  * Flags: 32-bit bit field:
-   - Lower 2 bits are the version (currently 0x01)
+   - Lower 2 bits are the version (currently 0x06)
- Bit 2 is the reply flag - needs to be sent on each reply from the slave
  * Size - 32-bit size of the payload
 
@@ -144,6 +148,7 @@ Message types
   Id: 2
   Ioctl: VHOST_SET_FEATURES
   Master payload: u64
+  Slave payload: u64 0:success else:fail
 
   Enable features in the underlying vhost implementation using a bitmask.
 
@@ -152,6 +157,7 @@ Message types
   Id: 3
   Equivalent ioctl: VHOST_SET_OWNER
   Master payload: N/A
+  Slave payload: u64 0:success else:fail
 
   Issued when a new connection is established. It sets the current Master
   as an owner of the session. This can be used on the Slave as a
@@ -162,6 +168,7 @@ Message types
   Id: 4
   Equivalent ioctl: VHOST_RESET_OWNER
   Master payload: N/A
+  Slave payload: u64 0:success else:fail
 
   Issued when a new connection is about to be closed. The Master will no
   longer own this connection (and will usually close it).
@@ -171,6 +178,7 @@ Message types
   Id: 5
   Equivalent ioctl: VHOST_SET_MEM_TABLE
   Master payload: memory regions description
+  Slave payload: u64 0:success else:fail
 
   Sets the memory map regions on the slave so it can translate the vring
   addresses. In the ancillary data there is an array of file descriptors
@@ -182,6 +190,7 @@ Message types
   Id: 6
   Equivalent ioctl: VHOST_SET_LOG_BASE
   Master payload: u64
+  Slave payload: u64 0:success else:fail
 
   Sets the logging base address.
 
@@ -190,6 +199,7 @@ Message types
   Id: 7
   Equivalent ioctl: VHOST_SET_LOG_FD
   Master payload: N/A
+  Slave payload: u64 0:success else:fail
 
   Sets the logging file descriptor, which is passed as ancillary data.
 
@@ -198,6 +208,7 @@ Message types
   Id: 8
   Equivalent ioctl: VHOST_SET_VRING_NUM
   Master payload: vring state description
+  Slave payload: u64 0:success else:fail
 
   Sets the number of vrings for this owner.
 
@@ -206,7 +217,7 @@ Message types
   Id: 9
   Equivalent ioctl: VHOST_SET_VRING_ADDR
   Master payload: vring address description
-  Slave payload: N/A
+  Slave payload: u64 0:success else:fail
 
   Sets the addresses of the different aspects of the vring.
 
@@ -215,6 +226,7 @@ Message types
   Id: 10
   Equivalent ioctl: VHOST_SET_VRING_BASE
   Master payload: vring state description
+  Slave payload: u64 0:success else:fail
 
   Sets the base offset in the available vring.
 
@@ -232,6 +244,7 @@ Message types
   Id: 12
   Equivalent ioctl: VHOST_SET_VRING_KICK
   Master payload: u64
+  Slave payload: u64 0:success else:fail
 
   Set the event file descriptor for adding buffers to the vring. It
   is passed in the ancillary data.
@@ -245,6 +258,7 @@ Message types
   Id: 13
   Equivalent ioctl: VHOST_SET_VRING_CALL
   Master payload: u64
+  Slave payload: u64 0:success else:fail
 
   Set the event file descriptor to signal when buffers are used. It
   is passed in the ancillary data.
@@ -258,6 +272,7 @@ Message types
   Id: 14
   Equivalent ioctl: VHOST_SET_VRING_ERR
   Master payload: u64
+  Slave payload: u64 0:success else:fail
 
   Set the event file descriptor to signal when error occurs. It
   is passed in the ancillary data.
-- 
1.7.12.4





Re: [Qemu-devel] [PATCH v1 1/2] vhost-user: support SET_MEM_TABLE waite the result of mmap

2015-02-11 Thread Linhaifeng


 No.May be the existing slaves need add reply in their codes.
 
 So that's not good. We need a way to negotiate the capability,
 we can't just deadlock with legacy slaves.
 

Hi,Michael

Do you have any suggestions?




Re: [Qemu-devel] [PATCH v1 1/2] vhost-user: support SET_MEM_TABLE waite the result of mmap

2015-02-10 Thread Linhaifeng


On 2015/2/10 20:04, Michael S. Tsirkin wrote:
 So that's not good. We need a way to negotiate the capability,
 we can't just deadlock with legacy slaves.

Should we wait many seconds if slave not reply we just return error?

-- 
Regards,
Haifeng




Re: [Qemu-devel] [PATCH v1 1/2] vhost-user: support SET_MEM_TABLE waite the result of mmap

2015-02-10 Thread Linhaifeng


On 2015/2/10 16:46, Michael S. Tsirkin wrote:
 On Tue, Feb 10, 2015 at 01:48:12PM +0800, linhaifeng wrote:
 From: Linhaifeng haifeng@huawei.com

 Slave should reply to master and set u64 to 0 if
 mmap all regions success otherwise set u64 to 1.

 Signed-off-by: Linhaifeng haifeng@huawei.com
 
 How does this work with existig slaves though?
 

Slaves should work like this:

int set_mem_table(...)
{

for (idx = 0, i = 0; idx  memory.nregions; idx++) {

mem = mmap(..);
if (MAP_FAILED == mem) {
msg-msg.u64 = 1;
msg-msg.size = MEMB_SIZE(VhostUserMsg, u64);
return 1;
}
}



msg-msg.u64 = 0;
msg-msg.size = MEMB_SIZE(VhostUserMsg, u64);
return 1;
}

If slaves not reply QEMU will always wait.

 ---
  docs/specs/vhost-user.txt | 1 +
  1 file changed, 1 insertion(+)

 diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
 index 650bb18..c96bf6b 100644
 --- a/docs/specs/vhost-user.txt
 +++ b/docs/specs/vhost-user.txt
 @@ -171,6 +171,7 @@ Message types
Id: 5
Equivalent ioctl: VHOST_SET_MEM_TABLE
Master payload: memory regions description
 +  Slave payload: u64 (0:success 0:failed)
  
Sets the memory map regions on the slave so it can translate the vring
addresses. In the ancillary data there is an array of file descriptors
 -- 
 1.7.12.4

 
 

-- 
Regards,
Haifeng




Re: [Qemu-devel] [PATCH v1 1/2] vhost-user: support SET_MEM_TABLE waite the result of mmap

2015-02-10 Thread Linhaifeng


On 2015/2/10 18:41, Michael S. Tsirkin wrote:
 On Tue, Feb 10, 2015 at 06:27:04PM +0800, Linhaifeng wrote:


 On 2015/2/10 16:46, Michael S. Tsirkin wrote:
 On Tue, Feb 10, 2015 at 01:48:12PM +0800, linhaifeng wrote:
 From: Linhaifeng haifeng@huawei.com

 Slave should reply to master and set u64 to 0 if
 mmap all regions success otherwise set u64 to 1.

 Signed-off-by: Linhaifeng haifeng@huawei.com

 How does this work with existig slaves though?


 Slaves should work like this:

 int set_mem_table(...)
 {
 
 for (idx = 0, i = 0; idx  memory.nregions; idx++) {
  
  mem = mmap(..);
  if (MAP_FAILED == mem) {
  msg-msg.u64 = 1;
 msg-msg.size = MEMB_SIZE(VhostUserMsg, u64);
  return 1;
  }
 }

 

 msg-msg.u64 = 0;
 msg-msg.size = MEMB_SIZE(VhostUserMsg, u64);
 return 1;
 }

 If slaves not reply QEMU will always wait.
 
 Are you sure existing slaves reply?

No.May be the existing slaves need add reply in their codes.

 
 ---
  docs/specs/vhost-user.txt | 1 +
  1 file changed, 1 insertion(+)

 diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
 index 650bb18..c96bf6b 100644
 --- a/docs/specs/vhost-user.txt
 +++ b/docs/specs/vhost-user.txt
 @@ -171,6 +171,7 @@ Message types
Id: 5
Equivalent ioctl: VHOST_SET_MEM_TABLE
Master payload: memory regions description
 +  Slave payload: u64 (0:success 0:failed)
  
Sets the memory map regions on the slave so it can translate the 
 vring
addresses. In the ancillary data there is an array of file 
 descriptors
 -- 
 1.7.12.4




 -- 
 Regards,
 Haifeng
 
 .
 

-- 
Regards,
Haifeng




Re: [Qemu-devel] [PATCH v1 1/2] vhost-user: support SET_MEM_TABLE waite the result of mmap

2015-02-10 Thread Linhaifeng


On 2015/2/10 20:04, Michael S. Tsirkin wrote:
 So that's not good. We need a way to negotiate the capability,
 we can't just deadlock with legacy slaves.

Or add a new message to query slaves' version if slaves not reply we don't wait
otherwise if the version as same as QEMU we wait the reply.

Mostly the same as iotcl may be all messages need reply.

-- 
Regards,
Haifeng




[Qemu-devel] [PATCH 2/2] vhost-user: add reply for set_mem_table

2015-02-09 Thread linhaifeng
From: Linhaifeng haifeng@huawei.com

If u64 is not 0 we should return -1 to tell qemu not going on.

Signed-off-by: Linhaifeng haifeng@huawei.com
---
 hw/virtio/vhost-user.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index aefe0bb..a68ce36 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -243,7 +243,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 msg.size = sizeof(m.memory.nregions);
 msg.size += sizeof(m.memory.padding);
 msg.size += fd_num * sizeof(VhostUserMemoryRegion);
-
+need_reply = 1;
 break;
 
 case VHOST_SET_LOG_FD:
@@ -315,6 +315,17 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 }
 memcpy(arg, msg.state, sizeof(struct vhost_vring_state));
 break;
+case VHOST_SET_MEM_TABLE:
+if (msg.size != sizeof(m.u64)) {
+error_report(Received bad msg size.\n);
+return -1;
+} else {
+if (m.u64) {
+error_report(Failed to set memory table.\n);
+return -1;
+}
+}
+break;
 default:
 error_report(Received unexpected msg type.\n);
 return -1;
-- 
1.7.12.4





[Qemu-devel] [PATCH v1 2/2] vhost-user: add reply for set_mem_table

2015-02-09 Thread linhaifeng
From: Linhaifeng haifeng@huawei.com

If u64 is not 0 we should return -1 to tell qemu not going on.

Remove some unnecessary '\n' in error_report.

Signed-off-by: Linhaifeng haifeng@huawei.com
---
 hw/virtio/vhost-user.c | 33 ++---
 1 file changed, 22 insertions(+), 11 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index aefe0bb..d69bb33 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -128,7 +128,7 @@ static int vhost_user_read(struct vhost_dev *dev, 
VhostUserMsg *msg)
 
 r = qemu_chr_fe_read_all(chr, p, size);
 if (r != size) {
-error_report(Failed to read msg header. Read %d instead of %d.\n, r,
+error_report(Failed to read msg header. Read %d instead of %d., r,
 size);
 goto fail;
 }
@@ -136,7 +136,7 @@ static int vhost_user_read(struct vhost_dev *dev, 
VhostUserMsg *msg)
 /* validate received flags */
 if (msg-flags != (VHOST_USER_REPLY_MASK | VHOST_USER_VERSION)) {
 error_report(Failed to read msg header.
- Flags 0x%x instead of 0x%x.\n, msg-flags,
+ Flags 0x%x instead of 0x%x., msg-flags,
 VHOST_USER_REPLY_MASK | VHOST_USER_VERSION);
 goto fail;
 }
@@ -144,7 +144,7 @@ static int vhost_user_read(struct vhost_dev *dev, 
VhostUserMsg *msg)
 /* validate message size is sane */
 if (msg-size  VHOST_USER_PAYLOAD_SIZE) {
 error_report(Failed to read msg header.
- Size %d exceeds the maximum %zu.\n, msg-size,
+ Size %d exceeds the maximum %zu., msg-size,
 VHOST_USER_PAYLOAD_SIZE);
 goto fail;
 }
@@ -155,7 +155,7 @@ static int vhost_user_read(struct vhost_dev *dev, 
VhostUserMsg *msg)
 r = qemu_chr_fe_read_all(chr, p, size);
 if (r != size) {
 error_report(Failed to read msg payload.
-  Read %d instead of %d.\n, r, msg-size);
+  Read %d instead of %d., r, msg-size);
 goto fail;
 }
 }
@@ -236,14 +236,14 @@ static int vhost_user_call(struct vhost_dev *dev, 
unsigned long int request,
 
 if (!fd_num) {
 error_report(Failed initializing vhost-user memory map\n
-consider using -object memory-backend-file share=on\n);
+consider using -object memory-backend-file share=on);
 return -1;
 }
 
 msg.size = sizeof(m.memory.nregions);
 msg.size += sizeof(m.memory.padding);
 msg.size += fd_num * sizeof(VhostUserMemoryRegion);
-
+need_reply = 1;
 break;
 
 case VHOST_SET_LOG_FD:
@@ -280,7 +280,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 }
 break;
 default:
-error_report(vhost-user trying to send unhandled ioctl\n);
+error_report(vhost-user trying to send unhandled ioctl);
 return -1;
 break;
 }
@@ -296,27 +296,38 @@ static int vhost_user_call(struct vhost_dev *dev, 
unsigned long int request,
 
 if (msg_request != msg.request) {
 error_report(Received unexpected msg type.
- Expected %d received %d\n, msg_request, msg.request);
+ Expected %d received %d, msg_request, msg.request);
 return -1;
 }
 
 switch (msg_request) {
 case VHOST_USER_GET_FEATURES:
 if (msg.size != sizeof(m.u64)) {
-error_report(Received bad msg size.\n);
+error_report(Received bad msg size.);
 return -1;
 }
 *((__u64 *) arg) = msg.u64;
 break;
 case VHOST_USER_GET_VRING_BASE:
 if (msg.size != sizeof(m.state)) {
-error_report(Received bad msg size.\n);
+error_report(Received bad msg size.);
 return -1;
 }
 memcpy(arg, msg.state, sizeof(struct vhost_vring_state));
 break;
+case VHOST_SET_MEM_TABLE:
+if (msg.size != sizeof(m.u64)) {
+error_report(Received bad msg size.);
+return -1;
+} else {
+if (m.u64) {
+error_report(Failed to set memory table.);
+return -1;
+}
+}
+break;
 default:
-error_report(Received unexpected msg type.\n);
+error_report(Received unexpected msg type.);
 return -1;
 break;
 }
-- 
1.7.12.4





[Qemu-devel] [PATCH 1/2] vhost-user: support SET_MEM_TABLE waite the result of mmap

2015-02-09 Thread linhaifeng
From: Linhaifeng haifeng@huawei.com

Slave should reply to master and set u64 to 0 if
mmap all regions success otherwise set u64 to 1.

Signed-off-by: Linhaifeng haifeng@huawei.com
---
 docs/specs/vhost-user.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 650bb18..c96bf6b 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -171,6 +171,7 @@ Message types
   Id: 5
   Equivalent ioctl: VHOST_SET_MEM_TABLE
   Master payload: memory regions description
+  Slave payload: u64 (0:success 0:failed)
 
   Sets the memory map regions on the slave so it can translate the vring
   addresses. In the ancillary data there is an array of file descriptors
-- 
1.7.12.4





Re: [Qemu-devel] [PATCH 2/2] vhost-user: add reply for set_mem_table

2015-02-09 Thread Linhaifeng


On 2015/2/10 11:57, Gonglei wrote:
 On 2015/2/10 11:24, linhaifeng wrote:
 From: Linhaifeng haifeng@huawei.com

 If u64 is not 0 we should return -1 to tell qemu not going on.

 Signed-off-by: Linhaifeng haifeng@huawei.com
 ---
  hw/virtio/vhost-user.c | 13 -
  1 file changed, 12 insertions(+), 1 deletion(-)

 diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
 index aefe0bb..a68ce36 100644
 --- a/hw/virtio/vhost-user.c
 +++ b/hw/virtio/vhost-user.c
 @@ -243,7 +243,7 @@ static int vhost_user_call(struct vhost_dev *dev, 
 unsigned long int request,
  msg.size = sizeof(m.memory.nregions);
  msg.size += sizeof(m.memory.padding);
  msg.size += fd_num * sizeof(VhostUserMemoryRegion);
 -
 +need_reply = 1;
  break;
  
  case VHOST_SET_LOG_FD:
 @@ -315,6 +315,17 @@ static int vhost_user_call(struct vhost_dev *dev, 
 unsigned long int request,
  }
  memcpy(arg, msg.state, sizeof(struct vhost_vring_state));
  break;
 +case VHOST_SET_MEM_TABLE:
 +if (msg.size != sizeof(m.u64)) {
 +error_report(Received bad msg size.\n);
 A superfluous \n character.

Hi Gonglei

Thank you for your review.

Yes, error_report has printed '\n'.
I will remove it.

 +return -1;
 +} else {
 +if (m.u64) {
 +error_report(Failed to set memory table.\n);
 A superfluous \n character.
 
 Regards,
 -Gonglei
 +return -1;
 +}
 +}
 +break;
  default:
  error_report(Received unexpected msg type.\n);
  return -1;

 
 
 
 .
 

-- 
Regards,
Haifeng




[Qemu-devel] [PATCH v1 1/2] vhost-user: support SET_MEM_TABLE waite the result of mmap

2015-02-09 Thread linhaifeng
From: Linhaifeng haifeng@huawei.com

Slave should reply to master and set u64 to 0 if
mmap all regions success otherwise set u64 to 1.

Signed-off-by: Linhaifeng haifeng@huawei.com
---
 docs/specs/vhost-user.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 650bb18..c96bf6b 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -171,6 +171,7 @@ Message types
   Id: 5
   Equivalent ioctl: VHOST_SET_MEM_TABLE
   Master payload: memory regions description
+  Slave payload: u64 (0:success 0:failed)
 
   Sets the memory map regions on the slave so it can translate the vring
   addresses. In the ancillary data there is an array of file descriptors
-- 
1.7.12.4





Re: [Qemu-devel] [PATCH v1 2/2] vhost-user: add reply for set_mem_table

2015-02-09 Thread Linhaifeng


On 2015/2/10 14:35, Gonglei wrote:
 On 2015/2/10 13:48, linhaifeng wrote:
 From: Linhaifeng haifeng@huawei.com

 If u64 is not 0 we should return -1 to tell qemu not going on.

 Remove some unnecessary '\n' in error_report.
 Hi, haifeng:
 
 You'd better split a single patch to do this work,
 and using '--cover' argument for patch series when git format-patch .  :)
 

good idea!

 Regards,
 -Gonglei

 Signed-off-by: Linhaifeng haifeng@huawei.com
 ---
  hw/virtio/vhost-user.c | 33 ++---
  1 file changed, 22 insertions(+), 11 deletions(-)
 
 
 
 .
 

-- 
Regards,
Haifeng




Re: [Qemu-devel] [PATCH v1] vhost-user: fix not send all hugepage files to vhost-user

2015-02-03 Thread Linhaifeng


On 2015/2/2 7:29, Paolo Bonzini wrote:
 
 
 On 17/12/2014 07:02, haifeng@huawei.com wrote:
 From: linhaifeng haifeng@huawei.com

 If we create VM with two or more numa nodes qemu will create two
 or more hugepage files but qemu only send one hugepage file fd
 to vhost-user when VM's memory size is 2G and with two numa nodes.

 Signed-off-by: linhaifeng haifeng@huawei.com
 
 The bug is in vhost_dev_assign_memory.  It doesn't check that the file
 descriptor matches when merging regions.  Michael, does the merging
 trigger in practice?  Can we just eliminate it?
 
 Paolo
 

zan!

Have this bug fixed?

 ---
  hw/virtio/vhost-user.c  | 78 
 ++---
  hw/virtio/vhost.c   | 13 
  linux-headers/linux/vhost.h |  7 
  3 files changed, 73 insertions(+), 25 deletions(-)

 diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
 index aefe0bb..439cbba 100644
 --- a/hw/virtio/vhost-user.c
 +++ b/hw/virtio/vhost-user.c
 @@ -24,6 +24,10 @@
  #include linux/vhost.h
  
  #define VHOST_MEMORY_MAX_NREGIONS8
 +/* FIXME: same as the max number of numa node?*/
 +#define HUGEPAGE_MAX_FILES   8
 +
 +#define RAM_SHARED (1  1)
  
  typedef enum VhostUserRequest {
  VHOST_USER_NONE = 0,
 @@ -41,14 +45,15 @@ typedef enum VhostUserRequest {
  VHOST_USER_SET_VRING_KICK = 12,
  VHOST_USER_SET_VRING_CALL = 13,
  VHOST_USER_SET_VRING_ERR = 14,
 -VHOST_USER_MAX
 +VHOST_USER_MMAP_HUGEPAGE_FILE = 15,
 +VHOST_USER_UNMAP_HUGEPAGE_FILE = 16,
 +VHOST_USER_MAX,
  } VhostUserRequest;
  
  typedef struct VhostUserMemoryRegion {
  uint64_t guest_phys_addr;
  uint64_t memory_size;
  uint64_t userspace_addr;
 -uint64_t mmap_offset;
  } VhostUserMemoryRegion;
  
  typedef struct VhostUserMemory {
 @@ -57,6 +62,16 @@ typedef struct VhostUserMemory {
  VhostUserMemoryRegion regions[VHOST_MEMORY_MAX_NREGIONS];
  } VhostUserMemory;
  
 +typedef struct HugepageMemoryInfo {
 +uint64_t base_addr;
 +uint64_t size;
 +}HugeMemInfo;
 +
 +typedef struct HugepageInfo {
 +uint32_t num;
 +HugeMemInfo files[HUGEPAGE_MAX_FILES];
 +}HugepageInfo;
 +
  typedef struct VhostUserMsg {
  VhostUserRequest request;
  
 @@ -71,6 +86,7 @@ typedef struct VhostUserMsg {
  struct vhost_vring_state state;
  struct vhost_vring_addr addr;
  VhostUserMemory memory;
 +HugepageInfo huge_info;
  };
  } QEMU_PACKED VhostUserMsg;
  
 @@ -104,7 +120,9 @@ static unsigned long int 
 ioctl_to_vhost_user_request[VHOST_USER_MAX] = {
  VHOST_GET_VRING_BASE,   /* VHOST_USER_GET_VRING_BASE */
  VHOST_SET_VRING_KICK,   /* VHOST_USER_SET_VRING_KICK */
  VHOST_SET_VRING_CALL,   /* VHOST_USER_SET_VRING_CALL */
 -VHOST_SET_VRING_ERR /* VHOST_USER_SET_VRING_ERR */
 +VHOST_SET_VRING_ERR,/* VHOST_USER_SET_VRING_ERR */
 +VHOST_MMAP_HUGEPAGE_FILE,  /* VHOST_USER_MMAP_HUGEPAGE_FILE */
 +VHOST_UNMAP_HUGEPAGE_FILE, /* VHOST_USER_UNMAP_HUGEPAGE_FILE */
  };
  
  static VhostUserRequest vhost_user_request_translate(unsigned long int 
 request)
 @@ -190,6 +208,7 @@ static int vhost_user_call(struct vhost_dev *dev, 
 unsigned long int request,
  int fds[VHOST_MEMORY_MAX_NREGIONS];
  int i, fd;
  size_t fd_num = 0;
 +RAMBlock *block;
  
  assert(dev-vhost_ops-backend_type == VHOST_BACKEND_TYPE_USER);
  
 @@ -213,37 +232,46 @@ static int vhost_user_call(struct vhost_dev *dev, 
 unsigned long int request,
  case VHOST_RESET_OWNER:
  break;
  
 -case VHOST_SET_MEM_TABLE:
 -for (i = 0; i  dev-mem-nregions; ++i) {
 -struct vhost_memory_region *reg = dev-mem-regions + i;
 -ram_addr_t ram_addr;
 +case VHOST_MMAP_HUGEPAGE_FILE:
 +qemu_mutex_lock_ramlist();
  
 -assert((uintptr_t)reg-userspace_addr == reg-userspace_addr);
 -qemu_ram_addr_from_host((void *)(uintptr_t)reg-userspace_addr, 
 ram_addr);
 -fd = qemu_get_ram_fd(ram_addr);
 -if (fd  0) {
 -msg.memory.regions[fd_num].userspace_addr = 
 reg-userspace_addr;
 -msg.memory.regions[fd_num].memory_size  = reg-memory_size;
 -msg.memory.regions[fd_num].guest_phys_addr = 
 reg-guest_phys_addr;
 -msg.memory.regions[fd_num].mmap_offset = 
 reg-userspace_addr -
 -(uintptr_t) qemu_get_ram_block_host_ptr(ram_addr);
 -assert(fd_num  VHOST_MEMORY_MAX_NREGIONS);
 -fds[fd_num++] = fd;
 +/* Get hugepage file informations */
 +QTAILQ_FOREACH(block, ram_list.blocks, next) {
 +if (block-flags  RAM_SHARED  block-fd  0) {
 +msg.huge_info.files[fd_num].size = block-length;
 +msg.huge_info.files[fd_num].base_addr = block-host;
 +fds[fd_num++] = block-fd;
  }
  }
 +msg.huge_info.num = fd_num

Re: [Qemu-devel] [PATCH v1] vhost-user: fix not send all hugepage files to vhost-user

2015-01-29 Thread Linhaifeng


On 2015/1/29 18:51, Michael S. Tsirkin wrote:
 On Thu, Jan 29, 2015 at 11:58:08AM +0800, Linhaifeng wrote:
 Hi,Michael S.Tsirkin

 The vhost-user device will not work if there are two numa nodes in VM.

 Should we fix this bug or ignore it ?
 
 I suggest we fix this bug.
 I saw that you responded to self so I assume you will
 send v2 with the modification you listed.
 Did I get it right?
 
 Also, pls Cc qemu-stable on bugfixes.
 
 Thanks!
 

Hi,Michael S.Tsirkin

Yes,in v2 i want to write vhost-user-test.c to show how to mmap and how to 
calculate address
then you and other one can review it.But i don't know how to compile and run 
the test.
Is there any document to show it?

BTW.I know you wrote virtio_net driver and vhost. Is there any document to 
introduce the principle
or how the virtio and vhost work?

 On 2014/12/18 13:06, Linhaifeng wrote:


 On 2014/12/17 14:02, haifeng@huawei.com wrote:
 From: linhaifeng haifeng@huawei.com

 If we create VM with two or more numa nodes qemu will create two
 or more hugepage files but qemu only send one hugepage file fd
 to vhost-user when VM's memory size is 2G and with two numa nodes.

 Signed-off-by: linhaifeng haifeng@huawei.com
 ---
  hw/virtio/vhost-user.c  | 78 
 ++---
  hw/virtio/vhost.c   | 13 
  linux-headers/linux/vhost.h |  7 
  3 files changed, 73 insertions(+), 25 deletions(-)

 diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
 index aefe0bb..439cbba 100644
 --- a/hw/virtio/vhost-user.c
 +++ b/hw/virtio/vhost-user.c
 @@ -24,6 +24,10 @@
  #include linux/vhost.h
  
  #define VHOST_MEMORY_MAX_NREGIONS8
 +/* FIXME: same as the max number of numa node?*/
 +#define HUGEPAGE_MAX_FILES   8
 +
 +#define RAM_SHARED (1  1)
  
  typedef enum VhostUserRequest {
  VHOST_USER_NONE = 0,
 @@ -41,14 +45,15 @@ typedef enum VhostUserRequest {
  VHOST_USER_SET_VRING_KICK = 12,
  VHOST_USER_SET_VRING_CALL = 13,
  VHOST_USER_SET_VRING_ERR = 14,
 -VHOST_USER_MAX
 +VHOST_USER_MMAP_HUGEPAGE_FILE = 15,
 +VHOST_USER_UNMAP_HUGEPAGE_FILE = 16,
 +VHOST_USER_MAX,
  } VhostUserRequest;
  
  typedef struct VhostUserMemoryRegion {
  uint64_t guest_phys_addr;
  uint64_t memory_size;
  uint64_t userspace_addr;
 -uint64_t mmap_offset;
  } VhostUserMemoryRegion;
  
  typedef struct VhostUserMemory {
 @@ -57,6 +62,16 @@ typedef struct VhostUserMemory {
  VhostUserMemoryRegion regions[VHOST_MEMORY_MAX_NREGIONS];
  } VhostUserMemory;
  
 +typedef struct HugepageMemoryInfo {
 +uint64_t base_addr;
 +uint64_t size;
 +}HugeMemInfo;
 +
 +typedef struct HugepageInfo {
 +uint32_t num;
 +HugeMemInfo files[HUGEPAGE_MAX_FILES];
 +}HugepageInfo;
 +
  typedef struct VhostUserMsg {
  VhostUserRequest request;
  
 @@ -71,6 +86,7 @@ typedef struct VhostUserMsg {
  struct vhost_vring_state state;
  struct vhost_vring_addr addr;
  VhostUserMemory memory;
 +HugepageInfo huge_info;
  };
  } QEMU_PACKED VhostUserMsg;
  
 @@ -104,7 +120,9 @@ static unsigned long int 
 ioctl_to_vhost_user_request[VHOST_USER_MAX] = {
  VHOST_GET_VRING_BASE,   /* VHOST_USER_GET_VRING_BASE */
  VHOST_SET_VRING_KICK,   /* VHOST_USER_SET_VRING_KICK */
  VHOST_SET_VRING_CALL,   /* VHOST_USER_SET_VRING_CALL */
 -VHOST_SET_VRING_ERR /* VHOST_USER_SET_VRING_ERR */
 +VHOST_SET_VRING_ERR,/* VHOST_USER_SET_VRING_ERR */
 +VHOST_MMAP_HUGEPAGE_FILE,  /* VHOST_USER_MMAP_HUGEPAGE_FILE */
 +VHOST_UNMAP_HUGEPAGE_FILE, /* VHOST_USER_UNMAP_HUGEPAGE_FILE */
  };
  
  static VhostUserRequest vhost_user_request_translate(unsigned long int 
 request)
 @@ -190,6 +208,7 @@ static int vhost_user_call(struct vhost_dev *dev, 
 unsigned long int request,
  int fds[VHOST_MEMORY_MAX_NREGIONS];
  int i, fd;
  size_t fd_num = 0;
 +RAMBlock *block;
  
  assert(dev-vhost_ops-backend_type == VHOST_BACKEND_TYPE_USER);
  
 @@ -213,37 +232,46 @@ static int vhost_user_call(struct vhost_dev *dev, 
 unsigned long int request,
  case VHOST_RESET_OWNER:
  break;
  
 -case VHOST_SET_MEM_TABLE:
 -for (i = 0; i  dev-mem-nregions; ++i) {
 -struct vhost_memory_region *reg = dev-mem-regions + i;
 -ram_addr_t ram_addr;
 +case VHOST_MMAP_HUGEPAGE_FILE:
 +qemu_mutex_lock_ramlist();
  
 -assert((uintptr_t)reg-userspace_addr == reg-userspace_addr);
 -qemu_ram_addr_from_host((void 
 *)(uintptr_t)reg-userspace_addr, ram_addr);
 -fd = qemu_get_ram_fd(ram_addr);
 -if (fd  0) {
 -msg.memory.regions[fd_num].userspace_addr = 
 reg-userspace_addr;
 -msg.memory.regions[fd_num].memory_size  = 
 reg-memory_size;
 -msg.memory.regions[fd_num].guest_phys_addr = 
 reg-guest_phys_addr;
 -msg.memory.regions[fd_num].mmap_offset = 
 reg-userspace_addr

Re: [Qemu-devel] [PATCH v1] vhost-user: fix not send all hugepage files to vhost-user

2015-01-28 Thread Linhaifeng
Hi,Michael S.Tsirkin

The vhost-user device will not work if there are two numa nodes in VM.

Should we fix this bug or ignore it ?

On 2014/12/18 13:06, Linhaifeng wrote:
 
 
 On 2014/12/17 14:02, haifeng@huawei.com wrote:
 From: linhaifeng haifeng@huawei.com

 If we create VM with two or more numa nodes qemu will create two
 or more hugepage files but qemu only send one hugepage file fd
 to vhost-user when VM's memory size is 2G and with two numa nodes.

 Signed-off-by: linhaifeng haifeng@huawei.com
 ---
  hw/virtio/vhost-user.c  | 78 
 ++---
  hw/virtio/vhost.c   | 13 
  linux-headers/linux/vhost.h |  7 
  3 files changed, 73 insertions(+), 25 deletions(-)

 diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
 index aefe0bb..439cbba 100644
 --- a/hw/virtio/vhost-user.c
 +++ b/hw/virtio/vhost-user.c
 @@ -24,6 +24,10 @@
  #include linux/vhost.h
  
  #define VHOST_MEMORY_MAX_NREGIONS8
 +/* FIXME: same as the max number of numa node?*/
 +#define HUGEPAGE_MAX_FILES   8
 +
 +#define RAM_SHARED (1  1)
  
  typedef enum VhostUserRequest {
  VHOST_USER_NONE = 0,
 @@ -41,14 +45,15 @@ typedef enum VhostUserRequest {
  VHOST_USER_SET_VRING_KICK = 12,
  VHOST_USER_SET_VRING_CALL = 13,
  VHOST_USER_SET_VRING_ERR = 14,
 -VHOST_USER_MAX
 +VHOST_USER_MMAP_HUGEPAGE_FILE = 15,
 +VHOST_USER_UNMAP_HUGEPAGE_FILE = 16,
 +VHOST_USER_MAX,
  } VhostUserRequest;
  
  typedef struct VhostUserMemoryRegion {
  uint64_t guest_phys_addr;
  uint64_t memory_size;
  uint64_t userspace_addr;
 -uint64_t mmap_offset;
  } VhostUserMemoryRegion;
  
  typedef struct VhostUserMemory {
 @@ -57,6 +62,16 @@ typedef struct VhostUserMemory {
  VhostUserMemoryRegion regions[VHOST_MEMORY_MAX_NREGIONS];
  } VhostUserMemory;
  
 +typedef struct HugepageMemoryInfo {
 +uint64_t base_addr;
 +uint64_t size;
 +}HugeMemInfo;
 +
 +typedef struct HugepageInfo {
 +uint32_t num;
 +HugeMemInfo files[HUGEPAGE_MAX_FILES];
 +}HugepageInfo;
 +
  typedef struct VhostUserMsg {
  VhostUserRequest request;
  
 @@ -71,6 +86,7 @@ typedef struct VhostUserMsg {
  struct vhost_vring_state state;
  struct vhost_vring_addr addr;
  VhostUserMemory memory;
 +HugepageInfo huge_info;
  };
  } QEMU_PACKED VhostUserMsg;
  
 @@ -104,7 +120,9 @@ static unsigned long int 
 ioctl_to_vhost_user_request[VHOST_USER_MAX] = {
  VHOST_GET_VRING_BASE,   /* VHOST_USER_GET_VRING_BASE */
  VHOST_SET_VRING_KICK,   /* VHOST_USER_SET_VRING_KICK */
  VHOST_SET_VRING_CALL,   /* VHOST_USER_SET_VRING_CALL */
 -VHOST_SET_VRING_ERR /* VHOST_USER_SET_VRING_ERR */
 +VHOST_SET_VRING_ERR,/* VHOST_USER_SET_VRING_ERR */
 +VHOST_MMAP_HUGEPAGE_FILE,  /* VHOST_USER_MMAP_HUGEPAGE_FILE */
 +VHOST_UNMAP_HUGEPAGE_FILE, /* VHOST_USER_UNMAP_HUGEPAGE_FILE */
  };
  
  static VhostUserRequest vhost_user_request_translate(unsigned long int 
 request)
 @@ -190,6 +208,7 @@ static int vhost_user_call(struct vhost_dev *dev, 
 unsigned long int request,
  int fds[VHOST_MEMORY_MAX_NREGIONS];
  int i, fd;
  size_t fd_num = 0;
 +RAMBlock *block;
  
  assert(dev-vhost_ops-backend_type == VHOST_BACKEND_TYPE_USER);
  
 @@ -213,37 +232,46 @@ static int vhost_user_call(struct vhost_dev *dev, 
 unsigned long int request,
  case VHOST_RESET_OWNER:
  break;
  
 -case VHOST_SET_MEM_TABLE:
 -for (i = 0; i  dev-mem-nregions; ++i) {
 -struct vhost_memory_region *reg = dev-mem-regions + i;
 -ram_addr_t ram_addr;
 +case VHOST_MMAP_HUGEPAGE_FILE:
 +qemu_mutex_lock_ramlist();
  
 -assert((uintptr_t)reg-userspace_addr == reg-userspace_addr);
 -qemu_ram_addr_from_host((void *)(uintptr_t)reg-userspace_addr, 
 ram_addr);
 -fd = qemu_get_ram_fd(ram_addr);
 -if (fd  0) {
 -msg.memory.regions[fd_num].userspace_addr = 
 reg-userspace_addr;
 -msg.memory.regions[fd_num].memory_size  = reg-memory_size;
 -msg.memory.regions[fd_num].guest_phys_addr = 
 reg-guest_phys_addr;
 -msg.memory.regions[fd_num].mmap_offset = 
 reg-userspace_addr -
 -(uintptr_t) qemu_get_ram_block_host_ptr(ram_addr);
 -assert(fd_num  VHOST_MEMORY_MAX_NREGIONS);
 -fds[fd_num++] = fd;
 +/* Get hugepage file informations */
 +QTAILQ_FOREACH(block, ram_list.blocks, next) {
 +if (block-flags  RAM_SHARED  block-fd  0) {
 +msg.huge_info.files[fd_num].size = block-length;
 +msg.huge_info.files[fd_num].base_addr = block-host;
 +fds[fd_num++] = block-fd;
  }
  }
 +msg.huge_info.num = fd_num;
  
 -msg.memory.nregions = fd_num;
 +/* Calculate msg size */
 +msg.size = sizeof(m.huge_info.num

Re: [Qemu-devel] [PATCH v1] vhost-user: fix not send all hugepage files to vhost-user

2014-12-17 Thread Linhaifeng


On 2014/12/17 14:02, haifeng@huawei.com wrote:
 From: linhaifeng haifeng@huawei.com
 
 If we create VM with two or more numa nodes qemu will create two
 or more hugepage files but qemu only send one hugepage file fd
 to vhost-user when VM's memory size is 2G and with two numa nodes.
 
 Signed-off-by: linhaifeng haifeng@huawei.com
 ---
  hw/virtio/vhost-user.c  | 78 
 ++---
  hw/virtio/vhost.c   | 13 
  linux-headers/linux/vhost.h |  7 
  3 files changed, 73 insertions(+), 25 deletions(-)
 
 diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
 index aefe0bb..439cbba 100644
 --- a/hw/virtio/vhost-user.c
 +++ b/hw/virtio/vhost-user.c
 @@ -24,6 +24,10 @@
  #include linux/vhost.h
  
  #define VHOST_MEMORY_MAX_NREGIONS8
 +/* FIXME: same as the max number of numa node?*/
 +#define HUGEPAGE_MAX_FILES   8
 +
 +#define RAM_SHARED (1  1)
  
  typedef enum VhostUserRequest {
  VHOST_USER_NONE = 0,
 @@ -41,14 +45,15 @@ typedef enum VhostUserRequest {
  VHOST_USER_SET_VRING_KICK = 12,
  VHOST_USER_SET_VRING_CALL = 13,
  VHOST_USER_SET_VRING_ERR = 14,
 -VHOST_USER_MAX
 +VHOST_USER_MMAP_HUGEPAGE_FILE = 15,
 +VHOST_USER_UNMAP_HUGEPAGE_FILE = 16,
 +VHOST_USER_MAX,
  } VhostUserRequest;
  
  typedef struct VhostUserMemoryRegion {
  uint64_t guest_phys_addr;
  uint64_t memory_size;
  uint64_t userspace_addr;
 -uint64_t mmap_offset;
  } VhostUserMemoryRegion;
  
  typedef struct VhostUserMemory {
 @@ -57,6 +62,16 @@ typedef struct VhostUserMemory {
  VhostUserMemoryRegion regions[VHOST_MEMORY_MAX_NREGIONS];
  } VhostUserMemory;
  
 +typedef struct HugepageMemoryInfo {
 +uint64_t base_addr;
 +uint64_t size;
 +}HugeMemInfo;
 +
 +typedef struct HugepageInfo {
 +uint32_t num;
 +HugeMemInfo files[HUGEPAGE_MAX_FILES];
 +}HugepageInfo;
 +
  typedef struct VhostUserMsg {
  VhostUserRequest request;
  
 @@ -71,6 +86,7 @@ typedef struct VhostUserMsg {
  struct vhost_vring_state state;
  struct vhost_vring_addr addr;
  VhostUserMemory memory;
 +HugepageInfo huge_info;
  };
  } QEMU_PACKED VhostUserMsg;
  
 @@ -104,7 +120,9 @@ static unsigned long int 
 ioctl_to_vhost_user_request[VHOST_USER_MAX] = {
  VHOST_GET_VRING_BASE,   /* VHOST_USER_GET_VRING_BASE */
  VHOST_SET_VRING_KICK,   /* VHOST_USER_SET_VRING_KICK */
  VHOST_SET_VRING_CALL,   /* VHOST_USER_SET_VRING_CALL */
 -VHOST_SET_VRING_ERR /* VHOST_USER_SET_VRING_ERR */
 +VHOST_SET_VRING_ERR,/* VHOST_USER_SET_VRING_ERR */
 +VHOST_MMAP_HUGEPAGE_FILE,  /* VHOST_USER_MMAP_HUGEPAGE_FILE */
 +VHOST_UNMAP_HUGEPAGE_FILE, /* VHOST_USER_UNMAP_HUGEPAGE_FILE */
  };
  
  static VhostUserRequest vhost_user_request_translate(unsigned long int 
 request)
 @@ -190,6 +208,7 @@ static int vhost_user_call(struct vhost_dev *dev, 
 unsigned long int request,
  int fds[VHOST_MEMORY_MAX_NREGIONS];
  int i, fd;
  size_t fd_num = 0;
 +RAMBlock *block;
  
  assert(dev-vhost_ops-backend_type == VHOST_BACKEND_TYPE_USER);
  
 @@ -213,37 +232,46 @@ static int vhost_user_call(struct vhost_dev *dev, 
 unsigned long int request,
  case VHOST_RESET_OWNER:
  break;
  
 -case VHOST_SET_MEM_TABLE:
 -for (i = 0; i  dev-mem-nregions; ++i) {
 -struct vhost_memory_region *reg = dev-mem-regions + i;
 -ram_addr_t ram_addr;
 +case VHOST_MMAP_HUGEPAGE_FILE:
 +qemu_mutex_lock_ramlist();
  
 -assert((uintptr_t)reg-userspace_addr == reg-userspace_addr);
 -qemu_ram_addr_from_host((void *)(uintptr_t)reg-userspace_addr, 
 ram_addr);
 -fd = qemu_get_ram_fd(ram_addr);
 -if (fd  0) {
 -msg.memory.regions[fd_num].userspace_addr = 
 reg-userspace_addr;
 -msg.memory.regions[fd_num].memory_size  = reg-memory_size;
 -msg.memory.regions[fd_num].guest_phys_addr = 
 reg-guest_phys_addr;
 -msg.memory.regions[fd_num].mmap_offset = reg-userspace_addr 
 -
 -(uintptr_t) qemu_get_ram_block_host_ptr(ram_addr);
 -assert(fd_num  VHOST_MEMORY_MAX_NREGIONS);
 -fds[fd_num++] = fd;
 +/* Get hugepage file informations */
 +QTAILQ_FOREACH(block, ram_list.blocks, next) {
 +if (block-flags  RAM_SHARED  block-fd  0) {
 +msg.huge_info.files[fd_num].size = block-length;
 +msg.huge_info.files[fd_num].base_addr = block-host;
 +fds[fd_num++] = block-fd;
  }
  }
 +msg.huge_info.num = fd_num;
  
 -msg.memory.nregions = fd_num;
 +/* Calculate msg size */
 +msg.size = sizeof(m.huge_info.num);
 +msg.size += fd_num * sizeof(HugeMemInfo);
 +
 +qemu_mutex_unlock_ramlist();
 +break;
  
 -if (!fd_num

Re: [Qemu-devel] vhost-user:How to send memory info

2014-12-11 Thread Linhaifeng

here is qemu's log :
file_ram_alloc 
filename[/dev/hugepages/libvirt/qemu/qemu_back_mem._objects_ram-node0.okpjmP] 
fd[7] size[1073741824]
file_ram_alloc 
filename[/dev/hugepages/libvirt/qemu/qemu_back_mem._objects_ram-node1.3b43YF] 
fd[8] size[1073741824]

region fd[7]:
gpa = 0xC
size = 2146697216
ua = 0x2acc
offset = 786432

region fd[7]:
gpa = 0x0
size = 655360
ua = 0x2ac0
offset = 0

we can see memory region only contain one hugepage.Is this a bug?

On 2014/12/11 11:10, Linhaifeng wrote:
 Hi,all
 
 Yestoday i tested the set_mem_table message found that qemu not send all the 
 memory info(fd and size) when
 VM memory size is 2G and have two numa nodes(two hugepage files).If VM memory 
 size is 4G and have two numa nodes
 will send all the memory info.
 
 Here is my understand,is this right?
 If qemu not send all the memory infomation about hugepage files,vhost-user 
 couldn't map all the VM memory.
 If vhost-user couldn't map all the VM memory vhost-user maybe couldn't read 
 the packets which allocate by Guest.
 
 
 1.Information about VM who has 2G and two numa nodes(vhost-user couldn't read 
 the packets from tx_ring of Guest):
 xml:
   cpu
 numa
   cell id='0' cpus='0' memory='1048576' memAccess='shared'/
   cell id='1' cpus='1' memory='1048576' memAccess='shared'/
 /numa
   /cpu
  memoryBacking
 hugepages
   page size=2 unit=M nodeset=0,1/
 /hugepages
  /memoryBacking
  interface type='vhostuser'
 mac address='52:54:00:3b:83:1a'/
 source type='unix' path='/var/run/vhost-user/port1' mode='client'/
 model type='virtio'/
  /interface
 
 qemu command:
 -m 2048 -smp 2,sockets=2,cores=1,threads=1
 -object 
 memory-backend-file,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,share=on,size=1024M,id=ram-node0
  -numa node,nodeid=0,cpus=0,memdev=ram-node0
 -object 
 memory-backend-file,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,share=on,size=1024M,id=ram-node1
  -numa node,nodeid=1,cpus=1,memdev=ram-node1
 
 
 memory regions:
 gpa = 0xC
 size = 2146697216
 ua = 0x2acc
 offset = 786432
 
 gpa = 0x0
 size = 655360
 ua = 0x2ac0
 offset = 0
 
 hugepage:
 cat /proc/pidof qemu/maps
 2ac0-2aaaeac0 rw-s  00:18 10357788   
 /dev/hugepages/libvirt/qemu/qemu_back_mem._objects_ram-node0.MvcPyi (deleted)
 2aaaeac0-2aab2ac0 rw-s  00:18 10357789   
 /dev/hugepages/libvirt/qemu/qemu_back_mem._objects_ram-node1.tjAVin (deleted)
 
 The memory size of each region is not match to the size of each hugepage 
 file,is this ok?How does vhost-user to mmap all the hugepage?
 
 2.Information about VM who has 4G and two numa nodes(vhost-user could read 
 the packets from tx_ring of Guest):
 xml:
   cpu
 numa
   cell id='0' cpus='0' memory='2097152' memAccess='shared'/
   cell id='1' cpus='1' memory='2097152' memAccess='shared'/
 /numa
   /cpu
 
  memoryBacking
 hugepages
   page size=2 unit=M nodeset=0,1/
 /hugepages
  /memoryBacking
 
 qemu command:
 -m 4096 -smp 2,sockets=2,cores=1,threads=1
 -object 
 memory-backend-file,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,share=on,size=2048M,id=ram-node0
  -numa node,nodeid=0,cpus=0,memdev=ram-node0
 -object 
 memory-backend-file,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,share=on,size=2048M,id=ram-node1
  -numa node,nodeid=1,cpus=1,memdev=ram-node1
 
 memory regions:
 gpa = 0xC
 size = 3220439040
 ua = 0x2acc
 offset = 786432
 
 gpa = 0x1
 size = 1073741824
 ua = 0x2aab6ac0
 offset = 1073741824
 
 gpa = 0x0
 size = 655360
 ua = 0x2ac0
 offset = 0
 
 hugepage:
 cat /proc/pidof qemu/maps
 2ac0-2aab2ac0 rw-s  00:18 10756623   
 /dev/hugepages/libvirt/qemu/qemu_back_mem._objects_ram-node0.If4Qkf (deleted)
 2aab2ac0-2aabaac0 rw-s  00:18 10756624   
 /dev/hugepages/libvirt/qemu/qemu_back_mem._objects_ram-node1.jQZPcI (deleted)
 

-- 
Regards,
Haifeng




[Qemu-devel] How to send memory info

2014-12-10 Thread Linhaifeng
Hi,all

Yestoday i tested the set_mem_table message found that qemu not send all the 
memory info(fd and size) when
VM memory size is 2G and have two numa nodes(two hugepage files).If VM memory 
size is 4G and have two numa nodes
will send all the memory info.

Here is my understand,is this right?
If qemu not send all the memory infomation about hugepage files,vhost-user 
couldn't map all the VM memory.
If vhost-user couldn't map all the VM memory vhost-user maybe couldn't read the 
packets which allocate by Guest.


1.Information about VM who has 2G and two numa nodes(vhost-user couldn't read 
the packets from tx_ring of Guest):
xml:
  cpu
numa
  cell id='0' cpus='0' memory='1048576' memAccess='shared'/
  cell id='1' cpus='1' memory='1048576' memAccess='shared'/
/numa
  /cpu
 memoryBacking
hugepages
  page size=2 unit=M nodeset=0,1/
/hugepages
 /memoryBacking
 interface type='vhostuser'
mac address='52:54:00:3b:83:1a'/
source type='unix' path='/var/run/vhost-user/port1' mode='client'/
model type='virtio'/
 /interface

qemu command:
-m 2048 -smp 2,sockets=2,cores=1,threads=1
-object 
memory-backend-file,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,share=on,size=1024M,id=ram-node0
 -numa node,nodeid=0,cpus=0,memdev=ram-node0
-object 
memory-backend-file,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,share=on,size=1024M,id=ram-node1
 -numa node,nodeid=1,cpus=1,memdev=ram-node1


memory regions:
gpa = 0xC
size = 2146697216
ua = 0x2acc
offset = 786432

gpa = 0x0
size = 655360
ua = 0x2ac0
offset = 0

hugepage:
cat /proc/pidof qemu/maps
2ac0-2aaaeac0 rw-s  00:18 10357788   
/dev/hugepages/libvirt/qemu/qemu_back_mem._objects_ram-node0.MvcPyi (deleted)
2aaaeac0-2aab2ac0 rw-s  00:18 10357789   
/dev/hugepages/libvirt/qemu/qemu_back_mem._objects_ram-node1.tjAVin (deleted)

The memory size of each region is not match to the size of each hugepage 
file,is this ok?How does vhost-user to mmap all the hugepage?

2.Information about VM who has 4G and two numa nodes(vhost-user could read the 
packets from tx_ring of Guest):
xml:
  cpu
numa
  cell id='0' cpus='0' memory='2097152' memAccess='shared'/
  cell id='1' cpus='1' memory='2097152' memAccess='shared'/
/numa
  /cpu

 memoryBacking
hugepages
  page size=2 unit=M nodeset=0,1/
/hugepages
 /memoryBacking

qemu command:
-m 4096 -smp 2,sockets=2,cores=1,threads=1
-object 
memory-backend-file,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,share=on,size=2048M,id=ram-node0
 -numa node,nodeid=0,cpus=0,memdev=ram-node0
-object 
memory-backend-file,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,share=on,size=2048M,id=ram-node1
 -numa node,nodeid=1,cpus=1,memdev=ram-node1

memory regions:
gpa = 0xC
size = 3220439040
ua = 0x2acc
offset = 786432

gpa = 0x1
size = 1073741824
ua = 0x2aab6ac0
offset = 1073741824

gpa = 0x0
size = 655360
ua = 0x2ac0
offset = 0

hugepage:
cat /proc/pidof qemu/maps
2ac0-2aab2ac0 rw-s  00:18 10756623   
/dev/hugepages/libvirt/qemu/qemu_back_mem._objects_ram-node0.If4Qkf (deleted)
2aab2ac0-2aabaac0 rw-s  00:18 10756624   
/dev/hugepages/libvirt/qemu/qemu_back_mem._objects_ram-node1.jQZPcI (deleted)

-- 
Regards,
Haifeng




[Qemu-devel] vhost_user: How to send memory info

2014-12-10 Thread Linhaifeng


On 2014/12/11 11:10, Linhaifeng wrote:
 Hi,all
 
 Yestoday i tested the set_mem_table message found that qemu not send all the 
 memory info(fd and size) when
 VM memory size is 2G and have two numa nodes(two hugepage files).If VM memory 
 size is 4G and have two numa nodes
 will send all the memory info.
 
 Here is my understand,is this right?
 If qemu not send all the memory infomation about hugepage files,vhost-user 
 couldn't map all the VM memory.
 If vhost-user couldn't map all the VM memory vhost-user maybe couldn't read 
 the packets which allocate by Guest.
 
 
 1.Information about VM who has 2G and two numa nodes(vhost-user couldn't read 
 the packets from tx_ring of Guest):
 xml:
   cpu
 numa
   cell id='0' cpus='0' memory='1048576' memAccess='shared'/
   cell id='1' cpus='1' memory='1048576' memAccess='shared'/
 /numa
   /cpu
  memoryBacking
 hugepages
   page size=2 unit=M nodeset=0,1/
 /hugepages
  /memoryBacking
  interface type='vhostuser'
 mac address='52:54:00:3b:83:1a'/
 source type='unix' path='/var/run/vhost-user/port1' mode='client'/
 model type='virtio'/
  /interface
 
 qemu command:
 -m 2048 -smp 2,sockets=2,cores=1,threads=1
 -object 
 memory-backend-file,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,share=on,size=1024M,id=ram-node0
  -numa node,nodeid=0,cpus=0,memdev=ram-node0
 -object 
 memory-backend-file,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,share=on,size=1024M,id=ram-node1
  -numa node,nodeid=1,cpus=1,memdev=ram-node1
 
 
 memory regions:
 gpa = 0xC
 size = 2146697216
 ua = 0x2acc
 offset = 786432
 
 gpa = 0x0
 size = 655360
 ua = 0x2ac0
 offset = 0
 
 hugepage:
 cat /proc/pidof qemu/maps
 2ac0-2aaaeac0 rw-s  00:18 10357788   
 /dev/hugepages/libvirt/qemu/qemu_back_mem._objects_ram-node0.MvcPyi (deleted)
 2aaaeac0-2aab2ac0 rw-s  00:18 10357789   
 /dev/hugepages/libvirt/qemu/qemu_back_mem._objects_ram-node1.tjAVin (deleted)
 
 The memory size of each region is not match to the size of each hugepage 
 file,is this ok?How does vhost-user to mmap all the hugepage?
 
 2.Information about VM who has 4G and two numa nodes(vhost-user could read 
 the packets from tx_ring of Guest):
 xml:
   cpu
 numa
   cell id='0' cpus='0' memory='2097152' memAccess='shared'/
   cell id='1' cpus='1' memory='2097152' memAccess='shared'/
 /numa
   /cpu
 
  memoryBacking
 hugepages
   page size=2 unit=M nodeset=0,1/
 /hugepages
  /memoryBacking
 
 qemu command:
 -m 4096 -smp 2,sockets=2,cores=1,threads=1
 -object 
 memory-backend-file,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,share=on,size=2048M,id=ram-node0
  -numa node,nodeid=0,cpus=0,memdev=ram-node0
 -object 
 memory-backend-file,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,share=on,size=2048M,id=ram-node1
  -numa node,nodeid=1,cpus=1,memdev=ram-node1
 
 memory regions:
 gpa = 0xC
 size = 3220439040
 ua = 0x2acc
 offset = 786432
 
 gpa = 0x1
 size = 1073741824
 ua = 0x2aab6ac0
 offset = 1073741824
 
 gpa = 0x0
 size = 655360
 ua = 0x2ac0
 offset = 0
 
 hugepage:
 cat /proc/pidof qemu/maps
 2ac0-2aab2ac0 rw-s  00:18 10756623   
 /dev/hugepages/libvirt/qemu/qemu_back_mem._objects_ram-node0.If4Qkf (deleted)
 2aab2ac0-2aabaac0 rw-s  00:18 10756624   
 /dev/hugepages/libvirt/qemu/qemu_back_mem._objects_ram-node1.jQZPcI (deleted)
 

-- 
Regards,
Haifeng




Re: [Qemu-devel] [PATCH] vhost-user: fix mmap offset calculation

2014-11-02 Thread Linhaifeng
good job!passed test bigger than 3.5G VM.

On 2014/11/3 2:01, Michael S. Tsirkin wrote:
 qemu_get_ram_block_host_ptr should get ram_addr_t,
 vhost-user passes in GPA.
 That's very wrong.
 
 Signed-off-by: Michael S. Tsirkin m...@redhat.com
 ---
  hw/virtio/vhost-user.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
 index 4e88d9c..aefe0bb 100644
 --- a/hw/virtio/vhost-user.c
 +++ b/hw/virtio/vhost-user.c
 @@ -226,7 +226,7 @@ static int vhost_user_call(struct vhost_dev *dev, 
 unsigned long int request,
  msg.memory.regions[fd_num].memory_size  = reg-memory_size;
  msg.memory.regions[fd_num].guest_phys_addr = 
 reg-guest_phys_addr;
  msg.memory.regions[fd_num].mmap_offset = reg-userspace_addr 
 -
 -(uintptr_t) 
 qemu_get_ram_block_host_ptr(reg-guest_phys_addr);
 +(uintptr_t) qemu_get_ram_block_host_ptr(ram_addr);
  assert(fd_num  VHOST_MEMORY_MAX_NREGIONS);
  fds[fd_num++] = fd;
  }
 

-- 
Regards,
Haifeng




[Qemu-devel] vhost-user:Bad ram offset

2014-11-01 Thread Linhaifeng
Hi,all

VM use vhost-user backend cannot startup when memory bigger than 3.5G.The log 
print Bad ram offset 1 .Is this a bug?

log:
[2014-11-01T08:39:07.245324Z] virtio_set_status:524 virtio-net device status is 
1 that means ACKNOWLEDGE
[2014-11-01T08:39:07.247225Z] virtio_set_status:524 virtio-balloon device 
status is 1 that means ACKNOWLEDGE
[2014-11-01T08:39:07.320191Z] virtio_set_status:524 virtio-net device status is 
3 that means DRIVER
Bad ram offset 1
[2014-11-01 08:39:07]: shutting down


command:
/usr/bin/qemu-system-x86_64 -name vm1
-smp 4
-drive 
file=/mnt/sdb/linhf/imgs/vm1.img,if=none,id=drive-ide0-0-0,format=raw,cache=none,aio=native
 -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 \
-chardev socket,id=charnet0,path=/var/run/vhost-user/port1 -netdev 
type=vhost-user,id=hostnet0,chardev=charnet0 -device 
virtio-net-pci,netdev=hostnet0,id=net0,mac=00:00:00:00:00:01,bus=pci.0,addr=0x3 
\
-enable-kvm -mem-prealloc -object 
memory-backend-file,id=mem,size=3968M,mem-path=/dev/hugepages,share=on -numa 
node,memdev=mem

when memory is smaller than 3.5G VM would run well.


-- 
Regards,
Haifeng




Re: [Qemu-devel] vhost-user:why region[0] always mmap failed ?

2014-10-31 Thread Linhaifeng


On 2014/10/16 5:28, Anshul Makkar wrote:
 Hi,
 
 Please can you share in what scenario this mapping fails. I am not seeing any 
 such issue.
 
 Thanks
 Anshul Makkar
 

VM info:
memory:4G
hugepage size:2M

memory regions info:
gpa = 0xC
size = 3220439040
ua = 0x2acc
offset = 786432

gpa = 0x1
size = 1073741824
ua = 0x2aab6ac0
offset = 18446650252267094016

gpa = 0x0
size = 655360
ua = 0x2ac0
offset = 0

log:
mmap fd[61] size[3221225472] failed
mmap fd[62] size[18446650253340835840] failed


 On Wed, Sep 17, 2014 at 10:33:23AM +0800, Linhaifeng wrote:
 Hi,

 There is two memory regions when receive VHOST_SET_MEM_TABLE message:
 region[0]
 gpa = 0x0
 size = 655360
 ua = 0x2ac0
 offset = 0
 region[1]
 gpa = 0xC
 size = 2146697216
 ua = 0x2acc
 offset = 786432

 region[0] always mmap failed.The user code is :

 for (idx = 0; idx  msg-msg.memory.nregions; idx++) {
 if (msg-fds[idx]  0) {
 size_t size;
 uint64_t *guest_mem;
 Region *region = vhost_server-memory.regions[i];

 region-guest_phys_addr = 
 msg-msg.memory.regions[idx].guest_phys_addr;
 region-memory_size = msg-msg.memory.regions[idx].memory_size;
 region-userspace_addr = 
 msg-msg.memory.regions[idx].userspace_addr;
 region-mmap_offset = msg-msg.memory.regions[idx].mmap_offset;
  
 assert(idx  msg-fd_num);
 assert(msg-fds[idx]  0);

 size = region-memory_size + region-mmap_offset;
 guest_mem = mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED, 
 msg-fds[idx], 0);
 if (MAP_FAILED == guest_mem) {
 continue;
 }
 i++;
 guest_mem += (region-mmap_offset / sizeof(*guest_mem));
 region-mmap_addr = (uint64_t)guest_mem;
 vhost_server-memory.nregions++;
 }
 }


 
 

-- 
Regards,
Haifeng




[Qemu-devel] is qemu-2.1 support 1G hugepage

2014-10-24 Thread Linhaifeng
Hi,

I run qemu2.1 with 1G hugepage and found that VM can't start (too slowly?) but 
can start with 2M hugepage quickly.

kernel version:3.10.0-123.6.3.el7.x86_64

command:qemu-kvm -name vm1 -enable-kvm -smp 2 -m 2048 -object 
memory-backend-file,id=mem,size=2048M,mem-path=/dev/hugepages,share=on -numa 
node,memdev=mem \
-chardev socket,id=chr0,path=/path/socket  -netdev 
type=vhost-user,id=net0,chardev=chr0 -device 
virtio-net-pci,netdev=net0,mac=00:00:00:00:00:01
-drive file=/mnt/sdb/imgs/vm1.img -vnc :0
-- 
Regards,
Haifeng




Re: [Qemu-devel] [PATCH] fix the memory leak for share hugepage

2014-10-20 Thread Linhaifeng


On 2014/10/20 13:32, Wen Congyang wrote:
 On 10/20/2014 12:48 PM, Linhaifeng wrote:


 On 2014/10/20 10:12, Wen Congyang wrote:
 On 10/18/2014 11:20 AM, Linhaifeng wrote:


 On 2014/10/17 21:26, Daniel P. Berrange wrote:
 On Fri, Oct 17, 2014 at 04:57:27PM +0800, Linhaifeng wrote:


 On 2014/10/17 16:33, Daniel P. Berrange wrote:
 On Fri, Oct 17, 2014 at 04:27:17PM +0800, haifeng@huawei.com wrote:
 From: linhaifeng haifeng@huawei.com

 The VM start with share hugepage should close the hugefile fd
 when exit.Because the hugepage fd may be send to other process
 e.g vhost-user If qemu not close the fd the other process can
 not free the hugepage otherwise exit process,this is ugly,so
 qemu should close all shared fd when exit.

 Signed-off-by: linhaifeng haifeng@huawei.com

 Err, all file descriptors are closed automatically when a process
 exits. So manually calling close(fd) before exit can't have any
 functional effect on a resource leak.

 If QEMU has sent the FD to another process, that process has a
 completely separate copy of the FD. Closing the FD in QEMU will
 not close the FD in the other process. You need the other process
 to exit for the copy to be closed.

 Regards,
 Daniel

 Hi,daniel

 QEMU send the fd by unix domain socket.unix domain socket just install 
 the fd to
 other process and inc the f_count,if qemu not close the fd the f_count 
 is not dec.
 Then the other process even close the fd the hugepage would not freed 
 whise the other process exit.

 The kernel always closes all FDs when a process exits. So if this FD is
 not being correctly closed then it is a kernel bug. There should never
 be any reason for an application to do close(fd) before exiting.

 Regards,
 Daniel

 Hi,daniel

 I don't think this is kernel's bug.May be this a problem about usage.
 If you open a file you should close it too.

 If you don't close it, the kernel will help you when the program exits.

 Yes,when the hugepage is only used for qemu,the kernel will free the file 
 object.If the hugepage shared for other process,when qemu exit the kernel 
 will not free the file.
 
 Even if the hugepage is shared with the other process, the kernel will auto 
 close the fd when qemu
 exits. If the kernel doesn't do it, it is a kernel bug.
 
Kernel supply close to fix this bug.If you call open you must call close.
If not, the result is unpredictability.

 This is linux man pageabout how to free resource of file.
 http://linux.die.net/man/2/close


 I'm trying to describe my problem.

 For example, there are 2 VMs run with hugepage and the hugepage only for 
 QEMU to use.

 Before run VM the meminfo is :
 HugePages_Total:4096
 HugePages_Free: 4096
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Run the two VMs.QEMU deal with hugepage as follow steps:
 1.open
 2.unlink
 3.mmap
 4.use memory of hugepage.After this step the meminfo is :
 HugePages_Total:4096
 HugePages_Free:0
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB
 5.shutdown VM with signal 15 without close(fd).After this step the meminfo 
 is :
 HugePages_Total:4096
 HugePages_Free: 4096
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Yes,it works well,like you said the kernel recycle all resources.

 For another example,there are 2 VMs run with hugepage and share the 
 hugepage with vapp(a vhost-user application).

 The vapp is your internal application?

 Yes vapp is a application to share the QEMU's hugepage.So threr are two 
 process use the hugepage.


 Before run VM the meminfo is :
 HugePages_Total:4096
 HugePages_Free: 4096
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Run the first VM.QEMU deal with hugepage as follow steps:
 1.open
 2.unlink
 3.mmap
 4.use memory of hugepage and send the fd to vapp with unix domain 
 socket.After this step the meminfo is:
 
 Do you modify qemu?
 
 HugePages_Total:4096
 HugePages_Free: 2048
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Run the second VM.After this step the meminfo is:
 HugePages_Total:4096
 HugePages_Free:0
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Then I want to close the first VM and run another VM.After close the first 
 VM and close the fd in vapp the meminfo is :
 HugePages_Total:4096
 HugePages_Free:0
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Does the qemu still run after you close the first VM? If the qemu exits, 
 the fd will be closed by the kernel, so this
 bug is very strange.

 qemu is not run when close the first VM.If other process used the file will 
 be closed by kernel too?
 
 If qeum doesn't run after the first vm is closed, the fd should be closed 
 even if another process uses the file.
 


 So failed to run the third VM because the first VM have

Re: [Qemu-devel] [PATCH] fix the memory leak for share hugepage

2014-10-20 Thread Linhaifeng
Hi,all

Maybe this is unix domain socket's bug.I found that qemu send the fd to vapp 
the fd's f_count inc twice in kernel.

1.kernel calls when we call send.
unix_stream_sendmsg - unix_scm_to_skb - unix_attach_fds - scm_fp_dup - 
get_file - atomic_long_inc(f-f_count)

Maybe should't inc the f_count when call send.


2.kernel calls when we call recv
unix_stream_recvmsg - scm_fp_dup - get_file - atomic_long_inc(f-f_count)



On 2014/10/20 14:26, Wen Congyang wrote:
 On 10/20/2014 02:17 PM, Linhaifeng wrote:


 On 2014/10/20 13:32, Wen Congyang wrote:
 On 10/20/2014 12:48 PM, Linhaifeng wrote:


 On 2014/10/20 10:12, Wen Congyang wrote:
 On 10/18/2014 11:20 AM, Linhaifeng wrote:


 On 2014/10/17 21:26, Daniel P. Berrange wrote:
 On Fri, Oct 17, 2014 at 04:57:27PM +0800, Linhaifeng wrote:


 On 2014/10/17 16:33, Daniel P. Berrange wrote:
 On Fri, Oct 17, 2014 at 04:27:17PM +0800, haifeng@huawei.com 
 wrote:
 From: linhaifeng haifeng@huawei.com

 The VM start with share hugepage should close the hugefile fd
 when exit.Because the hugepage fd may be send to other process
 e.g vhost-user If qemu not close the fd the other process can
 not free the hugepage otherwise exit process,this is ugly,so
 qemu should close all shared fd when exit.

 Signed-off-by: linhaifeng haifeng@huawei.com

 Err, all file descriptors are closed automatically when a process
 exits. So manually calling close(fd) before exit can't have any
 functional effect on a resource leak.

 If QEMU has sent the FD to another process, that process has a
 completely separate copy of the FD. Closing the FD in QEMU will
 not close the FD in the other process. You need the other process
 to exit for the copy to be closed.

 Regards,
 Daniel

 Hi,daniel

 QEMU send the fd by unix domain socket.unix domain socket just install 
 the fd to
 other process and inc the f_count,if qemu not close the fd the f_count 
 is not dec.
 Then the other process even close the fd the hugepage would not freed 
 whise the other process exit.

 The kernel always closes all FDs when a process exits. So if this FD is
 not being correctly closed then it is a kernel bug. There should never
 be any reason for an application to do close(fd) before exiting.

 Regards,
 Daniel

 Hi,daniel

 I don't think this is kernel's bug.May be this a problem about usage.
 If you open a file you should close it too.

 If you don't close it, the kernel will help you when the program exits.

 Yes,when the hugepage is only used for qemu,the kernel will free the file 
 object.If the hugepage shared for other process,when qemu exit the kernel 
 will not free the file.

 Even if the hugepage is shared with the other process, the kernel will auto 
 close the fd when qemu
 exits. If the kernel doesn't do it, it is a kernel bug.

 Kernel supply close to fix this bug.If you call open you must call close.
 If not, the result is unpredictability.
 
 No, if the program exists, the kernel must close all fd used by the program.
 So, there is no need to close fd before program exists.
 
 Thanks
 Wen Congyang
 

 This is linux man pageabout how to free resource of file.
 http://linux.die.net/man/2/close


 I'm trying to describe my problem.

 For example, there are 2 VMs run with hugepage and the hugepage only for 
 QEMU to use.

 Before run VM the meminfo is :
 HugePages_Total:4096
 HugePages_Free: 4096
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Run the two VMs.QEMU deal with hugepage as follow steps:
 1.open
 2.unlink
 3.mmap
 4.use memory of hugepage.After this step the meminfo is :
 HugePages_Total:4096
 HugePages_Free:0
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB
 5.shutdown VM with signal 15 without close(fd).After this step the 
 meminfo is :
 HugePages_Total:4096
 HugePages_Free: 4096
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Yes,it works well,like you said the kernel recycle all resources.

 For another example,there are 2 VMs run with hugepage and share the 
 hugepage with vapp(a vhost-user application).

 The vapp is your internal application?

 Yes vapp is a application to share the QEMU's hugepage.So threr are two 
 process use the hugepage.


 Before run VM the meminfo is :
 HugePages_Total:4096
 HugePages_Free: 4096
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Run the first VM.QEMU deal with hugepage as follow steps:
 1.open
 2.unlink
 3.mmap
 4.use memory of hugepage and send the fd to vapp with unix domain 
 socket.After this step the meminfo is:

 Do you modify qemu?

 HugePages_Total:4096
 HugePages_Free: 2048
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Run the second VM.After this step the meminfo is:
 HugePages_Total:4096
 HugePages_Free:0
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Then I want

Re: [Qemu-devel] [PATCH] fix the memory leak for share hugepage

2014-10-19 Thread Linhaifeng


On 2014/10/20 10:12, Wen Congyang wrote:
 On 10/18/2014 11:20 AM, Linhaifeng wrote:


 On 2014/10/17 21:26, Daniel P. Berrange wrote:
 On Fri, Oct 17, 2014 at 04:57:27PM +0800, Linhaifeng wrote:


 On 2014/10/17 16:33, Daniel P. Berrange wrote:
 On Fri, Oct 17, 2014 at 04:27:17PM +0800, haifeng@huawei.com wrote:
 From: linhaifeng haifeng@huawei.com

 The VM start with share hugepage should close the hugefile fd
 when exit.Because the hugepage fd may be send to other process
 e.g vhost-user If qemu not close the fd the other process can
 not free the hugepage otherwise exit process,this is ugly,so
 qemu should close all shared fd when exit.

 Signed-off-by: linhaifeng haifeng@huawei.com

 Err, all file descriptors are closed automatically when a process
 exits. So manually calling close(fd) before exit can't have any
 functional effect on a resource leak.

 If QEMU has sent the FD to another process, that process has a
 completely separate copy of the FD. Closing the FD in QEMU will
 not close the FD in the other process. You need the other process
 to exit for the copy to be closed.

 Regards,
 Daniel

 Hi,daniel

 QEMU send the fd by unix domain socket.unix domain socket just install the 
 fd to
 other process and inc the f_count,if qemu not close the fd the f_count is 
 not dec.
 Then the other process even close the fd the hugepage would not freed 
 whise the other process exit.

 The kernel always closes all FDs when a process exits. So if this FD is
 not being correctly closed then it is a kernel bug. There should never
 be any reason for an application to do close(fd) before exiting.

 Regards,
 Daniel

 Hi,daniel

 I don't think this is kernel's bug.May be this a problem about usage.
 If you open a file you should close it too.
 
 If you don't close it, the kernel will help you when the program exits.
 
Yes,when the hugepage is only used for qemu,the kernel will free the file 
object.If the hugepage shared for other process,when qemu exit the kernel will 
not free the file.

 This is linux man pageabout how to free resource of file.
 http://linux.die.net/man/2/close


 I'm trying to describe my problem.

 For example, there are 2 VMs run with hugepage and the hugepage only for 
 QEMU to use.

 Before run VM the meminfo is :
 HugePages_Total:4096
 HugePages_Free: 4096
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Run the two VMs.QEMU deal with hugepage as follow steps:
 1.open
 2.unlink
 3.mmap
 4.use memory of hugepage.After this step the meminfo is :
 HugePages_Total:4096
 HugePages_Free:0
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB
 5.shutdown VM with signal 15 without close(fd).After this step the meminfo 
 is :
 HugePages_Total:4096
 HugePages_Free: 4096
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Yes,it works well,like you said the kernel recycle all resources.

 For another example,there are 2 VMs run with hugepage and share the hugepage 
 with vapp(a vhost-user application).
 
 The vapp is your internal application?
 
Yes vapp is a application to share the QEMU's hugepage.So threr are two process 
use the hugepage.


 Before run VM the meminfo is :
 HugePages_Total:4096
 HugePages_Free: 4096
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Run the first VM.QEMU deal with hugepage as follow steps:
 1.open
 2.unlink
 3.mmap
 4.use memory of hugepage and send the fd to vapp with unix domain 
 socket.After this step the meminfo is:
 HugePages_Total:4096
 HugePages_Free: 2048
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Run the second VM.After this step the meminfo is:
 HugePages_Total:4096
 HugePages_Free:0
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Then I want to close the first VM and run another VM.After close the first 
 VM and close the fd in vapp the meminfo is :
 HugePages_Total:4096
 HugePages_Free:0
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB
 
 Does the qemu still run after you close the first VM? If the qemu exits, the 
 fd will be closed by the kernel, so this
 bug is very strange.
 
qemu is not run when close the first VM.If other process used the file will be 
closed by kernel too?


 So failed to run the third VM because the first VM have not free the 
 hugepage.After apply this patch the meminfo is:
 HugePages_Total:4096
 HugePages_Free: 2048
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB
 So i can run the third VM success.

 
 
 .
 

-- 
Regards,
Haifeng




Re: [Qemu-devel] [PATCH] fix the memory leak for share hugepage

2014-10-17 Thread Linhaifeng


On 2014/10/17 16:33, Daniel P. Berrange wrote:
 On Fri, Oct 17, 2014 at 04:27:17PM +0800, haifeng@huawei.com wrote:
 From: linhaifeng haifeng@huawei.com

 The VM start with share hugepage should close the hugefile fd
 when exit.Because the hugepage fd may be send to other process
 e.g vhost-user If qemu not close the fd the other process can
 not free the hugepage otherwise exit process,this is ugly,so
 qemu should close all shared fd when exit.

 Signed-off-by: linhaifeng haifeng@huawei.com
 
 Err, all file descriptors are closed automatically when a process
 exits. So manually calling close(fd) before exit can't have any
 functional effect on a resource leak.
 
 If QEMU has sent the FD to another process, that process has a
 completely separate copy of the FD. Closing the FD in QEMU will
 not close the FD in the other process. You need the other process
 to exit for the copy to be closed.
 
 Regards,
 Daniel
 
Hi,daniel

QEMU send the fd by unix domain socket.unix domain socket just install the fd to
other process and inc the f_count,if qemu not close the fd the f_count is not 
dec.
Then the other process even close the fd the hugepage would not freed whise the 
other process exit.

I have test it for many times.
-- 
Regards,
Haifeng




Re: [Qemu-devel] [PATCH] fix the memory leak for share hugepage

2014-10-17 Thread Linhaifeng


On 2014/10/17 16:56, Gonglei wrote:
 On 2014/10/17 16:33, Daniel P. Berrange wrote:
 
 On Fri, Oct 17, 2014 at 04:27:17PM +0800, haifeng@huawei.com wrote:
 From: linhaifeng haifeng@huawei.com

 The VM start with share hugepage should close the hugefile fd
 when exit.Because the hugepage fd may be send to other process
 e.g vhost-user If qemu not close the fd the other process can
 not free the hugepage otherwise exit process,this is ugly,so
 qemu should close all shared fd when exit.

 Signed-off-by: linhaifeng haifeng@huawei.com

 Err, all file descriptors are closed automatically when a process
 exits. So manually calling close(fd) before exit can't have any
 functional effect on a resource leak.

 If QEMU has sent the FD to another process, that process has a
 completely separate copy of the FD. Closing the FD in QEMU will
 not close the FD in the other process. You need the other process
 to exit for the copy to be closed.

 
 Actually, when vhost-user close the FD manually, the hugepage leak too
 unless the vhost-user process exit. So, maybe the FD is not a separate
 copy IMHO, but simply add the ref-count of FD. When QEMU exit,
 because the ref is not zero, the operate system will not free the FD
 automatically, and when vhost-user close the FD, because of the same
 reason, OS will not free FD resource.
 
 BTW, I don't think this patch is good. When Qemu exit exceptionally,
 sush as 'by kill -9', this problem of memory leak still exist.
 

So,we should close qemu by 'kill -15' or close with virsh.

 Best Regards,
 -Gonglei
 
 
 
 
 .
 



-- 
Regards,
Haifeng




Re: [Qemu-devel] [PATCH] fix the memory leak for share hugepage

2014-10-17 Thread Linhaifeng


On 2014/10/17 16:57, Linhaifeng wrote:
 
 
 On 2014/10/17 16:33, Daniel P. Berrange wrote:
 On Fri, Oct 17, 2014 at 04:27:17PM +0800, haifeng@huawei.com wrote:
 From: linhaifeng haifeng@huawei.com

 The VM start with share hugepage should close the hugefile fd
 when exit.Because the hugepage fd may be send to other process
 e.g vhost-user If qemu not close the fd the other process can
 not free the hugepage otherwise exit process,this is ugly,so
 qemu should close all shared fd when exit.

 Signed-off-by: linhaifeng haifeng@huawei.com

 Err, all file descriptors are closed automatically when a process
 exits. So manually calling close(fd) before exit can't have any
 functional effect on a resource leak.

 If QEMU has sent the FD to another process, that process has a
 completely separate copy of the FD. Closing the FD in QEMU will
 not close the FD in the other process. You need the other process
 to exit for the copy to be closed.

 Regards,
 Daniel

 Hi,daniel
 
 QEMU send the fd by unix domain socket.unix domain socket just install the fd 
 to
 other process and inc the f_count,if qemu not close the fd the f_count is not 
 dec.
 Then the other process even close the fd the hugepage would not freed whise 
 the other process exit.
 
 I have test it for many times.
 

The point is I want to free the hugepage when close port in vhost-user process 
and not exit.
-- 
Regards,
Haifeng




Re: [Qemu-devel] [PATCH] fix the memory leak for share hugepage

2014-10-17 Thread Linhaifeng


On 2014/10/17 21:26, Daniel P. Berrange wrote:
 On Fri, Oct 17, 2014 at 04:57:27PM +0800, Linhaifeng wrote:


 On 2014/10/17 16:33, Daniel P. Berrange wrote:
 On Fri, Oct 17, 2014 at 04:27:17PM +0800, haifeng@huawei.com wrote:
 From: linhaifeng haifeng@huawei.com

 The VM start with share hugepage should close the hugefile fd
 when exit.Because the hugepage fd may be send to other process
 e.g vhost-user If qemu not close the fd the other process can
 not free the hugepage otherwise exit process,this is ugly,so
 qemu should close all shared fd when exit.

 Signed-off-by: linhaifeng haifeng@huawei.com

 Err, all file descriptors are closed automatically when a process
 exits. So manually calling close(fd) before exit can't have any
 functional effect on a resource leak.

 If QEMU has sent the FD to another process, that process has a
 completely separate copy of the FD. Closing the FD in QEMU will
 not close the FD in the other process. You need the other process
 to exit for the copy to be closed.

 Regards,
 Daniel

 Hi,daniel

 QEMU send the fd by unix domain socket.unix domain socket just install the 
 fd to
 other process and inc the f_count,if qemu not close the fd the f_count is 
 not dec.
 Then the other process even close the fd the hugepage would not freed whise 
 the other process exit.
 
 The kernel always closes all FDs when a process exits. So if this FD is
 not being correctly closed then it is a kernel bug. There should never
 be any reason for an application to do close(fd) before exiting.
 
 Regards,
 Daniel
 
Hi,daniel

I don't think this is kernel's bug.May be this a problem about usage.
If you open a file you should close it too.

This is linux man pageabout how to free resource of file.
http://linux.die.net/man/2/close


I'm trying to describe my problem.

For example, there are 2 VMs run with hugepage and the hugepage only for QEMU 
to use.

Before run VM the meminfo is :
HugePages_Total:4096
HugePages_Free: 4096
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:   2048 kB

Run the two VMs.QEMU deal with hugepage as follow steps:
1.open
2.unlink
3.mmap
4.use memory of hugepage.After this step the meminfo is :
HugePages_Total:4096
HugePages_Free:0
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:   2048 kB
5.shutdown VM with signal 15 without close(fd).After this step the meminfo is :
HugePages_Total:4096
HugePages_Free: 4096
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:   2048 kB

Yes,it works well,like you said the kernel recycle all resources.

For another example,there are 2 VMs run with hugepage and share the hugepage 
with vapp(a vhost-user application).

Before run VM the meminfo is :
HugePages_Total:4096
HugePages_Free: 4096
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:   2048 kB

Run the first VM.QEMU deal with hugepage as follow steps:
1.open
2.unlink
3.mmap
4.use memory of hugepage and send the fd to vapp with unix domain socket.After 
this step the meminfo is:
HugePages_Total:4096
HugePages_Free: 2048
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:   2048 kB

Run the second VM.After this step the meminfo is:
HugePages_Total:4096
HugePages_Free:0
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:   2048 kB

Then I want to close the first VM and run another VM.After close the first VM 
and close the fd in vapp the meminfo is :
HugePages_Total:4096
HugePages_Free:0
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:   2048 kB

So failed to run the third VM because the first VM have not free the 
hugepage.After apply this patch the meminfo is:
HugePages_Total:4096
HugePages_Free: 2048
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:   2048 kB
So i can run the third VM success.

-- 
Regards,
Haifeng




Re: [Qemu-devel] [PATCH] fix the memory leak for share hugepage

2014-10-17 Thread Linhaifeng


On 2014/10/17 16:43, zhanghailiang wrote:
 On 2014/10/17 16:27, haifeng@huawei.com wrote:
 From: linhaifeng haifeng@huawei.com

 The VM start with share hugepage should close the hugefile fd
 when exit.Because the hugepage fd may be send to other process
 e.g vhost-user If qemu not close the fd the other process can
 not free the hugepage otherwise exit process,this is ugly,so
 qemu should close all shared fd when exit.

 Signed-off-by: linhaifeng haifeng@huawei.com
 ---
   exec.c | 12 
   vl.c   |  7 +++
   2 files changed, 19 insertions(+)

 diff --git a/exec.c b/exec.c
 index 759055d..d120b73 100644
 --- a/exec.c
 +++ b/exec.c
 @@ -1535,6 +1535,18 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t 
 length)
   }
   }
   }
 +
 +void qemu_close_all_ram_fd(void)
 +{
 +RAMBlock *block;
 +
 +qemu_mutex_lock_ramlist();
 +QTAILQ_FOREACH(block, ram_list.blocks, next) {
 +close(block-fd);
 +}
 +qemu_mutex_unlock_ramlist();
 +}
 +
   #endif /* !_WIN32 */

   int qemu_get_ram_fd(ram_addr_t addr)
 diff --git a/vl.c b/vl.c
 index aee73e1..0b78f3f 100644
 --- a/vl.c
 +++ b/vl.c
 @@ -1658,6 +1658,7 @@ static int qemu_shutdown_requested(void)
   return r;
   }

 +extern void qemu_close_all_ram_fd(void);
   static void qemu_kill_report(void)
   {
   if (!qtest_driver()  shutdown_signal != -1) {
 @@ -1671,6 +1672,12 @@ static void qemu_kill_report(void)
   fprintf(stderr,  from pid  FMT_pid \n, shutdown_pid);
   }
   shutdown_signal = -1;
 +
 +/* Close all ram fd when exit. If the ram is shared by othter 
 process
 
 s/othter/other/
 

OK.thank you.

 + * e.g vhost-user, it can free the hugepage by close fd after qemu 
 exit,
 + * otherwise the process have to exit to free hugepage.
 + */
 +qemu_close_all_ram_fd();
   }
   }


 
 
 
 .
 

-- 
Regards,
Haifeng




Re: [Qemu-devel] [Bug?]When close VM the hugepage not freed

2014-10-15 Thread Linhaifeng


On 2014/10/14 20:02, Linhaifeng wrote:
 Hi,all
 
 I was trying to use hugepage with VM and found that the hugepage not freed 
 when close VM.
 
 
 1.Before start VM the /proc/meminfo is:
 AnonHugePages:124928 kB
 HugePages_Total:4096
 HugePages_Free: 3072
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB
 
 2.Start VM the /proc/meminfo is:
 AnonHugePages:139264 kB
 HugePages_Total:4096
 HugePages_Free: 2048
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB
 
 3.Close VM the /proc/meminfo is:
 AnonHugePages:124928 kB
 HugePages_Total:4096
 HugePages_Free: 2048
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB
 
 We can see there are 1024 hugepage leak!
 
 I try to found which function used to free hugepage but i'm not sure where 
 the qemu_ram_free is the function to free hugepage.
 I found that the qemu_ram_free function not call unlink and we know unlink is 
 used to free hugepage(see example of hugepage-mmap.c in kernel source).
 
 void qemu_ram_free(ram_addr_t addr)
 {
 RAMBlock *block;
 
 /* This assumes the iothread lock is taken here too.  */
 qemu_mutex_lock_ramlist();
 QTAILQ_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
 QTAILQ_REMOVE(ram_list.blocks, block, next);
 ram_list.mru_block = NULL;
 ram_list.version++;
 if (block-flags  RAM_PREALLOC) {
 ;
 } else if (xen_enabled()) {
 xen_invalidate_map_cache_entry(block-host);
 #ifndef _WIN32
 } else if (block-fd = 0) {
 munmap(block-host, block-length);
 close(block-fd);
   // should we add unlink here to free hugepage?
 #endif
 } else {
 qemu_anon_ram_free(block-host, block-length);
 }
 g_free(block);
 break;
 }
 }
 qemu_mutex_unlock_ramlist();
 
 }
 
 
 
 

When i run the QEMU with tap backend the hugepage will be freed but not the 
vhost-user backend.
Maybe the vhost-user process should close the hugefile.

-- 
Regards,
Haifeng




[Qemu-devel] [Bug?]When close VM the hugepage not freed

2014-10-14 Thread Linhaifeng
Hi,all

I was trying to use hugepage with VM and found that the hugepage not freed when 
close VM.


1.Before start VM the /proc/meminfo is:
AnonHugePages:124928 kB
HugePages_Total:4096
HugePages_Free: 3072
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:   2048 kB

2.Start VM the /proc/meminfo is:
AnonHugePages:139264 kB
HugePages_Total:4096
HugePages_Free: 2048
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:   2048 kB

3.Close VM the /proc/meminfo is:
AnonHugePages:124928 kB
HugePages_Total:4096
HugePages_Free: 2048
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:   2048 kB

We can see there are 1024 hugepage leak!

I try to found which function used to free hugepage but i'm not sure where the 
qemu_ram_free is the function to free hugepage.
I found that the qemu_ram_free function not call unlink and we know unlink is 
used to free hugepage(see example of hugepage-mmap.c in kernel source).

void qemu_ram_free(ram_addr_t addr)
{
RAMBlock *block;

/* This assumes the iothread lock is taken here too.  */
qemu_mutex_lock_ramlist();
QTAILQ_FOREACH(block, ram_list.blocks, next) {
if (addr == block-offset) {
QTAILQ_REMOVE(ram_list.blocks, block, next);
ram_list.mru_block = NULL;
ram_list.version++;
if (block-flags  RAM_PREALLOC) {
;
} else if (xen_enabled()) {
xen_invalidate_map_cache_entry(block-host);
#ifndef _WIN32
} else if (block-fd = 0) {
munmap(block-host, block-length);
close(block-fd);
// should we add unlink here to free hugepage?
#endif
} else {
qemu_anon_ram_free(block-host, block-length);
}
g_free(block);
break;
}
}
qemu_mutex_unlock_ramlist();

}




Re: [Qemu-devel] [Bug?]When close VM the hugepage not freed

2014-10-14 Thread Linhaifeng


On 2014/10/14 20:08, Daniel P. Berrange wrote:
 On Tue, Oct 14, 2014 at 08:02:38PM +0800, Linhaifeng wrote:
 Hi,all

 I was trying to use hugepage with VM and found that the hugepage not freed 
 when close VM.


 1.Before start VM the /proc/meminfo is:
 AnonHugePages:124928 kB
 HugePages_Total:4096
 HugePages_Free: 3072
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 2.Start VM the /proc/meminfo is:
 AnonHugePages:139264 kB
 HugePages_Total:4096
 HugePages_Free: 2048
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 3.Close VM the /proc/meminfo is:
 AnonHugePages:124928 kB
 HugePages_Total:4096
 HugePages_Free: 2048
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 We can see there are 1024 hugepage leak!

 I try to found which function used to free hugepage but i'm not sure
 where the qemu_ram_free is the function to free hugepage.
 I found that the qemu_ram_free function not call unlink and we know
 unlink is used to free hugepage(see example of hugepage-mmap.c in
 kernel source).
 
 We can't rely on 'qemu_ram_free' ever executing because we must
 ensure hugepages are freed upon QEMU crash.
 
 It seems we should rely on UNIX filesytstem semantics and simply
 unlink the memory segment the moment we create it  open the FD.
 That way the kernel will automatically free it when the FD is
 closed when QEMU process exits.
 
 
 Regards,
 Daniel
 

Hi, daniel

Thank you for your answer.

Does it means libvirt should free the hugepage?

QEMU create the hugepage with template file and unlink it before mmap.
Do you know why to unlink the hugepage before mmap?

When unlink the hugepage before mmap libvirt cannot found the hugepage.
How does libvirt to free the hugepage ?

Regards,
Haifeng




[Qemu-devel] vhost-user:add VHOST_USER_CLEAR_MEM_TABLE

2014-10-13 Thread Linhaifeng
I want to add this message for vhost-user backend's memory changed.Any 
suggestion?

 * VHOST_USER_CLEAR_MEM_TABLE
  Id: 15
  Equivalent ioctl: VHOST_USER_CLEAR_MEM_TABLE
  Master payload: u64
  Clear the memory regions on the slave when the memory of the forward 
freed.e.g.when unplug the vhost-user device or reload the virio-net driver.
  Bits (0-7) of the payload contain nothing.




Re: [Qemu-devel] vhost-user: VHOST_SET_MEM_TABLE, VHOST_SET_VRING_CALL need a reply?

2014-09-18 Thread Linhaifeng


On 2014/9/18 13:16, Michael S. Tsirkin wrote:
 On Thu, Sep 18, 2014 at 08:45:37AM +0800, Linhaifeng wrote:


 On 2014/9/17 17:56, Michael S. Tsirkin wrote:
 On Wed, Sep 17, 2014 at 05:39:04PM +0800, Linhaifeng wrote:
 I think maybe is not need for the backend to wait for response.

 There is another way.vhost-user send VHOST_GET_MEM_TABLE to qemu then 
 qemu send VHOST_SET_MEM_TABLE to update the regions of vhost-user.same as 
 other command.
 If qemu could response the request of the vhost-user.the vhost-user could 
 update date at anytime.

 The updates are initiated by QEMU, as a result of IOMMU,
 memory hotplug or some other configuration change.

 How to deal with the vhost-user restart?
 when vhost-user restart it will lost the infomation which QEMU send.

 In the kernel mode vhost will restart with QEMU but in the user mode vhost 
 will not.
 
 vhost-user must restart with qemu only.
 


Sometimes qemu not allowed to restart. e.g. The customer want to update the 
vhost-user to a newer version but don't want to restart the VM.


 The nature of virtio protocol is such that there's not enough in-memory
 state for host to gracefully recover from losing VQ state.
 
 We could add a new feature to allow recovery by reporting
 failure to guest, and disabling processing new requests.
 Guest could respond by recovering / discarding submitted
 requests, re-enabling the device (likely by executing a reset)
 and re-submitting requests.
 
 
 This would need a new feature bit though, and would have to
 be acknowledged by guest. As such this would have to be
 virtio 1.0 feature, virtio 0.x is frozen.
 

 I think it's very useful for Commercialization.

 On 2014/9/17 16:38, Michael S. Tsirkin wrote:
 Reply-To: 

 Thinking about the vhost-user protocol, VHOST_SET_MEM_TABLE
 is used to update the memory mappings.

 So shouldn't we want for response?
 Otherwise e.g. guest can start using the memory
 that vhost-user can't access.

 Similarly, with an IOMMU vhost-user might access memory it shouldn't.

 VHOST_SET_VRING_CALL is used for MSI-X masking.
 Again, after vector is masted by switching the call fd,
 backend shouldn't assert the old one.

 Thoughts?



 .

 
 .
 




Re: [Qemu-devel] vhost-user:is miss command VHOST_NET_SET_BACKEND?

2014-09-17 Thread Linhaifeng
sorry it is not need to add VHOST_NET_SET_BACKEND.I found i can reset in 
VHOST_GET_VRING_BASE because before the backend cleanup it will send 
VHOST_GET_VRING_BASE.

On 2014/9/17 13:50, Linhaifeng wrote:
 Hi,
 
 The VHOST_NET_SET_BACKEND could tell the user when the backend is created or 
 destroyed.is usefull for the user but this command is lost in the protocol
 
 
 
 




Re: [Qemu-devel] vhost-user: VHOST_SET_MEM_TABLE, VHOST_SET_VRING_CALL need a reply?

2014-09-17 Thread Linhaifeng
I think maybe is not need for the backend to wait for response.

There is another way.vhost-user send VHOST_GET_MEM_TABLE to qemu then qemu 
send VHOST_SET_MEM_TABLE to update the regions of vhost-user.same as other 
command.
If qemu could response the request of the vhost-user.the vhost-user could 
update date at anytime.

I think it's very useful for Commercialization.

On 2014/9/17 16:38, Michael S. Tsirkin wrote:
 Reply-To: 
 
 Thinking about the vhost-user protocol, VHOST_SET_MEM_TABLE
 is used to update the memory mappings.
 
 So shouldn't we want for response?
 Otherwise e.g. guest can start using the memory
 that vhost-user can't access.
 
 Similarly, with an IOMMU vhost-user might access memory it shouldn't.
 
 VHOST_SET_VRING_CALL is used for MSI-X masking.
 Again, after vector is masted by switching the call fd,
 backend shouldn't assert the old one.
 
 Thoughts?
 
 




Re: [Qemu-devel] vhost-user: VHOST_SET_MEM_TABLE, VHOST_SET_VRING_CALL need a reply?

2014-09-17 Thread Linhaifeng


On 2014/9/17 17:56, Michael S. Tsirkin wrote:
 On Wed, Sep 17, 2014 at 05:39:04PM +0800, Linhaifeng wrote:
 I think maybe is not need for the backend to wait for response.

 There is another way.vhost-user send VHOST_GET_MEM_TABLE to qemu then qemu 
 send VHOST_SET_MEM_TABLE to update the regions of vhost-user.same as other 
 command.
 If qemu could response the request of the vhost-user.the vhost-user could 
 update date at anytime.
 
 The updates are initiated by QEMU, as a result of IOMMU,
 memory hotplug or some other configuration change.

How to deal with the vhost-user restart?
when vhost-user restart it will lost the infomation which QEMU send.

In the kernel mode vhost will restart with QEMU but in the user mode vhost will 
not.

 
 I think it's very useful for Commercialization.

 On 2014/9/17 16:38, Michael S. Tsirkin wrote:
 Reply-To: 

 Thinking about the vhost-user protocol, VHOST_SET_MEM_TABLE
 is used to update the memory mappings.

 So shouldn't we want for response?
 Otherwise e.g. guest can start using the memory
 that vhost-user can't access.

 Similarly, with an IOMMU vhost-user might access memory it shouldn't.

 VHOST_SET_VRING_CALL is used for MSI-X masking.
 Again, after vector is masted by switching the call fd,
 backend shouldn't assert the old one.

 Thoughts?


 
 .
 




Re: [Qemu-devel] vhost-user: VHOST_SET_MEM_TABLE, VHOST_SET_VRING_CALL need a reply?

2014-09-17 Thread Linhaifeng


On 2014/9/17 17:56, Michael S. Tsirkin wrote:
 On Wed, Sep 17, 2014 at 05:39:04PM +0800, Linhaifeng wrote:
 I think maybe is not need for the backend to wait for response.

 There is another way.vhost-user send VHOST_GET_MEM_TABLE to qemu then qemu 
 send VHOST_SET_MEM_TABLE to update the regions of vhost-user.same as other 
 command.
 If qemu could response the request of the vhost-user.the vhost-user could 
 update date at anytime.
 
 The updates are initiated by QEMU, as a result of IOMMU,
 memory hotplug or some other configuration change.

yes.when memory unplug/plug should tell vhost-user.

 
 I think it's very useful for Commercialization.

 On 2014/9/17 16:38, Michael S. Tsirkin wrote:
 Reply-To: 

 Thinking about the vhost-user protocol, VHOST_SET_MEM_TABLE
 is used to update the memory mappings.

 So shouldn't we want for response?
 Otherwise e.g. guest can start using the memory
 that vhost-user can't access.

 Similarly, with an IOMMU vhost-user might access memory it shouldn't.

 VHOST_SET_VRING_CALL is used for MSI-X masking.
 Again, after vector is masted by switching the call fd,
 backend shouldn't assert the old one.

 Thoughts?


 
 .
 




[Qemu-devel] vhost-user:how to notify the guest?

2014-09-16 Thread Linhaifeng
Hi,

I write the data to the rx-ring and write the fd to notify the guest but there 
is no interrupts in the guest.

my notify code:

uint64_t kick_it = 1;
write(vring[0]-kickfd, kick_it, sizeof(kick_it));

cat /proc/interrupts in the guest:
41: 0   PCI-MSI-EDGEvirtio0-input
42: 0   PCI-MSI-EDGEvirtio0-output




Re: [Qemu-devel] [PULL 12/12] vhost-user: fix VIRTIO_NET_F_MRG_RXBUF negotiation

2014-09-16 Thread Linhaifeng


On 2014/9/15 2:41, Michael S. Tsirkin wrote:
 From: Damjan Marion damar...@cisco.com
 
 Header length check should happen only if backend is kernel. For user
 backend there is no reason to reset this bit.
 
 vhost-user code does not define .has_vnet_hdr_len so
 VIRTIO_NET_F_MRG_RXBUF cannot be negotiated even if both sides
 support it.
 
 Signed-off-by: Damjan Marion damar...@cisco.com
 Reviewed-by: Michael S. Tsirkin m...@redhat.com
 Signed-off-by: Michael S. Tsirkin m...@redhat.com
 ---
  hw/net/vhost_net.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)
 
 diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
 index b21e7a4..77bb93e 100644
 --- a/hw/net/vhost_net.c
 +++ b/hw/net/vhost_net.c
 @@ -163,11 +163,11 @@ struct vhost_net *vhost_net_init(VhostNetOptions 
 *options)
  if (r  0) {
  goto fail;
  }
 -if (!qemu_has_vnet_hdr_len(options-net_backend,
 -   sizeof(struct virtio_net_hdr_mrg_rxbuf))) {
 -net-dev.features = ~(1  VIRTIO_NET_F_MRG_RXBUF);
 -}
  if (backend_kernel) {
 +if (!qemu_has_vnet_hdr_len(options-net_backend,
 +   sizeof(struct virtio_net_hdr_mrg_rxbuf))) {
 +net-dev.features = ~(1  VIRTIO_NET_F_MRG_RXBUF);
 +}
  if (~net-dev.features  net-dev.backend_features) {
  fprintf(stderr, vhost lacks feature mask % PRIu64
  for backend\n,
 

why vhost-user code not define .has_vnet_hdr_len? I think vhost-user code 
should define it.

if packet bigger than MTU,this will be a bug?




[Qemu-devel] vhost-user:why region[0] always mmap failed ?

2014-09-16 Thread Linhaifeng
Hi,

There is two memory regions when receive VHOST_SET_MEM_TABLE message:
region[0]
gpa = 0x0
size = 655360
ua = 0x2ac0
offset = 0
region[1]
gpa = 0xC
size = 2146697216
ua = 0x2acc
offset = 786432

region[0] always mmap failed.The user code is :

for (idx = 0; idx  msg-msg.memory.nregions; idx++) {
if (msg-fds[idx]  0) {
size_t size;
uint64_t *guest_mem;
Region *region = vhost_server-memory.regions[i];

region-guest_phys_addr = 
msg-msg.memory.regions[idx].guest_phys_addr;
region-memory_size = msg-msg.memory.regions[idx].memory_size;
region-userspace_addr = 
msg-msg.memory.regions[idx].userspace_addr;
region-mmap_offset = msg-msg.memory.regions[idx].mmap_offset;

assert(idx  msg-fd_num);
assert(msg-fds[idx]  0);

size = region-memory_size + region-mmap_offset;
guest_mem = mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED, 
msg-fds[idx], 0);
if (MAP_FAILED == guest_mem) {
continue;
}
i++;
guest_mem += (region-mmap_offset / sizeof(*guest_mem));
region-mmap_addr = (uint64_t)guest_mem;
vhost_server-memory.nregions++;
}
}




[Qemu-devel] vhost-user:is miss command VHOST_NET_SET_BACKEND?

2014-09-16 Thread Linhaifeng
Hi,

The VHOST_NET_SET_BACKEND could tell the user when the backend is created or 
destroyed.is usefull for the user but this command is lost in the protocol




Re: [Qemu-devel] [PULL v2 6/7] qtest: Adapt vhost-user-test to latest vhost-user changes

2014-09-13 Thread Linhaifeng
How to test send data to VM?

On 2014/7/18 7:44, Michael S. Tsirkin wrote:
 From: Nikolay Nikolaev n.nikol...@virtualopensystems.com
 
 A new field mmap_offset was added in the vhost-user message, we need to 
 reflect
 this change in the test too.
 
 Signed-off-by: Nikolay Nikolaev n.nikol...@virtualopensystems.com
 Reviewed-by: Michael S. Tsirkin m...@redhat.com
 Signed-off-by: Michael S. Tsirkin m...@redhat.com
 ---
  tests/vhost-user-test.c | 11 +--
  1 file changed, 9 insertions(+), 2 deletions(-)
 
 diff --git a/tests/vhost-user-test.c b/tests/vhost-user-test.c
 index 406ba70..75fedf0 100644
 --- a/tests/vhost-user-test.c
 +++ b/tests/vhost-user-test.c
 @@ -76,6 +76,7 @@ typedef struct VhostUserMemoryRegion {
  uint64_t guest_phys_addr;
  uint64_t memory_size;
  uint64_t userspace_addr;
 +uint64_t mmap_offset;
  } VhostUserMemoryRegion;
  
  typedef struct VhostUserMemory {
 @@ -205,6 +206,7 @@ static void read_guest_mem(void)
  uint32_t *guest_mem;
  gint64 end_time;
  int i, j;
 +size_t size;
  
  g_mutex_lock(data_mutex);
  
 @@ -231,8 +233,13 @@ static void read_guest_mem(void)
  
  g_assert_cmpint(memory.regions[i].memory_size, , 1024);
  
 -guest_mem = mmap(0, memory.regions[i].memory_size,
 -PROT_READ | PROT_WRITE, MAP_SHARED, fds[i], 0);
 +size =  memory.regions[i].memory_size + 
 memory.regions[i].mmap_offset;
 +
 +guest_mem = mmap(0, size, PROT_READ | PROT_WRITE,
 + MAP_SHARED, fds[i], 0);
 +
 +g_assert(guest_mem != MAP_FAILED);
 +guest_mem += (memory.regions[i].mmap_offset / sizeof(*guest_mem));
  
  for (j = 0; j  256; j++) {
  uint32_t a = readl(memory.regions[i].guest_phys_addr + j*4);
 




Re: [Qemu-devel] the userspace process vapp mmap filed // [PULL 13/37] vhost-user: fix regions provied with VHOST_USER_SET_MEM_TABLE message

2014-09-09 Thread Linhaifeng
Hi,

Thank you for your answer.I think the problem is not how to publish the patch 
the problem is there is no standard vhost-user module.

I just use the vapp to test the new backend vhost-user. I found that the kernel 
have a module names vhost-net for the vhost backend but there is no vhost-user 
module for the vhost-user backend.who will supply a standard vhost-user lib for 
the user process?if everybody implement it self I think it's hard to 
maintenance qemu.so I think there are some question must be answered:

1.who supply the standard vhost-user module to use the backend of qemu?kernel 
have maintenance the vhost-net there must be a organization to maintenance the 
vhost-user module.
2.the vhost-user module should be in common use.I think it maybe a share 
library have interface like open、close、send、recv and the user process will easy 
to use not just supply a test program.
3.qemu support multi net device the vhost-user module should support multi net 
device too.

-Original Message-
From: Nikolay Nikolaev [mailto:n.nikol...@virtualopensystems.com] 
Sent: Wednesday, September 10, 2014 1:54 AM
To: Linhaifeng; Daniel Raho
Cc: qemu-devel; m...@redhat.com  Michael S. Tsirkin; Lilijun (Jerry); Paolo 
Bonzini; Damjan Marion; VirtualOpenSystems Technical Team
Subject: Re: Re: the userspace process vapp mmap filed //[Qemu-devel] [PULL 
13/37] vhost-user: fix regions provied with VHOST_USER_SET_MEM_TABLE message

Hello,

Vapp is a VOSYS application, currently not meant to be part of QEMU;
as such your proposed patch might not be meaningful if pushed towards
QEMU devel list. As the current Vapp implementation is not updated
since last March, custom support and any related potential design need
for a software switch implementation can be discussed at a custom
commercial level.

regards,
Nikolay Nikolaev
Virtual Open Systems


On Tue, Sep 9, 2014 at 3:28 PM, linhafieng haifeng@huawei.com wrote:



  Forwarded Message 
 Subject: Re: the userspace process vapp mmap filed //[Qemu-devel] [PULL 
 13/37] vhost-user: fix regions provied with VHOST_USER_SET_MEM_TABLE message
 Date: Tue, 09 Sep 2014 19:45:08 +0800
 From: linhafieng haifeng@huawei.com
 To: Michael S. Tsirkin m...@redhat.com
 CC: n.nikol...@virtualopensystems.com, jerry.lili...@huawei.com, 
 qemu-devel@nongnu.org, pbonz...@redhat.com, damar...@cisco.com, 
 t...@virtualopensystems.com

 On 2014/9/3 15:08, Michael S. Tsirkin wrote:
 On Wed, Sep 03, 2014 at 02:26:03PM +0800, linhafieng wrote:
 I run the user process vapp to test the  VHOST_USER_SET_MEM_TABLE message 
 found that the user sapce failed to mmap.

 Why off-list?
 pls copy qemu mailing list and pbonz...@redhat.com




 I wrote a patch for the vapp to test the patch of broken mem regions.The vapp 
 can receive data from VM but there is a mmap failed error.

 i have some qeusions about the patch and vhost-user:
 1.can i mmap all the fd of the mem regions? why some region failed?Have any 
 impact on it?
 2.the vapp why not update with the path of broken mem regions?
 3.is the test program of vhost user test vring mem more meaningful?
 4.the port of switch how to find the vhost-user device?by the socket path?
 5.should the process of vhost-user manage all vhost-user backend socket fd? 
 or any better advise?


 my patch is for vapp is :

 diff -uNr vapp/vhost_server.c vapp-for-broken-mem-region//vhost_server.c
 --- vapp/vhost_server.c 2014-08-30 09:39:20.0 +
 +++ vapp-for-broken-mem-region//vhost_server.c  2014-09-09 11:36:50.0 
 +
 @@ -147,18 +147,22 @@

  for (idx = 0; idx  msg-msg.memory.nregions; idx++) {
  if (msg-fds[idx]  0) {
 +size_t size;
 +uint64_t *guest_mem;
  VhostServerMemoryRegion *region = 
 vhost_server-memory.regions[idx];

  region-guest_phys_addr = 
 msg-msg.memory.regions[idx].guest_phys_addr;
  region-memory_size = msg-msg.memory.regions[idx].memory_size;
  region-userspace_addr = 
 msg-msg.memory.regions[idx].userspace_addr;
 -
 +region-mmap_offset = msg-msg.memory.regions[idx].mmap_offset;
 +
  assert(idx  msg-fd_num);
  assert(msg-fds[idx]  0);

 -region-mmap_addr =
 -(uintptr_t) init_shm_from_fd(msg-fds[idx], 
 region-memory_size);
 -
 +size = region-memory_size + region-mmap_offset;
 +guest_mem = init_shm_from_fd(msg-fds[idx], size);
 +guest_mem += (region-mmap_offset / sizeof(*guest_mem));
 +region-mmap_addr = (uint64_t)guest_mem;
  vhost_server-memory.nregions++;
  }
  }
 diff -uNr vapp/vhost_server.h vapp-for-broken-mem-region//vhost_server.h
 --- vapp/vhost_server.h 2014-08-30 09:39:20.0 +
 +++ vapp-for-broken-mem-region//vhost_server.h  2014-09-05 01:41:27.0 
 +
 @@ -13,7 +13,9 @@
  uint64_t guest_phys_addr;
  uint64_t memory_size;
  uint64_t

Re: [Qemu-devel] the userspace process vapp mmap filed // [PULL 13/37] vhost-user: fix regions provied with VHOST_USER_SET_MEM_TABLE message

2014-09-09 Thread Linhaifeng



 From: Michael S. Tsirkin [mailto:m...@redhat.com]
 Sent: Wednesday, September 10, 2014 4:41 AM
 To: Nikolay Nikolaev
 Cc: Linhaifeng; Daniel Raho; qemu-devel; Lilijun (Jerry); Paolo Bonzini; 
 Damjan Marion; VirtualOpenSystems Technical Team
 Subject: Re: Re: the userspace process vapp mmap filed //[Qemu-devel] [PULL 
 13/37] vhost-user: fix regions provied with VHOST_USER_SET_MEM_TABLE message
 
 On Tue, Sep 09, 2014 at 08:54:08PM +0300, Nikolay Nikolaev wrote:
 Hello,

 Vapp is a VOSYS application, currently not meant to be part of QEMU;
 as such your proposed patch might not be meaningful if pushed towards
 QEMU devel list.
 As the current Vapp implementation is not updated
 since last March, custom support and any related potential design need
 for a software switch implementation can be discussed at a custom
 commercial level.

 regards,
 Nikolay Nikolaev
 Virtual Open Systems
 
 
 I'm not familiar enough with the software itself to comment on the
 patch.
 But I don't think yours is a valid answer, and I was the one who asked
 that the question is sent to the list.
 I merged the vhost-user protocol so we can support multiple backends.
 Someone wants to work on another backend, more power to them,
 Way I read it the question is whether there's a bug in vhost in qemu,
 and how to use vhost-user, and this seems relevant on our list.
 
 
 

I agree with you.
I want to ask how to use vhost-user and who will supply and maintence 
vhost-user.