date:20140902

[Qemu-devel] Fw:Re:What does "COW" mean?

2014-09-02 Thread shhuiw

Hi,

Sorry to trouble you.

I posted my question in the user mailing list, but can not get the right answer.
So I forward it to the devel mailing list. Hope to find answer here.

At 2014-09-02 04:33:50, "shhuiw"  wrote:
>
>
>Hi,
>
>I'm new to qemu community, and I'm trying the COW image format (old but 
>simple:-).
>I have read through its source code, and didn't find anything about 'copy on 
>write'.
>I wonder wthat "COW" stands for?

Sorry for my unclear expression.
I mean when copy-on-write happens if COW image format is used, and how the COW 
driver code handles cop-on-write?

>
>
>--
>
>Regards,
>shhuiw

--

Regards,
shhuiw

[Qemu-devel] Question about cow format with hexdump

2014-09-02 Thread shhuiw


Hi,

I'm reading the source code of cow.c: 
https://github.com/qemu/qemu/blob/master/block/cow.c
and try to understand the format better.

I created a cow format imagefile and can run 'qume-img info' to query the 
header info
---
-bash-4.1$ qemu-img create -f cow dummy 1M
Formatting 'test/dummy', fmt=cow size=1048576 
-bash-4.1$ qemu-img info dummy 
image: dummy
file format: cow
virtual size: 1.0M (1048576 bytes)
disk size: 4.0K


But when I used hexdump to dis the header part, I cannot find all info recorded:
(compared the define of struct cow_header_v2 and cow_create())
--
1) recognize the magic and version info:
-bash-4.1$ hexdump -C dummy -n 8
  4f 4f 4f 4d 00 00 00 02   |OOOM|
0008

2) backing_file and mtime fields are 0s:
# I think the "dummy" should be recorded
-bash-4.1$ hexdump -C dummy -n 1032
  4f 4f 4f 4d 00 00 00 02  00 00 00 00 00 00 00 00  |OOOM|
0010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
*
0400  00 00 00 00 00 00 00 00   ||
0408
-bash-4.1$ hexdump -C dummy -n 1036
  4f 4f 4f 4d 00 00 00 02  00 00 00 00 00 00 00 00  |OOOM|
0010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
*
0400  00 00 00 00 00 00 00 00  00 00 00 00  ||
040c

3) size field is 0s:
-bash-4.1$ hexdump -C dummy -n 1044 
   # size should be 1M
  4f 4f 4f 4d 00 00 00 02  00 00 00 00 00 00 00 00  |OOOM|
0010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
*
0410  00 00 00 00   ||
0414

4) sectorsize is 512:
-bash-4.1$ hexdump -C dummy -n 1048
  4f 4f 4f 4d 00 00 00 02  00 00 00 00 00 00 00 00  |OOOM|
0010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
*
0410  00 00 00 00 00 10 00 00   ||
0418


Can anyone help to explain this? Or how to debug further?

--

Regards,
shhuiw

Re: [Qemu-devel] [PATCH v6 02/27] bootindex: add del_boot_device_path function

2014-09-02 Thread Gonglei (Arei)






Best regards,
-Gonglei


> -Original Message-
> From: Gerd Hoffmann [mailto:kra...@redhat.com]
> Sent: Wednesday, September 03, 2014 2:25 PM
> To: Gonglei (Arei)
> Cc: Eduardo Habkost; qemu-devel@nongnu.org; aligu...@amazon.com;
> m...@redhat.com; pbonz...@redhat.com; ak...@redhat.com;
> hu...@cn.fujitsu.com; ebl...@redhat.com; afaer...@suse.de;
> arm...@redhat.com; imamm...@redhat.com; a...@ozlabs.ru;
> peter.crosthwa...@xilinx.com; lcapitul...@redhat.com; h...@linux.com;
> stefa...@redhat.com; ag...@suse.de; chenliang (T); Huangweidong (C);
> Luonengjun; Huangpeng (Peter); kw...@redhat.com
> Subject: Re: [PATCH v6 02/27] bootindex: add del_boot_device_path function
> Importance: High
> 
> 
>   Hi,
> 
> > 4. When we hotplug the virtio-net-pci device, only pass virtio-net-pci's 
> > pointer
> to
> > del_boot_device_path(). But virtio-net-pci != virtio-net-device, so I add a
> function
> > named is_same_fw_dev_path() to handle this situation.
> 
> When hot-unplugging virtio-net-pci I'd expect we free both
> virtio-net-pci and virtio-net-device (and therefore call
> del_boot_device_path twice, once for each device).  Can you check that?
> 
Yes, I can. 

The del_boot_device_path() is called only once, just for virtio-net-pci.
For its child, virtio-net-devcie's resource is cleaned by qbus->child 
unrealizing 
process, will not call device_finalize().

Command line:

# ./qemu-system-x86_64 -enable-kvm -m 4096 -smp 4 -name redhat6.2 -drive 
file=/mnt/sdb/gonglei/image/win7_32_2U,if=none,id=drive-ide0-0-0 -device 
ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -drive 
file=/mnt/sdb/gonglei/iso/rhel-server-7.0-x86_64-dvd.iso,if=none,id=drive-ide0-0-1
 -device ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1,bootindex=4 
-vnc 0.0.0.0:10 -netdev type=user,id=net0 -device 
virtio-net-pci,netdev=net0,bootindex=3,id=nic1
QEMU 2.1.50 monitor - type 'help' for more information
(qemu) device_del nic1
(qemu)

The below is the backtrace by gdb:

Breakpoint 1, virtio_net_device_unrealize (dev=0x7f1b1ea87d58, 
errp=0x7f1b17d8c350)
at /mnt/sdb/gonglei/qemu.git/qemu/hw/net/virtio-net.c:1649
1649{
(gdb) bt
#0  virtio_net_device_unrealize (dev=0x7f1b1ea87d58, errp=0x7f1b17d8c350)
at /mnt/sdb/gonglei/qemu.git/qemu/hw/net/virtio-net.c:1649
#1  0x7f1b1dd69b71 in virtio_device_unrealize (dev=0x7f1b1ea87d58, 
errp=0x7f1b17d8c3c8)
at /mnt/sdb/gonglei/qemu.git/qemu/hw/virtio/virtio.c:1312
#2  0x7f1b1de8a1a4 in device_set_realized (obj=0x7f1b1ea87d58, value=false, 
errp=0x7f1b17d8c568) at hw/core/qdev.c:885
#3  0x7f1b1dfa01f7 in property_set_bool (obj=0x7f1b1ea87d58, 
v=0x7f1b1eb29de0, opaque=0x7f1b1eaa72c0, name=
0x7f1b1e0c2b69 "realized", errp=0x7f1b17d8c568) at qom/object.c:1467
#4  0x7f1b1df9e82f in object_property_set (obj=0x7f1b1ea87d58, 
v=0x7f1b1eb29de0, name=0x7f1b1e0c2b69 "realized", errp=
0x7f1b17d8c568) at qom/object.c:814
#5  0x7f1b1dfa0c45 in object_property_set_qobject (obj=0x7f1b1ea87d58, 
value=0x7f1b1eb09c80, name=0x7f1b1e0c2b69 "realized", 
errp=0x7f1b17d8c568) at qom/qom-qobject.c:24
#6  0x7f1b1df9ebd7 in object_property_set_bool (obj=0x7f1b1ea87d58, 
value=false, name=0x7f1b1e0c2b69 "realized", errp=
0x7f1b17d8c568) at qom/object.c:878
#7  0x7f1b1de893fa in bus_set_realized (obj=0x7f1b1ea87ce0, value=false, 
errp=0x7f1b17d8c718) at hw/core/qdev.c:583
#8  0x7f1b1dfa01f7 in property_set_bool (obj=0x7f1b1ea87ce0, 
v=0x7f1b1ead5a10, opaque=0x7f1b1eaa1640, name=
0x7f1b1e0c2b69 "realized", errp=0x7f1b17d8c718) at qom/object.c:1467
#9  0x7f1b1df9e82f in object_property_set (obj=0x7f1b1ea87ce0, 
v=0x7f1b1ead5a10, name=0x7f1b1e0c2b69 "realized", errp=
0x7f1b17d8c718) at qom/object.c:814
#10 0x7f1b1dfa0c45 in object_property_set_qobject (obj=0x7f1b1ea87ce0, 
value=0x7f1b1eb09c60, name=0x7f1b1e0c2b69 "realized", 
errp=0x7f1b17d8c718) at qom/qom-qobject.c:24
#11 0x7f1b1df9ebd7 in object_property_set_bool (obj=0x7f1b1ea87ce0, 
value=false, name=0x7f1b1e0c2b69 "realized", errp=
0x7f1b17d8c718) at qom/object.c:878
#12 0x7f1b1de8a127 in device_set_realized (obj=0x7f1b1ea87380, value=false, 
errp=0x0) at hw/core/qdev.c:875
#13 0x7f1b1dfa01f7 in property_set_bool (obj=0x7f1b1ea87380, 
v=0x7f1b1eacf940, opaque=0x7f1b1ea89f20, name=
0x7f1b1e0c2b69 "realized", errp=0x0) at qom/object.c:1467
#14 0x7f1b1df9e82f in object_property_set (obj=0x7f1b1ea87380, 
v=0x7f1b1eacf940, name=0x7f1b1e0c2b69 "realized", errp=0x0)
at qom/object.c:814
#15 0x7f1b1dfa0c45 in object_property_set_qobject (obj=0x7f1b1ea87380, 
value=0x7f1b1eb085e0, name=0x7f1b1e0c2b69 "realized", 
errp=0x0) at qom/qom-qobject.c:24
#16 0x7f1b1df9ebd7 in object_property_set_bool (obj=0x7f1b1ea87380, 
value=false, name=0x7f1b1e0c2b69 "realized", errp=0x0)
at qom/object.c:878
#17 0x7f1b1de8a7d0 in device_unparent (obj=0x7f1b1ea87380) at 
hw/core/qdev.c:1010
#18 0x7f1b1df9f26b in object_finalize_child_proper

[Qemu-devel] [PULL 06/10] MAINTAINERS: Add VMWare devices maintainer

2014-09-02 Thread Michael Tokarev

From: Dmitry Fleytman 

Signed-off-by: Dmitry Fleytman 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Michael Tokarev 
---
 MAINTAINERS |6 ++
 1 file changed, 6 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 142f68a..8622e01 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -684,6 +684,12 @@ S: Maintained
 F: hw/*/xilinx_*
 F: include/hw/xilinx.h
 
+Vmware
+M: Dmitry Fleytman 
+S: Maintained
+F: hw/net/vmxnet*
+F: hw/scsi/vmw_pvscsi*
+
 Subsystems
 --
 Audio
-- 
1.7.10.4

[Qemu-devel] [PULL 08/10] qom/object.c, hmp.c: fix string_output_get_string() memory leak

2014-09-02 Thread Michael Tokarev

From: Chen Fan 

string_output_get_string() uses g_string_free(str, false) to
transfer the 'str' pointer to callers and never free it.

Signed-off-by: Chen Fan 
Reviewed-by: Peter Crosthwaite 
Reviewed-by: Hu Tao 
Signed-off-by: Michael Tokarev 
---
 hmp.c|6 --
 qom/object.c |   12 ++--
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/hmp.c b/hmp.c
index 4d1838e..ba40c75 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1687,6 +1687,7 @@ void hmp_info_memdev(Monitor *mon, const QDict *qdict)
 MemdevList *memdev_list = qmp_query_memdev(&err);
 MemdevList *m = memdev_list;
 StringOutputVisitor *ov;
+char *str;
 int i = 0;
 
 
@@ -1704,9 +1705,10 @@ void hmp_info_memdev(Monitor *mon, const QDict *qdict)
m->value->prealloc ? "true" : "false");
 monitor_printf(mon, "  policy: %s\n",
HostMemPolicy_lookup[m->value->policy]);
-monitor_printf(mon, "  host nodes: %s\n",
-   string_output_get_string(ov));
+str = string_output_get_string(ov);
+monitor_printf(mon, "  host nodes: %s\n", str);
 
+g_free(str);
 string_output_visitor_cleanup(ov);
 m = m->next;
 i++;
diff --git a/qom/object.c b/qom/object.c
index 1b00831..79bd0cc 100644
--- a/qom/object.c
+++ b/qom/object.c
@@ -938,14 +938,18 @@ int object_property_get_enum(Object *obj, const char 
*name,
 {
 StringOutputVisitor *sov;
 StringInputVisitor *siv;
+char *str;
 int ret;
 
 sov = string_output_visitor_new(false);
 object_property_get(obj, string_output_get_visitor(sov), name, errp);
-siv = string_input_visitor_new(string_output_get_string(sov));
+str = string_output_get_string(sov);
+siv = string_input_visitor_new(str);
 string_output_visitor_cleanup(sov);
 visit_type_enum(string_input_get_visitor(siv),
 &ret, strings, NULL, name, errp);
+
+g_free(str);
 string_input_visitor_cleanup(siv);
 
 return ret;
@@ -956,13 +960,17 @@ void object_property_get_uint16List(Object *obj, const 
char *name,
 {
 StringOutputVisitor *ov;
 StringInputVisitor *iv;
+char *str;
 
 ov = string_output_visitor_new(false);
 object_property_get(obj, string_output_get_visitor(ov),
 name, errp);
-iv = string_input_visitor_new(string_output_get_string(ov));
+str = string_output_get_string(ov);
+iv = string_input_visitor_new(str);
 visit_type_uint16List(string_input_get_visitor(iv),
   list, NULL, errp);
+
+g_free(str);
 string_output_visitor_cleanup(ov);
 string_input_visitor_cleanup(iv);
 }
-- 
1.7.10.4

[Qemu-devel] [PULL 01/10] curl: The macro that you have to uncomment to get debugging is DEBUG_CURL.

2014-09-02 Thread Michael Tokarev

From: "Richard W.M. Jones" 

Signed-off-by: Richard W.M. Jones 
Signed-off-by: Michael Tokarev 
---
 block/curl.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/curl.c b/block/curl.c
index 0258339..938f9d9 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -26,7 +26,7 @@
 #include "qapi/qmp/qbool.h"
 #include 
 
-// #define DEBUG
+// #define DEBUG_CURL
 // #define DEBUG_VERBOSE
 
 #ifdef DEBUG_CURL
-- 
1.7.10.4

[Qemu-devel] [PULL 04/10] device_tree.c: redirect load_device_tree err message to stderr

2014-09-02 Thread Michael Tokarev

From: Li Liu 

Reviewed-by: Peter Crosthwaite 
Signed-off-by: Li Liu 
Signed-off-by: Michael Tokarev 
---
 device_tree.c |   15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/device_tree.c b/device_tree.c
index ca83504..9d47195 100644
--- a/device_tree.c
+++ b/device_tree.c
@@ -20,6 +20,7 @@
 
 #include "config.h"
 #include "qemu-common.h"
+#include "qemu/error-report.h"
 #include "sysemu/device_tree.h"
 #include "sysemu/sysemu.h"
 #include "hw/loader.h"
@@ -79,8 +80,8 @@ void *load_device_tree(const char *filename_path, int *sizep)
 *sizep = 0;
 dt_size = get_image_size(filename_path);
 if (dt_size < 0) {
-printf("Unable to get size of device tree file '%s'\n",
-filename_path);
+error_report("Unable to get size of device tree file '%s'",
+ filename_path);
 goto fail;
 }
 
@@ -92,21 +93,21 @@ void *load_device_tree(const char *filename_path, int 
*sizep)
 
 dt_file_load_size = load_image(filename_path, fdt);
 if (dt_file_load_size < 0) {
-printf("Unable to open device tree file '%s'\n",
-   filename_path);
+error_report("Unable to open device tree file '%s'",
+ filename_path);
 goto fail;
 }
 
 ret = fdt_open_into(fdt, fdt, dt_size);
 if (ret) {
-printf("Unable to copy device tree in memory\n");
+error_report("Unable to copy device tree in memory");
 goto fail;
 }
 
 /* Check sanity of device tree */
 if (fdt_check_header(fdt)) {
-printf ("Device tree file loaded into memory is invalid: %s\n",
-filename_path);
+error_report("Device tree file loaded into memory is invalid: %s",
+ filename_path);
 goto fail;
 }
 *sizep = dt_size;
-- 
1.7.10.4

[Qemu-devel] [PULL 09/10] hmp: fix MemdevList memory leak

2014-09-02 Thread Michael Tokarev

From: Chen Fan 

the memdev_list in hmp_info_memdev() is never freed.
so we use existent method qapi_free_MemdevList() to free it.
and also we can use qapi_free_MemdevList() to replace list loops
to clean up the memdev list in error path.

Signed-off-by: Chen Fan 
Reviewed-by: Peter Crosthwaite 
Reviewed-by: Hu Tao 
Signed-off-by: Michael Tokarev 
---
 hmp.c  |2 ++
 numa.c |9 ++---
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/hmp.c b/hmp.c
index ba40c75..40a90da 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1715,4 +1715,6 @@ void hmp_info_memdev(Monitor *mon, const QDict *qdict)
 }
 
 monitor_printf(mon, "\n");
+
+qapi_free_MemdevList(memdev_list);
 }
diff --git a/numa.c b/numa.c
index aa772aa..f07149b 100644
--- a/numa.c
+++ b/numa.c
@@ -379,7 +379,7 @@ error:
 MemdevList *qmp_query_memdev(Error **errp)
 {
 Object *obj;
-MemdevList *list = NULL, *m;
+MemdevList *list = NULL;
 
 obj = object_resolve_path("/objects", NULL);
 if (obj == NULL) {
@@ -393,11 +393,6 @@ MemdevList *qmp_query_memdev(Error **errp)
 return list;
 
 error:
-while (list) {
-m = list;
-list = list->next;
-g_free(m->value);
-g_free(m);
-}
+qapi_free_MemdevList(list);
 return NULL;
 }
-- 
1.7.10.4

[Qemu-devel] [PULL 03/10] scripts: Remove scripts/qtest

2014-09-02 Thread Michael Tokarev

From: Fam Zheng 

This is a dummy file with no user, drop it.

Signed-off-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Michael Tokarev 
---
 scripts/qtest |5 -
 1 file changed, 5 deletions(-)
 delete mode 100755 scripts/qtest

diff --git a/scripts/qtest b/scripts/qtest
deleted file mode 100755
index 4ef6c1c..000
--- a/scripts/qtest
+++ /dev/null
@@ -1,5 +0,0 @@
-#!/bin/sh
-
-export QTEST_QEMU_BINARY=$1
-shift
-"$@"
-- 
1.7.10.4

[Qemu-devel] [PULL 05/10] device_tree.c: dump all err mesages with error_report

2014-09-02 Thread Michael Tokarev

From: Li Liu 

Signed-off-by: Li Liu 
Signed-off-by: Michael Tokarev 
---
 device_tree.c |   40 
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/device_tree.c b/device_tree.c
index 9d47195..df9eed9 100644
--- a/device_tree.c
+++ b/device_tree.c
@@ -60,13 +60,13 @@ void *create_device_tree(int *sizep)
 }
 ret = fdt_open_into(fdt, fdt, *sizep);
 if (ret) {
-fprintf(stderr, "Unable to copy device tree in memory\n");
+error_report("Unable to copy device tree in memory");
 exit(1);
 }
 
 return fdt;
 fail:
-fprintf(stderr, "%s Couldn't create dt: %s\n", __func__, 
fdt_strerror(ret));
+error_report("%s Couldn't create dt: %s", __func__, fdt_strerror(ret));
 exit(1);
 }
 
@@ -124,8 +124,8 @@ static int findnode_nofail(void *fdt, const char *node_path)
 
 offset = fdt_path_offset(fdt, node_path);
 if (offset < 0) {
-fprintf(stderr, "%s Couldn't find node %s: %s\n", __func__, node_path,
-fdt_strerror(offset));
+error_report("%s Couldn't find node %s: %s", __func__, node_path,
+ fdt_strerror(offset));
 exit(1);
 }
 
@@ -139,8 +139,8 @@ int qemu_fdt_setprop(void *fdt, const char *node_path,
 
 r = fdt_setprop(fdt, findnode_nofail(fdt, node_path), property, val, size);
 if (r < 0) {
-fprintf(stderr, "%s: Couldn't set %s/%s: %s\n", __func__, node_path,
-property, fdt_strerror(r));
+error_report("%s: Couldn't set %s/%s: %s", __func__, node_path,
+ property, fdt_strerror(r));
 exit(1);
 }
 
@@ -154,8 +154,8 @@ int qemu_fdt_setprop_cell(void *fdt, const char *node_path,
 
 r = fdt_setprop_cell(fdt, findnode_nofail(fdt, node_path), property, val);
 if (r < 0) {
-fprintf(stderr, "%s: Couldn't set %s/%s = %#08x: %s\n", __func__,
-node_path, property, val, fdt_strerror(r));
+error_report("%s: Couldn't set %s/%s = %#08x: %s", __func__,
+ node_path, property, val, fdt_strerror(r));
 exit(1);
 }
 
@@ -176,8 +176,8 @@ int qemu_fdt_setprop_string(void *fdt, const char 
*node_path,
 
 r = fdt_setprop_string(fdt, findnode_nofail(fdt, node_path), property, 
string);
 if (r < 0) {
-fprintf(stderr, "%s: Couldn't set %s/%s = %s: %s\n", __func__,
-node_path, property, string, fdt_strerror(r));
+error_report("%s: Couldn't set %s/%s = %s: %s", __func__,
+ node_path, property, string, fdt_strerror(r));
 exit(1);
 }
 
@@ -194,8 +194,8 @@ const void *qemu_fdt_getprop(void *fdt, const char 
*node_path,
 }
 r = fdt_getprop(fdt, findnode_nofail(fdt, node_path), property, lenp);
 if (!r) {
-fprintf(stderr, "%s: Couldn't get %s/%s: %s\n", __func__,
-node_path, property, fdt_strerror(*lenp));
+error_report("%s: Couldn't get %s/%s: %s", __func__,
+ node_path, property, fdt_strerror(*lenp));
 exit(1);
 }
 return r;
@@ -207,8 +207,8 @@ uint32_t qemu_fdt_getprop_cell(void *fdt, const char 
*node_path,
 int len;
 const uint32_t *p = qemu_fdt_getprop(fdt, node_path, property, &len);
 if (len != 4) {
-fprintf(stderr, "%s: %s/%s not 4 bytes long (not a cell?)\n",
-__func__, node_path, property);
+error_report("%s: %s/%s not 4 bytes long (not a cell?)",
+ __func__, node_path, property);
 exit(1);
 }
 return be32_to_cpu(*p);
@@ -220,8 +220,8 @@ uint32_t qemu_fdt_get_phandle(void *fdt, const char *path)
 
 r = fdt_get_phandle(fdt, findnode_nofail(fdt, path));
 if (r == 0) {
-fprintf(stderr, "%s: Couldn't get phandle for %s: %s\n", __func__,
-path, fdt_strerror(r));
+error_report("%s: Couldn't get phandle for %s: %s", __func__,
+ path, fdt_strerror(r));
 exit(1);
 }
 
@@ -266,8 +266,8 @@ int qemu_fdt_nop_node(void *fdt, const char *node_path)
 
 r = fdt_nop_node(fdt, findnode_nofail(fdt, node_path));
 if (r < 0) {
-fprintf(stderr, "%s: Couldn't nop node %s: %s\n", __func__, node_path,
-fdt_strerror(r));
+error_report("%s: Couldn't nop node %s: %s", __func__, node_path,
+ fdt_strerror(r));
 exit(1);
 }
 
@@ -295,8 +295,8 @@ int qemu_fdt_add_subnode(void *fdt, const char *name)
 
 retval = fdt_add_subnode(fdt, parent, basename);
 if (retval < 0) {
-fprintf(stderr, "FDT: Failed to create subnode %s: %s\n", name,
-fdt_strerror(retval));
+error_report("FDT: Failed to create subnode %s: %s", name,
+ fdt_strerror(retval));
 exit(1);
 }
 
-- 
1.7.10.4

[Qemu-devel] [PULL 02/10] Fix debug print warning

2014-09-02 Thread Michael Tokarev

From: Gonglei 

Steps:

1.enable qemu debug print, using simply scprit as below:
 grep "//#define DEBUG" * -rl | xargs sed -i "s/\/\/#define DEBUG/#define 
DEBUG/g"
2. make -j
3. get some warning:
hw/i2c/pm_smbus.c: In function 'smb_ioport_writeb':
hw/i2c/pm_smbus.c:142: warning: format '%04x' expects type 'unsigned int', but 
argument 2 has type 'hwaddr'
hw/i2c/pm_smbus.c:142: warning: format '%02x' expects type 'unsigned int', but 
argument 3 has type 'uint64_t'
hw/i2c/pm_smbus.c: In function 'smb_ioport_readb':
hw/i2c/pm_smbus.c:209: warning: format '%04x' expects type 'unsigned int', but 
argument 2 has type 'hwaddr'
hw/intc/i8259.c: In function 'pic_ioport_read':
hw/intc/i8259.c:373: warning: format '%02x' expects type 'unsigned int', but 
argument 2 has type 'hwaddr'
hw/input/pckbd.c: In function 'kbd_write_command':
hw/input/pckbd.c:232: warning: format '%02x' expects type 'unsigned int', but 
argument 2 has type 'uint64_t'
hw/input/pckbd.c: In function 'kbd_write_data':
hw/input/pckbd.c:333: warning: format '%02x' expects type 'unsigned int', but 
argument 2 has type 'uint64_t'
hw/isa/apm.c: In function 'apm_ioport_writeb':
hw/isa/apm.c:44: warning: format '%x' expects type 'unsigned int', but argument 
2 has type 'hwaddr'
hw/isa/apm.c:44: warning: format '%02x' expects type 'unsigned int', but 
argument 3 has type 'uint64_t'
hw/isa/apm.c: In function 'apm_ioport_readb':
hw/isa/apm.c:67: warning: format '%x' expects type 'unsigned int', but argument 
2 has type 'hwaddr'
hw/timer/mc146818rtc.c: In function 'cmos_ioport_write':
hw/timer/mc146818rtc.c:394: warning: format '%02x' expects type 'unsigned int', 
but argument 3 has type 'uint64_t'
hw/i386/pc.c: In function 'port92_write':
hw/i386/pc.c:479: warning: format '%02x' expects type 'unsigned int', but 
argument 2 has type 'uint64_t'

Fix them.

Cc: qemu-triv...@nongnu.org
Signed-off-by: Gonglei 
Signed-off-by: Michael Tokarev 
---
 hw/i2c/pm_smbus.c  |5 +++--
 hw/i386/pc.c   |2 +-
 hw/input/pckbd.c   |4 ++--
 hw/intc/i8259.c|2 +-
 hw/isa/apm.c   |5 +++--
 hw/timer/mc146818rtc.c |2 +-
 6 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/hw/i2c/pm_smbus.c b/hw/i2c/pm_smbus.c
index fedb5fb..ce1713d 100644
--- a/hw/i2c/pm_smbus.c
+++ b/hw/i2c/pm_smbus.c
@@ -139,7 +139,8 @@ static void smb_ioport_writeb(void *opaque, hwaddr addr, 
uint64_t val,
 {
 PMSMBus *s = opaque;
 
-SMBUS_DPRINTF("SMB writeb port=0x%04x val=0x%02x\n", addr, val);
+SMBUS_DPRINTF("SMB writeb port=0x%04" HWADDR_PRIx
+  " val=0x%02" PRIx64 "\n", addr, val);
 switch(addr) {
 case SMBHSTSTS:
 s->smb_stat = (~(val & 0xff)) & s->smb_stat;
@@ -206,7 +207,7 @@ static uint64_t smb_ioport_readb(void *opaque, hwaddr addr, 
unsigned width)
 val = 0;
 break;
 }
-SMBUS_DPRINTF("SMB readb port=0x%04x val=0x%02x\n", addr, val);
+SMBUS_DPRINTF("SMB readb port=0x%04" HWADDR_PRIx " val=0x%02x\n", addr, 
val);
 return val;
 }
 
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 0ca4deb..b6c9b61 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -481,7 +481,7 @@ static void port92_write(void *opaque, hwaddr addr, 
uint64_t val,
 Port92State *s = opaque;
 int oldval = s->outport;
 
-DPRINTF("port92: write 0x%02x\n", val);
+DPRINTF("port92: write 0x%02" PRIx64 "\n", val);
 s->outport = val;
 qemu_set_irq(*s->a20_out, (val >> 1) & 1);
 if ((val & 1) && !(oldval & 1)) {
diff --git a/hw/input/pckbd.c b/hw/input/pckbd.c
index ca1cffc..2ab8c87 100644
--- a/hw/input/pckbd.c
+++ b/hw/input/pckbd.c
@@ -229,7 +229,7 @@ static void kbd_write_command(void *opaque, hwaddr addr,
 {
 KBDState *s = opaque;
 
-DPRINTF("kbd: write cmd=0x%02x\n", val);
+DPRINTF("kbd: write cmd=0x%02" PRIx64 "\n", val);
 
 /* Bits 3-0 of the output port P2 of the keyboard controller may be pulsed
  * low for approximately 6 micro seconds. Bits 3-0 of the KBD_CCMD_PULSE
@@ -330,7 +330,7 @@ static void kbd_write_data(void *opaque, hwaddr addr,
 {
 KBDState *s = opaque;
 
-DPRINTF("kbd: write data=0x%02x\n", val);
+DPRINTF("kbd: write data=0x%02" PRIx64 "\n", val);
 
 switch(s->write_cmd) {
 case 0:
diff --git a/hw/intc/i8259.c b/hw/intc/i8259.c
index a563b82..c51901b 100644
--- a/hw/intc/i8259.c
+++ b/hw/intc/i8259.c
@@ -370,7 +370,7 @@ static uint64_t pic_ioport_read(void *opaque, hwaddr addr,
 ret = s->imr;
 }
 }
-DPRINTF("read: addr=0x%02x val=0x%02x\n", addr, ret);
+DPRINTF("read: addr=0x%02" HWADDR_PRIx " val=0x%02x\n", addr, ret);
 return ret;
 }
 
diff --git a/hw/isa/apm.c b/hw/isa/apm.c
index 054d529..26ab170 100644
--- a/hw/isa/apm.c
+++ b/hw/isa/apm.c
@@ -41,7 +41,8 @@ static void apm_ioport_writeb(void *opaque, hwaddr addr, 
uint64_t val,
 {
 APMState *apm = opaque;
 addr &= 1;
-APM_DPRINTF("apm_ioport_writeb addr=0x%x val=0x%02x\n", addr, val);
+APM_DPRINTF("apm_ioport_writeb

[Qemu-devel] [PULL 10/10] slirp: Honour vlan/stack in hostfwd_remove commands

2014-09-02 Thread Michael Tokarev

From: Peter Maydell 

The hostfwd_add and hostfwd_remove monitor commands allow the user
to optionally specify a vlan/stack tuple. hostfwd_add honours this,
but hostfwd_remove does not (it looks up the tuple but then ignores
the SlirpState it has looked up and always uses the first stack
in the list anyway). Correct this to honour what the user requested.

Signed-off-by: Peter Maydell 
Signed-off-by: Michael Tokarev 
---
 net/slirp.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/slirp.c b/net/slirp.c
index 647039e..c171119 100644
--- a/net/slirp.c
+++ b/net/slirp.c
@@ -345,8 +345,7 @@ void net_slirp_hostfwd_remove(Monitor *mon, const QDict 
*qdict)
 
 host_port = atoi(p);
 
-err = slirp_remove_hostfwd(QTAILQ_FIRST(&slirp_stacks)->slirp, is_udp,
-   host_addr, host_port);
+err = slirp_remove_hostfwd(s->slirp, is_udp, host_addr, host_port);
 
 monitor_printf(mon, "host forwarding rule for %s %s\n", src_str,
err ? "not found" : "removed");
-- 
1.7.10.4

[Qemu-devel] [PULL 07/10] query-memdev: fix potential memory leaks

2014-09-02 Thread Michael Tokarev

From: Chen Fan 

Signed-off-by: Chen Fan 
Reviewed-by: Peter Crosthwaite 
Reviewed-by: Hu Tao 
Signed-off-by: Michael Tokarev 
---
 numa.c |6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/numa.c b/numa.c
index c78cec9..aa772aa 100644
--- a/numa.c
+++ b/numa.c
@@ -318,10 +318,11 @@ void memory_region_allocate_system_memory(MemoryRegion 
*mr, Object *owner,
 static int query_memdev(Object *obj, void *opaque)
 {
 MemdevList **list = opaque;
+MemdevList *m = NULL;
 Error *err = NULL;
 
 if (object_dynamic_cast(obj, TYPE_MEMORY_BACKEND)) {
-MemdevList *m = g_malloc0(sizeof(*m));
+m = g_malloc0(sizeof(*m));
 
 m->value = g_malloc0(sizeof(*m->value));
 
@@ -369,6 +370,9 @@ static int query_memdev(Object *obj, void *opaque)
 
 return 0;
 error:
+g_free(m->value);
+g_free(m);
+
 return -1;
 }
 
-- 
1.7.10.4

[Qemu-devel] [PULL 00/10] Trivial patches for 2014-09-03

2014-09-02 Thread Michael Tokarev

Here's a next trivial-patches batch.  A few little things here and there.
Please consider applying.

Thanks,

/mjt

The following changes since commit 30eaca3acdf17d7bcbd1213eb149c02037edfb0b:

  Merge remote-tracking branch 'remotes/spice/tags/pull-spice-20140902-1' into 
staging (2014-09-02 10:26:10 +0100)

are available in the git repository at:

  git://git.corpit.ru/qemu.git tags/trivial-patches-2014-09-03

for you to fetch changes up to 70381662aa97b294f3c81a085ed143f37f0ab0cb:

  slirp: Honour vlan/stack in hostfwd_remove commands (2014-09-02 22:38:16 
+0400)


trivial patches for 2014-09-03


Chen Fan (3):
  query-memdev: fix potential memory leaks
  qom/object.c, hmp.c: fix string_output_get_string() memory leak
  hmp: fix MemdevList memory leak

Dmitry Fleytman (1):
  MAINTAINERS: Add VMWare devices maintainer

Fam Zheng (1):
  scripts: Remove scripts/qtest

Gonglei (1):
  Fix debug print warning

Li Liu (2):
  device_tree.c: redirect load_device_tree err message to stderr
  device_tree.c: dump all err mesages with error_report

Peter Maydell (1):
  slirp: Honour vlan/stack in hostfwd_remove commands

Richard W.M. Jones (1):
  curl: The macro that you have to uncomment to get debugging is DEBUG_CURL.

 MAINTAINERS|6 ++
 block/curl.c   |2 +-
 device_tree.c  |   55 
 hmp.c  |8 +--
 hw/i2c/pm_smbus.c  |5 +++--
 hw/i386/pc.c   |2 +-
 hw/input/pckbd.c   |4 ++--
 hw/intc/i8259.c|2 +-
 hw/isa/apm.c   |5 +++--
 hw/timer/mc146818rtc.c |2 +-
 net/slirp.c|3 +--
 numa.c |   15 ++---
 qom/object.c   |   12 +--
 scripts/qtest  |5 -
 14 files changed, 70 insertions(+), 56 deletions(-)
 delete mode 100755 scripts/qtest

[Qemu-devel] [PATCH] vhost_net: initialize acked_features to a safe value during ack

2014-09-02 Thread Jason Wang

commit 2e6d46d77ed328d34a94688da8371bcbe243479b (vhost: add
vhost_get_features and vhost_ack_features) removes the step that
initializes the acked_features to backend_features. This will result an
unexpected value of acked_features which may fail the features setting of
vhost. This patch fixes it by recover this step.

Cc: Nikolay Nikolaev 
Cc: Andrey Korolyov 
Cc: Michael S. Tsirkin 
Cc: Michael Roth 
Cc: qemu-sta...@nongnu.org
Signed-off-by: Jason Wang 
---
 hw/net/vhost_net.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index f87c798..b1d4b1f 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -115,6 +115,7 @@ unsigned vhost_net_get_features(struct vhost_net *net, 
unsigned features)

 void vhost_net_ack_features(struct vhost_net *net, unsigned features)
 {
+net->dev.acked_features = net->dev.backend_features;
 vhost_ack_features(&net->dev, vhost_net_get_feature_bits(net), features);
 }

-- 
1.8.3.1

Re: [Qemu-devel] [PATCH v6 02/27] bootindex: add del_boot_device_path function

2014-09-02 Thread Gerd Hoffmann


  Hi,

> 4. When we hotplug the virtio-net-pci device, only pass virtio-net-pci's 
> pointer to 
> del_boot_device_path(). But virtio-net-pci != virtio-net-device, so I add a 
> function
> named is_same_fw_dev_path() to handle this situation. 

When hot-unplugging virtio-net-pci I'd expect we free both
virtio-net-pci and virtio-net-device (and therefore call
del_boot_device_path twice, once for each device).  Can you check that?

thanks,
  Gerd

[Qemu-devel] [PATCH V2] fix: unix sockets created for virtio-serail has insufficient permissions

2014-09-02 Thread Chunyan Liu

Add umask to _virCommand, allow user to set umask to command.
Set umask(002) to qemu process to overwrite default umask(022) so
that unix sockets created for virtio-serial has expected permissions.

Fix problem reported here:
https://sourceware.org/bugzilla/show_bug.cgi?id=13078#c11
https://bugzilla.novell.com/show_bug.cgi?id=888166

To use virtio-serial device, unix socket created for chardev with
default umask(022) has insufficient permissions.
e.g.:
-device virtio-serial \
-chardev socket,path=/tmp/foo,server,nowait,id=foo \
-device virtserialport,chardev=foo,name=org.fedoraproject.port.0

srwxr-xr-x 1 qemu qemu 0 21. Jul 14:19 /tmp/somefile.sock

Other users in the same group (like real user, test engines, etc)
cannot write to this socket.

Signed-off-by: Chunyan Liu 
---
Changes:
  * set umask(002) to the whole qemu process instead of calling
umask in qemu unix_listen_opts.

  V1 is here:
  http://www.mail-archive.com/libvir-list@redhat.com/msg101513.html

 src/libvirt_private.syms |  1 +
 src/qemu/qemu_process.c  |  1 +
 src/util/vircommand.c| 11 +++
 src/util/vircommand.h|  1 +
 4 files changed, 14 insertions(+)

diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index 71fc063..f136d24 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -1171,6 +1171,7 @@ virCommandSetPidFile;
 virCommandSetPreExecHook;
 virCommandSetSELinuxLabel;
 virCommandSetUID;
+virCommandSetUmask;
 virCommandSetWorkingDirectory;
 virCommandToString;
 virCommandWait;
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index f68dfbe..f76aa5a 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -4141,6 +4141,7 @@ int qemuProcessStart(virConnectPtr conn,
 virCommandSetPreExecHook(cmd, qemuProcessHook, &hookData);
 virCommandSetMaxProcesses(cmd, cfg->maxProcesses);
 virCommandSetMaxFiles(cmd, cfg->maxFiles);
+virCommandSetUmask(cmd, 0x002);
 
 VIR_DEBUG("Setting up security labelling");
 if (virSecurityManagerSetChildProcessLabel(driver->securityManager,
diff --git a/src/util/vircommand.c b/src/util/vircommand.c
index 1d6dbd9..efb5f69 100644
--- a/src/util/vircommand.c
+++ b/src/util/vircommand.c
@@ -133,6 +133,7 @@ struct _virCommand {
 #if defined(WITH_SECDRIVER_APPARMOR)
 char *appArmorProfile;
 #endif
+int umask;
 };
 
 /* See virCommandSetDryRun for description for this variable */
@@ -582,6 +583,8 @@ virExec(virCommandPtr cmd)
 
 /* child */
 
+if (cmd->umask)
+umask(cmd->umask);
 ret = EXIT_CANCELED;
 openmax = sysconf(_SC_OPEN_MAX);
 if (openmax < 0) {
@@ -1082,6 +1085,14 @@ virCommandSetMaxFiles(virCommandPtr cmd, unsigned int 
files)
 cmd->maxFiles = files;
 }
 
+void virCommandSetUmask(virCommandPtr cmd, int umask)
+{
+if (!cmd || cmd->has_error)
+return;
+
+cmd->umask = umask;
+}
+
 /**
  * virCommandClearCaps:
  * @cmd: the command to modify
diff --git a/src/util/vircommand.h b/src/util/vircommand.h
index d3b286d..bf65de4 100644
--- a/src/util/vircommand.h
+++ b/src/util/vircommand.h
@@ -72,6 +72,7 @@ void virCommandSetUID(virCommandPtr cmd, uid_t uid);
 void virCommandSetMaxMemLock(virCommandPtr cmd, unsigned long long bytes);
 void virCommandSetMaxProcesses(virCommandPtr cmd, unsigned int procs);
 void virCommandSetMaxFiles(virCommandPtr cmd, unsigned int files);
+void virCommandSetUmask(virCommandPtr cmd, int umask);
 
 void virCommandClearCaps(virCommandPtr cmd);
 
-- 
1.8.4.5

Re: [Qemu-devel] [Qemu-stable] Patch Round-up for stable 2.1.1, freeze on 2014-09-03

2014-09-02 Thread Michael S. Tsirkin

On Wed, Sep 03, 2014 at 02:17:02AM +0400, Andrey Korolyov wrote:
> On Wed, Sep 3, 2014 at 2:09 AM, Andrey Korolyov  wrote:
> > On Wed, Sep 3, 2014 at 1:51 AM, Michael S. Tsirkin  wrote:
> >> On Wed, Sep 03, 2014 at 01:29:29AM +0400, Andrey Korolyov wrote:
> >>> On Wed, Sep 3, 2014 at 1:03 AM, Michael S. Tsirkin  
> >>> wrote:
> >>> >> bad one is the
> >>> >>
> >>> >> Author: Jason Wang 
> >>> >> Date:   Tue Sep 2 18:07:46 2014 +0300
> >>> >>
> >>> >> vhost_net: start/stop guest notifiers properly
> >>> >
> >>> >
> >>> >
> >>> > upstream has this (pull request sent today):
> >>> > vhost_net: cleanup start/stop condition
> >>> >
> >>> > Could you apply it and see if it helps please?
> >>> >
> >>> > Michael, if it helps it should be before start/stop guest notifiers
> >>> > ideally to avoid bisect problems.
> >>>
> >>> It is already applied as shown from the list in the previous message
> >>> (there are some aio fixes too on top of 2.1 I picked before but they
> >>> should not impact vhost-net interaction in any mean). The symptoms are
> >>> a bit interesting - VM crashes only at PCI device initalization (e.g.
> >>> grub stage after reset and initrd unpacking are passing well, but then
> >>> things getting ugly). I am running 3.14 guest i686-pae kernel from
> >>> debian backports in guest, so it may be version-specific after all. If
> >>> it`ll be hard to reproduce, I can try 64bit, expecting same behavior.
> >>> Please find args in attached file.
> >>
> >>
> >>
> >> ok just to make sure - which tree do I clone exactly?
> >>
> >
> > https://github.com/mdroth/qemu.git stable-2.1-staging showing same
> > behavior for me with those patches
> 
> Forgot to mention important detail - I am playing with -mq now, so
> actually virtio-net working in a bit different way than it may
> expected (it also shown in args list from above, but someone may miss
> it):
> ...
> qemu-system-x86_64: unable to start vhost net: 95: falling back on
> userspace virtio
> qemu-system-x86_64: unable to start vhost net: 95: falling back on
> userspace virtio
> ...


OK I see at least one obvious bug there: does the following fix the
crash for you?
Separately, we need to debug why mq vhost is broken for you.
Is this a regression?

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index ba5d544..1fe18c7 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -289,7 +289,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
 BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(dev)));
 VirtioBusState *vbus = VIRTIO_BUS(qbus);
 VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus);
-int r, i = 0;
+int r, i;
 
 if (!vhost_net_device_endian_ok(dev)) {
 error_report("vhost-net does not support cross-endian");
@@ -317,16 +317,22 @@ int vhost_net_start(VirtIODevice *dev, NetClientState 
*ncs,
 r = vhost_net_start_one(get_vhost_net(ncs[i].peer), dev);
 
 if (r < 0) {
-goto err;
+goto err_start;
 }
 }
 
 return 0;
 
-err:
+err_start:
 while (--i >= 0) {
 vhost_net_stop_one(get_vhost_net(ncs[i].peer), dev);
 }
+err:
+r = k->set_guest_notifiers(qbus->parent, total_queues * 2, false);
+if (r < 0) {
+fprintf(stderr, "vhost guest notifier cleanup failed: %d\n", r);
+fflush(stderr);
+}
 return r;
 }

Re: [Qemu-devel] [Qemu-stable] Patch Round-up for stable 2.1.1, freeze on 2014-09-03

2014-09-02 Thread Jason Wang

On 09/03/2014 02:35 PM, Michael S. Tsirkin wrote:
> On Wed, Sep 03, 2014 at 02:17:02AM +0400, Andrey Korolyov wrote:
>> On Wed, Sep 3, 2014 at 2:09 AM, Andrey Korolyov  wrote:
>>> On Wed, Sep 3, 2014 at 1:51 AM, Michael S. Tsirkin  wrote:
 On Wed, Sep 03, 2014 at 01:29:29AM +0400, Andrey Korolyov wrote:
> On Wed, Sep 3, 2014 at 1:03 AM, Michael S. Tsirkin  
> wrote:
>>> bad one is the
>>>
>>> Author: Jason Wang 
>>> Date:   Tue Sep 2 18:07:46 2014 +0300
>>>
>>> vhost_net: start/stop guest notifiers properly
>>
>>
>> upstream has this (pull request sent today):
>> vhost_net: cleanup start/stop condition
>>
>> Could you apply it and see if it helps please?
>>
>> Michael, if it helps it should be before start/stop guest notifiers
>> ideally to avoid bisect problems.
> It is already applied as shown from the list in the previous message
> (there are some aio fixes too on top of 2.1 I picked before but they
> should not impact vhost-net interaction in any mean). The symptoms are
> a bit interesting - VM crashes only at PCI device initalization (e.g.
> grub stage after reset and initrd unpacking are passing well, but then
> things getting ugly). I am running 3.14 guest i686-pae kernel from
> debian backports in guest, so it may be version-specific after all. If
> it`ll be hard to reproduce, I can try 64bit, expecting same behavior.
> Please find args in attached file.


 ok just to make sure - which tree do I clone exactly?

>>> https://github.com/mdroth/qemu.git stable-2.1-staging showing same
>>> behavior for me with those patches
>> Forgot to mention important detail - I am playing with -mq now, so
>> actually virtio-net working in a bit different way than it may
>> expected (it also shown in args list from above, but someone may miss
>> it):
>> ...
>> qemu-system-x86_64: unable to start vhost net: 95: falling back on
>> userspace virtio
>> qemu-system-x86_64: unable to start vhost net: 95: falling back on
>> userspace virtio
>> ...
> Okay, so there's some bug in the error handling then.
> I'll dig into it - meanwhile can you please strace
> the binary to figure out which ioctl is failing?
>
> Or just trace it by hand: I am guessing vhost_net_start_one
> is the one failing here, add printfs there and check
> (note to self: we need more error messages in that function).
>
>

Looks like the issue was caused by this commit:

commit 2e6d46d77ed328d34a94688da8371bcbe243479b
Author: Nikolay Nikolaev 
Date:   Tue May 27 15:04:42 2014 +0300

vhost: add vhost_get_features and vhost_ack_features

It remove the step of initialization of acked_features to
backend_features. This will result a unexpected value acked_features
which may fail during setting features.

Will post a patch for this.

Re: [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node

2014-09-02 Thread Bharata B Rao

On Tue, Aug 19, 2014 at 5:51 AM, Michael Roth  wrote:
> From: Nathan Fontenot 
>
> This add entries to the root OF node to advertise our PHBs as being
> DR-capable in according with PAPR specification.
>
> Each PHB is given a name of PHB, advertised as a PHB type,
> and associated with a power domain of -1 (indicating to guests that
> power management is handled automatically by hardware).
>
> We currently allocate entries for up to 32 DR-capable PHBs, though
> this limit can be increased later.
>
> DrcEntry objects to track the state of the DR-connector associated
> with each PHB are stored in a 32-entry array, and each DrcEntry has
> in turn have a dynamically-sized number of child DR-connectors,
> which we will use later to track the state of DR-connectors
> associated with a PHB's physical slots.
>
> Signed-off-by: Nathan Fontenot 
> Signed-off-by: Michael Roth 
> ---
>  hw/ppc/spapr.c | 143 
> +
>  hw/ppc/spapr_pci.c |   1 +
>  include/hw/ppc/spapr.h |  35 
>  3 files changed, 179 insertions(+)
>
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 5c92707..d5e46c3 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -296,6 +296,143 @@ static hwaddr spapr_node0_size(void)
>  return ram_size;
>  }
>
> +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid)
> +{
> +int i;
> +
> +for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +if (spapr->drc_table[i].phb_buid == buid) {
> +return &spapr->drc_table[i];
> +}
> + }
> +
> + return NULL;
> +}
> +
> +static void spapr_init_drc_table(void)
> +{
> +int i;
> +
> +memset(spapr->drc_table, 0, sizeof(spapr->drc_table));
> +
> +/* For now we only care about PHB entries */
> +for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +spapr->drc_table[i].drc_index = 0x201 + i;
> +}
> +}
> +
> +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state)
> +{
> +sPAPRDrcEntry *empty_drc = NULL;
> +sPAPRDrcEntry *found_drc = NULL;
> +int i, phb_index;
> +
> +for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +if (spapr->drc_table[i].phb_buid == 0) {
> +empty_drc = &spapr->drc_table[i];
> +}
> +
> +if (spapr->drc_table[i].phb_buid == buid) {
> +found_drc = &spapr->drc_table[i];
> +break;
> +}
> +}
> +
> +if (found_drc) {
> +return found_drc;
> +}
> +
> +if (empty_drc) {
> +empty_drc->phb_buid = buid;
> +empty_drc->state = state;

Shouldn't this be

empty_drc->state = state << INDICATOR_ENTITY_SENSE_SHIFT ?

Regards,
Bharata.

Re: [Qemu-devel] [Qemu-stable] Patch Round-up for stable 2.1.1, freeze on 2014-09-03

2014-09-02 Thread Michael S. Tsirkin

On Wed, Sep 03, 2014 at 02:17:02AM +0400, Andrey Korolyov wrote:
> On Wed, Sep 3, 2014 at 2:09 AM, Andrey Korolyov  wrote:
> > On Wed, Sep 3, 2014 at 1:51 AM, Michael S. Tsirkin  wrote:
> >> On Wed, Sep 03, 2014 at 01:29:29AM +0400, Andrey Korolyov wrote:
> >>> On Wed, Sep 3, 2014 at 1:03 AM, Michael S. Tsirkin  
> >>> wrote:
> >>> >> bad one is the
> >>> >>
> >>> >> Author: Jason Wang 
> >>> >> Date:   Tue Sep 2 18:07:46 2014 +0300
> >>> >>
> >>> >> vhost_net: start/stop guest notifiers properly
> >>> >
> >>> >
> >>> >
> >>> > upstream has this (pull request sent today):
> >>> > vhost_net: cleanup start/stop condition
> >>> >
> >>> > Could you apply it and see if it helps please?
> >>> >
> >>> > Michael, if it helps it should be before start/stop guest notifiers
> >>> > ideally to avoid bisect problems.
> >>>
> >>> It is already applied as shown from the list in the previous message
> >>> (there are some aio fixes too on top of 2.1 I picked before but they
> >>> should not impact vhost-net interaction in any mean). The symptoms are
> >>> a bit interesting - VM crashes only at PCI device initalization (e.g.
> >>> grub stage after reset and initrd unpacking are passing well, but then
> >>> things getting ugly). I am running 3.14 guest i686-pae kernel from
> >>> debian backports in guest, so it may be version-specific after all. If
> >>> it`ll be hard to reproduce, I can try 64bit, expecting same behavior.
> >>> Please find args in attached file.
> >>
> >>
> >>
> >> ok just to make sure - which tree do I clone exactly?
> >>
> >
> > https://github.com/mdroth/qemu.git stable-2.1-staging showing same
> > behavior for me with those patches
> 
> Forgot to mention important detail - I am playing with -mq now, so
> actually virtio-net working in a bit different way than it may
> expected (it also shown in args list from above, but someone may miss
> it):
> ...
> qemu-system-x86_64: unable to start vhost net: 95: falling back on
> userspace virtio
> qemu-system-x86_64: unable to start vhost net: 95: falling back on
> userspace virtio
> ...

Also - does it work fine if you disable mq?

Re: [Qemu-devel] [Qemu-stable] Patch Round-up for stable 2.1.1, freeze on 2014-09-03

2014-09-02 Thread Michael S. Tsirkin

On Wed, Sep 03, 2014 at 02:17:02AM +0400, Andrey Korolyov wrote:
> On Wed, Sep 3, 2014 at 2:09 AM, Andrey Korolyov  wrote:
> > On Wed, Sep 3, 2014 at 1:51 AM, Michael S. Tsirkin  wrote:
> >> On Wed, Sep 03, 2014 at 01:29:29AM +0400, Andrey Korolyov wrote:
> >>> On Wed, Sep 3, 2014 at 1:03 AM, Michael S. Tsirkin  
> >>> wrote:
> >>> >> bad one is the
> >>> >>
> >>> >> Author: Jason Wang 
> >>> >> Date:   Tue Sep 2 18:07:46 2014 +0300
> >>> >>
> >>> >> vhost_net: start/stop guest notifiers properly
> >>> >
> >>> >
> >>> >
> >>> > upstream has this (pull request sent today):
> >>> > vhost_net: cleanup start/stop condition
> >>> >
> >>> > Could you apply it and see if it helps please?
> >>> >
> >>> > Michael, if it helps it should be before start/stop guest notifiers
> >>> > ideally to avoid bisect problems.
> >>>
> >>> It is already applied as shown from the list in the previous message
> >>> (there are some aio fixes too on top of 2.1 I picked before but they
> >>> should not impact vhost-net interaction in any mean). The symptoms are
> >>> a bit interesting - VM crashes only at PCI device initalization (e.g.
> >>> grub stage after reset and initrd unpacking are passing well, but then
> >>> things getting ugly). I am running 3.14 guest i686-pae kernel from
> >>> debian backports in guest, so it may be version-specific after all. If
> >>> it`ll be hard to reproduce, I can try 64bit, expecting same behavior.
> >>> Please find args in attached file.
> >>
> >>
> >>
> >> ok just to make sure - which tree do I clone exactly?
> >>
> >
> > https://github.com/mdroth/qemu.git stable-2.1-staging showing same
> > behavior for me with those patches
> 
> Forgot to mention important detail - I am playing with -mq now, so
> actually virtio-net working in a bit different way than it may
> expected (it also shown in args list from above, but someone may miss
> it):
> ...
> qemu-system-x86_64: unable to start vhost net: 95: falling back on
> userspace virtio
> qemu-system-x86_64: unable to start vhost net: 95: falling back on
> userspace virtio
> ...

Okay, so there's some bug in the error handling then.
I'll dig into it - meanwhile can you please strace
the binary to figure out which ioctl is failing?

Or just trace it by hand: I am guessing vhost_net_start_one
is the one failing here, add printfs there and check
(note to self: we need more error messages in that function).

Re: [Qemu-devel] [Xen-devel] [PATCH 2/2] xen:i386:pc_piix: create isa bridge specific to IGD passthrough

2014-09-02 Thread Michael S. Tsirkin

On Wed, Sep 03, 2014 at 01:40:45AM +, Kay, Allen M wrote:
> 
> 
> > -Original Message-
> > From: Chen, Tiejun
> > Sent: Monday, September 01, 2014 12:50 AM
> > To: Michael S. Tsirkin
> > Cc: xen-de...@lists.xensource.com; Kay, Allen M; qemu-devel@nongnu.org;
> > Konrad Rzeszutek Wilk
> > Subject: Re: [Qemu-devel] [Xen-devel] [PATCH 2/2] xen:i386:pc_piix: create
> > isa bridge specific to IGD passthrough
> > 
> > On 2014/9/1 14:05, Michael S. Tsirkin wrote:
> > > On Mon, Sep 01, 2014 at 10:50:37AM +0800, Chen, Tiejun wrote:
> > >> On 2014/8/31 16:58, Michael S. Tsirkin wrote:
> > >>> On Fri, Aug 29, 2014 at 09:28:50AM +0800, Chen, Tiejun wrote:
> > 
> > 
> >  On 2014/8/28 8:56, Chen, Tiejun wrote:
> > >>> + */
> > >>> +dev = pci_create_simple(bus, PCI_DEVFN(0x1f, 0),
> > >>> +"xen-igd-passthrough-isa-bridge");
> > >>> +if (dev) {
> > >>> +r = xen_host_pci_device_get(&hdev, 0, 0,
> > >>> + PCI_DEVFN(0x1f,
> > >>> 0), 0);
> > >>> +if (!r) {
> > >>> +pci_config_set_vendor_id(dev->config,
> > hdev.vendor_id);
> > >>> +pci_config_set_device_id(dev->config,
> > >>> + hdev.device_id);
> > 
> >  Can you, instead, implement the reverse logic, probing the card
> >  and supplying the correct device id for PCH?
> > 
> > >>>
> > >>> Here what is your so-called reverse logic as I already asked you
> > >>> previously? Do you mean I should list all PCHs with a combo
> > >>> illustrated with the vendor/device id in advance? Then look up
> > >>> if we can find a
> > >>
> > >> Michael,
> > >>
> > >
> > > Ping.
> > >
> > > Thanks
> > > Tiejun
> > >
> > >> Could you explain this exactly? Then I can try follow-up your
> > >> idea ASAP if this is necessary and possible.
> > 
> >  Michel,
> > 
> >  Could you give us some explanation for your "reverse logic" when
> >  you're free?
> > 
> >  Thanks
> >  Tiejun
> > >>>
> > >>> So future drivers will look at device ID for the card and figure out
> > >>> how things should work from there.
> > >>> Old drivers still poke at device id of the chipset for this, but
> > >>> maybe qemu can do what newer drivers do:
> > >>> look at the card and figure out what guest should do, then present
> > >>> the appropriate chipset id.
> > >>>
> > >>> This is based on what Jesse said:
> > >>> Practically speaking, we could probably assume specific GPU/PCH
> > combos,
> > >>> as I don't think they're generally mixed across generations, 
> > >>> though
> > SNB
> > >>> and IVB did have compatible sockets, so there is the 
> > >>> possibility of
> > >>> mixing CPT and PPT PCHs, but those are register identical as 
> > >>> far as the
> > >>> graphics driver is concerned, so even that should be safe.
> > >>>
> > >>
> > >> Michael,
> > >>
> > >> Thanks for your explanation.
> > >>
> > >>> so the idea is to have a reverse table mapping GPU back to PCH.
> > >>> Present to guest the ID that will let it assume the correct GPU.
> > >>
> > >> I guess you mean we should create to maintain such a table:
> > >>
> > >> [GPU Card: device_id(s), PCH: device_id]
> > >>
> > >> Then each time, instead of exposing that real PCH device id directly,
> > >> qemu first can get the real GPU device id to lookup this table to
> > >> present a appropriate PCH's device id to the guest.
> > >>
> > >> And looks here that appropriate PCH's device id is not always a that
> > >> real PCH's device id. Right? If I'm wrong please correct me.
> > >
> > > Exactly: we don't really care what the PCH ID is - we only need it for
> > > the guest driver to do the right thing.
> > 
> > Agreed.
> > 
> > I need to ask Allen to check if one given GPU card device id is always
> > corresponding to one given PCH on both HSW and BDW currently. If yes, I can
> > do this quickly. Otherwise I need some time to establish that sort
> > of connection.
> > 
> 
> If I understand this correctly, the only difference is instead of reading PCH 
> DevID/RevID from the host hardware, QEMU inserts those values into PCH 
> virtual device by looking at the reverse mapping table it maintains.
> 
> I agree the downside of doing this is the reverse mapping table may be hard 
> to maintain.

Point is it won't be, since future devices will not need this hack.
So we just need the table for the existing ones, build it
once and forget about it.

> What is the advantage of doing this instead of having QEMU reading it from 
> the host?  Is it to test to make sure reverse mapping methods works before it 
> is adopted in the new drivers?

That's one.
But a more important one is security: we don't want QEMU
to poke at any new files in the host if we can help it,
otherwise all distros have to change their LSM policies
to allow this. In

[Qemu-devel] [Bug 1254828] Re: qemu-sparc64-static: Segmentation Fault during debootstrap second stage

2014-09-02 Thread Itaru Kitayama

I am seeing the same:

# qemu-sparc64-static -version
qemu-sparc64 version 2.1.50, Copyright (c) 2003-2008 Fabrice Bellard

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1254828

Title:
  qemu-sparc64-static: Segmentation Fault during debootstrap second
  stage

Status in QEMU:
  New
Status in Linaro QEMU:
  New
Status in “qemu-linaro” package in Ubuntu:
  New

Bug description:
  Host: Ubuntu Precise amd64
  Guest: Debian Sid (ports) sparc64

  When attempting the second stage of a debootstrap for a sparc64 Debian
  Sid guest, a segmentation fault occurs.

  $ sudo qemu-debootstrap --no-check-gpg --arch=sparc64 sid sparc64 
http://ftp.debian-ports.org/debian
  I: Running command: debootstrap --arch sparc64 --foreign --no-check-gpg sid 
sparc64 http://ftp.debian-ports.org/debian
  [...]
  I: Running command: chroot sparc64 /debootstrap/debootstrap --second-stage
  /debootstrap/debootstrap: 22: .: Can't open /usr/share/debootstrap/functions
  Segmentation fault (core dumped)

  Running a simple "sudo chroot sparc64" exits silently on amd64, and
  reports a segfault on i386.

  ProblemType: Bug
  DistroRelease: Ubuntu 12.04
  Package: qemu-user-static 1.0.50-2012.03-0ubuntu2.1
  ProcVersionSignature: Ubuntu 3.8.0-33.48~precise1-generic 3.8.13.11
  Uname: Linux 3.8.0-33-generic x86_64
  NonfreeKernelModules: wl
  ApportVersion: 2.0.1-0ubuntu17.6
  Architecture: amd64
  Date: Mon Nov 25 17:49:34 2013
  Dependencies:
   
  InstallationMedia: Ubuntu 12.04.3 LTS "Precise Pangolin" - Release amd64 
(20130820.1)
  MarkForUpload: True
  ProcEnviron:
   LANGUAGE=en_GB:en
   TERM=xterm
   PATH=(custom, no user)
   LANG=en_GB.UTF-8
   SHELL=/bin/bash
  SourcePackage: qemu-linaro
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1254828/+subscriptions

[Qemu-devel] [PATCH 1/3] trace: Only link generated-tracers.o with "simple" backend

2014-09-02 Thread Fam Zheng

In any other cases the object file is effectively empty, which is
disliked by ranlib and nm on Mac OS X.

Reported-by: Peter Maydell 
Signed-off-by: Fam Zheng 
---
 trace/Makefile.objs | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/trace/Makefile.objs b/trace/Makefile.objs
index 387f191..46de95c 100644
--- a/trace/Makefile.objs
+++ b/trace/Makefile.objs
@@ -140,8 +140,7 @@ $(obj)/generated-tcg-tracers.h-timestamp: 
$(SRC_PATH)/trace-events $(BUILD_DIR)/
 ##
 # Backend code
 
-util-obj-$(CONFIG_TRACE_SIMPLE) += simple.o
+util-obj-$(CONFIG_TRACE_SIMPLE) += simple.o generated-tracers.o
 util-obj-$(CONFIG_TRACE_FTRACE) += ftrace.o
 util-obj-$(CONFIG_TRACE_UST) += generated-ust.o
 util-obj-y += control.o
-util-obj-y += generated-tracers.o
-- 
2.1.0.27.g96db324

[Qemu-devel] [PATCH 3/3] util: Don't link host-utils.o if it's empty

2014-09-02 Thread Fam Zheng

Reported-by: Peter Maydell 
Signed-off-by: Fam Zheng 
---
 util/Makefile.objs | 3 ++-
 util/host-utils.c  | 2 --
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/util/Makefile.objs b/util/Makefile.objs
index 6b3c83b..cb8862b 100644
--- a/util/Makefile.objs
+++ b/util/Makefile.objs
@@ -1,7 +1,8 @@
 util-obj-y = osdep.o cutils.o unicode.o qemu-timer-common.o
 util-obj-$(CONFIG_WIN32) += oslib-win32.o qemu-thread-win32.o 
event_notifier-win32.o
 util-obj-$(CONFIG_POSIX) += oslib-posix.o qemu-thread-posix.o 
event_notifier-posix.o qemu-openpty.o
-util-obj-y += envlist.o path.o host-utils.o module.o
+util-obj-y += envlist.o path.o module.o
+util-obj-$(call lnot,$(CONFIG_INT128)) += host-utils.o
 util-obj-y += bitmap.o bitops.o hbitmap.o
 util-obj-y += fifo8.o
 util-obj-y += acl.o
diff --git a/util/host-utils.c b/util/host-utils.c
index ee57ef5..102e5bf 100644
--- a/util/host-utils.c
+++ b/util/host-utils.c
@@ -28,7 +28,6 @@
 #include "qemu/host-utils.h"
 
 /* Long integer helpers */
-#ifndef CONFIG_INT128
 static inline void mul64(uint64_t *plow, uint64_t *phigh,
  uint64_t a, uint64_t b)
 {
@@ -161,4 +160,3 @@ int divs128(int64_t *plow, int64_t *phigh, int64_t divisor)
 return overflow;
 }
 
-#endif /* !CONFIG_INT128 */
-- 
2.1.0.27.g96db324

[Qemu-devel] [PATCH 0/3] build-sys: Exclude empty object files when linking libqemuutil.a

2014-09-02 Thread Fam Zheng

On Mac OS X, ranlib complains on a few empty objects:

  ARlibqemuutil.a
  
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
  file: libqemuutil.a(generated-tracers.o) has no symbols
  
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
  file: libqemuutil.a(host-utils.o) has no symbols
  
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
  file: libqemuutil.a(getauxval.o) has no symbols

This series fixes the warnings.

Fam

Fam Zheng (3):
  trace: Only link generated-tracers.o with "simple" backend
  util: Move general qemu_getauxval to util/getauxval.c
  util: Don't link host-utils.o if it's empty

 include/qemu/osdep.h | 4 
 trace/Makefile.objs  | 3 +--
 util/Makefile.objs   | 3 ++-
 util/getauxval.c | 8 
 util/host-utils.c| 2 --
 5 files changed, 11 insertions(+), 9 deletions(-)

-- 
2.1.0.27.g96db324

[Qemu-devel] [PATCH 2/3] util: Move general qemu_getauxval to util/getauxval.c

2014-09-02 Thread Fam Zheng

So that we won't have an empty getauxval.o which is disliked by ranlib.

Reported-by: Peter Maydell 
Suggested-by: Paolo bonz...@redhat.com
Signed-off-by: Fam Zheng 
---
 include/qemu/osdep.h | 4 
 util/getauxval.c | 8 
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index 9dd43fc..1565404 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -246,11 +246,7 @@ char *qemu_get_exec_dir(void);
  * Search the auxiliary vector for @type, returning the value
  * or 0 if @type is not present.
  */
-#if defined(CONFIG_GETAUXVAL) || defined(__linux__)
 unsigned long qemu_getauxval(unsigned long type);
-#else
-static inline unsigned long qemu_getauxval(unsigned long type) { return 0; }
-#endif
 
 void qemu_set_tty_echo(int fd, bool echo);
 
diff --git a/util/getauxval.c b/util/getauxval.c
index 25f48e5..1732ace 100644
--- a/util/getauxval.c
+++ b/util/getauxval.c
@@ -98,4 +98,12 @@ unsigned long qemu_getauxval(unsigned long type)
 
 return 0;
 }
+
+#else
+
+unsigned long qemu_getauxval(unsigned long type)
+{
+return 0;
+}
+
 #endif
-- 
2.1.0.27.g96db324

Re: [Qemu-devel] [PATCH v2] rules.mak: Fix DSO build by pulling in archive symbols

2014-09-02 Thread Fam Zheng

On Tue, 09/02 14:07, Paolo Bonzini wrote:
> Il 02/09/2014 03:19, Fam Zheng ha scritto:
> > Thanks both of you for tracking down the problem. Let's add 2>/dev/null to
> > suppress the warning.
> 
> Actually I think the warning would complicate tracking down real
> problems.  We can fix the source of the warnings, especially since they
> already occur with ranlib.  I've queued v2 of the patch, but I'll wait a
> bit before submitting it for inclusion.
> 

OK, thanks. I'll post some patches to exclude the empty objects in the
util-obj-y.

Fam

Re: [Qemu-devel] [PATCH v6 02/27] bootindex: add del_boot_device_path function

2014-09-02 Thread Gonglei (Arei)

> Subject: RE: [PATCH v6 02/27] bootindex: add del_boot_device_path function
> 
> Hi,
> 
> > From: Eduardo Habkost [mailto:ehabk...@redhat.com]
> > Sent: Wednesday, September 03, 2014 2:00 AM
> > Subject: Re: [PATCH v6 02/27] bootindex: add del_boot_device_path function
> > Importance: High
> >
> > On Mon, Sep 01, 2014 at 06:47:13AM +, Gonglei (Arei) wrote:
> > > > From: Gerd Hoffmann [mailto:kra...@redhat.com]
> > > > Sent: Monday, September 01, 2014 2:43 PM
> > > > Subject: Re: [PATCH v6 02/27] bootindex: add del_boot_device_path
> > function
> > > > Importance: High
> > > >
> > > >   Hi,
> > > >
> > > > > +static bool is_same_fw_dev_path(DeviceState *src, DeviceState *dst)
> > > > > +{
> > > > > +bool ret = false;
> > > > > +char *devpath_src = qdev_get_fw_dev_path(src);
> > > > > +char *devpath_dst = qdev_get_fw_dev_path(dst);
> > > > > +
> > > > > +if (!strcmp(devpath_src, devpath_dst)) {
> > > > > +ret = true;
> > > > > +}
> > > > > +
> > > > > +g_free(devpath_src);
> > > > > +g_free(devpath_dst);
> > > > > +return ret;
> > > > > +}
> > > > > +
> > > > > +void del_boot_device_path(DeviceState *dev)
> > > > > +{
> > > > > +FWBootEntry *i;
> > > > > +
> > > > > +assert(dev != NULL);
> > > > > +
> > > > > +/* remove all entries of the assigned device */
> > > > > +QTAILQ_FOREACH(i, &fw_boot_order, link) {
> > > > > +if (i->dev == NULL) {
> > > > > +continue;
> > > > > +}
> > > > > +if ((i->dev == dev || is_same_fw_dev_path(i->dev, dev))) {
> > > >
> > > > Why this is needed?  Is there any case where i-->dev != dev but
> > > > is_same_fw_dev_path() returns true?
> > > >
> > > Yes, it is needed. At present, the virito-net-pci device
> > > compliance with this situation.
> > >
> > > Please see the qom path about virtio-net-pci and virtio-net device:
> > >
> > > id: null, /machine/peripheral/nic1/virtio-backend
> > > id: nic1, /machine/peripheral/nic1
> >
> > And why exactly is the caller passing a different pointer to
> > del_boot_device_path()? Why not simply change the caller to pass the
> > same pointer to add_boot_device_path() and del_boot_device_path()?
> >
> 1. When we want to create a virtio-net device, we must use '-device
> virtio-net-pci'.
> This device is a pci device, and virtio-net-device is its child:
>  object_property_add_child(obj, "virtio-backend", OBJECT(&dev->vdev),
> NULL);
> 2. Both virtio-net-pci and virtio-net-device own bootindex property because
> they
> include "DEFINE_NIC_PROPERTIES(VirtIONetPCI, vdev.nic_conf)".
> 3. Only virtio-net-device will call add_boot_device_path() in
> virtio_net_device_realize(),
> which use its own pointer, but not virtio-net-pci's pointer:
>  add_boot_device_path(n->nic_conf.bootindex, dev, "/ethernet-phy@0");
> 4. When we hotplug the virtio-net-pci device, only pass virtio-net-pci's 
> pointer

Sorry, my typo, s/hotplug/hot-unplug/

> to del_boot_device_path(). But virtio-net-pci != virtio-net-device, so I add a
> function named is_same_fw_dev_path() to handle this situation.
> 
> 
> Best regards,
> -Gonglei

Re: [Qemu-devel] [PATCH v6 02/27] bootindex: add del_boot_device_path function

2014-09-02 Thread Gonglei (Arei)

Hi,

> From: Eduardo Habkost [mailto:ehabk...@redhat.com]
> Sent: Wednesday, September 03, 2014 2:00 AM
> Subject: Re: [PATCH v6 02/27] bootindex: add del_boot_device_path function
> Importance: High
> 
> On Mon, Sep 01, 2014 at 06:47:13AM +, Gonglei (Arei) wrote:
> > > From: Gerd Hoffmann [mailto:kra...@redhat.com]
> > > Sent: Monday, September 01, 2014 2:43 PM
> > > Subject: Re: [PATCH v6 02/27] bootindex: add del_boot_device_path
> function
> > > Importance: High
> > >
> > >   Hi,
> > >
> > > > +static bool is_same_fw_dev_path(DeviceState *src, DeviceState *dst)
> > > > +{
> > > > +bool ret = false;
> > > > +char *devpath_src = qdev_get_fw_dev_path(src);
> > > > +char *devpath_dst = qdev_get_fw_dev_path(dst);
> > > > +
> > > > +if (!strcmp(devpath_src, devpath_dst)) {
> > > > +ret = true;
> > > > +}
> > > > +
> > > > +g_free(devpath_src);
> > > > +g_free(devpath_dst);
> > > > +return ret;
> > > > +}
> > > > +
> > > > +void del_boot_device_path(DeviceState *dev)
> > > > +{
> > > > +FWBootEntry *i;
> > > > +
> > > > +assert(dev != NULL);
> > > > +
> > > > +/* remove all entries of the assigned device */
> > > > +QTAILQ_FOREACH(i, &fw_boot_order, link) {
> > > > +if (i->dev == NULL) {
> > > > +continue;
> > > > +}
> > > > +if ((i->dev == dev || is_same_fw_dev_path(i->dev, dev))) {
> > >
> > > Why this is needed?  Is there any case where i-->dev != dev but
> > > is_same_fw_dev_path() returns true?
> > >
> > Yes, it is needed. At present, the virito-net-pci device
> > compliance with this situation.
> >
> > Please see the qom path about virtio-net-pci and virtio-net device:
> >
> > id: null, /machine/peripheral/nic1/virtio-backend
> > id: nic1, /machine/peripheral/nic1
> 
> And why exactly is the caller passing a different pointer to
> del_boot_device_path()? Why not simply change the caller to pass the
> same pointer to add_boot_device_path() and del_boot_device_path()?
> 
1. When we want to create a virtio-net device, we must use '-device 
virtio-net-pci'.
This device is a pci device, and virtio-net-device is its child:
 object_property_add_child(obj, "virtio-backend", OBJECT(&dev->vdev), NULL);
2. Both virtio-net-pci and virtio-net-device own bootindex property because they
include "DEFINE_NIC_PROPERTIES(VirtIONetPCI, vdev.nic_conf)".
3. Only virtio-net-device will call add_boot_device_path() in 
virtio_net_device_realize(),
which use its own pointer, but not virtio-net-pci's pointer:
 add_boot_device_path(n->nic_conf.bootindex, dev, "/ethernet-phy@0");
4. When we hotplug the virtio-net-pci device, only pass virtio-net-pci's 
pointer to 
del_boot_device_path(). But virtio-net-pci != virtio-net-device, so I add a 
function
named is_same_fw_dev_path() to handle this situation. 


Best regards,
-Gonglei

[Qemu-devel] vhost-user mmap failed

2014-09-02 Thread linhafieng

hi,

I run the qemu-master(vhost-user v10)with the vapp(vhost-user v9).It says mmap 
failed,do you know why this happen?

./vhost -s /path/to/socket

Processing message: VHOST_USER_SET_MEM_TABLE
_set_mem_table
mmap: Invalid argument
mmap: Invalid argument
Got memory.nregions 2
Cmd: VHOST_USER_SET_VRING_NUM (0x8)
Flags: 0x1
state: 0 256

./x86_64-softmmu/qemu-system-x86_64 -m 2048 -object 
memory-backend-file,id=mem,size=2048M,mem-path=/mnt/huge,share=on -numa 
node,memdev=mem -chardev socket,id=chr0,path=/path/to/socket  -netdev 
type=vhost-user,id=net0,chardev=chr0 -device virtio-net-pci,netdev=net0

Re: [Qemu-devel] [PATCH v13 5/6] raw-posix: Add full preallocation option

2014-09-02 Thread Hu Tao

On Tue, Sep 02, 2014 at 11:45:38PM +0200, Max Reitz wrote:
> On 29.08.2014 10:33, Hu Tao wrote:
> >This patch adds a new option preallocation for raw format, and implements
> >full preallocation.
> >
> >Signed-off-by: Hu Tao 
> >---
> >  block/raw-posix.c | 92 
> > +++
> >  qemu-doc.texi |  8 +
> >  qemu-img.texi |  8 +
> >  3 files changed, 88 insertions(+), 20 deletions(-)
> >
> >diff --git a/block/raw-posix.c b/block/raw-posix.c
> >index abe0759..25f66f2 100644
> >--- a/block/raw-posix.c
> >+++ b/block/raw-posix.c
> >@@ -30,6 +30,7 @@
> >  #include "block/thread-pool.h"
> >  #include "qemu/iov.h"
> >  #include "raw-aio.h"
> >+#include "qapi/util.h"
> >  #if defined(__APPLE__) && (__MACH__)
> >  #include 
> >@@ -1365,6 +1366,9 @@ static int raw_create(const char *filename, QemuOpts 
> >*opts, Error **errp)
> >  int result = 0;
> >  int64_t total_size = 0;
> >  bool nocow = false;
> >+PreallocMode prealloc = PREALLOC_MODE_OFF;
> >+char *buf = NULL;
> >+Error *local_err = NULL;
> >  strstart(filename, "file:", &filename);
> >@@ -1372,37 +1376,80 @@ static int raw_create(const char *filename, QemuOpts 
> >*opts, Error **errp)
> >  total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
> >BDRV_SECTOR_SIZE);
> >  nocow = qemu_opt_get_bool(opts, BLOCK_OPT_NOCOW, false);
> >+buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
> >+prealloc = qapi_enum_parse(PreallocMode_lookup, buf,
> >+   PREALLOC_MODE_MAX, PREALLOC_MODE_OFF,
> >+   &local_err);
> >+g_free(buf);
> >+if (local_err) {
> >+error_propagate(errp, local_err);
> >+result = -EINVAL;
> >+goto out;
> >+}
> >  fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
> > 0644);
> >  if (fd < 0) {
> >  result = -errno;
> >  error_setg_errno(errp, -result, "Could not create file");
> >-} else {
> >-if (nocow) {
> >+goto out;
> >+}
> >+
> >+if (nocow) {
> >  #ifdef __linux__
> >-/* Set NOCOW flag to solve performance issue on fs like btrfs.
> >- * This is an optimisation. The FS_IOC_SETFLAGS ioctl return 
> >value
> >- * will be ignored since any failure of this operation should 
> >not
> >- * block the left work.
> >- */
> >-int attr;
> >-if (ioctl(fd, FS_IOC_GETFLAGS, &attr) == 0) {
> >-attr |= FS_NOCOW_FL;
> >-ioctl(fd, FS_IOC_SETFLAGS, &attr);
> >-}
> >-#endif
> >+/* Set NOCOW flag to solve performance issue on fs like btrfs.
> >+ * This is an optimisation. The FS_IOC_SETFLAGS ioctl return value
> >+ * will be ignored since any failure of this operation should not
> >+ * block the left work.
> >+ */
> >+int attr;
> >+if (ioctl(fd, FS_IOC_GETFLAGS, &attr) == 0) {
> >+attr |= FS_NOCOW_FL;
> >+ioctl(fd, FS_IOC_SETFLAGS, &attr);
> >  }
> >+#endif
> >+}
> >-if (ftruncate(fd, total_size) != 0) {
> >-result = -errno;
> >-error_setg_errno(errp, -result, "Could not resize file");
> >-}
> >-if (qemu_close(fd) != 0) {
> >-result = -errno;
> >-error_setg_errno(errp, -result, "Could not close the new file");
> >+if (ftruncate(fd, total_size) != 0) {
> >+result = -errno;
> >+error_setg_errno(errp, -result, "Could not resize file");
> >+goto out_close;
> >+}
> >+
> >+if (prealloc == PREALLOC_MODE_FULL) {
> >+/* posix_fallocate() doesn't set errno. */
> >+result = -posix_fallocate(fd, 0, total_size);
> >+if (result != 0) {
> >+buf = g_malloc0(65536);
> >+int64_t num = 0, left = total_size;
> >+
> >+while (left > 0) {
> >+num = MIN(left, 65536);
> >+result = write(fd, buf, num);
> >+if (result < 0) {
> >+result = -errno;
> >+error_setg_errno(errp, -result,
> >+ "Could not write to the new file");
> >+g_free(buf);
> >+goto out_close;
> >+}
> >+left -= num;
> >+}
> >+fsync(fd);
> >+g_free(buf);
> >  }
> >+} else if (prealloc != PREALLOC_MODE_OFF) {
> >+result = -1;
> 
> As for qcow2 in patch 4, I'd prefer -EINVAL.

Okay.

> 
> >+error_setg(errp, "Unsupported preallocation mode: %s",
> >+   PreallocMode_lookup[prealloc]);
> >+}
> >+
> >+out_close:
> >+if (qemu_close(fd) != 0 && result == 0) {
> >+result = -errno;
> >+error_setg_errno(errp, -result, "Could not close the new file");

Re: [Qemu-devel] [Xen-devel] [PATCH 2/2] xen:i386:pc_piix: create isa bridge specific to IGD passthrough

2014-09-02 Thread Kay, Allen M



> -Original Message-
> From: Chen, Tiejun
> Sent: Monday, September 01, 2014 12:50 AM
> To: Michael S. Tsirkin
> Cc: xen-de...@lists.xensource.com; Kay, Allen M; qemu-devel@nongnu.org;
> Konrad Rzeszutek Wilk
> Subject: Re: [Qemu-devel] [Xen-devel] [PATCH 2/2] xen:i386:pc_piix: create
> isa bridge specific to IGD passthrough
> 
> On 2014/9/1 14:05, Michael S. Tsirkin wrote:
> > On Mon, Sep 01, 2014 at 10:50:37AM +0800, Chen, Tiejun wrote:
> >> On 2014/8/31 16:58, Michael S. Tsirkin wrote:
> >>> On Fri, Aug 29, 2014 at 09:28:50AM +0800, Chen, Tiejun wrote:
> 
> 
>  On 2014/8/28 8:56, Chen, Tiejun wrote:
> >>> + */
> >>> +dev = pci_create_simple(bus, PCI_DEVFN(0x1f, 0),
> >>> +"xen-igd-passthrough-isa-bridge");
> >>> +if (dev) {
> >>> +r = xen_host_pci_device_get(&hdev, 0, 0,
> >>> + PCI_DEVFN(0x1f,
> >>> 0), 0);
> >>> +if (!r) {
> >>> +pci_config_set_vendor_id(dev->config,
> hdev.vendor_id);
> >>> +pci_config_set_device_id(dev->config,
> >>> + hdev.device_id);
> 
>  Can you, instead, implement the reverse logic, probing the card
>  and supplying the correct device id for PCH?
> 
> >>>
> >>> Here what is your so-called reverse logic as I already asked you
> >>> previously? Do you mean I should list all PCHs with a combo
> >>> illustrated with the vendor/device id in advance? Then look up
> >>> if we can find a
> >>
> >> Michael,
> >>
> >
> > Ping.
> >
> > Thanks
> > Tiejun
> >
> >> Could you explain this exactly? Then I can try follow-up your
> >> idea ASAP if this is necessary and possible.
> 
>  Michel,
> 
>  Could you give us some explanation for your "reverse logic" when
>  you're free?
> 
>  Thanks
>  Tiejun
> >>>
> >>> So future drivers will look at device ID for the card and figure out
> >>> how things should work from there.
> >>> Old drivers still poke at device id of the chipset for this, but
> >>> maybe qemu can do what newer drivers do:
> >>> look at the card and figure out what guest should do, then present
> >>> the appropriate chipset id.
> >>>
> >>> This is based on what Jesse said:
> >>>   Practically speaking, we could probably assume specific GPU/PCH
> combos,
> >>>   as I don't think they're generally mixed across generations, though
> SNB
> >>>   and IVB did have compatible sockets, so there is the possibility of
> >>>   mixing CPT and PPT PCHs, but those are register identical as far as the
> >>>   graphics driver is concerned, so even that should be safe.
> >>>
> >>
> >> Michael,
> >>
> >> Thanks for your explanation.
> >>
> >>> so the idea is to have a reverse table mapping GPU back to PCH.
> >>> Present to guest the ID that will let it assume the correct GPU.
> >>
> >> I guess you mean we should create to maintain such a table:
> >>
> >> [GPU Card: device_id(s), PCH: device_id]
> >>
> >> Then each time, instead of exposing that real PCH device id directly,
> >> qemu first can get the real GPU device id to lookup this table to
> >> present a appropriate PCH's device id to the guest.
> >>
> >> And looks here that appropriate PCH's device id is not always a that
> >> real PCH's device id. Right? If I'm wrong please correct me.
> >
> > Exactly: we don't really care what the PCH ID is - we only need it for
> > the guest driver to do the right thing.
> 
> Agreed.
> 
> I need to ask Allen to check if one given GPU card device id is always
> corresponding to one given PCH on both HSW and BDW currently. If yes, I can
> do this quickly. Otherwise I need some time to establish that sort
> of connection.
> 

If I understand this correctly, the only difference is instead of reading PCH 
DevID/RevID from the host hardware, QEMU inserts those values into PCH virtual 
device by looking at the reverse mapping table it maintains.

I agree the downside of doing this is the reverse mapping table may be hard to 
maintain.

What is the advantage of doing this instead of having QEMU reading it from the 
host?  Is it to test to make sure reverse mapping methods works before it is 
adopted in the new drivers?

> Thanks
> Tiejun
> 
> >
> >>>
> >>> the problem with these tables is they are hard to keep up to date
> >>
> >> Yeah. But I think currently we can just start from some modern CPU
> >> such as HSW and BDW, then things could be easy.
> >>
> >> Allen,
> >>
> >> Any idea to this suggestion?
> >>
> >>> as new hardware comes out, but as future hardware won't need these
> >>> hacks, we shall be fine.
> >>
> >> Yeah.
> >>
> >> Thanks
> >> Tiejun
> >>
> >>>
> >>>
> >>
> >> Thanks
> >> Tiejun
> >>
> >>> matched PCH? If yes, what is that benefit you expect in
> >>> passthrough case? Shouldn't we pass these info to VM directly in
> passthrough case?
> >>>
> >>> Thanks
>

Re: [Qemu-devel] [PATCH v13 4/6] qapi: introduce PreallocMode and a new PreallocMode full.

2014-09-02 Thread Hu Tao

On Tue, Sep 02, 2014 at 03:51:23PM -0600, Eric Blake wrote:
> On 08/29/2014 02:33 AM, Hu Tao wrote:
> > This patch prepares for the subsequent patches.
> > 
> > Signed-off-by: Hu Tao 
> > ---
> >  block/qcow2.c  | 23 +++
> >  qapi/block-core.json   | 16 
> >  tests/qemu-iotests/049.out |  2 +-
> >  3 files changed, 32 insertions(+), 9 deletions(-)
> > 
> 
> > @@ -1958,6 +1958,13 @@ static int qcow2_create(const char *filename, 
> > QemuOpts *opts, Error **errp)
> >  flags |= BLOCK_FLAG_LAZY_REFCOUNTS;
> >  }
> >  
> > +if (prealloc && prealloc != PREALLOC_MODE_METADATA) {

This one reads as 'support PREALLOC_MODE_METADATA' only,

> 
> I find it a bit awkward that you are checking for PREALLOC_MODE_OFF
> implicitly ('prealloc &&') vs. checking for prealloc mode METADATA
> explicitly.  Since there are only three modes, would it be any simpler
> to just have written:
> 
> if (prealloc == PREALLOC_MODE_FULL) {

and this one reads as 'does't support PREALLOC_MODE_FULL'.  Althrough
they are the same, but I'd prefer the former one. Anyway, the check is
removed in patch 6.

Regards,
Hu

> 
> -- 
> Eric Blake   eblake redhat com+1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>

Re: [Qemu-devel] [PATCH v13 4/6] qapi: introduce PreallocMode and a new PreallocMode full.

2014-09-02 Thread Hu Tao

On Tue, Sep 02, 2014 at 11:32:50PM +0200, Max Reitz wrote:
> On 29.08.2014 10:33, Hu Tao wrote:
> >This patch prepares for the subsequent patches.
> >
> >Signed-off-by: Hu Tao 
> >---
> >  block/qcow2.c  | 23 +++
> >  qapi/block-core.json   | 16 
> >  tests/qemu-iotests/049.out |  2 +-
> >  3 files changed, 32 insertions(+), 9 deletions(-)
> >
> >diff --git a/block/qcow2.c b/block/qcow2.c
> >index cf27c3f..95fb240 100644
> >--- a/block/qcow2.c
> >+++ b/block/qcow2.c
> >@@ -30,6 +30,7 @@
> >  #include "qemu/error-report.h"
> >  #include "qapi/qmp/qerror.h"
> >  #include "qapi/qmp/qbool.h"
> >+#include "qapi/util.h"
> >  #include "trace.h"
> >  #include "qemu/option_int.h"
> >@@ -1738,7 +1739,7 @@ static int preallocate(BlockDriverState *bs)
> >  static int qcow2_create2(const char *filename, int64_t total_size,
> >   const char *backing_file, const char 
> > *backing_format,
> >- int flags, size_t cluster_size, int prealloc,
> >+ int flags, size_t cluster_size, PreallocMode 
> >prealloc,
> >   QemuOpts *opts, int version,
> >   Error **errp)
> >  {
> >@@ -1915,7 +1916,7 @@ static int qcow2_create(const char *filename, QemuOpts 
> >*opts, Error **errp)
> >  uint64_t size = 0;
> >  int flags = 0;
> >  size_t cluster_size = DEFAULT_CLUSTER_SIZE;
> >-int prealloc = 0;
> >+PreallocMode prealloc = PREALLOC_MODE_OFF;
> >  int version = 3;
> >  Error *local_err = NULL;
> >  int ret;
> >@@ -1931,12 +1932,11 @@ static int qcow2_create(const char *filename, 
> >QemuOpts *opts, Error **errp)
> >  cluster_size = qemu_opt_get_size_del(opts, BLOCK_OPT_CLUSTER_SIZE,
> >   DEFAULT_CLUSTER_SIZE);
> >  buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
> >-if (!buf || !strcmp(buf, "off")) {
> >-prealloc = 0;
> >-} else if (!strcmp(buf, "metadata")) {
> >-prealloc = 1;
> >-} else {
> >-error_setg(errp, "Invalid preallocation mode: '%s'", buf);
> >+prealloc = qapi_enum_parse(PreallocMode_lookup, buf,
> >+   PREALLOC_MODE_MAX, PREALLOC_MODE_OFF,
> >+   &local_err);
> >+if (local_err) {
> >+error_propagate(errp, local_err);
> >  ret = -EINVAL;
> >  goto finish;
> >  }
> >@@ -1958,6 +1958,13 @@ static int qcow2_create(const char *filename, 
> >QemuOpts *opts, Error **errp)
> >  flags |= BLOCK_FLAG_LAZY_REFCOUNTS;
> >  }
> >+if (prealloc && prealloc != PREALLOC_MODE_METADATA) {
> >+ret = -1;
> 
> Since the return value is expected to be -errno, I'd propose "ret =
> -EINVAL;" here. With that fixed (or a good explanation why we want
> -1 here):

Good.

> 
> Reviewed-by: Max Reitz 
> 
> >+error_setg(errp, "Unsupported preallocate mode: %s",
> >+   PreallocMode_lookup[prealloc]);
> >+goto finish;
> >+}
> >+
> >  if (backing_file && prealloc) {
> >  error_setg(errp, "Backing file and preallocation cannot be used at 
> > "
> > "the same time");
> >diff --git a/qapi/block-core.json b/qapi/block-core.json
> >index fb74c56..543b00b 100644
> >--- a/qapi/block-core.json
> >+++ b/qapi/block-core.json
> >@@ -1679,3 +1679,19 @@
> >  'len'   : 'int',
> >  'offset': 'int',
> >  'speed' : 'int' } }
> >+
> >+# @PreallocMode
> >+#
> >+# Preallocation mode of QEMU image file
> >+#
> >+# @off: no preallocation
> >+# @metadata: preallocate only for metadata
> >+# @full: preallocate all data by calling posix_fallocate() if it is
> >+#available, otherwise by writing zeros to device to ensure disk
> >+#space is really available. @full preallocation also sets up
> >+#metadata correctly.
> >+#
> >+# Since 2.2
> >+##
> >+{ 'enum': 'PreallocMode',
> >+  'data': [ 'off', 'metadata', 'full' ] }
> >diff --git a/tests/qemu-iotests/049.out b/tests/qemu-iotests/049.out
> >index 71ca44d..09ca0ae 100644
> >--- a/tests/qemu-iotests/049.out
> >+++ b/tests/qemu-iotests/049.out
> >@@ -179,7 +179,7 @@ qemu-img create -f qcow2 -o preallocation=metadata 
> >TEST_DIR/t.qcow2 64M
> >  Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 encryption=off 
> > cluster_size=65536 preallocation='metadata' lazy_refcounts=off
> >  qemu-img create -f qcow2 -o preallocation=1234 TEST_DIR/t.qcow2 64M
> >-qemu-img: TEST_DIR/t.qcow2: Invalid preallocation mode: '1234'
> >+qemu-img: TEST_DIR/t.qcow2: invalid parameter value: 1234
> >  Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 encryption=off 
> > cluster_size=65536 preallocation='1234' lazy_refcounts=off
> >  == Check encryption option ==

Re: [Qemu-devel] [PATCH v13 3/6] rename parse_enum_option to qapi_enum_parse and make it public

2014-09-02 Thread Hu Tao

On Tue, Sep 02, 2014 at 11:27:00PM +0200, Max Reitz wrote:
> On 29.08.2014 10:33, Hu Tao wrote:
> >From: Peter Lieven 
> >
> >relaxing the license to LGPLv2+ is intentional.
> >
> >Suggested-by: Markus Armbruster 
> >Signed-off-by: Hu Tao 
> >Signed-off-by: Peter Lieven 
> >Reviewed-by: Eric Blake 
> >Reviewed-by: Benoit Canet 
> >---
> >  blockdev.c  | 30 ++
> >  include/qapi/util.h | 17 +
> >  qapi/Makefile.objs  |  1 +
> >  qapi/qapi-util.c| 34 ++
> >  4 files changed, 58 insertions(+), 24 deletions(-)
> >  create mode 100644 include/qapi/util.h
> >  create mode 100644 qapi/qapi-util.c
> 
> Seems pretty much unchanged (except for the author note), so:

Yes. There was a license discussion, so I just take the latest version
by Peter.

Regards,
Hu

> 
> Reviewed-by: Max Reitz

Re: [Qemu-devel] [PATCH v13 5/6] raw-posix: Add full preallocation option

2014-09-02 Thread Hu Tao

On Fri, Aug 29, 2014 at 09:48:01AM +0100, Richard W.M. Jones wrote:
> On Fri, Aug 29, 2014 at 04:33:12PM +0800, Hu Tao wrote:
> > +if (prealloc == PREALLOC_MODE_FULL) {
> > +/* posix_fallocate() doesn't set errno. */
> > +result = -posix_fallocate(fd, 0, total_size);
> > +if (result != 0) {
> 
> Is it better to test:
> 
>   result != ENOSYS && result != EOPNOTSUPP
> 
> here?

posix_fallocate() doesn't return ENOSYS or EOPNOTSUPP. All the errors
returned by posix_fallocate() apply to writing zeros, too. that is, if
posix_fallocate() returns an error, we should not do writing zeros,
neither.

I'm wondering what is the right way to test if posix_fallocate() is
supported, something like AC_CHECK_FUNC? how?

Regards,
Hu

> 
> I think this is definitely the right approach.
> 
> Rich.
> 
> 
> -- 
> Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
> Read my programming and virtualization blog: http://rwmj.wordpress.com
> virt-p2v converts physical machines to virtual machines.  Boot with a
> live CD or over the network (PXE) and turn machines into KVM guests.
> http://libguestfs.org/virt-v2v

Re: [Qemu-devel] [PATCH] xen-hvm.c: Improve the return method for xen_hvm_init()

2014-09-02 Thread Chen Gang


Oh, sorry, forgot Cc to qemu trivial.

Thanks.

On 9/3/14 0:22, Chen Gang wrote:
> When failure occurs, it need use "return -1" instead of exit(1), so can
> let upper caller has chance to print failure information, too, then user
> can know the failure result more clearly.
> 
> xen_hvm_init() may also return -errno, which may let upper caller think
> more (e.g. free some other related resources and try again), although at
> present, all related upper callers still exit(1).
> 
> It is not a normal function which does not release related resources, if
> return -1. So need give the related comments for it.
> 
> It passes common test under fedora 20 x86_64:
> 
>   "./configure --enable-xen && make -j4 && make check"
>   execute result: "echo $? == 0".
> 
> Signed-off-by: Chen Gang 
> ---
>  xen-hvm.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/xen-hvm.c b/xen-hvm.c
> index 0d09940..35efec0 100644
> --- a/xen-hvm.c
> +++ b/xen-hvm.c
> @@ -978,6 +978,7 @@ static void xen_wakeup_notifier(Notifier *notifier, void 
> *data)
>  xc_set_hvm_param(xen_xc, xen_domid, HVM_PARAM_ACPI_S_STATE, 0);
>  }
>  
> +/* return value:  0 is OK, -errno is failure, -1 is critical issue -- 
> exit(1) */
>  int xen_hvm_init(ram_addr_t *below_4g_mem_size, ram_addr_t 
> *above_4g_mem_size,
>   MemoryRegion **ram_memory)
>  {
> @@ -998,6 +999,7 @@ int xen_hvm_init(ram_addr_t *below_4g_mem_size, 
> ram_addr_t *above_4g_mem_size,
>  state->xenstore = xs_daemon_open();
>  if (state->xenstore == NULL) {
>  perror("xen: xenstore open");
> +xc_evtchn_close(state->xce_handle);
>  g_free(state);
>  return -errno;
>  }
> @@ -1069,7 +1071,7 @@ int xen_hvm_init(ram_addr_t *below_4g_mem_size, 
> ram_addr_t *above_4g_mem_size,
>  /* Initialize backend core & drivers */
>  if (xen_be_init() != 0) {
>  fprintf(stderr, "%s: xen backend core setup failed\n", __FUNCTION__);
> -exit(1);
> +return -1;
>  }
>  xen_be_register("console", &xen_console_ops);
>  xen_be_register("vkbd", &xen_kbdmouse_ops);
> 

-- 
Chen Gang

Open, share, and attitude like air, water, and life which God blessed

Re: [Qemu-devel] [PATCH V2] acpi-build: Set FORCE_APIC_CLUSTER_MODEL bit for FADT flags

2014-09-02 Thread zhanghailiang


Hi,

Ping...

Thanks,
zhanghailiang

On 2014/8/29 11:52, zhanghailiang wrote:

If we start Windows 2008 R2 DataCenter with number of cpu less than 8,
The system will use APIC Flat Logical destination mode as default configuration,
Which has an upper limit of 8 CPUs.

The fault is that VM can not show all processors within Task Manager if
we hot-add cpus when the number of cpus in VM extends the limit of 8.

If we use cluster destination model, the problem will be solved.

Signed-off-by: huangzhichao
Signed-off-by: zhanghailiang
---
v2:
  - Set this bit when max_cpus>  8
---
  hw/i386/acpi-build.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 85e5834..cdc3b08 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -550,6 +550,12 @@ static void fadt_setup(AcpiFadtDescriptorRev1 *fadt, 
AcpiPmInfo *pm)
(1<<  ACPI_FADT_F_SLP_BUTTON) |
(1<<  ACPI_FADT_F_RTC_S4));
  fadt->flags |= cpu_to_le32(1<<  ACPI_FADT_F_USE_PLATFORM_CLOCK);
+/* APIC destination mode ("Flat Logical") has an upper limit of 8 CPUs
+ * For more than 8 CPUs, "Clustered Logical" mode has to be used
+ */
+if (max_cpus>  8) {
+fadt->flags |= cpu_to_le32(1<<  ACPI_FADT_F_FORCE_APIC_CLUSTER_MODEL);
+}
  }

Re: [Qemu-devel] [PATCH v3 0/8] block: Asynchronous request cancellation

2014-09-02 Thread Fam Zheng

On Tue, 09/02 16:09, Stefan Hajnoczi wrote:
> On Wed, Aug 27, 2014 at 10:49:08AM +0800, Fam Zheng wrote:
> > v3: Drop "RFC".
> > Improvements according to Paolo's comments:
> > 05: Just use THREAD_DONE and ret = -ECANCELED in thread-pool.c
> > 06: Don't check dbs->cancelled for twice.
> > Don't set dbs->acb to NULL.
> > 
> > v2: Drop the unfinished scsi part, which was broken in v1. (Paolo)
> > Add refcnt in BlockDriverAIOCB to maintain invariant of acb 
> > availability in
> > bdrv_aio_cancel_async. (Paolo)
> > Drop blkdebug change. (Stefan)
> > 
> > This series adds a new block layer API:
> > 
> >   void bdrv_aio_cancel_async(BlockDriverAIOCB *acb);
> > 
> > which is similar to existing bdrv_aio_cancel in that it cancels an AIO 
> > request,
> > but different that it doesn't block until the request is completely 
> > cancelled
> > or done.
> > 
> > More importantly, the completion callback, BlockDriverAIOCB.cb, is 
> > guaranteed
> > to be called, so that the cb can take care of resource releasing and status
> > reporting to guest, etc.
> > 
> > In the following work, scsi emulation code will be shifted to use the async
> > cancelling.
> > 
> > One major benefit would be that when guest tries to cancel a request, where 
> > the
> > request cannot be cancelled easily, (due to throttled BlockDriverState, a 
> > lost
> > connection, or a large request queue), we don't need to block the whole vm 
> > with
> > a busy loop, which is how bdrv_aio_cancel is implemented now.
> > 
> > A test case that is easy to reproduce is, throttle a scsi-disk to a very low
> > limit, for example 50 bps, then stress the guest block device with dd or 
> > fio.
> > 
> > Currently, the vm will quickly hang when it loses patience and send a tmf
> > command to cancel the request, at which point we will busy wait in
> > bdrv_aio_cancel, until the request is slowly spit out from throttled_reqs.
> > 
> > Later, we will change scsi device code to make this asynchronous, on top of
> > bdrv_aio_cancel_async.
> 
> We need to get rid of .bdrv_aio_cancel().  Keeping both
> .bdrv_aio_cancel() and .bdrv_aio_cancel_async() around is problematic
> because they have slightly different semantics.
> 
> This patch series makes block driver cancellation more complex because
> we support both approaches :(.
> 
> Could we do something like:
> 
> void bdrv_aio_cancel(BdrvAIOCB *acb)
> {
> bdrv_aiocb_ref(acb);
> bdrv_aio_cancel_async(acb);
> while (acb->ref > 1) {
> aio_poll(bdrv_get_aio_context(acb->bs), true);
> }
> bdrv_aiocb_release(acb);
> }
> 
> (pseudo-code)
> 
> Then .bdrv_aio_cancel() should be deleted.
> 

OK, I will drop .io_cancel field from AIOCBInfo, and convert all AIO
implementations to only support .io_cancel_async. That way we can emulate
bdrv_aio_cancel with bdrv_aio_cancel_async.

Thanks for reviewing!

Fam

Re: [Qemu-devel] tcmu-runner and QEMU

2014-09-02 Thread Andy Grover

On 09/02/2014 02:25 AM, Stefan Hajnoczi wrote:

The easiest approach is to write a tool similar to qemu-nbd that speaks
the userspace target protocol (i.e. mmap the shared memory).

If the tcmu setup code is involved, maybe providing a libtcmu with the
setup code would be useful.  I suspect that other projects may want to
integrate userspace target support too.  It's easier to let people add
it to their codebase rather than hope they bring their codebase into
tcmu-runner.

What other projects were you thinking of?

From my perspective, QEMU is singular. QEMU's block support seems to 
cover just about everything, even ceph, gluster, and sheepdog!

We certainly don't want to duplicate that code so a qemu-lio-tcmu in 
qemu.git like qemu-nbd, basically statically linking the BlockDriver 
object files, sounds like the first thing to try.

We can make tcmu-runner a library (libtcmu) if it makes sense, but let's 
do some work to try the current way and see how it goes before 
"flipping" it.

> The qemu-lio tool would live in the QEMU codebase and reuse all the
> infrastructure.  For example, it could include a QMP monitor just like
> the one you are adding to qemu-nbd.

Benoit and I talked a little about QMP on another part of the thread... 
I said I didn't think we needed a QMP monitor in qemu-lio-tcmu, but let 
me spin up on qemu a little more and I'll be able to speak more 
intelligently.

-- Andy

Re: [Qemu-devel] [PATCH memory v2 0/3] Memory Region Naming - take 2

2014-09-02 Thread Stefano Stabellini

Sorry for the late reply.

I just wanted to confirm that it's all working now.

Thanks,

Stefano

On Mon, 25 Aug 2014, Peter Crosthwaite wrote:
> Hi,
> 
> Here is an attempt at a proper fix to the memory region naming bugs as
> reported by Peter. The Xen compile bug and the memory leak.
> 
> Regards,
> Peter
> 
> Changed since v1:
> Constify xen physmap name string (Stefan W review).
> Change memory_region_name to caching based approach.
> 
> 
> 
> Peter Crosthwaite (3):
>   xen-hvm: Constify string
>   xen: hvm: Abstract away memory region name ref
>   memory: Lazy init name from QOM name as needed
> 
>  memory.c  |  5 -
>  xen-hvm.c | 11 +++
>  2 files changed, 11 insertions(+), 5 deletions(-)
> 
> -- 
> 2.1.0.1.g27b9230
>

Re: [Qemu-devel] [RFC PATCH v2 2/2] VFIO: Clear stale MSIx table during EEH reset

2014-09-02 Thread Gavin Shan

On Tue, Sep 02, 2014 at 02:10:42PM -0600, Alex Williamson wrote:
>On Mon, 2014-09-01 at 10:53 +1000, Gavin Shan wrote:
>> The PCI device MSIx table is cleaned out in hardware after EEH PE
>> reset. However, we still hold the stale MSIx entries in QEMU, which
>> should be cleared accordingly. Otherwise, we will run into another
>> (recursive) EEH error and the PCI devices contained in the PE have
>> to be offlined exceptionally.
>> 
>> The patch clears stale MSIx table before EEH PE reset so that MSIx
>> table could be restored properly after EEH PE reset.
>> 
>> Signed-off-by: Gavin Shan 
>> ---
>>  hw/misc/vfio.c | 32 +++-
>>  1 file changed, 31 insertions(+), 1 deletion(-)
>> 
>> diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
>> index 1a3e7eb..1f55051 100644
>> --- a/hw/misc/vfio.c
>> +++ b/hw/misc/vfio.c
>> @@ -2724,6 +2724,17 @@ static void vfio_disable_interrupts(VFIODevice *vdev)
>>  }
>>  }
>>  
>> +static void vfio_disable_and_reset_interrupts(VFIODevice *vdev)
>> +{
>> +vfio_disable_interrupts(vdev);
>> +
>> +switch (vdev->interrupt) {
>> +case VFIO_INT_MSIX:
>> +msix_reset(&vdev->pdev);
>> +break;
>> +}
>
>This is apparently untested because vdev->interrupt should never be set
>to VFIO_INT_MSIX after vfio_disable_interrupts().  Also, you need to
>update the normal reset path to call msix_reset() unless it's already
>happening via another reset handler.  Thanks,
>

Yes, I didn't test this revision. I'll change the code according to
your comments and retest, then send v3 for your comments.

Thanks,
Gavin

>Alex
>
>> +}
>> +
>>  static int vfio_setup_msi(VFIODevice *vdev, int pos)
>>  {
>>  uint16_t ctrl;
>> @@ -4442,8 +4453,27 @@ int vfio_container_ioctl(AddressSpace *as, int32_t 
>> groupid,
>>  switch (req) {
>>  case VFIO_CHECK_EXTENSION:
>>  case VFIO_IOMMU_SPAPR_TCE_GET_INFO:
>> -case VFIO_EEH_PE_OP:
>>  break;
>> +case VFIO_EEH_PE_OP: {
>> +VFIODevice *vdev;
>> +struct vfio_eeh_pe_op *arg = (struct vfio_eeh_pe_op *)param;
>> +
>> +switch (arg->op) {
>> +case VFIO_EEH_PE_RESET_HOT:
>> +case VFIO_EEH_PE_RESET_FUNDAMENTAL:
>> +/*
>> + * The MSIx table will be cleaned out by reset. We need
>> + * disable it so that it can be reenabled properly. Also,
>> + * the cached MSIx table should be cleared as it's not
>> + * reflecting the contents in hardware.
>> + */
>> +QLIST_FOREACH(vdev, &group->device_list, next) {
>> +vfio_disable_and_reset_interrupts(vdev);
>> +}
>> +}
>> +
>> +break;
>> +}
>>  default:
>>  /* Return an error on unknown requests */
>>  error_report("vfio: unsupported ioctl %X", req);
>
>
>

Re: [Qemu-devel] [RFC V2 7/8] throttle: Add throttle group support

2014-09-02 Thread Eric Blake

On 08/13/2014 08:23 AM, Benoît Canet wrote:
> The throttle group support use a cooperative round robin scheduling algorithm.
> 
> The principle of the algorithm are simple:

s/are/is/

> - Each BDS of the group is used as a token in a circular way.
> - The active BDS compute if a wait must be done and arm the right timer.

s/compute/computes/
s/arm/arms/

> - If a wait must be done the token timer will be armed so the token will 
> become
>   the next active BDS.
> 
> Signed-off-by: Benoit Canet 
> ---

> +++ b/qapi/block-core.json
> @@ -886,6 +886,9 @@
>  #
>  # @iops_size: #optional an I/O size in bytes (Since 1.7)
>  #
> +#
> +# @group: #optional throttle group name (Since 2.2)

Extra blank comment line.

Interface seems okay; I haven't looked closely at the code.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 3/3] docs/qcow2: Limit refcount_order to [0, 6]

2014-09-02 Thread Eric Blake

On 09/02/2014 04:25 PM, Max Reitz wrote:
> Specify the upper limit of refcount_order to be 6 (that is,
> refcount_bits = 64). Any larger value does not make much sense when all
> offsets, sizes, cluster counts etc. "only" have a width of 64 bit as
> well, and very large values would be very difficult to support.
> Therefore, just cap it at the largest reasonable value.
> 
> Suggested-by: Eric Blake 
> Signed-off-by: Max Reitz 
> ---
>  docs/specs/qcow2.txt | 1 +
>  1 file changed, 1 insertion(+)

Reviewed-by: Eric Blake 

> 
> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
> index 0a878aa..121dfc8 100644
> --- a/docs/specs/qcow2.txt
> +++ b/docs/specs/qcow2.txt
> @@ -110,6 +110,7 @@ in the description of a field.
>  in bits: refcount_bits = 1 << refcount_order). For 
> version 2
>  images, the order is always assumed to be 4
>  (i.e. refcount_bits = 16).
> +This value may not exceed 6 (i.e. refcount_bits = 64).
>  
>  100 - 103:  header_length
>  Length of the header structure in bytes. For version 2
> 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 2/3] docs/qcow2: Correct refcount_block_entries

2014-09-02 Thread Eric Blake

On 09/02/2014 04:25 PM, Max Reitz wrote:
> A refblock entry may have a different size than 16 bits, it may even be
> smaller than a byte. Correct the refcount_block_entries calculation
> accordingly.
> 
> Signed-off-by: Max Reitz 
> ---
>  docs/specs/qcow2.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Eric Blake 

> 
> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
> index cfbc8b0..0a878aa 100644
> --- a/docs/specs/qcow2.txt
> +++ b/docs/specs/qcow2.txt
> @@ -183,7 +183,7 @@ blocks and are exactly one cluster in size.
>  Given a offset into the image file, the refcount of its cluster can be 
> obtained
>  as follows:
>  
> -refcount_block_entries = (cluster_size / sizeof(uint16_t))
> +refcount_block_entries = (cluster_size * 8 / refcount_bits)
>  
>  refcount_block_index = (offset / cluster_size) % refcount_block_entries
>  refcount_table_index = (offset / cluster_size) / refcount_block_entries
> 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v2 2/3] docs/qcow2: Correct refcount_block_entries

2014-09-02 Thread Max Reitz

A refblock entry may have a different size than 16 bits, it may even be
smaller than a byte. Correct the refcount_block_entries calculation
accordingly.

Signed-off-by: Max Reitz 
---
 docs/specs/qcow2.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
index cfbc8b0..0a878aa 100644
--- a/docs/specs/qcow2.txt
+++ b/docs/specs/qcow2.txt
@@ -183,7 +183,7 @@ blocks and are exactly one cluster in size.
 Given a offset into the image file, the refcount of its cluster can be obtained
 as follows:
 
-refcount_block_entries = (cluster_size / sizeof(uint16_t))
+refcount_block_entries = (cluster_size * 8 / refcount_bits)
 
 refcount_block_index = (offset / cluster_size) % refcount_block_entries
 refcount_table_index = (offset / cluster_size) / refcount_block_entries
-- 
2.1.0

[Qemu-devel] [PATCH v2 0/3] qcow2: Drop REFCOUNT_SHIFT

2014-09-02 Thread Max Reitz

In preparation for qemu maybe actually supporting variable refcount
widths in the far future, drop the hardcoded REFCOUNT_SHIFT and instead
use the value as given by the image.

Also, the qcow2 documentation gave the width of a refcount block entry
as sizeof(uint16_t), which is wrong for any refcount order other than 4.
Fix that.

This is a follow-up to my "[PATCH v5 00/11] qcow2: Fix image repairing"
series and therefore depends on it.


v2:
- Patch 2: Fix wrongly fixed calculation [Benoît, Eric]
- Patch 3: Added [Eric]


git-backport-diff against v1:

Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/3:[] [--] 'qcow2: Drop REFCOUNT_SHIFT'
002/3:[0002] [FC] 'docs/qcow2: Correct refcount_block_entries'
003/3:[down] 'docs/qcow2: Limit refcount_order to [0, 6]'


Max Reitz (3):
  qcow2: Drop REFCOUNT_SHIFT
  docs/qcow2: Correct refcount_block_entries
  docs/qcow2: Limit refcount_order to [0, 6]

 block/qcow2-refcount.c | 32 ++--
 block/qcow2.c  |  2 +-
 block/qcow2.h  |  2 --
 docs/specs/qcow2.txt   |  3 ++-
 4 files changed, 17 insertions(+), 22 deletions(-)

-- 
2.1.0

[Qemu-devel] [PATCH v2 1/3] qcow2: Drop REFCOUNT_SHIFT

2014-09-02 Thread Max Reitz

With BDRVQcowState.refcount_block_bits, we don't need REFCOUNT_SHIFT
anymore.

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
---
 block/qcow2-refcount.c | 32 ++--
 block/qcow2.c  |  2 +-
 block/qcow2.h  |  2 --
 3 files changed, 15 insertions(+), 21 deletions(-)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 29136ee..cd6f5a0 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -102,7 +102,7 @@ static int get_refcount(BlockDriverState *bs, int64_t 
cluster_index)
 uint16_t *refcount_block;
 uint16_t refcount;
 
-refcount_table_index = cluster_index >> (s->cluster_bits - REFCOUNT_SHIFT);
+refcount_table_index = cluster_index >> s->refcount_block_bits;
 if (refcount_table_index >= s->refcount_table_size)
 return 0;
 refcount_block_offset =
@@ -116,8 +116,7 @@ static int get_refcount(BlockDriverState *bs, int64_t 
cluster_index)
 return ret;
 }
 
-block_index = cluster_index &
-((1 << (s->cluster_bits - REFCOUNT_SHIFT)) - 1);
+block_index = cluster_index & (s->refcount_block_size - 1);
 refcount = be16_to_cpu(refcount_block[block_index]);
 
 ret = qcow2_cache_put(bs, s->refcount_block_cache,
@@ -152,8 +151,8 @@ static unsigned int next_refcount_table_size(BDRVQcowState 
*s,
 static int in_same_refcount_block(BDRVQcowState *s, uint64_t offset_a,
 uint64_t offset_b)
 {
-uint64_t block_a = offset_a >> (2 * s->cluster_bits - REFCOUNT_SHIFT);
-uint64_t block_b = offset_b >> (2 * s->cluster_bits - REFCOUNT_SHIFT);
+uint64_t block_a = offset_a >> (s->cluster_bits + s->refcount_block_bits);
+uint64_t block_b = offset_b >> (s->cluster_bits + s->refcount_block_bits);
 
 return (block_a == block_b);
 }
@@ -174,7 +173,7 @@ static int alloc_refcount_block(BlockDriverState *bs,
 BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_ALLOC);
 
 /* Find the refcount block for the given cluster */
-refcount_table_index = cluster_index >> (s->cluster_bits - REFCOUNT_SHIFT);
+refcount_table_index = cluster_index >> s->refcount_block_bits;
 
 if (refcount_table_index < s->refcount_table_size) {
 
@@ -243,7 +242,7 @@ static int alloc_refcount_block(BlockDriverState *bs,
 
 /* The block describes itself, need to update the cache */
 int block_index = (new_block >> s->cluster_bits) &
-((1 << (s->cluster_bits - REFCOUNT_SHIFT)) - 1);
+(s->refcount_block_size - 1);
 (*refcount_block)[block_index] = cpu_to_be16(1);
 } else {
 /* Described somewhere else. This can recurse at most twice before we
@@ -315,8 +314,7 @@ static int alloc_refcount_block(BlockDriverState *bs,
 BLKDBG_EVENT(bs->file, BLKDBG_REFTABLE_GROW);
 
 /* Calculate the number of refcount blocks needed so far */
-uint64_t refcount_block_clusters = 1 << (s->cluster_bits - REFCOUNT_SHIFT);
-uint64_t blocks_used = DIV_ROUND_UP(cluster_index, 
refcount_block_clusters);
+uint64_t blocks_used = DIV_ROUND_UP(cluster_index, s->refcount_block_size);
 
 if (blocks_used > QCOW_MAX_REFTABLE_SIZE / sizeof(uint64_t)) {
 return -EFBIG;
@@ -330,14 +328,14 @@ static int alloc_refcount_block(BlockDriverState *bs,
 uint64_t table_clusters =
 size_to_clusters(s, table_size * sizeof(uint64_t));
 blocks_clusters = 1 +
-((table_clusters + refcount_block_clusters - 1)
-/ refcount_block_clusters);
+((table_clusters + s->refcount_block_size - 1)
+/ s->refcount_block_size);
 uint64_t meta_clusters = table_clusters + blocks_clusters;
 
 last_table_size = table_size;
 table_size = next_refcount_table_size(s, blocks_used +
-((meta_clusters + refcount_block_clusters - 1)
-/ refcount_block_clusters));
+((meta_clusters + s->refcount_block_size - 1)
+/ s->refcount_block_size));
 
 } while (last_table_size != table_size);
 
@@ -347,7 +345,7 @@ static int alloc_refcount_block(BlockDriverState *bs,
 #endif
 
 /* Create the new refcount table and blocks */
-uint64_t meta_offset = (blocks_used * refcount_block_clusters) *
+uint64_t meta_offset = (blocks_used * s->refcount_block_size) *
 s->cluster_size;
 uint64_t table_offset = meta_offset + blocks_clusters * s->cluster_size;
 uint64_t *new_table = g_try_new0(uint64_t, table_size);
@@ -546,8 +544,7 @@ static int QEMU_WARN_UNUSED_RESULT 
update_refcount(BlockDriverState *bs,
 {
 int block_index, refcount;
 int64_t cluster_index = cluster_offset >> s->cluster_bits;
-int64_t table_index =
-cluster_index >> (s->cluster_bits - REFCOUNT_SHIFT);
+int64_t table_index = cluster_index >> s->refcount_block_bits;
 
 /* Load the refcount block and allocate it if needed */
 if (table_index != old_table_index) {
@@ -569,8 +566,7 @@ static int QEMU_WARN_UNUSED_RESULT 
u

[Qemu-devel] [PATCH v2 3/3] docs/qcow2: Limit refcount_order to [0, 6]

2014-09-02 Thread Max Reitz

Specify the upper limit of refcount_order to be 6 (that is,
refcount_bits = 64). Any larger value does not make much sense when all
offsets, sizes, cluster counts etc. "only" have a width of 64 bit as
well, and very large values would be very difficult to support.
Therefore, just cap it at the largest reasonable value.

Suggested-by: Eric Blake 
Signed-off-by: Max Reitz 
---
 docs/specs/qcow2.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
index 0a878aa..121dfc8 100644
--- a/docs/specs/qcow2.txt
+++ b/docs/specs/qcow2.txt
@@ -110,6 +110,7 @@ in the description of a field.
 in bits: refcount_bits = 1 << refcount_order). For version 
2
 images, the order is always assumed to be 4
 (i.e. refcount_bits = 16).
+This value may not exceed 6 (i.e. refcount_bits = 64).
 
 100 - 103:  header_length
 Length of the header structure in bytes. For version 2
-- 
2.1.0

Re: [Qemu-devel] [PATCH 0/8] add basic recovery logic to quorum driver

2014-09-02 Thread Benoît Canet

The Monday 01 Sep 2014 à 15:43:06 (+0800), Liu Yuan wrote :

Liu,

Do you think this could work with qcow2 file backed by NFS servers ?

Best regards

Benoît

> This patch set mainly add mainly two logics to implement device recover
> - notify qourum driver of the broken states from the child driver(s)
> - dirty track and sync the device after it is repaired
> 
> Thus quorum allow VMs to continue while some child devices are broken and when
> the child devices are repaired and return back, we sync dirty bits during
> downtime to keep data consistency.
> 
> The recovery logic is based on the driver state bitmap and will sync the dirty
> bits with a timeslice window in a coroutine in this prtimive implementation.
> 
> Simple graph about 2 children with threshold=1 and read-pattern=fifo:
> (similary to DRBD)
> 
> + denote device sync iteration
> - IO on a single device
> = IO on two devices
> 
>   sync complete, release dirty bitmap
>  ^
>  |
>   -++==
>  | |
>  | v
>  |   device repaired and begin to sync
>  v
>device broken, create a dirty bitmap
> 
>   This sync logic can take care of nested broken problem, that devices are
>   broken while in sync. We just start a sync process after the devices are
>   repaired again and switch the devices from broken to sound only when the 
> sync
>   completes.
> 
> For read-pattern=quorum mode, it enjoys the recovery logic without any 
> problem.
> 
> Todo:
> - use aio interface to sync data (multiple transfer in one go)
> - dynamic slice window to control sync bandwidth more smoothly
> - add auto-reconnection mechanism to other protocol (if not support yet)
> - add tests
> 
> Cc: Eric Blake 
> Cc: Benoit Canet 
> Cc: Kevin Wolf 
> Cc: Stefan Hajnoczi 
> 
> Liu Yuan (8):
>   block/quorum: initialize qcrs.aiocb for read
>   block: add driver operation callbacks
>   block/sheepdog: propagate disconnect/reconnect events to upper driver
>   block/quorum: add quorum_aio_release() helper
>   quorum: fix quorum_aio_cancel()
>   block/quorum: add broken state to BlockDriverState
>   block: add two helpers
>   quorum: add basic device recovery logic
> 
>  block.c   |  17 +++
>  block/quorum.c| 324 
> +-
>  block/sheepdog.c  |   9 ++
>  include/block/block.h |   9 ++
>  include/block/block_int.h |   6 +
>  trace-events  |   5 +
>  6 files changed, 336 insertions(+), 34 deletions(-)
> 
> -- 
> 1.9.1
>

Re: [Qemu-devel] [Qemu-stable] Patch Round-up for stable 2.1.1, freeze on 2014-09-03

2014-09-02 Thread Andrey Korolyov

On Wed, Sep 3, 2014 at 2:09 AM, Andrey Korolyov  wrote:
> On Wed, Sep 3, 2014 at 1:51 AM, Michael S. Tsirkin  wrote:
>> On Wed, Sep 03, 2014 at 01:29:29AM +0400, Andrey Korolyov wrote:
>>> On Wed, Sep 3, 2014 at 1:03 AM, Michael S. Tsirkin  wrote:
>>> >> bad one is the
>>> >>
>>> >> Author: Jason Wang 
>>> >> Date:   Tue Sep 2 18:07:46 2014 +0300
>>> >>
>>> >> vhost_net: start/stop guest notifiers properly
>>> >
>>> >
>>> >
>>> > upstream has this (pull request sent today):
>>> > vhost_net: cleanup start/stop condition
>>> >
>>> > Could you apply it and see if it helps please?
>>> >
>>> > Michael, if it helps it should be before start/stop guest notifiers
>>> > ideally to avoid bisect problems.
>>>
>>> It is already applied as shown from the list in the previous message
>>> (there are some aio fixes too on top of 2.1 I picked before but they
>>> should not impact vhost-net interaction in any mean). The symptoms are
>>> a bit interesting - VM crashes only at PCI device initalization (e.g.
>>> grub stage after reset and initrd unpacking are passing well, but then
>>> things getting ugly). I am running 3.14 guest i686-pae kernel from
>>> debian backports in guest, so it may be version-specific after all. If
>>> it`ll be hard to reproduce, I can try 64bit, expecting same behavior.
>>> Please find args in attached file.
>>
>>
>>
>> ok just to make sure - which tree do I clone exactly?
>>
>
> https://github.com/mdroth/qemu.git stable-2.1-staging showing same
> behavior for me with those patches

Forgot to mention important detail - I am playing with -mq now, so
actually virtio-net working in a bit different way than it may
expected (it also shown in args list from above, but someone may miss
it):
...
qemu-system-x86_64: unable to start vhost net: 95: falling back on
userspace virtio
qemu-system-x86_64: unable to start vhost net: 95: falling back on
userspace virtio
...

Re: [Qemu-devel] [Qemu-stable] Patch Round-up for stable 2.1.1, freeze on 2014-09-03

2014-09-02 Thread Andrey Korolyov

On Wed, Sep 3, 2014 at 1:51 AM, Michael S. Tsirkin  wrote:
> On Wed, Sep 03, 2014 at 01:29:29AM +0400, Andrey Korolyov wrote:
>> On Wed, Sep 3, 2014 at 1:03 AM, Michael S. Tsirkin  wrote:
>> >> bad one is the
>> >>
>> >> Author: Jason Wang 
>> >> Date:   Tue Sep 2 18:07:46 2014 +0300
>> >>
>> >> vhost_net: start/stop guest notifiers properly
>> >
>> >
>> >
>> > upstream has this (pull request sent today):
>> > vhost_net: cleanup start/stop condition
>> >
>> > Could you apply it and see if it helps please?
>> >
>> > Michael, if it helps it should be before start/stop guest notifiers
>> > ideally to avoid bisect problems.
>>
>> It is already applied as shown from the list in the previous message
>> (there are some aio fixes too on top of 2.1 I picked before but they
>> should not impact vhost-net interaction in any mean). The symptoms are
>> a bit interesting - VM crashes only at PCI device initalization (e.g.
>> grub stage after reset and initrd unpacking are passing well, but then
>> things getting ugly). I am running 3.14 guest i686-pae kernel from
>> debian backports in guest, so it may be version-specific after all. If
>> it`ll be hard to reproduce, I can try 64bit, expecting same behavior.
>> Please find args in attached file.
>
>
>
> ok just to make sure - which tree do I clone exactly?
>

https://github.com/mdroth/qemu.git stable-2.1-staging showing same
behavior for me with those patches

Re: [Qemu-devel] [Qemu-stable] Patch Round-up for stable 2.1.1, freeze on 2014-09-03

2014-09-02 Thread Michael Roth

Quoting Andrey Korolyov (2014-09-02 16:29:29)
> On Wed, Sep 3, 2014 at 1:03 AM, Michael S. Tsirkin  wrote:
> >> bad one is the
> >>
> >> Author: Jason Wang 
> >> Date:   Tue Sep 2 18:07:46 2014 +0300
> >>
> >> vhost_net: start/stop guest notifiers properly
> >
> >
> >
> > upstream has this (pull request sent today):
> > vhost_net: cleanup start/stop condition
> >
> > Could you apply it and see if it helps please?
> >
> > Michael, if it helps it should be before start/stop guest notifiers
> > ideally to avoid bisect problems.
> 
> It is already applied as shown from the list in the previous message
> (there are some aio fixes too on top of 2.1 I picked before but they
> should not impact vhost-net interaction in any mean). The symptoms are

I also had it applied. Not sure what the main difference is with our
setups, but can't seem to reproduce it on my end. Also tried FC18 64-bit.

I have only 2.1.1 + the 4 commits mentioned in my previous email however,
so it may be worth retrying your test with only those applied as a sanity
check, or even just testing Michael's tree directly to confirm that the
right commit.

> a bit interesting - VM crashes only at PCI device initalization (e.g.
> grub stage after reset and initrd unpacking are passing well, but then
> things getting ugly). I am running 3.14 guest i686-pae kernel from
> debian backports in guest, so it may be version-specific after all. If
> it`ll be hard to reproduce, I can try 64bit, expecting same behavior.
> Please find args in attached file.

Re: [Qemu-devel] [PATCH v13 6/6] qcow2: Add full preallocation option

2014-09-02 Thread Max Reitz


On 29.08.2014 10:33, Hu Tao wrote:

preallocation=full allocates disk space by fallocating the space if
posix_fallocate() is available, otherwise by writing zeros to disk to
ensure disk space in any cases.

Signed-off-by: Hu Tao 
---
  block/qcow2.c  | 61 +++---
  qemu-doc.texi  |  7 +++---
  qemu-img.texi  |  7 +++---
  tests/qemu-iotests/082.out | 54 
  4 files changed, 87 insertions(+), 42 deletions(-)


Reviewed-by: Max Reitz

Re: [Qemu-devel] [Qemu-stable] Patch Round-up for stable 2.1.1, freeze on 2014-09-03

2014-09-02 Thread Michael S. Tsirkin

On Wed, Sep 03, 2014 at 01:29:29AM +0400, Andrey Korolyov wrote:
> On Wed, Sep 3, 2014 at 1:03 AM, Michael S. Tsirkin  wrote:
> >> bad one is the
> >>
> >> Author: Jason Wang 
> >> Date:   Tue Sep 2 18:07:46 2014 +0300
> >>
> >> vhost_net: start/stop guest notifiers properly
> >
> >
> >
> > upstream has this (pull request sent today):
> > vhost_net: cleanup start/stop condition
> >
> > Could you apply it and see if it helps please?
> >
> > Michael, if it helps it should be before start/stop guest notifiers
> > ideally to avoid bisect problems.
> 
> It is already applied as shown from the list in the previous message
> (there are some aio fixes too on top of 2.1 I picked before but they
> should not impact vhost-net interaction in any mean). The symptoms are
> a bit interesting - VM crashes only at PCI device initalization (e.g.
> grub stage after reset and initrd unpacking are passing well, but then
> things getting ugly). I am running 3.14 guest i686-pae kernel from
> debian backports in guest, so it may be version-specific after all. If
> it`ll be hard to reproduce, I can try 64bit, expecting same behavior.
> Please find args in attached file.



ok just to make sure - which tree do I clone exactly?

Re: [Qemu-devel] [PATCH v13 4/6] qapi: introduce PreallocMode and a new PreallocMode full.

2014-09-02 Thread Eric Blake

On 08/29/2014 02:33 AM, Hu Tao wrote:
> This patch prepares for the subsequent patches.
> 
> Signed-off-by: Hu Tao 
> ---
>  block/qcow2.c  | 23 +++
>  qapi/block-core.json   | 16 
>  tests/qemu-iotests/049.out |  2 +-
>  3 files changed, 32 insertions(+), 9 deletions(-)
> 

> @@ -1958,6 +1958,13 @@ static int qcow2_create(const char *filename, QemuOpts 
> *opts, Error **errp)
>  flags |= BLOCK_FLAG_LAZY_REFCOUNTS;
>  }
>  
> +if (prealloc && prealloc != PREALLOC_MODE_METADATA) {

I find it a bit awkward that you are checking for PREALLOC_MODE_OFF
implicitly ('prealloc &&') vs. checking for prealloc mode METADATA
explicitly.  Since there are only three modes, would it be any simpler
to just have written:

if (prealloc == PREALLOC_MODE_FULL) {

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v13 5/6] raw-posix: Add full preallocation option

2014-09-02 Thread Max Reitz


On 29.08.2014 10:33, Hu Tao wrote:

This patch adds a new option preallocation for raw format, and implements
full preallocation.

Signed-off-by: Hu Tao 
---
  block/raw-posix.c | 92 +++
  qemu-doc.texi |  8 +
  qemu-img.texi |  8 +
  3 files changed, 88 insertions(+), 20 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index abe0759..25f66f2 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -30,6 +30,7 @@
  #include "block/thread-pool.h"
  #include "qemu/iov.h"
  #include "raw-aio.h"
+#include "qapi/util.h"
  
  #if defined(__APPLE__) && (__MACH__)

  #include 
@@ -1365,6 +1366,9 @@ static int raw_create(const char *filename, QemuOpts 
*opts, Error **errp)
  int result = 0;
  int64_t total_size = 0;
  bool nocow = false;
+PreallocMode prealloc = PREALLOC_MODE_OFF;
+char *buf = NULL;
+Error *local_err = NULL;
  
  strstart(filename, "file:", &filename);
  
@@ -1372,37 +1376,80 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp)

  total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
  nocow = qemu_opt_get_bool(opts, BLOCK_OPT_NOCOW, false);
+buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
+prealloc = qapi_enum_parse(PreallocMode_lookup, buf,
+   PREALLOC_MODE_MAX, PREALLOC_MODE_OFF,
+   &local_err);
+g_free(buf);
+if (local_err) {
+error_propagate(errp, local_err);
+result = -EINVAL;
+goto out;
+}
  
  fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,

 0644);
  if (fd < 0) {
  result = -errno;
  error_setg_errno(errp, -result, "Could not create file");
-} else {
-if (nocow) {
+goto out;
+}
+
+if (nocow) {
  #ifdef __linux__
-/* Set NOCOW flag to solve performance issue on fs like btrfs.
- * This is an optimisation. The FS_IOC_SETFLAGS ioctl return value
- * will be ignored since any failure of this operation should not
- * block the left work.
- */
-int attr;
-if (ioctl(fd, FS_IOC_GETFLAGS, &attr) == 0) {
-attr |= FS_NOCOW_FL;
-ioctl(fd, FS_IOC_SETFLAGS, &attr);
-}
-#endif
+/* Set NOCOW flag to solve performance issue on fs like btrfs.
+ * This is an optimisation. The FS_IOC_SETFLAGS ioctl return value
+ * will be ignored since any failure of this operation should not
+ * block the left work.
+ */
+int attr;
+if (ioctl(fd, FS_IOC_GETFLAGS, &attr) == 0) {
+attr |= FS_NOCOW_FL;
+ioctl(fd, FS_IOC_SETFLAGS, &attr);
  }
+#endif
+}
  
-if (ftruncate(fd, total_size) != 0) {

-result = -errno;
-error_setg_errno(errp, -result, "Could not resize file");
-}
-if (qemu_close(fd) != 0) {
-result = -errno;
-error_setg_errno(errp, -result, "Could not close the new file");
+if (ftruncate(fd, total_size) != 0) {
+result = -errno;
+error_setg_errno(errp, -result, "Could not resize file");
+goto out_close;
+}
+
+if (prealloc == PREALLOC_MODE_FULL) {
+/* posix_fallocate() doesn't set errno. */
+result = -posix_fallocate(fd, 0, total_size);
+if (result != 0) {
+buf = g_malloc0(65536);
+int64_t num = 0, left = total_size;
+
+while (left > 0) {
+num = MIN(left, 65536);
+result = write(fd, buf, num);
+if (result < 0) {
+result = -errno;
+error_setg_errno(errp, -result,
+ "Could not write to the new file");
+g_free(buf);
+goto out_close;
+}
+left -= num;
+}
+fsync(fd);
+g_free(buf);
  }
+} else if (prealloc != PREALLOC_MODE_OFF) {
+result = -1;


As for qcow2 in patch 4, I'd prefer -EINVAL.


+error_setg(errp, "Unsupported preallocation mode: %s",
+   PreallocMode_lookup[prealloc]);
+}
+
+out_close:
+if (qemu_close(fd) != 0 && result == 0) {
+result = -errno;
+error_setg_errno(errp, -result, "Could not close the new file");
  }
+out:
  return result;
  }
  
@@ -1585,6 +1632,11 @@ static QemuOptsList raw_create_opts = {

  .type = QEMU_OPT_BOOL,
  .help = "Turn off copy-on-write (valid only on btrfs)"
  },
+{
+.name = BLOCK_OPT_PREALLOC,
+.type = QEMU_OPT_STRING,
+.help = "Preallocation mode (allowed values: off, full)"
+},
  { /* end of list */ }
  }
  }

Re: [Qemu-devel] [PATCH v13 4/6] qapi: introduce PreallocMode and a new PreallocMode full.

2014-09-02 Thread Max Reitz


On 29.08.2014 10:33, Hu Tao wrote:

This patch prepares for the subsequent patches.

Signed-off-by: Hu Tao 
---
  block/qcow2.c  | 23 +++
  qapi/block-core.json   | 16 
  tests/qemu-iotests/049.out |  2 +-
  3 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index cf27c3f..95fb240 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -30,6 +30,7 @@
  #include "qemu/error-report.h"
  #include "qapi/qmp/qerror.h"
  #include "qapi/qmp/qbool.h"
+#include "qapi/util.h"
  #include "trace.h"
  #include "qemu/option_int.h"
  
@@ -1738,7 +1739,7 @@ static int preallocate(BlockDriverState *bs)
  
  static int qcow2_create2(const char *filename, int64_t total_size,

   const char *backing_file, const char *backing_format,
- int flags, size_t cluster_size, int prealloc,
+ int flags, size_t cluster_size, PreallocMode prealloc,
   QemuOpts *opts, int version,
   Error **errp)
  {
@@ -1915,7 +1916,7 @@ static int qcow2_create(const char *filename, QemuOpts 
*opts, Error **errp)
  uint64_t size = 0;
  int flags = 0;
  size_t cluster_size = DEFAULT_CLUSTER_SIZE;
-int prealloc = 0;
+PreallocMode prealloc = PREALLOC_MODE_OFF;
  int version = 3;
  Error *local_err = NULL;
  int ret;
@@ -1931,12 +1932,11 @@ static int qcow2_create(const char *filename, QemuOpts 
*opts, Error **errp)
  cluster_size = qemu_opt_get_size_del(opts, BLOCK_OPT_CLUSTER_SIZE,
   DEFAULT_CLUSTER_SIZE);
  buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
-if (!buf || !strcmp(buf, "off")) {
-prealloc = 0;
-} else if (!strcmp(buf, "metadata")) {
-prealloc = 1;
-} else {
-error_setg(errp, "Invalid preallocation mode: '%s'", buf);
+prealloc = qapi_enum_parse(PreallocMode_lookup, buf,
+   PREALLOC_MODE_MAX, PREALLOC_MODE_OFF,
+   &local_err);
+if (local_err) {
+error_propagate(errp, local_err);
  ret = -EINVAL;
  goto finish;
  }
@@ -1958,6 +1958,13 @@ static int qcow2_create(const char *filename, QemuOpts 
*opts, Error **errp)
  flags |= BLOCK_FLAG_LAZY_REFCOUNTS;
  }
  
+if (prealloc && prealloc != PREALLOC_MODE_METADATA) {

+ret = -1;


Since the return value is expected to be -errno, I'd propose "ret = 
-EINVAL;" here. With that fixed (or a good explanation why we want -1 here):


Reviewed-by: Max Reitz 


+error_setg(errp, "Unsupported preallocate mode: %s",
+   PreallocMode_lookup[prealloc]);
+goto finish;
+}
+
  if (backing_file && prealloc) {
  error_setg(errp, "Backing file and preallocation cannot be used at "
 "the same time");
diff --git a/qapi/block-core.json b/qapi/block-core.json
index fb74c56..543b00b 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1679,3 +1679,19 @@
  'len'   : 'int',
  'offset': 'int',
  'speed' : 'int' } }
+
+# @PreallocMode
+#
+# Preallocation mode of QEMU image file
+#
+# @off: no preallocation
+# @metadata: preallocate only for metadata
+# @full: preallocate all data by calling posix_fallocate() if it is
+#available, otherwise by writing zeros to device to ensure disk
+#space is really available. @full preallocation also sets up
+#metadata correctly.
+#
+# Since 2.2
+##
+{ 'enum': 'PreallocMode',
+  'data': [ 'off', 'metadata', 'full' ] }
diff --git a/tests/qemu-iotests/049.out b/tests/qemu-iotests/049.out
index 71ca44d..09ca0ae 100644
--- a/tests/qemu-iotests/049.out
+++ b/tests/qemu-iotests/049.out
@@ -179,7 +179,7 @@ qemu-img create -f qcow2 -o preallocation=metadata 
TEST_DIR/t.qcow2 64M
  Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 encryption=off 
cluster_size=65536 preallocation='metadata' lazy_refcounts=off
  
  qemu-img create -f qcow2 -o preallocation=1234 TEST_DIR/t.qcow2 64M

-qemu-img: TEST_DIR/t.qcow2: Invalid preallocation mode: '1234'
+qemu-img: TEST_DIR/t.qcow2: invalid parameter value: 1234
  Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 encryption=off 
cluster_size=65536 preallocation='1234' lazy_refcounts=off
  
  == Check encryption option ==

Re: [Qemu-devel] [Qemu-stable] Patch Round-up for stable 2.1.1, freeze on 2014-09-03

2014-09-02 Thread Andrey Korolyov

On Wed, Sep 3, 2014 at 1:03 AM, Michael S. Tsirkin  wrote:
>> bad one is the
>>
>> Author: Jason Wang 
>> Date:   Tue Sep 2 18:07:46 2014 +0300
>>
>> vhost_net: start/stop guest notifiers properly
>
>
>
> upstream has this (pull request sent today):
> vhost_net: cleanup start/stop condition
>
> Could you apply it and see if it helps please?
>
> Michael, if it helps it should be before start/stop guest notifiers
> ideally to avoid bisect problems.

It is already applied as shown from the list in the previous message
(there are some aio fixes too on top of 2.1 I picked before but they
should not impact vhost-net interaction in any mean). The symptoms are
a bit interesting - VM crashes only at PCI device initalization (e.g.
grub stage after reset and initrd unpacking are passing well, but then
things getting ugly). I am running 3.14 guest i686-pae kernel from
debian backports in guest, so it may be version-specific after all. If
it`ll be hard to reproduce, I can try 64bit, expecting same behavior.
Please find args in attached file.

launchstring.txt.gz
Description: GNU Zip compressed data

Re: [Qemu-devel] [PATCH v13 3/6] rename parse_enum_option to qapi_enum_parse and make it public

2014-09-02 Thread Max Reitz


On 29.08.2014 10:33, Hu Tao wrote:

From: Peter Lieven 

relaxing the license to LGPLv2+ is intentional.

Suggested-by: Markus Armbruster 
Signed-off-by: Hu Tao 
Signed-off-by: Peter Lieven 
Reviewed-by: Eric Blake 
Reviewed-by: Benoit Canet 
---
  blockdev.c  | 30 ++
  include/qapi/util.h | 17 +
  qapi/Makefile.objs  |  1 +
  qapi/qapi-util.c| 34 ++
  4 files changed, 58 insertions(+), 24 deletions(-)
  create mode 100644 include/qapi/util.h
  create mode 100644 qapi/qapi-util.c


Seems pretty much unchanged (except for the author note), so:

Reviewed-by: Max Reitz

Re: [Qemu-devel] [PATCH v13 2/6] block: don't convert file size to sector size

2014-09-02 Thread Max Reitz


On 29.08.2014 10:33, Hu Tao wrote:

and avoid converting it back later.

Signed-off-by: Hu Tao 
---
  block/gluster.c   |  9 -
  block/qcow.c  |  8 
  block/qcow2.c | 10 +-
  block/raw-posix.c |  6 +++---
  block/raw-win32.c |  6 +++---
  5 files changed, 19 insertions(+), 20 deletions(-)


Reviewed-by: Max Reitz

Re: [Qemu-devel] [PATCH v13 1/6] block: round up file size to nearest sector

2014-09-02 Thread Max Reitz


On 29.08.2014 10:33, Hu Tao wrote:

Signed-off-by: Hu Tao 
---
  block/archipelago.c  |  3 ++-
  block/cow.c  |  3 ++-
  block/gluster.c  |  4 +--
  block/iscsi.c|  4 +--
  block/nfs.c  |  3 ++-
  block/qcow.c |  3 ++-
  block/qcow2.c|  3 ++-
  block/qed.c  |  3 ++-
  block/raw-posix.c|  8 +++---
  block/raw-win32.c|  4 +--
  block/rbd.c  |  3 ++-
  block/sheepdog.c |  3 ++-
  block/ssh.c  |  3 ++-
  block/vdi.c  |  3 ++-
  block/vhdx.c |  3 ++-
  block/vmdk.c |  3 ++-
  block/vpc.c  |  3 ++-
  tests/qemu-iotests/104   | 57 
  tests/qemu-iotests/104.out   | 12 +
  tests/qemu-iotests/common.filter | 21 +++
  tests/qemu-iotests/group |  1 +
  21 files changed, 127 insertions(+), 23 deletions(-)
  create mode 100755 tests/qemu-iotests/104
  create mode 100644 tests/qemu-iotests/104.out


Reviewed-by: Max Reitz

Re: [Qemu-devel] [PATCH v5 06/10] hw/vfio: create common module

2014-09-02 Thread Alex Williamson

On Tue, 2014-09-02 at 09:13 +0200, Eric Auger wrote:
> On 09/01/2014 07:41 PM, Alexander Graf wrote:
> > 
> > 
> >> Am 01.09.2014 um 18:31 schrieb Eric Auger :
> >>
> >>> On 08/13/2014 09:59 PM, Alex Williamson wrote:
>  On Tue, 2014-08-12 at 08:09 +0200, Eric Auger wrote:
> > On 08/11/2014 09:25 PM, Alex Williamson wrote:
> >> On Sat, 2014-08-09 at 15:25 +0100, Eric Auger wrote:
> >> diff --git a/include/hw/vfio/vfio-common.h 
> >> b/include/hw/vfio/vfio-common.h
> >> new file mode 100644
> >> index 000..4684ee5
> >> --- /dev/null
> >> +++ b/include/hw/vfio/vfio-common.h
> >> @@ -0,0 +1,151 @@
> >> +/*
> >> + * common header for vfio based device assignment support
> >> + *
> >> + * Copyright Red Hat, Inc. 2012
> >> + *
> >> + * Authors:
> >> + *  Alex Williamson 
> >> + *
> >> + * This work is licensed under the terms of the GNU GPL, version 2.  
> >> See
> >> + * the COPYING file in the top-level directory.
> >> + *
> >> + * Based on qemu-kvm device-assignment:
> >> + *  Adapted for KVM by Qumranet.
> >> + *  Copyright (c) 2007, Neocleus, Alex Novik (a...@neocleus.com)
> >> + *  Copyright (c) 2007, Neocleus, Guy Zana (g...@neocleus.com)
> >> + *  Copyright (C) 2008, Qumranet, Amit Shah (amit.s...@qumranet.com)
> >> + *  Copyright (C) 2008, Red Hat, Amit Shah (amit.s...@redhat.com)
> >> + *  Copyright (C) 2008, IBM, Muli Ben-Yehuda (m...@il.ibm.com)
> >> + */
> >> +#ifndef HW_VFIO_VFIO_COMMON_H
> >> +#define HW_VFIO_VFIO_COMMON_H
> >> +
> >> +#include "qemu-common.h"
> >> +#include "exec/address-spaces.h"
> >> +#include "exec/memory.h"
> >> +#include "qemu/queue.h"
> >> +#include "qemu/notify.h"
> >> +
> >> +/*#define DEBUG_VFIO*/
> >> +#ifdef DEBUG_VFIO
> >> +#define DPRINTF(fmt, ...) \
> >> +do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
> >> +#else
> >> +#define DPRINTF(fmt, ...) \
> >> +do { } while (0)
> >> +#endif
> >
> >
> > DPRINTF also need to be renamed to avoid conflicting namespace issues.
>  Ji Alex,
> 
>  OK.
> 
>  As I am going to touch at traces,
>  - are you OK if I use the new .name field to simply format strings?
> >>>
> >>> Sure, that's fine.
> >>>
> DPRINTF("%s(%04x:%02x:%02x.%x) Pin %c\n", __func__, vdev->host.domain,
> vdev->host.bus, vdev->host.slot, vdev->host.function,
> 'A' + vdev->intx.pin);
>  - Also Alex was suggesting to use trace points. What is your position
>  about that? Also I am not 100% sure of what it consists in? is it trace
>  events as documented in docs/tracing.txt
> >>>
> >>> I think it would be a great conversion, but it's not required.  Thanks,
> >>
> >> Hi Alex,
> >>
> >> I am currently progressing on the conversion to trace points (I did it
> >> for platform and common and now do the job for PCI). I wonder whether it
> >> makes sense I convert all DPRINTF into trace-points or only convert a
> >> subset (state transitions, ...). Would you accept a mixture of DPRINTFs
> >> and trace-points or do you advise to convert everything?
> > 
> > Yeah, it's perfectly good to even just nit introduce new dprintfs.
> ok thanks
> > 
> >>
> >> Also the tracing.txt doc says we should use the name of the function as
> >> prefix. That being said it could be interesting to trace all pci* or all
> >> platform* and wildcard seems to work fine to select the trace-events. So
> >> my second question is would you accept using pci__* as a
> >> generic pattern.
> > 
> > Not sure - maybe be more explicit and call it vfio_pci_...?
> well. maybe as a first draft I will follow the tracing.txt guideline and
> you will tell me, both Alex's, what you think of the outcome. Anyway it
> is not a big deal then to change ...

I haven't touched tracing yet, so I'll defer to you and agraf for now ;)
Thanks,

Alex

Re: [Qemu-devel] [Qemu-stable] Patch Round-up for stable 2.1.1, freeze on 2014-09-03

2014-09-02 Thread Michael S. Tsirkin

On Tue, Sep 02, 2014 at 09:33:13PM +0400, Andrey Korolyov wrote:
> On Tue, Sep 2, 2014 at 7:27 PM, Michael S. Tsirkin  wrote:
> > On Tue, Sep 02, 2014 at 06:25:46PM +0300, Michael S. Tsirkin wrote:
> >> On Tue, Sep 02, 2014 at 10:20:50AM -0500, Michael Roth wrote:
> >> > Quoting Michael Roth (2014-08-27 12:35:57)
> >> > > Hi everyone,
> >> > >
> >> > > The following new patches are queued for QEMU stable v2.1.1:
> >> > >
> >> > >   https://github.com/mdroth/qemu/commits/stable-2.1-staging
> >> >
> >> > As of now the following additional patches have been applied to
> >> > the 2.1.1 staging tree (see stable commit for upstream commit ref):
> >> >
> >> >   a7f9ea2 qxl-render: add more sanity checks
> >> >   1511595 target-arm: Correct Cortex-A57 ISAR5 and AA64ISAR0 ID register 
> >> > values
> >> >   44a1530 target-arm: Fix regression that disabled VFP for ARMv5 CPUs
> >> >   95dcaa4 x86: Clear MTRRs on vCPU reset
> >> >   7fd25d3 x86: kvm: Add MTRR support for kvm_get|put_msrs()
> >> >   0f9c63b x86: Use common variable range MTRR counts
> >> >   0c69595 target-i386: Don't forbid NX bit on PAE PDEs and PTEs
> >> >   89713cb vl: process -object after other backend options
> >> >   93dcbd3 spapr_pci: map the MSI window in each PHB
> >> >
> >> > The following patches have been Cc'd to qemu-stable and are still
> >> > pending upstream commit/merge:
> >> >
> >> >   [PATCH] virtio-net: don't run bh on vm stopped (Michael S. Tsirkin)
> >> > * pull just sent by Michael Tsirkin
> >> >   [PATCH] net: prevent sending packets while guest is stopped (Stefan 
> >> > Hajnoczi)
> >> >   [PATCH v2 2/4] pci: Avoid losing config updates to MSI/MSIX cap regs 
> >> > (Knut Omang)
> >>
> >>   this is also in that pull
> >>
> >> >   [PATCH V4] net: Forbid dealing with packets when VM is not running 
> >> > (zhanghailiang)
> >> > * looks like Stefan has this queued for net
> >> >   [PATCH V2] acpi-build: Set FORCE_APIC_CLUSTER_MODEL bit for FADT flags 
> >> > (zhanghailiang)
> >>
> >>   hmm I forgot to include that one. I'll redo the pull request.
> >
> > or maybe not
> > I recalled why I didn't include it - I wanted someone else to review it
> > first
> >
> > if it's ready, I'll send it separately tomorrow
> >
> >
> >> >   [RESEND v2 0/3] Fix some memory leaks about query memdev (Chen Fan)
> >> > * looks like Michael Tokarev has this queued for trivial
> >> >   [PATCH] target-i386: Support migratable=no properly (Eduardo Habkost)
> >> > * pinged
> >> >   [PATCH 1/3] pc: Fix disabling of vapic for compat PC models (Jan 
> >> > Kiszka)
> >> > * pinged
> >> >   [PATCH] pty: Fix byte loss bug when connecting to pty (Sebastian 
> >> > Tanase)
> >> > * pinged, presumably going through Gerd's tree
> >> >
> >> > Let me know if anything is missing.
> >> >
> >> > >
> >> > > The release is planned for 2014-09-08:
> >> > >
> >> > >   http://wiki.qemu.org/Planning/2.1
> >> > >
> >> > > Please respond here or CC qemu-sta...@nongnu.org on any patches you
> >> > > think should be included in the release.
> >> > >
> >> > > Testing/feedback is greatly appreciated.
> >> > >
> >> > > Thanks!
> >> > >
> >> > > 
> >> > > Alex Williamson (1):
> >> > >   vfio: Fix MSI-X vector expansion
> >> > >
> >> > > Ben Draper (1):
> >> > >   vmxnet3: Pad short frames to minimum size (60 bytes)
> >> > >
> >> > > Christoffer Dall (2):
> >> > >   target-arm: Rename QEMU PSCI v0.1 definitions
> >> > >   arm/virt: Use PSCI v0.2 function IDs in the DT when KVM uses 
> >> > > PSCI v0.2
> >> > >
> >> > > Fam Zheng (1):
> >> > >   blkdebug: Delete BH in bdrv_aio_cancel
> >> > >
> >> > > Gonglei (1):
> >> > >   pcihp: fix possible array out of bounds
> >> > >
> >> > > Hu Tao (3):
> >> > >   hw:i386: typo fix: MEMORY_HOPTLUG_DEVICE -> MEMORY_HOTPLUG_DEVICE
> >> > >   pc-dimm: validate node property
> >> > >   numa: show hex number in error message for consistency and 
> >> > > prefix them with 0x
> >> > >
> >> > > Jan Kiszka (1):
> >> > >   pci: Use bus master address space for delivering MSI/MSI-X 
> >> > > messages
> >> > >
> >> > > Michael S. Tsirkin (4):
> >> > >   pc-dimm: fix up error message
> >> > >   acpi: align RSDP
> >> > >   hostmem: set MPOL_MF_MOVE
> >> > >   pc: reserve more memory for ACPI for new machine types
> >> > >
> >> > > Michael Tokarev (2):
> >> > >   l2tpv3 (configure): it is linux-specific
> >> > >   ide: only constrain read/write requests to drive size, not other 
> >> > > types
> >> > >
> >> > > Peter Lieven (1):
> >> > >   block/iscsi: fix memory corruption on iscsi resize
> >> > >
> >> > > Peter Maydell (1):
> >> > >   target-arm: Fix return address for A64 BRK instructions
> >> > >
> >> > > Stefan Hajnoczi (6):
> >> > >   qmp: hide "hotplugged" device property from 
> >> > > device-list-properties
> >> > >   qdev-monitor: include QOM properties in -device FOO, help outpu

[Qemu-devel] QEMU correctness and performance testing

2014-09-02 Thread Xin Tong

where can i find most recent correctness and performance #s for QEMU ? are
there any organizations running these and making their data available
online ?

Thanks,
Xin

Re: [Qemu-devel] [PATCH v4 00/14] ivshmem: update documentation, add client/server tools

2014-09-02 Thread Eric Blake

On 09/02/2014 09:25 AM, David Marchand wrote:
> Here is a patchset containing an update on ivshmem specs documentation and
> importing ivshmem server and client tools.
> These tools have been written from scratch and are not related to what is
> available in nahanni repository.
> I put them in contrib/ directory as the qemu-doc.texi was already telling the
> server was supposed to be there.
> 
> Changes since v3:
> - first patch is untouched
> - just restored the Reviewed-By Claudio in second patch
> - following patches 3-8 take into account Stefan's comments
> - patches 9-12 take into account Gonglei's comments
> - patch 13 adjusts ivshmem-server default values
> - last patch introduces a change in the ivshmem client-server protocol to
>   check a protocol version at connect time

Rather than introducing new files with bugs, followed by patches to
clean it up, why not just introduce the new files correct in the first
place?  I think you are better off squashing in a lot of the cleanup
patches into patch 1.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 3/5] qemu-char: Convert udp backend to QAPI

2014-09-02 Thread Peter Maydell

On 2 September 2014 21:04, Eric Blake  wrote:
> On 09/02/2014 04:24 AM, Peter Maydell wrote:
>> +
>> +if (host == NULL || strlen(host) == 0) {
>> +host = "localhost";
>> +}
>> +if (port == NULL || strlen(port) == 0) {
>> +error_setg(errp, "chardev: udp: remote port not specified");
>
> In the common case of these strings being non-empty, you end up having
> to hunt for the end of the string only to then throw that information
> away.  Rather than 'strlen(foo) == 0)', it's slightly faster to check
> '*foo' for being a non-NUL byte.

I think that's a lot less clear to read, and in fact the
compiler is entirely capable of turning "strlen(x) == 0"
into "check whether *x is 0":

mnementh$ cat /tmp/zz9.c
#include 

int is_empty_string(const char *s)
{
return (strlen(s) == 0);
}
mnementh$ gcc -g -Wall -O2 -o /tmp/zz9.o -c /tmp/zz9.c
mnementh$ objdump --disassemble /tmp/zz9.o

/tmp/zz9.o: file format elf32-i386


Disassembly of section .text:

 :
   0:8b 44 24 04  mov0x4(%esp),%eax
   4:80 38 00 cmpb   $0x0,(%eax)
   7:0f 94 c0 sete   %al
   a:0f b6 c0 movzbl %al,%eax
   d:c3   ret

thanks
-- PMM

Re: [Qemu-devel] [PATCH v4 05/14] contrib/ivshmem-*: switch to QEMU headers

2014-09-02 Thread Eric Blake

On 09/02/2014 09:25 AM, David Marchand wrote:
> Reuse parsers from QEMU, C99 boolean.
> 
> Signed-off-by: David Marchand 
> ---
>  contrib/ivshmem-client/ivshmem-client.c |   12 +
>  contrib/ivshmem-client/ivshmem-client.h |4 +-
>  contrib/ivshmem-client/main.c   |   12 +
>  contrib/ivshmem-server/ivshmem-server.c |   14 +-
>  contrib/ivshmem-server/ivshmem-server.h |4 +-
>  contrib/ivshmem-server/main.c   |   73 
> +--
>  6 files changed, 20 insertions(+), 99 deletions(-)

Why introduce code in patch 1 only to rip it back out in patch 5?  Why
not just squash these together and just introduce the contrib file
correct on its first commit?


>  
>  case 'l': /* shm_size */
> -if (parse_size(optarg, &args->shm_size) < 0) {
> +parse_option_size("shm_size", optarg, &args->shm_size, &errp);
> +if (errp) {
> +error_free(errp);
>  fprintf(stderr, "cannot parse shm size\n");

It would be nicer to print the contents of errp, instead of discarding
what is likely to be a more specific error message.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v4 02/14] docs: update ivshmem device spec

2014-09-02 Thread Eric Blake

On 09/02/2014 09:25 AM, David Marchand wrote:
> Add some notes on the parts needed to use ivshmem devices: more specifically,
> explain the purpose of an ivshmem server and the basic concept to use the
> ivshmem devices in guests.
> Move some parts of the documentation and re-organise it.
> 
> Signed-off-by: David Marchand 
> Reviewed-by: Claudio Fontana 
> ---
>  docs/specs/ivshmem_device_spec.txt |  124 
> +++-
>  1 file changed, 93 insertions(+), 31 deletions(-)
> 

>  
> -The device currently supports 4 registers of 32-bits each.  Registers
> -are used for synchronization between guests sharing the same memory object 
> when
> -interrupts are supported (this requires using the shared memory server).
> +The server must be started on the host before any guest.
> +It creates a shared memory object then waits for clients to connect on an 
> unix
> +socket.

s/an unix/a unix/

>  
> -The server assigns each VM an ID number and sends this ID number to the QEMU
> -process when the guest starts.
> +For each client (QEMU processes) that connects to the server:

s/processes/process/


> +The client IDs are limited to 16 bits because of the current implementation 
> (see
> +Doorbell register in 'PCI device registers' subsection). Hence on 65536 
> clients

s/on/only/


> +At initialisation, when creating the ivshmem device, QEMU gets its ID from 
> the
> +server then make it available through BAR0 IVPosition register for the VM to 
> use

s/make/makes/

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v4 01/14] contrib: add ivshmem client and server

2014-09-02 Thread Eric Blake

On 09/02/2014 09:25 AM, David Marchand wrote:
> When using ivshmem devices, notifications between guests can be sent as
> interrupts using a ivshmem-server (typical use described in documentation).
> The client is provided as a debug tool.
> 
> Signed-off-by: Olivier Matz 
> Signed-off-by: David Marchand 
> ---

> +/* parse the size of shm */
> +static int
> +parse_size(const char *val_str, size_t *val)
> +{
> +char *endptr;
> +unsigned long long tmp;
> +
> +errno = 0;
> +tmp = strtoull(val_str, &endptr, 0);
> +if ((errno == ERANGE && tmp == ULLONG_MAX) || (errno != 0 && tmp == 0)) {
> +return -1;
> +}

This is overly complex; usually, this is just done as:

if (errno)
return -1;

> +/* parse an unsigned int */
> +static int
> +parse_uint(const char *val_str, unsigned *val)
> +{
> +char *endptr;
> +unsigned long tmp;
> +
> +errno = 0;
> +tmp = strtoul(val_str, &endptr, 0);
> +if ((errno == ERANGE && tmp == ULONG_MAX) || (errno != 0 && tmp == 0)) {
> +return -1;
> +}

and again

> +if (endptr == val_str || endptr[0] != '\0') {
> +return -1;
> +}
> +*val = tmp;
> +return 0;
> +}

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [RFC PATCH v2 2/2] VFIO: Clear stale MSIx table during EEH reset

2014-09-02 Thread Alex Williamson

On Mon, 2014-09-01 at 10:53 +1000, Gavin Shan wrote:
> The PCI device MSIx table is cleaned out in hardware after EEH PE
> reset. However, we still hold the stale MSIx entries in QEMU, which
> should be cleared accordingly. Otherwise, we will run into another
> (recursive) EEH error and the PCI devices contained in the PE have
> to be offlined exceptionally.
> 
> The patch clears stale MSIx table before EEH PE reset so that MSIx
> table could be restored properly after EEH PE reset.
> 
> Signed-off-by: Gavin Shan 
> ---
>  hw/misc/vfio.c | 32 +++-
>  1 file changed, 31 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
> index 1a3e7eb..1f55051 100644
> --- a/hw/misc/vfio.c
> +++ b/hw/misc/vfio.c
> @@ -2724,6 +2724,17 @@ static void vfio_disable_interrupts(VFIODevice *vdev)
>  }
>  }
>  
> +static void vfio_disable_and_reset_interrupts(VFIODevice *vdev)
> +{
> +vfio_disable_interrupts(vdev);
> +
> +switch (vdev->interrupt) {
> +case VFIO_INT_MSIX:
> +msix_reset(&vdev->pdev);
> +break;
> +}

This is apparently untested because vdev->interrupt should never be set
to VFIO_INT_MSIX after vfio_disable_interrupts().  Also, you need to
update the normal reset path to call msix_reset() unless it's already
happening via another reset handler.  Thanks,

Alex

> +}
> +
>  static int vfio_setup_msi(VFIODevice *vdev, int pos)
>  {
>  uint16_t ctrl;
> @@ -4442,8 +4453,27 @@ int vfio_container_ioctl(AddressSpace *as, int32_t 
> groupid,
>  switch (req) {
>  case VFIO_CHECK_EXTENSION:
>  case VFIO_IOMMU_SPAPR_TCE_GET_INFO:
> -case VFIO_EEH_PE_OP:
>  break;
> +case VFIO_EEH_PE_OP: {
> +VFIODevice *vdev;
> +struct vfio_eeh_pe_op *arg = (struct vfio_eeh_pe_op *)param;
> +
> +switch (arg->op) {
> +case VFIO_EEH_PE_RESET_HOT:
> +case VFIO_EEH_PE_RESET_FUNDAMENTAL:
> +/*
> + * The MSIx table will be cleaned out by reset. We need
> + * disable it so that it can be reenabled properly. Also,
> + * the cached MSIx table should be cleared as it's not
> + * reflecting the contents in hardware.
> + */
> +QLIST_FOREACH(vdev, &group->device_list, next) {
> +vfio_disable_and_reset_interrupts(vdev);
> +}
> +}
> +
> +break;
> +}
>  default:
>  /* Return an error on unknown requests */
>  error_report("vfio: unsupported ioctl %X", req);

Re: [Qemu-devel] [PATCH v2 3/5] qemu-char: Convert udp backend to QAPI

2014-09-02 Thread Eric Blake

On 09/02/2014 04:24 AM, Peter Maydell wrote:
> Convert the udp char backend to the new style QAPI framework.
> 
> Signed-off-by: Peter Maydell 
> ---
>  qemu-char.c | 69 
> +++--
>  1 file changed, 54 insertions(+), 15 deletions(-)
> 

> +
> +if (host == NULL || strlen(host) == 0) {
> +host = "localhost";
> +}
> +if (port == NULL || strlen(port) == 0) {
> +error_setg(errp, "chardev: udp: remote port not specified");

In the common case of these strings being non-empty, you end up having
to hunt for the end of the string only to then throw that information
away.  Rather than 'strlen(foo) == 0)', it's slightly faster to check
'*foo' for being a non-NUL byte.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2] dump: let dump_error return error info to caller

2014-09-02 Thread Eric Blake

On 09/02/2014 02:25 AM, zhanghailiang wrote:
> The second parameter of dump_error is unused, but one purpose of
> using this function is to report the error info.
> 
> Use error_set to return the error info to the caller.
> 
> Signed-off-by: zhanghailiang 
> ---
>  V2:
> - Return the error reason to the caller which suggested by Luiz Capitulino.
> ---
>  dump.c | 165 
> -
>  1 file changed, 82 insertions(+), 83 deletions(-)
> 
> diff --git a/dump.c b/dump.c
> index 71d3e94..0ab72e7 100644
> --- a/dump.c
> +++ b/dump.c
> @@ -81,9 +81,10 @@ static int dump_cleanup(DumpState *s)
>  return 0;
>  }
>  
> -static void dump_error(DumpState *s, const char *reason)
> +static void dump_error(DumpState *s, Error **errp, const char *reason)

The Error **errp is typically listed last.

>  {
>  dump_cleanup(s);
> +error_setg(errp, "%s", reason);
>  }
>  
>  static int fd_write_vmcore(const void *buf, size_t size, void *opaque)
> @@ -99,7 +100,7 @@ static int fd_write_vmcore(const void *buf, size_t size, 
> void *opaque)
>  return 0;
>  }
>  
> -static int write_elf64_header(DumpState *s)
> +static int write_elf64_header(DumpState *s, Error **errp)
>  {
>  Elf64_Ehdr elf_header;
>  int ret;
> @@ -126,14 +127,14 @@ static int write_elf64_header(DumpState *s)
>  
>  ret = fd_write_vmcore(&elf_header, sizeof(elf_header), s);
>  if (ret < 0) {
> -dump_error(s, "dump: failed to write elf header.\n");
> +dump_error(s, errp, "dump: failed to write elf header.\n");

This ends up calling error_setg with a trailing newline, which should
not be needed.  It looks like all of your conversions to the new
dump_error should drop the \n in the message.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [Qemu-stable] Patch Round-up for stable 2.1.1, freeze on 2014-09-03

2014-09-02 Thread Michael Roth

Quoting Andrey Korolyov (2014-09-02 12:33:13)
> On Tue, Sep 2, 2014 at 7:27 PM, Michael S. Tsirkin  wrote:
> > On Tue, Sep 02, 2014 at 06:25:46PM +0300, Michael S. Tsirkin wrote:
> >> On Tue, Sep 02, 2014 at 10:20:50AM -0500, Michael Roth wrote:
> >> > Quoting Michael Roth (2014-08-27 12:35:57)
> >> > > Hi everyone,
> >> > >
> >> > > The following new patches are queued for QEMU stable v2.1.1:
> >> > >
> >> > >   https://github.com/mdroth/qemu/commits/stable-2.1-staging
> >> >
> >> > As of now the following additional patches have been applied to
> >> > the 2.1.1 staging tree (see stable commit for upstream commit ref):
> >> >
> >> >   a7f9ea2 qxl-render: add more sanity checks
> >> >   1511595 target-arm: Correct Cortex-A57 ISAR5 and AA64ISAR0 ID register 
> >> > values
> >> >   44a1530 target-arm: Fix regression that disabled VFP for ARMv5 CPUs
> >> >   95dcaa4 x86: Clear MTRRs on vCPU reset
> >> >   7fd25d3 x86: kvm: Add MTRR support for kvm_get|put_msrs()
> >> >   0f9c63b x86: Use common variable range MTRR counts
> >> >   0c69595 target-i386: Don't forbid NX bit on PAE PDEs and PTEs
> >> >   89713cb vl: process -object after other backend options
> >> >   93dcbd3 spapr_pci: map the MSI window in each PHB
> >> >
> >> > The following patches have been Cc'd to qemu-stable and are still
> >> > pending upstream commit/merge:
> >> >
> >> >   [PATCH] virtio-net: don't run bh on vm stopped (Michael S. Tsirkin)
> >> > * pull just sent by Michael Tsirkin
> >> >   [PATCH] net: prevent sending packets while guest is stopped (Stefan 
> >> > Hajnoczi)
> >> >   [PATCH v2 2/4] pci: Avoid losing config updates to MSI/MSIX cap regs 
> >> > (Knut Omang)
> >>
> >>   this is also in that pull
> >>
> >> >   [PATCH V4] net: Forbid dealing with packets when VM is not running 
> >> > (zhanghailiang)
> >> > * looks like Stefan has this queued for net
> >> >   [PATCH V2] acpi-build: Set FORCE_APIC_CLUSTER_MODEL bit for FADT flags 
> >> > (zhanghailiang)
> >>
> >>   hmm I forgot to include that one. I'll redo the pull request.
> >
> > or maybe not
> > I recalled why I didn't include it - I wanted someone else to review it
> > first
> >
> > if it's ready, I'll send it separately tomorrow
> >
> >
> >> >   [RESEND v2 0/3] Fix some memory leaks about query memdev (Chen Fan)
> >> > * looks like Michael Tokarev has this queued for trivial
> >> >   [PATCH] target-i386: Support migratable=no properly (Eduardo Habkost)
> >> > * pinged
> >> >   [PATCH 1/3] pc: Fix disabling of vapic for compat PC models (Jan 
> >> > Kiszka)
> >> > * pinged
> >> >   [PATCH] pty: Fix byte loss bug when connecting to pty (Sebastian 
> >> > Tanase)
> >> > * pinged, presumably going through Gerd's tree
> >> >
> >> > Let me know if anything is missing.
> >> >
> >> > >
> >> > > The release is planned for 2014-09-08:
> >> > >
> >> > >   http://wiki.qemu.org/Planning/2.1
> >> > >
> >> > > Please respond here or CC qemu-sta...@nongnu.org on any patches you
> >> > > think should be included in the release.
> >> > >
> >> > > Testing/feedback is greatly appreciated.
> >> > >
> >> > > Thanks!
> >> > >
> >> > > 
> >> > > Alex Williamson (1):
> >> > >   vfio: Fix MSI-X vector expansion
> >> > >
> >> > > Ben Draper (1):
> >> > >   vmxnet3: Pad short frames to minimum size (60 bytes)
> >> > >
> >> > > Christoffer Dall (2):
> >> > >   target-arm: Rename QEMU PSCI v0.1 definitions
> >> > >   arm/virt: Use PSCI v0.2 function IDs in the DT when KVM uses 
> >> > > PSCI v0.2
> >> > >
> >> > > Fam Zheng (1):
> >> > >   blkdebug: Delete BH in bdrv_aio_cancel
> >> > >
> >> > > Gonglei (1):
> >> > >   pcihp: fix possible array out of bounds
> >> > >
> >> > > Hu Tao (3):
> >> > >   hw:i386: typo fix: MEMORY_HOPTLUG_DEVICE -> MEMORY_HOTPLUG_DEVICE
> >> > >   pc-dimm: validate node property
> >> > >   numa: show hex number in error message for consistency and 
> >> > > prefix them with 0x
> >> > >
> >> > > Jan Kiszka (1):
> >> > >   pci: Use bus master address space for delivering MSI/MSI-X 
> >> > > messages
> >> > >
> >> > > Michael S. Tsirkin (4):
> >> > >   pc-dimm: fix up error message
> >> > >   acpi: align RSDP
> >> > >   hostmem: set MPOL_MF_MOVE
> >> > >   pc: reserve more memory for ACPI for new machine types
> >> > >
> >> > > Michael Tokarev (2):
> >> > >   l2tpv3 (configure): it is linux-specific
> >> > >   ide: only constrain read/write requests to drive size, not other 
> >> > > types
> >> > >
> >> > > Peter Lieven (1):
> >> > >   block/iscsi: fix memory corruption on iscsi resize
> >> > >
> >> > > Peter Maydell (1):
> >> > >   target-arm: Fix return address for A64 BRK instructions
> >> > >
> >> > > Stefan Hajnoczi (6):
> >> > >   qmp: hide "hotplugged" device property from 
> >> > > device-list-properties
> >> > >   qdev-monitor: include QOM properties in -device FOO, help output
> >> > >   ra

Re: [Qemu-devel] [PATCH v5 11/11] iotests: Add test for potentially damaging repairs

2014-09-02 Thread Eric Blake

On 08/29/2014 03:41 PM, Max Reitz wrote:
> There are certain cases where repairing a qcow2 image might actually
> damage it further (or rather, where repairing it has in fact damaged it
> further with the old qcow2 check implementation). This should not
> happen, so add a test for these cases.
> 
> Furthermore, the repair function now repairs refblocks beyond the image
> end by resizing the image accordingly. Add several tests for this as
> well.
> 
> Signed-off-by: Max Reitz 
> ---
>  tests/qemu-iotests/104 | 141 
> +
>  tests/qemu-iotests/104.out | 110 +++
>  tests/qemu-iotests/group   |   1 +
>  3 files changed, 252 insertions(+)
>  create mode 100755 tests/qemu-iotests/104
>  create mode 100644 tests/qemu-iotests/104.out
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v5 10/11] iotests: Fix test outputs

2014-09-02 Thread Eric Blake

On 08/29/2014 03:41 PM, Max Reitz wrote:
> 039, 060 and 061 all create images with referenced clusters having a
> refcount of 0. Because previous commits changed handling of such errors,
> these tests now have a different output. Fix it.
> 
> Furthermore, 060 created a refblock with a refcount greater than one
> which now results in having to rebuild the refcount structure as well.
> 
> Signed-off-by: Max Reitz 
> ---
>  tests/qemu-iotests/039.out | 10 --
>  tests/qemu-iotests/060.out | 10 --
>  tests/qemu-iotests/061.out | 18 --
>  3 files changed, 28 insertions(+), 10 deletions(-)

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [Bug 1364501] Re: Gdb hangs when trying to single-step after an invalid instruction

2014-09-02 Thread Martin

>> Public bug reported:
>>
>> When using Gdb to remote-debug a program and manually setting its PC to
>> point to an address containing an invalid instruction and then doing a
>> single step, Qemu will never return control to the remote Gdb.
>>
>> For instance, let's say address 0x114 contains an invalid instruction.
>> On the remote Gdb, we'd do:
>>
>> (gdb) set $pc = 0x114
>> (gdb) stepi
>>
>> After doing that we won't get the (gdb) prompt unless we do a Ctrl-C. If
>> we do so we'll be left at 0x114 instead of going towards the exception
>> handler as we should. This happens with stepi, step and next. If instead
>> of single-stepping we used continue, the program will proceed into the
>> exception handler as it should.
>>
>> The reason this is happening is that when Qemu realizes it's about to
>> translate an instruction it doesn't recognize it'll generate a call to
>> helper_exception_with_syndrome(), which will register the exception and
>> then call cpu_loop_exit(). At the same time, because we're doing a
>> single-step, Qemu will also generate a call to
>> helper_exception_internal() passing it an EXCP_DEBUG, which lets the
>> system know it'll give control back to the remote debugger, and it also
>> ends with a call to cpu_loop_exit(). However, because the syndrome
>> exception calls cpu_loop_exit() first, the call to the internal
>> exception won't be reached and Qemu will be stuck in a loop without
>> returning control to the remote debugger.
>
> Just to check, does your system image include a valid handler
> for the undef exception?

Yes, it does. RTEMS sets a default handler for every exception, and just
to be sure I added a bunch of small wrappers so I'll be able to know
what kind of exception are we dealing with.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1364501

Title:
  Gdb hangs when trying to single-step after an invalid instruction

Status in QEMU:
  New

Bug description:
  When using Gdb to remote-debug a program and manually setting its PC
  to point to an address containing an invalid instruction and then
  doing a single step, Qemu will never return control to the remote Gdb.

  For instance, let's say address 0x114 contains an invalid instruction.
  On the remote Gdb, we'd do:

  (gdb) set $pc = 0x114
  (gdb) stepi

  After doing that we won't get the (gdb) prompt unless we do a Ctrl-C.
  If we do so we'll be left at 0x114 instead of going towards the
  exception handler as we should. This happens with stepi, step and
  next. If instead of single-stepping we used continue, the program will
  proceed into the exception handler as it should.

  The reason this is happening is that when Qemu realizes it's about to
  translate an instruction it doesn't recognize it'll generate a call to
  helper_exception_with_syndrome(), which will register the exception
  and then call cpu_loop_exit(). At the same time, because we're doing a
  single-step, Qemu will also generate a call to
  helper_exception_internal() passing it an EXCP_DEBUG, which lets the
  system know it'll give control back to the remote debugger, and it
  also ends with a call to cpu_loop_exit(). However, because the
  syndrome exception calls cpu_loop_exit() first, the call to the
  internal exception won't be reached and Qemu will be stuck in a loop
  without returning control to the remote debugger.

  What makes this a bit tricky to fix is that we must call
  cpu_loop_exit() at the end of helper_exception_with_syndrome(),
  otherwise the target exception will go undetected and its handler
  won't be excecuted.

  Tested on latest head by emulating a Stellaris lm3s6965 board and
  running RTEMS 4.11:

  $ qemu-system-arm -nographic -s -S -M lm3s6965evb -kernel my_rtems_app

  Commit hash in qemu.git: 30eaca3acdf17d7bcbd1213eb149c02037edfb0b

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1364501/+subscriptions

Re: [Qemu-devel] [PATCH 2/2] docs/qcow2: Correct refcount_block_entries

2014-09-02 Thread Eric Blake

On 09/02/2014 12:59 PM, Max Reitz wrote:

>> Consider refcount_order == 0 (that is, no shared blocks, ALL blocks have
>> at most refcount 1).  Then, refcount_bits is (1 << 0) == 1.  But 1/8 in
>> integer math truncates to 0 (oops, division by zero is undefined); when
>> in reality, the expression you want here is (cluster_size * 8 /
>> refcount_bits).
> 
> If it is integer division, that is. ;-)
> 
> I'm counting on you accepting "cluster_size * 8 / refcount_bits" and not
> rejecting it because of a possible integer overflow. *g*

We already document that qemu's maximum cluster size is 2M; and 2M*8 is
less than 32 bits :)

Maybe the qcow2 spec allows the theoretical file with a cluster size of
512M, where overflow then matters.  But you are correct that I won't
reject your patch, given that qemu doesn't parse all possible
theoretical qcow2 files :)

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] block/iscsi: use 16 byte CDBs only when necessary

2014-09-02 Thread Peter Lieven

Looking at the code, is it possible that not the guest is causing trouble here, 
but
multiwrite_merge code?

>From what I see the only limit it has when merging requests is the number of 
>IOVs.


Any thoughts?

Mine are:
 a) Introducing bs->bl.max_request_size and set merge = 0 if the result would 
be too big. Default
max request size to 32768 sectors (see below).
 b) Hardcoding the limit in multiwrite_merge for now limiting the merged size 
to 16MB (32768 sectors).
 Which is the limit we already use in bdrv_co_discard and 
bdrv_co_write_zeroes if we don't know
 better.

Peter

Am 02.09.2014 um 17:28 schrieb ronnie sahlberg:
> That is one big request.  I assume the device reports "no limit" in
> the VPD page so we can not state it is the guest/application going
> beyond the allowed limit?
>
>
> I am not entirely sure what meaning the target assigns to Protocol
> Error means here.
> It could be that ~100M is way higher than MaxBurstLength ?  What is
> the MaxBurstLength that was reported by the server during login
> negotiation?
> If so, we should make libiscsi check the maxburstlength and fail the
> request early. We would still fail the I/O so it will not really solve
> anything much
> but at least we should not send the request to the server.
>
> Best would probably be to take the smallest of a non-zero
> Block-Limits.max_transfer_length and iscsi-MaxBurstLength/block-size
> and pass this back to the guest in the emulated Block-Limits-VPD.
> At least then you have tried to tell the guest "never do SCSI I/O
> bigger than this".
>
> I.e. even if the target reports BlockLimits.MaxTransferLength == 0 ==
> no limit to QEMU, QEMU should probably take the iscsi transport limit
> into account and pass this to the guest
> by setting the emulated BlockLimits page it passes to scale to the
> maximum that MaxBurstLength allows.
>
>
> Then if BTRFS or SG_IO in the guest ignores the BlockLimits it is
> clearly a guest problem.
>
> (A different interpretation for ProtocolError could be the mismatch
> between the iscsi expected data transfer length and the scsi transfer
> length, but that should result in residuals, not protocol error.)
>
>
>
> Hypothetically there could be targets that support really huge
> MaxBurstLengths > 32MB. For those you probably want to switch to
> WRITE16 when the SCSI transfer length goes > 0x.
>
> - if (iscsilun->use_16_for_rw)  {
> + if (iscsilun->use_16_for_rw || num_sectors > 0x)  {
>
>
> regards
> ronnie sahlberg
>
> On Mon, Sep 1, 2014 at 8:21 AM, Peter Lieven  wrote:
>> On 17.06.2014 13:46, Paolo Bonzini wrote:
>>
>> Il 17/06/2014 13:37, Peter Lieven ha scritto:
>>
>> On 17.06.2014 13:15, Paolo Bonzini wrote:
>>
>> Il 17/06/2014 08:14, Peter Lieven ha scritto:
>>
>>
>>
>> BTW, while debugging a case with a bigger storage supplier I found
>> that open-iscsi seems to do exactly this undeterministic behaviour.
>> I have a 3TB LUN. If I access < 2TB sectors it uses READ10/WRITE10 and
>> if I go beyond 2TB it changes to READ16/WRITE16.
>>
>>
>> Isn't that exactly what your latest patch does for >64K sector writes? :)
>>
>>
>> Not exactly, we choose the default by checking the LUN size. 10 Byte for
>> < 2TB and 16 Byte otherwise.
>>
>>
>> Yeah, I meant introducing the non-determinism.
>>
>> My latest patch makes an exception if a request is bigger than 64K
>> sectors and
>> switches to 16 Byte requests. These would otherwise end in an I/O error.
>>
>>
>> It could also be split at the block layer, like we do for unmap.  I think
>> there's also a maximum transfer size somewhere in the VPD, we could to
>> READ16/WRITE16 if it is >64K sectors.
>>
>>
>> It seems that there might be a real world example where Linux issues >32MB
>> write requests. Maybe someone familiar with btrfs can advise.
>> I see iSCSI Protocol Errors in my logs:
>>
>> Sep  1 10:10:14 libiscsi:0 PDU header: 01 a1 00 00 00 01 00 00 00 00 00 00
>> 00 00 00 00 00 00 00 07 06 8f 30 00 00 00 00 06 00 00 00 0a 2a 00 01 09 9e
>> 50 00 47 98 00 00 00 00 00 00 00 [XXX]
>> Sep  1 10:10:14 qemu-2.0.0: iSCSI: Failed to write10 data to iSCSI lun.
>> Request was rejected with reason: 0x04 (Protocol Error)
>>
>> Looking at the headers the xferlen in the iSCSI PDU is 110047232 Byte which
>> is 214936 sectors.
>> 214936 % 65536 = 18328 which is exactly the number of blocks in the SCSI
>> WRITE10 CDB.
>>
>> Can someone advise if this is something that btrfs can cause
>> or if I have to
>> blame the customer that he issues very big write requests with Direct I/O?
>>
>> The user sseems something like this in the log:
>> [34640.489284] BTRFS: bdev /dev/vda2 errs: wr 8232, rd 0, flush 0, corrupt
>> 0, gen 0
>> [34640.490379] end_request: I/O error, dev vda, sector 17446880
>> [34640.491251] end_request: I/O error, dev vda, sector 5150144
>> [34640.491290] end_request: I/O error, dev vda, sector 17472080
>> [34640.492201] end_request: I/O error, dev vda, sector 17523488
>> [34640.492201] end_request: I/O error, dev vda, sector 17536

Re: [Qemu-devel] [PATCH v2 2/2] qemu-img: fix rebase src_cache option documentation

2014-09-02 Thread Max Reitz


On 02.09.2014 12:01, Stefan Hajnoczi wrote:

The src_cache option (-T) specifies the cache mode for backing files.
It applies both the image's old backing file as well as the new backing
file:

   ret = bdrv_open(&bs_old_backing, backing_name, NULL, NULL, src_flags,
   old_backing_drv, &local_err);
   if (ret) {
   ...
   }
   if (out_baseimg[0]) {
   bs_new_backing = bdrv_new("new_backing", &error_abort);
   ret = bdrv_open(&bs_new_backing, out_baseimg, NULL, NULL, src_flags,
   new_backing_drv, &local_err);
   if (ret) {
   ...
   }
   }

The documentation only mentions the new backing file but it really
applies to both.

Suggested-by: Jeff Nelson 
Signed-off-by: Stefan Hajnoczi 
---
  qemu-img.texi | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Reviewed-by: Max Reitz

Re: [Qemu-devel] [PATCH v2 1/2] qemu-img: clarify src_cache option documentation

2014-09-02 Thread Max Reitz


On 02.09.2014 12:01, Stefan Hajnoczi wrote:

The source cache option takes the same values as the cache option.  The
documentation reads a little strange because it starts with "In contrast
the src_cache option ...".  The fact that this is comparing with the
previous documented option (the 'cache' option) is implicit.  Readers
may be confused, especially if they jump to src_cache without reading
cache documentation first.

Suggested-by: Jeff Nelson 
Signed-off-by: Stefan Hajnoczi 
---
  qemu-img.c| 3 ++-
  qemu-img.texi | 5 +++--
  2 files changed, 5 insertions(+), 3 deletions(-)


Reviewed-by: Max Reitz

Re: [Qemu-devel] [PATCH 2/2] docs/qcow2: Correct refcount_block_entries

2014-09-02 Thread Max Reitz


On 30.08.2014 00:37, Eric Blake wrote:

On 08/29/2014 03:45 PM, Max Reitz wrote:

A refblock entry may have a different size than 16 bits, it may even be
smaller than a byte. Correct the refcount_block_entries calculation
accordingly.

Signed-off-by: Max Reitz 
---
  docs/specs/qcow2.txt | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
index cfbc8b0..531c478 100644
--- a/docs/specs/qcow2.txt
+++ b/docs/specs/qcow2.txt
@@ -183,7 +183,7 @@ blocks and are exactly one cluster in size.
  Given a offset into the image file, the refcount of its cluster can be 
obtained
  as follows:
  
-refcount_block_entries = (cluster_size / sizeof(uint16_t))

+refcount_block_entries = (cluster_size / (refcount_bits / 8))
  


Consider refcount_order == 0 (that is, no shared blocks, ALL blocks have
at most refcount 1).  Then, refcount_bits is (1 << 0) == 1.  But 1/8 in
integer math truncates to 0 (oops, division by zero is undefined); when
in reality, the expression you want here is (cluster_size * 8 /
refcount_bits).


If it is integer division, that is. ;-)

I'm counting on you accepting "cluster_size * 8 / refcount_bits" and not 
rejecting it because of a possible integer overflow. *g*


Max

Re: [Qemu-devel] [PATCH v5 01/11] qcow2: Calculate refcount block entry count

2014-09-02 Thread Max Reitz


On 30.08.2014 01:03, Eric Blake wrote:

On 08/29/2014 03:40 PM, Max Reitz wrote:

The size of a refblock entry is (in theory) variable; calculate
therefore the number of entries per refblock and the according bit shift
(1 << x == entry count) when opening an image.

Signed-off-by: Max Reitz 
---
  block/qcow2.c | 2 ++
  block/qcow2.h | 2 ++
  2 files changed, 4 insertions(+)

What is the maximum refcount_order?  The specs don't mention; the file
format is wide open to overflows.  Even something as benign-sounding as
refcount_order=6 (64 bits) means that each cluster can be referenced
2**64 times, which is far longer than our lifetimes to build it up that
high incrementally, and represents far greater than the amount of
storage in existence being deduplicated!  Shockingly easy to start
getting into undefined territory, so maybe we ought to explicitly cap
refcount_order to 6.


Well, the most obvious issue to me is that qcow2 only supports 64 bit 
offsets and sizes etc., so it shouldn't have refcounts wider than 64 bits.


On the other hand, it is probably possible to generate a valid image 
with a cluster having a refcount which exceeds 2^{64} - 1: Set the 
virtual size to (2^{64} - cluster_size), use a single data cluster for 
all virtual clusters which makes its refcount (2^{64} - cluster_size) / 
cluster_size (which would be 2^{55} - 1 in the most extreme case). Then 
you create cluster_size snapshots and the refcount of that cluster is 
now at (2^{55} - 1) * (1 + 512) = 2^{64} + 2^{55} - 512 - 1 >= 2^{64}.


But that would be crazy, so I think it very reasonable to forbid 
refcount_order > 6, too.



diff --git a/block/qcow2.c b/block/qcow2.c
index f9e045f..172ad00 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -689,6 +689,8 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, 
int flags,
  
  s->l2_bits = s->cluster_bits - 3; /* L2 is always one cluster */

  s->l2_size = 1 << s->l2_bits;
+s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3);

Hmm; we document that qemu requires cluster_bits to be between 9 and 21
inclusive.  So, if cluster_bits is 9 (512-byte clusters), and
refcount_order is 6, then we can pack in 9 - (6 - 3) or 2**6 (that is,
64) refcounts per cluster.

On the other extreme, the minimum refcount_order of 0 (each cluster
occupies refcount bits, and so is either allocated or not, but no
sharing), starts having the math looks ugly, because you are mixing:

(int) = (uint32_t) - ( (uint32_t) - (int) )

so at one point, you are doing s->cluster_bits - (4294967293U), but that
wraps around (thankfully, wraparound is well-defined on unsigned types)
for a net answer of cluster_bits + 3.  But in the worst case, that means
an image with 2M clusters will be packing 21 - (0 - 3) or 2**24 (that
is, 16M) refcounts in one cluster.  Still fits in an int, so it looks
like you are safe...


+s->refcount_block_size = 1 << s->refcount_block_bits;

...that this particular shift will not cause undefined behavior, for
reasonable refcount_order in the range [0,6].


Well, for now refcount_order is asserted to be 4, anyway. ;-)


Reviewed-by: Eric Blake 

[We really ought to tighten the qcow2 spec - but that's a separate patch]


Yep, I'll include it in v2 of the follow-up series.

Max

Re: [Qemu-devel] [Bug 1364501] [NEW] Gdb hangs when trying to single-step after an invalid instruction

2014-09-02 Thread Peter Maydell

On 2 September 2014 17:38, martin  wrote:
> Public bug reported:
>
> When using Gdb to remote-debug a program and manually setting its PC to
> point to an address containing an invalid instruction and then doing a
> single step, Qemu will never return control to the remote Gdb.
>
> For instance, let's say address 0x114 contains an invalid instruction.
> On the remote Gdb, we'd do:
>
> (gdb) set $pc = 0x114
> (gdb) stepi
>
> After doing that we won't get the (gdb) prompt unless we do a Ctrl-C. If
> we do so we'll be left at 0x114 instead of going towards the exception
> handler as we should. This happens with stepi, step and next. If instead
> of single-stepping we used continue, the program will proceed into the
> exception handler as it should.
>
> The reason this is happening is that when Qemu realizes it's about to
> translate an instruction it doesn't recognize it'll generate a call to
> helper_exception_with_syndrome(), which will register the exception and
> then call cpu_loop_exit(). At the same time, because we're doing a
> single-step, Qemu will also generate a call to
> helper_exception_internal() passing it an EXCP_DEBUG, which lets the
> system know it'll give control back to the remote debugger, and it also
> ends with a call to cpu_loop_exit(). However, because the syndrome
> exception calls cpu_loop_exit() first, the call to the internal
> exception won't be reached and Qemu will be stuck in a loop without
> returning control to the remote debugger.

Just to check, does your system image include a valid handler
for the undef exception? I'm wondering if we really never return
control in all cases, or only if we get stuck in an infinite loop
of exceptions beacuse the exception handler's first instruction
causes an exception which causes an exception which...

> What makes this a bit tricky to fix is that we must call cpu_loop_exit()
> at the end of helper_exception_with_syndrome(), otherwise the target
> exception will go undetected and its handler won't be excecuted.

I suspect there are actually more general issues with the interaction
of single-stepping with exceptions. See also
https://bugs.launchpad.net/qemu/+bug/757702 which is a report
that singlestepping an invalid instruction stops on the insn after
the first one in the UNDEF handler rather than before it.

thanks
-- PMM

Re: [Qemu-devel] [Qemu-trivial] [PATCH] slirp: Honour vlan/stack in hostfwd_remove commands

2014-09-02 Thread Michael Tokarev

26.06.2014 16:35, Peter Maydell wrote:
> On 16 June 2014 16:47, Peter Maydell  wrote:
>> The hostfwd_add and hostfwd_remove monitor commands allow the user
>> to optionally specify a vlan/stack tuple. hostfwd_add honours this,
>> but hostfwd_remove does not (it looks up the tuple but then ignores
>> the SlirpState it has looked up and always uses the first stack
>> in the list anyway). Correct this to honour what the user requested.

I've applied this to -trivial.  Somehow this patch has been marked as
"handled" in my queue, but actually it weren't - that's why I haven't
really seen the pings.

Thanks,

/mjt

Re: [Qemu-devel] [question] virtio-blk performance degradationhappened with virito-serial

2014-09-02 Thread Andrey Korolyov

On Tue, Sep 2, 2014 at 10:11 PM, Amit Shah  wrote:
> On (Tue) 02 Sep 2014 [22:05:45], Andrey Korolyov wrote:
>
>> Can confirm serious degradation comparing to the 1.1 with regular
>> serial output  - I am able to hang VM forever after some tens of
>> seconds after continuously printing dmest to the ttyS0. VM just ate
>> all available CPU quota during test and hanged over some tens of
>> seconds, not even responding to regular pings and progressively
>> raising CPU consumption up to the limit.
>
> Entirely different to what's being discussed here.  You're observing
> slowdown with ttyS0 in the guest -- the isa-serial device.  This
> thread is discussing virtio-blk and virtio-serial.
>
> Amit

Sorry for thread hijacking, the problem definitely not related to the
interrupt rework, will start a new thread.

Re: [Qemu-devel] [PATCH] block/iscsi: use 16 byte CDBs only when necessary

2014-09-02 Thread Peter Lieven

Am 02.09.2014 um 17:28 schrieb ronnie sahlberg:
> That is one big request.  I assume the device reports "no limit" in
> the VPD page so we can not state it is the guest/application going
> beyond the allowed limit?

Yes, my storage reports no limit, but afaik there is no way to tell the
guest the limit at least with virtio-blk. Or can we?

If so we should define a sane limit in case the storage reports no limits.
It can't be our goal to allow the guest to issue a request in orders
of gigabytes. Which is possible. The xferlen in the iSCSI PDU is 32bit so
we have an actual limit of 2^32 - 1.

>
>
> I am not entirely sure what meaning the target assigns to Protocol
> Error means here.
> It could be that ~100M is way higher than MaxBurstLength ?  What is
> the MaxBurstLength that was reported by the server during login
> negotiation?
> If so, we should make libiscsi check the maxburstlength and fail the
> request early. We would still fail the I/O so it will not really solve
> anything much
> but at least we should not send the request to the server.

The "problem" is that we switched to read/write10. In this case
we can write 65535 sectors max. The procotol error refers to
the xferlen not matching the block count in the SCSI CDB in the
payload.

>
> Best would probably be to take the smallest of a non-zero
> Block-Limits.max_transfer_length and iscsi-MaxBurstLength/block-size
> and pass this back to the guest in the emulated Block-Limits-VPD.
> At least then you have tried to tell the guest "never do SCSI I/O
> bigger than this".

I would vote for the biggest number of sectors fitting in uint16_t. Which
is 32768. This is 16 MB with 512 byte sectors. We have already used
this for write zeroes and discard in case there is no limit specified.

>
> I.e. even if the target reports BlockLimits.MaxTransferLength == 0 ==
> no limit to QEMU, QEMU should probably take the iscsi transport limit
> into account and pass this to the guest
> by setting the emulated BlockLimits page it passes to scale to the
> maximum that MaxBurstLength allows.
>
>
> Then if BTRFS or SG_IO in the guest ignores the BlockLimits it is
> clearly a guest problem.
>
> (A different interpretation for ProtocolError could be the mismatch
> between the iscsi expected data transfer length and the scsi transfer
> length, but that should result in residuals, not protocol error.)
>
>
>
> Hypothetically there could be targets that support really huge
> MaxBurstLengths > 32MB. For those you probably want to switch to
> WRITE16 when the SCSI transfer length goes > 0x.
>
> - if (iscsilun->use_16_for_rw)  {
> + if (iscsilun->use_16_for_rw || num_sectors > 0x)  {

I already proposed this, but I think Michael voted that this
would cause unpredictable and hard to debug behaviour
for qemu.

[PATCH] block/iscsi: use 16 byte CDBs also for big requests
https://lists.nongnu.org/archive/html/qemu-devel/2014-06/msg03628.html

But maybe we have to do this if we cannot report the max
transfer length for virtio-blk or IDE. I will check what open-iscsi or
Linux does.

Peter

Re: [Qemu-devel] [question] virtio-blk performance degradationhappened with virito-serial

2014-09-02 Thread Amit Shah

On (Tue) 02 Sep 2014 [22:05:45], Andrey Korolyov wrote:

> Can confirm serious degradation comparing to the 1.1 with regular
> serial output  - I am able to hang VM forever after some tens of
> seconds after continuously printing dmest to the ttyS0. VM just ate
> all available CPU quota during test and hanged over some tens of
> seconds, not even responding to regular pings and progressively
> raising CPU consumption up to the limit.

Entirely different to what's being discussed here.  You're observing
slowdown with ttyS0 in the guest -- the isa-serial device.  This
thread is discussing virtio-blk and virtio-serial.

Amit

Re: [Qemu-devel] [question] virtio-blk performance degradationhappened with virito-serial

2014-09-02 Thread Andrey Korolyov

On Tue, Sep 2, 2014 at 10:36 AM, Amit Shah  wrote:
> On (Mon) 01 Sep 2014 [20:52:46], Zhang Haoyu wrote:
>> >>> Hi, all
>> >>>
>> >>> I start a VM with virtio-serial (default ports number: 31), and found 
>> >>> that virtio-blk performance degradation happened, about 25%, this 
>> >>> problem can be reproduced 100%.
>> >>> without virtio-serial:
>> >>> 4k-read-random 1186 IOPS
>> >>> with virtio-serial:
>> >>> 4k-read-random 871 IOPS
>> >>>
>> >>> but if use max_ports=2 option to limit the max number of virio-serial 
>> >>> ports, then the IO performance degradation is not so serious, about 5%.
>> >>>
>> >>> And, ide performance degradation does not happen with virtio-serial.
>> >>
>> >>Pretty sure it's related to MSI vectors in use.  It's possible that
>> >>the virtio-serial device takes up all the avl vectors in the guests,
>> >>leaving old-style irqs for the virtio-blk device.
>> >>
>> >I don't think so,
>> >I use iometer to test 64k-read(or write)-sequence case, if I disable the 
>> >virtio-serial dynamically via device manager->virtio-serial => disable,
>> >then the performance get promotion about 25% immediately, then I re-enable 
>> >the virtio-serial via device manager->virtio-serial => enable,
>> >the performance got back again, very obvious.
>> add comments:
>> Although the virtio-serial is enabled, I don't use it at all, the 
>> degradation still happened.
>
> Using the vectors= option as mentioned below, you can restrict the
> number of MSI vectors the virtio-serial device gets.  You can then
> confirm whether it's MSI that's related to these issues.
>
>> >So, I think it has no business with legacy interrupt mode, right?
>> >
>> >I am going to observe the difference of perf top data on qemu and perf kvm 
>> >stat data when disable/enable virtio-serial in guest,
>> >and the difference of perf top data on guest when disable/enable 
>> >virtio-serial in guest,
>> >any ideas?
>> >
>> >Thanks,
>> >Zhang Haoyu
>> >>If you restrict the number of vectors the virtio-serial device gets
>> >>(using the -device virtio-serial-pci,vectors= param), does that make
>> >>things better for you?
>
>
>
> Amit
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Can confirm serious degradation comparing to the 1.1 with regular
serial output  - I am able to hang VM forever after some tens of
seconds after continuously printing dmest to the ttyS0. VM just ate
all available CPU quota during test and hanged over some tens of
seconds, not even responding to regular pings and progressively
raising CPU consumption up to the limit.

Re: [Qemu-devel] [PATCH v6 02/27] bootindex: add del_boot_device_path function

2014-09-02 Thread Eduardo Habkost

On Mon, Sep 01, 2014 at 06:47:13AM +, Gonglei (Arei) wrote:
> > From: Gerd Hoffmann [mailto:kra...@redhat.com]
> > Sent: Monday, September 01, 2014 2:43 PM
> > Subject: Re: [PATCH v6 02/27] bootindex: add del_boot_device_path function
> > Importance: High
> > 
> >   Hi,
> > 
> > > +static bool is_same_fw_dev_path(DeviceState *src, DeviceState *dst)
> > > +{
> > > +bool ret = false;
> > > +char *devpath_src = qdev_get_fw_dev_path(src);
> > > +char *devpath_dst = qdev_get_fw_dev_path(dst);
> > > +
> > > +if (!strcmp(devpath_src, devpath_dst)) {
> > > +ret = true;
> > > +}
> > > +
> > > +g_free(devpath_src);
> > > +g_free(devpath_dst);
> > > +return ret;
> > > +}
> > > +
> > > +void del_boot_device_path(DeviceState *dev)
> > > +{
> > > +FWBootEntry *i;
> > > +
> > > +assert(dev != NULL);
> > > +
> > > +/* remove all entries of the assigned device */
> > > +QTAILQ_FOREACH(i, &fw_boot_order, link) {
> > > +if (i->dev == NULL) {
> > > +continue;
> > > +}
> > > +if ((i->dev == dev || is_same_fw_dev_path(i->dev, dev))) {
> > 
> > Why this is needed?  Is there any case where i-->dev != dev but
> > is_same_fw_dev_path() returns true?
> > 
> Yes, it is needed. At present, the virito-net-pci device 
> compliance with this situation.
> 
> Please see the qom path about virtio-net-pci and virtio-net device:
> 
> id: null, /machine/peripheral/nic1/virtio-backend
> id: nic1, /machine/peripheral/nic1

And why exactly is the caller passing a different pointer to
del_boot_device_path()? Why not simply change the caller to pass the
same pointer to add_boot_device_path() and del_boot_device_path()?

-- 
Eduardo

Re: [Qemu-devel] [Qemu-stable] Patch Round-up for stable 2.1.1, freeze on 2014-09-03

2014-09-02 Thread Andrey Korolyov

On Tue, Sep 2, 2014 at 7:27 PM, Michael S. Tsirkin  wrote:
> On Tue, Sep 02, 2014 at 06:25:46PM +0300, Michael S. Tsirkin wrote:
>> On Tue, Sep 02, 2014 at 10:20:50AM -0500, Michael Roth wrote:
>> > Quoting Michael Roth (2014-08-27 12:35:57)
>> > > Hi everyone,
>> > >
>> > > The following new patches are queued for QEMU stable v2.1.1:
>> > >
>> > >   https://github.com/mdroth/qemu/commits/stable-2.1-staging
>> >
>> > As of now the following additional patches have been applied to
>> > the 2.1.1 staging tree (see stable commit for upstream commit ref):
>> >
>> >   a7f9ea2 qxl-render: add more sanity checks
>> >   1511595 target-arm: Correct Cortex-A57 ISAR5 and AA64ISAR0 ID register 
>> > values
>> >   44a1530 target-arm: Fix regression that disabled VFP for ARMv5 CPUs
>> >   95dcaa4 x86: Clear MTRRs on vCPU reset
>> >   7fd25d3 x86: kvm: Add MTRR support for kvm_get|put_msrs()
>> >   0f9c63b x86: Use common variable range MTRR counts
>> >   0c69595 target-i386: Don't forbid NX bit on PAE PDEs and PTEs
>> >   89713cb vl: process -object after other backend options
>> >   93dcbd3 spapr_pci: map the MSI window in each PHB
>> >
>> > The following patches have been Cc'd to qemu-stable and are still
>> > pending upstream commit/merge:
>> >
>> >   [PATCH] virtio-net: don't run bh on vm stopped (Michael S. Tsirkin)
>> > * pull just sent by Michael Tsirkin
>> >   [PATCH] net: prevent sending packets while guest is stopped (Stefan 
>> > Hajnoczi)
>> >   [PATCH v2 2/4] pci: Avoid losing config updates to MSI/MSIX cap regs 
>> > (Knut Omang)
>>
>>   this is also in that pull
>>
>> >   [PATCH V4] net: Forbid dealing with packets when VM is not running 
>> > (zhanghailiang)
>> > * looks like Stefan has this queued for net
>> >   [PATCH V2] acpi-build: Set FORCE_APIC_CLUSTER_MODEL bit for FADT flags 
>> > (zhanghailiang)
>>
>>   hmm I forgot to include that one. I'll redo the pull request.
>
> or maybe not
> I recalled why I didn't include it - I wanted someone else to review it
> first
>
> if it's ready, I'll send it separately tomorrow
>
>
>> >   [RESEND v2 0/3] Fix some memory leaks about query memdev (Chen Fan)
>> > * looks like Michael Tokarev has this queued for trivial
>> >   [PATCH] target-i386: Support migratable=no properly (Eduardo Habkost)
>> > * pinged
>> >   [PATCH 1/3] pc: Fix disabling of vapic for compat PC models (Jan Kiszka)
>> > * pinged
>> >   [PATCH] pty: Fix byte loss bug when connecting to pty (Sebastian Tanase)
>> > * pinged, presumably going through Gerd's tree
>> >
>> > Let me know if anything is missing.
>> >
>> > >
>> > > The release is planned for 2014-09-08:
>> > >
>> > >   http://wiki.qemu.org/Planning/2.1
>> > >
>> > > Please respond here or CC qemu-sta...@nongnu.org on any patches you
>> > > think should be included in the release.
>> > >
>> > > Testing/feedback is greatly appreciated.
>> > >
>> > > Thanks!
>> > >
>> > > 
>> > > Alex Williamson (1):
>> > >   vfio: Fix MSI-X vector expansion
>> > >
>> > > Ben Draper (1):
>> > >   vmxnet3: Pad short frames to minimum size (60 bytes)
>> > >
>> > > Christoffer Dall (2):
>> > >   target-arm: Rename QEMU PSCI v0.1 definitions
>> > >   arm/virt: Use PSCI v0.2 function IDs in the DT when KVM uses PSCI 
>> > > v0.2
>> > >
>> > > Fam Zheng (1):
>> > >   blkdebug: Delete BH in bdrv_aio_cancel
>> > >
>> > > Gonglei (1):
>> > >   pcihp: fix possible array out of bounds
>> > >
>> > > Hu Tao (3):
>> > >   hw:i386: typo fix: MEMORY_HOPTLUG_DEVICE -> MEMORY_HOTPLUG_DEVICE
>> > >   pc-dimm: validate node property
>> > >   numa: show hex number in error message for consistency and prefix 
>> > > them with 0x
>> > >
>> > > Jan Kiszka (1):
>> > >   pci: Use bus master address space for delivering MSI/MSI-X messages
>> > >
>> > > Michael S. Tsirkin (4):
>> > >   pc-dimm: fix up error message
>> > >   acpi: align RSDP
>> > >   hostmem: set MPOL_MF_MOVE
>> > >   pc: reserve more memory for ACPI for new machine types
>> > >
>> > > Michael Tokarev (2):
>> > >   l2tpv3 (configure): it is linux-specific
>> > >   ide: only constrain read/write requests to drive size, not other 
>> > > types
>> > >
>> > > Peter Lieven (1):
>> > >   block/iscsi: fix memory corruption on iscsi resize
>> > >
>> > > Peter Maydell (1):
>> > >   target-arm: Fix return address for A64 BRK instructions
>> > >
>> > > Stefan Hajnoczi (6):
>> > >   qmp: hide "hotplugged" device property from device-list-properties
>> > >   qdev-monitor: include QOM properties in -device FOO, help output
>> > >   raw-posix: fix O_DIRECT short reads
>> > >   qemu-iotests: add test case 101 for short file I/O
>> > >   thread-pool: avoid per-thread-pool EventNotifier
>> > >   thread-pool: avoid deadlock in nested aio_poll() calls
>> > >
>> > > zhanghailiang (1):
>> > >   virtio-blk: fix reference a pointer which might be f

[Qemu-devel] [Bug 1364501] [NEW] Gdb hangs when trying to single-step after an invalid instruction

2014-09-02 Thread martin

Public bug reported:

When using Gdb to remote-debug a program and manually setting its PC to
point to an address containing an invalid instruction and then doing a
single step, Qemu will never return control to the remote Gdb.

For instance, let's say address 0x114 contains an invalid instruction.
On the remote Gdb, we'd do:

(gdb) set $pc = 0x114
(gdb) stepi

After doing that we won't get the (gdb) prompt unless we do a Ctrl-C. If
we do so we'll be left at 0x114 instead of going towards the exception
handler as we should. This happens with stepi, step and next. If instead
of single-stepping we used continue, the program will proceed into the
exception handler as it should.

The reason this is happening is that when Qemu realizes it's about to
translate an instruction it doesn't recognize it'll generate a call to
helper_exception_with_syndrome(), which will register the exception and
then call cpu_loop_exit(). At the same time, because we're doing a
single-step, Qemu will also generate a call to
helper_exception_internal() passing it an EXCP_DEBUG, which lets the
system know it'll give control back to the remote debugger, and it also
ends with a call to cpu_loop_exit(). However, because the syndrome
exception calls cpu_loop_exit() first, the call to the internal
exception won't be reached and Qemu will be stuck in a loop without
returning control to the remote debugger.

What makes this a bit tricky to fix is that we must call cpu_loop_exit()
at the end of helper_exception_with_syndrome(), otherwise the target
exception will go undetected and its handler won't be excecuted.

Tested on latest head by emulating a Stellaris lm3s6965 board and
running RTEMS 4.11:

$ qemu-system-arm -nographic -s -S -M lm3s6965evb -kernel my_rtems_app

Commit hash in qemu.git: 30eaca3acdf17d7bcbd1213eb149c02037edfb0b

** Affects: qemu
 Importance: Undecided
 Status: New

** Description changed:

  When using Gdb to remote-debug a program and manually setting its PC to
- point to an address containing an invalid instruction, then doing a
- single step Qemu will never return control to the remote Gdb.
+ point to an address containing an invalid instruction and then doing a
+ single step, Qemu will never return control to the remote Gdb.
  
  For instance, let's say address 0x114 contains an invalid instruction.
  On the remote Gdb, we'd do:
  
  (gdb) set $pc = 0x114
  (gdb) stepi
  
  After doing that we won't get the (gdb) prompt unless we do a Ctrl-C. If
  we do so we'll be left at 0x114 instead of going towards the exception
  handler as we should. This happens with stepi, step and next. If instead
  of single-stepping we used continue, the program will proceed into the
  exception handler as it should.
  
  The reason this is happening is that when Qemu realizes it's about to
  translate an instruction it doesn't recognize it'll generate a call to
  helper_exception_with_syndrome(), which will register the exception and
  then call cpu_loop_exit(). At the same time, because we're doing a
  single-step, Qemu will also generate a call to
  helper_exception_internal() passing it an EXCP_DEBUG, which lets the
  system know it'll give control back to the remote debugger, and it also
  ends with a call to cpu_loop_exit(). However, because the syndrome
  exception calls cpu_loop_exit() first, the call to the internal
  exception won't be reached and Qemu will be stuck in a loop without
  returning control to the remote debugger.
  
  What makes this a bit tricky to fix is that we must call cpu_loop_exit()
  at the end of helper_exception_with_syndrome(), otherwise the target
  exception will go undetected and its handler won't be excecuted.
  
  Tested on latest head by emulating a Stellaris lm3s6965 board and
  running RTEMS 4.11:
  
  $ qemu-system-arm -nographic -s -S -M lm3s6965evb -kernel my_rtems_app
  
  Commit hash in qemu.git: 30eaca3acdf17d7bcbd1213eb149c02037edfb0b

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1364501

Title:
  Gdb hangs when trying to single-step after an invalid instruction

Status in QEMU:
  New

Bug description:
  When using Gdb to remote-debug a program and manually setting its PC
  to point to an address containing an invalid instruction and then
  doing a single step, Qemu will never return control to the remote Gdb.

  For instance, let's say address 0x114 contains an invalid instruction.
  On the remote Gdb, we'd do:

  (gdb) set $pc = 0x114
  (gdb) stepi

  After doing that we won't get the (gdb) prompt unless we do a Ctrl-C.
  If we do so we'll be left at 0x114 instead of going towards the
  exception handler as we should. This happens with stepi, step and
  next. If instead of single-stepping we used continue, the program will
  proceed into the exception handler as it should.

  The reason this is happening is that when Qemu realizes it's about to
  translate an instruction it does

Re: [Qemu-devel] [PATCH v2] qcow2: add update refcount table realization for update_refcount

2014-09-02 Thread Greg Kurz

On Mon,  1 Sep 2014 18:52:48 +0800
Jun Li  wrote:

> When every item of refcount block is NULL, free refcount block and reset the
> corresponding item of refcount table with NULL.
> 
> Signed-off-by: Jun Li 
> ---
> 
> The v2 do following change to modify some potential issue.
> 
>  +--- Here should start from "0".
>  |
> for (k = 0; k < refcount_block_entries; k++) {
> if (refcount_block[k] != cpu_to_be16(0)) {
> ...| |
> }  | |
> }  | + Using "0" is more safe.
>|
>+ This should be "k" not "++k".
> ---
>  block/qcow2-refcount.c | 31 +++
>  1 file changed, 31 insertions(+)
> 
> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> index 43665b8..63f36e6 100644
> --- a/block/qcow2-refcount.c
> +++ b/block/qcow2-refcount.c
> @@ -586,6 +586,37 @@ static int QEMU_WARN_UNUSED_RESULT 
> update_refcount(BlockDriverState *bs,
>  if (refcount == 0 && s->discard_passthrough[type]) {
>  update_refcount_discard(bs, cluster_offset, s->cluster_size);
>  }
> +
> +/* When refcount block is NULL, update refcount table */
> +if (block_index == 0) {
> +int k = block_index;

Hi,

k = 0 is also done in the for block below...

> +int refcount_block_entries = s->cluster_size / sizeof(uint16_t);

It's better for maintainance to count elements in an array this way:

int refcount_block_entries = s->cluster_size / sizeof(refcount_block[0]);

> +for (k = 0; k < refcount_block_entries; k++) {
> +if (refcount_block[k] != cpu_to_be16(0)) {
> +break;
> +}
> +}
> +
> +if (k == refcount_block_entries) {
> +qemu_vfree(refcount_block);
> +/* update refcount table */
> +unsigned int refcount_table_index;
> +uint64_t data64 = cpu_to_be64(0);
> +refcount_table_index = cluster_index >> (s->cluster_bits -
> +   REFCOUNT_SHIFT);
> +ret = bdrv_pwrite_sync(bs->file,
> +   s->refcount_table_offset +
> +   refcount_table_index *
> +   sizeof(uint64_t),
> +   &data64, sizeof(data64));
> +if (ret < 0) {
> +goto fail;
> +}
> +
> +s->refcount_table[refcount_table_index] = data64;
> +
> +}
> +}
>  }
> 
>  ret = 0;

Cheers.

-- 
Gregory Kurz kurzg...@fr.ibm.com
 gk...@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys  http://www.ibm.com
Tel +33 (0)562 165 496

"Anarchy is about taking complete responsibility for yourself."
Alan Moore.

Re: [Qemu-devel] [PATCH v4 07/33] target-arm: add non-secure Translation Block flag

2014-09-02 Thread Greg Bellows

Thanks Peter.  I'll fix these in v5.


On 2 September 2014 11:11, Peter Maydell  wrote:

> On 1 July 2014 00:09,   wrote:
> > From: Sergey Fedorov 
> >
> > This patch is based on idea found in patch at
> > git://github.com/jowinter/qemu-trustzone.git
> > f3d955c6c0ed8c46bc0eb10b634201032a651dd2 by
> > Johannes Winter .
> >
> > This flag prevents QEMU from executing TCG code generated for other CPU
> > security state. It also allows to generate different TCG code depending
> on
> > CPU secure state.
> >
> > Signed-off-by: Sergey Fedorov 
> > Signed-off-by: Fabian Aggeler 
> > Signed-off-by: Greg Bellows 
> > ---
> >  target-arm/cpu.h   | 10 ++
> >  target-arm/translate-a64.c |  1 +
> >  target-arm/translate.c |  3 +++
> >  target-arm/translate.h |  1 +
> >  4 files changed, 15 insertions(+)
> >
> > diff --git a/target-arm/cpu.h b/target-arm/cpu.h
> > index 1faf1e2..44e0943 100644
> > --- a/target-arm/cpu.h
> > +++ b/target-arm/cpu.h
> > @@ -1291,6 +1291,8 @@ static inline int cpu_mmu_index (CPUARMState *env)
> >  #define ARM_TBFLAG_BSWAP_CODE_MASK  (1 << ARM_TBFLAG_BSWAP_CODE_SHIFT)
> >  #define ARM_TBFLAG_CPACR_FPEN_SHIFT 17
> >  #define ARM_TBFLAG_CPACR_FPEN_MASK  (1 << ARM_TBFLAG_CPACR_FPEN_SHIFT)
> > +#define ARM_TBFLAG_NS_SHIFT 18
> > +#define ARM_TBFLAG_NS_MASK  (1 << ARM_TBFLAG_NS_SHIFT)
> >
> >  /* Bit usage when in AArch64 state */
> >  #define ARM_TBFLAG_AA64_EL_SHIFT0
> > @@ -1321,6 +1323,8 @@ static inline int cpu_mmu_index (CPUARMState *env)
> >  (((F) & ARM_TBFLAG_AA64_EL_MASK) >> ARM_TBFLAG_AA64_EL_SHIFT)
> >  #define ARM_TBFLAG_AA64_FPEN(F) \
> >  (((F) & ARM_TBFLAG_AA64_FPEN_MASK) >> ARM_TBFLAG_AA64_FPEN_SHIFT)
> > +#define ARM_TBFLAG_NS(F) \
> > +(((F) & ARM_TBFLAG_NS_MASK) >> ARM_TBFLAG_NS_SHIFT)
> >
> >  static inline void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong
> *pc,
> >  target_ulong *cs_base, int
> *flags)
> > @@ -1334,6 +1338,9 @@ static inline void
> cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
> >  if (fpen == 3 || (fpen == 1 && arm_current_pl(env) != 0)) {
> >  *flags |= ARM_TBFLAG_AA64_FPEN_MASK;
> >  }
> > +if (!arm_is_secure(env)) {
> > +*flags |= ARM_TBFLAG_NS_MASK;
> > +}
> >  } else {
> >  int privmode;
> >  *pc = env->regs[15];
> > @@ -1350,6 +1357,9 @@ static inline void
> cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
> >  if (privmode) {
> >  *flags |= ARM_TBFLAG_PRIV_MASK;
> >  }
> > +if (!arm_is_secure(env)) {
> > +*flags |= ARM_TBFLAG_NS_MASK;
> > +}
>
> You can't share the same TBFLAG between the AArch64
> and AArch32 tb flags like this -- they have different layouts
> of the bits in the flags word.
>
> >  if (env->vfp.xregs[ARM_VFP_FPEXC] & (1 << 30)
> >  || arm_el_is_aa64(env, 1)) {
> >  *flags |= ARM_TBFLAG_VFPEN_MASK;
> > diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
> > index 446d2cd..ad30903 100644
> > --- a/target-arm/translate-a64.c
> > +++ b/target-arm/translate-a64.c
> > @@ -10879,6 +10879,7 @@ void gen_intermediate_code_internal_a64(ARMCPU
> *cpu,
> >  dc->condexec_cond = 0;
> >  #if !defined(CONFIG_USER_ONLY)
> >  dc->user = (ARM_TBFLAG_AA64_EL(tb->flags) == 0);
> > +dc->ns = ARM_TBFLAG_NS(tb->flags);
> >  #endif
> >  dc->cpacr_fpen = ARM_TBFLAG_AA64_FPEN(tb->flags);
> >  dc->vec_len = 0;
> > diff --git a/target-arm/translate.c b/target-arm/translate.c
> > index cf4e767..bf17952 100644
> > --- a/target-arm/translate.c
> > +++ b/target-arm/translate.c
> > @@ -53,8 +53,10 @@ static uint32_t gen_opc_condexec_bits[OPC_BUF_SIZE];
> >
> >  #if defined(CONFIG_USER_ONLY)
> >  #define IS_USER(s) 1
> > +#define IS_NS(s) 1
> >  #else
> >  #define IS_USER(s) (s->user)
> > +#define IS_NS(s) (s->ns)
> >  #endif
> >
> >  TCGv_ptr cpu_env;
> > @@ -10904,6 +10906,7 @@ static inline void
> gen_intermediate_code_internal(ARMCPU *cpu,
> >  dc->condexec_cond = ARM_TBFLAG_CONDEXEC(tb->flags) >> 4;
> >  #if !defined(CONFIG_USER_ONLY)
> >  dc->user = (ARM_TBFLAG_PRIV(tb->flags) == 0);
> > +dc->ns = ARM_TBFLAG_NS(tb->flags);
> >  #endif
> >  dc->cpacr_fpen = ARM_TBFLAG_CPACR_FPEN(tb->flags);
> >  dc->vfp_enabled = ARM_TBFLAG_VFPEN(tb->flags);
> > diff --git a/target-arm/translate.h b/target-arm/translate.h
> > index 31a0104..6e8620a 100644
> > --- a/target-arm/translate.h
> > +++ b/target-arm/translate.h
> > @@ -19,6 +19,7 @@ typedef struct DisasContext {
> >  int bswap_code;
> >  #if !defined(CONFIG_USER_ONLY)
> >  int user;
> > +int ns;
> >  #endif
>
> Please don't follow the way the "user" flag is done, it's
> weird and something I might eventually tidy up. Just have
> a "bool ns" in the DisasContext (not inside the #if), and
> set it to true if CONFIG_USER_ONLY is set. Then you
> don't need to indirect via an IS_

Re: [Qemu-devel] [PATCH v4 01/33] target-arm: add cpu feature EL3 to CPUs with Security Extensions

2014-09-02 Thread Peter Maydell

On 1 July 2014 00:09,   wrote:
> From: Fabian Aggeler 
>
> Set ARM_FEATURE_EL3 feature for CPUs that implement Security Extensions.
>
> Signed-off-by: Fabian Aggeler 
> Signed-off-by: Greg Bellows 

When this patchset is eventually ready to commit, this
patch needs to be the last one, so we don't enable the
EL3 support until it's complete. (Obviously for testing
purposes it's handy to have it at the front for now though.)

-- PMM

[Qemu-devel] [PATCH] xen-hvm.c: Improve the return method for xen_hvm_init()

2014-09-02 Thread Chen Gang

When failure occurs, it need use "return -1" instead of exit(1), so can
let upper caller has chance to print failure information, too, then user
can know the failure result more clearly.

xen_hvm_init() may also return -errno, which may let upper caller think
more (e.g. free some other related resources and try again), although at
present, all related upper callers still exit(1).

It is not a normal function which does not release related resources, if
return -1. So need give the related comments for it.

It passes common test under fedora 20 x86_64:

  "./configure --enable-xen && make -j4 && make check"
  execute result: "echo $? == 0".

Signed-off-by: Chen Gang 
---
 xen-hvm.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/xen-hvm.c b/xen-hvm.c
index 0d09940..35efec0 100644
--- a/xen-hvm.c
+++ b/xen-hvm.c
@@ -978,6 +978,7 @@ static void xen_wakeup_notifier(Notifier *notifier, void 
*data)
 xc_set_hvm_param(xen_xc, xen_domid, HVM_PARAM_ACPI_S_STATE, 0);
 }
 
+/* return value:  0 is OK, -errno is failure, -1 is critical issue -- exit(1) 
*/
 int xen_hvm_init(ram_addr_t *below_4g_mem_size, ram_addr_t *above_4g_mem_size,
  MemoryRegion **ram_memory)
 {
@@ -998,6 +999,7 @@ int xen_hvm_init(ram_addr_t *below_4g_mem_size, ram_addr_t 
*above_4g_mem_size,
 state->xenstore = xs_daemon_open();
 if (state->xenstore == NULL) {
 perror("xen: xenstore open");
+xc_evtchn_close(state->xce_handle);
 g_free(state);
 return -errno;
 }
@@ -1069,7 +1071,7 @@ int xen_hvm_init(ram_addr_t *below_4g_mem_size, 
ram_addr_t *above_4g_mem_size,
 /* Initialize backend core & drivers */
 if (xen_be_init() != 0) {
 fprintf(stderr, "%s: xen backend core setup failed\n", __FUNCTION__);
-exit(1);
+return -1;
 }
 xen_be_register("console", &xen_console_ops);
 xen_be_register("vkbd", &xen_kbdmouse_ops);
-- 
1.9.3

Re: [Qemu-devel] [PATCH v4 07/33] target-arm: add non-secure Translation Block flag

2014-09-02 Thread Peter Maydell

On 1 July 2014 00:09,   wrote:
> From: Sergey Fedorov 
>
> This patch is based on idea found in patch at
> git://github.com/jowinter/qemu-trustzone.git
> f3d955c6c0ed8c46bc0eb10b634201032a651dd2 by
> Johannes Winter .
>
> This flag prevents QEMU from executing TCG code generated for other CPU
> security state. It also allows to generate different TCG code depending on
> CPU secure state.
>
> Signed-off-by: Sergey Fedorov 
> Signed-off-by: Fabian Aggeler 
> Signed-off-by: Greg Bellows 
> ---
>  target-arm/cpu.h   | 10 ++
>  target-arm/translate-a64.c |  1 +
>  target-arm/translate.c |  3 +++
>  target-arm/translate.h |  1 +
>  4 files changed, 15 insertions(+)
>
> diff --git a/target-arm/cpu.h b/target-arm/cpu.h
> index 1faf1e2..44e0943 100644
> --- a/target-arm/cpu.h
> +++ b/target-arm/cpu.h
> @@ -1291,6 +1291,8 @@ static inline int cpu_mmu_index (CPUARMState *env)
>  #define ARM_TBFLAG_BSWAP_CODE_MASK  (1 << ARM_TBFLAG_BSWAP_CODE_SHIFT)
>  #define ARM_TBFLAG_CPACR_FPEN_SHIFT 17
>  #define ARM_TBFLAG_CPACR_FPEN_MASK  (1 << ARM_TBFLAG_CPACR_FPEN_SHIFT)
> +#define ARM_TBFLAG_NS_SHIFT 18
> +#define ARM_TBFLAG_NS_MASK  (1 << ARM_TBFLAG_NS_SHIFT)
>
>  /* Bit usage when in AArch64 state */
>  #define ARM_TBFLAG_AA64_EL_SHIFT0
> @@ -1321,6 +1323,8 @@ static inline int cpu_mmu_index (CPUARMState *env)
>  (((F) & ARM_TBFLAG_AA64_EL_MASK) >> ARM_TBFLAG_AA64_EL_SHIFT)
>  #define ARM_TBFLAG_AA64_FPEN(F) \
>  (((F) & ARM_TBFLAG_AA64_FPEN_MASK) >> ARM_TBFLAG_AA64_FPEN_SHIFT)
> +#define ARM_TBFLAG_NS(F) \
> +(((F) & ARM_TBFLAG_NS_MASK) >> ARM_TBFLAG_NS_SHIFT)
>
>  static inline void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>  target_ulong *cs_base, int *flags)
> @@ -1334,6 +1338,9 @@ static inline void cpu_get_tb_cpu_state(CPUARMState 
> *env, target_ulong *pc,
>  if (fpen == 3 || (fpen == 1 && arm_current_pl(env) != 0)) {
>  *flags |= ARM_TBFLAG_AA64_FPEN_MASK;
>  }
> +if (!arm_is_secure(env)) {
> +*flags |= ARM_TBFLAG_NS_MASK;
> +}
>  } else {
>  int privmode;
>  *pc = env->regs[15];
> @@ -1350,6 +1357,9 @@ static inline void cpu_get_tb_cpu_state(CPUARMState 
> *env, target_ulong *pc,
>  if (privmode) {
>  *flags |= ARM_TBFLAG_PRIV_MASK;
>  }
> +if (!arm_is_secure(env)) {
> +*flags |= ARM_TBFLAG_NS_MASK;
> +}

You can't share the same TBFLAG between the AArch64
and AArch32 tb flags like this -- they have different layouts
of the bits in the flags word.

>  if (env->vfp.xregs[ARM_VFP_FPEXC] & (1 << 30)
>  || arm_el_is_aa64(env, 1)) {
>  *flags |= ARM_TBFLAG_VFPEN_MASK;
> diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
> index 446d2cd..ad30903 100644
> --- a/target-arm/translate-a64.c
> +++ b/target-arm/translate-a64.c
> @@ -10879,6 +10879,7 @@ void gen_intermediate_code_internal_a64(ARMCPU *cpu,
>  dc->condexec_cond = 0;
>  #if !defined(CONFIG_USER_ONLY)
>  dc->user = (ARM_TBFLAG_AA64_EL(tb->flags) == 0);
> +dc->ns = ARM_TBFLAG_NS(tb->flags);
>  #endif
>  dc->cpacr_fpen = ARM_TBFLAG_AA64_FPEN(tb->flags);
>  dc->vec_len = 0;
> diff --git a/target-arm/translate.c b/target-arm/translate.c
> index cf4e767..bf17952 100644
> --- a/target-arm/translate.c
> +++ b/target-arm/translate.c
> @@ -53,8 +53,10 @@ static uint32_t gen_opc_condexec_bits[OPC_BUF_SIZE];
>
>  #if defined(CONFIG_USER_ONLY)
>  #define IS_USER(s) 1
> +#define IS_NS(s) 1
>  #else
>  #define IS_USER(s) (s->user)
> +#define IS_NS(s) (s->ns)
>  #endif
>
>  TCGv_ptr cpu_env;
> @@ -10904,6 +10906,7 @@ static inline void 
> gen_intermediate_code_internal(ARMCPU *cpu,
>  dc->condexec_cond = ARM_TBFLAG_CONDEXEC(tb->flags) >> 4;
>  #if !defined(CONFIG_USER_ONLY)
>  dc->user = (ARM_TBFLAG_PRIV(tb->flags) == 0);
> +dc->ns = ARM_TBFLAG_NS(tb->flags);
>  #endif
>  dc->cpacr_fpen = ARM_TBFLAG_CPACR_FPEN(tb->flags);
>  dc->vfp_enabled = ARM_TBFLAG_VFPEN(tb->flags);
> diff --git a/target-arm/translate.h b/target-arm/translate.h
> index 31a0104..6e8620a 100644
> --- a/target-arm/translate.h
> +++ b/target-arm/translate.h
> @@ -19,6 +19,7 @@ typedef struct DisasContext {
>  int bswap_code;
>  #if !defined(CONFIG_USER_ONLY)
>  int user;
> +int ns;
>  #endif

Please don't follow the way the "user" flag is done, it's
weird and something I might eventually tidy up. Just have
a "bool ns" in the DisasContext (not inside the #if), and
set it to true if CONFIG_USER_ONLY is set. Then you
don't need to indirect via an IS_NS() macro, you can just
use s->ns directly.

>  bool cpacr_fpen; /* FP enabled via CPACR.FPEN */
>  bool vfp_enabled; /* FP enabled via FPSCR.EN */
> --

thanks
-- PMM

Re: [Qemu-devel] [PULL 13/13] vhost_net: start/stop guest notifiers properly

2014-09-02 Thread Michael S. Tsirkin

On Tue, Sep 02, 2014 at 05:47:54PM +0200, William Dauchy wrote:
> On Sep02 18:07, Michael S. Tsirkin wrote:
> > From: Jason Wang 
> > 
> > commit a9f98bb5ebe6fb1869321dcc58e72041ae626ad8 vhost: multiqueue
> > support changed the order of stopping the device. Previously
> > vhost_dev_stop would disable backend and only afterwards, unset guest
> > notifiers. We now unset guest notifiers while vhost is still
> > active. This can lose interrupts causing guest networking to fail. In
> > particular, this has been observed during migration.
> > 
> > To adapt this, several other changes are needed:
> > - remove the hdev->started assertion in vhost.c since we may want to
> > start the guest notifiers before vhost starts and stop the guest
> > notifiers after vhost is stopped.
> > - introduce the vhost_net_set_vq_index() and call it before setting
> > guest notifiers. This is used to guarantee vhost_net has the correct
> > virtqueue index when setting guest notifiers.
> > 
> > Cc: qemu-sta...@nongnu.org
> > Reported-by: "Zhangjie (HZ)" 
> > Tested-by: William Dauchy 
> 
> please use:
> 
> Tested-by: William Dauchy 

It's a pull request so not easy to fix.
I'll do it like that next time.

> > Signed-off-by: Michael S. Tsirkin 
> > Signed-off-by: Jason Wang 
> > Reviewed-by: Michael S. Tsirkin 
> > Signed-off-by: Michael S. Tsirkin 
> 
> Thanks,
> -- 
> William

1 2 3 >

1 - 100 of 268 matches

Mail list logo