Re: [Qemu-devel] [PATCH v7 14/31] qapi: Drop unused error argument for list and implicit struct

2015-12-08 Thread David Gibson
On Mon, Dec 07, 2015 at 08:55:04PM -0700, Eric Blake wrote:
> No backend was setting an error when ending the visit of a list
> or implicit struct.  Make the callers a bit easier to follow by
> making this a part of the contract, and removing the errp
> argument - callers can then unconditionally end an object as
> part of cleanup without having to think about whether a second
> error is dominated by a first, because there is no second error.
> 
> A later patch will then tackle the larger task of splitting
> visit_end_struct(), which can indeed set an error.
> 
> Signed-off-by: Eric Blake 

For spapr parts:

Acked-by: David Gibson 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v7 13/31] qapi: Drop unused 'kind' for struct/enum visit

2015-12-08 Thread David Gibson
On Mon, Dec 07, 2015 at 08:55:03PM -0700, Eric Blake wrote:
> visit_start_struct() and visit_type_enum() had a 'kind' argument
> that was usually set to either the stringized version of the
> corresponding qapi type name, or to NULL (although some clients
> didn't even get that right).  But nothing ever used the argument.
> It's even hard to argue that it would be useful in a debugger,
> as a stack backtrace also tells which type is being visited.
> 
> Therefore, drop the 'kind' argument as dead.  While at it, change
> the signature of visit_start_struct() to place the 'name'
> argument at the end (other than 'errp'), and the 'size' argument
> next to 'obj'; this placement of 'name' matches matches how all
> other functions in visit.h do it (visit_type_enum() places
> 'strings' between 'obj' and 'name'; visit_get_next_type() places
> 'promote_int' between 'type' and 'name').  This also avoids the
> confusion caused by splitting related pieces of information,
> where the old signature an unrelated parameter in between the
> "typename" and sizeof(typename) arguments.
> 
> Signed-off-by: Eric Blake 

For spapr parts:

Acked-by: David Gibson 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [ANNOUNCE] QEMU 2.5.0-rc3 is now available

2015-12-08 Thread Peter Maydell
On 8 December 2015 at 00:16, Michael Roth  wrote:
> Hello,
>
> On behalf of the QEMU Team, I'd like to announce the availability of the
> fourth release candidate for the QEMU 2.5 release.  This release is meant
> for testing purposes and should not be used in a production environment.
>
>   http://wiki.qemu.org/download/qemu-2.5.0-rc3.tar.bz2
>
> You can help improve the quality of the QEMU 2.5 release by testing this
> release and reporting bugs on Launchpad:
>
>   https://bugs.launchpad.net/qemu/
>
> The release plan for the 2.5 release is available at:
>
>   http://wiki.qemu.org/Planning/2.5

My plan here is for us to fix only any absolutely showstopper
bugs between here and release, so ideally an rc4 (if needed)
at the end of the week and final release same-as-rc4 middle
of the week after that.

Please make sure any such showstoppers are listed in the
Planning page on the wiki so we don't overlook them.

thanks
-- PMM



Re: [Qemu-devel] [Qemu-block] [PATCH for-2.5?] qcow2: always initialize specific image info

2015-12-08 Thread Kevin Wolf
Am 07.12.2015 um 20:44 hat Max Reitz geschrieben:
> I'd be completely fine with adding an "else { abort(); }" branch to
> qcow2_get_specific_info().

This is actually what I was going to suggest, too. Of course, it's not
supposed to fix anything now, but defensive coding has never hurt.

Kevin


pgp4xqUwG7kEK.pgp
Description: PGP signature


[Qemu-devel] QEMU/KVM performance gets worser - high load - high interrupts - high context switches

2015-12-08 Thread Gerhard Wiesinger

Hello,

Yesterday I looked at my munin statistics on my KVM host and I swar that 
performance gets worser: load is getting higher, interrupts are getting 
higher and are high as well as context switches. VMs and applications 
didn't change that way.


You can find graphics at: http://www.wiesinger.com/tmp/kvm/
Last spike I guess was upgrade from FC22 to FC23 or a kernel update. And 
it was even lower on older versions


For me it looks like the high interrupt load and context switches are 
the root cause. Interrupts inside the VM are <100, so with 10 VMs I'm 
expecting 1000+baseload => <2000, see statistics below.


All VMs are virtio on disk/network except one (IDE/rtl8139).

# Host as well as all guests (except 2 VMs):
uname -a
Linux kvm 4.2.6-301.fc23.x86_64 #1 SMP Fri Nov 20 22:22:41 UTC 2015 
x86_64 x86_64 x86_64 GNU/Linux


qemu-system-x86-2.4.1-1.fc23.x86_64

Platform:

All VMs have the pc-i440fx-2.4 profile (I upgraded yesterday from 
pc-i440fx-2.3 without any change).


Any ideas, anyone having same issues?

Ciao,
Gerhard

kvm: no VM running
 r  b   swpd   free   buff  cache   si   sobibo in   cs us sy 
id wa st
 0  0  0 3308516 102408 379856800 012 197  679  0  
0 99  0  0
 0  0  0 3308516 102416 379856400 042 197  914  0  
0 99  1  0
 0  0  0 3308516 102416 379856800 0 0 190  791  0  
0 100  0  0
 2  0  0 3308484 102416 379856800 0 0 129  440  0  
0 100  0  0


kvm: 2 VMs running
 procs ---memory-- ---swap-- -io -system-- 
--cpu-
 r  b   swpd   free   buff  cache   si   sobibo in   cs us sy 
id wa st
 1  0  0 2641464 103052 381470000 0 0 2715 5648  3  
2 95  0  0
 0  0  0 2641340 103052 381470000 0 0 2601   1  
2 97  0  0
 1  0  0 2641308 103052 381470000 0 5 2687 5708  3  
2 95  0  0
 0  0  0 2640620 103060 381462800 030 2779 5756  4  
3 93  1  0
 0  0  0 2640644 103060 381463600 0 0 2436 5364  1  
2 97  0  0
 1  0  0 2640520 103060 381463600 0   119 2734 5975  3  
2 95  0  0


kvm: all 10 VMs running
procs ---memory-- ---swap-- -io -system-- 
--cpu-
 r  b   swpd   free   buff  cache   si   sobibo in   cs us sy 
id wa st
 1  0  0  60408  78892 337198400 085 9015 17357  4  
9 87  0  0
 2  0  0  60408  78892 337196800 047 9375 17797  9  
9 82  0  0
 0  0  0  60472  78892 3372092004060 8882 17343  4  
8 86  1  0
 1  0  0  60316  78892 337208000 059 8863 17517  4  
8 87  0  0
 0  0  0  59540  78900 337209200 055 9135 17796  8  
9 81  1  0
 0  0  0  59168  78900 337211200 051 8931 17484  4  
9 87  0  0


cat /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 15
model name  : Intel(R) Core(TM)2 Quad CPU   @ 2.66GHz
stepping: 7




Re: [Qemu-devel] [PATCH] virtio-blk: Drop x-data-plane option

2015-12-08 Thread Cornelia Huck
On Tue, 8 Dec 2015 09:56:14 +0800
Fam Zheng  wrote:

> On Mon, 12/07 21:02, Fam Zheng wrote:
> > On Mon, 12/07 12:29, Cornelia Huck wrote:
> > > No general objection to removing x-data-plane; but this probably wants
> > > a mention on the changelog as x-data-plane has been described in
> > > various howtos etc. over the years.
> 
> Add a changelog line,
> 
> http://wiki.qemu.org/ChangeLog/2.5#Block_devices_and_tools
> 
> please review.

Looks sane, although I'd probably add a line to "Incompatible changes"
as well.




Re: [Qemu-devel] [PATCH v7 31/31] RFC: qapi: Adjust layout of FooList types

2015-12-08 Thread David Gibson
On Mon, Dec 07, 2015 at 08:55:21PM -0700, Eric Blake wrote:
> By sticking the next pointer first, we don't need a union with
> 64-bit padding for smaller types.  On 32-bit platforms, this
> can reduce the size of uint8List from 16 bytes (or 12, depending
> on whether 64-bit ints can tolerate 4-byte alignment) down to 8.
> It has no effect on 64-bit platforms (where alignment still
> dictates a 16-byte struct); but fewer anonymous unions is still
> a win in my book.
> 
> However, this requires visit_start_list() and visit_next_list()
> to gain a size parameter, to know what size element to allocate.
> 
> I debated about going one step further, to allow for fewer casts,
> by doing:
> typedef GenericList GenericList;
> struct GenericList {
> GenericList *next;
> };
> struct FooList {
> GenericList base;
> Foo value;
> };
> so that you convert to 'GenericList *' by '>base', and
> back by 'container_of(generic, GenericList, base)' (as opposed to
> the existing '(GenericList *)foolist' and '(FooList *)generic').
> But doing that would require hoisting the declaration of
> GenericList prior to inclusion of qapi-types.h, rather than its
> current spot in visitor.h; it also makes iteration a bit more
> verbose through 'foolist->base.next' instead of 'foolist->next'.
> 
> Signed-off-by: Eric Blake 

For the spapr change

Acked-by: David Gibson 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v7 20/31] spapr_drc: Expose 'null' in qom-get when there is no fdt

2015-12-08 Thread David Gibson
On Mon, Dec 07, 2015 at 08:55:10PM -0700, Eric Blake wrote:
> Now that the QMP output visitor supports an explicit null
> output, we should utilize it to make it easier to diagnose
> the difference between a missing fdt vs. a present-but-empty
> one.
> 
> (Note that this reverts the behavior of commit ab8bf1d, taking
> us back to the behavior of commit 1d10b44; but that this time,
> the change is intentional and not an accidental side-effect.)
> 
> Signed-off-by: Eric Blake 
> Cc: David Gibson 

Acked-by: David Gibson 

> 
> ---
> v7: new patch, based on discussion about spapr_drc.c
> ---
>  hw/ppc/spapr_drc.c | 6 +-
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> index dcce563..0c675ff 100644
> --- a/hw/ppc/spapr_drc.c
> +++ b/hw/ppc/spapr_drc.c
> @@ -259,11 +259,7 @@ static void prop_get_fdt(Object *obj, Visitor *v, void 
> *opaque,
>  void *fdt;
> 
>  if (!drc->fdt) {
> -visit_start_struct(v, NULL, 0, name, );
> -if (!err) {
> -visit_end_struct(v, );
> -}
> -error_propagate(errp, err);
> +visit_type_null(v, NULL, errp);
>  return;
>  }
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v7 28/31] qapi: Split visit_end_struct() into pieces

2015-12-08 Thread David Gibson
On Mon, Dec 07, 2015 at 08:55:18PM -0700, Eric Blake wrote:
> As mentioned in previous patches, we want to call visit_end_struct()
> functions unconditionally, so that visitors can release resources
> tied up since the matching visit_start_struct() without also having
> to worry about error priority if more than one error occurs.
> 
> Even though error_propagate() can be safely used to ignore a second
> error during cleanup caused by a first error, it is simpler if the
> cleanup cannot set an error, and we instead split the task of
> checking that an input visitor has no unvisited input as a new
> function visit_check_struct(), called only if all prior steps are
> successful.
> 
> Generated code has diffs resembling:
> 
> |@@ -59,10 +59,12 @@ void visit_type_ACPIOSTInfo(Visitor *v,
> | goto out_obj;
> | }
> | visit_type_ACPIOSTInfo_fields(v, obj, );
> |+if (err) {
> |+goto out_obj;
> |+}
> |+visit_check_struct(v, );
> | out_obj:
> |-error_propagate(errp, err);
> |-err = NULL;
> |-visit_end_struct(v, );
> |+visit_end_struct(v);
> | out:
> 
> Signed-off-by: Eric Blake 

For spapr parts:

Acked-by: David Gibson 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v7 29/31] qapi: Simplify semantics of visit_next_list()

2015-12-08 Thread David Gibson
On Mon, Dec 07, 2015 at 08:55:19PM -0700, Eric Blake wrote:
> We have two uses of list visits in the entire code base: one in
> spapr_drc (which completely avoids visit_next_list(), feeding in
> integers from a different source than uint8List), and one in
> qapi-visit.py (that is, all other list visitors are generated
> in qapi-visit.c, and share the same paradigm based on a qapi
> FooList type).  What's more, the semantics of the list visit are
> somewhat baroque, with the following pseudocode when FooList is
> used:
> 
> start()
> prev = head
> while (cur = next(prev)) {
> visit(cur)
> prev = 
> }
> 
> Note that these semantics (advance before visit) requires that
> the first call to next() return the list head, while all other
> calls return the next element of the list; that is, every visitor
> implementation is required to track extra state to decide whether
> to return the input as-is, or to advance.  It also requires an
> argument of 'GenericList **' to next(), solely because the first
> iteration might need to modify the caller's GenericList head, so
> that all other calls have to do a layer of dereferencing.
> 
> We can greatly simplify things by hoisting the special case
> into the start() routine, and flipping the order in the loop
> to visit before advance:
> 
> start(head)
> element = *head
> while (element) {
> visit(element)
> element = next(element)
> }
> 
> With the simpler semantics, visitors have less state to track,
> the argument to next() is reduced to 'GenericList *', and it
> also becomes obvious whether an input visitor is allocating a
> FooList during visit_start_list() (rather than the old way of
> not knowing if an allocation happened until the first
> visit_next_list()).
> 
> The spapr_drc case requires that visit_start_list() has to pay
> attention to whether visit_next_list() will even be used to
> visit a FooList qapi struct; this is done by passing NULL for
> list, similarly to how NULL is passed to visit_start_struct()
> when a qapi type is not used in those visits.  It was easy to
> provide these semantics for qmp-output and dealloc visitors,
> and a bit harder for qmp-input (it required hoisting the
> advance of the current qlist entry out of qmp_input_next_list()
> into qmp_input_get_object()).  But it turned out that the
> string and opts visitors munge enough state during
> visit_next_list() to make those conversions simpler if they
> require a GenericList visit for now; an assertion will remind
> us to adjust things if we need the semantics in the future.
> 
> Signed-off-by: Eric Blake 

For the spapr change:

Acked-by: David Gibson 
-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v9 6/6] tests/guest-debug: introduce basic gdbstub tests

2015-12-08 Thread Alex Bennée

Peter Maydell  writes:

> On 12 November 2015 at 16:20, Alex Bennée  wrote:
>> From: Alex Bennée 
>>
>> The aim of these tests is to combine with an appropriate kernel
>> image (with symbol-file vmlinux) and check it behaves as it should.
>> Given a kernel it checks:
>>
>>   - single step
>>   - software breakpoint
>>   - hardware breakpoint
>>   - access, read and write watchpoints
>>
>> On success it returns 0 to the calling process.
>>
>> I've not plumbed this into the "make check" logic though as we need a
>> solution for providing non-host binaries to the tests. However the test
>> is structured to work with pretty much any Linux kernel image as it
>> uses the basic kernel_init code which is common across architectures.
>
> Do these tests pass if you run them on the TCG QEMU, just out
> of interest?

You'll be glad to know they do.

> I'm not a great fan of tests that aren't in 'make check'
> because IME they just bitrot, but as you say we have no
> sensible approach for handling tests that need to run real
> guest code :-(

I was pondering if a git sub-project with large file support would work.
We could add pre-built binaries to the tree with appropriate meta-data
(src tree, version, config) to rebuild if required.

There would be some degree of trust implied in the original builder
though. Maybe a signed commit?

>
> thanks
> -- PMM


--
Alex Bennée



[Qemu-devel] [v3 1/3] cutils: add avx2 instruction optimization

2015-12-08 Thread Liang Li
buffer_find_nonzero_offset() is a hot function during live migration.
Now it use SSE2 intructions for optimization. For platform supports
AVX2 instructions, use the AVX2 instructions for optimization can help
to improve the performance about 30% comparing to SSE2.
Zero page check can be faster with this optimization, the test result
shows that for an 8GB RAM idle guest, this patch can help to shorten
the total live migration time about 6%.

This patch use the ifunc mechanism to select the proper function when
running, for platform supports AVX2, excute the AVX2 instructions,
else, excute the original code.

Signed-off-by: Liang Li 
---
 include/qemu-common.h   | 13 +-
 util/Makefile.objs  |  2 ++
 util/buffer-zero-avx2.c | 54 
 util/cutils.c   | 65 +++--
 4 files changed, 125 insertions(+), 9 deletions(-)
 create mode 100644 util/buffer-zero-avx2.c

diff --git a/include/qemu-common.h b/include/qemu-common.h
index 405364f..be8ba79 100644
--- a/include/qemu-common.h
+++ b/include/qemu-common.h
@@ -484,15 +484,14 @@ void qemu_hexdump(const char *buf, FILE *fp, const char 
*prefix, size_t size);
 #endif
 
 #define BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 8
-static inline bool
-can_use_buffer_find_nonzero_offset(const void *buf, size_t len)
-{
-return (len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR
-   * sizeof(VECTYPE)) == 0
-&& ((uintptr_t) buf) % sizeof(VECTYPE) == 0);
-}
+bool can_use_buffer_find_nonzero_offset(const void *buf, size_t len);
 size_t buffer_find_nonzero_offset(const void *buf, size_t len);
 
+#if defined CONFIG_IFUNC && defined CONFIG_AVX2
+bool can_use_buffer_find_nonzero_offset_avx2(const void *buf, size_t len);
+size_t buffer_find_nonzero_offset_avx2(const void *buf, size_t len);
+#endif
+
 /*
  * helper to parse debug environment variables
  */
diff --git a/util/Makefile.objs b/util/Makefile.objs
index 89dd80e..a130b35 100644
--- a/util/Makefile.objs
+++ b/util/Makefile.objs
@@ -1,4 +1,5 @@
 util-obj-y = osdep.o cutils.o unicode.o qemu-timer-common.o
+util-obj-$(CONFIG_AVX2) += buffer-zero-avx2.o
 util-obj-$(CONFIG_POSIX) += compatfd.o
 util-obj-$(CONFIG_POSIX) += event_notifier-posix.o
 util-obj-$(CONFIG_POSIX) += mmap-alloc.o
@@ -30,3 +31,4 @@ util-obj-y += qemu-coroutine-sleep.o
 util-obj-y += coroutine-$(CONFIG_COROUTINE_BACKEND).o
 util-obj-y += buffer.o
 util-obj-y += timed-average.o
+buffer-zero-avx2.o-cflags  := $(AVX2_CFLAGS)
diff --git a/util/buffer-zero-avx2.c b/util/buffer-zero-avx2.c
new file mode 100644
index 000..b9da0e3
--- /dev/null
+++ b/util/buffer-zero-avx2.c
@@ -0,0 +1,54 @@
+#include "qemu-common.h"
+
+#if defined CONFIG_IFUNC && defined CONFIG_AVX2
+#include 
+#define AVX2_VECTYPE__m256i
+#define AVX2_SPLAT(p)   _mm256_set1_epi8(*(p))
+#define AVX2_ALL_EQ(v1, v2) \
+(_mm256_movemask_epi8(_mm256_cmpeq_epi8(v1, v2)) == 0x)
+#define AVX2_VEC_OR(v1, v2) (_mm256_or_si256(v1, v2))
+
+inline bool
+can_use_buffer_find_nonzero_offset_avx2(const void *buf, size_t len)
+{
+return (len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR
+   * sizeof(AVX2_VECTYPE)) == 0
+&& ((uintptr_t) buf) % sizeof(AVX2_VECTYPE) == 0);
+}
+
+size_t buffer_find_nonzero_offset_avx2(const void *buf, size_t len)
+{
+const AVX2_VECTYPE *p = buf;
+const AVX2_VECTYPE zero = (AVX2_VECTYPE){0};
+size_t i;
+
+assert(can_use_buffer_find_nonzero_offset_avx2(buf, len));
+
+if (!len) {
+return 0;
+}
+
+for (i = 0; i < BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR; i++) {
+if (!AVX2_ALL_EQ(p[i], zero)) {
+return i * sizeof(AVX2_VECTYPE);
+}
+}
+
+for (i = BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR;
+ i < len / sizeof(AVX2_VECTYPE);
+ i += BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR) {
+AVX2_VECTYPE tmp0 = AVX2_VEC_OR(p[i + 0], p[i + 1]);
+AVX2_VECTYPE tmp1 = AVX2_VEC_OR(p[i + 2], p[i + 3]);
+AVX2_VECTYPE tmp2 = AVX2_VEC_OR(p[i + 4], p[i + 5]);
+AVX2_VECTYPE tmp3 = AVX2_VEC_OR(p[i + 6], p[i + 7]);
+AVX2_VECTYPE tmp01 = AVX2_VEC_OR(tmp0, tmp1);
+AVX2_VECTYPE tmp23 = AVX2_VEC_OR(tmp2, tmp3);
+if (!AVX2_ALL_EQ(AVX2_VEC_OR(tmp01, tmp23), zero)) {
+break;
+}
+}
+
+return i * sizeof(AVX2_VECTYPE);
+}
+
+#endif
diff --git a/util/cutils.c b/util/cutils.c
index cfeb848..3631c02 100644
--- a/util/cutils.c
+++ b/util/cutils.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "qemu/sockets.h"
 #include "qemu/iov.h"
@@ -161,6 +162,14 @@ int qemu_fdatasync(int fd)
 #endif
 }
 
+static inline bool
+can_use_buffer_find_nonzero_offset_inner(const void *buf, size_t len)
+{
+return (len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR
+   * sizeof(VECTYPE)) == 0
+&& ((uintptr_t) buf) % sizeof(VECTYPE) == 

[Qemu-devel] [v3 3/3] configure: add options to config avx2

2015-12-08 Thread Liang Li
Add the '--enable-avx2' & '--disable-avx2' option so as to config
the AVX2 instruction optimization.

If '--disable-avx2' is not set, configure will detect if the compiler
can support AVX2 option, if yes, AVX2 optimization is eabled, else
disabled.

Signed-off-by: Liang Li 
---
 configure | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/configure b/configure
index 394db3b..94e45fa 100755
--- a/configure
+++ b/configure
@@ -311,6 +311,7 @@ libusb=""
 usb_redir=""
 opengl=""
 ifunc=""
+avx2=""
 zlib="yes"
 lzo=""
 snappy=""
@@ -1063,6 +1064,10 @@ for opt do
   ;;
   --enable-usb-redir) usb_redir="yes"
   ;;
+  --disable-avx2) avx2="no"
+  ;;
+  --enable-avx2) avx2="yes"
+  ;;
   --disable-zlib-test) zlib="no"
   ;;
   --disable-lzo) lzo="no"
@@ -1378,6 +1383,7 @@ disabled with --disable-FEATURE, default is enabled if 
available:
   smartcard   smartcard support (libcacard)
   libusb  libusb (for usb passthrough)
   usb-redir   usb network redirection support
+  avx2support of avx2 instruction
   lzo support of lzo compression library
   snappy  support of snappy compression library
   bzip2   support of bzip2 compression library
@@ -1841,6 +1847,23 @@ else
 ifunc="no"
 fi
 
+
+# avx2 check
+
+if test "$avx2" != "no" ; then
+cat > $TMPC << EOF
+int main(void) { return 0; }
+EOF
+if compile_prog "" "-mavx2" ; then
+avx2="yes"
+else
+if test "$avx2" = "yes" ; then
+feature_not_found "avx2" "Your compiler don't support avx2"
+fi
+avx2="no"
+fi
+fi
+
 #
 # zlib check
 
@@ -4853,6 +4876,7 @@ echo "TPM passthrough   $tpm_passthrough"
 echo "QOM debugging $qom_cast_debug"
 echo "vhdx  $vhdx"
 echo "ifunc support $ifunc"
+echo "avx2 support  $avx2"
 echo "lzo support   $lzo"
 echo "snappy support$snappy"
 echo "bzip2 support $bzip2"
@@ -5241,6 +5265,12 @@ if test "$ifunc" = "yes" ; then
   echo "CONFIG_IFUNC=y" >> $config_host_mak
 fi
 
+if test "$avx2" = "yes" ; then
+  avx2_cflags=" -mavx2"
+  echo "AVX2_CFLAGS=$avx2_cflags" >> $config_host_mak
+  echo "CONFIG_AVX2=y" >> $config_host_mak
+fi
+
 if test "$lzo" = "yes" ; then
   echo "CONFIG_LZO=y" >> $config_host_mak
 fi
-- 
1.9.1




[Qemu-devel] [v3 2/3] configure: detect ifunc attribute

2015-12-08 Thread Liang Li
Detect if the compiler can support the ifunc attribute, the avx2
optimization depends on ifunc attribute.

Signed-off-by: Liang Li 
---
 configure | 20 
 1 file changed, 20 insertions(+)

diff --git a/configure b/configure
index b9552fd..394db3b 100755
--- a/configure
+++ b/configure
@@ -310,6 +310,7 @@ smartcard=""
 libusb=""
 usb_redir=""
 opengl=""
+ifunc=""
 zlib="yes"
 lzo=""
 snappy=""
@@ -1827,6 +1828,20 @@ EOF
 fi
 
 ##
+# ifunc check
+
+cat > $TMPC << EOF
+static void bar(void) {}
+static void foo(void) __attribute__((ifunc("bar")));
+int main(void) { foo(); return 0; }
+EOF
+if compile_prog "" "" ; then
+ifunc="yes"
+else
+ifunc="no"
+fi
+
+#
 # zlib check
 
 if test "$zlib" != "no" ; then
@@ -4837,6 +4852,7 @@ echo "libssh2 support   $libssh2"
 echo "TPM passthrough   $tpm_passthrough"
 echo "QOM debugging $qom_cast_debug"
 echo "vhdx  $vhdx"
+echo "ifunc support $ifunc"
 echo "lzo support   $lzo"
 echo "snappy support$snappy"
 echo "bzip2 support $bzip2"
@@ -5221,6 +5237,10 @@ if test "$opengl" = "yes" ; then
   echo "OPENGL_LIBS=$opengl_libs" >> $config_host_mak
 fi
 
+if test "$ifunc" = "yes" ; then
+  echo "CONFIG_IFUNC=y" >> $config_host_mak
+fi
+
 if test "$lzo" = "yes" ; then
   echo "CONFIG_LZO=y" >> $config_host_mak
 fi
-- 
1.9.1




Re: [Qemu-devel] tcg: improve MAX_CODE_GEN_BUFFER_SIZE for arm

2015-12-08 Thread Laurent Desnogues
Hello,

On Tue, Dec 8, 2015 at 11:39 AM, Aurelien Jarno  wrote:
[...]
> I already posted a patch a long time ago to remove the 16MB limit on ARM
> hosts:
>
> http://lists.gnu.org/archive/html/qemu-devel/2012-10/msg01684.html
>
> However as you can see in the thread, it has been rejected as it doesn't
> not bring improvement in all cases.

We could perhaps resurrect it and do some more benchmarking?  Who
would be able to do testing on (recent) ARM hardware?


Laurent



Re: [Qemu-devel] [PATCH v9 3/6] target-arm: kvm - support for single step

2015-12-08 Thread Alex Bennée

Peter Maydell  writes:

> On 12 November 2015 at 16:20, Alex Bennée  wrote:
>> This adds support for single-step. There isn't much to do on the QEMU
>> side as after we set-up the request for single step via the debug ioctl
>> it is all handled within the kernel.
>>
>> Signed-off-by: Alex Bennée 
>>
>> ---
>> v2
>>   - convert to using HSR_EC
>> v3
>>   - use internals.h definitions
>> ---
>>  target-arm/kvm.c | 10 ++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/target-arm/kvm.c b/target-arm/kvm.c
>> index 50f70ef..d505a7e 100644
>> --- a/target-arm/kvm.c
>> +++ b/target-arm/kvm.c
>> @@ -535,6 +535,13 @@ static int kvm_handle_debug(CPUState *cs, struct 
>> kvm_run *run)
>>  kvm_cpu_synchronize_state(cs);
>>
>>  switch (hsr_ec) {
>> +case EC_SOFTWARESTEP:
>> +if (cs->singlestep_enabled) {
>> +return true;
>> +} else {
>> +error_report("Came out of SINGLE STEP when not enabled");
>> +}
>> +break;
>>  case EC_AA64_BKPT:
>>  if (kvm_find_sw_breakpoint(cs, env->pc)) {
>>  return true;
>> @@ -595,6 +602,9 @@ int kvm_arch_on_sigbus(int code, void *addr)
>>
>>  void kvm_arch_update_guest_debug(CPUState *cs, struct kvm_guest_debug *dbg)
>>  {
>> +if (cs->singlestep_enabled) {
>> +dbg->control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_SINGLESTEP;
>> +}
>
> Doesn't kvm_update_guest_debug() already set these bits, or am
> I misreading it?

Yeah. This raises an interesting problem about what to do when we don't
have the capability. I could suppress those bits in the update function
but that seems a bit hacky.

Looking at the GDB capability code there doesn't seem to report
breakpoint capability short of just failing when you try to set one.

>
>>  if (kvm_sw_breakpoints_active(cs)) {
>>  dbg->control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP;
>>  }
>> --
>> 2.6.3
>
> thanks
> -- PMM


--
Alex Bennée



Re: [Qemu-devel] net: vmxnet3: memory leakage issue

2015-12-08 Thread P J P
  Hello Jason,

+-- On Fri, 4 Dec 2015, Jason Wang wrote --+
| Better with "git send-email".

  Okay.

| What if guest deactivate the device before re-activate the device?
|Looks like it could be done through methods:
|
|1) VMXNET3_CMD_QUIESCE_DEV

  IIUC, it is used to pause the device when the receiver end is unable to 
keee-up with the incoming flow. After a brief period, the operation could be 
resumed again.

|2) VMXNET3_REG_DSAL

   Shared memory between a driver and the device appears to be set in two 
steps. Firs low address, followed by the high address(VMXNET3_REG_DSAH). I 
guess 's->device_active' needs to be enabled again while setting the higher 
part of the address.

|So looks like need to free both tx_pkt and rx_pkt during deactivating?

  I think freeing 'tx_pkt' & 'rx_pkt' during pause wouldn't be good. It needs 
to call something like - 'vmxnet3_resume_device'.

Please see the patch below for the above two cases, does it look okay?

===
diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index 37373e5..4b2305b 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -1198,6 +1198,12 @@ static void vmxnet3_deactivate_device(VMXNET3State *s)
 s->device_active = false;
 }
 
+static void vmxnet3_resume_device(VMXNET3State *s)
+{
+VMW_CBPRN("Resuming vmxnet3...");
+s->device_active = true;
+}
+
 static void vmxnet3_reset(VMXNET3State *s)
 {
 VMW_CBPRN("Resetting vmxnet3...");
@@ -1627,8 +1633,13 @@ static void vmxnet3_handle_command(VMXNET3State *s, 
uint64_t cmd)
 break;
 
 case VMXNET3_CMD_QUIESCE_DEV:
-VMW_CBPRN("Set: VMXNET3_CMD_QUIESCE_DEV - pause the device");
-vmxnet3_deactivate_device(s);
+if (s->device_active) {
+VMW_CBPRN("Set: VMXNET3_CMD_QUIESCE_DEV - pause the device");
+vmxnet3_deactivate_device(s);
+} else {
+VMW_CBPRN("Set: VMXNET3_CMD_QUIESCE_DEV - resume the device");
+vmxnet3_resume_device(s);
+}
 break;
 
 case VMXNET3_CMD_GET_CONF_INTR:
@@ -1756,6 +1767,9 @@ vmxnet3_io_bar1_write(void *opaque,
  * We already should have low address part.
  */
 s->drv_shmem = s->temp_shared_guest_driver_memory | (val << 32);
+if (s->drv_shmem) {
+s->device_active = true;
+}
 break;
 
 /* Command */
===


Thank you.
--
Prasad J Pandit / Red Hat Product Security Team
47AF CE69 3A90 54AA 9045 1053 DD13 3D32 FE5B 041F



Re: [Qemu-devel] tcg: improve MAX_CODE_GEN_BUFFER_SIZE for arm

2015-12-08 Thread Aurelien Jarno
On 2015-12-08 10:43, TeLeMan wrote:
> I know MAX_CODE_GEN_BUFFER_SIZE is limited by the host direct branch
> instructions.But the arm's MAX_CODE_GEN_BUFFER_SIZE is so small.I
> tried improving MAX_CODE_GEN_BUFFER_SIZE.I wrote some check codes for
> the overflow offset in tcg_out_b(), tcg_out_bl(),
> tcg_out_blx_imm(),reloc_pc24(). But I didn't catch any overflow case
> when tb_size and MAX_CODE_GEN_BUFFER_SIZE were larger than 32MB. After
> the generated code size was larger than 32MB, qemu crashed.

Instrumenting all the tcg_out_* branch related functions do not work
here as the address is actually not known at code generation:

case INDEX_op_goto_tb:
if (s->tb_jmp_offset) {
/* Direct jump method */
s->tb_jmp_offset[args[0]] = tcg_current_code_size(s);
tcg_out_b_noaddr(s, COND_AL);

It is patched later during TB linking.

> Any suggest for this issue?

I already posted a patch a long time ago to remove the 16MB limit on ARM
hosts:

http://lists.gnu.org/archive/html/qemu-devel/2012-10/msg01684.html

However as you can see in the thread, it has been rejected as it doesn't
not bring improvement in all cases.

Aurelien

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net



Re: [Qemu-devel] [PULL for-2.5 2/4] block: Don't wait serialising for non-COR read requests

2015-12-08 Thread Kevin Wolf
Am 07.12.2015 um 17:42 hat Cornelia Huck geschrieben:
> On Mon, 7 Dec 2015 11:02:51 +0100
> Cornelia Huck  wrote:
> 
> > On Thu,  3 Dec 2015 13:00:00 +0800
> > Stefan Hajnoczi  wrote:
> > 
> > > From: Fam Zheng 
> > > 
> > > The assertion problem was noticed in 06c3916b35a, but it wasn't
> > > completely fixed, because even though the req is not marked as
> > > serialising, it still gets serialised by wait_serialising_requests
> > > against other serialising requests, which could lead to the same
> > > assertion failure.
> > > 
> > > Fix it by even more explicitly skipping the serialising for this
> > > specific case.
> > > 
> > > Signed-off-by: Fam Zheng 
> > > Message-id: 1448962590-2842-2-git-send-email-f...@redhat.com
> > > Signed-off-by: Stefan Hajnoczi 
> > > ---
> > >  block/backup.c|  2 +-
> > >  block/io.c| 12 +++-
> > >  include/block/block.h |  4 ++--
> > >  trace-events  |  2 +-
> > >  4 files changed, 11 insertions(+), 9 deletions(-)
> > 
> > This one causes segfaults for me:
> > 
> > Program received signal SIGSEGV, Segmentation fault.
> > bdrv_is_inserted (bs=0x8000) at /data/git/yyy/qemu/block.c:3071
> > 3071if (!drv) {
> > 
> > (gdb) bt
> > #0  bdrv_is_inserted (bs=0x8000) at /data/git/yyy/qemu/block.c:3071

This looks like some kind of memory corruption that hit blk->bs. It's
most definitely not a valid pointer anyway.

> > #1  0x80216974 in blk_is_inserted (blk=)
> > at /data/git/yyy/qemu/block/block-backend.c:986
> > #2  0x802169c6 in blk_is_available (blk=blk@entry=0x3ffb17e7960)
> > at /data/git/yyy/qemu/block/block-backend.c:991
> > #3  0x80216d12 in blk_check_byte_request 
> > (blk=blk@entry=0x3ffb17e7960, 
> > offset=offset@entry=4928966656, size=16384)
> > at /data/git/yyy/qemu/block/block-backend.c:558
> > #4  0x80216df2 in blk_check_request (blk=blk@entry=0x3ffb17e7960, 
> > sector_num=sector_num@entry=9626888, nb_sectors=nb_sectors@entry=32)
> > at /data/git/yyy/qemu/block/block-backend.c:589
> > #5  0x80217ee8 in blk_aio_readv (blk=0x3ffb17e7960, sector_num=
> > 9626888, iov=0x8098c658, nb_sectors=, cb=
> > 0x80081150 , opaque=0x80980620)
> > at /data/git/yyy/qemu/block/block-backend.c:727
> > #6  0x8008186e in submit_requests (niov=, 
> > num_reqs=, start=, mrb=, 
> > blk=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:366
> > #7  virtio_blk_submit_multireq (mrb=, blk=)
> > at /data/git/yyy/qemu/hw/block/virtio-blk.c:444
> > #8  virtio_blk_submit_multireq (blk=0x3ffb17e7960, mrb=0x3ffeb58)
> > at /data/git/yyy/qemu/hw/block/virtio-blk.c:389
> > #9  0x800823ee in virtio_blk_handle_output (vdev=, 
> > vq=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:615
> > #10 0x801e367e in aio_dispatch (ctx=0x80918520)
> > at /data/git/yyy/qemu/aio-posix.c:326
> > #11 0x801d28b0 in aio_ctx_dispatch (source=, 
> > callback=, user_data=)
> > at /data/git/yyy/qemu/async.c:231
> > #12 0x03fffd36a05a in g_main_context_dispatch ()
> >from /lib64/libglib-2.0.so.0
> > #13 0x801e0ffa in glib_pollfds_poll ()
> > at /data/git/yyy/qemu/main-loop.c:211
> > #14 os_host_main_loop_wait (timeout=)
> > at /data/git/yyy/qemu/main-loop.c:256
> > #15 main_loop_wait (nonblocking=)
> > at /data/git/yyy/qemu/main-loop.c:504
> > #16 0x800148a6 in main_loop () at /data/git/yyy/qemu/vl.c:1923
> > #17 main (argc=, argv=, envp=)
> > at /data/git/yyy/qemu/vl.c:4684
> > 
> > Relevant part of command line:
> > 
> > -drive 
> > file=/dev/sda,if=none,id=drive-virtio-disk0,format=raw,serial=ccwzfcp1,cache=none
> >  -device 
> > virtio-blk-ccw,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,scsi=off
> 
> I played around a bit. The main part of this change seems to be calling
> wait_serialising_requests() conditionally; reverting this makes the
> guest boot again.
> 
> I then tried to find out when wait_serialising_requests() was NOT
> called and added fprintfs: well, it was _always_ called. I then added a
> fprintf for flags at the beginning of the function: this produced a
> segfault no matter whether wait_serialising_requests() was called
> conditionally or unconditionally. Weird race?
> 
> Anything further I can do? I guess this patch fixes a bug for someone,
> but it means insta-death for my setup...

If it happens immediately, perhaps running under valgrind is possible
and could give some hints about what happened with blk->bs?

Kevin



Re: [Qemu-devel] tcg: improve MAX_CODE_GEN_BUFFER_SIZE for arm

2015-12-08 Thread Aurelien Jarno
On 2015-12-08 11:51, Laurent Desnogues wrote:
> Hello,
> 
> On Tue, Dec 8, 2015 at 11:39 AM, Aurelien Jarno  wrote:
> [...]
> > I already posted a patch a long time ago to remove the 16MB limit on ARM
> > hosts:
> >
> > http://lists.gnu.org/archive/html/qemu-devel/2012-10/msg01684.html
> >
> > However as you can see in the thread, it has been rejected as it doesn't
> > not bring improvement in all cases.
> 
> We could perhaps resurrect it and do some more benchmarking?  Who
> would be able to do testing on (recent) ARM hardware?

I can provide an updated patch, but I would prefer if someone else does
the benchmarking on a really recent hardware. Not sure the hardware I
have (cortex A7) is really representative of a modern ARM CPU.

Aurelien

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net



Re: [Qemu-devel] [v3 3/3] configure: add options to config avx2

2015-12-08 Thread Peter Maydell
On 8 December 2015 at 12:08, Liang Li  wrote:
> Add the '--enable-avx2' & '--disable-avx2' option so as to config
> the AVX2 instruction optimization.
>
> If '--disable-avx2' is not set, configure will detect if the compiler
> can support AVX2 option, if yes, AVX2 optimization is eabled, else
> disabled.

Is the configure option necessary? For other things like
this (eg our use of SSE2 or Altivec) we just go ahead and
use the feature if the compiler supports it.

When would somebody building QEMU want to disable this option?

thanks
-- PMM



Re: [Qemu-devel] [PULL for-2.5 2/4] block: Don't wait serialising for non-COR read requests

2015-12-08 Thread Christian Borntraeger
On 12/08/2015 01:30 PM, Christian Borntraeger wrote:
> On 12/08/2015 01:00 PM, Cornelia Huck wrote:
>> On Tue, 8 Dec 2015 10:59:54 +0100
>> Kevin Wolf  wrote:
>>
>>> Am 07.12.2015 um 17:42 hat Cornelia Huck geschrieben:
 On Mon, 7 Dec 2015 11:02:51 +0100
 Cornelia Huck  wrote:

> On Thu,  3 Dec 2015 13:00:00 +0800
> Stefan Hajnoczi  wrote:
>
>> From: Fam Zheng 
>>
>> The assertion problem was noticed in 06c3916b35a, but it wasn't
>> completely fixed, because even though the req is not marked as
>> serialising, it still gets serialised by wait_serialising_requests
>> against other serialising requests, which could lead to the same
>> assertion failure.
>>
>> Fix it by even more explicitly skipping the serialising for this
>> specific case.
>>
>> Signed-off-by: Fam Zheng 
>> Message-id: 1448962590-2842-2-git-send-email-f...@redhat.com
>> Signed-off-by: Stefan Hajnoczi 
>> ---
>>  block/backup.c|  2 +-
>>  block/io.c| 12 +++-
>>  include/block/block.h |  4 ++--
>>  trace-events  |  2 +-
>>  4 files changed, 11 insertions(+), 9 deletions(-)
>
> This one causes segfaults for me:
>
> Program received signal SIGSEGV, Segmentation fault.
> bdrv_is_inserted (bs=0x8000) at /data/git/yyy/qemu/block.c:3071
> 3071  if (!drv) {
>
> (gdb) bt
> #0  bdrv_is_inserted (bs=0x8000) at 
> /data/git/yyy/qemu/block.c:3071
>>>
>>> This looks like some kind of memory corruption that hit blk->bs. It's
>>> most definitely not a valid pointer anyway.
>>>
> #1  0x80216974 in blk_is_inserted (blk=)
> at /data/git/yyy/qemu/block/block-backend.c:986
> #2  0x802169c6 in blk_is_available (blk=blk@entry=0x3ffb17e7960)
> at /data/git/yyy/qemu/block/block-backend.c:991
> #3  0x80216d12 in blk_check_byte_request 
> (blk=blk@entry=0x3ffb17e7960, 
> offset=offset@entry=4928966656, size=16384)
> at /data/git/yyy/qemu/block/block-backend.c:558
> #4  0x80216df2 in blk_check_request (blk=blk@entry=0x3ffb17e7960, 
> sector_num=sector_num@entry=9626888, nb_sectors=nb_sectors@entry=32)
> at /data/git/yyy/qemu/block/block-backend.c:589
> #5  0x80217ee8 in blk_aio_readv (blk=0x3ffb17e7960, sector_num=
> 9626888, iov=0x8098c658, nb_sectors=, cb=
> 0x80081150 , opaque=0x80980620)
> at /data/git/yyy/qemu/block/block-backend.c:727
> #6  0x8008186e in submit_requests (niov=, 
> num_reqs=, start=, mrb=, 
> blk=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:366
> #7  virtio_blk_submit_multireq (mrb=, blk=)
> at /data/git/yyy/qemu/hw/block/virtio-blk.c:444
> #8  virtio_blk_submit_multireq (blk=0x3ffb17e7960, mrb=0x3ffeb58)
> at /data/git/yyy/qemu/hw/block/virtio-blk.c:389
> #9  0x800823ee in virtio_blk_handle_output (vdev=, 
> vq=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:615
> #10 0x801e367e in aio_dispatch (ctx=0x80918520)
> at /data/git/yyy/qemu/aio-posix.c:326
> #11 0x801d28b0 in aio_ctx_dispatch (source=, 
> callback=, user_data=)
> at /data/git/yyy/qemu/async.c:231
> #12 0x03fffd36a05a in g_main_context_dispatch ()
>from /lib64/libglib-2.0.so.0
> #13 0x801e0ffa in glib_pollfds_poll ()
> at /data/git/yyy/qemu/main-loop.c:211
> #14 os_host_main_loop_wait (timeout=)
> at /data/git/yyy/qemu/main-loop.c:256
> #15 main_loop_wait (nonblocking=)
> at /data/git/yyy/qemu/main-loop.c:504
> #16 0x800148a6 in main_loop () at /data/git/yyy/qemu/vl.c:1923
> #17 main (argc=, argv=, envp= out>)
> at /data/git/yyy/qemu/vl.c:4684
>
> Relevant part of command line:
>
> -drive 
> file=/dev/sda,if=none,id=drive-virtio-disk0,format=raw,serial=ccwzfcp1,cache=none
>  -device 
> virtio-blk-ccw,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,scsi=off

 I played around a bit. The main part of this change seems to be calling
 wait_serialising_requests() conditionally; reverting this makes the
 guest boot again.

 I then tried to find out when wait_serialising_requests() was NOT
 called and added fprintfs: well, it was _always_ called. I then added a
 fprintf for flags at the beginning of the function: this produced a
 segfault no matter whether wait_serialising_requests() was called
 conditionally or unconditionally. Weird race?

 Anything further I can do? I guess this patch fixes a bug for someone,
 but it means insta-death for my setup...
>>>
>>> If it happens immediately, perhaps running under valgrind is possible
>>> 

Re: [Qemu-devel] [PATCH v2 1/4] xen/MSI-X: latch MSI-X table writes

2015-12-08 Thread Stefano Stabellini
On Mon, 7 Dec 2015, Jan Beulich wrote:
> >>> On 07.12.15 at 13:41,  wrote:
> > I know that in your opinion is superfluous, nonetheless could you please
> > add 2-3 lines of in-code comment right here, to explain what you are
> > doing with the check?  Something like:
> > 
> > /*
> >  * Update the entry addr and data to the latest values only when the
> >  * entry is masked or they are all masked, as required by the spec.
> >  * Addr and data changes while the MSI-X entry is unmasked will be
> >  * delayed until the next masking->unmasking.
> >  */
> 
> Btw, will it be okay to just resend this one patch as v3, or do I need
> to resend the whole series (the rest of which didn't change)?

Just this patch is fine.



[Qemu-devel] [PATCH v3 1/4] xen/MSI-X: latch MSI-X table writes

2015-12-08 Thread Jan Beulich
The remaining log message in pci_msix_write() is wrong, as there guest
behavior may only appear to be wrong: For one, the old logic didn't
take the mask-all bit into account. And then this shouldn't depend on
host device state (i.e. the host may have masked the entry without the
guest having done so). Plus these writes shouldn't be dropped even when
an entry gets unmasked. Instead, if they can't be made take effect
right away, they should take effect on the next unmasking or enabling
operation - the specification explicitly describes such caching
behavior.

Signed-off-by: Jan Beulich 
---
v3: Add comment to xen_pt_msix_update_one().
v2: Pass original vec_ctrl to xen_pt_msix_update_one() instead of
(ab)using latch[].

--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -1499,6 +1499,8 @@ static int xen_pt_msixctrl_reg_write(Xen
 xen_pt_msix_disable(s);
 }
 
+s->msix->maskall = *val & PCI_MSIX_FLAGS_MASKALL;
+
 debug_msix_enabled_old = s->msix->enabled;
 s->msix->enabled = !!(*val & PCI_MSIX_FLAGS_ENABLE);
 if (s->msix->enabled != debug_msix_enabled_old) {
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -187,13 +187,13 @@ typedef struct XenPTMSIXEntry {
 int pirq;
 uint64_t addr;
 uint32_t data;
-uint32_t vector_ctrl;
+uint32_t latch[4];
 bool updated; /* indicate whether MSI ADDR or DATA is updated */
-bool warned;  /* avoid issuing (bogus) warning more than once */
 } XenPTMSIXEntry;
 typedef struct XenPTMSIX {
 uint32_t ctrl_offset;
 bool enabled;
+bool maskall;
 int total_entries;
 int bar_index;
 uint64_t table_base;
--- a/hw/xen/xen_pt_msi.c
+++ b/hw/xen/xen_pt_msi.c
@@ -25,6 +25,7 @@
 #define XEN_PT_GFLAGSSHIFT_DELIV_MODE 12
 #define XEN_PT_GFLAGSSHIFT_TRG_MODE   15
 
+#define latch(fld) latch[PCI_MSIX_ENTRY_##fld / sizeof(uint32_t)]
 
 /*
  * Helpers
@@ -314,7 +315,8 @@ static int msix_set_enable(XenPCIPassthr
enabled);
 }
 
-static int xen_pt_msix_update_one(XenPCIPassthroughState *s, int entry_nr)
+static int xen_pt_msix_update_one(XenPCIPassthroughState *s, int entry_nr,
+  uint32_t vec_ctrl)
 {
 XenPTMSIXEntry *entry = NULL;
 int pirq;
@@ -332,6 +334,19 @@ static int xen_pt_msix_update_one(XenPCI
 
 pirq = entry->pirq;
 
+/*
+ * Update the entry addr and data to the latest values only when the
+ * entry is masked or they are all masked, as required by the spec.
+ * Addr and data changes while the MSI-X entry is unmasked get deferred
+ * until the next masked -> unmasked transition.
+ */
+if (pirq == XEN_PT_UNASSIGNED_PIRQ || s->msix->maskall ||
+(vec_ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT)) {
+entry->addr = entry->latch(LOWER_ADDR) |
+  ((uint64_t)entry->latch(UPPER_ADDR) << 32);
+entry->data = entry->latch(DATA);
+}
+
 rc = msi_msix_setup(s, entry->addr, entry->data, , true, entry_nr,
 entry->pirq == XEN_PT_UNASSIGNED_PIRQ);
 if (rc) {
@@ -357,7 +372,7 @@ int xen_pt_msix_update(XenPCIPassthrough
 int i;
 
 for (i = 0; i < msix->total_entries; i++) {
-xen_pt_msix_update_one(s, i);
+xen_pt_msix_update_one(s, i, msix->msix_entry[i].latch(VECTOR_CTRL));
 }
 
 return 0;
@@ -406,35 +421,15 @@ int xen_pt_msix_update_remap(XenPCIPasst
 
 static uint32_t get_entry_value(XenPTMSIXEntry *e, int offset)
 {
-switch (offset) {
-case PCI_MSIX_ENTRY_LOWER_ADDR:
-return e->addr & UINT32_MAX;
-case PCI_MSIX_ENTRY_UPPER_ADDR:
-return e->addr >> 32;
-case PCI_MSIX_ENTRY_DATA:
-return e->data;
-case PCI_MSIX_ENTRY_VECTOR_CTRL:
-return e->vector_ctrl;
-default:
-return 0;
-}
+return !(offset % sizeof(*e->latch))
+   ? e->latch[offset / sizeof(*e->latch)] : 0;
 }
 
 static void set_entry_value(XenPTMSIXEntry *e, int offset, uint32_t val)
 {
-switch (offset) {
-case PCI_MSIX_ENTRY_LOWER_ADDR:
-e->addr = (e->addr & ((uint64_t)UINT32_MAX << 32)) | val;
-break;
-case PCI_MSIX_ENTRY_UPPER_ADDR:
-e->addr = (uint64_t)val << 32 | (e->addr & UINT32_MAX);
-break;
-case PCI_MSIX_ENTRY_DATA:
-e->data = val;
-break;
-case PCI_MSIX_ENTRY_VECTOR_CTRL:
-e->vector_ctrl = val;
-break;
+if (!(offset % sizeof(*e->latch)))
+{
+e->latch[offset / sizeof(*e->latch)] = val;
 }
 }
 
@@ -454,39 +449,26 @@ static void pci_msix_write(void *opaque,
 offset = addr % PCI_MSIX_ENTRY_SIZE;
 
 if (offset != PCI_MSIX_ENTRY_VECTOR_CTRL) {
-const volatile uint32_t *vec_ctrl;
-
 if (get_entry_value(entry, offset) == val
 && entry->pirq != XEN_PT_UNASSIGNED_PIRQ) {
 return;
 }
 
+entry->updated = true;
+} else if (msix->enabled && entry->updated &&
+   !(val & 

Re: [Qemu-devel] [PULL for-2.5 2/4] block: Don't wait serialising for non-COR read requests

2015-12-08 Thread Christian Borntraeger
On 12/08/2015 01:00 PM, Cornelia Huck wrote:
> On Tue, 8 Dec 2015 10:59:54 +0100
> Kevin Wolf  wrote:
> 
>> Am 07.12.2015 um 17:42 hat Cornelia Huck geschrieben:
>>> On Mon, 7 Dec 2015 11:02:51 +0100
>>> Cornelia Huck  wrote:
>>>
 On Thu,  3 Dec 2015 13:00:00 +0800
 Stefan Hajnoczi  wrote:

> From: Fam Zheng 
>
> The assertion problem was noticed in 06c3916b35a, but it wasn't
> completely fixed, because even though the req is not marked as
> serialising, it still gets serialised by wait_serialising_requests
> against other serialising requests, which could lead to the same
> assertion failure.
>
> Fix it by even more explicitly skipping the serialising for this
> specific case.
>
> Signed-off-by: Fam Zheng 
> Message-id: 1448962590-2842-2-git-send-email-f...@redhat.com
> Signed-off-by: Stefan Hajnoczi 
> ---
>  block/backup.c|  2 +-
>  block/io.c| 12 +++-
>  include/block/block.h |  4 ++--
>  trace-events  |  2 +-
>  4 files changed, 11 insertions(+), 9 deletions(-)

 This one causes segfaults for me:

 Program received signal SIGSEGV, Segmentation fault.
 bdrv_is_inserted (bs=0x8000) at /data/git/yyy/qemu/block.c:3071
 3071   if (!drv) {

 (gdb) bt
 #0  bdrv_is_inserted (bs=0x8000) at /data/git/yyy/qemu/block.c:3071
>>
>> This looks like some kind of memory corruption that hit blk->bs. It's
>> most definitely not a valid pointer anyway.
>>
 #1  0x80216974 in blk_is_inserted (blk=)
 at /data/git/yyy/qemu/block/block-backend.c:986
 #2  0x802169c6 in blk_is_available (blk=blk@entry=0x3ffb17e7960)
 at /data/git/yyy/qemu/block/block-backend.c:991
 #3  0x80216d12 in blk_check_byte_request 
 (blk=blk@entry=0x3ffb17e7960, 
 offset=offset@entry=4928966656, size=16384)
 at /data/git/yyy/qemu/block/block-backend.c:558
 #4  0x80216df2 in blk_check_request (blk=blk@entry=0x3ffb17e7960, 
 sector_num=sector_num@entry=9626888, nb_sectors=nb_sectors@entry=32)
 at /data/git/yyy/qemu/block/block-backend.c:589
 #5  0x80217ee8 in blk_aio_readv (blk=0x3ffb17e7960, sector_num=
 9626888, iov=0x8098c658, nb_sectors=, cb=
 0x80081150 , opaque=0x80980620)
 at /data/git/yyy/qemu/block/block-backend.c:727
 #6  0x8008186e in submit_requests (niov=, 
 num_reqs=, start=, mrb=, 
 blk=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:366
 #7  virtio_blk_submit_multireq (mrb=, blk=)
 at /data/git/yyy/qemu/hw/block/virtio-blk.c:444
 #8  virtio_blk_submit_multireq (blk=0x3ffb17e7960, mrb=0x3ffeb58)
 at /data/git/yyy/qemu/hw/block/virtio-blk.c:389
 #9  0x800823ee in virtio_blk_handle_output (vdev=, 
 vq=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:615
 #10 0x801e367e in aio_dispatch (ctx=0x80918520)
 at /data/git/yyy/qemu/aio-posix.c:326
 #11 0x801d28b0 in aio_ctx_dispatch (source=, 
 callback=, user_data=)
 at /data/git/yyy/qemu/async.c:231
 #12 0x03fffd36a05a in g_main_context_dispatch ()
from /lib64/libglib-2.0.so.0
 #13 0x801e0ffa in glib_pollfds_poll ()
 at /data/git/yyy/qemu/main-loop.c:211
 #14 os_host_main_loop_wait (timeout=)
 at /data/git/yyy/qemu/main-loop.c:256
 #15 main_loop_wait (nonblocking=)
 at /data/git/yyy/qemu/main-loop.c:504
 #16 0x800148a6 in main_loop () at /data/git/yyy/qemu/vl.c:1923
 #17 main (argc=, argv=, envp=)
 at /data/git/yyy/qemu/vl.c:4684

 Relevant part of command line:

 -drive 
 file=/dev/sda,if=none,id=drive-virtio-disk0,format=raw,serial=ccwzfcp1,cache=none
  -device 
 virtio-blk-ccw,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,scsi=off
>>>
>>> I played around a bit. The main part of this change seems to be calling
>>> wait_serialising_requests() conditionally; reverting this makes the
>>> guest boot again.
>>>
>>> I then tried to find out when wait_serialising_requests() was NOT
>>> called and added fprintfs: well, it was _always_ called. I then added a
>>> fprintf for flags at the beginning of the function: this produced a
>>> segfault no matter whether wait_serialising_requests() was called
>>> conditionally or unconditionally. Weird race?
>>>
>>> Anything further I can do? I guess this patch fixes a bug for someone,
>>> but it means insta-death for my setup...
>>
>> If it happens immediately, perhaps running under valgrind is possible
>> and could give some hints about what happened with blk->bs?
> 
> Just a quick update: This triggers on a qemu built with a not-so-fresh
> gcc 4.7.2 (and it seems to depend on 

Re: [Qemu-devel] [PATCH 1/2] Add param Error** to msi_init() & modify the callers

2015-12-08 Thread Cao jin

Hi Markus,
I have to say, you really did a amazing review for this "trivial 
"patch, thanks a lot & really appreciate it:)


On 12/07/2015 05:59 PM, Markus Armbruster wrote:

Cao jin  writes:


msi_init() is a supporting function in PCI device initialization, in order to
convert .init() to .realize(), it should be modified first. Also modify the
callers

Bonus: add more comment for msi_init().


Incomplete.  See notes on impact inline.


Signed-off-by: Cao jin 
---
  hw/audio/intel-hda.c   |  7 ++-
  hw/ide/ich.c   |  2 +-
  hw/net/vmxnet3.c   |  3 ++-
  hw/pci-bridge/ioh3420.c|  6 +-
  hw/pci-bridge/pci_bridge_dev.c |  6 +-
  hw/pci-bridge/xio3130_downstream.c |  7 ++-
  hw/pci-bridge/xio3130_upstream.c   |  7 ++-
  hw/pci/msi.c   | 17 +
  hw/scsi/megasas.c  | 12 +---
  hw/scsi/vmw_pvscsi.c   |  3 ++-
  hw/usb/hcd-xhci.c  |  5 -
  hw/vfio/pci.c  |  3 ++-
  include/hw/pci/msi.h   |  4 ++--
  13 files changed, 63 insertions(+), 19 deletions(-)

diff --git a/hw/audio/intel-hda.c b/hw/audio/intel-hda.c
index 433463e..9d733da 100644
--- a/hw/audio/intel-hda.c
+++ b/hw/audio/intel-hda.c
@@ -1130,6 +1130,7 @@ static void intel_hda_realize(PCIDevice *pci, Error 
**errp)
  {
  IntelHDAState *d = INTEL_HDA(pci);
  uint8_t *conf = d->pci.config;
+int ret;

  d->name = object_get_typename(OBJECT(d));

@@ -1142,7 +1143,11 @@ static void intel_hda_realize(PCIDevice *pci, Error 
**errp)
"intel-hda", 0x4000);
  pci_register_bar(>pci, 0, 0, >mmio);
  if (d->msi) {
-msi_init(>pci, d->old_msi_addr ? 0x50 : 0x60, 1, true, false);
+ret = msi_init(>pci, d->old_msi_addr ? 0x50 : 0x60, 1, true,
+false, errp);
+if(ret < 0) {


Please use scripts/checkpatch.pl to check your patches.  It's
occasionally wrong, so use your judgement.



Thanks for the tips, seems I got dizzy looking because many trivial 
place need to be modified...



+return;


This returns with the device in a half-realized state.  Do we have to
undo prior side effects to put it back into unrealized state?  See also
ioh3420_initfn() below.

Before: msi_init() failure is ignored.  After: it makes device
realization fail.  To assess impact, we need to understand how
msi_init() can fail.



It seems I missed the reality: devices are default to be hot-pluggable & 
most devices are hot-pluggable:-[ Because when cold plugged, process 
will exit on device-init failing, so, the half-realized state doesn`t 
matter in this condition.

Will rework it later.


+}
  }

  hda_codec_bus_init(DEVICE(pci), >codecs, sizeof(d->codecs),
diff --git a/hw/ide/ich.c b/hw/ide/ich.c
index 16925fa..94b1809 100644
--- a/hw/ide/ich.c
+++ b/hw/ide/ich.c
@@ -145,7 +145,7 @@ static void pci_ich9_ahci_realize(PCIDevice *dev, Error 
**errp)
  /* Although the AHCI 1.3 specification states that the first capability
   * should be PMCAP, the Intel ICH9 data sheet specifies that the ICH9
   * AHCI device puts the MSI capability first, pointing to 0x80. */
-msi_init(dev, ICH9_MSI_CAP_OFFSET, 1, true, false);
+msi_init(dev, ICH9_MSI_CAP_OFFSET, 1, true, false, errp);


Do we have to put the device back into unrealized state on failure?


  }

  static void pci_ich9_uninit(PCIDevice *dev)
diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index 5e3a233..4269141 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -2140,9 +2140,10 @@ vmxnet3_init_msi(VMXNET3State *s)
  {
  PCIDevice *d = PCI_DEVICE(s);
  int res;
+Error *local_err = NULL;

  res = msi_init(d, VMXNET3_MSI_OFFSET, VMXNET3_MAX_NMSIX_INTRS,
-   VMXNET3_USE_64BIT, VMXNET3_PER_VECTOR_MASK);
+   VMXNET3_USE_64BIT, VMXNET3_PER_VECTOR_MASK, _err);
  if (0 > res) {
  VMW_WRPRN("Failed to initialize MSI, error %d", res);
  s->msi_used = false;


The error is neither propagated nor handled, and the error object leaks.

Since this function can't handle it, it needs to propagate it.  Requires
adding an Error ** parameter.



[*]Actually, here is my consideration: a device-realize function(take 
the following ioh3420 for example) will call many supporting functions 
like msi_init(), so I am planning, every supporting function goes into a 
patch first, then every "device convert to realize()" goes into a patch, 
otherwise, it may will be a big patch for the reviewer. That`s why I 
didn`t add Error ** param, and propagate it, and plan to do it in 
"convert to realize()" patch. But for now, I think this patch should at 
least be successfully compiled & won`t impact the existed things.


Yes, it seems may have leaks when error happens, but will be fixed when 
the "convert to realize()" patch 

Re: [Qemu-devel] [PULL for-2.5 2/4] block: Don't wait serialising for non-COR read requests

2015-12-08 Thread Cornelia Huck
On Tue, 8 Dec 2015 10:59:54 +0100
Kevin Wolf  wrote:

> Am 07.12.2015 um 17:42 hat Cornelia Huck geschrieben:
> > On Mon, 7 Dec 2015 11:02:51 +0100
> > Cornelia Huck  wrote:
> > 
> > > On Thu,  3 Dec 2015 13:00:00 +0800
> > > Stefan Hajnoczi  wrote:
> > > 
> > > > From: Fam Zheng 
> > > > 
> > > > The assertion problem was noticed in 06c3916b35a, but it wasn't
> > > > completely fixed, because even though the req is not marked as
> > > > serialising, it still gets serialised by wait_serialising_requests
> > > > against other serialising requests, which could lead to the same
> > > > assertion failure.
> > > > 
> > > > Fix it by even more explicitly skipping the serialising for this
> > > > specific case.
> > > > 
> > > > Signed-off-by: Fam Zheng 
> > > > Message-id: 1448962590-2842-2-git-send-email-f...@redhat.com
> > > > Signed-off-by: Stefan Hajnoczi 
> > > > ---
> > > >  block/backup.c|  2 +-
> > > >  block/io.c| 12 +++-
> > > >  include/block/block.h |  4 ++--
> > > >  trace-events  |  2 +-
> > > >  4 files changed, 11 insertions(+), 9 deletions(-)
> > > 
> > > This one causes segfaults for me:
> > > 
> > > Program received signal SIGSEGV, Segmentation fault.
> > > bdrv_is_inserted (bs=0x8000) at /data/git/yyy/qemu/block.c:3071
> > > 3071  if (!drv) {
> > > 
> > > (gdb) bt
> > > #0  bdrv_is_inserted (bs=0x8000) at 
> > > /data/git/yyy/qemu/block.c:3071
> 
> This looks like some kind of memory corruption that hit blk->bs. It's
> most definitely not a valid pointer anyway.
> 
> > > #1  0x80216974 in blk_is_inserted (blk=)
> > > at /data/git/yyy/qemu/block/block-backend.c:986
> > > #2  0x802169c6 in blk_is_available (blk=blk@entry=0x3ffb17e7960)
> > > at /data/git/yyy/qemu/block/block-backend.c:991
> > > #3  0x80216d12 in blk_check_byte_request 
> > > (blk=blk@entry=0x3ffb17e7960, 
> > > offset=offset@entry=4928966656, size=16384)
> > > at /data/git/yyy/qemu/block/block-backend.c:558
> > > #4  0x80216df2 in blk_check_request (blk=blk@entry=0x3ffb17e7960, 
> > > sector_num=sector_num@entry=9626888, nb_sectors=nb_sectors@entry=32)
> > > at /data/git/yyy/qemu/block/block-backend.c:589
> > > #5  0x80217ee8 in blk_aio_readv (blk=0x3ffb17e7960, sector_num=
> > > 9626888, iov=0x8098c658, nb_sectors=, cb=
> > > 0x80081150 , opaque=0x80980620)
> > > at /data/git/yyy/qemu/block/block-backend.c:727
> > > #6  0x8008186e in submit_requests (niov=, 
> > > num_reqs=, start=, mrb=, 
> > > blk=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:366
> > > #7  virtio_blk_submit_multireq (mrb=, blk=)
> > > at /data/git/yyy/qemu/hw/block/virtio-blk.c:444
> > > #8  virtio_blk_submit_multireq (blk=0x3ffb17e7960, mrb=0x3ffeb58)
> > > at /data/git/yyy/qemu/hw/block/virtio-blk.c:389
> > > #9  0x800823ee in virtio_blk_handle_output (vdev=, 
> > > vq=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:615
> > > #10 0x801e367e in aio_dispatch (ctx=0x80918520)
> > > at /data/git/yyy/qemu/aio-posix.c:326
> > > #11 0x801d28b0 in aio_ctx_dispatch (source=, 
> > > callback=, user_data=)
> > > at /data/git/yyy/qemu/async.c:231
> > > #12 0x03fffd36a05a in g_main_context_dispatch ()
> > >from /lib64/libglib-2.0.so.0
> > > #13 0x801e0ffa in glib_pollfds_poll ()
> > > at /data/git/yyy/qemu/main-loop.c:211
> > > #14 os_host_main_loop_wait (timeout=)
> > > at /data/git/yyy/qemu/main-loop.c:256
> > > #15 main_loop_wait (nonblocking=)
> > > at /data/git/yyy/qemu/main-loop.c:504
> > > #16 0x800148a6 in main_loop () at /data/git/yyy/qemu/vl.c:1923
> > > #17 main (argc=, argv=, envp= > > out>)
> > > at /data/git/yyy/qemu/vl.c:4684
> > > 
> > > Relevant part of command line:
> > > 
> > > -drive 
> > > file=/dev/sda,if=none,id=drive-virtio-disk0,format=raw,serial=ccwzfcp1,cache=none
> > >  -device 
> > > virtio-blk-ccw,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,scsi=off
> > 
> > I played around a bit. The main part of this change seems to be calling
> > wait_serialising_requests() conditionally; reverting this makes the
> > guest boot again.
> > 
> > I then tried to find out when wait_serialising_requests() was NOT
> > called and added fprintfs: well, it was _always_ called. I then added a
> > fprintf for flags at the beginning of the function: this produced a
> > segfault no matter whether wait_serialising_requests() was called
> > conditionally or unconditionally. Weird race?
> > 
> > Anything further I can do? I guess this patch fixes a bug for someone,
> > but it means insta-death for my setup...
> 
> If it happens immediately, perhaps running under valgrind is possible
> and could give some hints about what happened with blk->bs?

Just a quick update: This triggers on a qemu 

[Qemu-devel] [v3 0/3] add avx2 instruction optimization

2015-12-08 Thread Liang Li
buffer_find_nonzero_offset() is a hot function during live migration.
Now it use SSE2 intructions for optimization. For platform supports
AVX2 instructions, use the AVX2 instructions for optimization can help
to improve the performance about 30% comparing to SSE2.
Zero page check can be faster with this optimization, the test result
shows that for an 8GB RAM idle guest, this patch can help to shorten
the total live migration time about 6%.

This patch use the ifunc mechanism to select the proper function when
running, for platform supports AVX2, excute the AVX2 instructions,
else, excute the original code.

With this patch, the QEMU binary can run on both platforms support AVX2
or not.

Compiler which desn't support the AVX2 or ifunc attribute can build the
source code successfully.


v2 -> v3 changes:
  * Detect the ifunc attribute support (Paolo's suggestion) 
  * Use the ifunc attribute instead of the inline asm (Richard's suggestion)
  * Change the configure (Juan's suggestion)


Liang Li (3):
  cutils: add avx2 instruction optimization
  configure: detect ifunc attribute
  configure: add options to config avx2

 configure   | 50 +
 include/qemu-common.h   | 13 +-
 util/Makefile.objs  |  2 ++
 util/buffer-zero-avx2.c | 54 
 util/cutils.c   | 65 +++--
 5 files changed, 175 insertions(+), 9 deletions(-)
 create mode 100644 util/buffer-zero-avx2.c

-- 
1.9.1




[Qemu-devel] [PATCH 1/7] pc: wire up TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE for !xen

2015-12-08 Thread Gerd Hoffmann
rename pc_xen_hvm_init_pci to pc_i440fx_init_pci,
use it for both xen and non-xen init.

Signed-off-by: Gerd Hoffmann 
---
 hw/i386/pc_piix.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 2e41efe..ce6c3c5 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -419,10 +419,9 @@ static void pc_init_isa(MachineState *machine)
 pc_init1(machine, TYPE_I440FX_PCI_HOST_BRIDGE, TYPE_I440FX_PCI_DEVICE);
 }
 
-#ifdef CONFIG_XEN
-static void pc_xen_hvm_init_pci(MachineState *machine)
+static void pc_i440fx_init_pci(MachineState *machine)
 {
-const char *pci_type = has_igd_gfx_passthru ?
+const char *pci_type = machine->igd_gfx_passthru ?
 TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE : 
TYPE_I440FX_PCI_DEVICE;
 
 pc_init1(machine,
@@ -430,6 +429,7 @@ static void pc_xen_hvm_init_pci(MachineState *machine)
  pci_type);
 }
 
+#ifdef CONFIG_XEN
 static void pc_xen_hvm_init(MachineState *machine)
 {
 PCIBus *bus;
@@ -439,7 +439,7 @@ static void pc_xen_hvm_init(MachineState *machine)
 exit(1);
 }
 
-pc_xen_hvm_init_pci(machine);
+pc_i440fx_init_pci(machine);
 
 bus = pci_find_primary_bus();
 if (bus != NULL) {
@@ -455,8 +455,7 @@ static void pc_xen_hvm_init(MachineState *machine)
 if (compat) { \
 compat(machine); \
 } \
-pc_init1(machine, TYPE_I440FX_PCI_HOST_BRIDGE, \
- TYPE_I440FX_PCI_DEVICE); \
+pc_i440fx_init_pci(machine); \
 } \
 DEFINE_PC_MACHINE(suffix, name, pc_init_##suffix, optionfn)
 
-- 
1.8.3.1




[Qemu-devel] [PATCH 0/7] igd passthrough chipset tweaks

2015-12-08 Thread Gerd Hoffmann
  Hi,

We have some code in our tree to support pci passthrough of intel
graphics devices (igd) on xen, which requires some chipset tweaks
for (a) the host bridge and (b) the lpc/isa-bridge to meat the
expectations of the guest driver.  For kvm we need pretty much
the same, also the requirements for vgpu (xengt/kvmgt) are very
simliar.

This patch series tackles (a) only, (b) will follow later.  It
wires up the igd-passthru machine option for tcg/kvm too, moves
the code to its own file so it is nicely separated, fixes a bunch
of issues and finally adds q35 support.

This patch series has seen very light testing, basically doing
lspci in the guest to check whenever pci config space got updated
correctly.  Trying actual device assignment needs more pieces
being in place.  But I suspect even that is more testing than
the code has seen on xen so far (see patch #6 ...).

cheers,
  Gerd

Gerd Hoffmann (7):
  pc: wire up TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE for !xen
  pc: move igd support code to igd.c
  igd: switch TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE to realize
  igd: TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE: call parent realize
  igd: use defines for standard pci config space offsets
  igd: revamp host config read
  igd: add q35 support

 hw/i386/pc_piix.c |  11 ++--
 hw/pci-host/Makefile.objs |   3 ++
 hw/pci-host/igd.c | 132 ++
 hw/pci-host/piix.c|  88 ---
 hw/pci-host/q35.c |   6 ++-
 5 files changed, 145 insertions(+), 95 deletions(-)
 create mode 100644 hw/pci-host/igd.c

-- 
1.8.3.1




[Qemu-devel] [PATCH 2/7] pc: move igd support code to igd.c

2015-12-08 Thread Gerd Hoffmann
Pure code motion, except for dropping instance_size for
TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE (no need to set,
we can inherit it from TYPE_I440FX_PCI_DEVICE).

Signed-off-by: Gerd Hoffmann 
---
 hw/pci-host/Makefile.objs |  3 ++
 hw/pci-host/igd.c | 96 +++
 hw/pci-host/piix.c| 88 ---
 3 files changed, 99 insertions(+), 88 deletions(-)
 create mode 100644 hw/pci-host/igd.c

diff --git a/hw/pci-host/Makefile.objs b/hw/pci-host/Makefile.objs
index 45f1f0e..e341a49 100644
--- a/hw/pci-host/Makefile.objs
+++ b/hw/pci-host/Makefile.objs
@@ -11,6 +11,9 @@ common-obj-$(CONFIG_PPCE500_PCI) += ppce500.o
 # ARM devices
 common-obj-$(CONFIG_VERSATILE_PCI) += versatile.o
 
+# igd passthrough support
+common-obj-$(CONFIG_LINUX) += igd.o
+
 common-obj-$(CONFIG_PCI_APB) += apb.o
 common-obj-$(CONFIG_FULONG) += bonito.o
 common-obj-$(CONFIG_PCI_PIIX) += piix.o
diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c
new file mode 100644
index 000..ef0273b
--- /dev/null
+++ b/hw/pci-host/igd.c
@@ -0,0 +1,96 @@
+#include "qemu-common.h"
+#include "hw/pci/pci.h"
+#include "hw/i386/pc.h"
+
+/* IGD Passthrough Host Bridge. */
+typedef struct {
+uint8_t offset;
+uint8_t len;
+} IGDHostInfo;
+
+/* Here we just expose minimal host bridge offset subset. */
+static const IGDHostInfo igd_host_bridge_infos[] = {
+{0x08, 2},  /* revision id */
+{0x2c, 2},  /* sybsystem vendor id */
+{0x2e, 2},  /* sybsystem id */
+{0x50, 2},  /* SNB: processor graphics control register */
+{0x52, 2},  /* processor graphics control register */
+{0xa4, 4},  /* SNB: graphics base of stolen memory */
+{0xa8, 4},  /* SNB: base of GTT stolen memory */
+};
+
+static int host_pci_config_read(int pos, int len, uint32_t val)
+{
+char path[PATH_MAX];
+int config_fd;
+ssize_t size = sizeof(path);
+/* Access real host bridge. */
+int rc = snprintf(path, size, "/sys/bus/pci/devices/%04x:%02x:%02x.%d/%s",
+  0, 0, 0, 0, "config");
+int ret = 0;
+
+if (rc >= size || rc < 0) {
+return -ENODEV;
+}
+
+config_fd = open(path, O_RDWR);
+if (config_fd < 0) {
+return -ENODEV;
+}
+
+if (lseek(config_fd, pos, SEEK_SET) != pos) {
+ret = -errno;
+goto out;
+}
+do {
+rc = read(config_fd, (uint8_t *), len);
+} while (rc < 0 && (errno == EINTR || errno == EAGAIN));
+if (rc != len) {
+ret = -errno;
+}
+out:
+close(config_fd);
+return ret;
+}
+
+static int igd_pt_i440fx_initfn(struct PCIDevice *pci_dev)
+{
+uint32_t val = 0;
+int rc, i, num;
+int pos, len;
+
+num = ARRAY_SIZE(igd_host_bridge_infos);
+for (i = 0; i < num; i++) {
+pos = igd_host_bridge_infos[i].offset;
+len = igd_host_bridge_infos[i].len;
+rc = host_pci_config_read(pos, len, val);
+if (rc) {
+return -ENODEV;
+}
+pci_default_write_config(pci_dev, pos, val, len);
+}
+
+return 0;
+}
+
+static void igd_passthrough_i440fx_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+
+k->init = igd_pt_i440fx_initfn;
+dc->desc = "IGD Passthrough Host bridge";
+}
+
+static const TypeInfo igd_passthrough_i440fx_info = {
+.name  = TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE,
+.parent= TYPE_I440FX_PCI_DEVICE,
+.class_init= igd_passthrough_i440fx_class_init,
+};
+
+static void igd_register_types(void)
+{
+type_register_static(_passthrough_i440fx_info);
+}
+
+type_init(igd_register_types)
diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index 715208b..ccacb57 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -744,93 +744,6 @@ static const TypeInfo i440fx_info = {
 .class_init= i440fx_class_init,
 };
 
-/* IGD Passthrough Host Bridge. */
-typedef struct {
-uint8_t offset;
-uint8_t len;
-} IGDHostInfo;
-
-/* Here we just expose minimal host bridge offset subset. */
-static const IGDHostInfo igd_host_bridge_infos[] = {
-{0x08, 2},  /* revision id */
-{0x2c, 2},  /* sybsystem vendor id */
-{0x2e, 2},  /* sybsystem id */
-{0x50, 2},  /* SNB: processor graphics control register */
-{0x52, 2},  /* processor graphics control register */
-{0xa4, 4},  /* SNB: graphics base of stolen memory */
-{0xa8, 4},  /* SNB: base of GTT stolen memory */
-};
-
-static int host_pci_config_read(int pos, int len, uint32_t val)
-{
-char path[PATH_MAX];
-int config_fd;
-ssize_t size = sizeof(path);
-/* Access real host bridge. */
-int rc = snprintf(path, size, "/sys/bus/pci/devices/%04x:%02x:%02x.%d/%s",
-  0, 0, 0, 0, "config");
-int ret = 0;
-
-if (rc >= size || rc < 0) {
-return -ENODEV;
-}
-
-config_fd = open(path, O_RDWR);
-if 

[Qemu-devel] FD passing for chardevs and chardev backend multiplexing

2015-12-08 Thread Daniel P. Berrange
Historically libvirt has connected stdout & stderr from QEMU directly to
a plain file (/var/log/libvirt/qemu/$GUESTNAME.log).  This has worked
well enough in general, but is susceptible to a guest denial of service
if the guest can cause QEMU to spew messages to stderr. There are enough
places in QEMU and SPICE that still print to stderr that this isn't
very hard to achieve.

So In libvirt 1.3.0 we introduce a new daemon 'virtlogd' which is used
for handling log file writing. When this is used, QEMU's stdout/stderr
will be connected to an anonymous pipe file descriptor, the other end
of which is held by the virtlogd daemon. The virtlogd daemon will only
permit a fixed file size to be created before rotating the log file,
so we no longer have the possibility of unbounded disk usage, which is
nice.


I'm now looking to extend the use of 'virtlogd' to also handle character
devices. OpenStack has historically configured the primary serial port
to log to a file in order to capture kernel boot up messages and later
report them to the user via its API. This serial port file backend is
of course susceptible to the same disk space denial of service. We don't
really want to push file rotation logic into QEMU because that would
involve giving QEMU permission to create / rename files, which is
undesirable from a security POV. We also prefer a solution that ideally
works with existing QEMU builds.

So for this my plan is to stop using the QEMU 'file' backend for char
devs and instead pass across a pre-opened file descriptor, connected
to virtlogd. There is no "officially documented" way to pass in a
file descriptor to QEMU chardevs, but since QEMU uses qemu_open(),
we can make use of the fdset feature to achieve this. eg

eg, consider fd 33 is the write end of a pipe file descriptor
I can (in theory) do

  -add-fd set=2,fd=33 -chardev file,id=charserial0,path=/dev/fdset/2

Now in practice this doesn't work, because qmp_chardev_open_file()
passes the O_CREAT|O_TRUNC flags in, which means the qemu_open()
call will fail when using the pipe FD pased in via fdsets.

After more investigation I found it *is* possible to use a socketpair
and a pipe backend though...

  -add-fd set=2,fd=33 -chardev pipe,id=charserial0,path=/dev/fdset/2

..because for reasons I don't understand, if QEMU can't open $PATH.in
and $PATH.out, it'll fallback to just opening $PATH in read-write
mode even. AFAICT, this is pretty useless with pipes since they
are unidirectional, but, it works nicely with socketpairs, where
virtlogd has one of the socketpairs and QEMU gets passed the other
via fdset.

I can easily check this works for historical QEMU versions back
to when fdsets support was added to chardevs, but I'm working if
the QEMU maintainers consider this usage acceptable over the long
term, and if so, should we explicitly document it as supported ?

If not, should we introduce a more explicit syntax for passing in
a pre-opened FD for chardevs ? eg

  -add-fd set=2,fd=33 -chardev fd,id=charserial0,path=/dev/fdset/2

Or just make  -chardev file,id=charserial0,path=/dev/fdset/2 actually
work ?

Or something else ?


OpenStack has a further requirement to allow use of the serial port
as an interactive console, at the same time that it is logging to a
file which is something QEMU can't support at all currently. This
essentially means being able to have multiple chardev backends all
connected to the same serial frontend - specifically we would need
a TCP backend and a file backend concurrently. Again this could be
implemented in QEMU, but we'd prefer something that works with
existing QEMU.

This is not too difficult to achieve with virtlogd really. Instead
of using the QEMU 'tcp' or 'unix' chardev protocol, we'd just always
pass QEMU a pre-opened socketpair, and then leave the TCP/UNIX
socket listening to the virtlogd daemon.

This is portable with existing QEMU versions, but the obvious downside
with this is extra copies in the interactive console path. So might it
be worth exploring the posibility of a chardev multiplexor in QEMU. We
would still pass in a pre-opened socketpair to QEMU for the logging side
of things, but would leave the TCP/UNIX socket listening upto QEMU still.

eg should we make something like this work:

  -add-fd set=2,fd=33
  -chardev pipe,id=charserial0file,path=/dev/fdset/2
  -chardev 
socket,id=charserial0tcp,host=127.0.0.1,port=,telnet,server,nowait
  -chardev multiplex,id=charserial0,muxA=charserial0file,muxB=charserial1
  -serial isa-serial,chardev=charserial0,id=serial0

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PULL for-2.5 2/4] block: Don't wait serialising for non-COR read requests

2015-12-08 Thread Christian Borntraeger
On 12/08/2015 03:10 PM, Kevin Wolf wrote:
[...]
 Not a compiler bug. gcc uses a floating point register 8 to spill
 the pointer of blk (which is call saved) submit_request will later
 on call  qemu_coroutine_enter and after returning from 
 qemu_coroutine_enter, the fpr8 contains junk. Not sure yet, what happened.
>>>
>>> Coroutines don't save the FPU state, so you're not supposed to use
>>> floating point operations inside coroutines. That the compiler spills
>>> some integer value into a floating point register is a bit nasty...
>>
>> Just checked.  bdrv_aligned_preadv does also use fprs (also for filling
>> and spilling). Some versions of gcc seem to like that as the LDGR and LGDR
>> instructions are pretty cheap and move the content from/to fprs in a bitwise
>> fashion. So this coroutine DOES trash floating point registers.
>>
>> Without the patch gcc seems to be fine with the 16 gprs and does not
>> spilling/filling from/to fprs in bdrv_aligned_preadv.
> 
> Actually, on closer look it seems that the reason why there is no code
> for saving the floating point registers in setjmp() on x86 is that they
> are caller-save registers anyway, so it doesn't have to. Otherwise the
> internet seems to be of the opinion that longjmp() must indeed restore
> floating point registers.
> 
> So this might be a libc bug on s390 then.

Fixed with 
https://sourceware.org/ml/libc-alpha/2013-01/msg00853.html

Christian




Re: [Qemu-devel] [PULL for-2.5 2/4] block: Don't wait serialising for non-COR read requests

2015-12-08 Thread Kevin Wolf
Am 08.12.2015 um 14:28 hat Christian Borntraeger geschrieben:
> On 12/08/2015 01:30 PM, Christian Borntraeger wrote:
> > On 12/08/2015 01:00 PM, Cornelia Huck wrote:
> >> On Tue, 8 Dec 2015 10:59:54 +0100
> >> Kevin Wolf  wrote:
> >>
> >>> Am 07.12.2015 um 17:42 hat Cornelia Huck geschrieben:
>  On Mon, 7 Dec 2015 11:02:51 +0100
>  Cornelia Huck  wrote:
> 
> > On Thu,  3 Dec 2015 13:00:00 +0800
> > Stefan Hajnoczi  wrote:
> >
> >> From: Fam Zheng 
> >>
> >> The assertion problem was noticed in 06c3916b35a, but it wasn't
> >> completely fixed, because even though the req is not marked as
> >> serialising, it still gets serialised by wait_serialising_requests
> >> against other serialising requests, which could lead to the same
> >> assertion failure.
> >>
> >> Fix it by even more explicitly skipping the serialising for this
> >> specific case.
> >>
> >> Signed-off-by: Fam Zheng 
> >> Message-id: 1448962590-2842-2-git-send-email-f...@redhat.com
> >> Signed-off-by: Stefan Hajnoczi 
> >> ---
> >>  block/backup.c|  2 +-
> >>  block/io.c| 12 +++-
> >>  include/block/block.h |  4 ++--
> >>  trace-events  |  2 +-
> >>  4 files changed, 11 insertions(+), 9 deletions(-)
> >
> > This one causes segfaults for me:
> >
> > Program received signal SIGSEGV, Segmentation fault.
> > bdrv_is_inserted (bs=0x8000) at /data/git/yyy/qemu/block.c:3071
> > 3071if (!drv) {
> >
> > (gdb) bt
> > #0  bdrv_is_inserted (bs=0x8000) at 
> > /data/git/yyy/qemu/block.c:3071
> >>>
> >>> This looks like some kind of memory corruption that hit blk->bs. It's
> >>> most definitely not a valid pointer anyway.
> >>>
> > #1  0x80216974 in blk_is_inserted (blk=)
> > at /data/git/yyy/qemu/block/block-backend.c:986
> > #2  0x802169c6 in blk_is_available (blk=blk@entry=0x3ffb17e7960)
> > at /data/git/yyy/qemu/block/block-backend.c:991
> > #3  0x80216d12 in blk_check_byte_request 
> > (blk=blk@entry=0x3ffb17e7960, 
> > offset=offset@entry=4928966656, size=16384)
> > at /data/git/yyy/qemu/block/block-backend.c:558
> > #4  0x80216df2 in blk_check_request 
> > (blk=blk@entry=0x3ffb17e7960, 
> > sector_num=sector_num@entry=9626888, nb_sectors=nb_sectors@entry=32)
> > at /data/git/yyy/qemu/block/block-backend.c:589
> > #5  0x80217ee8 in blk_aio_readv (blk=0x3ffb17e7960, sector_num=
> > 9626888, iov=0x8098c658, nb_sectors=, cb=
> > 0x80081150 , opaque=0x80980620)
> > at /data/git/yyy/qemu/block/block-backend.c:727
> > #6  0x8008186e in submit_requests (niov=, 
> > num_reqs=, start=, mrb= > out>, 
> > blk=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:366
> > #7  virtio_blk_submit_multireq (mrb=, blk= > out>)
> > at /data/git/yyy/qemu/hw/block/virtio-blk.c:444
> > #8  virtio_blk_submit_multireq (blk=0x3ffb17e7960, mrb=0x3ffeb58)
> > at /data/git/yyy/qemu/hw/block/virtio-blk.c:389
> > #9  0x800823ee in virtio_blk_handle_output (vdev= > out>, 
> > vq=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:615
> > #10 0x801e367e in aio_dispatch (ctx=0x80918520)
> > at /data/git/yyy/qemu/aio-posix.c:326
> > #11 0x801d28b0 in aio_ctx_dispatch (source=, 
> > callback=, user_data=)
> > at /data/git/yyy/qemu/async.c:231
> > #12 0x03fffd36a05a in g_main_context_dispatch ()
> >from /lib64/libglib-2.0.so.0
> > #13 0x801e0ffa in glib_pollfds_poll ()
> > at /data/git/yyy/qemu/main-loop.c:211
> > #14 os_host_main_loop_wait (timeout=)
> > at /data/git/yyy/qemu/main-loop.c:256
> > #15 main_loop_wait (nonblocking=)
> > at /data/git/yyy/qemu/main-loop.c:504
> > #16 0x800148a6 in main_loop () at /data/git/yyy/qemu/vl.c:1923
> > #17 main (argc=, argv=, envp= > out>)
> > at /data/git/yyy/qemu/vl.c:4684
> >
> > Relevant part of command line:
> >
> > -drive 
> > file=/dev/sda,if=none,id=drive-virtio-disk0,format=raw,serial=ccwzfcp1,cache=none
> >  -device 
> > virtio-blk-ccw,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,scsi=off
> 
>  I played around a bit. The main part of this change seems to be calling
>  wait_serialising_requests() conditionally; reverting this makes the
>  guest boot again.
> 
>  I then tried to find out when wait_serialising_requests() was NOT
>  called and added fprintfs: well, it was _always_ called. I then added a
>  fprintf for flags at the beginning of the function: this produced a
>  segfault no matter 

Re: [Qemu-devel] [PULL for-2.5 2/4] block: Don't wait serialising for non-COR read requests

2015-12-08 Thread Christian Borntraeger
On 12/08/2015 02:45 PM, Kevin Wolf wrote:
> Am 08.12.2015 um 14:28 hat Christian Borntraeger geschrieben:
>> On 12/08/2015 01:30 PM, Christian Borntraeger wrote:
>>> On 12/08/2015 01:00 PM, Cornelia Huck wrote:
 On Tue, 8 Dec 2015 10:59:54 +0100
 Kevin Wolf  wrote:

> Am 07.12.2015 um 17:42 hat Cornelia Huck geschrieben:
>> On Mon, 7 Dec 2015 11:02:51 +0100
>> Cornelia Huck  wrote:
>>
>>> On Thu,  3 Dec 2015 13:00:00 +0800
>>> Stefan Hajnoczi  wrote:
>>>
 From: Fam Zheng 

 The assertion problem was noticed in 06c3916b35a, but it wasn't
 completely fixed, because even though the req is not marked as
 serialising, it still gets serialised by wait_serialising_requests
 against other serialising requests, which could lead to the same
 assertion failure.

 Fix it by even more explicitly skipping the serialising for this
 specific case.

 Signed-off-by: Fam Zheng 
 Message-id: 1448962590-2842-2-git-send-email-f...@redhat.com
 Signed-off-by: Stefan Hajnoczi 
 ---
  block/backup.c|  2 +-
  block/io.c| 12 +++-
  include/block/block.h |  4 ++--
  trace-events  |  2 +-
  4 files changed, 11 insertions(+), 9 deletions(-)
>>>
>>> This one causes segfaults for me:
>>>
>>> Program received signal SIGSEGV, Segmentation fault.
>>> bdrv_is_inserted (bs=0x8000) at /data/git/yyy/qemu/block.c:3071
>>> 3071if (!drv) {
>>>
>>> (gdb) bt
>>> #0  bdrv_is_inserted (bs=0x8000) at 
>>> /data/git/yyy/qemu/block.c:3071
>
> This looks like some kind of memory corruption that hit blk->bs. It's
> most definitely not a valid pointer anyway.
>
>>> #1  0x80216974 in blk_is_inserted (blk=)
>>> at /data/git/yyy/qemu/block/block-backend.c:986
>>> #2  0x802169c6 in blk_is_available (blk=blk@entry=0x3ffb17e7960)
>>> at /data/git/yyy/qemu/block/block-backend.c:991
>>> #3  0x80216d12 in blk_check_byte_request 
>>> (blk=blk@entry=0x3ffb17e7960, 
>>> offset=offset@entry=4928966656, size=16384)
>>> at /data/git/yyy/qemu/block/block-backend.c:558
>>> #4  0x80216df2 in blk_check_request 
>>> (blk=blk@entry=0x3ffb17e7960, 
>>> sector_num=sector_num@entry=9626888, nb_sectors=nb_sectors@entry=32)
>>> at /data/git/yyy/qemu/block/block-backend.c:589
>>> #5  0x80217ee8 in blk_aio_readv (blk=0x3ffb17e7960, sector_num=
>>> 9626888, iov=0x8098c658, nb_sectors=, cb=
>>> 0x80081150 , opaque=0x80980620)
>>> at /data/git/yyy/qemu/block/block-backend.c:727
>>> #6  0x8008186e in submit_requests (niov=, 
>>> num_reqs=, start=, mrb=>> out>, 
>>> blk=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:366
>>> #7  virtio_blk_submit_multireq (mrb=, blk=>> out>)
>>> at /data/git/yyy/qemu/hw/block/virtio-blk.c:444
>>> #8  virtio_blk_submit_multireq (blk=0x3ffb17e7960, mrb=0x3ffeb58)
>>> at /data/git/yyy/qemu/hw/block/virtio-blk.c:389
>>> #9  0x800823ee in virtio_blk_handle_output (vdev=>> out>, 
>>> vq=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:615
>>> #10 0x801e367e in aio_dispatch (ctx=0x80918520)
>>> at /data/git/yyy/qemu/aio-posix.c:326
>>> #11 0x801d28b0 in aio_ctx_dispatch (source=, 
>>> callback=, user_data=)
>>> at /data/git/yyy/qemu/async.c:231
>>> #12 0x03fffd36a05a in g_main_context_dispatch ()
>>>from /lib64/libglib-2.0.so.0
>>> #13 0x801e0ffa in glib_pollfds_poll ()
>>> at /data/git/yyy/qemu/main-loop.c:211
>>> #14 os_host_main_loop_wait (timeout=)
>>> at /data/git/yyy/qemu/main-loop.c:256
>>> #15 main_loop_wait (nonblocking=)
>>> at /data/git/yyy/qemu/main-loop.c:504
>>> #16 0x800148a6 in main_loop () at /data/git/yyy/qemu/vl.c:1923
>>> #17 main (argc=, argv=, envp=>> out>)
>>> at /data/git/yyy/qemu/vl.c:4684
>>>
>>> Relevant part of command line:
>>>
>>> -drive 
>>> file=/dev/sda,if=none,id=drive-virtio-disk0,format=raw,serial=ccwzfcp1,cache=none
>>>  -device 
>>> virtio-blk-ccw,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,scsi=off
>>
>> I played around a bit. The main part of this change seems to be calling
>> wait_serialising_requests() conditionally; reverting this makes the
>> guest boot again.
>>
>> I then tried to find out when wait_serialising_requests() was NOT
>> called and added fprintfs: well, it was _always_ called. I then added a
>> fprintf for flags at the beginning of the 

[Qemu-devel] [PATCH 3/7] igd: switch TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE to realize

2015-12-08 Thread Gerd Hoffmann
Signed-off-by: Gerd Hoffmann 
---
 hw/pci-host/igd.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c
index ef0273b..d1eeafb 100644
--- a/hw/pci-host/igd.c
+++ b/hw/pci-host/igd.c
@@ -53,7 +53,7 @@ out:
 return ret;
 }
 
-static int igd_pt_i440fx_initfn(struct PCIDevice *pci_dev)
+static void igd_pt_i440fx_realize(PCIDevice *pci_dev, Error **errp)
 {
 uint32_t val = 0;
 int rc, i, num;
@@ -65,12 +65,11 @@ static int igd_pt_i440fx_initfn(struct PCIDevice *pci_dev)
 len = igd_host_bridge_infos[i].len;
 rc = host_pci_config_read(pos, len, val);
 if (rc) {
-return -ENODEV;
+error_setg(errp, "failed to read host config");
+return;
 }
 pci_default_write_config(pci_dev, pos, val, len);
 }
-
-return 0;
 }
 
 static void igd_passthrough_i440fx_class_init(ObjectClass *klass, void *data)
@@ -78,7 +77,7 @@ static void igd_passthrough_i440fx_class_init(ObjectClass 
*klass, void *data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
 
-k->init = igd_pt_i440fx_initfn;
+k->realize = igd_pt_i440fx_realize;
 dc->desc = "IGD Passthrough Host bridge";
 }
 
-- 
1.8.3.1




[Qemu-devel] [PATCH 5/7] igd: use defines for standard pci config space offsets

2015-12-08 Thread Gerd Hoffmann
Signed-off-by: Gerd Hoffmann 
---
 hw/pci-host/igd.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c
index 6f52ab1..0784128 100644
--- a/hw/pci-host/igd.c
+++ b/hw/pci-host/igd.c
@@ -10,9 +10,9 @@ typedef struct {
 
 /* Here we just expose minimal host bridge offset subset. */
 static const IGDHostInfo igd_host_bridge_infos[] = {
-{0x08, 2},  /* revision id */
-{0x2c, 2},  /* sybsystem vendor id */
-{0x2e, 2},  /* sybsystem id */
+{PCI_REVISION_ID, 2},
+{PCI_SUBSYSTEM_VENDOR_ID, 2},
+{PCI_SUBSYSTEM_ID,2},
 {0x50, 2},  /* SNB: processor graphics control register */
 {0x52, 2},  /* processor graphics control register */
 {0xa4, 4},  /* SNB: graphics base of stolen memory */
-- 
1.8.3.1




Re: [Qemu-devel] Error handling in realize() methods

2015-12-08 Thread Dr. David Alan Gilbert
* Markus Armbruster (arm...@redhat.com) wrote:
> In general, code running withing a realize() method should not exit() on
> error.  Instad, errors should be propagated through the realize()
> method.  Additionally, the realize() method should fail cleanly,
> i.e. carefully undo its side effects such as wiring of interrupts,
> mapping of memory, and so forth.  Tedious work, but necessary to make
> hot plug safe.
> 
> Quite a few devices don't do that.
> 
> Some of them can be usefully hot-plugged, and for them unclean failures
> are simply bugs.  I'm going to mark the ones I can find.
> 
> Others are used only as onboard devices, and if their realize() fails,
> the machine's init() will exit()[*].  In an ideal world, we'd start with
> an empty board and cold-plugg devices, and there, clean failure may be
> useful.  In the world we live in, making these devices fail cleanly is a
> lot of tedious work for no immediate gain.
> 
> Example: machine "kzm" and device "fsl,imx31".  fsl_imx31_realize()
> returns without cleanup on error.  kzm_init() exit(1)s when realize
> fails, so the lack of cleanup is a non-issue.
> 
> I think this is basically okay for now, but I'd like us to mark these
> devices cannot_instantiate_with_device_add_yet, with /* Reason:
> realize() method fails uncleanly */.
> 
> Opinions?
> 
> Next, let's consider the special case "out of memory".
> 
> Our general approach is to treat it as immediately fatal.  This makes
> sense, because when a smallish allocation fails, the process is almost
> certainly doomed anyway.  Moreover, memory allocation is frequent, and
> attempting to recover from failed memory allocation adds loads of
> hard-to-test error paths.  These are *dangerous* unless carefully tested
> (and we don't).
> 
> Certain important allocations we handle more gracefully.  For instance,
> we don't want to die when the user tries to hot-plug more memory than we
> can allocate, or tries to open a QCOW2 image with a huge L1 table.
> 
> Guest memory allocation used to have the "immediately fatal" policy
> baked in at a fairly low level, but it's since been lifted into callers;
> see commit c261d77..fc7a580 and fixups 4f96676..0bdaa3a.  During review
> of the latter, Peter Crosthwaite called out the _fatal in the
> realize methods and their supporting code.  I agreed with him back then
> that the errors should really be propagated.  But now I've changed my
> mind: I think we should treat these memory allocation failures like
> we've always treated them, namely report and exit(1).  Except for
> "large" allocations, where we have a higher probability of failure, and
> a more realistic chance to recover safely.
> 
> Can we agree that passing _fatal to memory_region_init_ram() &
> friends is basically okay even in realize() methods and their supporting
> code?

I'd say it depends if they can be hotplugged; I think anything that we really
want to hotplug onto real running machines (as opposed to where we're just
hotplugging during setup) we should propagate errors properly.

And tbh I don't buy the small allocation argument; I think we should handle them
all; in my utopian world a guest wouldn't die unless there was no way out.

Dave

> 
> [*] Well, the ones that bother to check for errors, but that's a
> separate problem.
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PULL for-2.5 2/4] block: Don't wait serialising for non-COR read requests

2015-12-08 Thread Cornelia Huck
On Tue, 8 Dec 2015 15:24:29 +0100
Christian Borntraeger  wrote:

> On 12/08/2015 03:10 PM, Kevin Wolf wrote:
> > So this might be a libc bug on s390 then.
> 
> Fixed with 
> https://sourceware.org/ml/libc-alpha/2013-01/msg00853.html

OK, so I need to upgrade that system; no bug in qemu. Thank you for
looking into this!




[Qemu-devel] Error handling in realize() methods

2015-12-08 Thread Markus Armbruster
In general, code running withing a realize() method should not exit() on
error.  Instad, errors should be propagated through the realize()
method.  Additionally, the realize() method should fail cleanly,
i.e. carefully undo its side effects such as wiring of interrupts,
mapping of memory, and so forth.  Tedious work, but necessary to make
hot plug safe.

Quite a few devices don't do that.

Some of them can be usefully hot-plugged, and for them unclean failures
are simply bugs.  I'm going to mark the ones I can find.

Others are used only as onboard devices, and if their realize() fails,
the machine's init() will exit()[*].  In an ideal world, we'd start with
an empty board and cold-plugg devices, and there, clean failure may be
useful.  In the world we live in, making these devices fail cleanly is a
lot of tedious work for no immediate gain.

Example: machine "kzm" and device "fsl,imx31".  fsl_imx31_realize()
returns without cleanup on error.  kzm_init() exit(1)s when realize
fails, so the lack of cleanup is a non-issue.

I think this is basically okay for now, but I'd like us to mark these
devices cannot_instantiate_with_device_add_yet, with /* Reason:
realize() method fails uncleanly */.

Opinions?

Next, let's consider the special case "out of memory".

Our general approach is to treat it as immediately fatal.  This makes
sense, because when a smallish allocation fails, the process is almost
certainly doomed anyway.  Moreover, memory allocation is frequent, and
attempting to recover from failed memory allocation adds loads of
hard-to-test error paths.  These are *dangerous* unless carefully tested
(and we don't).

Certain important allocations we handle more gracefully.  For instance,
we don't want to die when the user tries to hot-plug more memory than we
can allocate, or tries to open a QCOW2 image with a huge L1 table.

Guest memory allocation used to have the "immediately fatal" policy
baked in at a fairly low level, but it's since been lifted into callers;
see commit c261d77..fc7a580 and fixups 4f96676..0bdaa3a.  During review
of the latter, Peter Crosthwaite called out the _fatal in the
realize methods and their supporting code.  I agreed with him back then
that the errors should really be propagated.  But now I've changed my
mind: I think we should treat these memory allocation failures like
we've always treated them, namely report and exit(1).  Except for
"large" allocations, where we have a higher probability of failure, and
a more realistic chance to recover safely.

Can we agree that passing _fatal to memory_region_init_ram() &
friends is basically okay even in realize() methods and their supporting
code?


[*] Well, the ones that bother to check for errors, but that's a
separate problem.



Re: [Qemu-devel] [PULL for-2.5 2/4] block: Don't wait serialising for non-COR read requests

2015-12-08 Thread Kevin Wolf
Am 08.12.2015 um 14:58 hat Christian Borntraeger geschrieben:
> On 12/08/2015 02:45 PM, Kevin Wolf wrote:
> > Am 08.12.2015 um 14:28 hat Christian Borntraeger geschrieben:
> >> On 12/08/2015 01:30 PM, Christian Borntraeger wrote:
> >>> On 12/08/2015 01:00 PM, Cornelia Huck wrote:
>  On Tue, 8 Dec 2015 10:59:54 +0100
>  Kevin Wolf  wrote:
> 
> > Am 07.12.2015 um 17:42 hat Cornelia Huck geschrieben:
> >> On Mon, 7 Dec 2015 11:02:51 +0100
> >> Cornelia Huck  wrote:
> >>
> >>> On Thu,  3 Dec 2015 13:00:00 +0800
> >>> Stefan Hajnoczi  wrote:
> >>>
>  From: Fam Zheng 
> 
>  The assertion problem was noticed in 06c3916b35a, but it wasn't
>  completely fixed, because even though the req is not marked as
>  serialising, it still gets serialised by wait_serialising_requests
>  against other serialising requests, which could lead to the same
>  assertion failure.
> 
>  Fix it by even more explicitly skipping the serialising for this
>  specific case.
> 
>  Signed-off-by: Fam Zheng 
>  Message-id: 1448962590-2842-2-git-send-email-f...@redhat.com
>  Signed-off-by: Stefan Hajnoczi 
>  ---
>   block/backup.c|  2 +-
>   block/io.c| 12 +++-
>   include/block/block.h |  4 ++--
>   trace-events  |  2 +-
>   4 files changed, 11 insertions(+), 9 deletions(-)
> >>>
> >>> This one causes segfaults for me:
> >>>
> >>> Program received signal SIGSEGV, Segmentation fault.
> >>> bdrv_is_inserted (bs=0x8000) at 
> >>> /data/git/yyy/qemu/block.c:3071
> >>> 3071  if (!drv) {
> >>>
> >>> (gdb) bt
> >>> #0  bdrv_is_inserted (bs=0x8000) at 
> >>> /data/git/yyy/qemu/block.c:3071
> >
> > This looks like some kind of memory corruption that hit blk->bs. It's
> > most definitely not a valid pointer anyway.
> >
> >>> #1  0x80216974 in blk_is_inserted (blk=)
> >>> at /data/git/yyy/qemu/block/block-backend.c:986
> >>> #2  0x802169c6 in blk_is_available 
> >>> (blk=blk@entry=0x3ffb17e7960)
> >>> at /data/git/yyy/qemu/block/block-backend.c:991
> >>> #3  0x80216d12 in blk_check_byte_request 
> >>> (blk=blk@entry=0x3ffb17e7960, 
> >>> offset=offset@entry=4928966656, size=16384)
> >>> at /data/git/yyy/qemu/block/block-backend.c:558
> >>> #4  0x80216df2 in blk_check_request 
> >>> (blk=blk@entry=0x3ffb17e7960, 
> >>> sector_num=sector_num@entry=9626888, 
> >>> nb_sectors=nb_sectors@entry=32)
> >>> at /data/git/yyy/qemu/block/block-backend.c:589
> >>> #5  0x80217ee8 in blk_aio_readv (blk=0x3ffb17e7960, 
> >>> sector_num=
> >>> 9626888, iov=0x8098c658, nb_sectors=, cb=
> >>> 0x80081150 , opaque=0x80980620)
> >>> at /data/git/yyy/qemu/block/block-backend.c:727
> >>> #6  0x8008186e in submit_requests (niov=, 
> >>> num_reqs=, start=, mrb= >>> out>, 
> >>> blk=) at 
> >>> /data/git/yyy/qemu/hw/block/virtio-blk.c:366
> >>> #7  virtio_blk_submit_multireq (mrb=, blk= >>> out>)
> >>> at /data/git/yyy/qemu/hw/block/virtio-blk.c:444
> >>> #8  virtio_blk_submit_multireq (blk=0x3ffb17e7960, mrb=0x3ffeb58)
> >>> at /data/git/yyy/qemu/hw/block/virtio-blk.c:389
> >>> #9  0x800823ee in virtio_blk_handle_output (vdev= >>> out>, 
> >>> vq=) at 
> >>> /data/git/yyy/qemu/hw/block/virtio-blk.c:615
> >>> #10 0x801e367e in aio_dispatch (ctx=0x80918520)
> >>> at /data/git/yyy/qemu/aio-posix.c:326
> >>> #11 0x801d28b0 in aio_ctx_dispatch (source=, 
> >>> callback=, user_data=)
> >>> at /data/git/yyy/qemu/async.c:231
> >>> #12 0x03fffd36a05a in g_main_context_dispatch ()
> >>>from /lib64/libglib-2.0.so.0
> >>> #13 0x801e0ffa in glib_pollfds_poll ()
> >>> at /data/git/yyy/qemu/main-loop.c:211
> >>> #14 os_host_main_loop_wait (timeout=)
> >>> at /data/git/yyy/qemu/main-loop.c:256
> >>> #15 main_loop_wait (nonblocking=)
> >>> at /data/git/yyy/qemu/main-loop.c:504
> >>> #16 0x800148a6 in main_loop () at /data/git/yyy/qemu/vl.c:1923
> >>> #17 main (argc=, argv=, envp= >>> out>)
> >>> at /data/git/yyy/qemu/vl.c:4684
> >>>
> >>> Relevant part of command line:
> >>>
> >>> -drive 
> >>> file=/dev/sda,if=none,id=drive-virtio-disk0,format=raw,serial=ccwzfcp1,cache=none
> >>>  -device 
> >>> virtio-blk-ccw,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,scsi=off
> >>
> >> I played around a bit. The main part of this change seems 

[Qemu-devel] [PATCH 4/7] igd: TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE: call parent realize

2015-12-08 Thread Gerd Hoffmann
Signed-off-by: Gerd Hoffmann 
---
 hw/pci-host/igd.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c
index d1eeafb..6f52ab1 100644
--- a/hw/pci-host/igd.c
+++ b/hw/pci-host/igd.c
@@ -53,12 +53,20 @@ out:
 return ret;
 }
 
+static void (*i440fx_realize)(PCIDevice *pci_dev, Error **errp);
 static void igd_pt_i440fx_realize(PCIDevice *pci_dev, Error **errp)
 {
+Error *err = NULL;
 uint32_t val = 0;
 int rc, i, num;
 int pos, len;
 
+i440fx_realize(pci_dev, );
+if (err != NULL) {
+error_propagate(errp, err);
+return;
+}
+
 num = ARRAY_SIZE(igd_host_bridge_infos);
 for (i = 0; i < num; i++) {
 pos = igd_host_bridge_infos[i].offset;
@@ -77,6 +85,7 @@ static void igd_passthrough_i440fx_class_init(ObjectClass 
*klass, void *data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
 
+i440fx_realize = k->realize;
 k->realize = igd_pt_i440fx_realize;
 dc->desc = "IGD Passthrough Host bridge";
 }
-- 
1.8.3.1




Re: [Qemu-devel] Qemu Crashing

2015-12-08 Thread Peter Maydell
On 8 December 2015 at 07:43, Saqib Khan  wrote:
> I compiled Qemu with following command:
>
> ./configure --target-list=x86_64-softmmu --enable-debug
>
> Then I started up my VM using following command :
>
> /home/user/qemu/qemu/bin/debug/native/x86_64-softmmu/qemu-system-x86_64-m
> 1024 -drive if=ide,file=Windows7qemu.qcow2,cache=none -cdrom Win7Sp1.iso
>
> Qemu keep crashing with segmentation fault have attached PNG showing seg
> fault at 2 different locations as in attacments:

Which version of QEMU are you building? Does the bug still occur
if you use the latest 2.5.0-rc3 ?

thanks
-- PMM



Re: [Qemu-devel] [PULL for-2.5 2/4] block: Don't wait serialising for non-COR read requests

2015-12-08 Thread Christian Borntraeger
On 12/08/2015 02:58 PM, Christian Borntraeger wrote:
[...9
>>>
>>> Not a compiler bug. gcc uses a floating point register 8 to spill
>>> the pointer of blk (which is call saved) submit_request will later
>>> on call  qemu_coroutine_enter and after returning from 
>>> qemu_coroutine_enter, the fpr8 contains junk. Not sure yet, what happened.
>>
>> Coroutines don't save the FPU state, so you're not supposed to use
>> floating point operations inside coroutines. That the compiler spills
>> some integer value into a floating point register is a bit nasty...
> 
> Just checked.  bdrv_aligned_preadv does also use fprs (also for filling
> and spilling). Some versions of gcc seem to like that as the LDGR and LGDR
> instructions are pretty cheap and move the content from/to fprs in a bitwise
> fashion. So this coroutine DOES trash floating point registers.
> 
> Without the patch gcc seems to be fine with the 16 gprs and does not
> spilling/filling from/to fprs in bdrv_aligned_preadv.
> 
> Christian

Kevin,

I am wondering. gcc saves/restores f8 in the generated code for the
coroutine and setjmp/longjmp also save/restore the fprs. why do 
coroutines do not save the FPU state (which code does a light weight
switching)

Christian




Re: [Qemu-devel] [v3 3/3] configure: add options to config avx2

2015-12-08 Thread Li, Liang Z
> On 8 December 2015 at 12:08, Liang Li  wrote:
> > Add the '--enable-avx2' & '--disable-avx2' option so as to config the
> > AVX2 instruction optimization.
> >
> > If '--disable-avx2' is not set, configure will detect if the compiler
> > can support AVX2 option, if yes, AVX2 optimization is eabled, else
> > disabled.
> 
> Is the configure option necessary? For other things like this (eg our use of
> SSE2 or Altivec) we just go ahead and use the feature if the compiler
> supports it.
> 

It seems unnecessary.

> When would somebody building QEMU want to disable this option?
> 
> thanks
> -- PMM

The v1 of this patch had the  '--enable-avx2' & '--disable-avx2'  options 
because this version did not
 support ifunc, and I left them here in the following version ...
I will remove them if they are unnecessary.  

Thanks for your comments.

Liang




Re: [Qemu-devel] [PATCH 1/2] Add param Error** to msi_init() & modify the callers

2015-12-08 Thread Markus Armbruster
Cao jin  writes:

> Hi Markus,
> I have to say, you really did a amazing review for this "trivial
> "patch, thanks a lot & really appreciate it:)

Thanks!  I'm afraid the problem you picked isn't trivial, but I hope
it's still simple enough to be a useful exercise to get you going with
the code.

> On 12/07/2015 05:59 PM, Markus Armbruster wrote:
>> Cao jin  writes:
>>
>>> msi_init() is a supporting function in PCI device initialization, in order 
>>> to
>>> convert .init() to .realize(), it should be modified first. Also modify the
>>> callers
>>>
>>> Bonus: add more comment for msi_init().
>>
>> Incomplete.  See notes on impact inline.
>>
>>> Signed-off-by: Cao jin 
>>> ---
>>>   hw/audio/intel-hda.c   |  7 ++-
>>>   hw/ide/ich.c   |  2 +-
>>>   hw/net/vmxnet3.c   |  3 ++-
>>>   hw/pci-bridge/ioh3420.c|  6 +-
>>>   hw/pci-bridge/pci_bridge_dev.c |  6 +-
>>>   hw/pci-bridge/xio3130_downstream.c |  7 ++-
>>>   hw/pci-bridge/xio3130_upstream.c   |  7 ++-
>>>   hw/pci/msi.c   | 17 +
>>>   hw/scsi/megasas.c  | 12 +---
>>>   hw/scsi/vmw_pvscsi.c   |  3 ++-
>>>   hw/usb/hcd-xhci.c  |  5 -
>>>   hw/vfio/pci.c  |  3 ++-
>>>   include/hw/pci/msi.h   |  4 ++--
>>>   13 files changed, 63 insertions(+), 19 deletions(-)
>>>
>>> diff --git a/hw/audio/intel-hda.c b/hw/audio/intel-hda.c
>>> index 433463e..9d733da 100644
>>> --- a/hw/audio/intel-hda.c
>>> +++ b/hw/audio/intel-hda.c
>>> @@ -1130,6 +1130,7 @@ static void intel_hda_realize(PCIDevice *pci, Error 
>>> **errp)
>>>   {
>>>   IntelHDAState *d = INTEL_HDA(pci);
>>>   uint8_t *conf = d->pci.config;
>>> +int ret;
>>>
>>>   d->name = object_get_typename(OBJECT(d));
>>>
>>> @@ -1142,7 +1143,11 @@ static void intel_hda_realize(PCIDevice *pci, Error 
>>> **errp)
>>> "intel-hda", 0x4000);
>>>   pci_register_bar(>pci, 0, 0, >mmio);
>>>   if (d->msi) {
>>> -msi_init(>pci, d->old_msi_addr ? 0x50 : 0x60, 1, true, false);
>>> +ret = msi_init(>pci, d->old_msi_addr ? 0x50 : 0x60, 1, true,
>>> +false, errp);
>>> +if(ret < 0) {
>>
>> Please use scripts/checkpatch.pl to check your patches.  It's
>> occasionally wrong, so use your judgement.
>>
>
> Thanks for the tips, seems I got dizzy looking because many trivial
> place need to be modified...
>
>>> +return;
>>
>> This returns with the device in a half-realized state.  Do we have to
>> undo prior side effects to put it back into unrealized state?  See also
>> ioh3420_initfn() below.
>>
>> Before: msi_init() failure is ignored.  After: it makes device
>> realization fail.  To assess impact, we need to understand how
>> msi_init() can fail.
>>
>
> It seems I missed the reality: devices are default to be hot-pluggable
> & most devices are hot-pluggable:-[ Because when cold plugged, process
> will exit on device-init failing, so, the half-realized state doesn`t
> matter in this condition.
> Will rework it later.

In theory, realize() should always fail cleanly.  In practice, unclean
realize() failure doesn't matter when it's fatal anyway.  Some devices
are only used where it's always fatal.  See also "Error handling in
realize() methods" I just sent to the list; I hope we can come up with
some guidance on when shortcuts in realize() methods are tolerable.

>>> +}
>>>   }
>>>
>>>   hda_codec_bus_init(DEVICE(pci), >codecs, sizeof(d->codecs),
>>> diff --git a/hw/ide/ich.c b/hw/ide/ich.c
>>> index 16925fa..94b1809 100644
>>> --- a/hw/ide/ich.c
>>> +++ b/hw/ide/ich.c
>>> @@ -145,7 +145,7 @@ static void pci_ich9_ahci_realize(PCIDevice *dev, Error 
>>> **errp)
>>>   /* Although the AHCI 1.3 specification states that the first 
>>> capability
>>>* should be PMCAP, the Intel ICH9 data sheet specifies that the ICH9
>>>* AHCI device puts the MSI capability first, pointing to 0x80. */
>>> -msi_init(dev, ICH9_MSI_CAP_OFFSET, 1, true, false);
>>> +msi_init(dev, ICH9_MSI_CAP_OFFSET, 1, true, false, errp);
>>
>> Do we have to put the device back into unrealized state on failure?
>>
>>>   }
>>>
>>>   static void pci_ich9_uninit(PCIDevice *dev)
>>> diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
>>> index 5e3a233..4269141 100644
>>> --- a/hw/net/vmxnet3.c
>>> +++ b/hw/net/vmxnet3.c
>>> @@ -2140,9 +2140,10 @@ vmxnet3_init_msi(VMXNET3State *s)
>>>   {
>>>   PCIDevice *d = PCI_DEVICE(s);
>>>   int res;
>>> +Error *local_err = NULL;
>>>
>>>   res = msi_init(d, VMXNET3_MSI_OFFSET, VMXNET3_MAX_NMSIX_INTRS,
>>> -   VMXNET3_USE_64BIT, VMXNET3_PER_VECTOR_MASK);
>>> +   VMXNET3_USE_64BIT, VMXNET3_PER_VECTOR_MASK, _err);
>>>   if (0 > res) {
>>>   VMW_WRPRN("Failed to 

[Qemu-devel] [PATCH 6/7] igd: revamp host config read

2015-12-08 Thread Gerd Hoffmann
Move all work to the host_pci_config_copy helper function,
which we can easily reuse when adding q35 support.
Open sysfs file only once for all values.  Use pread.
Proper error handling.  Fix bugs:

 * Don't throw away results (like old host_pci_config_read
   did because val was passed by value not reference).
 * Update config space directly (writing via
   pci_default_write_config only works for registers
   whitelisted in wmask).

Hmm, this code can hardly ever worked before,
/me wonders what test coverage it had.

With this patch in place igd-passthru=on actually
works, although it still requires root priviledges
because linux refuses to allow non-root users access
pci config space above offset 0x50.

Signed-off-by: Gerd Hoffmann 
---
 hw/pci-host/igd.c | 65 +++
 1 file changed, 27 insertions(+), 38 deletions(-)

diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c
index 0784128..ec48875 100644
--- a/hw/pci-host/igd.c
+++ b/hw/pci-host/igd.c
@@ -19,47 +19,39 @@ static const IGDHostInfo igd_host_bridge_infos[] = {
 {0xa8, 4},  /* SNB: base of GTT stolen memory */
 };
 
-static int host_pci_config_read(int pos, int len, uint32_t val)
+static void host_pci_config_copy(PCIDevice *guest, const char *host,
+ const IGDHostInfo *list, int len, Error 
**errp)
 {
-char path[PATH_MAX];
-int config_fd;
-ssize_t size = sizeof(path);
-/* Access real host bridge. */
-int rc = snprintf(path, size, "/sys/bus/pci/devices/%04x:%02x:%02x.%d/%s",
-  0, 0, 0, 0, "config");
-int ret = 0;
+char *path;
+int config_fd, rc, i;
 
-if (rc >= size || rc < 0) {
-return -ENODEV;
-}
-
-config_fd = open(path, O_RDWR);
+path = g_strdup_printf("/sys/bus/pci/devices/%s/config", host);
+config_fd = open(path, O_RDONLY);
 if (config_fd < 0) {
-return -ENODEV;
+error_setg_file_open(errp, errno, path);
+goto out_free;
 }
 
-if (lseek(config_fd, pos, SEEK_SET) != pos) {
-ret = -errno;
-goto out;
+for (i = 0; i < len; i++) {
+rc = pread(config_fd, guest->config + list[i].offset,
+   list[i].len, list[i].offset);
+if (rc != list[i].len) {
+error_setg_errno(errp, errno, "read %s, offset 0x%x",
+ path, list[i].offset);
+goto out_close;
+}
 }
-do {
-rc = read(config_fd, (uint8_t *), len);
-} while (rc < 0 && (errno == EINTR || errno == EAGAIN));
-if (rc != len) {
-ret = -errno;
-}
-out:
+
+out_close:
 close(config_fd);
-return ret;
+out_free:
+g_free(path);
 }
 
 static void (*i440fx_realize)(PCIDevice *pci_dev, Error **errp);
 static void igd_pt_i440fx_realize(PCIDevice *pci_dev, Error **errp)
 {
 Error *err = NULL;
-uint32_t val = 0;
-int rc, i, num;
-int pos, len;
 
 i440fx_realize(pci_dev, );
 if (err != NULL) {
@@ -67,16 +59,13 @@ static void igd_pt_i440fx_realize(PCIDevice *pci_dev, Error 
**errp)
 return;
 }
 
-num = ARRAY_SIZE(igd_host_bridge_infos);
-for (i = 0; i < num; i++) {
-pos = igd_host_bridge_infos[i].offset;
-len = igd_host_bridge_infos[i].len;
-rc = host_pci_config_read(pos, len, val);
-if (rc) {
-error_setg(errp, "failed to read host config");
-return;
-}
-pci_default_write_config(pci_dev, pos, val, len);
+host_pci_config_copy(pci_dev, ":00:00.0",
+ igd_host_bridge_infos,
+ ARRAY_SIZE(igd_host_bridge_infos),
+ );
+if (err != NULL) {
+error_propagate(errp, err);
+return;
 }
 }
 
-- 
1.8.3.1




[Qemu-devel] [PATCH 7/7] igd: add q35 support

2015-12-08 Thread Gerd Hoffmann
Signed-off-by: Gerd Hoffmann 
---
 hw/pci-host/igd.c | 41 -
 hw/pci-host/q35.c |  6 +-
 2 files changed, 45 insertions(+), 2 deletions(-)

diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c
index ec48875..f6e3f7a 100644
--- a/hw/pci-host/igd.c
+++ b/hw/pci-host/igd.c
@@ -1,5 +1,6 @@
 #include "qemu-common.h"
 #include "hw/pci/pci.h"
+#include "hw/pci-host/q35.h"
 #include "hw/i386/pc.h"
 
 /* IGD Passthrough Host Bridge. */
@@ -76,7 +77,7 @@ static void igd_passthrough_i440fx_class_init(ObjectClass 
*klass, void *data)
 
 i440fx_realize = k->realize;
 k->realize = igd_pt_i440fx_realize;
-dc->desc = "IGD Passthrough Host bridge";
+dc->desc = "IGD Passthrough Host bridge (i440fx)";
 }
 
 static const TypeInfo igd_passthrough_i440fx_info = {
@@ -85,9 +86,47 @@ static const TypeInfo igd_passthrough_i440fx_info = {
 .class_init= igd_passthrough_i440fx_class_init,
 };
 
+static void (*q35_realize)(PCIDevice *pci_dev, Error **errp);
+static void igd_pt_q35_realize(PCIDevice *pci_dev, Error **errp)
+{
+Error *err = NULL;
+
+q35_realize(pci_dev, );
+if (err != NULL) {
+error_propagate(errp, err);
+return;
+}
+
+host_pci_config_copy(pci_dev, ":00:00.0",
+ igd_host_bridge_infos,
+ ARRAY_SIZE(igd_host_bridge_infos),
+ );
+if (err != NULL) {
+error_propagate(errp, err);
+return;
+}
+}
+
+static void igd_passthrough_q35_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+
+q35_realize = k->realize;
+k->realize = igd_pt_q35_realize;
+dc->desc = "IGD Passthrough Host bridge (q35)";
+}
+
+static const TypeInfo igd_passthrough_q35_info = {
+.name  = "igd-passthrough-q35-mch",
+.parent= TYPE_MCH_PCI_DEVICE,
+.class_init= igd_passthrough_q35_class_init,
+};
+
 static void igd_register_types(void)
 {
 type_register_static(_passthrough_i440fx_info);
+type_register_static(_passthrough_q35_info);
 }
 
 type_init(igd_register_types)
diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
index 1fb4707..07dc595 100644
--- a/hw/pci-host/q35.c
+++ b/hw/pci-host/q35.c
@@ -151,7 +151,11 @@ static void q35_host_initfn(Object *obj)
 memory_region_init_io(>data_mem, obj, _host_data_le_ops, phb,
   "pci-conf-data", 4);
 
-object_initialize(>mch, sizeof(s->mch), TYPE_MCH_PCI_DEVICE);
+if (object_property_get_bool(qdev_get_machine(), "igd-passthru", NULL)) {
+object_initialize(>mch, sizeof(s->mch), "igd-passthrough-q35-mch");
+} else {
+object_initialize(>mch, sizeof(s->mch), TYPE_MCH_PCI_DEVICE);
+}
 object_property_add_child(OBJECT(s), "mch", OBJECT(>mch), NULL);
 qdev_prop_set_uint32(DEVICE(>mch), "addr", PCI_DEVFN(0, 0));
 qdev_prop_set_bit(DEVICE(>mch), "multifunction", false);
-- 
1.8.3.1




Re: [Qemu-devel] [PULL for-2.5 2/4] block: Don't wait serialising for non-COR read requests

2015-12-08 Thread Peter Maydell
On 8 December 2015 at 13:45, Kevin Wolf  wrote:
> Coroutines don't save the FPU state, so you're not supposed to use
> floating point operations inside coroutines. That the compiler spills
> some integer value into a floating point register is a bit nasty...

The compiler will happily use FP registers even for apparently
integer code if it thinks that is a better way to do it (eg on
some CPUs doing memcpy and other kinds of block data move may
go faster via the FPU registers, or it might be faster to spill
an integer register into an FP register rather than spilling it
to memory). As I see you've already determined, it's the
job of setjmp/longjmp to make sure that everything is saved
and restored correctly, fp or otherwise...

thanks
-- PMM



Re: [Qemu-devel] [PATCH v9 00/10] qcow2: Support refcount order amendment

2015-12-08 Thread Kevin Wolf
Am 27.07.2015 um 17:51 hat Max Reitz geschrieben:
> (v1..v7 were named "qcow2: Support refcount orders != 4")
> 
> This series contains the final 10 patches of my qcow2 refcount order
> series, which add refcount order amendment functionality to qemu-img.

Thanks, applied to block-next (after some trivial rebasing).

Kevin



Re: [Qemu-devel] FD passing for chardevs and chardev backend multiplexing

2015-12-08 Thread Eric Blake
On 12/08/2015 07:59 AM, Daniel P. Berrange wrote:

> So for this my plan is to stop using the QEMU 'file' backend for char
> devs and instead pass across a pre-opened file descriptor, connected
> to virtlogd. There is no "officially documented" way to pass in a
> file descriptor to QEMU chardevs, but since QEMU uses qemu_open(),
> we can make use of the fdset feature to achieve this. eg
> 
> eg, consider fd 33 is the write end of a pipe file descriptor
> I can (in theory) do
> 
>   -add-fd set=2,fd=33 -chardev file,id=charserial0,path=/dev/fdset/2
> 
> Now in practice this doesn't work, because qmp_chardev_open_file()
> passes the O_CREAT|O_TRUNC flags in, which means the qemu_open()
> call will fail when using the pipe FD pased in via fdsets.

Is it just the O_TRUNC that is failing? If so, there is a recent patch
to add an 'append':true flag that switches O_TRUNC off in favor of O_APPEND:
https://lists.gnu.org/archive/html/qemu-devel/2015-12/msg00762.html

Or is it that the pipe is one-way, but chardev insists on O_RDWR and
fails because it is not two-way?

> 
> After more investigation I found it *is* possible to use a socketpair
> and a pipe backend though...
> 
>   -add-fd set=2,fd=33 -chardev pipe,id=charserial0,path=/dev/fdset/2

Yes, a socketpair is bi-directional, so it supports O_RDWR opening.

> 
> ..because for reasons I don't understand, if QEMU can't open $PATH.in
> and $PATH.out, it'll fallback to just opening $PATH in read-write
> mode even. AFAICT, this is pretty useless with pipes since they
> are unidirectional, but, it works nicely with socketpairs, where
> virtlogd has one of the socketpairs and QEMU gets passed the other
> via fdset.

Is it something where we'd want to support two pipes, and open
/dev/fdset/2 tied to char.in and /dev/fdset/3 tied to char.out, where
uni-directional pipes are again okay?

> 
> I can easily check this works for historical QEMU versions back
> to when fdsets support was added to chardevs, but I'm working if
> the QEMU maintainers consider this usage acceptable over the long
> term, and if so, should we explicitly document it as supported ?

It seems like a bi-directional socketpair as the single endpoint for a
chardev is useful enough to support and document, but I'm not the
maintainer to give final say-so.

> 
> If not, should we introduce a more explicit syntax for passing in
> a pre-opened FD for chardevs ? eg
> 
>   -add-fd set=2,fd=33 -chardev fd,id=charserial0,path=/dev/fdset/2
> 

Difference to the line you tried above:

>   -add-fd set=2,fd=33 -chardev file,id=charserial0,path=/dev/fdset/2

is 'fd' instead of 'file'.  But if we're going to add a new protocol, do
we even need to go through the "/dev/fdset/..." name, or can we just
pass the fd number directly?

> Or just make  -chardev file,id=charserial0,path=/dev/fdset/2 actually
> work ?

I'd lean more to this case - the whole point of fdsets was that we don't
have to add multiple fd protocols; that everyone that understood file
syntax and uses qemu_open() magically gained fd support.

> 
> Or something else ?
> 
> 
> OpenStack has a further requirement to allow use of the serial port
> as an interactive console, at the same time that it is logging to a
> file which is something QEMU can't support at all currently. This
> essentially means being able to have multiple chardev backends all
> connected to the same serial frontend - specifically we would need
> a TCP backend and a file backend concurrently. Again this could be
> implemented in QEMU, but we'd prefer something that works with
> existing QEMU.
> 
> This is not too difficult to achieve with virtlogd really. Instead
> of using the QEMU 'tcp' or 'unix' chardev protocol, we'd just always
> pass QEMU a pre-opened socketpair, and then leave the TCP/UNIX
> socket listening to the virtlogd daemon.
> 
> This is portable with existing QEMU versions, but the obvious downside
> with this is extra copies in the interactive console path. So might it
> be worth exploring the posibility of a chardev multiplexor in QEMU. We
> would still pass in a pre-opened socketpair to QEMU for the logging side
> of things, but would leave the TCP/UNIX socket listening upto QEMU still.
> 
> eg should we make something like this work:
> 
>   -add-fd set=2,fd=33
>   -chardev pipe,id=charserial0file,path=/dev/fdset/2
>   -chardev 
> socket,id=charserial0tcp,host=127.0.0.1,port=,telnet,server,nowait
>   -chardev multiplex,id=charserial0,muxA=charserial0file,muxB=charserial1

wouldn't muxB be charserial0tcp (not charserial1)?

>   -serial isa-serial,chardev=charserial0,id=serial0

But the idea of a multiplex protocol that has multiple data sinks (guest
output copied to all sinks) and a single source (at most one source can
provide input to the guest) makes sense on the surface.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH] qom: change object property iterator API contract

2015-12-08 Thread Eric Blake
On 11/27/2015 08:27 AM, Daniel P. Berrange wrote:
> Currently the object property iterator API works as follows
> 
>   ObjectPropertyIterator *iter;
> 
>   iter = object_property_iter_init(obj);
>   while ((prop = object_property_iter_next(iter))) {
>  ...
>   }
>   object_property_iter_free(iter);
> 
> This has the benefit that the ObjectPropertyIterator struct
> can be opaque, but has the downside that callers need to
> explicitly call a free function. It is also not in keeping
> with iterator style used elsewhere in QEMU/glib2
> 
> This patch changes the API to use stack allocation instead
> 
>   ObjectPropertyIterator iter;
> 
>   object_property_iter_init(, obj);
>   while ((prop = object_property_iter_next())) {
>  ...
>   }
> 
> Signed-off-by: Daniel P. Berrange 
> ---
> 
> NB, this patch is not against master, it is intended to apply
> after
> 
>   "qom: allow properties to be registered against classes"
> 
> which is queued in qom-next for 2.6
> 
>  hw/ppc/spapr_drc.c |  7 +++
>  include/qom/object.h   | 42 +++---
>  net/filter.c   |  7 +++
>  qmp.c  | 14 ++
>  qom/object.c   | 22 --
>  tests/check-qom-proplist.c |  7 +++
>  vl.c   |  7 +++
>  7 files changed, 49 insertions(+), 57 deletions(-)
> 

> +++ b/include/qom/object.h
> @@ -346,6 +346,7 @@ typedef struct ObjectProperty
>  void *opaque;
>  } ObjectProperty;
>  
> +
>  /**
>   * ObjectUnparent:
>   * @obj: the object that is being removed from the composition tree

Spurious whitespace change?


>  
> + /**
> + * object_property_iter_free:
> + * @iter: the iterator instance
> + *
> + * Releases any resources associated with the iterator. It is
> + * not necessary to call this method if object_property_iter_next
> + * has returned %NULL. It is only required if an application wishes
> + * to abort iteration before it is complete
> + */
> +void object_property_iter_free(ObjectPropertyIterator *iter);
> +

Huh? Why is this being added?  I thought the point was to get rid of the
need for object_property_iter_free().

> +++ b/qom/object.c
> @@ -67,11 +67,6 @@ struct TypeImpl

Other than that snafu, everything else looked fine.   If that's all you
fix for v2, you can add:

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v3 4/5] crypto: add QCryptoSecret object class for password/key handling

2015-12-08 Thread Eric Blake
On 11/27/2015 09:30 AM, Daniel P. Berrange wrote:
> Introduce a new QCryptoSecret object class which will be used
> for providing passwords and keys to other objects which need
> sensitive credentials.
> 

> More examples are shown in the updated docs.
> 
> Signed-off-by: Daniel P. Berrange 
> ---

> +++ b/crypto/secret.c

> +static void
> +qcrypto_secret_load_data(QCryptoSecret *secret,
> + uint8_t **output,
> + size_t *outputlen,
> + Error **errp)
> +{

> +if (!g_file_get_contents(secret->file, , , )) {
> +error_setg(errp,
> +   "Unable to read %s: %s",
> +   secret->file, gerr->message);
> +g_error_free(gerr);
> +return;
> +}
> +if (length) {
> +/* Even though data is raw 8-bit, so may contain
> + * arbitrary NULs, ensure it is explicitly NUL
> + * terminated */
> +*output = g_renew(uint8_t, data, length + 1);
> +(*output)[length] = '\0';

These two lines are dead code. g_file_get_contents() guarantees that on
success, contents is malloc'd large enough and that contents[length] == 0.

https://developer.gnome.org/glib/stable/glib-File-Utilities.html#g-file-get-contents

> +*outputlen = length;
> +} else {
> +error_setg(errp, "Secret file %s is empty",
> +   secret->file);

Is there any technical reason why we must forbid a 0-length password?
(Sometimes, having the empty string as a password can be a useful for
development tests).  I'm not opposed to rejecting it, especially if
doing so now avoids a more cryptic error message later because there is
indeed a technical reason; but just want to make sure it is not an
arbitrary limitation.

> +g_free(data);
> +}
> +} else if (secret->data) {
> +*outputlen = strlen(secret->data);
> +*output = g_new(uint8_t, *outputlen + 1);
> +memcpy(*output, secret->data, *outputlen + 1);

These two lines could be shortened to:
*output = g_strdup(secret->data);

> +
> +static void qcrypto_secret_decrypt(QCryptoSecret *secret,
> +   const uint8_t *input,
> +   size_t inputlen,
> +   uint8_t **output,
> +   size_t *outputlen,
> +   Error **errp)
> +{

> +if (secret->format == QCRYPTO_SECRET_FORMAT_BASE64) {
> +ciphertext = qbase64_decode((const gchar*)input,
> +inputlen,
> +,
> +errp);
> +if (!ciphertext) {
> +goto cleanup;
> +}
> +plaintext = g_new0(uint8_t, ciphertextlen + 1);
> +} else {
> +ciphertextlen = inputlen;
> +plaintext = g_new0(uint8_t, inputlen + 1);

g_new0(uint8_t, value) is the same as g_malloc0(value); I don't know if
it is worth the distinction.  But not worth a respin on its own.

I found some style or efficiency things you might want to touch up, but
no actual bugs that would prevent this patch from being usable as-is.

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [iGVT-g] [PATCH 6/7] igd: revamp host config read

2015-12-08 Thread Jike Song

On 12/08/2015 10:07 PM, Gerd Hoffmann wrote:

Move all work to the host_pci_config_copy helper function,
which we can easily reuse when adding q35 support.
Open sysfs file only once for all values.  Use pread.
Proper error handling.  Fix bugs:

  * Don't throw away results (like old host_pci_config_read
did because val was passed by value not reference).
  * Update config space directly (writing via
pci_default_write_config only works for registers
whitelisted in wmask).

Hmm, this code can hardly ever worked before,
/me wonders what test coverage it had.



WOW, that's really impressive! :)

--
Thanks,
Jike


With this patch in place igd-passthru=on actually
works, although it still requires root priviledges
because linux refuses to allow non-root users access
pci config space above offset 0x50.

Signed-off-by: Gerd Hoffmann 
---
  hw/pci-host/igd.c | 65 +++
  1 file changed, 27 insertions(+), 38 deletions(-)

diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c
index 0784128..ec48875 100644
--- a/hw/pci-host/igd.c
+++ b/hw/pci-host/igd.c
@@ -19,47 +19,39 @@ static const IGDHostInfo igd_host_bridge_infos[] = {
  {0xa8, 4},  /* SNB: base of GTT stolen memory */
  };

-static int host_pci_config_read(int pos, int len, uint32_t val)
+static void host_pci_config_copy(PCIDevice *guest, const char *host,
+ const IGDHostInfo *list, int len, Error 
**errp)
  {
-char path[PATH_MAX];
-int config_fd;
-ssize_t size = sizeof(path);
-/* Access real host bridge. */
-int rc = snprintf(path, size, "/sys/bus/pci/devices/%04x:%02x:%02x.%d/%s",
-  0, 0, 0, 0, "config");
-int ret = 0;
+char *path;
+int config_fd, rc, i;

-if (rc >= size || rc < 0) {
-return -ENODEV;
-}
-
-config_fd = open(path, O_RDWR);
+path = g_strdup_printf("/sys/bus/pci/devices/%s/config", host);
+config_fd = open(path, O_RDONLY);
  if (config_fd < 0) {
-return -ENODEV;
+error_setg_file_open(errp, errno, path);
+goto out_free;
  }

-if (lseek(config_fd, pos, SEEK_SET) != pos) {
-ret = -errno;
-goto out;
+for (i = 0; i < len; i++) {
+rc = pread(config_fd, guest->config + list[i].offset,
+   list[i].len, list[i].offset);
+if (rc != list[i].len) {
+error_setg_errno(errp, errno, "read %s, offset 0x%x",
+ path, list[i].offset);
+goto out_close;
+}
  }
-do {
-rc = read(config_fd, (uint8_t *), len);
-} while (rc < 0 && (errno == EINTR || errno == EAGAIN));
-if (rc != len) {
-ret = -errno;
-}
-out:
+
+out_close:
  close(config_fd);
-return ret;
+out_free:
+g_free(path);
  }

  static void (*i440fx_realize)(PCIDevice *pci_dev, Error **errp);
  static void igd_pt_i440fx_realize(PCIDevice *pci_dev, Error **errp)
  {
  Error *err = NULL;
-uint32_t val = 0;
-int rc, i, num;
-int pos, len;

  i440fx_realize(pci_dev, );
  if (err != NULL) {
@@ -67,16 +59,13 @@ static void igd_pt_i440fx_realize(PCIDevice *pci_dev, Error 
**errp)
  return;
  }

-num = ARRAY_SIZE(igd_host_bridge_infos);
-for (i = 0; i < num; i++) {
-pos = igd_host_bridge_infos[i].offset;
-len = igd_host_bridge_infos[i].len;
-rc = host_pci_config_read(pos, len, val);
-if (rc) {
-error_setg(errp, "failed to read host config");
-return;
-}
-pci_default_write_config(pci_dev, pos, val, len);
+host_pci_config_copy(pci_dev, ":00:00.0",
+ igd_host_bridge_infos,
+ ARRAY_SIZE(igd_host_bridge_infos),
+ );
+if (err != NULL) {
+error_propagate(errp, err);
+return;
  }
  }






[Qemu-devel] [PATCH for-2.5] virtio-9p-device: add minimal unrealize handler

2015-12-08 Thread Greg Kurz
Since commit 4652f1640e029e1f2433fa77ba6af285 "virtio-9p: add savevm handlers",
if the user hot-unplugs a quiescent 9p device and live migrates, the source
QEMU crashes before migration completetion... This happens because virtio-9p
devices have a realize handler which calls virtio_init() and register_savevm().
Both calls store pointers to the device internals, that get dereferenced during
migration even if the device got unplugged.

This patch simply adds an unrealize handler to perform minimal cleanup and
avoid the crash. Hot unplug of non-quiescent 9p devices is still not supported
in QEMU, and not supported by linux guests either.

Signed-off-by: Greg Kurz 
---
 hw/9pfs/virtio-9p-device.c |   12 
 1 file changed, 12 insertions(+)

diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index 944b5f5e9fcc..b42d3b30a027 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -145,6 +145,17 @@ out:
 v9fs_path_free();
 }
 
+static void virtio_9p_device_unrealize(DeviceState *dev, Error **errp)
+{
+VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+V9fsState *s = VIRTIO_9P(dev);
+
+virtio_cleanup(vdev);
+unregister_savevm(dev, "virtio-9p", s);
+g_free(s->ctx.fs_root);
+g_free(s->tag);
+}
+
 /* virtio-9p device */
 
 static Property virtio_9p_properties[] = {
@@ -161,6 +172,7 @@ static void virtio_9p_class_init(ObjectClass *klass, void 
*data)
 dc->props = virtio_9p_properties;
 set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
 vdc->realize = virtio_9p_device_realize;
+vdc->unrealize = virtio_9p_device_unrealize;
 vdc->get_features = virtio_9p_get_features;
 vdc->get_config = virtio_9p_get_config;
 }




[Qemu-devel] [PATCH v2 1/3] lib/x86: Make free_page() available to call

2015-12-08 Thread Andrey Smetanin
This will be used to release allocated pages by Hyper-V
SynIC timers test.

Signed-off-by: Andrey Smetanin 
Reviewed-by: Roman Kagan 
CC: Paolo Bonzini 
CC: Marcelo Tosatti 
CC: Roman Kagan 
CC: Denis V. Lunev 
CC: qemu-devel@nongnu.org
---
 lib/x86/vm.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/x86/vm.h b/lib/x86/vm.h
index bd73840..28794d7 100644
--- a/lib/x86/vm.h
+++ b/lib/x86/vm.h
@@ -33,6 +33,7 @@ unsigned long *install_pte(unsigned long *cr3,
unsigned long *pt_page);
 
 void *alloc_page();
+void free_page(void *page);
 
 unsigned long *install_large_page(unsigned long *cr3,unsigned long phys,
   void *virt);
-- 
2.4.3




Re: [Qemu-devel] [v3 1/3] cutils: add avx2 instruction optimization

2015-12-08 Thread Richard Henderson
On 12/08/2015 04:08 AM, Liang Li wrote:
> +++ b/util/buffer-zero-avx2.c
> @@ -0,0 +1,54 @@
> +#include "qemu-common.h"
> +
> +#if defined CONFIG_IFUNC && defined CONFIG_AVX2
> +#include 
> +#define AVX2_VECTYPE__m256i
> +#define AVX2_SPLAT(p)   _mm256_set1_epi8(*(p))
> +#define AVX2_ALL_EQ(v1, v2) \
> +(_mm256_movemask_epi8(_mm256_cmpeq_epi8(v1, v2)) == 0x)
> +#define AVX2_VEC_OR(v1, v2) (_mm256_or_si256(v1, v2))
> +
> +inline bool
> +can_use_buffer_find_nonzero_offset_avx2(const void *buf, size_t len)
> +{
> +return (len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR
> +   * sizeof(AVX2_VECTYPE)) == 0
> +&& ((uintptr_t) buf) % sizeof(AVX2_VECTYPE) == 0);
> +}

I'm not keen on adding a new file for this.  You ought to be able to use
__attribute__((target("-mavx2"))) on any compiler that supports the
command-line option.  Which means you can do this all in one file with static
functions.

Nor am I keen on marking a function inline when we know it must be out-of-line
because of the ifunc usage.


r~



Re: [Qemu-devel] [PATCH for-2.5] virtio-9p-device: add minimal unrealize handler

2015-12-08 Thread Michael S. Tsirkin
On Tue, Dec 08, 2015 at 04:54:57PM +0100, Greg Kurz wrote:
> Since commit 4652f1640e029e1f2433fa77ba6af285 "virtio-9p: add savevm 
> handlers",
> if the user hot-unplugs a quiescent 9p device and live migrates, the source
> QEMU crashes before migration completetion... This happens because virtio-9p
> devices have a realize handler which calls virtio_init() and 
> register_savevm().
> Both calls store pointers to the device internals, that get dereferenced 
> during
> migration even if the device got unplugged.
> 
> This patch simply adds an unrealize handler to perform minimal cleanup and
> avoid the crash. Hot unplug of non-quiescent 9p devices is still not supported
> in QEMU, and not supported by linux guests either.
> 
> Signed-off-by: Greg Kurz 

Reviewed-by: Michael S. Tsirkin 

> ---
>  hw/9pfs/virtio-9p-device.c |   12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
> index 944b5f5e9fcc..b42d3b30a027 100644
> --- a/hw/9pfs/virtio-9p-device.c
> +++ b/hw/9pfs/virtio-9p-device.c
> @@ -145,6 +145,17 @@ out:
>  v9fs_path_free();
>  }
>  
> +static void virtio_9p_device_unrealize(DeviceState *dev, Error **errp)
> +{
> +VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> +V9fsState *s = VIRTIO_9P(dev);
> +
> +virtio_cleanup(vdev);
> +unregister_savevm(dev, "virtio-9p", s);
> +g_free(s->ctx.fs_root);
> +g_free(s->tag);
> +}
> +
>  /* virtio-9p device */
>  
>  static Property virtio_9p_properties[] = {
> @@ -161,6 +172,7 @@ static void virtio_9p_class_init(ObjectClass *klass, void 
> *data)
>  dc->props = virtio_9p_properties;
>  set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
>  vdc->realize = virtio_9p_device_realize;
> +vdc->unrealize = virtio_9p_device_unrealize;
>  vdc->get_features = virtio_9p_get_features;
>  vdc->get_config = virtio_9p_get_config;
>  }



Re: [Qemu-devel] [PATCH v3 1/5] util: add base64 decoding function

2015-12-08 Thread Eric Blake
On 11/27/2015 09:30 AM, Daniel P. Berrange wrote:
> The standard glib provided g_base64_decode doesn't provide any
> kind of sensible error checking on its input. Add a QEMU custom
> wrapper qbase64_decode which can be used with untrustworthy
> input that can contain invalid base64 characters, embedded
> NUL characters, or not be NUL terminated at all.
> 
> Signed-off-by: Daniel P. Berrange 
> ---

> +
> +/**
> + * qbase64_decode:
> + * @input: the (possibly) base64 encoded text
> + * @in_len: length of @input or -1 if NUL terminated
> + * @out_len: filled with length of decoded data
> + * @errp: pointer to uninitialized error object

That almost implies that I could do:

Error *err;
qbase64_decode(,);

In reality, it should be the NULL-initialized error object, as in:

Error *err = NULL;

but I don't know if there is a better way to represent it in text.  At
any rate, that phrase exists elsewhere, so it would be easier to do a
tree-wide cleanup if we have a better terminology to use, and I won't
hold up review on this patch for it.

> + *
> + * Attempt to decode the (possibly) base64 encoded
> + * text provided in @input. If the @input text may
> + * contain embedded NUL characters, or may not be
> + * NUL terminated, then @in_len must be set to the
> + * known size of the @input buffer.
> + *
> + * Note that embedded NULs, or lack of a NUL terminator
> + * are considered invalid base64 data and errors
> + * will be reported to this effect.
> + *
> + * If decoding is successful, the decoded data will
> + * be returned and @out_len set to indicate the
> + * number of bytes in the decoded data.
> + *

Maybe mention that caller must g_free() the successful result?

> + * Returns: the decoded data or NULL
> + */
> +uint8_t *qbase64_decode(const char *input,
> +size_t in_len,
> +size_t *out_len,
> +Error **errp);
> +
> +

> +++ b/tests/test-base64.c
> @@ -0,0 +1,98 @@

> +static void test_base64_bad(const char *input,
> +size_t input_len)
> +{
> +size_t len;
> +Error *err = NULL;
> +uint8_t *actual = qbase64_decode(input,
> + input_len,
> + ,
> + );
> +
> +g_assert(err != NULL);

Could use _abort in the original call instead of a second check
for err != NULL; but that's cosmetic.

> +g_assert(actual == NULL);
> +g_assert_cmpint(len, ==, 0);

So you are testing that we initialize output length even on error,
rather than leaving it uninitialized.  That's fair.


> +
> +static void test_base64_embedded_nul(void)
> +{
> +const char input[] = "There's no such\0thing as a free lunch.";
> +
> +test_base64_bad(input, G_N_ELEMENTS(input) - 1);
> +}
> +

This asserts that you have a failure, but doesn't say what that failure
would be...

> +
> +static void test_base64_not_nul_terminated(void)
> +{
> +char input[] = "There's no such thing as a free lunch.";
> +input[G_N_ELEMENTS(input) - 1] = '!';
> +
> +test_base64_bad(input, G_N_ELEMENTS(input) - 1);
> +}
> +
> +
> +static void test_base64_invalid_chars(void)
> +{
> +const char *input = "There's no such thing as a free lunch.";
> +
> +test_base64_bad(input, strlen(input));
> +}

...and this same input already fails because it doesn't match base64,
regardless of whether NUL bytes are mishandled.  I wonder if
test_base64_embedded_nul() and test_base64_not_nul_terminated() should
be using variations of your known-good base64 string
("QmVjYXVzZSB3ZSBmb2N1c2VkIG9uIHRoZSBzbmFrZSwgd2UgbW\n"
"lzc2VkIHRoZSBzY29ycGlvbi4="), to prove that they are failing purely
because of the NUL handling and not because of invalid base64 content.
But that's only a weak complaint - because by peering into the black
box, I see that you are fully testing all code paths (that is, the black
box checks for NUL abuse prior to checking for valid base64 data), so
I'm not going to insist on a respin.

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH v2 0/3] KVM-UNIT-TESTS: Hyper-V SynIC timers test

2015-12-08 Thread Andrey Smetanin
The test checks Hyper-V SynIC timers functionality.
The test runs on every vCPU and performs start/stop
of periodic/one-shot timers (with period=1ms) and checks
validity of received expiration messages in appropriate
ISR's.

Changes v2:
* Share generic Hyper-V tests code
* Hyper-V SynIC timers test fixes to improve
readability and output

Signed-off-by: Andrey Smetanin 
Reviewed-by: Roman Kagan 
CC: Paolo Bonzini 
CC: Marcelo Tosatti 
CC: Roman Kagan 
CC: Denis V. Lunev 
CC: qemu-devel@nongnu.org

Andrey Smetanin (3):
  lib/x86: Make free_page() available to call
  x86/hyperv: Move Hyper-V generic code into hyperv.h/hyperv.c
  x86: Hyper-V SynIC timers test

 config/config-x86-common.mak |   8 +-
 lib/x86/msr.h|  23 ---
 lib/x86/vm.h |   1 +
 x86/hyperv.c |  24 +++
 x86/hyperv.h | 183 +
 x86/hyperv_stimer.c  | 376 +++
 x86/hyperv_synic.c   |  42 +
 x86/unittests.cfg|   5 +
 8 files changed, 603 insertions(+), 59 deletions(-)
 create mode 100644 x86/hyperv.c
 create mode 100644 x86/hyperv.h
 create mode 100644 x86/hyperv_stimer.c

-- 
2.4.3




Re: [Qemu-devel] [Qemu-arm] [PATCH v2 18/19] [RFC] hw/arm/virt: add secure memory region and UART

2015-12-08 Thread Peter Maydell
On 16 November 2015 at 14:05, Peter Maydell  wrote:
> Add a secure memory region to the virt board, which is the
> same as the nonsecure memory region except that it also has
> a secure-only UART in it. This is only created if the
> board is started with the '-machine secure=on' property.
>
> This is an RFC patch, beacuse the device tree bindings for
> indicating secure vs nonsecure devices are still under
> discussion upstream:
>   https://lkml.org/lkml/2015/10/29/287

The bindings have been accepted upstream, so we can take the
'RFC' tag off this patch now.

thanks
-- PMM



[Qemu-devel] [PATCH v2 2/3] x86/hyperv: Move Hyper-V generic code into hyperv.h/hyperv.c

2015-12-08 Thread Andrey Smetanin
This code will be used as shared between hyperv_synic
and hyperv_stimer tests.

Signed-off-by: Andrey Smetanin 
CC: Paolo Bonzini 
CC: Marcelo Tosatti 
CC: Roman Kagan 
CC: Denis V. Lunev 
CC: qemu-devel@nongnu.org
---
 config/config-x86-common.mak |  3 ++-
 lib/x86/msr.h| 23 --
 x86/hyperv.c | 24 ++
 x86/hyperv.h | 58 
 x86/hyperv_synic.c   | 42 ++--
 5 files changed, 92 insertions(+), 58 deletions(-)
 create mode 100644 x86/hyperv.c
 create mode 100644 x86/hyperv.h

diff --git a/config/config-x86-common.mak b/config/config-x86-common.mak
index f64874d..156be1c 100644
--- a/config/config-x86-common.mak
+++ b/config/config-x86-common.mak
@@ -113,7 +113,8 @@ $(TEST_DIR)/debug.elf: $(cstart.o) $(TEST_DIR)/debug.o
 
 $(TEST_DIR)/memory.elf: $(cstart.o) $(TEST_DIR)/memory.o
 
-$(TEST_DIR)/hyperv_synic.elf: $(cstart.o) $(TEST_DIR)/hyperv_synic.o
+$(TEST_DIR)/hyperv_synic.elf: $(cstart.o) $(TEST_DIR)/hyperv.o \
+  $(TEST_DIR)/hyperv_synic.o
 
 arch_clean:
$(RM) $(TEST_DIR)/*.o $(TEST_DIR)/*.flat $(TEST_DIR)/*.elf \
diff --git a/lib/x86/msr.h b/lib/x86/msr.h
index 54da420..281255a 100644
--- a/lib/x86/msr.h
+++ b/lib/x86/msr.h
@@ -408,27 +408,4 @@
 #define MSR_VM_IGNNE0xc0010115
 #define MSR_VM_HSAVE_PA 0xc0010117
 
-/* Define synthetic interrupt controller model specific registers. */
-#define HV_X64_MSR_SCONTROL 0x4080
-#define HV_X64_MSR_SVERSION 0x4081
-#define HV_X64_MSR_SIEFP0x4082
-#define HV_X64_MSR_SIMP 0x4083
-#define HV_X64_MSR_EOM  0x4084
-#define HV_X64_MSR_SINT00x4090
-#define HV_X64_MSR_SINT10x4091
-#define HV_X64_MSR_SINT20x4092
-#define HV_X64_MSR_SINT30x4093
-#define HV_X64_MSR_SINT40x4094
-#define HV_X64_MSR_SINT50x4095
-#define HV_X64_MSR_SINT60x4096
-#define HV_X64_MSR_SINT70x4097
-#define HV_X64_MSR_SINT80x4098
-#define HV_X64_MSR_SINT90x4099
-#define HV_X64_MSR_SINT10   0x409A
-#define HV_X64_MSR_SINT11   0x409B
-#define HV_X64_MSR_SINT12   0x409C
-#define HV_X64_MSR_SINT13   0x409D
-#define HV_X64_MSR_SINT14   0x409E
-#define HV_X64_MSR_SINT15   0x409F
-
 #endif /* _ASM_X86_MSR_INDEX_H */
diff --git a/x86/hyperv.c b/x86/hyperv.c
new file mode 100644
index 000..824773d
--- /dev/null
+++ b/x86/hyperv.c
@@ -0,0 +1,24 @@
+#include "hyperv.h"
+
+static void synic_ctl(u8 ctl, u8 vcpu_id, u8 sint)
+{
+outl((ctl << 16)|((vcpu_id) << 8)|sint, 0x3000);
+}
+
+void synic_sint_create(int vcpu, int sint, int vec, bool auto_eoi)
+{
+wrmsr(HV_X64_MSR_SINT0 + sint,
+  (u64)vec | ((auto_eoi) ? HV_SYNIC_SINT_AUTO_EOI : 0));
+synic_ctl(HV_TEST_DEV_SINT_ROUTE_CREATE, vcpu, sint);
+}
+
+void synic_sint_set(int vcpu, int sint)
+{
+synic_ctl(HV_TEST_DEV_SINT_ROUTE_SET_SINT, vcpu, sint);
+}
+
+void synic_sint_destroy(int vcpu, int sint)
+{
+wrmsr(HV_X64_MSR_SINT0 + sint, 0xFF|HV_SYNIC_SINT_MASKED);
+synic_ctl(HV_TEST_DEV_SINT_ROUTE_DESTROY, vcpu, sint);
+}
diff --git a/x86/hyperv.h b/x86/hyperv.h
new file mode 100644
index 000..0dd1d0d
--- /dev/null
+++ b/x86/hyperv.h
@@ -0,0 +1,58 @@
+#ifndef __HYPERV_H
+#define __HYPERV_H
+
+#include "libcflat.h"
+#include "processor.h"
+#include "io.h"
+
+#define HYPERV_CPUID_FEATURES   0x4003
+
+#define HV_X64_MSR_SYNIC_AVAILABLE  (1 << 2)
+
+/* Define synthetic interrupt controller model specific registers. */
+#define HV_X64_MSR_SCONTROL 0x4080
+#define HV_X64_MSR_SVERSION 0x4081
+#define HV_X64_MSR_SIEFP0x4082
+#define HV_X64_MSR_SIMP 0x4083
+#define HV_X64_MSR_EOM  0x4084
+#define HV_X64_MSR_SINT00x4090
+#define HV_X64_MSR_SINT10x4091
+#define HV_X64_MSR_SINT20x4092
+#define HV_X64_MSR_SINT30x4093
+#define HV_X64_MSR_SINT40x4094
+#define HV_X64_MSR_SINT50x4095
+#define HV_X64_MSR_SINT60x4096
+#define HV_X64_MSR_SINT70x4097
+#define HV_X64_MSR_SINT8

[Qemu-devel] [PATCH v2 3/3] x86: Hyper-V SynIC timers test

2015-12-08 Thread Andrey Smetanin
The test checks Hyper-V SynIC timers functionality.
The test runs on every vCPU and performs start/stop
of periodic/one-shot timers (with period=1ms) and checks
validity of received expiration messages in appropriate
ISR's.

Changes v2:
* reorg code to use generic hyperv.h
* split timer test into test cases with separate callbacks
* removed unnecessary irq_enable() calls
* moved sint's create/destoy into test prepare/cleanup callbacks
* defined used sint's numbers and vectors

Signed-off-by: Andrey Smetanin 
Reviewed-by: Roman Kagan 
CC: Paolo Bonzini 
CC: Marcelo Tosatti 
CC: Roman Kagan 
CC: Denis V. Lunev 
CC: qemu-devel@nongnu.org
---
 config/config-x86-common.mak |   5 +-
 x86/hyperv.h | 125 ++
 x86/hyperv_stimer.c  | 376 +++
 x86/unittests.cfg|   5 +
 4 files changed, 510 insertions(+), 1 deletion(-)
 create mode 100644 x86/hyperv_stimer.c

diff --git a/config/config-x86-common.mak b/config/config-x86-common.mak
index 156be1c..72b95e3 100644
--- a/config/config-x86-common.mak
+++ b/config/config-x86-common.mak
@@ -37,7 +37,7 @@ tests-common = $(TEST_DIR)/vmexit.flat $(TEST_DIR)/tsc.flat \
$(TEST_DIR)/s3.flat $(TEST_DIR)/pmu.flat \
$(TEST_DIR)/tsc_adjust.flat $(TEST_DIR)/asyncpf.flat \
$(TEST_DIR)/init.flat $(TEST_DIR)/smap.flat \
-   $(TEST_DIR)/hyperv_synic.flat
+   $(TEST_DIR)/hyperv_synic.flat $(TEST_DIR)/hyperv_stimer.flat \
 
 ifdef API
 tests-common += api/api-sample
@@ -116,6 +116,9 @@ $(TEST_DIR)/memory.elf: $(cstart.o) $(TEST_DIR)/memory.o
 $(TEST_DIR)/hyperv_synic.elf: $(cstart.o) $(TEST_DIR)/hyperv.o \
   $(TEST_DIR)/hyperv_synic.o
 
+$(TEST_DIR)/hyperv_stimer.elf: $(cstart.o) $(TEST_DIR)/hyperv.o \
+   $(TEST_DIR)/hyperv_stimer.o
+
 arch_clean:
$(RM) $(TEST_DIR)/*.o $(TEST_DIR)/*.flat $(TEST_DIR)/*.elf \
$(TEST_DIR)/.*.d lib/x86/.*.d
diff --git a/x86/hyperv.h b/x86/hyperv.h
index 0dd1d0d..faf931b 100644
--- a/x86/hyperv.h
+++ b/x86/hyperv.h
@@ -7,7 +7,11 @@
 
 #define HYPERV_CPUID_FEATURES   0x4003
 
+#define HV_X64_MSR_TIME_REF_COUNT_AVAILABLE (1 << 1)
 #define HV_X64_MSR_SYNIC_AVAILABLE  (1 << 2)
+#define HV_X64_MSR_SYNTIMER_AVAILABLE   (1 << 3)
+
+#define HV_X64_MSR_TIME_REF_COUNT   0x4020
 
 /* Define synthetic interrupt controller model specific registers. */
 #define HV_X64_MSR_SCONTROL 0x4080
@@ -32,6 +36,19 @@
 #define HV_X64_MSR_SINT14   0x409E
 #define HV_X64_MSR_SINT15   0x409F
 
+/*
+ * Synthetic Timer MSRs. Four timers per vcpu.
+ */
+
+#define HV_X64_MSR_STIMER0_CONFIG   0x40B0
+#define HV_X64_MSR_STIMER0_COUNT0x40B1
+#define HV_X64_MSR_STIMER1_CONFIG   0x40B2
+#define HV_X64_MSR_STIMER1_COUNT0x40B3
+#define HV_X64_MSR_STIMER2_CONFIG   0x40B4
+#define HV_X64_MSR_STIMER2_COUNT0x40B5
+#define HV_X64_MSR_STIMER3_CONFIG   0x40B6
+#define HV_X64_MSR_STIMER3_COUNT0x40B7
+
 #define HV_SYNIC_CONTROL_ENABLE (1ULL << 0)
 #define HV_SYNIC_SIMP_ENABLE(1ULL << 0)
 #define HV_SYNIC_SIEFP_ENABLE   (1ULL << 0)
@@ -40,6 +57,104 @@
 #define HV_SYNIC_SINT_VECTOR_MASK   (0xFF)
 #define HV_SYNIC_SINT_COUNT 16
 
+#define HV_STIMER_ENABLE(1ULL << 0)
+#define HV_STIMER_PERIODIC  (1ULL << 1)
+#define HV_STIMER_LAZY  (1ULL << 2)
+#define HV_STIMER_AUTOENABLE(1ULL << 3)
+#define HV_STIMER_SINT(config)  (__u8)(((config) >> 16) & 0x0F)
+
+#define HV_SYNIC_STIMER_COUNT   (4)
+
+/* Define synthetic interrupt controller message constants. */
+#define HV_MESSAGE_SIZE (256)
+#define HV_MESSAGE_PAYLOAD_BYTE_COUNT   (240)
+#define HV_MESSAGE_PAYLOAD_QWORD_COUNT  (30)
+
+/* Define hypervisor message types. */
+enum hv_message_type {
+HVMSG_NONE  = 0x,
+
+/* Memory access messages. */
+HVMSG_UNMAPPED_GPA  = 0x8000,
+HVMSG_GPA_INTERCEPT = 0x8001,
+
+/* Timer notification messages. */
+HVMSG_TIMER_EXPIRED = 0x8010,
+
+/* Error messages. */
+HVMSG_INVALID_VP_REGISTER_VALUE = 0x8020,
+HVMSG_UNRECOVERABLE_EXCEPTION   = 0x8021,
+HVMSG_UNSUPPORTED_FEATURE   = 0x8022,
+
+/* Trace buffer complete messages. */
+HVMSG_EVENTLOG_BUFFERCOMPLETE   = 0x8040,
+
+/* Platform-specific processor intercept messages. */
+

Re: [Qemu-devel] [PATCH 06/16] acpi: Save PCMachineState on AcpiBuildState

2015-12-08 Thread Eduardo Habkost
On Mon, Dec 07, 2015 at 05:39:29PM +0200, Marcel Apfelbaum wrote:
> On 12/02/2015 03:47 AM, Eduardo Habkost wrote:
> >PCMachineState will be used in some of the steps of ACPI table
> >building.
> >
> >Signed-off-by: Eduardo Habkost 
> >---
> >  hw/i386/acpi-build.c | 8 
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> >diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> >index 85a5c53..ca11c88 100644
> >--- a/hw/i386/acpi-build.c
> >+++ b/hw/i386/acpi-build.c
> >@@ -1644,7 +1644,7 @@ struct AcpiBuildState {
> >  MemoryRegion *table_mr;
> >  /* Is table patched? */
> >  uint8_t patched;
> >-PcGuestInfo *guest_info;
> >+PCMachineState *pcms;
> >  void *rsdp;
> >  MemoryRegion *rsdp_mr;
> >  MemoryRegion *linker_mr;
> >@@ -1855,7 +1855,7 @@ static void acpi_build_update(void *build_opaque, 
> >uint32_t offset)
> >
> >  acpi_build_tables_init();
> >
> >-acpi_build(build_state->guest_info, );
> >+acpi_build(_state->pcms->acpi_guest_info, );
> >
> >  acpi_ram_update(build_state->table_mr, tables.table_data);
> >
> >@@ -1916,12 +1916,12 @@ void acpi_setup(PCMachineState *pcms)
> >
> >  build_state = g_malloc0(sizeof *build_state);
> >
> >-build_state->guest_info = guest_info;
> >+build_state->pcms = pcms;
> 
> I am not "sold" on keeping a reference to machine in the build_state.
> We can always query current machine using qdev_machine() or something.
> 
> Keeping the "guest info" made sense since is used especially for ACPI,
> however the machine has a wider scope. (And not having to keep it
> around is a very good thing!)

I wouldn't mind using qdev_get_machine() if preferred by the
maintainer of that code, but I like to avoid it when possible. To
me, qdev_get_machine() is just a global variable disguised as a
harder-to-understand API.

-- 
Eduardo



[Qemu-devel] [PATCH v10 4/6] target-arm: kvm - add support for HW assisted debug

2015-12-08 Thread Alex Bennée
This adds basic support for HW assisted debug. The ioctl interface to
KVM allows us to pass an implementation defined number of break and
watch point registers. When KVM_GUESTDBG_USE_HW is specified these
debug registers will be installed in place on the world switch into the
guest.

The hardware is actually capable of more advanced matching but it is
unclear if this expressiveness is available via the gdbstub protocol.

Signed-off-by: Alex Bennée 

---
v2
  - correct setting of PMC/BAS/MASK
  - improved commentary
  - added helper function to check watchpoint in range
  - fix find/deletion of watchpoints
v3
  - use internals.h definitions
v6
  - KVM_GUESTDBG_USE_HW_BP->KVM_GUESTDBG_USE_HW
  - renamed some helper functions to avoid confusion
v9
  - fix merge conflicts on re-base
  - rm asm/ptrace.h include
  - add additional commentry for hw breakpoints
  - explain gdb's model for HW bkpts
  - fix up spacing, formatting as per checkpatch
  - better PAC values
  - use is_power_of_2
  - use _arm_ fn naming and add docs
  - add a CPUWatchpoint structure for reporting
  - replace manual array manipulation with g_array abstraction
v10
  - fix compilation for arm32/split imps between kvm32/64
  - make find_hw_watchpoint/breakpoint local static functions
  - cleaned up comment grammar
  - fixed up missing spaces
  - removed pointless ?: booleanisation
  - removed pointless kvm_arm_hw_debug_active wrappers
  - s/is/if/
---
 target-arm/kvm.c |  26 +---
 target-arm/kvm32.c   |  29 +
 target-arm/kvm64.c   | 361 ++-
 target-arm/kvm_arm.h |  21 +++
 4 files changed, 415 insertions(+), 22 deletions(-)

diff --git a/target-arm/kvm.c b/target-arm/kvm.c
index 7f44e22..84974bb 100644
--- a/target-arm/kvm.c
+++ b/target-arm/kvm.c
@@ -566,26 +566,12 @@ void kvm_arch_update_guest_debug(CPUState *cs, struct 
kvm_guest_debug *dbg)
 dbg->control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP;
 }
 #endif
-}
-
-int kvm_arch_insert_hw_breakpoint(target_ulong addr,
-  target_ulong len, int type)
-{
-qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__);
-return -EINVAL;
-}
-
-int kvm_arch_remove_hw_breakpoint(target_ulong addr,
-  target_ulong len, int type)
-{
-qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__);
-return -EINVAL;
-}
-
-
-void kvm_arch_remove_all_hw_breakpoints(void)
-{
-qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__);
+#ifdef KVM_GUESTDBG_USE_HW
+if (kvm_arm_hw_debug_active(cs)) {
+dbg->control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_HW;
+kvm_arm_copy_hw_debug_data(>arch);
+}
+#endif
 }
 
 void kvm_arch_init_irq_routing(KVMState *s)
diff --git a/target-arm/kvm32.c b/target-arm/kvm32.c
index 5ce969f..ff83ce6 100644
--- a/target-arm/kvm32.c
+++ b/target-arm/kvm32.c
@@ -493,3 +493,32 @@ bool kvm_arm_handle_debug(CPUState *cs, struct 
kvm_debug_exit_arch *debug_exit)
 qemu_log_mask(LOG_UNIMP, "%s: guest debug not yet implemented\n", 
__func__);
 return false;
 }
+
+int kvm_arch_insert_hw_breakpoint(target_ulong addr,
+  target_ulong len, int type)
+{
+qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__);
+return -EINVAL;
+}
+
+int kvm_arch_remove_hw_breakpoint(target_ulong addr,
+  target_ulong len, int type)
+{
+qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__);
+return -EINVAL;
+}
+
+void kvm_arch_remove_all_hw_breakpoints(void)
+{
+qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__);
+}
+
+void kvm_arm_copy_hw_debug_data(struct kvm_guest_debug_arch *ptr)
+{
+qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__);
+}
+
+bool kvm_arm_hw_debug_active(CPUState *cs)
+{
+return false;
+}
diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c
index 5f96cde..771ecdb 100644
--- a/target-arm/kvm64.c
+++ b/target-arm/kvm64.c
@@ -2,6 +2,7 @@
  * ARM implementation of KVM hooks, 64 bit specific code
  *
  * Copyright Mian-M. Hamayun 2013, Virtual Open Systems
+ * Copyright Alex Bennée 2014, Linaro
  *
  * This work is licensed under the terms of the GNU GPL, version 2 or later.
  * See the COPYING file in the top-level directory.
@@ -12,13 +13,17 @@
 #include 
 #include 
 #include 
+#include 
 
+#include 
 #include 
 
 #include "config-host.h"
 #include "qemu-common.h"
 #include "qemu/timer.h"
 #include "qemu/error-report.h"
+#include "qemu/host-utils.h"
+#include "exec/gdbstub.h"
 #include "sysemu/sysemu.h"
 #include "sysemu/kvm.h"
 #include "kvm_arm.h"
@@ -28,20 +33,358 @@
 
 static bool have_guest_debug;
 
+/*
+ * Although the ARM implementation of hardware assisted debugging
+ * allows for different breakpoints per-core, the current GDB
+ * interface treats them as a global pool of registers (which seems to
+ * be the case for x86, ppc and s390). As a result we store one 

[Qemu-devel] [PATCH v10 6/6] tests/guest-debug: introduce basic gdbstub tests

2015-12-08 Thread Alex Bennée
The aim of these tests is to combine with an appropriate kernel
image (with symbol-file vmlinux) and check it behaves as it should.
Given a kernel it checks:

  - single step
  - software breakpoint
  - hardware breakpoint
  - access, read and write watchpoints

On success it returns 0 to the calling process.

I've not plumbed this into the "make check" logic though as we need a
solution for providing non-host binaries to the tests. However the test
is structured to work with pretty much any Linux kernel image as it
uses the basic kernel_init code which is common across architectures.

Signed-off-by: Alex Bennée 

---
v10:
 - fixup for Py2/3 cleanliness
 - drop to shell on exception
---
 tests/guest-debug/test-gdbstub.py | 176 ++
 1 file changed, 176 insertions(+)
 create mode 100644 tests/guest-debug/test-gdbstub.py

diff --git a/tests/guest-debug/test-gdbstub.py 
b/tests/guest-debug/test-gdbstub.py
new file mode 100644
index 000..31ba6c9
--- /dev/null
+++ b/tests/guest-debug/test-gdbstub.py
@@ -0,0 +1,176 @@
+#
+# This script needs to be run on startup
+# qemu -kernel ${KERNEL} -s -S
+# and then:
+# gdb ${KERNEL}.vmlinux -x ${QEMU_SRC}/tests/guest-debug/test-gdbstub.py
+
+import gdb
+
+failcount = 0
+
+
+def report(cond, msg):
+"Report success/fail of test"
+if cond:
+print ("PASS: %s" % (msg))
+else:
+print ("FAIL: %s" % (msg))
+failcount += 1
+
+
+def check_step():
+"Step an instruction, check it moved."
+start_pc = gdb.parse_and_eval('$pc')
+gdb.execute("si")
+end_pc = gdb.parse_and_eval('$pc')
+
+return not (start_pc == end_pc)
+
+
+def check_break(sym_name):
+"Setup breakpoint, continue and check we stopped."
+sym, ok = gdb.lookup_symbol(sym_name)
+bp = gdb.Breakpoint(sym_name)
+
+gdb.execute("c")
+
+# hopefully we came back
+end_pc = gdb.parse_and_eval('$pc')
+print ("%s == %s %d" % (end_pc, sym.value(), bp.hit_count))
+bp.delete()
+
+# can we test we hit bp?
+return end_pc == sym.value()
+
+
+# We need to do hbreak manually as the python interface doesn't export it
+def check_hbreak(sym_name):
+"Setup hardware breakpoint, continue and check we stopped."
+sym, ok = gdb.lookup_symbol(sym_name)
+gdb.execute("hbreak %s" % (sym_name))
+gdb.execute("c")
+
+# hopefully we came back
+end_pc = gdb.parse_and_eval('$pc')
+print ("%s == %s" % (end_pc, sym.value()))
+
+if end_pc == sym.value():
+gdb.execute("d 1")
+return True
+else:
+return False
+
+
+class WatchPoint(gdb.Breakpoint):
+
+def get_wpstr(self, sym_name):
+"Setup sym and wp_str for given symbol."
+self.sym, ok = gdb.lookup_symbol(sym_name)
+wp_addr = gdb.parse_and_eval(sym_name).address
+self.wp_str = '*(%(type)s)(&%(address)s)' % dict(
+type = wp_addr.type, address = sym_name)
+
+return(self.wp_str)
+
+def __init__(self, sym_name, type):
+wp_str = self.get_wpstr(sym_name)
+super(WatchPoint, self).__init__(wp_str, gdb.BP_WATCHPOINT, type)
+
+def stop(self):
+end_pc = gdb.parse_and_eval('$pc')
+print ("HIT WP @ %s" % (end_pc))
+return True
+
+
+def do_one_watch(sym, wtype, text):
+
+wp = WatchPoint(sym, wtype)
+gdb.execute("c")
+report_str = "%s for %s (%s)" % (text, sym, wp.sym.value())
+
+if wp.hit_count > 0:
+report(True, report_str)
+wp.delete()
+else:
+report(False, report_str)
+
+
+def check_watches(sym_name):
+"Watch a symbol for any access."
+
+# Should hit for any read
+do_one_watch(sym_name, gdb.WP_ACCESS, "awatch")
+
+# Again should hit for reads
+do_one_watch(sym_name, gdb.WP_READ, "rwatch")
+
+# Finally when it is written
+do_one_watch(sym_name, gdb.WP_WRITE, "watch")
+
+
+class CatchBreakpoint(gdb.Breakpoint):
+def __init__(self, sym_name):
+super(CatchBreakpoint, self).__init__(sym_name)
+self.sym, ok = gdb.lookup_symbol(sym_name)
+
+def stop(self):
+end_pc = gdb.parse_and_eval('$pc')
+print ("CB: %s == %s" % (end_pc, self.sym.value()))
+if end_pc == self.sym.value():
+report(False, "Hit final catchpoint")
+
+
+def run_test():
+"Run throught the tests one by one"
+
+print ("Checking we can step the first few instructions")
+step_ok = 0
+for i in range(3):
+if check_step():
+step_ok += 1
+
+report(step_ok == 3, "single step in boot code")
+
+print ("Checking HW breakpoint works")
+break_ok = check_hbreak("kernel_init")
+report(break_ok, "hbreak @ kernel_init")
+
+# Can't set this up until we are in the kernel proper
+# if we make it to run_init_process we've over-run and
+# one of the tests failed
+print ("Setup catch-all for run_init_process")
+cbp = CatchBreakpoint("run_init_process")
+cpb2 = 

[Qemu-devel] [PATCH v10 0/6] QEMU support for KVM Guest Debug on arm64

2015-12-08 Thread Alex Bennée
Hi,

Here is the latest patch set to support debugging of KVM guests on
arm64. The main changes are fixing arm32 compiles (mostly with stubs
for the upcomming arm32 debug) and the usual bunch of minor tweaks and
clarifications following review.

I've kept the GDB Python based test in tests/guest-debug and cleaned
it up so it will work with python2/3 linked GDBs. It still isn't
plumbed it in to the "make check" so can be dropped until we have a
solution for testing against non-host binaries.

So in summary the changes are:

  - Fixed arm32 compile
  - Use results of debug capability checks
  - Whitespace and comment cleanups
  - Py2/3 cleanliness for test script

More detailed changelogs are attached to each patch.

GIT Repo:

The patch series is based off a recent master and can be found at:

https://github.com/stsquad/qemu
branch: kvm/guest-debug-v10

Alex Bennée (6):
  target-arm: kvm64 - introduce kvm_arm_init_debug()
  target-arm: kvm - implement software breakpoints
  target-arm: kvm - support for single step
  target-arm: kvm - add support for HW assisted debug
  target-arm: kvm - re-inject guest debug exceptions
  tests/guest-debug: introduce basic gdbstub tests

 target-arm/helper-a64.c   |  12 +-
 target-arm/kvm.c  |  65 +++---
 target-arm/kvm32.c|  47 
 target-arm/kvm64.c| 464 ++
 target-arm/kvm_arm.h  |  30 +++
 tests/guest-debug/test-gdbstub.py | 176 +++
 6 files changed, 757 insertions(+), 37 deletions(-)
 create mode 100644 tests/guest-debug/test-gdbstub.py

-- 
2.6.3




[Qemu-devel] [PATCH v10 3/6] target-arm: kvm - support for single step

2015-12-08 Thread Alex Bennée
This adds support for single-step. There isn't much to do on the QEMU
side as after we set-up the request for single step via the debug ioctl
it is all handled within the kernel.

The actual setting of the KVM_GUESTDBG_SINGLESTEP flag is already in the
common code. If the kernel doesn't support guest debug the ioctl will
simply error.

Signed-off-by: Alex Bennée 

---
v2
  - convert to using HSR_EC
v3
  - use internals.h definitions
v10
  - fix arm32 build
  - remove redundent flag setting (done in main kvm.c)
  - more words on fail case
---
 target-arm/kvm64.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c
index 3b3929d..5f96cde 100644
--- a/target-arm/kvm64.c
+++ b/target-arm/kvm64.c
@@ -534,6 +534,13 @@ bool kvm_arm_handle_debug(CPUState *cs, struct 
kvm_debug_exit_arch *debug_exit)
 kvm_cpu_synchronize_state(cs);
 
 switch (hsr_ec) {
+case EC_SOFTWARESTEP:
+if (cs->singlestep_enabled) {
+return true;
+} else {
+error_report("Came out of SINGLE STEP when not enabled");
+}
+break;
 case EC_AA64_BKPT:
 if (kvm_find_sw_breakpoint(cs, env->pc)) {
 return true;
-- 
2.6.3




Re: [Qemu-devel] [PATCH 00/16] pc: Eliminate struct PcGuestInfo

2015-12-08 Thread Eduardo Habkost
On Mon, Dec 07, 2015 at 08:57:03PM +0200, Marcel Apfelbaum wrote:
> On 12/02/2015 03:46 AM, Eduardo Habkost wrote:
> >This moves all data from PcGuestInfo to either PCMachineState or
> >PCMachineClass.
> >
> >This series depends on other two series:
> >* [PATCH v3 0/6] pc: Initialization and compat function cleanup
> >* [PATCH V3 0/3]  hw/pcie: Multi-root support for Q35
> >
> >For reference, there's a git tree containing this series plus all
> >the dependencies, at:
> >   git://github.com/ehabkost/qemu-hacks.git work/pcguestinfo-eliminate
> >
> >Eduardo Habkost (16):
> >   pc: Move PcGuestInfo declaration to top of file
> >   pc: Eliminate struct PcGuestInfoState
> >   pc: Remove guest_info parameter from pc_memory_init()
> >   acpi: Make acpi_setup() get PCMachineState as argument
> >   acpi: Remove unused build_facs() PcGuestInfo paramter
> >   acpi: Save PCMachineState on AcpiBuildState
> >   acpi: Make acpi_build() get PCMachineState as argument
> >   acpi: Make build_srat() get PCMachineState as argument
> >   acpi: Remove ram size fields fron PcGuestInfo
> >   pc: Move PcGuestInfo.fw_cfg field to PCMachineState
> >   pc: Simplify signature of xen_load_linux()
> >   pc: Remove PcGuestInfo.isapc_ram_fw field
> >   q35: Remove MCHPCIState.guest_info field
> >   acpi: Use PCMachineClass fields directly
> >   pc: Move PcGuestInfo.apic_xrupt_override field to PCMachineState
> >   pc: Move APIC and NUMA data from PcGuestInfo to PCMachineState
> 
> Hi,
> 
> I mainly agree with the removal of PcGuestInfo , I commented on some patches.
> 
> I do have a minor reservation, we kind of loose some information about the 
> fields.
> Until now it was pretty clear that the fields were related to guest because
> they were part of PcGuestInfo. Now this information is lost and the fields
> appear as yet other machine attributes.

But they really are just machine attributes, aren't they?

> 
> I suppose this can be addressed by:
> - a prefix for guest fields (e.g numa_nodes-> guest_numa_nodes),
> - a comment in the class /* guest fields */,
> - keeping the fields in PcGuestInfo struct but make the machine field short: 
> guest so we can call machine->guest.numa_nodes
> - or not be addressed at all :)

I don't see your point. Could you explain what you mean by
"related to the guest" and "guest fields"?

They are just machine attributes, and they happen to be used as
input when building ACPI tables (just like other machine
attributes are used as input for other guest-visible data, like
CPUID, SMBIOS, and other tables). What exactly make them "related
to guest"?

-- 
Eduardo



Re: [Qemu-devel] [PATCH 1/7] pc: wire up TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE for !xen

2015-12-08 Thread Marcel Apfelbaum

On 12/08/2015 04:07 PM, Gerd Hoffmann wrote:

rename pc_xen_hvm_init_pci to pc_i440fx_init_pci,
use it for both xen and non-xen init.

Signed-off-by: Gerd Hoffmann 
---
  hw/i386/pc_piix.c | 11 +--
  1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 2e41efe..ce6c3c5 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -419,10 +419,9 @@ static void pc_init_isa(MachineState *machine)
  pc_init1(machine, TYPE_I440FX_PCI_HOST_BRIDGE, TYPE_I440FX_PCI_DEVICE);
  }

-#ifdef CONFIG_XEN
-static void pc_xen_hvm_init_pci(MachineState *machine)
+static void pc_i440fx_init_pci(MachineState *machine)
  {
-const char *pci_type = has_igd_gfx_passthru ?
+const char *pci_type = machine->igd_gfx_passthru ?
  TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE : 
TYPE_I440FX_PCI_DEVICE;

  pc_init1(machine,
@@ -430,6 +429,7 @@ static void pc_xen_hvm_init_pci(MachineState *machine)
   pci_type);
  }

+#ifdef CONFIG_XEN
  static void pc_xen_hvm_init(MachineState *machine)
  {
  PCIBus *bus;
@@ -439,7 +439,7 @@ static void pc_xen_hvm_init(MachineState *machine)
  exit(1);
  }

-pc_xen_hvm_init_pci(machine);
+pc_i440fx_init_pci(machine);

  bus = pci_find_primary_bus();
  if (bus != NULL) {
@@ -455,8 +455,7 @@ static void pc_xen_hvm_init(MachineState *machine)
  if (compat) { \
  compat(machine); \
  } \
-pc_init1(machine, TYPE_I440FX_PCI_HOST_BRIDGE, \
- TYPE_I440FX_PCI_DEVICE); \
+pc_i440fx_init_pci(machine); \


Hi Gerd,

A quick question, does IGD_PASSTHROUGH makes sense for compat machine types?
On the same topic, does machine->igd_gfx_passthru makes sense for all machine 
types?

Thanks,
Marcel



  } \
  DEFINE_PC_MACHINE(suffix, name, pc_init_##suffix, optionfn)







Re: [Qemu-devel] [PATCH 04/16] acpi: Make acpi_setup() get PCMachineState as argument

2015-12-08 Thread Eduardo Habkost
On Mon, Dec 07, 2015 at 05:24:27PM +0200, Marcel Apfelbaum wrote:
> On 12/02/2015 03:46 AM, Eduardo Habkost wrote:
> >Lots of PcGuestInfo fields are duplicates of PCMachineClass or
> >PCMachineState fields. Pass PCMachineState as argument to
> >acpi_setup(), so we can simply let the ACPI code use those fields
> >directly.
> 
> I completely agree with removing duplicated fields and using PCMachine
> fields directly, but this not what this patch does.
> It only extracts PcGuestInfo info from the machine.

I should have appended "later" to the commit message. "So we are
able to simply let the ACPI code use those fields later (in
another commit)". The goal of this commit is to just change the
function signature to allow us to move the fields later. Maybe I
will squash some of those changes together in a new version of
the series.

-- 
Eduardo



[Qemu-devel] [Bug 1308341] Re: Multiple CPUs causes blue screen on Windows guest (14.04 regression)

2015-12-08 Thread Serge Hallyn
*** This bug is a duplicate of bug 1346917 ***
https://bugs.launchpad.net/bugs/1346917

Hi,

could you please file a new bug with debugging information as per
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1346917/comments/11
?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1308341

Title:
  Multiple CPUs causes blue screen on Windows guest (14.04 regression)

Status in QEMU:
  New
Status in linux package in Ubuntu:
  Confirmed
Status in qemu package in Ubuntu:
  Confirmed

Bug description:
  Configuring a Windows 7 guest using more than one CPU cases the guest to 
fail. This happens after a few hours after guest boot. This is the error on the 
blue screen:
  "A clock interrupt was not received on a secondary processor within the 
allocated time interval"

  After resetting, the guest will never boot and a new bluescreen with
  the error "STOP: 0x005c" appears. Shutting down the guest
  completely and restarting it will allow it to boot and run for a few
  hours again.

  The guest was created using virt-manager. The error happens with or
  without virtio devices and with both 32-bit and 64-bit Windows 7
  guests.

  I am using Ubuntu 14.04 release candidate.

  qemu-kvm version 2.0.0~rc1+dfsg-0ubuntu3

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1308341/+subscriptions



[Qemu-devel] [PATCH v10 5/6] target-arm: kvm - re-inject guest debug exceptions

2015-12-08 Thread Alex Bennée
If we can't find details for the debug exception in our debug state
then we can assume the exception is due to debugging inside the guest.
To inject the exception into the guest state we re-use the TCG exception
code (do_interrupt).

However while guest debugging is in effect we currently can't handle the
guest using single step as we will keep trapping to back to userspace.
GDB makes heavy use of single-step behind the scenes which effectively
means the guests ability to debug itself is disabled while it is being
debugged.

Signed-off-by: Alex Bennée 

---
v5:
  - new for v5
v10:
  - fix arm32 compile
  - add full stop at end of sentance
  - attempted to expand on limitations in commit msg
---
 target-arm/helper-a64.c | 12 ++--
 target-arm/kvm64.c  | 24 +---
 2 files changed, 27 insertions(+), 9 deletions(-)

diff --git a/target-arm/helper-a64.c b/target-arm/helper-a64.c
index deb8dbe..fc3ccdf 100644
--- a/target-arm/helper-a64.c
+++ b/target-arm/helper-a64.c
@@ -25,6 +25,7 @@
 #include "qemu/bitops.h"
 #include "internals.h"
 #include "qemu/crc32c.h"
+#include "sysemu/kvm.h"
 #include  /* For crc32 */
 
 /* C2.4.7 Multiply and divide */
@@ -469,7 +470,8 @@ void aarch64_cpu_do_interrupt(CPUState *cs)
   new_el);
 if (qemu_loglevel_mask(CPU_LOG_INT)
 && !excp_is_internal(cs->exception_index)) {
-qemu_log_mask(CPU_LOG_INT, "...with ESR 0x%" PRIx32 "\n",
+qemu_log_mask(CPU_LOG_INT, "...with ESR %x/0x%" PRIx32 "\n",
+  env->exception.syndrome >> ARM_EL_EC_SHIFT,
   env->exception.syndrome);
 }
 
@@ -535,6 +537,12 @@ void aarch64_cpu_do_interrupt(CPUState *cs)
 aarch64_restore_sp(env, new_el);
 
 env->pc = addr;
-cs->interrupt_request |= CPU_INTERRUPT_EXITTB;
+
+qemu_log_mask(CPU_LOG_INT, "...to EL%d PC 0x%" PRIx64 " PSTATE 0x%x\n",
+  new_el, env->pc, pstate_read(env));
+
+if (!kvm_enabled()) {
+cs->interrupt_request |= CPU_INTERRUPT_EXITTB;
+}
 }
 #endif
diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c
index 771ecdb..8e6d044 100644
--- a/target-arm/kvm64.c
+++ b/target-arm/kvm64.c
@@ -871,6 +871,7 @@ bool kvm_arm_handle_debug(CPUState *cs, struct 
kvm_debug_exit_arch *debug_exit)
 {
 int hsr_ec = debug_exit->hsr >> ARM_EL_EC_SHIFT;
 ARMCPU *cpu = ARM_CPU(cs);
+CPUClass *cc = CPU_GET_CLASS(cs);
 CPUARMState *env = >env;
 
 /* Ensure PC is synchronised */
@@ -881,7 +882,14 @@ bool kvm_arm_handle_debug(CPUState *cs, struct 
kvm_debug_exit_arch *debug_exit)
 if (cs->singlestep_enabled) {
 return true;
 } else {
-error_report("Came out of SINGLE STEP when not enabled");
+/*
+ * The kernel should have supressed the guests ability to
+ * single step at this point so something has gone wrong.
+ */
+error_report("%s: guest single-step while debugging unsupported"
+ " (%"PRIx64", %"PRIx32")\n",
+ __func__, env->pc, debug_exit->hsr);
+return false;
 }
 break;
 case EC_AA64_BKPT:
@@ -908,12 +916,14 @@ bool kvm_arm_handle_debug(CPUState *cs, struct 
kvm_debug_exit_arch *debug_exit)
  __func__, debug_exit->hsr, env->pc);
 }
 
-/* If we don't handle this it could be it really is for the
-   guest to handle */
-qemu_log_mask(LOG_UNIMP,
-  "%s: re-injecting exception not yet implemented"
-  " (0x%"PRIx32", %"PRIx64")\n",
-  __func__, hsr_ec, env->pc);
+/* If we are not handling the debug exception it must belong to
+ * the guest. Let's re-use the existing TCG interrupt code to set
+ * everything up properly.
+ */
+cs->exception_index = EXCP_BKPT;
+env->exception.syndrome = debug_exit->hsr;
+env->exception.vaddress = debug_exit->far;
+cc->do_interrupt(cs);
 
 return false;
 }
-- 
2.6.3




Re: [Qemu-devel] [Qemu-block] [PATCH] qemu-img / curl: When fetching Content-Size use GET instead of HEAD.

2015-12-08 Thread John Snow


On 12/07/2015 04:23 PM, Boris Schrijver wrote:
> Hi all,
> 

Hi!

> I was testing out the "qemu-img info/convert" options in combination with
> "http/https" when I stumbled upon this issue. When "qemu-img info/convert" 
> tries
> to collect the file info it will first try to fetch the Content-Size of the
> remote file. It does a HEAD request and after a GET request for the correct
> range.
> 
> The HEAD request is an issue. Because when you've got a pre-signed url, for
> example from S3, which INCLUDES the REQUEST METHOD in it's signature, you'll 
> get
> a 403 Forbidden.
> 
> It's is therefore better to use only the GET request method, and discard the
> body at the first call.
> 

How big is the body? Won't this introduce a really large overhead?

> Please review! I'll be ready for answers!
> 

Please use the git format-patch format for sending patch emails; see
http://qemu-project.org/Contribute/SubmitAPatch#Use_git_format-patch --
and remember to include a Signed-off-by line.

> [PATCH] qemu-img / curl: When fetching Content-Size use GET instead of HEAD.
> 
> A server can respond different to both methods, or can block one of the two.
> ---
>  block/curl.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/block/curl.c b/block/curl.c
> index 8994182..2e74c32 100644
> --- a/block/curl.c
> +++ b/block/curl.c
> @@ -594,11 +594,11 @@ static int curl_open(BlockDriverState *bs, QDict 
> *options,
> int flags,
>  // Get file size
>  
>  s->accept_range = false;
> -curl_easy_setopt(state->curl, CURLOPT_NOBODY, 1);
> +curl_easy_setopt(state->curl, CURLOPT_HTTPGET, 1);
>  curl_easy_setopt(state->curl, CURLOPT_HEADERFUNCTION,
>   curl_header_cb);
>  curl_easy_setopt(state->curl, CURLOPT_HEADERDATA, s);
> -if (curl_easy_perform(state->curl))
> +if (curl_easy_perform(state->curl) != 23)

We go from making sure there were no errors to enforcing that we *do*
get CURLE_WRITE_ERROR? Can you explain why this change doesn't break
error handling scenarios for all other cases?

>  goto out;
>  curl_easy_getinfo(state->curl, CURLINFO_CONTENT_LENGTH_DOWNLOAD, );
>  if (d)
> 



Re: [Qemu-devel] [PATCH 06/16] acpi: Save PCMachineState on AcpiBuildState

2015-12-08 Thread Marcel Apfelbaum

On 12/08/2015 07:59 PM, Eduardo Habkost wrote:

On Mon, Dec 07, 2015 at 05:39:29PM +0200, Marcel Apfelbaum wrote:

On 12/02/2015 03:47 AM, Eduardo Habkost wrote:

PCMachineState will be used in some of the steps of ACPI table
building.

Signed-off-by: Eduardo Habkost 
---
  hw/i386/acpi-build.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 85a5c53..ca11c88 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1644,7 +1644,7 @@ struct AcpiBuildState {
  MemoryRegion *table_mr;
  /* Is table patched? */
  uint8_t patched;
-PcGuestInfo *guest_info;
+PCMachineState *pcms;
  void *rsdp;
  MemoryRegion *rsdp_mr;
  MemoryRegion *linker_mr;
@@ -1855,7 +1855,7 @@ static void acpi_build_update(void *build_opaque, 
uint32_t offset)

  acpi_build_tables_init();

-acpi_build(build_state->guest_info, );
+acpi_build(_state->pcms->acpi_guest_info, );

  acpi_ram_update(build_state->table_mr, tables.table_data);

@@ -1916,12 +1916,12 @@ void acpi_setup(PCMachineState *pcms)

  build_state = g_malloc0(sizeof *build_state);

-build_state->guest_info = guest_info;
+build_state->pcms = pcms;


I am not "sold" on keeping a reference to machine in the build_state.
We can always query current machine using qdev_machine() or something.

Keeping the "guest info" made sense since is used especially for ACPI,
however the machine has a wider scope. (And not having to keep it
around is a very good thing!)


I wouldn't mind using qdev_get_machine() if preferred by the
maintainer of that code, but I like to avoid it when possible. To
me, qdev_get_machine() is just a global variable disguised as a
harder-to-understand API.


Really? Hmm, for me is looking like the other way around :)
I see it as "query the QOM tree", instead of "keep the reference around" 
everywhere.
But it may be just a personal preference.

Thanks,
Marcel










Re: [Qemu-devel] [Qemu-block] QEMU being able to use audio cdroms

2015-12-08 Thread John Snow


On 11/25/2015 03:44 PM, Programmingkid wrote:
> Is there any platform where a guest in QEMU can play an audio cd? If not, is 
> this a feature that you would allow into QEMU?
> 

I haven't tested it, nobody has ever asked. I wouldn't reject patches
for such a feature, but I likely wouldn't make reviewing them a priority
either.

I assume this would be primarily for emulating games/etc that use mixed
Data/Audio formats?

--js



[Qemu-devel] [PATCH v10 2/6] target-arm: kvm - implement software breakpoints

2015-12-08 Thread Alex Bennée
These don't involve messing around with debug registers, just setting
the breakpoint instruction in memory. GDB will not use this mechanism if
it can't access the memory to write the breakpoint.

All the kernel has to do is ensure the hypervisor traps the breakpoint
exceptions and returns to userspace.

Signed-off-by: Alex Bennée 

--
v2
  - handle debug exit with new hsr exception info
  - add verbosity to UNIMP message
v3
  - sync with kvm_cpu_synchronize_state() before checking PC.
  - use internals.h defines
  - use env->pc
  - use proper format types
v9
  - add include for error_report
  - define a brk_insn constant
v10
  - fix up for arm32 compile
  - move sw_bp_code to kvm32 (stubs)/64 (working)
  - move kvm_handle_debug to kvm32/64 as kvm_arm_handle_debug
  - don't enable SW_BP unless the define is there
  - wrap in have_guest_debug check
---
 target-arm/kvm.c | 39 +---
 target-arm/kvm32.c   | 18 +
 target-arm/kvm64.c   | 72 
 target-arm/kvm_arm.h |  9 +++
 4 files changed, 123 insertions(+), 15 deletions(-)

diff --git a/target-arm/kvm.c b/target-arm/kvm.c
index 79ef4c6..7f44e22 100644
--- a/target-arm/kvm.c
+++ b/target-arm/kvm.c
@@ -17,6 +17,7 @@
 
 #include "qemu-common.h"
 #include "qemu/timer.h"
+#include "qemu/error-report.h"
 #include "sysemu/sysemu.h"
 #include "sysemu/kvm.h"
 #include "kvm_arm.h"
@@ -516,9 +517,23 @@ MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run 
*run)
 return MEMTXATTRS_UNSPECIFIED;
 }
 
+
 int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
 {
-return 0;
+int ret = 0;
+
+switch (run->exit_reason) {
+case KVM_EXIT_DEBUG:
+if (kvm_arm_handle_debug(cs, >debug.arch)) {
+ret = EXCP_DEBUG;
+} /* otherwise return to guest */
+break;
+default:
+qemu_log_mask(LOG_UNIMP, "%s: un-handled exit reason %d\n",
+  __func__, run->exit_reason);
+break;
+}
+return ret;
 }
 
 bool kvm_arch_stop_on_emulation_error(CPUState *cs)
@@ -541,16 +556,16 @@ int kvm_arch_on_sigbus(int code, void *addr)
 return 1;
 }
 
+/* The #ifdef protections are until 32bit headers are imported and can
+ * be removed once both 32 and 64 bit reach feature parity.
+ */
 void kvm_arch_update_guest_debug(CPUState *cs, struct kvm_guest_debug *dbg)
 {
-qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__);
-}
-
-int kvm_arch_insert_sw_breakpoint(CPUState *cs,
-  struct kvm_sw_breakpoint *bp)
-{
-qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__);
-return -EINVAL;
+#ifdef KVM_GUESTDBG_USE_SW_BP
+if (kvm_sw_breakpoints_active(cs)) {
+dbg->control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP;
+}
+#endif
 }
 
 int kvm_arch_insert_hw_breakpoint(target_ulong addr,
@@ -567,12 +582,6 @@ int kvm_arch_remove_hw_breakpoint(target_ulong addr,
 return -EINVAL;
 }
 
-int kvm_arch_remove_sw_breakpoint(CPUState *cs,
-  struct kvm_sw_breakpoint *bp)
-{
-qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__);
-return -EINVAL;
-}
 
 void kvm_arch_remove_all_hw_breakpoints(void)
 {
diff --git a/target-arm/kvm32.c b/target-arm/kvm32.c
index df1e2b0..5ce969f 100644
--- a/target-arm/kvm32.c
+++ b/target-arm/kvm32.c
@@ -475,3 +475,21 @@ int kvm_arch_get_registers(CPUState *cs)
 
 return 0;
 }
+
+int kvm_arch_insert_sw_breakpoint(CPUState *cs, struct kvm_sw_breakpoint *bp)
+{
+qemu_log_mask(LOG_UNIMP, "%s: guest debug not yet implemented\n", 
__func__);
+return -EINVAL;
+}
+
+int kvm_arch_remove_sw_breakpoint(CPUState *cs, struct kvm_sw_breakpoint *bp)
+{
+qemu_log_mask(LOG_UNIMP, "%s: guest debug not yet implemented\n", 
__func__);
+return -EINVAL;
+}
+
+bool kvm_arm_handle_debug(CPUState *cs, struct kvm_debug_exit_arch *debug_exit)
+{
+qemu_log_mask(LOG_UNIMP, "%s: guest debug not yet implemented\n", 
__func__);
+return false;
+}
diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c
index d087794..3b3929d 100644
--- a/target-arm/kvm64.c
+++ b/target-arm/kvm64.c
@@ -18,6 +18,7 @@
 #include "config-host.h"
 #include "qemu-common.h"
 #include "qemu/timer.h"
+#include "qemu/error-report.h"
 #include "sysemu/sysemu.h"
 #include "sysemu/kvm.h"
 #include "kvm_arm.h"
@@ -481,3 +482,74 @@ int kvm_arch_get_registers(CPUState *cs)
 /* TODO: other registers */
 return ret;
 }
+
+/* C6.6.29 BRK instruction */
+static const uint32_t brk_insn = 0xd420;
+
+int kvm_arch_insert_sw_breakpoint(CPUState *cs, struct kvm_sw_breakpoint *bp)
+{
+if (have_guest_debug) {
+if (cpu_memory_rw_debug(cs, bp->pc, (uint8_t *)>saved_insn, 4, 0) 
||
+cpu_memory_rw_debug(cs, bp->pc, (uint8_t *)_insn, 4, 1)) {
+return -EINVAL;
+}
+return 0;
+} else {
+error_report("guest debug not supported on this kernel");

[Qemu-devel] [PATCH v10 1/6] target-arm: kvm64 - introduce kvm_arm_init_debug()

2015-12-08 Thread Alex Bennée
As we haven't always had guest debug support we need to probe for it.
Additionally we don't do this in the start-up capability code so we
don't fall over on old kernels.

Signed-off-by: Alex Bennée 
---
 target-arm/kvm64.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c
index ceebfeb..d087794 100644
--- a/target-arm/kvm64.c
+++ b/target-arm/kvm64.c
@@ -25,6 +25,22 @@
 #include "internals.h"
 #include "hw/arm/arm.h"
 
+static bool have_guest_debug;
+
+/**
+ * kvm_arm_init_debug()
+ * @cs: CPUState
+ *
+ * Check for guest debug capabilities.
+ *
+ */
+static void kvm_arm_init_debug(CPUState *cs)
+{
+have_guest_debug = kvm_check_extension(cs->kvm_state,
+   KVM_CAP_SET_GUEST_DEBUG);
+return;
+}
+
 static inline void set_feature(uint64_t *features, int feature)
 {
 *features |= 1ULL << feature;
@@ -121,6 +137,8 @@ int kvm_arch_init_vcpu(CPUState *cs)
 }
 cpu->mp_affinity = mpidr & ARM64_AFFINITY_MASK;
 
+kvm_arm_init_debug(cs);
+
 return kvm_arm_init_cpreg_list(cpu);
 }
 
-- 
2.6.3




Re: [Qemu-devel] [Qemu-block] QEMU being able to use audio cdroms

2015-12-08 Thread Programmingkid

On Dec 8, 2015, at 1:49 PM, John Snow wrote:

> 
> 
> On 11/25/2015 03:44 PM, Programmingkid wrote:
>> Is there any platform where a guest in QEMU can play an audio cd? If not, is 
>> this a feature that you would allow into QEMU?
>> 
> 
> I haven't tested it, nobody has ever asked. I wouldn't reject patches
> for such a feature, but I likely wouldn't make reviewing them a priority
> either.
> 
> I assume this would be primarily for emulating games/etc that use mixed
> Data/Audio formats?

That could be one reason to do it. I just thought it would be neat to be able 
to play CDs inside a guest.


Re: [Qemu-devel] [PATCH] qemu-img / curl: When fetching Content-Size use GET instead of HEAD.

2015-12-08 Thread Michael Tokarev
08.12.2015 00:23, Boris Schrijver wrote:
[]
> It's is therefore better to use only the GET request method, and discard the
> body at the first call.

Nooo!  Please N!

Fetching a large file once might be too long already.
Fetching it twice is twice as long.  Oh well!..

Thanks,

/mjt



Re: [Qemu-devel] [Qemu-block] [PATCH] qemu-img / curl: When fetching Content-Size use GET instead of HEAD.

2015-12-08 Thread Boris Schrijver
See inline! Thanks for your response!

-- 

Met vriendelijke groet / Kind regards,

Boris Schrijver

PCextreme B.V.

http://www.pcextreme.nl/contact
Tel direct: +31 (0) 118 700 215

> On December 8, 2015 at 8:40 PM John Snow  wrote:
> 
> 
> 
> 
> On 12/07/2015 04:23 PM, Boris Schrijver wrote:
> > Hi all,
> > 
> 
> Hi!
> 
> > I was testing out the "qemu-img info/convert" options in combination with
> > "http/https" when I stumbled upon this issue. When "qemu-img info/convert"
> > tries
> > to collect the file info it will first try to fetch the Content-Size of the
> > remote file. It does a HEAD request and after a GET request for the correct
> > range.
> > 
> > The HEAD request is an issue. Because when you've got a pre-signed url, for
> > example from S3, which INCLUDES the REQUEST METHOD in it's signature, you'll
> > get
> > a 403 Forbidden.
> > 
> > It's is therefore better to use only the GET request method, and discard the
> > body at the first call.
> > 
> 
> How big is the body? Won't this introduce a really large overhead?

The body is "worst-case" the size of the Ethernet v2 frame, around 1500 bytes.

> 
> > Please review! I'll be ready for answers!
> > 
> 
> Please use the git format-patch format for sending patch emails; see
> http://qemu-project.org/Contribute/SubmitAPatch#Use_git_format-patch --
> and remember to include a Signed-off-by line.
>

Ok, will do!

> > [PATCH] qemu-img / curl: When fetching Content-Size use GET instead of HEAD.
> > 
> > A server can respond different to both methods, or can block one of the two.
> > ---
> >  block/curl.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/block/curl.c b/block/curl.c
> > index 8994182..2e74c32 100644
> > --- a/block/curl.c
> > +++ b/block/curl.c
> > @@ -594,11 +594,11 @@ static int curl_open(BlockDriverState *bs, QDict
> > *options,
> > int flags,
> >  // Get file size
> >  
> >  s->accept_range = false;
> > -curl_easy_setopt(state->curl, CURLOPT_NOBODY, 1);
> > +curl_easy_setopt(state->curl, CURLOPT_HTTPGET, 1);
> >  curl_easy_setopt(state->curl, CURLOPT_HEADERFUNCTION,
> >   curl_header_cb);
> >  curl_easy_setopt(state->curl, CURLOPT_HEADERDATA, s);
> > -if (curl_easy_perform(state->curl))
> > +if (curl_easy_perform(state->curl) != 23)
> 
> We go from making sure there were no errors to enforcing that we *do*
> get CURLE_WRITE_ERROR? Can you explain why this change doesn't break
> error handling scenarios for all other cases?
>

We're enforcing the CURLE_WRITE_ERROR here. We receive data, but don't want to
save it anywhere -> We only want the header. CURLE_WRITE_ERROR implicitly means
the connection is successful, because we received a response body! Any other
error will not be CURLE_WRITE_ERROR and thus fail.

Please run the following command: curl -v -X GET -I http://qemu.org/
It will at the last line read:

* Excess found in a non pipelined read: excess = 279 url = / (zero-length body)

That is the body we're discarding.

Libcurl basically doesn't provide a nice way to handle this. That's why I've
implemented in this fashion.


> >  goto out;
> >  curl_easy_getinfo(state->curl, CURLINFO_CONTENT_LENGTH_DOWNLOAD, );
> >  if (d)
> >

[PATCH]

commit ec8d3ef01eaca9264d97e9ad757fe536e0dc037b
Author: Boris Schrijver 
Date:   Mon Dec 7 22:01:59 2015 +0100

qemu-img / curl: When fetching Content-Size use GET instead of HEAD.

A server can respond different to both methods, or can block one of the two.

Signed-off-by: Boris Schrijver 

diff --git a/block/curl.c b/block/curl.c
index 8994182..2e74c32 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -594,11 +594,11 @@ static int curl_open(BlockDriverState *bs, QDict *options,
int flags,
 // Get file size
 
 s->accept_range = false;
-curl_easy_setopt(state->curl, CURLOPT_NOBODY, 1);
+curl_easy_setopt(state->curl, CURLOPT_HTTPGET, 1);
 curl_easy_setopt(state->curl, CURLOPT_HEADERFUNCTION,
  curl_header_cb);
 curl_easy_setopt(state->curl, CURLOPT_HEADERDATA, s);
-if (curl_easy_perform(state->curl))
+if (curl_easy_perform(state->curl) != 23)
 goto out;
 curl_easy_getinfo(state->curl, CURLINFO_CONTENT_LENGTH_DOWNLOAD, );
 if (d)



[Qemu-devel] [PATCH for-2.5] sparc: allow CASA with ASI 0xa from user space

2015-12-08 Thread Richard Henderson
On 12/04/2015 07:01 AM, Alex Zuepke wrote:
> LEON3 allows the CASA instruction to be used from user space
> if the ASI is set to 0xa (user data).
> 
> Signed-off-by: Alex Zuepke 
> ---
>  target-sparc/translate.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/target-sparc/translate.c b/target-sparc/translate.c
> index 41a3319..63440dd 100644
> --- a/target-sparc/translate.c
> +++ b/target-sparc/translate.c
> @@ -5097,7 +5097,8 @@ static void disas_sparc_insn(DisasContext * dc, 
> unsigned int insn)
>  if (IS_IMM) {
>  goto illegal_insn;
>  }
> -if (!supervisor(dc)) {
> +/* LEON3 allows CASA from user space with ASI 0xa */
> +if ((GET_FIELD(insn, 19, 26) != 0xa) && !supervisor(dc)) 
> {
>  goto priv_insn;
>  }
>  #endif
> 

Reviewed-by: Richard Henderson 

This should probably be merged for 2.5.

For 2.6, I have a branch with substantial changes for the sparc backend.  Part
of which totally revamps the way ASIs are handled.  I believe it gets this
right.  See

  git://github.com/rth7680/qemu.git tgt-sparc


r~



Re: [Qemu-devel] [PATCH] qemu-img / curl: When fetching Content-Size use GET instead of HEAD.

2015-12-08 Thread Boris Schrijver
Hi!

To clarify: The body of the response is the maximum size defined by MTU network
policies, so by default around ~1500 bytes. After that is received, the header
is parsed and the connection is dropped!

So no whole file transfers!!! Please test and see for your self!

-- 

Met vriendelijke groet / Kind regards,

Boris Schrijver

PCextreme B.V.

http://www.pcextreme.nl/contact
Tel direct: +31 (0) 118 700 215

> On December 8, 2015 at 8:56 PM Michael Tokarev  wrote:
> 
> 
> 08.12.2015 00:23, Boris Schrijver wrote:
> []
> > It's is therefore better to use only the GET request method, and discard the
> > body at the first call.
> 
> Nooo!  Please N!
> 
> Fetching a large file once might be too long already.
> Fetching it twice is twice as long.  Oh well!..
> 
> Thanks,
> 
> /mjt



Re: [Qemu-devel] [PATCH 4/8] bcm2835_emmc: add bcm2835 MMC/SD controller

2015-12-08 Thread Peter Crosthwaite
On Tue, Dec 8, 2015 at 10:19 PM, Andrew Baumann
 wrote:
>> From: Peter Crosthwaite [mailto:crosthwaitepe...@gmail.com]
>> Sent: Saturday, 5 December 2015 21:26
>> Is this IP just SDHCI? We already model SDHCI in QEMU, see
>> hw/sd/sdhci.c. If there are RPi specific features to the SDHCI
>> implementation they should be added as optional extensions (probabably
>> via subclassing) to the existing SDHCI model.
>
> So yes, it turns out this is fairly similar to SDHCI (-> lots of wasted work 
> by Gregory and me, sigh), and indeed Linux boots with the existing sdhci 
> emulation. However, there are some quirks, and UEFI/Windows depend on them. 
> Namely:
>  * The host control registers (offset 0x28 and above) seem to differ 
> significantly. Maybe this is due to the SDHC version -- according to the 
> BCM2835 peripherals spec, the controller implements "Version 3.0 Draft 1.0" 
> of the SDHC spec, but of course I can't find that spec online anywhere. 
> Luckily nothing seems to depend on this, besides a few spurious warnings 
> about invalid writes.

Looks reasonably consistent from a quick scan? 0x28 in shdci.c is only
doing the ADMA stuff while there are other fields on the BCM model.

>  * Power is assumed to be always on -- the sdhci model requires the guest to 
> turn it on by a write at offset 0x29 before issuing any commands, but on pi 
> this bit is marked reserved, and commands are issued immediately after reset.

Does this help?:

https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg06271.html

>  * The card inserted interrupt is rather broken on pi: it is set at the start 
> of day, but a reset command clears it and it stays clear thereafter (and 
> never generates interrupts).
>

Is that more likely to be an IP connectivity problem (wierd input to
the card-detect pin in the SoC)?

> There's an inconsistency with response handling, too, although I'm not sure 
> if it's a quirk of the Pi or a general bug in sdhci. Pi UEFI sends a CMD23 
> without setting any of the response bits, but this command does in fact 
> generate a 4-byte R1 response. The question is whether this should be treated 
> as an error, or whether it simply means that the host wants to ignore the 
> response. In sdhci, the following code path (around line 246) raises a 
> "command index" error in the case that a non-zero response is returned but no 
> response bits were set in the command register:
>
> } else if (rlen != 0 && (s->errintstsen & SDHC_EISEN_CMDIDX)) {
> s->errintsts |= SDHC_EIS_CMDIDX;
> s->norintsts |= SDHC_NIS_ERR;
> }
>
> I do not observe this behaviour on the real Pi2 (and it breaks UEFI). The 
> hardware semantics appear to be "if the command generates a response, but you 
> didn't want to see it, we'll successfully complete the command and ignore the 
> response", whereas the sdhci implementation raises an error for this as well 
> as signalling completion. I have read the "SD Specifications Part A2 SD Host 
> Controller Simplified Specification Version 2.00", but did not find anything 
> describing this case, so it could be that this is open to interpretation. (It 
> could also be specified in SDHC v3.) The specific error also seems odd -- my 
> understanding is that a "command index" error means that the index in the 
> response didn't match the index of the issued command, but that's hardly what 
> is happening here.
>

Starting to sound like a bug.

> Assuming this latter bug can be fixed generically, how do you propose 
> handling the Pi quirks? I could add a bool property for "bcm2835-quirks" or 
> similar and just special-case the relevant code (my preferred approach). I'm 
> also open to subclassing, but no idea how that would work in practice, so 
> would need some pointers.
>

I think we need a more definitive list of the register level features
that are different or added, I am not seeing what is BCM specific just
yet.

Regards,
Peter

> Thanks,
> Andrew



Re: [Qemu-devel] [PATCH] virtio-blk: Drop x-data-plane option

2015-12-08 Thread Stefan Hajnoczi
On Mon, Dec 07, 2015 at 05:10:26PM +, Peter Maydell wrote:
> On 7 December 2015 at 15:19, Paolo Bonzini  wrote:
> >
> >
> > On 07/12/2015 14:02, Fam Zheng wrote:
> >> On Mon, 12/07 12:29, Cornelia Huck wrote:
> >>> On Mon,  7 Dec 2015 18:59:27 +0800
> >>> Fam Zheng  wrote:
> >>>
>  The official way of enabling dataplane is through the "iothread"
>  property that references an iothread object created by "-object
>  iothread".  Since the old "x-data-plane=on" way now even crashes, it's
>  probably easier to just drop it:
> 
>  $ qemu-system-x86_64 -drive file=null-co://,id=d0,if=none \
>  -device virtio-blk-pci,drive=d0,x-data-plane=on
> 
>  ERROR:/home/fam/work/qemu/qom/object.c:1515:
>  object_get_canonical_path_component: assertion failed: (obj->parent != 
>  NULL)
>  Aborted
> >>>
> >>> Do we understand yet why this crashes, btw?
> >>
> >> I think it's because with x-data-plane=on, virtio-blk initialize an object 
> >> that
> >> doesn't have a parent, therefore it doesn't have a valid "canonical path
> >> component" thing, which is different from objects created with "-object" 
> >> CLI.
> >> I'm not very familiar with the QOM semantics here.
> >>
> >>>
> 
>  Signed-off-by: Fam Zheng 
>  ---
>   hw/block/dataplane/virtio-blk.c | 15 ++-
>   hw/block/virtio-blk.c   |  1 -
>   include/hw/virtio/virtio-blk.h  |  1 -
>   3 files changed, 2 insertions(+), 15 deletions(-)
> 
> >>>
> >>> No general objection to removing x-data-plane; but this probably wants
> >>> a mention on the changelog as x-data-plane has been described in
> >>> various howtos etc. over the years.
> >>
> >> Yes, that is a good point.  I don't know if it's too rushing in removing 
> >> it for
> >> 2.5 (this is just posted as one option) and we'll have to count on QOM 
> >> experts
> >> for the fix, if it is.
> >
> > The solution would be to add object_property_add_child to
> > virtio_blk_data_plane_create, between object_initialize and
> > user_creatable_complete.  But I think this patch is ok for 2.5.
> 
> Paolo asked me to apply this to master, so I have done so.

Okay.  I will do my best to communicate that x-data-plane is gone.

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH] vfio: Align iova also to IOMMU page size

2015-12-08 Thread Alex Williamson
On Mon, 2015-12-07 at 11:20 +, Peter Maydell wrote:
> On 7 December 2015 at 10:53, Pavel Fedin  wrote:
> >> TAGET_PAGE_ALIGN tells us that it *could* be a valid DMA target though.
> >> The VM model is capable of using that as a page size, which means we
> >> assume it is and want to generate a fault.
> >
> >  We seem to have looped back. So...
> >  It is possible to fix this according to this assumption. In this
> > case we would need to make TARGET_PAGE_BITS a variable. If we are
> > emulating ancient armv5te, it will be set to 10. For modern targets,
> > ARMv6 and newer, it will be 12.
> 
> You can't just make TARGET_PAGE_BITS a variable, it is used as a compile
> time constant in a bunch of TCG internal stuff. It would be nice
> if we didn't require it to be compile time, but it would be a lot of
> work to fix (especially if you want to avoid it being a performance
> hit).
> 
> In any case, that still doesn't fix the problem. On an AArch64
> target CPU, TARGET_PAGE_BITS still has to be 12 (for a 4K
> minimum page size), but the guest and host could still be using
> 64K pages. So your VFIO code *must* be able to deal with the
> situation where TARGET_PAGE_BITS is smaller than any alignment
> that the guest, host or IOMMU need to care about.
> 
> I still think the VFIO code needs to figure out what alignment
> it actually cares about and find some way to determine what
> that is, or alternatively if the relevant alignment is not
> possible to determine, write the code so that it doesn't
> need to care. Either way, TARGET_PAGE_ALIGN is not the answer.

Ok, let's work our way down through the relevant page sizes, host,
IOMMU, and target.

The host page size is relevant because this is the granularity with
which the kernel can pin pages.  Every IOMMU mapping must be backed by a
pinned page in the current model since we don't really have hardware to
support IOMMU page faults.

The IOMMU page size defines the granularity with which we can map IOVA
to physical memory.  The IOMMU may support multiple page sizes, but what
we're really talking about here is the minimum page size.

The target page size is relevant because this defines the minimum
possible page size used within the VM.  We presume that anything less
than TARGET_PAGE_ALIGN cannot be referenced as a page by the VM CPU and
therefore is probably not allocated as a DMA buffer for a driver running
within the guest.

An implementation detail here is that the vfio type1 IOMMU model
currently exposes the host page size as the minimum IOMMU page size.
The reason for this is to simplify page accounting, if we don't allow
sub-host page mappings we don't need per page reference counting.  This
can be fixed within the current API, but kernel changes are required or
else locked page requirements due to over-counting become a problem.
The benefit though is that this abstracts the host page size from QEMU.

So let's take the easy scenario first, if target page size is greater
than or equal to the minimum IOMMU page size, we're golden.  We can map
anything that could be a target DMA buffer.  This leads to the current
situation that we simply ignore any ranges which disappear when we align
to the target page size.  It can't be a DMA buffer, ignore it.  Note
that the 64k host, 4k target problem goes away if type1 accounting is
fixed to allow IOMMU granularity mapping, since I think in the cases we
care about the IOMMU still supports 4k pages, otherwise...

Then we come to the scenario here, where target page size is less than
the minimum IOMMU page size.  The current code is intentionally trying
to trigger the vfio type1 error that this cannot be mapped.  To resolve
this, QEMU needs to decide if it's ok to provide the device with DMA
access to everything on that IOMMU granularity page, ensure that aliases
mapping the same IOMMU page are consistent and handle the reference
counting for those sub-mappings to avoid duplicate mappings and
premature unmaps.

So I think in the end, the one page size we care about is the minimum
IOMMU granularity.  We don't really care about the target page size at
all and maybe we only care about the host page size for determining what
might share a page with a sub-page mapping.  However, there's work to
get there (QEMU, kernel, or both depending on the specific config) and
the target page size trick has so far been a useful simplification.
Thanks,

Alex




Re: [Qemu-devel] tcg: improve MAX_CODE_GEN_BUFFER_SIZE for arm

2015-12-08 Thread TeLeMan
On Tue, Dec 8, 2015 at 7:21 PM, Aurelien Jarno  wrote:
> On 2015-12-08 11:51, Laurent Desnogues wrote:
>> Hello,
>>
>> On Tue, Dec 8, 2015 at 11:39 AM, Aurelien Jarno  wrote:
>> [...]
>> > I already posted a patch a long time ago to remove the 16MB limit on ARM
>> > hosts:
>> >
>> > http://lists.gnu.org/archive/html/qemu-devel/2012-10/msg01684.html
>> >
>> > However as you can see in the thread, it has been rejected as it doesn't
>> > not bring improvement in all cases.
>>
>> We could perhaps resurrect it and do some more benchmarking?  Who
>> would be able to do testing on (recent) ARM hardware?
>
> I can provide an updated patch, but I would prefer if someone else does
> the benchmarking on a really recent hardware. Not sure the hardware I
> have (cortex A7) is really representative of a modern ARM CPU.

ok,I wait your new patch, thanks. I have arm A7 too.

> Aurelien
>
> --
> Aurelien Jarno  GPG: 4096R/1DDD8C9B
> aurel...@aurel32.net http://www.aurel32.net



[Qemu-devel] [PATCH] xen_pt: fix failure of attaching & detaching a PCI device to VM repeatedly

2015-12-08 Thread Jianzhong,Chang
Add pci = [ '$VF_BDF1', '$VF_BDF2', '$VF_BDF3'] in
hvm guest configuration file. After the guest boot up,
detach the VFs in sequence by "xl pci-detach $DOMID $VF_BDF",
reattach the VFs by "xl pci-attach $VF_BDF" in sequence.
An error message will be reported like this:
"libxl: error: libxl_qmp.c:287:qmp_handle_error_response: received
an error message from QMP server: Duplicate ID 'pci-pt-07_10.1' for device"

When xen_pt_region_add/del() is called, MemoryRegion
may not belong to the XenPCIPassthroughState.
xen_pt_region_update() checks it but memory_region_ref/unref() does not.
This case causes obj->ref issue and affects the release of related objects.
So, memory_region_ref/unref() is moved from
xen_pt_region_add/del inside xen_pt_region_update.

Signed-off-by: Jianzhong,Chang 
---
 hw/xen/xen_pt.c |   10 +-
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index aa96288..45d4d6c 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -590,7 +590,11 @@ static void xen_pt_region_update(XenPCIPassthroughState *s,
 if (bar == -1 && (!s->msix || >msix->mmio != mr)) {
 return;
 }
-
+if (adding) {
+memory_region_ref(mr);
+} else {
+memory_region_unref(mr);
+}
 if (s->msix && >msix->mmio == mr) {
 if (adding) {
 s->msix->mmio_base_addr = sec->offset_within_address_space;
@@ -642,7 +646,6 @@ static void xen_pt_region_add(MemoryListener *l, 
MemoryRegionSection *sec)
 XenPCIPassthroughState *s = container_of(l, XenPCIPassthroughState,
  memory_listener);
 
-memory_region_ref(sec->mr);
 xen_pt_region_update(s, sec, true);
 }
 
@@ -652,7 +655,6 @@ static void xen_pt_region_del(MemoryListener *l, 
MemoryRegionSection *sec)
  memory_listener);
 
 xen_pt_region_update(s, sec, false);
-memory_region_unref(sec->mr);
 }
 
 static void xen_pt_io_region_add(MemoryListener *l, MemoryRegionSection *sec)
@@ -660,7 +662,6 @@ static void xen_pt_io_region_add(MemoryListener *l, 
MemoryRegionSection *sec)
 XenPCIPassthroughState *s = container_of(l, XenPCIPassthroughState,
  io_listener);
 
-memory_region_ref(sec->mr);
 xen_pt_region_update(s, sec, true);
 }
 
@@ -670,7 +671,6 @@ static void xen_pt_io_region_del(MemoryListener *l, 
MemoryRegionSection *sec)
  io_listener);
 
 xen_pt_region_update(s, sec, false);
-memory_region_unref(sec->mr);
 }
 
 static const MemoryListener xen_pt_memory_listener = {
-- 
1.7.1




Re: [Qemu-devel] [Qemu-ppc] [PATCHv2 07/10] pseries: DEFINE_SPAPR_MACHINE

2015-12-08 Thread Alexey Kardashevskiy

On 12/08/2015 01:38 PM, Sam Bobroff wrote:

On Mon, Dec 07, 2015 at 02:34:37PM +1100, David Gibson wrote:

At the moment all the class_init functions and TypeInfo structures for the
various versioned pseries machine types are open-coded.  As more versions
are created this is getting increasingly clumsy.

This patch borrows the approach used in PC, using a DEFINE_SPAPR_MACHINE()
macro to construct most of the boilerplate from simpler 'class_options' and
'instance_options' functions.

This patch makes a small semantic change - the versioned machine types are
now registered through machine_init() instead of type_init().  Since the
new way is how PC already did it, I'm assuming that's correct.

Signed-off-by: David Gibson 
---
  hw/ppc/spapr.c | 119 -
  1 file changed, 49 insertions(+), 70 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 3078e60..4f645f3 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2302,24 +2302,47 @@ static const TypeInfo spapr_machine_info = {
  },
  };

+#define DEFINE_SPAPR_MACHINE(suffix, verstr) \
+static void spapr_machine_##suffix##_class_init(ObjectClass *oc, \
+void *data)  \
+{\
+MachineClass *mc = MACHINE_CLASS(oc);\
+spapr_machine_##suffix##_class_options(mc);  \
+}\
+static void spapr_machine_##suffix##_instance_init(Object *obj)  \
+{\
+MachineState *machine = MACHINE(obj);\
+spapr_machine_##suffix##_instance_options(machine);  \
+}\
+static const TypeInfo spapr_machine_##suffix##_info = {  \
+.name = MACHINE_TYPE_NAME("pseries-" verstr),\
+.parent = TYPE_SPAPR_MACHINE,\
+.class_init = spapr_machine_##suffix##_class_init,   \
+.instance_init = spapr_machine_##suffix##_instance_init, \
+};   \
+static void spapr_machine_register_##suffix(void)\
+{\
+type_register(_machine_##suffix##_info);   \
+}\
+machine_init(spapr_machine_register_##suffix)
+
  /*
   * pseries-2.5
   */
-static void spapr_machine_2_5_class_init(ObjectClass *oc, void *data)
+static void spapr_machine_2_5_instance_options(MachineState *machine)
  {
-MachineClass *mc = MACHINE_CLASS(oc);
-sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(oc);
+}
+
+static void spapr_machine_2_5_class_options(MachineClass *mc)
+{
+sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);

  mc->alias = "pseries";
  mc->is_default = 1;
  smc->dr_lmb_enabled = true;
  }

-static const TypeInfo spapr_machine_2_5_info = {
-.name  = MACHINE_TYPE_NAME("pseries-2.5"),
-.parent= TYPE_SPAPR_MACHINE,
-.class_init= spapr_machine_2_5_class_init,
-};
+DEFINE_SPAPR_MACHINE(2_5, "2.5");

  /*
   * pseries-2.4
@@ -2327,18 +2350,17 @@ static const TypeInfo spapr_machine_2_5_info = {
  #define SPAPR_COMPAT_2_4 \
  HW_COMPAT_2_4

-static void spapr_machine_2_4_class_init(ObjectClass *oc, void *data)
+static void spapr_machine_2_4_instance_options(MachineState *machine)
  {
-MachineClass *mc = MACHINE_CLASS(oc);
+spapr_machine_2_5_instance_options(machine);
+}

+static void spapr_machine_2_4_class_options(MachineClass *mc)
+{
  SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_4);
  }

-static const TypeInfo spapr_machine_2_4_info = {
-.name  = MACHINE_TYPE_NAME("pseries-2.4"),
-.parent= TYPE_SPAPR_MACHINE,
-.class_init= spapr_machine_2_4_class_init,
-};
+DEFINE_SPAPR_MACHINE(2_4, "2.4");

  /*
   * pseries-2.3
@@ -2352,30 +2374,18 @@ static const TypeInfo spapr_machine_2_4_info = {
  .value= "off",\
  },

-static void spapr_compat_2_3(Object *obj)
+static void spapr_machine_2_3_instance_options(MachineState *machine)
  {
+spapr_machine_2_4_instance_options(machine);
  savevm_skip_section_footers();
  global_state_set_optional();
  }

-static void spapr_machine_2_3_instance_init(Object *obj)
-{
-spapr_compat_2_3(obj);
-}
-
-static void spapr_machine_2_3_class_init(ObjectClass *oc, void *data)
+static void spapr_machine_2_3_class_options(MachineClass *mc)
  {
-MachineClass *mc = MACHINE_CLASS(oc);
-
  SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_3);
  }
-
-static const TypeInfo spapr_machine_2_3_info = {
-.name   

Re: [Qemu-devel] [PATCH 1/7] pc: wire up TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE for !xen

2015-12-08 Thread Gerd Hoffmann
> Hi Gerd,
> 
> A quick question, does IGD_PASSTHROUGH makes sense for compat machine types?

Unlikely to be used in practice, but I don't feel like creating
different initialization code paths because of that ...

> On the same topic, does machine->igd_gfx_passthru makes sense for all machine 
> types?

Same answer ;)

cheers,
  Gerd




[Qemu-devel] [PATCH v6 00/11] Add basic "detach" support for dump-guest-memory

2015-12-08 Thread Peter Xu
v6 changes:
- patch 10
  - English error fix [Fam]
- patch 11
  - remove useless var: "not_used" [me]
- all
  - move patch 8 to the end to be patch 11 (v5 patches 9-11 become
  v6 patches 8-10) [Eric]

v5 changes:
- patch 1
  - comment English fix [Fam]
- patch 2
  - pass has_detach=true always in hmp_dump_guest_memory [Paolo]
- patch 3
  - always use local_err and error_propagate() when need to check
the result [Fam]
- patch 8
  - add "DumpQueryResult" in DUMP_COMPLETED event [Eric]
(since DumpQueryResult is introduced in patch 10, so doing it in
patch 10 for convenience. Please let me know if I should not do
this, e.g., if patch re-ordering is required)

v4 changes:
- patch 2:
  - hmp: fix default value lost [Eric]
  - English errors [Eric]
- patch 3:
  - use global DumpState, leverage C99 struct init [Paolo]
  - English errors [Eric]
- patch 5:
  - more cleanup for dump_process [Paolo]
- patch 8:
  - make sure qmp-events.txt is sorted [Eric]
  - enhance error_get_pretty() [Eric]
  - emit DUMP_COMPLETED no matter detach or not
- patch 10:
  - use g_new0 to replace g_malloc0 [Eric]
  - rename "written_bytes" to "completed", "total_bytes" to "total"
[Eric]
  - use atomic ops and [rw]mb to protect status read/write [Paolo]
- patch 12:
  - English errors [Eric]
  - merge contents into older patches [Eric]

v3 changes (patch number corresponds to v2 patch set):
- patch 1
  - fix commit message. no memory leak, only code cleanup [Fam]
- patch 2
  - better documentation for "dump-guest-memory" (new patch 9) [Fam]
- patch 3
  - remove rcu lock/unlock in dump_init() [Fam, Paolo]
  - embed mr pointer into GuestPhysBlock [Paolo]
  - remove global dump state [Paolo]
- patch 4
  - fix memory leak for error [Fam]
  - evt DUMP_COMPLETED data: change to an optional "*error" [Paolo]
- patch 5
  - fix documents [Fam]
  - change "dump-query" to "query-dump", HMP to "info dump" [Paolo]
- patch 6
  - for query-dump command: define enum for DumpStatus, use "int"
for written/total [Paolo]
- all
  - reorder the commits as suggested, no fake values [Paolo]
  - split big commit into smaller ones [me]

v2 changes:
- fixed English errors [Drew]
- reordered the "detach" field, first make it optional, then make sure
  it's order is consistent [Drew, Fam]
- added doc for new detach flag [Eric]
- collected error msg even detached [Drew]
- added qmp event DUMP_COMPLETED to notify user [Eric, Fam]
- added "dump-query" QMP & HMP commands to query dump status [Eric]
- "stop" is not allowed when dump in background (also include
  "cont" and "dump-guest-memory") [Fam]
- added codes to calculate how many dump work finished, which could
  be queried from "dump-query" [Laszlo]
- added list to track all used MemoryRegion objects, also ref before
  use [Paolo]
- dump-guest-memory will be forbidden during incoming migrate [Paolo]
- taking rcu lock when collecting memory info [Paolo]

Test Done:
- QMP & HMP
  - test default dump (sync), work as usual
  - test detached dump, command return immediately.
  - When dump finished, will receive event DUMP_COMPLETED. 
  - test query-dump before/during/after dump
  - test kdump with zlib compression, w/ and w/o detach
- libvirt
  - test "virsh dump --memory-only" with default format and
kdump-zlib format, work as usual

Peter Xu (11):
  dump-guest-memory: cleanup: removing dump_{error|cleanup}().
  dump-guest-memory: add "detach" flag for QMP/HMP interfaces.
  dump-guest-memory: using static DumpState, add DumpStatus
  dump-guest-memory: add dump_in_progress() helper function
  dump-guest-memory: introduce dump_process() helper function.
  dump-guest-memory: disable dump when in INMIGRATE state
  dump-guest-memory: add "detach" support
  DumpState: adding total_size and written_size fields
  Dump: add qmp command "query-dump"
  Dump: add hmp command "info dump"
  dump-guest-memory: add qmp event DUMP_COMPLETED

 docs/qmp-events.txt |  18 
 dump.c  | 215 ++--
 hmp-commands-info.hx|  14 +++
 hmp-commands.hx |   5 +-
 hmp.c   |  26 -
 hmp.h   |   1 +
 include/qemu-common.h   |   4 +
 include/sysemu/dump.h   |  15 +++
 include/sysemu/memory_mapping.h |   4 +
 memory_mapping.c|   3 +
 qapi-schema.json|  56 ++-
 qapi/event.json |  16 +++
 qmp-commands.hx |  31 +-
 qmp.c   |  14 +++
 14 files changed, 359 insertions(+), 63 deletions(-)

-- 
2.4.3




Re: [Qemu-devel] [Qemu-ppc] [PATCHv2 07/10] pseries: DEFINE_SPAPR_MACHINE

2015-12-08 Thread Alexey Kardashevskiy

On 12/09/2015 02:30 PM, Alexey Kardashevskiy wrote:

On 12/08/2015 01:38 PM, Sam Bobroff wrote:

On Mon, Dec 07, 2015 at 02:34:37PM +1100, David Gibson wrote:

At the moment all the class_init functions and TypeInfo structures for the
various versioned pseries machine types are open-coded.  As more versions
are created this is getting increasingly clumsy.

This patch borrows the approach used in PC, using a DEFINE_SPAPR_MACHINE()
macro to construct most of the boilerplate from simpler 'class_options' and
'instance_options' functions.

This patch makes a small semantic change - the versioned machine types are
now registered through machine_init() instead of type_init().  Since the
new way is how PC already did it, I'm assuming that's correct.

Signed-off-by: David Gibson 
---
  hw/ppc/spapr.c | 119
-
  1 file changed, 49 insertions(+), 70 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 3078e60..4f645f3 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2302,24 +2302,47 @@ static const TypeInfo spapr_machine_info = {
  },
  };

+#define DEFINE_SPAPR_MACHINE(suffix, verstr) \
+static void spapr_machine_##suffix##_class_init(ObjectClass *oc, \
+void *data)  \
+{\
+MachineClass *mc = MACHINE_CLASS(oc);\
+spapr_machine_##suffix##_class_options(mc);  \
+}\
+static void spapr_machine_##suffix##_instance_init(Object *obj)  \
+{\
+MachineState *machine = MACHINE(obj);\
+spapr_machine_##suffix##_instance_options(machine);  \
+}\
+static const TypeInfo spapr_machine_##suffix##_info = {  \
+.name = MACHINE_TYPE_NAME("pseries-" verstr),\
+.parent = TYPE_SPAPR_MACHINE,\
+.class_init = spapr_machine_##suffix##_class_init,   \
+.instance_init = spapr_machine_##suffix##_instance_init, \
+};   \
+static void spapr_machine_register_##suffix(void)\
+{\
+type_register(_machine_##suffix##_info);   \
+}\
+machine_init(spapr_machine_register_##suffix)
+
  /*
   * pseries-2.5
   */
-static void spapr_machine_2_5_class_init(ObjectClass *oc, void *data)
+static void spapr_machine_2_5_instance_options(MachineState *machine)
  {
-MachineClass *mc = MACHINE_CLASS(oc);
-sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(oc);
+}
+
+static void spapr_machine_2_5_class_options(MachineClass *mc)
+{
+sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);

  mc->alias = "pseries";
  mc->is_default = 1;
  smc->dr_lmb_enabled = true;
  }

-static const TypeInfo spapr_machine_2_5_info = {
-.name  = MACHINE_TYPE_NAME("pseries-2.5"),
-.parent= TYPE_SPAPR_MACHINE,
-.class_init= spapr_machine_2_5_class_init,
-};
+DEFINE_SPAPR_MACHINE(2_5, "2.5");

  /*
   * pseries-2.4
@@ -2327,18 +2350,17 @@ static const TypeInfo spapr_machine_2_5_info = {
  #define SPAPR_COMPAT_2_4 \
  HW_COMPAT_2_4

-static void spapr_machine_2_4_class_init(ObjectClass *oc, void *data)
+static void spapr_machine_2_4_instance_options(MachineState *machine)
  {
-MachineClass *mc = MACHINE_CLASS(oc);
+spapr_machine_2_5_instance_options(machine);
+}

+static void spapr_machine_2_4_class_options(MachineClass *mc)
+{
  SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_4);
  }

-static const TypeInfo spapr_machine_2_4_info = {
-.name  = MACHINE_TYPE_NAME("pseries-2.4"),
-.parent= TYPE_SPAPR_MACHINE,
-.class_init= spapr_machine_2_4_class_init,
-};
+DEFINE_SPAPR_MACHINE(2_4, "2.4");

  /*
   * pseries-2.3
@@ -2352,30 +2374,18 @@ static const TypeInfo spapr_machine_2_4_info = {
  .value= "off",\
  },

-static void spapr_compat_2_3(Object *obj)
+static void spapr_machine_2_3_instance_options(MachineState *machine)
  {
+spapr_machine_2_4_instance_options(machine);
  savevm_skip_section_footers();
  global_state_set_optional();
  }

-static void spapr_machine_2_3_instance_init(Object *obj)
-{
-spapr_compat_2_3(obj);
-}
-
-static void spapr_machine_2_3_class_init(ObjectClass *oc, void *data)
+static void spapr_machine_2_3_class_options(MachineClass *mc)
  {
-MachineClass *mc = MACHINE_CLASS(oc);
-
  SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_3);
  }
-
-static const 

Re: [Qemu-devel] [PATCHv2 09/10] pseries: Improve setting of default machine version

2015-12-08 Thread Alexey Kardashevskiy

On 12/07/2015 02:34 PM, David Gibson wrote:

This tweaks the way the default machine version is controlled, so that
there will be a bit less churn when each new version is introduced.

Signed-off-by: David Gibson 
---
  hw/ppc/spapr.c | 20 ++--
  1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 5af3d13..8b8eb18 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2308,12 +2308,16 @@ static const TypeInfo spapr_machine_info = {
  },
  };

-#define DEFINE_SPAPR_MACHINE(suffix, verstr) \
+#define DEFINE_SPAPR_MACHINE(suffix, verstr, latest) \
  static void spapr_machine_##suffix##_class_init(ObjectClass *oc, \
  void *data)  \
  {\
  MachineClass *mc = MACHINE_CLASS(oc);\
  spapr_machine_##suffix##_class_options(mc);  \
+if (latest) {\
+mc->alias = "pseries";   \
+mc->is_default = 1;  \
+}\
  }\
  static void spapr_machine_##suffix##_instance_init(Object *obj)  \
  {\
@@ -2342,11 +2346,9 @@ static void 
spapr_machine_2_5_instance_options(MachineState *machine)
  static void spapr_machine_2_5_class_options(MachineClass *mc)
  {
  /* Defaults for the latest behaviour inherited from the base class */
-mc->alias = "pseries";
-mc->is_default = 1;
  }

-DEFINE_SPAPR_MACHINE(2_5, "2.5");
+DEFINE_SPAPR_MACHINE(2_5, "2.5", true);

  /*
   * pseries-2.4
@@ -2364,13 +2366,11 @@ static void 
spapr_machine_2_4_class_options(MachineClass *mc)
  sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);

  spapr_machine_2_5_class_options(mc);
-mc->alias = NULL;
-mc->is_default = 0;
  smc->dr_lmb_enabled = false;
  SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_4);
  }

-DEFINE_SPAPR_MACHINE(2_4, "2.4");
+DEFINE_SPAPR_MACHINE(2_4, "2.4", false);

  /*
   * pseries-2.3
@@ -2396,7 +2396,7 @@ static void spapr_machine_2_3_class_options(MachineClass 
*mc)
  spapr_machine_2_4_class_options(mc);
  SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_3);
  }
-DEFINE_SPAPR_MACHINE(2_3, "2.3");
+DEFINE_SPAPR_MACHINE(2_3, "2.3", false);

  /*
   * pseries-2.2
@@ -2421,7 +2421,7 @@ static void spapr_machine_2_2_class_options(MachineClass 
*mc)
  spapr_machine_2_3_class_options(mc);
  SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_2);
  }
-DEFINE_SPAPR_MACHINE(2_2, "2.2");
+DEFINE_SPAPR_MACHINE(2_2, "2.2", false);

  /*
   * pseries-2.1
@@ -2440,7 +2440,7 @@ static void spapr_machine_2_1_class_options(MachineClass 
*mc)
  spapr_machine_2_2_class_options(mc);
  SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_1);
  }
-DEFINE_SPAPR_MACHINE(2_1, "2.1");
+DEFINE_SPAPR_MACHINE(2_1, "2.1", false);

  static void spapr_machine_register_types(void)
  {



Reviewed-by: Alexey Kardashevskiy 

--
Alexey



Re: [Qemu-devel] [PATCH 4/8] bcm2835_emmc: add bcm2835 MMC/SD controller

2015-12-08 Thread Andrew Baumann
> From: Peter Crosthwaite [mailto:crosthwaitepe...@gmail.com]
> Sent: Saturday, 5 December 2015 21:26
> Is this IP just SDHCI? We already model SDHCI in QEMU, see
> hw/sd/sdhci.c. If there are RPi specific features to the SDHCI
> implementation they should be added as optional extensions (probabably
> via subclassing) to the existing SDHCI model.

So yes, it turns out this is fairly similar to SDHCI (-> lots of wasted work by 
Gregory and me, sigh), and indeed Linux boots with the existing sdhci 
emulation. However, there are some quirks, and UEFI/Windows depend on them. 
Namely:
 * The host control registers (offset 0x28 and above) seem to differ 
significantly. Maybe this is due to the SDHC version -- according to the 
BCM2835 peripherals spec, the controller implements "Version 3.0 Draft 1.0" of 
the SDHC spec, but of course I can't find that spec online anywhere. Luckily 
nothing seems to depend on this, besides a few spurious warnings about invalid 
writes.
 * Power is assumed to be always on -- the sdhci model requires the guest to 
turn it on by a write at offset 0x29 before issuing any commands, but on pi 
this bit is marked reserved, and commands are issued immediately after reset.
 * The card inserted interrupt is rather broken on pi: it is set at the start 
of day, but a reset command clears it and it stays clear thereafter (and never 
generates interrupts).

There's an inconsistency with response handling, too, although I'm not sure if 
it's a quirk of the Pi or a general bug in sdhci. Pi UEFI sends a CMD23 without 
setting any of the response bits, but this command does in fact generate a 
4-byte R1 response. The question is whether this should be treated as an error, 
or whether it simply means that the host wants to ignore the response. In 
sdhci, the following code path (around line 246) raises a "command index" error 
in the case that a non-zero response is returned but no response bits were set 
in the command register:

} else if (rlen != 0 && (s->errintstsen & SDHC_EISEN_CMDIDX)) {
s->errintsts |= SDHC_EIS_CMDIDX;
s->norintsts |= SDHC_NIS_ERR;
}

I do not observe this behaviour on the real Pi2 (and it breaks UEFI). The 
hardware semantics appear to be "if the command generates a response, but you 
didn't want to see it, we'll successfully complete the command and ignore the 
response", whereas the sdhci implementation raises an error for this as well as 
signalling completion. I have read the "SD Specifications Part A2 SD Host 
Controller Simplified Specification Version 2.00", but did not find anything 
describing this case, so it could be that this is open to interpretation. (It 
could also be specified in SDHC v3.) The specific error also seems odd -- my 
understanding is that a "command index" error means that the index in the 
response didn't match the index of the issued command, but that's hardly what 
is happening here.

Assuming this latter bug can be fixed generically, how do you propose handling 
the Pi quirks? I could add a bool property for "bcm2835-quirks" or similar and 
just special-case the relevant code (my preferred approach). I'm also open to 
subclassing, but no idea how that would work in practice, so would need some 
pointers.

Thanks,
Andrew


[Qemu-devel] [PATCH v6 01/11] dump-guest-memory: cleanup: removing dump_{error|cleanup}().

2015-12-08 Thread Peter Xu
It might be a little bit confusing and error prone to do
dump_cleanup() in these two functions. A better way is to do
dump_cleanup() before dump finish, no matter whether dump has
succeeded or not.

Signed-off-by: Peter Xu 
Reviewed-by: Fam Zheng 
---
 dump.c | 78 +++---
 1 file changed, 32 insertions(+), 46 deletions(-)

diff --git a/dump.c b/dump.c
index 78b7d84..445e739 100644
--- a/dump.c
+++ b/dump.c
@@ -82,12 +82,6 @@ static int dump_cleanup(DumpState *s)
 return 0;
 }
 
-static void dump_error(DumpState *s, const char *reason, Error **errp)
-{
-dump_cleanup(s);
-error_setg(errp, "%s", reason);
-}
-
 static int fd_write_vmcore(const void *buf, size_t size, void *opaque)
 {
 DumpState *s = opaque;
@@ -128,7 +122,7 @@ static void write_elf64_header(DumpState *s, Error **errp)
 
 ret = fd_write_vmcore(_header, sizeof(elf_header), s);
 if (ret < 0) {
-dump_error(s, "dump: failed to write elf header", errp);
+error_setg(errp, "dump: failed to write elf header");
 }
 }
 
@@ -159,7 +153,7 @@ static void write_elf32_header(DumpState *s, Error **errp)
 
 ret = fd_write_vmcore(_header, sizeof(elf_header), s);
 if (ret < 0) {
-dump_error(s, "dump: failed to write elf header", errp);
+error_setg(errp, "dump: failed to write elf header");
 }
 }
 
@@ -182,7 +176,7 @@ static void write_elf64_load(DumpState *s, MemoryMapping 
*memory_mapping,
 
 ret = fd_write_vmcore(, sizeof(Elf64_Phdr), s);
 if (ret < 0) {
-dump_error(s, "dump: failed to write program header table", errp);
+error_setg(errp, "dump: failed to write program header table");
 }
 }
 
@@ -205,7 +199,7 @@ static void write_elf32_load(DumpState *s, MemoryMapping 
*memory_mapping,
 
 ret = fd_write_vmcore(, sizeof(Elf32_Phdr), s);
 if (ret < 0) {
-dump_error(s, "dump: failed to write program header table", errp);
+error_setg(errp, "dump: failed to write program header table");
 }
 }
 
@@ -225,7 +219,7 @@ static void write_elf64_note(DumpState *s, Error **errp)
 
 ret = fd_write_vmcore(, sizeof(Elf64_Phdr), s);
 if (ret < 0) {
-dump_error(s, "dump: failed to write program header table", errp);
+error_setg(errp, "dump: failed to write program header table");
 }
 }
 
@@ -245,7 +239,7 @@ static void write_elf64_notes(WriteCoreDumpFunction f, 
DumpState *s,
 id = cpu_index(cpu);
 ret = cpu_write_elf64_note(f, cpu, id, s);
 if (ret < 0) {
-dump_error(s, "dump: failed to write elf notes", errp);
+error_setg(errp, "dump: failed to write elf notes");
 return;
 }
 }
@@ -253,7 +247,7 @@ static void write_elf64_notes(WriteCoreDumpFunction f, 
DumpState *s,
 CPU_FOREACH(cpu) {
 ret = cpu_write_elf64_qemunote(f, cpu, s);
 if (ret < 0) {
-dump_error(s, "dump: failed to write CPU status", errp);
+error_setg(errp, "dump: failed to write CPU status");
 return;
 }
 }
@@ -275,7 +269,7 @@ static void write_elf32_note(DumpState *s, Error **errp)
 
 ret = fd_write_vmcore(, sizeof(Elf32_Phdr), s);
 if (ret < 0) {
-dump_error(s, "dump: failed to write program header table", errp);
+error_setg(errp, "dump: failed to write program header table");
 }
 }
 
@@ -290,7 +284,7 @@ static void write_elf32_notes(WriteCoreDumpFunction f, 
DumpState *s,
 id = cpu_index(cpu);
 ret = cpu_write_elf32_note(f, cpu, id, s);
 if (ret < 0) {
-dump_error(s, "dump: failed to write elf notes", errp);
+error_setg(errp, "dump: failed to write elf notes");
 return;
 }
 }
@@ -298,7 +292,7 @@ static void write_elf32_notes(WriteCoreDumpFunction f, 
DumpState *s,
 CPU_FOREACH(cpu) {
 ret = cpu_write_elf32_qemunote(f, cpu, s);
 if (ret < 0) {
-dump_error(s, "dump: failed to write CPU status", errp);
+error_setg(errp, "dump: failed to write CPU status");
 return;
 }
 }
@@ -326,7 +320,7 @@ static void write_elf_section(DumpState *s, int type, Error 
**errp)
 
 ret = fd_write_vmcore(, shdr_size, s);
 if (ret < 0) {
-dump_error(s, "dump: failed to write section header table", errp);
+error_setg(errp, "dump: failed to write section header table");
 }
 }
 
@@ -336,7 +330,7 @@ static void write_data(DumpState *s, void *buf, int length, 
Error **errp)
 
 ret = fd_write_vmcore(buf, length, s);
 if (ret < 0) {
-dump_error(s, "dump: failed to save memory", errp);
+error_setg(errp, "dump: failed to save memory");
 }
 }
 
@@ -568,11 +562,6 @@ static void dump_begin(DumpState *s, Error **errp)
 }
 }
 
-static void dump_completed(DumpState *s)
-{
-dump_cleanup(s);
-}
-
 static int get_next_block(DumpState *s, 

[Qemu-devel] [PATCH v6 05/11] dump-guest-memory: introduce dump_process() helper function.

2015-12-08 Thread Peter Xu
No functional change. Cleanup only.

Signed-off-by: Peter Xu 
Reviewed-by: Fam Zheng 
---
 dump.c| 31 +--
 include/sysemu/dump.h |  3 +++
 2 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/dump.c b/dump.c
index ccd56c8..f0ee9a8 100644
--- a/dump.c
+++ b/dump.c
@@ -1441,6 +1441,9 @@ static void dump_init(DumpState *s, int fd, bool 
has_format,
 Error *err = NULL;
 int ret;
 
+s->has_format = has_format;
+s->format = format;
+
 /* kdump-compressed is conflict with paging and filter */
 if (has_format && format != DUMP_GUEST_MEMORY_FORMAT_ELF) {
 assert(!paging && !has_filter);
@@ -1594,6 +1597,23 @@ cleanup:
 dump_cleanup(s);
 }
 
+/* this operation might be time consuming. */
+static void dump_process(DumpState *s, Error **errp)
+{
+Error *local_err = NULL;
+
+if (s->has_format && s->format != DUMP_GUEST_MEMORY_FORMAT_ELF) {
+create_kdump_vmcore(s, _err);
+} else {
+create_vmcore(s, _err);
+}
+
+s->status = (local_err ? DUMP_STATUS_FAILED : DUMP_STATUS_COMPLETED);
+error_propagate(errp, local_err);
+
+dump_cleanup(s);
+}
+
 void qmp_dump_guest_memory(bool paging, const char *file,
bool has_detach, bool detach,
bool has_begin, int64_t begin, bool has_length,
@@ -1679,16 +1699,7 @@ void qmp_dump_guest_memory(bool paging, const char *file,
 return;
 }
 
-if (has_format && format != DUMP_GUEST_MEMORY_FORMAT_ELF) {
-create_kdump_vmcore(s, _err);
-} else {
-create_vmcore(s, _err);
-}
-
-s->status = (local_err ? DUMP_STATUS_FAILED : DUMP_STATUS_COMPLETED);
-error_propagate(errp, local_err);
-
-dump_cleanup(s);
+dump_process(s, errp);
 }
 
 DumpGuestMemoryCapability *qmp_query_dump_guest_memory_capability(Error **errp)
diff --git a/include/sysemu/dump.h b/include/sysemu/dump.h
index affef38..d6f4a9c 100644
--- a/include/sysemu/dump.h
+++ b/include/sysemu/dump.h
@@ -185,6 +185,9 @@ typedef struct DumpState {
 size_t num_dumpable;/* number of page that can be dumped */
 uint32_t flag_compress; /* indicate the compression format */
 DumpStatus status;  /* current dump status */
+
+bool has_format;  /* whether format is provided */
+DumpGuestMemoryFormat format; /* valid only if has_format == true */
 } DumpState;
 
 uint16_t cpu_to_dump16(DumpState *s, uint16_t val);
-- 
2.4.3




[Qemu-devel] [PATCH v6 10/11] Dump: add hmp command "info dump"

2015-12-08 Thread Peter Xu
It will calculate percentage of finished work from completed and
total.

Signed-off-by: Peter Xu 
---
 hmp-commands-info.hx | 14 ++
 hmp.c| 17 +
 hmp.h|  1 +
 3 files changed, 32 insertions(+)

diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index 9b71351..52539c3 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -786,6 +786,20 @@ STEXI
 Display the value of a storage key (s390 only)
 ETEXI
 
+{
+.name   = "dump",
+.args_type  = "",
+.params = "",
+.help   = "Display the latest dump status",
+.mhandler.cmd = hmp_info_dump,
+},
+
+STEXI
+@item info dump
+@findex dump
+Display the latest dump status.
+ETEXI
+
 STEXI
 @end table
 ETEXI
diff --git a/hmp.c b/hmp.c
index 1f4d0b6..c824064 100644
--- a/hmp.c
+++ b/hmp.c
@@ -2383,3 +2383,20 @@ void hmp_rocker_of_dpa_groups(Monitor *mon, const QDict 
*qdict)
 
 qapi_free_RockerOfDpaGroupList(list);
 }
+
+void hmp_info_dump(Monitor *mon, const QDict *qdict)
+{
+DumpQueryResult *result = qmp_query_dump(NULL);
+
+assert(result->status < DUMP_STATUS_MAX);
+monitor_printf(mon, "Status: %s\n", DumpStatus_lookup[result->status]);
+
+if (result->status == DUMP_STATUS_ACTIVE) {
+float percent = 0;
+assert(result->total != 0);
+percent = 100.0 * result->completed / result->total;
+monitor_printf(mon, "Finished: %.2f %%\n", percent);
+}
+
+qapi_free_DumpQueryResult(result);
+}
diff --git a/hmp.h b/hmp.h
index a8c5b5a..093d65f 100644
--- a/hmp.h
+++ b/hmp.h
@@ -131,5 +131,6 @@ void hmp_rocker(Monitor *mon, const QDict *qdict);
 void hmp_rocker_ports(Monitor *mon, const QDict *qdict);
 void hmp_rocker_of_dpa_flows(Monitor *mon, const QDict *qdict);
 void hmp_rocker_of_dpa_groups(Monitor *mon, const QDict *qdict);
+void hmp_info_dump(Monitor *mon, const QDict *qdict);
 
 #endif
-- 
2.4.3




[Qemu-devel] [PATCH v6 06/11] dump-guest-memory: disable dump when in INMIGRATE state

2015-12-08 Thread Peter Xu
Signed-off-by: Peter Xu 
Reviewed-by: Fam Zheng 
---
 dump.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/dump.c b/dump.c
index f0ee9a8..aa9d1f8 100644
--- a/dump.c
+++ b/dump.c
@@ -1625,6 +1625,11 @@ void qmp_dump_guest_memory(bool paging, const char *file,
 DumpState *s;
 Error *local_err = NULL;
 
+if (runstate_check(RUN_STATE_INMIGRATE)) {
+error_setg(errp, "Dump not allowed during incoming migration.");
+return;
+}
+
 /* if there is a dump in background, we should wait until the dump
  * finished */
 if (dump_in_progress()) {
-- 
2.4.3




[Qemu-devel] [PATCH v6 02/11] dump-guest-memory: add "detach" flag for QMP/HMP interfaces.

2015-12-08 Thread Peter Xu
This patch only adds the interfaces, but does not implement them.
"detach" parameter is made optional, to make sure that all the old
dump-guest-memory requests will still be able to work.

Signed-off-by: Peter Xu 
Reviewed-by: Fam Zheng 
---
 dump.c   | 5 +++--
 hmp-commands.hx  | 5 +++--
 hmp.c| 9 +++--
 qapi-schema.json | 8 ++--
 qmp-commands.hx  | 6 --
 5 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/dump.c b/dump.c
index 445e739..d79e0ed 100644
--- a/dump.c
+++ b/dump.c
@@ -1580,8 +1580,9 @@ cleanup:
 dump_cleanup(s);
 }
 
-void qmp_dump_guest_memory(bool paging, const char *file, bool has_begin,
-   int64_t begin, bool has_length,
+void qmp_dump_guest_memory(bool paging, const char *file,
+   bool has_detach, bool detach,
+   bool has_begin, int64_t begin, bool has_length,
int64_t length, bool has_format,
DumpGuestMemoryFormat format, Error **errp)
 {
diff --git a/hmp-commands.hx b/hmp-commands.hx
index bb52e4d..664d794 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1056,10 +1056,11 @@ ETEXI
 
 {
 .name   = "dump-guest-memory",
-.args_type  = 
"paging:-p,zlib:-z,lzo:-l,snappy:-s,filename:F,begin:i?,length:i?",
-.params = "[-p] [-z|-l|-s] filename [begin length]",
+.args_type  = 
"paging:-p,detach:-d,zlib:-z,lzo:-l,snappy:-s,filename:F,begin:i?,length:i?",
+.params = "[-p] [-d] [-z|-l|-s] filename [begin length]",
 .help   = "dump guest memory into file 'filename'.\n\t\t\t"
   "-p: do paging to get guest's memory mapping.\n\t\t\t"
+  "-d: return immediately (do not wait for 
completion).\n\t\t\t"
   "-z: dump in kdump-compressed format, with zlib 
compression.\n\t\t\t"
   "-l: dump in kdump-compressed format, with lzo 
compression.\n\t\t\t"
   "-s: dump in kdump-compressed format, with snappy 
compression.\n\t\t\t"
diff --git a/hmp.c b/hmp.c
index 2140605..1f4d0b6 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1586,8 +1586,10 @@ void hmp_dump_guest_memory(Monitor *mon, const QDict 
*qdict)
 const char *file = qdict_get_str(qdict, "filename");
 bool has_begin = qdict_haskey(qdict, "begin");
 bool has_length = qdict_haskey(qdict, "length");
+bool has_detach = qdict_haskey(qdict, "detach");
 int64_t begin = 0;
 int64_t length = 0;
+bool detach = false;
 enum DumpGuestMemoryFormat dump_format = DUMP_GUEST_MEMORY_FORMAT_ELF;
 char *prot;
 
@@ -1615,11 +1617,14 @@ void hmp_dump_guest_memory(Monitor *mon, const QDict 
*qdict)
 if (has_length) {
 length = qdict_get_int(qdict, "length");
 }
+if (has_detach) {
+detach = qdict_get_bool(qdict, "detach");
+}
 
 prot = g_strconcat("file:", file, NULL);
 
-qmp_dump_guest_memory(paging, prot, has_begin, begin, has_length, length,
-  true, dump_format, );
+qmp_dump_guest_memory(paging, prot, true, detach, has_begin, begin,
+  has_length, length, true, dump_format, );
 hmp_handle_error(mon, );
 g_free(prot);
 }
diff --git a/qapi-schema.json b/qapi-schema.json
index 8b1a423..97c3ac4 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -2115,6 +2115,9 @@
 #2. fd: the protocol starts with "fd:", and the following string
 #   is the fd's name.
 #
+# @detach: #optional if true, QMP will return immediately rather than
+#  waiting for the dump to finish. (since 2.6).
+#
 # @begin: #optional if specified, the starting physical address.
 #
 # @length: #optional if specified, the memory size, in bytes. If you don't
@@ -2131,8 +2134,9 @@
 # Since: 1.2
 ##
 { 'command': 'dump-guest-memory',
-  'data': { 'paging': 'bool', 'protocol': 'str', '*begin': 'int',
-'*length': 'int', '*format': 'DumpGuestMemoryFormat' } }
+  'data': { 'paging': 'bool', 'protocol': 'str', '*detach': 'bool',
+'*begin': 'int', '*length': 'int',
+'*format': 'DumpGuestMemoryFormat'} }
 
 ##
 # @DumpGuestMemoryCapability:
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 9d8b42f..6b51585 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -840,8 +840,8 @@ EQMP
 
 {
 .name   = "dump-guest-memory",
-.args_type  = "paging:b,protocol:s,begin:i?,end:i?,format:s?",
-.params = "-p protocol [begin] [length] [format]",
+.args_type  = 
"paging:b,protocol:s,detach:b?,begin:i?,end:i?,format:s?",
+.params = "-p protocol [-d] [begin] [length] [format]",
 .help   = "dump guest memory to file",
 .mhandler.cmd_new = qmp_marshal_dump_guest_memory,
 },
@@ -857,6 +857,8 @@ Arguments:
 - "paging": do paging to get guest's memory mapping (json-bool)
 - "protocol": 

Re: [Qemu-devel] tcg: booting Windows on arm

2015-12-08 Thread Ya Ho
Hi all,
I am trying to boot Windows 8 (x86) on arm host using qemu dynamic
translation.
It is not successfull but Windows xp boots fine.
Any suggest for this issue?

On Wednesday, December 9, 2015,  wrote:

> Send Qemu-devel mailing list submissions to
> qemu-devel@nongnu.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.nongnu.org/mailman/listinfo/qemu-devel
> or, via email, send a message with subject or body 'help' to
> qemu-devel-requ...@nongnu.org
>
> You can reach the person managing the list at
> qemu-devel-ow...@nongnu.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Qemu-devel digest..."
>
>
> Today's Topics:
>
>1. Re: [PATCH for-2.5] sparc: allow CASA with ASI 0xafrom user
>   space (Peter Maydell)
>2. [Bug 1308341] Re: Multiple CPUs causes blue screen on Windows
>   guest (14.04 regression) (Cristian Aires)
>3. Re: [PATCH] vfio: Align iova also to IOMMU page size
>   (Alex Williamson)
>4. Re: tcg: improve MAX_CODE_GEN_BUFFER_SIZE for arm (TeLeMan)
>5. [PATCH] xen_pt: fix failure of attaching & detaching aPCI
>   device to VM repeatedly (Jianzhong,Chang)
>6. Re: [PATCH] virtio-blk: Drop x-data-plane option (Stefan Hajnoczi)
>7. [PATCH v6 00/11] Add basic "detach" support for
>   dump-guest-memory (Peter Xu)
>8. [PATCH v6 01/11] dump-guest-memory: cleanup: removing
>   dump_{error|cleanup}(). (Peter Xu)
>9. [PATCH v6 02/11] dump-guest-memory: add "detach" flag for
>   QMP/HMP interfaces. (Peter Xu)
>   10. [PATCH v6 03/11] dump-guest-memory: using static  DumpState,
>   add DumpStatus (Peter Xu)
>   11. [PATCH v6 04/11] dump-guest-memory: add   dump_in_progress()
>   helper function (Peter Xu)
>   12. [PATCH v6 05/11] dump-guest-memory: introduce dump_process()
>   helper function. (Peter Xu)
>   13. [PATCH v6 06/11] dump-guest-memory: disable dump when in
>   INMIGRATE state (Peter Xu)
>   14. [PATCH v6 07/11] dump-guest-memory: add "detach"  support
>   (Peter Xu)
>   15. [PATCH v6 08/11] DumpState: adding total_size and
>   written_size fields (Peter Xu)
>   16. [PATCH v6 09/11] Dump: add qmp command "query-dump" (Peter Xu)
>   17. [PATCH v6 10/11] Dump: add hmp command "info dump" (Peter Xu)
>   18. [PATCH v6 11/11] dump-guest-memory: add qmp event
>   DUMP_COMPLETED (Peter Xu)
>   19. Re: [Qemu-ppc] [PATCHv2 07/10] pseries:   DEFINE_SPAPR_MACHINE
>   (Alexey Kardashevskiy)
>   20. Re: [PATCHv2 01/10] pseries: Remove redundant setting of
>   mc->name for pseries-2.5 machine (Alexey Kardashevskiy)
>
>
> --
>
> Message: 1
> Date: Tue, 8 Dec 2015 21:28:49 +
> From: Peter Maydell 
> To: Richard Henderson 
> Cc: Alex Zuepke , Mark Cave-Ayland
> , QEMU Developers
> , Fabien Chouteau  >
> Subject: Re: [Qemu-devel] [PATCH for-2.5] sparc: allow CASA with ASI
> 0xa from user space
> Message-ID:
> 

  1   2   >