Re: [Qemu-devel] [PATCH v7 14/31] qapi: Drop unused error argument for list and implicit struct
On Mon, Dec 07, 2015 at 08:55:04PM -0700, Eric Blake wrote: > No backend was setting an error when ending the visit of a list > or implicit struct. Make the callers a bit easier to follow by > making this a part of the contract, and removing the errp > argument - callers can then unconditionally end an object as > part of cleanup without having to think about whether a second > error is dominated by a first, because there is no second error. > > A later patch will then tackle the larger task of splitting > visit_end_struct(), which can indeed set an error. > > Signed-off-by: Eric BlakeFor spapr parts: Acked-by: David Gibson -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [PATCH v7 13/31] qapi: Drop unused 'kind' for struct/enum visit
On Mon, Dec 07, 2015 at 08:55:03PM -0700, Eric Blake wrote: > visit_start_struct() and visit_type_enum() had a 'kind' argument > that was usually set to either the stringized version of the > corresponding qapi type name, or to NULL (although some clients > didn't even get that right). But nothing ever used the argument. > It's even hard to argue that it would be useful in a debugger, > as a stack backtrace also tells which type is being visited. > > Therefore, drop the 'kind' argument as dead. While at it, change > the signature of visit_start_struct() to place the 'name' > argument at the end (other than 'errp'), and the 'size' argument > next to 'obj'; this placement of 'name' matches matches how all > other functions in visit.h do it (visit_type_enum() places > 'strings' between 'obj' and 'name'; visit_get_next_type() places > 'promote_int' between 'type' and 'name'). This also avoids the > confusion caused by splitting related pieces of information, > where the old signature an unrelated parameter in between the > "typename" and sizeof(typename) arguments. > > Signed-off-by: Eric BlakeFor spapr parts: Acked-by: David Gibson -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [ANNOUNCE] QEMU 2.5.0-rc3 is now available
On 8 December 2015 at 00:16, Michael Rothwrote: > Hello, > > On behalf of the QEMU Team, I'd like to announce the availability of the > fourth release candidate for the QEMU 2.5 release. This release is meant > for testing purposes and should not be used in a production environment. > > http://wiki.qemu.org/download/qemu-2.5.0-rc3.tar.bz2 > > You can help improve the quality of the QEMU 2.5 release by testing this > release and reporting bugs on Launchpad: > > https://bugs.launchpad.net/qemu/ > > The release plan for the 2.5 release is available at: > > http://wiki.qemu.org/Planning/2.5 My plan here is for us to fix only any absolutely showstopper bugs between here and release, so ideally an rc4 (if needed) at the end of the week and final release same-as-rc4 middle of the week after that. Please make sure any such showstoppers are listed in the Planning page on the wiki so we don't overlook them. thanks -- PMM
Re: [Qemu-devel] [Qemu-block] [PATCH for-2.5?] qcow2: always initialize specific image info
Am 07.12.2015 um 20:44 hat Max Reitz geschrieben: > I'd be completely fine with adding an "else { abort(); }" branch to > qcow2_get_specific_info(). This is actually what I was going to suggest, too. Of course, it's not supposed to fix anything now, but defensive coding has never hurt. Kevin pgp4xqUwG7kEK.pgp Description: PGP signature
[Qemu-devel] QEMU/KVM performance gets worser - high load - high interrupts - high context switches
Hello, Yesterday I looked at my munin statistics on my KVM host and I swar that performance gets worser: load is getting higher, interrupts are getting higher and are high as well as context switches. VMs and applications didn't change that way. You can find graphics at: http://www.wiesinger.com/tmp/kvm/ Last spike I guess was upgrade from FC22 to FC23 or a kernel update. And it was even lower on older versions For me it looks like the high interrupt load and context switches are the root cause. Interrupts inside the VM are <100, so with 10 VMs I'm expecting 1000+baseload => <2000, see statistics below. All VMs are virtio on disk/network except one (IDE/rtl8139). # Host as well as all guests (except 2 VMs): uname -a Linux kvm 4.2.6-301.fc23.x86_64 #1 SMP Fri Nov 20 22:22:41 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux qemu-system-x86-2.4.1-1.fc23.x86_64 Platform: All VMs have the pc-i440fx-2.4 profile (I upgraded yesterday from pc-i440fx-2.3 without any change). Any ideas, anyone having same issues? Ciao, Gerhard kvm: no VM running r b swpd free buff cache si sobibo in cs us sy id wa st 0 0 0 3308516 102408 379856800 012 197 679 0 0 99 0 0 0 0 0 3308516 102416 379856400 042 197 914 0 0 99 1 0 0 0 0 3308516 102416 379856800 0 0 190 791 0 0 100 0 0 2 0 0 3308484 102416 379856800 0 0 129 440 0 0 100 0 0 kvm: 2 VMs running procs ---memory-- ---swap-- -io -system-- --cpu- r b swpd free buff cache si sobibo in cs us sy id wa st 1 0 0 2641464 103052 381470000 0 0 2715 5648 3 2 95 0 0 0 0 0 2641340 103052 381470000 0 0 2601 1 2 97 0 0 1 0 0 2641308 103052 381470000 0 5 2687 5708 3 2 95 0 0 0 0 0 2640620 103060 381462800 030 2779 5756 4 3 93 1 0 0 0 0 2640644 103060 381463600 0 0 2436 5364 1 2 97 0 0 1 0 0 2640520 103060 381463600 0 119 2734 5975 3 2 95 0 0 kvm: all 10 VMs running procs ---memory-- ---swap-- -io -system-- --cpu- r b swpd free buff cache si sobibo in cs us sy id wa st 1 0 0 60408 78892 337198400 085 9015 17357 4 9 87 0 0 2 0 0 60408 78892 337196800 047 9375 17797 9 9 82 0 0 0 0 0 60472 78892 3372092004060 8882 17343 4 8 86 1 0 1 0 0 60316 78892 337208000 059 8863 17517 4 8 87 0 0 0 0 0 59540 78900 337209200 055 9135 17796 8 9 81 1 0 0 0 0 59168 78900 337211200 051 8931 17484 4 9 87 0 0 cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 Quad CPU @ 2.66GHz stepping: 7
Re: [Qemu-devel] [PATCH] virtio-blk: Drop x-data-plane option
On Tue, 8 Dec 2015 09:56:14 +0800 Fam Zhengwrote: > On Mon, 12/07 21:02, Fam Zheng wrote: > > On Mon, 12/07 12:29, Cornelia Huck wrote: > > > No general objection to removing x-data-plane; but this probably wants > > > a mention on the changelog as x-data-plane has been described in > > > various howtos etc. over the years. > > Add a changelog line, > > http://wiki.qemu.org/ChangeLog/2.5#Block_devices_and_tools > > please review. Looks sane, although I'd probably add a line to "Incompatible changes" as well.
Re: [Qemu-devel] [PATCH v7 31/31] RFC: qapi: Adjust layout of FooList types
On Mon, Dec 07, 2015 at 08:55:21PM -0700, Eric Blake wrote: > By sticking the next pointer first, we don't need a union with > 64-bit padding for smaller types. On 32-bit platforms, this > can reduce the size of uint8List from 16 bytes (or 12, depending > on whether 64-bit ints can tolerate 4-byte alignment) down to 8. > It has no effect on 64-bit platforms (where alignment still > dictates a 16-byte struct); but fewer anonymous unions is still > a win in my book. > > However, this requires visit_start_list() and visit_next_list() > to gain a size parameter, to know what size element to allocate. > > I debated about going one step further, to allow for fewer casts, > by doing: > typedef GenericList GenericList; > struct GenericList { > GenericList *next; > }; > struct FooList { > GenericList base; > Foo value; > }; > so that you convert to 'GenericList *' by '>base', and > back by 'container_of(generic, GenericList, base)' (as opposed to > the existing '(GenericList *)foolist' and '(FooList *)generic'). > But doing that would require hoisting the declaration of > GenericList prior to inclusion of qapi-types.h, rather than its > current spot in visitor.h; it also makes iteration a bit more > verbose through 'foolist->base.next' instead of 'foolist->next'. > > Signed-off-by: Eric BlakeFor the spapr change Acked-by: David Gibson -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [PATCH v7 20/31] spapr_drc: Expose 'null' in qom-get when there is no fdt
On Mon, Dec 07, 2015 at 08:55:10PM -0700, Eric Blake wrote: > Now that the QMP output visitor supports an explicit null > output, we should utilize it to make it easier to diagnose > the difference between a missing fdt vs. a present-but-empty > one. > > (Note that this reverts the behavior of commit ab8bf1d, taking > us back to the behavior of commit 1d10b44; but that this time, > the change is intentional and not an accidental side-effect.) > > Signed-off-by: Eric Blake> Cc: David Gibson Acked-by: David Gibson > > --- > v7: new patch, based on discussion about spapr_drc.c > --- > hw/ppc/spapr_drc.c | 6 +- > 1 file changed, 1 insertion(+), 5 deletions(-) > > diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c > index dcce563..0c675ff 100644 > --- a/hw/ppc/spapr_drc.c > +++ b/hw/ppc/spapr_drc.c > @@ -259,11 +259,7 @@ static void prop_get_fdt(Object *obj, Visitor *v, void > *opaque, > void *fdt; > > if (!drc->fdt) { > -visit_start_struct(v, NULL, 0, name, ); > -if (!err) { > -visit_end_struct(v, ); > -} > -error_propagate(errp, err); > +visit_type_null(v, NULL, errp); > return; > } > -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [PATCH v7 28/31] qapi: Split visit_end_struct() into pieces
On Mon, Dec 07, 2015 at 08:55:18PM -0700, Eric Blake wrote: > As mentioned in previous patches, we want to call visit_end_struct() > functions unconditionally, so that visitors can release resources > tied up since the matching visit_start_struct() without also having > to worry about error priority if more than one error occurs. > > Even though error_propagate() can be safely used to ignore a second > error during cleanup caused by a first error, it is simpler if the > cleanup cannot set an error, and we instead split the task of > checking that an input visitor has no unvisited input as a new > function visit_check_struct(), called only if all prior steps are > successful. > > Generated code has diffs resembling: > > |@@ -59,10 +59,12 @@ void visit_type_ACPIOSTInfo(Visitor *v, > | goto out_obj; > | } > | visit_type_ACPIOSTInfo_fields(v, obj, ); > |+if (err) { > |+goto out_obj; > |+} > |+visit_check_struct(v, ); > | out_obj: > |-error_propagate(errp, err); > |-err = NULL; > |-visit_end_struct(v, ); > |+visit_end_struct(v); > | out: > > Signed-off-by: Eric BlakeFor spapr parts: Acked-by: David Gibson -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [PATCH v7 29/31] qapi: Simplify semantics of visit_next_list()
On Mon, Dec 07, 2015 at 08:55:19PM -0700, Eric Blake wrote: > We have two uses of list visits in the entire code base: one in > spapr_drc (which completely avoids visit_next_list(), feeding in > integers from a different source than uint8List), and one in > qapi-visit.py (that is, all other list visitors are generated > in qapi-visit.c, and share the same paradigm based on a qapi > FooList type). What's more, the semantics of the list visit are > somewhat baroque, with the following pseudocode when FooList is > used: > > start() > prev = head > while (cur = next(prev)) { > visit(cur) > prev = > } > > Note that these semantics (advance before visit) requires that > the first call to next() return the list head, while all other > calls return the next element of the list; that is, every visitor > implementation is required to track extra state to decide whether > to return the input as-is, or to advance. It also requires an > argument of 'GenericList **' to next(), solely because the first > iteration might need to modify the caller's GenericList head, so > that all other calls have to do a layer of dereferencing. > > We can greatly simplify things by hoisting the special case > into the start() routine, and flipping the order in the loop > to visit before advance: > > start(head) > element = *head > while (element) { > visit(element) > element = next(element) > } > > With the simpler semantics, visitors have less state to track, > the argument to next() is reduced to 'GenericList *', and it > also becomes obvious whether an input visitor is allocating a > FooList during visit_start_list() (rather than the old way of > not knowing if an allocation happened until the first > visit_next_list()). > > The spapr_drc case requires that visit_start_list() has to pay > attention to whether visit_next_list() will even be used to > visit a FooList qapi struct; this is done by passing NULL for > list, similarly to how NULL is passed to visit_start_struct() > when a qapi type is not used in those visits. It was easy to > provide these semantics for qmp-output and dealloc visitors, > and a bit harder for qmp-input (it required hoisting the > advance of the current qlist entry out of qmp_input_next_list() > into qmp_input_get_object()). But it turned out that the > string and opts visitors munge enough state during > visit_next_list() to make those conversions simpler if they > require a GenericList visit for now; an assertion will remind > us to adjust things if we need the semantics in the future. > > Signed-off-by: Eric BlakeFor the spapr change: Acked-by: David Gibson -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [PATCH v9 6/6] tests/guest-debug: introduce basic gdbstub tests
Peter Maydellwrites: > On 12 November 2015 at 16:20, Alex Bennée wrote: >> From: Alex Bennée >> >> The aim of these tests is to combine with an appropriate kernel >> image (with symbol-file vmlinux) and check it behaves as it should. >> Given a kernel it checks: >> >> - single step >> - software breakpoint >> - hardware breakpoint >> - access, read and write watchpoints >> >> On success it returns 0 to the calling process. >> >> I've not plumbed this into the "make check" logic though as we need a >> solution for providing non-host binaries to the tests. However the test >> is structured to work with pretty much any Linux kernel image as it >> uses the basic kernel_init code which is common across architectures. > > Do these tests pass if you run them on the TCG QEMU, just out > of interest? You'll be glad to know they do. > I'm not a great fan of tests that aren't in 'make check' > because IME they just bitrot, but as you say we have no > sensible approach for handling tests that need to run real > guest code :-( I was pondering if a git sub-project with large file support would work. We could add pre-built binaries to the tree with appropriate meta-data (src tree, version, config) to rebuild if required. There would be some degree of trust implied in the original builder though. Maybe a signed commit? > > thanks > -- PMM -- Alex Bennée
[Qemu-devel] [v3 1/3] cutils: add avx2 instruction optimization
buffer_find_nonzero_offset() is a hot function during live migration. Now it use SSE2 intructions for optimization. For platform supports AVX2 instructions, use the AVX2 instructions for optimization can help to improve the performance about 30% comparing to SSE2. Zero page check can be faster with this optimization, the test result shows that for an 8GB RAM idle guest, this patch can help to shorten the total live migration time about 6%. This patch use the ifunc mechanism to select the proper function when running, for platform supports AVX2, excute the AVX2 instructions, else, excute the original code. Signed-off-by: Liang Li--- include/qemu-common.h | 13 +- util/Makefile.objs | 2 ++ util/buffer-zero-avx2.c | 54 util/cutils.c | 65 +++-- 4 files changed, 125 insertions(+), 9 deletions(-) create mode 100644 util/buffer-zero-avx2.c diff --git a/include/qemu-common.h b/include/qemu-common.h index 405364f..be8ba79 100644 --- a/include/qemu-common.h +++ b/include/qemu-common.h @@ -484,15 +484,14 @@ void qemu_hexdump(const char *buf, FILE *fp, const char *prefix, size_t size); #endif #define BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 8 -static inline bool -can_use_buffer_find_nonzero_offset(const void *buf, size_t len) -{ -return (len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR - * sizeof(VECTYPE)) == 0 -&& ((uintptr_t) buf) % sizeof(VECTYPE) == 0); -} +bool can_use_buffer_find_nonzero_offset(const void *buf, size_t len); size_t buffer_find_nonzero_offset(const void *buf, size_t len); +#if defined CONFIG_IFUNC && defined CONFIG_AVX2 +bool can_use_buffer_find_nonzero_offset_avx2(const void *buf, size_t len); +size_t buffer_find_nonzero_offset_avx2(const void *buf, size_t len); +#endif + /* * helper to parse debug environment variables */ diff --git a/util/Makefile.objs b/util/Makefile.objs index 89dd80e..a130b35 100644 --- a/util/Makefile.objs +++ b/util/Makefile.objs @@ -1,4 +1,5 @@ util-obj-y = osdep.o cutils.o unicode.o qemu-timer-common.o +util-obj-$(CONFIG_AVX2) += buffer-zero-avx2.o util-obj-$(CONFIG_POSIX) += compatfd.o util-obj-$(CONFIG_POSIX) += event_notifier-posix.o util-obj-$(CONFIG_POSIX) += mmap-alloc.o @@ -30,3 +31,4 @@ util-obj-y += qemu-coroutine-sleep.o util-obj-y += coroutine-$(CONFIG_COROUTINE_BACKEND).o util-obj-y += buffer.o util-obj-y += timed-average.o +buffer-zero-avx2.o-cflags := $(AVX2_CFLAGS) diff --git a/util/buffer-zero-avx2.c b/util/buffer-zero-avx2.c new file mode 100644 index 000..b9da0e3 --- /dev/null +++ b/util/buffer-zero-avx2.c @@ -0,0 +1,54 @@ +#include "qemu-common.h" + +#if defined CONFIG_IFUNC && defined CONFIG_AVX2 +#include +#define AVX2_VECTYPE__m256i +#define AVX2_SPLAT(p) _mm256_set1_epi8(*(p)) +#define AVX2_ALL_EQ(v1, v2) \ +(_mm256_movemask_epi8(_mm256_cmpeq_epi8(v1, v2)) == 0x) +#define AVX2_VEC_OR(v1, v2) (_mm256_or_si256(v1, v2)) + +inline bool +can_use_buffer_find_nonzero_offset_avx2(const void *buf, size_t len) +{ +return (len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR + * sizeof(AVX2_VECTYPE)) == 0 +&& ((uintptr_t) buf) % sizeof(AVX2_VECTYPE) == 0); +} + +size_t buffer_find_nonzero_offset_avx2(const void *buf, size_t len) +{ +const AVX2_VECTYPE *p = buf; +const AVX2_VECTYPE zero = (AVX2_VECTYPE){0}; +size_t i; + +assert(can_use_buffer_find_nonzero_offset_avx2(buf, len)); + +if (!len) { +return 0; +} + +for (i = 0; i < BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR; i++) { +if (!AVX2_ALL_EQ(p[i], zero)) { +return i * sizeof(AVX2_VECTYPE); +} +} + +for (i = BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR; + i < len / sizeof(AVX2_VECTYPE); + i += BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR) { +AVX2_VECTYPE tmp0 = AVX2_VEC_OR(p[i + 0], p[i + 1]); +AVX2_VECTYPE tmp1 = AVX2_VEC_OR(p[i + 2], p[i + 3]); +AVX2_VECTYPE tmp2 = AVX2_VEC_OR(p[i + 4], p[i + 5]); +AVX2_VECTYPE tmp3 = AVX2_VEC_OR(p[i + 6], p[i + 7]); +AVX2_VECTYPE tmp01 = AVX2_VEC_OR(tmp0, tmp1); +AVX2_VECTYPE tmp23 = AVX2_VEC_OR(tmp2, tmp3); +if (!AVX2_ALL_EQ(AVX2_VEC_OR(tmp01, tmp23), zero)) { +break; +} +} + +return i * sizeof(AVX2_VECTYPE); +} + +#endif diff --git a/util/cutils.c b/util/cutils.c index cfeb848..3631c02 100644 --- a/util/cutils.c +++ b/util/cutils.c @@ -26,6 +26,7 @@ #include #include #include +#include #include "qemu/sockets.h" #include "qemu/iov.h" @@ -161,6 +162,14 @@ int qemu_fdatasync(int fd) #endif } +static inline bool +can_use_buffer_find_nonzero_offset_inner(const void *buf, size_t len) +{ +return (len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR + * sizeof(VECTYPE)) == 0 +&& ((uintptr_t) buf) % sizeof(VECTYPE) ==
[Qemu-devel] [v3 3/3] configure: add options to config avx2
Add the '--enable-avx2' & '--disable-avx2' option so as to config the AVX2 instruction optimization. If '--disable-avx2' is not set, configure will detect if the compiler can support AVX2 option, if yes, AVX2 optimization is eabled, else disabled. Signed-off-by: Liang Li--- configure | 30 ++ 1 file changed, 30 insertions(+) diff --git a/configure b/configure index 394db3b..94e45fa 100755 --- a/configure +++ b/configure @@ -311,6 +311,7 @@ libusb="" usb_redir="" opengl="" ifunc="" +avx2="" zlib="yes" lzo="" snappy="" @@ -1063,6 +1064,10 @@ for opt do ;; --enable-usb-redir) usb_redir="yes" ;; + --disable-avx2) avx2="no" + ;; + --enable-avx2) avx2="yes" + ;; --disable-zlib-test) zlib="no" ;; --disable-lzo) lzo="no" @@ -1378,6 +1383,7 @@ disabled with --disable-FEATURE, default is enabled if available: smartcard smartcard support (libcacard) libusb libusb (for usb passthrough) usb-redir usb network redirection support + avx2support of avx2 instruction lzo support of lzo compression library snappy support of snappy compression library bzip2 support of bzip2 compression library @@ -1841,6 +1847,23 @@ else ifunc="no" fi + +# avx2 check + +if test "$avx2" != "no" ; then +cat > $TMPC << EOF +int main(void) { return 0; } +EOF +if compile_prog "" "-mavx2" ; then +avx2="yes" +else +if test "$avx2" = "yes" ; then +feature_not_found "avx2" "Your compiler don't support avx2" +fi +avx2="no" +fi +fi + # # zlib check @@ -4853,6 +4876,7 @@ echo "TPM passthrough $tpm_passthrough" echo "QOM debugging $qom_cast_debug" echo "vhdx $vhdx" echo "ifunc support $ifunc" +echo "avx2 support $avx2" echo "lzo support $lzo" echo "snappy support$snappy" echo "bzip2 support $bzip2" @@ -5241,6 +5265,12 @@ if test "$ifunc" = "yes" ; then echo "CONFIG_IFUNC=y" >> $config_host_mak fi +if test "$avx2" = "yes" ; then + avx2_cflags=" -mavx2" + echo "AVX2_CFLAGS=$avx2_cflags" >> $config_host_mak + echo "CONFIG_AVX2=y" >> $config_host_mak +fi + if test "$lzo" = "yes" ; then echo "CONFIG_LZO=y" >> $config_host_mak fi -- 1.9.1
[Qemu-devel] [v3 2/3] configure: detect ifunc attribute
Detect if the compiler can support the ifunc attribute, the avx2 optimization depends on ifunc attribute. Signed-off-by: Liang Li--- configure | 20 1 file changed, 20 insertions(+) diff --git a/configure b/configure index b9552fd..394db3b 100755 --- a/configure +++ b/configure @@ -310,6 +310,7 @@ smartcard="" libusb="" usb_redir="" opengl="" +ifunc="" zlib="yes" lzo="" snappy="" @@ -1827,6 +1828,20 @@ EOF fi ## +# ifunc check + +cat > $TMPC << EOF +static void bar(void) {} +static void foo(void) __attribute__((ifunc("bar"))); +int main(void) { foo(); return 0; } +EOF +if compile_prog "" "" ; then +ifunc="yes" +else +ifunc="no" +fi + +# # zlib check if test "$zlib" != "no" ; then @@ -4837,6 +4852,7 @@ echo "libssh2 support $libssh2" echo "TPM passthrough $tpm_passthrough" echo "QOM debugging $qom_cast_debug" echo "vhdx $vhdx" +echo "ifunc support $ifunc" echo "lzo support $lzo" echo "snappy support$snappy" echo "bzip2 support $bzip2" @@ -5221,6 +5237,10 @@ if test "$opengl" = "yes" ; then echo "OPENGL_LIBS=$opengl_libs" >> $config_host_mak fi +if test "$ifunc" = "yes" ; then + echo "CONFIG_IFUNC=y" >> $config_host_mak +fi + if test "$lzo" = "yes" ; then echo "CONFIG_LZO=y" >> $config_host_mak fi -- 1.9.1
Re: [Qemu-devel] tcg: improve MAX_CODE_GEN_BUFFER_SIZE for arm
Hello, On Tue, Dec 8, 2015 at 11:39 AM, Aurelien Jarnowrote: [...] > I already posted a patch a long time ago to remove the 16MB limit on ARM > hosts: > > http://lists.gnu.org/archive/html/qemu-devel/2012-10/msg01684.html > > However as you can see in the thread, it has been rejected as it doesn't > not bring improvement in all cases. We could perhaps resurrect it and do some more benchmarking? Who would be able to do testing on (recent) ARM hardware? Laurent
Re: [Qemu-devel] [PATCH v9 3/6] target-arm: kvm - support for single step
Peter Maydellwrites: > On 12 November 2015 at 16:20, Alex Bennée wrote: >> This adds support for single-step. There isn't much to do on the QEMU >> side as after we set-up the request for single step via the debug ioctl >> it is all handled within the kernel. >> >> Signed-off-by: Alex Bennée >> >> --- >> v2 >> - convert to using HSR_EC >> v3 >> - use internals.h definitions >> --- >> target-arm/kvm.c | 10 ++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/target-arm/kvm.c b/target-arm/kvm.c >> index 50f70ef..d505a7e 100644 >> --- a/target-arm/kvm.c >> +++ b/target-arm/kvm.c >> @@ -535,6 +535,13 @@ static int kvm_handle_debug(CPUState *cs, struct >> kvm_run *run) >> kvm_cpu_synchronize_state(cs); >> >> switch (hsr_ec) { >> +case EC_SOFTWARESTEP: >> +if (cs->singlestep_enabled) { >> +return true; >> +} else { >> +error_report("Came out of SINGLE STEP when not enabled"); >> +} >> +break; >> case EC_AA64_BKPT: >> if (kvm_find_sw_breakpoint(cs, env->pc)) { >> return true; >> @@ -595,6 +602,9 @@ int kvm_arch_on_sigbus(int code, void *addr) >> >> void kvm_arch_update_guest_debug(CPUState *cs, struct kvm_guest_debug *dbg) >> { >> +if (cs->singlestep_enabled) { >> +dbg->control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_SINGLESTEP; >> +} > > Doesn't kvm_update_guest_debug() already set these bits, or am > I misreading it? Yeah. This raises an interesting problem about what to do when we don't have the capability. I could suppress those bits in the update function but that seems a bit hacky. Looking at the GDB capability code there doesn't seem to report breakpoint capability short of just failing when you try to set one. > >> if (kvm_sw_breakpoints_active(cs)) { >> dbg->control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP; >> } >> -- >> 2.6.3 > > thanks > -- PMM -- Alex Bennée
Re: [Qemu-devel] net: vmxnet3: memory leakage issue
Hello Jason, +-- On Fri, 4 Dec 2015, Jason Wang wrote --+ | Better with "git send-email". Okay. | What if guest deactivate the device before re-activate the device? |Looks like it could be done through methods: | |1) VMXNET3_CMD_QUIESCE_DEV IIUC, it is used to pause the device when the receiver end is unable to keee-up with the incoming flow. After a brief period, the operation could be resumed again. |2) VMXNET3_REG_DSAL Shared memory between a driver and the device appears to be set in two steps. Firs low address, followed by the high address(VMXNET3_REG_DSAH). I guess 's->device_active' needs to be enabled again while setting the higher part of the address. |So looks like need to free both tx_pkt and rx_pkt during deactivating? I think freeing 'tx_pkt' & 'rx_pkt' during pause wouldn't be good. It needs to call something like - 'vmxnet3_resume_device'. Please see the patch below for the above two cases, does it look okay? === diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c index 37373e5..4b2305b 100644 --- a/hw/net/vmxnet3.c +++ b/hw/net/vmxnet3.c @@ -1198,6 +1198,12 @@ static void vmxnet3_deactivate_device(VMXNET3State *s) s->device_active = false; } +static void vmxnet3_resume_device(VMXNET3State *s) +{ +VMW_CBPRN("Resuming vmxnet3..."); +s->device_active = true; +} + static void vmxnet3_reset(VMXNET3State *s) { VMW_CBPRN("Resetting vmxnet3..."); @@ -1627,8 +1633,13 @@ static void vmxnet3_handle_command(VMXNET3State *s, uint64_t cmd) break; case VMXNET3_CMD_QUIESCE_DEV: -VMW_CBPRN("Set: VMXNET3_CMD_QUIESCE_DEV - pause the device"); -vmxnet3_deactivate_device(s); +if (s->device_active) { +VMW_CBPRN("Set: VMXNET3_CMD_QUIESCE_DEV - pause the device"); +vmxnet3_deactivate_device(s); +} else { +VMW_CBPRN("Set: VMXNET3_CMD_QUIESCE_DEV - resume the device"); +vmxnet3_resume_device(s); +} break; case VMXNET3_CMD_GET_CONF_INTR: @@ -1756,6 +1767,9 @@ vmxnet3_io_bar1_write(void *opaque, * We already should have low address part. */ s->drv_shmem = s->temp_shared_guest_driver_memory | (val << 32); +if (s->drv_shmem) { +s->device_active = true; +} break; /* Command */ === Thank you. -- Prasad J Pandit / Red Hat Product Security Team 47AF CE69 3A90 54AA 9045 1053 DD13 3D32 FE5B 041F
Re: [Qemu-devel] tcg: improve MAX_CODE_GEN_BUFFER_SIZE for arm
On 2015-12-08 10:43, TeLeMan wrote: > I know MAX_CODE_GEN_BUFFER_SIZE is limited by the host direct branch > instructions.But the arm's MAX_CODE_GEN_BUFFER_SIZE is so small.I > tried improving MAX_CODE_GEN_BUFFER_SIZE.I wrote some check codes for > the overflow offset in tcg_out_b(), tcg_out_bl(), > tcg_out_blx_imm(),reloc_pc24(). But I didn't catch any overflow case > when tb_size and MAX_CODE_GEN_BUFFER_SIZE were larger than 32MB. After > the generated code size was larger than 32MB, qemu crashed. Instrumenting all the tcg_out_* branch related functions do not work here as the address is actually not known at code generation: case INDEX_op_goto_tb: if (s->tb_jmp_offset) { /* Direct jump method */ s->tb_jmp_offset[args[0]] = tcg_current_code_size(s); tcg_out_b_noaddr(s, COND_AL); It is patched later during TB linking. > Any suggest for this issue? I already posted a patch a long time ago to remove the 16MB limit on ARM hosts: http://lists.gnu.org/archive/html/qemu-devel/2012-10/msg01684.html However as you can see in the thread, it has been rejected as it doesn't not bring improvement in all cases. Aurelien -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] [PULL for-2.5 2/4] block: Don't wait serialising for non-COR read requests
Am 07.12.2015 um 17:42 hat Cornelia Huck geschrieben: > On Mon, 7 Dec 2015 11:02:51 +0100 > Cornelia Huckwrote: > > > On Thu, 3 Dec 2015 13:00:00 +0800 > > Stefan Hajnoczi wrote: > > > > > From: Fam Zheng > > > > > > The assertion problem was noticed in 06c3916b35a, but it wasn't > > > completely fixed, because even though the req is not marked as > > > serialising, it still gets serialised by wait_serialising_requests > > > against other serialising requests, which could lead to the same > > > assertion failure. > > > > > > Fix it by even more explicitly skipping the serialising for this > > > specific case. > > > > > > Signed-off-by: Fam Zheng > > > Message-id: 1448962590-2842-2-git-send-email-f...@redhat.com > > > Signed-off-by: Stefan Hajnoczi > > > --- > > > block/backup.c| 2 +- > > > block/io.c| 12 +++- > > > include/block/block.h | 4 ++-- > > > trace-events | 2 +- > > > 4 files changed, 11 insertions(+), 9 deletions(-) > > > > This one causes segfaults for me: > > > > Program received signal SIGSEGV, Segmentation fault. > > bdrv_is_inserted (bs=0x8000) at /data/git/yyy/qemu/block.c:3071 > > 3071if (!drv) { > > > > (gdb) bt > > #0 bdrv_is_inserted (bs=0x8000) at /data/git/yyy/qemu/block.c:3071 This looks like some kind of memory corruption that hit blk->bs. It's most definitely not a valid pointer anyway. > > #1 0x80216974 in blk_is_inserted (blk=) > > at /data/git/yyy/qemu/block/block-backend.c:986 > > #2 0x802169c6 in blk_is_available (blk=blk@entry=0x3ffb17e7960) > > at /data/git/yyy/qemu/block/block-backend.c:991 > > #3 0x80216d12 in blk_check_byte_request > > (blk=blk@entry=0x3ffb17e7960, > > offset=offset@entry=4928966656, size=16384) > > at /data/git/yyy/qemu/block/block-backend.c:558 > > #4 0x80216df2 in blk_check_request (blk=blk@entry=0x3ffb17e7960, > > sector_num=sector_num@entry=9626888, nb_sectors=nb_sectors@entry=32) > > at /data/git/yyy/qemu/block/block-backend.c:589 > > #5 0x80217ee8 in blk_aio_readv (blk=0x3ffb17e7960, sector_num= > > 9626888, iov=0x8098c658, nb_sectors=, cb= > > 0x80081150 , opaque=0x80980620) > > at /data/git/yyy/qemu/block/block-backend.c:727 > > #6 0x8008186e in submit_requests (niov=, > > num_reqs=, start=, mrb=, > > blk=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:366 > > #7 virtio_blk_submit_multireq (mrb=, blk=) > > at /data/git/yyy/qemu/hw/block/virtio-blk.c:444 > > #8 virtio_blk_submit_multireq (blk=0x3ffb17e7960, mrb=0x3ffeb58) > > at /data/git/yyy/qemu/hw/block/virtio-blk.c:389 > > #9 0x800823ee in virtio_blk_handle_output (vdev=, > > vq=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:615 > > #10 0x801e367e in aio_dispatch (ctx=0x80918520) > > at /data/git/yyy/qemu/aio-posix.c:326 > > #11 0x801d28b0 in aio_ctx_dispatch (source=, > > callback=, user_data=) > > at /data/git/yyy/qemu/async.c:231 > > #12 0x03fffd36a05a in g_main_context_dispatch () > >from /lib64/libglib-2.0.so.0 > > #13 0x801e0ffa in glib_pollfds_poll () > > at /data/git/yyy/qemu/main-loop.c:211 > > #14 os_host_main_loop_wait (timeout=) > > at /data/git/yyy/qemu/main-loop.c:256 > > #15 main_loop_wait (nonblocking=) > > at /data/git/yyy/qemu/main-loop.c:504 > > #16 0x800148a6 in main_loop () at /data/git/yyy/qemu/vl.c:1923 > > #17 main (argc=, argv=, envp=) > > at /data/git/yyy/qemu/vl.c:4684 > > > > Relevant part of command line: > > > > -drive > > file=/dev/sda,if=none,id=drive-virtio-disk0,format=raw,serial=ccwzfcp1,cache=none > > -device > > virtio-blk-ccw,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,scsi=off > > I played around a bit. The main part of this change seems to be calling > wait_serialising_requests() conditionally; reverting this makes the > guest boot again. > > I then tried to find out when wait_serialising_requests() was NOT > called and added fprintfs: well, it was _always_ called. I then added a > fprintf for flags at the beginning of the function: this produced a > segfault no matter whether wait_serialising_requests() was called > conditionally or unconditionally. Weird race? > > Anything further I can do? I guess this patch fixes a bug for someone, > but it means insta-death for my setup... If it happens immediately, perhaps running under valgrind is possible and could give some hints about what happened with blk->bs? Kevin
Re: [Qemu-devel] tcg: improve MAX_CODE_GEN_BUFFER_SIZE for arm
On 2015-12-08 11:51, Laurent Desnogues wrote: > Hello, > > On Tue, Dec 8, 2015 at 11:39 AM, Aurelien Jarnowrote: > [...] > > I already posted a patch a long time ago to remove the 16MB limit on ARM > > hosts: > > > > http://lists.gnu.org/archive/html/qemu-devel/2012-10/msg01684.html > > > > However as you can see in the thread, it has been rejected as it doesn't > > not bring improvement in all cases. > > We could perhaps resurrect it and do some more benchmarking? Who > would be able to do testing on (recent) ARM hardware? I can provide an updated patch, but I would prefer if someone else does the benchmarking on a really recent hardware. Not sure the hardware I have (cortex A7) is really representative of a modern ARM CPU. Aurelien -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] [v3 3/3] configure: add options to config avx2
On 8 December 2015 at 12:08, Liang Liwrote: > Add the '--enable-avx2' & '--disable-avx2' option so as to config > the AVX2 instruction optimization. > > If '--disable-avx2' is not set, configure will detect if the compiler > can support AVX2 option, if yes, AVX2 optimization is eabled, else > disabled. Is the configure option necessary? For other things like this (eg our use of SSE2 or Altivec) we just go ahead and use the feature if the compiler supports it. When would somebody building QEMU want to disable this option? thanks -- PMM
Re: [Qemu-devel] [PULL for-2.5 2/4] block: Don't wait serialising for non-COR read requests
On 12/08/2015 01:30 PM, Christian Borntraeger wrote: > On 12/08/2015 01:00 PM, Cornelia Huck wrote: >> On Tue, 8 Dec 2015 10:59:54 +0100 >> Kevin Wolfwrote: >> >>> Am 07.12.2015 um 17:42 hat Cornelia Huck geschrieben: On Mon, 7 Dec 2015 11:02:51 +0100 Cornelia Huck wrote: > On Thu, 3 Dec 2015 13:00:00 +0800 > Stefan Hajnoczi wrote: > >> From: Fam Zheng >> >> The assertion problem was noticed in 06c3916b35a, but it wasn't >> completely fixed, because even though the req is not marked as >> serialising, it still gets serialised by wait_serialising_requests >> against other serialising requests, which could lead to the same >> assertion failure. >> >> Fix it by even more explicitly skipping the serialising for this >> specific case. >> >> Signed-off-by: Fam Zheng >> Message-id: 1448962590-2842-2-git-send-email-f...@redhat.com >> Signed-off-by: Stefan Hajnoczi >> --- >> block/backup.c| 2 +- >> block/io.c| 12 +++- >> include/block/block.h | 4 ++-- >> trace-events | 2 +- >> 4 files changed, 11 insertions(+), 9 deletions(-) > > This one causes segfaults for me: > > Program received signal SIGSEGV, Segmentation fault. > bdrv_is_inserted (bs=0x8000) at /data/git/yyy/qemu/block.c:3071 > 3071 if (!drv) { > > (gdb) bt > #0 bdrv_is_inserted (bs=0x8000) at > /data/git/yyy/qemu/block.c:3071 >>> >>> This looks like some kind of memory corruption that hit blk->bs. It's >>> most definitely not a valid pointer anyway. >>> > #1 0x80216974 in blk_is_inserted (blk=) > at /data/git/yyy/qemu/block/block-backend.c:986 > #2 0x802169c6 in blk_is_available (blk=blk@entry=0x3ffb17e7960) > at /data/git/yyy/qemu/block/block-backend.c:991 > #3 0x80216d12 in blk_check_byte_request > (blk=blk@entry=0x3ffb17e7960, > offset=offset@entry=4928966656, size=16384) > at /data/git/yyy/qemu/block/block-backend.c:558 > #4 0x80216df2 in blk_check_request (blk=blk@entry=0x3ffb17e7960, > sector_num=sector_num@entry=9626888, nb_sectors=nb_sectors@entry=32) > at /data/git/yyy/qemu/block/block-backend.c:589 > #5 0x80217ee8 in blk_aio_readv (blk=0x3ffb17e7960, sector_num= > 9626888, iov=0x8098c658, nb_sectors=, cb= > 0x80081150 , opaque=0x80980620) > at /data/git/yyy/qemu/block/block-backend.c:727 > #6 0x8008186e in submit_requests (niov=, > num_reqs=, start=, mrb=, > blk=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:366 > #7 virtio_blk_submit_multireq (mrb=, blk=) > at /data/git/yyy/qemu/hw/block/virtio-blk.c:444 > #8 virtio_blk_submit_multireq (blk=0x3ffb17e7960, mrb=0x3ffeb58) > at /data/git/yyy/qemu/hw/block/virtio-blk.c:389 > #9 0x800823ee in virtio_blk_handle_output (vdev=, > vq=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:615 > #10 0x801e367e in aio_dispatch (ctx=0x80918520) > at /data/git/yyy/qemu/aio-posix.c:326 > #11 0x801d28b0 in aio_ctx_dispatch (source=, > callback=, user_data=) > at /data/git/yyy/qemu/async.c:231 > #12 0x03fffd36a05a in g_main_context_dispatch () >from /lib64/libglib-2.0.so.0 > #13 0x801e0ffa in glib_pollfds_poll () > at /data/git/yyy/qemu/main-loop.c:211 > #14 os_host_main_loop_wait (timeout=) > at /data/git/yyy/qemu/main-loop.c:256 > #15 main_loop_wait (nonblocking=) > at /data/git/yyy/qemu/main-loop.c:504 > #16 0x800148a6 in main_loop () at /data/git/yyy/qemu/vl.c:1923 > #17 main (argc=, argv=, envp= out>) > at /data/git/yyy/qemu/vl.c:4684 > > Relevant part of command line: > > -drive > file=/dev/sda,if=none,id=drive-virtio-disk0,format=raw,serial=ccwzfcp1,cache=none > -device > virtio-blk-ccw,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,scsi=off I played around a bit. The main part of this change seems to be calling wait_serialising_requests() conditionally; reverting this makes the guest boot again. I then tried to find out when wait_serialising_requests() was NOT called and added fprintfs: well, it was _always_ called. I then added a fprintf for flags at the beginning of the function: this produced a segfault no matter whether wait_serialising_requests() was called conditionally or unconditionally. Weird race? Anything further I can do? I guess this patch fixes a bug for someone, but it means insta-death for my setup... >>> >>> If it happens immediately, perhaps running under valgrind is possible >>>
Re: [Qemu-devel] [PATCH v2 1/4] xen/MSI-X: latch MSI-X table writes
On Mon, 7 Dec 2015, Jan Beulich wrote: > >>> On 07.12.15 at 13:41,wrote: > > I know that in your opinion is superfluous, nonetheless could you please > > add 2-3 lines of in-code comment right here, to explain what you are > > doing with the check? Something like: > > > > /* > > * Update the entry addr and data to the latest values only when the > > * entry is masked or they are all masked, as required by the spec. > > * Addr and data changes while the MSI-X entry is unmasked will be > > * delayed until the next masking->unmasking. > > */ > > Btw, will it be okay to just resend this one patch as v3, or do I need > to resend the whole series (the rest of which didn't change)? Just this patch is fine.
[Qemu-devel] [PATCH v3 1/4] xen/MSI-X: latch MSI-X table writes
The remaining log message in pci_msix_write() is wrong, as there guest behavior may only appear to be wrong: For one, the old logic didn't take the mask-all bit into account. And then this shouldn't depend on host device state (i.e. the host may have masked the entry without the guest having done so). Plus these writes shouldn't be dropped even when an entry gets unmasked. Instead, if they can't be made take effect right away, they should take effect on the next unmasking or enabling operation - the specification explicitly describes such caching behavior. Signed-off-by: Jan Beulich--- v3: Add comment to xen_pt_msix_update_one(). v2: Pass original vec_ctrl to xen_pt_msix_update_one() instead of (ab)using latch[]. --- a/hw/xen/xen_pt_config_init.c +++ b/hw/xen/xen_pt_config_init.c @@ -1499,6 +1499,8 @@ static int xen_pt_msixctrl_reg_write(Xen xen_pt_msix_disable(s); } +s->msix->maskall = *val & PCI_MSIX_FLAGS_MASKALL; + debug_msix_enabled_old = s->msix->enabled; s->msix->enabled = !!(*val & PCI_MSIX_FLAGS_ENABLE); if (s->msix->enabled != debug_msix_enabled_old) { --- a/hw/xen/xen_pt.h +++ b/hw/xen/xen_pt.h @@ -187,13 +187,13 @@ typedef struct XenPTMSIXEntry { int pirq; uint64_t addr; uint32_t data; -uint32_t vector_ctrl; +uint32_t latch[4]; bool updated; /* indicate whether MSI ADDR or DATA is updated */ -bool warned; /* avoid issuing (bogus) warning more than once */ } XenPTMSIXEntry; typedef struct XenPTMSIX { uint32_t ctrl_offset; bool enabled; +bool maskall; int total_entries; int bar_index; uint64_t table_base; --- a/hw/xen/xen_pt_msi.c +++ b/hw/xen/xen_pt_msi.c @@ -25,6 +25,7 @@ #define XEN_PT_GFLAGSSHIFT_DELIV_MODE 12 #define XEN_PT_GFLAGSSHIFT_TRG_MODE 15 +#define latch(fld) latch[PCI_MSIX_ENTRY_##fld / sizeof(uint32_t)] /* * Helpers @@ -314,7 +315,8 @@ static int msix_set_enable(XenPCIPassthr enabled); } -static int xen_pt_msix_update_one(XenPCIPassthroughState *s, int entry_nr) +static int xen_pt_msix_update_one(XenPCIPassthroughState *s, int entry_nr, + uint32_t vec_ctrl) { XenPTMSIXEntry *entry = NULL; int pirq; @@ -332,6 +334,19 @@ static int xen_pt_msix_update_one(XenPCI pirq = entry->pirq; +/* + * Update the entry addr and data to the latest values only when the + * entry is masked or they are all masked, as required by the spec. + * Addr and data changes while the MSI-X entry is unmasked get deferred + * until the next masked -> unmasked transition. + */ +if (pirq == XEN_PT_UNASSIGNED_PIRQ || s->msix->maskall || +(vec_ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT)) { +entry->addr = entry->latch(LOWER_ADDR) | + ((uint64_t)entry->latch(UPPER_ADDR) << 32); +entry->data = entry->latch(DATA); +} + rc = msi_msix_setup(s, entry->addr, entry->data, , true, entry_nr, entry->pirq == XEN_PT_UNASSIGNED_PIRQ); if (rc) { @@ -357,7 +372,7 @@ int xen_pt_msix_update(XenPCIPassthrough int i; for (i = 0; i < msix->total_entries; i++) { -xen_pt_msix_update_one(s, i); +xen_pt_msix_update_one(s, i, msix->msix_entry[i].latch(VECTOR_CTRL)); } return 0; @@ -406,35 +421,15 @@ int xen_pt_msix_update_remap(XenPCIPasst static uint32_t get_entry_value(XenPTMSIXEntry *e, int offset) { -switch (offset) { -case PCI_MSIX_ENTRY_LOWER_ADDR: -return e->addr & UINT32_MAX; -case PCI_MSIX_ENTRY_UPPER_ADDR: -return e->addr >> 32; -case PCI_MSIX_ENTRY_DATA: -return e->data; -case PCI_MSIX_ENTRY_VECTOR_CTRL: -return e->vector_ctrl; -default: -return 0; -} +return !(offset % sizeof(*e->latch)) + ? e->latch[offset / sizeof(*e->latch)] : 0; } static void set_entry_value(XenPTMSIXEntry *e, int offset, uint32_t val) { -switch (offset) { -case PCI_MSIX_ENTRY_LOWER_ADDR: -e->addr = (e->addr & ((uint64_t)UINT32_MAX << 32)) | val; -break; -case PCI_MSIX_ENTRY_UPPER_ADDR: -e->addr = (uint64_t)val << 32 | (e->addr & UINT32_MAX); -break; -case PCI_MSIX_ENTRY_DATA: -e->data = val; -break; -case PCI_MSIX_ENTRY_VECTOR_CTRL: -e->vector_ctrl = val; -break; +if (!(offset % sizeof(*e->latch))) +{ +e->latch[offset / sizeof(*e->latch)] = val; } } @@ -454,39 +449,26 @@ static void pci_msix_write(void *opaque, offset = addr % PCI_MSIX_ENTRY_SIZE; if (offset != PCI_MSIX_ENTRY_VECTOR_CTRL) { -const volatile uint32_t *vec_ctrl; - if (get_entry_value(entry, offset) == val && entry->pirq != XEN_PT_UNASSIGNED_PIRQ) { return; } +entry->updated = true; +} else if (msix->enabled && entry->updated && + !(val &
Re: [Qemu-devel] [PULL for-2.5 2/4] block: Don't wait serialising for non-COR read requests
On 12/08/2015 01:00 PM, Cornelia Huck wrote: > On Tue, 8 Dec 2015 10:59:54 +0100 > Kevin Wolfwrote: > >> Am 07.12.2015 um 17:42 hat Cornelia Huck geschrieben: >>> On Mon, 7 Dec 2015 11:02:51 +0100 >>> Cornelia Huck wrote: >>> On Thu, 3 Dec 2015 13:00:00 +0800 Stefan Hajnoczi wrote: > From: Fam Zheng > > The assertion problem was noticed in 06c3916b35a, but it wasn't > completely fixed, because even though the req is not marked as > serialising, it still gets serialised by wait_serialising_requests > against other serialising requests, which could lead to the same > assertion failure. > > Fix it by even more explicitly skipping the serialising for this > specific case. > > Signed-off-by: Fam Zheng > Message-id: 1448962590-2842-2-git-send-email-f...@redhat.com > Signed-off-by: Stefan Hajnoczi > --- > block/backup.c| 2 +- > block/io.c| 12 +++- > include/block/block.h | 4 ++-- > trace-events | 2 +- > 4 files changed, 11 insertions(+), 9 deletions(-) This one causes segfaults for me: Program received signal SIGSEGV, Segmentation fault. bdrv_is_inserted (bs=0x8000) at /data/git/yyy/qemu/block.c:3071 3071 if (!drv) { (gdb) bt #0 bdrv_is_inserted (bs=0x8000) at /data/git/yyy/qemu/block.c:3071 >> >> This looks like some kind of memory corruption that hit blk->bs. It's >> most definitely not a valid pointer anyway. >> #1 0x80216974 in blk_is_inserted (blk=) at /data/git/yyy/qemu/block/block-backend.c:986 #2 0x802169c6 in blk_is_available (blk=blk@entry=0x3ffb17e7960) at /data/git/yyy/qemu/block/block-backend.c:991 #3 0x80216d12 in blk_check_byte_request (blk=blk@entry=0x3ffb17e7960, offset=offset@entry=4928966656, size=16384) at /data/git/yyy/qemu/block/block-backend.c:558 #4 0x80216df2 in blk_check_request (blk=blk@entry=0x3ffb17e7960, sector_num=sector_num@entry=9626888, nb_sectors=nb_sectors@entry=32) at /data/git/yyy/qemu/block/block-backend.c:589 #5 0x80217ee8 in blk_aio_readv (blk=0x3ffb17e7960, sector_num= 9626888, iov=0x8098c658, nb_sectors=, cb= 0x80081150 , opaque=0x80980620) at /data/git/yyy/qemu/block/block-backend.c:727 #6 0x8008186e in submit_requests (niov=, num_reqs=, start=, mrb=, blk=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:366 #7 virtio_blk_submit_multireq (mrb=, blk=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:444 #8 virtio_blk_submit_multireq (blk=0x3ffb17e7960, mrb=0x3ffeb58) at /data/git/yyy/qemu/hw/block/virtio-blk.c:389 #9 0x800823ee in virtio_blk_handle_output (vdev=, vq=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:615 #10 0x801e367e in aio_dispatch (ctx=0x80918520) at /data/git/yyy/qemu/aio-posix.c:326 #11 0x801d28b0 in aio_ctx_dispatch (source=, callback=, user_data=) at /data/git/yyy/qemu/async.c:231 #12 0x03fffd36a05a in g_main_context_dispatch () from /lib64/libglib-2.0.so.0 #13 0x801e0ffa in glib_pollfds_poll () at /data/git/yyy/qemu/main-loop.c:211 #14 os_host_main_loop_wait (timeout=) at /data/git/yyy/qemu/main-loop.c:256 #15 main_loop_wait (nonblocking=) at /data/git/yyy/qemu/main-loop.c:504 #16 0x800148a6 in main_loop () at /data/git/yyy/qemu/vl.c:1923 #17 main (argc=, argv=, envp=) at /data/git/yyy/qemu/vl.c:4684 Relevant part of command line: -drive file=/dev/sda,if=none,id=drive-virtio-disk0,format=raw,serial=ccwzfcp1,cache=none -device virtio-blk-ccw,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,scsi=off >>> >>> I played around a bit. The main part of this change seems to be calling >>> wait_serialising_requests() conditionally; reverting this makes the >>> guest boot again. >>> >>> I then tried to find out when wait_serialising_requests() was NOT >>> called and added fprintfs: well, it was _always_ called. I then added a >>> fprintf for flags at the beginning of the function: this produced a >>> segfault no matter whether wait_serialising_requests() was called >>> conditionally or unconditionally. Weird race? >>> >>> Anything further I can do? I guess this patch fixes a bug for someone, >>> but it means insta-death for my setup... >> >> If it happens immediately, perhaps running under valgrind is possible >> and could give some hints about what happened with blk->bs? > > Just a quick update: This triggers on a qemu built with a not-so-fresh > gcc 4.7.2 (and it seems to depend on
Re: [Qemu-devel] [PATCH 1/2] Add param Error** to msi_init() & modify the callers
Hi Markus, I have to say, you really did a amazing review for this "trivial "patch, thanks a lot & really appreciate it:) On 12/07/2015 05:59 PM, Markus Armbruster wrote: Cao jinwrites: msi_init() is a supporting function in PCI device initialization, in order to convert .init() to .realize(), it should be modified first. Also modify the callers Bonus: add more comment for msi_init(). Incomplete. See notes on impact inline. Signed-off-by: Cao jin --- hw/audio/intel-hda.c | 7 ++- hw/ide/ich.c | 2 +- hw/net/vmxnet3.c | 3 ++- hw/pci-bridge/ioh3420.c| 6 +- hw/pci-bridge/pci_bridge_dev.c | 6 +- hw/pci-bridge/xio3130_downstream.c | 7 ++- hw/pci-bridge/xio3130_upstream.c | 7 ++- hw/pci/msi.c | 17 + hw/scsi/megasas.c | 12 +--- hw/scsi/vmw_pvscsi.c | 3 ++- hw/usb/hcd-xhci.c | 5 - hw/vfio/pci.c | 3 ++- include/hw/pci/msi.h | 4 ++-- 13 files changed, 63 insertions(+), 19 deletions(-) diff --git a/hw/audio/intel-hda.c b/hw/audio/intel-hda.c index 433463e..9d733da 100644 --- a/hw/audio/intel-hda.c +++ b/hw/audio/intel-hda.c @@ -1130,6 +1130,7 @@ static void intel_hda_realize(PCIDevice *pci, Error **errp) { IntelHDAState *d = INTEL_HDA(pci); uint8_t *conf = d->pci.config; +int ret; d->name = object_get_typename(OBJECT(d)); @@ -1142,7 +1143,11 @@ static void intel_hda_realize(PCIDevice *pci, Error **errp) "intel-hda", 0x4000); pci_register_bar(>pci, 0, 0, >mmio); if (d->msi) { -msi_init(>pci, d->old_msi_addr ? 0x50 : 0x60, 1, true, false); +ret = msi_init(>pci, d->old_msi_addr ? 0x50 : 0x60, 1, true, +false, errp); +if(ret < 0) { Please use scripts/checkpatch.pl to check your patches. It's occasionally wrong, so use your judgement. Thanks for the tips, seems I got dizzy looking because many trivial place need to be modified... +return; This returns with the device in a half-realized state. Do we have to undo prior side effects to put it back into unrealized state? See also ioh3420_initfn() below. Before: msi_init() failure is ignored. After: it makes device realization fail. To assess impact, we need to understand how msi_init() can fail. It seems I missed the reality: devices are default to be hot-pluggable & most devices are hot-pluggable:-[ Because when cold plugged, process will exit on device-init failing, so, the half-realized state doesn`t matter in this condition. Will rework it later. +} } hda_codec_bus_init(DEVICE(pci), >codecs, sizeof(d->codecs), diff --git a/hw/ide/ich.c b/hw/ide/ich.c index 16925fa..94b1809 100644 --- a/hw/ide/ich.c +++ b/hw/ide/ich.c @@ -145,7 +145,7 @@ static void pci_ich9_ahci_realize(PCIDevice *dev, Error **errp) /* Although the AHCI 1.3 specification states that the first capability * should be PMCAP, the Intel ICH9 data sheet specifies that the ICH9 * AHCI device puts the MSI capability first, pointing to 0x80. */ -msi_init(dev, ICH9_MSI_CAP_OFFSET, 1, true, false); +msi_init(dev, ICH9_MSI_CAP_OFFSET, 1, true, false, errp); Do we have to put the device back into unrealized state on failure? } static void pci_ich9_uninit(PCIDevice *dev) diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c index 5e3a233..4269141 100644 --- a/hw/net/vmxnet3.c +++ b/hw/net/vmxnet3.c @@ -2140,9 +2140,10 @@ vmxnet3_init_msi(VMXNET3State *s) { PCIDevice *d = PCI_DEVICE(s); int res; +Error *local_err = NULL; res = msi_init(d, VMXNET3_MSI_OFFSET, VMXNET3_MAX_NMSIX_INTRS, - VMXNET3_USE_64BIT, VMXNET3_PER_VECTOR_MASK); + VMXNET3_USE_64BIT, VMXNET3_PER_VECTOR_MASK, _err); if (0 > res) { VMW_WRPRN("Failed to initialize MSI, error %d", res); s->msi_used = false; The error is neither propagated nor handled, and the error object leaks. Since this function can't handle it, it needs to propagate it. Requires adding an Error ** parameter. [*]Actually, here is my consideration: a device-realize function(take the following ioh3420 for example) will call many supporting functions like msi_init(), so I am planning, every supporting function goes into a patch first, then every "device convert to realize()" goes into a patch, otherwise, it may will be a big patch for the reviewer. That`s why I didn`t add Error ** param, and propagate it, and plan to do it in "convert to realize()" patch. But for now, I think this patch should at least be successfully compiled & won`t impact the existed things. Yes, it seems may have leaks when error happens, but will be fixed when the "convert to realize()" patch
Re: [Qemu-devel] [PULL for-2.5 2/4] block: Don't wait serialising for non-COR read requests
On Tue, 8 Dec 2015 10:59:54 +0100 Kevin Wolfwrote: > Am 07.12.2015 um 17:42 hat Cornelia Huck geschrieben: > > On Mon, 7 Dec 2015 11:02:51 +0100 > > Cornelia Huck wrote: > > > > > On Thu, 3 Dec 2015 13:00:00 +0800 > > > Stefan Hajnoczi wrote: > > > > > > > From: Fam Zheng > > > > > > > > The assertion problem was noticed in 06c3916b35a, but it wasn't > > > > completely fixed, because even though the req is not marked as > > > > serialising, it still gets serialised by wait_serialising_requests > > > > against other serialising requests, which could lead to the same > > > > assertion failure. > > > > > > > > Fix it by even more explicitly skipping the serialising for this > > > > specific case. > > > > > > > > Signed-off-by: Fam Zheng > > > > Message-id: 1448962590-2842-2-git-send-email-f...@redhat.com > > > > Signed-off-by: Stefan Hajnoczi > > > > --- > > > > block/backup.c| 2 +- > > > > block/io.c| 12 +++- > > > > include/block/block.h | 4 ++-- > > > > trace-events | 2 +- > > > > 4 files changed, 11 insertions(+), 9 deletions(-) > > > > > > This one causes segfaults for me: > > > > > > Program received signal SIGSEGV, Segmentation fault. > > > bdrv_is_inserted (bs=0x8000) at /data/git/yyy/qemu/block.c:3071 > > > 3071 if (!drv) { > > > > > > (gdb) bt > > > #0 bdrv_is_inserted (bs=0x8000) at > > > /data/git/yyy/qemu/block.c:3071 > > This looks like some kind of memory corruption that hit blk->bs. It's > most definitely not a valid pointer anyway. > > > > #1 0x80216974 in blk_is_inserted (blk=) > > > at /data/git/yyy/qemu/block/block-backend.c:986 > > > #2 0x802169c6 in blk_is_available (blk=blk@entry=0x3ffb17e7960) > > > at /data/git/yyy/qemu/block/block-backend.c:991 > > > #3 0x80216d12 in blk_check_byte_request > > > (blk=blk@entry=0x3ffb17e7960, > > > offset=offset@entry=4928966656, size=16384) > > > at /data/git/yyy/qemu/block/block-backend.c:558 > > > #4 0x80216df2 in blk_check_request (blk=blk@entry=0x3ffb17e7960, > > > sector_num=sector_num@entry=9626888, nb_sectors=nb_sectors@entry=32) > > > at /data/git/yyy/qemu/block/block-backend.c:589 > > > #5 0x80217ee8 in blk_aio_readv (blk=0x3ffb17e7960, sector_num= > > > 9626888, iov=0x8098c658, nb_sectors=, cb= > > > 0x80081150 , opaque=0x80980620) > > > at /data/git/yyy/qemu/block/block-backend.c:727 > > > #6 0x8008186e in submit_requests (niov=, > > > num_reqs=, start=, mrb=, > > > blk=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:366 > > > #7 virtio_blk_submit_multireq (mrb=, blk=) > > > at /data/git/yyy/qemu/hw/block/virtio-blk.c:444 > > > #8 virtio_blk_submit_multireq (blk=0x3ffb17e7960, mrb=0x3ffeb58) > > > at /data/git/yyy/qemu/hw/block/virtio-blk.c:389 > > > #9 0x800823ee in virtio_blk_handle_output (vdev=, > > > vq=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:615 > > > #10 0x801e367e in aio_dispatch (ctx=0x80918520) > > > at /data/git/yyy/qemu/aio-posix.c:326 > > > #11 0x801d28b0 in aio_ctx_dispatch (source=, > > > callback=, user_data=) > > > at /data/git/yyy/qemu/async.c:231 > > > #12 0x03fffd36a05a in g_main_context_dispatch () > > >from /lib64/libglib-2.0.so.0 > > > #13 0x801e0ffa in glib_pollfds_poll () > > > at /data/git/yyy/qemu/main-loop.c:211 > > > #14 os_host_main_loop_wait (timeout=) > > > at /data/git/yyy/qemu/main-loop.c:256 > > > #15 main_loop_wait (nonblocking=) > > > at /data/git/yyy/qemu/main-loop.c:504 > > > #16 0x800148a6 in main_loop () at /data/git/yyy/qemu/vl.c:1923 > > > #17 main (argc=, argv=, envp= > > out>) > > > at /data/git/yyy/qemu/vl.c:4684 > > > > > > Relevant part of command line: > > > > > > -drive > > > file=/dev/sda,if=none,id=drive-virtio-disk0,format=raw,serial=ccwzfcp1,cache=none > > > -device > > > virtio-blk-ccw,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,scsi=off > > > > I played around a bit. The main part of this change seems to be calling > > wait_serialising_requests() conditionally; reverting this makes the > > guest boot again. > > > > I then tried to find out when wait_serialising_requests() was NOT > > called and added fprintfs: well, it was _always_ called. I then added a > > fprintf for flags at the beginning of the function: this produced a > > segfault no matter whether wait_serialising_requests() was called > > conditionally or unconditionally. Weird race? > > > > Anything further I can do? I guess this patch fixes a bug for someone, > > but it means insta-death for my setup... > > If it happens immediately, perhaps running under valgrind is possible > and could give some hints about what happened with blk->bs? Just a quick update: This triggers on a qemu
[Qemu-devel] [v3 0/3] add avx2 instruction optimization
buffer_find_nonzero_offset() is a hot function during live migration. Now it use SSE2 intructions for optimization. For platform supports AVX2 instructions, use the AVX2 instructions for optimization can help to improve the performance about 30% comparing to SSE2. Zero page check can be faster with this optimization, the test result shows that for an 8GB RAM idle guest, this patch can help to shorten the total live migration time about 6%. This patch use the ifunc mechanism to select the proper function when running, for platform supports AVX2, excute the AVX2 instructions, else, excute the original code. With this patch, the QEMU binary can run on both platforms support AVX2 or not. Compiler which desn't support the AVX2 or ifunc attribute can build the source code successfully. v2 -> v3 changes: * Detect the ifunc attribute support (Paolo's suggestion) * Use the ifunc attribute instead of the inline asm (Richard's suggestion) * Change the configure (Juan's suggestion) Liang Li (3): cutils: add avx2 instruction optimization configure: detect ifunc attribute configure: add options to config avx2 configure | 50 + include/qemu-common.h | 13 +- util/Makefile.objs | 2 ++ util/buffer-zero-avx2.c | 54 util/cutils.c | 65 +++-- 5 files changed, 175 insertions(+), 9 deletions(-) create mode 100644 util/buffer-zero-avx2.c -- 1.9.1
[Qemu-devel] [PATCH 1/7] pc: wire up TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE for !xen
rename pc_xen_hvm_init_pci to pc_i440fx_init_pci, use it for both xen and non-xen init. Signed-off-by: Gerd Hoffmann--- hw/i386/pc_piix.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index 2e41efe..ce6c3c5 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -419,10 +419,9 @@ static void pc_init_isa(MachineState *machine) pc_init1(machine, TYPE_I440FX_PCI_HOST_BRIDGE, TYPE_I440FX_PCI_DEVICE); } -#ifdef CONFIG_XEN -static void pc_xen_hvm_init_pci(MachineState *machine) +static void pc_i440fx_init_pci(MachineState *machine) { -const char *pci_type = has_igd_gfx_passthru ? +const char *pci_type = machine->igd_gfx_passthru ? TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE : TYPE_I440FX_PCI_DEVICE; pc_init1(machine, @@ -430,6 +429,7 @@ static void pc_xen_hvm_init_pci(MachineState *machine) pci_type); } +#ifdef CONFIG_XEN static void pc_xen_hvm_init(MachineState *machine) { PCIBus *bus; @@ -439,7 +439,7 @@ static void pc_xen_hvm_init(MachineState *machine) exit(1); } -pc_xen_hvm_init_pci(machine); +pc_i440fx_init_pci(machine); bus = pci_find_primary_bus(); if (bus != NULL) { @@ -455,8 +455,7 @@ static void pc_xen_hvm_init(MachineState *machine) if (compat) { \ compat(machine); \ } \ -pc_init1(machine, TYPE_I440FX_PCI_HOST_BRIDGE, \ - TYPE_I440FX_PCI_DEVICE); \ +pc_i440fx_init_pci(machine); \ } \ DEFINE_PC_MACHINE(suffix, name, pc_init_##suffix, optionfn) -- 1.8.3.1
[Qemu-devel] [PATCH 0/7] igd passthrough chipset tweaks
Hi, We have some code in our tree to support pci passthrough of intel graphics devices (igd) on xen, which requires some chipset tweaks for (a) the host bridge and (b) the lpc/isa-bridge to meat the expectations of the guest driver. For kvm we need pretty much the same, also the requirements for vgpu (xengt/kvmgt) are very simliar. This patch series tackles (a) only, (b) will follow later. It wires up the igd-passthru machine option for tcg/kvm too, moves the code to its own file so it is nicely separated, fixes a bunch of issues and finally adds q35 support. This patch series has seen very light testing, basically doing lspci in the guest to check whenever pci config space got updated correctly. Trying actual device assignment needs more pieces being in place. But I suspect even that is more testing than the code has seen on xen so far (see patch #6 ...). cheers, Gerd Gerd Hoffmann (7): pc: wire up TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE for !xen pc: move igd support code to igd.c igd: switch TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE to realize igd: TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE: call parent realize igd: use defines for standard pci config space offsets igd: revamp host config read igd: add q35 support hw/i386/pc_piix.c | 11 ++-- hw/pci-host/Makefile.objs | 3 ++ hw/pci-host/igd.c | 132 ++ hw/pci-host/piix.c| 88 --- hw/pci-host/q35.c | 6 ++- 5 files changed, 145 insertions(+), 95 deletions(-) create mode 100644 hw/pci-host/igd.c -- 1.8.3.1
[Qemu-devel] [PATCH 2/7] pc: move igd support code to igd.c
Pure code motion, except for dropping instance_size for TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE (no need to set, we can inherit it from TYPE_I440FX_PCI_DEVICE). Signed-off-by: Gerd Hoffmann--- hw/pci-host/Makefile.objs | 3 ++ hw/pci-host/igd.c | 96 +++ hw/pci-host/piix.c| 88 --- 3 files changed, 99 insertions(+), 88 deletions(-) create mode 100644 hw/pci-host/igd.c diff --git a/hw/pci-host/Makefile.objs b/hw/pci-host/Makefile.objs index 45f1f0e..e341a49 100644 --- a/hw/pci-host/Makefile.objs +++ b/hw/pci-host/Makefile.objs @@ -11,6 +11,9 @@ common-obj-$(CONFIG_PPCE500_PCI) += ppce500.o # ARM devices common-obj-$(CONFIG_VERSATILE_PCI) += versatile.o +# igd passthrough support +common-obj-$(CONFIG_LINUX) += igd.o + common-obj-$(CONFIG_PCI_APB) += apb.o common-obj-$(CONFIG_FULONG) += bonito.o common-obj-$(CONFIG_PCI_PIIX) += piix.o diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c new file mode 100644 index 000..ef0273b --- /dev/null +++ b/hw/pci-host/igd.c @@ -0,0 +1,96 @@ +#include "qemu-common.h" +#include "hw/pci/pci.h" +#include "hw/i386/pc.h" + +/* IGD Passthrough Host Bridge. */ +typedef struct { +uint8_t offset; +uint8_t len; +} IGDHostInfo; + +/* Here we just expose minimal host bridge offset subset. */ +static const IGDHostInfo igd_host_bridge_infos[] = { +{0x08, 2}, /* revision id */ +{0x2c, 2}, /* sybsystem vendor id */ +{0x2e, 2}, /* sybsystem id */ +{0x50, 2}, /* SNB: processor graphics control register */ +{0x52, 2}, /* processor graphics control register */ +{0xa4, 4}, /* SNB: graphics base of stolen memory */ +{0xa8, 4}, /* SNB: base of GTT stolen memory */ +}; + +static int host_pci_config_read(int pos, int len, uint32_t val) +{ +char path[PATH_MAX]; +int config_fd; +ssize_t size = sizeof(path); +/* Access real host bridge. */ +int rc = snprintf(path, size, "/sys/bus/pci/devices/%04x:%02x:%02x.%d/%s", + 0, 0, 0, 0, "config"); +int ret = 0; + +if (rc >= size || rc < 0) { +return -ENODEV; +} + +config_fd = open(path, O_RDWR); +if (config_fd < 0) { +return -ENODEV; +} + +if (lseek(config_fd, pos, SEEK_SET) != pos) { +ret = -errno; +goto out; +} +do { +rc = read(config_fd, (uint8_t *), len); +} while (rc < 0 && (errno == EINTR || errno == EAGAIN)); +if (rc != len) { +ret = -errno; +} +out: +close(config_fd); +return ret; +} + +static int igd_pt_i440fx_initfn(struct PCIDevice *pci_dev) +{ +uint32_t val = 0; +int rc, i, num; +int pos, len; + +num = ARRAY_SIZE(igd_host_bridge_infos); +for (i = 0; i < num; i++) { +pos = igd_host_bridge_infos[i].offset; +len = igd_host_bridge_infos[i].len; +rc = host_pci_config_read(pos, len, val); +if (rc) { +return -ENODEV; +} +pci_default_write_config(pci_dev, pos, val, len); +} + +return 0; +} + +static void igd_passthrough_i440fx_class_init(ObjectClass *klass, void *data) +{ +DeviceClass *dc = DEVICE_CLASS(klass); +PCIDeviceClass *k = PCI_DEVICE_CLASS(klass); + +k->init = igd_pt_i440fx_initfn; +dc->desc = "IGD Passthrough Host bridge"; +} + +static const TypeInfo igd_passthrough_i440fx_info = { +.name = TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE, +.parent= TYPE_I440FX_PCI_DEVICE, +.class_init= igd_passthrough_i440fx_class_init, +}; + +static void igd_register_types(void) +{ +type_register_static(_passthrough_i440fx_info); +} + +type_init(igd_register_types) diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c index 715208b..ccacb57 100644 --- a/hw/pci-host/piix.c +++ b/hw/pci-host/piix.c @@ -744,93 +744,6 @@ static const TypeInfo i440fx_info = { .class_init= i440fx_class_init, }; -/* IGD Passthrough Host Bridge. */ -typedef struct { -uint8_t offset; -uint8_t len; -} IGDHostInfo; - -/* Here we just expose minimal host bridge offset subset. */ -static const IGDHostInfo igd_host_bridge_infos[] = { -{0x08, 2}, /* revision id */ -{0x2c, 2}, /* sybsystem vendor id */ -{0x2e, 2}, /* sybsystem id */ -{0x50, 2}, /* SNB: processor graphics control register */ -{0x52, 2}, /* processor graphics control register */ -{0xa4, 4}, /* SNB: graphics base of stolen memory */ -{0xa8, 4}, /* SNB: base of GTT stolen memory */ -}; - -static int host_pci_config_read(int pos, int len, uint32_t val) -{ -char path[PATH_MAX]; -int config_fd; -ssize_t size = sizeof(path); -/* Access real host bridge. */ -int rc = snprintf(path, size, "/sys/bus/pci/devices/%04x:%02x:%02x.%d/%s", - 0, 0, 0, 0, "config"); -int ret = 0; - -if (rc >= size || rc < 0) { -return -ENODEV; -} - -config_fd = open(path, O_RDWR); -if
[Qemu-devel] FD passing for chardevs and chardev backend multiplexing
Historically libvirt has connected stdout & stderr from QEMU directly to a plain file (/var/log/libvirt/qemu/$GUESTNAME.log). This has worked well enough in general, but is susceptible to a guest denial of service if the guest can cause QEMU to spew messages to stderr. There are enough places in QEMU and SPICE that still print to stderr that this isn't very hard to achieve. So In libvirt 1.3.0 we introduce a new daemon 'virtlogd' which is used for handling log file writing. When this is used, QEMU's stdout/stderr will be connected to an anonymous pipe file descriptor, the other end of which is held by the virtlogd daemon. The virtlogd daemon will only permit a fixed file size to be created before rotating the log file, so we no longer have the possibility of unbounded disk usage, which is nice. I'm now looking to extend the use of 'virtlogd' to also handle character devices. OpenStack has historically configured the primary serial port to log to a file in order to capture kernel boot up messages and later report them to the user via its API. This serial port file backend is of course susceptible to the same disk space denial of service. We don't really want to push file rotation logic into QEMU because that would involve giving QEMU permission to create / rename files, which is undesirable from a security POV. We also prefer a solution that ideally works with existing QEMU builds. So for this my plan is to stop using the QEMU 'file' backend for char devs and instead pass across a pre-opened file descriptor, connected to virtlogd. There is no "officially documented" way to pass in a file descriptor to QEMU chardevs, but since QEMU uses qemu_open(), we can make use of the fdset feature to achieve this. eg eg, consider fd 33 is the write end of a pipe file descriptor I can (in theory) do -add-fd set=2,fd=33 -chardev file,id=charserial0,path=/dev/fdset/2 Now in practice this doesn't work, because qmp_chardev_open_file() passes the O_CREAT|O_TRUNC flags in, which means the qemu_open() call will fail when using the pipe FD pased in via fdsets. After more investigation I found it *is* possible to use a socketpair and a pipe backend though... -add-fd set=2,fd=33 -chardev pipe,id=charserial0,path=/dev/fdset/2 ..because for reasons I don't understand, if QEMU can't open $PATH.in and $PATH.out, it'll fallback to just opening $PATH in read-write mode even. AFAICT, this is pretty useless with pipes since they are unidirectional, but, it works nicely with socketpairs, where virtlogd has one of the socketpairs and QEMU gets passed the other via fdset. I can easily check this works for historical QEMU versions back to when fdsets support was added to chardevs, but I'm working if the QEMU maintainers consider this usage acceptable over the long term, and if so, should we explicitly document it as supported ? If not, should we introduce a more explicit syntax for passing in a pre-opened FD for chardevs ? eg -add-fd set=2,fd=33 -chardev fd,id=charserial0,path=/dev/fdset/2 Or just make -chardev file,id=charserial0,path=/dev/fdset/2 actually work ? Or something else ? OpenStack has a further requirement to allow use of the serial port as an interactive console, at the same time that it is logging to a file which is something QEMU can't support at all currently. This essentially means being able to have multiple chardev backends all connected to the same serial frontend - specifically we would need a TCP backend and a file backend concurrently. Again this could be implemented in QEMU, but we'd prefer something that works with existing QEMU. This is not too difficult to achieve with virtlogd really. Instead of using the QEMU 'tcp' or 'unix' chardev protocol, we'd just always pass QEMU a pre-opened socketpair, and then leave the TCP/UNIX socket listening to the virtlogd daemon. This is portable with existing QEMU versions, but the obvious downside with this is extra copies in the interactive console path. So might it be worth exploring the posibility of a chardev multiplexor in QEMU. We would still pass in a pre-opened socketpair to QEMU for the logging side of things, but would leave the TCP/UNIX socket listening upto QEMU still. eg should we make something like this work: -add-fd set=2,fd=33 -chardev pipe,id=charserial0file,path=/dev/fdset/2 -chardev socket,id=charserial0tcp,host=127.0.0.1,port=,telnet,server,nowait -chardev multiplex,id=charserial0,muxA=charserial0file,muxB=charserial1 -serial isa-serial,chardev=charserial0,id=serial0 Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: [Qemu-devel] [PULL for-2.5 2/4] block: Don't wait serialising for non-COR read requests
On 12/08/2015 03:10 PM, Kevin Wolf wrote: [...] Not a compiler bug. gcc uses a floating point register 8 to spill the pointer of blk (which is call saved) submit_request will later on call qemu_coroutine_enter and after returning from qemu_coroutine_enter, the fpr8 contains junk. Not sure yet, what happened. >>> >>> Coroutines don't save the FPU state, so you're not supposed to use >>> floating point operations inside coroutines. That the compiler spills >>> some integer value into a floating point register is a bit nasty... >> >> Just checked. bdrv_aligned_preadv does also use fprs (also for filling >> and spilling). Some versions of gcc seem to like that as the LDGR and LGDR >> instructions are pretty cheap and move the content from/to fprs in a bitwise >> fashion. So this coroutine DOES trash floating point registers. >> >> Without the patch gcc seems to be fine with the 16 gprs and does not >> spilling/filling from/to fprs in bdrv_aligned_preadv. > > Actually, on closer look it seems that the reason why there is no code > for saving the floating point registers in setjmp() on x86 is that they > are caller-save registers anyway, so it doesn't have to. Otherwise the > internet seems to be of the opinion that longjmp() must indeed restore > floating point registers. > > So this might be a libc bug on s390 then. Fixed with https://sourceware.org/ml/libc-alpha/2013-01/msg00853.html Christian
Re: [Qemu-devel] [PULL for-2.5 2/4] block: Don't wait serialising for non-COR read requests
Am 08.12.2015 um 14:28 hat Christian Borntraeger geschrieben: > On 12/08/2015 01:30 PM, Christian Borntraeger wrote: > > On 12/08/2015 01:00 PM, Cornelia Huck wrote: > >> On Tue, 8 Dec 2015 10:59:54 +0100 > >> Kevin Wolfwrote: > >> > >>> Am 07.12.2015 um 17:42 hat Cornelia Huck geschrieben: > On Mon, 7 Dec 2015 11:02:51 +0100 > Cornelia Huck wrote: > > > On Thu, 3 Dec 2015 13:00:00 +0800 > > Stefan Hajnoczi wrote: > > > >> From: Fam Zheng > >> > >> The assertion problem was noticed in 06c3916b35a, but it wasn't > >> completely fixed, because even though the req is not marked as > >> serialising, it still gets serialised by wait_serialising_requests > >> against other serialising requests, which could lead to the same > >> assertion failure. > >> > >> Fix it by even more explicitly skipping the serialising for this > >> specific case. > >> > >> Signed-off-by: Fam Zheng > >> Message-id: 1448962590-2842-2-git-send-email-f...@redhat.com > >> Signed-off-by: Stefan Hajnoczi > >> --- > >> block/backup.c| 2 +- > >> block/io.c| 12 +++- > >> include/block/block.h | 4 ++-- > >> trace-events | 2 +- > >> 4 files changed, 11 insertions(+), 9 deletions(-) > > > > This one causes segfaults for me: > > > > Program received signal SIGSEGV, Segmentation fault. > > bdrv_is_inserted (bs=0x8000) at /data/git/yyy/qemu/block.c:3071 > > 3071if (!drv) { > > > > (gdb) bt > > #0 bdrv_is_inserted (bs=0x8000) at > > /data/git/yyy/qemu/block.c:3071 > >>> > >>> This looks like some kind of memory corruption that hit blk->bs. It's > >>> most definitely not a valid pointer anyway. > >>> > > #1 0x80216974 in blk_is_inserted (blk=) > > at /data/git/yyy/qemu/block/block-backend.c:986 > > #2 0x802169c6 in blk_is_available (blk=blk@entry=0x3ffb17e7960) > > at /data/git/yyy/qemu/block/block-backend.c:991 > > #3 0x80216d12 in blk_check_byte_request > > (blk=blk@entry=0x3ffb17e7960, > > offset=offset@entry=4928966656, size=16384) > > at /data/git/yyy/qemu/block/block-backend.c:558 > > #4 0x80216df2 in blk_check_request > > (blk=blk@entry=0x3ffb17e7960, > > sector_num=sector_num@entry=9626888, nb_sectors=nb_sectors@entry=32) > > at /data/git/yyy/qemu/block/block-backend.c:589 > > #5 0x80217ee8 in blk_aio_readv (blk=0x3ffb17e7960, sector_num= > > 9626888, iov=0x8098c658, nb_sectors=, cb= > > 0x80081150 , opaque=0x80980620) > > at /data/git/yyy/qemu/block/block-backend.c:727 > > #6 0x8008186e in submit_requests (niov=, > > num_reqs=, start=, mrb= > out>, > > blk=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:366 > > #7 virtio_blk_submit_multireq (mrb=, blk= > out>) > > at /data/git/yyy/qemu/hw/block/virtio-blk.c:444 > > #8 virtio_blk_submit_multireq (blk=0x3ffb17e7960, mrb=0x3ffeb58) > > at /data/git/yyy/qemu/hw/block/virtio-blk.c:389 > > #9 0x800823ee in virtio_blk_handle_output (vdev= > out>, > > vq=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:615 > > #10 0x801e367e in aio_dispatch (ctx=0x80918520) > > at /data/git/yyy/qemu/aio-posix.c:326 > > #11 0x801d28b0 in aio_ctx_dispatch (source=, > > callback=, user_data=) > > at /data/git/yyy/qemu/async.c:231 > > #12 0x03fffd36a05a in g_main_context_dispatch () > >from /lib64/libglib-2.0.so.0 > > #13 0x801e0ffa in glib_pollfds_poll () > > at /data/git/yyy/qemu/main-loop.c:211 > > #14 os_host_main_loop_wait (timeout=) > > at /data/git/yyy/qemu/main-loop.c:256 > > #15 main_loop_wait (nonblocking=) > > at /data/git/yyy/qemu/main-loop.c:504 > > #16 0x800148a6 in main_loop () at /data/git/yyy/qemu/vl.c:1923 > > #17 main (argc=, argv=, envp= > out>) > > at /data/git/yyy/qemu/vl.c:4684 > > > > Relevant part of command line: > > > > -drive > > file=/dev/sda,if=none,id=drive-virtio-disk0,format=raw,serial=ccwzfcp1,cache=none > > -device > > virtio-blk-ccw,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,scsi=off > > I played around a bit. The main part of this change seems to be calling > wait_serialising_requests() conditionally; reverting this makes the > guest boot again. > > I then tried to find out when wait_serialising_requests() was NOT > called and added fprintfs: well, it was _always_ called. I then added a > fprintf for flags at the beginning of the function: this produced a > segfault no matter
Re: [Qemu-devel] [PULL for-2.5 2/4] block: Don't wait serialising for non-COR read requests
On 12/08/2015 02:45 PM, Kevin Wolf wrote: > Am 08.12.2015 um 14:28 hat Christian Borntraeger geschrieben: >> On 12/08/2015 01:30 PM, Christian Borntraeger wrote: >>> On 12/08/2015 01:00 PM, Cornelia Huck wrote: On Tue, 8 Dec 2015 10:59:54 +0100 Kevin Wolfwrote: > Am 07.12.2015 um 17:42 hat Cornelia Huck geschrieben: >> On Mon, 7 Dec 2015 11:02:51 +0100 >> Cornelia Huck wrote: >> >>> On Thu, 3 Dec 2015 13:00:00 +0800 >>> Stefan Hajnoczi wrote: >>> From: Fam Zheng The assertion problem was noticed in 06c3916b35a, but it wasn't completely fixed, because even though the req is not marked as serialising, it still gets serialised by wait_serialising_requests against other serialising requests, which could lead to the same assertion failure. Fix it by even more explicitly skipping the serialising for this specific case. Signed-off-by: Fam Zheng Message-id: 1448962590-2842-2-git-send-email-f...@redhat.com Signed-off-by: Stefan Hajnoczi --- block/backup.c| 2 +- block/io.c| 12 +++- include/block/block.h | 4 ++-- trace-events | 2 +- 4 files changed, 11 insertions(+), 9 deletions(-) >>> >>> This one causes segfaults for me: >>> >>> Program received signal SIGSEGV, Segmentation fault. >>> bdrv_is_inserted (bs=0x8000) at /data/git/yyy/qemu/block.c:3071 >>> 3071if (!drv) { >>> >>> (gdb) bt >>> #0 bdrv_is_inserted (bs=0x8000) at >>> /data/git/yyy/qemu/block.c:3071 > > This looks like some kind of memory corruption that hit blk->bs. It's > most definitely not a valid pointer anyway. > >>> #1 0x80216974 in blk_is_inserted (blk=) >>> at /data/git/yyy/qemu/block/block-backend.c:986 >>> #2 0x802169c6 in blk_is_available (blk=blk@entry=0x3ffb17e7960) >>> at /data/git/yyy/qemu/block/block-backend.c:991 >>> #3 0x80216d12 in blk_check_byte_request >>> (blk=blk@entry=0x3ffb17e7960, >>> offset=offset@entry=4928966656, size=16384) >>> at /data/git/yyy/qemu/block/block-backend.c:558 >>> #4 0x80216df2 in blk_check_request >>> (blk=blk@entry=0x3ffb17e7960, >>> sector_num=sector_num@entry=9626888, nb_sectors=nb_sectors@entry=32) >>> at /data/git/yyy/qemu/block/block-backend.c:589 >>> #5 0x80217ee8 in blk_aio_readv (blk=0x3ffb17e7960, sector_num= >>> 9626888, iov=0x8098c658, nb_sectors=, cb= >>> 0x80081150 , opaque=0x80980620) >>> at /data/git/yyy/qemu/block/block-backend.c:727 >>> #6 0x8008186e in submit_requests (niov=, >>> num_reqs=, start=, mrb=>> out>, >>> blk=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:366 >>> #7 virtio_blk_submit_multireq (mrb=, blk=>> out>) >>> at /data/git/yyy/qemu/hw/block/virtio-blk.c:444 >>> #8 virtio_blk_submit_multireq (blk=0x3ffb17e7960, mrb=0x3ffeb58) >>> at /data/git/yyy/qemu/hw/block/virtio-blk.c:389 >>> #9 0x800823ee in virtio_blk_handle_output (vdev=>> out>, >>> vq=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:615 >>> #10 0x801e367e in aio_dispatch (ctx=0x80918520) >>> at /data/git/yyy/qemu/aio-posix.c:326 >>> #11 0x801d28b0 in aio_ctx_dispatch (source=, >>> callback=, user_data=) >>> at /data/git/yyy/qemu/async.c:231 >>> #12 0x03fffd36a05a in g_main_context_dispatch () >>>from /lib64/libglib-2.0.so.0 >>> #13 0x801e0ffa in glib_pollfds_poll () >>> at /data/git/yyy/qemu/main-loop.c:211 >>> #14 os_host_main_loop_wait (timeout=) >>> at /data/git/yyy/qemu/main-loop.c:256 >>> #15 main_loop_wait (nonblocking=) >>> at /data/git/yyy/qemu/main-loop.c:504 >>> #16 0x800148a6 in main_loop () at /data/git/yyy/qemu/vl.c:1923 >>> #17 main (argc=, argv=, envp=>> out>) >>> at /data/git/yyy/qemu/vl.c:4684 >>> >>> Relevant part of command line: >>> >>> -drive >>> file=/dev/sda,if=none,id=drive-virtio-disk0,format=raw,serial=ccwzfcp1,cache=none >>> -device >>> virtio-blk-ccw,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,scsi=off >> >> I played around a bit. The main part of this change seems to be calling >> wait_serialising_requests() conditionally; reverting this makes the >> guest boot again. >> >> I then tried to find out when wait_serialising_requests() was NOT >> called and added fprintfs: well, it was _always_ called. I then added a >> fprintf for flags at the beginning of the
[Qemu-devel] [PATCH 3/7] igd: switch TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE to realize
Signed-off-by: Gerd Hoffmann--- hw/pci-host/igd.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c index ef0273b..d1eeafb 100644 --- a/hw/pci-host/igd.c +++ b/hw/pci-host/igd.c @@ -53,7 +53,7 @@ out: return ret; } -static int igd_pt_i440fx_initfn(struct PCIDevice *pci_dev) +static void igd_pt_i440fx_realize(PCIDevice *pci_dev, Error **errp) { uint32_t val = 0; int rc, i, num; @@ -65,12 +65,11 @@ static int igd_pt_i440fx_initfn(struct PCIDevice *pci_dev) len = igd_host_bridge_infos[i].len; rc = host_pci_config_read(pos, len, val); if (rc) { -return -ENODEV; +error_setg(errp, "failed to read host config"); +return; } pci_default_write_config(pci_dev, pos, val, len); } - -return 0; } static void igd_passthrough_i440fx_class_init(ObjectClass *klass, void *data) @@ -78,7 +77,7 @@ static void igd_passthrough_i440fx_class_init(ObjectClass *klass, void *data) DeviceClass *dc = DEVICE_CLASS(klass); PCIDeviceClass *k = PCI_DEVICE_CLASS(klass); -k->init = igd_pt_i440fx_initfn; +k->realize = igd_pt_i440fx_realize; dc->desc = "IGD Passthrough Host bridge"; } -- 1.8.3.1
[Qemu-devel] [PATCH 5/7] igd: use defines for standard pci config space offsets
Signed-off-by: Gerd Hoffmann--- hw/pci-host/igd.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c index 6f52ab1..0784128 100644 --- a/hw/pci-host/igd.c +++ b/hw/pci-host/igd.c @@ -10,9 +10,9 @@ typedef struct { /* Here we just expose minimal host bridge offset subset. */ static const IGDHostInfo igd_host_bridge_infos[] = { -{0x08, 2}, /* revision id */ -{0x2c, 2}, /* sybsystem vendor id */ -{0x2e, 2}, /* sybsystem id */ +{PCI_REVISION_ID, 2}, +{PCI_SUBSYSTEM_VENDOR_ID, 2}, +{PCI_SUBSYSTEM_ID,2}, {0x50, 2}, /* SNB: processor graphics control register */ {0x52, 2}, /* processor graphics control register */ {0xa4, 4}, /* SNB: graphics base of stolen memory */ -- 1.8.3.1
Re: [Qemu-devel] Error handling in realize() methods
* Markus Armbruster (arm...@redhat.com) wrote: > In general, code running withing a realize() method should not exit() on > error. Instad, errors should be propagated through the realize() > method. Additionally, the realize() method should fail cleanly, > i.e. carefully undo its side effects such as wiring of interrupts, > mapping of memory, and so forth. Tedious work, but necessary to make > hot plug safe. > > Quite a few devices don't do that. > > Some of them can be usefully hot-plugged, and for them unclean failures > are simply bugs. I'm going to mark the ones I can find. > > Others are used only as onboard devices, and if their realize() fails, > the machine's init() will exit()[*]. In an ideal world, we'd start with > an empty board and cold-plugg devices, and there, clean failure may be > useful. In the world we live in, making these devices fail cleanly is a > lot of tedious work for no immediate gain. > > Example: machine "kzm" and device "fsl,imx31". fsl_imx31_realize() > returns without cleanup on error. kzm_init() exit(1)s when realize > fails, so the lack of cleanup is a non-issue. > > I think this is basically okay for now, but I'd like us to mark these > devices cannot_instantiate_with_device_add_yet, with /* Reason: > realize() method fails uncleanly */. > > Opinions? > > Next, let's consider the special case "out of memory". > > Our general approach is to treat it as immediately fatal. This makes > sense, because when a smallish allocation fails, the process is almost > certainly doomed anyway. Moreover, memory allocation is frequent, and > attempting to recover from failed memory allocation adds loads of > hard-to-test error paths. These are *dangerous* unless carefully tested > (and we don't). > > Certain important allocations we handle more gracefully. For instance, > we don't want to die when the user tries to hot-plug more memory than we > can allocate, or tries to open a QCOW2 image with a huge L1 table. > > Guest memory allocation used to have the "immediately fatal" policy > baked in at a fairly low level, but it's since been lifted into callers; > see commit c261d77..fc7a580 and fixups 4f96676..0bdaa3a. During review > of the latter, Peter Crosthwaite called out the _fatal in the > realize methods and their supporting code. I agreed with him back then > that the errors should really be propagated. But now I've changed my > mind: I think we should treat these memory allocation failures like > we've always treated them, namely report and exit(1). Except for > "large" allocations, where we have a higher probability of failure, and > a more realistic chance to recover safely. > > Can we agree that passing _fatal to memory_region_init_ram() & > friends is basically okay even in realize() methods and their supporting > code? I'd say it depends if they can be hotplugged; I think anything that we really want to hotplug onto real running machines (as opposed to where we're just hotplugging during setup) we should propagate errors properly. And tbh I don't buy the small allocation argument; I think we should handle them all; in my utopian world a guest wouldn't die unless there was no way out. Dave > > [*] Well, the ones that bother to check for errors, but that's a > separate problem. > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PULL for-2.5 2/4] block: Don't wait serialising for non-COR read requests
On Tue, 8 Dec 2015 15:24:29 +0100 Christian Borntraegerwrote: > On 12/08/2015 03:10 PM, Kevin Wolf wrote: > > So this might be a libc bug on s390 then. > > Fixed with > https://sourceware.org/ml/libc-alpha/2013-01/msg00853.html OK, so I need to upgrade that system; no bug in qemu. Thank you for looking into this!
[Qemu-devel] Error handling in realize() methods
In general, code running withing a realize() method should not exit() on error. Instad, errors should be propagated through the realize() method. Additionally, the realize() method should fail cleanly, i.e. carefully undo its side effects such as wiring of interrupts, mapping of memory, and so forth. Tedious work, but necessary to make hot plug safe. Quite a few devices don't do that. Some of them can be usefully hot-plugged, and for them unclean failures are simply bugs. I'm going to mark the ones I can find. Others are used only as onboard devices, and if their realize() fails, the machine's init() will exit()[*]. In an ideal world, we'd start with an empty board and cold-plugg devices, and there, clean failure may be useful. In the world we live in, making these devices fail cleanly is a lot of tedious work for no immediate gain. Example: machine "kzm" and device "fsl,imx31". fsl_imx31_realize() returns without cleanup on error. kzm_init() exit(1)s when realize fails, so the lack of cleanup is a non-issue. I think this is basically okay for now, but I'd like us to mark these devices cannot_instantiate_with_device_add_yet, with /* Reason: realize() method fails uncleanly */. Opinions? Next, let's consider the special case "out of memory". Our general approach is to treat it as immediately fatal. This makes sense, because when a smallish allocation fails, the process is almost certainly doomed anyway. Moreover, memory allocation is frequent, and attempting to recover from failed memory allocation adds loads of hard-to-test error paths. These are *dangerous* unless carefully tested (and we don't). Certain important allocations we handle more gracefully. For instance, we don't want to die when the user tries to hot-plug more memory than we can allocate, or tries to open a QCOW2 image with a huge L1 table. Guest memory allocation used to have the "immediately fatal" policy baked in at a fairly low level, but it's since been lifted into callers; see commit c261d77..fc7a580 and fixups 4f96676..0bdaa3a. During review of the latter, Peter Crosthwaite called out the _fatal in the realize methods and their supporting code. I agreed with him back then that the errors should really be propagated. But now I've changed my mind: I think we should treat these memory allocation failures like we've always treated them, namely report and exit(1). Except for "large" allocations, where we have a higher probability of failure, and a more realistic chance to recover safely. Can we agree that passing _fatal to memory_region_init_ram() & friends is basically okay even in realize() methods and their supporting code? [*] Well, the ones that bother to check for errors, but that's a separate problem.
Re: [Qemu-devel] [PULL for-2.5 2/4] block: Don't wait serialising for non-COR read requests
Am 08.12.2015 um 14:58 hat Christian Borntraeger geschrieben: > On 12/08/2015 02:45 PM, Kevin Wolf wrote: > > Am 08.12.2015 um 14:28 hat Christian Borntraeger geschrieben: > >> On 12/08/2015 01:30 PM, Christian Borntraeger wrote: > >>> On 12/08/2015 01:00 PM, Cornelia Huck wrote: > On Tue, 8 Dec 2015 10:59:54 +0100 > Kevin Wolfwrote: > > > Am 07.12.2015 um 17:42 hat Cornelia Huck geschrieben: > >> On Mon, 7 Dec 2015 11:02:51 +0100 > >> Cornelia Huck wrote: > >> > >>> On Thu, 3 Dec 2015 13:00:00 +0800 > >>> Stefan Hajnoczi wrote: > >>> > From: Fam Zheng > > The assertion problem was noticed in 06c3916b35a, but it wasn't > completely fixed, because even though the req is not marked as > serialising, it still gets serialised by wait_serialising_requests > against other serialising requests, which could lead to the same > assertion failure. > > Fix it by even more explicitly skipping the serialising for this > specific case. > > Signed-off-by: Fam Zheng > Message-id: 1448962590-2842-2-git-send-email-f...@redhat.com > Signed-off-by: Stefan Hajnoczi > --- > block/backup.c| 2 +- > block/io.c| 12 +++- > include/block/block.h | 4 ++-- > trace-events | 2 +- > 4 files changed, 11 insertions(+), 9 deletions(-) > >>> > >>> This one causes segfaults for me: > >>> > >>> Program received signal SIGSEGV, Segmentation fault. > >>> bdrv_is_inserted (bs=0x8000) at > >>> /data/git/yyy/qemu/block.c:3071 > >>> 3071 if (!drv) { > >>> > >>> (gdb) bt > >>> #0 bdrv_is_inserted (bs=0x8000) at > >>> /data/git/yyy/qemu/block.c:3071 > > > > This looks like some kind of memory corruption that hit blk->bs. It's > > most definitely not a valid pointer anyway. > > > >>> #1 0x80216974 in blk_is_inserted (blk=) > >>> at /data/git/yyy/qemu/block/block-backend.c:986 > >>> #2 0x802169c6 in blk_is_available > >>> (blk=blk@entry=0x3ffb17e7960) > >>> at /data/git/yyy/qemu/block/block-backend.c:991 > >>> #3 0x80216d12 in blk_check_byte_request > >>> (blk=blk@entry=0x3ffb17e7960, > >>> offset=offset@entry=4928966656, size=16384) > >>> at /data/git/yyy/qemu/block/block-backend.c:558 > >>> #4 0x80216df2 in blk_check_request > >>> (blk=blk@entry=0x3ffb17e7960, > >>> sector_num=sector_num@entry=9626888, > >>> nb_sectors=nb_sectors@entry=32) > >>> at /data/git/yyy/qemu/block/block-backend.c:589 > >>> #5 0x80217ee8 in blk_aio_readv (blk=0x3ffb17e7960, > >>> sector_num= > >>> 9626888, iov=0x8098c658, nb_sectors=, cb= > >>> 0x80081150 , opaque=0x80980620) > >>> at /data/git/yyy/qemu/block/block-backend.c:727 > >>> #6 0x8008186e in submit_requests (niov=, > >>> num_reqs=, start=, mrb= >>> out>, > >>> blk=) at > >>> /data/git/yyy/qemu/hw/block/virtio-blk.c:366 > >>> #7 virtio_blk_submit_multireq (mrb=, blk= >>> out>) > >>> at /data/git/yyy/qemu/hw/block/virtio-blk.c:444 > >>> #8 virtio_blk_submit_multireq (blk=0x3ffb17e7960, mrb=0x3ffeb58) > >>> at /data/git/yyy/qemu/hw/block/virtio-blk.c:389 > >>> #9 0x800823ee in virtio_blk_handle_output (vdev= >>> out>, > >>> vq=) at > >>> /data/git/yyy/qemu/hw/block/virtio-blk.c:615 > >>> #10 0x801e367e in aio_dispatch (ctx=0x80918520) > >>> at /data/git/yyy/qemu/aio-posix.c:326 > >>> #11 0x801d28b0 in aio_ctx_dispatch (source=, > >>> callback=, user_data=) > >>> at /data/git/yyy/qemu/async.c:231 > >>> #12 0x03fffd36a05a in g_main_context_dispatch () > >>>from /lib64/libglib-2.0.so.0 > >>> #13 0x801e0ffa in glib_pollfds_poll () > >>> at /data/git/yyy/qemu/main-loop.c:211 > >>> #14 os_host_main_loop_wait (timeout=) > >>> at /data/git/yyy/qemu/main-loop.c:256 > >>> #15 main_loop_wait (nonblocking=) > >>> at /data/git/yyy/qemu/main-loop.c:504 > >>> #16 0x800148a6 in main_loop () at /data/git/yyy/qemu/vl.c:1923 > >>> #17 main (argc=, argv=, envp= >>> out>) > >>> at /data/git/yyy/qemu/vl.c:4684 > >>> > >>> Relevant part of command line: > >>> > >>> -drive > >>> file=/dev/sda,if=none,id=drive-virtio-disk0,format=raw,serial=ccwzfcp1,cache=none > >>> -device > >>> virtio-blk-ccw,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,scsi=off > >> > >> I played around a bit. The main part of this change seems
[Qemu-devel] [PATCH 4/7] igd: TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE: call parent realize
Signed-off-by: Gerd Hoffmann--- hw/pci-host/igd.c | 9 + 1 file changed, 9 insertions(+) diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c index d1eeafb..6f52ab1 100644 --- a/hw/pci-host/igd.c +++ b/hw/pci-host/igd.c @@ -53,12 +53,20 @@ out: return ret; } +static void (*i440fx_realize)(PCIDevice *pci_dev, Error **errp); static void igd_pt_i440fx_realize(PCIDevice *pci_dev, Error **errp) { +Error *err = NULL; uint32_t val = 0; int rc, i, num; int pos, len; +i440fx_realize(pci_dev, ); +if (err != NULL) { +error_propagate(errp, err); +return; +} + num = ARRAY_SIZE(igd_host_bridge_infos); for (i = 0; i < num; i++) { pos = igd_host_bridge_infos[i].offset; @@ -77,6 +85,7 @@ static void igd_passthrough_i440fx_class_init(ObjectClass *klass, void *data) DeviceClass *dc = DEVICE_CLASS(klass); PCIDeviceClass *k = PCI_DEVICE_CLASS(klass); +i440fx_realize = k->realize; k->realize = igd_pt_i440fx_realize; dc->desc = "IGD Passthrough Host bridge"; } -- 1.8.3.1
Re: [Qemu-devel] Qemu Crashing
On 8 December 2015 at 07:43, Saqib Khanwrote: > I compiled Qemu with following command: > > ./configure --target-list=x86_64-softmmu --enable-debug > > Then I started up my VM using following command : > > /home/user/qemu/qemu/bin/debug/native/x86_64-softmmu/qemu-system-x86_64-m > 1024 -drive if=ide,file=Windows7qemu.qcow2,cache=none -cdrom Win7Sp1.iso > > Qemu keep crashing with segmentation fault have attached PNG showing seg > fault at 2 different locations as in attacments: Which version of QEMU are you building? Does the bug still occur if you use the latest 2.5.0-rc3 ? thanks -- PMM
Re: [Qemu-devel] [PULL for-2.5 2/4] block: Don't wait serialising for non-COR read requests
On 12/08/2015 02:58 PM, Christian Borntraeger wrote: [...9 >>> >>> Not a compiler bug. gcc uses a floating point register 8 to spill >>> the pointer of blk (which is call saved) submit_request will later >>> on call qemu_coroutine_enter and after returning from >>> qemu_coroutine_enter, the fpr8 contains junk. Not sure yet, what happened. >> >> Coroutines don't save the FPU state, so you're not supposed to use >> floating point operations inside coroutines. That the compiler spills >> some integer value into a floating point register is a bit nasty... > > Just checked. bdrv_aligned_preadv does also use fprs (also for filling > and spilling). Some versions of gcc seem to like that as the LDGR and LGDR > instructions are pretty cheap and move the content from/to fprs in a bitwise > fashion. So this coroutine DOES trash floating point registers. > > Without the patch gcc seems to be fine with the 16 gprs and does not > spilling/filling from/to fprs in bdrv_aligned_preadv. > > Christian Kevin, I am wondering. gcc saves/restores f8 in the generated code for the coroutine and setjmp/longjmp also save/restore the fprs. why do coroutines do not save the FPU state (which code does a light weight switching) Christian
Re: [Qemu-devel] [v3 3/3] configure: add options to config avx2
> On 8 December 2015 at 12:08, Liang Liwrote: > > Add the '--enable-avx2' & '--disable-avx2' option so as to config the > > AVX2 instruction optimization. > > > > If '--disable-avx2' is not set, configure will detect if the compiler > > can support AVX2 option, if yes, AVX2 optimization is eabled, else > > disabled. > > Is the configure option necessary? For other things like this (eg our use of > SSE2 or Altivec) we just go ahead and use the feature if the compiler > supports it. > It seems unnecessary. > When would somebody building QEMU want to disable this option? > > thanks > -- PMM The v1 of this patch had the '--enable-avx2' & '--disable-avx2' options because this version did not support ifunc, and I left them here in the following version ... I will remove them if they are unnecessary. Thanks for your comments. Liang
Re: [Qemu-devel] [PATCH 1/2] Add param Error** to msi_init() & modify the callers
Cao jinwrites: > Hi Markus, > I have to say, you really did a amazing review for this "trivial > "patch, thanks a lot & really appreciate it:) Thanks! I'm afraid the problem you picked isn't trivial, but I hope it's still simple enough to be a useful exercise to get you going with the code. > On 12/07/2015 05:59 PM, Markus Armbruster wrote: >> Cao jin writes: >> >>> msi_init() is a supporting function in PCI device initialization, in order >>> to >>> convert .init() to .realize(), it should be modified first. Also modify the >>> callers >>> >>> Bonus: add more comment for msi_init(). >> >> Incomplete. See notes on impact inline. >> >>> Signed-off-by: Cao jin >>> --- >>> hw/audio/intel-hda.c | 7 ++- >>> hw/ide/ich.c | 2 +- >>> hw/net/vmxnet3.c | 3 ++- >>> hw/pci-bridge/ioh3420.c| 6 +- >>> hw/pci-bridge/pci_bridge_dev.c | 6 +- >>> hw/pci-bridge/xio3130_downstream.c | 7 ++- >>> hw/pci-bridge/xio3130_upstream.c | 7 ++- >>> hw/pci/msi.c | 17 + >>> hw/scsi/megasas.c | 12 +--- >>> hw/scsi/vmw_pvscsi.c | 3 ++- >>> hw/usb/hcd-xhci.c | 5 - >>> hw/vfio/pci.c | 3 ++- >>> include/hw/pci/msi.h | 4 ++-- >>> 13 files changed, 63 insertions(+), 19 deletions(-) >>> >>> diff --git a/hw/audio/intel-hda.c b/hw/audio/intel-hda.c >>> index 433463e..9d733da 100644 >>> --- a/hw/audio/intel-hda.c >>> +++ b/hw/audio/intel-hda.c >>> @@ -1130,6 +1130,7 @@ static void intel_hda_realize(PCIDevice *pci, Error >>> **errp) >>> { >>> IntelHDAState *d = INTEL_HDA(pci); >>> uint8_t *conf = d->pci.config; >>> +int ret; >>> >>> d->name = object_get_typename(OBJECT(d)); >>> >>> @@ -1142,7 +1143,11 @@ static void intel_hda_realize(PCIDevice *pci, Error >>> **errp) >>> "intel-hda", 0x4000); >>> pci_register_bar(>pci, 0, 0, >mmio); >>> if (d->msi) { >>> -msi_init(>pci, d->old_msi_addr ? 0x50 : 0x60, 1, true, false); >>> +ret = msi_init(>pci, d->old_msi_addr ? 0x50 : 0x60, 1, true, >>> +false, errp); >>> +if(ret < 0) { >> >> Please use scripts/checkpatch.pl to check your patches. It's >> occasionally wrong, so use your judgement. >> > > Thanks for the tips, seems I got dizzy looking because many trivial > place need to be modified... > >>> +return; >> >> This returns with the device in a half-realized state. Do we have to >> undo prior side effects to put it back into unrealized state? See also >> ioh3420_initfn() below. >> >> Before: msi_init() failure is ignored. After: it makes device >> realization fail. To assess impact, we need to understand how >> msi_init() can fail. >> > > It seems I missed the reality: devices are default to be hot-pluggable > & most devices are hot-pluggable:-[ Because when cold plugged, process > will exit on device-init failing, so, the half-realized state doesn`t > matter in this condition. > Will rework it later. In theory, realize() should always fail cleanly. In practice, unclean realize() failure doesn't matter when it's fatal anyway. Some devices are only used where it's always fatal. See also "Error handling in realize() methods" I just sent to the list; I hope we can come up with some guidance on when shortcuts in realize() methods are tolerable. >>> +} >>> } >>> >>> hda_codec_bus_init(DEVICE(pci), >codecs, sizeof(d->codecs), >>> diff --git a/hw/ide/ich.c b/hw/ide/ich.c >>> index 16925fa..94b1809 100644 >>> --- a/hw/ide/ich.c >>> +++ b/hw/ide/ich.c >>> @@ -145,7 +145,7 @@ static void pci_ich9_ahci_realize(PCIDevice *dev, Error >>> **errp) >>> /* Although the AHCI 1.3 specification states that the first >>> capability >>>* should be PMCAP, the Intel ICH9 data sheet specifies that the ICH9 >>>* AHCI device puts the MSI capability first, pointing to 0x80. */ >>> -msi_init(dev, ICH9_MSI_CAP_OFFSET, 1, true, false); >>> +msi_init(dev, ICH9_MSI_CAP_OFFSET, 1, true, false, errp); >> >> Do we have to put the device back into unrealized state on failure? >> >>> } >>> >>> static void pci_ich9_uninit(PCIDevice *dev) >>> diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c >>> index 5e3a233..4269141 100644 >>> --- a/hw/net/vmxnet3.c >>> +++ b/hw/net/vmxnet3.c >>> @@ -2140,9 +2140,10 @@ vmxnet3_init_msi(VMXNET3State *s) >>> { >>> PCIDevice *d = PCI_DEVICE(s); >>> int res; >>> +Error *local_err = NULL; >>> >>> res = msi_init(d, VMXNET3_MSI_OFFSET, VMXNET3_MAX_NMSIX_INTRS, >>> - VMXNET3_USE_64BIT, VMXNET3_PER_VECTOR_MASK); >>> + VMXNET3_USE_64BIT, VMXNET3_PER_VECTOR_MASK, _err); >>> if (0 > res) { >>> VMW_WRPRN("Failed to
[Qemu-devel] [PATCH 6/7] igd: revamp host config read
Move all work to the host_pci_config_copy helper function, which we can easily reuse when adding q35 support. Open sysfs file only once for all values. Use pread. Proper error handling. Fix bugs: * Don't throw away results (like old host_pci_config_read did because val was passed by value not reference). * Update config space directly (writing via pci_default_write_config only works for registers whitelisted in wmask). Hmm, this code can hardly ever worked before, /me wonders what test coverage it had. With this patch in place igd-passthru=on actually works, although it still requires root priviledges because linux refuses to allow non-root users access pci config space above offset 0x50. Signed-off-by: Gerd Hoffmann--- hw/pci-host/igd.c | 65 +++ 1 file changed, 27 insertions(+), 38 deletions(-) diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c index 0784128..ec48875 100644 --- a/hw/pci-host/igd.c +++ b/hw/pci-host/igd.c @@ -19,47 +19,39 @@ static const IGDHostInfo igd_host_bridge_infos[] = { {0xa8, 4}, /* SNB: base of GTT stolen memory */ }; -static int host_pci_config_read(int pos, int len, uint32_t val) +static void host_pci_config_copy(PCIDevice *guest, const char *host, + const IGDHostInfo *list, int len, Error **errp) { -char path[PATH_MAX]; -int config_fd; -ssize_t size = sizeof(path); -/* Access real host bridge. */ -int rc = snprintf(path, size, "/sys/bus/pci/devices/%04x:%02x:%02x.%d/%s", - 0, 0, 0, 0, "config"); -int ret = 0; +char *path; +int config_fd, rc, i; -if (rc >= size || rc < 0) { -return -ENODEV; -} - -config_fd = open(path, O_RDWR); +path = g_strdup_printf("/sys/bus/pci/devices/%s/config", host); +config_fd = open(path, O_RDONLY); if (config_fd < 0) { -return -ENODEV; +error_setg_file_open(errp, errno, path); +goto out_free; } -if (lseek(config_fd, pos, SEEK_SET) != pos) { -ret = -errno; -goto out; +for (i = 0; i < len; i++) { +rc = pread(config_fd, guest->config + list[i].offset, + list[i].len, list[i].offset); +if (rc != list[i].len) { +error_setg_errno(errp, errno, "read %s, offset 0x%x", + path, list[i].offset); +goto out_close; +} } -do { -rc = read(config_fd, (uint8_t *), len); -} while (rc < 0 && (errno == EINTR || errno == EAGAIN)); -if (rc != len) { -ret = -errno; -} -out: + +out_close: close(config_fd); -return ret; +out_free: +g_free(path); } static void (*i440fx_realize)(PCIDevice *pci_dev, Error **errp); static void igd_pt_i440fx_realize(PCIDevice *pci_dev, Error **errp) { Error *err = NULL; -uint32_t val = 0; -int rc, i, num; -int pos, len; i440fx_realize(pci_dev, ); if (err != NULL) { @@ -67,16 +59,13 @@ static void igd_pt_i440fx_realize(PCIDevice *pci_dev, Error **errp) return; } -num = ARRAY_SIZE(igd_host_bridge_infos); -for (i = 0; i < num; i++) { -pos = igd_host_bridge_infos[i].offset; -len = igd_host_bridge_infos[i].len; -rc = host_pci_config_read(pos, len, val); -if (rc) { -error_setg(errp, "failed to read host config"); -return; -} -pci_default_write_config(pci_dev, pos, val, len); +host_pci_config_copy(pci_dev, ":00:00.0", + igd_host_bridge_infos, + ARRAY_SIZE(igd_host_bridge_infos), + ); +if (err != NULL) { +error_propagate(errp, err); +return; } } -- 1.8.3.1
[Qemu-devel] [PATCH 7/7] igd: add q35 support
Signed-off-by: Gerd Hoffmann--- hw/pci-host/igd.c | 41 - hw/pci-host/q35.c | 6 +- 2 files changed, 45 insertions(+), 2 deletions(-) diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c index ec48875..f6e3f7a 100644 --- a/hw/pci-host/igd.c +++ b/hw/pci-host/igd.c @@ -1,5 +1,6 @@ #include "qemu-common.h" #include "hw/pci/pci.h" +#include "hw/pci-host/q35.h" #include "hw/i386/pc.h" /* IGD Passthrough Host Bridge. */ @@ -76,7 +77,7 @@ static void igd_passthrough_i440fx_class_init(ObjectClass *klass, void *data) i440fx_realize = k->realize; k->realize = igd_pt_i440fx_realize; -dc->desc = "IGD Passthrough Host bridge"; +dc->desc = "IGD Passthrough Host bridge (i440fx)"; } static const TypeInfo igd_passthrough_i440fx_info = { @@ -85,9 +86,47 @@ static const TypeInfo igd_passthrough_i440fx_info = { .class_init= igd_passthrough_i440fx_class_init, }; +static void (*q35_realize)(PCIDevice *pci_dev, Error **errp); +static void igd_pt_q35_realize(PCIDevice *pci_dev, Error **errp) +{ +Error *err = NULL; + +q35_realize(pci_dev, ); +if (err != NULL) { +error_propagate(errp, err); +return; +} + +host_pci_config_copy(pci_dev, ":00:00.0", + igd_host_bridge_infos, + ARRAY_SIZE(igd_host_bridge_infos), + ); +if (err != NULL) { +error_propagate(errp, err); +return; +} +} + +static void igd_passthrough_q35_class_init(ObjectClass *klass, void *data) +{ +DeviceClass *dc = DEVICE_CLASS(klass); +PCIDeviceClass *k = PCI_DEVICE_CLASS(klass); + +q35_realize = k->realize; +k->realize = igd_pt_q35_realize; +dc->desc = "IGD Passthrough Host bridge (q35)"; +} + +static const TypeInfo igd_passthrough_q35_info = { +.name = "igd-passthrough-q35-mch", +.parent= TYPE_MCH_PCI_DEVICE, +.class_init= igd_passthrough_q35_class_init, +}; + static void igd_register_types(void) { type_register_static(_passthrough_i440fx_info); +type_register_static(_passthrough_q35_info); } type_init(igd_register_types) diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c index 1fb4707..07dc595 100644 --- a/hw/pci-host/q35.c +++ b/hw/pci-host/q35.c @@ -151,7 +151,11 @@ static void q35_host_initfn(Object *obj) memory_region_init_io(>data_mem, obj, _host_data_le_ops, phb, "pci-conf-data", 4); -object_initialize(>mch, sizeof(s->mch), TYPE_MCH_PCI_DEVICE); +if (object_property_get_bool(qdev_get_machine(), "igd-passthru", NULL)) { +object_initialize(>mch, sizeof(s->mch), "igd-passthrough-q35-mch"); +} else { +object_initialize(>mch, sizeof(s->mch), TYPE_MCH_PCI_DEVICE); +} object_property_add_child(OBJECT(s), "mch", OBJECT(>mch), NULL); qdev_prop_set_uint32(DEVICE(>mch), "addr", PCI_DEVFN(0, 0)); qdev_prop_set_bit(DEVICE(>mch), "multifunction", false); -- 1.8.3.1
Re: [Qemu-devel] [PULL for-2.5 2/4] block: Don't wait serialising for non-COR read requests
On 8 December 2015 at 13:45, Kevin Wolfwrote: > Coroutines don't save the FPU state, so you're not supposed to use > floating point operations inside coroutines. That the compiler spills > some integer value into a floating point register is a bit nasty... The compiler will happily use FP registers even for apparently integer code if it thinks that is a better way to do it (eg on some CPUs doing memcpy and other kinds of block data move may go faster via the FPU registers, or it might be faster to spill an integer register into an FP register rather than spilling it to memory). As I see you've already determined, it's the job of setjmp/longjmp to make sure that everything is saved and restored correctly, fp or otherwise... thanks -- PMM
Re: [Qemu-devel] [PATCH v9 00/10] qcow2: Support refcount order amendment
Am 27.07.2015 um 17:51 hat Max Reitz geschrieben: > (v1..v7 were named "qcow2: Support refcount orders != 4") > > This series contains the final 10 patches of my qcow2 refcount order > series, which add refcount order amendment functionality to qemu-img. Thanks, applied to block-next (after some trivial rebasing). Kevin
Re: [Qemu-devel] FD passing for chardevs and chardev backend multiplexing
On 12/08/2015 07:59 AM, Daniel P. Berrange wrote: > So for this my plan is to stop using the QEMU 'file' backend for char > devs and instead pass across a pre-opened file descriptor, connected > to virtlogd. There is no "officially documented" way to pass in a > file descriptor to QEMU chardevs, but since QEMU uses qemu_open(), > we can make use of the fdset feature to achieve this. eg > > eg, consider fd 33 is the write end of a pipe file descriptor > I can (in theory) do > > -add-fd set=2,fd=33 -chardev file,id=charserial0,path=/dev/fdset/2 > > Now in practice this doesn't work, because qmp_chardev_open_file() > passes the O_CREAT|O_TRUNC flags in, which means the qemu_open() > call will fail when using the pipe FD pased in via fdsets. Is it just the O_TRUNC that is failing? If so, there is a recent patch to add an 'append':true flag that switches O_TRUNC off in favor of O_APPEND: https://lists.gnu.org/archive/html/qemu-devel/2015-12/msg00762.html Or is it that the pipe is one-way, but chardev insists on O_RDWR and fails because it is not two-way? > > After more investigation I found it *is* possible to use a socketpair > and a pipe backend though... > > -add-fd set=2,fd=33 -chardev pipe,id=charserial0,path=/dev/fdset/2 Yes, a socketpair is bi-directional, so it supports O_RDWR opening. > > ..because for reasons I don't understand, if QEMU can't open $PATH.in > and $PATH.out, it'll fallback to just opening $PATH in read-write > mode even. AFAICT, this is pretty useless with pipes since they > are unidirectional, but, it works nicely with socketpairs, where > virtlogd has one of the socketpairs and QEMU gets passed the other > via fdset. Is it something where we'd want to support two pipes, and open /dev/fdset/2 tied to char.in and /dev/fdset/3 tied to char.out, where uni-directional pipes are again okay? > > I can easily check this works for historical QEMU versions back > to when fdsets support was added to chardevs, but I'm working if > the QEMU maintainers consider this usage acceptable over the long > term, and if so, should we explicitly document it as supported ? It seems like a bi-directional socketpair as the single endpoint for a chardev is useful enough to support and document, but I'm not the maintainer to give final say-so. > > If not, should we introduce a more explicit syntax for passing in > a pre-opened FD for chardevs ? eg > > -add-fd set=2,fd=33 -chardev fd,id=charserial0,path=/dev/fdset/2 > Difference to the line you tried above: > -add-fd set=2,fd=33 -chardev file,id=charserial0,path=/dev/fdset/2 is 'fd' instead of 'file'. But if we're going to add a new protocol, do we even need to go through the "/dev/fdset/..." name, or can we just pass the fd number directly? > Or just make -chardev file,id=charserial0,path=/dev/fdset/2 actually > work ? I'd lean more to this case - the whole point of fdsets was that we don't have to add multiple fd protocols; that everyone that understood file syntax and uses qemu_open() magically gained fd support. > > Or something else ? > > > OpenStack has a further requirement to allow use of the serial port > as an interactive console, at the same time that it is logging to a > file which is something QEMU can't support at all currently. This > essentially means being able to have multiple chardev backends all > connected to the same serial frontend - specifically we would need > a TCP backend and a file backend concurrently. Again this could be > implemented in QEMU, but we'd prefer something that works with > existing QEMU. > > This is not too difficult to achieve with virtlogd really. Instead > of using the QEMU 'tcp' or 'unix' chardev protocol, we'd just always > pass QEMU a pre-opened socketpair, and then leave the TCP/UNIX > socket listening to the virtlogd daemon. > > This is portable with existing QEMU versions, but the obvious downside > with this is extra copies in the interactive console path. So might it > be worth exploring the posibility of a chardev multiplexor in QEMU. We > would still pass in a pre-opened socketpair to QEMU for the logging side > of things, but would leave the TCP/UNIX socket listening upto QEMU still. > > eg should we make something like this work: > > -add-fd set=2,fd=33 > -chardev pipe,id=charserial0file,path=/dev/fdset/2 > -chardev > socket,id=charserial0tcp,host=127.0.0.1,port=,telnet,server,nowait > -chardev multiplex,id=charserial0,muxA=charserial0file,muxB=charserial1 wouldn't muxB be charserial0tcp (not charserial1)? > -serial isa-serial,chardev=charserial0,id=serial0 But the idea of a multiplex protocol that has multiple data sinks (guest output copied to all sinks) and a single source (at most one source can provide input to the guest) makes sense on the surface. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH] qom: change object property iterator API contract
On 11/27/2015 08:27 AM, Daniel P. Berrange wrote: > Currently the object property iterator API works as follows > > ObjectPropertyIterator *iter; > > iter = object_property_iter_init(obj); > while ((prop = object_property_iter_next(iter))) { > ... > } > object_property_iter_free(iter); > > This has the benefit that the ObjectPropertyIterator struct > can be opaque, but has the downside that callers need to > explicitly call a free function. It is also not in keeping > with iterator style used elsewhere in QEMU/glib2 > > This patch changes the API to use stack allocation instead > > ObjectPropertyIterator iter; > > object_property_iter_init(, obj); > while ((prop = object_property_iter_next())) { > ... > } > > Signed-off-by: Daniel P. Berrange> --- > > NB, this patch is not against master, it is intended to apply > after > > "qom: allow properties to be registered against classes" > > which is queued in qom-next for 2.6 > > hw/ppc/spapr_drc.c | 7 +++ > include/qom/object.h | 42 +++--- > net/filter.c | 7 +++ > qmp.c | 14 ++ > qom/object.c | 22 -- > tests/check-qom-proplist.c | 7 +++ > vl.c | 7 +++ > 7 files changed, 49 insertions(+), 57 deletions(-) > > +++ b/include/qom/object.h > @@ -346,6 +346,7 @@ typedef struct ObjectProperty > void *opaque; > } ObjectProperty; > > + > /** > * ObjectUnparent: > * @obj: the object that is being removed from the composition tree Spurious whitespace change? > > + /** > + * object_property_iter_free: > + * @iter: the iterator instance > + * > + * Releases any resources associated with the iterator. It is > + * not necessary to call this method if object_property_iter_next > + * has returned %NULL. It is only required if an application wishes > + * to abort iteration before it is complete > + */ > +void object_property_iter_free(ObjectPropertyIterator *iter); > + Huh? Why is this being added? I thought the point was to get rid of the need for object_property_iter_free(). > +++ b/qom/object.c > @@ -67,11 +67,6 @@ struct TypeImpl Other than that snafu, everything else looked fine. If that's all you fix for v2, you can add: Reviewed-by: Eric Blake -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v3 4/5] crypto: add QCryptoSecret object class for password/key handling
On 11/27/2015 09:30 AM, Daniel P. Berrange wrote: > Introduce a new QCryptoSecret object class which will be used > for providing passwords and keys to other objects which need > sensitive credentials. > > More examples are shown in the updated docs. > > Signed-off-by: Daniel P. Berrange> --- > +++ b/crypto/secret.c > +static void > +qcrypto_secret_load_data(QCryptoSecret *secret, > + uint8_t **output, > + size_t *outputlen, > + Error **errp) > +{ > +if (!g_file_get_contents(secret->file, , , )) { > +error_setg(errp, > + "Unable to read %s: %s", > + secret->file, gerr->message); > +g_error_free(gerr); > +return; > +} > +if (length) { > +/* Even though data is raw 8-bit, so may contain > + * arbitrary NULs, ensure it is explicitly NUL > + * terminated */ > +*output = g_renew(uint8_t, data, length + 1); > +(*output)[length] = '\0'; These two lines are dead code. g_file_get_contents() guarantees that on success, contents is malloc'd large enough and that contents[length] == 0. https://developer.gnome.org/glib/stable/glib-File-Utilities.html#g-file-get-contents > +*outputlen = length; > +} else { > +error_setg(errp, "Secret file %s is empty", > + secret->file); Is there any technical reason why we must forbid a 0-length password? (Sometimes, having the empty string as a password can be a useful for development tests). I'm not opposed to rejecting it, especially if doing so now avoids a more cryptic error message later because there is indeed a technical reason; but just want to make sure it is not an arbitrary limitation. > +g_free(data); > +} > +} else if (secret->data) { > +*outputlen = strlen(secret->data); > +*output = g_new(uint8_t, *outputlen + 1); > +memcpy(*output, secret->data, *outputlen + 1); These two lines could be shortened to: *output = g_strdup(secret->data); > + > +static void qcrypto_secret_decrypt(QCryptoSecret *secret, > + const uint8_t *input, > + size_t inputlen, > + uint8_t **output, > + size_t *outputlen, > + Error **errp) > +{ > +if (secret->format == QCRYPTO_SECRET_FORMAT_BASE64) { > +ciphertext = qbase64_decode((const gchar*)input, > +inputlen, > +, > +errp); > +if (!ciphertext) { > +goto cleanup; > +} > +plaintext = g_new0(uint8_t, ciphertextlen + 1); > +} else { > +ciphertextlen = inputlen; > +plaintext = g_new0(uint8_t, inputlen + 1); g_new0(uint8_t, value) is the same as g_malloc0(value); I don't know if it is worth the distinction. But not worth a respin on its own. I found some style or efficiency things you might want to touch up, but no actual bugs that would prevent this patch from being usable as-is. Reviewed-by: Eric Blake -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [iGVT-g] [PATCH 6/7] igd: revamp host config read
On 12/08/2015 10:07 PM, Gerd Hoffmann wrote: Move all work to the host_pci_config_copy helper function, which we can easily reuse when adding q35 support. Open sysfs file only once for all values. Use pread. Proper error handling. Fix bugs: * Don't throw away results (like old host_pci_config_read did because val was passed by value not reference). * Update config space directly (writing via pci_default_write_config only works for registers whitelisted in wmask). Hmm, this code can hardly ever worked before, /me wonders what test coverage it had. WOW, that's really impressive! :) -- Thanks, Jike With this patch in place igd-passthru=on actually works, although it still requires root priviledges because linux refuses to allow non-root users access pci config space above offset 0x50. Signed-off-by: Gerd Hoffmann--- hw/pci-host/igd.c | 65 +++ 1 file changed, 27 insertions(+), 38 deletions(-) diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c index 0784128..ec48875 100644 --- a/hw/pci-host/igd.c +++ b/hw/pci-host/igd.c @@ -19,47 +19,39 @@ static const IGDHostInfo igd_host_bridge_infos[] = { {0xa8, 4}, /* SNB: base of GTT stolen memory */ }; -static int host_pci_config_read(int pos, int len, uint32_t val) +static void host_pci_config_copy(PCIDevice *guest, const char *host, + const IGDHostInfo *list, int len, Error **errp) { -char path[PATH_MAX]; -int config_fd; -ssize_t size = sizeof(path); -/* Access real host bridge. */ -int rc = snprintf(path, size, "/sys/bus/pci/devices/%04x:%02x:%02x.%d/%s", - 0, 0, 0, 0, "config"); -int ret = 0; +char *path; +int config_fd, rc, i; -if (rc >= size || rc < 0) { -return -ENODEV; -} - -config_fd = open(path, O_RDWR); +path = g_strdup_printf("/sys/bus/pci/devices/%s/config", host); +config_fd = open(path, O_RDONLY); if (config_fd < 0) { -return -ENODEV; +error_setg_file_open(errp, errno, path); +goto out_free; } -if (lseek(config_fd, pos, SEEK_SET) != pos) { -ret = -errno; -goto out; +for (i = 0; i < len; i++) { +rc = pread(config_fd, guest->config + list[i].offset, + list[i].len, list[i].offset); +if (rc != list[i].len) { +error_setg_errno(errp, errno, "read %s, offset 0x%x", + path, list[i].offset); +goto out_close; +} } -do { -rc = read(config_fd, (uint8_t *), len); -} while (rc < 0 && (errno == EINTR || errno == EAGAIN)); -if (rc != len) { -ret = -errno; -} -out: + +out_close: close(config_fd); -return ret; +out_free: +g_free(path); } static void (*i440fx_realize)(PCIDevice *pci_dev, Error **errp); static void igd_pt_i440fx_realize(PCIDevice *pci_dev, Error **errp) { Error *err = NULL; -uint32_t val = 0; -int rc, i, num; -int pos, len; i440fx_realize(pci_dev, ); if (err != NULL) { @@ -67,16 +59,13 @@ static void igd_pt_i440fx_realize(PCIDevice *pci_dev, Error **errp) return; } -num = ARRAY_SIZE(igd_host_bridge_infos); -for (i = 0; i < num; i++) { -pos = igd_host_bridge_infos[i].offset; -len = igd_host_bridge_infos[i].len; -rc = host_pci_config_read(pos, len, val); -if (rc) { -error_setg(errp, "failed to read host config"); -return; -} -pci_default_write_config(pci_dev, pos, val, len); +host_pci_config_copy(pci_dev, ":00:00.0", + igd_host_bridge_infos, + ARRAY_SIZE(igd_host_bridge_infos), + ); +if (err != NULL) { +error_propagate(errp, err); +return; } }
[Qemu-devel] [PATCH for-2.5] virtio-9p-device: add minimal unrealize handler
Since commit 4652f1640e029e1f2433fa77ba6af285 "virtio-9p: add savevm handlers", if the user hot-unplugs a quiescent 9p device and live migrates, the source QEMU crashes before migration completetion... This happens because virtio-9p devices have a realize handler which calls virtio_init() and register_savevm(). Both calls store pointers to the device internals, that get dereferenced during migration even if the device got unplugged. This patch simply adds an unrealize handler to perform minimal cleanup and avoid the crash. Hot unplug of non-quiescent 9p devices is still not supported in QEMU, and not supported by linux guests either. Signed-off-by: Greg Kurz--- hw/9pfs/virtio-9p-device.c | 12 1 file changed, 12 insertions(+) diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c index 944b5f5e9fcc..b42d3b30a027 100644 --- a/hw/9pfs/virtio-9p-device.c +++ b/hw/9pfs/virtio-9p-device.c @@ -145,6 +145,17 @@ out: v9fs_path_free(); } +static void virtio_9p_device_unrealize(DeviceState *dev, Error **errp) +{ +VirtIODevice *vdev = VIRTIO_DEVICE(dev); +V9fsState *s = VIRTIO_9P(dev); + +virtio_cleanup(vdev); +unregister_savevm(dev, "virtio-9p", s); +g_free(s->ctx.fs_root); +g_free(s->tag); +} + /* virtio-9p device */ static Property virtio_9p_properties[] = { @@ -161,6 +172,7 @@ static void virtio_9p_class_init(ObjectClass *klass, void *data) dc->props = virtio_9p_properties; set_bit(DEVICE_CATEGORY_STORAGE, dc->categories); vdc->realize = virtio_9p_device_realize; +vdc->unrealize = virtio_9p_device_unrealize; vdc->get_features = virtio_9p_get_features; vdc->get_config = virtio_9p_get_config; }
[Qemu-devel] [PATCH v2 1/3] lib/x86: Make free_page() available to call
This will be used to release allocated pages by Hyper-V SynIC timers test. Signed-off-by: Andrey SmetaninReviewed-by: Roman Kagan CC: Paolo Bonzini CC: Marcelo Tosatti CC: Roman Kagan CC: Denis V. Lunev CC: qemu-devel@nongnu.org --- lib/x86/vm.h | 1 + 1 file changed, 1 insertion(+) diff --git a/lib/x86/vm.h b/lib/x86/vm.h index bd73840..28794d7 100644 --- a/lib/x86/vm.h +++ b/lib/x86/vm.h @@ -33,6 +33,7 @@ unsigned long *install_pte(unsigned long *cr3, unsigned long *pt_page); void *alloc_page(); +void free_page(void *page); unsigned long *install_large_page(unsigned long *cr3,unsigned long phys, void *virt); -- 2.4.3
Re: [Qemu-devel] [v3 1/3] cutils: add avx2 instruction optimization
On 12/08/2015 04:08 AM, Liang Li wrote: > +++ b/util/buffer-zero-avx2.c > @@ -0,0 +1,54 @@ > +#include "qemu-common.h" > + > +#if defined CONFIG_IFUNC && defined CONFIG_AVX2 > +#include > +#define AVX2_VECTYPE__m256i > +#define AVX2_SPLAT(p) _mm256_set1_epi8(*(p)) > +#define AVX2_ALL_EQ(v1, v2) \ > +(_mm256_movemask_epi8(_mm256_cmpeq_epi8(v1, v2)) == 0x) > +#define AVX2_VEC_OR(v1, v2) (_mm256_or_si256(v1, v2)) > + > +inline bool > +can_use_buffer_find_nonzero_offset_avx2(const void *buf, size_t len) > +{ > +return (len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR > + * sizeof(AVX2_VECTYPE)) == 0 > +&& ((uintptr_t) buf) % sizeof(AVX2_VECTYPE) == 0); > +} I'm not keen on adding a new file for this. You ought to be able to use __attribute__((target("-mavx2"))) on any compiler that supports the command-line option. Which means you can do this all in one file with static functions. Nor am I keen on marking a function inline when we know it must be out-of-line because of the ifunc usage. r~
Re: [Qemu-devel] [PATCH for-2.5] virtio-9p-device: add minimal unrealize handler
On Tue, Dec 08, 2015 at 04:54:57PM +0100, Greg Kurz wrote: > Since commit 4652f1640e029e1f2433fa77ba6af285 "virtio-9p: add savevm > handlers", > if the user hot-unplugs a quiescent 9p device and live migrates, the source > QEMU crashes before migration completetion... This happens because virtio-9p > devices have a realize handler which calls virtio_init() and > register_savevm(). > Both calls store pointers to the device internals, that get dereferenced > during > migration even if the device got unplugged. > > This patch simply adds an unrealize handler to perform minimal cleanup and > avoid the crash. Hot unplug of non-quiescent 9p devices is still not supported > in QEMU, and not supported by linux guests either. > > Signed-off-by: Greg KurzReviewed-by: Michael S. Tsirkin > --- > hw/9pfs/virtio-9p-device.c | 12 > 1 file changed, 12 insertions(+) > > diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c > index 944b5f5e9fcc..b42d3b30a027 100644 > --- a/hw/9pfs/virtio-9p-device.c > +++ b/hw/9pfs/virtio-9p-device.c > @@ -145,6 +145,17 @@ out: > v9fs_path_free(); > } > > +static void virtio_9p_device_unrealize(DeviceState *dev, Error **errp) > +{ > +VirtIODevice *vdev = VIRTIO_DEVICE(dev); > +V9fsState *s = VIRTIO_9P(dev); > + > +virtio_cleanup(vdev); > +unregister_savevm(dev, "virtio-9p", s); > +g_free(s->ctx.fs_root); > +g_free(s->tag); > +} > + > /* virtio-9p device */ > > static Property virtio_9p_properties[] = { > @@ -161,6 +172,7 @@ static void virtio_9p_class_init(ObjectClass *klass, void > *data) > dc->props = virtio_9p_properties; > set_bit(DEVICE_CATEGORY_STORAGE, dc->categories); > vdc->realize = virtio_9p_device_realize; > +vdc->unrealize = virtio_9p_device_unrealize; > vdc->get_features = virtio_9p_get_features; > vdc->get_config = virtio_9p_get_config; > }
Re: [Qemu-devel] [PATCH v3 1/5] util: add base64 decoding function
On 11/27/2015 09:30 AM, Daniel P. Berrange wrote: > The standard glib provided g_base64_decode doesn't provide any > kind of sensible error checking on its input. Add a QEMU custom > wrapper qbase64_decode which can be used with untrustworthy > input that can contain invalid base64 characters, embedded > NUL characters, or not be NUL terminated at all. > > Signed-off-by: Daniel P. Berrange> --- > + > +/** > + * qbase64_decode: > + * @input: the (possibly) base64 encoded text > + * @in_len: length of @input or -1 if NUL terminated > + * @out_len: filled with length of decoded data > + * @errp: pointer to uninitialized error object That almost implies that I could do: Error *err; qbase64_decode(,); In reality, it should be the NULL-initialized error object, as in: Error *err = NULL; but I don't know if there is a better way to represent it in text. At any rate, that phrase exists elsewhere, so it would be easier to do a tree-wide cleanup if we have a better terminology to use, and I won't hold up review on this patch for it. > + * > + * Attempt to decode the (possibly) base64 encoded > + * text provided in @input. If the @input text may > + * contain embedded NUL characters, or may not be > + * NUL terminated, then @in_len must be set to the > + * known size of the @input buffer. > + * > + * Note that embedded NULs, or lack of a NUL terminator > + * are considered invalid base64 data and errors > + * will be reported to this effect. > + * > + * If decoding is successful, the decoded data will > + * be returned and @out_len set to indicate the > + * number of bytes in the decoded data. > + * Maybe mention that caller must g_free() the successful result? > + * Returns: the decoded data or NULL > + */ > +uint8_t *qbase64_decode(const char *input, > +size_t in_len, > +size_t *out_len, > +Error **errp); > + > + > +++ b/tests/test-base64.c > @@ -0,0 +1,98 @@ > +static void test_base64_bad(const char *input, > +size_t input_len) > +{ > +size_t len; > +Error *err = NULL; > +uint8_t *actual = qbase64_decode(input, > + input_len, > + , > + ); > + > +g_assert(err != NULL); Could use _abort in the original call instead of a second check for err != NULL; but that's cosmetic. > +g_assert(actual == NULL); > +g_assert_cmpint(len, ==, 0); So you are testing that we initialize output length even on error, rather than leaving it uninitialized. That's fair. > + > +static void test_base64_embedded_nul(void) > +{ > +const char input[] = "There's no such\0thing as a free lunch."; > + > +test_base64_bad(input, G_N_ELEMENTS(input) - 1); > +} > + This asserts that you have a failure, but doesn't say what that failure would be... > + > +static void test_base64_not_nul_terminated(void) > +{ > +char input[] = "There's no such thing as a free lunch."; > +input[G_N_ELEMENTS(input) - 1] = '!'; > + > +test_base64_bad(input, G_N_ELEMENTS(input) - 1); > +} > + > + > +static void test_base64_invalid_chars(void) > +{ > +const char *input = "There's no such thing as a free lunch."; > + > +test_base64_bad(input, strlen(input)); > +} ...and this same input already fails because it doesn't match base64, regardless of whether NUL bytes are mishandled. I wonder if test_base64_embedded_nul() and test_base64_not_nul_terminated() should be using variations of your known-good base64 string ("QmVjYXVzZSB3ZSBmb2N1c2VkIG9uIHRoZSBzbmFrZSwgd2UgbW\n" "lzc2VkIHRoZSBzY29ycGlvbi4="), to prove that they are failing purely because of the NUL handling and not because of invalid base64 content. But that's only a weak complaint - because by peering into the black box, I see that you are fully testing all code paths (that is, the black box checks for NUL abuse prior to checking for valid base64 data), so I'm not going to insist on a respin. Reviewed-by: Eric Blake -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
[Qemu-devel] [PATCH v2 0/3] KVM-UNIT-TESTS: Hyper-V SynIC timers test
The test checks Hyper-V SynIC timers functionality. The test runs on every vCPU and performs start/stop of periodic/one-shot timers (with period=1ms) and checks validity of received expiration messages in appropriate ISR's. Changes v2: * Share generic Hyper-V tests code * Hyper-V SynIC timers test fixes to improve readability and output Signed-off-by: Andrey SmetaninReviewed-by: Roman Kagan CC: Paolo Bonzini CC: Marcelo Tosatti CC: Roman Kagan CC: Denis V. Lunev CC: qemu-devel@nongnu.org Andrey Smetanin (3): lib/x86: Make free_page() available to call x86/hyperv: Move Hyper-V generic code into hyperv.h/hyperv.c x86: Hyper-V SynIC timers test config/config-x86-common.mak | 8 +- lib/x86/msr.h| 23 --- lib/x86/vm.h | 1 + x86/hyperv.c | 24 +++ x86/hyperv.h | 183 + x86/hyperv_stimer.c | 376 +++ x86/hyperv_synic.c | 42 + x86/unittests.cfg| 5 + 8 files changed, 603 insertions(+), 59 deletions(-) create mode 100644 x86/hyperv.c create mode 100644 x86/hyperv.h create mode 100644 x86/hyperv_stimer.c -- 2.4.3
Re: [Qemu-devel] [Qemu-arm] [PATCH v2 18/19] [RFC] hw/arm/virt: add secure memory region and UART
On 16 November 2015 at 14:05, Peter Maydellwrote: > Add a secure memory region to the virt board, which is the > same as the nonsecure memory region except that it also has > a secure-only UART in it. This is only created if the > board is started with the '-machine secure=on' property. > > This is an RFC patch, beacuse the device tree bindings for > indicating secure vs nonsecure devices are still under > discussion upstream: > https://lkml.org/lkml/2015/10/29/287 The bindings have been accepted upstream, so we can take the 'RFC' tag off this patch now. thanks -- PMM
[Qemu-devel] [PATCH v2 2/3] x86/hyperv: Move Hyper-V generic code into hyperv.h/hyperv.c
This code will be used as shared between hyperv_synic and hyperv_stimer tests. Signed-off-by: Andrey SmetaninCC: Paolo Bonzini CC: Marcelo Tosatti CC: Roman Kagan CC: Denis V. Lunev CC: qemu-devel@nongnu.org --- config/config-x86-common.mak | 3 ++- lib/x86/msr.h| 23 -- x86/hyperv.c | 24 ++ x86/hyperv.h | 58 x86/hyperv_synic.c | 42 ++-- 5 files changed, 92 insertions(+), 58 deletions(-) create mode 100644 x86/hyperv.c create mode 100644 x86/hyperv.h diff --git a/config/config-x86-common.mak b/config/config-x86-common.mak index f64874d..156be1c 100644 --- a/config/config-x86-common.mak +++ b/config/config-x86-common.mak @@ -113,7 +113,8 @@ $(TEST_DIR)/debug.elf: $(cstart.o) $(TEST_DIR)/debug.o $(TEST_DIR)/memory.elf: $(cstart.o) $(TEST_DIR)/memory.o -$(TEST_DIR)/hyperv_synic.elf: $(cstart.o) $(TEST_DIR)/hyperv_synic.o +$(TEST_DIR)/hyperv_synic.elf: $(cstart.o) $(TEST_DIR)/hyperv.o \ + $(TEST_DIR)/hyperv_synic.o arch_clean: $(RM) $(TEST_DIR)/*.o $(TEST_DIR)/*.flat $(TEST_DIR)/*.elf \ diff --git a/lib/x86/msr.h b/lib/x86/msr.h index 54da420..281255a 100644 --- a/lib/x86/msr.h +++ b/lib/x86/msr.h @@ -408,27 +408,4 @@ #define MSR_VM_IGNNE0xc0010115 #define MSR_VM_HSAVE_PA 0xc0010117 -/* Define synthetic interrupt controller model specific registers. */ -#define HV_X64_MSR_SCONTROL 0x4080 -#define HV_X64_MSR_SVERSION 0x4081 -#define HV_X64_MSR_SIEFP0x4082 -#define HV_X64_MSR_SIMP 0x4083 -#define HV_X64_MSR_EOM 0x4084 -#define HV_X64_MSR_SINT00x4090 -#define HV_X64_MSR_SINT10x4091 -#define HV_X64_MSR_SINT20x4092 -#define HV_X64_MSR_SINT30x4093 -#define HV_X64_MSR_SINT40x4094 -#define HV_X64_MSR_SINT50x4095 -#define HV_X64_MSR_SINT60x4096 -#define HV_X64_MSR_SINT70x4097 -#define HV_X64_MSR_SINT80x4098 -#define HV_X64_MSR_SINT90x4099 -#define HV_X64_MSR_SINT10 0x409A -#define HV_X64_MSR_SINT11 0x409B -#define HV_X64_MSR_SINT12 0x409C -#define HV_X64_MSR_SINT13 0x409D -#define HV_X64_MSR_SINT14 0x409E -#define HV_X64_MSR_SINT15 0x409F - #endif /* _ASM_X86_MSR_INDEX_H */ diff --git a/x86/hyperv.c b/x86/hyperv.c new file mode 100644 index 000..824773d --- /dev/null +++ b/x86/hyperv.c @@ -0,0 +1,24 @@ +#include "hyperv.h" + +static void synic_ctl(u8 ctl, u8 vcpu_id, u8 sint) +{ +outl((ctl << 16)|((vcpu_id) << 8)|sint, 0x3000); +} + +void synic_sint_create(int vcpu, int sint, int vec, bool auto_eoi) +{ +wrmsr(HV_X64_MSR_SINT0 + sint, + (u64)vec | ((auto_eoi) ? HV_SYNIC_SINT_AUTO_EOI : 0)); +synic_ctl(HV_TEST_DEV_SINT_ROUTE_CREATE, vcpu, sint); +} + +void synic_sint_set(int vcpu, int sint) +{ +synic_ctl(HV_TEST_DEV_SINT_ROUTE_SET_SINT, vcpu, sint); +} + +void synic_sint_destroy(int vcpu, int sint) +{ +wrmsr(HV_X64_MSR_SINT0 + sint, 0xFF|HV_SYNIC_SINT_MASKED); +synic_ctl(HV_TEST_DEV_SINT_ROUTE_DESTROY, vcpu, sint); +} diff --git a/x86/hyperv.h b/x86/hyperv.h new file mode 100644 index 000..0dd1d0d --- /dev/null +++ b/x86/hyperv.h @@ -0,0 +1,58 @@ +#ifndef __HYPERV_H +#define __HYPERV_H + +#include "libcflat.h" +#include "processor.h" +#include "io.h" + +#define HYPERV_CPUID_FEATURES 0x4003 + +#define HV_X64_MSR_SYNIC_AVAILABLE (1 << 2) + +/* Define synthetic interrupt controller model specific registers. */ +#define HV_X64_MSR_SCONTROL 0x4080 +#define HV_X64_MSR_SVERSION 0x4081 +#define HV_X64_MSR_SIEFP0x4082 +#define HV_X64_MSR_SIMP 0x4083 +#define HV_X64_MSR_EOM 0x4084 +#define HV_X64_MSR_SINT00x4090 +#define HV_X64_MSR_SINT10x4091 +#define HV_X64_MSR_SINT20x4092 +#define HV_X64_MSR_SINT30x4093 +#define HV_X64_MSR_SINT40x4094 +#define HV_X64_MSR_SINT50x4095 +#define HV_X64_MSR_SINT60x4096 +#define HV_X64_MSR_SINT70x4097 +#define HV_X64_MSR_SINT8
[Qemu-devel] [PATCH v2 3/3] x86: Hyper-V SynIC timers test
The test checks Hyper-V SynIC timers functionality. The test runs on every vCPU and performs start/stop of periodic/one-shot timers (with period=1ms) and checks validity of received expiration messages in appropriate ISR's. Changes v2: * reorg code to use generic hyperv.h * split timer test into test cases with separate callbacks * removed unnecessary irq_enable() calls * moved sint's create/destoy into test prepare/cleanup callbacks * defined used sint's numbers and vectors Signed-off-by: Andrey SmetaninReviewed-by: Roman Kagan CC: Paolo Bonzini CC: Marcelo Tosatti CC: Roman Kagan CC: Denis V. Lunev CC: qemu-devel@nongnu.org --- config/config-x86-common.mak | 5 +- x86/hyperv.h | 125 ++ x86/hyperv_stimer.c | 376 +++ x86/unittests.cfg| 5 + 4 files changed, 510 insertions(+), 1 deletion(-) create mode 100644 x86/hyperv_stimer.c diff --git a/config/config-x86-common.mak b/config/config-x86-common.mak index 156be1c..72b95e3 100644 --- a/config/config-x86-common.mak +++ b/config/config-x86-common.mak @@ -37,7 +37,7 @@ tests-common = $(TEST_DIR)/vmexit.flat $(TEST_DIR)/tsc.flat \ $(TEST_DIR)/s3.flat $(TEST_DIR)/pmu.flat \ $(TEST_DIR)/tsc_adjust.flat $(TEST_DIR)/asyncpf.flat \ $(TEST_DIR)/init.flat $(TEST_DIR)/smap.flat \ - $(TEST_DIR)/hyperv_synic.flat + $(TEST_DIR)/hyperv_synic.flat $(TEST_DIR)/hyperv_stimer.flat \ ifdef API tests-common += api/api-sample @@ -116,6 +116,9 @@ $(TEST_DIR)/memory.elf: $(cstart.o) $(TEST_DIR)/memory.o $(TEST_DIR)/hyperv_synic.elf: $(cstart.o) $(TEST_DIR)/hyperv.o \ $(TEST_DIR)/hyperv_synic.o +$(TEST_DIR)/hyperv_stimer.elf: $(cstart.o) $(TEST_DIR)/hyperv.o \ + $(TEST_DIR)/hyperv_stimer.o + arch_clean: $(RM) $(TEST_DIR)/*.o $(TEST_DIR)/*.flat $(TEST_DIR)/*.elf \ $(TEST_DIR)/.*.d lib/x86/.*.d diff --git a/x86/hyperv.h b/x86/hyperv.h index 0dd1d0d..faf931b 100644 --- a/x86/hyperv.h +++ b/x86/hyperv.h @@ -7,7 +7,11 @@ #define HYPERV_CPUID_FEATURES 0x4003 +#define HV_X64_MSR_TIME_REF_COUNT_AVAILABLE (1 << 1) #define HV_X64_MSR_SYNIC_AVAILABLE (1 << 2) +#define HV_X64_MSR_SYNTIMER_AVAILABLE (1 << 3) + +#define HV_X64_MSR_TIME_REF_COUNT 0x4020 /* Define synthetic interrupt controller model specific registers. */ #define HV_X64_MSR_SCONTROL 0x4080 @@ -32,6 +36,19 @@ #define HV_X64_MSR_SINT14 0x409E #define HV_X64_MSR_SINT15 0x409F +/* + * Synthetic Timer MSRs. Four timers per vcpu. + */ + +#define HV_X64_MSR_STIMER0_CONFIG 0x40B0 +#define HV_X64_MSR_STIMER0_COUNT0x40B1 +#define HV_X64_MSR_STIMER1_CONFIG 0x40B2 +#define HV_X64_MSR_STIMER1_COUNT0x40B3 +#define HV_X64_MSR_STIMER2_CONFIG 0x40B4 +#define HV_X64_MSR_STIMER2_COUNT0x40B5 +#define HV_X64_MSR_STIMER3_CONFIG 0x40B6 +#define HV_X64_MSR_STIMER3_COUNT0x40B7 + #define HV_SYNIC_CONTROL_ENABLE (1ULL << 0) #define HV_SYNIC_SIMP_ENABLE(1ULL << 0) #define HV_SYNIC_SIEFP_ENABLE (1ULL << 0) @@ -40,6 +57,104 @@ #define HV_SYNIC_SINT_VECTOR_MASK (0xFF) #define HV_SYNIC_SINT_COUNT 16 +#define HV_STIMER_ENABLE(1ULL << 0) +#define HV_STIMER_PERIODIC (1ULL << 1) +#define HV_STIMER_LAZY (1ULL << 2) +#define HV_STIMER_AUTOENABLE(1ULL << 3) +#define HV_STIMER_SINT(config) (__u8)(((config) >> 16) & 0x0F) + +#define HV_SYNIC_STIMER_COUNT (4) + +/* Define synthetic interrupt controller message constants. */ +#define HV_MESSAGE_SIZE (256) +#define HV_MESSAGE_PAYLOAD_BYTE_COUNT (240) +#define HV_MESSAGE_PAYLOAD_QWORD_COUNT (30) + +/* Define hypervisor message types. */ +enum hv_message_type { +HVMSG_NONE = 0x, + +/* Memory access messages. */ +HVMSG_UNMAPPED_GPA = 0x8000, +HVMSG_GPA_INTERCEPT = 0x8001, + +/* Timer notification messages. */ +HVMSG_TIMER_EXPIRED = 0x8010, + +/* Error messages. */ +HVMSG_INVALID_VP_REGISTER_VALUE = 0x8020, +HVMSG_UNRECOVERABLE_EXCEPTION = 0x8021, +HVMSG_UNSUPPORTED_FEATURE = 0x8022, + +/* Trace buffer complete messages. */ +HVMSG_EVENTLOG_BUFFERCOMPLETE = 0x8040, + +/* Platform-specific processor intercept messages. */ +
Re: [Qemu-devel] [PATCH 06/16] acpi: Save PCMachineState on AcpiBuildState
On Mon, Dec 07, 2015 at 05:39:29PM +0200, Marcel Apfelbaum wrote: > On 12/02/2015 03:47 AM, Eduardo Habkost wrote: > >PCMachineState will be used in some of the steps of ACPI table > >building. > > > >Signed-off-by: Eduardo Habkost> >--- > > hw/i386/acpi-build.c | 8 > > 1 file changed, 4 insertions(+), 4 deletions(-) > > > >diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c > >index 85a5c53..ca11c88 100644 > >--- a/hw/i386/acpi-build.c > >+++ b/hw/i386/acpi-build.c > >@@ -1644,7 +1644,7 @@ struct AcpiBuildState { > > MemoryRegion *table_mr; > > /* Is table patched? */ > > uint8_t patched; > >-PcGuestInfo *guest_info; > >+PCMachineState *pcms; > > void *rsdp; > > MemoryRegion *rsdp_mr; > > MemoryRegion *linker_mr; > >@@ -1855,7 +1855,7 @@ static void acpi_build_update(void *build_opaque, > >uint32_t offset) > > > > acpi_build_tables_init(); > > > >-acpi_build(build_state->guest_info, ); > >+acpi_build(_state->pcms->acpi_guest_info, ); > > > > acpi_ram_update(build_state->table_mr, tables.table_data); > > > >@@ -1916,12 +1916,12 @@ void acpi_setup(PCMachineState *pcms) > > > > build_state = g_malloc0(sizeof *build_state); > > > >-build_state->guest_info = guest_info; > >+build_state->pcms = pcms; > > I am not "sold" on keeping a reference to machine in the build_state. > We can always query current machine using qdev_machine() or something. > > Keeping the "guest info" made sense since is used especially for ACPI, > however the machine has a wider scope. (And not having to keep it > around is a very good thing!) I wouldn't mind using qdev_get_machine() if preferred by the maintainer of that code, but I like to avoid it when possible. To me, qdev_get_machine() is just a global variable disguised as a harder-to-understand API. -- Eduardo
[Qemu-devel] [PATCH v10 4/6] target-arm: kvm - add support for HW assisted debug
This adds basic support for HW assisted debug. The ioctl interface to KVM allows us to pass an implementation defined number of break and watch point registers. When KVM_GUESTDBG_USE_HW is specified these debug registers will be installed in place on the world switch into the guest. The hardware is actually capable of more advanced matching but it is unclear if this expressiveness is available via the gdbstub protocol. Signed-off-by: Alex Bennée--- v2 - correct setting of PMC/BAS/MASK - improved commentary - added helper function to check watchpoint in range - fix find/deletion of watchpoints v3 - use internals.h definitions v6 - KVM_GUESTDBG_USE_HW_BP->KVM_GUESTDBG_USE_HW - renamed some helper functions to avoid confusion v9 - fix merge conflicts on re-base - rm asm/ptrace.h include - add additional commentry for hw breakpoints - explain gdb's model for HW bkpts - fix up spacing, formatting as per checkpatch - better PAC values - use is_power_of_2 - use _arm_ fn naming and add docs - add a CPUWatchpoint structure for reporting - replace manual array manipulation with g_array abstraction v10 - fix compilation for arm32/split imps between kvm32/64 - make find_hw_watchpoint/breakpoint local static functions - cleaned up comment grammar - fixed up missing spaces - removed pointless ?: booleanisation - removed pointless kvm_arm_hw_debug_active wrappers - s/is/if/ --- target-arm/kvm.c | 26 +--- target-arm/kvm32.c | 29 + target-arm/kvm64.c | 361 ++- target-arm/kvm_arm.h | 21 +++ 4 files changed, 415 insertions(+), 22 deletions(-) diff --git a/target-arm/kvm.c b/target-arm/kvm.c index 7f44e22..84974bb 100644 --- a/target-arm/kvm.c +++ b/target-arm/kvm.c @@ -566,26 +566,12 @@ void kvm_arch_update_guest_debug(CPUState *cs, struct kvm_guest_debug *dbg) dbg->control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP; } #endif -} - -int kvm_arch_insert_hw_breakpoint(target_ulong addr, - target_ulong len, int type) -{ -qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__); -return -EINVAL; -} - -int kvm_arch_remove_hw_breakpoint(target_ulong addr, - target_ulong len, int type) -{ -qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__); -return -EINVAL; -} - - -void kvm_arch_remove_all_hw_breakpoints(void) -{ -qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__); +#ifdef KVM_GUESTDBG_USE_HW +if (kvm_arm_hw_debug_active(cs)) { +dbg->control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_HW; +kvm_arm_copy_hw_debug_data(>arch); +} +#endif } void kvm_arch_init_irq_routing(KVMState *s) diff --git a/target-arm/kvm32.c b/target-arm/kvm32.c index 5ce969f..ff83ce6 100644 --- a/target-arm/kvm32.c +++ b/target-arm/kvm32.c @@ -493,3 +493,32 @@ bool kvm_arm_handle_debug(CPUState *cs, struct kvm_debug_exit_arch *debug_exit) qemu_log_mask(LOG_UNIMP, "%s: guest debug not yet implemented\n", __func__); return false; } + +int kvm_arch_insert_hw_breakpoint(target_ulong addr, + target_ulong len, int type) +{ +qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__); +return -EINVAL; +} + +int kvm_arch_remove_hw_breakpoint(target_ulong addr, + target_ulong len, int type) +{ +qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__); +return -EINVAL; +} + +void kvm_arch_remove_all_hw_breakpoints(void) +{ +qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__); +} + +void kvm_arm_copy_hw_debug_data(struct kvm_guest_debug_arch *ptr) +{ +qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__); +} + +bool kvm_arm_hw_debug_active(CPUState *cs) +{ +return false; +} diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c index 5f96cde..771ecdb 100644 --- a/target-arm/kvm64.c +++ b/target-arm/kvm64.c @@ -2,6 +2,7 @@ * ARM implementation of KVM hooks, 64 bit specific code * * Copyright Mian-M. Hamayun 2013, Virtual Open Systems + * Copyright Alex Bennée 2014, Linaro * * This work is licensed under the terms of the GNU GPL, version 2 or later. * See the COPYING file in the top-level directory. @@ -12,13 +13,17 @@ #include #include #include +#include +#include #include #include "config-host.h" #include "qemu-common.h" #include "qemu/timer.h" #include "qemu/error-report.h" +#include "qemu/host-utils.h" +#include "exec/gdbstub.h" #include "sysemu/sysemu.h" #include "sysemu/kvm.h" #include "kvm_arm.h" @@ -28,20 +33,358 @@ static bool have_guest_debug; +/* + * Although the ARM implementation of hardware assisted debugging + * allows for different breakpoints per-core, the current GDB + * interface treats them as a global pool of registers (which seems to + * be the case for x86, ppc and s390). As a result we store one
[Qemu-devel] [PATCH v10 6/6] tests/guest-debug: introduce basic gdbstub tests
The aim of these tests is to combine with an appropriate kernel image (with symbol-file vmlinux) and check it behaves as it should. Given a kernel it checks: - single step - software breakpoint - hardware breakpoint - access, read and write watchpoints On success it returns 0 to the calling process. I've not plumbed this into the "make check" logic though as we need a solution for providing non-host binaries to the tests. However the test is structured to work with pretty much any Linux kernel image as it uses the basic kernel_init code which is common across architectures. Signed-off-by: Alex Bennée--- v10: - fixup for Py2/3 cleanliness - drop to shell on exception --- tests/guest-debug/test-gdbstub.py | 176 ++ 1 file changed, 176 insertions(+) create mode 100644 tests/guest-debug/test-gdbstub.py diff --git a/tests/guest-debug/test-gdbstub.py b/tests/guest-debug/test-gdbstub.py new file mode 100644 index 000..31ba6c9 --- /dev/null +++ b/tests/guest-debug/test-gdbstub.py @@ -0,0 +1,176 @@ +# +# This script needs to be run on startup +# qemu -kernel ${KERNEL} -s -S +# and then: +# gdb ${KERNEL}.vmlinux -x ${QEMU_SRC}/tests/guest-debug/test-gdbstub.py + +import gdb + +failcount = 0 + + +def report(cond, msg): +"Report success/fail of test" +if cond: +print ("PASS: %s" % (msg)) +else: +print ("FAIL: %s" % (msg)) +failcount += 1 + + +def check_step(): +"Step an instruction, check it moved." +start_pc = gdb.parse_and_eval('$pc') +gdb.execute("si") +end_pc = gdb.parse_and_eval('$pc') + +return not (start_pc == end_pc) + + +def check_break(sym_name): +"Setup breakpoint, continue and check we stopped." +sym, ok = gdb.lookup_symbol(sym_name) +bp = gdb.Breakpoint(sym_name) + +gdb.execute("c") + +# hopefully we came back +end_pc = gdb.parse_and_eval('$pc') +print ("%s == %s %d" % (end_pc, sym.value(), bp.hit_count)) +bp.delete() + +# can we test we hit bp? +return end_pc == sym.value() + + +# We need to do hbreak manually as the python interface doesn't export it +def check_hbreak(sym_name): +"Setup hardware breakpoint, continue and check we stopped." +sym, ok = gdb.lookup_symbol(sym_name) +gdb.execute("hbreak %s" % (sym_name)) +gdb.execute("c") + +# hopefully we came back +end_pc = gdb.parse_and_eval('$pc') +print ("%s == %s" % (end_pc, sym.value())) + +if end_pc == sym.value(): +gdb.execute("d 1") +return True +else: +return False + + +class WatchPoint(gdb.Breakpoint): + +def get_wpstr(self, sym_name): +"Setup sym and wp_str for given symbol." +self.sym, ok = gdb.lookup_symbol(sym_name) +wp_addr = gdb.parse_and_eval(sym_name).address +self.wp_str = '*(%(type)s)(&%(address)s)' % dict( +type = wp_addr.type, address = sym_name) + +return(self.wp_str) + +def __init__(self, sym_name, type): +wp_str = self.get_wpstr(sym_name) +super(WatchPoint, self).__init__(wp_str, gdb.BP_WATCHPOINT, type) + +def stop(self): +end_pc = gdb.parse_and_eval('$pc') +print ("HIT WP @ %s" % (end_pc)) +return True + + +def do_one_watch(sym, wtype, text): + +wp = WatchPoint(sym, wtype) +gdb.execute("c") +report_str = "%s for %s (%s)" % (text, sym, wp.sym.value()) + +if wp.hit_count > 0: +report(True, report_str) +wp.delete() +else: +report(False, report_str) + + +def check_watches(sym_name): +"Watch a symbol for any access." + +# Should hit for any read +do_one_watch(sym_name, gdb.WP_ACCESS, "awatch") + +# Again should hit for reads +do_one_watch(sym_name, gdb.WP_READ, "rwatch") + +# Finally when it is written +do_one_watch(sym_name, gdb.WP_WRITE, "watch") + + +class CatchBreakpoint(gdb.Breakpoint): +def __init__(self, sym_name): +super(CatchBreakpoint, self).__init__(sym_name) +self.sym, ok = gdb.lookup_symbol(sym_name) + +def stop(self): +end_pc = gdb.parse_and_eval('$pc') +print ("CB: %s == %s" % (end_pc, self.sym.value())) +if end_pc == self.sym.value(): +report(False, "Hit final catchpoint") + + +def run_test(): +"Run throught the tests one by one" + +print ("Checking we can step the first few instructions") +step_ok = 0 +for i in range(3): +if check_step(): +step_ok += 1 + +report(step_ok == 3, "single step in boot code") + +print ("Checking HW breakpoint works") +break_ok = check_hbreak("kernel_init") +report(break_ok, "hbreak @ kernel_init") + +# Can't set this up until we are in the kernel proper +# if we make it to run_init_process we've over-run and +# one of the tests failed +print ("Setup catch-all for run_init_process") +cbp = CatchBreakpoint("run_init_process") +cpb2 =
[Qemu-devel] [PATCH v10 0/6] QEMU support for KVM Guest Debug on arm64
Hi, Here is the latest patch set to support debugging of KVM guests on arm64. The main changes are fixing arm32 compiles (mostly with stubs for the upcomming arm32 debug) and the usual bunch of minor tweaks and clarifications following review. I've kept the GDB Python based test in tests/guest-debug and cleaned it up so it will work with python2/3 linked GDBs. It still isn't plumbed it in to the "make check" so can be dropped until we have a solution for testing against non-host binaries. So in summary the changes are: - Fixed arm32 compile - Use results of debug capability checks - Whitespace and comment cleanups - Py2/3 cleanliness for test script More detailed changelogs are attached to each patch. GIT Repo: The patch series is based off a recent master and can be found at: https://github.com/stsquad/qemu branch: kvm/guest-debug-v10 Alex Bennée (6): target-arm: kvm64 - introduce kvm_arm_init_debug() target-arm: kvm - implement software breakpoints target-arm: kvm - support for single step target-arm: kvm - add support for HW assisted debug target-arm: kvm - re-inject guest debug exceptions tests/guest-debug: introduce basic gdbstub tests target-arm/helper-a64.c | 12 +- target-arm/kvm.c | 65 +++--- target-arm/kvm32.c| 47 target-arm/kvm64.c| 464 ++ target-arm/kvm_arm.h | 30 +++ tests/guest-debug/test-gdbstub.py | 176 +++ 6 files changed, 757 insertions(+), 37 deletions(-) create mode 100644 tests/guest-debug/test-gdbstub.py -- 2.6.3
[Qemu-devel] [PATCH v10 3/6] target-arm: kvm - support for single step
This adds support for single-step. There isn't much to do on the QEMU side as after we set-up the request for single step via the debug ioctl it is all handled within the kernel. The actual setting of the KVM_GUESTDBG_SINGLESTEP flag is already in the common code. If the kernel doesn't support guest debug the ioctl will simply error. Signed-off-by: Alex Bennée--- v2 - convert to using HSR_EC v3 - use internals.h definitions v10 - fix arm32 build - remove redundent flag setting (done in main kvm.c) - more words on fail case --- target-arm/kvm64.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c index 3b3929d..5f96cde 100644 --- a/target-arm/kvm64.c +++ b/target-arm/kvm64.c @@ -534,6 +534,13 @@ bool kvm_arm_handle_debug(CPUState *cs, struct kvm_debug_exit_arch *debug_exit) kvm_cpu_synchronize_state(cs); switch (hsr_ec) { +case EC_SOFTWARESTEP: +if (cs->singlestep_enabled) { +return true; +} else { +error_report("Came out of SINGLE STEP when not enabled"); +} +break; case EC_AA64_BKPT: if (kvm_find_sw_breakpoint(cs, env->pc)) { return true; -- 2.6.3
Re: [Qemu-devel] [PATCH 00/16] pc: Eliminate struct PcGuestInfo
On Mon, Dec 07, 2015 at 08:57:03PM +0200, Marcel Apfelbaum wrote: > On 12/02/2015 03:46 AM, Eduardo Habkost wrote: > >This moves all data from PcGuestInfo to either PCMachineState or > >PCMachineClass. > > > >This series depends on other two series: > >* [PATCH v3 0/6] pc: Initialization and compat function cleanup > >* [PATCH V3 0/3] hw/pcie: Multi-root support for Q35 > > > >For reference, there's a git tree containing this series plus all > >the dependencies, at: > > git://github.com/ehabkost/qemu-hacks.git work/pcguestinfo-eliminate > > > >Eduardo Habkost (16): > > pc: Move PcGuestInfo declaration to top of file > > pc: Eliminate struct PcGuestInfoState > > pc: Remove guest_info parameter from pc_memory_init() > > acpi: Make acpi_setup() get PCMachineState as argument > > acpi: Remove unused build_facs() PcGuestInfo paramter > > acpi: Save PCMachineState on AcpiBuildState > > acpi: Make acpi_build() get PCMachineState as argument > > acpi: Make build_srat() get PCMachineState as argument > > acpi: Remove ram size fields fron PcGuestInfo > > pc: Move PcGuestInfo.fw_cfg field to PCMachineState > > pc: Simplify signature of xen_load_linux() > > pc: Remove PcGuestInfo.isapc_ram_fw field > > q35: Remove MCHPCIState.guest_info field > > acpi: Use PCMachineClass fields directly > > pc: Move PcGuestInfo.apic_xrupt_override field to PCMachineState > > pc: Move APIC and NUMA data from PcGuestInfo to PCMachineState > > Hi, > > I mainly agree with the removal of PcGuestInfo , I commented on some patches. > > I do have a minor reservation, we kind of loose some information about the > fields. > Until now it was pretty clear that the fields were related to guest because > they were part of PcGuestInfo. Now this information is lost and the fields > appear as yet other machine attributes. But they really are just machine attributes, aren't they? > > I suppose this can be addressed by: > - a prefix for guest fields (e.g numa_nodes-> guest_numa_nodes), > - a comment in the class /* guest fields */, > - keeping the fields in PcGuestInfo struct but make the machine field short: > guest so we can call machine->guest.numa_nodes > - or not be addressed at all :) I don't see your point. Could you explain what you mean by "related to the guest" and "guest fields"? They are just machine attributes, and they happen to be used as input when building ACPI tables (just like other machine attributes are used as input for other guest-visible data, like CPUID, SMBIOS, and other tables). What exactly make them "related to guest"? -- Eduardo
Re: [Qemu-devel] [PATCH 1/7] pc: wire up TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE for !xen
On 12/08/2015 04:07 PM, Gerd Hoffmann wrote: rename pc_xen_hvm_init_pci to pc_i440fx_init_pci, use it for both xen and non-xen init. Signed-off-by: Gerd Hoffmann--- hw/i386/pc_piix.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index 2e41efe..ce6c3c5 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -419,10 +419,9 @@ static void pc_init_isa(MachineState *machine) pc_init1(machine, TYPE_I440FX_PCI_HOST_BRIDGE, TYPE_I440FX_PCI_DEVICE); } -#ifdef CONFIG_XEN -static void pc_xen_hvm_init_pci(MachineState *machine) +static void pc_i440fx_init_pci(MachineState *machine) { -const char *pci_type = has_igd_gfx_passthru ? +const char *pci_type = machine->igd_gfx_passthru ? TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE : TYPE_I440FX_PCI_DEVICE; pc_init1(machine, @@ -430,6 +429,7 @@ static void pc_xen_hvm_init_pci(MachineState *machine) pci_type); } +#ifdef CONFIG_XEN static void pc_xen_hvm_init(MachineState *machine) { PCIBus *bus; @@ -439,7 +439,7 @@ static void pc_xen_hvm_init(MachineState *machine) exit(1); } -pc_xen_hvm_init_pci(machine); +pc_i440fx_init_pci(machine); bus = pci_find_primary_bus(); if (bus != NULL) { @@ -455,8 +455,7 @@ static void pc_xen_hvm_init(MachineState *machine) if (compat) { \ compat(machine); \ } \ -pc_init1(machine, TYPE_I440FX_PCI_HOST_BRIDGE, \ - TYPE_I440FX_PCI_DEVICE); \ +pc_i440fx_init_pci(machine); \ Hi Gerd, A quick question, does IGD_PASSTHROUGH makes sense for compat machine types? On the same topic, does machine->igd_gfx_passthru makes sense for all machine types? Thanks, Marcel } \ DEFINE_PC_MACHINE(suffix, name, pc_init_##suffix, optionfn)
Re: [Qemu-devel] [PATCH 04/16] acpi: Make acpi_setup() get PCMachineState as argument
On Mon, Dec 07, 2015 at 05:24:27PM +0200, Marcel Apfelbaum wrote: > On 12/02/2015 03:46 AM, Eduardo Habkost wrote: > >Lots of PcGuestInfo fields are duplicates of PCMachineClass or > >PCMachineState fields. Pass PCMachineState as argument to > >acpi_setup(), so we can simply let the ACPI code use those fields > >directly. > > I completely agree with removing duplicated fields and using PCMachine > fields directly, but this not what this patch does. > It only extracts PcGuestInfo info from the machine. I should have appended "later" to the commit message. "So we are able to simply let the ACPI code use those fields later (in another commit)". The goal of this commit is to just change the function signature to allow us to move the fields later. Maybe I will squash some of those changes together in a new version of the series. -- Eduardo
[Qemu-devel] [Bug 1308341] Re: Multiple CPUs causes blue screen on Windows guest (14.04 regression)
*** This bug is a duplicate of bug 1346917 *** https://bugs.launchpad.net/bugs/1346917 Hi, could you please file a new bug with debugging information as per https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1346917/comments/11 ? -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1308341 Title: Multiple CPUs causes blue screen on Windows guest (14.04 regression) Status in QEMU: New Status in linux package in Ubuntu: Confirmed Status in qemu package in Ubuntu: Confirmed Bug description: Configuring a Windows 7 guest using more than one CPU cases the guest to fail. This happens after a few hours after guest boot. This is the error on the blue screen: "A clock interrupt was not received on a secondary processor within the allocated time interval" After resetting, the guest will never boot and a new bluescreen with the error "STOP: 0x005c" appears. Shutting down the guest completely and restarting it will allow it to boot and run for a few hours again. The guest was created using virt-manager. The error happens with or without virtio devices and with both 32-bit and 64-bit Windows 7 guests. I am using Ubuntu 14.04 release candidate. qemu-kvm version 2.0.0~rc1+dfsg-0ubuntu3 To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1308341/+subscriptions
[Qemu-devel] [PATCH v10 5/6] target-arm: kvm - re-inject guest debug exceptions
If we can't find details for the debug exception in our debug state then we can assume the exception is due to debugging inside the guest. To inject the exception into the guest state we re-use the TCG exception code (do_interrupt). However while guest debugging is in effect we currently can't handle the guest using single step as we will keep trapping to back to userspace. GDB makes heavy use of single-step behind the scenes which effectively means the guests ability to debug itself is disabled while it is being debugged. Signed-off-by: Alex Bennée--- v5: - new for v5 v10: - fix arm32 compile - add full stop at end of sentance - attempted to expand on limitations in commit msg --- target-arm/helper-a64.c | 12 ++-- target-arm/kvm64.c | 24 +--- 2 files changed, 27 insertions(+), 9 deletions(-) diff --git a/target-arm/helper-a64.c b/target-arm/helper-a64.c index deb8dbe..fc3ccdf 100644 --- a/target-arm/helper-a64.c +++ b/target-arm/helper-a64.c @@ -25,6 +25,7 @@ #include "qemu/bitops.h" #include "internals.h" #include "qemu/crc32c.h" +#include "sysemu/kvm.h" #include /* For crc32 */ /* C2.4.7 Multiply and divide */ @@ -469,7 +470,8 @@ void aarch64_cpu_do_interrupt(CPUState *cs) new_el); if (qemu_loglevel_mask(CPU_LOG_INT) && !excp_is_internal(cs->exception_index)) { -qemu_log_mask(CPU_LOG_INT, "...with ESR 0x%" PRIx32 "\n", +qemu_log_mask(CPU_LOG_INT, "...with ESR %x/0x%" PRIx32 "\n", + env->exception.syndrome >> ARM_EL_EC_SHIFT, env->exception.syndrome); } @@ -535,6 +537,12 @@ void aarch64_cpu_do_interrupt(CPUState *cs) aarch64_restore_sp(env, new_el); env->pc = addr; -cs->interrupt_request |= CPU_INTERRUPT_EXITTB; + +qemu_log_mask(CPU_LOG_INT, "...to EL%d PC 0x%" PRIx64 " PSTATE 0x%x\n", + new_el, env->pc, pstate_read(env)); + +if (!kvm_enabled()) { +cs->interrupt_request |= CPU_INTERRUPT_EXITTB; +} } #endif diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c index 771ecdb..8e6d044 100644 --- a/target-arm/kvm64.c +++ b/target-arm/kvm64.c @@ -871,6 +871,7 @@ bool kvm_arm_handle_debug(CPUState *cs, struct kvm_debug_exit_arch *debug_exit) { int hsr_ec = debug_exit->hsr >> ARM_EL_EC_SHIFT; ARMCPU *cpu = ARM_CPU(cs); +CPUClass *cc = CPU_GET_CLASS(cs); CPUARMState *env = >env; /* Ensure PC is synchronised */ @@ -881,7 +882,14 @@ bool kvm_arm_handle_debug(CPUState *cs, struct kvm_debug_exit_arch *debug_exit) if (cs->singlestep_enabled) { return true; } else { -error_report("Came out of SINGLE STEP when not enabled"); +/* + * The kernel should have supressed the guests ability to + * single step at this point so something has gone wrong. + */ +error_report("%s: guest single-step while debugging unsupported" + " (%"PRIx64", %"PRIx32")\n", + __func__, env->pc, debug_exit->hsr); +return false; } break; case EC_AA64_BKPT: @@ -908,12 +916,14 @@ bool kvm_arm_handle_debug(CPUState *cs, struct kvm_debug_exit_arch *debug_exit) __func__, debug_exit->hsr, env->pc); } -/* If we don't handle this it could be it really is for the - guest to handle */ -qemu_log_mask(LOG_UNIMP, - "%s: re-injecting exception not yet implemented" - " (0x%"PRIx32", %"PRIx64")\n", - __func__, hsr_ec, env->pc); +/* If we are not handling the debug exception it must belong to + * the guest. Let's re-use the existing TCG interrupt code to set + * everything up properly. + */ +cs->exception_index = EXCP_BKPT; +env->exception.syndrome = debug_exit->hsr; +env->exception.vaddress = debug_exit->far; +cc->do_interrupt(cs); return false; } -- 2.6.3
Re: [Qemu-devel] [Qemu-block] [PATCH] qemu-img / curl: When fetching Content-Size use GET instead of HEAD.
On 12/07/2015 04:23 PM, Boris Schrijver wrote: > Hi all, > Hi! > I was testing out the "qemu-img info/convert" options in combination with > "http/https" when I stumbled upon this issue. When "qemu-img info/convert" > tries > to collect the file info it will first try to fetch the Content-Size of the > remote file. It does a HEAD request and after a GET request for the correct > range. > > The HEAD request is an issue. Because when you've got a pre-signed url, for > example from S3, which INCLUDES the REQUEST METHOD in it's signature, you'll > get > a 403 Forbidden. > > It's is therefore better to use only the GET request method, and discard the > body at the first call. > How big is the body? Won't this introduce a really large overhead? > Please review! I'll be ready for answers! > Please use the git format-patch format for sending patch emails; see http://qemu-project.org/Contribute/SubmitAPatch#Use_git_format-patch -- and remember to include a Signed-off-by line. > [PATCH] qemu-img / curl: When fetching Content-Size use GET instead of HEAD. > > A server can respond different to both methods, or can block one of the two. > --- > block/curl.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/block/curl.c b/block/curl.c > index 8994182..2e74c32 100644 > --- a/block/curl.c > +++ b/block/curl.c > @@ -594,11 +594,11 @@ static int curl_open(BlockDriverState *bs, QDict > *options, > int flags, > // Get file size > > s->accept_range = false; > -curl_easy_setopt(state->curl, CURLOPT_NOBODY, 1); > +curl_easy_setopt(state->curl, CURLOPT_HTTPGET, 1); > curl_easy_setopt(state->curl, CURLOPT_HEADERFUNCTION, > curl_header_cb); > curl_easy_setopt(state->curl, CURLOPT_HEADERDATA, s); > -if (curl_easy_perform(state->curl)) > +if (curl_easy_perform(state->curl) != 23) We go from making sure there were no errors to enforcing that we *do* get CURLE_WRITE_ERROR? Can you explain why this change doesn't break error handling scenarios for all other cases? > goto out; > curl_easy_getinfo(state->curl, CURLINFO_CONTENT_LENGTH_DOWNLOAD, ); > if (d) >
Re: [Qemu-devel] [PATCH 06/16] acpi: Save PCMachineState on AcpiBuildState
On 12/08/2015 07:59 PM, Eduardo Habkost wrote: On Mon, Dec 07, 2015 at 05:39:29PM +0200, Marcel Apfelbaum wrote: On 12/02/2015 03:47 AM, Eduardo Habkost wrote: PCMachineState will be used in some of the steps of ACPI table building. Signed-off-by: Eduardo Habkost--- hw/i386/acpi-build.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 85a5c53..ca11c88 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -1644,7 +1644,7 @@ struct AcpiBuildState { MemoryRegion *table_mr; /* Is table patched? */ uint8_t patched; -PcGuestInfo *guest_info; +PCMachineState *pcms; void *rsdp; MemoryRegion *rsdp_mr; MemoryRegion *linker_mr; @@ -1855,7 +1855,7 @@ static void acpi_build_update(void *build_opaque, uint32_t offset) acpi_build_tables_init(); -acpi_build(build_state->guest_info, ); +acpi_build(_state->pcms->acpi_guest_info, ); acpi_ram_update(build_state->table_mr, tables.table_data); @@ -1916,12 +1916,12 @@ void acpi_setup(PCMachineState *pcms) build_state = g_malloc0(sizeof *build_state); -build_state->guest_info = guest_info; +build_state->pcms = pcms; I am not "sold" on keeping a reference to machine in the build_state. We can always query current machine using qdev_machine() or something. Keeping the "guest info" made sense since is used especially for ACPI, however the machine has a wider scope. (And not having to keep it around is a very good thing!) I wouldn't mind using qdev_get_machine() if preferred by the maintainer of that code, but I like to avoid it when possible. To me, qdev_get_machine() is just a global variable disguised as a harder-to-understand API. Really? Hmm, for me is looking like the other way around :) I see it as "query the QOM tree", instead of "keep the reference around" everywhere. But it may be just a personal preference. Thanks, Marcel
Re: [Qemu-devel] [Qemu-block] QEMU being able to use audio cdroms
On 11/25/2015 03:44 PM, Programmingkid wrote: > Is there any platform where a guest in QEMU can play an audio cd? If not, is > this a feature that you would allow into QEMU? > I haven't tested it, nobody has ever asked. I wouldn't reject patches for such a feature, but I likely wouldn't make reviewing them a priority either. I assume this would be primarily for emulating games/etc that use mixed Data/Audio formats? --js
[Qemu-devel] [PATCH v10 2/6] target-arm: kvm - implement software breakpoints
These don't involve messing around with debug registers, just setting the breakpoint instruction in memory. GDB will not use this mechanism if it can't access the memory to write the breakpoint. All the kernel has to do is ensure the hypervisor traps the breakpoint exceptions and returns to userspace. Signed-off-by: Alex Bennée-- v2 - handle debug exit with new hsr exception info - add verbosity to UNIMP message v3 - sync with kvm_cpu_synchronize_state() before checking PC. - use internals.h defines - use env->pc - use proper format types v9 - add include for error_report - define a brk_insn constant v10 - fix up for arm32 compile - move sw_bp_code to kvm32 (stubs)/64 (working) - move kvm_handle_debug to kvm32/64 as kvm_arm_handle_debug - don't enable SW_BP unless the define is there - wrap in have_guest_debug check --- target-arm/kvm.c | 39 +--- target-arm/kvm32.c | 18 + target-arm/kvm64.c | 72 target-arm/kvm_arm.h | 9 +++ 4 files changed, 123 insertions(+), 15 deletions(-) diff --git a/target-arm/kvm.c b/target-arm/kvm.c index 79ef4c6..7f44e22 100644 --- a/target-arm/kvm.c +++ b/target-arm/kvm.c @@ -17,6 +17,7 @@ #include "qemu-common.h" #include "qemu/timer.h" +#include "qemu/error-report.h" #include "sysemu/sysemu.h" #include "sysemu/kvm.h" #include "kvm_arm.h" @@ -516,9 +517,23 @@ MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run *run) return MEMTXATTRS_UNSPECIFIED; } + int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run) { -return 0; +int ret = 0; + +switch (run->exit_reason) { +case KVM_EXIT_DEBUG: +if (kvm_arm_handle_debug(cs, >debug.arch)) { +ret = EXCP_DEBUG; +} /* otherwise return to guest */ +break; +default: +qemu_log_mask(LOG_UNIMP, "%s: un-handled exit reason %d\n", + __func__, run->exit_reason); +break; +} +return ret; } bool kvm_arch_stop_on_emulation_error(CPUState *cs) @@ -541,16 +556,16 @@ int kvm_arch_on_sigbus(int code, void *addr) return 1; } +/* The #ifdef protections are until 32bit headers are imported and can + * be removed once both 32 and 64 bit reach feature parity. + */ void kvm_arch_update_guest_debug(CPUState *cs, struct kvm_guest_debug *dbg) { -qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__); -} - -int kvm_arch_insert_sw_breakpoint(CPUState *cs, - struct kvm_sw_breakpoint *bp) -{ -qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__); -return -EINVAL; +#ifdef KVM_GUESTDBG_USE_SW_BP +if (kvm_sw_breakpoints_active(cs)) { +dbg->control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP; +} +#endif } int kvm_arch_insert_hw_breakpoint(target_ulong addr, @@ -567,12 +582,6 @@ int kvm_arch_remove_hw_breakpoint(target_ulong addr, return -EINVAL; } -int kvm_arch_remove_sw_breakpoint(CPUState *cs, - struct kvm_sw_breakpoint *bp) -{ -qemu_log_mask(LOG_UNIMP, "%s: not implemented\n", __func__); -return -EINVAL; -} void kvm_arch_remove_all_hw_breakpoints(void) { diff --git a/target-arm/kvm32.c b/target-arm/kvm32.c index df1e2b0..5ce969f 100644 --- a/target-arm/kvm32.c +++ b/target-arm/kvm32.c @@ -475,3 +475,21 @@ int kvm_arch_get_registers(CPUState *cs) return 0; } + +int kvm_arch_insert_sw_breakpoint(CPUState *cs, struct kvm_sw_breakpoint *bp) +{ +qemu_log_mask(LOG_UNIMP, "%s: guest debug not yet implemented\n", __func__); +return -EINVAL; +} + +int kvm_arch_remove_sw_breakpoint(CPUState *cs, struct kvm_sw_breakpoint *bp) +{ +qemu_log_mask(LOG_UNIMP, "%s: guest debug not yet implemented\n", __func__); +return -EINVAL; +} + +bool kvm_arm_handle_debug(CPUState *cs, struct kvm_debug_exit_arch *debug_exit) +{ +qemu_log_mask(LOG_UNIMP, "%s: guest debug not yet implemented\n", __func__); +return false; +} diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c index d087794..3b3929d 100644 --- a/target-arm/kvm64.c +++ b/target-arm/kvm64.c @@ -18,6 +18,7 @@ #include "config-host.h" #include "qemu-common.h" #include "qemu/timer.h" +#include "qemu/error-report.h" #include "sysemu/sysemu.h" #include "sysemu/kvm.h" #include "kvm_arm.h" @@ -481,3 +482,74 @@ int kvm_arch_get_registers(CPUState *cs) /* TODO: other registers */ return ret; } + +/* C6.6.29 BRK instruction */ +static const uint32_t brk_insn = 0xd420; + +int kvm_arch_insert_sw_breakpoint(CPUState *cs, struct kvm_sw_breakpoint *bp) +{ +if (have_guest_debug) { +if (cpu_memory_rw_debug(cs, bp->pc, (uint8_t *)>saved_insn, 4, 0) || +cpu_memory_rw_debug(cs, bp->pc, (uint8_t *)_insn, 4, 1)) { +return -EINVAL; +} +return 0; +} else { +error_report("guest debug not supported on this kernel");
[Qemu-devel] [PATCH v10 1/6] target-arm: kvm64 - introduce kvm_arm_init_debug()
As we haven't always had guest debug support we need to probe for it. Additionally we don't do this in the start-up capability code so we don't fall over on old kernels. Signed-off-by: Alex Bennée--- target-arm/kvm64.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c index ceebfeb..d087794 100644 --- a/target-arm/kvm64.c +++ b/target-arm/kvm64.c @@ -25,6 +25,22 @@ #include "internals.h" #include "hw/arm/arm.h" +static bool have_guest_debug; + +/** + * kvm_arm_init_debug() + * @cs: CPUState + * + * Check for guest debug capabilities. + * + */ +static void kvm_arm_init_debug(CPUState *cs) +{ +have_guest_debug = kvm_check_extension(cs->kvm_state, + KVM_CAP_SET_GUEST_DEBUG); +return; +} + static inline void set_feature(uint64_t *features, int feature) { *features |= 1ULL << feature; @@ -121,6 +137,8 @@ int kvm_arch_init_vcpu(CPUState *cs) } cpu->mp_affinity = mpidr & ARM64_AFFINITY_MASK; +kvm_arm_init_debug(cs); + return kvm_arm_init_cpreg_list(cpu); } -- 2.6.3
Re: [Qemu-devel] [Qemu-block] QEMU being able to use audio cdroms
On Dec 8, 2015, at 1:49 PM, John Snow wrote: > > > On 11/25/2015 03:44 PM, Programmingkid wrote: >> Is there any platform where a guest in QEMU can play an audio cd? If not, is >> this a feature that you would allow into QEMU? >> > > I haven't tested it, nobody has ever asked. I wouldn't reject patches > for such a feature, but I likely wouldn't make reviewing them a priority > either. > > I assume this would be primarily for emulating games/etc that use mixed > Data/Audio formats? That could be one reason to do it. I just thought it would be neat to be able to play CDs inside a guest.
Re: [Qemu-devel] [PATCH] qemu-img / curl: When fetching Content-Size use GET instead of HEAD.
08.12.2015 00:23, Boris Schrijver wrote: [] > It's is therefore better to use only the GET request method, and discard the > body at the first call. Nooo! Please N! Fetching a large file once might be too long already. Fetching it twice is twice as long. Oh well!.. Thanks, /mjt
Re: [Qemu-devel] [Qemu-block] [PATCH] qemu-img / curl: When fetching Content-Size use GET instead of HEAD.
See inline! Thanks for your response! -- Met vriendelijke groet / Kind regards, Boris Schrijver PCextreme B.V. http://www.pcextreme.nl/contact Tel direct: +31 (0) 118 700 215 > On December 8, 2015 at 8:40 PM John Snowwrote: > > > > > On 12/07/2015 04:23 PM, Boris Schrijver wrote: > > Hi all, > > > > Hi! > > > I was testing out the "qemu-img info/convert" options in combination with > > "http/https" when I stumbled upon this issue. When "qemu-img info/convert" > > tries > > to collect the file info it will first try to fetch the Content-Size of the > > remote file. It does a HEAD request and after a GET request for the correct > > range. > > > > The HEAD request is an issue. Because when you've got a pre-signed url, for > > example from S3, which INCLUDES the REQUEST METHOD in it's signature, you'll > > get > > a 403 Forbidden. > > > > It's is therefore better to use only the GET request method, and discard the > > body at the first call. > > > > How big is the body? Won't this introduce a really large overhead? The body is "worst-case" the size of the Ethernet v2 frame, around 1500 bytes. > > > Please review! I'll be ready for answers! > > > > Please use the git format-patch format for sending patch emails; see > http://qemu-project.org/Contribute/SubmitAPatch#Use_git_format-patch -- > and remember to include a Signed-off-by line. > Ok, will do! > > [PATCH] qemu-img / curl: When fetching Content-Size use GET instead of HEAD. > > > > A server can respond different to both methods, or can block one of the two. > > --- > > block/curl.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/block/curl.c b/block/curl.c > > index 8994182..2e74c32 100644 > > --- a/block/curl.c > > +++ b/block/curl.c > > @@ -594,11 +594,11 @@ static int curl_open(BlockDriverState *bs, QDict > > *options, > > int flags, > > // Get file size > > > > s->accept_range = false; > > -curl_easy_setopt(state->curl, CURLOPT_NOBODY, 1); > > +curl_easy_setopt(state->curl, CURLOPT_HTTPGET, 1); > > curl_easy_setopt(state->curl, CURLOPT_HEADERFUNCTION, > > curl_header_cb); > > curl_easy_setopt(state->curl, CURLOPT_HEADERDATA, s); > > -if (curl_easy_perform(state->curl)) > > +if (curl_easy_perform(state->curl) != 23) > > We go from making sure there were no errors to enforcing that we *do* > get CURLE_WRITE_ERROR? Can you explain why this change doesn't break > error handling scenarios for all other cases? > We're enforcing the CURLE_WRITE_ERROR here. We receive data, but don't want to save it anywhere -> We only want the header. CURLE_WRITE_ERROR implicitly means the connection is successful, because we received a response body! Any other error will not be CURLE_WRITE_ERROR and thus fail. Please run the following command: curl -v -X GET -I http://qemu.org/ It will at the last line read: * Excess found in a non pipelined read: excess = 279 url = / (zero-length body) That is the body we're discarding. Libcurl basically doesn't provide a nice way to handle this. That's why I've implemented in this fashion. > > goto out; > > curl_easy_getinfo(state->curl, CURLINFO_CONTENT_LENGTH_DOWNLOAD, ); > > if (d) > > [PATCH] commit ec8d3ef01eaca9264d97e9ad757fe536e0dc037b Author: Boris Schrijver Date: Mon Dec 7 22:01:59 2015 +0100 qemu-img / curl: When fetching Content-Size use GET instead of HEAD. A server can respond different to both methods, or can block one of the two. Signed-off-by: Boris Schrijver diff --git a/block/curl.c b/block/curl.c index 8994182..2e74c32 100644 --- a/block/curl.c +++ b/block/curl.c @@ -594,11 +594,11 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags, // Get file size s->accept_range = false; -curl_easy_setopt(state->curl, CURLOPT_NOBODY, 1); +curl_easy_setopt(state->curl, CURLOPT_HTTPGET, 1); curl_easy_setopt(state->curl, CURLOPT_HEADERFUNCTION, curl_header_cb); curl_easy_setopt(state->curl, CURLOPT_HEADERDATA, s); -if (curl_easy_perform(state->curl)) +if (curl_easy_perform(state->curl) != 23) goto out; curl_easy_getinfo(state->curl, CURLINFO_CONTENT_LENGTH_DOWNLOAD, ); if (d)
[Qemu-devel] [PATCH for-2.5] sparc: allow CASA with ASI 0xa from user space
On 12/04/2015 07:01 AM, Alex Zuepke wrote: > LEON3 allows the CASA instruction to be used from user space > if the ASI is set to 0xa (user data). > > Signed-off-by: Alex Zuepke> --- > target-sparc/translate.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/target-sparc/translate.c b/target-sparc/translate.c > index 41a3319..63440dd 100644 > --- a/target-sparc/translate.c > +++ b/target-sparc/translate.c > @@ -5097,7 +5097,8 @@ static void disas_sparc_insn(DisasContext * dc, > unsigned int insn) > if (IS_IMM) { > goto illegal_insn; > } > -if (!supervisor(dc)) { > +/* LEON3 allows CASA from user space with ASI 0xa */ > +if ((GET_FIELD(insn, 19, 26) != 0xa) && !supervisor(dc)) > { > goto priv_insn; > } > #endif > Reviewed-by: Richard Henderson This should probably be merged for 2.5. For 2.6, I have a branch with substantial changes for the sparc backend. Part of which totally revamps the way ASIs are handled. I believe it gets this right. See git://github.com/rth7680/qemu.git tgt-sparc r~
Re: [Qemu-devel] [PATCH] qemu-img / curl: When fetching Content-Size use GET instead of HEAD.
Hi! To clarify: The body of the response is the maximum size defined by MTU network policies, so by default around ~1500 bytes. After that is received, the header is parsed and the connection is dropped! So no whole file transfers!!! Please test and see for your self! -- Met vriendelijke groet / Kind regards, Boris Schrijver PCextreme B.V. http://www.pcextreme.nl/contact Tel direct: +31 (0) 118 700 215 > On December 8, 2015 at 8:56 PM Michael Tokarevwrote: > > > 08.12.2015 00:23, Boris Schrijver wrote: > [] > > It's is therefore better to use only the GET request method, and discard the > > body at the first call. > > Nooo! Please N! > > Fetching a large file once might be too long already. > Fetching it twice is twice as long. Oh well!.. > > Thanks, > > /mjt
Re: [Qemu-devel] [PATCH 4/8] bcm2835_emmc: add bcm2835 MMC/SD controller
On Tue, Dec 8, 2015 at 10:19 PM, Andrew Baumannwrote: >> From: Peter Crosthwaite [mailto:crosthwaitepe...@gmail.com] >> Sent: Saturday, 5 December 2015 21:26 >> Is this IP just SDHCI? We already model SDHCI in QEMU, see >> hw/sd/sdhci.c. If there are RPi specific features to the SDHCI >> implementation they should be added as optional extensions (probabably >> via subclassing) to the existing SDHCI model. > > So yes, it turns out this is fairly similar to SDHCI (-> lots of wasted work > by Gregory and me, sigh), and indeed Linux boots with the existing sdhci > emulation. However, there are some quirks, and UEFI/Windows depend on them. > Namely: > * The host control registers (offset 0x28 and above) seem to differ > significantly. Maybe this is due to the SDHC version -- according to the > BCM2835 peripherals spec, the controller implements "Version 3.0 Draft 1.0" > of the SDHC spec, but of course I can't find that spec online anywhere. > Luckily nothing seems to depend on this, besides a few spurious warnings > about invalid writes. Looks reasonably consistent from a quick scan? 0x28 in shdci.c is only doing the ADMA stuff while there are other fields on the BCM model. > * Power is assumed to be always on -- the sdhci model requires the guest to > turn it on by a write at offset 0x29 before issuing any commands, but on pi > this bit is marked reserved, and commands are issued immediately after reset. Does this help?: https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg06271.html > * The card inserted interrupt is rather broken on pi: it is set at the start > of day, but a reset command clears it and it stays clear thereafter (and > never generates interrupts). > Is that more likely to be an IP connectivity problem (wierd input to the card-detect pin in the SoC)? > There's an inconsistency with response handling, too, although I'm not sure > if it's a quirk of the Pi or a general bug in sdhci. Pi UEFI sends a CMD23 > without setting any of the response bits, but this command does in fact > generate a 4-byte R1 response. The question is whether this should be treated > as an error, or whether it simply means that the host wants to ignore the > response. In sdhci, the following code path (around line 246) raises a > "command index" error in the case that a non-zero response is returned but no > response bits were set in the command register: > > } else if (rlen != 0 && (s->errintstsen & SDHC_EISEN_CMDIDX)) { > s->errintsts |= SDHC_EIS_CMDIDX; > s->norintsts |= SDHC_NIS_ERR; > } > > I do not observe this behaviour on the real Pi2 (and it breaks UEFI). The > hardware semantics appear to be "if the command generates a response, but you > didn't want to see it, we'll successfully complete the command and ignore the > response", whereas the sdhci implementation raises an error for this as well > as signalling completion. I have read the "SD Specifications Part A2 SD Host > Controller Simplified Specification Version 2.00", but did not find anything > describing this case, so it could be that this is open to interpretation. (It > could also be specified in SDHC v3.) The specific error also seems odd -- my > understanding is that a "command index" error means that the index in the > response didn't match the index of the issued command, but that's hardly what > is happening here. > Starting to sound like a bug. > Assuming this latter bug can be fixed generically, how do you propose > handling the Pi quirks? I could add a bool property for "bcm2835-quirks" or > similar and just special-case the relevant code (my preferred approach). I'm > also open to subclassing, but no idea how that would work in practice, so > would need some pointers. > I think we need a more definitive list of the register level features that are different or added, I am not seeing what is BCM specific just yet. Regards, Peter > Thanks, > Andrew
Re: [Qemu-devel] [PATCH] virtio-blk: Drop x-data-plane option
On Mon, Dec 07, 2015 at 05:10:26PM +, Peter Maydell wrote: > On 7 December 2015 at 15:19, Paolo Bonziniwrote: > > > > > > On 07/12/2015 14:02, Fam Zheng wrote: > >> On Mon, 12/07 12:29, Cornelia Huck wrote: > >>> On Mon, 7 Dec 2015 18:59:27 +0800 > >>> Fam Zheng wrote: > >>> > The official way of enabling dataplane is through the "iothread" > property that references an iothread object created by "-object > iothread". Since the old "x-data-plane=on" way now even crashes, it's > probably easier to just drop it: > > $ qemu-system-x86_64 -drive file=null-co://,id=d0,if=none \ > -device virtio-blk-pci,drive=d0,x-data-plane=on > > ERROR:/home/fam/work/qemu/qom/object.c:1515: > object_get_canonical_path_component: assertion failed: (obj->parent != > NULL) > Aborted > >>> > >>> Do we understand yet why this crashes, btw? > >> > >> I think it's because with x-data-plane=on, virtio-blk initialize an object > >> that > >> doesn't have a parent, therefore it doesn't have a valid "canonical path > >> component" thing, which is different from objects created with "-object" > >> CLI. > >> I'm not very familiar with the QOM semantics here. > >> > >>> > > Signed-off-by: Fam Zheng > --- > hw/block/dataplane/virtio-blk.c | 15 ++- > hw/block/virtio-blk.c | 1 - > include/hw/virtio/virtio-blk.h | 1 - > 3 files changed, 2 insertions(+), 15 deletions(-) > > >>> > >>> No general objection to removing x-data-plane; but this probably wants > >>> a mention on the changelog as x-data-plane has been described in > >>> various howtos etc. over the years. > >> > >> Yes, that is a good point. I don't know if it's too rushing in removing > >> it for > >> 2.5 (this is just posted as one option) and we'll have to count on QOM > >> experts > >> for the fix, if it is. > > > > The solution would be to add object_property_add_child to > > virtio_blk_data_plane_create, between object_initialize and > > user_creatable_complete. But I think this patch is ok for 2.5. > > Paolo asked me to apply this to master, so I have done so. Okay. I will do my best to communicate that x-data-plane is gone. Stefan signature.asc Description: PGP signature
Re: [Qemu-devel] [PATCH] vfio: Align iova also to IOMMU page size
On Mon, 2015-12-07 at 11:20 +, Peter Maydell wrote: > On 7 December 2015 at 10:53, Pavel Fedinwrote: > >> TAGET_PAGE_ALIGN tells us that it *could* be a valid DMA target though. > >> The VM model is capable of using that as a page size, which means we > >> assume it is and want to generate a fault. > > > > We seem to have looped back. So... > > It is possible to fix this according to this assumption. In this > > case we would need to make TARGET_PAGE_BITS a variable. If we are > > emulating ancient armv5te, it will be set to 10. For modern targets, > > ARMv6 and newer, it will be 12. > > You can't just make TARGET_PAGE_BITS a variable, it is used as a compile > time constant in a bunch of TCG internal stuff. It would be nice > if we didn't require it to be compile time, but it would be a lot of > work to fix (especially if you want to avoid it being a performance > hit). > > In any case, that still doesn't fix the problem. On an AArch64 > target CPU, TARGET_PAGE_BITS still has to be 12 (for a 4K > minimum page size), but the guest and host could still be using > 64K pages. So your VFIO code *must* be able to deal with the > situation where TARGET_PAGE_BITS is smaller than any alignment > that the guest, host or IOMMU need to care about. > > I still think the VFIO code needs to figure out what alignment > it actually cares about and find some way to determine what > that is, or alternatively if the relevant alignment is not > possible to determine, write the code so that it doesn't > need to care. Either way, TARGET_PAGE_ALIGN is not the answer. Ok, let's work our way down through the relevant page sizes, host, IOMMU, and target. The host page size is relevant because this is the granularity with which the kernel can pin pages. Every IOMMU mapping must be backed by a pinned page in the current model since we don't really have hardware to support IOMMU page faults. The IOMMU page size defines the granularity with which we can map IOVA to physical memory. The IOMMU may support multiple page sizes, but what we're really talking about here is the minimum page size. The target page size is relevant because this defines the minimum possible page size used within the VM. We presume that anything less than TARGET_PAGE_ALIGN cannot be referenced as a page by the VM CPU and therefore is probably not allocated as a DMA buffer for a driver running within the guest. An implementation detail here is that the vfio type1 IOMMU model currently exposes the host page size as the minimum IOMMU page size. The reason for this is to simplify page accounting, if we don't allow sub-host page mappings we don't need per page reference counting. This can be fixed within the current API, but kernel changes are required or else locked page requirements due to over-counting become a problem. The benefit though is that this abstracts the host page size from QEMU. So let's take the easy scenario first, if target page size is greater than or equal to the minimum IOMMU page size, we're golden. We can map anything that could be a target DMA buffer. This leads to the current situation that we simply ignore any ranges which disappear when we align to the target page size. It can't be a DMA buffer, ignore it. Note that the 64k host, 4k target problem goes away if type1 accounting is fixed to allow IOMMU granularity mapping, since I think in the cases we care about the IOMMU still supports 4k pages, otherwise... Then we come to the scenario here, where target page size is less than the minimum IOMMU page size. The current code is intentionally trying to trigger the vfio type1 error that this cannot be mapped. To resolve this, QEMU needs to decide if it's ok to provide the device with DMA access to everything on that IOMMU granularity page, ensure that aliases mapping the same IOMMU page are consistent and handle the reference counting for those sub-mappings to avoid duplicate mappings and premature unmaps. So I think in the end, the one page size we care about is the minimum IOMMU granularity. We don't really care about the target page size at all and maybe we only care about the host page size for determining what might share a page with a sub-page mapping. However, there's work to get there (QEMU, kernel, or both depending on the specific config) and the target page size trick has so far been a useful simplification. Thanks, Alex
Re: [Qemu-devel] tcg: improve MAX_CODE_GEN_BUFFER_SIZE for arm
On Tue, Dec 8, 2015 at 7:21 PM, Aurelien Jarnowrote: > On 2015-12-08 11:51, Laurent Desnogues wrote: >> Hello, >> >> On Tue, Dec 8, 2015 at 11:39 AM, Aurelien Jarno wrote: >> [...] >> > I already posted a patch a long time ago to remove the 16MB limit on ARM >> > hosts: >> > >> > http://lists.gnu.org/archive/html/qemu-devel/2012-10/msg01684.html >> > >> > However as you can see in the thread, it has been rejected as it doesn't >> > not bring improvement in all cases. >> >> We could perhaps resurrect it and do some more benchmarking? Who >> would be able to do testing on (recent) ARM hardware? > > I can provide an updated patch, but I would prefer if someone else does > the benchmarking on a really recent hardware. Not sure the hardware I > have (cortex A7) is really representative of a modern ARM CPU. ok,I wait your new patch, thanks. I have arm A7 too. > Aurelien > > -- > Aurelien Jarno GPG: 4096R/1DDD8C9B > aurel...@aurel32.net http://www.aurel32.net
[Qemu-devel] [PATCH] xen_pt: fix failure of attaching & detaching a PCI device to VM repeatedly
Add pci = [ '$VF_BDF1', '$VF_BDF2', '$VF_BDF3'] in hvm guest configuration file. After the guest boot up, detach the VFs in sequence by "xl pci-detach $DOMID $VF_BDF", reattach the VFs by "xl pci-attach $VF_BDF" in sequence. An error message will be reported like this: "libxl: error: libxl_qmp.c:287:qmp_handle_error_response: received an error message from QMP server: Duplicate ID 'pci-pt-07_10.1' for device" When xen_pt_region_add/del() is called, MemoryRegion may not belong to the XenPCIPassthroughState. xen_pt_region_update() checks it but memory_region_ref/unref() does not. This case causes obj->ref issue and affects the release of related objects. So, memory_region_ref/unref() is moved from xen_pt_region_add/del inside xen_pt_region_update. Signed-off-by: Jianzhong,Chang--- hw/xen/xen_pt.c | 10 +- 1 files changed, 5 insertions(+), 5 deletions(-) diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c index aa96288..45d4d6c 100644 --- a/hw/xen/xen_pt.c +++ b/hw/xen/xen_pt.c @@ -590,7 +590,11 @@ static void xen_pt_region_update(XenPCIPassthroughState *s, if (bar == -1 && (!s->msix || >msix->mmio != mr)) { return; } - +if (adding) { +memory_region_ref(mr); +} else { +memory_region_unref(mr); +} if (s->msix && >msix->mmio == mr) { if (adding) { s->msix->mmio_base_addr = sec->offset_within_address_space; @@ -642,7 +646,6 @@ static void xen_pt_region_add(MemoryListener *l, MemoryRegionSection *sec) XenPCIPassthroughState *s = container_of(l, XenPCIPassthroughState, memory_listener); -memory_region_ref(sec->mr); xen_pt_region_update(s, sec, true); } @@ -652,7 +655,6 @@ static void xen_pt_region_del(MemoryListener *l, MemoryRegionSection *sec) memory_listener); xen_pt_region_update(s, sec, false); -memory_region_unref(sec->mr); } static void xen_pt_io_region_add(MemoryListener *l, MemoryRegionSection *sec) @@ -660,7 +662,6 @@ static void xen_pt_io_region_add(MemoryListener *l, MemoryRegionSection *sec) XenPCIPassthroughState *s = container_of(l, XenPCIPassthroughState, io_listener); -memory_region_ref(sec->mr); xen_pt_region_update(s, sec, true); } @@ -670,7 +671,6 @@ static void xen_pt_io_region_del(MemoryListener *l, MemoryRegionSection *sec) io_listener); xen_pt_region_update(s, sec, false); -memory_region_unref(sec->mr); } static const MemoryListener xen_pt_memory_listener = { -- 1.7.1
Re: [Qemu-devel] [Qemu-ppc] [PATCHv2 07/10] pseries: DEFINE_SPAPR_MACHINE
On 12/08/2015 01:38 PM, Sam Bobroff wrote: On Mon, Dec 07, 2015 at 02:34:37PM +1100, David Gibson wrote: At the moment all the class_init functions and TypeInfo structures for the various versioned pseries machine types are open-coded. As more versions are created this is getting increasingly clumsy. This patch borrows the approach used in PC, using a DEFINE_SPAPR_MACHINE() macro to construct most of the boilerplate from simpler 'class_options' and 'instance_options' functions. This patch makes a small semantic change - the versioned machine types are now registered through machine_init() instead of type_init(). Since the new way is how PC already did it, I'm assuming that's correct. Signed-off-by: David Gibson--- hw/ppc/spapr.c | 119 - 1 file changed, 49 insertions(+), 70 deletions(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 3078e60..4f645f3 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -2302,24 +2302,47 @@ static const TypeInfo spapr_machine_info = { }, }; +#define DEFINE_SPAPR_MACHINE(suffix, verstr) \ +static void spapr_machine_##suffix##_class_init(ObjectClass *oc, \ +void *data) \ +{\ +MachineClass *mc = MACHINE_CLASS(oc);\ +spapr_machine_##suffix##_class_options(mc); \ +}\ +static void spapr_machine_##suffix##_instance_init(Object *obj) \ +{\ +MachineState *machine = MACHINE(obj);\ +spapr_machine_##suffix##_instance_options(machine); \ +}\ +static const TypeInfo spapr_machine_##suffix##_info = { \ +.name = MACHINE_TYPE_NAME("pseries-" verstr),\ +.parent = TYPE_SPAPR_MACHINE,\ +.class_init = spapr_machine_##suffix##_class_init, \ +.instance_init = spapr_machine_##suffix##_instance_init, \ +}; \ +static void spapr_machine_register_##suffix(void)\ +{\ +type_register(_machine_##suffix##_info); \ +}\ +machine_init(spapr_machine_register_##suffix) + /* * pseries-2.5 */ -static void spapr_machine_2_5_class_init(ObjectClass *oc, void *data) +static void spapr_machine_2_5_instance_options(MachineState *machine) { -MachineClass *mc = MACHINE_CLASS(oc); -sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(oc); +} + +static void spapr_machine_2_5_class_options(MachineClass *mc) +{ +sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc); mc->alias = "pseries"; mc->is_default = 1; smc->dr_lmb_enabled = true; } -static const TypeInfo spapr_machine_2_5_info = { -.name = MACHINE_TYPE_NAME("pseries-2.5"), -.parent= TYPE_SPAPR_MACHINE, -.class_init= spapr_machine_2_5_class_init, -}; +DEFINE_SPAPR_MACHINE(2_5, "2.5"); /* * pseries-2.4 @@ -2327,18 +2350,17 @@ static const TypeInfo spapr_machine_2_5_info = { #define SPAPR_COMPAT_2_4 \ HW_COMPAT_2_4 -static void spapr_machine_2_4_class_init(ObjectClass *oc, void *data) +static void spapr_machine_2_4_instance_options(MachineState *machine) { -MachineClass *mc = MACHINE_CLASS(oc); +spapr_machine_2_5_instance_options(machine); +} +static void spapr_machine_2_4_class_options(MachineClass *mc) +{ SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_4); } -static const TypeInfo spapr_machine_2_4_info = { -.name = MACHINE_TYPE_NAME("pseries-2.4"), -.parent= TYPE_SPAPR_MACHINE, -.class_init= spapr_machine_2_4_class_init, -}; +DEFINE_SPAPR_MACHINE(2_4, "2.4"); /* * pseries-2.3 @@ -2352,30 +2374,18 @@ static const TypeInfo spapr_machine_2_4_info = { .value= "off",\ }, -static void spapr_compat_2_3(Object *obj) +static void spapr_machine_2_3_instance_options(MachineState *machine) { +spapr_machine_2_4_instance_options(machine); savevm_skip_section_footers(); global_state_set_optional(); } -static void spapr_machine_2_3_instance_init(Object *obj) -{ -spapr_compat_2_3(obj); -} - -static void spapr_machine_2_3_class_init(ObjectClass *oc, void *data) +static void spapr_machine_2_3_class_options(MachineClass *mc) { -MachineClass *mc = MACHINE_CLASS(oc); - SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_3); } - -static const TypeInfo spapr_machine_2_3_info = { -.name
Re: [Qemu-devel] [PATCH 1/7] pc: wire up TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE for !xen
> Hi Gerd, > > A quick question, does IGD_PASSTHROUGH makes sense for compat machine types? Unlikely to be used in practice, but I don't feel like creating different initialization code paths because of that ... > On the same topic, does machine->igd_gfx_passthru makes sense for all machine > types? Same answer ;) cheers, Gerd
[Qemu-devel] [PATCH v6 00/11] Add basic "detach" support for dump-guest-memory
v6 changes: - patch 10 - English error fix [Fam] - patch 11 - remove useless var: "not_used" [me] - all - move patch 8 to the end to be patch 11 (v5 patches 9-11 become v6 patches 8-10) [Eric] v5 changes: - patch 1 - comment English fix [Fam] - patch 2 - pass has_detach=true always in hmp_dump_guest_memory [Paolo] - patch 3 - always use local_err and error_propagate() when need to check the result [Fam] - patch 8 - add "DumpQueryResult" in DUMP_COMPLETED event [Eric] (since DumpQueryResult is introduced in patch 10, so doing it in patch 10 for convenience. Please let me know if I should not do this, e.g., if patch re-ordering is required) v4 changes: - patch 2: - hmp: fix default value lost [Eric] - English errors [Eric] - patch 3: - use global DumpState, leverage C99 struct init [Paolo] - English errors [Eric] - patch 5: - more cleanup for dump_process [Paolo] - patch 8: - make sure qmp-events.txt is sorted [Eric] - enhance error_get_pretty() [Eric] - emit DUMP_COMPLETED no matter detach or not - patch 10: - use g_new0 to replace g_malloc0 [Eric] - rename "written_bytes" to "completed", "total_bytes" to "total" [Eric] - use atomic ops and [rw]mb to protect status read/write [Paolo] - patch 12: - English errors [Eric] - merge contents into older patches [Eric] v3 changes (patch number corresponds to v2 patch set): - patch 1 - fix commit message. no memory leak, only code cleanup [Fam] - patch 2 - better documentation for "dump-guest-memory" (new patch 9) [Fam] - patch 3 - remove rcu lock/unlock in dump_init() [Fam, Paolo] - embed mr pointer into GuestPhysBlock [Paolo] - remove global dump state [Paolo] - patch 4 - fix memory leak for error [Fam] - evt DUMP_COMPLETED data: change to an optional "*error" [Paolo] - patch 5 - fix documents [Fam] - change "dump-query" to "query-dump", HMP to "info dump" [Paolo] - patch 6 - for query-dump command: define enum for DumpStatus, use "int" for written/total [Paolo] - all - reorder the commits as suggested, no fake values [Paolo] - split big commit into smaller ones [me] v2 changes: - fixed English errors [Drew] - reordered the "detach" field, first make it optional, then make sure it's order is consistent [Drew, Fam] - added doc for new detach flag [Eric] - collected error msg even detached [Drew] - added qmp event DUMP_COMPLETED to notify user [Eric, Fam] - added "dump-query" QMP & HMP commands to query dump status [Eric] - "stop" is not allowed when dump in background (also include "cont" and "dump-guest-memory") [Fam] - added codes to calculate how many dump work finished, which could be queried from "dump-query" [Laszlo] - added list to track all used MemoryRegion objects, also ref before use [Paolo] - dump-guest-memory will be forbidden during incoming migrate [Paolo] - taking rcu lock when collecting memory info [Paolo] Test Done: - QMP & HMP - test default dump (sync), work as usual - test detached dump, command return immediately. - When dump finished, will receive event DUMP_COMPLETED. - test query-dump before/during/after dump - test kdump with zlib compression, w/ and w/o detach - libvirt - test "virsh dump --memory-only" with default format and kdump-zlib format, work as usual Peter Xu (11): dump-guest-memory: cleanup: removing dump_{error|cleanup}(). dump-guest-memory: add "detach" flag for QMP/HMP interfaces. dump-guest-memory: using static DumpState, add DumpStatus dump-guest-memory: add dump_in_progress() helper function dump-guest-memory: introduce dump_process() helper function. dump-guest-memory: disable dump when in INMIGRATE state dump-guest-memory: add "detach" support DumpState: adding total_size and written_size fields Dump: add qmp command "query-dump" Dump: add hmp command "info dump" dump-guest-memory: add qmp event DUMP_COMPLETED docs/qmp-events.txt | 18 dump.c | 215 ++-- hmp-commands-info.hx| 14 +++ hmp-commands.hx | 5 +- hmp.c | 26 - hmp.h | 1 + include/qemu-common.h | 4 + include/sysemu/dump.h | 15 +++ include/sysemu/memory_mapping.h | 4 + memory_mapping.c| 3 + qapi-schema.json| 56 ++- qapi/event.json | 16 +++ qmp-commands.hx | 31 +- qmp.c | 14 +++ 14 files changed, 359 insertions(+), 63 deletions(-) -- 2.4.3
Re: [Qemu-devel] [Qemu-ppc] [PATCHv2 07/10] pseries: DEFINE_SPAPR_MACHINE
On 12/09/2015 02:30 PM, Alexey Kardashevskiy wrote: On 12/08/2015 01:38 PM, Sam Bobroff wrote: On Mon, Dec 07, 2015 at 02:34:37PM +1100, David Gibson wrote: At the moment all the class_init functions and TypeInfo structures for the various versioned pseries machine types are open-coded. As more versions are created this is getting increasingly clumsy. This patch borrows the approach used in PC, using a DEFINE_SPAPR_MACHINE() macro to construct most of the boilerplate from simpler 'class_options' and 'instance_options' functions. This patch makes a small semantic change - the versioned machine types are now registered through machine_init() instead of type_init(). Since the new way is how PC already did it, I'm assuming that's correct. Signed-off-by: David Gibson--- hw/ppc/spapr.c | 119 - 1 file changed, 49 insertions(+), 70 deletions(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 3078e60..4f645f3 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -2302,24 +2302,47 @@ static const TypeInfo spapr_machine_info = { }, }; +#define DEFINE_SPAPR_MACHINE(suffix, verstr) \ +static void spapr_machine_##suffix##_class_init(ObjectClass *oc, \ +void *data) \ +{\ +MachineClass *mc = MACHINE_CLASS(oc);\ +spapr_machine_##suffix##_class_options(mc); \ +}\ +static void spapr_machine_##suffix##_instance_init(Object *obj) \ +{\ +MachineState *machine = MACHINE(obj);\ +spapr_machine_##suffix##_instance_options(machine); \ +}\ +static const TypeInfo spapr_machine_##suffix##_info = { \ +.name = MACHINE_TYPE_NAME("pseries-" verstr),\ +.parent = TYPE_SPAPR_MACHINE,\ +.class_init = spapr_machine_##suffix##_class_init, \ +.instance_init = spapr_machine_##suffix##_instance_init, \ +}; \ +static void spapr_machine_register_##suffix(void)\ +{\ +type_register(_machine_##suffix##_info); \ +}\ +machine_init(spapr_machine_register_##suffix) + /* * pseries-2.5 */ -static void spapr_machine_2_5_class_init(ObjectClass *oc, void *data) +static void spapr_machine_2_5_instance_options(MachineState *machine) { -MachineClass *mc = MACHINE_CLASS(oc); -sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(oc); +} + +static void spapr_machine_2_5_class_options(MachineClass *mc) +{ +sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc); mc->alias = "pseries"; mc->is_default = 1; smc->dr_lmb_enabled = true; } -static const TypeInfo spapr_machine_2_5_info = { -.name = MACHINE_TYPE_NAME("pseries-2.5"), -.parent= TYPE_SPAPR_MACHINE, -.class_init= spapr_machine_2_5_class_init, -}; +DEFINE_SPAPR_MACHINE(2_5, "2.5"); /* * pseries-2.4 @@ -2327,18 +2350,17 @@ static const TypeInfo spapr_machine_2_5_info = { #define SPAPR_COMPAT_2_4 \ HW_COMPAT_2_4 -static void spapr_machine_2_4_class_init(ObjectClass *oc, void *data) +static void spapr_machine_2_4_instance_options(MachineState *machine) { -MachineClass *mc = MACHINE_CLASS(oc); +spapr_machine_2_5_instance_options(machine); +} +static void spapr_machine_2_4_class_options(MachineClass *mc) +{ SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_4); } -static const TypeInfo spapr_machine_2_4_info = { -.name = MACHINE_TYPE_NAME("pseries-2.4"), -.parent= TYPE_SPAPR_MACHINE, -.class_init= spapr_machine_2_4_class_init, -}; +DEFINE_SPAPR_MACHINE(2_4, "2.4"); /* * pseries-2.3 @@ -2352,30 +2374,18 @@ static const TypeInfo spapr_machine_2_4_info = { .value= "off",\ }, -static void spapr_compat_2_3(Object *obj) +static void spapr_machine_2_3_instance_options(MachineState *machine) { +spapr_machine_2_4_instance_options(machine); savevm_skip_section_footers(); global_state_set_optional(); } -static void spapr_machine_2_3_instance_init(Object *obj) -{ -spapr_compat_2_3(obj); -} - -static void spapr_machine_2_3_class_init(ObjectClass *oc, void *data) +static void spapr_machine_2_3_class_options(MachineClass *mc) { -MachineClass *mc = MACHINE_CLASS(oc); - SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_3); } - -static const
Re: [Qemu-devel] [PATCHv2 09/10] pseries: Improve setting of default machine version
On 12/07/2015 02:34 PM, David Gibson wrote: This tweaks the way the default machine version is controlled, so that there will be a bit less churn when each new version is introduced. Signed-off-by: David Gibson--- hw/ppc/spapr.c | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 5af3d13..8b8eb18 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -2308,12 +2308,16 @@ static const TypeInfo spapr_machine_info = { }, }; -#define DEFINE_SPAPR_MACHINE(suffix, verstr) \ +#define DEFINE_SPAPR_MACHINE(suffix, verstr, latest) \ static void spapr_machine_##suffix##_class_init(ObjectClass *oc, \ void *data) \ {\ MachineClass *mc = MACHINE_CLASS(oc);\ spapr_machine_##suffix##_class_options(mc); \ +if (latest) {\ +mc->alias = "pseries"; \ +mc->is_default = 1; \ +}\ }\ static void spapr_machine_##suffix##_instance_init(Object *obj) \ {\ @@ -2342,11 +2346,9 @@ static void spapr_machine_2_5_instance_options(MachineState *machine) static void spapr_machine_2_5_class_options(MachineClass *mc) { /* Defaults for the latest behaviour inherited from the base class */ -mc->alias = "pseries"; -mc->is_default = 1; } -DEFINE_SPAPR_MACHINE(2_5, "2.5"); +DEFINE_SPAPR_MACHINE(2_5, "2.5", true); /* * pseries-2.4 @@ -2364,13 +2366,11 @@ static void spapr_machine_2_4_class_options(MachineClass *mc) sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc); spapr_machine_2_5_class_options(mc); -mc->alias = NULL; -mc->is_default = 0; smc->dr_lmb_enabled = false; SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_4); } -DEFINE_SPAPR_MACHINE(2_4, "2.4"); +DEFINE_SPAPR_MACHINE(2_4, "2.4", false); /* * pseries-2.3 @@ -2396,7 +2396,7 @@ static void spapr_machine_2_3_class_options(MachineClass *mc) spapr_machine_2_4_class_options(mc); SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_3); } -DEFINE_SPAPR_MACHINE(2_3, "2.3"); +DEFINE_SPAPR_MACHINE(2_3, "2.3", false); /* * pseries-2.2 @@ -2421,7 +2421,7 @@ static void spapr_machine_2_2_class_options(MachineClass *mc) spapr_machine_2_3_class_options(mc); SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_2); } -DEFINE_SPAPR_MACHINE(2_2, "2.2"); +DEFINE_SPAPR_MACHINE(2_2, "2.2", false); /* * pseries-2.1 @@ -2440,7 +2440,7 @@ static void spapr_machine_2_1_class_options(MachineClass *mc) spapr_machine_2_2_class_options(mc); SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_1); } -DEFINE_SPAPR_MACHINE(2_1, "2.1"); +DEFINE_SPAPR_MACHINE(2_1, "2.1", false); static void spapr_machine_register_types(void) { Reviewed-by: Alexey Kardashevskiy -- Alexey
Re: [Qemu-devel] [PATCH 4/8] bcm2835_emmc: add bcm2835 MMC/SD controller
> From: Peter Crosthwaite [mailto:crosthwaitepe...@gmail.com] > Sent: Saturday, 5 December 2015 21:26 > Is this IP just SDHCI? We already model SDHCI in QEMU, see > hw/sd/sdhci.c. If there are RPi specific features to the SDHCI > implementation they should be added as optional extensions (probabably > via subclassing) to the existing SDHCI model. So yes, it turns out this is fairly similar to SDHCI (-> lots of wasted work by Gregory and me, sigh), and indeed Linux boots with the existing sdhci emulation. However, there are some quirks, and UEFI/Windows depend on them. Namely: * The host control registers (offset 0x28 and above) seem to differ significantly. Maybe this is due to the SDHC version -- according to the BCM2835 peripherals spec, the controller implements "Version 3.0 Draft 1.0" of the SDHC spec, but of course I can't find that spec online anywhere. Luckily nothing seems to depend on this, besides a few spurious warnings about invalid writes. * Power is assumed to be always on -- the sdhci model requires the guest to turn it on by a write at offset 0x29 before issuing any commands, but on pi this bit is marked reserved, and commands are issued immediately after reset. * The card inserted interrupt is rather broken on pi: it is set at the start of day, but a reset command clears it and it stays clear thereafter (and never generates interrupts). There's an inconsistency with response handling, too, although I'm not sure if it's a quirk of the Pi or a general bug in sdhci. Pi UEFI sends a CMD23 without setting any of the response bits, but this command does in fact generate a 4-byte R1 response. The question is whether this should be treated as an error, or whether it simply means that the host wants to ignore the response. In sdhci, the following code path (around line 246) raises a "command index" error in the case that a non-zero response is returned but no response bits were set in the command register: } else if (rlen != 0 && (s->errintstsen & SDHC_EISEN_CMDIDX)) { s->errintsts |= SDHC_EIS_CMDIDX; s->norintsts |= SDHC_NIS_ERR; } I do not observe this behaviour on the real Pi2 (and it breaks UEFI). The hardware semantics appear to be "if the command generates a response, but you didn't want to see it, we'll successfully complete the command and ignore the response", whereas the sdhci implementation raises an error for this as well as signalling completion. I have read the "SD Specifications Part A2 SD Host Controller Simplified Specification Version 2.00", but did not find anything describing this case, so it could be that this is open to interpretation. (It could also be specified in SDHC v3.) The specific error also seems odd -- my understanding is that a "command index" error means that the index in the response didn't match the index of the issued command, but that's hardly what is happening here. Assuming this latter bug can be fixed generically, how do you propose handling the Pi quirks? I could add a bool property for "bcm2835-quirks" or similar and just special-case the relevant code (my preferred approach). I'm also open to subclassing, but no idea how that would work in practice, so would need some pointers. Thanks, Andrew
[Qemu-devel] [PATCH v6 01/11] dump-guest-memory: cleanup: removing dump_{error|cleanup}().
It might be a little bit confusing and error prone to do dump_cleanup() in these two functions. A better way is to do dump_cleanup() before dump finish, no matter whether dump has succeeded or not. Signed-off-by: Peter XuReviewed-by: Fam Zheng --- dump.c | 78 +++--- 1 file changed, 32 insertions(+), 46 deletions(-) diff --git a/dump.c b/dump.c index 78b7d84..445e739 100644 --- a/dump.c +++ b/dump.c @@ -82,12 +82,6 @@ static int dump_cleanup(DumpState *s) return 0; } -static void dump_error(DumpState *s, const char *reason, Error **errp) -{ -dump_cleanup(s); -error_setg(errp, "%s", reason); -} - static int fd_write_vmcore(const void *buf, size_t size, void *opaque) { DumpState *s = opaque; @@ -128,7 +122,7 @@ static void write_elf64_header(DumpState *s, Error **errp) ret = fd_write_vmcore(_header, sizeof(elf_header), s); if (ret < 0) { -dump_error(s, "dump: failed to write elf header", errp); +error_setg(errp, "dump: failed to write elf header"); } } @@ -159,7 +153,7 @@ static void write_elf32_header(DumpState *s, Error **errp) ret = fd_write_vmcore(_header, sizeof(elf_header), s); if (ret < 0) { -dump_error(s, "dump: failed to write elf header", errp); +error_setg(errp, "dump: failed to write elf header"); } } @@ -182,7 +176,7 @@ static void write_elf64_load(DumpState *s, MemoryMapping *memory_mapping, ret = fd_write_vmcore(, sizeof(Elf64_Phdr), s); if (ret < 0) { -dump_error(s, "dump: failed to write program header table", errp); +error_setg(errp, "dump: failed to write program header table"); } } @@ -205,7 +199,7 @@ static void write_elf32_load(DumpState *s, MemoryMapping *memory_mapping, ret = fd_write_vmcore(, sizeof(Elf32_Phdr), s); if (ret < 0) { -dump_error(s, "dump: failed to write program header table", errp); +error_setg(errp, "dump: failed to write program header table"); } } @@ -225,7 +219,7 @@ static void write_elf64_note(DumpState *s, Error **errp) ret = fd_write_vmcore(, sizeof(Elf64_Phdr), s); if (ret < 0) { -dump_error(s, "dump: failed to write program header table", errp); +error_setg(errp, "dump: failed to write program header table"); } } @@ -245,7 +239,7 @@ static void write_elf64_notes(WriteCoreDumpFunction f, DumpState *s, id = cpu_index(cpu); ret = cpu_write_elf64_note(f, cpu, id, s); if (ret < 0) { -dump_error(s, "dump: failed to write elf notes", errp); +error_setg(errp, "dump: failed to write elf notes"); return; } } @@ -253,7 +247,7 @@ static void write_elf64_notes(WriteCoreDumpFunction f, DumpState *s, CPU_FOREACH(cpu) { ret = cpu_write_elf64_qemunote(f, cpu, s); if (ret < 0) { -dump_error(s, "dump: failed to write CPU status", errp); +error_setg(errp, "dump: failed to write CPU status"); return; } } @@ -275,7 +269,7 @@ static void write_elf32_note(DumpState *s, Error **errp) ret = fd_write_vmcore(, sizeof(Elf32_Phdr), s); if (ret < 0) { -dump_error(s, "dump: failed to write program header table", errp); +error_setg(errp, "dump: failed to write program header table"); } } @@ -290,7 +284,7 @@ static void write_elf32_notes(WriteCoreDumpFunction f, DumpState *s, id = cpu_index(cpu); ret = cpu_write_elf32_note(f, cpu, id, s); if (ret < 0) { -dump_error(s, "dump: failed to write elf notes", errp); +error_setg(errp, "dump: failed to write elf notes"); return; } } @@ -298,7 +292,7 @@ static void write_elf32_notes(WriteCoreDumpFunction f, DumpState *s, CPU_FOREACH(cpu) { ret = cpu_write_elf32_qemunote(f, cpu, s); if (ret < 0) { -dump_error(s, "dump: failed to write CPU status", errp); +error_setg(errp, "dump: failed to write CPU status"); return; } } @@ -326,7 +320,7 @@ static void write_elf_section(DumpState *s, int type, Error **errp) ret = fd_write_vmcore(, shdr_size, s); if (ret < 0) { -dump_error(s, "dump: failed to write section header table", errp); +error_setg(errp, "dump: failed to write section header table"); } } @@ -336,7 +330,7 @@ static void write_data(DumpState *s, void *buf, int length, Error **errp) ret = fd_write_vmcore(buf, length, s); if (ret < 0) { -dump_error(s, "dump: failed to save memory", errp); +error_setg(errp, "dump: failed to save memory"); } } @@ -568,11 +562,6 @@ static void dump_begin(DumpState *s, Error **errp) } } -static void dump_completed(DumpState *s) -{ -dump_cleanup(s); -} - static int get_next_block(DumpState *s,
[Qemu-devel] [PATCH v6 05/11] dump-guest-memory: introduce dump_process() helper function.
No functional change. Cleanup only. Signed-off-by: Peter XuReviewed-by: Fam Zheng --- dump.c| 31 +-- include/sysemu/dump.h | 3 +++ 2 files changed, 24 insertions(+), 10 deletions(-) diff --git a/dump.c b/dump.c index ccd56c8..f0ee9a8 100644 --- a/dump.c +++ b/dump.c @@ -1441,6 +1441,9 @@ static void dump_init(DumpState *s, int fd, bool has_format, Error *err = NULL; int ret; +s->has_format = has_format; +s->format = format; + /* kdump-compressed is conflict with paging and filter */ if (has_format && format != DUMP_GUEST_MEMORY_FORMAT_ELF) { assert(!paging && !has_filter); @@ -1594,6 +1597,23 @@ cleanup: dump_cleanup(s); } +/* this operation might be time consuming. */ +static void dump_process(DumpState *s, Error **errp) +{ +Error *local_err = NULL; + +if (s->has_format && s->format != DUMP_GUEST_MEMORY_FORMAT_ELF) { +create_kdump_vmcore(s, _err); +} else { +create_vmcore(s, _err); +} + +s->status = (local_err ? DUMP_STATUS_FAILED : DUMP_STATUS_COMPLETED); +error_propagate(errp, local_err); + +dump_cleanup(s); +} + void qmp_dump_guest_memory(bool paging, const char *file, bool has_detach, bool detach, bool has_begin, int64_t begin, bool has_length, @@ -1679,16 +1699,7 @@ void qmp_dump_guest_memory(bool paging, const char *file, return; } -if (has_format && format != DUMP_GUEST_MEMORY_FORMAT_ELF) { -create_kdump_vmcore(s, _err); -} else { -create_vmcore(s, _err); -} - -s->status = (local_err ? DUMP_STATUS_FAILED : DUMP_STATUS_COMPLETED); -error_propagate(errp, local_err); - -dump_cleanup(s); +dump_process(s, errp); } DumpGuestMemoryCapability *qmp_query_dump_guest_memory_capability(Error **errp) diff --git a/include/sysemu/dump.h b/include/sysemu/dump.h index affef38..d6f4a9c 100644 --- a/include/sysemu/dump.h +++ b/include/sysemu/dump.h @@ -185,6 +185,9 @@ typedef struct DumpState { size_t num_dumpable;/* number of page that can be dumped */ uint32_t flag_compress; /* indicate the compression format */ DumpStatus status; /* current dump status */ + +bool has_format; /* whether format is provided */ +DumpGuestMemoryFormat format; /* valid only if has_format == true */ } DumpState; uint16_t cpu_to_dump16(DumpState *s, uint16_t val); -- 2.4.3
[Qemu-devel] [PATCH v6 10/11] Dump: add hmp command "info dump"
It will calculate percentage of finished work from completed and total. Signed-off-by: Peter Xu--- hmp-commands-info.hx | 14 ++ hmp.c| 17 + hmp.h| 1 + 3 files changed, 32 insertions(+) diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx index 9b71351..52539c3 100644 --- a/hmp-commands-info.hx +++ b/hmp-commands-info.hx @@ -786,6 +786,20 @@ STEXI Display the value of a storage key (s390 only) ETEXI +{ +.name = "dump", +.args_type = "", +.params = "", +.help = "Display the latest dump status", +.mhandler.cmd = hmp_info_dump, +}, + +STEXI +@item info dump +@findex dump +Display the latest dump status. +ETEXI + STEXI @end table ETEXI diff --git a/hmp.c b/hmp.c index 1f4d0b6..c824064 100644 --- a/hmp.c +++ b/hmp.c @@ -2383,3 +2383,20 @@ void hmp_rocker_of_dpa_groups(Monitor *mon, const QDict *qdict) qapi_free_RockerOfDpaGroupList(list); } + +void hmp_info_dump(Monitor *mon, const QDict *qdict) +{ +DumpQueryResult *result = qmp_query_dump(NULL); + +assert(result->status < DUMP_STATUS_MAX); +monitor_printf(mon, "Status: %s\n", DumpStatus_lookup[result->status]); + +if (result->status == DUMP_STATUS_ACTIVE) { +float percent = 0; +assert(result->total != 0); +percent = 100.0 * result->completed / result->total; +monitor_printf(mon, "Finished: %.2f %%\n", percent); +} + +qapi_free_DumpQueryResult(result); +} diff --git a/hmp.h b/hmp.h index a8c5b5a..093d65f 100644 --- a/hmp.h +++ b/hmp.h @@ -131,5 +131,6 @@ void hmp_rocker(Monitor *mon, const QDict *qdict); void hmp_rocker_ports(Monitor *mon, const QDict *qdict); void hmp_rocker_of_dpa_flows(Monitor *mon, const QDict *qdict); void hmp_rocker_of_dpa_groups(Monitor *mon, const QDict *qdict); +void hmp_info_dump(Monitor *mon, const QDict *qdict); #endif -- 2.4.3
[Qemu-devel] [PATCH v6 06/11] dump-guest-memory: disable dump when in INMIGRATE state
Signed-off-by: Peter XuReviewed-by: Fam Zheng --- dump.c | 5 + 1 file changed, 5 insertions(+) diff --git a/dump.c b/dump.c index f0ee9a8..aa9d1f8 100644 --- a/dump.c +++ b/dump.c @@ -1625,6 +1625,11 @@ void qmp_dump_guest_memory(bool paging, const char *file, DumpState *s; Error *local_err = NULL; +if (runstate_check(RUN_STATE_INMIGRATE)) { +error_setg(errp, "Dump not allowed during incoming migration."); +return; +} + /* if there is a dump in background, we should wait until the dump * finished */ if (dump_in_progress()) { -- 2.4.3
[Qemu-devel] [PATCH v6 02/11] dump-guest-memory: add "detach" flag for QMP/HMP interfaces.
This patch only adds the interfaces, but does not implement them. "detach" parameter is made optional, to make sure that all the old dump-guest-memory requests will still be able to work. Signed-off-by: Peter XuReviewed-by: Fam Zheng --- dump.c | 5 +++-- hmp-commands.hx | 5 +++-- hmp.c| 9 +++-- qapi-schema.json | 8 ++-- qmp-commands.hx | 6 -- 5 files changed, 23 insertions(+), 10 deletions(-) diff --git a/dump.c b/dump.c index 445e739..d79e0ed 100644 --- a/dump.c +++ b/dump.c @@ -1580,8 +1580,9 @@ cleanup: dump_cleanup(s); } -void qmp_dump_guest_memory(bool paging, const char *file, bool has_begin, - int64_t begin, bool has_length, +void qmp_dump_guest_memory(bool paging, const char *file, + bool has_detach, bool detach, + bool has_begin, int64_t begin, bool has_length, int64_t length, bool has_format, DumpGuestMemoryFormat format, Error **errp) { diff --git a/hmp-commands.hx b/hmp-commands.hx index bb52e4d..664d794 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1056,10 +1056,11 @@ ETEXI { .name = "dump-guest-memory", -.args_type = "paging:-p,zlib:-z,lzo:-l,snappy:-s,filename:F,begin:i?,length:i?", -.params = "[-p] [-z|-l|-s] filename [begin length]", +.args_type = "paging:-p,detach:-d,zlib:-z,lzo:-l,snappy:-s,filename:F,begin:i?,length:i?", +.params = "[-p] [-d] [-z|-l|-s] filename [begin length]", .help = "dump guest memory into file 'filename'.\n\t\t\t" "-p: do paging to get guest's memory mapping.\n\t\t\t" + "-d: return immediately (do not wait for completion).\n\t\t\t" "-z: dump in kdump-compressed format, with zlib compression.\n\t\t\t" "-l: dump in kdump-compressed format, with lzo compression.\n\t\t\t" "-s: dump in kdump-compressed format, with snappy compression.\n\t\t\t" diff --git a/hmp.c b/hmp.c index 2140605..1f4d0b6 100644 --- a/hmp.c +++ b/hmp.c @@ -1586,8 +1586,10 @@ void hmp_dump_guest_memory(Monitor *mon, const QDict *qdict) const char *file = qdict_get_str(qdict, "filename"); bool has_begin = qdict_haskey(qdict, "begin"); bool has_length = qdict_haskey(qdict, "length"); +bool has_detach = qdict_haskey(qdict, "detach"); int64_t begin = 0; int64_t length = 0; +bool detach = false; enum DumpGuestMemoryFormat dump_format = DUMP_GUEST_MEMORY_FORMAT_ELF; char *prot; @@ -1615,11 +1617,14 @@ void hmp_dump_guest_memory(Monitor *mon, const QDict *qdict) if (has_length) { length = qdict_get_int(qdict, "length"); } +if (has_detach) { +detach = qdict_get_bool(qdict, "detach"); +} prot = g_strconcat("file:", file, NULL); -qmp_dump_guest_memory(paging, prot, has_begin, begin, has_length, length, - true, dump_format, ); +qmp_dump_guest_memory(paging, prot, true, detach, has_begin, begin, + has_length, length, true, dump_format, ); hmp_handle_error(mon, ); g_free(prot); } diff --git a/qapi-schema.json b/qapi-schema.json index 8b1a423..97c3ac4 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -2115,6 +2115,9 @@ #2. fd: the protocol starts with "fd:", and the following string # is the fd's name. # +# @detach: #optional if true, QMP will return immediately rather than +# waiting for the dump to finish. (since 2.6). +# # @begin: #optional if specified, the starting physical address. # # @length: #optional if specified, the memory size, in bytes. If you don't @@ -2131,8 +2134,9 @@ # Since: 1.2 ## { 'command': 'dump-guest-memory', - 'data': { 'paging': 'bool', 'protocol': 'str', '*begin': 'int', -'*length': 'int', '*format': 'DumpGuestMemoryFormat' } } + 'data': { 'paging': 'bool', 'protocol': 'str', '*detach': 'bool', +'*begin': 'int', '*length': 'int', +'*format': 'DumpGuestMemoryFormat'} } ## # @DumpGuestMemoryCapability: diff --git a/qmp-commands.hx b/qmp-commands.hx index 9d8b42f..6b51585 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -840,8 +840,8 @@ EQMP { .name = "dump-guest-memory", -.args_type = "paging:b,protocol:s,begin:i?,end:i?,format:s?", -.params = "-p protocol [begin] [length] [format]", +.args_type = "paging:b,protocol:s,detach:b?,begin:i?,end:i?,format:s?", +.params = "-p protocol [-d] [begin] [length] [format]", .help = "dump guest memory to file", .mhandler.cmd_new = qmp_marshal_dump_guest_memory, }, @@ -857,6 +857,8 @@ Arguments: - "paging": do paging to get guest's memory mapping (json-bool) - "protocol":
Re: [Qemu-devel] tcg: booting Windows on arm
Hi all, I am trying to boot Windows 8 (x86) on arm host using qemu dynamic translation. It is not successfull but Windows xp boots fine. Any suggest for this issue? On Wednesday, December 9, 2015,wrote: > Send Qemu-devel mailing list submissions to > qemu-devel@nongnu.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.nongnu.org/mailman/listinfo/qemu-devel > or, via email, send a message with subject or body 'help' to > qemu-devel-requ...@nongnu.org > > You can reach the person managing the list at > qemu-devel-ow...@nongnu.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Qemu-devel digest..." > > > Today's Topics: > >1. Re: [PATCH for-2.5] sparc: allow CASA with ASI 0xafrom user > space (Peter Maydell) >2. [Bug 1308341] Re: Multiple CPUs causes blue screen on Windows > guest (14.04 regression) (Cristian Aires) >3. Re: [PATCH] vfio: Align iova also to IOMMU page size > (Alex Williamson) >4. Re: tcg: improve MAX_CODE_GEN_BUFFER_SIZE for arm (TeLeMan) >5. [PATCH] xen_pt: fix failure of attaching & detaching aPCI > device to VM repeatedly (Jianzhong,Chang) >6. Re: [PATCH] virtio-blk: Drop x-data-plane option (Stefan Hajnoczi) >7. [PATCH v6 00/11] Add basic "detach" support for > dump-guest-memory (Peter Xu) >8. [PATCH v6 01/11] dump-guest-memory: cleanup: removing > dump_{error|cleanup}(). (Peter Xu) >9. [PATCH v6 02/11] dump-guest-memory: add "detach" flag for > QMP/HMP interfaces. (Peter Xu) > 10. [PATCH v6 03/11] dump-guest-memory: using static DumpState, > add DumpStatus (Peter Xu) > 11. [PATCH v6 04/11] dump-guest-memory: add dump_in_progress() > helper function (Peter Xu) > 12. [PATCH v6 05/11] dump-guest-memory: introduce dump_process() > helper function. (Peter Xu) > 13. [PATCH v6 06/11] dump-guest-memory: disable dump when in > INMIGRATE state (Peter Xu) > 14. [PATCH v6 07/11] dump-guest-memory: add "detach" support > (Peter Xu) > 15. [PATCH v6 08/11] DumpState: adding total_size and > written_size fields (Peter Xu) > 16. [PATCH v6 09/11] Dump: add qmp command "query-dump" (Peter Xu) > 17. [PATCH v6 10/11] Dump: add hmp command "info dump" (Peter Xu) > 18. [PATCH v6 11/11] dump-guest-memory: add qmp event > DUMP_COMPLETED (Peter Xu) > 19. Re: [Qemu-ppc] [PATCHv2 07/10] pseries: DEFINE_SPAPR_MACHINE > (Alexey Kardashevskiy) > 20. Re: [PATCHv2 01/10] pseries: Remove redundant setting of > mc->name for pseries-2.5 machine (Alexey Kardashevskiy) > > > -- > > Message: 1 > Date: Tue, 8 Dec 2015 21:28:49 + > From: Peter Maydell > To: Richard Henderson > Cc: Alex Zuepke , Mark Cave-Ayland > , QEMU Developers > , Fabien Chouteau > > Subject: Re: [Qemu-devel] [PATCH for-2.5] sparc: allow CASA with ASI > 0xa from user space > Message-ID: >