Re: [Qemu-devel] [PATCH v3 4/4] target-ppc: Handle ibm, nmi-register RTAS call
On Friday 14 November 2014 06:12 AM, David Gibson wrote: On Thu, Nov 13, 2014 at 08:06:55PM +0530, Aravinda Prasad wrote: On Thursday 13 November 2014 06:14 PM, David Gibson wrote: On Thu, Nov 13, 2014 at 05:18:16PM +0530, Aravinda Prasad wrote: On Thursday 13 November 2014 04:02 PM, David Gibson wrote: On Thu, Nov 13, 2014 at 11:28:30AM +0530, Aravinda Prasad wrote: [snip] Having to retry the hcall from here seems very awkward. This is a private hcall, so you can define it to do whatever retries are necessary internally (and I don't think your current implementation can fail anyway). Retrying is required in the cases when multi-processors experience machine check at or about the same time. As per PAPR, subsequent processors should serialize and wait for the first processor to issue the ibm,nmi-interlock call. The second processor retries if the first processor which received a machine check is still reading the error log and is yet to issue ibm,nmi-interlock call. Hmm.. ok. But I don't see any mechanism in the patches by which H_REPORT_MC_ERR will report failure if another CPU has an MC in progress. h_report_mc_err returns 0 if another VCPU is processing machine check and in that case we retry. h_report_mc_err returns error log address if no other VCPU is processing machine check. Uh.. how? I'm only seeing one return statement in the implementation in 3/4. This part is in 4/4 which handles ibm,nmi-interlock call in h_report_mc_err() +if (mc_in_progress == 1) { +return 0; +} Ah, right, missed the change to h_report_mc_err() in the later patch. Retrying cannot be done internally in h_report_mc_err hcall: only one thread can succeed entering qemu upon parallel hcall and hence retrying inside the hcall will not allow the ibm,nmi-interlock from first CPU to succeed. It's possible, but would require some fiddling inside the h_call to unlock and wait for the other CPUs to finish, so yes, it might be more trouble than it's worth. +mtsprg 2,4 Um.. doesn't this clobber the value of r3 you saved in SPRG2 just above. The r3 saved in SPRG2 is moved to rtas area in the private hcall and hence it is fine to clobber r3 here Ok, if you're going to do some magic register saving inside the HCALL, why not do the SRR[01] and CR restoration inside there as well. SRR0/1 is clobbered while returning from HCALL and hence cannot be restored in HCALL. For CR, we need to do the restoration here as we clobber CR after returning from HCALL (the instruction checking the return value of hcall clobbers CR). Hrm. AFAICT SRR0/1 shouldn't be clobbered when returning from an As hcall is an interrupt, SRR0 is set to nip and SRR1 to msr just before executing rfid. AFAICT the return path from the hypervisor - including for hcalls - uses HSSR0/1 and hrfid, so ordinary SRR0/SRR1 should be ok. I see SRR0 and SRR1 clobbered when the HCALL from guest returns. Previous discussions on this is in the link below: http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg01148.html Hrm. Well, I guess if it happened it happened, but Alex's explanation for why doesn't make sense to me. Did you execute cpu_synchronize_state() *before* attempting to set SRR0/1 in the hcall? Yes I did. Further I searched QEMU source code but could not find whether it is using rfid/hrfid. However, ISA for sc instruction mentions that SRR0 and SRR1 are modified. Well of course it isn't in the qemu source, the low-level return to guest is within the host kernel, specifically fast_guest_return in arch/powerpc/kvm/book3s_hv_rmhandlers.S which uses hrfid. If I'm reading the ISA correctly then yes, SRR0/1 are clobbered on entry, but that's on *entry* so can be overwritten by the hcall handler itself. Hmm.. ok. I need to take a look into it in detail. -- Regards, Aravinda
Re: [Qemu-devel] [PATCH] gtk: fix possible memory leak about local_err
On Fr, 2014-11-14 at 11:25 +0800, zhanghailiang wrote: local_err in gd_vc_gfx_init() is not freed, and we don't use it, so remove it. Added to gtk patch queue. thanks, Gerd
Re: [Qemu-devel] [PATCH] vmdk: Leave bdi intact if -ENOTSUP in vmdk_get_info
On 2014-11-14 at 05:09, Fam Zheng wrote: When extent types don't match, we return -ENOTSUP. In this case, be polite to the caller and don't modify bdi. Signed-off-by: Fam Zheng f...@redhat.com --- block/vmdk.c | 20 +--- 1 file changed, 13 insertions(+), 7 deletions(-) Reviewed-by: Max Reitz mre...@redhat.com
Re: [Qemu-devel] iothread object hotplug ?
On 14/11/2014 08:25, Alexandre DERUMIER wrote: Hi, I would like to known if it's possible to hot-add|hot-plug an iothread object on a running guest ? Yes, there is an object_add/object-add command (respectively HMP and QMP), but I don't think libvirt has bindings already. Paolo
Re: [Qemu-devel] [PATCH] l2tpv3: fix possible double free
On 14/11/2014 02:39, zhanghailiang wrote: freeaddrinfo(result) does not assign result = NULL, after frees it. There will be a double free when it goes error case. It is reported by covertiy. Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com --- net/l2tpv3.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/l2tpv3.c b/net/l2tpv3.c index 528d95b..f9e0c98 100644 --- a/net/l2tpv3.c +++ b/net/l2tpv3.c @@ -661,6 +661,7 @@ int net_init_l2tpv3(const NetClientOptions *opts, fd = -errno; error_report(l2tpv3_open : socket creation failed, errno = %d, -fd); freeaddrinfo(result); +result = NULL; You can just remove the call to freeaddrinfo(). I made the change and applied the patch. Paolo goto outerr; } if (bind(fd, (struct sockaddr *) result-ai_addr, result-ai_addrlen)) {
[Qemu-devel] [Bug 1392504] Re: USB Passthrough is not working anymore
Under Ubuntu 14.04 the USB devices where attached to the virtual machines. After upgrading to Ubuntu 14.10 the USB devices are not visible in the virtual machines anymore. Same issue appeard after upgrading from Ubuntu 13.04 to 13.10, but I can not find that bug report anymore. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1392504 Title: USB Passthrough is not working anymore Status in QEMU: New Bug description: After upgrading from Ubuntu 14.04 to Ubuntu 14.10 USB passthrough in QEMU (version is now 2.1.0 - Debian2.1+dfsg-4ubuntu6.1) is not working any more. Already tried to remove and attach the USB devices. I use 1 USB2 HDD + 1 USB3 HDD to a virtual linux machine; 1 USB2 HDD to a virual FNAS machine and a USB 2camera to virtual Win7 machine. All these devices are not working any more. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1392504/+subscriptions
[Qemu-devel] [PATCH v2 1/3] chardev: Add -qmp-pretty
Add a command line option for adding a QMP monitor using pretty JSON formatting. Signed-off-by: Max Reitz mre...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com --- qemu-options.hx | 8 vl.c| 15 ++- 2 files changed, 18 insertions(+), 5 deletions(-) diff --git a/qemu-options.hx b/qemu-options.hx index da9851d..bc7b4c2 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -2788,6 +2788,14 @@ STEXI @findex -qmp Like -monitor but opens in 'control' mode. ETEXI +DEF(qmp-pretty, HAS_ARG, QEMU_OPTION_qmp_pretty, \ +-qmp-pretty dev like -qmp but uses pretty JSON formatting\n, +QEMU_ARCH_ALL) +STEXI +@item -qmp-pretty @var{dev} +@findex -qmp-pretty +Like -qmp but uses pretty JSON formatting. +ETEXI DEF(mon, HAS_ARG, QEMU_OPTION_mon, \ -mon [chardev=]name[,mode=readline|control][,default]\n, QEMU_ARCH_ALL) diff --git a/vl.c b/vl.c index f4a6e5e..c7bebad 100644 --- a/vl.c +++ b/vl.c @@ -2284,7 +2284,7 @@ static int mon_init_func(QemuOpts *opts, void *opaque) return 0; } -static void monitor_parse(const char *optarg, const char *mode) +static void monitor_parse(const char *optarg, const char *mode, bool pretty) { static int monitor_device_index = 0; QemuOpts *opts; @@ -2314,6 +2314,7 @@ static void monitor_parse(const char *optarg, const char *mode) } qemu_opt_set(opts, mode, mode); qemu_opt_set(opts, chardev, label); +qemu_opt_set_bool(opts, pretty, pretty); if (def) qemu_opt_set(opts, default, on); monitor_device_index++; @@ -3292,11 +3293,15 @@ int main(int argc, char **argv, char **envp) case QEMU_OPTION_monitor: default_monitor = 0; if (strncmp(optarg, none, 4)) { -monitor_parse(optarg, readline); +monitor_parse(optarg, readline, false); } break; case QEMU_OPTION_qmp: -monitor_parse(optarg, control); +monitor_parse(optarg, control, false); +default_monitor = 0; +break; +case QEMU_OPTION_qmp_pretty: +monitor_parse(optarg, control, true); default_monitor = 0; break; case QEMU_OPTION_mon: @@ -3994,7 +3999,7 @@ int main(int argc, char **argv, char **envp) add_device_config(DEV_SCLP, stdio); } if (default_monitor) -monitor_parse(stdio, readline); +monitor_parse(stdio, readline, false); } } else { if (default_serial) @@ -4002,7 +4007,7 @@ int main(int argc, char **argv, char **envp) if (default_parallel) add_device_config(DEV_PARALLEL, vc:80Cx24C); if (default_monitor) -monitor_parse(vc:80Cx24C, readline); +monitor_parse(vc:80Cx24C, readline, false); if (default_virtcon) add_device_config(DEV_VIRTCON, vc:80Cx24C); if (default_sclp) { -- 1.9.3
[Qemu-devel] [PATCH v2 3/3] iotests: Use -qmp-pretty in 067
067 invokes query-block, resulting in a reference output with really long lines (which may pose a problem in email patches and always poses a problem when the output changes, because it is hard to see what has actually changed). Use -qmp-pretty to mitigate this issue. Signed-off-by: Max Reitz mre...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com --- tests/qemu-iotests/067 | 2 +- tests/qemu-iotests/067.out | 779 + 2 files changed, 723 insertions(+), 58 deletions(-) diff --git a/tests/qemu-iotests/067 b/tests/qemu-iotests/067 index d025192..29cd6b5 100755 --- a/tests/qemu-iotests/067 +++ b/tests/qemu-iotests/067 @@ -39,7 +39,7 @@ _supported_os Linux function do_run_qemu() { echo Testing: $@ -$QEMU -nographic -qmp stdio -serial none $@ +$QEMU -nographic -qmp-pretty stdio -serial none $@ echo } diff --git a/tests/qemu-iotests/067.out b/tests/qemu-iotests/067.out index 0f72dcf..929dc74 100644 --- a/tests/qemu-iotests/067.out +++ b/tests/qemu-iotests/067.out @@ -4,77 +4,742 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 === -drive/-device and device_del === Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,if=none,id=disk -device virtio-blk-pci,drive=disk,id=virtio0 -QMP_VERSION -{return: {}} -{return: [{io-status: ok, device: disk, locked: false, removable: false, inserted: {iops_rd: 0, detect_zeroes: off, image: {virtual-size: 134217728, filename: TEST_DIR/t.qcow2, cluster-size: 65536, format: qcow2, actual-size: SIZE, format-specific: {type: qcow2, data: {compat: 1.1, lazy-refcounts: false, corrupt: false}}, dirty-flag: false}, iops_wr: 0, ro: false, backing_file_depth: 0, drv: qcow2, iops: 0, bps_wr: 0, encrypted: false, bps: 0, bps_rd: 0, file: TEST_DIR/t.qcow2, encryption_key_missing: false}, type: unknown}, {io-status: ok, device: ide1-cd0, locked: false, removable: true, tray_open: false, type: unknown}, {device: floppy0, locked: false, removable: true, tray_open: false, type: unknown}, {device: sd0, locked: false, removable: true, tray_open: false, type: unknown}]} -{return: {}} -{return: {}} -{timestamp: {seconds: TIMESTAMP, microseconds: TIMESTAMP}, event: DEVICE_DELETED, data: {path: /machine/peripheral/virtio0/virtio-backend}} -{timestamp: {seconds: TIMESTAMP, microseconds: TIMESTAMP}, event: DEVICE_DELETED, data: {device: virtio0, path: /machine/peripheral/virtio0}} -{timestamp: {seconds: TIMESTAMP, microseconds: TIMESTAMP}, event: RESET} -{return: [{io-status: ok, device: ide1-cd0, locked: false, removable: true, tray_open: false, type: unknown}, {device: floppy0, locked: false, removable: true, tray_open: false, type: unknown}, {device: sd0, locked: false, removable: true, tray_open: false, type: unknown}]} -{return: {}} -{timestamp: {seconds: TIMESTAMP, microseconds: TIMESTAMP}, event: SHUTDOWN} -{timestamp: {seconds: TIMESTAMP, microseconds: TIMESTAMP}, event: DEVICE_TRAY_MOVED, data: {device: ide1-cd0, tray-open: true}} -{timestamp: {seconds: TIMESTAMP, microseconds: TIMESTAMP}, event: DEVICE_TRAY_MOVED, data: {device: floppy0, tray-open: true}} +{ +QMP_VERSION +} +{ +return: { +} +} +{ +return: [ +{ +io-status: ok, +device: disk, +locked: false, +removable: false, +inserted: { +iops_rd: 0, +detect_zeroes: off, +image: { +virtual-size: 134217728, +filename: TEST_DIR/t.qcow2, +cluster-size: 65536, +format: qcow2, +actual-size: SIZE, +format-specific: { +type: qcow2, +data: { +compat: 1.1, +lazy-refcounts: false, +corrupt: false +} +}, +dirty-flag: false +}, +iops_wr: 0, +ro: false, +backing_file_depth: 0, +drv: qcow2, +iops: 0, +bps_wr: 0, +encrypted: false, +bps: 0, +bps_rd: 0, +file: TEST_DIR/t.qcow2, +encryption_key_missing: false +}, +type: unknown +}, +{ +io-status: ok, +device: ide1-cd0, +locked: false, +removable: true, +tray_open: false, +type: unknown +}, +{ +device: floppy0, +locked: false, +removable: true, +tray_open: false, +type: unknown +}, +{ +device: sd0, +locked: false, +removable: true, +tray_open: false, +type: unknown +} +] +} +{ +
[Qemu-devel] [PATCH v2 0/3] chardev: Add -qmp-pretty
This series does not add new functionality. Adding a QMP monitor with prettily formatted JSON output can be done as follows: $ qemu -chardev stdio,id=mon0 -mon chardev=mon0,mode=control,pretty=on However, this is rather cumbersome, so this series (its first patch) adds a shortcut in the form of the new command line option -qmp-pretty. Since the argument given to a monitor command line option (such as -qmp) is parsed depending on its prefix and probably also depending on the current phase of the moon, this is cleaner than trying to add a switch to -qmp itself (in the form of -qmp stdio,pretty=on). Patch 3 makes uses of the new option in qemu-iotest 067 to greatly increase maintainability of its reference output. Patch 2 extends the QMP filter for qemu-iotests so it is able to filter out the QMP version object in pretty mode. v2: - Patch 2: Replaced the multi-line QMP_VERSION replacement written in bash by a nice sed script [Eric] Max Reitz (3): chardev: Add -qmp-pretty iotests: _filter_qmp for pretty JSON output iotests: Use -qmp-pretty in 067 qemu-options.hx | 8 + tests/qemu-iotests/067 | 2 +- tests/qemu-iotests/067.out | 779 --- tests/qemu-iotests/common.filter | 5 +- vl.c | 15 +- 5 files changed, 745 insertions(+), 64 deletions(-) -- 1.9.3
[Qemu-devel] [PATCH v2 2/3] iotests: _filter_qmp for pretty JSON output
_filter_qmp should be able to correctly filter out the QMP version object for pretty JSON output. Signed-off-by: Max Reitz mre...@redhat.com --- tests/qemu-iotests/common.filter | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter index 3acdb30..63d15b3 100644 --- a/tests/qemu-iotests/common.filter +++ b/tests/qemu-iotests/common.filter @@ -165,9 +165,12 @@ _filter_qemu() # replace problematic QMP output like timestamps _filter_qmp() { +discard=0 + _filter_win32 | \ sed -e 's#\(\(micro\)\?seconds: \)[0-9]\+#\1 TIMESTAMP#g' \ --e 's#^{QMP:.*}$#QMP_VERSION#' +-e 's#^{QMP:.*}$#QMP_VERSION#' \ +-e '/^QMP: {\s*$/, /^}\s*$/ c\QMP_VERSION' } # replace driver-specific options in the Formatting... line -- 1.9.3
Re: [Qemu-devel] [Bug 1392504] Re: USB Passthrough is not working anymore
On 2014/11/14 16:47, Leen Keus wrote: Under Ubuntu 14.04 the USB devices where attached to the virtual machines. After upgrading to Ubuntu 14.10 the USB devices are not visible in the virtual machines anymore. Same issue appeard after upgrading from Ubuntu 13.04 to 13.10, but I can not find that bug report anymore. Please show your command line booting VM. Or you can get the latest Qemu master code to reproduce your problem. Maybe it is a resolved issue. :) Best regards, -Gonglei
Re: [Qemu-devel] [PATCH v2 2/3] iotests: _filter_qmp for pretty JSON output
On 2014-11-14 at 10:02, Max Reitz wrote: _filter_qmp should be able to correctly filter out the QMP version object for pretty JSON output. Signed-off-by: Max Reitz mre...@redhat.com --- tests/qemu-iotests/common.filter | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter index 3acdb30..63d15b3 100644 --- a/tests/qemu-iotests/common.filter +++ b/tests/qemu-iotests/common.filter @@ -165,9 +165,12 @@ _filter_qemu() # replace problematic QMP output like timestamps _filter_qmp() { +discard=0 + Urgh, self-NACK. We don't need that anymore. Max _filter_win32 | \ sed -e 's#\(\(micro\)\?seconds: \)[0-9]\+#\1 TIMESTAMP#g' \ --e 's#^{QMP:.*}$#QMP_VERSION#' +-e 's#^{QMP:.*}$#QMP_VERSION#' \ +-e '/^QMP: {\s*$/, /^}\s*$/ c\QMP_VERSION' } # replace driver-specific options in the Formatting... line
[Qemu-devel] [PATCH v3 1/3] chardev: Add -qmp-pretty
Add a command line option for adding a QMP monitor using pretty JSON formatting. Signed-off-by: Max Reitz mre...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com --- qemu-options.hx | 8 vl.c| 15 ++- 2 files changed, 18 insertions(+), 5 deletions(-) diff --git a/qemu-options.hx b/qemu-options.hx index da9851d..bc7b4c2 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -2788,6 +2788,14 @@ STEXI @findex -qmp Like -monitor but opens in 'control' mode. ETEXI +DEF(qmp-pretty, HAS_ARG, QEMU_OPTION_qmp_pretty, \ +-qmp-pretty dev like -qmp but uses pretty JSON formatting\n, +QEMU_ARCH_ALL) +STEXI +@item -qmp-pretty @var{dev} +@findex -qmp-pretty +Like -qmp but uses pretty JSON formatting. +ETEXI DEF(mon, HAS_ARG, QEMU_OPTION_mon, \ -mon [chardev=]name[,mode=readline|control][,default]\n, QEMU_ARCH_ALL) diff --git a/vl.c b/vl.c index f4a6e5e..c7bebad 100644 --- a/vl.c +++ b/vl.c @@ -2284,7 +2284,7 @@ static int mon_init_func(QemuOpts *opts, void *opaque) return 0; } -static void monitor_parse(const char *optarg, const char *mode) +static void monitor_parse(const char *optarg, const char *mode, bool pretty) { static int monitor_device_index = 0; QemuOpts *opts; @@ -2314,6 +2314,7 @@ static void monitor_parse(const char *optarg, const char *mode) } qemu_opt_set(opts, mode, mode); qemu_opt_set(opts, chardev, label); +qemu_opt_set_bool(opts, pretty, pretty); if (def) qemu_opt_set(opts, default, on); monitor_device_index++; @@ -3292,11 +3293,15 @@ int main(int argc, char **argv, char **envp) case QEMU_OPTION_monitor: default_monitor = 0; if (strncmp(optarg, none, 4)) { -monitor_parse(optarg, readline); +monitor_parse(optarg, readline, false); } break; case QEMU_OPTION_qmp: -monitor_parse(optarg, control); +monitor_parse(optarg, control, false); +default_monitor = 0; +break; +case QEMU_OPTION_qmp_pretty: +monitor_parse(optarg, control, true); default_monitor = 0; break; case QEMU_OPTION_mon: @@ -3994,7 +3999,7 @@ int main(int argc, char **argv, char **envp) add_device_config(DEV_SCLP, stdio); } if (default_monitor) -monitor_parse(stdio, readline); +monitor_parse(stdio, readline, false); } } else { if (default_serial) @@ -4002,7 +4007,7 @@ int main(int argc, char **argv, char **envp) if (default_parallel) add_device_config(DEV_PARALLEL, vc:80Cx24C); if (default_monitor) -monitor_parse(vc:80Cx24C, readline); +monitor_parse(vc:80Cx24C, readline, false); if (default_virtcon) add_device_config(DEV_VIRTCON, vc:80Cx24C); if (default_sclp) { -- 1.9.3
[Qemu-devel] [PATCH v3 2/3] iotests: _filter_qmp for pretty JSON output
_filter_qmp should be able to correctly filter out the QMP version object for pretty JSON output. Signed-off-by: Max Reitz mre...@redhat.com --- tests/qemu-iotests/common.filter | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter index 3acdb30..93642f3 100644 --- a/tests/qemu-iotests/common.filter +++ b/tests/qemu-iotests/common.filter @@ -167,7 +167,8 @@ _filter_qmp() { _filter_win32 | \ sed -e 's#\(\(micro\)\?seconds: \)[0-9]\+#\1 TIMESTAMP#g' \ --e 's#^{QMP:.*}$#QMP_VERSION#' +-e 's#^{QMP:.*}$#QMP_VERSION#' \ +-e '/^QMP: {\s*$/, /^}\s*$/ c\QMP_VERSION' } # replace driver-specific options in the Formatting... line -- 1.9.3
[Qemu-devel] [PATCH v3 0/3] chardev: Add -qmp-pretty
This series does not add new functionality. Adding a QMP monitor with prettily formatted JSON output can be done as follows: $ qemu -chardev stdio,id=mon0 -mon chardev=mon0,mode=control,pretty=on However, this is rather cumbersome, so this series (its first patch) adds a shortcut in the form of the new command line option -qmp-pretty. Since the argument given to a monitor command line option (such as -qmp) is parsed depending on its prefix and probably also depending on the current phase of the moon, this is cleaner than trying to add a switch to -qmp itself (in the form of -qmp stdio,pretty=on). Patch 3 makes uses of the new option in qemu-iotest 067 to greatly increase maintainability of its reference output. Patch 2 extends the QMP filter for qemu-iotests so it is able to filter out the QMP version object in pretty mode. v3: - Patch 2: Cull useless discard=0 v2: - Patch 2: Replaced the multi-line QMP_VERSION replacement written in bash by a nice sed script [Eric] git-backport-diff against v1: Key: [] : patches are identical [] : number of functional differences between upstream/downstream patch [down] : patch is downstream-only The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively 001/3:[] [--] 'chardev: Add -qmp-pretty' 002/3:[0015] [FC] 'iotests: _filter_qmp for pretty JSON output' 003/3:[] [--] 'iotests: Use -qmp-pretty in 067' Max Reitz (3): chardev: Add -qmp-pretty iotests: _filter_qmp for pretty JSON output iotests: Use -qmp-pretty in 067 qemu-options.hx | 8 + tests/qemu-iotests/067 | 2 +- tests/qemu-iotests/067.out | 779 --- tests/qemu-iotests/common.filter | 3 +- vl.c | 15 +- 5 files changed, 743 insertions(+), 64 deletions(-) -- 1.9.3
[Qemu-devel] [PATCH v3 3/3] iotests: Use -qmp-pretty in 067
067 invokes query-block, resulting in a reference output with really long lines (which may pose a problem in email patches and always poses a problem when the output changes, because it is hard to see what has actually changed). Use -qmp-pretty to mitigate this issue. Signed-off-by: Max Reitz mre...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com --- tests/qemu-iotests/067 | 2 +- tests/qemu-iotests/067.out | 779 + 2 files changed, 723 insertions(+), 58 deletions(-) diff --git a/tests/qemu-iotests/067 b/tests/qemu-iotests/067 index d025192..29cd6b5 100755 --- a/tests/qemu-iotests/067 +++ b/tests/qemu-iotests/067 @@ -39,7 +39,7 @@ _supported_os Linux function do_run_qemu() { echo Testing: $@ -$QEMU -nographic -qmp stdio -serial none $@ +$QEMU -nographic -qmp-pretty stdio -serial none $@ echo } diff --git a/tests/qemu-iotests/067.out b/tests/qemu-iotests/067.out index 0f72dcf..929dc74 100644 --- a/tests/qemu-iotests/067.out +++ b/tests/qemu-iotests/067.out @@ -4,77 +4,742 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 === -drive/-device and device_del === Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,if=none,id=disk -device virtio-blk-pci,drive=disk,id=virtio0 -QMP_VERSION -{return: {}} -{return: [{io-status: ok, device: disk, locked: false, removable: false, inserted: {iops_rd: 0, detect_zeroes: off, image: {virtual-size: 134217728, filename: TEST_DIR/t.qcow2, cluster-size: 65536, format: qcow2, actual-size: SIZE, format-specific: {type: qcow2, data: {compat: 1.1, lazy-refcounts: false, corrupt: false}}, dirty-flag: false}, iops_wr: 0, ro: false, backing_file_depth: 0, drv: qcow2, iops: 0, bps_wr: 0, encrypted: false, bps: 0, bps_rd: 0, file: TEST_DIR/t.qcow2, encryption_key_missing: false}, type: unknown}, {io-status: ok, device: ide1-cd0, locked: false, removable: true, tray_open: false, type: unknown}, {device: floppy0, locked: false, removable: true, tray_open: false, type: unknown}, {device: sd0, locked: false, removable: true, tray_open: false, type: unknown}]} -{return: {}} -{return: {}} -{timestamp: {seconds: TIMESTAMP, microseconds: TIMESTAMP}, event: DEVICE_DELETED, data: {path: /machine/peripheral/virtio0/virtio-backend}} -{timestamp: {seconds: TIMESTAMP, microseconds: TIMESTAMP}, event: DEVICE_DELETED, data: {device: virtio0, path: /machine/peripheral/virtio0}} -{timestamp: {seconds: TIMESTAMP, microseconds: TIMESTAMP}, event: RESET} -{return: [{io-status: ok, device: ide1-cd0, locked: false, removable: true, tray_open: false, type: unknown}, {device: floppy0, locked: false, removable: true, tray_open: false, type: unknown}, {device: sd0, locked: false, removable: true, tray_open: false, type: unknown}]} -{return: {}} -{timestamp: {seconds: TIMESTAMP, microseconds: TIMESTAMP}, event: SHUTDOWN} -{timestamp: {seconds: TIMESTAMP, microseconds: TIMESTAMP}, event: DEVICE_TRAY_MOVED, data: {device: ide1-cd0, tray-open: true}} -{timestamp: {seconds: TIMESTAMP, microseconds: TIMESTAMP}, event: DEVICE_TRAY_MOVED, data: {device: floppy0, tray-open: true}} +{ +QMP_VERSION +} +{ +return: { +} +} +{ +return: [ +{ +io-status: ok, +device: disk, +locked: false, +removable: false, +inserted: { +iops_rd: 0, +detect_zeroes: off, +image: { +virtual-size: 134217728, +filename: TEST_DIR/t.qcow2, +cluster-size: 65536, +format: qcow2, +actual-size: SIZE, +format-specific: { +type: qcow2, +data: { +compat: 1.1, +lazy-refcounts: false, +corrupt: false +} +}, +dirty-flag: false +}, +iops_wr: 0, +ro: false, +backing_file_depth: 0, +drv: qcow2, +iops: 0, +bps_wr: 0, +encrypted: false, +bps: 0, +bps_rd: 0, +file: TEST_DIR/t.qcow2, +encryption_key_missing: false +}, +type: unknown +}, +{ +io-status: ok, +device: ide1-cd0, +locked: false, +removable: true, +tray_open: false, +type: unknown +}, +{ +device: floppy0, +locked: false, +removable: true, +tray_open: false, +type: unknown +}, +{ +device: sd0, +locked: false, +removable: true, +tray_open: false, +type: unknown +} +] +} +{ +
Re: [Qemu-devel] iothread object hotplug ?
Yes, there is an object_add/object-add command (respectively HMP and QMP), but I don't think libvirt has bindings already. Thanks Paolo ! I'm currently implementing iothread on proxmox, so no libvirt. Alexandre. - Mail original - De: Paolo Bonzini pbonz...@redhat.com À: Alexandre DERUMIER aderum...@odiso.com, qemu-devel qemu-devel@nongnu.org Envoyé: Vendredi 14 Novembre 2014 09:37:02 Objet: Re: iothread object hotplug ? On 14/11/2014 08:25, Alexandre DERUMIER wrote: Hi, I would like to known if it's possible to hot-add|hot-plug an iothread object on a running guest ? Yes, there is an object_add/object-add command (respectively HMP and QMP), but I don't think libvirt has bindings already. Paolo
Re: [Qemu-devel] [PATCH] mips/gdbstub: Correct the handling of register #72 on writes
On 03/11/2014 18:47, Maciej W. Rozycki wrote: Fix an off-by-one error in `mips_cpu_gdb_write_register' for register #72 that is handled further down in that function rather than here, matching how `mips_cpu_gdb_read_register' handles it. This register slot is a fake anyway, there's nothing in hardware that corresponds to it. Signed-off-by: Maciej W. Rozycki ma...@codesourcery.com --- I have a further change down the queue to clean up `mips_cpu_gdb_read_register' and `mips_cpu_gdb_write_register' and make them more consistent with respect to each other as far as the handling of FP registers is concerned. For now please apply this obvious change. Thanks. Maciej qemu-mips-fpreg72.diff Index: qemu-git-trunk/target-mips/gdbstub.c === --- qemu-git-trunk.orig/target-mips/gdbstub.c 2013-07-29 11:23:07.048742983 +0100 +++ qemu-git-trunk/target-mips/gdbstub.c 2014-10-27 04:17:19.159003270 + @@ -97,7 +97,7 @@ int mips_cpu_gdb_write_register(CPUState return sizeof(target_ulong); } if (env-CP0_Config1 (1 CP0C1_FP) - n = 38 n 73) { + n = 38 n 72) { if (n 70) { if (env-CP0_Status (1 CP0St_FR)) { env-active_fpu.fpr[n - 38].d = tmp; Reviewed-by: Leon Alrae leon.al...@imgtec.com
Re: [Qemu-devel] [PATCH] mips/gdbstub: Make CP1.FIR read-only here too
On 03/11/2014 18:51, Maciej W. Rozycki wrote: CP1.FIR is read-only in hardware so gdbstub must respect it. We already respect it for CTC1 instructions, so do it here too. Signed-off-by: Maciej W. Rozycki ma...@codesourcery.com --- Not much to say about it here. Please apply. Maciej qemu-mips-fir.diff Index: qemu-git-trunk/target-mips/gdbstub.c === --- qemu-git-trunk.orig/target-mips/gdbstub.c 2014-11-02 17:51:33.458968203 + +++ qemu-git-trunk/target-mips/gdbstub.c 2014-11-02 17:51:35.958924223 + @@ -112,7 +112,7 @@ int mips_cpu_gdb_write_register(CPUState RESTORE_ROUNDING_MODE; break; case 71: -env-active_fpu.fcr0 = tmp; +/* FIR is read-only. Ignore writes. */ break; } return sizeof(target_ulong); Reviewed-by: Leon Alrae leon.al...@imgtec.com
Re: [Qemu-devel] [PATCH] vmdk: Leave bdi intact if -ENOTSUP in vmdk_get_info
On Fri, Nov 14, 2014 at 12:09:21PM +0800, Fam Zheng wrote: When extent types don't match, we return -ENOTSUP. In this case, be polite to the caller and don't modify bdi. Signed-off-by: Fam Zheng f...@redhat.com --- block/vmdk.c | 20 +--- 1 file changed, 13 insertions(+), 7 deletions(-) At this stage of the release process it would be very helpful to include some context for this patch: is it a pure cleanup or actually a bug fix that needs to go into QEMU 2.2? Since you are a regular contributor and I therefore trust you, I have merged this patch. But please be clear in commit descriptions about whether or not a patch is a bug fix (how to reproduce the bug, if a previous commit's regression is being fixed, etc). Thanks, applied to my block tree: https://github.com/stefanha/qemu/commits/block Stefan pgpeDt8fK8gEK.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH v2] libcacard: fix resource leak
zhanghailiang zhang.zhanghaili...@huawei.com writes: In function connect_to_qemu(), getaddrinfo() will allocate memory that is stored into server, it should be freed by using freeaddrinfo() before connect_to_qemu() return. Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com --- v2: - fix typo in title ;) --- libcacard/vscclient.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/libcacard/vscclient.c b/libcacard/vscclient.c index 80111df..fa6041d 100644 --- a/libcacard/vscclient.c +++ b/libcacard/vscclient.c @@ -597,7 +597,7 @@ connect_to_qemu( const char *port ) { struct addrinfo hints; -struct addrinfo *server; +struct addrinfo *server = NULL; int ret, sock; sock = socket(AF_INET, SOCK_STREAM, 0); @@ -629,9 +629,14 @@ connect_to_qemu( if (verbose) { printf(Connected (sizeof Header=%zd)!\n, sizeof(VSCMsgHeader)); } + +freeaddrinfo(server); return sock; cleanup_socket: +if (server) { +freeaddrinfo(server); +} closesocket(sock); return -1; } Reviewed-by: Markus Armbruster arm...@redhat.com Aside: this code uses the first result from getaddrinfo(), and fails if it can't connect. This is a common misuse of getaddrinfo(). Consider a remote host with both an IPv4 and an IPv6 address, but only one of them actually connects. If getaddrinfo() puts the connectable one first, we succeed. Else, we fail. We should try the addresses in order until connect() succeeds, like qemu-sockets.c does.
Re: [Qemu-devel] iothread object hotplug ?
On Fri, Nov 14, 2014 at 09:37:02AM +0100, Paolo Bonzini wrote: On 14/11/2014 08:25, Alexandre DERUMIER wrote: Hi, I would like to known if it's possible to hot-add|hot-plug an iothread object on a running guest ? Yes, there is an object_add/object-add command (respectively HMP and QMP), but I don't think libvirt has bindings already. Right, libvirt does not support iothread hotplug at the moment. I should add that there hasn't been much testing of iothread hotplug in QEMU either yet, even though the object-add/object-del commands make it possible. Please let me know if you encounter any problems. Stefan pgpSLEdzgribF.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH v3 for-xen-4.5] xen_disk: fix unmapping of persistent grants
On Thu, 13 Nov 2014, Roger Pau Monne wrote: This patch fixes two issues with persistent grants and the disk PV backend (Qdisk): - Keep track of memory regions where persistent grants have been mapped since we need to unmap them as a whole. It is not possible to unmap a single grant if it has been batch-mapped. A new check has also been added to make sure persistent grants are only used if the whole mapped region can be persistently mapped in the batch_maps case. - Unmap persistent grants before switching to the closed state, so the frontend can also free them. Signed-off-by: Roger Pau Monné roger@citrix.com Reported-by: George Dunlap george.dun...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Kevin Wolf kw...@redhat.com Cc: Stefan Hajnoczi stefa...@redhat.com Cc: George Dunlap george.dun...@eu.citrix.com Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Acked-by: Stefano Stabellini stefano.stabell...@eu.citrix.com I'll send a pull request and backport to qemu-xen. Xen release exception: this is obviously an important bug-fix that should be included in 4.5. Without this fix, trying to do a block-detach of a Qdisk from a running guest can lead to guest crash depending on the blkfront implementation. --- Changea since v2: - Expand the commit message to mention the newly added check. - Don't use persistent_regions in the !batch_maps case, we can use the previous implementation which works fine in that case. Changes since v1: - Instead of disabling batch mappings when using persistent grants, keep track of the persistently mapped regions and unmap them on disconnection. This prevents unmapping single persistent grants, but the current implementation doesn't require it. This new version is slightly faster than the previous one, the following test results are in IOPS from 20 runs of a block-attach, fio run, block-detach test loop: x batch + no_batch +--+ | + + xx + + + x xx x | |+ + x x+ *+++ * +x* +++x* x xx x *| | |_A_M__|| ||A_M___| | +--+ N Min MaxMedian AvgStddev x 20 52941 63822 62396 61180.15 2673.6497 + 20 49967 63849 60322 59116.9 3456.3455 Difference at 95.0% confidence -2063.25 +/- 1977.66 -3.37242% +/- 3.23252% (Student's t, pooled s = 3089.88) --- hw/block/xen_disk.c | 72 - 1 file changed, 66 insertions(+), 6 deletions(-) diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c index 231e9a7..21842a0 100644 --- a/hw/block/xen_disk.c +++ b/hw/block/xen_disk.c @@ -59,6 +59,13 @@ struct PersistentGrant { typedef struct PersistentGrant PersistentGrant; +struct PersistentRegion { +void *addr; +int num; +}; + +typedef struct PersistentRegion PersistentRegion; + struct ioreq { blkif_request_t req; int16_t status; @@ -118,6 +125,7 @@ struct XenBlkDev { gbooleanfeature_discard; gbooleanfeature_persistent; GTree *persistent_gnts; +GSList *persistent_regions; unsigned intpersistent_gnt_count; unsigned intmax_grants; @@ -177,6 +185,23 @@ static void destroy_grant(gpointer pgnt) g_free(grant); } +static void remove_persistent_region(gpointer data, gpointer dev) +{ +PersistentRegion *region = data; +struct XenBlkDev *blkdev = dev; +XenGnttab gnt = blkdev-xendev.gnttabdev; + +if (xc_gnttab_munmap(gnt, region-addr, region-num) != 0) { +xen_be_printf(blkdev-xendev, 0, + xc_gnttab_munmap region %p failed: %s\n, + region-addr, strerror(errno)); +} +xen_be_printf(blkdev-xendev, 3, + unmapped grant region %p with %d pages\n, + region-addr, region-num); +g_free(region); +} + static struct ioreq *ioreq_start(struct XenBlkDev *blkdev) { struct ioreq *ioreq = NULL; @@ -343,6 +368,7 @@ static int ioreq_map(struct ioreq *ioreq) void *page[BLKIF_MAX_SEGMENTS_PER_REQUEST]; int i, j, new_maps = 0; PersistentGrant *grant; +PersistentRegion *region; /* domids and refs variables will contain the information necessary * to map the grants that are needed to fulfill this request. * @@ -421,7 +447,22 @@ static int ioreq_map(struct ioreq *ioreq) } } } -if (ioreq-blkdev-feature_persistent) { +
[Qemu-devel] [PULL for-2.2 02/11] ide: repair PIO transfers for cases where nsector 1
From: John Snow js...@redhat.com Currently, for emulated PIO transfers through the AHCI device, any attempt made to request more than a single sector's worth of data will result in the same sector being transferred over and over. For example, if we request 8 sectors via PIO READ SECTORS, the AHCI device will give us the same sector eight times. This patch adds offset tracking into the PIO pathways so that we can fulfill these requests appropriately. Signed-off-by: John Snow js...@redhat.com Reviewed-by: Paolo Bonzini pbonz...@redhat.com Message-id: 1414785819-26209-2-git-send-email-js...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- hw/ide/ahci.c | 2 +- hw/ide/core.c | 4 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index 1f3f951..dbd6773 100644 --- a/hw/ide/ahci.c +++ b/hw/ide/ahci.c @@ -1093,7 +1093,7 @@ static void ahci_start_transfer(IDEDMA *dma) goto out; } -if (!ahci_populate_sglist(ad, s-sg, 0)) { +if (!ahci_populate_sglist(ad, s-sg, s-io_buffer_offset)) { has_sglist = 1; } diff --git a/hw/ide/core.c b/hw/ide/core.c index d316ccf..dab21f0 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -592,6 +592,7 @@ static void ide_sector_read_cb(void *opaque, int ret) ide_set_sector(s, ide_get_sector(s) + n); s-nsector -= n; +s-io_buffer_offset += 512 * n; } void ide_sector_read(IDEState *s) @@ -832,6 +833,8 @@ static void ide_sector_write_cb(void *opaque, int ret) n = s-req_nb_sectors; } s-nsector -= n; +s-io_buffer_offset += 512 * n; + if (s-nsector == 0) { /* no more sectors to write */ ide_transfer_stop(s); @@ -1824,6 +1827,7 @@ void ide_exec_cmd(IDEBus *bus, uint32_t val) s-status = READY_STAT | BUSY_STAT; s-error = 0; +s-io_buffer_offset = 0; complete = ide_cmd_table[val].handler(s, val); if (complete) { -- 2.1.0
[Qemu-devel] [PULL for-2.2 04/11] ide: Correct handling of malformed/short PRDTs
From: John Snow js...@redhat.com This impacts both BMDMA and AHCI HBA interfaces for IDE. Currently, we confuse the difference between a PRDT having 0 bytes and a PRDT having 0 complete sectors. When we receive an incomplete sector, inconsistent error checking leads to an infinite loop wherein the call succeeds, but it didn't give us enough bytes -- leading us to re-call the DMA chain over and over again. This leads to, in the BMDMA case, leaked memory for short PRDTs, and infinite loops and resource usage in the AHCI case. The .prepare_buf() callback is reworked to return the number of bytes that it successfully prepared. 0 is a valid, non-error answer that means the table was empty and described no bytes. -1 indicates an error. Our current implementation uses the io_buffer in IDEState to ultimately describe the size of a prepared scatter-gather list. Even though the AHCI PRDT/SGList can be as large as 256GiB, the AHCI command header limits transactions to just 4GiB. ATA8-ACS3, however, defines the largest transaction to be an LBA48 command that transfers 65,536 sectors. With a 512 byte sector size, this is just 32MiB. Since our current state structures use the int type to describe the size of the buffer, and this state is migrated as int32, we are limited to describing 2GiB buffer sizes unless we change the migration protocol. For this reason, this patch begins to unify the assertions in the IDE pathways that the scatter-gather list provided by either the AHCI PRDT or the PCI BMDMA PRDs can only describe, at a maximum, 2GiB. This should be resilient enough unless we need a sector size that exceeds 32KiB. Further, the likelihood of any guest operating system actually attempting to transfer this much data in a single operation is very slim. To this end, the IDEState variables have been updated to more explicitly clarify our maximum supported size. Callers to the prepare_buf callback have been reworked to understand the new return code, and all versions of the prepare_buf callback have been adjusted accordingly. Lastly, the ahci_populate_sglist helper, relied upon by the AHCI implementation of .prepare_buf() as well as the PCI implementation of the callback have had overflow assertions added to help make clear the reasonings behind the various type changes. [Added %d - %PRId64 fix John sent because off_pos changed from int to int64_t. --Stefan] Signed-off-by: John Snow js...@redhat.com Reviewed-by: Paolo Bonzini pbonz...@redhat.com Message-id: 1414785819-26209-4-git-send-email-js...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- hw/ide/ahci.c | 33 ++--- hw/ide/core.c | 10 -- hw/ide/internal.h | 13 +++-- hw/ide/macio.c| 7 ++- hw/ide/pci.c | 27 +-- 5 files changed, 68 insertions(+), 22 deletions(-) diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index 28aa105..9647d94 100644 --- a/hw/ide/ahci.c +++ b/hw/ide/ahci.c @@ -730,7 +730,8 @@ static int prdt_tbl_entry_size(const AHCI_SG *tbl) return (le32_to_cpu(tbl-flags_size) AHCI_PRDT_SIZE_MASK) + 1; } -static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist, int offset) +static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist, +int32_t offset) { AHCICmdHdr *cmd = ad-cur_cmd; uint32_t opts = le32_to_cpu(cmd-opts); @@ -741,13 +742,21 @@ static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist, int offset) uint8_t *prdt; int i; int r = 0; -int sum = 0; +uint64_t sum = 0; int off_idx = -1; -int off_pos = -1; +int64_t off_pos = -1; int tbl_entry_size; IDEBus *bus = ad-port; BusState *qbus = BUS(bus); +/* + * Note: AHCI PRDT can describe up to 256GiB. SATA/ATA only support + * transactions of up to 32MiB as of ATA8-ACS3 rev 1b, assuming a + * 512 byte sector size. We limit the PRDT in this implementation to + * a reasonably large 2GiB, which can accommodate the maximum transfer + * request for sector sizes up to 32K. + */ + if (!sglist_alloc_hint) { DPRINTF(ad-port_no, no sg list given by guest: 0x%08x\n, opts); return -1; @@ -782,7 +791,7 @@ static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist, int offset) } if ((off_idx == -1) || (off_pos 0) || (off_pos tbl_entry_size)) { DPRINTF(ad-port_no, %s: Incorrect offset! -off_idx: %d, off_pos: %d\n, +off_idx: %d, off_pos: %PRId64\n, __func__, off_idx, off_pos); r = -1; goto out; @@ -797,6 +806,13 @@ static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist, int offset) /* flags_size is zero-based */ qemu_sglist_add(sglist, le64_to_cpu(tbl[i].addr), prdt_tbl_entry_size(tbl[i])); +if
[Qemu-devel] [PULL for-2.2 01/11] ahci: Fix byte count regression for ATAPI/PIO
From: John Snow js...@redhat.com This patch fixes a regression caused by commit 659142ecf71a0da240ab0ff7cf929ee25c32b9bc. The problem occurs when we wish to return early from the ahci_start_transfer function, but are now updating the transferred byte count in the AHCI command header via ahci_commit_buf. This will cause problems in the Windows 8 installer. Don't update the byte count in the command header for the transmission of ATAPI packets: These commands will distort the final byte count of the actual data payload. The call to ahci_commit_buf remains in the out portion of the call in order to clean up the sglist. The byte count is maintained by forcing size to be 0. Signed-off-by: John Snow js...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- hw/ide/ahci.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index 61dbed1..1f3f951 100644 --- a/hw/ide/ahci.c +++ b/hw/ide/ahci.c @@ -1089,6 +1089,7 @@ static void ahci_start_transfer(IDEDMA *dma) if (is_atapi !ad-done_atapi_packet) { /* already prepopulated iobuffer */ ad-done_atapi_packet = true; +size = 0; goto out; } -- 2.1.0
[Qemu-devel] [PULL for-2.2 00/11] Block patches
The following changes since commit c52e67924fbdadfa00668248f5c075542943c54c: Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging (2014-11-13 15:44:16 +) are available in the git repository at: git://github.com/stefanha/qemu.git tags/block-pull-request for you to fetch changes up to 5f58330790b72c4705b373ee0646fb3adf800b4e: vmdk: Leave bdi intact if -ENOTSUP in vmdk_get_info (2014-11-14 09:20:45 +) Fam Zheng (2): block: Fix max nb_sectors in bdrv_make_zero vmdk: Leave bdi intact if -ENOTSUP in vmdk_get_info John Snow (9): ahci: Fix byte count regression for ATAPI/PIO ide: repair PIO transfers for cases where nsector 1 ahci: unify sglist preparation ide: Correct handling of malformed/short PRDTs ahci: add is_ncq predicate helper ahci: Fix FIS decomposition ahci: Reorder error cases in handle_cmd ahci: Check cmd_fis[1] more explicitly ahci: factor out FIS decomposition from handle_cmd block.c | 4 +- block/vmdk.c | 20 ++-- hw/ide/ahci.c | 269 ++ hw/ide/ahci.h | 3 + hw/ide/core.c | 14 ++- hw/ide/internal.h | 13 +-- hw/ide/macio.c| 7 +- hw/ide/pci.c | 27 -- 8 files changed, 214 insertions(+), 143 deletions(-) -- 2.1.0
[Qemu-devel] [PULL for-2.2 08/11] ahci: Check cmd_fis[1] more explicitly
From: John Snow js...@redhat.com Instead of checking for a known byte, inspect the fields of this byte explicitly to produce more meaningful error messages and improve the readability of this section. Signed-off-by: John Snow js...@redhat.com Reviewed-by: Paolo Bonzini pbonz...@redhat.com Message-id: 1415058979-16604-5-git-send-email-js...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- hw/ide/ahci.c | 23 --- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index 578a93b..d6b012c 100644 --- a/hw/ide/ahci.c +++ b/hw/ide/ahci.c @@ -1004,17 +1004,18 @@ static int handle_cmd(AHCIState *s, int port, int slot) break; } -switch (cmd_fis[1]) { -case SATA_FIS_REG_H2D_UPDATE_COMMAND_REGISTER: -break; -case 0: -break; -default: -DPRINTF(port, unknown command cmd_fis[0]=%02x cmd_fis[1]=%02x - cmd_fis[2]=%02x\n, cmd_fis[0], cmd_fis[1], - cmd_fis[2]); -goto out; -break; +if (cmd_fis[1] 0x0F) { +DPRINTF(port, Port Multiplier not supported. + cmd_fis[0]=%02x cmd_fis[1]=%02x cmd_fis[2]=%02x\n, +cmd_fis[0], cmd_fis[1], cmd_fis[2]); +goto out; +} + +if (cmd_fis[1] 0x70) { +DPRINTF(port, Reserved flags set in H2D Register FIS. + cmd_fis[0]=%02x cmd_fis[1]=%02x cmd_fis[2]=%02x\n, +cmd_fis[0], cmd_fis[1], cmd_fis[2]); +goto out; } if (!(cmd_fis[1] SATA_FIS_REG_H2D_UPDATE_COMMAND_REGISTER)) { -- 2.1.0
[Qemu-devel] [PULL for-2.2 03/11] ahci: unify sglist preparation
From: John Snow js...@redhat.com The intent of this patch is to further unify the creation and deletion of the sglist used for all AHCI transfers, including emulated PIO, ATAPI R/W, and native DMA R/W. By replacing ahci_start_transfer's call to ahci_populate_sglist with ahci_dma_prepare_buf, we reduce the number of direct calls where we manipulate the scatter-gather list in the AHCI code. To make this switch, the constant 0 passed as an offset in ahci_dma_prepare_buf is adjusted to use io_buffer_offset. For DMA pathways, this has no effect: io_buffer_offset is always updated to 0 at the beginning of a DMA transfer loop regardless. DMA pathways through ide_dma_cb() update the io_buffer_offset accordingly, and for circumstances where we might make several trips through this loop, this may actually correct a design flaw. For PIO pathways, the newly updated ahci_dma_prepare_buf will now prepare the sglist at the correct offset. It will also set io_buffer_size, but this is not used in the cmd_read_pio or cmd_write_pio pathways. Signed-off-by: John Snow js...@redhat.com Reviewed-by: Paolo Bonzini pbonz...@redhat.com Message-id: 1414785819-26209-3-git-send-email-js...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- hw/ide/ahci.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index dbd6773..28aa105 100644 --- a/hw/ide/ahci.c +++ b/hw/ide/ahci.c @@ -1093,7 +1093,7 @@ static void ahci_start_transfer(IDEDMA *dma) goto out; } -if (!ahci_populate_sglist(ad, s-sg, s-io_buffer_offset)) { +if (ahci_dma_prepare_buf(dma, is_write)) { has_sglist = 1; } @@ -1145,7 +1145,7 @@ static int ahci_dma_prepare_buf(IDEDMA *dma, int is_write) AHCIDevice *ad = DO_UPCAST(AHCIDevice, dma, dma); IDEState *s = ad-port.ifs[0]; -ahci_populate_sglist(ad, s-sg, 0); +ahci_populate_sglist(ad, s-sg, s-io_buffer_offset); s-io_buffer_size = s-sg.size; DPRINTF(ad-port_no, len=%#x\n, s-io_buffer_size); -- 2.1.0
[Qemu-devel] [PULL for-2.2 06/11] ahci: Fix FIS decomposition
From: John Snow js...@redhat.com This patch introduces a few changes to how FIS packets are deciphered in the AHCI virtual device. The summary of changes can be grouped into two pieces: [A] Changes to how we apply a preliminary sieve to FISes, [B] Changes in how we internalize a decomposed FIS. == Changes to how we apply a preliminary sieve to FISes == (1) Packets may now either update the Control register or the Command register, but not both. This is according to the SATA 3.2 specification which states: ...the device either initiates processing of the command indicated in the Command register or initiates processing of the control request indicated [...] depending on the state of the C bit in the FIS. See SATA 3.2 section 10.5.5.4, Reception in the 10.5.5 Register Host to Device FIS section. This change accounts for the first two regions of change within the diff. All other changes belong to the following changes. == Changes in how we internalize a decomposed FIS == (2) Instead of trying to extract the sector number out of the FIS from bytes 4-10 and setting it with ide_set_sector, we set the appropriate IDEState registers and trust that ide_get_sector can retrieve the correct sector later. By constructing the sector for use with ide_set_sector, we are duplicating the mechanisms of ide_get_sector. This change makes the FIS decomposition more obvious. SATA 3.2 as a specification does not make the legacy register mapping with respect to the D2H FIS obvious. However, SATA 3.2 section 10.5.5.1 Register Host to Device FIS layout describes all of the cmd_fis bytes: 0 - FIS Type (0x27) 1 - Port Multiplier Port and Command Update flag 2 - ATA Command 3 - Features_Low 4 - LBA 7:0 5 - LBA 15:8 6 - LBA 23:16 7 - Device, AKA Drive Select. 8 - LBA 31:24 9 - LBA 39:32 10 - LBA 47:40 11 - Features_High 12 - Count Low 13 - Count High 14 - ICC 15 - Control 16-19 - Auxiliary (for NCQ, defined per-command) Most of these registers map to existing IDEState registers in obvious ways, especially features, select, hob_features, and nsector (count). ICC is reserved in older specifications but is not supported in our implementation, and remains unused here. The Control register is not valid for a command that is trying to update the command register and is to be considered reserved at this point. What is not obvious is the LBA register mappings, but SATA 1.0 can help inform of us legacy device support, see SATA 1.0 section 8.5.2 Register - Host to Device. LBA 7:0 - Sector Number(sector) LBA 15:8 - Cyl Low (lcyl) LBA 23:16 - Cyl High (hcyl) LBA 31:24 - Sector Num Exp. (hob_sector) LBA 39:32 - Cyl Low Exp. (hob_lcyl) LBA 47:40 - Cyl High Exp.(hob_hcyl) These mappings help guide which registers the FIS should be decomposed into/towards for CHS, LBA28 and LBA48 commands. As a note: The prior confusion that can be seen in the documentation arises from the fact that CHS and LBA28 commands use the low nybble of the drive select register to store LBA 27:24, whereas LNA48 commands use the hob_sector, hob_lcyl and hob_hcyl registers as explained above. The decomposition as it stands now will correctly decompose CHS, LBA28 and LBA48 commands into their appropriate registers where the core IDE/ATAPI layers can deal with them correctly. See the below point for more information. (3) We save cmd_fis[7] as ide_state-select, which informs decisions about if we are using LBA or CHS. This corrects a bug in AHCI wherein we attempt to set and/or retrieve the sector number by using ide_set_sector and ide_get_sector, which depend on the select register to determine if we are using LBA or CHS. Without this adjustment, LBA48 read/writes are currently broken. Thanks to Eniac Zheng @ HP for pointing this out. (4) Save cmd_fis[11] as ide_state-hob_feature, as defined in SATA 3.2. (5) For several ATA commands, the sector count register set to 0 is a magic number that means 256 sectors. For LBA48 commands, this means 65,536 sectors. We drop the magic sector correction here, and trust the ide core layer to handle the conversion appropriately, in ide_cmd_lba48_transform(). As it stands, the current AHCI code is only compliant with LBA28 commands. By simply removing the magic, it will work with LBA28 and LBA48. (6) We expand FIS decomposition to include both ATAPI and IDE devices. We leave the logic of determining if the fields are valid or not to the respective layers. This change intends to make it clearer that AHCI is only a composition mechanism for the FIS packets: the meanings of the registers is best left to the implementation layers for those devices. (7) Forcefully
[Qemu-devel] [PULL for-2.2 05/11] ahci: add is_ncq predicate helper
From: John Snow js...@redhat.com A small helper to determine which S/ATA commands are destined to be routed to the NCQ pathways. This references SATA 3.2 section 13.6, Native Command Queueing. See sections 13.6.4, 13.6.5, 13.6.6, 13.6.7 and 13.6.8 for all SATA commands considered to be part of the NCQ feature set. This is summarized in a small list in section 13.6.3.1 and again in 13.6.3.2. Not all of these NCQ commands are currently supported, so the error pathways are adjusted slightly to be more informative in the case they are encountered. Signed-off-by: John Snow js...@redhat.com Reviewed-by: Paolo Bonzini pbonz...@redhat.com Message-id: 1415058979-16604-2-git-send-email-js...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- hw/ide/ahci.c | 28 hw/ide/ahci.h | 3 +++ 2 files changed, 27 insertions(+), 4 deletions(-) diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index 9647d94..f2acddb 100644 --- a/hw/ide/ahci.c +++ b/hw/ide/ahci.c @@ -854,6 +854,21 @@ static void ncq_cb(void *opaque, int ret) ncq_tfs-used = 0; } +static int is_ncq(uint8_t ata_cmd) +{ +/* Based on SATA 3.2 section 13.6.3.2 */ +switch (ata_cmd) { +case READ_FPDMA_QUEUED: +case WRITE_FPDMA_QUEUED: +case NCQ_NON_DATA: +case RECEIVE_FPDMA_QUEUED: +case SEND_FPDMA_QUEUED: +return 1; +default: +return 0; +} +} + static void process_ncq_command(AHCIState *s, int port, uint8_t *cmd_fis, int slot) { @@ -919,9 +934,15 @@ static void process_ncq_command(AHCIState *s, int port, uint8_t *cmd_fis, ncq_cb, ncq_tfs); break; default: -DPRINTF(port, error: tried to process non-NCQ command as NCQ\n); +if (is_ncq(cmd_fis[2])) { +DPRINTF(port, +error: unsupported NCQ command (0x%02x) received\n, +cmd_fis[2]); +} else { +DPRINTF(port, +error: tried to process non-NCQ command as NCQ\n); +} qemu_sglist_destroy(ncq_tfs-sglist); -break; } } @@ -1013,8 +1034,7 @@ static int handle_cmd(AHCIState *s, int port, int slot) if (cmd_fis[1] == SATA_FIS_REG_H2D_UPDATE_COMMAND_REGISTER) { /* Check for NCQ command */ -if ((cmd_fis[2] == READ_FPDMA_QUEUED) || -(cmd_fis[2] == WRITE_FPDMA_QUEUED)) { +if (is_ncq(cmd_fis[2])) { process_ncq_command(s, port, cmd_fis, slot); goto out; } diff --git a/hw/ide/ahci.h b/hw/ide/ahci.h index b123237..e0d2eb8 100644 --- a/hw/ide/ahci.h +++ b/hw/ide/ahci.h @@ -186,6 +186,9 @@ #define READ_FPDMA_QUEUED 0x60 #define WRITE_FPDMA_QUEUED 0x61 +#define NCQ_NON_DATA 0x63 +#define RECEIVE_FPDMA_QUEUED 0x65 +#define SEND_FPDMA_QUEUED 0x64 #define RES_FIS_DSFIS 0x00 #define RES_FIS_PSFIS 0x20 -- 2.1.0
[Qemu-devel] [PULL for-2.2 07/11] ahci: Reorder error cases in handle_cmd
From: John Snow js...@redhat.com Error checking in ahci's handle_cmd is re-ordered so that we initialize as few things as possible before we've done our sanity checking. This simplifies returning from this call in case of an error. A check to make sure the DMA memory map succeeds with the correct size is also added, and the debug print of the command fis is cleaned up with its size corrected. Signed-off-by: John Snow js...@redhat.com Reviewed-by: Paolo Bonzini pbonz...@redhat.com Message-id: 1415058979-16604-4-git-send-email-js...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- hw/ide/ahci.c | 29 ++--- 1 file changed, 14 insertions(+), 15 deletions(-) diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index 43da363..578a93b 100644 --- a/hw/ide/ahci.c +++ b/hw/ide/ahci.c @@ -961,38 +961,37 @@ static int handle_cmd(AHCIState *s, int port, int slot) return -1; } -cmd = ((AHCICmdHdr *)s-dev[port].lst)[slot]; - if (!s-dev[port].lst) { DPRINTF(port, error: lst not given but cmd handled); return -1; } - +cmd = ((AHCICmdHdr *)s-dev[port].lst)[slot]; /* remember current slot handle for later */ s-dev[port].cur_cmd = cmd; +/* The device we are working for */ +ide_state = s-dev[port].port.ifs[0]; +if (!ide_state-blk) { +DPRINTF(port, error: guest accessed unused port); +return -1; +} + opts = le32_to_cpu(cmd-opts); tbl_addr = le64_to_cpu(cmd-tbl_addr); - cmd_len = 0x80; cmd_fis = dma_memory_map(s-as, tbl_addr, cmd_len, DMA_DIRECTION_FROM_DEVICE); - if (!cmd_fis) { DPRINTF(port, error: guest passed us an invalid cmd fis\n); return -1; -} - -/* The device we are working for */ -ide_state = s-dev[port].port.ifs[0]; - -if (!ide_state-blk) { -DPRINTF(port, error: guest accessed unused port); +} else if (cmd_len != 0x80) { +ahci_trigger_irq(s, s-dev[port], PORT_IRQ_HBUS_ERR); +DPRINTF(port, error: dma_memory_map failed: +(len(%02PRIx64) != 0x80)\n, +cmd_len); goto out; } - -debug_print_fis(cmd_fis, 0x90); -//debug_print_fis(cmd_fis, (opts AHCI_CMD_HDR_CMD_FIS_LEN) * 4); +debug_print_fis(cmd_fis, 0x80); switch (cmd_fis[0]) { case SATA_FIS_TYPE_REGISTER_H2D: -- 2.1.0
[Qemu-devel] [PULL for-2.2 09/11] ahci: factor out FIS decomposition from handle_cmd
From: John Snow js...@redhat.com In order to make handle_cmd more readable at the macro level, the details of how to decompose particular types of FIS packets are left to helper functions. In our case, the only type of FIS packet we currently expect to see is a Register H2D FIS packet, but the gory details of its decomposition are of no particular interest in handle_cmd. This patch keeps the receipt of FIS packets and the decomposition thereof separated to two different functions. Signed-off-by: John Snow js...@redhat.com Reviewed-by: Paolo Bonzini pbonz...@redhat.com Message-id: 1415058979-16604-6-git-send-email-js...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- hw/ide/ahci.c | 169 ++ 1 file changed, 86 insertions(+), 83 deletions(-) diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index d6b012c..94f28e6 100644 --- a/hw/ide/ahci.c +++ b/hw/ide/ahci.c @@ -946,10 +946,94 @@ static void process_ncq_command(AHCIState *s, int port, uint8_t *cmd_fis, } } +static void handle_reg_h2d_fis(AHCIState *s, int port, + int slot, uint8_t *cmd_fis) +{ +IDEState *ide_state = s-dev[port].port.ifs[0]; +AHCICmdHdr *cmd = s-dev[port].cur_cmd; +uint32_t opts = le32_to_cpu(cmd-opts); + +if (cmd_fis[1] 0x0F) { +DPRINTF(port, Port Multiplier not supported. + cmd_fis[0]=%02x cmd_fis[1]=%02x cmd_fis[2]=%02x\n, +cmd_fis[0], cmd_fis[1], cmd_fis[2]); +return; +} + +if (cmd_fis[1] 0x70) { +DPRINTF(port, Reserved flags set in H2D Register FIS. + cmd_fis[0]=%02x cmd_fis[1]=%02x cmd_fis[2]=%02x\n, +cmd_fis[0], cmd_fis[1], cmd_fis[2]); +return; +} + +if (!(cmd_fis[1] SATA_FIS_REG_H2D_UPDATE_COMMAND_REGISTER)) { +switch (s-dev[port].port_state) { +case STATE_RUN: +if (cmd_fis[15] ATA_SRST) { +s-dev[port].port_state = STATE_RESET; +} +break; +case STATE_RESET: +if (!(cmd_fis[15] ATA_SRST)) { +ahci_reset_port(s, port); +} +break; +} +return; +} + +/* Check for NCQ command */ +if (is_ncq(cmd_fis[2])) { +process_ncq_command(s, port, cmd_fis, slot); +return; +} + +/* Decompose the FIS: + * AHCI does not interpret FIS packets, it only forwards them. + * SATA 1.0 describes how to decode LBA28 and CHS FIS packets. + * Later specifications, e.g, SATA 3.2, describe LBA48 FIS packets. + * + * ATA4 describes sector number for LBA28/CHS commands. + * ATA6 describes sector number for LBA48 commands. + * ATA8 deprecates CHS fully, describing only LBA28/48. + * + * We dutifully convert the FIS into IDE registers, and allow the + * core layer to interpret them as needed. */ +ide_state-feature = cmd_fis[3]; +ide_state-sector = cmd_fis[4]; /* LBA 7:0 */ +ide_state-lcyl = cmd_fis[5];/* LBA 15:8 */ +ide_state-hcyl = cmd_fis[6];/* LBA 23:16 */ +ide_state-select = cmd_fis[7]; /* LBA 27:24 (LBA28) */ +ide_state-hob_sector = cmd_fis[8]; /* LBA 31:24 */ +ide_state-hob_lcyl = cmd_fis[9];/* LBA 39:32 */ +ide_state-hob_hcyl = cmd_fis[10]; /* LBA 47:40 */ +ide_state-hob_feature = cmd_fis[11]; +ide_state-nsector = (int64_t)((cmd_fis[13] 8) | cmd_fis[12]); +/* 14, 16, 17, 18, 19: Reserved (SATA 1.0) */ +/* 15: Only valid when UPDATE_COMMAND not set. */ + +/* Copy the ACMD field (ATAPI packet, if any) from the AHCI command + * table to ide_state-io_buffer */ +if (opts AHCI_CMD_ATAPI) { +memcpy(ide_state-io_buffer, cmd_fis[AHCI_COMMAND_TABLE_ACMD], 0x10); +debug_print_fis(ide_state-io_buffer, 0x10); +s-dev[port].done_atapi_packet = false; +/* XXX send PIO setup FIS */ +} + +ide_state-error = 0; + +/* Reset transferred byte counter */ +cmd-status = 0; + +/* We're ready to process the command in FIS byte 2. */ +ide_exec_cmd(s-dev[port].port, cmd_fis[2]); +} + static int handle_cmd(AHCIState *s, int port, int slot) { IDEState *ide_state; -uint32_t opts; uint64_t tbl_addr; AHCICmdHdr *cmd; uint8_t *cmd_fis; @@ -976,7 +1060,6 @@ static int handle_cmd(AHCIState *s, int port, int slot) return -1; } -opts = le32_to_cpu(cmd-opts); tbl_addr = le64_to_cpu(cmd-tbl_addr); cmd_len = 0x80; cmd_fis = dma_memory_map(s-as, tbl_addr, cmd_len, @@ -995,93 +1078,13 @@ static int handle_cmd(AHCIState *s, int port, int slot) switch (cmd_fis[0]) { case SATA_FIS_TYPE_REGISTER_H2D: +handle_reg_h2d_fis(s, port, slot, cmd_fis); break; default: DPRINTF(port, unknown command cmd_fis[0]=%02x cmd_fis[1]=%02x cmd_fis[2]=%02x\n, cmd_fis[0],
[Qemu-devel] [PULL for-2.2 10/11] block: Fix max nb_sectors in bdrv_make_zero
From: Fam Zheng f...@redhat.com In bdrv_rw_co we report -EINVAL for nb_sectors INT_MAX / BDRV_SECTOR_SIZE, so a caller shouldn't exceed it. Signed-off-by: Fam Zheng f...@redhat.com Reviewed-by: Markus Armbruster arm...@redhat.com Message-id: 1415603264-21497-1-git-send-email-f...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- block.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/block.c b/block.c index c612826..a612594 100644 --- a/block.c +++ b/block.c @@ -2790,8 +2790,8 @@ int bdrv_make_zero(BlockDriverState *bs, BdrvRequestFlags flags) if (nb_sectors = 0) { return 0; } -if (nb_sectors INT_MAX) { -nb_sectors = INT_MAX; +if (nb_sectors INT_MAX / BDRV_SECTOR_SIZE) { +nb_sectors = INT_MAX / BDRV_SECTOR_SIZE; } ret = bdrv_get_block_status(bs, sector_num, nb_sectors, n); if (ret 0) { -- 2.1.0
[Qemu-devel] [PULL for-2.2 11/11] vmdk: Leave bdi intact if -ENOTSUP in vmdk_get_info
From: Fam Zheng f...@redhat.com When extent types don't match, we return -ENOTSUP. In this case, be polite to the caller and don't modify bdi. Signed-off-by: Fam Zheng f...@redhat.com Reviewed-by: Max Reitz mre...@redhat.com Message-id: 1415938161-16217-1-git-send-email-f...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- block/vmdk.c | 20 +--- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/block/vmdk.c b/block/vmdk.c index 673d3f5..2cbfd3e 100644 --- a/block/vmdk.c +++ b/block/vmdk.c @@ -2137,23 +2137,29 @@ static ImageInfoSpecific *vmdk_get_specific_info(BlockDriverState *bs) return spec_info; } +static bool vmdk_extents_type_eq(const VmdkExtent *a, const VmdkExtent *b) +{ +return a-flat == b-flat + a-compressed == b-compressed + (a-flat || a-cluster_sectors == b-cluster_sectors); +} + static int vmdk_get_info(BlockDriverState *bs, BlockDriverInfo *bdi) { int i; BDRVVmdkState *s = bs-opaque; assert(s-num_extents); -bdi-needs_compressed_writes = s-extents[0].compressed; -if (!s-extents[0].flat) { -bdi-cluster_size = s-extents[0].cluster_sectors BDRV_SECTOR_BITS; -} + /* See if we have multiple extents but they have different cases */ for (i = 1; i s-num_extents; i++) { -if (bdi-needs_compressed_writes != s-extents[i].compressed || -(bdi-cluster_size bdi-cluster_size != -s-extents[i].cluster_sectors BDRV_SECTOR_BITS)) { +if (!vmdk_extents_type_eq(s-extents[0], s-extents[i])) { return -ENOTSUP; } } +bdi-needs_compressed_writes = s-extents[0].compressed; +if (!s-extents[0].flat) { +bdi-cluster_size = s-extents[0].cluster_sectors BDRV_SECTOR_BITS; +} return 0; } -- 2.1.0
[Qemu-devel] qemu 2.2 crash on linux hvm domU (full backtrace included)
dom0 xen-unstable from staging git with x86/hvm: Extend HVM cpuid leaf with vcpu id and x86/hvm: Add per-vcpu evtchn upcalls patches, and qemu 2.2 from spice git (spice/next commit e779fa0a715530311e6f59fc8adb0f6eca914a89): https://github.com/Fantu/Xen/commits/rebase/m2r-staging Qemu crash on fedora 20 lxde (with software updates of some days ago) boot with this backtrace: Program received signal SIGSEGV, Segmentation fault. 0x55689607 in vmport_ioport_read (opaque=0x56440a20, addr=0, size=4) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/hw/misc/vmport.c:73 73 eax = env-regs[R_EAX]; (gdb) bt full #0 0x55689607 in vmport_ioport_read (opaque=0x56440a20, addr=0, size=4) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/hw/misc/vmport.c:73 s = 0x56440a20 cs = 0x0 cpu = 0x0 __func__ = vmport_ioport_read env = 0x8250 command = 0 '\000' eax = 0 #1 0x55655b9c in memory_region_read_accessor (mr=0x56440aa8, addr=0, value=0x7fffd8c0, size=4, shift=0, mask=4294967295) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/memory.c:410 tmp = 0 #2 0x55655e8f in access_with_adjusted_size (addr=0, value=0x7fffd8c0, size=4, access_size_min=4, access_size_max=4, access=0x55655b3a memory_region_read_accessor, mr=0x56440aa8) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/memory.c:480 access_mask = 4294967295 access_size = 4 i = 0 #3 0x55658cc1 in memory_region_dispatch_read1 (mr=0x56440aa8, addr=0, size=4) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/memory.c:1077 data = 0 #4 0x55658d89 in memory_region_dispatch_read (mr=0x56440aa8, addr=0, pval=0x7fffd998, size=4) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/memory.c:1099 No locals. #5 0x5565c794 in io_mem_read (mr=0x56440aa8, addr=0, pval=0x7fffd998, size=4) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/memory.c:1962 No locals. #6 0x55609fae in address_space_rw (as=0x55eae840, addr=22104, buf=0x7fffda40 \377\377\377\377, len=4, is_write=false) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/exec.c:2169 l = 4 ptr = 0x0 val = 7964229952888770560 addr1 = 0 mr = 0x56440aa8 error = false #7 0x5560a173 in address_space_read (as=0x55eae840, addr=22104, buf=0x7fffda40 \377\377\377\377, len=4) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/exec.c:2207 No locals. #8 0x5564fac7 in cpu_inl (addr=22104) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/ioport.c:117 buf = \377\377\377\377 val = 21845 #9 0x5567084b in do_inp (addr=22104, size=4) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/xen-hvm.c:684 No locals. #10 0x55670ab8 in cpu_ioreq_pio (req=0x77ff3000) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/xen-hvm.c:747 i = 0 #11 0x5567108b in handle_ioreq (state=0x563c1590, req=0x77ff3000) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/xen-hvm.c:853 ---Type return to continue, or q return to quit--- No locals. #12 0x556713fe in cpu_handle_ioreq (opaque=0x563c1590) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/xen-hvm.c:931 state = 0x563c1590 req = 0x77ff3000 #13 0x5596d874 in qemu_iohandler_poll (pollfds=0x56388a30, ret=1) at iohandler.c:143 revents = 1 pioh = 0x563f3c90 ioh = 0x56515f80 #14 0x5596d450 in main_loop_wait (nonblocking=0) at main-loop.c:495 ret = 1 timeout = 4294967295 timeout_ns = 3418165 #15 0x557567b7 in main_loop () at vl.c:1882 nonblocking = false last_io = 1 #16 0x5575e4c1 in main (argc=62, argv=0x7fffe038, envp=0x7fffe230) at vl.c:4400 i = 128 snapshot = 0 linux_boot = 0 initrd_filename = 0x0 kernel_filename = 0x0 kernel_cmdline = 0x55a485c6 boot_order = 0x563864e0 dc ds = 0x564c71b0 cyls = 0 heads = 0 secs = 0 translation = 0 hda_opts = 0x0 opts = 0x56386430 machine_opts = 0x56388090 icount_opts = 0x0 olist = 0x55e56da0 optind = 62 optarg = 0x7fffe914 file=/mnt/vm/disks/FEDORA19.disk1.xm,if=ide,index=0,media=disk,format=raw,cache=writeback loadvm = 0x0 machine_class = 0x5637c5c0 cpu_model = 0x0 vga_model = 0x0 qtest_chrdev = 0x0 ---Type return to continue, or q return to quit--- qtest_log = 0x0 pid_file = 0x0 incoming = 0x0 show_vnc_port = 0 defconfig = true userconfig = true log_mask = 0x0 log_file = 0x0 mem_trace = {malloc = 0x55759e7a malloc_and_trace, realloc = 0x55759ed2 realloc_and_trace, free = 0x55759f39 free_and_trace, calloc = 0, try_malloc = 0, try_realloc = 0} trace_events = 0x0
[Qemu-devel] [PULL for-2.2 0/2] Xen tree 2014-11-14
The following changes since commit c52e67924fbdadfa00668248f5c075542943c54c: Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging (2014-11-13 15:44:16 +) are available in the git repository at: git://xenbits.xen.org/people/sstabellini/qemu-dm.git xen-2014-11-14 for you to fetch changes up to 2f01dfacb56bc7a0d4639adc9dff9aae131e6216: xen_disk: fix unmapping of persistent grants (2014-11-14 11:12:38 +) Igor Mammedov (1): pc: piix4_pm: init legacy PCI hotplug when running on Xen Roger Pau Monne (1): xen_disk: fix unmapping of persistent grants hw/acpi/piix4.c |4 +++ hw/block/xen_disk.c | 72 ++- hw/i386/pc_piix.c | 11 3 files changed, 70 insertions(+), 17 deletions(-)
[Qemu-devel] [PULL for-2.2 1/2] pc: piix4_pm: init legacy PCI hotplug when running on Xen
From: Igor Mammedov imamm...@redhat.com If user starts QEMU with -machine pc,accel=xen, then compat property in xenfv won't work and it would cause error: Unsupported bus. Bus doesn't have property 'acpi-pcihp-bsel' set when PCI device is added with -device on QEMU CLI. From: Igor Mammedov imamm...@redhat.com In case of Xen instead of using compat property, just use the fact that xen doesn't use QEMU's fw_cfg/acpi tables to switch piix4_pm into legacy PCI hotplug mode when Xen is enabled. Signed-off-by: Igor Mammedov imamm...@redhat.com Signed-off-by: Li Liang liang.z...@intel.com Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Acked-by: Paolo Bonzini pbonz...@redhat.com --- hw/acpi/piix4.c |4 hw/i386/pc_piix.c | 11 --- 2 files changed, 4 insertions(+), 11 deletions(-) diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c index 78c0a6d..481a16c 100644 --- a/hw/acpi/piix4.c +++ b/hw/acpi/piix4.c @@ -36,6 +36,7 @@ #include hw/mem/pc-dimm.h #include hw/acpi/memory_hotplug.h #include hw/acpi/acpi_dev_interface.h +#include hw/xen/xen.h //#define DEBUG @@ -501,6 +502,9 @@ I2CBus *piix4_pm_init(PCIBus *bus, int devfn, uint32_t smb_io_base, s-irq = sci_irq; s-smi_irq = smi_irq; s-kvm_enabled = kvm_enabled; +if (xen_enabled()) { +s-use_acpi_pci_hotplug = false; +} qdev_init_nofail(dev); diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index b559181..7bb97a4 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -916,17 +916,6 @@ static QEMUMachine xenfv_machine = { .max_cpus = HVM_MAX_VCPUS, .default_machine_opts = accel=xen, .hot_add_cpu = pc_hot_add_cpu, -.compat_props = (GlobalProperty[]) { -/* xenfv has no fwcfg and so does not load acpi from QEMU. - * as such new acpi features don't work. - */ -{ -.driver = PIIX4_PM, -.property = acpi-pci-hotplug-with-bridge-support, -.value= off, -}, -{ /* end of list */ } -}, }; #endif -- 1.7.10.4
[Qemu-devel] [PULL for-2.2 2/2] xen_disk: fix unmapping of persistent grants
From: Roger Pau Monne roger@citrix.com This patch fixes two issues with persistent grants and the disk PV backend (Qdisk): - Keep track of memory regions where persistent grants have been mapped since we need to unmap them as a whole. It is not possible to unmap a single grant if it has been batch-mapped. A new check has also been added to make sure persistent grants are only used if the whole mapped region can be persistently mapped in the batch_maps case. - Unmap persistent grants before switching to the closed state, so the frontend can also free them. Signed-off-by: Roger Pau Monné roger@citrix.com Reported-by: George Dunlap george.dun...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Kevin Wolf kw...@redhat.com Cc: Stefan Hajnoczi stefa...@redhat.com Cc: George Dunlap george.dun...@eu.citrix.com Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- hw/block/xen_disk.c | 72 ++- 1 file changed, 66 insertions(+), 6 deletions(-) diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c index 231e9a7..21842a0 100644 --- a/hw/block/xen_disk.c +++ b/hw/block/xen_disk.c @@ -59,6 +59,13 @@ struct PersistentGrant { typedef struct PersistentGrant PersistentGrant; +struct PersistentRegion { +void *addr; +int num; +}; + +typedef struct PersistentRegion PersistentRegion; + struct ioreq { blkif_request_t req; int16_t status; @@ -118,6 +125,7 @@ struct XenBlkDev { gbooleanfeature_discard; gbooleanfeature_persistent; GTree *persistent_gnts; +GSList *persistent_regions; unsigned intpersistent_gnt_count; unsigned intmax_grants; @@ -177,6 +185,23 @@ static void destroy_grant(gpointer pgnt) g_free(grant); } +static void remove_persistent_region(gpointer data, gpointer dev) +{ +PersistentRegion *region = data; +struct XenBlkDev *blkdev = dev; +XenGnttab gnt = blkdev-xendev.gnttabdev; + +if (xc_gnttab_munmap(gnt, region-addr, region-num) != 0) { +xen_be_printf(blkdev-xendev, 0, + xc_gnttab_munmap region %p failed: %s\n, + region-addr, strerror(errno)); +} +xen_be_printf(blkdev-xendev, 3, + unmapped grant region %p with %d pages\n, + region-addr, region-num); +g_free(region); +} + static struct ioreq *ioreq_start(struct XenBlkDev *blkdev) { struct ioreq *ioreq = NULL; @@ -343,6 +368,7 @@ static int ioreq_map(struct ioreq *ioreq) void *page[BLKIF_MAX_SEGMENTS_PER_REQUEST]; int i, j, new_maps = 0; PersistentGrant *grant; +PersistentRegion *region; /* domids and refs variables will contain the information necessary * to map the grants that are needed to fulfill this request. * @@ -421,7 +447,22 @@ static int ioreq_map(struct ioreq *ioreq) } } } -if (ioreq-blkdev-feature_persistent) { +if (ioreq-blkdev-feature_persistent new_maps != 0 +(!batch_maps || (ioreq-blkdev-persistent_gnt_count + new_maps = +ioreq-blkdev-max_grants))) { +/* + * If we are using persistent grants and batch mappings only + * add the new maps to the list of persistent grants if the whole + * area can be persistently mapped. + */ +if (batch_maps) { +region = g_malloc0(sizeof(*region)); +region-addr = ioreq-pages; +region-num = new_maps; +ioreq-blkdev-persistent_regions = g_slist_append( +ioreq-blkdev-persistent_regions, +region); +} while ((ioreq-blkdev-persistent_gnt_count ioreq-blkdev-max_grants) new_maps) { /* Go through the list of newly mapped grants and add as many @@ -447,6 +488,7 @@ static int ioreq_map(struct ioreq *ioreq) grant); ioreq-blkdev-persistent_gnt_count++; } +assert(!batch_maps || new_maps == 0); } for (i = 0; i ioreq-v.niov; i++) { ioreq-v.iov[i].iov_base += (uintptr_t)page[i]; @@ -971,7 +1013,10 @@ static int blk_connect(struct XenDevice *xendev) blkdev-max_grants = max_requests * BLKIF_MAX_SEGMENTS_PER_REQUEST; blkdev-persistent_gnts = g_tree_new_full((GCompareDataFunc)int_cmp, NULL, NULL, + batch_maps ? + (GDestroyNotify)g_free : (GDestroyNotify)destroy_grant); +blkdev-persistent_regions = NULL; blkdev-persistent_gnt_count = 0; } @@ -1000,6 +1045,26 @@ static void blk_disconnect(struct XenDevice *xendev) blkdev-cnt_map--; blkdev-sring = NULL;
Re: [Qemu-devel] [PULL for-2.2 00/11] Block patches
On 14 November 2014 11:05, Stefan Hajnoczi stefa...@redhat.com wrote: The following changes since commit c52e67924fbdadfa00668248f5c075542943c54c: Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging (2014-11-13 15:44:16 +) are available in the git repository at: git://github.com/stefanha/qemu.git tags/block-pull-request for you to fetch changes up to 5f58330790b72c4705b373ee0646fb3adf800b4e: vmdk: Leave bdi intact if -ENOTSUP in vmdk_get_info (2014-11-14 09:20:45 +) Applied, thanks. -- PMM
[Qemu-devel] [Bug 1392504] Re: USB Passthrough is not working anymore
Command lines generated by virsh from the libvirt xml files (let me know if you also need the xml configuration files): qemu-system-x86_64 -enable-kvm -name fsg -S -machine pc-1.2,accel=kvm,usb=off -m 2048 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 30619811-d285-944a-026a-cdd4a753b77a -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/fsg.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device nec-usb-xhci,id=usb,bus=pci.0,addr=0x5 -drive file=/vm2/fsgorg/disk-sda2.qcow2,if=none,id=drive-virtio-disk0,format=qcow2 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/vm2/fsgorg/disk-sdb.qcow2,if=none,id=drive-virtio-disk1,format=qcow2 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive-virtio-disk1,id=virtio-disk1 -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=2 -netdev tap,fd=23,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:76:fd:b6,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 0.0.0.0:1 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device usb-host,hostbus=1,hostaddr=4,id=hostdev0 -device usb-host,hostbus=4,hostaddr=2,id=hostdev1 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -msg timestamp=on qemu-system-x86_64 -enable-kvm -name fnas -S -machine pc- i440fx-1.4,accel=kvm,usb=off -m 1024 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 7c2703e5-b8b8-acd9-df8b-6284382fb58b -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/fnas.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no- shutdown -boot menu=off,strict=on -device ich9-usb- ehci1,id=usb,bus=pci.0,addr=0x6.0x7 -device ich9-usb- uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6 -device ich9-usb- uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1 -device ich9 -usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2 -drive file=/vm2/FreeNAS/fnas.qcow2,if=none,id=drive-ide0-0-0,format=qcow2 -device ide-hd,bus=ide.0,unit=0,drive=drive- ide0-0-0,id=ide0-0-0,bootindex=1 -drive file=/vm2/FreeNAS/fnas2.qcow2,if=none,id=drive-ide0-0-1,format=qcow2 -device ide-hd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw -device ide- cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=23,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:d8:31:b5,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa- serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:5 -device cirrus- vga,id=video0,bus=pci.0,addr=0x2 -device intel- hda,id=sound0,bus=pci.0,addr=0x4 -device hda- duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device usb- host,hostbus=2,hostaddr=3,id=hostdev0 -device virtio-balloon- pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on qemu-system-x86_64 -enable-kvm -name win7 -S -machine pc-1.2,accel=kvm,usb=off -m 2048 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 717d41d2-cfc9-5d70-e2f5-817c2ed1a90a -nouser-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/win7.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime -no-shutdown -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2 -drive file=/vm/Windows7/win7.qcow2,if=none,id=drive-virtio-disk0,format=qcow2 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/vm/Windows7/joespb.iso,if=none,id=drive-ide0-1-0,readonly=on,format=raw -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=23,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:3c:7b:7f,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 0.0.0.0:2 -device VGA,id=video0,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device usb-host,hostbus=1,hostaddr=3,id=hostdev0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -usb -msg timestamp=on Regards, Leen -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1392504 Title: USB Passthrough is
[Qemu-devel] [PATCH v2 02/21] qcow2: Add refcount_width to format-specific info
Add the bit width of every refcount entry to the format-specific information. In contrast to lazy_refcounts and the corrupt flag, this should be always emitted, even for compat=0.10 although it does not support any refcount width other than 16 bits. This is because if a boolean is optional, one normally assumes it to be false when omitted; but if an integer is not specified, it is rather difficult to guess its value. This new field breaks some test outputs, fix them. Signed-off-by: Max Reitz mre...@redhat.com --- block/qcow2.c | 4 +++- qapi/block-core.json | 5 - tests/qemu-iotests/060.out | 1 + tests/qemu-iotests/065 | 23 +++ tests/qemu-iotests/067.out | 5 + tests/qemu-iotests/082.out | 7 +++ tests/qemu-iotests/089.out | 2 ++ 7 files changed, 37 insertions(+), 10 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index f57aff9..d70e927 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -2475,7 +2475,8 @@ static ImageInfoSpecific *qcow2_get_specific_info(BlockDriverState *bs) }; if (s-qcow_version == 2) { *spec_info-qcow2 = (ImageInfoSpecificQCow2){ -.compat = g_strdup(0.10), +.compat = g_strdup(0.10), +.refcount_width = s-refcount_bits, }; } else if (s-qcow_version == 3) { *spec_info-qcow2 = (ImageInfoSpecificQCow2){ @@ -2486,6 +2487,7 @@ static ImageInfoSpecific *qcow2_get_specific_info(BlockDriverState *bs) .corrupt= s-incompatible_features QCOW2_INCOMPAT_CORRUPT, .has_corrupt= true, +.refcount_width = s-refcount_bits, }; } diff --git a/qapi/block-core.json b/qapi/block-core.json index b7083fb..e3a3cb7 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -41,13 +41,16 @@ # @corrupt: #optional true if the image has been marked corrupt; only valid for # compat = 1.1 (since 2.2) # +# @refcount-width: width of a refcount entry in bits (since 2.3) +# # Since: 1.7 ## { 'type': 'ImageInfoSpecificQCow2', 'data': { 'compat': 'str', '*lazy-refcounts': 'bool', - '*corrupt': 'bool' + '*corrupt': 'bool', + 'refcount-width': 'int' } } ## diff --git a/tests/qemu-iotests/060.out b/tests/qemu-iotests/060.out index 9419da1..17b3eaf 100644 --- a/tests/qemu-iotests/060.out +++ b/tests/qemu-iotests/060.out @@ -19,6 +19,7 @@ cluster_size: 65536 Format specific information: compat: 1.1 lazy refcounts: false +refcount width: 16 corrupt: true qemu-io: can't open device TEST_DIR/t.IMGFMT: IMGFMT: Image is corrupt; cannot be opened read/write read 512/512 bytes at offset 0 diff --git a/tests/qemu-iotests/065 b/tests/qemu-iotests/065 index 8d3a9c9..8539aeb 100755 --- a/tests/qemu-iotests/065 +++ b/tests/qemu-iotests/065 @@ -88,34 +88,41 @@ class TestQMP(TestImageInfoSpecific): class TestQCow2(TestQemuImgInfo): '''Testing a qcow2 version 2 image''' img_options = 'compat=0.10' -json_compare = { 'compat': '0.10' } -human_compare = [ 'compat: 0.10' ] +json_compare = { 'compat': '0.10', 'refcount-width': 16 } +human_compare = [ 'compat: 0.10', 'refcount width: 16' ] class TestQCow3NotLazy(TestQemuImgInfo): '''Testing a qcow2 version 3 image with lazy refcounts disabled''' img_options = 'compat=1.1,lazy_refcounts=off' -json_compare = { 'compat': '1.1', 'lazy-refcounts': False, 'corrupt': False } -human_compare = [ 'compat: 1.1', 'lazy refcounts: false', 'corrupt: false' ] +json_compare = { 'compat': '1.1', 'lazy-refcounts': False, + 'refcount-width': 16, 'corrupt': False } +human_compare = [ 'compat: 1.1', 'lazy refcounts: false', + 'refcount width: 16', 'corrupt: false' ] class TestQCow3Lazy(TestQemuImgInfo): '''Testing a qcow2 version 3 image with lazy refcounts enabled''' img_options = 'compat=1.1,lazy_refcounts=on' -json_compare = { 'compat': '1.1', 'lazy-refcounts': True, 'corrupt': False } -human_compare = [ 'compat: 1.1', 'lazy refcounts: true', 'corrupt: false' ] +json_compare = { 'compat': '1.1', 'lazy-refcounts': True, + 'refcount-width': 16, 'corrupt': False } +human_compare = [ 'compat: 1.1', 'lazy refcounts: true', + 'refcount width: 16', 'corrupt: false' ] class TestQCow3NotLazyQMP(TestQMP): '''Testing a qcow2 version 3 image with lazy refcounts disabled, opening with lazy refcounts enabled''' img_options = 'compat=1.1,lazy_refcounts=off' qemu_options = 'lazy-refcounts=on' -compare = { 'compat': '1.1', 'lazy-refcounts': False, 'corrupt': False } +compare = { 'compat': '1.1', 'lazy-refcounts': False, +'refcount-width': 16, 'corrupt': False } + class TestQCow3LazyQMP(TestQMP): '''Testing a qcow2 version 3 image with lazy refcounts
[Qemu-devel] [PATCH v2 04/21] qcow2: Respect error in qcow2_alloc_bytes()
qcow2_update_cluster_refcount() may fail, and qcow2_alloc_bytes() should mind that case. Signed-off-by: Max Reitz mre...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com --- block/qcow2-refcount.c | 32 +--- 1 file changed, 21 insertions(+), 11 deletions(-) diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c index 6e06531..be4e5fe 100644 --- a/block/qcow2-refcount.c +++ b/block/qcow2-refcount.c @@ -761,7 +761,8 @@ int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset, int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size) { BDRVQcowState *s = bs-opaque; -int64_t offset, cluster_offset; +int64_t offset, cluster_offset, new_cluster; +int64_t ret; int free_in_cluster; BLKDBG_EVENT(bs-file, BLKDBG_CLUSTER_ALLOC_BYTES); @@ -783,23 +784,32 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size) free_in_cluster -= size; if (free_in_cluster == 0) s-free_byte_offset = 0; -if (offset_into_cluster(s, offset) != 0) -qcow2_update_cluster_refcount(bs, offset s-cluster_bits, 1, - QCOW2_DISCARD_NEVER); +if (offset_into_cluster(s, offset) != 0) { +ret = qcow2_update_cluster_refcount(bs, offset s-cluster_bits, +1, QCOW2_DISCARD_NEVER); +if (ret 0) { +return ret; +} +} } else { -offset = qcow2_alloc_clusters(bs, s-cluster_size); -if (offset 0) { -return offset; +new_cluster = qcow2_alloc_clusters(bs, s-cluster_size); +if (new_cluster 0) { +return new_cluster; } cluster_offset = start_of_cluster(s, s-free_byte_offset); -if ((cluster_offset + s-cluster_size) == offset) { +if ((cluster_offset + s-cluster_size) == new_cluster) { /* we are lucky: contiguous data */ offset = s-free_byte_offset; -qcow2_update_cluster_refcount(bs, offset s-cluster_bits, 1, - QCOW2_DISCARD_NEVER); +ret = qcow2_update_cluster_refcount(bs, offset s-cluster_bits, +1, QCOW2_DISCARD_NEVER); +if (ret 0) { +qcow2_free_clusters(bs, new_cluster, s-cluster_size, +QCOW2_DISCARD_NEVER); +return ret; +} s-free_byte_offset += size; } else { -s-free_byte_offset = offset; +s-free_byte_offset = new_cluster; goto redo; } } -- 1.9.3
[Qemu-devel] [PATCH v2 12/21] qcow2: Allow creation with refcount order != 4
Add a creation option to qcow2 for setting the refcount order of images to be created, and respect that option's value. This breaks some test outputs, fix them. Signed-off-by: Max Reitz mre...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com --- block/qcow2.c | 20 include/block/block_int.h | 1 + tests/qemu-iotests/049.out | 112 ++--- tests/qemu-iotests/082.out | 41 ++--- tests/qemu-iotests/085.out | 38 +++ 5 files changed, 130 insertions(+), 82 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 6dc1984..657d558 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -2065,6 +2065,17 @@ static int qcow2_create(const char *filename, QemuOpts *opts, Error **errp) goto finish; } +refcount_width = qemu_opt_get_number_del(opts, BLOCK_OPT_REFCOUNT_WIDTH, + refcount_width); +if (refcount_width = 0 || refcount_width 64 || +!is_power_of_2(refcount_width)) +{ +error_setg(errp, Refcount width must be a power of two and may not + exceed 64 bits); +ret = -EINVAL; +goto finish; +} + if (version 3 refcount_width != 16) { error_setg(errp, Different refcount widths than 16 bits require compatibility level 1.1 or above (use compat=1.1 or @@ -2704,6 +2715,9 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts, } else if (!strcmp(desc-name, lazy_refcounts)) { lazy_refcounts = qemu_opt_get_bool(opts, lazy_refcounts, lazy_refcounts); +} else if (!strcmp(desc-name, refcount_width)) { +error_report(Cannot change refcount entry width); +return -ENOTSUP; } else { /* if this assertion fails, this probably means a new option was * added without having it covered here */ @@ -2873,6 +2887,12 @@ static QemuOptsList qcow2_create_opts = { .help = Postpone refcount updates, .def_value_str = off }, +{ +.name = BLOCK_OPT_REFCOUNT_WIDTH, +.type = QEMU_OPT_NUMBER, +.help = Width of a reference count entry in bits, +.def_value_str = 16 +}, { /* end of list */ } } }; diff --git a/include/block/block_int.h b/include/block/block_int.h index a1c17b9..c34d610 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -56,6 +56,7 @@ #define BLOCK_OPT_ADAPTER_TYPE adapter_type #define BLOCK_OPT_REDUNDANCYredundancy #define BLOCK_OPT_NOCOW nocow +#define BLOCK_OPT_REFCOUNT_WIDTHrefcount_width typedef struct BdrvTrackedRequest { BlockDriverState *bs; diff --git a/tests/qemu-iotests/049.out b/tests/qemu-iotests/049.out index 09ca0ae..9369c12 100644 --- a/tests/qemu-iotests/049.out +++ b/tests/qemu-iotests/049.out @@ -4,90 +4,90 @@ QA output created by 049 == 1. Traditional size parameter == qemu-img create -f qcow2 TEST_DIR/t.qcow2 1024 -Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 encryption=off cluster_size=65536 lazy_refcounts=off +Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 encryption=off cluster_size=65536 lazy_refcounts=off refcount_width=16 qemu-img create -f qcow2 TEST_DIR/t.qcow2 1024b -Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 encryption=off cluster_size=65536 lazy_refcounts=off +Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 encryption=off cluster_size=65536 lazy_refcounts=off refcount_width=16 qemu-img create -f qcow2 TEST_DIR/t.qcow2 1k -Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 encryption=off cluster_size=65536 lazy_refcounts=off +Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 encryption=off cluster_size=65536 lazy_refcounts=off refcount_width=16 qemu-img create -f qcow2 TEST_DIR/t.qcow2 1K -Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 encryption=off cluster_size=65536 lazy_refcounts=off +Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 encryption=off cluster_size=65536 lazy_refcounts=off refcount_width=16 qemu-img create -f qcow2 TEST_DIR/t.qcow2 1M -Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1048576 encryption=off cluster_size=65536 lazy_refcounts=off +Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1048576 encryption=off cluster_size=65536 lazy_refcounts=off refcount_width=16 qemu-img create -f qcow2 TEST_DIR/t.qcow2 1G -Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1073741824 encryption=off cluster_size=65536 lazy_refcounts=off +Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1073741824 encryption=off cluster_size=65536 lazy_refcounts=off refcount_width=16 qemu-img create -f qcow2 TEST_DIR/t.qcow2 1T -Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1099511627776 encryption=off cluster_size=65536 lazy_refcounts=off +Formatting 'TEST_DIR/t.qcow2', fmt=qcow2
[Qemu-devel] [PATCH v2 06/21] qcow2: Helper for refcount array reallocation
Add a helper function for reallocating a refcount array, independently of the refcount order. The newly allocated space is zeroed and the function handles failed reallocations gracefully. Signed-off-by: Max Reitz mre...@redhat.com --- block/qcow2-refcount.c | 121 + 1 file changed, 72 insertions(+), 49 deletions(-) diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c index 66c78c0..18fcd0d 100644 --- a/block/qcow2-refcount.c +++ b/block/qcow2-refcount.c @@ -1108,6 +1108,54 @@ fail: /* refcount checking functions */ +static size_t refcount_array_byte_size(BDRVQcowState *s, uint64_t entries) +{ +if (s-refcount_order 3) { +/* sub-byte width */ +int shift = 3 - s-refcount_order; +return (entries + (1 shift) - 1) shift; +} else if (s-refcount_order == 3) { +/* byte width */ +return entries; +} else { +/* multiple bytes wide */ + +/* This assertion holds because there is no way we can address more than + * 2^(64 - 9) clusters at once (with cluster size 512 = 2^9, and because + * offsets have to be representable in bytes); due to every cluster + * corresponding to one refcount entry and because refcount_order has to + * be below 7, we are far below that limit */ +assert(!(entries (64 - (s-refcount_order - 3; + +return entries (s-refcount_order - 3); +} +} + +static int realloc_refcount_array(BDRVQcowState *s, uint16_t **array, + int64_t *size, int64_t new_size) +{ +/* Round to clusters so the array can be directly written to disk */ +size_t old_byte_size = ROUND_UP(refcount_array_byte_size(s, *size), +s-cluster_size); +size_t new_byte_size = ROUND_UP(refcount_array_byte_size(s, new_size), +s-cluster_size); +uint16_t *new_ptr; + +new_ptr = g_try_realloc(*array, new_byte_size); +if (new_byte_size !new_ptr) { +return -ENOMEM; +} + +if (new_ptr) { +memset((void *)((uintptr_t)new_ptr + old_byte_size), 0, + new_byte_size - old_byte_size); +} + +*array = new_ptr; +*size = new_size; + +return 0; +} /* * Increases the refcount for a range of clusters in a given refcount table. @@ -1124,6 +1172,7 @@ static int inc_refcounts(BlockDriverState *bs, { BDRVQcowState *s = bs-opaque; uint64_t start, last, cluster_offset, k; +int ret; if (size = 0) { return 0; @@ -1135,23 +1184,12 @@ static int inc_refcounts(BlockDriverState *bs, cluster_offset += s-cluster_size) { k = cluster_offset s-cluster_bits; if (k = *refcount_table_size) { -int64_t old_refcount_table_size = *refcount_table_size; -uint16_t *new_refcount_table; - -*refcount_table_size = k + 1; -new_refcount_table = g_try_realloc(*refcount_table, - *refcount_table_size * - sizeof(**refcount_table)); -if (!new_refcount_table) { -*refcount_table_size = old_refcount_table_size; +ret = realloc_refcount_array(s, refcount_table, + refcount_table_size, k + 1); +if (ret 0) { res-check_errors++; -return -ENOMEM; +return ret; } -*refcount_table = new_refcount_table; - -memset(*refcount_table + old_refcount_table_size, 0, - (*refcount_table_size - old_refcount_table_size) * - sizeof(**refcount_table)); } if (++(*refcount_table)[k] == 0) { @@ -1518,8 +1556,7 @@ static int check_refblocks(BlockDriverState *bs, BdrvCheckResult *res, fix BDRV_FIX_ERRORS ? Repairing : ERROR, i); if (fix BDRV_FIX_ERRORS) { -int64_t old_nb_clusters = *nb_clusters; -uint16_t *new_refcount_table; +int64_t new_nb_clusters; if (offset INT64_MAX - s-cluster_size) { ret = -EINVAL; @@ -1536,22 +1573,15 @@ static int check_refblocks(BlockDriverState *bs, BdrvCheckResult *res, goto resize_fail; } -*nb_clusters = size_to_clusters(s, size); -assert(*nb_clusters = old_nb_clusters); +new_nb_clusters = size_to_clusters(s, size); +assert(new_nb_clusters = *nb_clusters); -new_refcount_table = g_try_realloc(*refcount_table, - *nb_clusters * - sizeof(**refcount_table)); -if (!new_refcount_table) { -*nb_clusters = old_nb_clusters; +
[Qemu-devel] [PATCH v2 03/21] qcow2: Use 64 bits for refcount values
Refcounts may have a width of up to 64 bits, so qemu should use the same width to represent refcount values internally. Since for instance qcow2_get_refcount() signals an error by returning a negative value, refcount values are generally signed to be able to represent those error values correctly. This limits the maximum refcount value supported by qemu to INT64_MAX (= 63 bits), as established in qcow2: Add two new fields to BDRVQcowState. This limitation should have no implications in practice for normal valid images. If the MSb in a 64 bit refcount value is set, we can safely assume the value to be invalid (because reaching such high refcounts is impossible due to other limitations of the qcow2 format). Signed-off-by: Max Reitz mre...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com --- block/qcow2-cluster.c | 9 ++--- block/qcow2-refcount.c | 37 - block/qcow2.h | 7 --- 3 files changed, 30 insertions(+), 23 deletions(-) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index df0b2c9..ab43902 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -1640,7 +1640,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table, for (i = 0; i l1_size; i++) { uint64_t l2_offset = l1_table[i] L1E_OFFSET_MASK; bool l2_dirty = false; -int l2_refcount; +int64_t l2_refcount; if (!l2_offset) { /* unallocated */ @@ -1696,14 +1696,17 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table, } if (l2_refcount 1) { +int64_t ret64; + /* For shared L2 tables, set the refcount accordingly (it is * already 1 and needs to be l2_refcount) */ -ret = qcow2_update_cluster_refcount(bs, +ret64 = qcow2_update_cluster_refcount(bs, offset s-cluster_bits, l2_refcount - 1, QCOW2_DISCARD_OTHER); -if (ret 0) { +if (ret64 0) { qcow2_free_clusters(bs, offset, s-cluster_size, QCOW2_DISCARD_OTHER); +ret = ret64; goto fail; } } diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c index 6016211..6e06531 100644 --- a/block/qcow2-refcount.c +++ b/block/qcow2-refcount.c @@ -91,14 +91,14 @@ static int load_refcount_block(BlockDriverState *bs, * return value is the refcount of the cluster, negative values are -errno * and indicate an error. */ -int qcow2_get_refcount(BlockDriverState *bs, int64_t cluster_index) +int64_t qcow2_get_refcount(BlockDriverState *bs, int64_t cluster_index) { BDRVQcowState *s = bs-opaque; uint64_t refcount_table_index, block_index; int64_t refcount_block_offset; int ret; uint16_t *refcount_block; -uint16_t refcount; +int64_t refcount; refcount_table_index = cluster_index s-refcount_block_bits; if (refcount_table_index = s-refcount_table_size) @@ -556,9 +556,10 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs, for(cluster_offset = start; cluster_offset = last; cluster_offset += s-cluster_size) { -int block_index, refcount; +int block_index; int64_t cluster_index = cluster_offset s-cluster_bits; int64_t table_index = cluster_index s-refcount_block_bits; +int64_t refcount; /* Load the refcount block and allocate it if needed */ if (table_index != old_table_index) { @@ -634,10 +635,10 @@ fail: * If the return value is non-negative, it is the new refcount of the cluster. * If it is negative, it is -errno and indicates an error. */ -int qcow2_update_cluster_refcount(BlockDriverState *bs, - int64_t cluster_index, - int addend, - enum qcow2_discard_type type) +int64_t qcow2_update_cluster_refcount(BlockDriverState *bs, + int64_t cluster_index, + int addend, + enum qcow2_discard_type type) { BDRVQcowState *s = bs-opaque; int ret; @@ -663,7 +664,7 @@ static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size) { BDRVQcowState *s = bs-opaque; uint64_t i, nb_clusters; -int refcount; +int64_t refcount; nb_clusters = size_to_clusters(s, size); retry: @@ -722,7 +723,8 @@ int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset, BDRVQcowState *s = bs-opaque; uint64_t cluster_index; uint64_t i; -int refcount, ret; +int64_t refcount; +int ret; assert(nb_clusters = 0); if
[Qemu-devel] [PATCH v2 10/21] qcow2: refcount_order parameter for qcow2_create2
Add a refcount_order parameter to qcow2_create2(), use that value for the image header and for calculating the size required for preallocation. For now, always pass 4. Signed-off-by: Max Reitz mre...@redhat.com --- block/qcow2.c | 41 ++--- 1 file changed, 30 insertions(+), 11 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 528d696..6dc1984 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -1775,7 +1775,7 @@ static int preallocate(BlockDriverState *bs) static int qcow2_create2(const char *filename, int64_t total_size, const char *backing_file, const char *backing_format, int flags, size_t cluster_size, PreallocMode prealloc, - QemuOpts *opts, int version, + QemuOpts *opts, int version, int refcount_order, Error **errp) { /* Calculate cluster_bits */ @@ -1811,6 +1811,13 @@ static int qcow2_create2(const char *filename, int64_t total_size, int64_t meta_size = 0; uint64_t nreftablee, nrefblocke, nl1e, nl2e; int64_t aligned_total_size = align_offset(total_size, cluster_size); +int refblock_bits, refblock_size; +/* refcount entry size in bytes */ +double rces = (1 refcount_order) / 8.; + +/* see qcow2_open() */ +refblock_bits = cluster_bits - (refcount_order - 3); +refblock_size = 1 refblock_bits; /* header: 1 cluster */ meta_size += cluster_size; @@ -1835,20 +1842,20 @@ static int qcow2_create2(const char *filename, int64_t total_size, * c = cluster size * y1 = number of refcount blocks entries * y2 = meta size including everything + * rces = refcount entry size in bytes * then, * y1 = (y2 + a)/c - * y2 = y1 * sizeof(u16) + y1 * sizeof(u16) * sizeof(u64) / c + m + * y2 = y1 * rces + y1 * rces * sizeof(u64) / c + m * we can get y1: - * y1 = (a + m) / (c - sizeof(u16) - sizeof(u16) * sizeof(u64) / c) + * y1 = (a + m) / (c - rces - rces * sizeof(u64) / c) */ -nrefblocke = (aligned_total_size + meta_size + cluster_size) / -(cluster_size - sizeof(uint16_t) - - 1.0 * sizeof(uint16_t) * sizeof(uint64_t) / cluster_size); -nrefblocke = align_offset(nrefblocke, cluster_size / sizeof(uint16_t)); -meta_size += nrefblocke * sizeof(uint16_t); +nrefblocke = (aligned_total_size + meta_size + cluster_size) + / (cluster_size - rces - rces * sizeof(uint64_t) + / cluster_size); +meta_size += DIV_ROUND_UP(nrefblocke, refblock_size) * cluster_size; /* total size of refcount tables */ -nreftablee = nrefblocke * sizeof(uint16_t) / cluster_size; +nreftablee = nrefblocke / refblock_size; nreftablee = align_offset(nreftablee, cluster_size / sizeof(uint64_t)); meta_size += nreftablee * sizeof(uint64_t); @@ -1883,7 +1890,7 @@ static int qcow2_create2(const char *filename, int64_t total_size, .l1_size= cpu_to_be32(0), .refcount_table_offset = cpu_to_be64(cluster_size), .refcount_table_clusters= cpu_to_be32(1), -.refcount_order = cpu_to_be32(4), +.refcount_order = cpu_to_be32(refcount_order), .header_length = cpu_to_be32(sizeof(*header)), }; @@ -2003,6 +2010,7 @@ static int qcow2_create(const char *filename, QemuOpts *opts, Error **errp) size_t cluster_size = DEFAULT_CLUSTER_SIZE; PreallocMode prealloc; int version = 3; +int refcount_width = 16, refcount_order; Error *local_err = NULL; int ret; @@ -2057,8 +2065,19 @@ static int qcow2_create(const char *filename, QemuOpts *opts, Error **errp) goto finish; } +if (version 3 refcount_width != 16) { +error_setg(errp, Different refcount widths than 16 bits require + compatibility level 1.1 or above (use compat=1.1 or + greater)); +ret = -EINVAL; +goto finish; +} + +refcount_order = ffs(refcount_width) - 1; + ret = qcow2_create2(filename, size, backing_file, backing_fmt, flags, -cluster_size, prealloc, opts, version, local_err); +cluster_size, prealloc, opts, version, refcount_order, +local_err); if (local_err) { error_propagate(errp, local_err); } -- 1.9.3
[Qemu-devel] [PATCH v2 14/21] qcow2: Use error_report() in qcow2_amend_options()
Signed-off-by: Max Reitz mre...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com --- block/qcow2.c | 14 ++ tests/qemu-iotests/061.out | 14 +++--- 2 files changed, 13 insertions(+), 15 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index d084485..e82120c 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -2686,11 +2686,11 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts, } else if (!strcmp(compat, 1.1)) { new_version = 3; } else { -fprintf(stderr, Unknown compatibility level %s.\n, compat); +error_report(Unknown compatibility level %s, compat); return -EINVAL; } } else if (!strcmp(desc-name, preallocation)) { -fprintf(stderr, Cannot change preallocation mode.\n); +error_report(Cannot change preallocation mode); return -ENOTSUP; } else if (!strcmp(desc-name, size)) { new_size = qemu_opt_get_size(opts, size, 0); @@ -2701,16 +2701,14 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts, } else if (!strcmp(desc-name, encryption)) { encrypt = qemu_opt_get_bool(opts, encryption, s-crypt_method); if (encrypt != !!s-crypt_method) { -fprintf(stderr, Changing the encryption flag is not -supported.\n); +error_report(Changing the encryption flag is not supported); return -ENOTSUP; } } else if (!strcmp(desc-name, cluster_size)) { cluster_size = qemu_opt_get_size(opts, cluster_size, cluster_size); if (cluster_size != s-cluster_size) { -fprintf(stderr, Changing the cluster size is not -supported.\n); +error_report(Changing the cluster size is not supported); return -ENOTSUP; } } else if (!strcmp(desc-name, lazy_refcounts)) { @@ -2756,8 +2754,8 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts, if (s-use_lazy_refcounts != lazy_refcounts) { if (lazy_refcounts) { if (s-qcow_version 3) { -fprintf(stderr, Lazy refcounts only supported with compatibility -level 1.1 and above (use compat=1.1 or greater)\n); +error_report(Lazy refcounts only supported with compatibility + level 1.1 and above (use compat=1.1 or greater)); return -EINVAL; } s-compatible_features |= QCOW2_COMPAT_LAZY_REFCOUNTS; diff --git a/tests/qemu-iotests/061.out b/tests/qemu-iotests/061.out index 9045544..2fd92ca 100644 --- a/tests/qemu-iotests/061.out +++ b/tests/qemu-iotests/061.out @@ -281,19 +281,19 @@ No errors were found on the image. === Testing invalid configurations === Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 -Lazy refcounts only supported with compatibility level 1.1 and above (use compat=1.1 or greater) +qemu-img: Lazy refcounts only supported with compatibility level 1.1 and above (use compat=1.1 or greater) qemu-img: Error while amending options: Invalid argument -Lazy refcounts only supported with compatibility level 1.1 and above (use compat=1.1 or greater) +qemu-img: Lazy refcounts only supported with compatibility level 1.1 and above (use compat=1.1 or greater) qemu-img: Error while amending options: Invalid argument -Unknown compatibility level 0.42. +qemu-img: Unknown compatibility level 0.42 qemu-img: Error while amending options: Invalid argument qemu-img: Invalid parameter 'foo' qemu-img: Invalid options for file format 'qcow2' -Changing the cluster size is not supported. +qemu-img: Changing the cluster size is not supported qemu-img: Error while amending options: Operation not supported -Changing the encryption flag is not supported. +qemu-img: Changing the encryption flag is not supported qemu-img: Error while amending options: Operation not supported -Cannot change preallocation mode. +qemu-img: Cannot change preallocation mode qemu-img: Error while amending options: Operation not supported === Testing correct handling of unset value === @@ -301,7 +301,7 @@ qemu-img: Error while amending options: Operation not supported Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 Should work: Should not work: -Changing the cluster size is not supported. +qemu-img: Changing the cluster size is not supported qemu-img: Error while amending options: Operation not supported === Testing zero expansion on inactive clusters === -- 1.9.3
[Qemu-devel] [PATCH v2 08/21] qcow2: More helpers for refcount modification
Add helper functions for getting and setting refcounts in a refcount array for any possible refcount order, and choose the correct one during refcount initialization. Signed-off-by: Max Reitz mre...@redhat.com --- block/qcow2-refcount.c | 146 - 1 file changed, 144 insertions(+), 2 deletions(-) diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c index b3ca7d2..2e13a9c 100644 --- a/block/qcow2-refcount.c +++ b/block/qcow2-refcount.c @@ -32,10 +32,73 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs, int64_t offset, int64_t length, int addend, enum qcow2_discard_type type); +static uint64_t get_refcount_ro0(const void *refcount_array, uint64_t index); +static uint64_t get_refcount_ro1(const void *refcount_array, uint64_t index); +static uint64_t get_refcount_ro2(const void *refcount_array, uint64_t index); +static uint64_t get_refcount_ro3(const void *refcount_array, uint64_t index); static uint64_t get_refcount_ro4(const void *refcount_array, uint64_t index); +static uint64_t get_refcount_ro5(const void *refcount_array, uint64_t index); +static uint64_t get_refcount_ro6(const void *refcount_array, uint64_t index); +static void set_refcount_ro0(void *refcount_array, uint64_t index, + uint64_t value); +static void set_refcount_ro1(void *refcount_array, uint64_t index, + uint64_t value); +static void set_refcount_ro2(void *refcount_array, uint64_t index, + uint64_t value); +static void set_refcount_ro3(void *refcount_array, uint64_t index, + uint64_t value); static void set_refcount_ro4(void *refcount_array, uint64_t index, uint64_t value); +static void set_refcount_ro5(void *refcount_array, uint64_t index, + uint64_t value); +static void set_refcount_ro6(void *refcount_array, uint64_t index, + uint64_t value); + +static void get_refcount_functions(int refcount_order, + Qcow2GetRefcountFunc **get, + Qcow2SetRefcountFunc **set) +{ +switch (refcount_order) { +case 0: +*get = get_refcount_ro0; +*set = set_refcount_ro0; +break; + +case 1: +*get = get_refcount_ro1; +*set = set_refcount_ro1; +break; + +case 2: +*get = get_refcount_ro2; +*set = set_refcount_ro2; +break; + +case 3: +*get = get_refcount_ro3; +*set = set_refcount_ro3; +break; + +case 4: +*get = get_refcount_ro4; +*set = set_refcount_ro4; +break; + +case 5: +*get = get_refcount_ro5; +*set = set_refcount_ro5; +break; + +case 6: +*get = get_refcount_ro6; +*set = set_refcount_ro6; +break; + +default: +abort(); +} +} /*/ @@ -47,8 +110,8 @@ int qcow2_refcount_init(BlockDriverState *bs) unsigned int refcount_table_size2, i; int ret; -s-get_refcount = get_refcount_ro4; -s-set_refcount = set_refcount_ro4; +get_refcount_functions(s-refcount_order, + s-get_refcount, s-set_refcount); assert(s-refcount_table_size = INT_MAX / sizeof(uint64_t)); refcount_table_size2 = s-refcount_table_size * sizeof(uint64_t); @@ -80,6 +143,59 @@ void qcow2_refcount_close(BlockDriverState *bs) } +static uint64_t get_refcount_ro0(const void *refcount_array, uint64_t index) +{ +return (((const uint8_t *)refcount_array)[index / 8] (index % 8)) 0x1; +} + +static void set_refcount_ro0(void *refcount_array, uint64_t index, + uint64_t value) +{ +assert(!(value 1)); +((uint8_t *)refcount_array)[index / 8] = ~(0x1 (index % 8)); +((uint8_t *)refcount_array)[index / 8] |= value (index % 8); +} + +static uint64_t get_refcount_ro1(const void *refcount_array, uint64_t index) +{ +return (((const uint8_t *)refcount_array)[index / 4] (2 * (index % 4))) +0x3; +} + +static void set_refcount_ro1(void *refcount_array, uint64_t index, + uint64_t value) +{ +assert(!(value 2)); +((uint8_t *)refcount_array)[index / 4] = ~(0x3 (2 * (index % 4))); +((uint8_t *)refcount_array)[index / 4] |= value (2 * (index % 4)); +} + +static uint64_t get_refcount_ro2(const void *refcount_array, uint64_t index) +{ +return (((const uint8_t *)refcount_array)[index / 2] (4 * (index % 2))) +0xf; +} + +static void set_refcount_ro2(void *refcount_array, uint64_t index, + uint64_t value) +{ +assert(!(value 4)); +
[Qemu-devel] [PATCH v2 15/21] qcow2: Use abort() instead of assert(false)
Signed-off-by: Max Reitz mre...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com --- block/qcow2.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index e82120c..423af48 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -2718,9 +2718,9 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts, error_report(Cannot change refcount entry width); return -ENOTSUP; } else { -/* if this assertion fails, this probably means a new option was +/* if this point is reached, this probably means a new option was * added without having it covered here */ -assert(false); +abort(); } desc++; -- 1.9.3
[Qemu-devel] [PATCH v2 17/21] qcow2: Use intermediate helper CB for amend
If there is more than one time-consuming operation to be performed for qcow2_amend_options(), we need an intermediate CB which coordinates the progress of the individual operations and passes the result to the original status callback. Signed-off-by: Max Reitz mre...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com --- block/qcow2.c | 80 ++- 1 file changed, 79 insertions(+), 1 deletion(-) diff --git a/block/qcow2.c b/block/qcow2.c index d2553b9..0263019 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -2655,6 +2655,75 @@ static int qcow2_downgrade(BlockDriverState *bs, int target_version, return 0; } +typedef enum Qcow2AmendOperation { +/* This is the value Qcow2AmendHelperCBInfo::last_operation will be + * statically initialized to so that the helper CB can discern the first + * invocation from an operation change */ +QCOW2_NO_OPERATION = 0, + +QCOW2_DOWNGRADING, +} Qcow2AmendOperation; + +typedef struct Qcow2AmendHelperCBInfo { +/* The code coordinating the amend operations should only modify + * these four fields; the rest will be managed by the CB */ +BlockDriverAmendStatusCB *original_status_cb; +void *original_cb_opaque; + +Qcow2AmendOperation current_operation; + +/* Total number of operations to perform (only set once) */ +int total_operations; + +/* The following fields are managed by the CB */ + +/* Number of operations completed */ +int operations_completed; + +/* Cumulative offset of all completed operations */ +int64_t offset_completed; + +Qcow2AmendOperation last_operation; +int64_t last_work_size; +} Qcow2AmendHelperCBInfo; + +static void qcow2_amend_helper_cb(BlockDriverState *bs, + int64_t operation_offset, + int64_t operation_work_size, void *opaque) +{ +Qcow2AmendHelperCBInfo *info = opaque; +int64_t current_work_size; +int64_t projected_work_size; + +if (info-current_operation != info-last_operation) { +if (info-last_operation != QCOW2_NO_OPERATION) { +info-offset_completed += info-last_work_size; +info-operations_completed++; +} + +info-last_operation = info-current_operation; +} + +assert(info-total_operations 0); +assert(info-operations_completed info-total_operations); + +info-last_work_size = operation_work_size; + +current_work_size = info-offset_completed + operation_work_size; + +/* current_work_size is the total work size for (operations_completed + 1) + * operations (which includes this one), so multiply it by the number of + * operations not covered and divide it by the number of operations + * covered to get a projection for the operations not covered */ +projected_work_size = current_work_size * (info-total_operations - + info-operations_completed - 1) +/ (info-operations_completed + 1); + +info-original_status_cb(bs, info-offset_completed + operation_offset, + current_work_size + projected_work_size, + info-original_cb_opaque); +} + static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts, BlockDriverAmendStatusCB *status_cb, void *cb_opaque) @@ -2669,6 +2738,7 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts, bool encrypt; int ret; QemuOptDesc *desc = opts-list-desc; +Qcow2AmendHelperCBInfo helper_cb_info; while (desc desc-name) { if (!qemu_opt_find(opts, desc-name)) { @@ -2726,6 +2796,12 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts, desc++; } +helper_cb_info = (Qcow2AmendHelperCBInfo){ +.original_status_cb = status_cb, +.original_cb_opaque = cb_opaque, +.total_operations = (new_version old_version) +}; + /* Upgrade first (some features may require compat=1.1) */ if (new_version old_version) { s-qcow_version = new_version; @@ -2784,7 +2860,9 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts, /* Downgrade last (so unsupported features can be removed before) */ if (new_version old_version) { -ret = qcow2_downgrade(bs, new_version, status_cb, cb_opaque); +helper_cb_info.current_operation = QCOW2_DOWNGRADING; +ret = qcow2_downgrade(bs, new_version, qcow2_amend_helper_cb, + helper_cb_info); if (ret 0) { return ret; } -- 1.9.3
[Qemu-devel] [PATCH v2 05/21] qcow2: Refcount overflow and qcow2_alloc_bytes()
qcow2_alloc_bytes() may reuse a cluster multiple times, in which case the refcount is increased accordingly. However, if this would lead to an overflow the function should instead just not reuse this cluster and allocate a new one. Signed-off-by: Max Reitz mre...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com --- block/qcow2-refcount.c | 32 ++-- 1 file changed, 30 insertions(+), 2 deletions(-) diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c index be4e5fe..66c78c0 100644 --- a/block/qcow2-refcount.c +++ b/block/qcow2-refcount.c @@ -761,12 +761,13 @@ int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset, int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size) { BDRVQcowState *s = bs-opaque; -int64_t offset, cluster_offset, new_cluster; +int64_t offset, cluster_offset, new_cluster, refcount; int64_t ret; int free_in_cluster; BLKDBG_EVENT(bs-file, BLKDBG_CLUSTER_ALLOC_BYTES); assert(size 0 size = s-cluster_size); + redo: if (s-free_byte_offset == 0) { offset = qcow2_alloc_clusters(bs, s-cluster_size); if (offset 0) { @@ -774,12 +775,25 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size) } s-free_byte_offset = offset; } - redo: + free_in_cluster = s-cluster_size - offset_into_cluster(s, s-free_byte_offset); if (size = free_in_cluster) { /* enough space in current cluster */ offset = s-free_byte_offset; + +if (offset_into_cluster(s, offset) != 0) { +/* We will have to increase the refcount of this cluster; if the + * maximum has been reached already, this cluster cannot be used */ +refcount = qcow2_get_refcount(bs, offset s-cluster_bits); +if (refcount 0) { +return refcount; +} else if (refcount == s-refcount_max) { +s-free_byte_offset = 0; +goto redo; +} +} + s-free_byte_offset += size; free_in_cluster -= size; if (free_in_cluster == 0) @@ -800,6 +814,20 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size) if ((cluster_offset + s-cluster_size) == new_cluster) { /* we are lucky: contiguous data */ offset = s-free_byte_offset; + +/* Same as above: In order to reuse the cluster, the refcount has to + * be increased; if that will not work, we are not so lucky after + * all */ +refcount = qcow2_get_refcount(bs, offset s-cluster_bits); +if (refcount 0) { +qcow2_free_clusters(bs, new_cluster, s-cluster_size, +QCOW2_DISCARD_NEVER); +return refcount; +} else if (refcount == s-refcount_max) { +s-free_byte_offset = offset; +goto redo; +} + ret = qcow2_update_cluster_refcount(bs, offset s-cluster_bits, 1, QCOW2_DISCARD_NEVER); if (ret 0) { -- 1.9.3
[Qemu-devel] [PATCH v2 18/21] qcow2: Add function for refcount order amendment
Add a function qcow2_change_refcount_order() which allows changing the refcount order of a qcow2 image. Signed-off-by: Max Reitz mre...@redhat.com --- block/qcow2-refcount.c | 457 + block/qcow2.h | 4 + 2 files changed, 461 insertions(+) diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c index 2e13a9c..b59a028 100644 --- a/block/qcow2-refcount.c +++ b/block/qcow2-refcount.c @@ -2484,3 +2484,460 @@ int qcow2_pre_write_overlap_check(BlockDriverState *bs, int ign, int64_t offset, return 0; } + +/* A pointer to a function of this type is given to walk_over_reftable(). That + * function will create refblocks and pass them to a RefblockFinishOp once they + * are completed (@refblock). @refblock_empty is set if the refblock is + * completely empty. + * + * Along with the refblock, a corresponding reftable entry is passed, in the + * reftable @reftable (which may be reallocated) at @reftable_index. + * + * @allocated should be set to true if a new cluster has been allocated. + */ +typedef int (RefblockFinishOp)(BlockDriverState *bs, uint64_t **reftable, + uint64_t reftable_index, uint64_t *reftable_size, + void *refblock, bool refblock_empty, + bool *allocated, Error **errp); + +/** + * This operation for walk_over_reftable() allocates the refblock on disk (if + * it is not empty) and inserts its offset into the new reftable. The size of + * this new reftable is increased as required. + */ +static int alloc_refblock(BlockDriverState *bs, uint64_t **reftable, + uint64_t reftable_index, uint64_t *reftable_size, + void *refblock, bool refblock_empty, bool *allocated, + Error **errp) +{ +BDRVQcowState *s = bs-opaque; +int64_t offset; + +if (!refblock_empty reftable_index = *reftable_size) { +uint64_t *new_reftable; +uint64_t new_reftable_size; + +new_reftable_size = ROUND_UP(reftable_index + 1, + s-cluster_size / sizeof(uint64_t)); +if (new_reftable_size QCOW_MAX_REFTABLE_SIZE / sizeof(uint64_t)) { +error_setg(errp, + This operation would make the refcount table grow + beyond the maximum size supported by QEMU, aborting); +return -ENOTSUP; +} + +new_reftable = g_try_realloc(*reftable, new_reftable_size * +sizeof(uint64_t)); +if (!new_reftable) { +error_setg(errp, Failed to increase reftable buffer size); +return -ENOMEM; +} + +memset(new_reftable + *reftable_size, 0, + (new_reftable_size - *reftable_size) * sizeof(uint64_t)); + +*reftable = new_reftable; +*reftable_size = new_reftable_size; +} + +if (!refblock_empty !(*reftable)[reftable_index]) { +offset = qcow2_alloc_clusters(bs, s-cluster_size); +if (offset 0) { +error_setg_errno(errp, -offset, Failed to allocate refblock); +return offset; +} +(*reftable)[reftable_index] = offset; +*allocated = true; +} + +return 0; +} + +/** + * This operation for walk_over_reftable() writes the refblock to disk at the + * offset specified by the new reftable's entry. It does not modify the new + * reftable or change any refcounts. + */ +static int flush_refblock(BlockDriverState *bs, uint64_t **reftable, + uint64_t reftable_index, uint64_t *reftable_size, + void *refblock, bool refblock_empty, bool *allocated, + Error **errp) +{ +BDRVQcowState *s = bs-opaque; +int64_t offset; +int ret; + +if (reftable_index *reftable_size (*reftable)[reftable_index]) { +offset = (*reftable)[reftable_index]; + +ret = qcow2_pre_write_overlap_check(bs, 0, offset, s-cluster_size); +if (ret 0) { +error_setg_errno(errp, -ret, Overlap check failed); +return ret; +} + +ret = bdrv_pwrite(bs-file, offset, refblock, s-cluster_size); +if (ret 0) { +error_setg_errno(errp, -ret, Failed to write refblock); +return ret; +} +} else { +assert(refblock_empty); +} + +return 0; +} + +/** + * This function walks over the existing reftable and every referenced refblock; + * if @new_set_refcount is non-NULL, it is called for every refcount entry to + * create an equal new entry in the passed @new_refblock. Once that + * @new_refblock is completely filled, @operation will be called. + * + * @status_cb and @cb_opaque are used for the amend operation's status callback. + * @index is the index of the walk_over_reftable() calls and @total is the total + * number of walk_over_reftable()
[Qemu-devel] [PATCH v2 09/21] qcow2: Open images with refcount order != 4
No longer refuse to open images with a different refcount entry width than 16 bits; only reject images with a refcount width larger than 64 bits (which is prohibited by the specification). Signed-off-by: Max Reitz mre...@redhat.com --- block/qcow2.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index d70e927..528d696 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -677,10 +677,10 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, } /* Check support for various header values */ -if (header.refcount_order != 4) { -report_unsupported(bs, errp, %d bit reference counts, - 1 header.refcount_order); -ret = -ENOTSUP; +if (header.refcount_order 6) { +error_setg(errp, Reference count entry width too large; may not + exceed 64 bit); +ret = -EINVAL; goto fail; } s-refcount_order = header.refcount_order; -- 1.9.3
[Qemu-devel] [PATCH v2 19/21] qcow2: Invoke refcount order amendment function
Make use of qcow2_change_refcount_order() to support changing the refcount order with qemu-img amend. Signed-off-by: Max Reitz mre...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com --- block/qcow2.c | 44 +++- 1 file changed, 35 insertions(+), 9 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 0263019..469650b 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -2607,13 +2607,7 @@ static int qcow2_downgrade(BlockDriverState *bs, int target_version, } if (s-refcount_order != 4) { -/* we would have to convert the image to a refcount_order == 4 image - * here; however, since qemu (at the time of writing this) does not - * support anything different than 4 anyway, there is no point in doing - * so right now; however, we should error out (if qemu supports this in - * the future and this code has not been adapted) */ -error_report(qcow2_downgrade: Image refcount orders other than 4 are - currently not supported.); +error_report(compat=0.10 requires refcount_width=16); return -ENOTSUP; } @@ -2661,6 +2655,7 @@ typedef enum Qcow2AmendOperation { * invocation from an operation change */ QCOW2_NO_OPERATION = 0, +QCOW2_CHANGING_REFCOUNT_ORDER, QCOW2_DOWNGRADING, } Qcow2AmendOperation; @@ -2736,6 +2731,7 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts, const char *compat = NULL; uint64_t cluster_size = s-cluster_size; bool encrypt; +int refcount_width = s-refcount_bits; int ret; QemuOptDesc *desc = opts-list-desc; Qcow2AmendHelperCBInfo helper_cb_info; @@ -2785,8 +2781,16 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts, lazy_refcounts = qemu_opt_get_bool(opts, lazy_refcounts, lazy_refcounts); } else if (!strcmp(desc-name, refcount_width)) { -error_report(Cannot change refcount entry width); -return -ENOTSUP; +refcount_width = qemu_opt_get_number(opts, refcount_width, + refcount_width); + +if (refcount_width = 0 || refcount_width 64 || +!is_power_of_2(refcount_width)) +{ +error_report(Refcount width must be a power of two and may + not exceed 64 bits); +return -EINVAL; +} } else { /* if this point is reached, this probably means a new option was * added without having it covered here */ @@ -2800,6 +2804,7 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts, .original_status_cb = status_cb, .original_cb_opaque = cb_opaque, .total_operations = (new_version old_version) + + (s-refcount_bits != refcount_width) }; /* Upgrade first (some features may require compat=1.1) */ @@ -2812,6 +2817,27 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts, } } +if (s-refcount_bits != refcount_width) { +int refcount_order = ffs(refcount_width) - 1; +Error *local_error = NULL; + +if (new_version 3 refcount_width != 16) { +error_report(Different refcount widths than 16 bits require + compatibility level 1.1 or above (use compat=1.1 or + greater)); +return -EINVAL; +} + +helper_cb_info.current_operation = QCOW2_CHANGING_REFCOUNT_ORDER; +ret = qcow2_change_refcount_order(bs, refcount_order, + qcow2_amend_helper_cb, + helper_cb_info, local_error); +if (ret 0) { +qerror_report_err(local_error); +return ret; +} +} + if (backing_file || backing_format) { ret = qcow2_change_backing_file(bs, backing_file ?: bs-backing_file, backing_format ?: bs-backing_format); -- 1.9.3
[Qemu-devel] [PATCH v2 16/21] qcow2: Split upgrade/downgrade paths for amend
If the image version should be upgraded, that is the first we should do; if it should be downgraded, that is the last we should do. So split the version change block into an upgrade part at the start and a downgrade part at the end. Signed-off-by: Max Reitz mre...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com --- block/qcow2.c | 31 --- 1 file changed, 16 insertions(+), 15 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 423af48..d2553b9 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -2726,20 +2726,13 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts, desc++; } -if (new_version != old_version) { -if (new_version old_version) { -/* Upgrade */ -s-qcow_version = new_version; -ret = qcow2_update_header(bs); -if (ret 0) { -s-qcow_version = old_version; -return ret; -} -} else { -ret = qcow2_downgrade(bs, new_version, status_cb, cb_opaque); -if (ret 0) { -return ret; -} +/* Upgrade first (some features may require compat=1.1) */ +if (new_version old_version) { +s-qcow_version = new_version; +ret = qcow2_update_header(bs); +if (ret 0) { +s-qcow_version = old_version; +return ret; } } @@ -2753,7 +2746,7 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts, if (s-use_lazy_refcounts != lazy_refcounts) { if (lazy_refcounts) { -if (s-qcow_version 3) { +if (new_version 3) { error_report(Lazy refcounts only supported with compatibility level 1.1 and above (use compat=1.1 or greater)); return -EINVAL; @@ -2789,6 +2782,14 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts, } } +/* Downgrade last (so unsupported features can be removed before) */ +if (new_version old_version) { +ret = qcow2_downgrade(bs, new_version, status_cb, cb_opaque); +if (ret 0) { +return ret; +} +} + return 0; } -- 1.9.3
[Qemu-devel] [PATCH v2 13/21] block: Add opaque value to the amend CB
Add an opaque value which is to be passed to the bdrv_amend_options() status callback. Signed-off-by: Max Reitz mre...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com --- block.c | 4 ++-- block/qcow2-cluster.c | 14 -- block/qcow2.c | 9 + block/qcow2.h | 3 ++- include/block/block.h | 4 ++-- include/block/block_int.h | 3 ++- qemu-img.c| 5 +++-- 7 files changed, 24 insertions(+), 18 deletions(-) diff --git a/block.c b/block.c index c979d51..c34b188 100644 --- a/block.c +++ b/block.c @@ -5790,12 +5790,12 @@ void bdrv_add_before_write_notifier(BlockDriverState *bs, } int bdrv_amend_options(BlockDriverState *bs, QemuOpts *opts, - BlockDriverAmendStatusCB *status_cb) + BlockDriverAmendStatusCB *status_cb, void *cb_opaque) { if (!bs-drv-bdrv_amend_options) { return -ENOTSUP; } -return bs-drv-bdrv_amend_options(bs, opts, status_cb); +return bs-drv-bdrv_amend_options(bs, opts, status_cb, cb_opaque); } /* This function will be called by the bdrv_recurse_is_first_non_filter method diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index ab43902..2daf334 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -1620,7 +1620,8 @@ fail: static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table, int l1_size, int64_t *visited_l1_entries, int64_t l1_entries, - BlockDriverAmendStatusCB *status_cb) + BlockDriverAmendStatusCB *status_cb, + void *cb_opaque) { BDRVQcowState *s = bs-opaque; bool is_active_l1 = (l1_table == s-l1_table); @@ -1646,7 +1647,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table, /* unallocated */ (*visited_l1_entries)++; if (status_cb) { -status_cb(bs, *visited_l1_entries, l1_entries); +status_cb(bs, *visited_l1_entries, l1_entries, cb_opaque); } continue; } @@ -1768,7 +1769,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table, (*visited_l1_entries)++; if (status_cb) { -status_cb(bs, *visited_l1_entries, l1_entries); +status_cb(bs, *visited_l1_entries, l1_entries, cb_opaque); } } @@ -1797,7 +1798,8 @@ fail: * qcow2 version which doesn't yet support metadata zero clusters. */ int qcow2_expand_zero_clusters(BlockDriverState *bs, - BlockDriverAmendStatusCB *status_cb) + BlockDriverAmendStatusCB *status_cb, + void *cb_opaque) { BDRVQcowState *s = bs-opaque; uint64_t *l1_table = NULL; @@ -1814,7 +1816,7 @@ int qcow2_expand_zero_clusters(BlockDriverState *bs, ret = expand_zero_clusters_in_l1(bs, s-l1_table, s-l1_size, visited_l1_entries, l1_entries, - status_cb); + status_cb, cb_opaque); if (ret 0) { goto fail; } @@ -1849,7 +1851,7 @@ int qcow2_expand_zero_clusters(BlockDriverState *bs, ret = expand_zero_clusters_in_l1(bs, l1_table, s-snapshots[i].l1_size, visited_l1_entries, l1_entries, - status_cb); + status_cb, cb_opaque); if (ret 0) { goto fail; } diff --git a/block/qcow2.c b/block/qcow2.c index 657d558..d084485 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -2592,7 +2592,7 @@ static int qcow2_load_vmstate(BlockDriverState *bs, uint8_t *buf, * have to be removed. */ static int qcow2_downgrade(BlockDriverState *bs, int target_version, - BlockDriverAmendStatusCB *status_cb) + BlockDriverAmendStatusCB *status_cb, void *cb_opaque) { BDRVQcowState *s = bs-opaque; int current_version = s-qcow_version; @@ -2641,7 +2641,7 @@ static int qcow2_downgrade(BlockDriverState *bs, int target_version, /* clearing autoclear features is trivial */ s-autoclear_features = 0; -ret = qcow2_expand_zero_clusters(bs, status_cb); +ret = qcow2_expand_zero_clusters(bs, status_cb, cb_opaque); if (ret 0) { return ret; } @@ -2656,7 +2656,8 @@ static int qcow2_downgrade(BlockDriverState *bs, int target_version, } static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts, - BlockDriverAmendStatusCB *status_cb) + BlockDriverAmendStatusCB *status_cb, + void *cb_opaque) {
Re: [Qemu-devel] [PATCH v2 3/4] raw-posix: Fix try_seek_hole()'s handling of SEEK_DATA failure
Eric Blake ebl...@redhat.com writes: On 11/13/2014 08:29 AM, Eric Blake wrote: lseek() with SEEK_DATA starting in a hole when there is no data until EOF is actually the part that isn't documented in the man page, but ENXIO is what I'm seeing here on RHEL 7. Here's the (proposed) POSIX wording: http://austingroupbugs.net/view.php?id=415 And ENXIO is indeed the expected error for SEEK_DATA on a trailing hole, so maybe we should special case it. Uggh. Historical practice on Solaris (and therefore the POSIX wording) says that SEEK_HOLE in a trailing hole is allowed (but not required) to seek to EOF instead of reporting the offset requested. I have no clue why this was done, but it is VERY annoying - it means that if you provide an offset within a tail hole of a file, you cannot reliably tell if the file ends in a hole or with data, without ALSO trying SEEK_DATA. For applications that are reading a file sequentially but skipping over holes, this behavior is fine (it short-circuits the hole/data search points and might shave an iteration off a lop). But for OUR purposes, where we are merely trying to ascertain whether we are in a hole, we have an inaccurate response - since SEEK_HOLE does NOT return the offset we passed in, we are prone to treat the offset as belonging to data, which is a pessimization (you never get wrong results by treating a hole as data and reading it, but it is definitely slower). I think you HAVE to call lseek() twice, both with SEEK_HOLE and with SEEK_DATA, if you want to accurately determine whether an offset happens to live within a trailing hole. Here's a table of possible situations, based solely on POSIX wording (and not on actual tests on Solaris or Linux, although it shouldn't be too hard to confirm behavior): 0-length file: lseek(fd, 0, SEEK_HOLE) = -1 ENXIO lseek(fd, 0, SEEK_DATA) = -1 ENXIO conclusion: 0 is at EOF Isn't this a special case of the next one? file of any size: lseek(fd, size_or_larger, SEEK_HOLE) = -1 ENXIO lseek(fd, size_or_larger, SEEK_DATA) = -1 ENXIO conclusion: size_or_larger is at or beyond EOF file where offset is in a hole, but data appears later: lseek(fd, offset, SEEK_HOLE) = offset lseek(fd, offset, SEEK_DATA) = end_of_hole conclusion: offset through end_of_hole is in a hole file where offset is data, whether or not a hole appears later: lseek(fd, offset, SEEK_HOLE) = end_of_data lseek(fd, offset, SEEK_DATA) = offset conclusion: offset through end_of_data is in data file where offset is in a tail hole, option 1: lseek(fd, offset, SEEK_HOLE) = offset lseek(fd, offset, SEEK_DATA) = -1 ENXIO conclusion: offset through EOF is in hole, but another seek needed to learn EOF file where offset is in a tail hole, option 2: lseek(fd, offset, SEEK_HOLE) = EOF lseek(fd, offset, SEEK_DATA) = -1 ENXIO conclusion: offset through EOF is in hole, no additional seek needed The two calls are both necessary, in order to learn which extant type offset belongs to, and to tell where that extant ends; and the behaviors are distinguishable (if both lseek() succeed, we have both numbers we want; if both fail with ENXIO, we know the offset is at or beyond EOF; and if only SEEK_HOLE fails with ENXIO, we know we have a trailing hole); and we can tell at runtime what to do about a trailing hole (if the return value is offset, we need one more lseek(fd, 0, SEEK_END) to find EOF; if the return value is larger than offset, we have EOF for free). You can optimize by calling SEEK_HOLE first (if it fails with ENXIO, there is no need to try SEEK_DATA); but SEEK_HOLE in isolation is insufficient to give you all the information you need. Not discussed: how to handle failures other than ENXIO. The appended code still avoids a second seek in one case. Useful mostly because it saves us from handling a second seek's contradictory information. /* * Find allocation range in @bs around offset @start. * May change underlying file descriptor's file offset. * If @start is not in a hole, store @start in @data, and the * beginning of the next hole in @hole, and return 0. * If @start is in a non-trailing hole, store @start in @hole and the * beginning of the next non-hole in @data, and return 0. * If @start is in a trailing hole or beyond EOF, return -ENXIO. * If we can't find out, return a negative errno other than -ENXIO. */ static int find_allocation(BlockDriverState *bs, off_t start, off_t *data, off_t *hole) { #if defined SEEK_HOLE defined SEEK_DATA BDRVRawState *s = bs-opaque; off_t offs; /* * SEEK_DATA cases: * D1. offs == start: start is in data * D2. offs start: start is in a hole, next data at offs * D3. offs 0, errno = ENXIO: either start is in a trailing hole * or start is beyond EOF * If the latter happens, the file has been truncated behind * our back since we opened it. Best
[Qemu-devel] [PATCH v2 20/21] qcow2: Point to amend function in check
If a reference count is not representable with the current refcount order, the image check should point to qemu-img amend for increasing the refcount order. However, qemu-img amend needs write access to the image which cannot be provided if the image is marked corrupt; and the image check will not mark the image consistent unless everything actually is consistent. Therefore, if an image is marked corrupt and the image check encounters a reference count overflow, it cannot be fixed by using qemu-img amend to increase the refcount order. Instead, one has to use qemu-img convert to create a completely new copy of the image in this case. Alternatively, we may want to give the user a way of manually removing the corrupt flag, maybe through qemu-img amend, but this is not part of this patch. Signed-off-by: Max Reitz mre...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com --- block/qcow2-refcount.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c index b59a028..e9647ce 100644 --- a/block/qcow2-refcount.c +++ b/block/qcow2-refcount.c @@ -1368,6 +1368,9 @@ static int inc_refcounts(BlockDriverState *bs, if (refcount == s-refcount_max) { fprintf(stderr, ERROR: overflow cluster offset=0x% PRIx64 \n, cluster_offset); +fprintf(stderr, Use qemu-img amend to increase the refcount entry +width or qemu-img convert to create a clean copy if the +image cannot be opened for writing\n); res-corruptions++; continue; } -- 1.9.3
[Qemu-devel] [PATCH v2 21/21] iotests: Add test for different refcount widths
Add a test for conversion between different refcount widths and errors specific to certain widths (i.e. snapshots with refcount_width=1). Signed-off-by: Max Reitz mre...@redhat.com --- tests/qemu-iotests/112 | 252 + tests/qemu-iotests/112.out | 131 +++ tests/qemu-iotests/group | 1 + 3 files changed, 384 insertions(+) create mode 100755 tests/qemu-iotests/112 create mode 100644 tests/qemu-iotests/112.out diff --git a/tests/qemu-iotests/112 b/tests/qemu-iotests/112 new file mode 100755 index 000..e824d8a --- /dev/null +++ b/tests/qemu-iotests/112 @@ -0,0 +1,252 @@ +#!/bin/bash +# +# Test cases for different refcount_widths +# +# Copyright (C) 2014 Red Hat, Inc. +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see http://www.gnu.org/licenses/. +# + +# creator +owner=mre...@redhat.com + +seq=$(basename $0) +echo QA output created by $seq + +here=$PWD +tmp=/tmp/$$ +status=1 # failure is the default! + +_cleanup() +{ + _cleanup_test_img +} +trap _cleanup; exit \$status 0 1 2 3 15 + +# get standard environment, filters and checks +. ./common.rc +. ./common.filter + +# This tests qcow2-specific low-level functionality +_supported_fmt qcow2 +_supported_proto file +_supported_os Linux +# This test will set refcount_width on its own which would conflict with the +# manual setting; compat will be overridden as well +_unsupported_imgopts refcount_width 'compat=0.10' + +function print_refcount_width() +{ +$QEMU_IMG info $TEST_IMG | sed -n '/refcount width:/ s/^ *//p' +} + +echo +echo '=== refcount_width limits ===' +echo + +# Must be positive (non-zero) +IMGOPTS=$IMGOPTS,refcount_width=0 _make_test_img 64M +# Must be positive (non-negative) +IMGOPTS=$IMGOPTS,refcount_width=-1 _make_test_img 64M +# May not exceed 64 +IMGOPTS=$IMGOPTS,refcount_width=128 _make_test_img 64M +# Must be a power of two +IMGOPTS=$IMGOPTS,refcount_width=42 _make_test_img 64M + +# 1 is the minimum +IMGOPTS=$IMGOPTS,refcount_width=1 _make_test_img 64M +print_refcount_width + +# 64 is the maximum +IMGOPTS=$IMGOPTS,refcount_width=64 _make_test_img 64M +print_refcount_width + +# 16 is the default +_make_test_img 64M +print_refcount_width + +echo +echo '=== refcount_width and compat=0.10 ===' +echo + +# Should work +IMGOPTS=$IMGOPTS,compat=0.10,refcount_width=16 _make_test_img 64M +print_refcount_width + +# Should not work +IMGOPTS=$IMGOPTS,compat=0.10,refcount_width=1 _make_test_img 64M +IMGOPTS=$IMGOPTS,compat=0.10,refcount_width=64 _make_test_img 64M + + +echo +echo '=== Snapshot limit on refcount_width=1 ===' +echo + +IMGOPTS=$IMGOPTS,refcount_width=1 _make_test_img 64M +print_refcount_width + +$QEMU_IO -c 'write 0 512' $TEST_IMG | _filter_qemu_io + +# Should fail for now; in the future, this might be supported by automatically +# copying all clusters with overflowing refcount +$QEMU_IMG snapshot -c foo $TEST_IMG + +# The new L1 table could/should be leaked +_check_test_img + +echo +echo '=== Snapshot limit on refcount_width=2 ===' +echo + +IMGOPTS=$IMGOPTS,refcount_width=2 _make_test_img 64M +print_refcount_width + +$QEMU_IO -c 'write 0 512' $TEST_IMG | _filter_qemu_io + +# Should succeed +$QEMU_IMG snapshot -c foo $TEST_IMG +$QEMU_IMG snapshot -c bar $TEST_IMG +# Should fail (4th reference) +$QEMU_IMG snapshot -c baz $TEST_IMG + +# The new L1 table could/should be leaked +_check_test_img + +echo +echo '=== Compressed clusters with refcount_width=1 ===' +echo + +IMGOPTS=$IMGOPTS,refcount_width=1 _make_test_img 64M +print_refcount_width + +# Both should fit into a single host cluster; instead of failing to increase the +# refcount of that cluster, qemu should just allocate a new cluster and make +# this operation succeed +$QEMU_IO -c 'write -P 0 -c 0 64k' \ + -c 'write -P 1 -c 64k 64k' \ + $TEST_IMG | _filter_qemu_io + +_check_test_img + +echo +echo '=== Amend from refcount_width=16 to refcount_width=1 ===' +echo + +_make_test_img 64M +print_refcount_width + +$QEMU_IO -c 'write 16M 32M' $TEST_IMG | _filter_qemu_io +$QEMU_IMG amend -o refcount_width=1 $TEST_IMG +_check_test_img +print_refcount_width + +echo +echo '=== Amend from refcount_width=1 to refcount_width=64 ===' +echo + +$QEMU_IMG amend -o refcount_width=64 $TEST_IMG +_check_test_img +print_refcount_width + +echo +echo '=== Amend to compat=0.10 ===' +echo + +# Should not work because refcount_width needs to be 16 for
Re: [Qemu-devel] [PULL for-2.2 0/2] Xen tree 2014-11-14
On 14 November 2014 11:29, Stefano Stabellini stefano.stabell...@eu.citrix.com wrote: The following changes since commit c52e67924fbdadfa00668248f5c075542943c54c: Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging (2014-11-13 15:44:16 +) are available in the git repository at: git://xenbits.xen.org/people/sstabellini/qemu-dm.git xen-2014-11-14 for you to fetch changes up to 2f01dfacb56bc7a0d4639adc9dff9aae131e6216: xen_disk: fix unmapping of persistent grants (2014-11-14 11:12:38 +) Igor Mammedov (1): pc: piix4_pm: init legacy PCI hotplug when running on Xen Roger Pau Monne (1): xen_disk: fix unmapping of persistent grants hw/acpi/piix4.c |4 +++ hw/block/xen_disk.c | 72 ++- hw/i386/pc_piix.c | 11 3 files changed, 70 insertions(+), 17 deletions(-) Applied, thanks. -- PMM
Re: [Qemu-devel] [PATCH v2 2/3] iotests: _filter_qmp for pretty JSON output
On 11/14/2014 02:02 AM, Max Reitz wrote: _filter_qmp should be able to correctly filter out the QMP version object for pretty JSON output. Signed-off-by: Max Reitz mre...@redhat.com --- tests/qemu-iotests/common.filter | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter index 3acdb30..63d15b3 100644 --- a/tests/qemu-iotests/common.filter +++ b/tests/qemu-iotests/common.filter @@ -165,9 +165,12 @@ _filter_qemu() # replace problematic QMP output like timestamps _filter_qmp() { +discard=0 + _filter_win32 | \ sed -e 's#\(\(micro\)\?seconds: \)[0-9]\+#\1 TIMESTAMP#g' \ --e 's#^{QMP:.*}$#QMP_VERSION#' +-e 's#^{QMP:.*}$#QMP_VERSION#' \ +-e '/^QMP: {\s*$/, /^}\s*$/ c\QMP_VERSION' Technically, portable sed requires a newline between the initial 'c\' and the first line of replacement. Use of multiple -e is the easiest way to provide newlines (sed concatenates separate -e strings with an implicit newline). Even though GNU sed doesn't barf on your variant, I'd still prefer: -e '/^QMP: {/, /^}/ c\' \ -e 'QMP_VERSION' written as two lines, to ensure we don't upset BSD sed. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v2 4/4] target-tricore: Add instructions of RCR opcode format
On 11/13/2014 06:12 PM, Bastian Koppelmann wrote: +tcg_gen_ext_i32_i64(t3, r3); +tcg_gen_concat_i32_i64(t2, r2_low, r2_high); +/* extend the sign for r2 to high 64 bits */ +tcg_gen_sari_i64(t4, t2, 63); +tcg_gen_ext_i32_i64(t1, r1); + +tcg_gen_muls2_i64(t1, t3, t1, t3); +tcg_gen_add2_i64(t1, t3, t2, t4, t1, t3); + I don't believe that you need 128 bit arithemetic for multiply-accumulate, either here or elsewhere (e.g. msub). Looking at unsigned, the maximum result of the multiply is 2*(2^n-1), or 2^(2n) - 2^(n+1). Which means that the accumulate with a 2^n-1 value cannot overflow a double-word intermediate result. r~
Re: [Qemu-devel] [PATCH v2 0/4] Add TriCore RCPW, RCRR, RCRW, RLC and RCR instructions
On 11/13/2014 06:12 PM, Bastian Koppelmann wrote: Hi, this patch depends on the previous TriCore patches (https://patchwork.ozlabs.org/patch/405459/) and will hopefully end up in 2.3 QEMU. Other than adding the RCPW, RCRR, RCRW, RLC and RCR instructions, it cleans up how ISA versions in the feature bitmask are handled, to simplify the checks, when instructions are available. Thanks, Bastian v1 - v2: - Fixed obvious style errors, given by checkpatch. Bastian Koppelmann (4): target-tricore: Make TRICORE_FEATURES implying others. target-tricore: Add instructions of RCPW, RCRR and RCRW opcode format target-tricore: Add instructions of RLC opcode format target-tricore: Add instructions of RCR opcode format Patches 1-3 get Reviewed-by: Richard Henderson r...@twiddle.net Patch 4 looks correct, but appears to work too hard. r~
Re: [Qemu-devel] [PATCH 2.3 5/8] tcg: Put opcodes in a linked list
On 11/11/2014 04:24 PM, Richard Henderson wrote: +static void tcg_gen_op_begin(TCGContext *ctx, TCGOpcode opc, int args) +{ +int oi = ctx-gen_next_op_idx; +int ni = oi + 1; +int pi = oi - 1; + +tcg_debug_assert(oi OPC_BUF_SIZE); +ctx-gen_last_op_idx = oi; +ctx-gen_next_op_idx = ni; + +ctx-gen_op_buf[oi] = (TCGOp){ +.opc = opc, +.args = args, +.prev = pi, +.next = ni +}; +} + The name of this function says begin while used at the end of each tcg_gen_op. How about tcg_gen_op_list_add? @@ -508,14 +521,10 @@ struct TCGContext { int goto_tb_issue_mask; #endif -uint16_t gen_opc_buf[OPC_BUF_SIZE]; -TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE]; - -uint16_t *gen_opc_ptr; -TCGArg *gen_opparam_ptr; You forgot to remove gen_opc_ptr in the dummy function tcg_liveness_analysis, in case USE_LIVENESS_ANALYSIS is not defined. -target_ulong gen_opc_pc[OPC_BUF_SIZE]; -uint16_t gen_opc_icount[OPC_BUF_SIZE]; -uint8_t gen_opc_instr_start[OPC_BUF_SIZE]; +int gen_first_op_idx; +int gen_last_op_idx; +int gen_next_op_idx; +int gen_next_parm_idx; Other than that it looks good to me. Cheers, Bastian
Re: [Qemu-devel] [PATCH 2.3 6/8] tcg: Remove opcodes instead of noping them out
On 11/11/2014 04:24 PM, Richard Henderson wrote: With the linked list scheme we need not leave nops in the stream that we need to process later. Signed-off-by: Richard Henderson r...@twiddle.net --- tcg/optimize.c | 14 +++--- tcg/tcg.c | 28 tcg/tcg.h | 1 + 3 files changed, 32 insertions(+), 11 deletions(-) Reviewed-by: Bastian Koppelmann kbast...@mail.uni-paderborn.de
Re: [Qemu-devel] [PATCH 0/4] qom: Replace automatic arrayification
Paolo Bonzini pbonz...@redhat.com writes: On 13/11/2014 15:34, Markus Armbruster wrote: -... bla ... +char *name = object_gen_new_property_name(obj, bla); +... name ... +g_free(name); Not quite as mechanical, in particular finding the obj to pass to object_gen_new_property_name(). Correct? Yes (except that the obj is just qdev_get_machine()). Ah, of course. Let me summarize. Automatic arrayification currently has two kinds of users: qdev properties and memory regions. qdev properties can easily use object_gen_new_property_name() instead, as shown in PATCH 3/4. So can memory regions [PATCH 2/4], but only because we clumsily arrayify all of them, by appending [*] in memory_region_init(). Some day, we may want to arrayify only the ones that actually need it, by appending [*] right to their names instead of appending it behind the scenes to all memory region names. This would involve touching the untouchables: non-qdevified devices. But the changes should be limited to string literals. With my series, you'd have to graft in object_gen_new_property_name() and the matching g_free() instead. You called that just too ugly. Here's how to avoid it, and confine the ugliness to memory_region_init(): Change memory_region_init() from char *escaped_name = memory_region_escape_name(name); char *propname = object_gen_new_property_name(owner, escaped_name); object_property_add_child(owner, propname, OBJECT(mr), error_abort); object_unref(OBJECT(mr)); g_free(propname); g_free(escaped_name); to something like if (name ends with [*]) { stem = g_strndup(name, strlen(name) -3); escaped = memory_region_escape_name(stem); propname = object_gen_new_property_name(owner, escaped_name); g_free(escaped); g_free(stem); else propname = memory_region_escape_name(name); object_property_add_child(owner, propname, OBJECT(mr), error_abort); object_unref(OBJECT(mr)); g_free(propname); and append [*] to the names of regions that need arrayification. That way, the bad magic is limited to just memory_region_init() rather than all of QOM, and it's needed only as long as the problem with non-qdevified users remains. [*] What we want to do is drag them along until the Second Coming, when Jesus will set everything right, which naturally includes qdevifying all the old crap nobody wants to touch. Or maybe not. Look at the bottom: https://upload.wikimedia.org/wikipedia/commons/a/a5/Michelangelo%2C_Giudizio_Universale_02.jpg Oww! Well played, sir!
Re: [Qemu-devel] [PATCH 2.3 7/8] tcg: Implement insert_op_before
On 11/11/2014 04:24 PM, Richard Henderson wrote: Rather reserving space in the op stream for optimization, let the optimizer add ops as necessary. Signed-off-by: Richard Henderson r...@twiddle.net --- tcg/optimize.c | 57 +++-- tcg/tcg-op.c | 16 2 files changed, 35 insertions(+), 38 deletions(-) diff --git a/tcg/optimize.c b/tcg/optimize.c index 973fbb4..067917c 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -67,6 +67,37 @@ static void reset_temp(TCGArg temp) temps[temp].mask = -1; } +static TCGOp *insert_op_before(TCGContext *s, TCGOp *old_op, +TCGOpcode opc, int nargs) +{ +int oi = s-gen_next_op_idx; +int pi = s-gen_next_parm_idx; +int prev = old_op-prev; +int next = old_op - s-gen_op_buf; +TCGOp *new_op; + +tcg_debug_assert(oi OPC_BUF_SIZE); +tcg_debug_assert(pi + nargs = OPPARAM_BUF_SIZE); I thinks it is better to assure these assertion always hold, e.g. if (oi OPC_BUF_SIZE || pi + nargs = OPPARAM_BUF_SIZE) { return NULL; } ... TCGOp *op2 = insert_op_before(s, op, INDEX_op_movi_i32, 2); if (op2) { *args2 = s-gen_opparam_buf[op2-args]; } Or how do we know they always hold? diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c index fbd82bd..8de259a 100644 --- a/tcg/tcg-op.c +++ b/tcg/tcg-op.c @@ -571,8 +571,6 @@ void tcg_gen_add2_i32(TCGv_i32 rl, TCGv_i32 rh, TCGv_i32 al, { if (TCG_TARGET_HAS_add2_i32) { tcg_gen_op6_i32(INDEX_op_add2_i32, rl, rh, al, ah, bl, bh); -/* Allow the optimizer room to replace add2 with two moves. */ -tcg_gen_op0(tcg_ctx, INDEX_op_nop); All references on tcg_gen_op0 are gone, so lets remove it. Cheers, Bastian
Re: [Qemu-devel] [PATCH 0/4] qom: Replace automatic arrayification
On 14/11/2014 15:21, Markus Armbruster wrote: qdev properties can easily use object_gen_new_property_name() instead, as shown in PATCH 3/4. So can memory regions [PATCH 2/4], but only because we clumsily arrayify all of them, by appending [*] in memory_region_init(). Some day, we may want to arrayify only the ones that actually need it, by appending [*] right to their names instead of appending it behind the scenes to all memory region names. This would involve touching the untouchables: non-qdevified devices. But the changes should be limited to string literals. With my series, you'd have to graft in object_gen_new_property_name() and the matching g_free() instead. You called that just too ugly. Here's how to avoid it, and confine the ugliness to memory_region_init(): Change memory_region_init() from char *escaped_name = memory_region_escape_name(name); char *propname = object_gen_new_property_name(owner, escaped_name); object_property_add_child(owner, propname, OBJECT(mr), error_abort); object_unref(OBJECT(mr)); g_free(propname); g_free(escaped_name); to something like if (name ends with [*]) { stem = g_strndup(name, strlen(name) -3); escaped = memory_region_escape_name(stem); propname = object_gen_new_property_name(owner, escaped_name); g_free(escaped); g_free(stem); else propname = memory_region_escape_name(name); object_property_add_child(owner, propname, OBJECT(mr), error_abort); object_unref(OBJECT(mr)); g_free(propname); and append [*] to the names of regions that need arrayification. That way, the bad magic is limited to just memory_region_init() rather than all of QOM, and it's needed only as long as the problem with non-qdevified users remains. Yes, I agree. Whether it's an improvement to move back the special-casing of [*] to memory_region_init(), that's of course in the eye of the beholder. ;) We do know that it's very unlikely to be removed from memory regions. Paolo
Re: [Qemu-devel] [PATCH 2.3 8/8] tcg: Remove unused opcodes
On 11/11/2014 04:24 PM, Richard Henderson wrote: diff --git a/tci.c b/tci.c index 4711ee4..28292b3 100644 --- a/tci.c +++ b/tci.c @@ -506,19 +506,6 @@ uintptr_t tcg_qemu_tb_exec(CPUArchState *env, uint8_t *tb_ptr) tb_ptr += 2; switch (opc) { -case INDEX_op_end: -case INDEX_op_nop: -break; -case INDEX_op_nop1: -case INDEX_op_nop2: -case INDEX_op_nop3: -case INDEX_op_nopn: -case INDEX_op_discard: -TODO(); -break; -case INDEX_op_set_label: -TODO(); -break; Why do you remove the TODO notice for INDEX_op_discard/set_label? Is TCI no longer maintained? case INDEX_op_call: t0 = tci_read_ri(tb_ptr); #if TCG_TARGET_REG_BITS == 32 Other than that, Reviewed-by: Bastian Koppelmann kbast...@mail.uni-paderborn.de Cheers, Bastian
Re: [Qemu-devel] [PATCH 2.3 7/8] tcg: Implement insert_op_before
On 11/14/2014 04:25 PM, Bastian Koppelmann wrote: On 11/11/2014 04:24 PM, Richard Henderson wrote: Rather reserving space in the op stream for optimization, let the optimizer add ops as necessary. Signed-off-by: Richard Henderson r...@twiddle.net --- tcg/optimize.c | 57 +++-- tcg/tcg-op.c | 16 2 files changed, 35 insertions(+), 38 deletions(-) diff --git a/tcg/optimize.c b/tcg/optimize.c index 973fbb4..067917c 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -67,6 +67,37 @@ static void reset_temp(TCGArg temp) temps[temp].mask = -1; } +static TCGOp *insert_op_before(TCGContext *s, TCGOp *old_op, +TCGOpcode opc, int nargs) +{ +int oi = s-gen_next_op_idx; +int pi = s-gen_next_parm_idx; +int prev = old_op-prev; +int next = old_op - s-gen_op_buf; +TCGOp *new_op; + +tcg_debug_assert(oi OPC_BUF_SIZE); +tcg_debug_assert(pi + nargs = OPPARAM_BUF_SIZE); I thinks it is better to assure these assertion always hold, e.g. if (oi OPC_BUF_SIZE || pi + nargs = OPPARAM_BUF_SIZE) { return NULL; } ... TCGOp *op2 = insert_op_before(s, op, INDEX_op_movi_i32, 2); if (op2) { *args2 = s-gen_opparam_buf[op2-args]; } Or how do we know they always hold? For the same reason we don't bother checking during initial generation of the opcodes. We simply assume there's enough space. Not a good answer but... All references on tcg_gen_op0 are gone, so lets remove it. Fair enough. r~
Re: [Qemu-devel] [PATCH 2.3 8/8] tcg: Remove unused opcodes
On 11/14/2014 04:31 PM, Bastian Koppelmann wrote: On 11/11/2014 04:24 PM, Richard Henderson wrote: diff --git a/tci.c b/tci.c index 4711ee4..28292b3 100644 --- a/tci.c +++ b/tci.c @@ -506,19 +506,6 @@ uintptr_t tcg_qemu_tb_exec(CPUArchState *env, uint8_t *tb_ptr) tb_ptr += 2; switch (opc) { -case INDEX_op_end: -case INDEX_op_nop: -break; -case INDEX_op_nop1: -case INDEX_op_nop2: -case INDEX_op_nop3: -case INDEX_op_nopn: -case INDEX_op_discard: -TODO(); -break; -case INDEX_op_set_label: -TODO(); -break; Why do you remove the TODO notice for INDEX_op_discard/set_label? Is TCI no longer maintained? Barely. But these opcodes never reach this far anyway, so the todo is bogus. r~
Re: [Qemu-devel] [PATCH 00/12] qcow2: Add new overlap check functions
On 2014-11-03 at 18:04, Max Reitz wrote: As has been requested, this series adds new overlap check functions to the qcow2 code. My local branch is called qcow2-improved-overlap-v1, but I am not so sure whether it is actually an improvement; that is left for you to decide, dear reviewers. See patch 1 for an explanation of why this series exists and what it does. Patch 1 is basically the core of this series, the rest just employs the functions introduced there. I have yet to do benchmarks to test whether this series actually improves things, but judging from the iotests it at least does not slow things down (which it did at one time during development, particularily test 044 is good for testing this, so this actually has some significance to it). In a later patch, we may want to change the meaning of the constant overlap checking option to mean the same as cached, which is everything except for inactive L2 tables. This series does make checking for overlaps with inactive L2 tables at runtime just as cheap as everything else (constant time plus caching), but using these checks means qemu has to read all the snapshot L1 tables when opening a qcow2 file. This does not take long, of course, but it does result in a bit of overhead so I did not want to enable it by default. I think just enabling all overlap checks by default after this series should be fine and useful, though. Rejoice, for I return with benchmarks; the kind of benchmarks which always show the result I want them to show, which I should be known for by now. First, I basically tried the setup from last time only this time I didn't care about the in-VM I/O performance but just use perf record -g to record the amount of cycles used by the overlap check. This worked somehow, but bonnie++ (which I used as the in-VM benchmark tool) does some different tests, both reading and writing and writing with different sizes, so the result is not that bad there. So I did what's absolutely worst for the overlap checks: dd if=/dev/zero of=/dev/vda bs=65536 oflag=direct (even worse would be cluster_size=512 and bs=512, but I wanted to get the test over with today, so I just went for 64k). The image was a 1G qcow2 image in tmpfs with ten snapshots (each having 128 MB of data, all pointing to the same data clusters which are different from the active clusters) just because having internal snapshots makes the overlap checks even more CPU intensive, of course. I ran dd 42 times in a row (for i in $(seq 1 42); do ...; done) and started up perf record just after I hit enter and canceled it just before the last dd exited. I don't remember the exact numbers, but for the currently existing overlap check function (using the default of overlap-check=cached), it used about 13.5 % in the first run, 10.5 % in the second and (this I do know) 12.41 % in the third run. With these patches applied, I had 0.08 % in the first run with overlap-check=cached and 0.09 % in the second run with overlap-check=all. (all percentages are referring to the fraction of cycles used by qcow2_check_metadata_overlap()) So this series apparently is actually worth it. I could yet do another benchmark where I test what happens if the cache is too small, which means that the range list representation has to be converted to the bitmap all the time and vice versa. The biggest problem with that would be to somehow fit the image into tmpfs (if it's not in tmpfs, I don't want to do CPU benchmarks)... Max
Re: [Qemu-devel] [PATCH 0/4] qom: Replace automatic arrayification
Paolo Bonzini pbonz...@redhat.com writes: On 14/11/2014 15:21, Markus Armbruster wrote: qdev properties can easily use object_gen_new_property_name() instead, as shown in PATCH 3/4. So can memory regions [PATCH 2/4], but only because we clumsily arrayify all of them, by appending [*] in memory_region_init(). Some day, we may want to arrayify only the ones that actually need it, by appending [*] right to their names instead of appending it behind the scenes to all memory region names. This would involve touching the untouchables: non-qdevified devices. But the changes should be limited to string literals. With my series, you'd have to graft in object_gen_new_property_name() and the matching g_free() instead. You called that just too ugly. Here's how to avoid it, and confine the ugliness to memory_region_init(): Change memory_region_init() from char *escaped_name = memory_region_escape_name(name); char *propname = object_gen_new_property_name(owner, escaped_name); object_property_add_child(owner, propname, OBJECT(mr), error_abort); object_unref(OBJECT(mr)); g_free(propname); g_free(escaped_name); to something like if (name ends with [*]) { stem = g_strndup(name, strlen(name) -3); escaped = memory_region_escape_name(stem); propname = object_gen_new_property_name(owner, escaped_name); g_free(escaped); g_free(stem); else propname = memory_region_escape_name(name); object_property_add_child(owner, propname, OBJECT(mr), error_abort); object_unref(OBJECT(mr)); g_free(propname); and append [*] to the names of regions that need arrayification. That way, the bad magic is limited to just memory_region_init() rather than all of QOM, and it's needed only as long as the problem with non-qdevified users remains. Yes, I agree. Whether it's an improvement to move back the special-casing of [*] to memory_region_init(), that's of course in the eye of the beholder. ;) We do know that it's very unlikely to be removed from memory regions. Like Paris, we get to choose from three. Unlike Paris, we get to choose from three uglies: 1. Ugly magic in QOM properties (status quo) 2. No ugly magic now, maybe ugly magic in memory region names later 3. No ugly magic, but may have to touch the untouchables later, which is going to be ugly I'd go for 2. For 1, we do nothing, for 2 and 3, we merge my patches now, and worry about 2 vs. 3 later.
[Qemu-devel] [Bug 1392504] Re: USB Passthrough is not working anymore
Yes, it is valid. libvirt is disabling the automagic UHCI controller, which makes sense since he is using XHCI. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1392504 Title: USB Passthrough is not working anymore Status in QEMU: New Bug description: After upgrading from Ubuntu 14.04 to Ubuntu 14.10 USB passthrough in QEMU (version is now 2.1.0 - Debian2.1+dfsg-4ubuntu6.1) is not working any more. Already tried to remove and attach the USB devices. I use 1 USB2 HDD + 1 USB3 HDD to a virtual linux machine; 1 USB2 HDD to a virual FNAS machine and a USB 2camera to virtual Win7 machine. All these devices are not working any more. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1392504/+subscriptions
Re: [Qemu-devel] State of ARM FIQ in Qemu
On 14 November 2014 09:34, Tim Sander t...@krieglstein.org wrote: 0xbfffe000? You where talking about the fact that the security extensions where not implemented. I was not aware that the different vbar's where already part of the security stuff? MVBAR is part of the Security extensions. HVBAR is part of the Virtualization extensions. In mainline QEMU we implement neither of those extensions, and so don't implement the associated registers. (Strictly speaking, VBAR is also only in the Security extensions, but we provide it as a workaround for guests that assume our CPUs should implement it.) Peter beat me to it. None of the VBAR registers should matter in your case which coincides with the use of hivecs. While writing this mail i found out that the integrated debugger is causing harm in combination with the fiq. So everything below the braces seems to be related to the this problem. But i still wanted to keep the data points for reference: { Ok, so qemu only implements the SCTLR.V bit to control the memory address of the interrupt vector. So its either 0 or 0x. That is fine with me. Currently i have the problem that a call to set_fiq_handler does not place the binary stuff loaded at the address where qemu is jumping to which is presumably 0x1240. I have checked that SCTLR.V =1 under linux which is fine. The background info to set_fiq_handler from my understanding is that it copies the given stuff directly at the address where the FIQ vector is located. This works as the FIQ is the last entry and thus there is some memory space for a short interrupt handler. I checked the memory when entering the FIQ with the integrated gdb: (gdb) info reg r0 0x0 0 r1 0x0 0 r2 0x1 1 r3 0x76eb34c8 1995125960 r4 0x76eb34c8 1995125960 r5 0x76f633b8 1995846584 r6 0x2a 42 r7 0x76f4c28c 1995752076 r8 0xf8200100 -132120320 r9 0xe004 -536608768 r100x60004059 1610629209 r110x0 0 r120x0 0 sp 0x908be000 0x908be000 lr 0x76dfc108 1994375432 pc 0x1240 0x1240 firq_fiq_handler cpsr 0x600f01d1 1611596241 (gdb) x 0x1240 0xe599b00c But my firq_fiq_handler starts with 0xee12af10? I know that this works on real hardware so i suspect that this an error within qemu? Or at least that there is something amiss in the way the memory is initialized or handled. Is there a way to instrument the memory below the vector table to get debug logs if the memory is modified? } It may be worthwhile to put a kernel breakpoint in handle_fiq_as_nmi() just to see where it goes. If CONFIG_ARM_GIC is enabled it should take you to your handler I suspect. Plus, if you get there then we have likely proven that QEMU is getting the kernel to the right place. I set a BP in this routine on my A9 run and appear to be hitting it correctly. So you are talking about the linux kernel, right? CONFIG_ARM_GIC=y check but i can't find handle_fiq_as_nmi? Even a fuzzier rgrep nmi * |grep fiq does not find anything. Maybe we are working off different versions of the kernel sources. I'm using a kernel variant of v3.18-rc1. I took a look at my 3.15 kernel and it does not have the routine, so perhaps yours is an earlier version as well. I don't spend much time working in or tracking the Linux kernel, so I am not sure when the difference was introduced. I just found it to be a convenient function to set a BP for early FIQ debugging, you may have something different. Interestingly, as I researched the Linux FIQ support I found this mail thread... http://www.spinics.net/lists/arm-kernel/msg14960.html As I don't have access to your code, I could not verify that the SVC SPSR was being preserved, but it may be worth you looking into it as I can see this potentially resulting in all kinds of random behavior. More interestingly, this comment and code appears to have been changed in later versions of the FIQ code, so perhaps this has been fixed or improved (My 3.18 kernel does not have the comment). Concerning the fact that qemu is jumping to the right address: To i have put a breakpoint to 0x001c which is the fiq base vector address. There is an instruction 0xea000480 which seems to be a pc relative branch to 0x1224 which then lands at 0x1240. But the internal debugger gives me some concerns. If i do at the gdb command line: hb *0x001c hb *0x1240 The debugger only stops at the first breakpoint. If i leave the first breakpoint away the debugger stops at 0x1240. As i know that at 0x01c it should jump right to 0x1240 i would expect that both breakpoints are triggered. Then if i reach the breakpoint at
Re: [Qemu-devel] [PATCH 2.3 1/8] tcg: Move some opcode generation functions out of line
On 11/11/2014 04:24 PM, Richard Henderson wrote: Some of these functions are really quite large. We have a number of things that ought to be circularly dependent, but we duplicated code to break that chain for the inlines. This saved 25% of the code size of one of the translators I examined. Signed-off-by: Richard Henderson r...@twiddle.net --- Makefile.target |2 +- tcg/tcg-op.c| 1978 +++ tcg/tcg-op.h| 2488 --- tcg/tcg.c | 137 --- tcg/tcg.h |3 - 5 files changed, 2339 insertions(+), 2269 deletions(-) create mode 100644 tcg/tcg-op.c Reviewed-by: Bastian Koppelmann kbast...@mail.uni-paderborn.de Cheers, Bastian
Re: [Qemu-devel] [PATCH 2.3 2/8] tcg: Reduce ifdefs in tcg-op.c
On 11/11/2014 04:24 PM, Richard Henderson wrote: Almost completely eliminates the ifdefs in this file, improving confidence in the lesser used 32-bit builds. Signed-off-by: Richard Henderson r...@twiddle.net --- tcg/tcg-op.c | 449 +++ 1 file changed, 207 insertions(+), 242 deletions(-) Reviewed-by: Bastian Koppelmann kbast...@mail.uni-paderborn.de Cheers, Bastian
Re: [Qemu-devel] [PATCH 2.3 0/8] Linked list for tcg ops
On 11/11/2014 04:24 PM, Richard Henderson wrote: Richard Henderson (8): tcg: Move some opcode generation functions out of line tcg: Reduce ifdefs in tcg-op.c tcg: Move emit of INDEX_op_end into gen_tb_end tcg: Introduce tcg_op_buf_count and tcg_op_buf_full tcg: Put opcodes in a linked list tcg: Remove opcodes instead of noping them out tcg: Implement insert_op_before tcg: Remove unused opcodes Makefile.target |2 +- include/exec/gen-icount.h | 22 +- target-alpha/translate.c | 16 +- target-arm/translate-a64.c| 10 +- target-arm/translate.c| 10 +- target-cris/translate.c | 15 +- target-i386/translate.c | 11 +- target-lm32/translate.c | 16 +- target-m68k/translate.c | 10 +- target-microblaze/translate.c | 22 +- target-mips/translate.c | 10 +- target-moxie/translate.c | 10 +- target-openrisc/translate.c | 15 +- target-ppc/translate.c| 11 +- target-s390x/translate.c | 11 +- target-sh4/translate.c| 10 +- target-sparc/translate.c | 10 +- target-tricore/translate.c|5 +- target-unicore32/translate.c | 10 +- target-xtensa/translate.c |8 +- tcg/optimize.c| 307 +++-- tcg/tcg-op.c | 1941 tcg/tcg-op.h | 2488 ++--- tcg/tcg-opc.h |9 - tcg/tcg.c | 535 +++-- tcg/tcg.h | 72 +- tci.c | 13 - 27 files changed, 2761 insertions(+), 2838 deletions(-) create mode 100644 tcg/tcg-op.c Richard, doing the review for the tcg changes helped me in my understanding on how tcg works. So whenever you have more changes for tcg, feel free to CC me. Cheers, Bastian
Re: [Qemu-devel] [RFC PATCH] virtio-mmio: support for multiple irqs
On Thu, 2014-11-13 at 09:39 +, Shannon Zhao wrote: When we use only virtio-mmio or vhost-net without irqfd, the device uses qemu_set_irq(within qemu) to inject interrupt and at the same time qemu update VIRTIO_MMIO_INTERRUPT_STATUS to tell guest driver whom this interrupt to. All these things are done by qemu. Though qemu_set_irq will call kvm_set_irq to do the real interrupt inject, but the trigger is qemu and it can update the VIRTIO_MMIO_INTERRUPT_STATUS register before injecting the interrupt. But if we use vhost-net with irqfd, the device uses ioeventfd mechanism to inject interrupt. When an interrupt happened, it doesn't transfer to qemu, while the irqfd finally call kvm_set_irq to inject the interrupt directly. All these things are done in kvm and it can't update the VIRTIO_MMIO_INTERRUPT_STATUS register. So if the guest driver still uses the old interrupt handler, it has to read the VIRTIO_MMIO_INTERRUPT_STATUS register to get the interrupt reason and run different handlers for different reasons. But the register has nothing and none of handlers will be called. I make myself clear? :-) I see... (well, at least I believe I see ;-) Of course this means that with irqfd, the device is simply non-compliant with the spec. And that extending the status register doesn't help you at all, so we can drop this idea. Paradoxically, it's a good news (of a sort :-) because I don't think we should be doing such a fundamental change in the spec at this date. I'll discuss it with others in the TC and I'm open to be convinced otherwise, but I believe the spec should stay as it is. Having said that, I see no problem with experimenting with the device model so the next version of the standard is extended. Two suggestions I have would be: 1. Drop the virtio-pci like case with two interrupts (one for config changes, one for all queues), as it doesn't bring anything new. Just make it all interrupts individual. 2. Change the way the interrupts are acknowledge - instead of a bitmask, have a register which takes a queue number to ack the queue's interrupt and ~0 to ack the config interrupt. 3. Change the version of the device to (intially) 0x8003 - I've just made an executive decision :-) that non-standard compliant devices should have the MSB of the version number set (I'll propose to reserve this range of versions in future version of the spec). We'll consider it a prototype of the version 3. Then make the driver behave in the new way when (and only when) such device version is observed. Also, I remembered I haven't addressed one of your previous comments: On Wed, 2014-11-12 at 08:32 +, Shannon Zhao wrote: One point I'd like to make is that the device was intentionally designed with simplicity in mind first, performance later (something about embedded etc :-). Changing this assumption is of course possible, but Ah, I think ARM is not only about embedded things. Maybe it could has a wider application such as micro server. Just my personal opinion. By all means, I couldn't agree more. But there's one thing you have to keep in mind - it doesn't matter whether the real hardware has got PCI or not, but what is emulated by qemu/KVM. Therefore, if only you can get the PCI host controller working in the qemu virt machine (which should be much simpler now, as we have generic support for PCI controllers/bridges in the kernel now), you'll be able to forget the issue of virtio-mmio and multiple interrupts and still enjoy your performance gains :-) Does it all make sense? Pawel
[Qemu-devel] [PATCH] target-ppc: Altivec's mtvscr Decodes Wrong Register
The Move to Vector Status and Control Register (mtvscr) instruction uses VRB as the source register. Fix the code generator to correctly decode the VRB field. That is, use rB(ctx-opcode) instead of rD(ctx-opcode). Signed-off-by: Tom Musta tommu...@gmail.com --- target-ppc/translate.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/target-ppc/translate.c b/target-ppc/translate.c index 910ce56..d381632 100644 --- a/target-ppc/translate.c +++ b/target-ppc/translate.c @@ -6848,7 +6848,7 @@ static void gen_mtvscr(DisasContext *ctx) gen_exception(ctx, POWERPC_EXCP_VPU); return; } -p = gen_avr_ptr(rD(ctx-opcode)); +p = gen_avr_ptr(rB(ctx-opcode)); gen_helper_mtvscr(cpu_env, p); tcg_temp_free_ptr(p); } -- 1.7.1
Re: [Qemu-devel] [PATCH 0/3] apic: ExtINT fixes for QNX
I've been testing these against our entire kernel regression suite. While they definitely fix the original problems, there is one test that now hangs. I'm going to have to get to the bottom of that failure before I can sign off. From: Paolo Bonzini pbonz...@redhat.commailto:pbonz...@redhat.com Date: Tue Nov 11 2014 09:23:38 GMT-0500 (EST) To: qemu-devel@nongnu.org qemu-devel@nongnu.orgmailto:qemu-devel@nongnu.org Cc: Richard Bilson rbil...@qnx.commailto:rbil...@qnx.com Subject: [PATCH 0/3] apic: ExtINT fixes for QNX Richard Bilson reported some QNX emulation bugs. I sent him offlist a prototype patch, that he commented on. The first patch here works around what is arguably a QNX bug, that is triggered by the last two patches; but it is better to avoid spurious HLT wakeups, so here it is. The last two patches are what Richard had tested, amended with a fix for Windows XP + KVM + QEMU apic. Richard, can you confirm that these patches work? Paolo Paolo Bonzini (3): apic: avoid getting out of halted state on masked PIC interrupts apic: fix loss of IPI due to masked ExtINT apic: fix incorrect handling of ExtINT interrupts wrt processor priority hw/intc/apic.c | 20 target-i386/cpu.c | 8 ++-- 2 files changed, 18 insertions(+), 10 deletions(-) -- 2.1.0
Re: [Qemu-devel] [PATCH v2 3/4] raw-posix: Fix try_seek_hole()'s handling of SEEK_DATA failure
On 11/14/2014 06:12 AM, Markus Armbruster wrote: 0-length file: lseek(fd, 0, SEEK_HOLE) = -1 ENXIO lseek(fd, 0, SEEK_DATA) = -1 ENXIO conclusion: 0 is at EOF Isn't this a special case of the next one? file of any size: lseek(fd, size_or_larger, SEEK_HOLE) = -1 ENXIO lseek(fd, size_or_larger, SEEK_DATA) = -1 ENXIO conclusion: size_or_larger is at or beyond EOF Yes. The two calls are both necessary, in order to learn which extant type offset belongs to, and to tell where that extant ends; and the behaviors are distinguishable (if both lseek() succeed, we have both numbers we want; if both fail with ENXIO, we know the offset is at or beyond EOF; and if only SEEK_HOLE fails with ENXIO, we know we have a trailing hole); and we can tell at runtime what to do about a trailing hole (if the return value is offset, we need one more lseek(fd, 0, SEEK_END) to find EOF; if the return value is larger than offset, we have EOF for free). You can optimize by calling SEEK_HOLE first (if it fails with ENXIO, there is no need to try SEEK_DATA); but SEEK_HOLE in isolation is insufficient to give you all the information you need. Not discussed: how to handle failures other than ENXIO. The appended code still avoids a second seek in one case. Useful mostly because it saves us from handling a second seek's contradictory information. Slick - I focused on SEEK_HOLE first, but you focused on SEEK_DATA first. Your comments make all the difference. /* * Find allocation range in @bs around offset @start. * May change underlying file descriptor's file offset. * If @start is not in a hole, store @start in @data, and the * beginning of the next hole in @hole, and return 0. * If @start is in a non-trailing hole, store @start in @hole and the * beginning of the next non-hole in @data, and return 0. * If @start is in a trailing hole or beyond EOF, return -ENXIO. And caller can blindly and safely treat that as a trailing hole, as needed. * If we can't find out, return a negative errno other than -ENXIO. */ static int find_allocation(BlockDriverState *bs, off_t start, off_t *data, off_t *hole) { #if defined SEEK_HOLE defined SEEK_DATA I seriously doubt you'd find a system with one but not both of these constants defined. But it doesn't hurt to check both. BDRVRawState *s = bs-opaque; off_t offs; /* * SEEK_DATA cases: * D1. offs == start: start is in data * D2. offs start: start is in a hole, next data at offs * D3. offs 0, errno = ENXIO: either start is in a trailing hole * or start is beyond EOF * If the latter happens, the file has been truncated behind * our back since we opened it. Best we can do is treat like * a trailing hole. * D4. offs 0, errno != ENXIO: we learned nothing */ Correct. offs = lseek(s-fd, start, SEEK_DATA); if (offs 0) { return -errno; /* D3 or D4 */ } assert(offs = start); if (offs start) { /* D2: in hole, next data at offs */ *hole = start; *data = offs; return 0; } /* D1: in data, end not yet known */ /* * SEEK_HOLE cases: * H1. offs == start: start is in a hole * If this happens here, a hole has been dug behind our back * since the previous lseek(). * H2. offs start: either start is in data, next hole at offs, * or start is in trailing hole, EOF at offs * Linux treats trailing holes like any other hole: offs == * start. Solaris seeks to EOF instead: offs start (blech). Correct in isolation. Coupled with the additional knowledge that we are in state D1 (and already treated D3 as a trailing hole with early exit),... * If that happens here, a hole has been dug behind our back * since the previous lseek(). ...this is further true for this function. * H3. offs 0, errno = ENXIO: start is beyond EOF * If this happens, the file has been truncated behind our * back since we opened it. Treat it like a trailing hole. * H4. offs 0, errno != ENXIO: we learned nothing * Pretend we know nothing at all, i.e. forget about D1. */ offs = lseek(s-fd, start, SEEK_HOLE); if (offs 0) { return -errno; /* D1 and (H3 or H4) */ } assert(offs = start); if (offs start) { /* * D1 and H2: either in data, next hole at offs, or it was in * data but is now in a trailing hole. Treating the latter as * if it there was data extending to EOF is safe, so simply do * that. */ *data = start; *hole = offs; return 0; } Reasonable. /* D1 and H1 */ return -EBUSY; #else return -ENOTSUP; #endif } I like it. Maybe we could do
Re: [Qemu-devel] [Bug 1392504] Re: USB Passthrough is not working anymore
On 2014/11/14 19:57, Leen Keus wrote: Command lines generated by virsh from the libvirt xml files (let me know if you also need the xml configuration files): Does this below patch fix your problem? http://git.qemu.org/?p=qemu.git;a=commitdiff;h=79ae25af1569a50a0ec799901a1bb280c088f121 Regards, -Gonglei