[Qemu-devel] [PATCH 2/4] blkverify: Fix leak of opts in blkverify_open
Signed-off-by: Fam Zheng f...@redhat.com --- block/blkverify.c | 1 + 1 file changed, 1 insertion(+) diff --git a/block/blkverify.c b/block/blkverify.c index 7c78ca4..163064c 100644 --- a/block/blkverify.c +++ b/block/blkverify.c @@ -158,6 +158,7 @@ static int blkverify_open(BlockDriverState *bs, QDict *options, int flags, ret = 0; fail: +qemu_opts_del(opts); return ret; } -- 2.1.0
[Qemu-devel] [PATCH 1/4] nfs: Fix leak of opts in nfs_file_open
Signed-off-by: Fam Zheng f...@redhat.com --- block/nfs.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/block/nfs.c b/block/nfs.c index 93d87f3..36e8057 100644 --- a/block/nfs.c +++ b/block/nfs.c @@ -393,15 +393,18 @@ static int nfs_file_open(BlockDriverState *bs, QDict *options, int flags, qemu_opts_absorb_qdict(opts, options, local_err); if (local_err) { error_propagate(errp, local_err); -return -EINVAL; +ret = -EINVAL; +goto out; } ret = nfs_client_open(client, qemu_opt_get(opts, filename), (flags BDRV_O_RDWR) ? O_RDWR : O_RDONLY, errp); if (ret 0) { -return ret; +goto out; } bs-total_sectors = ret; +out: +qemu_opts_del(opts); return 0; } -- 2.1.0
Re: [Qemu-devel] [PATCH v4 5/5] qemu-iotests: Add 093 for IO throttling
On Tue, 08/26 14:50, Stefan Hajnoczi wrote: On Thu, Jun 05, 2014 at 04:47:46PM +0800, Fam Zheng wrote: +def setUp(self): +qemu_img('create', '-f', iotests.imgfmt, test_img, 1G) +#self.vm = iotests.VM().add_drive(test_img, bps=1024,bps_max=1) Commented out lines should be dropped +self.vm = iotests.VM().add_drive(test_img) +self.vm.launch() + +def tearDown(self): +self.vm.shutdown() +os.remove(test_img) + +def do_test_throttle(self, seconds=10, **limits): +def check_limit(limit, num): +# IO throttling algorithm is discrete, allow 10% error so the test +# is more deterministic +return limit == 0 or num seconds * limit * 1.1 + +nsec_per_sec = 10 + +limits['bps_max'] = 1 +limits['iops_max'] = 1 + +# Enqueue many requests to throttling queue This comment is wrong, it actually happens further down +result = self.vm.qmp(block_set_io_throttle, conv_keys=False, **limits) +self.assert_qmp(result, 'return', {}) + +# Set vm clock to a known value +ns = nsec_per_sec +self.vm.qtest_cmd(clock_step %d % ns) + +# Append many requests into the throttle queue +# They drain bps_max and iops_max +# The rest requests won't get executed until qtest clock is driven +for i in range(1000): +self.vm.hmp_qemu_io(drive0, aio_read -a -q 0 512) +self.vm.hmp_qemu_io(drive0, aio_write -a -q 0 512) + +start_rd_bytes, start_rd_iops, start_wr_bytes, start_wr_iops = self.blockstats('drive0') + +ns += seconds * nsec_per_sec +self.vm.qtest_cmd(clock_step %d % ns) +# wait for a while to let requests take off +time.sleep(1) This is not a reliable testing approach. If the system is under heavy load maybe only a few requests completed. We don't know whether that is due to I/O throttling or not. A reliable test would not perform real disk I/O so the test is independent of disk/system speed. And it would not use time.sleep(1) to wait since there is no guarantee that anything happened in the meantime. Do you think this can be improved? Throttling only sets the upper limit of IO, this test checks the IO doesn't cross it: when the test fails, something must be wrong with the throttling; when the check passes, we can't guarantee that everything is correct. That's not uncommon across many other test cases we have. The other half is very hard, because of host load, etc., which are out of control of this script. Regarding to disk IO, I've submitted a separate patch, the null:// protocol, which can be used to sidestep the host disk load uncertainty. I can base this test on top. +end_rd_bytes, end_rd_iops, end_wr_bytes, end_wr_iops = self.blockstats('drive0') + +rd_bytes = end_rd_bytes - start_rd_bytes +rd_iops = end_rd_iops - start_rd_iops +wr_bytes = end_wr_bytes - start_wr_bytes +wr_iops = end_wr_iops - start_wr_iops + +assert check_limit(limits['bps'], rd_bytes) +assert check_limit(limits['bps_rd'], rd_bytes) +assert check_limit(limits['bps'], wr_bytes) +assert check_limit(limits['bps_wr'], wr_bytes) +assert check_limit(limits['iops'], rd_iops) +assert check_limit(limits['iops_rd'], rd_iops) +assert check_limit(limits['iops'], wr_iops) +assert check_limit(limits['iops_wr'], wr_iops) Please use TestCase.assert*() methods instead of plain assert. They produce humand-readable error messages including the failing values. OK. + +def test_bps(self): +self.do_test_throttle(**{ +'device': 'drive0', +'bps': 1000, +'bps_rd': 0, +'bps_wr': 0, +'iops': 0, +'iops_rd': 0, +'iops_wr': 0, +}) Keyword argument syntax is more concise: self.do_test_throttle(device='drive0', bps=1000, bps_rd=0, ...) OK, will change. Thanks, Fam
Re: [Qemu-devel] [PATCH 3/3] qemu-img: always goto out in img_snapshot() error paths
Stefan Hajnoczi stefa...@redhat.com writes: The out label has the qemu_progress_end() and other cleanup calls. Always goto out in error paths so the cleanup happens. Note that bdrv_unref(NULL) is safe. We just need to initialize bs to NULL at the top of the function. We can now remove the obsolete bs_old_backing = NULL and bs_new_backing = NULL for safe mode. Originally it was necessary in commit 3e85c6fd (qemu-img rebase) but became useless in commit c2abcce (qemu-img: avoid calling exit(1) to release resources properly) because the variables are already initialized during declaration. Please mention here that you fix exit code.from -1 to 1. Reported-by: John Snow js...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com [...]
Re: [Qemu-devel] [PATCH] qemu-char: fix terminal crash when using -monitor stdio -nographic
john.liuli john.li...@huawei.com writes: From: Li Liu john.li...@huawei.com Eeay to reproduce, just try qemu -monitor stdio -nographic and type quit, then the terminal will be crashed. There are two pathes try to call tcgetattr of stdio in vl.c: 1) Monitor_parse(optarg, readline); . qemu_opts_foreach(qemu_find_opts(chardev), chardev_init_func, NULL, 1) != 0) 2) if (default_serial) add_device_config(DEV_SERIAL, stdio); if (foreach_device_config(DEV_SERIAL, serial_parse) 0) Both of them will trigger qemu_chr_open_stdio which will disable ECHO attributes. First one has updated the attributes of stdio by calling qemu_chr_fe_set_echo(chr, false). And the tty attributes has been saved in oldtty. Then the second path will redo such actions, and the oldtty is overlapped. So till quit, term_exit can't recove the correct attributes. Signed-off-by: Li Liu john.li...@huawei.com Yes, failure to restore tty settings is a bug. But is having multiple character devices use the same terminal valid? If no, can we catch and reject the attempt? [...]
Re: [Qemu-devel] [PATCH] qemu-char: fix terminal crash when using -monitor stdio -nographic
On 2014/8/27 14:44, Markus Armbruster wrote: john.liuli john.li...@huawei.com writes: From: Li Liu john.li...@huawei.com Eeay to reproduce, just try qemu -monitor stdio -nographic and type quit, then the terminal will be crashed. There are two pathes try to call tcgetattr of stdio in vl.c: 1) Monitor_parse(optarg, readline); . qemu_opts_foreach(qemu_find_opts(chardev), chardev_init_func, NULL, 1) != 0) 2) if (default_serial) add_device_config(DEV_SERIAL, stdio); if (foreach_device_config(DEV_SERIAL, serial_parse) 0) Both of them will trigger qemu_chr_open_stdio which will disable ECHO attributes. First one has updated the attributes of stdio by calling qemu_chr_fe_set_echo(chr, false). And the tty attributes has been saved in oldtty. Then the second path will redo such actions, and the oldtty is overlapped. So till quit, term_exit can't recove the correct attributes. Signed-off-by: Li Liu john.li...@huawei.com Yes, failure to restore tty settings is a bug. But is having multiple character devices use the same terminal valid? I'm not sure. But I have found such comments in vl.c According to documentation and historically, -nographic redirects serial port, parallel port and monitor to stdio Best regards Li. If no, can we catch and reject the attempt? [...]
Re: [Qemu-devel] [PATCH] curl: Add override_accept_ranges flag to force sending range requests.
On Tue, Aug 26, 2014 at 08:57:30PM -0600, Eric Blake wrote: On 08/26/2014 08:38 PM, Fam Zheng wrote: On Tue, 08/26 21:48, Richard W.M. Jones wrote: Some servers (notably VMware ESX) accept range requests, but don't send back the Accept-Ranges: bytes header in their initial response. For these servers you can set override_accept_ranges to 'on' which forces this block driver to send range requests anyway. Is this a case where we should be naming with dashes instead of underscores, as in override-accept-ranges? Yes. I'm not particularly happy with the long name either, but couldn't think of anything shorter. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/
[Qemu-devel] [PATCH 1/1] Rename piix4_acpi_system_hot_add_init() to piix4_acpi_system_hotplug_init().
piix4_acpi_system_hot_add_init() handles not only memory hot add initialization, but the whole memory hotplug initialization. So rename it to piix4_acpi_system_hotplug_init(). Signed-off-by: Tang Chen tangc...@cn.fujitsu.com --- hw/acpi/piix4.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c index c1d5187..1a15e5e 100644 --- a/hw/acpi/piix4.c +++ b/hw/acpi/piix4.c @@ -93,7 +93,7 @@ typedef struct PIIX4PMState { #define PIIX4_PM(obj) \ OBJECT_CHECK(PIIX4PMState, (obj), TYPE_PIIX4_PM) -static void piix4_acpi_system_hot_add_init(MemoryRegion *parent, +static void piix4_acpi_system_hotplug_init(MemoryRegion *parent, PCIBus *bus, PIIX4PMState *s); #define ACPI_ENABLE 0xf1 @@ -468,7 +468,7 @@ static int piix4_pm_initfn(PCIDevice *dev) qemu_add_machine_init_done_notifier(s-machine_ready); qemu_register_reset(piix4_reset, s); -piix4_acpi_system_hot_add_init(pci_address_space_io(dev), dev-bus, s); +piix4_acpi_system_hotplug_init(pci_address_space_io(dev), dev-bus, s); piix4_pm_add_propeties(s); return 0; @@ -556,7 +556,7 @@ static void piix4_cpu_added_req(Notifier *n, void *opaque) acpi_update_sci(s-ar, s-irq); } -static void piix4_acpi_system_hot_add_init(MemoryRegion *parent, +static void piix4_acpi_system_hotplug_init(MemoryRegion *parent, PCIBus *bus, PIIX4PMState *s) { memory_region_init_io(s-io_gpe, OBJECT(s), piix4_gpe_ops, s, -- 1.8.4.2
[Qemu-devel] [RESEND PATCH v3 0/8] QEmu memory hot unplug support.
This patch-set implements memory hot-remove for QEmu. Approach: QEmu sets GPE status bit, then triggers SCI to notify guest os. Guest os checks device status, and free memory resource if possible, then generate OST. Finally, qemu handles OST events to free dimm device. Change log v2 - v3: 1. Divide the large patch into 7 small patches. (patch 1, 3~8) 2. Add new patch 2/8, implementing memory hot-remove for ich9. 3. Sanitize ACPI hardware implement code in patch 6/8. Hu Tao (6): acpi, piix4: Add memory hot unplug support for piix4. pc: Add memory hot unplug support for pc machine. qdev: Add memory hot unplug support for bus-less devices. pc-dimm: Add pc_dimm_unrealize() for memory hot unplug support. pc, acpi bios: Add memory hot unplug interface. monitor: Add memory hot unplug support for device_del command. Tang Chen (2): acpi, ich9: Add memory hot unplug support for ich9. acpi: Add hardware implementation for memory hot unplug. hw/acpi/ich9.c | 12 ++ hw/acpi/memory_hotplug.c | 87 ++-- hw/acpi/piix4.c | 3 ++ hw/core/qdev.c | 8 hw/i386/pc.c | 32 +++ hw/i386/ssdt-mem.dsl | 5 +++ hw/i386/ssdt-misc.dsl| 15 ++- hw/isa/lpc_ich9.c| 5 ++- hw/mem/pc-dimm.c | 10 + include/hw/acpi/acpi.h | 14 +++ include/hw/acpi/ich9.h | 2 + include/hw/acpi/memory_hotplug.h | 3 ++ include/hw/acpi/pc-hotplug.h | 2 + include/qom/object.h | 1 + qdev-monitor.c | 23 +++ qom/object.c | 2 +- 16 files changed, 217 insertions(+), 7 deletions(-) -- 1.8.4.2
[Qemu-devel] [RESEND PATCH v3 2/8] acpi, ich9: Add memory hot unplug support for ich9.
Implement ich9_pm_device_unplug_cb() to support memory hot-remove, calling acpi_memory_unplug_cb(). And itself will be called in ich9_device_unplug_cb(). Signed-off-by: Tang Chen tangc...@cn.fujitsu.com --- hw/acpi/ich9.c | 12 hw/isa/lpc_ich9.c | 5 +++-- include/hw/acpi/ich9.h | 2 ++ 3 files changed, 17 insertions(+), 2 deletions(-) diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c index 7b14bbb..d369151 100644 --- a/hw/acpi/ich9.c +++ b/hw/acpi/ich9.c @@ -310,6 +310,18 @@ void ich9_pm_device_plug_cb(ICH9LPCPMRegs *pm, DeviceState *dev, Error **errp) } } +void ich9_pm_device_unplug_cb(ICH9LPCPMRegs *pm, DeviceState *dev, Error **errp) +{ +if (pm-acpi_memory_hotplug.is_enabled +object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) { +acpi_memory_unplug_cb(pm-acpi_regs, pm-irq, pm-acpi_memory_hotplug, + dev, errp); +} else { +error_setg(errp, acpi: device unplug request for not supported device +type: %s, object_get_typename(OBJECT(dev))); +} +} + void ich9_pm_ospm_status(AcpiDeviceIf *adev, ACPIOSTInfoList ***list) { ICH9LPCState *s = ICH9_LPC_DEVICE(adev); diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c index 177023b..2c0761a 100644 --- a/hw/isa/lpc_ich9.c +++ b/hw/isa/lpc_ich9.c @@ -610,8 +610,9 @@ static void ich9_device_plug_cb(HotplugHandler *hotplug_dev, static void ich9_device_unplug_cb(HotplugHandler *hotplug_dev, DeviceState *dev, Error **errp) { -error_setg(errp, acpi: device unplug request for not supported device -type: %s, object_get_typename(OBJECT(dev))); +ICH9LPCState *lpc = ICH9_LPC_DEVICE(hotplug_dev); + +ich9_pm_device_unplug_cb(lpc-pm, dev, errp); } static bool ich9_rst_cnt_needed(void *opaque) diff --git a/include/hw/acpi/ich9.h b/include/hw/acpi/ich9.h index 7e42448..029576f 100644 --- a/include/hw/acpi/ich9.h +++ b/include/hw/acpi/ich9.h @@ -60,6 +60,8 @@ extern const VMStateDescription vmstate_ich9_pm; void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm, Error **errp); void ich9_pm_device_plug_cb(ICH9LPCPMRegs *pm, DeviceState *dev, Error **errp); +void ich9_pm_device_unplug_cb(ICH9LPCPMRegs *pm, DeviceState *dev, + Error **errp); void ich9_pm_ospm_status(AcpiDeviceIf *adev, ACPIOSTInfoList ***list); #endif /* HW_ACPI_ICH9_H */ -- 1.8.4.2
[Qemu-devel] [RESEND PATCH v3 1/8] acpi, piix4: Add memory hot unplug support for piix4.
From: Hu Tao hu...@cn.fujitsu.com Implement acpi_memory_unplug_cb(), sending an sci to guest to trigger memory hot-remove, and call it in piix4_device_unplug_cb(). Signed-off-by: Hu Tao hu...@cn.fujitsu.com Signed-off-by: Tang Chen tangc...@cn.fujitsu.com --- hw/acpi/memory_hotplug.c | 25 + hw/acpi/piix4.c | 3 +++ include/hw/acpi/memory_hotplug.h | 2 ++ 3 files changed, 30 insertions(+) diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c index ed39241..c310b44 100644 --- a/hw/acpi/memory_hotplug.c +++ b/hw/acpi/memory_hotplug.c @@ -195,6 +195,31 @@ void acpi_memory_plug_cb(ACPIREGS *ar, qemu_irq irq, MemHotplugState *mem_st, return; } +void acpi_memory_unplug_cb(ACPIREGS *ar, qemu_irq irq, MemHotplugState *mem_st, + DeviceState *dev, Error **errp) +{ +Error *local_err = NULL; +int slot = object_property_get_int(OBJECT(dev), slot, local_err); + +if (local_err) { +error_propagate(errp, local_err); +return; +} + +if (slot = mem_st-dev_count) { +char *dev_path = object_get_canonical_path(OBJECT(dev)); +error_setg(errp, acpi_memory_plug_cb: + device [%s] returned invalid memory slot[%d], + dev_path, slot); +g_free(dev_path); +return; +} + +/* do ACPI magic */ +ar-gpe.sts[0] |= ACPI_MEMORY_HOTPLUG_STATUS; +acpi_update_sci(ar, irq); +} + static const VMStateDescription vmstate_memhp_sts = { .name = memory hotplug device state, .version_id = 1, diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c index b72b34e..c1d5187 100644 --- a/hw/acpi/piix4.c +++ b/hw/acpi/piix4.c @@ -362,6 +362,9 @@ static void piix4_device_unplug_cb(HotplugHandler *hotplug_dev, if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) { acpi_pcihp_device_unplug_cb(s-ar, s-irq, s-acpi_pci_hotplug, dev, errp); +} else if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) { +acpi_memory_unplug_cb(s-ar, s-irq, s-acpi_memory_hotplug, dev, + errp); } else { error_setg(errp, acpi: device unplug request for not supported device type: %s, object_get_typename(OBJECT(dev))); diff --git a/include/hw/acpi/memory_hotplug.h b/include/hw/acpi/memory_hotplug.h index 7bbf8a0..fc6b868 100644 --- a/include/hw/acpi/memory_hotplug.h +++ b/include/hw/acpi/memory_hotplug.h @@ -28,6 +28,8 @@ void acpi_memory_hotplug_init(MemoryRegion *as, Object *owner, void acpi_memory_plug_cb(ACPIREGS *ar, qemu_irq irq, MemHotplugState *mem_st, DeviceState *dev, Error **errp); +void acpi_memory_unplug_cb(ACPIREGS *ar, qemu_irq irq, MemHotplugState *mem_st, + DeviceState *dev, Error **errp); extern const VMStateDescription vmstate_memory_hotplug; #define VMSTATE_MEMORY_HOTPLUG(memhp, state) \ -- 1.8.4.2
[Qemu-devel] [RESEND PATCH v3 4/8] qdev: Add memory hot unplug support for bus-less devices.
From: Hu Tao hu...@cn.fujitsu.com Implement bus-less device hot-remove in qdev_unplug(). It will call PCMachine callback introduced in previous patch. Signed-off-by: Hu Tao hu...@cn.fujitsu.com Signed-off-by: Tang Chen tangc...@cn.fujitsu.com --- hw/core/qdev.c | 8 1 file changed, 8 insertions(+) diff --git a/hw/core/qdev.c b/hw/core/qdev.c index da1ba48..e365a74 100644 --- a/hw/core/qdev.c +++ b/hw/core/qdev.c @@ -228,6 +228,14 @@ void qdev_unplug(DeviceState *dev, Error **errp) if (dev-parent_bus dev-parent_bus-hotplug_handler) { hotplug_handler_unplug(dev-parent_bus-hotplug_handler, dev, errp); +} else if (*errp == NULL) { +HotplugHandler *hotplug_ctrl; +MachineState *machine = MACHINE(qdev_get_machine()); +MachineClass *mc = MACHINE_GET_CLASS(machine); + +hotplug_ctrl = mc-get_hotplug_handler(machine, dev); +if (hotplug_ctrl) +hotplug_handler_unplug(hotplug_ctrl, dev, errp); } else { assert(dc-unplug != NULL); if (dc-unplug(dev) 0) { /* legacy handler */ -- 1.8.4.2
[Qemu-devel] [RESEND PATCH v3 3/8] pc: Add memory hot unplug support for pc machine.
From: Hu Tao hu...@cn.fujitsu.com Implement device unplug callback for PCMachine. And it now only support pc-dimm hot-remove. The callback will call piix4 or ich9 callbacks introduced in previous patches. Signed-off-by: Hu Tao hu...@cn.fujitsu.com Signed-off-by: Tang Chen tangc...@cn.fujitsu.com --- hw/i386/pc.c | 32 1 file changed, 32 insertions(+) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 8fa8d2f..8bafe7c 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -61,6 +61,8 @@ #include hw/mem/pc-dimm.h #include trace.h #include qapi/visitor.h +#include hw/acpi/piix4.h +#include hw/i386/ich9.h /* debug PC/ISA interrupts */ //#define DEBUG_IRQ @@ -1607,6 +1609,27 @@ out: error_propagate(errp, local_err); } +static void pc_dimm_unplug(HotplugHandler *hotplug_dev, + DeviceState *dev, Error **errp) +{ +Object *acpi_dev; +HotplugHandlerClass *hhc; +Error *local_err = NULL; + +acpi_dev = (acpi_dev = piix4_pm_find()) ? acpi_dev : ich9_lpc_find(); +if (!acpi_dev) { +error_setg(local_err, + memory hotplug is not enabled: missing acpi device); +error_propagate(errp, local_err); +return; +} + +hhc = HOTPLUG_HANDLER_GET_CLASS(acpi_dev); +hhc-unplug(HOTPLUG_HANDLER(acpi_dev), dev, local_err); + +error_propagate(errp, local_err); +} + static void pc_machine_device_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev, Error **errp) { @@ -1615,6 +1638,14 @@ static void pc_machine_device_plug_cb(HotplugHandler *hotplug_dev, } } +static void pc_machine_device_unplug_cb(HotplugHandler *hotplug_dev, +DeviceState *dev, Error **errp) +{ +if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) { +pc_dimm_unplug(hotplug_dev, dev, errp); +} +} + static HotplugHandler *pc_get_hotpug_handler(MachineState *machine, DeviceState *dev) { @@ -1701,6 +1732,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data) pcmc-get_hotplug_handler = mc-get_hotplug_handler; mc-get_hotplug_handler = pc_get_hotpug_handler; hc-plug = pc_machine_device_plug_cb; +hc-unplug = pc_machine_device_unplug_cb; } static const TypeInfo pc_machine_info = { -- 1.8.4.2
[Qemu-devel] [RFC PATCH v1 2/4] Add event handling for memory device insertion.
Define device insertion OST event and status, and add a function to handle memory insertion. Signed-off-by: Tang Chen tangc...@cn.fujitsu.com --- hw/acpi/memory_hotplug.c | 18 ++ include/hw/acpi/acpi.h | 10 +- 2 files changed, 27 insertions(+), 1 deletion(-) diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c index 1b21191..fddb0fd 100644 --- a/hw/acpi/memory_hotplug.c +++ b/hw/acpi/memory_hotplug.c @@ -85,6 +85,20 @@ static uint64_t acpi_memory_hotplug_read(void *opaque, hwaddr addr, return val; } +static void acpi_handle_insert(MemStatus *mdev) +{ +switch (mdev-ost_status) { +case ACPI_SUCCESS: +case ACPI_FAILURE: +case ACPI_UNRECOGNIZED_NOTIFY: +case ACPI_INSERT_DRIVER_LOAD_FAILURE: +case ACPI_INSERT_NOT_SUPPORTED: +case ACPI_INSERT_IN_PROGRESS: +default: +break; +} +} + static void acpi_handle_eject(MemStatus *mdev) { switch (mdev-ost_status) { @@ -121,6 +135,10 @@ static void acpi_handle_eject(MemStatus *mdev) static void acpi_handle_ost_event(MemStatus *mdev) { switch (mdev-ost_event) { +case ACPI_NOTIFY_DEVICE_CHECK: +case ACPI_OSPM_INSERT: +acpi_handle_insert(mdev); +break; case ACPI_NOTIFY_EJECT_REQUEST:/* Ejection triggered by hardware. */ case ACPI_OSPM_EJECT: /* Ejection triggered by guest OS. */ acpi_handle_eject(mdev); diff --git a/include/hw/acpi/acpi.h b/include/hw/acpi/acpi.h index e8499f6..ad706d8 100644 --- a/include/hw/acpi/acpi.h +++ b/include/hw/acpi/acpi.h @@ -93,21 +93,29 @@ /* OST_EVENT */ #define ACPI_OSPM_EJECT 0x103 +#define ACPI_OSPM_INSERT0x200 /* NOTIFY_EVENT */ #define ACPI_NOTIFY_DEVICE_CHECK0x1 #define ACPI_NOTIFY_EJECT_REQUEST 0x3 -/* OST_STATUS */ +/* general processing OST_STATUS */ #define ACPI_SUCCESS0x0 #define ACPI_FAILURE0x1 #define ACPI_UNRECOGNIZED_NOTIFY0x2 + +/* ejection processing OST_STATUS */ #define ACPI_EJECT_NOT_SUPPORTED0x80 #define ACPI_EJECT_DEVICE_IN_USE0x81 #define ACPI_EJECT_DEVICE_BUSY 0x82 #define ACPI_EJECT_DEPENDENCY_BUSY 0x83 #define ACPI_EJECT_IN_PROGRESS 0x84 +/* insertion processing OST_STATUS */ +#define ACPI_INSERT_IN_PROGRESS 0x80 +#define ACPI_INSERT_DRIVER_LOAD_FAILURE 0x81 +#define ACPI_INSERT_NOT_SUPPORTED 0x82 + /* structs */ typedef struct ACPIPMTimer ACPIPMTimer; typedef struct ACPIPM1EVT ACPIPM1EVT; -- 1.8.4.2
[Qemu-devel] [RESEND PATCH v3 8/8] monitor: Add memory hot unplug support for device_del command.
From: Hu Tao hu...@cn.fujitsu.com Implement find_peripheral_device() to find bus-less device, and call it in qmp_device_del() so that device_del command will be able to remove memory device. Signed-off-by: Hu Tao hu...@cn.fujitsu.com Signed-off-by: Tang Chen tangc...@cn.fujitsu.com --- include/qom/object.h | 1 + qdev-monitor.c | 23 +++ qom/object.c | 2 +- 3 files changed, 25 insertions(+), 1 deletion(-) diff --git a/include/qom/object.h b/include/qom/object.h index 8a05a81..a352beb 100644 --- a/include/qom/object.h +++ b/include/qom/object.h @@ -1300,5 +1300,6 @@ int object_child_foreach(Object *obj, int (*fn)(Object *child, void *opaque), */ Object *container_get(Object *root, const char *path); +bool object_property_is_child(ObjectProperty *prop); #endif diff --git a/qdev-monitor.c b/qdev-monitor.c index fb9ee24..effd928 100644 --- a/qdev-monitor.c +++ b/qdev-monitor.c @@ -24,6 +24,7 @@ #include qmp-commands.h #include sysemu/arch_init.h #include qemu/config-file.h +#include qom/object.h /* * Aliases were a bad idea from the start. Let's keep them @@ -683,12 +684,34 @@ int do_device_add(Monitor *mon, const QDict *qdict, QObject **ret_data) return 0; } +static DeviceState *find_peripheral_device(const char *id) +{ +Object *peripheral = qdev_get_peripheral(); +DeviceState *dev = NULL; +ObjectProperty *prop; + +QTAILQ_FOREACH(prop, peripheral-properties, node) { +if (object_property_is_child(prop)) { +dev = DEVICE(prop-opaque); +if (!strcmp(dev-id, id)) { +return dev; +} +} +} + +return NULL; +} + void qmp_device_del(const char *id, Error **errp) { DeviceState *dev; dev = qdev_find_recursive(sysbus_get_default(), id); if (!dev) { +dev = find_peripheral_device(id); +} + +if (!dev) { error_set(errp, QERR_DEVICE_NOT_FOUND, id); return; } diff --git a/qom/object.c b/qom/object.c index 0e8267b..1183e5d 100644 --- a/qom/object.c +++ b/qom/object.c @@ -351,7 +351,7 @@ void object_initialize(void *data, size_t size, const char *typename) object_initialize_with_type(data, size, type); } -static inline bool object_property_is_child(ObjectProperty *prop) +bool object_property_is_child(ObjectProperty *prop) { return strstart(prop-type, child, NULL); } -- 1.8.4.2
[Qemu-devel] [RFC PATCH v1 0/4] Handle memory hotplug errors from guest OS.
When doing memory hotplug, QEmu is not aware of guest OS error when hotplugging memory devices. Even if guest OS failed to hot-add memory, the pc-dimm device will be added to QEmu. Even if guest OS failed to hot-remove memory, QEmu will remove the pc-dimm device. An example is: for a Linux guest, the Linux kernel limited that the size of hot-added memory should be mutiple of memory section (128MB by default). If we add 130MB memory, the Linux kernel won't add it. We are not able to handle the size check in QEmu commmand line because different OS may have different limits. And also, QEmu outputs nothing but guest OS failed to hot-add memory will confuse users. We should at least report an error. So, we should report the error to users, and cancel the memory hotplug progress in QEmu. QEmu thread sends a SCI to guest OS and return immediately. The vcpu thread will emulate ACPI hardware operations. So this patch-set introduces a wait condition variable to synchronize these two threads. Tang Chen (4): Use macro to define ACPI notification event. Add event handling for memory device insertion. Introduce wait condition to catch guest OS memory hotplug error. Handle memory hotplug error from guest OS in QEmu. hw/acpi/memory_hotplug.c | 146 +-- include/hw/acpi/acpi.h | 15 - 2 files changed, 153 insertions(+), 8 deletions(-) -- 1.8.4.2
[Qemu-devel] [RESEND PATCH v3 5/8] pc-dimm: Add pc_dimm_unrealize() for memory hot unplug support.
From: Hu Tao hu...@cn.fujitsu.com Implement unrealize function for pc-dimm device. It delete subregion from hotplug region, and delete ram address range from guest ram list. Signed-off-by: Hu Tao hu...@cn.fujitsu.com Signed-off-by: Tang Chen tangc...@cn.fujitsu.com --- hw/mem/pc-dimm.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c index 20fe0dc..34109a2 100644 --- a/hw/mem/pc-dimm.c +++ b/hw/mem/pc-dimm.c @@ -270,12 +270,22 @@ static MemoryRegion *pc_dimm_get_memory_region(PCDIMMDevice *dimm) return host_memory_backend_get_memory(dimm-hostmem, error_abort); } +static void pc_dimm_unrealize(DeviceState *dev, Error **errp) +{ +PCDIMMDevice *dimm = PC_DIMM(dev); +MemoryRegion *mr = pc_dimm_get_memory_region(dimm); + +memory_region_del_subregion(mr-container, mr); +vmstate_unregister_ram(mr, dev); +} + static void pc_dimm_class_init(ObjectClass *oc, void *data) { DeviceClass *dc = DEVICE_CLASS(oc); PCDIMMDeviceClass *ddc = PC_DIMM_CLASS(oc); dc-realize = pc_dimm_realize; +dc-unrealize = pc_dimm_unrealize; dc-props = pc_dimm_properties; ddc-get_memory_region = pc_dimm_get_memory_region; -- 1.8.4.2
[Qemu-devel] [RFC PATCH v1 4/4] Handle memory hotplug error from guest OS in QEmu.
Before this patch, QEmu is not aware of guest OS error when hotplugging memory devices. Even if guest OS failed to hot-add memory, the pc-dimm device will be added to QEmu. Even if guest OS failed to hot-remove memory, QEmu will remove the pc-dimm device. For example, for a Linux guest, the Linux kernel limited that the size of hot-added memory should be mutiple of memory section (128MB by default). If we add 130MB memory, the Linux kernel won't add it. We are not able to handle the size check in QEmu commmand line because different OS may have different limits. And also, QEmu outputs nothing but guest OS failed to hot-add memory will confuse users. We should at least report an error. So, we should report the error to users, and cancel the memory hotplug progress in QEmu. Since after previous patch, QEmu will wait on qemu_cond_memhp after sending SCI, and will be signaled by vcpu thread after OST status is written to ACPI register, this patch checks OST status, and report an error to users, and cancel hotplug progress if necessary. Signed-off-by: Tang Chen tangc...@cn.fujitsu.com --- hw/acpi/memory_hotplug.c | 83 +--- 1 file changed, 79 insertions(+), 4 deletions(-) diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c index 38d9654..59917c3 100644 --- a/hw/acpi/memory_hotplug.c +++ b/hw/acpi/memory_hotplug.c @@ -257,25 +257,95 @@ void acpi_memory_hotplug_init(MemoryRegion *as, Object *owner, qemu_mutex_init(qemu_memhp_mutex); } +static void acpi_handle_memory_insert_error(MemStatus *mdev, char *dev_path, +Error **errp) +{ +switch (mdev-ost_status) { +case ACPI_FAILURE: +error_setg(errp, Failed to insert device [%s], dev_path); +break; +case ACPI_UNRECOGNIZED_NOTIFY: +error_setg(errp, Unrecongnized notification); +break; +case ACPI_INSERT_DRIVER_LOAD_FAILURE: +error_setg(errp, Failed to load driver for device [%s], dev_path); +break; +case ACPI_INSERT_NOT_SUPPORTED: +error_setg(errp, Insertion is not supported for device [%s], + dev_path); +break; +default: +error_setg(errp, Unknown error on insertion of device [%s], dev_path); +break; +} +} + +static void acpi_handle_memory_eject_error(MemStatus *mdev, char *dev_path, + Error **errp) +{ +switch (mdev-ost_status) { +case ACPI_FAILURE: +error_setg(errp, Failed to eject device [%s], dev_path); +break; +case ACPI_UNRECOGNIZED_NOTIFY: +error_setg(errp, Unrecongnized notification); +break; +case ACPI_EJECT_DEVICE_IN_USE: +case ACPI_EJECT_DEVICE_BUSY: +error_setg(errp, Device [%s] is busy, dev_path); +break; +case ACPI_EJECT_DEPENDENCY_BUSY: +error_setg(errp, Ejection dependency is busy or not supported + for device [%s], dev_path); +break; +case ACPI_EJECT_NOT_SUPPORTED: +error_setg(errp, Ejection is not supported for device [%s], + dev_path); +break; +default: +error_setg(errp, Unknown error on ejection of device [%s], dev_path); +break; +} +} + +static void acpi_handle_memory_error(MemStatus *mdev, char *dev_path, + Error **errp) +{ +if (mdev-ost_status == ACPI_SUCCESS) +return; + +switch (mdev-ost_event) { +case ACPI_NOTIFY_DEVICE_CHECK: +case ACPI_OSPM_INSERT: +acpi_handle_memory_insert_error(mdev, dev_path, errp); +break; +case ACPI_NOTIFY_EJECT_REQUEST: +case ACPI_OSPM_EJECT: +acpi_handle_memory_eject_error(mdev, dev_path, errp); +break; +default: +break; +} +} + void acpi_memory_plug_cb(ACPIREGS *ar, qemu_irq irq, MemHotplugState *mem_st, DeviceState *dev, Error **errp) { MemStatus *mdev; Error *local_err = NULL; int slot = object_property_get_int(OBJECT(dev), slot, local_err); +char *dev_path = object_get_canonical_path(OBJECT(dev)); if (local_err) { error_propagate(errp, local_err); -return; +goto ret; } if (slot = mem_st-dev_count) { -char *dev_path = object_get_canonical_path(OBJECT(dev)); error_setg(errp, acpi_memory_plug_cb: device [%s] returned invalid memory slot[%d], dev_path, slot); -g_free(dev_path); -return; +goto ret; } mdev = mem_st-devs[slot]; @@ -300,9 +370,14 @@ void acpi_memory_plug_cb(ACPIREGS *ar, qemu_irq irq, MemHotplugState *mem_st, * and will relock it when signaled. */ qemu_cond_wait(qemu_memhp_cond, qemu_memhp_mutex); + +acpi_handle_memory_error(mdev, dev_path, errp); + qemu_mutex_unlock(qemu_memhp_mutex); qemu_mutex_lock_iothread(); +ret: +
[Qemu-devel] [RESEND PATCH v3 6/8] acpi: Add hardware implementation for memory hot unplug.
This patch adds a bool member named is_removing to MemStatus indicating if the memory device is being removed. It is set to true in acpi_memory_unplug_cb() when doing memory hot-remove with device_del command, or write an ACPI_EJECTION_IN_PROGRESS status to ACPI register when triggering memory hot-remove in guest. Signed-off-by: Hu Tao hu...@cn.fujitsu.com Signed-off-by: Tang Chen tangc...@cn.fujitsu.com --- hw/acpi/memory_hotplug.c | 62 ++-- include/hw/acpi/acpi.h | 14 + include/hw/acpi/memory_hotplug.h | 1 + 3 files changed, 74 insertions(+), 3 deletions(-) diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c index c310b44..9074e7c 100644 --- a/hw/acpi/memory_hotplug.c +++ b/hw/acpi/memory_hotplug.c @@ -75,6 +75,7 @@ static uint64_t acpi_memory_hotplug_read(void *opaque, hwaddr addr, case 0x14: /* pack and return is_* fields */ val |= mdev-is_enabled ? 1 : 0; val |= mdev-is_inserting ? 2 : 0; +val |= mdev-is_removing ? 4 : 0; trace_mhp_acpi_read_flags(mem_st-selector, val); break; default: @@ -84,6 +85,51 @@ static uint64_t acpi_memory_hotplug_read(void *opaque, hwaddr addr, return val; } +static void acpi_handle_eject(MemStatus *mdev) +{ +switch (mdev-ost_status) { +case ACPI_SUCCESS: +object_unparent(OBJECT(mdev-dimm)); +mdev-is_removing = false; +mdev-dimm = NULL; +break; +case ACPI_EJECT_IN_PROGRESS: +/* For ejection triggered by hardware (device_del command), + * mdev-is_removing is set to true by acpi_memory_unplug_cb() + * before sending SCI. So we only handle ejection triggered by + * guest OS. +*/ +if (mdev-ost_event == ACPI_OSPM_EJECT) { +mdev-is_enabled = false; +mdev-is_removing = true; +} +break; +case ACPI_FAILURE: +case ACPI_UNRECOGNIZED_NOTIFY: +case ACPI_EJECT_NOT_SUPPORTED: +case ACPI_EJECT_DEVICE_IN_USE: +case ACPI_EJECT_DEVICE_BUSY: +case ACPI_EJECT_DEPENDENCY_BUSY: +mdev-is_removing = false; +mdev-is_enabled = true; +break; +default: +break; +} +} + +static void acpi_handle_ost_event(MemStatus *mdev) +{ +switch (mdev-ost_event) { +case ACPI_NOTIFY_EJECT_REQUEST:/* Ejection triggered by hardware. */ +case ACPI_OSPM_EJECT: /* Ejection triggered by guest OS. */ +acpi_handle_eject(mdev); +break; +default: +break; +} +} + static void acpi_memory_hotplug_write(void *opaque, hwaddr addr, uint64_t data, unsigned int size) { @@ -121,21 +167,27 @@ static void acpi_memory_hotplug_write(void *opaque, hwaddr addr, uint64_t data, mdev = mem_st-devs[mem_st-selector]; mdev-ost_status = data; trace_mhp_acpi_write_ost_status(mem_st-selector, mdev-ost_status); -/* TODO: implement memory removal on guest signal */ info = acpi_memory_device_status(mem_st-selector, mdev); qapi_event_send_acpi_device_ost(info, error_abort); qapi_free_ACPIOSTInfo(info); + +acpi_handle_ost_event(mdev); break; -case 0x14: +case 0x14: /* set is_* fields */ mdev = mem_st-devs[mem_st-selector]; + if (data 2) { /* clear insert event */ mdev-is_inserting = false; trace_mhp_acpi_clear_insert_evt(mem_st-selector); +} else if (data 4) { /* request removal of device */ +mdev-is_enabled = false; } + +break; +default: break; } - } static const MemoryRegionOps acpi_memory_hotplug_ops = { .read = acpi_memory_hotplug_read, @@ -198,6 +250,7 @@ void acpi_memory_plug_cb(ACPIREGS *ar, qemu_irq irq, MemHotplugState *mem_st, void acpi_memory_unplug_cb(ACPIREGS *ar, qemu_irq irq, MemHotplugState *mem_st, DeviceState *dev, Error **errp) { +MemStatus *mdev; Error *local_err = NULL; int slot = object_property_get_int(OBJECT(dev), slot, local_err); @@ -215,6 +268,9 @@ void acpi_memory_unplug_cb(ACPIREGS *ar, qemu_irq irq, MemHotplugState *mem_st, return; } +mdev = mem_st-devs[slot]; +mdev-is_removing = true; + /* do ACPI magic */ ar-gpe.sts[0] |= ACPI_MEMORY_HOTPLUG_STATUS; acpi_update_sci(ar, irq); diff --git a/include/hw/acpi/acpi.h b/include/hw/acpi/acpi.h index 1f678b4..e105e45 100644 --- a/include/hw/acpi/acpi.h +++ b/include/hw/acpi/acpi.h @@ -91,6 +91,20 @@ /* PM2_CNT */ #define ACPI_BITMASK_ARB_DISABLE0x0001 +/* OST_EVENT */ +#define ACPI_NOTIFY_EJECT_REQUEST 0x03 +#define ACPI_OSPM_EJECT 0x103 + +/* OST_STATUS */ +#define ACPI_SUCCESS0x0 +#define ACPI_FAILURE0x1 +#define ACPI_UNRECOGNIZED_NOTIFY
[Qemu-devel] [RFC PATCH v1 3/4] Introduce wait condition to catch guest OS memory hotplug error.
When acpi_memory_plug_cb() sends a SCI to guest OS, vcpu thread will handle it. And QEmu thread will return, who is not able to catch the error if guest OS failed to handle the SCI. Of course, if guest OS failed to handle SCI, it will give error messages. But QEmu will not output anything. Furthermore, if QEmu cannot catch these errors, applications based on QEmu, like Libvirt, will also not able to catch them. This could be trouble to end users. This patch introduces a condition variable named qemu_cond_memhp. QEmu will wait on qemu_cond_memhp after sending SCI, and vcpu thread will signal QEmu when OST status is written into ACPI register. This is used by the following patch. Signed-off-by: Tang Chen tangc...@cn.fujitsu.com --- hw/acpi/memory_hotplug.c | 38 ++ 1 file changed, 38 insertions(+) diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c index fddb0fd..38d9654 100644 --- a/hw/acpi/memory_hotplug.c +++ b/hw/acpi/memory_hotplug.c @@ -4,6 +4,12 @@ #include hw/boards.h #include trace.h #include qapi-event.h +#include qemu/thread.h +#include qemu/main-loop.h + +/* _OST called by guest OS */ +static QemuCond qemu_memhp_cond; +static QemuMutex qemu_memhp_mutex; static ACPIOSTInfo *acpi_memory_device_status(int slot, MemStatus *mdev) { @@ -93,6 +99,8 @@ static void acpi_handle_insert(MemStatus *mdev) case ACPI_UNRECOGNIZED_NOTIFY: case ACPI_INSERT_DRIVER_LOAD_FAILURE: case ACPI_INSERT_NOT_SUPPORTED: +/* Signal QEmu to continue.*/ +qemu_cond_signal(qemu_memhp_cond); case ACPI_INSERT_IN_PROGRESS: default: break; @@ -106,6 +114,9 @@ static void acpi_handle_eject(MemStatus *mdev) object_unparent(OBJECT(mdev-dimm)); mdev-is_removing = false; mdev-dimm = NULL; + +/* Signal QEmu to continue.*/ +qemu_cond_signal(qemu_memhp_cond); break; case ACPI_EJECT_IN_PROGRESS: /* For ejection triggered by hardware (device_del command), @@ -126,6 +137,9 @@ static void acpi_handle_eject(MemStatus *mdev) case ACPI_EJECT_DEPENDENCY_BUSY: mdev-is_removing = false; mdev-is_enabled = true; + +/* Signal QEmu to catch errors. */ +qemu_cond_signal(qemu_memhp_cond); break; default: break; @@ -181,12 +195,16 @@ static void acpi_memory_hotplug_write(void *opaque, hwaddr addr, uint64_t data, /* TODO: handle device remove OST event */ break; } +qemu_mutex_lock(qemu_memhp_mutex); mdev-ost_event = data; +qemu_mutex_unlock(qemu_memhp_mutex); trace_mhp_acpi_write_ost_ev(mem_st-selector, mdev-ost_event); break; case 0x8: /* _OST status */ mdev = mem_st-devs[mem_st-selector]; +qemu_mutex_lock(qemu_memhp_mutex); mdev-ost_status = data; +qemu_mutex_unlock(qemu_memhp_mutex); trace_mhp_acpi_write_ost_status(mem_st-selector, mdev-ost_status); info = acpi_memory_device_status(mem_st-selector, mdev); @@ -234,6 +252,9 @@ void acpi_memory_hotplug_init(MemoryRegion *as, Object *owner, memory_region_init_io(state-io, owner, acpi_memory_hotplug_ops, state, acpi-mem-hotplug, ACPI_MEMORY_HOTPLUG_IO_LEN); memory_region_add_subregion(as, ACPI_MEMORY_HOTPLUG_BASE, state-io); + +qemu_cond_init(qemu_memhp_cond); +qemu_mutex_init(qemu_memhp_mutex); } void acpi_memory_plug_cb(ACPIREGS *ar, qemu_irq irq, MemHotplugState *mem_st, @@ -265,6 +286,23 @@ void acpi_memory_plug_cb(ACPIREGS *ar, qemu_irq irq, MemHotplugState *mem_st, /* do ACPI magic */ ar-gpe.sts[0] |= ACPI_MEMORY_HOTPLUG_STATUS; acpi_update_sci(ar, irq); + +/* When SCI is sent, wait for guest OS handling and catch errors. + * + * QEmu locked all the iothreads before handling monitor command in + * os_host_main_loop_wait(). Unlock them so that vcpu threads are able + * to handle ACPI register writing requests from guest OS. And relock + * them after we are signaled. + */ +qemu_mutex_unlock_iothread(); +qemu_mutex_lock(qemu_memhp_mutex); +/* qemu_cond_wait() will release qemu_memhp_mutex and wait, + * and will relock it when signaled. + */ +qemu_cond_wait(qemu_memhp_cond, qemu_memhp_mutex); +qemu_mutex_unlock(qemu_memhp_mutex); +qemu_mutex_lock_iothread(); + return; } -- 1.8.4.2
[Qemu-devel] [RESEND PATCH v3 7/8] pc, acpi bios: Add memory hot unplug interface.
From: Hu Tao hu...@cn.fujitsu.com This patch implements MEMORY_SLOT_EJECT_METHOD according to ACPI spec. Signed-off-by: Hu Tao hu...@cn.fujitsu.com Signed-off-by: Tang Chen tangc...@cn.fujitsu.com --- hw/i386/ssdt-mem.dsl | 5 + hw/i386/ssdt-misc.dsl| 15 ++- include/hw/acpi/pc-hotplug.h | 2 ++ 3 files changed, 21 insertions(+), 1 deletion(-) diff --git a/hw/i386/ssdt-mem.dsl b/hw/i386/ssdt-mem.dsl index 22ff5dd..1416639 100644 --- a/hw/i386/ssdt-mem.dsl +++ b/hw/i386/ssdt-mem.dsl @@ -43,6 +43,7 @@ DefinitionBlock (ssdt-mem.aml, SSDT, 0x02, BXPC, CSSDT, 0x1) External(\_SB.PCI0.MEMORY_HOTPLUG_DEVICE.MEMORY_SLOT_STATUS_METHOD, MethodObj) External(\_SB.PCI0.MEMORY_HOTPLUG_DEVICE.MEMORY_SLOT_OST_METHOD, MethodObj) External(\_SB.PCI0.MEMORY_HOTPLUG_DEVICE.MEMORY_SLOT_PROXIMITY_METHOD, MethodObj) +External(\_SB.PCI0.MEMORY_HOTPLUG_DEVICE.MEMORY_SLOT_EJECT_METHOD, MethodObj) Scope(\_SB) { /* v-- DO NOT EDIT --v */ @@ -72,6 +73,10 @@ DefinitionBlock (ssdt-mem.aml, SSDT, 0x02, BXPC, CSSDT, 0x1) Method(_OST, 3) { \_SB.PCI0.MEMORY_HOTPLUG_DEVICE.MEMORY_SLOT_OST_METHOD(_UID, Arg0, Arg1, Arg2) } + +Method(_EJ0, 1) { +\_SB.PCI0.MEMORY_HOTPLUG_DEVICE.MEMORY_SLOT_EJECT_METHOD(_UID, Arg0) +} } } } diff --git a/hw/i386/ssdt-misc.dsl b/hw/i386/ssdt-misc.dsl index 0fd4480..79228ac 100644 --- a/hw/i386/ssdt-misc.dsl +++ b/hw/i386/ssdt-misc.dsl @@ -156,6 +156,7 @@ DefinitionBlock (ssdt-misc.aml, SSDT, 0x01, BXPC, BXSSDTSUSP, 0x1) Offset(20), MEMORY_SLOT_ENABLED, 1, // 1 if enabled, read only MEMORY_SLOT_INSERT_EVENT, 1, // (read) 1 if has a insert event. (write) 1 to clear event +MEMORY_SLOT_REMOVE_EVENT, 1, // 1 if DIMM has a remove request, read only } Mutex (MEMORY_SLOT_LOCK, 0) @@ -174,11 +175,16 @@ DefinitionBlock (ssdt-misc.aml, SSDT, 0x01, BXPC, BXSSDTSUSP, 0x1) Acquire(MEMORY_SLOT_LOCK, 0x) while (LLess(Local0, MEMORY_SLOTS_NUMBER)) { Store(Local0, MEMORY_SLOT_SLECTOR) // select Local0 DIMM + If (LEqual(MEMORY_SLOT_INSERT_EVENT, One)) { // Memory device needs check MEMORY_SLOT_NOTIFY_METHOD(Local0, 1) Store(1, MEMORY_SLOT_INSERT_EVENT) } -// TODO: handle memory eject request + +If (LEqual(MEMORY_SLOT_REMOVE_EVENT, One)) { // Ejection request +MTFY(Local0, 3) +} + Add(Local0, One, Local0) // goto next DIMM } Release(MEMORY_SLOT_LOCK) @@ -278,6 +284,13 @@ DefinitionBlock (ssdt-misc.aml, SSDT, 0x01, BXPC, BXSSDTSUSP, 0x1) Store(Arg2, MEMORY_SLOT_OST_STATUS) Release(MEMORY_SLOT_LOCK) } + +Method(MEMORY_SLOT_EJECT_METHOD, 2) { +Acquire(MLCK, 0x) +Store(ToInteger(Arg0), MSEL) // select DIMM +Store(One, MEMORY_SLOT_REMOVE_EVENT) +Release(MLCK) +} } // Device() } // Scope() } diff --git a/include/hw/acpi/pc-hotplug.h b/include/hw/acpi/pc-hotplug.h index b9db295..b3770a1 100644 --- a/include/hw/acpi/pc-hotplug.h +++ b/include/hw/acpi/pc-hotplug.h @@ -42,6 +42,7 @@ #define MEMORY_SLOT_PROXIMITYMPX #define MEMORY_SLOT_ENABLED MES #define MEMORY_SLOT_INSERT_EVENT MINS +#define MEMORY_SLOT_REMOVE_EVENT MRMV #define MEMORY_SLOT_SLECTOR MSEL #define MEMORY_SLOT_OST_EVENTMOEV #define MEMORY_SLOT_OST_STATUS MOSC @@ -50,6 +51,7 @@ #define MEMORY_SLOT_CRS_METHOD MCRS #define MEMORY_SLOT_OST_METHOD MOST #define MEMORY_SLOT_PROXIMITY_METHOD MPXM +#define MEMORY_SLOT_EJECT_METHOD MDEJ #define MEMORY_SLOT_NOTIFY_METHODMTFY #define MEMORY_SLOT_SCAN_METHOD MSCN -- 1.8.4.2
[Qemu-devel] [RFC PATCH v1 1/4] Use macro to define ACPI notification event.
According to ACPI spec, device object notification values define insertion request (Device Check) as 1, and ejection request as 3. Use macro to define them. Signed-off-by: Tang Chen tangc...@cn.fujitsu.com --- hw/acpi/memory_hotplug.c | 7 +-- include/hw/acpi/acpi.h | 5 - 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c index f29715a..1b21191 100644 --- a/hw/acpi/memory_hotplug.c +++ b/hw/acpi/memory_hotplug.c @@ -155,10 +155,13 @@ static void acpi_memory_hotplug_write(void *opaque, hwaddr addr, uint64_t data, break; case 0x4: /* _OST event */ mdev = mem_st-devs[mem_st-selector]; -if (data == 1) { +switch (data) { +case ACPI_NOTIFY_DEVICE_CHECK: /* TODO: handle device insert OST event */ -} else if (data == 3) { +break; +case ACPI_NOTIFY_EJECT_REQUEST: /* TODO: handle device remove OST event */ +break; } mdev-ost_event = data; trace_mhp_acpi_write_ost_ev(mem_st-selector, mdev-ost_event); diff --git a/include/hw/acpi/acpi.h b/include/hw/acpi/acpi.h index e105e45..e8499f6 100644 --- a/include/hw/acpi/acpi.h +++ b/include/hw/acpi/acpi.h @@ -92,9 +92,12 @@ #define ACPI_BITMASK_ARB_DISABLE0x0001 /* OST_EVENT */ -#define ACPI_NOTIFY_EJECT_REQUEST 0x03 #define ACPI_OSPM_EJECT 0x103 +/* NOTIFY_EVENT */ +#define ACPI_NOTIFY_DEVICE_CHECK0x1 +#define ACPI_NOTIFY_EJECT_REQUEST 0x3 + /* OST_STATUS */ #define ACPI_SUCCESS0x0 #define ACPI_FAILURE0x1 -- 1.8.4.2
Re: [Qemu-devel] [RFC PATCH v1 0/4] Handle memory hotplug errors from guest OS.
Forgot to mention, this patch-set is based on the following patch-set: [RESEND PATCH v3 0/8] QEmu memory hot unplug support. https://www.mail-archive.com/qemu-devel@nongnu.org/msg253018.html Thanks. On 08/27/2014 04:09 PM, Tang Chen wrote: When doing memory hotplug, QEmu is not aware of guest OS error when hotplugging memory devices. Even if guest OS failed to hot-add memory, the pc-dimm device will be added to QEmu. Even if guest OS failed to hot-remove memory, QEmu will remove the pc-dimm device. An example is: for a Linux guest, the Linux kernel limited that the size of hot-added memory should be mutiple of memory section (128MB by default). If we add 130MB memory, the Linux kernel won't add it. We are not able to handle the size check in QEmu commmand line because different OS may have different limits. And also, QEmu outputs nothing but guest OS failed to hot-add memory will confuse users. We should at least report an error. So, we should report the error to users, and cancel the memory hotplug progress in QEmu. QEmu thread sends a SCI to guest OS and return immediately. The vcpu thread will emulate ACPI hardware operations. So this patch-set introduces a wait condition variable to synchronize these two threads. Tang Chen (4): Use macro to define ACPI notification event. Add event handling for memory device insertion. Introduce wait condition to catch guest OS memory hotplug error. Handle memory hotplug error from guest OS in QEmu. hw/acpi/memory_hotplug.c | 146 +-- include/hw/acpi/acpi.h | 15 - 2 files changed, 153 insertions(+), 8 deletions(-)
Re: [Qemu-devel] [0/3] target-ppc Fixes for some missing config dependencies
On Tue, Aug 26, 2014 at 10:34:10AM +0200, Paolo Bonzini wrote: Il 26/08/2014 06:30, David Gibson ha scritto: These 3 patches fix some places where things ought to depend on an existing config variable, but don't. Which header provdides the #defines? Ah, crap, none does. So effectively, my patch unconditionally disables those tests. Have I mentioned that I really hate the fact that qemu has different config variables visible to make and C. I'll rework, but probably not for a little while, since I have to focus on some kernel work instead for the time being. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson pgpBfgotY1lTB.pgp Description: PGP signature
Re: [Qemu-devel] [RFC PATCH v2 11/13] spapr_pci_vfio: Enable DDW
On Tue, Aug 26, 2014 at 06:16:51PM +1000, Alexey Kardashevskiy wrote: On 08/26/2014 05:19 PM, David Gibson wrote: On Fri, Aug 15, 2014 at 08:12:33PM +1000, Alexey Kardashevskiy wrote: This implements DDW for VFIO. Host kernel support is required for this. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- Changes: v2: * remove()/reset() callbacks use spapr_pci's ones --- hw/ppc/spapr_pci_vfio.c | 86 + 1 file changed, 86 insertions(+) diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c index 11b4272..79df716 100644 --- a/hw/ppc/spapr_pci_vfio.c +++ b/hw/ppc/spapr_pci_vfio.c @@ -71,6 +71,88 @@ static void spapr_phb_vfio_finish_realize(sPAPRPHBState *sphb, Error **errp) spapr_tce_get_iommu(tcet)); object_unref(OBJECT(tcet)); + +if (sphb-ddw_enabled) { +sphb-ddw_enabled = !!(info.flags VFIO_IOMMU_SPAPR_TCE_FLAG_DDW); This overrides an explicit ddw= set by the user, which is a bit counter-intuitive. For the user it is rather try ddw when available than do ddw. This was suggested by Alex Graf or I misunderstood his suggestion :) Uh.. never mind. I misread this, not spotting the if :). -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson pgpO9byEvxjzr.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH v2 1/1] pc-dimm: Change PCDIMMDevice-node from UINT32 to INT32, and initialize it as -1.
On 08/26/2014 10:24 PM, Andrey Korolyov wrote: .. Just to remind - Windows will not add pc dimms without populated SRAT, so imho forcing NUMA topology to be set (and whose support is required anyway from guest linux kernel in order to support ACPI hotplug) is better than approach proposed by this patch. . Hi Andrey, Sorry, I don't quite understand. If Windows won't add pc-dimm without SRAT, why is this approach not enough ? If there is no SRAT, there is no NUMA topology. Why do we need to set a NUMA topology ? Thanks.
Re: [Qemu-devel] [PATCH v2 1/1] pc-dimm: Change PCDIMMDevice-node from UINT32 to INT32, and initialize it as -1.
On 08/26/2014 10:24 PM, Andrey Korolyov wrote: .. Just to remind - Windows will not add pc dimms without populated SRAT, so imho forcing NUMA topology to be set (and whose support is required anyway from guest linux kernel in order to support ACPI hotplug) is better than approach proposed by this patch. . Hi Andrey, Sorry, I don't quite understand. If Windows won't add pc-dimm without SRAT, why is this approach not enough ? If there is no SRAT, there is no NUMA topology. Why do we need to set a NUMA topology ? Thanks.
Re: [Qemu-devel] [RFC PATCH v2 12/13] vfio: Enable DDW ioctls to VFIO IOMMU driver
On Tue, Aug 26, 2014 at 06:20:51PM +1000, Alexey Kardashevskiy wrote: On 08/26/2014 05:20 PM, David Gibson wrote: On Fri, Aug 15, 2014 at 08:12:34PM +1000, Alexey Kardashevskiy wrote: This enables DDW RTAS-related ioctls in VFIO. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru This should probably just be folded into the previous patch. It's broken without this change. It won't work but it is not broken - guest will fail to create DDW and continue using the default windows. And since the series needs attention of 2 maintainers (A. Williamson and A. Graf), it is better to draw bold line between areas :) Ah, ok, that makes sense then. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson pgpBE5MLL1UGp.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH v4 5/5] qemu-iotests: Add 093 for IO throttling
On Wed, Aug 27, 2014 at 02:19:44PM +0800, Fam Zheng wrote: On Tue, 08/26 14:50, Stefan Hajnoczi wrote: On Thu, Jun 05, 2014 at 04:47:46PM +0800, Fam Zheng wrote: +result = self.vm.qmp(block_set_io_throttle, conv_keys=False, **limits) +self.assert_qmp(result, 'return', {}) + +# Set vm clock to a known value +ns = nsec_per_sec +self.vm.qtest_cmd(clock_step %d % ns) + +# Append many requests into the throttle queue +# They drain bps_max and iops_max +# The rest requests won't get executed until qtest clock is driven +for i in range(1000): +self.vm.hmp_qemu_io(drive0, aio_read -a -q 0 512) +self.vm.hmp_qemu_io(drive0, aio_write -a -q 0 512) + +start_rd_bytes, start_rd_iops, start_wr_bytes, start_wr_iops = self.blockstats('drive0') + +ns += seconds * nsec_per_sec +self.vm.qtest_cmd(clock_step %d % ns) +# wait for a while to let requests take off +time.sleep(1) This is not a reliable testing approach. If the system is under heavy load maybe only a few requests completed. We don't know whether that is due to I/O throttling or not. A reliable test would not perform real disk I/O so the test is independent of disk/system speed. And it would not use time.sleep(1) to wait since there is no guarantee that anything happened in the meantime. Do you think this can be improved? Throttling only sets the upper limit of IO, this test checks the IO doesn't cross it: when the test fails, something must be wrong with the throttling; when the check passes, we can't guarantee that everything is correct. That's not uncommon across many other test cases we have. The other half is very hard, because of host load, etc., which are out of control of this script. Regarding to disk IO, I've submitted a separate patch, the null:// protocol, which can be used to sidestep the host disk load uncertainty. I can base this test on top. If both a fake disk and fake timers are used then the execution is deterministic. I'll take a look at the null:// protocol you have posted. Although I would love this test to be deterministic, I understand it is tricky to achieve that. I'd like to discuss making this deterministic but I'm not opposed to merging the current approach. Stefan pgpJp8DWeoagx.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH v2 1/1] pc-dimm: Change PCDIMMDevice-node from UINT32 to INT32, and initialize it as -1.
On Wed, Aug 27, 2014 at 12:39 PM, tangchen tangc...@cn.fujitsu.com wrote: On 08/26/2014 10:24 PM, Andrey Korolyov wrote: .. Just to remind - Windows will not add pc dimms without populated SRAT, so imho forcing NUMA topology to be set (and whose support is required anyway from guest linux kernel in order to support ACPI hotplug) is better than approach proposed by this patch. . Hi Andrey, Sorry, I don't quite understand. If Windows won't add pc-dimm without SRAT, why is this approach not enough ? If there is no SRAT, there is no NUMA topology. Why do we need to set a NUMA topology ? Thanks. Sorry, probably it was a bit unclear. I thnk that there should be no cases when dimm is plugged (and check from patch is fired up) without actually populated NUMA, because not every OS will workaround this by faking the node.
Re: [Qemu-devel] [PATCH 0/2] pflash (UEFI varstore) migration shortcut for libvirt
On Sat, Aug 23, 2014 at 12:19:05PM +0200, Laszlo Ersek wrote: Libvirt is growing support for x86_64 OVMF guests: http://www.redhat.com/archives/libvir-list/2014-August/msg01045.html An important feature of such guests is the persistent store for non-volatile UEFI variables. This is implemented with if=pflash drives. The referenced libvirt patchset sets up the varstore files for single-host use. Wrt. migration, two choices have been considered: (a) full-blown live storage migration for the drives backing pflash devices, (b) vs. a shortcut that exploits the special nature of pflash drives (namely, their minuscule size, and a RAMBlock that keeps the full contents of each pflash drive visible to the guest, and is up-to-date, at all times.) So, IIUC, with option b), libvirt will merely need to make sure that the NVRAM var store file exists with the right size. QEMU will then just 'do the right thing' for migration copying across the contents ? If so that sounds nice to me. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: [Qemu-devel] [PATCH 1/2] qapi.py: avoid Python 2.5+ any() function
On Fri, Aug 15, 2014 at 10:43:24AM +0300, Riku Voipio wrote: On Tue, Aug 12, 2014 at 01:37:33PM +0100, Stefan Hajnoczi wrote: There is one instance of any() in qapi.py that breaks builds on older distros that ship Python 2.4 (like RHEL5): GEN qmp-commands.h Traceback (most recent call last): File build/scripts/qapi-commands.py, line 445, in ? exprs = parse_schema(input_file) File build/scripts/qapi.py, line 329, in parse_schema schema = QAPISchema(open(input_file, r)) File build/scripts/qapi.py, line 110, in __init__ if any(include_path == elem[1] NameError: global name 'any' is not defined I tried building on RHEL5, and this patch gets a bit more forward. However further down the build I get a similar error: Traceback (most recent call last): File /build/qemu/scripts/tracetool.py, line 139, in ? main(sys.argv) File /build/qemu/scripts/tracetool.py, line 134, in main binary=binary, probe_prefix=probe_prefix) File /build/qemu/scripts/tracetool/__init__.py, line 267, in generate backend = tracetool.backend.Wrapper(backends, format) File /build/qemu/scripts/tracetool/backend/__init__.py, line 105, in __init__ assert all(exists(backend) for backend in self._backends) semi-related - since I'm building --disable-system --disable-tools --enable-user, is there any benefit of tracetool for this build config? Tracetool is part of the build process, there is no way to skip it. Stefan pgpI8cNvmuJsw.pgp Description: PGP signature
[Qemu-devel] [PATCH 1/1] hw/pci-assign: split pci-assign.c
We will try to reuse assign_dev_load_option_rom in xen side, and especially its a good beginning to unify pci assign codes both on kvm and xen in the future. Signed-off-by: Tiejun Chen tiejun.c...@intel.com --- hw/i386/kvm/pci-assign.c| 170 +--- include/hw/pci/pci_assign.h | 204 2 files changed, 207 insertions(+), 167 deletions(-) create mode 100644 include/hw/pci/pci_assign.h diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c index 17c7d6dc..a4fe2d6 100644 --- a/hw/i386/kvm/pci-assign.c +++ b/hw/i386/kvm/pci-assign.c @@ -37,109 +37,7 @@ #include hw/pci/pci.h #include hw/pci/msi.h #include kvm_i386.h - -#define MSIX_PAGE_SIZE 0x1000 - -/* From linux/ioport.h */ -#define IORESOURCE_IO 0x0100 /* Resource type */ -#define IORESOURCE_MEM 0x0200 -#define IORESOURCE_IRQ 0x0400 -#define IORESOURCE_DMA 0x0800 -#define IORESOURCE_PREFETCH 0x2000 /* No side effects */ -#define IORESOURCE_MEM_64 0x0010 - -//#define DEVICE_ASSIGNMENT_DEBUG - -#ifdef DEVICE_ASSIGNMENT_DEBUG -#define DEBUG(fmt, ...) \ -do { \ -fprintf(stderr, %s: fmt, __func__ , __VA_ARGS__); \ -} while (0) -#else -#define DEBUG(fmt, ...) -#endif - -typedef struct PCIRegion { -int type; /* Memory or port I/O */ -int valid; -uint64_t base_addr; -uint64_t size;/* size of the region */ -int resource_fd; -} PCIRegion; - -typedef struct PCIDevRegions { -uint8_t bus, dev, func; /* Bus inside domain, device and function */ -int irq;/* IRQ number */ -uint16_t region_number; /* number of active regions */ - -/* Port I/O or MMIO Regions */ -PCIRegion regions[PCI_NUM_REGIONS - 1]; -int config_fd; -} PCIDevRegions; - -typedef struct AssignedDevRegion { -MemoryRegion container; -MemoryRegion real_iomem; -union { -uint8_t *r_virtbase; /* mmapped access address for memory regions */ -uint32_t r_baseport; /* the base guest port for I/O regions */ -} u; -pcibus_t e_size;/* emulated size of region in bytes */ -pcibus_t r_size;/* real size of region in bytes */ -PCIRegion *region; -} AssignedDevRegion; - -#define ASSIGNED_DEVICE_PREFER_MSI_BIT 0 -#define ASSIGNED_DEVICE_SHARE_INTX_BIT 1 - -#define ASSIGNED_DEVICE_PREFER_MSI_MASK (1 ASSIGNED_DEVICE_PREFER_MSI_BIT) -#define ASSIGNED_DEVICE_SHARE_INTX_MASK (1 ASSIGNED_DEVICE_SHARE_INTX_BIT) - -typedef struct MSIXTableEntry { -uint32_t addr_lo; -uint32_t addr_hi; -uint32_t data; -uint32_t ctrl; -} MSIXTableEntry; - -typedef enum AssignedIRQType { -ASSIGNED_IRQ_NONE = 0, -ASSIGNED_IRQ_INTX_HOST_INTX, -ASSIGNED_IRQ_INTX_HOST_MSI, -ASSIGNED_IRQ_MSI, -ASSIGNED_IRQ_MSIX -} AssignedIRQType; - -typedef struct AssignedDevice { -PCIDevice dev; -PCIHostDeviceAddress host; -uint32_t dev_id; -uint32_t features; -int intpin; -AssignedDevRegion v_addrs[PCI_NUM_REGIONS - 1]; -PCIDevRegions real_device; -PCIINTxRoute intx_route; -AssignedIRQType assigned_irq_type; -struct { -#define ASSIGNED_DEVICE_CAP_MSI (1 0) -#define ASSIGNED_DEVICE_CAP_MSIX (1 1) -uint32_t available; -#define ASSIGNED_DEVICE_MSI_ENABLED (1 0) -#define ASSIGNED_DEVICE_MSIX_ENABLED (1 1) -#define ASSIGNED_DEVICE_MSIX_MASKED (1 2) -uint32_t state; -} cap; -uint8_t emulate_config_read[PCI_CONFIG_SPACE_SIZE]; -uint8_t emulate_config_write[PCI_CONFIG_SPACE_SIZE]; -int msi_virq_nr; -int *msi_virq; -MSIXTableEntry *msix_table; -hwaddr msix_table_addr; -uint16_t msix_max; -MemoryRegion mmio; -char *configfd_name; -int32_t bootindex; -} AssignedDevice; +#include hw/pci/pci_assign.h static void assigned_dev_update_irq_routing(PCIDevice *dev); @@ -1891,72 +1789,10 @@ static void assign_register_types(void) type_init(assign_register_types) -/* - * Scan the assigned devices for the devices that have an option ROM, and then - * load the corresponding ROM data to RAM. If an error occurs while loading an - * option ROM, we just ignore that option ROM and continue with the next one. - */ static void assigned_dev_load_option_rom(AssignedDevice *dev) { -char name[32], rom_file[64]; -FILE *fp; -uint8_t val; -struct stat st; void *ptr; -/* If loading ROM from file, pci handles it */ -if (dev-dev.romfile || !dev-dev.rom_bar) { -return; -} - -snprintf(rom_file, sizeof(rom_file), - /sys/bus/pci/devices/%04x:%02x:%02x.%01x/rom, - dev-host.domain, dev-host.bus, dev-host.slot, - dev-host.function); - -if (stat(rom_file, st)) { -return; -} - -if (access(rom_file, F_OK)) { -error_report(pci-assign: Insufficient privileges for %s, rom_file); -
[Qemu-devel] [PATCH 0/1] qemu:pci-assign: try to pci-assign.c
As you know I'm working on supporting IGD passthrough. Here we need load VGABIOS to work out IGD case. Obviously something may be duplicated to kvm codes, we should unify some codes but looks its not easy to finish that in short time. So as Michael suggestion, at least we'd better split assigned_dev_load_option_rom to reuse on both kvm and xen. I don't finish all IGD stuff patches but here I'd like to post some related codes to show how to use assigned_dev_load_option_rom() lately. +static int get_vgabios(XenPCIPassthroughState *s, void *ptr, + XenHostPCIDevice *dev) +{ +int size = 0; + +size = dev_load_option_rom(s-dev, OBJECT(dev), ptr, dev-domain, + dev-bus, dev-dev, dev-func); + +return size; +} + +int xen_pt_setup_vga(XenPCIPassthroughState *s, XenHostPCIDevice *dev) +{ +void *bios = NULL; +int bios_size = 0; +int rc = 0; + +if (!is_vga_passthrough(dev)) { +return rc; +} + +bios_size = get_vgabios(s, bios, dev); +if (!bios || !bios_size) { +XEN_PT_ERR(NULL, VGA: getting VBIOS!\n); +rc = -1; +goto out; +} ... Tiejun Chen (1): hw/pci-assign: split pci-assign.c hw/i386/kvm/pci-assign.c| 170 +++-- include/hw/pci/pci_assign.h | 204 + 2 files changed, 207 insertions(+), 167 deletions(-) create mode 100644 include/hw/pci/pci_assign.h Thanks Tiejun
Re: [Qemu-devel] [PATCH 0/2] pflash (UEFI varstore) migration shortcut for libvirt
On 08/27/14 10:58, Daniel P. Berrange wrote: On Sat, Aug 23, 2014 at 12:19:05PM +0200, Laszlo Ersek wrote: Libvirt is growing support for x86_64 OVMF guests: http://www.redhat.com/archives/libvir-list/2014-August/msg01045.html An important feature of such guests is the persistent store for non-volatile UEFI variables. This is implemented with if=pflash drives. The referenced libvirt patchset sets up the varstore files for single-host use. Wrt. migration, two choices have been considered: (a) full-blown live storage migration for the drives backing pflash devices, (b) vs. a shortcut that exploits the special nature of pflash drives (namely, their minuscule size, and a RAMBlock that keeps the full contents of each pflash drive visible to the guest, and is up-to-date, at all times.) So, IIUC, with option b), libvirt will merely need to make sure that the NVRAM var store file exists with the right size. QEMU will then just 'do the right thing' for migration copying across the contents ? If so that sounds nice to me. Yes, that is the case. Michal's patchset for libvirt pre-creates the empty varstore on the incoming side, from the template file. That code path is shared with the case when the VM starts up as a non-incoming one, and just has no private varstore yet. The varstore created (from the template) is empty in the PI / UEFI sense, but it's not all zeroes; it has some internal structure. Instantiating the VM-private varstore from the template is - required for both size *and* contents reasons for a non-incoming startup, and - required for size reasons *only* for an incoming startup. http://www.redhat.com/archives/libvir-list/2014-August/msg01048.html +/* If the nvram path is configured already, there's nothing + * we need to do. Unless we are starting the destination side + * of migration in which case nvram is configured in the + * domain XML but the file doesn't exist yet. Moreover, after + * the migration is completed, qemu will invoke a + * synchronization write into the nvram file so we don't have + * to take care about transmitting the real data on the other + * side. */ +if (loader-nvram !migrated) +return 0; Once the pflash_post_load() hook is called in the incoming qemu process, the VM-private varstore (which libvirt has prepared from the template) will be completely overwritten by QEMU. But the size of the file is important information, for starting up the incoming qemu side. (Any mismatch in size would be caught when the backing RAMBlock's size is verified during migration.) Thanks Laszlo
[Qemu-devel] [PATCH] hmp: Add info machines
This is the hmp counterpart of qmp query_machines Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com --- hmp-commands.hx | 2 ++ hmp.c | 22 ++ hmp.h | 1 + monitor.c | 7 +++ 4 files changed, 32 insertions(+) diff --git a/hmp-commands.hx b/hmp-commands.hx index d0943b1..1d04235 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1780,6 +1780,8 @@ show qdev device model list show roms @item info tpm show the TPM device +@item info machines +show supported machines information @end table ETEXI diff --git a/hmp.c b/hmp.c index 4d1838e..603f2f5 100644 --- a/hmp.c +++ b/hmp.c @@ -725,6 +725,28 @@ void hmp_info_tpm(Monitor *mon, const QDict *qdict) qapi_free_TPMInfoList(info_list); } +void hmp_info_machines(Monitor *mon, const QDict *qdict) +{ +MachineInfoList *mach_list = NULL, *cur_item = NULL; + +mach_list = qmp_query_machines(NULL); +for (cur_item = mach_list; cur_item; cur_item = cur_item-next) { +MachineInfo *machine; +char default_chr = ' '; + +machine = cur_item-value; +if (machine-is_default) { +default_chr = '*'; +} +monitor_printf(mon, %c %s:\n, default_chr, machine-name); +monitor_printf(mon, max_cpus: % PRId64 \n, machine-cpu_max); +if (machine-has_alias) { +monitor_printf(mon, alias: %s\n, machine-alias); +} +} +qapi_free_MachineInfoList(mach_list); +} + void hmp_quit(Monitor *mon, const QDict *qdict) { monitor_suspend(mon); diff --git a/hmp.h b/hmp.h index 4fd3c4a..374e841 100644 --- a/hmp.h +++ b/hmp.h @@ -38,6 +38,7 @@ void hmp_info_balloon(Monitor *mon, const QDict *qdict); void hmp_info_pci(Monitor *mon, const QDict *qdict); void hmp_info_block_jobs(Monitor *mon, const QDict *qdict); void hmp_info_tpm(Monitor *mon, const QDict *qdict); +void hmp_info_machines(Monitor *mon, const QDict *qdict); void hmp_quit(Monitor *mon, const QDict *qdict); void hmp_stop(Monitor *mon, const QDict *qdict); void hmp_system_reset(Monitor *mon, const QDict *qdict); diff --git a/monitor.c b/monitor.c index 34cee74..adc3645 100644 --- a/monitor.c +++ b/monitor.c @@ -2921,6 +2921,13 @@ static mon_cmd_t info_cmds[] = { .mhandler.cmd = hmp_info_memdev, }, { +.name = machines, +.args_type = , +.params = , +.help = show supported machines information, +.mhandler.cmd = hmp_info_machines, +}, +{ .name = NULL, }, }; -- 1.7.12.4
Re: [Qemu-devel] [RFC PATCH v2 02/13] spapr_iommu: Disable in-kernel IOMMU tables for 4GB windows
On 15.08.14 12:12, Alexey Kardashevskiy wrote: The existing KVM_CREATE_SPAPR_TCE ioctl only support 4G windows max. We are going to add huge DMA windows support so this will create small window and unexpectedly fail later. This disables KVM_CREATE_SPAPR_TCE for windows bigger that 4GB. Since those windows are normally mapped at the boot time, there will be no performance impact. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- hw/ppc/spapr_iommu.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c index f6e32a4..36f5d27 100644 --- a/hw/ppc/spapr_iommu.c +++ b/hw/ppc/spapr_iommu.c @@ -113,11 +113,11 @@ static MemoryRegionIOMMUOps spapr_iommu_ops = { static int spapr_tce_table_realize(DeviceState *dev) { sPAPRTCETable *tcet = SPAPR_TCE_TABLE(dev); +uint64_t window_size = tcet-nb_table tcet-page_shift; -if (kvm_enabled()) { +if (kvm_enabled() !(window_size 32)) { Please add a comment here that explains the reasoning. Alex
Re: [Qemu-devel] [PATCH 0/1] qemu:pci-assign: try to pci-assign.c
On 08/27/14 11:13, Tiejun Chen wrote: As you know I'm working on supporting IGD passthrough. Here we need load VGABIOS to work out IGD case. Obviously something may be duplicated to kvm codes, we should unify some codes but looks its not easy to finish that in short time. So as Michael suggestion, at least we'd better split assigned_dev_load_option_rom to reuse on both kvm and xen. I don't finish all IGD stuff patches but here I'd like to post some related codes to show how to use assigned_dev_load_option_rom() lately. +static int get_vgabios(XenPCIPassthroughState *s, void *ptr, + XenHostPCIDevice *dev) +{ +int size = 0; + +size = dev_load_option_rom(s-dev, OBJECT(dev), ptr, dev-domain, + dev-bus, dev-dev, dev-func); + +return size; +} + +int xen_pt_setup_vga(XenPCIPassthroughState *s, XenHostPCIDevice *dev) +{ +void *bios = NULL; +int bios_size = 0; +int rc = 0; + +if (!is_vga_passthrough(dev)) { +return rc; +} + +bios_size = get_vgabios(s, bios, dev); +if (!bios || !bios_size) { +XEN_PT_ERR(NULL, VGA: getting VBIOS!\n); +rc = -1; +goto out; +} ... Tiejun Chen (1): hw/pci-assign: split pci-assign.c hw/i386/kvm/pci-assign.c| 170 +++-- include/hw/pci/pci_assign.h | 204 + 2 files changed, 207 insertions(+), 167 deletions(-) create mode 100644 include/hw/pci/pci_assign.h Thanks Tiejun Just to make it clear: I'm assuming that you CC'd me because get-maintainer.pl listed my name, as a committer. I had one series in this area that improved error message propagation and error reporting. I have minimal knowledge about device assignment proper, so I will not review your patch. Please don't wait for my review, and if you have further patches in the area, there's no need to CC me -- I can't help. Thanks, Laszlo
Re: [Qemu-devel] [RFC PATCH v2 05/13] spapr_pci: Introduce a liobn number generating macros
On 15.08.14 12:12, Alexey Kardashevskiy wrote: We are going to have multiple DMA windows per PHB and we want them to migrate so we need a predictable way of assigning LIOBNs. This introduces a macro which makes up a LIOBN from fixed prefix, PHB index (unique PHB id) and window number. This introduces a SPAPR_PCI_DMA_WINDOW_NUM() to know the window number from LIOBN, will be used in next patch(es) to distinguish the default 32bit windows from dynamic windows. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- hw/ppc/spapr_pci.c | 2 +- include/hw/ppc/spapr.h | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index 5c46c0d..17eb0d8 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -529,7 +529,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp) } sphb-buid = SPAPR_PCI_BASE_BUID + sphb-index; -sphb-dma_liobn = SPAPR_PCI_BASE_LIOBN + sphb-index; +sphb-dma_liobn = SPAPR_PCI_LIOBN(sphb-index, 0); windows_base = SPAPR_PCI_WINDOW_BASE + sphb-index * SPAPR_PCI_WINDOW_SPACING; diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h index c9d6c6c..782519a 100644 --- a/include/hw/ppc/spapr.h +++ b/include/hw/ppc/spapr.h @@ -443,7 +443,8 @@ int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr, #define SPAPR_TCE_PAGE_MASK(SPAPR_TCE_PAGE_SIZE - 1) #define SPAPR_VIO_BASE_LIOBN0x -#define SPAPR_PCI_BASE_LIOBN0x8000 +#define SPAPR_PCI_LIOBN(i, n) (0x8000 | ((i) 8) | (n)) +#define SPAPR_PCI_DMA_WINDOW_NUM(liobn) ((liobn) 0xff) Good idea. Mind to convert VIO as well to make them consistent? Alex
Re: [Qemu-devel] [PATCH 0/1] qemu:pci-assign: try to pci-assign.c
On 2014/8/27 17:29, Laszlo Ersek wrote: On 08/27/14 11:13, Tiejun Chen wrote: As you know I'm working on supporting IGD passthrough. Here we need load VGABIOS to work out IGD case. Obviously something may be duplicated to kvm codes, we should unify some codes but looks its not easy to finish that in short time. So as Michael suggestion, at least we'd better split assigned_dev_load_option_rom to reuse on both kvm and xen. I don't finish all IGD stuff patches but here I'd like to post some related codes to show how to use assigned_dev_load_option_rom() lately. +static int get_vgabios(XenPCIPassthroughState *s, void *ptr, + XenHostPCIDevice *dev) +{ +int size = 0; + +size = dev_load_option_rom(s-dev, OBJECT(dev), ptr, dev-domain, + dev-bus, dev-dev, dev-func); + +return size; +} + +int xen_pt_setup_vga(XenPCIPassthroughState *s, XenHostPCIDevice *dev) +{ +void *bios = NULL; +int bios_size = 0; +int rc = 0; + +if (!is_vga_passthrough(dev)) { +return rc; +} + +bios_size = get_vgabios(s, bios, dev); +if (!bios || !bios_size) { +XEN_PT_ERR(NULL, VGA: getting VBIOS!\n); +rc = -1; +goto out; +} ... Tiejun Chen (1): hw/pci-assign: split pci-assign.c hw/i386/kvm/pci-assign.c| 170 +++-- include/hw/pci/pci_assign.h | 204 + 2 files changed, 207 insertions(+), 167 deletions(-) create mode 100644 include/hw/pci/pci_assign.h Thanks Tiejun Just to make it clear: I'm assuming that you CC'd me because get-maintainer.pl listed my name, as a committer. I had one series in Yes. this area that improved error message propagation and error reporting. Was that acked? If yes, I think I need to rebase on yours. I have minimal knowledge about device assignment proper, so I will not review your patch. Please don't wait for my review, and if you have further patches in the area, there's no need to CC me -- I can't help. Thanks for your info. Tiejun
Re: [Qemu-devel] [question] e1000 interrupt storm happened becauseofits correspondingioapic-irr bit always set
Hi, all I use a qemu-1.4.1/qemu-2.0.0 to run win7 guest, and encounter e1000 NIC interrupt storm, because if (!ent-fields.mask (ioapic-irr (1 i))) is always true in __kvm_ioapic_update_eoi(). Any ideas? We meet this several times: search the autoneg patches for an example of workaround for this in qemu, and patch kvm: ioapic: conditionally delay irq delivery during eoi broadcast for an workaround in kvm (rejected). Thanks, Jason, I searched e1000 autoneg in gmane.comp.emulators.qemu, and found below patches, http://thread.gmane.org/gmane.comp.emulators.qemu/143001/focus=143007 This series is the first try to fix the guest hang during guest hibernation or driver enable/disable. http://thread.gmane.org/gmane.comp.emulators.qemu/284105/focus=284765 http://thread.gmane.org/gmane.comp.emulators.qemu/186159/focus=187351 Those are follow-up that tries to fix the bugs introduced by the autoneg hack. which one tries to fix this problem, or all of them? As you can see, those kinds of hacking may not as good as we expect since we don't know exactly how e1000 works. Only the register function description from Intel's manual may not be sufficient. And you can search e1000 in the archives and you can find some behaviour of e1000 registers were not fictionalized like what spec said. It was really suggested to use virtio-net instead of e1000 in guest. Will the [PATCH] kvm: ioapic: conditionally delay irq delivery during eoi broadcast add delay to virtual interrupt injection sometimes, then some time delay sensitive applications will be impacted? I don't test it too much but it only give a minor delay of 1% irq in the hope of guest irq handler will be registered shortly. But I suspect it's the bug of e1000 who inject the irq in the wrong time. Under what cases did you meet this issue? Some scenarios, not constant and 100% reproducity, e.g., reboot vm, ifdown e1000 nic, install kaspersky(network configuration is performed during installing stage), .etc. Thanks, Zhang Haoyu Thanks, Zhang Haoyu
Re: [Qemu-devel] [RFC PATCH v2 07/13] spapr_rtas: Add Dynamic DMA windows (DDW) RTAS calls support
On 15.08.14 12:12, Alexey Kardashevskiy wrote: This adds support for Dynamic DMA Windows (DDW) option defined by the SPAPR specification which allows to have additional DMA window(s) which can support page sizes other than 4K. The existing implementation of DDW in the guest tries to create one huge DMA window with 64K or 16MB pages and map the entire guest RAM to. If it succeeds, the guest switches to dma_direct_ops and never calls TCE hypercalls (H_PUT_TCE,...) again. This enables VFIO devices to use the entire RAM and not waste time on map/unmap. This adds 4 RTAS handlers: * ibm,query-pe-dma-window * ibm,create-pe-dma-window * ibm,remove-pe-dma-window * ibm,reset-pe-dma-window These are registered from type_init() callback. These RTAS handlers are implemented in a separate file to avoid polluting spapr_iommu.c with PHB. Since no PHB class implements new callback in this patch, no functional change is expected. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- Changes: v2: * double loop squashed to spapr_iommu_fixmask() helper * added @ddw_num counter to PHB, it is used to generate LIOBN for new window; it is reset on ddw-reset event * added ULL to constants used in shift operations * rtas_ibm_reset_pe_dma_window() and rtas_ibm_remove_pe_dma_window() do not remove windows anymore, the PHB callback has to as it will reuse the same code in case of guest reboot as well --- hw/ppc/Makefile.objs| 3 + hw/ppc/spapr_pci.c | 3 +- hw/ppc/spapr_rtas_ddw.c | 283 include/hw/pci-host/spapr.h | 19 +++ include/hw/ppc/spapr.h | 6 +- trace-events| 4 + 6 files changed, 316 insertions(+), 2 deletions(-) create mode 100644 hw/ppc/spapr_rtas_ddw.c diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs index edd44d0..9773294 100644 --- a/hw/ppc/Makefile.objs +++ b/hw/ppc/Makefile.objs @@ -7,6 +7,9 @@ obj-$(CONFIG_PSERIES) += spapr_pci.o ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy) obj-y += spapr_pci_vfio.o endif +ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES), yy) +obj-y += spapr_rtas_ddw.o +endif # PowerPC 4xx boards obj-y += ppc405_boards.o ppc4xx_devs.o ppc405_uc.o ppc440_bamboo.o obj-y += ppc4xx_pci.o diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index aa20c36..9b03d0d 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -759,7 +759,7 @@ static int spapr_pci_post_load(void *opaque, int version_id) static const VMStateDescription vmstate_spapr_pci = { .name = spapr_pci, -.version_id = 2, +.version_id = 3, .minimum_version_id = 2, .pre_save = spapr_pci_pre_save, .post_load = spapr_pci_post_load, @@ -775,6 +775,7 @@ static const VMStateDescription vmstate_spapr_pci = { VMSTATE_INT32(msi_devs_num, sPAPRPHBState), VMSTATE_STRUCT_VARRAY_ALLOC(msi_devs, sPAPRPHBState, msi_devs_num, 0, vmstate_spapr_pci_msi, spapr_pci_msi_mig), +VMSTATE_UINT32_V(ddw_num, sPAPRPHBState, 3), VMSTATE_END_OF_LIST() }, }; diff --git a/hw/ppc/spapr_rtas_ddw.c b/hw/ppc/spapr_rtas_ddw.c new file mode 100644 index 000..2b5376a --- /dev/null +++ b/hw/ppc/spapr_rtas_ddw.c @@ -0,0 +1,283 @@ +/* + * QEMU sPAPR Dynamic DMA windows support + * + * Copyright (c) 2014 Alexey Kardashevskiy, IBM Corporation. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, + * or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see http://www.gnu.org/licenses/. + */ + +#include hw/ppc/spapr.h +#include hw/pci-host/spapr.h +#include trace.h + +static uint32_t spapr_iommu_fixmask(struct ppc_one_seg_page_size *sps, +uint32_t query_mask) +{ +int i, j; +uint32_t mask = 0; +const struct { int shift; uint32_t mask; } masks[] = { +{ 12, DDW_PGSIZE_4K }, +{ 16, DDW_PGSIZE_64K }, +{ 24, DDW_PGSIZE_16M }, +{ 25, DDW_PGSIZE_32M }, +{ 26, DDW_PGSIZE_64M }, +{ 27, DDW_PGSIZE_128M }, +{ 28, DDW_PGSIZE_256M }, +{ 34, DDW_PGSIZE_16G }, +}; + +for (i = 0; i PPC_PAGE_SIZES_MAX_SZ; i++) { +for (j = 0; j ARRAY_SIZE(masks); ++j) { +if ((sps[i].page_shift == masks[j].shift) +(query_mask masks[j].mask)) { +mask |=
Re: [Qemu-devel] [PATCH 0/1] qemu:pci-assign: try to pci-assign.c
On 08/27/14 11:34, Chen, Tiejun wrote: On 2014/8/27 17:29, Laszlo Ersek wrote: Just to make it clear: I'm assuming that you CC'd me because get-maintainer.pl listed my name, as a committer. I had one series in Yes. this area that improved error message propagation and error reporting. Was that acked? If yes, I think I need to rebase on yours. No need to rebase, the series was commited ages ago. get_maintainer.pl gave you my name exactly because my commits are in-tree already. Thanks Laszlo
Re: [Qemu-devel] [PATCH 0/1] qemu:pci-assign: try to pci-assign.c
On 2014/8/27 17:39, Laszlo Ersek wrote: On 08/27/14 11:34, Chen, Tiejun wrote: On 2014/8/27 17:29, Laszlo Ersek wrote: Just to make it clear: I'm assuming that you CC'd me because get-maintainer.pl listed my name, as a committer. I had one series in Yes. this area that improved error message propagation and error reporting. Was that acked? If yes, I think I need to rebase on yours. No need to rebase, the series was commited ages ago. get_maintainer.pl gave you my name exactly because my commits are in-tree already. Oh, understood :) Tiejun
Re: [Qemu-devel] [PATCH v3] block.curl: timeout option
On Wed, Aug 13, 2014 at 12:44:26PM -0300, Daniel Henrique Barboza wrote: Changes in v3: - changed option name from 'curltimeout' to 'timeout' Changes in v2: - remove double quote from the int value in qemu-options.hx Daniel Henrique Barboza (1): block.curl: adding 'timeout' option block/curl.c| 13 - qemu-options.hx | 10 -- 2 files changed, 20 insertions(+), 3 deletions(-) Thanks, applied to my block tree: https://github.com/stefanha/qemu/commits/block Stefan pgp9nI9fx3tpV.pgp Description: PGP signature
Re: [Qemu-devel] [RFC PATCH v2 13/13] spapr: Add pseries-2.2 machine with default ddw option
On 15.08.14 12:12, Alexey Kardashevskiy wrote: This defines new pseries machine version which is capable of DDW (dynamic DMA windows) by default. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru This machine should also be the new default for -M pseries. Alex
Re: [Qemu-devel] [RFC PATCH v2 13/13] spapr: Add pseries-2.2 machine with default ddw option
On 15.08.14 12:12, Alexey Kardashevskiy wrote: This defines new pseries machine version which is capable of DDW (dynamic DMA windows) by default. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru In fact, please reverse the logic. The 2.1 machine should get ddw=off and the 2.2 machine an empty compat_props. Alex
Re: [Qemu-devel] [Qemu-ppc] [PATCH 3/5] target-ppc: Build error log
On 25.08.14 15:45, Aravinda Prasad wrote: Whenever there is a physical memory error due to bit flips, which cannot be corrected by hardware, the error is passed on to the kernel. If the memory address in error belongs to guest address space then guest kernel is responsible to take action. Hence the error is passed on to guest via KVM by invoking 0x200 NMI vector. However, guest OS, as per PAPR, expects an error log upon such error. This patch registers a new hcall which is issued from 0x200 interrupt vector and builds the error log, copies the error log to rtas space and passes the address of the error log to guest Enhancement to KVM to perform above functionality is already in upstream kernel. Signed-off-by: Aravinda Prasad aravi...@linux.vnet.ibm.com Why do we have to maintain the error log inside the rtas blob? Can't we just create a fixed MMIO region that is backed by an error log device in QEMU? Then this call would just always return that specific address and we'd also have a single file that deals with all of the error reporting. Alex
Re: [Qemu-devel] [PATCH] trace: [ust] Fix format string computation in tcg-enabled events
LluÃs Vilanova writes: TCG-enabled events start with two format strings. Delay per-argument format computation until requested ('Event.formats'). ping Without this fix, QEMU does not compile if an entry in trace-events has the tcg property. Signed-off-by: LluÃs Vilanova vilan...@ac.upc.edu --- scripts/tracetool/__init__.py| 19 ++- scripts/tracetool/format/ust_events_h.py |2 +- 2 files changed, 11 insertions(+), 10 deletions(-) diff --git a/scripts/tracetool/__init__.py b/scripts/tracetool/__init__.py index 36c789d..854fb9e 100644 --- a/scripts/tracetool/__init__.py +++ b/scripts/tracetool/__init__.py @@ -136,8 +136,7 @@ class Event(object): Properties of the event. args : Arguments The event arguments. -arg_fmts : str -The format strings for each argument. + _CRE = re.compile(((?Pprops.*)\s+)? @@ -146,11 +145,10 @@ class Event(object): \s* (?:(?:(?Pfmt_trans\.+),)?\s*(?Pfmt\.+))? \s*) -_FMT = re.compile((%\w+|%.*PRI\S+)) _VALID_PROPS = set([disable, tcg, tcg-trans, tcg-exec]) -def __init__(self, name, props, fmt, args, arg_fmts, orig=None): +def __init__(self, name, props, fmt, args, orig=None): Parameters -- @@ -162,8 +160,6 @@ class Event(object): Event printing format (or formats). args : Arguments Event arguments. -arg_fmts : list of str -Format strings for each argument. orig : Event or None Original Event before transformation. @@ -172,7 +168,6 @@ class Event(object): self.properties = props self.fmt = fmt self.args = args -self.arg_fmts = arg_fmts if orig is None: self.original = weakref.ref(self) @@ -210,7 +205,6 @@ class Event(object): if len(fmt_trans) 0: fmt = [fmt_trans, fmt] args = Arguments.build(groups[args]) -arg_fmts = Event._FMT.findall(fmt) if tcg-trans in props: raise ValueError(Invalid property 'tcg-trans') @@ -221,7 +215,7 @@ class Event(object): if tcg in props and isinstance(fmt, str): raise ValueError(Events with 'tcg' property must have two formats) -return Event(name, props, fmt, args, arg_fmts) +return Event(name, props, fmt, args) def __repr__(self): Evaluable string representation for this object. @@ -234,6 +228,13 @@ class Event(object): self.args, fmt) +_FMT = re.compile((%\w+|%.*PRI\S+)) + +def formats(self): +List of argument print formats. +assert not isinstance(self.fmt, list) +return self._FMT.findall(self.fmt) + QEMU_TRACE = trace_%(name)s QEMU_TRACE_TCG = QEMU_TRACE + _tcg diff --git a/scripts/tracetool/format/ust_events_h.py b/scripts/tracetool/format/ust_events_h.py index d189899..3e8a7cd 100644 --- a/scripts/tracetool/format/ust_events_h.py +++ b/scripts/tracetool/format/ust_events_h.py @@ -65,7 +65,7 @@ def generate(events, backend): types = e.args.types() names = e.args.names() -fmts = e.arg_fmts +fmts = e.formats() for t,n,f in zip(types, names, fmts): if ('char *' in t) or ('char*' in t): out(' ctf_string(' + n + ', ' + n + ')') -- And it's much the same thing with knowledge, for whenever you learn something new, the whole world becomes that much richer. -- The Princess of Pure Reason, as told by Norton Juster in The Phantom Tollbooth
Re: [Qemu-devel] [PATCH] hmp: Add info machines
Il 27/08/2014 11:24, zhanghailiang ha scritto: This is the hmp counterpart of qmp query_machines Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com --- hmp-commands.hx | 2 ++ hmp.c | 22 ++ hmp.h | 1 + monitor.c | 7 +++ 4 files changed, 32 insertions(+) diff --git a/hmp-commands.hx b/hmp-commands.hx index d0943b1..1d04235 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1780,6 +1780,8 @@ show qdev device model list show roms @item info tpm show the TPM device +@item info machines +show supported machines information @end table ETEXI diff --git a/hmp.c b/hmp.c index 4d1838e..603f2f5 100644 --- a/hmp.c +++ b/hmp.c @@ -725,6 +725,28 @@ void hmp_info_tpm(Monitor *mon, const QDict *qdict) qapi_free_TPMInfoList(info_list); } +void hmp_info_machines(Monitor *mon, const QDict *qdict) +{ +MachineInfoList *mach_list = NULL, *cur_item = NULL; + +mach_list = qmp_query_machines(NULL); +for (cur_item = mach_list; cur_item; cur_item = cur_item-next) { +MachineInfo *machine; +char default_chr = ' '; + +machine = cur_item-value; +if (machine-is_default) { +default_chr = '*'; +} +monitor_printf(mon, %c %s:\n, default_chr, machine-name); +monitor_printf(mon, max_cpus: % PRId64 \n, machine-cpu_max); +if (machine-has_alias) { +monitor_printf(mon, alias: %s\n, machine-alias); +} +} +qapi_free_MachineInfoList(mach_list); +} + void hmp_quit(Monitor *mon, const QDict *qdict) { monitor_suspend(mon); diff --git a/hmp.h b/hmp.h index 4fd3c4a..374e841 100644 --- a/hmp.h +++ b/hmp.h @@ -38,6 +38,7 @@ void hmp_info_balloon(Monitor *mon, const QDict *qdict); void hmp_info_pci(Monitor *mon, const QDict *qdict); void hmp_info_block_jobs(Monitor *mon, const QDict *qdict); void hmp_info_tpm(Monitor *mon, const QDict *qdict); +void hmp_info_machines(Monitor *mon, const QDict *qdict); void hmp_quit(Monitor *mon, const QDict *qdict); void hmp_stop(Monitor *mon, const QDict *qdict); void hmp_system_reset(Monitor *mon, const QDict *qdict); diff --git a/monitor.c b/monitor.c index 34cee74..adc3645 100644 --- a/monitor.c +++ b/monitor.c @@ -2921,6 +2921,13 @@ static mon_cmd_t info_cmds[] = { .mhandler.cmd = hmp_info_memdev, }, { +.name = machines, +.args_type = , +.params = , +.help = show supported machines information, +.mhandler.cmd = hmp_info_machines, +}, +{ .name = NULL, }, }; What is this useful for? You can use -machine help. Paolo
Re: [Qemu-devel] [0/3] target-ppc Fixes for some missing config dependencies
Il 27/08/2014 10:22, David Gibson ha scritto: Have I mentioned that I really hate the fact that qemu has different config variables visible to make and C. Yeah, the introduction of config-devices.h was nixed by one of the maintainers a few years ago... I'll rework, but probably not for a little while, since I have to focus on some kernel work instead for the time being. No problem. Paolo
Re: [Qemu-devel] [PATCH 09/12] rtl8139: adding new fields to vmstate
From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini Il 26/08/2014 09:15, Pavel Dovgalyuk ha scritto: This patch adds virtual clock-dependent timers to VMState to allow correct saving and restoring the state of RTL8139 network controller. Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru --- hw/net/rtl8139.c | 50 -- 1 files changed, 48 insertions(+), 2 deletions(-) Again, this is only needed in your record/replay system (and you haven't yet quite explained why the design has this limitation), so it should not be a part of this series. I see. Updating s-timer and s-TimerExpire in post_load will be enough? Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH] curl: Add override_accept_ranges flag to force sending range requests.
On Wed, Aug 27, 2014 at 08:37:22AM +0200, Markus Armbruster wrote: Eric Blake ebl...@redhat.com writes: On 08/26/2014 08:38 PM, Fam Zheng wrote: On Tue, 08/26 21:48, Richard W.M. Jones wrote: Some servers (notably VMware ESX) accept range requests, but don't send back the Accept-Ranges: bytes header in their initial response. For these servers you can set override_accept_ranges to 'on' which forces this block driver to send range requests anyway. Is this a case where we should be naming with dashes instead of underscores, as in override-accept-ranges? Yes. Thanks for reviewing. I'm actually going to drop this for a couple of reasons: - VMware vSphere does send the header, only ESX doesn't. - When you access ESX (using this patch) eventually the ESX web server crashes. In a sense ESX was correct that it doesn't support ranges -- because it's buggy. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org
Re: [Qemu-devel] [PATCH 09/12] rtl8139: adding new fields to vmstate
Il 27/08/2014 12:15, Pavel Dovgaluk ha scritto: Again, this is only needed in your record/replay system (and you haven't yet quite explained why the design has this limitation), so it should not be a part of this series. I see. Updating s-timer and s-TimerExpire in post_load will be enough? In theory it should be done already, I guess it's not deterministic enough though. Have you tried my patch to rewrite the timer stuff? Paolo
Re: [Qemu-devel] [PATCH V4] net: Forbid dealing with packets when VM is not running
zhanghailiang zhang.zhanghaili...@huawei.com wrote: For all NICs(except virtio-net) emulated by qemu, Such as e1000, rtl8139, pcnet and ne2k_pci, Qemu can still receive packets when VM is not running. If this happened in *migration's* last PAUSE VM stage, but before the end of the migration, the new receiving packets will possibly dirty parts of RAM which has been cached in *iovec*(will be sent asynchronously) and dirty parts of new RAM which will be missed. This will lead serious network fault in VM. To avoid this, we forbid receiving packets in generic net code when VM is not running. Bug reproduction steps: (1) Start a VM which configured at least one NIC (2) In VM, open several Terminal and do *Ping IP -i 0.1* (3) Migrate the VM repeatedly between two Hosts And the *PING* command in VM will very likely fail with message: 'Destination HOST Unreachable', the NIC in VM will stay unavailable unless you run 'service network restart' Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com Reviewed-by: Juan Quintela quint...@redhat.com It fixes the collateral issue than info migrate after migration has ended shows a remaining ram != 0. Thanks, Juan.
Re: [Qemu-devel] [PATCH 09/12] rtl8139: adding new fields to vmstate
From: Paolo Bonzini [mailto:pbonz...@redhat.com] Il 27/08/2014 12:15, Pavel Dovgaluk ha scritto: Again, this is only needed in your record/replay system (and you haven't yet quite explained why the design has this limitation), so it should not be a part of this series. I see. Updating s-timer and s-TimerExpire in post_load will be enough? In theory it should be done already, I guess it's not deterministic enough though. I split existing function into the two parts: one sets irq (and calls another). And the second part sets only timer and TimerExpire fields. The second function is also called from post_load. Have you tried my patch to rewrite the timer stuff? What patch do you mean? Pavel Dovgalyuk
Re: [Qemu-devel] [Qemu-ppc] [PATCH 4/5] target-ppc: Handle ibm, nmi-register RTAS call
On 25.08.14 15:45, Aravinda Prasad wrote: This patch adds FWNMI support in qemu for powerKVM guests by handling the ibm,nmi-register rtas call. Whenever OS issues ibm,nmi-register RTAS call, the machine check notification address is saved and the machine check interrupt vector 0x200 is patched to issue a private hcall. Signed-off-by: Aravinda Prasad aravi...@linux.vnet.ibm.com --- hw/ppc/spapr_rtas.c| 91 include/hw/ppc/spapr.h |8 2 files changed, 98 insertions(+), 1 deletion(-) diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c index 02ddbf9..1135d2b 100644 --- a/hw/ppc/spapr_rtas.c +++ b/hw/ppc/spapr_rtas.c @@ -277,6 +277,91 @@ static void rtas_ibm_set_system_parameter(PowerPCCPU *cpu, rtas_st(rets, 0, ret); } +static void rtas_ibm_nmi_register(PowerPCCPU *cpu, + sPAPREnvironment *spapr, + uint32_t token, uint32_t nargs, + target_ulong args, + uint32_t nret, target_ulong rets) +{ +int i; +uint32_t branch_inst = 0x4802; +target_ulong guest_machine_check_addr; +PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu); +/* + * Trampoline saves r3 in sprg2 and issues private hcall + * to request qemu to build error log. QEMU builds the + * error log, copies to rtas-blob and returns the address. + * The initial 16 bytes in rtas-blob consists of saved srr0 + * and srr1 which we restore and pass on the actual error + * log address to OS handled mcachine check notification + * routine + */ +uint32_t trampoline[] = { +0x7c7243a6,/* mtspr SPRN_SPRG2,r3 */ +0x3860,/* li r3,0 */ +/* 0xf004 is the KVMPPC_H_REPORT_ERR private HCALL */ +0x6063f004,/* ori r3,r3,f004 */ +/* Issue H_CALL */ +0x4422,/* sc 1 */ So up to here we're saving r3 in SPRG2 (how do we know that we can clobber it?) and call our special hypercall. But what does all the cruft below here do? +0x7c9243a6,/* mtspr r4 sprg2 */ Apart from th fact that your order is wrong, this destroys the value of r3 that we saved above again. +0xe883,/* ld r4, 0(r3) */ +0x7c9a03a6,/* mtspr r4, srr0 */ +0xe8830008,/* ld r4, 8(r3) */ +0x7c9b03a6,/* mtspr r4, srr1 */ Can't we just set srr0 and srr1 directly? +0x38630010,/* addi r3,r3,16 */ +0x7c9242a6,/* mfspr r4 sprg2 */ +0x4802,/* Branch to address registered +* by OS. The branch address is +* patched below */ +0x4800,/* b . */ +}; +int total_inst = sizeof(trampoline) / sizeof(uint32_t); + +/* Store the system reset and machine check address */ +guest_machine_check_addr = rtas_ld(args, 1); + +/* Safety Check */ +if (sizeof(trampoline) = MC_INTERRUPT_VECTOR_SIZE) { +fprintf(stderr, Unable to register ibm,nmi_register: +Trampoline size exceeded\n); +rtas_st(rets, 0, RTAS_OUT_NOT_SUPPORTED); +return; +} + +/* + * Update the branch instruction in trampoline with the absolute + * machine check address requested by OS + */ +branch_inst |= guest_machine_check_addr; Does this even work? You're creating a relative branch here. Alex +memcpy(trampoline[11], branch_inst, sizeof(branch_inst)); + +/* Handle all Host/Guest LE/BE combinations */ +if ((*pcc-interrupts_big_endian)(cpu)) { +for (i = 0; i total_inst; i++) { +trampoline[i] = cpu_to_be32(trampoline[i]); +} +} else { +for (i = 0; i total_inst; i++) { +trampoline[i] = cpu_to_le32(trampoline[i]); +} +} + +/* Patch 0x200 NMI interrupt vector memory area of guest */ +cpu_physical_memory_write(MC_INTERRUPT_VECTOR, trampoline, + sizeof(trampoline)); + +rtas_st(rets, 0, RTAS_OUT_SUCCESS); +} + +static void rtas_ibm_nmi_interlock(PowerPCCPU *cpu, + sPAPREnvironment *spapr, + uint32_t token, uint32_t nargs, + target_ulong args, + uint32_t nret, target_ulong rets) +{ +rtas_st(rets, 0, RTAS_OUT_SUCCESS); +} + static struct rtas_call { const char *name; spapr_rtas_fn fn; @@ -404,6 +489,12 @@ static void core_rtas_register_types(void) spapr_rtas_register(RTAS_IBM_SET_SYSTEM_PARAMETER, ibm,set-system-parameter, rtas_ibm_set_system_parameter); +spapr_rtas_register(RTAS_IBM_NMI_REGISTER, +ibm,nmi-register, +
Re: [Qemu-devel] [Qemu-ppc] [PATCH 5/5] target-ppc: Handle cases when multi-processors get machine-check
On 25.08.14 15:45, Aravinda Prasad wrote: It is possible for multi-processors to experience machine check at or about the same time. As per PAPR, subsequent processors serialize waiting for the first processor to issue the ibm,nmi-interlock call. The second processor retries if the first processor which received a machine check is still reading the error log and is yet to issue ibm,nmi-interlock call. This patch implements this functionality. Signed-off-by: Aravinda Prasad aravi...@linux.vnet.ibm.com This patch doesn't make any sense. Both threads will issue an HCALL which will get locked inside of QEMU, so we'll never see the case where both hypercalls get processed at the same time. Alex
Re: [Qemu-devel] [PATCH 09/12] rtl8139: adding new fields to vmstate
Il 27/08/2014 12:30, Pavel Dovgaluk ha scritto: From: Paolo Bonzini [mailto:pbonz...@redhat.com] Il 27/08/2014 12:15, Pavel Dovgaluk ha scritto: Again, this is only needed in your record/replay system (and you haven't yet quite explained why the design has this limitation), so it should not be a part of this series. I see. Updating s-timer and s-TimerExpire in post_load will be enough? In theory it should be done already, I guess it's not deterministic enough though. I split existing function into the two parts: one sets irq (and calls another). And the second part sets only timer and TimerExpire fields. The second function is also called from post_load. Have you tried my patch to rewrite the timer stuff? What patch do you mean? This one: http://article.gmane.org/gmane.comp.emulators.qemu/288521 Paolo
Re: [Qemu-devel] [PATCH 09/12] rtl8139: adding new fields to vmstate
From: Paolo Bonzini [mailto:pbonz...@redhat.com] Il 27/08/2014 12:30, Pavel Dovgaluk ha scritto: From: Paolo Bonzini [mailto:pbonz...@redhat.com] Il 27/08/2014 12:15, Pavel Dovgaluk ha scritto: Again, this is only needed in your record/replay system (and you haven't yet quite explained why the design has this limitation), so it should not be a part of this series. I see. Updating s-timer and s-TimerExpire in post_load will be enough? In theory it should be done already, I guess it's not deterministic enough though. I split existing function into the two parts: one sets irq (and calls another). And the second part sets only timer and TimerExpire fields. The second function is also called from post_load. Have you tried my patch to rewrite the timer stuff? What patch do you mean? This one: http://article.gmane.org/gmane.comp.emulators.qemu/288521 It will solve the problem, because it removes raising an irq from rtl8139_set_next_tctr_time function. Ok then, I'll remove my rtl8139 patch from the series. Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH] spapr_pci: turn IOMMU root into an I/O region
On 26.08.14 18:40, Greg Kurz wrote: On sPAPR, virtio devices are connected to the PCI bus and use MSI-X. Commit cc943c36faa192cd4b32af8fe5edb31894017d35 has modified MSI-X so that writes are made using the bus master address space. Unfortunately, the MSI-X notification hits unassigned_mem_write and never reaches the guest... The most visible effect is that all virtio devices are non-fonctionnal on sPAPR. :( This patch plugs the MSI memory ops to the root IOMMU region, and virtio devices work again. I am not sure this is the right way to fix: please comment and enlight ! Signed-off-by: Greg Kurz gk...@linux.vnet.ibm.com --- hw/ppc/spapr_pci.c |5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index 9ed39a9..b638a3c 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -599,8 +599,9 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp) */ sprintf(namebuf, %s.iommu-root, sphb-dtbusname); -memory_region_init(sphb-iommu_root, OBJECT(sphb), - namebuf, UINT64_MAX); +memory_region_init_io(sphb-iommu_root, OBJECT(sphb), + spapr_msi_ops, spapr, + namebuf, UINT64_MAX); This will turn all unallocated accesses in iommu_root into calls to spapr_msi_ops, no? Can't we instead just populate the iommu_root memory region with the MSI subregion? Alex
Re: [Qemu-devel] [PATCH] spapr_pci: Fix config space corruption
On 13.08.14 09:20, Alexey Kardashevskiy wrote: When disabling MSI/MSIX via ibm,change-msi RTAS call, no check was made if MSI or MSIX is actually supported and the MSI message was reset unconditionally. If this happened on a device which does not support MSI (but does support MSIX, otherwise ibm,change-msi would not be called), this device would have PCIDevice::msi_cap field (MSI capability offset) set to zero and writing a vector would actually clear PCI status. This clears MSI message only if MSI or MSIX is present on a device. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru Thanks, applied to ppc-next. Alex
Re: [Qemu-devel] [PATCH 1/2] qapi.py: avoid Python 2.5+ any() function
On Fri, Aug 15, 2014 at 10:43:24AM +0300, Riku Voipio wrote: On Tue, Aug 12, 2014 at 01:37:33PM +0100, Stefan Hajnoczi wrote: There is one instance of any() in qapi.py that breaks builds on older distros that ship Python 2.4 (like RHEL5): GEN qmp-commands.h Traceback (most recent call last): File build/scripts/qapi-commands.py, line 445, in ? exprs = parse_schema(input_file) File build/scripts/qapi.py, line 329, in parse_schema schema = QAPISchema(open(input_file, r)) File build/scripts/qapi.py, line 110, in __init__ if any(include_path == elem[1] NameError: global name 'any' is not defined I tried building on RHEL5, and this patch gets a bit more forward. However further down the build I get a similar error: Traceback (most recent call last): File /build/qemu/scripts/tracetool.py, line 139, in ? main(sys.argv) File /build/qemu/scripts/tracetool.py, line 134, in main binary=binary, probe_prefix=probe_prefix) File /build/qemu/scripts/tracetool/__init__.py, line 267, in generate backend = tracetool.backend.Wrapper(backends, format) File /build/qemu/scripts/tracetool/backend/__init__.py, line 105, in __init__ assert all(exists(backend) for backend in self._backends) semi-related - since I'm building --disable-system --disable-tools --enable-user, is there any benefit of tracetool for this build config? I will send a new revision with additional fixes. pgpfxlqp9TIyM.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH] tests: set QEMU_AUDIO_DRV=none for pci sound cards
On Tue, Aug 26, 2014 at 08:39:14AM +0200, Gerd Hoffmann wrote: On Di, 2014-08-05 at 16:05 +0200, Markus Armbruster wrote: Ping? Back online. What is the state here? I've seen Stefan (Cc'ed) posted a different patch for the same issue? Anything merged meanwhile? I am sending a v2 of my series that will use setenv(3) like you did except for all tests (Markus preferred that approach). Stefan pgpMfE9WGkGLO.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH] spapr-vlan: Don't touch last entry in buffer list
On 22.08.14 03:50, Anton Blanchard wrote: The last 8 bytes of the buffer list is defined to contain the number of dropped frames. At the moment we use it to store rx entries, which trips up ethtool -S: rx_no_buffer: 9223380832981355136 Fix this by skipping the last buffer list entry. Signed-off-by: Anton Blanchard an...@samba.org Thanks, applied to ppc-next. Alex
[Qemu-devel] [PATCH v2 0/6] Fixes for buildbot failures
v2: * Use setenv(3) instead of prepending environment variables to shell command-line [Markus] * 'error' should be 'warning' in QEMU_AUDIO_DRV=none commit description [Peter] * Avoid all() in tracetool [Riku] * Avoid GSequence in qemu-img.c (RHEL 5 glib does not support it) I just did a sweep of the buildbot at http://buildbot.b1-systems.de/qemu/builders. These patches solve issues on RHEL5 and OpenBSD buildslaves. Stefan Hajnoczi (6): qapi.py: avoid Python 2.5+ any() function libqtest: launch QEMU with QEMU_AUDIO_DRV=none trace: avoid Python 2.5 all() in tracetool mirror: fix uninitialized variable delay_ns warnings block: sort formats alphabetically in bdrv_iterate_format() Revert qemu-img: sort block formats in help message block.c | 14 +- block/mirror.c| 4 +--- qemu-img.c| 25 +++-- scripts/qapi.py | 8 scripts/tracetool/backend/__init__.py | 3 ++- tests/libqtest.c | 1 + 6 files changed, 24 insertions(+), 31 deletions(-) -- 1.9.3
[Qemu-devel] [PATCH v2 1/6] qapi.py: avoid Python 2.5+ any() function
There is one instance of any() in qapi.py that breaks builds on older distros that ship Python 2.4 (like RHEL5): GEN qmp-commands.h Traceback (most recent call last): File build/scripts/qapi-commands.py, line 445, in ? exprs = parse_schema(input_file) File build/scripts/qapi.py, line 329, in parse_schema schema = QAPISchema(open(input_file, r)) File build/scripts/qapi.py, line 110, in __init__ if any(include_path == elem[1] NameError: global name 'any' is not defined Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- scripts/qapi.py | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/scripts/qapi.py b/scripts/qapi.py index f2c6d1f..77d46aa 100644 --- a/scripts/qapi.py +++ b/scripts/qapi.py @@ -107,10 +107,10 @@ class QAPISchema: 'Expected a file name (string), got: %s' % include) include_path = os.path.join(self.input_dir, include) -if any(include_path == elem[1] - for elem in self.include_hist): -raise QAPIExprError(expr_info, Inclusion loop for %s -% include) +for elem in self.include_hist: +if include_path == elem[1]: +raise QAPIExprError(expr_info, Inclusion loop for %s +% include) # skip multiple include of the same file if include_path in previously_included: continue -- 1.9.3
[Qemu-devel] [PATCH v2 2/6] libqtest: launch QEMU with QEMU_AUDIO_DRV=none
No test case actually uses the audio backend. Disable audio to prevent warnings on hosts with no sound hardware present: GTESTER check-qtest-aarch64 sdl: SDL_OpenAudio failed sdl: Reason: No available audio device sdl: SDL_OpenAudio failed sdl: Reason: No available audio device audio: Failed to create voice `lm4549.out' Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- tests/libqtest.c | 1 + 1 file changed, 1 insertion(+) diff --git a/tests/libqtest.c b/tests/libqtest.c index ed55686..5e458e8 100644 --- a/tests/libqtest.c +++ b/tests/libqtest.c @@ -165,6 +165,7 @@ QTestState *qtest_init(const char *extra_args) s-qemu_pid = fork(); if (s-qemu_pid == 0) { +setenv(QEMU_AUDIO_DRV, none, true); command = g_strdup_printf(exec %s -qtest unix:%s,nowait -qtest-log %s -- 1.9.3
[Qemu-devel] [PATCH v2 5/6] block: sort formats alphabetically in bdrv_iterate_format()
Format names are best consumed in alphabetical order. This makes human-readable output easy to produce. bdrv_iterate_format() already has an array of format strings. Sort them before invoking the iteration callback. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- block.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/block.c b/block.c index e9380f6..1df13ac 100644 --- a/block.c +++ b/block.c @@ -3744,11 +3744,17 @@ const char *bdrv_get_format_name(BlockDriverState *bs) return bs-drv ? bs-drv-format_name : NULL; } +static int qsort_strcmp(const void *a, const void *b) +{ +return strcmp(a, b); +} + void bdrv_iterate_format(void (*it)(void *opaque, const char *name), void *opaque) { BlockDriver *drv; int count = 0; +int i; const char **formats = NULL; QLIST_FOREACH(drv, bdrv_drivers, list) { @@ -3762,10 +3768,16 @@ void bdrv_iterate_format(void (*it)(void *opaque, const char *name), if (!found) { formats = g_renew(const char *, formats, count + 1); formats[count++] = drv-format_name; -it(opaque, drv-format_name); } } } + +qsort(formats, count, sizeof(formats[0]), qsort_strcmp); + +for (i = 0; i count; i++) { +it(opaque, formats[i]); +} + g_free(formats); } -- 1.9.3
[Qemu-devel] [PATCH v2 3/6] trace: avoid Python 2.5 all() in tracetool
Red Hat Enterprise Linux 5 ships Python 2.4.3. The all() function was added in Python 2.5 so we cannot use it. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- scripts/tracetool/backend/__init__.py | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/scripts/tracetool/backend/__init__.py b/scripts/tracetool/backend/__init__.py index 5bfa1ef..d4b6dab 100644 --- a/scripts/tracetool/backend/__init__.py +++ b/scripts/tracetool/backend/__init__.py @@ -102,7 +102,8 @@ class Wrapper: def __init__(self, backends, format): self._backends = [backend.replace(-, _) for backend in backends] self._format = format.replace(-, _) -assert all(exists(backend) for backend in self._backends) +for backend in self._backends: +assert exists(backend) assert tracetool.format.exists(self._format) def _run_function(self, name, *args, **kwargs): -- 1.9.3
[Qemu-devel] [PATCH v2 4/6] mirror: fix uninitialized variable delay_ns warnings
The gcc 4.1.2 compiler warns that delay_ns may be uninitialized in mirror_iteration(). There are two break statements in the do ... while loop that skip over the delay_ns assignment. These are probably the cause of the warning. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- block/mirror.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/block/mirror.c b/block/mirror.c index 5e7a166..18b18e0 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -157,7 +157,7 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s) BlockDriverState *source = s-common.bs; int nb_sectors, sectors_per_chunk, nb_chunks; int64_t end, sector_num, next_chunk, next_sector, hbitmap_next_sector; -uint64_t delay_ns; +uint64_t delay_ns = 0; MirrorOp *op; s-sector_num = hbitmap_iter_next(s-hbi); @@ -247,8 +247,6 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s) next_chunk += added_chunks; if (!s-synced s-common.speed) { delay_ns = ratelimit_calculate_delay(s-limit, added_sectors); -} else { -delay_ns = 0; } } while (delay_ns == 0 next_sector end); -- 1.9.3
[Qemu-devel] [PATCH v2 6/6] Revert qemu-img: sort block formats in help message
This reverts commit 1a443c1b8b4314d365e82bddeb1de5b4b1c15fb3 and the later commit 395071a76328189f50c778f4dee6dabb90503dd9. GSequence was introduced in glib 2.14. RHEL 5 fails to compile since it uses glib 2.12.3. Now that bdrv_iterate_format() invokes the iteration callback in sorted order these commits are unnecessary. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- qemu-img.c | 25 +++-- 1 file changed, 3 insertions(+), 22 deletions(-) diff --git a/qemu-img.c b/qemu-img.c index c843420..2052b14 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -32,7 +32,6 @@ #include block/block_int.h #include block/qapi.h #include getopt.h -#include glib.h #define QEMU_IMG_VERSION qemu-img version QEMU_VERSION \ , Copyright (c) 2004-2008 Fabrice Bellard\n @@ -56,22 +55,9 @@ typedef enum OutputFormat { #define BDRV_O_FLAGS BDRV_O_CACHE_WB #define BDRV_DEFAULT_CACHE writeback -static gint compare_data(gconstpointer a, gconstpointer b, gpointer user) +static void format_print(void *opaque, const char *name) { -return g_strcmp0(a, b); -} - -static void print_format(gpointer data, gpointer user) -{ -printf( %s, (char *)data); -} - -static void add_format_to_seq(void *opaque, const char *fmt_name) -{ -GSequence *seq = opaque; - -g_sequence_insert_sorted(seq, (gpointer)fmt_name, - compare_data, NULL); +printf( %s, name); } static void QEMU_NORETURN GCC_FMT_ATTR(1, 2) error_exit(const char *fmt, ...) @@ -157,15 +143,10 @@ static void QEMU_NORETURN help(void) '-f' first image format\n '-F' second image format\n '-s' run in Strict mode - fail on different image size or sector allocation\n; -GSequence *seq; printf(%s\nSupported formats:, help_msg); -seq = g_sequence_new(NULL); -bdrv_iterate_format(add_format_to_seq, seq); -g_sequence_foreach(seq, print_format, NULL); +bdrv_iterate_format(format_print, NULL); printf(\n); -g_sequence_free(seq); - exit(EXIT_SUCCESS); } -- 1.9.3
Re: [Qemu-devel] [PATCH 0/6] target-ppc: More Cleanup of FXU Instructions
On 25.08.14 22:21, Richard Henderson wrote: On 08/25/2014 12:25 PM, Tom Musta wrote: This series follows up my previous series of bug fixes to Power fixed point instructions (http://lists.nongnu.org/archive/html/qemu-ppc/2014-08/msg00068.html). Richard Henderson provided additional feedback after the patches had been taken into Aleg Graf's ppc-next tree. Tom Musta (6): target-ppc: Special Case of rlwimi Should Use Deposit target-ppc: Optimize rlwinm MB=0 ME=31 target-ppc: Optimize rlwnm MB=0 ME=31 target-ppc: Clean Up mullw target-ppc: Clean up mullwo target-ppc: Implement mulldo with TCG Thanks for all the cleanups. Reviewed-by: Richard Henderson r...@twiddle.net Thanks, applied all to ppc-next. Alex
Re: [Qemu-devel] [PATCH 3/3] qemu-img: always goto out in img_snapshot() error paths
On Wed, Aug 27, 2014 at 08:35:33AM +0200, Markus Armbruster wrote: Stefan Hajnoczi stefa...@redhat.com writes: The out label has the qemu_progress_end() and other cleanup calls. Always goto out in error paths so the cleanup happens. Note that bdrv_unref(NULL) is safe. We just need to initialize bs to NULL at the top of the function. We can now remove the obsolete bs_old_backing = NULL and bs_new_backing = NULL for safe mode. Originally it was necessary in commit 3e85c6fd (qemu-img rebase) but became useless in commit c2abcce (qemu-img: avoid calling exit(1) to release resources properly) because the variables are already initialized during declaration. Please mention here that you fix exit code.from -1 to 1. Fixed up when merging pgpCMfKBttix7.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH 0/3] qemu-img: error path fixes
On Tue, Aug 26, 2014 at 07:17:53PM +0100, Stefan Hajnoczi wrote: I noticed a few minor issues when looking at cache flag parsing. This series cleans them up. Stefan Hajnoczi (3): qemu-img: fix img_commit() error return value qemu-img: fix img_compare() flags error path qemu-img: always goto out in img_snapshot() error paths qemu-img.c | 23 ++- 1 file changed, 10 insertions(+), 13 deletions(-) Applied to my block tree: https://github.com/stefanha/qemu/commits/block Stefan pgpPAyiTW8dW3.pgp Description: PGP signature
[Qemu-devel] [PATCH] dump: let dump_error printf the error reason
The second parameter of dump_error is unused, but one purpose of using this function is to report the error info. Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com --- dump.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/dump.c b/dump.c index 71d3e94..0f44e9d 100644 --- a/dump.c +++ b/dump.c @@ -83,6 +83,9 @@ static int dump_cleanup(DumpState *s) static void dump_error(DumpState *s, const char *reason) { +if (reason) { +error_report(%s, reason); +} dump_cleanup(s); } -- 1.7.12.4
Re: [Qemu-devel] [PATCH] spapr_pci: turn IOMMU root into an I/O region
On 08/27/2014 08:57 PM, Alexander Graf wrote: On 26.08.14 18:40, Greg Kurz wrote: On sPAPR, virtio devices are connected to the PCI bus and use MSI-X. Commit cc943c36faa192cd4b32af8fe5edb31894017d35 has modified MSI-X so that writes are made using the bus master address space. Unfortunately, the MSI-X notification hits unassigned_mem_write and never reaches the guest... The most visible effect is that all virtio devices are non-fonctionnal on sPAPR. :( This patch plugs the MSI memory ops to the root IOMMU region, and virtio devices work again. I am not sure this is the right way to fix: please comment and enlight ! Signed-off-by: Greg Kurz gk...@linux.vnet.ibm.com --- hw/ppc/spapr_pci.c |5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index 9ed39a9..b638a3c 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -599,8 +599,9 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp) */ sprintf(namebuf, %s.iommu-root, sphb-dtbusname); -memory_region_init(sphb-iommu_root, OBJECT(sphb), - namebuf, UINT64_MAX); +memory_region_init_io(sphb-iommu_root, OBJECT(sphb), + spapr_msi_ops, spapr, + namebuf, UINT64_MAX); This will turn all unallocated accesses in iommu_root into calls to spapr_msi_ops, no? It should not, the window is small. Can't we instead just populate the iommu_root memory region with the MSI subregion? Makes sense to me. Having one global MSI window sounds like my bug, I just did not any good reason for having it per PHB :) -- Alexey
Re: [Qemu-devel] [PATCH] block: Always compile virtio-blk dataplane
On Tue, Aug 26, 2014 at 02:57:19PM +0800, Fam Zheng wrote: Dataplane doesn't depend on linux-aio any more, so we don't need the compiling condition now. Configure options are kept but just print a message. Signed-off-by: Fam Zheng f...@redhat.com --- configure | 21 ++--- hw/block/Makefile.objs | 2 +- hw/block/virtio-blk.c | 22 ++ hw/virtio/Makefile.objs| 2 +- include/hw/virtio/virtio-blk.h | 2 -- 5 files changed, 6 insertions(+), 43 deletions(-) Thanks, applied to my block tree: https://github.com/stefanha/qemu/commits/block Stefan pgp6Xg8Xv8AGJ.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH V4] net: Forbid dealing with packets when VM is not running
On Tue, Aug 26, 2014 at 04:06:17PM +0800, zhanghailiang wrote: For all NICs(except virtio-net) emulated by qemu, Such as e1000, rtl8139, pcnet and ne2k_pci, Qemu can still receive packets when VM is not running. If this happened in *migration's* last PAUSE VM stage, but before the end of the migration, the new receiving packets will possibly dirty parts of RAM which has been cached in *iovec*(will be sent asynchronously) and dirty parts of new RAM which will be missed. This will lead serious network fault in VM. To avoid this, we forbid receiving packets in generic net code when VM is not running. Bug reproduction steps: (1) Start a VM which configured at least one NIC (2) In VM, open several Terminal and do *Ping IP -i 0.1* (3) Migrate the VM repeatedly between two Hosts And the *PING* command in VM will very likely fail with message: 'Destination HOST Unreachable', the NIC in VM will stay unavailable unless you run 'service network restart' Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com Reviewed-by: Michael S. Tsirkin m...@redhat.com --- v4: - The action of flush queued packets is unnecessary, remove this. (Thanks for the help of Jason Wang and Stefan). v3: - change the 'vmstate' to 'vm_running' v2: - remove the superfluous check of nc-received_disabled --- net/net.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/net/net.c b/net/net.c index 6d930ea..4cb92c0 100644 --- a/net/net.c +++ b/net/net.c @@ -41,6 +41,7 @@ #include qapi-visit.h #include qapi/opts-visitor.h #include qapi/dealloc-visitor.h +#include sysemu/sysemu.h /* Net bridge is currently not supported for W32. */ #if !defined(_WIN32) @@ -452,6 +453,12 @@ void qemu_set_vnet_hdr_len(NetClientState *nc, int len) int qemu_can_send_packet(NetClientState *sender) { +int vm_running = runstate_is_running(); + +if (!vm_running) { +return 0; +} + if (!sender-peer) { return 1; } -- 1.7.12.4
Re: [Qemu-devel] [PATCH V3] vhost_net: start/stop guest notifiers properly
On Wed, Aug 20, 2014 at 05:23:21PM +0800, Zhangjie (HZ) wrote: On 2014/8/19 12:56, Jason Wang wrote: commit a9f98bb5ebe6fb1869321dcc58e72041ae626ad8 vhost: multiqueue call it before setting Zhang Jie, please test this patch to see if it fixes the issue. +static void vhost_net_set_vq_index(struct vhost_net *net, int vq_index) +{ +net-dev.vq_index = vq_index; +} int vq_index) ... Because of vhost_net_set_vq_index, VM can be start successfully. But, after about 80 times of migration under my environment, virtual nic became unreachable again. When I use jprobe to notify tap, the virtual nic becomes reachable again. This shows that interrupts missing causes the problem. What about the patch net: Forbid dealing with packets when VM is not does it help if you additionally apply that one? -- Best Wishes! Zhang Jie
Re: [Qemu-devel] [PATCH V4] net: Forbid dealing with packets when VM is not running
On Wed, Aug 27, 2014 at 01:53:21PM +0200, Michael S. Tsirkin wrote: On Tue, Aug 26, 2014 at 04:06:17PM +0800, zhanghailiang wrote: For all NICs(except virtio-net) emulated by qemu, Such as e1000, rtl8139, pcnet and ne2k_pci, Qemu can still receive packets when VM is not running. If this happened in *migration's* last PAUSE VM stage, but before the end of the migration, the new receiving packets will possibly dirty parts of RAM which has been cached in *iovec*(will be sent asynchronously) and dirty parts of new RAM which will be missed. This will lead serious network fault in VM. To avoid this, we forbid receiving packets in generic net code when VM is not running. Bug reproduction steps: (1) Start a VM which configured at least one NIC (2) In VM, open several Terminal and do *Ping IP -i 0.1* (3) Migrate the VM repeatedly between two Hosts And the *PING* command in VM will very likely fail with message: 'Destination HOST Unreachable', the NIC in VM will stay unavailable unless you run 'service network restart' Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com Reviewed-by: Michael S. Tsirkin m...@redhat.com Please add: Cc: qemu-sta...@nongnu.org when applying this. Thanks! --- v4: - The action of flush queued packets is unnecessary, remove this. (Thanks for the help of Jason Wang and Stefan). v3: - change the 'vmstate' to 'vm_running' v2: - remove the superfluous check of nc-received_disabled --- net/net.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/net/net.c b/net/net.c index 6d930ea..4cb92c0 100644 --- a/net/net.c +++ b/net/net.c @@ -41,6 +41,7 @@ #include qapi-visit.h #include qapi/opts-visitor.h #include qapi/dealloc-visitor.h +#include sysemu/sysemu.h /* Net bridge is currently not supported for W32. */ #if !defined(_WIN32) @@ -452,6 +453,12 @@ void qemu_set_vnet_hdr_len(NetClientState *nc, int len) int qemu_can_send_packet(NetClientState *sender) { +int vm_running = runstate_is_running(); + +if (!vm_running) { +return 0; +} + if (!sender-peer) { return 1; } -- 1.7.12.4
Re: [Qemu-devel] [PATCH 06/12] kvmvapic: fixing loading vmstate
From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini Il 26/08/2014 09:15, Pavel Dovgalyuk ha scritto: vapic state should not be synchronized with APIC while loading, because APIC state could be not loaded yet at that moment. We just save vapic_paddr in APIC VMState instead of synchronization. Can you use a vm_change_state_handler, or a QEMU_CLOCK_VIRTUAL timer with expiration time in the past (e.g. at time zero) to run the sync code as soon as possible? Then you can preserve the current migration format and avoid using the invalid APIC state. Does this method guarantee, that nobody (like other timers) will access APIC between loading the vmstate and invocation of the timer? Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH 06/12] kvmvapic: fixing loading vmstate
Il 27/08/2014 14:16, Pavel Dovgaluk ha scritto: Can you use a vm_change_state_handler, or a QEMU_CLOCK_VIRTUAL timer with expiration time in the past (e.g. at time zero) to run the sync code as soon as possible? Then you can preserve the current migration format and avoid using the invalid APIC state. Does this method guarantee, that nobody (like other timers) will access APIC between loading the vmstate and invocation of the timer? Hmm, probably not. The bug would not be other timers accessing the APIC, because that would also call apic_sync_vapic and the only effect would be an extra useless synchronization. The bug would happen if the APIC is accessed by the CPU before the timer has the occasion to run. However, a vm_change_state_handler should work. It runs before VCPUs are started. Paolo
Re: [Qemu-devel] [PATCH v2] blkdebug: make the fault injection functionality callable from QMP
On 08/26/2014 11:34 PM, Hitoshi Mitake wrote: {execute: blkdebug-set-rules, arguments: {device: ide0-hd0, rules:[{event: write_aio, type: inject-error, immediately: Why 'write_aoi'? New QMP commands should prefer dashes (write-aio) if there is no compelling reason for underscore. The name of event is defined in the event_names array of block/blkdebug.c, which uses underscore. I think using the original names would be good because the name shouldn't be different from config file notation of blkdebug. But it should be fairly easy to code things to accept both spellings when parsing the config file, while preferring the dash in QAPI. Furthermore, by defining the enum in QAPI instead of in blkdebug.c, you get a generated enum that blkdebug.c can reuse, and where you are then guaranteed that things stay in sync if new commands are added (by adding them to the .json file). +} else if (!strcmp(type, inject-error)) { +int _errno, sector; The name _errno threw me; is there something better without a leading underscore that would work better? I added underscore to the name because errno is already used by errno.h. How about given_errno? If you use proper .json typing, the generator will automatically rename the QMP interface errno to the C code q_errno. Consistently using the type-safe munging already done by the generator, instead of redoing a completely different munging on your own, is the preferred solution here. + +once = qdict_get_try_bool(dict, once, 0); s/0/bool/ - we use stdbool.h, so you should use the named constants when dealing with bool parameters. Thanks, I'll fix it in v3. Actually, if you use a type-safe qapi definition, you may not even need to do raw qdict operations, but can just directly use the C struct that gets generated as a result of the qapi. I also like the detailed specification. But, there are bunch of event names (the event_names array of block/blkdebug.c). In addition, the rule of blkdebug can be extended. So I think defining it in two palces (qapi-schema.json and block/blkdebug.c) is hard to maintain. Parsing qdict in blkdebug.c would be simpler. How do you think? No, don't do duplication. Instead, fix blkdebug.c to USE the enum generated by the .json, and you only have to maintain the list in one place - the .json file. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH] block: Introduce null driver
On 08/27/2014 12:22 AM, Fam Zheng wrote: This is an analogue to Linux null_blk. It can be used for testing block device emulation and general block layer functionalities such as coroutines and throttling, where disk IO is not necessary or wanted. Signed-off-by: Fam Zheng f...@redhat.com --- block/Makefile.objs | 1 + block/null.c| 172 2 files changed, 173 insertions(+) create mode 100644 block/null.c Worth also adding it to BlockdevOptions in qapi/block-core.json to allow hotplug of this driver? (Not that libvirt will ever hotplug one, but I really want to move towards being feature-complete on letting QMP drive all block types) -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH] curl: Add override_accept_ranges flag to force sending range requests.
On 08/26/2014 05:48 PM, Richard W.M. Jones wrote: Some servers (notably VMware ESX) accept range requests, but don't send back the Accept-Ranges: bytes header in their initial response. For these servers you can set override_accept_ranges to 'on' which forces this block driver to send range requests anyway. Signed-off-by: Richard W.M. Jones rjo...@redhat.com --- block/curl.c| 10 +- qemu-options.hx | 5 + 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/block/curl.c b/block/curl.c index 095b5a4..3905c6e 100644 --- a/block/curl.c +++ b/block/curl.c @@ -71,6 +71,7 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle, #define CURL_BLOCK_OPT_URL url #define CURL_BLOCK_OPT_READAHEAD readahead #define CURL_BLOCK_OPT_SSLVERIFY sslverify +#define CURL_BLOCK_OPT_OVERRIDE_ACCEPT_RANGES override_accept_ranges struct BDRVCURLState; @@ -489,6 +490,11 @@ static QemuOptsList runtime_opts = { .type = QEMU_OPT_BOOL, .help = Verify SSL certificate }, +{ +.name = CURL_BLOCK_OPT_OVERRIDE_ACCEPT_RANGES, +.type = QEMU_OPT_BOOL, +.help = Server accepts range requests +}, { /* end of list */ } }, }; @@ -547,7 +553,9 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags, // Get file size -s-accept_range = false; +s-accept_range = +qemu_opt_get_bool(opts, CURL_BLOCK_OPT_OVERRIDE_ACCEPT_RANGES, + false); curl_easy_setopt(state-curl, CURLOPT_NOBODY, 1); curl_easy_setopt(state-curl, CURLOPT_HEADERFUNCTION, curl_header_cb); diff --git a/qemu-options.hx b/qemu-options.hx index c573dd8..626ebb3 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -2351,6 +2351,11 @@ multiple of 512 bytes. It defaults to 256k. @item sslverify Whether to verify the remote server's certificate when connecting over SSL. It can have the value 'on' or 'off'. It defaults to 'on'. + +@item override_accept_ranges +Some servers (notably VMware ESX) accept range requests, but don't +declare this fact in the headers that they return. Setting this option +to 'on' forces CURL to use range requests. The default is 'off'. @end table Note that when passing options to qemu explicitly, @option{driver} is the value Reviewed-by: Daniel Barboza danie...@linux.vnet.ibm.com Tested-by: Daniel Barboza danie...@linux.vnet.ibm.com
Re: [Qemu-devel] [PATCH v2 2/6] libqtest: launch QEMU with QEMU_AUDIO_DRV=none
On Mi, 2014-08-27 at 12:08 +0100, Stefan Hajnoczi wrote: No test case actually uses the audio backend. Disable audio to prevent warnings on hosts with no sound hardware present: GTESTER check-qtest-aarch64 sdl: SDL_OpenAudio failed sdl: Reason: No available audio device sdl: SDL_OpenAudio failed sdl: Reason: No available audio device audio: Failed to create voice `lm4549.out' Signed-off-by: Stefan Hajnoczi stefa...@redhat.com Reviewed-by: Gerd Hoffmann kra...@redhat.com
Re: [Qemu-devel] [PATCH V3] vhost_net: start/stop guest notifiers properly
On Thu, Aug 21, 2014 at 03:42:53PM +0800, Zhangjie (HZ) wrote: On 2014/8/21 14:53, Jason Wang wrote: On 08/21/2014 02:28 PM, Zhangjie (HZ) wrote: After migration, vhost is not disabled, virtual nic became unreachable because vhost is not awakened. By the logical of EVENT_IDX, virtio-net will not kick vhost again if the used idx is not updated. So, if one interrupts is lost during migration, virtio_net will not kick vhost again. Then, no skb from virtio-net can be sent to tap. Yes and I mean to test vhost=off to see if it was the issue of vhost. That sounds reasonable, but the test case is to test vhost. Jason's patch reduced the probability of occurrence, from about 1/20 to 1/80. It is really effective. I think the patch should be acked. May be we can try to solve the problem from another perspective. Do you have some methods to sense the migration? We can make up a signal from virtio-net after the migration. You can make a patch like this to debug. If problem disappears, it means interrupt was really lost anyway. Anyway, I will try to reproduce it by myself. The test environment is really terrible, I build a environment myself, but it problem did not occur. The environment I use now is from a colleague Responsible for test work. Two hosts, every host has about 20 vms, they send packages(ipv4 and ipv6) between each other. The VM to be migrated also sens packages itself, and there is a ping(-i 0.001) from another host to it. The physical nic is 1GE, connected through a internal nework. Just want to confirm. For the problem did not occur, you mean with my patch on top? . I mean, with your patch, I have to test 80 times before it occurs, the probability is reduced. Could you please try to apply the patch [PATCH V4] net: Forbid dealing with packets when VM is not running on top and see if this helps? Thanks! -- Best Wishes! Zhang Jie
Re: [Qemu-devel] [PATCH 1/1] hw/pci-assign: split pci-assign.c
On Wed, Aug 27, 2014 at 05:13:07PM +0800, Tiejun Chen wrote: We will try to reuse assign_dev_load_option_rom in xen side, and especially its a good beginning to unify pci assign codes both on kvm and xen in the future. Signed-off-by: Tiejun Chen tiejun.c...@intel.com --- hw/i386/kvm/pci-assign.c| 170 +--- include/hw/pci/pci_assign.h | 204 Why are you making so much code inline? Please move it to an out of line file. Also it should be pci-assign not pci_assign. 2 files changed, 207 insertions(+), 167 deletions(-) create mode 100644 include/hw/pci/pci_assign.h diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c index 17c7d6dc..a4fe2d6 100644 --- a/hw/i386/kvm/pci-assign.c +++ b/hw/i386/kvm/pci-assign.c @@ -37,109 +37,7 @@ #include hw/pci/pci.h #include hw/pci/msi.h #include kvm_i386.h - -#define MSIX_PAGE_SIZE 0x1000 - -/* From linux/ioport.h */ -#define IORESOURCE_IO 0x0100 /* Resource type */ -#define IORESOURCE_MEM 0x0200 -#define IORESOURCE_IRQ 0x0400 -#define IORESOURCE_DMA 0x0800 -#define IORESOURCE_PREFETCH 0x2000 /* No side effects */ -#define IORESOURCE_MEM_64 0x0010 - -//#define DEVICE_ASSIGNMENT_DEBUG - -#ifdef DEVICE_ASSIGNMENT_DEBUG -#define DEBUG(fmt, ...) \ -do { \ -fprintf(stderr, %s: fmt, __func__ , __VA_ARGS__); \ -} while (0) -#else -#define DEBUG(fmt, ...) -#endif - -typedef struct PCIRegion { -int type; /* Memory or port I/O */ -int valid; -uint64_t base_addr; -uint64_t size;/* size of the region */ -int resource_fd; -} PCIRegion; - -typedef struct PCIDevRegions { -uint8_t bus, dev, func; /* Bus inside domain, device and function */ -int irq;/* IRQ number */ -uint16_t region_number; /* number of active regions */ - -/* Port I/O or MMIO Regions */ -PCIRegion regions[PCI_NUM_REGIONS - 1]; -int config_fd; -} PCIDevRegions; - -typedef struct AssignedDevRegion { -MemoryRegion container; -MemoryRegion real_iomem; -union { -uint8_t *r_virtbase; /* mmapped access address for memory regions */ -uint32_t r_baseport; /* the base guest port for I/O regions */ -} u; -pcibus_t e_size;/* emulated size of region in bytes */ -pcibus_t r_size;/* real size of region in bytes */ -PCIRegion *region; -} AssignedDevRegion; - -#define ASSIGNED_DEVICE_PREFER_MSI_BIT 0 -#define ASSIGNED_DEVICE_SHARE_INTX_BIT 1 - -#define ASSIGNED_DEVICE_PREFER_MSI_MASK (1 ASSIGNED_DEVICE_PREFER_MSI_BIT) -#define ASSIGNED_DEVICE_SHARE_INTX_MASK (1 ASSIGNED_DEVICE_SHARE_INTX_BIT) - -typedef struct MSIXTableEntry { -uint32_t addr_lo; -uint32_t addr_hi; -uint32_t data; -uint32_t ctrl; -} MSIXTableEntry; - -typedef enum AssignedIRQType { -ASSIGNED_IRQ_NONE = 0, -ASSIGNED_IRQ_INTX_HOST_INTX, -ASSIGNED_IRQ_INTX_HOST_MSI, -ASSIGNED_IRQ_MSI, -ASSIGNED_IRQ_MSIX -} AssignedIRQType; - -typedef struct AssignedDevice { -PCIDevice dev; -PCIHostDeviceAddress host; -uint32_t dev_id; -uint32_t features; -int intpin; -AssignedDevRegion v_addrs[PCI_NUM_REGIONS - 1]; -PCIDevRegions real_device; -PCIINTxRoute intx_route; -AssignedIRQType assigned_irq_type; -struct { -#define ASSIGNED_DEVICE_CAP_MSI (1 0) -#define ASSIGNED_DEVICE_CAP_MSIX (1 1) -uint32_t available; -#define ASSIGNED_DEVICE_MSI_ENABLED (1 0) -#define ASSIGNED_DEVICE_MSIX_ENABLED (1 1) -#define ASSIGNED_DEVICE_MSIX_MASKED (1 2) -uint32_t state; -} cap; -uint8_t emulate_config_read[PCI_CONFIG_SPACE_SIZE]; -uint8_t emulate_config_write[PCI_CONFIG_SPACE_SIZE]; -int msi_virq_nr; -int *msi_virq; -MSIXTableEntry *msix_table; -hwaddr msix_table_addr; -uint16_t msix_max; -MemoryRegion mmio; -char *configfd_name; -int32_t bootindex; -} AssignedDevice; +#include hw/pci/pci_assign.h static void assigned_dev_update_irq_routing(PCIDevice *dev); @@ -1891,72 +1789,10 @@ static void assign_register_types(void) type_init(assign_register_types) -/* - * Scan the assigned devices for the devices that have an option ROM, and then - * load the corresponding ROM data to RAM. If an error occurs while loading an - * option ROM, we just ignore that option ROM and continue with the next one. - */ static void assigned_dev_load_option_rom(AssignedDevice *dev) { -char name[32], rom_file[64]; -FILE *fp; -uint8_t val; -struct stat st; void *ptr; -/* If loading ROM from file, pci handles it */ -if (dev-dev.romfile || !dev-dev.rom_bar) { -return; -} - -snprintf(rom_file,
Re: [Qemu-devel] [PATCH 06/12] kvmvapic: fixing loading vmstate
From: Paolo Bonzini [mailto:pbonz...@redhat.com] Il 27/08/2014 14:16, Pavel Dovgaluk ha scritto: Can you use a vm_change_state_handler, or a QEMU_CLOCK_VIRTUAL timer with expiration time in the past (e.g. at time zero) to run the sync code as soon as possible? Then you can preserve the current migration format and avoid using the invalid APIC state. Does this method guarantee, that nobody (like other timers) will access APIC between loading the vmstate and invocation of the timer? Hmm, probably not. The bug would not be other timers accessing the APIC, because that would also call apic_sync_vapic and the only effect would be an extra useless synchronization. The bug would happen if the APIC is accessed by the CPU before the timer has the occasion to run. Sorry, but I don't understand which problem we will solve with apic_sync_vapic. All fields that were added to vmstate are not affected by this function. Could you explain your idea? Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH v2] monitor: fix debug print compiling error
On Thu, 21 Aug 2014 21:03:09 +0800 arei.gong...@huawei.com wrote: From: Gonglei arei.gong...@huawei.com error: 'i' undeclared (first use in this function) Signed-off-by: Gonglei arei.gong...@huawei.com Applied to the qmp branch, thanks. --- v2: avoid to mix code and declarations, add a pair of {}, thanks Peter. --- monitor.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/monitor.c b/monitor.c index 34cee74..667efb7 100644 --- a/monitor.c +++ b/monitor.c @@ -4747,8 +4747,11 @@ static void monitor_find_completion(void *opaque, return; } #ifdef DEBUG_COMPLETION -for (i = 0; i nb_args; i++) { -monitor_printf(mon, arg%d = '%s'\n, i, args[i]); +{ +int i; +for (i = 0; i nb_args; i++) { +monitor_printf(mon, arg%d = '%s'\n, i, args[i]); +} } #endif
Re: [Qemu-devel] [RFC PATCH 2/2] VFIO: Clear stale MSIx table during EEH reset
On Tue, Aug 26, 2014 at 02:25:47PM -0600, Alex Williamson wrote: On Wed, 2014-08-20 at 19:52 +1000, Gavin Shan wrote: The PCI device MSIx table is cleaned out in hardware after EEH PE reset. However, we still hold the stale MSIx entries in QEMU, which should be cleared accordingly. Otherwise, we will run into another (recursive) EEH error and the PCI devices contained in the PE have to be offlined exceptionally. The patch clears stale MSIx table before EEH PE reset so that MSIx table could be restored properly after EEH PE reset. Signed-off-by: Gavin Shan gws...@linux.vnet.ibm.com --- hw/misc/vfio.c | 24 1 file changed, 24 insertions(+) diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c index 1a3e7eb..3cf7f02 100644 --- a/hw/misc/vfio.c +++ b/hw/misc/vfio.c @@ -4424,6 +4424,8 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid, { VFIOGroup *group; VFIOContainer *container; +VFIODevice *vdev; +struct vfio_eeh_pe_op *arg; Define these within the scope of the case since they're not used outside of it. Yes, I'll fix. int ret = -1; group = vfio_get_group(groupid, as); @@ -4442,7 +,29 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid, switch (req) { case VFIO_CHECK_EXTENSION: case VFIO_IOMMU_SPAPR_TCE_GET_INFO: +break; case VFIO_EEH_PE_OP: +arg = (struct vfio_eeh_pe_op *)param; +switch (arg-op) { +case VFIO_EEH_PE_RESET_HOT: +case VFIO_EEH_PE_RESET_FUNDAMENTAL: +/* + * The MSIx table will be cleaned out by reset. We need + * disable it so that it can be reenabled properly. Also, + * the cached MSIx table should be cleared as it's not + * reflecting the contents in hardware. + */ +QLIST_FOREACH(vdev, group-device_list, next) { +if (msix_enabled(vdev-pdev)) { +vfio_disable_msix(vdev); +} + +msix_reset(vdev-pdev); +} We already have vfio_disable_interrupts(), can't we use that (blindly)? Do we really need to call msix_reset()? If so, should vfio_disable_msix() call it? Yes, vfio_disable_interrupts() would be better to be used here as it can covers all types of interrupt (INTx/MSI/MSIx). vfio_disable_interrupts() needn't clear MSIx vectors. If you prefer calling msix_reset() in the function, I guess I have to add one more parameter to vfio_disable_interrupts() to indicate if we need clear MSI/MSIx vector: static void vfio_disable_interrupts(VFIODevice *vdev, bool clr_vector) + +break; Extraneous break Yep, I'll remove it. +} + break; default: /* Return an error on unknown requests */ Thanks, Alex Thanks, Gavin
Re: [Qemu-devel] [PATCH] dump: let dump_error printf the error reason
On Wed, 27 Aug 2014 19:18:53 +0800 zhanghailiang zhang.zhanghaili...@huawei.com wrote: The second parameter of dump_error is unused, but one purpose of using this function is to report the error info. Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com --- dump.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/dump.c b/dump.c index 71d3e94..0f44e9d 100644 --- a/dump.c +++ b/dump.c @@ -83,6 +83,9 @@ static int dump_cleanup(DumpState *s) static void dump_error(DumpState *s, const char *reason) { +if (reason) { +error_report(%s, reason); +} dump_cleanup(s); } Good catch, but error_report() will report the error only to the user. This is QMP code, so we have to use the Error API. I think that the best way to solve this is to make dump_error() fill an Error object (eg. by calling error_setg()) and then returning it after the call to dump_cleanup(). Of course that you will have to change _all_ code paths calling dump_error() to propagate the error up. For more information on this, please read docs/writing-qmp-commands.txt. You can also take a look at simple commands doing error propagation, like qmp_cont() or qmp_block_passwd().
Re: [Qemu-devel] [PATCH v7 0/4] s390: Support for Hotplug of Standby Memory
On 07/30/2014 02:15 PM, Matthew Rosato wrote: This patchset adds support in s390 for a pool of standby memory, which can be set online/offline by the guest (ie, via chmem). The standby pool of memory is allocated as the difference between the initial memory setting and the maxmem setting. As part of this work, additional results are provided for the Read SCP Information SCLP, and new implentation is added for the Read Storage Element Information, Attach Storage Element, Assign Storage and Unassign Storage SCLPs, which enables the s390 guest to manipulate the standby memory pool. This patchset is based on work originally done by Jeng-Fang (Nick) Wang. Sample qemu command snippet: qemu -machine s390-ccw-virtio -m 1024M,maxmem=2048M,slots=32 -enable-kvm This will allocate 1024M of active memory, and another 1024M of standby memory. Example output from s390-tools lsmem: = 0x-0x0fff256 online no 0-127 0x1000-0x1fff256 online yes128-255 0x2000-0x3fff512 online no 256-511 0x4000-0x7fff 1024 offline - 512-1023 Memory device size : 2 MB Memory block size : 256 MB Total online memory : 1024 MB Total offline memory: 1024 MB The guest can dynamically enable part or all of the standby pool via the s390-tools chmem, for example: chmem -e 512M And can attempt to dynamically disable: chmem -d 512M Changes for v7: * Added patch to enforce the same memory alignments in s390-virtio.c, so that shared code (like sclp) doesn't need to be dual paths. Ping... Changes for v6: * Fix in sclp.h - DeviceState parent -- SysBusDevice parent in struct sclpMemoryHotplugDev. * Fix in assign_storage - int this_subregion_size, should be uint64_t. * Added information on how to test in the cover letter. Changes for v5: * Since ACPI memory hotplug is now in, removed Igor's patches from this set. * Updated sclp.c to use object_resolve_path() instead of object_property_find(). Matthew Rosato (4): sclp-s390: Add device to manage s390 memory hotplug virtio-ccw: Include standby memory when calculating storage increment s390-virtio: Apply same memory boundaries as virtio-ccw sclp-s390: Add memory hotplug SCLPs hw/s390x/s390-virtio-ccw.c | 46 +-- hw/s390x/s390-virtio.c | 15 ++- hw/s390x/sclp.c| 289 +++- include/hw/s390x/sclp.h| 20 +++ qemu-options.hx|3 +- target-s390x/cpu.h | 18 +++ target-s390x/kvm.c |5 + 7 files changed, 375 insertions(+), 21 deletions(-)
Re: [Qemu-devel] [PATCH 06/12] kvmvapic: fixing loading vmstate
Il 27/08/2014 15:03, Pavel Dovgaluk ha scritto: Hmm, probably not. The bug would not be other timers accessing the APIC, because that would also call apic_sync_vapic and the only effect would be an extra useless synchronization. The bug would happen if the APIC is accessed by the CPU before the timer has the occasion to run. Sorry, but I don't understand which problem we will solve with apic_sync_vapic. In KVM mode, it is not a problem to call apic_enable_vapic before APIC state is loaded; all vapic processing is delayed anyway to after the VCPUs are started. In TCG mode, apic_enable_vapic calls apic_sync_vapic. Taking inspiration from what KVM does, the fix could be even simpler than a change state handler. run_on_cpu functions do not run while the VM is stopped, so the following should work: diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c index ce3d903..81d1ad7 100644 --- a/hw/intc/apic_common.c +++ b/hw/intc/apic_common.c @@ -91,13 +91,20 @@ void apic_enable_tpr_access_reporting(DeviceState *dev, bool enable) } } +static void do_apic_enable_vapic(void *data) +{ +APICCommonState *s = APIC_COMMON(data); +APICCommonClass *info = APIC_COMMON_GET_CLASS(s); + +info-vapic_base_update(s); +} + void apic_enable_vapic(DeviceState *dev, hwaddr paddr) { APICCommonState *s = APIC_COMMON(dev); -APICCommonClass *info = APIC_COMMON_GET_CLASS(s); s-vapic_paddr = paddr; -info-vapic_base_update(s); +run_on_cpu(CPU(s-cpu), do_apic_enable_vapic, s); } void apic_handle_tpr_access_report(DeviceState *dev, target_ulong ip, All fields that were added to vmstate are not affected by this function. The sipi_vector and wait_for_sipi parts of this patch are okay. You should split those in a separate patch. Paolo
Re: [Qemu-devel] [PATCH] block: Introduce null driver
Eric Blake ebl...@redhat.com writes: On 08/27/2014 12:22 AM, Fam Zheng wrote: This is an analogue to Linux null_blk. It can be used for testing block device emulation and general block layer functionalities such as coroutines and throttling, where disk IO is not necessary or wanted. Signed-off-by: Fam Zheng f...@redhat.com --- block/Makefile.objs | 1 + block/null.c | 172 2 files changed, 173 insertions(+) create mode 100644 block/null.c Worth also adding it to BlockdevOptions in qapi/block-core.json to allow hotplug of this driver? (Not that libvirt will ever hotplug one, but I really want to move towards being feature-complete on letting QMP drive all block types) I'd support a policy of no new block drivers outside QAPI.
Re: [Qemu-devel] [RFC PATCH 2/2] VFIO: Clear stale MSIx table during EEH reset
On Wed, 2014-08-27 at 23:15 +1000, Gavin Shan wrote: On Tue, Aug 26, 2014 at 02:25:47PM -0600, Alex Williamson wrote: On Wed, 2014-08-20 at 19:52 +1000, Gavin Shan wrote: The PCI device MSIx table is cleaned out in hardware after EEH PE reset. However, we still hold the stale MSIx entries in QEMU, which should be cleared accordingly. Otherwise, we will run into another (recursive) EEH error and the PCI devices contained in the PE have to be offlined exceptionally. The patch clears stale MSIx table before EEH PE reset so that MSIx table could be restored properly after EEH PE reset. Signed-off-by: Gavin Shan gws...@linux.vnet.ibm.com --- hw/misc/vfio.c | 24 1 file changed, 24 insertions(+) diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c index 1a3e7eb..3cf7f02 100644 --- a/hw/misc/vfio.c +++ b/hw/misc/vfio.c @@ -4424,6 +4424,8 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid, { VFIOGroup *group; VFIOContainer *container; +VFIODevice *vdev; +struct vfio_eeh_pe_op *arg; Define these within the scope of the case since they're not used outside of it. Yes, I'll fix. int ret = -1; group = vfio_get_group(groupid, as); @@ -4442,7 +,29 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid, switch (req) { case VFIO_CHECK_EXTENSION: case VFIO_IOMMU_SPAPR_TCE_GET_INFO: +break; case VFIO_EEH_PE_OP: +arg = (struct vfio_eeh_pe_op *)param; +switch (arg-op) { +case VFIO_EEH_PE_RESET_HOT: +case VFIO_EEH_PE_RESET_FUNDAMENTAL: +/* + * The MSIx table will be cleaned out by reset. We need + * disable it so that it can be reenabled properly. Also, + * the cached MSIx table should be cleared as it's not + * reflecting the contents in hardware. + */ +QLIST_FOREACH(vdev, group-device_list, next) { +if (msix_enabled(vdev-pdev)) { +vfio_disable_msix(vdev); +} + +msix_reset(vdev-pdev); +} We already have vfio_disable_interrupts(), can't we use that (blindly)? Do we really need to call msix_reset()? If so, should vfio_disable_msix() call it? Yes, vfio_disable_interrupts() would be better to be used here as it can covers all types of interrupt (INTx/MSI/MSIx). vfio_disable_interrupts() needn't clear MSIx vectors. If you prefer calling msix_reset() in the function, I guess I have to add one more parameter to vfio_disable_interrupts() to indicate if we need clear MSI/MSIx vector: static void vfio_disable_interrupts(VFIODevice *vdev, bool clr_vector) How about just creating a vfio_reset_interrupts() that calls msix_reset() if the device supports MSIX? We could wrap vfio_disable_interrupts() and vfio_reset_interrupts() into a vfio_disable_and_reset_interrupts(), but I'm not sure it's worth it. The reset device path should also add the interrupt reset call. Thanks, Alex + +break; Extraneous break Yep, I'll remove it. +} + break; default: /* Return an error on unknown requests */ Thanks, Alex Thanks, Gavin
[Qemu-devel] [PATCH] blockdev: fix drive-mirror 'granularity' error message
Name the 'granularity' parameter and give its expected value range. Previously the device name was mistakingly reported as the parameter name. Note that the error class is unchanged from ERROR_CLASS_GENERIC_ERROR. Reported-by: Eric Blake ebl...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- blockdev.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/blockdev.c b/blockdev.c index 6a204c6..eeb414e 100644 --- a/blockdev.c +++ b/blockdev.c @@ -2179,11 +2179,12 @@ void qmp_drive_mirror(const char *device, const char *target, } if (granularity != 0 (granularity 512 || granularity 1048576 * 64)) { -error_set(errp, QERR_INVALID_PARAMETER, device); +error_set(errp, QERR_INVALID_PARAMETER_VALUE, granularity, + a value in range [512B, 64MB]); return; } if (granularity (granularity - 1)) { -error_set(errp, QERR_INVALID_PARAMETER, device); +error_set(errp, QERR_INVALID_PARAMETER_VALUE, granularity, power of 2); return; } -- 1.9.3
Re: [Qemu-devel] bad error message in drive-mirror
On Mon, Aug 25, 2014 at 05:26:31PM -0600, Eric Blake wrote: I just noticed today that passing granularity:1 to QMP drive-mirror results in this confusing error message: Invalid parameter 'drive-virtio-disk0' because the code mistakenly did error_set(errp, QERR_INVALID_PARAMETER, device); instead of the correct error_set(errp, QERR_INVALID_PARAMETER, granularity); I ran out of time to patch it today, and at any rate, whoever writes the patch should probably audit for other bad uses of error_set(QERR_INVALID_PARAMETER). I have sent a patch to fix this. Thanks for reporting it. Stefan pgpt8huFjYu3J.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH] blockdev: fix drive-mirror 'granularity' error message
On 08/27/2014 07:29 AM, Stefan Hajnoczi wrote: Name the 'granularity' parameter and give its expected value range. Previously the device name was mistakingly reported as the parameter s/mistakingly/mistakenly/ name. Note that the error class is unchanged from ERROR_CLASS_GENERIC_ERROR. Reported-by: Eric Blake ebl...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- blockdev.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) buf-size has the same bug a few lines later; might as well fix both at once. But what you have is a strict improvement, so depending on whether you do a v2 to fix buf-size, vs. do a followup patch, you could add: Reviewed-by: Eric Blake ebl...@redhat.com -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature