[Qemu-devel] [Bug 1242963] Re: QEMU loadvm causes guest OS freeze
Hi guys, This bug has reported in a very long time. I have tested v1.7.0, v1.7.1 and even v2.0.0rc1, v2.0.0rc2. The bug is still there. This is very easy to reproduce. I dunno why it is not fixed until today. I think living-snapshot is a very important feature for Qemu. Somebody can take a look at this issue. Or at least, give me some tips to let me know how to fix this? Thanks! -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1242963 Title: QEMU loadvm causes guest OS freeze Status in QEMU: New Bug description: HOST: ubuntu 13.10 x64 GUEST: winxp sp 3 x86 AFFECT QEMU(tested): v1.5.2, v1.5.3, v1.6.0, v1.6.1 I compile QEMU by myself with ./configure --target-list=i386-softmmu make make install. After installing a winxp sp3 into the qemu-system-i386 with command line: qemu-system-i386 -m 512 -hda xp.img -net user -net nic,model=rtl8139 -rtc base=localtime,clock=vm I use monitor to create a live snapshot: stop savevm xxx cont And then I load this snapshot (I also try it in commad line: -loadvm xxx): loadvm xxx cont After that, the windows system is freeze (don't accept any keyboard or mouse input, although I knew vcpu is still working). If I compile with -enable-kvm and launch qemu-system-i386 with -enable-kvm, it looks like everything works well. I think it is a bug for qemu system. BTW: freeze is not appearing 100%, but in my test, 95% cases would cause system freeze. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1242963/+subscriptions
[Qemu-devel] [PATCH] qmp: synchronize state of the cpu before memory reading
calling memsave' through the qmp makes a zero filled dump. the hmp still works properly because hmp_memsave calls cpu_synchronize_state before calling qmp_memsave. --- Andrey Karpov (1): synchronize state of the cpu before memory reading cpus.c |1 + 1 file changed, 1 insertion(+) -- Signature
[Qemu-devel] [PATCH] synchronize state of the cpu before memory reading
--- cpus.c |1 + 1 file changed, 1 insertion(+) diff --git a/cpus.c b/cpus.c index 1104d61..4c53747 100644 --- a/cpus.c +++ b/cpus.c @@ -1417,6 +1417,7 @@ void qmp_memsave(int64_t addr, int64_t size, const char *filename, return; } +cpu_synchronize_state(cpu); while (size != 0) { l = sizeof(buf); if (l size)
[Qemu-devel] [PATCH v8 3/4] raw-posix: Add full image preallocation option
This patch adds a new option preallocation for raw format, and implements full preallocation. Signed-off-by: Hu Tao hu...@cn.fujitsu.com --- block/raw-posix.c | 61 --- 1 file changed, 54 insertions(+), 7 deletions(-) diff --git a/block/raw-posix.c b/block/raw-posix.c index a363b71..44a1b4a 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -1240,6 +1240,7 @@ static int raw_create(const char *filename, QEMUOptionParameter *options, int fd; int result = 0; int64_t total_size = 0; +PreallocMode prealloc = PREALLOC_MODE_OFF; strstart(filename, file:, filename); @@ -1247,6 +1248,18 @@ static int raw_create(const char *filename, QEMUOptionParameter *options, while (options options-name) { if (!strcmp(options-name, BLOCK_OPT_SIZE)) { total_size = options-value.n BDRV_SECTOR_MASK; +} else if (!strcmp(options-name, BLOCK_OPT_PREALLOC)) { +if (!options-value.s || !strcmp(options-value.s, off)) { +prealloc = PREALLOC_MODE_OFF; +} else if (!strcmp(options-value.s, metadata)) { +prealloc = PREALLOC_MODE_METADATA; +} else if (!strcmp(options-value.s, full)) { +prealloc = PREALLOC_MODE_FULL; +} else { +error_setg(errp, Invalid preallocation mode: '%s', + options-value.s); +return -EINVAL; +} } options++; } @@ -1256,16 +1269,45 @@ static int raw_create(const char *filename, QEMUOptionParameter *options, if (fd 0) { result = -errno; error_setg_errno(errp, -result, Could not create file); -} else { -if (ftruncate(fd, total_size) != 0) { -result = -errno; -error_setg_errno(errp, -result, Could not resize file); +goto out; +} +if (ftruncate(fd, total_size) != 0) { +result = -errno; +error_setg_errno(errp, -result, Could not resize file); +goto out_close; +} +if (prealloc == PREALLOC_MODE_METADATA) { +/* posix_fallocate() doesn't set errno. */ +result = -posix_fallocate(fd, 0, total_size); +if (result != 0) { +error_setg_errno(errp, -result, + Could not preallocate data for the new file); } -if (qemu_close(fd) != 0) { -result = -errno; -error_setg_errno(errp, -result, Could not close the new file); +} else if (prealloc == PREALLOC_MODE_FULL) { +char *buf = g_malloc0(65536); +int64_t num = 0, left = total_size; + +while (left 0) { +num = MIN(left, 65536); +result = write(fd, buf, num); +if (result 0) { +result = -errno; +error_setg_errno(errp, -result, + Could not write to the new file); +g_free(buf); +goto out_close; +} +left -= num; } +fsync(fd); +g_free(buf); +} +out_close: +if (qemu_close(fd) != 0) { +result = -errno; +error_setg_errno(errp, -result, Could not close the new file); } +out: return result; } @@ -1416,6 +1458,11 @@ static QEMUOptionParameter raw_create_options[] = { .type = OPT_SIZE, .help = Virtual disk size }, +{ +.name = BLOCK_OPT_PREALLOC, +.type = OPT_STRING, +.help = Preallocation mode (allowed values: off, metadata, full) +}, { NULL } }; -- 1.8.5.2.229.g4448466
[Qemu-devel] [PATCH v8 2/4] raw, qcow2: don't convert file size to sector size
and avoid converting it back later. Signed-off-by: Hu Tao hu...@cn.fujitsu.com --- block/qcow2.c | 8 block/raw-posix.c | 4 ++-- block/raw-win32.c | 4 ++-- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 3377007..b2ff9c3 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -1713,7 +1713,7 @@ static int qcow2_create2(const char *filename, int64_t total_size, } /* Okay, now that we have a valid image, let's give it the right size */ -ret = bdrv_truncate(bs, total_size * BDRV_SECTOR_SIZE); +ret = bdrv_truncate(bs, total_size); if (ret 0) { error_setg_errno(errp, -ret, Could not resize image); goto out; @@ -1766,7 +1766,7 @@ static int qcow2_create(const char *filename, QEMUOptionParameter *options, { const char *backing_file = NULL; const char *backing_fmt = NULL; -uint64_t sectors = 0; +uint64_t size = 0; int flags = 0; size_t cluster_size = DEFAULT_CLUSTER_SIZE; PreallocMode prealloc = PREALLOC_MODE_OFF; @@ -1777,7 +1777,7 @@ static int qcow2_create(const char *filename, QEMUOptionParameter *options, /* Read out options */ while (options options-name) { if (!strcmp(options-name, BLOCK_OPT_SIZE)) { -sectors = options-value.n / 512; +size = options-value.n BDRV_SECTOR_MASK; } else if (!strcmp(options-name, BLOCK_OPT_BACKING_FILE)) { backing_file = options-value.s; } else if (!strcmp(options-name, BLOCK_OPT_BACKING_FMT)) { @@ -1828,7 +1828,7 @@ static int qcow2_create(const char *filename, QEMUOptionParameter *options, return -EINVAL; } -ret = qcow2_create2(filename, sectors, backing_file, backing_fmt, flags, +ret = qcow2_create2(filename, size, backing_file, backing_fmt, flags, cluster_size, prealloc, options, version, local_err); if (local_err) { error_propagate(errp, local_err); diff --git a/block/raw-posix.c b/block/raw-posix.c index 1688e16..a363b71 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -1246,7 +1246,7 @@ static int raw_create(const char *filename, QEMUOptionParameter *options, /* Read out options */ while (options options-name) { if (!strcmp(options-name, BLOCK_OPT_SIZE)) { -total_size = options-value.n / BDRV_SECTOR_SIZE; +total_size = options-value.n BDRV_SECTOR_MASK; } options++; } @@ -1257,7 +1257,7 @@ static int raw_create(const char *filename, QEMUOptionParameter *options, result = -errno; error_setg_errno(errp, -result, Could not create file); } else { -if (ftruncate(fd, total_size * BDRV_SECTOR_SIZE) != 0) { +if (ftruncate(fd, total_size) != 0) { result = -errno; error_setg_errno(errp, -result, Could not resize file); } diff --git a/block/raw-win32.c b/block/raw-win32.c index 48cb2c2..9fe60f3 100644 --- a/block/raw-win32.c +++ b/block/raw-win32.c @@ -486,7 +486,7 @@ static int raw_create(const char *filename, QEMUOptionParameter *options, /* Read out options */ while (options options-name) { if (!strcmp(options-name, BLOCK_OPT_SIZE)) { -total_size = options-value.n / 512; +total_size = options-value.n BDRV_SECTOR_MASK; } options++; } @@ -498,7 +498,7 @@ static int raw_create(const char *filename, QEMUOptionParameter *options, return -EIO; } set_sparse(fd); -ftruncate(fd, total_size * 512); +ftruncate(fd, total_size); qemu_close(fd); return 0; } -- 1.8.5.2.229.g4448466
[Qemu-devel] [PATCH v8 1/4] qapi: introduce PreallocMode and a new PreallocMode full.
This patch prepares for the subsequent patches. Reviewed-by: Fam Zheng f...@redhat.com Signed-off-by: Hu Tao hu...@cn.fujitsu.com --- block/qcow2.c| 8 qapi-schema.json | 14 ++ 2 files changed, 18 insertions(+), 4 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index e903d97..3377007 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -1592,7 +1592,7 @@ static int preallocate(BlockDriverState *bs) static int qcow2_create2(const char *filename, int64_t total_size, const char *backing_file, const char *backing_format, - int flags, size_t cluster_size, int prealloc, + int flags, size_t cluster_size, PreallocMode prealloc, QEMUOptionParameter *options, int version, Error **errp) { @@ -1769,7 +1769,7 @@ static int qcow2_create(const char *filename, QEMUOptionParameter *options, uint64_t sectors = 0; int flags = 0; size_t cluster_size = DEFAULT_CLUSTER_SIZE; -int prealloc = 0; +PreallocMode prealloc = PREALLOC_MODE_OFF; int version = 3; Error *local_err = NULL; int ret; @@ -1790,9 +1790,9 @@ static int qcow2_create(const char *filename, QEMUOptionParameter *options, } } else if (!strcmp(options-name, BLOCK_OPT_PREALLOC)) { if (!options-value.s || !strcmp(options-value.s, off)) { -prealloc = 0; +prealloc = PREALLOC_MODE_OFF; } else if (!strcmp(options-value.s, metadata)) { -prealloc = 1; +prealloc = PREALLOC_MODE_METADATA; } else { error_setg(errp, Invalid preallocation mode: '%s', options-value.s); diff --git a/qapi-schema.json b/qapi-schema.json index 391356f..9e6221a 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -4689,3 +4689,17 @@ 'btn' : 'InputBtnEvent', 'rel' : 'InputMoveEvent', 'abs' : 'InputMoveEvent' } } + +## +# @PreallocMode +# +# Preallocation mode of QEMU image file +# +# @off: no preallocation +# @metadata: preallocate only for metadata +# @full: preallocate all data, including metadata +# +# Since 2.0 +## +{ 'enum': 'PreallocMode', + 'data': [ 'off', 'metadata', 'full' ] } -- 1.8.5.2.229.g4448466
[Qemu-devel] [PATCH v4 2/6] register: Add Register API
This API provides some encapsulation of registers and factors our some common functionality to common code. Bits of device state (usually MMIO registers), often have all sorts of access restrictions and semantics associated with them. This API allow you to define what those restrictions are on a bit-by-bit basis. Helper functions are then used to access the register which observe the semantics defined by the RegisterAccessInfo struct. Some features: Bits can be marked as read_only (ro field) Bits can be marked as write-1-clear (w1c field) Bits can be marked as reserved (rsvd field) Reset values can be defined (reset) Bits can throw guest errors when written certain values (ge0, ge1) Bits can throw unimp errors when written certain values (ui0, ui1) Bits can be marked clear on read (cor) Pre and post action callbacks can be added to read and write ops Verbose debugging info can be enabled/disabled Useful for defining device register spaces in a data driven way. Cuts down on a lot of the verbosity and repetition in the switch-case blocks in the standard foo_mmio_read/write functions. Also useful for automated generation of device models from hardware design sources. Signed-off-by: Peter Crosthwaite peter.crosthwa...@xilinx.com --- changed from v2: Simplified! Removed pre-read, nwx, wo Removed byte loops (Gerd Review) Made data pointer optional Added fast paths for simple registers Moved into hw/core and include/hw (Paolo Review) changed from v1: Rebranded as the Register API - I think thats probably what it is. Near total rewrite of implementation. De-arrayified reset (this is client/Memory APIs job). Moved out of bitops into its own file (Blue review) Added debug, the register pointer, and prefix to a struct (Blue Review) Made 64-bit to play friendlier with memory API (Blue review) Made backend storage uint8_t (MST review) Added read/write callbacks (Blue review) Added ui0, ui1 (Blue review) Moved re-purposed width (now byte width defining actual storage size) Arrayified ge0, ge1 (ui0, ui1 too) and added .reason Added wo field (not an April fools joke - this has genuine meaning here) Added we mask to write accessor hw/core/Makefile.objs | 1 + hw/core/register.c| 197 ++ include/hw/register.h | 131 + 3 files changed, 329 insertions(+) create mode 100644 hw/core/register.c create mode 100644 include/hw/register.h diff --git a/hw/core/Makefile.objs b/hw/core/Makefile.objs index 5377d05..7839f46 100644 --- a/hw/core/Makefile.objs +++ b/hw/core/Makefile.objs @@ -4,6 +4,7 @@ common-obj-y += fw-path-provider.o # irq.o needed for qdev GPIO handling: common-obj-y += irq.o common-obj-y += hotplug.o +common-obj-y += register.o common-obj-$(CONFIG_EMPTY_SLOT) += empty_slot.o common-obj-$(CONFIG_XILINX_AXI) += stream.o diff --git a/hw/core/register.c b/hw/core/register.c new file mode 100644 index 000..96712f2 --- /dev/null +++ b/hw/core/register.c @@ -0,0 +1,197 @@ +/* + * Register Definition API + * + * Copyright (c) 2013 Xilinx Inc. + * Copyright (c) 2013 Peter Crosthwaite peter.crosthwa...@xilinx.com + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ + +#include hw/register.h +#include qemu/log.h + +static inline void register_write_log(RegisterInfo *reg, int dir, uint64_t val, + int mask, const char *msg, + const char *reason) +{ +qemu_log_mask(mask, %s:%s bits %# PRIx64 %s write of %d%s%s\n, + reg-prefix, reg-access-name, val, msg, dir, + reason ? : : , reason ? reason : ); +} + +static inline void register_write_val(RegisterInfo *reg, uint64_t val) +{ +if (!reg-data) { +return; +} +switch (reg-data_size) { +case 1: +*(uint8_t *)reg-data = val; +break; +case 2: +*(uint16_t *)reg-data = val; +break; +case 4: +*(uint32_t *)reg-data = val; +break; +case 8: +*(uint64_t *)reg-data = val; +break; +default: +abort(); +} +} + +static inline uint64_t register_read_val(RegisterInfo *reg) +{ +switch (reg-data_size) { +case 1: +return *(uint8_t *)reg-data; +case 2: +return *(uint16_t *)reg-data; +case 4: +return *(uint32_t *)reg-data; +case 8: +return *(uint64_t *)reg-data; +default: +abort(); +} +return 0; /* unreachable */ +} + +void register_write(RegisterInfo *reg, uint64_t val, uint64_t we) +{ +uint64_t old_val, new_val, test, no_w_mask; +const RegisterAccessInfo *ac; +const RegisterAccessError *rae; + +assert(reg); + +ac = reg-access; +if (reg-write_lite !~we) { /* fast path!! */ +new_val = val; +goto register_write_fast; +} + +old_val = reg-data ? register_read_val(reg) : ac-reset; +
[Qemu-devel] [PATCH v4 3/6] register: Add Memory API glue
Add memory io handlers that glue the register API to the memory API. Just translation functions at this stage. Although it does allow for devices to be created without all-in-one mmio r/w handlers. Signed-off-by: Peter Crosthwaite peter.crosthwa...@xilinx.com --- changed from v2: Added fast path to register_write_memory to skip endianness bitbashing hw/core/register.c| 48 include/hw/register.h | 12 2 files changed, 60 insertions(+) diff --git a/hw/core/register.c b/hw/core/register.c index 96712f2..29e5f04 100644 --- a/hw/core/register.c +++ b/hw/core/register.c @@ -195,3 +195,51 @@ void register_reset(RegisterInfo *reg) register_write_val(reg, reg-access-reset); } + +static inline void register_write_memory(void *opaque, hwaddr addr, + uint64_t value, unsigned size, bool be) +{ +RegisterInfo *reg = opaque; +uint64_t we = ~0; +int shift = 0; + +if (reg-data_size != size) { +we = (size == 8) ? ~0ull : (1ull size * 8) - 1; +shift = 8 * (be ? reg-data_size - size - addr : addr); +} + +assert(size + addr = reg-data_size); +register_write(reg, value shift, we shift); +} + +void register_write_memory_be(void *opaque, hwaddr addr, uint64_t value, + unsigned size) +{ +register_write_memory(opaque, addr, value, size, true); +} + + +void register_write_memory_le(void *opaque, hwaddr addr, uint64_t value, + unsigned size) +{ +register_write_memory(opaque, addr, value, size, false); +} + +static inline uint64_t register_read_memory(void *opaque, hwaddr addr, +unsigned size, bool be) +{ +RegisterInfo *reg = opaque; +int shift = 8 * (be ? reg-data_size - size - addr : addr); + +return register_read(reg) shift; +} + +uint64_t register_read_memory_be(void *opaque, hwaddr addr, unsigned size) +{ +return register_read_memory(opaque, addr, size, true); +} + +uint64_t register_read_memory_le(void *opaque, hwaddr addr, unsigned size) +{ +return register_read_memory(opaque, addr, size, false); +} diff --git a/include/hw/register.h b/include/hw/register.h index f747a7f..c595ed6 100644 --- a/include/hw/register.h +++ b/include/hw/register.h @@ -87,6 +87,8 @@ struct RegisterAccessInfo { * @prefix: String prefix for log and debug messages * * @opaque: Opaque data for the register + * + * @mem: optional Memory region for the register */ struct RegisterInfo { @@ -102,6 +104,8 @@ struct RegisterInfo { /* private */ bool read_lite; bool write_lite; + +MemoryRegion mem; }; /** @@ -128,4 +132,12 @@ uint64_t register_read(RegisterInfo *reg); void register_reset(RegisterInfo *reg); +void register_write_memory_be(void *opaque, hwaddr addr, uint64_t value, + unsigned size); +void register_write_memory_le(void *opaque, hwaddr addr, uint64_t value, + unsigned size); + +uint64_t register_read_memory_be(void *opaque, hwaddr addr, unsigned size); +uint64_t register_read_memory_le(void *opaque, hwaddr addr, unsigned size); + #endif -- 1.9.1.1.gbb9f595
[Qemu-devel] [PATCH v4 0/6] Data Driven device registers + Zynq DEVCFG
Hi All. This is a new scheme i've come up with handling device registers in a data driven way. My motivation for this is to factor out a lot of the access checking that seems to be replicated in every device. See P2 commit message for further discussion. P1 is a trivial addition to bitops.h P2 is the main patch, adds the register definition functionality P3 adds helpers that glue the register API to the Memory API P4 is an example new device (the Xilinx Zynq devcfg) that uses this scheme. P5 adds devcfg to the Zynq machine model This devcfg device was particularly finnicky with per-bit restrictions which prompted all this. I'm also looking for a higher-than-usual modelling fidelity on the register space, with semantics defined for random reserved bits in-between otherwise consistent fields. Here's an example of the qemu_log output for the devcfg device. This is produced by now generic sharable code: /machine/unattached/device[44]:Addr 0x08:CFG: write of value 0508 /machine/unattached/device[44]:Addr 0x80:MCTRL: write of value 00800010 /machine/unattached/device[44]:Addr 0x10:INT_MASK: write of value /machine/unattached/device[44]:Addr :CTRL: write of value 0c00607f And an example of a rogue guest banging on a bad bit: /machine/unattached/device[44]:Addr 0x14:STATUS bits 0x01 may not be \ written to 1 Changed from v3: Rebased (its been a while) Added reserved bits. Cleaner separation of decode and access components (Patch 3) Changed from v2: Fixed for hw/ re-orginisation (Paolo review) Simplified and optimized (PMM and Gerd review) Changed from v1: Added ONES macro patch Dropped bogus former patch 1 (PMM review) Addressed Blue, Gerd and MST comments. Simplified to be more Memory API compatible. Added Memory API helpers. Please see discussion already on list and commit msgs for more detail. Peter Crosthwaite (6): bitops: Add ONES macro register: Add Register API register: Add Memory API glue register: Add support for decoding information xilinx_devcfg: Zynq devcfg device model xilinx_zynq: added devcfg to machine model default-configs/arm-softmmu.mak | 1 + hw/arm/xilinx_zynq.c| 8 + hw/core/Makefile.objs | 1 + hw/core/register.c | 245 + hw/dma/Makefile.objs| 1 + hw/dma/xilinx_devcfg.c | 462 include/hw/register.h | 164 ++ include/qemu/bitops.h | 2 + 8 files changed, 884 insertions(+) create mode 100644 hw/core/register.c create mode 100644 hw/dma/xilinx_devcfg.c create mode 100644 include/hw/register.h -- 1.9.1.1.gbb9f595
[Qemu-devel] [PATCH v4 4/6] register: Add support for decoding information
Allow defining of optional address decoding information in register definitions. This is useful for clients that want to associate registers with specific addresses. Signed-off-by: Peter Crosthwaite peter.crosthwa...@xilinx.com --- include/hw/register.h | 21 + 1 file changed, 21 insertions(+) diff --git a/include/hw/register.h b/include/hw/register.h index c595ed6..d7f602f 100644 --- a/include/hw/register.h +++ b/include/hw/register.h @@ -15,6 +15,7 @@ typedef struct RegisterInfo RegisterInfo; typedef struct RegisterAccessInfo RegisterAccessInfo; +typedef struct RegisterDecodeInfo RegisterDecodeInfo; /** * A register access error message @@ -54,6 +55,11 @@ typedef struct RegisterAccessError { * allowing this function to modify the value before return to the client. */ +#define REG_DECODE_READ (1 0) +#define REG_DECODE_WRITE (1 1) +#define REG_DECODE_EXECUTE (1 2) +#define REG_DECODE_RW (REG_DECODE_READ | REG_DECODE_WRITE) + struct RegisterAccessInfo { const char *name; uint64_t ro; @@ -71,6 +77,14 @@ struct RegisterAccessInfo { void (*post_write)(RegisterInfo *reg, uint64_t val); uint64_t (*post_read)(RegisterInfo *reg, uint64_t val); + +size_t storage; +int data_size; + +struct { +hwaddr addr; +uint8_t flags; +} decode; }; /** @@ -108,6 +122,13 @@ struct RegisterInfo { MemoryRegion mem; }; + +struct RegisterDecodeInfo { +RegisterInfo *reg; +hwaddr addr; +unsigned len; +}; + /** * write a value to a register, subject to its restrictions * @reg: register to write to -- 1.9.1.1.gbb9f595
Re: [Qemu-devel] [V3 PATCH 1/2] target-ppc: Fix target_disas
On 8 April 2014 20:26, Tom Musta tommu...@gmail.com wrote: Inspect only bit 16 for the Little Endian test. Correct comment preceding the target_disas() function. Signed-off-by: Tom Musta tommu...@gmail.com Reviewed-by: Peter Maydell peter.mayd...@linaro.org
[Qemu-devel] [PATCH v8 0/4] qemu-img: add preallocation=full
The purpose of this series is to use posix_fallocate() when creating img file to ensure there are disk space for it which is way fast than acturally writing to disk. But this only works in file system level. For cases like thin provisioning, an option full preallocation is added to write zeros to storage to ensure disk space. Hu Tao (4): qapi: introduce PreallocMode and a new PreallocMode full. raw, qcow2: don't convert file size to sector size raw-posix: Add full image preallocation option qcow2: Add full image preallocation option block/qcow2.c | 95 -- block/raw-posix.c | 63 ++ block/raw-win32.c | 4 +- qapi-schema.json | 14 +++ tests/qemu-iotests/082.out | 54 +- 5 files changed, 182 insertions(+), 48 deletions(-) -- 1.8.5.2.229.g4448466
Re: [Qemu-devel] [V3 PATCH 2/2] monitor: QEMU Monitor Instruction Disassembly Incorrect for PowerPC LE Mode
On 8 April 2014 20:26, Tom Musta tommu...@gmail.com wrote: The monitor support for disassembling instructions does not honor the MSR[LE] bit for PowerPC processors. This change enhances the monitor_disas() routine by supporting a flag bit for Little Endian mode. Bit 16 is used since that bit was used in the analagous guest disassembly routine target_disas(). Also, to be consistent with target_disas(), the disassembler bfd_mach field can be passed in the flags argument. Reported-by: Anton Blanchard an...@samba.org Signed-off-by: Tom Musta tommu...@gmail.com --- V2: Look specifically at bit 16 for LE. Support machine configuration in flags. V3: Changed documentation of 'flags' argument to simply refer to the target_disas description (per Peter Maydell's review). The bug can be easily observed by dumping the first few instructions of an interrupt vector (0x300 is the Data Storage Interrupt handler in PPC) (qemu) xp/8i 0x300 0x0300: .long 0x60 0x0304: lhzur18,-19843(r3) 0x0308: .long 0x60 0x030c: lhzur18,-20099(r2) 0x0310: lwz r0,11769(0) 0x0314: lhzur23,8317(r2) 0x0318: .long 0x7813427c 0x031c: lbz r0,19961(0) With the patch applied, the disassembly now looks correct: (qemu) xp/8i 0x300 0x0300: nop 0x0304: mtsprg 2,r13 0x0308: nop 0x030c: mfsprg r13,1 0x0310: std r9,128(r13) 0x0314: mfspr r9,896 0x0318: mr r2,r2 0x031c: std r10,136(r13) disas.c | 14 -- monitor.c |4 2 files changed, 16 insertions(+), 2 deletions(-) diff --git a/disas.c b/disas.c index a427d18..f300102 100644 --- a/disas.c +++ b/disas.c @@ -445,6 +445,8 @@ monitor_fprintf(FILE *stream, const char *fmt, ...) return 0; } +/* Disassembler for the monitor. See target_disas for a description of 'flags'. + */ You could just put that */ on the previous line, right? Or is it too long if you do? void monitor_disas(Monitor *mon, CPUArchState *env, target_ulong pc, int nb_insn, int is_physical, int flags) { @@ -485,11 +487,19 @@ void monitor_disas(Monitor *mon, CPUArchState *env, s.info.mach = bfd_mach_sparc_v9b; #endif #elif defined(TARGET_PPC) +if (flags 0x) { +/* If we have a precise definitions of the instructions set, use it */ definition, instruction. Otherwise: Reviewed-by: Peter Maydell peter.mayd...@linaro.org thanks -- PMM
[Qemu-devel] [PATCH v4 1/6] bitops: Add ONES macro
Little macro that just gives you N ones (justified to LSB). Signed-off-by: Peter Crosthwaite peter.crosthwa...@xilinx.com --- include/qemu/bitops.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/qemu/bitops.h b/include/qemu/bitops.h index 340b1e7..b9fd60f 100644 --- a/include/qemu/bitops.h +++ b/include/qemu/bitops.h @@ -403,4 +403,6 @@ static inline uint64_t deposit64(uint64_t value, int start, int length, return (value ~mask) | ((fieldval start) mask); } +#define ONES(num) ((num) == 64 ? ~0ull : (1ull (num)) - 1) + #endif -- 1.9.1.1.gbb9f595
Re: [Qemu-devel] [PATCH for-2.0] configure: add option to disable -fstack-protector flags
On 8 April 2014 21:47, Noonan, Steven snoo...@amazon.com wrote: On Tue, Apr 08, 2014 at 09:37:27PM +0100, Peter Maydell wrote: This is bad because we have that framework argument as part of our linker flags. Effectively this means that clang won't warn about the argument at link time but will warn for every .c-.o compile (as well as ending up with no stack protection). So -framework is designed to transparently add the appropriate -I and -L/-l flags, pointing to the insides of a .framework bundle. To me, the -framework arguments belong in CFLAGS and LIBS, but not LDFLAGS. In the context of QEMU's configure script, I think it'd be QEMU_INCLUDES and LIBS. Unfortunately, putting -framework CoreFoundation in CFLAGS produces a different warning: manooth$ clang -o /tmp/zz9.o -Werror -fstack-protector -c /tmp/zz9.c -framework CoreFoundation clang: error: -framework CoreFoundation: 'linker' input unused which would seem to imply that you shouldn't be passing it on the .c-.o compile command line. thanks -- PMM
[Qemu-devel] [PATCH v8 4/4] qcow2: Add full image preallocation option
This adds a preallocation=full mode to qcow2 image creation, which creates a non-sparse image file. Signed-off-by: Hu Tao hu...@cn.fujitsu.com --- block/qcow2.c | 79 -- tests/qemu-iotests/082.out | 54 +++ 2 files changed, 103 insertions(+), 30 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index b2ff9c3..1b0034d 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -1596,6 +1596,7 @@ static int qcow2_create2(const char *filename, int64_t total_size, QEMUOptionParameter *options, int version, Error **errp) { +QEMUOptionParameter *alloc_options = NULL; /* Calculate cluster_bits */ int cluster_bits; cluster_bits = ffs(cluster_size) - 1; @@ -1625,10 +1626,78 @@ static int qcow2_create2(const char *filename, int64_t total_size, Error *local_err = NULL; int ret; +if (prealloc == PREALLOC_MODE_FULL || prealloc == PREALLOC_MODE_METADATA) { +int64_t meta_size = 0; +unsigned nreftablee, nrefblocke, nl1e, nl2e; +BlockDriver *drv; + +total_size = align_offset(total_size, cluster_size); + +drv = bdrv_find_protocol(filename, true); +if (drv == NULL) { +error_setg(errp, Could not find protocol for file '%s', filename); +return -ENOENT; +} + +alloc_options = append_option_parameters(alloc_options, + drv-create_options); +alloc_options = append_option_parameters(alloc_options, options); + +/* header: 1 cluster */ +meta_size += cluster_size; + +/* total size of L2 tables */ +nl2e = total_size / cluster_size; +nl2e = align_offset(nl2e, cluster_size / sizeof(uint64_t)); +meta_size += nl2e * sizeof(uint64_t); + +/* total size of L1 tables */ +nl1e = nl2e * sizeof(uint64_t) / cluster_size; +nl1e = align_offset(nl1e, cluster_size / sizeof(uint64_t)); +meta_size += nl1e * sizeof(uint64_t); + +/* total size of refcount blocks + * + * note: every host cluster is reference-counted, including metadata + * (even refcount blocks are recursively included). + * Let: + * a = total_size (this is the guest disk size) + * m = meta size not including refcount blocks and refcount tables + * c = cluster size + * y1 = number of refcount blocks entries + * y2 = meta size including everything + * then, + * y1 = (y2 + a)/c + * y2 = y1 * sizeof(u16) + y1 * sizeof(u16) * sizeof(u64) / c + m + * we can get y1: + * y1 = (a + m) / (c - sizeof(u16) - sizeof(u16) * sizeof(u64) / c) + */ +nrefblocke = (total_size + meta_size + cluster_size) / +(cluster_size - sizeof(uint16_t) - + 1.0 * sizeof(uint16_t) * sizeof(uint64_t) / cluster_size); +nrefblocke = align_offset(nrefblocke, cluster_size / sizeof(uint16_t)); +meta_size += nrefblocke * sizeof(uint16_t); + +/* total size of refcount tables */ +nreftablee = nrefblocke * sizeof(uint16_t) / cluster_size; +nreftablee = align_offset(nreftablee, cluster_size / sizeof(uint64_t)); +meta_size += nreftablee * sizeof(uint64_t); + +set_option_parameter_int(alloc_options, BLOCK_OPT_SIZE, + total_size + meta_size); +if (prealloc == PREALLOC_MODE_FULL) { +set_option_parameter(alloc_options, BLOCK_OPT_PREALLOC, full); +} else if (prealloc == PREALLOC_MODE_METADATA) { +set_option_parameter(alloc_options, BLOCK_OPT_PREALLOC, metadata); +} + +options = alloc_options; +} + ret = bdrv_create_file(filename, options, local_err); if (ret 0) { error_propagate(errp, local_err); -return ret; +goto out_options; } bs = NULL; @@ -1636,7 +1705,7 @@ static int qcow2_create2(const char *filename, int64_t total_size, NULL, local_err); if (ret 0) { error_propagate(errp, local_err); -return ret; +goto out_options; } /* Write the header */ @@ -1758,6 +1827,8 @@ out: if (bs) { bdrv_unref(bs); } +out_options: +free_option_parameters(alloc_options); return ret; } @@ -1793,6 +1864,8 @@ static int qcow2_create(const char *filename, QEMUOptionParameter *options, prealloc = PREALLOC_MODE_OFF; } else if (!strcmp(options-value.s, metadata)) { prealloc = PREALLOC_MODE_METADATA; +} else if (!strcmp(options-value.s, full)) { +prealloc = PREALLOC_MODE_FULL; } else { error_setg(errp, Invalid preallocation mode: '%s', options-value.s); @@ -2358,7 +2431,7 @@ static
[Qemu-devel] [PATCH v4 6/6] xilinx_zynq: added devcfg to machine model
Signed-off-by: Peter Crosthwaite peter.crosthwa...@xilinx.com --- Changed since v3: Author reset. Changed since v1: Added manual parenting of devcfg node (evil but needed for early access to canonical path by devcfgs realize fn). hw/arm/xilinx_zynq.c | 8 1 file changed, 8 insertions(+) diff --git a/hw/arm/xilinx_zynq.c b/hw/arm/xilinx_zynq.c index 9ee21e7..e3505fa 100644 --- a/hw/arm/xilinx_zynq.c +++ b/hw/arm/xilinx_zynq.c @@ -231,6 +231,14 @@ static void zynq_init(QEMUMachineInitArgs *args) sysbus_connect_irq(busdev, n + 1, pic[dma_irqs[n] - IRQ_OFFSET]); } +dev = qdev_create(NULL, xlnx.ps7-dev-cfg); +object_property_add_child(qdev_get_machine(), xilinx-devcfg, OBJECT(dev), + NULL); +qdev_init_nofail(dev); +busdev = SYS_BUS_DEVICE(dev); +sysbus_connect_irq(busdev, 0, pic[40-IRQ_OFFSET]); +sysbus_mmio_map(busdev, 0, 0xF8007000); + zynq_binfo.ram_size = ram_size; zynq_binfo.kernel_filename = kernel_filename; zynq_binfo.kernel_cmdline = kernel_cmdline; -- 1.9.1.1.gbb9f595
Re: [Qemu-devel] [PULL for-2.0] dsdt: tweak ACPI ID for hotplug resource device
On Tue, 8 Apr 2014 15:24:18 +0300 Michael S. Tsirkin m...@redhat.com wrote: ACPI0004 seems too new: Windows XP complains about an unrecognized device. This is a regression since 1.7. Use PNP0A06 instead - Generic Container Device. Signed-off-by: Michael S. Tsirkin m...@redhat.com Reviewed-By: Igor Mammedov imamm...@redhat.com --- hw/i386/acpi-dsdt-cpu-hotplug.dsl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/i386/acpi-dsdt-cpu-hotplug.dsl b/hw/i386/acpi-dsdt-cpu-hotplug.dsl index dee4843..34aab5a 100644 --- a/hw/i386/acpi-dsdt-cpu-hotplug.dsl +++ b/hw/i386/acpi-dsdt-cpu-hotplug.dsl @@ -93,7 +93,7 @@ Scope(\_SB) { } Device(CPU_HOTPLUG_RESOURCE_DEVICE) { -Name(_HID, ACPI0004) +Name(_HID, EisaId(PNP0A06)) Name(_CRS, ResourceTemplate() { IO(Decode16, CPU_STATUS_BASE, CPU_STATUS_BASE, 0, CPU_STATUS_LEN) There is a need to update test cases with new DSDTs, since it breaks them.
[Qemu-devel] [Bug 1242963] Re: QEMU loadvm causes guest OS freeze
I found something, hope this could be useful. (qemu) info cpus * CPU #0: pc=0x8054b7f1 thread_id=20447 (qemu) savevm x (qemu) loadvm x (qemu) info cpus * CPU #0: pc=0x806f69ba (halted) thread_id=20447 -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1242963 Title: QEMU loadvm causes guest OS freeze Status in QEMU: New Bug description: HOST: ubuntu 13.10 x64 GUEST: winxp sp 3 x86 AFFECT QEMU(tested): v1.5.2, v1.5.3, v1.6.0, v1.6.1 I compile QEMU by myself with ./configure --target-list=i386-softmmu make make install. After installing a winxp sp3 into the qemu-system-i386 with command line: qemu-system-i386 -m 512 -hda xp.img -net user -net nic,model=rtl8139 -rtc base=localtime,clock=vm I use monitor to create a live snapshot: stop savevm xxx cont And then I load this snapshot (I also try it in commad line: -loadvm xxx): loadvm xxx cont After that, the windows system is freeze (don't accept any keyboard or mouse input, although I knew vcpu is still working). If I compile with -enable-kvm and launch qemu-system-i386 with -enable-kvm, it looks like everything works well. I think it is a bug for qemu system. BTW: freeze is not appearing 100%, but in my test, 95% cases would cause system freeze. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1242963/+subscriptions
Re: [Qemu-devel] [RFC PATCH] target-ppc: enable migration within the same CPU family
Am 09.04.2014 um 02:41 schrieb Alexey Kardashevskiy a...@ozlabs.ru: On 04/09/2014 12:59 AM, Alexander Graf wrote: On 04/08/2014 02:19 PM, Michael Mueller wrote: On Tue, 08 Apr 2014 21:47:39 +1000 Alexey Kardashevskiy a...@ozlabs.ru wrote: On 04/08/2014 08:32 PM, Michael Mueller wrote: On Tue, 08 Apr 2014 20:04:42 +1000 Alexey Kardashevskiy a...@ozlabs.ru wrote: On 04/08/2014 07:47 PM, Michael Mueller wrote: On Tue, 08 Apr 2014 11:23:14 +1000 Alexey Kardashevskiy a...@ozlabs.ru wrote: On 04/08/2014 04:53 AM, Andreas Färber wrote: Am 07.04.2014 05:27, schrieb Alexey Kardashevskiy: On 04/04/2014 11:28 PM, Alexander Graf wrote: On 04/04/2014 07:17 AM, Alexey Kardashevskiy wrote: On 03/24/2014 04:28 PM, Alexey Kardashevskiy wrote: Currently only migration fails if CPU version is different even a bit. For example, migration from POWER7 v2.0 to POWER7 v2.1 fails because of that. Since there is no difference between CPU versions which could affect migration stream, we can safely enable it. This adds a helper to find the closest POWERPC family class (i.e. first abstract class in hierarchy). This replaces VMSTATE_UINTTL_EQUAL statement with a custom handler which checks if the source and destination CPUs belong to the same family and fails if they are not. This adds a PVR reset to the default value as it will be overwritten by VMSTATE_UINTTL_ARRAY(env.spr, PowerPCCPU, 1024). Since the actual migration format is not changed by this patch, @version_id of vmstate_ppc_cpu does not have to be changed either. Signed-off-by: Bharata B Rao bhar...@linux.vnet.ibm.com Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru Ping? Can't we just always allow migration to succeed? It's a problem of the tool stack above if it allows migration to an incompatible host, no? This is not how libvirt works. It simply sends the source XML, reconstructs a guest on the destination side and then migrates. hoping that the migration will fail is something (which only QEMU has knowledge of) is incompatible. The new guest will start with -cpu host (as the source) but it will create diffrent CPU class and do different things. If we do not check PVR (and cpu_dt_id and chip_id - the latter is coming soon) and migrate power8-power7, we can easily get a broken guest. The response is very simple: -cpu host is not supported for migration. Same as for x86 hosts. Is there any good reason to limit ourselves on POWERPC? As you say, the domain config is transferred by libvirt: If you use -cpu POWER7, you can migrate from POWER7 to POWER8 and back; if you use -cpu POWER8, you can only migrate on POWER8. -cpu other that host is not supported by HV KVM, only compat which upstream QEMU does not have yet. So you are saying that the migration is not supported by upstream QEMU for at least SPAPR. Well, ok, it is dead anyway so I am fine :) With s390x we have a similar situation. Thus we came up with a mechanism to limit the CPU functionality of a possible target system. Our patch implements CPU models based on TYPE and GA like 2817-ga1, etc. (GA represents a CPU facility set and an IBC value (Instruction Blocking Control, reduces the instruction set to the requested level)) When a guest is started, it receives its CPU model by means of option -cpu. host equates the configuration of the current system. We implemented query-cpu-model returning the actual model, here maybe { name: 2817-ga1 }. To find a suitable migration target in a remote CEC, libvirt has to query-cpu-definitions returning a list of models supported by the target system {{name: 2827-ga2}, {name: 2827-ga1}, {name: 2817-ga2},...]. A match means the system is suitable and can be used as migration target. Sorry, I do not follow you. You hacked libvirt to run the destination QEMU with a specific CPU model? Or it is in QEMU? Where? What I see now is this: static const VMStateDescription vmstate_s390_cpu = { .name = cpu, .unmigratable = 1, }; Does not look like it supports migration :) Thanks! The code you're missing is not upstream yet. The s390x guest can be migrated in the meantime. Yes, libvirt currently gets an extension to be able to identify and startup suitable migration targets for s390x on behalf of the mentioned qemu cpu model. BTW can you point me to the above mentioned SPAPR stuff... Mmm. What stuff? :) At the moment POWERPC guests migrate if PVR (processor version register) value is exactly the same. I am trying to relax this limitation to any version within same CPU family, like power7 v1.0 and v2.1. With stuff I referred to to term sPAPR not realizing it relates to the Power Architecture Platform Requirements, got it now. :-) I see, ppc currently has this limitation to enforce compatibility VMSTATE_UINTTL_EQUAL(env.spr[SPR_PVR], PowerPCCPU), Yes, but the s390 approach is a lot cleaner and I'd rather like to move into that direction. Then we
Re: [Qemu-devel] [PATCH v1 1/1] vl.c: Allow sysbus devices to be attached via commandline
Alistair Francis alistair.fran...@xilinx.com writes: On Wed, Apr 9, 2014 at 11:28 AM, Peter Crosthwaite peter.crosthwa...@xilinx.com wrote: On Wed, Mar 26, 2014 at 10:47 AM, Alistair Francis alistair.fran...@xilinx.com wrote: This patch introduces a new command line argument that allows sysbus devices to be attached via the command line. This allows devices to be added after the machine init but before anything is booted The new argument is -sysbusdev A new argument is being used to avoid confusion and user errors that would appear if the same command for hot-plugging was used instead. This could be changed to use -device and specify sysbus by using something like 'bus=sysbus' if that is preferred Can you be more specific about the confusion issue you are seeing? If you went for -device bus=root (meaning the default singleton root sysbus), where are the possible issues? Using -device would be possible (and easy to implement). The reason I started with a different argument name was just to avoid confusion and to make sure I didn't interfere with hot-pluggable devices. I still feel that separating hot-pluggable devices from bus devices is a good idea. They are initialised in slightly different parts of the code (although this wouldn't be hard to fix). The name 'sysbusdev' is confusing though, as it doesn't have to connect to the sysbus. -device / device_add already covers both hot plug and cold plug. In fact, only device_add can hot plug; all -device ever does is cold plug. Plugging in a device involves wiring it up. Hot plug additionally involves communication with guest software, but let's ignore that for now. -device uses generic wiring code, guided by configuration: property bus. This suffices for straightforward cases like PCI and USB devices. It doesn't for sysbus devices, because these may require arbitrarily complex wirings. Therefore, sysbus devices are unavailable with -device; see commit 837d371. What we need is a way for configuration to guide more general wiring. Perhaps -device can be extended. If that turns out to be impractical, we need something more expressive that also covers everything -device does now, and can cover everything eventually. Andreas may have ideas here. What we don't need, in my opinion, is more special-case options :) [...]
[Qemu-devel] [PATCH v4 5/6] xilinx_devcfg: Zynq devcfg device model
Minimal device model for devcfg module of Zynq. DMA capabilities and interrupt generation supported. Signed-off-by: Peter Crosthwaite peter.crosthwa...@xilinx.com --- Changed since v3: Stylistic updates. Changed over to new decoding scheme. Use .rsvd in definitions as appropriate. Author reset (s/petalogix/xilinx). Changed since v2: Some QOM styling updates. Re-implemented nw0 for lock register as pre_write Changed since v1: Rebased against new version of Register API. Use action callbacks for side effects rather than switch. Documented reasons for ge0, ge1 (Verbatim from TRM) Added ui1 definitions for unimplemented major features Removed dead lock code default-configs/arm-softmmu.mak | 1 + hw/dma/Makefile.objs| 1 + hw/dma/xilinx_devcfg.c | 462 3 files changed, 464 insertions(+) create mode 100644 hw/dma/xilinx_devcfg.c diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak index f3513fa..e0e518c 100644 --- a/default-configs/arm-softmmu.mak +++ b/default-configs/arm-softmmu.mak @@ -62,6 +62,7 @@ CONFIG_PXA2XX=y CONFIG_BITBANG_I2C=y CONFIG_FRAMEBUFFER=y CONFIG_XILINX_SPIPS=y +CONFIG_ZYNQ_DEVCFG=y CONFIG_ARM11SCU=y CONFIG_A9SCU=y diff --git a/hw/dma/Makefile.objs b/hw/dma/Makefile.objs index 0e65ed0..96025cf 100644 --- a/hw/dma/Makefile.objs +++ b/hw/dma/Makefile.objs @@ -5,6 +5,7 @@ common-obj-$(CONFIG_PL330) += pl330.o common-obj-$(CONFIG_I82374) += i82374.o common-obj-$(CONFIG_I8257) += i8257.o common-obj-$(CONFIG_XILINX_AXI) += xilinx_axidma.o +common-obj-$(CONFIG_ZYNQ_DEVCFG) += xilinx_devcfg.o common-obj-$(CONFIG_ETRAXFS) += etraxfs_dma.o common-obj-$(CONFIG_STP2000) += sparc32_dma.o common-obj-$(CONFIG_SUN4M) += sun4m_iommu.o diff --git a/hw/dma/xilinx_devcfg.c b/hw/dma/xilinx_devcfg.c new file mode 100644 index 000..c85f808 --- /dev/null +++ b/hw/dma/xilinx_devcfg.c @@ -0,0 +1,462 @@ +/* + * QEMU model of the Xilinx Devcfg Interface + * + * (C) 2011 PetaLogix Pty Ltd + * (C) 2014 Xilinx Inc. + * Written by Peter Crosthwaite peter.crosthwa...@xilinx.com + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the Software), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ + +#include sysemu/sysemu.h +#include sysemu/dma.h +#include hw/sysbus.h +#include hw/ptimer.h +#include qemu/bitops.h +#include hw/register.h +#include qemu/bitops.h + +#define TYPE_XILINX_DEVCFG xlnx.ps7-dev-cfg + +#define XILINX_DEVCFG(obj) \ +OBJECT_CHECK(XilinxDevcfg, (obj), TYPE_XILINX_DEVCFG) + +/* FIXME: get rid of hardcoded nastiness */ + +#define FREQ_HZ 9 + +#define BTT_MAX 0x400 /* bytes to transfer per delay inteval */ +#define CYCLES_BTT_MAX 1 /*approximate 10k cycles per delay interval */ + +#ifndef XILINX_DEVCFG_ERR_DEBUG +#define XILINX_DEVCFG_ERR_DEBUG 0 +#endif +#define DB_PRINT(...) do { \ +if (XILINX_DEVCFG_ERR_DEBUG) { \ +fprintf(stderr, : %s: , __func__); \ +fprintf(stderr, ## __VA_ARGS__); \ +} \ +} while (0); + +#define R_CTRL(0x00/4) +#define FORCE_RST(1 31) /* Not supported, writes ignored */ +#define PCAP_PR (1 27) /* Forced to 0 on bad unlock */ +#define PCAP_MODE(1 26) +#define MULTIBOOT_EN (1 24) +#define USER_MODE(1 15) +#define PCFG_AES_FUSE(1 12) /* locked by AES_FUSE_LOCK */ +#define PCFG_AES_EN_SHIFT9 /* locked by AES_EN_LOCK */ +#define PCFG_AES_EN_LEN 3 /* locked by AES_EN_LOCK */ +#define PCFG_AES_EN_MASK (((1 PCFG_AES_EN_LEN) - 1) \ + PCFG_AES_EN_SHIFT) +#define SEU_EN (1 8) /* locked by SEU_LOCK */ +#define SEC_EN (1 7) /* locked by SEC_LOCK */ +#define SPNIDEN (1 6) /* locked by DBG_LOCK */ +#define SPIDEN (1 5) /* locked by DBG_LOCK */ +#define NIDEN(1 4) /* locked by DBG_LOCK */ +#define DBGEN(1 3)
Re: [Qemu-devel] [PATCH v2 4/6] qemu-img: Enable progress output for commit
Am 08.04.2014 um 18:53 hat Eric Blake geschrieben: On 04/08/2014 09:34 AM, Kevin Wolf wrote: Am 08.04.2014 um 14:50 hat Max Reitz geschrieben: Implement progress output for the commit command by querying the progress of the block job. Signed-off-by: Max Reitz mre...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com --- info = block_job_query(job); +if (info-offset) { +if (!mod_offset) { On a fully populated image this doesn't look entirely right. I think the first 2 MB (or whatever the buffer size is) will be disregarded in the calculation, even though they are real work that is done. Is there any (easy) way to calculate how many dirty sectors remain to be committed, to use that rather than the image size as the job percentage remaining? The very first thing that mirror_run() does is building the dirty bitmap depending on whether clusters are allocated or not. After this loop we have the necessary information, but we don't really communicate it to the caller. Max tries to guess the point at which the loop completed by checking when we make progress for the first time, but that means that we ignore the first actual buffer. We could probably simply change mirror_run() to update s-common.len by subtracting the parts that haven't been marked dirty. Then the caller could use the values as they are, and query-block-jobs would get more accurate values for mirroring jobs, too. commit_run(), for a commit somewhere down in the backing chain, we don't have this initialising loop yet, but we could add it without adding too much overhead, I think. qemu-img convert was changed in December to do the same and noone has complained yet (though it's not in a release yet, perhaps people will complain once 2.0 is out). Kevin pgpksJm2RJ8vu.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH v1 1/1] zynq_slcr: Change the comma to a underscore
Alistair Francis alistair.fran...@xilinx.com writes: On Wed, Apr 9, 2014 at 11:14 AM, Peter Crosthwaite peter.crosthwa...@xilinx.com wrote: On Wed, Mar 26, 2014 at 1:05 PM, Alistair Francis alistair.fran...@xilinx.com wrote: This patch changes the comma in the xilinx,zynq_slcr to an underscore. This matches every other xilinx* peripheral and also makes parsing the device via the command line possible. I think its actually a case of this being slightly ahead and everyone else being behind. The comma is probably ultimately wrong and I'm guessing its awkward for your command-line work due to command line character escaping. I am in favor of the xlnx.foo styling that is more widely adopted: [qemu]$ git grep -c xlnx\. hw/arm/xilinx_zynq.c:1 hw/char/xilinx_uartlite.c:2 hw/dma/xilinx_axidma.c:2 hw/intc/xilinx_intc.c:2 hw/microblaze/petalogix_ml605_mmu.c:6 hw/microblaze/petalogix_s3adsp1800_mmu.c:5 hw/net/xilinx_axienet.c:3 hw/net/xilinx_ethlite.c:2 hw/ppc/virtex_ml507.c:2 hw/ssi/xilinx_spi.c:1 hw/ssi/xilinx_spips.c:2 hw/timer/xilinx_timer.c:2 target-microblaze/cpu.c:1 will xlnx.zynq-slcr work? (fix the underscore while at it). Full stops are fine, just as long as it is not a comma The most common separator in device model names is '-'. There's a fair number of '.', some '_', and a few ','. The comma are probably rooted in device tree usage. I doubt that buys us anything but confusing command line trouble. I suspect period breaks -global. I very much recommend picking '-' whenever practical.
Re: [Qemu-devel] [PATCH] HMP: support specifying dump format for dump-guest-memory
Qiao Nuohan qiaonuo...@cn.fujitsu.com writes: Dumping guest memory is available to specify the dump format now. This patch adds options '-z|-l|-s' to HMP command dump-guest-memory to specify dumping in kdump-compression format, with zlib/lzo/snappy compression. And without these options ELF format will be used. The discussion about this feature is here: http://lists.nongnu.org/archive/html/qemu-devel/2014-03/msg04235.html Signed-off-by: Qiao Nuohan qiaonuo...@cn.fujitsu.com Suggested-by: Christian Borntraeger borntrae...@de.ibm.com --- hmp-commands.hx | 11 +++ hmp.c | 25 - 2 files changed, 31 insertions(+), 5 deletions(-) diff --git a/hmp-commands.hx b/hmp-commands.hx index f3fc514..4b9989f 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -998,8 +998,8 @@ ETEXI { .name = dump-guest-memory, -.args_type = paging:-p,filename:F,begin:i?,length:i?, -.params = [-p] filename [begin] [length], +.args_type = paging:-p,zlib:-z,lzo:-l,snappy:-s,filename:F,begin:i?,length:i?, +.params = [-p] [-z|-l|-s] filename [begin] [length], Since you're touching it anyway, please fix it to [begin length]. .help = dump guest memory to file \n\t\t\t begin(optional): the starting physical address \n\t\t\t length(optional): the memory size, in bytes, .help doesn't cover the flags. Please fix that. Both (optional) are redundant, because help also prints .params. @@ -1008,12 +1008,15 @@ ETEXI STEXI -@item dump-guest-memory [-p] @var{protocol} @var{begin} @var{length} +@item dump-guest-memory [-p] [-z|-l|-s] @var{protocol} @var{begin} @var{length} @findex dump-guest-memory Dump guest memory to @var{protocol}. The file can be processed with crash or -gdb. +gdb. Without -z|-l|-s, the dump format is ELF. filename: dump file name This was called protocol above. Not your fault, but please fix it to filename anyway. paging: do paging to get guest's memory mapping This should be -p:, not paging:. Again, not your fault, but please fix it. + zlib: dump in kdump-compressed format, with zlib compression + lzo: dump in kdump-compressed format, with lzo compression +snappy: dump in kdump-compressed format, with snappy compression These should be -z:, -l: and -s:. begin: the starting physical address. It's optional, and should be specified with length together. length: the memory size, in bytes. It's optional, and should be specified with begin together. Please document that -p and begin/length both conflict with -z, -l, -s. Here's a rather formal way to do that: @item dump-guest-memory [[-p] @var{filename} @var{begin} @var{length} | [-z|-l|-s] @var{filename}] Awkward. This might come out better: @item dump-guest-memory [-p] @var{filename} @var{begin} @var{length} @item dump-guest-memory [-z|-l|-s] @var{filename} Alternatively describe the restrictions in the text rather than the Suggest to try and check the resulting .info. In case you decide to touch the lines on begin and length for some reason: with length together should be together with length. I'd probably fix the preexisting issues in a separate patch first. diff --git a/hmp.c b/hmp.c index 2f279c4..37c3961 100644 --- a/hmp.c +++ b/hmp.c @@ -1308,16 +1308,39 @@ void hmp_dump_guest_memory(Monitor *mon, const QDict *qdict) { Error *errp = NULL; int paging = qdict_get_try_bool(qdict, paging, 0); +int zlib = qdict_get_try_bool(qdict, zlib, 0); +int lzo = qdict_get_try_bool(qdict, lzo, 0); +int snappy = qdict_get_try_bool(qdict, snappy, 0); const char *file = qdict_get_str(qdict, filename); bool has_begin = qdict_haskey(qdict, begin); bool has_length = qdict_haskey(qdict, length); -/* kdump-compressed format is not supported for HMP */ bool has_format = false; Suggest to drop has_format, and pass true to qmp_dump_guest_memory() instead. int64_t begin = 0; int64_t length = 0; enum DumpGuestMemoryFormat dump_format = DUMP_GUEST_MEMORY_FORMAT_ELF; char *prot; +if ((zlib + lzo + snappy) 1) { Superfluous parentheses around sum, please drop them. +error_setg(errp, only one of '-z|-l|-s' can be set); +hmp_handle_error(mon, errp); +return; +} + +if (zlib) { +has_format = true; +dump_format = DUMP_GUEST_MEMORY_FORMAT_KDUMP_ZLIB; +} + +if (lzo) { +has_format = true; +dump_format = DUMP_GUEST_MEMORY_FORMAT_KDUMP_LZO; +} + +if (snappy) { +has_format = true; +dump_format = DUMP_GUEST_MEMORY_FORMAT_KDUMP_SNAPPY; +} + if (has_begin) { begin = qdict_get_int(qdict, begin); }
Re: [Qemu-devel] [PATCH v1 1/1] zynq_slcr: Change the comma to a underscore
Markus Armbruster arm...@redhat.com writes: Alistair Francis alistair.fran...@xilinx.com writes: On Wed, Apr 9, 2014 at 11:14 AM, Peter Crosthwaite peter.crosthwa...@xilinx.com wrote: On Wed, Mar 26, 2014 at 1:05 PM, Alistair Francis alistair.fran...@xilinx.com wrote: This patch changes the comma in the xilinx,zynq_slcr to an underscore. This matches every other xilinx* peripheral and also makes parsing the device via the command line possible. I think its actually a case of this being slightly ahead and everyone else being behind. The comma is probably ultimately wrong and I'm guessing its awkward for your command-line work due to command line character escaping. I am in favor of the xlnx.foo styling that is more widely adopted: [qemu]$ git grep -c xlnx\. hw/arm/xilinx_zynq.c:1 hw/char/xilinx_uartlite.c:2 hw/dma/xilinx_axidma.c:2 hw/intc/xilinx_intc.c:2 hw/microblaze/petalogix_ml605_mmu.c:6 hw/microblaze/petalogix_s3adsp1800_mmu.c:5 hw/net/xilinx_axienet.c:3 hw/net/xilinx_ethlite.c:2 hw/ppc/virtex_ml507.c:2 hw/ssi/xilinx_spi.c:1 hw/ssi/xilinx_spips.c:2 hw/timer/xilinx_timer.c:2 target-microblaze/cpu.c:1 will xlnx.zynq-slcr work? (fix the underscore while at it). Full stops are fine, just as long as it is not a comma The most common separator in device model names is '-'. There's a fair number of '.', some '_', and a few ','. The comma are probably rooted in device tree usage. I doubt that buys us anything but confusing command line trouble. I suspect period breaks -global. Yes, it does: -global cfi.pflash01.name=foo gets parsed as driver = cfi, property = pflash01.name, value = foo. I'm going to start a new thread for this issue. I very much recommend picking '-' whenever practical. Still do :)
Re: [Qemu-devel] [PATCH for-2.0] configure: add option to disable -fstack-protector flags
On 9 April 2014 10:29, Noonan, Steven snoo...@amazon.com wrote: So in your case all you probably need is to drop the -framework arguments from CFLAGS and plop them into LIBS, and you're probably good to go. This is where they are already. The problem is that putting -framework on the linker command line causes clang to fail to reject -fsome-random-unknown-thing, which means we can't do configure detection of -fsomething arguments using compile_prog :-( I think the only thing we can do about this is to make sure that our configure code to check for -fsomething does a compile-only check and not a link. thanks -- PMM
[Qemu-devel] '.' IDs and certain names breaks -global and -set
We have a number of device model names containing '.'. They're unusable with -global. That's because -global A.B.C=foo gets parsed as driver = A property = B.C value = foo. Wrong when the device model name is really A.B and the property is C, e.g. -global cfi.pflash01.name. Related problem: QemuOpts ID may contain '.'. Such IDs are in fact common practice. Unfortunately, they're unusable with -set. For instance, -set A.B.C.D=foo gets parsed as group = A id = B arg = C.D value = foo Wrong when the id is really B.C and the arg is D, e.g. -device isa-serial,id=s.0 -set device.s.0.chardev=c0. This issue isn't limited to devices. Related historical problem: commit b560a9a. Possible solutions: * Outlaw '.' in QemuOpts parameter names (and thus device model names and property names, because QemuOpts is a major way for users to specify them). * Resolve the syntactic ambiguity semantically (ugh) * Extend the syntax of -global and -set to let users avoid the issue, or replace them outright * Use the old ostrich algorithm Andreas, what restrictions does QOM place on QOM path component names? No '/', obviously. Anything else?
[Qemu-devel] [PATCH] bochs: Fix memory leak in bochs_open() error path
Signed-off-by: Kevin Wolf kw...@redhat.com --- block/bochs.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/block/bochs.c b/block/bochs.c index 826ec12..50b84a9 100644 --- a/block/bochs.c +++ b/block/bochs.c @@ -150,11 +150,13 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags, s-extent_size = le32_to_cpu(bochs.extent); if (s-extent_size == 0) { error_setg(errp, Extent size may not be zero); -return -EINVAL; +ret = -EINVAL; +goto fail; } else if (s-extent_size 0x80) { error_setg(errp, Extent size % PRIu32 is too large, s-extent_size); -return -EINVAL; +ret = -EINVAL; +goto fail; } if (s-catalog_size bs-total_sectors / s-extent_size) { -- 1.8.3.1
Re: [Qemu-devel] [PATCH] bochs: Fix memory leak in bochs_open() error path
On 04/09/14 11:20, Kevin Wolf wrote: Signed-off-by: Kevin Wolf kw...@redhat.com --- block/bochs.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/block/bochs.c b/block/bochs.c index 826ec12..50b84a9 100644 --- a/block/bochs.c +++ b/block/bochs.c @@ -150,11 +150,13 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags, s-extent_size = le32_to_cpu(bochs.extent); if (s-extent_size == 0) { error_setg(errp, Extent size may not be zero); -return -EINVAL; +ret = -EINVAL; +goto fail; } else if (s-extent_size 0x80) { error_setg(errp, Extent size % PRIu32 is too large, s-extent_size); -return -EINVAL; +ret = -EINVAL; +goto fail; } if (s-catalog_size bs-total_sectors / s-extent_size) { Reviewed-by: Laszlo Ersek ler...@redhat.com
Re: [Qemu-devel] [PATCH 2/2] pci-assign: Fix memory out of bound when MSI-X table not fit in a single page
Hi, QEMU only mmap MSIX_PAGE_SIZE memory for all pci devices in assigned_dev_register_msix_mmio(), meanwhile the set the one page memmory to zero, so the rest memory will be random value (maybe etnry.data is not 0). In the assigned_dev_update_msix_mmio() maybe occur the issue of entry_nr 256, and the kmod reports the EINVAL error. Signed-off-by: Gonglei arei.gong...@huawei.com Okay so this kind of works because guest does not try to use so many vectors. Yes. It's true. But I think it's better not to give guest more entries than we can actually support. How about tweaking MSIX capability exposed to guest to limit table size? IMHO, the MSIX table entry size got form the physic NIC hardware. The software layer shouldn't tweak the capability of real hardware, neither QEMU nor KVM. Best regards, -Gonglei --- hw/i386/kvm/pci-assign.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c index 570333f..d25a19e 100644 --- a/hw/i386/kvm/pci-assign.c +++ b/hw/i386/kvm/pci-assign.c @@ -37,6 +37,7 @@ #include hw/pci/pci.h #include hw/pci/msi.h #include kvm_i386.h +#include qemu/osdep.h #define MSIX_PAGE_SIZE 0x1000 @@ -59,6 +60,9 @@ #define DEBUG(fmt, ...) #endif +/* the msix-table size readed from pci device config */ +static int msix_table_size; + typedef struct PCIRegion { int type; /* Memory or port I/O */ int valid; @@ -1604,7 +1608,12 @@ static void assigned_dev_msix_reset(AssignedDevice *dev) static int assigned_dev_register_msix_mmio(AssignedDevice *dev) { -dev-msix_table = mmap(NULL, MSIX_PAGE_SIZE, PROT_READ|PROT_WRITE, +msix_table_size = ROUND_UP(dev-msix_max * sizeof(struct MSIXTableEntry), +MSIX_PAGE_SIZE); + +DEBUG(msix_table_size: 0x%x\n, msix_table_size); + +dev-msix_table = mmap(NULL, msix_table_size, PROT_READ|PROT_WRITE, MAP_ANONYMOUS|MAP_PRIVATE, 0, 0); if (dev-msix_table == MAP_FAILED) { error_report(fail allocate msix_table! %s, strerror(errno)); @@ -1615,7 +1624,7 @@ static int assigned_dev_register_msix_mmio(AssignedDevice *dev) assigned_dev_msix_reset(dev); memory_region_init_io(dev-mmio, OBJECT(dev), assigned_dev_msix_mmio_ops, - dev, assigned-dev-msix, MSIX_PAGE_SIZE); + dev, assigned-dev-msix, msix_table_size); return 0; } @@ -1627,7 +1636,7 @@ static void assigned_dev_unregister_msix_mmio(AssignedDevice *dev) memory_region_destroy(dev-mmio); -if (munmap(dev-msix_table, MSIX_PAGE_SIZE) == -1) { +if (munmap(dev-msix_table, msix_table_size) == -1) { error_report(error unmapping msix_table! %s, strerror(errno)); } dev-msix_table = NULL; -- 1.7.12.4
[Qemu-devel] [PATCH for-2.0] configure: use do_cc when checking for -fstack-protector support
MacOSX clang silently swallows unrecognized -f options when doing a link with '-framework' also on the command line, so to detect support for the various -fstack-protector options we must do a plain .c to .o compile, not a complete compile-and-link. Signed-off-by: Peter Maydell peter.mayd...@linaro.org --- This should be a pretty safe change and it prevents clang/MacOSX builds from spewing a warning on every C file compilation, so I'd like to get it into 2.0. configure | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/configure b/configure index eb0e7bb..c85475f 100755 --- a/configure +++ b/configure @@ -1448,7 +1448,7 @@ done if test $stack_protector != no ; then gcc_flags=-fstack-protector-strong -fstack-protector-all for flag in $gcc_flags; do -if compile_prog -Werror $flag ; then +if do_cc $QEMU_CFLAGS -Werror $flag -c -o $TMPO $TMPC ; then QEMU_CFLAGS=$QEMU_CFLAGS $flag LIBTOOLFLAGS=$LIBTOOLFLAGS -Wc,$flag break -- 1.8.5.4
[Qemu-devel] [PATCH] bochs: Fix catalog size check
The old check was off by a factor of 512 and didn't consider cases where we don't get an exact division. This could lead to an out-of-bounds array access in seek_to_sector(). Signed-off-by: Kevin Wolf kw...@redhat.com --- block/bochs.c | 14 +++--- tests/qemu-iotests/078 | 6 +- tests/qemu-iotests/078.out | 6 -- 3 files changed, 20 insertions(+), 6 deletions(-) diff --git a/block/bochs.c b/block/bochs.c index 50b84a9..eacf956 100644 --- a/block/bochs.c +++ b/block/bochs.c @@ -148,8 +148,14 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags, s-extent_blocks = 1 + (le32_to_cpu(bochs.extent) - 1) / 512; s-extent_size = le32_to_cpu(bochs.extent); -if (s-extent_size == 0) { -error_setg(errp, Extent size may not be zero); +if (s-extent_size BDRV_SECTOR_SIZE) { +/* bximage actually never creates extents smaller than 4k */ +error_setg(errp, Extent size must be at least 512); +ret = -EINVAL; +goto fail; +} else if (!is_power_of_2(s-extent_size)) { +error_setg(errp, Extent size % PRIu32 is not a power of two, + s-extent_size); ret = -EINVAL; goto fail; } else if (s-extent_size 0x80) { @@ -159,7 +165,9 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags, goto fail; } -if (s-catalog_size bs-total_sectors / s-extent_size) { +if (s-catalog_size DIV_ROUND_UP(bs-total_sectors, + s-extent_size / BDRV_SECTOR_SIZE)) +{ error_setg(errp, Catalog size is too small for this disk size); ret = -EINVAL; goto fail; diff --git a/tests/qemu-iotests/078 b/tests/qemu-iotests/078 index 872e734..d4d6da7 100755 --- a/tests/qemu-iotests/078 +++ b/tests/qemu-iotests/078 @@ -69,10 +69,14 @@ _use_sample_img empty.bochs.bz2 poke_file $TEST_IMG $disk_size_offset \x00\xc0\x0f\x00\x00\x00\x00\x7f { $QEMU_IO -c read 2T 4k $TEST_IMG; } 21 | _filter_qemu_io | _filter_testdir +_use_sample_img empty.bochs.bz2 +poke_file $TEST_IMG $catalog_size_offset \x10\x00\x00\x00 +{ $QEMU_IO -c read 0xfbe00 512 $TEST_IMG; } 21 | _filter_qemu_io | _filter_testdir + echo echo == Negative extent size == _use_sample_img empty.bochs.bz2 -poke_file $TEST_IMG $extent_size_offset \xff\xff\xff\xff +poke_file $TEST_IMG $extent_size_offset \x00\x00\x00\x80 { $QEMU_IO -c read 768k 512 $TEST_IMG; } 21 | _filter_qemu_io | _filter_testdir echo diff --git a/tests/qemu-iotests/078.out b/tests/qemu-iotests/078.out index ea95ffd..ca18d2e 100644 --- a/tests/qemu-iotests/078.out +++ b/tests/qemu-iotests/078.out @@ -15,12 +15,14 @@ no file open, try 'help open' == Too small catalog bitmap for image size == qemu-io: can't open device TEST_DIR/empty.bochs: Catalog size is too small for this disk size no file open, try 'help open' +qemu-io: can't open device TEST_DIR/empty.bochs: Catalog size is too small for this disk size +no file open, try 'help open' == Negative extent size == -qemu-io: can't open device TEST_DIR/empty.bochs: Extent size 4294967295 is too large +qemu-io: can't open device TEST_DIR/empty.bochs: Extent size 2147483648 is too large no file open, try 'help open' == Zero extent size == -qemu-io: can't open device TEST_DIR/empty.bochs: Extent size may not be zero +qemu-io: can't open device TEST_DIR/empty.bochs: Extent size must be at least 512 no file open, try 'help open' *** done -- 1.8.3.1
[Qemu-devel] [PATCH v4 2/5] kvm: add kvm_enable_cap_{vm,vcpu}
Provide helper functions for enabling capabilities (on a vcpu and on a vm). Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- include/sysemu/kvm.h |4 kvm-all.c| 19 ++- 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index 0bee1e8..d89911c 100644 --- a/include/sysemu/kvm.h +++ b/include/sysemu/kvm.h @@ -294,6 +294,10 @@ bool kvm_arch_stop_on_emulation_error(CPUState *cpu); int kvm_check_extension(KVMState *s, unsigned int extension); +int kvm_enable_cap_vm(KVMState *s, unsigned int capability); + +int kvm_enable_cap_vcpu(CPUState *cpu, unsigned int capability); + uint32_t kvm_arch_get_supported_cpuid(KVMState *env, uint32_t function, uint32_t index, int reg); diff --git a/kvm-all.c b/kvm-all.c index cd4111d..c32eeff 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -501,7 +501,24 @@ int kvm_check_extension(KVMState *s, unsigned int extension) return ret; } -static int kvm_set_ioeventfd_mmio(int fd, hwaddr addr, uint32_t val, +int kvm_enable_cap_vm(KVMState *s, unsigned int capability) +{ +struct kvm_enable_cap cap = {}; + +cap.cap = capability; +return kvm_vm_ioctl(s, KVM_ENABLE_CAP, cap); +} + +int kvm_enable_cap_vcpu(CPUState *cpu, unsigned int capability) +{ +struct kvm_enable_cap cap = {}; + +cap.cap = capability; +return kvm_vcpu_ioctl(cpu, KVM_ENABLE_CAP, cap); +} + + +static int kvm_set_ioeventfd_mmio(int fd, uint32_t addr, uint32_t val, bool assign, uint32_t size, bool datamatch) { int ret; -- 1.7.9.5
[Qemu-devel] [PATCH v4 5/5] s390x/virtio-ccw: Wire up irq routing and irqfds.
Make use of the new s390 adapter irq routing support to enable real in-kernel irqfds for virtio-ccw with adapter interrupts. Note that s390 doesn't provide the common KVM_CAP_IRQCHIP capability, but rather needs KVM_CAP_S390_IRQCHIP to be enabled. This is to ensure backward compatibility. Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- hw/s390x/virtio-ccw.c | 165 hw/s390x/virtio-ccw.h |2 + include/hw/s390x/adapter.h | 23 ++ include/qemu/typedefs.h|1 + include/sysemu/kvm.h |2 + kvm-all.c | 38 +- kvm-stub.c |5 ++ target-s390x/kvm.c |5 ++ 8 files changed, 228 insertions(+), 13 deletions(-) create mode 100644 include/hw/s390x/adapter.h diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c index 69efa6c..5612ccc 100644 --- a/hw/s390x/virtio-ccw.c +++ b/hw/s390x/virtio-ccw.c @@ -21,6 +21,7 @@ #include hw/sysbus.h #include qemu/bitops.h #include hw/virtio/virtio-bus.h +#include hw/s390x/adapter.h #include ioinst.h #include css.h @@ -48,7 +49,7 @@ static IndAddr *get_indicator(hwaddr ind_addr, int len) return indicator; } -static void release_indicator(IndAddr *indicator) +static void release_indicator(uint32_t adapter_id, IndAddr *indicator) { assert(indicator-refcnt 0); indicator-refcnt--; @@ -56,9 +57,31 @@ static void release_indicator(IndAddr *indicator) return; } QTAILQ_REMOVE(indicator_addresses, indicator, sibling); +if (indicator-map) { +s390_io_adapter_map(adapter_id, indicator-map, false); +} g_free(indicator); } +static int map_indicator(uint32_t adapter_id, IndAddr *indicator) +{ +int ret; + +if (indicator-map) { +return 0; /* already mapped is not an error */ +} +indicator-map = indicator-addr; +ret = s390_io_adapter_map(adapter_id, indicator-map, true); +if ((ret != 0) (ret != -ENOSYS)) { +goto out_err; +} +return 0; + +out_err: +indicator-map = 0; +return -EFAULT; +} + static void virtio_ccw_bus_new(VirtioBusState *bus, size_t bus_size, VirtioCcwDevice *dev); @@ -733,7 +756,7 @@ static int virtio_ccw_exit(VirtioCcwDevice *dev) g_free(sch); } if (dev-indicators) { -release_indicator(dev-indicators); +release_indicator(dev-adapter_id, dev-indicators); dev-indicators = NULL; } return 0; @@ -1034,15 +1057,15 @@ static void virtio_ccw_reset(DeviceState *d) virtio_reset(vdev); css_reset_sch(dev-sch); if (dev-indicators) { -release_indicator(dev-indicators); +release_indicator(dev-adapter_id, dev-indicators); dev-indicators = NULL; } if (dev-indicators2) { -release_indicator(dev-indicators2); +release_indicator(dev-adapter_id, dev-indicators2); dev-indicators2 = NULL; } if (dev-summary_indicator) { -release_indicator(dev-summary_indicator); +release_indicator(dev-adapter_id, dev-summary_indicator); dev-summary_indicator = NULL; } } @@ -1078,6 +1101,100 @@ static int virtio_ccw_set_host_notifier(DeviceState *d, int n, bool assign) return virtio_ccw_set_guest2host_notifier(dev, n, assign, false); } +static int virtio_ccw_get_adapter_info(VirtioCcwDevice *dev, + AdapterInfo *adapter) +{ +int r; + +if (!dev-sch-thinint_active) { +return -EINVAL; +} + +r = map_indicator(dev-adapter_id, dev-summary_indicator); +if (r) { +return r; +} +r = map_indicator(dev-adapter_id, dev-indicators); +if (r) { +return r; +} +adapter-summary_addr = dev-summary_indicator-map; +adapter-ind_addr = dev-indicators-map; +adapter-ind_offset = dev-ind_bit; +adapter-summary_offset = 7; +adapter-adapter_id = dev-adapter_id; + +return 0; +} + +static int virtio_ccw_setup_irqroutes(VirtioCcwDevice *dev, int nvqs) +{ +int i; +VirtIODevice *vdev = virtio_bus_get_device(dev-bus); +int ret; +AdapterInfo adapter; + +ret = virtio_ccw_get_adapter_info(dev, adapter); +if (ret) { +return ret; +} +for (i = 0; i nvqs; i++) { +if (!virtio_queue_get_num(vdev, i)) { +break; +} +ret = kvm_irqchip_add_adapter_route(kvm_state, adapter); +if (ret 0) { +goto out_undo; +} +dev-gsi[i] = ret; +adapter.ind_offset++; +} +return 0; +out_undo: +while (--i = 0) { +kvm_irqchip_release_virq(kvm_state, dev-gsi[i]); +dev-gsi[i] = -1; +} +return ret; +} + +static void virtio_ccw_release_irqroutes(VirtioCcwDevice *dev, int nvqs) +{ +int i; +VirtIODevice *vdev = virtio_bus_get_device(dev-bus); + +for (i = 0; i nvqs; i++) { +
Re: [Qemu-devel] [PATCH v4 1/1] Make qemu_peek_buffer loop until it gets it's data
* (chenliang0...@icloud.com) wrote: ?? 2014??4??810:29??Dr. David Alan Gilbert (git) dgilb...@redhat.com ?? From: Dr. David Alan Gilbert dgilb...@redhat.com Make qemu_peek_buffer repeatedly call fill_buffer until it gets all the data it requires, or until there is an error. At the moment, qemu_peek_buffer will try one qemu_fill_buffer if there isn't enough data waiting, however the kernel is entitled to return just a few bytes, and still leave qemu_peek_buffer with less bytes than it needed. I've seen this fail in a dev world, and I think it could theoretically fail in the peeking of the subsection headers in the current world. hmm, I also have got some errors(infrequently). Maybe It is one point. Could you show some messages about the error? I've only seen the errors in my visitor/ber world where I use the peek_buffer a lot more; but the one place it is used in the existing code is in the code to check if we have the start of a subsection; if that goes wrong I'm not sure what error would be produced. Dave In my observation, error is more likely to happen if migration thread is blocked too long. BTW, xbzrle and auto convergence are disable. Best regards ChenLiang However, this patch avoids one potential issue, thanks. Reviewed-by: ChenLiang chenliang0...@icloud.com Comment qemu_peek_byte to point out it's not guaranteed to work for non-continuous peeks Signed-off-by: Dr. David Alan Gilbert dgilb...@redhat.com --- v4: Limit the size qemu_get_buffer passes to qemu_fill_buffer on each loop v3: Make qemu_fill_buffer return ssize_t Remove the other size_t cleanup from this patch - I'll get back to them later Comment additions/cleanups from review Stretch my -ve include/migration/qemu-file.h | 5 qemu-file.c | 53 +++ 2 files changed, 54 insertions(+), 4 deletions(-) diff --git a/include/migration/qemu-file.h b/include/migration/qemu-file.h index a191fb6..c90f529 100644 --- a/include/migration/qemu-file.h +++ b/include/migration/qemu-file.h @@ -123,6 +123,11 @@ void qemu_put_be32(QEMUFile *f, unsigned int v); void qemu_put_be64(QEMUFile *f, uint64_t v); int qemu_peek_buffer(QEMUFile *f, uint8_t *buf, int size, size_t offset); int qemu_get_buffer(QEMUFile *f, uint8_t *buf, int size); +/* + * Note that you can only peek continuous bytes from where the current pointer + * is; you aren't guaranteed to be able to peak to +n bytes unless you've + * previously peeked +n-1. + */ int qemu_peek_byte(QEMUFile *f, int offset); int qemu_get_byte(QEMUFile *f); void qemu_file_skip(QEMUFile *f, int size); diff --git a/qemu-file.c b/qemu-file.c index e5ec798..6ae37d0 100644 --- a/qemu-file.c +++ b/qemu-file.c @@ -529,7 +529,15 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t block_offset, return RAM_SAVE_CONTROL_NOT_SUPP; } -static void qemu_fill_buffer(QEMUFile *f) +/* + * Attempt to fill the buffer from the underlying file + * Returns the number of bytes read, or negative value for an error. + * + * Note that it can return a partially full buffer even in a not error/not EOF + * case if the underlying file descriptor gives a short read, and that can + * happen even on a blocking fd. + */ +static ssize_t qemu_fill_buffer(QEMUFile *f) { int len; int pending; @@ -553,6 +561,8 @@ static void qemu_fill_buffer(QEMUFile *f) } else if (len != -EAGAIN) { qemu_file_set_error(f, len); } + +return len; } int qemu_get_fd(QEMUFile *f) @@ -683,17 +693,39 @@ void qemu_file_skip(QEMUFile *f, int size) } } +/* + * Read 'size' bytes from file (at 'offset') into buf without moving the + * pointer. + * + * It will return size bytes unless there was an error, in which case it will + * return as many as it managed to read (assuming blocking fd's which + * all current QEMUFile are) + */ int qemu_peek_buffer(QEMUFile *f, uint8_t *buf, int size, size_t offset) { int pending; int index; assert(!qemu_file_is_writable(f)); +assert(offset IO_BUF_SIZE); +assert(size = IO_BUF_SIZE - offset); +/* The 1st byte to read from */ index = f-buf_index + offset; +/* The number of available bytes starting at index */ pending = f-buf_size - index; -if (pending size) { -qemu_fill_buffer(f); + +/* + * qemu_fill_buffer might return just a few bytes, even when there isn't + * an error, so loop collecting them until we get enough. + */ +while (pending size) { +int received = qemu_fill_buffer(f); + +if (received = 0) { +break; +} + index = f-buf_index + offset; pending = f-buf_size - index; } @@ -709,6 +741,14 @@ int qemu_peek_buffer(QEMUFile *f, uint8_t *buf, int size, size_t offset) return size; } +/* + * Read 'size' bytes of data from the file into buf. +
Re: [Qemu-devel] '.' IDs and certain names breaks -global and -set
On Wed, Apr 9, 2014 at 7:56 PM, Markus Armbruster arm...@redhat.com wrote: We have a number of device model names containing '.'. They're unusable with -global. That's because -global A.B.C=foo gets parsed as driver = A property = B.C value = foo. Wrong when the device model name is really A.B and the property is C, e.g. -global cfi.pflash01.name. Related problem: QemuOpts ID may contain '.'. Such IDs are in fact common practice. Unfortunately, they're unusable with -set. For instance, -set A.B.C.D=foo gets parsed as group = A id = B arg = C.D value = foo Wrong when the id is really B.C and the arg is D, e.g. -device isa-serial,id=s.0 -set device.s.0.chardev=c0. This issue isn't limited to devices. Related historical problem: commit b560a9a. Possible solutions: * Outlaw '.' in QemuOpts parameter names (and thus device model names and property names, because QemuOpts is a major way for users to specify them). I like the .s because they mean this is where the vendor stops and the IP name starts. Whereas - just delimits multiple words. We seem to be running our of characters however with these overloading problems. But AFAICT : is largely unused (within this context anyway). So how about: - seperates multiple words in a name : splits logical groupings of multiple words as appropriate ( e.g. some-vendor:their-device ) , demilits multiple command line arguments . accesses property within an object / walks the canonical path I do also need to confess to my alternate agenda of device tree driven QEMU - I am still parsing device trees and directly mapping the compatible strings to QOM type names for machine creation. Ambiguity of - as meaning the , or the - is hard to deal with. Regards, Peter * Resolve the syntactic ambiguity semantically (ugh) * Extend the syntax of -global and -set to let users avoid the issue, or replace them outright * Use the old ostrich algorithm Andreas, what restrictions does QOM place on QOM path component names? No '/', obviously. Anything else?
Re: [Qemu-devel] [PATCH v4 1/1] Make qemu_peek_buffer loop until it gets it's data
* (chenliang0...@icloud.com) wrote: * (chenliang0...@icloud.com) wrote: ?? 2014??4??810:29??Dr. David Alan Gilbert (git) dgilb...@redhat.com ?? From: Dr. David Alan Gilbert dgilb...@redhat.com Make qemu_peek_buffer repeatedly call fill_buffer until it gets all the data it requires, or until there is an error. At the moment, qemu_peek_buffer will try one qemu_fill_buffer if there isn't enough data waiting, however the kernel is entitled to return just a few bytes, and still leave qemu_peek_buffer with less bytes than it needed. I've seen this fail in a dev world, and I think it could theoretically fail in the peeking of the subsection headers in the current world. hmm, I also have got some errors(infrequently). Maybe It is one point. Could you show some messages about the error? I've only seen the errors in my visitor/ber world where I use the peek_buffer a lot more; but the one place it is used in the existing code is in the code to check if we have the start of a subsection; if that goes wrong I'm not sure what error would be produced. Dave In my observation, error is more likely to happen if migration thread is blocked too long. BTW, xbzrle and auto convergence are disable. What error do you see? Dave -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
[Qemu-devel] [PATCH v4 3/5] s390x: Add I/O adapter registration.
Register an I/O adapter interrupt source for when virtio-ccw devices start using adapter interrupts. Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- hw/intc/s390_flic.c | 59 + hw/s390x/css.c| 51 ++ hw/s390x/css.h|4 hw/s390x/virtio-ccw.c |4 hw/s390x/virtio-ccw.h |1 + target-s390x/cpu.h| 33 +++ 6 files changed, 152 insertions(+) diff --git a/hw/intc/s390_flic.c b/hw/intc/s390_flic.c index b2ef3e3..c033c8a 100644 --- a/hw/intc/s390_flic.c +++ b/hw/intc/s390_flic.c @@ -21,6 +21,11 @@ #define FLIC_FAILED (-1UL) #define FLIC_SAVEVM_VERSION 1 +static KVMS390FLICState *s390_get_flic(void) +{ +return KVM_S390_FLIC(object_resolve_path(/machine/s390-flic, NULL)); +} + void s390_flic_init(void) { DeviceState *dev; @@ -148,6 +153,60 @@ static int __get_all_irqs(KVMS390FLICState *flic, return r; } +int kvm_s390_register_io_adapter(uint32_t id, uint8_t isc, bool swap, + bool is_maskable) +{ +struct kvm_s390_io_adapter adapter = { +.id = id, +.isc = isc, +.maskable = is_maskable, +.swap = swap, +}; +KVMS390FLICState *flic = s390_get_flic(); +int r, ret; +struct kvm_device_attr attr = { +.group = KVM_DEV_FLIC_ADAPTER_REGISTER, +.addr = (uint64_t)adapter, +}; + +if (!flic) { +return -ENOSYS; +} +if (!kvm_check_extension(kvm_state, KVM_CAP_IRQ_ROUTING)) { +return -ENOSYS; +} + +r = ioctl(flic-fd, KVM_SET_DEVICE_ATTR, attr); + +ret = r ? -errno : 0; +return ret; +} + +int kvm_s390_io_adapter_map(uint32_t id, uint64_t map_addr, bool do_map) +{ +struct kvm_s390_io_adapter_req req = { +.id = id, +.type = do_map ? KVM_S390_IO_ADAPTER_MAP : KVM_S390_IO_ADAPTER_UNMAP, +.addr = map_addr, +}; +KVMS390FLICState *flic = s390_get_flic(); +struct kvm_device_attr attr = { +.group = KVM_DEV_FLIC_ADAPTER_MODIFY, +.addr = (uint64_t)req, +}; +int r; + +if (!flic) { +return -ENOSYS; +} +if (!kvm_check_extension(kvm_state, KVM_CAP_IRQ_ROUTING)) { +return -ENOSYS; +} + +r = ioctl(flic-fd, KVM_SET_DEVICE_ATTR, attr); +return r ? -errno : 0; +} + /** * kvm_flic_save - Save pending floating interrupts * @f: QEMUFile containing migration state diff --git a/hw/s390x/css.c b/hw/s390x/css.c index 7074d2b..a6d173f 100644 --- a/hw/s390x/css.c +++ b/hw/s390x/css.c @@ -39,6 +39,13 @@ typedef struct CssImage { ChpInfo chpids[MAX_CHPID + 1]; } CssImage; +typedef struct IoAdapter { +uint32_t id; +uint8_t type; +uint8_t isc; +QTAILQ_ENTRY(IoAdapter) sibling; +} IoAdapter; + typedef struct ChannelSubSys { QTAILQ_HEAD(, CrwContainer) pending_crws; bool do_crw_mchk; @@ -49,6 +56,7 @@ typedef struct ChannelSubSys { uint64_t chnmon_area; CssImage *css[MAX_CSSID + 1]; uint8_t default_cssid; +QTAILQ_HEAD(, IoAdapter) io_adapters; } ChannelSubSys; static ChannelSubSys *channel_subsys; @@ -69,6 +77,48 @@ int css_create_css_image(uint8_t cssid, bool default_image) return 0; } +int css_register_io_adapter(uint8_t type, uint8_t isc, bool swap, +bool maskable, uint32_t *id) +{ +IoAdapter *adapter; +bool found = false; +int ret; + +*id = 0; +QTAILQ_FOREACH(adapter, channel_subsys-io_adapters, sibling) { +if ((adapter-type == type) (adapter-isc == isc)) { +*id = adapter-id; +found = true; +ret = 0; +break; +} +if (adapter-id = *id) { +*id = adapter-id + 1; +} +} +if (found) { +goto out; +} +adapter = g_new0(IoAdapter, 1); +ret = s390_register_io_adapter(*id, isc, swap, maskable); +if (ret == -ENOSYS) { +/* Keep adapter even if we didn't register it anywhere. */ +ret = 0; +} +if (ret == 0) { +adapter-id = *id; +adapter-isc = isc; +adapter-type = type; +QTAILQ_INSERT_TAIL(channel_subsys-io_adapters, adapter, sibling); +} else { +g_free(adapter); +fprintf(stderr, Unexpected error %d when registering adapter %d\n, +ret, *id); +} +out: +return ret; +} + uint16_t css_build_subchannel_id(SubchDev *sch) { if (channel_subsys-max_cssid 0) { @@ -1239,6 +1289,7 @@ static void css_init(void) channel_subsys-do_crw_mchk = true; channel_subsys-crws_lost = false; channel_subsys-chnmon_active = false; +QTAILQ_INIT(channel_subsys-io_adapters); } machine_init(css_init); diff --git a/hw/s390x/css.h b/hw/s390x/css.h index e9b4405..380e8e7 100644 --- a/hw/s390x/css.h +++ b/hw/s390x/css.h @@ -99,4 +99,8 @@ void
[Qemu-devel] [PATCH v4 4/5] s390x/virtio-ccw: reference-counted indicators
Make code using the same indicators point to a single allocated structure that is freed when the last user goes away. This will be used by the irqfd code to unmap addresses after the last user is gone. Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- hw/s390x/virtio-ccw.c | 80 ++--- hw/s390x/virtio-ccw.h | 13 ++-- 2 files changed, 73 insertions(+), 20 deletions(-) diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c index 1193682..69efa6c 100644 --- a/hw/s390x/virtio-ccw.c +++ b/hw/s390x/virtio-ccw.c @@ -27,6 +27,38 @@ #include virtio-ccw.h #include trace.h +static QTAILQ_HEAD(, IndAddr) indicator_addresses = +QTAILQ_HEAD_INITIALIZER(indicator_addresses); + +static IndAddr *get_indicator(hwaddr ind_addr, int len) +{ +IndAddr *indicator; + +QTAILQ_FOREACH(indicator, indicator_addresses, sibling) { +if (indicator-addr == ind_addr) { +indicator-refcnt++; +return indicator; +} +} +indicator = g_new0(IndAddr, 1); +indicator-addr = ind_addr; +indicator-len = len; +indicator-refcnt = 1; +QTAILQ_INSERT_TAIL(indicator_addresses, indicator, sibling); +return indicator; +} + +static void release_indicator(IndAddr *indicator) +{ +assert(indicator-refcnt 0); +indicator-refcnt--; +if (indicator-refcnt 0) { +return; +} +QTAILQ_REMOVE(indicator_addresses, indicator, sibling); +g_free(indicator); +} + static void virtio_ccw_bus_new(VirtioBusState *bus, size_t bus_size, VirtioCcwDevice *dev); @@ -445,7 +477,7 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw) ret = -EFAULT; } else { indicators = ldq_phys(address_space_memory, ccw.cda); -dev-indicators = indicators; +dev-indicators = get_indicator(indicators, sizeof(uint64_t)); sch-curr_status.scsw.count = ccw.count - sizeof(indicators); ret = 0; } @@ -465,7 +497,7 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw) ret = -EFAULT; } else { indicators = ldq_phys(address_space_memory, ccw.cda); -dev-indicators2 = indicators; +dev-indicators2 = get_indicator(indicators, sizeof(uint64_t)); sch-curr_status.scsw.count = ccw.count - sizeof(indicators); ret = 0; } @@ -517,8 +549,10 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw) ret = -EFAULT; } else { len = hw_len; -dev-summary_indicator = thinint-summary_indicator; -dev-indicators = thinint-device_indicator; +dev-summary_indicator = +get_indicator(thinint-summary_indicator, sizeof(uint8_t)); +dev-indicators = get_indicator(thinint-device_indicator, +thinint-ind_bit / 8 + 1); dev-thinint_isc = thinint-isc; dev-ind_bit = thinint-ind_bit; cpu_physical_memory_unmap(thinint, hw_len, 0, hw_len); @@ -526,8 +560,8 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw) dev-thinint_isc, true, false, dev-adapter_id); assert(ret == 0); -sch-thinint_active = ((dev-indicators != 0) - (dev-summary_indicator != 0)); +sch-thinint_active = ((dev-indicators != NULL) + (dev-summary_indicator != NULL)); sch-curr_status.scsw.count = ccw.count - len; ret = 0; } @@ -558,7 +592,7 @@ static int virtio_ccw_device_init(VirtioCcwDevice *dev, VirtIODevice *vdev) sch-driver_data = dev; dev-sch = sch; -dev-indicators = 0; +dev-indicators = NULL; /* Initialize subchannel structure. */ sch-channel_prog = 0x0; @@ -698,7 +732,10 @@ static int virtio_ccw_exit(VirtioCcwDevice *dev) css_subch_assign(sch-cssid, sch-ssid, sch-schid, sch-devno, NULL); g_free(sch); } -dev-indicators = 0; +if (dev-indicators) { +release_indicator(dev-indicators); +dev-indicators = NULL; +} return 0; } @@ -955,17 +992,17 @@ static void virtio_ccw_notify(DeviceState *d, uint16_t vector) * ind_bit indicates the start of the indicators in a big * endian notation. */ -virtio_set_ind_atomic(sch, dev-indicators + +virtio_set_ind_atomic(sch, dev-indicators-addr + (dev-ind_bit + vector) / 8, 0x80 ((dev-ind_bit + vector) % 8)); -if (!virtio_set_ind_atomic(sch, dev-summary_indicator, +if
Re: [Qemu-devel] [PATCH v4 1/1] Make qemu_peek_buffer loop until it gets it's data
* (chenliang0...@icloud.com) wrote: * (chenliang0...@icloud.com) wrote: ?? 2014??4??810:29??Dr. David Alan Gilbert (git) dgilb...@redhat.com ?? From: Dr. David Alan Gilbert dgilb...@redhat.com Make qemu_peek_buffer repeatedly call fill_buffer until it gets all the data it requires, or until there is an error. At the moment, qemu_peek_buffer will try one qemu_fill_buffer if there isn't enough data waiting, however the kernel is entitled to return just a few bytes, and still leave qemu_peek_buffer with less bytes than it needed. I've seen this fail in a dev world, and I think it could theoretically fail in the peeking of the subsection headers in the current world. hmm, I also have got some errors(infrequently). Maybe It is one point. Could you show some messages about the error? I've only seen the errors in my visitor/ber world where I use the peek_buffer a lot more; but the one place it is used in the existing code is in the code to check if we have the start of a subsection; if that goes wrong I'm not sure what error would be produced. Dave In my observation, error is more likely to happen if migration thread is blocked too long. BTW, xbzrle and auto convergence are disable. What error do you see? Dave It likes section id 1 load failed at dest side. -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
[Qemu-devel] [PATCH v4 0/5] qemu: irqfds for s390x
Here's my current qemu s390x irqfd patchset. Unless there are objections, I'll send a pull request once 2.0 has been released. Changes from v3: - rebased against current master - first patch is now a proper kernel header update - in patch 5, make indicators-map a uint64_t instead of void *; this saves useless casting and fixes a 32 bit compile error Changes from v2: - rebased against current master - use object_resolve_path() to grab the flic - more explicit return code check for enabling KVM_CAP_S390_IRQCHIP Changes from v1: - rebased against current master - pick up changed capability numbers from the kvm patchset - use c99 struct initializers in patch 3 - adapter interrupts are already upstream The git branch at https://github.com/cohuck/qemu.git s390x-irqfd has been updated accordingly. Cornelia Huck (5): linux-headers: update kvm: add kvm_enable_cap_{vm,vcpu} s390x: Add I/O adapter registration. s390x/virtio-ccw: reference-counted indicators s390x/virtio-ccw: Wire up irq routing and irqfds. hw/intc/s390_flic.c | 59 +++ hw/s390x/css.c | 51 + hw/s390x/css.h |4 + hw/s390x/virtio-ccw.c| 239 +- hw/s390x/virtio-ccw.h| 16 ++- include/hw/s390x/adapter.h | 23 include/qemu/typedefs.h |1 + include/sysemu/kvm.h |6 ++ kvm-all.c| 57 +- kvm-stub.c |5 + linux-headers/asm-s390/kvm.h | 24 + linux-headers/linux/kvm.h| 17 +++ target-s390x/cpu.h | 33 ++ target-s390x/kvm.c |5 + 14 files changed, 511 insertions(+), 29 deletions(-) create mode 100644 include/hw/s390x/adapter.h -- 1.7.9.5
[Qemu-devel] [PATCH v4 1/5] linux-headers: update
Base is 7cbb39d4d4d530dff12f2ff06ed6c85c504ba91a. Gets several new interfaces: Per-vm capability enablement, adapter interrupt sources, irq routing on s390. Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- linux-headers/asm-s390/kvm.h | 24 linux-headers/linux/kvm.h| 17 + 2 files changed, 41 insertions(+) diff --git a/linux-headers/asm-s390/kvm.h b/linux-headers/asm-s390/kvm.h index cb4c1eb..c003c6a 100644 --- a/linux-headers/asm-s390/kvm.h +++ b/linux-headers/asm-s390/kvm.h @@ -22,6 +22,8 @@ #define KVM_DEV_FLIC_CLEAR_IRQS3 #define KVM_DEV_FLIC_APF_ENABLE4 #define KVM_DEV_FLIC_APF_DISABLE_WAIT 5 +#define KVM_DEV_FLIC_ADAPTER_REGISTER 6 +#define KVM_DEV_FLIC_ADAPTER_MODIFY7 /* * We can have up to 4*64k pending subchannels + 8 adapter interrupts, * as well as up to ASYNC_PF_PER_VCPU*KVM_MAX_VCPUS pfault done interrupts. @@ -32,6 +34,26 @@ #define KVM_S390_MAX_FLOAT_IRQS266250 #define KVM_S390_FLIC_MAX_BUFFER 0x200 +struct kvm_s390_io_adapter { + __u32 id; + __u8 isc; + __u8 maskable; + __u8 swap; + __u8 pad; +}; + +#define KVM_S390_IO_ADAPTER_MASK 1 +#define KVM_S390_IO_ADAPTER_MAP 2 +#define KVM_S390_IO_ADAPTER_UNMAP 3 + +struct kvm_s390_io_adapter_req { + __u32 id; + __u8 type; + __u8 mask; + __u16 pad0; + __u64 addr; +}; + /* for KVM_GET_REGS and KVM_SET_REGS */ struct kvm_regs { /* general purpose regs for s390 */ @@ -76,4 +98,6 @@ struct kvm_sync_regs { #define KVM_REG_S390_PFTOKEN (KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x5) #define KVM_REG_S390_PFCOMPARE (KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x6) #define KVM_REG_S390_PFSELECT (KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x7) +#define KVM_REG_S390_PP(KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x8) +#define KVM_REG_S390_GBEA (KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x9) #endif diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h index e27a4b3..b278ab3 100644 --- a/linux-headers/linux/kvm.h +++ b/linux-headers/linux/kvm.h @@ -740,6 +740,9 @@ struct kvm_ppc_smmu_info { #define KVM_CAP_SPAPR_MULTITCE 94 #define KVM_CAP_EXT_EMUL_CPUID 95 #define KVM_CAP_HYPERV_TIME 96 +#define KVM_CAP_IOAPIC_POLARITY_IGNORED 97 +#define KVM_CAP_ENABLE_CAP_VM 98 +#define KVM_CAP_S390_IRQCHIP 99 #ifdef KVM_CAP_IRQ_ROUTING @@ -755,9 +758,18 @@ struct kvm_irq_routing_msi { __u32 pad; }; +struct kvm_irq_routing_s390_adapter { + __u64 ind_addr; + __u64 summary_addr; + __u64 ind_offset; + __u32 summary_offset; + __u32 adapter_id; +}; + /* gsi routing entry types */ #define KVM_IRQ_ROUTING_IRQCHIP 1 #define KVM_IRQ_ROUTING_MSI 2 +#define KVM_IRQ_ROUTING_S390_ADAPTER 3 struct kvm_irq_routing_entry { __u32 gsi; @@ -767,6 +779,7 @@ struct kvm_irq_routing_entry { union { struct kvm_irq_routing_irqchip irqchip; struct kvm_irq_routing_msi msi; + struct kvm_irq_routing_s390_adapter adapter; __u32 pad[8]; } u; }; @@ -1075,6 +1088,10 @@ struct kvm_s390_ucas_mapping { /* Available with KVM_CAP_DEBUGREGS */ #define KVM_GET_DEBUGREGS _IOR(KVMIO, 0xa1, struct kvm_debugregs) #define KVM_SET_DEBUGREGS _IOW(KVMIO, 0xa2, struct kvm_debugregs) +/* + * vcpu version available with KVM_ENABLE_CAP + * vm version available with KVM_CAP_ENABLE_CAP_VM + */ #define KVM_ENABLE_CAP_IOW(KVMIO, 0xa3, struct kvm_enable_cap) /* Available with KVM_CAP_XSAVE */ #define KVM_GET_XSAVE_IOR(KVMIO, 0xa4, struct kvm_xsave) -- 1.7.9.5
[Qemu-devel] PL022 VMSD versions bump
Hi Peter, Im looking to do some cleanup around pl022 and I would like to use it as my lead example of code using the new generic FIFO (as I think I have test cases handy for it). But the VMSD description for this device is a bit ugly as it is: VMSTATE_INT32(tx_fifo_head, PL022State), VMSTATE_INT32(rx_fifo_head, PL022State), VMSTATE_INT32(tx_fifo_len, PL022State), VMSTATE_INT32(rx_fifo_len, PL022State), VMSTATE_UINT16(tx_fifo[0], PL022State), VMSTATE_UINT16(rx_fifo[0], PL022State), VMSTATE_UINT16(tx_fifo[1], PL022State), VMSTATE_UINT16(rx_fifo[1], PL022State), VMSTATE_UINT16(tx_fifo[2], PL022State), VMSTATE_UINT16(rx_fifo[2], PL022State), VMSTATE_UINT16(tx_fifo[3], PL022State), VMSTATE_UINT16(rx_fifo[3], PL022State), VMSTATE_UINT16(tx_fifo[4], PL022State), VMSTATE_UINT16(rx_fifo[4], PL022State), VMSTATE_UINT16(tx_fifo[5], PL022State), VMSTATE_UINT16(rx_fifo[5], PL022State), VMSTATE_UINT16(tx_fifo[6], PL022State), VMSTATE_UINT16(rx_fifo[6], PL022State), VMSTATE_UINT16(tx_fifo[7], PL022State), VMSTATE_UINT16(rx_fifo[7], PL022State), There is a way to covert to Fifo while maintaining full backwards compat (in much the same way as I did for serial), although as you can imagine, it will be messy. Is it worth the VMSD version bump to get it cleaned up? Regards, Peter
Re: [Qemu-devel] [PATCH v3 21/26] tcg-aarch64: Introduce tcg_out_insn_3507
On 03.04.2014 21:56, Richard Henderson wrote: Cleaning up the implementation of REV and REV16 at the same time. Reviewed-by: Claudio Fontana claudio.font...@huawei.com Signed-off-by: Richard Henderson r...@twiddle.net --- tcg/aarch64/tcg-target.c | 22 ++ 1 file changed, 14 insertions(+), 8 deletions(-) During testing I found this patch causes a regression for big endian targets (sparc). Can you take a look? I think it might be related to the extended form of the REV instruction needing an additional 0x400. See below. diff --git a/tcg/aarch64/tcg-target.c b/tcg/aarch64/tcg-target.c index caaf8a2..de7490d 100644 --- a/tcg/aarch64/tcg-target.c +++ b/tcg/aarch64/tcg-target.c @@ -327,6 +327,10 @@ typedef enum { I3506_CSEL = 0x1a80, I3506_CSINC = 0x1a800400, +/* Data-processing (1 source) instructions. */ +I3507_REV16 = 0x5ac00400, +I3507_REV = 0x5ac00800, + /* Data-processing (2 source) instructions. */ I3508_LSLV = 0x1ac02000, I3508_LSRV = 0x1ac02400, @@ -545,6 +549,12 @@ static void tcg_out_insn_3506(TCGContext *s, AArch64Insn insn, TCGType ext, | tcg_cond_to_aarch64[c] 12); } +static void tcg_out_insn_3507(TCGContext *s, AArch64Insn insn, TCGType ext, + TCGReg rd, TCGReg rn) +{ +tcg_out32(s, insn | ext 31 | rn 5 | rd); +} + static void tcg_out_insn_3509(TCGContext *s, AArch64Insn insn, TCGType ext, TCGReg rd, TCGReg rn, TCGReg rm, TCGReg ra) { @@ -961,19 +971,15 @@ static void tcg_out_brcond(TCGContext *s, TCGMemOp ext, TCGCond c, TCGArg a, } static inline void tcg_out_rev(TCGContext *s, TCGType ext, - TCGReg rd, TCGReg rm) + TCGReg rd, TCGReg rn) { -/* using REV 0x5ac00800 */ -unsigned int base = ext ? 0xdac00c00 : 0x5ac00800; see the extended form 0xdac00c00 - -tcg_out32(s, base | rm 5 | rd); +tcg_out_insn(s, 3507, REV, ext, rd, rn); } static inline void tcg_out_rev16(TCGContext *s, TCGType ext, - TCGReg rd, TCGReg rm) + TCGReg rd, TCGReg rn) { -/* using REV16 0x5ac00400 */ -unsigned int base = ext ? 0xdac00400 : 0x5ac00400; -tcg_out32(s, base | rm 5 | rd); +tcg_out_insn(s, 3507, REV16, ext, rd, rn); while this does not have it. } static inline void tcg_out_sxt(TCGContext *s, TCGType ext, TCGMemOp s_bits, Ciao, Claudio
Re: [Qemu-devel] [PATCH v4 1/1] Make qemu_peek_buffer loop until it gets it's data
* (chenliang0...@icloud.com) wrote: * (chenliang0...@icloud.com) wrote: * (chenliang0...@icloud.com) wrote: ?? 2014??4??810:29??Dr. David Alan Gilbert (git) dgilb...@redhat.com ?? From: Dr. David Alan Gilbert dgilb...@redhat.com Make qemu_peek_buffer repeatedly call fill_buffer until it gets all the data it requires, or until there is an error. At the moment, qemu_peek_buffer will try one qemu_fill_buffer if there isn't enough data waiting, however the kernel is entitled to return just a few bytes, and still leave qemu_peek_buffer with less bytes than it needed. I've seen this fail in a dev world, and I think it could theoretically fail in the peeking of the subsection headers in the current world. hmm, I also have got some errors(infrequently). Maybe It is one point. Could you show some messages about the error? I've only seen the errors in my visitor/ber world where I use the peek_buffer a lot more; but the one place it is used in the existing code is in the code to check if we have the start of a subsection; if that goes wrong I'm not sure what error would be produced. Dave In my observation, error is more likely to happen if migration thread is blocked too long. BTW, xbzrle and auto convergence are disable. What error do you see? Dave It likes section id 1 load failed at dest side. Our migration errors aren't particularly descriptive :-( However, try it with this patch - I'd be interested if it makes your problem go away. Dave -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [libvirt] [PATCH 1/1] Enable QEMU_CAPS_PCI_MULTIBUS capability for QEMU2.0 forward.
[adding qemu] On 04/08/2014 11:36 PM, Li Zhang wrote: On 2014年04月09日 11:20, Eric Blake wrote: On 04/08/2014 08:03 PM, Li Zhang wrote: From: Li Zhang zhlci...@linux.vnet.ibm.com For QEMU2.0 forward version, it supports PCI multiBUS. Currently, libvirt still disables it which causes an error Bus 'pci' not found. Signed-off-by: Li Zhang zhlci...@linux.vnet.ibm.com --- src/qemu/qemu_capabilities.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 2c8ec10..b49398f 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -3019,6 +3019,9 @@ virQEMUCapsInitQMPMonitor(virQEMUCapsPtr qemuCaps, if (qemuCaps-version = 1006000) virQEMUCapsSet(qemuCaps, QEMU_CAPS_DEVICE_VIDEO_PRIMARY); +if (qemuCaps-version = 200) +virQEMUCapsSet(qemuCaps, QEMU_CAPS_PCI_MULTIBUS); + This is a version number check, which is bad. Is there a QMP command we can issue that gives a more reliable answer of whether the feature is present? Hi Eric, I can't find any QMP command for MULTIBUS in QEMU. There is only one query-pci to list PCI bus and devices information. Maybe it's my fault for not being more vocal when the bus rename was being proposed on the qemu list, but libvirt really does want a way to probe via QMP whether the 'pci.0' (multibus) vs. 'pci' naming should be used for a given machine. Is there an existing command we can use, or is this something that we need to fix early on in qemu 2.1 and backport to 2.0-stable? -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH] bochs: Fix catalog size check
On 04/09/14 13:14, Kevin Wolf wrote: The old check was off by a factor of 512 and didn't consider cases where we don't get an exact division. This could lead to an out-of-bounds array access in seek_to_sector(). Signed-off-by: Kevin Wolf kw...@redhat.com --- block/bochs.c | 14 +++--- tests/qemu-iotests/078 | 6 +- tests/qemu-iotests/078.out | 6 -- 3 files changed, 20 insertions(+), 6 deletions(-) diff --git a/block/bochs.c b/block/bochs.c index 50b84a9..eacf956 100644 --- a/block/bochs.c +++ b/block/bochs.c @@ -148,8 +148,14 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags, s-extent_blocks = 1 + (le32_to_cpu(bochs.extent) - 1) / 512; s-extent_size = le32_to_cpu(bochs.extent); -if (s-extent_size == 0) { -error_setg(errp, Extent size may not be zero); +if (s-extent_size BDRV_SECTOR_SIZE) { +/* bximage actually never creates extents smaller than 4k */ +error_setg(errp, Extent size must be at least 512); +ret = -EINVAL; +goto fail; +} else if (!is_power_of_2(s-extent_size)) { +error_setg(errp, Extent size % PRIu32 is not a power of two, + s-extent_size); ret = -EINVAL; goto fail; } else if (s-extent_size 0x80) { OK, so this is based on your other [PATCH] bochs: Fix memory leak in bochs_open() error path ; good thus far. @@ -159,7 +165,9 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags, goto fail; } -if (s-catalog_size bs-total_sectors / s-extent_size) { +if (s-catalog_size DIV_ROUND_UP(bs-total_sectors, + s-extent_size / BDRV_SECTOR_SIZE)) +{ error_setg(errp, Catalog size is too small for this disk size); ret = -EINVAL; goto fail; Nice! If I may voice my liking. Reviewed-by: Laszlo Ersek ler...@redhat.com Thanks Laszlo diff --git a/tests/qemu-iotests/078 b/tests/qemu-iotests/078 index 872e734..d4d6da7 100755 --- a/tests/qemu-iotests/078 +++ b/tests/qemu-iotests/078 @@ -69,10 +69,14 @@ _use_sample_img empty.bochs.bz2 poke_file $TEST_IMG $disk_size_offset \x00\xc0\x0f\x00\x00\x00\x00\x7f { $QEMU_IO -c read 2T 4k $TEST_IMG; } 21 | _filter_qemu_io | _filter_testdir +_use_sample_img empty.bochs.bz2 +poke_file $TEST_IMG $catalog_size_offset \x10\x00\x00\x00 +{ $QEMU_IO -c read 0xfbe00 512 $TEST_IMG; } 21 | _filter_qemu_io | _filter_testdir + echo echo == Negative extent size == _use_sample_img empty.bochs.bz2 -poke_file $TEST_IMG $extent_size_offset \xff\xff\xff\xff +poke_file $TEST_IMG $extent_size_offset \x00\x00\x00\x80 { $QEMU_IO -c read 768k 512 $TEST_IMG; } 21 | _filter_qemu_io | _filter_testdir echo diff --git a/tests/qemu-iotests/078.out b/tests/qemu-iotests/078.out index ea95ffd..ca18d2e 100644 --- a/tests/qemu-iotests/078.out +++ b/tests/qemu-iotests/078.out @@ -15,12 +15,14 @@ no file open, try 'help open' == Too small catalog bitmap for image size == qemu-io: can't open device TEST_DIR/empty.bochs: Catalog size is too small for this disk size no file open, try 'help open' +qemu-io: can't open device TEST_DIR/empty.bochs: Catalog size is too small for this disk size +no file open, try 'help open' == Negative extent size == -qemu-io: can't open device TEST_DIR/empty.bochs: Extent size 4294967295 is too large +qemu-io: can't open device TEST_DIR/empty.bochs: Extent size 2147483648 is too large no file open, try 'help open' == Zero extent size == -qemu-io: can't open device TEST_DIR/empty.bochs: Extent size may not be zero +qemu-io: can't open device TEST_DIR/empty.bochs: Extent size must be at least 512 no file open, try 'help open' *** done
Re: [Qemu-devel] PL022 VMSD versions bump
On 9 April 2014 13:42, Peter Crosthwaite peter.crosthwa...@xilinx.com wrote: Hi Peter, Im looking to do some cleanup around pl022 and I would like to use it as my lead example of code using the new generic FIFO (as I think I have test cases handy for it). But the VMSD description for this device is a bit ugly as it is: This will just be because it's a transliteration of a previous save/load function implementation. There is a way to covert to Fifo while maintaining full backwards compat (in much the same way as I did for serial), although as you can imagine, it will be messy. Is it worth the VMSD version bump to get it cleaned up? A version bump is fine with me. thanks -- PMM
Re: [Qemu-devel] [PATCH 2/2] pci-assign: Fix memory out of bound when MSI-X table not fit in a single page
On Wed, Apr 09, 2014 at 10:56:57AM +, Gonglei (Arei) wrote: Hi, QEMU only mmap MSIX_PAGE_SIZE memory for all pci devices in assigned_dev_register_msix_mmio(), meanwhile the set the one page memmory to zero, so the rest memory will be random value (maybe etnry.data is not 0). In the assigned_dev_update_msix_mmio() maybe occur the issue of entry_nr 256, and the kmod reports the EINVAL error. Signed-off-by: Gonglei arei.gong...@huawei.com Okay so this kind of works because guest does not try to use so many vectors. Yes. It's true. But I think it's better not to give guest more entries than we can actually support. How about tweaking MSIX capability exposed to guest to limit table size? IMHO, the MSIX table entry size got form the physic NIC hardware. The software layer shouldn't tweak the capability of real hardware, neither QEMU nor KVM. Best regards, -Gonglei Should be easy to fix though. Does the following help? -- pci-assign: limit # of msix vectors Signed-off-by: Michael S. Tsirkin m...@redhat.com diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c index a825871..76aa86e 100644 --- a/hw/i386/kvm/pci-assign.c +++ b/hw/i386/kvm/pci-assign.c @@ -1258,6 +1258,7 @@ static int assigned_device_pci_cap_init(PCIDevice *pci_dev) if (pos != 0 kvm_device_msix_supported(kvm_state)) { int bar_nr; uint32_t msix_table_entry; +uint16_t msix_max; if (!check_irqchip_in_kernel()) { return -ENOTSUP; @@ -1269,9 +1270,10 @@ static int assigned_device_pci_cap_init(PCIDevice *pci_dev) } pci_dev-msix_cap = pos; -pci_set_word(pci_dev-config + pos + PCI_MSIX_FLAGS, - pci_get_word(pci_dev-config + pos + PCI_MSIX_FLAGS) - PCI_MSIX_FLAGS_QSIZE); +msix_max = (pci_get_word(pci_dev-config + pos + PCI_MSIX_FLAGS) +PCI_MSIX_FLAGS_QSIZE) + 1; +msix_max = MIN(msix_max, KVM_MAX_MSIX_PER_DEV); +pci_set_word(pci_dev-config + pos + PCI_MSIX_FLAGS, msix_max - 1); /* Only enable and function mask bits are writable */ pci_set_word(pci_dev-wmask + pos + PCI_MSIX_FLAGS, @@ -1281,9 +1283,7 @@ static int assigned_device_pci_cap_init(PCIDevice *pci_dev) bar_nr = msix_table_entry PCI_MSIX_FLAGS_BIRMASK; msix_table_entry = ~PCI_MSIX_FLAGS_BIRMASK; dev-msix_table_addr = pci_region[bar_nr].base_addr + msix_table_entry; -dev-msix_max = pci_get_word(pci_dev-config + pos + PCI_MSIX_FLAGS); -dev-msix_max = PCI_MSIX_FLAGS_QSIZE; -dev-msix_max += 1; +dev-msix_max = msix_max; } /* Minimal PM support, nothing writable, device appears to NAK changes */
Re: [Qemu-devel] [PATCH v4 2/5] kvm: add kvm_enable_cap_{vm,vcpu}
On 09.04.14 13:34, Cornelia Huck wrote: Provide helper functions for enabling capabilities (on a vcpu and on a vm). Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com I think it makes sense to convert all ENABLE_CAP callers of the code base to this helper as well. Once you do that, you'll realize that we may want to have additional argument parameters ;). Alex --- include/sysemu/kvm.h |4 kvm-all.c| 19 ++- 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index 0bee1e8..d89911c 100644 --- a/include/sysemu/kvm.h +++ b/include/sysemu/kvm.h @@ -294,6 +294,10 @@ bool kvm_arch_stop_on_emulation_error(CPUState *cpu); int kvm_check_extension(KVMState *s, unsigned int extension); +int kvm_enable_cap_vm(KVMState *s, unsigned int capability); + +int kvm_enable_cap_vcpu(CPUState *cpu, unsigned int capability); + uint32_t kvm_arch_get_supported_cpuid(KVMState *env, uint32_t function, uint32_t index, int reg); diff --git a/kvm-all.c b/kvm-all.c index cd4111d..c32eeff 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -501,7 +501,24 @@ int kvm_check_extension(KVMState *s, unsigned int extension) return ret; } -static int kvm_set_ioeventfd_mmio(int fd, hwaddr addr, uint32_t val, +int kvm_enable_cap_vm(KVMState *s, unsigned int capability) +{ +struct kvm_enable_cap cap = {}; + +cap.cap = capability; +return kvm_vm_ioctl(s, KVM_ENABLE_CAP, cap); +} + +int kvm_enable_cap_vcpu(CPUState *cpu, unsigned int capability) +{ +struct kvm_enable_cap cap = {}; + +cap.cap = capability; +return kvm_vcpu_ioctl(cpu, KVM_ENABLE_CAP, cap); +} + + +static int kvm_set_ioeventfd_mmio(int fd, uint32_t addr, uint32_t val, bool assign, uint32_t size, bool datamatch) { int ret;
Re: [Qemu-devel] [PATCH v4 3/5] s390x: Add I/O adapter registration.
On 09.04.14 13:34, Cornelia Huck wrote: Register an I/O adapter interrupt source for when virtio-ccw devices start using adapter interrupts. Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- hw/intc/s390_flic.c | 59 + hw/s390x/css.c| 51 ++ hw/s390x/css.h|4 hw/s390x/virtio-ccw.c |4 hw/s390x/virtio-ccw.h |1 + target-s390x/cpu.h| 33 +++ 6 files changed, 152 insertions(+) diff --git a/hw/intc/s390_flic.c b/hw/intc/s390_flic.c index b2ef3e3..c033c8a 100644 --- a/hw/intc/s390_flic.c +++ b/hw/intc/s390_flic.c @@ -21,6 +21,11 @@ #define FLIC_FAILED (-1UL) #define FLIC_SAVEVM_VERSION 1 +static KVMS390FLICState *s390_get_flic(void) +{ +return KVM_S390_FLIC(object_resolve_path(/machine/s390-flic, NULL)); +} + void s390_flic_init(void) { DeviceState *dev; @@ -148,6 +153,60 @@ static int __get_all_irqs(KVMS390FLICState *flic, return r; } +int kvm_s390_register_io_adapter(uint32_t id, uint8_t isc, bool swap, + bool is_maskable) +{ +struct kvm_s390_io_adapter adapter = { +.id = id, +.isc = isc, +.maskable = is_maskable, +.swap = swap, +}; +KVMS390FLICState *flic = s390_get_flic(); +int r, ret; +struct kvm_device_attr attr = { +.group = KVM_DEV_FLIC_ADAPTER_REGISTER, +.addr = (uint64_t)adapter, +}; + +if (!flic) { +return -ENOSYS; +} +if (!kvm_check_extension(kvm_state, KVM_CAP_IRQ_ROUTING)) { +return -ENOSYS; +} + +r = ioctl(flic-fd, KVM_SET_DEVICE_ATTR, attr); + +ret = r ? -errno : 0; +return ret; +} + +int kvm_s390_io_adapter_map(uint32_t id, uint64_t map_addr, bool do_map) +{ +struct kvm_s390_io_adapter_req req = { +.id = id, +.type = do_map ? KVM_S390_IO_ADAPTER_MAP : KVM_S390_IO_ADAPTER_UNMAP, +.addr = map_addr, +}; +KVMS390FLICState *flic = s390_get_flic(); +struct kvm_device_attr attr = { +.group = KVM_DEV_FLIC_ADAPTER_MODIFY, +.addr = (uint64_t)req, +}; +int r; + +if (!flic) { +return -ENOSYS; +} +if (!kvm_check_extension(kvm_state, KVM_CAP_IRQ_ROUTING)) { +return -ENOSYS; +} + +r = ioctl(flic-fd, KVM_SET_DEVICE_ATTR, attr); +return r ? -errno : 0; +} + /** * kvm_flic_save - Save pending floating interrupts * @f: QEMUFile containing migration state diff --git a/hw/s390x/css.c b/hw/s390x/css.c index 7074d2b..a6d173f 100644 --- a/hw/s390x/css.c +++ b/hw/s390x/css.c @@ -39,6 +39,13 @@ typedef struct CssImage { ChpInfo chpids[MAX_CHPID + 1]; } CssImage; +typedef struct IoAdapter { +uint32_t id; +uint8_t type; +uint8_t isc; +QTAILQ_ENTRY(IoAdapter) sibling; +} IoAdapter; + typedef struct ChannelSubSys { QTAILQ_HEAD(, CrwContainer) pending_crws; bool do_crw_mchk; @@ -49,6 +56,7 @@ typedef struct ChannelSubSys { uint64_t chnmon_area; CssImage *css[MAX_CSSID + 1]; uint8_t default_cssid; +QTAILQ_HEAD(, IoAdapter) io_adapters; } ChannelSubSys; static ChannelSubSys *channel_subsys; @@ -69,6 +77,48 @@ int css_create_css_image(uint8_t cssid, bool default_image) return 0; } +int css_register_io_adapter(uint8_t type, uint8_t isc, bool swap, +bool maskable, uint32_t *id) +{ +IoAdapter *adapter; +bool found = false; +int ret; + +*id = 0; +QTAILQ_FOREACH(adapter, channel_subsys-io_adapters, sibling) { +if ((adapter-type == type) (adapter-isc == isc)) { +*id = adapter-id; +found = true; +ret = 0; +break; +} +if (adapter-id = *id) { +*id = adapter-id + 1; +} +} +if (found) { +goto out; +} +adapter = g_new0(IoAdapter, 1); +ret = s390_register_io_adapter(*id, isc, swap, maskable); +if (ret == -ENOSYS) { +/* Keep adapter even if we didn't register it anywhere. */ +ret = 0; +} +if (ret == 0) { +adapter-id = *id; +adapter-isc = isc; +adapter-type = type; +QTAILQ_INSERT_TAIL(channel_subsys-io_adapters, adapter, sibling); +} else { +g_free(adapter); +fprintf(stderr, Unexpected error %d when registering adapter %d\n, +ret, *id); +} +out: +return ret; +} + uint16_t css_build_subchannel_id(SubchDev *sch) { if (channel_subsys-max_cssid 0) { @@ -1239,6 +1289,7 @@ static void css_init(void) channel_subsys-do_crw_mchk = true; channel_subsys-crws_lost = false; channel_subsys-chnmon_active = false; +QTAILQ_INIT(channel_subsys-io_adapters); } machine_init(css_init); diff --git a/hw/s390x/css.h b/hw/s390x/css.h index e9b4405..380e8e7
Re: [Qemu-devel] [PATCH 2/2] pci-assign: Fix memory out of bound when MSI-X table not fit in a single page
On 04/03/14 07:18, arei.gong...@huawei.com wrote: From: Gonglei arei.gong...@huawei.com QEMU only mmap MSIX_PAGE_SIZE memory for all pci devices in assigned_dev_register_msix_mmio(), meanwhile the set the one page memmory to zero, so the rest memory will be random value (maybe etnry.data is not 0). In the assigned_dev_update_msix_mmio() maybe occur the issue of entry_nr 256, and the kmod reports the EINVAL error. Signed-off-by: Gonglei arei.gong...@huawei.com --- hw/i386/kvm/pci-assign.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c index 570333f..d25a19e 100644 --- a/hw/i386/kvm/pci-assign.c +++ b/hw/i386/kvm/pci-assign.c @@ -37,6 +37,7 @@ #include hw/pci/pci.h #include hw/pci/msi.h #include kvm_i386.h +#include qemu/osdep.h #define MSIX_PAGE_SIZE 0x1000 @@ -59,6 +60,9 @@ #define DEBUG(fmt, ...) #endif +/* the msix-table size readed from pci device config */ +static int msix_table_size; + typedef struct PCIRegion { int type; /* Memory or port I/O */ int valid; @@ -1604,7 +1608,12 @@ static void assigned_dev_msix_reset(AssignedDevice *dev) static int assigned_dev_register_msix_mmio(AssignedDevice *dev) { -dev-msix_table = mmap(NULL, MSIX_PAGE_SIZE, PROT_READ|PROT_WRITE, +msix_table_size = ROUND_UP(dev-msix_max * sizeof(struct MSIXTableEntry), +MSIX_PAGE_SIZE); + +DEBUG(msix_table_size: 0x%x\n, msix_table_size); + +dev-msix_table = mmap(NULL, msix_table_size, PROT_READ|PROT_WRITE, MAP_ANONYMOUS|MAP_PRIVATE, 0, 0); if (dev-msix_table == MAP_FAILED) { error_report(fail allocate msix_table! %s, strerror(errno)); @@ -1615,7 +1624,7 @@ static int assigned_dev_register_msix_mmio(AssignedDevice *dev) assigned_dev_msix_reset(dev); memory_region_init_io(dev-mmio, OBJECT(dev), assigned_dev_msix_mmio_ops, - dev, assigned-dev-msix, MSIX_PAGE_SIZE); + dev, assigned-dev-msix, msix_table_size); return 0; } @@ -1627,7 +1636,7 @@ static void assigned_dev_unregister_msix_mmio(AssignedDevice *dev) memory_region_destroy(dev-mmio); -if (munmap(dev-msix_table, MSIX_PAGE_SIZE) == -1) { +if (munmap(dev-msix_table, msix_table_size) == -1) { error_report(error unmapping msix_table! %s, strerror(errno)); } dev-msix_table = NULL; My only interest in this patch is RHBZ 616415. Namely, I have to improve the propagation of human-readable errors in PCI device assignment (through QMP as well). This patchset has hunks that look capable of conflicting with my work (not started yet), so I think maybe I should delay my work until this patchset is hashed out and applied. Anyway, I just don't want to wait, so here are some review comments (I'm not a PCI assignment expert by any means!): (1) The patch introduces msix_table_size as an object with static storage duration (ie. as a global variable). This is wrong; different passthru devices will have different MSI-X table sizes. The patch causes breakage as soon as two devices with different (rounded up) table sizes are assigned, and then at least one of them is unassigned (because assigned_dev_unregister_msix_mmio() will unmap either too much or too little). There are two ways to fix this: - Make msix_table_size a *sibling* field of msix_table and msix_max, in struct AssignedDevice. Or, - Recompute the value of msix_table_size (to be introduced as a local variable in all relevant functions) from dev-msix_max each time you need it. (2) The other problem is that not all uses of MSIX_PAGE_SIZE are updated. Namely, assigned_dev_msix_reset() clears the first page, then overwrites the first dev-msix_table entries (masking them). However, if we've allocated at least two pages due to *inexact* division (ie. rounding up), then the trailing portion of the last page continues to contain garbage. Now, this (a) either matters to KVM (because it wants to have sensible values in *all* entries that fit into the pages that it sees), or (b) it doesn't (because KVM complies with entries_nr in assigned_dev_update_msix_mmio(), which is bounded by adev-msix_max). I don't know which one of (a) or (b) is the case. But, the code seems incorrect (or at least misleading) in both cases: In case (a), the memset() in assigned_dev_msix_reset() should be updated so that it covers the entire allocation (all full pages, including the last one). In case (b), the memset() should be dropped from assigned_dev_msix_reset(), because the loop just beneath it will populate the entire set that KVM will look at. Thanks Laszlo
Re: [Qemu-devel] [PATCH 1/2] pci-assign: Fix a bug when map MSI-X table memory failed
On Thu, Apr 03, 2014 at 01:18:23PM +0800, arei.gong...@huawei.com wrote: From: Gonglei arei.gong...@huawei.com when map MSI-X table memory failed, the dev-msix_table not be set to NULL, the assigned_dev_unregister_msix_mmio() will case a segfault when munmap the failed dev-msix_table. Signed-off-by: Gonglei arei.gong...@huawei.com Reviewed-by: Michael S. Tsirkin m...@redhat.com --- hw/i386/kvm/pci-assign.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c index a825871..570333f 100644 --- a/hw/i386/kvm/pci-assign.c +++ b/hw/i386/kvm/pci-assign.c @@ -1608,6 +1608,7 @@ static int assigned_dev_register_msix_mmio(AssignedDevice *dev) MAP_ANONYMOUS|MAP_PRIVATE, 0, 0); if (dev-msix_table == MAP_FAILED) { error_report(fail allocate msix_table! %s, strerror(errno)); +dev-msix_table = NULL; return -EFAULT; } -- 1.7.12.4
Re: [Qemu-devel] [PATCH v4 2/5] kvm: add kvm_enable_cap_{vm,vcpu}
On Wed, 09 Apr 2014 15:58:55 +0200 Alexander Graf ag...@suse.de wrote: On 09.04.14 13:34, Cornelia Huck wrote: Provide helper functions for enabling capabilities (on a vcpu and on a vm). Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com I think it makes sense to convert all ENABLE_CAP callers of the code base to this helper as well. Once you do that, you'll realize that we may want to have additional argument parameters ;). OK, I'll do the additional conversions as a follow-up, so I can add a good interface here :)
Re: [Qemu-devel] [PATCH v4 3/5] s390x: Add I/O adapter registration.
On Wed, 09 Apr 2014 16:05:00 +0200 Alexander Graf ag...@suse.de wrote: On 09.04.14 13:34, Cornelia Huck wrote: Register an I/O adapter interrupt source for when virtio-ccw devices start using adapter interrupts. Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- hw/intc/s390_flic.c | 59 + hw/s390x/css.c| 51 ++ hw/s390x/css.h|4 hw/s390x/virtio-ccw.c |4 hw/s390x/virtio-ccw.h |1 + target-s390x/cpu.h| 33 +++ 6 files changed, 152 insertions(+) diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c index 2bf0af8..1193682 100644 --- a/hw/s390x/virtio-ccw.c +++ b/hw/s390x/virtio-ccw.c @@ -522,6 +522,10 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw) dev-thinint_isc = thinint-isc; dev-ind_bit = thinint-ind_bit; cpu_physical_memory_unmap(thinint, hw_len, 0, hw_len); +ret = css_register_io_adapter(CSS_IO_ADAPTER_VIRTIO, + dev-thinint_isc, true, false, + dev-adapter_id); In all other machines the machine file is the one creating the link between a device and the interrupt controller. Can we do something similar for s390? Hm. This would imply we'd need to add a virtio I/O adapter for each isc (0-7) at startup, regardless whether the guest enables adapter interrupts on any of those iscs. Moreover, we'd need to do the same for each type (on each isc) if we add more types of I/O adapters later (should we want to support one of the other adapter-interrupt using devices). I'd prefer to add an I/O adapter only when needed.
Re: [Qemu-devel] [PATCH v4 3/5] s390x: Add I/O adapter registration.
On 09.04.14 16:24, Cornelia Huck wrote: On Wed, 09 Apr 2014 16:05:00 +0200 Alexander Graf ag...@suse.de wrote: On 09.04.14 13:34, Cornelia Huck wrote: Register an I/O adapter interrupt source for when virtio-ccw devices start using adapter interrupts. Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- hw/intc/s390_flic.c | 59 + hw/s390x/css.c| 51 ++ hw/s390x/css.h|4 hw/s390x/virtio-ccw.c |4 hw/s390x/virtio-ccw.h |1 + target-s390x/cpu.h| 33 +++ 6 files changed, 152 insertions(+) diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c index 2bf0af8..1193682 100644 --- a/hw/s390x/virtio-ccw.c +++ b/hw/s390x/virtio-ccw.c @@ -522,6 +522,10 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw) dev-thinint_isc = thinint-isc; dev-ind_bit = thinint-ind_bit; cpu_physical_memory_unmap(thinint, hw_len, 0, hw_len); +ret = css_register_io_adapter(CSS_IO_ADAPTER_VIRTIO, + dev-thinint_isc, true, false, + dev-adapter_id); In all other machines the machine file is the one creating the link between a device and the interrupt controller. Can we do something similar for s390? Hm. This would imply we'd need to add a virtio I/O adapter for each isc (0-7) at startup, regardless whether the guest enables adapter interrupts on any of those iscs. Moreover, we'd need to do the same for each type (on each isc) if we add more types of I/O adapters later (should we want to support one of the other adapter-interrupt using devices). I'd prefer to add an I/O adapter only when needed. I'm not sure I can follow you here. Instead of registering the interrupt vector on the fly, you would still register it on the fly, but after the virtio-ccw device got created, no? Alex
[Qemu-devel] [PULL for-2.0 1/1] tests/acpi: update expected DSDT files
commit f2ccc311df55ec026a8f8ea9df998f26314f22b2 dsdt: tweak ACPI ID for hotplug resource device changes the DSDT, update test expected files to match Signed-off-by: Michael S. Tsirkin m...@redhat.com Reported-by: Igor Mammedov imamm...@redhat.com --- tests/acpi-test-data/pc/DSDT | Bin 4485 - 4480 bytes tests/acpi-test-data/q35/DSDT | Bin 7383 - 7378 bytes 2 files changed, 0 insertions(+), 0 deletions(-) diff --git a/tests/acpi-test-data/pc/DSDT b/tests/acpi-test-data/pc/DSDT index 8b14a5f6f2012dc9bfd01f13931100f002e99391..d0bb3de79d2ff98736e8a54e6e8859ab4efeaf03 100644 GIT binary patch delta 45 zcmZowZcyfO33dr-5M*Fr+`Ex$AwQ$jTd;+_%#E9T!T5{Jv?1_94~OOOo~z7XT?9 B4Z8pU delta 50 zcmZorZdK-T33dr-6=Yywe7lirAwQ%0Td;+MDzlJT!T5{Jv?1_9i0O_4GauS7$z?k GxC;Pq#typx diff --git a/tests/acpi-test-data/q35/DSDT b/tests/acpi-test-data/q35/DSDT index a76ea9a418bea57c2d685b3a1b0221defd02b447..fc5b9700097d621ac76c59cf2596ead9d61c20a6 100644 GIT binary patch delta 45 zcmca^dC8K?CDk8k_-a_W8p?FHEBkt$%fJ|_%#E9T!T5{Jv?1_94~OOP1coJ1pqk4 B4IBUf delta 50 zcmca)dEJuBCDk8x(ovYHC(xYSN7ElMSU`i0B0bxdwB@dw9C=Iywh8WRuFih5v GSp@)qeGVJ| -- MST
[Qemu-devel] [PULL for-2.0 0/1] acpi test: DSDT update
The following changes since commit efcc87d9aedb590b8506cd1a7c8abe557c760f9e: Update version for v2.0.0-rc2 release (2014-04-08 18:52:06 +0100) are available in the git repository at: git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream for you to fetch changes up to 50329d3418c959ebce63c59d4c21473ac102d88f: tests/acpi: update expected DSDT files (2014-04-09 17:52:08 +0300) acpi test: DSDT update Not a must but as close to zero risk as one can get: removes an annoying warning produced when running make check. No code changes so it really can't break anything ... Signed-off-by: Michael S. Tsirkin m...@redhat.com Michael S. Tsirkin (1): tests/acpi: update expected DSDT files tests/acpi-test-data/pc/DSDT | Bin 4485 - 4480 bytes tests/acpi-test-data/q35/DSDT | Bin 7383 - 7378 bytes 2 files changed, 0 insertions(+), 0 deletions(-)
[Qemu-devel] [RFC v2 1/6] hw/arm/virt: add a xgmac device
From: Kim Phillips kim.phill...@linaro.org This is a hack and only serves as an example of what needs to be done to make the next RFC - add vfio-platform support - work for development purposes on a Calxeda Midway system. We don't want mach-virt to always create this ethernet device - DO NOT APPLY, etc. Initial attempts to convince QEMU to create a memory mapped device on the command line (e.g., -device vfio-platform,name=fff51000.ethernet) would fail with Parameter 'driver' expects pluggable device type. Any guidance as to how to overcome this apparent design limitation is welcome. RAM is reduced from 30 to 1GiB such as to not overlap the xgmac device's physical address. Not sure if the 30GiB RAM (or whatever the user sets it to with -m) could be set up above 0x1__, but there is probably extra work needed to resolve this type of conflict. note: vfio-platform interrupt support development may want interrupt property data filled; here it's omitted for the time being. Not-signed-off-by: Kim Phillips kim.phill...@linaro.org --- hw/arm/virt.c | 24 +++- 1 file changed, 23 insertions(+), 1 deletion(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 2bbc931..5d43cf0 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -65,6 +65,7 @@ enum { VIRT_GIC_CPU, VIRT_UART, VIRT_MMIO, +VIRT_ETHERNET, }; typedef struct MemMapEntry { @@ -106,7 +107,8 @@ static const MemMapEntry a15memmap[] = { [VIRT_MMIO] = { 0xa00, 0x200 }, /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */ /* 0x1000 .. 0x4000 reserved for PCI */ -[VIRT_MEM] = { 0x4000, 30ULL * 1024 * 1024 * 1024 }, +[VIRT_MEM] = { 0x4000, 1ULL * 1024 * 1024 * 1024 }, +[VIRT_ETHERNET] = { 0xfff51000, 0x1000 }, }; static const int a15irqmap[] = { @@ -291,6 +293,25 @@ static void create_uart(const VirtBoardInfo *vbi, qemu_irq *pic) g_free(nodename); } +static void create_ethernet(const VirtBoardInfo *vbi, qemu_irq *pic) +{ +char *nodename; +hwaddr base = vbi-memmap[VIRT_ETHERNET].base; +hwaddr size = vbi-memmap[VIRT_ETHERNET].size; +const char compat[] = calxeda,hb-xgmac; + +sysbus_create_simple(vfio-platform, base, NULL); + +nodename = g_strdup_printf(/ethernet@% PRIx64, base); +qemu_fdt_add_subnode(vbi-fdt, nodename); + +/* Note that we can't use setprop_string because of the embedded NUL */ +qemu_fdt_setprop(vbi-fdt, nodename, compatible, compat, sizeof(compat)); +qemu_fdt_setprop_sized_cells(vbi-fdt, nodename, reg, 2, base, 2, size); + +g_free(nodename); +} + static void create_virtio_devices(const VirtBoardInfo *vbi, qemu_irq *pic) { int i; @@ -425,6 +446,7 @@ static void machvirt_init(QEMUMachineInitArgs *args) } create_uart(vbi, pic); +create_ethernet(vbi, pic); /* Create mmio transports, so the user can create virtio backends * (which will be automatically plugged in to the transports). If -- 1.8.3.2
[Qemu-devel] [RFC v2 3/6] vfio: add vfio-platform support
From: Kim Phillips kim.phill...@linaro.org Functions for which PCI and platform device support share are moved into common.c. The common vfio_{get,put}_group() get an additional argument, a pointer to a vfio_reset_handler(), for which to pass on to qemu_register_reset, but only if it exists (the platform device code currently passes a NULL as its reset_handler). For the platform device code, we basically use SysBusDevice instead of PCIDevice. Since realize() returns void, unlike PCIDevice's initfn, error codes are moved into the error message text with %m. Currently only MMIO access is supported at this time. The perceived path for future QEMU development is: - add support for interrupts - verify and test platform dev unmap path - test existing PCI path for any regressions - add support for creating platform devices on the qemu command line - currently device address specification is hardcoded for test development on Calxeda Midway's fff51000.ethernet device - reset is not supported and registration of reset functions is bypassed for platform devices. - there is no standard means of resetting a platform device, unsure if it suffices to be handled at device--VFIO binding time Signed-off-by: Kim Phillips kim.phill...@linaro.org [1] http://www.spinics.net/lists/kvm-arm/msg08195.html --- hw/vfio/Makefile.objs | 2 + hw/vfio/common.c | 486 ++ hw/vfio/pci.c | 480 ++--- hw/vfio/platform.c| 381 +++ hw/vfio/vfio-common.h | 55 ++ 5 files changed, 937 insertions(+), 467 deletions(-) create mode 100644 hw/vfio/common.c create mode 100644 hw/vfio/platform.c create mode 100644 hw/vfio/vfio-common.h diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs index 31c7dab..c5c76fe 100644 --- a/hw/vfio/Makefile.objs +++ b/hw/vfio/Makefile.objs @@ -1,3 +1,5 @@ ifeq ($(CONFIG_LINUX), y) +obj-$(CONFIG_SOFTMMU) += common.o obj-$(CONFIG_PCI) += pci.o +obj-$(CONFIG_SOFTMMU) += platform.o endif diff --git a/hw/vfio/common.c b/hw/vfio/common.c new file mode 100644 index 000..9d1f723 --- /dev/null +++ b/hw/vfio/common.c @@ -0,0 +1,486 @@ +/* + * vfio based device assignment support + * + * Copyright Red Hat, Inc. 2012 + * + * Authors: + * Alex Williamson alex.william...@redhat.com + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + * Based on qemu-kvm device-assignment: + * Adapted for KVM by Qumranet. + * Copyright (c) 2007, Neocleus, Alex Novik (a...@neocleus.com) + * Copyright (c) 2007, Neocleus, Guy Zana (g...@neocleus.com) + * Copyright (C) 2008, Qumranet, Amit Shah (amit.s...@qumranet.com) + * Copyright (C) 2008, Red Hat, Amit Shah (amit.s...@redhat.com) + * Copyright (C) 2008, IBM, Muli Ben-Yehuda (m...@il.ibm.com) + */ + +#include dirent.h +#include linux/vfio.h +#include sys/ioctl.h +#include sys/mman.h +#include sys/stat.h +#include sys/types.h +#include unistd.h + +#include config.h +#include exec/address-spaces.h +#include exec/memory.h +#include hw/pci/msi.h +#include hw/pci/msix.h +#include hw/pci/pci.h +#include qemu-common.h +#include qemu/error-report.h +#include qemu/event_notifier.h +#include qemu/queue.h +#include qemu/range.h +#include sysemu/kvm.h +#include sysemu/sysemu.h + +#include vfio-common.h + +#define DEBUG_VFIO +#ifdef DEBUG_VFIO +#define DPRINTF(fmt, ...) \ +do { fprintf(stderr, vfio: fmt, ## __VA_ARGS__); } while (0) +#else +#define DPRINTF(fmt, ...) \ +do { } while (0) +#endif + +static QLIST_HEAD(, VFIOContainer) +container_list = QLIST_HEAD_INITIALIZER(container_list); + +QLIST_HEAD(, VFIOGroup) +group_list = QLIST_HEAD_INITIALIZER(group_list); + + +struct VFIODevice; + +#ifdef CONFIG_KVM +/* + * We have a single VFIO pseudo device per KVM VM. Once created it lives + * for the life of the VM. Closing the file descriptor only drops our + * reference to it and the device's reference to kvm. Therefore once + * initialized, this file descriptor is only released on QEMU exit and + * we'll re-use it should another vfio device be attached before then. + */ +static int vfio_kvm_device_fd = -1; +#endif + +/* + * DMA - Mapping and unmapping for the type1 IOMMU interface used on x86 + */ +static int vfio_dma_unmap(VFIOContainer *container, + hwaddr iova, ram_addr_t size) +{ +struct vfio_iommu_type1_dma_unmap unmap = { +.argsz = sizeof(unmap), +.flags = 0, +.iova = iova, +.size = size, +}; + +if (ioctl(container-fd, VFIO_IOMMU_UNMAP_DMA, unmap)) { +DPRINTF(VFIO_UNMAP_DMA: %d\n, -errno); +return -errno; +} + +return 0; +} + +static int vfio_dma_map(VFIOContainer *container, hwaddr iova, +ram_addr_t size, void *vaddr, bool readonly) +{ +struct vfio_iommu_type1_dma_map map = { +.argsz =
[Qemu-devel] [RFC v2 5/6] virt: Assign a VFIO platform device to a virt VM in QEMU command line
This patch aims at allowing the end-user to specify the device he wants to directly assign to his virt VM in the QEMU command line. The QEMU platform device becomes generic. Current choice is to reuse the -device option. For example when assigning Calxeda Midway xgmac device this option is used: -device vfio-platform,vfio_device=fff51000.ethernet,\ compat=calxeda/hb-xgmac,mmap-timeout-ms=1000 where - fff51000.ethernet is the name of the device in /sys/bus/platform/devices/ - calxeda/hb-xgma is the compatibility where the standard coma separator is replaced by / since coma is specifically used by QEMU command line parser - mmap-timeout-ms is minimal amount of time (ms) during which the IP register space stays MMIO mapped after an IRQ triggers in order to trap the end of interrupt (EOI). This is an optional parameter (default value set to 1100 ms). virt machine was modified to interpret this line and automatically - map the device at a chosen guest physical address in [0xa004000, 0x1000], - map the device IRQs after 48, - create the associated guest device tree with the provided compatibility. The -device option underlying implementation is not standard which can be argued. Indeed normaly it induces the call to the QEMU device realize function once after the virtual machine init execution. In this case QDEV mappings and device tree creation must happen. Since virt is the place where the whole memory and IRQ mapping is known and device tree is created, it was chosen to interpret the option line there. This means the realize function is called twice, once in virt.c and once after machine init return. The second call returns immediatly since the QEMU device is recognized as already existing. Another way to implement this would be to create a new option in QEMU. Acknowledgements: - a single compatibility currently is supported - IRQ properties set in the device tree should be refined - More generally devices with more complex device tree nodes must be studied and are not currently handled - cases where multiple VFIO devices are assigned could not be tested Signed-off-by: Eric Auger eric.au...@linaro.org --- hw/arm/virt.c | 178 +++-- hw/vfio/platform.c | 43 ++--- 2 files changed, 181 insertions(+), 40 deletions(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 31ae7d2..1fb66ef 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -40,6 +40,17 @@ #include exec/address-spaces.h #include qemu/bitops.h #include qemu/error-report.h +#include monitor/qdev.h +#include qemu/config-file.h + +/* + * this function is implemented in vfio/platform.c + * it returns the name, compatibility, IRQ number and register set size. + * the function only is implemented for VFIO platform devices + */ +void vfio_get_props(SysBusDevice *s, char **pname, char **pcompat, +int *pnum_irqs, size_t *psize); + #define NUM_VIRTIO_TRANSPORTS 32 @@ -65,7 +76,7 @@ enum { VIRT_GIC_CPU, VIRT_UART, VIRT_MMIO, -VIRT_ETHERNET, +VIRT_VFIO, }; typedef struct MemMapEntry { @@ -79,7 +90,10 @@ typedef struct VirtBoardInfo { const char *qdevname; const char *gic_compatible; const MemMapEntry *memmap; +qemu_irq pic[NUM_IRQS]; const int *irqmap; +hwaddr avail_vfio_base; +int avail_vfio_irq; int smp_cpus; void *fdt; int fdt_size; @@ -105,16 +119,16 @@ static const MemMapEntry a15memmap[] = { [VIRT_GIC_CPU] = { 0x8002000, 0x1000 }, [VIRT_UART] = { 0x900, 0x1000 }, [VIRT_MMIO] = { 0xa00, 0x200 }, +[VIRT_VFIO] = { 0xa004000, 0x0 }, /* size is dynamically populated */ /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */ /* 0x1000 .. 0x4000 reserved for PCI */ -[VIRT_MEM] = { 0x4000, 1ULL * 1024 * 1024 * 1024 }, -[VIRT_ETHERNET] = { 0xfff41000, 0x1000 }, +[VIRT_MEM] = { 0x4000, 30ULL * 1024 * 1024 * 1024 }, }; static const int a15irqmap[] = { [VIRT_UART] = 1, [VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */ -[VIRT_ETHERNET] = 77, +[VIRT_VFIO] = 48, }; static VirtBoardInfo machines[] = { @@ -266,7 +280,7 @@ static void fdt_add_gic_node(const VirtBoardInfo *vbi) qemu_fdt_setprop_cell(vbi-fdt, /intc, phandle, gic_phandle); } -static void create_uart(const VirtBoardInfo *vbi, qemu_irq *pic) +static void create_uart(const VirtBoardInfo *vbi) { char *nodename; hwaddr base = vbi-memmap[VIRT_UART].base; @@ -275,7 +289,7 @@ static void create_uart(const VirtBoardInfo *vbi, qemu_irq *pic) const char compat[] = arm,pl011\0arm,primecell; const char clocknames[] = uartclk\0apb_pclk; -sysbus_create_simple(pl011, base, pic[irq]); +sysbus_create_simple(pl011, base, vbi-pic[irq]); nodename = g_strdup_printf(/pl011@% PRIx64, base); qemu_fdt_add_subnode(vbi-fdt, nodename); @@ -294,34 +308,133 @@ static void
[Qemu-devel] [RFC v2 4/6] vfio: Add initial IRQ support in QEMU platform device
This work is inspired of PCI INTx. The code was prepared to support multiple IRQs but this was not tested at that stage. Similarly to what is done on PCI, the device register space is RAM unmapped on IRQ hit in order to trap the end of interrupt (EOI). On mmap timer hit, if the EOI was trapped, the mmapping is restored. the physical interrupt is unmasked on the first read/write with the MMIO register space. Tested on Calxeda Midway xgmac which can be directly assigned to one virt VM. Acknowledgements: - vfio device name, guest physical address and IRQ indexes are hardcoded. This will be improved in another patch - currently tested on a single IRQ (xgmac main IRQ) - a KVM patch is required to invalidate stage2 entries on RAM memory region destruction (see [PATCH] ARM: KVM: Handle IPA unmapping on memory region deletion) Signed-off-by: Eric Auger eric.au...@linaro.org --- hw/arm/virt.c | 13 ++- hw/vfio/platform.c | 316 + 2 files changed, 304 insertions(+), 25 deletions(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 5d43cf0..31ae7d2 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -108,12 +108,13 @@ static const MemMapEntry a15memmap[] = { /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */ /* 0x1000 .. 0x4000 reserved for PCI */ [VIRT_MEM] = { 0x4000, 1ULL * 1024 * 1024 * 1024 }, -[VIRT_ETHERNET] = { 0xfff51000, 0x1000 }, +[VIRT_ETHERNET] = { 0xfff41000, 0x1000 }, }; static const int a15irqmap[] = { [VIRT_UART] = 1, [VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */ +[VIRT_ETHERNET] = 77, }; static VirtBoardInfo machines[] = { @@ -299,8 +300,12 @@ static void create_ethernet(const VirtBoardInfo *vbi, qemu_irq *pic) hwaddr base = vbi-memmap[VIRT_ETHERNET].base; hwaddr size = vbi-memmap[VIRT_ETHERNET].size; const char compat[] = calxeda,hb-xgmac; +int irqm = vbi-irqmap[VIRT_ETHERNET]; +int irqp = irqm+1; +int irqlp = irqm+2; -sysbus_create_simple(vfio-platform, base, NULL); +sysbus_create_varargs(vfio-platform, base, + pic[irqm], pic[irqp], pic[irqlp], NULL); nodename = g_strdup_printf(/ethernet@% PRIx64, base); qemu_fdt_add_subnode(vbi-fdt, nodename); @@ -308,6 +313,10 @@ static void create_ethernet(const VirtBoardInfo *vbi, qemu_irq *pic) /* Note that we can't use setprop_string because of the embedded NUL */ qemu_fdt_setprop(vbi-fdt, nodename, compatible, compat, sizeof(compat)); qemu_fdt_setprop_sized_cells(vbi-fdt, nodename, reg, 2, base, 2, size); +qemu_fdt_setprop_cells(vbi-fdt, nodename, interrupts, +0x0, irqm, 0x4, +0x0, irqp, 0x4, +0x0, irqlp, 0x4); g_free(nodename); } diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c index 138fb13..f148edd 100644 --- a/hw/vfio/platform.c +++ b/hw/vfio/platform.c @@ -24,6 +24,7 @@ #include config.h #include exec/address-spaces.h #include exec/memory.h + #include qemu-common.h #include qemu/error-report.h #include qemu/event_notifier.h @@ -31,6 +32,7 @@ #include qemu/range.h #include sysemu/kvm.h #include sysemu/sysemu.h + #include hw/qdev-properties.h #include migration/vmstate.h #include hw/hw.h @@ -41,12 +43,15 @@ #define DEBUG_VFIO #ifdef DEBUG_VFIO #define DPRINTF(fmt, ...) \ -do { fprintf(stderr, vfio: fmt, ## __VA_ARGS__); } while (0) +do { fprintf(stderr, vfio: %s: fmt, __func__, ## __VA_ARGS__); } \ +while (0) #else #define DPRINTF(fmt, ...) \ do { } while (0) #endif +#define PLATFORM_NUM_REGIONS 10 + /* Extra debugging, trap acceleration paths for more logging */ #define VFIO_ALLOW_MMAP 1 @@ -63,16 +68,240 @@ typedef struct VFIORegion { uint8_t nr; /* cache the region number for debug */ } VFIORegion; + +#define VFIO_INT_INTp 4 + +typedef struct VFIOINTp { +QLIST_ENTRY(VFIOINTp) next; +EventNotifier interrupt; /* eventfd triggered on interrupt */ +EventNotifier unmask; /* eventfd for unmask on QEMU bypass */ +qemu_irq qemuirq; +struct VFIODevice *vdev; /* back pointer to device */ +bool pending; /* interrupt pending */ +bool kvm_accel; /* set when QEMU bypass through KVM enabled */ +uint8_t pin; /* index */ +uint32_t mmap_timeout; /* delay to re-enable mmaps after interrupt */ +QEMUTimer *mmap_timer; /* enable mmaps after periods w/o interrupts */ +} VFIOINTp; + + + typedef struct VFIODevice { SysBusDevice sbdev; int fd; int num_regions; -VFIORegion *regions; +int num_irqs; +int interrupt; /* type of the interrupt, might disappear */ +char *name; +uint32_t mmap_timeout; /* mmap timeout value in ms */ +VFIORegion regions[PLATFORM_NUM_REGIONS]; QLIST_ENTRY(VFIODevice) next; struct VFIOGroup *group; -char *name; +QLIST_HEAD(, VFIOINTp) intp_list;
[Qemu-devel] [RFC v2 2/6] vfio: move hw/misc/vfio.c to hw/vfio/pci.c
From: Kim Phillips kim.phill...@linaro.org This is done in preparation for the addition of VFIO platform device support. Signed-off-by: Kim Phillips kim.phill...@linaro.org --- LICENSE| 2 +- MAINTAINERS| 2 +- hw/Makefile.objs | 1 + hw/misc/Makefile.objs | 1 - hw/vfio/Makefile.objs | 3 +++ hw/{misc/vfio.c = vfio/pci.c} | 0 6 files changed, 6 insertions(+), 3 deletions(-) create mode 100644 hw/vfio/Makefile.objs rename hw/{misc/vfio.c = vfio/pci.c} (100%) diff --git a/LICENSE b/LICENSE index da70e94..0e0b4b9 100644 --- a/LICENSE +++ b/LICENSE @@ -11,7 +11,7 @@ option) any later version. As of July 2013, contributions under version 2 of the GNU General Public License (and no later version) are only accepted for the following files -or directories: bsd-user/, linux-user/, hw/misc/vfio.c, hw/xen/xen_pt*. +or directories: bsd-user/, linux-user/, hw/vfio/, hw/xen/xen_pt*. 3) The Tiny Code Generator (TCG) is released under the BSD license (see license headers in files). diff --git a/MAINTAINERS b/MAINTAINERS index 7d17f83..9b5d569 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -605,7 +605,7 @@ F: hw/usb/* VFIO M: Alex Williamson alex.william...@redhat.com S: Supported -F: hw/misc/vfio.c +F: hw/vfio/* vhost M: Michael S. Tsirkin m...@redhat.com diff --git a/hw/Makefile.objs b/hw/Makefile.objs index d178b65..b16dada 100644 --- a/hw/Makefile.objs +++ b/hw/Makefile.objs @@ -26,6 +26,7 @@ devices-dirs-$(CONFIG_SOFTMMU) += ssi/ devices-dirs-$(CONFIG_SOFTMMU) += timer/ devices-dirs-$(CONFIG_TPM) += tpm/ devices-dirs-$(CONFIG_SOFTMMU) += usb/ +devices-dirs-$(CONFIG_SOFTMMU) += vfio/ devices-dirs-$(CONFIG_VIRTIO) += virtio/ devices-dirs-$(CONFIG_SOFTMMU) += watchdog/ devices-dirs-$(CONFIG_SOFTMMU) += xen/ diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs index f674365..656570c 100644 --- a/hw/misc/Makefile.objs +++ b/hw/misc/Makefile.objs @@ -21,7 +21,6 @@ common-obj-$(CONFIG_MACIO) += macio/ ifeq ($(CONFIG_PCI), y) obj-$(CONFIG_KVM) += ivshmem.o -obj-$(CONFIG_LINUX) += vfio.o endif obj-$(CONFIG_REALVIEW) += arm_sysctl.o diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs new file mode 100644 index 000..31c7dab --- /dev/null +++ b/hw/vfio/Makefile.objs @@ -0,0 +1,3 @@ +ifeq ($(CONFIG_LINUX), y) +obj-$(CONFIG_PCI) += pci.o +endif diff --git a/hw/misc/vfio.c b/hw/vfio/pci.c similarity index 100% rename from hw/misc/vfio.c rename to hw/vfio/pci.c -- 1.8.3.2
[Qemu-devel] [RFC v2 6/6] vfio: add exit function and IRQ disable functions
this patch brings misc improvements: - improve comments - remove interrupt field in VFIODevice - add IRQ disable routines used by new exitfn function Signed-off-by: Eric Auger eric.au...@linaro.org --- hw/vfio/platform.c | 167 - 1 file changed, 127 insertions(+), 40 deletions(-) diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c index 8f30d41..c4a4286 100644 --- a/hw/vfio/platform.c +++ b/hw/vfio/platform.c @@ -40,7 +40,6 @@ #include vfio-common.h -#define DEBUG_VFIO #ifdef DEBUG_VFIO #define DPRINTF(fmt, ...) \ do { fprintf(stderr, vfio: %s: fmt, __func__, ## __VA_ARGS__); } \ @@ -69,7 +68,9 @@ typedef struct VFIORegion { } VFIORegion; -#define VFIO_INT_INTp 4 +/* + * The IRQ structure inspired from PCI VFIOINTx + */ typedef struct VFIOINTp { QLIST_ENTRY(VFIOINTp) next; @@ -91,9 +92,8 @@ typedef struct VFIODevice { int fd; int num_regions; int num_irqs; -int interrupt; /* type of the interrupt, might disappear */ char *name; -char *compat; +char *compat; /* compatibility string */ uint32_t mmap_timeout; /* mmap timeout value in ms */ VFIORegion regions[PLATFORM_NUM_REGIONS]; QLIST_ENTRY(VFIODevice) next; @@ -101,9 +101,11 @@ typedef struct VFIODevice { QLIST_HEAD(, VFIOINTp) intp_list; } VFIODevice; + /* * returns properties from a QEMU VFIO device such as * name, compatibility, num IRQs, size of the register set + * currently used by virt machine */ void vfio_get_props(SysBusDevice *s, char **pname, char **pcompat, int *pnum_irqs, size_t *psize); @@ -118,6 +120,20 @@ void vfio_get_props(SysBusDevice *s, char **pname, *psize = vdev-regions[0].size; } +static void vfio_disable_irqindex(VFIODevice *vdev, int index) +{ +struct vfio_irq_set irq_set = { +.argsz = sizeof(irq_set), +.flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER, +.index = index, +.start = 0, +.count = 0, +}; + +ioctl(vdev-fd, VFIO_DEVICE_SET_IRQS, irq_set); +} + + static void vfio_unmask_intp(VFIODevice *vdev, int index) { @@ -132,12 +148,17 @@ static void vfio_unmask_intp(VFIODevice *vdev, int index) ioctl(vdev-fd, VFIO_DEVICE_SET_IRQS, irq_set); } - +/* + * Checks whether the IRQ was EOI'ed. In the positive the fast path + * is restored (reg space is mmaped). In the negative the reg space + * stays as an MMIO region (ops) and the mmap timer is reprogrammed + * to check the same condition after mmap_timeout ms + */ static void vfio_intp_mmap_enable(void *opaque) { -VFIOINTp * intp = (VFIOINTp *)opaque; +VFIOINTp *intp = (VFIOINTp *)opaque; VFIODevice *vdev = intp-vdev; if (intp-pending) { @@ -153,6 +174,10 @@ static void vfio_intp_mmap_enable(void *opaque) } +/* + * The fd handler + */ + static void vfio_intp_interrupt(void *opaque) { @@ -176,11 +201,14 @@ static void vfio_intp_interrupt(void *opaque) */ VFIORegion *region = vdev-regions[0]; + +/* register space is unmapped to trap EOI */ memory_region_set_enabled(region-mmap_mem, false); +/* trigger the virtual IRQ */ qemu_set_irq(intp-qemuirq, 1); -/* schedule the mmap timer which will restote mmap path after EOI*/ +/* schedule the mmap timer which will restore mmap path after EOI*/ if (intp-mmap_timeout) { timer_mod(intp-mmap_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + intp-mmap_timeout); @@ -200,13 +228,18 @@ static void vfio_irq_eoi(VFIODevice *vdev) if (intp-pending) { if (eoi_done) { DPRINTF(several IRQ pending simultaneously: \ - this is not a supported case yet\n); + this case is not tested yet\n); } + DPRINTF(EOI IRQ #%d fd=%d\n, intp-pin, event_notifier_get_fd(intp-interrupt)); + intp-pending = false; + +/* deassert the virtual IRQ and unmask physical one */ qemu_set_irq(intp-qemuirq, 0); vfio_unmask_intp(vdev, intp-pin); + eoi_done = true; } } @@ -217,22 +250,6 @@ static void vfio_irq_eoi(VFIODevice *vdev) -#if 0 -static void vfio_list_intp(VFIODevice *vdev) -{ -VFIOINTp *intp; -int i = 0; -QLIST_FOREACH(intp, vdev-intp_list, next) { -DPRINTF(IRQ #%d\n, i); -DPRINTF(- pin = %d\n, intp-pin); -DPRINTF(- fd = %d\n, event_notifier_get_fd(intp-interrupt)); -DPRINTF(- pending = %d\n, (int)intp-pending); -DPRINTF(- kvm_accel = %d\n, (int)intp-kvm_accel); -i++; -} -} -#endif - static int vfio_enable_intp(VFIODevice *vdev, unsigned int index) { struct vfio_irq_set *irq_set; /* irq structure passed to vfio kernel */ @@ -242,8 +259,6 @@ static int vfio_enable_intp(VFIODevice *vdev, unsigned int index) int device = vdev-fd;
Re: [Qemu-devel] [PATCH v4 3/5] s390x: Add I/O adapter registration.
On Wed, 09 Apr 2014 16:30:33 +0200 Alexander Graf ag...@suse.de wrote: On 09.04.14 16:24, Cornelia Huck wrote: On Wed, 09 Apr 2014 16:05:00 +0200 Alexander Graf ag...@suse.de wrote: On 09.04.14 13:34, Cornelia Huck wrote: Register an I/O adapter interrupt source for when virtio-ccw devices start using adapter interrupts. Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- hw/intc/s390_flic.c | 59 + hw/s390x/css.c| 51 ++ hw/s390x/css.h|4 hw/s390x/virtio-ccw.c |4 hw/s390x/virtio-ccw.h |1 + target-s390x/cpu.h| 33 +++ 6 files changed, 152 insertions(+) diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c index 2bf0af8..1193682 100644 --- a/hw/s390x/virtio-ccw.c +++ b/hw/s390x/virtio-ccw.c @@ -522,6 +522,10 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw) dev-thinint_isc = thinint-isc; dev-ind_bit = thinint-ind_bit; cpu_physical_memory_unmap(thinint, hw_len, 0, hw_len); +ret = css_register_io_adapter(CSS_IO_ADAPTER_VIRTIO, + dev-thinint_isc, true, false, + dev-adapter_id); In all other machines the machine file is the one creating the link between a device and the interrupt controller. Can we do something similar for s390? Hm. This would imply we'd need to add a virtio I/O adapter for each isc (0-7) at startup, regardless whether the guest enables adapter interrupts on any of those iscs. Moreover, we'd need to do the same for each type (on each isc) if we add more types of I/O adapters later (should we want to support one of the other adapter-interrupt using devices). I'd prefer to add an I/O adapter only when needed. I'm not sure I can follow you here. Instead of registering the interrupt vector on the fly, you would still register it on the fly, but after the virtio-ccw device got created, no? You mean register-at-device-creation instead of register-while-interpreting-ccw? We'd end up with the same problem: We don't know which isc the guest wants to use for adapter interrupts at that point in time, so we would need to register for all iscs. I don't think that is what we want.
[Qemu-devel] Error propagation in generated visitors and command marshallers
I stumbled over this while trying to purge error_is_set() from the code. Here's how we commonly use the Error API: Error *err = NULL; foo(arg, err) if (err) { goto out; } bar(arg, err) if (err) { goto out; } This ensures that err is null on entry, both for foo() and for bar(). Many functions rely on that, like this: void foo(ArgType arg, Error **errp) { if (frobnicate(arg) 0) { error_setg(errp, Can't frobnicate); // This asserts errp != NULL } } Here's how some of our visitor code uses the Error API (for real code, check out generated qmp-marshal.c): Error *err = NULL; QmpInputVisitor *mi = qmp_input_visitor_new_strict(QOBJECT(args)); Visitor *v = qmp_input_get_visitor(mi); char *foo = NULL; char *bar = NULL; visit_type_str(v, foo, foo, err); visit_type_str(v, bar, bar, err); if (err) { goto out; } Unlike above, this may pass a non-null errp to the second visit_type_str(), namely when the first one fails. The visitor functions guard against that, like this: void visit_type_str(Visitor *v, char **obj, const char *name, Error **errp) { if (!error_is_set(errp)) { v-type_str(v, obj, name, errp); } } As discussed before, error_is_set() is almost almost wrong, fragile or unclean. What if errp is null? Then we fail to stop visiting after an error. The function could be improved like this: void visit_type_str(Visitor *v, char **obj, const char *name, Error **errp) { assert(errp); if (!*errp) { v-type_str(v, obj, name, errp); } } But: is it a good idea to have both patterns in the code? Should we perhaps use the common pattern for visiting, too? Like this: visit_type_str(v, foo, foo, err); if (err) { goto out; } visit_type_str(v, bar, bar, err); if (err) { goto out; } Then we can assume *errp is clear on function entry, like this: void visit_type_str(Visitor *v, char **obj, const char *name, Error **errp) { v-type_str(v, obj, name, errp); } Should execute roughly the same number of conditional branches. Tedious repetition of if (err) goto out in the caller, but that's what we do elsewhere, and unlike elsewhere, these one's are generated. Opinions?
Re: [Qemu-devel] [PATCH v4 3/5] s390x: Add I/O adapter registration.
Am 09.04.2014 um 17:35 schrieb Cornelia Huck cornelia.h...@de.ibm.com: On Wed, 09 Apr 2014 16:30:33 +0200 Alexander Graf ag...@suse.de wrote: On 09.04.14 16:24, Cornelia Huck wrote: On Wed, 09 Apr 2014 16:05:00 +0200 Alexander Graf ag...@suse.de wrote: On 09.04.14 13:34, Cornelia Huck wrote: Register an I/O adapter interrupt source for when virtio-ccw devices start using adapter interrupts. Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- hw/intc/s390_flic.c | 59 + hw/s390x/css.c| 51 ++ hw/s390x/css.h|4 hw/s390x/virtio-ccw.c |4 hw/s390x/virtio-ccw.h |1 + target-s390x/cpu.h| 33 +++ 6 files changed, 152 insertions(+) diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c index 2bf0af8..1193682 100644 --- a/hw/s390x/virtio-ccw.c +++ b/hw/s390x/virtio-ccw.c @@ -522,6 +522,10 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw) dev-thinint_isc = thinint-isc; dev-ind_bit = thinint-ind_bit; cpu_physical_memory_unmap(thinint, hw_len, 0, hw_len); +ret = css_register_io_adapter(CSS_IO_ADAPTER_VIRTIO, + dev-thinint_isc, true, false, + dev-adapter_id); In all other machines the machine file is the one creating the link between a device and the interrupt controller. Can we do something similar for s390? Hm. This would imply we'd need to add a virtio I/O adapter for each isc (0-7) at startup, regardless whether the guest enables adapter interrupts on any of those iscs. Moreover, we'd need to do the same for each type (on each isc) if we add more types of I/O adapters later (should we want to support one of the other adapter-interrupt using devices). I'd prefer to add an I/O adapter only when needed. I'm not sure I can follow you here. Instead of registering the interrupt vector on the fly, you would still register it on the fly, but after the virtio-ccw device got created, no? You mean register-at-device-creation instead of register-while-interpreting-ccw? We'd end up with the same problem: We don't know which isc the guest wants to use for adapter interrupts at that point in time, so we would need to register for all iscs. I don't think that is what we want. Well, if we only assign a route to the interrupt delivery path, the guest can then configure that previously set up route to the respective isc, or am I missing something obvious? Alex
[Qemu-devel] [RFC v2 0/6] KVM platform device passthrough
This is an RFC about QEMU VFIO platform device implementation. This work aims at enabling KVM platform device passthrough, where a guest OS is directly assigned a platform device: meaning it can access the register space without trapping, and receive IRQ from the device. Also the system MMU downstream to the device is programmed properly to make sure the device can only see the VM assigned physical address space. This work was tested on Calxeda Midway where one xgmac is assigned to KVM host while the second one is assigned to the guest. The current status is the guest is able to ping the gateway meaning the basic functionality is achieved for this device. Nethertheless some key work remains to be done: - unbind/migration - multi-instantiation testing - multiple IRQ testing - management of platform devices with more complex device tree node - The 1st RFC patch works around the limitation of QEMU not being able to create platform devices from the command line, and should only be used for testing the second patch. A discussion of how to address the limitation is encouraged, however, in the context of the first patch. - The 2d patch addresses A.Williamson's comment to have the platform device code separated from the PCI device code, by firstly moving the existing VFIO PCI code into its own hw/vfio/. - the 3d RFC patch achieves MMIO direct access to the device. - the 4th RFC brings basic IRQ support - the 5th RFC enables the QEMU end-user to pass the device he wants to assign to his guest in the command line. - the last RFC brings some cleanup and paves the way for cleaner exit Here are the instructions to test on a Calxeda Midway: https://wiki.linaro.org/LEG/Engineering/Virtualization/Platform_Device_Passthrough_on_Midway The code is based on today's: git://git.qemu.org/qemu.git some patches need to be applied on host linux to correct issues not yet upstreamed. They can be found on vfio-dev branch of git://git.linaro.org/people/eric.auger/linux.git - commit 997691b: enables unmapping stage2 entries on memory region deletion - commit 731e308: corrects an in read/write function of vfio platform driver - iommu/arm-smmu commit serie and especially bf5a852 fixing an issue with Calxeda Midway sMMU. QEMU commits can be found on vfio-dev-integ git://git.linaro.org/people/eric.auger/qemu.git Best Regards Eric Eric Auger (3): vfio: Add initial IRQ support in QEMU platform device virt: Assign a VFIO platform device to a virt VM in QEMU command line vfio: add exit function and IRQ disable functions Kim Phillips (3): hw/arm/virt: add a xgmac device vfio: move hw/misc/vfio.c to hw/vfio/pci.c vfio: add vfio-platform support LICENSE| 2 +- MAINTAINERS| 2 +- hw/Makefile.objs | 1 + hw/arm/virt.c | 161 - hw/misc/Makefile.objs | 1 - hw/vfio/Makefile.objs | 5 + hw/vfio/common.c | 486 ++ hw/{misc/vfio.c = vfio/pci.c} | 480 +- hw/vfio/platform.c | 765 + hw/vfio/vfio-common.h | 55 +++ 10 files changed, 1480 insertions(+), 478 deletions(-) create mode 100644 hw/vfio/Makefile.objs create mode 100644 hw/vfio/common.c rename hw/{misc/vfio.c = vfio/pci.c} (89%) create mode 100644 hw/vfio/platform.c create mode 100644 hw/vfio/vfio-common.h -- 1.8.3.2
Re: [Qemu-devel] [PATCH v4 3/5] s390x: Add I/O adapter registration.
On Wed, 9 Apr 2014 17:53:40 +0200 Alexander Graf ag...@suse.de wrote: Am 09.04.2014 um 17:35 schrieb Cornelia Huck cornelia.h...@de.ibm.com: On Wed, 09 Apr 2014 16:30:33 +0200 Alexander Graf ag...@suse.de wrote: On 09.04.14 16:24, Cornelia Huck wrote: On Wed, 09 Apr 2014 16:05:00 +0200 Alexander Graf ag...@suse.de wrote: On 09.04.14 13:34, Cornelia Huck wrote: Register an I/O adapter interrupt source for when virtio-ccw devices start using adapter interrupts. Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- hw/intc/s390_flic.c | 59 + hw/s390x/css.c| 51 ++ hw/s390x/css.h|4 hw/s390x/virtio-ccw.c |4 hw/s390x/virtio-ccw.h |1 + target-s390x/cpu.h| 33 +++ 6 files changed, 152 insertions(+) diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c index 2bf0af8..1193682 100644 --- a/hw/s390x/virtio-ccw.c +++ b/hw/s390x/virtio-ccw.c @@ -522,6 +522,10 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw) dev-thinint_isc = thinint-isc; dev-ind_bit = thinint-ind_bit; cpu_physical_memory_unmap(thinint, hw_len, 0, hw_len); +ret = css_register_io_adapter(CSS_IO_ADAPTER_VIRTIO, + dev-thinint_isc, true, false, + dev-adapter_id); In all other machines the machine file is the one creating the link between a device and the interrupt controller. Can we do something similar for s390? Hm. This would imply we'd need to add a virtio I/O adapter for each isc (0-7) at startup, regardless whether the guest enables adapter interrupts on any of those iscs. Moreover, we'd need to do the same for each type (on each isc) if we add more types of I/O adapters later (should we want to support one of the other adapter-interrupt using devices). I'd prefer to add an I/O adapter only when needed. I'm not sure I can follow you here. Instead of registering the interrupt vector on the fly, you would still register it on the fly, but after the virtio-ccw device got created, no? You mean register-at-device-creation instead of register-while-interpreting-ccw? We'd end up with the same problem: We don't know which isc the guest wants to use for adapter interrupts at that point in time, so we would need to register for all iscs. I don't think that is what we want. Well, if we only assign a route to the interrupt delivery path, the guest can then configure that previously set up route to the respective isc, or am I missing something obvious? The guest might want to use different iscs for different devices. Currently, the first caller for an isc registers the adapter, further users just use it. If we had all devices register for (say) isc 0, we'd have one adapter for that. If the first guest devices uses isc 3, we could switch this one to isc 3, but it would still be one adapter. If now another guest device uses isc 7, we'd need to register a new adapter for isc 7 anyway.
Re: [Qemu-devel] Error propagation in generated visitors and command marshallers
On 04/09/2014 09:48 AM, Markus Armbruster wrote: I stumbled over this while trying to purge error_is_set() from the code. But: is it a good idea to have both patterns in the code? Should we perhaps use the common pattern for visiting, too? Like this: visit_type_str(v, foo, foo, err); if (err) { goto out; } visit_type_str(v, bar, bar, err); if (err) { goto out; } Then we can assume *errp is clear on function entry, like this: void visit_type_str(Visitor *v, char **obj, const char *name, Error **errp) { v-type_str(v, obj, name, errp); } Should execute roughly the same number of conditional branches. Tedious repetition of if (err) goto out in the caller, but that's what we do elsewhere, and unlike elsewhere, these one's are generated. Opinions? Putting the tedium into the generated code is WHY we have generated code; so that the rest of the code that is hand-written can be concise. I like this latter idea of letting the visitors assume that errp is clean on entry with the caller responsible for checking err after each step. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] Error propagation in generated visitors and command marshallers
On Wed, Apr 9, 2014 at 8:48 AM, Markus Armbruster arm...@redhat.com wrote: I stumbled over this while trying to purge error_is_set() from the code. Here's how we commonly use the Error API: Error *err = NULL; foo(arg, err) if (err) { goto out; } bar(arg, err) if (err) { goto out; } This ensures that err is null on entry, both for foo() and for bar(). Many functions rely on that, like this: void foo(ArgType arg, Error **errp) { if (frobnicate(arg) 0) { error_setg(errp, Can't frobnicate); // This asserts errp != NULL } } Here's how some of our visitor code uses the Error API (for real code, check out generated qmp-marshal.c): Error *err = NULL; QmpInputVisitor *mi = qmp_input_visitor_new_strict(QOBJECT(args)); Visitor *v = qmp_input_get_visitor(mi); char *foo = NULL; char *bar = NULL; visit_type_str(v, foo, foo, err); visit_type_str(v, bar, bar, err); if (err) { goto out; } Unlike above, this may pass a non-null errp to the second visit_type_str(), namely when the first one fails. The visitor functions guard against that, like this: void visit_type_str(Visitor *v, char **obj, const char *name, Error **errp) { if (!error_is_set(errp)) { v-type_str(v, obj, name, errp); } } As discussed before, error_is_set() is almost almost wrong, fragile or unclean. What if errp is null? Then we fail to stop visiting after an error. The function could be improved like this: void visit_type_str(Visitor *v, char **obj, const char *name, Error **errp) { assert(errp); if (!*errp) { v-type_str(v, obj, name, errp); } } But: is it a good idea to have both patterns in the code? Should we perhaps use the common pattern for visiting, too? Like this: visit_type_str(v, foo, foo, err); if (err) { goto out; } visit_type_str(v, bar, bar, err); if (err) { goto out; } Then we can assume *errp is clear on function entry, like this: void visit_type_str(Visitor *v, char **obj, const char *name, Error **errp) { v-type_str(v, obj, name, errp); } Should execute roughly the same number of conditional branches. Tedious repetition of if (err) goto out in the caller, but that's what we do elsewhere, and unlike elsewhere, these one's are generated. Opinions? The original visiting code was loosely based on ASN1 marshaling code from Samba which used the if error, bail out at the top style of error handling. As use of Error has evolved in QEMU, I agree that the paradigm of bail out as soon as you see an error and fail fast is better so I'd vote for changing the generated code to do that. Regards, Anthony Liguori
[Qemu-devel] [PATCH V4 1/1] qmp: add pmemload command
Signed-off-by: Baojun Wang wan...@gmail.com --- cpus.c | 29 + hmp-commands.hx | 13 + hmp.c| 11 +++ hmp.h| 1 + qapi-schema.json | 18 ++ qmp-commands.hx | 27 +++ 6 files changed, 99 insertions(+) diff --git a/cpus.c b/cpus.c index 1104d61..9617d43 100644 --- a/cpus.c +++ b/cpus.c @@ -1467,6 +1467,35 @@ exit: fclose(f); } +void qmp_pmemload(int64_t addr, int64_t size, const char *filename, + Error **errp) +{ +FILE *f; +uint32_t l; +uint8_t buf[1024]; + +f = fopen(filename, rb); +if (!f) { +error_setg_file_open(errp, errno, filename); +return; +} + +while (size != 0) { +l = fread(buf, 1, sizeof(buf), f); +if (l != sizeof(buf)) { +error_set(errp, QERR_IO_ERROR); +goto exit; +} +if (l size) +l = size; +cpu_physical_memory_rw(addr, buf, l, 1); +addr += l; +size -= l; +} + +exit: +fclose(f); +} void qmp_inject_nmi(Error **errp) { #if defined(TARGET_I386) diff --git a/hmp-commands.hx b/hmp-commands.hx index f3fc514..18604a6 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -809,6 +809,19 @@ save to disk physical memory dump starting at @var{addr} of size @var{size}. ETEXI { +.name = pmemload, +.args_type = val:l,size:i,filename:s, +.params = addr size file, +.help = load from disk physical memory dump starting at 'addr' of size 'size', +.mhandler.cmd = hmp_pmemload, +}, + +STEXI +@item pmemload @var{addr} @var{size} @var{file} +@findex pmemload +load from disk physical memory dump starting at @var{addr} of size @var{size}. +ETEXI +{ .name = boot_set, .args_type = bootdevice:s, .params = bootdevice, diff --git a/hmp.c b/hmp.c index 2f279c4..6e932f9 100644 --- a/hmp.c +++ b/hmp.c @@ -767,6 +767,17 @@ void hmp_pmemsave(Monitor *mon, const QDict *qdict) hmp_handle_error(mon, errp); } +void hmp_pmemload(Monitor *mon, const QDict *qdict) +{ +uint32_t size = qdict_get_int(qdict, size); +const char *filename = qdict_get_str(qdict, filename); +uint64_t addr = qdict_get_int(qdict, val); +Error *errp = NULL; + +qmp_pmemload(addr, size, filename, errp); +hmp_handle_error(mon, errp); +} + void hmp_ringbuf_write(Monitor *mon, const QDict *qdict) { const char *chardev = qdict_get_str(qdict, device); diff --git a/hmp.h b/hmp.h index ed58f0e..f5f2a16 100644 --- a/hmp.h +++ b/hmp.h @@ -44,6 +44,7 @@ void hmp_system_powerdown(Monitor *mon, const QDict *qdict); void hmp_cpu(Monitor *mon, const QDict *qdict); void hmp_memsave(Monitor *mon, const QDict *qdict); void hmp_pmemsave(Monitor *mon, const QDict *qdict); +void hmp_pmemload(Monitor *mon, const QDict *qdict); void hmp_ringbuf_write(Monitor *mon, const QDict *qdict); void hmp_ringbuf_read(Monitor *mon, const QDict *qdict); void hmp_cont(Monitor *mon, const QDict *qdict); diff --git a/qapi-schema.json b/qapi-schema.json index 391356f..60f9782 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -1708,6 +1708,24 @@ 'data': {'val': 'int', 'size': 'int', 'filename': 'str'} } ## +# @pmemload: +# +# Load a portion of guest physical memory from a file. +# +# @val: the physical address of the guest to start from +# +# @size: the size of memory region to load +# +# @filename: the file to load the memory from as binary data +# +# Returns: Nothing on success +# +# Since: 2.1 +## +{ 'command': 'pmemload', + 'data': {'val': 'int', 'size': 'int', 'filename': 'str'} } + +## # @cont: # # Resume guest VCPU execution. diff --git a/qmp-commands.hx b/qmp-commands.hx index ed3ab92..584d6cf 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -468,6 +468,33 @@ Example: EQMP { +.name = pmemload, +.args_type = val:l,size:i,filename:s, +.mhandler.cmd_new = qmp_marshal_input_pmemload, +}, + +SQMP +pmemload + + +load from disk physical memory dump starting at 'val' of size 'size'. + +Arguments: + +- val: the starting address (json-int) +- size: the memory size, in bytes (json-int) +- filename: file path (json-string) + +Example: + +- { execute: pmemload, + arguments: { val: 10, +size: 100, +filename: /tmp/physical-mem-dump } } +- { return: {} } + +EQMP +{ .name = inject-nmi, .args_type = , .mhandler.cmd_new = qmp_marshal_input_inject_nmi, -- 1.9.1
Re: [Qemu-devel] [PATCH V4 1/1] qmp: add pmemload command
On 04/09/2014 10:54 AM, Baojun Wang wrote: Signed-off-by: Baojun Wang wan...@gmail.com You lost this part of your commit message, which gives more details about the 'why' (the subject line covers the 'what', but it is often the 'why' that is most needed when reviewing a commit later). Your earlier mail had: I found this could be useful to have qemu-softmmu as a cross debugger (launch with -s -S command line option), then if we can have a command to load guest physical memory, we can use cross gdb to do some target debug which gdb cannot do directly. +void qmp_pmemload(int64_t addr, int64_t size, const char *filename, + Error **errp) +{ +FILE *f; +uint32_t l; +uint8_t buf[1024]; + +f = fopen(filename, rb); I still think that fopen()/fread() is wrong, and that you should be using qemu_open()/read() - that way, libvirt could pass in a file descriptor and use qemu's already-existing /dev/fdset/nnn support rather than forcing qemu to open something from the filesystem. +++ b/hmp-commands.hx @@ -809,6 +809,19 @@ save to disk physical memory dump starting at @var{addr} of size @var{size}. ETEXI { +.name = pmemload, +.args_type = val:l,size:i,filename:s, +.params = addr size file, +.help = load from disk physical memory dump starting at 'addr' of size 'size', +.mhandler.cmd = hmp_pmemload, And I still think that you should list filename first, with val and size optional at the end. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v8 1/4] qapi: introduce PreallocMode and a new PreallocMode full.
On 04/09/2014 01:12 AM, Hu Tao wrote: This patch prepares for the subsequent patches. Reviewed-by: Fam Zheng f...@redhat.com Signed-off-by: Hu Tao hu...@cn.fujitsu.com --- block/qcow2.c| 8 qapi-schema.json | 14 ++ 2 files changed, 18 insertions(+), 4 deletions(-) +++ b/qapi-schema.json @@ -4689,3 +4689,17 @@ 'btn' : 'InputBtnEvent', 'rel' : 'InputMoveEvent', 'abs' : 'InputMoveEvent' } } + +## +# @PreallocMode +# +# Preallocation mode of QEMU image file +# +# @off: no preallocation +# @metadata: preallocate only for metadata +# @full: preallocate all data, including metadata +# +# Since 2.0 You've missed 2.0. This should be 2.1. With that fixed, Reviewed-by: Eric Blake ebl...@redhat.com -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
[Qemu-devel] [PATCH] Re: snapshot: fixed bdrv_get_full_backing_filename can not get correct full_backing_filename
Thanks Eric's analysis and review firstly. As not so clear to the application context, so the first patch can not cover symlink scenarios. In this patch, will check the backing_filename is a symlink or not firstly, then return the full(absolute) path via realpath. If this patch has something not coverd, please give me more suggestions. Thx. Signed-off-by: Jun Li junm...@gmail.com --- block.c | 18 +- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/block.c b/block.c index 990a754..8566b75 100644 --- a/block.c +++ b/block.c @@ -304,10 +304,26 @@ void path_combine(char *dest, int dest_size, void bdrv_get_full_backing_filename(BlockDriverState *bs, char *dest, size_t sz) { +struct stat sb; +char *linkname; + if (bs-backing_file[0] == '\0' || path_has_protocol(bs-backing_file)) { pstrcpy(dest, sz, bs-backing_file); } else { -path_combine(dest, sz, bs-filename, bs-backing_file); +if (lstat(bs-backing_file, sb) == -1) { +perror(lstat); +exit(EXIT_FAILURE); +} + +/* Check linkname is a link or not */ +if (S_ISLNK(sb.st_mode)) { +linkname = malloc(sb.st_size + 1); +readlink(bs-backing_file, linkname, sb.st_size + 1); +linkname[sb.st_size] = '\0'; +realpath(linkname, dest); +} else { +realpath(bs-backing_file, dest); +} } } -- 1.8.3.1
Re: [Qemu-devel] [PATCH v3 21/26] tcg-aarch64: Introduce tcg_out_insn_3507
On 04/09/2014 05:54 AM, Claudio Fontana wrote: During testing I found this patch causes a regression for big endian targets (sparc). Can you take a look? I think it might be related to the extended form of the REV instruction needing an additional 0x400. See below. You're right. It's disassembling as rev32 x0, x0. Bizzarely, sparc32 bios was working. I guess it only uses 64-bit load/store for ldd/std for register pair save and restore. And since we rev'ed them the same way for load/store, it worked. Uploading a full mipseb system image to test now. r~
Re: [Qemu-devel] Error propagation in generated visitors and command marshallers
* Markus Armbruster (arm...@redhat.com) wrote: I stumbled over this while trying to purge error_is_set() from the code. Here's how we commonly use the Error API: Error *err = NULL; foo(arg, err) if (err) { goto out; } bar(arg, err) if (err) { goto out; } This ensures that err is null on entry, both for foo() and for bar(). Many functions rely on that, like this: void foo(ArgType arg, Error **errp) { if (frobnicate(arg) 0) { error_setg(errp, Can't frobnicate); // This asserts errp != NULL } } Here's how some of our visitor code uses the Error API (for real code, check out generated qmp-marshal.c): Error *err = NULL; QmpInputVisitor *mi = qmp_input_visitor_new_strict(QOBJECT(args)); Visitor *v = qmp_input_get_visitor(mi); char *foo = NULL; char *bar = NULL; visit_type_str(v, foo, foo, err); visit_type_str(v, bar, bar, err); if (err) { goto out; } Unlike above, this may pass a non-null errp to the second visit_type_str(), namely when the first one fails. Right, one of the problems is you just have long strings of visit_* calls and adding a check to each one hides what you're actually doing in a sea of checks. The downside is that if one of those visit's fails then you've got no chance of figuring out which one it was. In my BER world I've got some macros along the lines of: #define LOCAL_ERR_REPORT(fallout) \ if (local_err) { \ fallout \ } and at least then I can do things like: visit_type_str(v, foo, foo, err); LOCAL_ERR_REPORT( goto out; ) visit_type_str(v, bar, bar, err); LOCAL_ERR_REPORT( goto out; ) which while not nice, means that you can actually follow the code, and I can also add a printf to the macro to record the function/line so that when one of them fails I can see which visit was the cause of the problem (something that's currently very difficult). The visitor functions guard against that, like this: void visit_type_str(Visitor *v, char **obj, const char *name, Error **errp) { if (!error_is_set(errp)) { v-type_str(v, obj, name, errp); } } As discussed before, error_is_set() is almost almost wrong, fragile or unclean. What if errp is null? Then we fail to stop visiting after an error. The function could be improved like this: void visit_type_str(Visitor *v, char **obj, const char *name, Error **errp) { assert(errp); if (!*errp) { v-type_str(v, obj, name, errp); } } But: is it a good idea to have both patterns in the code? Should we perhaps use the common pattern for visiting, too? Like this: visit_type_str(v, foo, foo, err); if (err) { goto out; } visit_type_str(v, bar, bar, err); if (err) { goto out; } Then we can assume *errp is clear on function entry, like this: void visit_type_str(Visitor *v, char **obj, const char *name, Error **errp) { v-type_str(v, obj, name, errp); } Should execute roughly the same number of conditional branches. Tedious repetition of if (err) goto out in the caller, but that's what we do elsewhere, and unlike elsewhere, these one's are generated. The other problem is I had a tendency to typo some of the cases to if (*err) and it's quite hard to spot and you wonder what's going on. Dave -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
[Qemu-devel] qemu projects
Hi All, I am new to qemu-kvm.But have good knowledge about virtulization and cloud computing.I would like to do some projects in qemu-kvm.Please let me know how can I start. Please suggest me some small projects which I can take up and finish in 2-3 weeks. Regards, Pradeep
[Qemu-devel] TCG x86-64 'bt' insn
Hi guys, I have to revive a rather old thread [1,2]. A quick summary of the issue: TCG emulates the BT (bit-test) instruction using a SAR to re-compute eflags. While SAR affects all flags, BT only affects the C-flag and leaves the Z-flag unaffected/unchanged [3]. According to the x86 manual, the current behavior is correct (although not perfect performance-wise), but NOT correct according to newer Intel CPU instruction sets (such as x86-64). What has caused some confusion in the past it seems is that AMD's manual mentions that all flags other than the C-flag are undefined (just like in Intel's old x86 manual). A patch has been suggested (although not applied) before [2] and I was trying to update and try the patch to the QEMU 2.0 RC2, but it seems not to be working (causes a BSOD on Windows guests). BTW: I have a program that seg-faults on Windows clients (32 as well as 64 bit guest OSes) in vanilla QEMU 2.0 RC2 (just like in previous versions), so this is not just of theoretical importance :) Can somebody say if there is something that I'm doing obviously wrong? *NON-FUNCTIONAL PATCH!* --- ../orig/qemu-2.0.0-rc2/target-i386/translate.c 2014-04-08 12:38:58.0 -0700 +++ target-i386/translate.c 2014-04-09 10:08:43.252084947 -0700 @@ -6710,8 +6710,14 @@ tcg_gen_andi_tl(cpu_T[1], cpu_T[1], (1 (3 + ot)) - 1); switch(op) { case 0: -tcg_gen_shr_tl(cpu_cc_src, cpu_T[0], cpu_T[1]); -tcg_gen_movi_tl(cpu_cc_dst, 0); +/* whatever the last CC-op is - recompute now so we can OR-in + * updated results */ +gen_update_cc_op(s); +gen_compute_eflags(s); +tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]); +tcg_gen_andi_tl(cpu_tmp4, cpu_tmp4, CC_C); +tcg_gen_or_tl(cpu_cc_src, cpu_cc_src, cpu_tmp4); +set_cc_op(s, CC_OP_EFLAGS); break; case 1: tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]); @@ -6734,8 +6740,8 @@ tcg_gen_xor_tl(cpu_T[0], cpu_T[0], cpu_tmp0); break; } -set_cc_op(s, CC_OP_SARB + ot); if (op != 0) { +set_cc_op(s, CC_OP_SARB + ot); if (mod != 3) { gen_op_st_v(s, ot, cpu_T[0], cpu_A0); } else { Thanks, I'd really appreciate any input you can provide. -Clemens [1] http://lists.gnu.org/archive/html/qemu-devel/2010-11/msg00442.html [2] http://lists.gnu.org/archive/html/qemu-devel/2010-11/msg00764.html [3] page 148 http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf -- Clemens Kolbitsch Security Researcher kolbit...@lastline.com Mobile +1 (206) 356-7745 Land +1 (805) 456-7076 Lastline, Inc. 6950 Hollister Avenue, Suite 101 Goleta, CA 93117 www.lastline.com
[Qemu-devel] [PATCH V3 4/5] machine: replace QEMUMachine by MachineClass in accelerator configuration
This minimizes QEMUMachine usage, as part of machine QOM-ification. Signed-off-by: Marcel Apfelbaum marce...@redhat.com --- include/hw/xen/xen.h| 2 +- include/qemu/typedefs.h | 1 + include/sysemu/kvm.h| 2 +- include/sysemu/qtest.h | 2 +- kvm-all.c | 6 +++--- kvm-stub.c | 2 +- qtest.c | 2 +- vl.c| 10 +- xen-all.c | 2 +- xen-stub.c | 2 +- 10 files changed, 16 insertions(+), 15 deletions(-) diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h index 9d549fc..85fda3d 100644 --- a/include/hw/xen/xen.h +++ b/include/hw/xen/xen.h @@ -36,7 +36,7 @@ void xen_cmos_set_s3_resume(void *opaque, int irq, int level); qemu_irq *xen_interrupt_controller_init(void); -int xen_init(QEMUMachine *machine); +int xen_init(MachineClass *mc); int xen_hvm_init(MemoryRegion **ram_memory); void xenstore_store_pv_console_info(int i, struct CharDriverState *chr); diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h index bf8daac..86bab12 100644 --- a/include/qemu/typedefs.h +++ b/include/qemu/typedefs.h @@ -31,6 +31,7 @@ typedef struct MemoryListener MemoryListener; typedef struct MemoryMappingList MemoryMappingList; typedef struct QEMUMachine QEMUMachine; +typedef struct MachineClass MachineClass; typedef struct NICInfo NICInfo; typedef struct HCIInfo HCIInfo; typedef struct AudioState AudioState; diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index 0bee1e8..518655c 100644 --- a/include/sysemu/kvm.h +++ b/include/sysemu/kvm.h @@ -152,7 +152,7 @@ extern KVMState *kvm_state; /* external API */ -int kvm_init(QEMUMachine *machine); +int kvm_init(MachineClass *mc); int kvm_has_sync_mmu(void); int kvm_has_vcpu_events(void); diff --git a/include/sysemu/qtest.h b/include/sysemu/qtest.h index 224131f..95c9ade 100644 --- a/include/sysemu/qtest.h +++ b/include/sysemu/qtest.h @@ -26,7 +26,7 @@ static inline bool qtest_enabled(void) bool qtest_driver(void); -int qtest_init_accel(QEMUMachine *machine); +int qtest_init_accel(MachineClass *mc); void qtest_init(const char *qtest_chrdev, const char *qtest_log, Error **errp); static inline int qtest_available(void) diff --git a/kvm-all.c b/kvm-all.c index cd4111d..1bb979d 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -1341,7 +1341,7 @@ static int kvm_max_vcpus(KVMState *s) return (ret) ? ret : kvm_recommended_vcpus(s); } -int kvm_init(QEMUMachine *machine) +int kvm_init(MachineClass *mc) { static const char upgrade_note[] = Please upgrade to at least kernel 2.6.29 or recent kvm-kmod\n @@ -1433,8 +1433,8 @@ int kvm_init(QEMUMachine *machine) } kvm_type = qemu_opt_get(qemu_get_machine_opts(), kvm-type); -if (machine-kvm_type) { -type = machine-kvm_type(kvm_type); +if (mc-kvm_type) { +type = mc-kvm_type(kvm_type); } else if (kvm_type) { fprintf(stderr, Invalid argument kvm-type=%s\n, kvm_type); goto err; diff --git a/kvm-stub.c b/kvm-stub.c index ccdba62..8acda86 100644 --- a/kvm-stub.c +++ b/kvm-stub.c @@ -34,7 +34,7 @@ int kvm_init_vcpu(CPUState *cpu) return -ENOSYS; } -int kvm_init(QEMUMachine *machine) +int kvm_init(MachineClass *mc) { return -ENOSYS; } diff --git a/qtest.c b/qtest.c index 0ac9f42..2aba20d 100644 --- a/qtest.c +++ b/qtest.c @@ -500,7 +500,7 @@ static void qtest_event(void *opaque, int event) } } -int qtest_init_accel(QEMUMachine *machine) +int qtest_init_accel(MachineClass *mc) { configure_icount(0); diff --git a/vl.c b/vl.c index dc8d515..5119937 100644 --- a/vl.c +++ b/vl.c @@ -2720,7 +2720,7 @@ static MachineClass *machine_parse(const char *name) exit(!name || !is_help_option(name)); } -static int tcg_init(QEMUMachine *machine) +static int tcg_init(MachineClass *mc) { tcg_exec_init(tcg_tb_size * 1024 * 1024); return 0; @@ -2730,7 +2730,7 @@ static struct { const char *opt_name; const char *name; int (*available)(void); -int (*init)(QEMUMachine *); +int (*init)(MachineClass *mc); bool *allowed; } accel_list[] = { { tcg, tcg, tcg_available, tcg_init, tcg_allowed }, @@ -2739,7 +2739,7 @@ static struct { { qtest, QTest, qtest_available, qtest_init_accel, qtest_allowed }, }; -static int configure_accelerator(QEMUMachine *machine) +static int configure_accelerator(MachineClass *mc) { const char *p; char buf[10]; @@ -2766,7 +2766,7 @@ static int configure_accelerator(QEMUMachine *machine) continue; } *(accel_list[i].allowed) = true; -ret = accel_list[i].init(machine); +ret = accel_list[i].init(mc); if (ret 0) { init_failed = true; fprintf(stderr, failed to initialize %s: %s\n, @@ -4187,7 +4187,7 @@ int main(int argc, char **argv, char **envp) exit(0); } -
[Qemu-devel] [PATCH V3 0/5] remove QEMUMachine indirection from MachineClass
Cc: Andreas Färber afaer...@suse.de V2 - V3: - Addressed Andreas's comments: - Dropped QEMUMachineInitArgs's 'next' obsoleted field in a separate patch - Revision the separation into patches: - Started using MachineClass for .machine early (3/5). - Merged hw/ppc changes with QEMUMachine indirection removal - Ensured that git bisect is not affected - Rebased to master. V1 - V2: - Addressed Paolo's comments: - replaced commas by semicolons on patch 4/5. - Rebased to master. This is a continuation of 'QEMU Machine as QOM object' effort. The scope of this series is to allow machine QOM-ification of all machines gradually, by removing the need for QEMUMachine registration through vl.c . Now we will have 2 paths: 1. Non QOM-ified machines will be converted to QOM on the fly in vl.c by qemu machine registration. 2. QOM-ified machines will behave as regular QOM classes setting MachineClass fields in class_init. - Patch 4/5 demonstrates this. Next steps: - Replace QemuOpts queries by MachineState fields. - Follow Paolo's suggestions to get rid of QEMUMachineInitArgs. Comments are appreciated, Thanks, Marcel Marcel Apfelbaum (5): hw/boards.h: remove obsoleted field from QEMUMachine vl.c: copy QEMUMachine's fields to MachineClass vl.c: Replace QEMUMachine with MachineClass in QEMUMachineInitArgs machine: replace QEMUMachine by MachineClass in accelerator configuration machine: remove QEMUMachine indirection from MachineClass device-hotplug.c| 2 +- hw/ppc/spapr.c | 26 +-- include/hw/boards.h | 30 +++-- include/hw/xen/xen.h| 2 +- include/qemu/typedefs.h | 1 + include/sysemu/kvm.h| 2 +- include/sysemu/qtest.h | 2 +- kvm-all.c | 6 +-- kvm-stub.c | 2 +- qmp.c | 4 +- qtest.c | 2 +- vl.c| 114 +++- xen-all.c | 2 +- xen-stub.c | 2 +- 14 files changed, 116 insertions(+), 81 deletions(-) -- 1.8.3.1
[Qemu-devel] [PATCH V3 1/5] hw/boards.h: remove obsoleted field from QEMUMachine
This field shouldn't be used any more since we adopted the QOM way of iterating over the types. The commit that obsoleted it is: commit 261747f176f6f2d88f8268aaebfdd1a1afe887e2 vl: Use MachineClass instead of global QEMUMachine list The machine registration flow is refactored to use the QOM functionality. Instead of linking the machines into a list, each machine has a type and the types can be traversed in the QOM way. Signed-off-by: Marcel Apfelbaum marce...@redhat.com --- include/hw/boards.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/hw/boards.h b/include/hw/boards.h index dd2c70d..871 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -46,7 +46,6 @@ struct QEMUMachine { const char *default_machine_opts; const char *default_boot_order; GlobalProperty *compat_props; -struct QEMUMachine *next; const char *hw_version; }; -- 1.8.3.1
[Qemu-devel] [PATCH V3 3/5] vl.c: Replace QEMUMachine with MachineClass in QEMUMachineInitArgs
QEMUMachine's fields are already in MachineClass. We can safely make the switch because we copy them in machine_class_init. Signed-off-by: Marcel Apfelbaum marce...@redhat.com --- include/hw/boards.h | 5 +++-- vl.c| 2 +- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/include/hw/boards.h b/include/hw/boards.h index 138346d..51211a6 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -7,8 +7,10 @@ #include hw/qdev.h #include qom/object.h +typedef struct MachineClass MachineClass; + typedef struct QEMUMachineInitArgs { -const QEMUMachine *machine; +const MachineClass *machine; ram_addr_t ram_size; const char *boot_order; const char *kernel_filename; @@ -62,7 +64,6 @@ int qemu_register_machine(QEMUMachine *m); OBJECT_CLASS_CHECK(MachineClass, (klass), TYPE_MACHINE) typedef struct MachineState MachineState; -typedef struct MachineClass MachineClass; MachineClass *find_default_machine(void); extern MachineState *current_machine; diff --git a/vl.c b/vl.c index 0b0202d..dc8d515 100644 --- a/vl.c +++ b/vl.c @@ -4394,7 +4394,7 @@ int main(int argc, char **argv, char **envp) qdev_machine_init(); -QEMUMachineInitArgs args = { .machine = machine, +QEMUMachineInitArgs args = { .machine = machine_class, .ram_size = ram_size, .boot_order = boot_order, .kernel_filename = kernel_filename, -- 1.8.3.1
[Qemu-devel] [PATCH V3 2/5] vl.c: copy QEMUMachine's fields to MachineClass
In order to eliminate the QEMUMachine indirection, add its fields directly to MachineClass. Do not remove yet qemu_machine field because it is still in use by sparpr. Signed-off-by: Marcel Apfelbaum marce...@redhat.com --- include/hw/boards.h | 23 +++ vl.c| 23 +++ 2 files changed, 46 insertions(+) diff --git a/include/hw/boards.h b/include/hw/boards.h index 871..138346d 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -77,6 +77,29 @@ struct MachineClass { /* public */ QEMUMachine *qemu_machine; +const char *name; +const char *alias; +const char *desc; + +void (*init)(QEMUMachineInitArgs *args); +void (*reset)(void); +void (*hot_add_cpu)(const int64_t id, Error **errp); +int (*kvm_type)(const char *arg); + +BlockInterfaceType block_default_type; +int max_cpus; +unsigned int no_serial:1, +no_parallel:1, +use_virtcon:1, +use_sclp:1, +no_floppy:1, +no_cdrom:1, +no_sdcard:1; +int is_default; +const char *default_machine_opts; +const char *default_boot_order; +GlobalProperty *compat_props; +const char *hw_version; }; /** diff --git a/vl.c b/vl.c index 9975e5a..0b0202d 100644 --- a/vl.c +++ b/vl.c @@ -1583,8 +1583,31 @@ MachineState *current_machine; static void machine_class_init(ObjectClass *oc, void *data) { MachineClass *mc = MACHINE_CLASS(oc); +QEMUMachine *qm = data; mc-qemu_machine = data; + +mc-name = qm-name; +mc-alias = qm-alias; +mc-desc = qm-desc; +mc-init = qm-init; +mc-reset = qm-reset; +mc-hot_add_cpu = qm-hot_add_cpu; +mc-kvm_type = qm-kvm_type; +mc-block_default_type = qm-block_default_type; +mc-max_cpus = qm-max_cpus; +mc-no_serial = qm-no_serial; +mc-no_parallel = qm-no_parallel; +mc-use_virtcon = qm-use_virtcon; +mc-use_sclp = qm-use_sclp; +mc-no_floppy = qm-no_floppy; +mc-no_cdrom = qm-no_cdrom; +mc-no_sdcard = qm-no_sdcard; +mc-is_default = qm-is_default; +mc-default_machine_opts = qm-default_machine_opts; +mc-default_boot_order = qm-default_boot_order; +mc-compat_props = qm-compat_props; +mc-hw_version = qm-hw_version; } int qemu_register_machine(QEMUMachine *m) -- 1.8.3.1
[Qemu-devel] [PATCH V3 5/5] machine: remove QEMUMachine indirection from MachineClass
No need to go through qemu_machine field. Use MachineClass fields directly. Signed-off-by: Marcel Apfelbaum marce...@redhat.com --- device-hotplug.c| 2 +- hw/ppc/spapr.c | 26 -- include/hw/boards.h | 1 - qmp.c | 4 +-- vl.c| 79 - 5 files changed, 50 insertions(+), 62 deletions(-) diff --git a/device-hotplug.c b/device-hotplug.c index ebfa6b1..eecb08e 100644 --- a/device-hotplug.c +++ b/device-hotplug.c @@ -40,7 +40,7 @@ DriveInfo *add_init_drive(const char *optstr) return NULL; mc = MACHINE_GET_CLASS(current_machine); -dinfo = drive_init(opts, mc-qemu_machine-block_default_type); +dinfo = drive_init(opts, mc-block_default_type); if (!dinfo) { qemu_opts_del(opts); return NULL; diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index a11e121..b4ce950 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -1419,19 +1419,6 @@ static int spapr_kvm_type(const char *vm_type) exit(1); } -static QEMUMachine spapr_machine = { -.name = pseries, -.desc = pSeries Logical Partition (PAPR compliant), -.is_default = 1, -.init = ppc_spapr_init, -.reset = ppc_spapr_reset, -.block_default_type = IF_SCSI, -.max_cpus = MAX_CPUS, -.no_parallel = 1, -.default_boot_order = NULL, -.kvm_type = spapr_kvm_type, -}; - /* * Implementation of an interface to adjust firmware patch * for the bootindex property handling. @@ -1494,7 +1481,17 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data) MachineClass *mc = MACHINE_CLASS(oc); FWPathProviderClass *fwc = FW_PATH_PROVIDER_CLASS(oc); -mc-qemu_machine = data; +mc-name = pseries; +mc-desc = pSeries Logical Partition (PAPR compliant); +mc-is_default = 1; +mc-init = ppc_spapr_init; +mc-reset = ppc_spapr_reset; +mc-block_default_type = IF_SCSI; +mc-max_cpus = MAX_CPUS; +mc-no_parallel = 1; +mc-default_boot_order = NULL; +mc-kvm_type = spapr_kvm_type; + fwc-get_dev_path = spapr_get_fw_dev_path; } @@ -1502,7 +1499,6 @@ static const TypeInfo spapr_machine_info = { .name = TYPE_SPAPR_MACHINE, .parent= TYPE_MACHINE, .class_init= spapr_machine_class_init, -.class_data= spapr_machine, .interfaces = (InterfaceInfo[]) { { TYPE_FW_PATH_PROVIDER }, { } diff --git a/include/hw/boards.h b/include/hw/boards.h index 51211a6..66ee98a 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -77,7 +77,6 @@ struct MachineClass { ObjectClass parent_class; /* public */ -QEMUMachine *qemu_machine; const char *name; const char *alias; const char *desc; diff --git a/qmp.c b/qmp.c index 87a28f7..26eb589 100644 --- a/qmp.c +++ b/qmp.c @@ -117,8 +117,8 @@ void qmp_cpu_add(int64_t id, Error **errp) MachineClass *mc; mc = MACHINE_GET_CLASS(current_machine); -if (mc-qemu_machine-hot_add_cpu) { -mc-qemu_machine-hot_add_cpu(id, errp); +if (mc-hot_add_cpu) { +mc-hot_add_cpu(id, errp); } else { error_setg(errp, Not supported); } diff --git a/vl.c b/vl.c index 5119937..190bd5d 100644 --- a/vl.c +++ b/vl.c @@ -1585,8 +1585,6 @@ static void machine_class_init(ObjectClass *oc, void *data) MachineClass *mc = MACHINE_CLASS(oc); QEMUMachine *qm = data; -mc-qemu_machine = data; - mc-name = qm-name; mc-alias = qm-alias; mc-desc = qm-desc; @@ -1634,12 +1632,12 @@ static MachineClass *find_machine(const char *name) for (el = machines; el; el = el-next) { MachineClass *temp = el-data; -if (!strcmp(temp-qemu_machine-name, name)) { +if (!strcmp(temp-name, name)) { mc = temp; break; } -if (temp-qemu_machine-alias -!strcmp(temp-qemu_machine-alias, name)) { +if (temp-alias +!strcmp(temp-alias, name)) { mc = temp; break; } @@ -1657,7 +1655,7 @@ MachineClass *find_default_machine(void) for (el = machines; el; el = el-next) { MachineClass *temp = el-data; -if (temp-qemu_machine-is_default) { +if (temp-is_default) { mc = temp; break; } @@ -1671,27 +1669,25 @@ MachineInfoList *qmp_query_machines(Error **errp) { GSList *el, *machines = object_class_get_list(TYPE_MACHINE, false); MachineInfoList *mach_list = NULL; -QEMUMachine *m; for (el = machines; el; el = el-next) { MachineClass *mc = el-data; MachineInfoList *entry; MachineInfo *info; -m = mc-qemu_machine; info = g_malloc0(sizeof(*info)); -if (m-is_default) { +if (mc-is_default) { info-has_is_default = true; info-is_default = true; } -if (m-alias) { +if (mc-alias) {
[Qemu-devel] [PATCH V5 1/1] qmp: add pmemload command
I found this could be useful to have qemu-softmmu as a cross debugger (launch with -s -S command line option), then if we can have a command to load guest physical memory, we can use cross gdb to do some target debug which gdb cannot do directly. Many thanks to Eric Blake for review the patch and suggestions. Signed-off-by: Baojun Wang wan...@gmail.com --- cpus.c | 29 + hmp-commands.hx | 13 + hmp.c| 11 +++ hmp.h| 1 + qapi-schema.json | 18 ++ qmp-commands.hx | 27 +++ 6 files changed, 99 insertions(+) diff --git a/cpus.c b/cpus.c index 1104d61..230664f 100644 --- a/cpus.c +++ b/cpus.c @@ -1467,6 +1467,35 @@ exit: fclose(f); } +void qmp_pmemload(const char* filename, int64_t addr, int64_t size, + Error **errp) +{ +int fd; +ssize_t l; +uint8_t buf[4096]; + +fd = qemu_open(filename, O_RDONLY | O_BINARY); +if (fd 0) { +error_setg_file_open(errp, errno, filename); +return; +} + +while (size != 0) { +l = read(fd, buf, sizeof(buf)); +if (l != sizeof(buf)) { +error_set(errp, QERR_IO_ERROR); +goto exit; +} +if (l size) +l = size; +cpu_physical_memory_rw(addr, buf, l, 1); +addr += l; +size -= l; +} + +exit: +qemu_close(fd); +} void qmp_inject_nmi(Error **errp) { #if defined(TARGET_I386) diff --git a/hmp-commands.hx b/hmp-commands.hx index f3fc514..1eae6e3 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -809,6 +809,19 @@ save to disk physical memory dump starting at @var{addr} of size @var{size}. ETEXI { +.name = pmemload, +.args_type = filename:s,val:l,size:i, +.params = file addr size, +.help = load from disk physical memory dump starting at 'addr' of size 'size', +.mhandler.cmd = hmp_pmemload, +}, + +STEXI +@item pmemload @var{file} @var{addr} @var{size} +@findex pmemload +load from disk physical memory dump starting at @var{addr} of size @var{size}. +ETEXI +{ .name = boot_set, .args_type = bootdevice:s, .params = bootdevice, diff --git a/hmp.c b/hmp.c index 2f279c4..745c087 100644 --- a/hmp.c +++ b/hmp.c @@ -767,6 +767,17 @@ void hmp_pmemsave(Monitor *mon, const QDict *qdict) hmp_handle_error(mon, errp); } +void hmp_pmemload(Monitor *mon, const QDict *qdict) +{ +uint32_t size = qdict_get_int(qdict, size); +const char *filename = qdict_get_str(qdict, filename); +uint64_t addr = qdict_get_int(qdict, val); +Error *errp = NULL; + +qmp_pmemload(filename, addr, size, errp); +hmp_handle_error(mon, errp); +} + void hmp_ringbuf_write(Monitor *mon, const QDict *qdict) { const char *chardev = qdict_get_str(qdict, device); diff --git a/hmp.h b/hmp.h index ed58f0e..f5f2a16 100644 --- a/hmp.h +++ b/hmp.h @@ -44,6 +44,7 @@ void hmp_system_powerdown(Monitor *mon, const QDict *qdict); void hmp_cpu(Monitor *mon, const QDict *qdict); void hmp_memsave(Monitor *mon, const QDict *qdict); void hmp_pmemsave(Monitor *mon, const QDict *qdict); +void hmp_pmemload(Monitor *mon, const QDict *qdict); void hmp_ringbuf_write(Monitor *mon, const QDict *qdict); void hmp_ringbuf_read(Monitor *mon, const QDict *qdict); void hmp_cont(Monitor *mon, const QDict *qdict); diff --git a/qapi-schema.json b/qapi-schema.json index 391356f..7198242 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -1708,6 +1708,24 @@ 'data': {'val': 'int', 'size': 'int', 'filename': 'str'} } ## +# @pmemload: +# +# Load a portion of guest physical memory from a file. +# +# @filename: the file to load the memory from as binary data +# +# @val: the physical address of the guest to start from +# +# @size: the size of memory region to load +# +# Returns: Nothing on success +# +# Since: 2.1 +## +{ 'command': 'pmemload', + 'data': {'filename': 'str', 'val': 'int', 'size': 'int'} } + +## # @cont: # # Resume guest VCPU execution. diff --git a/qmp-commands.hx b/qmp-commands.hx index ed3ab92..2831038 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -468,6 +468,33 @@ Example: EQMP { +.name = pmemload, +.args_type = filename:s,val:l,size:i, +.mhandler.cmd_new = qmp_marshal_input_pmemload, +}, + +SQMP +pmemload + + +load from disk physical memory dump starting at 'val' of size 'size'. + +Arguments: + +- filename: file path (json-string) +- val: the starting address (json-int) +- size: the memory size, in bytes (json-int) + +Example: + +- { execute: pmemload, + arguments: { filename: /tmp/physical-mem-dump, +val: 10, +size: 100 } } +- { return: {} } + +EQMP +{ .name = inject-nmi, .args_type = , .mhandler.cmd_new = qmp_marshal_input_inject_nmi, -- 1.9.1
Re: [Qemu-devel] TCG x86-64 'bt' insn
Hi, quick follow-up. *As always* you find a problem right after asking for help :). The updated patch does not cause BSOD on Windows guests, but neither does it fix the actual problem (of the program seg-faulting) I would really appreciate any feedback on the proposed patch below - the difference to the previous patch is that we clear out undefined flags and only keep the Z-flag (and update the C-flag) --- ../orig/qemu-2.0.0-rc2/target-i386/translate.c 2014-04-08 12:38:58.0 -0700 +++ target-i386/translate.c 2014-04-09 10:48:25.264200230 -0700 @@ -6710,8 +6710,15 @@ tcg_gen_andi_tl(cpu_T[1], cpu_T[1], (1 (3 + ot)) - 1); switch(op) { case 0: -tcg_gen_shr_tl(cpu_cc_src, cpu_T[0], cpu_T[1]); -tcg_gen_movi_tl(cpu_cc_dst, 0); +/* whatever the last CC-op is - recompute now so we can OR-in + * updated results */ +gen_update_cc_op(s); // ? needed ? +gen_compute_eflags(s); +tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]); +tcg_gen_andi_tl(cpu_tmp4, cpu_tmp4, CC_C); +tcg_gen_andi_tl(cpu_cc_src, cpu_cc_src, CC_Z); +tcg_gen_or_tl(cpu_cc_src, cpu_cc_src, cpu_tmp4); +set_cc_op(s, CC_OP_EFLAGS); break; case 1: tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]); @@ -6734,8 +6741,8 @@ tcg_gen_xor_tl(cpu_T[0], cpu_T[0], cpu_tmp0); break; } -set_cc_op(s, CC_OP_SARB + ot); if (op != 0) { +set_cc_op(s, CC_OP_SARB + ot); if (mod != 3) { gen_op_st_v(s, ot, cpu_T[0], cpu_A0); } else { Last another general question: Does TCG make any assumptions that undefined flags are set to 0? I see that most flag-computations set undefined flags to 0 - is this just a convention or really a requirement? Thanks guys! -Clemens On Wed, Apr 9, 2014 at 10:33 AM, Clemens Kolbitsch kolbit...@lastline.comwrote: Hi guys, I have to revive a rather old thread [1,2]. A quick summary of the issue: TCG emulates the BT (bit-test) instruction using a SAR to re-compute eflags. While SAR affects all flags, BT only affects the C-flag and leaves the Z-flag unaffected/unchanged [3]. According to the x86 manual, the current behavior is correct (although not perfect performance-wise), but NOT correct according to newer Intel CPU instruction sets (such as x86-64). What has caused some confusion in the past it seems is that AMD's manual mentions that all flags other than the C-flag are undefined (just like in Intel's old x86 manual). A patch has been suggested (although not applied) before [2] and I was trying to update and try the patch to the QEMU 2.0 RC2, but it seems not to be working (causes a BSOD on Windows guests). BTW: I have a program that seg-faults on Windows clients (32 as well as 64 bit guest OSes) in vanilla QEMU 2.0 RC2 (just like in previous versions), so this is not just of theoretical importance :) Can somebody say if there is something that I'm doing obviously wrong? *NON-FUNCTIONAL PATCH!* --- ../orig/qemu-2.0.0-rc2/target-i386/translate.c 2014-04-08 12:38:58.0 -0700 +++ target-i386/translate.c 2014-04-09 10:08:43.252084947 -0700 @@ -6710,8 +6710,14 @@ tcg_gen_andi_tl(cpu_T[1], cpu_T[1], (1 (3 + ot)) - 1); switch(op) { case 0: -tcg_gen_shr_tl(cpu_cc_src, cpu_T[0], cpu_T[1]); -tcg_gen_movi_tl(cpu_cc_dst, 0); +/* whatever the last CC-op is - recompute now so we can OR-in + * updated results */ +gen_update_cc_op(s); +gen_compute_eflags(s); +tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]); +tcg_gen_andi_tl(cpu_tmp4, cpu_tmp4, CC_C); +tcg_gen_or_tl(cpu_cc_src, cpu_cc_src, cpu_tmp4); +set_cc_op(s, CC_OP_EFLAGS); break; case 1: tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]); @@ -6734,8 +6740,8 @@ tcg_gen_xor_tl(cpu_T[0], cpu_T[0], cpu_tmp0); break; } -set_cc_op(s, CC_OP_SARB + ot); if (op != 0) { +set_cc_op(s, CC_OP_SARB + ot); if (mod != 3) { gen_op_st_v(s, ot, cpu_T[0], cpu_A0); } else { Thanks, I'd really appreciate any input you can provide. -Clemens [1] http://lists.gnu.org/archive/html/qemu-devel/2010-11/msg00442.html [2] http://lists.gnu.org/archive/html/qemu-devel/2010-11/msg00764.html [3] page 148 http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf
Re: [Qemu-devel] [PATCH v17 08/14] block: Support dropping active in bdrv_drop_intermediate
On Tue, Apr 08, 2014 at 05:07:38PM +0800, Fam Zheng wrote: On Tue, 04/08 10:15, Markus Armbruster wrote: Jeff Cody jc...@redhat.com writes: On Mon, Mar 10, 2014 at 03:26:04PM +0800, Fam Zheng wrote: Dropping intermediate could be useful both for commit and stream, and BDS refcnt plus bdrv_swap could do most of the job nicely. It also needs to work with op blockers. Signed-off-by: Fam Zheng f...@redhat.com --- block.c| 139 - block/commit.c | 2 +- 2 files changed, 70 insertions(+), 71 deletions(-) diff --git a/block.c b/block.c index 05f7766..0af7c62 100644 --- a/block.c +++ b/block.c @@ -2503,115 +2503,114 @@ BlockDriverState *bdrv_find_overlay(BlockDriverState *active, return overlay; } -typedef struct BlkIntermediateStates { -BlockDriverState *bs; -QSIMPLEQ_ENTRY(BlkIntermediateStates) entry; -} BlkIntermediateStates; - - /* - * Drops images above 'base' up to and including 'top', and sets the image - * above 'top' to have base as its backing file. + * Drops images above 'base' up to and including 'top', and sets new 'base' as + * backing_hd of top's overlay (the image orignally has 'top' as backing file). + * top's overlay may be NULL if 'top' is active, no such update needed. + * Requires that the top's overlay to 'top' is opened r/w. + * + * 1) This will convert the following chain: + * + * ... - base - ... - top - overlay -... - active * - * Requires that the overlay to 'top' is opened r/w, so that the backing file - * information in 'bs' can be properly updated. + * to + * + * ... - base - overlay - active + * + * 2) It is allowed for bottom==base, in which case it converts: * - * E.g., this will convert the following chain: - * bottom - base - intermediate - top - active + * base - ... - top - overlay - ... - active * * to * - * bottom - base - active + * base - overlay - active * - * It is allowed for bottom==base, in which case it converts: + * 2) It also allows active==top, in which case it converts: * - * base - intermediate - top - active + * ... - base - ... - top (active) * * to * - * base - active + * ... - base == active == top + * + * i.e. only base and lower remains: *top == *base when return. + * + * 3) If base==NULL, it will drop all the BDS below overlay and set its + * backing_hd to NULL. I.e.: * - * Error conditions: - * if active == top, that is considered an error + * base(NULL) - ... - overlay - ... - active + * + * to + * + * overlay - ... - active * */ int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top, BlockDriverState *base) { -BlockDriverState *intermediate; -BlockDriverState *base_bs = NULL; -BlockDriverState *new_top_bs = NULL; -BlkIntermediateStates *intermediate_state, *next; -int ret = -EIO; - -QSIMPLEQ_HEAD(states_to_delete, BlkIntermediateStates) states_to_delete; -QSIMPLEQ_INIT(states_to_delete); +BlockDriverState *drop_start, *overlay, *bs; +int ret = -EINVAL; -if (!top-drv || !base-drv) { +assert(active); +assert(top); +/* Verify that top is in backing chain of active */ +bs = active; +while (bs bs != top) { +bs = bs-backing_hd; +} +if (!bs) { goto exit; } +/* Verify that base is in backing chain of top */ +if (base) { +while (bs bs != base) { +bs = bs-backing_hd; +} +if (bs != base) { +goto exit; +} +} -new_top_bs = bdrv_find_overlay(active, top); - -if (new_top_bs == NULL) { -/* we could not find the image above 'top', this is an error */ +if (!top-drv || (base !base-drv)) { goto exit; } - -/* special case of new_top_bs-backing_hd already pointing to base - nothing - * to do, no intermediate images */ -if (new_top_bs-backing_hd == base) { +if (top == base) { +ret = 0; +goto exit; +} else if (top == active) { +assert(base); +drop_start = active-backing_hd; +bdrv_swap(active, base); This will assert in block.c, in bdrv_swap, on the test for anonymity of active. (For testing, I changed the active layer commit in mirror to use bdrv_drop_intermediate()). Jeff, you're right, because bdrv_swap requires first argument to be a new BDS, while we are passing in an top BDS. But what happens if we write bdrv_swap(base, active)? That seems like it could work - I did a quick
Re: [Qemu-devel] [PATCH v17 06/14] block: Add backing_blocker in BlockDriverState
On Mon, Mar 10, 2014 at 03:26:02PM +0800, Fam Zheng wrote: This makes use of op_blocker and blocks all the operations except for commit target, on each BlockDriverState-backing_hd. The asserts for op_blocker in bdrv_swap are removed because with this change, the target of block commit has at least the backing blocker of its child, so the assertion is not true. Callers should do their check. Signed-off-by: Fam Zheng f...@redhat.com --- block.c | 24 block/mirror.c| 1 + include/block/block_int.h | 3 +++ 3 files changed, 24 insertions(+), 4 deletions(-) diff --git a/block.c b/block.c index 64738dc..95247c8 100644 --- a/block.c +++ b/block.c @@ -1050,16 +1050,33 @@ fail: void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd) { +if (bs-backing_hd) { +assert(error_is_set(bs-backing_blocker)); +bdrv_op_unblock_all(bs-backing_hd, bs-backing_blocker); +} else if (backing_hd) { +error_setg(bs-backing_blocker, + device is used as backing hd of '%s', + bs-device_name); +} + bs-backing_hd = backing_hd; if (!backing_hd) { bs-backing_file[0] = '\0'; bs-backing_format[0] = '\0'; +if (error_is_set(bs-backing_blocker)) { +error_free(bs-backing_blocker); +} goto out; } bs-open_flags = ~BDRV_O_NO_BACKING; pstrcpy(bs-backing_file, sizeof(bs-backing_file), backing_hd-filename); pstrcpy(bs-backing_format, sizeof(bs-backing_format), backing_hd-drv ? backing_hd-drv-format_name : ); + +bdrv_op_block_all(bs-backing_hd, bs-backing_blocker); +/* Otherwise we won't be able to commit due to check in bdrv_commit */ +bdrv_op_unblock(bs-backing_hd, BLOCK_OP_TYPE_COMMIT, +bs-backing_blocker); out: bdrv_refresh_limits(bs); } @@ -1699,8 +1716,9 @@ void bdrv_close(BlockDriverState *bs) if (bs-drv) { if (bs-backing_hd) { -bdrv_unref(bs-backing_hd); -bs-backing_hd = NULL; +BlockDriverState *backing_hd = bs-backing_hd; +bdrv_set_backing_hd(bs, NULL); +bdrv_unref(backing_hd); } bs-drv-bdrv_close(bs); g_free(bs-opaque); @@ -1908,7 +1926,6 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old) assert(QLIST_EMPTY(bs_new-dirty_bitmaps)); assert(bs_new-job == NULL); assert(bs_new-dev == NULL); -assert(bdrv_op_blocker_is_empty(bs_new)); assert(bs_new-io_limits_enabled == false); assert(!throttle_have_timer(bs_new-throttle_state)); @@ -1927,7 +1944,6 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old) /* Check a few fields that should remain attached to the device */ assert(bs_new-dev == NULL); assert(bs_new-job == NULL); -assert(bdrv_op_blocker_is_empty(bs_new)); assert(bs_new-io_limits_enabled == false); assert(!throttle_have_timer(bs_new-throttle_state)); diff --git a/block/mirror.c b/block/mirror.c index dd5ee05..6dc84e8 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -487,6 +487,7 @@ immediate_exit: * trigger the unref from the top one */ BlockDriverState *p = s-base-backing_hd; s-base-backing_hd = NULL; +bdrv_op_unblock_all(p, s-base-backing_blocker); bdrv_unref(p); FYI, this is what I changed in my testing, to try out the active layer case in bdrv_drop_intermediate(). Since you'll need to respin anyway, might as well clean this up to use the updated bdrv_drop_intermediate(): --- a/block/mirror.c +++ b/block/mirror.c @@ -490,15 +490,8 @@ immediate_exit: if (bdrv_get_flags(s-target) != bdrv_get_flags(s-common.bs)) { bdrv_reopen(s-target, bdrv_get_flags(s-common.bs), NULL); } -bdrv_swap(s-target, s-common.bs); -if (s-common.driver-job_type == BLOCK_JOB_TYPE_COMMIT) { -/* drop the bs loop chain formed by the swap: break the loop then - * trigger the unref from the top one */ -BlockDriverState *p = s-base-backing_hd; -s-base-backing_hd = NULL; -bdrv_op_unblock_all(p, s-base-backing_blocker); -bdrv_unref(p); -} +bdrv_op_unblock_all(s-common.bs-backing_hd, s-common.bs-backing_blocker); +bdrv_drop_intermediate(s-common.bs, s-common.bs, s-target); } bdrv_unref(s-target); block_job_completed(s-common, ret);
[Qemu-devel] [Bug 1303926] Re: qemu-system-x86_64 crashed with SIGABRT
This bug was fixed in the package qemu - 2.0.0~rc1+dfsg-0ubuntu3 --- qemu (2.0.0~rc1+dfsg-0ubuntu3) trusty; urgency=medium * d/p/ubuntu/kvm_physical_sync_dirty_bitmap-ignore-ENOENT-from-kv.patch don't abort() just because the kernel has no dirty bitmap. (LP: #1303926) -- Serge Hallyn serge.hal...@ubuntu.com Tue, 08 Apr 2014 22:32:00 -0500 ** Changed in: qemu (Ubuntu) Status: Triaged = Fix Released -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1303926 Title: qemu-system-x86_64 crashed with SIGABRT Status in QEMU: New Status in “qemu” package in Ubuntu: Fix Released Bug description: I've been getting this periodically since my upgrade to qemu 2.0 in trusty this morning. ProblemType: Crash DistroRelease: Ubuntu 14.04 Package: qemu-system-x86 2.0.0~rc1+dfsg-0ubuntu1 ProcVersionSignature: Ubuntu 3.13.0-23.45-generic 3.13.8 Uname: Linux 3.13.0-23-generic x86_64 ApportVersion: 2.14.1-0ubuntu1 Architecture: amd64 Date: Mon Apr 7 13:31:53 2014 ExecutablePath: /usr/bin/qemu-system-x86_64 InstallationDate: Installed on 2013-11-26 (131 days ago) InstallationMedia: Ubuntu 13.10 Saucy Salamander - Release amd64 (20131016.1) ProcEnviron: PATH=(custom, no user) Registers: No symbol table is loaded. Use the file command. Signal: 6 SourcePackage: qemu StacktraceTop: Title: qemu-system-x86_64 crashed with SIGABRT UpgradeStatus: Upgraded to trusty on 2014-01-17 (79 days ago) UserGroups: To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1303926/+subscriptions
Re: [Qemu-devel] [PATCH v2] PPC: Clean up DECR implementation
On 4/8/2014 2:58 PM, Alexander Graf wrote: On 04/08/2014 09:56 PM, Tom Musta wrote: On 4/6/2014 3:55 PM, Alexander Graf wrote: snip @@ -806,6 +838,10 @@ clk_setup_cb cpu_ppc_tb_init (CPUPPCState *env, uint32_t freq) tb_env = g_malloc0(sizeof(ppc_tb_t)); env-tb_env = tb_env; tb_env-flags = PPC_DECR_UNDERFLOW_TRIGGERED; +if (env-insns_flags PPC_SEGMENT_64B) { +/* All Book3S 64bit CPUs implement level based DEC logic */ +tb_env-flags |= PPC_DECR_UNDERFLOW_LEVEL; +} /* Create new timer */ tb_env-decr_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, cpu_ppc_decr_cb, cpu); if (0) { Equating Book3S with PPC_SEGMENT_64B is clever ... is it too clever? Especially since the SLB Bridge is in the phased-out category and consequently we should expect future Book3S implementations to not support this instruction category. Maybe it's too clever :). I'm very open to suggestions on how to figure this out otherwise. Or maybe we should just rework the way timers get created and make them be part of the core itself? Alex A somewhat more practical approach than redesigning timer init: The phrasing introduced into Book3S that corresponds to your UNDERFLOW_LEVEL flag has existed at least since ISA 2.03. And 2.03 introduced some new features, like SPE and Altivec. So ... if (env-insns_flags (PPC_SEGMENT_64B | PPC_SPE | PPC_ALTIVEC)) { /* All Book3S 64bit CPUs implement level based DEC logic */ tb_env-flags |= PPC_DECR_UNDERFLOW_LEVEL; } would catch a few more. I'm not sure we get into this code for Book3E machines, but if you are worried about that you could also ensure that insns_flags doesn't have PPC_BOOKE on.
[Qemu-devel] [PATCH 1/2] target-ppc: Fix target_disas
Inspect only bit 16 for the Little Endian test. Correct comment preceding the target_disas() function. Correct grammar in comment for flags processing. Signed-off-by: Tom Musta tommu...@gmail.com Reviewed-by: Peter Maydell peter.mayd...@linaro.org --- V4: Correct grammar definitions of the instructions set == definition of the instruction set disas.c |7 --- 1 files changed, 4 insertions(+), 3 deletions(-) diff --git a/disas.c b/disas.c index 79e6944..1397167 100644 --- a/disas.c +++ b/disas.c @@ -191,7 +191,8 @@ static int print_insn_od_target(bfd_vma pc, disassemble_info *info) values: i386 - 1 means 16 bit code, 2 means 64 bit code arm - bit 0 = thumb, bit 1 = reverse endian, bit 2 = A64 -ppc - nonzero means little endian +ppc - bits 0:15 specify (optionally) the machine instruction set; + bit 16 indicates little endian. other targets - unused */ void target_disas(FILE *out, CPUArchState *env, target_ulong code, @@ -251,11 +252,11 @@ void target_disas(FILE *out, CPUArchState *env, target_ulong code, s.info.mach = bfd_mach_sparc_v9b; #endif #elif defined(TARGET_PPC) -if (flags 16) { +if ((flags 16) 1) { s.info.endian = BFD_ENDIAN_LITTLE; } if (flags 0x) { -/* If we have a precise definitions of the instructions set, use it */ +/* If we have a precise definition of the instruction set, use it. */ s.info.mach = flags 0x; } else { #ifdef TARGET_PPC64 -- 1.7.1
[Qemu-devel] [PATCH 2/2] monitor: QEMU Monitor Instruction Disassembly Incorrect for PowerPC LE Mode
The monitor support for disassembling instructions does not honor the MSR[LE] bit for PowerPC processors. This change enhances the monitor_disas() routine by supporting a flag bit for Little Endian mode. Bit 16 is used since that bit was used in the analagous guest disassembly routine target_disas(). Also, to be consistent with target_disas(), the disassembler bfd_mach field can be passed in the flags argument. Reported-by: Anton Blanchard an...@samba.org Signed-off-by: Tom Musta tommu...@gmail.com Reviewed-by: Peter Maydell peter.mayd...@linaro.org --- V2: Look specifically at bit 16 for LE. Support machine configuration in flags. V3: Changed documentation of 'flags' argument to simply refer to the target_disas description (per Peter Maydell's review). V4: Corrections to comments. The bug can be easily observed by dumping the first few instructions of an interrupt vector (0x300 is the Data Storage Interrupt handler in PPC) (qemu) xp/8i 0x300 0x0300: .long 0x60 0x0304: lhzur18,-19843(r3) 0x0308: .long 0x60 0x030c: lhzur18,-20099(r2) 0x0310: lwz r0,11769(0) 0x0314: lhzur23,8317(r2) 0x0318: .long 0x7813427c 0x031c: lbz r0,19961(0) With the patch applied, the disassembly now looks correct: (qemu) xp/8i 0x300 0x0300: nop 0x0304: mtsprg 2,r13 0x0308: nop 0x030c: mfsprg r13,1 0x0310: std r9,128(r13) 0x0314: mfspr r9,896 0x0318: mr r2,r2 0x031c: std r10,136(r13) disas.c | 14 -- monitor.c |4 2 files changed, 16 insertions(+), 2 deletions(-) diff --git a/disas.c b/disas.c index 1397167..44a019a 100644 --- a/disas.c +++ b/disas.c @@ -445,6 +445,8 @@ monitor_fprintf(FILE *stream, const char *fmt, ...) return 0; } +/* Disassembler for the monitor. + See target_disas for a description of flags. */ void monitor_disas(Monitor *mon, CPUArchState *env, target_ulong pc, int nb_insn, int is_physical, int flags) { @@ -485,11 +487,19 @@ void monitor_disas(Monitor *mon, CPUArchState *env, s.info.mach = bfd_mach_sparc_v9b; #endif #elif defined(TARGET_PPC) +if (flags 0x) { +/* If we have a precise definition of the instruction set, use it. */ +s.info.mach = flags 0x; +} else { #ifdef TARGET_PPC64 -s.info.mach = bfd_mach_ppc64; +s.info.mach = bfd_mach_ppc64; #else -s.info.mach = bfd_mach_ppc; +s.info.mach = bfd_mach_ppc; #endif +} +if ((flags 16) 1) { +s.info.endian = BFD_ENDIAN_LITTLE; +} print_insn = print_insn_ppc; #elif defined(TARGET_M68K) print_insn = print_insn_m68k; diff --git a/monitor.c b/monitor.c index 342e83b..3ae561b 100644 --- a/monitor.c +++ b/monitor.c @@ -1309,6 +1309,10 @@ static void memory_dump(Monitor *mon, int count, int format, int wsize, } } #endif +#ifdef TARGET_PPC +flags = msr_le 16; +flags |= env-bfd_mach; +#endif monitor_disas(mon, env, addr, count, is_physical, flags); return; } -- 1.7.1
[Qemu-devel] [PATCH 0/2] QEMU Monitor Instruction Disassembly Incorrect for PPC LE
Fix disassembly in the QEMU monitor for Little Endian codes. Also fix the comment and tighten up a flag check in the closely related disassembler code for tracing. V2: Fixed target_disas comment and bit decoding. V3: Make monitor_disas flag documentation refer to target_disas documentation. V4: Minor corrections to comments. Tom Musta (2): target-ppc: Fix target_disas monitor: QEMU Monitor Instruction Disassembly Incorrect for PowerPC LE Mode disas.c | 21 - monitor.c |4 2 files changed, 20 insertions(+), 5 deletions(-)
[Qemu-devel] Turning off default storage devices?
Currently, -M q35 boots linux quite a bit slower than the default machine type. This seems to be because it takes a few hundred ms to determine that there's nothing attached to the AHCI controller. In virtio setups, there will probably never be anything attached to the AHCI controller. Would it be possible to add something like -machine default_storage=off to turn off default storage devices? This could include the AHCI on q35 and the cdrom and such on pc. There's precedent: -machine usb=off turns off the default USB controllers, which is great for setups that use xhci. Thanks, Andy
Re: [Qemu-devel] [PATCH v2] PPC: Clean up DECR implementation
Am 09.04.2014 um 21:33 schrieb Tom Musta tommu...@gmail.com: On 4/8/2014 2:58 PM, Alexander Graf wrote: On 04/08/2014 09:56 PM, Tom Musta wrote: On 4/6/2014 3:55 PM, Alexander Graf wrote: snip @@ -806,6 +838,10 @@ clk_setup_cb cpu_ppc_tb_init (CPUPPCState *env, uint32_t freq) tb_env = g_malloc0(sizeof(ppc_tb_t)); env-tb_env = tb_env; tb_env-flags = PPC_DECR_UNDERFLOW_TRIGGERED; +if (env-insns_flags PPC_SEGMENT_64B) { +/* All Book3S 64bit CPUs implement level based DEC logic */ +tb_env-flags |= PPC_DECR_UNDERFLOW_LEVEL; +} /* Create new timer */ tb_env-decr_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, cpu_ppc_decr_cb, cpu); if (0) { Equating Book3S with PPC_SEGMENT_64B is clever ... is it too clever? Especially since the SLB Bridge is in the phased-out category and consequently we should expect future Book3S implementations to not support this instruction category. Maybe it's too clever :). I'm very open to suggestions on how to figure this out otherwise. Or maybe we should just rework the way timers get created and make them be part of the core itself? Alex A somewhat more practical approach than redesigning timer init: The phrasing introduced into Book3S that corresponds to your UNDERFLOW_LEVEL flag has existed at least since ISA 2.03. And 2.03 introduced some new features, like SPE and Altivec. So ... e6500 supports Altivec :) if (env-insns_flags (PPC_SEGMENT_64B | PPC_SPE | PPC_ALTIVEC)) { /* All Book3S 64bit CPUs implement level based DEC logic */ tb_env-flags |= PPC_DECR_UNDERFLOW_LEVEL; } would catch a few more. I'm not sure we get into this code for Book3E machines, but if you are worried about that you could also ensure that insns_flags doesn't have PPC_BOOKE on. Phew, I'm not quite sure. We can do it as an interim solution, but really the timer is a core property rather than a machine device :). So eventually this should get reworked. Alex
Re: [Qemu-devel] [PATCH target-arm v1 1/1] net: cadence_gem: Make phy respond to broadcast
On Thu, Apr 03, 2014 at 11:55:19PM -0700, Peter Crosthwaite wrote: Phys must respond to address 0 by specification. Implement. Signed-off-by: Nathan Rossi nathan.ro...@xilinx.com Signed-off-by: Peter Crosthwaite peter.crosthwa...@xilinx.com --- hw/net/cadence_gem.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/net/cadence_gem.c b/hw/net/cadence_gem.c index 92dc2f2..e34b25e 100644 --- a/hw/net/cadence_gem.c +++ b/hw/net/cadence_gem.c @@ -1093,7 +1093,7 @@ static uint64_t gem_read(void *opaque, hwaddr offset, unsigned size) uint32_t phy_addr, reg_num; phy_addr = (retval GEM_PHYMNTNC_ADDR) GEM_PHYMNTNC_ADDR_SHFT; -if (phy_addr == BOARD_PHY_ADDRESS) { +if (phy_addr == BOARD_PHY_ADDRESS || phy_addr == 0) { reg_num = (retval GEM_PHYMNTNC_REG) GEM_PHYMNTNC_REG_SHIFT; retval = 0x; retval |= gem_phy_read(s, reg_num); @@ -1193,7 +1193,7 @@ static void gem_write(void *opaque, hwaddr offset, uint64_t val, uint32_t phy_addr, reg_num; phy_addr = (val GEM_PHYMNTNC_ADDR) GEM_PHYMNTNC_ADDR_SHFT; -if (phy_addr == BOARD_PHY_ADDRESS) { +if (phy_addr == BOARD_PHY_ADDRESS || phy_addr == 0) { reg_num = (val GEM_PHYMNTNC_REG) GEM_PHYMNTNC_REG_SHIFT; gem_phy_write(s, reg_num, val); } Although 802.3 standard dictates that PHYs must always respond to address 0, AFAIK not all PHYs do this. In this case, if the Marvell PHY complies with standard: Reviewed-by: Beniamino Galvani b.galv...@gmail.com
[Qemu-devel] [PATCH] target-i386: Preserve the Z bit for bt/bts/btr/btc
Older Intel manuals (pre-2010) describe Z as undefined, but AMD and newer Intel manuals describe Z as unchanged. Signed-off-by: Richard Henderson r...@twiddle.net --- target-i386/translate.c | 40 +++- 1 file changed, 31 insertions(+), 9 deletions(-) --- Clemens, your patch fails to fix flags computation for bts/btr/btc, which should be done similarly to bt. And to answer your question, no, QEMU does not make any assumptions about undefined flags. We often set them to zero, just because that is easier than any other setting. r~ diff --git a/target-i386/translate.c b/target-i386/translate.c index 02625e3..032b0fd 100644 --- a/target-i386/translate.c +++ b/target-i386/translate.c @@ -6708,41 +6708,63 @@ static target_ulong disas_insn(CPUX86State *env, DisasContext *s, } bt_op: tcg_gen_andi_tl(cpu_T[1], cpu_T[1], (1 (3 + ot)) - 1); +tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]); switch(op) { case 0: -tcg_gen_shr_tl(cpu_cc_src, cpu_T[0], cpu_T[1]); -tcg_gen_movi_tl(cpu_cc_dst, 0); break; case 1: -tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]); tcg_gen_movi_tl(cpu_tmp0, 1); tcg_gen_shl_tl(cpu_tmp0, cpu_tmp0, cpu_T[1]); tcg_gen_or_tl(cpu_T[0], cpu_T[0], cpu_tmp0); break; case 2: -tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]); tcg_gen_movi_tl(cpu_tmp0, 1); tcg_gen_shl_tl(cpu_tmp0, cpu_tmp0, cpu_T[1]); -tcg_gen_not_tl(cpu_tmp0, cpu_tmp0); -tcg_gen_and_tl(cpu_T[0], cpu_T[0], cpu_tmp0); +tcg_gen_andc_tl(cpu_T[0], cpu_T[0], cpu_tmp0); break; default: case 3: -tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]); tcg_gen_movi_tl(cpu_tmp0, 1); tcg_gen_shl_tl(cpu_tmp0, cpu_tmp0, cpu_T[1]); tcg_gen_xor_tl(cpu_T[0], cpu_T[0], cpu_tmp0); break; } -set_cc_op(s, CC_OP_SARB + ot); if (op != 0) { if (mod != 3) { gen_op_st_v(s, ot, cpu_T[0], cpu_A0); } else { gen_op_mov_reg_v(ot, rm, cpu_T[0]); } +} + +/* Delay all CC updates until after the store above. Note that + C is the result of the test, Z is unchanged, and the others + are all undefined. */ +switch (s-cc_op) { +case CC_OP_MULB ... CC_OP_MULQ: +case CC_OP_ADDB ... CC_OP_ADDQ: +case CC_OP_ADCB ... CC_OP_ADCQ: +case CC_OP_SUBB ... CC_OP_SUBQ: +case CC_OP_SBBB ... CC_OP_SBBQ: +case CC_OP_LOGICB ... CC_OP_LOGICQ: +case CC_OP_INCB ... CC_OP_INCQ: +case CC_OP_DECB ... CC_OP_DECQ: +case CC_OP_SHLB ... CC_OP_SHLQ: +case CC_OP_SARB ... CC_OP_SARQ: +case CC_OP_BMILGB ... CC_OP_BMILGQ: +/* Z was going to be computed from the non-zero status of CC_DST. + We can get that same Z value (and the new C value) by leaving + CC_DST alone, setting CC_SRC, and using a CC_OP_SAR of the + same width. */ tcg_gen_mov_tl(cpu_cc_src, cpu_tmp4); -tcg_gen_movi_tl(cpu_cc_dst, 0); +set_cc_op(s, ((s-cc_op - CC_OP_MULB) 3) + CC_OP_SARB); +break; +default: +/* Otherwise, generate EFLAGS and replace the C bit. */ +gen_compute_eflags(s); +tcg_gen_deposit_tl(cpu_cc_src, cpu_cc_src, cpu_tmp4, + ctz32(CC_C), 1); +break; } break; case 0x1bc: /* bsf / tzcnt */ -- 1.9.0
Re: [Qemu-devel] [PATCH microblaze v1 1/1] net: xilinx_axienet.c: Add phy soft reset bit clearing
On Tue, Apr 08, 2014 at 06:52:39PM -0700, Peter Crosthwaite wrote: From: Nathan Rossi nathan.ro...@xilinx.com Clear the BMCR Reset when writing to registers. Signed-off-by: Nathan Rossi nathan.ro...@xilinx.com [ PC: * Trivial style fixes to commit message ] Signed-off-by: Peter Crosthwaite peter.crosthwa...@xilinx.com --- hw/net/xilinx_axienet.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/hw/net/xilinx_axienet.c b/hw/net/xilinx_axienet.c index 839d97c..0f485a0 100644 --- a/hw/net/xilinx_axienet.c +++ b/hw/net/xilinx_axienet.c @@ -142,6 +142,9 @@ tdk_write(struct PHY *phy, unsigned int req, unsigned int data) phy-regs[regnum] = data; break; } + +/* Unconditionally clear regs[BMCR][BMCR_RESET] */ +phy-regs[0] = ~0x8000; } static void -- Reviewed-by: Beniamino Galvani b.galv...@gmail.com Ideally we should also restore default values of registers after a reset, but probably it is not required for the guest to operate properly. Beniamino