Re: [Qemu-devel] [PATCH] fix bug of isa_bus irq
On Sun, Mar 11, 2012 at 08:46:38AM +0100, Jan Kiszka wrote: On 2012-03-11 08:04, Wanpeng Li wrote: ISA bus only use IRQ 0~15, so don't need to give an array qemu_irq 0~23, just array qemu_irq i8259 is ok. Signed-off-by: Wanpeng Li l...@linux.vnet.ibm.com --- hw/pc_piix.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/hw/pc_piix.c b/hw/pc_piix.c index 63dba4e..52f7cf8 100644 --- a/hw/pc_piix.c +++ b/hw/pc_piix.c @@ -210,7 +210,6 @@ static void pc_init1(MemoryRegion *system_memory, isa_bus = isa_bus_new(NULL, system_io); no_hpet = 1; } -isa_bus_irqs(isa_bus, gsi); if (kvm_irqchip_in_kernel()) { i8259 = kvm_i8259_init(isa_bus); @@ -221,6 +220,8 @@ static void pc_init1(MemoryRegion *system_memory, i8259 = i8259_init(isa_bus, cpu_irq[0]); } +isa_bus_irqs(isa_bus, i8259); + for (i = 0; i ISA_NUM_IRQS; i++) { gsi_state-i8259_irq[i] = i8259[i]; } This is bogus. isa_bus_irqs sets the output IRQs of the ISA bus. And those are not only delivered to the PIC on the PIIX2, but also the IOAPIC. Thus we have to pass in the GSI input lines which dispatch to both. Of those lines, only the first 16 will be used by the ISA bus (there is even an assert to ensure this). Did you see any concrete bug in the context of this logic? Jan Yes, but actually PIC is being used at present, whether passing qemu_irq 0~23 to isa_bus is not safe or not. Wanpeng Li
Re: [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping
From: Jan Kiszka jan.kis...@siemens.com Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping Date: Fri, 09 Mar 2012 14:24:41 +0100 On 2012-03-09 13:53, HATAYAMA Daisuke wrote: From: Jan Kiszka jan.kis...@siemens.com Subject: Re: [RFC][PATCH 05/16 v8] Add API to get memory mapping Date: Fri, 09 Mar 2012 11:06:30 +0100 On 2012-03-09 11:05, Jan Kiszka wrote: If crash can work both with and without paging, it should be default *on* to avoid writing cores that can later on only be analyzed with that tool. Still not sure, though, if that changes the requirement on what memory regions should be written in that mode. If this logic is not remvoed, crash can work both with and without paging. But the default value is 'off' now, because the option is '-p'. And this would be unfortunate if you do not want to use crash for analyzing (I'm working on gdb python scripts which will make gdb - one day - at least as powerful as crash). If paging mode has the same information that non-paging mode has, I would even suggest to drop it. Err, with it = non-paging mode. Jan Paging at default is not good idea. Performing paging in qemu has risk. - Guest machine is not always in the state where paging mode is enabled. Also, CR3 doesn't always refer to page table. That's detectable and means physical == linear address. CR0 itself is one of the guest's resources. There's still the issue whether to trust CR0 or not. The assumption behind my idea is the host is running in a good condition but the quest in a bad condition. So we can use qemu dump, which is the host's feature. Only checking if CR3 refers to page table correctly is considerably complicated. CR3 can have any physical address. There's no hint such as magic number to check it's invalid. So, to check if CR3 correctly points at page table, the only way is to check if we can see the data through actually performing paging for some virtual address. The virtual address would be better if it's more typical one, but it tends to be guest specific, and I think it's not suitable for qemu due to OS dependency. # Of course, this story is done out of the assumption that the data # could be corrupted. - If guest machine is in catastrophic state, its memory data could be corrupted. Then, we cannot trust such corrupted page table. OK, here I agree. # In this point, managing PT_LOAD program headers based on such # potencially corruppted data has risk. The idea of yours that performing paging in debugger side is better than doing in qemu. Another alternative is to add guest-awareness to the dump code. If we detect (or get told) that the target is a Linux kernel, at least the linear kernel mapping can be written reliably. But also the fact that there can be as many different page tables as active processors means that paging likely needs a second thought and more awareness of the debugger. Also, it seems to me that the lacking feature of gdb used to dumpfile compared with used to running guest is paging only. Paging is a feature in architecture, which is unlikely to change in the future. It's better in maintainability than introducing OS-level dependency. Thanks. HATAYAMA, Daisuke
Re: [Qemu-devel] How to trace all the guest OS instructions and the micro-ops
On Mon, Mar 12, 2012 at 5:43 AM, Mulyadi Santosa mulyadi.sant...@gmail.com wrote: Hi On Sun, Mar 11, 2012 at 10:12, Yue Chen ycyc...@gmail.com wrote: I am doing some research based on the QEMU. Does anyone know how to get (trace) all the instructions of the guest OS, and get all the intermediate micro-ops ? (Not in the 0.9.1 version) QEMU has release version 1.0.1. Why are you still using 0.9.1? I believe it's -d option you're looking for. Please read qemu manual for further clarification and info. -d can only give a static view of what instruction is translated, but can't get a dynamic instruction execution trace. Additionally, how to get the whole memory or each process' memory data of the guest OS? you wanna do that simply from Qemu's monitor? I don't think that's doable...or at least easily. Qemu sees guest RAM like your physical RAM. It doesn't differentiate which pages belongs to which process. You need to hook or go straight inside the guest OS, maybe using gdb or other tool to get the core dump of those processes. I really appreciate your help. Hope it helps... -- regards, Mulyadi Santosa Freelance Linux trainer and consultant blog: the-hydra.blogspot.com training: mulyaditraining.blogspot.com -- Best regards, Chen Yufei
Re: [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping
From: HATAYAMA Daisuke d.hatay...@jp.fujitsu.com Subject: Re: [Qemu-devel] [RFC][PATCH 05/16 v8] Add API to get memory mapping Date: Mon, 12 Mar 2012 15:16:55 +0900 ( ) The assumption behind my idea is the host is running in a good condition but the quest in a bad condition. So we can use qemu dump, which is the host's feature. There's also the situation that a guest is running in a good condition and users can recognize that its data is obviously not corrupted. For such situation, it's natural for qemu dump to have paging mode to some amount. Thanks. HATAYAMA, Daisuke
[Qemu-devel] [PATCH v2 1/2] block: add the support to drain throttled requests
From: Zhi Yong Wu wu...@linux.vnet.ibm.com Signed-off-by: Zhi Yong Wu wu...@linux.vnet.ibm.com --- block.c | 21 + block_int.h |1 + 2 files changed, 22 insertions(+), 0 deletions(-) diff --git a/block.c b/block.c index 52ffe14..0825168 100644 --- a/block.c +++ b/block.c @@ -853,6 +853,21 @@ void bdrv_close_all(void) } } +/** + * Complete all pending requests for a block device + */ +void bdrv_drain(BlockDriverState *bs) +{ +do { +qemu_co_queue_restart_all(bs-throttled_reqs); +} while (!qemu_co_queue_empty(bs-throttled_reqs)); + +qemu_aio_flush(); + +assert(QLIST_EMPTY(bs-tracked_requests)); +assert(qemu_co_queue_empty(bs-throttled_reqs)); +} + /* * Wait for pending requests to complete across all BlockDriverStates * @@ -863,6 +878,12 @@ void bdrv_drain_all(void) { BlockDriverState *bs; +QTAILQ_FOREACH(bs, bdrv_states, list) { +do { +qemu_co_queue_restart_all(bs-throttled_reqs); +} while (!qemu_co_queue_empty(bs-throttled_reqs)); +} + qemu_aio_flush(); /* If requests are still pending there is a bug somewhere */ diff --git a/block_int.h b/block_int.h index b460c36..c624399 100644 --- a/block_int.h +++ b/block_int.h @@ -318,6 +318,7 @@ void qemu_aio_release(void *p); void bdrv_set_io_limits(BlockDriverState *bs, BlockIOLimit *io_limits); +void bdrv_drain(BlockDriverState *bs); #ifdef _WIN32 int is_windows_drive(const char *filename); -- 1.7.6
[Qemu-devel] [PATCH v2 2/2] block: drain throttled requests for one block device
From: Zhi Yong Wu wu...@linux.vnet.ibm.com Signed-off-by: Zhi Yong Wu wu...@linux.vnet.ibm.com --- blockdev.c |4 ++-- hw/ide/macio.c |2 +- hw/ide/pci.c |3 ++- 3 files changed, 5 insertions(+), 4 deletions(-) diff --git a/blockdev.c b/blockdev.c index d78aa51..1bc4667 100644 --- a/blockdev.c +++ b/blockdev.c @@ -694,7 +694,7 @@ void qmp_blockdev_snapshot_sync(const char *device, const char *snapshot_file, return; } -bdrv_drain_all(); +bdrv_drain(bs); bdrv_flush(bs); bdrv_close(bs); @@ -1014,7 +1014,7 @@ int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data) } /* quiesce block driver; prevent further io */ -bdrv_drain_all(); +bdrv_drain(bs); bdrv_flush(bs); bdrv_close(bs); diff --git a/hw/ide/macio.c b/hw/ide/macio.c index a4df244..454a020 100644 --- a/hw/ide/macio.c +++ b/hw/ide/macio.c @@ -192,7 +192,7 @@ static void pmac_ide_flush(DBDMA_io *io) MACIOIDEState *m = io-opaque; if (m-aiocb) { -bdrv_drain_all(); +bdrv_drain(m-aiocb-bs); } } diff --git a/hw/ide/pci.c b/hw/ide/pci.c index 88c0942..205d65a 100644 --- a/hw/ide/pci.c +++ b/hw/ide/pci.c @@ -27,6 +27,7 @@ #include hw/pci.h #include hw/isa.h #include block.h +#include block_int.h #include dma.h #include hw/ide/pci.h @@ -309,7 +310,7 @@ void bmdma_cmd_writeb(BMDMAState *bm, uint32_t val) * aio operation with preadv/pwritev. */ if (bm-bus-dma-aiocb) { -bdrv_drain_all(); +bdrv_drain(bm-bus-dma-aiocb-bs); assert(bm-bus-dma-aiocb == NULL); assert((bm-status BM_STATUS_DMAING) == 0); } -- 1.7.6
Re: [Qemu-devel] How to trace all the guest OS instructions and the micro-ops
Thanks a lot. So any approach to get the dynamic or static whole memory information of the guest OS ? Not the memory of each process. Sorry for the confusion. I do use version 1.0.1. I mention not in 0.9.1 because someone has already implemented the dynamic tracing in 0.9.1, but not in the latest version. On Mon, Mar 12, 2012 at 2:20 AM, Chen Yufei cyfde...@gmail.com wrote: On Mon, Mar 12, 2012 at 5:43 AM, Mulyadi Santosa mulyadi.sant...@gmail.com wrote: Hi On Sun, Mar 11, 2012 at 10:12, Yue Chen ycyc...@gmail.com wrote: I am doing some research based on the QEMU. Does anyone know how to get (trace) all the instructions of the guest OS, and get all the intermediate micro-ops ? (Not in the 0.9.1 version) QEMU has release version 1.0.1. Why are you still using 0.9.1? I believe it's -d option you're looking for. Please read qemu manual for further clarification and info. -d can only give a static view of what instruction is translated, but can't get a dynamic instruction execution trace. Additionally, how to get the whole memory or each process' memory data of the guest OS? you wanna do that simply from Qemu's monitor? I don't think that's doable...or at least easily. Qemu sees guest RAM like your physical RAM. It doesn't differentiate which pages belongs to which process. You need to hook or go straight inside the guest OS, maybe using gdb or other tool to get the core dump of those processes. I really appreciate your help. Hope it helps... -- regards, Mulyadi Santosa Freelance Linux trainer and consultant blog: the-hydra.blogspot.com training: mulyaditraining.blogspot.com -- Best regards, Chen Yufei
Re: [Qemu-devel] [PATCH] fix bug of isa_bus irq
On 2012-03-12 07:08, Wanpeng Li wrote: On Sun, Mar 11, 2012 at 08:46:38AM +0100, Jan Kiszka wrote: On 2012-03-11 08:04, Wanpeng Li wrote: ISA bus only use IRQ 0~15, so don't need to give an array qemu_irq 0~23, just array qemu_irq i8259 is ok. Signed-off-by: Wanpeng Li l...@linux.vnet.ibm.com --- hw/pc_piix.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/hw/pc_piix.c b/hw/pc_piix.c index 63dba4e..52f7cf8 100644 --- a/hw/pc_piix.c +++ b/hw/pc_piix.c @@ -210,7 +210,6 @@ static void pc_init1(MemoryRegion *system_memory, isa_bus = isa_bus_new(NULL, system_io); no_hpet = 1; } -isa_bus_irqs(isa_bus, gsi); if (kvm_irqchip_in_kernel()) { i8259 = kvm_i8259_init(isa_bus); @@ -221,6 +220,8 @@ static void pc_init1(MemoryRegion *system_memory, i8259 = i8259_init(isa_bus, cpu_irq[0]); } +isa_bus_irqs(isa_bus, i8259); + for (i = 0; i ISA_NUM_IRQS; i++) { gsi_state-i8259_irq[i] = i8259[i]; } This is bogus. isa_bus_irqs sets the output IRQs of the ISA bus. And those are not only delivered to the PIC on the PIIX2, but also the IOAPIC. Thus we have to pass in the GSI input lines which dispatch to both. Of those lines, only the first 16 will be used by the ISA bus (there is even an assert to ensure this). Did you see any concrete bug in the context of this logic? Jan Yes, but actually PIC is being used at present, whether passing qemu_irq 0~23 to isa_bus is not safe or not. Sorry, IRQ routing to PIC and IOAPIC is actually not a property of the PIIX3 but the board we emulate. And here we follow the Multiprocessor Specification of Intel and route ISA bus IRQs to both interrupt controllers. Thus the bus must be connected to the GSIs. And, again, GSI[16..13] aren't referenced by the ISA bus at any time. Jan signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v2 1/2] block: add the support to drain throttled requests
Il 12/03/2012 07:29, zwu.ker...@gmail.com ha scritto: From: Zhi Yong Wu wu...@linux.vnet.ibm.com Signed-off-by: Zhi Yong Wu wu...@linux.vnet.ibm.com --- block.c | 21 + block_int.h |1 + 2 files changed, 22 insertions(+), 0 deletions(-) diff --git a/block.c b/block.c index 52ffe14..0825168 100644 --- a/block.c +++ b/block.c @@ -853,6 +853,21 @@ void bdrv_close_all(void) } } +/** + * Complete all pending requests for a block device + */ +void bdrv_drain(BlockDriverState *bs) +{ +do { +qemu_co_queue_restart_all(bs-throttled_reqs); +} while (!qemu_co_queue_empty(bs-throttled_reqs)); + +qemu_aio_flush(); This doesn't work, qemu_aio_flush can start new I/O. Paolo
Re: [Qemu-devel] QAPI conversion status and async commands support
On Wed, Mar 07, 2012 at 10:06:31PM +0200, Alon Levy wrote: On Wed, Mar 07, 2012 at 11:36:06AM -0600, Anthony Liguori wrote: On 03/07/2012 11:29 AM, Paolo Bonzini wrote: Il 07/03/2012 17:36, Luiz Capitulino ha scritto: Hi there, In the last few weeks we've had some proposals for new QMP commands that need to be asynchronous. As we lack a standard asynchronous API today, each command ends up adding its own way to execute in the background. This multiplies the API complexity as each command has to be implemented and learned by clients separately, with their own way of doing more or less the same things. The solution for this, envisioned for us for a long time now, is to introduce an unified QMP API for asynchronous commands. But before doing this we have to: 1. Finish the commands conversion to the QAPI This is almost done, the only missing commands are: add_graphics_client, do_closefd, do_device_add, do_device_del, do_getfd, do_migrate, do_netdev_add, do_netdev_del, do_qmp_capabilities and do_screen_dump. Note that do_migrate has already been posted to the list, and I have the screendump more or less done. Also, Anthony has an old branch where most of the conversions are already done, they just need to be rebased tested. 2. Integrate the new QAPI server Implemented by Anthony, may have missing pieces. 3. Implement async command support I think the missing commands to be converted can be done in around one week, but unfortunately I've been busy at other things and will need a few days to resume this work. Then there's the new QAPI server async support, which I'm not sure how much time we'll need to integrate them, but we should have this done for 1.1. The main question is: what should we do for the already posted async commands? Should we hold them until we finish this work? I think yes, and we could even have a list of features without which 1.1 should not ship. QOM buses, drive mirroring and QAPI async command support may be them. Perhaps qtest too. Okay, let's get serious about what we can and can't do. Hard freeze for 1.1 is May 1st which is roughly 6 weeks from now. I think QOM buses can go in no problem along with qtest. I would be okay considering QOM buses a release blocker but probably not qtest. I'm not really sure about drive mirroring. Is the work already done such that we just need to talk about merging it? With QAPI async command, I don't think 1.1 is a viable target. We're not just talking about converting existing commands to QAPI, but also replacing the QMP server infrastructure. I don't think that is a change that should be made at the tail end of the development cycle. ... so, Avi, Anthony, as: 1. screendump is broken for qxl 2. there will not be QAPI async for 1.1 3. monitor async is unacceptable by maintainer Is screendump-async to be accepted? I've send a patchset that works for hmp and I intend to fix libvirt by using the hmp screendump command instead of the qmp one for now. See: http://lists.gnu.org/archive/html/qemu-devel/2012-03/msg01802.html Regards, Anthony Liguori Paolo
Re: [Qemu-devel] seamless migration with spice
Hi, What about the second part? it's independant of the async issue. Isn't this a client problem? The client has this state, no? It is state of the client - server session. Today spice client creates a new session on migration, so there is simply no need to maintain any state. Drawback is that everything needs to be resent from the server to the client. Thats why we want be able to continue the spice session, so the client caches will stay valid. Of course the spice-server on the migration target needs the session state for that, i.e. know for example which bits the client has cached which it hasn't. If the state is stored in the server, wouldn't it be marshaled as part of the server's migration state? spice-server is stateless today when it comes to migration. QXL handles all (device) state, by keeping track of some commands (such as create/destroy surface) which it needs to transfer on migration, and by asking spice-server to render all surfaces on migration, which effectively flushes the spice server state to qxl device memory. To transfer the client session state there are basically two options: (a) transfer it as part of the qemu migration data stream. I don't want have any details about the qemu migration implementation and/or protocol in the spice-server library api, which basically leaves a ugly transfer-this-blob-for-me-please style interface as only option. (b) transfer it as part of the spice protocol. As the spice client has a connection to both source and target while the migration runs we can send session state from the source host via spice client to the target host. This needs some form of synchronization, to make sure both vmstate and spice migration are completed when qemu on the source machine quits. I think (b) is the better approach. cheers, Gerd
Re: [Qemu-devel] [PATCH v2 2/3] make check: Add qemu-iotests subset
Am 09.03.2012 20:31, schrieb Anthony Liguori: On 03/09/2012 06:46 AM, Kevin Wolf wrote: Run the 'quick' group from qemu-iotests during 'make check'. Signed-off-by: Kevin Wolfkw...@redhat.com --- tests/Makefile |5 + tests/qemu-iotests-quick.sh | 17 + 2 files changed, 22 insertions(+), 0 deletions(-) create mode 100755 tests/qemu-iotests-quick.sh diff --git a/tests/Makefile b/tests/Makefile index 74b29dc..571ad42 100644 --- a/tests/Makefile +++ b/tests/Makefile @@ -1,6 +1,9 @@ +export SRC_PATH + CHECKS = check-qdict check-qfloat check-qint check-qstring check-qlist CHECKS += check-qjson test-qmp-output-visitor test-qmp-input-visitor CHECKS += test-string-input-visitor test-string-output-visitor test-coroutine +CHECKS += $(SRC_PATH)/tests/qemu-iotests-quick.sh Not to suggest that we don't start here, but I think we should look into how to make qemu-iotest use gtester in the near future. That will allow the qemu-iotest to be part of the make check-report output and will provide an easy way for other tools (like autotest) to run these tests. Yes, we need to figure out how to integrate shell scripts with gtester. If we can't, we need to rethink gtester as our framework and put something more flexible on top. Kevin
[Qemu-devel] [PATCH 1/4] add MIPS DSP helpers define
This patch is the helper define of MIPS ASE DSP. Signed-off-by: Jia Liu pro...@gmail.com --- target-mips/helper.h | 152 ++ 1 files changed, 152 insertions(+), 0 deletions(-) diff --git a/target-mips/helper.h b/target-mips/helper.h index 442f684..1abf582 100644 --- a/target-mips/helper.h +++ b/target-mips/helper.h @@ -297,4 +297,156 @@ DEF_HELPER_0(rdhwr_ccres, tl) DEF_HELPER_1(pmon, void, int) DEF_HELPER_0(wait, void) +/* MIPS32 DSP */ +DEF_HELPER_1(absqsph, i32, i32) +DEF_HELPER_1(absqsw,i32, i32) +DEF_HELPER_2(addqph,i32, i32, i32) +DEF_HELPER_2(addqsph, i32, i32, i32) +DEF_HELPER_2(addqsw,i32, i32, i32) +DEF_HELPER_2(addsc, i32, i32, i32) +DEF_HELPER_2(addwc, i32, i32, i32) +DEF_HELPER_1(bitrev,i32, i32) +DEF_HELPER_2(cmpeqph, void, i32, i32) +DEF_HELPER_2(cmpltph, void, i32, i32) +DEF_HELPER_2(cmpleph, void, i32, i32) +DEF_HELPER_2(cmpgueqqb, i32, i32, i32) +DEF_HELPER_2(cmpgultqb, i32, i32, i32) +DEF_HELPER_2(cmpguleqb, i32, i32, i32) +DEF_HELPER_2(cmpueqqb, void, i32, i32) +DEF_HELPER_2(cmpultqb, void, i32, i32) +DEF_HELPER_2(cmpuleqb, void, i32, i32) +DEF_HELPER_3(dpaqswph, void, int, i32, i32) +DEF_HELPER_3(dpaqsalw, void, int, i32, i32) +DEF_HELPER_3(dpauhqbl, void, int, i32, i32) +DEF_HELPER_3(dpauhqbr, void, int, i32, i32) +DEF_HELPER_3(dpsqswph, void, int, i32, i32) +DEF_HELPER_3(dpsqsalw, void, int, i32, i32) +DEF_HELPER_3(dpsuhqbl, void, int, i32, i32) +DEF_HELPER_3(dpsuhqbr, void, int, i32, i32) +DEF_HELPER_3(extp, void, int, int, int) +DEF_HELPER_3(extpdp,void, int, int, int) +DEF_HELPER_3(extpdpv, void, int, i32, int) +DEF_HELPER_3(extpv, void, int, i32, int) +DEF_HELPER_3(extrsh,void, int, int, int) +DEF_HELPER_3(extrw, void, int, int, int) +DEF_HELPER_3(extrrw,void, int, int, int) +DEF_HELPER_3(extrrsw, void, int, int, int) +DEF_HELPER_2(extrvsh, i32, int, i32); +DEF_HELPER_3(extrvw,void, int, i32, int) +DEF_HELPER_3(extrvrw, void, int, i32, int) +DEF_HELPER_3(extrvrsw, void, int, i32, int) +DEF_HELPER_3(insv, void, int, i32, i32) +DEF_HELPER_2(lbux, tl, tl, int) +DEF_HELPER_2(lhx, i32, i32, int) +DEF_HELPER_2(lwx, i32, i32, int) +DEF_HELPER_3(maqswphl, void, int, i32, i32) +DEF_HELPER_3(maqsawphl, void, int, i32, i32) +DEF_HELPER_3(maqswphr, void, int, i32, i32) +DEF_HELPER_3(maqsawphr, void, int, i32, i32) +DEF_HELPER_2(modsub,i32, i32, i32) +DEF_HELPER_2(mthlip,void, int, i32) +DEF_HELPER_2(muleqswphl,i32, i32, i32) +DEF_HELPER_2(muleqswphr,i32, i32, i32) +DEF_HELPER_2(muleusphqbl, i32, i32, i32) +DEF_HELPER_2(muleusphqbr, i32, i32, i32) +DEF_HELPER_2(mulqrsph, i32, i32, i32) +DEF_HELPER_3(mulsaqswph,void, int, i32, i32) +DEF_HELPER_2(packrlph, i32, i32, i32) +DEF_HELPER_2(pickqb,i32, i32, i32) +DEF_HELPER_1(preceqwphl,i32, i32) +DEF_HELPER_2(pickph,i32, i32, i32) +DEF_HELPER_1(preceqwphr,i32, i32) +DEF_HELPER_1(precequphqbl, i32, i32) +DEF_HELPER_1(precequphqbla, i32, i32) +DEF_HELPER_1(precequphqbr, i32, i32) +DEF_HELPER_1(precequphqbra, i32, i32) +DEF_HELPER_1(preceuphqbl, i32, i32) +DEF_HELPER_1(preceuphqbla, i32, i32) +DEF_HELPER_1(preceuphqbr, i32, i32) +DEF_HELPER_1(preceuphqbra, i32, i32) +DEF_HELPER_2(precrqqbph,i32, i32, i32) +DEF_HELPER_2(precrqphw, i32, i32, i32) +DEF_HELPER_2(precrqrsphw, i32, i32, i32) +DEF_HELPER_2(precrqusqbph, i32, i32, i32) +DEF_HELPER_1(radduwqb, i32, i32) +DEF_HELPER_1(rddsp, i32, i32) +DEF_HELPER_1(replph,i32, i32) +DEF_HELPER_1(replqb,i32, i32) +DEF_HELPER_1(replvph, i32, i32) +DEF_HELPER_1(replvqb, i32, i32) +DEF_HELPER_2(shilo, void, int, int) +DEF_HELPER_2(shilov,void, int, i32) +DEF_HELPER_2(shllph,i32, int, i32) +DEF_HELPER_2(shllsph, i32, int, i32) +DEF_HELPER_2(shllqb,i32, int, i32) +DEF_HELPER_2(shllsw,i32, int, i32) +DEF_HELPER_2(shllvph, i32, i32, i32) +DEF_HELPER_2(shllvsph, i32, i32, i32) +DEF_HELPER_2(shllvqb, i32, i32, i32) +DEF_HELPER_2(shllvsw, i32, i32, i32) +DEF_HELPER_2(shraph,i32, int, i32) +DEF_HELPER_2(shrarph, i32, int, i32) +DEF_HELPER_2(shrarw,i32, int, i32) +DEF_HELPER_2(shravph, i32, i32, i32) +DEF_HELPER_2(shravrph, i32, i32, i32) +DEF_HELPER_2(shravrw, i32, i32, i32) +DEF_HELPER_2(shrlqb,i32, int, i32) +DEF_HELPER_2(shrlvqb, i32, i32, i32) +DEF_HELPER_2(subqph,i32, i32, i32) +DEF_HELPER_2(subqsph, i32, i32, i32) +DEF_HELPER_2(subqsw,i32, i32, i32) +DEF_HELPER_2(subuqb,i32, i32, i32) +DEF_HELPER_2(subusqb, i32, i32, i32)
[Qemu-devel] [PATCH 0/4] MIPS ASE DSP Support for Qemu
Hi all This is the MIPS ASE DSP Support for Qemu. Jia Liu (4): add MIPS DSP helpers define add MIPS DSP helpers implement add MIPS DSP translation add MIPS DSP testcase target-mips/helper.h | 152 + target-mips/op_helper.c| 3936 target-mips/translate.c| 1114 +++- tests/tcg/mips/mips32-dsp/Makefile | 133 + tests/tcg/mips/mips32-dsp/absq_s_ph.c | 28 + tests/tcg/mips/mips32-dsp/absq_s_w.c | 35 + tests/tcg/mips/mips32-dsp/addq_ph.c| 29 + tests/tcg/mips/mips32-dsp/addq_s_ph.c | 29 + tests/tcg/mips/mips32-dsp/addsc.c | 28 + tests/tcg/mips/mips32-dsp/addu_qb.c| 28 + tests/tcg/mips/mips32-dsp/addu_s_qb.c | 28 + tests/tcg/mips/mips32-dsp/addwc.c | 28 + tests/tcg/mips/mips32-dsp/bitrev.c | 18 + tests/tcg/mips/mips32-dsp/bposge32.c | 42 + tests/tcg/mips/mips32-dsp/cmp_eq_ph.c | 33 + tests/tcg/mips/mips32-dsp/cmp_le_ph.c | 33 + tests/tcg/mips/mips32-dsp/cmp_lt_ph.c | 33 + tests/tcg/mips/mips32-dsp/cmpgu_eq_qb.c| 29 + tests/tcg/mips/mips32-dsp/cmpgu_le_qb.c| 29 + tests/tcg/mips/mips32-dsp/cmpgu_lt_qb.c| 29 + tests/tcg/mips/mips32-dsp/cmpu_eq_qb.c | 33 + tests/tcg/mips/mips32-dsp/cmpu_le_qb.c | 33 + tests/tcg/mips/mips32-dsp/cmpu_lt_qb.c | 33 + tests/tcg/mips/mips32-dsp/dpaq_s_w_ph.c| 29 + tests/tcg/mips/mips32-dsp/dpaq_sa_l_w.c| 29 + tests/tcg/mips/mips32-dsp/dpau_h_qbl.c | 25 + tests/tcg/mips/mips32-dsp/dpau_h_qbr.c | 25 + tests/tcg/mips/mips32-dsp/dpsq_s_w_ph.c| 25 + tests/tcg/mips/mips32-dsp/dpsq_sa_l_w.c| 29 + tests/tcg/mips/mips32-dsp/dpsu_h_qbl.c | 25 + tests/tcg/mips/mips32-dsp/dpsu_h_qbr.c | 25 + tests/tcg/mips/mips32-dsp/extp.c | 42 + tests/tcg/mips/mips32-dsp/extpdp.c | 44 + tests/tcg/mips/mips32-dsp/extpdpv.c| 45 + tests/tcg/mips/mips32-dsp/extpv.c | 43 + tests/tcg/mips/mips32-dsp/extr_r_w.c | 23 + tests/tcg/mips/mips32-dsp/extr_rs_w.c | 23 + tests/tcg/mips/mips32-dsp/extr_s_h.c | 23 + tests/tcg/mips/mips32-dsp/extr_w.c | 23 + tests/tcg/mips/mips32-dsp/extrv_r_w.c | 27 + tests/tcg/mips/mips32-dsp/extrv_rs_w.c | 27 + tests/tcg/mips/mips32-dsp/extrv_s_h.c | 27 + tests/tcg/mips/mips32-dsp/extrv_w.c| 27 + tests/tcg/mips/mips32-dsp/insv.c | 21 + tests/tcg/mips/mips32-dsp/lbux.c | 21 + tests/tcg/mips/mips32-dsp/lhx.c| 21 + tests/tcg/mips/mips32-dsp/lwx.c| 21 + tests/tcg/mips/mips32-dsp/madd.c | 29 + tests/tcg/mips/mips32-dsp/maddu.c | 29 + tests/tcg/mips/mips32-dsp/maq_s_w_phl.c| 29 + tests/tcg/mips/mips32-dsp/maq_s_w_phr.c| 29 + tests/tcg/mips/mips32-dsp/maq_sa_w_phl.c | 29 + tests/tcg/mips/mips32-dsp/maq_sa_w_phr.c | 29 + tests/tcg/mips/mips32-dsp/mfhi.c | 19 + tests/tcg/mips/mips32-dsp/mflo.c | 19 + tests/tcg/mips/mips32-dsp/modsub.c | 28 + tests/tcg/mips/mips32-dsp/msub.c | 28 + tests/tcg/mips/mips32-dsp/msubu.c | 28 + tests/tcg/mips/mips32-dsp/mthi.c | 19 + tests/tcg/mips/mips32-dsp/mthlip.c | 32 + tests/tcg/mips/mips32-dsp/mtlo.c | 19 + tests/tcg/mips/mips32-dsp/muleq_s_w_phr.c | 38 + tests/tcg/mips/mips32-dsp/muleu_s_ph_qbl.c | 23 + tests/tcg/mips/mips32-dsp/muleu_s_ph_qbr.c | 23 + tests/tcg/mips/mips32-dsp/mulq_rs_ph.c | 23 + tests/tcg/mips/mips32-dsp/mult.c | 22 + tests/tcg/mips/mips32-dsp/multu.c | 22 + tests/tcg/mips/mips32-dsp/packrl_ph.c | 19 + tests/tcg/mips/mips32-dsp/pick_ph.c| 21 + tests/tcg/mips/mips32-dsp/pick_qb.c| 21 + tests/tcg/mips/mips32-dsp/preceq_w_phl.c | 18 + tests/tcg/mips/mips32-dsp/preceq_w_phr.c | 18 + tests/tcg/mips/mips32-dsp/precequ_ph_qbl.c | 18 + tests/tcg/mips/mips32-dsp/precequ_ph_qbla.c| 18 + tests/tcg/mips/mips32-dsp/precequ_ph_qbr.c | 18 + tests/tcg/mips/mips32-dsp/precequ_ph_qbra.c| 18 + tests/tcg/mips/mips32-dsp/preceu_ph_qbl.c | 18 + tests/tcg/mips/mips32-dsp/preceu_ph_qbla.c | 18 + tests/tcg/mips/mips32-dsp/preceu_ph_qbr.c | 18 + tests/tcg/mips/mips32-dsp/preceu_ph_qbra.c | 18 + tests/tcg/mips/mips32-dsp/precrq_ph_w.c| 19 + tests/tcg/mips/mips32-dsp/precrq_qb_ph.c | 19 + tests/tcg/mips/mips32-dsp/precrq_rs_ph_w.c | 19 + tests/tcg/mips/mips32-dsp/precrqu_s_qb_ph.c| 19 +
[Qemu-devel] [PATCH 3/4] add MIPS DSP translation
This patch is the translation of MIPS ASE DSP. Signed-off-by: Jia Liu pro...@gmail.com --- target-mips/translate.c | 1114 +-- 1 files changed, 1088 insertions(+), 26 deletions(-) diff --git a/target-mips/translate.c b/target-mips/translate.c index 8361d88..1fa5b28 100644 --- a/target-mips/translate.c +++ b/target-mips/translate.c @@ -249,6 +249,11 @@ enum { OPC_SYNCI= (0x1F 16) | OPC_REGIMM, }; +/* REGIMM mipsdsp opcodes */ +enum { +OPC_BPOSGE32 = (0x1C 16) | OPC_REGIMM, +}; + /* Special2 opcodes */ #define MASK_SPECIAL2(op) MASK_OP_MAJOR(op) | (op 0x3F) @@ -312,6 +317,21 @@ enum { OPC_MODU_G_2E = 0x23 | OPC_SPECIAL3, OPC_DMOD_G_2E = 0x26 | OPC_SPECIAL3, OPC_DMODU_G_2E = 0x27 | OPC_SPECIAL3, + +/* MIPS DSP */ +OPC_ABSQ_S_PH_DSP = 0x12 | OPC_SPECIAL3, +OPC_ADDU_QB_DSP= 0x10 | OPC_SPECIAL3, +/* OPC_ADDUH_QB_DSP is same as OPC_MULT_G_2E. */ +/* OPC_ADDUH_QB_DSP = 0x18 | OPC_SPECIAL3, */ +OPC_APPEND_DSP = 0x31 | OPC_SPECIAL3, +OPC_CMPU_EQ_QB_DSP = 0x11 | OPC_SPECIAL3, +OPC_DPA_W_PH_DSP = 0x30 | OPC_SPECIAL3, +OPC_EXTR_W_DSP = 0x38 | OPC_SPECIAL3, +OPC_INSV_DSP = 0x0C | OPC_SPECIAL3, +OPC_LX_DSP = 0x0A | OPC_SPECIAL3, +/* OPC_MUL_PH_DSP is same as OPC_ADDUH_QB_DSP. */ +/* OPC_MUL_PH_DSP = 0x18 | OPC_SPECIAL3, */ +OPC_SHLL_QB_DSP= 0x13 | OPC_SPECIAL3, }; /* BSHFL opcodes */ @@ -331,6 +351,231 @@ enum { OPC_DSHD = (0x05 6) | OPC_DBSHFL, }; +#define MASK_ABSQ_S_PH(op) MASK_SPECIAL3(op) | (op (0x1F 6)) +/* MIPS DSP */ +enum { +OPC_ABSQ_S_PH = (0x09 6) | OPC_ABSQ_S_PH_DSP, +OPC_ABSQ_S_W= (0x11 6) | OPC_ABSQ_S_PH_DSP, +OPC_BITREV = (0x1B 6) | OPC_ABSQ_S_PH_DSP, +OPC_PRECEQ_W_PHL= (0x0C 6) | OPC_ABSQ_S_PH_DSP, +OPC_PRECEQ_W_PHR= (0x0D 6) | OPC_ABSQ_S_PH_DSP, +OPC_PRECEQU_PH_QBL = (0x04 6) | OPC_ABSQ_S_PH_DSP, +OPC_PRECEQU_PH_QBLA = (0x06 6) | OPC_ABSQ_S_PH_DSP, +OPC_PRECEQU_PH_QBR = (0x05 6) | OPC_ABSQ_S_PH_DSP, +OPC_PRECEQU_PH_QBRA = (0x07 6) | OPC_ABSQ_S_PH_DSP, +OPC_PRECEU_PH_QBL = (0x1C 6) | OPC_ABSQ_S_PH_DSP, +OPC_PRECEU_PH_QBLA = (0x1E 6) | OPC_ABSQ_S_PH_DSP, +OPC_PRECEU_PH_QBR = (0x1D 6) | OPC_ABSQ_S_PH_DSP, +OPC_PRECEU_PH_QBRA = (0x1F 6) | OPC_ABSQ_S_PH_DSP, +OPC_REPL_PH = (0x0A 6) | OPC_ABSQ_S_PH_DSP, +OPC_REPL_QB = (0x02 6) | OPC_ABSQ_S_PH_DSP, +OPC_REPLV_PH= (0x0B 6) | OPC_ABSQ_S_PH_DSP, +OPC_REPLV_QB= (0x03 6) | OPC_ABSQ_S_PH_DSP, +}; + +/* MIPS DSPR2 */ +enum { +OPC_ABSQ_S_QB = (0x01 6) | OPC_ABSQ_S_PH_DSP, +}; + +#define MASK_ADDU_QB(op) MASK_SPECIAL3(op) | (op (0x1F 6)) +/* MIPS DSP */ +enum { +OPC_ADDQ_PH= (0x0A 6) | OPC_ADDU_QB_DSP, +OPC_ADDQ_S_PH = (0x0E 6) | OPC_ADDU_QB_DSP, +OPC_ADDQ_S_W = (0x16 6) | OPC_ADDU_QB_DSP, +OPC_ADDSC = (0x10 6) | OPC_ADDU_QB_DSP, +OPC_ADDU_QB= (0x00 6) | OPC_ADDU_QB_DSP, +OPC_ADDU_S_QB = (0x04 6) | OPC_ADDU_QB_DSP, +OPC_ADDWC = (0x11 6) | OPC_ADDU_QB_DSP, +OPC_MODSUB = (0x12 6) | OPC_ADDU_QB_DSP, +OPC_MULEQ_S_W_PHL = (0x1C 6) | OPC_ADDU_QB_DSP, +OPC_MULEQ_S_W_PHR = (0x1D 6) | OPC_ADDU_QB_DSP, +OPC_MULEU_S_PH_QBL = (0x06 6) | OPC_ADDU_QB_DSP, +OPC_MULEU_S_PH_QBR = (0x07 6) | OPC_ADDU_QB_DSP, +OPC_MULQ_RS_PH = (0x1F 6) | OPC_ADDU_QB_DSP, +OPC_RADDU_W_QB = (0x14 6) | OPC_ADDU_QB_DSP, +OPC_SUBQ_PH= (0x0B 6) | OPC_ADDU_QB_DSP, +OPC_SUBQ_S_PH = (0x0F 6) | OPC_ADDU_QB_DSP, +OPC_SUBQ_S_W = (0x17 6) | OPC_ADDU_QB_DSP, +OPC_SUBU_QB= (0x01 6) | OPC_ADDU_QB_DSP, +OPC_SUBU_S_QB = (0x05 6) | OPC_ADDU_QB_DSP, +}; + +/* MIPS DSPR2 */ +enum { +OPC_ADDU_PH = (0x08 6) | OPC_ADDU_QB_DSP, +OPC_ADDU_S_PH = (0x0C 6) | OPC_ADDU_QB_DSP, +OPC_MULQ_S_PH = (0x1E 6) | OPC_ADDU_QB_DSP, +OPC_SUBU_PH = (0x09 6) | OPC_ADDU_QB_DSP, +OPC_SUBU_S_PH = (0x0D 6) | OPC_ADDU_QB_DSP, +}; + +#define OPC_ADDUH_QB_DSP OPC_MULT_G_2E +#define MASK_ADDUH_QB(op) MASK_SPECIAL3(op) | (op (0x1F 6)) +/* MIPS DSPR2 */ +enum { +OPC_ADDQH_PH = (0x08 6) | OPC_ADDUH_QB_DSP, +OPC_ADDQH_R_PH = (0x0A 6) | OPC_ADDUH_QB_DSP, +OPC_ADDQH_W= (0x10 6) | OPC_ADDUH_QB_DSP, +OPC_ADDQH_R_W = (0x12 6) | OPC_ADDUH_QB_DSP, +OPC_ADDUH_QB = (0x00 6) | OPC_ADDUH_QB_DSP, +OPC_ADDUH_R_QB = (0x02 6) | OPC_ADDUH_QB_DSP, +OPC_MUL_PH = (0x0C 6) | OPC_ADDUH_QB_DSP, +OPC_MUL_S_PH = (0x0E 6) | OPC_ADDUH_QB_DSP, +OPC_SUBQH_PH = (0x09 6) | OPC_ADDUH_QB_DSP, +OPC_SUBQH_R_PH = (0x0B 6) | OPC_ADDUH_QB_DSP, +OPC_SUBQH_W= (0x11 6) | OPC_ADDUH_QB_DSP, +OPC_SUBQH_R_W = (0x13 6) | OPC_ADDUH_QB_DSP, +OPC_SUBUH_QB = (0x01 6) | OPC_ADDUH_QB_DSP, +OPC_SUBUH_R_QB = (0x03 6) |
Re: [Qemu-devel] [Spice-devel] seamless migration with spice
Hi, On 03/11/2012 02:16 PM, Yonit Halperin wrote: Hi, We would like to implement seamless migration for Spice, i.e., keeping the currently opened spice client session valid after migration. Today, the spice client establishes the connection to the destination before migration starts, and when migration completes, the client's session is moved to the destination, but all the session data is being reset. We face 2 main challenges when coming to implement seamless migration: snip (1) (2) In order to restore the source-client spice session in the destination, we need to pass data from the source to the destination. Example for such data: in flight copy paste data, in flight usb data We want to pass the data from the source spice server to the destination, via Spice client. This introduces a possible race: after migration completes, the source qemu can be killed before the spice-server completes transferring the migration data to the client. I don't understand why we must transfer this via the client, we should transfer this info using established qemu migration technologies, and we should transfer it directly from the source to the dest. Passing this through the client, means trusting the client which is crazy (from a security pov), the data passed is not always just data buffers often it contains state info. And transferring this through the client means opening a whole window of injection vulnerabilities, which can simply be avoided by sending the data directly. I know this has been discussed before and I was not involved in that discussion due to -ENOTIME, sorry about that. But just as the solution for sending the data directly from source to dest proposed then was nacked by various qemu people, I nack the send the data through the client solution. That one simply is not acceptable from a security pov. So we must re-think how we can send this data directly from source to dest, in a way which is acceptable in upstream qemu. Regards, Hans
Re: [Qemu-devel] QAPI conversion status and async commands support
At 03/08/2012 02:12 AM, Luiz Capitulino Wrote: On Wed, 07 Mar 2012 11:36:06 -0600 Anthony Liguori anth...@codemonkey.ws wrote: On 03/07/2012 11:29 AM, Paolo Bonzini wrote: Il 07/03/2012 17:36, Luiz Capitulino ha scritto: Hi there, In the last few weeks we've had some proposals for new QMP commands that need to be asynchronous. As we lack a standard asynchronous API today, each command ends up adding its own way to execute in the background. This multiplies the API complexity as each command has to be implemented and learned by clients separately, with their own way of doing more or less the same things. The solution for this, envisioned for us for a long time now, is to introduce an unified QMP API for asynchronous commands. But before doing this we have to: 1. Finish the commands conversion to the QAPI This is almost done, the only missing commands are: add_graphics_client, do_closefd, do_device_add, do_device_del, do_getfd, do_migrate, do_netdev_add, do_netdev_del, do_qmp_capabilities and do_screen_dump. Note that do_migrate has already been posted to the list, and I have the screendump more or less done. Also, Anthony has an old branch where most of the conversions are already done, they just need to be rebased tested. 2. Integrate the new QAPI server Implemented by Anthony, may have missing pieces. 3. Implement async command support I think the missing commands to be converted can be done in around one week, but unfortunately I've been busy at other things and will need a few days to resume this work. Then there's the new QAPI server async support, which I'm not sure how much time we'll need to integrate them, but we should have this done for 1.1. The main question is: what should we do for the already posted async commands? Should we hold them until we finish this work? I think yes, and we could even have a list of features without which 1.1 should not ship. QOM buses, drive mirroring and QAPI async command support may be them. Perhaps qtest too. Okay, let's get serious about what we can and can't do. Hard freeze for 1.1 is May 1st which is roughly 6 weeks from now. I think QOM buses can go in no problem along with qtest. I would be okay considering QOM buses a release blocker but probably not qtest. I'm not really sure about drive mirroring. Is the work already done such that we just need to talk about merging it? With QAPI async command, I don't think 1.1 is a viable target. We're not just talking about converting existing commands to QAPI, but also replacing the QMP server infrastructure. I don't think that is a change that should be made at the tail end of the development cycle. I will have to agree with this, specially because I think the new server is going to generate some fun discussions because iirc it has some novel concepts. And those discussions burn a lot of time. On top of that we also have the async API to design. I have some ideas in mind, but again, design review cycles take some time. Having said that, I wonder if postponing want-to-be-async commands to 1.2 is a good thing. Two solutions other than waiting: 1. Add synchronous versions of them, if possible (I think this is possible for the dump command) I agree with this. libvirt uses migrate command to do dump now, and it cannot work when the guest uses host's device. We need to fix this bug as quickly as possible. So we need dump command recently. So I think we can add synchronous versions of dump command. Is it OK? If it is OK, I will send the synchronous versions of dump command. Thanks Wen Congyang 2. Accept the ad-hoc async mechanism they introduce Btw, probably needless to say but, I'd greatly appreciate and encourage anyone who wants to join this effort and work on the conversions, new QMP server, etc...
Re: [Qemu-devel] [PATCH v2 1/2] block: add the support to drain throttled requests
On Mon, Mar 12, 2012 at 3:27 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 12/03/2012 07:29, zwu.ker...@gmail.com ha scritto: From: Zhi Yong Wu wu...@linux.vnet.ibm.com Signed-off-by: Zhi Yong Wu wu...@linux.vnet.ibm.com --- block.c | 21 + block_int.h | 1 + 2 files changed, 22 insertions(+), 0 deletions(-) diff --git a/block.c b/block.c index 52ffe14..0825168 100644 --- a/block.c +++ b/block.c @@ -853,6 +853,21 @@ void bdrv_close_all(void) } } +/** + * Complete all pending requests for a block device + */ +void bdrv_drain(BlockDriverState *bs) +{ + do { + qemu_co_queue_restart_all(bs-throttled_reqs); + } while (!qemu_co_queue_empty(bs-throttled_reqs)); + + qemu_aio_flush(); This doesn't work, qemu_aio_flush can start new I/O. Do you mean that it will start next I/O via the current request's cb? if no, where will it start new I/O? Paolo -- Regards, Zhi Yong Wu
Re: [Qemu-devel] [PATCH v2 1/2] block: add the support to drain throttled requests
Il 12/03/2012 09:42, Zhi Yong Wu ha scritto: This doesn't work, qemu_aio_flush can start new I/O. Do you mean that it will start next I/O via the current request's cb? Yes. Paolo
Re: [Qemu-devel] [Spice-devel] seamless migration with spice
Hi, On 03/12/2012 08:57 AM, Gerd Hoffmann wrote: Hi, What about the second part? it's independant of the async issue. Isn't this a client problem? The client has this state, no? It is state of the client- server session. Today spice client creates a new session on migration, so there is simply no need to maintain any state. Drawback is that everything needs to be resent from the server to the client. Thats why we want be able to continue the spice session, so the client caches will stay valid. Of course the spice-server on the migration target needs the session state for that, i.e. know for example which bits the client has cached which it hasn't. If the state is stored in the server, wouldn't it be marshaled as part of the server's migration state? spice-server is stateless today when it comes to migration. QXL handles all (device) state, by keeping track of some commands (such as create/destroy surface) which it needs to transfer on migration, and by asking spice-server to render all surfaces on migration, which effectively flushes the spice server state to qxl device memory. To transfer the client session state there are basically two options: (a) transfer it as part of the qemu migration data stream. I don't want have any details about the qemu migration implementation and/or protocol in the spice-server library api, which basically leaves a ugly transfer-this-blob-for-me-please style interface as only option. (b) transfer it as part of the spice protocol. As the spice client has a connection to both source and target while the migration runs we can send session state from the source host via spice client to the target host. This needs some form of synchronization, to make sure both vmstate and spice migration are completed when qemu on the source machine quits. The problem with (b) is, that iirc the way b was implemented in the past was still the big blob approach, but then pass the blob through the client, which means an evil client could modify it, causing all sorts of interesting behavior inside spice-server. Since we're re-implementing this to me the send a blob through the client approach is simply not acceptable from a security pov, also see my previous mail in this thread. I think (b) is the better approach. I disagree. Note that there is more info to send over then just which surfaces / images are cached by the client. There also is things like partial complete agent channel messages, including how much bytes must be read to complete the command, etc. IMHO (b) would only be acceptable if the data send through the client stops becoming a blob. Instead the client could simply send a list of all surface ids, etc. which it has cached after it connects to / starts using the new host. Note that the old hosts needs to send nothing for this, this is info the client already has, also removing the need for synchronization. As for certain other data, such as (but not limited to) partially parsed agent messages, these should be send through the regular vmstate methods IMHO. So I see 2 options 1) Do (a), sending everything that way 2) Do (a) sending non client state that way; and let the client send state like which surfaces it has cached when the new session starts. Regards, Hans
Re: [Qemu-devel] [PATCH v1 01/12] net: Add a hub net client
On Fri, Mar 9, 2012 at 6:33 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 09/03/2012 10:00, zwu.ker...@gmail.com ha scritto: + /* TODO use qemu_send_packet() or need to call *_deliver_* directly? */ + /* TODO ignore return value? */ + qemu_send_packet(port-nc, buf, len); Did you see my message at http://permalink.gmane.org/gmane.comp.emulators.qemu/134998 Yeah, i ever saw this, but read it just now. In your comments, there're some places where i have not completely understand. This means VLANs will wait for all receivers to be ready and then do N-1 synchronous sends. Your code will queue N-1 asynchronous sends. However, then I noticed that qemu_can_send_packet is not called very much, and I do not understand why qemu_net_queue_send and qemu_net_queue_send_iov do not call qemu_can_send_packet before calling deliver/deliver_iov. If they did, hubs could then do their own flow control via can_receive. When qemu_send_packet returns zero you increment a count of in-flight packets, and a sent-packet callback would decrement the same count. sent-packet callback is sent_cb here? i noticed that sent_cb is currently NULL. When the count is non-zero, can_receive returns false (and vice versa). Can you elaborate this? since can_receive is called finally by qemu_send_packet, how can can_receive return false or true based on the count? The sent_cb also needs to call qemu_flush_queued_packets when the count drop to zero. When sent_cb is called, it means that the packet has been sent out, right? With this in place, I think the other TODO about the return value is easily solved; receive/receive_iov callbacks can simply return immediate success, and later block further sends. Due to the separate per-port send queues, when many sends are blocking many ports might linearize the same packet multiple times in qemu_net_queue_append_iov. The limit however is packet size * number of I think that when one packet need to be enqueued, we should at first determine if it has existed in send queue. ports, which is not more than a few KB; it's more of a performance problem and VLANs/hubs should only be used when performance doesn't matter (as with the dump client). BTW, an additional cleanup that you can do on top is to remove the deliver/deliver_iov function pointers in net/queue.c and qemu_new_net_queue, because they will always be qemu_deliver_packet and qemu_deliver_packet_iov. ? Paolo -- Regards, Zhi Yong Wu
Re: [Qemu-devel] [PATCH] virtio-serial-bus: use correct lengths in control_out() message
On (Sun) 11 Mar 2012 [17:52:59], Michael Tokarev wrote: In case of more than one control message, the code will use size of the largest message so far for all subsequent messages, instead of using size of current one. Fix it. Makes sense. How did you detect this? Any reproducible test-case? One comment below. diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c index e22940e..abe48ec 100644 --- a/hw/virtio-serial-bus.c +++ b/hw/virtio-serial-bus.c @@ -451,28 +451,28 @@ static void control_out(VirtIODevice *vdev, VirtQueue *vq) vser = DO_UPCAST(VirtIOSerial, vdev, vdev); len = 0; buf = NULL; while (virtqueue_pop(vq, elem)) { -size_t cur_len, copied; +size_t cur_len; cur_len = iov_size(elem.out_sg, elem.out_num); /* * Allocate a new buf only if we didn't have one previously or * if the size of the buf differs */ if (cur_len len) { g_free(buf); buf = g_malloc(cur_len); len = cur_len; } -copied = iov_to_buf(elem.out_sg, elem.out_num, buf, 0, len); +iov_to_buf(elem.out_sg, elem.out_num, buf, 0, cur_len); Why drop 'copied'? I don't think we have had a situation where copied can be less than cur_len, and in any case we don't do anything special as a recovery mechanism, but a warning message or an abort in case copied != cur_len should work, I think. -handle_control_message(vser, buf, copied); +handle_control_message(vser, buf, cur_len); virtqueue_push(vq, elem, 0); } g_free(buf); virtio_notify(vdev, vq); } Thanks, Amit
Re: [Qemu-devel] [PATCH 0/2 v3] kvm: notify host when guest panicked
At 03/09/2012 09:21 AM, Wen Congyang Wrote: At 03/08/2012 07:13 PM, Avi Kivity Wrote: On 03/08/2012 09:57 AM, Wen Congyang wrote: We can know the guest is paniced when the guest runs on xen. But we do not have such feature on kvm. Another purpose of this feature is: management app(for example: libvirt) can do auto dump when the guest is crashed. If management app does not do auto dump, the guest's user can do dump by hand if he sees the guest is paniced. I touch the hypervisor instead of using virtio-serial, because 1. it is simple 2. the virtio-serial is an optional device, and the guest may not have such device. Changes from v2 to v3: 1. correct spelling Changes from v1 to v2: 1. split up host and guest-side changes 2. introduce new request flag to avoid changing return values. I see no Documentation/ changes. Do you have any other comments about this patch? What shoude be writen into Documentation/? I read the documents under Documentation/virtual/kvm, not all KVM_EXIT_* are writen in api.txt. In this patch set, I introduce a new exit_reason: KVM_EXIT_GUEST_PANICKED, but I donot know where I should write. What about this? Thanks Wen Congyang Thanks Wen Congyang -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[Qemu-devel] qemu in domain of xen
hi there guys recently, i install a qemu in the domian U of xen , the OS of the domain U is debian squeeze . i want to make use of the qemu to install a WINDOWS XP in the domain U, but when i create it , the interface is staying in the processing of formatting the disk. the system of the domain U is likely halted, so i have to restart the domain U. can the qemu be used to create another virtual machine in the virtual machine of the domain U in xen . every one expericed it what i said in front. regards
[Qemu-devel] tuning parameters qed
Hi all i have a kvm machine and i use qed format as the disk image format I want to improve the I/O performance of my machine can anyone tell other then cluster_size and cache what are the tunining parameters for qed so that i can improve the performance of my VM ? -- *Pankaj Rawat*
Re: [Qemu-devel] [PATCH v1 01/12] net: Add a hub net client
Il 12/03/2012 09:59, Zhi Yong Wu ha scritto: However, then I noticed that qemu_can_send_packet is not called very much, and I do not understand why qemu_net_queue_send and qemu_net_queue_send_iov do not call qemu_can_send_packet before calling deliver/deliver_iov. If they did, hubs could then do their own flow control via can_receive. When qemu_send_packet returns zero you increment a count of in-flight packets, and a sent-packet callback would decrement the same count. sent-packet callback is sent_cb here? i noticed that sent_cb is currently NULL. Yes. When the count is non-zero, can_receive returns false (and vice versa). Can you elaborate this? since can_receive is called finally by qemu_send_packet, how can can_receive return false or true based on the count? Based on counts from the previous sends. Paolo
Re: [Qemu-devel] [PATCH] virtio-serial-bus: use correct lengths in control_out() message
On 12.03.2012 12:59, Amit Shah wrote: On (Sun) 11 Mar 2012 [17:52:59], Michael Tokarev wrote: In case of more than one control message, the code will use size of the largest message so far for all subsequent messages, instead of using size of current one. Fix it. Makes sense. How did you detect this? Any reproducible test-case? There's no test-case, and no detection, just reading the code. Actually, I think, there's no bug here, but a very, well, difficult to read code. One comment below. and the answer is below, too. diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c index e22940e..abe48ec 100644 --- a/hw/virtio-serial-bus.c +++ b/hw/virtio-serial-bus.c @@ -451,28 +451,28 @@ static void control_out(VirtIODevice *vdev, VirtQueue *vq) vser = DO_UPCAST(VirtIOSerial, vdev, vdev); len = 0; buf = NULL; while (virtqueue_pop(vq, elem)) { -size_t cur_len, copied; +size_t cur_len; cur_len = iov_size(elem.out_sg, elem.out_num); /* * Allocate a new buf only if we didn't have one previously or * if the size of the buf differs */ if (cur_len len) { g_free(buf); buf = g_malloc(cur_len); len = cur_len; } -copied = iov_to_buf(elem.out_sg, elem.out_num, buf, 0, len); +iov_to_buf(elem.out_sg, elem.out_num, buf, 0, cur_len); Why drop 'copied'? I don't think we have had a situation where copied can be less than cur_len, and in any case we don't do anything special as a recovery mechanism, but a warning message or an abort in case copied != cur_len should work, I think. In this case, copied was _always_ == cur_len. That's why there's actually no bug. See: cur_len = iov_size(elem.out_sg, elem.out_num); len = max(cur_len, buflen) = roughly copied = iov_to_buf(elem.out_sg, elem.out_num, buf, 0, len); iov_to_buf() will stop copying when it reaches end of buf (which is len bytes long) or end of iov, which is cur_len bytes long. Obviously in all cases it will be cur_len. But it is obvious only when you write it one near another and _think_. And the reason for this confusion is the introduction of this `copied' variable, which shouldn't be there in the first place. It is like doing, for a memcpy-like function: void *memdup(const void *src, size_t size) { char *dest = malloc(size+1); size_t copied = copybytes(dest, src, size+1); if (copied != size+1) { /* What?? */ } return dest; } The only sane thing here, I think, is to drop 'copied', to stop any possible confusion :) -handle_control_message(vser, buf, copied); +handle_control_message(vser, buf, cur_len); virtqueue_push(vq, elem, 0); } g_free(buf); virtio_notify(vdev, vq); } Please belive me, I realized that the original code is actually right only after re-reading your reply. And please note that even you, the author, don't understand what it is doing :) So I think the patch is correct still ;) Thanks! /mjt
Re: [Qemu-devel] Support for Nested Paging
On Sat, Mar 10, 2012 at 04:44:27PM -0500, Ankur Agrawal wrote: I am a graduate student at Stony Brook University and am working on design and implementation of hypervisors for OSCAR lab ( http://oscar.cs.stonybrook.edu/). Currently I am working on implementing emulation of Nested Page Tables in QEMU as present in AMD-V architectures. I would like to know if the QEMU team will be interested in having a patch which emulates the Nested Page Table and other hardware virtualization techniques supported by AMD-V or Intel-VT atchitectures. I would love to help in maintenance of my patch or any other issues in the QEMU in future as well. I would also like to know if there is any chance that this can become a part of Google Summer of Code 2012. Yes, you can propose your own project idea. The key thing is to find a mentor who is willing to work with you. You can do that on this mailing list and #qemu IRC. I don't know the state of i386 TCG but I suggest checking if there are prerequisites for vmx/svm that are missing from QEMU. You may find the i386 TCG needs some improvements before it becomes possible to work on vmx/svm itself. I don't know the answer but you'll need to check this to see if your project idea is viable. Stefan
Re: [Qemu-devel] Support for Nested Paging
On Sun, Mar 04, 2012 at 01:28:04AM +0800, 陳韋任 wrote: Also I am trying to understand the QEMU source with an objective of participating in the Google Summer of Code and contributing to QEMU. I have tried tracing through the code but seems this link http://repo.or.cz/w/qemu/stefanha.git/blob_plain/refs/heads/tracing:/docs/tracing.txt is not updated because many of the options do not work here. I would very happy if someone could provide me links to a good starting point to understand QEMU source code. The tracing you mentioned is not tend to help reading the code. Depends on which part of QEMU you're trying to play with, you have some background knowledge of it. I think the only thing missing is that you need to use: ./configure --enable-trace-backend=backend The docs still say ./configure --trace-backend=backend. I am sending a pull request that includes a patch to update this. Stefan
Re: [Qemu-devel] [PATCH 1/2] Support @documentencoding in scripts/texi2pod.pl
On Sun, Mar 11, 2012 at 01:56:41PM +0400, Michael Tokarev wrote: Ping? It's been more than a month since this patch has been posted. Maybe it is a good candidate for -trivial queue? One thing I don't understand is where you actually use @documentencoding? I don't see it in you 2 patches. Stefan
[Qemu-devel] [PATCH 1/2] trace-events: Rename 'next' argument
'next' is a systemtap keyword, so it's a bad idea to use it as an argument name. Signed-off-by: Kevin Wolf kw...@redhat.com --- trace-events |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/trace-events b/trace-events index 606d903..2a96353 100644 --- a/trace-events +++ b/trace-events @@ -530,7 +530,7 @@ qemu_coroutine_terminate(void *co) self %p # qemu-coroutine-lock.c qemu_co_queue_next_bh(void) -qemu_co_queue_next(void *next) next %p +qemu_co_queue_next(void *nxt) next %p qemu_co_mutex_lock_entry(void *mutex, void *self) mutex %p self %p qemu_co_mutex_lock_return(void *mutex, void *self) mutex %p self %p qemu_co_mutex_unlock_entry(void *mutex, void *self) mutex %p self %p -- 1.7.6.5
[Qemu-devel] [PATCH 2/2] tracetool: Forbid argument name 'next'
It has happened more than once that patches that look perfectly sane and work with simpletrace broke systemtap because they use 'next' as an argument name for a tracing function. However, 'next' is a keyword for systemtap, so we shouldn't use it. Signed-off-by: Kevin Wolf kw...@redhat.com --- scripts/tracetool |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/scripts/tracetool b/scripts/tracetool index 4c9951d..f892af4 100755 --- a/scripts/tracetool +++ b/scripts/tracetool @@ -81,6 +81,10 @@ get_args() args=${1#*\(} args=${args%%\)*} echo $args + +if (echo $args | grep [ *]next\($\|[, ]\) /dev/null 21); then +echo -e \n#error 'next' is a bad argument name (clash with systemtap keyword)\n +fi } # Get the argument name list of a trace event -- 1.7.6.5
Re: [Qemu-devel] [RFC PATCH 14/17] block: support FALLOC_FL_PUNCH_HOLE trimming
Il 09/03/2012 21:36, Richard Laager ha scritto: fallocate() only supports files. In my patch, I was fstat()ing the fd just after opening it and caching an is_device boolean. Then, when doing discards, if !is_device, call fallocate(), else (i.e. for devices) do the following (note: untested code): __uint64_t range[2]; range[0] = sector_num 9; range[1] = (__uint64_t)nb_sectors 9; if (ioctl(s-fd, BLKDISCARD, range) errno != EOPNOTSUPP) { return -errno; } return 0; You can instead put this code in a separate function hdev_discard. hdev device is used instead of file when the backend is a special file (block or character). sync requirements I'm not sure if fallocate() and/or BLKDISCARD always guarantee that the discard has made it to stable storage. If they don't, does O_DIRECT or O_DSYNC on open() cause them to make any such guarantee? O_DIRECT shouldn't have any effect; for O_DSYNC I don't think so. Thin vs. Thick Provisioning On Fri, 2012-03-09 at 00:35 -0800, Chris Wedgwood wrote: Simplest still compare the blocks allocated by the file to it's length (ie. stat.st_blocks != stat.st_size9). I thought of this as well. It covers 99% of the cases, but there's one case where it breaks down. Imagine I have a sparse file backing my virtual disk. In the guest, I fill the virtual disk 100%. Then, I restart QEMU. Now it thinks that sparse file is non-sparse and stops issuing hole punches. I would not really have any problem with that, it seems like a relatively minor case (also because newer guests will allocate the first partition at 1 MB, so you might still have a small hole at the beginning). To be completely correct, I suggest the following behavior: 1. Add a discard boolean option to the disk layer. 2. If discard is not specified: * For files, detect a true/false value by comparing stat.st_blocks != stat.st_size9. * For devices, assume a fixed value (true?). 3. If discard is true, issue discards. 4. If discard is false, do not issue discards. The problem is, who will use this interface? Paolo
[Qemu-devel] [PATCH] acpi: beginnings of piix acpi interface doc
Before we start tweaking and enhancing hardware, I think it makes sense to document what we currently have, to make sure we stay compatible. This documents the hotplug interface for piix. Stubs for cpu hotplug, PM. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- docs/acpi.txt | 32 1 files changed, 32 insertions(+), 0 deletions(-) create mode 100644 docs/acpi.txt diff --git a/docs/acpi.txt b/docs/acpi.txt new file mode 100644 index 000..4938d48 --- /dev/null +++ b/docs/acpi.txt @@ -0,0 +1,32 @@ +QEMU exposes the following registers to guests, +intended primarily for use by the ACPI interface. + +PCI Hotplug + + +Events use the standard GPE register: +GPE 0xafe0 - an ACPI GPE register + +Hotplug events set GPE bit 1 (mask 0x2) + +The following registers are used for PCI hotplug. +Each register is 32 bit (4 bytes) long, and has little endian format. +Bits 0-31 in each register correspond to slots 0-31 on the root bus, +respectively. + +UP 0xae00 - RO - Bit set by host on device insertion (note:existing implementations + trigger device check event) +DOWN 0xae04 - RO - Bit set by host on user eject request +EJ0 0xae08 - WO - Bit set by guest removes all power to device +RMV 0xae0c - RO - Bit set by host if slot supports hotplug + (can not change while guest is up) + + +Power management + +TODO + + +CPU hotplug + +TODO -- 1.7.9.111.gf3fb0
Re: [Qemu-devel] [PATCH RFC v4 44/44] qom: Introduce CPU class
On 03/10/2012 03:28 AM, Andreas Färber wrote: Reintroduce CPUState as QOM object: It's abstract and derived directly from TYPE_OBJECT for compatibility with the user emulators. The identifier CPUState avoids conflicts between CPU() and the struct. Introduce $(qom-twice-y) to build it separately for system and for user emulators. Prepare a virtual reset method, (re)introduce cpu_reset() as wrapper. Signed-off-by: Andreas Färberafaer...@suse.de Cc: Anthony Liguorianth...@codemonkey.ws --- Makefile.objs |3 ++ configure |1 + include/qemu/cpu.h | 75 qom/Makefile |1 + qom/cpu.c | 58 5 files changed, 138 insertions(+), 0 deletions(-) create mode 100644 include/qemu/cpu.h create mode 100644 qom/cpu.c diff --git a/Makefile.objs b/Makefile.objs index 431b7a1..291baf5 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -14,6 +14,7 @@ universal-obj-y += $(qobject-obj-y) # QOM include $(SRC_PATH)/qom/Makefile qom-obj-y = $(addprefix qom/, $(qom-y)) +qom-obj-twice-y = $(addprefix qom/, $(qom-twice-y)) universal-obj-y += $(qom-obj-y) @@ -89,6 +90,7 @@ fsdev-obj-$(CONFIG_VIRTFS) += $(addprefix fsdev/, $(fsdev-nested-y)) common-obj-y = $(block-obj-y) blockdev.o common-obj-y += $(net-obj-y) +common-obj-y += $(qom-obj-twice-y) common-obj-$(CONFIG_LINUX) += $(fsdev-obj-$(CONFIG_LINUX)) common-obj-y += readline.o console.o cursor.o common-obj-y += $(oslib-obj-y) @@ -194,6 +196,7 @@ user-obj-y += cutils.o cache-utils.o user-obj-y += module.o user-obj-y += qemu-user.o user-obj-y += $(trace-obj-y) +user-obj-y += $(qom-obj-twice-y) ## # libhw diff --git a/configure b/configure index 66a65d9..1826af5 100755 --- a/configure +++ b/configure @@ -3888,6 +3888,7 @@ fi d=libuser mkdir -p $d mkdir -p $d/trace +mkdir -p $d/qom symlink $source_path/Makefile.user $d/Makefile if test $docs = yes ; then diff --git a/include/qemu/cpu.h b/include/qemu/cpu.h new file mode 100644 index 000..4291279 --- /dev/null +++ b/include/qemu/cpu.h @@ -0,0 +1,75 @@ +/* + * QEMU CPU model + * + * Copyright (c) 2012 SUSE LINUX Products GmbH + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see + *http://www.gnu.org/licenses/gpl-2.0.html + */ +#ifndef QEMU_CPU_H +#define QEMU_CPU_H + +#include qemu/object.h + +/** + * SECTION:cpu + * @section_id: QEMU-cpu + * @title: CPU Class + * @short_description: Base class for all CPUs + */ + +#define TYPE_CPU cpu + +#define CPU(obj) OBJECT_CHECK(CPUState, (obj), TYPE_CPU) +#define CPU_CLASS(class) OBJECT_CLASS_CHECK(CPUClass, (class), TYPE_CPU) +#define CPU_GET_CLASS(obj) OBJECT_GET_CLASS(CPUClass, (obj), TYPE_CPU) + +typedef struct CPUState CPUState; + +/** + * CPUClass: + * @reset: Callback to reset the #CPU to its initial state. + * + * Represents a CPU family or model. + */ +typedef struct CPUClass { +/* private*/ +ObjectClass parent_class; +/* public*/ + +void (*reset)(CPUState *cpu); Why not use Object* as argument here? It will be easier to generalize later qdev code and not make special case when adding cpus. BTW how we are going to generalize qdev and make its infrastructure available to other types except of DEVICE_TYPE. Maybe we should introduce some (abstract) base class (or interface) for basic device that will define methods like reset, realize, unrealize and use it in qdev code instead of DEVICE_TYPE? +} CPUClass; + +/** + * CPUState: + * + * State of one CPU core or thread. + */ +struct CPUState { +/* private*/ +Object parent_obj; +/* public*/ + +/* TODO Move common fields from CPUState here. */ +}; + + +/** + * cpu_reset: + * @cpu: The CPU whose state is to be reset. + */ +void cpu_reset(CPUState *cpu); + + +#endif diff --git a/qom/Makefile b/qom/Makefile index 885a263..34c6de5 100644 --- a/qom/Makefile +++ b/qom/Makefile @@ -1 +1,2 @@ qom-y = object.o container.o qom-qobject.o +qom-twice-y = cpu.o diff --git a/qom/cpu.c b/qom/cpu.c new file mode 100644 index 000..5b36046 --- /dev/null +++ b/qom/cpu.c @@ -0,0 +1,58 @@ +/* + * QEMU CPU model + * + * Copyright (c) 2012 SUSE LINUX Products GmbH + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + *
Re: [Qemu-devel] KVM call agenda for tuesday 31
On 02/21/2012 07:33 PM, Peter Maydell wrote: Short summary: * switch wp groups to bitfield rather than int array * convert sd.c to use memory_region_init_ram() to allocate the wp groups (being careful to use memory_region_set_dirty() when we touch them) * we don't need variable-length fields for sd.c any more * rest of the vmstate conversion is straightforward After a conversion to bitfield wp_groups consumes 2048 bytes for 32 GB SD image, do we really need live section for this? -- Mitsyanko Igor ASWG, Moscow RD center, Samsung Electronics email: i.mitsya...@samsung.com
Re: [Qemu-devel] [Spice-devel] seamless migration with spice
Hi, The problem with (b) is, that iirc the way b was implemented in the past was still the big blob approach, but then pass the blob through the client, which means an evil client could modify it, causing all sorts of interesting behavior inside spice-server. Since we're re-implementing this to me the send a blob through the client approach is simply not acceptable from a security pov, also see my previous mail in this thread. Agree. It should be a normal spice message which goes through the spice marshaller for parsing sanity checking. I disagree. Note that there is more info to send over then just which surfaces / images are cached by the client. There also is things like partial complete agent channel messages, including how much bytes must be read to complete the command, etc. Is there a complete list of the session state we need to save? IMHO (b) would only be acceptable if the data send through the client stops becoming a blob. Using (a) to send a blob isn't better. Instead the client could simply send a list of all surface ids, etc. which it has cached after it connects to / starts using the new host. Note that the old hosts needs to send nothing for this, this is info the client already has, also removing the need for synchronization. Yes, some session state is known to the client anyway so we don't need a source - target channel for them. As for certain other data, such as (but not limited to) partially parsed agent messages, these should be send through the regular vmstate methods IMHO. That isn't easy to handle via vmstate, at least as soon as this goes beyond a fixed number of fields aka 'migrate over this struct for me'. Think multiple spice clients connected at the same time. 1) Do (a), sending everything that way 2) Do (a) sending non client state that way; and let the client send state like which surfaces it has cached when the new session starts. I think we have to look at each piece of state information needed by the target and look how to handle this best. cheers, Gerd
Re: [Qemu-devel] tuning parameters qed
On Mon, Mar 12, 2012 at 9:12 AM, PANKAJ RAWAT pankajr...@gmail.com wrote: Hi all i have a kvm machine and i use qed format as the disk image format I want to improve the I/O performance of my machine can anyone tell other then cluster_size and cache what are the tunining parameters for qed so that i can improve the performance of my VM ? Cluster and L1/L2 table size are the only QED-specific parameters. What to tune really depends on what bottleneck you are seeing. You need to do disk I/O benchmarks to find that out. If you are generally looking for good settings, I suggest leaving the QED parameters at their default and running with -drive if=virtio,cache=none,aio=native,file=path/to/image.qed with a recent QEMU and host kernel. Stefan
Re: [Qemu-devel] [Spice-devel] seamless migration with spice
On Mon, Mar 12, 2012 at 10:46:44AM +0100, Gerd Hoffmann wrote: Hi, The problem with (b) is, that iirc the way b was implemented in the past was still the big blob approach, but then pass the blob through the client, which means an evil client could modify it, causing all sorts of interesting behavior inside spice-server. Since we're re-implementing this to me the send a blob through the client approach is simply not acceptable from a security pov, also see my previous mail in this thread. Agree. It should be a normal spice message which goes through the spice marshaller for parsing sanity checking. I disagree. Note that there is more info to send over then just which surfaces / images are cached by the client. There also is things like partial complete agent channel messages, including how much bytes must be read to complete the command, etc. Is there a complete list of the session state we need to save? IMHO (b) would only be acceptable if the data send through the client stops becoming a blob. Using (a) to send a blob isn't better. Actually, we discussed this in the past and one benefit is that network between source and target qemu would be fast (otherwise migration wouldn't work in the first place), as opposed to source-client and client-dest. Additionally you save one transaction. Instead the client could simply send a list of all surface ids, etc. which it has cached after it connects to / starts using the new host. Note that the old hosts needs to send nothing for this, this is info the client already has, also removing the need for synchronization. Yes, some session state is known to the client anyway so we don't need a source - target channel for them. As for certain other data, such as (but not limited to) partially parsed agent messages, these should be send through the regular vmstate methods IMHO. That isn't easy to handle via vmstate, at least as soon as this goes beyond a fixed number of fields aka 'migrate over this struct for me'. Think multiple spice clients connected at the same time. Migrate this struct n times for me. 1) Do (a), sending everything that way 2) Do (a) sending non client state that way; and let the client send state like which surfaces it has cached when the new session starts. I think we have to look at each piece of state information needed by the target and look how to handle this best. cheers, Gerd
Re: [Qemu-devel] change the default value of timeout
On Mon, Mar 12, 2012 at 3:17 AM, Zhang, Yang Z yang.z.zh...@intel.com wrote: Currently, if not using nonblocking mode, the default timeout of select() in main_loop_wait is 1000ms. There has no problem if you run few VMs. But when running more VMs like 32 or 64, then the problem is coming. Our experience shows that when running 64 idle VMs, the pkg C6 residency is 88% by default and it goes to 90% when I change timeout to 10s. And 2% means 1 watt in my box. Since this is only a timeout value for select, so I suggest to use a more reasonable way to set it rather than using a fixed value. For example, using an argument to set it by user. But I am not sure whether something else is depend on the timeout. It's not obvious to me why we have a finite select(2) timeout in main-loop.c:main_loop_wait(). State changes should use qemu_notify_event() to leave the blocking select(2) call. A user-visible argument doesn't seem like a solution to me. How should a user choose a value? How about testing with an infinite timeout (timeout == NULL)? Try both KVM and TCG to see if it has weird effects. Stefan
Re: [Qemu-devel] [PATCH 1/2] Support @documentencoding in scripts/texi2pod.pl
On 12.03.2012 12:21, Stefan Hajnoczi wrote: On Sun, Mar 11, 2012 at 01:56:41PM +0400, Michael Tokarev wrote: Ping? It's been more than a month since this patch has been posted. Maybe it is a good candidate for -trivial queue? One thing I don't understand is where you actually use @documentencoding? I don't see it in you 2 patches. It is already used in the existing docs, that's the whole point ;) Thanks, /mjt
Re: [Qemu-devel] [PATCH] Tracing documentation changes
On Sun, Mar 11, 2012 at 4:13 PM, Peter Teoh htmldevelo...@gmail.com wrote: I was trying out the tracing feature of QEMU after checking out the git tree at git://git.qemu.org/qemu.git, and managed to generate some traces, but the following are the changes needed to the documentation, in order to successfully generate the tracing. Some comments: qemu-system-i386 was used because the present git tree does not generate any qemu binary at all. Comments? Thanks for the documentation update. Please see http://wiki.qemu.org/Contribute/SubmitAPatch. Although i386-specific command-lines are helpful to someone using QEMU for the first time, they are subject to change so I'd rather not include them. The ./configure --enable-trace-backend fix has been submitted and I will send a pull request shortly. The scripts/simpletrace.py update is useful - please submit that. Stefan
Re: [Qemu-devel] [PATCH v2 1/5] qxl: switch qxl.c to trace-events
On 03/11/12 20:26, Alon Levy wrote: dprint is still used for qxl_init_common one time prints. I think we shouldn't simply convert the dprintf's into trace-points. We should look at each dprintf and check whenever it makes sense at all, whenever it makes sense at that place before converting it over to a tracepoint. @@ -409,7 +410,7 @@ static void interface_attach_worker(QXLInstance *sin, QXLWorker *qxl_worker) { PCIQXLDevice *qxl = container_of(sin, PCIQXLDevice, ssd.qxl); -dprint(qxl, 1, %s:\n, __FUNCTION__); +trace_qxl_interface_attach_worker(); qxl-ssd.worker = qxl_worker; } For example: Do we really need that one? @@ -505,9 +506,10 @@ static int interface_get_command(QXLInstance *sin, struct QXLCommandExt *ext) QXLCommand *cmd; int notify, ret; +trace_qxl_interface_get_command_enter(qxl_mode_to_string(qxl-mode)); + Why this? -dprint(qxl, 1, %s: scheduling update_area_bh, #dirty %d\n, - __func__, qxl-num_dirty_rects); + trace_qxl_interface_update_area_complete_schedule_bh(qxl-num_dirty_rects); I think it makes more sense to have the tracepoint in the bottom half handler instead. static void qxl_hard_reset(PCIQXLDevice *d, int loadvm) { -dprint(d, 1, %s: start%s\n, __FUNCTION__, - loadvm ? (loadvm) : ); +trace_qxl_hard_reset_enter(loadvm); qxl_spice_reset_cursor(d); qxl_spice_reset_image_cache(d); @@ -934,7 +935,7 @@ static void qxl_hard_reset(PCIQXLDevice *d, int loadvm) qemu_spice_create_host_memslot(d-ssd); qxl_soft_reset(d); -dprint(d, 1, %s: done\n, __FUNCTION__); +trace_qxl_hard_reset_exit(); } Do we need the exit tracepoint? static void qxl_reset_memslots(PCIQXLDevice *d) { -dprint(d, 1, %s:\n, __FUNCTION__); +trace_qxl_reset_memslots(); qxl_spice_reset_memslots(d); memset(d-guest_slots, 0, sizeof(d-guest_slots)); } Do we need that one? qxl_hard_reset is the only caller of that function ... @@ -1216,8 +1213,8 @@ static void ioport_write(void *opaque, target_phys_addr_t addr, if (d-mode != QXL_MODE_VGA) { break; } -dprint(d, 1, %s: unexpected port 0x%x (%s) in vga mode\n, -__func__, io_port, io_port_to_string(io_port)); +trace_qxl_io_unexpected_vga_mode( +io_port, io_port_to_string(io_port)); We might want raise an error irq here, and have a tracepoint in qxl_guest_bug() of course ... case QXL_IO_SET_MODE: -dprint(d, 1, QXL_SET_MODE %d\n, (int)val); +trace_qxl_io_set_mode(val); qxl_set_mode(d, val, 0); Needed? There is a tracepoint in qxl_set_mode() ... case QXL_IO_RESET: -dprint(d, 1, QXL_IO_RESET\n); +trace_qxl_io_reset(); qxl_hard_reset(d, 0); ... likewise ... break; case QXL_IO_MEMSLOT_ADD: @@ -1337,7 +1334,7 @@ async_common: async); goto cancel_async; } -dprint(d, 1, QXL_IO_CREATE_PRIMARY async=%d\n, async); +trace_qxl_io_create_primary(async); d-guest_primary.surface = d-ram-create_surface; qxl_create_guest_primary(d, 0, async); ... here too ... We might want to have a trace_qxl_io_write(addr, val) at the start of the function, so we see all guest writes. Traces for the actual ops (if needed at all) are probably much better placed into the functions executing the op as they can trace more details (i.e. qxl_set_mode has the tracepoint *after* looking up the mode so we can stick the mode info into the trace too). cheers, Gerd
Re: [Qemu-devel] [PATCH] ide: Adds model=s option, allowing the user to override the default disk model name QEMU HARDDISK
On Sat, Mar 10, 2012 at 7:56 PM, Floris Bos b...@je-eigen-domein.nl wrote: @@ -1885,6 +1885,22 @@ int ide_init_drive(IDEState *s, BlockDriverState *bs, IDEDriveKind kind, snprintf(s-drive_serial_str, sizeof(s-drive_serial_str), QM%05d, s-drive_serial); } + if (model) { + strncpy(s-drive_model_str, model, sizeof(s-drive_model_str)); strncpy(3) does not NUL-terminate if the max length is reached. Either you need to use pstrcpy() or specify sizeof(s-drive_model_str) - 1 and make sure s-drive_model_str[40] = '\0'. @@ -146,6 +155,9 @@ static int ide_dev_initfn(IDEDevice *dev, IDEDriveKind kind) if (!dev-serial) { dev-serial = g_strdup(s-drive_serial_str); } + if (!dev-model) { + dev-model = g_strdup(s-drive_model_str); + } Seems this will never be freed but dev-serial has the same issue, so this isn't new. Stefan
Re: [Qemu-devel] [PATCH] maintainers: Add docs/tracing.txt to Tracing
On Sat, Mar 10, 2012 at 12:37 PM, Andreas Färber afaer...@suse.de wrote: The topic of whether and by whom docs/tracing.txt is maintained was brought up. It currently does not have an official maintainer. Add it to the tracing section so that Stefan gets cc'ed on patches. Signed-off-by: Andreas Färber afaer...@suse.de Cc: Stefan Hajnoczi stefa...@linux.vnet.ibm.com Cc: Peter Maydell peter.mayd...@linaro.org --- MAINTAINERS | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) Acked-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
Re: [Qemu-devel] [Spice-devel] seamless migration with spice
Hi, As for certain other data, such as (but not limited to) partially parsed agent messages, these should be send through the regular vmstate methods IMHO. That isn't easy to handle via vmstate, at least as soon as this goes beyond a fixed number of fields aka 'migrate over this struct for me'. Think multiple spice clients connected at the same time. Migrate this struct n times for me. I think for the agent case this isn't needed. Or is every client allowed to speak to the agent in case of multiple clients connected? I somehow doubt this can work as the agent protocol can't multicast ... cheers, Gerd
Re: [Qemu-devel] [PATCH 0/2 v3] kvm: notify host when guest panicked
On 03/09/2012 03:21 AM, Wen Congyang wrote: Changes from v2 to v3: 1. correct spelling Changes from v1 to v2: 1. split up host and guest-side changes 2. introduce new request flag to avoid changing return values. I see no Documentation/ changes. What shoude be writen into Documentation/? I read the documents under Documentation/virtual/kvm, not all KVM_EXIT_* are writen in api.txt. In this patch set, I introduce a new exit_reason: KVM_EXIT_GUEST_PANICKED, but I donot know where I should write. Describe the exit (why it happens). Also document the new hypercall. If there is no good place for it, start a new file. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH 0/2 v3] kvm: notify host when guest panicked
On 03/12/2012 11:04 AM, Wen Congyang wrote: Do you have any other comments about this patch? Not really, but I'm not 100% convinced the patch is worthwhile. It's likely to only be used by Linux, which has kexec facilities, and you can put talk to management via virtio-serial and describe the crash in more details than a simple hypercall. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH 1/2] Support @documentencoding in scripts/texi2pod.pl
On Mon, Mar 12, 2012 at 10:14 AM, Michael Tokarev m...@tls.msk.ru wrote: On 12.03.2012 12:21, Stefan Hajnoczi wrote: On Sun, Mar 11, 2012 at 01:56:41PM +0400, Michael Tokarev wrote: Ping? It's been more than a month since this patch has been posted. Maybe it is a good candidate for -trivial queue? One thing I don't understand is where you actually use @documentencoding? I don't see it in you 2 patches. It is already used in the existing docs, that's the whole point ;) Ah, it would have been good to explain that in the commit description :). Stefan
Re: [Qemu-devel] [PATCH v4 1/3] kvmclock: Always register type
On 03/10/2012 03:35 AM, Andreas Färber wrote: Am 05.03.2012 10:23, schrieb Avi Kivity: On 03/04/2012 10:32 PM, Andreas Färber wrote: Currently, the kvmclock type is only registered when kvm_enabled(). This breaks when moving type registration to before command line parsing (so that QOM types can be used for CPU and machine). Since the QOM classes are lazy-initialized anyway and kvmclock_create() has another kvm_enabled() check, simply drop the KVM check in kvmclock_register_types(). kvm-i8259, kvm-apic and kvm-ioapic do not suffer from such a check. Patch looks good. Ping for series. Avi, do you want to sign this patch off through uq/master? Or should I make the above a Reviewed-by (and remove the Cc:) within this series? Reviewed-by: please. -- error compiling committee.c: too many arguments to function
[Qemu-devel] Re : Re : Regression: more 0.12 regression (SeaBIOS related?)
On Wed, Mar 07, 2012 at 06:31:31AM -0800, Alain Ribière wrote: I ran qemu 1.0.1 and the latest SeaBIOS (from the git) with the following options : qemu-system-i386 -L git/bios -fda disk.img -no-fd-bootchk -boot a -m 16 Here is the log : https://docs.google.com/open?id=0B7mz0vq6Rpb7UE1ibjJDcEhTRWlNV050QnMyMWwtZw Here is the floppy disk image I used : https://docs.google.com/open?id=0B7mz0vq6Rpb7bHpYaEt2SnVUUi1KaWE3a3lBQUJpQQ The floppy disk is simply a C-DOS 720 Ko floppy created by format a: /s. So it's quite empty. Qemu doesn't crash or freeze. But I can just type a single character and the nothing else. But the system is still running (there is a clock at the bottom right of the screen). I tracked this down. Looks like the image takes over the PS2 irq and keyboard handling, but then occasionally calls into the BIOS. When it does call the BIOS irq handler (manually), it expects the irq handler to enable the keyboard. Weird. Anyway, the patch below fixes it for me. -Kevin Great ! It works for me too. Thanks a lot, Alain
Re: [Qemu-devel] [RFC PATCH 07/17] block: make high level discard operation always zero
On 03/09/2012 06:42 PM, Kevin Wolf wrote: While (3) can be worked around, the only way around the other two, unfortunately, is support in the formats and protocols. We will still provide device options to opt out of this, but with raw and qed covered (+ qcow2 without backing file, and qcow3 in the future) it should not be too bad. Can't qcow2 with a backing file also be supported? Zero out the first cluster, and remember it. The following discards can reuse this zero cluster, as long as it hasn't been overwritten. qcow2 can't handle clusters that are referenced twice from the same L1 table. This would require a reverse lookup to adjust the QCOW_O_COPIED flags in the L2 tables containing the other references. Don't follow, sorry. What adjustment are you talking about? If it's a 1-0 transition, is it mandatory to adjust the flag? That is, it it legal to have a refcount of exactly one, but have the flag clear? -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [RFC PATCH 07/17] block: make high level discard operation always zero
Am 12.03.2012 11:42, schrieb Avi Kivity: On 03/09/2012 06:42 PM, Kevin Wolf wrote: While (3) can be worked around, the only way around the other two, unfortunately, is support in the formats and protocols. We will still provide device options to opt out of this, but with raw and qed covered (+ qcow2 without backing file, and qcow3 in the future) it should not be too bad. Can't qcow2 with a backing file also be supported? Zero out the first cluster, and remember it. The following discards can reuse this zero cluster, as long as it hasn't been overwritten. qcow2 can't handle clusters that are referenced twice from the same L1 table. This would require a reverse lookup to adjust the QCOW_O_COPIED flags in the L2 tables containing the other references. Don't follow, sorry. What adjustment are you talking about? If it's a 1-0 transition, is it mandatory to adjust the flag? That is, it it legal to have a refcount of exactly one, but have the flag clear? According to the spec it's illegal and qemu-img check will complain, too. In practice, I'm not entirely sure if it will cause real corruption or just unnecessary COWs. I believe it may be the latter. But it's not worth the trouble anyway when we can have a real zero flag in qcow3. Kevin
Re: [Qemu-devel] [PATCH 2/2] tracetool: Forbid argument name 'next'
On Mon, Mar 12, 2012 at 9:34 AM, Kevin Wolf kw...@redhat.com wrote: It has happened more than once that patches that look perfectly sane and work with simpletrace broke systemtap because they use 'next' as an argument name for a tracing function. However, 'next' is a keyword for systemtap, so we shouldn't use it. Signed-off-by: Kevin Wolf kw...@redhat.com --- scripts/tracetool | 4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/scripts/tracetool b/scripts/tracetool index 4c9951d..f892af4 100755 --- a/scripts/tracetool +++ b/scripts/tracetool @@ -81,6 +81,10 @@ get_args() args=${1#*\(} args=${args%%\)*} echo $args + + if (echo $args | grep [ *]next\($\|[, ]\) /dev/null 21); then + echo -e \n#error 'next' is a bad argument name (clash with systemtap keyword)\n + fi Good idea, let's prevent it from being used. I don't think this is the way to do it because callers will parse stdout and we're not guaranteed to be generating C code where #error works. Instead, we can echo to stderr and do exit 1. Stefan
Re: [Qemu-devel] [PATCH 2/2] tracetool: Forbid argument name 'next'
Am 12.03.2012 12:01, schrieb Stefan Hajnoczi: On Mon, Mar 12, 2012 at 9:34 AM, Kevin Wolf kw...@redhat.com wrote: It has happened more than once that patches that look perfectly sane and work with simpletrace broke systemtap because they use 'next' as an argument name for a tracing function. However, 'next' is a keyword for systemtap, so we shouldn't use it. Signed-off-by: Kevin Wolf kw...@redhat.com --- scripts/tracetool |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/scripts/tracetool b/scripts/tracetool index 4c9951d..f892af4 100755 --- a/scripts/tracetool +++ b/scripts/tracetool @@ -81,6 +81,10 @@ get_args() args=${1#*\(} args=${args%%\)*} echo $args + +if (echo $args | grep [ *]next\($\|[, ]\) /dev/null 21); then +echo -e \n#error 'next' is a bad argument name (clash with systemtap keyword)\n +fi Good idea, let's prevent it from being used. I don't think this is the way to do it because callers will parse stdout and we're not guaranteed to be generating C code where #error works. Instead, we can echo to stderr and do exit 1. Yes, we may generate something else additionally. But trace.h is generated in any case and it will cause a compile error before any non-C file is used. I tried the 'exit 1' approach first and it doesn't really work. This function is called from a subshell, so you don't exit tracetool but just the one subshell and end up with a broken trace.h that will fail compilation in the same place, just with a less helpful error message. Kevin
Re: [Qemu-devel] [PATCH] virtio-serial-bus: use correct lengths in control_out() message
On (Mon) 12 Mar 2012 [13:22:22], Michael Tokarev wrote: On 12.03.2012 12:59, Amit Shah wrote: On (Sun) 11 Mar 2012 [17:52:59], Michael Tokarev wrote: In case of more than one control message, the code will use size of the largest message so far for all subsequent messages, instead of using size of current one. Fix it. Makes sense. How did you detect this? Any reproducible test-case? There's no test-case, and no detection, just reading the code. Actually, I think, there's no bug here, but a very, well, difficult to read code. Do you mean this code is difficult to read, or in general? Any ideas to make it simpler (or at least details on what's difficult?) diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c index e22940e..abe48ec 100644 --- a/hw/virtio-serial-bus.c +++ b/hw/virtio-serial-bus.c @@ -451,28 +451,28 @@ static void control_out(VirtIODevice *vdev, VirtQueue *vq) vser = DO_UPCAST(VirtIOSerial, vdev, vdev); len = 0; buf = NULL; while (virtqueue_pop(vq, elem)) { -size_t cur_len, copied; +size_t cur_len; cur_len = iov_size(elem.out_sg, elem.out_num); /* * Allocate a new buf only if we didn't have one previously or * if the size of the buf differs */ if (cur_len len) { g_free(buf); buf = g_malloc(cur_len); len = cur_len; } -copied = iov_to_buf(elem.out_sg, elem.out_num, buf, 0, len); +iov_to_buf(elem.out_sg, elem.out_num, buf, 0, cur_len); Why drop 'copied'? I don't think we have had a situation where copied can be less than cur_len, and in any case we don't do anything special as a recovery mechanism, but a warning message or an abort in case copied != cur_len should work, I think. In this case, copied was _always_ == cur_len. That's why there's actually no bug. See: cur_len = iov_size(elem.out_sg, elem.out_num); len = max(cur_len, buflen) = roughly copied = iov_to_buf(elem.out_sg, elem.out_num, buf, 0, len); iov_to_buf() will stop copying when it reaches end of buf (which is len bytes long) or end of iov, which is cur_len bytes long. Obviously in all cases it will be cur_len. But it is obvious only when you write it one near another and _think_. And the reason for this confusion is the introduction of this `copied' variable, which shouldn't be there in the first place. It is like doing, for a memcpy-like function: void *memdup(const void *src, size_t size) { char *dest = malloc(size+1); size_t copied = copybytes(dest, src, size+1); if (copied != size+1) { /* What?? */ } return dest; } The only sane thing here, I think, is to drop 'copied', to stop any possible confusion :) -handle_control_message(vser, buf, copied); +handle_control_message(vser, buf, cur_len); virtqueue_push(vq, elem, 0); } g_free(buf); virtio_notify(vdev, vq); } Please belive me, I realized that the original code is actually right only after re-reading your reply. Heh. And please note that even you, the author, don't understand what it is doing :) Of course, I can't claim to remember everything, I sometimes don't even remember stuff *while* coding it. However, I do understand this part of the code, what I meant above was iov_to_buf() could fail or copy lesser than what was asked of it. It's a memcpy() right now, it could change to something else. Ignoring return values, esp. in copy functions, is not good style, even if you know it can't fail. So I think the patch is correct still ;) No doubt about that. I never said otherwise. I just feel we shouldn't ignore return values. Amit
Re: [Qemu-devel] [PATCH] acpi: beginnings of piix acpi interface doc
On Mon, Mar 12, 2012 at 11:35:29AM +0200, Michael S. Tsirkin wrote: Before we start tweaking and enhancing hardware, I think it makes sense to document what we currently have, to make sure we stay compatible. This documents the hotplug interface for piix. Stubs for cpu hotplug, PM. We already have docs/specs/acpi_pci_hotplug.txt, no? Signed-off-by: Michael S. Tsirkin m...@redhat.com --- docs/acpi.txt | 32 1 files changed, 32 insertions(+), 0 deletions(-) create mode 100644 docs/acpi.txt diff --git a/docs/acpi.txt b/docs/acpi.txt new file mode 100644 index 000..4938d48 --- /dev/null +++ b/docs/acpi.txt @@ -0,0 +1,32 @@ +QEMU exposes the following registers to guests, +intended primarily for use by the ACPI interface. + +PCI Hotplug + + +Events use the standard GPE register: +GPE 0xafe0 - an ACPI GPE register + +Hotplug events set GPE bit 1 (mask 0x2) + +The following registers are used for PCI hotplug. +Each register is 32 bit (4 bytes) long, and has little endian format. +Bits 0-31 in each register correspond to slots 0-31 on the root bus, +respectively. + +UP 0xae00 - RO - Bit set by host on device insertion (note:existing implementations + trigger device check event) +DOWN 0xae04 - RO - Bit set by host on user eject request +EJ0 0xae08 - WO - Bit set by guest removes all power to device +RMV 0xae0c - RO - Bit set by host if slot supports hotplug + (can not change while guest is up) + + +Power management + +TODO + + +CPU hotplug + +TODO -- 1.7.9.111.gf3fb0 -- Gleb.
Re: [Qemu-devel] [Spice-devel] seamless migration with spice
Hi, On 03/12/2012 10:46 AM, Gerd Hoffmann wrote: Hi, The problem with (b) is, that iirc the way b was implemented in the past was still the big blob approach, but then pass the blob through the client, which means an evil client could modify it, causing all sorts of interesting behavior inside spice-server. Since we're re-implementing this to me the send a blob through the client approach is simply not acceptable from a security pov, also see my previous mail in this thread. Agree. It should be a normal spice message which goes through the spice marshaller for parsing sanity checking. I disagree. Note that there is more info to send over then just which surfaces / images are cached by the client. There also is things like partial complete agent channel messages, including how much bytes must be read to complete the command, etc. Is there a complete list of the session state we need to save? There is still code in spice-server for the old seamless migration, someone could go through that and use that as an initial list of session state we need to save. IMHO (b) would only be acceptable if the data send through the client stops becoming a blob. Using (a) to send a blob isn't better. It has the distinct advantage that we can trust the contents of the blob which makes life significantly easier IMHO. Instead the client could simply send a list of all surface ids, etc. which it has cached after it connects to / starts using the new host. Note that the old hosts needs to send nothing for this, this is info the client already has, also removing the need for synchronization. Yes, some session state is known to the client anyway so we don't need a source- target channel for them. As for certain other data, such as (but not limited to) partially parsed agent messages, these should be send through the regular vmstate methods IMHO. That isn't easy to handle via vmstate, at least as soon as this goes beyond a fixed number of fields aka 'migrate over this struct for me'. Think multiple spice clients connected at the same time. 1) Do (a), sending everything that way 2) Do (a) sending non client state that way; and let the client send state like which surfaces it has cached when the new session starts. I think we have to look at each piece of state information needed by the target and look how to handle this best. Agreed, so for starts someone needs to make a list of all session state we need to save. Regards, Hans
Re: [Qemu-devel] [Spice-devel] seamless migration with spice
On Mon, Mar 12, 2012 at 11:26:50AM +0100, Gerd Hoffmann wrote: Hi, As for certain other data, such as (but not limited to) partially parsed agent messages, these should be send through the regular vmstate methods IMHO. That isn't easy to handle via vmstate, at least as soon as this goes beyond a fixed number of fields aka 'migrate over this struct for me'. Think multiple spice clients connected at the same time. Migrate this struct n times for me. I think for the agent case this isn't needed. Or is every client allowed to speak to the agent in case of multiple clients connected? I somehow doubt this can work as the agent protocol can't multicast ... Actually the agent protocol does extend nicely to multiple clients - I forgot the name but there is an additional wrapper between the client/server originating message and the guest received message, that is currently used for server or client originating messages, and can be reused to have multiple in flight different client messages. We don't use it for guest generated messages, but we could as well. Multicast would be another number. cheers, Gerd
Re: [Qemu-devel] [PATCH] ide: Adds model=s option, allowing the user to override the default disk model name QEMU HARDDISK
On 03/12/2012 11:26 AM, Stefan Hajnoczi wrote: On Sat, Mar 10, 2012 at 7:56 PM, Floris Bosb...@je-eigen-domein.nl wrote: @@ -1885,6 +1885,22 @@ int ide_init_drive(IDEState *s, BlockDriverState *bs, IDEDriveKind kind, snprintf(s-drive_serial_str, sizeof(s-drive_serial_str), QM%05d, s-drive_serial); } +if (model) { +strncpy(s-drive_model_str, model, sizeof(s-drive_model_str)); strncpy(3) does not NUL-terminate if the max length is reached. Either you need to use pstrcpy() or specify sizeof(s-drive_model_str) - 1 and make sure s-drive_model_str[40] = '\0'. Thanks for the feedback. Will change that line (and serial that used strncpy() as well) to pstrcpy(), correct the cosmetic issues mentioned by Andreas and submit a v2 patch. -- Yours sincerely, Floris Bos
Re: [Qemu-devel] Add GSoC project ideas to the wiki!
Am 24.02.2012 10:19, schrieb Stefan Hajnoczi: This is a reminder that QEMU will apply for Google Summer of Code 2012 and we need project ideas and mentors. Libvirt and kvm.ko projects are also welcome! http://wiki.qemu.org/Google_Summer_of_Code_2012 Please add yourself to the wiki now if you want to mentor a project this summer. I will file our application next week. Natalia, I saw that you copied a few items from last year's Wiki page to the current one. But did you check that all of them are still valid? For example, we do have EHCI and xHCI in the tree for quite a while now. I don't think there's enough left for a GSoC project with these. Kevin
Re: [Qemu-devel] [Spice-devel] seamless migration with spice
On 03/12/12 12:29, Alon Levy wrote: On Mon, Mar 12, 2012 at 11:26:50AM +0100, Gerd Hoffmann wrote: Hi, Migrate this struct n times for me. I think for the agent case this isn't needed. Or is every client allowed to speak to the agent in case of multiple clients connected? I somehow doubt this can work as the agent protocol can't multicast ... Actually the agent protocol does extend nicely to multiple clients - I forgot the name but there is an additional wrapper between the client/server originating message and the guest received message, that is currently used for server or client originating messages, and can be reused to have multiple in flight different client messages. I think you'll have issues in the layer above though. Two spice clients doing cut+paste operations at the same time? Two spice clients requesting different screen resolutions? cheers, Gerd
Re: [Qemu-devel] [PATCH] ide: Adds model=s option, allowing the user to override the default disk model name QEMU HARDDISK
Am 12.03.2012 12:30, schrieb Floris Bos / Maxnet: On 03/12/2012 11:26 AM, Stefan Hajnoczi wrote: On Sat, Mar 10, 2012 at 7:56 PM, Floris Bosb...@je-eigen-domein.nl wrote: @@ -1885,6 +1885,22 @@ int ide_init_drive(IDEState *s, BlockDriverState *bs, IDEDriveKind kind, snprintf(s-drive_serial_str, sizeof(s-drive_serial_str), QM%05d, s-drive_serial); } +if (model) { +strncpy(s-drive_model_str, model, sizeof(s-drive_model_str)); strncpy(3) does not NUL-terminate if the max length is reached. Either you need to use pstrcpy() or specify sizeof(s-drive_model_str) - 1 and make sure s-drive_model_str[40] = '\0'. Thanks for the feedback. Will change that line (and serial that used strncpy() as well) to pstrcpy(), correct the cosmetic issues mentioned by Andreas and submit a v2 patch. Fixing serial as well is a good idea. Please submit a separate patch for that, though, as it is an independent fix. Kevin
Re: [Qemu-devel] [PATCH] qemu-io: add option to enable tracing
On Wed, Dec 21, 2011 at 3:35 PM, Stefan Hajnoczi stefa...@linux.vnet.ibm.com wrote: Ping? Do you want to take this through the block tree? It can be useful to enable QEMU tracing when trying out block layer interfaces via qemu-io. Tracing can be enabled using the new -t FILE option where the given file contains a list of trace events to enable (just like the qemu --trace events=FILE option). $ echo qemu_vfree my-events $ ./qemu-io -t my-events ... Remember to use ./configure --enable-trace-backend=BACKEND when building qemu-io. Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- qemu-io.c | 8 +++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/qemu-io.c b/qemu-io.c index ffa62fb..ad91fd6 100644 --- a/qemu-io.c +++ b/qemu-io.c @@ -17,6 +17,7 @@ #include qemu-common.h #include block_int.h #include cmd.h +#include trace/control.h #define VERSION 0.0.1 @@ -1722,6 +1723,7 @@ static void usage(const char *name) -g, --growable allow file to grow (only applies to protocols)\n -m, --misalign misalign allocations for O_DIRECT\n -k, --native-aio use kernel AIO implementation (on Linux only)\n + -t, --trace FILE enable trace events listed in the given file\n -h, --help display this help and exit\n -V, --version output version information and exit\n \n, @@ -1733,7 +1735,7 @@ int main(int argc, char **argv) { int readonly = 0; int growable = 0; - const char *sopt = hVc:rsnmgk; + const char *sopt = hVc:rsnmgkt:; const struct option lopt[] = { { help, 0, NULL, 'h' }, { version, 0, NULL, 'V' }, @@ -1745,6 +1747,7 @@ int main(int argc, char **argv) { misalign, 0, NULL, 'm' }, { growable, 0, NULL, 'g' }, { native-aio, 0, NULL, 'k' }, + { trace, 1, NULL, 't' }, { NULL, 0, NULL, 0 } }; int c; @@ -1776,6 +1779,9 @@ int main(int argc, char **argv) case 'k': flags |= BDRV_O_NATIVE_AIO; break; + case 't': + trace_backend_init(optarg, NULL); + break; case 'V': printf(%s version %s\n, progname, VERSION); exit(0); -- 1.7.7.3
Re: [Qemu-devel] [PATCH v2 1/5] qxl: switch qxl.c to trace-events
On Mon, Mar 12, 2012 at 11:20:55AM +0100, Gerd Hoffmann wrote: On 03/11/12 20:26, Alon Levy wrote: dprint is still used for qxl_init_common one time prints. I think we shouldn't simply convert the dprintf's into trace-points. We should look at each dprintf and check whenever it makes sense at all, whenever it makes sense at that place before converting it over to a tracepoint. I had two changes of heart about this. Initially I started mechanically converting, then I realized it only makes sense for recurring events, and things we want to see come out of the same output. But then I noticed a lot of existing users do use it for as verbose usage as we do (bh calls) and it is not meant as a stable interface to anyone - clearly something that should fit the developer and user, and if the subsystem changes then the events can change. Bottom line I think having most of the dprints as trace_events makes sense, and we can use consistent naming to make enabling/disabling them easy for systemtap/stderr (with monitor trace-events command) easy. I only left very few dprints. @@ -409,7 +410,7 @@ static void interface_attach_worker(QXLInstance *sin, QXLWorker *qxl_worker) { PCIQXLDevice *qxl = container_of(sin, PCIQXLDevice, ssd.qxl); -dprint(qxl, 1, %s:\n, __FUNCTION__); +trace_qxl_interface_attach_worker(); qxl-ssd.worker = qxl_worker; } For example: Do we really need that one? I look at it the other way around - can it repeat? yes, it's a callback from the spice server which we don't control. does it serve the same purpose as the dprint? yes. @@ -505,9 +506,10 @@ static int interface_get_command(QXLInstance *sin, struct QXLCommandExt *ext) QXLCommand *cmd; int notify, ret; +trace_qxl_interface_get_command_enter(qxl_mode_to_string(qxl-mode)); + Why this? Simplyfication of the dprints to avoid introducing an additional trace event. We had a dprint for level 1 for both VGA and other modes, so I moved it up. This trace is for each request from the server, as opposed to the _ret that is for each returned command (much less frequent). -dprint(qxl, 1, %s: scheduling update_area_bh, #dirty %d\n, - __func__, qxl-num_dirty_rects); + trace_qxl_interface_update_area_complete_schedule_bh(qxl-num_dirty_rects); I think it makes more sense to have the tracepoint in the bottom half handler instead. Why instead? I could add another one at the bottom half. static void qxl_hard_reset(PCIQXLDevice *d, int loadvm) { -dprint(d, 1, %s: start%s\n, __FUNCTION__, - loadvm ? (loadvm) : ); +trace_qxl_hard_reset_enter(loadvm); qxl_spice_reset_cursor(d); qxl_spice_reset_image_cache(d); @@ -934,7 +935,7 @@ static void qxl_hard_reset(PCIQXLDevice *d, int loadvm) qemu_spice_create_host_memslot(d-ssd); qxl_soft_reset(d); -dprint(d, 1, %s: done\n, __FUNCTION__); +trace_qxl_hard_reset_exit(); } Do we need the exit tracepoint? With systemtap I'd say the whole function could be traced, and that would mean both entry and exit. But you can't do that with the tracing framework, so this is the only way to have both. If having both dprints makes no sense, I guess having both trace events makes none too. static void qxl_reset_memslots(PCIQXLDevice *d) { -dprint(d, 1, %s:\n, __FUNCTION__); +trace_qxl_reset_memslots(); qxl_spice_reset_memslots(d); memset(d-guest_slots, 0, sizeof(d-guest_slots)); } Do we need that one? qxl_hard_reset is the only caller of that function ... Agree, I'll drop it. But maybe I should add trace events for all the qxl_spice_* calls? @@ -1216,8 +1213,8 @@ static void ioport_write(void *opaque, target_phys_addr_t addr, if (d-mode != QXL_MODE_VGA) { break; } -dprint(d, 1, %s: unexpected port 0x%x (%s) in vga mode\n, -__func__, io_port, io_port_to_string(io_port)); +trace_qxl_io_unexpected_vga_mode( +io_port, io_port_to_string(io_port)); We might want raise an error irq here, and have a tracepoint in qxl_guest_bug() of course ... ok, I think I can add the tracepoint for qxl_guest_bug. Raise an error irq I'll do in another patch. case QXL_IO_SET_MODE: -dprint(d, 1, QXL_SET_MODE %d\n, (int)val); +trace_qxl_io_set_mode(val); qxl_set_mode(d, val, 0); Needed? There is a tracepoint in qxl_set_mode() ... But if qxl_set_mode can be called from multiple places it isn't equivalent. case QXL_IO_RESET: -dprint(d, 1, QXL_IO_RESET\n); +trace_qxl_io_reset(); qxl_hard_reset(d, 0); ... likewise ... Same answer. break; case QXL_IO_MEMSLOT_ADD: @@ -1337,7 +1334,7 @@ async_common: async); goto cancel_async; } -dprint(d, 1,
Re: [Qemu-devel] [PATCH] virtio-serial-bus: use correct lengths in control_out() message
On 12.03.2012 15:06, Amit Shah wrote: On (Mon) 12 Mar 2012 [13:22:22], Michael Tokarev wrote: On 12.03.2012 12:59, Amit Shah wrote: On (Sun) 11 Mar 2012 [17:52:59], Michael Tokarev wrote: In case of more than one control message, the code will use size of the largest message so far for all subsequent messages, instead of using size of current one. Fix it. Makes sense. How did you detect this? Any reproducible test-case? There's no test-case, and no detection, just reading the code. Actually, I think, there's no bug here, but a very, well, difficult to read code. Do you mean this code is difficult to read, or in general? Any ideas to make it simpler (or at least details on what's difficult?) Just difficult to understand, and just this particular (very minor!) place. We got one thing, we requested to copy another, and we handle 3rd which is something else. While actually we are supposed to get, request and handle the _same_, or else we're doomed. [] It's a memcpy() right now, it could change to something else. Ignoring return values, esp. in copy functions, is not good style, even if you know it can't fail. So that's the reason why the return value should be void, and that the code should always request as many bytes as it actually needs, and there must be some assert()s to ensure we're not outside of something. So I think the patch is correct still ;) No doubt about that. I never said otherwise. I just feel we shouldn't ignore return values. By _not_ ignoring return value in something like that is not far away from checking if 1 is still equal to 1 after each instruction :) I want to change this interface a bit, to be more obvious, to stop all similar discussions and doubts. It will handle the requested amount and will abort() internally if it can't, so we can stop bothering ignoring the return value, provided that we actually request what we _want_ it to do, not what we have. In this particular case, the 'size' argument of iov_from_buf() should be 'bytes' or 'len', -- actual amount of bytes we need to process, not size of the buffer we have in our disposal. (For the iov_* family, we've another set, qemu_iovec_*, and also qemu_sendv() Co, each of which have different and not obvious at all interface :) Thanks!
Re: [Qemu-devel] [PATCH] qemu-io: add option to enable tracing
Am 12.03.2012 12:39, schrieb Stefan Hajnoczi: On Wed, Dec 21, 2011 at 3:35 PM, Stefan Hajnoczi stefa...@linux.vnet.ibm.com wrote: Ping? Do you want to take this through the block tree? Sorry, I completely missed this one. It can be useful to enable QEMU tracing when trying out block layer interfaces via qemu-io. Tracing can be enabled using the new -t FILE option where the given file contains a list of trace events to enable (just like the qemu --trace events=FILE option). $ echo qemu_vfree my-events $ ./qemu-io -t my-events ... Remember to use ./configure --enable-trace-backend=BACKEND when building qemu-io. Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- qemu-io.c |8 +++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/qemu-io.c b/qemu-io.c index ffa62fb..ad91fd6 100644 --- a/qemu-io.c +++ b/qemu-io.c @@ -17,6 +17,7 @@ #include qemu-common.h #include block_int.h #include cmd.h +#include trace/control.h #define VERSION0.0.1 @@ -1722,6 +1723,7 @@ static void usage(const char *name) -g, --growable allow file to grow (only applies to protocols)\n -m, --misalign misalign allocations for O_DIRECT\n -k, --native-aio use kernel AIO implementation (on Linux only)\n + -t, --trace FILE enable trace events listed in the given file\n -h, --help display this help and exit\n -V, --versionoutput version information and exit\n \n, @@ -1733,7 +1735,7 @@ int main(int argc, char **argv) { int readonly = 0; int growable = 0; -const char *sopt = hVc:rsnmgk; +const char *sopt = hVc:rsnmgkt:; const struct option lopt[] = { { help, 0, NULL, 'h' }, { version, 0, NULL, 'V' }, @@ -1745,6 +1747,7 @@ int main(int argc, char **argv) { misalign, 0, NULL, 'm' }, { growable, 0, NULL, 'g' }, { native-aio, 0, NULL, 'k' }, +{ trace, 1, NULL, 't' }, { NULL, 0, NULL, 0 } }; int c; @@ -1776,6 +1779,9 @@ int main(int argc, char **argv) case 'k': flags |= BDRV_O_NATIVE_AIO; break; +case 't': +trace_backend_init(optarg, NULL); +break; vl.c checks the return value of trace_backend_init. Shouldn't we do the same here? Also, I was considering adding a -t for the cache mode (option name for consistency with qemu-img). Conversely, we'll probably want to add a tracing option to qemu-img and -t isn't available any more there. Maybe we should use a different letter? Kevin
Re: [Qemu-devel] [PATCH] ide: Adds model=s option, allowing the user to override the default disk model name QEMU HARDDISK
Am 10.03.2012 20:56, schrieb Floris Bos: Some Linux distributions use the /dev/disk/by-id/scsi-SATA_name-of-disk-model_serial addressing scheme when refering to partitions in /etc/fstab and elsewhere. This causes problems when starting a disk image taken from an existing physical server under qemu, because when running under qemu name-of-disk-model is always QEMU HARDDISK This patch introduces a model=s option which in combination with the existing serial=s option can be used to fake the disk the operating system was previously on, allowing the OS to boot properly. Cc: kw...@redhat.com Signed-off-by: Floris Bos d...@noc-ps.com Oh, and now that I look at the actual patch, do we really want to add the model=... option to -drive? I think just the qdev property may be enough. When you need to set the option, you would need to use -device, but it's not that hard. Kevin
Re: [Qemu-devel] [PATCH 2/2] memory: print aliased IO ranges in info mtree
On 03/11/2012 01:00 PM, Blue Swirl wrote: Print also I/O ports behind bridges and other aliases. Signed-off-by: Blue Swirl blauwir...@gmail.com --- memory.c | 13 + 1 files changed, 13 insertions(+), 0 deletions(-) diff --git a/memory.c b/memory.c index 4c3dc49..0201392 100644 --- a/memory.c +++ b/memory.c @@ -1639,7 +1639,20 @@ void mtree_info(fprintf_function mon_printf, void *f) if (address_space_io.root !QTAILQ_EMPTY(address_space_io.root-subregions)) { QTAILQ_INIT(ml_head); + mon_printf(f, I/O\n); mtree_print_mr(mon_printf, f, address_space_io.root, 0, 0, ml_head); + +/* print aliased I/O regions */ +QTAILQ_FOREACH(ml, ml_head, queue) { +if (!ml-printed) { +mon_printf(f, %s\n, ml-mr-name); +mtree_print_mr(mon_printf, f, ml-mr, 0, 0, ml_head); +} +} +} + +QTAILQ_FOREACH_SAFE(ml, ml_head, queue, ml2) { +g_free(ml); } } This is just duplication. A little refactoring would eliminate it. Also, we might have an alias from the memory address space and an alias from the I/O address space pointing into the same region. So maybe alias printing should be consolidated for all address spaces. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [RFC PATCH 07/17] block: make high level discard operation always zero
On 03/12/2012 01:04 PM, Kevin Wolf wrote: qcow2 can't handle clusters that are referenced twice from the same L1 table. This would require a reverse lookup to adjust the QCOW_O_COPIED flags in the L2 tables containing the other references. Don't follow, sorry. What adjustment are you talking about? If it's a 1-0 transition, is it mandatory to adjust the flag? That is, it it legal to have a refcount of exactly one, but have the flag clear? According to the spec it's illegal and qemu-img check will complain, too. In practice, I'm not entirely sure if it will cause real corruption or just unnecessary COWs. I believe it may be the latter. We could retro-doc it but I'm not a huge fan of this practice. But it's not worth the trouble anyway when we can have a real zero flag in qcow3. ok. Thanks for the explanations. -- error compiling committee.c: too many arguments to function
[Qemu-devel] [PATCH v2] ide: Adds model=s option, allowing the user to override the default disk model name QEMU HARDDISK
Some Linux distributions use the /dev/disk/by-id/scsi-SATA_name-of-disk-model_serial addressing scheme when refering to partitions in /etc/fstab and elsewhere. This causes problems when starting a disk image taken from an existing physical server under qemu, because when running under qemu name-of-disk-model is always QEMU HARDDISK This patch introduces a model=s option which in combination with the existing serial=s option can be used to fake the disk the operating system was previously on, allowing the OS to boot properly. Cc: kw...@redhat.com Signed-off-by: Floris Bos d...@noc-ps.com --- blockdev.c|5 + blockdev.h|2 ++ hw/ide/core.c | 27 ++- hw/ide/internal.h |4 +++- hw/ide/qdev.c | 18 -- qemu-config.c |4 qemu-options.hx |4 +++- 7 files changed, 55 insertions(+), 9 deletions(-) diff --git a/blockdev.c b/blockdev.c index d78aa51..3df9cb9 100644 --- a/blockdev.c +++ b/blockdev.c @@ -277,6 +277,7 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi) const char *file = NULL; char devname[128]; const char *serial; +const char *model; const char *mediastr = ; BlockInterfaceType type; enum { MEDIA_DISK, MEDIA_CDROM } media; @@ -313,6 +314,7 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi) file = qemu_opt_get(opts, file); serial = qemu_opt_get(opts, serial); +model = qemu_opt_get(opts, model); if ((buf = qemu_opt_get(opts, if)) != NULL) { pstrcpy(devname, sizeof(devname), buf); @@ -534,6 +536,9 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi) dinfo-refcount = 1; if (serial) strncpy(dinfo-serial, serial, sizeof(dinfo-serial) - 1); +if (model) { +pstrcpy(dinfo-model, sizeof(dinfo-model), model); +} QTAILQ_INSERT_TAIL(drives, dinfo, next); bdrv_set_on_error(dinfo-bdrv, on_read_error, on_write_error); diff --git a/blockdev.h b/blockdev.h index 260e16b..21eb4b5 100644 --- a/blockdev.h +++ b/blockdev.h @@ -18,6 +18,7 @@ void blockdev_mark_auto_del(BlockDriverState *bs); void blockdev_auto_del(BlockDriverState *bs); #define BLOCK_SERIAL_STRLEN 20 +#define BLOCK_MODEL_STRLEN 40 typedef enum { IF_DEFAULT = -1,/* for use with drive_add() only */ @@ -37,6 +38,7 @@ struct DriveInfo { int media_cd; QemuOpts *opts; char serial[BLOCK_SERIAL_STRLEN + 1]; +char model[BLOCK_MODEL_STRLEN + 1]; QTAILQ_ENTRY(DriveInfo) next; int refcount; }; diff --git a/hw/ide/core.c b/hw/ide/core.c index 4d568ac..7bd2810 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -101,7 +101,7 @@ static void ide_identify(IDEState *s) put_le16(p + 21, 512); /* cache size in sectors */ put_le16(p + 22, 4); /* ecc bytes */ padstr((char *)(p + 23), s-version, 8); /* firmware version */ -padstr((char *)(p + 27), QEMU HARDDISK, 40); /* model */ +padstr((char *)(p + 27), s-drive_model_str, 40); /* model */ #if MAX_MULT_SECTORS 1 put_le16(p + 47, 0x8000 | MAX_MULT_SECTORS); #endif @@ -189,7 +189,7 @@ static void ide_atapi_identify(IDEState *s) put_le16(p + 21, 512); /* cache size in sectors */ put_le16(p + 22, 4); /* ecc bytes */ padstr((char *)(p + 23), s-version, 8); /* firmware version */ -padstr((char *)(p + 27), QEMU DVD-ROM, 40); /* model */ +padstr((char *)(p + 27), s-drive_model_str, 40); /* model */ put_le16(p + 48, 1); /* dword I/O (XXX: should not be set on CDROM) */ #ifdef USE_DMA_CDROM put_le16(p + 49, 1 9 | 1 8); /* DMA and LBA supported */ @@ -246,7 +246,7 @@ static void ide_cfata_identify(IDEState *s) padstr((char *)(p + 10), s-drive_serial_str, 20); /* serial number */ put_le16(p + 22, 0x0004); /* ECC bytes */ padstr((char *) (p + 23), s-version, 8); /* Firmware Revision */ -padstr((char *) (p + 27), QEMU MICRODRIVE, 40);/* Model number */ +padstr((char *) (p + 27), s-drive_model_str, 40);/* Model number */ #if MAX_MULT_SECTORS 1 put_le16(p + 47, 0x8000 | MAX_MULT_SECTORS); #else @@ -1834,7 +1834,7 @@ static const BlockDevOps ide_cd_block_ops = { }; int ide_init_drive(IDEState *s, BlockDriverState *bs, IDEDriveKind kind, - const char *version, const char *serial) + const char *version, const char *serial, const char *model) { int cylinders, heads, secs; uint64_t nb_sectors; @@ -1885,6 +1885,22 @@ int ide_init_drive(IDEState *s, BlockDriverState *bs, IDEDriveKind kind, snprintf(s-drive_serial_str, sizeof(s-drive_serial_str), QM%05d, s-drive_serial); } +if (model) { +pstrcpy(s-drive_model_str, sizeof(s-drive_model_str), model); +} else { +switch (kind) { +case IDE_CD: +strcpy(s-drive_model_str, QEMU DVD-ROM); +break; +case IDE_CFATA: +strcpy(s-drive_model_str, QEMU
Re: [Qemu-devel] [PATCH] kvm: add flightrecorder script
On 03/09/2012 04:13 PM, Stefan Hajnoczi wrote: The kvm kernel module includes a number of trace events which can be useful when debugging system behavior. Even on production systems these trace events can be used to observe guest behavior and identify the source of problems. The kvm_flightrecorder script is a command-line wrapper for the /sys/kernel/debug/tracing interface. Kernel symbols do not need to be installed. This script captures a fixed-size buffer of KVM trace events. Recent events overwrite the oldest events when the buffer size is exceeded and it is possible to leave KVM tracing enabled for any period of time with just a fixed-size buffer. If the buffer is large enough this script is a useful tool for collecting detailed information after an issue occurs with a guest. Hence the name flight recorder. The script can also be used in 'tail' mode to simply view KVM trace events as they occur. This is handy for development and to ensure that the guest is indeed running. Have you considered updating trace-cmd or perf-something instead? While each of these tools are useful we miss out on combining capabilities. For example you could 'perf probe' a dynamic tracepoint and flightrecord it together with kvm tracepoints. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH RESEND v5 3/6] Set runstate to INMIGRATE earlier
On Wed, 7 Mar 2012, Luiz Capitulino wrote: On Tue, 28 Feb 2012 15:51:32 + Stefano Stabellini stefano.stabell...@eu.citrix.com wrote: Set runstate to RUN_STATE_INMIGRATE as soon as we can on resume. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Would be nice to explain why, but I suggested this myself so: Acked-by: Luiz Capitulino lcapitul...@redhat.com Thanks! It would be nice if somebody took the time to review patch #1 and #2 too considering that they have been sitting there for a while...
Re: [Qemu-devel] [PATCH] ide: Adds model=s option, allowing the user to override the default disk model name QEMU HARDDISK
On 03/12/2012 12:57 PM, Kevin Wolf wrote: Am 10.03.2012 20:56, schrieb Floris Bos: Some Linux distributions use the /dev/disk/by-id/scsi-SATA_name-of-disk-model_serial addressing scheme when refering to partitions in /etc/fstab and elsewhere. This causes problems when starting a disk image taken from an existing physical server under qemu, because when running under qemu name-of-disk-model is always QEMU HARDDISK This patch introduces a model=s option which in combination with the existing serial=s option can be used to fake the disk the operating system was previously on, allowing the OS to boot properly. Cc: kw...@redhat.com Signed-off-by: Floris Bosd...@noc-ps.com Oh, and now that I look at the actual patch, do we really want to add the model=... option to -drive? I think just the qdev property may be enough. When you need to set the option, you would need to use -device, but it's not that hard. Well, to me it makes sense to put it at the same places serial is, as both options have quite similar functions: faking drive attributes. -- Yours sincerely, Floris Bos
Re: [Qemu-devel] [PATCH v1 01/12] net: Add a hub net client
On Mon, Mar 12, 2012 at 5:12 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 12/03/2012 09:59, Zhi Yong Wu ha scritto: However, then I noticed that qemu_can_send_packet is not called very much, and I do not understand why qemu_net_queue_send and qemu_net_queue_send_iov do not call qemu_can_send_packet before calling deliver/deliver_iov. If they did, hubs could then do their own flow control via can_receive. When qemu_send_packet returns zero you increment a count of in-flight packets, and a sent-packet callback would decrement the same count. sent-packet callback is sent_cb here? i noticed that sent_cb is currently NULL. Yes. When the count is non-zero, can_receive returns false (and vice versa). Can you elaborate this? since can_receive is called finally by qemu_send_packet, how can can_receive return false or true based on the count? Based on counts from the previous sends. thanks. I agree with your opition, but hope to get stefan's comments. Paolo -- Regards, Zhi Yong Wu
Re: [Qemu-devel] [PATCH 2/2] tracetool: Forbid argument name 'next'
Stefan Hajnoczi writes: On Mon, Mar 12, 2012 at 9:34 AM, Kevin Wolf kw...@redhat.com wrote: It has happened more than once that patches that look perfectly sane and work with simpletrace broke systemtap because they use 'next' as an argument name for a tracing function. However, 'next' is a keyword for systemtap, so we shouldn't use it. Signed-off-by: Kevin Wolf kw...@redhat.com --- scripts/tracetool | 4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/scripts/tracetool b/scripts/tracetool index 4c9951d..f892af4 100755 --- a/scripts/tracetool +++ b/scripts/tracetool @@ -81,6 +81,10 @@ get_args() args=${1#*\(} args=${args%%\)*} echo $args + + if (echo $args | grep [ *]next\($\|[, ]\) /dev/null 21); then + echo -e \n#error 'next' is a bad argument name (clash with systemtap keyword)\n + fi Good idea, let's prevent it from being used. I don't think this is the way to do it because callers will parse stdout and we're not guaranteed to be generating C code where #error works. Instead, we can echo to stderr and do exit 1. I'd rather wait for the python version of tracetool to be integrated, so that less patches have to be rebased. In addition, there's a nice 'error' routine to handle this type of cases. Lluis -- And it's much the same thing with knowledge, for whenever you learn something new, the whole world becomes that much richer. -- The Princess of Pure Reason, as told by Norton Juster in The Phantom Tollbooth
Re: [Qemu-devel] [Spice-devel] seamless migration with spice
Hi, Is there a complete list of the session state we need to save? There is still code in spice-server for the old seamless migration, someone could go through that and use that as an initial list of session state we need to save. That doesn't help much as it is _way_ too old. Predates surfaces wan support which needs additional state. Predates smardcard and usb too. Also agent didn't have bulky stuff (cut+paste) back then, so chances are not bad that just not saving any agent state works in 99.99% of the cases just fine, so I'm not sure this is handled at all. Also some bits are are not needed any more. Transfering the ticket has been offloaded that to management. QXL device handles some bits which used to be transfered with the spice-server state (mouse pointer shape). Agreed, so for starts someone needs to make a list of all session state we need to save. Here is what I'm aware of: * session id (needed when sending state via vmstate, I think we don't need it when sending state via spice client). * surfaces known to the client (can also be negotiated between client and target directly). * surface state (lossy vs. lossless quality if wan support is enabled). Dunno whenever the client knows this. * glz compression dictionary state (not sure what exactly is transfered here and why). * vmchannel state (agent, smartcard, usb). agent is tricky because spice-server needs to maintain state there because of the message multiplexing. A fixed number of fields and maybe a VD_AGENT_MAX_DATA_SIZE-sized buffer could work for that though. smartcard + usb: This is just pass-through for spice-server, right? There shouldn't be anything to save, except maybe for stuff buffered in spice-server. Is there any? I mean really in spice-server, migrating spice-qemu-char.c buffers via vmstate is not a big issue. cheers, Gerd
Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations
Il 10/03/2012 19:02, Richard Laager ha scritto: I propose adding the following behaviors in any event: * If a QEMU block device reports a discard_granularity 0, it must be equal to 2^n (n = 0), or QEMU's block core will change it to 0. (Non-power-of-two granularities are not likely to exist in the real world, and this assumption greatly simplifies ensuring correctness.) Yeah, I was considering this to be simply a bug in the block device. * For SCSI, report an unmap_granularity to the guest as follows: max(logical_block_size, discard_granularity) / logical_block_size This is more or less already in place later in the series. As a design concept, instead of guaranteeing that 512B zero'ing discards are supported, I think the QEMU block layer should instead guarantee aligned discards to QEMU block devices, emulating any misaligned discards (or portions thereof) by writing zeroes if (and only if) discard_zeros_data is set. Yes, this can be done of course. This series does not include it yet. This leaves one remaining issue: In raw-posix.c, for files (i.e. not devices), I assume you're going to advertise discard_granularity=1 and discard_zeros_data=1 when compiled with support for fallocate(FALLOC_FL_PUNCH_HOLE). Note, I'm assuming fallocate() actually guarantees that it zeros the data when punching holes. It does, that's pretty much the definition of a hole. If the guest does a big discard (think mkfs) and fallocate() returns EOPNOTSUPP, you'll have to zero essentially the whole virtual disk, which, as you noted, will also allocate it (unless you explicitly check for holes). This is bad. It can be avoided by not advertising discard_zeros_data, but as you noted, that's unfortunate. If you have a new kernel that supports SEEK_HOLE/SEEK_DATA, it can also be done by skipping the zero write on known holes. This could even be done at the block layer level using bdrv_is_allocated. If we could probe for FALLOC_FL_PUNCH_HOLE support, then we could avoid advertising discard support based on FALLOC_FL_PUNCH_HOLE when it is not going to work. This would side step these problems. ... and introduce others when migrating if your datacenter doesn't have homogeneous kernel versions and/or filesystems. :( You said it wasn't possible to probe for FALLOC_FL_PUNCH_HOLE. Have you considered probing by extending the file by one byte and then punching that: char buf = 0; fstat(s-fd, st); pwrite(s-fd, buf, 1, st.st_size + 1); has_discard = !fallocate(s-fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, st.st_size + 1, 1); ftruncate(s-fd, st.st_size); Nice trick. :) Yes, that could work. Do you know if non-Linux operating systems have something similar to BLKDISCARDZEROES? Paolo
Re: [Qemu-devel] [Spice-devel] seamless migration with spice
On Mon, Mar 12, 2012 at 12:34:42PM +0100, Gerd Hoffmann wrote: On 03/12/12 12:29, Alon Levy wrote: On Mon, Mar 12, 2012 at 11:26:50AM +0100, Gerd Hoffmann wrote: Hi, Migrate this struct n times for me. I think for the agent case this isn't needed. Or is every client allowed to speak to the agent in case of multiple clients connected? I somehow doubt this can work as the agent protocol can't multicast ... Actually the agent protocol does extend nicely to multiple clients - I forgot the name but there is an additional wrapper between the client/server originating message and the guest received message, that is currently used for server or client originating messages, and can be reused to have multiple in flight different client messages. I think you'll have issues in the layer above though. Two spice clients doing cut+paste operations at the same time? Two spice clients requesting different screen resolutions? Yeah, you're right of course, this needs to be dealt with somehow. cut+paste: maps nicely to a number of different buffers. Would need some policy, and the session agent becomes closer to a buffer manager. resolutions: again policy, perhaps have a master client, or if none defined let the last or just the first choose. Not sure. But these issues don't need to be solved now, do they? cheers, Gerd ___ Spice-devel mailing list spice-de...@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/spice-devel
[Qemu-devel] [PATCH 7/7] vga: add trace event for ppm_save
From: Alon Levy al...@redhat.com Signed-off-by: Alon Levy al...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- hw/vga.c |2 ++ trace-events |3 +++ 2 files changed, 5 insertions(+), 0 deletions(-) diff --git a/hw/vga.c b/hw/vga.c index 5994f43..6dc98f6 100644 --- a/hw/vga.c +++ b/hw/vga.c @@ -30,6 +30,7 @@ #include pixel_ops.h #include qemu-timer.h #include xen.h +#include trace.h //#define DEBUG_VGA //#define DEBUG_VGA_MEM @@ -2372,6 +2373,7 @@ int ppm_save(const char *filename, struct DisplaySurface *ds) int ret; char *linebuf, *pbuf; +trace_ppm_save(filename, ds); f = fopen(filename, wb); if (!f) return -1; diff --git a/trace-events b/trace-events index 94c4a6f..dfe28ed 100644 --- a/trace-events +++ b/trace-events @@ -662,3 +662,6 @@ dma_map_wait(void *dbs) dbs=%p # console.h displaysurface_free(void *display_state, void *display_surface) state=%p surface=%p displaysurface_resize(void *display_state, void *display_surface, int width, int height) state=%p surface=%p %dx%d + +# vga.c +ppm_save(const char *filename, void *display_surface) %s surface=%p -- 1.7.9.1
[Qemu-devel] [PATCH 3/7] trace: make trace_thread_create() use its function arg
From: Jun Koi junkoi2...@gmail.com This patch makes trace_thread_create() to use its function arg to initialize thread. The other choice is to make this a function to use void arg, but i prefer this way. Signed-off-by: Jun Koi junkoi2...@gmail.com Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- trace/simple.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/trace/simple.c b/trace/simple.c index bbc9930..33ae486 100644 --- a/trace/simple.c +++ b/trace/simple.c @@ -363,7 +363,7 @@ static GThread *trace_thread_create(GThreadFunc fn) sigfillset(set); pthread_sigmask(SIG_SETMASK, set, oldset); #endif -thread = g_thread_create(writeout_thread, NULL, FALSE, NULL); +thread = g_thread_create(fn, NULL, FALSE, NULL); #ifndef _WIN32 pthread_sigmask(SIG_SETMASK, oldset, NULL); #endif -- 1.7.9.1
Re: [Qemu-devel] [Spice-devel] seamless migration with spice
Hans de Goede píše v Po 12. 03. 2012 v 09:51 +0100: Hi, On 03/12/2012 08:57 AM, Gerd Hoffmann wrote: Hi, What about the second part? it's independant of the async issue. Isn't this a client problem? The client has this state, no? It is state of the client- server session. Today spice client creates a new session on migration, so there is simply no need to maintain any state. Drawback is that everything needs to be resent from the server to the client. Thats why we want be able to continue the spice session, so the client caches will stay valid. Of course the spice-server on the migration target needs the session state for that, i.e. know for example which bits the client has cached which it hasn't. If the state is stored in the server, wouldn't it be marshaled as part of the server's migration state? spice-server is stateless today when it comes to migration. QXL handles all (device) state, by keeping track of some commands (such as create/destroy surface) which it needs to transfer on migration, and by asking spice-server to render all surfaces on migration, which effectively flushes the spice server state to qxl device memory. To transfer the client session state there are basically two options: (a) transfer it as part of the qemu migration data stream. I don't want have any details about the qemu migration implementation and/or protocol in the spice-server library api, which basically leaves a ugly transfer-this-blob-for-me-please style interface as only option. (b) transfer it as part of the spice protocol. As the spice client has a connection to both source and target while the migration runs we can send session state from the source host via spice client to the target host. This needs some form of synchronization, to make sure both vmstate and spice migration are completed when qemu on the source machine quits. The problem with (b) is, that iirc the way b was implemented in the past was still the big blob approach, but then pass the blob through the client, which means an evil client could modify it, causing all sorts of interesting behavior inside spice-server. Since we're re-implementing this to me the send a blob through the client approach is simply not acceptable from a security pov, also see my previous mail in this thread. In addition to security POV, it's also bad from network usage POV - while network connection among hosts is gigabit or better, client may be connected over high-latency low-bandwidth WAN. Sending any data through client makes absolutely no sense in such cases. David I think (b) is the better approach. I disagree. Note that there is more info to send over then just which surfaces / images are cached by the client. There also is things like partial complete agent channel messages, including how much bytes must be read to complete the command, etc. IMHO (b) would only be acceptable if the data send through the client stops becoming a blob. Instead the client could simply send a list of all surface ids, etc. which it has cached after it connects to / starts using the new host. Note that the old hosts needs to send nothing for this, this is info the client already has, also removing the need for synchronization. As for certain other data, such as (but not limited to) partially parsed agent messages, these should be send through the regular vmstate methods IMHO. So I see 2 options 1) Do (a), sending everything that way 2) Do (a) sending non client state that way; and let the client send state like which surfaces it has cached when the new session starts. Regards, Hans ___ Spice-devel mailing list spice-de...@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/spice-devel -- David Jaša, RHCE SPICE QE based in Brno GPG Key: 22C33E24 Fingerprint: 513A 060B D1B4 2A72 7F0D 0278 B125 CD00 22C3 3E24
[Qemu-devel] KVM call agenda for Tuesday 13
Hi Please send in any agenda items you are interested in covering. Cheers, Juan.
Re: [Qemu-devel] [PATCH RESEND v5 1/6] cirrus_vga: do not reset videoram
On 02/28/2012 05:51 PM, Stefano Stabellini wrote: There is no need to set the videoram to 0xff in cirrus_reset, because it is the BIOS' job. Reviewed-by: Avi Kivity a...@redhat.com -- error compiling committee.c: too many arguments to function
[Qemu-devel] [PATCH 4/7] docs: correct ./configure line in tracing.txt
From: Jun Koi junkoi2...@gmail.com This patch corrects the configure's trace option in docs/tracing.txt. Signed-off-by: Jun Koi junkoi2...@gmail.com Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- docs/tracing.txt |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/docs/tracing.txt b/docs/tracing.txt index a92716f..c541133 100644 --- a/docs/tracing.txt +++ b/docs/tracing.txt @@ -9,7 +9,7 @@ for debugging, profiling, and observing execution. 1. Build with the 'simple' trace backend: -./configure --trace-backend=simple +./configure --enable-trace-backend=simple make 2. Create a file with the events you want to trace: -- 1.7.9.1
Re: [Qemu-devel] [Spice-devel] seamless migration with spice
On 03/12/2012 11:46 AM, Gerd Hoffmann wrote: Hi, The problem with (b) is, that iirc the way b was implemented in the past was still the big blob approach, but then pass the blob through the client, which means an evil client could modify it, causing all sorts of interesting behavior inside spice-server. Since we're re-implementing this to me the send a blob through the client approach is simply not acceptable from a security pov, also see my previous mail in this thread. Agree. It should be a normal spice message which goes through the spice marshaller for parsing sanity checking. I disagree. Note that there is more info to send over then just which surfaces / images are cached by the client. There also is things like partial complete agent channel messages, including how much bytes must be read to complete the command, etc. Is there a complete list of the session state we need to save? IMHO (b) would only be acceptable if the data send through the client stops becoming a blob. Using (a) to send a blob isn't better. Gerd/Hans, Can you explain/exemplify, why sending data as a blob (either by (a) or (b)), that is verified only by the two ends that actually use it, is a problem? Lets say the client/qemu are completely aware of the migration data, what prevent it from harming it then? Instead the client could simply send a list of all surface ids, etc. which it has cached after it connects to / starts using the new host. Note that the old hosts needs to send nothing for this, this is info the client already has, also removing the need for synchronization. Yes, some session state is known to the client anyway so we don't need a source- target channel for them. As for certain other data, such as (but not limited to) partially parsed agent messages, these should be send through the regular vmstate methods IMHO. That isn't easy to handle via vmstate, at least as soon as this goes beyond a fixed number of fields aka 'migrate over this struct for me'. Think multiple spice clients connected at the same time. 1) Do (a), sending everything that way 2) Do (a) sending non client state that way; and let the client send state like which surfaces it has cached when the new session starts. I think we have to look at each piece of state information needed by the target and look how to handle this best. cheers, Gerd ___ Spice-devel mailing list spice-de...@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/spice-devel
[Qemu-devel] [PATCH 5/7] maintainers: Add docs/tracing.txt to Tracing
From: Andreas Färber afaer...@suse.de The topic of whether and by whom docs/tracing.txt is maintained was brought up. It currently does not have an official maintainer. Add it to the tracing section so that Stefan gets cc'ed on patches. Signed-off-by: Andreas Färber afaer...@suse.de Acked-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com Cc: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- MAINTAINERS |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index d249947..f83d07c2 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -548,6 +548,7 @@ Tracing M: Stefan Hajnoczi stefa...@linux.vnet.ibm.com S: Maintained F: trace/ +F: docs/tracing.txt T: git://github.com/stefanha/qemu.git tracing Checkpatch -- 1.7.9.1
[Qemu-devel] [PULL 0/7] Tracing patches
First round of tracing patches. I should have sent these out a long time ago. Once these are out of the way I can review and handle bigger patches from Lluís and Harsh. The following changes since commit a348f108842fb928563865c9918642900cd0d477: Add missing const attributes for MemoryRegionOps (2012-03-11 11:40:15 +) are available in the git repository at: git://github.com/stefanha/qemu.git tracing for you to fetch changes up to 727500181a2b2470a676e021205d170ede23beb7: vga: add trace event for ppm_save (2012-03-12 10:30:27 +) Alon Levy (2): console: add some trace events vga: add trace event for ppm_save Andreas Färber (1): maintainers: Add docs/tracing.txt to Tracing Jun Koi (2): trace: make trace_thread_create() use its function arg docs: correct ./configure line in tracing.txt Lluís Vilanova (1): trace: Provide a per-event status define for conditional compilation Stefan Hajnoczi (1): tracetool: Omit useless QEMU_*_ENABLED() check MAINTAINERS |1 + console.h |3 +++ docs/tracing.txt | 48 +--- hw/vga.c |2 ++ scripts/tracetool | 13 + trace-events |7 +++ trace/simple.c|2 +- 7 files changed, 64 insertions(+), 12 deletions(-) -- 1.7.9.1
[Qemu-devel] [PATCH 6/7] console: add some trace events
From: Alon Levy al...@redhat.com Signed-off-by: Alon Levy al...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- console.h|3 +++ trace-events |4 2 files changed, 7 insertions(+), 0 deletions(-) diff --git a/console.h b/console.h index a95b581..4334db5 100644 --- a/console.h +++ b/console.h @@ -5,6 +5,7 @@ #include qdict.h #include notify.h #include monitor.h +#include trace.h /* keyboard/mouse support */ @@ -202,11 +203,13 @@ static inline DisplaySurface* qemu_create_displaysurface(DisplayState *ds, int w static inline DisplaySurface* qemu_resize_displaysurface(DisplayState *ds, int width, int height) { +trace_displaysurface_resize(ds, ds-surface, width, height); return ds-allocator-resize_displaysurface(ds-surface, width, height); } static inline void qemu_free_displaysurface(DisplayState *ds) { +trace_displaysurface_free(ds, ds-surface); ds-allocator-free_displaysurface(ds-surface); } diff --git a/trace-events b/trace-events index c5d0f0f..94c4a6f 100644 --- a/trace-events +++ b/trace-events @@ -658,3 +658,7 @@ dma_aio_cancel(void *dbs) dbs=%p dma_complete(void *dbs, int ret, void *cb) dbs=%p ret=%d cb=%p dma_bdrv_cb(void *dbs, int ret) dbs=%p ret=%d dma_map_wait(void *dbs) dbs=%p + +# console.h +displaysurface_free(void *display_state, void *display_surface) state=%p surface=%p +displaysurface_resize(void *display_state, void *display_surface, int width, int height) state=%p surface=%p %dx%d -- 1.7.9.1
[Qemu-devel] [PATCH 2/7] tracetool: Omit useless QEMU_*_ENABLED() check
SystemTap provides a semaphore that can optionally be tested before executing a trace event. The purpose of this mechanism is to skip expensive tracing code when the trace event is disabled. For example, some applications may have trace events that format or convert strings for trace events. This expensive processing should only be done in the case where the trace event is enabled. Since QEMU's generated trace events never have such special-purpose code, there is no reason to add the semaphore check. Reviewed-by: Masami Hiramatsu masami.hiramatsu...@hitachi.com Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- scripts/tracetool |4 +--- 1 files changed, 1 insertions(+), 3 deletions(-) diff --git a/scripts/tracetool b/scripts/tracetool index 701b517..65bd0a1 100755 --- a/scripts/tracetool +++ b/scripts/tracetool @@ -415,9 +415,7 @@ linetoh_dtrace() # Define an empty function for the trace event cat EOF static inline void trace_$name($args) { -if (QEMU_${nameupper}_ENABLED()) { -QEMU_${nameupper}($argnames); -} +QEMU_${nameupper}($argnames); } EOF } -- 1.7.9.1
Re: [Qemu-devel] [libvirt] Modern CPU models cannot be used with libvirt
On Sun, Mar 11, 2012 at 09:12:58AM -0500, Anthony Liguori wrote: On 03/11/2012 08:27 AM, Gleb Natapov wrote: On Sat, Mar 10, 2012 at 12:24:47PM -0600, Anthony Liguori wrote: Let's step back here. Why are you writing these patches? It's probably not because you have a desire to say -cpu Westmere when you run QEMU on your laptop. I'd wager to say that no human has ever done that or that if they had, they did so by accident because they read documentation and thought they had to. No, it's because libvirt doesn't handle all the tiny small details involved in specifying a CPU. All libvirty knows about are a set of CPU flag bits, but it knows nothing about 'level', 'family', and 'xlevel', but we would like to allow it to expose a Westmere-like CPU to the guest. libvirt does know how to use the Westmere CPU model today, if it is not disabled by -nodefconfig. The interface it uses for probing has deficiencies, but it works right now. Humans probably do one of two things: 1) no cpu option or 2) -cpu host. And both are not optimal. Actually both are bad. First one because default cpu is very conservative and the second because there is no guaranty that guest will continue to work after qemu or kernel upgrade. Let me elaborate about the later. Suppose host CPU has kill_guest feature and at the time a guest was installed it was not implemented by kvm. Since it was not implemented by kvm it was not present in vcpu during installation and the guest didn't install workaround kill_guest module. Now unsuspecting user upgrades the kernel and tries to restart the guest and fails. He writes angry letter to qemu-devel and is asked to reinstall his guest and move along. -cpu best wouldn't solve this. You need a read/write configuration file where QEMU probes the available CPU and records it to be used for the lifetime of the VM. If the CPU records are used for probing, this is yet another reason they are not configuration, but defaults/templates to be used to build the actual configuration. IMHO, having to generate an opaque config file based on the results of probing is poor interface design, for humans _and_ for machines. If we have any bug on the probing, or on the data used as base for the probing, or on the config generation, it will be impossible to deploy a fix for the users. This is why machine-types exist: you have the ability to implement probing and/or sane defaults, but at the same time you can change the probing behavior or the set of defaults without breaking existing machines. This way, the config file contains only what the user really wanted to configure, not some complex and opaque result of a probing process. Tthe fact that we have a _set_ of CPU definitions to choose from (or to use as input for probing) instead of a single default CPU definition that the user can change is a sign that that the cpudefs are _not_ user configuration, but just templates/defaults. [...] This discussion isn't about whether QEMU should have a Westmere processor definition. In fact, I think I already applied that patch. It's a discussion about how we handle this up and down the stack. Agreed on this point. The question is who should define and manage CPU compatibility. Right now QEMU does to a certain degree, libvirt discards this and does it's own thing, and VDSM/ovirt-engine assume that we're providing something and has built a UI around it. libvirt doesn't discard this. If it just discarded this and properly defined its own models, I wouldn't even have (re-)started this thread. (Well, maybe I would have started a similar thread arguing that we are wasting time working on equivalent known-to-work CPU model definitions on Qemu and libvirt. Today we don't waste time doing it because libvirt currently expects -nodefconfig to not disable the existing default models). What I'm proposing we consider: have VDSM manage CPU definitions in order to provide a specific user experience in ovirt-engine. I don't disagree completely with that. The problem is defining what's CPU definitions. The current cpudef semantics is simply too low level, it impacts other features that are _already_ managed by Qemu. Let me try to enumerate: - Some CPUID leafs are defined based on -smp; - Some CPUID leafs depend on kernel capabilities; - The availability of some CPUID leafs depend on some features being enabled or not, but they are simply not exposed if a proper 'level' value is set. We could have two approaches here: we can define some details of CPU definitions as not configurable and others as must-be configurable, and force management layer to agree with us about what should be configurable or not. Or, we could simply define that a sane set of CPU definitions are part of a machine-type, and let managment to reconfigure parts of it if desired, but do not force it to configure it if not needed. We would continue to have Westmere/etc in QEMU exposed as part of the user
Re: [Qemu-devel] [PATCH 2/4] block: image fragmentation statistics for qed
On Wed, Mar 7, 2012 at 9:22 AM, Dong Xu Wang wdon...@linux.vnet.ibm.com wrote: + for (i = 0; i table_noffsets; i++) { + l2_offset = s-l1_table-offsets[i]; + if (l2_offset == 0) { if (qed_offset_is_unalloc_cluster(l2_offset)) { + continue; + } + ret = bdrv_pread(bs-file, l2_offset, table, size); + if (ret 0) { + qemu_vfree(table); + return ret; + } + for (j = 0; j size/sizeof(uint64_t); j++) { + uint64_t *offset = (uint64_t *)(table + j); No need to cast, table is already uint64_t*. Since you don't write to the offset, I would just do: uint64_t offset = table[j]; + if (*offset cluster_size) { + continue; + } if (!qed_check_cluster_offset(s, offset)) { continue; } We will skip unallocated clusters and zero clusters. Stefan
Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations
Am 12.03.2012 13:27, schrieb Paolo Bonzini: Il 10/03/2012 19:02, Richard Laager ha scritto: I propose adding the following behaviors in any event: * If a QEMU block device reports a discard_granularity 0, it must be equal to 2^n (n = 0), or QEMU's block core will change it to 0. (Non-power-of-two granularities are not likely to exist in the real world, and this assumption greatly simplifies ensuring correctness.) Yeah, I was considering this to be simply a bug in the block device. Block driver you mean? Yes, I think we can just assert() this. Kevin
Re: [Qemu-devel] [libvirt] Modern CPU models cannot be used with libvirt
On Mon, Mar 12, 2012 at 09:52:27AM -0300, Eduardo Habkost wrote: On Sun, Mar 11, 2012 at 09:12:58AM -0500, Anthony Liguori wrote: On 03/11/2012 08:27 AM, Gleb Natapov wrote: On Sat, Mar 10, 2012 at 12:24:47PM -0600, Anthony Liguori wrote: Let's step back here. Why are you writing these patches? It's probably not because you have a desire to say -cpu Westmere when you run QEMU on your laptop. I'd wager to say that no human has ever done that or that if they had, they did so by accident because they read documentation and thought they had to. No, it's because libvirt doesn't handle all the tiny small details involved in specifying a CPU. All libvirty knows about are a set of CPU flag bits, but it knows nothing about 'level', 'family', and 'xlevel', but we would like to allow it to expose a Westmere-like CPU to the guest. This is easily fixable in libvirt - so for the point of going discussion, IMHO, we can assume libvirt will support level, family, xlevel, etc. Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: [Qemu-devel] [PATCH 1/4] block: add image fragmentation statistics to qemu-img
On Wed, Mar 7, 2012 at 9:22 AM, Dong Xu Wang wdon...@linux.vnet.ibm.com wrote: From: Dong Xu Wang wdon...@linux.vnet.ibm.com Discussion can be found at: http://patchwork.ozlabs.org/patch/128730/ This patch add image fragmentation statistics while using qemu-img info. Signed-off-by: Dong Xu Wang wdon...@linux.vnet.ibm.com --- block.c | 13 + block.h | 7 +++ block_int.h | 1 + qemu-img.c | 9 + 4 files changed, 30 insertions(+), 0 deletions(-) diff --git a/block.c b/block.c index 52ffe14..947607b 100644 --- a/block.c +++ b/block.c @@ -2588,6 +2588,19 @@ int bdrv_get_info(BlockDriverState *bs, BlockDriverInfo *bdi) return drv-bdrv_get_info(bs, bdi); } +int bdrv_get_fragment(BlockDriverState *bs, BlockFragInfo *bfi) bdrv_get_fraginfo() makes it clearer that this function gets a summary of fragmentation information. bdrv_get_fragment() makes me think it gets one specific fragment. +{ + BlockDriver *drv = bs-drv; + if (!drv) { + return -ENOMEDIUM; + } + if (!drv-bdrv_get_fragment) { + return -ENOTSUP; + } + memset(bfi, 0, sizeof(*bfi)); + return drv-bdrv_get_fragment(bs, bfi); For now this .drv_get_fraginfo() interface makes sense but if we merge the QCOW2-QED in-place conversion patch series in the future it will be possible to implement this in a generic way because image formats will expose their guest - host mapping. @@ -1126,6 +1127,14 @@ static int img_info(int argc, char **argv) printf(cluster_size: %d\n, bdi.cluster_size); } } + if (bdrv_get_fragment(bs, bfi) = 0) { I think we need a separate sub-command for fragmentation info: qemu-img fraginfo image-file Utilities that invoke qemu-img info want it to be fast. Reading all metadata from a large image can take several seconds. Since many qemu-img info users don't need to see the fragmentation information, it makes sense to put it in a new sub-command. + if (bfi.total_clusters != 0 bfi.allocated_clusters != 0) { + printf(%lld/%lld = %0.2f%% allocated, %0.2f%% fragmented\n, Please use the PRId64 macro instead of %lld because some compilers warn when uint64_t is not typedefed to long long int. Stefan
Re: [Qemu-devel] [libvirt] Modern CPU models cannot be used with libvirt
On Sun, Mar 11, 2012 at 10:41:32AM -0500, Anthony Liguori wrote: On 03/11/2012 10:12 AM, Gleb Natapov wrote: On Sun, Mar 11, 2012 at 09:16:49AM -0500, Anthony Liguori wrote: If libvirt assumes anything about what kvm actually supports it is working only by sheer luck. Well the simple answer for libvirt is don't use -nodefconfig and then it can reuse the CPU definitions (including any that the user adds). CPU models should be usable even with -nodefconfig. CPU model is more like device. By -cpu Nehalem I am saying I want Nehalem device in my machine. Let's say we moved CPU definitions to /usr/share/qemu/cpu-models.xml. Obviously, we'd want a command line option to be able to change that location so we'd introduce -cpu-models PATH. But we want all of our command line options to be settable by the global configuration file so we would have a cpu-model=PATH to the configuration file. But why hard code a path when we can just set the default path in the configuration file so let's avoid hard coding and just put cpu-models=/usr/share/qemu/cpu-models.xml in the default configuration file. We wouldn't do the above. -nodefconfig should disable the loading of files on /etc, but it shouldn't disable loading internal non-configurable data that we just happened to choose to store outside the qemu binary because it makes development easier. Really, the requirement of a default configuration file is a problem by itself. Qemu should not require a default configuration file to work, and it shouldn't require users to copy the default configuration file to change options from the default. Doing this would make it impossible to deploy fixes to users if we evern find out that the default configuration file had a serious bug. What if a bug in our default configuration file has a serious security implication? But now when libvirt uses -nodefconfig, those models go away. -nodefconfig means start QEMU in the most minimal state possible. You get what you pay for if you use it. We'll have the same problem with machine configuration files. At some point in time, -nodefconfig will make machine models disappear. It shouldn't. Machine-types are defaults to be used as base, they are not user-provided configuration. And the fact that we decided to store some data outside of the Qemu binary is orthogonal the design decisions in the Qemu command-line and configuration interface. As I said previously, requiring generation of opaque config files (and copy the default config file and change it is included on my definition of generation of opaque config files) is poor design, IMO. I bet this even has an entry in some design anti-pattern catalog somewhere. -- Eduardo
Re: [Qemu-devel] [PATCH 1/4] block: add image fragmentation statistics to qemu-img
Am 12.03.2012 14:07, schrieb Stefan Hajnoczi: On Wed, Mar 7, 2012 at 9:22 AM, Dong Xu Wang wdon...@linux.vnet.ibm.com wrote: From: Dong Xu Wang wdon...@linux.vnet.ibm.com Discussion can be found at: http://patchwork.ozlabs.org/patch/128730/ This patch add image fragmentation statistics while using qemu-img info. Signed-off-by: Dong Xu Wang wdon...@linux.vnet.ibm.com --- block.c | 13 + block.h |7 +++ block_int.h |1 + qemu-img.c |9 + 4 files changed, 30 insertions(+), 0 deletions(-) diff --git a/block.c b/block.c index 52ffe14..947607b 100644 --- a/block.c +++ b/block.c @@ -2588,6 +2588,19 @@ int bdrv_get_info(BlockDriverState *bs, BlockDriverInfo *bdi) return drv-bdrv_get_info(bs, bdi); } +int bdrv_get_fragment(BlockDriverState *bs, BlockFragInfo *bfi) bdrv_get_fraginfo() makes it clearer that this function gets a summary of fragmentation information. bdrv_get_fragment() makes me think it gets one specific fragment. +{ +BlockDriver *drv = bs-drv; +if (!drv) { +return -ENOMEDIUM; +} +if (!drv-bdrv_get_fragment) { +return -ENOTSUP; +} +memset(bfi, 0, sizeof(*bfi)); +return drv-bdrv_get_fragment(bs, bfi); For now this .drv_get_fraginfo() interface makes sense but if we merge the QCOW2-QED in-place conversion patch series in the future it will be possible to implement this in a generic way because image formats will expose their guest - host mapping. We can probably resurrect Devin's patches for this part even now instead of introducing an interface that we don't really want. @@ -1126,6 +1127,14 @@ static int img_info(int argc, char **argv) printf(cluster_size: %d\n, bdi.cluster_size); } } +if (bdrv_get_fragment(bs, bfi) = 0) { I think we need a separate sub-command for fragmentation info: qemu-img fraginfo image-file Utilities that invoke qemu-img info want it to be fast. Reading all metadata from a large image can take several seconds. Since many qemu-img info users don't need to see the fragmentation information, it makes sense to put it in a new sub-command. Yes. If we wanted to merge it into an existing qemu-img subcommand, I think check would be the one, as it scans the whole image already today and fragmentation is something that could be added fairly easily. Kevin
[Qemu-devel] [PATCH/RFC 2/7] Allow a qemu_fopen_socket() to be opened for writing
--- migration-tcp.c |2 +- migration-unix.c |2 +- qemu-file.h |2 +- savevm.c | 36 4 files changed, 35 insertions(+), 7 deletions(-) diff --git a/migration-tcp.c b/migration-tcp.c index 35a5781..f567898 100644 --- a/migration-tcp.c +++ b/migration-tcp.c @@ -140,7 +140,7 @@ static void tcp_accept_incoming_migration(void *opaque) goto out2; } -f = qemu_fopen_socket(c); +f = qemu_fopen_socket(c, r); if (f == NULL) { fprintf(stderr, could not qemu_fopen socket\n); goto out; diff --git a/migration-unix.c b/migration-unix.c index 169de88..3928f4c 100644 --- a/migration-unix.c +++ b/migration-unix.c @@ -137,7 +137,7 @@ static void unix_accept_incoming_migration(void *opaque) goto out2; } -f = qemu_fopen_socket(c); +f = qemu_fopen_socket(c, r); if (f == NULL) { fprintf(stderr, could not qemu_fopen socket\n); goto out; diff --git a/qemu-file.h b/qemu-file.h index 31b83f6..efd6b1c 100644 --- a/qemu-file.h +++ b/qemu-file.h @@ -67,7 +67,7 @@ QEMUFile *qemu_fopen_ops(void *opaque, QEMUFilePutBufferFunc *put_buffer, QEMUFileGetRateLimit *get_rate_limit); QEMUFile *qemu_fopen(const char *filename, const char *mode); QEMUFile *qemu_fdopen(int fd, const char *mode); -QEMUFile *qemu_fopen_socket(int fd); +QEMUFile *qemu_fopen_socket(int fd, const char *mode); QEMUFile *qemu_popen(FILE *popen_file, const char *mode); QEMUFile *qemu_popen_cmd(const char *command, const char *mode); int qemu_stdio_fd(QEMUFile *f); diff --git a/savevm.c b/savevm.c index 80be1ff..4171ebf 100644 --- a/savevm.c +++ b/savevm.c @@ -205,6 +205,21 @@ static int socket_get_buffer(void *opaque, uint8_t *buf, int64_t pos, int size) return len; } +static int socket_put_buffer(void *opaque, const uint8_t *buf, int64_t pos, int size) +{ +QEMUFileSocket *s = opaque; +ssize_t len; + +do { +len = write(s-fd, buf, size); +} while (len == -1 socket_error() == EINTR); + +if (len == -1) +len = -socket_error(); + +return len; +} + static int socket_close(void *opaque) { QEMUFileSocket *s = opaque; @@ -316,7 +331,7 @@ QEMUFile *qemu_fdopen(int fd, const char *mode) if (!s-stdio_file) goto fail; -if(mode[0] == 'r') { +if (mode[0] == 'r') { s-file = qemu_fopen_ops(s, NULL, stdio_get_buffer, stdio_fclose, NULL, NULL, NULL); } else { @@ -330,13 +345,26 @@ fail: return NULL; } -QEMUFile *qemu_fopen_socket(int fd) +QEMUFile *qemu_fopen_socket(int fd, const char *mode) { QEMUFileSocket *s = g_malloc0(sizeof(QEMUFileSocket)); +if (mode == NULL || + (mode[0] != 'r' mode[0] != 'w') || + mode[1] != 'b' || mode[2] != 0) { +fprintf(stderr, qemu_fopen_socket: Argument validity check failed\n); +return NULL; +} + s-fd = fd; -s-file = qemu_fopen_ops(s, NULL, socket_get_buffer, socket_close, -NULL, NULL, NULL); +if (mode[0] == 'r') { +s-file = qemu_fopen_ops(s, NULL, socket_get_buffer, socket_close, + NULL, NULL, NULL); +} else { +s-file = qemu_fopen_ops(s, socket_put_buffer, NULL, socket_close, + NULL, NULL, NULL); +} + return s-file; } -- 1.7.7.6
Re: [Qemu-devel] [libvirt] Modern CPU models cannot be used with libvirt
On Mon, Mar 12, 2012 at 01:04:19PM +, Daniel P. Berrange wrote: On Mon, Mar 12, 2012 at 09:52:27AM -0300, Eduardo Habkost wrote: On Sun, Mar 11, 2012 at 09:12:58AM -0500, Anthony Liguori wrote: On 03/11/2012 08:27 AM, Gleb Natapov wrote: On Sat, Mar 10, 2012 at 12:24:47PM -0600, Anthony Liguori wrote: Let's step back here. Why are you writing these patches? It's probably not because you have a desire to say -cpu Westmere when you run QEMU on your laptop. I'd wager to say that no human has ever done that or that if they had, they did so by accident because they read documentation and thought they had to. No, it's because libvirt doesn't handle all the tiny small details involved in specifying a CPU. All libvirty knows about are a set of CPU flag bits, but it knows nothing about 'level', 'family', and 'xlevel', but we would like to allow it to expose a Westmere-like CPU to the guest. This is easily fixable in libvirt - so for the point of going discussion, IMHO, we can assume libvirt will support level, family, xlevel, etc. And fill in all cpuid leafs by querying /dev/kvm when needed or, if TCG is used, replicating QEMU logic? And since QEMU should be usable without libvirt the same logic should be implemented in QEMU anyway. -- Gleb.
Re: [Qemu-devel] [PATCH] ide: Adds model=s option, allowing the user to override the default disk model name QEMU HARDDISK
Am 12.03.2012 13:09, schrieb Floris Bos / Maxnet: On 03/12/2012 12:57 PM, Kevin Wolf wrote: Am 10.03.2012 20:56, schrieb Floris Bos: Some Linux distributions use the /dev/disk/by-id/scsi-SATA_name-of-disk-model_serial addressing scheme when refering to partitions in /etc/fstab and elsewhere. This causes problems when starting a disk image taken from an existing physical server under qemu, because when running under qemu name-of-disk-model is always QEMU HARDDISK This patch introduces a model=s option which in combination with the existing serial=s option can be used to fake the disk the operating system was previously on, allowing the OS to boot properly. Cc: kw...@redhat.com Signed-off-by: Floris Bosd...@noc-ps.com Oh, and now that I look at the actual patch, do we really want to add the model=... option to -drive? I think just the qdev property may be enough. When you need to set the option, you would need to use -device, but it's not that hard. Well, to me it makes sense to put it at the same places serial is, as both options have quite similar functions: faking drive attributes. -drive serial=... is older than -device, this is why it was okay back then and why we still have to maintain it for compatibility. If serial=... was a new patch today, I would probably be asking the same question there. Kevin