Re: [Qemu-devel] [PATCH v2 4/5] exec.c: refactor cpu_physical_memory_map
On 12.07.2011, at 00:17, Jan Kiszka wrote: On 2011-05-19 19:35, stefano.stabell...@eu.citrix.com wrote: From: Stefano Stabellini stefano.stabell...@eu.citrix.com Introduce qemu_ram_ptr_length that takes an address and a size as parameters rather than just an address. Refactor cpu_physical_memory_map so that we call qemu_ram_ptr_length only once rather than calling qemu_get_ram_ptr one time per page. This is not only more efficient but also tries to simplify the logic of the function. Currently we are relying on the fact that all the pages are mapped contiguously in qemu's address space: we have a check to make sure that the virtual address returned by qemu_get_ram_ptr from the second call on is consecutive. Now we are making this more explicit replacing all the calls to qemu_get_ram_ptr with a single call to qemu_ram_ptr_length passing a size argument. This breaks cpu_physical_memory_map for 4G addresses on PC. Effectively, it doesn't account for the PCI gap, ie. that the RAM block is actually mapped in two chunks into the guest physical memory. One outcome is that QEMU aborts when we try to process an address that is now outside RAM. Simple to reproduce with a virtio NIC and 5G guest memory, even without KVM. Do you have some reliable test case? I can't seem to reproduce the issue. It works just fine for me with -m 10G, virtio nic and disk, polluting all the guest page cache. Alex
Re: [Qemu-devel] [PATCH v2 4/5] exec.c: refactor cpu_physical_memory_map
On 12.07.2011, at 00:17, Jan Kiszka wrote: On 2011-05-19 19:35, stefano.stabell...@eu.citrix.com wrote: From: Stefano Stabellini stefano.stabell...@eu.citrix.com Introduce qemu_ram_ptr_length that takes an address and a size as parameters rather than just an address. Refactor cpu_physical_memory_map so that we call qemu_ram_ptr_length only once rather than calling qemu_get_ram_ptr one time per page. This is not only more efficient but also tries to simplify the logic of the function. Currently we are relying on the fact that all the pages are mapped contiguously in qemu's address space: we have a check to make sure that the virtual address returned by qemu_get_ram_ptr from the second call on is consecutive. Now we are making this more explicit replacing all the calls to qemu_get_ram_ptr with a single call to qemu_ram_ptr_length passing a size argument. This breaks cpu_physical_memory_map for 4G addresses on PC. Effectively, it doesn't account for the PCI gap, ie. that the RAM block is actually mapped in two chunks into the guest physical memory. One outcome is that QEMU aborts when we try to process an address that is now outside RAM. Simple to reproduce with a virtio NIC and 5G guest memory, even without KVM. Ah, I see what you mean now. It breaks on current HEAD, but not on my last xen-next branch which already included that patch, so I'd assume it's something different that came in later. Alex
Re: [Qemu-devel] [PATCH v2 4/5] exec.c: refactor cpu_physical_memory_map
On 12.07.2011, at 00:17, Jan Kiszka wrote: On 2011-05-19 19:35, stefano.stabell...@eu.citrix.com wrote: From: Stefano Stabellini stefano.stabell...@eu.citrix.com Introduce qemu_ram_ptr_length that takes an address and a size as parameters rather than just an address. Refactor cpu_physical_memory_map so that we call qemu_ram_ptr_length only once rather than calling qemu_get_ram_ptr one time per page. This is not only more efficient but also tries to simplify the logic of the function. Currently we are relying on the fact that all the pages are mapped contiguously in qemu's address space: we have a check to make sure that the virtual address returned by qemu_get_ram_ptr from the second call on is consecutive. Now we are making this more explicit replacing all the calls to qemu_get_ram_ptr with a single call to qemu_ram_ptr_length passing a size argument. This breaks cpu_physical_memory_map for 4G addresses on PC. Effectively, it doesn't account for the PCI gap, ie. that the RAM block is actually mapped in two chunks into the guest physical memory. One outcome is that QEMU aborts when we try to process an address that is now outside RAM. Simple to reproduce with a virtio NIC and 5G guest memory, even without KVM. Yeah, that's what happens when you read mails too early in the morning :). The xen branch didn't get pulled yet, so upstream is missing the following patch: commit f221e5ac378feea71d9857ddaa40f829c511742f Author: Stefano Stabellini stefano.stabell...@eu.citrix.com Date: Mon Jun 27 18:26:06 2011 +0100 qemu_ram_ptr_length: take ram_addr_t as arguments qemu_ram_ptr_length should take ram_addr_t as argument rather than target_phys_addr_t because is doing comparisons with RAMBlock addresses. cpu_physical_memory_map should create a ram_addr_t address to pass to qemu_ram_ptr_length from PhysPageDesc phys_offset. Remove code after abort() in qemu_ram_ptr_length. Changes in v2: - handle 0 size in qemu_ram_ptr_length; - rename addr1 to raddr; - initialize raddr to ULONG_MAX. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Reviewed-by: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Alexander Graf ag...@suse.de Anthony? Alex
Re: [Qemu-devel] [RFC v3 31/56] ac97: convert to memory API
On 07/12/2011 01:03 AM, malc wrote: Here's a new version: This one looks acceptable[1], original submission said: fixes BAR sizing as well. what was wrong with it? The nabm BAR, for example, was registered as 64 bytes of byte ioports, 128 bytes of word ioports, and 256 bytes of long ioports. I expect this was an error. The new patch preserves the error. [..snip..] P.S. Sans minor inconsistency with trailing commas. Where I expect more fields, I leave a trailing comma. It makes further patches nicer. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.
Re: [Qemu-devel] [PATCH v2 4/5] exec.c: refactor cpu_physical_memory_map
On 07/12/2011 09:15 AM, Jan Kiszka wrote: Am I the only one under the impression that too many patches are in limbo ATM? No. :) Paolo
Re: [Qemu-devel] [PATCH v2 4/5] exec.c: refactor cpu_physical_memory_map
On 2011-07-12 08:28, Alexander Graf wrote: On 12.07.2011, at 00:17, Jan Kiszka wrote: On 2011-05-19 19:35, stefano.stabell...@eu.citrix.com wrote: From: Stefano Stabellini stefano.stabell...@eu.citrix.com Introduce qemu_ram_ptr_length that takes an address and a size as parameters rather than just an address. Refactor cpu_physical_memory_map so that we call qemu_ram_ptr_length only once rather than calling qemu_get_ram_ptr one time per page. This is not only more efficient but also tries to simplify the logic of the function. Currently we are relying on the fact that all the pages are mapped contiguously in qemu's address space: we have a check to make sure that the virtual address returned by qemu_get_ram_ptr from the second call on is consecutive. Now we are making this more explicit replacing all the calls to qemu_get_ram_ptr with a single call to qemu_ram_ptr_length passing a size argument. This breaks cpu_physical_memory_map for 4G addresses on PC. Effectively, it doesn't account for the PCI gap, ie. that the RAM block is actually mapped in two chunks into the guest physical memory. One outcome is that QEMU aborts when we try to process an address that is now outside RAM. Simple to reproduce with a virtio NIC and 5G guest memory, even without KVM. Yeah, that's what happens when you read mails too early in the morning :). The xen branch didn't get pulled yet, so upstream is missing the following patch: commit f221e5ac378feea71d9857ddaa40f829c511742f Author: Stefano Stabellini stefano.stabell...@eu.citrix.com Date: Mon Jun 27 18:26:06 2011 +0100 qemu_ram_ptr_length: take ram_addr_t as arguments qemu_ram_ptr_length should take ram_addr_t as argument rather than target_phys_addr_t because is doing comparisons with RAMBlock addresses. cpu_physical_memory_map should create a ram_addr_t address to pass to qemu_ram_ptr_length from PhysPageDesc phys_offset. Remove code after abort() in qemu_ram_ptr_length. Changes in v2: - handle 0 size in qemu_ram_ptr_length; - rename addr1 to raddr; - initialize raddr to ULONG_MAX. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Reviewed-by: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Alexander Graf ag...@suse.de Maybe subject or changlog can reflect what this all fixes? Anthony? Am I the only one under the impression that too many patches are in limbo ATM? Jan signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH 3/8] block: Support to keep track of I/O status
Luiz Capitulino lcapitul...@redhat.com writes: This commit adds support to the BlockDriverState type to keep track of the last I/O status. That is, at every I/O operation we update a status field in the BlockDriverState instance. Valid statuses are: OK, FAILED and ENOSPC. ENOSPC is distinguished from FAILED because an management application can use it to implement thin-provisioning. This feature has to be explicit enabled by buses/devices supporting it. buses? Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- block.c | 18 ++ block.h |7 +++ block_int.h |2 ++ 3 files changed, 27 insertions(+), 0 deletions(-) diff --git a/block.c b/block.c index 24a25d5..cc0a34e 100644 --- a/block.c +++ b/block.c @@ -195,6 +195,7 @@ BlockDriverState *bdrv_new(const char *device_name) if (device_name[0] != '\0') { QTAILQ_INSERT_TAIL(bdrv_states, bs, list); } +bs-iostatus_enabled = false; return bs; } @@ -2876,6 +2877,23 @@ int bdrv_in_use(BlockDriverState *bs) return bs-in_use; } +void bdrv_enable_iostatus(BlockDriverState *bs) +{ +bs-iostatus_enabled = true; +} + +void bdrv_iostatus_update(BlockDriverState *bs, int error) +{ +error = abs(error); + +if (!error) { +bs-iostatus = BDRV_IOS_OK; +} else { +bs-iostatus = (error == ENOSPC) ? BDRV_IOS_ENOSPC : + BDRV_IOS_FAILED; +} +} + int bdrv_img_create(const char *filename, const char *fmt, const char *base_filename, const char *base_fmt, char *options, uint64_t img_size, int flags) diff --git a/block.h b/block.h index 859d1d9..0dca1bb 100644 --- a/block.h +++ b/block.h @@ -50,6 +50,13 @@ typedef enum { BDRV_ACTION_REPORT, BDRV_ACTION_IGNORE, BDRV_ACTION_STOP } BlockMonEventAction; +typedef enum { +BDRV_IOS_OK, BDRV_IOS_FAILED, BDRV_IOS_ENOSPC +} BlockIOStatus; + +void bdrv_iostatus_update(BlockDriverState *bs, int error); +void bdrv_enable_iostatus(BlockDriverState *bs); +void bdrv_enable_io_status(BlockDriverState *bs); void bdrv_mon_event(const BlockDriverState *bdrv, BlockMonEventAction action, int is_read); void bdrv_info_print(Monitor *mon, const QObject *data); diff --git a/block_int.h b/block_int.h index 1e265d2..09f038d 100644 --- a/block_int.h +++ b/block_int.h @@ -195,6 +195,8 @@ struct BlockDriverState { drivers. They are not used by the block driver */ int cyls, heads, secs, translation; BlockErrorAction on_read_error, on_write_error; +bool iostatus_enabled; +BlockIOStatus iostatus; char device_name[32]; unsigned long *dirty_bitmap; int64_t dirty_count; Okay, let's see what we got here. The block layer merely holds I/O status, device models set it. Device I/O status is not migrated. Why? bdrv_new() creates the BDS with I/O status tracking disabled. Devices that do tracking enable it in their qdev init method. If a device gets hot unplugged, tracking remains enabled. If the BDS then gets reused with a device that doesn't do tracking, I/O status becomes incorrect. Can't happen right now, because we automatically delete the BDS on hot unplug, but it's a trap. Suggest to disable tracking in bdrv_detach(). Actually, this is a symptom of the midlayer disease. I suspect things would be simpler if we hold the status in its rightful owner, the device model. Need a getter for it. I'm working on a patch series that moves misplaced state out of the block layer into device models and block drivers, and a I/O status getter will fit in easily there.
Re: [Qemu-devel] [PATCH 1/8] Introduce the VMStatus type
Luiz Capitulino lcapitul...@redhat.com writes: We need to track the VM status so that QMP can report it to clients. This commit adds the VMStatus type and related functions. The vm_status_set() function is used to keep track of the current VM status. The current statuses are: Nitpicking about names, bear with me. - debug: guest is running under gdb - inmigrate: guest is paused waiting for an incoming migration incoming-migration? - postmigrate: guest is paused following a successful migration post-migrate? - internal-error: Fatal internal error that prevents further guest execution - load-state-error: guest is paused following a failed 'loadvm' Less than obvious. If you like concrete, name it loadvm-failed. If you like abstract, name it restore-vm-failed. - io-error: the last IOP has failed and the device is configured to pause on I/O errors - watchdog-error: the watchdog action is configured to pause and has been triggered Sounds like the watchdog suffered an error. watchdog-fired? - paused: guest has been paused via the 'stop' command stop-command? - prelaunch: QEMU was started with -S and guest has not started unstarted? - running: guest is actively running - shutdown: guest is shut down (and -no-shutdown is in use) Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- gdbstub.c |4 hw/ide/core.c |1 + hw/scsi-disk.c |1 + hw/virtio-blk.c |1 + hw/watchdog.c |1 + kvm-all.c |1 + migration.c |3 +++ monitor.c |5 - sysemu.h| 19 +++ vl.c| 37 + 10 files changed, 72 insertions(+), 1 deletions(-) diff --git a/gdbstub.c b/gdbstub.c index c085a5a..61b700a 100644 --- a/gdbstub.c +++ b/gdbstub.c @@ -2358,6 +2358,7 @@ void gdb_do_syscall(gdb_syscall_complete_cb cb, const char *fmt, ...) s-state = RS_SYSCALL; #ifndef CONFIG_USER_ONLY vm_stop(VMSTOP_DEBUG); +vm_status_set(VMST_DEBUG); #endif s-state = RS_IDLE; va_start(va, fmt); @@ -2432,6 +2433,7 @@ static void gdb_read_byte(GDBState *s, int ch) /* when the CPU is running, we cannot do anything except stop it when receiving a char */ vm_stop(VMSTOP_USER); +vm_status_set(VMST_DEBUG); } else #endif { @@ -2694,6 +2696,7 @@ static void gdb_chr_event(void *opaque, int event) switch (event) { case CHR_EVENT_OPENED: vm_stop(VMSTOP_USER); +vm_status_set(VMST_DEBUG); gdb_has_xml = 0; break; default: Previous hunk has VMST_DEBUG with VMST_DEBUG. Odd. @@ -2735,6 +2738,7 @@ static void gdb_sigterm_handler(int signal) { if (vm_running) { vm_stop(VMSTOP_USER); +vm_status_set(VMST_DEBUG); } } #endif diff --git a/hw/ide/core.c b/hw/ide/core.c index ca17a43..bf9df41 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -523,6 +523,7 @@ static int ide_handle_rw_error(IDEState *s, int error, int op) s-bus-error_status = op; bdrv_mon_event(s-bs, BDRV_ACTION_STOP, is_read); vm_stop(VMSTOP_DISKFULL); +vm_status_set(VMST_IOERROR); } else { if (op BM_STATUS_DMA_RETRY) { dma_buf_commit(s, 0); diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c index a8c7372..66037fd 100644 --- a/hw/scsi-disk.c +++ b/hw/scsi-disk.c @@ -216,6 +216,7 @@ static int scsi_handle_rw_error(SCSIDiskReq *r, int error, int type) bdrv_mon_event(s-bs, BDRV_ACTION_STOP, is_read); vm_stop(VMSTOP_DISKFULL); +vm_status_set(VMST_IOERROR); } else { if (type == SCSI_REQ_STATUS_RETRY_READ) { scsi_req_data(r-req, 0); diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c index 91e0394..bf70200 100644 --- a/hw/virtio-blk.c +++ b/hw/virtio-blk.c @@ -79,6 +79,7 @@ static int virtio_blk_handle_rw_error(VirtIOBlockReq *req, int error, s-rq = req; bdrv_mon_event(s-bs, BDRV_ACTION_STOP, is_read); vm_stop(VMSTOP_DISKFULL); +vm_status_set(VMST_IOERROR); } else { virtio_blk_req_complete(req, VIRTIO_BLK_S_IOERR); bdrv_mon_event(s-bs, BDRV_ACTION_REPORT, is_read); diff --git a/hw/watchdog.c b/hw/watchdog.c index 1c900a1..d130cbb 100644 --- a/hw/watchdog.c +++ b/hw/watchdog.c @@ -133,6 +133,7 @@ void watchdog_perform_action(void) case WDT_PAUSE: /* same as 'stop' command in monitor */ watchdog_mon_event(pause); vm_stop(VMSTOP_WATCHDOG); +vm_status_set(VMST_WATCHDOG); break; case WDT_DEBUG: diff --git a/kvm-all.c b/kvm-all.c index cbc2532..aee9e0a 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -1015,6 +1015,7 @@ int kvm_cpu_exec(CPUState *env) if (ret 0) {
[Qemu-devel] [PATCH v2] usb-hid: Fix 0/0 position for Windows in tablet mode
On 2011-07-04 20:15, andrzej zaborowski wrote: On 26 June 2011 11:11, Jan Kiszka jan.kis...@web.de wrote: On 2011-06-25 15:10, Andreas Färber wrote: Am 25.06.2011 um 14:55 schrieb Jan Kiszka: On 2011-06-25 14:37, Andreas Färber wrote: Am 24.06.2011 um 16:27 schrieb Jan Kiszka: For unknown reasons, Windows drivers (tested with XP and Win7) ignore usb-tablet events that move the pointer to 0/0. So always set bit 0 of the coordinates. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- hw/usb-hid.c |6 -- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/hw/usb-hid.c b/hw/usb-hid.c index d711b5c..2b9a451 100644 --- a/hw/usb-hid.c +++ b/hw/usb-hid.c @@ -457,8 +457,10 @@ static void usb_pointer_event_combine(USBPointerEvent *e, int xyrel, e-xdx += x1; e-ydy += y1; } else { -e-xdx = x1; -e-ydy = y1; +/* Windows drivers do not like the 0/0 position and ignore such + * events. */ +e-xdx = x1 | 1; +e-ydy = y1 | 1; Doesn't this change mean we can't access any other even pixel either? Only on 32767x32767 screens (that's the resolution of the tablet). Well, if it's just a fix for 0/0 I would've expected something like: e-xdx = x1 ? x1 : 1; e-ydy = y1 ? y1 : 1; Works as well, my version is a little bit simpler. But I don't mind, will post whatever is preferred to fix this. Would it be enough to just do this for x or y, not both? Yes, looks like. Is this one better? Jan --8- From: Jan Kiszka jan.kis...@siemens.com For unknown reasons, Windows drivers (tested with XP and Win7) ignore usb-tablet events that move the pointer to 0/0. So always report 0/0 as 1/0. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- hw/usb-hid.c |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/hw/usb-hid.c b/hw/usb-hid.c index d711b5c..faf91c4 100644 --- a/hw/usb-hid.c +++ b/hw/usb-hid.c @@ -459,6 +459,11 @@ static void usb_pointer_event_combine(USBPointerEvent *e, int xyrel, } else { e-xdx = x1; e-ydy = y1; +/* Windows drivers do not like the 0/0 position and ignore such + * events. */ +if (!(x1 | y1)) { +x1 = 1; +} } e-dz += z1; } -- 1.7.3.4
Re: [Qemu-devel] [PATCH 7/8] QMP: query-status: Add 'io-status' key
Luiz Capitulino lcapitul...@redhat.com writes: Contains the last I/O status for the given device. Currently this is only supported by ide, scsi and virtio block devices. Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- block.c | 15 ++- block.h |2 +- qmp-commands.hx |6 ++ 3 files changed, 21 insertions(+), 2 deletions(-) diff --git a/block.c b/block.c index cc0a34e..28df3d8 100644 --- a/block.c +++ b/block.c @@ -1720,6 +1720,12 @@ void bdrv_info_print(Monitor *mon, const QObject *data) qlist_iter(qobject_to_qlist(data), bdrv_print_dict, mon); } +static const char *const io_status_name[BDRV_IOS_MAX] = { +[BDRV_IOS_OK] = ok, +[BDRV_IOS_FAILED] = failed, +[BDRV_IOS_ENOSPC] = nospace, +}; + void bdrv_info(Monitor *mon, QObject **ret_data) { QList *bs_list; @@ -1729,15 +1735,16 @@ void bdrv_info(Monitor *mon, QObject **ret_data) QTAILQ_FOREACH(bs, bdrv_states, list) { QObject *bs_obj; +QDict *bs_dict; bs_obj = qobject_from_jsonf({ 'device': %s, 'type': 'unknown', 'removable': %i, 'locked': %i }, bs-device_name, bs-removable, bs-locked); +bs_dict = qobject_to_qdict(bs_obj); if (bs-drv) { QObject *obj; -QDict *bs_dict = qobject_to_qdict(bs_obj); obj = qobject_from_jsonf({ 'file': %s, 'ro': %i, 'drv': %s, 'encrypted': %i }, @@ -1752,6 +1759,12 @@ void bdrv_info(Monitor *mon, QObject **ret_data) qdict_put_obj(bs_dict, inserted, obj); } + +if (bs-iostatus_enabled) { +qdict_put(bs_dict, io-status, + qstring_from_str(io_status_name[bs-iostatus])); +} + qlist_append_obj(bs_list, bs_obj); } diff --git a/block.h b/block.h index 0dca1bb..0141fe6 100644 --- a/block.h +++ b/block.h @@ -51,7 +51,7 @@ typedef enum { } BlockMonEventAction; typedef enum { -BDRV_IOS_OK, BDRV_IOS_FAILED, BDRV_IOS_ENOSPC +BDRV_IOS_OK, BDRV_IOS_FAILED, BDRV_IOS_ENOSPC, BDRV_IOS_MAX } BlockIOStatus; void bdrv_iostatus_update(BlockDriverState *bs, int error); diff --git a/qmp-commands.hx b/qmp-commands.hx index 6b8eb0a..1746b6d 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -1082,6 +1082,9 @@ Each json-object contain the following: tftp, vdi, vmdk, vpc, vvfat - backing_file: backing file name (json-string, optional) - encrypted: true if encrypted, false otherwise (json-bool) +- io-status: last executed I/O operation status, only present if the device + supports it (json_string, optional) + - Possible values: ok, failed, nospace Example: @@ -1089,6 +1092,7 @@ Example: - { return:[ { +io-status: ok, device:ide0-hd0, locked:false, removable:false, @@ -1101,12 +1105,14 @@ Example: type:unknown }, { +io-status: ok, device:ide1-cd0, locked:false, removable:true, type:unknown }, { +io-status: ok, device:floppy0, locked:false, removable:true, floppy doesn't support I/O status, yet the example shows io-status: ok. Are you sure it's correct?
Re: [Qemu-devel] live block copy/stream/snapshot discussion
Am 11.07.2011 18:32, schrieb Marcelo Tosatti: On Mon, Jul 11, 2011 at 03:47:15PM +0100, Stefan Hajnoczi wrote: Kevin, Marcelo, I'd like to reach agreement on the QMP/HMP APIs for live block copy and image streaming. Libvirt has acked the image streaming APIs that Adam proposed and I think they are a good fit for the feature. I have described that API below for your review (it's exactly what the QED Image Streaming patches provide). Marcelo: Are you happy with this API for live block copy? Also please take a look at the switch command that I am proposing. Image streaming API === For leaf images with copy-on-read semantics, the stream commands allow the user to populate local blocks by manually streaming them from the backing image. Once all blocks have been streamed, the dependency on the original backing image can be removed. Therefore, stream commands can be used to implement post-copy live block migration and rapid deployment. The block_stream command can be used to stream a single cluster, to start streaming the entire device, and to cancel an active stream. It is easiest to allow the block_stream command to manage streaming for the entire device but a managent tool could use single cluster mode to throttle the I/O rate. As discussed earlier, having the management send requests for each single cluster doesn't make any sense at all. It wouldn't only throttle the I/O rate but bring it down to a level that makes it unusable. What you really want is to allow the management to give us a range (offset + length) that qemu should stream. The command synopses are as follows: block_stream Copy data from a backing file into a block device. If the optional 'all' argument is true, this operation is performed in the background until the entire backing file has been copied. The status of ongoing block_stream operations can be checked with query-block-stream. Not sure if it's a good idea to use a bool argument to turn a command into its opposite. I think having a separate command for stopping would be cleaner. Something for the QMP folks to decide, though. Arguments: - all:copy entire device (json-bool, optional) - stop: stop copying to device (json-bool, optional) - device: device name (json-string) It must be possible to specify backing file that will be active after streaming finishes (data from that file will not be streamed into active file, of course). Yes, I think the common base image belongs here. With all = false, where does the streaming begin? Do you have something like the current streaming offset in the state of each BlockDriverState? As I said above, I would prefer adding offset and length to the arguments. Return: - device: device name (json-string) - len:size of the device, in bytes (json-int) - offset: ending offset of the completed I/O, in bytes (json-int) So you only get the reply when the request has completed? With the current monitor, this means that QMP is blocked while we stream, doesn't it? How are you supposed to send the stop command then? Two of three examples below have an empty return value instead, so they are not compliant to this specification. Examples: - { execute: block_stream, arguments: { device: virtio0 } } - { return: { device: virtio0, len: 10737418240, offset: 512 } } - { execute: block_stream, arguments: { all: true, device: virtio0 } } - { return: {} } - { execute: block_stream, arguments: { stop: true, device: virtio0 } } - { return: {} } query-block-stream -- Show progress of ongoing block_stream operations. Return a json-array of all operations. If no operation is active then an empty array will be returned. Each operation is a json-object with the following data: - device: device name (json-string) - len:size of the device, in bytes (json-int) - offset: ending offset of the completed I/O, in bytes (json-int) Example: - { execute: query-block-stream } - { return:[ { device: virtio0, len: 10737418240, offset: 709632} ] } When block_stream is changed, this will have to make the same changes. Block device switching API == Extend the 'change' command to support changing the image file without media change notification. Perhaps we should take the opportunity to add a format argument for image files? change -- Change a removable medium or VNC configuration. Arguments: - device: device name (json-string) - target: filename or item (json-string) - arg: additional argument (json-string, optional) - notify: whether to notify guest, defaults to true (json-bool, optional) Examples: 1. Change a removable medium - { execute: change, arguments: { device: ide1-cd0, target: /srv/images/Fedora-12-x86_64-DVD.iso } } - { return: {} } 2. Change a disk without media change notification - { execute: change,
Re: [Qemu-devel] [PATCH 3/8] block: Support to keep track of I/O status
Am 12.07.2011 09:45, schrieb Markus Armbruster: Luiz Capitulino lcapitul...@redhat.com writes: This commit adds support to the BlockDriverState type to keep track of the last I/O status. That is, at every I/O operation we update a status field in the BlockDriverState instance. Valid statuses are: OK, FAILED and ENOSPC. ENOSPC is distinguished from FAILED because an management application can use it to implement thin-provisioning. This feature has to be explicit enabled by buses/devices supporting it. buses? Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- block.c | 18 ++ block.h |7 +++ block_int.h |2 ++ 3 files changed, 27 insertions(+), 0 deletions(-) diff --git a/block.c b/block.c index 24a25d5..cc0a34e 100644 --- a/block.c +++ b/block.c @@ -195,6 +195,7 @@ BlockDriverState *bdrv_new(const char *device_name) if (device_name[0] != '\0') { QTAILQ_INSERT_TAIL(bdrv_states, bs, list); } +bs-iostatus_enabled = false; return bs; } @@ -2876,6 +2877,23 @@ int bdrv_in_use(BlockDriverState *bs) return bs-in_use; } +void bdrv_enable_iostatus(BlockDriverState *bs) +{ +bs-iostatus_enabled = true; +} + +void bdrv_iostatus_update(BlockDriverState *bs, int error) +{ +error = abs(error); + +if (!error) { +bs-iostatus = BDRV_IOS_OK; +} else { +bs-iostatus = (error == ENOSPC) ? BDRV_IOS_ENOSPC : + BDRV_IOS_FAILED; +} +} + int bdrv_img_create(const char *filename, const char *fmt, const char *base_filename, const char *base_fmt, char *options, uint64_t img_size, int flags) diff --git a/block.h b/block.h index 859d1d9..0dca1bb 100644 --- a/block.h +++ b/block.h @@ -50,6 +50,13 @@ typedef enum { BDRV_ACTION_REPORT, BDRV_ACTION_IGNORE, BDRV_ACTION_STOP } BlockMonEventAction; +typedef enum { +BDRV_IOS_OK, BDRV_IOS_FAILED, BDRV_IOS_ENOSPC +} BlockIOStatus; + +void bdrv_iostatus_update(BlockDriverState *bs, int error); +void bdrv_enable_iostatus(BlockDriverState *bs); +void bdrv_enable_io_status(BlockDriverState *bs); void bdrv_mon_event(const BlockDriverState *bdrv, BlockMonEventAction action, int is_read); void bdrv_info_print(Monitor *mon, const QObject *data); diff --git a/block_int.h b/block_int.h index 1e265d2..09f038d 100644 --- a/block_int.h +++ b/block_int.h @@ -195,6 +195,8 @@ struct BlockDriverState { drivers. They are not used by the block driver */ int cyls, heads, secs, translation; BlockErrorAction on_read_error, on_write_error; +bool iostatus_enabled; +BlockIOStatus iostatus; char device_name[32]; unsigned long *dirty_bitmap; int64_t dirty_count; Okay, let's see what we got here. The block layer merely holds I/O status, device models set it. Device I/O status is not migrated. Why? bdrv_new() creates the BDS with I/O status tracking disabled. Devices that do tracking enable it in their qdev init method. If a device gets hot unplugged, tracking remains enabled. If the BDS then gets reused with a device that doesn't do tracking, I/O status becomes incorrect. Can't happen right now, because we automatically delete the BDS on hot unplug, but it's a trap. Suggest to disable tracking in bdrv_detach(). Actually, this is a symptom of the midlayer disease. I suspect things would be simpler if we hold the status in its rightful owner, the device model. Need a getter for it. I'm working on a patch series that moves misplaced state out of the block layer into device models and block drivers, and a I/O status getter will fit in easily there. This is host state, so the device is not the rightful owner. Devices should not even be involved with enabling it. Kevin
[Qemu-devel] [PATCH] qxl: upon reset, if spice worker is stopped, the command rings can be not empty
Spice worker does no longer process commands when it is stopped. Otherwise, it might crash during migration when attempting to process commands while the guest is not completely loaded. Cc: Alon Levy al...@redhat.com --- hw/qxl.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/qxl.c b/hw/qxl.c index 0b9a4c7..a6fb7f0 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -656,8 +656,8 @@ static void qxl_reset_state(PCIQXLDevice *d) QXLRam *ram = d-ram; QXLRom *rom = d-rom; -assert(SPICE_RING_IS_EMPTY(ram-cmd_ring)); -assert(SPICE_RING_IS_EMPTY(ram-cursor_ring)); +assert(!d-ssd.running || SPICE_RING_IS_EMPTY(ram-cmd_ring)); +assert(!d-ssd.running || SPICE_RING_IS_EMPTY(ram-cursor_ring)); d-shadow_rom.update_id = cpu_to_le32(0); *rom = d-shadow_rom; qxl_rom_set_dirty(d); -- 1.7.4.4
[Qemu-devel] [Bug 739785] Re: qemu-i386 user mode on ARMv5 host fails (bash: fork: Invalid argument)
I hope this can work so development for Dropbox on Drobo-FS can continue -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/739785 Title: qemu-i386 user mode on ARMv5 host fails (bash: fork: Invalid argument) Status in QEMU: New Bug description: Good time of day everybody, I have been trying to make usermode qemu on ARM with plugapps (archlinux) with archlinux i386 chroot to work. 1. I installed arch linux in a virtuabox and created a chroot for it with mkarchroot. Transferred it to my pogo plug into /i386/ 2. I comiled qemu-i386 static and put it into /i386/usr/bin/ ./configure --static --disable-blobs --disable-system --target-list=i386-linux-user make 3. I also compiled linux kernel 2.6.38 with CONFIG_BINFMT_MISC=y and installed it. uname -a Linux Plugbox 2.6.38 #4 PREEMPT Fri Mar 18 22:19:10 CDT 2011 armv5tel Feroceon 88FR131 rev 1 (v5l) Marvell SheevaPlug Reference Board GNU/Linux 4. Added the following options into /etc/rc.local /sbin/modprobe binfmt_misc /bin/mount binfmt_misc -t binfmt_misc /proc/sys/fs/binfmt_misc echo ':qemu-i386:M::\x7fELF\x01\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x03\x00:\xff\xff\xff\xff\xff\xfe\xfe\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfb\xff\xff\xff:/usr/bin/qemu-i386:' /proc/sys/fs/binfmt_misc/register 5. Also copied ld-linux.so.3 (actually ld-2.13.so because ld- linux.so.3 is a link to that file) from /lib/ to /i386/lib/ 6.Now i chroot into /i386 and I get this: [root@Plugbox i386]# chroot . [II aI hnve ao n@P /]# pacman -Suy bash: fork: Invalid argument 7.I also downloaded linux-user-test-0.3 from qemu website and ran the test: [root@Plugbox linux-user-test-0.3]# make ./qemu-linux-user.sh [qemu-i386] ../qemu-0.14.0/i386-linux-user/qemu-i386 -L ./gnemul/qemu-i386 i386/ls -l dummyfile BUG IN DYNAMIC LINKER ld.so: dl-version.c: 210: _dl_check_map_versions: Assertion `needed != ((void *)0)' failed! make: *** [test] Error 127 To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/739785/+subscriptions
Re: [Qemu-devel] How to run realview-pbx-a9 image in qemu
On 12 July 2011 07:04, Xiao Jiang jgq...@gmail.com wrote: It looks like I am not in luck, qemu still can't run successfully. I recompiled the qemu from linaro qemu tree and executed below instructions in order. 1. open window A, run below cmd. xjiang@xjiang-desktop:~/work/qemu$ sudo qemu-system-arm -M realview-pbx-a9 -m 1024 -kernel zImage-cortex-a9 -serial telnet::,server -append I think you have a bad kernel image. There are some prebuilt ones (and config files) linked off: http://www.arm.com/community/software-enablement/linux.php but unfortunately the webserver that holds them isn't happy today. Dave
Re: [Qemu-devel] [PULL 0/8] Block patches
Am 06.07.2011 16:21, schrieb Kevin Wolf: The following changes since commit 9312805d33e8b106bae356d13a8071fb37d75554: pxa2xx_lcd: add proper rotation support (2011-07-04 22:12:21 +0200) are available in the git repository at: git://repo.or.cz/qemu/kevin.git for-anthony Federico Simoncelli (1): qemu-img: Add cache command line option Johannes Stezenbach (1): block/raw-posix: Linux compat-ioctl warning workaround Kevin Wolf (3): Documentation: Remove outdated host_device note ide: Ignore reads during PIO in and writes during PIO out ide: Initialise buffers with zeros Luiz Capitulino (2): block: drive_init(): Simplify interface type setting block: drive_init(): Improve CHS setting error message Markus Armbruster (1): virtio-blk: Turn drive serial into a qdev property block/raw-posix.c| 14 + blockdev.c | 14 +++- hw/ide/core.c| 50 +- hw/s390-virtio-bus.c |4 ++- hw/s390-virtio-bus.h |1 + hw/virtio-blk.c | 29 -- hw/virtio-blk.h |2 + hw/virtio-pci.c |4 ++- hw/virtio-pci.h |1 + hw/virtio.h |3 +- qemu-img-cmds.hx |6 ++-- qemu-img.c | 80 + qemu-img.texi|6 13 files changed, 161 insertions(+), 53 deletions(-) Ping?
Re: [Qemu-devel] [PATCH 3/8] block: Support to keep track of I/O status
Kevin Wolf kw...@redhat.com writes: Am 12.07.2011 09:45, schrieb Markus Armbruster: Luiz Capitulino lcapitul...@redhat.com writes: This commit adds support to the BlockDriverState type to keep track of the last I/O status. That is, at every I/O operation we update a status field in the BlockDriverState instance. Valid statuses are: OK, FAILED and ENOSPC. ENOSPC is distinguished from FAILED because an management application can use it to implement thin-provisioning. This feature has to be explicit enabled by buses/devices supporting it. buses? Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- block.c | 18 ++ block.h |7 +++ block_int.h |2 ++ 3 files changed, 27 insertions(+), 0 deletions(-) diff --git a/block.c b/block.c index 24a25d5..cc0a34e 100644 --- a/block.c +++ b/block.c @@ -195,6 +195,7 @@ BlockDriverState *bdrv_new(const char *device_name) if (device_name[0] != '\0') { QTAILQ_INSERT_TAIL(bdrv_states, bs, list); } +bs-iostatus_enabled = false; return bs; } @@ -2876,6 +2877,23 @@ int bdrv_in_use(BlockDriverState *bs) return bs-in_use; } +void bdrv_enable_iostatus(BlockDriverState *bs) +{ +bs-iostatus_enabled = true; +} + +void bdrv_iostatus_update(BlockDriverState *bs, int error) +{ +error = abs(error); + +if (!error) { +bs-iostatus = BDRV_IOS_OK; +} else { +bs-iostatus = (error == ENOSPC) ? BDRV_IOS_ENOSPC : + BDRV_IOS_FAILED; +} +} + int bdrv_img_create(const char *filename, const char *fmt, const char *base_filename, const char *base_fmt, char *options, uint64_t img_size, int flags) diff --git a/block.h b/block.h index 859d1d9..0dca1bb 100644 --- a/block.h +++ b/block.h @@ -50,6 +50,13 @@ typedef enum { BDRV_ACTION_REPORT, BDRV_ACTION_IGNORE, BDRV_ACTION_STOP } BlockMonEventAction; +typedef enum { +BDRV_IOS_OK, BDRV_IOS_FAILED, BDRV_IOS_ENOSPC +} BlockIOStatus; + +void bdrv_iostatus_update(BlockDriverState *bs, int error); +void bdrv_enable_iostatus(BlockDriverState *bs); +void bdrv_enable_io_status(BlockDriverState *bs); void bdrv_mon_event(const BlockDriverState *bdrv, BlockMonEventAction action, int is_read); void bdrv_info_print(Monitor *mon, const QObject *data); diff --git a/block_int.h b/block_int.h index 1e265d2..09f038d 100644 --- a/block_int.h +++ b/block_int.h @@ -195,6 +195,8 @@ struct BlockDriverState { drivers. They are not used by the block driver */ int cyls, heads, secs, translation; BlockErrorAction on_read_error, on_write_error; +bool iostatus_enabled; +BlockIOStatus iostatus; char device_name[32]; unsigned long *dirty_bitmap; int64_t dirty_count; Okay, let's see what we got here. The block layer merely holds I/O status, device models set it. Device I/O status is not migrated. Why? bdrv_new() creates the BDS with I/O status tracking disabled. Devices that do tracking enable it in their qdev init method. If a device gets hot unplugged, tracking remains enabled. If the BDS then gets reused with a device that doesn't do tracking, I/O status becomes incorrect. Can't happen right now, because we automatically delete the BDS on hot unplug, but it's a trap. Suggest to disable tracking in bdrv_detach(). Actually, this is a symptom of the midlayer disease. I suspect things would be simpler if we hold the status in its rightful owner, the device model. Need a getter for it. I'm working on a patch series that moves misplaced state out of the block layer into device models and block drivers, and a I/O status getter will fit in easily there. This is host state, so the device is not the rightful owner. Devices should not even be involved with enabling it. They are because they do the tracking, and thus the tracking only works for device models that do it. Could it be done entirely within the block layer?
Re: [Qemu-devel] [PATCH v3] QMP: add snapshot_blkdev command
On 07/11/11 22:35, Luiz Capitulino wrote: Sorry that is no go, you just broke the hmp implementation - you cannot change the hmp behavior like that. HMP uses positional arguments, so changing argument names makes no difference. And, apart from some exceptions, it's not an stable interface, anyway... I guess you're right about the naming not affecting the hmp interface. However hmp is far more usable to end users than qmp, so yes it does matter not to change the interface at random. Jes
Re: [Qemu-devel] How to run realview-pbx-a9 image in qemu
David Gilbert wrote: On 12 July 2011 07:04, Xiao Jiang jgq...@gmail.com wrote: It looks like I am not in luck, qemu still can't run successfully. I recompiled the qemu from linaro qemu tree and executed below instructions in order. 1. open window A, run below cmd. xjiang@xjiang-desktop:~/work/qemu$ sudo qemu-system-arm -M realview-pbx-a9 -m 1024 -kernel zImage-cortex-a9 -serial telnet::,server -append I think you have a bad kernel image. There are some prebuilt ones (and config files) linked off: http://www.arm.com/community/software-enablement/linux.php but unfortunately the webserver that holds them isn't happy today. Hi Dave, Thanks for the link, the server is unhappy :(. I compiled this image from mainline ..., so to support qemu, there should be some modifications in linux kernel, maybe I should try kernel tree from linaro. But fortunately, I can run vexpress-a9 image in linaro qemu successfully this afternoon which is also built from mainline, :) Regards, Xiao Dave
[Qemu-devel] [PATCH v2 1/1] virtio-console: Prevent abort()s in case of host chardev close
A host chardev could close just before the guest sends some data to be written. This will cause an -EPIPE error. This shouldn't be propagated to virtio-serial-bus. Ideally we should close the port once -EPIPE is received, but since the chardev interface doesn't return such meaningful values to its users, all we get is -1 for any kind of error. Just return 0 for now and wait for chardevs to return better error messages to act better on the return messages. Signed-off-by: Amit Shah amit.s...@redhat.com --- hw/virtio-console.c | 20 ++-- 1 files changed, 18 insertions(+), 2 deletions(-) diff --git a/hw/virtio-console.c b/hw/virtio-console.c index b076331..9ca0dc6 100644 --- a/hw/virtio-console.c +++ b/hw/virtio-console.c @@ -24,8 +24,24 @@ typedef struct VirtConsole { static ssize_t flush_buf(VirtIOSerialPort *port, const uint8_t *buf, size_t len) { VirtConsole *vcon = DO_UPCAST(VirtConsole, port, port); - -return qemu_chr_write(vcon-chr, buf, len); +ssize_t ret; + +ret = qemu_chr_write(vcon-chr, buf, len); +if (ret 0) { +/* + * Ideally we'd get a better error code than just -1, but + * that's what the chardev interface gives us right now. If + * we had a finer-grained message, like -EPIPE, we could close + * this connection. Absent such error messages, the most we + * can do is to return 0 here. + * + * This will prevent stray -1 values to go to + * virtio-serial-bus.c and cause abort()s in + * do_flush_queued_data(). + */ +ret = 0; +} +return ret; } /* Callback function that's called when the guest opens the port */ -- 1.7.6
Re: [Qemu-devel] [PATCH 1/2] xen_mapcache: remove unused variable
On Mon, 11 Jul 2011, Juan Quintela wrote: Signed-off-by: Juan Quintela quint...@redhat.com Acked-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- xen-mapcache.c |3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/xen-mapcache.c b/xen-mapcache.c index fac47cd..e2e324d 100644 --- a/xen-mapcache.c +++ b/xen-mapcache.c @@ -232,7 +232,7 @@ uint8_t *qemu_map_cache(target_phys_addr_t phys_addr, target_phys_addr_t size, u ram_addr_t qemu_ram_addr_from_mapcache(void *ptr) { -MapCacheEntry *entry = NULL, *pentry = NULL; +MapCacheEntry *entry = NULL; MapCacheRev *reventry; target_phys_addr_t paddr_index; target_phys_addr_t size; @@ -258,7 +258,6 @@ ram_addr_t qemu_ram_addr_from_mapcache(void *ptr) entry = mapcache-entry[paddr_index % mapcache-nr_buckets]; while (entry (entry-paddr_index != paddr_index || entry-size != size)) { -pentry = entry; entry = entry-next; } if (!entry) { -- 1.7.6
Re: [Qemu-devel] [PATCH 0/3] MIPS64 user mode emulation in QEMU with Cavium specific instruction support
Hi We have developed Mips64 user mode emulation. In addition we implemented Cavium specific instruction along with octeon CPU definition. We need your support to make our contribution public ally available via making it open source. I tried to resolve the issues pointed out by Aurelien Jarno, Riku, Nathan and other friends and send the patches on Jul 5. Please review the patch series and give your feedback in the form of comments and suggestions Thanks On Tue, Jul 5, 2011 at 2:19 PM, kha...@kics.edu.pk wrote: From: Khansa Butt kha...@kics.edu.pk This is the team work of Ehsan-ul-Haq, Abdul Qadeer, Abdul Waheed, Khansa Butt from HPCN Lab KICS UET Lahore. Cavium Networks's Octeon processors are based on MIPS64r2 We have Implemented 27 user mode Cavium specific instructions. Richard Henderson told me that QEMU does not support 64-bit address spaces in user mode from a 32-bit host. so this code will work only on 64 bit host. Although we did some workaround to run MIPS64 on 32 x86 and it can be generlized for other architectures. We will submit that after this submission. This development work is tested for 64 bit X86 and working fine all Cavium specific instructions are also tested. teast cases can be provided if required. Octeon binaries (ELF) can be downloaded from below links 1)http://dl.dropbox.com/u/19530066/hw_mips 2)http://dl.dropbox.com/u/19530066/matmul If you have any objection regarding the Implementation of Cavium instructions please read following notes. Notes * The detail of some instructions are as follows 1)seq rd,rs,rt seq--rd = 1 if rs = rt is equivalent to xor rd,rs,rt sltiu rd,rd,1 2)exts rt,rs,p,lenm1 rt = sign-extend(rsp+lenm1:p,lenm1) From reference manual of Cavium Networks Bit locations p + lenm1 to p are extracted from rs and the result is written into the lowest bits of destination register rt. The remaining bits in rt are a sign-extension of the most-significant bit of the bit field (i.e. rt63:lenm1 are all duplicates of the source-register bit rsp+lenm1). so we can't use any of 8,16 or 32 bit sign extention tcg function. To sign extend according to msb of bit field we have our own implementation 3)dmul rd,rs,rt This instruction is included in gen_arith() because it is three operand double word multiply instruction. -- 1.7.3.4
Re: [Qemu-devel] qemu-user[armel/mips] and debian-rootfs
2011/7/11 Riku Voipio riku.voi...@iki.fi: On Mon, Jul 11, 2011 at 11:10:50AM -0300, Lisandro Damián Nicanor Pérez Meyer wrote: Thanks Riku! This bug has already been solved by Wesley Terpstra: http://lists.nongnu.org/archive/html/qemu-devel/2011-07/msg00313.html Ok, I missed these patches. Will adjust the linux-user patchset to include these patches if no bugs found in them. Just a heads up---if you only use my original patch, mipseb still won't be able to run 'make'. My final patchset (sent as a series of 6 patches) resolves this in addition to the other problems.
[Qemu-devel] Best qemu ARM board supporting 1G+ ?
Greetings. I have a arm/versatileab running debian/sid, but this only has access to 256MB of main memory. For my current project I need a minimum of 1GB. I know that the realview-pbx-a9 can do this, but am unsure how well its supported. 1. Which qemu ARM boards support = 1GB of memory? 2. Which of those can run a linux kernel reliably enough to host debian? Thanks!
[Qemu-devel] [PATCH v9 00/12] Adding VMDK monolithic flat support
Changes from v8: 09/12: remove duplicated sscanf 10/12: change option name to 'subformat', change commit message typo, factor common parts of creating, and other small improvements Fam Zheng (12): VMDK: introduce VmdkExtent VMDK: bugfix, align offset to cluster in get_whole_cluster VMDK: probe for monolithicFlat images VMDK: separate vmdk_open by format version VMDK: add field BDRVVmdkState.desc_offset VMDK: flush multiple extents VMDK: move 'static' cid_update flag to bs field VMDK: change get_cluster_offset return type VMDK: open/read/write for monolithicFlat image VMDK: create different subformats VMDK: fix coding style block: add bdrv_get_allocated_file_size() operation block.c | 19 + block.h |1 + block/raw-posix.c | 21 + block/raw-win32.c | 29 ++ block/vmdk.c | 1296 - block_int.h |2 + qemu-img.c| 31 +-- 7 files changed, 964 insertions(+), 435 deletions(-)
[Qemu-devel] [PATCH v9 06/12] VMDK: flush multiple extents
Flush all the file that referenced by the image. Signed-off-by: Fam Zheng famc...@gmail.com --- block/vmdk.c | 12 +++- 1 files changed, 11 insertions(+), 1 deletions(-) diff --git a/block/vmdk.c b/block/vmdk.c index 529ae90..f6d2986 100644 --- a/block/vmdk.c +++ b/block/vmdk.c @@ -1072,7 +1072,17 @@ static void vmdk_close(BlockDriverState *bs) static int vmdk_flush(BlockDriverState *bs) { -return bdrv_flush(bs-file); +int i, ret, err; +BDRVVmdkState *s = bs-opaque; + +ret = bdrv_flush(bs-file); +for (i = 0; i s-num_extents; i++) { +err = bdrv_flush(s-extents[i].file); +if (err 0) { +ret = err; +} +} +return ret; }
[Qemu-devel] [PATCH v9 08/12] VMDK: change get_cluster_offset return type
The return type of get_cluster_offset was an offset that use 0 to denote 'not allocated', this will be no longer true for flat extents, as we see flat extent file as a single huge cluster whose offset is 0 and length is the whole file length. So now we use int return value, 0 means success and otherwise offset invalid. Signed-off-by: Fam Zheng famc...@gmail.com --- block/vmdk.c | 79 ++--- 1 files changed, 42 insertions(+), 37 deletions(-) diff --git a/block/vmdk.c b/block/vmdk.c index 8dc58a8..f637d98 100644 --- a/block/vmdk.c +++ b/block/vmdk.c @@ -665,26 +665,31 @@ static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data) return 0; } -static uint64_t get_cluster_offset(BlockDriverState *bs, +static int get_cluster_offset(BlockDriverState *bs, VmdkExtent *extent, VmdkMetaData *m_data, -uint64_t offset, int allocate) +uint64_t offset, +int allocate, +uint64_t *cluster_offset) { unsigned int l1_index, l2_offset, l2_index; int min_index, i, j; uint32_t min_count, *l2_table, tmp = 0; -uint64_t cluster_offset; if (m_data) m_data-valid = 0; +if (extent-flat) { +*cluster_offset = 0; +return 0; +} l1_index = (offset 9) / extent-l1_entry_sectors; if (l1_index = extent-l1_size) { -return 0; +return -1; } l2_offset = extent-l1_table[l1_index]; if (!l2_offset) { -return 0; +return -1; } for (i = 0; i L2_CACHE_SIZE; i++) { if (l2_offset == extent-l2_cache_offsets[i]) { @@ -714,28 +719,29 @@ static uint64_t get_cluster_offset(BlockDriverState *bs, l2_table, extent-l2_size * sizeof(uint32_t) ) != extent-l2_size * sizeof(uint32_t)) { -return 0; +return -1; } extent-l2_cache_offsets[min_index] = l2_offset; extent-l2_cache_counts[min_index] = 1; found: l2_index = ((offset 9) / extent-cluster_sectors) % extent-l2_size; -cluster_offset = le32_to_cpu(l2_table[l2_index]); +*cluster_offset = le32_to_cpu(l2_table[l2_index]); -if (!cluster_offset) { -if (!allocate) -return 0; +if (!*cluster_offset) { +if (!allocate) { +return -1; +} // Avoid the L2 tables update for the images that have snapshots. -cluster_offset = bdrv_getlength(extent-file); +*cluster_offset = bdrv_getlength(extent-file); bdrv_truncate( extent-file, -cluster_offset + (extent-cluster_sectors 9) +*cluster_offset + (extent-cluster_sectors 9) ); -cluster_offset = 9; -tmp = cpu_to_le32(cluster_offset); +*cluster_offset = 9; +tmp = cpu_to_le32(*cluster_offset); l2_table[l2_index] = tmp; /* First of all we write grain itself, to avoid race condition @@ -744,8 +750,8 @@ static uint64_t get_cluster_offset(BlockDriverState *bs, * or inappropriate VM shutdown. */ if (get_whole_cluster( -bs, extent, cluster_offset, offset, allocate) == -1) -return 0; +bs, extent, *cluster_offset, offset, allocate) == -1) +return -1; if (m_data) { m_data-offset = tmp; @@ -755,8 +761,8 @@ static uint64_t get_cluster_offset(BlockDriverState *bs, m_data-valid = 1; } } -cluster_offset = 9; -return cluster_offset; +*cluster_offset = 9; +return 0; } static VmdkExtent *find_extent(BDRVVmdkState *s, @@ -780,7 +786,6 @@ static int vmdk_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors, int *pnum) { BDRVVmdkState *s = bs-opaque; - int64_t index_in_cluster, n, ret; uint64_t offset; VmdkExtent *extent; @@ -789,15 +794,13 @@ static int vmdk_is_allocated(BlockDriverState *bs, int64_t sector_num, if (!extent) { return 0; } -if (extent-flat) { -n = extent-end_sector - sector_num; -ret = 1; -} else { -offset = get_cluster_offset(bs, extent, NULL, sector_num * 512, 0); -index_in_cluster = sector_num % extent-cluster_sectors; -n = extent-cluster_sectors - index_in_cluster; -ret = offset ? 1 : 0; -} +ret = get_cluster_offset(bs, extent, NULL, +sector_num * 512, 0, offset); +/* get_cluster_offset returning 0 means success */ +ret = !ret; + +index_in_cluster = sector_num % extent-cluster_sectors; +n = extent-cluster_sectors - index_in_cluster; if (n nb_sectors) n = nb_sectors; *pnum = n; @@ -818,14 +821,15 @@ static int
[Qemu-devel] [PATCH v9 01/12] VMDK: introduce VmdkExtent
Introduced VmdkExtent array into BDRVVmdkState, enable holding multiple image extents for multiple file image support. Signed-off-by: Fam Zheng famc...@gmail.com --- block/vmdk.c | 348 +- 1 files changed, 246 insertions(+), 102 deletions(-) diff --git a/block/vmdk.c b/block/vmdk.c index 922b23d..3b78583 100644 --- a/block/vmdk.c +++ b/block/vmdk.c @@ -60,7 +60,11 @@ typedef struct { #define L2_CACHE_SIZE 16 -typedef struct BDRVVmdkState { +typedef struct VmdkExtent { +BlockDriverState *file; +bool flat; +int64_t sectors; +int64_t end_sector; int64_t l1_table_offset; int64_t l1_backup_table_offset; uint32_t *l1_table; @@ -74,7 +78,13 @@ typedef struct BDRVVmdkState { uint32_t l2_cache_counts[L2_CACHE_SIZE]; unsigned int cluster_sectors; +} VmdkExtent; + +typedef struct BDRVVmdkState { uint32_t parent_cid; +int num_extents; +/* Extent array with num_extents entries, ascend ordered by address */ +VmdkExtent *extents; } BDRVVmdkState; typedef struct VmdkMetaData { @@ -105,6 +115,19 @@ static int vmdk_probe(const uint8_t *buf, int buf_size, const char *filename) #define DESC_SIZE 20*SECTOR_SIZE // 20 sectors of 512 bytes each #define HEADER_SIZE 512// first sector of 512 bytes +static void vmdk_free_extents(BlockDriverState *bs) +{ +int i; +BDRVVmdkState *s = bs-opaque; + +for (i = 0; i s-num_extents; i++) { +qemu_free(s-extents[i].l1_table); +qemu_free(s-extents[i].l2_cache); +qemu_free(s-extents[i].l1_backup_table); +} +qemu_free(s-extents); +} + static uint32_t vmdk_read_cid(BlockDriverState *bs, int parent) { char desc[DESC_SIZE]; @@ -358,11 +381,50 @@ static int vmdk_parent_open(BlockDriverState *bs) return 0; } +/* Create and append extent to the extent array. Return the added VmdkExtent + * address. return NULL if allocation failed. */ +static VmdkExtent *vmdk_add_extent(BlockDriverState *bs, + BlockDriverState *file, bool flat, int64_t sectors, + int64_t l1_offset, int64_t l1_backup_offset, + uint32_t l1_size, + int l2_size, unsigned int cluster_sectors) +{ +VmdkExtent *extent; +BDRVVmdkState *s = bs-opaque; + +s-extents = qemu_realloc(s-extents, + (s-num_extents + 1) * sizeof(VmdkExtent)); +extent = s-extents[s-num_extents]; +s-num_extents++; + +memset(extent, 0, sizeof(VmdkExtent)); +extent-file = file; +extent-flat = flat; +extent-sectors = sectors; +extent-l1_table_offset = l1_offset; +extent-l1_backup_table_offset = l1_backup_offset; +extent-l1_size = l1_size; +extent-l1_entry_sectors = l2_size * cluster_sectors; +extent-l2_size = l2_size; +extent-cluster_sectors = cluster_sectors; + +if (s-num_extents 1) { +extent-end_sector = (*(extent - 1)).end_sector + extent-sectors; +} else { +extent-end_sector = extent-sectors; +} +bs-total_sectors = extent-end_sector; +return extent; +} + + static int vmdk_open(BlockDriverState *bs, int flags) { BDRVVmdkState *s = bs-opaque; uint32_t magic; -int l1_size, i; +int i; +uint32_t l1_size, l1_entry_sectors; +VmdkExtent *extent = NULL; if (bdrv_pread(bs-file, 0, magic, sizeof(magic)) != sizeof(magic)) goto fail; @@ -370,32 +432,34 @@ static int vmdk_open(BlockDriverState *bs, int flags) magic = be32_to_cpu(magic); if (magic == VMDK3_MAGIC) { VMDK3Header header; - -if (bdrv_pread(bs-file, sizeof(magic), header, sizeof(header)) != sizeof(header)) +if (bdrv_pread(bs-file, sizeof(magic), header, sizeof(header)) +!= sizeof(header)) { goto fail; -s-cluster_sectors = le32_to_cpu(header.granularity); -s-l2_size = 1 9; -s-l1_size = 1 6; -bs-total_sectors = le32_to_cpu(header.disk_sectors); -s-l1_table_offset = le32_to_cpu(header.l1dir_offset) 9; -s-l1_backup_table_offset = 0; -s-l1_entry_sectors = s-l2_size * s-cluster_sectors; +} +extent = vmdk_add_extent(bs, bs-file, false, + le32_to_cpu(header.disk_sectors), + le32_to_cpu(header.l1dir_offset) 9, 0, + 1 6, 1 9, le32_to_cpu(header.granularity)); } else if (magic == VMDK4_MAGIC) { VMDK4Header header; - -if (bdrv_pread(bs-file, sizeof(magic), header, sizeof(header)) != sizeof(header)) +if (bdrv_pread(bs-file, sizeof(magic), header, sizeof(header)) +!= sizeof(header)) { goto fail; -bs-total_sectors = le64_to_cpu(header.capacity); -s-cluster_sectors = le64_to_cpu(header.granularity); -s-l2_size =
[Qemu-devel] [PATCH v9 12/12] block: add bdrv_get_allocated_file_size() operation
qemu-img.c wants to count allocated file size of image. Previously it counts a single bs-file by 'stat' or Window API. As VMDK introduces multiple file support, the operation becomes format specific with platform specific meanwhile. The functions are moved to block/raw-{posix,win32}.c and qemu-img.c calls bdrv_get_allocated_file_size to count the bs. And also added VMDK code to count his own extents. Signed-off-by: Fam Zheng famc...@gmail.com --- block.c | 19 +++ block.h |1 + block/raw-posix.c | 21 + block/raw-win32.c | 29 + block/vmdk.c | 24 block_int.h |1 + qemu-img.c| 31 +-- 7 files changed, 96 insertions(+), 30 deletions(-) diff --git a/block.c b/block.c index 24a25d5..9549b9e 100644 --- a/block.c +++ b/block.c @@ -1147,6 +1147,25 @@ int bdrv_truncate(BlockDriverState *bs, int64_t offset) } /** + * Length of a allocated file in bytes. Sparse files are counted by actual + * allocated space. Return 0 if error or unknown. + */ +int64_t bdrv_get_allocated_file_size(BlockDriverState *bs) +{ +BlockDriver *drv = bs-drv; +if (!drv) { +return -ENOMEDIUM; +} +if (drv-bdrv_get_allocated_file_size) { +return drv-bdrv_get_allocated_file_size(bs); +} +if (bs-file) { +return bdrv_get_allocated_file_size(bs-file); +} +return -ENOTSUP; +} + +/** * Length of a file in bytes. Return 0 if error or unknown. */ int64_t bdrv_getlength(BlockDriverState *bs) diff --git a/block.h b/block.h index 859d1d9..59cc410 100644 --- a/block.h +++ b/block.h @@ -89,6 +89,7 @@ int bdrv_write_sync(BlockDriverState *bs, int64_t sector_num, const uint8_t *buf, int nb_sectors); int bdrv_truncate(BlockDriverState *bs, int64_t offset); int64_t bdrv_getlength(BlockDriverState *bs); +int64_t bdrv_get_allocated_file_size(BlockDriverState *bs); void bdrv_get_geometry(BlockDriverState *bs, uint64_t *nb_sectors_ptr); void bdrv_guess_geometry(BlockDriverState *bs, int *pcyls, int *pheads, int *psecs); int bdrv_commit(BlockDriverState *bs); diff --git a/block/raw-posix.c b/block/raw-posix.c index 4cd7d7a..911cc0d 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -791,6 +791,17 @@ static int64_t raw_getlength(BlockDriverState *bs) } #endif +static int64_t raw_get_allocated_file_size(BlockDriverState *bs) +{ +struct stat st; +BDRVRawState *s = bs-opaque; + +if (fstat(s-fd, st) 0) { +return -errno; +} +return (int64_t)st.st_blocks * 512; +} + static int raw_create(const char *filename, QEMUOptionParameter *options) { int fd; @@ -886,6 +897,8 @@ static BlockDriver bdrv_file = { .bdrv_truncate = raw_truncate, .bdrv_getlength = raw_getlength, +.bdrv_get_allocated_file_size += raw_get_allocated_file_size, .create_options = raw_create_options, }; @@ -1154,6 +1167,8 @@ static BlockDriver bdrv_host_device = { .bdrv_read = raw_read, .bdrv_write = raw_write, .bdrv_getlength= raw_getlength, +.bdrv_get_allocated_file_size += raw_get_allocated_file_size, /* generic scsi device */ #ifdef __linux__ @@ -1269,6 +1284,8 @@ static BlockDriver bdrv_host_floppy = { .bdrv_read = raw_read, .bdrv_write = raw_write, .bdrv_getlength= raw_getlength, +.bdrv_get_allocated_file_size += raw_get_allocated_file_size, /* removable device support */ .bdrv_is_inserted = floppy_is_inserted, @@ -1366,6 +1383,8 @@ static BlockDriver bdrv_host_cdrom = { .bdrv_read = raw_read, .bdrv_write = raw_write, .bdrv_getlength = raw_getlength, +.bdrv_get_allocated_file_size += raw_get_allocated_file_size, /* removable device support */ .bdrv_is_inserted = cdrom_is_inserted, @@ -1489,6 +1508,8 @@ static BlockDriver bdrv_host_cdrom = { .bdrv_read = raw_read, .bdrv_write = raw_write, .bdrv_getlength = raw_getlength, +.bdrv_get_allocated_file_size += raw_get_allocated_file_size, /* removable device support */ .bdrv_is_inserted = cdrom_is_inserted, diff --git a/block/raw-win32.c b/block/raw-win32.c index 56bd719..91067e7 100644 --- a/block/raw-win32.c +++ b/block/raw-win32.c @@ -213,6 +213,31 @@ static int64_t raw_getlength(BlockDriverState *bs) return l.QuadPart; } +static int64_t raw_get_allocated_file_size(BlockDriverState *bs) +{ +typedef DWORD (WINAPI * get_compressed_t)(const char *filename, + DWORD * high); +get_compressed_t get_compressed; +struct _stati64 st; +const char *filename = bs-filename; +/* WinNT support GetCompressedFileSize to determine allocate size */ +
[Qemu-devel] [PATCH v9 02/12] VMDK: bugfix, align offset to cluster in get_whole_cluster
In get_whole_cluster, the offset is not aligned to cluster when reading from backing_hd. When the first write to child is not at the cluster boundary, wrong address data from parent is copied to child. Signed-off-by: Fam Zheng famc...@gmail.com --- block/vmdk.c |8 +--- 1 files changed, 5 insertions(+), 3 deletions(-) diff --git a/block/vmdk.c b/block/vmdk.c index 3b78583..03a4619 100644 --- a/block/vmdk.c +++ b/block/vmdk.c @@ -514,21 +514,23 @@ static int get_whole_cluster(BlockDriverState *bs, /* 128 sectors * 512 bytes each = grain size 64KB */ uint8_t whole_grain[extent-cluster_sectors * 512]; -// we will be here if it's first write on non-exist grain(cluster). -// try to read from parent image, if exist +/* we will be here if it's first write on non-exist grain(cluster). + * try to read from parent image, if exist */ if (bs-backing_hd) { int ret; if (!vmdk_is_cid_valid(bs)) return -1; +/* floor offset to cluster */ +offset -= offset % (extent-cluster_sectors * 512); ret = bdrv_read(bs-backing_hd, offset 9, whole_grain, extent-cluster_sectors); if (ret 0) { return -1; } -//Write grain only into the active image +/* Write grain only into the active image */ ret = bdrv_write(extent-file, cluster_offset, whole_grain, extent-cluster_sectors); if (ret 0) {
Re: [Qemu-devel] Best qemu ARM board supporting 1G+ ?
On 12 July 2011 12:36, Wesley W. Terpstra wes...@terpstra.ca wrote: I have a arm/versatileab running debian/sid, but this only has access to 256MB of main memory. For my current project I need a minimum of 1GB. I know that the realview-pbx-a9 can do this, but am unsure how well its supported. 1. Which qemu ARM boards support = 1GB of memory? realview-pb-a8, realview-pbx-a9 and vexpress-a9. 2. Which of those can run a linux kernel reliably enough to host debian? They should all be OK to run the kernel. (You may find that you need to tweak some kernel config options, eg disable perf support, or patch qemu to support the perf regs.) The one I use most is vexpress-a9. But at a board level none of them differ all that much from each other (they're not compatible, but any bug you run into in one you'll probably hit in the others too). Note that the 0.14.x releases all have known bugs in the ARMv7 support; current git master is much better in this regard. Unfortunately all three of these boards have no PCI support, which means no SCSI, so disk is going to be SD card. There seems to be a nasty performance problem with the SD card emulation to the point where anything doing heavy I/O is either very slow or appears to lock up. (That's on my todo list to investigate but so are a lot of other things :-( ) -- PMM
[Qemu-devel] [PATCH v9 09/12] VMDK: open/read/write for monolithicFlat image
Parse vmdk decriptor file and open mono flat image. Read/write the flat extent. Signed-off-by: Fam Zheng famc...@gmail.com --- block/vmdk.c | 171 +- 1 files changed, 158 insertions(+), 13 deletions(-) diff --git a/block/vmdk.c b/block/vmdk.c index f637d98..93ac289 100644 --- a/block/vmdk.c +++ b/block/vmdk.c @@ -65,6 +65,7 @@ typedef struct VmdkExtent { bool flat; int64_t sectors; int64_t end_sector; +int64_t flat_start_offset; int64_t l1_table_offset; int64_t l1_backup_table_offset; uint32_t *l1_table; @@ -407,9 +408,10 @@ fail: static int vmdk_parent_open(BlockDriverState *bs) { char *p_name; -char desc[DESC_SIZE]; +char desc[DESC_SIZE + 1]; BDRVVmdkState *s = bs-opaque; +desc[DESC_SIZE] = '\0'; if (bdrv_pread(bs-file, s-desc_offset, desc, DESC_SIZE) != DESC_SIZE) { return -1; } @@ -584,6 +586,144 @@ static int vmdk_open_vmdk4(BlockDriverState *bs, int flags) return ret; } +/* find an option value out of descriptor file */ +static int vmdk_parse_description(const char *desc, const char *opt_name, +char *buf, int buf_size) +{ +char *opt_pos, *opt_end; +const char *end = desc + strlen(desc); + +opt_pos = strstr(desc, opt_name); +if (!opt_pos) { +return -1; +} +/* Skip =\ following opt_name */ +opt_pos += strlen(opt_name) + 2; +if (opt_pos = end) { +return -1; +} +opt_end = opt_pos; +while (opt_end end *opt_end != '') { +opt_end++; +} +if (opt_end == end || buf_size opt_end - opt_pos + 1) { +return -1; +} +pstrcpy(buf, opt_end - opt_pos + 1, opt_pos); +return 0; +} + +static int vmdk_parse_extents(const char *desc, BlockDriverState *bs, +const char *desc_file_path) +{ +int ret; +char access[11]; +char type[11]; +char fname[512]; +const char *p = desc; +int64_t sectors = 0; +int64_t flat_offset; + +while (*p) { +/* parse extent line: + * RW [size in sectors] FLAT file-name.vmdk OFFSET + * or + * RW [size in sectors] SPARSE file-name.vmdk + */ +flat_offset = -1; +ret = sscanf(p, %10s %lld %10s %511s %lld, +access, sectors, type, fname, flat_offset); +if (ret 4 || strcmp(access, RW)) { +goto next_line; +} else if (!strcmp(type, FLAT)) { +if (ret != 5 || flat_offset 0) { +return -EINVAL; +} +} else if (ret != 4) { +return -EINVAL; +} + +/* trim the quotation marks around */ +if (fname[0] == '') { +memmove(fname, fname + 1, strlen(fname)); +if (strlen(fname) = 1 || fname[strlen(fname) - 1] != '') { +return -EINVAL; +} +fname[strlen(fname) - 1] = '\0'; +} +if (sectors = 0 || +(strcmp(type, FLAT) strcmp(type, SPARSE)) || +(strcmp(access, RW))) { +goto next_line; +} + +/* save to extents array */ +if (!strcmp(type, FLAT)) { +/* FLAT extent */ +char extent_path[PATH_MAX]; +BlockDriverState *extent_file; +VmdkExtent *extent; + +path_combine(extent_path, sizeof(extent_path), +desc_file_path, fname); +ret = bdrv_file_open(extent_file, extent_path, bs-open_flags); +if (ret) { +return ret; +} +extent = vmdk_add_extent(bs, extent_file, true, sectors, +0, 0, 0, 0, sectors); +extent-flat_start_offset = flat_offset; +} else { +/* SPARSE extent, not supported for now */ +fprintf(stderr, +VMDK: Not supported extent type \%s\.\n, type); +return -ENOTSUP; +} +next_line: +/* move to next line */ +while (*p *p != '\n') { +p++; +} +p++; +} +return 0; +} + +static int vmdk_open_desc_file(BlockDriverState *bs, int flags) +{ +int ret; +char buf[2048]; +char ct[128]; +BDRVVmdkState *s = bs-opaque; + +ret = bdrv_pread(bs-file, 0, buf, sizeof(buf)); +if (ret 0) { +return ret; +} +buf[2047] = '\0'; +if (vmdk_parse_description(buf, createType, ct, sizeof(ct))) { +return -EINVAL; +} +if (strcmp(ct, monolithicFlat)) { +fprintf(stderr, +VMDK: Not supported image type \%s\.\n, ct); +return -ENOTSUP; +} +s-desc_offset = 0; +ret = vmdk_parse_extents(buf, bs, bs-file-filename); +if (ret) { +return ret; +} + +/* try to open parent images, if exist */ +if (vmdk_parent_open(bs)) { +qemu_free(s-extents); +return -EINVAL; +} +s-parent_cid = vmdk_read_cid(bs, 1); +return 0; +} + static
[Qemu-devel] [PATCH v9 04/12] VMDK: separate vmdk_open by format version
Separate vmdk_open by subformats to: * vmdk_open_vmdk3 * vmdk_open_vmdk4 Signed-off-by: Fam Zheng famc...@gmail.com --- block/vmdk.c | 178 - 1 files changed, 112 insertions(+), 66 deletions(-) diff --git a/block/vmdk.c b/block/vmdk.c index f8a815c..6d7b497 100644 --- a/block/vmdk.c +++ b/block/vmdk.c @@ -458,67 +458,20 @@ static VmdkExtent *vmdk_add_extent(BlockDriverState *bs, return extent; } - -static int vmdk_open(BlockDriverState *bs, int flags) +static int vmdk_init_tables(BlockDriverState *bs, VmdkExtent *extent) { -BDRVVmdkState *s = bs-opaque; -uint32_t magic; -int i; -uint32_t l1_size, l1_entry_sectors; -VmdkExtent *extent = NULL; - -if (bdrv_pread(bs-file, 0, magic, sizeof(magic)) != sizeof(magic)) -goto fail; - -magic = be32_to_cpu(magic); -if (magic == VMDK3_MAGIC) { -VMDK3Header header; -if (bdrv_pread(bs-file, sizeof(magic), header, sizeof(header)) -!= sizeof(header)) { -goto fail; -} -extent = vmdk_add_extent(bs, bs-file, false, - le32_to_cpu(header.disk_sectors), - le32_to_cpu(header.l1dir_offset) 9, 0, - 1 6, 1 9, le32_to_cpu(header.granularity)); -} else if (magic == VMDK4_MAGIC) { -VMDK4Header header; -if (bdrv_pread(bs-file, sizeof(magic), header, sizeof(header)) -!= sizeof(header)) { -goto fail; -} -l1_entry_sectors = le32_to_cpu(header.num_gtes_per_gte) -* le64_to_cpu(header.granularity); -l1_size = (le64_to_cpu(header.capacity) + l1_entry_sectors - 1) -/ l1_entry_sectors; -extent = vmdk_add_extent(bs, bs-file, false, - le64_to_cpu(header.capacity), - le64_to_cpu(header.gd_offset) 9, - le64_to_cpu(header.rgd_offset) 9, - l1_size, - le32_to_cpu(header.num_gtes_per_gte), - le64_to_cpu(header.granularity)); -if (extent-l1_entry_sectors = 0) { -goto fail; -} -// try to open parent images, if exist -if (vmdk_parent_open(bs) != 0) -goto fail; -// write the CID once after the image creation -s-parent_cid = vmdk_read_cid(bs,1); -} else { -goto fail; -} +int ret; +int l1_size, i; /* read the L1 table */ l1_size = extent-l1_size * sizeof(uint32_t); extent-l1_table = qemu_malloc(l1_size); -if (bdrv_pread(bs-file, -extent-l1_table_offset, -extent-l1_table, -l1_size) -!= l1_size) { -goto fail; +ret = bdrv_pread(extent-file, +extent-l1_table_offset, +extent-l1_table, +l1_size); +if (ret 0) { +goto fail_l1; } for (i = 0; i extent-l1_size; i++) { le32_to_cpus(extent-l1_table[i]); @@ -526,12 +479,12 @@ static int vmdk_open(BlockDriverState *bs, int flags) if (extent-l1_backup_table_offset) { extent-l1_backup_table = qemu_malloc(l1_size); -if (bdrv_pread(bs-file, -extent-l1_backup_table_offset, -extent-l1_backup_table, -l1_size) -!= l1_size) { -goto fail; +ret = bdrv_pread(extent-file, +extent-l1_backup_table_offset, +extent-l1_backup_table, +l1_size); +if (ret 0) { +goto fail_l1b; } for (i = 0; i extent-l1_size; i++) { le32_to_cpus(extent-l1_backup_table[i]); @@ -541,9 +494,102 @@ static int vmdk_open(BlockDriverState *bs, int flags) extent-l2_cache = qemu_malloc(extent-l2_size * L2_CACHE_SIZE * sizeof(uint32_t)); return 0; + fail_l1b: +qemu_free(extent-l1_backup_table); + fail_l1: +qemu_free(extent-l1_table); +return ret; +} + +static int vmdk_open_vmdk3(BlockDriverState *bs, int flags) +{ +int ret; +uint32_t magic; +VMDK3Header header; +VmdkExtent *extent; + +ret = bdrv_pread(bs-file, sizeof(magic), header, sizeof(header)); +if (ret 0) { +goto fail; +} +extent = vmdk_add_extent(bs, + bs-file, false, + le32_to_cpu(header.disk_sectors), + le32_to_cpu(header.l1dir_offset) 9, + 0, 1 6, 1 9, + le32_to_cpu(header.granularity)); +ret = vmdk_init_tables(bs, extent); +if (ret) { +/* vmdk_init_tables cleans up on fail, so only free allocation of + * vmdk_add_extent here. */ +goto fail; +} +return 0;
Re: [Qemu-devel] KVM call egenda for July 12
On 07/11/2011 08:14 AM, Alexander Graf wrote: Am 11.07.2011 um 13:46 schrieb Juan Quintelaquint...@redhat.com: Hi Please send in any agenda items you are interested in covering. Device passthrough on non-PCI (take 2) Can we defer this to next week? I won't be able to attend today. Regards, Anthony Liguori Alex
Re: [Qemu-devel] Best qemu ARM board supporting 1G+ ?
On 12 July 2011 13:08, Peter Maydell peter.mayd...@linaro.org wrote: On 12 July 2011 12:36, Wesley W. Terpstra wes...@terpstra.ca wrote: I have a arm/versatileab running debian/sid, but this only has access to 256MB of main memory. For my current project I need a minimum of 1GB. I know that the realview-pbx-a9 can do this, but am unsure how well its supported. 1. Which qemu ARM boards support = 1GB of memory? realview-pb-a8, realview-pbx-a9 and vexpress-a9. I should have mentioned that these are all 1GB maximum; no board supports more than 1GB. -- PMM
[Qemu-devel] [PATCH v9 07/12] VMDK: move 'static' cid_update flag to bs field
Cid_update is the flag for updating CID on first write after opening the image. This should be per image open rather than per program life cycle, so change it from static var of vmdk_write to a field in BDRVVmdkState. Signed-off-by: Fam Zheng famc...@gmail.com --- block/vmdk.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/block/vmdk.c b/block/vmdk.c index f6d2986..8dc58a8 100644 --- a/block/vmdk.c +++ b/block/vmdk.c @@ -82,6 +82,7 @@ typedef struct VmdkExtent { typedef struct BDRVVmdkState { int desc_offset; +bool cid_updated; uint32_t parent_cid; int num_extents; /* Extent array with num_extents entries, ascend ordered by address */ @@ -853,7 +854,6 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num, int n; int64_t index_in_cluster; uint64_t cluster_offset; -static int cid_update = 0; VmdkMetaData m_data; if (sector_num bs-total_sectors) { @@ -900,9 +900,9 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num, buf += n * 512; // update CID on the first write every time the virtual disk is opened -if (!cid_update) { +if (!s-cid_updated) { vmdk_write_cid(bs, time(NULL)); -cid_update++; +s-cid_updated = true; } } return 0;
[Qemu-devel] [PATCH v9 11/12] VMDK: fix coding style
Conform coding style in vmdk.c to pass scripts/checkpatch.pl checks. Signed-off-by: Fam Zheng famc...@gmail.com --- block/vmdk.c | 78 +++--- 1 files changed, 47 insertions(+), 31 deletions(-) diff --git a/block/vmdk.c b/block/vmdk.c index e7bea1f..aa05a3b 100644 --- a/block/vmdk.c +++ b/block/vmdk.c @@ -102,8 +102,9 @@ static int vmdk_probe(const uint8_t *buf, int buf_size, const char *filename) { uint32_t magic; -if (buf_size 4) +if (buf_size 4) { return 0; +} magic = be32_to_cpu(*(uint32_t *)buf); if (magic == VMDK3_MAGIC || magic == VMDK4_MAGIC) { @@ -193,9 +194,10 @@ static uint32_t vmdk_read_cid(BlockDriverState *bs, int parent) cid_str_size = sizeof(CID); } -if ((p_name = strstr(desc,cid_str)) != NULL) { +p_name = strstr(desc, cid_str); +if (p_name != NULL) { p_name += cid_str_size; -sscanf(p_name,%x,cid); +sscanf(p_name, %x, cid); } return cid; @@ -212,9 +214,10 @@ static int vmdk_write_cid(BlockDriverState *bs, uint32_t cid) return -EIO; } -tmp_str = strstr(desc,parentCID); +tmp_str = strstr(desc, parentCID); pstrcpy(tmp_desc, sizeof(tmp_desc), tmp_str); -if ((p_name = strstr(desc,CID)) != NULL) { +p_name = strstr(desc, CID); +if (p_name != NULL) { p_name += sizeof(CID); snprintf(p_name, sizeof(desc) - (p_name - desc), %x\n, cid); pstrcat(desc, sizeof(desc), tmp_desc); @@ -234,13 +237,14 @@ static int vmdk_is_cid_valid(BlockDriverState *bs) uint32_t cur_pcid; if (p_bs) { -cur_pcid = vmdk_read_cid(p_bs,0); -if (s-parent_cid != cur_pcid) -// CID not valid +cur_pcid = vmdk_read_cid(p_bs, 0); +if (s-parent_cid != cur_pcid) { +/* CID not valid */ return 0; +} } #endif -// CID valid +/* CID valid */ return 1; } @@ -255,14 +259,18 @@ static int vmdk_parent_open(BlockDriverState *bs) return -1; } -if ((p_name = strstr(desc,parentFileNameHint)) != NULL) { +p_name = strstr(desc, parentFileNameHint); +if (p_name != NULL) { char *end_name; p_name += sizeof(parentFileNameHint) + 1; -if ((end_name = strchr(p_name,'\')) == NULL) +end_name = strchr(p_name, '\'); +if (end_name == NULL) { return -1; -if ((end_name - p_name) sizeof (bs-backing_file) - 1) +} +if ((end_name - p_name) sizeof(bs-backing_file) - 1) { return -1; +} pstrcpy(bs-backing_file, end_name - p_name + 1, p_name); } @@ -595,8 +603,9 @@ static int get_whole_cluster(BlockDriverState *bs, if (bs-backing_hd) { int ret; -if (!vmdk_is_cid_valid(bs)) +if (!vmdk_is_cid_valid(bs)) { return -1; +} /* floor offset to cluster */ offset -= offset % (extent-cluster_sectors * 512); @@ -655,8 +664,9 @@ static int get_cluster_offset(BlockDriverState *bs, int min_index, i, j; uint32_t min_count, *l2_table, tmp = 0; -if (m_data) +if (m_data) { m_data-valid = 0; +} if (extent-flat) { *cluster_offset = extent-flat_start_offset; return 0; @@ -712,7 +722,7 @@ static int get_cluster_offset(BlockDriverState *bs, return -1; } -// Avoid the L2 tables update for the images that have snapshots. +/* Avoid the L2 tables update for the images that have snapshots. */ *cluster_offset = bdrv_getlength(extent-file); bdrv_truncate( extent-file, @@ -729,8 +739,9 @@ static int get_cluster_offset(BlockDriverState *bs, * or inappropriate VM shutdown. */ if (get_whole_cluster( -bs, extent, *cluster_offset, offset, allocate) == -1) +bs, extent, *cluster_offset, offset, allocate) == -1) { return -1; +} if (m_data) { m_data-offset = tmp; @@ -780,8 +791,9 @@ static int vmdk_is_allocated(BlockDriverState *bs, int64_t sector_num, index_in_cluster = sector_num % extent-cluster_sectors; n = extent-cluster_sectors - index_in_cluster; -if (n nb_sectors) +if (n nb_sectors) { n = nb_sectors; +} *pnum = n; return ret; } @@ -805,16 +817,19 @@ static int vmdk_read(BlockDriverState *bs, int64_t sector_num, sector_num 9, 0, cluster_offset); index_in_cluster = sector_num % extent-cluster_sectors; n = extent-cluster_sectors - index_in_cluster; -if (n nb_sectors) +if (n nb_sectors) { n = nb_sectors; +} if (ret) { /* if not allocated, try to read from parent image, if exist */ if (bs-backing_hd) { -if (!vmdk_is_cid_valid(bs)) +
Re: [Qemu-devel] [PATCH v2 1/1] virtio-console: Prevent abort()s in case of host chardev close
On (Tue) 12 Jul 2011 [14:58:57], Amit Shah wrote: A host chardev could close just before the guest sends some data to be written. This will cause an -EPIPE error. This shouldn't be propagated to virtio-serial-bus. Ideally we should close the port once -EPIPE is received, but since the chardev interface doesn't return such meaningful values to its users, all we get is -1 for any kind of error. Just return 0 for now and wait for chardevs to return better error messages to act better on the return messages. For v2, removed the check for -EAGAIN as qemu-char.c doesn't return anything other than -1 for error conditions, as Markus and Juan noted. Amit
Re: [Qemu-devel] [PATCH v3 2/2] qemu-io: Fix if scoping bug
Am 11.07.2011 17:20, schrieb Devin Nakamura: Fix a bug caused by lack of braces in if statement Lack of braces means that if(count 0x1ff) is never reached Conflicts: qemu-io.c Signed-off-by: Devin Nakamura devin...@gmail.com Thanks, applied both to the block branch. Kevin
[Qemu-devel] [PATCH v9 10/12] VMDK: create different subformats
Add create option 'format', with enums: monolithicSparse monolithicFlat twoGbMaxExtentSparse twoGbMaxExtentFlat Each creates a subformat image file. The default is monolithicSparse. Signed-off-by: Fam Zheng famc...@gmail.com --- block/vmdk.c | 502 +++-- block_int.h |1 + 2 files changed, 274 insertions(+), 229 deletions(-) diff --git a/block/vmdk.c b/block/vmdk.c index 93ac289..e7bea1f 100644 --- a/block/vmdk.c +++ b/block/vmdk.c @@ -156,8 +156,9 @@ static int vmdk_probe(const uint8_t *buf, int buf_size, const char *filename) #define CHECK_CID 1 #define SECTOR_SIZE 512 -#define DESC_SIZE 20*SECTOR_SIZE // 20 sectors of 512 bytes each -#define HEADER_SIZE 512// first sector of 512 bytes +#define DESC_SIZE (20 * SECTOR_SIZE)/* 20 sectors of 512 bytes each */ +#define BUF_SIZE 4096 +#define HEADER_SIZE 512 /* first sector of 512 bytes */ static void vmdk_free_extents(BlockDriverState *bs) { @@ -243,168 +244,6 @@ static int vmdk_is_cid_valid(BlockDriverState *bs) return 1; } -static int vmdk_snapshot_create(const char *filename, const char *backing_file) -{ -int snp_fd, p_fd; -int ret; -uint32_t p_cid; -char *p_name, *gd_buf, *rgd_buf; -const char *real_filename, *temp_str; -VMDK4Header header; -uint32_t gde_entries, gd_size; -int64_t gd_offset, rgd_offset, capacity, gt_size; -char p_desc[DESC_SIZE], s_desc[DESC_SIZE], hdr[HEADER_SIZE]; -static const char desc_template[] = -# Disk DescriptorFile\n -version=1\n -CID=%x\n -parentCID=%x\n -createType=\monolithicSparse\\n -parentFileNameHint=\%s\\n -\n -# Extent description\n -RW %u SPARSE \%s\\n -\n -# The Disk Data Base \n -#DDB\n -\n; - -snp_fd = open(filename, O_RDWR | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE, 0644); -if (snp_fd 0) -return -errno; -p_fd = open(backing_file, O_RDONLY | O_BINARY | O_LARGEFILE); -if (p_fd 0) { -close(snp_fd); -return -errno; -} - -/* read the header */ -if (lseek(p_fd, 0x0, SEEK_SET) == -1) { -ret = -errno; -goto fail; -} -if (read(p_fd, hdr, HEADER_SIZE) != HEADER_SIZE) { -ret = -errno; -goto fail; -} - -/* write the header */ -if (lseek(snp_fd, 0x0, SEEK_SET) == -1) { -ret = -errno; -goto fail; -} -if (write(snp_fd, hdr, HEADER_SIZE) == -1) { -ret = -errno; -goto fail; -} - -memset(header, 0, sizeof(header)); -memcpy(header,hdr[4], sizeof(header)); // skip the VMDK4_MAGIC - -if (ftruncate(snp_fd, header.grain_offset 9)) { -ret = -errno; -goto fail; -} -/* the descriptor offset = 0x200 */ -if (lseek(p_fd, 0x200, SEEK_SET) == -1) { -ret = -errno; -goto fail; -} -if (read(p_fd, p_desc, DESC_SIZE) != DESC_SIZE) { -ret = -errno; -goto fail; -} - -if ((p_name = strstr(p_desc,CID)) != NULL) { -p_name += sizeof(CID); -sscanf(p_name,%x,p_cid); -} - -real_filename = filename; -if ((temp_str = strrchr(real_filename, '\\')) != NULL) -real_filename = temp_str + 1; -if ((temp_str = strrchr(real_filename, '/')) != NULL) -real_filename = temp_str + 1; -if ((temp_str = strrchr(real_filename, ':')) != NULL) -real_filename = temp_str + 1; - -snprintf(s_desc, sizeof(s_desc), desc_template, p_cid, p_cid, backing_file, - (uint32_t)header.capacity, real_filename); - -/* write the descriptor */ -if (lseek(snp_fd, 0x200, SEEK_SET) == -1) { -ret = -errno; -goto fail; -} -if (write(snp_fd, s_desc, strlen(s_desc)) == -1) { -ret = -errno; -goto fail; -} - -gd_offset = header.gd_offset * SECTOR_SIZE; // offset of GD table -rgd_offset = header.rgd_offset * SECTOR_SIZE; // offset of RGD table -capacity = header.capacity * SECTOR_SIZE; // Extent size -/* - * Each GDE span 32M disk, means: - * 512 GTE per GT, each GTE points to grain - */ -gt_size = (int64_t)header.num_gtes_per_gte * header.granularity * SECTOR_SIZE; -if (!gt_size) { -ret = -EINVAL; -goto fail; -} -gde_entries = (uint32_t)(capacity / gt_size); // number of gde/rgde -gd_size = gde_entries * sizeof(uint32_t); - -/* write RGD */ -rgd_buf = qemu_malloc(gd_size); -if (lseek(p_fd, rgd_offset, SEEK_SET) == -1) { -ret = -errno; -goto fail_rgd; -} -if (read(p_fd, rgd_buf, gd_size) != gd_size) { -ret = -errno; -goto fail_rgd; -} -if (lseek(snp_fd, rgd_offset, SEEK_SET) == -1) { -ret = -errno; -goto fail_rgd; -} -if (write(snp_fd, rgd_buf, gd_size) == -1) { -ret = -errno; -goto fail_rgd; -} - -/* write GD */ -gd_buf =
Re: [Qemu-devel] qemu-user[armel/mips] and debian-rootfs
On Mar 12 Jul 2011 08:32:51 Wesley W. Terpstra escribió: [snip] Just a heads up---if you only use my original patch, mipseb still won't be able to run 'make'. My final patchset (sent as a series of 6 patches) resolves this in addition to the other problems. And there you solved my last problem :-) Thanks! -- Lisandro Damián Nicanor Pérez Meyer http://perezmeyer.com.ar/ http://perezmeyer.blogspot.com/ signature.asc Description: This is a digitally signed message part.
Re: [Qemu-devel] [PATCH v3] QMP: add snapshot_blkdev command
On Tue, 12 Jul 2011 11:26:09 +0200 Jes Sorensen jes.soren...@redhat.com wrote: On 07/11/11 22:35, Luiz Capitulino wrote: Sorry that is no go, you just broke the hmp implementation - you cannot change the hmp behavior like that. HMP uses positional arguments, so changing argument names makes no difference. And, apart from some exceptions, it's not an stable interface, anyway... I guess you're right about the naming not affecting the hmp interface. However hmp is far more usable to end users than qmp, so yes it does matter not to change the interface at random. Right, but we don't do it at random. Actually, it's not something that happens often and we always consider the impact. However, hmp doesn't have stability guarantees as qmp has. In this specific case, no hmp user visible change has been made.
Re: [Qemu-devel] [PATCH 0/4] scsi fixes
Am 11.07.2011 15:02, schrieb Hannes Reinecke: Hi all, these are some fixes I found during debugging my megasas HBA emulation. This time I've sent them as a separate patchset for inclusion. All of them have been acked, so please apply. Hannes Reinecke (4): iov: Update parameter usage in iov_(to|from)_buf() scsi: Add 'hba_private' to SCSIRequest scsi-disk: Fixup debugging statement scsi-disk: Mask out serial number EVPD Thanks, applied all to the block branch. Kevin
Re: [Qemu-devel] [PATCH v5 05/18] qapi: add QMP input visitor
On Mon, 11 Jul 2011 19:05:58 -0500 Michael Roth mdr...@linux.vnet.ibm.com wrote: On 07/07/2011 09:32 AM, Luiz Capitulino wrote: On Tue, 5 Jul 2011 08:02:32 -0500 Michael Rothmdr...@linux.vnet.ibm.com wrote: A type of Visiter class that is used to walk a qobject's structure and assign each entry to the corresponding native C type. Command marshaling function will use this to pull out QMP command parameters recieved over the wire and pass them as native arguments to the corresponding C functions. Signed-off-by: Michael Rothmdr...@linux.vnet.ibm.com --- Makefile.objs|2 +- qapi/qmp-input-visitor.c | 264 ++ qapi/qmp-input-visitor.h | 27 + qerror.h |3 + 4 files changed, 295 insertions(+), 1 deletions(-) create mode 100644 qapi/qmp-input-visitor.c create mode 100644 qapi/qmp-input-visitor.h diff --git a/Makefile.objs b/Makefile.objs index 0077014..997ecef 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -375,7 +375,7 @@ libcacard-y = cac.o event.o vcard.o vreader.o vcard_emul_nss.o vcard_emul_type.o ## # qapi -qapi-nested-y = qapi-visit-core.o +qapi-nested-y = qapi-visit-core.o qmp-input-visitor.o qapi-obj-y = $(addprefix qapi/, $(qapi-nested-y)) vl.o: QEMU_CFLAGS+=$(GPROF_CFLAGS) diff --git a/qapi/qmp-input-visitor.c b/qapi/qmp-input-visitor.c new file mode 100644 index 000..80912bb --- /dev/null +++ b/qapi/qmp-input-visitor.c @@ -0,0 +1,264 @@ +/* + * Input Visitor + * + * Copyright IBM, Corp. 2011 + * + * Authors: + * Anthony Liguorialigu...@us.ibm.com + * + * This work is licensed under the terms of the GNU LGPL, version 2.1 or later. + * See the COPYING.LIB file in the top-level directory. + * + */ + +#include qmp-input-visitor.h +#include qemu-queue.h +#include qemu-common.h +#include qemu-objects.h +#include qerror.h + +#define QIV_STACK_SIZE 1024 + +typedef struct StackObject +{ +const QObject *obj; +const QListEntry *entry; +} StackObject; + +struct QmpInputVisitor +{ +Visitor visitor; +const QObject *obj; +StackObject stack[QIV_STACK_SIZE]; +int nb_stack; +}; + +static QmpInputVisitor *to_qiv(Visitor *v) +{ +return container_of(v, QmpInputVisitor, visitor); +} + +static const QObject *qmp_input_get_object(QmpInputVisitor *qiv, const char *name) +{ +const QObject *qobj; + +if (qiv-nb_stack == 0) { +qobj = qiv-obj; +} else { +qobj = qiv-stack[qiv-nb_stack - 1].obj; +} + +if (name qobject_type(qobj) == QTYPE_QDICT) { +return qdict_get(qobject_to_qdict(qobj), name); +} else if (qiv-nb_stack 0 qobject_type(qobj) == QTYPE_QLIST) { +return qlist_entry_obj(qiv-stack[qiv-nb_stack - 1].entry); +} + +return qobj; +} + +static void qmp_input_push(QmpInputVisitor *qiv, const QObject *obj, Error **errp) +{ +qiv-stack[qiv-nb_stack].obj = obj; +if (qobject_type(obj) == QTYPE_QLIST) { +qiv-stack[qiv-nb_stack].entry = qlist_first(qobject_to_qlist(obj)); +} +qiv-nb_stack++; + +if (qiv-nb_stack= QIV_STACK_SIZE) { +error_set(errp, QERR_BUFFER_OVERRUN); +return; +} +} + +static void qmp_input_pop(QmpInputVisitor *qiv, Error **errp) +{ +qiv-nb_stack--; +if (qiv-nb_stack 0) { +error_set(errp, QERR_BUFFER_OVERRUN); +return; +} +} + +static void qmp_input_start_struct(Visitor *v, void **obj, const char *kind, const char *name, size_t size, Error **errp) +{ +QmpInputVisitor *qiv = to_qiv(v); +const QObject *qobj = qmp_input_get_object(qiv, name); + +if (!qobj || qobject_type(qobj) != QTYPE_QDICT) { +error_set(errp, QERR_INVALID_PARAMETER_TYPE, name ? name : null, QDict); +return; +} + +qmp_input_push(qiv, qobj, errp); +if (error_is_set(errp)) { +return; +} + +if (obj) { +*obj = qemu_mallocz(size); +} +} + +static void qmp_input_end_struct(Visitor *v, Error **errp) +{ +QmpInputVisitor *qiv = to_qiv(v); + +qmp_input_pop(qiv, errp); +} + +static void qmp_input_start_list(Visitor *v, const char *name, Error **errp) +{ +QmpInputVisitor *qiv = to_qiv(v); +const QObject *qobj = qmp_input_get_object(qiv, name); + +if (!qobj || qobject_type(qobj) != QTYPE_QLIST) { +error_set(errp, QERR_INVALID_PARAMETER_TYPE, name ? name : null, list); +return; +} + +qmp_input_push(qiv, qobj, errp); +} + +static GenericList *qmp_input_next_list(Visitor *v, GenericList **list, Error **errp) +{ +QmpInputVisitor *qiv = to_qiv(v); +
Re: [Qemu-devel] [PULL 0/8] Block patches
On 07/06/2011 09:21 AM, Kevin Wolf wrote: The following changes since commit 9312805d33e8b106bae356d13a8071fb37d75554: pxa2xx_lcd: add proper rotation support (2011-07-04 22:12:21 +0200) Pulled. Thanks. Regards, Anthony Liguori are available in the git repository at: git://repo.or.cz/qemu/kevin.git for-anthony Federico Simoncelli (1): qemu-img: Add cache command line option Johannes Stezenbach (1): block/raw-posix: Linux compat-ioctl warning workaround Kevin Wolf (3): Documentation: Remove outdated host_device note ide: Ignore reads during PIO in and writes during PIO out ide: Initialise buffers with zeros Luiz Capitulino (2): block: drive_init(): Simplify interface type setting block: drive_init(): Improve CHS setting error message Markus Armbruster (1): virtio-blk: Turn drive serial into a qdev property block/raw-posix.c| 14 + blockdev.c | 14 +++- hw/ide/core.c| 50 +- hw/s390-virtio-bus.c |4 ++- hw/s390-virtio-bus.h |1 + hw/virtio-blk.c | 29 -- hw/virtio-blk.h |2 + hw/virtio-pci.c |4 ++- hw/virtio-pci.h |1 + hw/virtio.h |3 +- qemu-img-cmds.hx |6 ++-- qemu-img.c | 80 + qemu-img.texi|6 13 files changed, 161 insertions(+), 53 deletions(-)
Re: [Qemu-devel] KVM call egenda for July 12
Juan Quintela quint...@redhat.com wrote: Hi Please send in any agenda items you are interested in covering. As there is a single topic, and our fearless leader can't attend and want it moved to next week, this week call got cancelled. Later, Juan.
Re: [Qemu-devel] [PATCH v5 05/18] qapi: add QMP input visitor
On 07/12/2011 08:16 AM, Luiz Capitulino wrote: On Mon, 11 Jul 2011 19:05:58 -0500 Michael Rothmdr...@linux.vnet.ibm.com wrote: On 07/07/2011 09:32 AM, Luiz Capitulino wrote: On Tue, 5 Jul 2011 08:02:32 -0500 Michael Rothmdr...@linux.vnet.ibm.com wrote: A type of Visiter class that is used to walk a qobject's structure and assign each entry to the corresponding native C type. Command marshaling function will use this to pull out QMP command parameters recieved over the wire and pass them as native arguments to the corresponding C functions. Signed-off-by: Michael Rothmdr...@linux.vnet.ibm.com --- Makefile.objs|2 +- qapi/qmp-input-visitor.c | 264 ++ qapi/qmp-input-visitor.h | 27 + qerror.h |3 + 4 files changed, 295 insertions(+), 1 deletions(-) create mode 100644 qapi/qmp-input-visitor.c create mode 100644 qapi/qmp-input-visitor.h diff --git a/Makefile.objs b/Makefile.objs index 0077014..997ecef 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -375,7 +375,7 @@ libcacard-y = cac.o event.o vcard.o vreader.o vcard_emul_nss.o vcard_emul_type.o ## # qapi -qapi-nested-y = qapi-visit-core.o +qapi-nested-y = qapi-visit-core.o qmp-input-visitor.o qapi-obj-y = $(addprefix qapi/, $(qapi-nested-y)) vl.o: QEMU_CFLAGS+=$(GPROF_CFLAGS) diff --git a/qapi/qmp-input-visitor.c b/qapi/qmp-input-visitor.c new file mode 100644 index 000..80912bb --- /dev/null +++ b/qapi/qmp-input-visitor.c @@ -0,0 +1,264 @@ +/* + * Input Visitor + * + * Copyright IBM, Corp. 2011 + * + * Authors: + * Anthony Liguorialigu...@us.ibm.com + * + * This work is licensed under the terms of the GNU LGPL, version 2.1 or later. + * See the COPYING.LIB file in the top-level directory. + * + */ + +#include qmp-input-visitor.h +#include qemu-queue.h +#include qemu-common.h +#include qemu-objects.h +#include qerror.h + +#define QIV_STACK_SIZE 1024 + +typedef struct StackObject +{ +const QObject *obj; +const QListEntry *entry; +} StackObject; + +struct QmpInputVisitor +{ +Visitor visitor; +const QObject *obj; +StackObject stack[QIV_STACK_SIZE]; +int nb_stack; +}; + +static QmpInputVisitor *to_qiv(Visitor *v) +{ +return container_of(v, QmpInputVisitor, visitor); +} + +static const QObject *qmp_input_get_object(QmpInputVisitor *qiv, const char *name) +{ +const QObject *qobj; + +if (qiv-nb_stack == 0) { +qobj = qiv-obj; +} else { +qobj = qiv-stack[qiv-nb_stack - 1].obj; +} + +if (name qobject_type(qobj) == QTYPE_QDICT) { +return qdict_get(qobject_to_qdict(qobj), name); +} else if (qiv-nb_stack 0 qobject_type(qobj) == QTYPE_QLIST) { +return qlist_entry_obj(qiv-stack[qiv-nb_stack - 1].entry); +} + +return qobj; +} + +static void qmp_input_push(QmpInputVisitor *qiv, const QObject *obj, Error **errp) +{ +qiv-stack[qiv-nb_stack].obj = obj; +if (qobject_type(obj) == QTYPE_QLIST) { +qiv-stack[qiv-nb_stack].entry = qlist_first(qobject_to_qlist(obj)); +} +qiv-nb_stack++; + +if (qiv-nb_stack= QIV_STACK_SIZE) { +error_set(errp, QERR_BUFFER_OVERRUN); +return; +} +} + +static void qmp_input_pop(QmpInputVisitor *qiv, Error **errp) +{ +qiv-nb_stack--; +if (qiv-nb_stack 0) { +error_set(errp, QERR_BUFFER_OVERRUN); +return; +} +} + +static void qmp_input_start_struct(Visitor *v, void **obj, const char *kind, const char *name, size_t size, Error **errp) +{ +QmpInputVisitor *qiv = to_qiv(v); +const QObject *qobj = qmp_input_get_object(qiv, name); + +if (!qobj || qobject_type(qobj) != QTYPE_QDICT) { +error_set(errp, QERR_INVALID_PARAMETER_TYPE, name ? name : null, QDict); +return; +} + +qmp_input_push(qiv, qobj, errp); +if (error_is_set(errp)) { +return; +} + +if (obj) { +*obj = qemu_mallocz(size); +} +} + +static void qmp_input_end_struct(Visitor *v, Error **errp) +{ +QmpInputVisitor *qiv = to_qiv(v); + +qmp_input_pop(qiv, errp); +} + +static void qmp_input_start_list(Visitor *v, const char *name, Error **errp) +{ +QmpInputVisitor *qiv = to_qiv(v); +const QObject *qobj = qmp_input_get_object(qiv, name); + +if (!qobj || qobject_type(qobj) != QTYPE_QLIST) { +error_set(errp, QERR_INVALID_PARAMETER_TYPE, name ? name : null, list); +return; +} + +qmp_input_push(qiv, qobj, errp); +} + +static GenericList *qmp_input_next_list(Visitor *v, GenericList **list, Error **errp) +{ +QmpInputVisitor *qiv = to_qiv(v); +GenericList *entry; +StackObject *so =qiv-stack[qiv-nb_stack - 1]; + +if (so-entry == NULL) { +return NULL; +} + +entry = qemu_mallocz(sizeof(*entry)); +if (*list) { +so-entry = qlist_next(so-entry); +if (so-entry == NULL) { +
[Qemu-devel] [PATCHv3] async + suspend reworked
v2-v3: builds correctly with older and newer spice, and runs with older and newer qxl driver. fixed update_area_async to not use QXLRect on stack qxl-render updated to work with update_area_async correctly reverted change to update_area api - update_area still returns dirty rects array Git trees: git://anongit.freedesktop.org/~alon/qemuasync_and_s3.v3 git://anongit.freedesktop.org/~alon/spice async_and_s3.v4 git://anongit.freedesktop.org/~alon/spice-protocol s3.v2 (unchanged) git://anongit.freedesktop.org/~alon/qxl s3.v3.async.v3 (unchanged) Alon Levy (12): qxl: add io_port_to_string qxl: make qxl_guest_bug take variable arguments qxl: use QXL_REVISION_* qxl: QXL_IO_UPDATE_AREA: pass ram-update_area directly to update_area qxl: async io support using new spice api qxl-render/qxl: split out qxl_save_ppm qxl-render: split out qxl_render_update_dirty_rectangles qxl-render: qxl_render_update: nop if \!ssd.running qxl-render: use update_area_async and update_area_complete qxl: qxl_send_events: ignore if stopped (instead of abort) qxl: only disallow specific io's in vga mode qxl: add QXL_IO_FLUSH_{SURFACES,RELEASE} for guest S3S4 support Gerd Hoffmann (7): spice: add worker wrapper functions. spice: add qemu_spice_display_init_common qxl: remove qxl_destroy_primary() spice/qxl: move worker wrappers qxl: fix surface tracking locking qxl: error handling fixes and cleanups. qxl: bump pci rev hw/qxl-render.c| 97 +-- hw/qxl.c | 490 hw/qxl.h | 38 - ui/spice-display.c | 94 +-- ui/spice-display.h | 33 5 files changed, 652 insertions(+), 100 deletions(-) -- 1.7.6
Re: [Qemu-devel] [PATCH v5 05/18] qapi: add QMP input visitor
On Tue, 12 Jul 2011 08:46:13 -0500 Michael Roth mdr...@linux.vnet.ibm.com wrote: On 07/12/2011 08:16 AM, Luiz Capitulino wrote: On Mon, 11 Jul 2011 19:05:58 -0500 Michael Rothmdr...@linux.vnet.ibm.com wrote: On 07/07/2011 09:32 AM, Luiz Capitulino wrote: On Tue, 5 Jul 2011 08:02:32 -0500 Michael Rothmdr...@linux.vnet.ibm.com wrote: A type of Visiter class that is used to walk a qobject's structure and assign each entry to the corresponding native C type. Command marshaling function will use this to pull out QMP command parameters recieved over the wire and pass them as native arguments to the corresponding C functions. Signed-off-by: Michael Rothmdr...@linux.vnet.ibm.com --- Makefile.objs|2 +- qapi/qmp-input-visitor.c | 264 ++ qapi/qmp-input-visitor.h | 27 + qerror.h |3 + 4 files changed, 295 insertions(+), 1 deletions(-) create mode 100644 qapi/qmp-input-visitor.c create mode 100644 qapi/qmp-input-visitor.h diff --git a/Makefile.objs b/Makefile.objs index 0077014..997ecef 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -375,7 +375,7 @@ libcacard-y = cac.o event.o vcard.o vreader.o vcard_emul_nss.o vcard_emul_type.o ## # qapi -qapi-nested-y = qapi-visit-core.o +qapi-nested-y = qapi-visit-core.o qmp-input-visitor.o qapi-obj-y = $(addprefix qapi/, $(qapi-nested-y)) vl.o: QEMU_CFLAGS+=$(GPROF_CFLAGS) diff --git a/qapi/qmp-input-visitor.c b/qapi/qmp-input-visitor.c new file mode 100644 index 000..80912bb --- /dev/null +++ b/qapi/qmp-input-visitor.c @@ -0,0 +1,264 @@ +/* + * Input Visitor + * + * Copyright IBM, Corp. 2011 + * + * Authors: + * Anthony Liguorialigu...@us.ibm.com + * + * This work is licensed under the terms of the GNU LGPL, version 2.1 or later. + * See the COPYING.LIB file in the top-level directory. + * + */ + +#include qmp-input-visitor.h +#include qemu-queue.h +#include qemu-common.h +#include qemu-objects.h +#include qerror.h + +#define QIV_STACK_SIZE 1024 + +typedef struct StackObject +{ +const QObject *obj; +const QListEntry *entry; +} StackObject; + +struct QmpInputVisitor +{ +Visitor visitor; +const QObject *obj; +StackObject stack[QIV_STACK_SIZE]; +int nb_stack; +}; + +static QmpInputVisitor *to_qiv(Visitor *v) +{ +return container_of(v, QmpInputVisitor, visitor); +} + +static const QObject *qmp_input_get_object(QmpInputVisitor *qiv, const char *name) +{ +const QObject *qobj; + +if (qiv-nb_stack == 0) { +qobj = qiv-obj; +} else { +qobj = qiv-stack[qiv-nb_stack - 1].obj; +} + +if (name qobject_type(qobj) == QTYPE_QDICT) { +return qdict_get(qobject_to_qdict(qobj), name); +} else if (qiv-nb_stack 0 qobject_type(qobj) == QTYPE_QLIST) { +return qlist_entry_obj(qiv-stack[qiv-nb_stack - 1].entry); +} + +return qobj; +} + +static void qmp_input_push(QmpInputVisitor *qiv, const QObject *obj, Error **errp) +{ +qiv-stack[qiv-nb_stack].obj = obj; +if (qobject_type(obj) == QTYPE_QLIST) { +qiv-stack[qiv-nb_stack].entry = qlist_first(qobject_to_qlist(obj)); +} +qiv-nb_stack++; + +if (qiv-nb_stack= QIV_STACK_SIZE) { +error_set(errp, QERR_BUFFER_OVERRUN); +return; +} +} + +static void qmp_input_pop(QmpInputVisitor *qiv, Error **errp) +{ +qiv-nb_stack--; +if (qiv-nb_stack 0) { +error_set(errp, QERR_BUFFER_OVERRUN); +return; +} +} + +static void qmp_input_start_struct(Visitor *v, void **obj, const char *kind, const char *name, size_t size, Error **errp) +{ +QmpInputVisitor *qiv = to_qiv(v); +const QObject *qobj = qmp_input_get_object(qiv, name); + +if (!qobj || qobject_type(qobj) != QTYPE_QDICT) { +error_set(errp, QERR_INVALID_PARAMETER_TYPE, name ? name : null, QDict); +return; +} + +qmp_input_push(qiv, qobj, errp); +if (error_is_set(errp)) { +return; +} + +if (obj) { +*obj = qemu_mallocz(size); +} +} + +static void qmp_input_end_struct(Visitor *v, Error **errp) +{ +QmpInputVisitor *qiv = to_qiv(v); + +qmp_input_pop(qiv, errp); +} + +static void qmp_input_start_list(Visitor *v, const char *name, Error **errp) +{ +QmpInputVisitor *qiv = to_qiv(v); +const QObject *qobj = qmp_input_get_object(qiv, name); + +if (!qobj || qobject_type(qobj) != QTYPE_QLIST) { +error_set(errp, QERR_INVALID_PARAMETER_TYPE, name ? name : null, list); +return; +} + +qmp_input_push(qiv, qobj, errp);
[Qemu-devel] [PATCHv3] spice: add qemu_spice_display_init_common
From: Gerd Hoffmann kra...@redhat.com Factor out SimpleSpiceDisplay initialization into qemu_spice_display_init_common() and call it from both qxl.c (for vga mode) and spice-display.c Signed-off-by: Gerd Hoffmann kra...@redhat.com --- hw/qxl.c |7 +-- ui/spice-display.c | 17 +++-- ui/spice-display.h |1 + 3 files changed, 13 insertions(+), 12 deletions(-) diff --git a/hw/qxl.c b/hw/qxl.c index 545074d..2d46814 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -1321,12 +1321,7 @@ static int qxl_init_primary(PCIDevice *dev) vga-ds = graphic_console_init(qxl_hw_update, qxl_hw_invalidate, qxl_hw_screen_dump, qxl_hw_text_update, qxl); -qxl-ssd.ds = vga-ds; -qemu_mutex_init(qxl-ssd.lock); -qxl-ssd.mouse_x = -1; -qxl-ssd.mouse_y = -1; -qxl-ssd.bufsize = (16 * 1024 * 1024); -qxl-ssd.buf = qemu_malloc(qxl-ssd.bufsize); +qemu_spice_display_init_common(qxl-ssd, vga-ds); qxl0 = qxl; register_displaychangelistener(vga-ds, display_listener); diff --git a/ui/spice-display.c b/ui/spice-display.c index 0433ea8..fef1758 100644 --- a/ui/spice-display.c +++ b/ui/spice-display.c @@ -285,6 +285,16 @@ void qemu_spice_vm_change_state_handler(void *opaque, int running, int reason) ssd-running = running; } +void qemu_spice_display_init_common(SimpleSpiceDisplay *ssd, DisplayState *ds) +{ +ssd-ds = ds; +qemu_mutex_init(ssd-lock); +ssd-mouse_x = -1; +ssd-mouse_y = -1; +ssd-bufsize = (16 * 1024 * 1024); +ssd-buf = qemu_malloc(ssd-bufsize); +} + /* display listener callbacks */ void qemu_spice_display_update(SimpleSpiceDisplay *ssd, @@ -498,12 +508,7 @@ static DisplayChangeListener display_listener = { void qemu_spice_display_init(DisplayState *ds) { assert(sdpy.ds == NULL); -sdpy.ds = ds; -qemu_mutex_init(sdpy.lock); -sdpy.mouse_x = -1; -sdpy.mouse_y = -1; -sdpy.bufsize = (16 * 1024 * 1024); -sdpy.buf = qemu_malloc(sdpy.bufsize); +qemu_spice_display_init_common(sdpy, ds); register_displaychangelistener(ds, display_listener); sdpy.qxl.base.sif = dpy_interface.base; diff --git a/ui/spice-display.h b/ui/spice-display.h index 0effdfa..a39b19d 100644 --- a/ui/spice-display.h +++ b/ui/spice-display.h @@ -75,6 +75,7 @@ void qemu_spice_create_host_memslot(SimpleSpiceDisplay *ssd); void qemu_spice_create_host_primary(SimpleSpiceDisplay *ssd); void qemu_spice_destroy_host_primary(SimpleSpiceDisplay *ssd); void qemu_spice_vm_change_state_handler(void *opaque, int running, int reason); +void qemu_spice_display_init_common(SimpleSpiceDisplay *ssd, DisplayState *ds); void qemu_spice_display_update(SimpleSpiceDisplay *ssd, int x, int y, int w, int h); -- 1.7.6
[Qemu-devel] [PATCHv3] qxl-render: qxl_render_update: nop if \!ssd.running
--- hw/qxl-render.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/hw/qxl-render.c b/hw/qxl-render.c index d70373d..f440fe8 100644 --- a/hw/qxl-render.c +++ b/hw/qxl-render.c @@ -106,6 +106,10 @@ void qxl_render_update(PCIQXLDevice *qxl) QXLRect dirty[32], update; void *ptr; +if (!qxl-ssd.running) { +return; +} + if (qxl-guest_primary.resized) { qxl-guest_primary.resized = 0; -- 1.7.6
[Qemu-devel] [PATCHv3] qxl: make qxl_guest_bug take variable arguments
--- hw/qxl.c | 18 +++--- hw/qxl.h |2 +- 2 files changed, 12 insertions(+), 8 deletions(-) diff --git a/hw/qxl.c b/hw/qxl.c index 91bc98d..ae1d0de 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -124,11 +124,15 @@ static void qxl_reset_memslots(PCIQXLDevice *d); static void qxl_reset_surfaces(PCIQXLDevice *d); static void qxl_ring_set_dirty(PCIQXLDevice *qxl); -void qxl_guest_bug(PCIQXLDevice *qxl, const char *msg) +void qxl_guest_bug(PCIQXLDevice *qxl, const char *msg, ...) { qxl_send_events(qxl, QXL_INTERRUPT_ERROR); if (qxl-guestdebug) { -fprintf(stderr, qxl-%d: guest bug: %s\n, qxl-id, msg); +va_list ap; +va_start(ap, msg); +fprintf(stderr, qxl-%d: guest bug: , qxl-id); +vfprintf(stderr, msg, ap); +va_end(ap); } } @@ -1120,11 +1124,11 @@ static void ioport_write(void *opaque, uint32_t addr, uint32_t val) break; case QXL_IO_MEMSLOT_ADD: if (val = NUM_MEMSLOTS) { -qxl_guest_bug(d, QXL_IO_MEMSLOT_ADD: val out of range); +qxl_guest_bug(d, QXL_IO_MEMSLOT_ADD: val out of range\n); break; } if (d-guest_slots[val].active) { -qxl_guest_bug(d, QXL_IO_MEMSLOT_ADD: memory slot already active); +qxl_guest_bug(d, QXL_IO_MEMSLOT_ADD: memory slot already active\n); break; } d-guest_slots[val].slot = d-ram-mem_slot; @@ -1132,14 +1136,14 @@ static void ioport_write(void *opaque, uint32_t addr, uint32_t val) break; case QXL_IO_MEMSLOT_DEL: if (val = NUM_MEMSLOTS) { -qxl_guest_bug(d, QXL_IO_MEMSLOT_DEL: val out of range); +qxl_guest_bug(d, QXL_IO_MEMSLOT_DEL: val out of range\n); break; } qxl_del_memslot(d, val); break; case QXL_IO_CREATE_PRIMARY: if (val != 0) { -qxl_guest_bug(d, QXL_IO_CREATE_PRIMARY: val != 0); +qxl_guest_bug(d, QXL_IO_CREATE_PRIMARY: val != 0\n); break; } dprint(d, 1, QXL_IO_CREATE_PRIMARY\n); @@ -1148,7 +1152,7 @@ static void ioport_write(void *opaque, uint32_t addr, uint32_t val) break; case QXL_IO_DESTROY_PRIMARY: if (val != 0) { -qxl_guest_bug(d, QXL_IO_DESTROY_PRIMARY: val != 0); +qxl_guest_bug(d, QXL_IO_DESTROY_PRIMARY: val != 0\n); break; } dprint(d, 1, QXL_IO_DESTROY_PRIMARY (%s)\n, qxl_mode_to_string(d-mode)); diff --git a/hw/qxl.h b/hw/qxl.h index 88393c2..e361bc6 100644 --- a/hw/qxl.h +++ b/hw/qxl.h @@ -99,7 +99,7 @@ typedef struct PCIQXLDevice { /* qxl.c */ void *qxl_phys2virt(PCIQXLDevice *qxl, QXLPHYSICAL phys, int group_id); -void qxl_guest_bug(PCIQXLDevice *qxl, const char *msg); +void qxl_guest_bug(PCIQXLDevice *qxl, const char *msg, ...); void qxl_spice_update_area(PCIQXLDevice *qxl, uint32_t surface_id, struct QXLRect *area, struct QXLRect *dirty_rects, -- 1.7.6
[Qemu-devel] [PATCHv3] qxl: remove qxl_destroy_primary()
From: Gerd Hoffmann kra...@redhat.com We'll have to move qemu_spice_destroy_primary_surface() out of qxl_destroy_primary(). That makes the function pretty pointless, so zap it and open code the two lines instead. Signed-off-by: Gerd Hoffmann kra...@redhat.com --- hw/qxl.c | 28 1 files changed, 12 insertions(+), 16 deletions(-) diff --git a/hw/qxl.c b/hw/qxl.c index 2d46814..0c5ed65 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -120,7 +120,6 @@ static QXLMode qxl_modes[] = { static PCIQXLDevice *qxl0; static void qxl_send_events(PCIQXLDevice *d, uint32_t events); -static void qxl_destroy_primary(PCIQXLDevice *d); static void qxl_reset_memslots(PCIQXLDevice *d); static void qxl_reset_surfaces(PCIQXLDevice *d); static void qxl_ring_set_dirty(PCIQXLDevice *qxl); @@ -617,7 +616,10 @@ static void qxl_exit_vga_mode(PCIQXLDevice *d) return; } dprint(d, 1, %s\n, __FUNCTION__); -qxl_destroy_primary(d); +if (d-mode != QXL_MODE_UNDEFINED) { +d-mode = QXL_MODE_UNDEFINED; +qemu_spice_destroy_primary_surface(d-ssd, 0); +} } static void qxl_set_irq(PCIQXLDevice *d) @@ -720,7 +722,10 @@ static void qxl_vga_ioport_write(void *opaque, uint32_t addr, uint32_t val) if (qxl-mode != QXL_MODE_VGA) { dprint(qxl, 1, %s\n, __FUNCTION__); -qxl_destroy_primary(qxl); +if (qxl-mode != QXL_MODE_UNDEFINED) { +qxl-mode = QXL_MODE_UNDEFINED; +qemu_spice_destroy_primary_surface(qxl-ssd, 0); +} qxl_soft_reset(qxl); } vga_ioport_write(opaque, addr, val); @@ -881,18 +886,6 @@ static void qxl_create_guest_primary(PCIQXLDevice *qxl, int loadvm) qxl_render_resize(qxl); } -static void qxl_destroy_primary(PCIQXLDevice *d) -{ -if (d-mode == QXL_MODE_UNDEFINED) { -return; -} - -dprint(d, 1, %s\n, __FUNCTION__); - -d-mode = QXL_MODE_UNDEFINED; -qemu_spice_destroy_primary_surface(d-ssd, 0); -} - static void qxl_set_mode(PCIQXLDevice *d, int modenr, int loadvm) { pcibus_t start = d-pci.io_regions[QXL_RAM_RANGE_INDEX].addr; @@ -1019,7 +1012,10 @@ static void ioport_write(void *opaque, uint32_t addr, uint32_t val) case QXL_IO_DESTROY_PRIMARY: PANIC_ON(val != 0); dprint(d, 1, QXL_IO_DESTROY_PRIMARY (%s)\n, qxl_mode_to_string(d-mode)); -qxl_destroy_primary(d); +if (d-mode != QXL_MODE_UNDEFINED) { +d-mode = QXL_MODE_UNDEFINED; +qemu_spice_destroy_primary_surface(d-ssd, 0); +} break; case QXL_IO_DESTROY_SURFACE_WAIT: qemu_spice_destroy_surface_wait(d-ssd, val); -- 1.7.6
[Qemu-devel] [PATCHv3] qxl-render: use update_area_async and update_area_complete
So now there are two implementations chosen based on QXL_INTERFACE_MINOR: * old (spice qxl minor == 0) - use update_area, no change. * new: 1. keep an array of updated rectangles (ssd.dirty_rects) 2. update it on callback (realloc) 3. render the current one before issuing a new update_area_async --- hw/qxl-render.c| 48 ++-- hw/qxl.c | 12 hw/qxl.h |2 ++ ui/spice-display.h |4 4 files changed, 64 insertions(+), 2 deletions(-) diff --git a/hw/qxl-render.c b/hw/qxl-render.c index f440fe8..4862c35 100644 --- a/hw/qxl-render.c +++ b/hw/qxl-render.c @@ -103,8 +103,11 @@ static void qxl_render_update_dirty_rectangles(PCIQXLDevice *qxl, QXLRect *dirty void qxl_render_update(PCIQXLDevice *qxl) { VGACommonState *vga = qxl-vga; -QXLRect dirty[32], update; +QXLRect update; void *ptr; +#if SPICE_INTERFACE_QXL_MINOR 1 +QXLRect dirty[32]; +#endif if (!qxl-ssd.running) { return; @@ -112,7 +115,13 @@ void qxl_render_update(PCIQXLDevice *qxl) if (qxl-guest_primary.resized) { qxl-guest_primary.resized = 0; - +qemu_mutex_lock(qxl-ssd.lock); +if (qxl-ssd.num_dirty_rects 0) { +free(qxl-ssd.dirty_rects); +qxl-ssd.dirty_rects = NULL; +qxl-ssd.num_dirty_rects = 0; +} +qemu_mutex_unlock(qxl-ssd.lock); if (qxl-guest_primary.flipped) { qemu_free(qxl-guest_primary.flipped); qxl-guest_primary.flipped = NULL; @@ -146,6 +155,19 @@ void qxl_render_update(PCIQXLDevice *qxl) dpy_resize(vga-ds); } +#if SPICE_INTERFACE_QXL_MINOR = 1 +/* render rectangles from last update_area_async */ +qemu_mutex_lock(qxl-ssd.lock); +if (qxl-ssd.num_dirty_rects 0) { +qxl_render_update_dirty_rectangles(qxl, qxl-ssd.dirty_rects, + qxl-ssd.num_dirty_rects); +free(qxl-ssd.dirty_rects); +qxl-ssd.dirty_rects = NULL; +qxl-ssd.num_dirty_rects = 0; +} +qemu_mutex_unlock(qxl-ssd.lock); +#endif + if (!qxl-guest_primary.commands) { return; } @@ -156,11 +178,33 @@ void qxl_render_update(PCIQXLDevice *qxl) update.top= 0; update.bottom = qxl-guest_primary.surface.height; +#if SPICE_INTERFACE_QXL_MINOR = 1 +/* do a new update_area */ +qxl_spice_update_area_async(qxl, 0, update, 1, 1); +#else memset(dirty, 0, sizeof(dirty)); qxl_spice_update_area(qxl, 0, update, dirty, ARRAY_SIZE(dirty), 1); qxl_render_update_dirty_rectangles(qxl, dirty, ARRAY_SIZE(dirty)); +#endif +} + +#if SPICE_INTERFACE_QXL_MINOR = 1 +void qxl_render_primary_updated(PCIQXLDevice *qxl, QXLRect *dirty, +uint32_t num_dirty) +{ +if (num_dirty == 0) { +return; +} +qemu_mutex_lock(qxl-ssd.lock); +qxl-ssd.num_dirty_rects += num_dirty; +qxl-ssd.dirty_rects = qemu_realloc(qxl-ssd.dirty_rects, + sizeof(QXLRect) * qxl-ssd.num_dirty_rects); +memcpy(qxl-ssd.dirty_rects[qxl-ssd.num_dirty_rects - num_dirty], + dirty, num_dirty * sizeof(QXLRect)); +qemu_mutex_unlock(qxl-ssd.lock); } +#endif /* SPICE_INTERFACE_QXL_MINOR = 1 */ static QEMUCursor *qxl_cursor(PCIQXLDevice *qxl, QXLCursor *cursor) { diff --git a/hw/qxl.c b/hw/qxl.c index de93efa..8a9463e 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -780,6 +780,17 @@ static void interface_async_complete(QXLInstance *sin, uint64_t cookie) qxl_send_events(qxl, QXL_INTERRUPT_IO_CMD); } +static void interface_update_area_complete(QXLInstance *sin, +uint32_t surface_id, struct QXLRect *updated_rects, +uint32_t num_updated_rects) +{ +PCIQXLDevice *qxl = container_of(sin, PCIQXLDevice, ssd.qxl); + +dprint(qxl, 3, %s: %d\n, __FUNCTION__, surface_id); +if (surface_id == 0) { +qxl_render_primary_updated(qxl, updated_rects, num_updated_rects); +} +} #endif static const QXLInterface qxl_interface = { @@ -803,6 +814,7 @@ static const QXLInterface qxl_interface = { .flush_resources = interface_flush_resources, #if SPICE_INTERFACE_QXL_MINOR = 1 .async_complete = interface_async_complete, +.update_area_complete= interface_update_area_complete, #endif }; diff --git a/hw/qxl.h b/hw/qxl.h index 2c7f94a..fad01c6 100644 --- a/hw/qxl.h +++ b/hw/qxl.h @@ -139,4 +139,6 @@ void qxl_spice_update_area_async(PCIQXLDevice *qxl, uint32_t surface_id, struct QXLRect *area, uint32_t clear_dirty_region, int is_vga); +void qxl_render_primary_updated(PCIQXLDevice *qxl, QXLRect *dirty, +uint32_t num_dirty); #endif diff --git a/ui/spice-display.h b/ui/spice-display.h index d24cca9..115aae5 100644 ---
[Qemu-devel] [PATCHv3] spice: add worker wrapper functions.
From: Gerd Hoffmann kra...@redhat.com Add wrapper functions for all spice worker calls. Signed-off-by: Gerd Hoffmann kra...@redhat.com --- hw/qxl-render.c|4 +- hw/qxl.c | 32 +- ui/spice-display.c | 94 --- ui/spice-display.h | 20 +++ 4 files changed, 126 insertions(+), 24 deletions(-) diff --git a/hw/qxl-render.c b/hw/qxl-render.c index 1316066..bef5f14 100644 --- a/hw/qxl-render.c +++ b/hw/qxl-render.c @@ -124,8 +124,8 @@ void qxl_render_update(PCIQXLDevice *qxl) update.bottom = qxl-guest_primary.surface.height; memset(dirty, 0, sizeof(dirty)); -qxl-ssd.worker-update_area(qxl-ssd.worker, 0, update, - dirty, ARRAY_SIZE(dirty), 1); +qemu_spice_update_area(qxl-ssd, 0, update, + dirty, ARRAY_SIZE(dirty), 1); for (i = 0; i ARRAY_SIZE(dirty); i++) { if (qemu_spice_rect_is_empty(dirty+i)) { diff --git a/hw/qxl.c b/hw/qxl.c index 919ec91..545074d 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -690,8 +690,8 @@ static void qxl_hard_reset(PCIQXLDevice *d, int loadvm) dprint(d, 1, %s: start%s\n, __FUNCTION__, loadvm ? (loadvm) : ); -d-ssd.worker-reset_cursor(d-ssd.worker); -d-ssd.worker-reset_image_cache(d-ssd.worker); +qemu_spice_reset_cursor(d-ssd); +qemu_spice_reset_image_cache(d-ssd); qxl_reset_surfaces(d); qxl_reset_memslots(d); @@ -796,7 +796,7 @@ static void qxl_add_memslot(PCIQXLDevice *d, uint32_t slot_id, uint64_t delta) __FUNCTION__, memslot.slot_id, memslot.virt_start, memslot.virt_end); -d-ssd.worker-add_memslot(d-ssd.worker, memslot); +qemu_spice_add_memslot(d-ssd, memslot); d-guest_slots[slot_id].ptr = (void*)memslot.virt_start; d-guest_slots[slot_id].size = memslot.virt_end - memslot.virt_start; d-guest_slots[slot_id].delta = delta; @@ -806,14 +806,14 @@ static void qxl_add_memslot(PCIQXLDevice *d, uint32_t slot_id, uint64_t delta) static void qxl_del_memslot(PCIQXLDevice *d, uint32_t slot_id) { dprint(d, 1, %s: slot %d\n, __FUNCTION__, slot_id); -d-ssd.worker-del_memslot(d-ssd.worker, MEMSLOT_GROUP_HOST, slot_id); +qemu_spice_del_memslot(d-ssd, MEMSLOT_GROUP_HOST, slot_id); d-guest_slots[slot_id].active = 0; } static void qxl_reset_memslots(PCIQXLDevice *d) { dprint(d, 1, %s:\n, __FUNCTION__); -d-ssd.worker-reset_memslots(d-ssd.worker); +qemu_spice_reset_memslots(d-ssd); memset(d-guest_slots, 0, sizeof(d-guest_slots)); } @@ -821,7 +821,7 @@ static void qxl_reset_surfaces(PCIQXLDevice *d) { dprint(d, 1, %s:\n, __FUNCTION__); d-mode = QXL_MODE_UNDEFINED; -d-ssd.worker-destroy_surfaces(d-ssd.worker); +qemu_spice_destroy_surfaces(d-ssd); memset(d-guest_surfaces.cmds, 0, sizeof(d-guest_surfaces.cmds)); } @@ -875,7 +875,7 @@ static void qxl_create_guest_primary(PCIQXLDevice *qxl, int loadvm) qxl-mode = QXL_MODE_NATIVE; qxl-cmdflags = 0; -qxl-ssd.worker-create_primary_surface(qxl-ssd.worker, 0, surface); +qemu_spice_create_primary_surface(qxl-ssd, 0, surface); /* for local rendering */ qxl_render_resize(qxl); @@ -890,7 +890,7 @@ static void qxl_destroy_primary(PCIQXLDevice *d) dprint(d, 1, %s\n, __FUNCTION__); d-mode = QXL_MODE_UNDEFINED; -d-ssd.worker-destroy_primary_surface(d-ssd.worker, 0); +qemu_spice_destroy_primary_surface(d-ssd, 0); } static void qxl_set_mode(PCIQXLDevice *d, int modenr, int loadvm) @@ -962,15 +962,15 @@ static void ioport_write(void *opaque, uint32_t addr, uint32_t val) case QXL_IO_UPDATE_AREA: { QXLRect update = d-ram-update_area; -d-ssd.worker-update_area(d-ssd.worker, d-ram-update_surface, - update, NULL, 0, 0); +qemu_spice_update_area(d-ssd, d-ram-update_surface, + update, NULL, 0, 0); break; } case QXL_IO_NOTIFY_CMD: -d-ssd.worker-wakeup(d-ssd.worker); +qemu_spice_wakeup(d-ssd); break; case QXL_IO_NOTIFY_CURSOR: -d-ssd.worker-wakeup(d-ssd.worker); +qemu_spice_wakeup(d-ssd); break; case QXL_IO_UPDATE_IRQ: qxl_set_irq(d); @@ -984,7 +984,7 @@ static void ioport_write(void *opaque, uint32_t addr, uint32_t val) break; } d-oom_running = 1; -d-ssd.worker-oom(d-ssd.worker); +qemu_spice_oom(d-ssd); d-oom_running = 0; break; case QXL_IO_SET_MODE: @@ -1022,10 +1022,10 @@ static void ioport_write(void *opaque, uint32_t addr, uint32_t val) qxl_destroy_primary(d); break; case QXL_IO_DESTROY_SURFACE_WAIT: -d-ssd.worker-destroy_surface_wait(d-ssd.worker, val); +qemu_spice_destroy_surface_wait(d-ssd, val); break; case QXL_IO_DESTROY_ALL_SURFACES: -
[Qemu-devel] [PATCHv3] qxl: add io_port_to_string
--- hw/qxl.c | 62 +- 1 files changed, 61 insertions(+), 1 deletions(-) diff --git a/hw/qxl.c b/hw/qxl.c index 6862bc8..7be7ae1 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -407,6 +407,65 @@ static const char *qxl_mode_to_string(int mode) return INVALID; } +static const char *io_port_to_string(uint32_t io_port) +{ +if (io_port = QXL_IO_RANGE_SIZE) { +return out of range; +} +switch(io_port) { +case QXL_IO_NOTIFY_CMD: +return QXL_IO_NOTIFY_CMD; +case QXL_IO_NOTIFY_CURSOR: +return QXL_IO_NOTIFY_CURSOR; +case QXL_IO_UPDATE_AREA: +return QXL_IO_UPDATE_AREA; +case QXL_IO_UPDATE_IRQ: +return QXL_IO_UPDATE_IRQ; +case QXL_IO_NOTIFY_OOM: +return QXL_IO_NOTIFY_OOM; +case QXL_IO_RESET: +return QXL_IO_RESET; +case QXL_IO_SET_MODE: +return QXL_IO_SET_MODE; +case QXL_IO_LOG: +return QXL_IO_LOG; +case QXL_IO_MEMSLOT_ADD: +return QXL_IO_MEMSLOT_ADD; +case QXL_IO_MEMSLOT_DEL: +return QXL_IO_MEMSLOT_DEL; +case QXL_IO_DETACH_PRIMARY: +return QXL_IO_DETACH_PRIMARY; +case QXL_IO_ATTACH_PRIMARY: +return QXL_IO_ATTACH_PRIMARY; +case QXL_IO_CREATE_PRIMARY: +return QXL_IO_CREATE_PRIMARY; +case QXL_IO_DESTROY_PRIMARY: +return QXL_IO_DESTROY_PRIMARY; +case QXL_IO_DESTROY_SURFACE_WAIT: +return QXL_IO_DESTROY_SURFACE_WAIT; +case QXL_IO_DESTROY_ALL_SURFACES: +return QXL_IO_DESTROY_ALL_SURFACES; +case QXL_IO_UPDATE_AREA_ASYNC: +return QXL_IO_UPDATE_AREA_ASYNC; +case QXL_IO_MEMSLOT_ADD_ASYNC: +return QXL_IO_MEMSLOT_ADD_ASYNC; +case QXL_IO_CREATE_PRIMARY_ASYNC: +return QXL_IO_CREATE_PRIMARY_ASYNC; +case QXL_IO_DESTROY_PRIMARY_ASYNC: +return QXL_IO_DESTROY_PRIMARY_ASYNC; +case QXL_IO_DESTROY_SURFACE_ASYNC: +return QXL_IO_DESTROY_SURFACE_ASYNC; +case QXL_IO_DESTROY_ALL_SURFACES_ASYNC: +return QXL_IO_DESTROY_ALL_SURFACES_ASYNC; +case QXL_IO_FLUSH_SURFACES_ASYNC: +return QXL_IO_FLUSH_SURFACES_ASYNC; +case QXL_IO_FLUSH_RELEASE: +return QXL_IO_FLUSH_RELEASE; +} +// not reached? +return error in io_port_to_string; +} + /* called from spice server thread context only */ static int interface_get_command(QXLInstance *sin, struct QXLCommandExt *ext) { @@ -1003,7 +1062,8 @@ static void ioport_write(void *opaque, uint32_t addr, uint32_t val) default: if (d-mode == QXL_MODE_NATIVE || d-mode == QXL_MODE_COMPAT) break; -dprint(d, 1, %s: unexpected port 0x%x in vga mode\n, __FUNCTION__, io_port); +dprint(d, 1, %s: unexpected port 0x%x (%s) in vga mode\n, +__FUNCTION__, io_port, io_port_to_string(io_port)); return; } -- 1.7.6
[Qemu-devel] [PATCHv3] qxl: fix surface tracking locking
From: Gerd Hoffmann kra...@redhat.com Surface tracking needs proper locking since it is used from vcpu and spice worker threads, add it. Also reset the surface counter when zapping all surfaces. Signed-off-by: Gerd Hoffmann kra...@redhat.com --- hw/qxl.c | 13 - hw/qxl.h |2 ++ 2 files changed, 14 insertions(+), 1 deletions(-) diff --git a/hw/qxl.c b/hw/qxl.c index c1508a5..6862bc8 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -135,7 +135,12 @@ void qxl_spice_update_area(PCIQXLDevice *qxl, uint32_t surface_id, void qxl_spice_destroy_surface_wait(PCIQXLDevice *qxl, uint32_t id) { +qemu_mutex_lock(qxl-track_lock); +PANIC_ON(id = NUM_SURFACES); qxl-ssd.worker-destroy_surface_wait(qxl-ssd.worker, id); +qxl-guest_surfaces.cmds[id] = 0; +qxl-guest_surfaces.count--; +qemu_mutex_unlock(qxl-track_lock); } void qxl_spice_loadvm_commands(PCIQXLDevice *qxl, struct QXLCommandExt *ext, @@ -156,7 +161,11 @@ void qxl_spice_reset_memslots(PCIQXLDevice *qxl) void qxl_spice_destroy_surfaces(PCIQXLDevice *qxl) { +qemu_mutex_lock(qxl-track_lock); qxl-ssd.worker-destroy_surfaces(qxl-ssd.worker); +memset(qxl-guest_surfaces.cmds, 0, sizeof(qxl-guest_surfaces.cmds)); +qxl-guest_surfaces.count = 0; +qemu_mutex_unlock(qxl-track_lock); } void qxl_spice_reset_image_cache(PCIQXLDevice *qxl) @@ -315,6 +324,7 @@ static void qxl_track_command(PCIQXLDevice *qxl, struct QXLCommandExt *ext) QXLSurfaceCmd *cmd = qxl_phys2virt(qxl, ext-cmd.data, ext-group_id); uint32_t id = le32_to_cpu(cmd-surface_id); PANIC_ON(id = NUM_SURFACES); +qemu_mutex_lock(qxl-track_lock); if (cmd-type == QXL_SURFACE_CMD_CREATE) { qxl-guest_surfaces.cmds[id] = ext-cmd.data; qxl-guest_surfaces.count++; @@ -325,6 +335,7 @@ static void qxl_track_command(PCIQXLDevice *qxl, struct QXLCommandExt *ext) qxl-guest_surfaces.cmds[id] = 0; qxl-guest_surfaces.count--; } +qemu_mutex_unlock(qxl-track_lock); break; } case QXL_CMD_CURSOR: @@ -873,7 +884,6 @@ static void qxl_reset_surfaces(PCIQXLDevice *d) dprint(d, 1, %s:\n, __FUNCTION__); d-mode = QXL_MODE_UNDEFINED; qxl_spice_destroy_surfaces(d); -memset(d-guest_surfaces.cmds, 0, sizeof(d-guest_surfaces.cmds)); } /* called from spice server thread context only */ @@ -1284,6 +1294,7 @@ static int qxl_init_common(PCIQXLDevice *qxl) qxl-generation = 1; qxl-num_memslots = NUM_MEMSLOTS; qxl-num_surfaces = NUM_SURFACES; +qemu_mutex_init(qxl-track_lock); switch (qxl-revision) { case 1: /* spice 0.4 -- qxl-1 */ diff --git a/hw/qxl.h b/hw/qxl.h index 489d518..087ef6b 100644 --- a/hw/qxl.h +++ b/hw/qxl.h @@ -55,6 +55,8 @@ typedef struct PCIQXLDevice { } guest_surfaces; QXLPHYSICALguest_cursor; +QemuMutex track_lock; + /* thread signaling */ pthread_t main; intpipe[2]; -- 1.7.6
[Qemu-devel] [PATCHv3] qxl: bump pci rev
From: Gerd Hoffmann kra...@redhat.com Inform guest drivers about the new features I/O commands we have now (async commands, S3 support) if building with newer spice, i.e. if SPICE_INTERFACE_QXL_MINOR = 1. Signed-off-by: Gerd Hoffmann kra...@redhat.com --- hw/qxl.c | 11 --- hw/qxl.h |6 ++ 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/hw/qxl.c b/hw/qxl.c index ae1d0de..d3b1581 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -1389,9 +1389,14 @@ static int qxl_init_common(PCIQXLDevice *qxl) pci_device_rev = QXL_REVISION_STABLE_V04; break; case 2: /* spice 0.6 -- qxl-2 */ -default: pci_device_rev = QXL_REVISION_STABLE_V06; break; +#if SPICE_INTERFACE_QXL_MINOR = 1 +case 3: /* qxl-3 */ +#endif +default: +pci_device_rev = QXL_DEFAULT_REVISION; +break; } pci_set_byte(config[PCI_REVISION_ID], pci_device_rev); @@ -1655,7 +1660,7 @@ static PCIDeviceInfo qxl_info_primary = { .qdev.props = (Property[]) { DEFINE_PROP_UINT32(ram_size, PCIQXLDevice, vga.vram_size, 64 * 1024 * 1024), DEFINE_PROP_UINT32(vram_size, PCIQXLDevice, vram_size, 64 * 1024 * 1024), -DEFINE_PROP_UINT32(revision, PCIQXLDevice, revision, 2), +DEFINE_PROP_UINT32(revision, PCIQXLDevice, revision, QXL_DEFAULT_REVISION), DEFINE_PROP_UINT32(debug, PCIQXLDevice, debug, 0), DEFINE_PROP_UINT32(guestdebug, PCIQXLDevice, guestdebug, 0), DEFINE_PROP_UINT32(cmdlog, PCIQXLDevice, cmdlog, 0), @@ -1676,7 +1681,7 @@ static PCIDeviceInfo qxl_info_secondary = { .qdev.props = (Property[]) { DEFINE_PROP_UINT32(ram_size, PCIQXLDevice, vga.vram_size, 64 * 1024 * 1024), DEFINE_PROP_UINT32(vram_size, PCIQXLDevice, vram_size, 64 * 1024 * 1024), -DEFINE_PROP_UINT32(revision, PCIQXLDevice, revision, 2), +DEFINE_PROP_UINT32(revision, PCIQXLDevice, revision, QXL_DEFAULT_REVISION), DEFINE_PROP_UINT32(debug, PCIQXLDevice, debug, 0), DEFINE_PROP_UINT32(guestdebug, PCIQXLDevice, guestdebug, 0), DEFINE_PROP_UINT32(cmdlog, PCIQXLDevice, cmdlog, 0), diff --git a/hw/qxl.h b/hw/qxl.h index e361bc6..85d37be 100644 --- a/hw/qxl.h +++ b/hw/qxl.h @@ -97,6 +97,12 @@ typedef struct PCIQXLDevice { } \ } while (0) +#if SPICE_INTERFACE_QXL_MINOR = 1 +#define QXL_DEFAULT_REVISION QXL_REVISION_STABLE_V10 +#else +#define QXL_DEFAULT_REVISION QXL_REVISION_STABLE_V06 +#endif + /* qxl.c */ void *qxl_phys2virt(PCIQXLDevice *qxl, QXLPHYSICAL phys, int group_id); void qxl_guest_bug(PCIQXLDevice *qxl, const char *msg, ...); -- 1.7.6
[Qemu-devel] [PATCHv3] qxl: QXL_IO_UPDATE_AREA: pass ram-update_area directly to update_area
--- hw/qxl.c |6 +- 1 files changed, 1 insertions(+), 5 deletions(-) diff --git a/hw/qxl.c b/hw/qxl.c index 17b5b39..6094b38 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -136,7 +136,6 @@ void qxl_guest_bug(PCIQXLDevice *qxl, const char *msg, ...) } } - void qxl_spice_update_area(PCIQXLDevice *qxl, uint32_t surface_id, struct QXLRect *area, struct QXLRect *dirty_rects, uint32_t num_dirty_rects, uint32_t clear_dirty_region) @@ -1081,12 +1080,9 @@ static void ioport_write(void *opaque, uint32_t addr, uint32_t val) switch (io_port) { case QXL_IO_UPDATE_AREA: -{ -QXLRect update = d-ram-update_area; qxl_spice_update_area(d, d-ram-update_surface, - update, NULL, 0, 0); + d-ram-update_area, NULL, 0, 1); break; -} case QXL_IO_NOTIFY_CMD: qemu_spice_wakeup(d-ssd); break; -- 1.7.6
Re: [Qemu-devel] [PATCH 3/8] block: Support to keep track of I/O status
Am 05.07.2011 20:17, schrieb Luiz Capitulino: This commit adds support to the BlockDriverState type to keep track of the last I/O status. That is, at every I/O operation we update a status field in the BlockDriverState instance. Valid statuses are: OK, FAILED and ENOSPC. ENOSPC is distinguished from FAILED because an management application can use it to implement thin-provisioning. This feature has to be explicit enabled by buses/devices supporting it. Signed-off-by: Luiz Capitulino lcapitul...@redhat.com I'm not sure how this is meant to work with devices that can have multiple requests in flight. If a request fails, one of the things that are done before sending a monitor event is qemu_aio_flush(), i.e. waiting for all in-flight requests to complete. If the last one of them is successful, your status will report BDRV_IOS_OK. If you don't stop the VM on I/O errors, the status is useless anyway, even if only one request is active at the same point. I think it would make more sense if we only stored the last error (that is, don't clear the field on success). What is the use case, would this be enough for it? By the way, I'm not sure how it fits in, but I'd like to have a block layer function that format drivers can use to tell qemu that the image is corrupted. Maybe that's another case in which we should stop the VM and have an appropriate status for it. It should probably have precedence over an ENOSPC happening at the same time, so maybe we'll also need a way to tell that some status is more important and may overwrite a less important status, but not the other way round. Kevin
[Qemu-devel] [PATCHv3] qxl: error handling fixes and cleanups.
From: Gerd Hoffmann kra...@redhat.com Add qxl_guest_bug() function which is supposed to be called in case sanity checks of guest requests fail. It raises an error IRQ and logs a message in case guest debugging is enabled. Make PANIC_ON() abort instead of exit. That macro should be used for qemu bugs only, any guest-triggerable stuff should use the new qxl_guest_bug() function instead. Convert a few easy cases from PANIC_ON() to qxl_guest_bug() to show intended usage. Signed-off-by: Gerd Hoffmann kra...@redhat.com --- hw/qxl.c | 32 hw/qxl.h |3 ++- 2 files changed, 30 insertions(+), 5 deletions(-) diff --git a/hw/qxl.c b/hw/qxl.c index 7be7ae1..91bc98d 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -124,6 +124,14 @@ static void qxl_reset_memslots(PCIQXLDevice *d); static void qxl_reset_surfaces(PCIQXLDevice *d); static void qxl_ring_set_dirty(PCIQXLDevice *qxl); +void qxl_guest_bug(PCIQXLDevice *qxl, const char *msg) +{ +qxl_send_events(qxl, QXL_INTERRUPT_ERROR); +if (qxl-guestdebug) { +fprintf(stderr, qxl-%d: guest bug: %s\n, qxl-id, msg); +} +} + void qxl_spice_update_area(PCIQXLDevice *qxl, uint32_t surface_id, struct QXLRect *area, struct QXLRect *dirty_rects, @@ -,22 +1119,38 @@ static void ioport_write(void *opaque, uint32_t addr, uint32_t val) qxl_hard_reset(d, 0); break; case QXL_IO_MEMSLOT_ADD: -PANIC_ON(val = NUM_MEMSLOTS); -PANIC_ON(d-guest_slots[val].active); +if (val = NUM_MEMSLOTS) { +qxl_guest_bug(d, QXL_IO_MEMSLOT_ADD: val out of range); +break; +} +if (d-guest_slots[val].active) { +qxl_guest_bug(d, QXL_IO_MEMSLOT_ADD: memory slot already active); +break; +} d-guest_slots[val].slot = d-ram-mem_slot; qxl_add_memslot(d, val, 0); break; case QXL_IO_MEMSLOT_DEL: +if (val = NUM_MEMSLOTS) { +qxl_guest_bug(d, QXL_IO_MEMSLOT_DEL: val out of range); +break; +} qxl_del_memslot(d, val); break; case QXL_IO_CREATE_PRIMARY: -PANIC_ON(val != 0); +if (val != 0) { +qxl_guest_bug(d, QXL_IO_CREATE_PRIMARY: val != 0); +break; +} dprint(d, 1, QXL_IO_CREATE_PRIMARY\n); d-guest_primary.surface = d-ram-create_surface; qxl_create_guest_primary(d, 0); break; case QXL_IO_DESTROY_PRIMARY: -PANIC_ON(val != 0); +if (val != 0) { +qxl_guest_bug(d, QXL_IO_DESTROY_PRIMARY: val != 0); +break; +} dprint(d, 1, QXL_IO_DESTROY_PRIMARY (%s)\n, qxl_mode_to_string(d-mode)); if (d-mode != QXL_MODE_UNDEFINED) { d-mode = QXL_MODE_UNDEFINED; diff --git a/hw/qxl.h b/hw/qxl.h index 087ef6b..88393c2 100644 --- a/hw/qxl.h +++ b/hw/qxl.h @@ -86,7 +86,7 @@ typedef struct PCIQXLDevice { #define PANIC_ON(x) if ((x)) { \ printf(%s: PANIC %s failed\n, __FUNCTION__, #x); \ -exit(-1); \ +abort(); \ } #define dprint(_qxl, _level, _fmt, ...) \ @@ -99,6 +99,7 @@ typedef struct PCIQXLDevice { /* qxl.c */ void *qxl_phys2virt(PCIQXLDevice *qxl, QXLPHYSICAL phys, int group_id); +void qxl_guest_bug(PCIQXLDevice *qxl, const char *msg); void qxl_spice_update_area(PCIQXLDevice *qxl, uint32_t surface_id, struct QXLRect *area, struct QXLRect *dirty_rects, -- 1.7.6
[Qemu-devel] [PATCHv3] qxl: use QXL_REVISION_*
--- hw/qxl.c | 22 ++ 1 files changed, 10 insertions(+), 12 deletions(-) diff --git a/hw/qxl.c b/hw/qxl.c index d3b1581..17b5b39 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -1375,7 +1375,6 @@ static DisplayChangeListener display_listener = { static int qxl_init_common(PCIQXLDevice *qxl) { uint8_t* config = qxl-pci.config; -uint32_t pci_device_rev; uint32_t io_size; qxl-mode = QXL_MODE_UNDEFINED; @@ -1385,21 +1384,20 @@ static int qxl_init_common(PCIQXLDevice *qxl) qemu_mutex_init(qxl-track_lock); switch (qxl-revision) { -case 1: /* spice 0.4 -- qxl-1 */ -pci_device_rev = QXL_REVISION_STABLE_V04; -break; -case 2: /* spice 0.6 -- qxl-2 */ -pci_device_rev = QXL_REVISION_STABLE_V06; -break; +case QXL_REVISION_STABLE_V04: /* spice 0.4 -- qxl-1 */ +case QXL_REVISION_STABLE_V06: /* spice 0.6 -- qxl-2 */ #if SPICE_INTERFACE_QXL_MINOR = 1 -case 3: /* qxl-3 */ +case QXL_REVISION_STABLE_V10: /* spice 0.10? -- qxl-3 */ +break; #endif default: -pci_device_rev = QXL_DEFAULT_REVISION; +fprintf(stderr, invalid revision %d, resetting to %d\n, qxl-revision, +QXL_DEFAULT_REVISION); +qxl-revision = QXL_DEFAULT_REVISION; break; } -pci_set_byte(config[PCI_REVISION_ID], pci_device_rev); +pci_set_byte(config[PCI_REVISION_ID], qxl-revision); pci_set_byte(config[PCI_INTERRUPT_PIN], 1); qxl-rom_size = qxl_rom_size(); @@ -1410,14 +1408,14 @@ static int qxl_init_common(PCIQXLDevice *qxl) if (qxl-vram_size 16 * 1024 * 1024) { qxl-vram_size = 16 * 1024 * 1024; } -if (qxl-revision == 1) { +if (qxl-revision == QXL_REVISION_STABLE_V04) { qxl-vram_size = 4096; } qxl-vram_size = msb_mask(qxl-vram_size * 2 - 1); qxl-vram_offset = qemu_ram_alloc(qxl-pci.qdev, qxl.vram, qxl-vram_size); io_size = msb_mask(QXL_IO_RANGE_SIZE * 2 - 1); -if (qxl-revision == 1) { +if (qxl-revision == QXL_REVISION_STABLE_V04) { io_size = 8; } -- 1.7.6
[Qemu-devel] [PATCHv3] spice/qxl: move worker wrappers
From: Gerd Hoffmann kra...@redhat.com Move the wrapper functions which are used by qxl only to qxl.c. Rename them from qemu_spice_* to qxl_spice_*. Also pass in a qxl state pointer instead of a SimpleSpiceDisplay pointer. Signed-off-by: Gerd Hoffmann kra...@redhat.com --- hw/qxl-render.c|4 +- hw/qxl.c | 66 hw/qxl.h | 12 + ui/spice-display.c | 45 --- ui/spice-display.h | 11 5 files changed, 70 insertions(+), 68 deletions(-) diff --git a/hw/qxl-render.c b/hw/qxl-render.c index bef5f14..60b822d 100644 --- a/hw/qxl-render.c +++ b/hw/qxl-render.c @@ -124,8 +124,8 @@ void qxl_render_update(PCIQXLDevice *qxl) update.bottom = qxl-guest_primary.surface.height; memset(dirty, 0, sizeof(dirty)); -qemu_spice_update_area(qxl-ssd, 0, update, - dirty, ARRAY_SIZE(dirty), 1); +qxl_spice_update_area(qxl, 0, update, + dirty, ARRAY_SIZE(dirty), 1); for (i = 0; i ARRAY_SIZE(dirty); i++) { if (qemu_spice_rect_is_empty(dirty+i)) { diff --git a/hw/qxl.c b/hw/qxl.c index 0c5ed65..c1508a5 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -124,6 +124,52 @@ static void qxl_reset_memslots(PCIQXLDevice *d); static void qxl_reset_surfaces(PCIQXLDevice *d); static void qxl_ring_set_dirty(PCIQXLDevice *qxl); + +void qxl_spice_update_area(PCIQXLDevice *qxl, uint32_t surface_id, + struct QXLRect *area, struct QXLRect *dirty_rects, + uint32_t num_dirty_rects, uint32_t clear_dirty_region) +{ +qxl-ssd.worker-update_area(qxl-ssd.worker, surface_id, area, dirty_rects, + num_dirty_rects, clear_dirty_region); +} + +void qxl_spice_destroy_surface_wait(PCIQXLDevice *qxl, uint32_t id) +{ +qxl-ssd.worker-destroy_surface_wait(qxl-ssd.worker, id); +} + +void qxl_spice_loadvm_commands(PCIQXLDevice *qxl, struct QXLCommandExt *ext, + uint32_t count) +{ +qxl-ssd.worker-loadvm_commands(qxl-ssd.worker, ext, count); +} + +void qxl_spice_oom(PCIQXLDevice *qxl) +{ +qxl-ssd.worker-oom(qxl-ssd.worker); +} + +void qxl_spice_reset_memslots(PCIQXLDevice *qxl) +{ +qxl-ssd.worker-reset_memslots(qxl-ssd.worker); +} + +void qxl_spice_destroy_surfaces(PCIQXLDevice *qxl) +{ +qxl-ssd.worker-destroy_surfaces(qxl-ssd.worker); +} + +void qxl_spice_reset_image_cache(PCIQXLDevice *qxl) +{ +qxl-ssd.worker-reset_image_cache(qxl-ssd.worker); +} + +void qxl_spice_reset_cursor(PCIQXLDevice *qxl) +{ +qxl-ssd.worker-reset_cursor(qxl-ssd.worker); +} + + static inline uint32_t msb_mask(uint32_t val) { uint32_t mask; @@ -692,8 +738,8 @@ static void qxl_hard_reset(PCIQXLDevice *d, int loadvm) dprint(d, 1, %s: start%s\n, __FUNCTION__, loadvm ? (loadvm) : ); -qemu_spice_reset_cursor(d-ssd); -qemu_spice_reset_image_cache(d-ssd); +qxl_spice_reset_cursor(d); +qxl_spice_reset_image_cache(d); qxl_reset_surfaces(d); qxl_reset_memslots(d); @@ -818,7 +864,7 @@ static void qxl_del_memslot(PCIQXLDevice *d, uint32_t slot_id) static void qxl_reset_memslots(PCIQXLDevice *d) { dprint(d, 1, %s:\n, __FUNCTION__); -qemu_spice_reset_memslots(d-ssd); +qxl_spice_reset_memslots(d); memset(d-guest_slots, 0, sizeof(d-guest_slots)); } @@ -826,7 +872,7 @@ static void qxl_reset_surfaces(PCIQXLDevice *d) { dprint(d, 1, %s:\n, __FUNCTION__); d-mode = QXL_MODE_UNDEFINED; -qemu_spice_destroy_surfaces(d-ssd); +qxl_spice_destroy_surfaces(d); memset(d-guest_surfaces.cmds, 0, sizeof(d-guest_surfaces.cmds)); } @@ -955,8 +1001,8 @@ static void ioport_write(void *opaque, uint32_t addr, uint32_t val) case QXL_IO_UPDATE_AREA: { QXLRect update = d-ram-update_area; -qemu_spice_update_area(d-ssd, d-ram-update_surface, - update, NULL, 0, 0); +qxl_spice_update_area(d, d-ram-update_surface, + update, NULL, 0, 0); break; } case QXL_IO_NOTIFY_CMD: @@ -977,7 +1023,7 @@ static void ioport_write(void *opaque, uint32_t addr, uint32_t val) break; } d-oom_running = 1; -qemu_spice_oom(d-ssd); +qxl_spice_oom(d); d-oom_running = 0; break; case QXL_IO_SET_MODE: @@ -1018,10 +1064,10 @@ static void ioport_write(void *opaque, uint32_t addr, uint32_t val) } break; case QXL_IO_DESTROY_SURFACE_WAIT: -qemu_spice_destroy_surface_wait(d-ssd, val); +qxl_spice_destroy_surface_wait(d, val); break; case QXL_IO_DESTROY_ALL_SURFACES: -qemu_spice_destroy_surfaces(d-ssd); +qxl_spice_destroy_surfaces(d); break; default: fprintf(stderr, %s: ioport=0x%x, abort()\n, __FUNCTION__, io_port); @@ -1421,7 +1467,7 @@ static int
[Qemu-devel] [PATCHv3] qxl-render: split out qxl_render_update_dirty_rectangles
will later be reused from surface_updated callback when compiling against a newer spice-server. --- hw/qxl-render.c | 37 ++--- 1 files changed, 22 insertions(+), 15 deletions(-) diff --git a/hw/qxl-render.c b/hw/qxl-render.c index e64b646..d70373d 100644 --- a/hw/qxl-render.c +++ b/hw/qxl-render.c @@ -79,12 +79,32 @@ static void qxl_save_ppm(PCIQXLDevice *qxl) } } +static void qxl_render_update_dirty_rectangles(PCIQXLDevice *qxl, QXLRect *dirty, uint32_t num_dirty) +{ +VGACommonState *vga = qxl-vga; +int i; + +dprint(qxl, 3, %s: %d\n, __FUNCTION__, num_dirty); +for (i = 0; i num_dirty; i++) { +if (qemu_spice_rect_is_empty(dirty + i)) { +break; +} +if (qxl-guest_primary.flipped) { +qxl_flip(qxl, dirty + i); +} +dpy_update(vga-ds, + dirty[i].left, dirty[i].top, + dirty[i].right - dirty[i].left, + dirty[i].bottom - dirty[i].top); +} +qxl_save_ppm(qxl); +} + void qxl_render_update(PCIQXLDevice *qxl) { VGACommonState *vga = qxl-vga; QXLRect dirty[32], update; void *ptr; -int i; if (qxl-guest_primary.resized) { qxl-guest_primary.resized = 0; @@ -135,20 +155,7 @@ void qxl_render_update(PCIQXLDevice *qxl) memset(dirty, 0, sizeof(dirty)); qxl_spice_update_area(qxl, 0, update, dirty, ARRAY_SIZE(dirty), 1); - -for (i = 0; i ARRAY_SIZE(dirty); i++) { -if (qemu_spice_rect_is_empty(dirty+i)) { -break; -} -if (qxl-guest_primary.flipped) { -qxl_flip(qxl, dirty+i); -} -dpy_update(vga-ds, - dirty[i].left, dirty[i].top, - dirty[i].right - dirty[i].left, - dirty[i].bottom - dirty[i].top); -} -qxl_save_ppm(qxl); +qxl_render_update_dirty_rectangles(qxl, dirty, ARRAY_SIZE(dirty)); } static QEMUCursor *qxl_cursor(PCIQXLDevice *qxl, QXLCursor *cursor) -- 1.7.6
[Qemu-devel] [PATCHv3] qxl: only disallow specific io's in vga mode
Since the driver is still in operation even after moving to UNDEFINED, i.e. by destroying primary in any way. --- hw/qxl.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/hw/qxl.c b/hw/qxl.c index 0585f02..1d6acce 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -1175,8 +1175,9 @@ static void ioport_write(void *opaque, uint32_t addr, uint32_t val) case QXL_IO_LOG: break; default: -if (d-mode == QXL_MODE_NATIVE || d-mode == QXL_MODE_COMPAT) +if (d-mode != QXL_MODE_VGA) { break; +} dprint(d, 1, %s: unexpected port 0x%x (%s) in vga mode\n, __FUNCTION__, io_port, io_port_to_string(io_port)); /* be nice to buggy guest drivers */ -- 1.7.6
Re: [Qemu-devel] [PATCH 7/8] QMP: query-status: Add 'io-status' key
On Tue, 12 Jul 2011 09:47:19 +0200 Markus Armbruster arm...@redhat.com wrote: Luiz Capitulino lcapitul...@redhat.com writes: Contains the last I/O status for the given device. Currently this is only supported by ide, scsi and virtio block devices. Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- block.c | 15 ++- block.h |2 +- qmp-commands.hx |6 ++ 3 files changed, 21 insertions(+), 2 deletions(-) diff --git a/block.c b/block.c index cc0a34e..28df3d8 100644 --- a/block.c +++ b/block.c @@ -1720,6 +1720,12 @@ void bdrv_info_print(Monitor *mon, const QObject *data) qlist_iter(qobject_to_qlist(data), bdrv_print_dict, mon); } +static const char *const io_status_name[BDRV_IOS_MAX] = { +[BDRV_IOS_OK] = ok, +[BDRV_IOS_FAILED] = failed, +[BDRV_IOS_ENOSPC] = nospace, +}; + void bdrv_info(Monitor *mon, QObject **ret_data) { QList *bs_list; @@ -1729,15 +1735,16 @@ void bdrv_info(Monitor *mon, QObject **ret_data) QTAILQ_FOREACH(bs, bdrv_states, list) { QObject *bs_obj; +QDict *bs_dict; bs_obj = qobject_from_jsonf({ 'device': %s, 'type': 'unknown', 'removable': %i, 'locked': %i }, bs-device_name, bs-removable, bs-locked); +bs_dict = qobject_to_qdict(bs_obj); if (bs-drv) { QObject *obj; -QDict *bs_dict = qobject_to_qdict(bs_obj); obj = qobject_from_jsonf({ 'file': %s, 'ro': %i, 'drv': %s, 'encrypted': %i }, @@ -1752,6 +1759,12 @@ void bdrv_info(Monitor *mon, QObject **ret_data) qdict_put_obj(bs_dict, inserted, obj); } + +if (bs-iostatus_enabled) { +qdict_put(bs_dict, io-status, + qstring_from_str(io_status_name[bs-iostatus])); +} + qlist_append_obj(bs_list, bs_obj); } diff --git a/block.h b/block.h index 0dca1bb..0141fe6 100644 --- a/block.h +++ b/block.h @@ -51,7 +51,7 @@ typedef enum { } BlockMonEventAction; typedef enum { -BDRV_IOS_OK, BDRV_IOS_FAILED, BDRV_IOS_ENOSPC +BDRV_IOS_OK, BDRV_IOS_FAILED, BDRV_IOS_ENOSPC, BDRV_IOS_MAX } BlockIOStatus; void bdrv_iostatus_update(BlockDriverState *bs, int error); diff --git a/qmp-commands.hx b/qmp-commands.hx index 6b8eb0a..1746b6d 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -1082,6 +1082,9 @@ Each json-object contain the following: tftp, vdi, vmdk, vpc, vvfat - backing_file: backing file name (json-string, optional) - encrypted: true if encrypted, false otherwise (json-bool) +- io-status: last executed I/O operation status, only present if the device + supports it (json_string, optional) + - Possible values: ok, failed, nospace Example: @@ -1089,6 +1092,7 @@ Example: - { return:[ { +io-status: ok, device:ide0-hd0, locked:false, removable:false, @@ -1101,12 +1105,14 @@ Example: type:unknown }, { +io-status: ok, device:ide1-cd0, locked:false, removable:true, type:unknown }, { +io-status: ok, device:floppy0, locked:false, removable:true, floppy doesn't support I/O status, yet the example shows io-status: ok. Are you sure it's correct? Good catch, I did this by hand :-)
[Qemu-devel] [PATCHv3] qxl: async io support using new spice api
Some of the QXL port i/o commands are waiting for the spice server to complete certain actions. Add async versions for these commands, so we don't block the vcpu while the spice server processses the command. Instead the qxl device will raise an IRQ when done. The async command processing relies on an added QXLInterface::async_complete and added QXLWorker::*_async additions, in spice server qxl = 3.1 Signed-off-by: Gerd Hoffmann kra...@redhat.com Signed-off-by: Alon Levy al...@redhat.com --- hw/qxl.c | 229 --- hw/qxl.h | 15 +++- ui/spice-display.c | 48 +--- ui/spice-display.h | 25 +- 4 files changed, 270 insertions(+), 47 deletions(-) diff --git a/hw/qxl.c b/hw/qxl.c index 6094b38..bd540c0 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -138,22 +138,49 @@ void qxl_guest_bug(PCIQXLDevice *qxl, const char *msg, ...) void qxl_spice_update_area(PCIQXLDevice *qxl, uint32_t surface_id, struct QXLRect *area, struct QXLRect *dirty_rects, - uint32_t num_dirty_rects, uint32_t clear_dirty_region) + uint32_t num_dirty_rects, + uint32_t clear_dirty_region) { qxl-ssd.worker-update_area(qxl-ssd.worker, surface_id, area, dirty_rects, num_dirty_rects, clear_dirty_region); } -void qxl_spice_destroy_surface_wait(PCIQXLDevice *qxl, uint32_t id) +#if SPICE_INTERFACE_QXL_MINOR = 1 +void qxl_spice_update_area_async(PCIQXLDevice *qxl, uint32_t surface_id, + struct QXLRect *area, + uint32_t clear_dirty_region, int is_vga) +{ +qxl-ssd.worker-update_area_async(qxl-ssd.worker, surface_id, area, + clear_dirty_region, + is_vga ? QXL_COOKIE_VGA : 0); +} +#endif + +static void qxl_spice_destroy_surface_wait_complete(PCIQXLDevice *qxl, +uint32_t id) { qemu_mutex_lock(qxl-track_lock); -PANIC_ON(id = NUM_SURFACES); -qxl-ssd.worker-destroy_surface_wait(qxl-ssd.worker, id); qxl-guest_surfaces.cmds[id] = 0; qxl-guest_surfaces.count--; qemu_mutex_unlock(qxl-track_lock); } +static void qxl_spice_destroy_surface_wait(PCIQXLDevice *qxl, uint32_t id, + qxl_async_io async) +{ +if (async) { +#if SPICE_INTERFACE_QXL_MINOR 1 +abort(); +#else +qxl-ssd.worker-destroy_surface_wait_async(qxl-ssd.worker, id, +(uint64_t)id); +#endif +} else { +qxl-ssd.worker-destroy_surface_wait(qxl-ssd.worker, id); +qxl_spice_destroy_surface_wait_complete(qxl, id); +} +} + void qxl_spice_loadvm_commands(PCIQXLDevice *qxl, struct QXLCommandExt *ext, uint32_t count) { @@ -170,15 +197,28 @@ void qxl_spice_reset_memslots(PCIQXLDevice *qxl) qxl-ssd.worker-reset_memslots(qxl-ssd.worker); } -void qxl_spice_destroy_surfaces(PCIQXLDevice *qxl) +static void qxl_spice_destroy_surfaces_complete(PCIQXLDevice *qxl) { qemu_mutex_lock(qxl-track_lock); -qxl-ssd.worker-destroy_surfaces(qxl-ssd.worker); memset(qxl-guest_surfaces.cmds, 0, sizeof(qxl-guest_surfaces.cmds)); qxl-guest_surfaces.count = 0; qemu_mutex_unlock(qxl-track_lock); } +static void qxl_spice_destroy_surfaces(PCIQXLDevice *qxl, qxl_async_io async) +{ +if (async) { +#if SPICE_INTERFACE_QXL_MINOR 1 +abort(); +#else +qxl-ssd.worker-destroy_surfaces_async(qxl-ssd.worker, 0); +#endif +} else { +qxl-ssd.worker-destroy_surfaces(qxl-ssd.worker); +qxl_spice_destroy_surfaces_complete(qxl); +} +} + void qxl_spice_reset_image_cache(PCIQXLDevice *qxl) { qxl-ssd.worker-reset_image_cache(qxl-ssd.worker); @@ -705,6 +745,43 @@ static int interface_flush_resources(QXLInstance *sin) return ret; } +static void qxl_create_guest_primary_complete(PCIQXLDevice *d); + +#if SPICE_INTERFACE_QXL_MINOR = 1 + +/* called from spice server thread context only */ +static void interface_async_complete(QXLInstance *sin, uint64_t cookie) +{ +PCIQXLDevice *qxl = container_of(sin, PCIQXLDevice, ssd.qxl); +uint32_t current_async; + +if (cookie == QXL_COOKIE_VGA) { +dprint(qxl, 3, ignoring async from vga update\n); +return; +} + +qemu_mutex_lock(qxl-async_lock); +current_async = qxl-current_async; +qxl-current_async = QXL_UNDEFINED_IO; +qemu_mutex_unlock(qxl-async_lock); + +dprint(qxl, 2, async_complete: %d (%ld) done\n, current_async, cookie); +switch (current_async) { +case QXL_IO_CREATE_PRIMARY_ASYNC: +qxl_create_guest_primary_complete(qxl); +break; +case QXL_IO_DESTROY_ALL_SURFACES_ASYNC: +qxl_spice_destroy_surfaces_complete(qxl); +break; +
[Qemu-devel] qemu add cpu AMD Opteron 61XX
Hello! What add new cpu AMD Opteron 61xx in QEMU ? my cpu: processor : 47 vendor_id : AuthenticAMD cpu family : 16 model : 9 model name : AMD Opteron(tm) Processor 6174 stepping : 1 cpu MHz : 2200.294 cache size : 512 KB physical id : 3 siblings : 12 core id : 5 cpu cores : 12 apicid : 75 initial apicid : 59 fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid amd_dcm pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt nodeid_msr bogomips : 4400.19 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: ts ttp tm stc 100mhzsteps hwpstate if I add new [cpudef] in file target-x86_64.conf all its works? or need new compiled quemy? Thanks Alexej
[Qemu-devel] [PATCHv3] qxl: qxl_send_events: ignore if stopped (instead of abort)
This can happen if there is an interface_get_command issued when the server has been stopped. easy to trigger - do stop/cont a few times (three seem to be enough). The solution of ignoring the request is bad, but better then aborting and a real solution would probably be in spice to not call get_command in the first place. --- hw/qxl.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/hw/qxl.c b/hw/qxl.c index 8a9463e..0585f02 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -1406,7 +1406,10 @@ static void qxl_send_events(PCIQXLDevice *d, uint32_t events) uint32_t old_pending; uint32_t le_events = cpu_to_le32(events); -assert(d-ssd.running); +if (!d-ssd.running) { +fprintf(stderr, qxl: not sending interrupt %d while stopped\n, events); +return; +} old_pending = __sync_fetch_and_or(d-ram-int_pending, le_events); if ((old_pending le_events) == le_events) { return; -- 1.7.6
[Qemu-devel] Fwd: [PATCH] Introduce info migrate-times monitor command
This accidentally didn't go to the list although it's been sent there (using git send-email)... Michal Original Message Subject:[PATCH] Introduce info migrate-times monitor command Date: Tue, 12 Jul 2011 15:28:27 +0200 From: Michal Novotny mig...@gmail.com To: qemu-devel@nongnu.org CC: Michal Novotny minov...@redhat.com, Michal Novotny mig...@gmail.com From: Michal Novotny minov...@redhat.com Hi, this is the implementation of the info migrate-times command I did to get the times for the migration to get times for each migration stage. Based on the fact migration itself is just the vmsave on the source host and vmload on destination host this function can be also useful to get the save times however it's main purpose is measuring the migration times therefore it's called info migrate-times. The total memory transferred during the last migration is being tracked there as well as total migration time, time of waiting for input data, times for various migration stages for total value, disk (if applicable) and ram memory transfer. There's also the time difference which is the inaccuracy value which is caused by block device flushing and also the qemu_get_clock_ns() is being used in there and subsequent calls of this function may result into minor inaccuracies (in the matter smaller than of milliseconds). I also did the testing with various migration speed settings (using the set_migrate_speed monitor command) for 7 GiB RHEL-6 i386 guest running bonnie++ test for 14 GiB (2x RAM) and the results were as follows: Max.speed |Memory transferred |Time (s) --+-+ 32m | 12 925 676 bytes | 199 s 64m |7 745 224 bytes | 26 s 128m |7 674 188 bytes | 16 s 256m |7 628 988 bytes | 16 s 512m |7 599 837 bytes | 15 s 1024m (1g) |7 592 934 bytes | 14 s 10g |7 583 824 bytes | 13 s This has been tested on the 1 GiB network using the remote migration. The output of the command for last iteration (shown as an example was): (qemu) info migrate-times Total transferred memory: 7583824 kbytes Total migration time: 13.552894 s Waiting for input data: 6.942414 s Time difference (inaccuracy): 0.018257 s Times for total stage 1: 0.020247 s Times for total stage 2: 6.355092 s Times for total stage 3: 0.253398 s Times for total total: 6.628737 s Times for ram stage 1: 0.020238 s Times for ram stage 2: 6.353832 s Times for ram stage 3: 0.228953 s Times for ram total: 6.603023 s (qemu) So please review. This patch could be useful for getting the migration stage times. Thanks, Michal Signed-off-by: Michal Novotny mig...@gmail.com --- arch_init.c | 12 +- block-migration.c |5 ++ migration.c | 105 + migration.h |4 ++ monitor.c |8 +++ savevm.c | 50 ++--- sysemu.h |6 +++ vl.c | 123 + 8 files changed, 305 insertions(+), 8 deletions(-) diff --git a/arch_init.c b/arch_init.c index 484b39d..684ae3c 100644 --- a/arch_init.c +++ b/arch_init.c @@ -252,8 +252,12 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, void *opaque) { ram_addr_t addr; uint64_t bytes_transferred_last; +uint64_t t_start; double bwidth = 0; uint64_t expected_time = 0; +int retval = 0; + +t_start = qemu_get_clock_ns(host_clock); if (stage 0) { cpu_physical_memory_set_dirty_tracking(0); @@ -272,6 +276,10 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, void *opaque) last_offset = 0; sort_ram_list(); +time_set(ram, 1, 0); +time_set(ram, 2, 0); +time_set(ram, 3, 0); + /* Make sure all dirty bits are set */ QLIST_FOREACH(block, ram_list.blocks, next) { for (addr = block-offset; addr block-offset + block-length; @@ -331,8 +339,10 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, void *opaque) qemu_put_be64(f, RAM_SAVE_FLAG_EOS); expected_time = ram_save_remaining() * TARGET_PAGE_SIZE / bwidth; +retval = (stage == 2) (expected_time = migrate_max_downtime()); -return (stage == 2) (expected_time = migrate_max_downtime()); +time_add2(ram, stage, qemu_get_clock_ns(host_clock), t_start); +return retval; } static inline void *host_from_stream_offset(QEMUFile *f, diff --git a/block-migration.c b/block-migration.c index 0936c7d..b53a1f4 100644 --- a/block-migration.c +++ b/block-migration.c @@ -17,6 +17,8 @@ #include qemu-queue.h #include qemu-timer.h #include monitor.h +#include qemu-timer.h +#include sysemu.h #include block-migration.h #include migration.h #include blockdev.h @@ -556,6 +558,7
Re: [Qemu-devel] [PATCH 1/8] Introduce the VMStatus type
On Tue, 12 Jul 2011 16:51:03 +0200 Kevin Wolf kw...@redhat.com wrote: Am 12.07.2011 16:25, schrieb Luiz Capitulino: On Tue, 12 Jul 2011 09:28:05 +0200 Markus Armbruster arm...@redhat.com wrote: Luiz Capitulino lcapitul...@redhat.com writes: We need to track the VM status so that QMP can report it to clients. This commit adds the VMStatus type and related functions. The vm_status_set() function is used to keep track of the current VM status. The current statuses are: Nitpicking about names, bear with me. - debug: guest is running under gdb - inmigrate: guest is paused waiting for an incoming migration incoming-migration? - postmigrate: guest is paused following a successful migration post-migrate? - internal-error: Fatal internal error that prevents further guest execution - load-state-error: guest is paused following a failed 'loadvm' Less than obvious. If you like concrete, name it loadvm-failed. If you like abstract, name it restore-vm-failed. Ok for your suggestions above. - io-error: the last IOP has failed and the device is configured to pause on I/O errors - watchdog-error: the watchdog action is configured to pause and has been triggered Sounds like the watchdog suffered an error. watchdog-fired? Maybe watchdog-paused. - paused: guest has been paused via the 'stop' command stop-command? I prefer 'paused', it communicates better the state we're in. - prelaunch: QEMU was started with -S and guest has not started unstarted? Looks the same to me. - running: guest is actively running - shutdown: guest is shut down (and -no-shutdown is in use) Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- gdbstub.c |4 hw/ide/core.c |1 + hw/scsi-disk.c |1 + hw/virtio-blk.c |1 + hw/watchdog.c |1 + kvm-all.c |1 + migration.c |3 +++ monitor.c |5 - sysemu.h| 19 +++ vl.c| 37 + 10 files changed, 72 insertions(+), 1 deletions(-) diff --git a/gdbstub.c b/gdbstub.c index c085a5a..61b700a 100644 --- a/gdbstub.c +++ b/gdbstub.c @@ -2358,6 +2358,7 @@ void gdb_do_syscall(gdb_syscall_complete_cb cb, const char *fmt, ...) s-state = RS_SYSCALL; #ifndef CONFIG_USER_ONLY vm_stop(VMSTOP_DEBUG); +vm_status_set(VMST_DEBUG); #endif s-state = RS_IDLE; va_start(va, fmt); @@ -2432,6 +2433,7 @@ static void gdb_read_byte(GDBState *s, int ch) /* when the CPU is running, we cannot do anything except stop it when receiving a char */ vm_stop(VMSTOP_USER); +vm_status_set(VMST_DEBUG); } else #endif { @@ -2694,6 +2696,7 @@ static void gdb_chr_event(void *opaque, int event) switch (event) { case CHR_EVENT_OPENED: vm_stop(VMSTOP_USER); +vm_status_set(VMST_DEBUG); gdb_has_xml = 0; break; default: Previous hunk has VMST_DEBUG with VMST_DEBUG. Odd. @@ -2735,6 +2738,7 @@ static void gdb_sigterm_handler(int signal) { if (vm_running) { vm_stop(VMSTOP_USER); +vm_status_set(VMST_DEBUG); } } #endif diff --git a/hw/ide/core.c b/hw/ide/core.c index ca17a43..bf9df41 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -523,6 +523,7 @@ static int ide_handle_rw_error(IDEState *s, int error, int op) s-bus-error_status = op; bdrv_mon_event(s-bs, BDRV_ACTION_STOP, is_read); vm_stop(VMSTOP_DISKFULL); +vm_status_set(VMST_IOERROR); } else { if (op BM_STATUS_DMA_RETRY) { dma_buf_commit(s, 0); diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c index a8c7372..66037fd 100644 --- a/hw/scsi-disk.c +++ b/hw/scsi-disk.c @@ -216,6 +216,7 @@ static int scsi_handle_rw_error(SCSIDiskReq *r, int error, int type) bdrv_mon_event(s-bs, BDRV_ACTION_STOP, is_read); vm_stop(VMSTOP_DISKFULL); +vm_status_set(VMST_IOERROR); } else { if (type == SCSI_REQ_STATUS_RETRY_READ) { scsi_req_data(r-req, 0); diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c index 91e0394..bf70200 100644 --- a/hw/virtio-blk.c +++ b/hw/virtio-blk.c @@ -79,6 +79,7 @@ static int virtio_blk_handle_rw_error(VirtIOBlockReq *req, int error, s-rq = req; bdrv_mon_event(s-bs, BDRV_ACTION_STOP, is_read); vm_stop(VMSTOP_DISKFULL); +vm_status_set(VMST_IOERROR); } else { virtio_blk_req_complete(req, VIRTIO_BLK_S_IOERR); bdrv_mon_event(s-bs, BDRV_ACTION_REPORT, is_read); diff --git a/hw/watchdog.c b/hw/watchdog.c index 1c900a1..d130cbb 100644
[Qemu-devel] [PATCHv3] qxl: add QXL_IO_FLUSH_{SURFACES, RELEASE} for guest S3S4 support
Add two new IOs. QXL_IO_FLUSH_SURFACES - equivalent to update area for all surfaces, used to reduce vmexits from NumSurfaces to 1 on guest S3, S4 and resolution change (windows driver implementation is such that this is done on each of those occasions). QXL_IO_FLUSH_RELEASE - used to ensure anything on last_release is put on the release ring for the client to free. Cc: Yonit Halperin yhalp...@redhat.com --- hw/qxl.c | 32 1 files changed, 32 insertions(+), 0 deletions(-) diff --git a/hw/qxl.c b/hw/qxl.c index 1d6acce..a9cc1a3 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -181,6 +181,13 @@ static void qxl_spice_destroy_surface_wait(PCIQXLDevice *qxl, uint32_t id, } } +#if SPICE_INTERFACE_QXL_MINOR = 1 +static void qxl_spice_flush_surfaces_async(PCIQXLDevice *qxl) +{ +qxl-ssd.worker-flush_surfaces_async(qxl-ssd.worker, 0); +} +#endif + void qxl_spice_loadvm_commands(PCIQXLDevice *qxl, struct QXLCommandExt *ext, uint32_t count) { @@ -1195,6 +1202,7 @@ static void ioport_write(void *opaque, uint32_t addr, uint32_t val) case QXL_IO_DESTROY_PRIMARY_ASYNC: case QXL_IO_DESTROY_SURFACE_ASYNC: case QXL_IO_DESTROY_ALL_SURFACES_ASYNC: +case QXL_IO_FLUSH_SURFACES_ASYNC: #if SPICE_INTERFACE_QXL_MINOR 1 fprintf(stderr, qxl: error: async not supported by libspice but guest driver used it\n); return; @@ -1322,6 +1330,30 @@ static void ioport_write(void *opaque, uint32_t addr, uint32_t val) } qxl_spice_destroy_surface_wait(d, val, async); break; +case QXL_IO_FLUSH_RELEASE: { +QXLReleaseRing *ring = d-ram-release_ring; +if (ring-prod - ring-cons + 1 == ring-num_items) { +fprintf(stderr, +ERROR: no flush, full release ring [p%d,%dc]\n, +ring-prod, ring-cons); +} +qxl_push_free_res(d, 1 /* flush */); +dprint(d, 1, QXL_IO_FLUSH_RELEASE exit (%s, s#=%d, res#=%d,%p)\n, +qxl_mode_to_string(d-mode), d-guest_surfaces.count, +d-num_free_res, d-last_release); +break; +} +case QXL_IO_FLUSH_SURFACES_ASYNC: +#if SPICE_INTERFACE_QXL_MINOR = 1 +dprint(d, 1, QXL_IO_FLUSH_SURFACES_ASYNC (%d) (%s, s#=%d, res#=%d)\n, + val, qxl_mode_to_string(d-mode), d-guest_surfaces.count, + d-num_free_res); +qxl_spice_flush_surfaces_async(d); +#else +dprint(d, 1, QXL_IO_FLUSH_SURFACES_ASYNC (%d) ignored, too old spice server\n, + val); +#endif +break; case QXL_IO_DESTROY_ALL_SURFACES_ASYNC: case QXL_IO_DESTROY_ALL_SURFACES: d-mode = QXL_MODE_UNDEFINED; -- 1.7.6
Re: [Qemu-devel] [PATCH 3/8] block: Support to keep track of I/O status
On Tue, 12 Jul 2011 11:12:04 +0200 Markus Armbruster arm...@redhat.com wrote: Kevin Wolf kw...@redhat.com writes: Am 12.07.2011 09:45, schrieb Markus Armbruster: Luiz Capitulino lcapitul...@redhat.com writes: This commit adds support to the BlockDriverState type to keep track of the last I/O status. That is, at every I/O operation we update a status field in the BlockDriverState instance. Valid statuses are: OK, FAILED and ENOSPC. ENOSPC is distinguished from FAILED because an management application can use it to implement thin-provisioning. This feature has to be explicit enabled by buses/devices supporting it. buses? I think I should have called it 'interface'. Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- block.c | 18 ++ block.h |7 +++ block_int.h |2 ++ 3 files changed, 27 insertions(+), 0 deletions(-) diff --git a/block.c b/block.c index 24a25d5..cc0a34e 100644 --- a/block.c +++ b/block.c @@ -195,6 +195,7 @@ BlockDriverState *bdrv_new(const char *device_name) if (device_name[0] != '\0') { QTAILQ_INSERT_TAIL(bdrv_states, bs, list); } +bs-iostatus_enabled = false; return bs; } @@ -2876,6 +2877,23 @@ int bdrv_in_use(BlockDriverState *bs) return bs-in_use; } +void bdrv_enable_iostatus(BlockDriverState *bs) +{ +bs-iostatus_enabled = true; +} + +void bdrv_iostatus_update(BlockDriverState *bs, int error) +{ +error = abs(error); + +if (!error) { +bs-iostatus = BDRV_IOS_OK; +} else { +bs-iostatus = (error == ENOSPC) ? BDRV_IOS_ENOSPC : + BDRV_IOS_FAILED; +} +} + int bdrv_img_create(const char *filename, const char *fmt, const char *base_filename, const char *base_fmt, char *options, uint64_t img_size, int flags) diff --git a/block.h b/block.h index 859d1d9..0dca1bb 100644 --- a/block.h +++ b/block.h @@ -50,6 +50,13 @@ typedef enum { BDRV_ACTION_REPORT, BDRV_ACTION_IGNORE, BDRV_ACTION_STOP } BlockMonEventAction; +typedef enum { +BDRV_IOS_OK, BDRV_IOS_FAILED, BDRV_IOS_ENOSPC +} BlockIOStatus; + +void bdrv_iostatus_update(BlockDriverState *bs, int error); +void bdrv_enable_iostatus(BlockDriverState *bs); +void bdrv_enable_io_status(BlockDriverState *bs); void bdrv_mon_event(const BlockDriverState *bdrv, BlockMonEventAction action, int is_read); void bdrv_info_print(Monitor *mon, const QObject *data); diff --git a/block_int.h b/block_int.h index 1e265d2..09f038d 100644 --- a/block_int.h +++ b/block_int.h @@ -195,6 +195,8 @@ struct BlockDriverState { drivers. They are not used by the block driver */ int cyls, heads, secs, translation; BlockErrorAction on_read_error, on_write_error; +bool iostatus_enabled; +BlockIOStatus iostatus; char device_name[32]; unsigned long *dirty_bitmap; int64_t dirty_count; Okay, let's see what we got here. The block layer merely holds I/O status, device models set it. Device I/O status is not migrated. Why? Bug. :) bdrv_new() creates the BDS with I/O status tracking disabled. Devices that do tracking enable it in their qdev init method. If a device gets hot unplugged, tracking remains enabled. If the BDS then gets reused with a device that doesn't do tracking, I/O status becomes incorrect. Can't happen right now, because we automatically delete the BDS on hot unplug, but it's a trap. Suggest to disable tracking in bdrv_detach(). Actually, this is a symptom of the midlayer disease. I suspect things would be simpler if we hold the status in its rightful owner, the device model. Need a getter for it. I'm working on a patch series that moves misplaced state out of the block layer into device models and block drivers, and a I/O status getter will fit in easily there. Excellent. This is host state, so the device is not the rightful owner. Devices should not even be involved with enabling it. They are because they do the tracking, and thus the tracking only works for device models that do it. Could it be done entirely within the block layer?
Re: [Qemu-devel] [PATCH 1/8] Introduce the VMStatus type
On Tue, 12 Jul 2011 09:28:05 +0200 Markus Armbruster arm...@redhat.com wrote: Luiz Capitulino lcapitul...@redhat.com writes: We need to track the VM status so that QMP can report it to clients. This commit adds the VMStatus type and related functions. The vm_status_set() function is used to keep track of the current VM status. The current statuses are: Nitpicking about names, bear with me. - debug: guest is running under gdb - inmigrate: guest is paused waiting for an incoming migration incoming-migration? - postmigrate: guest is paused following a successful migration post-migrate? - internal-error: Fatal internal error that prevents further guest execution - load-state-error: guest is paused following a failed 'loadvm' Less than obvious. If you like concrete, name it loadvm-failed. If you like abstract, name it restore-vm-failed. Ok for your suggestions above. - io-error: the last IOP has failed and the device is configured to pause on I/O errors - watchdog-error: the watchdog action is configured to pause and has been triggered Sounds like the watchdog suffered an error. watchdog-fired? Maybe watchdog-paused. - paused: guest has been paused via the 'stop' command stop-command? I prefer 'paused', it communicates better the state we're in. - prelaunch: QEMU was started with -S and guest has not started unstarted? Looks the same to me. - running: guest is actively running - shutdown: guest is shut down (and -no-shutdown is in use) Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- gdbstub.c |4 hw/ide/core.c |1 + hw/scsi-disk.c |1 + hw/virtio-blk.c |1 + hw/watchdog.c |1 + kvm-all.c |1 + migration.c |3 +++ monitor.c |5 - sysemu.h| 19 +++ vl.c| 37 + 10 files changed, 72 insertions(+), 1 deletions(-) diff --git a/gdbstub.c b/gdbstub.c index c085a5a..61b700a 100644 --- a/gdbstub.c +++ b/gdbstub.c @@ -2358,6 +2358,7 @@ void gdb_do_syscall(gdb_syscall_complete_cb cb, const char *fmt, ...) s-state = RS_SYSCALL; #ifndef CONFIG_USER_ONLY vm_stop(VMSTOP_DEBUG); +vm_status_set(VMST_DEBUG); #endif s-state = RS_IDLE; va_start(va, fmt); @@ -2432,6 +2433,7 @@ static void gdb_read_byte(GDBState *s, int ch) /* when the CPU is running, we cannot do anything except stop it when receiving a char */ vm_stop(VMSTOP_USER); +vm_status_set(VMST_DEBUG); } else #endif { @@ -2694,6 +2696,7 @@ static void gdb_chr_event(void *opaque, int event) switch (event) { case CHR_EVENT_OPENED: vm_stop(VMSTOP_USER); +vm_status_set(VMST_DEBUG); gdb_has_xml = 0; break; default: Previous hunk has VMST_DEBUG with VMST_DEBUG. Odd. @@ -2735,6 +2738,7 @@ static void gdb_sigterm_handler(int signal) { if (vm_running) { vm_stop(VMSTOP_USER); +vm_status_set(VMST_DEBUG); } } #endif diff --git a/hw/ide/core.c b/hw/ide/core.c index ca17a43..bf9df41 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -523,6 +523,7 @@ static int ide_handle_rw_error(IDEState *s, int error, int op) s-bus-error_status = op; bdrv_mon_event(s-bs, BDRV_ACTION_STOP, is_read); vm_stop(VMSTOP_DISKFULL); +vm_status_set(VMST_IOERROR); } else { if (op BM_STATUS_DMA_RETRY) { dma_buf_commit(s, 0); diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c index a8c7372..66037fd 100644 --- a/hw/scsi-disk.c +++ b/hw/scsi-disk.c @@ -216,6 +216,7 @@ static int scsi_handle_rw_error(SCSIDiskReq *r, int error, int type) bdrv_mon_event(s-bs, BDRV_ACTION_STOP, is_read); vm_stop(VMSTOP_DISKFULL); +vm_status_set(VMST_IOERROR); } else { if (type == SCSI_REQ_STATUS_RETRY_READ) { scsi_req_data(r-req, 0); diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c index 91e0394..bf70200 100644 --- a/hw/virtio-blk.c +++ b/hw/virtio-blk.c @@ -79,6 +79,7 @@ static int virtio_blk_handle_rw_error(VirtIOBlockReq *req, int error, s-rq = req; bdrv_mon_event(s-bs, BDRV_ACTION_STOP, is_read); vm_stop(VMSTOP_DISKFULL); +vm_status_set(VMST_IOERROR); } else { virtio_blk_req_complete(req, VIRTIO_BLK_S_IOERR); bdrv_mon_event(s-bs, BDRV_ACTION_REPORT, is_read); diff --git a/hw/watchdog.c b/hw/watchdog.c index 1c900a1..d130cbb 100644 --- a/hw/watchdog.c +++ b/hw/watchdog.c @@ -133,6 +133,7 @@ void watchdog_perform_action(void) case WDT_PAUSE: /* same as 'stop'
Re: [Qemu-devel] [PATCH v5 05/18] qapi: add QMP input visitor
On 07/12/2011 08:53 AM, Luiz Capitulino wrote: On Tue, 12 Jul 2011 08:46:13 -0500 Michael Rothmdr...@linux.vnet.ibm.com wrote: On 07/12/2011 08:16 AM, Luiz Capitulino wrote: On Mon, 11 Jul 2011 19:05:58 -0500 Michael Rothmdr...@linux.vnet.ibm.com wrote: On 07/07/2011 09:32 AM, Luiz Capitulino wrote: On Tue, 5 Jul 2011 08:02:32 -0500 Michael Rothmdr...@linux.vnet.ibm.comwrote: A type of Visiter class that is used to walk a qobject's structure and assign each entry to the corresponding native C type. Command marshaling function will use this to pull out QMP command parameters recieved over the wire and pass them as native arguments to the corresponding C functions. Signed-off-by: Michael Rothmdr...@linux.vnet.ibm.com --- Makefile.objs|2 +- qapi/qmp-input-visitor.c | 264 ++ qapi/qmp-input-visitor.h | 27 + qerror.h |3 + 4 files changed, 295 insertions(+), 1 deletions(-) create mode 100644 qapi/qmp-input-visitor.c create mode 100644 qapi/qmp-input-visitor.h diff --git a/Makefile.objs b/Makefile.objs index 0077014..997ecef 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -375,7 +375,7 @@ libcacard-y = cac.o event.o vcard.o vreader.o vcard_emul_nss.o vcard_emul_type.o ## # qapi -qapi-nested-y = qapi-visit-core.o +qapi-nested-y = qapi-visit-core.o qmp-input-visitor.o qapi-obj-y = $(addprefix qapi/, $(qapi-nested-y)) vl.o: QEMU_CFLAGS+=$(GPROF_CFLAGS) diff --git a/qapi/qmp-input-visitor.c b/qapi/qmp-input-visitor.c new file mode 100644 index 000..80912bb --- /dev/null +++ b/qapi/qmp-input-visitor.c @@ -0,0 +1,264 @@ +/* + * Input Visitor + * + * Copyright IBM, Corp. 2011 + * + * Authors: + * Anthony Liguorialigu...@us.ibm.com + * + * This work is licensed under the terms of the GNU LGPL, version 2.1 or later. + * See the COPYING.LIB file in the top-level directory. + * + */ + +#include qmp-input-visitor.h +#include qemu-queue.h +#include qemu-common.h +#include qemu-objects.h +#include qerror.h + +#define QIV_STACK_SIZE 1024 + +typedef struct StackObject +{ +const QObject *obj; +const QListEntry *entry; +} StackObject; + +struct QmpInputVisitor +{ +Visitor visitor; +const QObject *obj; +StackObject stack[QIV_STACK_SIZE]; +int nb_stack; +}; + +static QmpInputVisitor *to_qiv(Visitor *v) +{ +return container_of(v, QmpInputVisitor, visitor); +} + +static const QObject *qmp_input_get_object(QmpInputVisitor *qiv, const char *name) +{ +const QObject *qobj; + +if (qiv-nb_stack == 0) { +qobj = qiv-obj; +} else { +qobj = qiv-stack[qiv-nb_stack - 1].obj; +} + +if (nameqobject_type(qobj) == QTYPE_QDICT) { +return qdict_get(qobject_to_qdict(qobj), name); +} else if (qiv-nb_stack0qobject_type(qobj) == QTYPE_QLIST) { +return qlist_entry_obj(qiv-stack[qiv-nb_stack - 1].entry); +} + +return qobj; +} + +static void qmp_input_push(QmpInputVisitor *qiv, const QObject *obj, Error **errp) +{ +qiv-stack[qiv-nb_stack].obj = obj; +if (qobject_type(obj) == QTYPE_QLIST) { +qiv-stack[qiv-nb_stack].entry = qlist_first(qobject_to_qlist(obj)); +} +qiv-nb_stack++; + +if (qiv-nb_stack= QIV_STACK_SIZE) { +error_set(errp, QERR_BUFFER_OVERRUN); +return; +} +} + +static void qmp_input_pop(QmpInputVisitor *qiv, Error **errp) +{ +qiv-nb_stack--; +if (qiv-nb_stack0) { +error_set(errp, QERR_BUFFER_OVERRUN); +return; +} +} + +static void qmp_input_start_struct(Visitor *v, void **obj, const char *kind, const char *name, size_t size, Error **errp) +{ +QmpInputVisitor *qiv = to_qiv(v); +const QObject *qobj = qmp_input_get_object(qiv, name); + +if (!qobj || qobject_type(qobj) != QTYPE_QDICT) { +error_set(errp, QERR_INVALID_PARAMETER_TYPE, name ? name : null, QDict); +return; +} + +qmp_input_push(qiv, qobj, errp); +if (error_is_set(errp)) { +return; +} + +if (obj) { +*obj = qemu_mallocz(size); +} +} + +static void qmp_input_end_struct(Visitor *v, Error **errp) +{ +QmpInputVisitor *qiv = to_qiv(v); + +qmp_input_pop(qiv, errp); +} + +static void qmp_input_start_list(Visitor *v, const char *name, Error **errp) +{ +QmpInputVisitor *qiv = to_qiv(v); +const QObject *qobj = qmp_input_get_object(qiv, name); + +if (!qobj || qobject_type(qobj) != QTYPE_QLIST) { +error_set(errp, QERR_INVALID_PARAMETER_TYPE, name ? name : null, list); +return; +} + +qmp_input_push(qiv, qobj, errp); +} + +static GenericList *qmp_input_next_list(Visitor *v, GenericList **list, Error **errp) +{ +QmpInputVisitor *qiv = to_qiv(v); +GenericList *entry; +StackObject *so =qiv-stack[qiv-nb_stack - 1]; + +if (so-entry == NULL) { +return NULL; +} + +
Re: [Qemu-devel] [PATCH 1/8] Introduce the VMStatus type
Am 12.07.2011 16:25, schrieb Luiz Capitulino: On Tue, 12 Jul 2011 09:28:05 +0200 Markus Armbruster arm...@redhat.com wrote: Luiz Capitulino lcapitul...@redhat.com writes: We need to track the VM status so that QMP can report it to clients. This commit adds the VMStatus type and related functions. The vm_status_set() function is used to keep track of the current VM status. The current statuses are: Nitpicking about names, bear with me. - debug: guest is running under gdb - inmigrate: guest is paused waiting for an incoming migration incoming-migration? - postmigrate: guest is paused following a successful migration post-migrate? - internal-error: Fatal internal error that prevents further guest execution - load-state-error: guest is paused following a failed 'loadvm' Less than obvious. If you like concrete, name it loadvm-failed. If you like abstract, name it restore-vm-failed. Ok for your suggestions above. - io-error: the last IOP has failed and the device is configured to pause on I/O errors - watchdog-error: the watchdog action is configured to pause and has been triggered Sounds like the watchdog suffered an error. watchdog-fired? Maybe watchdog-paused. - paused: guest has been paused via the 'stop' command stop-command? I prefer 'paused', it communicates better the state we're in. - prelaunch: QEMU was started with -S and guest has not started unstarted? Looks the same to me. - running: guest is actively running - shutdown: guest is shut down (and -no-shutdown is in use) Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- gdbstub.c |4 hw/ide/core.c |1 + hw/scsi-disk.c |1 + hw/virtio-blk.c |1 + hw/watchdog.c |1 + kvm-all.c |1 + migration.c |3 +++ monitor.c |5 - sysemu.h| 19 +++ vl.c| 37 + 10 files changed, 72 insertions(+), 1 deletions(-) diff --git a/gdbstub.c b/gdbstub.c index c085a5a..61b700a 100644 --- a/gdbstub.c +++ b/gdbstub.c @@ -2358,6 +2358,7 @@ void gdb_do_syscall(gdb_syscall_complete_cb cb, const char *fmt, ...) s-state = RS_SYSCALL; #ifndef CONFIG_USER_ONLY vm_stop(VMSTOP_DEBUG); +vm_status_set(VMST_DEBUG); #endif s-state = RS_IDLE; va_start(va, fmt); @@ -2432,6 +2433,7 @@ static void gdb_read_byte(GDBState *s, int ch) /* when the CPU is running, we cannot do anything except stop it when receiving a char */ vm_stop(VMSTOP_USER); +vm_status_set(VMST_DEBUG); } else #endif { @@ -2694,6 +2696,7 @@ static void gdb_chr_event(void *opaque, int event) switch (event) { case CHR_EVENT_OPENED: vm_stop(VMSTOP_USER); +vm_status_set(VMST_DEBUG); gdb_has_xml = 0; break; default: Previous hunk has VMST_DEBUG with VMST_DEBUG. Odd. @@ -2735,6 +2738,7 @@ static void gdb_sigterm_handler(int signal) { if (vm_running) { vm_stop(VMSTOP_USER); +vm_status_set(VMST_DEBUG); } } #endif diff --git a/hw/ide/core.c b/hw/ide/core.c index ca17a43..bf9df41 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -523,6 +523,7 @@ static int ide_handle_rw_error(IDEState *s, int error, int op) s-bus-error_status = op; bdrv_mon_event(s-bs, BDRV_ACTION_STOP, is_read); vm_stop(VMSTOP_DISKFULL); +vm_status_set(VMST_IOERROR); } else { if (op BM_STATUS_DMA_RETRY) { dma_buf_commit(s, 0); diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c index a8c7372..66037fd 100644 --- a/hw/scsi-disk.c +++ b/hw/scsi-disk.c @@ -216,6 +216,7 @@ static int scsi_handle_rw_error(SCSIDiskReq *r, int error, int type) bdrv_mon_event(s-bs, BDRV_ACTION_STOP, is_read); vm_stop(VMSTOP_DISKFULL); +vm_status_set(VMST_IOERROR); } else { if (type == SCSI_REQ_STATUS_RETRY_READ) { scsi_req_data(r-req, 0); diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c index 91e0394..bf70200 100644 --- a/hw/virtio-blk.c +++ b/hw/virtio-blk.c @@ -79,6 +79,7 @@ static int virtio_blk_handle_rw_error(VirtIOBlockReq *req, int error, s-rq = req; bdrv_mon_event(s-bs, BDRV_ACTION_STOP, is_read); vm_stop(VMSTOP_DISKFULL); +vm_status_set(VMST_IOERROR); } else { virtio_blk_req_complete(req, VIRTIO_BLK_S_IOERR); bdrv_mon_event(s-bs, BDRV_ACTION_REPORT, is_read); diff --git a/hw/watchdog.c b/hw/watchdog.c index 1c900a1..d130cbb 100644 --- a/hw/watchdog.c +++ b/hw/watchdog.c @@ -133,6 +133,7 @@ void watchdog_perform_action(void) case WDT_PAUSE: /* same as 'stop' command in monitor */ watchdog_mon_event(pause);
Re: [Qemu-devel] [PULL] usb patch queue
On 07/08/11 11:50, Gerd Hoffmann wrote: Hi, Here is the current usb patch queue. Most noteworthy is the usb companion controller support added. There are also a bunch of bug fixes, some from Hans which he found while doing the companion controller work and some have been found in patch review. please pull, Gerd The following changes since commit 9312805d33e8b106bae356d13a8071fb37d75554: pxa2xx_lcd: add proper rotation support (2011-07-04 22:12:21 +0200) are available in the git repository at: git://git.kraxel.org/qemu usb.19 ping? cheers, Gerd
Re: [Qemu-devel] [PATCH v6 4/4] guest agent: add guest agent RPCs/commands
On Mon, 11 Jul 2011 18:11:21 -0500 Michael Roth mdr...@linux.vnet.ibm.com wrote: On 07/11/2011 04:12 PM, Luiz Capitulino wrote: On Mon, 11 Jul 2011 15:11:26 -0500 Michael Rothmdr...@linux.vnet.ibm.com wrote: On 07/08/2011 10:14 AM, Luiz Capitulino wrote: On Tue, 5 Jul 2011 08:21:40 -0500 Michael Rothmdr...@linux.vnet.ibm.com wrote: This adds the initial set of QMP/QAPI commands provided by the guest agent: guest-sync guest-ping guest-info guest-shutdown guest-file-open guest-file-read guest-file-write guest-file-seek guest-file-close guest-fsfreeze-freeze guest-fsfreeze-thaw guest-fsfreeze-status The input/output specification for these commands are documented in the schema. Example usage: host: qemu -device virtio-serial \ -chardev socket,path=/tmp/vs0.sock,server,nowait,id=qga0 \ -device virtserialport,chardev=qga0,name=qga0 ... echo {'execute':'guest-info'} | socat stdio \ unix-connect:/tmp/qga0.sock guest: qemu-ga -c virtio-serial -p /dev/virtio-ports/qga0 \ -p /var/run/qemu-guest-agent.pid -d Signed-off-by: Michael Rothmdr...@linux.vnet.ibm.com --- Makefile | 15 ++- qemu-ga.c |4 + qerror.h |3 + qga/guest-agent-commands.c | 501 qga/guest-agent-core.h |2 + 5 files changed, 523 insertions(+), 2 deletions(-) create mode 100644 qga/guest-agent-commands.c diff --git a/Makefile b/Makefile index b2e8593..7e4f722 100644 --- a/Makefile +++ b/Makefile @@ -175,15 +175,26 @@ $(qapi-dir)/test-qmp-commands.h: $(qapi-dir)/test-qmp-marshal.c $(qapi-dir)/test-qmp-marshal.c: $(SRC_PATH)/qapi-schema-test.json $(SRC_PATH)/scripts/qapi-commands.py $(call quiet-command,python $(SRC_PATH)/scripts/qapi-commands.py -o $(qapi-dir) -p test- $, GEN $@) +$(qapi-dir)/qga-qapi-types.c: $(qapi-dir)/qga-qapi-types.h +$(qapi-dir)/qga-qapi-types.h: $(SRC_PATH)/qapi-schema-guest.json $(SRC_PATH)/scripts/qapi-types.py +$(call quiet-command,python $(SRC_PATH)/scripts/qapi-types.py -o $(qapi-dir) -p qga- $, GEN $@) +$(qapi-dir)/qga-qapi-visit.c: $(qapi-dir)/qga-qapi-visit.h +$(qapi-dir)/qga-qapi-visit.h: $(SRC_PATH)/qapi-schema-guest.json $(SRC_PATH)/scripts/qapi-visit.py +$(call quiet-command,python $(SRC_PATH)/scripts/qapi-visit.py -o $(qapi-dir) -p qga- $, GEN $@) +$(qapi-dir)/qga-qmp-marshal.c: $(SRC_PATH)/qapi-schema-guest.json $(SRC_PATH)/scripts/qapi-commands.py +$(call quiet-command,python $(SRC_PATH)/scripts/qapi-commands.py -o $(qapi-dir) -p qga- $, GEN $@) + test-visitor.o: $(addprefix $(qapi-dir)/, test-qapi-types.c test-qapi-types.h test-qapi-visit.c test-qapi-visit.h) test-visitor: test-visitor.o qfloat.o qint.o qdict.o qstring.o qlist.o qbool.o $(qapi-obj-y) error.o osdep.o qemu-malloc.o $(oslib-obj-y) qjson.o json-streamer.o json-lexer.o json-parser.o qerror.o qemu-error.o qemu-tool.o $(qapi-dir)/test-qapi-visit.o $(qapi-dir)/test-qapi-types.o test-qmp-commands.o: $(addprefix $(qapi-dir)/, test-qapi-types.c test-qapi-types.h test-qapi-visit.c test-qapi-visit.h test-qmp-marshal.c test-qmp-commands.h) test-qmp-commands: test-qmp-commands.o qfloat.o qint.o qdict.o qstring.o qlist.o qbool.o $(qapi-obj-y) error.o osdep.o qemu-malloc.o $(oslib-obj-y) qjson.o json-streamer.o json-lexer.o json-parser.o qerror.o qemu-error.o qemu-tool.o $(qapi-dir)/test-qapi-visit.o $(qapi-dir)/test-qapi-types.o $(qapi-dir)/test-qmp-marshal.o module.o -QGALIB=qga/guest-agent-command-state.o +QGALIB=qga/guest-agent-command-state.o qga/guest-agent-commands.o + +qemu-ga.o: $(qapi-dir)/qga-qapi-types.c $(qapi-dir)/qga-qapi-types.h $(qapi-dir)/qga-qapi-visit.c $(qapi-dir)/qga-qmp-marshal.c -qemu-ga$(EXESUF): qemu-ga.o $(QGALIB) qemu-tool.o qemu-error.o error.o $(oslib-obj-y) $(trace-obj-y) $(block-obj-y) $(qobject-obj-y) $(version-obj-y) $(qapi-obj-y) qemu-timer-common.o qemu-sockets.o module.o qapi/qmp-dispatch.o qapi/qmp-registry.o +qemu-ga$(EXESUF): qemu-ga.o $(QGALIB) qemu-tool.o qemu-error.o error.o $(oslib-obj-y) $(trace-obj-y) $(block-obj-y) $(qobject-obj-y) $(version-obj-y) $(qapi-obj-y) qemu-timer-common.o qemu-sockets.o module.o qapi/qmp-dispatch.o qapi/qmp-registry.o $(qapi-dir)/qga-qapi-visit.o $(qapi-dir)/qga-qmp-marshal.o QEMULIBS=libhw32 libhw64 libuser libdis libdis-user diff --git a/qemu-ga.c b/qemu-ga.c index 649c16a..04ead22 100644 --- a/qemu-ga.c +++ b/qemu-ga.c @@ -637,6 +637,9 @@ int main(int argc, char **argv) g_log_set_default_handler(ga_log, s); g_log_set_fatal_mask(NULL, G_LOG_LEVEL_ERROR); s-logging_enabled = true; +
Re: [Qemu-devel] [PULL] spice patch queue
On 07/04/11 17:14, Gerd Hoffmann wrote: Hi, Here is the spice patch queue with a bunch of small fixes and improvements collected over time. No major changes. please pull, Gerd The following changes since commit 75ef849696830fc2ddeff8bb90eea5887ff50df6: esp: correctly fill bus id with requested lun (2011-07-02 18:50:19 +) are available in the git repository at: git://anongit.freedesktop.org/spice/qemu spice.v38 Ping? cheers, Gerd
[Qemu-devel] [Bug 807893] Re: [PATCH] os-posix: set groups properly for -runas
On Sat, Jul 9, 2011 at 10:22 AM, Stefan Hajnoczi stefa...@linux.vnet.ibm.com wrote: Andrew Griffiths reports that -runas does not set supplementary group IDs. This means that gid 0 (root) is not dropped when switching to an unprivileged user. Add an initgroups(3) call to use the -runas user's /etc/groups membership to update the supplementary group IDs. Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- Note this needs compile testing on various POSIX host platforms. Tested on Linux. Should work on BSD and Solaris. initgroups(3) is SVr4/BSD but not in POSIX. os-posix.c | 6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) Are you happy with this patch? Bumping because security-related. Regarding portability, Linux, BSD, Solaris, and Mac OS X all provide initgroups(3). I think we're good. Stefan -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/807893 Title: qemu privilege escalation Status in QEMU: Confirmed Bug description: If qemu is started as root, with -runas, the extra groups is not dropped correctly /proc/`pidof qemu`/status .. Uid:100 100 100 100 Gid:100 100 100 100 FDSize: 32 Groups: 0 1 2 3 4 6 10 11 26 27 ... The fix is to add initgroups() or setgroups(1, [gid]) where appropriate to os-posix.c. The extra gid's allow read or write access to other files (such as /dev etc). Emulating the qemu code: # python ... import os os.setgid(100) os.setuid(100) os.execve(/bin/sh, [ /bin/sh ], os.environ) sh-4.1$ xxd /dev/sda | head -n2 000: eb48 9000 .H.. 010: sh-4.1$ ls -l /dev/sda brw-rw 1 root disk 8, 0 Jul 8 11:54 /dev/sda sh-4.1$ id uid=100(qemu00) gid=100(users) groups=100(users),0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),26(tape),27(video) To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/807893/+subscriptions
Re: [Qemu-devel] [PATCH 3/8] block: Support to keep track of I/O status
On Tue, 12 Jul 2011 16:25:22 +0200 Kevin Wolf kw...@redhat.com wrote: Am 05.07.2011 20:17, schrieb Luiz Capitulino: This commit adds support to the BlockDriverState type to keep track of the last I/O status. That is, at every I/O operation we update a status field in the BlockDriverState instance. Valid statuses are: OK, FAILED and ENOSPC. ENOSPC is distinguished from FAILED because an management application can use it to implement thin-provisioning. This feature has to be explicit enabled by buses/devices supporting it. Signed-off-by: Luiz Capitulino lcapitul...@redhat.com I'm not sure how this is meant to work with devices that can have multiple requests in flight. If a request fails, one of the things that are done before sending a monitor event is qemu_aio_flush(), i.e. waiting for all in-flight requests to complete. If the last one of them is successful, your status will report BDRV_IOS_OK. We're more interested in states that the device can not recover from or that are not temporary. So, if something really bad happens I'd expect all in-flight requests to fail the same way. Am I wrong? If you don't stop the VM on I/O errors, the status is useless anyway, even if only one request is active at the same point. Right, that's a good point. A mngt application can only trust that the status won't change in the next second if the VM is stopped. I think it would make more sense if we only stored the last error (that is, don't clear the field on success). What is the use case, would this be enough for it? Yes, it would, but there's a problem. If the management application manages to correct the error and put the VM to run again, we need to clear the status, otherwise the management application could get confused if the status is read at a later time. The most effective way I found to do this was to let the device report its own current status. But I see two other ways of doing this: 1. We could only report the status if the VM is paused. This doesn't change much the implementation though 2. We could allow the mngt app to clear the status By the way, I'm not sure how it fits in, but I'd like to have a block layer function that format drivers can use to tell qemu that the image is corrupted. Maybe that's another case in which we should stop the VM and have an appropriate status for it. It should probably have precedence over an ENOSPC happening at the same time, so maybe we'll also need a way to tell that some status is more important and may overwrite a less important status, but not the other way round. Yes, seems to make sense.
Re: [Qemu-devel] live block copy/stream/snapshot discussion
Am 12.07.2011 17:45, schrieb Stefan Hajnoczi: Image streaming API === For leaf images with copy-on-read semantics, the stream commands allow the user to populate local blocks by manually streaming them from the backing image. Once all blocks have been streamed, the dependency on the original backing image can be removed. Therefore, stream commands can be used to implement post-copy live block migration and rapid deployment. The block_stream command can be used to stream a single cluster, to start streaming the entire device, and to cancel an active stream. It is easiest to allow the block_stream command to manage streaming for the entire device but a managent tool could use single cluster mode to throttle the I/O rate. As discussed earlier, having the management send requests for each single cluster doesn't make any sense at all. It wouldn't only throttle the I/O rate but bring it down to a level that makes it unusable. What you really want is to allow the management to give us a range (offset + length) that qemu should stream. I feel that an iteration interface is problematic whether the management tool or QEMU decide what to stream. Let's have just the background streaming operation. The problem with byte ranges is two-fold. The management tool doesn't know which regions of the image are allocated so it may do a lot of nop calls to already-allocated regions with no intelligence as to where the next sensible offset for streaming is. Secondly, because the progress and performance of image streaming depend largely on whether or not clusters are allocated (it is very fast when a cluster is already allocated and we have no work to do), offsets are bad indicators of progress to the user. I think it's best not to expose these details to the management tool at all. The only reason for the iteration interface was to punt I/O throttling to the management tool. I think it would be easier to just throttle inside the streaming function. Kevin: Are you happy with dropping the iteration interface? Adam: Is there a libvirt requirement for iteration or could we support background copy only? Okay, works for me. The command synopses are as follows: block_stream Copy data from a backing file into a block device. If the optional 'all' argument is true, this operation is performed in the background until the entire backing file has been copied. The status of ongoing block_stream operations can be checked with query-block-stream. Not sure if it's a good idea to use a bool argument to turn a command into its opposite. I think having a separate command for stopping would be cleaner. Something for the QMP folks to decide, though. git branch new_branch git branch -D new_branch Makes sense to me :) I don't think you should compare a command line option to a programming interface. Having a git_create_branch(const char *name, bool delete) would really look strange. Anyway, probably a matter of taste. A hint that separate commands would make sense is that the stop command won't need the other arguments that the start command gets ('all' and 'base'). Arguments: - all:copy entire device (json-bool, optional) - stop: stop copying to device (json-bool, optional) - device: device name (json-string) It must be possible to specify backing file that will be active after streaming finishes (data from that file will not be streamed into active file, of course). Yes, I think the common base image belongs here. Right. We need to specify it by filename: - base: filename of base file (json-string, optional) Sectors are not copied from the base file and its backing file chain. The following describes this feature: Before: base - sn1 - sn2 - sn3 - vm.img After: base - vm.img Does this imply that a rebase -u happens always after completion? With all = false, where does the streaming begin? Streaming begins at the start of the image. Do you have something like the current streaming offset in the state of each BlockDriverState? Yes, there is a StreamState for each block device that has an in-progress operation. The progress is saved between block_stream (without -a) invocations so the caller does not need to specify the streaming offset as an argument. Thanks for pointing out these weaknesses in the documentation. It should really be explained fully. I think we also need to describe error cases. For example, what happens if you try to start streaming while it's already in progress? Return: - device: device name (json-string) - len:size of the device, in bytes (json-int) - offset: ending offset of the completed I/O, in bytes (json-int) So you only get the reply when the request has completed? With the current monitor, this means that QMP is blocked while we stream, doesn't it? How are you supposed to send the stop command then? Incomplete documentation again,
[Qemu-devel] Loading ELF binaries with very high base addresses
Hello, I am working on target-ia64, but am stuck during ia64 ELF loading. Referring to function probe_guest_base() in linux-user/elfload.c around line 1350, called from around line 1484 -- When the main binary is being mmap'd, the host address and guest address should ideally be the same. If they're not, a linear search is done by increasing the host_address by one page and trying the mmap again. The (positive) offset is then saved. The problem occurs with ia64 binaries, which typically start at 0x4000 (i.e 0x464). At least on my x86_64 host machine, mmap'ing at this address fails. The real_address is of the order of 0x832. Needless to say, increasing host_address and trying again will never reach a *lower* address to map at. Further, I cannot make it relocate to a *lower* host address because the offset (guest_base) is an unsigned int and so the relocation can only happen by a positive offset. Because of this it is not possible to load any ELF binaries which start at such high memory addresses. I can tailor an elf binary to start at a lower base address, which might work for that specific case, but I suspect most existing ia64 binaries start at 0x464 by convention. Also, the hiaddr is read from elf header which again is set to 0x464 + some value. The existing code works fine with x86_64, for example, because the binaries are typically starting at 0x4, which is easily mmap'd at first try. Any ideas on a workaround? ~Prashant
[Qemu-devel] [Bug 807893] Re: qemu privilege escalation
Yep, that fix looks fine. RedHat should have a CVE number for this issue. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/807893 Title: qemu privilege escalation Status in QEMU: Confirmed Bug description: If qemu is started as root, with -runas, the extra groups is not dropped correctly /proc/`pidof qemu`/status .. Uid:100 100 100 100 Gid:100 100 100 100 FDSize: 32 Groups: 0 1 2 3 4 6 10 11 26 27 ... The fix is to add initgroups() or setgroups(1, [gid]) where appropriate to os-posix.c. The extra gid's allow read or write access to other files (such as /dev etc). Emulating the qemu code: # python ... import os os.setgid(100) os.setuid(100) os.execve(/bin/sh, [ /bin/sh ], os.environ) sh-4.1$ xxd /dev/sda | head -n2 000: eb48 9000 .H.. 010: sh-4.1$ ls -l /dev/sda brw-rw 1 root disk 8, 0 Jul 8 11:54 /dev/sda sh-4.1$ id uid=100(qemu00) gid=100(users) groups=100(users),0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),26(tape),27(video) To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/807893/+subscriptions
[Qemu-devel] [Bug 807893] Re: qemu privilege escalation
or any other linux vendor that has an interest in qemu :) -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/807893 Title: qemu privilege escalation Status in QEMU: Confirmed Bug description: If qemu is started as root, with -runas, the extra groups is not dropped correctly /proc/`pidof qemu`/status .. Uid:100 100 100 100 Gid:100 100 100 100 FDSize: 32 Groups: 0 1 2 3 4 6 10 11 26 27 ... The fix is to add initgroups() or setgroups(1, [gid]) where appropriate to os-posix.c. The extra gid's allow read or write access to other files (such as /dev etc). Emulating the qemu code: # python ... import os os.setgid(100) os.setuid(100) os.execve(/bin/sh, [ /bin/sh ], os.environ) sh-4.1$ xxd /dev/sda | head -n2 000: eb48 9000 .H.. 010: sh-4.1$ ls -l /dev/sda brw-rw 1 root disk 8, 0 Jul 8 11:54 /dev/sda sh-4.1$ id uid=100(qemu00) gid=100(users) groups=100(users),0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),26(tape),27(video) To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/807893/+subscriptions
Re: [Qemu-devel] [PATCH v6 4/4] guest agent: add guest agent RPCs/commands
On Tue, 12 Jul 2011 10:44:14 -0500 Michael Roth mdr...@linux.vnet.ibm.com wrote: On 07/12/2011 09:15 AM, Luiz Capitulino wrote: On Mon, 11 Jul 2011 18:11:21 -0500 Michael Rothmdr...@linux.vnet.ibm.com wrote: On 07/11/2011 04:12 PM, Luiz Capitulino wrote: On Mon, 11 Jul 2011 15:11:26 -0500 Michael Rothmdr...@linux.vnet.ibm.com wrote: On 07/08/2011 10:14 AM, Luiz Capitulino wrote: On Tue, 5 Jul 2011 08:21:40 -0500 Michael Rothmdr...@linux.vnet.ibm.comwrote: This adds the initial set of QMP/QAPI commands provided by the guest agent: guest-sync guest-ping guest-info guest-shutdown guest-file-open guest-file-read guest-file-write guest-file-seek guest-file-close guest-fsfreeze-freeze guest-fsfreeze-thaw guest-fsfreeze-status The input/output specification for these commands are documented in the schema. Example usage: host: qemu -device virtio-serial \ -chardev socket,path=/tmp/vs0.sock,server,nowait,id=qga0 \ -device virtserialport,chardev=qga0,name=qga0 ... echo {'execute':'guest-info'} | socat stdio \ unix-connect:/tmp/qga0.sock guest: qemu-ga -c virtio-serial -p /dev/virtio-ports/qga0 \ -p /var/run/qemu-guest-agent.pid -d Signed-off-by: Michael Rothmdr...@linux.vnet.ibm.com --- Makefile | 15 ++- qemu-ga.c |4 + qerror.h |3 + qga/guest-agent-commands.c | 501 qga/guest-agent-core.h |2 + 5 files changed, 523 insertions(+), 2 deletions(-) create mode 100644 qga/guest-agent-commands.c diff --git a/Makefile b/Makefile index b2e8593..7e4f722 100644 --- a/Makefile +++ b/Makefile @@ -175,15 +175,26 @@ $(qapi-dir)/test-qmp-commands.h: $(qapi-dir)/test-qmp-marshal.c $(qapi-dir)/test-qmp-marshal.c: $(SRC_PATH)/qapi-schema-test.json $(SRC_PATH)/scripts/qapi-commands.py $(call quiet-command,python $(SRC_PATH)/scripts/qapi-commands.py -o $(qapi-dir) -p test- $, GEN $@) +$(qapi-dir)/qga-qapi-types.c: $(qapi-dir)/qga-qapi-types.h +$(qapi-dir)/qga-qapi-types.h: $(SRC_PATH)/qapi-schema-guest.json $(SRC_PATH)/scripts/qapi-types.py + $(call quiet-command,python $(SRC_PATH)/scripts/qapi-types.py -o $(qapi-dir) -p qga-$, GEN $@) +$(qapi-dir)/qga-qapi-visit.c: $(qapi-dir)/qga-qapi-visit.h +$(qapi-dir)/qga-qapi-visit.h: $(SRC_PATH)/qapi-schema-guest.json $(SRC_PATH)/scripts/qapi-visit.py + $(call quiet-command,python $(SRC_PATH)/scripts/qapi-visit.py -o $(qapi-dir) -p qga-$, GEN $@) +$(qapi-dir)/qga-qmp-marshal.c: $(SRC_PATH)/qapi-schema-guest.json $(SRC_PATH)/scripts/qapi-commands.py + $(call quiet-command,python $(SRC_PATH)/scripts/qapi-commands.py -o $(qapi-dir) -p qga- $, GEN $@) + test-visitor.o: $(addprefix $(qapi-dir)/, test-qapi-types.c test-qapi-types.h test-qapi-visit.c test-qapi-visit.h) test-visitor: test-visitor.o qfloat.o qint.o qdict.o qstring.o qlist.o qbool.o $(qapi-obj-y) error.o osdep.o qemu-malloc.o $(oslib-obj-y) qjson.o json-streamer.o json-lexer.o json-parser.o qerror.o qemu-error.o qemu-tool.o $(qapi-dir)/test-qapi-visit.o $(qapi-dir)/test-qapi-types.o test-qmp-commands.o: $(addprefix $(qapi-dir)/, test-qapi-types.c test-qapi-types.h test-qapi-visit.c test-qapi-visit.h test-qmp-marshal.c test-qmp-commands.h) test-qmp-commands: test-qmp-commands.o qfloat.o qint.o qdict.o qstring.o qlist.o qbool.o $(qapi-obj-y) error.o osdep.o qemu-malloc.o $(oslib-obj-y) qjson.o json-streamer.o json-lexer.o json-parser.o qerror.o qemu-error.o qemu-tool.o $(qapi-dir)/test-qapi-visit.o $(qapi-dir)/test-qapi-types.o $(qapi-dir)/test-qmp-marshal.o module.o -QGALIB=qga/guest-agent-command-state.o +QGALIB=qga/guest-agent-command-state.o qga/guest-agent-commands.o + +qemu-ga.o: $(qapi-dir)/qga-qapi-types.c $(qapi-dir)/qga-qapi-types.h $(qapi-dir)/qga-qapi-visit.c $(qapi-dir)/qga-qmp-marshal.c -qemu-ga$(EXESUF): qemu-ga.o $(QGALIB) qemu-tool.o qemu-error.o error.o $(oslib-obj-y) $(trace-obj-y) $(block-obj-y) $(qobject-obj-y) $(version-obj-y) $(qapi-obj-y) qemu-timer-common.o qemu-sockets.o module.o qapi/qmp-dispatch.o qapi/qmp-registry.o +qemu-ga$(EXESUF): qemu-ga.o $(QGALIB) qemu-tool.o qemu-error.o error.o $(oslib-obj-y) $(trace-obj-y) $(block-obj-y) $(qobject-obj-y) $(version-obj-y) $(qapi-obj-y) qemu-timer-common.o qemu-sockets.o module.o qapi/qmp-dispatch.o qapi/qmp-registry.o $(qapi-dir)/qga-qapi-visit.o $(qapi-dir)/qga-qmp-marshal.o QEMULIBS=libhw32 libhw64 libuser libdis libdis-user diff --git a/qemu-ga.c b/qemu-ga.c index 649c16a..04ead22 100644 --- a/qemu-ga.c +++ b/qemu-ga.c @@ -637,6 +637,9 @@ int main(int argc, char **argv)
Re: [Qemu-devel] live block copy/stream/snapshot discussion
On Tue, Jul 12, 2011 at 9:06 AM, Kevin Wolf kw...@redhat.com wrote: Am 11.07.2011 18:32, schrieb Marcelo Tosatti: On Mon, Jul 11, 2011 at 03:47:15PM +0100, Stefan Hajnoczi wrote: Kevin, Marcelo, I'd like to reach agreement on the QMP/HMP APIs for live block copy and image streaming. Libvirt has acked the image streaming APIs that Adam proposed and I think they are a good fit for the feature. I have described that API below for your review (it's exactly what the QED Image Streaming patches provide). Marcelo: Are you happy with this API for live block copy? Also please take a look at the switch command that I am proposing. Image streaming API === For leaf images with copy-on-read semantics, the stream commands allow the user to populate local blocks by manually streaming them from the backing image. Once all blocks have been streamed, the dependency on the original backing image can be removed. Therefore, stream commands can be used to implement post-copy live block migration and rapid deployment. The block_stream command can be used to stream a single cluster, to start streaming the entire device, and to cancel an active stream. It is easiest to allow the block_stream command to manage streaming for the entire device but a managent tool could use single cluster mode to throttle the I/O rate. As discussed earlier, having the management send requests for each single cluster doesn't make any sense at all. It wouldn't only throttle the I/O rate but bring it down to a level that makes it unusable. What you really want is to allow the management to give us a range (offset + length) that qemu should stream. I feel that an iteration interface is problematic whether the management tool or QEMU decide what to stream. Let's have just the background streaming operation. The problem with byte ranges is two-fold. The management tool doesn't know which regions of the image are allocated so it may do a lot of nop calls to already-allocated regions with no intelligence as to where the next sensible offset for streaming is. Secondly, because the progress and performance of image streaming depend largely on whether or not clusters are allocated (it is very fast when a cluster is already allocated and we have no work to do), offsets are bad indicators of progress to the user. I think it's best not to expose these details to the management tool at all. The only reason for the iteration interface was to punt I/O throttling to the management tool. I think it would be easier to just throttle inside the streaming function. Kevin: Are you happy with dropping the iteration interface? Adam: Is there a libvirt requirement for iteration or could we support background copy only? The command synopses are as follows: block_stream Copy data from a backing file into a block device. If the optional 'all' argument is true, this operation is performed in the background until the entire backing file has been copied. The status of ongoing block_stream operations can be checked with query-block-stream. Not sure if it's a good idea to use a bool argument to turn a command into its opposite. I think having a separate command for stopping would be cleaner. Something for the QMP folks to decide, though. git branch new_branch git branch -D new_branch Makes sense to me :) Arguments: - all: copy entire device (json-bool, optional) - stop: stop copying to device (json-bool, optional) - device: device name (json-string) It must be possible to specify backing file that will be active after streaming finishes (data from that file will not be streamed into active file, of course). Yes, I think the common base image belongs here. Right. We need to specify it by filename: - base: filename of base file (json-string, optional) Sectors are not copied from the base file and its backing file chain. The following describes this feature: Before: base - sn1 - sn2 - sn3 - vm.img After: base - vm.img With all = false, where does the streaming begin? Streaming begins at the start of the image. Do you have something like the current streaming offset in the state of each BlockDriverState? Yes, there is a StreamState for each block device that has an in-progress operation. The progress is saved between block_stream (without -a) invocations so the caller does not need to specify the streaming offset as an argument. Thanks for pointing out these weaknesses in the documentation. It should really be explained fully. Return: - device: device name (json-string) - len: size of the device, in bytes (json-int) - offset: ending offset of the completed I/O, in bytes (json-int) So you only get the reply when the request has completed? With the current monitor, this means that QMP is blocked while we stream, doesn't it? How are you supposed to send the stop command then? Incomplete documentation again, sorry. The block_stream
Re: [Qemu-devel] migration: new sections and backward compatibility.
Hi, Well, in case of usb hid devices breaking the guest isn't that a big issue for at least some guests because they manage to reset the device and continue nevertheless ... In a situation like this, I think our responsibility is to let the user know that there could be a problem, and provide the ability to the user to force the migration. So for instance, you could have a (qemu) migrate_ignore_section usb command or something like that. Isn't that a bit overkill? But we shouldn't enable things that may sometimes work by default. I certainly agree on that for the future, thats why there is the patch series which starts tagging devices without migration support. This is about bug compatibility with old qemu versions though. They used to migrate usb devices without saving any state, with surprisingly few issues. cheers, Gerd
Re: [Qemu-devel] [PATCH v6 4/4] guest agent: add guest agent RPCs/commands
On 07/12/2011 09:15 AM, Luiz Capitulino wrote: On Mon, 11 Jul 2011 18:11:21 -0500 Michael Rothmdr...@linux.vnet.ibm.com wrote: On 07/11/2011 04:12 PM, Luiz Capitulino wrote: On Mon, 11 Jul 2011 15:11:26 -0500 Michael Rothmdr...@linux.vnet.ibm.com wrote: On 07/08/2011 10:14 AM, Luiz Capitulino wrote: On Tue, 5 Jul 2011 08:21:40 -0500 Michael Rothmdr...@linux.vnet.ibm.comwrote: This adds the initial set of QMP/QAPI commands provided by the guest agent: guest-sync guest-ping guest-info guest-shutdown guest-file-open guest-file-read guest-file-write guest-file-seek guest-file-close guest-fsfreeze-freeze guest-fsfreeze-thaw guest-fsfreeze-status The input/output specification for these commands are documented in the schema. Example usage: host: qemu -device virtio-serial \ -chardev socket,path=/tmp/vs0.sock,server,nowait,id=qga0 \ -device virtserialport,chardev=qga0,name=qga0 ... echo {'execute':'guest-info'} | socat stdio \ unix-connect:/tmp/qga0.sock guest: qemu-ga -c virtio-serial -p /dev/virtio-ports/qga0 \ -p /var/run/qemu-guest-agent.pid -d Signed-off-by: Michael Rothmdr...@linux.vnet.ibm.com --- Makefile | 15 ++- qemu-ga.c |4 + qerror.h |3 + qga/guest-agent-commands.c | 501 qga/guest-agent-core.h |2 + 5 files changed, 523 insertions(+), 2 deletions(-) create mode 100644 qga/guest-agent-commands.c diff --git a/Makefile b/Makefile index b2e8593..7e4f722 100644 --- a/Makefile +++ b/Makefile @@ -175,15 +175,26 @@ $(qapi-dir)/test-qmp-commands.h: $(qapi-dir)/test-qmp-marshal.c $(qapi-dir)/test-qmp-marshal.c: $(SRC_PATH)/qapi-schema-test.json $(SRC_PATH)/scripts/qapi-commands.py $(call quiet-command,python $(SRC_PATH)/scripts/qapi-commands.py -o $(qapi-dir) -p test-$, GEN $@) +$(qapi-dir)/qga-qapi-types.c: $(qapi-dir)/qga-qapi-types.h +$(qapi-dir)/qga-qapi-types.h: $(SRC_PATH)/qapi-schema-guest.json $(SRC_PATH)/scripts/qapi-types.py + $(call quiet-command,python $(SRC_PATH)/scripts/qapi-types.py -o $(qapi-dir) -p qga-$, GEN $@) +$(qapi-dir)/qga-qapi-visit.c: $(qapi-dir)/qga-qapi-visit.h +$(qapi-dir)/qga-qapi-visit.h: $(SRC_PATH)/qapi-schema-guest.json $(SRC_PATH)/scripts/qapi-visit.py + $(call quiet-command,python $(SRC_PATH)/scripts/qapi-visit.py -o $(qapi-dir) -p qga-$, GEN $@) +$(qapi-dir)/qga-qmp-marshal.c: $(SRC_PATH)/qapi-schema-guest.json $(SRC_PATH)/scripts/qapi-commands.py + $(call quiet-command,python $(SRC_PATH)/scripts/qapi-commands.py -o $(qapi-dir) -p qga-$, GEN $@) + test-visitor.o: $(addprefix $(qapi-dir)/, test-qapi-types.c test-qapi-types.h test-qapi-visit.c test-qapi-visit.h) test-visitor: test-visitor.o qfloat.o qint.o qdict.o qstring.o qlist.o qbool.o $(qapi-obj-y) error.o osdep.o qemu-malloc.o $(oslib-obj-y) qjson.o json-streamer.o json-lexer.o json-parser.o qerror.o qemu-error.o qemu-tool.o $(qapi-dir)/test-qapi-visit.o $(qapi-dir)/test-qapi-types.o test-qmp-commands.o: $(addprefix $(qapi-dir)/, test-qapi-types.c test-qapi-types.h test-qapi-visit.c test-qapi-visit.h test-qmp-marshal.c test-qmp-commands.h) test-qmp-commands: test-qmp-commands.o qfloat.o qint.o qdict.o qstring.o qlist.o qbool.o $(qapi-obj-y) error.o osdep.o qemu-malloc.o $(oslib-obj-y) qjson.o json-streamer.o json-lexer.o json-parser.o qerror.o qemu-error.o qemu-tool.o $(qapi-dir)/test-qapi-visit.o $(qapi-dir)/test-qapi-types.o $(qapi-dir)/test-qmp-marshal.o module.o -QGALIB=qga/guest-agent-command-state.o +QGALIB=qga/guest-agent-command-state.o qga/guest-agent-commands.o + +qemu-ga.o: $(qapi-dir)/qga-qapi-types.c $(qapi-dir)/qga-qapi-types.h $(qapi-dir)/qga-qapi-visit.c $(qapi-dir)/qga-qmp-marshal.c -qemu-ga$(EXESUF): qemu-ga.o $(QGALIB) qemu-tool.o qemu-error.o error.o $(oslib-obj-y) $(trace-obj-y) $(block-obj-y) $(qobject-obj-y) $(version-obj-y) $(qapi-obj-y) qemu-timer-common.o qemu-sockets.o module.o qapi/qmp-dispatch.o qapi/qmp-registry.o +qemu-ga$(EXESUF): qemu-ga.o $(QGALIB) qemu-tool.o qemu-error.o error.o $(oslib-obj-y) $(trace-obj-y) $(block-obj-y) $(qobject-obj-y) $(version-obj-y) $(qapi-obj-y) qemu-timer-common.o qemu-sockets.o module.o qapi/qmp-dispatch.o qapi/qmp-registry.o $(qapi-dir)/qga-qapi-visit.o $(qapi-dir)/qga-qmp-marshal.o QEMULIBS=libhw32 libhw64 libuser libdis libdis-user diff --git a/qemu-ga.c b/qemu-ga.c index 649c16a..04ead22 100644 --- a/qemu-ga.c +++ b/qemu-ga.c @@ -637,6 +637,9 @@ int main(int argc, char **argv) g_log_set_default_handler(ga_log, s); g_log_set_fatal_mask(NULL, G_LOG_LEVEL_ERROR); s-logging_enabled = true; +s-command_state = ga_command_state_new(); +ga_command_state_init(s, s-command_state); +ga_command_state_init_all(s-command_state); ga_state = s;
Re: [Qemu-devel] [PATCH 1/8] Introduce the VMStatus type
On Tue, 12 Jul 2011 12:12:31 -0300 Luiz Capitulino lcapitul...@redhat.com wrote: On Tue, 12 Jul 2011 16:51:03 +0200 Kevin Wolf kw...@redhat.com wrote: Am 12.07.2011 16:25, schrieb Luiz Capitulino: On Tue, 12 Jul 2011 09:28:05 +0200 Markus Armbruster arm...@redhat.com wrote: Luiz Capitulino lcapitul...@redhat.com writes: We need to track the VM status so that QMP can report it to clients. This commit adds the VMStatus type and related functions. The vm_status_set() function is used to keep track of the current VM status. The current statuses are: Nitpicking about names, bear with me. - debug: guest is running under gdb - inmigrate: guest is paused waiting for an incoming migration incoming-migration? - postmigrate: guest is paused following a successful migration post-migrate? - internal-error: Fatal internal error that prevents further guest execution - load-state-error: guest is paused following a failed 'loadvm' Less than obvious. If you like concrete, name it loadvm-failed. If you like abstract, name it restore-vm-failed. Ok for your suggestions above. - io-error: the last IOP has failed and the device is configured to pause on I/O errors - watchdog-error: the watchdog action is configured to pause and has been triggered Sounds like the watchdog suffered an error. watchdog-fired? Maybe watchdog-paused. - paused: guest has been paused via the 'stop' command stop-command? I prefer 'paused', it communicates better the state we're in. - prelaunch: QEMU was started with -S and guest has not started unstarted? Looks the same to me. - running: guest is actively running - shutdown: guest is shut down (and -no-shutdown is in use) Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- gdbstub.c |4 hw/ide/core.c |1 + hw/scsi-disk.c |1 + hw/virtio-blk.c |1 + hw/watchdog.c |1 + kvm-all.c |1 + migration.c |3 +++ monitor.c |5 - sysemu.h| 19 +++ vl.c| 37 + 10 files changed, 72 insertions(+), 1 deletions(-) diff --git a/gdbstub.c b/gdbstub.c index c085a5a..61b700a 100644 --- a/gdbstub.c +++ b/gdbstub.c @@ -2358,6 +2358,7 @@ void gdb_do_syscall(gdb_syscall_complete_cb cb, const char *fmt, ...) s-state = RS_SYSCALL; #ifndef CONFIG_USER_ONLY vm_stop(VMSTOP_DEBUG); +vm_status_set(VMST_DEBUG); #endif s-state = RS_IDLE; va_start(va, fmt); @@ -2432,6 +2433,7 @@ static void gdb_read_byte(GDBState *s, int ch) /* when the CPU is running, we cannot do anything except stop it when receiving a char */ vm_stop(VMSTOP_USER); +vm_status_set(VMST_DEBUG); } else #endif { @@ -2694,6 +2696,7 @@ static void gdb_chr_event(void *opaque, int event) switch (event) { case CHR_EVENT_OPENED: vm_stop(VMSTOP_USER); +vm_status_set(VMST_DEBUG); gdb_has_xml = 0; break; default: Previous hunk has VMST_DEBUG with VMST_DEBUG. Odd. @@ -2735,6 +2738,7 @@ static void gdb_sigterm_handler(int signal) { if (vm_running) { vm_stop(VMSTOP_USER); +vm_status_set(VMST_DEBUG); } } #endif diff --git a/hw/ide/core.c b/hw/ide/core.c index ca17a43..bf9df41 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -523,6 +523,7 @@ static int ide_handle_rw_error(IDEState *s, int error, int op) s-bus-error_status = op; bdrv_mon_event(s-bs, BDRV_ACTION_STOP, is_read); vm_stop(VMSTOP_DISKFULL); +vm_status_set(VMST_IOERROR); } else { if (op BM_STATUS_DMA_RETRY) { dma_buf_commit(s, 0); diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c index a8c7372..66037fd 100644 --- a/hw/scsi-disk.c +++ b/hw/scsi-disk.c @@ -216,6 +216,7 @@ static int scsi_handle_rw_error(SCSIDiskReq *r, int error, int type) bdrv_mon_event(s-bs, BDRV_ACTION_STOP, is_read); vm_stop(VMSTOP_DISKFULL); +vm_status_set(VMST_IOERROR); } else { if (type == SCSI_REQ_STATUS_RETRY_READ) { scsi_req_data(r-req, 0); diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c index 91e0394..bf70200 100644 --- a/hw/virtio-blk.c +++ b/hw/virtio-blk.c @@ -79,6 +79,7 @@ static int virtio_blk_handle_rw_error(VirtIOBlockReq *req, int error, s-rq = req; bdrv_mon_event(s-bs, BDRV_ACTION_STOP, is_read); vm_stop(VMSTOP_DISKFULL); +
Re: [Qemu-devel] qdev for programmers writeup
On 07/11/2011 06:47 PM, Peter Maydell wrote: On 11 July 2011 16:29, Paolo Bonzinipbonz...@redhat.com wrote: On 07/11/2011 04:44 PM, Peter Maydell wrote: (Also if you have one bus type per board then you're still very limited in what you can do with -device because you can't plug in some random other sysbus device anyway.) I'm not talking about one bus type per board! I'm talking about as few as possible board-specific root devices, and sharing buses between boards as much as possible. Er, doesn't that just get you sysbus again? It does get you a bus that can be reused by devices. It doesn't get you a bus that is a pot-pourri of features, some of which are not even meaningful in the context of all boards (e.g. PIO), and some of which override the run-time reconfigurability mechanisms that qdev has built-in. By the way, while it's true that run-time reconfigurability does not buy you much in terms of adding devices---at least without a device tree in the guest---it can help in terms of removing devices for debugging. If a device only needs MMIO and no GPIO/IRQ pins, it probably can stay under SysBus. However, I don't believe the magic MMIO functionality of SysBus is useful, and I do think it should be replaced by properties. Also if you have a root device and it's not the CPU then something's a bit odd. (The CPU lives above the interrupt controller in the interrupt tree if you want to look at it like that.) If you consider the CPU to be hidden beyond sysbus, then yes, you do have CPU-SysBus-PIC. It is interesting that in the PC the devices below SysBus are indeed mostly managing interrupts: CPU-SysBus-LAPIC(s) IOAPIC HPET i440FX-pcihost fw_cfg I think the PC's fw_cfg device should move below the ISA bridge; and the HPET is there only because there is no single device for the northbridge chip. It should perhaps be more like CPU-SysBus-LAPIC(s) i440FX-nb-i440FX-pcihost IOAPIC HPET i8259 I think the real reason so many devices use sysbus is that it is basically I'm a device and I support some gpio signals and some memory mappings, which is just a very natural way to model a lot of things. I agree that sysbus is convenient sugar right now, and we need that sugar to be available at all levels (not just sysbus), but you don't need sysbus to express that. There is actually one thing that I'd save in sysbus, and that is IRQs. That is because GPIO pins provided by qdev work in one direction only. If you want to have interrupt/GPIO sources both towards the children and towards the parent, it doesn't work well. This is a nice niche that sysbus IRQs fit in; a GPIO chip can use gpio_in/gpio_out towards the children, and sysbus IRQs towards the parent, giving nice separation. And even if things tend to be tree-like, you still need to support arbitrary inter-wiring for the corner cases (like this MMC controller's 'card present' wire needs to connect to the board-register model's input). You can model trees with arbitrary interconnections, but not vice-versa. Yes, any slot/socket mechanism for run-time reconfigurability of GPIO or IRQ connections needs to take into account the possibility of connecting siblings (or even completely disconnected devices). Right now that is limited to C code. But since a GPIO/IRQ is simply a pointer, adding such a mechanism would be be just syntax to name the devices' GPIO/IRQ slots. But in any case you will need a preferred topology defined somewhere, because code needs more than a bunch of qemu_irqs. Since they know that the model is a tree, qdevified devices can exploit their parent-child relationship and you can use that to tie the parent and child in more specific ways with virtual functions. It's quite fundamental. This can stay even if you turn the preferred topology into a DAG, or into the superposition of many trees. (This view of the world, which I accept is not really qdev's, says that a bus is really just a conveniently named and manipulable bundle of connections.) I see qbuses as a conveniently named and pluggable set of callbacks (including qemu_irq callbacks whenever that's convenient). Alternatively, it's the point where the children's sockets are joined to the children's slots we're forced by qdev to make all sockets meet their slots in the same place---i.e. on the same qbus). Paolo
Re: [Qemu-devel] Loading ELF binaries with very high base addresses
Hello Prashant, first of all your 0x464 is wrong it's 0x460. In Volume 2 of the IASDM page 2:46 you see that these three upper bits correspond to the 8 virtual regions (here: region 2). So maybe you can just disregard these bits and use the rest as new offset to an faked guest_base that fits your needs (e.g. somewhere in your process space)? Regards, Marc Hello, I am working on target-ia64, but am stuck during ia64 ELF loading. Referring to function probe_guest_base() in linux-user/elfload.c around line 1350, called from around line 1484 -- When the main binary is being mmap'd, the host address and guest address should ideally be the same. If they're not, a linear search is done by increasing the host_address by one page and trying the mmap again. The (positive) offset is then saved. The problem occurs with ia64 binaries, which typically start at 0x4000 (i.e 0x464). At least on my x86_64 host machine, mmap'ing at this address fails. The real_address is of the order of 0x832. Needless to say, increasing host_address and trying again will never reach a lower address to map at. Further, I cannot make it relocate to a lower host address because the offset (guest_base) is an unsigned int and so the relocation can only happen by a positive offset. Because of this it is not possible to load any ELF binaries which start at such high memory addresses. I can tailor an elf binary to start at a lower base address, which might work for that specific case, but I suspect most existing ia64 binaries start at 0x464 by convention. Also, the hiaddr is read from elf header which again is set to 0x464 + some value. The existing code works fine with x86_64, for example, because the binaries are typically starting at 0x4, which is easily mmap'd at first try. Any ideas on a workaround? ~Prashant -- NEU: FreePhone - kostenlos mobil telefonieren! Jetzt informieren: http://www.gmx.net/de/go/freephone
Re: [Qemu-devel] [PATCH 1/8] Introduce the VMStatus type
Am 12.07.2011 18:03, schrieb Luiz Capitulino: On Tue, 12 Jul 2011 12:12:31 -0300 Luiz Capitulino lcapitul...@redhat.com wrote: On Tue, 12 Jul 2011 16:51:03 +0200 Kevin Wolf kw...@redhat.com wrote: Am 12.07.2011 16:25, schrieb Luiz Capitulino: On Tue, 12 Jul 2011 09:28:05 +0200 Markus Armbruster arm...@redhat.com wrote: Luiz Capitulino lcapitul...@redhat.com writes: We need to track the VM status so that QMP can report it to clients. This commit adds the VMStatus type and related functions. The vm_status_set() function is used to keep track of the current VM status. The current statuses are: Nitpicking about names, bear with me. - debug: guest is running under gdb - inmigrate: guest is paused waiting for an incoming migration incoming-migration? - postmigrate: guest is paused following a successful migration post-migrate? - internal-error: Fatal internal error that prevents further guest execution - load-state-error: guest is paused following a failed 'loadvm' Less than obvious. If you like concrete, name it loadvm-failed. If you like abstract, name it restore-vm-failed. Ok for your suggestions above. - io-error: the last IOP has failed and the device is configured to pause on I/O errors - watchdog-error: the watchdog action is configured to pause and has been triggered Sounds like the watchdog suffered an error. watchdog-fired? Maybe watchdog-paused. - paused: guest has been paused via the 'stop' command stop-command? I prefer 'paused', it communicates better the state we're in. - prelaunch: QEMU was started with -S and guest has not started unstarted? Looks the same to me. - running: guest is actively running - shutdown: guest is shut down (and -no-shutdown is in use) Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- gdbstub.c |4 hw/ide/core.c |1 + hw/scsi-disk.c |1 + hw/virtio-blk.c |1 + hw/watchdog.c |1 + kvm-all.c |1 + migration.c |3 +++ monitor.c |5 - sysemu.h| 19 +++ vl.c| 37 + 10 files changed, 72 insertions(+), 1 deletions(-) diff --git a/gdbstub.c b/gdbstub.c index c085a5a..61b700a 100644 --- a/gdbstub.c +++ b/gdbstub.c @@ -2358,6 +2358,7 @@ void gdb_do_syscall(gdb_syscall_complete_cb cb, const char *fmt, ...) s-state = RS_SYSCALL; #ifndef CONFIG_USER_ONLY vm_stop(VMSTOP_DEBUG); +vm_status_set(VMST_DEBUG); #endif s-state = RS_IDLE; va_start(va, fmt); @@ -2432,6 +2433,7 @@ static void gdb_read_byte(GDBState *s, int ch) /* when the CPU is running, we cannot do anything except stop it when receiving a char */ vm_stop(VMSTOP_USER); +vm_status_set(VMST_DEBUG); } else #endif { @@ -2694,6 +2696,7 @@ static void gdb_chr_event(void *opaque, int event) switch (event) { case CHR_EVENT_OPENED: vm_stop(VMSTOP_USER); +vm_status_set(VMST_DEBUG); gdb_has_xml = 0; break; default: Previous hunk has VMST_DEBUG with VMST_DEBUG. Odd. @@ -2735,6 +2738,7 @@ static void gdb_sigterm_handler(int signal) { if (vm_running) { vm_stop(VMSTOP_USER); +vm_status_set(VMST_DEBUG); } } #endif diff --git a/hw/ide/core.c b/hw/ide/core.c index ca17a43..bf9df41 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -523,6 +523,7 @@ static int ide_handle_rw_error(IDEState *s, int error, int op) s-bus-error_status = op; bdrv_mon_event(s-bs, BDRV_ACTION_STOP, is_read); vm_stop(VMSTOP_DISKFULL); +vm_status_set(VMST_IOERROR); } else { if (op BM_STATUS_DMA_RETRY) { dma_buf_commit(s, 0); diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c index a8c7372..66037fd 100644 --- a/hw/scsi-disk.c +++ b/hw/scsi-disk.c @@ -216,6 +216,7 @@ static int scsi_handle_rw_error(SCSIDiskReq *r, int error, int type) bdrv_mon_event(s-bs, BDRV_ACTION_STOP, is_read); vm_stop(VMSTOP_DISKFULL); +vm_status_set(VMST_IOERROR); } else { if (type == SCSI_REQ_STATUS_RETRY_READ) { scsi_req_data(r-req, 0); diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c index 91e0394..bf70200 100644 --- a/hw/virtio-blk.c +++ b/hw/virtio-blk.c @@ -79,6 +79,7 @@ static int virtio_blk_handle_rw_error(VirtIOBlockReq *req, int error, s-rq = req; bdrv_mon_event(s-bs, BDRV_ACTION_STOP, is_read); vm_stop(VMSTOP_DISKFULL); +vm_status_set(VMST_IOERROR); } else { virtio_blk_req_complete(req, VIRTIO_BLK_S_IOERR); bdrv_mon_event(s-bs, BDRV_ACTION_REPORT, is_read); diff --git a/hw/watchdog.c b/hw/watchdog.c index 1c900a1..d130cbb 100644 --- a/hw/watchdog.c +++
[Qemu-devel] [Bug 656285] Re: arm-semi mishandling SYS_HEAPINFO
The patches I mention in commit #4 (and also a fix by Cedric Vincent for some other brk related bugs) have now been committed to qemu master. ** Changed in: qemu Status: New = Fix Committed -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/656285 Title: arm-semi mishandling SYS_HEAPINFO Status in QEMU: Fix Committed Bug description: I am running qemu-arm on a 32-bit fedora-7 i386 machine: $ /home/bri0633/users/clarkes/qemu/build/arm-linux-user/qemu-arm --version qemu-arm version 0.12.3, Copyright (c) 2003-2008 Fabrice Bellard When I try to run an arm semi-hosted executable, I sometimes get unexpected segv and sometimes not, depending on the executable. The symptom is: $ /home/bri0633/users/clarkes/qemu/build/arm-linux-user/qemu-arm -cpu cortex-a9 -- a.out qemu: uncaught target signal 11 (Segmentation fault) - core dumped Segmentation fault It appear to be because of the handling of the SYS_HEAPINFO syscall in arm-semi.c. There it tries to allocate 128M for the heap by calling do_brk() which calls target_mmap(). This is the DEBUG_MMAP diagnostic: mmap: start=0x9000 len=0x08001000 prot=rw- flags=MAP_FIXED MAP_ANON MAP_PRIVATE fd=0 offset= but this mmap is failing because there are shared libraries (and the gate page) mapped there: $ ldd /home/bri0633/users/clarkes/qemu/build/arm-linux-user/qemu-arm linux-gate.so.1 = (0x0088) librt.so.1 = /lib/librt.so.1 (0x03409000) libpthread.so.0 = /lib/libpthread.so.0 (0x00d7d000) libm.so.6 = /lib/libm.so.6 (0x00d4b000) libc.so.6 = /lib/libc.so.6 (0x00bf5000) /lib/ld-linux.so.2 (0x00bd6000) However, it seems that the code in arm-semi.c does not interpret the result of do_brk() correctly, and thinks that the mapping succeeded. The following patch appears to fix the problem: $ diff -u arm-semi.c.orig arm-semi.c --- arm-semi.c.orig 2010-09-21 13:19:15.0 +0100 +++ arm-semi.c 2010-10-07 13:23:13.0 +0100 @@ -475,7 +475,7 @@ /* Try a big heap, and reduce the size if that fails. */ for (;;) { ret = do_brk(limit); -if (ret != -1) +if (ret == limit) break; limit = (ts-heap_base 1) + (limit 1); } Do you think this is a genuine bug? Steve. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/656285/+subscriptions
[Qemu-devel] [Bug 807893] Re: qemu privilege escalation
This bug is being tracked as CVE-2011-2527 ** CVE added: http://www.cve.mitre.org/cgi- bin/cvename.cgi?name=2011-2527 -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/807893 Title: qemu privilege escalation Status in QEMU: Confirmed Bug description: If qemu is started as root, with -runas, the extra groups is not dropped correctly /proc/`pidof qemu`/status .. Uid:100 100 100 100 Gid:100 100 100 100 FDSize: 32 Groups: 0 1 2 3 4 6 10 11 26 27 ... The fix is to add initgroups() or setgroups(1, [gid]) where appropriate to os-posix.c. The extra gid's allow read or write access to other files (such as /dev etc). Emulating the qemu code: # python ... import os os.setgid(100) os.setuid(100) os.execve(/bin/sh, [ /bin/sh ], os.environ) sh-4.1$ xxd /dev/sda | head -n2 000: eb48 9000 .H.. 010: sh-4.1$ ls -l /dev/sda brw-rw 1 root disk 8, 0 Jul 8 11:54 /dev/sda sh-4.1$ id uid=100(qemu00) gid=100(users) groups=100(users),0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),26(tape),27(video) To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/807893/+subscriptions
Re: [Qemu-devel] live block copy/stream/snapshot discussion
On 07/12/2011 10:45 AM, Stefan Hajnoczi wrote: On Tue, Jul 12, 2011 at 9:06 AM, Kevin Wolf kw...@redhat.com wrote: Am 11.07.2011 18:32, schrieb Marcelo Tosatti: On Mon, Jul 11, 2011 at 03:47:15PM +0100, Stefan Hajnoczi wrote: Kevin, Marcelo, I'd like to reach agreement on the QMP/HMP APIs for live block copy and image streaming. Libvirt has acked the image streaming APIs that Adam proposed and I think they are a good fit for the feature. I have described that API below for your review (it's exactly what the QED Image Streaming patches provide). Marcelo: Are you happy with this API for live block copy? Also please take a look at the switch command that I am proposing. Image streaming API === For leaf images with copy-on-read semantics, the stream commands allow the user to populate local blocks by manually streaming them from the backing image. Once all blocks have been streamed, the dependency on the original backing image can be removed. Therefore, stream commands can be used to implement post-copy live block migration and rapid deployment. The block_stream command can be used to stream a single cluster, to start streaming the entire device, and to cancel an active stream. It is easiest to allow the block_stream command to manage streaming for the entire device but a managent tool could use single cluster mode to throttle the I/O rate. As discussed earlier, having the management send requests for each single cluster doesn't make any sense at all. It wouldn't only throttle the I/O rate but bring it down to a level that makes it unusable. What you really want is to allow the management to give us a range (offset + length) that qemu should stream. I feel that an iteration interface is problematic whether the management tool or QEMU decide what to stream. Let's have just the background streaming operation. The problem with byte ranges is two-fold. The management tool doesn't know which regions of the image are allocated so it may do a lot of nop calls to already-allocated regions with no intelligence as to where the next sensible offset for streaming is. Secondly, because the progress and performance of image streaming depend largely on whether or not clusters are allocated (it is very fast when a cluster is already allocated and we have no work to do), offsets are bad indicators of progress to the user. I think it's best not to expose these details to the management tool at all. The only reason for the iteration interface was to punt I/O throttling to the management tool. I think it would be easier to just throttle inside the streaming function. Kevin: Are you happy with dropping the iteration interface? Adam: Is there a libvirt requirement for iteration or could we support background copy only? There is no hard requirement for iteration in libvirt. However, I think there is a requirement that we report some sort of progress to an end user. These operations can easily take many minutes (even hours) and such a long-running operation needs to report progress. I think the current information returned by 'query-block-stream' is appropriate for this purpose and should definitely be maintained. -- Adam Litke IBM Linux Technology Center
[Qemu-devel] [Bug 807893] Re: qemu privilege escalation
Requesting CVE. Tools like libvirt deprivilege themselves before launching qemu as an unprivileged user (no use of -runas), so aren't vulnerable. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/807893 Title: qemu privilege escalation Status in QEMU: Confirmed Bug description: If qemu is started as root, with -runas, the extra groups is not dropped correctly /proc/`pidof qemu`/status .. Uid:100 100 100 100 Gid:100 100 100 100 FDSize: 32 Groups: 0 1 2 3 4 6 10 11 26 27 ... The fix is to add initgroups() or setgroups(1, [gid]) where appropriate to os-posix.c. The extra gid's allow read or write access to other files (such as /dev etc). Emulating the qemu code: # python ... import os os.setgid(100) os.setuid(100) os.execve(/bin/sh, [ /bin/sh ], os.environ) sh-4.1$ xxd /dev/sda | head -n2 000: eb48 9000 .H.. 010: sh-4.1$ ls -l /dev/sda brw-rw 1 root disk 8, 0 Jul 8 11:54 /dev/sda sh-4.1$ id uid=100(qemu00) gid=100(users) groups=100(users),0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),26(tape),27(video) To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/807893/+subscriptions
Re: [Qemu-devel] Loading ELF binaries with very high base addresses
Hi Prashant, Am 12.07.2011 um 17:29 schrieb Prashant Vaibhav q...@mercurysquad.com: Hello, I am working on target-ia64, but am stuck during ia64 ELF loading. Referring to function probe_guest_base() in linux-user/elfload.c around line 1350, called from around line 1484 -- When the main binary is being mmap'd, the host address and guest address should ideally be the same. If they're not, a linear search is done by increasing the host_address by one page and trying the mmap again. The (positive) offset is then saved. The problem occurs with ia64 binaries, which typically start at 0x4000 (i.e 0x464). At least on my x86_64 host machine, mmap'ing at this address fails. The real_address is of the order of 0x832. Needless to say, increasing host_address and trying again will never reach a lower address to map at. Further, I cannot make it relocate to a lower host address because the offset (guest_base) is an unsigned int and so the relocation can only happen by a positive offset. On x86_64 the high 16 bits of the virtual address space have to be equal - either 0x or 0x. So the IA64 addresses can't be reflected in x86_64 virtual address space. For now, you could try to add an ifdef that wraps around to some other lower address in the loop when the va is higher than x86_64 supports. Because of this it is not possible to load any ELF binaries which start at such high memory addresses. I can tailor an elf binary to start at a lower base address, which might work for that specific case, but I suspect most existing ia64 binaries start at 0x464 by convention. Also, the hiaddr is read from elf header which again is set to 0x464 + some value. The existing code works fine with x86_64, for example, because the binaries are typically starting at 0x4, which is easily mmap'd at first try. Any ideas on a workaround? I guess the long-term solution here really is to use the softmmu for linux-user as well - unless we're running 32-on-64. For now, just force the mapping to somewhere mappable :) Alex
Re: [Qemu-devel] [PATCH 1/8] Introduce the VMStatus type
On Tue, 12 Jul 2011 18:16:26 +0200 Kevin Wolf kw...@redhat.com wrote: Am 12.07.2011 18:03, schrieb Luiz Capitulino: On Tue, 12 Jul 2011 12:12:31 -0300 Luiz Capitulino lcapitul...@redhat.com wrote: On Tue, 12 Jul 2011 16:51:03 +0200 Kevin Wolf kw...@redhat.com wrote: Am 12.07.2011 16:25, schrieb Luiz Capitulino: On Tue, 12 Jul 2011 09:28:05 +0200 Markus Armbruster arm...@redhat.com wrote: Luiz Capitulino lcapitul...@redhat.com writes: We need to track the VM status so that QMP can report it to clients. This commit adds the VMStatus type and related functions. The vm_status_set() function is used to keep track of the current VM status. The current statuses are: Nitpicking about names, bear with me. - debug: guest is running under gdb - inmigrate: guest is paused waiting for an incoming migration incoming-migration? - postmigrate: guest is paused following a successful migration post-migrate? - internal-error: Fatal internal error that prevents further guest execution - load-state-error: guest is paused following a failed 'loadvm' Less than obvious. If you like concrete, name it loadvm-failed. If you like abstract, name it restore-vm-failed. Ok for your suggestions above. - io-error: the last IOP has failed and the device is configured to pause on I/O errors - watchdog-error: the watchdog action is configured to pause and has been triggered Sounds like the watchdog suffered an error. watchdog-fired? Maybe watchdog-paused. - paused: guest has been paused via the 'stop' command stop-command? I prefer 'paused', it communicates better the state we're in. - prelaunch: QEMU was started with -S and guest has not started unstarted? Looks the same to me. - running: guest is actively running - shutdown: guest is shut down (and -no-shutdown is in use) Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- gdbstub.c |4 hw/ide/core.c |1 + hw/scsi-disk.c |1 + hw/virtio-blk.c |1 + hw/watchdog.c |1 + kvm-all.c |1 + migration.c |3 +++ monitor.c |5 - sysemu.h| 19 +++ vl.c| 37 + 10 files changed, 72 insertions(+), 1 deletions(-) diff --git a/gdbstub.c b/gdbstub.c index c085a5a..61b700a 100644 --- a/gdbstub.c +++ b/gdbstub.c @@ -2358,6 +2358,7 @@ void gdb_do_syscall(gdb_syscall_complete_cb cb, const char *fmt, ...) s-state = RS_SYSCALL; #ifndef CONFIG_USER_ONLY vm_stop(VMSTOP_DEBUG); +vm_status_set(VMST_DEBUG); #endif s-state = RS_IDLE; va_start(va, fmt); @@ -2432,6 +2433,7 @@ static void gdb_read_byte(GDBState *s, int ch) /* when the CPU is running, we cannot do anything except stop it when receiving a char */ vm_stop(VMSTOP_USER); +vm_status_set(VMST_DEBUG); } else #endif { @@ -2694,6 +2696,7 @@ static void gdb_chr_event(void *opaque, int event) switch (event) { case CHR_EVENT_OPENED: vm_stop(VMSTOP_USER); +vm_status_set(VMST_DEBUG); gdb_has_xml = 0; break; default: Previous hunk has VMST_DEBUG with VMST_DEBUG. Odd. @@ -2735,6 +2738,7 @@ static void gdb_sigterm_handler(int signal) { if (vm_running) { vm_stop(VMSTOP_USER); +vm_status_set(VMST_DEBUG); } } #endif diff --git a/hw/ide/core.c b/hw/ide/core.c index ca17a43..bf9df41 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -523,6 +523,7 @@ static int ide_handle_rw_error(IDEState *s, int error, int op) s-bus-error_status = op; bdrv_mon_event(s-bs, BDRV_ACTION_STOP, is_read); vm_stop(VMSTOP_DISKFULL); +vm_status_set(VMST_IOERROR); } else { if (op BM_STATUS_DMA_RETRY) { dma_buf_commit(s, 0); diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c index a8c7372..66037fd 100644 --- a/hw/scsi-disk.c +++ b/hw/scsi-disk.c @@ -216,6 +216,7 @@ static int scsi_handle_rw_error(SCSIDiskReq *r, int error, int type) bdrv_mon_event(s-bs, BDRV_ACTION_STOP, is_read); vm_stop(VMSTOP_DISKFULL); +vm_status_set(VMST_IOERROR); } else { if (type == SCSI_REQ_STATUS_RETRY_READ) { scsi_req_data(r-req, 0); diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c index 91e0394..bf70200 100644 --- a/hw/virtio-blk.c +++ b/hw/virtio-blk.c @@ -79,6 +79,7 @@ static int virtio_blk_handle_rw_error(VirtIOBlockReq *req, int error, s-rq = req; bdrv_mon_event(s-bs, BDRV_ACTION_STOP, is_read); vm_stop(VMSTOP_DISKFULL); +vm_status_set(VMST_IOERROR); }
[Qemu-devel] [Bug 807893] Re: [PATCH] os-posix: set groups properly for -runas
* Stefan Hajnoczi (stefa...@linux.vnet.ibm.com) wrote: @@ -199,6 +200,11 @@ static void change_process_uid(void) fprintf(stderr, Failed to setgid(%d)\n, user_pwd-pw_gid); exit(1); } +if (initgroups(user_pwd-pw_name, user_pwd-pw_gid) 0) { +fprintf(stderr, Failed to initgroups(\%s\, %d)\n, +user_pwd-pw_name, user_pwd-pw_gid); +exit(1); +} Does initgroups need access to /etc/group? How does this combine w/ -chroot? Added bonus...this will fail when the initial user is not privileged _and_ is the same user as -runas user (probably not what a user intended, but would've worked before). Something like: [doh@laptop qemu]$ qemu -runas doh -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/807893 Title: qemu privilege escalation Status in QEMU: Confirmed Bug description: If qemu is started as root, with -runas, the extra groups is not dropped correctly /proc/`pidof qemu`/status .. Uid:100 100 100 100 Gid:100 100 100 100 FDSize: 32 Groups: 0 1 2 3 4 6 10 11 26 27 ... The fix is to add initgroups() or setgroups(1, [gid]) where appropriate to os-posix.c. The extra gid's allow read or write access to other files (such as /dev etc). Emulating the qemu code: # python ... import os os.setgid(100) os.setuid(100) os.execve(/bin/sh, [ /bin/sh ], os.environ) sh-4.1$ xxd /dev/sda | head -n2 000: eb48 9000 .H.. 010: sh-4.1$ ls -l /dev/sda brw-rw 1 root disk 8, 0 Jul 8 11:54 /dev/sda sh-4.1$ id uid=100(qemu00) gid=100(users) groups=100(users),0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),26(tape),27(video) To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/807893/+subscriptions