Re: [Qemu-devel] [QAPI+QGA 3/3] QEMU Guest Agent (virtagent) v7
On Sat, Jul 16, 2011 at 12:27 AM, Luiz Capitulino lcapitul...@redhat.com wrote: On Fri, 15 Jul 2011 09:15:18 +0800 Zhi Yong Wu zwu.ker...@gmail.com wrote: On Fri, Jul 15, 2011 at 4:00 AM, Michael Roth mdr...@linux.vnet.ibm.com wrote: This is Set 3/3 of the QAPI+QGA patchsets. These patches apply on top of qapi-backport-set2-v6, and can also be obtained from: git://repo.or.cz/qemu/mdroth.git qapi-backport-set3-v7 (Set1+2 are a backport of some of the QAPI-related work from Anthony's glib tree. The main goal is to get the basic code generation infrastructure in place so that it can be used by the guest agent to implement a QMP-like guest interface, and so that future work regarding the QMP conversion to QAPI can be decoupled from the infrastructure bits. Set3 is the Qemu Guest Agent (virtagent), rebased on the new code QAPI code generation infrastructure. This is the first user of QAPI, QMP will follow.) ___ CHANGES SINCE V6: - Made /dev/virtio-ports/org.qemu.guest_agent.0 default device path for agent - Consolidated uneeded fcntl() calls into qemu_open() - JSON/QMP parse errors now propagated to client - Replaced non-assertion uses of g_error() with exit() - Added guest-file-flush - Removed limit on max read size for guest-file-read - 'count' parameters to guest-file-read/guest-file-write are now optional (default to 4KB and size of provided buffer, base64-decoded, respectively) - Removed redundant 'file_' and 'shutdown_' prefixes from guest-file-*/guest-shutdown commands, switched to - in place of _ in parameter names, renamed guest-file-read's buf param to buf-b64 and guest-file-write's data_b64 param to buf-b64 for consistency. - guest-fsfreeze-freeze now returns error objects on error rather as part of it's integer return values, and on error will unfreeze previously frozen filesystems. - GUEST_FSFREEZE_STATUS_INPROGRESS removed, GUEST_FSFREEZE_STATUS_ERROR now serves the explicit purpose of noting a failure to find a previously mounted filesytem/directory after initial freeze, or failure to unfreeze 1 or more filesystems. - -c/--channel option to qemu-ga is now -m/--method CHANGES SINCE V5: - switched to using qemu malloc/list functions where possible - removed unused proxy_path field in struct GAState - pid file now opened write-only, removed lockf() in favor of O_EXCL, added SIGINT/SIGTERM signal handlers to handle cleanup - cleaned up error-handling, switched to asserts where appropriate, removed unecessary gotos and NULL checks for qemu_free()/qobject_decref() - refactored send_payload() using helper functions - fixed improper handling of pidfile fd==0 - changed guest-shutdown's shutdown_mode param to mode - switched to using kernel-generated FDs for guest-file-open rather than an autoincrement value - add maximum chunk size of guest-file-read/guest-file-write - added checks to avoid guest-file-write from writing data beyond the provided data buffer - made logging best-effort, removed handling of failures to log as errors - guest-shutdown exec errors now logged to guest syslog, clarified shutdown's asynchronous, no gauruntee nature in schema. CHANGES SINCE V4: - Removed timeout mechanism via worker thread/pthread_cancel due to potential memory leak. Will re-introduce guest-side timeout support in future version. - Fixed up fsfreeze code to use enums specified within the guest agent's qapi schema. - Fixed memory leak due to a log statement, and added missing cleanup functions for heap-allocated g_error objects. - Made mode param to guest-file-open optional, defaults to r (read-only) CHANGES SINCE V3: - Fixed error-handling issues in fsfreeze commands leading to certain mounted directories causing freeze/thaw operations to fail - Added cleanup hook to thaw filesystems on graceful guest agent exit - Removed unused enum values and added additional details to schema documentation - Fixed build issue that was missed due to deprecated files in source tree, removed unused includes CHANGES SINCE V2: - Rebased on new QAPI code generation framework - Dropped ability for QMP to act as a proxy for the guest agent, will be added when new QMP server is backported from Anthony's glib tree - Replaced negotiation/control events with a simple 2-way handshake implemented by a standard RPC (guest-sync) - Removed enforcement of pristine sessions, state is now global/persistant across multiple clients/connections - Fixed segfault in logging code - Added Jes' filesystem freeze patches - General cleanups CHANGES SINCE V1: - Added guest agent worker thread to execute RPCs in the guest. With this in place we have a reliable timeout mechanism for hung commands, currently set at 30 seconds. - Add framework for registering init/cleanup routines for stateful
Re: [Qemu-devel] [PATCH 2/2] main: switch qemu_set_fd_handler to g_io_add_watch
On 09/06/2011 05:59 PM, Anthony Liguori wrote: So it should be possible to add a new Source type that just selects on a file descriptor and avoid GIOChannels? I think you still have the problem that glib on Windows waits for HANDLEs, not file descriptors. Also, I'm not sure it's worth it though as long as slirp still does its own fill/poll. Paolo
Re: [Qemu-devel] [PATCH v3 06/27] scsi-disk: Factor out scsi_disk_emulate_start_stop()
On 09/06/2011 06:58 PM, Markus Armbruster wrote: Signed-off-by: Markus Armbrusterarm...@redhat.com --- hw/scsi-disk.c | 17 + 1 files changed, 13 insertions(+), 4 deletions(-) diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c index 9724d0f..c8ad2e7 100644 --- a/hw/scsi-disk.c +++ b/hw/scsi-disk.c @@ -814,6 +814,18 @@ static int scsi_disk_emulate_read_toc(SCSIRequest *req, uint8_t *outbuf) return toclen; } +static void scsi_disk_emulate_start_stop(SCSIDiskReq *r) +{ +SCSIRequest *req =r-req; +SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, req-dev); +bool start = req-cmd.buf[4] 1; +bool loej = req-cmd.buf[4] 2; /* load on start, eject on !start */ + +if (s-qdev.type == TYPE_ROM loej) { +bdrv_eject(s-bs, !start); +} +} + static int scsi_disk_emulate_command(SCSIDiskReq *r, uint8_t *outbuf) { SCSIRequest *req =r-req; @@ -859,10 +871,7 @@ static int scsi_disk_emulate_command(SCSIDiskReq *r, uint8_t *outbuf) goto illegal_request; break; case START_STOP: -if (s-qdev.type == TYPE_ROM (req-cmd.buf[4] 2)) { -/* load/eject medium */ -bdrv_eject(s-bs, !(req-cmd.buf[4] 1)); -} +scsi_disk_emulate_start_stop(r); break; case ALLOW_MEDIUM_REMOVAL: bdrv_set_locked(s-bs, req-cmd.buf[4] 1); Reviewed-by: Paolo Bonzini pbonz...@redhat.com
Re: [Qemu-devel] [PATCH v3 07/27] scsi-disk: Track tray open/close state
On 09/06/2011 06:58 PM, Markus Armbruster wrote: We already track it in BlockDriverState since commit 4be9762a. As discussed in that commit's message, we should track it in the device device models instead, because it's device state. Signed-off-by: Markus Armbrusterarm...@redhat.com --- hw/scsi-disk.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c index c8ad2e7..f18ddd7 100644 --- a/hw/scsi-disk.c +++ b/hw/scsi-disk.c @@ -72,6 +72,7 @@ struct SCSIDiskState QEMUBH *bh; char *version; char *serial; +bool tray_open; }; static int scsi_handle_rw_error(SCSIDiskReq *r, int error, int type); @@ -823,6 +824,7 @@ static void scsi_disk_emulate_start_stop(SCSIDiskReq *r) if (s-qdev.type == TYPE_ROM loej) { bdrv_eject(s-bs, !start); +s-tray_open = !start; } } Reviewed-by: Paolo Bonzini pbonz...@redhat.com
Re: [Qemu-devel] [PATCH v3 11/27] scsi-disk: Track tray locked state
On 09/06/2011 06:58 PM, Markus Armbruster wrote: We already track it in BlockDriverState. Just like tray open/close state, we should track it in the device models instead, because it's device state. Signed-off-by: Markus Armbrusterarm...@redhat.com --- hw/scsi-disk.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c index f35ada4..e7358e3 100644 --- a/hw/scsi-disk.c +++ b/hw/scsi-disk.c @@ -73,6 +73,7 @@ struct SCSIDiskState char *version; char *serial; bool tray_open; +bool tray_locked; }; static int scsi_handle_rw_error(SCSIDiskReq *r, int error, int type); @@ -671,7 +672,7 @@ static int mode_sense_page(SCSIDiskState *s, int page, uint8_t **p_outbuf, p[5] = 0xff; /* CD DA, DA accurate, RW supported, RW corrected, C2 errors, ISRC, UPC, Bar code */ -p[6] = 0x2d | (bdrv_is_locked(s-bs)? 2 : 0); +p[6] = 0x2d | (s-tray_locked ? 2 : 0); /* Locking supported, jumper present, eject, tray */ p[7] = 0; /* no volume mute control, no changer */ @@ -882,6 +883,7 @@ static int scsi_disk_emulate_command(SCSIDiskReq *r, uint8_t *outbuf) scsi_disk_emulate_start_stop(r); break; case ALLOW_MEDIUM_REMOVAL: +s-tray_locked = req-cmd.buf[4] 1; bdrv_set_locked(s-bs, req-cmd.buf[4] 1); break; case READ_CAPACITY_10: Reviewed-by: Paolo Bonzini pbonz...@redhat.com
Re: [Qemu-devel] [PATCH v3 16/27] scsi-disk: Fix START_STOP to fail when it can't eject
On 09/06/2011 06:58 PM, Markus Armbruster wrote: Don't fail when tray is already open. Signed-off-by: Markus Armbrusterarm...@redhat.com --- hw/scsi-bus.c | 10 ++ hw/scsi-disk.c | 15 +++ hw/scsi.h |4 3 files changed, 25 insertions(+), 4 deletions(-) diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c index 160eaee..79cb29d 100644 --- a/hw/scsi-bus.c +++ b/hw/scsi-bus.c @@ -772,6 +772,11 @@ const struct SCSISense sense_code_NO_MEDIUM = { .key = NOT_READY, .asc = 0x3a, .ascq = 0x00 }; +/* LUN not ready, medium removal prevented */ +const struct SCSISense sense_code_NOT_READY_REMOVAL_PREVENTED = { +.key = NOT_READY, .asc = 0x53, .ascq = 0x00 +}; + /* Hardware error, internal target failure */ const struct SCSISense sense_code_TARGET_FAILURE = { .key = HARDWARE_ERROR, .asc = 0x44, .ascq = 0x00 @@ -807,6 +812,11 @@ const struct SCSISense sense_code_INCOMPATIBLE_MEDIUM = { .key = ILLEGAL_REQUEST, .asc = 0x30, .ascq = 0x00 }; +/* Illegal request, medium removal prevented */ +const struct SCSISense sense_code_ILLEGAL_REQ_REMOVAL_PREVENTED = { +.key = ILLEGAL_REQUEST, .asc = 0x53, .ascq = 0x00 +}; + /* Command aborted, I/O process terminated */ const struct SCSISense sense_code_IO_ERROR = { .key = ABORTED_COMMAND, .asc = 0x00, .ascq = 0x06 diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c index 4e89bb1..1a49217 100644 --- a/hw/scsi-disk.c +++ b/hw/scsi-disk.c @@ -822,7 +822,7 @@ static int scsi_disk_emulate_read_toc(SCSIRequest *req, uint8_t *outbuf) return toclen; } -static void scsi_disk_emulate_start_stop(SCSIDiskReq *r) +static int scsi_disk_emulate_start_stop(SCSIDiskReq *r) { SCSIRequest *req =r-req; SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, req-dev); @@ -830,12 +830,17 @@ static void scsi_disk_emulate_start_stop(SCSIDiskReq *r) bool loej = req-cmd.buf[4] 2; /* load on start, eject on !start */ if (s-qdev.type == TYPE_ROM loej) { -if (!start s-tray_locked) { -return; +if (!start !s-tray_open s-tray_locked) { +scsi_check_condition(r, + bdrv_is_inserted(s-bs) + ? SENSE_CODE(ILLEGAL_REQ_REMOVAL_PREVENTED) + : SENSE_CODE(NOT_READY_REMOVAL_PREVENTED)); +return -1; } bdrv_eject(s-bs, !start); s-tray_open = !start; } +return 0; } static int scsi_disk_emulate_command(SCSIDiskReq *r, uint8_t *outbuf) @@ -883,7 +888,9 @@ static int scsi_disk_emulate_command(SCSIDiskReq *r, uint8_t *outbuf) goto illegal_request; break; case START_STOP: -scsi_disk_emulate_start_stop(r); +if (scsi_disk_emulate_start_stop(r) 0) { +return -1; +} break; case ALLOW_MEDIUM_REMOVAL: s-tray_locked = req-cmd.buf[4] 1; diff --git a/hw/scsi.h b/hw/scsi.h index 98fd689..a28cd68 100644 --- a/hw/scsi.h +++ b/hw/scsi.h @@ -136,6 +136,8 @@ extern const struct SCSISense sense_code_NO_SENSE; extern const struct SCSISense sense_code_LUN_NOT_READY; /* LUN not ready, Medium not present */ extern const struct SCSISense sense_code_NO_MEDIUM; +/* LUN not ready, medium removal prevented */ +extern const struct SCSISense sense_code_NOT_READY_REMOVAL_PREVENTED; /* Hardware error, internal target failure */ extern const struct SCSISense sense_code_TARGET_FAILURE; /* Illegal request, invalid command operation code */ @@ -150,6 +152,8 @@ extern const struct SCSISense sense_code_LUN_NOT_SUPPORTED; extern const struct SCSISense sense_code_SAVING_PARAMS_NOT_SUPPORTED; /* Illegal request, Incompatible format */ extern const struct SCSISense sense_code_INCOMPATIBLE_FORMAT; +/* Illegal request, medium removal prevented */ +extern const struct SCSISense sense_code_ILLEGAL_REQ_REMOVAL_PREVENTED; /* Command aborted, I/O process terminated */ extern const struct SCSISense sense_code_IO_ERROR; /* Command aborted, I_T Nexus loss occurred */ Matches MMC6, paragraph 6.42.3.1. Reviewed-by: Paolo Bonzini pbonz...@redhat.com
Re: [Qemu-devel] [PATCH] Add options to disable build with debug symbols and override optimization flags.
On Wed, Sep 7, 2011 at 3:59 AM, Brad b...@comstyle.com wrote: Add --disable-debug-symbols to disable building with debug symbols and --optflags to override the optimization flags passed to the compiler. --- configure | 18 +++--- 1 files changed, 15 insertions(+), 3 deletions(-) QEMU builds with debug symbols. But during make install the binary is stripped unless you specify --disable-strip. What is the need for --disable-debug-symbols? Stefan
Re: [Qemu-devel] [PATCH v3 27/27] ide/atapi scsi-disk: Make monitor eject -f, then change work
On 09/06/2011 06:59 PM, Markus Armbruster wrote: change fails while the tray is locked by the guest. eject -f forces it open and removes any media. Unfortunately, the tray closes again instantly. Since the lock remains as it is, there is no way to insert another medium unless the guest voluntarily unlocks. Fix by leaving the tray open after monitor eject. Signed-off-by: Markus Armbrusterarm...@redhat.com --- blockdev.c |3 ++- hw/ide/core.c |1 + hw/scsi-disk.c |1 + 3 files changed, 4 insertions(+), 1 deletions(-) diff --git a/blockdev.c b/blockdev.c index 154cc84..0827bf7 100644 --- a/blockdev.c +++ b/blockdev.c @@ -635,7 +635,8 @@ static int eject_device(Monitor *mon, BlockDriverState *bs, int force) qerror_report(QERR_DEVICE_NOT_REMOVABLE, bdrv_get_device_name(bs)); return -1; } -if (!force bdrv_dev_is_medium_locked(bs)) { +if (!force !bdrv_dev_is_tray_open(bs) + bdrv_dev_is_medium_locked(bs)) { qerror_report(QERR_DEVICE_LOCKED, bdrv_get_device_name(bs)); return -1; } diff --git a/hw/ide/core.c b/hw/ide/core.c index bc83c46..db144aa 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -789,6 +789,7 @@ static void ide_cd_change_cb(void *opaque, bool load) IDEState *s = opaque; uint64_t nb_sectors; +s-tray_open = !load; bdrv_get_geometry(s-bs,nb_sectors); s-nb_sectors = nb_sectors; diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c index f5f1d82..4a60820 100644 --- a/hw/scsi-disk.c +++ b/hw/scsi-disk.c @@ -1175,6 +1175,7 @@ static void scsi_destroy(SCSIDevice *dev) static void scsi_cd_change_media_cb(void *opaque, bool load) { +((SCSIDiskState *)opaque)-tray_open = !load; } static bool scsi_cd_is_tray_open(void *opaque) Reviewed-by: Paolo Bonzini pbonz...@redhat.com
Re: [Qemu-devel] [PATCH v3 17/27] ide/atapi: Preserve tray state on migration
On 09/06/2011 06:58 PM, Markus Armbruster wrote: Use a subsection, so that migration to older version still works, provided the tray is closed and unlocked. Signed-off-by: Markus Armbrusterarm...@redhat.com --- hw/ide/core.c | 32 1 files changed, 32 insertions(+), 0 deletions(-) diff --git a/hw/ide/core.c b/hw/ide/core.c index b1a73ee..30cb7de 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -2058,6 +2058,22 @@ static bool ide_drive_pio_state_needed(void *opaque) || (s-bus-error_status BM_STATUS_PIO_RETRY); } +static int ide_tray_state_post_load(void *opaque, int version_id) +{ +IDEState *s = opaque; + +bdrv_eject(s-bs, s-tray_open); +bdrv_lock_medium(s-bs, s-tray_locked); +return 0; +} + +static bool ide_tray_state_needed(void *opaque) +{ +IDEState *s = opaque; + +return s-tray_open || s-tray_locked; I wonder if the most common case is this, or rather tray closed and locked. Perhaps it depends (for Windows it is likely unlocked, for Linux locked). In any case there's time before 1.0 to fix this, so Reviewed-by: Paolo Bonzini pbonz...@redhat.com Paolo
Re: [Qemu-devel] [PATCH v3 17/27] ide/atapi: Preserve tray state on migration
Am 07.09.2011 09:14, schrieb Paolo Bonzini: On 09/06/2011 06:58 PM, Markus Armbruster wrote: Use a subsection, so that migration to older version still works, provided the tray is closed and unlocked. Signed-off-by: Markus Armbrusterarm...@redhat.com --- hw/ide/core.c | 32 1 files changed, 32 insertions(+), 0 deletions(-) diff --git a/hw/ide/core.c b/hw/ide/core.c index b1a73ee..30cb7de 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -2058,6 +2058,22 @@ static bool ide_drive_pio_state_needed(void *opaque) || (s-bus-error_status BM_STATUS_PIO_RETRY); } +static int ide_tray_state_post_load(void *opaque, int version_id) +{ +IDEState *s = opaque; + +bdrv_eject(s-bs, s-tray_open); +bdrv_lock_medium(s-bs, s-tray_locked); +return 0; +} + +static bool ide_tray_state_needed(void *opaque) +{ +IDEState *s = opaque; + +return s-tray_open || s-tray_locked; I wonder if the most common case is this, or rather tray closed and locked. Perhaps it depends (for Windows it is likely unlocked, for Linux locked). In any case there's time before 1.0 to fix this, so I would argue that the common case even for Linux is that you don't have a CD mounted (probably the drive is empty anyway). Kevin
[Qemu-devel] [PATCH 3/6] qxl: send interrupt after migration in case ram-int_pending != 0, RHBZ #732949
From: Yonit Halperin yhalp...@redhat.com if qxl_send_events was called from spice server context, and then migration had completed before a call to pipe_read, the target guest qxl driver didn't get the interrupt. In addition, qxl_send_events ignored further interrupts of the same kind, since ram-int_pending was set. As a result, the guest driver was stacked or very slow (when the waiting for the interrupt was with timeout). Signed-off-by: Yonit Halperin yhalp...@redhat.com Signed-off-by: Gerd Hoffmann kra...@redhat.com --- hw/qxl.c | 10 -- 1 files changed, 8 insertions(+), 2 deletions(-) diff --git a/hw/qxl.c b/hw/qxl.c index 1fe0b53..7bb2560 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -1362,7 +1362,6 @@ static void pipe_read(void *opaque) qxl_set_irq(d); } -/* called from spice server thread context only */ static void qxl_send_events(PCIQXLDevice *d, uint32_t events) { uint32_t old_pending; @@ -1459,7 +1458,14 @@ static void qxl_vm_change_state_handler(void *opaque, int running, int reason) PCIQXLDevice *qxl = opaque; qemu_spice_vm_change_state_handler(qxl-ssd, running, reason); -if (!running qxl-mode == QXL_MODE_NATIVE) { +if (running) { +/* + * if qxl_send_events was called from spice server context before + * migration ended, qxl_set_irq for these events might not have been + * called + */ + qxl_set_irq(qxl); +} else if (qxl-mode == QXL_MODE_NATIVE) { /* dirty all vram (which holds surfaces) and devram (primary surface) * to make sure they are saved */ /* FIXME #1: should go out during live stage */ -- 1.7.1
[Qemu-devel] [PATCH 4/6] qxl: s/qxl_set_irq/qxl_update_irq/
From: Yonit Halperin yhalp...@redhat.com Signed-off-by: Yonit Halperin yhalp...@redhat.com Signed-off-by: Gerd Hoffmann kra...@redhat.com --- hw/qxl.c | 12 ++-- 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/hw/qxl.c b/hw/qxl.c index 7bb2560..a282d23 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -808,7 +808,7 @@ static void qxl_exit_vga_mode(PCIQXLDevice *d) qxl_destroy_primary(d, QXL_SYNC); } -static void qxl_set_irq(PCIQXLDevice *d) +static void qxl_update_irq(PCIQXLDevice *d) { uint32_t pending = le32_to_cpu(d-ram-int_pending); uint32_t mask= le32_to_cpu(d-ram-int_mask); @@ -1209,7 +1209,7 @@ async_common: qemu_spice_wakeup(d-ssd); break; case QXL_IO_UPDATE_IRQ: -qxl_set_irq(d); +qxl_update_irq(d); break; case QXL_IO_NOTIFY_OOM: if (!SPICE_RING_IS_EMPTY(d-ram-release_ring)) { @@ -1359,7 +1359,7 @@ static void pipe_read(void *opaque) do { len = read(d-pipe[0], dummy, sizeof(dummy)); } while (len == sizeof(dummy)); -qxl_set_irq(d); +qxl_update_irq(d); } static void qxl_send_events(PCIQXLDevice *d, uint32_t events) @@ -1373,7 +1373,7 @@ static void qxl_send_events(PCIQXLDevice *d, uint32_t events) return; } if (pthread_self() == d-main) { -qxl_set_irq(d); +qxl_update_irq(d); } else { if (write(d-pipe[1], d, 1) != 1) { dprint(d, 1, %s: write to pipe failed\n, __FUNCTION__); @@ -1461,10 +1461,10 @@ static void qxl_vm_change_state_handler(void *opaque, int running, int reason) if (running) { /* * if qxl_send_events was called from spice server context before - * migration ended, qxl_set_irq for these events might not have been + * migration ended, qxl_update_irq for these events might not have been * called */ - qxl_set_irq(qxl); + qxl_update_irq(qxl); } else if (qxl-mode == QXL_MODE_NATIVE) { /* dirty all vram (which holds surfaces) and devram (primary surface) * to make sure they are saved */ -- 1.7.1
[Qemu-devel] [PATCH 5/6] spice: set qxl-ssd.running=true before telling spice to start, RHBZ #733993
From: Yonit Halperin yhalp...@redhat.com If qxl-ssd.running=true is set after telling spice to start, the spice server thread can call qxl_send_events while qxl-ssd.running is still false. This leads to assert(d-ssd.running). Signed-off-by: Yonit Halperin yhalp...@redhat.com Signed-off-by: Gerd Hoffmann kra...@redhat.com --- ui/spice-display.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/ui/spice-display.c b/ui/spice-display.c index 4983963..e385361 100644 --- a/ui/spice-display.c +++ b/ui/spice-display.c @@ -260,11 +260,12 @@ void qemu_spice_vm_change_state_handler(void *opaque, int running, int reason) SimpleSpiceDisplay *ssd = opaque; if (running) { +ssd-running = true; qemu_spice_start(ssd); } else { qemu_spice_stop(ssd); +ssd-running = false; } -ssd-running = running; } void qemu_spice_display_init_common(SimpleSpiceDisplay *ssd, DisplayState *ds) -- 1.7.1
[Qemu-devel] [PULL] spice patch queue
Hi, Here is the spice patch queue with a collection of bugfixes. A workaround for the much discussed spice-calls-us-from-wrong-thread issue is included because it turned out to be not *that* easily fixable in spice so it will probably take some time. Also a spice server fix wouldn't cover already released spice versions. cheers, Gerd The following changes since commit 344eecf6995f4a0ad1d887cec922f6806f91a3f8: mips: Support the MT TCStatus IXMT irq disable flag (2011-09-06 11:09:39 +0200) are available in the git repository at: git://anongit.freedesktop.org/spice/qemu spice.v42 Gerd Hoffmann (1): spice: workaround a spice server bug. Peter Maydell (2): spice-qemu-char.c: Use correct printf format char for ssize_t hw/qxl: Fix format string errors Yonit Halperin (3): qxl: send interrupt after migration in case ram-int_pending != 0, RHBZ #732949 qxl: s/qxl_set_irq/qxl_update_irq/ spice: set qxl-ssd.running=true before telling spice to start, RHBZ #733993 hw/qxl-logger.c|2 +- hw/qxl.c | 26 -- spice-qemu-char.c |2 +- ui/spice-core.c| 25 - ui/spice-display.c |3 ++- 5 files changed, 44 insertions(+), 14 deletions(-)
[Qemu-devel] [PATCH 2/6] hw/qxl: Fix format string errors
From: Peter Maydell peter.mayd...@linaro.org Fix format string errors causing compile failure on 32 bit hosts when spice is enabled. Signed-off-by: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Gerd Hoffmann kra...@redhat.com --- hw/qxl-logger.c |2 +- hw/qxl.c|8 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/hw/qxl-logger.c b/hw/qxl-logger.c index 74cadba..367aad1 100644 --- a/hw/qxl-logger.c +++ b/hw/qxl-logger.c @@ -224,7 +224,7 @@ void qxl_log_command(PCIQXLDevice *qxl, const char *ring, QXLCommandExt *ext) if (!qxl-cmdlog) { return; } -fprintf(stderr, %ld qxl-%d/%s:, qemu_get_clock_ns(vm_clock), +fprintf(stderr, % PRId64 qxl-%d/%s:, qemu_get_clock_ns(vm_clock), qxl-id, ring); fprintf(stderr, cmd @ 0x% PRIx64 %s%s, ext-cmd.data, qxl_name(qxl_type, ext-cmd.type), diff --git a/hw/qxl.c b/hw/qxl.c index 45e2401..1fe0b53 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -959,7 +959,7 @@ static void qxl_add_memslot(PCIQXLDevice *d, uint32_t slot_id, uint64_t delta, memslot.generation = d-rom-slot_generation = 0; qxl_rom_set_dirty(d); -dprint(d, 1, %s: slot %d: host virt 0x% PRIx64 - 0x% PRIx64 \n, +dprint(d, 1, %s: slot %d: host virt 0x%lx - 0x%lx\n, __FUNCTION__, memslot.slot_id, memslot.virt_start, memslot.virt_end); @@ -1090,8 +1090,8 @@ static void qxl_set_mode(PCIQXLDevice *d, int modenr, int loadvm) .mem= devmem + d-shadow_rom.draw_area_offset, }; -dprint(d, 1, %s: mode %d [ %d x %d @ %d bpp devmem 0x%lx ]\n, __FUNCTION__, - modenr, mode-x_res, mode-y_res, mode-bits, devmem); +dprint(d, 1, %s: mode %d [ %d x %d @ %d bpp devmem 0x% PRIx64 ]\n, + __func__, modenr, mode-x_res, mode-y_res, mode-bits, devmem); if (!loadvm) { qxl_hard_reset(d, 0); } @@ -1229,7 +1229,7 @@ async_common: break; case QXL_IO_LOG: if (d-guestdebug) { -fprintf(stderr, qxl/guest-%d: %ld: %s, d-id, +fprintf(stderr, qxl/guest-%d: % PRId64 : %s, d-id, qemu_get_clock_ns(vm_clock), d-ram-log_buf); } break; -- 1.7.1
Re: [Qemu-devel] Compilation error of coroutine-win32.c with gcc version 3.4.5 (mingw-vista special r3)
2011/8/23 Roy Tam roy...@gmail.com: 2011/8/20 Stefan Weil w...@mail.berlios.de: Am 16.08.2011 20:50, schrieb Paolo Bonzini: On 08/16/2011 05:45 AM, Stefan Hajnoczi wrote: Roy, This stack trace does not reveal much. Is there any MinGW gcc user that has successfully built and run qemu.git? I would be surprised if Stefan Weil hasn't. How did you know that? Which version/architecture of Windows and which MinGW version? QEMU currently only supports 32 bit Windows executables which work on any current Windows version version (XP and later, 32 and 64 bit). I use latest MinGW from http://www.mingw.org/ (native compilation on Windows) and Debian cross compilations. Those are available from http://qemu.weilnetz.de/w32/. Did you cross compile it to get those binaries? They run fine here, but not mine. I tried rebuilding all dependencies(gettext, libiconv, glib, zlib) under gcc 4.6.1 but I still get same error with latest git revision. I wonder if there is someone (excluding me) can build and run QEMU successfully under Windows(MinGW+MSYS) environment. I finally downgrade MinGW GCC from 4.6.1(xvidvideo.ru) to 4.3.3-tdm and recompile all dependency and QEMU works well at the end.
[Qemu-devel] [PATCH 1/6] spice-qemu-char.c: Use correct printf format char for ssize_t
From: Peter Maydell peter.mayd...@linaro.org Use the correct printf format string character (%z) for ssize_t. This fixes a compile failure on 32 bit Linux with spice enabled. Signed-off-by: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Gerd Hoffmann kra...@redhat.com --- spice-qemu-char.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/spice-qemu-char.c b/spice-qemu-char.c index a9323f3..ac52202 100644 --- a/spice-qemu-char.c +++ b/spice-qemu-char.c @@ -45,7 +45,7 @@ static int vmc_write(SpiceCharDeviceInstance *sin, const uint8_t *buf, int len) p += last_out; } -dprintf(scd, 3, %s: %lu/%zd\n, __func__, out, len + out); +dprintf(scd, 3, %s: %zu/%zd\n, __func__, out, len + out); trace_spice_vmc_write(out, len + out); return out; } -- 1.7.1
Re: [Qemu-devel] [PATCH v3 17/27] ide/atapi: Preserve tray state on migration
On 09/07/2011 09:35 AM, Kevin Wolf wrote: I wonder if the most common case is this, or rather tray closed and locked. Perhaps it depends (for Windows it is likely unlocked, for Linux locked). In any case there's time before 1.0 to fix this, so I would argue that the common case even for Linux is that you don't have a CD mounted (probably the drive is empty anyway). Indeed---I was thinking about automounting by a GUI, but that's not the typical server case. Paolo
[Qemu-devel] [PATCH 6/6] spice: workaround a spice server bug.
spice server might call the channel_event callback from spice server thread context. Detect that and aquire iothread lock if needed, --- ui/spice-core.c | 25 - 1 files changed, 24 insertions(+), 1 deletions(-) diff --git a/ui/spice-core.c b/ui/spice-core.c index dba11f0..3cbc721 100644 --- a/ui/spice-core.c +++ b/ui/spice-core.c @@ -19,6 +19,7 @@ #include spice-experimental.h #include netdb.h +#include pthread.h #include qemu-common.h #include qemu-spice.h @@ -44,6 +45,8 @@ static char *auth_passwd; static time_t auth_expires = TIME_MAX; int using_spice = 0; +static pthread_t me; + struct SpiceTimer { QEMUTimer *timer; QTAILQ_ENTRY(SpiceTimer) next; @@ -217,6 +220,20 @@ static void channel_event(int event, SpiceChannelEventInfo *info) QDict *server, *client; QObject *data; +/* + * Spice server might have called us from spice worker thread + * context (happens on display channel disconnects). Spice should + * not do that. It isn't that easy to fix it in spice and even + * when it is fixed we still should cover the already released + * spice versions. So detect that we've been called from another + * thread and grab the iothread lock if so before calling qemu + * functions. + */ +bool need_lock = !pthread_equal(me, pthread_self()); +if (need_lock) { +qemu_mutex_lock_iothread(); +} + client = qdict_new(); add_addr_info(client, info-paddr, info-plen); @@ -236,6 +253,10 @@ static void channel_event(int event, SpiceChannelEventInfo *info) QOBJECT(client), QOBJECT(server)); monitor_protocol_event(qevent[event], data); qobject_decref(data); + +if (need_lock) { +qemu_mutex_unlock_iothread(); +} } #else /* SPICE_INTERFACE_CORE_MINOR = 3 */ @@ -482,7 +503,9 @@ void qemu_spice_init(void) spice_image_compression_t compression; spice_wan_compression_t wan_compr; -if (!opts) { +me = pthread_self(); + + if (!opts) { return; } port = qemu_opt_get_number(opts, port, 0); -- 1.7.1
Re: [Qemu-devel] [FIX] X86 CPU topology broken in KVM mode
On 2011-09-07 06:21, Bharata B Rao wrote: Hi, Sometime back I posted a patch for fixing x86 CPU topology ( http://lists.gnu.org/archive/html/qemu-devel/2011-08/msg02022.html). Here is the next version of the fix which addresses all but one comment received during that post. - Fixed code style issues - Ensured that the fix doesn't break TCG mode - I am not sure what is the problem with i486 as I haven't been able to boot an i486 VM successfully, hence haven't attempted to fix this. -smp 2 -cpu i486 boots fine here (granted, I don't have some i486 SMP kernel at hand). I have tested following scenarios and found the fix to be working fine. KVM: (with --enable-kvm) -smp sockets=1,cores=4,threads=2 -smp sockets=4,cores=4,threads=2 -cpu core2duo sockets=1,cores=4,threads=2 -cpu core2duo sockets=2,cores=4,threads=2 TCG: (without --enable-kvm) -cpu core2duo sockets=1,cores=4,threads=2 -cpu core2duo sockets=2,cores=4,threads=2 Here is the updated patch which now applies against qemu.git. Fix apic id enumeration apic id returned to guest kernel in ebx for cpuid(function=1) depends on CPUX86State-cpuid_apic_id which gets populated after the cpuid information is cached in the host kernel. Fix this by setting cpuid_apic_id before cpuid information is passed to the host kernel. This is done by moving the setting of cpuid_apic_id to cpu_x86_init() where it will work for both KVM as well as TCG modes. Signed-off-by: Bharata B Rao bharata@gmail.com --- hw/pc.c |1 - target-i386/helper.c |5 + 2 files changed, 5 insertions(+), 1 deletion(-) Index: qemu/hw/pc.c === --- qemu.orig/hw/pc.c +++ qemu/hw/pc.c @@ -933,7 +933,6 @@ static CPUState *pc_new_cpu(const char * exit(1); } if ((env-cpuid_features CPUID_APIC) || smp_cpus 1) { -env-cpuid_apic_id = env-cpu_index; env-apic_state = apic_init(env, env-cpuid_apic_id); } qemu_register_reset(pc_cpu_reset, env); Index: qemu/target-i386/helper.c === --- qemu.orig/target-i386/helper.c +++ qemu/target-i386/helper.c @@ -1256,6 +1256,11 @@ CPUX86State *cpu_x86_init(const char *cp cpu_x86_close(env); return NULL; } + +if (env-cpuid_features CPUID_APIC) { || smp_cpus 1 Should be obvious when looking at the hunk you took this from. +env-cpuid_apic_id = env-cpu_index; +} + mce_init(env); qemu_init_vcpu(env); * Regards, Bharata. -- http://bharata.sulekha.com/blog/posts.htm, http://raobharata.wordpress.com/ Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH 2/2] main: switch qemu_set_fd_handler to g_io_add_watch
On 2011-09-07 09:03, Paolo Bonzini wrote: On 09/06/2011 05:59 PM, Anthony Liguori wrote: So it should be possible to add a new Source type that just selects on a file descriptor and avoid GIOChannels? I think you still have the problem that glib on Windows waits for HANDLEs, not file descriptors. Also, I'm not sure it's worth it though as long as slirp still does its own fill/poll. The latter can surely be improved, just takes someone to put on some gloves and dig in the dirt... Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [FIX] X86 CPU topology broken in KVM mode
On Wed, Sep 7, 2011 at 1:37 PM, Jan Kiszka jan.kis...@siemens.com wrote: On 2011-09-07 06:21, Bharata B Rao wrote: - I am not sure what is the problem with i486 as I haven't been able to boot an i486 VM successfully, hence haven't attempted to fix this. -smp 2 -cpu i486 boots fine here (granted, I don't have some i486 SMP kernel at hand). I am getting Unable to find x86 CPU definition error with -cpu i486. Need to investigate more. + + if (env-cpuid_features CPUID_APIC) { || smp_cpus 1 Should be obvious when looking at the hunk you took this from. Yes, but I thought no harm in initializing it for uni processor case too, no ? Regards, Bharata.
Re: [Qemu-devel] [FIX] X86 CPU topology broken in KVM mode
On 2011-09-07 10:19, Bharata B Rao wrote: On Wed, Sep 7, 2011 at 1:37 PM, Jan Kiszka jan.kis...@siemens.com wrote: On 2011-09-07 06:21, Bharata B Rao wrote: - I am not sure what is the problem with i486 as I haven't been able to boot an i486 VM successfully, hence haven't attempted to fix this. -smp 2 -cpu i486 boots fine here (granted, I don't have some i486 SMP kernel at hand). I am getting Unable to find x86 CPU definition error with -cpu i486. Need to investigate more. Err, sorry: -cpu 486 + +if (env-cpuid_features CPUID_APIC) { || smp_cpus 1 Should be obvious when looking at the hunk you took this from. Yes, but I thought no harm in initializing it for uni processor case too, no ? 486 CPUs do not have the CPUID_APIC feature set as they do not include a local APIC. But those SMP systems have external APICs. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PULL] usb patch queue
Hi, are available in the git repository at: git://git.kraxel.org/qemu usb.25 Pushed new branch usb.26. Rebased to latest master, solved conflicts due to tracing merge, adapted to tracing changes (disabled not needed any more in trace-events). Squashed in a warning fix (init port variable) here: usb-host: parse port in /proc/bus/usb/devices scan Don't feel like spamming the list with these minor changes. But can do a full repost if prefered. please pull, Gerd
[Qemu-devel] [PATCH v1] domain_conf: add the support for disk I/O throttle setting
The first patch is only used to see if it is suitable for exteeding blkiotune to implement disk I/O throttling. As you have known, when blkiotune is issued without options, it will display current tuning parameters; If we exceed it, without options, what should it display? both info will? or should one new option be added to separately display them? Signed-off-by: Zhi Yong Wu wu...@linux.vnet.ibm.com --- src/conf/domain_conf.c | 18 ++ src/conf/domain_conf.h | 11 +++ 2 files changed, 29 insertions(+), 0 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index cce9955..7dd350a 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -9065,6 +9065,24 @@ virDomainDiskDefFormat(virBufferPtr buf, virBufferAsprintf(buf, target dev='%s' bus='%s'/\n, def-dst, bus); +/*disk I/O throttling*/ +if (def-blkio.blkiothrottle) { +virBufferAsprintf(buf, blkiothrottle\n); +virBufferAsprintf(buf, bps%llu/bps\n, + def-blkiothrottle.bps); +virBufferAsprintf(buf, bps_rd%llu/bps_rd\n, + def-blkiothrottle.bps_rd); +virBufferAsprintf(buf, bps_wr%llu/bps_wr\n, + def-blkiothrottle.bps_wr); +virBufferAsprintf(buf, iops%llu/iops\n, + def-blkiothrottle.iops); +virBufferAsprintf(buf, iops_rd%llu/iops_rd\n, + def-blkiothrottle.iops_rd); +virBufferAsprintf(buf, iops_wr%llu/iops_wr\n, + def-blkiothrottle.iops_wr); +virBufferAsprintf(buf, /blkiothrottle\n); +} + if (def-bootIndex) virBufferAsprintf(buf, boot order='%d'/\n, def-bootIndex); if (def-readonly) diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index e218a30..5902377 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -258,6 +258,17 @@ struct _virDomainDiskDef { virDomainDiskHostDefPtr hosts; char *driverName; char *driverType; + +/*disk I/O throttling*/ +struct { +unsigned long long bps; +unsigned long long bps_rd; +unsigned long long bps_wr; +unsigned long long iops; +unsigned long long iops_rd; +unsigned long long iops_wr; +} blkiothrottle; + char *serial; int cachemode; int error_policy; -- 1.7.6
[Qemu-devel] sorry, pls ignore, it is not correct.Re: [PATCH v1] domain_conf: add the support for disk I/O throttle setting
On Wed, Sep 07, 2011 at 04:59:55PM +0800, Zhi Yong Wu wrote: From: Zhi Yong Wu wu...@linux.vnet.ibm.com To: qemu-devel@nongnu.org Cc: stefa...@linux.vnet.ibm.com, a...@us.ibm.com, zwu.ker...@gmail.com, Zhi Yong Wu wu...@linux.vnet.ibm.com Subject: [PATCH v1] domain_conf: add the support for disk I/O throttle setting Date: Wed, 7 Sep 2011 16:59:55 +0800 Message-Id: 1315385995-23283-1-git-send-email-wu...@linux.vnet.ibm.com X-Mailer: git-send-email 1.7.6 X-Xagent-From: wu...@linux.vnet.ibm.com X-Xagent-To: wu...@linux.vnet.ibm.com X-Xagent-Gateway: vmsdvm6.vnet.ibm.com (XAGENTU3 at VMSDVM6) The first patch is only used to see if it is suitable for exteeding blkiotune to implement disk I/O throttling. As you have known, when blkiotune is issued without options, it will display current tuning parameters; If we exceed it, without options, what should it display? both info will? or should one new option be added to separately display them? Signed-off-by: Zhi Yong Wu wu...@linux.vnet.ibm.com --- src/conf/domain_conf.c | 18 ++ src/conf/domain_conf.h | 11 +++ 2 files changed, 29 insertions(+), 0 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index cce9955..7dd350a 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -9065,6 +9065,24 @@ virDomainDiskDefFormat(virBufferPtr buf, virBufferAsprintf(buf, target dev='%s' bus='%s'/\n, def-dst, bus); +/*disk I/O throttling*/ +if (def-blkio.blkiothrottle) { +virBufferAsprintf(buf, blkiothrottle\n); +virBufferAsprintf(buf, bps%llu/bps\n, + def-blkiothrottle.bps); +virBufferAsprintf(buf, bps_rd%llu/bps_rd\n, + def-blkiothrottle.bps_rd); +virBufferAsprintf(buf, bps_wr%llu/bps_wr\n, + def-blkiothrottle.bps_wr); +virBufferAsprintf(buf, iops%llu/iops\n, + def-blkiothrottle.iops); +virBufferAsprintf(buf, iops_rd%llu/iops_rd\n, + def-blkiothrottle.iops_rd); +virBufferAsprintf(buf, iops_wr%llu/iops_wr\n, + def-blkiothrottle.iops_wr); +virBufferAsprintf(buf, /blkiothrottle\n); +} + if (def-bootIndex) virBufferAsprintf(buf, boot order='%d'/\n, def-bootIndex); if (def-readonly) diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index e218a30..5902377 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -258,6 +258,17 @@ struct _virDomainDiskDef { virDomainDiskHostDefPtr hosts; char *driverName; char *driverType; + +/*disk I/O throttling*/ +struct { +unsigned long long bps; +unsigned long long bps_rd; +unsigned long long bps_wr; +unsigned long long iops; +unsigned long long iops_rd; +unsigned long long iops_wr; +} blkiothrottle; + char *serial; int cachemode; int error_policy; -- 1.7.6
[Qemu-devel] [PATCH] qcow2: make cache=unsafe usable
Currently cache=unsafe is unsafe to the point of unusability - the caches are never written to disk except on exit so anything except an orderly exit -- including live migration -- leaves the disk image corrupted. Fix by interpreting flush requests and doing everything except flushing the underlying file. The contents of the metadata cache are transferred to the host pagecache, so that qemu aborts keep the disk in a consistent state, and live migration (on the same host, or if using a coherent filesystem) works. Signed-off-by: Avi Kivity a...@redhat.com --- Untested - is this the right approach? block/qcow2.c |7 +++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index bfff6cd..7ecd096 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -275,6 +275,13 @@ static int qcow2_open(BlockDriverState *bs, int flags) ret = -EINVAL; goto fail; } +/* + * Request flush callbask so that we can write metadata to the host + * pagecache. Flushes to bs-file will still be ignored. This keeps + * metadata consistent in host pagecache, so we're safe wrt unexpected + * exits, but avoids slow disk flushes (and is vulnerable to host crashes) + */ +bs-open_flags = ~BDRV_O_NO_FLUSH; /* Initialise locks */ qemu_co_mutex_init(s-lock); -- 1.7.6.1
Re: [Qemu-devel] [PATCH 0/6] Device state visualization reloaded
Am 06.09.2011 19:05, schrieb Michael S. Tsirkin: On Tue, Sep 06, 2011 at 11:28:09AM -0500, Anthony Liguori wrote: On 09/06/2011 11:09 AM, Michael S. Tsirkin wrote: On Tue, Sep 06, 2011 at 10:51:26AM -0500, Anthony Liguori wrote: On 09/06/2011 10:45 AM, Jan Kiszka wrote: On 2011-09-06 16:48, Michael S. Tsirkin wrote: I'm afraid that won't be enough to stop people scripting this command - libvirt accessed HMP for years. On the other hand, no QMP command means e.g. libvirt users don't get any benefit from this. What I think will solve these problems, for both HMP and QMP, is an explicit 'debug_unstable' or 'debug_unsupported' command that will expose all kind of debugging functionality making it very explicit that it's an unsupported debugging utility. Proposed syntax: debug_unstablesubcommand options Example: debug_unstable device_show -all For HMP, this would needlessly complicate the user interface, nothing I would support. People scripting things on top of HMP are generally doing this on their own risk and cannot expect output stability. device_show is like info qtree: the output will naturally change as the emulated hardware evolves, information is added/removed, or we simply improve the layout. Recent changes on info network are an example for the latter. Yeah, I'm not worried about stability. HMP commands that aren't exposed as QMP commands are inherently unstable and should not be scripted to. They are also not accessible when using libvirt, right? $ virsh human-monitor-passthrough GuestName device_show foo Should work. So how, in the end, will user know it's unsupported? I don't agree with 'all HMP is unstable' as people will use it and will come to depend on it. Why unsupported? It's just not a stable API to script against. It's a user interface, not a programming interface. So as long as you use it manually, that's perfectly fine. Would you expect that virt-manager never changes its GUI because there might be some scripts that try to achieve things by screen scraping? The interface is just not meant for such uses, and if you use it in that way and it breaks with the next version it's your own fault. Kevin
Re: [Qemu-devel] [PATCH] qcow2: make cache=unsafe usable
On 07.09.2011, at 11:24, Avi Kivity wrote: Currently cache=unsafe is unsafe to the point of unusability - the caches are never written to disk except on exit so anything except an orderly exit -- including live migration -- leaves the disk image corrupted. Fix by interpreting flush requests and doing everything except flushing the underlying file. The contents of the metadata cache are transferred to the host pagecache, so that qemu aborts keep the disk in a consistent state, and live migration (on the same host, or if using a coherent filesystem) works. Yes, I've seen breakage with cache=unsafe and qcow2 myself. Thus semantically, the patch seems very reasonable to me. However, I'll leave it to Kevin to decide if it's a good idea to just unset random flags in open() or if we want to have something more expressive there :) Alex
Re: [Qemu-devel] [PATCH 6/6] qdev: Generate IDs for anonymous devices
On Wed, Aug 31, 2011 at 08:31:26PM +0200, Jan Kiszka wrote: On 2011-08-29 23:19, Anthony Liguori wrote: On 08/29/2011 03:56 PM, Jan Kiszka wrote: On 2011-08-29 21:23, Anthony Liguori wrote: On 08/26/2011 09:48 AM, Jan Kiszka wrote: In order to address devices for that the user forgot or is even unable (no_user) to provide an ID, assign an automatically generated one. Such IDs have the format #number, thus are outside the name space availing to users. Don't use them for bus naming to avoid any other user-visible change. I don't think this is a very nice approach. Why not eliminate anonymous devices entirely and use a parent derived name for devices that are not created by the user? This eliminates anonymous devices completely. So I guess you are asking for a different naming scheme, something likeparent-id.child#no e.g.? Well, we would end up with fairly long names when a complete hierarchy is anonymous. What would be the benefit? No, I'm saying that whenever a device is created, it should be given a non-random name. IOW, the names of these devices should be stable. I'm really just looking for some simple, temporary workaround without touching the existing fragile naming scheme. What we really need is full path addressing, but that without preserving all the legacy. Yeah, I understand, and I hesitated making any grander suggestions here, but I'm not sure how much work it would be to just remove any caller that passes NULL for ID and replace it with something more meaningful. I think that's a helpful clean up long term no matter what. That won't solve the problem of finding a unique device name. If we want to derive it from stable device properties (bus addresses etc.), we first of all have to define them for all types of devices. And that's basically were the discussion exploded last year IIRC. Why not use the OpenFirmware naming that we already have for some devices instead of inventing something new? -- Gleb.
Re: [Qemu-devel] [PATCH] qcow2: make cache=unsafe usable
Am 07.09.2011 11:24, schrieb Avi Kivity: Currently cache=unsafe is unsafe to the point of unusability - the caches are never written to disk except on exit so anything except an orderly exit -- including live migration -- leaves the disk image corrupted. Fix by interpreting flush requests and doing everything except flushing the underlying file. The contents of the metadata cache are transferred to the host pagecache, so that qemu aborts keep the disk in a consistent state, and live migration (on the same host, or if using a coherent filesystem) works. Signed-off-by: Avi Kivity a...@redhat.com --- Untested - is this the right approach? Hm, could work, even though I don't like it very much. The alternative approach would be something like this (not only untested, but won't even compile): diff --git a/block.c b/block.c index a8c789a..1aa5967 100644 --- a/block.c +++ b/block.c @@ -1723,10 +1723,6 @@ const char *bdrv_get_device_name(BlockDriverState *bs) int bdrv_flush(BlockDriverState *bs) { -if (bs-open_flags BDRV_O_NO_FLUSH) { -return 0; -} - if (bs-drv bdrv_has_async_flush(bs-drv) qemu_in_coroutine()) { return bdrv_co_flush_em(bs); } @@ -2624,10 +2620,6 @@ BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState *bs, trace_bdrv_aio_flush(bs, opaque); -if (bs-open_flags BDRV_O_NO_FLUSH) { -return bdrv_aio_noop_em(bs, cb, opaque); -} - if (!drv) return NULL; return drv-bdrv_aio_flush(bs, cb, opaque); diff --git a/block/raw-posix.c b/block/raw-posix.c index bcf50b2..bb0c0c5 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -629,6 +629,10 @@ static BlockDriverAIOCB *raw_aio_flush(BlockDriverState *bs, { BDRVRawState *s = bs-opaque; +if (bs-open_flags BDRV_O_NO_FLUSH) { +return bdrv_aio_noop_em(bs, cb, opaque); +} + if (fd_open(bs) 0) return NULL; @@ -839,6 +843,11 @@ static int raw_create(const char *filename, QEMUOptionParameter *options) static int raw_flush(BlockDriverState *bs) { BDRVRawState *s = bs-opaque; + +if (bs-open_flags BDRV_O_NO_FLUSH) { +return 0; +} + return qemu_fdatasync(s-fd); }
Re: [Qemu-devel] [PATCH] qcow2: make cache=unsafe usable
On 09/07/2011 12:56 PM, Kevin Wolf wrote: Am 07.09.2011 11:24, schrieb Avi Kivity: Currently cache=unsafe is unsafe to the point of unusability - the caches are never written to disk except on exit so anything except an orderly exit -- including live migration -- leaves the disk image corrupted. Fix by interpreting flush requests and doing everything except flushing the underlying file. The contents of the metadata cache are transferred to the host pagecache, so that qemu aborts keep the disk in a consistent state, and live migration (on the same host, or if using a coherent filesystem) works. Signed-off-by: Avi Kivitya...@redhat.com --- Untested - is this the right approach? Hm, could work, even though I don't like it very much. The alternative approach would be something like this I think that your version is better - it fixes all the layered format drivers at once (even though qcow2 is the only one that needs fixing). -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH 6/6] qdev: Generate IDs for anonymous devices
On 2011-09-07 11:50, Gleb Natapov wrote: On Wed, Aug 31, 2011 at 08:31:26PM +0200, Jan Kiszka wrote: On 2011-08-29 23:19, Anthony Liguori wrote: On 08/29/2011 03:56 PM, Jan Kiszka wrote: On 2011-08-29 21:23, Anthony Liguori wrote: On 08/26/2011 09:48 AM, Jan Kiszka wrote: In order to address devices for that the user forgot or is even unable (no_user) to provide an ID, assign an automatically generated one. Such IDs have the format #number, thus are outside the name space availing to users. Don't use them for bus naming to avoid any other user-visible change. I don't think this is a very nice approach. Why not eliminate anonymous devices entirely and use a parent derived name for devices that are not created by the user? This eliminates anonymous devices completely. So I guess you are asking for a different naming scheme, something likeparent-id.child#no e.g.? Well, we would end up with fairly long names when a complete hierarchy is anonymous. What would be the benefit? No, I'm saying that whenever a device is created, it should be given a non-random name. IOW, the names of these devices should be stable. I'm really just looking for some simple, temporary workaround without touching the existing fragile naming scheme. What we really need is full path addressing, but that without preserving all the legacy. Yeah, I understand, and I hesitated making any grander suggestions here, but I'm not sure how much work it would be to just remove any caller that passes NULL for ID and replace it with something more meaningful. I think that's a helpful clean up long term no matter what. That won't solve the problem of finding a unique device name. If we want to derive it from stable device properties (bus addresses etc.), we first of all have to define them for all types of devices. And that's basically were the discussion exploded last year IIRC. Why not use the OpenFirmware naming that we already have for some devices instead of inventing something new? Because I do not want to establish any path names before QOM conversion (including potential device reorganization) has been started. Specifically as I do not need naming for some devices, but for all. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH 6/6] qdev: Generate IDs for anonymous devices
On Wed, Sep 07, 2011 at 12:27:23PM +0200, Jan Kiszka wrote: On 2011-09-07 11:50, Gleb Natapov wrote: On Wed, Aug 31, 2011 at 08:31:26PM +0200, Jan Kiszka wrote: On 2011-08-29 23:19, Anthony Liguori wrote: On 08/29/2011 03:56 PM, Jan Kiszka wrote: On 2011-08-29 21:23, Anthony Liguori wrote: On 08/26/2011 09:48 AM, Jan Kiszka wrote: In order to address devices for that the user forgot or is even unable (no_user) to provide an ID, assign an automatically generated one. Such IDs have the format #number, thus are outside the name space availing to users. Don't use them for bus naming to avoid any other user-visible change. I don't think this is a very nice approach. Why not eliminate anonymous devices entirely and use a parent derived name for devices that are not created by the user? This eliminates anonymous devices completely. So I guess you are asking for a different naming scheme, something likeparent-id.child#no e.g.? Well, we would end up with fairly long names when a complete hierarchy is anonymous. What would be the benefit? No, I'm saying that whenever a device is created, it should be given a non-random name. IOW, the names of these devices should be stable. I'm really just looking for some simple, temporary workaround without touching the existing fragile naming scheme. What we really need is full path addressing, but that without preserving all the legacy. Yeah, I understand, and I hesitated making any grander suggestions here, but I'm not sure how much work it would be to just remove any caller that passes NULL for ID and replace it with something more meaningful. I think that's a helpful clean up long term no matter what. That won't solve the problem of finding a unique device name. If we want to derive it from stable device properties (bus addresses etc.), we first of all have to define them for all types of devices. And that's basically were the discussion exploded last year IIRC. Why not use the OpenFirmware naming that we already have for some devices instead of inventing something new? Because I do not want to establish any path names before QOM conversion (including potential device reorganization) has been started. In theory device paths are dictated by HW topology, not today's flavor of QEMU object model. Specifically as I do not need naming for some devices, but for all. It can be extended. We already have three types of device naming. One is used in qdev, another is used for migration and yet another one for passing device names to firmware. We should converge to a single one :) -- Gleb.
Re: [Qemu-devel] glib mainloop breaks virtfs
On Tue, 06 Sep 2011 13:22:36 +0200, Gerd Hoffmann kra...@redhat.com wrote: Hi, virtfs stopped working for me in master, the guest (fedora 15) just hangs at boot when mounting the virtfs filesystems. Bisecting points to this commit: rincewind kraxel ~/projects/qemu ((69e5bb6...)|BISECTING)# git bisect good 4d88a2ac8643265108ef1fb47ceee5d7b28e19f2 is the first bad commit commit 4d88a2ac8643265108ef1fb47ceee5d7b28e19f2 Author: Anthony Liguori aligu...@us.ibm.com Date: Mon Aug 22 08:12:53 2011 -0500 main: switch qemu_set_fd_handler to g_io_add_watch cheers, Gerd The below patch fix the problem for me. commit 52ed37a201d34a3070b4e2d0f21b5e6cca50352e Author: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com Date: Wed Sep 7 16:11:03 2011 +0530 iohandler: update qemu_fd_set_handler to work with null call back arg Many users of qemu_fd_set_handler including VirtFS use NULL call back arg. Update fd_trampoline and qemu_fd_set_handler to work with NULL call back arg Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com diff --git a/iohandler.c b/iohandler.c index 5ef66fb..b803208 100644 --- a/iohandler.c +++ b/iohandler.c @@ -93,10 +93,6 @@ static gboolean fd_trampoline(GIOChannel *chan, GIOCondition cond, gpointer opaq { IOTrampoline *tramp = opaque; -if (tramp-opaque == NULL) { -return FALSE; -} - if ((cond G_IO_IN) tramp-fd_read) { tramp-fd_read(tramp-opaque); } @@ -113,6 +109,7 @@ int qemu_set_fd_handler(int fd, IOHandler *fd_write, void *opaque) { +GIOCondition cond = 0; static IOTrampoline fd_trampolines[FD_SETSIZE]; IOTrampoline *tramp = fd_trampolines[fd]; @@ -121,25 +118,21 @@ int qemu_set_fd_handler(int fd, g_source_remove(tramp-tag); } -if (opaque) { -GIOCondition cond = 0; - -tramp-fd_read = fd_read; -tramp-fd_write = fd_write; -tramp-opaque = opaque; - -if (fd_read) { -cond |= G_IO_IN | G_IO_ERR; -} +tramp-fd_read = fd_read; +tramp-fd_write = fd_write; +tramp-opaque = opaque; -if (fd_write) { -cond |= G_IO_OUT | G_IO_ERR; -} +if (fd_read) { +cond |= G_IO_IN | G_IO_ERR; +} -tramp-chan = g_io_channel_unix_new(fd); -tramp-tag = g_io_add_watch(tramp-chan, cond, fd_trampoline, tramp); +if (fd_write) { +cond |= G_IO_OUT | G_IO_ERR; } +tramp-chan = g_io_channel_unix_new(fd); +tramp-tag = g_io_add_watch(tramp-chan, cond, fd_trampoline, tramp); + return 0; }
[Qemu-devel] [PATCH] iohandler: update qemu_fd_set_handler to work with null call back arg
Many users of qemu_fd_set_handler including VirtFS use NULL call back arg. Update fd_trampoline and qemu_fd_set_handler to work with NULL call back arg Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com --- iohandler.c | 31 --- 1 files changed, 12 insertions(+), 19 deletions(-) diff --git a/iohandler.c b/iohandler.c index 5ef66fb..b803208 100644 --- a/iohandler.c +++ b/iohandler.c @@ -93,10 +93,6 @@ static gboolean fd_trampoline(GIOChannel *chan, GIOCondition cond, gpointer opaq { IOTrampoline *tramp = opaque; -if (tramp-opaque == NULL) { -return FALSE; -} - if ((cond G_IO_IN) tramp-fd_read) { tramp-fd_read(tramp-opaque); } @@ -113,6 +109,7 @@ int qemu_set_fd_handler(int fd, IOHandler *fd_write, void *opaque) { +GIOCondition cond = 0; static IOTrampoline fd_trampolines[FD_SETSIZE]; IOTrampoline *tramp = fd_trampolines[fd]; @@ -121,25 +118,21 @@ int qemu_set_fd_handler(int fd, g_source_remove(tramp-tag); } -if (opaque) { -GIOCondition cond = 0; - -tramp-fd_read = fd_read; -tramp-fd_write = fd_write; -tramp-opaque = opaque; - -if (fd_read) { -cond |= G_IO_IN | G_IO_ERR; -} +tramp-fd_read = fd_read; +tramp-fd_write = fd_write; +tramp-opaque = opaque; -if (fd_write) { -cond |= G_IO_OUT | G_IO_ERR; -} +if (fd_read) { +cond |= G_IO_IN | G_IO_ERR; +} -tramp-chan = g_io_channel_unix_new(fd); -tramp-tag = g_io_add_watch(tramp-chan, cond, fd_trampoline, tramp); +if (fd_write) { +cond |= G_IO_OUT | G_IO_ERR; } +tramp-chan = g_io_channel_unix_new(fd); +tramp-tag = g_io_add_watch(tramp-chan, cond, fd_trampoline, tramp); + return 0; } -- 1.7.4.1
Re: [Qemu-devel] [PATCH] Add options to disable build with debug symbols and override optimization flags.
Brad b...@comstyle.com wrote: Add --disable-debug-symbols to disable building with debug symbols and --optflags to override the optimization flags passed to the compiler. # default flags for all hosts QEMU_CFLAGS=-fno-strict-aliasing $QEMU_CFLAGS -CFLAGS=-g $CFLAGS QEMU_CFLAGS=-Wall -Wundef -Wwrite-strings -Wmissing-prototypes $QEMU_CFLAGS QEMU_CFLAGS=-Wstrict-prototypes -Wredundant-decls $QEMU_CFLAGS QEMU_CFLAGS=-D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE $QEMU_CFLAGS QEMU_CFLAGS=-D_FORTIFY_SOURCE=2 $QEMU_CFLAGS QEMU_INCLUDES=-I. -I\$(SRC_PATH) -I\$(SRC_PATH)/fpu -LDFLAGS=-g $LDFLAGS I can understand this part. # make source path absolute source_path=`cd $source_path; pwd` @@ -518,6 +518,8 @@ for opt do ;; --cc=*) ;; + --optflags=*) optflags=$optarg + ;; --host-cc=*) host_cc=$optarg ;; --make=*) make=$optarg No, please. We already have --extra-cflags, --extra-ldflags, no need for another one. I haven't tested, but my understanding is that just using: --extra-cflags=-O0 to your configure line should fix the -O2 issue, no? My understanding is that: gcc -O2 -O0 . is understood as -O0, no? Whatever you pass in --extra-cflags is put at the end of the command line (otherwise, it is a bug somewhere). @@ -588,6 +590,10 @@ for opt do ;; --disable-debug-mon) debug_mon=no ;; + --enable-debug-symbols) debug_symbols=yes + ;; + --disable-debug-symbols) debug_symbols=no + ;; --enable-debug) # Enable debugging options that aren't excessively noisy debug_tcg=yes Not really sure if we should add this under the --enable-debug option. But I can agree with this option. Later, Juan.
Re: [Qemu-devel] [PATCH] Only build with -g CFLAGS/LDFLAGS if using --enable-debug and add --optflags.
Brad b...@comstyle.com wrote: - Original message - On 09/06/11 10:02, Brad wrote: Only build with -g CFLAGS/LDFLAGS if using --enable-debug. Add --optflags to allow overriding the default optimization level added to CFLAGS. This is a first draft of coming up with a patch I could potentially push upstream based on much cruder local patches to do something similar. I'm trying to eliminate having to patch the configure script. You don't have to. You can just run 'make CFLAGS=$optflags' to override the defaults. Nevertheless having optflags would be nice as you don't have to type this for each make run then. I do when its unconditionally on the commandline either way. If the configure scipt didnt put it their if CFLAGS wasnt empty it wouldnt be an issue. $(call quiet-command,$(CC) $(QEMU_INCLUDES) $(QEMU_CFLAGS) $(QEMU_DGFLAGS) $(CFLAGS) -c -o $@ $, CC$(TARGET_DIR)$@) this is the rule called in rules.make. QEMU don't use CFLAGS internally^W^W^W^W, it is only used for -g -O2, so it should be enough to use make CFLAGS= or something like that. I don't think we should mess with the -g flag. It should stay enabled by default, so you can easily get a useful stacktrace out of a core without having to rebuild with debug info first. I dont care what the default is as long as I can disable it without patching. What is the reason for that? I guess that compilation speed/memory, but just to be sure. Later, Juan.
Re: [Qemu-devel] [PATCH 6/6] qdev: Generate IDs for anonymous devices
On 2011-09-07 12:34, Gleb Natapov wrote: On Wed, Sep 07, 2011 at 12:27:23PM +0200, Jan Kiszka wrote: On 2011-09-07 11:50, Gleb Natapov wrote: On Wed, Aug 31, 2011 at 08:31:26PM +0200, Jan Kiszka wrote: On 2011-08-29 23:19, Anthony Liguori wrote: On 08/29/2011 03:56 PM, Jan Kiszka wrote: On 2011-08-29 21:23, Anthony Liguori wrote: On 08/26/2011 09:48 AM, Jan Kiszka wrote: In order to address devices for that the user forgot or is even unable (no_user) to provide an ID, assign an automatically generated one. Such IDs have the format #number, thus are outside the name space availing to users. Don't use them for bus naming to avoid any other user-visible change. I don't think this is a very nice approach. Why not eliminate anonymous devices entirely and use a parent derived name for devices that are not created by the user? This eliminates anonymous devices completely. So I guess you are asking for a different naming scheme, something likeparent-id.child#no e.g.? Well, we would end up with fairly long names when a complete hierarchy is anonymous. What would be the benefit? No, I'm saying that whenever a device is created, it should be given a non-random name. IOW, the names of these devices should be stable. I'm really just looking for some simple, temporary workaround without touching the existing fragile naming scheme. What we really need is full path addressing, but that without preserving all the legacy. Yeah, I understand, and I hesitated making any grander suggestions here, but I'm not sure how much work it would be to just remove any caller that passes NULL for ID and replace it with something more meaningful. I think that's a helpful clean up long term no matter what. That won't solve the problem of finding a unique device name. If we want to derive it from stable device properties (bus addresses etc.), we first of all have to define them for all types of devices. And that's basically were the discussion exploded last year IIRC. Why not use the OpenFirmware naming that we already have for some devices instead of inventing something new? Because I do not want to establish any path names before QOM conversion (including potential device reorganization) has been started. In theory device paths are dictated by HW topology, not today's flavor of QEMU object model. There will be changes in the object composition, but predicting them today and modeling this on top of current qdev is nothing I want to try. Specifically as I do not need naming for some devices, but for all. It can be extended. We already have three types of device naming. One is used in qdev, another is used for migration and yet another one for passing device names to firmware. We should converge to a single one :) Yes, but that's beyond what this patch set can achieve or what will happen in foreseeable time. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] glib mainloop breaks virtfs
commit 52ed37a201d34a3070b4e2d0f21b5e6cca50352e Author: Aneesh Kumar K.Vaneesh.ku...@linux.vnet.ibm.com Date: Wed Sep 7 16:11:03 2011 +0530 iohandler: update qemu_fd_set_handler to work with null call back arg Many users of qemu_fd_set_handler including VirtFS use NULL call back arg. Update fd_trampoline and qemu_fd_set_handler to work with NULL call back arg Patch applied, everything works again. thanks, Gerd
Re: [Qemu-devel] [PATCH v7 1/4] block: add the command line support
On Tue, Sep 6, 2011 at 11:48 AM, Zhi Yong Wu wu...@linux.vnet.ibm.com wrote: Signed-off-by: Zhi Yong Wu wu...@linux.vnet.ibm.com --- block.c | 8 block_int.h | 30 ++ blockdev.c | 29 + blockdev.h | 2 ++ qemu-config.c | 24 qemu-options.hx | 1 + 6 files changed, 94 insertions(+), 0 deletions(-) [...] +#include block/blk-queue.h This patch fails to build. blk-queue.h does not exist yet. Please order changes so that each patch builds. Stefan
[Qemu-devel] [PATCH] block: allow resizing of images residing on host devices
Allow to resize images that reside on host devices up to the available space. This allows to grow images after resizing the device manually or vice versa. Reviewed-by: Christoph Hellwig h...@lst.de Index: qemu/block/raw-posix.c === --- qemu.orig/block/raw-posix.c 2011-09-01 17:37:42.579651525 +0200 +++ qemu/block/raw-posix.c 2011-09-01 17:43:28.882967337 +0200 @@ -645,10 +645,23 @@ static void raw_close(BlockDriverState * static int raw_truncate(BlockDriverState *bs, int64_t offset) { BDRVRawState *s = bs-opaque; -if (s-type != FTYPE_FILE) -return -ENOTSUP; -if (ftruncate(s-fd, offset) 0) +struct stat st; + +if (fstat(s-fd, st)) return -errno; + +if (S_ISREG(st.st_mode)) { +if (ftruncate(s-fd, offset) 0) +return -errno; +} else if (S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode)) { + if (offset raw_getlength(bs)) { + return -EINVAL; + } + return 0; + +} else { +return -ENOTSUP; +} return 0; } @@ -1167,6 +1180,7 @@ static BlockDriver bdrv_host_device = { .bdrv_read = raw_read, .bdrv_write = raw_write, +.bdrv_truncate = raw_truncate, .bdrv_getlength= raw_getlength, .bdrv_get_allocated_file_size = raw_get_allocated_file_size, @@ -1288,6 +1302,7 @@ static BlockDriver bdrv_host_floppy = { .bdrv_read = raw_read, .bdrv_write = raw_write, +.bdrv_truncate = raw_truncate, .bdrv_getlength= raw_getlength, .bdrv_get_allocated_file_size = raw_get_allocated_file_size, @@ -1389,6 +1404,7 @@ static BlockDriver bdrv_host_cdrom = { .bdrv_read = raw_read, .bdrv_write = raw_write, +.bdrv_truncate = raw_truncate, .bdrv_getlength = raw_getlength, .bdrv_get_allocated_file_size = raw_get_allocated_file_size, @@ -1510,6 +1526,7 @@ static BlockDriver bdrv_host_cdrom = { .bdrv_read = raw_read, .bdrv_write = raw_write, +.bdrv_truncate = raw_truncate, .bdrv_getlength = raw_getlength, .bdrv_get_allocated_file_size = raw_get_allocated_file_size,
Re: [Qemu-devel] [PATCH v6 2/4] block: add the block queue support
On Mon, Sep 5, 2011 at 9:34 AM, Zhi Yong Wu wu...@linux.vnet.ibm.com wrote: On Thu, Sep 01, 2011 at 02:02:41PM +0100, Stefan Hajnoczi wrote: 01 Sep 2011 06:02:41 -0700 (PDT) On Thu, Sep 1, 2011 at 12:44 PM, Zhi Yong Wu wu...@linux.vnet.ibm.com wrote: +static void qemu_block_queue_dequeue(BlockQueue *queue, BlockIORequest *request) +{ + BlockIORequest *req; + + while (!QTAILQ_EMPTY(queue-requests)) { + req = QTAILQ_FIRST(queue-requests); + if (req == request) { + QTAILQ_REMOVE(queue-requests, req, entry); + break; + } + } +} + +static void qemu_block_queue_cancel(BlockDriverAIOCB *acb) +{ + BlockQueueAIOCB *blkacb = container_of(acb, BlockQueueAIOCB, common); + if (blkacb-real_acb) { + bdrv_aio_cancel(blkacb-real_acb); + } else { + assert(blkacb-common.bs-block_queue); + qemu_block_queue_dequeue(blkacb-common.bs-block_queue, + blkacb-request); Can we use QTAILQ_REMOVE() and eliminate qemu_block_queue_dequeue()? why need to replace dequeue function with QTAILQ_REMOVE()? Because the existing QTAILQ_REMOVE() macro already does what is needed. +void qemu_del_block_queue(BlockQueue *queue) +{ + BlockIORequest *request, *next; + + QTAILQ_FOREACH_SAFE(request, queue-requests, entry, next) { + QTAILQ_REMOVE(queue-requests, request, entry); + g_free(request); What about the acbs? There needs to be a strategy for cleanly what strategy? shutting down completely. In fact we should probably use cancellation here or assert that the queue is already empty. You mean if the queue has been empty, we need to assert(queue)? This patch silently deletes queued requests, which leaks the BlockQueueAIOCBs. The queued requests will never be issued or completed. qemu_del_block_queue() must cleanly destroy the queue. This function could require the queue to be empty, then we do not need to worry about queued requests. The caller can do this by flushing it before deleting the queue. +static int qemu_block_queue_handler(BlockIORequest *request) +{ + int ret; + BlockDriverAIOCB *res; + + res = request-handler(request-bs, request-sector_num, + request-qiov, request-nb_sectors, + request-cb, request-opaque); Please pass qemu_block_queue_callback and request-acb directly here instead of using request-cb and request-opaque. Those fields aren't needed and just add indirection. If later the other guy want to reuse qemu_block_queue_handler, and use different callback function, then this function can not be used. Your way lower this function's reusability. Code can be changed so it's best to do things the simple way first and extend the code if necessary later. Trying to think ahead results in half-finished code where the reader has to guess the author's intention. Stefan
[Qemu-devel] Help - Using QEMU with lm3s811evb
Hi, I'm wondering can anyone help me with my issue: I want to start ARM system emulator on QEMU for The Luminary Micro Stellaris LM3S811EVB emulation. On my machine I have installed CentOS 6 and QEMU 0.15.0. On QEMU I have Debian ARM version running with rebuilded kernel (v 3.0.4). Till now I started QEMU with this comand: qemu-system-arm -M versatilepb -kernel vmlinuz-2.6.26-2-versatile -initrd initrd.img-2.6.26-2-versatile -hda debian_lenny_arm_standard.qcow2 -append root=/dev/sda1 Now, tried to start QEMU with command: qemu-system-arm -M lm3s811evb -kernel vmlinuz-3.0.4 -initrd initrd.img-3.0.4 -hda debian_lenny_arm_standard.qcow2 -append root=/dev/sda1 and then I get message: vmlinuz-3.0.4 : No such file or directory qemu: could not load kernel 'vmlinuz-3.0.4' This is strange because that file exist in my file system in /boot directory. I don't know ho to resolve this, so please help. Thanks in advance, Best regards, p.s. I'm new with this stuff so I don't know is this the right way to ask but I hope I'll get some help or other link/mail where can I look for solution.
Re: [Qemu-devel] [PATCH V8 08/14] Introduce file lock for the block layer
On Tue, Sep 06, 2011 at 07:55:48PM -0400, Stefan Berger wrote: On 09/04/2011 03:32 PM, Michael S. Tsirkin wrote: On Thu, Sep 01, 2011 at 09:53:40PM -0400, Stefan Berger wrote: Generally, what all other devices do is perform validation as the last step in migration when device state is restored. On failure, management can decide what to do: retry migration or restart on source. Why is TPM special and needs to be treated differently? ... More detail: Typically one starts out with an empty QCoW2 file created via qemu-img. Once Qemu starts and initializes the libtpms-based TPM, it tries to read existing state from that QCoW2 file. Since there is no state stored in the QCoW2, the TPM will start initializing itself to an initial 'blank' state. So it looks like the problem is you access the file when guest isn't running. Delaying the initialization until the guest actually starts running will solve the problem in a way that is more consistent with other qemu devices. I'd agree if there wasn't one more thing: encryption on the data inside the QCoW2 filesystem First: There are two ways to encrypt the data. One comes with the QCoW2 type of image and it comes for free. Set the encryption flag when creating the QCoW2 file and one has to provide a key to access the QCoW2. I found this mode problematic for users since it required me to go through the monitor every time I started the VM. Besides that the key is provided so late that all devices are already initialized and if the wrong key was provided the only thing the TPM can do is to go into shutdown mode since there is state on the QCoW2 but it cannot be decrypted. This also became problematic when doing migrations with libvirt for example and one was to have a wrong key/password installed on the target side -- graceful termination of the migration is impossible. So the above drove the implementation of the encryption mode added in patch 10 in this series. Here the key is provided via command line and it can be used immediately. So I am reading the state blobs from the file, decrypt them, create the CRC32 on the plain data and check against the CRC32 stored in the 'directory'. If it doesn't match the expected CRC32 either the key was wrong or the state is corrupted and I can terminate Qemu gracefully. I can also react appropriately if no key was provided but one is expected and vice-versa. Also in case of migration this now allows me to terminate Qemu gracefully so it continues running on the source. This is an improvement over the situation described above where in case the target had the wrong key the TPM went into shutdown mode and the user would be wondering why that is -- the TPM becomes inaccessible. However, particularly in the case of migration with shared storage I need to access the QCoW2 file to check whether on the target I have the right key. And this happens very early during qemu startup on the target machine. So that's where the file locking on the block layer comes in. I want to serialize access to the QCoW2 so I don't read intermediate state on the migration-target host that can occur if the source is currently writing TPM state data. This patch and the command line provided key along with the encryption mode added in patch 10 in this series add to usability and help prevent failures. Regards, Stefan Sure, it's a useful feature of validating the device early. But as I said above, it's useful for other devices besides TPM. However it introduces issues where state changes since guest keeps running (such as what you have described here). I think solving this in a way specific to TPM is a mistake. Passing key on command line does not mean you must use it immediately, this can still be done when guest starts running. Further, the patchset seems to be split in a way that introduces support headaches and makes review harder, not easier. In this case, we will have to support both a broken mode (key supplied later) and a non-broken one (key supplied on init). It would be better to implement a single mode, in a single patch. -- MST
Re: [Qemu-devel] [PATCH V8 14/14] Allow to provide inital TPM state
On Tue, Sep 06, 2011 at 10:45:34PM -0400, Stefan Berger wrote: On 09/04/2011 12:38 PM, Michael S. Tsirkin wrote: On Thu, Sep 01, 2011 at 11:00:56PM -0400, Stefan Berger wrote: initstate_fd=file descriptor initstate_base64=on/off (or base64/bin if you really expect more formats in the future) and use qemu routines to get the fd so they can be passed through the monitor later ... I suppose you mean monitor_get_fd(). That functions seems to only be used by net.c so far and if understand the chain of functions correctly that are called with the monitor as parameter it helps in hotplugging net devices ? For the TPM I would like to *not* have support for hotplugging since that device is supposed to be soldered to the motherboard and needs to be initialized through a command sequence by the (v)BIOS, so it has to be present early on during machine startup. Stefan Fine, but let's reuse common functions and save code duplication, especially parsing functions.
Re: [Qemu-devel] Help - Using QEMU with lm3s811evb
On 7 September 2011 12:16, Danilo Bojovic danilo.d.bojo...@gmail.com wrote: Now, tried to start QEMU with command: qemu-system-arm -M lm3s811evb -kernel vmlinuz-3.0.4 -initrd initrd.img-3.0.4 -hda debian_lenny_arm_standard.qcow2 -append root=/dev/sda1 and then I get message: vmlinuz-3.0.4 : No such file or directory qemu: could not load kernel 'vmlinuz-3.0.4' This is strange because that file exist in my file system in /boot directory. I don't know ho to resolve this, so please help. The first (and minor) problem is that you need to specify the exact path to the kernel file: qemu will not randomly look in /boot (why should it?) but only where you tell it to. If you specify a filename with no directory part it will look in the current working directory. The more major issue here is that you seem to be trying to run a standard Linux kernel and filesystem image on this machine. This will never work -- the lm3s811evb is a machine based on the ARM Cortex-M3 microcontroller, which does not have an MMU and is not capable of running Linux. If you want to use the lm3s811evb model you need a binary (probably an RTOS or possibly a standalone bare-metal app) which has been compiled specifically for this hardware. What are you actually trying to achieve here? -- PMM
Re: [Qemu-devel] [PATCH] ppc405: use RAM_ADDR_FMT instead of %08lx
On 05.09.2011, at 15:02, Stefan Hajnoczi wrote: The RAM_ADDR_FMT macro hides the type of ram_addr_t so that format strings can be safely used. Make sure to use RAM_ADDR_FMT so that the build works on 32-bit hosts with Xen enabled. Whether Xen should affect ppc TCG targets is questionable but a separate issue. Thanks, applied. Alex
Re: [Qemu-devel] [PATCH 7/9] openpic: avoid a warning from clang analyzer
On 05.09.2011, at 20:41, Blue Swirl wrote: On Mon, Sep 5, 2011 at 6:48 AM, Paolo Bonzini pbonz...@redhat.com wrote: On 09/04/2011 05:52 PM, Blue Swirl wrote: Avoid this warning by clang analyzer by defining a default case: /src/qemu/hw/openpic.c:477:5: warning: Undefined or garbage value returned to caller return retval; Signed-off-by: Blue Swirlblauwir...@gmail.com --- hw/openpic.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/hw/openpic.c b/hw/openpic.c index 26c96e2..4b883ac 100644 --- a/hw/openpic.c +++ b/hw/openpic.c @@ -469,6 +469,7 @@ static inline uint32_t read_IRQreg (openpic_t *opp, int n_IRQ, uint32_t reg) case IRQ_IPVP: retval = opp-src[n_IRQ].ipvp; break; +default: case IRQ_IDE: retval = opp-src[n_IRQ].ide; break; Looks wrong, perhaps it should return 0? The only possible values are IRQ_IDE and IRQ_IPVP. The function is actually baroque, it's as easy to use read_IRQreg(opp, IRQ_DBL0 + n_dbl, IRQ_IPVP); as the shorter opp-src[IRQ_DBL0 + n_dbl].ipvp; The reason seems to be that write_IRQreg is more complex. I'd replace both with {read,write}_{ide,ipvp} without the switch. I agree. Let me assemble a patch. Alex
Re: [Qemu-devel] [PATCH 3/5] tcg/s390: Only one call output register needed for 64 bit hosts
On 05.09.2011, at 11:07, Stefan Weil wrote: The second register is only needed for 32 bit hosts. Looks sane to me. Richard, mind to ack? Alex Cc: Alexander Graf ag...@suse.de Signed-off-by: Stefan Weil w...@mail.berlios.de --- tcg/s390/tcg-target.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c index 2fc5646..b58df71 100644 --- a/tcg/s390/tcg-target.c +++ b/tcg/s390/tcg-target.c @@ -252,7 +252,9 @@ static const int tcg_target_call_iarg_regs[] = { static const int tcg_target_call_oarg_regs[] = { TCG_REG_R2, -TCG_REG_R3, +#if TCG_TARGET_REG_BITS == 32 +TCG_REG_R3 +#endif }; #define S390_CC_EQ 8 -- 1.7.0.4
Re: [Qemu-devel] [PATCH] pci: add standard bridge device
On Wed, Sep 07, 2011 at 12:39:09PM +0800, Wen Congyang wrote: At 09/06/2011 03:45 PM, Avi Kivity Write: On 09/06/2011 06:06 AM, Wen Congyang wrote: Use the uio driver - http://docs.blackfin.uclinux.org/kernel/generated/uio-howto/. You just mmap() the BAR from userspace and play with it. When I try to bind ivshmem to uio_pci_generic, I get the following messages: uio_pci_generic :01:01.0: No IRQ assigned to device: no support for interrupts? No idea what this means. PCI 3.0 6.2.4 For x86 based PCs, the values in this register correspond to IRQ numbers (0-15) of the standard dual 8259 configuration. The value 255 is defined as meaning unknown or no connection to the interrupt controller. Values between 15 and 254 are reserved. The register is interrupt line. I read the config of this device, the interrupt line is 0. It means that it uses the IRQ0. The following is the uio_pci_generic's code: static int __devinit probe(struct pci_dev *pdev, const struct pci_device_id *id) { struct uio_pci_generic_dev *gdev; int err; err = pci_enable_device(pdev); if (err) { dev_err(pdev-dev, %s: pci_enable_device failed: %d\n, __func__, err); return err; } if (!pdev-irq) { dev_warn(pdev-dev, No IRQ assigned to device: no support for interrupts?\n); pci_disable_device(pdev); return -ENODEV; } ... } This function will be called when we write 'domain:bus:slot.function' to /sys/bus/pci/drivers/uio_pci_generic/bind. pdev-irq is 0, it means the device uses IRQ0. But we refuse it. I do not why. To Michael S. Tsirkin This code is writen by you. Do you know why you check whether pdev-irq is 0? Thanks Wen Congyang Well I see this in linux: /* * Read interrupt line and base address registers. * The architecture-dependent code can tweak these, of course. */ static void pci_read_irq(struct pci_dev *dev) { unsigned char irq; pci_read_config_byte(dev, PCI_INTERRUPT_PIN, irq); dev-pin = irq; if (irq) pci_read_config_byte(dev, PCI_INTERRUPT_LINE, irq); dev-irq = irq; } Thus a device without an interrupt pin will get irq set to 0, and this seems the right way to detect such devices. I don't think PCI devices really use IRQ0 in practice, its probably used for PC things. More likely the system is misconfigured. Try lspci -vv to see what went wrong. -- MST
[Qemu-devel] [PATCH 1/2] openpic: Unfold read_IRQreg
The helper function read_IRQreg was always called with a specific argument on the type of register to access. Inside the function we were simply doing a switch on that constant argument again. It's a lot easier to just unfold this into two separate functions and call each individually. Reported-by: Blue Swirl blauwir...@gmail.com Signed-off-by: Alexander Graf ag...@suse.de --- hw/openpic.c | 56 +--- 1 files changed, 25 insertions(+), 31 deletions(-) diff --git a/hw/openpic.c b/hw/openpic.c index 03e442b..fbd8837 100644 --- a/hw/openpic.c +++ b/hw/openpic.c @@ -472,20 +472,14 @@ static void openpic_reset (void *opaque) opp-glbc = 0x; } -static inline uint32_t read_IRQreg (openpic_t *opp, int n_IRQ, uint32_t reg) +static inline uint32_t read_IRQreg_ide(openpic_t *opp, int n_IRQ) { -uint32_t retval; - -switch (reg) { -case IRQ_IPVP: -retval = opp-src[n_IRQ].ipvp; -break; -case IRQ_IDE: -retval = opp-src[n_IRQ].ide; -break; -} +return opp-src[n_IRQ].ide; +} -return retval; +static inline uint32_t read_IRQreg_ipvp(openpic_t *opp, int n_IRQ) +{ +return opp-src[n_IRQ].ipvp; } static inline void write_IRQreg (openpic_t *opp, int n_IRQ, @@ -523,10 +517,10 @@ static uint32_t read_doorbell_register (openpic_t *opp, switch (offset) { case DBL_IPVP_OFFSET: -retval = read_IRQreg(opp, IRQ_DBL0 + n_dbl, IRQ_IPVP); +retval = read_IRQreg_ipvp(opp, IRQ_DBL0 + n_dbl); break; case DBL_IDE_OFFSET: -retval = read_IRQreg(opp, IRQ_DBL0 + n_dbl, IRQ_IDE); +retval = read_IRQreg_ide(opp, IRQ_DBL0 + n_dbl); break; case DBL_DMR_OFFSET: retval = opp-doorbells[n_dbl].dmr; @@ -564,10 +558,10 @@ static uint32_t read_mailbox_register (openpic_t *opp, retval = opp-mailboxes[n_mbx].mbr; break; case MBX_IVPR_OFFSET: -retval = read_IRQreg(opp, IRQ_MBX0 + n_mbx, IRQ_IPVP); +retval = read_IRQreg_ipvp(opp, IRQ_MBX0 + n_mbx); break; case MBX_DMR_OFFSET: -retval = read_IRQreg(opp, IRQ_MBX0 + n_mbx, IRQ_IDE); +retval = read_IRQreg_ide(opp, IRQ_MBX0 + n_mbx); break; } @@ -695,7 +689,7 @@ static uint32_t openpic_gbl_read (void *opaque, target_phys_addr_t addr) { int idx; idx = (addr - 0x10A0) 4; -retval = read_IRQreg(opp, opp-irq_ipi0 + idx, IRQ_IPVP); +retval = read_IRQreg_ipvp(opp, opp-irq_ipi0 + idx); } break; case 0x10E0: /* SPVE */ @@ -765,10 +759,10 @@ static uint32_t openpic_timer_read (void *opaque, uint32_t addr) retval = opp-timers[idx].tibc; break; case 0x20: /* TIPV */ -retval = read_IRQreg(opp, opp-irq_tim0 + idx, IRQ_IPVP); +retval = read_IRQreg_ipvp(opp, opp-irq_tim0 + idx); break; case 0x30: /* TIDE */ -retval = read_IRQreg(opp, opp-irq_tim0 + idx, IRQ_IDE); +retval = read_IRQreg_ide(opp, opp-irq_tim0 + idx); break; } DPRINTF(%s: = %08x\n, __func__, retval); @@ -809,10 +803,10 @@ static uint32_t openpic_src_read (void *opaque, uint32_t addr) idx = addr 5; if (addr 0x10) { /* EXDE / IFEDE / IEEDE */ -retval = read_IRQreg(opp, idx, IRQ_IDE); +retval = read_IRQreg_ide(opp, idx); } else { /* EXVP / IFEVP / IEEVP */ -retval = read_IRQreg(opp, idx, IRQ_IPVP); +retval = read_IRQreg_ipvp(opp, idx); } DPRINTF(%s: = %08x\n, __func__, retval); @@ -1368,13 +1362,13 @@ static uint32_t mpic_timer_read (void *opaque, target_phys_addr_t addr) retval = mpp-timers[idx].tibc; break; case 0x20: /* TIPV */ -retval = read_IRQreg(mpp, MPIC_TMR_IRQ + idx, IRQ_IPVP); +retval = read_IRQreg_ipvp(mpp, MPIC_TMR_IRQ + idx); break; case 0x30: /* TIDR */ if ((addr 0xF0) == 0XF0) retval = mpp-dst[cpu].tfrr; else -retval = read_IRQreg(mpp, MPIC_TMR_IRQ + idx, IRQ_IDE); +retval = read_IRQreg_ide(mpp, MPIC_TMR_IRQ + idx); break; } DPRINTF(%s: = %08x\n, __func__, retval); @@ -1421,10 +1415,10 @@ static uint32_t mpic_src_ext_read (void *opaque, target_phys_addr_t addr) idx += (addr 0xFFF0) 5; if (addr 0x10) { /* EXDE / IFEDE / IEEDE */ -retval = read_IRQreg(mpp, idx, IRQ_IDE); +retval = read_IRQreg_ide(mpp, idx); } else { /* EXVP / IFEVP / IEEVP */ -retval = read_IRQreg(mpp, idx, IRQ_IPVP); +retval = read_IRQreg_ipvp(mpp, idx); } DPRINTF(%s: = %08x\n, __func__, retval); } @@ -1471,10 +1465,10 @@ static uint32_t mpic_src_int_read (void *opaque, target_phys_addr_t addr) idx += (addr 0xFFF0) 5; if (addr 0x10) { /* EXDE / IFEDE / IEEDE */
[Qemu-devel] [PATCH 2/2] openpic: Unfold write_IRQreg
The helper function write_IRQreg was always called with a specific argument on the type of register to access. Inside the function we were simply doing a switch on that constant argument again. It's a lot easier to just unfold this into two separate functions and call each individually. Reported-by: Blue Swirl blauwir...@gmail.com Signed-off-by: Alexander Graf ag...@suse.de --- hw/openpic.c | 79 +++-- 1 files changed, 37 insertions(+), 42 deletions(-) diff --git a/hw/openpic.c b/hw/openpic.c index fbd8837..43b8f27 100644 --- a/hw/openpic.c +++ b/hw/openpic.c @@ -482,30 +482,25 @@ static inline uint32_t read_IRQreg_ipvp(openpic_t *opp, int n_IRQ) return opp-src[n_IRQ].ipvp; } -static inline void write_IRQreg (openpic_t *opp, int n_IRQ, - uint32_t reg, uint32_t val) +static inline void write_IRQreg_ide(openpic_t *opp, int n_IRQ, uint32_t val) { uint32_t tmp; -switch (reg) { -case IRQ_IPVP: -/* NOTE: not fully accurate for special IRQs, but simple and - sufficient */ -/* ACTIVITY bit is read-only */ -opp-src[n_IRQ].ipvp = -(opp-src[n_IRQ].ipvp 0x4000) | -(val 0x800F00FF); -openpic_update_irq(opp, n_IRQ); -DPRINTF(Set IPVP %d to 0x%08x - 0x%08x\n, -n_IRQ, val, opp-src[n_IRQ].ipvp); -break; -case IRQ_IDE: -tmp = val 0xC000; -tmp |= val ((1ULL MAX_CPU) - 1); -opp-src[n_IRQ].ide = tmp; -DPRINTF(Set IDE %d to 0x%08x\n, n_IRQ, opp-src[n_IRQ].ide); -break; -} +tmp = val 0xC000; +tmp |= val ((1ULL MAX_CPU) - 1); +opp-src[n_IRQ].ide = tmp; +DPRINTF(Set IDE %d to 0x%08x\n, n_IRQ, opp-src[n_IRQ].ide); +} + +static inline void write_IRQreg_ipvp(openpic_t *opp, int n_IRQ, uint32_t val) +{ +/* NOTE: not fully accurate for special IRQs, but simple and sufficient */ +/* ACTIVITY bit is read-only */ +opp-src[n_IRQ].ipvp = (opp-src[n_IRQ].ipvp 0x4000) + | (val 0x800F00FF); +openpic_update_irq(opp, n_IRQ); +DPRINTF(Set IPVP %d to 0x%08x - 0x%08x\n, n_IRQ, val, +opp-src[n_IRQ].ipvp); } #if 0 // Code provision for Intel model @@ -535,10 +530,10 @@ static void write_doorbell_register (penpic_t *opp, int n_dbl, { switch (offset) { case DBL_IVPR_OFFSET: -write_IRQreg(opp, IRQ_DBL0 + n_dbl, IRQ_IPVP, value); +write_IRQreg_ipvp(opp, IRQ_DBL0 + n_dbl, value); break; case DBL_IDE_OFFSET: -write_IRQreg(opp, IRQ_DBL0 + n_dbl, IRQ_IDE, value); +write_IRQreg_ide(opp, IRQ_DBL0 + n_dbl, value); break; case DBL_DMR_OFFSET: opp-doorbells[n_dbl].dmr = value; @@ -576,10 +571,10 @@ static void write_mailbox_register (openpic_t *opp, int n_mbx, opp-mailboxes[n_mbx].mbr = value; break; case MBX_IVPR_OFFSET: -write_IRQreg(opp, IRQ_MBX0 + n_mbx, IRQ_IPVP, value); +write_IRQreg_ipvp(opp, IRQ_MBX0 + n_mbx, value); break; case MBX_DMR_OFFSET: -write_IRQreg(opp, IRQ_MBX0 + n_mbx, IRQ_IDE, value); +write_IRQreg_ide(opp, IRQ_MBX0 + n_mbx, value); break; } } @@ -636,7 +631,7 @@ static void openpic_gbl_write (void *opaque, target_phys_addr_t addr, uint32_t v { int idx; idx = (addr - 0x10A0) 4; -write_IRQreg(opp, opp-irq_ipi0 + idx, IRQ_IPVP, val); +write_IRQreg_ipvp(opp, opp-irq_ipi0 + idx, val); } break; case 0x10E0: /* SPVE */ @@ -729,10 +724,10 @@ static void openpic_timer_write (void *opaque, uint32_t addr, uint32_t val) opp-timers[idx].tibc = val; break; case 0x20: /* TIVP */ -write_IRQreg(opp, opp-irq_tim0 + idx, IRQ_IPVP, val); +write_IRQreg_ipvp(opp, opp-irq_tim0 + idx, val); break; case 0x30: /* TIDE */ -write_IRQreg(opp, opp-irq_tim0 + idx, IRQ_IDE, val); +write_IRQreg_ide(opp, opp-irq_tim0 + idx, val); break; } } @@ -782,10 +777,10 @@ static void openpic_src_write (void *opaque, uint32_t addr, uint32_t val) idx = addr 5; if (addr 0x10) { /* EXDE / IFEDE / IEEDE */ -write_IRQreg(opp, idx, IRQ_IDE, val); +write_IRQreg_ide(opp, idx, val); } else { /* EXVP / IFEVP / IEEVP */ -write_IRQreg(opp, idx, IRQ_IPVP, val); +write_IRQreg_ipvp(opp, idx, val); } } @@ -835,8 +830,8 @@ static void openpic_cpu_write_internal(void *opaque, target_phys_addr_t addr, case 0x70: idx = (addr - 0x40) 4; /* we use IDE as mask which CPUs to deliver the IPI to still. */ -write_IRQreg(opp, opp-irq_ipi0 + idx, IRQ_IDE, - opp-src[opp-irq_ipi0 + idx].ide | val); +write_IRQreg_ide(opp, opp-irq_ipi0 + idx, + opp-src[opp-irq_ipi0 + idx].ide
Re: [Qemu-devel] [PATCH V8 10/14] Encrypt state blobs using AES CBC encryption
On Tue, Sep 06, 2011 at 08:32:41PM -0400, Stefan Berger wrote: To summarize it: enc_mode=aes-cbc# redundant for now since this is the only supported encryption scheme; so could drop it and assume as default key_format=hex/binary # hex for a string hex number; binary would mean the found string is directly converted to a char[] that is then directly used as the AES key (like a password) key=128, 192, or 256 bithex key or string key_file=path to file containing 128,192 or 256 bit hex key or string Stefan OK -- MST
Re: [Qemu-devel] [PATCH] s390: remove boot image detection to fix boot with newer kernels
On 06.09.2011, at 13:41, Christian Borntraeger wrote: Alex, Newer kernels will not always have a 0dd0 (basr 13,0) at address 0x1. (e.g. current linux-next). We must not rely on specific code at certain addresses, so lets just remove this check. Is there any other sane way we can find out if the image we're loading is actually bootable? I don't want users to do qemu-system-s390x -kernel vmlinux and see it just plain fail. Alex
Re: [Qemu-devel] [PATCH] ppc: move ADB stuff from ppc_mac.h to adb.h
On 04.09.2011, at 20:41, Laurent Vivier wrote: Allow to use ADB in non-ppc macintosh What exactly do you need this for? Not saying I'm opposed to the change - it looks reasonable to have adb export its own interfaces using its own header - but I'd like to understand where you're heading here. Is this for m68k? Btw - I applied the patch nevertheless. Alex
Re: [Qemu-devel] [PATCH] ppc: move ADB stuff from ppc_mac.h to adb.h
On 07.09.2011, at 14:13, Laurent Vivier wrote: Hi, Le 7 septembre 2011 à 14:05, Alexander Graf ag...@suse.de a écrit : On 04.09.2011, at 20:41, Laurent Vivier wrote: Allow to use ADB in non-ppc macintosh What exactly do you need this for? Not saying I'm opposed to the change - it looks reasonable to have adb export its own interfaces using its own header - but I'd like to understand where you're heading here. Is this for m68k? Yes, I'm working on a quadra 800 emulation and ADB is attached to VIA. So, it seems reasonable to move it out of ppc. There will be more changes in the futur, but for the moment I'd like to merge only obvious and generic changes. I see - that makes a lot of sense. Looking forward to seeing m68k revived again! :) Alex
[Qemu-devel] [PATCH -V2] iohandler: update qemu_fd_set_handler to work with null call back arg
Many users of qemu_fd_set_handler including VirtFS use NULL call back arg. Update fd_trampoline and qemu_fd_set_handler to work with NULL call back arg Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com --- Changes from V1: Add support for dropping event source iohandler.c | 31 +-- 1 files changed, 13 insertions(+), 18 deletions(-) diff --git a/iohandler.c b/iohandler.c index 5ef66fb..783f3ac 100644 --- a/iohandler.c +++ b/iohandler.c @@ -93,9 +93,8 @@ static gboolean fd_trampoline(GIOChannel *chan, GIOCondition cond, gpointer opaq { IOTrampoline *tramp = opaque; -if (tramp-opaque == NULL) { +if (!tramp-fd_read !tramp-fd_write) return FALSE; -} if ((cond G_IO_IN) tramp-fd_read) { tramp-fd_read(tramp-opaque); @@ -113,6 +112,7 @@ int qemu_set_fd_handler(int fd, IOHandler *fd_write, void *opaque) { +GIOCondition cond = 0; static IOTrampoline fd_trampolines[FD_SETSIZE]; IOTrampoline *tramp = fd_trampolines[fd]; @@ -121,25 +121,20 @@ int qemu_set_fd_handler(int fd, g_source_remove(tramp-tag); } -if (opaque) { -GIOCondition cond = 0; - -tramp-fd_read = fd_read; -tramp-fd_write = fd_write; -tramp-opaque = opaque; - -if (fd_read) { -cond |= G_IO_IN | G_IO_ERR; -} - -if (fd_write) { -cond |= G_IO_OUT | G_IO_ERR; -} +if (fd_read) { +cond |= G_IO_IN | G_IO_ERR; +} -tramp-chan = g_io_channel_unix_new(fd); -tramp-tag = g_io_add_watch(tramp-chan, cond, fd_trampoline, tramp); +if (fd_write) { +cond |= G_IO_OUT | G_IO_ERR; } +tramp-fd_read = fd_read; +tramp-fd_write = fd_write; +tramp-opaque = opaque; +tramp-chan = g_io_channel_unix_new(fd); +tramp-tag = g_io_add_watch(tramp-chan, cond, fd_trampoline, tramp); + return 0; } -- 1.7.4.1
Re: [Qemu-devel] [PATCH] s390: remove boot image detection to fix boot with newer kernels
On 07/09/11 13:56, Alexander Graf wrote: On 06.09.2011, at 13:41, Christian Borntraeger wrote: Alex, Newer kernels will not always have a 0dd0 (basr 13,0) at address 0x1. (e.g. current linux-next). We must not rely on specific code at certain addresses, so lets just remove this check. Is there any other sane way we can find out if the image we're loading is actually bootable? I don't want users to do qemu-system-s390x -kernel vmlinux and see it just plain fail. No, in theory it could change arbitrarily. The vmlinux case is unfortunate but in the end its shoot yourself in the foot, we just have to make sure that we allow a graceful exit from a looping qemu guest. Christian
[Qemu-devel] [PATCH v8 0/4] The intro of QEMU block I/O throttling
The main goal of the patch is to effectively cap the disk I/O speed or counts of one single VM.It is only one draft, so it unavoidably has some drawbacks, if you catch them, please let me know. The patch will mainly introduce one block I/O throttling algorithm, one timer and one block queue for each I/O limits enabled drive. When a block request is coming in, the throttling algorithm will check if its I/O rate or counts exceed the limits; if yes, then it will enqueue to the block queue; The timer will handle the I/O requests in it. Some available features follow as below: (1) global bps limit. -drive bps=xxxin bytes/s (2) only read bps limit -drive bps_rd=xxx in bytes/s (3) only write bps limit -drive bps_wr=xxx in bytes/s (4) global iops limit -drive iops=xxx in ios/s (5) only read iops limit -drive iops_rd=xxxin ios/s (6) only write iops limit -drive iops_wr=xxxin ios/s (7) the combination of some limits. -drive bps=xxx,iops=xxx Known Limitations: (1) #1 can not coexist with #2, #3 (2) #4 can not coexist with #5, #6 (3) When bps/iops limits are specified to a small value such as 511 bytes/s, this VM will hang up. We are considering how to handle this senario. Changes since code V7: fix the build per patch based on stefan's comments. Zhi Yong Wu (4): block: add the command line support block: add the block queue support block: add block timer and throttling algorithm qmp/hmp: add block_set_io_throttle v7: Mainly simply the block queue. Adjust codes based on stefan's comments. v6: Mainly fix the aio callback issue for block queue. Adjust codes based on Ram Pai's comments. v5: add qmp/hmp support. Adjust the codes based on stefan's comments qmp/hmp: add block_set_io_throttle v4: fix memory leaking based on ryan's feedback. v3: Added the code for extending slice time, and modified the method to compute wait time for the timer. v2: The codes V2 for QEMU disk I/O limits. Modified the codes mainly based on stefan's comments. v1: Submit the codes for QEMU disk I/O limits. Only a code draft. Makefile.objs |2 +- block.c | 331 +++-- block.h |6 +- block/blk-queue.c | 184 + block/blk-queue.h | 59 ++ block_int.h | 30 + blockdev.c| 98 blockdev.h|2 + hmp-commands.hx | 15 +++ qemu-config.c | 24 qemu-options.hx |1 + qerror.c |4 + qerror.h |3 + qmp-commands.hx | 52 - 14 files changed, 796 insertions(+), 15 deletions(-) create mode 100644 block/blk-queue.c create mode 100644 block/blk-queue.h -- 1.7.6
Re: [Qemu-devel] [PATCH] s390: remove boot image detection to fix boot with newer kernels
On 07.09.2011, at 14:33, Christian Borntraeger wrote: On 07/09/11 13:56, Alexander Graf wrote: On 06.09.2011, at 13:41, Christian Borntraeger wrote: Alex, Newer kernels will not always have a 0dd0 (basr 13,0) at address 0x1. (e.g. current linux-next). We must not rely on specific code at certain addresses, so lets just remove this check. Is there any other sane way we can find out if the image we're loading is actually bootable? I don't want users to do qemu-system-s390x -kernel vmlinux and see it just plain fail. No, in theory it could change arbitrarily. The vmlinux case is unfortunate but in the end its shoot yourself in the foot, we just have to make sure that we allow a graceful exit from a looping qemu guest. That's not the answer I'd like to hear. Can't we put a magic constant somewhere for newer kernel versions that would identify those and keep the basr 13,0 hack around for older ones? Alex
[Qemu-devel] [PATCH v8 1/4] block: add the block queue support
Signed-off-by: Zhi Yong Wu wu...@linux.vnet.ibm.com --- Makefile.objs |2 +- block/blk-queue.c | 184 + block/blk-queue.h | 59 + block_int.h | 27 4 files changed, 271 insertions(+), 1 deletions(-) create mode 100644 block/blk-queue.c create mode 100644 block/blk-queue.h diff --git a/Makefile.objs b/Makefile.objs index 26b885b..5dcf456 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -33,7 +33,7 @@ block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vv block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o block-nested-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o block-nested-y += qed-check.o -block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o +block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o blk-queue.o block-nested-$(CONFIG_WIN32) += raw-win32.o block-nested-$(CONFIG_POSIX) += raw-posix.o block-nested-$(CONFIG_CURL) += curl.o diff --git a/block/blk-queue.c b/block/blk-queue.c new file mode 100644 index 000..da01fcb --- /dev/null +++ b/block/blk-queue.c @@ -0,0 +1,184 @@ +/* + * QEMU System Emulator queue definition for block layer + * + * Copyright (c) IBM, Corp. 2011 + * + * Authors: + * Zhi Yong Wu wu...@linux.vnet.ibm.com + * Stefan Hajnoczi stefa...@linux.vnet.ibm.com + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the Software), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ + +#include block_int.h +#include block/blk-queue.h +#include qemu-common.h + +/* The APIs for block request queue on qemu block layer. + */ + +struct BlockQueueAIOCB { +BlockDriverAIOCB common; +QTAILQ_ENTRY(BlockQueueAIOCB) entry; +BlockRequestHandler *handler; +BlockDriverAIOCB *real_acb; + +int64_t sector_num; +QEMUIOVector *qiov; +int nb_sectors; +}; + +typedef struct BlockQueueAIOCB BlockQueueAIOCB; + +struct BlockQueue { +QTAILQ_HEAD(requests, BlockQueueAIOCB) requests; +bool flushing; +}; + +static void qemu_block_queue_dequeue(BlockQueue *queue, + BlockQueueAIOCB *request) +{ +BlockQueueAIOCB *req; + +assert(queue); +while (!QTAILQ_EMPTY(queue-requests)) { +req = QTAILQ_FIRST(queue-requests); +if (req == request) { +QTAILQ_REMOVE(queue-requests, req, entry); +break; +} +} +} + +static void qemu_block_queue_cancel(BlockDriverAIOCB *acb) +{ +BlockQueueAIOCB *request = container_of(acb, BlockQueueAIOCB, common); +if (request-real_acb) { +bdrv_aio_cancel(request-real_acb); +} else { +assert(request-common.bs-block_queue); +qemu_block_queue_dequeue(request-common.bs-block_queue, + request); +} + +qemu_aio_release(request); +} + +static AIOPool block_queue_pool = { +.aiocb_size = sizeof(struct BlockQueueAIOCB), +.cancel = qemu_block_queue_cancel, +}; + +BlockQueue *qemu_new_block_queue(void) +{ +BlockQueue *queue; + +queue = g_malloc0(sizeof(BlockQueue)); + +QTAILQ_INIT(queue-requests); + +queue-flushing = false; + +return queue; +} + +void qemu_del_block_queue(BlockQueue *queue) +{ +BlockQueueAIOCB *request, *next; + +QTAILQ_FOREACH_SAFE(request, queue-requests, entry, next) { +QTAILQ_REMOVE(queue-requests, request, entry); +qemu_aio_release(request); +} + +g_free(queue); +} + +BlockDriverAIOCB *qemu_block_queue_enqueue(BlockQueue *queue, +BlockDriverState *bs, +BlockRequestHandler *handler, +int64_t sector_num, +QEMUIOVector *qiov, +int nb_sectors, +BlockDriverCompletionFunc *cb, +void *opaque) +{ +BlockDriverAIOCB *acb; +BlockQueueAIOCB *request; + +if
[Qemu-devel] [PATCH v8 2/4] block: add the command line support
Signed-off-by: Zhi Yong Wu wu...@linux.vnet.ibm.com --- block.c | 59 +++ block.h |5 block_int.h |3 ++ blockdev.c | 29 +++ qemu-config.c | 24 ++ qemu-options.hx |1 + 6 files changed, 121 insertions(+), 0 deletions(-) diff --git a/block.c b/block.c index 43742b7..cd75183 100644 --- a/block.c +++ b/block.c @@ -104,6 +104,57 @@ int is_windows_drive(const char *filename) } #endif +/* throttling disk I/O limits */ +void bdrv_io_limits_disable(BlockDriverState *bs) +{ +bs-io_limits_enabled = false; + +if (bs-block_queue) { +qemu_block_queue_flush(bs-block_queue); +qemu_del_block_queue(bs-block_queue); +bs-block_queue = NULL; +} + +if (bs-block_timer) { +qemu_del_timer(bs-block_timer); +qemu_free_timer(bs-block_timer); +bs-block_timer = NULL; +} + +bs-slice_start = 0; + +bs-slice_end = 0; +} + +static void bdrv_block_timer(void *opaque) +{ +BlockDriverState *bs = opaque; +BlockQueue *queue= bs-block_queue; + +qemu_block_queue_flush(queue); +} + +void bdrv_io_limits_enable(BlockDriverState *bs) +{ +bs-block_queue = qemu_new_block_queue(); +bs-block_timer = qemu_new_timer_ns(vm_clock, bdrv_block_timer, bs); + +bs-slice_start = qemu_get_clock_ns(vm_clock); + +bs-slice_end = bs-slice_start + BLOCK_IO_SLICE_TIME; +} + +bool bdrv_io_limits_enabled(BlockDriverState *bs) +{ +BlockIOLimit *io_limits = bs-io_limits; +return io_limits-bps[BLOCK_IO_LIMIT_READ] + || io_limits-bps[BLOCK_IO_LIMIT_WRITE] + || io_limits-bps[BLOCK_IO_LIMIT_TOTAL] + || io_limits-iops[BLOCK_IO_LIMIT_READ] + || io_limits-iops[BLOCK_IO_LIMIT_WRITE] + || io_limits-iops[BLOCK_IO_LIMIT_TOTAL]; +} + /* check if the path starts with protocol: */ static int path_has_protocol(const char *path) { @@ -1453,6 +1504,14 @@ void bdrv_get_geometry_hint(BlockDriverState *bs, *psecs = bs-secs; } +/* throttling disk io limits */ +void bdrv_set_io_limits(BlockDriverState *bs, +BlockIOLimit *io_limits) +{ +bs-io_limits = *io_limits; +bs-io_limits_enabled = bdrv_io_limits_enabled(bs); +} + /* Recognize floppy formats */ typedef struct FDFormat { FDriveType drive; diff --git a/block.h b/block.h index 3ac0b94..a3e69db 100644 --- a/block.h +++ b/block.h @@ -58,6 +58,11 @@ void bdrv_info(Monitor *mon, QObject **ret_data); void bdrv_stats_print(Monitor *mon, const QObject *data); void bdrv_info_stats(Monitor *mon, QObject **ret_data); +/* disk I/O throttling */ +void bdrv_io_limits_enable(BlockDriverState *bs); +void bdrv_io_limits_disable(BlockDriverState *bs); +bool bdrv_io_limits_enabled(BlockDriverState *bs); + void bdrv_init(void); void bdrv_init_with_whitelist(void); BlockDriver *bdrv_find_protocol(const char *filename); diff --git a/block_int.h b/block_int.h index 201e635..368c776 100644 --- a/block_int.h +++ b/block_int.h @@ -257,6 +257,9 @@ void qemu_aio_release(void *p); void *qemu_blockalign(BlockDriverState *bs, size_t size); +void bdrv_set_io_limits(BlockDriverState *bs, +BlockIOLimit *io_limits); + #ifdef _WIN32 int is_windows_drive(const char *filename); #endif diff --git a/blockdev.c b/blockdev.c index 2602591..619ae9f 100644 --- a/blockdev.c +++ b/blockdev.c @@ -236,6 +236,7 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi) int on_read_error, on_write_error; const char *devaddr; DriveInfo *dinfo; +BlockIOLimit io_limits; int snapshot = 0; int ret; @@ -354,6 +355,31 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi) } } +/* disk I/O throttling */ +io_limits.bps[BLOCK_IO_LIMIT_TOTAL] = + qemu_opt_get_number(opts, bps, 0); +io_limits.bps[BLOCK_IO_LIMIT_READ] = + qemu_opt_get_number(opts, bps_rd, 0); +io_limits.bps[BLOCK_IO_LIMIT_WRITE] = + qemu_opt_get_number(opts, bps_wr, 0); +io_limits.iops[BLOCK_IO_LIMIT_TOTAL] = + qemu_opt_get_number(opts, iops, 0); +io_limits.iops[BLOCK_IO_LIMIT_READ] = + qemu_opt_get_number(opts, iops_rd, 0); +io_limits.iops[BLOCK_IO_LIMIT_WRITE] = + qemu_opt_get_number(opts, iops_wr, 0); + +if (((io_limits.bps[BLOCK_IO_LIMIT_TOTAL] != 0) + ((io_limits.bps[BLOCK_IO_LIMIT_READ] != 0) +|| (io_limits.bps[BLOCK_IO_LIMIT_WRITE] != 0))) +|| ((io_limits.iops[BLOCK_IO_LIMIT_TOTAL] != 0) + ((io_limits.iops[BLOCK_IO_LIMIT_READ] != 0) +|| (io_limits.iops[BLOCK_IO_LIMIT_WRITE] != 0 { +error_report(bps(iops) and bps_rd/bps_wr(iops_rd/iops_wr) + cannot be used at the same time); +return
Re: [Qemu-devel] [PATCH 2/2] main: switch qemu_set_fd_handler to g_io_add_watch
On 09/07/2011 02:03 AM, Paolo Bonzini wrote: On 09/06/2011 05:59 PM, Anthony Liguori wrote: So it should be possible to add a new Source type that just selects on a file descriptor and avoid GIOChannels? I think you still have the problem that glib on Windows waits for HANDLEs, not file descriptors. Also, I'm not sure it's worth it though as long as slirp still does its own fill/poll. So how do we fix this long term? We seem to get away with using fds today and not HANDLEs, do fds on Windows not work the same with poll as they do with select? Regards, Anthony Liguori Paolo
[Qemu-devel] [PATCH v8 3/4] block: add block timer and throttling algorithm
Note: 1.) When bps/iops limits are specified to a small value such as 511 bytes/s, this VM will hang up. We are considering how to handle this senario. 2.) When dd command is issued in guest, if its option bs is set to a large value such as bs=1024K, the result speed will slightly bigger than the limits. For these problems, if you have nice thought, pls let us know.:) Signed-off-by: Zhi Yong Wu wu...@linux.vnet.ibm.com --- block.c | 246 --- block.h |1 - 2 files changed, 236 insertions(+), 11 deletions(-) diff --git a/block.c b/block.c index cd75183..8a82273 100644 --- a/block.c +++ b/block.c @@ -30,6 +30,9 @@ #include qemu-objects.h #include qemu-coroutine.h +#include qemu-timer.h +#include block/blk-queue.h + #ifdef CONFIG_BSD #include sys/types.h #include sys/stat.h @@ -72,6 +75,13 @@ static int coroutine_fn bdrv_co_writev_em(BlockDriverState *bs, QEMUIOVector *iov); static int coroutine_fn bdrv_co_flush_em(BlockDriverState *bs); +static bool bdrv_exceed_bps_limits(BlockDriverState *bs, int nb_sectors, +bool is_write, double elapsed_time, uint64_t *wait); +static bool bdrv_exceed_iops_limits(BlockDriverState *bs, bool is_write, +double elapsed_time, uint64_t *wait); +static bool bdrv_exceed_io_limits(BlockDriverState *bs, int nb_sectors, +bool is_write, int64_t *wait); + static QTAILQ_HEAD(, BlockDriverState) bdrv_states = QTAILQ_HEAD_INITIALIZER(bdrv_states); @@ -745,6 +755,11 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags, bs-change_cb(bs-change_opaque, CHANGE_MEDIA); } +/* throttling disk I/O limits */ +if (bs-io_limits_enabled) { +bdrv_io_limits_enable(bs); +} + return 0; unlink_and_fail: @@ -783,6 +798,18 @@ void bdrv_close(BlockDriverState *bs) if (bs-change_cb) bs-change_cb(bs-change_opaque, CHANGE_MEDIA); } + +/* throttling disk I/O limits */ +if (bs-block_queue) { +qemu_del_block_queue(bs-block_queue); +bs-block_queue = NULL; +} + +if (bs-block_timer) { +qemu_del_timer(bs-block_timer); +qemu_free_timer(bs-block_timer); +bs-block_timer = NULL; +} } void bdrv_close_all(void) @@ -2341,16 +2368,40 @@ BlockDriverAIOCB *bdrv_aio_readv(BlockDriverState *bs, int64_t sector_num, BlockDriverCompletionFunc *cb, void *opaque) { BlockDriver *drv = bs-drv; +BlockDriverAIOCB *ret; +int64_t wait_time = -1; trace_bdrv_aio_readv(bs, sector_num, nb_sectors, opaque); -if (!drv) -return NULL; -if (bdrv_check_request(bs, sector_num, nb_sectors)) +if (!drv || bdrv_check_request(bs, sector_num, nb_sectors)) { return NULL; +} -return drv-bdrv_aio_readv(bs, sector_num, qiov, nb_sectors, +/* throttling disk read I/O */ +if (bs-io_limits_enabled) { +if (bdrv_exceed_io_limits(bs, nb_sectors, false, wait_time)) { +ret = qemu_block_queue_enqueue(bs-block_queue, bs, bdrv_aio_readv, + sector_num, qiov, nb_sectors, cb, opaque); +if (wait_time != -1) { +qemu_mod_timer(bs-block_timer, + wait_time + qemu_get_clock_ns(vm_clock)); +} + +return ret; +} +} + +ret = drv-bdrv_aio_readv(bs, sector_num, qiov, nb_sectors, cb, opaque); +if (ret) { +if (bs-io_limits_enabled) { +bs-io_disps.bytes[BLOCK_IO_LIMIT_READ] += + (unsigned) nb_sectors * BDRV_SECTOR_SIZE; +bs-io_disps.ios[BLOCK_IO_LIMIT_READ]++; +} +} + +return ret; } typedef struct BlockCompleteData { @@ -2396,15 +2447,14 @@ BlockDriverAIOCB *bdrv_aio_writev(BlockDriverState *bs, int64_t sector_num, BlockDriver *drv = bs-drv; BlockDriverAIOCB *ret; BlockCompleteData *blk_cb_data; +int64_t wait_time = -1; trace_bdrv_aio_writev(bs, sector_num, nb_sectors, opaque); -if (!drv) -return NULL; -if (bs-read_only) -return NULL; -if (bdrv_check_request(bs, sector_num, nb_sectors)) +if (!drv || bs-read_only +|| bdrv_check_request(bs, sector_num, nb_sectors)) { return NULL; +} if (bs-dirty_bitmap) { blk_cb_data = blk_dirty_cb_alloc(bs, sector_num, nb_sectors, cb, @@ -2413,13 +2463,32 @@ BlockDriverAIOCB *bdrv_aio_writev(BlockDriverState *bs, int64_t sector_num, opaque = blk_cb_data; } +/* throttling disk write I/O */ +if (bs-io_limits_enabled) { +if (bdrv_exceed_io_limits(bs, nb_sectors, true, wait_time)) { +ret = qemu_block_queue_enqueue(bs-block_queue, bs, bdrv_aio_writev, + sector_num, qiov, nb_sectors, cb, opaque); +if
[Qemu-devel] [PATCH v8 4/4] qmp/hmp: add block_set_io_throttle
The patch introduce one new command block_set_io_throttle; For its usage syntax, if you have better idea, pls let me know. Signed-off-by: Zhi Yong Wu wu...@linux.vnet.ibm.com --- block.c | 26 +++- blockdev.c | 69 +++ blockdev.h |2 + hmp-commands.hx | 15 qerror.c|4 +++ qerror.h|3 ++ qmp-commands.hx | 52 - 7 files changed, 168 insertions(+), 3 deletions(-) diff --git a/block.c b/block.c index 8a82273..e435a79 100644 --- a/block.c +++ b/block.c @@ -1938,6 +1938,16 @@ static void bdrv_print_dict(QObject *obj, void *opaque) qdict_get_bool(qdict, ro), qdict_get_str(qdict, drv), qdict_get_bool(qdict, encrypted)); + +monitor_printf(mon, bps=% PRId64 bps_rd=% PRId64 + bps_wr=% PRId64 iops=% PRId64 + iops_rd=% PRId64 iops_wr=% PRId64, +qdict_get_int(qdict, bps), +qdict_get_int(qdict, bps_rd), +qdict_get_int(qdict, bps_wr), +qdict_get_int(qdict, iops), +qdict_get_int(qdict, iops_rd), +qdict_get_int(qdict, iops_wr)); } else { monitor_printf(mon, [not inserted]); } @@ -1970,10 +1980,22 @@ void bdrv_info(Monitor *mon, QObject **ret_data) QDict *bs_dict = qobject_to_qdict(bs_obj); obj = qobject_from_jsonf({ 'file': %s, 'ro': %i, 'drv': %s, - 'encrypted': %i }, + 'encrypted': %i, + 'bps': % PRId64 , + 'bps_rd': % PRId64 , + 'bps_wr': % PRId64 , + 'iops': % PRId64 , + 'iops_rd': % PRId64 , + 'iops_wr': % PRId64 }, bs-filename, bs-read_only, bs-drv-format_name, - bdrv_is_encrypted(bs)); + bdrv_is_encrypted(bs), + bs-io_limits.bps[BLOCK_IO_LIMIT_TOTAL], + bs-io_limits.bps[BLOCK_IO_LIMIT_READ], + bs-io_limits.bps[BLOCK_IO_LIMIT_WRITE], + bs-io_limits.iops[BLOCK_IO_LIMIT_TOTAL], + bs-io_limits.iops[BLOCK_IO_LIMIT_READ], + bs-io_limits.iops[BLOCK_IO_LIMIT_WRITE]); if (bs-backing_file[0] != '\0') { QDict *qdict = qobject_to_qdict(obj); qdict_put(qdict, backing_file, diff --git a/blockdev.c b/blockdev.c index 619ae9f..7f5c4df 100644 --- a/blockdev.c +++ b/blockdev.c @@ -747,6 +747,75 @@ int do_change_block(Monitor *mon, const char *device, return monitor_read_bdrv_key_start(mon, bs, NULL, NULL); } +/* throttling disk I/O limits */ +int do_block_set_io_throttle(Monitor *mon, + const QDict *qdict, QObject **ret_data) +{ +const char *devname = qdict_get_str(qdict, device); +uint64_t bps= qdict_get_try_int(qdict, bps, -1); +uint64_t bps_rd = qdict_get_try_int(qdict, bps_rd, -1); +uint64_t bps_wr = qdict_get_try_int(qdict, bps_wr, -1); +uint64_t iops = qdict_get_try_int(qdict, iops, -1); +uint64_t iops_rd= qdict_get_try_int(qdict, iops_rd, -1); +uint64_t iops_wr= qdict_get_try_int(qdict, iops_wr, -1); +BlockDriverState *bs; + +bs = bdrv_find(devname); +if (!bs) { +qerror_report(QERR_DEVICE_NOT_FOUND, devname); +return -1; +} + +if ((bps == -1) (bps_rd == -1) (bps_wr == -1) + (iops == -1) (iops_rd == -1) (iops_wr == -1)) { +qerror_report(QERR_MISSING_PARAMETER, + bps/bps_rd/bps_wr/iops/iops_rd/iops_wr); +return -1; +} + +if (((bps != -1) ((bps_rd != -1) || (bps_wr != -1))) +|| ((iops != -1) ((iops_rd != -1) || (iops_wr != -1 { +qerror_report(QERR_INVALID_PARAMETER_COMBINATION); +return -1; +} + +if (bps != -1) { +bs-io_limits.bps[BLOCK_IO_LIMIT_TOTAL] = bps; +bs-io_limits.bps[BLOCK_IO_LIMIT_READ] = 0; +bs-io_limits.bps[BLOCK_IO_LIMIT_WRITE] = 0; +} + +if ((bps_rd != -1) || (bps_wr != -1)) { +bs-io_limits.bps[BLOCK_IO_LIMIT_READ] = + (bps_rd == -1) ? bs-io_limits.bps[BLOCK_IO_LIMIT_READ] : bps_rd; +bs-io_limits.bps[BLOCK_IO_LIMIT_WRITE] = + (bps_wr == -1) ? bs-io_limits.bps[BLOCK_IO_LIMIT_WRITE] : bps_wr; +
Re: [Qemu-devel] [PATCH 2/2] main: switch qemu_set_fd_handler to g_io_add_watch
On 09/04/2011 06:01 PM, Anthony Liguori wrote: On 09/04/2011 09:03 AM, Avi Kivity wrote: On 08/22/2011 04:12 PM, Anthony Liguori wrote: This patch changes qemu_set_fd_handler to be implemented in terms of g_io_add_watch(). The semantics are a bit different so some glue is required. qemu_set_fd_handler2 is much harder to convert because of its use of polling. The glib main loop has the major of advantage of having a proven thread safe architecture. By using the glib main loop instead of our own, it will allow us to eventually introduce multiple I/O threads. I'm pretty sure that this will work on Win32, but I would appreciate some help testing. I think the semantics of g_io_channel_unix_new() are really just tied to the notion of a unix fd and not necessarily unix itself. 'git bisect' fingered this as responsible for breaking qcow2+cache=unsafe. I think there's an off-by-one here and the guilty patch is the one that switches the main loop, but that's just a guess. The symptoms are that a guest that is restarted (new qemu process) after install doesn't make it through grub - some image data didn't make it do disk. With qcow2 and cache=unsafe that can easily happen through exit notifiers not being run and the entire qcow2 metadata being thrown out the window. Running with raw+cache=unsafe works. Can you share your full command line? Nothing that would be in the obvious path actually uses qemu_set_fd_handler... I upgraded my autotest setup due to other issues, and now the symptoms are much worse... even before the merge that introduced this patch. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [FIX] X86 CPU topology broken in KVM mode
On 09/06/2011 11:21 PM, Bharata B Rao wrote: Hi, Sometime back I posted a patch for fixing x86 CPU topology ( http://lists.gnu.org/archive/html/qemu-devel/2011-08/msg02022.html). Here is the next version of the fix which addresses all but one comment received during that post. - Fixed code style issues - Ensured that the fix doesn't break TCG mode - I am not sure what is the problem with i486 as I haven't been able to boot an i486 VM successfully, hence haven't attempted to fix this. I have tested following scenarios and found the fix to be working fine. KVM: (with --enable-kvm) -smp sockets=1,cores=4,threads=2 -smp sockets=4,cores=4,threads=2 -cpu core2duo sockets=1,cores=4,threads=2 -cpu core2duo sockets=2,cores=4,threads=2 TCG: (without --enable-kvm) -cpu core2duo sockets=1,cores=4,threads=2 -cpu core2duo sockets=2,cores=4,threads=2 Here is the updated patch which now applies against qemu.git. Fix apic id enumeration apic id returned to guest kernel in ebx for cpuid(function=1) depends on CPUX86State-cpuid_apic_id which gets populated after the cpuid information is cached in the host kernel. Fix this by setting cpuid_apic_id before cpuid information is passed to the host kernel. This is done by moving the setting of cpuid_apic_id to cpu_x86_init() where it will work for both KVM as well as TCG modes. Signed-off-by: Bharata B Raobharata@gmail.com Please post patches as top-level threads with [PATCH] in the subject. Please use git diff or better yet, git-send-email. Regards, Anthony Liguori --- hw/pc.c |1 - target-i386/helper.c |5 + 2 files changed, 5 insertions(+), 1 deletion(-) Index: qemu/hw/pc.c === --- qemu.orig/hw/pc.c +++ qemu/hw/pc.c @@ -933,7 +933,6 @@ static CPUState *pc_new_cpu(const char * exit(1); } if ((env-cpuid_features CPUID_APIC) || smp_cpus 1) { -env-cpuid_apic_id = env-cpu_index; env-apic_state = apic_init(env, env-cpuid_apic_id); } qemu_register_reset(pc_cpu_reset, env); Index: qemu/target-i386/helper.c === --- qemu.orig/target-i386/helper.c +++ qemu/target-i386/helper.c @@ -1256,6 +1256,11 @@ CPUX86State *cpu_x86_init(const char *cp cpu_x86_close(env); return NULL; } + +if (env-cpuid_features CPUID_APIC) { +env-cpuid_apic_id = env-cpu_index; +} + mce_init(env); qemu_init_vcpu(env); * Regards, Bharata. -- http://bharata.sulekha.com/blog/posts.htm, http://raobharata.wordpress.com/
[Qemu-devel] [PATCH 2/3] Use hex instead of binary.
Older gcc versions don't understand 0bbits, use hex representation instead. Fixes build failure on RHEL-5. Signed-off-by: Gerd Hoffmann kra...@redhat.com --- target-unicore32/translate.c | 16 1 files changed, 8 insertions(+), 8 deletions(-) diff --git a/target-unicore32/translate.c b/target-unicore32/translate.c index 4ecb0f1..4d0aa43 100644 --- a/target-unicore32/translate.c +++ b/target-unicore32/translate.c @@ -1788,7 +1788,7 @@ static void disas_uc32_insn(CPUState *env, DisasContext *s) * E: 5 */ switch (insn 29) { -case 0b000: +case 0x0: if (UCOP_SET(5) UCOP_SET(8) !UCOP_SET(28)) { do_mult(env, s, insn); break; @@ -1798,7 +1798,7 @@ static void disas_uc32_insn(CPUState *env, DisasContext *s) do_misc(env, s, insn); break; } -case 0b001: +case 0x1: if (((UCOP_OPCODES 2) == 2) !UCOP_SET_S) { do_misc(env, s, insn); break; @@ -1806,7 +1806,7 @@ static void disas_uc32_insn(CPUState *env, DisasContext *s) do_datap(env, s, insn); break; -case 0b010: +case 0x2: if (UCOP_SET(8) UCOP_SET(5)) { do_ldst_hwsb(env, s, insn); break; @@ -1814,24 +1814,24 @@ static void disas_uc32_insn(CPUState *env, DisasContext *s) if (UCOP_SET(8) || UCOP_SET(5)) { ILLEGAL; } -case 0b011: +case 0x3: do_ldst_ir(env, s, insn); break; -case 0b100: +case 0x4: if (UCOP_SET(8)) { ILLEGAL; /* extended instructions */ } do_ldst_m(env, s, insn); break; -case 0b101: +case 0x5: do_branch(env, s, insn); break; -case 0b110: +case 0x6: /* Coprocessor. */ disas_coproc_insn(env, s, insn); break; -case 0b111: +case 0x7: if (!UCOP_SET(28)) { disas_coproc_insn(env, s, insn); break; -- 1.7.1
[Qemu-devel] [PATCH 0/3] Fix build failures.
Hi, This patch series fixes a bunch of build failures. please apply, Gerd Gerd Hoffmann (3): Don't use g_thread_get_initialized. Use hex instead of binary. vns/tls: don't use depricated gnutls functions hw/9pfs/virtio-9p-coth.c |4 --- target-unicore32/translate.c | 16 +- ui/vnc-tls.c | 62 +- vl.c |1 + 4 files changed, 52 insertions(+), 31 deletions(-)
[Qemu-devel] [PATCH 1/3] Don't use g_thread_get_initialized.
Initialize glib threads unconditionally in main() instead of using g_thread_get_initialized in the 9p code. Fixes a build failure on RHEL-5, which ships glib 2.12. g_thread_get_initialized was added in 2.20. Signed-off-by: Gerd Hoffmann kra...@redhat.com --- hw/9pfs/virtio-9p-coth.c |4 vl.c |1 + 2 files changed, 1 insertions(+), 4 deletions(-) diff --git a/hw/9pfs/virtio-9p-coth.c b/hw/9pfs/virtio-9p-coth.c index ae05658..25556cc 100644 --- a/hw/9pfs/virtio-9p-coth.c +++ b/hw/9pfs/virtio-9p-coth.c @@ -67,10 +67,6 @@ int v9fs_init_worker_threads(void) /* Leave signal handling to the iothread. */ pthread_sigmask(SIG_SETMASK, set, oldset); -/* init thread system if not already initialized */ -if (!g_thread_get_initialized()) { -g_thread_init(NULL); -} if (qemu_pipe(notifier_fds) == -1) { ret = -1; goto err_out; diff --git a/vl.c b/vl.c index 5ba9b35..6e998af 100644 --- a/vl.c +++ b/vl.c @@ -2200,6 +2200,7 @@ int main(int argc, char **argv, char **envp) error_set_progname(argv[0]); g_mem_set_vtable(mem_trace); +g_thread_init(NULL); init_clocks(); -- 1.7.1
[Qemu-devel] [PATCH 3/3] vns/tls: don't use depricated gnutls functions
Avoid using depricated gnutls functions with recent gnutls versions. Fixes build failure on Fedora 16. Keep the old way for compatibility with old installations such as RHEL-5 (gnutls 1.4.x). Based on a patch from Raghavendra D Prabhu raghu.prabh...@gmail.com Signed-off-by: Gerd Hoffmann kra...@redhat.com --- ui/vnc-tls.c | 62 - 1 files changed, 43 insertions(+), 19 deletions(-) diff --git a/ui/vnc-tls.c b/ui/vnc-tls.c index 2e2456e..276e127 100644 --- a/ui/vnc-tls.c +++ b/ui/vnc-tls.c @@ -283,13 +283,51 @@ int vnc_tls_validate_certificate(struct VncState *vs) return 0; } +#if defined(GNUTLS_VERSION_NUMBER) \ +GNUTLS_VERSION_NUMBER = 0x020200 /* 2.2.0 */ + +static int vnc_set_gnutls_priority(struct VncState *vs, int needX509Creds) +{ +const char *priority = needX509Creds ? NORMAL : NORMAL:+ANON-DH; + +if (gnutls_priority_set_direct(vs-tls.session, priority, NULL) 0) { +return -1; +} +return 0; +} + +#else + +static int vnc_set_gnutls_priority(struct VncState *vs, int x509) +{ +static const int cert_types[] = { GNUTLS_CRT_X509, 0 }; +static const int protocols[] = { +GNUTLS_TLS1_1, GNUTLS_TLS1_0, GNUTLS_SSL3, 0 +}; +static const int kx_anon[] = { GNUTLS_KX_ANON_DH, 0 }; +static const int kx_x509[] = { +GNUTLS_KX_DHE_DSS, GNUTLS_KX_RSA, +GNUTLS_KX_DHE_RSA, GNUTLS_KX_SRP, 0 +}; + +if (gnutls_kx_set_priority(vs-tls.session, x509 ? kx_x509 : kx_anon) 0) { +return -1; +} + +if (gnutls_certificate_type_set_priority(vs-tls.session, cert_types) 0) { +return -1; +} + +if (gnutls_protocol_set_priority(vs-tls.session, protocols) 0) { +return -1; +} +return 0; +} + +#endif int vnc_tls_client_setup(struct VncState *vs, int needX509Creds) { -static const int cert_type_priority[] = { GNUTLS_CRT_X509, 0 }; -static const int protocol_priority[]= { GNUTLS_TLS1_1, GNUTLS_TLS1_0, GNUTLS_SSL3, 0 }; -static const int kx_anon[] = {GNUTLS_KX_ANON_DH, 0}; -static const int kx_x509[] = {GNUTLS_KX_DHE_DSS, GNUTLS_KX_RSA, GNUTLS_KX_DHE_RSA, GNUTLS_KX_SRP, 0}; VNC_DEBUG(Do TLS setup\n); if (vnc_tls_initialize() 0) { @@ -310,21 +348,7 @@ int vnc_tls_client_setup(struct VncState *vs, return -1; } -if (gnutls_kx_set_priority(vs-tls.session, needX509Creds ? kx_x509 : kx_anon) 0) { -gnutls_deinit(vs-tls.session); -vs-tls.session = NULL; -vnc_client_error(vs); -return -1; -} - -if (gnutls_certificate_type_set_priority(vs-tls.session, cert_type_priority) 0) { -gnutls_deinit(vs-tls.session); -vs-tls.session = NULL; -vnc_client_error(vs); -return -1; -} - -if (gnutls_protocol_set_priority(vs-tls.session, protocol_priority) 0) { +if (vnc_set_gnutls_priority(vs, needX509Creds) 0) { gnutls_deinit(vs-tls.session); vs-tls.session = NULL; vnc_client_error(vs); -- 1.7.1
Re: [Qemu-devel] [PATCH 0/6] Device state visualization reloaded
On Wed, Sep 07, 2011 at 11:37:20AM +0200, Kevin Wolf wrote: Am 06.09.2011 19:05, schrieb Michael S. Tsirkin: On Tue, Sep 06, 2011 at 11:28:09AM -0500, Anthony Liguori wrote: On 09/06/2011 11:09 AM, Michael S. Tsirkin wrote: On Tue, Sep 06, 2011 at 10:51:26AM -0500, Anthony Liguori wrote: On 09/06/2011 10:45 AM, Jan Kiszka wrote: On 2011-09-06 16:48, Michael S. Tsirkin wrote: I'm afraid that won't be enough to stop people scripting this command - libvirt accessed HMP for years. On the other hand, no QMP command means e.g. libvirt users don't get any benefit from this. What I think will solve these problems, for both HMP and QMP, is an explicit 'debug_unstable' or 'debug_unsupported' command that will expose all kind of debugging functionality making it very explicit that it's an unsupported debugging utility. Proposed syntax: debug_unstablesubcommand options Example: debug_unstable device_show -all For HMP, this would needlessly complicate the user interface, nothing I would support. People scripting things on top of HMP are generally doing this on their own risk and cannot expect output stability. device_show is like info qtree: the output will naturally change as the emulated hardware evolves, information is added/removed, or we simply improve the layout. Recent changes on info network are an example for the latter. Yeah, I'm not worried about stability. HMP commands that aren't exposed as QMP commands are inherently unstable and should not be scripted to. They are also not accessible when using libvirt, right? $ virsh human-monitor-passthrough GuestName device_show foo Should work. So how, in the end, will user know it's unsupported? I don't agree with 'all HMP is unstable' as people will use it and will come to depend on it. Why unsupported? It's just not a stable API to script against. It's a user interface, not a programming interface. So as long as you use it manually, that's perfectly fine. Would you expect that virt-manager never changes its GUI because there might be some scripts that try to achieve things by screen scraping? The interface is just not meant for such uses, and if you use it in that way and it breaks with the next version it's your own fault. Kevin I certainly think that radically changing a GUI is very bad usability. Screen scarping is a silly example, but documentation does appear e.g. on the web, and GUI changes invalidate it. I won't mention recent examples of user unhappiness - it takes years to live down such trauma. If we expose qemu internals in a command, that's a very bad idea to give such command to the user, because 1. users will come to depend on the command, whatever you tell them 2. the fact it's there means there is some unaddressed need, so we plaster over it with unsupported commands instead of addressing it properly But if the command is not for users at all, if it's for qemu debugging, then exposing internals is a very logical thing. Only problem is - we must make it very very clear which commands are for qemu debugging. -- MST
Re: [Qemu-devel] [PATCH V8 08/14] Introduce file lock for the block layer
On 09/07/2011 07:18 AM, Michael S. Tsirkin wrote: On Tue, Sep 06, 2011 at 07:55:48PM -0400, Stefan Berger wrote: On 09/04/2011 03:32 PM, Michael S. Tsirkin wrote: On Thu, Sep 01, 2011 at 09:53:40PM -0400, Stefan Berger wrote: Generally, what all other devices do is perform validation as the last step in migration when device state is restored. On failure, management can decide what to do: retry migration or restart on source. Why is TPM special and needs to be treated differently? ... More detail: Typically one starts out with an empty QCoW2 file created via qemu-img. Once Qemu starts and initializes the libtpms-based TPM, it tries to read existing state from that QCoW2 file. Since there is no state stored in the QCoW2, the TPM will start initializing itself to an initial 'blank' state. So it looks like the problem is you access the file when guest isn't running. Delaying the initialization until the guest actually starts running will solve the problem in a way that is more consistent with other qemu devices. I'd agree if there wasn't one more thing: encryption on the data inside the QCoW2 filesystem First: There are two ways to encrypt the data. One comes with the QCoW2 type of image and it comes for free. Set the encryption flag when creating the QCoW2 file and one has to provide a key to access the QCoW2. I found this mode problematic for users since it required me to go through the monitor every time I started the VM. Besides that the key is provided so late that all devices are already initialized and if the wrong key was provided the only thing the TPM can do is to go into shutdown mode since there is state on the QCoW2 but it cannot be decrypted. This also became problematic when doing migrations with libvirt for example and one was to have a wrong key/password installed on the target side -- graceful termination of the migration is impossible. So the above drove the implementation of the encryption mode added in patch 10 in this series. Here the key is provided via command line and it can be used immediately. So I am reading the state blobs from the file, decrypt them, create the CRC32 on the plain data and check against the CRC32 stored in the 'directory'. If it doesn't match the expected CRC32 either the key was wrong or the state is corrupted and I can terminate Qemu gracefully. I can also react appropriately if no key was provided but one is expected and vice-versa. Also in case of migration this now allows me to terminate Qemu gracefully so it continues running on the source. This is an improvement over the situation described above where in case the target had the wrong key the TPM went into shutdown mode and the user would be wondering why that is -- the TPM becomes inaccessible. However, particularly in the case of migration with shared storage I need to access the QCoW2 file to check whether on the target I have the right key. And this happens very early during qemu startup on the target machine. So that's where the file locking on the block layer comes in. I want to serialize access to the QCoW2 so I don't read intermediate state on the migration-target host that can occur if the source is currently writing TPM state data. This patch and the command line provided key along with the encryption mode added in patch 10 in this series add to usability and help prevent failures. Regards, Stefan Sure, it's a useful feature of validating the device early. But as I said above, it's useful for other devices besides TPM. However it introduces issues where state changes since guest keeps running (such as what you have described here). I think solving this in a way specific to TPM is a mistake. Passing key on command line does not mean you must use it immediately, this can still be done when guest starts running. If I was not using the key immediately I could just drop this patch and the following one introducing the blob encryption and we would have to go with the QCoW2 encryption mode. Delaying the key usage then becomes equivalent to how QCoW2 is handling its encryption mode along with the problems related to user-friendliness / handling of missing or wrong keys described earlier. Further, the patchset seems to be split in a way that introduces support headaches and makes review Patch 8 introduces file locking for bdrv. Patch 9 implements support for string TPM blobs inside a QCoW2 image and makes use of the locking. Patch 10 adds encryption support for the TPM blobs. What otherwise would be the logical split? harder, not easier. In this case, we will have to support both a broken mode (key supplied later) and a non-broken one (key supplied on init). It would be better to implement a single mode, in a single patch. If we call QCoW2 encryption support the 'broken mode' and we want to do better than this then I do have to solve it in a TPM-specific way since Qemu otherwise does not support any better method (afaics). QCoW2 encryption is there today and
Re: [Qemu-devel] [PATCH 0/6] Device state visualization reloaded
On 2011-09-07 15:06, Michael S. Tsirkin wrote: But if the command is not for users at all, if it's for qemu debugging, then exposing internals is a very logical thing. Only problem is - we must make it very very clear which commands are for qemu debugging. This command it also for users, to debug guests. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH 0/6] Device state visualization reloaded
On Wed, Sep 07, 2011 at 03:13:00PM +0200, Jan Kiszka wrote: On 2011-09-07 15:06, Michael S. Tsirkin wrote: But if the command is not for users at all, if it's for qemu debugging, then exposing internals is a very logical thing. Only problem is - we must make it very very clear which commands are for qemu debugging. This command it also for users, to debug guests. Jan Hmm, guest visible state must be stable by definition. So why can't we make the interfaces stable then? -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH V8 08/14] Introduce file lock for the block layer
On Wed, Sep 07, 2011 at 09:06:05AM -0400, Stefan Berger wrote: First: There are two ways to encrypt the data. One comes with the QCoW2 type of image and it comes for free. Set the encryption flag when creating the QCoW2 file and one has to provide a key to access the QCoW2. I found this mode problematic for users since it required me to go through the monitor every time I started the VM. Besides that the key is provided so late that all devices are already initialized and if the wrong key was provided the only thing the TPM can do is to go into shutdown mode since there is state on the QCoW2 but it cannot be decrypted. This also became problematic when doing migrations with libvirt for example and one was to have a wrong key/password installed on the target side -- graceful termination of the migration is impossible. OK let's go back to this for a moment. Add a load callback, access file there. On failure, return an error. migration fails gracefully, and management can retry, or migrate to another node, or whatever. What's the problem exactly? -- MST
Re: [Qemu-devel] [PATCH 0/6] Device state visualization reloaded
On 09/07/2011 08:17 AM, Michael S. Tsirkin wrote: On Wed, Sep 07, 2011 at 03:13:00PM +0200, Jan Kiszka wrote: On 2011-09-07 15:06, Michael S. Tsirkin wrote: But if the command is not for users at all, if it's for qemu debugging, then exposing internals is a very logical thing. Only problem is - we must make it very very clear which commands are for qemu debugging. This command it also for users, to debug guests. Jan Hmm, guest visible state must be stable by definition. So why can't we make the interfaces stable then? Because right now how you reference devices cannot be stable. Going back to a previous thread, what if the command took a qdev class type as a filter? That would be a stable name, not require anonymous IDs, and would have the property of usually being a single device. Regards, Anthony Liguori -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [FIX] X86 CPU topology broken in KVM mode
On Wed, Sep 7, 2011 at 6:29 PM, Anthony Liguori anth...@codemonkey.ws wrote: On 09/06/2011 11:21 PM, Bharata B Rao wrote: Hi, Please post patches as top-level threads with [PATCH] in the subject. I posted a new thread and hence it has appeared as a top-level thread. This was a fix and hence used [FIX], but if this mailing list expects [PATCH], then will use it from next time. Please use git diff or better yet, git-send-email. This was a small patch and hence used quilt. If you insist, I can use git for the next post :) Regards, Bharata.
Re: [Qemu-devel] [FIX] X86 CPU topology broken in KVM mode
On 09/07/2011 08:24 AM, Bharata B Rao wrote: On Wed, Sep 7, 2011 at 6:29 PM, Anthony Liguorianth...@codemonkey.ws wrote: On 09/06/2011 11:21 PM, Bharata B Rao wrote: Hi, Please post patches as top-level threads with [PATCH] in the subject. I posted a new thread and hence it has appeared as a top-level thread. This was a fix and hence used [FIX], but if this mailing list expects [PATCH], then will use it from next time. Please use git diff or better yet, git-send-email. This was a small patch and hence used quilt. If you insist, I can use git for the next post :) It's not strictly required, but git includes extra information in the patch it generates (base revision) which git-am can use to merge the patch via a 3way merge instead of patch fuzz. That makes life a bit easier when you're applying a lot of patches. Regards, Anthony Liguori Regards, Bharata.
Re: [Qemu-devel] [PATCH 0/6] Device state visualization reloaded
On 2011-09-07 15:23, Anthony Liguori wrote: On 09/07/2011 08:17 AM, Michael S. Tsirkin wrote: On Wed, Sep 07, 2011 at 03:13:00PM +0200, Jan Kiszka wrote: On 2011-09-07 15:06, Michael S. Tsirkin wrote: But if the command is not for users at all, if it's for qemu debugging, then exposing internals is a very logical thing. Only problem is - we must make it very very clear which commands are for qemu debugging. This command it also for users, to debug guests. Jan Hmm, guest visible state must be stable by definition. So why can't we make the interfaces stable then? Because right now how you reference devices cannot be stable. Going back to a previous thread, what if the command took a qdev class type as a filter? That would be a stable name, not require anonymous IDs, and would have the property of usually being a single device. It's surely not usual that there is only a single instance of a device type. That's why my old series introduced (unstable) instance numbering. I refrained for warming those patches up again. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH 0/6] Device state visualization reloaded
On Wed, Sep 07, 2011 at 08:23:10AM -0500, Anthony Liguori wrote: On 09/07/2011 08:17 AM, Michael S. Tsirkin wrote: On Wed, Sep 07, 2011 at 03:13:00PM +0200, Jan Kiszka wrote: On 2011-09-07 15:06, Michael S. Tsirkin wrote: But if the command is not for users at all, if it's for qemu debugging, then exposing internals is a very logical thing. Only problem is - we must make it very very clear which commands are for qemu debugging. This command it also for users, to debug guests. Jan Hmm, guest visible state must be stable by definition. So why can't we make the interfaces stable then? Because right now how you reference devices cannot be stable. For devices that do have IDs, we can support this on QMP as well then? -- MST
Re: [Qemu-devel] [PATCH V8 14/14] Allow to provide inital TPM state
On 09/07/2011 07:23 AM, Michael S. Tsirkin wrote: On Tue, Sep 06, 2011 at 10:45:34PM -0400, Stefan Berger wrote: On 09/04/2011 12:38 PM, Michael S. Tsirkin wrote: On Thu, Sep 01, 2011 at 11:00:56PM -0400, Stefan Berger wrote: initstate_fd=file descriptor initstate_base64=on/off (or base64/bin if you really expect more formats in the future) and use qemu routines to get the fd so they can be passed through the monitor later ... I suppose you mean monitor_get_fd(). That functions seems to only be used by net.c so far and if understand the chain of functions correctly that are called with the monitor as parameter it helps in hotplugging net devices ? For the TPM I would like to *not* have support for hotplugging since that device is supposed to be soldered to the motherboard and needs to be initialized through a command sequence by the (v)BIOS, so it has to be present early on during machine startup. Stefan Fine, but let's reuse common functions and save code duplication, especially parsing functions. When parsing the command line there's no Monitor being passed around. So in case of 'net' net_handle_fd_param() (net.c) ends up not invoking monitor_get_fd() but the else branch where strtol() is used to convert the fd. Now I won't call net_handle_fd_param() but could introduce tpm_handle_fd_param() also calling strtol(). Though that would not make me call a common function but duplicating the code there... I don't know of another function handling the parsing of fd's. Is there one ? If not, I'll also just fall back to using strtol(). Stefan
Re: [Qemu-devel] [PATCH V8 08/14] Introduce file lock for the block layer
On 09/07/2011 09:16 AM, Michael S. Tsirkin wrote: On Wed, Sep 07, 2011 at 09:06:05AM -0400, Stefan Berger wrote: First: There are two ways to encrypt the data. One comes with the QCoW2 type of image and it comes for free. Set the encryption flag when creating the QCoW2 file and one has to provide a key to access the QCoW2. I found this mode problematic for users since it required me to go through the monitor every time I started the VM. Besides that the key is provided so late that all devices are already initialized and if the wrong key was provided the only thing the TPM can do is to go into shutdown mode since there is state on the QCoW2 but it cannot be decrypted. This also became problematic when doing migrations with libvirt for example and one was to have a wrong key/password installed on the target side -- graceful termination of the migration is impossible. OK let's go back to this for a moment. Add a load callback, access file there. On failure, return an error. migration fails gracefully, and management can retry, or migrate to another node, or whatever. What's the problem exactly? The switch-over from source to destination already happened when the key is finally passed and you just won't be able to access the QCoW2 in case the key was wrong. Similar problems occur when you start a VM with an encrypted QCoW2 image. The monitor will prompt you for the password and then you start the VM and if the password was wrong the OS just won't be able to access the image. Stefan
Re: [Qemu-devel] [PATCH V8 14/14] Allow to provide inital TPM state
On Wed, Sep 07, 2011 at 09:51:00AM -0400, Stefan Berger wrote: On 09/07/2011 07:23 AM, Michael S. Tsirkin wrote: On Tue, Sep 06, 2011 at 10:45:34PM -0400, Stefan Berger wrote: On 09/04/2011 12:38 PM, Michael S. Tsirkin wrote: On Thu, Sep 01, 2011 at 11:00:56PM -0400, Stefan Berger wrote: initstate_fd=file descriptor initstate_base64=on/off (or base64/bin if you really expect more formats in the future) and use qemu routines to get the fd so they can be passed through the monitor later ... I suppose you mean monitor_get_fd(). That functions seems to only be used by net.c so far and if understand the chain of functions correctly that are called with the monitor as parameter it helps in hotplugging net devices ? For the TPM I would like to *not* have support for hotplugging since that device is supposed to be soldered to the motherboard and needs to be initialized through a command sequence by the (v)BIOS, so it has to be present early on during machine startup. Stefan Fine, but let's reuse common functions and save code duplication, especially parsing functions. When parsing the command line there's no Monitor being passed around. So in case of 'net' net_handle_fd_param() (net.c) ends up not invoking monitor_get_fd() but the else branch where strtol() is used to convert the fd. Now I won't call net_handle_fd_param() but could introduce tpm_handle_fd_param() also calling strtol(). Though that would not make me call a common function but duplicating the code there... I don't know of another function handling the parsing of fd's. Is there one ? If not, I'll also just fall back to using strtol(). Stefan We can create a common function and use that for net and tpm. -- MST
Re: [Qemu-devel] [PATCH V8 08/14] Introduce file lock for the block layer
On Wed, Sep 07, 2011 at 09:56:52AM -0400, Stefan Berger wrote: On 09/07/2011 09:16 AM, Michael S. Tsirkin wrote: On Wed, Sep 07, 2011 at 09:06:05AM -0400, Stefan Berger wrote: First: There are two ways to encrypt the data. One comes with the QCoW2 type of image and it comes for free. Set the encryption flag when creating the QCoW2 file and one has to provide a key to access the QCoW2. I found this mode problematic for users since it required me to go through the monitor every time I started the VM. Besides that the key is provided so late that all devices are already initialized and if the wrong key was provided the only thing the TPM can do is to go into shutdown mode since there is state on the QCoW2 but it cannot be decrypted. This also became problematic when doing migrations with libvirt for example and one was to have a wrong key/password installed on the target side -- graceful termination of the migration is impossible. OK let's go back to this for a moment. Add a load callback, access file there. On failure, return an error. migration fails gracefully, and management can retry, or migrate to another node, or whatever. What's the problem exactly? The switch-over from source to destination already happened when the key is finally passed and you just won't be able to access the QCoW2 in case the key was wrong. This is exactly what happens with any kind of othe rmigration errror. So fail migration, and source can get restarted if necessary. Similar problems occur when you start a VM with an encrypted QCoW2 image. The monitor will prompt you for the password and then you start the VM and if the password was wrong the OS just won't be able to access the image. Stefan Why can't you verify the password? -- MST
[Qemu-devel] [PATCH] qcow2: removed unused depends_on field
Signed-off-by: Frediano Ziglio fredd...@gmail.com --- block/qcow2-cluster.c |3 +-- block/qcow2.h |1 - 2 files changed, 1 insertions(+), 3 deletions(-) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index e06be64..113db8b 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -694,7 +694,7 @@ err: * If the offset is not found, allocate a new cluster. * * If the cluster was already allocated, m-nb_clusters is set to 0, - * m-depends_on is set to NULL and the other fields in m are meaningless. + * other fields in m are meaningless. * * If the cluster is newly allocated, m-nb_clusters is set to the number of * contiguous clusters that have been allocated. In this case, the other @@ -736,7 +736,6 @@ again: cluster_offset = ~QCOW_OFLAG_COPIED; m-nb_clusters = 0; -m-depends_on = NULL; goto out; } diff --git a/block/qcow2.h b/block/qcow2.h index c8ca3bc..531af39 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -148,7 +148,6 @@ typedef struct QCowL2Meta int n_start; int nb_available; int nb_clusters; -struct QCowL2Meta *depends_on; CoQueue dependent_requests; QLIST_ENTRY(QCowL2Meta) next_in_flight; -- 1.7.1
Re: [Qemu-devel] [PATCH V8 08/14] Introduce file lock for the block layer
On 09/07/2011 10:10 AM, Michael S. Tsirkin wrote: On Wed, Sep 07, 2011 at 09:56:52AM -0400, Stefan Berger wrote: On 09/07/2011 09:16 AM, Michael S. Tsirkin wrote: On Wed, Sep 07, 2011 at 09:06:05AM -0400, Stefan Berger wrote: First: There are two ways to encrypt the data. One comes with the QCoW2 type of image and it comes for free. Set the encryption flag when creating the QCoW2 file and one has to provide a key to access the QCoW2. I found this mode problematic for users since it required me to go through the monitor every time I started the VM. Besides that the key is provided so late that all devices are already initialized and if the wrong key was provided the only thing the TPM can do is to go into shutdown mode since there is state on the QCoW2 but it cannot be decrypted. This also became problematic when doing migrations with libvirt for example and one was to have a wrong key/password installed on the target side -- graceful termination of the migration is impossible. OK let's go back to this for a moment. Add a load callback, access file there. On failure, return an error. migration fails gracefully, and management can retry, or migrate to another node, or whatever. What's the problem exactly? The switch-over from source to destination already happened when the key is finally passed and you just won't be able to access the QCoW2 in case the key was wrong. This is exactly what happens with any kind of othe rmigration errror. So fail migration, and source can get restarted if necessary. I guess I wanted to improve on this and catch user errors. If we let migration fail then all you can do is try to terminate the VM on the destination and cold-start on the source. Similar problems occur when you start a VM with an encrypted QCoW2 image. The monitor will prompt you for the password and then you start the VM and if the password was wrong the OS just won't be able to access the image. Stefan Why can't you verify the password? I do verify the key/password in the TPM driver. If the driver cannot make sense of the contents of the QCoW2 due to wrong key I simply put the driver into failure mode. That's all I can do with encrypted QCoW2. In case of a QCoW2 encrypted VM image it's different. There I guess the intelligence of what is good and bad data is only inside the OS. Stefan
Re: [Qemu-devel] [PATCH V8 08/14] Introduce file lock for the block layer
On Wed, Sep 07, 2011 at 10:25:08AM -0400, Stefan Berger wrote: On 09/07/2011 10:10 AM, Michael S. Tsirkin wrote: On Wed, Sep 07, 2011 at 09:56:52AM -0400, Stefan Berger wrote: On 09/07/2011 09:16 AM, Michael S. Tsirkin wrote: On Wed, Sep 07, 2011 at 09:06:05AM -0400, Stefan Berger wrote: First: There are two ways to encrypt the data. One comes with the QCoW2 type of image and it comes for free. Set the encryption flag when creating the QCoW2 file and one has to provide a key to access the QCoW2. I found this mode problematic for users since it required me to go through the monitor every time I started the VM. Besides that the key is provided so late that all devices are already initialized and if the wrong key was provided the only thing the TPM can do is to go into shutdown mode since there is state on the QCoW2 but it cannot be decrypted. This also became problematic when doing migrations with libvirt for example and one was to have a wrong key/password installed on the target side -- graceful termination of the migration is impossible. OK let's go back to this for a moment. Add a load callback, access file there. On failure, return an error. migration fails gracefully, and management can retry, or migrate to another node, or whatever. What's the problem exactly? The switch-over from source to destination already happened when the key is finally passed and you just won't be able to access the QCoW2 in case the key was wrong. This is exactly what happens with any kind of othe rmigration errror. So fail migration, and source can get restarted if necessary. I guess I wanted to improve on this and catch user errors. If we let migration fail then all you can do is try to terminate the VM on the destination and cold-start on the source. No, normally if migration fails VM is not started on destination, and it can just continue on source. Similar problems occur when you start a VM with an encrypted QCoW2 image. The monitor will prompt you for the password and then you start the VM and if the password was wrong the OS just won't be able to access the image. Stefan Why can't you verify the password? I do verify the key/password in the TPM driver. If the driver cannot make sense of the contents of the QCoW2 due to wrong key I simply put the driver into failure mode. That's all I can do with encrypted QCoW2. You can return error from init script which will make qemu exit. In case of a QCoW2 encrypted VM image it's different. There I guess the intelligence of what is good and bad data is only inside the OS. Stefan
Re: [Qemu-devel] [PATCH 2/2] main: switch qemu_set_fd_handler to g_io_add_watch
On 09/07/2011 02:42 PM, Anthony Liguori wrote: On 09/07/2011 02:03 AM, Paolo Bonzini wrote: On 09/06/2011 05:59 PM, Anthony Liguori wrote: So it should be possible to add a new Source type that just selects on a file descriptor and avoid GIOChannels? I think you still have the problem that glib on Windows waits for HANDLEs, not file descriptors. Also, I'm not sure it's worth it though as long as slirp still does its own fill/poll. So how do we fix this long term? Long term, we use GIOChannels for everything, assuming that's possible at all. More realistically, we could rewrite socket handling on Windows so that we can use g_poll instead of select (don't wait for me doing that). Another possibility, the ugliest but also the most realistic, is to separate the Windows and POSIX implementations of the main loop more sharply. This way glib's main loop can be integrated (differently) into both implementations. In the meanwhile: just do not rely on glib sources on Windows. There isn't any large benefit in this patch, and it actually complicates the straightforward code in iohandler. Just revert it and #ifdef the glib integration in patch 1/2. Since I don't see a 100%-glib main loop anytime soon, we are unlikely to lose much. If anybody introduces a feature that requires Avahi or GTK+, it won't be supported on Windows. We seem to get away with using fds today and not HANDLEs, do fds on Windows not work the same with poll as they do with select? Here is a summary table: selectsocket HANDLEs only poll does not exist WaitForMultipleObjects all other HANDLEs g_pollall other HANDLEs We only use select for Windows socket handles today. Everything else is handled separately (with WaitForMultipleObjects) by osdep-win32.c/oslib-win32.c. Paolo
Re: [Qemu-devel] [RESEND][PATCH] booke timers
On 06/09/2011 21:33, Alexander Graf wrote: Am 01.09.2011 um 10:20 schrieb Fabien Chouteau chout...@adacore.com: While working on the emulation of the freescale p2010 (e500v2) I realized that there's no implementation of booke's timers features. Currently mpc8544 uses ppc_emb (ppc_emb_timers_init) which is close but not exactly like booke (for example booke uses different SPR). Signed-off-by: Fabien Chouteau chout...@adacore.com --- Makefile.target |2 +- hw/ppc.c| 133 -- hw/ppc.h| 25 - hw/ppc4xx_devs.c|2 +- hw/ppc_booke.c | 263 +++ hw/ppce500_mpc8544ds.c |4 +- hw/virtex_ml507.c | 11 +-- target-ppc/cpu.h| 29 + target-ppc/translate_init.c | 43 +++- 9 files changed, 412 insertions(+), 100 deletions(-) create mode 100644 hw/ppc_booke.c diff --git a/Makefile.target b/Makefile.target index 07af4d4..c8704cd 100644 --- a/Makefile.target +++ b/Makefile.target @@ -234,7 +234,7 @@ obj-i386-$(CONFIG_KVM) += kvmclock.o obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o # shared objects -obj-ppc-y = ppc.o +obj-ppc-y = ppc.o ppc_booke.o obj-ppc-y += vga.o # PREP target obj-ppc-y += i8259.o mc146818rtc.o diff --git a/hw/ppc.c b/hw/ppc.c index 8870748..bcb1e91 100644 --- a/hw/ppc.c +++ b/hw/ppc.c @@ -50,7 +50,7 @@ static void cpu_ppc_tb_stop (CPUState *env); static void cpu_ppc_tb_start (CPUState *env); -static void ppc_set_irq (CPUState *env, int n_IRQ, int level) +void ppc_set_irq(CPUState *env, int n_IRQ, int level) { unsigned int old_pending = env-pending_interrupts; @@ -423,25 +423,8 @@ void ppce500_irq_init (CPUState *env) } /*/ /* PowerPC time base and decrementer emulation */ -struct ppc_tb_t { -/* Time base management */ -int64_t tb_offset;/* Compensation*/ -int64_t atb_offset; /* Compensation*/ -uint32_t tb_freq; /* TB frequency*/ -/* Decrementer management */ -uint64_t decr_next;/* Tick for next decr interrupt*/ -uint32_t decr_freq;/* decrementer frequency */ -struct QEMUTimer *decr_timer; -/* Hypervisor decrementer management */ -uint64_t hdecr_next;/* Tick for next hdecr interrupt */ -struct QEMUTimer *hdecr_timer; -uint64_t purr_load; -uint64_t purr_start; -void *opaque; -}; -static inline uint64_t cpu_ppc_get_tb(ppc_tb_t *tb_env, uint64_t vmclk, - int64_t tb_offset) +uint64_t cpu_ppc_get_tb(ppc_tb_t *tb_env, uint64_t vmclk, int64_t tb_offset) { /* TB time in tb periods */ return muldiv64(vmclk, tb_env-tb_freq, get_ticks_per_sec()) + tb_offset; @@ -611,10 +594,13 @@ static inline uint32_t _cpu_ppc_load_decr(CPUState *env, uint64_t next) int64_t diff; diff = next - qemu_get_clock_ns(vm_clock); -if (diff = 0) +if (diff = 0) { decr = muldiv64(diff, tb_env-decr_freq, get_ticks_per_sec()); -else +} else if (env-insns_flags PPC_BOOKE) { +decr = 0; I don't think we should expose instruction interpreter details into the timing code. It'd be better to have a separate flag that gets set on the booke timer init function which would be stored in tb_env. Keeps things better separated :) Good idea, but how do we set the flags? ppc_booke_timers_init doesn't know which processor is emulated. +} else { decr = -muldiv64(-diff, tb_env-decr_freq, get_ticks_per_sec()); +} LOG_TB(%s: %08 PRIx32 \n, __func__, decr); return decr; @@ -678,18 +664,23 @@ static void __cpu_ppc_store_decr (CPUState *env, uint64_t *nextp, decr, value); now = qemu_get_clock_ns(vm_clock); next = now + muldiv64(value, get_ticks_per_sec(), tb_env-decr_freq); -if (is_excp) +if (is_excp) { next += *nextp - now; -if (next == now) +} +if (next == now) { next++; +} *nextp = next; /* Adjust timer */ qemu_mod_timer(timer, next); /* If we set a negative value and the decrementer was positive, - * raise an exception. + * raise an exception (not for booke). */ -if ((value 0x8000) !(decr 0x8000)) +if (!(env-insns_flags PPC_BOOKE) + (value 0x8000) + !(decr 0x8000)) { (*raise_excp)(env); Please make this a separate flag too. IIRC this is not unified behavior on all ppc CPUs, not even all server type ones. This come from a comment by Scott (CC'd), I don't know much about it. Can you help me to find a good name for this feature? +} } static inline void _cpu_ppc_store_decr(CPUState *env, uint32_t decr, @@ -806,11 +797,11 @@ uint32_t
Re: [Qemu-devel] [PATCH 2/2] main: switch qemu_set_fd_handler to g_io_add_watch
On 09/07/2011 09:40 AM, Paolo Bonzini wrote: On 09/07/2011 02:42 PM, Anthony Liguori wrote: On 09/07/2011 02:03 AM, Paolo Bonzini wrote: On 09/06/2011 05:59 PM, Anthony Liguori wrote: So it should be possible to add a new Source type that just selects on a file descriptor and avoid GIOChannels? I think you still have the problem that glib on Windows waits for HANDLEs, not file descriptors. Also, I'm not sure it's worth it though as long as slirp still does its own fill/poll. So how do we fix this long term? Long term, we use GIOChannels for everything, assuming that's possible at all. More realistically, we could rewrite socket handling on Windows so that we can use g_poll instead of select (don't wait for me doing that). I assume switching to GIO would resolve all of these issues? Another possibility, the ugliest but also the most realistic, is to separate the Windows and POSIX implementations of the main loop more sharply. This way glib's main loop can be integrated (differently) into both implementations. In the meanwhile: just do not rely on glib sources on Windows. There isn't any large benefit in this patch, and it actually complicates the straightforward code in iohandler. Just revert it and #ifdef the glib integration in patch 1/2. Since I don't see a 100%-glib main loop anytime soon, we are unlikely to lose much. If anybody introduces a feature that requires Avahi or GTK+, it won't be supported on Windows. My main motivation is unit testing. I want to be able to have device models only rely on glib main loop primitives such that we can easily use devices in a simple glib main loop. The split main loop approach won't work for that. Regards, Anthony Liguori We seem to get away with using fds today and not HANDLEs, do fds on Windows not work the same with poll as they do with select? Here is a summary table: select socket HANDLEs only poll does not exist WaitForMultipleObjects all other HANDLEs g_poll all other HANDLEs We only use select for Windows socket handles today. Everything else is handled separately (with WaitForMultipleObjects) by osdep-win32.c/oslib-win32.c. Paolo
Re: [Qemu-devel] [PATCH V8 08/14] Introduce file lock for the block layer
On 09/07/2011 10:35 AM, Michael S. Tsirkin wrote: On Wed, Sep 07, 2011 at 10:25:08AM -0400, Stefan Berger wrote: On 09/07/2011 10:10 AM, Michael S. Tsirkin wrote: On Wed, Sep 07, 2011 at 09:56:52AM -0400, Stefan Berger wrote: On 09/07/2011 09:16 AM, Michael S. Tsirkin wrote: On Wed, Sep 07, 2011 at 09:06:05AM -0400, Stefan Berger wrote: First: There are two ways to encrypt the data. One comes with the QCoW2 type of image and it comes for free. Set the encryption flag when creating the QCoW2 file and one has to provide a key to access the QCoW2. I found this mode problematic for users since it required me to go through the monitor every time I started the VM. Besides that the key is provided so late that all devices are already initialized and if the wrong key was provided the only thing the TPM can do is to go into shutdown mode since there is state on the QCoW2 but it cannot be decrypted. This also became problematic when doing migrations with libvirt for example and one was to have a wrong key/password installed on the target side -- graceful termination of the migration is impossible. OK let's go back to this for a moment. Add a load callback, access file there. On failure, return an error. migration fails gracefully, and management can retry, or migrate to another node, or whatever. What's the problem exactly? The switch-over from source to destination already happened when the key is finally passed and you just won't be able to access the QCoW2 in case the key was wrong. This is exactly what happens with any kind of othe rmigration errror. So fail migration, and source can get restarted if necessary. I guess I wanted to improve on this and catch user errors. If we let migration fail then all you can do is try to terminate the VM on the destination and cold-start on the source. No, normally if migration fails VM is not started on destination, and it can just continue on source. When I had tried this in conjunction with encrypted QCoW2 the switch-over already had happened and the source had died. So a wrong key on the destination was fatal. Similar problems occur when you start a VM with an encrypted QCoW2 image. The monitor will prompt you for the password and then you start the VM and if the password was wrong the OS just won't be able to access the image. Stefan Why can't you verify the password? I do verify the key/password in the TPM driver. If the driver cannot make sense of the contents of the QCoW2 due to wrong key I simply put the driver into failure mode. That's all I can do with encrypted QCoW2. You can return error from init script which will make qemu exit. I can return an error code when the front- and backend interfaces are initialized, but that happens really early and the encyrption key entered through the monitor is not available at this point. I also don't get a notification about when the key was entered. In case of QCoW2 encryption (and migration) I delay initialization until very late, basically when the VM accesses the tpm tis hardware emulation layer again (needs to be done this way I think to support block migration where I cannot even access the block device early on at all). Only then I find out that the key was wrong. I guess any other handling would require blockdev.c's invocation of monitor_read_bdrv_key_start() to be 'somehow' extended and ... do what ? loop until the correct password was entered? Stefan In case of a QCoW2 encrypted VM image it's different. There I guess the intelligence of what is good and bad data is only inside the OS. Stefan
Re: [Qemu-devel] [PATCH 3/3] vns/tls: don't use depricated gnutls functions
See inline comments below. Am 07.09.2011 15:02, schrieb Gerd Hoffmann: Avoid using depricated gnutls functions with recent gnutls versions. deprecated? Fixes build failure on Fedora 16. Keep the old way for compatibility with old installations such as RHEL-5 (gnutls 1.4.x). Based on a patch from Raghavendra D Prabhu raghu.prabh...@gmail.com Signed-off-by: Gerd Hoffmann kra...@redhat.com --- ui/vnc-tls.c | 62 - 1 files changed, 43 insertions(+), 19 deletions(-) diff --git a/ui/vnc-tls.c b/ui/vnc-tls.c index 2e2456e..276e127 100644 --- a/ui/vnc-tls.c +++ b/ui/vnc-tls.c @@ -283,13 +283,51 @@ int vnc_tls_validate_certificate(struct VncState *vs) return 0; } +#if defined(GNUTLS_VERSION_NUMBER) \ + GNUTLS_VERSION_NUMBER = 0x020200 /* 2.2.0 */ + +static int vnc_set_gnutls_priority(struct VncState *vs, int needX509Creds) Replace the first argument by gnutls_session_t /session/. This simplifies the code. +{ + const char *priority = needX509Creds ? NORMAL : NORMAL:+ANON-DH; + + if (gnutls_priority_set_direct(vs-tls.session, priority, NULL) 0) { Even if this works, testing for != GNUTLS_E_SUCCESS would be better because GNUTLS_E_SUCCESS is the return value for success according to the manual. The same applies to the other function calls below as well. + return -1; + } + return 0; +} + +#else + +static int vnc_set_gnutls_priority(struct VncState *vs, int x509) Replace the first argument by gnutls_session_t /session/. +{ + static const int cert_types[] = { GNUTLS_CRT_X509, 0 }; + static const int protocols[] = { + GNUTLS_TLS1_1, GNUTLS_TLS1_0, GNUTLS_SSL3, 0 + }; + static const int kx_anon[] = { GNUTLS_KX_ANON_DH, 0 }; + static const int kx_x509[] = { + GNUTLS_KX_DHE_DSS, GNUTLS_KX_RSA, + GNUTLS_KX_DHE_RSA, GNUTLS_KX_SRP, 0 + }; + + if (gnutls_kx_set_priority(vs-tls.session, x509 ? kx_x509 : kx_anon) 0) { + return -1; + } + + if (gnutls_certificate_type_set_priority(vs-tls.session, cert_types) 0) { + return -1; + } + + if (gnutls_protocol_set_priority(vs-tls.session, protocols) 0) { + return -1; + } + return 0; +} + +#endif int vnc_tls_client_setup(struct VncState *vs, int needX509Creds) { - static const int cert_type_priority[] = { GNUTLS_CRT_X509, 0 }; - static const int protocol_priority[]= { GNUTLS_TLS1_1, GNUTLS_TLS1_0, GNUTLS_SSL3, 0 }; - static const int kx_anon[] = {GNUTLS_KX_ANON_DH, 0}; - static const int kx_x509[] = {GNUTLS_KX_DHE_DSS, GNUTLS_KX_RSA, GNUTLS_KX_DHE_RSA, GNUTLS_KX_SRP, 0}; VNC_DEBUG(Do TLS setup\n); if (vnc_tls_initialize() 0) { @@ -310,21 +348,7 @@ int vnc_tls_client_setup(struct VncState *vs, return -1; } - if (gnutls_kx_set_priority(vs-tls.session, needX509Creds ? kx_x509 : kx_anon) 0) { - gnutls_deinit(vs-tls.session); - vs-tls.session = NULL; - vnc_client_error(vs); - return -1; - } - - if (gnutls_certificate_type_set_priority(vs-tls.session, cert_type_priority) 0) { - gnutls_deinit(vs-tls.session); - vs-tls.session = NULL; - vnc_client_error(vs); - return -1; - } - - if (gnutls_protocol_set_priority(vs-tls.session, protocol_priority) 0) { + if (vnc_set_gnutls_priority(vs, needX509Creds) 0) { Use vs-tls.session as first argument. gnutls_deinit(vs-tls.session); vs-tls.session = NULL; vnc_client_error(vs);
Re: [Qemu-devel] [PATCH V8 08/14] Introduce file lock for the block layer
On Wed, Sep 07, 2011 at 11:06:42AM -0400, Stefan Berger wrote: On 09/07/2011 10:35 AM, Michael S. Tsirkin wrote: On Wed, Sep 07, 2011 at 10:25:08AM -0400, Stefan Berger wrote: On 09/07/2011 10:10 AM, Michael S. Tsirkin wrote: On Wed, Sep 07, 2011 at 09:56:52AM -0400, Stefan Berger wrote: On 09/07/2011 09:16 AM, Michael S. Tsirkin wrote: On Wed, Sep 07, 2011 at 09:06:05AM -0400, Stefan Berger wrote: First: There are two ways to encrypt the data. One comes with the QCoW2 type of image and it comes for free. Set the encryption flag when creating the QCoW2 file and one has to provide a key to access the QCoW2. I found this mode problematic for users since it required me to go through the monitor every time I started the VM. Besides that the key is provided so late that all devices are already initialized and if the wrong key was provided the only thing the TPM can do is to go into shutdown mode since there is state on the QCoW2 but it cannot be decrypted. This also became problematic when doing migrations with libvirt for example and one was to have a wrong key/password installed on the target side -- graceful termination of the migration is impossible. OK let's go back to this for a moment. Add a load callback, access file there. On failure, return an error. migration fails gracefully, and management can retry, or migrate to another node, or whatever. What's the problem exactly? The switch-over from source to destination already happened when the key is finally passed and you just won't be able to access the QCoW2 in case the key was wrong. This is exactly what happens with any kind of othe rmigration errror. So fail migration, and source can get restarted if necessary. I guess I wanted to improve on this and catch user errors. If we let migration fail then all you can do is try to terminate the VM on the destination and cold-start on the source. No, normally if migration fails VM is not started on destination, and it can just continue on source. When I had tried this in conjunction with encrypted QCoW2 the switch-over already had happened and the source had died. Giving continue command should bring it back. So a wrong key on the destination was fatal. So it's a bug in the code then? Similar problems occur when you start a VM with an encrypted QCoW2 image. The monitor will prompt you for the password and then you start the VM and if the password was wrong the OS just won't be able to access the image. Stefan Why can't you verify the password? I do verify the key/password in the TPM driver. If the driver cannot make sense of the contents of the QCoW2 due to wrong key I simply put the driver into failure mode. That's all I can do with encrypted QCoW2. You can return error from init script which will make qemu exit. I can return an error code when the front- and backend interfaces are initialized, but that happens really early and the encyrption key entered through the monitor is not available at this point. I also don't get a notification about when the key was entered. In case of QCoW2 encryption (and migration) I delay initialization until very late, basically when the VM accesses the tpm tis hardware emulation layer again (needs to be done this way I think to support block migration where I cannot even access the block device early on at all). So it in the loadvm callback. This happens when guest is stopped on source, so no need for locks. On failure you return an error and migration fails before destination is started. You can Only then I find out that the key was wrong. I guess any other handling would require blockdev.c's invocation of monitor_read_bdrv_key_start() to be 'somehow' extended and ... do what ? loop until the correct password was entered? Return an error so vm start fails? Stefan In case of a QCoW2 encrypted VM image it's different. There I guess the intelligence of what is good and bad data is only inside the OS. Stefan
[Qemu-devel] [PATCH 0/5] block: preparatory patches for scatter/gather support
These patches are preparatory work for supporting scatter/gather in the SCSI subsystem. Since there would be no HBA actually using it, I am just posting the cleanups, and the fix for CVE-2011-3346 (buffer overflow in the handling of READ CAPACITY 16) that comes for free with the last patch. Paolo Bonzini (5): dma-helpers: rename is_write to to_dev dma-helpers: allow including from target-independent code dma-helpers: rewrite completion/cancellation scsi-disk: commonize iovec creation between reads and writes scsi-disk: lazily allocate bounce buffer dma-helpers.c | 47 +++ dma.h |8 - hw/scsi-disk.c | 84 +-- qemu-common.h |1 + 4 files changed, 86 insertions(+), 54 deletions(-) -- 1.7.6
[Qemu-devel] [PATCH 5/5] scsi-disk: lazily allocate bounce buffer
It will not be needed for reads and writes if the HBA provides a sglist. In addition, this lets scsi-disk refuse commands with an excessive allocation length, as well as limit memory on usual well-behaved guests. Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- hw/scsi-disk.c | 44 +--- 1 files changed, 33 insertions(+), 11 deletions(-) diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c index d0ac31e..4a01e0f 100644 --- a/hw/scsi-disk.c +++ b/hw/scsi-disk.c @@ -54,6 +54,7 @@ typedef struct SCSIDiskReq { /* Both sector and sector_count are in terms of qemu 512 byte blocks. */ uint64_t sector; uint32_t sector_count; +uint32_t buflen; struct iovec iov; QEMUIOVector qiov; uint32_t status; @@ -75,13 +76,15 @@ struct SCSIDiskState }; static int scsi_handle_rw_error(SCSIDiskReq *r, int error, int type); -static int scsi_disk_emulate_command(SCSIDiskReq *r, uint8_t *outbuf); +static int scsi_disk_emulate_command(SCSIDiskReq *r); static void scsi_free_request(SCSIRequest *req) { SCSIDiskReq *r = DO_UPCAST(SCSIDiskReq, req, req); -qemu_vfree(r-iov.iov_base); +if (r-iov.iov_base) { +qemu_vfree(r-iov.iov_base); +} } /* Helper function for command completion with sense. */ @@ -107,7 +110,13 @@ static void scsi_cancel_io(SCSIRequest *req) static uint32_t scsi_init_iovec(SCSIDiskReq *r) { -r-iov.iov_len = MIN(r-sector_count * 512, SCSI_DMA_BUF_SIZE); +SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r-req.dev); + +if (!r-iov.iov_base) { +r-buflen = SCSI_DMA_BUF_SIZE; +r-iov.iov_base = qemu_blockalign(s-bs, r-buflen); +} +r-iov.iov_len = MIN(r-sector_count * 512, r-buflen); qemu_iovec_init_external(r-qiov, r-iov, 1); return r-qiov.size / 512; } @@ -314,7 +323,7 @@ static void scsi_dma_restart_bh(void *opaque) scsi_write_data(r-req); break; case SCSI_REQ_STATUS_RETRY_FLUSH: -ret = scsi_disk_emulate_command(r, r-iov.iov_base); +ret = scsi_disk_emulate_command(r); if (ret == 0) { scsi_req_complete(r-req, GOOD); } @@ -808,13 +817,31 @@ static int scsi_disk_emulate_read_toc(SCSIRequest *req, uint8_t *outbuf) return toclen; } -static int scsi_disk_emulate_command(SCSIDiskReq *r, uint8_t *outbuf) +static int scsi_disk_emulate_command(SCSIDiskReq *r) { SCSIRequest *req = r-req; SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, req-dev); uint64_t nb_sectors; +uint8_t *outbuf; int buflen = 0; +if (!r-iov.iov_base) { +/* + * FIXME: we shouldn't return anything bigger than 4k, but the code + * requires the buffer to be as big as req-cmd.xfer in several + * places. So, do not allow CDBs with a very large ALLOCATION + * LENGTH. The real fix would be to modify scsi_read_data and + * dma_buf_read, so that they return data beyond the buflen + * as all zeros. + */ +if (req-cmd.xfer 65536) { +goto illegal_request; +} +r-buflen = MAX(4096, req-cmd.xfer); +r-iov.iov_base = qemu_blockalign(s-bs, r-buflen); +} + +outbuf = r-iov.iov_base; switch (req-cmd.buf[0]) { case TEST_UNIT_READY: if (!bdrv_is_inserted(s-bs)) @@ -965,11 +992,9 @@ static int32_t scsi_send_command(SCSIRequest *req, uint8_t *buf) SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, req-dev); int32_t len; uint8_t command; -uint8_t *outbuf; int rc; command = buf[0]; -outbuf = (uint8_t *)r-iov.iov_base; DPRINTF(Command: lun=%d tag=0x%x data=0x%02x, req-lun, req-tag, buf[0]); #ifdef DEBUG_SCSI @@ -998,7 +1023,7 @@ static int32_t scsi_send_command(SCSIRequest *req, uint8_t *buf) case GET_CONFIGURATION: case SERVICE_ACTION_IN_16: case VERIFY_10: -rc = scsi_disk_emulate_command(r, outbuf); +rc = scsi_disk_emulate_command(r); if (rc 0) { return 0; } @@ -1228,11 +1253,8 @@ static SCSIRequest *scsi_new_request(SCSIDevice *d, uint32_t tag, { SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, d); SCSIRequest *req; -SCSIDiskReq *r; req = scsi_req_alloc(scsi_disk_reqops, s-qdev, tag, lun, hba_private); -r = DO_UPCAST(SCSIDiskReq, req, req); -r-iov.iov_base = qemu_blockalign(s-bs, SCSI_DMA_BUF_SIZE); return req; } -- 1.7.6
[Qemu-devel] [PATCH 4/5] scsi-disk: commonize iovec creation between reads and writes
Also, consistently use qiov.size instead of iov.iov_len. Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- hw/scsi-disk.c | 42 ++ 1 files changed, 18 insertions(+), 24 deletions(-) diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c index 9724d0f..d0ac31e 100644 --- a/hw/scsi-disk.c +++ b/hw/scsi-disk.c @@ -105,6 +105,13 @@ static void scsi_cancel_io(SCSIRequest *req) r-req.aiocb = NULL; } +static uint32_t scsi_init_iovec(SCSIDiskReq *r) +{ +r-iov.iov_len = MIN(r-sector_count * 512, SCSI_DMA_BUF_SIZE); +qemu_iovec_init_external(r-qiov, r-iov, 1); +return r-qiov.size / 512; +} + static void scsi_read_complete(void * opaque, int ret) { SCSIDiskReq *r = (SCSIDiskReq *)opaque; @@ -122,12 +129,12 @@ static void scsi_read_complete(void * opaque, int ret) } } -DPRINTF(Data ready tag=0x%x len=%zd\n, r-req.tag, r-iov.iov_len); +DPRINTF(Data ready tag=0x%x len=%zd\n, r-req.tag, r-qiov.size); -n = r-iov.iov_len / 512; +n = r-qiov.size / 512; r-sector += n; r-sector_count -= n; -scsi_req_data(r-req, r-iov.iov_len); +scsi_req_data(r-req, r-qiov.size); } static void scsi_flush_complete(void * opaque, int ret) @@ -178,13 +185,7 @@ static void scsi_read_data(SCSIRequest *req) return; } -n = r-sector_count; -if (n SCSI_DMA_BUF_SIZE / 512) -n = SCSI_DMA_BUF_SIZE / 512; - -r-iov.iov_len = n * 512; -qemu_iovec_init_external(r-qiov, r-iov, 1); - +n = scsi_init_iovec(r); bdrv_acct_start(s-bs, r-acct, n * BDRV_SECTOR_SIZE, BDRV_ACCT_READ); r-req.aiocb = bdrv_aio_readv(s-bs, r-sector, r-qiov, n, scsi_read_complete, r); @@ -233,7 +234,6 @@ static void scsi_write_complete(void * opaque, int ret) { SCSIDiskReq *r = (SCSIDiskReq *)opaque; SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r-req.dev); -uint32_t len; uint32_t n; if (r-req.aiocb != NULL) { @@ -247,19 +247,15 @@ static void scsi_write_complete(void * opaque, int ret) } } -n = r-iov.iov_len / 512; +n = r-qiov.size / 512; r-sector += n; r-sector_count -= n; if (r-sector_count == 0) { scsi_req_complete(r-req, GOOD); } else { -len = r-sector_count * 512; -if (len SCSI_DMA_BUF_SIZE) { -len = SCSI_DMA_BUF_SIZE; -} -r-iov.iov_len = len; -DPRINTF(Write complete tag=0x%x more=%d\n, r-req.tag, len); -scsi_req_data(r-req, len); +scsi_init_iovec(r); +DPRINTF(Write complete tag=0x%x more=%d\n, r-req.tag, r-qiov.size); +scsi_req_data(r-req, r-qiov.size); } } @@ -278,18 +274,16 @@ static void scsi_write_data(SCSIRequest *req) return; } -n = r-iov.iov_len / 512; +n = r-qiov.size / 512; if (n) { -qemu_iovec_init_external(r-qiov, r-iov, 1); - bdrv_acct_start(s-bs, r-acct, n * BDRV_SECTOR_SIZE, BDRV_ACCT_WRITE); r-req.aiocb = bdrv_aio_writev(s-bs, r-sector, r-qiov, n, - scsi_write_complete, r); + scsi_write_complete, r); if (r-req.aiocb == NULL) { scsi_write_complete(r, -ENOMEM); } } else { -/* Invoke completion routine to fetch data from host. */ +/* Called for the first time. Ask the driver to send us more data. */ scsi_write_complete(r, 0); } } -- 1.7.6
[Qemu-devel] [PATCH 2/5] dma-helpers: allow including from target-independent code
Target-independent code cannot construct sglists, but it can take them from the outside as a black box. Allow this. Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- dma.h |8 ++-- qemu-common.h |1 + 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/dma.h b/dma.h index a6db5ba..f7e0142 100644 --- a/dma.h +++ b/dma.h @@ -15,10 +15,13 @@ #include hw/hw.h #include block.h -typedef struct { +typedef struct ScatterGatherEntry ScatterGatherEntry; + +#if defined(TARGET_PHYS_ADDR_BITS) +struct ScatterGatherEntry { target_phys_addr_t base; target_phys_addr_t len; -} ScatterGatherEntry; +}; struct QEMUSGList { ScatterGatherEntry *sg; @@ -31,6 +34,7 @@ void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint); void qemu_sglist_add(QEMUSGList *qsg, target_phys_addr_t base, target_phys_addr_t len); void qemu_sglist_destroy(QEMUSGList *qsg); +#endif typedef BlockDriverAIOCB *DMAIOFunc(BlockDriverState *bs, int64_t sector_num, QEMUIOVector *iov, int nb_sectors, diff --git a/qemu-common.h b/qemu-common.h index 404c421..ef9a2bb 100644 --- a/qemu-common.h +++ b/qemu-common.h @@ -18,6 +18,7 @@ typedef struct DeviceState DeviceState; struct Monitor; typedef struct Monitor Monitor; +typedef struct QEMUSGList QEMUSGList; /* we put basic includes here to avoid repeating them in device drivers */ #include stdlib.h -- 1.7.6
Re: [Qemu-devel] [PATCH] qcow2: removed unused depends_on field
Am 07.09.2011 16:19, schrieb Frediano Ziglio: Signed-off-by: Frediano Ziglio fredd...@gmail.com --- block/qcow2-cluster.c |3 +-- block/qcow2.h |1 - 2 files changed, 1 insertions(+), 3 deletions(-) Thanks, applied to the block branch. Kevin