Re: [Qemu-block] [PATCH 12/16] ahci: ncq migration
On Mon, Jun 22, 2015 at 08:21:11PM -0400, John Snow wrote: @@ -1555,6 +1573,35 @@ static int ahci_state_post_load(void *opaque, int version_id) return -1; } +for (j = 0; j AHCI_MAX_CMDS; j++) { +ncq_tfs = ad-ncq_tfs[j]; +ncq_tfs-drive = ad; + +if (ncq_tfs-used != ncq_tfs-halt) { +return -1; +} +if (!ncq_tfs-halt) { +continue; +} +if (!is_ncq(ncq_tfs-cmd)) { +return -1; +} +if (ncq_tfs-slot != ncq_tfs-tag) { +return -1; +} +if (ncq_tfs-slot AHCI_MAX_CMDS) { +return -1; +} +ncq_tfs-cmdh = ((AHCICmdHdr *)ad-lst)[ncq_tfs-slot]; Is there a guarantee that -lst has been mapped? Maybe pr-cmd PORT_CMD_START was 0. We need to check that the HBA is in a valid state for NCQ processing before attempting this. pgp0DLIgRj7_g.pgp Description: PGP signature
Re: [Qemu-block] [PATCH] block/iscsi: restore compatiblity with libiscsi 1.9.0
On 26/06/2015 12:08, Peter Lieven wrote: RHEL7 and others are stuck with libiscsi 1.9.0 since there unfortunately was an ABI breakage after that release. Signed-off-by: Peter Lieven p...@kamp.de --- block/iscsi.c | 31 ++- configure |6 +++--- qemu-options.hx |3 ++- 3 files changed, 31 insertions(+), 9 deletions(-) diff --git a/block/iscsi.c b/block/iscsi.c index f19a56a..551d583 100644 --- a/block/iscsi.c +++ b/block/iscsi.c @@ -168,6 +168,19 @@ static inline unsigned exp_random(double mean) return -mean * log((double)rand() / RAND_MAX); } +/* SCSI_STATUS_TASK_SET_FULL and SCSI_STATUS_TIMEOUT were introduced + * in libiscsi 1.10.0 as part of an enum. The LIBISCSI_API_VERSION + * macro was introduced in 1.11.0. So use the API_VERSION macro as + * a hint that the macros are defined and define them ourselves + * otherwise to keep the required libiscsi version at 1.9.0 */ +#if !defined(LIBISCSI_API_VERSION) +#define _SCSI_STATUS_TASK_SET_FULL 0x28 +#define _SCSI_STATUS_TIMEOUT0x0f02 +#else +#define _SCSI_STATUS_TASK_SET_FULL SCSI_STATUS_TASK_SET_FULL +#define _SCSI_STATUS_TIMEOUTSCSI_STATUS_TIMEOUT Unfortunately symbols starting with underscore + uppercase character are reserved for the language implementation (compiler libc). Paolo +#endif + static void iscsi_co_generic_cb(struct iscsi_context *iscsi, int status, void *command_data, void *opaque) @@ -188,11 +201,11 @@ iscsi_co_generic_cb(struct iscsi_context *iscsi, int status, iTask-do_retry = 1; goto out; } -if (status == SCSI_STATUS_BUSY || status == SCSI_STATUS_TIMEOUT || -status == SCSI_STATUS_TASK_SET_FULL) { +if (status == SCSI_STATUS_BUSY || status == _SCSI_STATUS_TIMEOUT || +status == _SCSI_STATUS_TASK_SET_FULL) { unsigned retry_time = exp_random(iscsi_retry_times[iTask-retries - 1]); -if (status == SCSI_STATUS_TIMEOUT) { +if (status == _SCSI_STATUS_TIMEOUT) { /* make sure the request is rescheduled AFTER the * reconnect is initiated */ retry_time = EVENT_INTERVAL * 2; @@ -1358,7 +1371,7 @@ static int iscsi_open(BlockDriverState *bs, QDict *options, int flags, QemuOpts *opts; Error *local_err = NULL; const char *filename; -int i, ret = 0; +int i, ret = 0, timeout = 0; opts = qemu_opts_create(runtime_opts, NULL, 0, error_abort); qemu_opts_absorb_qdict(opts, options, local_err); @@ -1428,7 +1441,15 @@ static int iscsi_open(BlockDriverState *bs, QDict *options, int flags, goto out; } -iscsi_set_timeout(iscsi, parse_timeout(iscsi_url-target)); +/* timeout handling is broken in libiscsi before 1.15.0 */ +timeout = parse_timeout(iscsi_url-target); +#if defined(LIBISCSI_API_VERSION) LIBISCSI_API_VERSION = 20150621 +iscsi_set_timeout(iscsi, timeout); +#else +if (timeout) { +error_report(iSCSI: ignoring timeout value for libiscsi 1.15.0); +} +#endif if (iscsi_full_connect_sync(iscsi, iscsi_url-portal, iscsi_url-lun) != 0) { error_setg(errp, iSCSI: Failed to connect to LUN : %s, diff --git a/configure b/configure index 2ed9db2..222694f 100755 --- a/configure +++ b/configure @@ -3660,15 +3660,15 @@ if compile_prog ; then fi ## -# Do we have libiscsi = 1.10.0 +# Do we have libiscsi = 1.9.0 if test $libiscsi != no ; then - if $pkg_config --atleast-version=1.10.0 libiscsi; then + if $pkg_config --atleast-version=1.9.0 libiscsi; then libiscsi=yes libiscsi_cflags=$($pkg_config --cflags libiscsi) libiscsi_libs=$($pkg_config --libs libiscsi) else if test $libiscsi = yes ; then - feature_not_found libiscsi Install libiscsi = 1.10.0 + feature_not_found libiscsi Install libiscsi = 1.9.0 fi libiscsi=no fi diff --git a/qemu-options.hx b/qemu-options.hx index ca37481..2d23330 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -2293,7 +2293,8 @@ line or a configuration file. Since version Qemu 2.4 it is possible to specify a iSCSI request timeout to detect stalled requests and force a reestablishment of the session. The timeout -is specified in seconds. The default is 0 which means no timeout. +is specified in seconds. The default is 0 which means no timeout. Libiscsi +1.15.0 is required for this feature. Example (without authentication): @example
Re: [Qemu-block] [PATCH v7 0/8] block: Mirror discarded sectors
On Mon, Jun 08, 2015 at 01:56:06PM +0800, Fam Zheng wrote: v7: Fix the lost assignment of s-unmap. v6: Fix pnum in bdrv_get_block_status_above. [Paolo] v5: Rewrite patch 1. Address Eric's comments on patch 3. Add Eric's rev-by to patches 2 4. Check BDRV_BLOCK_DATA in patch 3. [Paolo] This fixes the mirror assert failure reported by wangxiaolong: https://lists.gnu.org/archive/html/qemu-devel/2015-04/msg04458.html The direct cause is that hbitmap code couldn't handle unset of bits *after* iterator's current position. We could fix that, but the bdrv_reset_dirty() call is more questionable: Before, if guest discarded some sectors during migration, it could see different data after moving to dest side, depending on block backends of the src and the dest. This is IMO worse than mirroring the actual reading as done in this series, because we don't know what the guest is doing. For example if a guest first issues WRITE SAME to wipe out the area then issues UNMAP to discard it, just to get rid of some sensitive data completely, we may miss both operations and leave stale data on dest image. Fam Zheng (8): block: Add bdrv_get_block_status_above qmp: Add optional bool unmap to drive-mirror mirror: Do zero write on target if sectors not allocated block: Fix dirty bitmap in bdrv_co_discard block: Remove bdrv_reset_dirty qemu-iotests: Make block job methods common qemu-iotests: Add test case for mirror with unmap iotests: Use event_wait in wait_ready block.c | 12 block/io.c| 60 ++- block/mirror.c| 28 +++--- blockdev.c| 5 hmp.c | 2 +- include/block/block.h | 4 +++ include/block/block_int.h | 4 +-- qapi/block-core.json | 8 +- qmp-commands.hx | 3 ++ tests/qemu-iotests/041| 66 ++- tests/qemu-iotests/132| 59 ++ tests/qemu-iotests/132.out| 5 tests/qemu-iotests/group | 1 + tests/qemu-iotests/iotests.py | 23 +++ 14 files changed, 196 insertions(+), 84 deletions(-) create mode 100644 tests/qemu-iotests/132 create mode 100644 tests/qemu-iotests/132.out -- 2.4.2 Thanks, applied to my block tree again now that the bool/enum question has been resolved: https://github.com/stefanha/qemu/commits/block Stefan pgpVrNnmooFQz.pgp Description: PGP signature
Re: [Qemu-block] [Qemu-devel] [Qemu-stable] [PATCH v7 0/8] block: Mirror discarded sectors
Hi, There is no problem, the observasion by Andrey was just that qmp command takes a few minutes before returning, because he didn't apply https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg02511.html Is this patch already apply on the block tree ? With nfs as source storage, it's really slow currently (lseek slow + a lot of nfs ops). - Mail original - De: Fam Zheng f...@redhat.com À: pbonzini pbonz...@redhat.com Cc: Kevin Wolf kw...@redhat.com, qemu-block@nongnu.org, Jeff Cody jc...@redhat.com, qemu-devel qemu-de...@nongnu.org, qemu-stable qemu-sta...@nongnu.org, stefanha stefa...@redhat.com, js...@redhat.com, wangxiaol...@ucloud.cn Envoyé: Jeudi 25 Juin 2015 12:45:38 Objet: Re: [Qemu-devel] [Qemu-stable] [PATCH v7 0/8] block: Mirror discarded sectors On Thu, 06/25 09:02, Fam Zheng wrote: On Wed, 06/24 19:01, Paolo Bonzini wrote: On 24/06/2015 11:08, Fam Zheng wrote: Stefan, The only controversial patches are the qmp/drive-mirror ones (1-3), while patches 4-8 are still useful on their own: they fix the mentioned crash and improve iotests. Shall we merge the second half (of course none of them depend on 1-3) now that softfreeze is approaching? Stefan, would you consider applying patches 4-8? Actually why not apply all of them? Even if blockdev-mirror is a superior interface in the long run, the current behavior of drive-mirror can cause images to balloon up to the full size, which is bad. Extending drive-mirror is okay IMHO for 2.4. Before we do that, Andrey Korolyov has reported a hang issue with unmap=true, I'll take a look at it today. There is no problem, the observasion by Andrey was just that qmp command takes a few minutes before returning, because he didn't apply https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg02511.html Fam
Re: [Qemu-block] [PATCH 06/16] ahci: record ncq failures
On Mon, Jun 22, 2015 at 08:21:05PM -0400, John Snow wrote: Handle NCQ failures for cases where we want to halt the VM on IO errors. Signed-off-by: John Snow js...@redhat.com --- hw/ide/ahci.c | 17 +++-- hw/ide/ahci.h | 1 + hw/ide/internal.h | 1 + 3 files changed, 17 insertions(+), 2 deletions(-) diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index 71b5085..a838317 100644 --- a/hw/ide/ahci.c +++ b/hw/ide/ahci.c @@ -959,13 +959,25 @@ static void ncq_cb(void *opaque, int ret) return; } +ncq_tfs-halt = false; Why does halt need to be cleared here? if (ret 0) { -ncq_err(ncq_tfs); +bool is_read = ncq_tfs-cmd == READ_FPDMA_QUEUED; +BlockErrorAction action = blk_get_error_action(ide_state-blk, + is_read, -ret); +if (action == BLOCK_ERROR_ACTION_STOP) { +ncq_tfs-halt = true; +ide_state-bus-error_status = IDE_RETRY_HBA; +} else if (action == BLOCK_ERROR_ACTION_REPORT) { +ncq_err(ncq_tfs); +} +blk_error_action(ide_state-blk, action, is_read, -ret); } else { ide_state-status = READY_STAT | SEEK_STAT; } -ncq_finish(ncq_tfs); +if (!ncq_tfs-halt) { +ncq_finish(ncq_tfs); +} } static int is_ncq(uint8_t ata_cmd) @@ -1042,6 +1054,7 @@ static void process_ncq_command(AHCIState *s, int port, uint8_t *cmd_fis, } ncq_tfs-used = 1; +ncq_tfs-halt = false; ncq_tfs-drive = ad; ncq_tfs-slot = slot; ncq_tfs-cmd = ncq_fis-command; diff --git a/hw/ide/ahci.h b/hw/ide/ahci.h index 33607d7..47a3122 100644 --- a/hw/ide/ahci.h +++ b/hw/ide/ahci.h @@ -262,6 +262,7 @@ typedef struct NCQTransferState { uint8_t cmd; int slot; int used; +bool halt; } NCQTransferState; struct AHCIDevice { diff --git a/hw/ide/internal.h b/hw/ide/internal.h index 7a4a86d..5abee19 100644 --- a/hw/ide/internal.h +++ b/hw/ide/internal.h @@ -499,6 +499,7 @@ struct IDEDevice { #define IDE_RETRY_READ 0x20 #define IDE_RETRY_FLUSH 0x40 #define IDE_RETRY_TRIM 0x80 +#define IDE_RETRY_HBA 0x100 Feel free to squash this patch together with the next one. It is hard to review in isolation since IDE_RETRY_HBA and -halt aren't used yet. pgpS2Y_vq6L4V.pgp Description: PGP signature
Re: [Qemu-block] [PATCH 13/16] ahci: add get_cmd_header helper
On Mon, Jun 22, 2015 at 08:21:12PM -0400, John Snow wrote: +static AHCICmdHdr *get_cmd_header(AHCIState *s, uint8_t port, uint8_t slot) +{ +if (port s-ports || slot AHCI_MAX_CMDS) { Should these be = instead of ? Otherwise 1 element beyond the end of the array can be accessed. pgpymHmGSfpAr.pgp Description: PGP signature
Re: [Qemu-block] [PATCH] block.c: fix real cdrom detection
On Jun 26, 2015, at 5:34 AM, Stefan Hajnoczi wrote: On Thu, Jun 25, 2015 at 11:11:24AM -0400, Programmingkid wrote: On Jun 25, 2015, at 9:31 AM, Stefan Hajnoczi wrote: On Tue, Jun 23, 2015 at 02:26:51PM -0400, Programmingkid wrote: On Jun 23, 2015, at 2:06 PM, John Snow wrote: So what's the issue that this patch attempts to fix and how did you determine that the fix was needed here? It doesn't look like it respects proper abstraction at a glance. Without the patch, QEMU would just quit when the -cdrom /dev/cdrom option is given. Why does it quit? Because of a bug. This is what it prints: Could not read image for determining its format: Invalid argument. The bdrv_pread() failure is what need you need to investigate. In the other sub-thread there have been hints about adding CD-ROM passthrough support on Mac OS X by filling in the missing parts in block/raw-posix.c. That should help you get to the bottom of the problem. This message seems to indicate that QEMU thinks the real cdrom drive is an image file. If the -drive format= option is not given, QEMU will try to detect the image format. That's the purpose of the find_image_format() function. QEMU does not make a distinction between image files and block devices because there are valid use cases where a block device uses an image format. For example, a disk or partition can contain qcow2 data (there is no partition table or file system, just qcow2). So what you are saying is if the user enters -cdrom /dev/cdrom, QEMU is suppose to call find_image_format. If everything goes right, the first if statement should be skipped. Then the bdrv_pread() function should succeed and so should the bdrv_probe_all() function. Is that how it is suppose to work?
Re: [Qemu-block] [PATCH 14/16] ahci: Do not map cmd_fis to generate response
On Mon, Jun 22, 2015 at 08:21:13PM -0400, John Snow wrote: @@ -744,8 +722,8 @@ static void ahci_write_fis_pio(AHCIDevice *ad, uint16_t len) pio_fis[9] = s-hob_lcyl; pio_fis[10] = s-hob_hcyl; pio_fis[11] = 0; -pio_fis[12] = cmd_fis[12]; -pio_fis[13] = cmd_fis[13]; +pio_fis[12] = s-nsector 0xFF; +pio_fis[13] = (s-nsector 8) 0xFF; hw/ide/core.c decreases s-nsector until it reaches 0 and the request ends. Will the values reported back to the guest be correct if we use s-nsector? pgpsfw1r9x0wX.pgp Description: PGP signature
Re: [Qemu-block] [PATCH 12/16] ahci: ncq migration
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 06/26/2015 11:48 AM, Stefan Hajnoczi wrote: On Mon, Jun 22, 2015 at 08:21:11PM -0400, John Snow wrote: @@ -1555,6 +1573,35 @@ static int ahci_state_post_load(void *opaque, int version_id) return -1; } +for (j = 0; j AHCI_MAX_CMDS; j++) { + ncq_tfs = ad-ncq_tfs[j]; +ncq_tfs-drive = ad; + + if (ncq_tfs-used != ncq_tfs-halt) { +return -1; +} +if (!ncq_tfs-halt) { + continue; +} +if (!is_ncq(ncq_tfs-cmd)) { +return -1; +} +if (ncq_tfs-slot != ncq_tfs-tag) { +return -1; + } +if (ncq_tfs-slot AHCI_MAX_CMDS) { + return -1; +} +ncq_tfs-cmdh = ((AHCICmdHdr *)ad-lst)[ncq_tfs-slot]; Is there a guarantee that -lst has been mapped? Maybe pr-cmd PORT_CMD_START was 0. We need to check that the HBA is in a valid state for NCQ processing before attempting this. Slipped my mind, there should be no way to have ncq_tfs-halt set under normal circumstances except if the engine is started (and lst is mapped.) That said, stream corruption is always possible, so I'll add another validation check. An is not null check here should be sufficient, due to the if not halt continue up above. Thanks, - --js -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQIcBAEBCAAGBQJVjYHPAAoJEH3vgQaq/DkOkOsP/2O1bJJWaSAUVGB168GLQKGY B4ffiV37g4JrrLdhVeRuJiHYZmHT31VwBYqX2ZfxFJWqOBsei3PUm6qN7OjKyUQr td/oAaT3TKou8qySfbbgXMN1ubodtXU8CHOYykfNXiRqMeuW1CZUvP6IvEqJhZrj nu5H5l+W1lzsgEt2MCbK6VN96l3LMpYp8EafbZV1tOiMW63U8Tdc2IVJD1UP8krv U5ngLUegs5FCfHXYMBoIamDXDux52iE06EboP0J+nS21jNadtm/akJ2qq7ndKLVJ vghtlI+9Teq8bKq8JjY9qQY9tAe3FCIqygquwCOcQPKsi7FxnkB396k3sY9BckFK uBNMOBo6b0+5S4NnS7KV/L3986StepoStxBsEJj89EOi6AidDJocIs67m3TkOb5T WGbFQsnzXi+MLJw4Ck583DbwZ92fkdUNKEGlZtbvZ3UxpJ0qNos8hNPenewGlmQO BrcgTrD0f+Hr9GrLA+ACXgUah1I5giSsAYyjX3REwzrhuMEYMAv9vfBszrdMW/KP VZiLNl763YcjdK4EMC72R/R7+pf25RyTA8w4yAcc8mvTR+fDIuTtgCrHo8MEhIfS 91UC72zgmVhOWq2JelVR0sPKWGC4dvWLs/r0gfTOlthjzx5twnbSG2/XhrWPuJmd MCTuNL5CiuRCdjfq9hLB =XQh5 -END PGP SIGNATURE-
Re: [Qemu-block] [PATCH v2 00/16] ahci: ncq cleanup, part 1
On Mon, Jun 22, 2015 at 07:38:12PM -0400, John Snow wrote: requires: 1434470575-21625-1-git-send-email-js...@redhat.com [PATCH v2 0/4] ahci: misc fixes/tests for 2.4 This series adds a couple of tests to exercise the NCQ pathways and establish a baseline for us. Most of these patches are fairly short and should be relatively trivial to review. this series will lay the groundwork for part 2, which adds the rerror/werror=stop and migration support for that feature as well. === v2: - Cleared out #ifdefs to prevent unneeded bitrot - Fixed migration test in patch 16 - Removed the forceful ide_state-error clear from patch 08, replacing it instead with a change to the tests to issue IDENTIFY prior to attempting any data transfer. For convenience, this branch is available at: https://github.com/jnsnow/qemu.git branch ahci-ncq-s1 https://github.com/jnsnow/qemu/tree/ahci-ncq-s1 This version is tagged ahci-ncq-s1-v2: https://github.com/jnsnow/qemu/releases/tag/ahci-ncq-s1-v2 John Snow (16): ahci: Rename NCQFIS structure fields ahci: use shorter variables ahci: add ncq_err helper ahci: check for ncq prdtl overflow ahci: separate prdtl from opts ahci: add ncq debug checks ahci: ncq sector count correction ahci/qtest: Execute IDENTIFY prior to data commands libqos/ahci: fix cmd_sanity for ncq libqos/ahci: add NCQ frame support libqos/ahci: edit wait to be ncq aware libqos/ahci: adjust expected NCQ interrupts libqos/ahci: set the NCQ tag on command_commit libqos/ahci: Force all NCQ commands to be LBA48 qtest/ahci: simple ncq data test qtest/ahci: ncq migration test hw/ide/ahci.c | 102 +++-- hw/ide/ahci.h | 38 tests/ahci-test.c | 38 ++-- tests/libqos/ahci.c | 162 +--- tests/libqos/ahci.h | 59 ++- 5 files changed, 281 insertions(+), 118 deletions(-) -- 2.1.0 Reviewed-by: Stefan Hajnoczi stefa...@redhat.com pgpE4CEtPDdWu.pgp Description: PGP signature
Re: [Qemu-block] [Qemu-devel] [Qemu-stable] [PATCH v7 0/8] block: Mirror discarded sectors
With nfs as source storage, it's really slow currently (lseek slow + a lot of nfs ops). it's blocked around 30min for 300GB, with a raw file on a netapp san array through nfs. - Mail original - De: aderumier aderum...@odiso.com À: Fam Zheng f...@redhat.com Cc: Kevin Wolf kw...@redhat.com, qemu-block@nongnu.org, Jeff Cody jc...@redhat.com, qemu-devel qemu-de...@nongnu.org, qemu-stable qemu-sta...@nongnu.org, stefanha stefa...@redhat.com, pbonzini pbonz...@redhat.com, js...@redhat.com, wangxiaol...@ucloud.cn Envoyé: Vendredi 26 Juin 2015 15:36:10 Objet: Re: [Qemu-devel] [Qemu-stable] [PATCH v7 0/8] block: Mirror discarded sectors Hi, There is no problem, the observasion by Andrey was just that qmp command takes a few minutes before returning, because he didn't apply https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg02511.html Is this patch already apply on the block tree ? With nfs as source storage, it's really slow currently (lseek slow + a lot of nfs ops). - Mail original - De: Fam Zheng f...@redhat.com À: pbonzini pbonz...@redhat.com Cc: Kevin Wolf kw...@redhat.com, qemu-block@nongnu.org, Jeff Cody jc...@redhat.com, qemu-devel qemu-de...@nongnu.org, qemu-stable qemu-sta...@nongnu.org, stefanha stefa...@redhat.com, js...@redhat.com, wangxiaol...@ucloud.cn Envoyé: Jeudi 25 Juin 2015 12:45:38 Objet: Re: [Qemu-devel] [Qemu-stable] [PATCH v7 0/8] block: Mirror discarded sectors On Thu, 06/25 09:02, Fam Zheng wrote: On Wed, 06/24 19:01, Paolo Bonzini wrote: On 24/06/2015 11:08, Fam Zheng wrote: Stefan, The only controversial patches are the qmp/drive-mirror ones (1-3), while patches 4-8 are still useful on their own: they fix the mentioned crash and improve iotests. Shall we merge the second half (of course none of them depend on 1-3) now that softfreeze is approaching? Stefan, would you consider applying patches 4-8? Actually why not apply all of them? Even if blockdev-mirror is a superior interface in the long run, the current behavior of drive-mirror can cause images to balloon up to the full size, which is bad. Extending drive-mirror is okay IMHO for 2.4. Before we do that, Andrey Korolyov has reported a hang issue with unmap=true, I'll take a look at it today. There is no problem, the observasion by Andrey was just that qmp command takes a few minutes before returning, because he didn't apply https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg02511.html Fam
Re: [Qemu-block] [PATCH 01/16] ide: add limit to .prepare_buf()
On Mon, Jun 22, 2015 at 08:21:00PM -0400, John Snow wrote: diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index 95d228f..f873ab1 100644 --- a/hw/ide/ahci.c +++ b/hw/ide/ahci.c @@ -49,7 +49,7 @@ static int handle_cmd(AHCIState *s,int port,int slot); static void ahci_reset_port(AHCIState *s, int port); static void ahci_write_fis_d2h(AHCIDevice *ad, uint8_t *cmd_fis); static void ahci_init_d2h(AHCIDevice *ad); -static int ahci_dma_prepare_buf(IDEDMA *dma, int is_write); +static int ahci_dma_prepare_buf(IDEDMA *dma, int64_t limit, int is_write); Why int64_t? Types involved here are uint64_t, dma_addr_t, size_t, and int. Out of these, uint64_t seems like a good candidate but I'm not sure why it needs to be signed. diff --git a/hw/ide/pci.c b/hw/ide/pci.c index 4afd0cf..a295baa 100644 --- a/hw/ide/pci.c +++ b/hw/ide/pci.c @@ -55,8 +55,11 @@ static void bmdma_start_dma(IDEDMA *dma, IDEState *s, /** * Return the number of bytes successfully prepared. * -1 on error. + * BUG?: Does not currently heed the 'limit' parameter because + * it is not clear what the correct behavior here is, + * see tests/ide-test.c QEMU implements both short and long PRDT cases for IDE in ide_dma_cb() and the tests check them. I saw nothing indicating that existing behavior might not correspond to real hardware behavior. Why do you say the correct behavior is unclear? pgprR9RUPcOwA.pgp Description: PGP signature
Re: [Qemu-block] [Qemu-devel] [PATCH v3 2/2] virtio-blk: Use blk_drain() to drain IO requests
Markus Armbruster arm...@redhat.com writes: Just spotted this in my git-pull... Alexander Yarygin yary...@linux.vnet.ibm.com writes: Each call of the virtio_blk_reset() function calls blk_drain_all(), which works for all existing BlockDriverStates, while draining only one is needed. This patch replaces blk_drain_all() by blk_drain() in virtio_blk_reset(). virtio_blk_data_plane_stop() should be called after draining because it restores vblk-complete_request. Cc: Michael S. Tsirkin m...@redhat.com Cc: Christian Borntraeger borntrae...@de.ibm.com Cc: Cornelia Huck cornelia.h...@de.ibm.com Cc: Kevin Wolf kw...@redhat.com Cc: Paolo Bonzini pbonz...@redhat.com Cc: Stefan Hajnoczi stefa...@redhat.com Signed-off-by: Alexander Yarygin yary...@linux.vnet.ibm.com --- hw/block/virtio-blk.c | 15 ++- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index e6afe97..d8a906f 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -651,16 +651,21 @@ static void virtio_blk_dma_restart_cb(void *opaque, int running, static void virtio_blk_reset(VirtIODevice *vdev) { VirtIOBlock *s = VIRTIO_BLK(vdev); - -if (s-dataplane) { -virtio_blk_data_plane_stop(s-dataplane); -} +AioContext *ctx; /* * This should cancel pending requests, but can't do nicely until there * are per-device request lists. */ -blk_drain_all(); +ctx = blk_get_aio_context(s-blk); +aio_context_acquire(ctx); +blk_drain(s-blk); + +if (s-dataplane) { +virtio_blk_data_plane_stop(s-dataplane); +} +aio_context_release(ctx); + blk_set_enable_write_cache(s-blk, s-original_wce); } From bdrv_drain_all()'s comment: * Note that completion of an asynchronous I/O operation can trigger any * number of other I/O operations on other devices---for example a coroutine * can be arbitrarily complex and a constant flow of I/O can come until the * coroutine is complete. Because of this, it is not possible to have a * function to drain a single device's I/O queue. From bdrv_drain()'s comment: * See the warning in bdrv_drain_all(). This function can only be called if * you are sure nothing can generate I/O because you have op blockers * installed. blk_drain() and blk_drain_all() are trivial wrappers. Ignorant questions: * Why does blk_drain() suffice here? * Is blk_drain() (created in PATCH 1) even a safe interface? * We want to drain requests from only one bdrv and blk_drain() can do that. * Ignorant answer: I was told that the bdrv_drain_all()'s comment is obsolete and we can use bdrv_drain(). Here is a link to the old thread: http://marc.info/?l=qemu-develm=143154211017926w=2. Since I don't see the full picture of this area yet, I'm just relying on other people's opinion.
Re: [Qemu-block] [Qemu-devel] [PATCH] refresh filename after the node is replaced
On 26.06.2015 16:27, Wen Congyang wrote: At 2015/6/26 21:47, Max Reitz Wrote: On 25.06.2015 08:41, Wen Congyang wrote: We can use block job mirror to repair broken quorum files. But the command 'info block' doesn't output correct filename after block job mirror finishes. Which filename? The quorum filename is broken after this patch, too. In In my test, quorum has two children, s-common.bs-drv is quorum, and s-to_replace is one of his child. Without this patch, info block will output obsolete information. With this patch, the quorum's filename is right. I don't know why quorum filename is broken. Oh, yes, you are right, I forgot to execute block-job-complete. However, this patch relies on the Quorum BDS being the mirror source, which is the intentional use case but certainly not the only one: $ x86_64-softmmu/qemu-system-x86_64 -drive if=none,id=quorum,driver=quorum,children.0.file.filename=test1.qcow2,children.1.file.filename=test2.qcow2,children.1.node-name=ch1,vote-threshold=1 -drive if=none,id=drv,file=test2.qcow2 -qmp stdio {QMP: {version: {qemu: {micro: 50, minor: 3, major: 2}, package: }, capabilities: []}} {'execute':'qmp_capabilities'} {return: {}} {'execute':'drive-mirror','arguments':{'device':'drv','target':'tmp.qcow2','format':'qcow2','node-name':'nch1','replaces':'ch1','sync':'full'}} Formatting 'tmp.qcow2', fmt=qcow2 size=67108864 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16 {return: {}} {timestamp: {seconds: 1435331363, microseconds: 217707}, event: BLOCK_JOB_READY, data: {device: drv, len: 0, offset: 0, speed: 0, type: mirror}} {'execute':'block-job-complete','arguments':{'device':'drv'}} {return: {}} {timestamp: {seconds: 1435331373, microseconds: 152945}, event: BLOCK_JOB_COMPLETED, data: {device: drv, len: 0, offset: 0, speed: 0, type: mirror}} {'execute':'query-block'} {return: [{device: quorum, locked: false, removable: true, inserted: {iops_rd: 0, detect_zeroes: off, image: {virtual-size: 67108864, filename: json:{\children\: [{\driver\: \qcow2\, \file\: {\driver\: \file\, \filename\: \test1.qcow2\}}, {\driver\: \qcow2\, \file\: {\driver\: \file\, \filename\: \test2.qcow2\}}], \driver\: \quorum\, \blkverify\: false, \rewrite-corrupted\: false, \vote-threshold\: 1}, format: quorum}, iops_wr: 0, ro: false, backing_file_depth: 0, drv: quorum, iops: 0, bps_wr: 0, write_threshold: 0, encrypted: false, bps: 0, bps_rd: 0, cache: {no-flush: false, direct: false, writeback: true}, file: json:{\children\: [{\driver\: \qcow2\, \file\: {\driver\: \file\, \filename\: \test1.qcow2\}}, {\driver\: \qcow2\, \file\: {\driver\: \file\, \filename\: \test2.qcow2\}}], \driver\: \quorum\, \blkverify\: false, \rewrite-corrupted\: false, \vote-threshold\: 1}, encryption_key_missing: false}, tray_open: false, type: unknown}, {device: drv, locked: false, removable: true, inserted: {iops_rd: 0, detect_zeroes: off, image: {virtual-size: 67108864, filename: test2.qcow2, cluster-size: 65536, format: qcow2, actual-size: 200704, format-specific: {type: qcow2, data: {compat: 1.1, lazy-refcounts: false, refcount-bits: 16, corrupt: false}}, dirty-flag: false}, iops_wr: 0, ro: false, backing_file_depth: 0, drv: qcow2, iops: 0, bps_wr: 0, write_threshold: 0, encrypted: false, bps: 0, bps_rd: 0, cache: {no-flush: false, direct: false, writeback: true}, file: test2.qcow2, encryption_key_missing: false}, tray_open: false, type: unknown}, {io-status: ok, device: ide1-cd0, locked: false, removable: true, tray_open: false, type: unknown}, {device: floppy0, locked: false, removable: true, tray_open: false, type: unknown}, {device: sd0, locked: false, removable: true, tray_open: false, type: unknown}]} This patch makes mirror_exit() refresh the name of the block job's device (in this case drv), which is not necessarily a parent of the node being replaced. Even if it were, imagine a configuration where there are two nested quorums, with the inner one being repaired using the replaces parameter for drive-mirror. In this case, this patch would fix the inner Quorum's file name, but not the outer one's. I see two solutions to this issue: Either, we do it right and that means, if there is a change in the BDS graph (e.g. because of bdrv_swap()), we have to call bdrv_refresh_filename() on all of the changed BDS's parents. In order to be able to enumerate a BDS's parents, we need Kevin's series, as said before. Or we revive my series block: Drop BDS.filename which drops the BDS.filename field completely and always rebuilds it if it is queried. This would fix the issue as well, however, there is a reason it has been lying around for nine months now, and that reason is that we did not yet fully examine the impacts of dropping that field, especially concerning libvirt. Max Thanks Wen Congyang order to fix this, we need to call bdrv_refresh_filename() after bdrv_swap() on all BDSs which reference one of the swapped BDSs. I think
Re: [Qemu-block] [PATCH 14/16] ahci: Do not map cmd_fis to generate response
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 06/26/2015 11:59 AM, Stefan Hajnoczi wrote: On Mon, Jun 22, 2015 at 08:21:13PM -0400, John Snow wrote: @@ -744,8 +722,8 @@ static void ahci_write_fis_pio(AHCIDevice *ad, uint16_t len) pio_fis[9] = s-hob_lcyl; pio_fis[10] = s-hob_hcyl; pio_fis[11] = 0; -pio_fis[12] = cmd_fis[12]; - pio_fis[13] = cmd_fis[13]; +pio_fis[12] = s-nsector 0xFF; +pio_fis[13] = (s-nsector 8) 0xFF; hw/ide/core.c decreases s-nsector until it reaches 0 and the request ends. Will the values reported back to the guest be correct if we use s-nsector? See the commit message for justification of this one. Ultimately, it doesn't matter what gets put in here (for data transfer commands) -- but getting RID of the cmd_fis mapping is a strong positive. The field is supposed to contain the contents of the sector count register at the conclusion of the command. The spec says that for data transfer commands, this value is useless/uninteresting (it can be zero-filled, or filled with garbage, etc) However, for some special purpose commands, the sector count register is required to hold a value that has some special meaning. I don't think we currently support any of those, but for the future -- this part of the FIS generation code will work correctly regardless. So this patch is mainly about getting rid of the cmd_fis mapping and putting something nominally correct in its place. - --js -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQIcBAEBCAAGBQJVjYxgAAoJEH3vgQaq/DkOYiMQAIUetgDGGd8QvntbT147pifE f8vLqV1TDZL2cmorplUT8K5BFPEOiSFn3GIa2K4sVmfZy9L7r/Q8SUn/WtV0AeLD /OkU7l25JIz62cSKpZZpfJcKsw9iBjRoeltxM1AfiS+GpUIMRxxORUEsgswEJQzn CPLkOOE+ME1Bh9qq6/kYj0+gPLwC2UjoW2xny1A5pizwdf4VlJl9jyM5DWMdtryW Qcd/djxK2CAXsvpMYNQnnjF7JDp+nkGaXct+hTQEUZRWbUv02+KOt9StBfBh+Dq1 njYq6pgUfvvbjOzccHTzgeaUlCCI6mP+ng0ksjG2HW9LI8QyV9whtPB/Rc2ORfhz R1zTw5JW6fsAVnbfkxBC7OFJBZB4WfmJWEdIPUmrmLgpxaBDi9jOK+bqYXeTTpXv bgdFJN2BAWXW1F4UCSjDaVUAL1FYG5vjxw+MbbWGcdHTFPT6z3OZVxe1AST7SS3j xuqIVMaxBeskQgUJiwOBFSFkTMTpWPIfRLY1MLk9yqXuwQpg7NtQncbho3wCj1cC nxpjDjWRAB3a2odjCocA9RN6uBNeKq2WbmFzU6bODgjOHRFUPfP/s3qz43PPQGxa +VQZaQahOnvVIGWnerXMyl+LT/0tLhsBUZDeRoNbR3MMMprw3yxozP1BM1bQ4bWx brUOjknDPIwWfjfIpUID =SaS/ -END PGP SIGNATURE-
Re: [Qemu-block] [Qemu-devel] [PATCH 06/16] ahci: record ncq failures
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 06/26/2015 11:35 AM, Stefan Hajnoczi wrote: On Mon, Jun 22, 2015 at 08:21:05PM -0400, John Snow wrote: Handle NCQ failures for cases where we want to halt the VM on IO errors. Signed-off-by: John Snow js...@redhat.com --- hw/ide/ahci.c | 17 +++-- hw/ide/ahci.h | 1 + hw/ide/internal.h | 1 + 3 files changed, 17 insertions(+), 2 deletions(-) diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index 71b5085..a838317 100644 --- a/hw/ide/ahci.c +++ b/hw/ide/ahci.c @@ -959,13 +959,25 @@ static void ncq_cb(void *opaque, int ret) return; } +ncq_tfs-halt = false; Why does halt need to be cleared here? Might make more sense to clear it just on the beginning of every command, in execute(). There's no strong reason here other than If there's an error and it should be set, it'll get reset again pretty soon. It's just a default state. I could move it from process to execute. if (ret 0) { -ncq_err(ncq_tfs); +bool is_read = ncq_tfs-cmd == READ_FPDMA_QUEUED; +BlockErrorAction action = blk_get_error_action(ide_state-blk, + is_read, -ret); +if (action == BLOCK_ERROR_ACTION_STOP) { +ncq_tfs-halt = true; + ide_state-bus-error_status = IDE_RETRY_HBA; +} else if (action == BLOCK_ERROR_ACTION_REPORT) { + ncq_err(ncq_tfs); +} + blk_error_action(ide_state-blk, action, is_read, -ret); } else { ide_state-status = READY_STAT | SEEK_STAT; } -ncq_finish(ncq_tfs); +if (!ncq_tfs-halt) { + ncq_finish(ncq_tfs); +} } static int is_ncq(uint8_t ata_cmd) @@ -1042,6 +1054,7 @@ static void process_ncq_command(AHCIState *s, int port, uint8_t *cmd_fis, } ncq_tfs-used = 1; +ncq_tfs-halt = false; ncq_tfs-drive = ad; ncq_tfs-slot = slot; ncq_tfs-cmd = ncq_fis-command; diff --git a/hw/ide/ahci.h b/hw/ide/ahci.h index 33607d7..47a3122 100644 --- a/hw/ide/ahci.h +++ b/hw/ide/ahci.h @@ -262,6 +262,7 @@ typedef struct NCQTransferState { uint8_t cmd; int slot; int used; +bool halt; } NCQTransferState; struct AHCIDevice { diff --git a/hw/ide/internal.h b/hw/ide/internal.h index 7a4a86d..5abee19 100644 --- a/hw/ide/internal.h +++ b/hw/ide/internal.h @@ -499,6 +499,7 @@ struct IDEDevice { #define IDE_RETRY_READ 0x20 #define IDE_RETRY_FLUSH 0x40 #define IDE_RETRY_TRIM 0x80 +#define IDE_RETRY_HBA 0x100 Feel free to squash this patch together with the next one. It is hard to review in isolation since IDE_RETRY_HBA and -halt aren't used yet. Will do. -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQIcBAEBCAAGBQJVjZmbAAoJEH3vgQaq/DkOPUAP/26368m3XEWpWLPePLnBOvS+ tFuUnYyrWyWdq9p9lNBPcz+v+//CAdISQ8bRZX4OS+f3W+uVB04yJc+N5igiFJCe 9f8+lnLnVO6nmNPzeKPhfigBPF8fc5tnmk4P9bSYUcEmTkdPb+HD8tbNh4l9euWl 0+uqGUhn7Gj1G8aZSZ7UNv+6Ru7SBrygnn/GlMrqrVcbTSW2uj6JmpT1xvdk1iU8 mh+j76JIBYRn2dZf4ZIHJW51x5Zijvsi04Rc8EfRDqiGZ/7HK9EQwwoZ5vUImo/d vuSufuxISAGc5ECeGpb7fFyGbfl7Y2R+l+qipUsYZ704wTGdnkBMAst0E/NK5y8U IHCv/5IR9lLp1qaAL0DSmkuaz6xeiBktKmy0lRT1Yq38ZK2RMCzgRPHTbsPGfF/Z XhRkz5lgqJAdzJJNlvUDNI0jAepiREsBP4Oqi7FMaKq7ix3haSCH0k21JIJbAls5 yRNP9hUNh0Q30E/09h4/ZYRhoQTbvzZS2tLJ8zG1lB0dnIK0uML6SjO2mKbsWpN6 hPIYku04BtqDtPlRcfsOQwJ04bHHKh46PSEWaC0xCsvMVhGzWhs4FqMCxsKOvSMC rAEYzgEWaJdZUMC+qp4RS8aX2yJP/nmHivEEb2vI6196jsXdqrdmYLBNHcL+Q6kY bDrMjWdP5CbDUu7E4pAY =kgU+ -END PGP SIGNATURE-
Re: [Qemu-block] [Qemu-devel] [PATCH 13/16] ahci: add get_cmd_header helper
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 06/26/2015 11:51 AM, Stefan Hajnoczi wrote: On Mon, Jun 22, 2015 at 08:21:12PM -0400, John Snow wrote: +static AHCICmdHdr *get_cmd_header(AHCIState *s, uint8_t port, uint8_t slot) +{ +if (port s-ports || slot AHCI_MAX_CMDS) { Should these be = instead of ? Otherwise 1 element beyond the end of the array can be accessed. Sigh. Yes, yes it should. Thank you for saving me from myself. -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQIcBAEBCAAGBQJVjZrRAAoJEH3vgQaq/DkOjcMP/0CqxJAWcgsnFH5Opwz7iNbm OyvLakvzUmU958qN3nzG2vMME3WwyVF3bxVJkoT3v1Pc6Tm9e+hq+R1oDzD0TD6X tfjQTc5JZcViKSD74Huo85gKoVp3Tp4idOaaVFn36TmP9gk5i6g/C0Nf7EzY12Er 4op6mLvBhiU6NH72UwP4HHij0KZCMCVYYH4qhAWsil/QIugAgELlM9TFkVI49lHs WN3IE+5gQijd99Z5VJaC9wrmtOdZWHblfTeWMvVQfMfbg5dJ2Aiya1tHlw6QrPzb 7nGPHTwyGLmNsvUsNqMHrWGaL9e4iEwXGYNRnIT16L8QPzGqg7v4D3lvJcGv5LrM OFPje2Fk9w8OGdgRjrQhQcJXfU+IPkvlwnW7G3NO2Ts0GK5cHClRafZVjvAhMuXs 5M89BTdi7LBGJ7EWq8ByP5mxPj+hiR3hsSG41OPFg3e0VpwtgFJ6umaZde1tMrHp IwnOZcvXUoTOAna3Y4E/fSwK1jKczTXIvkl4+HnJB40ekyoSkyl6qXV1FlMRiMnq pAeJ5jgBovuV2gx8pLYU3ipR0LJTjzz7N39FkAJJyekWWjNVrZ4Ue5IGW/pXKVkN SjeSrZTtxSmyO8BPya7/xsqjiqo3v7m2jSBLTVEsUtkFEaet4xlAH2OQZBvlTbuI n5Hc61NkB6qMU5k6We31 =Sfdd -END PGP SIGNATURE-
Re: [Qemu-block] [Qemu-devel] [PATCH 00/16] ahci: ncq cleanup, part 2
On 06/26/2015 12:11 PM, Stefan Hajnoczi wrote: On Mon, Jun 22, 2015 at 08:20:59PM -0400, John Snow wrote: requires: 1434470575-21625-1-git-send-email-js...@redhat.com 1435016308-6150-1-git-send-email-js...@redhat.com [PATCH v2 0/4] ahci: misc fixes/tests for 2.4 [PATCH v2 00/16] ahci: ncq cleanup, part 1 This chunk gets NCQ migration and and resume support working. There's still some left to do, particularly around error handling and FIS semantics, but this should get us most of the way there. There is one RFC bit in this patch, inside of Patch #1, concerning how to handle truncating PRDTs that are too large -- it looks like we have attempted to address it in the past, and I accidentally bumped up against it now. By actually trying to consume the PRDs but ignoring any extra space they describe, I would break ide-test. I'll post logs later, but judging by the test text itself, we don't seem to know what the right behavior is. For convenience, this branch is available at: https://github.com/jnsnow/qemu.git branch ahci-ncq-s2 https://github.com/jnsnow/qemu/tree/ahci-ncq-s2 This version is tagged ahci-ncq-s2-v1: https://github.com/jnsnow/qemu/releases/tag/ahci-ncq-s2-v1 John Snow (16): ide: add limit to .prepare_buf() ahci: stash ncq command ahci: assert is_ncq for process_ncq ahci: refactor process_ncq_command ahci: factor ncq_finish out of ncq_cb ahci: record ncq failures ahci: kick NCQ queue ahci: correct types in NCQTransferState ahci: correct ncq sector count qtest/ahci: halted NCQ test ahci: add cmd header to ncq transfer state ahci: ncq migration ahci: add get_cmd_header helper ahci: Do not map cmd_fis to generate response qtest/ahci: halted ncq migration test ahci: fix sdb fis semantics hw/ide/ahci.c | 330 -- hw/ide/ahci.h | 9 +- hw/ide/core.c | 12 +- hw/ide/internal.h | 4 +- hw/ide/macio.c| 2 +- hw/ide/pci.c | 5 +- tests/ahci-test.c | 38 +-- 7 files changed, 252 insertions(+), 148 deletions(-) I posted comments on a few patches. The rest looks good. OK, I'm smooshing 6/7 and 12/13, and correcting issues where appropriate. Will wait to feedback-feedback before I hit the send button. --js
Re: [Qemu-block] [PATCH 16/16] ahci: fix sdb fis semantics
On Mon, Jun 22, 2015 at 08:21:15PM -0400, John Snow wrote: @@ -682,19 +680,22 @@ static void ahci_write_fis_sdb(AHCIState *s, int port, uint32_t finished) sdb_fis-type = SATA_FIS_TYPE_SDB; /* Interrupt pending Notification bit */ -sdb_fis-flags = (ad-hba-control_regs.irqstatus ? (1 6) : 0); +sdb_fis-flags = 0x40; /* Interrupt bit, always 1 for NCQ */ ... -ahci_trigger_irq(s, ad, PORT_IRQ_SDB_FIS); +if (sdb_fis-flags 0x40) { +ahci_trigger_irq(s, ad, PORT_IRQ_SDB_FIS); +} This if statement is always true. pgpLYUB9t3Qco.pgp Description: PGP signature
Re: [Qemu-block] [PATCH 1/2] block: vpc - prevent overflow if max_table_entries = 0x40000000
On Thu, Jun 25, 2015 at 11:05:20AM -0400, Jeff Cody wrote: On Thu, Jun 25, 2015 at 03:28:35PM +0100, Stefan Hajnoczi wrote: On Wed, Jun 24, 2015 at 03:54:27PM -0400, Jeff Cody wrote: @@ -269,7 +270,9 @@ static int vpc_open(BlockDriverState *bs, QDict *options, int flags, goto fail; } -s-pagetable = qemu_try_blockalign(bs-file, s-max_table_entries * 4); +pagetable_size = (size_t) s-max_table_entries * 4; + +s-pagetable = qemu_try_blockalign(bs-file, pagetable_size); On 32-bit hosts size_t is 32-bit so the overflow hasn't been solved. Does it make sense to impose a limit on pagetable_size? Good point. Yes, it does. The VHD spec says that the Max Table Entries field should be equal to the disk size / block size. I don't know if there are images out there that treat that as = disk size / block size rather than ==, however. But if we assume max size of 2TB for a VHD disk, and a minimal block size of 512 bytes, that would give us a max_table_entries of 0x1, which exceeds 32 bits by itself. For pagetable_size to fit in a 32-bit, that means to support 2TB on a 32-bit host in the current implementation, the minimum block size is 4096. We could check during open / create that (disk_size / block_size) * 4 SIZE_MAX, and refuse to open if this is not true (and also validate max_table_entries to fit in the same). Sounds good. pgpBKUy3yog9o.pgp Description: PGP signature
Re: [Qemu-block] [PATCH for-2.4] block: keep bitmap if incremental backup job is cancelled
On Thu, Jun 25, 2015 at 12:33:25PM -0400, Jeff Cody wrote: On Thu, Jun 25, 2015 at 12:30:56PM -0400, John Snow wrote: On 06/25/2015 08:53 AM, Stefan Hajnoczi wrote: Reclaim the dirty bitmap if an incremental backup block job is cancelled. The ret variable may be 0 when the job is cancelled so it's not enough to check ret 0. Reviewed-by: John Snow js...@redhat.com Reviewed-by: Max Reitz mre...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- block/backup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/backup.c b/block/backup.c index 4a1af68..ddf8424 100644 --- a/block/backup.c +++ b/block/backup.c @@ -431,7 +431,7 @@ static void coroutine_fn backup_run(void *opaque) if (job-sync_bitmap) { BdrvDirtyBitmap *bm; -if (ret 0) { +if (ret 0 || block_job_is_cancelled(job-common)) { /* Merge the successor back into the parent, delete nothing. */ bm = bdrv_reclaim_dirty_bitmap(bs, job-sync_bitmap, NULL); assert(bm); Didn't Jeff Cody already stage this? Yes Thanks! pgpjeAuwaK6L7.pgp Description: PGP signature
[Qemu-block] [PATCH] block/iscsi: restore compatiblity with libiscsi 1.9.0
RHEL7 and others are stuck with libiscsi 1.9.0 since there unfortunately was an ABI breakage after that release. Signed-off-by: Peter Lieven p...@kamp.de --- block/iscsi.c | 31 ++- configure |6 +++--- qemu-options.hx |3 ++- 3 files changed, 31 insertions(+), 9 deletions(-) diff --git a/block/iscsi.c b/block/iscsi.c index f19a56a..551d583 100644 --- a/block/iscsi.c +++ b/block/iscsi.c @@ -168,6 +168,19 @@ static inline unsigned exp_random(double mean) return -mean * log((double)rand() / RAND_MAX); } +/* SCSI_STATUS_TASK_SET_FULL and SCSI_STATUS_TIMEOUT were introduced + * in libiscsi 1.10.0 as part of an enum. The LIBISCSI_API_VERSION + * macro was introduced in 1.11.0. So use the API_VERSION macro as + * a hint that the macros are defined and define them ourselves + * otherwise to keep the required libiscsi version at 1.9.0 */ +#if !defined(LIBISCSI_API_VERSION) +#define _SCSI_STATUS_TASK_SET_FULL 0x28 +#define _SCSI_STATUS_TIMEOUT0x0f02 +#else +#define _SCSI_STATUS_TASK_SET_FULL SCSI_STATUS_TASK_SET_FULL +#define _SCSI_STATUS_TIMEOUTSCSI_STATUS_TIMEOUT +#endif + static void iscsi_co_generic_cb(struct iscsi_context *iscsi, int status, void *command_data, void *opaque) @@ -188,11 +201,11 @@ iscsi_co_generic_cb(struct iscsi_context *iscsi, int status, iTask-do_retry = 1; goto out; } -if (status == SCSI_STATUS_BUSY || status == SCSI_STATUS_TIMEOUT || -status == SCSI_STATUS_TASK_SET_FULL) { +if (status == SCSI_STATUS_BUSY || status == _SCSI_STATUS_TIMEOUT || +status == _SCSI_STATUS_TASK_SET_FULL) { unsigned retry_time = exp_random(iscsi_retry_times[iTask-retries - 1]); -if (status == SCSI_STATUS_TIMEOUT) { +if (status == _SCSI_STATUS_TIMEOUT) { /* make sure the request is rescheduled AFTER the * reconnect is initiated */ retry_time = EVENT_INTERVAL * 2; @@ -1358,7 +1371,7 @@ static int iscsi_open(BlockDriverState *bs, QDict *options, int flags, QemuOpts *opts; Error *local_err = NULL; const char *filename; -int i, ret = 0; +int i, ret = 0, timeout = 0; opts = qemu_opts_create(runtime_opts, NULL, 0, error_abort); qemu_opts_absorb_qdict(opts, options, local_err); @@ -1428,7 +1441,15 @@ static int iscsi_open(BlockDriverState *bs, QDict *options, int flags, goto out; } -iscsi_set_timeout(iscsi, parse_timeout(iscsi_url-target)); +/* timeout handling is broken in libiscsi before 1.15.0 */ +timeout = parse_timeout(iscsi_url-target); +#if defined(LIBISCSI_API_VERSION) LIBISCSI_API_VERSION = 20150621 +iscsi_set_timeout(iscsi, timeout); +#else +if (timeout) { +error_report(iSCSI: ignoring timeout value for libiscsi 1.15.0); +} +#endif if (iscsi_full_connect_sync(iscsi, iscsi_url-portal, iscsi_url-lun) != 0) { error_setg(errp, iSCSI: Failed to connect to LUN : %s, diff --git a/configure b/configure index 2ed9db2..222694f 100755 --- a/configure +++ b/configure @@ -3660,15 +3660,15 @@ if compile_prog ; then fi ## -# Do we have libiscsi = 1.10.0 +# Do we have libiscsi = 1.9.0 if test $libiscsi != no ; then - if $pkg_config --atleast-version=1.10.0 libiscsi; then + if $pkg_config --atleast-version=1.9.0 libiscsi; then libiscsi=yes libiscsi_cflags=$($pkg_config --cflags libiscsi) libiscsi_libs=$($pkg_config --libs libiscsi) else if test $libiscsi = yes ; then - feature_not_found libiscsi Install libiscsi = 1.10.0 + feature_not_found libiscsi Install libiscsi = 1.9.0 fi libiscsi=no fi diff --git a/qemu-options.hx b/qemu-options.hx index ca37481..2d23330 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -2293,7 +2293,8 @@ line or a configuration file. Since version Qemu 2.4 it is possible to specify a iSCSI request timeout to detect stalled requests and force a reestablishment of the session. The timeout -is specified in seconds. The default is 0 which means no timeout. +is specified in seconds. The default is 0 which means no timeout. Libiscsi +1.15.0 is required for this feature. Example (without authentication): @example -- 1.7.9.5
Re: [Qemu-block] [PATCH 0/3] Add infrastructure to compute timed averages
On Fri, Jun 12, 2015 at 04:01:28PM +0300, Alberto Garcia wrote: This series adds a new module that can be used to compute the average of a set of values in a certain period of time. This will be used by the accounting code to obtain statistics such as the min / max / average latency of I/O commands. This is based on Benoît's code, originally written last year. Regards, Berto Alberto Garcia (3): timer: Move NANOSECONDS_PER_SECONDS to timer.h timer: Use a single definition of NSEC_PER_SEC for the whole codebase util: Infrastructure for computing recent averages hw/ppc/ppc.c | 2 - hw/ppc/spapr_rtc.c | 3 +- hw/timer/mc146818rtc.c | 1 - hw/usb/hcd-ehci.c| 2 +- include/qemu/throttle.h | 2 - include/qemu/timed-average.h | 58 include/qemu/timer.h | 2 + tests/Makefile | 4 + tests/rtl8139-test.c | 10 +-- tests/test-throttle.c| 8 +- tests/test-timed-average.c | 89 ++ tests/wdt_ib700-test.c | 15 ++-- util/Makefile.objs | 1 + util/throttle.c | 4 +- util/timed-average.c | 208 +++ 15 files changed, 382 insertions(+), 27 deletions(-) create mode 100644 include/qemu/timed-average.h create mode 100644 tests/test-timed-average.c create mode 100644 util/timed-average.c Thanks, applied patches 1 2 to my block tree: https://github.com/stefanha/qemu/commits/block Patch 3 is going to be unused in QEMU 2.4 for I'm holding off on merging it. Please resend it together with the accounting series so it can be merged for QEMU 2.5. Thanks, Stefan pgpx8GD9ORGUS.pgp Description: PGP signature
Re: [Qemu-block] [Qemu-devel] [PATCH] block.c: fix real cdrom detection
On 25/06/2015 19:19, Laurent Vivier wrote: On 25/06/2015 18:16, Paolo Bonzini wrote: On 25/06/2015 18:12, Laurent Vivier wrote: On 25/06/2015 17:48, Paolo Bonzini wrote: On 25/06/2015 17:32, Programmingkid wrote: I think we are going to have to agree to disagree. I have never used the /dev/sr(0 | 1) devices and don't see how they would be effected by this patch. Are you trying to say the /dev/sr(0 | 1) devices *should* be handled by this patch? Thinking about your question some more, I see what you mean. On Linux /dev/sr0 refers to the cdrom drive. Also on Linux, the /dev/cdrom link refers to the /dev/sr0 device file. So if you just use /dev/cdrom, you are good. Well, that's not how things work. If you do things like that, you end up with a bunch of hacks, not with a decent piece of software. There is support for CD-ROM passthrough on Linux and FreeBSD in block/raw-posix.c. Perhaps the FreeBSD support can be extended to OS X as well. In fact, programmingkid, you should fix it in hdev_open() where there is already a #if __APPLE__ . Paolo, I agree with you but : hdev_open() #if defined(__linux__) { char resolved_path[ MAXPATHLEN ], *temp; temp = realpath(filename, resolved_path); if (temp strstart(temp, /dev/sg, NULL)) { bs-sg = 1; } #endif I'm wondering who had this strange idea... :) I was very scared to type git blame here. :) But the question is also http://geek-and-poke.com/2013/11/24/simply-explained BTW, it is a legacy from 2006: 19cb373 better support of host drives coming from MacOS X (again!): 3b0d4f6 OS X: support for the built in CD-ROM drive (Mike Kronenberg) where to put the checks. Putting it at a random place in block.c is not a good idea. But yes, this is also bad. It should use stat and check the major/minor numbers. Yes, we should check if major is SCSI_GENERIC_MAJOR (21) (on linux). We can also try to send an SG command like in cdrom_probe_device(). Something like in scsi_generic_realize(): rc = blk_ioctl(s-conf.blk, SG_GET_VERSION_NUM, sg_version); if (rc 0) { error_setg(errp, cannot get SG_IO version number: %s. Is this a SCSI device?, strerror(-rc)); return; } BTW, this solution is already queued in Stefan's block tree: https://github.com/stefanha/qemu/commit/3307ed7b3fac5ba99eb3b84904b0b7cdc3592a61 Laurent
Re: [Qemu-block] [Qemu-devel] [PATCH] block.c: fix real cdrom detection
On Thu, Jun 25, 2015 at 07:19:14PM +0200, Laurent Vivier wrote: On 25/06/2015 18:16, Paolo Bonzini wrote: On 25/06/2015 18:12, Laurent Vivier wrote: On 25/06/2015 17:48, Paolo Bonzini wrote: On 25/06/2015 17:32, Programmingkid wrote: I think we are going to have to agree to disagree. I have never used the /dev/sr(0 | 1) devices and don't see how they would be effected by this patch. Are you trying to say the /dev/sr(0 | 1) devices *should* be handled by this patch? Thinking about your question some more, I see what you mean. On Linux /dev/sr0 refers to the cdrom drive. Also on Linux, the /dev/cdrom link refers to the /dev/sr0 device file. So if you just use /dev/cdrom, you are good. Well, that's not how things work. If you do things like that, you end up with a bunch of hacks, not with a decent piece of software. There is support for CD-ROM passthrough on Linux and FreeBSD in block/raw-posix.c. Perhaps the FreeBSD support can be extended to OS X as well. In fact, programmingkid, you should fix it in hdev_open() where there is already a #if __APPLE__ . Paolo, I agree with you but : hdev_open() #if defined(__linux__) { char resolved_path[ MAXPATHLEN ], *temp; temp = realpath(filename, resolved_path); if (temp strstart(temp, /dev/sg, NULL)) { bs-sg = 1; } #endif I'm wondering who had this strange idea... :) I was very scared to type git blame here. :) But the question is also http://geek-and-poke.com/2013/11/24/simply-explained BTW, it is a legacy from 2006: 19cb373 better support of host drives coming from MacOS X (again!): 3b0d4f6 OS X: support for the built in CD-ROM drive (Mike Kronenberg) where to put the checks. Putting it at a random place in block.c is not a good idea. But yes, this is also bad. It should use stat and check the major/minor numbers. Yes, we should check if major is SCSI_GENERIC_MAJOR (21) (on linux). That would be too specific since there are other drivers that support SG ioctls, like block/bsg.c. We can also try to send an SG command like in cdrom_probe_device(). Something like in scsi_generic_realize(): rc = blk_ioctl(s-conf.blk, SG_GET_VERSION_NUM, sg_version); if (rc 0) { error_setg(errp, cannot get SG_IO version number: %s. Is this a SCSI device?, strerror(-rc)); return; } That was recently done in: commit 3307ed7b3fac5ba99eb3b84904b0b7cdc3592a61 Author: Dimitris Aragiorgis dim...@arrikto.com Date: Tue Jun 23 13:45:00 2015 +0300 raw-posix: Introduce hdev_is_sg() pgpLIqnQzawNC.pgp Description: PGP signature
Re: [Qemu-block] [PATCH] block/iscsi: add support for request timeouts
On 25/06/2015 23:21, Peter Lieven wrote: I will send a v2 that is compatible with 1.9.0 and enables the timeout stuff only for libiscsi = 1.15.0 Since Stefan has already applied the patch, restoring 1.9.0 compatibility (and disabling timeouts between 1.9.0 and 1.14.x) on top of this patch would be enough. Thanks for your understanding! Paolo
Re: [Qemu-block] [PATCH] block/nfs: add support for setting debug level
Am 26.06.2015 um 11:14 schrieb Stefan Hajnoczi: On Thu, Jun 25, 2015 at 03:26:46PM +0200, Peter Lieven wrote: Am 25.06.2015 um 15:18 schrieb Stefan Hajnoczi: On Tue, Jun 23, 2015 at 10:12:15AM +0200, Peter Lieven wrote: upcoming libnfs versions will support logging debug messages. Add support for it in qemu through an URL parameter. Signed-off-by: Peter Lieven p...@kamp.de --- block/nfs.c | 4 1 file changed, 4 insertions(+) diff --git a/block/nfs.c b/block/nfs.c index ca9e24e..f7388a3 100644 --- a/block/nfs.c +++ b/block/nfs.c @@ -329,6 +329,10 @@ static int64_t nfs_client_open(NFSClient *client, const char *filename, } else if (!strcmp(qp-p[i].name, readahead)) { nfs_set_readahead(client-context, val); #endif +#ifdef LIBNFS_FEATURE_DEBUG +} else if (!strcmp(qp-p[i].name, debug)) { +nfs_set_debug(client-context, val); +#endif } else { error_setg(errp, Unknown NFS parameter name: %s, qp-p[i].name); Untrusted users may be able to set these options since they are encoded in the URI. I'm imagining a hosting or cloud scenario like OpenStack. A verbose debug level spams stderr and could consume a lot of disk space. (The uid and gid options are probably okay since the NFS server cannot trust the uid/gid coming from QEMU anyway.) I think we can merge this patch for QEMU 2.4 but I'd like to have a discussion about the security risk of encoding libnfs options in the URI. CCed Eric Blake in case libvirt is affected. Has anyone thought about this and what are the rules? Good point. In general I think there should be some kind of sanitization of the parameters before they are passed on to Qemu. In our use case the user cannot pass any kind of URIs himself, but this might be different in other backends. The readahead value is as dangerous as well if not sanitized. I had a discussion with Ronnie in the past if we should encode parameters in the URI or via environment like it is done in libiscsi. If I remember correctly we came up with the URI parameters for better usability, but hadn't attack scenarios in mind. I am also open to only allow uncritical parameters in the URI and pass others via a new -nfs cmdline option. Or limit max readahead and max debug level settable via URI. I'd feel safer if the option was in runtime_opts instead. The the management tool has to pass them explicitly and the end user cannot influence them via the URI. If an option is needed both at open and create time, then it must also be parsed from nfs_file_create() opts. Ok, I will send a patch that follows this approach. And also a second one to limit the readahead size to a reasonable value. Peter
Re: [Qemu-block] [PATCH] block/nfs: add support for setting debug level
On Thu, Jun 25, 2015 at 03:26:46PM +0200, Peter Lieven wrote: Am 25.06.2015 um 15:18 schrieb Stefan Hajnoczi: On Tue, Jun 23, 2015 at 10:12:15AM +0200, Peter Lieven wrote: upcoming libnfs versions will support logging debug messages. Add support for it in qemu through an URL parameter. Signed-off-by: Peter Lieven p...@kamp.de --- block/nfs.c | 4 1 file changed, 4 insertions(+) diff --git a/block/nfs.c b/block/nfs.c index ca9e24e..f7388a3 100644 --- a/block/nfs.c +++ b/block/nfs.c @@ -329,6 +329,10 @@ static int64_t nfs_client_open(NFSClient *client, const char *filename, } else if (!strcmp(qp-p[i].name, readahead)) { nfs_set_readahead(client-context, val); #endif +#ifdef LIBNFS_FEATURE_DEBUG +} else if (!strcmp(qp-p[i].name, debug)) { +nfs_set_debug(client-context, val); +#endif } else { error_setg(errp, Unknown NFS parameter name: %s, qp-p[i].name); Untrusted users may be able to set these options since they are encoded in the URI. I'm imagining a hosting or cloud scenario like OpenStack. A verbose debug level spams stderr and could consume a lot of disk space. (The uid and gid options are probably okay since the NFS server cannot trust the uid/gid coming from QEMU anyway.) I think we can merge this patch for QEMU 2.4 but I'd like to have a discussion about the security risk of encoding libnfs options in the URI. CCed Eric Blake in case libvirt is affected. Has anyone thought about this and what are the rules? Good point. In general I think there should be some kind of sanitization of the parameters before they are passed on to Qemu. In our use case the user cannot pass any kind of URIs himself, but this might be different in other backends. The readahead value is as dangerous as well if not sanitized. I had a discussion with Ronnie in the past if we should encode parameters in the URI or via environment like it is done in libiscsi. If I remember correctly we came up with the URI parameters for better usability, but hadn't attack scenarios in mind. I am also open to only allow uncritical parameters in the URI and pass others via a new -nfs cmdline option. Or limit max readahead and max debug level settable via URI. I'd feel safer if the option was in runtime_opts instead. The the management tool has to pass them explicitly and the end user cannot influence them via the URI. If an option is needed both at open and create time, then it must also be parsed from nfs_file_create() opts. Stefan pgpL21zBs6zAN.pgp Description: PGP signature
Re: [Qemu-block] [Qemu-devel] [PATCH v3 2/2] virtio-blk: Use blk_drain() to drain IO requests
Just spotted this in my git-pull... Alexander Yarygin yary...@linux.vnet.ibm.com writes: Each call of the virtio_blk_reset() function calls blk_drain_all(), which works for all existing BlockDriverStates, while draining only one is needed. This patch replaces blk_drain_all() by blk_drain() in virtio_blk_reset(). virtio_blk_data_plane_stop() should be called after draining because it restores vblk-complete_request. Cc: Michael S. Tsirkin m...@redhat.com Cc: Christian Borntraeger borntrae...@de.ibm.com Cc: Cornelia Huck cornelia.h...@de.ibm.com Cc: Kevin Wolf kw...@redhat.com Cc: Paolo Bonzini pbonz...@redhat.com Cc: Stefan Hajnoczi stefa...@redhat.com Signed-off-by: Alexander Yarygin yary...@linux.vnet.ibm.com --- hw/block/virtio-blk.c | 15 ++- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index e6afe97..d8a906f 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -651,16 +651,21 @@ static void virtio_blk_dma_restart_cb(void *opaque, int running, static void virtio_blk_reset(VirtIODevice *vdev) { VirtIOBlock *s = VIRTIO_BLK(vdev); - -if (s-dataplane) { -virtio_blk_data_plane_stop(s-dataplane); -} +AioContext *ctx; /* * This should cancel pending requests, but can't do nicely until there * are per-device request lists. */ -blk_drain_all(); +ctx = blk_get_aio_context(s-blk); +aio_context_acquire(ctx); +blk_drain(s-blk); + +if (s-dataplane) { +virtio_blk_data_plane_stop(s-dataplane); +} +aio_context_release(ctx); + blk_set_enable_write_cache(s-blk, s-original_wce); } From bdrv_drain_all()'s comment: * Note that completion of an asynchronous I/O operation can trigger any * number of other I/O operations on other devices---for example a coroutine * can be arbitrarily complex and a constant flow of I/O can come until the * coroutine is complete. Because of this, it is not possible to have a * function to drain a single device's I/O queue. From bdrv_drain()'s comment: * See the warning in bdrv_drain_all(). This function can only be called if * you are sure nothing can generate I/O because you have op blockers * installed. blk_drain() and blk_drain_all() are trivial wrappers. Ignorant questions: * Why does blk_drain() suffice here? * Is blk_drain() (created in PATCH 1) even a safe interface?
Re: [Qemu-block] [PATCH] block.c: fix real cdrom detection
On Thu, Jun 25, 2015 at 11:11:24AM -0400, Programmingkid wrote: On Jun 25, 2015, at 9:31 AM, Stefan Hajnoczi wrote: On Tue, Jun 23, 2015 at 02:26:51PM -0400, Programmingkid wrote: On Jun 23, 2015, at 2:06 PM, John Snow wrote: So what's the issue that this patch attempts to fix and how did you determine that the fix was needed here? It doesn't look like it respects proper abstraction at a glance. Without the patch, QEMU would just quit when the -cdrom /dev/cdrom option is given. Why does it quit? Because of a bug. This is what it prints: Could not read image for determining its format: Invalid argument. The bdrv_pread() failure is what need you need to investigate. In the other sub-thread there have been hints about adding CD-ROM passthrough support on Mac OS X by filling in the missing parts in block/raw-posix.c. That should help you get to the bottom of the problem. This message seems to indicate that QEMU thinks the real cdrom drive is an image file. If the -drive format= option is not given, QEMU will try to detect the image format. That's the purpose of the find_image_format() function. QEMU does not make a distinction between image files and block devices because there are valid use cases where a block device uses an image format. For example, a disk or partition can contain qcow2 data (there is no partition table or file system, just qcow2). pgp06L8EHTOUG.pgp Description: PGP signature
[Qemu-block] [PATCH V2] block/iscsi: restore compatiblity with libiscsi 1.9.0
RHEL7 and others are stuck with libiscsi 1.9.0 since there unfortunately was an ABI breakage after that release. Signed-off-by: Peter Lieven p...@kamp.de --- v1-v2: - do not use reserved macro names [Paolo] - change text in qemu-options.hx to libiscsi 1.15.0 or greater block/iscsi.c | 32 +++- configure |6 +++--- qemu-options.hx |3 ++- 3 files changed, 32 insertions(+), 9 deletions(-) diff --git a/block/iscsi.c b/block/iscsi.c index f19a56a..fd4a66e 100644 --- a/block/iscsi.c +++ b/block/iscsi.c @@ -168,6 +168,19 @@ static inline unsigned exp_random(double mean) return -mean * log((double)rand() / RAND_MAX); } +/* SCSI_STATUS_TASK_SET_FULL and SCSI_STATUS_TIMEOUT were introduced + * in libiscsi 1.10.0 as part of an enum. The LIBISCSI_API_VERSION + * macro was introduced in 1.11.0. So use the API_VERSION macro as + * a hint that the macros are defined and define them ourselves + * otherwise to keep the required libiscsi version at 1.9.0 */ +#if !defined(LIBISCSI_API_VERSION) +#define QEMU_SCSI_STATUS_TASK_SET_FULL 0x28 +#define QEMU_SCSI_STATUS_TIMEOUT0x0f02 +#else +#define QEMU_SCSI_STATUS_TASK_SET_FULL SCSI_STATUS_TASK_SET_FULL +#define QEMU_SCSI_STATUS_TIMEOUTSCSI_STATUS_TIMEOUT +#endif + static void iscsi_co_generic_cb(struct iscsi_context *iscsi, int status, void *command_data, void *opaque) @@ -188,11 +201,12 @@ iscsi_co_generic_cb(struct iscsi_context *iscsi, int status, iTask-do_retry = 1; goto out; } -if (status == SCSI_STATUS_BUSY || status == SCSI_STATUS_TIMEOUT || -status == SCSI_STATUS_TASK_SET_FULL) { +if (status == SCSI_STATUS_BUSY || +status == QEMU_SCSI_STATUS_TIMEOUT || +status == QEMU_SCSI_STATUS_TASK_SET_FULL) { unsigned retry_time = exp_random(iscsi_retry_times[iTask-retries - 1]); -if (status == SCSI_STATUS_TIMEOUT) { +if (status == QEMU_SCSI_STATUS_TIMEOUT) { /* make sure the request is rescheduled AFTER the * reconnect is initiated */ retry_time = EVENT_INTERVAL * 2; @@ -1358,7 +1372,7 @@ static int iscsi_open(BlockDriverState *bs, QDict *options, int flags, QemuOpts *opts; Error *local_err = NULL; const char *filename; -int i, ret = 0; +int i, ret = 0, timeout = 0; opts = qemu_opts_create(runtime_opts, NULL, 0, error_abort); qemu_opts_absorb_qdict(opts, options, local_err); @@ -1428,7 +1442,15 @@ static int iscsi_open(BlockDriverState *bs, QDict *options, int flags, goto out; } -iscsi_set_timeout(iscsi, parse_timeout(iscsi_url-target)); +/* timeout handling is broken in libiscsi before 1.15.0 */ +timeout = parse_timeout(iscsi_url-target); +#if defined(LIBISCSI_API_VERSION) LIBISCSI_API_VERSION = 20150621 +iscsi_set_timeout(iscsi, timeout); +#else +if (timeout) { +error_report(iSCSI: ignoring timeout value for libiscsi 1.15.0); +} +#endif if (iscsi_full_connect_sync(iscsi, iscsi_url-portal, iscsi_url-lun) != 0) { error_setg(errp, iSCSI: Failed to connect to LUN : %s, diff --git a/configure b/configure index 2ed9db2..222694f 100755 --- a/configure +++ b/configure @@ -3660,15 +3660,15 @@ if compile_prog ; then fi ## -# Do we have libiscsi = 1.10.0 +# Do we have libiscsi = 1.9.0 if test $libiscsi != no ; then - if $pkg_config --atleast-version=1.10.0 libiscsi; then + if $pkg_config --atleast-version=1.9.0 libiscsi; then libiscsi=yes libiscsi_cflags=$($pkg_config --cflags libiscsi) libiscsi_libs=$($pkg_config --libs libiscsi) else if test $libiscsi = yes ; then - feature_not_found libiscsi Install libiscsi = 1.10.0 + feature_not_found libiscsi Install libiscsi = 1.9.0 fi libiscsi=no fi diff --git a/qemu-options.hx b/qemu-options.hx index ca37481..389cf64 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -2293,7 +2293,8 @@ line or a configuration file. Since version Qemu 2.4 it is possible to specify a iSCSI request timeout to detect stalled requests and force a reestablishment of the session. The timeout -is specified in seconds. The default is 0 which means no timeout. +is specified in seconds. The default is 0 which means no timeout. Libiscsi +1.15.0 or greater is required for this feature. Example (without authentication): @example -- 1.7.9.5
[Qemu-block] [PATCH V2] block/nfs: add support for setting debug level
upcoming libnfs versions will support logging debug messages. Add support for it in qemu through a cmdline parameter. Example qemu -nfs debug=99 -cdrom nfs://... Signed-off-by: Peter Lieven p...@kamp.de --- v1-v2: reworked patch to accept the debug level as a cmdline parameter instead of an URI parameter [Stefan] block/nfs.c | 40 qemu-options.hx | 21 + vl.c|8 3 files changed, 69 insertions(+) diff --git a/block/nfs.c b/block/nfs.c index ca9e24e..43d48ae 100644 --- a/block/nfs.c +++ b/block/nfs.c @@ -274,6 +274,30 @@ static void nfs_file_close(BlockDriverState *bs) nfs_client_close(client); } +static void nfs_parse_options(NFSClient *client) +{ +QemuOptsList *list; +QemuOpts *opts; +const char *debug; + +list = qemu_find_opts(nfs); +if (list) { +opts = QTAILQ_FIRST(list-head); +if (opts) { +debug = qemu_opt_get(opts, debug); +if (debug) { +#ifdef LIBNFS_FEATURE_DEBUG +nfs_set_debug(client-context, atoi(debug)); +#else +error_report(NFS Warning: The linked version of libnfs does + not support setting debug levels); +#endif +} +} +} +} + + static int64_t nfs_client_open(NFSClient *client, const char *filename, int flags, Error **errp) { @@ -336,6 +360,8 @@ static int64_t nfs_client_open(NFSClient *client, const char *filename, } } +nfs_parse_options(client); + ret = nfs_mount(client-context, uri-server, uri-path); if (ret 0) { error_setg(errp, Failed to mount nfs share: %s, @@ -501,9 +527,23 @@ static BlockDriver bdrv_nfs = { .bdrv_attach_aio_context= nfs_attach_aio_context, }; +static QemuOptsList qemu_nfs_opts = { +.name = nfs, +.head = QTAILQ_HEAD_INITIALIZER(qemu_nfs_opts.head), +.desc = { +{ +.name = debug, +.type = QEMU_OPT_NUMBER, +.help = Set libnfs debug level (default 0 = no debug), +}, +{ /* end of list */ } +}, +}; + static void nfs_block_init(void) { bdrv_register(bdrv_nfs); +qemu_add_opts(qemu_nfs_opts); } block_init(nfs_block_init); diff --git a/qemu-options.hx b/qemu-options.hx index 389cf64..63c60e7 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -2329,6 +2329,26 @@ STEXI iSCSI parameters such as username and password can also be specified via a configuration file. See qemu-doc for more information and examples. +@item NFS +NFS support allows QEMU to access NFS shares directly and use as +images for the guest storage. Both disk and cdrom images are supported. + +Syntax for specifying NFS shares is +``nfs://server-ip/export/filename'' + +Example (setting deubg level to 2 and use ISO from NFS share as CDROM): +@example +qemu-system-i386 -nfs debug=2 -cdrom nfs://127.0.0.1/isos/cdimage.iso +@end example + +NFS support is an optional feature of QEMU and only available when +compiled and linked against libnfs. +ETEXI +DEF(nfs, HAS_ARG, QEMU_OPTION_nfs, +-nfs [debug=debug]\n +NFS connection parameters\n, QEMU_ARCH_ALL) +STEXI + @item NBD QEMU supports NBD (Network Block Devices) both using TCP protocol as well as Unix Domain Sockets. @@ -2480,6 +2500,7 @@ ETEXI STEXI @end table ETEXI +DEFHEADING() DEFHEADING(Bluetooth(R) options:) STEXI diff --git a/vl.c b/vl.c index 9542095..4317572 100644 --- a/vl.c +++ b/vl.c @@ -3141,6 +3141,14 @@ int main(int argc, char **argv, char **envp) } break; #endif +#ifdef CONFIG_LIBNFS +case QEMU_OPTION_nfs: +opts = qemu_opts_parse(qemu_find_opts(nfs), optarg, 0); +if (!opts) { +exit(1); +} +break; +#endif #ifdef CONFIG_SLIRP case QEMU_OPTION_tftp: legacy_tftp_prefix = optarg; -- 1.7.9.5
[Qemu-block] [PATCH] block/nfs: limit maximum readahead size to 1MB
a malicious caller could otherwise specify a very large value via the URI and force libnfs to allocate a large amount of memory for the readahead buffer. Cc: qemu-sta...@nongnu.org Signed-off-by: Peter Lieven p...@kamp.de --- block/nfs.c |7 +++ 1 file changed, 7 insertions(+) diff --git a/block/nfs.c b/block/nfs.c index 43d48ae..4fb1937 100644 --- a/block/nfs.c +++ b/block/nfs.c @@ -35,6 +35,8 @@ #include sysemu/sysemu.h #include nfsc/libnfs.h +#define QEMU_NFS_MAX_READAHEAD_SIZE 1048576 + typedef struct NFSClient { struct nfs_context *context; struct nfsfh *fh; @@ -351,6 +353,11 @@ static int64_t nfs_client_open(NFSClient *client, const char *filename, nfs_set_tcp_syncnt(client-context, val); #ifdef LIBNFS_FEATURE_READAHEAD } else if (!strcmp(qp-p[i].name, readahead)) { +if (val QEMU_NFS_MAX_READAHEAD_SIZE) { +error_report(NFS Warning: Truncating NFS readahead + size to %d, QEMU_NFS_MAX_READAHEAD_SIZE); +val = QEMU_NFS_MAX_READAHEAD_SIZE; +} nfs_set_readahead(client-context, val); #endif } else { -- 1.7.9.5
Re: [Qemu-block] [PATCH V2] block/iscsi: restore compatiblity with libiscsi 1.9.0
On 26/06/2015 12:18, Peter Lieven wrote: RHEL7 and others are stuck with libiscsi 1.9.0 since there unfortunately was an ABI breakage after that release. Signed-off-by: Peter Lieven p...@kamp.de --- v1-v2: - do not use reserved macro names [Paolo] - change text in qemu-options.hx to libiscsi 1.15.0 or greater block/iscsi.c | 32 +++- configure |6 +++--- qemu-options.hx |3 ++- 3 files changed, 32 insertions(+), 9 deletions(-) diff --git a/block/iscsi.c b/block/iscsi.c index f19a56a..fd4a66e 100644 --- a/block/iscsi.c +++ b/block/iscsi.c @@ -168,6 +168,19 @@ static inline unsigned exp_random(double mean) return -mean * log((double)rand() / RAND_MAX); } +/* SCSI_STATUS_TASK_SET_FULL and SCSI_STATUS_TIMEOUT were introduced + * in libiscsi 1.10.0 as part of an enum. The LIBISCSI_API_VERSION + * macro was introduced in 1.11.0. So use the API_VERSION macro as + * a hint that the macros are defined and define them ourselves + * otherwise to keep the required libiscsi version at 1.9.0 */ +#if !defined(LIBISCSI_API_VERSION) +#define QEMU_SCSI_STATUS_TASK_SET_FULL 0x28 +#define QEMU_SCSI_STATUS_TIMEOUT0x0f02 +#else +#define QEMU_SCSI_STATUS_TASK_SET_FULL SCSI_STATUS_TASK_SET_FULL +#define QEMU_SCSI_STATUS_TIMEOUTSCSI_STATUS_TIMEOUT +#endif + static void iscsi_co_generic_cb(struct iscsi_context *iscsi, int status, void *command_data, void *opaque) @@ -188,11 +201,12 @@ iscsi_co_generic_cb(struct iscsi_context *iscsi, int status, iTask-do_retry = 1; goto out; } -if (status == SCSI_STATUS_BUSY || status == SCSI_STATUS_TIMEOUT || -status == SCSI_STATUS_TASK_SET_FULL) { +if (status == SCSI_STATUS_BUSY || +status == QEMU_SCSI_STATUS_TIMEOUT || +status == QEMU_SCSI_STATUS_TASK_SET_FULL) { unsigned retry_time = exp_random(iscsi_retry_times[iTask-retries - 1]); -if (status == SCSI_STATUS_TIMEOUT) { +if (status == QEMU_SCSI_STATUS_TIMEOUT) { /* make sure the request is rescheduled AFTER the * reconnect is initiated */ retry_time = EVENT_INTERVAL * 2; @@ -1358,7 +1372,7 @@ static int iscsi_open(BlockDriverState *bs, QDict *options, int flags, QemuOpts *opts; Error *local_err = NULL; const char *filename; -int i, ret = 0; +int i, ret = 0, timeout = 0; opts = qemu_opts_create(runtime_opts, NULL, 0, error_abort); qemu_opts_absorb_qdict(opts, options, local_err); @@ -1428,7 +1442,15 @@ static int iscsi_open(BlockDriverState *bs, QDict *options, int flags, goto out; } -iscsi_set_timeout(iscsi, parse_timeout(iscsi_url-target)); +/* timeout handling is broken in libiscsi before 1.15.0 */ +timeout = parse_timeout(iscsi_url-target); +#if defined(LIBISCSI_API_VERSION) LIBISCSI_API_VERSION = 20150621 +iscsi_set_timeout(iscsi, timeout); +#else +if (timeout) { +error_report(iSCSI: ignoring timeout value for libiscsi 1.15.0); +} +#endif if (iscsi_full_connect_sync(iscsi, iscsi_url-portal, iscsi_url-lun) != 0) { error_setg(errp, iSCSI: Failed to connect to LUN : %s, diff --git a/configure b/configure index 2ed9db2..222694f 100755 --- a/configure +++ b/configure @@ -3660,15 +3660,15 @@ if compile_prog ; then fi ## -# Do we have libiscsi = 1.10.0 +# Do we have libiscsi = 1.9.0 if test $libiscsi != no ; then - if $pkg_config --atleast-version=1.10.0 libiscsi; then + if $pkg_config --atleast-version=1.9.0 libiscsi; then libiscsi=yes libiscsi_cflags=$($pkg_config --cflags libiscsi) libiscsi_libs=$($pkg_config --libs libiscsi) else if test $libiscsi = yes ; then - feature_not_found libiscsi Install libiscsi = 1.10.0 + feature_not_found libiscsi Install libiscsi = 1.9.0 fi libiscsi=no fi diff --git a/qemu-options.hx b/qemu-options.hx index ca37481..389cf64 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -2293,7 +2293,8 @@ line or a configuration file. Since version Qemu 2.4 it is possible to specify a iSCSI request timeout to detect stalled requests and force a reestablishment of the session. The timeout -is specified in seconds. The default is 0 which means no timeout. +is specified in seconds. The default is 0 which means no timeout. Libiscsi +1.15.0 or greater is required for this feature. Example (without authentication): @example Reviewed-by: Paolo Bonzini pbonz...@redhat.com
Re: [Qemu-block] [PATCH for-2.4] block: keep bitmap if incremental backup job is cancelled
On 26.06.2015 11:57, Stefan Hajnoczi wrote: On Thu, Jun 25, 2015 at 12:33:25PM -0400, Jeff Cody wrote: On Thu, Jun 25, 2015 at 12:30:56PM -0400, John Snow wrote: On 06/25/2015 08:53 AM, Stefan Hajnoczi wrote: Reclaim the dirty bitmap if an incremental backup block job is cancelled. The ret variable may be 0 when the job is cancelled so it's not enough to check ret 0. Reviewed-by: John Snow js...@redhat.com Reviewed-by: Max Reitz mre...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- block/backup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/backup.c b/block/backup.c index 4a1af68..ddf8424 100644 --- a/block/backup.c +++ b/block/backup.c @@ -431,7 +431,7 @@ static void coroutine_fn backup_run(void *opaque) if (job-sync_bitmap) { BdrvDirtyBitmap *bm; -if (ret 0) { +if (ret 0 || block_job_is_cancelled(job-common)) { /* Merge the successor back into the parent, delete nothing. */ bm = bdrv_reclaim_dirty_bitmap(bs, job-sync_bitmap, NULL); assert(bm); Didn't Jeff Cody already stage this? Yes Thanks! It appears like I successfully confused Stefan, thanks to John not wanting to embarrass me in public. :-)