[Qemu-block] [PATCH for-2.6 0/2] Bug fixes for gluster
Bug fixes for gluster; second patch is to prevent a potential data loss when trying to recover from a recoverable error (such as ENOSPC). Jeff Cody (2): block/gluster: return correct error value block/gluster: prevent data loss after i/o error block/gluster.c | 29 - configure | 8 2 files changed, 36 insertions(+), 1 deletion(-) -- 1.9.3
[Qemu-block] [PATCH for-2.6 1/2] block/gluster: return correct error value
Upon error, gluster will call the aio callback function with a ret value of -1, with errno set to the proper error value. If we set the acb->ret value to the return value in the callback, that results in every error being EPERM (i.e. 1). Instead, set it to the proper error result. Signed-off-by: Jeff Cody--- block/gluster.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/gluster.c b/block/gluster.c index 7bface2..30a827e 100644 --- a/block/gluster.c +++ b/block/gluster.c @@ -245,7 +245,7 @@ static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void *arg) if (!ret || ret == acb->size) { acb->ret = 0; /* Success */ } else if (ret < 0) { -acb->ret = ret; /* Read/Write failed */ +acb->ret = -errno; /* Read/Write failed */ } else { acb->ret = -EIO; /* Partial read/write - fail it */ } -- 1.9.3
[Qemu-block] [PATCH for-2.6 2/2] block/gluster: prevent data loss after i/o error
Upon receiving an I/O error after an fsync, by default gluster will dump its cache. However, QEMU will retry the fsync, which is especially useful when encountering errors such as ENOSPC when using the werror=stop option. When using caching with gluster, however, the last written data will be lost upon encountering ENOSPC. Using the cache xlator option of 'resync-failed-syncs-after-fsync' should cause gluster to retain the cached data after a failed fsync, so that ENOSPC and other transient errors are recoverable. Signed-off-by: Jeff Cody--- block/gluster.c | 27 +++ configure | 8 2 files changed, 35 insertions(+) diff --git a/block/gluster.c b/block/gluster.c index 30a827e..b1cf71b 100644 --- a/block/gluster.c +++ b/block/gluster.c @@ -330,6 +330,23 @@ static int qemu_gluster_open(BlockDriverState *bs, QDict *options, goto out; } +#ifdef CONFIG_GLUSTERFS_XLATOR_OPT +/* Without this, if fsync fails for a recoverable reason (for instance, + * ENOSPC), gluster will dump its cache, preventing retries. This means + * almost certain data loss. Not all gluster versions support the + * 'resync-failed-syncs-after-fsync' key value, but there is no way to + * discover during runtime if it is supported (this api returns success for + * unknown key/value pairs) */ +ret = glfs_set_xlator_option (s->glfs, "*-write-behind", + "resync-failed-syncs-after-fsync", + "on"); +if (ret < 0) { +error_setg_errno(errp, errno, "Unable to set xlator key/value pair"); +ret = -errno; +goto out; +} +#endif + qemu_gluster_parse_flags(bdrv_flags, _flags); s->fd = glfs_open(s->glfs, gconf->image, open_flags); @@ -386,6 +403,16 @@ static int qemu_gluster_reopen_prepare(BDRVReopenState *state, goto exit; } +#ifdef CONFIG_GLUSTERFS_XLATOR_OPT +ret = glfs_set_xlator_option(reop_s->glfs, "*-write-behind", + "resync-failed-syncs-after-fsync", "on"); +if (ret < 0) { +error_setg_errno(errp, errno, "Unable to set xlator key/value pair"); +ret = -errno; +goto exit; +} +#endif + reop_s->fd = glfs_open(reop_s->glfs, gconf->image, open_flags); if (reop_s->fd == NULL) { /* reops->glfs will be cleaned up in _abort */ diff --git a/configure b/configure index 5db29f0..3e921fe 100755 --- a/configure +++ b/configure @@ -298,6 +298,7 @@ coroutine="" coroutine_pool="" seccomp="" glusterfs="" +glusterfs_xlator_opt="no" glusterfs_discard="no" glusterfs_zerofill="no" archipelago="no" @@ -3397,6 +3398,9 @@ if test "$glusterfs" != "no" ; then glusterfs="yes" glusterfs_cflags=`$pkg_config --cflags glusterfs-api` glusterfs_libs=`$pkg_config --libs glusterfs-api` +if $pkg_config --atleast-version=4 glusterfs-api; then + glusterfs_xlator_opt="yes" +fi if $pkg_config --atleast-version=5 glusterfs-api; then glusterfs_discard="yes" fi @@ -5339,6 +5343,10 @@ if test "$glusterfs" = "yes" ; then echo "GLUSTERFS_LIBS=$glusterfs_libs" >> $config_host_mak fi +if test "$glusterfs_xlator_opt" = "yes" ; then + echo "CONFIG_GLUSTERFS_XLATOR_OPT=y" >> $config_host_mak +fi + if test "$glusterfs_discard" = "yes" ; then echo "CONFIG_GLUSTERFS_DISCARD=y" >> $config_host_mak fi -- 1.9.3
[Qemu-block] Possible bug: virtio-scsi + iothread (Former x-data-plane) = "Guest moved index from 0 to 61440" warning
Recently I hear that the experimental x-data-plane feature from virtio-blk was production-ready and that virtio-scsi also got support for it, so, after finding what the new syntax is: http://comments.gmane.org/gmane.comp.emulators.qemu/279118 ...I decided to test it. After all, it was supposed to be a free huge performance boost. My test setup: 1- A Xen Host running Xen 4.5 with Arch Linux as Dom0 using Kernel 4.0, and three VMs (One main everyday VM, another for some work which I want isolated, and the Nested KVM Host). I'm intending to replace this host with QEMU-KVM-VFIO after I nail down all the details.2- A Nested KVM Host running Arch Linux using Kernel 4.4 and QEMU 2.5 (I'm not using libvirt, just standalone QEMU). Since I'm using Nested Virtualization, this host can create KVM capable VMs3- A test VM spawned by the Nested KVM Host. I just use it to boot ArchISO (Arch Linux LiveCD) and check if things are working The problem (And the reason why I'm sending this mail) is that when I launch QEMU from the command line to create a VM using virtio-scsi-pci with the iothread parameter, it throws a strange "Guest moved index from 0 to 61440" warning. Some googling reveals that this error appeared for some users in totally different circunstances (Like removing not-hotpluggable devices) and that its actually a fatal error that makes the VM hang, crash, or something else, but in my case, it appears only at the moment that I create the VM, and it seems to work after the warning. However, since the device involved is a storage controller, I'm worried about possible data corruption or other major issues. The bare minimum script that produces the warning, should be this: #!/bin/bash qemu-system-x86_64 \-m 1024M \-enable-kvm \-object iothread,id=iothread0 \-device virtio-scsi-pci,iothread=iothread0 \ This produces the "Guest moved index from 0 to 61440" warning. It happens when virtio-scsi-pci has the iothread parameters set.Of note, not using -enable-kvm in the previous script produces this fatal error: virtio-scsi: VRing setup failedqemu-system-x86_64: /build/qemu/src/qemu-2.5.0/memory.c:1735: memory_region_del_eventfd: Assertion `i != mr->ioeventfd_nb' failed. ...but since I didn't googled if having KVM enabled was required to use virtio-scsi-pci or other VirtIO devices, I don't know if this is standard behavior or unintended, so this may not be important at all. You can even get multiple warnings, one for each virtio-scsi-pci device using iothreads. For example... #!/bin/bash qemu-system-x86_64 \-monitor stdio \-m 1024M \-enable-kvm \-object iothread,id=iothread0 \-object iothread,id=iothread1 \-object iothread,id=iothread2 \-device virtio-scsi-pci,iothread=iothread0 \-device virtio-scsi-pci,iothread=iothread1 \-device virtio-scsi-pci,iothread=iothread2 \-device virtio-scsi-pci \ This one produces three identical "Guest moved index from 0 to 61440" warnings on the Terminal.Using info iothreads on QEMU Monitor does shows the three IO Threads. Additional -object iothread,id=iothreadX that aren't used, or -device virtio-scsi-pci devices that aren't using a iothread, does not produce extra warnings. The warning does NOT happen if I use instead virtio-blk-pci with iothread. This is a more complete test script booting the Arch Linux ArchISO LiveCD, which was what I was using previously when I found the warning the first time: #!/bin/bash qemu-system-x86_64 \-name "Test VM" \-monitor stdio \-m 1024M \-M pc-q35-2.5,accelt=kvm \-nodefaults \-drive if=none,file=archlinux-2016.03.01-dual.iso,readonly=on,format=raw,id=drive0-drive if=none,file=/dev/vg0/lv0,format=raw,id=drive1 \-drive if=none,file=filetest1.img,format=raw,id=drive2 \-drive if=none,file=filetest2.img,format=raw,id=drive3 \-device ide-cd,drive=drive0 \-device qxl-vga \-object iothread,id=iothread0 \-object iothread,id=iothread1 \-object iothread,id=iothread2 \-device virtio-scsi-pci,iothread=iothread0,id=scsi0 \-device virtio-scsi-pci,iothread=iothread1,id=scsi1 \-device scsi-hd,bus=scsi0.0,drive=drive1 \-device scsi-hd,bus=scsi1.0,drive=drive2 \-device virtio-blk-pci,iothread=iothread2,drive=drive3 \ Starting the VM produces two warnings on the Terminal. Arch Linux however does see with lsblk the drives sda and sdb (Plus vda for the VirtIO Block Device), and lsblk does show two Virtio SCSI Controllers (Plus the VirtIO Block Device). I didn't tried to read or write to them to check if they work. I also asked to a guy at QEMU IRC that had an Arch Linux host with a cloned test VM with a virtio-scsi-pci device, to add the -object iothread and iothread parameter to virtio-scsi-pci. He said that it worked and he didn't received any Terminal warning, but that Windows CPU Usage was always 100% when he launched with it. I don't know if in his case, VirtIO Windows Drivers may be the reason (Sadly, forgot to ask Windows and VirtIO Drivers versions). In my VM with ArchISO, using top, CPU Usage seems to be
Re: [Qemu-block] [PATCH v3 0/2] block: Fix hang with mirroring qcow2
On 05/04/2016 13:20, Fam Zheng wrote: > See patch 1 for the bug analysis. > > v3: Make bdrv_co_drain a public function and use it directly in > block/mirror.c. > [Stefan] FWIW, no need to send a v4 with bdrv_co_drained_begin. Paolo > > > Fam Zheng (2): > block: Fix bdrv_drain in coroutine > mirror: Replace bdrv_drain(bs) with bdrv_co_drain(bs) > > block/io.c| 45 + > block/mirror.c| 2 +- > include/block/block.h | 1 + > 3 files changed, 47 insertions(+), 1 deletion(-) >
Re: [Qemu-block] [PATCH v2] block: Fix bdrv_drain in coroutine
On 05/04/2016 13:15, Fam Zheng wrote: > > Anyway, let's consider bdrv_drain() legacy code that can call if > > (qemu_in_coroutine()) but please make bdrv_co_drain() public so > > block/mirror.c can at least call it directly. > OK, will do. Please create a bdrv_co_drained_begin() function too, even if it's not used yet. Paolo
[Qemu-block] [PATCH v3 1/2] block: Fix bdrv_drain in coroutine
Using the nested aio_poll() in coroutine is a bad idea. This patch replaces the aio_poll loop in bdrv_drain with a BH, if called in coroutine. For example, the bdrv_drain() in mirror.c can hang when a guest issued request is pending on it in qemu_co_mutex_lock(). Mirror coroutine in this case has just finished a request, and the block job is about to complete. It calls bdrv_drain() which waits for the other coroutine to complete. The other coroutine is a scsi-disk request. The deadlock happens when the latter is in turn pending on the former to yield/terminate, in qemu_co_mutex_lock(). The state flow is as below (assuming a qcow2 image): mirror coroutine scsi-disk coroutine - do last write qcow2:qemu_co_mutex_lock() ... scsi disk read tracked request begin qcow2:qemu_co_mutex_lock.enter qcow2:qemu_co_mutex_unlock() bdrv_drain while (has tracked request) aio_poll() In the scsi-disk coroutine, the qemu_co_mutex_lock() will never return because the mirror coroutine is blocked in the aio_poll(blocking=true). With this patch, the added qemu_coroutine_yield() allows the scsi-disk coroutine to make progress as expected: mirror coroutine scsi-disk coroutine - do last write qcow2:qemu_co_mutex_lock() ... scsi disk read tracked request begin qcow2:qemu_co_mutex_lock.enter qcow2:qemu_co_mutex_unlock() bdrv_drain.enter > schedule BH > qemu_coroutine_yield() > qcow2:qemu_co_mutex_lock.return > ... tracked request end ... (resumed from BH callback) bdrv_drain.return ... Reported-by: Laurent VivierSuggested-by: Paolo Bonzini Signed-off-by: Fam Zheng --- block/io.c| 45 + include/block/block.h | 1 + 2 files changed, 46 insertions(+) diff --git a/block/io.c b/block/io.c index c4869b9..a7dbf85 100644 --- a/block/io.c +++ b/block/io.c @@ -253,6 +253,47 @@ static void bdrv_drain_recurse(BlockDriverState *bs) } } +typedef struct { +Coroutine *co; +BlockDriverState *bs; +QEMUBH *bh; +bool done; +} BdrvCoDrainData; + +static void bdrv_co_drain_bh_cb(void *opaque) +{ +BdrvCoDrainData *data = opaque; +Coroutine *co = data->co; + +qemu_bh_delete(data->bh); +bdrv_drain(data->bs); +data->done = true; +qemu_coroutine_enter(co, NULL); +} + +void coroutine_fn bdrv_co_drain(BlockDriverState *bs) +{ +BdrvCoDrainData data; + +/* Calling bdrv_drain() from a BH ensures the current coroutine yields and + * other coroutines run if they were queued from + * qemu_co_queue_run_restart(). */ + +assert(qemu_in_coroutine()); +data = (BdrvCoDrainData) { +.co = qemu_coroutine_self(), +.bs = bs, +.done = false, +.bh = aio_bh_new(bdrv_get_aio_context(bs), bdrv_co_drain_bh_cb, ), +}; +qemu_bh_schedule(data.bh); + +qemu_coroutine_yield(); +/* If we are resumed from some other event (such as an aio completion or a + * timer callback), it is a bug in the caller that should be fixed. */ +assert(data.done); +} + /* * Wait for pending requests to complete on a single BlockDriverState subtree, * and suspend block driver's internal I/O until next request arrives. @@ -269,6 +310,10 @@ void bdrv_drain(BlockDriverState *bs) bool busy = true; bdrv_drain_recurse(bs); +if (qemu_in_coroutine()) { +bdrv_co_drain(bs); +return; +} while (busy) { /* Keep iterating */ bdrv_flush_io_queue(bs); diff --git a/include/block/block.h b/include/block/block.h index 6a39f94..3a73137 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -372,6 +372,7 @@ int bdrv_flush(BlockDriverState *bs); int coroutine_fn bdrv_co_flush(BlockDriverState *bs); void bdrv_close_all(void); void bdrv_drain(BlockDriverState *bs); +void coroutine_fn bdrv_co_drain(BlockDriverState *bs); void bdrv_drain_all(void); int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors); -- 2.7.4
[Qemu-block] [PATCH v3 2/2] mirror: Replace bdrv_drain(bs) with bdrv_co_drain(bs)
Suggested-by: Stefan HajnocziSigned-off-by: Fam Zheng --- block/mirror.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/mirror.c b/block/mirror.c index f64db1a..c2cfc1a 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -650,7 +650,7 @@ static void coroutine_fn mirror_run(void *opaque) * mirror_populate runs. */ trace_mirror_before_drain(s, cnt); -bdrv_drain(bs); +bdrv_co_drain(bs); cnt = bdrv_get_dirty_count(s->dirty_bitmap); } -- 2.7.4
[Qemu-block] [PATCH v3 0/2] block: Fix hang with mirroring qcow2
See patch 1 for the bug analysis. v3: Make bdrv_co_drain a public function and use it directly in block/mirror.c. [Stefan] Fam Zheng (2): block: Fix bdrv_drain in coroutine mirror: Replace bdrv_drain(bs) with bdrv_co_drain(bs) block/io.c| 45 + block/mirror.c| 2 +- include/block/block.h | 1 + 3 files changed, 47 insertions(+), 1 deletion(-) -- 2.7.4
Re: [Qemu-block] [PATCH v2] block: Fix bdrv_drain in coroutine
On Tue, 04/05 10:39, Stefan Hajnoczi wrote: > > > block/mirror.c should call bdrv_co_drain() explicitly and bdrv_drain() > > > should assert(!qemu_in_coroutine()). > > > > > > The reason why existing bdrv_read() and friends detect coroutine context > > > at runtime is because it is meant for legacy code that runs in both > > > coroutine and non-coroutine contexts. > > > > blk_unref/bdrv_unref are called in coroutine (.bdrv_create implementations), > > and this doesn't just work with the assertion. Should I clean up this > > "legacy" > > code first, i.e. move bdrv_unref calls to BHs in the callers and > > assert(!qemu_in_coroutine()) there too? I didn't think this because it > > complicates the code somehow. > > This is a messy problem. > > In general I don't like introducing yields into non-coroutine_fn > functions because it can lead to bugs when the caller didn't expect a > yield point. > > For example, I myself wouldn't have expected bdrv_unref() to be a yield > point. So maybe coroutine code I've written would be vulnerable to > interference (I won't call it a race condition) from another coroutine > across the bdrv_unref() call. This could mean that another coroutine > now sees intermediate state that would never be visible without the new > yield point. > > I think attempting to invoke qemu_co_queue_run_restart() directly > instead of scheduling a BH and yielding does not improve the situation. > It's also a layering violation since qemu_co_queue_run_restart() is just > meant for the core coroutine code and isn't a public interface. > > Anyway, let's consider bdrv_drain() legacy code that can call if > (qemu_in_coroutine()) but please make bdrv_co_drain() public so > block/mirror.c can at least call it directly. OK, will do. Fam
Re: [Qemu-block] [PATCH v2] block: Fix bdrv_drain in coroutine
On Tue, Apr 05, 2016 at 09:27:39AM +0800, Fam Zheng wrote: > On Mon, 04/04 12:57, Stefan Hajnoczi wrote: > > On Fri, Apr 01, 2016 at 09:57:38PM +0800, Fam Zheng wrote: > > > +BdrvCoDrainData data; > > > + > > > +assert(qemu_in_coroutine()); > > > +data = (BdrvCoDrainData) { > > > +.co = qemu_coroutine_self(), > > > +.bs = bs, > > > +.done = false, > > > +.bh = aio_bh_new(bdrv_get_aio_context(bs), bdrv_co_drain_bh_cb, > > > ), > > > +}; > > > +qemu_bh_schedule(data.bh); > > > + > > > +qemu_coroutine_yield(); > > > +/* If we are resumed from some other event (such as an aio > > > completion or a > > > + * timer callback), it is a bug in the caller that should be fixed. > > > */ > > > +assert(data.done); > > > +} > > > + > > > /* > > > * Wait for pending requests to complete on a single BlockDriverState > > > subtree, > > > * and suspend block driver's internal I/O until next request arrives. > > > @@ -269,6 +306,10 @@ void bdrv_drain(BlockDriverState *bs) > > > bool busy = true; > > > > > > bdrv_drain_recurse(bs); > > > +if (qemu_in_coroutine()) { > > > +bdrv_co_drain(bs); > > > +return; > > > +} > > > while (busy) { > > > /* Keep iterating */ > > > bdrv_flush_io_queue(bs); > > > -- > > > 2.7.4 > > > > block/mirror.c should call bdrv_co_drain() explicitly and bdrv_drain() > > should assert(!qemu_in_coroutine()). > > > > The reason why existing bdrv_read() and friends detect coroutine context > > at runtime is because it is meant for legacy code that runs in both > > coroutine and non-coroutine contexts. > > blk_unref/bdrv_unref are called in coroutine (.bdrv_create implementations), > and this doesn't just work with the assertion. Should I clean up this "legacy" > code first, i.e. move bdrv_unref calls to BHs in the callers and > assert(!qemu_in_coroutine()) there too? I didn't think this because it > complicates the code somehow. This is a messy problem. In general I don't like introducing yields into non-coroutine_fn functions because it can lead to bugs when the caller didn't expect a yield point. For example, I myself wouldn't have expected bdrv_unref() to be a yield point. So maybe coroutine code I've written would be vulnerable to interference (I won't call it a race condition) from another coroutine across the bdrv_unref() call. This could mean that another coroutine now sees intermediate state that would never be visible without the new yield point. I think attempting to invoke qemu_co_queue_run_restart() directly instead of scheduling a BH and yielding does not improve the situation. It's also a layering violation since qemu_co_queue_run_restart() is just meant for the core coroutine code and isn't a public interface. Anyway, let's consider bdrv_drain() legacy code that can call if (qemu_in_coroutine()) but please make bdrv_co_drain() public so block/mirror.c can at least call it directly. Stefan signature.asc Description: PGP signature
[Qemu-block] [PATCH] MAINTAINERS: Add Fam Zheng as a co-maintainer of block I/O path
As agreed with Stefan, I'm listing myself a co-maintainer of block I/O path and assist with the maintainership. Signed-off-by: Fam Zheng--- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 9277fbf..d261412 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -998,6 +998,7 @@ T: git git://repo.or.cz/qemu/kevin.git block Block I/O path M: Stefan Hajnoczi +M: Fam Zheng L: qemu-block@nongnu.org S: Supported F: async.c -- 2.7.4
[Qemu-block] [PATCH 4/7] qemu-iotests: 148: properly skip test if quorum support is missing
qemu-iotests test case 148 already had some code for skipping the test if quorum support is missing, but it didn't work in all cases. TestQuorumEvents.setUp() gets run before the actual test class (which contains the skipping code) and tries to start qemu with a drive using the quorum driver. For some reason this works fine when using qcow2, but fails for raw. As the entire test case requires quorum, just check for availability before even starting the test suite. Introduce a verify_quorum() function in iotests.py for this purpose so future test cases can make use of it. Signed-off-by: Sascha SilbeReviewed-by: Bo Tu --- tests/qemu-iotests/148| 4 +--- tests/qemu-iotests/iotests.py | 5 + 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/tests/qemu-iotests/148 b/tests/qemu-iotests/148 index d066ec3..e01b061 100644 --- a/tests/qemu-iotests/148 +++ b/tests/qemu-iotests/148 @@ -79,9 +79,6 @@ sector = "%d" self.assert_qmp(event, 'data/sector-num', sector) def testQuorum(self): -if not 'quorum' in iotests.qemu_img_pipe('--help'): -return - # Generate an error and get an event self.vm.hmp_qemu_io("drive0", "aio_read %d %d" % (offset * sector_size, sector_size)) @@ -139,4 +136,5 @@ class TestFifoQuorumEvents(TestQuorumEvents): read_pattern = 'fifo' if __name__ == '__main__': +iotests.verify_quorum() iotests.main(supported_fmts=["raw"]) diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py index fb5c482..bf31ec8 100644 --- a/tests/qemu-iotests/iotests.py +++ b/tests/qemu-iotests/iotests.py @@ -437,6 +437,11 @@ def verify_platform(supported_oses=['linux']): if True not in [sys.platform.startswith(x) for x in supported_oses]: notrun('not suitable for this OS: %s' % sys.platform) +def verify_quorum(): +'''Skip test suite if quorum support is not available''' +if 'quorum' not in qemu_img_pipe('--help'): +notrun('quorum support missing') + def main(supported_fmts=[], supported_oses=['linux']): '''Run tests''' -- 1.9.1
[Qemu-block] [PATCH 0/7] next round of qemu-iotests fixes
With these fixes, qemu-iotests complete successfully on my test systems (s390x, x86_64) when used with QCOW2 or raw image formats. These are purely bug fixes for tests and most are trivial, so they should be safe even for hard freeze. Sascha Silbe (7): qemu-iotests: check: don't place files with predictable names in /tmp qemu-iotests: fix 051 on non-PC architectures qemu-iotests: iotests.VM: remove qtest socket on error qemu-iotests: 148: properly skip test if quorum support is missing qemu-iotests: 068: don't require KVM qemu-iotests: 141: reduce likelihood of race condition on systems with fast IO qemu-iotests: iotests.py: get rid of __all__ tests/qemu-iotests/051.out | 10 +- tests/qemu-iotests/068 | 2 +- tests/qemu-iotests/141 | 11 ++- tests/qemu-iotests/141.out | 24 tests/qemu-iotests/148 | 4 +--- tests/qemu-iotests/check | 21 +++-- tests/qemu-iotests/common.filter | 1 + tests/qemu-iotests/iotests.py| 22 +- 8 files changed, 54 insertions(+), 41 deletions(-) -- 1.9.1
[Qemu-block] [PATCH 1/7] qemu-iotests: check: don't place files with predictable names in /tmp
Placing files with predictable or even hard-coded names in /tmp is a security risk and can prevent or disturb operation on a multi-user machine. Place them inside the "scratch" directory instead, as we already do for most other test-related files. Signed-off-by: Sascha SilbeReviewed-by: Bo Tu --- tests/qemu-iotests/check | 21 +++-- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check index c350f16..4cba215 100755 --- a/tests/qemu-iotests/check +++ b/tests/qemu-iotests/check @@ -19,7 +19,6 @@ # Control script for QA # -tmp=/tmp/$$ status=0 needwrap=true try=0 @@ -130,6 +129,8 @@ fi #exit 1 #fi +tmp="${TEST_DIR}"/$$ + _wallclock() { date "+%H %M %S" | $AWK_PROG '{ print $1*3600 + $2*60 + $3 }' @@ -146,8 +147,8 @@ _wrapup() # for hangcheck ... # remove files that were used by hangcheck # -[ -f /tmp/check.pid ] && rm -rf /tmp/check.pid -[ -f /tmp/check.sts ] && rm -rf /tmp/check.sts +[ -f "${TEST_DIR}"/check.pid ] && rm -rf "${TEST_DIR}"/check.pid +[ -f "${TEST_DIR}"/check.sts ] && rm -rf "${TEST_DIR}"/check.sts if $showme then @@ -197,8 +198,8 @@ END{ if (NR > 0) { needwrap=false fi -rm -f /tmp/*.out /tmp/*.err /tmp/*.time -rm -f /tmp/check.pid /tmp/check.sts +rm -f "${TEST_DIR}"/*.out "${TEST_DIR}"/*.err "${TEST_DIR}"/*.time +rm -f "${TEST_DIR}"/check.pid "${TEST_DIR}"/check.sts rm -f $tmp.* } @@ -208,16 +209,16 @@ trap "_wrapup; exit \$status" 0 1 2 3 15 # Save pid of check in a well known place, so that hangcheck can be sure it # has the right pid (getting the pid from ps output is not reliable enough). # -rm -rf /tmp/check.pid -echo $$ >/tmp/check.pid +rm -rf "${TEST_DIR}"/check.pid +echo $$ > "${TEST_DIR}"/check.pid # for hangcheck ... # Save the status of check in a well known place, so that hangcheck can be # sure to know where check is up to (getting test number from ps output is # not reliable enough since the trace stuff has been introduced). # -rm -rf /tmp/check.sts -echo "preamble" >/tmp/check.sts +rm -rf "${TEST_DIR}"/check.sts +echo "preamble" > "${TEST_DIR}"/check.sts # don't leave old full output behind on a clean run rm -f check.full @@ -285,7 +286,7 @@ do rm -f core $seq.notrun # for hangcheck ... -echo "$seq" >/tmp/check.sts +echo "$seq" > "${TEST_DIR}"/check.sts start=`_wallclock` $timestamp && echo -n "["`date "+%T"`"]" -- 1.9.1
[Qemu-block] [PATCH 6/7] qemu-iotests: 141: reduce likelihood of race condition on systems with fast IO
On systems with fast IO, qemu-io may write more than 1 MiB before receiving the block-job-cancel command, causing the test case to fail. 141 is inherently racy, but we can at least reduce the likelihood of the job completing before the cancel command arrives by bumping the size of the data getting written; we'll try 32 MiB for a start. Once we actually move enough data around for the block job not to complete prematurely, the test will still fail because the offset value in the BLOCK_JOB_CANCELLED event will vary. Since this is more or less inherent to the nature of this event, we just replace it with a fixed value globally (in _filter_qmp), the same way we handle timestamps. Signed-off-by: Sascha SilbeReviewed-by: Bo Tu --- tests/qemu-iotests/141 | 11 ++- tests/qemu-iotests/141.out | 24 tests/qemu-iotests/common.filter | 1 + 3 files changed, 19 insertions(+), 17 deletions(-) diff --git a/tests/qemu-iotests/141 b/tests/qemu-iotests/141 index f7c28b4..a06dc72 100755 --- a/tests/qemu-iotests/141 +++ b/tests/qemu-iotests/141 @@ -83,9 +83,10 @@ test_blockjob() } -TEST_IMG="$TEST_DIR/b.$IMGFMT" _make_test_img 1M -TEST_IMG="$TEST_DIR/m.$IMGFMT" _make_test_img -b "$TEST_DIR/b.$IMGFMT" 1M -_make_test_img -b "$TEST_DIR/m.$IMGFMT" 1M +IMGSIZE=32M +TEST_IMG="$TEST_DIR/b.$IMGFMT" _make_test_img "${IMGSIZE}" +TEST_IMG="$TEST_DIR/m.$IMGFMT" _make_test_img -b "$TEST_DIR/b.$IMGFMT" "${IMGSIZE}" +_make_test_img -b "$TEST_DIR/m.$IMGFMT" "${IMGSIZE}" _launch_qemu -nodefaults @@ -147,7 +148,7 @@ echo # immediately, send a BLOCK_JOB_COMPLETED and ejecting the BDS would work just # fine without the block job still running. -$QEMU_IO -c 'write 0 1M' "$TEST_DIR/m.$IMGFMT" | _filter_qemu_io +$QEMU_IO -c "write 0 ${IMGSIZE}" "$TEST_DIR/m.$IMGFMT" | _filter_qemu_io test_blockjob \ "{'execute': 'block-commit', @@ -165,7 +166,7 @@ echo # immediately, send a BLOCK_JOB_COMPLETED and ejecting the BDS would work just # fine without the block job still running. -$QEMU_IO -c 'write 0 1M' "$TEST_DIR/b.$IMGFMT" | _filter_qemu_io +$QEMU_IO -c "write 0 ${IMGSIZE}" "$TEST_DIR/b.$IMGFMT" | _filter_qemu_io # With some data to stream (and @speed set to 1), block-stream will not complete # until we send the block-job-cancel command. Therefore, no event other than diff --git a/tests/qemu-iotests/141.out b/tests/qemu-iotests/141.out index adceac1..14e457f 100644 --- a/tests/qemu-iotests/141.out +++ b/tests/qemu-iotests/141.out @@ -1,23 +1,23 @@ QA output created by 141 -Formatting 'TEST_DIR/b.IMGFMT', fmt=IMGFMT size=1048576 -Formatting 'TEST_DIR/m.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/b.IMGFMT -Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/m.IMGFMT +Formatting 'TEST_DIR/b.IMGFMT', fmt=IMGFMT size=33554432 +Formatting 'TEST_DIR/m.IMGFMT', fmt=IMGFMT size=33554432 backing_file=TEST_DIR/b.IMGFMT +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=33554432 backing_file=TEST_DIR/m.IMGFMT {"return": {}} === Testing drive-backup === {"return": {}} -Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT +Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=33554432 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT {"return": {}} {"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: block device is in use by block job: backup"}} {"return": {}} -{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_CANCELLED", "data": {"device": "drv0", "len": 1048576, "offset": 0, "speed": 0, "type": "backup"}} +{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_CANCELLED", "data": {"device": "drv0", "len": 33554432, "offset": OFFSET, "speed": 0, "type": "backup"}} {"return": {}} === Testing drive-mirror === {"return": {}} -Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT +Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=33554432 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT {"return": {}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "drv0", "len": 0, "offset": 0, "speed": 0, "type": "mirror"}} {"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: block device is in use by block job: mirror"}} @@ -37,23 +37,23 @@ Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t. === Testing non-active block-commit === -wrote 1048576/1048576 bytes at offset 0 -1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) +wrote 33554432/33554432 bytes at offset 0 +32 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) {"return": {}} {"return": {}} {"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: block device is in use by block job: commit"}} {"return":
[Qemu-block] [PATCH 2/7] qemu-iotests: fix 051 on non-PC architectures
Commit 61de4c68 [block: Remove BDRV_O_CACHE_WB] updated the reference output for PCs, but neglected to do the same for the generic reference output file. Fix 051 on all non-PC architectures by applying the same change to the generic output file. Fixes: 61de4c68 ("block: Remove BDRV_O_CACHE_WB") Signed-off-by: Sascha SilbeReviewed-by: Bo Tu --- tests/qemu-iotests/051.out | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/tests/qemu-iotests/051.out b/tests/qemu-iotests/051.out index c1291ff..408d613 100644 --- a/tests/qemu-iotests/051.out +++ b/tests/qemu-iotests/051.out @@ -145,7 +145,7 @@ QEMU X.Y.Z monitor - type 'help' for more information Testing: -drive driver=null-co,cache=invalid_value QEMU_PROG: -drive driver=null-co,cache=invalid_value: invalid cache option -Testing: -drive file=TEST_DIR/t.qcow2,cache=writeback,backing.file.filename=TEST_DIR/t.qcow2.base,backing.cache.no-flush=on,backing.cache.writeback=on,backing.node-name=backing,backing.file.node-name=backing-file,file.node-name=file,if=none,id=drive0 -nodefaults +Testing: -drive file=TEST_DIR/t.qcow2,cache=writeback,backing.file.filename=TEST_DIR/t.qcow2.base,backing.cache.no-flush=on,backing.node-name=backing,backing.file.node-name=backing-file,file.node-name=file,if=none,id=drive0 -nodefaults QEMU X.Y.Z monitor - type 'help' for more information (qemu) i[K[Din[K[D[Dinf[K[D[D[Dinfo[K[D[D[D[Dinfo [K[D[D[D[D[Dinfo b[K[D[D[D[D[D[Dinfo bl[K[D[D[D[D[D[D[Dinfo blo[K[D[D[D[D[D[D[D[Dinfo bloc[K[D[D[D[D[D[D[D[D[Dinfo block[K drive0 (NODE_NAME): TEST_DIR/t.qcow2 (qcow2) @@ -165,7 +165,7 @@ backing-file: TEST_DIR/t.qcow2.base (file, read-only) Cache mode: writeback, ignore flushes (qemu) q[K[Dqu[K[D[Dqui[K[D[D[Dquit[K -Testing: -drive file=TEST_DIR/t.qcow2,cache=writethrough,backing.file.filename=TEST_DIR/t.qcow2.base,backing.cache.no-flush=on,backing.cache.writeback=on,backing.node-name=backing,backing.file.node-name=backing-file,file.node-name=file,if=none,id=drive0 -nodefaults +Testing: -drive file=TEST_DIR/t.qcow2,cache=writethrough,backing.file.filename=TEST_DIR/t.qcow2.base,backing.cache.no-flush=on,backing.node-name=backing,backing.file.node-name=backing-file,file.node-name=file,if=none,id=drive0 -nodefaults QEMU X.Y.Z monitor - type 'help' for more information (qemu) i[K[Din[K[D[Dinf[K[D[D[Dinfo[K[D[D[D[Dinfo [K[D[D[D[D[Dinfo b[K[D[D[D[D[D[Dinfo bl[K[D[D[D[D[D[D[Dinfo blo[K[D[D[D[D[D[D[D[Dinfo bloc[K[D[D[D[D[D[D[D[D[Dinfo block[K drive0 (NODE_NAME): TEST_DIR/t.qcow2 (qcow2) @@ -185,7 +185,7 @@ backing-file: TEST_DIR/t.qcow2.base (file, read-only) Cache mode: writeback, ignore flushes (qemu) q[K[Dqu[K[D[Dqui[K[D[D[Dquit[K -Testing: -drive file=TEST_DIR/t.qcow2,cache=unsafe,backing.file.filename=TEST_DIR/t.qcow2.base,backing.cache.no-flush=on,backing.cache.writeback=on,backing.node-name=backing,backing.file.node-name=backing-file,file.node-name=file,if=none,id=drive0 -nodefaults +Testing: -drive file=TEST_DIR/t.qcow2,cache=unsafe,backing.file.filename=TEST_DIR/t.qcow2.base,backing.cache.no-flush=on,backing.node-name=backing,backing.file.node-name=backing-file,file.node-name=file,if=none,id=drive0 -nodefaults QEMU X.Y.Z monitor - type 'help' for more information (qemu) i[K[Din[K[D[Dinf[K[D[D[Dinfo[K[D[D[D[Dinfo [K[D[D[D[D[Dinfo b[K[D[D[D[D[D[Dinfo bl[K[D[D[D[D[D[D[Dinfo blo[K[D[D[D[D[D[D[D[Dinfo bloc[K[D[D[D[D[D[D[D[D[Dinfo block[K drive0 (NODE_NAME): TEST_DIR/t.qcow2 (qcow2) @@ -205,8 +205,8 @@ backing-file: TEST_DIR/t.qcow2.base (file, read-only) Cache mode: writeback, ignore flushes (qemu) q[K[Dqu[K[D[Dqui[K[D[D[Dquit[K -Testing: -drive file=TEST_DIR/t.qcow2,cache=invalid_value,backing.file.filename=TEST_DIR/t.qcow2.base,backing.cache.no-flush=on,backing.cache.writeback=on,backing.node-name=backing,backing.file.node-name=backing-file,file.node-name=file,if=none,id=drive0 -nodefaults -QEMU_PROG: -drive file=TEST_DIR/t.qcow2,cache=invalid_value,backing.file.filename=TEST_DIR/t.qcow2.base,backing.cache.no-flush=on,backing.cache.writeback=on,backing.node-name=backing,backing.file.node-name=backing-file,file.node-name=file,if=none,id=drive0: invalid cache option +Testing: -drive file=TEST_DIR/t.qcow2,cache=invalid_value,backing.file.filename=TEST_DIR/t.qcow2.base,backing.cache.no-flush=on,backing.node-name=backing,backing.file.node-name=backing-file,file.node-name=file,if=none,id=drive0 -nodefaults +QEMU_PROG: -drive file=TEST_DIR/t.qcow2,cache=invalid_value,backing.file.filename=TEST_DIR/t.qcow2.base,backing.cache.no-flush=on,backing.node-name=backing,backing.file.node-name=backing-file,file.node-name=file,if=none,id=drive0: invalid cache option === Specifying the protocol layer === --
[Qemu-block] [PATCH 5/7] qemu-iotests: 068: don't require KVM
None of the other test cases explicitly enable KVM and there's no obvious reason for 068 to require it. Drop this so all test cases can be executed in environments where KVM is not available (e.g. because the user doesn't have sufficient permissions to access /dev/kvm). Signed-off-by: Sascha SilbeReviewed-by: Bo Tu --- tests/qemu-iotests/068 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/qemu-iotests/068 b/tests/qemu-iotests/068 index 58d1d80..7562dd7 100755 --- a/tests/qemu-iotests/068 +++ b/tests/qemu-iotests/068 @@ -53,7 +53,7 @@ _make_test_img $IMG_SIZE case "$QEMU_DEFAULT_MACHINE" in s390-ccw-virtio) - platform_parm="-no-shutdown -machine accel=kvm" + platform_parm="-no-shutdown" ;; *) platform_parm="" -- 1.9.1
[Qemu-block] [PATCH 7/7] qemu-iotests: iotests.py: get rid of __all__
The __all__ list contained a typo for as long as the iotests module existed. That typo prevented "from iotests import *" (which is the only case where iotests.__all__ is used at all) from ever working. The names used by iotests are highly prone to name collisions, so importing them all unconditionally is a bad idea anyway. Since __all__ is not adding any value, let's just get rid of it. Fixes: f345cfd0 ("qemu-iotests: add iotests Python module") Signed-off-by: Sascha SilbeReviewed-by: Bo Tu --- tests/qemu-iotests/iotests.py | 4 1 file changed, 4 deletions(-) diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py index bf31ec8..0c0b533 100644 --- a/tests/qemu-iotests/iotests.py +++ b/tests/qemu-iotests/iotests.py @@ -29,10 +29,6 @@ import qmp import qtest import struct -__all__ = ['imgfmt', 'imgproto', 'test_dir' 'qemu_img', 'qemu_io', - 'VM', 'QMPTestCase', 'notrun', 'main', 'verify_image_format', - 'verify_platform', 'filter_test_dir', 'filter_win32', - 'filter_qemu_io', 'filter_chown', 'log'] # This will not work if arguments contain spaces but is necessary if we # want to support the override options that ./check supports. -- 1.9.1