v2: Two passes inactivation. [Kevin]
For now we only consider bs->file chain. In fact this is incomplete. For
example, if qcow2 is a quorum child, we don't properly invalidate or inactivate
it. This series recurses into the subtrees in both bdrv_invalidate_cache_all
and bdrv_inactivate_all. This
Now they are invalidated by the block layer, so it's not necessary to
do this in block drivers' implementations of .bdrv_invalidate_cache.
Signed-off-by: Fam Zheng
---
block/qcow2.c | 7 ---
block/qed.c| 6 --
block/quorum.c | 16
3 files
Currently we only inactivate the top BDS. Actually bdrv_inactivate
should be the opposite of bdrv_invalidate_cache.
Recurse into the whole subtree instead.
Because a node may have multiple parents, and because once
BDRV_O_INACTIVE is set for a node, further writes are not allowed, we
cannot
Currently we only recurse to bs->file, which will miss the children in quorum
and VMDK.
Recurse into the whole subtree to avoid that.
Signed-off-by: Fam Zheng
---
block.c | 20 ++--
1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/block.c
On Tue, 05/03 16:39, Eric Blake wrote:
> I noticed some inconsistencies in FUA handling while working
> with NBD, then Kevin pointed out that my initial attempt wasn't
> quite right for iscsi which also had problems, so this has
> expanded into a series rather than a single patch.
>
> I'm not
On Wed, 05/11 08:48, Fam Zheng wrote:
> racy problem. Any suggestion how this could be fixed?
Reading into the subthread I see the answer: the file-private locks look
promising. Will take a look at that! Thanks.
Fam
On Tue, 05/10 09:57, Daniel P. Berrange wrote:
> On Tue, May 10, 2016 at 10:50:40AM +0800, Fam Zheng wrote:
> > They are wrappers of POSIX fcntl file locking, with the additional
> > interception of open/close (through qemu_open and qemu_close) to offer a
> > better semantics that preserves the
On 05/10/2016 10:33 AM, Quentin Casasnovas wrote:
> Looks like there's an easier way:
>
> $ qemu-img create -f qcow2 foo.qcow2 10G
> $ qemu-nbd --discard=on -c /dev/nbd0 foo.qcow2
> $ mkfs.ext4 /dev/nbd0
> mke2fs 1.42.13 (17-May-2015)
> Discarding device blocks: failed - Input/output error
On 10/05/2016 17:38, Alex Bligh wrote:
> > and are at the
> > mercy of however the kernel currently decides to split large requests).
>
> I am surprised TRIM doesn't get broken up the same way READ and WRITE
> do.
The payload of TRIM has constant size, so it makes sense to do that.
The kernel
On Tue, May 10, 2016 at 05:54:44PM +0200, Quentin Casasnovas wrote:
> On Tue, May 10, 2016 at 09:46:36AM -0600, Eric Blake wrote:
> > On 05/10/2016 09:41 AM, Alex Bligh wrote:
> > >
> > > On 10 May 2016, at 16:29, Eric Blake wrote:
> > >
> > >> So the kernel is currently one
This adds support for testing the LUKS driver with the block
I/O test framework.
cd tests/qemu-io-tests
./check -luks
A handful of test cases are modified to work with luks
- 004 - whitelist luks format
- 012 - use TEST_IMG_FILE instead of TEST_IMG for file ops
- 048 - use
The LUKS block driver tests will require the ability to specify
encryption secrets with block devices. This requires using the
--object argument to qemu-img/qemu-io to create a 'secret'
object.
When the IMGKEYSECRET env variable is set, it provides the
password to be associated with a secret
Currently all block tests use the traditional syntax for images
just specifying a filename. To support the LUKS driver without
resorting to JSON, the tests need to be able to use the new
--image-opts argument to qemu-img and qemu-io.
This introduces a new env variable IMGOPTSSYNTAX. If this is
This series contains the 3 test suite patches that had to be dropped
from the v6 series during merge with the block tree:
v6: https://lists.gnu.org/archive/html/qemu-devel/2016-03/msg04935.html
v7: https://lists.gnu.org/archive/html/qemu-devel/2016-03/msg06687.html
v8:
On Tue, May 10, 2016 at 04:49:57PM +0100, Alex Bligh wrote:
>
> On 10 May 2016, at 16:45, Quentin Casasnovas
> wrote:
>
> > I'm by no mean an expert in this, but why would the kernel break up those
> > TRIM commands? After all, breaking things up makes sense
On Tue, May 10, 2016 at 09:46:36AM -0600, Eric Blake wrote:
> On 05/10/2016 09:41 AM, Alex Bligh wrote:
> >
> > On 10 May 2016, at 16:29, Eric Blake wrote:
> >
> >> So the kernel is currently one of the clients that does NOT honor block
> >> sizes, and as such, servers should
On Tue, May 10, 2016 at 04:38:29PM +0100, Alex Bligh wrote:
> Eric,
>
> On 10 May 2016, at 16:29, Eric Blake wrote:
> >>> Maybe we should revisit that in the spec, and/or advertise yet another
> >>> block size (since the maximum size for a trim and/or write_zeroes
> >>>
On 10 May 2016, at 16:45, Quentin Casasnovas
wrote:
> I'm by no mean an expert in this, but why would the kernel break up those
> TRIM commands? After all, breaking things up makes sense when the length
> of the request is big, not that much when it only
On 10 May 2016, at 16:46, Eric Blake wrote:
> Does anyone have an easy way to cause the kernel to request a trim
> operation that large on a > 4G export? I'm not familiar enough with
> EXT4 operation to know what file system operations you can run to
> ultimately indirectly
On 05/10/2016 09:41 AM, Alex Bligh wrote:
>
> On 10 May 2016, at 16:29, Eric Blake wrote:
>
>> So the kernel is currently one of the clients that does NOT honor block
>> sizes, and as such, servers should be prepared for ANY size up to
>> UINT_MAX (other than DoS handling).
>
On 10 May 2016, at 16:29, Eric Blake wrote:
> So the kernel is currently one of the clients that does NOT honor block
> sizes, and as such, servers should be prepared for ANY size up to
> UINT_MAX (other than DoS handling).
Interesting followup question:
If the kernel does
Eric,
On 10 May 2016, at 16:29, Eric Blake wrote:
>>> Maybe we should revisit that in the spec, and/or advertise yet another
>>> block size (since the maximum size for a trim and/or write_zeroes
>>> request may indeed be different than the maximum size for a read/write).
>>
On 05/10/2016 09:08 AM, Alex Bligh wrote:
> Eric,
>
>> Hmm. The current wording of the experimental block size additions does
>> NOT allow the client to send a NBD_CMD_TRIM with a size larger than the
>> maximum NBD_CMD_WRITE:
>>
Eric,
> Hmm. The current wording of the experimental block size additions does
> NOT allow the client to send a NBD_CMD_TRIM with a size larger than the
> maximum NBD_CMD_WRITE:
> https://github.com/yoe/nbd/blob/extension-info/doc/proto.md#block-size-constraints
Correct
> Maybe we should
Am 06.05.2016 um 18:26 hat Eric Blake geschrieben:
> 2.7 material, depends on Kevin's block-next:
> git://repo.or.cz/qemu/kevin.git block-next
>
> Previously posted as part of a larger NBD series [1] and as v6 [3].
> Mostly orthogonal to Kevin's recent work to also kill sector
> interfaces from
Am 10.05.2016 um 14:56 hat Eric Blake geschrieben:
> On 05/10/2016 02:55 AM, Kevin Wolf wrote:
> > Am 06.05.2016 um 18:26 hat Eric Blake geschrieben:
> >> Sector-based blk_aio_readv() and blk_aio_writev() should die; switch
> >> to byte-based blk_aio_preadv() and blk_aio_pwritev() instead.
> >>
>
On 10.05.2016 09:36, Changlong Xie wrote:
> ChangLog:
> v14:
> 1. Address commets from Betro and Max
> p2: introduce bdrv_drained_begin/end, rename last_index, remove redundant
> assert codes
> v13:
> 1. Rebase to the newest codes
> 2. Address commets from Betro and Max
> p1. Add R-B, fix
On 10.05.2016 09:36, Changlong Xie wrote:
> From: Wen Congyang
>
> Signed-off-by: Wen Congyang
> Signed-off-by: zhanghailiang
> Signed-off-by: Gonglei
> Signed-off-by: Changlong Xie
[adding nbd-devel, qemu-block]
On 05/06/2016 02:45 AM, Quentin Casasnovas wrote:
> When running fstrim on a filesystem mounted through qemu-nbd with
> --discard=on, fstrim would fail with I/O errors:
>
> $ fstrim /k/spl/ice/
> fstrim: /k/spl/ice/: FITRIM ioctl failed: Input/output error
>
>
On 05/10/2016 02:55 AM, Kevin Wolf wrote:
> Am 06.05.2016 um 18:26 hat Eric Blake geschrieben:
>> Sector-based blk_aio_readv() and blk_aio_writev() should die; switch
>> to byte-based blk_aio_preadv() and blk_aio_pwritev() instead.
>>
>> As part of the cleanup, scsi_init_iovec() no longer needs to
Am 10.05.2016 um 14:22 hat Daniel P. Berrange geschrieben:
> On Tue, May 10, 2016 at 01:11:30PM +0100, Richard W.M. Jones wrote:
> > At no point did I say that it was safe to use libguestfs on live VMs
> > or that you would always get consistent data out.
> >
> > But the fact that it can fail is
Am 04.05.2016 um 00:39 hat Eric Blake geschrieben:
> I noticed some inconsistencies in FUA handling while working
> with NBD, then Kevin pointed out that my initial attempt wasn't
> quite right for iscsi which also had problems, so this has
> expanded into a series rather than a single patch.
>
>
On Fri 22 Apr 2016 07:42:39 PM CEST, Kevin Wolf wrote:
> This moves the throttling related part of the BDS life cycle management
> to BlockBackend. The throttling group reference is now kept even when no
> medium is inserted.
>
> With this commit, throttling isn't disabled and then re-enabled any
On Fri 22 Apr 2016 07:42:34 PM CEST, Kevin Wolf wrote:
> typedef struct BlockBackendPublic {
> -/* I/O throttling */
> +/* I/O throttling.
> + * throttle_state tells us if this BDS has I/O limits configured.
> + * io_limits_disabled tells us if they are currently being enforced */
On Tue, May 10, 2016 at 01:11:30PM +0100, Richard W.M. Jones wrote:
> At no point did I say that it was safe to use libguestfs on live VMs
> or that you would always get consistent data out.
>
> But the fact that it can fail is understood, the chance of failure is
> really tiny (it has literally
On Fri 22 Apr 2016 07:42:42 PM CEST, Kevin Wolf wrote:
> Checking whether there are throttled requests requires going to the
> associated BlockBackend, which we want to avoid. All users of
> bdrv_requests_pending() already call bdrv_parent_drained_begin()
> first,
There's a couple of
At no point did I say that it was safe to use libguestfs on live VMs
or that you would always get consistent data out.
But the fact that it can fail is understood, the chance of failure is
really tiny (it has literally only happened twice that I've read
corrupted data, in years of daily use), and
Am 10.05.2016 um 13:46 hat Richard W.M. Jones geschrieben:
> On Tue, May 10, 2016 at 01:08:49PM +0200, Kevin Wolf wrote:
> > Are you saying that libguestfs only allows operations like df on live
> > images, but not e.g. copying files out of the VM?
> [...]
>
> virt-copy-out will let you copy out
On Tue, May 10, 2016 at 01:08:49PM +0200, Kevin Wolf wrote:
> Are you saying that libguestfs only allows operations like df on live
> images, but not e.g. copying files out of the VM?
[...]
virt-copy-out will let you copy out files from a live VM.
There's no difference between "safe" and
Am 10.05.2016 um 12:29 hat Daniel P. Berrange geschrieben:
> On Tue, May 10, 2016 at 12:07:06PM +0200, Kevin Wolf wrote:
> > Am 10.05.2016 um 11:43 hat Daniel P. Berrange geschrieben:
> > > On Tue, May 10, 2016 at 11:35:14AM +0200, Kevin Wolf wrote:
> > > > Am 10.05.2016 um 11:23 hat Daniel P.
Am 10.05.2016 um 12:16 hat Richard W.M. Jones geschrieben:
> On Tue, May 10, 2016 at 12:07:06PM +0200, Kevin Wolf wrote:
> > I'm surprised how low the standards seem to be when we're talking about
> > data integrity. If occasionally losing data is okay, the qemu block
> > layer could be quite a
On 10/05/2016 11:40, Kevin Wolf wrote:
> > Regarding performance, I'm thinking about a guest with 8 disks (queue
> > depth 32). The worst case is when the guest submits 32 requests at once
> > but the Linux AIO event limit has already been reached. Then the disk
> > is starved until other
On Tue, May 10, 2016 at 12:07:06PM +0200, Kevin Wolf wrote:
> Am 10.05.2016 um 11:43 hat Daniel P. Berrange geschrieben:
> > On Tue, May 10, 2016 at 11:35:14AM +0200, Kevin Wolf wrote:
> > > Am 10.05.2016 um 11:23 hat Daniel P. Berrange geschrieben:
> > > > On Tue, May 10, 2016 at 11:14:22AM
On Tue, May 10, 2016 at 12:07:06PM +0200, Kevin Wolf wrote:
> I'm surprised how low the standards seem to be when we're talking about
> data integrity. If occasionally losing data is okay, the qemu block
> layer could be quite a bit simpler.
I welcome this patch because it fixes a real data
Am 10.05.2016 um 11:43 hat Daniel P. Berrange geschrieben:
> On Tue, May 10, 2016 at 11:35:14AM +0200, Kevin Wolf wrote:
> > Am 10.05.2016 um 11:23 hat Daniel P. Berrange geschrieben:
> > > On Tue, May 10, 2016 at 11:14:22AM +0200, Kevin Wolf wrote:
> > > > Am 10.05.2016 um 10:50 hat Daniel P.
On Tue, May 10, 2016 at 11:14:22AM +0200, Kevin Wolf wrote:
> Am 10.05.2016 um 10:50 hat Daniel P. Berrange geschrieben:
> > On Tue, May 10, 2016 at 09:43:04AM +0100, Richard W.M. Jones wrote:
> > > On Tue, May 10, 2016 at 09:14:26AM +0100, Richard W.M. Jones wrote:
> > > > However I didn't test
On 05/06/2016 11:46 PM, Stefan Hajnoczi wrote:
On Fri, Apr 15, 2016 at 04:10:37PM +0800, Changlong Xie wrote:
+static void replication_close(BlockDriverState *bs)
+{
+BDRVReplicationState *s = bs->opaque;
+
+if (s->mode == REPLICATION_MODE_SECONDARY) {
+g_free(s->top_id);
+}
On Tue, May 10, 2016 at 11:35:14AM +0200, Kevin Wolf wrote:
> Am 10.05.2016 um 11:23 hat Daniel P. Berrange geschrieben:
> > On Tue, May 10, 2016 at 11:14:22AM +0200, Kevin Wolf wrote:
> > > Am 10.05.2016 um 10:50 hat Daniel P. Berrange geschrieben:
> > > > On Tue, May 10, 2016 at 09:43:04AM
Am 10.05.2016 um 11:30 hat Stefan Hajnoczi geschrieben:
> On Mon, May 09, 2016 at 06:31:44PM +0200, Paolo Bonzini wrote:
> > On 19/04/2016 11:09, Stefan Hajnoczi wrote:
> > >> > This has better performance because it executes fewer system calls
> > >> > and does not use a bottom half per disk.
> >
Am 10.05.2016 um 11:23 hat Daniel P. Berrange geschrieben:
> On Tue, May 10, 2016 at 11:14:22AM +0200, Kevin Wolf wrote:
> > Am 10.05.2016 um 10:50 hat Daniel P. Berrange geschrieben:
> > > On Tue, May 10, 2016 at 09:43:04AM +0100, Richard W.M. Jones wrote:
> > > > On Tue, May 10, 2016 at
On Mon, May 09, 2016 at 06:31:44PM +0200, Paolo Bonzini wrote:
> On 19/04/2016 11:09, Stefan Hajnoczi wrote:
> >> > This has better performance because it executes fewer system calls
> >> > and does not use a bottom half per disk.
> > Each aio_context_t is initialized for 128 in-flight requests in
On Tue, May 10, 2016 at 11:14:22AM +0200, Kevin Wolf wrote:
> Am 10.05.2016 um 10:50 hat Daniel P. Berrange geschrieben:
> > On Tue, May 10, 2016 at 09:43:04AM +0100, Richard W.M. Jones wrote:
> > > On Tue, May 10, 2016 at 09:14:26AM +0100, Richard W.M. Jones wrote:
> > > > However I didn't test
On Tue, May 10, 2016 at 10:06:35AM +0100, Richard W.M. Jones wrote:
> On Tue, May 10, 2016 at 09:57:48AM +0100, Daniel P. Berrange wrote:
> > On Tue, May 10, 2016 at 10:50:40AM +0800, Fam Zheng wrote:
> > > They are wrappers of POSIX fcntl file locking, with the additional
> > > interception of
Am 10.05.2016 um 10:50 hat Daniel P. Berrange geschrieben:
> On Tue, May 10, 2016 at 09:43:04AM +0100, Richard W.M. Jones wrote:
> > On Tue, May 10, 2016 at 09:14:26AM +0100, Richard W.M. Jones wrote:
> > > However I didn't test the write-shareable case (the libvirt
> > > flag which should map to
On Tue, May 10, 2016 at 09:57:48AM +0100, Daniel P. Berrange wrote:
> On Tue, May 10, 2016 at 10:50:40AM +0800, Fam Zheng wrote:
> > They are wrappers of POSIX fcntl file locking, with the additional
> > interception of open/close (through qemu_open and qemu_close) to offer a
> > better semantics
On Tue, May 10, 2016 at 10:50:40AM +0800, Fam Zheng wrote:
> They are wrappers of POSIX fcntl file locking, with the additional
> interception of open/close (through qemu_open and qemu_close) to offer a
> better semantics that preserves the locks across multiple life cycles of
> different fds on
On Tue, May 10, 2016 at 09:43:04AM +0100, Richard W.M. Jones wrote:
> On Tue, May 10, 2016 at 09:14:26AM +0100, Richard W.M. Jones wrote:
> > However I didn't test the write-shareable case (the libvirt
> > flag which should map to a shared lock -- is that right Dan?).
>
> To Dan (mainly): I
On Tue 10 May 2016 10:39:21 AM CEST, Kevin Wolf wrote:
>>s->children = g_renew(BdrvChild *, s->children, ++s->num_children);
>>s->children[s->num_children] = child;
>
> Without having checked the context, this code is not equivalent. You
> need to access s->children[s->num_children
On Tue, May 10, 2016 at 09:14:26AM +0100, Richard W.M. Jones wrote:
> However I didn't test the write-shareable case (the libvirt
> flag which should map to a shared lock -- is that right Dan?).
To Dan (mainly): I think setting the flag in libvirt only
sets cache=unsafe on the qemu drive (it
Am 09.05.2016 um 17:52 hat Alberto Garcia geschrieben:
> On Wed 13 Apr 2016 10:33:08 AM CEST, Changlong Xie wrote:
>
> Sorry for the late reply!
>
> The patch looks good, I have some additional comments on top of what Max
> Wrote, nothing serious though :)
>
> > @@ -67,6 +68,9 @@ typedef struct
Am 09.05.2016 um 20:23 hat Max Reitz geschrieben:
> On 08.05.2016 05:35, Eric Blake wrote:
> > On 05/07/2016 09:16 PM, Eric Blake wrote:
> >> While working on NBD, I found myself cursing the qemu-io UI for
> >> not letting me test various scenarios, particularly after fixing
> >> NBD to serve at
Am 10.05.2016 um 05:23 hat Fam Zheng geschrieben:
> On Fri, 05/06 09:49, Kevin Wolf wrote:
> > Am 05.05.2016 um 02:32 hat Fam Zheng geschrieben:
> > > On Wed, 05/04 12:12, Kevin Wolf wrote:
> > > > Am 19.04.2016 um 03:42 hat Fam Zheng geschrieben:
> > > > > Currently we only inactivate the top
On Tue, May 10, 2016 at 10:50:32AM +0800, Fam Zheng wrote:
> v4: Don't lock RO image. [Rich]
I applied this on top of HEAD and added some debugging messages to
block/raw-posix.c so I can see when files are being opened and locked,
and it does appear to do the right thing for read-only and
On Tue, May 10, 2016 at 10:50:40AM +0800, Fam Zheng wrote:
> +int qemu_lock_fd(int fd, int64_t start, int64_t len, bool readonly)
I find this new API to be very unintuitive. When I was reading the
other code in block/raw-posix.c I had to refer back to this file to
find out what all the integers
On Tue 10 May 2016 09:36:38 AM CEST, Changlong Xie wrote:
> From: Wen Congyang
>
> Signed-off-by: Wen Congyang
> Signed-off-by: zhanghailiang
> Signed-off-by: Gonglei
> Signed-off-by: Changlong
From: Wen Congyang
The new QMP command name is x-blockdev-change. It's just for adding/removing
quorum's child now, and doesn't support all kinds of children, all kinds of
operations, nor all block drivers. So it is experimental now.
Signed-off-by: Wen Congyang
ChangLog:
v14:
1. Address commets from Betro and Max
p2: introduce bdrv_drained_begin/end, rename last_index, remove redundant
assert codes
v13:
1. Rebase to the newest codes
2. Address commets from Betro and Max
p1. Add R-B, fix incorrect syntax
p2. Add missing "qemu/cutils.h" since 2.6, and
From: Wen Congyang
In some cases, we want to take a quorum child offline, and take
another child online.
Signed-off-by: Wen Congyang
Signed-off-by: zhanghailiang
Signed-off-by: Gonglei
From: Wen Congyang
Signed-off-by: Wen Congyang
Signed-off-by: zhanghailiang
Signed-off-by: Gonglei
Signed-off-by: Changlong Xie
---
block.c | 8
On 05/09/2016 11:52 PM, Alberto Garcia wrote:
On Wed 13 Apr 2016 10:33:08 AM CEST, Changlong Xie wrote:
Sorry for the late reply!
Never mind : )
The patch looks good, I have some additional comments on top of what Max
Wrote, nothing serious though :)
@@ -67,6 +68,9 @@ typedef struct
70 matches
Mail list logo