On 2/4/20 11:32 AM, Max Reitz wrote:
On 31.01.20 18:44, Eric Blake wrote:
Based-on: <20200124103458.1525982-2-david.edmond...@oracle.com>
([PATCH v2 1/2] qemu-img: Add --target-is-zero to convert)
I'm working on adding an NBD extension that reports whether an image
is already all zero when the client first connects. I initially
thought I could write the NBD code to just call bdrv_has_zero_init(),
but that turned out to be a bad assumption that instead resulted in
this patch series. The NBD patch will come later (and cross-posted to
the NBD protocol, libnbd, nbdkit, and qemu, as it will affect all four
repositories).
We had a discussion about this on IRC, and as far as I remember I wasn’t
quite sold on the “why”. So, again, I wonder why this is needed.
I mean, it does make intuitive sense to want to know whether an image is
fully zero, but if I continue thinking about it I don’t know any case
where we would need to figure it out and where we could accept “We don’t
know” as an answer. So I’m looking for use cases, but this cover letter
doesn’t mention any. (And from a quick glance I don’t see this series
using the flag, actually.)
Patch 10/17 has:
diff --git a/qemu-img.c b/qemu-img.c
index e60217e6c382..c8519a74f738 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1985,10 +1985,12 @@ static int convert_do_copy(ImgConvertState *s)
int64_t sector_num = 0;
/* Check whether we have zero initialisation or can get it
efficiently */
- if (!s->has_zero_init && s->target_is_new && s->min_sparse &&
- !s->target_has_backing) {
- s->has_zero_init = !!(bdrv_known_zeroes(blk_bs(s->target)) &
- BDRV_ZERO_CREATE);
+ if (!s->has_zero_init && s->min_sparse && !s->target_has_backing) {
+ ret = bdrv_known_zeroes(blk_bs(s->target));
+ if (ret & BDRV_ZERO_OPEN ||
+ (s->target_is_new && ret & BDRV_ZERO_CREATE)) {
+ s->has_zero_init = true;
+ }
}
That's the use case: when copying into a destination file, it's useful
to know if the destination already reads as all zeroes, before
attempting a fallback to bdrv_make_zero(BDRV_REQ_NO_FALLBACK) or calls
to block status checking for holes.
(We have a use case with convert -n to freshly created image files, but
my position on this on IRC was that we want the --target-is-zero flag
for that anyway: Auto-detection may always break, our preferred default
behavior may always change, so if you want convert -n not to touch the
target image except to write non-zero data from the source, we need a
--target-is-zero flag and users need to use it. Well, management
layers, because I don’t think users would use convert -n anyway.
And with --target-is-zero and users effectively having to use it, I
don’t think that’s a good example of a use case.)
Yes, there will still be cases where you have to use --target-is-zero
because the image itself couldn't report that it already reads as
zeroes, but there are also enough cases where the destination is already
known to read zeroes and it's a shame to tell the user that 'you have to
add --target-is-zero to get faster copying even though we could have
inferred it on your behalf'.
I suppose there is the point of blockdev-create + blockdev-mirror: This
has exactly the same problem as convert -n. But again, if you really
want blockdev-mirror not just to force-zero the image, you probably need
to tell it so explicitly (i.e., with a --target-is-zero flag for
blockdev-mirror).
(Well, I suppose we could save us a target-is-zero for mirror if we took
this series and had a filter driver that force-reports BDRV_ZERO_OPEN.
But, well, please no.)
But maybe I’m just an idiot and there is no reason not to take this
series and make blockdev-create + blockdev-mirror do the sensible thing
by default in most cases. *shrug*
My argument for taking the series _is_ that the common case can be made
more efficient without user effort. Yes, we still need the knob for
when the common case isn't already smart enough, but the difference in
avoiding a pre-zeroing pass is noticeable when copying images around
(and more than just for qcow2 - my followup series to improve NBD is
similarly useful given how much work has already been invested in
mapping NBD into storage access over https in the upper layers like ovirt).
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization: qemu.org | libvirt.org