Re: [PATCH 13/17] qcow2: Add new autoclear feature for all zero image

Eric Blake Tue, 04 Feb 2020 05:13:55 -0800

On 2/3/20 11:45 AM, Vladimir Sementsov-Ogievskiy wrote:

31.01.2020 20:44, Eric Blake wrote:

With the recent introduction of BDRV_ZERO_OPEN, we can optimize
various qemu-img operations if we know the destination starts life
with all zero content.  For an image with no cluster allocations and
no backing file, this was already trivial with BDRV_ZERO_CREATE; but
for a fully preallocated image, it does not scale to crawl through the
entire L1/L2 tree to see if every cluster is currently marked as a
zero cluster.  But it is quite easy to add an autoclear bit to the
qcow2 file itself: the bit will be set after newly creating an image
or after qcow2_make_empty, and cleared on any other modification
(including by an older qemu that doesn't recognize the bit).


This patch documents the new bit, independently of implementing the
places in code that should set it (which means that for bisection
purposes, it is safer to still mask the bit out when opening an image
with the bit set).

A few iotests have updated output due to the larger number of named
header features.

Signed-off-by: Eric Blake <ebl...@redhat.com>

---
RFC: As defined in this patch, I defined the bit to be clear if any
cluster defers to a backing file. But the block layer would handle
things just fine if we instead allowed the bit to be set if all
clusters allocated in this image are zero, even if there are other
clusters not allocated.  Or maybe we want TWO bits: one if all
clusters allocated here are known zero, and a second if we know that
there are any clusters that defer to a backing image.

-                    Bits 2-63:  Reserved (set to 0)
+                    Bit 2:      All zero image bit

+ If this bit is set, the entire imagereads

+                                as all zeroes. This can be useful for
+                                detecting just-created images even when
+                                clusters are preallocated, which in turn
+                                can be used to optimize image copying.
+

+ This bit should not be set if anycluster

+                                in the image defers to a backing file.


Hmm. The term "defers to a backing file" not defined in the spec. And, as I

understand, can't be defined by design. Backing file may beadded/removed/changed

dynamically, and qcow2 driver will not know about it. So, the only way to
be sure that clusters are not defer to backing file is to make them
ZERO clusters (not UNALLOCATED). But this is inefficient, as we'll have to
allocated all L2 tables.

So, I think better to define this flag as "all allocated clusters arezero".


That was precisely the topic of my RFC question.

I _do_ think it is simpler to report that 'all clusters where contentcomes from _this_ image read as zero', leaving unallocated clusters aszero only if 1. there is no backing image, or 2. the backing image alsoreads as all zero (recursing as needed). I'll spin v2 of these patchesalong those lines, although I'm hoping for more review on the rest ofthe series, first.


Hmm interesting, in qcow2 spec "allocated" means allocated on disk and has
offset. So, ZERO cluster is actually unallocated cluster, with bit 0 of
L2 entry set to 1. On the other hand, qemu block layer considers ZERO
clusters as "allocated" (in POV of backing-chain).

I really want the definition to be 'any cluster whose contents come fromthis layer' (the qemu-io definition of allocated, not necessarily theqcow2 definition of allocated), which picks up BOTH types of qcow2 zeroclusters (those preallocated but marked 0, where the contents of theallocated area are indeterminate but never read, and those unallocatedbut marked 0 which do not defer to the backing layer). Whether or notthe cluster is allocated is less important than whether the image readsas 0 at that cluster.

But I think that you are right that an alternative definition of 'allallocated clusters are zero' will give the same results when crawlingthrough the backing chain to learn if the overall image reads as zero,and that's all the more that we can expect out of this bit.


So, if we define it as "all allocated clusters are zero", we are done:
other clusters are either unallocated and MAY refer to backing, so we
can say nothing about their read-as-zero status at the level of qcow2
spec, or unallocated with zero-bit set, which are normal ZERO clusters.

So, on the level of qcow2 driver I think it's better consider only this
image. Still, we can implement generic bdrv_is_all_zeros, which will
check or layers (or at least, check that bs->backing is NULL).

The earlier parts of this series which renamed bdrv_has_zero_init() intobdrv_known_zeroes() does just that - it already handles recursionthrough the backing chain, and insists that an image is all zeroes withrespect to BDRV_ZERO_OPEN only if all layers of the backing chain agree.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH 13/17] qcow2: Add new autoclear feature for all zero image

Reply via email to