Re: [Qemu-devel] [PATCH v2] specs/qcow2: Fix documentation of the compressed cluster descriptor
On Tue 20 Feb 2018 08:40:43 PM CET, Eric Blake wrote: >> Compressed Clusters Descriptor (x = 62 - (cluster_bits - 8)): > > I'm looking at how this works for different cluster sizes. If we have > 512-byte clusters, x is 61, and we DON'T have the 'number sectors' > field at all! Well, you can definitely have compressed images with 512-byte clusters. So I think he have just found one more mistake in the documentation :) (x = 62 - (cluster_bits - 8)): Bit 0 - x:Host cluster offset. x+1 - 61:Number of 512-byte sectors That's not how it works, it's rather [0, x-1], [x, 61]. For 512-byte clusters x is 61 and we have 1 bit for the number of sectors, allowing one or two sectors. If you have a compressed image with 512-byte clusters you can also see that since the compressed data is not aligned, some compressed clusters span two different sectors (as expected). That means that nb_csectors in the L2 entry is two (1+1), which is the maximum allowed in this case, so that makes sense. And since the size of our clusters is also 512 bytes, nb_csectors is twice the cluster size, so we need s->cluster_data to be (cluster_size * 2) bytes (minus one, strictly speaking). > If we ever allowed a compressed cluster to spill across two host > clusters, it would cause mayhem in trying to track refcounts and other > things. I haven't checked how this works in practice but it seems to work fine. Note that those clusters are read-only so that surely makes things easier. Berto
Re: [Qemu-devel] [PATCH v2] specs/qcow2: Fix documentation of the compressed cluster descriptor
On 02/20/2018 04:03 PM, Eric Blake wrote: boundary. Technically, it might be possible, but qemu does NOT do that (again, looking at qcow2_alloc_bytes() - we loop if free_in_cluster < size) - so we may want to be explicit about this point to prevent OTHER implementations from creating a compressed cluster that crosses host cluster boundaries (right now, I can't see qcow2_decompress_cluster() validating it, though - YIKES). Aha, I see where I went wrong. That said, a simple patch to try this: triggers failures in iotests 122: --- /home/eblake/qemu/tests/qemu-iotests/122.out 2017-10-06 13:45:25.559279136 -0500 +++ /home/eblake/qemu/tests/qemu-iotests/122.out.bad 2018-02-20 15:54:29.890221575 -0600 @@ -117,8 +117,8 @@ convert -c -S 0: read 3145728/3145728 bytes at offset 0 3 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) -read 63963136/63963136 bytes at offset 3145728 -61 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) +qcow2: Marking image as corrupt: Compressed cluster at 0x5ffd2 crosses host cluster boundary; further corruption events will be suppressed +read failed: Input/output error [{ "start": 0, "length": 67108864, "depth": 0, "zero": false, "data": true}] so it looks like I'm reading qcow2_alloc_bytes() wrong and that we CAN have a compressed cluster that crosses host cluster boundaries? We DO allow crossing sector boundaries, IF the newly allocated cluster is contiguous to the cluster that has the unused tail that we are starting in. Good, because that's less wasteful of the image (suppose every compressed cluster got 49% reduction in size - since each one requires 51% of a cluster, not allowing cluster crossing would require a full cluster each, rather than the expected ~49% reduction in overall image size if we are good at contiguous allocations). So if I may suggest: x+1 - 61: Number of additional 512-byte sectors used for the compressed data, beyond the sector containing the offset in the previous field. These sectors must fit within the same host cluster. This sentence needs tweaking to match reality, given that my simple patch to flag cross-sector hosts triggered (or I need to figure out what was wrong with my patch). So change that sentence to: As needed, these additional sectors may reside in the next contiguous host cluster. Note that the compressed data does not necessarily occupy all of the bytes in the final sector; rather, decompression stops when it has produced a cluster of data. Another compressed cluster may map to the tail of the final sector used by this compressed cluster. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [Qemu-devel] [PATCH v2] specs/qcow2: Fix documentation of the compressed cluster descriptor
On 02/20/2018 01:40 PM, Eric Blake wrote: On 02/20/2018 11:01 AM, Alberto Garcia wrote: tl:dr; I think we need a v3 with even more clarification. I'm also making an additional observationn: Due to the pigeonhole principle and the fact that the compression stream adds metadata, we KNOW that there are some (rare) cases where attempting to compress data will actually result in an INCREASE in size ('man gzip' backs up this claim, calling out a worst case -0.015% compression ratio, or 15 bytes added for every 1000 bytes of input, on uncompressible data). So presumably, we should state that a cluster can only be written in compressed form IF it occupies less space than the uncompressed cluster (we could also allow a compressed form that occupies the same length as the uncompressed cluster, but that's a waste of CPU cycles). Once we have that restriction stated, then it becomes obvious that a compressed cluster should never REQUIRE using more than one host cluster (and this is backed up by qcow2_alloc_bytes() asserting that size <= s->cluster_size). Where things get interesting, though, is whether we PERMIT a compressed cluster to overlap a host cluster boundary. Technically, it might be possible, but qemu does NOT do that (again, looking at qcow2_alloc_bytes() - we loop if free_in_cluster < size) - so we may want to be explicit about this point to prevent OTHER implementations from creating a compressed cluster that crosses host cluster boundaries (right now, I can't see qcow2_decompress_cluster() validating it, though - YIKES). That said, a simple patch to try this: diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 8c4b26ceaf2..85b5dbd9c16 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -1598,6 +1598,15 @@ int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset) sector_offset = coffset & 511; csize = nb_csectors * 512 - sector_offset; +/* We never write a compressed cluster that crosses host + * cluster boundaries; reject images that do that. */ +if (csize + (coffset % s->cluster_size) > s->cluster_size) { +qcow2_signal_corruption(bs, true, coffset, csize, +"Compressed cluster at %#" PRIx64 +" crosses host cluster boundary", coffset); +return -EIO; +} + /* Allocate buffers on first decompress operation, most images are * uncompressed and the memory overhead can be avoided. The buffers * are freed in .bdrv_close(). triggers failures in iotests 122: --- /home/eblake/qemu/tests/qemu-iotests/122.out 2017-10-06 13:45:25.559279136 -0500 +++ /home/eblake/qemu/tests/qemu-iotests/122.out.bad 2018-02-20 15:54:29.890221575 -0600 @@ -117,8 +117,8 @@ convert -c -S 0: read 3145728/3145728 bytes at offset 0 3 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) -read 63963136/63963136 bytes at offset 3145728 -61 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) +qcow2: Marking image as corrupt: Compressed cluster at 0x5ffd2 crosses host cluster boundary; further corruption events will be suppressed +read failed: Input/output error [{ "start": 0, "length": 67108864, "depth": 0, "zero": false, "data": true}] so it looks like I'm reading qcow2_alloc_bytes() wrong and that we CAN have a compressed cluster that crosses host cluster boundaries? So if I may suggest: x+1 - 61: Number of additional 512-byte sectors used for the compressed data, beyond the sector containing the offset in the previous field. These sectors must fit within the same host cluster. This sentence needs tweaking to match reality, given that my simple patch to flag cross-sector hosts triggered (or I need to figure out what was wrong with my patch). Note that the compressed data does not necessarily occupy all of the bytes in the final sector; rather, decompression stops when it has produced a cluster of data. Another compressed cluster may map to the tail of the final sector used by this compressed cluster. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [Qemu-devel] [PATCH v2] specs/qcow2: Fix documentation of the compressed cluster descriptor
On 02/20/2018 11:01 AM, Alberto Garcia wrote: tl:dr; I think we need a v3 with even more clarification. The documentation claims that the cluster descriptor contains the number of sectors used to store the compressed data, but what it actually contains is the number of sectors *minus one*. That can be easily seen in qcow2_decompress_cluster(), that adds one to the value stored in that field: nb_csectors = ((cluster_offset >> s->csize_shift) & s->csize_mask) + 1; This is misleading. It says how we take what is in the qcow2 file on reading in order to decompress, but still doesn't show how we generated that number. Let's also compare it as well to what we WRITE into the qcow2 file: nb_csectors = ((cluster_offset + compressed_size - 1) >> 9) - (cluster_offset >> 9); I'm also making an additional observationn: Due to the pigeonhole principle and the fact that the compression stream adds metadata, we KNOW that there are some (rare) cases where attempting to compress data will actually result in an INCREASE in size ('man gzip' backs up this claim, calling out a worst case -0.015% compression ratio, or 15 bytes added for every 1000 bytes of input, on uncompressible data). So presumably, we should state that a cluster can only be written in compressed form IF it occupies less space than the uncompressed cluster (we could also allow a compressed form that occupies the same length as the uncompressed cluster, but that's a waste of CPU cycles). Once we have that restriction stated, then it becomes obvious that a compressed cluster should never REQUIRE using more than one host cluster (and this is backed up by qcow2_alloc_bytes() asserting that size <= s->cluster_size). Where things get interesting, though, is whether we PERMIT a compressed cluster to overlap a host cluster boundary. Technically, it might be possible, but qemu does NOT do that (again, looking at qcow2_alloc_bytes() - we loop if free_in_cluster < size) - so we may want to be explicit about this point to prevent OTHER implementations from creating a compressed cluster that crosses host cluster boundaries (right now, I can't see qcow2_decompress_cluster() validating it, though - YIKES). In addition to that this patch clarifies where the actual compressed data is located. Although the size of the data is specified in sectors, the offset is not necessarily aligned to a sector boundary, so the actual data goes from the specified offset until the end of the last sector, leaving the initial bytes of the first sector (if any) unused. Signed-off-by: Alberto Garcia--- v2: I realized that the documentation is not completely clear about the exact location and size of the compressed data, so I updated the patch to clarify this. --- docs/interop/qcow2.txt | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt index d7fdb1fee3..dc2b9cefb2 100644 --- a/docs/interop/qcow2.txt +++ b/docs/interop/qcow2.txt @@ -427,9 +427,17 @@ Standard Cluster Descriptor: Compressed Clusters Descriptor (x = 62 - (cluster_bits - 8)): I'm looking at how this works for different cluster sizes. If we have 512-byte clusters, x is 61, and we DON'T have the 'number sectors' field at all! But that still makes sense, provided that all consecutive host sectors used to hold a compressed guest cluster lie within a single host cluster. If we ever allowed a compressed cluster to spill across two host clusters, it would cause mayhem in trying to track refcounts and other things. So you can have two 512-byte guest clusters that manage to compress into the same host cluster, but must never have a single guest cluster that spills over a host cluster boundary. For all other cluster sizes, the value of x leaves us exactly log2(cluster_size / 512) bits for the 'number sectors' field. For example, with 64k clusters, x is 54, leaving 7 bits (64k/512 == 128, and 2^7 covers the number of sectors in a 64k host cluster). Whether all of these sectors should lie within the same host cluster should be stated (as I argued above, this is how qemu does it, so it SHOULD be part of the spec to prevent refcount and other confusion if some other implementation created images violating that). Bit 0 - x:Host cluster offset. This is usually _not_ aligned to a -cluster boundary! +cluster or sector boundary! - x+1 - 61:Compressed size of the images in sectors of 512 bytes + x+1 - 61:Number of 512-byte sectors used for the compressed data, +minus one (that is, a value of n here means n+1 sectors). + +The actual compressed data is located at the end of this +region, from the offset indicated in the previous field +until the end of the last sector. + +
[Qemu-devel] [PATCH v2] specs/qcow2: Fix documentation of the compressed cluster descriptor
The documentation claims that the cluster descriptor contains the number of sectors used to store the compressed data, but what it actually contains is the number of sectors *minus one*. That can be easily seen in qcow2_decompress_cluster(), that adds one to the value stored in that field: nb_csectors = ((cluster_offset >> s->csize_shift) & s->csize_mask) + 1; In addition to that this patch clarifies where the actual compressed data is located. Although the size of the data is specified in sectors, the offset is not necessarily aligned to a sector boundary, so the actual data goes from the specified offset until the end of the last sector, leaving the initial bytes of the first sector (if any) unused. Signed-off-by: Alberto Garcia--- v2: I realized that the documentation is not completely clear about the exact location and size of the compressed data, so I updated the patch to clarify this. --- docs/interop/qcow2.txt | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt index d7fdb1fee3..dc2b9cefb2 100644 --- a/docs/interop/qcow2.txt +++ b/docs/interop/qcow2.txt @@ -427,9 +427,17 @@ Standard Cluster Descriptor: Compressed Clusters Descriptor (x = 62 - (cluster_bits - 8)): Bit 0 - x:Host cluster offset. This is usually _not_ aligned to a -cluster boundary! +cluster or sector boundary! - x+1 - 61:Compressed size of the images in sectors of 512 bytes + x+1 - 61:Number of 512-byte sectors used for the compressed data, +minus one (that is, a value of n here means n+1 sectors). + +The actual compressed data is located at the end of this +region, from the offset indicated in the previous field +until the end of the last sector. + +The initial bytes of this region are therefore unused if +the offset is not aligned to a sector boundary. If a cluster is unallocated, read requests shall read the data from the backing file (except if bit 0 in the Standard Cluster Descriptor is set). If there is -- 2.11.0