[Qemu-devel] [PATCH v3] specs/qcow2: Fix documentation of the compressed cluster descriptor
This patch fixes several mistakes in the documentation of the compressed cluster descriptor: 1) the documentation claims that the cluster descriptor contains the number of sectors used to store the compressed data, but what it actually contains is the number of sectors *minus one* or, in other words, the number of additional sectors after the first one. 2) the width of the fields is incorrectly specified. The number of bits used by each field is x = 62 - (cluster_bits - 8) for the offset field y = (cluster_bits - 8)for the size field So the offset field's location is [0, x-1], not [0, x] as stated. 3) the size field does not contain the size of the compressed data, but rather the number of sectors where that data is stored. The compressed data starts at the exact point specified in the offset field and ends when there's enough data to produce a cluster of decompressed data. Both points can be in the middle of a sector, allowing several compressed clusters to be stored next to one another, sharing sectors if necessary. Signed-off-by: Alberto Garcia --- v3: Fix the specification of the width of the fields, and update the explanation of how the compressed data is stored [Eric]. v2: I realized that the documentation is not completely clear about the exact location and size of the compressed data, so I updated the patch to clarify this. --- docs/interop/qcow2.txt | 16 +--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt index d7fdb1fee3..feb711fb6a 100644 --- a/docs/interop/qcow2.txt +++ b/docs/interop/qcow2.txt @@ -426,10 +426,20 @@ Standard Cluster Descriptor: Compressed Clusters Descriptor (x = 62 - (cluster_bits - 8)): -Bit 0 - x:Host cluster offset. This is usually _not_ aligned to a -cluster boundary! +Bit 0 - x-1: Host cluster offset. This is usually _not_ aligned to a +cluster or sector boundary! - x+1 - 61:Compressed size of the images in sectors of 512 bytes + x - 61:Number of additional 512-byte sectors used for the +compressed data, beyond the sector containing the offset +in the previous field. Some of these sectors may reside +in the next contiguous host cluster. + +Note that the compressed data does not necessarily occupy +all of the bytes in the final sector; rather, decompression +stops when it has produced a cluster of data. + +Another compressed cluster may map to the tail of the final +sector used by this compressed cluster. If a cluster is unallocated, read requests shall read the data from the backing file (except if bit 0 in the Standard Cluster Descriptor is set). If there is -- 2.11.0
Re: [Qemu-devel] [PATCH v3] specs/qcow2: Fix documentation of the compressed cluster descriptor
On 02/21/2018 08:08 AM, Alberto Garcia wrote: This patch fixes several mistakes in the documentation of the compressed cluster descriptor: 1) the documentation claims that the cluster descriptor contains the number of sectors used to store the compressed data, but what it actually contains is the number of sectors *minus one* or, in other words, the number of additional sectors after the first one. 2) the width of the fields is incorrectly specified. The number of bits used by each field is x = 62 - (cluster_bits - 8) for the offset field y = (cluster_bits - 8)for the size field So the offset field's location is [0, x-1], not [0, x] as stated. Yep, I figured that out independently, although our mails crossed, and I didn't notice your fix until after my other email calling out this bug in the spec. 3) the size field does not contain the size of the compressed data, but rather the number of sectors where that data is stored. The compressed data starts at the exact point specified in the offset field and ends when there's enough data to produce a cluster of decompressed data. Both points can be in the middle of a sector, allowing several compressed clusters to be stored next to one another, sharing sectors if necessary. Signed-off-by: Alberto Garcia --- v3: Fix the specification of the width of the fields, and update the explanation of how the compressed data is stored [Eric]. +++ b/docs/interop/qcow2.txt @@ -426,10 +426,20 @@ Standard Cluster Descriptor: Compressed Clusters Descriptor (x = 62 - (cluster_bits - 8)): -Bit 0 - x:Host cluster offset. This is usually _not_ aligned to a -cluster boundary! +Bit 0 - x-1: Host cluster offset. This is usually _not_ aligned to a +cluster or sector boundary! This breaks the nice alignment of the ':' character that the rest of the section preserves. You could also have done: Compressed Clusters Descriptor (x = 61 - (cluster_bits - 8)): Bit 0 - x: ... x+1 - 61: ... to preserve the alignment, by shifting the adjustment into the calculation of x. But I'm fine with either approach. - x+1 - 61:Compressed size of the images in sectors of 512 bytes + x - 61:Number of additional 512-byte sectors used for the +compressed data, beyond the sector containing the offset +in the previous field. Some of these sectors may reside +in the next contiguous host cluster. + +Note that the compressed data does not necessarily occupy +all of the bytes in the final sector; rather, decompression +stops when it has produced a cluster of data. + +Another compressed cluster may map to the tail of the final +sector used by this compressed cluster. Yes, that wording is MUCH nicer. Thanks for persisting, and for making me learn more about qcow2 compression in the process. So either way on the colon alignment issue: Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [Qemu-devel] [PATCH v3] specs/qcow2: Fix documentation of the compressed cluster descriptor
CC: qemu-sta...@nongnu.org Incorrect docs make for difficult interoperability. On 02/21/2018 08:08 AM, Alberto Garcia wrote: This patch fixes several mistakes in the documentation of the compressed cluster descriptor: 1) the documentation claims that the cluster descriptor contains the number of sectors used to store the compressed data, but what it actually contains is the number of sectors *minus one* or, in other words, the number of additional sectors after the first one. 2) the width of the fields is incorrectly specified. The number of bits used by each field is x = 62 - (cluster_bits - 8) for the offset field y = (cluster_bits - 8)for the size field So the offset field's location is [0, x-1], not [0, x] as stated. 3) the size field does not contain the size of the compressed data, but rather the number of sectors where that data is stored. The compressed data starts at the exact point specified in the offset field and ends when there's enough data to produce a cluster of decompressed data. Both points can be in the middle of a sector, allowing several compressed clusters to be stored next to one another, sharing sectors if necessary. Signed-off-by: Alberto Garcia --- v3: Fix the specification of the width of the fields, and update the explanation of how the compressed data is stored [Eric]. v2: I realized that the documentation is not completely clear about the exact location and size of the compressed data, so I updated the patch to clarify this. --- docs/interop/qcow2.txt | 16 +--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt index d7fdb1fee3..feb711fb6a 100644 --- a/docs/interop/qcow2.txt +++ b/docs/interop/qcow2.txt @@ -426,10 +426,20 @@ Standard Cluster Descriptor: Compressed Clusters Descriptor (x = 62 - (cluster_bits - 8)): -Bit 0 - x:Host cluster offset. This is usually _not_ aligned to a -cluster boundary! +Bit 0 - x-1: Host cluster offset. This is usually _not_ aligned to a +cluster or sector boundary! - x+1 - 61:Compressed size of the images in sectors of 512 bytes + x - 61:Number of additional 512-byte sectors used for the +compressed data, beyond the sector containing the offset +in the previous field. Some of these sectors may reside +in the next contiguous host cluster. + +Note that the compressed data does not necessarily occupy +all of the bytes in the final sector; rather, decompression +stops when it has produced a cluster of data. + +Another compressed cluster may map to the tail of the final +sector used by this compressed cluster. If a cluster is unallocated, read requests shall read the data from the backing file (except if bit 0 in the Standard Cluster Descriptor is set). If there is -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [Qemu-devel] [PATCH v3] specs/qcow2: Fix documentation of the compressed cluster descriptor
On Wed 21 Feb 2018 05:10:30 PM CET, Eric Blake wrote: >> Compressed Clusters Descriptor (x = 62 - (cluster_bits - 8)): >> >> -Bit 0 - x:Host cluster offset. This is usually _not_ aligned to a >> -cluster boundary! >> +Bit 0 - x-1: Host cluster offset. This is usually _not_ aligned to a >> +cluster or sector boundary! > > > This breaks the nice alignment of the ':' character that the rest of the > section preserves. You could also have done: > > Compressed Clusters Descriptor (x = 61 - (cluster_bits - 8)): > Bit 0 - x: ... >x+1 - 61: ... > > to preserve the alignment, by shifting the adjustment into the > calculation of x. But I'm fine with either approach. Yeah I actually saw the same problem, in the end I chose this one because the numbers match the implementation: s->csize_shift = (62 - (s->cluster_bits - 8)); Thanks for reviewing! Berto
Re: [Qemu-devel] [PATCH v3] specs/qcow2: Fix documentation of the compressed cluster descriptor
Am 21.02.2018 um 15:08 hat Alberto Garcia geschrieben: > This patch fixes several mistakes in the documentation of the > compressed cluster descriptor: > > 1) the documentation claims that the cluster descriptor contains the >number of sectors used to store the compressed data, but what it >actually contains is the number of sectors *minus one* or, in other >words, the number of additional sectors after the first one. > > 2) the width of the fields is incorrectly specified. The number of bits >used by each field is > > x = 62 - (cluster_bits - 8) for the offset field > y = (cluster_bits - 8)for the size field > >So the offset field's location is [0, x-1], not [0, x] as stated. > > 3) the size field does not contain the size of the compressed data, >but rather the number of sectors where that data is stored. The >compressed data starts at the exact point specified in the offset >field and ends when there's enough data to produce a cluster of >decompressed data. Both points can be in the middle of a sector, >allowing several compressed clusters to be stored next to one >another, sharing sectors if necessary. > > Signed-off-by: Alberto Garcia Thanks, applied to the block branch. Kevin
Re: [Qemu-devel] [PATCH v3] specs/qcow2: Fix documentation of the compressed cluster descriptor
On 02/21/2018 08:08 AM, Alberto Garcia wrote: This patch fixes several mistakes in the documentation of the compressed cluster descriptor: More things to consider, as followup patches: Note that both the L1 table, and the standard L2 descriptors, have a cap on bit 55 as the maximum host offset (< 64PB). But for compressed clusters, our maximum depends on cluster_bits, as follows: 512 byte cluster: bit 61 forms 'number clusters', leaving bits 60-0 for computing the host offset. But even though this looks on the surface like you could allocate 2EB of compressed clusters, it does not play well with the rest of the L1/L2 system, so we should probably explicitly document that bits 60-56 MUST be 0, if they are assigned to the 'host offset field', making the maximum compressed cluster offset at 64PB. 2M cluster: bits 61-50 form 'number clusters', leaving bit 49 as the maximum bit in the host offset (< 512 TB). But we never validate that we don't overflow this value! I'm working on a patch. Meanwhile, the refcount table currently allows all the way up to bit 63 to form an offset to a refcount block, although capping that at 55 the way L1/L2 are capped would be reasonable (it gets weird to state that your metadata must live below 64PB but that your data pointed to by the metadata can live beyond that point). So it may also be worth considering a spec patch that points out the 64PB maximum useful size, and maybe even a comment that the maximum size may be further constrained by the protocol layer (for example, ext4 has a 16TB cap on individual file size). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [Qemu-devel] [PATCH v3] specs/qcow2: Fix documentation of the compressed cluster descriptor
On Wed 21 Feb 2018 11:10:07 PM CET, Eric Blake wrote: >> This patch fixes several mistakes in the documentation of the >> compressed cluster descriptor: > > More things to consider, as followup patches: > > Note that both the L1 table, and the standard L2 descriptors, have a cap > on bit 55 as the maximum host offset (< 64PB). But for compressed > clusters, our maximum depends on cluster_bits, as follows: > > 512 byte cluster: bit 61 forms 'number clusters', leaving bits 60-0 for > computing the host offset. But even though this looks on the surface > like you could allocate 2EB of compressed clusters, it does not play > well with the rest of the L1/L2 system, so we should probably explicitly > document that bits 60-56 MUST be 0, if they are assigned to the 'host > offset field', making the maximum compressed cluster offset at 64PB. > > 2M cluster: bits 61-50 form 'number clusters', leaving bit 49 as the > maximum bit in the host offset (< 512 TB). But we never validate that > we don't overflow this value! I'm working on a patch. Good catch! > Meanwhile, the refcount table currently allows all the way up to bit > 63 to form an offset to a refcount block, although capping that at 55 > the way L1/L2 are capped would be reasonable (it gets weird to state > that your metadata must live below 64PB but that your data pointed to > by the metadata can live beyond that point). So it may also be worth > considering a spec patch that points out the 64PB maximum useful size, > and maybe even a comment that the maximum size may be further > constrained by the protocol layer (for example, ext4 has a 16TB cap on > individual file size). Sounds reasonable. Berto