Re: [Qemu-devel] [PATCH v3] specs/qcow2: Fix documentation of the compressed cluster descriptor

2018-02-22 Thread Alberto Garcia
On Wed 21 Feb 2018 11:10:07 PM CET, Eric Blake wrote:
>> This patch fixes several mistakes in the documentation of the
>> compressed cluster descriptor:
>
> More things to consider, as followup patches:
>
> Note that both the L1 table, and the standard L2 descriptors, have a cap 
> on bit 55 as the maximum host offset (< 64PB).  But for compressed 
> clusters, our maximum depends on cluster_bits, as follows:
>
> 512 byte cluster: bit 61 forms 'number clusters', leaving bits 60-0 for 
> computing the host offset.  But even though this looks on the surface 
> like you could allocate 2EB of compressed clusters, it does not play 
> well with the rest of the L1/L2 system, so we should probably explicitly 
> document that bits 60-56 MUST be 0, if they are assigned to the 'host 
> offset field', making the maximum compressed cluster offset at 64PB.
>
> 2M cluster: bits 61-50 form 'number clusters', leaving bit 49 as the 
> maximum bit in the host offset (< 512 TB).  But we never validate that 
> we don't overflow this value!  I'm working on a patch.

Good catch!

> Meanwhile, the refcount table currently allows all the way up to bit
> 63 to form an offset to a refcount block, although capping that at 55
> the way L1/L2 are capped would be reasonable (it gets weird to state
> that your metadata must live below 64PB but that your data pointed to
> by the metadata can live beyond that point).  So it may also be worth
> considering a spec patch that points out the 64PB maximum useful size,
> and maybe even a comment that the maximum size may be further
> constrained by the protocol layer (for example, ext4 has a 16TB cap on
> individual file size).

Sounds reasonable.

Berto



Re: [Qemu-devel] [PATCH v3] specs/qcow2: Fix documentation of the compressed cluster descriptor

2018-02-21 Thread Eric Blake

On 02/21/2018 08:08 AM, Alberto Garcia wrote:

This patch fixes several mistakes in the documentation of the
compressed cluster descriptor:



More things to consider, as followup patches:

Note that both the L1 table, and the standard L2 descriptors, have a cap 
on bit 55 as the maximum host offset (< 64PB).  But for compressed 
clusters, our maximum depends on cluster_bits, as follows:


512 byte cluster: bit 61 forms 'number clusters', leaving bits 60-0 for 
computing the host offset.  But even though this looks on the surface 
like you could allocate 2EB of compressed clusters, it does not play 
well with the rest of the L1/L2 system, so we should probably explicitly 
document that bits 60-56 MUST be 0, if they are assigned to the 'host 
offset field', making the maximum compressed cluster offset at 64PB.


2M cluster: bits 61-50 form 'number clusters', leaving bit 49 as the 
maximum bit in the host offset (< 512 TB).  But we never validate that 
we don't overflow this value!  I'm working on a patch.


Meanwhile, the refcount table currently allows all the way up to bit 63 
to form an offset to a refcount block, although capping that at 55 the 
way L1/L2 are capped would be reasonable (it gets weird to state that 
your metadata must live below 64PB but that your data pointed to by the 
metadata can live beyond that point).  So it may also be worth 
considering a spec patch that points out the 64PB maximum useful size, 
and maybe even a comment that the maximum size may be further 
constrained by the protocol layer (for example, ext4 has a 16TB cap on 
individual file size).


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-devel] [PATCH v3] specs/qcow2: Fix documentation of the compressed cluster descriptor

2018-02-21 Thread Kevin Wolf
Am 21.02.2018 um 15:08 hat Alberto Garcia geschrieben:
> This patch fixes several mistakes in the documentation of the
> compressed cluster descriptor:
> 
> 1) the documentation claims that the cluster descriptor contains the
>number of sectors used to store the compressed data, but what it
>actually contains is the number of sectors *minus one* or, in other
>words, the number of additional sectors after the first one.
> 
> 2) the width of the fields is incorrectly specified. The number of bits
>used by each field is
> 
>   x = 62 - (cluster_bits - 8)   for the offset field
>   y = (cluster_bits - 8)for the size field
> 
>So the offset field's location is [0, x-1], not [0, x] as stated.
> 
> 3) the size field does not contain the size of the compressed data,
>but rather the number of sectors where that data is stored. The
>compressed data starts at the exact point specified in the offset
>field and ends when there's enough data to produce a cluster of
>decompressed data. Both points can be in the middle of a sector,
>allowing several compressed clusters to be stored next to one
>another, sharing sectors if necessary.
> 
> Signed-off-by: Alberto Garcia 

Thanks, applied to the block branch.

Kevin



Re: [Qemu-devel] [PATCH v3] specs/qcow2: Fix documentation of the compressed cluster descriptor

2018-02-21 Thread Alberto Garcia
On Wed 21 Feb 2018 05:10:30 PM CET, Eric Blake wrote:
>>   Compressed Clusters Descriptor (x = 62 - (cluster_bits - 8)):
>>   
>> -Bit  0 -  x:Host cluster offset. This is usually _not_ aligned to a
>> -cluster boundary!
>> +Bit  0 - x-1:   Host cluster offset. This is usually _not_ aligned to a
>> +cluster or sector boundary!
>
>
> This breaks the nice alignment of the ':' character that the rest of the 
> section preserves.  You could also have done:
>
> Compressed Clusters Descriptor (x = 61 - (cluster_bits - 8)):
> Bit  0 -  x:  ...
>x+1 - 61:  ...
>
> to preserve the alignment, by shifting the adjustment into the 
> calculation of x.  But I'm fine with either approach.

Yeah I actually saw the same problem, in the end I chose this one
because the numbers match the implementation:

s->csize_shift = (62 - (s->cluster_bits - 8));

Thanks for reviewing!

Berto



Re: [Qemu-devel] [PATCH v3] specs/qcow2: Fix documentation of the compressed cluster descriptor

2018-02-21 Thread Eric Blake

CC: qemu-sta...@nongnu.org

Incorrect docs make for difficult interoperability.

On 02/21/2018 08:08 AM, Alberto Garcia wrote:

This patch fixes several mistakes in the documentation of the
compressed cluster descriptor:

1) the documentation claims that the cluster descriptor contains the
number of sectors used to store the compressed data, but what it
actually contains is the number of sectors *minus one* or, in other
words, the number of additional sectors after the first one.

2) the width of the fields is incorrectly specified. The number of bits
used by each field is

   x = 62 - (cluster_bits - 8)   for the offset field
   y = (cluster_bits - 8)for the size field

So the offset field's location is [0, x-1], not [0, x] as stated.

3) the size field does not contain the size of the compressed data,
but rather the number of sectors where that data is stored. The
compressed data starts at the exact point specified in the offset
field and ends when there's enough data to produce a cluster of
decompressed data. Both points can be in the middle of a sector,
allowing several compressed clusters to be stored next to one
another, sharing sectors if necessary.

Signed-off-by: Alberto Garcia 
---

v3: Fix the specification of the width of the fields, and update the
 explanation of how the compressed data is stored [Eric].

v2: I realized that the documentation is not completely clear about
 the exact location and size of the compressed data, so I updated
 the patch to clarify this.

---
  docs/interop/qcow2.txt | 16 +---
  1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt
index d7fdb1fee3..feb711fb6a 100644
--- a/docs/interop/qcow2.txt
+++ b/docs/interop/qcow2.txt
@@ -426,10 +426,20 @@ Standard Cluster Descriptor:
  
  Compressed Clusters Descriptor (x = 62 - (cluster_bits - 8)):
  
-Bit  0 -  x:Host cluster offset. This is usually _not_ aligned to a

-cluster boundary!
+Bit  0 - x-1:   Host cluster offset. This is usually _not_ aligned to a
+cluster or sector boundary!
  
-   x+1 - 61:Compressed size of the images in sectors of 512 bytes

+ x - 61:Number of additional 512-byte sectors used for the
+compressed data, beyond the sector containing the offset
+in the previous field. Some of these sectors may reside
+in the next contiguous host cluster.
+
+Note that the compressed data does not necessarily occupy
+all of the bytes in the final sector; rather, decompression
+stops when it has produced a cluster of data.
+
+Another compressed cluster may map to the tail of the final
+sector used by this compressed cluster.
  
  If a cluster is unallocated, read requests shall read the data from the backing

  file (except if bit 0 in the Standard Cluster Descriptor is set). If there is



--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-devel] [PATCH v3] specs/qcow2: Fix documentation of the compressed cluster descriptor

2018-02-21 Thread Eric Blake

On 02/21/2018 08:08 AM, Alberto Garcia wrote:

This patch fixes several mistakes in the documentation of the
compressed cluster descriptor:

1) the documentation claims that the cluster descriptor contains the
number of sectors used to store the compressed data, but what it
actually contains is the number of sectors *minus one* or, in other
words, the number of additional sectors after the first one.

2) the width of the fields is incorrectly specified. The number of bits
used by each field is

   x = 62 - (cluster_bits - 8)   for the offset field
   y = (cluster_bits - 8)for the size field

So the offset field's location is [0, x-1], not [0, x] as stated.


Yep, I figured that out independently, although our mails crossed, and I 
didn't notice your fix until after my other email calling out this bug 
in the spec.




3) the size field does not contain the size of the compressed data,
but rather the number of sectors where that data is stored. The
compressed data starts at the exact point specified in the offset
field and ends when there's enough data to produce a cluster of
decompressed data. Both points can be in the middle of a sector,
allowing several compressed clusters to be stored next to one
another, sharing sectors if necessary.

Signed-off-by: Alberto Garcia 
---

v3: Fix the specification of the width of the fields, and update the
 explanation of how the compressed data is stored [Eric].




+++ b/docs/interop/qcow2.txt
@@ -426,10 +426,20 @@ Standard Cluster Descriptor:
  
  Compressed Clusters Descriptor (x = 62 - (cluster_bits - 8)):
  
-Bit  0 -  x:Host cluster offset. This is usually _not_ aligned to a

-cluster boundary!
+Bit  0 - x-1:   Host cluster offset. This is usually _not_ aligned to a
+cluster or sector boundary!



This breaks the nice alignment of the ':' character that the rest of the 
section preserves.  You could also have done:


Compressed Clusters Descriptor (x = 61 - (cluster_bits - 8)):
   Bit  0 -  x:  ...
  x+1 - 61:  ...

to preserve the alignment, by shifting the adjustment into the 
calculation of x.  But I'm fine with either approach.


  
-   x+1 - 61:Compressed size of the images in sectors of 512 bytes

+ x - 61:Number of additional 512-byte sectors used for the
+compressed data, beyond the sector containing the offset
+in the previous field. Some of these sectors may reside
+in the next contiguous host cluster.
+
+Note that the compressed data does not necessarily occupy
+all of the bytes in the final sector; rather, decompression
+stops when it has produced a cluster of data.
+
+Another compressed cluster may map to the tail of the final
+sector used by this compressed cluster.


Yes, that wording is MUCH nicer.  Thanks for persisting, and for making 
me learn more about qcow2 compression in the process.  So either way on 
the colon alignment issue:


Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



[Qemu-devel] [PATCH v3] specs/qcow2: Fix documentation of the compressed cluster descriptor

2018-02-21 Thread Alberto Garcia
This patch fixes several mistakes in the documentation of the
compressed cluster descriptor:

1) the documentation claims that the cluster descriptor contains the
   number of sectors used to store the compressed data, but what it
   actually contains is the number of sectors *minus one* or, in other
   words, the number of additional sectors after the first one.

2) the width of the fields is incorrectly specified. The number of bits
   used by each field is

  x = 62 - (cluster_bits - 8)   for the offset field
  y = (cluster_bits - 8)for the size field

   So the offset field's location is [0, x-1], not [0, x] as stated.

3) the size field does not contain the size of the compressed data,
   but rather the number of sectors where that data is stored. The
   compressed data starts at the exact point specified in the offset
   field and ends when there's enough data to produce a cluster of
   decompressed data. Both points can be in the middle of a sector,
   allowing several compressed clusters to be stored next to one
   another, sharing sectors if necessary.

Signed-off-by: Alberto Garcia 
---

v3: Fix the specification of the width of the fields, and update the
explanation of how the compressed data is stored [Eric].

v2: I realized that the documentation is not completely clear about
the exact location and size of the compressed data, so I updated
the patch to clarify this.

---
 docs/interop/qcow2.txt | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt
index d7fdb1fee3..feb711fb6a 100644
--- a/docs/interop/qcow2.txt
+++ b/docs/interop/qcow2.txt
@@ -426,10 +426,20 @@ Standard Cluster Descriptor:
 
 Compressed Clusters Descriptor (x = 62 - (cluster_bits - 8)):
 
-Bit  0 -  x:Host cluster offset. This is usually _not_ aligned to a
-cluster boundary!
+Bit  0 - x-1:   Host cluster offset. This is usually _not_ aligned to a
+cluster or sector boundary!
 
-   x+1 - 61:Compressed size of the images in sectors of 512 bytes
+ x - 61:Number of additional 512-byte sectors used for the
+compressed data, beyond the sector containing the offset
+in the previous field. Some of these sectors may reside
+in the next contiguous host cluster.
+
+Note that the compressed data does not necessarily occupy
+all of the bytes in the final sector; rather, decompression
+stops when it has produced a cluster of data.
+
+Another compressed cluster may map to the tail of the final
+sector used by this compressed cluster.
 
 If a cluster is unallocated, read requests shall read the data from the backing
 file (except if bit 0 in the Standard Cluster Descriptor is set). If there is
-- 
2.11.0