On 5/16/19 8:48 AM, Denis Plotnikov wrote:
> The patch adds some preparation parts for incompatible compression type
> feature into QCOW2 header that indicates that *all* compressed clusters
> must be (de)compressed using a certain compression type.
> 
> It is implied that the compression type is set on the image creation and
> can be changed only later by image conversion, thus the only compression
> algorithm is used for the image.
> 
> The plan is to add support for ZSTD and then may be something more effective
> in the future.
> 
> ZSTD compression algorithm consumes 3-5 times less CPU power with a
> comparable compression ratio with zlib. It would be wise to use it for
> data compression e.g. for backups.
> 
> The default compression is ZLIB.
> 
> Signed-off-by: Denis Plotnikov <dplotni...@virtuozzo.com>
> ---

> +++ b/docs/interop/qcow2.txt
> @@ -109,7 +109,11 @@ in the description of a field.
>                                  An External Data File Name header extension 
> may
>                                  be present if this bit is set.
>  
> -                    Bits 3-63:  Reserved (set to 0)
> +                    Bit 3:      Compression type bit. If the bit is set, 
> then the
> +                                type of compression the image uses is set in 
> the
> +                                header extension

I'd call out 'Compression type' header extension by name, to make it
more obvious.  Is it an error if bit 3 is set but the compression header
is not present? Is it an error if the compression header is present but
bit 3 is not set?

> +
> +                    Bits 4-63:  Reserved (set to 0)
>  
>           80 -  87:  compatible_features
>                      Bitmask of compatible features. An implementation can
> @@ -175,6 +179,7 @@ be stored. Each extension has a structure like the 
> following:
>                          0x23852875 - Bitmaps extension
>                          0x0537be77 - Full disk encryption header pointer
>                          0x44415441 - External data file name string
> +                        0x434D5052 - Compression type extension

Our earlier magic numbers were probably created as random numbers and
contain 8-bit values, to make them less likely to appear naturally in
other parts of the file and thus less likely to be misinterpreted.  But
that's not a requirement, and I see that you followed the lead of "DATA"
and created "CMPR" for yours.  Works for me :)

>                          other      - Unknown header extension, can be safely
>                                       ignored
>  
> @@ -771,3 +776,21 @@ In the image file the 'enabled' state is reflected by 
> the 'auto' flag. If this
>  flag is set, the software must consider the bitmap as 'enabled' and start
>  tracking virtual disk changes to this bitmap from the first write to the
>  virtual disk. If this flag is not set then the bitmap is disabled.
> +
> +
> +== Compression type extension ==
> +
> +The compression type extension is an optional header extension. It stores the

Could probably do a better job at describing when the header is optional
vs. mandatory.

> +ID of the compressor which has to be used to compress/decompress disk 
> clusters.
> +The compression type is used for all disk cluster. Two clusters of the image
> +couldn't be compressed with different compressors.

Wording suggestion: A single compression type is applied to all
compressed disk clusters, with no way to change compression types per
cluster.

But is that a hard requirement? Since this is already an incompatible
feature extension, we could have a compression type that states that
each compressed cluster is self-describing via a 1-byte prefix (yes, it
means compression is not quite as dense, but probably not an issue).

Something like: in the image header, we have compression type 1 = zlib,
compression type 2 = zstd, etc, each of which treat all compressed
clusters as-is with no further per-cluster headers. Or, in the image
header, we have compression type 255 = per-cluster, at which point a
compressed cluster is now represented as: [1-byte prefix] [tail], where
the one-byte prefix is 1 = zlib, 2 = zstd, etc (but not 255), and then
the tail is decoded with the appropriate algorithm. In this way, it
might even be possible to encode different clusters with an optimal
algorithm per cluster, and thus create an image that requires both zlib
and zstd to be fully read.

I'm not sure if we need that much complexity, but just throwing it out
there for thought.

> +
> +The compression type can be set on the image creation. The only way to change
> +the compression type is to convert the image explicitly.
> +
> +Available compression types:
> +    ID    0: ZLIB (gzip)
> +          1: ZSTD
> +
> +The default compression type is ZLIB. When ZLIB is used the compression type
> +header extension is not present.

Here's where we have to think about back-compat. If zlib is used, and
the compression type header is present, must incompatible bit 3 be set?
Do we want to permit images that have incompatible bit 3 set and zlib
explicitly mentioned? Or are you making a hard requirement that if zlib
is chosen, incompatible bit 3 must be absent and no compression header
should be set? Or is it okay for the compression header to be present
and incompatible bit 3 clear, but only when compression type 0 is
chosen?  Let's spell out exactly what we want, probably with a goal of
minimizing the number of situations where an incompatible bit must be
set (as that makes it harder to work with images in older software).

Does the compression type really have to be chosen at image creation, or
can the decision be deferred until the time that the first compressed
cluster is written?  You could implement things to state that if
incompatible bit 3 is set but the compression header is absent, then
there must not be any compressed clusters in the image; as soon as the
first compressed cluster is written, then the compression header must
also be written (even if it explicitly calls out zlib), to make it
easier for new software to tell at a glance if the image has ever
contained compressed clusters at least once in the past.

> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 7ccbfff9d0..8eebcc728b 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -698,6 +698,7 @@
>  { 'struct': 'BlockMeasureInfo',
>    'data': {'required': 'int', 'fully-allocated': 'int'} }
>  
> +
>  ##

Why the added blank line?

>  # @query-block:
>  #
> @@ -5257,3 +5258,16 @@
>    'data' : { 'node-name': 'str',
>               'iothread': 'StrOrNull',
>               '*force': 'bool' } }
> +
> +##
> +# @Qcow2CompressionType:
> +#
> +# Compression type used in qcow2 image file
> +#
> +# @zlib - gzip compressor
> +# @zstd - zstd compression
> +#
> +# Since: 4.0

You've missed 4.0; this should be 4.1.

> +##
> +{ 'enum': 'Qcow2CompressionType',
> +  'data': [ 'zlib', 'zstd' ] }
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to