Re: [PATCH v22 3/4] qcow2: add zstd cluster compression

Max Reitz Mon, 04 May 2020 00:54:23 -0700

On 30.04.20 15:56, Denis Plotnikov wrote:
> 
> 
> On 30.04.2020 14:47, Max Reitz wrote:
>> On 30.04.20 11:48, Denis Plotnikov wrote:
>>>
>>> On 30.04.2020 11:26, Max Reitz wrote:
>>>> On 29.04.20 15:02, Vladimir Sementsov-Ogievskiy wrote:
>>>>> 29.04.2020 15:17, Max Reitz wrote:
>>>>>> On 29.04.20 12:37, Vladimir Sementsov-Ogievskiy wrote:
>>>>>>> 29.04.2020 13:24, Max Reitz wrote:
>>>>>>>> On 28.04.20 22:00, Denis Plotnikov wrote:
>>>>>>>>> zstd significantly reduces cluster compression time.
>>>>>>>>> It provides better compression performance maintaining
>>>>>>>>> the same level of the compression ratio in comparison with
>>>>>>>>> zlib, which, at the moment, is the only compression
>>>>>>>>> method available.
>>>>>>>>>
>>>>>>>>> The performance test results:
>>>>>>>>> Test compresses and decompresses qemu qcow2 image with just
>>>>>>>>> installed rhel-7.6 guest.
>>>>>>>>> Image cluster size: 64K. Image on disk size: 2.2G
>>>>>>>>>
>>>>>>>>> The test was conducted with brd disk to reduce the influence
>>>>>>>>> of disk subsystem to the test results.
>>>>>>>>> The results is given in seconds.
>>>>>>>>>
>>>>>>>>> compress cmd:
>>>>>>>>>       time ./qemu-img convert -O qcow2 -c -o
>>>>>>>>> compression_type=[zlib|zstd]
>>>>>>>>>                       src.img [zlib|zstd]_compressed.img
>>>>>>>>> decompress cmd
>>>>>>>>>       time ./qemu-img convert -O qcow2
>>>>>>>>>                       [zlib|zstd]_compressed.img uncompressed.img
>>>>>>>>>
>>>>>>>>>                compression               decompression
>>>>>>>>>              zlib       zstd           zlib         zstd
>>>>>>>>> ------------------------------------------------------------
>>>>>>>>> real     65.5       16.3 (-75 %)    1.9          1.6 (-16 %)
>>>>>>>>> user     65.0       15.8            5.3          2.5
>>>>>>>>> sys       3.3        0.2            2.0          2.0
>>>>>>>>>
>>>>>>>>> Both ZLIB and ZSTD gave the same compression ratio: 1.57
>>>>>>>>> compressed image size in both cases: 1.4G
>>>>>>>>>
>>>>>>>>> Signed-off-by: Denis Plotnikov <dplotni...@virtuozzo.com>
>>>>>>>>> QAPI part:
>>>>>>>>> Acked-by: Markus Armbruster <arm...@redhat.com>
>>>>>>>>> ---
>>>>>>>>>      docs/interop/qcow2.txt |   1 +
>>>>>>>>>      configure              |   2 +-
>>>>>>>>>      qapi/block-core.json   |   3 +-
>>>>>>>>>      block/qcow2-threads.c  | 169
>>>>>>>>> +++++++++++++++++++++++++++++++++++++++++
>>>>>>>>>      block/qcow2.c          |   7 ++
>>>>>>>>>      slirp                  |   2 +-
>>>>>>>>>      6 files changed, 181 insertions(+), 3 deletions(-)
>>>>>>>> [...]
>>>>>>>>
>>>>>>>>> diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
>>>>>>>>> index 7dbaf53489..a0b12e1b15 100644
>>>>>>>>> --- a/block/qcow2-threads.c
>>>>>>>>> +++ b/block/qcow2-threads.c
>>>>>>>> [...]
>>>>>>>>
>>>>>>>>> +static ssize_t qcow2_zstd_decompress(void *dest, size_t
>>>>>>>>> dest_size,
>>>>>>>>> +                                     const void *src, size_t
>>>>>>>>> src_size)
>>>>>>>>> +{
>>>>>>>> [...]
>>>>>>>>
>>>>>>>>> +    /*
>>>>>>>>> +     * The compressed stream from the input buffer may consist of
>>>>>>>>> more
>>>>>>>>> +     * than one zstd frame.
>>>>>>>> Can it?
>>>>>>> If not, we must require it in the specification.
>>>>>> Actually, now that you mention it, it would make sense anyway to add
>>>>>> some note to the specification on what exactly compressed with zstd
>>>>>> means.
>>>>>>
>>>>>>> Hmm. If at some point
>>>>>>> we'll want multi-threaded compression of one big (2M) cluster..
>>>>>>> Could
>>>>>>> this be implemented with zstd lib, if multiple frames are allowed,
>>>>>>> will
>>>>>>> allowing multiple frames help? I don't know actually, but I think
>>>>>>> better
>>>>>>> not to forbid it. On the other hand, I don't see any benefit in
>>>>>>> large
>>>>>>> compressed clusters. At least, in our scenarios (for compressed
>>>>>>> backups)
>>>>>>> we use 64k compressed clusters, for good granularity of incremental
>>>>>>> backups (when for running vm we use 1M clusters).
>>>>>> Is it really that important?  Naïvely, it sounds rather
>>>>>> complicated to
>>>>>> introduce multithreading into block drivers.
>>>>> It is already here: compression and encryption already multithreaded.
>>>>> But of course, one cluster is handled in one thread.
>>>> Ah, good.  I forgot.
>>>>
>>>>>> (Also, as for compression, it can only be used in backup scenarios
>>>>>> anyway, where you write many clusters at once.  So parallelism on the
>>>>>> cluster level should sufficient to get high usage, and it would
>>>>>> benefit
>>>>>> all compression types and cluster sizes.)
>>>>>>
>>>>> Yes it works in this way already :)
>>>> Well, OK then.
>>>>
>>>>> So, we don't know do we want one frame restriction or not. Do you
>>>>> have a
>>>>> preference?
>>>> *shrug*
>>>>
>>>> Seems like it would be preferential to allow multiple frames still.  A
>>>> note in the spec would be nice (i.e., streaming format, multiple frames
>>>> per cluster possible).
>>> We don't mention anything about zlib compressing details in the spec.
>> Yep.  Which is bad enough.
>>
>>> If we mention anything about ZSTD compressing details we'll have to do
>>> it for
>>> zlib as well.
>> I personally don’t like “If you can’t make it perfect, you shouldn’t do
>> it at all”.  Mentioning it for zstd would still be an improvement.
> 
> Good approach. I like it. But I'm trying to be cautious.
>>
>> Also, I believe that “zlib compression” is pretty much unambiguous,
>> considering all the code does is call deflate()/inflate().
>>
>> I’m not saying we need to amend the spec in this series, but I don’t see
>> a good argument against doing so at some point.
>>
>>> So, I think we have two possibilities for the spec:
>>> 1. mention for both
>>> 2. don't mention at all
>>>
>>> I think the 2nd is better. It gives more degree of freedom for the
>>> future improvements.
>> No, it absolutely doesn’t.  There is a de-facto format, namely what qemu
>> accepts.  Changing that would be an incompatible change.  Just because
>> we don’t write what’s allowed into the spec doesn’t change that fact.
>>
>>> and we've already covered both cases (one frame, may frames) in the
>>> code.
>> There are still different zstd formats, namely streaming or not
>> streaming.  That’s what should be mentioned.
> 
> It looks like the ZSTD format is always the same.
> The ZSTD interface is different: for some reason the simple
> zstd_decompress()
> wants to know the size of data to decompress
> 
> From zstd doc:
> *size_t ZSTD_decompress( void* dst, size_t dstCapacity, const void* src,
> size_t compressedSize); *
> 
> `compressedSize` : must be the _exact_ size of some number of compressed
> and/or skippable frames.


That’s... interesting.

> I made a test (based on Vladimir's code) to check it:
> 
> // compile with gcc -lzstd -g test_zstd.c -o test_zstd
> 
> #include <stdio.h> #include <assert.h> #include <zstd.h> #include
> <zstd_errors.h> int main() { char buf1[] = "erbebewbwe"; char buf2[100];
> char buf3[100]; int ret; ZSTD_outBuffer output; ZSTD_inBuffer input; ret
> = ZSTD_compress(buf2, sizeof(buf2), buf1, sizeof(buf1), 5); printf("ret:
> %d\n", ret); ZSTD_DCtx *dctx = ZSTD_createDCtx(); input =
> (ZSTD_inBuffer){buf2, ret, 0}; output = (ZSTD_outBuffer){buf3,
> sizeof(buf3), 0}; ret = ZSTD_decompressStream(dctx, &output, &input);
> printf("ret: %d, input.pos: %ld, output.pos: %ld\n", ret, input.pos,
> output.pos); printf("To compress: %s\n", buf1); buf3[output.pos] = 0;
> printf("Uncompressed: %s\n", buf3); return 0; }
> 
> And it works fine.
> 
> We use streaming semantics just to be consistent with the interfaces
> (stream_compress/stream_decompress), otherwise
> 
> we could use simple_compress/stream_decompress

OK, if there’s just a single format and we accept multiple frames, then
there’s indeed no need to document it.

Max

signature.asc
Description: OpenPGP digital signature

Re: [PATCH v22 3/4] qcow2: add zstd cluster compression

Reply via email to