On 30.04.20 15:56, Denis Plotnikov wrote: > > > On 30.04.2020 14:47, Max Reitz wrote: >> On 30.04.20 11:48, Denis Plotnikov wrote: >>> >>> On 30.04.2020 11:26, Max Reitz wrote: >>>> On 29.04.20 15:02, Vladimir Sementsov-Ogievskiy wrote: >>>>> 29.04.2020 15:17, Max Reitz wrote: >>>>>> On 29.04.20 12:37, Vladimir Sementsov-Ogievskiy wrote: >>>>>>> 29.04.2020 13:24, Max Reitz wrote: >>>>>>>> On 28.04.20 22:00, Denis Plotnikov wrote: >>>>>>>>> zstd significantly reduces cluster compression time. >>>>>>>>> It provides better compression performance maintaining >>>>>>>>> the same level of the compression ratio in comparison with >>>>>>>>> zlib, which, at the moment, is the only compression >>>>>>>>> method available. >>>>>>>>> >>>>>>>>> The performance test results: >>>>>>>>> Test compresses and decompresses qemu qcow2 image with just >>>>>>>>> installed rhel-7.6 guest. >>>>>>>>> Image cluster size: 64K. Image on disk size: 2.2G >>>>>>>>> >>>>>>>>> The test was conducted with brd disk to reduce the influence >>>>>>>>> of disk subsystem to the test results. >>>>>>>>> The results is given in seconds. >>>>>>>>> >>>>>>>>> compress cmd: >>>>>>>>> time ./qemu-img convert -O qcow2 -c -o >>>>>>>>> compression_type=[zlib|zstd] >>>>>>>>> src.img [zlib|zstd]_compressed.img >>>>>>>>> decompress cmd >>>>>>>>> time ./qemu-img convert -O qcow2 >>>>>>>>> [zlib|zstd]_compressed.img uncompressed.img >>>>>>>>> >>>>>>>>> compression decompression >>>>>>>>> zlib zstd zlib zstd >>>>>>>>> ------------------------------------------------------------ >>>>>>>>> real 65.5 16.3 (-75 %) 1.9 1.6 (-16 %) >>>>>>>>> user 65.0 15.8 5.3 2.5 >>>>>>>>> sys 3.3 0.2 2.0 2.0 >>>>>>>>> >>>>>>>>> Both ZLIB and ZSTD gave the same compression ratio: 1.57 >>>>>>>>> compressed image size in both cases: 1.4G >>>>>>>>> >>>>>>>>> Signed-off-by: Denis Plotnikov <dplotni...@virtuozzo.com> >>>>>>>>> QAPI part: >>>>>>>>> Acked-by: Markus Armbruster <arm...@redhat.com> >>>>>>>>> --- >>>>>>>>> docs/interop/qcow2.txt | 1 + >>>>>>>>> configure | 2 +- >>>>>>>>> qapi/block-core.json | 3 +- >>>>>>>>> block/qcow2-threads.c | 169 >>>>>>>>> +++++++++++++++++++++++++++++++++++++++++ >>>>>>>>> block/qcow2.c | 7 ++ >>>>>>>>> slirp | 2 +- >>>>>>>>> 6 files changed, 181 insertions(+), 3 deletions(-) >>>>>>>> [...] >>>>>>>> >>>>>>>>> diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c >>>>>>>>> index 7dbaf53489..a0b12e1b15 100644 >>>>>>>>> --- a/block/qcow2-threads.c >>>>>>>>> +++ b/block/qcow2-threads.c >>>>>>>> [...] >>>>>>>> >>>>>>>>> +static ssize_t qcow2_zstd_decompress(void *dest, size_t >>>>>>>>> dest_size, >>>>>>>>> + const void *src, size_t >>>>>>>>> src_size) >>>>>>>>> +{ >>>>>>>> [...] >>>>>>>> >>>>>>>>> + /* >>>>>>>>> + * The compressed stream from the input buffer may consist of >>>>>>>>> more >>>>>>>>> + * than one zstd frame. >>>>>>>> Can it? >>>>>>> If not, we must require it in the specification. >>>>>> Actually, now that you mention it, it would make sense anyway to add >>>>>> some note to the specification on what exactly compressed with zstd >>>>>> means. >>>>>> >>>>>>> Hmm. If at some point >>>>>>> we'll want multi-threaded compression of one big (2M) cluster.. >>>>>>> Could >>>>>>> this be implemented with zstd lib, if multiple frames are allowed, >>>>>>> will >>>>>>> allowing multiple frames help? I don't know actually, but I think >>>>>>> better >>>>>>> not to forbid it. On the other hand, I don't see any benefit in >>>>>>> large >>>>>>> compressed clusters. At least, in our scenarios (for compressed >>>>>>> backups) >>>>>>> we use 64k compressed clusters, for good granularity of incremental >>>>>>> backups (when for running vm we use 1M clusters). >>>>>> Is it really that important? Naïvely, it sounds rather >>>>>> complicated to >>>>>> introduce multithreading into block drivers. >>>>> It is already here: compression and encryption already multithreaded. >>>>> But of course, one cluster is handled in one thread. >>>> Ah, good. I forgot. >>>> >>>>>> (Also, as for compression, it can only be used in backup scenarios >>>>>> anyway, where you write many clusters at once. So parallelism on the >>>>>> cluster level should sufficient to get high usage, and it would >>>>>> benefit >>>>>> all compression types and cluster sizes.) >>>>>> >>>>> Yes it works in this way already :) >>>> Well, OK then. >>>> >>>>> So, we don't know do we want one frame restriction or not. Do you >>>>> have a >>>>> preference? >>>> *shrug* >>>> >>>> Seems like it would be preferential to allow multiple frames still. A >>>> note in the spec would be nice (i.e., streaming format, multiple frames >>>> per cluster possible). >>> We don't mention anything about zlib compressing details in the spec. >> Yep. Which is bad enough. >> >>> If we mention anything about ZSTD compressing details we'll have to do >>> it for >>> zlib as well. >> I personally don’t like “If you can’t make it perfect, you shouldn’t do >> it at all”. Mentioning it for zstd would still be an improvement. > > Good approach. I like it. But I'm trying to be cautious. >> >> Also, I believe that “zlib compression” is pretty much unambiguous, >> considering all the code does is call deflate()/inflate(). >> >> I’m not saying we need to amend the spec in this series, but I don’t see >> a good argument against doing so at some point. >> >>> So, I think we have two possibilities for the spec: >>> 1. mention for both >>> 2. don't mention at all >>> >>> I think the 2nd is better. It gives more degree of freedom for the >>> future improvements. >> No, it absolutely doesn’t. There is a de-facto format, namely what qemu >> accepts. Changing that would be an incompatible change. Just because >> we don’t write what’s allowed into the spec doesn’t change that fact. >> >>> and we've already covered both cases (one frame, may frames) in the >>> code. >> There are still different zstd formats, namely streaming or not >> streaming. That’s what should be mentioned. > > It looks like the ZSTD format is always the same. > The ZSTD interface is different: for some reason the simple > zstd_decompress() > wants to know the size of data to decompress > > From zstd doc: > *size_t ZSTD_decompress( void* dst, size_t dstCapacity, const void* src, > size_t compressedSize); * > > `compressedSize` : must be the _exact_ size of some number of compressed > and/or skippable frames.
That’s... interesting. > I made a test (based on Vladimir's code) to check it: > > // compile with gcc -lzstd -g test_zstd.c -o test_zstd > > #include <stdio.h> #include <assert.h> #include <zstd.h> #include > <zstd_errors.h> int main() { char buf1[] = "erbebewbwe"; char buf2[100]; > char buf3[100]; int ret; ZSTD_outBuffer output; ZSTD_inBuffer input; ret > = ZSTD_compress(buf2, sizeof(buf2), buf1, sizeof(buf1), 5); printf("ret: > %d\n", ret); ZSTD_DCtx *dctx = ZSTD_createDCtx(); input = > (ZSTD_inBuffer){buf2, ret, 0}; output = (ZSTD_outBuffer){buf3, > sizeof(buf3), 0}; ret = ZSTD_decompressStream(dctx, &output, &input); > printf("ret: %d, input.pos: %ld, output.pos: %ld\n", ret, input.pos, > output.pos); printf("To compress: %s\n", buf1); buf3[output.pos] = 0; > printf("Uncompressed: %s\n", buf3); return 0; } > > And it works fine. > > We use streaming semantics just to be consistent with the interfaces > (stream_compress/stream_decompress), otherwise > > we could use simple_compress/stream_decompress OK, if there’s just a single format and we accept multiple frames, then there’s indeed no need to document it. Max
signature.asc
Description: OpenPGP digital signature