Re: [PATCH 0/5] Live Migration Acceleration with IAA Compression

Juan Quintela Mon, 23 Oct 2023 03:49:15 -0700

Daniel P. Berrangé <berra...@redhat.com> wrote:
> On Mon, Oct 23, 2023 at 08:33:44AM +0000, Liu, Yuan1 wrote:
>> > -----Original Message-----
>> > From: Daniel P. Berrangé <berra...@redhat.com>
>> > Sent: Thursday, October 19, 2023 11:32 PM
>> > To: Peter Xu <pet...@redhat.com>
>> > Cc: Juan Quintela <quint...@redhat.com>; Liu, Yuan1
>> > <yuan1....@intel.com>; faro...@suse.de; leob...@redhat.com; qemu-
>> > de...@nongnu.org; Zou, Nanhai <nanhai....@intel.com>
>> > Subject: Re: [PATCH 0/5] Live Migration Acceleration with IAA Compression
>> > 
>> > On Thu, Oct 19, 2023 at 11:23:31AM -0400, Peter Xu wrote:
>> > > On Thu, Oct 19, 2023 at 03:52:14PM +0100, Daniel P. Berrangé wrote:
>> > > > On Thu, Oct 19, 2023 at 01:40:23PM +0200, Juan Quintela wrote:
>> > > > > Yuan Liu <yuan1....@intel.com> wrote:
>> > > > > > Hi,
>> > > > > >
>> > > > > > I am writing to submit a code change aimed at enhancing live
>> > > > > > migration acceleration by leveraging the compression capability
>> > > > > > of the Intel In-Memory Analytics Accelerator (IAA).
>> > > > > >
>> > > > > > Enabling compression functionality during the live migration
>> > > > > > process can enhance performance, thereby reducing downtime and
>> > > > > > network bandwidth requirements. However, this improvement comes
>> > > > > > at the cost of additional CPU resources, posing a challenge for
>> > > > > > cloud service providers in terms of resource allocation. To
>> > > > > > address this challenge, I have focused on offloading the 
>> > > > > > compression
>> > overhead to the IAA hardware, resulting in performance gains.
>> > > > > >
>> > > > > > The implementation of the IAA (de)compression code is based on
>> > > > > > Intel Query Processing Library (QPL), an open-source software
>> > > > > > project designed for IAA high-level software programming.
>> > > > >
>> > > > > After reviewing the patches:
>> > > > >
>> > > > > - why are you doing this on top of old compression code, that is
>> > > > >   obsolete, deprecated and buggy
>> Some users have not enabled the multifd feature yet, but they will
>> decide whether to enable the compression feature based on the load
>> situation. So I'm wondering if, without multifd, the compression
>> functionality will no longer be available?
>> 
>> > > > > - why are you not doing it on top of multifd.
>> I plan to submit the support for multifd independently because the
>> multifd compression and legacy compression code are separate.
>
> So the core question her (for migration maintainers) is whether
> contributors should be spending any time at all on non-multifd
> code, or if new features should be exclusively for multifd ?


Only for multifd.

Comparison right now:
- compression (can be done better in multifd)
- plain precopy (we can satturate faster networks with multifd)
- xbzrle: right now only non-multifd (plan to add as another multifd
          compression method)
- exec: This is a hard one.  Fabiano is about to submit a file based
        multifd method.  Advantages over exec:
          * much less space used (it writes each page at the right
            position, no overhead and never the same page on the two
            streams)
          * We can give proper errors, exec is very bad when the exec'd
            process gives an error.
        Disadvantages:
          * libvirt (or any management app) needs to wait for
            compression to end, and launch the exec command by hand.
            I wanted to discuss this with libvirt, if it would be
            possible to remove the use of exec compression.
- rdma: This is a hard one
        Current implementation is a mess
        It is almost un-maintained
        There are two-three years old patches to move it on top of
        multifd
- postcopy: Not implemented.  This is the real reason that we can't
        deprecate precopy and put multifd as default.
- snapshots:  They are to coupled with qcow2.  It should be possible to
        do something more sensible with multifd + file, but we need to walk that
        path when multifd + file hit the tree.

> I doesn't make a lot of sense over the long term to have people
> spending time implementing the same features twice. IOW, should
> we be directly contributors explicitly towards multifd only,
> and even consider deprecating non-multifd code at some time ?

Intel submited something similarish to this on top of QAT several months
back.  I already advised them not to use any time on top of old
compression code and just do things on top of multifd.

Once that we are here, what are the differ]ences of QPL and QAT?
Previous submission used qatzip-devel.

Later, JUan.

>> > > > I'm not sure that is ideal approach.  IIUC, the IAA/QPL library is
>> > > > not defining a new compression format. Rather it is providing a
>> > > > hardware accelerator for 'deflate' format, as can be made compatible
>> > > > with zlib:
>> > > >
>> > > >
>> > > > https://intel.github.io/qpl/documentation/dev_guide_docs/c_use_cases
>> > > > /deflate/c_deflate_zlib_gzip.html#zlib-and-gzip-compatibility-refere
>> > > > nce-link
>> > > >
>> > > > With multifd we already have a 'zlib' compression format, and so
>> > > > this IAA/QPL logic would effectively just be a providing a second
>> > > > implementation of zlib.
>> > > >
>> > > > Given the use of a standard format, I would expect to be able to use
>> > > > software zlib on the src, mixed with IAA/QPL zlib on the target, or
>> > > > vica-verca.
>> > > >
>> > > > IOW, rather than defining a new compression format for this, I think
>> > > > we could look at a new migration parameter for
>> > > >
>> > > > "compression-accelerator": ["auto", "none", "qpl"]
>> > > >
>> > > > with 'auto' the default, such that we can automatically enable
>> > > > IAA/QPL when 'zlib' format is requested, if running on a suitable
>> > > > host.
>> > >
>> > > I was also curious about the format of compression comparing to
>> > > software ones when reading.
>> > >
>> > > Would there be a use case that one would prefer soft compression even
>> > > if hardware accelerator existed, no matter on src/dst?
>> > >
>> > > I'm wondering whether we can avoid that one more parameter but always
>> > > use hardware accelerations as long as possible.
>>
>> I want to add a new compression format(QPL or IAA-Deflate) here.
>> The reasons are as follows:
>>
>> 1. The QPL library already supports both software and hardware paths
>>    for compression. The software path uses a fast Deflate compression
>>    algorithm, while the hardware path uses IAA.
>
> That's not a reason to describe this as a new format in QEMU. It is
> still deflate, and so conceptually we can model this as 'zlib' and
> potentially choose to use QPL automatically.
>
>> 2. QPL's software and hardware paths are based on the Deflate algorithm,
>>    but there is a limitation: the history buffer only supports 4K. The
>>    default history buffer for zlib is 32K, which means that IAA cannot
>>    decompress zlib-compressed data. However, zlib can decompress IAA-
>>    compressed data.
>
> That's again not a reason to call it a new compression format in
> QEMU. It would mean, however, if compression-accelerator=auto, we
> would not be able to safely enable QPL on the incoming QEMU, as we
> can't be sure the src used a 4k window.  We could still automatically
> enable QPL on outgoing side though.
>
>> 3. For zlib and zstd, Intel QuickAssist Technology can accelerate
>>    both of them.
>
> What's the difference between this, and the IAA/QPL ? 
>
> With regards,
> Daniel

Re: [PATCH 0/5] Live Migration Acceleration with IAA Compression

Reply via email to