Yuan Liu <yuan1....@intel.com> writes: > QPL compression and decompression will use IAA hardware path if the IAA > hardware is available. Otherwise the QPL library software path is used. > > The hardware path will automatically fall back to QPL software path if > the IAA queues are busy. In some scenarios, this may happen frequently, > such as configuring 4 channels but only one IAA device is available. In > the case of insufficient IAA hardware resources, retry and fallback can > help optimize performance: > > 1. Retry + SW fallback: > total time: 14649 ms > downtime: 25 ms > throughput: 17666.57 mbps > pages-per-second: 1509647 > > 2. No fallback, always wait for work queues to become available > total time: 18381 ms > downtime: 25 ms > throughput: 13698.65 mbps > pages-per-second: 859607 > > If both the hardware and software paths fail, the uncompressed page is > sent directly. > > Signed-off-by: Yuan Liu <yuan1....@intel.com> > Reviewed-by: Nanhai Zou <nanhai....@intel.com> > --- > migration/multifd-qpl.c | 424 +++++++++++++++++++++++++++++++++++++++- > 1 file changed, 420 insertions(+), 4 deletions(-) > > diff --git a/migration/multifd-qpl.c b/migration/multifd-qpl.c > index 6791a204d5..9265098ee7 100644 > --- a/migration/multifd-qpl.c > +++ b/migration/multifd-qpl.c > @@ -13,9 +13,14 @@ > #include "qemu/osdep.h" > #include "qemu/module.h" > #include "qapi/error.h" > +#include "qapi/qapi-types-migration.h" > +#include "exec/ramblock.h" > #include "multifd.h" > #include "qpl/qpl.h" > > +/* Maximum number of retries to resubmit a job if IAA work queues are full */ > +#define MAX_SUBMIT_RETRY_NUM (3) > + > typedef struct { > /* the QPL hardware path job */ > qpl_job *job; > @@ -260,6 +265,225 @@ static void multifd_qpl_send_cleanup(MultiFDSendParams > *p, Error **errp) > p->iov = NULL; > } > > +/** > + * multifd_qpl_prepare_job: prepare the job > + * > + * Set the QPL job parameters and properties. > + * > + * @job: pointer to the qpl_job structure > + * @is_compression: indicates compression and decompression > + * @input: pointer to the input data buffer > + * @input_len: the length of the input data > + * @output: pointer to the output data buffer > + * @output_len: the length of the output data > + */ > +static void multifd_qpl_prepare_job(qpl_job *job, bool is_compression, > + uint8_t *input, uint32_t input_len, > + uint8_t *output, uint32_t output_len) > +{ > + job->op = is_compression ? qpl_op_compress : qpl_op_decompress; > + job->next_in_ptr = input; > + job->next_out_ptr = output; > + job->available_in = input_len; > + job->available_out = output_len; > + job->flags = QPL_FLAG_FIRST | QPL_FLAG_LAST | QPL_FLAG_OMIT_VERIFY; > + /* only supports compression level 1 */ > + job->level = 1; > +} > + > +/** > + * multifd_qpl_prepare_comp_job: prepare the compression job > + * > + * Set the compression job parameters and properties. > + * > + * @job: pointer to the qpl_job structure > + * @input: pointer to the input data buffer > + * @output: pointer to the output data buffer > + * @size: the page size > + */ > +static void multifd_qpl_prepare_comp_job(qpl_job *job, uint8_t *input, > + uint8_t *output, uint32_t size) > +{ > + /* > + * Set output length to less than the page size to force the job to > + * fail in case it compresses to a larger size. We'll send that page > + * without compression and skip the decompression operation on the > + * destination. > + */
This is way better in here! Reviewed-by: Fabiano Rosas <faro...@suse.de>