Re: [PATCH v4 0/4] Implement using Intel QAT to offload ZLIB

Peter Xu Wed, 10 Jul 2024 08:19:40 -0700

On Wed, Jul 10, 2024 at 01:55:23PM +0000, Liu, Yuan1 wrote:

[...]


> migrate_set_parameter max-bandwidth 1250M
> |-----------|--------|---------|----------|----------|------|------|
> |8 Channels |Total   |down     |throughput|pages per | send | recv |
> |           |time(ms)|time(ms) |(mbps)    |second    | cpu %| cpu% |
> |-----------|--------|---------|----------|----------|------|------|
> |qatzip     |   16630|       28|     10467|   2940235|   160|   360|
> |-----------|--------|---------|----------|----------|------|------|
> |zstd       |   20165|       24|      8579|   2391465|   810|   340|
> |-----------|--------|---------|----------|----------|------|------|
> |none       |   46063|       40|     10848|    330240|    45|    85|
> |-----------|--------|---------|----------|----------|------|------|
> 
> QATzip's dirty page processing throughput is much higher than that no 
> compression. 
> In this test, the vCPUs are in idle state, so the migration can be successful 
> even 
> without compression.

Thanks!  Maybe good material to be put into the docs/ too, if Yichen's
going to pick up your doc patch when repost.

[...]

> I don’t have much experience with postcopy, here are some of my thoughts
> 1. For write-intensive VMs, this solution can improve the migration success, 
>    because in a limited bandwidth network scenario, the dirty page processing
>    throughput will be significantly reduced for no compression, the previous
>    data includes this(pages_per_second), it means that in the no compression
>    precopy, the dirty pages generated by the workload are greater than the
>    migration processing, resulting in migration failure.

Yes.

> 
> 2. If the VM is read-intensive or has low vCPU utilization (for example, my 
>    current test scenario is that the vCPUs are all idle). I think no 
> compression +
>    precopy + postcopy also cannot improve the migration performance, and may 
> also
>    cause timeout failure due to long migration time, same with no compression 
> precopy.

I don't think postcopy will trigger timeout failures - postcopy should use
constant time to complete a migration, that is guest memsize / bw.

The challenge is normally on the delay of page requests higher than
precopy, but in this case it might not be a big deal. And I wonder if on
100G*2 cards it can also perform pretty well, as the delay might be minimal
even if bandwidth is throttled.

> 
> 3. In my opinion, the postcopy is a good solution in this scenario(low 
> network bandwidth,
>    VM is not critical), because even if compression is turned on, the 
> migration may still 
>    fail(page_per_second may still less than the new dirty pages), and it is 
> hard to predict
>    whether VM memory is compression-friendly.

Yes.

Thanks,

-- 
Peter Xu

Re: [PATCH v4 0/4] Implement using Intel QAT to offload ZLIB

Reply via email to