Re: [PATCH v7 5/6] migration: Reduce time of loading non-iterable vmstate

Juan Quintela Thu, 16 Mar 2023 03:47:43 -0700

Chuang Xu <xuchuangxc...@bytedance.com> wrote:
> The duration of loading non-iterable vmstate accounts for a significant
> portion of downtime (starting with the timestamp of source qemu stop and
> ending with the timestamp of target qemu start). Most of the time is spent
> committing memory region changes repeatedly.
>
> This patch packs all the changes to memory region during the period of
> loading non-iterable vmstate in a single memory transaction. With the
> increase of devices, this patch will greatly improve the performance.
>
> Note that the following test results are based on the application of the
> next patch. Without the next patch, the improvement will be reduced.
>
> Here are the test1 results:
> test info:
> - Host
>   - Intel(R) Xeon(R) Platinum 8362 CPU
>   - Mellanox Technologies MT28841
> - VM
>   - 32 CPUs 128GB RAM VM
>   - 8 16-queue vhost-net device
>   - 16 4-queue vhost-user-blk device.
>
>       time of loading non-iterable vmstate     downtime
> before                about 112 ms                      285 ms
> after         about 20 ms                       194 ms
>
> In test2, we keep the number of the device the same as test1, reduce the
> number of queues per device:
>
> Here are the test2 results:
> test info:
> - Host
>   - Intel(R) Xeon(R) Platinum 8362 CPU
>   - Mellanox Technologies MT28841
> - VM
>   - 32 CPUs 128GB RAM VM
>   - 8 1-queue vhost-net device
>   - 16 1-queue vhost-user-blk device.
>
>       time of loading non-iterable vmstate     downtime
> before                about 65 ms                      about 151 ms
>
> after         about 19 ms                      about 100 ms
>
> In test3, we keep the number of queues per device the same as test1, reduce
> the number of devices:
>
> Here are the test3 results:
> test info:
> - Host
>   - Intel(R) Xeon(R) Platinum 8362 CPU
>   - Mellanox Technologies MT28841
> - VM
>   - 32 CPUs 128GB RAM VM
>   - 1 16-queue vhost-net device
>   - 1 4-queue vhost-user-blk device.
>
>       time of loading non-iterable vmstate     downtime
> before                about 24 ms                      about 51 ms
> after         about 9 ms                       about 36 ms
>
> As we can see from the test results above, both the number of queues and
> the number of devices have a great impact on the time of loading non-iterable
> vmstate. The growth of the number of devices and queues will lead to more
> mr commits, and the time consumption caused by the flatview reconstruction
> will also increase.
>
> Signed-off-by: Chuang Xu <xuchuangxc...@bytedance.com>
> ---
>  migration/savevm.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
>
> diff --git a/migration/savevm.c b/migration/savevm.c
> index aa54a67fda..9a7d3e40d6 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2762,6 +2762,7 @@ out:
>              goto retry;
>          }
>      }
> +
>      return ret;
>  }
>


Drop this.

> @@ -2787,7 +2788,25 @@ int qemu_loadvm_state(QEMUFile *f)
>  
>      cpu_synchronize_all_pre_loadvm();
>  
> +    /*
> +     * Call memory_region_transaction_begin() before loading vmstate.
> +     * This call is paired with memory_region_transaction_commit() at
> +     * the end of qemu_loadvm_state_main(), in order to pack all the
> +     * changes to memory region during the period of loading
> +     * non-iterable vmstate in a single memory transaction.
> +     * This operation will reduce time of loading non-iterable vmstate
> +     */
> +    memory_region_transaction_begin();
> +
>      ret = qemu_loadvm_state_main(f, mis);
> +
> +    /*
> +     * Call memory_region_transaction_commit() after loading vmstate.
> +     * At this point, qemu actually completes all the previous memory
> +     * region transactions.
> +     */
> +    memory_region_transaction_commit();
> +
>      qemu_event_set(&mis->main_thread_load_event);
>  
>      trace_qemu_loadvm_state_post_main(ret);

Reviewed-by: Juan Quintela <quint...@redhat.com>

I don't feel confident getting this series through the migration tree
without Paolo (or someone else more familiar with the memory API)
reviews it.

So if anyone else reviews it, I will got it through the migration tree,
otherwise I am ok to have it pulled trhough other tree.

Not sure if we should get this in the middle of the freeze or should we
wait for 8.1 to open.

Re: [PATCH v7 5/6] migration: Reduce time of loading non-iterable vmstate

Reply via email to