Chuang Xu <xuchuangxc...@bytedance.com> wrote: > The duration of loading non-iterable vmstate accounts for a significant > portion of downtime (starting with the timestamp of source qemu stop and > ending with the timestamp of target qemu start). Most of the time is spent > committing memory region changes repeatedly. > > This patch packs all the changes to memory region during the period of > loading non-iterable vmstate in a single memory transaction. With the > increase of devices, this patch will greatly improve the performance. > > Note that the following test results are based on the application of the > next patch. Without the next patch, the improvement will be reduced. > > Here are the test1 results: > test info: > - Host > - Intel(R) Xeon(R) Platinum 8362 CPU > - Mellanox Technologies MT28841 > - VM > - 32 CPUs 128GB RAM VM > - 8 16-queue vhost-net device > - 16 4-queue vhost-user-blk device. > > time of loading non-iterable vmstate downtime > before about 112 ms 285 ms > after about 20 ms 194 ms > > In test2, we keep the number of the device the same as test1, reduce the > number of queues per device: > > Here are the test2 results: > test info: > - Host > - Intel(R) Xeon(R) Platinum 8362 CPU > - Mellanox Technologies MT28841 > - VM > - 32 CPUs 128GB RAM VM > - 8 1-queue vhost-net device > - 16 1-queue vhost-user-blk device. > > time of loading non-iterable vmstate downtime > before about 65 ms about 151 ms > > after about 19 ms about 100 ms > > In test3, we keep the number of queues per device the same as test1, reduce > the number of devices: > > Here are the test3 results: > test info: > - Host > - Intel(R) Xeon(R) Platinum 8362 CPU > - Mellanox Technologies MT28841 > - VM > - 32 CPUs 128GB RAM VM > - 1 16-queue vhost-net device > - 1 4-queue vhost-user-blk device. > > time of loading non-iterable vmstate downtime > before about 24 ms about 51 ms > after about 9 ms about 36 ms > > As we can see from the test results above, both the number of queues and > the number of devices have a great impact on the time of loading non-iterable > vmstate. The growth of the number of devices and queues will lead to more > mr commits, and the time consumption caused by the flatview reconstruction > will also increase. > > Signed-off-by: Chuang Xu <xuchuangxc...@bytedance.com> > --- > migration/savevm.c | 19 +++++++++++++++++++ > 1 file changed, 19 insertions(+) > > diff --git a/migration/savevm.c b/migration/savevm.c > index aa54a67fda..9a7d3e40d6 100644 > --- a/migration/savevm.c > +++ b/migration/savevm.c > @@ -2762,6 +2762,7 @@ out: > goto retry; > } > } > + > return ret; > } >
Drop this. > @@ -2787,7 +2788,25 @@ int qemu_loadvm_state(QEMUFile *f) > > cpu_synchronize_all_pre_loadvm(); > > + /* > + * Call memory_region_transaction_begin() before loading vmstate. > + * This call is paired with memory_region_transaction_commit() at > + * the end of qemu_loadvm_state_main(), in order to pack all the > + * changes to memory region during the period of loading > + * non-iterable vmstate in a single memory transaction. > + * This operation will reduce time of loading non-iterable vmstate > + */ > + memory_region_transaction_begin(); > + > ret = qemu_loadvm_state_main(f, mis); > + > + /* > + * Call memory_region_transaction_commit() after loading vmstate. > + * At this point, qemu actually completes all the previous memory > + * region transactions. > + */ > + memory_region_transaction_commit(); > + > qemu_event_set(&mis->main_thread_load_event); > > trace_qemu_loadvm_state_post_main(ret); Reviewed-by: Juan Quintela <quint...@redhat.com> I don't feel confident getting this series through the migration tree without Paolo (or someone else more familiar with the memory API) reviews it. So if anyone else reviews it, I will got it through the migration tree, otherwise I am ok to have it pulled trhough other tree. Not sure if we should get this in the middle of the freeze or should we wait for 8.1 to open.