Right now, we don't have an interface to detect that cases and
got back to the iterative stage.
How about go back to the iterative stage when detect that the
pending_size is larger Than max_size, like this:
+/* do flush here is aimed to shorten the VM downtime,
+ * bdrv_flush_all is a time consuming operation
+ * when the guest has done some file writing */
+bdrv_flush_all();
+pending_size = qemu_savevm_state_pending(s-file,
max_size);
+if (pending_size pending_size = max_size) {
+qemu_mutex_unlock_iothread();
+continue;
+}
ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
if (ret = 0) {
qemu_file_set_rate_limit(s-file,
INT64_MAX);
and this is quite simple.
Yes, but it is too simple. If you hold all the locks during
bdrv_flush_all(), your VM will effectively stop as soon as it
performs the next I/O access, so you don't win much. And you
still don't have a timeout for cases where the flush takes really long.
This is probably better than what we had now (basically we are
meassuring
after bdrv_flush_all how much the amount of dirty memory has
changed, and return to iterative stage if it took too much. A
timeout would be better anyways. And an interface te start the
synchronization sooner asynchronously would be also good.
Notice that my understanding is that any proper fix for this is 2.4
material.
Then, how to deal with this issue in 2.3, leave it here? or make an
incomplete fix like I do above?
I think it is better to leave it here for 2.3. With a patch like this
one, we improve in one load and we got worse in a different load
(depens a lot in the ratio of dirtying memory vs disk). I have no
data which load is more common, so I prefer to be conservative so late
in the cycle. What do you think?
I agree, it's too late in the release cycle for such a change.
Kevin
Hi Juan Kevin,
I have not found the related patches to fix the issue which lead to long VM
downtime, how is it going?
Liang