On Sat, Apr 02, 2022 at 03:22:21PM +0800, Hyman Huang wrote: > 在 2022/3/28 9:32, [email protected] 写道: > > @@ -879,7 +862,9 @@ static void kvm_dirty_ring_flush(void) > > * First make sure to flush the hardware buffers by kicking all > > * vcpus out in a synchronous way. > > */ > > - kvm_cpu_synchronize_kick_all(); > > + if (!cpu_throttle_get_percentage()) { > > + qemu_kvm_cpu_synchronize_kick_all(); > > + } > For the code logic itself, kvm_dirty_ring_flush aims to flush all existing > dirty pages, same as i reviewed in v1's first commit "kvm,memory: Optimize > dirty page collection for dirty ring", we shouldn't consider the migation > logic in this path.
I agree with Yong's analysis. I'm afraid this patch is breaking the api here. Say, memory_region_sync_dirty_bitmap() should be the memory api to flush all dirty memory, we can't simply ignore the flushing just because it's slow when we're throttling the cpus. Things will already start to break, e.g., the dirty rate calculation based on dirty ring relies on all dirty pages be accounted after the sync() call and this patch will make the reported value smaller than the reality. It's not a major deal breaker in dirty rate measurements but it's just one exmample that we're potentially breaking the memory api. AFAIU this is probably another reason why I don't really like the auto-converge old solution because it's kind of brutal in this case by putting the thread into a long sleep. One fix could be that we do periodically sleep in auto-converge code so that the vcpu threads can still handle any cpu sync requests, though I didn't really check the cpu code path. It's also racy in that cpu_throttle_get_percentage() is racy itself: imagine a case where cpu_throttle_get_percentage() always return >0 _but_ it starts to return 0 when migration_complete() - we'll not call qemu_kvm_cpu_synchronize_kick_all() even for completion and it can crash the VM. -- Peter Xu
