On Wed, May 10, 2023 at 08:31:13AM +0200, Juan Quintela wrote: > Peter Xu <pet...@redhat.com> wrote: > > Hi > > [Adding Kevin to the party] > > > On Fri, May 05, 2023 at 03:46:52PM +0200, Fiona Ebner wrote: > >> To fix it, ensure that the BQL is held during setup. To avoid changing > >> the behavior for migration too, introduce conditionals for the setup > >> callbacks that need the BQL and only take the lock if it's not already > >> held. > > > > The major complexity of this patch is the "conditionally taking" part. > > Yeap. > > I don't want that bit. > > This is another case of: > - I have a problem > - I will use recursive mutexes to solve it > > Now you have two problems O:-) > > > Pure question: what is the benefit of not holding BQL always during > > save_setup(), if after all we have this coroutine issue to be solved? > > Dunno. > > I would like that paolo commented on this. I "reviewed the code" 10 > years ago. I don't remember at all why we wanted to change that. > > > I can understand that it helps us to avoid taking BQL too long, but we'll > > need to take it anyway during ramblock dirty track initializations, and so > > far IIUC it's the major time to be consumed during setup(). > > > > Commit message of 9b0950375277467 says, "Only the migration_bitmap_sync() > > call needs the iothread lock". Firstly I think it's also covering > > enablement of dirty tracking: > > > > + qemu_mutex_lock_iothread(); > > + qemu_mutex_lock_ramlist(); > > + bytes_transferred = 0; > > + reset_ram_globals(); > > + > > memory_global_dirty_log_start(); > > migration_bitmap_sync(); > > + qemu_mutex_unlock_iothread(); > > > > And I think enablement itself can be slow too, maybe even slower than > > migration_bitmap_sync() especially with KVM_DIRTY_LOG_INITIALLY_SET > > supported in the kernel. > > > > Meanwhile I always got confused on why we need to sync dirty bitmap when > > setup at all. Say, what if we drop migration_bitmap_sync() here? After > > all, shouldn't all pages be dirty from start (ram_list_init_bitmaps())? > > How do you convince KVM (or the other lists) to start doing dirty > tracking? Doing a bitmap sync.
I think memory_global_dirty_log_start() kicks off the tracking already. Take KVM as example, normally the case is KVM_MEM_LOG_DIRTY_PAGES is set there, then ioctl(KVM_SET_USER_MEMORY_REGION) will start dirty tracking for all of the guest memory slots necessary (including wr-protect all pages). KVM_DIRTY_LOG_INITIALLY_SET is slightly special, it'll skip that procedure during ioctl(KVM_SET_USER_MEMORY_REGION), but that also means the kernel assumed the userapp (QEMU) marked all pages dirty initially (always the case for QEMU, I think..). Hence in this case the sync doesn't help either because we'll simply get no new dirty bits in this shot.. > > And yeap, probably there is a better way of doing it. > > > This is slightly off-topic, but I'd like to know if someone can help > > answer. > > > > My whole point is still questioning whether we can unconditionally take bql > > during save_setup(). > > I agree with you. > > > I could have missed something, though, where we want to do in setup() but > > avoid holding BQL. Help needed on figuring this out (and if there is, IMHO > > it'll be worthwhile to put that into comment of save_setup() hook). > > I am more towards revert completely > 9b0950375277467fd74a9075624477ae43b9bb22 > > and call it a day. On migration we don't use coroutines on the sending > side (I mean the migration code, the block layer uses coroutines for > everything/anything). > > Paolo, Stefan any clues for not run setup with the BKL? > > Later, Juan. > -- Peter Xu