Peter Xu <pet...@redhat.com> writes: > On Fri, Aug 11, 2023 at 12:08:36PM -0300, Fabiano Rosas wrote: >> We currently have a pattern for cleaning up a migration QEMUFile: >> >> qemu_mutex_lock(&s->qemu_file_lock); >> file = s->file_name; >> s->file_name = NULL; >> qemu_mutex_unlock(&s->qemu_file_lock); >> >> migration_ioc_unregister_yank_from_file(file); >> qemu_file_shutdown(file); >> qemu_fclose(file); >> >> There are some considerations for this sequence: >> >> - we must clear the pointer under the lock, to avoid TOC/TOU bugs; >> - the shutdown() and close() expect be given a non-null parameter; >> - a close() in one thread should not race with a shutdown() in another; >> >> Create a wrapper function to make sure everything works correctly. >> >> Note: the return path did not used to call >> migration_ioc_unregister_yank_from_file(), but I added it >> nonetheless for uniformity. >> >> Signed-off-by: Fabiano Rosas <faro...@suse.de> > > This definitely looks cleaner. Probably can be squashed together with > previous patch? If you could double check whether we can just drop the > shutdown() all over the places when close() altogether, it'll be even > nicer (I hope I didn't miss any real reasons to explicitly do that). > >> diff --git a/util/yank.c b/util/yank.c >> index abf47c346d..4b6afbf589 100644 >> --- a/util/yank.c >> +++ b/util/yank.c >> @@ -146,8 +146,6 @@ void yank_unregister_function(const YankInstance >> *instance, >> return; >> } >> } >> - >> - abort(); > > I think we can't silently do this. This check is very strict and I guess > you removed it because you hit a crash. What's the crash? Can we just > pair the yank reg/unreg? >
Well, the abort() is the crash. It just means that we looped and didn't find the handler to unregister. It looks harmless to me. I should have mentioned this in the commit message. I could certainly add a yank handler to the rp_state.from_dst_file. But then I have no idea what will happen if we try to yank the return path at a random moment. Side note: I see that yank does a qio_channel_shutdown() without the controversial setting of -EIO. Which means it is probably succeptible to the same race described in the qemu_file_shutdown() code.