Peter Xu <pet...@redhat.com> writes: > On Wed, Sep 13, 2023 at 06:53:20PM -0300, Fabiano Rosas wrote: >> Peter Xu <pet...@redhat.com> writes: >> >> > On Mon, Sep 11, 2023 at 02:13:19PM -0300, Fabiano Rosas wrote: >> >> The core yank code is strict about balanced registering and >> >> unregistering of yank functions. >> >> >> >> This creates a difficulty because the migration code registers one >> >> yank function per QIOChannel, but each QIOChannel can be referenced by >> >> more than one QEMUFile. The yank function should not be removed until >> >> all QEMUFiles have been closed. >> >> >> >> Keep a reference count of how many QEMUFiles are using a QIOChannel >> >> that has a yank function. Only unregister the yank function when all >> >> QEMUFiles have been closed. >> >> >> >> This improves the current code by removing the need for the programmer >> >> to know which QEMUFile is the last one to be cleaned up and fixes the >> >> theoretical issue of removing the yank function while another QEMUFile >> >> could still be using the ioc and require a yank. >> >> >> >> Signed-off-by: Fabiano Rosas <faro...@suse.de> >> >> --- >> >> migration/yank_functions.c | 81 ++++++++++++++++++++++++++++++++++---- >> >> migration/yank_functions.h | 8 ++++ >> >> 2 files changed, 81 insertions(+), 8 deletions(-) >> > >> > I worry this over-complicate things. >> >> It does. We ran out of simple options. >> >> > If you prefer the cleaness that we operate always on qemufile level, can we >> > just register each yank function per-qemufile? >> >> "just" hehe >> >> we could, but: >> >> i) the yank is a per-channel operation, so this is even more unintuitive; > > I mean we can provide something like: > > void migration_yank_qemufile(void *opaque) > { > QEMUFile *file = opaque; > QIOChannel *ioc = file->ioc; > > qio_channel_shutdown(ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL); > } > > void migration_qemufile_register_yank(QEMUFile *file) > { > if (migration_ioc_yank_supported(file->ioc)) { > yank_register_function(MIGRATION_YANK_INSTANCE, > migration_yank_qemufile, > file); > } > }
Sure, this is what I was thinking as well. IMO it will be yet another operation that happens on the channel, but it performed via the file. Just like qio_channel_close() at qemu_fclose(). Not the end of the world, of course, I just find it error-prone. >> >> ii) multifd doesn't have a QEMUFile, so it will have to continue using >> the ioc; > > We can keep using migration_ioc_[un]register_yank() for them if there's no > qemufile attached. As long as the function will all be registered under > MIGRATION_YANK_INSTANCE we should be fine having different yank func. > ok >> >> iii) we'll have to add a yank to every new QEMUFile created during the >> incoming migration (colo, rdma, etc), otherwise the incoming side >> will be left using iocs while the src uses the QEMUFile; > > For RDMA, IIUC it'll simply be a noop as migration_ioc_yank_supported() > will be a noop for it for either reg/unreg. > > Currently it seems we will also unreg the ioc even for RDMA (even though we > don't reg for it). But since unreg will be a noop it seems all fine even > if not paired.. maybe we should still try to pair it, e.g. register also in > rdma_start_outgoing_migration() for the rdma ioc so at least they're paired. > > I don't see why COLO is special here, though. Maybe I missed something. For colo I was thinking we'd have to register the yank just to be sure that all paths unregistering it have something to unregister. Maybe I should move the register into qemu_file_new_impl() with a matching unregister at qemu_fclose(). >> >> iv) this is a functional change of the yank feature for which we have no >> tests. > > Having yank tested should be preferrable. Lukas is in the loop, let's see > whether he has something. We can still smoke test it before a selftest > being there. > > Taking one step back.. I doubt whether anyone is using yank for migration? > Knowing that migration already have migrate-cancel (for precopy) and > migrate-pause (for postcopy). Right, both already call qio_channel_shutdown(). > I never used it myself, and I don't think > it's supported for RHEL. How's that in suse's case? Never heard mention of it and I don't see it in our virtualization documentation. > > If no one is using it, maybe we can even avoid registering migration to > yank? > Seems reasonable to me. >> >> If that's all ok to you I'll go ahead and git it a try. >> >> > I think qmp yank will simply fail the 2nd call on the qemufile if the >> > iochannel is shared with the other one, but that's totally fine, IMHO. >> > >> > What do you think? >> > >> > In all cases, we should probably at least merge patch 1-8 if that can >> > resolve the CI issue. I think all of them are properly reviewed. >> >> I agree. Someone needs to queue this though since Juan has been busy. > > Yes, I'll see what I can do. Thanks. I could even send a pull request myself if it would make things easier. Let me know.