Peter Xu <pet...@redhat.com> wrote: > It was broken due to recent changes in two parts: > > - migration_incoming_process() will be called now even for postcopy > recovery sessions (while we shouldn't) > > - Now we don't call migrate_fd_process_incoming() any more (unless in > RDMA code), so actually the postcopy recovery logic is fully bypassed > > Fix this up to make sure we only call migration_incoming_process() when > necessary, and move the postcopy recovery logic far earlier to the entry > point of incoming migration. Renaming migration_fd_process_incoming() > into postcopy_try_recover() since it's mostly for the recovery process, > then touch up RDMA code to suite it. > > Since at it, refactor the imcoming port handling to only have one singe > entry point for incoming migration. Then we can avoid calling > migration_ioc_process_incoming() everywhere, which is really error > prone.
Perhaps splitting some of this bits? > diff --git a/migration/ram.h b/migration/ram.h > index d386f4d641..457bf54b8c 100644 > --- a/migration/ram.h > +++ b/migration/ram.h > @@ -46,7 +46,7 @@ int multifd_save_cleanup(Error **errp); > int multifd_load_setup(void); > int multifd_load_cleanup(Error **errp); > bool multifd_recv_all_channels_created(void); > -void multifd_recv_new_channel(QIOChannel *ioc); > +bool multifd_recv_new_channel(QIOChannel *ioc); I am ok with this type change. > void migration_ioc_process_incoming(QIOChannel *ioc) > { > MigrationIncomingState *mis = migration_incoming_get_current(); > + QEMUFile *f = qemu_fopen_channel_input(ioc); We are creating a QEMUFile here. > + bool start_migration; > + > + /* If it's a recovery attempt, we're done */ > + if (postcopy_try_recover(f)) { Here we use it. > + return; > + } > > if (!mis->from_src_file) { > - QEMUFile *f = qemu_fopen_channel_input(ioc); > + /* The first connection (multifd may have multiple) */ > migration_incoming_setup(f); Here we use it. > - return; > + /* > + * Common migration only needs one channel, so we can start > + * right now. Multifd needs more than one channel, we wait. > + */ > + start_migration = !migrate_use_multifd(); > + } else { > + /* Multiple connections */ > + assert(migrate_use_multifd()); > + start_migration = multifd_recv_new_channel(ioc); But here we don't use it. We are lecking it. So, we need to fix the leak. No clue about how to convince postcopy_try_recover() without the QEMUFile Later, Juan.