* Fei Li (f...@suse.com) wrote: > > > On 10/31/2018 03:49 AM, Dr. David Alan Gilbert wrote: > > * Fei Li (f...@suse.com) wrote: > > > Add error handling for qemu_ram_foreach_migratable_block() when > > > it fails. > > > > > > Always call migrate_set_error() to set the error state without relying > > > on whether multifd_save_cleanup() succeeds. As the passed &local_err > > > is never used in multifd_save_cleanup(), remove it. > > > > > > Signed-off-by: Fei Li <f...@suse.com> > > > --- > > > migration/migration.c | 5 +---- > > > migration/postcopy-ram.c | 3 +++ > > > migration/ram.c | 7 +++---- > > > migration/ram.h | 2 +- > > > 4 files changed, 8 insertions(+), 9 deletions(-) > > > > > > diff --git a/migration/migration.c b/migration/migration.c > > > index 87dfc7374f..3b8b7ab4f9 100644 > > > --- a/migration/migration.c > > > +++ b/migration/migration.c > > > @@ -1377,7 +1377,6 @@ static void migrate_fd_cleanup(void *opaque) > > > qemu_savevm_state_cleanup(); > > > if (s->to_dst_file) { > > > - Error *local_err = NULL; > > > QEMUFile *tmp; > > > trace_migrate_fd_cleanup(); > > > @@ -1388,9 +1387,7 @@ static void migrate_fd_cleanup(void *opaque) > > > } > > > qemu_mutex_lock_iothread(); > > > - if (multifd_save_cleanup(&local_err) != 0) { > > > - error_report_err(local_err); > > > - } > > > + multifd_save_cleanup(); > > > qemu_mutex_lock(&s->qemu_file_lock); > > > tmp = s->to_dst_file; > > > s->to_dst_file = NULL; > > > diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c > > > index e5c02a32c5..4ca2bc02b3 100644 > > > --- a/migration/postcopy-ram.c > > > +++ b/migration/postcopy-ram.c > > > @@ -1117,6 +1117,9 @@ int > > > postcopy_ram_enable_notify(MigrationIncomingState *mis) > > > /* Mark so that we get notified of accesses to unwritten areas */ > > > if (qemu_ram_foreach_migratable_block(ram_block_enable_notify, > > > mis)) { > > > + error_report("ram_block_enable_notify failed"); > > > + close(mis->userfault_event_fd); > > > + close(mis->userfault_fd); > > I don't think these close() calls are safe. This code is just after > > starting the fault thread, and the fault thread has a poll() call on > > these fd's, so we can't close them until we've instructed that thread > > to exit. > > > > We should fall out through postcopy_ram_incoming_cleanup, and because > > the thread exists it should do a notify to the thread, a join and then > > only later do the close calls. > > > > Dave > I see the postcopy_ram_incoming_cleanup() already include the > notify & join the fault thread & close these two fds and other more cleanup, > thus that means directly replace the above two close() with > postcopy_ram_incoming_cleanup() is ok, right? :)
Yes; I think I'd do that in loadvm_postcopy_handle_listen perhaps. It's a weird case, normally if it fails before 'listen' then we call cleanup in process_incoming_migration_co, but after listen we call it at the bottom of the listen thread. Here we've got a half-state where we're trying and failing to enter listen. > BTW, does the parameter "&local_err" for multifd_save_cleanup(&local_err) is > reserved for some reason? As I notice no code is using this error parameter, > and if it is reserved for nothing I'd like to delete it just as the second > paragraph > of the commit message said. :) Yes, that's OK; it might be better as a separate patch; I suspect maybe in an earlier version it had an error case. Dave > Have a nice day, thanks > Fei > > > > > return -1; > > > } > > > diff --git a/migration/ram.c b/migration/ram.c > > > index 8f03afe228..57cb1bee3d 100644 > > > --- a/migration/ram.c > > > +++ b/migration/ram.c > > > @@ -922,7 +922,7 @@ static void multifd_send_terminate_threads(Error *err) > > > } > > > } > > > -int multifd_save_cleanup(Error **errp) > > > +int multifd_save_cleanup(void) > > > { > > > int i; > > > int ret = 0; > > > @@ -1082,9 +1082,8 @@ static void multifd_new_send_channel_async(QIOTask > > > *task, gpointer opaque) > > > if (qio_task_propagate_error(task, &local_err)) { > > > migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED); > > > - if (multifd_save_cleanup(&local_err) != 0) { > > > - migrate_set_error(migrate_get_current(), local_err); > > > - } > > > + multifd_save_cleanup(); > > > + migrate_set_error(migrate_get_current(), local_err); > > > } else { > > > p->c = QIO_CHANNEL(sioc); > > > qio_channel_set_delay(p->c, false); > > > diff --git a/migration/ram.h b/migration/ram.h > > > index 046d3074be..0d1215209e 100644 > > > --- a/migration/ram.h > > > +++ b/migration/ram.h > > > @@ -43,7 +43,7 @@ uint64_t ram_bytes_remaining(void); > > > uint64_t ram_bytes_total(void); > > > int multifd_save_setup(void); > > > -int multifd_save_cleanup(Error **errp); > > > +int multifd_save_cleanup(void); > > > int multifd_load_setup(void); > > > int multifd_load_cleanup(Error **errp); > > > bool multifd_recv_all_channels_created(void); > > > -- > > > 2.13.7 > > > > > -- > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > > > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK