On Tue, Nov 27, 2018 at 06:25:40PM +0800, Wei Wang wrote: > On 11/27/2018 03:38 PM, Peter Xu wrote: > > On Thu, Nov 15, 2018 at 06:08:01PM +0800, Wei Wang wrote: > > > +typedef enum PrecopyNotifyReason { > > > + PRECOPY_NOTIFY_ERR = 0, > > > + PRECOPY_NOTIFY_START_ITERATION = 1, > > > + PRECOPY_NOTIFY_BEFORE_SYNC_BITMAP = 2, > > > + PRECOPY_NOTIFY_AFTER_SYNC_BITMAP = 3, > > > + PRECOPY_NOTIFY_MAX = 4, > > It would be nice to add some comments for each of the notify reason. > > E.g., from the name PRECOPY_NOTIFY_START_ITERATION seems more like a > > hook at the start of each iteration but according to [1] it should be > > at the start of migration rather than each iteration (or when > > migration restarts, though I'm not sure whether we really have this > > yet). > > OK. I think It would be better if the name itself could be straightforward. > Probably we could change PRECOPY_NOTIFY_START_ITERATION to > PRECOPY_NOTIFY_START_MIGRATION.
Sounds good. > > > > > +} PrecopyNotifyReason; > > > + > > > +void precopy_infrastructure_init(void); > > > +void precopy_add_notifier(Notifier *n); > > > +void precopy_remove_notifier(Notifier *n); > > > + > > > void ram_mig_init(void); > > > void qemu_guest_free_page_hint(void *addr, size_t len); > > > diff --git a/migration/ram.c b/migration/ram.c > > > index 229b791..65b1223 100644 > > > --- a/migration/ram.c > > > +++ b/migration/ram.c > > > @@ -292,6 +292,8 @@ struct RAMState { > > > bool ram_bulk_stage; > > > /* How many times we have dirty too many pages */ > > > int dirty_rate_high_cnt; > > > + /* ram save states used for notifiers */ > > > + int ram_save_state; > > This can be removed? > > Yes, thanks. > > > > > > /* these variables are used for bitmap sync */ > > > /* last time we did a full bitmap_sync */ > > > int64_t time_last_bitmap_sync; > > > @@ -328,6 +330,28 @@ typedef struct RAMState RAMState; > > > static RAMState *ram_state; > > > +static NotifierList precopy_notifier_list; > > > + > > > +void precopy_infrastructure_init(void) > > > +{ > > > + notifier_list_init(&precopy_notifier_list); > > > +} > > > + > > > +void precopy_add_notifier(Notifier *n) > > > +{ > > > + notifier_list_add(&precopy_notifier_list, n); > > > +} > > > + > > > +void precopy_remove_notifier(Notifier *n) > > > +{ > > > + notifier_remove(n); > > > +} > > > + > > > +static void precopy_notify(PrecopyNotifyReason reason) > > > +{ > > > + notifier_list_notify(&precopy_notifier_list, &reason); > > > +} > > > + > > > uint64_t ram_bytes_remaining(void) > > > { > > > return ram_state ? (ram_state->migration_dirty_pages * > > > TARGET_PAGE_SIZE) : > > > @@ -1642,6 +1666,8 @@ static void migration_bitmap_sync(RAMState *rs) > > > int64_t end_time; > > > uint64_t bytes_xfer_now; > > > + precopy_notify(PRECOPY_NOTIFY_BEFORE_SYNC_BITMAP); > > > + > > > ram_counters.dirty_sync_count++; > > > if (!rs->time_last_bitmap_sync) { > > > @@ -1699,6 +1725,8 @@ static void migration_bitmap_sync(RAMState *rs) > > > if (migrate_use_events()) { > > > qapi_event_send_migration_pass(ram_counters.dirty_sync_count); > > > } > > > + > > > + precopy_notify(PRECOPY_NOTIFY_AFTER_SYNC_BITMAP); > > > } > > > /** > > > @@ -2555,6 +2583,8 @@ static void ram_state_reset(RAMState *rs) > > > rs->last_page = 0; > > > rs->last_version = ram_list.version; > > > rs->ram_bulk_stage = true; > > > + > > > + precopy_notify(PRECOPY_NOTIFY_START_ITERATION); > > [1] > > > > > } > > > #define MAX_WAIT 50 /* ms, half buffered_file limit */ > > > @@ -3324,6 +3354,7 @@ out: > > > ret = qemu_file_get_error(f); > > > if (ret < 0) { > > > + precopy_notify(PRECOPY_NOTIFY_ERR); > > Could you show me which function is this line in? > > > > Line 3324 here is ram_save_complete(), but I cannot find this exact > > place. > > Sure, it's in ram_save_iterate(): > ... > out: > qemu_put_be64(f, RAM_SAVE_FLAG_EOS); > qemu_fflush(f); > ram_counters.transferred += 8; > > ret = qemu_file_get_error(f); > if (ret < 0) { > + precopy_notify(PRECOPY_NOTIFY_ERR); > return ret; > } > > return done; > } Ok thanks. Please just make sure you will capture all the error cases, e.g., I also see path like this (a few lines below): if (pages < 0) { qemu_file_set_error(f, pages); break; } It seems that you missed that one. I would even suggest that you capture the error with higher level. E.g., in migration_iteration_run() after qemu_savevm_state_iterate(). Or we can just check the return value of qemu_savevm_state_iterate(), which we have had ignored so far. [1] > > > > > > Another thing to mention about the "reasons" (though I see it more > > like "events"): have you thought about adding a PRECOPY_NOTIFY_END? > > It might help in some cases: > > > > - then you don't need to trickily export the migrate_postcopy() > > since you'll notify that before postcopy starts > > I'm thinking probably we don't need to export migrate_postcopy even now. > It's more like a sanity check, and not needed because now we have the > notifier registered to the precopy specific callchain, which has ensured > that > it is invoked via precopy. But postcopy will always start with precopy, no? > > > - you'll have a solid point that you'll 100% guarantee that we'll > > stop the free page hinting and don't need to worry about whether > > there is chance the hinting will be running without an end [2]. > > Thanks, I think it makes sense. Plan to add PRECOPY_NOTIFY_END in > ram_save_complete. Yeah you can. Btw, if you're mostly adding the notifies only in RAM-only codes, then you can consider add the "RAM" into the names of events too to be clear. My suggestion at [1] is precopy general, but you can still capture it at the end of ram_save_iterate, then they are RAM-only again. Please feel free to choose what fits more... > > > > > > Regarding [2] above: now the series only stops the hinting when > > PRECOPY_NOTIFY_BEFORE_SYNC_BITMAP is notified. Could there be a case > > that it's missing? E.g., what if we cancel/fail a migration during > > precopy? Have you tried it? > > > > I think it has been handled by the above PRECOPY_NOTIFY_ERR Indeed it should, as long as you're covering all the error cases. Thanks, -- Peter Xu