* Paolo Bonzini (pbonz...@redhat.com) wrote: > Il 03/10/2014 19:47, Dr. David Alan Gilbert (git) ha scritto: > > From: "Dr. David Alan Gilbert" <dgilb...@redhat.com> > > > > Rework the migration thread to setup and start postcopy. > > > > Signed-off-by: Dr. David Alan Gilbert <dgilb...@redhat.com> > > --- > > include/migration/migration.h | 3 + > > migration.c | 201 > > ++++++++++++++++++++++++++++++++++++++---- > > 2 files changed, 185 insertions(+), 19 deletions(-) > > > > diff --git a/include/migration/migration.h b/include/migration/migration.h > > index b01cc17..f401775 100644 > > --- a/include/migration/migration.h > > +++ b/include/migration/migration.h
<snip> > > +/* Switch from normal iteration to postcopy > > + * Returns non-0 on error > > + */ > > +static int postcopy_start(MigrationState *ms) > > +{ > > + int ret; > > + const QEMUSizedBuffer *qsb; > > + migrate_set_state(ms, MIG_STATE_ACTIVE, MIG_STATE_POSTCOPY_ACTIVE); > > + > > + DPRINTF("postcopy_start\n"); > > + qemu_mutex_lock_iothread(); > > + DPRINTF("postcopy_start: setting run state\n"); > > + ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE); > > + > > + if (ret < 0) { > > + migrate_set_state(ms, MIG_STATE_POSTCOPY_ACTIVE, MIG_STATE_ERROR); > > + qemu_mutex_unlock_iothread(); > > + return -1; > > Please use "goto" for error returns, like > > fail_locked: > qemu_mutex_unlock_iothread(); > fail: > migrate_set_state(ms, MIG_STATE_POSTCOPY_ACTIVE, MIG_STATE_ERROR); > return -1; Done; they all end up unlocking, but I've got another label for a case that has to close the fb later. > > + } > > + > > + /* > > + * in Finish migrate and with the io-lock held everything should > > + * be quiet, but we've potentially still got dirty pages and we > > + * need to tell the destination to throw any pages it's already > > received > > + * that are dirty > > + */ > > + if (ram_postcopy_send_discard_bitmap(ms)) { > > + DPRINTF("postcopy send discard bitmap failed\n"); > > + migrate_set_state(ms, MIG_STATE_POSTCOPY_ACTIVE, MIG_STATE_ERROR); > > + qemu_mutex_unlock_iothread(); > > + return -1; > > + } > > + > > + DPRINTF("postcopy_start: sending req 2\n"); > > + qemu_savevm_send_reqack(ms->file, 2); > > Perhaps move it below qemu_file_set_rate_limit, and add > trace_qemu_savevm_send_reqack? Trace added, and also moved as requested - was the request to move it just to elimintate the other DPRINTF? > Also what is 2/3/4? Is this just for debugging or is it part of the > protocol? Debug; they're very useful for matching the debug streams up, especially when the timers on the two hosts are very different. (I'm up for suggestions on how to mark the 2/3/4 for debug more clearly, especially if it meant that it didn't make the ping (ne reqack) dedicated to debug). > > + /* > > + * send rest of state - note things that are doing postcopy > > + * will notice we're in MIG_STATE_POSTCOPY_ACTIVE and not actually > > + * wrap their state up here > > + */ > > + qemu_file_set_rate_limit(ms->file, INT64_MAX); > > + DPRINTF("postcopy_start: do state_complete\n"); > > + > > + /* > > + * We need to leave the fd free for page transfers during the > > + * loading of the device state, so wrap all the remaining > > + * commands and state into a package that gets sent in one go > > + */ > > The comments in the code are very nice. Thanks. This is a huge > improvement from the last version I received. > > > + QEMUFile *fb = qemu_bufopen("w", NULL); > > + if (!fb) { > > + error_report("Failed to create buffered file"); > > + migrate_set_state(ms, MIG_STATE_POSTCOPY_ACTIVE, MIG_STATE_ERROR); > > + qemu_mutex_unlock_iothread(); > > + return -1; > > + } > > + > > + qemu_savevm_state_complete(fb); > > + DPRINTF("postcopy_start: sending req 3\n"); > > + qemu_savevm_send_reqack(fb, 3); > > + > > + qemu_savevm_send_postcopy_ram_run(fb); > > + > > + /* <><> end of stuff going into the package */ > > + qsb = qemu_buf_get(fb); > > + > > + /* Now send that blob */ > > + if (qsb_get_length(qsb) > MAX_VM_CMD_PACKAGED_SIZE) { > > + DPRINTF("postcopy_start: Unreasonably large packaged state: %lu\n", > > + (unsigned long)(qsb_get_length(qsb))); > > + migrate_set_state(ms, MIG_STATE_POSTCOPY_ACTIVE, MIG_STATE_ERROR); > > + qemu_mutex_unlock_iothread(); > > + qemu_fclose(fb); > > Close fb above migrate_set_state, and use goto as above. Or just have > three labels. Done, it's a separate label. > > > + return -1; > > + } > > + qemu_savevm_send_packaged(ms->file, qsb); > > + qemu_fclose(fb); > > + > > + qemu_mutex_unlock_iothread(); > > + > > + DPRINTF("postcopy_start not finished sending ack\n"); > > + qemu_savevm_send_reqack(ms->file, 4); > > + > > + ret = qemu_file_get_error(ms->file); > > + if (ret) { > > + error_report("postcopy_start: Migration stream errored"); > > This should have been reported already. No, sorry - I don't trust qemu_file reporting errors by itself. Dave (Again, the rest of the comments on this patch can wait for another mail) -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK