2011/2/23 Juan Quintela <quint...@redhat.com>: > Yoshiaki Tamura <tamura.yoshi...@lab.ntt.co.jp> wrote: >> 2011/2/23 Juan Quintela <quint...@redhat.com>: >>> >>> Signed-off-by: Juan Quintela <quint...@redhat.com> >>> --- >>> migration.c | 20 +++++++++----------- >>> 1 files changed, 9 insertions(+), 11 deletions(-) >>> >>> diff --git a/migration.c b/migration.c >>> index f015e02..641df9f 100644 >>> --- a/migration.c >>> +++ b/migration.c >>> @@ -361,28 +361,26 @@ static void migrate_fd_put_ready(void *opaque) >>> >>> DPRINTF("iterate\n"); >>> if (qemu_savevm_state_iterate(s->mon, s->file) == 1) { >>> - int state; >>> int old_vm_running = vm_running; >>> >>> DPRINTF("done iterating\n"); >>> vm_stop(VMSTOP_MIGRATE); >>> >>> - if ((qemu_savevm_state_complete(s->mon, s->file)) < 0) { >>> - if (old_vm_running) { >>> - vm_start(); >>> - } >>> - state = MIG_STATE_ERROR; >>> + if (qemu_savevm_state_complete(s->mon, s->file) < 0) { >>> + migrate_fd_error(s); >>> } else { >>> - state = MIG_STATE_COMPLETED; >>> + if (migrate_fd_cleanup(s) < 0) { >>> + migrate_fd_error(s); >>> + } else { >>> + s->state = MIG_STATE_COMPLETED; >>> + notifier_list_notify(&migration_state_notifiers); >>> + } >>> } >>> - if (migrate_fd_cleanup(s) < 0) { >>> + if (s->get_status(s) != MIG_STATE_COMPLETED) { >>> if (old_vm_running) { >>> vm_start(); >>> } >> >> This part, although it's not fair to ask you, but calling >> vm_start when != MIG_STATE_COMPLETED terrifies me because just >> failing migrate_fd_cleanup (mostly calling qemu_fclose) may cause >> split brain between src/dst. Although I haven't encountered this >> situation, just having stopped VMs on both sides is safer. > > I see your pain. I am not happy at all, but this was integrated by > Anthony to fix this bug: > > commit 41ef56e61153d7bd27d34a634633bb51b1c5988d > Author: Anthony Liguori <aligu...@us.ibm.com> > Date: Wed Jun 2 14:55:25 2010 -0500 > > migration: respect exit status with exec: > > This fixes https://bugs.launchpad.net/qemu/+bug/391879 >
Thanks for the link. I don't know IIUC, why stopping the VM was a problem? The essential thing is that we need to introduce a flag that whether user wants to continue a VM when something goes wrong during live migration. Deciding only with old_vm_running is wrong. > I think that it fixes that bug, but it makes me un-easy to restart vm if > there is a failure in migrate_fd_cleanup(). As I didn't wanted to > change behaviour with this series, I left it as it was. I agree with keeping the behavior unchanged. > Next on ToDo list is to do something sensible with errors, just now we > are not very good at handling them. Yeah. If we introduce Kemari, the migration code becomes more important because it'll be part of the normal VM execution path :) Thanks, Yoshi > > Later, Juan. > >