On 6/3/2021 3:36 PM, Dr. David Alan Gilbert wrote: > * Steven Sistare (steven.sist...@oracle.com) wrote: >> On 5/24/2021 6:39 AM, Dr. David Alan Gilbert wrote: >>> * Steven Sistare (steven.sist...@oracle.com) wrote: >>>> On 5/20/2021 9:13 AM, Dr. David Alan Gilbert wrote: >>>>> On the 'restart' branch of questions; can you explain, >>>>> other than the passing of the fd's, why the outgoing side of >>>>> qemu's 'migrate exec:' doesn't work for you? >>>> >>>> I'm not sure what I should describe. Can you be more specific? >>>> Do you mean: can we add the cpr specific bits to the migrate exec code? >>> >>> Yes; if possible I'd prefer to just keep the one exec mechanism. >>> It has an advantage of letting you specify the new command line; that >>> avoids the problems I'd pointed out with needing to change the command >>> line if a hotplug had happened. It also means we only need one chunk of >>> exec code. >> >> [...] > > I'm not quite sure what I want in the incoming there, but that is > already the source execing the destination qemu - although I think we'd > probably need to check if that's actually via an intermediary. > >> We could shoehorn cpr restart into the migrate exec path by defining a new >> migration >> capability that the client would set before issuing the migrate command. >> However, the >> implementation code would be sprinkled with conditionals to suppress >> migrate-only bits >> and call cpr-only bits. IMO that would be less maintainable than having a >> separate >> cprsave function. Currently cprsave does not duplicate any migration >> functionality. >> cprsave calls qemu_save_device_state() which is used by xen. > > To me it feels like cprsave in particular is replicating more code.
In the attached file I annotated lines of code that have some overlap with migration code actions. They include vm control, global_state_store, and vmstate save, and cover 18 lines of 78 total. I did not include the body of qf_file_open because it is also called by xen. The migration code adds capabilities, parameters, state, status, info, precopy, postcopy, dirty bitmap, events, notifiers, 6 channel types, blockers, pause+resume, return path, request-reply commands, throttling, colo, blocks, phases, iteration, and threading, implemented by 20000+ lines of code. To me it seems wrong to throw cpr into that mix to avoid adding tens of lines of similar code. - Steve
void cprsave(const char *file, CprMode mode, Error **errp) { int ret = 0; ** QEMUFile *f; ** int saved_vm_running = runstate_is_running(); bool restart = (mode == CPR_MODE_RESTART); bool reboot = (mode == CPR_MODE_REBOOT); if (reboot && qemu_ram_volatile(errp)) { return; } if (restart && xen_enabled()) { error_setg(errp, "xen does not support cprsave restart"); return; } if (migrate_colo_enabled()) { error_setg(errp, "error: cprsave does not support x-colo"); return; } if (replay_mode != REPLAY_MODE_NONE) { error_setg(errp, "error: cprsave does not support replay"); return; } f = qf_file_open(file, O_CREAT | O_WRONLY | O_TRUNC, 0600, "cprsave", errp); if (!f) { return; } ** ret = global_state_store(); ** if (ret) { ** error_setg(errp, "Error saving global state"); ** qemu_fclose(f); ** return; ** } if (runstate_check(RUN_STATE_SUSPENDED)) { /* Update timers_state before saving. Suspend did not so do. */ cpu_disable_ticks(); } ** vm_stop(RUN_STATE_SAVE_VM); cpr_is_active = true; ** ret = qemu_save_device_state(f); ** qemu_fclose(f); ** if (ret < 0) { ** error_setg(errp, "Error %d while saving VM state", ret); ** goto err; ** } if (reboot) { shutdown_action = SHUTDOWN_ACTION_POWEROFF; qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN); } else if (restart) { if (!qemu_chr_cpr_capable(errp)) { goto err; } if (vfio_cprsave(errp)) { goto err; } walkenv(FD_PREFIX, preserve_fd, 0); vhost_dev_reset_all(); qemu_term_exit(); setenv("QEMU_START_FREEZE", "", 1); qemu_system_exec_request(); } goto done; err: ** if (saved_vm_running) { ** vm_start(); ** } done: cpr_is_active = false; return; }