On Thu, Jun 22, 2023 at 11:54:43AM -0400, Peter Xu wrote: > I can try to move the todo even higher. Trying to list the initial goals > here: > > - One extra phase of handshake between src/dst (maybe the time to boost > QEMU_VM_FILE_VERSION) before anything else happens. > > - Dest shouldn't need to apply any cap/param, it should get all from src. > Dest still need to be setup with an URI and that should be all it needs. > > - Src shouldn't need to worry on the binary version of dst anymore as long > as dest qemu supports handshake, because src can fetch it from dest.
I'm not sure that works in general. Even if we have a handshake and bi-directional comms for live migration, we still haave the save/restore to file codepath to deal with. The dst QEMU doesn't exist at the time the save process is done, so we can't add logic to VMSate handling that assumes knowledge of the dst version at time of serialization. > - Handshake can always fail gracefully if anything wrong happened, it > normally should mean dest qemu is not compatible with src's setup (either > machine, device, or migration configs) for whatever reason. Src should > be able to get a solid error from dest if so. > > - Handshake protocol should always be self-bootstrap-able, it means when we > change the handshake protocol it should always works with old binaries. > > - When src is newer it should be able to know what's missing on dest and > skip the new bits. > > - When dst is newer it should all rely on src (which is older) and it > should always understand src's language. I'm not convinced it can reliably self-bootstrap in a backwards compatible manner, precisely because the current migration stream has no handshake and only requires a unidirectional channel. I don't think its possible for QEMU to validate that it has a fully bi-directional channel, without adding timeouts to its detection which I think we should strive to avoid. I don't think we actually need self-bootstrapping anyway. I think the mgmt app can just indicate the new v2 bi-directional protocol when issuing the 'migrate' and 'migrate-incoming' commands. This becomes trivial when Het's refactoring of the migrate address QAPI is accepted: https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg04851.html eg: { "execute": "migrate", "arguments": { "channels": [ { "channeltype": "main", "addr": { "transport": "socket", "type": "inet", "host": "10.12.34.9", "port": "1050" } } ] } } note the 'channeltype' parameter here. If we declare the 'main' refers to the existing migration protocol, then we merely need to define a new 'channeltype' to use as an indicator for the v2 migration handshake protocol. > - All !main channels need to be established later than the handshake - if > we're going to do this anyway we probably should do it altogether to make > channels named, so each channel used in migration needs to have a common > header. Prepare to deprecate the old tricks of channel orderings. Once the primary channel involves a bi-directional handshake, we'll trivially ensure ordering - similar to how the existing code worked fnie in TLS mode which had a bi-directional TLS handshake. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|