On Mon, Oct 17, 2022 at 05:15:35PM -0400, Peter Xu wrote: > On Mon, Oct 17, 2022 at 12:38:30PM +0100, Daniel P. Berrangé wrote: > > On Mon, Oct 17, 2022 at 01:06:00PM +0530, manish.mishra wrote: > > > Hi Daniel, > > > > > > I was thinking for some solutions for this so wanted to discuss that > > > before going ahead. Also added Juan and Peter in loop. > > > > > > 1. Earlier i was thinking, on destination side as of now for default > > > and multi-FD channel first data to be sent is MAGIC_NUMBER and VERSION > > > so may be we can decide mapping based on that. But then that does not > > > work for newly added post copy preempt channel as it does not send > > > any MAGIC number. Also even for multiFD just MAGIC number does not > > > tell which multifd channel number is it, even though as per my thinking > > > it does not matter. So MAGIC number should be good for indentifying > > > default vs multiFD channel? > > > > Yep, you don't need to know more than the MAGIC value. > > > > In migration_io_process_incoming, we need to use MSG_PEEK to look at > > the first 4 bytes pendingon the wire. If those bytes are 'QEVM' that's > > the primary channel, if those bytes are big endian 0x11223344, that's > > a multifd channel. Using MSG_PEEK aviods need to modify thue later > > code that actually reads this data. > > > > The challenge is how long to wait with the MSG_PEEK. If we do it > > in a blocking mode, its fine for main channel and multifd, but > > IIUC for the post-copy pre-empt channel we'd be waiting for > > something that will never arrive. > > > > Having suggested MSG_PEEK though, this may well not work if the > > channel has TLS present. In fact it almost definitely won't work. > > > > To cope with TLS migration_io_process_incoming would need to > > actually read the data off the wire, and later methods be > > taught to skip reading the magic. > > > > > 2. For post-copy preempt may be we can initiate this channel only > > > after we have received a request from remote e.g. remote page fault. > > > This to me looks safest considering post-copy recorvery case too. > > > I can not think of any depedency on post copy preempt channel which > > > requires it to be initialised very early. May be Peter can confirm > > > this. > > > > I guess that could work > > Currently all preempt code still assumes when postcopy activated it's in > preempt mode. IIUC such a change will bring an extra phase of postcopy > with no-preempt before preempt enabled. We may need to teach qemu to > understand that if it's needed. > > Meanwhile the initial page requests will not be able to benefit from the > new preempt channel too. > > > > > > 3. Another thing we can do is to have 2-way handshake on every > > > channel creation with some additional metadata, this to me looks > > > like cleanest approach and durable, i understand that can break > > > migration to/from old qemu, but then that can come as migration > > > capability? > > > > The benefit of (1) is that the fix can be deployed for all existing > > QEMU releases by backporting it. (3) will meanwhile need mgmt app > > updates to make it work, which is much more work to deploy. > > > > We really shoulud have had a more formal handshake, and I've described > > ways to achieve this in the past, but it is quite alot of work. > > I don't know whether (1) is a valid option if there are use cases that it > cannot cover (on either tls or preempt). The handshake is definitely the > clean approach. > > What's the outcome of such wrongly ordered connections? Will migration > fail immediately and safely? > > For multifd, I think it should fail immediately after the connection > established. > > For preempt, I'd also expect the same thing because the only wrong order to > happen right now is having the preempt channel to be the migration channel, > then it should also fail immediately on the first qemu_get_byte(). > > Hopefully that's still not too bad - I mean, if we can fail constantly and > safely (never fail during postcopy), we can always retry and as long as > connections created successfully we can start the migration safely. But > please correct me if it's not the case.
It should typically fail as the magic bytes are different, which will not pass validation. The exception being the postcopy pre-empt channel which may well cause migration to stall as nothing will be sent initially by the src. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|