Avihai Horon <avih...@nvidia.com> wrote:
> On 03/05/2023 1:49, Peter Xu wrote:
>> External email: Use caution opening links or attachments

>> Only until READY msg received on src could src switchover the precopy to
>> dst.
>>
>> Then it only needs 1 more field in SaveVMHandlers rather than 3, and only 1
>> more msg (dst->src).
>>
>> This is based on the fact that right now we always set caps on both qemus
>> so I suppose it already means either both have or don't have the feature
>> (even if one has, not setting the cap means disabled on both).
>>
>> Would it work for this case and cleaner?
>
> Hi Peter, thanks for the response!
> Your approach is indeed much simpler, however I have a few concerns
> regarding compatibility.
>
> You are saying that caps are always set both in src and dest.
> But what happens if we set the cap only on one side?
> Should we care about these scenarios?

Not really.
We are supposed that something like libvirt has set things up and such
things are ok.  We don't try to detect that kind of things in the
migration stream (I am not telling we should'nt, but that we don't).

If you configure qemu with an disk on source that is on source but not
on destination, migration will work fine until the device copy stage,
when that disk is missing.  I think this is something like that.  A
missconfiguration.

> For example, if we set the cap only in src, then src will wait
> indefinitely for dest to notify that switchover is ready.
> Would you expect migration to fail instead of just keep running
> indefinitely?
> In current approach we only need to enable the cap in the source, so
> such scenario can't happen.

I see.  I have to think if this is a better approach.  But will like to
know what libvirt thinks about this.

Daniel?


> Let's look at some other scenario.
> Src QEMU supports explicit-switchover for device X but *not* for
> device Y (i.e., src QEMU is some older version of QEMU that supports
> explicit-switchover for device X but not for Y).
> Dest QEMU supports explicit-switchover for device X and device Y.
> The capability is set in both src and dest.
> In the destination we will have switchover_pending=2 because both X
> and Y support explicit-switchover.
> We do migration, but switchover_pending will never reach 0 because
> only X supports it in the source, so the migration will run
> indefinitely.
> The per-device handshake solves this by making device Y not use
> explicit-switchover in this case.

You have a point here.
But I will approach this case in a different way:

Destination QEMU needs to be older, because it don't have the feature.
So we need to NOT being able to do the switchover for older machine
types.
And have something like this is qemu/hw/machine.c

GlobalProperty hw_compat_7_2[] = {
    { "our_device", "explicit-switchover", "off" },
};

Or whatever we want to call the device and the property, and not use it
for older machine types to allow migration for that.

Once told that, this is the "ideal" world.  In general we don't force
this because we are not good at detecting this kind of failures.

Later, Juan.


Reply via email to