Avihai Horon <avih...@nvidia.com> wrote:
> On 10/05/2023 19:41, Juan Quintela wrote:

>> Does this makes sense?
>
> Yes, thanks a lot for the full and detailed explanation!

Thank you.

> This indeed solves the problem in the scenario I mentioned above.
>
> However, this relies on the fact that a device support for this
> feature depends only on the QEMU version.
> This is not the case for VFIO devices.

What a surprise :-)

Yes, I couldn't resist.

> To support explicit-switchover, a VFIO device also needs host kernel
> support for VFIO precopy, i.e., it needs to have the
> VFIO_MIGRATION_PRE_COPY flag set.
> So, theoretically we could have the following:
> - Source and destination QEMU are the same version.
> - We migrate two different VFIO devices (i.e., they don't share the
>   same kernel driver), device X and device Y.
> - Host kernel in source supports VIFO precopy for device X but not for
>   device Y.
> - Host kernel in destination supports VFIO precopy for both device X
>   and device Y.
> Without explicit-switchover, migration should work.
> But if we enable explicit-switchover and do migration, we would end up
> in the same situation where switchover_pending=2 in destination and it
> never reaches zero so migration is stuck.

I think this is too much for qemu.  You need to work at the
libvirt/management level.

> This could be solved by moving the switchover_pending counter to the
> source and sending multiple MIG_RP explicit-switchover ACK messages.
> However, I also raised a concern about this in my last mail to Peter
> [1], where this is not guaranteed to work, depending on the device
> implementation for explicit-switchover feature.

I will not try to be extra clever here.  We have removed qemu support of
the question, as it is the same qemu in both sides.

So what we have is this configuration:

Host A
------
device X explicit_switchoever=on
device Y explicit_switchoever=off

Host B
------
device X explicit_switchoever=on
device Y explicit_switchoever=on

The configuration is different.  That is something that qemu protocol
don't know how to handle, and it is up to stack.

You need to configure explicitely in qemu command line on host B:
device=Y,explicit_switchover=off

Or whatever is that configured off.

It is exactly the same problem than:

Host A
------

Intel CPU genX

Host B
------

intel CPU genX-1

i.e. there are features that Host A has but host B don't have.  The only
way to make this work is that you need to configure qemu when launched
in Host A with a cpu type that host B is able to run (i.e. one that
don't have any features that Host B is missing).

What is the difference between this and yours?


> Not sure though if I'm digging too deep in some improbable future
> corner cases.

Oh, you are just starting.  The compat layers that CPU have had to do
over the years.  At some point even migration between AMD and Intel
CPU's worked.

> Let's go back to the basic question, which is whether we need to send
> an "advise" message for each device that supports explicit-switchover.
> I think it gives us more flexibility and although not needed at the
> moment, might be useful in the future.

I think that is not a good idea, see my previous comment.  We have two
cases:
- both devices have the same features in both places
- they have different features in any of the places

First case, we don't care.  It always work.
Second case, we need to configure it correctly, and that means disable
features that are not on the other side.

> If you want I can send a v2 that addresses the comments and simplifies
> the code in other areas and we'll continue discussing the necessity of
> the "advise" message then.

Yeap.  I think is the best course of action.

Thanks, Juan.

> Thanks!
>
> [1]
> https://lore.kernel.org/qemu-devel/688acb4e-a4e6-428d-9124-7596e3666...@nvidia.com/


Reply via email to