On Fri, Jul 26, 2024 at 10:43:42AM -0400, Peter Xu wrote:
> On Fri, Jul 26, 2024 at 09:48:02AM +0100, Daniel P. Berrangé wrote:
> > On Fri, Jul 26, 2024 at 09:03:24AM +0200, Thomas Huth wrote:
> > > On 26/07/2024 08.08, Michael S. Tsirkin wrote:
> > > > On Thu, Jul 25, 2024 at 06:18:20PM -0400, Peter Xu wrote:
> > > > > On Tue, Aug 01, 2023 at 01:31:48AM +0300, Yuri Benditovich wrote:
> > > > > > USO features of virtio-net device depend on kernel ability
> > > > > > to support them, for backward compatibility by default the
> > > > > > features are disabled on 8.0 and earlier.
> > > > > > 
> > > > > > Signed-off-by: Yuri Benditovich <yuri.benditov...@daynix.com>
> > > > > > Signed-off-by: Andrew Melnychecnko <and...@daynix.com>
> > > > > 
> > > > > Looks like this patch broke migration when the VM starts on a host 
> > > > > that has
> > > > > USO supported, to another host that doesn't..
> > > > 
> > > > This was always the case with all offloads. The answer at the moment is,
> > > > don't do this.
> > > 
> > > May I ask for my understanding:
> > > "don't do this" = don't automatically enable/disable virtio features in 
> > > QEMU
> > > depending on host kernel features, or "don't do this" = don't try to 
> > > migrate
> > > between machines that have different host kernel features?
> > > 
> > > > Long term, we need to start exposing management APIs
> > > > to discover this, and management has to disable unsupported features.
> > > 
> > > Ack, this likely needs some treatments from the libvirt side, too.
> > 
> > When QEMU automatically toggles machine type featuers based on host
> > kernel, relying on libvirt to then disable them again is impractical,
> > as we cannot assume that the libvirt people are using knows about
> > newly introduced features. Even if libvirt is updated to know about
> > it, people can easily be using a previous libvirt release.
> > 
> > QEMU itself needs to make the machine types do that they are there
> > todo, which is to define a stable machine ABI. 
> > 
> > What QEMU is missing here is a "platform ABI" concept, to encode
> > sets of features which are tied to specific platform generations.
> > As long as we don't have that we'll keep having these broken
> > migration problems from machine types dynamically changing instead
> > of providing a stable guest ABI.
> 
> Any more elaboration on this idea?  Would it be easily feasible in
> implementation?

In terms of launching QEMU I'd imagine:

  $QEMU -machine pc-q35-9.1 -platform linux-6.9 ...args...

Any virtual machine HW features which are tied to host kernel features
would have their defaults set based on the requested -platform. The
-machine will be fully invariant wrt the host kernel.

You would have -platform hlep to list available platforms, and
corresonding QMP "query-platforms" command to list what platforms
are supported on a given host OS.

Downstream distros can provide their own platforms definitions
(eg "linux-rhel-9.5") if they have kernels whose feature set
diverges from upstream due to backports.

Mgmt apps won't need to be taught about every single little QEMU
setting whose default is derived from the kernel. Individual
defaults are opaque and controlled by the requested platform.

Live migration has clearly defined semantics, and mgmt app can
use query-platforms to validate two hosts are compatible.

Omitting -platform should pick the very latest platform that is
cmpatible with the current host (not neccessarily the latest
platform built-in to QEMU).


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Reply via email to