On Fri, Nov 14, 2025 at 3:32 AM Peter Xu <[email protected]> wrote:
>
> On Thu, Nov 13, 2025 at 12:46:55PM -0500, Michael S. Tsirkin wrote:
> > failing to start a perfectly good qemu which used to work
> > because you changed kernels is better than failing to migrate how?
> >
>
> I agree this is not pretty.
>
> The very original proposal was having extra features to be OFF by default,
> only allow explicit selections to enable them when the mgmt / user is aware
> of the possible hosts to run on top.  That'll guarantee:
>
> (1) explicit failure whenever some unsupported cap is chosen on boot,
>
> (2) default setup should always assume no kernel dependency hence booting
> should be all fine,
>
> (3) since all features will be by default OFF or selected by the user with
> explicit cmdlines, VM ABI is guaranteed so that migration will work.
>
> But unfortunately that proposal was rejected.
>
> >
> > graceful downgrade with old kernels is the basics of good userspace
> > behaviour and has been for decades.
> >
> >
> > sure, let's work on a solution, just erroring out is more about blaming
> > the user. what is the user supposed to do when qemu fails to start?
>
> This is indeed a good question.  If with strict checks maybe we would at
> least want to make sure we throw explicit messages to let user know what to
> turn off.
>
> >
> >
> > first, formulate what exactly do you want to enable.
> >
> >
> >
> > for example, you have a set of boxes and you want a set of flags
> > to supply to guarantee qemu can migrate between them. is that it?
>
> Yes I think that's the case.
>
> That's also why I think the very original proposal still makes sense
> (having all defaults OFF when dependent on kernel), because only the mgmt
> knows the details about the cluster, so it may make more sense to select
> from the top which has the full knowledge base, explicitly enable some sets
> of features (not only network, but also CPU feature bits and else).  Then
> the mgmt boots the VM, also knows where it can migrate explicitly.
>
> If all things are hidden then the mgmt is almost out of control of this.

+1

>
> That was rejected because there's the need to by default enable new
> features if ever possible.  In that case, IMHO Jason's soluion is spot on
> where it sits in the middle ground of both, allowing both to happen
> (auto-enable of new feats, while keeping VM ABI stablility).

Yes.

>
> So IIUC there will be a cluster, it may contain different groups of hosts,
> each group should have similar setups so that VMs can freely migrate
> between each other within the same group (but may not easily migratable
> across groups?).  But I don't think I know well on that part in practise.

Towards this, we may need to develop tools somewhere to report TAP
capability. Or as replied in another thread, developing software
fallback for new features, but it seems a burden.

>
> Dan might be a great source of input from that level.
>
> Thanks,
>
> --
> Peter Xu
>

Thanks


Reply via email to