Very lengthy discussion, apologies if I repeat something in one of the
various threads but I read lots of these discussions and I'm somewhat
confused still of what this is all about...

On Wed, Nov 25, 2009 at 04:09:55PM +0200, Michael S. Tsirkin wrote:
> We were discussing features that are (mostly) not user-visible.
> It is clear that if you have a user-visible change you have
> a different machine, so you can not migrate.
> 
> Now if you fix a bug by changing savevm format, without user visible
> changes you *also* can not migrate, but this does not make it into
> feature or make it a good fit for machine description.

There clearly has to be a separation already of machine definition
otherwise forward migration to new qemu version couldn't be guaranteed
in the first place!

To migrate back all we need is the ability of the new version of qemu
to write savevm in the old version format negotiated as
max(oldformats[], newformats[]). It already has to be able to "read"
the old savevm format but it wasn't required to write it yet, writing
old format is the only new requirement. The machine definition is the
old one because it comes from an old qemu and it has to be handled by
new qemu if forward migration was possible in the first place.

Clearly the migration won't be done safely across the cluster until
all host nodes are upgraded, so I think the highlevel GUI should print
a warning when it notices a migration from new savevm format to old
savevm format. (obviously only savevm format can change here,
machine definition isn't changing if migration is possible at all and
it should just return error!).

Then in an orthogonal way (totally different problem) we need to
ensure all VM are started with the same guest visible machine
definition (that should be true even if savevm format doesn't
change). With -M if that's the desired API and we are upgrading qemu
significantly in that update, if we didn't change qemu drivers
significantly no -M parameter is needed. And if we upgrade machine
definition migration will simply stop and that's feature not a bug.

Now how much finegrined we want the savevm format, to be versioned per
device, how complex we want the negotiation protocol (to be more
extensible in the future) is all a matter of implementation
details.

In very short all we can be reasonably discussing here is to add the
ability to new qemu to write in older (buggy) savevm format to allow
backwards migration and to negotiate the highest savevm format for a
backwards migration at the start of the connection, with a warning
that there's a savevm format downgrade during migration so user knows
he's risking instability and he should confirm after negotiation is
complete and the downgrade has been noticed. After that we can still
migrate (with a warning) from fixed pvclock to broken pvclock (the
latter will remain potentially unstable, which is warning is required
in my view) and they won't be forced to upgrade all hosts at once to
still migrate across the whole cluster.


Reply via email to