Am 29.05.2015 um 12:34 hat Dr. David Alan Gilbert geschrieben:
> * Kevin Wolf (kw...@redhat.com) wrote:
> > Am 29.05.2015 um 11:38 hat Dr. David Alan Gilbert geschrieben:
> > > * Kevin Wolf (kw...@redhat.com) wrote:
> > > > Am 29.05.2015 um 10:33 hat Dr. David Alan Gilbert geschrieben:
> > > > > * Markus Armbruster (arm...@redhat.com) wrote:
> > > > > > "Dr. David Alan Gilbert" <dgilb...@redhat.com> writes:
> > > > > > 
> > > > > > > * John Snow (js...@redhat.com) wrote:
> > > > > > >> 
> > > > > > >> 
> > > > > > >> On 05/21/2015 09:19 AM, Kevin Wolf wrote:
> > > > > > >> > The floppy controller spec describes three different 
> > > > > > >> > controller phases,
> > > > > > >> > which are currently not explicitly modelled in our emulation. 
> > > > > > >> > Instead,
> > > > > > >> > each phase is represented by a combination of flags in 
> > > > > > >> > registers.
> > > > > > >> > 
> > > > > > >> > This patch makes explicit in which phase the controller 
> > > > > > >> > currently is.
> > > > > > >> > 
> > > > > > >> > Signed-off-by: Kevin Wolf <kw...@redhat.com>
> > > > > 
> > > > > <snip>
> > > > > 
> > > > > > >> > +static bool fdc_phase_needed(void *opaque)
> > > > > > >> > +{
> > > > > > >> > +    FDCtrl *fdctrl = opaque;
> > > > > > >> > +
> > > > > > >> > +    return reconstruct_phase(fdctrl) != fdctrl->phase;
> > > > > > >> > +}
> > > > > > >
> > > > > > > OK, that is one way of doing it which should work, as long
> > > > > > > as most of the time that matches.
> > > > > > >
> > > > > > > My preference for subsections is normally to make them just
> > > > > > > conditional on machine type, that way backwards migration just
> > > > > > > works.  However, if the result of the backwards migration would
> > > > > > > be data corruption (which if I understand what you saying it could
> > > > > > > in this case) then it's arguably better to fail migration in those
> > > > > > > cases.
> > > > > > 
> > > > > > This isn't a matter of preference.
> > > > > > 
> > > > > > Subsections were designed to solves a problem we only have because 
> > > > > > we're
> > > > > > not perfect: device model bugs, namely forgetting to migrate a bit 
> > > > > > of
> > > > > > state, or failing to model it in the first place.
> > > > > > 
> > > > > > Subsections tied to machine types solve an entirely different 
> > > > > > problem:
> > > > > > the old version of the device is just fine, including migration, 
> > > > > > but we
> > > > > > now want a new and improved variant, which needs some more state
> > > > > > migrated.  Putting that piece of state into a subsection tied to 
> > > > > > the new
> > > > > > machine type avoids code duplication.
> > > > > > 
> > > > > > But what do you do for device model bugs serious enough to need 
> > > > > > fixing
> > > > > > even for existing machine types, when the fix needs additional state
> > > > > > migrated?  Subsections tied to machine types are *not* a solution 
> > > > > > there.
> > > > > > That's why this isn't about preference!  It's about having a bug to 
> > > > > > fix
> > > > > > or not.
> > > > > 
> > > > > My problem is that people keep adding subsections to fix minor device
> > > > > bugs because they think breaking migration is irrelevant.  If the bug
> > > > > being fixed causes data corruption then OK I agree it is probably 
> > > > > better
> > > > > to break migration, otherwise  you need to make a decision about 
> > > > > whether
> > > > > fixing the device bug during migration is better or worse than having
> > > > > the migration fail.
> > > > > Some people say 'Migration failure is ok - people just try it again';
> > > > > sorry, no, that's useless in environments where the VM has 100s of GB
> > > > > of RAM and takes hours to migrate only for it then to fail, and it
> > > > > was urgent that it was migrated off that source host.
> > > > 
> > > > If this is a problem in practice, it sounds a lot like we need to
> > > > improve our handling of that situation in general. Why do we abort
> > > > the whole migration when serialising the state fails? Can't we abort
> > > > just the completion and switch back to the live state, and then retry a
> > > > bit later?
> > > 
> > > It's not the serialisation that fails, it's the deserialisation on the 
> > > destination,
> > > and our migration stream is currently one way, so we have no way
> > > of signalling back to the source to 'just try that again'.
> > > So the migration fails, the source carries on running and we can only
> > > try again from the beginning.
> > 
> > True, forgot about that.
> > 
> > So what if the management tool queried the destination for supported
> > subsections and passed that as an (optional) argument to the migration
> > command on the source?
> > 
> > Then the source would have full control and could complete migration
> > only when no unsupported subsection are needed.
> 
> I tried doing something similar, where I had an optional parameter on
> the source which was the version of the destination qemu; people really
> hated that.

Did they hate it because version numbers are really badly suited for the
task (I would agree with that) or already because the source gets some
information about the destination?

In the mailing list thread that I could find quicky, only Paolo seemed
to object, and the two arguments that I see is that the version number
isn't a good criterion and that it might not be worth the effort for a
corner case that isn't practically relevant.

Apparently you think that it is in fact practically relevant, so the
only point that remains is replacing the version number by something
better. The name and version ID of all supported sections and
subsections should be the definite source for qemu to tell whether it
can migrate to the destination now, possibly can migrate later, or can't
migrate at all (if the version ID on the destination is too small).

> > > > Most, if not all, of the subsections you mentioned that fix minor bugs
> > > > fix some states while the device is actively used and shorty after we
> > > > should be back in a state where the subsection isn't needed.
> > > 
> > > I think once any (non-iterative) device state has been sent we can't
> > > necessarily abandon and try again; I'm not confident that the devices
> > > would work if we sent one set of state and then tried to reload it later.
> > 
> > Why would serialising the device state change any device state? Don't
> > you already continue running the source after sending non-iterative
> > device state when migration fails on the destination?
> 
> It's the destination I'm worried about here, not the source; lets say
> you have two devices, a & b.  'a' gets serialised, but then 'b' finds
> it has to wait, so we return to running the source and sending pages
> across.   However the destination has already loaded the 'a' device state;
> so that when we serialise again we send a new 'a' device state'; I'm
> worrying here that the destination 'a' state loader would get upset
> trying to load the same state twice without somehow resetting it.

If that were the case, it would be a bug that needs fixing.

For pure VMState fields, it's obviously not a problem. Anything with
pre/post_load handlers could in theory have bugs, though.

> > > I think that would need a further pass that went around all the devices 
> > > and
> > > said 'is it OK to migrate now' and that would happen before any of the
> > > non-iterative devices were migrated.
> > 
> > Yes, another pass is an option, too.

So if you're concerned about it, and we don't want to audit the
pre/post_load handlers, just do this second pass. It makes sense anyway
to fail as soon as possible and not send lots of data before we fail.

> > > > This is exactly the "fails 1 in 100 migrations" cases you talked about,
> > > > and they would be practically fixed if you made a few more attempts to
> > > > complete the migration before you let it fail (or perhaps even retry
> > > > indefinitely, as long as the user doesn't cancel migration).
> > > 
> > > This causes a problem if your belief in the safe-state to migrate
> > > doesn't happen on some new OS version, because then it will retry
> > > indefinitely.  You'd know you didn't need to wait on the newer machine
> > > type, but you could still use this trick on the older machine type.
> > > (Of course if you knew you were going to a new QEMU that could
> > > always accept the new section then you wouldn't worry about this -
> > > but people dont like sending qemu versions).
> > 
> > Is retrying indefinitely a worse problem than failing migration, though?
> > Anyway, this was just a thought. You could as well give up after a
> > number of attempts and just increase that number from 1 currently to
> > at least maybe 10 or so. (Potentially even configurable, with
> > indefinitely being one option.)
> 
> No, it's probably better than failing the migration, but what I'm worrying
> about is, as in your code, where the destination has the new version
> of qemu and could cope with the new subsections, so there is no need
> to wait.

Waiting obviously only makes sense when you know what subsections the
destination supports.

Kevin

Reply via email to