* Kevin Wolf (kw...@redhat.com) wrote: > Am 21.01.2019 um 17:05 hat Dr. David Alan Gilbert geschrieben: > > * Kevin Wolf (kw...@redhat.com) wrote: > > > Am 18.01.2019 um 16:57 hat Dr. David Alan Gilbert geschrieben: > > > > * Kevin Wolf (kw...@redhat.com) wrote: > > > > > Am 14.01.2019 um 11:51 hat Dr. David Alan Gilbert geschrieben: > > > > > > * Michael Tokarev (m...@tls.msk.ru) wrote: > > > > > > > $ qemu-system-x86_64 -monitor stdio -hda foo.img > > > > > > > QEMU 3.1.0 monitor - type 'help' for more information > > > > > > > (qemu) stop > > > > > > > (qemu) migrate "exec:cat >/dev/null" > > > > > > > (qemu) migrate "exec:cat >/dev/null" > > > > > > > qemu-system-x86_64: /build/qemu/qemu-3.1/block.c:4647: > > > > > > > bdrv_inactivate_recurse: Assertion `!(bs->open_flags & > > > > > > > BDRV_O_INACTIVE)' failed. > > > > > > > Aborted > > > > > > > > > > > > And on head as well; it only happens if the 1st migrate is > > > > > > succesful; > > > > > > if it got cancelled the 2nd one works, so it's not too bad. > > > > > > > > > > > > I suspect the problem here is all around locking/ownership - the > > > > > > block > > > > > > devices get shutdown at the end of migration since the assumption is > > > > > > that the other end has them open now and we had better release them. > > > > > > > > > > Yes, only "cont" gets control back to the source VM. > > > > > > > > > > I think we really should limit the possible monitor commands in the > > > > > postmigrate status, and possibly provide a way to get back to the > > > > > regular paused state (which means getting back control of the > > > > > resources) > > > > > without resuming the VM first. > > > > > > > > This error is a little interesting if you'd done something like: > > > > > > > > > > > > src: > > > > stop > > > > migrate > > > > > > > > dst: > > > > <kill qemu for some reason> > > > > start a new qemu > > > > > > > > src: > > > > migrate > > > > > > > > Now that used to work (safely) - note we've not started > > > > a VM succesfully anywhere else. > > > > > > > > Now the source refuses to let that happen - with a rather > > > > nasty abort. > > > > > > Essentially it's another effect of the problem that migration has always > > > lacked a proper model of ownership transfer. And it's still treating > > > this as a block layer problem rather than making it a core concept of > > > migration as it should. > > > > > > We can stack another one-off fix on top, and get back control of the > > > block devices automatically on a second 'migrate'. But it feels like a > > > hack and not like VMs had a properly designed and respected state > > > machine. > > > > Hmm; I don't like to get back to this argument because I think > > we've got a perfectly servicable model that's implemented at higher > > levels outside qemu, and the real problem is the block layer added > > new assumptions about the semantics without checking they were really > > true. > > qemu only has the view from a single host; it takes the higher level > > view from something like libvirt to have the view across multiple hosts > > to understand who has the ownership when. > > Obviously the upper layer is not handling this without the help of QEMU > or we wouldn't have had bugs that images were accessed by two QEMU > processes at the same time. We didn't change the assumptions, but we > only started to actually check the preconditions that have always been > necessary to perform live migration correctly.
In this case there is a behaviour that was perfectly legal before that fails now; further the case is safe - the source hasn't accessed the disks after the first migration and isn't trying to access it again either. > But if you like to think the upper layer should handle all of this, I don't really want the upper layer to handle all of this; but I don't think we can handle it all either - we've not got the higher level view of screwups that happen outside qemu. >then > it's on libvirt to handle the ownership transfer manually. If we really > want, we can add explicit QMP commands to activate and inactivate block > nodes. This can be done and requiring that the management layer does > all of this would be a consistent interface, too. > > I just don't like this design much for two reasons: The first is that > you can't migrate a VM that has disks with a simple 'migrate' command > any more. The second is that if you implement it consistently, this has > an impact on compatibility. I think it's a design that could be > considered if we were adding live migration as a new feature, but it's > probably hard to switch to it now. > > In any case, I do think we should finally make a decision how ownership > of resources should work in the context of migration, and then implement > that. I think we're mostly OK, but what I'd like would be: a) I'd like things to fail gently rather than abort; so I'd either like the current functions to fail cleanly so I can fail the migration or add a check at the start of migration to tell the user they did something wrong. b) I'd like commands that can tell me the current state and a command to move it to the other state explicitly; so we've got a way to recover in weirder cases. c) I'd like to document what the states should be before/after/in various middle states of migration. I think the normal case is fine, and hence as you say, I wouldn't want to break a normal 'migrate' - I just want cleaner failures and ways to do the more unusual things. Dave > Kevin -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK