Re: [Qemu-devel] [PATCH] migration: Cleanup during exit

Dr. David Alan Gilbert Fri, 01 Mar 2019 02:12:37 -0800

* Peter Xu (pet...@redhat.com) wrote:
> On Thu, Feb 28, 2019 at 12:28:22PM +0000, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrangé (berra...@redhat.com) wrote:
> > > On Thu, Feb 28, 2019 at 11:40:19AM +0000, Dr. David Alan Gilbert wrote:
> > > > * Peter Xu (pet...@redhat.com) wrote:
> > > > > On Wed, Feb 27, 2019 at 04:49:00PM +0000, Dr. David Alan Gilbert 
> > > > > (git) wrote:
> > > > > > From: "Dr. David Alan Gilbert" <dgilb...@redhat.com>
> > > > > > 
> > > > > > Currently we cleanup the migration object as we exit main after the
> > > > > > main_loop finishes; however if there's a migration running things
> > > > > > get messy and we can end up with the migration thread still trying
> > > > > > to access freed structures.
> > > > > > 
> > > > > > We now take a ref to the object around the migration thread itself,
> > > > > > so the act of dropping the ref during exit doesn't cause us to lose
> > > > > > the state until the thread quits.
> > > > > > 
> > > > > > Cancelling the migration during migration also tries to get the 
> > > > > > thread
> > > > > > to quit.
> > > > > > 
> > > > > > We do this a bit earlier; so hopefully migration gets out of the way
> > > > > > before all the devices etc are freed.
> > > > > 
> > > > > So does it mean that even with the patch it's still possible the
> > > > > migration thread will be accessing device structs that have already
> > > > > been freed which can still crash QEMU?
> > > > 
> > > > Possibly yes; I'm not sure how to go to the next stage and stop that
> > > > case; the consensus seems to be we don't want to explicitly block
> > > > during the exit process, so doing a join on the migration thread doesn't
> > > > seem to be wanted.
> > > 
> > > Essentially the many  *_cleanup() calls at the end of main() in vl.c
> > > are only ever safe if all background threads have stopped using the
> > > resources that are being freed. This isn't the case with migration
> > > currently. I also worry about other threads that might be running
> > > in QEMU, SPICE in particular as it touchs many device backends.
> > > 
> > > Aborting the migration & joining the migration threads is the natural
> > > way to address this. Cancelling appears to require the main loop to
> > > still be running, so would require main_loop_should_exit() to issue
> > > the cancel & return false unless it has completed. This would delay
> > > shutdown for as long as it takes migration to abort.
> > 
> > ish - cancelling performs a shutdown(2) on the fd, that should be enough
> > in most cases to kick the migration thread into falling out without
> > main loop interaction; I think...
> 
> Dave, could you hint me on when shutdown() will not suffice (say,
> we'll hang death if we do join() upon the migration thread)?


I think I mostly worry about when we hang in a device's migration code
or where that interacts with an external program (e.g. vhost-user).

> > 
> > > FWIW, even the bdrv_close_all() call during shutdown can delay things
> > > for a considerable time if draining the backends isn't a quick op,
> > > which could be the case if there are storage problems (blocked local
> > > I/O, or interrupted network - rbd/ceph/nfs clients). So I'm not sure
> > > that stopping migration is inherantly worse than what's already
> > > possible with block cleanup, in terms of delaying things.
> > > 
> > > A slightly more hacky approach would be to pthread_cancel() all the
> > > migration threads. Normally we'd never use pthread_cancel() as it
> > > is incredibly hard to ensure all memory used by the threads is
> > > freed. Since we're about to exit the process though, this cleanup
> > > does not matter. The risk, however, is that one of the threads is
> > > holding a mutex which then blocks the rest of the *cleanup functions,
> > > so this still feels dangerous.
> > > 
> > > Overall to me a clean cancel & join of the migration threads feels
> > > like the only safe option, unless we just remove all notion of
> > > cleanup from QEMU & delete all the _cleanup functions in main()
> > > and let the OS free all memory.
> > 
> > That's actually my preference; I think trying to do clean tidy ups
> > here is very racy and doesn't gain much.  However, for things like
> > storage there may be locks that have to be properly dropped and
> > bitmaps and things to be stored/sync'd - so just exiting probably
> > isn't safe either.
> 
> I'm unsure about whether I caught the whole idea but I feel like we
> can't drop all the cleanup hooks since what if we want to do something
> else than "freeing memories"?  Or anything that the OS can't do for us
> but we want to try to do before the process quits.  If that operation
> hangs we'll probably face a similar hang issue.

Right; things like where the block code wants to ensure that
bitmaps/data structtures/buffers are saved out - you need to give them a
chance.

> Regarding pthread_cancel() - it will only work if the thread reaches
> cancellation points, right?  Then does it mean that it still cannot
> guarantee the thread will quit very soon and we still have chance that
> we'll wait forever when join()?  One major reason that the thread will
> be waiting should be the migration streams but AFAIU shutdown() (as
> Dave has mentioned) should solve this problem properly, then I'm
> unsuer how pthread_cancel() can help much comparing to shutdown()...

If I understand correctly, threads are always cancelable by default?

> My understanding (probably not correct, but I'll just speak loud...)
> is that we will never have a way to guarantee a process to quit
> cleanly all the time because there're really many things that can hang
> (I'm assuming we've already solved the MigrationState refcounting
> issue by this patch, so I'm assuming we're having a "hang QEMU" rather
> than crashing issue).  Then IMHO we could simply do whatever we can do
> to cleanup everything assuming no hang will happen, and if it really
> happens we use SIGKILL which will be the last thing we can do by which
> we might lost many things (e.g., unflushed caches in block layer) but
> we've tried our best.  For migration, it'll be (1) cancel (not
> pthread_cancel, but cancel the migration to trigger shutdown() on fds
> or whatever) (2) join thread (3) finalize/cleanup data structures.

Right so the question I think is just whether it's worth adding the join
at the moment which turns us from a small-risk of crash to small-risk of
hang.

> IMHO SIGKILL is the real de-facto solution for all these issues to me.
> And even if with SIGKILL we can still hang.... but I'll assume those
> hangs will be kernel problems not QEMU since SIGKILL should be
> designed to kill a process without a question.

Dave

> Thanks,
> 
> -- 
> Peter Xu
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH] migration: Cleanup during exit

Reply via email to