On 03/03/2017 13:00, Dr. David Alan Gilbert wrote: > Ouch that's pretty nasty; I remember Paolo explaining to me a while ago that > their were times when run_on_cpu would have to drop the BQL and I worried > about it, > but this is the 1st time I've seen an error due to it. > > Do you know what the migration state was at that point? Was it > MIGRATION_STATUS_CANCELLING? > I'm thinking perhaps we should stop 'cont' from continuing while migration is > in > MIGRATION_STATUS_CANCELLING. Do we send an event when we hit CANCELLED - so > that > perhaps libvirt could avoid sending the 'cont' until then?
No, there's no event, though I thought libvirt would poll until "query-migrate" returns the cancelled state. Of course that is a small consolation, because a segfault is unacceptable. One possibility is to suspend the monitor in qmp_migrate_cancel and resume it (with add_migration_state_change_notifier) when we hit the CANCELLED state. I'm not sure what the latency would be between the end of migrate_fd_cancel and finally reaching CANCELLED. Paolo