Re: [Qemu-devel] Migration downtime more than 5s when migrating guest with massive disks

Gonglei (Arei) Fri, 19 May 2017 07:12:15 -0700

Oops, forgot to CC qemu-devel, add it.


> -----Original Message-----
> From: Gonglei (Arei)
> Sent: Friday, May 19, 2017 8:17 PM
> To: 'Paolo Bonzini'; yanghongyang; m...@redhat.com
> Cc: quint...@redhat.com; Dr. David Alan Gilbert; Huangzhichao
> Subject: RE: Migration downtime more than 5s when migrating guest with
> massive disks
> 
> 
> > -----Original Message-----
> > From: Paolo Bonzini [mailto:pbonz...@redhat.com]
> > Sent: Friday, May 19, 2017 6:19 PM
> >
> > On 19/05/2017 12:00, Yang Hongyang wrote:
> > > We found that migration downtime is unacceptable when migrating guest
> > with
> > > 60 disks, more than 5.5 seconds.
> > > By debugging, we find out the problem is there's too many
> > > memory_region_transaction_commit() operations during guest load, about
> > > 31w+ times.
> > > Any idea to optimize the migration downtime in this scenario?
> > > maybe reduce the times of memory_region_transaction_commit() call, but
> > how?
> > > or we could optimize the time cost of
> memory_region_transaction_commit()
> > call,
> > > but I think that wouldn't help much.
> >
> > It would.  Right now memory_region_transaction_commit() is roughly
> > O(n^2) (n devices * n BARs), and there are n of them.
> >
> > Reducing memory_region_transaction_commit to O(n) would be a large
> > change.  One idea is to share the AddressSpaceDispatch for AddressSpaces
> > that have the same root memory region (after resolving aliases).  The
> > starting point would be to change mem_begin/mem_commit/mem_add from
> a
> > MemoryListener to an loop on the FlatView, storing the
> > AddressSpaceDispatch in the FlatView.
> >
> How about do O(1) for stopping stage of live migration?
> Because the cpu is stopped in this phase, it wouldn't cause
> side effects IMHO, right?
> 
> Thanks,
> -Gonglei
> 
> > One bandaid solution is to use virtio-scsi in the guest, with multiple
> > disks behind one controller.
> >
> > Thanks,
> >
> > Paolo

Re: [Qemu-devel] Migration downtime more than 5s when migrating guest with massive disks

Reply via email to