On 23/09/15 14:11, Daniel P. Berrange wrote:
On Wed, Sep 23, 2015 at 01:48:17PM +0100, Paul Carlton wrote:

On 22/09/15 16:44, Daniel P. Berrange wrote:
On Tue, Sep 22, 2015 at 09:29:46AM -0600, Chris Friesen wrote:
There is also work on post-copy migration in QEMU. Normally with live
migration, the guest doesn't start executing on the target host until
migration has transferred all data. There are many workloads where that
doesn't work, as the guest is dirtying data too quickly, With post-copy you
can start running the guest on the target at any time, and when it faults
on a missing page that will be pulled from the source host. This is
slightly more fragile as you risk loosing the guest entirely if the source
host dies before migration finally completes. It does guarantee that
migration will succeed no matter what workload is in the guest. This is
probably Nxxxx cycle material.
It seems to me that the ideal solution would be to start doing pre-copy
migration, then if that doesn't converge with the specified downtime value
then maybe have the option to just cut over to the destination and do a
post-copy migration of the remaining data.
Yes, that is precisely what the QEMU developers working on this
featue suggest we should do. The lazy page faulting on the target
host has a performance hit on the guest, so you definitely need
to give a little time for pre-copy to start off with, and then
switch to post-copy once some benchmark is reached, or if progress
info shows the transfer is not making progress.

Regards,
Daniel
I'd be a bit concerned about automatically switching to the post copy
mode.  As Daniel commented perviously, if something goes wrong on the
source node the customer's instance could be lost.  Many cloud operators
will want to control the use of this mode.  As per my previous message
this could be something that could be set on or off by default but
provide a PUT operation on os-migration to update setting on for a
specific migration
NB, if you are concerned about the source host going down while
migration is still taking place, you will loose the VM even with
pre-copy mode too, since the VM will of course still be running
on the source.

The new failure scenario is essentially about the network
connection between the source & host guest - if the network
layer fails while post-copy is running, then you loose the
VM.

In some sense post-copy will reduce the window of failure,
because it should ensure that the VM migration completes
in a faster & finite amount of time. I think this is
probably particularly important for host evacuation so
the admin can guarantee to get all the VMs off a host in
a reasonable amount of time.

As such I don't think you need expose post-copy as a concept in the
API, but I could see a nova.conf value to say whether use of post-copy
was acceptable, so those who want to have stronger resilience against
network failure can turn off post-copy.

Regards,
Daniel

If the source node fails during a pre-copy migration then when that node
is restored the instance is ok again (usually).  With the post-copy
approach the risk is that the instance will be corrupted which many
cloud operators would consider to be an unacceptable risk.

However, let's start by exposing it as a nova.conf setting and see how
that goes.

--
Paul Carlton
Software Engineer
Cloud Services
Hewlett Packard
BUK03:T242
Longdown Avenue
Stoke Gifford
Bristol BS34 8QZ

Mobile:    +44 (0)7768 994283
Email:    mailto:paul.carlt...@hpe.com
Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks RG12 1HN 
Registered No: 690597 England.
The contents of this message and any attachments to it are confidential and may be 
legally privileged. If you have received this message in error, you should delete it from 
your system immediately and advise the sender. To any recipient of this message within 
HP, unless otherwise stated you should consider this message and attachments as "HP 
CONFIDENTIAL".


Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to