On 23/09/15 14:11, Daniel P. Berrange wrote:
On Wed, Sep 23, 2015 at 01:48:17PM +0100, Paul Carlton wrote:On 22/09/15 16:44, Daniel P. Berrange wrote:On Tue, Sep 22, 2015 at 09:29:46AM -0600, Chris Friesen wrote:There is also work on post-copy migration in QEMU. Normally with live migration, the guest doesn't start executing on the target host until migration has transferred all data. There are many workloads where that doesn't work, as the guest is dirtying data too quickly, With post-copy you can start running the guest on the target at any time, and when it faults on a missing page that will be pulled from the source host. This is slightly more fragile as you risk loosing the guest entirely if the source host dies before migration finally completes. It does guarantee that migration will succeed no matter what workload is in the guest. This is probably Nxxxx cycle material.It seems to me that the ideal solution would be to start doing pre-copy migration, then if that doesn't converge with the specified downtime value then maybe have the option to just cut over to the destination and do a post-copy migration of the remaining data.Yes, that is precisely what the QEMU developers working on this featue suggest we should do. The lazy page faulting on the target host has a performance hit on the guest, so you definitely need to give a little time for pre-copy to start off with, and then switch to post-copy once some benchmark is reached, or if progress info shows the transfer is not making progress. Regards, DanielI'd be a bit concerned about automatically switching to the post copy mode. As Daniel commented perviously, if something goes wrong on the source node the customer's instance could be lost. Many cloud operators will want to control the use of this mode. As per my previous message this could be something that could be set on or off by default but provide a PUT operation on os-migration to update setting on for a specific migrationNB, if you are concerned about the source host going down while migration is still taking place, you will loose the VM even with pre-copy mode too, since the VM will of course still be running on the source. The new failure scenario is essentially about the network connection between the source & host guest - if the network layer fails while post-copy is running, then you loose the VM. In some sense post-copy will reduce the window of failure, because it should ensure that the VM migration completes in a faster & finite amount of time. I think this is probably particularly important for host evacuation so the admin can guarantee to get all the VMs off a host in a reasonable amount of time. As such I don't think you need expose post-copy as a concept in the API, but I could see a nova.conf value to say whether use of post-copy was acceptable, so those who want to have stronger resilience against network failure can turn off post-copy. Regards, Daniel
If the source node fails during a pre-copy migration then when that node is restored the instance is ok again (usually). With the post-copy approach the risk is that the instance will be corrupted which many cloud operators would consider to be an unacceptable risk. However, let's start by exposing it as a nova.conf setting and see how that goes. -- Paul Carlton Software Engineer Cloud Services Hewlett Packard BUK03:T242 Longdown Avenue Stoke Gifford Bristol BS34 8QZ Mobile: +44 (0)7768 994283 Email: mailto:paul.carlt...@hpe.com Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks RG12 1HN Registered No: 690597 England. The contents of this message and any attachments to it are confidential and may be legally privileged. If you have received this message in error, you should delete it from your system immediately and advise the sender. To any recipient of this message within HP, unless otherwise stated you should consider this message and attachments as "HP CONFIDENTIAL".
smime.p7s
Description: S/MIME Cryptographic Signature
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev