Re: [Openstack] KVM live block migration: stability, future, docs
Hi Vish, On 10 August 2012 00:27, Vishvananda Ishaya vishvana...@gmail.com wrote: On Aug 9, 2012, at 7:13 AM, Daniel P. Berrange berra...@redhat.com wrote: With non-live migration, the migration operation is guaranteed to complete. With live migration, you can get into a non-convergence scenario where the guest is dirtying data faster than it can be migrated. With the way Nova currently works the live migration will just run forever with no way to stop it. So if you want to enable live migration by default, we'll need todo more than simply set the flag. Nova will need to be able to monitor the migration, and either cancel it after some time, or tune the max allowed downtime to let it complete Ah good to know. So it sounds like we should keep the default as-is for now and revisit it later. I'm not so sure. It seems to me that nova migrate should be the offline/paused migration and nova live-migration should be _live_ migration, like it says. Semantic mismatches like this exposed to operators/users are bad news. As it is, I don't even know what nova migrate is supposed to do...? There's at least a need to improve the docs on this. Daniel's point about the non-convergence cases with [live|block]-migration is certainly good to know. It sounds like in practice the key settings, such as the allowable live-migration downtime, should be tuned to the deployment. Nova should probably default to a conservatively high allowable downtime. Daniel; any advice about choosing a sensible value for the allowable downtime? -- Cheers, ~Blairo ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] KVM live block migration: stability, future, docs
Hi Daniel, Thanks for following this up! On 8 August 2012 19:53, Daniel P. Berrange berra...@redhat.com wrote: not tune this downtime setting, I don't see how you'd see 4 mins downtime unless it was not truely live migration, or there was Yes, quite right. It turns out Nova is not passing/setting libvirt's VIR_MIGRATE_LIVE when it is asked to live-migrate a guest, so it is not proper live-migration. That is the default behaviour unless the flag is added to the migrate flags in nova.conf, unfortunately that flag isn't currently mentioned in the OpenStack docs either. -- Cheers, ~Blairo ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] KVM live block migration: stability, future, docs
Daniel, Thanks for providing this insight, most useful. I'm interpreting this as: block migration can be used in non-critical applications, mileage will vary, thorough testing in the particular environment is recommended. An alternative implementation will come, but the higher level feature (live-migration without shared storage) is unlikely to disappear. Is that a reasonable appraisal? On 8 August 2012 19:59, Daniel P. Berrange berra...@redhat.com wrote: Block migration is a part of the KVM that none of the upstream developers really like, is not entirely reliable, and most distros typically do not want to support it due to its poor design (eg not supported in RHEL). Would you mind/be-able-to elaborate on those reliability issues? E.g., is there anything we can do to mitigate them? -- Cheers, ~Blairo ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] KVM live block migration: stability, future, docs
On Aug 9, 2012, at 1:03 AM, Blair Bethwaite blair.bethwa...@gmail.com wrote: Hi Daniel, Thanks for following this up! On 8 August 2012 19:53, Daniel P. Berrange berra...@redhat.com wrote: not tune this downtime setting, I don't see how you'd see 4 mins downtime unless it was not truely live migration, or there was Yes, quite right. It turns out Nova is not passing/setting libvirt's VIR_MIGRATE_LIVE when it is asked to live-migrate a guest, so it is not proper live-migration. That is the default behaviour unless the flag is added to the migrate flags in nova.conf, unfortunately that flag isn't currently mentioned in the OpenStack docs either. Can you file a bug on this to change the default? I don't see any reason why this should be off. Vish ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] KVM live block migration: stability, future, docs
On Thu, Aug 09, 2012 at 07:10:17AM -0700, Vishvananda Ishaya wrote: On Aug 9, 2012, at 1:03 AM, Blair Bethwaite blair.bethwa...@gmail.com wrote: Hi Daniel, Thanks for following this up! On 8 August 2012 19:53, Daniel P. Berrange berra...@redhat.com wrote: not tune this downtime setting, I don't see how you'd see 4 mins downtime unless it was not truely live migration, or there was Yes, quite right. It turns out Nova is not passing/setting libvirt's VIR_MIGRATE_LIVE when it is asked to live-migrate a guest, so it is not proper live-migration. That is the default behaviour unless the flag is added to the migrate flags in nova.conf, unfortunately that flag isn't currently mentioned in the OpenStack docs either. Can you file a bug on this to change the default? I don't see any reason why this should be off. With non-live migration, the migration operation is guaranteed to complete. With live migration, you can get into a non-convergence scenario where the guest is dirtying data faster than it can be migrated. With the way Nova currently works the live migration will just run forever with no way to stop it. So if you want to enable live migration by default, we'll need todo more than simply set the flag. Nova will need to be able to monitor the migration, and either cancel it after some time, or tune the max allowed downtime to let it complete Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] KVM live block migration: stability, future, docs
On Aug 9, 2012, at 7:13 AM, Daniel P. Berrange berra...@redhat.com wrote: With non-live migration, the migration operation is guaranteed to complete. With live migration, you can get into a non-convergence scenario where the guest is dirtying data faster than it can be migrated. With the way Nova currently works the live migration will just run forever with no way to stop it. So if you want to enable live migration by default, we'll need todo more than simply set the flag. Nova will need to be able to monitor the migration, and either cancel it after some time, or tune the max allowed downtime to let it complete Ah good to know. So it sounds like we should keep the default as-is for now and revisit it later. Vish ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] KVM live block migration: stability, future, docs
On Wed, Aug 08, 2012 at 09:50:20AM +0800, Huang Zhiteng wrote: But to the contrary. I tested live-migrate (without block migrate) last night using a guest with 8GB RAM (almost fully committed) and lost any access/contact with the guest for over 4 minutes - it was paused for the duration. Not something I'd want to do to a user's web-server on a regular basis... 4 minutes of pause (down time)? That's way too long. Even there was crazy memory intensive workload inside the VM being migrated, the worst case is KVM has to pause VM and transmit all 8 GB memory (all memory are dirty, which is very rare). If you have 1GbE link between two host, that worst case pause period (down time) is less than 2 minutes. My previous experience is: the down time for migrating one idle (almost no memory access) 8GB VM via 1GbE is less than 1 second; the down time for migrating a 8 GB VM that page got dirty really quickly is 60 seconds. FYI. KVM has a tunable setting for the maximum allowable live migration downtime, which IIRC defaults to something very small like 250ms. If the migration can't be completed within this downtime limit, KVM will simply never complete migration. Given that Nova does not tune this downtime setting, I don't see how you'd see 4 mins downtime unless it was not truely live migration, or there was something else broken (eg the network bridge device had a delay inserted by the STP protocol which made the VM /appear/ to be unreponsive on the network even though it was running fine). Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] KVM live block migration: stability, future, docs
On Tue, Aug 07, 2012 at 04:13:22PM -0400, Jay Pipes wrote: On 08/07/2012 08:57 AM, Blair Bethwaite wrote: I also feel a little concern about this statement: It don't work so well, it complicates migration code, and we are building a replacement that works. I have to go further with my tests, maybe we could share some ideas, use case etc... I think it may be worth asking about this on the KVM lists, unless anyone here has further insights...? I grabbed the KVM 1.0 source from Ubuntu Precise and vanilla KVM 1.1.1 from Sourceforge, block migration appears to remain in place despite those (sparse) comments from the KVM meeting minutes (though I am naive to the source layout and project structure, so could have easily missed something). In any case, it seems unlikely Precise would see a forced update to the 1.1.x series. cc'd Daniel Berrange, who seems to be keyed in on upstream KVM/Qemu activity. Perhaps Daniel could shed some light. Block migration is a part of the KVM that none of the upstream developers really like, is not entirely reliable, and most distros typically do not want to support it due to its poor design (eg not supported in RHEL). It is quite likely that it will be removed in favour of an alternative implementation. What that alternative impl will be, and when I will arrive, I can't say right now. A lot of the work (possibly all) will probably be pushed up into libvirt, or even the higher level mgmt apps using libvirt. It could well involve the mgmt app having to setup an NBD or iSCSI server on the source host, and then launching QEMU on the destination host configured to stream the data across from the NBD/iSCSI server in parallel with the migration stream. But this is all just talk for now, no firm decisions have been made, beyond a general desire to kill the current block migration code. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] KVM live block migration: stability, future, docs
From memory (a fuzzy memory at that!) Nova will fallback to block migration if believes shared storage is unavailable. This would explain the delay, but someone who's read the code recently can confirm... Thanks, Kiall On Aug 8, 2012 11:08 AM, Daniel P. Berrange berra...@redhat.com wrote: On Wed, Aug 08, 2012 at 09:50:20AM +0800, Huang Zhiteng wrote: But to the contrary. I tested live-migrate (without block migrate) last night using a guest with 8GB RAM (almost fully committed) and lost any access/contact with the guest for over 4 minutes - it was paused for the duration. Not something I'd want to do to a user's web-server on a regular basis... 4 minutes of pause (down time)? That's way too long. Even there was crazy memory intensive workload inside the VM being migrated, the worst case is KVM has to pause VM and transmit all 8 GB memory (all memory are dirty, which is very rare). If you have 1GbE link between two host, that worst case pause period (down time) is less than 2 minutes. My previous experience is: the down time for migrating one idle (almost no memory access) 8GB VM via 1GbE is less than 1 second; the down time for migrating a 8 GB VM that page got dirty really quickly is 60 seconds. FYI. KVM has a tunable setting for the maximum allowable live migration downtime, which IIRC defaults to something very small like 250ms. If the migration can't be completed within this downtime limit, KVM will simply never complete migration. Given that Nova does not tune this downtime setting, I don't see how you'd see 4 mins downtime unless it was not truely live migration, or there was something else broken (eg the network bridge device had a delay inserted by the STP protocol which made the VM /appear/ to be unreponsive on the network even though it was running fine). Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/:| |: http://libvirt.org -o- http://virt-manager.org:| |: http://autobuild.org -o- http://search.cpan.org/~danberr/:| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc:| ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] KVM live block migration: stability, future, docs
Hi! I think it's a pretty useful feature, a good compromise. As you said using a shared fs implies a lot of things and can dramatically decrease your performance rather than using the local fs. I tested it and I will use it for my deployment. I'll be happy to discuss more deeply with you about this feature :) I also feel a little concern about this statement: It don't work so well, it complicates migration code, and we are building a replacement that works. I have to go further with my tests, maybe we could share some ideas, use case etc... Cheers! On Mon, Aug 6, 2012 at 3:08 PM, Blair Bethwaite blair.bethwa...@gmail.com wrote: Hi all, KVM block migration support in OpenStack (https://blueprints.launchpad.net/nova/+spec/kvm-block-migration) seems to be somewhat of a secret - there's almost nothing in the docs/guides (which to the contrary state that live migration is only possible with shared storage) and only a couple of mentions on list, yet it's been around since Diablo. Should this be taken to mean it's considered unstable, or just that no-one interested in documenting it understands the significance of such a feature to deployment architects? After all, decent shared storage is an expensive prospect with a pile of associated design and management overhead! I'd be happy to contribute some documentation patches (starting with the admin guide) that cover this. But first I'd like to get some confirmation that it's here to stay, which will be significant for our own large deployment. We've tested with Essex on Ubuntu Precise and seen a bit of weird file-system behaviour, which we currently suspect might be a consequence of using ext3 in the guest. But also, there seems to be some associated lag with interactive services (e.g. active VNC session) in the guest, not yet sure how this compares to the non-block live migration case. We'd really appreciate anybody actively using this feature to speak up and comment on their mileage, especially with respect to ops. I'm slightly concerned that KVM may drop this going forward (http://www.spinics.net/lists/kvm/msg72228.html), though that would be unlikely to affect anybody deploying on Precise. -- Cheers, ~Blairo ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] KVM live block migration: stability, future, docs
Hi Sébastien, Thanks for responding! By the way, I have come across your blog post regarding this and should reference it for the list: http://www.sebastien-han.fr/blog/2012/07/12/openstack-block-migration/ On 7 August 2012 17:45, Sébastien Han han.sebast...@gmail.com wrote: I think it's a pretty useful feature, a good compromise. As you said using a shared fs implies a lot of things and can dramatically decrease your performance rather than using the local fs. Agreed, scale-out distributed file-systems are hard. Consistent hashing based systems (like Gluster and Ceph) seem like the answer to many of the existing issues with systems trying to mix scalability, performance and POSIX compliance. But the key issue is how one measures performance for these systems... throughput for large synchronous reads writes may scale linearly (up to network saturation), but random IOPS are another thing entirely. As far as I can tell, random IOPS are the primary metric of concern in the design of the nova-compute storage, whereas both capacity and throughput requirements are relatively easy to specify and simply represent hard limits that must be met to support the various instance flavours you plan to offer. It's interesting to note that RedHat do not recommend using RHS (RedHat Storage), their RHEL-based Gluster (which they own now) appliance, for live VM storage. Additionally, operations issues are much harder to handle with a DFS (even NFS), e.g., how can I put an upper limit on disk I/O for any particular instance when its ephemeral disk files are across the network and potentially striped into opaque objects across multiple storage bricks...? I tested it and I will use it for my deployment. I'll be happy to discuss more deeply with you about this feature :) Great. We have tested too. Compared to regular (non-block) live migrate, we don't see much difference in the guest - both scenarios involve a minute or two of interruption as the guest is moved (e.g. VNC and SSH sessions hang temporarily), which I find slightly surprising - is that your experience too? I also feel a little concern about this statement: It don't work so well, it complicates migration code, and we are building a replacement that works. I have to go further with my tests, maybe we could share some ideas, use case etc... I think it may be worth asking about this on the KVM lists, unless anyone here has further insights...? I grabbed the KVM 1.0 source from Ubuntu Precise and vanilla KVM 1.1.1 from Sourceforge, block migration appears to remain in place despite those (sparse) comments from the KVM meeting minutes (though I am naive to the source layout and project structure, so could have easily missed something). In any case, it seems unlikely Precise would see a forced update to the 1.1.x series. -- Cheers, ~Blairo ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] KVM live block migration: stability, future, docs
On 08/07/2012 08:57 AM, Blair Bethwaite wrote: Hi Sébastien, Thanks for responding! By the way, I have come across your blog post regarding this and should reference it for the list: http://www.sebastien-han.fr/blog/2012/07/12/openstack-block-migration/ On 7 August 2012 17:45, Sébastien Han han.sebast...@gmail.com wrote: I think it's a pretty useful feature, a good compromise. As you said using a shared fs implies a lot of things and can dramatically decrease your performance rather than using the local fs. Agreed, scale-out distributed file-systems are hard. Consistent hashing based systems (like Gluster and Ceph) seem like the answer to many of the existing issues with systems trying to mix scalability, performance and POSIX compliance. But the key issue is how one measures performance for these systems... throughput for large synchronous reads writes may scale linearly (up to network saturation), but random IOPS are another thing entirely. As far as I can tell, random IOPS are the primary metric of concern in the design of the nova-compute storage, whereas both capacity and throughput requirements are relatively easy to specify and simply represent hard limits that must be met to support the various instance flavours you plan to offer. It's interesting to note that RedHat do not recommend using RHS (RedHat Storage), their RHEL-based Gluster (which they own now) appliance, for live VM storage. Additionally, operations issues are much harder to handle with a DFS (even NFS), e.g., how can I put an upper limit on disk I/O for any particular instance when its ephemeral disk files are across the network and potentially striped into opaque objects across multiple storage bricks...? We at ATT are also interested in this area, for the record, and will likely do testing in this area in the next 6-12 months. We will release any information and findings to the mailing list of course, and hopefully we can collaborate on this important area. I tested it and I will use it for my deployment. I'll be happy to discuss more deeply with you about this feature :) Great. We have tested too. Compared to regular (non-block) live migrate, we don't see much difference in the guest - both scenarios involve a minute or two of interruption as the guest is moved (e.g. VNC and SSH sessions hang temporarily), which I find slightly surprising - is that your experience too? Why would you find this surprising? I'm just curious... I also feel a little concern about this statement: It don't work so well, it complicates migration code, and we are building a replacement that works. I have to go further with my tests, maybe we could share some ideas, use case etc... I think it may be worth asking about this on the KVM lists, unless anyone here has further insights...? I grabbed the KVM 1.0 source from Ubuntu Precise and vanilla KVM 1.1.1 from Sourceforge, block migration appears to remain in place despite those (sparse) comments from the KVM meeting minutes (though I am naive to the source layout and project structure, so could have easily missed something). In any case, it seems unlikely Precise would see a forced update to the 1.1.x series. cc'd Daniel Berrange, who seems to be keyed in on upstream KVM/Qemu activity. Perhaps Daniel could shed some light. Best, -jay ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] KVM live block migration: stability, future, docs
Hi Jay, On 8 August 2012 06:13, Jay Pipes jaypi...@gmail.com wrote: Why would you find this surprising? I'm just curious... The live migration algorithm detailed here: http://www.linux-kvm.org/page/Migration, seems to me to indicate that only a brief pause should be expected. Indeed, the summary says, Almost unnoticeable guest down time. But to the contrary. I tested live-migrate (without block migrate) last night using a guest with 8GB RAM (almost fully committed) and lost any access/contact with the guest for over 4 minutes - it was paused for the duration. Not something I'd want to do to a user's web-server on a regular basis... cc'd Daniel Berrange, who seems to be keyed in on upstream KVM/Qemu activity. Perhaps Daniel could shed some light. That would be wonderful. Thanks! -- Cheers, ~Blairo ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] KVM live block migration: stability, future, docs
On 08/07/2012 08:23 PM, Blair Bethwaite wrote: Hi Jay, On 8 August 2012 06:13, Jay Pipes jaypi...@gmail.com wrote: Why would you find this surprising? I'm just curious... The live migration algorithm detailed here: http://www.linux-kvm.org/page/Migration, seems to me to indicate that only a brief pause should be expected. Indeed, the summary says, Almost unnoticeable guest down time. But to the contrary. I tested live-migrate (without block migrate) last night using a guest with 8GB RAM (almost fully committed) and lost any access/contact with the guest for over 4 minutes - it was paused for the duration. Not something I'd want to do to a user's web-server on a regular basis... Sorry, from your original post, I didn't think you were referring to live migration, but rather just server migration. You had written Compared to regular (non-block) live migrate, but I read that as Compared to regular migrate and thought you were referring to the server migration behaviour that Nova supports... sorry about that! Best, -jay ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] KVM live block migration: stability, future, docs
On 8 August 2012 11:33, Jay Pipes jaypi...@gmail.com wrote: Sorry, from your original post, I didn't think you were referring to live migration, but rather just server migration. You had written Compared to regular (non-block) live migrate, but I read that as Compared to regular migrate and thought you were referring to the server migration behaviour that Nova supports... sorry about that! Jay, is your use of the wording behaviour that Nova supports there, significant? I mean, you're not trying to indicate that Nova does not support _live_ migration, are you? Anyway, I found this relevant and stale bug: https://bugs.launchpad.net/nova/+bug/883845. VIR_MIGRATE_LIVE remains undefined in https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py. We only just discovered the lack of this as a default option, so we'll test further, this time with VIR_MIGRATE_LIVE=1 explicitly specified in nova.conf... -- Cheers, ~Blairo ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] KVM live block migration: stability, future, docs
But to the contrary. I tested live-migrate (without block migrate) last night using a guest with 8GB RAM (almost fully committed) and lost any access/contact with the guest for over 4 minutes - it was paused for the duration. Not something I'd want to do to a user's web-server on a regular basis... 4 minutes of pause (down time)? That's way too long. Even there was crazy memory intensive workload inside the VM being migrated, the worst case is KVM has to pause VM and transmit all 8 GB memory (all memory are dirty, which is very rare). If you have 1GbE link between two host, that worst case pause period (down time) is less than 2 minutes. My previous experience is: the down time for migrating one idle (almost no memory access) 8GB VM via 1GbE is less than 1 second; the down time for migrating a 8 GB VM that page got dirty really quickly is 60 seconds. FYI. -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] KVM live block migration: stability, future, docs
On 08/07/2012 09:42 PM, Blair Bethwaite wrote: On 8 August 2012 11:33, Jay Pipes jaypi...@gmail.com wrote: Sorry, from your original post, I didn't think you were referring to live migration, but rather just server migration. You had written Compared to regular (non-block) live migrate, but I read that as Compared to regular migrate and thought you were referring to the server migration behaviour that Nova supports... sorry about that! Jay, is your use of the wording behaviour that Nova supports there, significant? I mean, you're not trying to indicate that Nova does not support _live_ migration, are you? No, I was referring to the differentiation between server migration in Nova and live migration in Nova. In other words, the difference between: $ nova migrate SERVER ... and $ nova live-migrate SERVER ... Anyway, I found this relevant and stale bug: https://bugs.launchpad.net/nova/+bug/883845. VIR_MIGRATE_LIVE remains undefined in https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py. We only just discovered the lack of this as a default option, so we'll test further, this time with VIR_MIGRATE_LIVE=1 explicitly specified in nova.conf... OK, cheers, -jay ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
[Openstack] KVM live block migration: stability, future, docs
Hi all, KVM block migration support in OpenStack (https://blueprints.launchpad.net/nova/+spec/kvm-block-migration) seems to be somewhat of a secret - there's almost nothing in the docs/guides (which to the contrary state that live migration is only possible with shared storage) and only a couple of mentions on list, yet it's been around since Diablo. Should this be taken to mean it's considered unstable, or just that no-one interested in documenting it understands the significance of such a feature to deployment architects? After all, decent shared storage is an expensive prospect with a pile of associated design and management overhead! I'd be happy to contribute some documentation patches (starting with the admin guide) that cover this. But first I'd like to get some confirmation that it's here to stay, which will be significant for our own large deployment. We've tested with Essex on Ubuntu Precise and seen a bit of weird file-system behaviour, which we currently suspect might be a consequence of using ext3 in the guest. But also, there seems to be some associated lag with interactive services (e.g. active VNC session) in the guest, not yet sure how this compares to the non-block live migration case. We'd really appreciate anybody actively using this feature to speak up and comment on their mileage, especially with respect to ops. I'm slightly concerned that KVM may drop this going forward (http://www.spinics.net/lists/kvm/msg72228.html), though that would be unlikely to affect anybody deploying on Precise. -- Cheers, ~Blairo ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp