On 2011-11-11 01:23, Dmitry Golubev wrote: >> ManageVE has migration support using chkpt/restore since resource-agents >> version 1.0.4. .... but if I understand the OpenVZ migration concept >> correct ... please someone correct me if I'm wrong! ... there is no need >> for a shared storage. >> >> The vzmigrate script rsyncs complete data, config and state between >> nodes .... no shared storage needed. >> >> Of course you would need twice the diskspace, but this is also true for >> DRBD replication. Extending ManageVE to use vzmigrate for live migration >> looks quite straight forward to me.
Let me chime in here, as I originally added migrate_from and migrate_to to ManageVE. > My apologies - I did not notice ManageVE has migrate actions, as its usage > help > does not list them (somebody forgot to add them). Sorry about that; I've added them now. As a general rule though, the authoritative documentation for resource agents are always "ocf ra info <agent>" or the RA man page -- in the ManageVE case, "man ocf_heartbeat_ManageVE" --, both of which do mention migration. I have, however, just updated the man page auto-generation so we get an additional paragraph informing people that the RA supports native migration. > However it is not so easy as > it seems. The ManageVE makes a checkpoint and restores the machine (not > vzmigrate, as I will explain further on), but it also needs a shared or > migratable storage to place the dumpfile on. So the MigrateVE does have > exactly > the same issue with migration as I mentioned. You can look at the source - it > has a comment, which says exactly that. Yes, and there is a simple reason for that: it's much faster than vzmigrate. The checkpoint and restore can be completed in a matter of seconds, and the incurred downtime is minimal. In HA configurations, uptime is something people care about a lot, so it made sense to implement it that way. > The vzmigrate script, on the other hand, works a bit differently: it transfers > the whole virtual machine over the network. Now this approach has three > obvious drawbacks. First, the need to send huge amount of data over, so it > will > be very very slow (I've seen such migration take hours if the virtual machine > is very large, say a terabyte), and, moreover, will slowdown the complete disk > subsystem of the current active node. Second, it will not be live at all, > since > it needs to suspend the machine and synchronize what's left unsynchronized > during the first run (all the modifications took place during the first > rsync). > It will also need to recalculate quota, which takes a lot of time as well (for > a terabyte virtual machine I would estimate quota to calculate up to an hour, > depending on the disk subsystem). And third, most importantly, there will be > zero fault tolerance, as the copy on the second node is not being synchronized > with the current primary. Now, I do not intend to say that vzmigrate is evil > or incorrect: it has its purposes, and I've used it to migrate virtual > machines to new disks (where disks can not be shared) many times, and I was > very very happy with just how it works... but it is just not suitable for this > particular purpose. > > A filesystem on an active-passive DRBD, on the other hand, provides full > online > synchronization, so not only the second node could take over once the primary > failed, but also live migration would be just a matter of dumping the memory > file, unmounting the filesystem, remounting it on the other node and reading > the memory file - fast, clean and simple. Right. So the only thing you're saying is, rather than doing doing vzctl stop during stop and vzctl chkpnt during migrate_from, you just always want it to do vzctl chkpnt during stop, too? Well that's something we can add to the existing RA -- again, no need to roll your own. Let me know if that's what you want, and then we can discuss how to best implement it. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker