Il 25/10/2013 06:58, Lei Li ha scritto: > Right now just has inaccurate numbers without the new vmsplice, which > based on > the result from info migrate, as the guest ram size increases, although the > 'total time' is number of times less compared with the current live > migration, but the 'downtime' performs badly.
Of course. > > For a 1GB ram guest, > > total time: 702 milliseconds > downtime: 692 milliseconds > > And when the ram size of guest increasesexponentially, those numbers are > proportional to it. > > I will make a list of the performance with the new vmsplice later, I am > sure it'd be much better than this at least. Yes, please. Is the memory usage is still 2x without vmsplice? I think you have a nice proof of concept, but on the other hand this probably needs to be coupled with some kind of postcopy live migration, that is: * the source starts sending data * but the destination starts running immediately * if the machine needs a page that is missing, the destination asks the source to send it * as soon as it arrives, the destination can restart Using postcopy is problematic for reliability: if the destination fails, the virtual machine is lost because the source doesn't have the latest content of memory. However, this is a much, much smaller problem for live QEMU upgrade where the network cannot fail. If you do this, you can achieve pretty much instantaneous live upgrade, well within your original 200 ms goals. But the flipping code with vmsplice should be needed anyway to avoid doubling memory usage, and it's looking pretty good in this version already! I'm relieved that the RDMA code was designed right! Paolo