* Li, Liang Z (liang.z...@intel.com) wrote: > > Subject: Re: [RFC Design Doc]Speed up live migration by skipping free pages > > > > * Li, Liang Z (liang.z...@intel.com) wrote: > > > Hi Dave, > > > > > > I am now working on how to benefit post-copy by skipping the free > > > pages, and I remember you have said we should let the destination know > > > the info of free pages so as to avoid request the free pages from the > > source. > > > > > > We have two solutions: > > > > > > a. send the migration dirty page bitmap to destination before post > > > copy start, so the destination can decide whether to request the pages > > > or place zero pages by checking the migration dirty page bitmap. The > > > advantage is that we can avoid sending the free pages. the > > > disadvantage is that we have to send extra data to destination. > > > > > > b. Check the page request on the source side, if it's not a dirty > > > page, send a zero page header to the destination. > > > > > > What's your opinion about them? > > > > (b) is certainly simpler - and requires no changes on the destination side > > or > > the protocol. > > If you then decided to add stuff to send the dirty page bit map later you > > could do. > > > > However, there are some other problems to figure out: > > 1) The source side quits when it thinks it's sent all pages; when is your > > source going to quit? If it quits while the destination still has > > unfulfilled pages then the destination will fail. > > The source quit as the same as before, but before quitting, tell destination > it has already quit. > After that, the destination don't need to request pages from the source, just > place zero pages. works?
Yes, maybe. The destination side would somehow have to clean up once it has all the zero pages, but it currently doesn't keep a count or map of which pages still need to be received. Actually, perhaps that's easy - when the destination receives the 'quit it's zero' message from the source, maybe it just turns off userfault; any fresh accesses would get a zero page. However, I'm not sure what happens to pages that are already blocked/waiting for a page - that we'd need to check with Andrea/test. > > 2) I sent a 'discard' bitmap of pages for the destination to unmap > > just at the change into postcopy; so I'm already sending one bitmap; > > this is for pages that have been changed since they were first sent > > but not yet resent. > > Be careful about how any changes you make interact with the generation > > of that bitmap. > > Thanks for your reminding. > > > 3) It's potentially very slow if the destination has to keep requesting > > blank pages. > > Yes, really. > > > Essentially what you're suggesting for (a) is a way to send a compressed set > > of 'page is zero' messages based on a bitmap, and you're worried about the > > time to send it - which I think is where we started the conversation about > > time to deal with zeros :-). Two ways to think of that are: > > All my thoughts are in your words. :) > > > 4) I already send one bitmap - so you're only doubling it in theory; > > I originally used a sparse bitmap but the suggestion was it was > > more complex than needed and it turned into more of a run-length > > encoding. > > 5) You're worried it would increase the downtime as you send the bitmap; > > however > > if you implement (b) as well as (a) then you can send the data for > > (a) after the destination is running and not increase the downtime. > > The downtime is main reason that I start to consider about (b), for VM with > huge amount of RAM. > the downtime will become a big problem. Obviously, (a) is more efficient > then (b). With your idea about sending a 'quit' message to tell the destination the remaining pages are all zero, I'm not sure that's true - (b) + the quit message sounds like a good combination. Dave > > > > Dave > > > > -- > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK