Inline > -----Original Message----- > From: Isaku Yamahata [mailto:yamah...@valinux.co.jp] > Sent: 14 November 2012 02:23 > To: Hudzia, Benoit > Cc: quint...@redhat.com; qemu-devel qemu-devel; Orit Wasserman; > chegu_vi...@hp.com; Michael Roth > Subject: Re: Migration To-do list > > On Tue, Nov 13, 2012 at 05:46:13PM +0000, Hudzia, Benoit wrote: > > Hi, > > > > One concept we have been playing around in the context of and hybrid > and post copy and might make sense if you are orienting your effort toward > RDMA / Post copy is to move most of the logic in the destination side. > > > > This is one thing you might want to consider as it can solve some of the > issue you currently have and allow you to maintain almost a single API / > Protocol once integrating with post copy approach. > > > > The idea is to drive the migration from the destination side. I.e. The page > are pulled from the destination and not pushed from the source side. > > > > Ex: current pre-copy : > > > > *extract dirty bitmap ( dirty bitmap extraction can be scheduled or > triggered by destination) > > * send it to the destination side > > * have the destination iterating over the bitmap ( can do page > prioritization here) > > IIRC last year, you mentioned page prioritization, but didn't this year. > Is it still supported? > Where is it implemented? in qemu or kernel?
It is in Qemu, it is too expensive and specialised to do that within the kernel. I think Orit did some work regarding this aspect however I am not 100% sure it is the stable branch yet. > > > > * depending of protocol : > > _ with standard socket ( or RDS) : > > . Destination : request page(s)<- can be batched > > . source receive request send back the page > > . destination process > > _ with RDMA : > > . Destination Read Page from source to local page ( > the page have been mapped to RDMA at the bitmap extraction) ( RDMA > support scatter gather) > > Although I'm not familiar with RDMA, RDMA requires the exchange of DMA- > address between > sender and receiver in advance and pinning down pages. > It it correct? Yes it is correct. This is why you would be registering the memory only when the page is dirtied. Avoiding large memory pinning for too long. ( an unpinning upon RDMA read confirmation ). The address is the same one as the one within the virtual memory. What you exchange is a combination of RDMA key ( to uniquely identify the memory region you are sharing ) and the offset start address of the MR. Then you can read write at will within it. That is why it's a little bit tricky because the RDMA write and read typically do not trigger any notification ( cpu / os etc.. everything is bypassed) as a result your page content can change without the process/OS knowing it. > > > > _ with post copy > > . pretty much the same but the dirty bitmap reset is > done in kernel during the post copy operation ( provide a better dirty bit > tracking granularity) > > > > > > Disadvantage: > > * add a round trip that can be compensate with batch operation ( > only with standard socket) > > > > Advantage : > > * most of the heavy lifting is done at the destination side leaving the > source to respond to request in an event based format > > * resolve a lot of issue you have with your threading form the sender > side ( accounting etc.. ) > > * extremely friendly to optimised solution > > * if the bitmap generation is expensive we can overlap their > generation creating a semi continuous delivery of them guaranteeing an > uninterrupted and optimised flow. => we decouple the bitmap generation > from the send/ receive operation. > > > > > > > > Anyway , I will notify you as soon as I have the patch / library available > > for > RDMA / postcopy. > > > > Note On the fault tolerance part: this require a lot more heavy code > optimisation and poking around to guarantee efficient checkpointing. Most > of the solution we tested so far ( Remus and an old version of kemari) scale > poorly . Again, an RDMA / post copy solution is kind of necessary when you > talk about check pointing enterprise class applications. > > IIRC Kemari guys evaluated IB case. I'm not sure that it was with RDMA or > IPoIB. > > thanks, > > > > > > Regards > > Benoit > > > > > > > > > > > > > -----Original Message----- > > > From: Juan Quintela [mailto:quint...@redhat.com] > > > Sent: 13 November 2012 16:19 > > > To: qemu-devel qemu-devel; Orit Wasserman; chegu_vi...@hp.com; > > > Hudzia, Benoit; Isaku Yamahata; Michael Roth > > > Subject: Migration ToDo list > > > > > > > > > Hi > > > > > > If you have anything else to put, please add. > > > > > > Migration Thread > > > * Plan is integrate it as one of first thing in December (me) > > > * Remove copies with buffered file (me) > > > > > > Bitmap Optimization > > > * Finish moving to individual bitmaps for migration/vga/code > > > * Make sure we don't copy things around > > > * Shared memory bitmap with kvm? > > > * Move to 2MB pages bitmap and then fine grain? > > > > > > QIDL > > > * Review the patches (me) > > > > > > PostCopy > > > * Review patches? > > > * See what we can already integrate? > > > I remember for last year that we could integrate the 1st third or so > > > > > > RDMA > > > * Send RDMA/tcp/.... library they already have (Benoit) > > > * This is required for postcopy > > > * This can be used for precopy > > > > > > General > > > * Change protocol to: > > > a) being always 16byte aligned (paolo said that is faster) > > > b) do scatter/gather of the pages? > > > > > > Fault Tolerance > > > * That is built on top of migration code, but I have nothing to add. > > > > > > Any more ideas? > > > > > > Later, Juan. > > > > -- > yamahata