On 06/04/2012 03:37 PM, Anthony Liguori wrote:
> On 06/04/2012 05:57 PM, Isaku Yamahata wrote:
>> After the long time, we have v2. This is qemu part.
>> The linux kernel part is sent separatedly.
>>
>> Changes v1 ->  v2:
>> - split up patches for review
>> - buffered file refactored
>> - many bug fixes
>>    Espcially PV drivers can work with postcopy
>> - optimization/heuristic
>>
>> Patches
>> 1 - 30: refactoring exsiting code and preparation
>> 31 - 37: implement postcopy itself (essential part)
>> 38 - 41: some optimization/heuristic for postcopy
>>
>> Intro
>> =====
>> This patch series implements postcopy live migration.[1]
>> As discussed at KVM forum 2011, dedicated character device is used for
>> distributed shared memory between migration source and destination.
>> Now we can discuss/benchmark/compare with precopy. I believe there are
>> much rooms for improvement.
>>
>> [1] http://wiki.qemu.org/Features/PostCopyLiveMigration
>>
>>
>> Usage
>> =====
>> You need load umem character device on the host before starting migration.
>> Postcopy can be used for tcg and kvm accelarator. The implementation depend
>> on only linux umem character device. But the driver dependent code is split
>> into a file.
>> I tested only host page size == guest page size case, but the implementation
>> allows host page size != guest page size case.
>>
>> The following options are added with this patch series.
>> - incoming part
>>    command line options
>>    -postcopy [-postcopy-flags<flags>]
>>    where flags is for changing behavior for benchmark/debugging
>>    Currently the following flags are available
>>    0: default
>>    1: enable touching page request
>>
>>    example:
>>    qemu -postcopy -incoming tcp:0:4444 -monitor stdio -machine accel=kvm
>>
>> - outging part
>>    options for migrate command
>>    migrate [-p [-n] [-m]] URI [<prefault forward>  [<prefault backword>]]
>>    -p: indicate postcopy migration
>>    -n: disable background transferring pages: This is for benchmark/debugging
>>    -m: move background transfer of postcopy mode
>>    <prefault forward>: The number of forward pages which is sent with 
>> on-demand
>>    <prefault backward>: The number of backward pages which is sent with
>>                         on-demand
>>
>>    example:
>>    migrate -p -n tcp:<dest ip address>:4444
>>    migrate -p -n -m tcp:<dest ip address>:4444 32 0
>>
>>
>> TODO
>> ====
>> - benchmark/evaluation. Especially how async page fault affects the result.
> 
> I don't mean to beat on a dead horse, but I really don't understand the point 
> of postcopy migration other than the fact that it's possible.  It's a lot of 
> code and a new ABI in an area where we already have too much difficulty 
> maintaining our ABI.
> 
> Without a compelling real world case with supporting benchmarks for why we 
> need postcopy and cannot improve precopy, I'm against merging this.
Hi Anthony,

The example is quite simple lets look at a 300G guest that is dirtying 10 
percent of it memory every second. (for example SAP ...)
Even if we have a 30G/s network we will need 1 second of downtime of this guest 
, many workload time out in this kind of downtime.
The guests are getting bigger and bigger so for those big guest the only way to 
do live migration is using post copy.
I agree we are losing reliability with post copy but we can try to limit the 
risk :
- do a full copy of the guest ram (precopy) and than switch to post copy only 
for the updates
- the user will use a private LAN ,maybe with redundancy which is much safer
- maybe backup the memory to storage so in case of network failure we can 
recover

In the end it is up to the user , he can decide what he is willing to risk.
The default of course should always be precopy live migration, maybe we should 
even have a different command for post copy.
In the end I can see some users that will have no choice but use post copy live 
migration or stop the guest in order to move them to 
another host.

Regards,
Orit
> 
> Regards,
> 
> Anthony Liguori
> 
> 


Reply via email to