Re: [Qemu-devel] Testing migration under stress

Orit Wasserman Tue, 06 Nov 2012 02:54:59 -0800
On 11/06/2012 07:22 AM, Alexey Kardashevskiy wrote:
> On 02/11/12 23:12, Orit Wasserman wrote:
>> On 11/02/2012 05:10 AM, David Gibson wrote:
>>> Asking for some advice on the list.
>>>
>>> I have prorotype savevm and migration support ready for the pseries
>>> machine.  They seem to work under simple circumstances (idle guest).
>>> To test them more extensively I've been attempting to perform live
>>> migrations (just over tcp->localhost) which the guest is active with
>>> something.  In particular I've tried while using octave to do matrix
>>> multiply (so exercising the FP unit) and my colleague Alexey has tried
>>> during some video encoding.
>>>
>> As you are doing local migration one option is to setting the speed higher
>> than line speed , as we don't actually send the data, another is to set high 
>> downtime.
>>
>>> However, in each of these cases, we've found that the migration only
>>> completes and the source instance only stops after the intensive
>>> workload has (just) completed.  What I surmise is happening is that
>>> the workload is touching memory pages fast enough that the ram
>>> migration code is never getting below the threshold to complete the
>>> migration until the guest is idle again.
>>>
>> The workload you chose is really bad for live migration, as all the guest 
>> does is
>> dirtying his memory. I recommend looking for workload that does some 
>> networking or disk IO.
>> Vinod succeeded running SwingBench and SLOB benchmarks that converged ok, I 
>> don't
>> know if they run on pseries, but similar workload should be ok(small 
>> database/warehouse).
>> We found out that SpecJbb on the other hand is hard to converge.
>> Web workload or video streaming also do the trick.
> 
> 
> My ffmpeg workload is simple encoding h263+ac3 to h263+ac3, 64*36 pixels. So 
> it should not be dirtying memory too much. Or is it?
> 
> (qemu) info migrate
> capabilities: xbzrle: off
> Migration status: completed
> total time: 14538 milliseconds
> downtime: 1273 milliseconds
> transferred ram: 389961 kbytes
> remaining ram: 0 kbytes
> total ram: 1065024 kbytes
> duplicate: 181949 pages
> normal: 97446 pages
> normal bytes: 389784 kbytes
> 
> How many bytes were actually transferred? "duplicate" * 4K = 745MB?
For duplicate we send one byte and those are usually zero pages + the page 
header.
transferred is the actual amount of bytes sent so here is around 389M was sent.
> 
> Is there any tool in QEMU to see how many pages are used/dirty/etc?
sadly no.
> "info" does not seem to have any kind of such statistic.
> 
> btw the new guest did not resume (qemu still responds on commands) but this 
> is probably our problem within "pseries" platform. What is strange is that 
> "info migrate" on the new guest shows nothing:
> 
> (qemu) info migrate
> (qemu)
> 
the "info migrate" command displays outgoing migration information not incoming 
..
> 
> 
> 
>> Cheers,
>> Orit
>>
>>> Does anyone have some ideas for testing this better: workloads that
>>> are less likely to trigger this behaviour, or settings to tweak in the
>>> migration itself to make it more likely to complete migration while
>>> the workload is still active.
>>>
>>
> 
>
Re: [Qemu-devel] Testing migration under stress

Reply via email to