Re: [PATCH v2 00/41] postcopy live migration

2012-06-14 Thread Juan Quintela
Isaku Yamahata  wrote:
> After the long time, we have v2. This is qemu part.
> The linux kernel part is sent separatedly.
>
> Changes v1 -> v2:
> - split up patches for review
> - buffered file refactored
> - many bug fixes
>   Espcially PV drivers can work with postcopy
> - optimization/heuristic
>
> Patches
> 1 - 30: refactoring exsiting code and preparation
> 31 - 37: implement postcopy itself (essential part)
> 38 - 41: some optimization/heuristic for postcopy
>

After reviewing the changes.  I think we can merge the patches 1-30.
For the rest of them we still need another round of review /coding (at
least we need to implement the error handling).

IMHO, it makes no sense to add CONFIG_POSTCOPY, we can just compile the
code in.  Furthermore, we have not ifdefed the code calls on the common
code.  But that is just my opinion.

Later, Juan.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/41] postcopy live migration

2012-06-14 Thread Juan Quintela
Avi Kivity  wrote:
> On 06/08/2012 01:16 PM, Juan Quintela wrote:
>> Anthony Liguori  wrote:
>>
>> Once told that, we need to measure what is the time of an async page
>> fault over the network.  If it is too high, post copy just don't work.
>>
>> And no, I haven't seen any measurement that told us that this is going
>> to be fast enough, but there is always hope.
>
> At 10Gb/sec, the time to transfer one page is 4 microseconds.  At
> 40Gb/sec this drops to a microsecond, plus the latency.  This is on par
> with the time to handle a write protection fault that precopy uses.  But
> this can *only* be achieved with RDMA, otherwise the overhead of
> messaging and copying will dominate.
>
> Note this does not mean we should postpone merging until RDMA support is
> ready.  However we need to make sure the kernel interface is RDMA friendly.

Fully agree here.  I always thought that postcopy will work with RDMA or
something like that, any other thing would just add too much latency.

Later, Juan.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/41] postcopy live migration

2012-06-08 Thread Avi Kivity
On 06/08/2012 01:16 PM, Juan Quintela wrote:
> Anthony Liguori  wrote:
> >> TODO
> >> 
> >> - benchmark/evaluation. Especially how async page fault affects the result.
> >
> > I don't mean to beat on a dead horse, but I really don't understand
> > the point of postcopy migration other than the fact that it's
> > possible.  It's a lot of code and a new ABI in an area where we
> > already have too much difficulty maintaining our ABI.
> >
> > Without a compelling real world case with supporting benchmarks for
> > why we need postcopy and cannot improve precopy, I'm against merging
> > this.
>
> I understand easily the need/want for post-copy migration.  Other thing
> is that this didn't came with benchmarks and that post-copy is
> difficult.
>
> The basic problem with precopy is that the amount of memory used by
> guest is not going to go down any time soon.  The same with number of
> cores.  At some point (it didn't matter if it is 16GB, 128GB or 256GB
> RAM in the guest, the same for vcpus), precopy just don't have a chance.
> And post-copy does.
>
> Once told that, we need to measure what is the time of an async page
> fault over the network.  If it is too high, post copy just don't work.
>
> And no, I haven't seen any measurement that told us that this is going
> to be fast enough, but there is always hope.

At 10Gb/sec, the time to transfer one page is 4 microseconds.  At
40Gb/sec this drops to a microsecond, plus the latency.  This is on par
with the time to handle a write protection fault that precopy uses.  But
this can *only* be achieved with RDMA, otherwise the overhead of
messaging and copying will dominate.

Note this does not mean we should postpone merging until RDMA support is
ready.  However we need to make sure the kernel interface is RDMA friendly.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/41] postcopy live migration

2012-06-08 Thread Juan Quintela
Anthony Liguori  wrote:
>> TODO
>> 
>> - benchmark/evaluation. Especially how async page fault affects the result.
>
> I don't mean to beat on a dead horse, but I really don't understand
> the point of postcopy migration other than the fact that it's
> possible.  It's a lot of code and a new ABI in an area where we
> already have too much difficulty maintaining our ABI.
>
> Without a compelling real world case with supporting benchmarks for
> why we need postcopy and cannot improve precopy, I'm against merging
> this.

I understand easily the need/want for post-copy migration.  Other thing
is that this didn't came with benchmarks and that post-copy is
difficult.

The basic problem with precopy is that the amount of memory used by
guest is not going to go down any time soon.  The same with number of
cores.  At some point (it didn't matter if it is 16GB, 128GB or 256GB
RAM in the guest, the same for vcpus), precopy just don't have a chance.
And post-copy does.

Once told that, we need to measure what is the time of an async page
fault over the network.  If it is too high, post copy just don't work.

And no, I haven't seen any measurement that told us that this is going
to be fast enough, but there is always hope.

Later, Juan.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH v2 00/41] postcopy live migration

2012-06-07 Thread Orit Wasserman
On 06/04/2012 03:37 PM, Anthony Liguori wrote:
> On 06/04/2012 05:57 PM, Isaku Yamahata wrote:
>> After the long time, we have v2. This is qemu part.
>> The linux kernel part is sent separatedly.
>>
>> Changes v1 ->  v2:
>> - split up patches for review
>> - buffered file refactored
>> - many bug fixes
>>Espcially PV drivers can work with postcopy
>> - optimization/heuristic
>>
>> Patches
>> 1 - 30: refactoring exsiting code and preparation
>> 31 - 37: implement postcopy itself (essential part)
>> 38 - 41: some optimization/heuristic for postcopy
>>
>> Intro
>> =
>> This patch series implements postcopy live migration.[1]
>> As discussed at KVM forum 2011, dedicated character device is used for
>> distributed shared memory between migration source and destination.
>> Now we can discuss/benchmark/compare with precopy. I believe there are
>> much rooms for improvement.
>>
>> [1] http://wiki.qemu.org/Features/PostCopyLiveMigration
>>
>>
>> Usage
>> =
>> You need load umem character device on the host before starting migration.
>> Postcopy can be used for tcg and kvm accelarator. The implementation depend
>> on only linux umem character device. But the driver dependent code is split
>> into a file.
>> I tested only host page size == guest page size case, but the implementation
>> allows host page size != guest page size case.
>>
>> The following options are added with this patch series.
>> - incoming part
>>command line options
>>-postcopy [-postcopy-flags]
>>where flags is for changing behavior for benchmark/debugging
>>Currently the following flags are available
>>0: default
>>1: enable touching page request
>>
>>example:
>>qemu -postcopy -incoming tcp:0: -monitor stdio -machine accel=kvm
>>
>> - outging part
>>options for migrate command
>>migrate [-p [-n] [-m]] URI [  []]
>>-p: indicate postcopy migration
>>-n: disable background transferring pages: This is for benchmark/debugging
>>-m: move background transfer of postcopy mode
>>: The number of forward pages which is sent with 
>> on-demand
>>: The number of backward pages which is sent with
>> on-demand
>>
>>example:
>>migrate -p -n tcp::
>>migrate -p -n -m tcp:: 32 0
>>
>>
>> TODO
>> 
>> - benchmark/evaluation. Especially how async page fault affects the result.
> 
> I don't mean to beat on a dead horse, but I really don't understand the point 
> of postcopy migration other than the fact that it's possible.  It's a lot of 
> code and a new ABI in an area where we already have too much difficulty 
> maintaining our ABI.
> 
> Without a compelling real world case with supporting benchmarks for why we 
> need postcopy and cannot improve precopy, I'm against merging this.
Hi Anthony,

The example is quite simple lets look at a 300G guest that is dirtying 10 
percent of it memory every second. (for example SAP ...)
Even if we have a 30G/s network we will need 1 second of downtime of this guest 
, many workload time out in this kind of downtime.
The guests are getting bigger and bigger so for those big guest the only way to 
do live migration is using post copy.
I agree we are losing reliability with post copy but we can try to limit the 
risk :
- do a full copy of the guest ram (precopy) and than switch to post copy only 
for the updates
- the user will use a private LAN ,maybe with redundancy which is much safer
- maybe backup the memory to storage so in case of network failure we can 
recover

In the end it is up to the user , he can decide what he is willing to risk.
The default of course should always be precopy live migration, maybe we should 
even have a different command for post copy.
In the end I can see some users that will have no choice but use post copy live 
migration or stop the guest in order to move them to 
another host.

Regards,
Orit
> 
> Regards,
> 
> Anthony Liguori
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/41] postcopy live migration

2012-06-05 Thread Dor Laor

On 06/04/2012 04:38 PM, Isaku Yamahata wrote:

On Mon, Jun 04, 2012 at 08:37:04PM +0800, Anthony Liguori wrote:

On 06/04/2012 05:57 PM, Isaku Yamahata wrote:

After the long time, we have v2. This is qemu part.
The linux kernel part is sent separatedly.

Changes v1 ->   v2:
- split up patches for review
- buffered file refactored
- many bug fixes
Espcially PV drivers can work with postcopy
- optimization/heuristic

Patches
1 - 30: refactoring exsiting code and preparation
31 - 37: implement postcopy itself (essential part)
38 - 41: some optimization/heuristic for postcopy

Intro
=
This patch series implements postcopy live migration.[1]
As discussed at KVM forum 2011, dedicated character device is used for
distributed shared memory between migration source and destination.
Now we can discuss/benchmark/compare with precopy. I believe there are
much rooms for improvement.

[1] http://wiki.qemu.org/Features/PostCopyLiveMigration


Usage
=
You need load umem character device on the host before starting migration.
Postcopy can be used for tcg and kvm accelarator. The implementation depend
on only linux umem character device. But the driver dependent code is split
into a file.
I tested only host page size == guest page size case, but the implementation
allows host page size != guest page size case.

The following options are added with this patch series.
- incoming part
command line options
-postcopy [-postcopy-flags]
where flags is for changing behavior for benchmark/debugging
Currently the following flags are available
0: default
1: enable touching page request

example:
qemu -postcopy -incoming tcp:0: -monitor stdio -machine accel=kvm

- outging part
options for migrate command
migrate [-p [-n] [-m]] URI [   []]
-p: indicate postcopy migration
-n: disable background transferring pages: This is for benchmark/debugging
-m: move background transfer of postcopy mode
: The number of forward pages which is sent with on-demand
: The number of backward pages which is sent with
 on-demand

example:
migrate -p -n tcp::
migrate -p -n -m tcp:: 32 0


TODO

- benchmark/evaluation. Especially how async page fault affects the result.


I don't mean to beat on a dead horse, but I really don't understand the
point of postcopy migration other than the fact that it's possible.  It's
a lot of code and a new ABI in an area where we already have too much
difficulty maintaining our ABI.

Without a compelling real world case with supporting benchmarks for why
we need postcopy and cannot improve precopy, I'm against merging this.


Some new results are available at
https://events.linuxfoundation.org/images/stories/pdf/lcjp2012_yamahata_postcopy.pdf



It does shows dramatic improvement over pre copy. As stated in the docs, 
async page faults may help lots of various loads and turn post copy into 
a viable solution over today's code.


In addition, the sort of 'demand pages' approach on the destination can 
help us for other usages - For example, we can use this implementation 
to live snapshot VMs w/ RAM (post live migration into a file that leave 
the source active) and live resume VMs from file w/o reading the entire 
RAM from disk.


I didn't go over the api for the live migration part but IIUC, the only 
change needed for the live migration 'protocol' is w.r.t guest pages and 
we need to do it regardless when we'll merge the page ordering optimization.


Cheers,
Dor


precopy assumes that the network bandwidth are wide enough and
the number of dirty pages converges. But it doesn't always hold true.

- planned migration
   predictability of total migration time is important

- dynamic consolidation
   In cloud use cases, the resources of physical machine are usually
   over committed.
   When physical machine becomes over loaded, some VMs are moved to another
   physical host to balance the load.
   precopy can't move VMs promptly. compression makes things worse.

- inter data center migration
   With L2 over L3 technology, it has becoming common to create a virtual
   data center which actually spans over multi physical data centers.
   It is useful to migrate VMs over physical data centers as disaster recovery.
   The network bandwidth between DCs is narrower than LAN case. So precopy
   assumption wouldn't hold.

- In case that network bandwidth might be limited by QoS,
   precopy assumption doesn't hold.


thanks,


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fwd: [Qemu-devel] [PATCH v2 00/41] postcopy live migration

2012-06-04 Thread Isaku Yamahata
On Mon, Jun 04, 2012 at 07:27:25AM -0700, Chegu Vinod wrote:
> On 6/4/2012 6:13 AM, Isaku Yamahata wrote:
>> On Mon, Jun 04, 2012 at 05:01:30AM -0700, Chegu Vinod wrote:
>>> Hello Isaku Yamahata,
>> Hi.
>>
>>> I just saw your patches..Would it be possible to email me a tar bundle of 
>>> these
>>> patches (makes it easier to apply the patches to a copy of the upstream 
>>> qemu.git)
>> I uploaded them to github for those who are interested in it.
>>
>> git://github.com/yamahata/qemu.git qemu-postcopy-june-04-2012
>> git://github.com/yamahata/linux-umem.git  linux-umem-june-04-2012
>>
>
> Thanks for the pointer...
>>> BTW, I am also curious if you have considered using any kind of RDMA 
>>> features for
>>> optimizing the page-faults during postcopy ?
>> Yes, RDMA is interesting topic. Can we share your use case/concern/issues?
>
>
> Looking at large sized guests (256GB and higher)  running cpu/memory  
> intensive enterprise workloads.
> The  concerns are the same...i.e. having a predictable total migration  
> time, minimal downtime/freeze-time and of course minimal service  
> degradation to the workload(s) in the VM or the co-located VM's...
>
> How large of a guest have you tested your changes with and what kind of  
> workloads have you used so far ?

Only up to several GB VM. Off course We'd like to benchmark with real
huge VM (several hundred GB), but it's somewhat difficult.


>> Thus we can collaborate.
>> You may want to see Benoit's results.
>
> Yes. 'have already seen some of Benoit's results.

Great.

> Hence the question about use of RDMA techniques for post copy.

So far my implementation doesn't used RDMA.

>> As long as I know, he has not published
>> his code yet.
>
> Thanks
> Vinod
>
>>
>> thanks,
>>
>>> Thanks
>>> Vinod
>>>
>>>
>>>
>>> --
>>>
>>> Message: 1
>>> Date: Mon,  4 Jun 2012 18:57:02 +0900
>>> From: Isaku Yamahata
>>> To: qemu-de...@nongnu.org, kvm@vger.kernel.org
>>> Cc: benoit.hud...@gmail.com, aarca...@redhat.com, aligu...@us.ibm.com,
>>> quint...@redhat.com, stefa...@gmail.com, t.hirofu...@aist.go.jp,
>>> dl...@redhat.com, satoshi.i...@aist.go.jp,  
>>> mdr...@linux.vnet.ibm.com,
>>> yoshikawa.tak...@oss.ntt.co.jp, owass...@redhat.com, a...@redhat.com,
>>> pbonz...@redhat.com
>>> Subject: [Qemu-devel] [PATCH v2 00/41] postcopy live migration
>>> Message-ID:
>>>
>>> After the long time, we have v2. This is qemu part.
>>> The linux kernel part is sent separatedly.
>>>
>>> Changes v1 ->   v2:
>>> - split up patches for review
>>> - buffered file refactored
>>> - many bug fixes
>>>Espcially PV drivers can work with postcopy
>>> - optimization/heuristic
>>>
>>> Patches
>>> 1 - 30: refactoring exsiting code and preparation
>>> 31 - 37: implement postcopy itself (essential part)
>>> 38 - 41: some optimization/heuristic for postcopy
>>>
>>> Intro
>>> =
>>> This patch series implements postcopy live migration.[1]
>>> As discussed at KVM forum 2011, dedicated character device is used for
>>> distributed shared memory between migration source and destination.
>>> Now we can discuss/benchmark/compare with precopy. I believe there are
>>> much rooms for improvement.
>>>
>>> [1] http://wiki.qemu.org/Features/PostCopyLiveMigration
>>>
>>>
>>> Usage
>>> =
>>> You need load umem character device on the host before starting migration.
>>> Postcopy can be used for tcg and kvm accelarator. The implementation depend
>>> on only linux umem character device. But the driver dependent code is split
>>> into a file.
>>> I tested only host page size == guest page size case, but the implementation
>>> allows host page size != guest page size case.
>>>
>>> The following options are added with this patch series.
>>> - incoming part
>>>command line options
>>>-postcopy [-postcopy-flags]
>>>where flags is for changing behavior for benchmark/debugging
>>>Currently the following flags are available
>>>0: default
>>>1: enable touching page request
>>>
>>>example:
>>>qemu -postcopy -incoming tcp:0: -monitor stdio -machine accel

Re: Fwd: [Qemu-devel] [PATCH v2 00/41] postcopy live migration

2012-06-04 Thread Chegu Vinod

On 6/4/2012 6:13 AM, Isaku Yamahata wrote:

On Mon, Jun 04, 2012 at 05:01:30AM -0700, Chegu Vinod wrote:

Hello Isaku Yamahata,

Hi.


I just saw your patches..Would it be possible to email me a tar bundle of these
patches (makes it easier to apply the patches to a copy of the upstream 
qemu.git)

I uploaded them to github for those who are interested in it.

git://github.com/yamahata/qemu.git qemu-postcopy-june-04-2012
git://github.com/yamahata/linux-umem.git  linux-umem-june-04-2012



Thanks for the pointer...

BTW, I am also curious if you have considered using any kind of RDMA features 
for
optimizing the page-faults during postcopy ?

Yes, RDMA is interesting topic. Can we share your use case/concern/issues?



Looking at large sized guests (256GB and higher)  running cpu/memory 
intensive enterprise workloads.
The  concerns are the same...i.e. having a predictable total migration 
time, minimal downtime/freeze-time and of course minimal service 
degradation to the workload(s) in the VM or the co-located VM's...


How large of a guest have you tested your changes with and what kind of 
workloads have you used so far ?



Thus we can collaborate.
You may want to see Benoit's results.


Yes. 'have already seen some of Benoit's results.

Hence the question about use of RDMA techniques for post copy.


As long as I know, he has not published
his code yet.


Thanks
Vinod



thanks,


Thanks
Vinod



--

Message: 1
Date: Mon,  4 Jun 2012 18:57:02 +0900
From: Isaku Yamahata
To: qemu-de...@nongnu.org, kvm@vger.kernel.org
Cc: benoit.hud...@gmail.com, aarca...@redhat.com, aligu...@us.ibm.com,
quint...@redhat.com, stefa...@gmail.com, t.hirofu...@aist.go.jp,
dl...@redhat.com, satoshi.i...@aist.go.jp,  
mdr...@linux.vnet.ibm.com,
yoshikawa.tak...@oss.ntt.co.jp, owass...@redhat.com, a...@redhat.com,
pbonz...@redhat.com
Subject: [Qemu-devel] [PATCH v2 00/41] postcopy live migration
Message-ID:

After the long time, we have v2. This is qemu part.
The linux kernel part is sent separatedly.

Changes v1 ->   v2:
- split up patches for review
- buffered file refactored
- many bug fixes
   Espcially PV drivers can work with postcopy
- optimization/heuristic

Patches
1 - 30: refactoring exsiting code and preparation
31 - 37: implement postcopy itself (essential part)
38 - 41: some optimization/heuristic for postcopy

Intro
=
This patch series implements postcopy live migration.[1]
As discussed at KVM forum 2011, dedicated character device is used for
distributed shared memory between migration source and destination.
Now we can discuss/benchmark/compare with precopy. I believe there are
much rooms for improvement.

[1] http://wiki.qemu.org/Features/PostCopyLiveMigration


Usage
=
You need load umem character device on the host before starting migration.
Postcopy can be used for tcg and kvm accelarator. The implementation depend
on only linux umem character device. But the driver dependent code is split
into a file.
I tested only host page size == guest page size case, but the implementation
allows host page size != guest page size case.

The following options are added with this patch series.
- incoming part
   command line options
   -postcopy [-postcopy-flags]
   where flags is for changing behavior for benchmark/debugging
   Currently the following flags are available
   0: default
   1: enable touching page request

   example:
   qemu -postcopy -incoming tcp:0: -monitor stdio -machine accel=kvm

- outging part
   options for migrate command
   migrate [-p [-n] [-m]] URI [   []]
   -p: indicate postcopy migration
   -n: disable background transferring pages: This is for benchmark/debugging
   -m: move background transfer of postcopy mode
   : The number of forward pages which is sent with on-demand
   : The number of backward pages which is sent with
on-demand

   example:
   migrate -p -n tcp::
   migrate -p -n -m tcp:: 32 0


TODO

- benchmark/evaluation. Especially how async page fault affects the result.
- improve/optimization
   At the moment at least what I'm aware of is
   - making incoming socket non-blocking with thread
 As page compression is comming, it is impractical to non-blocking read
 and check if the necessary data is read.
   - touching pages in incoming qemu process by fd handler seems suboptimal.
 creating dedicated thread?
   - outgoing handler seems suboptimal causing latency.
- consider on FUSE/CUSE possibility
- don't fork umemd, but create thread?

basic postcopy work flow

 qemu on the destination
   |
   V
 open(/dev/umem)
   |
   V
 UMEM_INIT
   |
   V
 Here we have two file descriptors to
 umem device and shmem file
   |
   |  

Re: [PATCH v2 00/41] postcopy live migration

2012-06-04 Thread Isaku Yamahata
On Mon, Jun 04, 2012 at 08:37:04PM +0800, Anthony Liguori wrote:
> On 06/04/2012 05:57 PM, Isaku Yamahata wrote:
>> After the long time, we have v2. This is qemu part.
>> The linux kernel part is sent separatedly.
>>
>> Changes v1 ->  v2:
>> - split up patches for review
>> - buffered file refactored
>> - many bug fixes
>>Espcially PV drivers can work with postcopy
>> - optimization/heuristic
>>
>> Patches
>> 1 - 30: refactoring exsiting code and preparation
>> 31 - 37: implement postcopy itself (essential part)
>> 38 - 41: some optimization/heuristic for postcopy
>>
>> Intro
>> =
>> This patch series implements postcopy live migration.[1]
>> As discussed at KVM forum 2011, dedicated character device is used for
>> distributed shared memory between migration source and destination.
>> Now we can discuss/benchmark/compare with precopy. I believe there are
>> much rooms for improvement.
>>
>> [1] http://wiki.qemu.org/Features/PostCopyLiveMigration
>>
>>
>> Usage
>> =
>> You need load umem character device on the host before starting migration.
>> Postcopy can be used for tcg and kvm accelarator. The implementation depend
>> on only linux umem character device. But the driver dependent code is split
>> into a file.
>> I tested only host page size == guest page size case, but the implementation
>> allows host page size != guest page size case.
>>
>> The following options are added with this patch series.
>> - incoming part
>>command line options
>>-postcopy [-postcopy-flags]
>>where flags is for changing behavior for benchmark/debugging
>>Currently the following flags are available
>>0: default
>>1: enable touching page request
>>
>>example:
>>qemu -postcopy -incoming tcp:0: -monitor stdio -machine accel=kvm
>>
>> - outging part
>>options for migrate command
>>migrate [-p [-n] [-m]] URI [  []]
>>-p: indicate postcopy migration
>>-n: disable background transferring pages: This is for benchmark/debugging
>>-m: move background transfer of postcopy mode
>>: The number of forward pages which is sent with 
>> on-demand
>>: The number of backward pages which is sent with
>> on-demand
>>
>>example:
>>migrate -p -n tcp::
>>migrate -p -n -m tcp:: 32 0
>>
>>
>> TODO
>> 
>> - benchmark/evaluation. Especially how async page fault affects the result.
>
> I don't mean to beat on a dead horse, but I really don't understand the 
> point of postcopy migration other than the fact that it's possible.  It's 
> a lot of code and a new ABI in an area where we already have too much 
> difficulty maintaining our ABI.
>
> Without a compelling real world case with supporting benchmarks for why 
> we need postcopy and cannot improve precopy, I'm against merging this.

Some new results are available at 
https://events.linuxfoundation.org/images/stories/pdf/lcjp2012_yamahata_postcopy.pdf

precopy assumes that the network bandwidth are wide enough and
the number of dirty pages converges. But it doesn't always hold true.

- planned migration
  predictability of total migration time is important

- dynamic consolidation
  In cloud use cases, the resources of physical machine are usually
  over committed.
  When physical machine becomes over loaded, some VMs are moved to another
  physical host to balance the load.
  precopy can't move VMs promptly. compression makes things worse.

- inter data center migration
  With L2 over L3 technology, it has becoming common to create a virtual
  data center which actually spans over multi physical data centers.
  It is useful to migrate VMs over physical data centers as disaster recovery.
  The network bandwidth between DCs is narrower than LAN case. So precopy
  assumption wouldn't hold.

- In case that network bandwidth might be limited by QoS,
  precopy assumption doesn't hold.


thanks,
-- 
yamahata
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fwd: [Qemu-devel] [PATCH v2 00/41] postcopy live migration

2012-06-04 Thread Isaku Yamahata
On Mon, Jun 04, 2012 at 05:01:30AM -0700, Chegu Vinod wrote:
> Hello Isaku Yamahata,

Hi.

> I just saw your patches..Would it be possible to email me a tar bundle of 
> these
> patches (makes it easier to apply the patches to a copy of the upstream 
> qemu.git)

I uploaded them to github for those who are interested in it.

git://github.com/yamahata/qemu.git qemu-postcopy-june-04-2012
git://github.com/yamahata/linux-umem.git  linux-umem-june-04-2012 


> BTW, I am also curious if you have considered using any kind of RDMA features 
> for
> optimizing the page-faults during postcopy ?

Yes, RDMA is interesting topic. Can we share your use case/concern/issues?
Thus we can collaborate.
You may want to see Benoit's results. As long as I know, he has not published
his code yet.

thanks,

> Thanks
> Vinod
>
>
>
> --
>
> Message: 1
> Date: Mon,  4 Jun 2012 18:57:02 +0900
> From: Isaku Yamahata
> To: qemu-de...@nongnu.org, kvm@vger.kernel.org
> Cc: benoit.hud...@gmail.com, aarca...@redhat.com, aligu...@us.ibm.com,
>   quint...@redhat.com, stefa...@gmail.com, t.hirofu...@aist.go.jp,
>   dl...@redhat.com, satoshi.i...@aist.go.jp,  
> mdr...@linux.vnet.ibm.com,
>   yoshikawa.tak...@oss.ntt.co.jp, owass...@redhat.com, a...@redhat.com,
>   pbonz...@redhat.com
> Subject: [Qemu-devel] [PATCH v2 00/41] postcopy live migration
> Message-ID:
>
> After the long time, we have v2. This is qemu part.
> The linux kernel part is sent separatedly.
>
> Changes v1 ->  v2:
> - split up patches for review
> - buffered file refactored
> - many bug fixes
>   Espcially PV drivers can work with postcopy
> - optimization/heuristic
>
> Patches
> 1 - 30: refactoring exsiting code and preparation
> 31 - 37: implement postcopy itself (essential part)
> 38 - 41: some optimization/heuristic for postcopy
>
> Intro
> =
> This patch series implements postcopy live migration.[1]
> As discussed at KVM forum 2011, dedicated character device is used for
> distributed shared memory between migration source and destination.
> Now we can discuss/benchmark/compare with precopy. I believe there are
> much rooms for improvement.
>
> [1] http://wiki.qemu.org/Features/PostCopyLiveMigration
>
>
> Usage
> =
> You need load umem character device on the host before starting migration.
> Postcopy can be used for tcg and kvm accelarator. The implementation depend
> on only linux umem character device. But the driver dependent code is split
> into a file.
> I tested only host page size == guest page size case, but the implementation
> allows host page size != guest page size case.
>
> The following options are added with this patch series.
> - incoming part
>   command line options
>   -postcopy [-postcopy-flags]
>   where flags is for changing behavior for benchmark/debugging
>   Currently the following flags are available
>   0: default
>   1: enable touching page request
>
>   example:
>   qemu -postcopy -incoming tcp:0: -monitor stdio -machine accel=kvm
>
> - outging part
>   options for migrate command
>   migrate [-p [-n] [-m]] URI [  []]
>   -p: indicate postcopy migration
>   -n: disable background transferring pages: This is for benchmark/debugging
>   -m: move background transfer of postcopy mode
>   : The number of forward pages which is sent with on-demand
>   : The number of backward pages which is sent with
>on-demand
>
>   example:
>   migrate -p -n tcp::
>   migrate -p -n -m tcp:: 32 0
>
>
> TODO
> 
> - benchmark/evaluation. Especially how async page fault affects the result.
> - improve/optimization
>   At the moment at least what I'm aware of is
>   - making incoming socket non-blocking with thread
> As page compression is comming, it is impractical to non-blocking read
> and check if the necessary data is read.
>   - touching pages in incoming qemu process by fd handler seems suboptimal.
> creating dedicated thread?
>   - outgoing handler seems suboptimal causing latency.
> - consider on FUSE/CUSE possibility
> - don't fork umemd, but create thread?
>
> basic postcopy work flow
> 
> qemu on the destination
>   |
>   V
> open(/dev/umem)
>   |
>   V
> UMEM_INIT
>   |
>   V
> Here we have two file descriptors to
> umem device and shmem file
>   |
>   |  umemd
>   |  daemon on

Re: [PATCH v2 00/41] postcopy live migration

2012-06-04 Thread Anthony Liguori

On 06/04/2012 05:57 PM, Isaku Yamahata wrote:

After the long time, we have v2. This is qemu part.
The linux kernel part is sent separatedly.

Changes v1 ->  v2:
- split up patches for review
- buffered file refactored
- many bug fixes
   Espcially PV drivers can work with postcopy
- optimization/heuristic

Patches
1 - 30: refactoring exsiting code and preparation
31 - 37: implement postcopy itself (essential part)
38 - 41: some optimization/heuristic for postcopy

Intro
=
This patch series implements postcopy live migration.[1]
As discussed at KVM forum 2011, dedicated character device is used for
distributed shared memory between migration source and destination.
Now we can discuss/benchmark/compare with precopy. I believe there are
much rooms for improvement.

[1] http://wiki.qemu.org/Features/PostCopyLiveMigration


Usage
=
You need load umem character device on the host before starting migration.
Postcopy can be used for tcg and kvm accelarator. The implementation depend
on only linux umem character device. But the driver dependent code is split
into a file.
I tested only host page size == guest page size case, but the implementation
allows host page size != guest page size case.

The following options are added with this patch series.
- incoming part
   command line options
   -postcopy [-postcopy-flags]
   where flags is for changing behavior for benchmark/debugging
   Currently the following flags are available
   0: default
   1: enable touching page request

   example:
   qemu -postcopy -incoming tcp:0: -monitor stdio -machine accel=kvm

- outging part
   options for migrate command
   migrate [-p [-n] [-m]] URI [  []]
   -p: indicate postcopy migration
   -n: disable background transferring pages: This is for benchmark/debugging
   -m: move background transfer of postcopy mode
   : The number of forward pages which is sent with on-demand
   : The number of backward pages which is sent with
on-demand

   example:
   migrate -p -n tcp::
   migrate -p -n -m tcp:: 32 0


TODO

- benchmark/evaluation. Especially how async page fault affects the result.


I don't mean to beat on a dead horse, but I really don't understand the point of 
postcopy migration other than the fact that it's possible.  It's a lot of code 
and a new ABI in an area where we already have too much difficulty maintaining 
our ABI.


Without a compelling real world case with supporting benchmarks for why we need 
postcopy and cannot improve precopy, I'm against merging this.


Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 00/41] postcopy live migration

2012-06-04 Thread Isaku Yamahata
After the long time, we have v2. This is qemu part.
The linux kernel part is sent separatedly.

Changes v1 -> v2:
- split up patches for review
- buffered file refactored
- many bug fixes
  Espcially PV drivers can work with postcopy
- optimization/heuristic

Patches
1 - 30: refactoring exsiting code and preparation
31 - 37: implement postcopy itself (essential part)
38 - 41: some optimization/heuristic for postcopy

Intro
=
This patch series implements postcopy live migration.[1]
As discussed at KVM forum 2011, dedicated character device is used for
distributed shared memory between migration source and destination.
Now we can discuss/benchmark/compare with precopy. I believe there are
much rooms for improvement.

[1] http://wiki.qemu.org/Features/PostCopyLiveMigration


Usage
=
You need load umem character device on the host before starting migration.
Postcopy can be used for tcg and kvm accelarator. The implementation depend
on only linux umem character device. But the driver dependent code is split
into a file.
I tested only host page size == guest page size case, but the implementation
allows host page size != guest page size case.

The following options are added with this patch series.
- incoming part
  command line options
  -postcopy [-postcopy-flags ]
  where flags is for changing behavior for benchmark/debugging
  Currently the following flags are available
  0: default
  1: enable touching page request

  example:
  qemu -postcopy -incoming tcp:0: -monitor stdio -machine accel=kvm

- outging part
  options for migrate command 
  migrate [-p [-n] [-m]] URI [ []]
  -p: indicate postcopy migration
  -n: disable background transferring pages: This is for benchmark/debugging
  -m: move background transfer of postcopy mode
  : The number of forward pages which is sent with on-demand
  : The number of backward pages which is sent with
   on-demand

  example:
  migrate -p -n tcp:: 
  migrate -p -n -m tcp:: 32 0


TODO

- benchmark/evaluation. Especially how async page fault affects the result.
- improve/optimization
  At the moment at least what I'm aware of is
  - making incoming socket non-blocking with thread
As page compression is comming, it is impractical to non-blocking read
and check if the necessary data is read.
  - touching pages in incoming qemu process by fd handler seems suboptimal.
creating dedicated thread?
  - outgoing handler seems suboptimal causing latency.
- consider on FUSE/CUSE possibility
- don't fork umemd, but create thread?

basic postcopy work flow

qemu on the destination
  |
  V
open(/dev/umem)
  |
  V
UMEM_INIT
  |
  V
Here we have two file descriptors to
umem device and shmem file
  |
  |  umemd
  |  daemon on the destination
  |
  Vcreate pipe to communicate
fork()---,
  |  |
  V  |
close(socket)V
close(shmem)  mmap(shmem file)
  |  |
  V  V
mmap(umem device) for guest RAM   close(shmem file)
  |  |
close(umem device)   |
  |  |
  V  |
wait for ready from daemon  the owner of the socket
  | to the source  
  V  |
entering post copy stage |
start guest execution|
  |  |
  V  V
access guest RAM  read() to get faulted pages
  |  |
  V  V
page fault -->page offset is returned
block|
 V
  pull page from the source
  write the page contents
  to the shmem.