Re: Fwd: [Qemu-devel] [PATCH v2 00/41] postcopy live migration

2012-06-04 Thread Isaku Yamahata
On Mon, Jun 04, 2012 at 05:01:30AM -0700, Chegu Vinod wrote:
 Hello Isaku Yamahata,

Hi.

 I just saw your patches..Would it be possible to email me a tar bundle of 
 these
 patches (makes it easier to apply the patches to a copy of the upstream 
 qemu.git)

I uploaded them to github for those who are interested in it.

git://github.com/yamahata/qemu.git qemu-postcopy-june-04-2012
git://github.com/yamahata/linux-umem.git  linux-umem-june-04-2012 


 BTW, I am also curious if you have considered using any kind of RDMA features 
 for
 optimizing the page-faults during postcopy ?

Yes, RDMA is interesting topic. Can we share your use case/concern/issues?
Thus we can collaborate.
You may want to see Benoit's results. As long as I know, he has not published
his code yet.

thanks,

 Thanks
 Vinod



 --

 Message: 1
 Date: Mon,  4 Jun 2012 18:57:02 +0900
 From: Isaku Yamahatayamah...@valinux.co.jp
 To: qemu-de...@nongnu.org, kvm@vger.kernel.org
 Cc: benoit.hud...@gmail.com, aarca...@redhat.com, aligu...@us.ibm.com,
   quint...@redhat.com, stefa...@gmail.com, t.hirofu...@aist.go.jp,
   dl...@redhat.com, satoshi.i...@aist.go.jp,  
 mdr...@linux.vnet.ibm.com,
   yoshikawa.tak...@oss.ntt.co.jp, owass...@redhat.com, a...@redhat.com,
   pbonz...@redhat.com
 Subject: [Qemu-devel] [PATCH v2 00/41] postcopy live migration
 Message-ID:cover.1338802190.git.yamah...@valinux.co.jp

 After the long time, we have v2. This is qemu part.
 The linux kernel part is sent separatedly.

 Changes v1 -  v2:
 - split up patches for review
 - buffered file refactored
 - many bug fixes
   Espcially PV drivers can work with postcopy
 - optimization/heuristic

 Patches
 1 - 30: refactoring exsiting code and preparation
 31 - 37: implement postcopy itself (essential part)
 38 - 41: some optimization/heuristic for postcopy

 Intro
 =
 This patch series implements postcopy live migration.[1]
 As discussed at KVM forum 2011, dedicated character device is used for
 distributed shared memory between migration source and destination.
 Now we can discuss/benchmark/compare with precopy. I believe there are
 much rooms for improvement.

 [1] http://wiki.qemu.org/Features/PostCopyLiveMigration


 Usage
 =
 You need load umem character device on the host before starting migration.
 Postcopy can be used for tcg and kvm accelarator. The implementation depend
 on only linux umem character device. But the driver dependent code is split
 into a file.
 I tested only host page size == guest page size case, but the implementation
 allows host page size != guest page size case.

 The following options are added with this patch series.
 - incoming part
   command line options
   -postcopy [-postcopy-flagsflags]
   where flags is for changing behavior for benchmark/debugging
   Currently the following flags are available
   0: default
   1: enable touching page request

   example:
   qemu -postcopy -incoming tcp:0: -monitor stdio -machine accel=kvm

 - outging part
   options for migrate command
   migrate [-p [-n] [-m]] URI [prefault forward  [prefault backword]]
   -p: indicate postcopy migration
   -n: disable background transferring pages: This is for benchmark/debugging
   -m: move background transfer of postcopy mode
   prefault forward: The number of forward pages which is sent with on-demand
   prefault backward: The number of backward pages which is sent with
on-demand

   example:
   migrate -p -n tcp:dest ip address:
   migrate -p -n -m tcp:dest ip address: 32 0


 TODO
 
 - benchmark/evaluation. Especially how async page fault affects the result.
 - improve/optimization
   At the moment at least what I'm aware of is
   - making incoming socket non-blocking with thread
 As page compression is comming, it is impractical to non-blocking read
 and check if the necessary data is read.
   - touching pages in incoming qemu process by fd handler seems suboptimal.
 creating dedicated thread?
   - outgoing handler seems suboptimal causing latency.
 - consider on FUSE/CUSE possibility
 - don't fork umemd, but create thread?

 basic postcopy work flow
 
 qemu on the destination
   |
   V
 open(/dev/umem)
   |
   V
 UMEM_INIT
   |
   V
 Here we have two file descriptors to
 umem device and shmem file
   |
   |  umemd
   |  daemon on the destination
   |
   Vcreate pipe to communicate
 fork()---,
   |  |
   V  |
 close(socket)V
 close(shmem)  

Re: Fwd: [Qemu-devel] [PATCH v2 00/41] postcopy live migration

2012-06-04 Thread Chegu Vinod

On 6/4/2012 6:13 AM, Isaku Yamahata wrote:

On Mon, Jun 04, 2012 at 05:01:30AM -0700, Chegu Vinod wrote:

Hello Isaku Yamahata,

Hi.


I just saw your patches..Would it be possible to email me a tar bundle of these
patches (makes it easier to apply the patches to a copy of the upstream 
qemu.git)

I uploaded them to github for those who are interested in it.

git://github.com/yamahata/qemu.git qemu-postcopy-june-04-2012
git://github.com/yamahata/linux-umem.git  linux-umem-june-04-2012



Thanks for the pointer...

BTW, I am also curious if you have considered using any kind of RDMA features 
for
optimizing the page-faults during postcopy ?

Yes, RDMA is interesting topic. Can we share your use case/concern/issues?



Looking at large sized guests (256GB and higher)  running cpu/memory 
intensive enterprise workloads.
The  concerns are the same...i.e. having a predictable total migration 
time, minimal downtime/freeze-time and of course minimal service 
degradation to the workload(s) in the VM or the co-located VM's...


How large of a guest have you tested your changes with and what kind of 
workloads have you used so far ?



Thus we can collaborate.
You may want to see Benoit's results.


Yes. 'have already seen some of Benoit's results.

Hence the question about use of RDMA techniques for post copy.


As long as I know, he has not published
his code yet.


Thanks
Vinod



thanks,


Thanks
Vinod



--

Message: 1
Date: Mon,  4 Jun 2012 18:57:02 +0900
From: Isaku Yamahatayamah...@valinux.co.jp
To: qemu-de...@nongnu.org, kvm@vger.kernel.org
Cc: benoit.hud...@gmail.com, aarca...@redhat.com, aligu...@us.ibm.com,
quint...@redhat.com, stefa...@gmail.com, t.hirofu...@aist.go.jp,
dl...@redhat.com, satoshi.i...@aist.go.jp,  
mdr...@linux.vnet.ibm.com,
yoshikawa.tak...@oss.ntt.co.jp, owass...@redhat.com, a...@redhat.com,
pbonz...@redhat.com
Subject: [Qemu-devel] [PATCH v2 00/41] postcopy live migration
Message-ID:cover.1338802190.git.yamah...@valinux.co.jp

After the long time, we have v2. This is qemu part.
The linux kernel part is sent separatedly.

Changes v1 -   v2:
- split up patches for review
- buffered file refactored
- many bug fixes
   Espcially PV drivers can work with postcopy
- optimization/heuristic

Patches
1 - 30: refactoring exsiting code and preparation
31 - 37: implement postcopy itself (essential part)
38 - 41: some optimization/heuristic for postcopy

Intro
=
This patch series implements postcopy live migration.[1]
As discussed at KVM forum 2011, dedicated character device is used for
distributed shared memory between migration source and destination.
Now we can discuss/benchmark/compare with precopy. I believe there are
much rooms for improvement.

[1] http://wiki.qemu.org/Features/PostCopyLiveMigration


Usage
=
You need load umem character device on the host before starting migration.
Postcopy can be used for tcg and kvm accelarator. The implementation depend
on only linux umem character device. But the driver dependent code is split
into a file.
I tested only host page size == guest page size case, but the implementation
allows host page size != guest page size case.

The following options are added with this patch series.
- incoming part
   command line options
   -postcopy [-postcopy-flagsflags]
   where flags is for changing behavior for benchmark/debugging
   Currently the following flags are available
   0: default
   1: enable touching page request

   example:
   qemu -postcopy -incoming tcp:0: -monitor stdio -machine accel=kvm

- outging part
   options for migrate command
   migrate [-p [-n] [-m]] URI [prefault forward   [prefault backword]]
   -p: indicate postcopy migration
   -n: disable background transferring pages: This is for benchmark/debugging
   -m: move background transfer of postcopy mode
   prefault forward: The number of forward pages which is sent with on-demand
   prefault backward: The number of backward pages which is sent with
on-demand

   example:
   migrate -p -n tcp:dest ip address:
   migrate -p -n -m tcp:dest ip address: 32 0


TODO

- benchmark/evaluation. Especially how async page fault affects the result.
- improve/optimization
   At the moment at least what I'm aware of is
   - making incoming socket non-blocking with thread
 As page compression is comming, it is impractical to non-blocking read
 and check if the necessary data is read.
   - touching pages in incoming qemu process by fd handler seems suboptimal.
 creating dedicated thread?
   - outgoing handler seems suboptimal causing latency.
- consider on FUSE/CUSE possibility
- don't fork umemd, but create thread?

basic postcopy work flow

 qemu on the destination
   |
   V
 open(/dev/umem)
   |
   V
 UMEM_INIT
   |
 

Re: Fwd: [Qemu-devel] [PATCH v2 00/41] postcopy live migration

2012-06-04 Thread Isaku Yamahata
On Mon, Jun 04, 2012 at 07:27:25AM -0700, Chegu Vinod wrote:
 On 6/4/2012 6:13 AM, Isaku Yamahata wrote:
 On Mon, Jun 04, 2012 at 05:01:30AM -0700, Chegu Vinod wrote:
 Hello Isaku Yamahata,
 Hi.

 I just saw your patches..Would it be possible to email me a tar bundle of 
 these
 patches (makes it easier to apply the patches to a copy of the upstream 
 qemu.git)
 I uploaded them to github for those who are interested in it.

 git://github.com/yamahata/qemu.git qemu-postcopy-june-04-2012
 git://github.com/yamahata/linux-umem.git  linux-umem-june-04-2012


 Thanks for the pointer...
 BTW, I am also curious if you have considered using any kind of RDMA 
 features for
 optimizing the page-faults during postcopy ?
 Yes, RDMA is interesting topic. Can we share your use case/concern/issues?


 Looking at large sized guests (256GB and higher)  running cpu/memory  
 intensive enterprise workloads.
 The  concerns are the same...i.e. having a predictable total migration  
 time, minimal downtime/freeze-time and of course minimal service  
 degradation to the workload(s) in the VM or the co-located VM's...

 How large of a guest have you tested your changes with and what kind of  
 workloads have you used so far ?

Only up to several GB VM. Off course We'd like to benchmark with real
huge VM (several hundred GB), but it's somewhat difficult.


 Thus we can collaborate.
 You may want to see Benoit's results.

 Yes. 'have already seen some of Benoit's results.

Great.

 Hence the question about use of RDMA techniques for post copy.

So far my implementation doesn't used RDMA.

 As long as I know, he has not published
 his code yet.

 Thanks
 Vinod


 thanks,

 Thanks
 Vinod



 --

 Message: 1
 Date: Mon,  4 Jun 2012 18:57:02 +0900
 From: Isaku Yamahatayamah...@valinux.co.jp
 To: qemu-de...@nongnu.org, kvm@vger.kernel.org
 Cc: benoit.hud...@gmail.com, aarca...@redhat.com, aligu...@us.ibm.com,
 quint...@redhat.com, stefa...@gmail.com, t.hirofu...@aist.go.jp,
 dl...@redhat.com, satoshi.i...@aist.go.jp,  
 mdr...@linux.vnet.ibm.com,
 yoshikawa.tak...@oss.ntt.co.jp, owass...@redhat.com, a...@redhat.com,
 pbonz...@redhat.com
 Subject: [Qemu-devel] [PATCH v2 00/41] postcopy live migration
 Message-ID:cover.1338802190.git.yamah...@valinux.co.jp

 After the long time, we have v2. This is qemu part.
 The linux kernel part is sent separatedly.

 Changes v1 -   v2:
 - split up patches for review
 - buffered file refactored
 - many bug fixes
Espcially PV drivers can work with postcopy
 - optimization/heuristic

 Patches
 1 - 30: refactoring exsiting code and preparation
 31 - 37: implement postcopy itself (essential part)
 38 - 41: some optimization/heuristic for postcopy

 Intro
 =
 This patch series implements postcopy live migration.[1]
 As discussed at KVM forum 2011, dedicated character device is used for
 distributed shared memory between migration source and destination.
 Now we can discuss/benchmark/compare with precopy. I believe there are
 much rooms for improvement.

 [1] http://wiki.qemu.org/Features/PostCopyLiveMigration


 Usage
 =
 You need load umem character device on the host before starting migration.
 Postcopy can be used for tcg and kvm accelarator. The implementation depend
 on only linux umem character device. But the driver dependent code is split
 into a file.
 I tested only host page size == guest page size case, but the implementation
 allows host page size != guest page size case.

 The following options are added with this patch series.
 - incoming part
command line options
-postcopy [-postcopy-flagsflags]
where flags is for changing behavior for benchmark/debugging
Currently the following flags are available
0: default
1: enable touching page request

example:
qemu -postcopy -incoming tcp:0: -monitor stdio -machine accel=kvm

 - outging part
options for migrate command
migrate [-p [-n] [-m]] URI [prefault forward   [prefault backword]]
-p: indicate postcopy migration
-n: disable background transferring pages: This is for 
 benchmark/debugging
-m: move background transfer of postcopy mode
prefault forward: The number of forward pages which is sent with 
 on-demand
prefault backward: The number of backward pages which is sent with
 on-demand

example:
migrate -p -n tcp:dest ip address:
migrate -p -n -m tcp:dest ip address: 32 0


 TODO
 
 - benchmark/evaluation. Especially how async page fault affects the result.
 - improve/optimization
At the moment at least what I'm aware of is
- making incoming socket non-blocking with thread
  As page compression is comming, it is impractical to non-blocking read
  and check if the necessary data is read.
- touching pages in incoming qemu process by fd handler seems suboptimal.
  creating dedicated thread?
- outgoing handler seems