Re: [Qemu-devel] KVM call agenda for Tuesday, June 19th

2012-07-11 Thread Takuya Yoshikawa
On Thu, 12 Jul 2012 02:02:24 +0100
"Vinod, Chegu"  wrote:

> There have been some recent fixes (from Juan) that are supposed to honor the 
> user requested downtime. I am in the middle of redoing some of my 
> experiments...and will share when they are ready (in about 3-4 days).  
> Initial observations are that the time take for the total migration 
> considerably increases but there are no observed stalls or ping timeouts etc. 
> Will know more after I finish my experiments (i.e. the non-XBZRLE ones).
> 
> As expected the 10G [back -to-back] connection is not really getting 
> saturated with the migration traffic... so the there is some other layer that 
> is consuming time (possibly the overhead of  tracking dirty pages).  
> 
> I haven't yet  had the time to try to quantify the performance degradation on 
> the workload during the live migration (stage 2)... need to look at that 
> next. 
> 

I recommend you to try the latest kvm.git next branch as well since
it now has Xiao's fast(lock-less) page fault handling work.

Although I am still testing that branch, it seems working well here.

Thanks,
Takuya

> Thanks for the pointers to the old artcles. 
> 
> Thanks
> Vinod
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [Qemu-devel] KVM call agenda for Tuesday, June 19th

2012-07-11 Thread Vinod, Chegu


-Original Message-
From: Dor Laor [mailto:dl...@redhat.com] 
Sent: Wednesday, July 11, 2012 2:59 AM
To: Vinod, Chegu
Cc: kvm@vger.kernel.org
Subject: Re: [Qemu-devel] KVM call agenda for Tuesday, June 19th

On 06/19/2012 06:42 PM, Chegu Vinod wrote:
> Hello,
>
> Wanted to share some preliminary data from live migration experiments 
> on a setup that is perhaps one of the larger ones.
>
> We used Juan's "huge_memory" patches (without the separate migration 
> thread) and measured the total migration time and the time taken for stage 3 
> ("downtime").
> Note: We didn't change the default "downtime" (30ms?). We had a 
> private 10Gig back-to-back link between the two hosts..and we set the 
> migration speed to 10Gig.
>
> The "workloads" chosen were ones that we could easily setup. All 
> experiments were done without using virsh/virt-manager (i.e. direct 
> interaction with the qemu monitor prompt).  Pl. see the data below.
>
> As the guest size increased (and for busier the workloads) we observed 
> that network connections were getting dropped not only during the "downtime" 
> (i.e.
> stage 3) but also during at times during iterative pre-copy phase 
> (i.e. stage 2).  Perhaps some of this will get fixed when we have the 
> migration thread implemented.
>
> We had also briefly tried the proposed delta compression changes 
> (easier to say than XBZRLE :)) on a smaller configuration. For the 
> simple workloads (perhaps there was not much temporal locality in 
> them) it didn't seem to show improvements instead took much longer 
> time to migrate (high cache miss penalty?). Waiting for the updated 
> version of the XBZRLE for further experiments to see how well it scales on 
> this larger set up...
>
> FYI
> Vinod
>
> ---
> 10VCPUs/128G
> ---
> 1) Idle guest
> Total migration time : 124585 ms,
> Stage_3_time : 941 ms ,
> Total MB transferred : 2720
>
>
> 2) AIM7-compute (2000 users)
> Total migration time : 123540 ms,
> Stage_3_time : 726 ms ,
> Total MB transferred : 3580
>
> 3) SpecJBB (modified to run 10 warehouse threads for a long duration 
> of time) Total migration time : 165720 ms, Stage_3_time : 6851 ms , 
> Total MB transferred : 19656

6.8s downtime may be unacceptable for some applications. Does it converges with 
maximum downtime of 1sec?
In theory this is where post copy can shine. But what we're missing in the 
(good) performance data is how the application perform during live migration. 
This is exactly where the live migration thread and dirtybit optimization 
should help us.

Our 'friends' have nice old analysis of live migration performance:
  -
http://www.cl.cam.ac.uk/research/srg/netos/papers/2005-migration-nsdi-pre.pdf
  - http://www.vmware.com/files/pdf/techpaper/VMW_Netioc_BestPractices.pdf

Cheers,
Dor
>
>
>


There have been some recent fixes (from Juan) that are supposed to honor the 
user requested downtime. I am in the middle of redoing some of my 
experiments...and will share when they are ready (in about 3-4 days).  Initial 
observations are that the time take for the total migration considerably 
increases but there are no observed stalls or ping timeouts etc. Will know more 
after I finish my experiments (i.e. the non-XBZRLE ones).

As expected the 10G [back -to-back] connection is not really getting saturated 
with the migration traffic... so the there is some other layer that is 
consuming time (possibly the overhead of  tracking dirty pages).  

I haven't yet  had the time to try to quantify the performance degradation on 
the workload during the live migration (stage 2)... need to look at that next. 

Thanks for the pointers to the old artcles. 

Thanks
Vinod



> 4) Google SAT  (-s 3600 -C 5 -i 5)
> Total migration time : 411827 ms,
> Stage_3_time : 77807 ms ,
> Total MB transferred : 142136
>
>
>
> ---
> 20VCPUs /256G
> ---
>
> 1) Idle  guest
> Total migration time : 259938 ms,
> Stage_3_time : 1998 ms ,
> Total MB transferred : 5114
>
> 2) AIM7-compute (2000 users)
> Total migration time : 261336 ms,
> Stage_3_time : 2107 ms ,
> Total MB transferred : 5473
>
> 3) SpecJBB (modified to run 20 warehouse threads for a long duration 
> of time) Total migration time : 390548 ms, Stage_3_time : 19596 ms , 
> Total MB transferred : 48109
>
> 4) Google SAT  (-s 3600 -C 10 -i 10)
> Total migration time : 780150 ms,
> Stage_3_time : 90346 ms ,
> Total MB transferred : 251287
>
> 
> 30VCPUs/384G
> ---
>
> 1) Idle guest
> (qemu) Total migration time : 501704 ms, Stage_3_time : 2835 ms , 
> Total MB transferred : 15731
>
>
> 2) AIM7-compute (2000 users)
> Total migration time : 496001 ms,
> Stage_

Re: [Qemu-devel] KVM call agenda for Tuesday, June 19th

2012-07-11 Thread Dor Laor

On 06/19/2012 08:22 PM, Michael Roth wrote:

On Tue, Jun 19, 2012 at 11:34:42PM +0900, Takuya Yoshikawa wrote:

On Tue, 19 Jun 2012 09:01:36 -0500
Anthony Liguori  wrote:


I'm not at all convinced that postcopy is a good idea.  There needs a clear
expression of what the value proposition is that's backed by benchmarks.  Those
benchmarks need to include latency measurements of downtime which so far, I've
not seen.

I don't want to take any postcopy patches until this discussion happens.


FWIW:

I rather see postcopy as a way of migrating guests forcibly and I know
a service in which such a way is needed: emergency migration.  There is
also a product which does live migration when some hardware problems are
detected (as a semi-FT solution) -- in such cases, we cannot wait until
the guest becomes calm.


Ignoring max downtime values when we've determined that the target is no
longer converging would be another option. Essentially having a
use_strict_max_downtime that can be set on a per-migration basis, where
if not set we can "give up" on maintaining the max_downtime when it's
been determined that progress is no longer being made.


There is no need for a new parameter. Management software like 
ovirt/virt-manager can track the mount of pages-to-migrate left and if 
the number start rising, realize that the current max limit won't 
converge and either increase the number or cancel the migration.






Although I am not certain whether QEMU can be used for such products,
it may be worth thinking about.

Thanks,
Takuya


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] KVM call agenda for Tuesday, June 19th

2012-07-11 Thread Dor Laor

On 06/19/2012 06:42 PM, Chegu Vinod wrote:

Hello,

Wanted to share some preliminary data from live migration experiments on a setup
that is perhaps one of the larger ones.

We used Juan's "huge_memory" patches (without the separate migration thread) and
measured the total migration time and the time taken for stage 3 ("downtime").
Note: We didn't change the default "downtime" (30ms?). We had a private 10Gig
back-to-back link between the two hosts..and we set the migration speed to
10Gig.

The "workloads" chosen were ones that we could easily setup. All experiments
were done without using virsh/virt-manager (i.e. direct interaction with the
qemu monitor prompt).  Pl. see the data below.

As the guest size increased (and for busier the workloads) we observed that
network connections were getting dropped not only during the "downtime" (i.e.
stage 3) but also during at times during iterative pre-copy phase (i.e. stage
2).  Perhaps some of this will get fixed when we have the migration thread
implemented.

We had also briefly tried the proposed delta compression changes (easier to say
than XBZRLE :)) on a smaller configuration. For the simple workloads (perhaps
there was not much temporal locality in them) it didn't seem to show
improvements instead took much longer time to migrate (high cache miss
penalty?). Waiting for the updated version of the XBZRLE for further experiments
to see how well it scales on this larger set up...

FYI
Vinod

---
10VCPUs/128G
---
1) Idle guest
Total migration time : 124585 ms,
Stage_3_time : 941 ms ,
Total MB transferred : 2720


2) AIM7-compute (2000 users)
Total migration time : 123540 ms,
Stage_3_time : 726 ms ,
Total MB transferred : 3580

3) SpecJBB (modified to run 10 warehouse threads for a long duration of time)
Total migration time : 165720 ms,
Stage_3_time : 6851 ms ,
Total MB transferred : 19656


6.8s downtime may be unacceptable for some applications. Does it 
converges with maximum downtime of 1sec?
In theory this is where post copy can shine. But what we're missing in 
the (good) performance data is how the application perform during live 
migration. This is exactly where the live migration thread and dirtybit 
optimization should help us.


Our 'friends' have nice old analysis of live migration performance:
 - 
http://www.cl.cam.ac.uk/research/srg/netos/papers/2005-migration-nsdi-pre.pdf

 - http://www.vmware.com/files/pdf/techpaper/VMW_Netioc_BestPractices.pdf

Cheers,
Dor



4) Google SAT  (-s 3600 -C 5 -i 5)
Total migration time : 411827 ms,
Stage_3_time : 77807 ms ,
Total MB transferred : 142136



---
20VCPUs /256G
---

1) Idle  guest
Total migration time : 259938 ms,
Stage_3_time : 1998 ms ,
Total MB transferred : 5114

2) AIM7-compute (2000 users)
Total migration time : 261336 ms,
Stage_3_time : 2107 ms ,
Total MB transferred : 5473

3) SpecJBB (modified to run 20 warehouse threads for a long duration of time)
Total migration time : 390548 ms,
Stage_3_time : 19596 ms ,
Total MB transferred : 48109

4) Google SAT  (-s 3600 -C 10 -i 10)
Total migration time : 780150 ms,
Stage_3_time : 90346 ms ,
Total MB transferred : 251287


30VCPUs/384G
---

1) Idle guest
(qemu) Total migration time : 501704 ms,
Stage_3_time : 2835 ms ,
Total MB transferred : 15731


2) AIM7-compute (2000 users)
Total migration time : 496001 ms,
Stage_3_time : 3884 ms ,
Total MB transferred : 9375


3) SpecJBB (modified to run 30 warehouse threads for a long duration of time)
Total migration time : 611075 ms,
Stage_3_time : 17107 ms ,
Total MB transferred : 48862


4) Google SAT  (-s 3600 -C 15 -i 15)  (look at /tmp/kvm_30w_Goog)
Total migration time : 1348102 ms,
Stage_3_time : 128531 ms ,
Total MB transferred : 367524



---
40VCPUs/512G
---

1) Idle guest
Total migration time : 780257 ms,
Stage_3_time : 3770 ms ,
Total MB transferred : 13330


2) AIM7-compute (2000 users)
Total migration time : 720963 ms,
Stage_3_time : 3966 ms ,
Total MB transferred : 10595

3) SpecJBB (modified to run 40 warehouse threads for a long duration of time)
Total migration time : 863577 ms,
Stage_3_time : 25149 ms ,
Total MB transferred : 54685

4) Google SAT  (-s 3600 -C 20 -i 20)
Total migration time : 2585039 ms,
Stage_3_time : 177625 ms ,
Total MB transferred : 493575


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] KVM call agenda for Tuesday, June 19th

2012-06-19 Thread Michael Roth
On Tue, Jun 19, 2012 at 11:34:42PM +0900, Takuya Yoshikawa wrote:
> On Tue, 19 Jun 2012 09:01:36 -0500
> Anthony Liguori  wrote:
> 
> > I'm not at all convinced that postcopy is a good idea.  There needs a clear 
> > expression of what the value proposition is that's backed by benchmarks.  
> > Those 
> > benchmarks need to include latency measurements of downtime which so far, 
> > I've 
> > not seen.
> > 
> > I don't want to take any postcopy patches until this discussion happens.
> 
> FWIW:
> 
> I rather see postcopy as a way of migrating guests forcibly and I know
> a service in which such a way is needed: emergency migration.  There is
> also a product which does live migration when some hardware problems are
> detected (as a semi-FT solution) -- in such cases, we cannot wait until
> the guest becomes calm.

Ignoring max downtime values when we've determined that the target is no
longer converging would be another option. Essentially having a
use_strict_max_downtime that can be set on a per-migration basis, where
if not set we can "give up" on maintaining the max_downtime when it's
been determined that progress is no longer being made.

> 
> Although I am not certain whether QEMU can be used for such products,
> it may be worth thinking about.
> 
> Thanks,
>   Takuya
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] KVM call agenda for Tuesday, June 19th

2012-06-19 Thread Michael Roth
On Tue, Jun 19, 2012 at 03:54:23PM +0200, Juan Quintela wrote:
> Juan Quintela  wrote:
> > Hi
> >
> > Please send in any agenda items you are interested in covering.
> >
> > Anthony suggested for last week:
> > - multithreading vhost (and general vhost improvements)
> >
> > I suggest:
> > - status of migration: post-copy, IDL, XBRLE, huge memory, ...
> >   Will send an email with an status before tomorrow call.
> 
> XBRLE: v12 is coming today or so.
> 
> 
> This three patches should be a no-brainer (just refactoring code).
> 1st one is shared with postcopy.
> 
> [PATCH v11 1/9] Add MigrationParams structure
> [PATCH v11 5/9] Add uleb encoding/decoding functions
> [PATCH v11 6/9] Add save_block_hdr function
> 
> This ones can be be the ones that we can discuss.
> 
> [PATCH v11 2/9] Add migration capabilites
> [PATCH v11 3/9] Add XBZRLE documentation
> [PATCH v11 4/9] Add cache handling functions
> [PATCH v11 7/9] Add XBZRLE to ram_save_block and ram_save_live
> [PATCH v11 8/9] Add set_cachesize command
> 
> Postcopy:  This is just refactoring that can be integrated.
> 
> [PATCH v2 01/41] arch_init: export sort_ram_list() and ram_save_block()
> [PATCH v2 02/41] arch_init: export RAM_SAVE_xxx flags for postcopy
> [PATCH v2 03/41] arch_init/ram_save: introduce constant for ram save version 
> = 4
> [PATCH v2 04/41] arch_init: refactor host_from_stream_offset()
> [PATCH v2 05/41] arch_init/ram_save_live: factor out RAM_SAVE_FLAG_MEM_SIZE 
> case
> [PATCH v2 06/41] arch_init: refactor ram_save_block()
> [PATCH v2 07/41] arch_init/ram_save_live: factor out ram_save_limit
> [PATCH v2 08/41] arch_init/ram_load: refactor ram_load
> [PATCH v2 09/41] arch_init: introduce helper function to find ram block with 
> id string
> [PATCH v2 10/41] arch_init: simplify a bit by ram_find_block()
> [PATCH v2 11/41] arch_init: factor out counting transferred bytes
> [PATCH v2 12/41] arch_init: factor out setting last_block, last_offset
> [PATCH v2 13/41] exec.c: factor out qemu_get_ram_ptr()
> [PATCH v2 14/41] exec.c: export last_ram_offset()
> [PATCH v2 15/41] savevm: export qemu_peek_buffer, qemu_peek_byte, 
> qemu_file_skip
> [PATCH v2 16/41] savevm: qemu_pending_size() to return pending buffered size
> [PATCH v2 17/41] savevm, buffered_file: introduce method to drain buffer of 
> buffered file
> [PATCH v2 18/41] QEMUFile: add qemu_file_fd() for later use
> [PATCH v2 19/41] savevm/QEMUFile: drop qemu_stdio_fd
> [PATCH v2 20/41] savevm/QEMUFileSocket: drop duplicated member fd
> [PATCH v2 21/41] savevm: rename QEMUFileSocket to QEMUFileFD, socket_close to 
> fd_close
> [PATCH v2 22/41] savevm/QEMUFile: introduce qemu_fopen_fd
> [PATCH v2 23/41] migration.c: remove redundant line in migrate_init()
> [PATCH v2 24/41] migration: export migrate_fd_completed() and 
> migrate_fd_cleanup()
> [PATCH v2 25/41] migration: factor out parameters into MigrationParams
> [PATCH v2 26/41] buffered_file: factor out buffer management logic
> [PATCH v2 27/41] buffered_file: Introduce QEMUFileNonblock for nonblock write
> [PATCH v2 28/41] buffered_file: add qemu_file to read/write to buffer in 
> memory
> 
> This is postcopy properly.  From this one, postcopy needs to be the
> things addressed on previous review, and from there probably (at least)
> another review.  Thing to have in account is that the umem (or whatever
> you want to call it), should be able to work over RDMA.  Anyone that
> knows anything about RDMA to comment on this?
> 
> [PATCH v2 29/41] umem.h: import Linux umem.h
> [PATCH v2 30/41] update-linux-headers.sh: teach umem.h to 
> update-linux-headers.sh
> [PATCH v2 31/41] configure: add CONFIG_POSTCOPY option
> [PATCH v2 32/41] savevm: add new section that is used by postcopy
> [PATCH v2 33/41] postcopy: introduce -postcopy and -postcopy-flags option
> [PATCH v2 34/41] postcopy outgoing: add -p and -n option to migrate command
> [PATCH v2 35/41] postcopy: introduce helper functions for postcopy
> [PATCH v2 36/41] postcopy: implement incoming part of postcopy live migration
> [PATCH v2 37/41] postcopy: implement outgoing part of postcopy live migration
> [PATCH v2 38/41] postcopy/outgoing: add forward, backward option to specify 
> the size of prefault
> [PATCH v2 39/41] postcopy/outgoing: implement prefault
> [PATCH v2 40/41] migrate: add -m (movebg) option to migrate command
> [PATCH v2 41/41] migration/postcopy: add movebg mode
> 
> Huge memory migration.
> This ones should be trivial, and integrated.
> 
> [PATCH 1/7] Add spent time for migration
> [PATCH 2/7] Add tracepoints for savevm section start/end
> [PATCH 3/7] No need to iterate if we already are over the limit
> [PATCH 4/7] Only TCG needs TLB handling
> [PATCH 5/7] Only calculate expected_time for stage 2
> 
> This one is also trivial, but Anthony on previous reviews wanted to have
> migration-thread before we integrated this one.
> 
> [PATCH 6/7] Exit loop if we have been there too long
> 
> This one, Anthony wanted a different approach improving bitmap
> handling.  N

Re: [Qemu-devel] KVM call agenda for Tuesday, June 19th

2012-06-19 Thread Chegu Vinod
Hello,

Wanted to share some preliminary data from live migration experiments on a 
setup 
that is perhaps one of the larger ones.  

We used Juan's "huge_memory" patches (without the separate migration thread) 
and 
measured the total migration time and the time taken for stage 3 ("downtime"). 
Note: We didn't change the default "downtime" (30ms?). We had a private 10Gig 
back-to-back link between the two hosts..and we set the migration speed to 
10Gig. 

The "workloads" chosen were ones that we could easily setup. All experiments 
were done without using virsh/virt-manager (i.e. direct interaction with the 
qemu monitor prompt).  Pl. see the data below. 

As the guest size increased (and for busier the workloads) we observed that 
network connections were getting dropped not only during the "downtime" (i.e. 
stage 3) but also during at times during iterative pre-copy phase (i.e. stage 
2).  Perhaps some of this will get fixed when we have the migration thread 
implemented.

We had also briefly tried the proposed delta compression changes (easier to say 
than XBZRLE :)) on a smaller configuration. For the simple workloads (perhaps 
there was not much temporal locality in them) it didn't seem to show 
improvements instead took much longer time to migrate (high cache miss 
penalty?). Waiting for the updated version of the XBZRLE for further 
experiments 
to see how well it scales on this larger set up... 

FYI
Vinod

---
10VCPUs/128G
---
1) Idle guest
Total migration time : 124585 ms, 
Stage_3_time : 941 ms , 
Total MB transferred : 2720


2) AIM7-compute (2000 users)
Total migration time : 123540 ms, 
Stage_3_time : 726 ms , 
Total MB transferred : 3580

3) SpecJBB (modified to run 10 warehouse threads for a long duration of time)
Total migration time : 165720 ms, 
Stage_3_time : 6851 ms , 
Total MB transferred : 19656


4) Google SAT  (-s 3600 -C 5 -i 5)
Total migration time : 411827 ms, 
Stage_3_time : 77807 ms , 
Total MB transferred : 142136



---
20VCPUs /256G
---

1) Idle  guest
Total migration time : 259938 ms, 
Stage_3_time : 1998 ms , 
Total MB transferred : 5114

2) AIM7-compute (2000 users)
Total migration time : 261336 ms, 
Stage_3_time : 2107 ms , 
Total MB transferred : 5473

3) SpecJBB (modified to run 20 warehouse threads for a long duration of time)
Total migration time : 390548 ms, 
Stage_3_time : 19596 ms , 
Total MB transferred : 48109

4) Google SAT  (-s 3600 -C 10 -i 10)
Total migration time : 780150 ms, 
Stage_3_time : 90346 ms , 
Total MB transferred : 251287


30VCPUs/384G
---

1) Idle guest
(qemu) Total migration time : 501704 ms, 
Stage_3_time : 2835 ms , 
Total MB transferred : 15731


2) AIM7-compute (2000 users)
Total migration time : 496001 ms, 
Stage_3_time : 3884 ms , 
Total MB transferred : 9375


3) SpecJBB (modified to run 30 warehouse threads for a long duration of time)
Total migration time : 611075 ms, 
Stage_3_time : 17107 ms , 
Total MB transferred : 48862


4) Google SAT  (-s 3600 -C 15 -i 15)  (look at /tmp/kvm_30w_Goog)
Total migration time : 1348102 ms, 
Stage_3_time : 128531 ms , 
Total MB transferred : 367524



---
40VCPUs/512G
---

1) Idle guest
Total migration time : 780257 ms, 
Stage_3_time : 3770 ms , 
Total MB transferred : 13330


2) AIM7-compute (2000 users)
Total migration time : 720963 ms, 
Stage_3_time : 3966 ms , 
Total MB transferred : 10595

3) SpecJBB (modified to run 40 warehouse threads for a long duration of time)
Total migration time : 863577 ms, 
Stage_3_time : 25149 ms , 
Total MB transferred : 54685

4) Google SAT  (-s 3600 -C 20 -i 20)
Total migration time : 2585039 ms, 
Stage_3_time : 177625 ms , 
Total MB transferred : 493575


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] KVM call agenda for Tuesday, June 19th

2012-06-19 Thread Takuya Yoshikawa
On Tue, 19 Jun 2012 09:01:36 -0500
Anthony Liguori  wrote:

> I'm not at all convinced that postcopy is a good idea.  There needs a clear 
> expression of what the value proposition is that's backed by benchmarks.  
> Those 
> benchmarks need to include latency measurements of downtime which so far, 
> I've 
> not seen.
> 
> I don't want to take any postcopy patches until this discussion happens.

FWIW:

I rather see postcopy as a way of migrating guests forcibly and I know
a service in which such a way is needed: emergency migration.  There is
also a product which does live migration when some hardware problems are
detected (as a semi-FT solution) -- in such cases, we cannot wait until
the guest becomes calm.

Although I am not certain whether QEMU can be used for such products,
it may be worth thinking about.

Thanks,
Takuya
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] KVM call agenda for Tuesday, June 19th

2012-06-19 Thread Anthony Liguori

On 06/19/2012 08:54 AM, Juan Quintela wrote:

Juan Quintela  wrote:

Hi

Please send in any agenda items you are interested in covering.

Anthony suggested for last week:
- multithreading vhost (and general vhost improvements)

I suggest:
- status of migration: post-copy, IDL, XBRLE, huge memory, ...
   Will send an email with an status before tomorrow call.


XBRLE: v12 is coming today or so.


This three patches should be a no-brainer (just refactoring code).
1st one is shared with postcopy.

[PATCH v11 1/9] Add MigrationParams structure
[PATCH v11 5/9] Add uleb encoding/decoding functions
[PATCH v11 6/9] Add save_block_hdr function

This ones can be be the ones that we can discuss.

[PATCH v11 2/9] Add migration capabilites
[PATCH v11 3/9] Add XBZRLE documentation
[PATCH v11 4/9] Add cache handling functions
[PATCH v11 7/9] Add XBZRLE to ram_save_block and ram_save_live
[PATCH v11 8/9] Add set_cachesize command

Postcopy:  This is just refactoring that can be integrated.

[PATCH v2 01/41] arch_init: export sort_ram_list() and ram_save_block()
[PATCH v2 02/41] arch_init: export RAM_SAVE_xxx flags for postcopy
[PATCH v2 03/41] arch_init/ram_save: introduce constant for ram save version = 4
[PATCH v2 04/41] arch_init: refactor host_from_stream_offset()
[PATCH v2 05/41] arch_init/ram_save_live: factor out RAM_SAVE_FLAG_MEM_SIZE case
[PATCH v2 06/41] arch_init: refactor ram_save_block()
[PATCH v2 07/41] arch_init/ram_save_live: factor out ram_save_limit
[PATCH v2 08/41] arch_init/ram_load: refactor ram_load
[PATCH v2 09/41] arch_init: introduce helper function to find ram block with id 
string
[PATCH v2 10/41] arch_init: simplify a bit by ram_find_block()
[PATCH v2 11/41] arch_init: factor out counting transferred bytes
[PATCH v2 12/41] arch_init: factor out setting last_block, last_offset
[PATCH v2 13/41] exec.c: factor out qemu_get_ram_ptr()
[PATCH v2 14/41] exec.c: export last_ram_offset()
[PATCH v2 15/41] savevm: export qemu_peek_buffer, qemu_peek_byte, qemu_file_skip
[PATCH v2 16/41] savevm: qemu_pending_size() to return pending buffered size
[PATCH v2 17/41] savevm, buffered_file: introduce method to drain buffer of 
buffered file
[PATCH v2 18/41] QEMUFile: add qemu_file_fd() for later use
[PATCH v2 19/41] savevm/QEMUFile: drop qemu_stdio_fd
[PATCH v2 20/41] savevm/QEMUFileSocket: drop duplicated member fd
[PATCH v2 21/41] savevm: rename QEMUFileSocket to QEMUFileFD, socket_close to 
fd_close
[PATCH v2 22/41] savevm/QEMUFile: introduce qemu_fopen_fd
[PATCH v2 23/41] migration.c: remove redundant line in migrate_init()
[PATCH v2 24/41] migration: export migrate_fd_completed() and 
migrate_fd_cleanup()
[PATCH v2 25/41] migration: factor out parameters into MigrationParams
[PATCH v2 26/41] buffered_file: factor out buffer management logic
[PATCH v2 27/41] buffered_file: Introduce QEMUFileNonblock for nonblock write
[PATCH v2 28/41] buffered_file: add qemu_file to read/write to buffer in memory

This is postcopy properly.  From this one, postcopy needs to be the
things addressed on previous review, and from there probably (at least)
another review.  Thing to have in account is that the umem (or whatever
you want to call it), should be able to work over RDMA.  Anyone that
knows anything about RDMA to comment on this?

[PATCH v2 29/41] umem.h: import Linux umem.h
[PATCH v2 30/41] update-linux-headers.sh: teach umem.h to 
update-linux-headers.sh
[PATCH v2 31/41] configure: add CONFIG_POSTCOPY option
[PATCH v2 32/41] savevm: add new section that is used by postcopy
[PATCH v2 33/41] postcopy: introduce -postcopy and -postcopy-flags option
[PATCH v2 34/41] postcopy outgoing: add -p and -n option to migrate command
[PATCH v2 35/41] postcopy: introduce helper functions for postcopy
[PATCH v2 36/41] postcopy: implement incoming part of postcopy live migration
[PATCH v2 37/41] postcopy: implement outgoing part of postcopy live migration
[PATCH v2 38/41] postcopy/outgoing: add forward, backward option to specify the 
size of prefault
[PATCH v2 39/41] postcopy/outgoing: implement prefault
[PATCH v2 40/41] migrate: add -m (movebg) option to migrate command
[PATCH v2 41/41] migration/postcopy: add movebg mode


I'm not at all convinced that postcopy is a good idea.  There needs a clear 
expression of what the value proposition is that's backed by benchmarks.  Those 
benchmarks need to include latency measurements of downtime which so far, I've 
not seen.


I don't want to take any postcopy patches until this discussion happens.

Regards,

Anthony Liguori



Huge memory migration.
This ones should be trivial, and integrated.

[PATCH 1/7] Add spent time for migration
[PATCH 2/7] Add tracepoints for savevm section start/end
[PATCH 3/7] No need to iterate if we already are over the limit
[PATCH 4/7] Only TCG needs TLB handling
[PATCH 5/7] Only calculate expected_time for stage 2

This one is also trivial, but Anthony on previous reviews wanted to have
migration-thread before we integrated this one.

[PATCH 6/7] Exit loop i