[no subject]

2014-08-13 Thread Umesh Deshpande
unsubscribe kvm udesh...@binghamton.edu
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


EPT Accessed bit

2014-08-12 Thread Umesh Deshpande
Hi,

From the Intel processor manual I read that accessed bit has been
introduced in EPT. The Redhat 6 release notes mention that Extended
Page Table age bits enables a host to make smarter choices for
swapping memory under memory pressure.  I couldn't find any related
patches on KVM or LKML. I was wondering if using the EPT accessed bit
requires any support from KVM to allow the host kernel to determine
and swap out the inactive VM pages.

Thanks,
Umesh
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 0/5] Separate thread for VM migration

2011-08-30 Thread Umesh Deshpande
On Mon, Aug 29, 2011 at 6:20 AM, Paolo Bonzini pbonz...@redhat.com wrote:

 On 08/27/2011 08:09 PM, Umesh Deshpande wrote:

 Following patch series deals with VCPU and iothread starvation during the
 migration of a guest. Currently the iothread is responsible for performing
 the
 guest migration. It holds qemu_mutex during the migration and doesn't
 allow VCPU
 to enter the qemu mode and delays its return to the guest. The guest
 migration,
 executed as an iohandler also delays the execution of other iohandlers.
 In the following patch series,

 The migration has been moved to a separate thread to
 reduce the qemu_mutex contention and iohandler starvation.

 Umesh Deshpande (5):
   vm_stop from non-io threads
   MRU ram block list
   migration thread mutex
   separate migration bitmap
   separate migration thread

  arch_init.c |   38 +
  buffered_file.c |   76 ++**---
  cpu-all.h   |   42 ++
  cpus.c  |4 +-
  exec.c  |   97 ++**++--
  migration.c |  117 ++**
 ++---
  migration.h |9 
  qemu-common.h   |2 +
  qemu-thread-posix.c |   10 
  qemu-thread.h   |1 +
  10 files changed, 302 insertions(+), 94 deletions(-)


 I think this patchset is quite good.  These are the problems I found:

 1) the locking rules in patch 3 are a bit too clever, and the cleverness
 will become obsolete once RCU is in place.  The advantage of the clever
 stuff is that rwlock code looks more like RCU code; the disadvantage is that
 it is harder to read and easier to bikeshed about.

 2) it breaks Windows build, but that's easy to fix.

 3) there are a _lot_ of cleanups possible on top of patch 5 (I would not be
 too surprised if the final version has an almost-neutral diffstat), and
 whether to prefer good or perfect is another very popular topic.

 4) I'm not sure block migration has been tested in all scenarios (I'm
 curious about coroutine-heavy ones).  This may mean that the migration
 thread is blocked onto Marcelo's live streaming work, which would provide
 the ingredients to remove block migration altogether.  A round of Autotest
 testing is the minimum required to include this, and I'm not sure if this
 was done either.


 That said, I find the code to be quite good overall, and I wouldn't oppose
 inclusion with only (2) fixed---may even take care of it myself---and more
 testing results apart from the impressive performance numbers.

 About performance, I'm curious how you measured it.  Was the buffer cache
 empty?  That is, how many compressible pages were found?  I toyed with
 vectorizing is_dup_page, but I need to get some numbers before posting.


Above tests were run with an idle VM. I didn't measure the number of
compressible pages.

- Umesh


Re: [PATCH 3/5] Migration thread mutex

2011-08-29 Thread Umesh Deshpande

On 08/29/2011 05:04 AM, Stefan Hajnoczi wrote:

On Sat, Aug 27, 2011 at 7:09 PM, Umesh Deshpandeudesh...@redhat.com  wrote:

This patch implements migrate_ram mutex, which protects the RAMBlock list
traversal in the migration thread during the transfer of a ram from their
addition/removal from the iothread.

Note: Combination of iothread mutex and migration thread mutex works as a
rw-lock. Both mutexes are acquired while modifying the ram_list members or RAM
block list.

Signed-off-by: Umesh Deshpandeudesh...@redhat.com
---
  arch_init.c   |   21 +
  cpu-all.h |3 +++
  exec.c|   23 +++
  qemu-common.h |2 ++
  4 files changed, 49 insertions(+), 0 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 484b39d..9d02270 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -109,6 +109,7 @@ static int is_dup_page(uint8_t *page, uint8_t ch)

  static RAMBlock *last_block;
  static ram_addr_t last_offset;
+static uint64_t last_version;

[...]

  typedef struct RAMList {
+QemuMutex mutex;/* Protects RAM block list */
 uint8_t *phys_dirty;
+uint32_t version;   /* To detect ram block addition/removal */

Is there a reason why RAMList.version is uint32_t but last_version is uint64_t?

No, my bad, they both should be consistent.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/5] Separate migration thread

2011-08-29 Thread Umesh Deshpande

On 08/29/2011 05:09 AM, Stefan Hajnoczi wrote:

On Sat, Aug 27, 2011 at 7:09 PM, Umesh Deshpandeudesh...@redhat.com  wrote:

This patch creates a separate thread for the guest migration on the source side.
All exits (on completion/error) from the migration thread are handled by a
bottom handler, which is called from the iothread.

Signed-off-by: Umesh Deshpandeudesh...@redhat.com
---
  buffered_file.c |   76 
  migration.c |  105 ++
  migration.h |8 
  qemu-thread-posix.c |   10 +
  qemu-thread.h   |1 +

Will this patch break Windows builds by adding a function to
qemu-thread-posix.c which is not implemented in qemu-thread-win32.c?

Yes, equivalent function needs to be added in qemu-thread.win32.c
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 3/5] Migration thread mutex

2011-08-29 Thread Umesh Deshpande

On 08/29/2011 05:04 AM, Stefan Hajnoczi wrote:

On Sat, Aug 27, 2011 at 7:09 PM, Umesh Deshpandeudesh...@redhat.com  wrote:

This patch implements migrate_ram mutex, which protects the RAMBlock list
traversal in the migration thread during the transfer of a ram from their
addition/removal from the iothread.

Note: Combination of iothread mutex and migration thread mutex works as a
rw-lock. Both mutexes are acquired while modifying the ram_list members or RAM
block list.

Signed-off-by: Umesh Deshpandeudesh...@redhat.com
---
  arch_init.c   |   21 +
  cpu-all.h |3 +++
  exec.c|   23 +++
  qemu-common.h |2 ++
  4 files changed, 49 insertions(+), 0 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 484b39d..9d02270 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -109,6 +109,7 @@ static int is_dup_page(uint8_t *page, uint8_t ch)

  static RAMBlock *last_block;
  static ram_addr_t last_offset;
+static uint64_t last_version;

[...]

  typedef struct RAMList {
+QemuMutex mutex;/* Protects RAM block list */
 uint8_t *phys_dirty;
+uint32_t version;   /* To detect ram block addition/removal */

Is there a reason why RAMList.version is uint32_t but last_version is uint64_t?

No, my bad, they both should be consistent.




Re: [Qemu-devel] [PATCH 5/5] Separate migration thread

2011-08-29 Thread Umesh Deshpande

On 08/29/2011 05:09 AM, Stefan Hajnoczi wrote:

On Sat, Aug 27, 2011 at 7:09 PM, Umesh Deshpandeudesh...@redhat.com  wrote:

This patch creates a separate thread for the guest migration on the source side.
All exits (on completion/error) from the migration thread are handled by a
bottom handler, which is called from the iothread.

Signed-off-by: Umesh Deshpandeudesh...@redhat.com
---
  buffered_file.c |   76 
  migration.c |  105 ++
  migration.h |8 
  qemu-thread-posix.c |   10 +
  qemu-thread.h   |1 +

Will this patch break Windows builds by adding a function to
qemu-thread-posix.c which is not implemented in qemu-thread-win32.c?

Yes, equivalent function needs to be added in qemu-thread.win32.c



[PATCH 5/5] Separate migration thread

2011-08-27 Thread Umesh Deshpande
This patch creates a separate thread for the guest migration on the source side.
All exits (on completion/error) from the migration thread are handled by a
bottom handler, which is called from the iothread.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 buffered_file.c |   76 
 migration.c |  105 ++
 migration.h |8 
 qemu-thread-posix.c |   10 +
 qemu-thread.h   |1 +
 5 files changed, 124 insertions(+), 76 deletions(-)

diff --git a/buffered_file.c b/buffered_file.c
index 41b42c3..c31852e 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -16,6 +16,8 @@
 #include qemu-timer.h
 #include qemu-char.h
 #include buffered_file.h
+#include migration.h
+#include qemu-thread.h
 
 //#define DEBUG_BUFFERED_FILE
 
@@ -28,13 +30,14 @@ typedef struct QEMUFileBuffered
 void *opaque;
 QEMUFile *file;
 int has_error;
+int closed;
 int freeze_output;
 size_t bytes_xfer;
 size_t xfer_limit;
 uint8_t *buffer;
 size_t buffer_size;
 size_t buffer_capacity;
-QEMUTimer *timer;
+QemuThread thread;
 } QEMUFileBuffered;
 
 #ifdef DEBUG_BUFFERED_FILE
@@ -155,14 +158,6 @@ static int buffered_put_buffer(void *opaque, const uint8_t 
*buf, int64_t pos, in
 offset = size;
 }
 
-if (pos == 0  size == 0) {
-DPRINTF(file is ready\n);
-if (s-bytes_xfer = s-xfer_limit) {
-DPRINTF(notifying client\n);
-s-put_ready(s-opaque);
-}
-}
-
 return offset;
 }
 
@@ -173,22 +168,25 @@ static int buffered_close(void *opaque)
 
 DPRINTF(closing\n);
 
-while (!s-has_error  s-buffer_size) {
-buffered_flush(s);
-if (s-freeze_output)
-s-wait_for_unfreeze(s);
-}
+s-closed = 1;
 
-ret = s-close(s-opaque);
+qemu_mutex_unlock_migrate_ram();
+qemu_mutex_unlock_iothread();
 
-qemu_del_timer(s-timer);
-qemu_free_timer(s-timer);
+qemu_thread_join(s-thread);
+/* Waits for the completion of the migration thread */
+
+qemu_mutex_lock_iothread();
+qemu_mutex_lock_migrate_ram();
+
+ret = s-close(s-opaque);
 qemu_free(s-buffer);
 qemu_free(s);
 
 return ret;
 }
 
+
 static int buffered_rate_limit(void *opaque)
 {
 QEMUFileBuffered *s = opaque;
@@ -228,26 +226,37 @@ static int64_t buffered_get_rate_limit(void *opaque)
 return s-xfer_limit;
 }
 
-static void buffered_rate_tick(void *opaque)
+static void *migrate_vm(void *opaque)
 {
 QEMUFileBuffered *s = opaque;
+int64_t current_time, expire_time = qemu_get_clock_ms(rt_clock) + 100;
+struct timeval tv = { .tv_sec = 0, .tv_usec = 10};
 
-if (s-has_error) {
-buffered_close(s);
-return;
-}
-
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+while (!s-has_error  (!s-closed || s-buffer_size)) {
+if (s-freeze_output) {
+s-wait_for_unfreeze(s);
+s-freeze_output = 0;
+continue;
+}
 
-if (s-freeze_output)
-return;
+current_time = qemu_get_clock_ms(rt_clock);
+if (!s-closed  (expire_time  current_time)) {
+tv.tv_usec = 1000 * (expire_time - current_time);
+select(0, NULL, NULL, NULL, tv);
+continue;
+}
 
-s-bytes_xfer = 0;
+s-bytes_xfer = 0;
 
-buffered_flush(s);
+expire_time = qemu_get_clock_ms(rt_clock) + 100;
+if (!s-closed) {
+s-put_ready(s-opaque);
+} else {
+buffered_flush(s);
+}
+}
 
-/* Add some checks around this */
-s-put_ready(s-opaque);
+return NULL;
 }
 
 QEMUFile *qemu_fopen_ops_buffered(void *opaque,
@@ -267,15 +276,14 @@ QEMUFile *qemu_fopen_ops_buffered(void *opaque,
 s-put_ready = put_ready;
 s-wait_for_unfreeze = wait_for_unfreeze;
 s-close = close;
+s-closed = 0;
 
 s-file = qemu_fopen_ops(s, buffered_put_buffer, NULL,
  buffered_close, buffered_rate_limit,
  buffered_set_rate_limit,
-buffered_get_rate_limit);
-
-s-timer = qemu_new_timer_ms(rt_clock, buffered_rate_tick, s);
+ buffered_get_rate_limit);
 
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+qemu_thread_create(s-thread, migrate_vm, s);
 
 return s-file;
 }
diff --git a/migration.c b/migration.c
index af3a1f2..5df186d 100644
--- a/migration.c
+++ b/migration.c
@@ -149,10 +149,12 @@ int do_migrate_set_speed(Monitor *mon, const QDict 
*qdict, QObject **ret_data)
 }
 max_throttle = d;
 
+qemu_mutex_lock_migrate_ram();
 s = migrate_to_fms(current_migration);
 if (s  s-file) {
 qemu_file_set_rate_limit(s-file, max_throttle);
 }
+qemu_mutex_unlock_migrate_ram();
 
 return 0;
 }
@@ -284,13 +286,13 @@ int migrate_fd_cleanup(FdMigrationState *s

[PATCH 2/5] MRU ram block list

2011-08-27 Thread Umesh Deshpande
This patch creates a new list of RAM blocks in MRU order. So that separate
locking rules can be applied to the regular RAM block list and the MRU list.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpu-all.h |2 ++
 exec.c|   17 -
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index e839100..6b217a2 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -925,6 +925,7 @@ typedef struct RAMBlock {
 uint32_t flags;
 char idstr[256];
 QLIST_ENTRY(RAMBlock) next;
+QLIST_ENTRY(RAMBlock) next_mru;
 #if defined(__linux__)  !defined(TARGET_S390X)
 int fd;
 #endif
@@ -933,6 +934,7 @@ typedef struct RAMBlock {
 typedef struct RAMList {
 uint8_t *phys_dirty;
 QLIST_HEAD(ram, RAMBlock) blocks;
+QLIST_HEAD(, RAMBlock) blocks_mru;
 } RAMList;
 extern RAMList ram_list;
 
diff --git a/exec.c b/exec.c
index 0e2ce57..c5c247c 100644
--- a/exec.c
+++ b/exec.c
@@ -113,7 +113,11 @@ static uint8_t *code_gen_ptr;
 int phys_ram_fd;
 static int in_migration;
 
-RAMList ram_list = { .blocks = QLIST_HEAD_INITIALIZER(ram_list) };
+RAMList ram_list = {
+.blocks = QLIST_HEAD_INITIALIZER(ram_list),
+.blocks_mru = QLIST_HEAD_INITIALIZER(ram_list.blocks_mru)
+};
+
 #endif
 
 CPUState *first_cpu;
@@ -2973,6 +2977,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, 
const char *name,
 new_block-length = size;
 
 QLIST_INSERT_HEAD(ram_list.blocks, new_block, next);
+QLIST_INSERT_HEAD(ram_list.blocks_mru, new_block, next_mru);
 
 ram_list.phys_dirty = qemu_realloc(ram_list.phys_dirty,
last_ram_offset()  TARGET_PAGE_BITS);
@@ -2997,6 +3002,7 @@ void qemu_ram_free_from_ptr(ram_addr_t addr)
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
 QLIST_REMOVE(block, next);
+QLIST_REMOVE(block, next_mru);
 qemu_free(block);
 return;
 }
@@ -3010,6 +3016,7 @@ void qemu_ram_free(ram_addr_t addr)
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
 QLIST_REMOVE(block, next);
+QLIST_REMOVE(block, next_mru);
 if (block-flags  RAM_PREALLOC_MASK) {
 ;
 } else if (mem_path) {
@@ -3113,12 +3120,12 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
 {
 RAMBlock *block;
 
-QLIST_FOREACH(block, ram_list.blocks, next) {
+QLIST_FOREACH(block, ram_list.blocks_mru, next_mru) {
 if (addr - block-offset  block-length) {
 /* Move this entry to to start of the list.  */
 if (block != QLIST_FIRST(ram_list.blocks)) {
-QLIST_REMOVE(block, next);
-QLIST_INSERT_HEAD(ram_list.blocks, block, next);
+QLIST_REMOVE(block, next_mru);
+QLIST_INSERT_HEAD(ram_list.blocks_mru, block, next_mru);
 }
 if (xen_mapcache_enabled()) {
 /* We need to check if the requested address is in the RAM
@@ -3211,7 +3218,7 @@ int qemu_ram_addr_from_host(void *ptr, ram_addr_t 
*ram_addr)
 return 0;
 }
 
-QLIST_FOREACH(block, ram_list.blocks, next) {
+QLIST_FOREACH(block, ram_list.blocks_mru, next_mru) {
 /* This case append when the block is not mapped. */
 if (block-host == NULL) {
 continue;
-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/5] Migration thread mutex

2011-08-27 Thread Umesh Deshpande
This patch implements migrate_ram mutex, which protects the RAMBlock list
traversal in the migration thread during the transfer of a ram from their
addition/removal from the iothread.

Note: Combination of iothread mutex and migration thread mutex works as a
rw-lock. Both mutexes are acquired while modifying the ram_list members or RAM
block list.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 arch_init.c   |   21 +
 cpu-all.h |3 +++
 exec.c|   23 +++
 qemu-common.h |2 ++
 4 files changed, 49 insertions(+), 0 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 484b39d..9d02270 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -109,6 +109,7 @@ static int is_dup_page(uint8_t *page, uint8_t ch)
 
 static RAMBlock *last_block;
 static ram_addr_t last_offset;
+static uint64_t last_version;
 
 static int ram_save_block(QEMUFile *f)
 {
@@ -170,6 +171,7 @@ static int ram_save_block(QEMUFile *f)
 
 last_block = block;
 last_offset = offset;
+last_version = ram_list.version;
 
 return bytes_sent;
 }
@@ -270,6 +272,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 bytes_transferred = 0;
 last_block = NULL;
 last_offset = 0;
+last_version = ram_list.version = 0;
 sort_ram_list();
 
 /* Make sure all dirty bits are set */
@@ -298,6 +301,17 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 bytes_transferred_last = bytes_transferred;
 bwidth = qemu_get_clock_ns(rt_clock);
 
+if (stage != 3) {
+qemu_mutex_lock_migrate_ram();
+qemu_mutex_unlock_iothread();
+}
+
+if (ram_list.version != last_version) {
+/* RAM block added or removed */
+last_block = NULL;
+last_offset = 0;
+}
+
 while (!qemu_file_rate_limit(f)) {
 int bytes_sent;
 
@@ -308,6 +322,13 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 }
 }
 
+if (stage != 3) {
+qemu_mutex_unlock_migrate_ram();
+qemu_mutex_lock_iothread();
+/* Lock ordering : iothread mutex is always acquired outside 
migrate_ram
+ * mutex critical section to avoid deadlock */
+}
+
 bwidth = qemu_get_clock_ns(rt_clock) - bwidth;
 bwidth = (bytes_transferred - bytes_transferred_last) / bwidth;
 
diff --git a/cpu-all.h b/cpu-all.h
index 6b217a2..b85483f 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -21,6 +21,7 @@
 
 #include qemu-common.h
 #include cpu-common.h
+#include qemu-thread.h
 
 /* some important defines:
  *
@@ -932,7 +933,9 @@ typedef struct RAMBlock {
 } RAMBlock;
 
 typedef struct RAMList {
+QemuMutex mutex;/* Protects RAM block list */
 uint8_t *phys_dirty;
+uint32_t version;   /* To detect ram block addition/removal */
 QLIST_HEAD(ram, RAMBlock) blocks;
 QLIST_HEAD(, RAMBlock) blocks_mru;
 } RAMList;
diff --git a/exec.c b/exec.c
index c5c247c..7627483 100644
--- a/exec.c
+++ b/exec.c
@@ -582,6 +582,7 @@ void cpu_exec_init_all(unsigned long tb_size)
 code_gen_alloc(tb_size);
 code_gen_ptr = code_gen_buffer;
 page_init();
+qemu_mutex_init(ram_list.mutex);
 #if !defined(CONFIG_USER_ONLY)
 io_mem_init();
 #endif
@@ -2802,6 +2803,16 @@ static long gethugepagesize(const char *path)
 return fs.f_bsize;
 }
 
+void qemu_mutex_lock_migrate_ram(void)
+{
+qemu_mutex_lock(ram_list.mutex);
+}
+
+void qemu_mutex_unlock_migrate_ram(void)
+{
+qemu_mutex_unlock(ram_list.mutex);
+}
+
 static void *file_ram_alloc(RAMBlock *block,
 ram_addr_t memory,
 const char *path)
@@ -2976,14 +2987,20 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, 
const char *name,
 }
 new_block-length = size;
 
+qemu_mutex_lock_migrate_ram();
+
 QLIST_INSERT_HEAD(ram_list.blocks, new_block, next);
 QLIST_INSERT_HEAD(ram_list.blocks_mru, new_block, next_mru);
 
+ram_list.version++;
+
 ram_list.phys_dirty = qemu_realloc(ram_list.phys_dirty,
last_ram_offset()  TARGET_PAGE_BITS);
 memset(ram_list.phys_dirty + (new_block-offset  TARGET_PAGE_BITS),
0xff, size  TARGET_PAGE_BITS);
 
+qemu_mutex_unlock_migrate_ram();
+
 if (kvm_enabled())
 kvm_setup_guest_memory(new_block-host, size);
 
@@ -3001,8 +3018,11 @@ void qemu_ram_free_from_ptr(ram_addr_t addr)
 
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
+qemu_mutex_lock_migrate_ram();
 QLIST_REMOVE(block, next);
 QLIST_REMOVE(block, next_mru);
+ram_list.version++;
+qemu_mutex_unlock_migrate_ram();
 qemu_free(block);
 return;
 }
@@ -3015,8 +3035,11 @@ void qemu_ram_free(ram_addr_t addr)
 
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset

[PATCH 4/5] Separate migration dirty bitmap

2011-08-27 Thread Umesh Deshpande
This patch creates a migration bitmap, which is periodically kept in sync with
the qemu bitmap. A separate copy of the dirty bitmap for the migration avoids
concurrent access to the qemu bitmap from the iothread and the migration thread.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 arch_init.c |   17 -
 cpu-all.h   |   37 +
 exec.c  |   57 +
 3 files changed, 102 insertions(+), 9 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 9d02270..b5b852b 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -124,13 +124,13 @@ static int ram_save_block(QEMUFile *f)
 current_addr = block-offset + offset;
 
 do {
-if (cpu_physical_memory_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) 
{
+if (migration_bitmap_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) {
 uint8_t *p;
 int cont = (block == last_block) ? RAM_SAVE_FLAG_CONTINUE : 0;
 
-cpu_physical_memory_reset_dirty(current_addr,
-current_addr + TARGET_PAGE_SIZE,
-MIGRATION_DIRTY_FLAG);
+migration_bitmap_reset_dirty(current_addr,
+ current_addr + TARGET_PAGE_SIZE,
+ MIGRATION_DIRTY_FLAG);
 
 p = block-host + offset;
 
@@ -187,7 +187,7 @@ static ram_addr_t ram_save_remaining(void)
 ram_addr_t addr;
 for (addr = block-offset; addr  block-offset + block-length;
  addr += TARGET_PAGE_SIZE) {
-if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
+if (migration_bitmap_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
 count++;
 }
 }
@@ -267,6 +267,8 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 return 0;
 }
 
+sync_migration_bitmap(0, TARGET_PHYS_ADDR_MAX);
+
 if (stage == 1) {
 RAMBlock *block;
 bytes_transferred = 0;
@@ -279,10 +281,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 QLIST_FOREACH(block, ram_list.blocks, next) {
 for (addr = block-offset; addr  block-offset + block-length;
  addr += TARGET_PAGE_SIZE) {
-if (!cpu_physical_memory_get_dirty(addr,
-   MIGRATION_DIRTY_FLAG)) {
-cpu_physical_memory_set_dirty(addr);
-}
+migration_bitmap_set_dirty(addr);
 }
 }
 
diff --git a/cpu-all.h b/cpu-all.h
index b85483f..8181f8b 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -935,6 +935,7 @@ typedef struct RAMBlock {
 typedef struct RAMList {
 QemuMutex mutex;/* Protects RAM block list */
 uint8_t *phys_dirty;
+uint8_t *migration_bitmap; /* Dedicated bitmap for migration thread */
 uint32_t version;   /* To detect ram block addition/removal */
 QLIST_HEAD(ram, RAMBlock) blocks;
 QLIST_HEAD(, RAMBlock) blocks_mru;
@@ -1009,8 +1010,44 @@ static inline void 
cpu_physical_memory_mask_dirty_range(ram_addr_t start,
 }
 }
 
+
+
 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
  int dirty_flags);
+
+static inline int migration_bitmap_get_dirty(ram_addr_t addr,
+ int dirty_flags)
+{
+return ram_list.migration_bitmap[addr  TARGET_PAGE_BITS]  dirty_flags;
+}
+
+static inline void migration_bitmap_set_dirty(ram_addr_t addr)
+{
+ram_list.migration_bitmap[addr  TARGET_PAGE_BITS] = 0xff;
+}
+
+static inline void migration_bitmap_mask_dirty_range(ram_addr_t start,
+ int length,
+ int dirty_flags)
+{
+int i, mask, len;
+uint8_t *p;
+
+len = length  TARGET_PAGE_BITS;
+mask = ~dirty_flags;
+p = ram_list.migration_bitmap + (start  TARGET_PAGE_BITS);
+for (i = 0; i  len; i++) {
+p[i] = mask;
+}
+}
+
+
+void migration_bitmap_reset_dirty(ram_addr_t start,
+  ram_addr_t end,
+  int dirty_flags);
+
+void sync_migration_bitmap(ram_addr_t start, ram_addr_t end);
+
 void cpu_tlb_update_dirty(CPUState *env);
 
 int cpu_physical_memory_set_dirty_tracking(int enable);
diff --git a/exec.c b/exec.c
index 7627483..8dfbdbc 100644
--- a/exec.c
+++ b/exec.c
@@ -2111,6 +2111,10 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, 
ram_addr_t end,
 abort();
 }
 
+if (kvm_enabled()) {
+return;
+}
+
 for(env = first_cpu; env != NULL; env = env-next_cpu) {
 int mmu_idx;
 for (mmu_idx = 0; mmu_idx  NB_MMU_MODES; mmu_idx++) {
@@ -2119,8 +2123,54 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start

[PATCH 1/5] Support for vm_stop from the migration thread

2011-08-27 Thread Umesh Deshpande
Currently, when any thread other than iothread calls vm_stop, it is scheduled to
be executed later by the iothread. This patch allows the execution of vm_stop
from threads other than iothread. This is especially helpful when the migration 
is
moved into a separate thread.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 cpus.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/cpus.c b/cpus.c
index de70e02..f35f683 100644
--- a/cpus.c
+++ b/cpus.c
@@ -122,8 +122,8 @@ static void do_vm_stop(int reason)
 {
 if (vm_running) {
 cpu_disable_ticks();
-vm_running = 0;
 pause_all_vcpus();
+vm_running = 0;
 vm_state_notify(0, reason);
 qemu_aio_flush();
 bdrv_flush_all();
@@ -1027,7 +1027,7 @@ void cpu_stop_current(void)
 
 void vm_stop(int reason)
 {
-if (!qemu_thread_is_self(io_thread)) {
+if (cpu_single_env) {
 qemu_system_vmstop_request(reason);
 /*
  * FIXME: should not return to device code in case
-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/5] Separate thread for VM migration

2011-08-27 Thread Umesh Deshpande
Following patch series deals with VCPU and iothread starvation during the
migration of a guest. Currently the iothread is responsible for performing the
guest migration. It holds qemu_mutex during the migration and doesn't allow VCPU
to enter the qemu mode and delays its return to the guest. The guest migration,
executed as an iohandler also delays the execution of other iohandlers.
In the following patch series,

The migration has been moved to a separate thread to
reduce the qemu_mutex contention and iohandler starvation.

Umesh Deshpande (5):
  vm_stop from non-io threads
  MRU ram block list
  migration thread mutex
  separate migration bitmap
  separate migration thread

 arch_init.c |   38 +
 buffered_file.c |   76 ++---
 cpu-all.h   |   42 ++
 cpus.c  |4 +-
 exec.c  |   97 --
 migration.c |  117 ---
 migration.h |9 
 qemu-common.h   |2 +
 qemu-thread-posix.c |   10 
 qemu-thread.h   |1 +
 10 files changed, 302 insertions(+), 94 deletions(-)

-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Qemu-devel] [PATCH 1/5] Support for vm_stop from the migration thread

2011-08-27 Thread Umesh Deshpande
Currently, when any thread other than iothread calls vm_stop, it is scheduled to
be executed later by the iothread. This patch allows the execution of vm_stop
from threads other than iothread. This is especially helpful when the migration 
is
moved into a separate thread.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 cpus.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/cpus.c b/cpus.c
index de70e02..f35f683 100644
--- a/cpus.c
+++ b/cpus.c
@@ -122,8 +122,8 @@ static void do_vm_stop(int reason)
 {
 if (vm_running) {
 cpu_disable_ticks();
-vm_running = 0;
 pause_all_vcpus();
+vm_running = 0;
 vm_state_notify(0, reason);
 qemu_aio_flush();
 bdrv_flush_all();
@@ -1027,7 +1027,7 @@ void cpu_stop_current(void)
 
 void vm_stop(int reason)
 {
-if (!qemu_thread_is_self(io_thread)) {
+if (cpu_single_env) {
 qemu_system_vmstop_request(reason);
 /*
  * FIXME: should not return to device code in case
-- 
1.7.4.1




[Qemu-devel] [PATCH 2/5] MRU ram block list

2011-08-27 Thread Umesh Deshpande
This patch creates a new list of RAM blocks in MRU order. So that separate
locking rules can be applied to the regular RAM block list and the MRU list.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpu-all.h |2 ++
 exec.c|   17 -
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index e839100..6b217a2 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -925,6 +925,7 @@ typedef struct RAMBlock {
 uint32_t flags;
 char idstr[256];
 QLIST_ENTRY(RAMBlock) next;
+QLIST_ENTRY(RAMBlock) next_mru;
 #if defined(__linux__)  !defined(TARGET_S390X)
 int fd;
 #endif
@@ -933,6 +934,7 @@ typedef struct RAMBlock {
 typedef struct RAMList {
 uint8_t *phys_dirty;
 QLIST_HEAD(ram, RAMBlock) blocks;
+QLIST_HEAD(, RAMBlock) blocks_mru;
 } RAMList;
 extern RAMList ram_list;
 
diff --git a/exec.c b/exec.c
index 0e2ce57..c5c247c 100644
--- a/exec.c
+++ b/exec.c
@@ -113,7 +113,11 @@ static uint8_t *code_gen_ptr;
 int phys_ram_fd;
 static int in_migration;
 
-RAMList ram_list = { .blocks = QLIST_HEAD_INITIALIZER(ram_list) };
+RAMList ram_list = {
+.blocks = QLIST_HEAD_INITIALIZER(ram_list),
+.blocks_mru = QLIST_HEAD_INITIALIZER(ram_list.blocks_mru)
+};
+
 #endif
 
 CPUState *first_cpu;
@@ -2973,6 +2977,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, 
const char *name,
 new_block-length = size;
 
 QLIST_INSERT_HEAD(ram_list.blocks, new_block, next);
+QLIST_INSERT_HEAD(ram_list.blocks_mru, new_block, next_mru);
 
 ram_list.phys_dirty = qemu_realloc(ram_list.phys_dirty,
last_ram_offset()  TARGET_PAGE_BITS);
@@ -2997,6 +3002,7 @@ void qemu_ram_free_from_ptr(ram_addr_t addr)
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
 QLIST_REMOVE(block, next);
+QLIST_REMOVE(block, next_mru);
 qemu_free(block);
 return;
 }
@@ -3010,6 +3016,7 @@ void qemu_ram_free(ram_addr_t addr)
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
 QLIST_REMOVE(block, next);
+QLIST_REMOVE(block, next_mru);
 if (block-flags  RAM_PREALLOC_MASK) {
 ;
 } else if (mem_path) {
@@ -3113,12 +3120,12 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
 {
 RAMBlock *block;
 
-QLIST_FOREACH(block, ram_list.blocks, next) {
+QLIST_FOREACH(block, ram_list.blocks_mru, next_mru) {
 if (addr - block-offset  block-length) {
 /* Move this entry to to start of the list.  */
 if (block != QLIST_FIRST(ram_list.blocks)) {
-QLIST_REMOVE(block, next);
-QLIST_INSERT_HEAD(ram_list.blocks, block, next);
+QLIST_REMOVE(block, next_mru);
+QLIST_INSERT_HEAD(ram_list.blocks_mru, block, next_mru);
 }
 if (xen_mapcache_enabled()) {
 /* We need to check if the requested address is in the RAM
@@ -3211,7 +3218,7 @@ int qemu_ram_addr_from_host(void *ptr, ram_addr_t 
*ram_addr)
 return 0;
 }
 
-QLIST_FOREACH(block, ram_list.blocks, next) {
+QLIST_FOREACH(block, ram_list.blocks_mru, next_mru) {
 /* This case append when the block is not mapped. */
 if (block-host == NULL) {
 continue;
-- 
1.7.4.1




[Qemu-devel] [PATCH 3/5] Migration thread mutex

2011-08-27 Thread Umesh Deshpande
This patch implements migrate_ram mutex, which protects the RAMBlock list
traversal in the migration thread during the transfer of a ram from their
addition/removal from the iothread.

Note: Combination of iothread mutex and migration thread mutex works as a
rw-lock. Both mutexes are acquired while modifying the ram_list members or RAM
block list.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 arch_init.c   |   21 +
 cpu-all.h |3 +++
 exec.c|   23 +++
 qemu-common.h |2 ++
 4 files changed, 49 insertions(+), 0 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 484b39d..9d02270 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -109,6 +109,7 @@ static int is_dup_page(uint8_t *page, uint8_t ch)
 
 static RAMBlock *last_block;
 static ram_addr_t last_offset;
+static uint64_t last_version;
 
 static int ram_save_block(QEMUFile *f)
 {
@@ -170,6 +171,7 @@ static int ram_save_block(QEMUFile *f)
 
 last_block = block;
 last_offset = offset;
+last_version = ram_list.version;
 
 return bytes_sent;
 }
@@ -270,6 +272,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 bytes_transferred = 0;
 last_block = NULL;
 last_offset = 0;
+last_version = ram_list.version = 0;
 sort_ram_list();
 
 /* Make sure all dirty bits are set */
@@ -298,6 +301,17 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 bytes_transferred_last = bytes_transferred;
 bwidth = qemu_get_clock_ns(rt_clock);
 
+if (stage != 3) {
+qemu_mutex_lock_migrate_ram();
+qemu_mutex_unlock_iothread();
+}
+
+if (ram_list.version != last_version) {
+/* RAM block added or removed */
+last_block = NULL;
+last_offset = 0;
+}
+
 while (!qemu_file_rate_limit(f)) {
 int bytes_sent;
 
@@ -308,6 +322,13 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 }
 }
 
+if (stage != 3) {
+qemu_mutex_unlock_migrate_ram();
+qemu_mutex_lock_iothread();
+/* Lock ordering : iothread mutex is always acquired outside 
migrate_ram
+ * mutex critical section to avoid deadlock */
+}
+
 bwidth = qemu_get_clock_ns(rt_clock) - bwidth;
 bwidth = (bytes_transferred - bytes_transferred_last) / bwidth;
 
diff --git a/cpu-all.h b/cpu-all.h
index 6b217a2..b85483f 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -21,6 +21,7 @@
 
 #include qemu-common.h
 #include cpu-common.h
+#include qemu-thread.h
 
 /* some important defines:
  *
@@ -932,7 +933,9 @@ typedef struct RAMBlock {
 } RAMBlock;
 
 typedef struct RAMList {
+QemuMutex mutex;/* Protects RAM block list */
 uint8_t *phys_dirty;
+uint32_t version;   /* To detect ram block addition/removal */
 QLIST_HEAD(ram, RAMBlock) blocks;
 QLIST_HEAD(, RAMBlock) blocks_mru;
 } RAMList;
diff --git a/exec.c b/exec.c
index c5c247c..7627483 100644
--- a/exec.c
+++ b/exec.c
@@ -582,6 +582,7 @@ void cpu_exec_init_all(unsigned long tb_size)
 code_gen_alloc(tb_size);
 code_gen_ptr = code_gen_buffer;
 page_init();
+qemu_mutex_init(ram_list.mutex);
 #if !defined(CONFIG_USER_ONLY)
 io_mem_init();
 #endif
@@ -2802,6 +2803,16 @@ static long gethugepagesize(const char *path)
 return fs.f_bsize;
 }
 
+void qemu_mutex_lock_migrate_ram(void)
+{
+qemu_mutex_lock(ram_list.mutex);
+}
+
+void qemu_mutex_unlock_migrate_ram(void)
+{
+qemu_mutex_unlock(ram_list.mutex);
+}
+
 static void *file_ram_alloc(RAMBlock *block,
 ram_addr_t memory,
 const char *path)
@@ -2976,14 +2987,20 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, 
const char *name,
 }
 new_block-length = size;
 
+qemu_mutex_lock_migrate_ram();
+
 QLIST_INSERT_HEAD(ram_list.blocks, new_block, next);
 QLIST_INSERT_HEAD(ram_list.blocks_mru, new_block, next_mru);
 
+ram_list.version++;
+
 ram_list.phys_dirty = qemu_realloc(ram_list.phys_dirty,
last_ram_offset()  TARGET_PAGE_BITS);
 memset(ram_list.phys_dirty + (new_block-offset  TARGET_PAGE_BITS),
0xff, size  TARGET_PAGE_BITS);
 
+qemu_mutex_unlock_migrate_ram();
+
 if (kvm_enabled())
 kvm_setup_guest_memory(new_block-host, size);
 
@@ -3001,8 +3018,11 @@ void qemu_ram_free_from_ptr(ram_addr_t addr)
 
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
+qemu_mutex_lock_migrate_ram();
 QLIST_REMOVE(block, next);
 QLIST_REMOVE(block, next_mru);
+ram_list.version++;
+qemu_mutex_unlock_migrate_ram();
 qemu_free(block);
 return;
 }
@@ -3015,8 +3035,11 @@ void qemu_ram_free(ram_addr_t addr)
 
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset

[Qemu-devel] [PATCH 0/5] Separate thread for VM migration

2011-08-27 Thread Umesh Deshpande
Following patch series deals with VCPU and iothread starvation during the
migration of a guest. Currently the iothread is responsible for performing the
guest migration. It holds qemu_mutex during the migration and doesn't allow VCPU
to enter the qemu mode and delays its return to the guest. The guest migration,
executed as an iohandler also delays the execution of other iohandlers.
In the following patch series,

The migration has been moved to a separate thread to
reduce the qemu_mutex contention and iohandler starvation.

Umesh Deshpande (5):
  vm_stop from non-io threads
  MRU ram block list
  migration thread mutex
  separate migration bitmap
  separate migration thread

 arch_init.c |   38 +
 buffered_file.c |   76 ++---
 cpu-all.h   |   42 ++
 cpus.c  |4 +-
 exec.c  |   97 --
 migration.c |  117 ---
 migration.h |9 
 qemu-common.h   |2 +
 qemu-thread-posix.c |   10 
 qemu-thread.h   |1 +
 10 files changed, 302 insertions(+), 94 deletions(-)

-- 
1.7.4.1




[Qemu-devel] [PATCH 4/5] Separate migration dirty bitmap

2011-08-27 Thread Umesh Deshpande
This patch creates a migration bitmap, which is periodically kept in sync with
the qemu bitmap. A separate copy of the dirty bitmap for the migration avoids
concurrent access to the qemu bitmap from the iothread and the migration thread.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 arch_init.c |   17 -
 cpu-all.h   |   37 +
 exec.c  |   57 +
 3 files changed, 102 insertions(+), 9 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 9d02270..b5b852b 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -124,13 +124,13 @@ static int ram_save_block(QEMUFile *f)
 current_addr = block-offset + offset;
 
 do {
-if (cpu_physical_memory_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) 
{
+if (migration_bitmap_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) {
 uint8_t *p;
 int cont = (block == last_block) ? RAM_SAVE_FLAG_CONTINUE : 0;
 
-cpu_physical_memory_reset_dirty(current_addr,
-current_addr + TARGET_PAGE_SIZE,
-MIGRATION_DIRTY_FLAG);
+migration_bitmap_reset_dirty(current_addr,
+ current_addr + TARGET_PAGE_SIZE,
+ MIGRATION_DIRTY_FLAG);
 
 p = block-host + offset;
 
@@ -187,7 +187,7 @@ static ram_addr_t ram_save_remaining(void)
 ram_addr_t addr;
 for (addr = block-offset; addr  block-offset + block-length;
  addr += TARGET_PAGE_SIZE) {
-if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
+if (migration_bitmap_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
 count++;
 }
 }
@@ -267,6 +267,8 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 return 0;
 }
 
+sync_migration_bitmap(0, TARGET_PHYS_ADDR_MAX);
+
 if (stage == 1) {
 RAMBlock *block;
 bytes_transferred = 0;
@@ -279,10 +281,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 QLIST_FOREACH(block, ram_list.blocks, next) {
 for (addr = block-offset; addr  block-offset + block-length;
  addr += TARGET_PAGE_SIZE) {
-if (!cpu_physical_memory_get_dirty(addr,
-   MIGRATION_DIRTY_FLAG)) {
-cpu_physical_memory_set_dirty(addr);
-}
+migration_bitmap_set_dirty(addr);
 }
 }
 
diff --git a/cpu-all.h b/cpu-all.h
index b85483f..8181f8b 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -935,6 +935,7 @@ typedef struct RAMBlock {
 typedef struct RAMList {
 QemuMutex mutex;/* Protects RAM block list */
 uint8_t *phys_dirty;
+uint8_t *migration_bitmap; /* Dedicated bitmap for migration thread */
 uint32_t version;   /* To detect ram block addition/removal */
 QLIST_HEAD(ram, RAMBlock) blocks;
 QLIST_HEAD(, RAMBlock) blocks_mru;
@@ -1009,8 +1010,44 @@ static inline void 
cpu_physical_memory_mask_dirty_range(ram_addr_t start,
 }
 }
 
+
+
 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
  int dirty_flags);
+
+static inline int migration_bitmap_get_dirty(ram_addr_t addr,
+ int dirty_flags)
+{
+return ram_list.migration_bitmap[addr  TARGET_PAGE_BITS]  dirty_flags;
+}
+
+static inline void migration_bitmap_set_dirty(ram_addr_t addr)
+{
+ram_list.migration_bitmap[addr  TARGET_PAGE_BITS] = 0xff;
+}
+
+static inline void migration_bitmap_mask_dirty_range(ram_addr_t start,
+ int length,
+ int dirty_flags)
+{
+int i, mask, len;
+uint8_t *p;
+
+len = length  TARGET_PAGE_BITS;
+mask = ~dirty_flags;
+p = ram_list.migration_bitmap + (start  TARGET_PAGE_BITS);
+for (i = 0; i  len; i++) {
+p[i] = mask;
+}
+}
+
+
+void migration_bitmap_reset_dirty(ram_addr_t start,
+  ram_addr_t end,
+  int dirty_flags);
+
+void sync_migration_bitmap(ram_addr_t start, ram_addr_t end);
+
 void cpu_tlb_update_dirty(CPUState *env);
 
 int cpu_physical_memory_set_dirty_tracking(int enable);
diff --git a/exec.c b/exec.c
index 7627483..8dfbdbc 100644
--- a/exec.c
+++ b/exec.c
@@ -2111,6 +2111,10 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, 
ram_addr_t end,
 abort();
 }
 
+if (kvm_enabled()) {
+return;
+}
+
 for(env = first_cpu; env != NULL; env = env-next_cpu) {
 int mmu_idx;
 for (mmu_idx = 0; mmu_idx  NB_MMU_MODES; mmu_idx++) {
@@ -2119,8 +2123,54 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start

[Qemu-devel] [PATCH 5/5] Separate migration thread

2011-08-27 Thread Umesh Deshpande
This patch creates a separate thread for the guest migration on the source side.
All exits (on completion/error) from the migration thread are handled by a
bottom handler, which is called from the iothread.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 buffered_file.c |   76 
 migration.c |  105 ++
 migration.h |8 
 qemu-thread-posix.c |   10 +
 qemu-thread.h   |1 +
 5 files changed, 124 insertions(+), 76 deletions(-)

diff --git a/buffered_file.c b/buffered_file.c
index 41b42c3..c31852e 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -16,6 +16,8 @@
 #include qemu-timer.h
 #include qemu-char.h
 #include buffered_file.h
+#include migration.h
+#include qemu-thread.h
 
 //#define DEBUG_BUFFERED_FILE
 
@@ -28,13 +30,14 @@ typedef struct QEMUFileBuffered
 void *opaque;
 QEMUFile *file;
 int has_error;
+int closed;
 int freeze_output;
 size_t bytes_xfer;
 size_t xfer_limit;
 uint8_t *buffer;
 size_t buffer_size;
 size_t buffer_capacity;
-QEMUTimer *timer;
+QemuThread thread;
 } QEMUFileBuffered;
 
 #ifdef DEBUG_BUFFERED_FILE
@@ -155,14 +158,6 @@ static int buffered_put_buffer(void *opaque, const uint8_t 
*buf, int64_t pos, in
 offset = size;
 }
 
-if (pos == 0  size == 0) {
-DPRINTF(file is ready\n);
-if (s-bytes_xfer = s-xfer_limit) {
-DPRINTF(notifying client\n);
-s-put_ready(s-opaque);
-}
-}
-
 return offset;
 }
 
@@ -173,22 +168,25 @@ static int buffered_close(void *opaque)
 
 DPRINTF(closing\n);
 
-while (!s-has_error  s-buffer_size) {
-buffered_flush(s);
-if (s-freeze_output)
-s-wait_for_unfreeze(s);
-}
+s-closed = 1;
 
-ret = s-close(s-opaque);
+qemu_mutex_unlock_migrate_ram();
+qemu_mutex_unlock_iothread();
 
-qemu_del_timer(s-timer);
-qemu_free_timer(s-timer);
+qemu_thread_join(s-thread);
+/* Waits for the completion of the migration thread */
+
+qemu_mutex_lock_iothread();
+qemu_mutex_lock_migrate_ram();
+
+ret = s-close(s-opaque);
 qemu_free(s-buffer);
 qemu_free(s);
 
 return ret;
 }
 
+
 static int buffered_rate_limit(void *opaque)
 {
 QEMUFileBuffered *s = opaque;
@@ -228,26 +226,37 @@ static int64_t buffered_get_rate_limit(void *opaque)
 return s-xfer_limit;
 }
 
-static void buffered_rate_tick(void *opaque)
+static void *migrate_vm(void *opaque)
 {
 QEMUFileBuffered *s = opaque;
+int64_t current_time, expire_time = qemu_get_clock_ms(rt_clock) + 100;
+struct timeval tv = { .tv_sec = 0, .tv_usec = 10};
 
-if (s-has_error) {
-buffered_close(s);
-return;
-}
-
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+while (!s-has_error  (!s-closed || s-buffer_size)) {
+if (s-freeze_output) {
+s-wait_for_unfreeze(s);
+s-freeze_output = 0;
+continue;
+}
 
-if (s-freeze_output)
-return;
+current_time = qemu_get_clock_ms(rt_clock);
+if (!s-closed  (expire_time  current_time)) {
+tv.tv_usec = 1000 * (expire_time - current_time);
+select(0, NULL, NULL, NULL, tv);
+continue;
+}
 
-s-bytes_xfer = 0;
+s-bytes_xfer = 0;
 
-buffered_flush(s);
+expire_time = qemu_get_clock_ms(rt_clock) + 100;
+if (!s-closed) {
+s-put_ready(s-opaque);
+} else {
+buffered_flush(s);
+}
+}
 
-/* Add some checks around this */
-s-put_ready(s-opaque);
+return NULL;
 }
 
 QEMUFile *qemu_fopen_ops_buffered(void *opaque,
@@ -267,15 +276,14 @@ QEMUFile *qemu_fopen_ops_buffered(void *opaque,
 s-put_ready = put_ready;
 s-wait_for_unfreeze = wait_for_unfreeze;
 s-close = close;
+s-closed = 0;
 
 s-file = qemu_fopen_ops(s, buffered_put_buffer, NULL,
  buffered_close, buffered_rate_limit,
  buffered_set_rate_limit,
-buffered_get_rate_limit);
-
-s-timer = qemu_new_timer_ms(rt_clock, buffered_rate_tick, s);
+ buffered_get_rate_limit);
 
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+qemu_thread_create(s-thread, migrate_vm, s);
 
 return s-file;
 }
diff --git a/migration.c b/migration.c
index af3a1f2..5df186d 100644
--- a/migration.c
+++ b/migration.c
@@ -149,10 +149,12 @@ int do_migrate_set_speed(Monitor *mon, const QDict 
*qdict, QObject **ret_data)
 }
 max_throttle = d;
 
+qemu_mutex_lock_migrate_ram();
 s = migrate_to_fms(current_migration);
 if (s  s-file) {
 qemu_file_set_rate_limit(s-file, max_throttle);
 }
+qemu_mutex_unlock_migrate_ram();
 
 return 0;
 }
@@ -284,13 +286,13 @@ int migrate_fd_cleanup(FdMigrationState *s

Re: [Qemu-devel] [RFC PATCH v5 0/4] Separate thread for VM migration

2011-08-25 Thread Umesh Deshpande

Jitterd Test
I ran jitterd in a migrating VM of size 8GB with and w/o the patch series.
./jitterd -f -m 1 -p 100 -r 40
That is to report the jitter of greater than 400ms during the interval 
of 40 seconds.


Jitter in ms. with the migration thread.
RunTotal (Peak)
1No chatter
2No chatter
3No chatter
4409 (360)

Jitter in ms. without migration thread.
RunTotal (Peak)
14663 (2413)
2643 (423)
31973 (1817)
43908 (3772)

--
Flood ping test : ping to the migrating VM from a third machine (data 
over 3 runs)

Latency (ms) ping to a non-migrating VM: Avg 0.156, Max: 0.96
Latency (ms) with migration thread : Avg 0.215, Max: 280
Latency (ms) without migration thread: Avg 6.47,   Max: 4562

- Umesh


On 08/24/2011 01:19 PM, Anthony Liguori wrote:

On 08/23/2011 10:12 PM, Umesh Deshpande wrote:
Following patch series deals with VCPU and iothread starvation during 
the
migration of a guest. Currently the iothread is responsible for 
performing the
guest migration. It holds qemu_mutex during the migration and doesn't 
allow VCPU
to enter the qemu mode and delays its return to the guest. The guest 
migration,

executed as an iohandler also delays the execution of other iohandlers.
In the following patch series,


Can you please include detailed performance data with and without this 
series?


Perhaps runs of migration with jitterd running in the guest.

Regards,

Anthony Liguori



The migration has been moved to a separate thread to
reduce the qemu_mutex contention and iohandler starvation.

Umesh Deshpande (4):
   MRU ram block list
   migration thread mutex
   separate migration bitmap
   separate migration thread

  arch_init.c |   38 
  buffered_file.c |   75 +--
  cpu-all.h   |   42 +
  exec.c  |   97 ++--
  migration.c |  122 
+-

  migration.h |9 
  qemu-common.h   |2 +
  qemu-thread-posix.c |   10 
  qemu-thread.h   |1 +
  savevm.c|5 --
  10 files changed, 297 insertions(+), 104 deletions(-)





--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [RFC PATCH v5 0/4] Separate thread for VM migration

2011-08-25 Thread Umesh Deshpande

*Jitterd Test*
I ran jitterd in a migrating VM of size 8GB with and w/o the patch series.
./jitterd -f -m 1 -p 100 -r 40
That is to report the jitter of greater than 400ms during the interval 
of 40 seconds.


Jitter in ms. with the migration thread.
RunTotal (Peak)
1No chatter
2No chatter
3No chatter
4409 (360)

Jitter in ms. without migration thread.
RunTotal (Peak)
14663 (2413)
2643 (423)
31973 (1817)
43908 (3772)

*Flood ping test* : ping to the migrating VM from a third machine (data 
over 3 runs)

Latency (ms) ping to a non-migrating VM: Avg 0.156, Max: 0.96
Latency (ms) with migration thread : Avg 0.215, Max: 280
Latency (ms) without migration thread: Avg 6.47,   Max: 4562

- Umesh


On 08/24/2011 01:19 PM, Anthony Liguori wrote:

On 08/23/2011 10:12 PM, Umesh Deshpande wrote:
Following patch series deals with VCPU and iothread starvation during 
the
migration of a guest. Currently the iothread is responsible for 
performing the
guest migration. It holds qemu_mutex during the migration and doesn't 
allow VCPU
to enter the qemu mode and delays its return to the guest. The guest 
migration,

executed as an iohandler also delays the execution of other iohandlers.
In the following patch series,


Can you please include detailed performance data with and without this 
series?


Perhaps runs of migration with jitterd running in the guest.

Regards,

Anthony Liguori



The migration has been moved to a separate thread to
reduce the qemu_mutex contention and iohandler starvation.

Umesh Deshpande (4):
   MRU ram block list
   migration thread mutex
   separate migration bitmap
   separate migration thread

  arch_init.c |   38 
  buffered_file.c |   75 +--
  cpu-all.h   |   42 +
  exec.c  |   97 ++--
  migration.c |  122 
+-

  migration.h |9 
  qemu-common.h   |2 +
  qemu-thread-posix.c |   10 
  qemu-thread.h   |1 +
  savevm.c|5 --
  10 files changed, 297 insertions(+), 104 deletions(-)







Re: [Qemu-devel] [RFC PATCH v5 0/4] Separate thread for VM migration

2011-08-25 Thread Umesh Deshpande

Jitterd Test
I ran jitterd in a migrating VM of size 8GB with and w/o the patch series.
./jitterd -f -m 1 -p 100 -r 40
That is to report the jitter of greater than 400ms during the interval 
of 40 seconds.


Jitter in ms. with the migration thread.
RunTotal (Peak)
1No chatter
2No chatter
3No chatter
4409 (360)

Jitter in ms. without migration thread.
RunTotal (Peak)
14663 (2413)
2643 (423)
31973 (1817)
43908 (3772)

--
Flood ping test : ping to the migrating VM from a third machine (data 
over 3 runs)

Latency (ms) ping to a non-migrating VM: Avg 0.156, Max: 0.96
Latency (ms) with migration thread : Avg 0.215, Max: 280
Latency (ms) without migration thread: Avg 6.47,   Max: 4562

- Umesh


On 08/24/2011 01:19 PM, Anthony Liguori wrote:

On 08/23/2011 10:12 PM, Umesh Deshpande wrote:
Following patch series deals with VCPU and iothread starvation during 
the
migration of a guest. Currently the iothread is responsible for 
performing the
guest migration. It holds qemu_mutex during the migration and doesn't 
allow VCPU
to enter the qemu mode and delays its return to the guest. The guest 
migration,

executed as an iohandler also delays the execution of other iohandlers.
In the following patch series,


Can you please include detailed performance data with and without this 
series?


Perhaps runs of migration with jitterd running in the guest.

Regards,

Anthony Liguori



The migration has been moved to a separate thread to
reduce the qemu_mutex contention and iohandler starvation.

Umesh Deshpande (4):
   MRU ram block list
   migration thread mutex
   separate migration bitmap
   separate migration thread

  arch_init.c |   38 
  buffered_file.c |   75 +--
  cpu-all.h   |   42 +
  exec.c  |   97 ++--
  migration.c |  122 
+-

  migration.h |9 
  qemu-common.h   |2 +
  qemu-thread-posix.c |   10 
  qemu-thread.h   |1 +
  savevm.c|5 --
  10 files changed, 297 insertions(+), 104 deletions(-)








[RFC PATCH v5 0/4] Separate thread for VM migration

2011-08-23 Thread Umesh Deshpande
Following patch series deals with VCPU and iothread starvation during the
migration of a guest. Currently the iothread is responsible for performing the
guest migration. It holds qemu_mutex during the migration and doesn't allow VCPU
to enter the qemu mode and delays its return to the guest. The guest migration,
executed as an iohandler also delays the execution of other iohandlers.
In the following patch series,

The migration has been moved to a separate thread to
reduce the qemu_mutex contention and iohandler starvation.

Umesh Deshpande (4):
  MRU ram block list
  migration thread mutex
  separate migration bitmap
  separate migration thread

 arch_init.c |   38 
 buffered_file.c |   75 +--
 cpu-all.h   |   42 +
 exec.c  |   97 ++--
 migration.c |  122 +-
 migration.h |9 
 qemu-common.h   |2 +
 qemu-thread-posix.c |   10 
 qemu-thread.h   |1 +
 savevm.c|5 --
 10 files changed, 297 insertions(+), 104 deletions(-)

-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v5 1/4] MRU ram list

2011-08-23 Thread Umesh Deshpande
This patch creates a new list of RAM blocks in MRU order. So that separate
locking rules can be applied to the regular RAM block list and the MRU list.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpu-all.h |2 ++
 exec.c|   17 -
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index e839100..6b217a2 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -925,6 +925,7 @@ typedef struct RAMBlock {
 uint32_t flags;
 char idstr[256];
 QLIST_ENTRY(RAMBlock) next;
+QLIST_ENTRY(RAMBlock) next_mru;
 #if defined(__linux__)  !defined(TARGET_S390X)
 int fd;
 #endif
@@ -933,6 +934,7 @@ typedef struct RAMBlock {
 typedef struct RAMList {
 uint8_t *phys_dirty;
 QLIST_HEAD(ram, RAMBlock) blocks;
+QLIST_HEAD(, RAMBlock) blocks_mru;
 } RAMList;
 extern RAMList ram_list;
 
diff --git a/exec.c b/exec.c
index 0e2ce57..c5c247c 100644
--- a/exec.c
+++ b/exec.c
@@ -113,7 +113,11 @@ static uint8_t *code_gen_ptr;
 int phys_ram_fd;
 static int in_migration;
 
-RAMList ram_list = { .blocks = QLIST_HEAD_INITIALIZER(ram_list) };
+RAMList ram_list = {
+.blocks = QLIST_HEAD_INITIALIZER(ram_list),
+.blocks_mru = QLIST_HEAD_INITIALIZER(ram_list.blocks_mru)
+};
+
 #endif
 
 CPUState *first_cpu;
@@ -2973,6 +2977,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, 
const char *name,
 new_block-length = size;
 
 QLIST_INSERT_HEAD(ram_list.blocks, new_block, next);
+QLIST_INSERT_HEAD(ram_list.blocks_mru, new_block, next_mru);
 
 ram_list.phys_dirty = qemu_realloc(ram_list.phys_dirty,
last_ram_offset()  TARGET_PAGE_BITS);
@@ -2997,6 +3002,7 @@ void qemu_ram_free_from_ptr(ram_addr_t addr)
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
 QLIST_REMOVE(block, next);
+QLIST_REMOVE(block, next_mru);
 qemu_free(block);
 return;
 }
@@ -3010,6 +3016,7 @@ void qemu_ram_free(ram_addr_t addr)
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
 QLIST_REMOVE(block, next);
+QLIST_REMOVE(block, next_mru);
 if (block-flags  RAM_PREALLOC_MASK) {
 ;
 } else if (mem_path) {
@@ -3113,12 +3120,12 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
 {
 RAMBlock *block;
 
-QLIST_FOREACH(block, ram_list.blocks, next) {
+QLIST_FOREACH(block, ram_list.blocks_mru, next_mru) {
 if (addr - block-offset  block-length) {
 /* Move this entry to to start of the list.  */
 if (block != QLIST_FIRST(ram_list.blocks)) {
-QLIST_REMOVE(block, next);
-QLIST_INSERT_HEAD(ram_list.blocks, block, next);
+QLIST_REMOVE(block, next_mru);
+QLIST_INSERT_HEAD(ram_list.blocks_mru, block, next_mru);
 }
 if (xen_mapcache_enabled()) {
 /* We need to check if the requested address is in the RAM
@@ -3211,7 +3218,7 @@ int qemu_ram_addr_from_host(void *ptr, ram_addr_t 
*ram_addr)
 return 0;
 }
 
-QLIST_FOREACH(block, ram_list.blocks, next) {
+QLIST_FOREACH(block, ram_list.blocks_mru, next_mru) {
 /* This case append when the block is not mapped. */
 if (block-host == NULL) {
 continue;
-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v5 3/4] Separate migration bitmap

2011-08-23 Thread Umesh Deshpande
This patch creates a migration bitmap, which is periodically kept in sync with
the qemu bitmap. A separate copy of the dirty bitmap for the migration avoids
concurrent access to the qemu bitmap from iothread and migration thread.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 arch_init.c |   17 -
 cpu-all.h   |   37 +
 exec.c  |   57 +
 3 files changed, 102 insertions(+), 9 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 9d02270..b5b852b 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -124,13 +124,13 @@ static int ram_save_block(QEMUFile *f)
 current_addr = block-offset + offset;
 
 do {
-if (cpu_physical_memory_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) 
{
+if (migration_bitmap_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) {
 uint8_t *p;
 int cont = (block == last_block) ? RAM_SAVE_FLAG_CONTINUE : 0;
 
-cpu_physical_memory_reset_dirty(current_addr,
-current_addr + TARGET_PAGE_SIZE,
-MIGRATION_DIRTY_FLAG);
+migration_bitmap_reset_dirty(current_addr,
+ current_addr + TARGET_PAGE_SIZE,
+ MIGRATION_DIRTY_FLAG);
 
 p = block-host + offset;
 
@@ -187,7 +187,7 @@ static ram_addr_t ram_save_remaining(void)
 ram_addr_t addr;
 for (addr = block-offset; addr  block-offset + block-length;
  addr += TARGET_PAGE_SIZE) {
-if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
+if (migration_bitmap_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
 count++;
 }
 }
@@ -267,6 +267,8 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 return 0;
 }
 
+sync_migration_bitmap(0, TARGET_PHYS_ADDR_MAX);
+
 if (stage == 1) {
 RAMBlock *block;
 bytes_transferred = 0;
@@ -279,10 +281,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 QLIST_FOREACH(block, ram_list.blocks, next) {
 for (addr = block-offset; addr  block-offset + block-length;
  addr += TARGET_PAGE_SIZE) {
-if (!cpu_physical_memory_get_dirty(addr,
-   MIGRATION_DIRTY_FLAG)) {
-cpu_physical_memory_set_dirty(addr);
-}
+migration_bitmap_set_dirty(addr);
 }
 }
 
diff --git a/cpu-all.h b/cpu-all.h
index b85483f..8181f8b 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -935,6 +935,7 @@ typedef struct RAMBlock {
 typedef struct RAMList {
 QemuMutex mutex;/* Protects RAM block list */
 uint8_t *phys_dirty;
+uint8_t *migration_bitmap; /* Dedicated bitmap for migration thread */
 uint32_t version;   /* To detect ram block addition/removal */
 QLIST_HEAD(ram, RAMBlock) blocks;
 QLIST_HEAD(, RAMBlock) blocks_mru;
@@ -1009,8 +1010,44 @@ static inline void 
cpu_physical_memory_mask_dirty_range(ram_addr_t start,
 }
 }
 
+
+
 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
  int dirty_flags);
+
+static inline int migration_bitmap_get_dirty(ram_addr_t addr,
+ int dirty_flags)
+{
+return ram_list.migration_bitmap[addr  TARGET_PAGE_BITS]  dirty_flags;
+}
+
+static inline void migration_bitmap_set_dirty(ram_addr_t addr)
+{
+ram_list.migration_bitmap[addr  TARGET_PAGE_BITS] = 0xff;
+}
+
+static inline void migration_bitmap_mask_dirty_range(ram_addr_t start,
+ int length,
+ int dirty_flags)
+{
+int i, mask, len;
+uint8_t *p;
+
+len = length  TARGET_PAGE_BITS;
+mask = ~dirty_flags;
+p = ram_list.migration_bitmap + (start  TARGET_PAGE_BITS);
+for (i = 0; i  len; i++) {
+p[i] = mask;
+}
+}
+
+
+void migration_bitmap_reset_dirty(ram_addr_t start,
+  ram_addr_t end,
+  int dirty_flags);
+
+void sync_migration_bitmap(ram_addr_t start, ram_addr_t end);
+
 void cpu_tlb_update_dirty(CPUState *env);
 
 int cpu_physical_memory_set_dirty_tracking(int enable);
diff --git a/exec.c b/exec.c
index 7627483..8dfbdbc 100644
--- a/exec.c
+++ b/exec.c
@@ -2111,6 +2111,10 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, 
ram_addr_t end,
 abort();
 }
 
+if (kvm_enabled()) {
+return;
+}
+
 for(env = first_cpu; env != NULL; env = env-next_cpu) {
 int mmu_idx;
 for (mmu_idx = 0; mmu_idx  NB_MMU_MODES; mmu_idx++) {
@@ -2119,8 +2123,54 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, 
ram_addr_t end

[RFC PATCH v5 2/4] Migration thread mutex

2011-08-23 Thread Umesh Deshpande
ramlist mutex is implemented to protect the RAMBlock list traversal in the
migration thread from their addition/removal from the iothread.

Note: Combination of iothread mutex and migration thread mutex works as a
rw-lock. Both mutexes are acquired while modifying the ram_list members or RAM
block list.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 arch_init.c   |   21 +
 cpu-all.h |3 +++
 exec.c|   23 +++
 qemu-common.h |2 ++
 4 files changed, 49 insertions(+), 0 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 484b39d..9d02270 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -109,6 +109,7 @@ static int is_dup_page(uint8_t *page, uint8_t ch)
 
 static RAMBlock *last_block;
 static ram_addr_t last_offset;
+static uint64_t last_version;
 
 static int ram_save_block(QEMUFile *f)
 {
@@ -170,6 +171,7 @@ static int ram_save_block(QEMUFile *f)
 
 last_block = block;
 last_offset = offset;
+last_version = ram_list.version;
 
 return bytes_sent;
 }
@@ -270,6 +272,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 bytes_transferred = 0;
 last_block = NULL;
 last_offset = 0;
+last_version = ram_list.version = 0;
 sort_ram_list();
 
 /* Make sure all dirty bits are set */
@@ -298,6 +301,17 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 bytes_transferred_last = bytes_transferred;
 bwidth = qemu_get_clock_ns(rt_clock);
 
+if (stage != 3) {
+qemu_mutex_lock_migthread();
+qemu_mutex_unlock_iothread();
+}
+
+if (ram_list.version != last_version) {
+/* RAM block added or removed */
+last_block = NULL;
+last_offset = 0;
+}
+
 while (!qemu_file_rate_limit(f)) {
 int bytes_sent;
 
@@ -308,6 +322,13 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 }
 }
 
+if (stage != 3) {
+qemu_mutex_unlock_migthread();
+qemu_mutex_lock_iothread();
+/* Lock ordering : iothread mutex is always acquired outside migthread
+ * mutex critical section to avoid deadlock */
+}
+
 bwidth = qemu_get_clock_ns(rt_clock) - bwidth;
 bwidth = (bytes_transferred - bytes_transferred_last) / bwidth;
 
diff --git a/cpu-all.h b/cpu-all.h
index 6b217a2..b85483f 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -21,6 +21,7 @@
 
 #include qemu-common.h
 #include cpu-common.h
+#include qemu-thread.h
 
 /* some important defines:
  *
@@ -932,7 +933,9 @@ typedef struct RAMBlock {
 } RAMBlock;
 
 typedef struct RAMList {
+QemuMutex mutex;/* Protects RAM block list */
 uint8_t *phys_dirty;
+uint32_t version;   /* To detect ram block addition/removal */
 QLIST_HEAD(ram, RAMBlock) blocks;
 QLIST_HEAD(, RAMBlock) blocks_mru;
 } RAMList;
diff --git a/exec.c b/exec.c
index c5c247c..7627483 100644
--- a/exec.c
+++ b/exec.c
@@ -582,6 +582,7 @@ void cpu_exec_init_all(unsigned long tb_size)
 code_gen_alloc(tb_size);
 code_gen_ptr = code_gen_buffer;
 page_init();
+qemu_mutex_init(ram_list.mutex);
 #if !defined(CONFIG_USER_ONLY)
 io_mem_init();
 #endif
@@ -2802,6 +2803,16 @@ static long gethugepagesize(const char *path)
 return fs.f_bsize;
 }
 
+void qemu_mutex_lock_migthread(void)
+{
+qemu_mutex_lock(ram_list.mutex);
+}
+
+void qemu_mutex_unlock_migthread(void)
+{
+qemu_mutex_unlock(ram_list.mutex);
+}
+
 static void *file_ram_alloc(RAMBlock *block,
 ram_addr_t memory,
 const char *path)
@@ -2976,14 +2987,20 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, 
const char *name,
 }
 new_block-length = size;
 
+qemu_mutex_lock_migthread();
+
 QLIST_INSERT_HEAD(ram_list.blocks, new_block, next);
 QLIST_INSERT_HEAD(ram_list.blocks_mru, new_block, next_mru);
 
+ram_list.version++;
+
 ram_list.phys_dirty = qemu_realloc(ram_list.phys_dirty,
last_ram_offset()  TARGET_PAGE_BITS);
 memset(ram_list.phys_dirty + (new_block-offset  TARGET_PAGE_BITS),
0xff, size  TARGET_PAGE_BITS);
 
+qemu_mutex_unlock_migthread();
+
 if (kvm_enabled())
 kvm_setup_guest_memory(new_block-host, size);
 
@@ -3001,8 +3018,11 @@ void qemu_ram_free_from_ptr(ram_addr_t addr)
 
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
+qemu_mutex_lock_migthread();
 QLIST_REMOVE(block, next);
 QLIST_REMOVE(block, next_mru);
+ram_list.version++;
+qemu_mutex_unlock_migthread();
 qemu_free(block);
 return;
 }
@@ -3015,8 +3035,11 @@ void qemu_ram_free(ram_addr_t addr)
 
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
+qemu_mutex_lock_migthread();
 QLIST_REMOVE(block, next

[RFC PATCH v5 4/4] Separate thread for VM migration

2011-08-23 Thread Umesh Deshpande
This patch creates a separate thread for the guest migration on the source side.
All exits (on completion/error) from the migration thread are handled by a 
bottom
handler, which is called from the iothread.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 buffered_file.c |   75 +--
 migration.c |  122 +-
 migration.h |9 
 qemu-thread-posix.c |   10 
 qemu-thread.h   |1 +
 savevm.c|5 --
 6 files changed, 132 insertions(+), 90 deletions(-)

diff --git a/buffered_file.c b/buffered_file.c
index 41b42c3..0d94baa 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -16,6 +16,8 @@
 #include qemu-timer.h
 #include qemu-char.h
 #include buffered_file.h
+#include migration.h
+#include qemu-thread.h
 
 //#define DEBUG_BUFFERED_FILE
 
@@ -28,13 +30,14 @@ typedef struct QEMUFileBuffered
 void *opaque;
 QEMUFile *file;
 int has_error;
+int closed;
 int freeze_output;
 size_t bytes_xfer;
 size_t xfer_limit;
 uint8_t *buffer;
 size_t buffer_size;
 size_t buffer_capacity;
-QEMUTimer *timer;
+QemuThread thread;
 } QEMUFileBuffered;
 
 #ifdef DEBUG_BUFFERED_FILE
@@ -155,14 +158,6 @@ static int buffered_put_buffer(void *opaque, const uint8_t 
*buf, int64_t pos, in
 offset = size;
 }
 
-if (pos == 0  size == 0) {
-DPRINTF(file is ready\n);
-if (s-bytes_xfer = s-xfer_limit) {
-DPRINTF(notifying client\n);
-s-put_ready(s-opaque);
-}
-}
-
 return offset;
 }
 
@@ -173,22 +168,25 @@ static int buffered_close(void *opaque)
 
 DPRINTF(closing\n);
 
-while (!s-has_error  s-buffer_size) {
-buffered_flush(s);
-if (s-freeze_output)
-s-wait_for_unfreeze(s);
-}
+s-closed = 1;
 
-ret = s-close(s-opaque);
+qemu_mutex_unlock_migthread();
+qemu_mutex_unlock_iothread();
+
+qemu_thread_join(s-thread);
+/* Waits for the completion of the migration thread */
 
-qemu_del_timer(s-timer);
-qemu_free_timer(s-timer);
+qemu_mutex_lock_iothread();
+qemu_mutex_lock_migthread();
+
+ret = s-close(s-opaque);
 qemu_free(s-buffer);
 qemu_free(s);
 
 return ret;
 }
 
+
 static int buffered_rate_limit(void *opaque)
 {
 QEMUFileBuffered *s = opaque;
@@ -228,26 +226,36 @@ static int64_t buffered_get_rate_limit(void *opaque)
 return s-xfer_limit;
 }
 
-static void buffered_rate_tick(void *opaque)
+static void *migrate_vm(void *opaque)
 {
 QEMUFileBuffered *s = opaque;
+int64_t current_time, expire_time = qemu_get_clock_ms(rt_clock) + 100;
+struct timeval tv = { .tv_sec = 0, .tv_usec = 10};
 
-if (s-has_error) {
-buffered_close(s);
-return;
-}
-
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+while (!s-has_error  (!s-closed || s-buffer_size)) {
+if (s-freeze_output) {
+s-wait_for_unfreeze(s);
+s-freeze_output = 0;
+continue;
+}
 
-if (s-freeze_output)
-return;
+current_time = qemu_get_clock_ms(rt_clock);
+if (!s-closed  (expire_time  current_time)) {
+tv.tv_usec = 1000 * (expire_time - current_time);
+select(0, NULL, NULL, NULL, tv);
+continue;
+}
 
-s-bytes_xfer = 0;
+s-bytes_xfer = 0;
+buffered_flush(s);
 
-buffered_flush(s);
+expire_time = qemu_get_clock_ms(rt_clock) + 100;
+if (!s-closed) {
+s-put_ready(s-opaque);
+}
+}
 
-/* Add some checks around this */
-s-put_ready(s-opaque);
+return NULL;
 }
 
 QEMUFile *qemu_fopen_ops_buffered(void *opaque,
@@ -267,15 +275,14 @@ QEMUFile *qemu_fopen_ops_buffered(void *opaque,
 s-put_ready = put_ready;
 s-wait_for_unfreeze = wait_for_unfreeze;
 s-close = close;
+s-closed = 0;
 
 s-file = qemu_fopen_ops(s, buffered_put_buffer, NULL,
  buffered_close, buffered_rate_limit,
  buffered_set_rate_limit,
-buffered_get_rate_limit);
-
-s-timer = qemu_new_timer_ms(rt_clock, buffered_rate_tick, s);
+ buffered_get_rate_limit);
 
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+qemu_thread_create(s-thread, migrate_vm, s);
 
 return s-file;
 }
diff --git a/migration.c b/migration.c
index af3a1f2..17d866a 100644
--- a/migration.c
+++ b/migration.c
@@ -149,10 +149,12 @@ int do_migrate_set_speed(Monitor *mon, const QDict 
*qdict, QObject **ret_data)
 }
 max_throttle = d;
 
+qemu_mutex_lock_migthread();
 s = migrate_to_fms(current_migration);
 if (s  s-file) {
 qemu_file_set_rate_limit(s-file, max_throttle);
 }
+qemu_mutex_unlock_migthread();
 
 return 0;
 }
@@ -284,8 +286,6 @@ int migrate_fd_cleanup(FdMigrationState *s

[Qemu-devel] [RFC PATCH v5 2/4] Migration thread mutex

2011-08-23 Thread Umesh Deshpande
ramlist mutex is implemented to protect the RAMBlock list traversal in the
migration thread from their addition/removal from the iothread.

Note: Combination of iothread mutex and migration thread mutex works as a
rw-lock. Both mutexes are acquired while modifying the ram_list members or RAM
block list.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 arch_init.c   |   21 +
 cpu-all.h |3 +++
 exec.c|   23 +++
 qemu-common.h |2 ++
 4 files changed, 49 insertions(+), 0 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 484b39d..9d02270 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -109,6 +109,7 @@ static int is_dup_page(uint8_t *page, uint8_t ch)
 
 static RAMBlock *last_block;
 static ram_addr_t last_offset;
+static uint64_t last_version;
 
 static int ram_save_block(QEMUFile *f)
 {
@@ -170,6 +171,7 @@ static int ram_save_block(QEMUFile *f)
 
 last_block = block;
 last_offset = offset;
+last_version = ram_list.version;
 
 return bytes_sent;
 }
@@ -270,6 +272,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 bytes_transferred = 0;
 last_block = NULL;
 last_offset = 0;
+last_version = ram_list.version = 0;
 sort_ram_list();
 
 /* Make sure all dirty bits are set */
@@ -298,6 +301,17 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 bytes_transferred_last = bytes_transferred;
 bwidth = qemu_get_clock_ns(rt_clock);
 
+if (stage != 3) {
+qemu_mutex_lock_migthread();
+qemu_mutex_unlock_iothread();
+}
+
+if (ram_list.version != last_version) {
+/* RAM block added or removed */
+last_block = NULL;
+last_offset = 0;
+}
+
 while (!qemu_file_rate_limit(f)) {
 int bytes_sent;
 
@@ -308,6 +322,13 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 }
 }
 
+if (stage != 3) {
+qemu_mutex_unlock_migthread();
+qemu_mutex_lock_iothread();
+/* Lock ordering : iothread mutex is always acquired outside migthread
+ * mutex critical section to avoid deadlock */
+}
+
 bwidth = qemu_get_clock_ns(rt_clock) - bwidth;
 bwidth = (bytes_transferred - bytes_transferred_last) / bwidth;
 
diff --git a/cpu-all.h b/cpu-all.h
index 6b217a2..b85483f 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -21,6 +21,7 @@
 
 #include qemu-common.h
 #include cpu-common.h
+#include qemu-thread.h
 
 /* some important defines:
  *
@@ -932,7 +933,9 @@ typedef struct RAMBlock {
 } RAMBlock;
 
 typedef struct RAMList {
+QemuMutex mutex;/* Protects RAM block list */
 uint8_t *phys_dirty;
+uint32_t version;   /* To detect ram block addition/removal */
 QLIST_HEAD(ram, RAMBlock) blocks;
 QLIST_HEAD(, RAMBlock) blocks_mru;
 } RAMList;
diff --git a/exec.c b/exec.c
index c5c247c..7627483 100644
--- a/exec.c
+++ b/exec.c
@@ -582,6 +582,7 @@ void cpu_exec_init_all(unsigned long tb_size)
 code_gen_alloc(tb_size);
 code_gen_ptr = code_gen_buffer;
 page_init();
+qemu_mutex_init(ram_list.mutex);
 #if !defined(CONFIG_USER_ONLY)
 io_mem_init();
 #endif
@@ -2802,6 +2803,16 @@ static long gethugepagesize(const char *path)
 return fs.f_bsize;
 }
 
+void qemu_mutex_lock_migthread(void)
+{
+qemu_mutex_lock(ram_list.mutex);
+}
+
+void qemu_mutex_unlock_migthread(void)
+{
+qemu_mutex_unlock(ram_list.mutex);
+}
+
 static void *file_ram_alloc(RAMBlock *block,
 ram_addr_t memory,
 const char *path)
@@ -2976,14 +2987,20 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, 
const char *name,
 }
 new_block-length = size;
 
+qemu_mutex_lock_migthread();
+
 QLIST_INSERT_HEAD(ram_list.blocks, new_block, next);
 QLIST_INSERT_HEAD(ram_list.blocks_mru, new_block, next_mru);
 
+ram_list.version++;
+
 ram_list.phys_dirty = qemu_realloc(ram_list.phys_dirty,
last_ram_offset()  TARGET_PAGE_BITS);
 memset(ram_list.phys_dirty + (new_block-offset  TARGET_PAGE_BITS),
0xff, size  TARGET_PAGE_BITS);
 
+qemu_mutex_unlock_migthread();
+
 if (kvm_enabled())
 kvm_setup_guest_memory(new_block-host, size);
 
@@ -3001,8 +3018,11 @@ void qemu_ram_free_from_ptr(ram_addr_t addr)
 
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
+qemu_mutex_lock_migthread();
 QLIST_REMOVE(block, next);
 QLIST_REMOVE(block, next_mru);
+ram_list.version++;
+qemu_mutex_unlock_migthread();
 qemu_free(block);
 return;
 }
@@ -3015,8 +3035,11 @@ void qemu_ram_free(ram_addr_t addr)
 
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
+qemu_mutex_lock_migthread();
 QLIST_REMOVE(block, next

[Qemu-devel] [RFC PATCH v5 1/4] MRU ram list

2011-08-23 Thread Umesh Deshpande
This patch creates a new list of RAM blocks in MRU order. So that separate
locking rules can be applied to the regular RAM block list and the MRU list.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpu-all.h |2 ++
 exec.c|   17 -
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index e839100..6b217a2 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -925,6 +925,7 @@ typedef struct RAMBlock {
 uint32_t flags;
 char idstr[256];
 QLIST_ENTRY(RAMBlock) next;
+QLIST_ENTRY(RAMBlock) next_mru;
 #if defined(__linux__)  !defined(TARGET_S390X)
 int fd;
 #endif
@@ -933,6 +934,7 @@ typedef struct RAMBlock {
 typedef struct RAMList {
 uint8_t *phys_dirty;
 QLIST_HEAD(ram, RAMBlock) blocks;
+QLIST_HEAD(, RAMBlock) blocks_mru;
 } RAMList;
 extern RAMList ram_list;
 
diff --git a/exec.c b/exec.c
index 0e2ce57..c5c247c 100644
--- a/exec.c
+++ b/exec.c
@@ -113,7 +113,11 @@ static uint8_t *code_gen_ptr;
 int phys_ram_fd;
 static int in_migration;
 
-RAMList ram_list = { .blocks = QLIST_HEAD_INITIALIZER(ram_list) };
+RAMList ram_list = {
+.blocks = QLIST_HEAD_INITIALIZER(ram_list),
+.blocks_mru = QLIST_HEAD_INITIALIZER(ram_list.blocks_mru)
+};
+
 #endif
 
 CPUState *first_cpu;
@@ -2973,6 +2977,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, 
const char *name,
 new_block-length = size;
 
 QLIST_INSERT_HEAD(ram_list.blocks, new_block, next);
+QLIST_INSERT_HEAD(ram_list.blocks_mru, new_block, next_mru);
 
 ram_list.phys_dirty = qemu_realloc(ram_list.phys_dirty,
last_ram_offset()  TARGET_PAGE_BITS);
@@ -2997,6 +3002,7 @@ void qemu_ram_free_from_ptr(ram_addr_t addr)
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
 QLIST_REMOVE(block, next);
+QLIST_REMOVE(block, next_mru);
 qemu_free(block);
 return;
 }
@@ -3010,6 +3016,7 @@ void qemu_ram_free(ram_addr_t addr)
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
 QLIST_REMOVE(block, next);
+QLIST_REMOVE(block, next_mru);
 if (block-flags  RAM_PREALLOC_MASK) {
 ;
 } else if (mem_path) {
@@ -3113,12 +3120,12 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
 {
 RAMBlock *block;
 
-QLIST_FOREACH(block, ram_list.blocks, next) {
+QLIST_FOREACH(block, ram_list.blocks_mru, next_mru) {
 if (addr - block-offset  block-length) {
 /* Move this entry to to start of the list.  */
 if (block != QLIST_FIRST(ram_list.blocks)) {
-QLIST_REMOVE(block, next);
-QLIST_INSERT_HEAD(ram_list.blocks, block, next);
+QLIST_REMOVE(block, next_mru);
+QLIST_INSERT_HEAD(ram_list.blocks_mru, block, next_mru);
 }
 if (xen_mapcache_enabled()) {
 /* We need to check if the requested address is in the RAM
@@ -3211,7 +3218,7 @@ int qemu_ram_addr_from_host(void *ptr, ram_addr_t 
*ram_addr)
 return 0;
 }
 
-QLIST_FOREACH(block, ram_list.blocks, next) {
+QLIST_FOREACH(block, ram_list.blocks_mru, next_mru) {
 /* This case append when the block is not mapped. */
 if (block-host == NULL) {
 continue;
-- 
1.7.4.1




[Qemu-devel] [RFC PATCH v5 4/4] Separate thread for VM migration

2011-08-23 Thread Umesh Deshpande
This patch creates a separate thread for the guest migration on the source side.
All exits (on completion/error) from the migration thread are handled by a 
bottom
handler, which is called from the iothread.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 buffered_file.c |   75 +--
 migration.c |  122 +-
 migration.h |9 
 qemu-thread-posix.c |   10 
 qemu-thread.h   |1 +
 savevm.c|5 --
 6 files changed, 132 insertions(+), 90 deletions(-)

diff --git a/buffered_file.c b/buffered_file.c
index 41b42c3..0d94baa 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -16,6 +16,8 @@
 #include qemu-timer.h
 #include qemu-char.h
 #include buffered_file.h
+#include migration.h
+#include qemu-thread.h
 
 //#define DEBUG_BUFFERED_FILE
 
@@ -28,13 +30,14 @@ typedef struct QEMUFileBuffered
 void *opaque;
 QEMUFile *file;
 int has_error;
+int closed;
 int freeze_output;
 size_t bytes_xfer;
 size_t xfer_limit;
 uint8_t *buffer;
 size_t buffer_size;
 size_t buffer_capacity;
-QEMUTimer *timer;
+QemuThread thread;
 } QEMUFileBuffered;
 
 #ifdef DEBUG_BUFFERED_FILE
@@ -155,14 +158,6 @@ static int buffered_put_buffer(void *opaque, const uint8_t 
*buf, int64_t pos, in
 offset = size;
 }
 
-if (pos == 0  size == 0) {
-DPRINTF(file is ready\n);
-if (s-bytes_xfer = s-xfer_limit) {
-DPRINTF(notifying client\n);
-s-put_ready(s-opaque);
-}
-}
-
 return offset;
 }
 
@@ -173,22 +168,25 @@ static int buffered_close(void *opaque)
 
 DPRINTF(closing\n);
 
-while (!s-has_error  s-buffer_size) {
-buffered_flush(s);
-if (s-freeze_output)
-s-wait_for_unfreeze(s);
-}
+s-closed = 1;
 
-ret = s-close(s-opaque);
+qemu_mutex_unlock_migthread();
+qemu_mutex_unlock_iothread();
+
+qemu_thread_join(s-thread);
+/* Waits for the completion of the migration thread */
 
-qemu_del_timer(s-timer);
-qemu_free_timer(s-timer);
+qemu_mutex_lock_iothread();
+qemu_mutex_lock_migthread();
+
+ret = s-close(s-opaque);
 qemu_free(s-buffer);
 qemu_free(s);
 
 return ret;
 }
 
+
 static int buffered_rate_limit(void *opaque)
 {
 QEMUFileBuffered *s = opaque;
@@ -228,26 +226,36 @@ static int64_t buffered_get_rate_limit(void *opaque)
 return s-xfer_limit;
 }
 
-static void buffered_rate_tick(void *opaque)
+static void *migrate_vm(void *opaque)
 {
 QEMUFileBuffered *s = opaque;
+int64_t current_time, expire_time = qemu_get_clock_ms(rt_clock) + 100;
+struct timeval tv = { .tv_sec = 0, .tv_usec = 10};
 
-if (s-has_error) {
-buffered_close(s);
-return;
-}
-
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+while (!s-has_error  (!s-closed || s-buffer_size)) {
+if (s-freeze_output) {
+s-wait_for_unfreeze(s);
+s-freeze_output = 0;
+continue;
+}
 
-if (s-freeze_output)
-return;
+current_time = qemu_get_clock_ms(rt_clock);
+if (!s-closed  (expire_time  current_time)) {
+tv.tv_usec = 1000 * (expire_time - current_time);
+select(0, NULL, NULL, NULL, tv);
+continue;
+}
 
-s-bytes_xfer = 0;
+s-bytes_xfer = 0;
+buffered_flush(s);
 
-buffered_flush(s);
+expire_time = qemu_get_clock_ms(rt_clock) + 100;
+if (!s-closed) {
+s-put_ready(s-opaque);
+}
+}
 
-/* Add some checks around this */
-s-put_ready(s-opaque);
+return NULL;
 }
 
 QEMUFile *qemu_fopen_ops_buffered(void *opaque,
@@ -267,15 +275,14 @@ QEMUFile *qemu_fopen_ops_buffered(void *opaque,
 s-put_ready = put_ready;
 s-wait_for_unfreeze = wait_for_unfreeze;
 s-close = close;
+s-closed = 0;
 
 s-file = qemu_fopen_ops(s, buffered_put_buffer, NULL,
  buffered_close, buffered_rate_limit,
  buffered_set_rate_limit,
-buffered_get_rate_limit);
-
-s-timer = qemu_new_timer_ms(rt_clock, buffered_rate_tick, s);
+ buffered_get_rate_limit);
 
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+qemu_thread_create(s-thread, migrate_vm, s);
 
 return s-file;
 }
diff --git a/migration.c b/migration.c
index af3a1f2..17d866a 100644
--- a/migration.c
+++ b/migration.c
@@ -149,10 +149,12 @@ int do_migrate_set_speed(Monitor *mon, const QDict 
*qdict, QObject **ret_data)
 }
 max_throttle = d;
 
+qemu_mutex_lock_migthread();
 s = migrate_to_fms(current_migration);
 if (s  s-file) {
 qemu_file_set_rate_limit(s-file, max_throttle);
 }
+qemu_mutex_unlock_migthread();
 
 return 0;
 }
@@ -284,8 +286,6 @@ int migrate_fd_cleanup(FdMigrationState *s

[Qemu-devel] [RFC PATCH v5 0/4] Separate thread for VM migration

2011-08-23 Thread Umesh Deshpande
Following patch series deals with VCPU and iothread starvation during the
migration of a guest. Currently the iothread is responsible for performing the
guest migration. It holds qemu_mutex during the migration and doesn't allow VCPU
to enter the qemu mode and delays its return to the guest. The guest migration,
executed as an iohandler also delays the execution of other iohandlers.
In the following patch series,

The migration has been moved to a separate thread to
reduce the qemu_mutex contention and iohandler starvation.

Umesh Deshpande (4):
  MRU ram block list
  migration thread mutex
  separate migration bitmap
  separate migration thread

 arch_init.c |   38 
 buffered_file.c |   75 +--
 cpu-all.h   |   42 +
 exec.c  |   97 ++--
 migration.c |  122 +-
 migration.h |9 
 qemu-common.h   |2 +
 qemu-thread-posix.c |   10 
 qemu-thread.h   |1 +
 savevm.c|5 --
 10 files changed, 297 insertions(+), 104 deletions(-)

-- 
1.7.4.1




[Qemu-devel] [RFC PATCH v5 3/4] Separate migration bitmap

2011-08-23 Thread Umesh Deshpande
This patch creates a migration bitmap, which is periodically kept in sync with
the qemu bitmap. A separate copy of the dirty bitmap for the migration avoids
concurrent access to the qemu bitmap from iothread and migration thread.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 arch_init.c |   17 -
 cpu-all.h   |   37 +
 exec.c  |   57 +
 3 files changed, 102 insertions(+), 9 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 9d02270..b5b852b 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -124,13 +124,13 @@ static int ram_save_block(QEMUFile *f)
 current_addr = block-offset + offset;
 
 do {
-if (cpu_physical_memory_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) 
{
+if (migration_bitmap_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) {
 uint8_t *p;
 int cont = (block == last_block) ? RAM_SAVE_FLAG_CONTINUE : 0;
 
-cpu_physical_memory_reset_dirty(current_addr,
-current_addr + TARGET_PAGE_SIZE,
-MIGRATION_DIRTY_FLAG);
+migration_bitmap_reset_dirty(current_addr,
+ current_addr + TARGET_PAGE_SIZE,
+ MIGRATION_DIRTY_FLAG);
 
 p = block-host + offset;
 
@@ -187,7 +187,7 @@ static ram_addr_t ram_save_remaining(void)
 ram_addr_t addr;
 for (addr = block-offset; addr  block-offset + block-length;
  addr += TARGET_PAGE_SIZE) {
-if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
+if (migration_bitmap_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
 count++;
 }
 }
@@ -267,6 +267,8 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 return 0;
 }
 
+sync_migration_bitmap(0, TARGET_PHYS_ADDR_MAX);
+
 if (stage == 1) {
 RAMBlock *block;
 bytes_transferred = 0;
@@ -279,10 +281,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 QLIST_FOREACH(block, ram_list.blocks, next) {
 for (addr = block-offset; addr  block-offset + block-length;
  addr += TARGET_PAGE_SIZE) {
-if (!cpu_physical_memory_get_dirty(addr,
-   MIGRATION_DIRTY_FLAG)) {
-cpu_physical_memory_set_dirty(addr);
-}
+migration_bitmap_set_dirty(addr);
 }
 }
 
diff --git a/cpu-all.h b/cpu-all.h
index b85483f..8181f8b 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -935,6 +935,7 @@ typedef struct RAMBlock {
 typedef struct RAMList {
 QemuMutex mutex;/* Protects RAM block list */
 uint8_t *phys_dirty;
+uint8_t *migration_bitmap; /* Dedicated bitmap for migration thread */
 uint32_t version;   /* To detect ram block addition/removal */
 QLIST_HEAD(ram, RAMBlock) blocks;
 QLIST_HEAD(, RAMBlock) blocks_mru;
@@ -1009,8 +1010,44 @@ static inline void 
cpu_physical_memory_mask_dirty_range(ram_addr_t start,
 }
 }
 
+
+
 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
  int dirty_flags);
+
+static inline int migration_bitmap_get_dirty(ram_addr_t addr,
+ int dirty_flags)
+{
+return ram_list.migration_bitmap[addr  TARGET_PAGE_BITS]  dirty_flags;
+}
+
+static inline void migration_bitmap_set_dirty(ram_addr_t addr)
+{
+ram_list.migration_bitmap[addr  TARGET_PAGE_BITS] = 0xff;
+}
+
+static inline void migration_bitmap_mask_dirty_range(ram_addr_t start,
+ int length,
+ int dirty_flags)
+{
+int i, mask, len;
+uint8_t *p;
+
+len = length  TARGET_PAGE_BITS;
+mask = ~dirty_flags;
+p = ram_list.migration_bitmap + (start  TARGET_PAGE_BITS);
+for (i = 0; i  len; i++) {
+p[i] = mask;
+}
+}
+
+
+void migration_bitmap_reset_dirty(ram_addr_t start,
+  ram_addr_t end,
+  int dirty_flags);
+
+void sync_migration_bitmap(ram_addr_t start, ram_addr_t end);
+
 void cpu_tlb_update_dirty(CPUState *env);
 
 int cpu_physical_memory_set_dirty_tracking(int enable);
diff --git a/exec.c b/exec.c
index 7627483..8dfbdbc 100644
--- a/exec.c
+++ b/exec.c
@@ -2111,6 +2111,10 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, 
ram_addr_t end,
 abort();
 }
 
+if (kvm_enabled()) {
+return;
+}
+
 for(env = first_cpu; env != NULL; env = env-next_cpu) {
 int mmu_idx;
 for (mmu_idx = 0; mmu_idx  NB_MMU_MODES; mmu_idx++) {
@@ -2119,8 +2123,54 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, 
ram_addr_t end

Re: [Qemu-devel] [RFC PATCH v4 3/5] separate migration bitmap

2011-08-20 Thread Umesh Deshpande

On 08/19/2011 08:51 AM, Paolo Bonzini wrote:

On 08/16/2011 08:56 PM, Umesh Deshpande wrote:
@@ -2128,8 +2132,61 @@ void 
cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,

   start1, length);
 }
 }
+
 }

+void migration_bitmap_reset_dirty(ram_addr_t start, ram_addr_t end,
+  int dirty_flags)
+{
+unsigned long length, start1;
+
+start = TARGET_PAGE_MASK;
+end = TARGET_PAGE_ALIGN(end);
+
+length = end - start;
+if (length == 0) {
+return;
+}
+
+migration_bitmap_mask_dirty_range(start, length, dirty_flags);
+
+/* we modify the TLB cache so that the dirty bit will be set again
+   when accessing the range */


The comment does not apply here, and the code below can also be safely 
deleted.



+start1 = (unsigned long)qemu_safe_ram_ptr(start);
+/* Check that we don't span multiple blocks - this breaks the
+   address comparisons below.  */
+if ((unsigned long)qemu_safe_ram_ptr(end - 1) - start1
+!= (end - 1) - start) {
+abort();
+}
+}
+
+void sync_migration_bitmap(ram_addr_t start, ram_addr_t end)
+{
+unsigned long length, len, i;
+ram_addr_t addr;
+start = TARGET_PAGE_MASK;
+end = TARGET_PAGE_ALIGN(end);
+
+length = end - start;
+if (length == 0) {
+return;
+}
+
+len = length  TARGET_PAGE_BITS;
+for (i = 0; i  len; i++) {
+addr = i  TARGET_PAGE_BITS;
+if (cpu_physical_memory_get_dirty(addr, 
MIGRATION_DIRTY_FLAG)) {

+migration_bitmap_set_dirty(addr);
+cpu_physical_memory_reset_dirty(addr, addr + 
TARGET_PAGE_SIZE,

+MIGRATION_DIRTY_FLAG);


This should be run under the iothread lock.  Pay attention to avoiding 
lock inversion: the I/O thread always takes the iothread lock outside 
and the ramlist lock within, so the migration thread must do the same.


BTW, I think this code in the migration thread patch also needs the 
iothread lock:



if (stage  0) {
cpu_physical_memory_set_dirty_tracking(0);
return 0;
}

if (cpu_physical_sync_dirty_bitmap(0, TARGET_PHYS_ADDR_MAX) != 0) {
qemu_file_set_error(f);
return 0;
}

Callers of above code snippets (sync_migration_bitmap etc.) are holding 
the iothread mutex. It has been made sure that the original qemu dirty 
bitmap is only accessed when holding the mutex.


Finally, here:


/* Make sure all dirty bits are set */
QLIST_FOREACH(block, ram_list.blocks, next) {
for (addr = block-offset; addr  block-offset + 
block-length;

 addr += TARGET_PAGE_SIZE) {
if (!migration_bitmap_get_dirty(addr,
   
MIGRATION_DIRTY_FLAG)) {

migration_bitmap_set_dirty(addr);
}
}
}



... you can skip the get_dirty operation since we are not interested 
in other flags than the migration flag for the migration-specific bitmap.

okay

Thanks
Umesh




Re: [RFC PATCH v4 2/5] ramlist mutex

2011-08-19 Thread Umesh Deshpande

On 08/17/2011 02:28 AM, Paolo Bonzini wrote:

On 08/16/2011 08:56 PM, Umesh Deshpande wrote:

@@ -3001,8 +3016,10 @@ void qemu_ram_free_from_ptr(ram_addr_t addr)

  QLIST_FOREACH(block,ram_list.blocks, next) {
  if (addr == block-offset) {
+qemu_mutex_lock_ramlist();
  QLIST_REMOVE(block, next);
  QLIST_REMOVE(block, next_mru);
+qemu_mutex_unlock_ramlist();
  qemu_free(block);
  return;
  }
@@ -3015,8 +3032,10 @@ void qemu_ram_free(ram_addr_t addr)

  QLIST_FOREACH(block,ram_list.blocks, next) {
  if (addr == block-offset) {
+qemu_mutex_lock_ramlist();
  QLIST_REMOVE(block, next);
  QLIST_REMOVE(block, next_mru);
+qemu_mutex_unlock_ramlist();
  if (block-flags  RAM_PREALLOC_MASK) {
  ;
  } else if (mem_path) {


You must protect the whole QLIST_FOREACH.  Otherwise looks good.
Or, is it okay to convert all the ramblock list traversals in exec.c 
(under iothread) to mru traversals, and probably it makes sense as the 
original list was also maintained in the mru order, whereas the sequence 
of blocks doesn't matter for the migration code. This way we don't have 
to acquire the mutex for block list traversals.


- Umesh

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [RFC PATCH v4 2/5] ramlist mutex

2011-08-19 Thread Umesh Deshpande

On 08/17/2011 02:28 AM, Paolo Bonzini wrote:

On 08/16/2011 08:56 PM, Umesh Deshpande wrote:

@@ -3001,8 +3016,10 @@ void qemu_ram_free_from_ptr(ram_addr_t addr)

  QLIST_FOREACH(block,ram_list.blocks, next) {
  if (addr == block-offset) {
+qemu_mutex_lock_ramlist();
  QLIST_REMOVE(block, next);
  QLIST_REMOVE(block, next_mru);
+qemu_mutex_unlock_ramlist();
  qemu_free(block);
  return;
  }
@@ -3015,8 +3032,10 @@ void qemu_ram_free(ram_addr_t addr)

  QLIST_FOREACH(block,ram_list.blocks, next) {
  if (addr == block-offset) {
+qemu_mutex_lock_ramlist();
  QLIST_REMOVE(block, next);
  QLIST_REMOVE(block, next_mru);
+qemu_mutex_unlock_ramlist();
  if (block-flags  RAM_PREALLOC_MASK) {
  ;
  } else if (mem_path) {


You must protect the whole QLIST_FOREACH.  Otherwise looks good.
Or, is it okay to convert all the ramblock list traversals in exec.c 
(under iothread) to mru traversals, and probably it makes sense as the 
original list was also maintained in the mru order, whereas the sequence 
of blocks doesn't matter for the migration code. This way we don't have 
to acquire the mutex for block list traversals.


- Umesh




[RFC PATCH v4 0/5] Separate thread for VM migration

2011-08-16 Thread Umesh Deshpande
Following patch series deals with VCPU and iothread starvation during the
migration of a guest. Currently the iothread is responsible for performing the
guest migration. It holds qemu_mutex during the migration and doesn't allow VCPU
to enter the qemu mode and delays its return to the guest. The guest migration,
executed as an iohandler also delays the execution of other iohandlers.
In the following patch series,

The migration has been moved to a separate thread to
reduce the qemu_mutex contention and iohandler starvation.

Umesh Deshpande (5):
  MRU ram list
  ramlist lock
  separate migration bitmap
  separate thread for VM migration
  synchronous migrate_cancel

 arch_init.c |   26 +---
 buffered_file.c |  104 ++-
 buffered_file.h |3 +
 cpu-all.h   |   41 
 exec.c  |  100 ++--
 hw/hw.h |5 ++-
 migration.c |   78 --
 migration.h |1 +
 qemu-common.h   |2 +
 qemu-thread-posix.c |   10 +
 qemu-thread.h   |1 +
 savevm.c|   30 +--
 12 files changed, 304 insertions(+), 97 deletions(-)

-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v4 2/5] ramlist mutex

2011-08-16 Thread Umesh Deshpande
ramlist mutex is implemented to protect the RAMBlock list traversal in the
migration thread from their addition/removal from the iothread.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 cpu-all.h |2 ++
 exec.c|   19 +++
 qemu-common.h |2 ++
 3 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index 6b217a2..eab9803 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -21,6 +21,7 @@
 
 #include qemu-common.h
 #include cpu-common.h
+#include qemu-thread.h
 
 /* some important defines:
  *
@@ -932,6 +933,7 @@ typedef struct RAMBlock {
 } RAMBlock;
 
 typedef struct RAMList {
+QemuMutex mutex;
 uint8_t *phys_dirty;
 QLIST_HEAD(ram, RAMBlock) blocks;
 QLIST_HEAD(, RAMBlock) blocks_mru;
diff --git a/exec.c b/exec.c
index c5c247c..404d8ea 100644
--- a/exec.c
+++ b/exec.c
@@ -582,6 +582,7 @@ void cpu_exec_init_all(unsigned long tb_size)
 code_gen_alloc(tb_size);
 code_gen_ptr = code_gen_buffer;
 page_init();
+qemu_mutex_init(ram_list.mutex);
 #if !defined(CONFIG_USER_ONLY)
 io_mem_init();
 #endif
@@ -2802,6 +2803,16 @@ static long gethugepagesize(const char *path)
 return fs.f_bsize;
 }
 
+void qemu_mutex_lock_ramlist(void)
+{
+qemu_mutex_lock(ram_list.mutex);
+}
+
+void qemu_mutex_unlock_ramlist(void)
+{
+qemu_mutex_unlock(ram_list.mutex);
+}
+
 static void *file_ram_alloc(RAMBlock *block,
 ram_addr_t memory,
 const char *path)
@@ -2976,6 +2987,8 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, 
const char *name,
 }
 new_block-length = size;
 
+qemu_mutex_lock_ramlist();
+
 QLIST_INSERT_HEAD(ram_list.blocks, new_block, next);
 QLIST_INSERT_HEAD(ram_list.blocks_mru, new_block, next_mru);
 
@@ -2984,6 +2997,8 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, 
const char *name,
 memset(ram_list.phys_dirty + (new_block-offset  TARGET_PAGE_BITS),
0xff, size  TARGET_PAGE_BITS);
 
+qemu_mutex_unlock_ramlist();
+
 if (kvm_enabled())
 kvm_setup_guest_memory(new_block-host, size);
 
@@ -3001,8 +3016,10 @@ void qemu_ram_free_from_ptr(ram_addr_t addr)
 
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
+qemu_mutex_lock_ramlist();
 QLIST_REMOVE(block, next);
 QLIST_REMOVE(block, next_mru);
+qemu_mutex_unlock_ramlist();
 qemu_free(block);
 return;
 }
@@ -3015,8 +3032,10 @@ void qemu_ram_free(ram_addr_t addr)
 
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
+qemu_mutex_lock_ramlist();
 QLIST_REMOVE(block, next);
 QLIST_REMOVE(block, next_mru);
+qemu_mutex_unlock_ramlist();
 if (block-flags  RAM_PREALLOC_MASK) {
 ;
 } else if (mem_path) {
diff --git a/qemu-common.h b/qemu-common.h
index abd7a75..b802883 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -212,6 +212,8 @@ char *qemu_strndup(const char *str, size_t size);
 
 void qemu_mutex_lock_iothread(void);
 void qemu_mutex_unlock_iothread(void);
+void qemu_mutex_lock_ramlist(void);
+void qemu_mutex_unlock_ramlist(void);
 
 int qemu_open(const char *name, int flags, ...);
 ssize_t qemu_write_full(int fd, const void *buf, size_t count)
-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v4 1/5] MRU ram list

2011-08-16 Thread Umesh Deshpande
This patch creates a new list of RAM blocks in MRU order. So that separate
locking rules can be applied to the regular RAM block list and the MRU list.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpu-all.h |2 ++
 exec.c|   17 -
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index e839100..6b217a2 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -925,6 +925,7 @@ typedef struct RAMBlock {
 uint32_t flags;
 char idstr[256];
 QLIST_ENTRY(RAMBlock) next;
+QLIST_ENTRY(RAMBlock) next_mru;
 #if defined(__linux__)  !defined(TARGET_S390X)
 int fd;
 #endif
@@ -933,6 +934,7 @@ typedef struct RAMBlock {
 typedef struct RAMList {
 uint8_t *phys_dirty;
 QLIST_HEAD(ram, RAMBlock) blocks;
+QLIST_HEAD(, RAMBlock) blocks_mru;
 } RAMList;
 extern RAMList ram_list;
 
diff --git a/exec.c b/exec.c
index 0e2ce57..c5c247c 100644
--- a/exec.c
+++ b/exec.c
@@ -113,7 +113,11 @@ static uint8_t *code_gen_ptr;
 int phys_ram_fd;
 static int in_migration;
 
-RAMList ram_list = { .blocks = QLIST_HEAD_INITIALIZER(ram_list) };
+RAMList ram_list = {
+.blocks = QLIST_HEAD_INITIALIZER(ram_list),
+.blocks_mru = QLIST_HEAD_INITIALIZER(ram_list.blocks_mru)
+};
+
 #endif
 
 CPUState *first_cpu;
@@ -2973,6 +2977,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, 
const char *name,
 new_block-length = size;
 
 QLIST_INSERT_HEAD(ram_list.blocks, new_block, next);
+QLIST_INSERT_HEAD(ram_list.blocks_mru, new_block, next_mru);
 
 ram_list.phys_dirty = qemu_realloc(ram_list.phys_dirty,
last_ram_offset()  TARGET_PAGE_BITS);
@@ -2997,6 +3002,7 @@ void qemu_ram_free_from_ptr(ram_addr_t addr)
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
 QLIST_REMOVE(block, next);
+QLIST_REMOVE(block, next_mru);
 qemu_free(block);
 return;
 }
@@ -3010,6 +3016,7 @@ void qemu_ram_free(ram_addr_t addr)
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
 QLIST_REMOVE(block, next);
+QLIST_REMOVE(block, next_mru);
 if (block-flags  RAM_PREALLOC_MASK) {
 ;
 } else if (mem_path) {
@@ -3113,12 +3120,12 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
 {
 RAMBlock *block;
 
-QLIST_FOREACH(block, ram_list.blocks, next) {
+QLIST_FOREACH(block, ram_list.blocks_mru, next_mru) {
 if (addr - block-offset  block-length) {
 /* Move this entry to to start of the list.  */
 if (block != QLIST_FIRST(ram_list.blocks)) {
-QLIST_REMOVE(block, next);
-QLIST_INSERT_HEAD(ram_list.blocks, block, next);
+QLIST_REMOVE(block, next_mru);
+QLIST_INSERT_HEAD(ram_list.blocks_mru, block, next_mru);
 }
 if (xen_mapcache_enabled()) {
 /* We need to check if the requested address is in the RAM
@@ -3211,7 +3218,7 @@ int qemu_ram_addr_from_host(void *ptr, ram_addr_t 
*ram_addr)
 return 0;
 }
 
-QLIST_FOREACH(block, ram_list.blocks, next) {
+QLIST_FOREACH(block, ram_list.blocks_mru, next_mru) {
 /* This case append when the block is not mapped. */
 if (block-host == NULL) {
 continue;
-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v3 5/5] Making iothread block for migrate_cancel

2011-08-16 Thread Umesh Deshpande
Following patch makes iothread wait until the migration thread responds to the
migrate_cancel request and terminates its execution.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 buffered_file.c |   13 -
 buffered_file.h |3 +++
 hw/hw.h |5 -
 migration.c |1 +
 qemu-thread-posix.c |   10 ++
 qemu-thread.h   |1 +
 savevm.c|   31 +--
 7 files changed, 52 insertions(+), 12 deletions(-)

diff --git a/buffered_file.c b/buffered_file.c
index bdcdf42..405b17f 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -223,6 +223,16 @@ static int64_t buffered_get_rate_limit(void *opaque)
 return s-xfer_limit;
 }
 
+static void buffered_wait_for_cancel(void *opaque)
+{
+QEMUFileBuffered *s = opaque;
+QemuThread thread = s-thread;
+
+qemu_mutex_unlock_iothread();
+qemu_thread_join(thread);
+qemu_mutex_lock_iothread();
+}
+
 static void *migrate_vm(void *opaque)
 {
 QEMUFileBuffered *s = opaque;
@@ -296,7 +306,8 @@ QEMUFile *qemu_fopen_ops_buffered(void *opaque,
 s-file = qemu_fopen_ops(s, buffered_put_buffer, NULL,
  buffered_close, buffered_rate_limit,
  buffered_set_rate_limit,
- buffered_get_rate_limit);
+ buffered_get_rate_limit,
+ buffered_wait_for_cancel);
 
 qemu_thread_create(s-thread, migrate_vm, s);
 
diff --git a/buffered_file.h b/buffered_file.h
index 98d358b..413cc9f 100644
--- a/buffered_file.h
+++ b/buffered_file.h
@@ -20,6 +20,9 @@ typedef ssize_t (BufferedPutFunc)(void *opaque, const void 
*data, size_t size);
 typedef void (BufferedPutReadyFunc)(void *opaque);
 typedef void (BufferedWaitForUnfreezeFunc)(void *opaque);
 typedef int (BufferedCloseFunc)(void *opaque);
+typedef void (BufferedWaitForCancelFunc)(void *opaque);
+
+void wait_for_cancel(void *opaque);
 
 QEMUFile *qemu_fopen_ops_buffered(void *opaque, size_t xfer_limit,
   BufferedPutFunc *put_buffer,
diff --git a/hw/hw.h b/hw/hw.h
index 9dd7096..e1d5ea8 100644
--- a/hw/hw.h
+++ b/hw/hw.h
@@ -41,13 +41,15 @@ typedef int (QEMUFileRateLimit)(void *opaque);
  */
 typedef int64_t (QEMUFileSetRateLimit)(void *opaque, int64_t new_rate);
 typedef int64_t (QEMUFileGetRateLimit)(void *opaque);
+typedef void (QEMUFileWaitForCancel)(void *opaque);
 
 QEMUFile *qemu_fopen_ops(void *opaque, QEMUFilePutBufferFunc *put_buffer,
  QEMUFileGetBufferFunc *get_buffer,
  QEMUFileCloseFunc *close,
  QEMUFileRateLimit *rate_limit,
  QEMUFileSetRateLimit *set_rate_limit,
-QEMUFileGetRateLimit *get_rate_limit);
+ QEMUFileGetRateLimit *get_rate_limit,
+ QEMUFileWaitForCancel *wait_for_cancel);
 QEMUFile *qemu_fopen(const char *filename, const char *mode);
 QEMUFile *qemu_fdopen(int fd, const char *mode);
 QEMUFile *qemu_fopen_socket(int fd);
@@ -56,6 +58,7 @@ QEMUFile *qemu_popen_cmd(const char *command, const char 
*mode);
 int qemu_stdio_fd(QEMUFile *f);
 void qemu_fflush(QEMUFile *f);
 int qemu_fclose(QEMUFile *f);
+void qemu_wait_for_cancel(QEMUFile *f);
 void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size);
 void qemu_put_byte(QEMUFile *f, int v);
 
diff --git a/migration.c b/migration.c
index b6ba690..0c5a484 100644
--- a/migration.c
+++ b/migration.c
@@ -423,6 +423,7 @@ void migrate_fd_cancel(MigrationState *mig_state)
 DPRINTF(cancelling migration\n);
 
 s-state = MIG_STATE_CANCELLED;
+qemu_wait_for_cancel(s-file);
 }
 
 void migrate_fd_release(MigrationState *mig_state)
diff --git a/qemu-thread-posix.c b/qemu-thread-posix.c
index 2bd02ef..0d18b35 100644
--- a/qemu-thread-posix.c
+++ b/qemu-thread-posix.c
@@ -115,6 +115,16 @@ void qemu_cond_wait(QemuCond *cond, QemuMutex *mutex)
 error_exit(err, __func__);
 }
 
+void qemu_thread_join(QemuThread thread)
+{
+int err;
+
+err = pthread_join(thread.thread, NULL);
+if (err) {
+error_exit(err, __func__);
+}
+}
+
 void qemu_thread_create(QemuThread *thread,
void *(*start_routine)(void*),
void *arg)
diff --git a/qemu-thread.h b/qemu-thread.h
index 0a73d50..909529f 100644
--- a/qemu-thread.h
+++ b/qemu-thread.h
@@ -30,6 +30,7 @@ void qemu_cond_destroy(QemuCond *cond);
 void qemu_cond_signal(QemuCond *cond);
 void qemu_cond_broadcast(QemuCond *cond);
 void qemu_cond_wait(QemuCond *cond, QemuMutex *mutex);
+void qemu_thread_join(QemuThread thread);
 
 void qemu_thread_create(QemuThread *thread,
void *(*start_routine)(void*),
diff --git a/savevm.c b/savevm.c
index f54f555..8003411 100644
--- a/savevm.c
+++ b/savevm.c
@@ -164,6 +164,7 @@ struct QEMUFile {
 QEMUFileRateLimit *rate_limit

[RFC PATCH v4 4/5] separate thread for VM migration

2011-08-16 Thread Umesh Deshpande
This patch creates a separate thread for the guest migration on the source side.
migrate_cancel request from the iothread is handled asynchronously. That is,
iothread submits migrate_cancel to the migration thread and returns, while the
migration thread attends this request at the next iteration to terminate its
execution.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 buffered_file.c |   93 ++
 migration.c |   77 +++--
 migration.h |1 +
 savevm.c|5 ---
 4 files changed, 99 insertions(+), 77 deletions(-)

diff --git a/buffered_file.c b/buffered_file.c
index 41b42c3..bdcdf42 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -16,6 +16,8 @@
 #include qemu-timer.h
 #include qemu-char.h
 #include buffered_file.h
+#include migration.h
+#include qemu-thread.h
 
 //#define DEBUG_BUFFERED_FILE
 
@@ -28,13 +30,14 @@ typedef struct QEMUFileBuffered
 void *opaque;
 QEMUFile *file;
 int has_error;
+int closed;
 int freeze_output;
 size_t bytes_xfer;
 size_t xfer_limit;
 uint8_t *buffer;
 size_t buffer_size;
 size_t buffer_capacity;
-QEMUTimer *timer;
+QemuThread thread;
 } QEMUFileBuffered;
 
 #ifdef DEBUG_BUFFERED_FILE
@@ -155,14 +158,6 @@ static int buffered_put_buffer(void *opaque, const uint8_t 
*buf, int64_t pos, in
 offset = size;
 }
 
-if (pos == 0  size == 0) {
-DPRINTF(file is ready\n);
-if (s-bytes_xfer = s-xfer_limit) {
-DPRINTF(notifying client\n);
-s-put_ready(s-opaque);
-}
-}
-
 return offset;
 }
 
@@ -175,20 +170,20 @@ static int buffered_close(void *opaque)
 
 while (!s-has_error  s-buffer_size) {
 buffered_flush(s);
-if (s-freeze_output)
+if (s-freeze_output) {
 s-wait_for_unfreeze(s);
+}
 }
 
-ret = s-close(s-opaque);
+s-closed = 1;
 
-qemu_del_timer(s-timer);
-qemu_free_timer(s-timer);
+ret = s-close(s-opaque);
 qemu_free(s-buffer);
-qemu_free(s);
 
 return ret;
 }
 
+
 static int buffered_rate_limit(void *opaque)
 {
 QEMUFileBuffered *s = opaque;
@@ -228,34 +223,63 @@ static int64_t buffered_get_rate_limit(void *opaque)
 return s-xfer_limit;
 }
 
-static void buffered_rate_tick(void *opaque)
+static void *migrate_vm(void *opaque)
 {
 QEMUFileBuffered *s = opaque;
+int64_t current_time, expire_time = qemu_get_clock_ms(rt_clock) + 100;
+struct timeval tv = { .tv_sec = 0, .tv_usec = 10};
 
-if (s-has_error) {
-buffered_close(s);
-return;
-}
+qemu_mutex_lock_iothread();
 
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+while (!s-closed) {
+if (s-freeze_output) {
+qemu_mutex_lock_ramlist();
+qemu_mutex_unlock_iothread();
+s-wait_for_unfreeze(s);
+qemu_mutex_lock_iothread();
+qemu_mutex_unlock_ramlist();
+s-freeze_output = 0;
+continue;
+}
 
-if (s-freeze_output)
-return;
+if (s-has_error) {
+break;
+}
+
+current_time = qemu_get_clock_ms(rt_clock);
+if (!s-closed  (expire_time  current_time)) {
+tv.tv_usec = 1000 * (expire_time - current_time);
+qemu_mutex_lock_ramlist();
+qemu_mutex_unlock_iothread();
+select(0, NULL, NULL, NULL, tv);
+qemu_mutex_lock_iothread();
+qemu_mutex_unlock_ramlist();
+continue;
+}
 
-s-bytes_xfer = 0;
+s-bytes_xfer = 0;
+buffered_flush(s);
 
-buffered_flush(s);
+expire_time = qemu_get_clock_ms(rt_clock) + 100;
+s-put_ready(s-opaque);
+}
 
-/* Add some checks around this */
-s-put_ready(s-opaque);
+if (s-has_error) {
+buffered_close(s);
+}
+qemu_free(s);
+
+qemu_mutex_unlock_iothread();
+
+return NULL;
 }
 
 QEMUFile *qemu_fopen_ops_buffered(void *opaque,
-  size_t bytes_per_sec,
-  BufferedPutFunc *put_buffer,
-  BufferedPutReadyFunc *put_ready,
-  BufferedWaitForUnfreezeFunc 
*wait_for_unfreeze,
-  BufferedCloseFunc *close)
+size_t bytes_per_sec,
+BufferedPutFunc *put_buffer,
+BufferedPutReadyFunc *put_ready,
+BufferedWaitForUnfreezeFunc *wait_for_unfreeze,
+BufferedCloseFunc *close)
 {
 QEMUFileBuffered *s;
 
@@ -267,15 +291,14 @@ QEMUFile *qemu_fopen_ops_buffered(void *opaque,
 s-put_ready = put_ready;
 s-wait_for_unfreeze = wait_for_unfreeze;
 s-close = close;
+s-closed = 0;
 
 s-file = qemu_fopen_ops(s, buffered_put_buffer, NULL,
  buffered_close, buffered_rate_limit

[RFC PATCH v4 3/5] separate migration bitmap

2011-08-16 Thread Umesh Deshpande
This patch creates a migration bitmap, which is periodically kept in sync with
the qemu bitmap. A separate copy of the dirty bitmap for the migration avoids
concurrent access to the qemu bitmap from iothread and migration thread.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 arch_init.c |   26 +--
 cpu-all.h   |   37 ++
 exec.c  |   64 +++
 3 files changed, 120 insertions(+), 7 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 484b39d..296b7d6 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -123,13 +123,13 @@ static int ram_save_block(QEMUFile *f)
 current_addr = block-offset + offset;
 
 do {
-if (cpu_physical_memory_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) 
{
+if (migration_bitmap_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) {
 uint8_t *p;
 int cont = (block == last_block) ? RAM_SAVE_FLAG_CONTINUE : 0;
 
-cpu_physical_memory_reset_dirty(current_addr,
-current_addr + TARGET_PAGE_SIZE,
-MIGRATION_DIRTY_FLAG);
+migration_bitmap_reset_dirty(current_addr,
+ current_addr + TARGET_PAGE_SIZE,
+ MIGRATION_DIRTY_FLAG);
 
 p = block-host + offset;
 
@@ -185,7 +185,7 @@ static ram_addr_t ram_save_remaining(void)
 ram_addr_t addr;
 for (addr = block-offset; addr  block-offset + block-length;
  addr += TARGET_PAGE_SIZE) {
-if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
+if (migration_bitmap_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
 count++;
 }
 }
@@ -265,6 +265,8 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 return 0;
 }
 
+sync_migration_bitmap(0, TARGET_PHYS_ADDR_MAX);
+
 if (stage == 1) {
 RAMBlock *block;
 bytes_transferred = 0;
@@ -276,9 +278,9 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 QLIST_FOREACH(block, ram_list.blocks, next) {
 for (addr = block-offset; addr  block-offset + block-length;
  addr += TARGET_PAGE_SIZE) {
-if (!cpu_physical_memory_get_dirty(addr,
+if (!migration_bitmap_get_dirty(addr,
MIGRATION_DIRTY_FLAG)) {
-cpu_physical_memory_set_dirty(addr);
+migration_bitmap_set_dirty(addr);
 }
 }
 }
@@ -298,6 +300,11 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 bytes_transferred_last = bytes_transferred;
 bwidth = qemu_get_clock_ns(rt_clock);
 
+if (stage != 3) {
+qemu_mutex_lock_ramlist();
+qemu_mutex_unlock_iothread();
+}
+
 while (!qemu_file_rate_limit(f)) {
 int bytes_sent;
 
@@ -308,6 +315,11 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 }
 }
 
+if (stage != 3) {
+qemu_mutex_lock_iothread();
+qemu_mutex_unlock_ramlist();
+}
+
 bwidth = qemu_get_clock_ns(rt_clock) - bwidth;
 bwidth = (bytes_transferred - bytes_transferred_last) / bwidth;
 
diff --git a/cpu-all.h b/cpu-all.h
index eab9803..e709277 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -935,6 +935,7 @@ typedef struct RAMBlock {
 typedef struct RAMList {
 QemuMutex mutex;
 uint8_t *phys_dirty;
+uint8_t *migration_bitmap;
 QLIST_HEAD(ram, RAMBlock) blocks;
 QLIST_HEAD(, RAMBlock) blocks_mru;
 } RAMList;
@@ -1008,8 +1009,44 @@ static inline void 
cpu_physical_memory_mask_dirty_range(ram_addr_t start,
 }
 }
 
+
+
 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
  int dirty_flags);
+
+static inline int migration_bitmap_get_dirty(ram_addr_t addr,
+ int dirty_flags)
+{
+return ram_list.migration_bitmap[addr  TARGET_PAGE_BITS]  dirty_flags;
+}
+
+static inline void migration_bitmap_set_dirty(ram_addr_t addr)
+{
+ram_list.migration_bitmap[addr  TARGET_PAGE_BITS] = 0xff;
+}
+
+static inline void migration_bitmap_mask_dirty_range(ram_addr_t start,
+ int length,
+ int dirty_flags)
+{
+int i, mask, len;
+uint8_t *p;
+
+len = length  TARGET_PAGE_BITS;
+mask = ~dirty_flags;
+p = ram_list.migration_bitmap + (start  TARGET_PAGE_BITS);
+for (i = 0; i  len; i++) {
+p[i] = mask;
+}
+}
+
+
+void migration_bitmap_reset_dirty(ram_addr_t start,
+  ram_addr_t end,
+  int dirty_flags);
+
+void sync_migration_bitmap

[Qemu-devel] [RFC PATCH v4 0/5] Separate thread for VM migration

2011-08-16 Thread Umesh Deshpande
Following patch series deals with VCPU and iothread starvation during the
migration of a guest. Currently the iothread is responsible for performing the
guest migration. It holds qemu_mutex during the migration and doesn't allow VCPU
to enter the qemu mode and delays its return to the guest. The guest migration,
executed as an iohandler also delays the execution of other iohandlers.
In the following patch series,

The migration has been moved to a separate thread to
reduce the qemu_mutex contention and iohandler starvation.

Umesh Deshpande (5):
  MRU ram list
  ramlist lock
  separate migration bitmap
  separate thread for VM migration
  synchronous migrate_cancel

 arch_init.c |   26 +---
 buffered_file.c |  104 ++-
 buffered_file.h |3 +
 cpu-all.h   |   41 
 exec.c  |  100 ++--
 hw/hw.h |5 ++-
 migration.c |   78 --
 migration.h |1 +
 qemu-common.h   |2 +
 qemu-thread-posix.c |   10 +
 qemu-thread.h   |1 +
 savevm.c|   30 +--
 12 files changed, 304 insertions(+), 97 deletions(-)

-- 
1.7.4.1




[Qemu-devel] [RFC PATCH v4 3/5] separate migration bitmap

2011-08-16 Thread Umesh Deshpande
This patch creates a migration bitmap, which is periodically kept in sync with
the qemu bitmap. A separate copy of the dirty bitmap for the migration avoids
concurrent access to the qemu bitmap from iothread and migration thread.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 arch_init.c |   26 +--
 cpu-all.h   |   37 ++
 exec.c  |   64 +++
 3 files changed, 120 insertions(+), 7 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 484b39d..296b7d6 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -123,13 +123,13 @@ static int ram_save_block(QEMUFile *f)
 current_addr = block-offset + offset;
 
 do {
-if (cpu_physical_memory_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) 
{
+if (migration_bitmap_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) {
 uint8_t *p;
 int cont = (block == last_block) ? RAM_SAVE_FLAG_CONTINUE : 0;
 
-cpu_physical_memory_reset_dirty(current_addr,
-current_addr + TARGET_PAGE_SIZE,
-MIGRATION_DIRTY_FLAG);
+migration_bitmap_reset_dirty(current_addr,
+ current_addr + TARGET_PAGE_SIZE,
+ MIGRATION_DIRTY_FLAG);
 
 p = block-host + offset;
 
@@ -185,7 +185,7 @@ static ram_addr_t ram_save_remaining(void)
 ram_addr_t addr;
 for (addr = block-offset; addr  block-offset + block-length;
  addr += TARGET_PAGE_SIZE) {
-if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
+if (migration_bitmap_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
 count++;
 }
 }
@@ -265,6 +265,8 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 return 0;
 }
 
+sync_migration_bitmap(0, TARGET_PHYS_ADDR_MAX);
+
 if (stage == 1) {
 RAMBlock *block;
 bytes_transferred = 0;
@@ -276,9 +278,9 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 QLIST_FOREACH(block, ram_list.blocks, next) {
 for (addr = block-offset; addr  block-offset + block-length;
  addr += TARGET_PAGE_SIZE) {
-if (!cpu_physical_memory_get_dirty(addr,
+if (!migration_bitmap_get_dirty(addr,
MIGRATION_DIRTY_FLAG)) {
-cpu_physical_memory_set_dirty(addr);
+migration_bitmap_set_dirty(addr);
 }
 }
 }
@@ -298,6 +300,11 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 bytes_transferred_last = bytes_transferred;
 bwidth = qemu_get_clock_ns(rt_clock);
 
+if (stage != 3) {
+qemu_mutex_lock_ramlist();
+qemu_mutex_unlock_iothread();
+}
+
 while (!qemu_file_rate_limit(f)) {
 int bytes_sent;
 
@@ -308,6 +315,11 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 }
 }
 
+if (stage != 3) {
+qemu_mutex_lock_iothread();
+qemu_mutex_unlock_ramlist();
+}
+
 bwidth = qemu_get_clock_ns(rt_clock) - bwidth;
 bwidth = (bytes_transferred - bytes_transferred_last) / bwidth;
 
diff --git a/cpu-all.h b/cpu-all.h
index eab9803..e709277 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -935,6 +935,7 @@ typedef struct RAMBlock {
 typedef struct RAMList {
 QemuMutex mutex;
 uint8_t *phys_dirty;
+uint8_t *migration_bitmap;
 QLIST_HEAD(ram, RAMBlock) blocks;
 QLIST_HEAD(, RAMBlock) blocks_mru;
 } RAMList;
@@ -1008,8 +1009,44 @@ static inline void 
cpu_physical_memory_mask_dirty_range(ram_addr_t start,
 }
 }
 
+
+
 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
  int dirty_flags);
+
+static inline int migration_bitmap_get_dirty(ram_addr_t addr,
+ int dirty_flags)
+{
+return ram_list.migration_bitmap[addr  TARGET_PAGE_BITS]  dirty_flags;
+}
+
+static inline void migration_bitmap_set_dirty(ram_addr_t addr)
+{
+ram_list.migration_bitmap[addr  TARGET_PAGE_BITS] = 0xff;
+}
+
+static inline void migration_bitmap_mask_dirty_range(ram_addr_t start,
+ int length,
+ int dirty_flags)
+{
+int i, mask, len;
+uint8_t *p;
+
+len = length  TARGET_PAGE_BITS;
+mask = ~dirty_flags;
+p = ram_list.migration_bitmap + (start  TARGET_PAGE_BITS);
+for (i = 0; i  len; i++) {
+p[i] = mask;
+}
+}
+
+
+void migration_bitmap_reset_dirty(ram_addr_t start,
+  ram_addr_t end,
+  int dirty_flags);
+
+void sync_migration_bitmap

[Qemu-devel] [RFC PATCH v4 4/5] separate thread for VM migration

2011-08-16 Thread Umesh Deshpande
This patch creates a separate thread for the guest migration on the source side.
migrate_cancel request from the iothread is handled asynchronously. That is,
iothread submits migrate_cancel to the migration thread and returns, while the
migration thread attends this request at the next iteration to terminate its
execution.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 buffered_file.c |   93 ++
 migration.c |   77 +++--
 migration.h |1 +
 savevm.c|5 ---
 4 files changed, 99 insertions(+), 77 deletions(-)

diff --git a/buffered_file.c b/buffered_file.c
index 41b42c3..bdcdf42 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -16,6 +16,8 @@
 #include qemu-timer.h
 #include qemu-char.h
 #include buffered_file.h
+#include migration.h
+#include qemu-thread.h
 
 //#define DEBUG_BUFFERED_FILE
 
@@ -28,13 +30,14 @@ typedef struct QEMUFileBuffered
 void *opaque;
 QEMUFile *file;
 int has_error;
+int closed;
 int freeze_output;
 size_t bytes_xfer;
 size_t xfer_limit;
 uint8_t *buffer;
 size_t buffer_size;
 size_t buffer_capacity;
-QEMUTimer *timer;
+QemuThread thread;
 } QEMUFileBuffered;
 
 #ifdef DEBUG_BUFFERED_FILE
@@ -155,14 +158,6 @@ static int buffered_put_buffer(void *opaque, const uint8_t 
*buf, int64_t pos, in
 offset = size;
 }
 
-if (pos == 0  size == 0) {
-DPRINTF(file is ready\n);
-if (s-bytes_xfer = s-xfer_limit) {
-DPRINTF(notifying client\n);
-s-put_ready(s-opaque);
-}
-}
-
 return offset;
 }
 
@@ -175,20 +170,20 @@ static int buffered_close(void *opaque)
 
 while (!s-has_error  s-buffer_size) {
 buffered_flush(s);
-if (s-freeze_output)
+if (s-freeze_output) {
 s-wait_for_unfreeze(s);
+}
 }
 
-ret = s-close(s-opaque);
+s-closed = 1;
 
-qemu_del_timer(s-timer);
-qemu_free_timer(s-timer);
+ret = s-close(s-opaque);
 qemu_free(s-buffer);
-qemu_free(s);
 
 return ret;
 }
 
+
 static int buffered_rate_limit(void *opaque)
 {
 QEMUFileBuffered *s = opaque;
@@ -228,34 +223,63 @@ static int64_t buffered_get_rate_limit(void *opaque)
 return s-xfer_limit;
 }
 
-static void buffered_rate_tick(void *opaque)
+static void *migrate_vm(void *opaque)
 {
 QEMUFileBuffered *s = opaque;
+int64_t current_time, expire_time = qemu_get_clock_ms(rt_clock) + 100;
+struct timeval tv = { .tv_sec = 0, .tv_usec = 10};
 
-if (s-has_error) {
-buffered_close(s);
-return;
-}
+qemu_mutex_lock_iothread();
 
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+while (!s-closed) {
+if (s-freeze_output) {
+qemu_mutex_lock_ramlist();
+qemu_mutex_unlock_iothread();
+s-wait_for_unfreeze(s);
+qemu_mutex_lock_iothread();
+qemu_mutex_unlock_ramlist();
+s-freeze_output = 0;
+continue;
+}
 
-if (s-freeze_output)
-return;
+if (s-has_error) {
+break;
+}
+
+current_time = qemu_get_clock_ms(rt_clock);
+if (!s-closed  (expire_time  current_time)) {
+tv.tv_usec = 1000 * (expire_time - current_time);
+qemu_mutex_lock_ramlist();
+qemu_mutex_unlock_iothread();
+select(0, NULL, NULL, NULL, tv);
+qemu_mutex_lock_iothread();
+qemu_mutex_unlock_ramlist();
+continue;
+}
 
-s-bytes_xfer = 0;
+s-bytes_xfer = 0;
+buffered_flush(s);
 
-buffered_flush(s);
+expire_time = qemu_get_clock_ms(rt_clock) + 100;
+s-put_ready(s-opaque);
+}
 
-/* Add some checks around this */
-s-put_ready(s-opaque);
+if (s-has_error) {
+buffered_close(s);
+}
+qemu_free(s);
+
+qemu_mutex_unlock_iothread();
+
+return NULL;
 }
 
 QEMUFile *qemu_fopen_ops_buffered(void *opaque,
-  size_t bytes_per_sec,
-  BufferedPutFunc *put_buffer,
-  BufferedPutReadyFunc *put_ready,
-  BufferedWaitForUnfreezeFunc 
*wait_for_unfreeze,
-  BufferedCloseFunc *close)
+size_t bytes_per_sec,
+BufferedPutFunc *put_buffer,
+BufferedPutReadyFunc *put_ready,
+BufferedWaitForUnfreezeFunc *wait_for_unfreeze,
+BufferedCloseFunc *close)
 {
 QEMUFileBuffered *s;
 
@@ -267,15 +291,14 @@ QEMUFile *qemu_fopen_ops_buffered(void *opaque,
 s-put_ready = put_ready;
 s-wait_for_unfreeze = wait_for_unfreeze;
 s-close = close;
+s-closed = 0;
 
 s-file = qemu_fopen_ops(s, buffered_put_buffer, NULL,
  buffered_close, buffered_rate_limit

[Qemu-devel] [RFC PATCH v4 1/5] MRU ram list

2011-08-16 Thread Umesh Deshpande
This patch creates a new list of RAM blocks in MRU order. So that separate
locking rules can be applied to the regular RAM block list and the MRU list.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpu-all.h |2 ++
 exec.c|   17 -
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index e839100..6b217a2 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -925,6 +925,7 @@ typedef struct RAMBlock {
 uint32_t flags;
 char idstr[256];
 QLIST_ENTRY(RAMBlock) next;
+QLIST_ENTRY(RAMBlock) next_mru;
 #if defined(__linux__)  !defined(TARGET_S390X)
 int fd;
 #endif
@@ -933,6 +934,7 @@ typedef struct RAMBlock {
 typedef struct RAMList {
 uint8_t *phys_dirty;
 QLIST_HEAD(ram, RAMBlock) blocks;
+QLIST_HEAD(, RAMBlock) blocks_mru;
 } RAMList;
 extern RAMList ram_list;
 
diff --git a/exec.c b/exec.c
index 0e2ce57..c5c247c 100644
--- a/exec.c
+++ b/exec.c
@@ -113,7 +113,11 @@ static uint8_t *code_gen_ptr;
 int phys_ram_fd;
 static int in_migration;
 
-RAMList ram_list = { .blocks = QLIST_HEAD_INITIALIZER(ram_list) };
+RAMList ram_list = {
+.blocks = QLIST_HEAD_INITIALIZER(ram_list),
+.blocks_mru = QLIST_HEAD_INITIALIZER(ram_list.blocks_mru)
+};
+
 #endif
 
 CPUState *first_cpu;
@@ -2973,6 +2977,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, 
const char *name,
 new_block-length = size;
 
 QLIST_INSERT_HEAD(ram_list.blocks, new_block, next);
+QLIST_INSERT_HEAD(ram_list.blocks_mru, new_block, next_mru);
 
 ram_list.phys_dirty = qemu_realloc(ram_list.phys_dirty,
last_ram_offset()  TARGET_PAGE_BITS);
@@ -2997,6 +3002,7 @@ void qemu_ram_free_from_ptr(ram_addr_t addr)
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
 QLIST_REMOVE(block, next);
+QLIST_REMOVE(block, next_mru);
 qemu_free(block);
 return;
 }
@@ -3010,6 +3016,7 @@ void qemu_ram_free(ram_addr_t addr)
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
 QLIST_REMOVE(block, next);
+QLIST_REMOVE(block, next_mru);
 if (block-flags  RAM_PREALLOC_MASK) {
 ;
 } else if (mem_path) {
@@ -3113,12 +3120,12 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
 {
 RAMBlock *block;
 
-QLIST_FOREACH(block, ram_list.blocks, next) {
+QLIST_FOREACH(block, ram_list.blocks_mru, next_mru) {
 if (addr - block-offset  block-length) {
 /* Move this entry to to start of the list.  */
 if (block != QLIST_FIRST(ram_list.blocks)) {
-QLIST_REMOVE(block, next);
-QLIST_INSERT_HEAD(ram_list.blocks, block, next);
+QLIST_REMOVE(block, next_mru);
+QLIST_INSERT_HEAD(ram_list.blocks_mru, block, next_mru);
 }
 if (xen_mapcache_enabled()) {
 /* We need to check if the requested address is in the RAM
@@ -3211,7 +3218,7 @@ int qemu_ram_addr_from_host(void *ptr, ram_addr_t 
*ram_addr)
 return 0;
 }
 
-QLIST_FOREACH(block, ram_list.blocks, next) {
+QLIST_FOREACH(block, ram_list.blocks_mru, next_mru) {
 /* This case append when the block is not mapped. */
 if (block-host == NULL) {
 continue;
-- 
1.7.4.1




[Qemu-devel] [RFC PATCH v4 2/5] ramlist mutex

2011-08-16 Thread Umesh Deshpande
ramlist mutex is implemented to protect the RAMBlock list traversal in the
migration thread from their addition/removal from the iothread.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 cpu-all.h |2 ++
 exec.c|   19 +++
 qemu-common.h |2 ++
 3 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index 6b217a2..eab9803 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -21,6 +21,7 @@
 
 #include qemu-common.h
 #include cpu-common.h
+#include qemu-thread.h
 
 /* some important defines:
  *
@@ -932,6 +933,7 @@ typedef struct RAMBlock {
 } RAMBlock;
 
 typedef struct RAMList {
+QemuMutex mutex;
 uint8_t *phys_dirty;
 QLIST_HEAD(ram, RAMBlock) blocks;
 QLIST_HEAD(, RAMBlock) blocks_mru;
diff --git a/exec.c b/exec.c
index c5c247c..404d8ea 100644
--- a/exec.c
+++ b/exec.c
@@ -582,6 +582,7 @@ void cpu_exec_init_all(unsigned long tb_size)
 code_gen_alloc(tb_size);
 code_gen_ptr = code_gen_buffer;
 page_init();
+qemu_mutex_init(ram_list.mutex);
 #if !defined(CONFIG_USER_ONLY)
 io_mem_init();
 #endif
@@ -2802,6 +2803,16 @@ static long gethugepagesize(const char *path)
 return fs.f_bsize;
 }
 
+void qemu_mutex_lock_ramlist(void)
+{
+qemu_mutex_lock(ram_list.mutex);
+}
+
+void qemu_mutex_unlock_ramlist(void)
+{
+qemu_mutex_unlock(ram_list.mutex);
+}
+
 static void *file_ram_alloc(RAMBlock *block,
 ram_addr_t memory,
 const char *path)
@@ -2976,6 +2987,8 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, 
const char *name,
 }
 new_block-length = size;
 
+qemu_mutex_lock_ramlist();
+
 QLIST_INSERT_HEAD(ram_list.blocks, new_block, next);
 QLIST_INSERT_HEAD(ram_list.blocks_mru, new_block, next_mru);
 
@@ -2984,6 +2997,8 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, 
const char *name,
 memset(ram_list.phys_dirty + (new_block-offset  TARGET_PAGE_BITS),
0xff, size  TARGET_PAGE_BITS);
 
+qemu_mutex_unlock_ramlist();
+
 if (kvm_enabled())
 kvm_setup_guest_memory(new_block-host, size);
 
@@ -3001,8 +3016,10 @@ void qemu_ram_free_from_ptr(ram_addr_t addr)
 
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
+qemu_mutex_lock_ramlist();
 QLIST_REMOVE(block, next);
 QLIST_REMOVE(block, next_mru);
+qemu_mutex_unlock_ramlist();
 qemu_free(block);
 return;
 }
@@ -3015,8 +3032,10 @@ void qemu_ram_free(ram_addr_t addr)
 
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
+qemu_mutex_lock_ramlist();
 QLIST_REMOVE(block, next);
 QLIST_REMOVE(block, next_mru);
+qemu_mutex_unlock_ramlist();
 if (block-flags  RAM_PREALLOC_MASK) {
 ;
 } else if (mem_path) {
diff --git a/qemu-common.h b/qemu-common.h
index abd7a75..b802883 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -212,6 +212,8 @@ char *qemu_strndup(const char *str, size_t size);
 
 void qemu_mutex_lock_iothread(void);
 void qemu_mutex_unlock_iothread(void);
+void qemu_mutex_lock_ramlist(void);
+void qemu_mutex_unlock_ramlist(void);
 
 int qemu_open(const char *name, int flags, ...);
 ssize_t qemu_write_full(int fd, const void *buf, size_t count)
-- 
1.7.4.1




[Qemu-devel] [RFC PATCH v3 5/5] Making iothread block for migrate_cancel

2011-08-16 Thread Umesh Deshpande
Following patch makes iothread wait until the migration thread responds to the
migrate_cancel request and terminates its execution.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 buffered_file.c |   13 -
 buffered_file.h |3 +++
 hw/hw.h |5 -
 migration.c |1 +
 qemu-thread-posix.c |   10 ++
 qemu-thread.h   |1 +
 savevm.c|   31 +--
 7 files changed, 52 insertions(+), 12 deletions(-)

diff --git a/buffered_file.c b/buffered_file.c
index bdcdf42..405b17f 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -223,6 +223,16 @@ static int64_t buffered_get_rate_limit(void *opaque)
 return s-xfer_limit;
 }
 
+static void buffered_wait_for_cancel(void *opaque)
+{
+QEMUFileBuffered *s = opaque;
+QemuThread thread = s-thread;
+
+qemu_mutex_unlock_iothread();
+qemu_thread_join(thread);
+qemu_mutex_lock_iothread();
+}
+
 static void *migrate_vm(void *opaque)
 {
 QEMUFileBuffered *s = opaque;
@@ -296,7 +306,8 @@ QEMUFile *qemu_fopen_ops_buffered(void *opaque,
 s-file = qemu_fopen_ops(s, buffered_put_buffer, NULL,
  buffered_close, buffered_rate_limit,
  buffered_set_rate_limit,
- buffered_get_rate_limit);
+ buffered_get_rate_limit,
+ buffered_wait_for_cancel);
 
 qemu_thread_create(s-thread, migrate_vm, s);
 
diff --git a/buffered_file.h b/buffered_file.h
index 98d358b..413cc9f 100644
--- a/buffered_file.h
+++ b/buffered_file.h
@@ -20,6 +20,9 @@ typedef ssize_t (BufferedPutFunc)(void *opaque, const void 
*data, size_t size);
 typedef void (BufferedPutReadyFunc)(void *opaque);
 typedef void (BufferedWaitForUnfreezeFunc)(void *opaque);
 typedef int (BufferedCloseFunc)(void *opaque);
+typedef void (BufferedWaitForCancelFunc)(void *opaque);
+
+void wait_for_cancel(void *opaque);
 
 QEMUFile *qemu_fopen_ops_buffered(void *opaque, size_t xfer_limit,
   BufferedPutFunc *put_buffer,
diff --git a/hw/hw.h b/hw/hw.h
index 9dd7096..e1d5ea8 100644
--- a/hw/hw.h
+++ b/hw/hw.h
@@ -41,13 +41,15 @@ typedef int (QEMUFileRateLimit)(void *opaque);
  */
 typedef int64_t (QEMUFileSetRateLimit)(void *opaque, int64_t new_rate);
 typedef int64_t (QEMUFileGetRateLimit)(void *opaque);
+typedef void (QEMUFileWaitForCancel)(void *opaque);
 
 QEMUFile *qemu_fopen_ops(void *opaque, QEMUFilePutBufferFunc *put_buffer,
  QEMUFileGetBufferFunc *get_buffer,
  QEMUFileCloseFunc *close,
  QEMUFileRateLimit *rate_limit,
  QEMUFileSetRateLimit *set_rate_limit,
-QEMUFileGetRateLimit *get_rate_limit);
+ QEMUFileGetRateLimit *get_rate_limit,
+ QEMUFileWaitForCancel *wait_for_cancel);
 QEMUFile *qemu_fopen(const char *filename, const char *mode);
 QEMUFile *qemu_fdopen(int fd, const char *mode);
 QEMUFile *qemu_fopen_socket(int fd);
@@ -56,6 +58,7 @@ QEMUFile *qemu_popen_cmd(const char *command, const char 
*mode);
 int qemu_stdio_fd(QEMUFile *f);
 void qemu_fflush(QEMUFile *f);
 int qemu_fclose(QEMUFile *f);
+void qemu_wait_for_cancel(QEMUFile *f);
 void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size);
 void qemu_put_byte(QEMUFile *f, int v);
 
diff --git a/migration.c b/migration.c
index b6ba690..0c5a484 100644
--- a/migration.c
+++ b/migration.c
@@ -423,6 +423,7 @@ void migrate_fd_cancel(MigrationState *mig_state)
 DPRINTF(cancelling migration\n);
 
 s-state = MIG_STATE_CANCELLED;
+qemu_wait_for_cancel(s-file);
 }
 
 void migrate_fd_release(MigrationState *mig_state)
diff --git a/qemu-thread-posix.c b/qemu-thread-posix.c
index 2bd02ef..0d18b35 100644
--- a/qemu-thread-posix.c
+++ b/qemu-thread-posix.c
@@ -115,6 +115,16 @@ void qemu_cond_wait(QemuCond *cond, QemuMutex *mutex)
 error_exit(err, __func__);
 }
 
+void qemu_thread_join(QemuThread thread)
+{
+int err;
+
+err = pthread_join(thread.thread, NULL);
+if (err) {
+error_exit(err, __func__);
+}
+}
+
 void qemu_thread_create(QemuThread *thread,
void *(*start_routine)(void*),
void *arg)
diff --git a/qemu-thread.h b/qemu-thread.h
index 0a73d50..909529f 100644
--- a/qemu-thread.h
+++ b/qemu-thread.h
@@ -30,6 +30,7 @@ void qemu_cond_destroy(QemuCond *cond);
 void qemu_cond_signal(QemuCond *cond);
 void qemu_cond_broadcast(QemuCond *cond);
 void qemu_cond_wait(QemuCond *cond, QemuMutex *mutex);
+void qemu_thread_join(QemuThread thread);
 
 void qemu_thread_create(QemuThread *thread,
void *(*start_routine)(void*),
diff --git a/savevm.c b/savevm.c
index f54f555..8003411 100644
--- a/savevm.c
+++ b/savevm.c
@@ -164,6 +164,7 @@ struct QEMUFile {
 QEMUFileRateLimit *rate_limit

Re: [RFC PATCH v3 3/4] lock to protect memslots

2011-08-15 Thread Umesh Deshpande

On 08/12/2011 02:45 AM, Paolo Bonzini wrote:

On 08/11/2011 06:20 PM, Paolo Bonzini wrote:


+qemu_mutex_lock_ramlist();
  QLIST_REMOVE(block, next);
  QLIST_INSERT_HEAD(ram_list.blocks, block, next);
+qemu_mutex_unlock_ramlist();


Theoretically qemu_get_ram_ptr should be protected.  The problem is not
just accessing the ramlist, it is accessing the data underneath it
before anyone frees it.  Luckily we can set aside that problem for now,
because qemu_ram_free_from_ptr is only used by device assignment and
device assignment makes VMs unmigratable.


Hmm, rethinking about it, all the loops in exec.c should be protected 
from the mutex. 
Other loops in exec.c are just for reading the ram_list members, and the 
migration thread doesn't modify ram_list.
Also, protecting the loops in exec.c would make those functions 
un-callable from the functions that are already holding the ram_list 
mutex to protect themselves against memslot removal (migration thread in 
our case).
That's not too good because qemu_get_ram_ptr is a hot path for TCG. 
Looks like qemu_get_ram_ptr isn't called from the source side code of 
guest migration.
Perhaps you can also avoid the mutex entirely, and just disable the 
above optimization for most-recently-used-block while migration is 
running.  It's not a complete solution, but it could be good enough 
until we have RAM hot-plug/hot-unplug.


Paolo


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v3 3/4] lock to protect memslots

2011-08-15 Thread Umesh Deshpande

On 08/15/2011 10:14 AM, Paolo Bonzini wrote:

On 08/15/2011 12:26 AM, Marcelo Tosatti wrote:

Actually the previous patchset does not traverse the ramlist without
qemu_mutex locked, which is safe versus the most-recently-used-block
optimization.

Actually it does:

  bytes_transferred_last = bytes_transferred;
  bwidth = qemu_get_clock_ns(rt_clock);

+if (stage != 3) {
+qemu_mutex_lock_ramlist();
+qemu_mutex_unlock_iothread();
+}
+
  while (!qemu_file_rate_limit(f)) {
  int bytes_sent;

  /* ram_save_block does traverse memory.  */
  bytes_sent = ram_save_block(f);
  bytes_transferred += bytes_sent;
  if (bytes_sent == 0) { /* no more blocks */
  break;
  }
  }

+if (stage != 3) {
+qemu_mutex_lock_iothread();
+qemu_mutex_unlock_ramlist();
+}
+
  bwidth = qemu_get_clock_ns(rt_clock) - bwidth;
  bwidth = (bytes_transferred - bytes_transferred_last) / bwidth;


What Umesh is doing is using either ramlist mutex or iothread mutex when 
reading
the ramlist, and both when writing the ramlist; similar to rwlocks done with a
regular mutex per CPU---clever!  So this:

+qemu_mutex_lock_ramlist();
  QLIST_REMOVE(block, next);
  QLIST_INSERT_HEAD(ram_list.blocks, block, next);
+qemu_mutex_unlock_ramlist();

is effectively upgrading the lock from read-side to write-side, assuming that
qemu_get_ram_ptr is never called from the migration thread (which is true).

However, I propose that you put the MRU order in a separate list.  You would 
still
need two locks: the IO thread lock to protect the new list, a new lock to 
protect
the other fields in the ram_list.  For simplicity you may skip the new lock if 
you
assume that the migration and I/O threads never modify the list concurrently,
which is true.
Yes, the mru list patch would obviate the need of holding the ram_list 
mutex in qemu_get_ram_ptr.
Also, I was planning to protect the whole migration thread with iothread 
mutex, and ram_list mutex. (i.e. holding ram_list mutex while sleeping 
between two iterations, when we release iothread mutex). This will 
prevent the memslot block removal altogether during the migration. Do 
you see any problem with this?



And more importantly, the MRU and migration code absolutely do not
affect each other, because indeed the migration thread does not do MRU accesses.
See the attachment for an untested patch.

Paolo

Thanks
Umesh
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v3 0/4] Separate thread for VM migration

2011-08-11 Thread Umesh Deshpande
Following patch series deals with VCPU and iothread starvation during the
migration of
a guest. Currently the iothread is responsible for performing the guest
migration. It holds qemu_mutex during the migration and doesn't allow VCPU to
enter the qemu mode and delays its return to the guest. The guest migration,
executed as an iohandler also delays the execution of other iohandlers.
In the following patch series,

The migration has been moved to a separate thread to
reduce the qemu_mutex contention and iohandler starvation.

Umesh Deshpande (4):
  separate thread for VM migration
  synchronous migrate_cancel
  lock to protect memslots
  separate migration bitmap

 arch_init.c |   26 ++
 buffered_file.c |  100 +--
 buffered_file.h |4 ++
 cpu-all.h   |   39 
 cpus.c  |   12 ++
 exec.c  |   74 +
 hw/hw.h |5 ++-
 migration.c |   50 --
 migration.h |6 +++
 qemu-common.h   |2 +
 qemu-thread-posix.c |   10 +
 qemu-thread.h   |1 +
 savevm.c|   31 +++-
 13 files changed, 280 insertions(+), 80 deletions(-)

-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v3 1/4] separate thread for VM migration

2011-08-11 Thread Umesh Deshpande
This patch creates a separate thread for the guest migration on the source side.
migrate_cancel request from the iothread is handled asynchronously. That is,
iothread submits migrate_cancel to the migration thread and returns, while the
migration thread attends this request at the next iteration to terminate its
execution.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 buffered_file.c |   85 --
 buffered_file.h |4 ++
 migration.c |   49 ++-
 migration.h |6 
 4 files changed, 82 insertions(+), 62 deletions(-)

diff --git a/buffered_file.c b/buffered_file.c
index 41b42c3..19932b6 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -16,6 +16,8 @@
 #include qemu-timer.h
 #include qemu-char.h
 #include buffered_file.h
+#include migration.h
+#include qemu-thread.h
 
 //#define DEBUG_BUFFERED_FILE
 
@@ -28,13 +30,14 @@ typedef struct QEMUFileBuffered
 void *opaque;
 QEMUFile *file;
 int has_error;
+int closed;
 int freeze_output;
 size_t bytes_xfer;
 size_t xfer_limit;
 uint8_t *buffer;
 size_t buffer_size;
 size_t buffer_capacity;
-QEMUTimer *timer;
+QemuThread thread;
 } QEMUFileBuffered;
 
 #ifdef DEBUG_BUFFERED_FILE
@@ -155,14 +158,6 @@ static int buffered_put_buffer(void *opaque, const uint8_t 
*buf, int64_t pos, in
 offset = size;
 }
 
-if (pos == 0  size == 0) {
-DPRINTF(file is ready\n);
-if (s-bytes_xfer = s-xfer_limit) {
-DPRINTF(notifying client\n);
-s-put_ready(s-opaque);
-}
-}
-
 return offset;
 }
 
@@ -175,20 +170,20 @@ static int buffered_close(void *opaque)
 
 while (!s-has_error  s-buffer_size) {
 buffered_flush(s);
-if (s-freeze_output)
+if (s-freeze_output) {
 s-wait_for_unfreeze(s);
+}
 }
 
-ret = s-close(s-opaque);
+s-closed = 1;
 
-qemu_del_timer(s-timer);
-qemu_free_timer(s-timer);
+ret = s-close(s-opaque);
 qemu_free(s-buffer);
-qemu_free(s);
 
 return ret;
 }
 
+
 static int buffered_rate_limit(void *opaque)
 {
 QEMUFileBuffered *s = opaque;
@@ -228,34 +223,55 @@ static int64_t buffered_get_rate_limit(void *opaque)
 return s-xfer_limit;
 }
 
-static void buffered_rate_tick(void *opaque)
+static void *migrate_vm(void *opaque)
 {
 QEMUFileBuffered *s = opaque;
+int64_t current_time, expire_time = qemu_get_clock_ms(rt_clock) + 100;
+struct timeval tv = { .tv_sec = 0, .tv_usec = 10};
 
-if (s-has_error) {
-buffered_close(s);
-return;
-}
+qemu_mutex_lock_iothread();
 
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+while (!s-closed) {
+if (s-freeze_output) {
+s-wait_for_unfreeze(s);
+s-freeze_output = 0;
+continue;
+}
 
-if (s-freeze_output)
-return;
+if (s-has_error) {
+break;
+}
+
+current_time = qemu_get_clock_ms(rt_clock);
+if (!s-closed  (expire_time  current_time)) {
+tv.tv_usec = 1000 * (expire_time - current_time);
+select(0, NULL, NULL, NULL, tv);
+continue;
+}
 
-s-bytes_xfer = 0;
+s-bytes_xfer = 0;
+buffered_flush(s);
 
-buffered_flush(s);
+expire_time = qemu_get_clock_ms(rt_clock) + 100;
+s-put_ready(s-opaque);
+}
 
-/* Add some checks around this */
-s-put_ready(s-opaque);
+if (s-has_error) {
+buffered_close(s);
+}
+qemu_free(s);
+
+qemu_mutex_unlock_iothread();
+
+return NULL;
 }
 
 QEMUFile *qemu_fopen_ops_buffered(void *opaque,
-  size_t bytes_per_sec,
-  BufferedPutFunc *put_buffer,
-  BufferedPutReadyFunc *put_ready,
-  BufferedWaitForUnfreezeFunc 
*wait_for_unfreeze,
-  BufferedCloseFunc *close)
+size_t bytes_per_sec,
+BufferedPutFunc *put_buffer,
+BufferedPutReadyFunc *put_ready,
+BufferedWaitForUnfreezeFunc *wait_for_unfreeze,
+BufferedCloseFunc *close)
 {
 QEMUFileBuffered *s;
 
@@ -267,15 +283,14 @@ QEMUFile *qemu_fopen_ops_buffered(void *opaque,
 s-put_ready = put_ready;
 s-wait_for_unfreeze = wait_for_unfreeze;
 s-close = close;
+s-closed = 0;
 
 s-file = qemu_fopen_ops(s, buffered_put_buffer, NULL,
  buffered_close, buffered_rate_limit,
  buffered_set_rate_limit,
-buffered_get_rate_limit);
-
-s-timer = qemu_new_timer_ms(rt_clock, buffered_rate_tick, s);
+ buffered_get_rate_limit);
 
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+qemu_thread_create(s-thread, migrate_vm, s);
 
 return s-file

[RFC PATCH v3 3/4] lock to protect memslots

2011-08-11 Thread Umesh Deshpande
Following patch introduces a mutex to protect the migration thread against the
removal of memslots during the guest migration iteration.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 arch_init.c |   10 ++
 buffered_file.c |4 
 cpu-all.h   |2 ++
 cpus.c  |   12 
 exec.c  |   10 ++
 qemu-common.h   |2 ++
 6 files changed, 40 insertions(+), 0 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 484b39d..f0ddda6 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -298,6 +298,11 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 bytes_transferred_last = bytes_transferred;
 bwidth = qemu_get_clock_ns(rt_clock);
 
+if (stage != 3) {
+qemu_mutex_lock_ramlist();
+qemu_mutex_unlock_iothread();
+}
+
 while (!qemu_file_rate_limit(f)) {
 int bytes_sent;
 
@@ -308,6 +313,11 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 }
 }
 
+if (stage != 3) {
+qemu_mutex_lock_iothread();
+qemu_mutex_unlock_ramlist();
+}
+
 bwidth = qemu_get_clock_ns(rt_clock) - bwidth;
 bwidth = (bytes_transferred - bytes_transferred_last) / bwidth;
 
diff --git a/buffered_file.c b/buffered_file.c
index b64ada7..5735e18 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -243,7 +243,9 @@ static void *migrate_vm(void *opaque)
 
 while (!s-closed) {
 if (s-freeze_output) {
+qemu_mutex_unlock_iothread();
 s-wait_for_unfreeze(s);
+qemu_mutex_lock_iothread();
 s-freeze_output = 0;
 continue;
 }
@@ -255,7 +257,9 @@ static void *migrate_vm(void *opaque)
 current_time = qemu_get_clock_ms(rt_clock);
 if (!s-closed  (expire_time  current_time)) {
 tv.tv_usec = 1000 * (expire_time - current_time);
+qemu_mutex_unlock_iothread();
 select(0, NULL, NULL, NULL, tv);
+qemu_mutex_lock_iothread();
 continue;
 }
 
diff --git a/cpu-all.h b/cpu-all.h
index e839100..6a5dbb3 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -21,6 +21,7 @@
 
 #include qemu-common.h
 #include cpu-common.h
+#include qemu-thread.h
 
 /* some important defines:
  *
@@ -931,6 +932,7 @@ typedef struct RAMBlock {
 } RAMBlock;
 
 typedef struct RAMList {
+QemuMutex mutex;
 uint8_t *phys_dirty;
 QLIST_HEAD(ram, RAMBlock) blocks;
 } RAMList;
diff --git a/cpus.c b/cpus.c
index de70e02..6090c44 100644
--- a/cpus.c
+++ b/cpus.c
@@ -666,6 +666,7 @@ int qemu_init_main_loop(void)
 qemu_cond_init(qemu_work_cond);
 qemu_mutex_init(qemu_fair_mutex);
 qemu_mutex_init(qemu_global_mutex);
+qemu_mutex_init(ram_list.mutex);
 qemu_mutex_lock(qemu_global_mutex);
 
 qemu_thread_get_self(io_thread);
@@ -919,6 +920,17 @@ void qemu_mutex_unlock_iothread(void)
 qemu_mutex_unlock(qemu_global_mutex);
 }
 
+void qemu_mutex_lock_ramlist(void)
+{
+qemu_mutex_lock(ram_list.mutex);
+}
+
+void qemu_mutex_unlock_ramlist(void)
+{
+qemu_mutex_unlock(ram_list.mutex);
+}
+
+
 static int all_vcpus_paused(void)
 {
 CPUState *penv = first_cpu;
diff --git a/exec.c b/exec.c
index 0e2ce57..7bfb36f 100644
--- a/exec.c
+++ b/exec.c
@@ -2972,6 +2972,8 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, 
const char *name,
 }
 new_block-length = size;
 
+qemu_mutex_lock_ramlist();
+
 QLIST_INSERT_HEAD(ram_list.blocks, new_block, next);
 
 ram_list.phys_dirty = qemu_realloc(ram_list.phys_dirty,
@@ -2979,6 +2981,8 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, 
const char *name,
 memset(ram_list.phys_dirty + (new_block-offset  TARGET_PAGE_BITS),
0xff, size  TARGET_PAGE_BITS);
 
+qemu_mutex_unlock_ramlist();
+
 if (kvm_enabled())
 kvm_setup_guest_memory(new_block-host, size);
 
@@ -2996,7 +3000,9 @@ void qemu_ram_free_from_ptr(ram_addr_t addr)
 
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
+qemu_mutex_lock_ramlist();
 QLIST_REMOVE(block, next);
+qemu_mutex_unlock_ramlist();
 qemu_free(block);
 return;
 }
@@ -3009,7 +3015,9 @@ void qemu_ram_free(ram_addr_t addr)
 
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (addr == block-offset) {
+qemu_mutex_lock_ramlist();
 QLIST_REMOVE(block, next);
+qemu_mutex_unlock_ramlist();
 if (block-flags  RAM_PREALLOC_MASK) {
 ;
 } else if (mem_path) {
@@ -3117,8 +3125,10 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
 if (addr - block-offset  block-length) {
 /* Move this entry to to start of the list.  */
 if (block != QLIST_FIRST(ram_list.blocks)) {
+qemu_mutex_lock_ramlist();
 QLIST_REMOVE(block, next);
 QLIST_INSERT_HEAD(ram_list.blocks, block, next

[RFC PATCH v3 2/4] Making iothread block for migrate_cancel

2011-08-11 Thread Umesh Deshpande
Following patch makes iothread wait until the migration thread responds to the
migrate_cancel request and terminates its execution.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 buffered_file.c |   13 -
 hw/hw.h |5 -
 migration.c |1 +
 qemu-thread-posix.c |   10 ++
 qemu-thread.h   |1 +
 savevm.c|   31 +--
 6 files changed, 49 insertions(+), 12 deletions(-)

diff --git a/buffered_file.c b/buffered_file.c
index 19932b6..b64ada7 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -223,6 +223,16 @@ static int64_t buffered_get_rate_limit(void *opaque)
 return s-xfer_limit;
 }
 
+static void buffered_wait_for_cancel(void *opaque)
+{
+QEMUFileBuffered *s = opaque;
+QemuThread thread = s-thread;
+
+qemu_mutex_unlock_iothread();
+qemu_thread_join(thread);
+qemu_mutex_lock_iothread();
+}
+
 static void *migrate_vm(void *opaque)
 {
 QEMUFileBuffered *s = opaque;
@@ -288,7 +298,8 @@ QEMUFile *qemu_fopen_ops_buffered(void *opaque,
 s-file = qemu_fopen_ops(s, buffered_put_buffer, NULL,
  buffered_close, buffered_rate_limit,
  buffered_set_rate_limit,
- buffered_get_rate_limit);
+ buffered_get_rate_limit,
+ buffered_wait_for_cancel);
 
 qemu_thread_create(s-thread, migrate_vm, s);
 
diff --git a/hw/hw.h b/hw/hw.h
index 9dd7096..e1d5ea8 100644
--- a/hw/hw.h
+++ b/hw/hw.h
@@ -41,13 +41,15 @@ typedef int (QEMUFileRateLimit)(void *opaque);
  */
 typedef int64_t (QEMUFileSetRateLimit)(void *opaque, int64_t new_rate);
 typedef int64_t (QEMUFileGetRateLimit)(void *opaque);
+typedef void (QEMUFileWaitForCancel)(void *opaque);
 
 QEMUFile *qemu_fopen_ops(void *opaque, QEMUFilePutBufferFunc *put_buffer,
  QEMUFileGetBufferFunc *get_buffer,
  QEMUFileCloseFunc *close,
  QEMUFileRateLimit *rate_limit,
  QEMUFileSetRateLimit *set_rate_limit,
-QEMUFileGetRateLimit *get_rate_limit);
+ QEMUFileGetRateLimit *get_rate_limit,
+ QEMUFileWaitForCancel *wait_for_cancel);
 QEMUFile *qemu_fopen(const char *filename, const char *mode);
 QEMUFile *qemu_fdopen(int fd, const char *mode);
 QEMUFile *qemu_fopen_socket(int fd);
@@ -56,6 +58,7 @@ QEMUFile *qemu_popen_cmd(const char *command, const char 
*mode);
 int qemu_stdio_fd(QEMUFile *f);
 void qemu_fflush(QEMUFile *f);
 int qemu_fclose(QEMUFile *f);
+void qemu_wait_for_cancel(QEMUFile *f);
 void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size);
 void qemu_put_byte(QEMUFile *f, int v);
 
diff --git a/migration.c b/migration.c
index d8a0abb..c19a206 100644
--- a/migration.c
+++ b/migration.c
@@ -407,6 +407,7 @@ void migrate_fd_cancel(MigrationState *mig_state)
 DPRINTF(cancelling migration\n);
 
 s-state = MIG_STATE_CANCELLED;
+qemu_wait_for_cancel(s-file);
 }
 
 void migrate_fd_terminate(FdMigrationState *s)
diff --git a/qemu-thread-posix.c b/qemu-thread-posix.c
index 2bd02ef..0d18b35 100644
--- a/qemu-thread-posix.c
+++ b/qemu-thread-posix.c
@@ -115,6 +115,16 @@ void qemu_cond_wait(QemuCond *cond, QemuMutex *mutex)
 error_exit(err, __func__);
 }
 
+void qemu_thread_join(QemuThread thread)
+{
+int err;
+
+err = pthread_join(thread.thread, NULL);
+if (err) {
+error_exit(err, __func__);
+}
+}
+
 void qemu_thread_create(QemuThread *thread,
void *(*start_routine)(void*),
void *arg)
diff --git a/qemu-thread.h b/qemu-thread.h
index 0a73d50..909529f 100644
--- a/qemu-thread.h
+++ b/qemu-thread.h
@@ -30,6 +30,7 @@ void qemu_cond_destroy(QemuCond *cond);
 void qemu_cond_signal(QemuCond *cond);
 void qemu_cond_broadcast(QemuCond *cond);
 void qemu_cond_wait(QemuCond *cond, QemuMutex *mutex);
+void qemu_thread_join(QemuThread thread);
 
 void qemu_thread_create(QemuThread *thread,
void *(*start_routine)(void*),
diff --git a/savevm.c b/savevm.c
index 8139bc7..6bebf7e 100644
--- a/savevm.c
+++ b/savevm.c
@@ -164,6 +164,7 @@ struct QEMUFile {
 QEMUFileRateLimit *rate_limit;
 QEMUFileSetRateLimit *set_rate_limit;
 QEMUFileGetRateLimit *get_rate_limit;
+QEMUFileWaitForCancel *wait_for_cancel;
 void *opaque;
 int is_write;
 
@@ -261,10 +262,10 @@ QEMUFile *qemu_popen(FILE *stdio_file, const char *mode)
 
 if(mode[0] == 'r') {
 s-file = qemu_fopen_ops(s, NULL, stdio_get_buffer, stdio_pclose, 
-NULL, NULL, NULL);
+NULL, NULL, NULL, NULL);
 } else {
 s-file = qemu_fopen_ops(s, stdio_put_buffer, NULL, stdio_pclose, 
-NULL, NULL, NULL);
+NULL, NULL, NULL, NULL);
 }
 return s-file

[RFC PATCH v3 4/4] Separate migration bitmap

2011-08-11 Thread Umesh Deshpande
This patch creates a migration bitmap, which is periodically kept in sync with
the qemu bitmap. A separate copy of the dirty bitmap for the migration avoids
concurrent access to the qemu bitmap from iothread and migration thread.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 arch_init.c |   16 --
 cpu-all.h   |   37 ++
 exec.c  |   64 +++
 3 files changed, 110 insertions(+), 7 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index f0ddda6..296b7d6 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -123,13 +123,13 @@ static int ram_save_block(QEMUFile *f)
 current_addr = block-offset + offset;
 
 do {
-if (cpu_physical_memory_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) 
{
+if (migration_bitmap_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) {
 uint8_t *p;
 int cont = (block == last_block) ? RAM_SAVE_FLAG_CONTINUE : 0;
 
-cpu_physical_memory_reset_dirty(current_addr,
-current_addr + TARGET_PAGE_SIZE,
-MIGRATION_DIRTY_FLAG);
+migration_bitmap_reset_dirty(current_addr,
+ current_addr + TARGET_PAGE_SIZE,
+ MIGRATION_DIRTY_FLAG);
 
 p = block-host + offset;
 
@@ -185,7 +185,7 @@ static ram_addr_t ram_save_remaining(void)
 ram_addr_t addr;
 for (addr = block-offset; addr  block-offset + block-length;
  addr += TARGET_PAGE_SIZE) {
-if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
+if (migration_bitmap_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
 count++;
 }
 }
@@ -265,6 +265,8 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 return 0;
 }
 
+sync_migration_bitmap(0, TARGET_PHYS_ADDR_MAX);
+
 if (stage == 1) {
 RAMBlock *block;
 bytes_transferred = 0;
@@ -276,9 +278,9 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 QLIST_FOREACH(block, ram_list.blocks, next) {
 for (addr = block-offset; addr  block-offset + block-length;
  addr += TARGET_PAGE_SIZE) {
-if (!cpu_physical_memory_get_dirty(addr,
+if (!migration_bitmap_get_dirty(addr,
MIGRATION_DIRTY_FLAG)) {
-cpu_physical_memory_set_dirty(addr);
+migration_bitmap_set_dirty(addr);
 }
 }
 }
diff --git a/cpu-all.h b/cpu-all.h
index 6a5dbb3..34a225b 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -934,6 +934,7 @@ typedef struct RAMBlock {
 typedef struct RAMList {
 QemuMutex mutex;
 uint8_t *phys_dirty;
+uint8_t *migration_bitmap;
 QLIST_HEAD(ram, RAMBlock) blocks;
 } RAMList;
 extern RAMList ram_list;
@@ -1006,8 +1007,44 @@ static inline void 
cpu_physical_memory_mask_dirty_range(ram_addr_t start,
 }
 }
 
+
+
 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
  int dirty_flags);
+
+static inline int migration_bitmap_get_dirty(ram_addr_t addr,
+ int dirty_flags)
+{
+return ram_list.migration_bitmap[addr  TARGET_PAGE_BITS]  dirty_flags;
+}
+
+static inline void migration_bitmap_set_dirty(ram_addr_t addr)
+{
+ram_list.migration_bitmap[addr  TARGET_PAGE_BITS] = 0xff;
+}
+
+static inline void migration_bitmap_mask_dirty_range(ram_addr_t start,
+ int length,
+ int dirty_flags)
+{
+int i, mask, len;
+uint8_t *p;
+
+len = length  TARGET_PAGE_BITS;
+mask = ~dirty_flags;
+p = ram_list.migration_bitmap + (start  TARGET_PAGE_BITS);
+for (i = 0; i  len; i++) {
+p[i] = mask;
+}
+}
+
+
+void migration_bitmap_reset_dirty(ram_addr_t start,
+  ram_addr_t end,
+  int dirty_flags);
+
+void sync_migration_bitmap(ram_addr_t start, ram_addr_t end);
+
 void cpu_tlb_update_dirty(CPUState *env);
 
 int cpu_physical_memory_set_dirty_tracking(int enable);
diff --git a/exec.c b/exec.c
index 7bfb36f..f758c4e 100644
--- a/exec.c
+++ b/exec.c
@@ -2106,6 +2106,10 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, 
ram_addr_t end,
 abort();
 }
 
+if (kvm_enabled()) {
+return;
+}
+
 for(env = first_cpu; env != NULL; env = env-next_cpu) {
 int mmu_idx;
 for (mmu_idx = 0; mmu_idx  NB_MMU_MODES; mmu_idx++) {
@@ -2114,8 +2118,61 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, 
ram_addr_t end,
   start1, length);
 }
 }
+
+}
+
+void

Re: [RFC PATCH v3 1/4] separate thread for VM migration

2011-08-11 Thread Umesh Deshpande

On 08/11/2011 12:18 PM, Paolo Bonzini wrote:

@@ -175,20 +170,20 @@ static int buffered_close(void *opaque)

  while (!s-has_error  s-buffer_size) {
  buffered_flush(s);
-if (s-freeze_output)
+if (s-freeze_output) {
  s-wait_for_unfreeze(s);
+}
  }


This is racy; you might end up calling buffered_put_buffer twice from 
two different threads.
Now, migrate_fd_cleanup, buffured_close is just executed by the 
migration thread.
I am not letting iothread call any migration cancellation related 
functions. In stead it just submits the request and waits for the 
migration thread to terminate itself in the next iteration.
The reason is to avoid the call to qemu_fflush,  
qemu_savevm_state_cancel (to carry out migrate_cancel) from iothread 
while migration thread is transferring data without holding the locks.





-ret = s-close(s-opaque);
+s-closed = 1;

-qemu_del_timer(s-timer);
-qemu_free_timer(s-timer);
+ret = s-close(s-opaque);
  qemu_free(s-buffer);
-qemu_free(s);


... similarly, here the migration thread might end up using the 
buffer.  Just set s-closed here and wait for thread completion; the 
migration thread can handle the flushes free the buffer etc.  Let the 
migration thread do as much as possible, it will simplify your life.



  return ret;
  }

+
  static int buffered_rate_limit(void *opaque)
  {
  QEMUFileBuffered *s = opaque;
@@ -228,34 +223,55 @@ static int64_t buffered_get_rate_limit(void 
*opaque)

  return s-xfer_limit;
  }

-static void buffered_rate_tick(void *opaque)
+static void *migrate_vm(void *opaque)
  {
  QEMUFileBuffered *s = opaque;
+int64_t current_time, expire_time = qemu_get_clock_ms(rt_clock) 
+ 100;

+struct timeval tv = { .tv_sec = 0, .tv_usec = 10};

-if (s-has_error) {
-buffered_close(s);
-return;
-}
+qemu_mutex_lock_iothread();

-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+while (!s-closed) {


... This can be in fact

while (!s-closed || s-buffered_size)

and that alone will subsume the loop in buffered_close, no?
s-fd is closed in migrate_fd_cleanup (which calls buffered_close). So I 
flush the buffer in buffered_close before closing the descriptor, and 
then migration thread simply exits because s-closed is set.


- Umesh
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v2 1/3] separate thread for VM migration

2011-08-01 Thread Umesh Deshpande

On 08/01/2011 05:37 AM, Paolo Bonzini wrote:

On 07/29/2011 10:57 PM, Umesh Deshpande wrote:
This patch creates a separate thread for the guest migration on the 
source side.


Signed-off-by: Umesh Deshpandeudesh...@redhat.com


Looks pretty good!

One thing that shows, is that the interface separation between 
buffered_file.c is migration.c is pretty weird.  Your patch makes it 
somewhat worse, but it was like this before so it's not your fault.  
The good thing is that if buffered_file.c uses threads, you can fix a 
large part of this and get even simpler code:


1) there is really just one way to implement migrate_fd_put_notify, 
and with your simplifications it does not belong anymore in migration.c.


2) s-callback is actually not NULL exactly if s-file-frozen_output 
is true, you can remove it as well;


3) buffered_close is messy because it can be called from both the 
iothread (monitor-migrate_fd_cancel-migrate_fd_cleanup-qemu_fclose) 
or the migration thread (after qemu_savevm_state_complete).  But 
buffered_close is actually very similar to your thread function (it 
does flush+wait_for_unfreeze, basically)!  So buffered_close can be 
simply:


s-closed = 1;
ret = qemu_thread_join(s-thread); /* doesn't exist yet :) */
qemu_free(...);
return ret;

Another nit is that here:


+if (migrate_fd_check_expire()) {
+buffered_rate_tick(s-file);
+}
+
+if (s-state != MIG_STATE_ACTIVE) {
+break;
+}
+
+if (s-callback) {
+migrate_fd_wait_for_unfreeze(s);
+s-callback(s);
+}


you can still have a busy wait.

Putting it all together, you can move the thread function back to 
buffered_file.c like:


while (!s-closed || (!s-has_error  s-buffer_size)) {
if (s-freeze_output) {
qemu_mutex_unlock_iothread();
s-wait_for_unfreeze(s);
qemu_mutex_lock_iothread();
/* This comes from qemu_file_put_notify (via
   buffered_put_buffer---can be simplified a lot too?).
s-freeze_output = 0;
/* Test again for cancellation.  */
continue;
}

int64_t current_time = qemu_get_clock_ms(rt_clock);
if (s-expire_time  current_time) {
struct timeval tv = { .tv_sec = 0, .tv_usec = ... };
qemu_mutex_unlock_iothread();
select (0, NULL, NULL, NULL, tv);
qemu_mutex_lock_iothread();
s-expire_time = qemu_get_clock_ms(rt_clock) + 100;
continue;
}

/* This comes from buffered_rate_tick.  */
s-bytes_xfer = 0;
buffered_flush(s);
if (!s-closed) {
s-put_ready(s-opaque);
}
}

ret = s-close(s-opaque);
...

Does it look sane?

I kept this in migration.c to call qemu_savevm_state_begin. (The way it 
is done currently. i.e. to keep access to FdMigrationState in migration.c)
Calling it from buffered_file.c would be inconsistent in that sense. or 
we will have to call it from the iothread before spawning the migration 
thread.


Also why is the separation between FdMigrationState and QEMUFileBuffered 
is required. Is QEMUFileBuffered designed to use also for things other 
than migration?


Thanks
Umesh


Paolo


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [RFC PATCH v2 1/3] separate thread for VM migration

2011-08-01 Thread Umesh Deshpande

On 08/01/2011 05:37 AM, Paolo Bonzini wrote:

On 07/29/2011 10:57 PM, Umesh Deshpande wrote:
This patch creates a separate thread for the guest migration on the 
source side.


Signed-off-by: Umesh Deshpandeudesh...@redhat.com


Looks pretty good!

One thing that shows, is that the interface separation between 
buffered_file.c is migration.c is pretty weird.  Your patch makes it 
somewhat worse, but it was like this before so it's not your fault.  
The good thing is that if buffered_file.c uses threads, you can fix a 
large part of this and get even simpler code:


1) there is really just one way to implement migrate_fd_put_notify, 
and with your simplifications it does not belong anymore in migration.c.


2) s-callback is actually not NULL exactly if s-file-frozen_output 
is true, you can remove it as well;


3) buffered_close is messy because it can be called from both the 
iothread (monitor-migrate_fd_cancel-migrate_fd_cleanup-qemu_fclose) 
or the migration thread (after qemu_savevm_state_complete).  But 
buffered_close is actually very similar to your thread function (it 
does flush+wait_for_unfreeze, basically)!  So buffered_close can be 
simply:


s-closed = 1;
ret = qemu_thread_join(s-thread); /* doesn't exist yet :) */
qemu_free(...);
return ret;

Another nit is that here:


+if (migrate_fd_check_expire()) {
+buffered_rate_tick(s-file);
+}
+
+if (s-state != MIG_STATE_ACTIVE) {
+break;
+}
+
+if (s-callback) {
+migrate_fd_wait_for_unfreeze(s);
+s-callback(s);
+}


you can still have a busy wait.

Putting it all together, you can move the thread function back to 
buffered_file.c like:


while (!s-closed || (!s-has_error  s-buffer_size)) {
if (s-freeze_output) {
qemu_mutex_unlock_iothread();
s-wait_for_unfreeze(s);
qemu_mutex_lock_iothread();
/* This comes from qemu_file_put_notify (via
   buffered_put_buffer---can be simplified a lot too?).
s-freeze_output = 0;
/* Test again for cancellation.  */
continue;
}

int64_t current_time = qemu_get_clock_ms(rt_clock);
if (s-expire_time  current_time) {
struct timeval tv = { .tv_sec = 0, .tv_usec = ... };
qemu_mutex_unlock_iothread();
select (0, NULL, NULL, NULL, tv);
qemu_mutex_lock_iothread();
s-expire_time = qemu_get_clock_ms(rt_clock) + 100;
continue;
}

/* This comes from buffered_rate_tick.  */
s-bytes_xfer = 0;
buffered_flush(s);
if (!s-closed) {
s-put_ready(s-opaque);
}
}

ret = s-close(s-opaque);
...

Does it look sane?

I kept this in migration.c to call qemu_savevm_state_begin. (The way it 
is done currently. i.e. to keep access to FdMigrationState in migration.c)
Calling it from buffered_file.c would be inconsistent in that sense. or 
we will have to call it from the iothread before spawning the migration 
thread.


Also why is the separation between FdMigrationState and QEMUFileBuffered 
is required. Is QEMUFileBuffered designed to use also for things other 
than migration?


Thanks
Umesh


Paolo





[RFC PATCH v2 0/3] separate thread for VM migration

2011-07-29 Thread Umesh Deshpande
Following patch deals with VCPU and iothread starvation during the migration of
a guest. Currently the iothread is responsible for performing the guest
migration. It holds qemu_mutex during the migration and doesn't allow VCPU to
enter the qemu mode and delays its return to the guest. The guest migration,
executed as an iohandler also delays the execution of other iohandlers.
In the following patch series,

The migration has been moved to a separate thread to
reduce the qemu_mutex contention and iohandler starvation.

Also current dirty bitmap is split into per memslot bitmap to reduce its size.

Umesh Deshpande (3):
  separate thread for VM migration
  fine grained qemu_mutex locking for migration
  per memslot dirty bitmap

 arch_init.c |   14 ++--
 buffered_file.c |   28 -
 buffered_file.h |4 +++
 cpu-all.h   |   40 ++--
 exec.c  |   38 +-
 migration.c |   60 --
 migration.h |3 ++
 savevm.c|   22 +---
 savevm.h|   29 ++
 xen-all.c   |6 +---
 10 files changed, 173 insertions(+), 71 deletions(-)
 create mode 100644 savevm.h

-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v2 1/3] separate thread for VM migration

2011-07-29 Thread Umesh Deshpande
This patch creates a separate thread for the guest migration on the source side.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 buffered_file.c |   28 -
 buffered_file.h |4 +++
 migration.c |   59 +++---
 migration.h |3 ++
 savevm.c|   22 +---
 savevm.h|   29 +++
 6 files changed, 102 insertions(+), 43 deletions(-)
 create mode 100644 savevm.h

diff --git a/buffered_file.c b/buffered_file.c
index 41b42c3..d4146bf 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -16,12 +16,16 @@
 #include qemu-timer.h
 #include qemu-char.h
 #include buffered_file.h
+#include migration.h
+#include savevm.h
+#include qemu-thread.h
 
 //#define DEBUG_BUFFERED_FILE
 
 typedef struct QEMUFileBuffered
 {
 BufferedPutFunc *put_buffer;
+BufferedBeginFunc *begin;
 BufferedPutReadyFunc *put_ready;
 BufferedWaitForUnfreezeFunc *wait_for_unfreeze;
 BufferedCloseFunc *close;
@@ -35,6 +39,7 @@ typedef struct QEMUFileBuffered
 size_t buffer_size;
 size_t buffer_capacity;
 QEMUTimer *timer;
+QemuThread thread;
 } QEMUFileBuffered;
 
 #ifdef DEBUG_BUFFERED_FILE
@@ -181,8 +186,6 @@ static int buffered_close(void *opaque)
 
 ret = s-close(s-opaque);
 
-qemu_del_timer(s-timer);
-qemu_free_timer(s-timer);
 qemu_free(s-buffer);
 qemu_free(s);
 
@@ -228,17 +231,15 @@ static int64_t buffered_get_rate_limit(void *opaque)
 return s-xfer_limit;
 }
 
-static void buffered_rate_tick(void *opaque)
+void buffered_rate_tick(QEMUFile *file)
 {
-QEMUFileBuffered *s = opaque;
+QEMUFileBuffered *s = file-opaque;
 
 if (s-has_error) {
 buffered_close(s);
 return;
 }
 
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
-
 if (s-freeze_output)
 return;
 
@@ -250,9 +251,17 @@ static void buffered_rate_tick(void *opaque)
 s-put_ready(s-opaque);
 }
 
+static void *migrate_vm(void *opaque)
+{
+QEMUFileBuffered *s = opaque;
+s-begin(s-opaque);
+return NULL;
+}
+
 QEMUFile *qemu_fopen_ops_buffered(void *opaque,
   size_t bytes_per_sec,
   BufferedPutFunc *put_buffer,
+  BufferedBeginFunc *begin,
   BufferedPutReadyFunc *put_ready,
   BufferedWaitForUnfreezeFunc 
*wait_for_unfreeze,
   BufferedCloseFunc *close)
@@ -264,6 +273,7 @@ QEMUFile *qemu_fopen_ops_buffered(void *opaque,
 s-opaque = opaque;
 s-xfer_limit = bytes_per_sec / 10;
 s-put_buffer = put_buffer;
+s-begin = begin;
 s-put_ready = put_ready;
 s-wait_for_unfreeze = wait_for_unfreeze;
 s-close = close;
@@ -271,11 +281,9 @@ QEMUFile *qemu_fopen_ops_buffered(void *opaque,
 s-file = qemu_fopen_ops(s, buffered_put_buffer, NULL,
  buffered_close, buffered_rate_limit,
  buffered_set_rate_limit,
-buffered_get_rate_limit);
-
-s-timer = qemu_new_timer_ms(rt_clock, buffered_rate_tick, s);
+ buffered_get_rate_limit);
 
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+qemu_thread_create(s-thread, migrate_vm, s);
 
 return s-file;
 }
diff --git a/buffered_file.h b/buffered_file.h
index 98d358b..cfe2833 100644
--- a/buffered_file.h
+++ b/buffered_file.h
@@ -17,12 +17,16 @@
 #include hw/hw.h
 
 typedef ssize_t (BufferedPutFunc)(void *opaque, const void *data, size_t size);
+typedef void (BufferedBeginFunc)(void *opaque);
 typedef void (BufferedPutReadyFunc)(void *opaque);
 typedef void (BufferedWaitForUnfreezeFunc)(void *opaque);
 typedef int (BufferedCloseFunc)(void *opaque);
 
+void buffered_rate_tick(QEMUFile *file);
+
 QEMUFile *qemu_fopen_ops_buffered(void *opaque, size_t xfer_limit,
   BufferedPutFunc *put_buffer,
+  BufferedBeginFunc *begin,
   BufferedPutReadyFunc *put_ready,
   BufferedWaitForUnfreezeFunc 
*wait_for_unfreeze,
   BufferedCloseFunc *close);
diff --git a/migration.c b/migration.c
index af3a1f2..bf86067 100644
--- a/migration.c
+++ b/migration.c
@@ -31,6 +31,8 @@
 do { } while (0)
 #endif
 
+static int64_t expire_time;
+
 /* Migration speed throttling */
 static int64_t max_throttle = (32  20);
 
@@ -284,8 +286,6 @@ int migrate_fd_cleanup(FdMigrationState *s)
 {
 int ret = 0;
 
-qemu_set_fd_handler2(s-fd, NULL, NULL, NULL, NULL);
-
 if (s-file) {
 DPRINTF(closing file\n);
 if (qemu_fclose(s-file) != 0) {
@@ -310,8 +310,7 @@ int migrate_fd_cleanup(FdMigrationState *s)
 void migrate_fd_put_notify(void *opaque)
 {
 FdMigrationState *s = opaque

[RFC PATCH v2 3/3] Per memslot dirty bitmap

2011-07-29 Thread Umesh Deshpande
This patch creates a separate dirty bitmap for each slot. Currently dirty bitmap
is created for addresses ranging from 0 to the end address of the last memory
slot. Since the memslots are not necessarily contiguous, current bitmap might
contain empty region or holes that doesn't represent any VM pages. This patch
reduces the size of the dirty bitmap by allocating per memslot dirty bitmaps.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 cpu-all.h |   40 +---
 exec.c|   38 +++---
 xen-all.c |6 ++
 3 files changed, 58 insertions(+), 26 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index e839100..9517a9b 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -920,6 +920,7 @@ extern ram_addr_t ram_size;
 
 typedef struct RAMBlock {
 uint8_t *host;
+uint8_t *phys_dirty;
 ram_addr_t offset;
 ram_addr_t length;
 uint32_t flags;
@@ -931,7 +932,6 @@ typedef struct RAMBlock {
 } RAMBlock;
 
 typedef struct RAMList {
-uint8_t *phys_dirty;
 QLIST_HEAD(ram, RAMBlock) blocks;
 } RAMList;
 extern RAMList ram_list;
@@ -961,32 +961,55 @@ extern int mem_prealloc;
 #define CODE_DIRTY_FLAG  0x02
 #define MIGRATION_DIRTY_FLAG 0x08
 
+RAMBlock *qemu_addr_to_ramblock(ram_addr_t);
+
+static inline int get_page_nr(ram_addr_t addr, RAMBlock **block)
+{
+int page_nr;
+*block = qemu_addr_to_ramblock(addr);
+
+page_nr = addr - (*block)-offset;
+page_nr = page_nr  TARGET_PAGE_BITS;
+
+return page_nr;
+}
+
 /* read dirty bit (return 0 or 1) */
 static inline int cpu_physical_memory_is_dirty(ram_addr_t addr)
 {
-return ram_list.phys_dirty[addr  TARGET_PAGE_BITS] == 0xff;
+RAMBlock *block;
+int page_nr = get_page_nr(addr, block);
+return block-phys_dirty[page_nr] == 0xff;
 }
 
 static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t addr)
 {
-return ram_list.phys_dirty[addr  TARGET_PAGE_BITS];
+RAMBlock *block;
+int page_nr = get_page_nr(addr, block);
+return block-phys_dirty[page_nr];
 }
 
 static inline int cpu_physical_memory_get_dirty(ram_addr_t addr,
 int dirty_flags)
 {
-return ram_list.phys_dirty[addr  TARGET_PAGE_BITS]  dirty_flags;
+RAMBlock *block;
+int page_nr = get_page_nr(addr, block);
+return block-phys_dirty[page_nr]  dirty_flags;
 }
 
 static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
 {
-ram_list.phys_dirty[addr  TARGET_PAGE_BITS] = 0xff;
+RAMBlock *block;
+int page_nr = get_page_nr(addr, block);
+block-phys_dirty[page_nr] = 0xff;
 }
 
 static inline int cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
   int dirty_flags)
 {
-return ram_list.phys_dirty[addr  TARGET_PAGE_BITS] |= dirty_flags;
+RAMBlock *block;
+int page_nr = get_page_nr(addr, block);
+return block-phys_dirty[page_nr] |= dirty_flags;
 }
 
 static inline void cpu_physical_memory_mask_dirty_range(ram_addr_t start,
@@ -995,10 +1018,13 @@ static inline void 
cpu_physical_memory_mask_dirty_range(ram_addr_t start,
 {
 int i, mask, len;
 uint8_t *p;
+RAMBlock *block;
+int page_nr = get_page_nr(start, block);
 
 len = length  TARGET_PAGE_BITS;
 mask = ~dirty_flags;
-p = ram_list.phys_dirty + (start  TARGET_PAGE_BITS);
+
+p = block-phys_dirty + page_nr;
 for (i = 0; i  len; i++) {
 p[i] = mask;
 }
diff --git a/exec.c b/exec.c
index 0e2ce57..6312550 100644
--- a/exec.c
+++ b/exec.c
@@ -2106,6 +2106,10 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, 
ram_addr_t end,
 abort();
 }
 
+if (kvm_enabled()) {
+return;
+}
+
 for(env = first_cpu; env != NULL; env = env-next_cpu) {
 int mmu_idx;
 for (mmu_idx = 0; mmu_idx  NB_MMU_MODES; mmu_idx++) {
@@ -2894,17 +2898,6 @@ static ram_addr_t find_ram_offset(ram_addr_t size)
 return offset;
 }
 
-static ram_addr_t last_ram_offset(void)
-{
-RAMBlock *block;
-ram_addr_t last = 0;
-
-QLIST_FOREACH(block, ram_list.blocks, next)
-last = MAX(last, block-offset + block-length);
-
-return last;
-}
-
 ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, const char *name,
ram_addr_t size, void *host)
 {
@@ -2974,10 +2967,8 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, 
const char *name,
 
 QLIST_INSERT_HEAD(ram_list.blocks, new_block, next);
 
-ram_list.phys_dirty = qemu_realloc(ram_list.phys_dirty,
-   last_ram_offset()  TARGET_PAGE_BITS);
-memset(ram_list.phys_dirty + (new_block-offset  TARGET_PAGE_BITS),
-   0xff, size  TARGET_PAGE_BITS);
+new_block-phys_dirty = qemu_mallocz(new_block-length  
TARGET_PAGE_BITS);
+memset(new_block-phys_dirty, 0xff, new_block-length  TARGET_PAGE_BITS);
 
 if (kvm_enabled())
 kvm_setup_guest_memory(new_block-host, size

[RFC PATCH v2 2/3] fine grained qemu_mutex locking for migration

2011-07-29 Thread Umesh Deshpande
In the migration thread, qemu_mutex is released during the most time consuming
part. i.e. during is_dup_page which identifies the uniform data pages and during
the put_buffer. qemu_mutex is also released while blocking on select to wait for
the descriptor to become ready for writes.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 arch_init.c |   14 +++---
 migration.c |   11 +++
 2 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 484b39d..cd545bc 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -110,7 +110,7 @@ static int is_dup_page(uint8_t *page, uint8_t ch)
 static RAMBlock *last_block;
 static ram_addr_t last_offset;
 
-static int ram_save_block(QEMUFile *f)
+static int ram_save_block(QEMUFile *f, int stage)
 {
 RAMBlock *block = last_block;
 ram_addr_t offset = last_offset;
@@ -131,6 +131,10 @@ static int ram_save_block(QEMUFile *f)
 current_addr + TARGET_PAGE_SIZE,
 MIGRATION_DIRTY_FLAG);
 
+if (stage != 3) {
+qemu_mutex_unlock_iothread();
+}
+
 p = block-host + offset;
 
 if (is_dup_page(p, *p)) {
@@ -153,6 +157,10 @@ static int ram_save_block(QEMUFile *f)
 bytes_sent = TARGET_PAGE_SIZE;
 }
 
+if (stage != 3) {
+qemu_mutex_lock_iothread();
+}
+
 break;
 }
 
@@ -301,7 +309,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 while (!qemu_file_rate_limit(f)) {
 int bytes_sent;
 
-bytes_sent = ram_save_block(f);
+bytes_sent = ram_save_block(f, stage);
 bytes_transferred += bytes_sent;
 if (bytes_sent == 0) { /* no more blocks */
 break;
@@ -322,7 +330,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 int bytes_sent;
 
 /* flush all remaining blocks regardless of rate limiting */
-while ((bytes_sent = ram_save_block(f)) != 0) {
+while ((bytes_sent = ram_save_block(f, stage)) != 0) {
 bytes_transferred += bytes_sent;
 }
 cpu_physical_memory_set_dirty_tracking(0);
diff --git a/migration.c b/migration.c
index bf86067..992fef5 100644
--- a/migration.c
+++ b/migration.c
@@ -375,15 +375,19 @@ void migrate_fd_begin(void *arg)
 if (ret  0) {
 DPRINTF(failed, %d\n, ret);
 migrate_fd_error(s);
-goto out;
+qemu_mutex_unlock_iothread();
+return;
 }
 
 expire_time = qemu_get_clock_ms(rt_clock) + 100;
 migrate_fd_put_ready(s);
+qemu_mutex_unlock_iothread();
 
 while (s-state == MIG_STATE_ACTIVE) {
 if (migrate_fd_check_expire()) {
+qemu_mutex_lock_iothread();
 buffered_rate_tick(s-file);
+qemu_mutex_unlock_iothread();
 }
 
 if (s-state != MIG_STATE_ACTIVE) {
@@ -392,12 +396,11 @@ void migrate_fd_begin(void *arg)
 
 if (s-callback) {
 migrate_fd_wait_for_unfreeze(s);
+qemu_mutex_lock_iothread();
 s-callback(s);
+qemu_mutex_unlock_iothread();
 }
 }
-
-out:
-qemu_mutex_unlock_iothread();
 }
 
 
-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Qemu-devel] [RFC PATCH v2 0/3] separate thread for VM migration

2011-07-29 Thread Umesh Deshpande
Following patch deals with VCPU and iothread starvation during the migration of
a guest. Currently the iothread is responsible for performing the guest
migration. It holds qemu_mutex during the migration and doesn't allow VCPU to
enter the qemu mode and delays its return to the guest. The guest migration,
executed as an iohandler also delays the execution of other iohandlers.
In the following patch series,

The migration has been moved to a separate thread to
reduce the qemu_mutex contention and iohandler starvation.

Also current dirty bitmap is split into per memslot bitmap to reduce its size.

Umesh Deshpande (3):
  separate thread for VM migration
  fine grained qemu_mutex locking for migration
  per memslot dirty bitmap

 arch_init.c |   14 ++--
 buffered_file.c |   28 -
 buffered_file.h |4 +++
 cpu-all.h   |   40 ++--
 exec.c  |   38 +-
 migration.c |   60 --
 migration.h |3 ++
 savevm.c|   22 +---
 savevm.h|   29 ++
 xen-all.c   |6 +---
 10 files changed, 173 insertions(+), 71 deletions(-)
 create mode 100644 savevm.h

-- 
1.7.4.1




[Qemu-devel] [RFC PATCH v2 1/3] separate thread for VM migration

2011-07-29 Thread Umesh Deshpande
This patch creates a separate thread for the guest migration on the source side.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 buffered_file.c |   28 -
 buffered_file.h |4 +++
 migration.c |   59 +++---
 migration.h |3 ++
 savevm.c|   22 +---
 savevm.h|   29 +++
 6 files changed, 102 insertions(+), 43 deletions(-)
 create mode 100644 savevm.h

diff --git a/buffered_file.c b/buffered_file.c
index 41b42c3..d4146bf 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -16,12 +16,16 @@
 #include qemu-timer.h
 #include qemu-char.h
 #include buffered_file.h
+#include migration.h
+#include savevm.h
+#include qemu-thread.h
 
 //#define DEBUG_BUFFERED_FILE
 
 typedef struct QEMUFileBuffered
 {
 BufferedPutFunc *put_buffer;
+BufferedBeginFunc *begin;
 BufferedPutReadyFunc *put_ready;
 BufferedWaitForUnfreezeFunc *wait_for_unfreeze;
 BufferedCloseFunc *close;
@@ -35,6 +39,7 @@ typedef struct QEMUFileBuffered
 size_t buffer_size;
 size_t buffer_capacity;
 QEMUTimer *timer;
+QemuThread thread;
 } QEMUFileBuffered;
 
 #ifdef DEBUG_BUFFERED_FILE
@@ -181,8 +186,6 @@ static int buffered_close(void *opaque)
 
 ret = s-close(s-opaque);
 
-qemu_del_timer(s-timer);
-qemu_free_timer(s-timer);
 qemu_free(s-buffer);
 qemu_free(s);
 
@@ -228,17 +231,15 @@ static int64_t buffered_get_rate_limit(void *opaque)
 return s-xfer_limit;
 }
 
-static void buffered_rate_tick(void *opaque)
+void buffered_rate_tick(QEMUFile *file)
 {
-QEMUFileBuffered *s = opaque;
+QEMUFileBuffered *s = file-opaque;
 
 if (s-has_error) {
 buffered_close(s);
 return;
 }
 
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
-
 if (s-freeze_output)
 return;
 
@@ -250,9 +251,17 @@ static void buffered_rate_tick(void *opaque)
 s-put_ready(s-opaque);
 }
 
+static void *migrate_vm(void *opaque)
+{
+QEMUFileBuffered *s = opaque;
+s-begin(s-opaque);
+return NULL;
+}
+
 QEMUFile *qemu_fopen_ops_buffered(void *opaque,
   size_t bytes_per_sec,
   BufferedPutFunc *put_buffer,
+  BufferedBeginFunc *begin,
   BufferedPutReadyFunc *put_ready,
   BufferedWaitForUnfreezeFunc 
*wait_for_unfreeze,
   BufferedCloseFunc *close)
@@ -264,6 +273,7 @@ QEMUFile *qemu_fopen_ops_buffered(void *opaque,
 s-opaque = opaque;
 s-xfer_limit = bytes_per_sec / 10;
 s-put_buffer = put_buffer;
+s-begin = begin;
 s-put_ready = put_ready;
 s-wait_for_unfreeze = wait_for_unfreeze;
 s-close = close;
@@ -271,11 +281,9 @@ QEMUFile *qemu_fopen_ops_buffered(void *opaque,
 s-file = qemu_fopen_ops(s, buffered_put_buffer, NULL,
  buffered_close, buffered_rate_limit,
  buffered_set_rate_limit,
-buffered_get_rate_limit);
-
-s-timer = qemu_new_timer_ms(rt_clock, buffered_rate_tick, s);
+ buffered_get_rate_limit);
 
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+qemu_thread_create(s-thread, migrate_vm, s);
 
 return s-file;
 }
diff --git a/buffered_file.h b/buffered_file.h
index 98d358b..cfe2833 100644
--- a/buffered_file.h
+++ b/buffered_file.h
@@ -17,12 +17,16 @@
 #include hw/hw.h
 
 typedef ssize_t (BufferedPutFunc)(void *opaque, const void *data, size_t size);
+typedef void (BufferedBeginFunc)(void *opaque);
 typedef void (BufferedPutReadyFunc)(void *opaque);
 typedef void (BufferedWaitForUnfreezeFunc)(void *opaque);
 typedef int (BufferedCloseFunc)(void *opaque);
 
+void buffered_rate_tick(QEMUFile *file);
+
 QEMUFile *qemu_fopen_ops_buffered(void *opaque, size_t xfer_limit,
   BufferedPutFunc *put_buffer,
+  BufferedBeginFunc *begin,
   BufferedPutReadyFunc *put_ready,
   BufferedWaitForUnfreezeFunc 
*wait_for_unfreeze,
   BufferedCloseFunc *close);
diff --git a/migration.c b/migration.c
index af3a1f2..bf86067 100644
--- a/migration.c
+++ b/migration.c
@@ -31,6 +31,8 @@
 do { } while (0)
 #endif
 
+static int64_t expire_time;
+
 /* Migration speed throttling */
 static int64_t max_throttle = (32  20);
 
@@ -284,8 +286,6 @@ int migrate_fd_cleanup(FdMigrationState *s)
 {
 int ret = 0;
 
-qemu_set_fd_handler2(s-fd, NULL, NULL, NULL, NULL);
-
 if (s-file) {
 DPRINTF(closing file\n);
 if (qemu_fclose(s-file) != 0) {
@@ -310,8 +310,7 @@ int migrate_fd_cleanup(FdMigrationState *s)
 void migrate_fd_put_notify(void *opaque)
 {
 FdMigrationState *s = opaque

[Qemu-devel] [RFC PATCH v2 2/3] fine grained qemu_mutex locking for migration

2011-07-29 Thread Umesh Deshpande
In the migration thread, qemu_mutex is released during the most time consuming
part. i.e. during is_dup_page which identifies the uniform data pages and during
the put_buffer. qemu_mutex is also released while blocking on select to wait for
the descriptor to become ready for writes.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 arch_init.c |   14 +++---
 migration.c |   11 +++
 2 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 484b39d..cd545bc 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -110,7 +110,7 @@ static int is_dup_page(uint8_t *page, uint8_t ch)
 static RAMBlock *last_block;
 static ram_addr_t last_offset;
 
-static int ram_save_block(QEMUFile *f)
+static int ram_save_block(QEMUFile *f, int stage)
 {
 RAMBlock *block = last_block;
 ram_addr_t offset = last_offset;
@@ -131,6 +131,10 @@ static int ram_save_block(QEMUFile *f)
 current_addr + TARGET_PAGE_SIZE,
 MIGRATION_DIRTY_FLAG);
 
+if (stage != 3) {
+qemu_mutex_unlock_iothread();
+}
+
 p = block-host + offset;
 
 if (is_dup_page(p, *p)) {
@@ -153,6 +157,10 @@ static int ram_save_block(QEMUFile *f)
 bytes_sent = TARGET_PAGE_SIZE;
 }
 
+if (stage != 3) {
+qemu_mutex_lock_iothread();
+}
+
 break;
 }
 
@@ -301,7 +309,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 while (!qemu_file_rate_limit(f)) {
 int bytes_sent;
 
-bytes_sent = ram_save_block(f);
+bytes_sent = ram_save_block(f, stage);
 bytes_transferred += bytes_sent;
 if (bytes_sent == 0) { /* no more blocks */
 break;
@@ -322,7 +330,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 int bytes_sent;
 
 /* flush all remaining blocks regardless of rate limiting */
-while ((bytes_sent = ram_save_block(f)) != 0) {
+while ((bytes_sent = ram_save_block(f, stage)) != 0) {
 bytes_transferred += bytes_sent;
 }
 cpu_physical_memory_set_dirty_tracking(0);
diff --git a/migration.c b/migration.c
index bf86067..992fef5 100644
--- a/migration.c
+++ b/migration.c
@@ -375,15 +375,19 @@ void migrate_fd_begin(void *arg)
 if (ret  0) {
 DPRINTF(failed, %d\n, ret);
 migrate_fd_error(s);
-goto out;
+qemu_mutex_unlock_iothread();
+return;
 }
 
 expire_time = qemu_get_clock_ms(rt_clock) + 100;
 migrate_fd_put_ready(s);
+qemu_mutex_unlock_iothread();
 
 while (s-state == MIG_STATE_ACTIVE) {
 if (migrate_fd_check_expire()) {
+qemu_mutex_lock_iothread();
 buffered_rate_tick(s-file);
+qemu_mutex_unlock_iothread();
 }
 
 if (s-state != MIG_STATE_ACTIVE) {
@@ -392,12 +396,11 @@ void migrate_fd_begin(void *arg)
 
 if (s-callback) {
 migrate_fd_wait_for_unfreeze(s);
+qemu_mutex_lock_iothread();
 s-callback(s);
+qemu_mutex_unlock_iothread();
 }
 }
-
-out:
-qemu_mutex_unlock_iothread();
 }
 
 
-- 
1.7.4.1




[Qemu-devel] [RFC PATCH v2 3/3] Per memslot dirty bitmap

2011-07-29 Thread Umesh Deshpande
This patch creates a separate dirty bitmap for each slot. Currently dirty bitmap
is created for addresses ranging from 0 to the end address of the last memory
slot. Since the memslots are not necessarily contiguous, current bitmap might
contain empty region or holes that doesn't represent any VM pages. This patch
reduces the size of the dirty bitmap by allocating per memslot dirty bitmaps.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 cpu-all.h |   40 +---
 exec.c|   38 +++---
 xen-all.c |6 ++
 3 files changed, 58 insertions(+), 26 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index e839100..9517a9b 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -920,6 +920,7 @@ extern ram_addr_t ram_size;
 
 typedef struct RAMBlock {
 uint8_t *host;
+uint8_t *phys_dirty;
 ram_addr_t offset;
 ram_addr_t length;
 uint32_t flags;
@@ -931,7 +932,6 @@ typedef struct RAMBlock {
 } RAMBlock;
 
 typedef struct RAMList {
-uint8_t *phys_dirty;
 QLIST_HEAD(ram, RAMBlock) blocks;
 } RAMList;
 extern RAMList ram_list;
@@ -961,32 +961,55 @@ extern int mem_prealloc;
 #define CODE_DIRTY_FLAG  0x02
 #define MIGRATION_DIRTY_FLAG 0x08
 
+RAMBlock *qemu_addr_to_ramblock(ram_addr_t);
+
+static inline int get_page_nr(ram_addr_t addr, RAMBlock **block)
+{
+int page_nr;
+*block = qemu_addr_to_ramblock(addr);
+
+page_nr = addr - (*block)-offset;
+page_nr = page_nr  TARGET_PAGE_BITS;
+
+return page_nr;
+}
+
 /* read dirty bit (return 0 or 1) */
 static inline int cpu_physical_memory_is_dirty(ram_addr_t addr)
 {
-return ram_list.phys_dirty[addr  TARGET_PAGE_BITS] == 0xff;
+RAMBlock *block;
+int page_nr = get_page_nr(addr, block);
+return block-phys_dirty[page_nr] == 0xff;
 }
 
 static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t addr)
 {
-return ram_list.phys_dirty[addr  TARGET_PAGE_BITS];
+RAMBlock *block;
+int page_nr = get_page_nr(addr, block);
+return block-phys_dirty[page_nr];
 }
 
 static inline int cpu_physical_memory_get_dirty(ram_addr_t addr,
 int dirty_flags)
 {
-return ram_list.phys_dirty[addr  TARGET_PAGE_BITS]  dirty_flags;
+RAMBlock *block;
+int page_nr = get_page_nr(addr, block);
+return block-phys_dirty[page_nr]  dirty_flags;
 }
 
 static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
 {
-ram_list.phys_dirty[addr  TARGET_PAGE_BITS] = 0xff;
+RAMBlock *block;
+int page_nr = get_page_nr(addr, block);
+block-phys_dirty[page_nr] = 0xff;
 }
 
 static inline int cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
   int dirty_flags)
 {
-return ram_list.phys_dirty[addr  TARGET_PAGE_BITS] |= dirty_flags;
+RAMBlock *block;
+int page_nr = get_page_nr(addr, block);
+return block-phys_dirty[page_nr] |= dirty_flags;
 }
 
 static inline void cpu_physical_memory_mask_dirty_range(ram_addr_t start,
@@ -995,10 +1018,13 @@ static inline void 
cpu_physical_memory_mask_dirty_range(ram_addr_t start,
 {
 int i, mask, len;
 uint8_t *p;
+RAMBlock *block;
+int page_nr = get_page_nr(start, block);
 
 len = length  TARGET_PAGE_BITS;
 mask = ~dirty_flags;
-p = ram_list.phys_dirty + (start  TARGET_PAGE_BITS);
+
+p = block-phys_dirty + page_nr;
 for (i = 0; i  len; i++) {
 p[i] = mask;
 }
diff --git a/exec.c b/exec.c
index 0e2ce57..6312550 100644
--- a/exec.c
+++ b/exec.c
@@ -2106,6 +2106,10 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, 
ram_addr_t end,
 abort();
 }
 
+if (kvm_enabled()) {
+return;
+}
+
 for(env = first_cpu; env != NULL; env = env-next_cpu) {
 int mmu_idx;
 for (mmu_idx = 0; mmu_idx  NB_MMU_MODES; mmu_idx++) {
@@ -2894,17 +2898,6 @@ static ram_addr_t find_ram_offset(ram_addr_t size)
 return offset;
 }
 
-static ram_addr_t last_ram_offset(void)
-{
-RAMBlock *block;
-ram_addr_t last = 0;
-
-QLIST_FOREACH(block, ram_list.blocks, next)
-last = MAX(last, block-offset + block-length);
-
-return last;
-}
-
 ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, const char *name,
ram_addr_t size, void *host)
 {
@@ -2974,10 +2967,8 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, 
const char *name,
 
 QLIST_INSERT_HEAD(ram_list.blocks, new_block, next);
 
-ram_list.phys_dirty = qemu_realloc(ram_list.phys_dirty,
-   last_ram_offset()  TARGET_PAGE_BITS);
-memset(ram_list.phys_dirty + (new_block-offset  TARGET_PAGE_BITS),
-   0xff, size  TARGET_PAGE_BITS);
+new_block-phys_dirty = qemu_mallocz(new_block-length  
TARGET_PAGE_BITS);
+memset(new_block-phys_dirty, 0xff, new_block-length  TARGET_PAGE_BITS);
 
 if (kvm_enabled())
 kvm_setup_guest_memory(new_block-host, size

Re: [RFC 3/4] A separate thread for the VM migration

2011-07-21 Thread Umesh Deshpande


- Original Message -
From: Marcelo Tosatti mtosa...@redhat.com
To: Umesh Deshpande udesh...@redhat.com
Cc: kvm@vger.kernel.org, qemu-de...@nongnu.org
Sent: Wednesday, July 20, 2011 3:02:46 PM
Subject: Re: [RFC 3/4] A separate thread for the VM migration

On Wed, Jul 20, 2011 at 12:00:44AM -0400, Umesh Deshpande wrote:
 This patch creates a separate thread for the guest migration on the source 
 side. The migration routine is called from the migration clock.
 
 Signed-off-by: Umesh Deshpande udesh...@redhat.com
 ---
  arch_init.c  |8 +++
  buffered_file.c  |   10 -
  migration-tcp.c  |   18 -
  migration-unix.c |7 ++
  migration.c  |   56 
 +--
  migration.h  |4 +--
  6 files changed, 57 insertions(+), 46 deletions(-)
 
 diff --git a/arch_init.c b/arch_init.c
 index f81a729..6d44b72 100644
 --- a/arch_init.c
 +++ b/arch_init.c
 @@ -260,6 +260,10 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
 void *opaque)
  return 0;
  }
  
 +if (stage != 3) {
 +qemu_mutex_lock_iothread();
 +}
 +
  if (cpu_physical_sync_dirty_bitmap(0, TARGET_PHYS_ADDR_MAX) != 0) {
  qemu_file_set_error(f);
  return 0;
 @@ -267,6 +271,10 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
 void *opaque)
  
  sync_migration_bitmap(0, TARGET_PHYS_ADDR_MAX);
  
 +if (stage != 3) {
 +qemu_mutex_unlock_iothread();
 +}
 +

Many data structures shared by vcpus/iothread and migration thread are
accessed simultaneously without protection. Instead of simply moving
the entire migration routines to a thread, i'd suggest moving only the
time consuming work in ram_save_block (dup_page and put_buffer), after
properly audit for shared access. And send more than one page a time, of
course.

The group of migration routines moved into the thread needs to be executed 
sequentially, because of the way protocol is designed.
Currently, migration is performed in sections, and we cannot proceed to the 
next section
until current section has been written to the QEMUFile. A thread for any 
sub-part would introduce parallelism, breaking the sequential semantics.
(Condition variables will have to be used to ensure sequentiality across new 
thread and iothread)

Secondly, put_buffer is called from iohandler and timers, currently both are 
called from iothread.
With a separate thread for dup_page and put_buffer, it will also be called from 
inside the thread.

Another option with the current implementation could be to hold the qemu_mutex 
inside the thread for most of the part and releasing it for time consuming part 
in ram_save_block.

A separate lock for ram_list is probably necessary, so that it can
be accessed from the migration thread.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [RFC 3/4] A separate thread for the VM migration

2011-07-21 Thread Umesh Deshpande


- Original Message -
From: Marcelo Tosatti mtosa...@redhat.com
To: Umesh Deshpande udesh...@redhat.com
Cc: k...@vger.kernel.org, qemu-devel@nongnu.org
Sent: Wednesday, July 20, 2011 3:02:46 PM
Subject: Re: [RFC 3/4] A separate thread for the VM migration

On Wed, Jul 20, 2011 at 12:00:44AM -0400, Umesh Deshpande wrote:
 This patch creates a separate thread for the guest migration on the source 
 side. The migration routine is called from the migration clock.
 
 Signed-off-by: Umesh Deshpande udesh...@redhat.com
 ---
  arch_init.c  |8 +++
  buffered_file.c  |   10 -
  migration-tcp.c  |   18 -
  migration-unix.c |7 ++
  migration.c  |   56 
 +--
  migration.h  |4 +--
  6 files changed, 57 insertions(+), 46 deletions(-)
 
 diff --git a/arch_init.c b/arch_init.c
 index f81a729..6d44b72 100644
 --- a/arch_init.c
 +++ b/arch_init.c
 @@ -260,6 +260,10 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
 void *opaque)
  return 0;
  }
  
 +if (stage != 3) {
 +qemu_mutex_lock_iothread();
 +}
 +
  if (cpu_physical_sync_dirty_bitmap(0, TARGET_PHYS_ADDR_MAX) != 0) {
  qemu_file_set_error(f);
  return 0;
 @@ -267,6 +271,10 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
 void *opaque)
  
  sync_migration_bitmap(0, TARGET_PHYS_ADDR_MAX);
  
 +if (stage != 3) {
 +qemu_mutex_unlock_iothread();
 +}
 +

Many data structures shared by vcpus/iothread and migration thread are
accessed simultaneously without protection. Instead of simply moving
the entire migration routines to a thread, i'd suggest moving only the
time consuming work in ram_save_block (dup_page and put_buffer), after
properly audit for shared access. And send more than one page a time, of
course.

The group of migration routines moved into the thread needs to be executed 
sequentially, because of the way protocol is designed.
Currently, migration is performed in sections, and we cannot proceed to the 
next section
until current section has been written to the QEMUFile. A thread for any 
sub-part would introduce parallelism, breaking the sequential semantics.
(Condition variables will have to be used to ensure sequentiality across new 
thread and iothread)

Secondly, put_buffer is called from iohandler and timers, currently both are 
called from iothread.
With a separate thread for dup_page and put_buffer, it will also be called from 
inside the thread.

Another option with the current implementation could be to hold the qemu_mutex 
inside the thread for most of the part and releasing it for time consuming part 
in ram_save_block.

A separate lock for ram_list is probably necessary, so that it can
be accessed from the migration thread.




[Qemu-devel] [RFC 1/4] A separate thread for the VM migration

2011-07-19 Thread Umesh Deshpande
This patch creates a migration bitmap, which is periodically kept in sync with 
the qemu bitmap. This allows us to have a separate thread for VM migration. A 
separate copy of the dirty bitmap for the migration avoids concurrent access to 
the qemu bitmap from iohandlers and migration thread.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 arch_init.c |   16 ---
 cpu-all.h   |   37 +++
 exec.c  |   63 

3 files changed, 109 insertions(+), 7 deletions(-)
diff --git a/arch_init.c b/arch_init.c
index 484b39d..f81a729 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -123,13 +123,13 @@ static int ram_save_block(QEMUFile *f)
 current_addr = block-offset + offset;
 
 do {
-if (cpu_physical_memory_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) 
{
+if (migration_bitmap_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) {
 uint8_t *p;
 int cont = (block == last_block) ? RAM_SAVE_FLAG_CONTINUE : 0;
 
-cpu_physical_memory_reset_dirty(current_addr,
-current_addr + TARGET_PAGE_SIZE,
-MIGRATION_DIRTY_FLAG);
+migration_bitmap_reset_dirty(current_addr,
+ current_addr + TARGET_PAGE_SIZE,
+ MIGRATION_DIRTY_FLAG);
 
 p = block-host + offset;
 
@@ -185,7 +185,7 @@ static ram_addr_t ram_save_remaining(void)
 ram_addr_t addr;
 for (addr = block-offset; addr  block-offset + block-length;
  addr += TARGET_PAGE_SIZE) {
-if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
+if (migration_bitmap_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
 count++;
 }
 }
@@ -265,6 +265,8 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 return 0;
 }
 
+sync_migration_bitmap(0, TARGET_PHYS_ADDR_MAX);
+
 if (stage == 1) {
 RAMBlock *block;
 bytes_transferred = 0;
@@ -276,9 +278,9 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 QLIST_FOREACH(block, ram_list.blocks, next) {
 for (addr = block-offset; addr  block-offset + block-length;
  addr += TARGET_PAGE_SIZE) {
-if (!cpu_physical_memory_get_dirty(addr,
+if (!migration_bitmap_get_dirty(addr,
MIGRATION_DIRTY_FLAG)) {
-cpu_physical_memory_set_dirty(addr);
+migration_bitmap_set_dirty(addr);
 }
 }
 }
diff --git a/cpu-all.h b/cpu-all.h
index e839100..80ce601 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -932,6 +932,7 @@ typedef struct RAMBlock {
 
 typedef struct RAMList {
 uint8_t *phys_dirty;
+uint8_t *migration_bitmap;
 QLIST_HEAD(ram, RAMBlock) blocks;
 } RAMList;
 extern RAMList ram_list;
@@ -1004,8 +1005,44 @@ static inline void 
cpu_physical_memory_mask_dirty_range(ram_addr_t start,
 }
 }
 
+
+
 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
  int dirty_flags);
+
+static inline int migration_bitmap_get_dirty(ram_addr_t addr,
+ int dirty_flags)
+{
+return ram_list.migration_bitmap[addr  TARGET_PAGE_BITS]  dirty_flags;
+}
+
+static inline void migration_bitmap_set_dirty(ram_addr_t addr)
+{
+ram_list.migration_bitmap[addr  TARGET_PAGE_BITS] = 0xff;
+}
+
+static inline void migration_bitmap_mask_dirty_range(ram_addr_t start,
+ int length,
+ int dirty_flags)
+{
+int i, mask, len;
+uint8_t *p;
+
+len = length  TARGET_PAGE_BITS;
+mask = ~dirty_flags;
+p = ram_list.migration_bitmap + (start  TARGET_PAGE_BITS);
+for (i = 0; i  len; i++) {
+p[i] = mask;
+}
+}
+
+
+void migration_bitmap_reset_dirty(ram_addr_t start,
+  ram_addr_t end,
+  int dirty_flags);
+
+void sync_migration_bitmap(ram_addr_t start, ram_addr_t end);
+
 void cpu_tlb_update_dirty(CPUState *env);
 
 int cpu_physical_memory_set_dirty_tracking(int enable);
diff --git a/exec.c b/exec.c
index 0e2ce57..9811328 100644
--- a/exec.c
+++ b/exec.c
@@ -2106,6 +2106,10 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, 
ram_addr_t end,
 abort();
 }
 
+if (kvm_enabled()) {
+return;
+}
+
 for(env = first_cpu; env != NULL; env = env-next_cpu) {
 int mmu_idx;
 for (mmu_idx = 0; mmu_idx  NB_MMU_MODES; mmu_idx++) {
@@ -2114,8 +2118,61 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, 
ram_addr_t end,
   start1

[Qemu-devel] [RFC 2/4] A separate thread for the VM migration

2011-07-19 Thread Umesh Deshpande
This patch implements a migration clock, whose implementation is similar to the 
existing rt_clock. This allows the migration timer to run in parallel to other 
timers in the rt_clock. In the next patch, this clock is used to create a new 
timer from the migration thread that calls the VM migration routine on the 
source side.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 qemu-timer.c |   29 +++--
 qemu-timer.h |3 +++
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index 72066c7..91e356f 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -144,6 +144,7 @@ void cpu_disable_ticks(void)
 #define QEMU_CLOCK_REALTIME 0
 #define QEMU_CLOCK_VIRTUAL  1
 #define QEMU_CLOCK_HOST 2
+#define QEMU_CLOCK_MIGRATE  3
 
 struct QEMUClock {
 int type;
@@ -364,9 +365,10 @@ next:
 }
 }
 
-#define QEMU_NUM_CLOCKS 3
+#define QEMU_NUM_CLOCKS 4
 
 QEMUClock *rt_clock;
+QEMUClock *migration_clock;
 QEMUClock *vm_clock;
 QEMUClock *host_clock;
 
@@ -561,12 +563,31 @@ int qemu_timer_pending(QEMUTimer *ts)
 return 0;
 }
 
+int64_t qemu_timer_difference(QEMUTimer *ts, QEMUClock *clock)
+{
+int64_t expire_time, current_time;
+QEMUTimer *t;
+
+current_time = qemu_get_clock_ms(clock);
+for (t = active_timers[clock-type]; t != NULL; t = t-next) {
+if (t == ts) {
+expire_time = ts-expire_time / SCALE_MS;
+if (current_time = expire_time) {
+return 0;
+} else {
+return expire_time - current_time;
+}
+}
+}
+return 0;
+}
+
 int qemu_timer_expired(QEMUTimer *timer_head, int64_t current_time)
 {
 return qemu_timer_expired_ns(timer_head, current_time * timer_head-scale);
 }
 
-static void qemu_run_timers(QEMUClock *clock)
+void qemu_run_timers(QEMUClock *clock)
 {
 QEMUTimer **ptimer_head, *ts;
 int64_t current_time;
@@ -595,6 +616,9 @@ int64_t qemu_get_clock_ns(QEMUClock *clock)
 switch(clock-type) {
 case QEMU_CLOCK_REALTIME:
 return get_clock();
+
+case QEMU_CLOCK_MIGRATE:
+return get_clock();
 default:
 case QEMU_CLOCK_VIRTUAL:
 if (use_icount) {
@@ -610,6 +634,7 @@ int64_t qemu_get_clock_ns(QEMUClock *clock)
 void init_clocks(void)
 {
 rt_clock = qemu_new_clock(QEMU_CLOCK_REALTIME);
+migration_clock = qemu_new_clock(QEMU_CLOCK_MIGRATE);
 vm_clock = qemu_new_clock(QEMU_CLOCK_VIRTUAL);
 host_clock = qemu_new_clock(QEMU_CLOCK_HOST);
 
diff --git a/qemu-timer.h b/qemu-timer.h
index 06cbe20..014b70b 100644
--- a/qemu-timer.h
+++ b/qemu-timer.h
@@ -23,6 +23,7 @@ typedef void QEMUTimerCB(void *opaque);
machine is stopped. The real time clock has a frequency of 1000
Hz. */
 extern QEMUClock *rt_clock;
+extern QEMUClock *migration_clock;
 
 /* The virtual clock is only run during the emulation. It is stopped
when the virtual machine is stopped. Virtual timers use a high
@@ -45,7 +46,9 @@ QEMUTimer *qemu_new_timer(QEMUClock *clock, int scale,
 void qemu_free_timer(QEMUTimer *ts);
 void qemu_del_timer(QEMUTimer *ts);
 void qemu_mod_timer(QEMUTimer *ts, int64_t expire_time);
+void qemu_run_timers(QEMUClock *clock);
 int qemu_timer_pending(QEMUTimer *ts);
+int64_t qemu_timer_difference(QEMUTimer *ts, QEMUClock *);
 int qemu_timer_expired(QEMUTimer *timer_head, int64_t current_time);
 
 void qemu_run_all_timers(void);
-- 



[Qemu-devel] [RFC 4/4] A separate thread for the VM migration

2011-07-19 Thread Umesh Deshpande
This patch creates a separate thread for the guest migration on the target side.


Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 migration-exec.c |7 +++
 migration-fd.c   |4 ++--
 migration-tcp.c  |9 +
 migration-unix.c |   10 ++
 migration.c  |   32 ++--
 migration.h  |   16 +++-
 6 files changed, 61 insertions(+), 17 deletions(-)

diff --git a/migration-exec.c b/migration-exec.c
index 4b7aad8..04140e2 100644
--- a/migration-exec.c
+++ b/migration-exec.c
@@ -120,12 +120,11 @@ err_after_alloc:
 static void exec_accept_incoming_migration(void *opaque)
 {
 QEMUFile *f = opaque;
-
-process_incoming_migration(f);
+process_incoming_migration(f, qemu_stdio_fd(f), 0, MIG_EXEC);
 qemu_set_fd_handler2(qemu_stdio_fd(f), NULL, NULL, NULL, NULL);
-qemu_fclose(f);
 }
 
+
 int exec_start_incoming_migration(const char *command)
 {
 QEMUFile *f;
@@ -138,7 +137,7 @@ int exec_start_incoming_migration(const char *command)
 }
 
 qemu_set_fd_handler2(qemu_stdio_fd(f), NULL,
-exec_accept_incoming_migration, NULL, f);
+ exec_accept_incoming_migration, NULL, f);
 
 return 0;
 }
diff --git a/migration-fd.c b/migration-fd.c
index 66d51c1..24f195f 100644
--- a/migration-fd.c
+++ b/migration-fd.c
@@ -100,13 +100,13 @@ err_after_alloc:
 return NULL;
 }
 
+
 static void fd_accept_incoming_migration(void *opaque)
 {
 QEMUFile *f = opaque;
 
-process_incoming_migration(f);
+process_incoming_migration(f, qemu_stdio_fd(f), 0, MIG_FD);
 qemu_set_fd_handler2(qemu_stdio_fd(f), NULL, NULL, NULL, NULL);
-qemu_fclose(f);
 }
 
 int fd_start_incoming_migration(const char *infd)
diff --git a/migration-tcp.c b/migration-tcp.c
index d3d80c9..ad1b9d0 100644
--- a/migration-tcp.c
+++ b/migration-tcp.c
@@ -159,13 +159,15 @@ static void tcp_accept_incoming_migration(void *opaque)
 goto out;
 }
 
-process_incoming_migration(f);
-qemu_fclose(f);
+process_incoming_migration(f, s, c, MIG_UNIX);
+goto out3;
 out:
 close(c);
 out2:
-qemu_set_fd_handler2(s, NULL, NULL, NULL, NULL);
 close(s);
+out3:
+qemu_set_fd_handler2(s, NULL, NULL, NULL, NULL);
+return;
 }
 
 int tcp_start_incoming_migration(const char *host_port)
@@ -194,7 +196,6 @@ int tcp_start_incoming_migration(const char *host_port)
 
 qemu_set_fd_handler2(s, NULL, tcp_accept_incoming_migration, NULL,
  (void *)(intptr_t)s);
-
 return 0;
 
 err:
diff --git a/migration-unix.c b/migration-unix.c
index c8625c7..ed57d5a 100644
--- a/migration-unix.c
+++ b/migration-unix.c
@@ -167,12 +166,14 @@ static void unix_accept_incoming_migration(void *opaque)
 goto out;
 }
 
-process_incoming_migration(f);
-qemu_fclose(f);
+process_incoming_migration(f, s, c, MIG_UNIX);
+goto out2;
 out:
-qemu_set_fd_handler2(s, NULL, NULL, NULL, NULL);
 close(s);
 close(c);
+out2:
+qemu_set_fd_handler2(s, NULL, NULL, NULL, NULL);
+return;
 }
 
 int unix_start_incoming_migration(const char *path)
@@ -203,7 +204,7 @@ int unix_start_incoming_migration(const char *path)
 }
 
 qemu_set_fd_handler2(sock, NULL, unix_accept_incoming_migration, NULL,
-(void *)(intptr_t)sock);
+ (void *)(intptr_t)sock);
 
 return 0;
 
diff --git a/migration.c b/migration.c
index af3a1f2..34b1aa6 100644
--- a/migration.c
+++ b/migration.c
@@ -61,9 +64,10 @@ int qemu_start_incoming_migration(const char *uri)
 return ret;
 }
 
-void process_incoming_migration(QEMUFile *f)
+static void *incoming_migration_thread(void *arg)
 {
-if (qemu_loadvm_state(f)  0) {
+MigrationArg *p = arg;
+if (qemu_loadvm_state(p-f)  0) {
 fprintf(stderr, load of migration failed\n);
 exit(0);
 }
@@ -74,6 +78,33 @@ void process_incoming_migration(QEMUFile *f)
 
 if (autostart)
 vm_start();
+
+qemu_fclose(p-f);
+
+if (p-type == MIG_TCP || p-type == MIG_UNIX) {
+close(p-migration_port);
+close(p-host_port);
+}
+
+qemu_free(p);
+return NULL;
+}
+
+void process_incoming_migration(QEMUFile *f, int host_port,
+int migration_port, int type)
+{
+MigrationArg *arg;
+struct QemuThread migrate_incoming_thread;
+
+arg = qemu_mallocz(sizeof(*arg));
+
+arg-f = f;
+arg-host_port = host_port;
+arg-migration_port = migration_port;
+arg-type = type;
+
+qemu_thread_create(migrate_incoming_thread,
+   incoming_migration_thread, arg);
 }
 
 int do_migrate(Monitor *mon, const QDict *qdict, QObject **ret_data)
diff --git a/migration.h b/migration.h
index 050c56c..33318d9 100644
--- a/migration.h
+++ b/migration.h
@@ -23,6 +23,11 @@
 #define MIG_STATE_CANCELLED1
 #define MIG_STATE_ACTIVE   2
 
+#define MIG_TCP 1
+#define MIG_UNIX2

[Qemu-devel] [RFC 3/4] A separate thread for the VM migration

2011-07-19 Thread Umesh Deshpande
This patch creates a separate thread for the guest migration on the source 
side. The migration routine is called from the migration clock.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 arch_init.c  |8 +++
 buffered_file.c  |   10 -
 migration-tcp.c  |   18 -
 migration-unix.c |7 ++
 migration.c  |   56 +--
 migration.h  |4 +--
 6 files changed, 57 insertions(+), 46 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index f81a729..6d44b72 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -260,6 +260,10 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 return 0;
 }
 
+if (stage != 3) {
+qemu_mutex_lock_iothread();
+}
+
 if (cpu_physical_sync_dirty_bitmap(0, TARGET_PHYS_ADDR_MAX) != 0) {
 qemu_file_set_error(f);
 return 0;
@@ -267,6 +271,10 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 
 sync_migration_bitmap(0, TARGET_PHYS_ADDR_MAX);
 
+if (stage != 3) {
+qemu_mutex_unlock_iothread();
+}
+
 if (stage == 1) {
 RAMBlock *block;
 bytes_transferred = 0;
diff --git a/buffered_file.c b/buffered_file.c
index 41b42c3..e05efe8 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -237,7 +237,7 @@ static void buffered_rate_tick(void *opaque)
 return;
 }
 
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+qemu_mod_timer(s-timer, qemu_get_clock_ms(migration_clock) + 100);
 
 if (s-freeze_output)
 return;
@@ -246,8 +246,8 @@ static void buffered_rate_tick(void *opaque)
 
 buffered_flush(s);
 
-/* Add some checks around this */
 s-put_ready(s-opaque);
+usleep(qemu_timer_difference(s-timer, migration_clock) * 1000);
 }
 
 QEMUFile *qemu_fopen_ops_buffered(void *opaque,
@@ -271,11 +271,11 @@ QEMUFile *qemu_fopen_ops_buffered(void *opaque,
 s-file = qemu_fopen_ops(s, buffered_put_buffer, NULL,
  buffered_close, buffered_rate_limit,
  buffered_set_rate_limit,
-buffered_get_rate_limit);
+ buffered_get_rate_limit);
 
-s-timer = qemu_new_timer_ms(rt_clock, buffered_rate_tick, s);
+s-timer = qemu_new_timer_ms(migration_clock, buffered_rate_tick, s);
 
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+qemu_mod_timer(s-timer, qemu_get_clock_ms(migration_clock) + 100);
 
 return s-file;
 }
diff --git a/migration-tcp.c b/migration-tcp.c
index d3d80c9..ad1b9d0 100644
--- a/migration-tcp.c
+++ b/migration-tcp.c
@@ -65,11 +65,9 @@ static void tcp_wait_for_connect(void *opaque)
 return;
 }
 
-qemu_set_fd_handler2(s-fd, NULL, NULL, NULL, NULL);
-
-if (val == 0)
+if (val == 0) {
 migrate_fd_connect(s);
-else {
+} else {
 DPRINTF(error connecting %d\n, val);
 migrate_fd_error(s);
 }
@@ -79,8 +77,8 @@ MigrationState *tcp_start_outgoing_migration(Monitor *mon,
  const char *host_port,
  int64_t bandwidth_limit,
  int detach,
-int blk,
-int inc)
+ int blk,
+ int inc)
 {
 struct sockaddr_in addr;
 FdMigrationState *s;
@@ -121,15 +119,17 @@ MigrationState *tcp_start_outgoing_migration(Monitor *mon,
 if (ret == -1)
 ret = -(s-get_error(s));
 
-if (ret == -EINPROGRESS || ret == -EWOULDBLOCK)
-qemu_set_fd_handler2(s-fd, NULL, NULL, tcp_wait_for_connect, s);
 } while (ret == -EINTR);
 
 if (ret  0  ret != -EINPROGRESS  ret != -EWOULDBLOCK) {
 DPRINTF(connect failed\n);
 migrate_fd_error(s);
-} else if (ret = 0)
+} else if (ret = 0) {
 migrate_fd_connect(s);
+} else {
+migrate_fd_wait_for_unfreeze(s);
+tcp_wait_for_connect(s);
+}
 
 return s-mig_state;
 }
diff --git a/migration-unix.c b/migration-unix.c
index c8625c7..ed57d5a 100644
--- a/migration-unix.c
+++ b/migration-unix.c
@@ -64,8 +64,6 @@ static void unix_wait_for_connect(void *opaque)
 return;
 }
 
-qemu_set_fd_handler2(s-fd, NULL, NULL, NULL, NULL);
-
 if (val == 0)
 migrate_fd_connect(s);
 else {
@@ -116,13 +114,14 @@ MigrationState *unix_start_outgoing_migration(Monitor 
*mon,
 if (ret == -1)
ret = -(s-get_error(s));
 
-if (ret == -EINPROGRESS || ret == -EWOULDBLOCK)
-   qemu_set_fd_handler2(s-fd, NULL, NULL, unix_wait_for_connect, s);
 } while (ret == -EINTR);
 
 if (ret  0  ret != -EINPROGRESS  ret != -EWOULDBLOCK) {
 DPRINTF(connect failed\n);
 goto

Re: [RFC] New thread for the VM migration

2011-07-14 Thread Umesh Deshpande
Following patch is implemented to deal with the VCPU and iothread starvation 
during the migration of a guest. Currently iothread is responsible for 
performing the migration. It holds the qemu_mutex during the migration and 
doesn't allow VCPU to enter the qemu mode and delays its return to the guest. 
The guest migration, executed as an iohandler also delays the execution of 
other iohandlers. In the following patch, the migration has been moved to a 
separate thread to reduce the qemu_mutex contention and iohandler starvation.


Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 arch_init.c  |   19 +-
 buffered_file.c  |   10 
 cpu-all.h|   37 +
 exec.c   |   59 +++
 migration-exec.c |   17 ++---
 migration-fd.c   |   15 ++-
 migration-tcp.c  |   34 +++
 migration-unix.c |   23 ++
 migration.c  |   67 +-
 migration.h  |6 +++-
 qemu-timer.c |   28 +-
 qemu-timer.h |3 ++
 12 files changed, 245 insertions(+), 73 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 484b39d..f18dda2 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -123,13 +123,13 @@ static int ram_save_block(QEMUFile *f)
 current_addr = block-offset + offset;
 
 do {
-if (cpu_physical_memory_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) 
{
+if (migration_bitmap_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) {
 uint8_t *p;
 int cont = (block == last_block) ? RAM_SAVE_FLAG_CONTINUE : 0;
 
-cpu_physical_memory_reset_dirty(current_addr,
-current_addr + TARGET_PAGE_SIZE,
-MIGRATION_DIRTY_FLAG);
+migration_bitmap_reset_dirty(current_addr,
+ current_addr + TARGET_PAGE_SIZE,
+ MIGRATION_DIRTY_FLAG);
 
 p = block-host + offset;
 
@@ -185,7 +185,7 @@ static ram_addr_t ram_save_remaining(void)
 ram_addr_t addr;
 for (addr = block-offset; addr  block-offset + block-length;
  addr += TARGET_PAGE_SIZE) {
-if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
+if (migration_bitmap_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
 count++;
 }
 }
@@ -260,10 +260,15 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 return 0;
 }
 
+if (stage != 3)
+qemu_mutex_lock_iothread();
 if (cpu_physical_sync_dirty_bitmap(0, TARGET_PHYS_ADDR_MAX) != 0) {
 qemu_file_set_error(f);
 return 0;
 }
+sync_migration_bitmap(0, TARGET_PHYS_ADDR_MAX);
+if (stage != 3)
+qemu_mutex_unlock_iothread();
 
 if (stage == 1) {
 RAMBlock *block;
@@ -276,9 +281,9 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 QLIST_FOREACH(block, ram_list.blocks, next) {
 for (addr = block-offset; addr  block-offset + block-length;
  addr += TARGET_PAGE_SIZE) {
-if (!cpu_physical_memory_get_dirty(addr,
+if (!migration_bitmap_get_dirty(addr,
MIGRATION_DIRTY_FLAG)) {
-cpu_physical_memory_set_dirty(addr);
+migration_bitmap_set_dirty(addr);
 }
 }
 }
diff --git a/buffered_file.c b/buffered_file.c
index 41b42c3..e05efe8 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -237,7 +237,7 @@ static void buffered_rate_tick(void *opaque)
 return;
 }
 
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+qemu_mod_timer(s-timer, qemu_get_clock_ms(migration_clock) + 100);
 
 if (s-freeze_output)
 return;
@@ -246,8 +246,8 @@ static void buffered_rate_tick(void *opaque)
 
 buffered_flush(s);
 
-/* Add some checks around this */
 s-put_ready(s-opaque);
+usleep(qemu_timer_difference(s-timer, migration_clock) * 1000);
 }
 
 QEMUFile *qemu_fopen_ops_buffered(void *opaque,
@@ -271,11 +271,11 @@ QEMUFile *qemu_fopen_ops_buffered(void *opaque,
 s-file = qemu_fopen_ops(s, buffered_put_buffer, NULL,
  buffered_close, buffered_rate_limit,
  buffered_set_rate_limit,
-buffered_get_rate_limit);
+ buffered_get_rate_limit);
 
-s-timer = qemu_new_timer_ms(rt_clock, buffered_rate_tick, s);
+s-timer = qemu_new_timer_ms(migration_clock, buffered_rate_tick, s);
 
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+qemu_mod_timer(s-timer, qemu_get_clock_ms(migration_clock) + 100);
 
 return s-file;
 }
diff --git a/cpu

Re: [Qemu-devel] [RFC] New thread for the VM migration

2011-07-14 Thread Umesh Deshpande
Following patch is implemented to deal with the VCPU and iothread starvation 
during the migration of a guest. Currently iothread is responsible for 
performing the migration. It holds the qemu_mutex during the migration and 
doesn't allow VCPU to enter the qemu mode and delays its return to the guest. 
The guest migration, executed as an iohandler also delays the execution of 
other iohandlers. In the following patch, the migration has been moved to a 
separate thread to reduce the qemu_mutex contention and iohandler starvation.


Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 arch_init.c  |   19 +-
 buffered_file.c  |   10 
 cpu-all.h|   37 +
 exec.c   |   59 +++
 migration-exec.c |   17 ++---
 migration-fd.c   |   15 ++-
 migration-tcp.c  |   34 +++
 migration-unix.c |   23 ++
 migration.c  |   67 +-
 migration.h  |6 +++-
 qemu-timer.c |   28 +-
 qemu-timer.h |3 ++
 12 files changed, 245 insertions(+), 73 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 484b39d..f18dda2 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -123,13 +123,13 @@ static int ram_save_block(QEMUFile *f)
 current_addr = block-offset + offset;
 
 do {
-if (cpu_physical_memory_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) 
{
+if (migration_bitmap_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) {
 uint8_t *p;
 int cont = (block == last_block) ? RAM_SAVE_FLAG_CONTINUE : 0;
 
-cpu_physical_memory_reset_dirty(current_addr,
-current_addr + TARGET_PAGE_SIZE,
-MIGRATION_DIRTY_FLAG);
+migration_bitmap_reset_dirty(current_addr,
+ current_addr + TARGET_PAGE_SIZE,
+ MIGRATION_DIRTY_FLAG);
 
 p = block-host + offset;
 
@@ -185,7 +185,7 @@ static ram_addr_t ram_save_remaining(void)
 ram_addr_t addr;
 for (addr = block-offset; addr  block-offset + block-length;
  addr += TARGET_PAGE_SIZE) {
-if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
+if (migration_bitmap_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
 count++;
 }
 }
@@ -260,10 +260,15 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 return 0;
 }
 
+if (stage != 3)
+qemu_mutex_lock_iothread();
 if (cpu_physical_sync_dirty_bitmap(0, TARGET_PHYS_ADDR_MAX) != 0) {
 qemu_file_set_error(f);
 return 0;
 }
+sync_migration_bitmap(0, TARGET_PHYS_ADDR_MAX);
+if (stage != 3)
+qemu_mutex_unlock_iothread();
 
 if (stage == 1) {
 RAMBlock *block;
@@ -276,9 +281,9 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 QLIST_FOREACH(block, ram_list.blocks, next) {
 for (addr = block-offset; addr  block-offset + block-length;
  addr += TARGET_PAGE_SIZE) {
-if (!cpu_physical_memory_get_dirty(addr,
+if (!migration_bitmap_get_dirty(addr,
MIGRATION_DIRTY_FLAG)) {
-cpu_physical_memory_set_dirty(addr);
+migration_bitmap_set_dirty(addr);
 }
 }
 }
diff --git a/buffered_file.c b/buffered_file.c
index 41b42c3..e05efe8 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -237,7 +237,7 @@ static void buffered_rate_tick(void *opaque)
 return;
 }
 
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+qemu_mod_timer(s-timer, qemu_get_clock_ms(migration_clock) + 100);
 
 if (s-freeze_output)
 return;
@@ -246,8 +246,8 @@ static void buffered_rate_tick(void *opaque)
 
 buffered_flush(s);
 
-/* Add some checks around this */
 s-put_ready(s-opaque);
+usleep(qemu_timer_difference(s-timer, migration_clock) * 1000);
 }
 
 QEMUFile *qemu_fopen_ops_buffered(void *opaque,
@@ -271,11 +271,11 @@ QEMUFile *qemu_fopen_ops_buffered(void *opaque,
 s-file = qemu_fopen_ops(s, buffered_put_buffer, NULL,
  buffered_close, buffered_rate_limit,
  buffered_set_rate_limit,
-buffered_get_rate_limit);
+ buffered_get_rate_limit);
 
-s-timer = qemu_new_timer_ms(rt_clock, buffered_rate_tick, s);
+s-timer = qemu_new_timer_ms(migration_clock, buffered_rate_tick, s);
 
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+qemu_mod_timer(s-timer, qemu_get_clock_ms(migration_clock) + 100);
 
 return s-file;
 }
diff --git a/cpu

[RFC] New thread for the VM migration

2011-07-13 Thread Umesh Deshpande
Following patch is implemented to deal with the VCPU and iothread starvation 
during the migration of a guest. Currently iothread is responsible for 
performing the migration. It holds the qemu_mutex during the migration and 
doesn't allow VCPU to enter the qemu mode and delays its return to the guest. 
The guest migration, executed as an iohandler also delays the execution of 
other iohandlers. In the following patch, the migration has been moved to a 
separate thread to reduce the qemu_mutex contention and iohandler starvation.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 arch_init.c  |   19 +-
 buffered_file.c  |   10 
 cpu-all.h|   37 
 exec.c   |   70 ++
 migration-exec.c |   17 ++---
 migration-fd.c   |   15 ++-
 migration-tcp.c  |   34 ++---
 migration-unix.c |   23 ++---
 migration.c  |   67 +++
 migration.h  |6 +++-
 qemu-timer.c |   28 -
 qemu-timer.h |3 ++
 12 files changed, 256 insertions(+), 73 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 484b39d..f18dda2 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -123,13 +123,13 @@ static int ram_save_block(QEMUFile *f)
 current_addr = block-offset + offset;
 
 do {
-if (cpu_physical_memory_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) 
{
+if (migration_bitmap_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) {
 uint8_t *p;
 int cont = (block == last_block) ? RAM_SAVE_FLAG_CONTINUE : 0;
 
-cpu_physical_memory_reset_dirty(current_addr,
-current_addr + TARGET_PAGE_SIZE,
-MIGRATION_DIRTY_FLAG);
+migration_bitmap_reset_dirty(current_addr,
+ current_addr + TARGET_PAGE_SIZE,
+ MIGRATION_DIRTY_FLAG);
 
 p = block-host + offset;
 
@@ -185,7 +185,7 @@ static ram_addr_t ram_save_remaining(void)
 ram_addr_t addr;
 for (addr = block-offset; addr  block-offset + block-length;
  addr += TARGET_PAGE_SIZE) {
-if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
+if (migration_bitmap_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
 count++;
 }
 }
@@ -260,10 +260,15 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 return 0;
 }
 
+if (stage != 3)
+qemu_mutex_lock_iothread();
 if (cpu_physical_sync_dirty_bitmap(0, TARGET_PHYS_ADDR_MAX) != 0) {
 qemu_file_set_error(f);
 return 0;
 }
+sync_migration_bitmap(0, TARGET_PHYS_ADDR_MAX);
+if (stage != 3)
+qemu_mutex_unlock_iothread();
 
 if (stage == 1) {
 RAMBlock *block;
@@ -276,9 +281,9 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 QLIST_FOREACH(block, ram_list.blocks, next) {
 for (addr = block-offset; addr  block-offset + block-length;
  addr += TARGET_PAGE_SIZE) {
-if (!cpu_physical_memory_get_dirty(addr,
+if (!migration_bitmap_get_dirty(addr,
MIGRATION_DIRTY_FLAG)) {
-cpu_physical_memory_set_dirty(addr);
+migration_bitmap_set_dirty(addr);
 }
 }
 }
diff --git a/buffered_file.c b/buffered_file.c
index 41b42c3..e05efe8 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -237,7 +237,7 @@ static void buffered_rate_tick(void *opaque)
 return;
 }
 
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+qemu_mod_timer(s-timer, qemu_get_clock_ms(migration_clock) + 100);
 
 if (s-freeze_output)
 return;
@@ -246,8 +246,8 @@ static void buffered_rate_tick(void *opaque)
 
 buffered_flush(s);
 
-/* Add some checks around this */
 s-put_ready(s-opaque);
+usleep(qemu_timer_difference(s-timer, migration_clock) * 1000);
 }
 
 QEMUFile *qemu_fopen_ops_buffered(void *opaque,
@@ -271,11 +271,11 @@ QEMUFile *qemu_fopen_ops_buffered(void *opaque,
 s-file = qemu_fopen_ops(s, buffered_put_buffer, NULL,
  buffered_close, buffered_rate_limit,
  buffered_set_rate_limit,
-buffered_get_rate_limit);
+ buffered_get_rate_limit);
 
-s-timer = qemu_new_timer_ms(rt_clock, buffered_rate_tick, s);
+s-timer = qemu_new_timer_ms(migration_clock, buffered_rate_tick, s);
 
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+qemu_mod_timer(s-timer, qemu_get_clock_ms(migration_clock) + 100);
 
 return s-file;
 }
diff --git a/cpu

[Qemu-devel] [RFC] New thread for the VM migration

2011-07-13 Thread Umesh Deshpande
Following patch is implemented to deal with the VCPU and iothread starvation 
during the migration of a guest. Currently iothread is responsible for 
performing the migration. It holds the qemu_mutex during the migration and 
doesn't allow VCPU to enter the qemu mode and delays its return to the guest. 
The guest migration, executed as an iohandler also delays the execution of 
other iohandlers. In the following patch, the migration has been moved to a 
separate thread to reduce the qemu_mutex contention and iohandler starvation.

Signed-off-by: Umesh Deshpande udesh...@redhat.com
---
 arch_init.c  |   19 +-
 buffered_file.c  |   10 
 cpu-all.h|   37 
 exec.c   |   70 ++
 migration-exec.c |   17 ++---
 migration-fd.c   |   15 ++-
 migration-tcp.c  |   34 ++---
 migration-unix.c |   23 ++---
 migration.c  |   67 +++
 migration.h  |6 +++-
 qemu-timer.c |   28 -
 qemu-timer.h |3 ++
 12 files changed, 256 insertions(+), 73 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 484b39d..f18dda2 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -123,13 +123,13 @@ static int ram_save_block(QEMUFile *f)
 current_addr = block-offset + offset;
 
 do {
-if (cpu_physical_memory_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) 
{
+if (migration_bitmap_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) {
 uint8_t *p;
 int cont = (block == last_block) ? RAM_SAVE_FLAG_CONTINUE : 0;
 
-cpu_physical_memory_reset_dirty(current_addr,
-current_addr + TARGET_PAGE_SIZE,
-MIGRATION_DIRTY_FLAG);
+migration_bitmap_reset_dirty(current_addr,
+ current_addr + TARGET_PAGE_SIZE,
+ MIGRATION_DIRTY_FLAG);
 
 p = block-host + offset;
 
@@ -185,7 +185,7 @@ static ram_addr_t ram_save_remaining(void)
 ram_addr_t addr;
 for (addr = block-offset; addr  block-offset + block-length;
  addr += TARGET_PAGE_SIZE) {
-if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
+if (migration_bitmap_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
 count++;
 }
 }
@@ -260,10 +260,15 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 return 0;
 }
 
+if (stage != 3)
+qemu_mutex_lock_iothread();
 if (cpu_physical_sync_dirty_bitmap(0, TARGET_PHYS_ADDR_MAX) != 0) {
 qemu_file_set_error(f);
 return 0;
 }
+sync_migration_bitmap(0, TARGET_PHYS_ADDR_MAX);
+if (stage != 3)
+qemu_mutex_unlock_iothread();
 
 if (stage == 1) {
 RAMBlock *block;
@@ -276,9 +281,9 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 QLIST_FOREACH(block, ram_list.blocks, next) {
 for (addr = block-offset; addr  block-offset + block-length;
  addr += TARGET_PAGE_SIZE) {
-if (!cpu_physical_memory_get_dirty(addr,
+if (!migration_bitmap_get_dirty(addr,
MIGRATION_DIRTY_FLAG)) {
-cpu_physical_memory_set_dirty(addr);
+migration_bitmap_set_dirty(addr);
 }
 }
 }
diff --git a/buffered_file.c b/buffered_file.c
index 41b42c3..e05efe8 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -237,7 +237,7 @@ static void buffered_rate_tick(void *opaque)
 return;
 }
 
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+qemu_mod_timer(s-timer, qemu_get_clock_ms(migration_clock) + 100);
 
 if (s-freeze_output)
 return;
@@ -246,8 +246,8 @@ static void buffered_rate_tick(void *opaque)
 
 buffered_flush(s);
 
-/* Add some checks around this */
 s-put_ready(s-opaque);
+usleep(qemu_timer_difference(s-timer, migration_clock) * 1000);
 }
 
 QEMUFile *qemu_fopen_ops_buffered(void *opaque,
@@ -271,11 +271,11 @@ QEMUFile *qemu_fopen_ops_buffered(void *opaque,
 s-file = qemu_fopen_ops(s, buffered_put_buffer, NULL,
  buffered_close, buffered_rate_limit,
  buffered_set_rate_limit,
-buffered_get_rate_limit);
+ buffered_get_rate_limit);
 
-s-timer = qemu_new_timer_ms(rt_clock, buffered_rate_tick, s);
+s-timer = qemu_new_timer_ms(migration_clock, buffered_rate_tick, s);
 
-qemu_mod_timer(s-timer, qemu_get_clock_ms(rt_clock) + 100);
+qemu_mod_timer(s-timer, qemu_get_clock_ms(migration_clock) + 100);
 
 return s-file;
 }
diff --git a/cpu

[XFree86] Error

2006-03-02 Thread umesh deshpande

Log file : /var/log/XFree86.0.log.
Backup file contains output of this file.

XFree86 Version 4.3.0 (Red Hat Linux release: 4.3.0-2)
Release Date: 27 February 2003
X Protocol Version 11, Revision 0, Release 6.6
Build Operating System: Linux 2.4.20-3bigmem i686 [ELF] 
Build Date: 27 February 2003
Build Host: porky.devel.redhat.com
 
Before reporting problems, check http://www.XFree86.Org/
to make sure that you have the latest version.
Module Loader present
OS Kernel: Linux version 2.4.20-8 ([EMAIL PROTECTED]) (gcc version 3.2.2 
20030222 (Red Hat Linux 3.2.2-5)) #1 Thu Mar 13 17:18:24 EST 2003 
Markers: (--) probed, (**) from config file, (==) default setting,
 (++) from command line, (!!) notice, (II) informational,
 (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: /var/log/XFree86.0.log, Time: Thu Mar  2 19:13:42 2006
(==) Using config file: /etc/X11/XF86Config
(==) ServerLayout Default Layout
(**) |--Screen Screen0 (0)
(**) |   |--Monitor Monitor0
(**) |   |--Device Videocard0
(**) |--Input Device Mouse0
(**) |--Input Device Keyboard0
(**) Option XkbRules xfree86
(**) XKB: rules: xfree86
(**) Option XkbModel pc105
(**) XKB: model: pc105
(**) Option XkbLayout us
(**) XKB: layout: us
(==) Keyboard: CustomKeycode disabled
(**) |--Input Device DevInputMice
(**) FontPath set to unix/:7100
(**) RgbPath set to /usr/X11R6/lib/X11/rgb
(==) ModulePath set to /usr/X11R6/lib/modules
(--) using VT number 9

(II) Open APM successful
(II) Module ABI versions:
XFree86 ANSI C Emulation: 0.2
XFree86 Video Driver: 0.6
XFree86 XInput driver : 0.4
XFree86 Server Extension : 0.2
XFree86 Font Renderer : 0.4
(II) Loader running on linux
(II) LoadModule: bitmap
(II) Loading /usr/X11R6/lib/modules/fonts/libbitmap.a
(II) Module bitmap: vendor=The XFree86 Project
compiled for 4.3.0, module version = 1.0.0
Module class: XFree86 Font Renderer
ABI class: XFree86 Font Renderer, version 0.4
(II) Loading font Bitmap
(II) LoadModule: pcidata
(II) Loading /usr/X11R6/lib/modules/libpcidata.a
(II) Module pcidata: vendor=The XFree86 Project
compiled for 4.3.0, module version = 1.0.0
ABI class: XFree86 Video Driver, version 0.6
(II) PCI: Probing config type using method 1
(II) PCI: Config type is 1
(II) PCI: stages = 0x03, oldVal1 = 0x, mode1Res1 = 0x8000
(II) PCI: PCI scan (all values are in hex)
(II) PCI: 00:00:0: chip 1106,3189 card 1106,3189 rev 00 class 06,00,00 hdr 00
(II) PCI: 00:01:0: chip 1106,b168 card , rev 00 class 06,04,00 hdr 01
(II) PCI: 00:09:0: chip 14f1,2f00 card 14f1,2004 rev 01 class 07,80,00 hdr 00
(II) PCI: 00:10:0: chip 1106,3038 card 1106,3038 rev 80 class 0c,03,00 hdr 80
(II) PCI: 00:10:1: chip 1106,3038 card 1106,3038 rev 80 class 0c,03,00 hdr 80
(II) PCI: 00:10:2: chip 1106,3038 card 1106,3038 rev 80 class 0c,03,00 hdr 80
(II) PCI: 00:10:3: chip 1106,3104 card 1106,3104 rev 82 class 0c,03,20 hdr 00
(II) PCI: 00:11:0: chip 1106,3177 card 1106,3177 rev 00 class 06,01,00 hdr 80
(II) PCI: 00:11:1: chip 1106,0571 card 1106,0571 rev 06 class 01,01,8a hdr 00
(II) PCI: 00:11:5: chip 1106,3059 card 1565,f614 rev 50 class 04,01,00 hdr 00
(II) PCI: 00:12:0: chip 1106,3065 card 1565,2200 rev 74 class 02,00,00 hdr 00
(II) PCI: 01:00:0: chip 10de,0182 card , rev a2 class 03,00,00 hdr 00
(II) PCI: End of PCI scan
(II) Host-to-PCI bridge:
(II) Bus 0: bridge is at (0:0:0), (0,0,1), BCTRL: 0x0008 (VGA_EN is set)
(II) Bus 0 I/O range:
[0] -1  0   0x - 0x (0x1) IX[B]
(II) Bus 0 non-prefetchable memory range:
[0] -1  0   0x - 0x (0x0) MX[B]
(II) Bus 0 prefetchable memory range:
[0] -1  0   0x - 0x (0x0) MX[B]
(II) PCI-to-PCI bridge:
(II) Bus 1: bridge is at (0:1:0), (0,1,1), BCTRL: 0x000c (VGA_EN is set)
(II) Bus 1 non-prefetchable memory range:
[0] -1  0   0xd400 - 0xd5ff (0x200) MX[B]
(II) Bus 1 prefetchable memory range:
[0] -1  0   0xd000 - 0xd3ff (0x400) MX[B]
(II) PCI-to-ISA bridge:
(II) Bus -1: bridge is at (0:17:0), (0,-1,-1), BCTRL: 0x0008 (VGA_EN is set)
(--) PCI:*(1:0:0) nVidia Corporation NV18 [GeForce4 MX 440SE AGP 8x] rev 162, 
Mem @ 0xd400/24, 0xd000/26
(II) Addressable bus resource ranges are
[0] -1  0   0x - 0x (0x0) MX[B]
[1] -1  0   0x - 0x (0x1) IX[B]
(II) OS-reported resource ranges:
[0] -1  0   0xffe0 - 0x (0x20) MX[B](B)
[1] -1  0   0x0010 - 0x3fff (0x3ff0) MX[B]E(B)
[2] -1  0   0x000f - 0x000f (0x1) MX[B]
[3] -1  0   0x000c - 0x000e (0x3) MX[B]
[4] -1  0   0x - 0x0009 (0xa) MX[B]
[5] -1  0   0x - 0x (0x1) IX[B]
[6] -1  0   0x - 0x00ff (0x100) IX[B]
(II) PCI Memory