Re: [Qemu-devel] [PATCH] Fix two XBZRLE corruption issues

2014-02-14 Thread Orit Wasserman
 /* Can't send this cached data async, since the cache page
+ * might get updated before it gets to the wire
+ */
+send_async = false;
  }
  }

  /* XBZRLE overflow or normal page */
  if (bytes_sent == -1) {
  bytes_sent = save_block_hdr(f, block, offset, cont, 
RAM_SAVE_FLAG_PAGE);
-qemu_put_buffer_async(f, p, TARGET_PAGE_SIZE);
+if (send_async) {
+qemu_put_buffer_async(f, p, TARGET_PAGE_SIZE);
+} else {
+qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
+}
  bytes_sent += TARGET_PAGE_SIZE;
  acct_info.norm_pages++;
  }
diff --git a/include/migration/page_cache.h b/include/migration/page_cache.h
index d156f0d..2d5ce2d 100644
--- a/include/migration/page_cache.h
+++ b/include/migration/page_cache.h
@@ -66,7 +66,7 @@ uint8_t *get_cached_data(const PageCache *cache, uint64_t 
addr);
   * @addr: page address
   * @pdata: pointer to the page
   */
-int cache_insert(PageCache *cache, uint64_t addr, uint8_t *pdata);
+int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata);

  /**
   * cache_resize: resize the page cache. In case of size reduction the extra
diff --git a/page_cache.c b/page_cache.c
index 3ef6ee7..b033681 100644
--- a/page_cache.c
+++ b/page_cache.c
@@ -150,7 +150,7 @@ uint8_t *get_cached_data(const PageCache *cache, uint64_t 
addr)
  return cache_get_by_addr(cache, addr)->it_data;
  }

-int cache_insert(PageCache *cache, uint64_t addr, uint8_t *pdata)
+int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata)
  {

  CacheItem *it = NULL;



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [pve-devel] QEMU LIve Migration - swap_free: Bad swap file entry

2014-02-11 Thread Orit Wasserman

On 02/11/2014 03:33 PM, Stefan Priebe - Profihost AG wrote:


Am 11.02.2014 14:32, schrieb Orit Wasserman:

On 02/08/2014 09:23 PM, Stefan Priebe wrote:

i could fix it by explicitly disable xbzrle - it seems its
automatically on if i do not set the migration caps to false.

So it seems to be a xbzrle bug.



XBZRLE is disabled by default (actually all capabilities are off by
default)
What version of QEMU are you using that you need to disable it explicitly?
Maybe you run migration with XBZRLE and canceled it, so it stays on?


No real idea why this happens - but yes this seems to be a problem for me.



I checked upstream QEMU and it is still off by default (always been)


But the bug in XBZRLE is still there ;-)



We need to understand the exact scenario in order to understand the problem.

What exact version of Qemu are you using?
Can you try with the latest upstream version, there were some fixes to the
XBZRLE code?


Stefan


Orit


Stefan

Am 07.02.2014 21:10, schrieb Stefan Priebe:

Am 07.02.2014 21:02, schrieb Dr. David Alan Gilbert:

* Stefan Priebe (s.pri...@profihost.ag) wrote:

anything i could try or debug? to help to find the problem?


I think the most useful would be to see if the problem is
a new problem in the 1.7 you're using or has existed
for a while; depending on the machine type you used, it might
be possible to load that image on an earlier (or newer) qemu
and try the same test, however if the problem doesn't
repeat reliably it can be hard.


I've seen this first with Qemu 1.5 but was not able to reproduce it for
month. 1.4 was working fine.


If you have any way of simplifying the configuration of the
VM it would be good; e.g. if you could get a failure on
something without graphics (-nographic) and USB.


Sadly not ;-(


Dave



Stefan

Am 07.02.2014 14:45, schrieb Stefan Priebe - Profihost AG:

it's always the same "pattern" there are too many 0 instead of X.

only seen:

read:0x ... expected:0x

or

read:0x ... expected:0x

or

read:0xbf00bf00 ... expected:0xbfffbfff

or

read:0x ... expected:0xb5b5b5b5b5b5b5b5

no idea if this helps.

Stefan

Am 07.02.2014 14:39, schrieb Stefan Priebe - Profihost AG:

Hi,
Am 07.02.2014 14:19, schrieb Paolo Bonzini:

Il 07/02/2014 14:04, Stefan Priebe - Profihost AG ha scritto:

first of all i've now a memory image of a VM where i can
reproduce it.


You mean you start that VM with -incoming 'exec:cat
/path/to/vm.img'?
But google stress test doesn't report any error until you start
migration _and_ it finishes?


Sorry no i meant i have a VM where i saved the memory to disk - so i
don't need to wait hours until i can reproduce as it does not happen
with a fresh started VM. So it's a state file i think.


Another test:

- start the VM with -S, migrate, do errors appear on the
destination?


I started with -S and the errors appear AFTER resuming/unpause
the VM.
So it is fine until i resume it on the "new" host.

Stefan




--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK










Re: [Qemu-devel] [pve-devel] QEMU LIve Migration - swap_free: Bad swap file entry

2014-02-11 Thread Orit Wasserman

On 02/08/2014 09:23 PM, Stefan Priebe wrote:

i could fix it by explicitly disable xbzrle - it seems its automatically on if 
i do not set the migration caps to false.

So it seems to be a xbzrle bug.



XBZRLE is disabled by default (actually all capabilities are off by default)
What version of QEMU are you using that you need to disable it explicitly?
Maybe you run migration with XBZRLE and canceled it, so it stays on?

Orit


Stefan

Am 07.02.2014 21:10, schrieb Stefan Priebe:

Am 07.02.2014 21:02, schrieb Dr. David Alan Gilbert:

* Stefan Priebe (s.pri...@profihost.ag) wrote:

anything i could try or debug? to help to find the problem?


I think the most useful would be to see if the problem is
a new problem in the 1.7 you're using or has existed
for a while; depending on the machine type you used, it might
be possible to load that image on an earlier (or newer) qemu
and try the same test, however if the problem doesn't
repeat reliably it can be hard.


I've seen this first with Qemu 1.5 but was not able to reproduce it for
month. 1.4 was working fine.


If you have any way of simplifying the configuration of the
VM it would be good; e.g. if you could get a failure on
something without graphics (-nographic) and USB.


Sadly not ;-(


Dave



Stefan

Am 07.02.2014 14:45, schrieb Stefan Priebe - Profihost AG:

it's always the same "pattern" there are too many 0 instead of X.

only seen:

read:0x ... expected:0x

or

read:0x ... expected:0x

or

read:0xbf00bf00 ... expected:0xbfffbfff

or

read:0x ... expected:0xb5b5b5b5b5b5b5b5

no idea if this helps.

Stefan

Am 07.02.2014 14:39, schrieb Stefan Priebe - Profihost AG:

Hi,
Am 07.02.2014 14:19, schrieb Paolo Bonzini:

Il 07/02/2014 14:04, Stefan Priebe - Profihost AG ha scritto:

first of all i've now a memory image of a VM where i can
reproduce it.


You mean you start that VM with -incoming 'exec:cat /path/to/vm.img'?
But google stress test doesn't report any error until you start
migration _and_ it finishes?


Sorry no i meant i have a VM where i saved the memory to disk - so i
don't need to wait hours until i can reproduce as it does not happen
with a fresh started VM. So it's a state file i think.


Another test:

- start the VM with -S, migrate, do errors appear on the destination?


I started with -S and the errors appear AFTER resuming/unpause the VM.
So it is fine until i resume it on the "new" host.

Stefan




--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK








Re: [Qemu-devel] migration question: disk images on nfs server

2014-02-07 Thread Orit Wasserman

On 02/07/2014 03:44 PM, Marcin Gibuła wrote:

On 07.02.2014 14:36, Orit Wasserman wrote:

Do you know if is applies to linux O_DIRECT writes as well?



 From the man of open:

The behaviour of O_DIRECT with NFS will differ from local
filesystems.  Older kernels, or kernels configured in certain ways,
may not support this combination.  The NFS protocol does not
support
passing the flag to the server, so O_DIRECT I/O will bypass the
page
cache only on the client; the server may still cache the I/O.  The
client asks the server to make the I/O synchronous to preserve the
synchronous semantics of O_DIRECT.  Some servers will perform
poorly
under these circumstances, especially if the I/O size is small.
Some
servers may also be configured to lie to clients about the I/O
having
reached stable storage; this will avoid the performance penalty at
some risk to data integrity in the event of server power failure.
The Linux NFS client places no alignment restrictions on O_DIRECT
I/O.

To summaries it depends on your kernel (NFS client).


So, assuming new kernel (where nfs O_DIRECT translates to no cache at client 
side) and cache coherent server, is it enough or is 'sync' mount (or O_SYNC 
flag) still required for some reason?



I think is should be enough.



Re: [Qemu-devel] migration question: disk images on nfs server

2014-02-07 Thread Orit Wasserman

On 02/07/2014 02:54 PM, Marcin Gibuła wrote:

It is more a NFS issue, if you have a file in NFS that two users in
two different host are accessing (one at least write to it) you will
need to enforce the "sync" option.
Even if you flush all the data and close the file the NFS client can still
have cached data that it didn't sync to the server.


Do you know if is applies to linux O_DIRECT writes as well?



From the man of open:

   The behaviour of O_DIRECT with NFS will differ from local
   filesystems.  Older kernels, or kernels configured in certain ways,
   may not support this combination.  The NFS protocol does not support
   passing the flag to the server, so O_DIRECT I/O will bypass the page
   cache only on the client; the server may still cache the I/O.  The
   client asks the server to make the I/O synchronous to preserve the
   synchronous semantics of O_DIRECT.  Some servers will perform poorly
   under these circumstances, especially if the I/O size is small.  Some
   servers may also be configured to lie to clients about the I/O having
   reached stable storage; this will avoid the performance penalty at
   some risk to data integrity in the event of server power failure.
   The Linux NFS client places no alignment restrictions on O_DIRECT
   I/O.
 
To summaries it depends on your kernel (NFS client).




 From comment in fs/nfs/direct.c:

* When an application requests uncached I/O, all read and write requests
* are made directly to the server; data stored or fetched via these
* requests is not cached in the Linux page cache.  The client does not
* correct unaligned requests from applications.  All requested bytes are
* held on permanent storage before a direct write system call returns to
* an application.











Re: [Qemu-devel] migration question: disk images on nfs server

2014-02-07 Thread Orit Wasserman

On 02/07/2014 02:10 PM, Alexey Kardashevskiy wrote:

On 07.02.2014 18:46, Orit Wasserman wrote:

On 02/07/2014 06:35 AM, Alexey Kardashevskiy wrote:

Hi!

I have yet another problem with migration. Or NFS.

There is one NFS server and 2 test POWER8 machines. There is a shared NFS
folder on the server, mounted to both test hosts. There is an qcow2 image
(abc.qcow2) in that shared folder.

We start a guest with this abc.qcow2 on the test machine #1. And start
another guest on the test machine #2 with "-incoming ..." and same
abc.qcow2.

Now we start migration. In most cases it goes fine. But if we put some
load
on machine #1, the destination guest sometime crashes.

I blame out-of-sync NFS on the test machines. I looked a bit further in
QEMU and could not find a spot where it would fflush(abc.qcow2) or
close it
or do any other sync so it is up to the host NFS mountpoint to decide
when
to sync and it definitely does not get a clue when to do this.

I do not really understand why the abc.qcow2 image is still open, should
not it be closed after migration succeeded?

What do I miss here? Should we switch from NFS to GlusterFS (is it always
syncronized)? Or if we want NFS, should we just boot our guests with
"root=/dev/nfs ip=dhcp nfsroot=..." and avoid using disk images in
network
disks? Thanks!



For NFS you need to use the sync mount option to force the NFS client to
sync to
server on writes.


So there is no any kind of sync in QEMU after migration finished,
correct? Looks too mucn to enforce "sync" option for NFS as we really
need it for once.



It is more a NFS issue, if you have a file in NFS that two users in
two different host are accessing (one at least write to it) you will need to enforce the 
"sync" option.
Even if you flush all the data and close the file the NFS client can still
have cached data that it didn't sync to the server.








Re: [Qemu-devel] migration question: disk images on nfs server

2014-02-06 Thread Orit Wasserman

On 02/07/2014 06:35 AM, Alexey Kardashevskiy wrote:

Hi!

I have yet another problem with migration. Or NFS.

There is one NFS server and 2 test POWER8 machines. There is a shared NFS
folder on the server, mounted to both test hosts. There is an qcow2 image
(abc.qcow2) in that shared folder.

We start a guest with this abc.qcow2 on the test machine #1. And start
another guest on the test machine #2 with "-incoming ..." and same abc.qcow2.

Now we start migration. In most cases it goes fine. But if we put some load
on machine #1, the destination guest sometime crashes.

I blame out-of-sync NFS on the test machines. I looked a bit further in
QEMU and could not find a spot where it would fflush(abc.qcow2) or close it
or do any other sync so it is up to the host NFS mountpoint to decide when
to sync and it definitely does not get a clue when to do this.

I do not really understand why the abc.qcow2 image is still open, should
not it be closed after migration succeeded?

What do I miss here? Should we switch from NFS to GlusterFS (is it always
syncronized)? Or if we want NFS, should we just boot our guests with
"root=/dev/nfs ip=dhcp nfsroot=..." and avoid using disk images in network
disks? Thanks!



For NFS you need to use the sync mount option to force the NFS client to sync to
server on writes.

Orit










Re: [Qemu-devel] QEMU LIve Migration - swap_free: Bad swap file entry

2014-02-06 Thread Orit Wasserman

On 02/06/2014 09:20 AM, Stefan Priebe - Profihost AG wrote:

Am 05.02.2014 21:15, schrieb Dr. David Alan Gilbert:

* Stefan Priebe (s.pri...@profihost.ag) wrote:

Hello,

after live migrating machines with a lot of memory (32GB, 48GB, ...)
i see pretty often crashing services after migration and the guest
kernel prints:

[1707620.031806] swap_free: Bad swap file entry 00377410
[1707620.031806] swap_free: Bad swap file entry 00593c48
[1707620.031807] swap_free: Bad swap file entry 03201430
[1707620.031807] swap_free: Bad swap file entry 01bc5900
[1707620.031807] swap_free: Bad swap file entry 0173ce40
[1707620.031808] swap_free: Bad swap file entry 011c0270
[1707620.031808] swap_free: Bad swap file entry 03c58ae8
[1707660.749059] BUG: Bad rss-counter state mm:88064d09f380
idx:1 val:1536
[1707660.749937] BUG: Bad rss-counter state mm:88064d09f380
idx:2 val:-1536



Is this live migration with shared storage? what kind of shared storage?
Does this happens with smaller guests?


Qemu is 1.7

Does anybody know a fix?


I don't, but some more information about:
 1) What guest you're running


Linux guest the output is also from the guest. Kernel 3.10.26


 2) The configuration of your hosts


What do you mean by that?


 3) The command line (or XML if you're running libvirt) for
your qemu so we can see what devices you're running.


qemu -chardev
socket,id=qmp,path=/var/run/qemu-server/179.qmp,server,nowait -mon
chardev=qmp,mode=control -vnc
unix:/var/run/qemu-server/179.vnc,x509,password -pidfile
/var/run/qemu-server/179.pid -daemonize -name K31953 -smp
sockets=1,cores=16 -nodefaults -boot
menu=on,strict=on,reboot-timeout=1000 -vga cirrus -cpu
kvm64,+lahf_lm,+x2apic,+sep -k de -m 32768 -device
piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2 -device
usb-tablet,id=tablet,bus=uhci.0,port=1 -device
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 -drive
if=none,id=drive-ide2,media=cdrom,aio=native -device
ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200 -device
virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5 -drive
file=rbd:...,if=none,id=drive-scsi0,iops_rd=1000,iops_wr=500,bps_rd=314572800,bps_wr=209715200,aio=native,discard=on
-device
scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100
-netdev
type=tap,id=net0,ifname=tap179i0,script=/var/lib/qemu-server/pve-bridge,vhost=on
-device
virtio-net-pci,mac=CA:CA:23:AC:2D:C5,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300
-rtc base=localtime -machine type=pc-i440fx-1.7


Do you get any messages on either the source or destination
qemu during the migrate?


no

Stefan






[Qemu-devel] [PATCH] Remove trailing space from error message

2014-02-05 Thread Orit Wasserman
Signed-off-by: Orit Wasserman 
---
 migration.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration.c b/migration.c
index 25add6f..cc18f6c 100644
--- a/migration.c
+++ b/migration.c
@@ -482,7 +482,7 @@ void qmp_migrate_set_cache_size(int64_t value, Error **errp)
 /* Cache should not be larger than guest ram size */
 if (value > ram_bytes_total()) {
 error_set(errp, QERR_INVALID_PARAMETER_VALUE, "cache size",
-  "exceeds guest ram size ");
+  "exceeds guest ram size");
 return;
 }
 
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH 5/8] XBZRLE cache size should not be larger than guest memory size

2014-02-04 Thread Orit Wasserman

On 02/04/2014 06:26 PM, Eric Blake wrote:

On 02/04/2014 08:19 AM, Juan Quintela wrote:

From: Orit Wasserman 

Signed-off-by: Orit Wasserman 
Signed-off-by: Juan Quintela 
---
  migration.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/migration.c b/migration.c
index 46a7305..25add6f 100644
--- a/migration.c
+++ b/migration.c
@@ -479,6 +479,13 @@ void qmp_migrate_set_cache_size(int64_t value, Error 
**errp)
  return;
  }

+/* Cache should not be larger than guest ram size */
+if (value > ram_bytes_total()) {
+error_set(errp, QERR_INVALID_PARAMETER_VALUE, "cache size",
+  "exceeds guest ram size ");


Trailing space in the error message.



I will send a separate patch to fix this.

Orit



Re: [Qemu-devel] [PATCH v2 4/6] XBZRLE cache size should not be larger than guest memory size

2014-01-30 Thread Orit Wasserman

On 01/30/2014 08:48 PM, Dr. David Alan Gilbert wrote:

* Orit Wasserman (owass...@redhat.com) wrote:

On 01/30/2014 08:23 PM, Dr. David Alan Gilbert wrote:

* Orit Wasserman (owass...@redhat.com) wrote:

Signed-off-by: Orit Wasserman 
---
  migration.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/migration.c b/migration.c
index 46a7305..25add6f 100644
--- a/migration.c
+++ b/migration.c
@@ -479,6 +479,13 @@ void qmp_migrate_set_cache_size(int64_t value, Error 
**errp)
  return;
  }

+/* Cache should not be larger than guest ram size */


Why? (It's admittedly odd, but does it actually break something if it's larger?)



Because how XBZRLE works, the idea is that for workload that changes the same 
pages
frequently, we can reduce the amount of transferred data sent by sending only 
the diff.
We also compress the diff itself.

The cache is used to store the previous page so we can calculate the diff, so 
at most it will
contain all the guest pages.


It's a hash based cache though isn't it - so there will be some contention for
a cache size==ram size case?



not really because the hash function is so simple it becomes like an index into 
array.



Also this does mean that you have to be a little careful to pick a sane XBZRLE 
cache
size, since one that's too large will now fail; I can only see that being
a problem on a machine with a mix of huge and tiny VMs.



Yes if you pick the cache too small than you will have lots of missing and the 
compression
wont be useful. if you pick too large than you wasted memory for nothing.
What also problematic is picking the right workload, it will only be effective 
if between each
migration iteration the same pages are changed.


( I sent the reviewd-by tag separately.)


Thanks,
Orit


Dave
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK






Re: [Qemu-devel] [PATCH v2 4/6] XBZRLE cache size should not be larger than guest memory size

2014-01-30 Thread Orit Wasserman

On 01/30/2014 08:23 PM, Dr. David Alan Gilbert wrote:

* Orit Wasserman (owass...@redhat.com) wrote:

Signed-off-by: Orit Wasserman 
---
  migration.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/migration.c b/migration.c
index 46a7305..25add6f 100644
--- a/migration.c
+++ b/migration.c
@@ -479,6 +479,13 @@ void qmp_migrate_set_cache_size(int64_t value, Error 
**errp)
  return;
  }

+/* Cache should not be larger than guest ram size */


Why? (It's admittedly odd, but does it actually break something if it's larger?)



Because how XBZRLE works, the idea is that for workload that changes the same 
pages
frequently, we can reduce the amount of transferred data sent by sending only 
the diff.
We also compress the diff itself.

The cache is used to store the previous page so we can calculate the diff, so 
at most it will
contain all the guest pages.

Orit


Dave


+if (value > ram_bytes_total()) {
+error_set(errp, QERR_INVALID_PARAMETER_VALUE, "cache size",
+  "exceeds guest ram size ");
+return;
+}
+
  new_size = xbzrle_cache_resize(value);
  if (new_size < 0) {
  error_set(errp, QERR_INVALID_PARAMETER_VALUE, "cache size",
--
1.8.3.1


--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK






[Qemu-devel] [PATCH v2 6/6] Don't abort on memory allocation error

2014-01-30 Thread Orit Wasserman
It is better to fail migration in case of failure to
allocate new cache item

Signed-off-by: Orit Wasserman 
---
 arch_init.c|  4 +++-
 include/migration/page_cache.h |  4 +++-
 page_cache.c   | 16 +++-
 3 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 1fa5f1f..80574a0 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -284,7 +284,9 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t 
*current_data,
 
 if (!cache_is_cached(XBZRLE.cache, current_addr)) {
 if (!last_stage) {
-cache_insert(XBZRLE.cache, current_addr, current_data);
+if (cache_insert(XBZRLE.cache, current_addr, current_data) == -1) {
+return -1;
+}
 }
 acct_info.xbzrle_cache_miss++;
 return -1;
diff --git a/include/migration/page_cache.h b/include/migration/page_cache.h
index 87894fe..d156f0d 100644
--- a/include/migration/page_cache.h
+++ b/include/migration/page_cache.h
@@ -60,11 +60,13 @@ uint8_t *get_cached_data(const PageCache *cache, uint64_t 
addr);
  * cache_insert: insert the page into the cache. the page cache
  * will dup the data on insert. the previous value will be overwritten
  *
+ * Returns -1 on error
+ *
  * @cache pointer to the PageCache struct
  * @addr: page address
  * @pdata: pointer to the page
  */
-void cache_insert(PageCache *cache, uint64_t addr, uint8_t *pdata);
+int cache_insert(PageCache *cache, uint64_t addr, uint8_t *pdata);
 
 /**
  * cache_resize: resize the page cache. In case of size reduction the extra
diff --git a/page_cache.c b/page_cache.c
index 62a53f8..3ef6ee7 100644
--- a/page_cache.c
+++ b/page_cache.c
@@ -150,7 +150,7 @@ uint8_t *get_cached_data(const PageCache *cache, uint64_t 
addr)
 return cache_get_by_addr(cache, addr)->it_data;
 }
 
-void cache_insert(PageCache *cache, uint64_t addr, uint8_t *pdata)
+int cache_insert(PageCache *cache, uint64_t addr, uint8_t *pdata)
 {
 
 CacheItem *it = NULL;
@@ -161,16 +161,22 @@ void cache_insert(PageCache *cache, uint64_t addr, 
uint8_t *pdata)
 /* actual update of entry */
 it = cache_get_by_addr(cache, addr);
 
-/* free old cached data if any */
-g_free(it->it_data);
-
+/* allocate page */
 if (!it->it_data) {
+it->it_data = g_try_malloc(cache->page_size);
+if (!it->it_data) {
+DPRINTF("Error allocating page\n");
+return -1;
+}
 cache->num_items++;
 }
 
-it->it_data = g_memdup(pdata, cache->page_size);
+memcpy(it->it_data, pdata, cache->page_size);
+
 it->it_age = ++cache->max_item_age;
 it->it_addr = addr;
+
+return 0;
 }
 
 int64_t cache_resize(PageCache *cache, int64_t new_num_pages)
-- 
1.8.3.1




[Qemu-devel] [PATCH v2 5/6] Don't abort on out of memory when creating page cache

2014-01-30 Thread Orit Wasserman
Signed-off-by: Orit Wasserman 
---
 arch_init.c  | 18 --
 page_cache.c | 18 ++
 2 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 5eff80b..1fa5f1f 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -664,8 +664,22 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
 DPRINTF("Error creating cache\n");
 return -1;
 }
-XBZRLE.encoded_buf = g_malloc0(TARGET_PAGE_SIZE);
-XBZRLE.current_buf = g_malloc(TARGET_PAGE_SIZE);
+
+/* We prefer not to abort if there is no memory */
+XBZRLE.encoded_buf = g_try_malloc0(TARGET_PAGE_SIZE);
+if (!XBZRLE.encoded_buf) {
+DPRINTF("Error allocating encoded_buf\n");
+return -1;
+}
+
+XBZRLE.current_buf = g_try_malloc(TARGET_PAGE_SIZE);
+if (!XBZRLE.current_buf) {
+DPRINTF("Error allocating current_buf\n");
+g_free(XBZRLE.encoded_buf);
+XBZRLE.encoded_buf = NULL;
+return -1;
+}
+
 acct_clear();
 }
 
diff --git a/page_cache.c b/page_cache.c
index a05db64..62a53f8 100644
--- a/page_cache.c
+++ b/page_cache.c
@@ -60,8 +60,12 @@ PageCache *cache_init(int64_t num_pages, unsigned int 
page_size)
 return NULL;
 }
 
-cache = g_malloc(sizeof(*cache));
-
+/* We prefer not to abort if there is no memory */
+cache = g_try_malloc(sizeof(*cache));
+if (!cache) {
+DPRINTF("Failed to allocate cache\n");
+return NULL;
+}
 /* round down to the nearest power of 2 */
 if (!is_power_of_2(num_pages)) {
 num_pages = pow2floor(num_pages);
@@ -74,8 +78,14 @@ PageCache *cache_init(int64_t num_pages, unsigned int 
page_size)
 
 DPRINTF("Setting cache buckets to %" PRId64 "\n", cache->max_num_items);
 
-cache->page_cache = g_malloc((cache->max_num_items) *
- sizeof(*cache->page_cache));
+/* We prefer not to abort if there is no memory */
+cache->page_cache = g_try_malloc((cache->max_num_items) *
+ sizeof(*cache->page_cache));
+if (!cache->page_cache) {
+DPRINTF("Failed to allocate cache->page_cache\n");
+g_free(cache);
+return NULL;
+}
 
 for (i = 0; i < cache->max_num_items; i++) {
 cache->page_cache[i].it_data = NULL;
-- 
1.8.3.1




[Qemu-devel] [PATCH v2 2/6] Add check for cache size smaller than page size

2014-01-30 Thread Orit Wasserman
Signed-off-by: Orit Wasserman 
Reviewed-by: Juan Quintela 
---
 arch_init.c |  4 
 migration.c | 10 +-
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch_init.c b/arch_init.c
index 66f5e82..8edeabe 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -178,6 +178,10 @@ static struct {
 
 int64_t xbzrle_cache_resize(int64_t new_size)
 {
+if (new_size < TARGET_PAGE_SIZE) {
+return -1;
+}
+
 if (XBZRLE.cache != NULL) {
 return cache_resize(XBZRLE.cache, new_size / TARGET_PAGE_SIZE) *
 TARGET_PAGE_SIZE;
diff --git a/migration.c b/migration.c
index 7235c23..84587e9 100644
--- a/migration.c
+++ b/migration.c
@@ -469,6 +469,7 @@ void qmp_migrate_cancel(Error **errp)
 void qmp_migrate_set_cache_size(int64_t value, Error **errp)
 {
 MigrationState *s = migrate_get_current();
+int64_t new_size;
 
 /* Check for truncation */
 if (value != (size_t)value) {
@@ -477,7 +478,14 @@ void qmp_migrate_set_cache_size(int64_t value, Error 
**errp)
 return;
 }
 
-s->xbzrle_cache_size = xbzrle_cache_resize(value);
+new_size = xbzrle_cache_resize(value);
+if (new_size < 0) {
+error_set(errp, QERR_INVALID_PARAMETER_VALUE, "cache size",
+  "is smaller than page size");
+return;
+}
+
+s->xbzrle_cache_size = new_size;
 }
 
 int64_t qmp_query_migrate_cache_size(Error **errp)
-- 
1.8.3.1




[Qemu-devel] [PATCH v2 3/6] migration:fix free XBZRLE decoded_buf wrong

2014-01-30 Thread Orit Wasserman
From: "Gonglei (Arei)" 

When qemu do live migration with xbzrle, qemu malloc decoded_buf
at destination end but free it at source end. It will crash qemu
by double free error in some scenarios. Splitting the XBZRLE structure
for clear logic distinguishing src/dst side.

Signed-off-by: ChenLiang 
Reviewed-by: Peter Maydell 
Reviewed-by: Orit Wasserman 
Signed-off-by: GongLei 
---
 arch_init.c   | 22 --
 include/migration/migration.h |  1 +
 migration.c   |  1 +
 3 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 8edeabe..5eff80b 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -164,17 +164,15 @@ static struct {
 uint8_t *encoded_buf;
 /* buffer for storing page content */
 uint8_t *current_buf;
-/* buffer used for XBZRLE decoding */
-uint8_t *decoded_buf;
 /* Cache for XBZRLE */
 PageCache *cache;
 } XBZRLE = {
 .encoded_buf = NULL,
 .current_buf = NULL,
-.decoded_buf = NULL,
 .cache = NULL,
 };
-
+/* buffer used for XBZRLE decoding */
+static uint8_t *xbzrle_decoded_buf;
 
 int64_t xbzrle_cache_resize(int64_t new_size)
 {
@@ -606,6 +604,12 @@ uint64_t ram_bytes_total(void)
 return total;
 }
 
+void free_xbzrle_decoded_buf(void)
+{
+g_free(xbzrle_decoded_buf);
+xbzrle_decoded_buf = NULL;
+}
+
 static void migration_end(void)
 {
 if (migration_bitmap) {
@@ -619,11 +623,9 @@ static void migration_end(void)
 g_free(XBZRLE.cache);
 g_free(XBZRLE.encoded_buf);
 g_free(XBZRLE.current_buf);
-g_free(XBZRLE.decoded_buf);
 XBZRLE.cache = NULL;
 XBZRLE.encoded_buf = NULL;
 XBZRLE.current_buf = NULL;
-XBZRLE.decoded_buf = NULL;
 }
 }
 
@@ -814,8 +816,8 @@ static int load_xbzrle(QEMUFile *f, ram_addr_t addr, void 
*host)
 unsigned int xh_len;
 int xh_flags;
 
-if (!XBZRLE.decoded_buf) {
-XBZRLE.decoded_buf = g_malloc(TARGET_PAGE_SIZE);
+if (!xbzrle_decoded_buf) {
+xbzrle_decoded_buf = g_malloc(TARGET_PAGE_SIZE);
 }
 
 /* extract RLE header */
@@ -832,10 +834,10 @@ static int load_xbzrle(QEMUFile *f, ram_addr_t addr, void 
*host)
 return -1;
 }
 /* load data and decode */
-qemu_get_buffer(f, XBZRLE.decoded_buf, xh_len);
+qemu_get_buffer(f, xbzrle_decoded_buf, xh_len);
 
 /* decode RLE */
-ret = xbzrle_decode_buffer(XBZRLE.decoded_buf, xh_len, host,
+ret = xbzrle_decode_buffer(xbzrle_decoded_buf, xh_len, host,
TARGET_PAGE_SIZE);
 if (ret == -1) {
 fprintf(stderr, "Failed to load XBZRLE page - decode error!\n");
diff --git a/include/migration/migration.h b/include/migration/migration.h
index bfa3951..3e1e6c7 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -109,6 +109,7 @@ MigrationState *migrate_get_current(void);
 uint64_t ram_bytes_remaining(void);
 uint64_t ram_bytes_transferred(void);
 uint64_t ram_bytes_total(void);
+void free_xbzrle_decoded_buf(void);
 
 void acct_update_position(QEMUFile *f, size_t size, bool zero);
 
diff --git a/migration.c b/migration.c
index 84587e9..46a7305 100644
--- a/migration.c
+++ b/migration.c
@@ -105,6 +105,7 @@ static void process_incoming_migration_co(void *opaque)
 
 ret = qemu_loadvm_state(f);
 qemu_fclose(f);
+free_xbzrle_decoded_buf();
 if (ret < 0) {
 fprintf(stderr, "load of migration failed\n");
 exit(EXIT_FAILURE);
-- 
1.8.3.1




[Qemu-devel] [PATCH v2 4/6] XBZRLE cache size should not be larger than guest memory size

2014-01-30 Thread Orit Wasserman
Signed-off-by: Orit Wasserman 
---
 migration.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/migration.c b/migration.c
index 46a7305..25add6f 100644
--- a/migration.c
+++ b/migration.c
@@ -479,6 +479,13 @@ void qmp_migrate_set_cache_size(int64_t value, Error 
**errp)
 return;
 }
 
+/* Cache should not be larger than guest ram size */
+if (value > ram_bytes_total()) {
+error_set(errp, QERR_INVALID_PARAMETER_VALUE, "cache size",
+  "exceeds guest ram size ");
+return;
+}
+
 new_size = xbzrle_cache_resize(value);
 if (new_size < 0) {
 error_set(errp, QERR_INVALID_PARAMETER_VALUE, "cache size",
-- 
1.8.3.1




[Qemu-devel] [PATCH v2 1/6] Set xbzrle buffers to NULL after freeing them to avoid double free errors

2014-01-30 Thread Orit Wasserman
Signed-off-by: Orit Wasserman 
Reviewed-by: Juan Quintela 
---
 arch_init.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch_init.c b/arch_init.c
index 77912e7..66f5e82 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -617,6 +617,9 @@ static void migration_end(void)
 g_free(XBZRLE.current_buf);
 g_free(XBZRLE.decoded_buf);
 XBZRLE.cache = NULL;
+XBZRLE.encoded_buf = NULL;
+XBZRLE.current_buf = NULL;
+XBZRLE.decoded_buf = NULL;
 }
 }
 
-- 
1.8.3.1




[Qemu-devel] [PATCH v2 0/6] XBZRLE Fixes

2014-01-30 Thread Orit Wasserman
Fix memory leak and missing NULLs.
Better cache size validation checks.
Fail migration instead of aborting QEMU in case there is not
enough memory for the page cache.

Changes from v1:
Free encoded_buf in case of failure to allocate 
current_buf.
Update num_items after successful allocation.

Gonglei (Arei) (1):
  migration:fix free XBZRLE decoded_buf wrong

Orit Wasserman (5):
  Set xbzrle buffers to NULL after freeing them to avoid double free
errors
  Add check for cache size smaller than page size
  XBZRLE cache size should not be larger than guest memory size
  Don't abort on out of memory when creating page cache
  Don't abort on memory allocation error

 arch_init.c| 49 +++---
 include/migration/migration.h  |  1 +
 include/migration/page_cache.h |  4 +++-
 migration.c| 18 +++-
 page_cache.c   | 34 +
 5 files changed, 83 insertions(+), 23 deletions(-)

-- 
1.8.3.1




Re: [Qemu-devel] [PATCH 6/6] Don't abort on memory allocation error

2014-01-30 Thread Orit Wasserman

On 01/30/2014 06:08 PM, Dr. David Alan Gilbert wrote:

* Orit Wasserman (owass...@redhat.com) wrote:

It is better to fail migration in case of failure to
allocate new cache item


Does this actually fail migration or just drop back to sending an uncompressed
page? (I think the latter, and that's an even better result).



It fails the migration, if the users wants to disable compression it can do it
and start migration again.


Signed-off-by: Orit Wasserman 
---
  arch_init.c|  4 +++-
  include/migration/page_cache.h |  4 +++-
  page_cache.c   | 16 +++-
  3 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 806d096..0bfbc5a 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -284,7 +284,9 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t 
*current_data,

  if (!cache_is_cached(XBZRLE.cache, current_addr)) {
  if (!last_stage) {
-cache_insert(XBZRLE.cache, current_addr, current_data);
+if (cache_insert(XBZRLE.cache, current_addr, current_data) == -1) {
+return -1;
+}
  }
  acct_info.xbzrle_cache_miss++;
  return -1;
diff --git a/include/migration/page_cache.h b/include/migration/page_cache.h
index 87894fe..d156f0d 100644
--- a/include/migration/page_cache.h
+++ b/include/migration/page_cache.h
@@ -60,11 +60,13 @@ uint8_t *get_cached_data(const PageCache *cache, uint64_t 
addr);
   * cache_insert: insert the page into the cache. the page cache
   * will dup the data on insert. the previous value will be overwritten
   *
+ * Returns -1 on error
+ *
   * @cache pointer to the PageCache struct
   * @addr: page address
   * @pdata: pointer to the page
   */
-void cache_insert(PageCache *cache, uint64_t addr, uint8_t *pdata);
+int cache_insert(PageCache *cache, uint64_t addr, uint8_t *pdata);

  /**
   * cache_resize: resize the page cache. In case of size reduction the extra
diff --git a/page_cache.c b/page_cache.c
index 62a53f8..69e8329 100644
--- a/page_cache.c
+++ b/page_cache.c
@@ -150,7 +150,7 @@ uint8_t *get_cached_data(const PageCache *cache, uint64_t 
addr)
  return cache_get_by_addr(cache, addr)->it_data;
  }

-void cache_insert(PageCache *cache, uint64_t addr, uint8_t *pdata)
+int cache_insert(PageCache *cache, uint64_t addr, uint8_t *pdata)
  {

  CacheItem *it = NULL;
@@ -161,16 +161,22 @@ void cache_insert(PageCache *cache, uint64_t addr, 
uint8_t *pdata)
  /* actual update of entry */
  it = cache_get_by_addr(cache, addr);

-/* free old cached data if any */
-g_free(it->it_data);
-
+/* allocate page */
  if (!it->it_data) {
  cache->num_items++;
+it->it_data = g_try_malloc(cache->page_size);
+if (!it->it_data) {
+DPRINTF("Error allocating page\n");
+return -1;
+}


It feels like it would have been correct to do the num_items++ after the
allocation test (although since the num_items doesn't seem to be used
anywhere that's not a show-stopper, but worth fixing sometime).



You are correct, I will move it after allocation.

Orit

  }

-it->it_data = g_memdup(pdata, cache->page_size);
+memcpy(it->it_data, pdata, cache->page_size);
+
  it->it_age = ++cache->max_item_age;
  it->it_addr = addr;
+
+return 0;
  }

  int64_t cache_resize(PageCache *cache, int64_t new_num_pages)
--
1.8.3.1


Reviewed-by: Dr. David Alan Gilbert 

Dave
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK






Re: [Qemu-devel] [PATCH 5/6] Don't abort on out of memory when creating page cache

2014-01-30 Thread Orit Wasserman

On 01/30/2014 05:48 PM, Dr. David Alan Gilbert wrote:

* Orit Wasserman (owass...@redhat.com) wrote:

Signed-off-by: Orit Wasserman 
---
  arch_init.c  | 16 ++--
  page_cache.c | 18 ++
  2 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 5eff80b..806d096 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -664,8 +664,20 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
  DPRINTF("Error creating cache\n");
  return -1;
  }
-XBZRLE.encoded_buf = g_malloc0(TARGET_PAGE_SIZE);
-XBZRLE.current_buf = g_malloc(TARGET_PAGE_SIZE);
+
+/* We prefer not to abort if there is no memory */
+XBZRLE.encoded_buf = g_try_malloc0(TARGET_PAGE_SIZE);
+if (!XBZRLE.encoded_buf) {
+DPRINTF("Error allocating encoded_buf\n");
+return -1;
+}
+
+XBZRLE.current_buf = g_try_malloc(TARGET_PAGE_SIZE);
+if (!XBZRLE.current_buf) {
+DPRINTF("Error allocating current_buf\n");
+return -1;
+}


Would it be best to free encoded_buf in this second exit case?



It is freed in migration_end (that is called when migration fails or canceled).

Orit


Dave
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK






[Qemu-devel] [PATCH 6/6] Don't abort on memory allocation error

2014-01-29 Thread Orit Wasserman
It is better to fail migration in case of failure to
allocate new cache item

Signed-off-by: Orit Wasserman 
---
 arch_init.c|  4 +++-
 include/migration/page_cache.h |  4 +++-
 page_cache.c   | 16 +++-
 3 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 806d096..0bfbc5a 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -284,7 +284,9 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t 
*current_data,
 
 if (!cache_is_cached(XBZRLE.cache, current_addr)) {
 if (!last_stage) {
-cache_insert(XBZRLE.cache, current_addr, current_data);
+if (cache_insert(XBZRLE.cache, current_addr, current_data) == -1) {
+return -1;
+}
 }
 acct_info.xbzrle_cache_miss++;
 return -1;
diff --git a/include/migration/page_cache.h b/include/migration/page_cache.h
index 87894fe..d156f0d 100644
--- a/include/migration/page_cache.h
+++ b/include/migration/page_cache.h
@@ -60,11 +60,13 @@ uint8_t *get_cached_data(const PageCache *cache, uint64_t 
addr);
  * cache_insert: insert the page into the cache. the page cache
  * will dup the data on insert. the previous value will be overwritten
  *
+ * Returns -1 on error
+ *
  * @cache pointer to the PageCache struct
  * @addr: page address
  * @pdata: pointer to the page
  */
-void cache_insert(PageCache *cache, uint64_t addr, uint8_t *pdata);
+int cache_insert(PageCache *cache, uint64_t addr, uint8_t *pdata);
 
 /**
  * cache_resize: resize the page cache. In case of size reduction the extra
diff --git a/page_cache.c b/page_cache.c
index 62a53f8..69e8329 100644
--- a/page_cache.c
+++ b/page_cache.c
@@ -150,7 +150,7 @@ uint8_t *get_cached_data(const PageCache *cache, uint64_t 
addr)
 return cache_get_by_addr(cache, addr)->it_data;
 }
 
-void cache_insert(PageCache *cache, uint64_t addr, uint8_t *pdata)
+int cache_insert(PageCache *cache, uint64_t addr, uint8_t *pdata)
 {
 
 CacheItem *it = NULL;
@@ -161,16 +161,22 @@ void cache_insert(PageCache *cache, uint64_t addr, 
uint8_t *pdata)
 /* actual update of entry */
 it = cache_get_by_addr(cache, addr);
 
-/* free old cached data if any */
-g_free(it->it_data);
-
+/* allocate page */
 if (!it->it_data) {
 cache->num_items++;
+it->it_data = g_try_malloc(cache->page_size);
+if (!it->it_data) {
+DPRINTF("Error allocating page\n");
+return -1;
+}
 }
 
-it->it_data = g_memdup(pdata, cache->page_size);
+memcpy(it->it_data, pdata, cache->page_size);
+
 it->it_age = ++cache->max_item_age;
 it->it_addr = addr;
+
+return 0;
 }
 
 int64_t cache_resize(PageCache *cache, int64_t new_num_pages)
-- 
1.8.3.1




[Qemu-devel] [PATCH 4/6] XBZRLE cache size should not be larger than guest memory size

2014-01-29 Thread Orit Wasserman
Signed-off-by: Orit Wasserman 
---
 migration.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/migration.c b/migration.c
index 46a7305..25add6f 100644
--- a/migration.c
+++ b/migration.c
@@ -479,6 +479,13 @@ void qmp_migrate_set_cache_size(int64_t value, Error 
**errp)
 return;
 }
 
+/* Cache should not be larger than guest ram size */
+if (value > ram_bytes_total()) {
+error_set(errp, QERR_INVALID_PARAMETER_VALUE, "cache size",
+  "exceeds guest ram size ");
+return;
+}
+
 new_size = xbzrle_cache_resize(value);
 if (new_size < 0) {
 error_set(errp, QERR_INVALID_PARAMETER_VALUE, "cache size",
-- 
1.8.3.1




[Qemu-devel] [PATCH 2/6] Add check for cache size smaller than page size

2014-01-29 Thread Orit Wasserman
Signed-off-by: Orit Wasserman 
Reviewed-by: Juan Quintela 
---
 arch_init.c |  4 
 migration.c | 10 +-
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch_init.c b/arch_init.c
index 66f5e82..8edeabe 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -178,6 +178,10 @@ static struct {
 
 int64_t xbzrle_cache_resize(int64_t new_size)
 {
+if (new_size < TARGET_PAGE_SIZE) {
+return -1;
+}
+
 if (XBZRLE.cache != NULL) {
 return cache_resize(XBZRLE.cache, new_size / TARGET_PAGE_SIZE) *
 TARGET_PAGE_SIZE;
diff --git a/migration.c b/migration.c
index 7235c23..84587e9 100644
--- a/migration.c
+++ b/migration.c
@@ -469,6 +469,7 @@ void qmp_migrate_cancel(Error **errp)
 void qmp_migrate_set_cache_size(int64_t value, Error **errp)
 {
 MigrationState *s = migrate_get_current();
+int64_t new_size;
 
 /* Check for truncation */
 if (value != (size_t)value) {
@@ -477,7 +478,14 @@ void qmp_migrate_set_cache_size(int64_t value, Error 
**errp)
 return;
 }
 
-s->xbzrle_cache_size = xbzrle_cache_resize(value);
+new_size = xbzrle_cache_resize(value);
+if (new_size < 0) {
+error_set(errp, QERR_INVALID_PARAMETER_VALUE, "cache size",
+  "is smaller than page size");
+return;
+}
+
+s->xbzrle_cache_size = new_size;
 }
 
 int64_t qmp_query_migrate_cache_size(Error **errp)
-- 
1.8.3.1




[Qemu-devel] [PATCH 5/6] Don't abort on out of memory when creating page cache

2014-01-29 Thread Orit Wasserman
Signed-off-by: Orit Wasserman 
---
 arch_init.c  | 16 ++--
 page_cache.c | 18 ++
 2 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 5eff80b..806d096 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -664,8 +664,20 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
 DPRINTF("Error creating cache\n");
 return -1;
 }
-XBZRLE.encoded_buf = g_malloc0(TARGET_PAGE_SIZE);
-XBZRLE.current_buf = g_malloc(TARGET_PAGE_SIZE);
+
+/* We prefer not to abort if there is no memory */
+XBZRLE.encoded_buf = g_try_malloc0(TARGET_PAGE_SIZE);
+if (!XBZRLE.encoded_buf) {
+DPRINTF("Error allocating encoded_buf\n");
+return -1;
+}
+
+XBZRLE.current_buf = g_try_malloc(TARGET_PAGE_SIZE);
+if (!XBZRLE.current_buf) {
+DPRINTF("Error allocating current_buf\n");
+return -1;
+}
+
 acct_clear();
 }
 
diff --git a/page_cache.c b/page_cache.c
index a05db64..62a53f8 100644
--- a/page_cache.c
+++ b/page_cache.c
@@ -60,8 +60,12 @@ PageCache *cache_init(int64_t num_pages, unsigned int 
page_size)
 return NULL;
 }
 
-cache = g_malloc(sizeof(*cache));
-
+/* We prefer not to abort if there is no memory */
+cache = g_try_malloc(sizeof(*cache));
+if (!cache) {
+DPRINTF("Failed to allocate cache\n");
+return NULL;
+}
 /* round down to the nearest power of 2 */
 if (!is_power_of_2(num_pages)) {
 num_pages = pow2floor(num_pages);
@@ -74,8 +78,14 @@ PageCache *cache_init(int64_t num_pages, unsigned int 
page_size)
 
 DPRINTF("Setting cache buckets to %" PRId64 "\n", cache->max_num_items);
 
-cache->page_cache = g_malloc((cache->max_num_items) *
- sizeof(*cache->page_cache));
+/* We prefer not to abort if there is no memory */
+cache->page_cache = g_try_malloc((cache->max_num_items) *
+ sizeof(*cache->page_cache));
+if (!cache->page_cache) {
+DPRINTF("Failed to allocate cache->page_cache\n");
+g_free(cache);
+return NULL;
+}
 
 for (i = 0; i < cache->max_num_items; i++) {
 cache->page_cache[i].it_data = NULL;
-- 
1.8.3.1




[Qemu-devel] [PATCH 1/6] Set xbzrle buffers to NULL after freeing them to avoid double free errors

2014-01-29 Thread Orit Wasserman
Signed-off-by: Orit Wasserman 
Reviewed-by: Juan Quintela 
---
 arch_init.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch_init.c b/arch_init.c
index 77912e7..66f5e82 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -617,6 +617,9 @@ static void migration_end(void)
 g_free(XBZRLE.current_buf);
 g_free(XBZRLE.decoded_buf);
 XBZRLE.cache = NULL;
+XBZRLE.encoded_buf = NULL;
+XBZRLE.current_buf = NULL;
+XBZRLE.decoded_buf = NULL;
 }
 }
 
-- 
1.8.3.1




[Qemu-devel] [PATCH 3/6] migration:fix free XBZRLE decoded_buf wrong

2014-01-29 Thread Orit Wasserman
From: "Gonglei (Arei)" 

When qemu do live migration with xbzrle, qemu malloc decoded_buf
at destination end but free it at source end. It will crash qemu
by double free error in some scenarios. Splitting the XBZRLE structure
for clear logic distinguishing src/dst side.

Signed-off-by: ChenLiang 
Reviewed-by: Peter Maydell 
Reviewed-by: Orit Wasserman 
Signed-off-by: GongLei 
---
 arch_init.c   | 22 --
 include/migration/migration.h |  1 +
 migration.c   |  1 +
 3 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 8edeabe..5eff80b 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -164,17 +164,15 @@ static struct {
 uint8_t *encoded_buf;
 /* buffer for storing page content */
 uint8_t *current_buf;
-/* buffer used for XBZRLE decoding */
-uint8_t *decoded_buf;
 /* Cache for XBZRLE */
 PageCache *cache;
 } XBZRLE = {
 .encoded_buf = NULL,
 .current_buf = NULL,
-.decoded_buf = NULL,
 .cache = NULL,
 };
-
+/* buffer used for XBZRLE decoding */
+static uint8_t *xbzrle_decoded_buf;
 
 int64_t xbzrle_cache_resize(int64_t new_size)
 {
@@ -606,6 +604,12 @@ uint64_t ram_bytes_total(void)
 return total;
 }
 
+void free_xbzrle_decoded_buf(void)
+{
+g_free(xbzrle_decoded_buf);
+xbzrle_decoded_buf = NULL;
+}
+
 static void migration_end(void)
 {
 if (migration_bitmap) {
@@ -619,11 +623,9 @@ static void migration_end(void)
 g_free(XBZRLE.cache);
 g_free(XBZRLE.encoded_buf);
 g_free(XBZRLE.current_buf);
-g_free(XBZRLE.decoded_buf);
 XBZRLE.cache = NULL;
 XBZRLE.encoded_buf = NULL;
 XBZRLE.current_buf = NULL;
-XBZRLE.decoded_buf = NULL;
 }
 }
 
@@ -814,8 +816,8 @@ static int load_xbzrle(QEMUFile *f, ram_addr_t addr, void 
*host)
 unsigned int xh_len;
 int xh_flags;
 
-if (!XBZRLE.decoded_buf) {
-XBZRLE.decoded_buf = g_malloc(TARGET_PAGE_SIZE);
+if (!xbzrle_decoded_buf) {
+xbzrle_decoded_buf = g_malloc(TARGET_PAGE_SIZE);
 }
 
 /* extract RLE header */
@@ -832,10 +834,10 @@ static int load_xbzrle(QEMUFile *f, ram_addr_t addr, void 
*host)
 return -1;
 }
 /* load data and decode */
-qemu_get_buffer(f, XBZRLE.decoded_buf, xh_len);
+qemu_get_buffer(f, xbzrle_decoded_buf, xh_len);
 
 /* decode RLE */
-ret = xbzrle_decode_buffer(XBZRLE.decoded_buf, xh_len, host,
+ret = xbzrle_decode_buffer(xbzrle_decoded_buf, xh_len, host,
TARGET_PAGE_SIZE);
 if (ret == -1) {
 fprintf(stderr, "Failed to load XBZRLE page - decode error!\n");
diff --git a/include/migration/migration.h b/include/migration/migration.h
index bfa3951..3e1e6c7 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -109,6 +109,7 @@ MigrationState *migrate_get_current(void);
 uint64_t ram_bytes_remaining(void);
 uint64_t ram_bytes_transferred(void);
 uint64_t ram_bytes_total(void);
+void free_xbzrle_decoded_buf(void);
 
 void acct_update_position(QEMUFile *f, size_t size, bool zero);
 
diff --git a/migration.c b/migration.c
index 84587e9..46a7305 100644
--- a/migration.c
+++ b/migration.c
@@ -105,6 +105,7 @@ static void process_incoming_migration_co(void *opaque)
 
 ret = qemu_loadvm_state(f);
 qemu_fclose(f);
+free_xbzrle_decoded_buf();
 if (ret < 0) {
 fprintf(stderr, "load of migration failed\n");
 exit(EXIT_FAILURE);
-- 
1.8.3.1




[Qemu-devel] [PATCH 0/6] XBZRLE Fixes

2014-01-29 Thread Orit Wasserman
Fix memory leak and missing NULLs.
Better cache size validation checks.
Fail migration instead of aborting QEMU in case there is not
enough memory for the page cache.

Gonglei (Arei) (1):
  migration:fix free XBZRLE decoded_buf wrong

Orit Wasserman (5):
  Set xbzrle buffers to NULL after freeing them to avoid double free
errors
  Add check for cache size smaller than page size
  XBZRLE cache size should not be larger than guest memory size
  Don't abort on out of memory when creating page cache
  Don't abort on memory allocation error

 arch_init.c| 47 +++---
 include/migration/migration.h  |  1 +
 include/migration/page_cache.h |  4 +++-
 migration.c| 18 +++-
 page_cache.c   | 34 ++
 5 files changed, 81 insertions(+), 23 deletions(-)

-- 
1.8.3.1




Re: [Qemu-devel] [PATCH v2] migration:fix free XBZRLE decoded_buf wrong

2014-01-21 Thread Orit Wasserman

On 01/21/2014 02:58 PM, Gonglei (Arei) wrote:



-Original Message-
From: Orit Wasserman [mailto:owass...@redhat.com]
Sent: Tuesday, January 21, 2014 8:24 PM
To: Gonglei (Arei); qemu-devel@nongnu.org; qemu-sta...@nongnu.org; Peter
Maydell; anth...@codemonkey.ws; pbonz...@redhat.com
Cc: chenliang (T); Luonengjun; Huangweidong (Hardware)
Subject: Re: [Qemu-devel] [PATCH v2] migration:fix free XBZRLE decoded_buf
wrong

On 01/21/2014 02:11 PM, Gonglei (Arei) wrote:

Hi,

This is an update of my patch.
Modifications in v2:
* Removing excess check for g_free
* The structure of XBZRLE is divided into two halves.One is for
* src side, another is for dest side.



What is the benefit of splitting the structure?
decode_buf is only allocated (and freed) in the destination any way.


Yeah, you are right. Splitting the structure is not necessary.
The key to do that is just for clear logic. As Peter said:
the current arrangement looks extremely prone to bugs like
this one where somebody forgets that some of the fields are
not relevant to whichever of src/dst the code path they're
writing is used on.

Best regards,
-Gonglei



Sounds reasonable.
Thanks for finding the leak and fixing it.

Orit

Orit




Re: [Qemu-devel] [PATCH v2] migration:fix free XBZRLE decoded_buf wrong

2014-01-21 Thread Orit Wasserman

On 01/21/2014 02:11 PM, Gonglei (Arei) wrote:

Hi,

This is an update of my patch.
Modifications in v2:
* Removing excess check for g_free
* The structure of XBZRLE is divided into two halves.One is for
* src side, another is for dest side.



What is the benefit of splitting the structure?
decode_buf is only allocated (and freed) in the destination any way.

Orit


Signed-off-by: chenliang 
---
  arch_init.c   |   23 ++-
  include/migration/migration.h |1 +
  migration.c   |1 +
  3 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 77912e7..9a28a43 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -164,17 +164,15 @@ static struct {
  uint8_t *encoded_buf;
  /* buffer for storing page content */
  uint8_t *current_buf;
-/* buffer used for XBZRLE decoding */
-uint8_t *decoded_buf;
  /* Cache for XBZRLE */
  PageCache *cache;
  } XBZRLE = {
  .encoded_buf = NULL,
  .current_buf = NULL,
-.decoded_buf = NULL,
  .cache = NULL,
  };
-
+/* buffer used for XBZRLE decoding */
+static uint8_t *xbzrle_decoded_buf = NULL;

  int64_t xbzrle_cache_resize(int64_t new_size)
  {
@@ -602,6 +600,12 @@ uint64_t ram_bytes_total(void)
  return total;
  }

+void free_xbzrle_decoded_buf(void)
+{
+g_free(xbzrle_decoded_buf);
+xbzrle_decoded_buf = NULL;
+}
+
  static void migration_end(void)
  {
  if (migration_bitmap) {
@@ -615,8 +619,9 @@ static void migration_end(void)
  g_free(XBZRLE.cache);
  g_free(XBZRLE.encoded_buf);
  g_free(XBZRLE.current_buf);
-g_free(XBZRLE.decoded_buf);
  XBZRLE.cache = NULL;
+XBZRLE.encoded_buf = NULL;
+XBZRLE.current_buf = NULL;
  }
  }

@@ -807,8 +812,8 @@ static int load_xbzrle(QEMUFile *f, ram_addr_t addr, void 
*host)
  unsigned int xh_len;
  int xh_flags;

-if (!XBZRLE.decoded_buf) {
-XBZRLE.decoded_buf = g_malloc(TARGET_PAGE_SIZE);
+if (!xbzrle_decoded_buf) {
+xbzrle_decoded_buf = g_malloc(TARGET_PAGE_SIZE);
  }

  /* extract RLE header */
@@ -825,10 +830,10 @@ static int load_xbzrle(QEMUFile *f, ram_addr_t addr, void 
*host)
  return -1;
  }
  /* load data and decode */
-qemu_get_buffer(f, XBZRLE.decoded_buf, xh_len);
+qemu_get_buffer(f, xbzrle_decoded_buf, xh_len);

  /* decode RLE */
-ret = xbzrle_decode_buffer(XBZRLE.decoded_buf, xh_len, host,
+ret = xbzrle_decode_buffer(xbzrle_decoded_buf, xh_len, host,
 TARGET_PAGE_SIZE);
  if (ret == -1) {
  fprintf(stderr, "Failed to load XBZRLE page - decode error!\n");
diff --git a/include/migration/migration.h b/include/migration/migration.h
index bfa3951..3e1e6c7 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -109,6 +109,7 @@ MigrationState *migrate_get_current(void);
  uint64_t ram_bytes_remaining(void);
  uint64_t ram_bytes_transferred(void);
  uint64_t ram_bytes_total(void);
+void free_xbzrle_decoded_buf(void);

  void acct_update_position(QEMUFile *f, size_t size, bool zero);

diff --git a/migration.c b/migration.c
index 7235c23..3d46804 100644
--- a/migration.c
+++ b/migration.c
@@ -105,6 +105,7 @@ static void process_incoming_migration_co(void *opaque)

  ret = qemu_loadvm_state(f);
  qemu_fclose(f);
+free_xbzrle_decoded_buf();
  if (ret < 0) {
  fprintf(stderr, "load of migration failed\n");
  exit(EXIT_FAILURE);






[Qemu-devel] [PATCH 1/2] Set xbzrle buffers to NULL after freeing them to avoid double free errors

2014-01-07 Thread Orit Wasserman
Signed-off-by: Orit Wasserman 
---
 arch_init.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch_init.c b/arch_init.c
index e0acbc5..5c55c68 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -587,6 +587,9 @@ static void migration_end(void)
 g_free(XBZRLE.current_buf);
 g_free(XBZRLE.decoded_buf);
 XBZRLE.cache = NULL;
+XBZRLE.encoded_buf = NULL;
+XBZRLE.current_buf = NULL;
+XBZRLE.decoded_buf = NULL;
 }
 }
 
-- 
1.8.3.1




[Qemu-devel] [PATCH 2/2] Add check for cache size smaller than page size

2014-01-07 Thread Orit Wasserman
Signed-off-by: Orit Wasserman 
---
 arch_init.c |  4 
 migration.c | 10 +-
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch_init.c b/arch_init.c
index 5c55c68..e52c9ba 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -176,6 +176,10 @@ static struct {
 
 int64_t xbzrle_cache_resize(int64_t new_size)
 {
+if (new_size < TARGET_PAGE_SIZE) {
+return -1;
+}
+
 if (XBZRLE.cache != NULL) {
 return cache_resize(XBZRLE.cache, new_size / TARGET_PAGE_SIZE) *
 TARGET_PAGE_SIZE;
diff --git a/migration.c b/migration.c
index 2b1ab20..f28aa1d 100644
--- a/migration.c
+++ b/migration.c
@@ -455,6 +455,7 @@ void qmp_migrate_cancel(Error **errp)
 void qmp_migrate_set_cache_size(int64_t value, Error **errp)
 {
 MigrationState *s = migrate_get_current();
+int64_t new_size; 
 
 /* Check for truncation */
 if (value != (size_t)value) {
@@ -463,8 +464,14 @@ void qmp_migrate_set_cache_size(int64_t value, Error 
**errp)
 return;
 }
 
-s->xbzrle_cache_size = xbzrle_cache_resize(value);
+new_size = xbzrle_cache_resize(value);
+if (new_size < 0) {
+error_set(errp, QERR_INVALID_PARAMETER_VALUE, "cache size",
+  "is smaller than page size");
+return;
+}
+
+s->xbzrle_cache_size = new_size;
 }
 
 int64_t qmp_query_migrate_cache_size(Error **errp)
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH 01/38] bitmap: use long as index

2013-12-24 Thread Orit Wasserman
 long bits)
  {
-int k;
-int nr = BITS_TO_LONGS(bits);
+long k;
+long nr = BITS_TO_LONGS(bits);

  for (k = 0; k < nr; k++) {
  dst[k] = bitmap1[k] ^ bitmap2[k];
@@ -143,10 +143,10 @@ void slow_bitmap_xor(unsigned long *dst, const unsigned 
long *bitmap1,
  }

  int slow_bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,
-   const unsigned long *bitmap2, int bits)
+   const unsigned long *bitmap2, long bits)
  {
-int k;
-int nr = BITS_TO_LONGS(bits);
+long k;
+long nr = BITS_TO_LONGS(bits);
  unsigned long result = 0;

  for (k = 0; k < nr; k++) {
@@ -157,10 +157,10 @@ int slow_bitmap_andnot(unsigned long *dst, const unsigned 
long *bitmap1,

  #define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) % BITS_PER_LONG))

-void bitmap_set(unsigned long *map, int start, int nr)
+void bitmap_set(unsigned long *map, long start, long nr)
  {
  unsigned long *p = map + BIT_WORD(start);
-const int size = start + nr;
+const long size = start + nr;
  int bits_to_set = BITS_PER_LONG - (start % BITS_PER_LONG);
  unsigned long mask_to_set = BITMAP_FIRST_WORD_MASK(start);

@@ -177,10 +177,10 @@ void bitmap_set(unsigned long *map, int start, int nr)
  }
  }

-void bitmap_clear(unsigned long *map, int start, int nr)
+void bitmap_clear(unsigned long *map, long start, long nr)
  {
  unsigned long *p = map + BIT_WORD(start);
-const int size = start + nr;
+const long size = start + nr;
  int bits_to_clear = BITS_PER_LONG - (start % BITS_PER_LONG);
  unsigned long mask_to_clear = BITMAP_FIRST_WORD_MASK(start);

@@ -212,10 +212,10 @@ void bitmap_clear(unsigned long *map, int start, int nr)
   * power of 2. A @align_mask of 0 means no alignment is required.
   */
  unsigned long bitmap_find_next_zero_area(unsigned long *map,
-unsigned long size,
-unsigned long start,
-unsigned int nr,
-unsigned long align_mask)
+ unsigned long size,
+ unsigned long start,
+ unsigned long nr,
+ unsigned long align_mask)
  {
  unsigned long index, end, i;
  again:
@@ -237,9 +237,9 @@ again:
  }

  int slow_bitmap_intersects(const unsigned long *bitmap1,
-   const unsigned long *bitmap2, int bits)
+   const unsigned long *bitmap2, long bits)
  {
-int k, lim = bits/BITS_PER_LONG;
+long k, lim = bits/BITS_PER_LONG;

  for (k = 0; k < lim; ++k) {
  if (bitmap1[k] & bitmap2[k]) {



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 37/38] migration: synchronize memory bitmap 64bits at a time

2013-12-19 Thread Orit Wasserman

On 12/17/2013 05:26 PM, Juan Quintela wrote:

We use the old code if the bitmaps are not aligned

Signed-off-by: Juan Quintela 
---
  arch_init.c | 38 +-
  1 file changed, 29 insertions(+), 9 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 2cd3d00..77912e7 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -50,6 +50,7 @@
  #include "exec/cpu-all.h"
  #include "exec/ram_addr.h"
  #include "hw/acpi/acpi.h"
+#include "qemu/host-utils.h"

  #ifdef DEBUG_ARCH_INIT
  #define DPRINTF(fmt, ...) \
@@ -376,15 +377,34 @@ static inline bool migration_bitmap_set_dirty(ram_addr_t 
addr)
  static void migration_bitmap_sync_range(ram_addr_t start, ram_addr_t length)
  {
  ram_addr_t addr;
-
-for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
-if (cpu_physical_memory_get_dirty(start + addr,
-  TARGET_PAGE_SIZE,
-  DIRTY_MEMORY_MIGRATION)) {
-cpu_physical_memory_reset_dirty(start + addr,
-TARGET_PAGE_SIZE,
-DIRTY_MEMORY_MIGRATION);
-migration_bitmap_set_dirty(start + addr);
+unsigned long page = BIT_WORD(start >> TARGET_PAGE_BITS);
+
+/* start address is aligned at the start of a word? */
+if (((page * BITS_PER_LONG) << TARGET_PAGE_BITS) == start) {
+int k;
+int nr = BITS_TO_LONGS(length >> TARGET_PAGE_BITS);
+unsigned long *src = ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION];
+
+for (k = page; k < page + nr; k++) {
+if (src[k]) {
+unsigned long new_dirty;
+new_dirty = ~migration_bitmap[k];
+migration_bitmap[k] |= src[k];
+new_dirty &= src[k];
+migration_dirty_pages += ctpopl(new_dirty);
+src[k] = 0;
+}
+}
+} else {
+for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
+if (cpu_physical_memory_get_dirty(start + addr,
+  TARGET_PAGE_SIZE,
+  DIRTY_MEMORY_MIGRATION)) {
+cpu_physical_memory_reset_dirty(start + addr,
+TARGET_PAGE_SIZE,
+DIRTY_MEMORY_MIGRATION);
+migration_bitmap_set_dirty(start + addr);
+}
  }
  }
  }



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 36/38] ram: split function that synchronizes a range

2013-12-19 Thread Orit Wasserman

On 12/17/2013 05:26 PM, Juan Quintela wrote:

This function is the only bit where we care about speed.

Signed-off-by: Juan Quintela 
---
  arch_init.c | 34 --
  1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 0e8c8b5..2cd3d00 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -360,11 +360,10 @@ ram_addr_t 
migration_bitmap_find_and_reset_dirty(MemoryRegion *mr,
  return (next - base) << TARGET_PAGE_BITS;
  }

-static inline bool migration_bitmap_set_dirty(MemoryRegion *mr,
-  ram_addr_t offset)
+static inline bool migration_bitmap_set_dirty(ram_addr_t addr)
  {
  bool ret;
-int nr = (mr->ram_addr + offset) >> TARGET_PAGE_BITS;
+int nr = addr >> TARGET_PAGE_BITS;

  ret = test_and_set_bit(nr, migration_bitmap);

@@ -374,12 +373,28 @@ static inline bool 
migration_bitmap_set_dirty(MemoryRegion *mr,
  return ret;
  }

+static void migration_bitmap_sync_range(ram_addr_t start, ram_addr_t length)
+{
+ram_addr_t addr;
+
+for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
+if (cpu_physical_memory_get_dirty(start + addr,
+  TARGET_PAGE_SIZE,
+  DIRTY_MEMORY_MIGRATION)) {
+cpu_physical_memory_reset_dirty(start + addr,
+TARGET_PAGE_SIZE,
+DIRTY_MEMORY_MIGRATION);
+migration_bitmap_set_dirty(start + addr);
+}
+}
+}
+
+
  /* Needs iothread lock! */

  static void migration_bitmap_sync(void)
  {
  RAMBlock *block;
-ram_addr_t addr;
  uint64_t num_dirty_pages_init = migration_dirty_pages;
  MigrationState *s = migrate_get_current();
  static int64_t start_time;
@@ -400,16 +415,7 @@ static void migration_bitmap_sync(void)
  address_space_sync_dirty_bitmap(&address_space_memory);

  QTAILQ_FOREACH(block, &ram_list.blocks, next) {
-for (addr = 0; addr < block->length; addr += TARGET_PAGE_SIZE) {
-if (cpu_physical_memory_get_dirty(block->mr->ram_addr + addr,
-  TARGET_PAGE_SIZE,
-  DIRTY_MEMORY_MIGRATION)) {
-cpu_physical_memory_reset_dirty(block->mr->ram_addr + addr,
-TARGET_PAGE_SIZE,
-DIRTY_MEMORY_MIGRATION);
-migration_bitmap_set_dirty(block->mr, addr);
-}
-}
+migration_bitmap_sync_range(block->mr->ram_addr, block->length);
  }
  trace_migration_bitmap_sync_end(migration_dirty_pages
      - num_dirty_pages_init);



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 35/38] memory: syncronize kvm bitmap using bitmaps operations

2013-12-19 Thread Orit Wasserman

On 12/17/2013 05:26 PM, Juan Quintela wrote:

If bitmaps are aligned properly, use bitmap operations.  If they are
not, just use old bit at a time code.

Signed-off-by: Juan Quintela 
---
  include/exec/ram_addr.h | 54 -
  1 file changed, 36 insertions(+), 18 deletions(-)

diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index c6736ed..33c8acc 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -83,29 +83,47 @@ static inline void 
cpu_physical_memory_set_dirty_lebitmap(unsigned long *bitmap,
ram_addr_t start,
ram_addr_t pages)
  {
-unsigned int i, j;
+unsigned long i, j;
  unsigned long page_number, c;
  hwaddr addr;
  ram_addr_t ram_addr;
-unsigned int len = (pages + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
+unsigned long len = (pages + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
  unsigned long hpratio = getpagesize() / TARGET_PAGE_SIZE;
+unsigned long page = BIT_WORD(start >> TARGET_PAGE_BITS);

-/*
- * bitmap-traveling is faster than memory-traveling (for addr...)
- * especially when most of the memory is not dirty.
- */
-for (i = 0; i < len; i++) {
-if (bitmap[i] != 0) {
-c = leul_to_cpu(bitmap[i]);
-do {
-j = ffsl(c) - 1;
-c &= ~(1ul << j);
-page_number = (i * HOST_LONG_BITS + j) * hpratio;
-addr = page_number * TARGET_PAGE_SIZE;
-ram_addr = start + addr;
-cpu_physical_memory_set_dirty_range(ram_addr,
-TARGET_PAGE_SIZE * 
hpratio);
-} while (c != 0);
+/* start address is aligned at the start of a word? */
+if (((page * BITS_PER_LONG) << TARGET_PAGE_BITS) == start) {
+long k;
+long nr = BITS_TO_LONGS(pages);
+
+for (k = 0; k < nr; k++) {
+if (bitmap[k]) {
+unsigned long temp = leul_to_cpu(bitmap[k]);
+
+ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION][page + k] |= 
temp;
+ram_list.dirty_memory[DIRTY_MEMORY_VGA][page + k] |= temp;
+ram_list.dirty_memory[DIRTY_MEMORY_CODE][page + k] |= temp;
+}
+}
+xen_modified_memory(start, pages);
+} else {
+/*
+ * bitmap-traveling is faster than memory-traveling (for addr...)
+ * especially when most of the memory is not dirty.
+ */
+for (i = 0; i < len; i++) {
+if (bitmap[i] != 0) {
+c = leul_to_cpu(bitmap[i]);
+do {
+j = ffsl(c) - 1;
+c &= ~(1ul << j);
+page_number = (i * HOST_LONG_BITS + j) * hpratio;
+addr = page_number * TARGET_PAGE_SIZE;
+ram_addr = start + addr;
+cpu_physical_memory_set_dirty_range(ram_addr,
+   TARGET_PAGE_SIZE * hpratio);
+} while (c != 0);
+}
      }
  }
  }



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 34/38] memory: move bitmap synchronization to its own function

2013-12-19 Thread Orit Wasserman

On 12/17/2013 05:26 PM, Juan Quintela wrote:

We want to have all the functions that handle directly the dirty
bitmap near.  We will change it later.

Signed-off-by: Juan Quintela 
---
  include/exec/ram_addr.h | 31 +++
  kvm-all.c   | 27 ++-
  2 files changed, 33 insertions(+), 25 deletions(-)

diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index db977fb..c6736ed 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -79,6 +79,37 @@ static inline void 
cpu_physical_memory_set_dirty_range(ram_addr_t start,
  xen_modified_memory(start, length);
  }

+static inline void cpu_physical_memory_set_dirty_lebitmap(unsigned long 
*bitmap,
+  ram_addr_t start,
+  ram_addr_t pages)
+{
+unsigned int i, j;
+unsigned long page_number, c;
+hwaddr addr;
+ram_addr_t ram_addr;
+unsigned int len = (pages + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
+unsigned long hpratio = getpagesize() / TARGET_PAGE_SIZE;
+
+/*
+ * bitmap-traveling is faster than memory-traveling (for addr...)
+ * especially when most of the memory is not dirty.
+ */
+for (i = 0; i < len; i++) {
+if (bitmap[i] != 0) {
+c = leul_to_cpu(bitmap[i]);
+do {
+j = ffsl(c) - 1;
+c &= ~(1ul << j);
+page_number = (i * HOST_LONG_BITS + j) * hpratio;
+addr = page_number * TARGET_PAGE_SIZE;
+ram_addr = start + addr;
+cpu_physical_memory_set_dirty_range(ram_addr,
+TARGET_PAGE_SIZE * 
hpratio);
+} while (c != 0);
+}
+}
+}
+
  static inline void cpu_physical_memory_clear_dirty_range(ram_addr_t start,
   ram_addr_t length,
   unsigned client)
diff --git a/kvm-all.c b/kvm-all.c
index cb62ba4..0bfb060 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -380,33 +380,10 @@ static int kvm_set_migration_log(int enable)
  static int kvm_get_dirty_pages_log_range(MemoryRegionSection *section,
   unsigned long *bitmap)
  {
-unsigned int i, j;
-unsigned long page_number, c;
-hwaddr addr;
  ram_addr_t start = section->offset_within_region + section->mr->ram_addr;
-ram_addr_t ram_addr;
-unsigned int pages = int128_get64(section->size) / getpagesize();
-unsigned int len = (pages + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
-unsigned long hpratio = getpagesize() / TARGET_PAGE_SIZE;
+ram_addr_t pages = int128_get64(section->size) / getpagesize();

-/*
- * bitmap-traveling is faster than memory-traveling (for addr...)
- * especially when most of the memory is not dirty.
- */
-for (i = 0; i < len; i++) {
-if (bitmap[i] != 0) {
-c = leul_to_cpu(bitmap[i]);
-do {
-j = ffsl(c) - 1;
-c &= ~(1ul << j);
-page_number = (i * HOST_LONG_BITS + j) * hpratio;
-addr = page_number * TARGET_PAGE_SIZE;
-ram_addr = start + addr;
-cpu_physical_memory_set_dirty_range(ram_addr,
-TARGET_PAGE_SIZE * 
hpratio);
-} while (c != 0);
-}
-}
+cpu_physical_memory_set_dirty_lebitmap(bitmap, start, pages);
  return 0;
  }



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 33/38] kvm: refactor start address calculation

2013-12-19 Thread Orit Wasserman

On 12/17/2013 05:26 PM, Juan Quintela wrote:

Signed-off-by: Juan Quintela 
---
  kvm-all.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 308dfba..cb62ba4 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -382,7 +382,8 @@ static int 
kvm_get_dirty_pages_log_range(MemoryRegionSection *section,
  {
  unsigned int i, j;
  unsigned long page_number, c;
-hwaddr addr, addr1;
+hwaddr addr;
+ram_addr_t start = section->offset_within_region + section->mr->ram_addr;
  ram_addr_t ram_addr;
  unsigned int pages = int128_get64(section->size) / getpagesize();
  unsigned int len = (pages + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
@@ -399,9 +400,8 @@ static int 
kvm_get_dirty_pages_log_range(MemoryRegionSection *section,
  j = ffsl(c) - 1;
  c &= ~(1ul << j);
  page_number = (i * HOST_LONG_BITS + j) * hpratio;
-addr1 = page_number * TARGET_PAGE_SIZE;
-addr = section->offset_within_region + addr1;
-ram_addr = section->mr->ram_addr + addr;
+addr = page_number * TARGET_PAGE_SIZE;
+ram_addr = start + addr;
  cpu_physical_memory_set_dirty_range(ram_addr,
  TARGET_PAGE_SIZE * 
hpratio);
  } while (c != 0);



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 32/38] kvm: use directly cpu_physical_memory_* api for tracking dirty pages

2013-12-19 Thread Orit Wasserman

On 12/17/2013 05:26 PM, Juan Quintela wrote:

Performance is important in this function, and we want to optimize even further.

Signed-off-by: Juan Quintela 
---
  kvm-all.c | 7 +--
  1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 3937754..308dfba 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -31,6 +31,7 @@
  #include "sysemu/kvm.h"
  #include "qemu/bswap.h"
  #include "exec/memory.h"
+#include "exec/ram_addr.h"
  #include "exec/address-spaces.h"
  #include "qemu/event_notifier.h"
  #include "trace.h"
@@ -382,6 +383,7 @@ static int 
kvm_get_dirty_pages_log_range(MemoryRegionSection *section,
  unsigned int i, j;
  unsigned long page_number, c;
  hwaddr addr, addr1;
+ram_addr_t ram_addr;
  unsigned int pages = int128_get64(section->size) / getpagesize();
  unsigned int len = (pages + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
  unsigned long hpratio = getpagesize() / TARGET_PAGE_SIZE;
@@ -399,8 +401,9 @@ static int 
kvm_get_dirty_pages_log_range(MemoryRegionSection *section,
  page_number = (i * HOST_LONG_BITS + j) * hpratio;
  addr1 = page_number * TARGET_PAGE_SIZE;
  addr = section->offset_within_region + addr1;
-memory_region_set_dirty(section->mr, addr,
-TARGET_PAGE_SIZE * hpratio);
+ram_addr = section->mr->ram_addr + addr;
+cpu_physical_memory_set_dirty_range(ram_addr,
+TARGET_PAGE_SIZE * 
hpratio);
  } while (c != 0);
  }
  }



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 31/38] memory: unfold memory_region_test_and_clear()

2013-12-18 Thread Orit Wasserman

On 12/17/2013 05:26 PM, Juan Quintela wrote:

We are going to update the bitmap directly

Signed-off-by: Juan Quintela 
---
  arch_init.c | 10 +++---
  1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index e0acbc5..0e8c8b5 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -48,6 +48,7 @@
  #include "qmp-commands.h"
  #include "trace.h"
  #include "exec/cpu-all.h"
+#include "exec/ram_addr.h"
  #include "hw/acpi/acpi.h"

  #ifdef DEBUG_ARCH_INIT
@@ -400,9 +401,12 @@ static void migration_bitmap_sync(void)

  QTAILQ_FOREACH(block, &ram_list.blocks, next) {
  for (addr = 0; addr < block->length; addr += TARGET_PAGE_SIZE) {
-if (memory_region_test_and_clear_dirty(block->mr,
-   addr, TARGET_PAGE_SIZE,
-   DIRTY_MEMORY_MIGRATION)) {
+if (cpu_physical_memory_get_dirty(block->mr->ram_addr + addr,
+  TARGET_PAGE_SIZE,
+  DIRTY_MEMORY_MIGRATION)) {
+cpu_physical_memory_reset_dirty(block->mr->ram_addr + addr,
+TARGET_PAGE_SIZE,
+DIRTY_MEMORY_MIGRATION);
  migration_bitmap_set_dirty(block->mr, addr);
  }
  }



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 26/38] memory: cpu_physical_memory_clear_dirty_range() now uses bitmap operations

2013-12-18 Thread Orit Wasserman

On 12/17/2013 05:26 PM, Juan Quintela wrote:

We were clearing a range of bits, so use bitmap_clear().

Signed-off-by: Juan Quintela 
Reviewed-by: Eric Blake 
---
  include/exec/memory-internal.h | 10 --
  1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 4906cdf..e2f55ea 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -95,14 +95,12 @@ static inline void 
cpu_physical_memory_clear_dirty_range(ram_addr_t start,
   ram_addr_t length,
   unsigned client)
  {
-ram_addr_t addr, end;
+unsigned long end, page;

  assert(client < DIRTY_MEMORY_NUM);
-end = TARGET_PAGE_ALIGN(start + length);
-start &= TARGET_PAGE_MASK;
-for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
-clear_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
-}
+end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
+page = start >> TARGET_PAGE_BITS;
+bitmap_clear(ram_list.dirty_memory[client], page, end - page);
  }

  void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 30/38] memory: split cpu_physical_memory_* functions to its own include

2013-12-18 Thread Orit Wasserman
 COPYING file in the top-level directory.
+ *
+ */
+
+/*
+ * This header is for use by exec.c and memory.c ONLY.  Do not include it.
+ * The functions declared here will be removed soon.
+ */
+
+#ifndef RAM_ADDR_H
+#define RAM_ADDR_H
+
+#ifndef CONFIG_USER_ONLY
+#include "hw/xen/xen.h"
+
+ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
+   MemoryRegion *mr);
+ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr);
+void *qemu_get_ram_ptr(ram_addr_t addr);
+void qemu_ram_free(ram_addr_t addr);
+void qemu_ram_free_from_ptr(ram_addr_t addr);
+
+static inline bool cpu_physical_memory_get_dirty(ram_addr_t start,
+ ram_addr_t length,
+ unsigned client)
+{
+unsigned long end, page, next;
+
+assert(client < DIRTY_MEMORY_NUM);
+
+end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
+page = start >> TARGET_PAGE_BITS;
+next = find_next_bit(ram_list.dirty_memory[client], end, page);
+
+return next < end;
+}
+
+static inline bool cpu_physical_memory_get_dirty_flag(ram_addr_t addr,
+  unsigned client)
+{
+return cpu_physical_memory_get_dirty(addr, 1, client);
+}
+
+static inline bool cpu_physical_memory_is_clean(ram_addr_t addr)
+{
+bool vga = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_VGA);
+bool code = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_CODE);
+bool migration =
+cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_MIGRATION);
+return !(vga && code && migration);
+}
+
+static inline void cpu_physical_memory_set_dirty_flag(ram_addr_t addr,
+  unsigned client)
+{
+assert(client < DIRTY_MEMORY_NUM);
+set_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
+}
+
+static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
+   ram_addr_t length)
+{
+unsigned long end, page;
+
+end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
+page = start >> TARGET_PAGE_BITS;
+bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION], page, end - 
page);
+bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_VGA], page, end - page);
+bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_CODE], page, end - page);
+xen_modified_memory(start, length);
+}
+
+static inline void cpu_physical_memory_clear_dirty_range(ram_addr_t start,
+ ram_addr_t length,
+ unsigned client)
+{
+unsigned long end, page;
+
+assert(client < DIRTY_MEMORY_NUM);
+end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
+page = start >> TARGET_PAGE_BITS;
+bitmap_clear(ram_list.dirty_memory[client], page, end - page);
+}
+
+void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t length,
+ unsigned client);
+
+#endif
+#endif
diff --git a/memory.c b/memory.c
index c010296..59ecc28 100644
--- a/memory.c
+++ b/memory.c
@@ -22,6 +22,7 @@
  #include 

  #include "exec/memory-internal.h"
+#include "exec/ram_addr.h"

  //#define DEBUG_UNASSIGNED



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 29/38] memory: cpu_physical_memory_set_dirty_tracking() should return void

2013-12-18 Thread Orit Wasserman

On 12/17/2013 05:26 PM, Juan Quintela wrote:

Result was always 0, and not used anywhere.  Once there, use bool type
for the parameter.

Signed-off-by: Juan Quintela 
Reviewed-by: Eric Blake 
---
  exec.c | 10 --
  1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/exec.c b/exec.c
index ba5989c..d165fd8 100644
--- a/exec.c
+++ b/exec.c
@@ -56,7 +56,7 @@
  //#define DEBUG_SUBPAGE

  #if !defined(CONFIG_USER_ONLY)
-static int in_migration;
+static bool in_migration;

  RAMList ram_list = { .blocks = QTAILQ_HEAD_INITIALIZER(ram_list.blocks) };

@@ -751,11 +751,9 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, 
ram_addr_t length,
  }
  }

-static int cpu_physical_memory_set_dirty_tracking(int enable)
+static void cpu_physical_memory_set_dirty_tracking(bool enable)
  {
-int ret = 0;
  in_migration = enable;
-return ret;
  }

  hwaddr memory_region_section_get_iotlb(CPUArchState *env,
@@ -1797,12 +1795,12 @@ static void tcg_commit(MemoryListener *listener)

  static void core_log_global_start(MemoryListener *listener)
  {
-cpu_physical_memory_set_dirty_tracking(1);
+cpu_physical_memory_set_dirty_tracking(true);
  }

  static void core_log_global_stop(MemoryListener *listener)
  {
-cpu_physical_memory_set_dirty_tracking(0);
+cpu_physical_memory_set_dirty_tracking(false);
  }

  static MemoryListener core_memory_listener = {



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 28/38] memory: make cpu_physical_memory_reset_dirty() take a length parameter

2013-12-18 Thread Orit Wasserman

On 12/17/2013 05:26 PM, Juan Quintela wrote:

We have an end parameter in all the callers, and this make it coherent
with the rest of cpu_physical_memory_* functions, that also take a
length parameter.

Once here, move the start/end calculation to
tlb_reset_dirty_range_all() as we don't need it here anymore.

Signed-off-by: Juan Quintela 
Reviewed-by: Eric Blake 
---
  cputlb.c   |  3 +--
  exec.c | 19 ---
  include/exec/memory-internal.h |  2 +-
  memory.c   |  8 ++--
  4 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index 911d764..a5805e1 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -127,8 +127,7 @@ void tlb_flush_page(CPUArchState *env, target_ulong addr)
 can be detected */
  void tlb_protect_code(ram_addr_t ram_addr)
  {
-cpu_physical_memory_reset_dirty(ram_addr,
-ram_addr + TARGET_PAGE_SIZE,
+cpu_physical_memory_reset_dirty(ram_addr, TARGET_PAGE_SIZE,
  DIRTY_MEMORY_CODE);
  }

diff --git a/exec.c b/exec.c
index a2f89eb..ba5989c 100644
--- a/exec.c
+++ b/exec.c
@@ -723,11 +723,14 @@ found:
  return block;
  }

-static void tlb_reset_dirty_range_all(ram_addr_t start, ram_addr_t end,
-  uintptr_t length)
+static void tlb_reset_dirty_range_all(ram_addr_t start, ram_addr_t length)
  {
-RAMBlock *block;
  ram_addr_t start1;
+RAMBlock *block;
+ram_addr_t end;
+
+end = TARGET_PAGE_ALIGN(start + length);
+start &= TARGET_PAGE_MASK;

  block = qemu_get_ram_block(start);
  assert(block == qemu_get_ram_block(end - 1));
@@ -736,21 +739,15 @@ static void tlb_reset_dirty_range_all(ram_addr_t start, 
ram_addr_t end,
  }

  /* Note: start and end must be within the same ram block.  */
-void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
+void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t length,
   unsigned client)
  {
-uintptr_t length;
-
-start &= TARGET_PAGE_MASK;
-end = TARGET_PAGE_ALIGN(end);
-
-length = end - start;
  if (length == 0)
  return;
  cpu_physical_memory_clear_dirty_range(start, length, client);

  if (tcg_enabled()) {
-tlb_reset_dirty_range_all(start, end, length);
+tlb_reset_dirty_range_all(start, length);
  }
  }

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 771b23f..cb2249f 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -102,7 +102,7 @@ static inline void 
cpu_physical_memory_clear_dirty_range(ram_addr_t start,
  bitmap_clear(ram_list.dirty_memory[client], page, end - page);
  }

-void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
+void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t length,
   unsigned client);

  #endif
diff --git a/memory.c b/memory.c
index a490cbd..c010296 100644
--- a/memory.c
+++ b/memory.c
@@ -1191,9 +1191,7 @@ bool memory_region_test_and_clear_dirty(MemoryRegion *mr, 
hwaddr addr,
  assert(mr->terminates);
  ret = cpu_physical_memory_get_dirty(mr->ram_addr + addr, size, client);
  if (ret) {
-cpu_physical_memory_reset_dirty(mr->ram_addr + addr,
-mr->ram_addr + addr + size,
-client);
+cpu_physical_memory_reset_dirty(mr->ram_addr + addr, size, client);
  }
  return ret;
  }
@@ -1239,9 +1237,7 @@ void memory_region_reset_dirty(MemoryRegion *mr, hwaddr 
addr,
 hwaddr size, unsigned client)
  {
  assert(mr->terminates);
-cpu_physical_memory_reset_dirty(mr->ram_addr + addr,
-mr->ram_addr + addr + size,
-client);
+cpu_physical_memory_reset_dirty(mr->ram_addr + addr, size, client);
  }

  void *memory_region_get_ram_ptr(MemoryRegion *mr)



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 27/38] memory: s/dirty/clean/ in cpu_physical_memory_is_dirty()

2013-12-18 Thread Orit Wasserman

On 12/17/2013 05:26 PM, Juan Quintela wrote:

All uses except one really want the other meaning.

Signed-off-by: Juan Quintela 
Reviewed-by: Eric Blake 
---
  cputlb.c   | 3 ++-
  exec.c | 6 +++---
  include/exec/memory-internal.h | 5 ++---
  3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index dfd747a..911d764 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -299,7 +299,8 @@ void tlb_set_page(CPUArchState *env, target_ulong vaddr,
  /* Write access calls the I/O callback.  */
  te->addr_write = address | TLB_MMIO;
  } else if (memory_region_is_ram(section->mr)
-   && !cpu_physical_memory_is_dirty(section->mr->ram_addr + 
xlat)) {
+   && cpu_physical_memory_is_clean(section->mr->ram_addr
+   + xlat)) {
  te->addr_write = address | TLB_NOTDIRTY;
  } else {
  te->addr_write = address;
diff --git a/exec.c b/exec.c
index 6bef3e5..a2f89eb 100644
--- a/exec.c
+++ b/exec.c
@@ -1513,7 +1513,7 @@ static void notdirty_mem_write(void *opaque, hwaddr 
ram_addr,
  cpu_physical_memory_set_dirty_flag(ram_addr, DIRTY_MEMORY_VGA);
  /* we remove the notdirty callback only if the code has been
 flushed */
-if (cpu_physical_memory_is_dirty(ram_addr)) {
+if (!cpu_physical_memory_is_clean(ram_addr)) {
  CPUArchState *env = current_cpu->env_ptr;
  tlb_set_dirty(env, env->mem_io_vaddr);
  }
@@ -1916,7 +1916,7 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong addr,
  static void invalidate_and_set_dirty(hwaddr addr,
   hwaddr length)
  {
-if (!cpu_physical_memory_is_dirty(addr)) {
+if (cpu_physical_memory_is_clean(addr)) {
  /* invalidate code */
  tb_invalidate_phys_page_range(addr, addr + length, 0);
  /* set dirty bit */
@@ -2499,7 +2499,7 @@ void stl_phys_notdirty(hwaddr addr, uint32_t val)
  stl_p(ptr, val);

  if (unlikely(in_migration)) {
-if (!cpu_physical_memory_is_dirty(addr1)) {
+if (cpu_physical_memory_is_clean(addr1)) {
  /* invalidate code */
  tb_invalidate_phys_page_range(addr1, addr1 + 4, 0);
  /* set dirty bit */
diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index e2f55ea..771b23f 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -61,14 +61,13 @@ static inline bool 
cpu_physical_memory_get_dirty_flag(ram_addr_t addr,
  return cpu_physical_memory_get_dirty(addr, 1, client);
  }

-/* read dirty bit (return 0 or 1) */
-static inline bool cpu_physical_memory_is_dirty(ram_addr_t addr)
+static inline bool cpu_physical_memory_is_clean(ram_addr_t addr)
  {
  bool vga = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_VGA);
  bool code = cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_CODE);
  bool migration =
  cpu_physical_memory_get_dirty_flag(addr, DIRTY_MEMORY_MIGRATION);
-return vga && code && migration;
+return !(vga && code && migration);
  }

  static inline void cpu_physical_memory_set_dirty_flag(ram_addr_t addr,



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 25/38] memory: cpu_physical_memory_set_dirty_range() now uses bitmap operations

2013-12-18 Thread Orit Wasserman

On 12/17/2013 05:26 PM, Juan Quintela wrote:

We were setting a range of bits, so use bitmap_set().

Note: xen has always been wrong, and should have used start instead
of addr from the beginning.

Signed-off-by: Juan Quintela 
Reviewed-by: Eric Blake 
---
  include/exec/memory-internal.h | 19 +++
  1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index b017c2e..4906cdf 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -81,19 +81,14 @@ static inline void 
cpu_physical_memory_set_dirty_flag(ram_addr_t addr,
  static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
 ram_addr_t length)
  {
-ram_addr_t addr, end;
+unsigned long end, page;

-end = TARGET_PAGE_ALIGN(start + length);
-start &= TARGET_PAGE_MASK;
-for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
-set_bit(addr >> TARGET_PAGE_BITS,
-ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION]);
-set_bit(addr >> TARGET_PAGE_BITS,
-ram_list.dirty_memory[DIRTY_MEMORY_VGA]);
-set_bit(addr >> TARGET_PAGE_BITS,
-ram_list.dirty_memory[DIRTY_MEMORY_CODE]);
-}
-xen_modified_memory(addr, length);
+end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
+page = start >> TARGET_PAGE_BITS;
+bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION], page, end - 
page);
+bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_VGA], page, end - page);
+bitmap_set(ram_list.dirty_memory[DIRTY_MEMORY_CODE], page, end - page);
+xen_modified_memory(start, length);
  }

  static inline void cpu_physical_memory_clear_dirty_range(ram_addr_t start,



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 24/38] memory: use find_next_bit() to find dirty bits

2013-12-18 Thread Orit Wasserman

On 12/17/2013 05:26 PM, Juan Quintela wrote:

This operation is way faster than doing it bit by bit.

Signed-off-by: Juan Quintela 
Reviewed-by: Eric Blake 
---
  include/exec/memory-internal.h | 16 ++--
  1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index c04a92a..b017c2e 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -44,19 +44,15 @@ static inline bool cpu_physical_memory_get_dirty(ram_addr_t 
start,
   ram_addr_t length,
   unsigned client)
  {
-ram_addr_t addr, end;
+unsigned long end, page, next;

  assert(client < DIRTY_MEMORY_NUM);

-end = TARGET_PAGE_ALIGN(start + length);
-start &= TARGET_PAGE_MASK;
-for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
-if (test_bit(addr >> TARGET_PAGE_BITS,
- ram_list.dirty_memory[client])) {
-return true;
-}
-}
-return false;
+end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
+page = start >> TARGET_PAGE_BITS;
+next = find_next_bit(ram_list.dirty_memory[client], end, page);
+
+return next < end;
  }

  static inline bool cpu_physical_memory_get_dirty_flag(ram_addr_t addr,



Reviewed-by: Orit Wasserman 




Re: [Qemu-devel] [PATCHv2] block: add native support for NFS

2013-12-18 Thread Orit Wasserman

On 12/18/2013 01:11 PM, Peter Lieven wrote:




Am 18.12.2013 um 11:00 schrieb Orit Wasserman :


On 12/18/2013 01:03 AM, Peter Lieven wrote:



Am 17.12.2013 um 18:32 schrieb "Daniel P. Berrange" :

On Tue, Dec 17, 2013 at 10:15:25AM +0100, Peter Lieven wrote:
This patch adds native support for accessing images on NFS shares without
the requirement to actually mount the entire NFS share on the host.

NFS Images can simply be specified by an url of the form:
nfs:

For example:
qemu-img create -f qcow2 nfs://10.0.0.1/qemu-images/test.qcow2


Does it support other config tunables, eg specifying which
NFS version to use 2/3/4 ? If so will they be available as
URI parameters in the obvious manner ?


currently only v3 is supported by libnfs. what other tunables would you like to 
see?


For live migration we need the sync option (async ignores O_SYNC and O_DIRECT 
sadly),
will it be supported? or will it be the default?


the async you see in the libnfs calls refer to the async API. bdrv_flush will 
not return before the nfs server completes the request.



That great!

Thanks,
Orit


Peter



Orit



Daniel
--
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|







Re: [Qemu-devel] [PATCH 23/38] memory: s/mask/clear/ cpu_physical_memory_mask_dirty_range

2013-12-18 Thread Orit Wasserman

On 12/17/2013 05:25 PM, Juan Quintela wrote:

Now all functions use the same wording that bitops/bitmap operations

Signed-off-by: Juan Quintela 
Reviewed-by: Eric Blake 
---
  exec.c | 2 +-
  include/exec/memory-internal.h | 6 +++---
  2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/exec.c b/exec.c
index ad9866c..6bef3e5 100644
--- a/exec.c
+++ b/exec.c
@@ -747,7 +747,7 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, 
ram_addr_t end,
  length = end - start;
  if (length == 0)
  return;
-cpu_physical_memory_mask_dirty_range(start, length, client);
+cpu_physical_memory_clear_dirty_range(start, length, client);

  if (tcg_enabled()) {
  tlb_reset_dirty_range_all(start, end, length);
diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index fa28fc6..c04a92a 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -100,9 +100,9 @@ static inline void 
cpu_physical_memory_set_dirty_range(ram_addr_t start,
  xen_modified_memory(addr, length);
  }

-static inline void cpu_physical_memory_mask_dirty_range(ram_addr_t start,
-ram_addr_t length,
-unsigned client)
+static inline void cpu_physical_memory_clear_dirty_range(ram_addr_t start,
+ ram_addr_t length,
+ unsigned client)
  {
  ram_addr_t addr, end;



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 18/38] memory: unfold cpu_physical_memory_clear_dirty_flag() in its only user

2013-12-18 Thread Orit Wasserman

On 12/17/2013 05:25 PM, Juan Quintela wrote:

Signed-off-by: Juan Quintela 
Reviewed-by: Eric Blake 
---
  include/exec/memory-internal.h | 10 ++
  1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 6fb1b64..2f6610a 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -86,13 +86,6 @@ static inline void cpu_physical_memory_set_dirty(ram_addr_t 
addr)
  cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_CODE);
  }

-static inline void cpu_physical_memory_clear_dirty_flag(ram_addr_t addr,
-   unsigned client)
-{
-assert(client < DIRTY_MEMORY_NUM);
-clear_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
-}
-
  static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
 ram_addr_t length)
  {
@@ -112,10 +105,11 @@ static inline void 
cpu_physical_memory_mask_dirty_range(ram_addr_t start,
  {
  ram_addr_t addr, end;

+assert(client < DIRTY_MEMORY_NUM);
  end = TARGET_PAGE_ALIGN(start + length);
  start &= TARGET_PAGE_MASK;
  for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
-cpu_physical_memory_clear_dirty_flag(addr, client);
+clear_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
      }
  }



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 22/38] memory: cpu_physical_memory_get_dirty() is used as returning a bool

2013-12-18 Thread Orit Wasserman

On 12/17/2013 05:25 PM, Juan Quintela wrote:

Signed-off-by: Juan Quintela 
Reviewed-by: Eric Blake 
---
  include/exec/memory-internal.h | 15 ---
  1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index edca8a8..fa28fc6 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -40,11 +40,10 @@ void *qemu_get_ram_ptr(ram_addr_t addr);
  void qemu_ram_free(ram_addr_t addr);
  void qemu_ram_free_from_ptr(ram_addr_t addr);

-static inline int cpu_physical_memory_get_dirty(ram_addr_t start,
-ram_addr_t length,
-unsigned client)
+static inline bool cpu_physical_memory_get_dirty(ram_addr_t start,
+ ram_addr_t length,
+ unsigned client)
  {
-int ret = 0;
  ram_addr_t addr, end;

  assert(client < DIRTY_MEMORY_NUM);
@@ -52,10 +51,12 @@ static inline int cpu_physical_memory_get_dirty(ram_addr_t 
start,
  end = TARGET_PAGE_ALIGN(start + length);
  start &= TARGET_PAGE_MASK;
  for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
-ret |= test_bit(addr >> TARGET_PAGE_BITS,
-ram_list.dirty_memory[client]);
+if (test_bit(addr >> TARGET_PAGE_BITS,
+ ram_list.dirty_memory[client])) {
+return true;
+}
  }
-return ret;
+return false;
  }

  static inline bool cpu_physical_memory_get_dirty_flag(ram_addr_t addr,



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 21/38] memory: make cpu_physical_memory_get_dirty() the main function

2013-12-18 Thread Orit Wasserman

On 12/17/2013 05:25 PM, Juan Quintela wrote:

And make cpu_physical_memory_get_dirty_flag() to use it.  It used to
be the other way around.

Signed-off-by: Juan Quintela 
Reviewed-by: Eric Blake 
---
  include/exec/memory-internal.h | 36 +++-
  1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index b99617a..edca8a8 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -40,11 +40,28 @@ void *qemu_get_ram_ptr(ram_addr_t addr);
  void qemu_ram_free(ram_addr_t addr);
  void qemu_ram_free_from_ptr(ram_addr_t addr);

+static inline int cpu_physical_memory_get_dirty(ram_addr_t start,
+ram_addr_t length,
+unsigned client)
+{
+int ret = 0;
+ram_addr_t addr, end;
+
+assert(client < DIRTY_MEMORY_NUM);
+
+end = TARGET_PAGE_ALIGN(start + length);
+start &= TARGET_PAGE_MASK;
+for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
+ret |= test_bit(addr >> TARGET_PAGE_BITS,
+ram_list.dirty_memory[client]);
+}
+return ret;
+}
+
  static inline bool cpu_physical_memory_get_dirty_flag(ram_addr_t addr,
unsigned client)
  {
-assert(client < DIRTY_MEMORY_NUM);
-return test_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
+return cpu_physical_memory_get_dirty(addr, 1, client);
  }

  /* read dirty bit (return 0 or 1) */
@@ -57,21 +74,6 @@ static inline bool cpu_physical_memory_is_dirty(ram_addr_t 
addr)
  return vga && code && migration;
  }

-static inline int cpu_physical_memory_get_dirty(ram_addr_t start,
-ram_addr_t length,
-unsigned client)
-{
-int ret = 0;
-ram_addr_t addr, end;
-
-end = TARGET_PAGE_ALIGN(start + length);
-start &= TARGET_PAGE_MASK;
-for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
-ret |= cpu_physical_memory_get_dirty_flag(addr, client);
-}
-return ret;
-}
-
  static inline void cpu_physical_memory_set_dirty_flag(ram_addr_t addr,
    unsigned client)
  {



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 20/38] memory: unfold cpu_physical_memory_set_dirty_flag()

2013-12-18 Thread Orit Wasserman

On 12/17/2013 05:25 PM, Juan Quintela wrote:

Signed-off-by: Juan Quintela 
Reviewed-by: Eric Blake 
---
  include/exec/memory-internal.h | 9 ++---
  1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 9d34434..b99617a 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -87,9 +87,12 @@ static inline void 
cpu_physical_memory_set_dirty_range(ram_addr_t start,
  end = TARGET_PAGE_ALIGN(start + length);
  start &= TARGET_PAGE_MASK;
  for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
-cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_MIGRATION);
-cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_VGA);
-cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_CODE);
+set_bit(addr >> TARGET_PAGE_BITS,
+ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION]);
+set_bit(addr >> TARGET_PAGE_BITS,
+ram_list.dirty_memory[DIRTY_MEMORY_VGA]);
+set_bit(addr >> TARGET_PAGE_BITS,
+ram_list.dirty_memory[DIRTY_MEMORY_CODE]);
  }
  xen_modified_memory(addr, length);
  }



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 19/38] memory: unfold cpu_physical_memory_set_dirty() in its only user

2013-12-18 Thread Orit Wasserman

On 12/17/2013 05:25 PM, Juan Quintela wrote:

Signed-off-by: Juan Quintela 
Reviewed-by: Eric Blake 
---
  include/exec/memory-internal.h | 11 +++
  1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 2f6610a..9d34434 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -79,13 +79,6 @@ static inline void 
cpu_physical_memory_set_dirty_flag(ram_addr_t addr,
  set_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
  }

-static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
-{
-cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_MIGRATION);
-cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_VGA);
-cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_CODE);
-}
-
  static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
 ram_addr_t length)
  {
@@ -94,7 +87,9 @@ static inline void 
cpu_physical_memory_set_dirty_range(ram_addr_t start,
  end = TARGET_PAGE_ALIGN(start + length);
  start &= TARGET_PAGE_MASK;
  for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
-cpu_physical_memory_set_dirty(addr);
+cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_MIGRATION);
+cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_VGA);
+cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_CODE);
  }
  xen_modified_memory(addr, length);
  }



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 18/38] memory: unfold cpu_physical_memory_clear_dirty_flag() in its only user

2013-12-18 Thread Orit Wasserman

On 12/17/2013 05:25 PM, Juan Quintela wrote:

Signed-off-by: Juan Quintela 
Reviewed-by: Eric Blake 
---
  include/exec/memory-internal.h | 10 ++
  1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 6fb1b64..2f6610a 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -86,13 +86,6 @@ static inline void cpu_physical_memory_set_dirty(ram_addr_t 
addr)
  cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_CODE);
  }

-static inline void cpu_physical_memory_clear_dirty_flag(ram_addr_t addr,
-   unsigned client)
-{
-assert(client < DIRTY_MEMORY_NUM);
-clear_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
-}
-
  static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
 ram_addr_t length)
  {
@@ -112,10 +105,11 @@ static inline void 
cpu_physical_memory_mask_dirty_range(ram_addr_t start,
  {
  ram_addr_t addr, end;

+assert(client < DIRTY_MEMORY_NUM);
  end = TARGET_PAGE_ALIGN(start + length);
  start &= TARGET_PAGE_MASK;
  for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
-cpu_physical_memory_clear_dirty_flag(addr, client);
+clear_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
      }
  }



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCHv2] block: add native support for NFS

2013-12-18 Thread Orit Wasserman

On 12/18/2013 12:18 PM, Daniel P. Berrange wrote:

On Wed, Dec 18, 2013 at 12:00:03PM +0200, Orit Wasserman wrote:

On 12/18/2013 01:03 AM, Peter Lieven wrote:




Am 17.12.2013 um 18:32 schrieb "Daniel P. Berrange" :


On Tue, Dec 17, 2013 at 10:15:25AM +0100, Peter Lieven wrote:
This patch adds native support for accessing images on NFS shares without
the requirement to actually mount the entire NFS share on the host.

NFS Images can simply be specified by an url of the form:
nfs:

For example:
qemu-img create -f qcow2 nfs://10.0.0.1/qemu-images/test.qcow2


Does it support other config tunables, eg specifying which
NFS version to use 2/3/4 ? If so will they be available as
URI parameters in the obvious manner ?


currently only v3 is supported by libnfs. what other tunables would you like to 
see?



For live migration we need the sync option (async ignores O_SYNC and O_DIRECT 
sadly),
will it be supported? or will it be the default?


Since this is bypassing the client kernel FS I/O layer question around
support of things like O_SYNC/O_DIRECT are not applicable.



so no live migration support?
 

Daniel






Re: [Qemu-devel] [PATCH 17/38] memory: split dirty bitmap into three

2013-12-18 Thread Orit Wasserman

On 12/17/2013 05:25 PM, Juan Quintela wrote:

After all the previous patches, spliting the bitmap gets direct.

Note: For some reason, I have to move DIRTY_MEMORY_* definitions to
the beginning of memory.h to make compilation work.

Signed-off-by: Juan Quintela 
Reviewed-by: Eric Blake 
---
  exec.c |  9 ++---
  include/exec/cpu-all.h |  3 ++-
  include/exec/memory-internal.h |  9 +++--
  include/exec/memory.h  | 10 +-
  4 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/exec.c b/exec.c
index bed5c07..ad9866c 100644
--- a/exec.c
+++ b/exec.c
@@ -1276,9 +1276,12 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void 
*host,
  new_ram_size = last_ram_offset() >> TARGET_PAGE_BITS;

  if (new_ram_size > old_ram_size) {
-ram_list.phys_dirty = g_realloc(ram_list.phys_dirty, new_ram_size);
-memset(ram_list.phys_dirty + (new_block->offset >> TARGET_PAGE_BITS),
-   0, size >> TARGET_PAGE_BITS);
+int i;
+for (i = 0; i < DIRTY_MEMORY_NUM; i++) {
+ram_list.dirty_memory[i] =
+bitmap_zero_extend(ram_list.dirty_memory[i],
+   old_ram_size, new_ram_size);
+   }
  }
  cpu_physical_memory_set_dirty_range(new_block->offset, size);

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index b6998f0..4cb4b4a 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -21,6 +21,7 @@

  #include "qemu-common.h"
  #include "exec/cpu-common.h"
+#include "exec/memory.h"
  #include "qemu/thread.h"
  #include "qom/cpu.h"

@@ -459,7 +460,7 @@ typedef struct RAMBlock {
  typedef struct RAMList {
  QemuMutex mutex;
  /* Protected by the iothread lock.  */
-uint8_t *phys_dirty;
+unsigned long *dirty_memory[DIRTY_MEMORY_NUM];
  RAMBlock *mru_block;
  /* Protected by the ramlist lock.  */
  QTAILQ_HEAD(, RAMBlock) blocks;
diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 666490c..6fb1b64 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -44,7 +44,7 @@ static inline bool 
cpu_physical_memory_get_dirty_flag(ram_addr_t addr,
unsigned client)
  {
  assert(client < DIRTY_MEMORY_NUM);
-return ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] & (1 << client);
+return test_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
  }

  /* read dirty bit (return 0 or 1) */
@@ -76,7 +76,7 @@ static inline void 
cpu_physical_memory_set_dirty_flag(ram_addr_t addr,
unsigned client)
  {
  assert(client < DIRTY_MEMORY_NUM);
-ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] |= (1 << client);
+set_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
  }

  static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
@@ -89,11 +89,8 @@ static inline void cpu_physical_memory_set_dirty(ram_addr_t 
addr)
  static inline void cpu_physical_memory_clear_dirty_flag(ram_addr_t addr,
 unsigned client)
  {
-int mask = ~(1 << client);
-
  assert(client < DIRTY_MEMORY_NUM);
-
-ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] &= mask;
+clear_bit(addr >> TARGET_PAGE_BITS, ram_list.dirty_memory[client]);
  }

  static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
diff --git a/include/exec/memory.h b/include/exec/memory.h
index d5e9d58..296d6ab 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -16,6 +16,11 @@

  #ifndef CONFIG_USER_ONLY

+#define DIRTY_MEMORY_VGA   0
+#define DIRTY_MEMORY_CODE  1
+#define DIRTY_MEMORY_MIGRATION 2
+#define DIRTY_MEMORY_NUM   3/* num of dirty bits */
+
  #include 
  #include 
  #include "qemu-common.h"
@@ -33,11 +38,6 @@
  typedef struct MemoryRegionOps MemoryRegionOps;
  typedef struct MemoryRegionMmio MemoryRegionMmio;

-#define DIRTY_MEMORY_VGA   0
-#define DIRTY_MEMORY_CODE  1
-#define DIRTY_MEMORY_MIGRATION 2
-#define DIRTY_MEMORY_NUM   3    /* num of dirty bits */
-
  struct MemoryRegionMmio {
  CPUReadMemoryFunc *read[3];
  CPUWriteMemoryFunc *write[3];



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 16/38] bitmap: Add bitmap_zero_extend operation

2013-12-18 Thread Orit Wasserman

On 12/17/2013 05:25 PM, Juan Quintela wrote:

Signed-off-by: Juan Quintela 
Reviewed-by: Eric Blake 
---
  include/qemu/bitmap.h | 9 +
  1 file changed, 9 insertions(+)

diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h
index afdd257..1babd5d 100644
--- a/include/qemu/bitmap.h
+++ b/include/qemu/bitmap.h
@@ -220,4 +220,13 @@ unsigned long bitmap_find_next_zero_area(unsigned long 
*map,
   unsigned long nr,
   unsigned long align_mask);

+static inline unsigned long *bitmap_zero_extend(unsigned long *old,
+long old_nbits, long new_nbits)
+{
+long new_len = BITS_TO_LONGS(new_nbits) * sizeof(unsigned long);
+unsigned long *new = g_realloc(old, new_len);
+bitmap_clear(new, old_nbits, new_nbits - old_nbits);
+return new;
+}
+
  #endif /* BITMAP_H */



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 14/38] memory: only resize dirty bitmap when memory size increases

2013-12-18 Thread Orit Wasserman

On 12/17/2013 05:25 PM, Juan Quintela wrote:

Signed-off-by: Juan Quintela 
Reviewed-by: Eric Blake 
---
  exec.c | 12 +---
  1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/exec.c b/exec.c
index 9996da2..bed5c07 100644
--- a/exec.c
+++ b/exec.c
@@ -1210,6 +1210,9 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void 
*host,
 MemoryRegion *mr)
  {
  RAMBlock *block, *new_block;
+ram_addr_t old_ram_size, new_ram_size;
+
+old_ram_size = last_ram_offset() >> TARGET_PAGE_BITS;

  size = TARGET_PAGE_ALIGN(size);
  new_block = g_malloc0(sizeof(*new_block));
@@ -1270,10 +1273,13 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, 
void *host,
  ram_list.version++;
  qemu_mutex_unlock_ramlist();

-ram_list.phys_dirty = g_realloc(ram_list.phys_dirty,
-   last_ram_offset() >> TARGET_PAGE_BITS);
-memset(ram_list.phys_dirty + (new_block->offset >> TARGET_PAGE_BITS),
+new_ram_size = last_ram_offset() >> TARGET_PAGE_BITS;
+
+if (new_ram_size > old_ram_size) {
+ram_list.phys_dirty = g_realloc(ram_list.phys_dirty, new_ram_size);
+memset(ram_list.phys_dirty + (new_block->offset >> TARGET_PAGE_BITS),
 0, size >> TARGET_PAGE_BITS);
+}
  cpu_physical_memory_set_dirty_range(new_block->offset, size);

  qemu_ram_setup_dump(new_block->host, size);



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 15/38] memory: cpu_physical_memory_clear_dirty_flag() result is never used

2013-12-18 Thread Orit Wasserman

On 12/17/2013 05:25 PM, Juan Quintela wrote:

Signed-off-by: Juan Quintela 
Reviewed-by: Eric Blake 
---
  include/exec/memory-internal.h | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index d09d6d8..666490c 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -86,14 +86,14 @@ static inline void cpu_physical_memory_set_dirty(ram_addr_t 
addr)
  cpu_physical_memory_set_dirty_flag(addr, DIRTY_MEMORY_CODE);
  }

-static inline int cpu_physical_memory_clear_dirty_flag(ram_addr_t addr,
+static inline void cpu_physical_memory_clear_dirty_flag(ram_addr_t addr,
 unsigned client)
  {
  int mask = ~(1 << client);

  assert(client < DIRTY_MEMORY_NUM);

-return ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] &= mask;
+ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] &= mask;
  }

  static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 13/38] memory: make sure that client is always inside range

2013-12-18 Thread Orit Wasserman

On 12/17/2013 05:25 PM, Juan Quintela wrote:

Signed-off-by: Juan Quintela 
Reviewed-by: Eric Blake 
---
  include/exec/memory-internal.h | 4 
  1 file changed, 4 insertions(+)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index b58010f..d09d6d8 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -43,6 +43,7 @@ void qemu_ram_free_from_ptr(ram_addr_t addr);
  static inline bool cpu_physical_memory_get_dirty_flag(ram_addr_t addr,
unsigned client)
  {
+assert(client < DIRTY_MEMORY_NUM);
  return ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] & (1 << client);
  }

@@ -74,6 +75,7 @@ static inline int cpu_physical_memory_get_dirty(ram_addr_t 
start,
  static inline void cpu_physical_memory_set_dirty_flag(ram_addr_t addr,
unsigned client)
  {
+assert(client < DIRTY_MEMORY_NUM);
  ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] |= (1 << client);
  }

@@ -89,6 +91,8 @@ static inline int 
cpu_physical_memory_clear_dirty_flag(ram_addr_t addr,
  {
  int mask = ~(1 << client);

+assert(client < DIRTY_MEMORY_NUM);
+
  return ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] &= mask;
  }



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 12/38] memory: use bit 2 for migration

2013-12-18 Thread Orit Wasserman

On 12/17/2013 05:25 PM, Juan Quintela wrote:

For historical reasons it was bit 3.  Once there, create a constant to
know the number of clients.

Signed-off-by: Juan Quintela 
Reviewed-by: Eric Blake 
---
  include/exec/memory.h | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index b8e76f4..d5e9d58 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -35,7 +35,8 @@ typedef struct MemoryRegionMmio MemoryRegionMmio;

  #define DIRTY_MEMORY_VGA   0
  #define DIRTY_MEMORY_CODE  1
-#define DIRTY_MEMORY_MIGRATION 3
+#define DIRTY_MEMORY_MIGRATION 2
+#define DIRTY_MEMORY_NUM   3/* num of dirty bits */

  struct MemoryRegionMmio {
  CPUReadMemoryFunc *read[3];



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 11/38] memory: cpu_physical_memory_mask_dirty_range() always clears a single flag

2013-12-18 Thread Orit Wasserman
cpu_physical_memory_get_dirty(mr->ram_addr + addr, size, client);
  if (ret) {
  cpu_physical_memory_reset_dirty(mr->ram_addr + addr,
  mr->ram_addr + addr + size,
-1 << client);
+client);
  }
  return ret;
  }
@@ -1243,7 +1241,7 @@ void memory_region_reset_dirty(MemoryRegion *mr, hwaddr 
addr,
  assert(mr->terminates);
  cpu_physical_memory_reset_dirty(mr->ram_addr + addr,
  mr->ram_addr + addr + size,
-    1 << client);
+client);
  }

  void *memory_region_get_ram_ptr(MemoryRegion *mr)



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 10/38] memory: cpu_physical_memory_set_dirty_range() always dirty all flags

2013-12-18 Thread Orit Wasserman

On 12/17/2013 05:25 PM, Juan Quintela wrote:

So remove the flag argument and do it directly.  After this change,
there is nothing else using cpu_physical_memory_set_dirty_flags() so
remove it.

Signed-off-by: Juan Quintela 
Reviewed-by: Eric Blake 
---
  exec.c |  2 +-
  include/exec/memory-internal.h | 11 ++-
  memory.c   |  2 +-
  3 files changed, 4 insertions(+), 11 deletions(-)

diff --git a/exec.c b/exec.c
index 6981f73..45fdb84 100644
--- a/exec.c
+++ b/exec.c
@@ -1274,7 +1274,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void 
*host,
 last_ram_offset() >> TARGET_PAGE_BITS);
  memset(ram_list.phys_dirty + (new_block->offset >> TARGET_PAGE_BITS),
 0, size >> TARGET_PAGE_BITS);
-cpu_physical_memory_set_dirty_range(new_block->offset, size, 0xff);
+cpu_physical_memory_set_dirty_range(new_block->offset, size);

  qemu_ram_setup_dump(new_block->host, size);
  qemu_madvise(new_block->host, size, QEMU_MADV_HUGEPAGE);
diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 9f4ad69..681d63b 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -75,12 +75,6 @@ static inline int cpu_physical_memory_get_dirty(ram_addr_t 
start,
  return ret;
  }

-static inline void cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
-  int dirty_flags)
-{
-ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] |= dirty_flags;
-}
-
  static inline void cpu_physical_memory_set_dirty_flag(ram_addr_t addr,
int dirty_flag)
  {
@@ -103,15 +97,14 @@ static inline int 
cpu_physical_memory_clear_dirty_flags(ram_addr_t addr,
  }

  static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
-   ram_addr_t length,
-   int dirty_flags)
+   ram_addr_t length)
  {
  ram_addr_t addr, end;

  end = TARGET_PAGE_ALIGN(start + length);
  start &= TARGET_PAGE_MASK;
  for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
-cpu_physical_memory_set_dirty_flags(addr, dirty_flags);
+cpu_physical_memory_set_dirty(addr);
  }
  xen_modified_memory(addr, length);
  }
diff --git a/memory.c b/memory.c
index e497f99..fb52e1c 100644
--- a/memory.c
+++ b/memory.c
@@ -1182,7 +1182,7 @@ void memory_region_set_dirty(MemoryRegion *mr, hwaddr 
addr,
   hwaddr size)
  {
  assert(mr->terminates);
-cpu_physical_memory_set_dirty_range(mr->ram_addr + addr, size, -1);
+cpu_physical_memory_set_dirty_range(mr->ram_addr + addr, size);
  }

  bool memory_region_test_and_clear_dirty(MemoryRegion *mr, hwaddr addr,



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 09/38] memory: set single dirty flags when possible

2013-12-18 Thread Orit Wasserman

On 12/17/2013 05:25 PM, Juan Quintela wrote:

Signed-off-by: Juan Quintela 
Reviewed-by: Eric Blake 
---
  exec.c | 7 ---
  include/exec/memory-internal.h | 4 +++-
  2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/exec.c b/exec.c
index a1fc280..6981f73 100644
--- a/exec.c
+++ b/exec.c
@@ -1911,7 +1911,8 @@ static void invalidate_and_set_dirty(hwaddr addr,
  /* invalidate code */
  tb_invalidate_phys_page_range(addr, addr + length, 0);
  /* set dirty bit */
-cpu_physical_memory_set_dirty_flags(addr, (0xff & ~CODE_DIRTY_FLAG));
+cpu_physical_memory_set_dirty_flag(addr, VGA_DIRTY_FLAG);
+cpu_physical_memory_set_dirty_flag(addr, MIGRATION_DIRTY_FLAG);
  }
  xen_modified_memory(addr, length);
  }
@@ -2493,8 +2494,8 @@ void stl_phys_notdirty(hwaddr addr, uint32_t val)
  /* invalidate code */
  tb_invalidate_phys_page_range(addr1, addr1 + 4, 0);
  /* set dirty bit */
-cpu_physical_memory_set_dirty_flags(
-addr1, (0xff & ~CODE_DIRTY_FLAG));
+cpu_physical_memory_set_dirty_flag(addr1, 
MIGRATION_DIRTY_FLAG);
+cpu_physical_memory_set_dirty_flag(addr1, VGA_DIRTY_FLAG);
  }
  }
  }
diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 53cfe83..9f4ad69 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -89,7 +89,9 @@ static inline void 
cpu_physical_memory_set_dirty_flag(ram_addr_t addr,

  static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
  {
-cpu_physical_memory_set_dirty_flags(addr, 0xff);
+cpu_physical_memory_set_dirty_flag(addr, MIGRATION_DIRTY_FLAG);
+cpu_physical_memory_set_dirty_flag(addr, VGA_DIRTY_FLAG);
+cpu_physical_memory_set_dirty_flag(addr, CODE_DIRTY_FLAG);
  }

  static inline int cpu_physical_memory_clear_dirty_flags(ram_addr_t addr,



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 08/38] memory: all users of cpu_physical_memory_get_dirty used only one flag

2013-12-18 Thread Orit Wasserman

On 12/17/2013 05:25 PM, Juan Quintela wrote:

So cpu_physical_memory_get_dirty_flags is not needed anymore

Signed-off-by: Juan Quintela 
Reviewed-by: Eric Blake 
---
  include/exec/memory-internal.h | 9 ++---
  1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 0b25c3f..53cfe83 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -44,11 +44,6 @@ void qemu_ram_free_from_ptr(ram_addr_t addr);
  #define CODE_DIRTY_FLAG  0x02
  #define MIGRATION_DIRTY_FLAG 0x08

-static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t addr)
-{
-return ram_list.phys_dirty[addr >> TARGET_PAGE_BITS];
-}
-
  static inline bool cpu_physical_memory_get_dirty_flag(ram_addr_t addr,
int dirty_flag)
  {
@@ -67,7 +62,7 @@ static inline bool cpu_physical_memory_is_dirty(ram_addr_t 
addr)

  static inline int cpu_physical_memory_get_dirty(ram_addr_t start,
  ram_addr_t length,
-int dirty_flags)
+int dirty_flag)
  {
  int ret = 0;
  ram_addr_t addr, end;
@@ -75,7 +70,7 @@ static inline int cpu_physical_memory_get_dirty(ram_addr_t 
start,
  end = TARGET_PAGE_ALIGN(start + length);
  start &= TARGET_PAGE_MASK;
  for (addr = start; addr < end; addr += TARGET_PAGE_SIZE) {
-ret |= cpu_physical_memory_get_dirty_flags(addr) & dirty_flags;
+ret |= cpu_physical_memory_get_dirty_flag(addr, dirty_flag);
  }
  return ret;
  }



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 07/38] memory: make cpu_physical_memory_is_dirty return bool

2013-12-18 Thread Orit Wasserman

On 12/17/2013 05:25 PM, Juan Quintela wrote:

Signed-off-by: Juan Quintela 
Reviewed-by: Eric Blake 
---
  exec.c | 7 ++-
  include/exec/memory-internal.h | 8 ++--
  2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/exec.c b/exec.c
index 95bcf70..a1fc280 100644
--- a/exec.c
+++ b/exec.c
@@ -1484,11 +1484,8 @@ found:
  static void notdirty_mem_write(void *opaque, hwaddr ram_addr,
 uint64_t val, unsigned size)
  {
-int dirty_flags;
-dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr);
  if (!cpu_physical_memory_get_dirty_flag(ram_addr, CODE_DIRTY_FLAG)) {
  tb_invalidate_phys_page_fast(ram_addr, size);
-dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr);
  }
  switch (size) {
  case 1:
@@ -1503,8 +1500,8 @@ static void notdirty_mem_write(void *opaque, hwaddr 
ram_addr,
  default:
  abort();
  }
-dirty_flags |= (0xff & ~CODE_DIRTY_FLAG);
-cpu_physical_memory_set_dirty_flags(ram_addr, dirty_flags);
+cpu_physical_memory_set_dirty_flag(ram_addr, MIGRATION_DIRTY_FLAG);
+cpu_physical_memory_set_dirty_flag(ram_addr, VGA_DIRTY_FLAG);
  /* we remove the notdirty callback only if the code has been
 flushed */
  if (cpu_physical_memory_is_dirty(ram_addr)) {
diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 136198c..0b25c3f 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -56,9 +56,13 @@ static inline bool 
cpu_physical_memory_get_dirty_flag(ram_addr_t addr,
  }

  /* read dirty bit (return 0 or 1) */
-static inline int cpu_physical_memory_is_dirty(ram_addr_t addr)
+static inline bool cpu_physical_memory_is_dirty(ram_addr_t addr)
  {
-return cpu_physical_memory_get_dirty_flags(addr) == 0xff;
+bool vga = cpu_physical_memory_get_dirty_flag(addr, VGA_DIRTY_FLAG);
+bool code = cpu_physical_memory_get_dirty_flag(addr, CODE_DIRTY_FLAG);
+bool migration =
+cpu_physical_memory_get_dirty_flag(addr, MIGRATION_DIRTY_FLAG);
+return vga && code && migration;
  }

  static inline int cpu_physical_memory_get_dirty(ram_addr_t start,



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCHv2] block: add native support for NFS

2013-12-18 Thread Orit Wasserman

On 12/18/2013 01:03 AM, Peter Lieven wrote:




Am 17.12.2013 um 18:32 schrieb "Daniel P. Berrange" :


On Tue, Dec 17, 2013 at 10:15:25AM +0100, Peter Lieven wrote:
This patch adds native support for accessing images on NFS shares without
the requirement to actually mount the entire NFS share on the host.

NFS Images can simply be specified by an url of the form:
nfs:

For example:
qemu-img create -f qcow2 nfs://10.0.0.1/qemu-images/test.qcow2


Does it support other config tunables, eg specifying which
NFS version to use 2/3/4 ? If so will they be available as
URI parameters in the obvious manner ?


currently only v3 is supported by libnfs. what other tunables would you like to 
see?



For live migration we need the sync option (async ignores O_SYNC and O_DIRECT 
sadly),
will it be supported? or will it be the default?

Orit



Daniel
--
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|







Re: [Qemu-devel] [PATCH 0/9] vmstate code split + unit tests

2013-12-05 Thread Orit Wasserman

On 11/28/2013 04:01 PM, Eduardo Habkost wrote:

This series separates the QEMUFile and VMState code from savevm.c, and adds a
few unit tests to the VMState code.

Eduardo Habkost (9):
   qemu-file: Make a few functions non-static
   migration: Move QEMU_VM_* defines to migration/migration.h
   savevm: Convert all tabs to spaces
   savevm.c: Coding style fixes
   savevm.c: Coding style fix
   vmstate: Move VMState code to vmstate.c
   qemu-file: Move QEMUFile code to qemu-file.c
   savevm: Small comment about why timer QEMUFile/VMState code is in
 savevm.c
   tests: Some unit tests for vmstate.c

  Makefile.objs |2 +
  include/migration/migration.h |   11 +
  include/migration/qemu-file.h |4 +
  qemu-file.c   |  826 +
  savevm.c  | 1590 ++---
  tests/.gitignore  |1 +
  tests/Makefile|4 +
  tests/test-vmstate.c  |  357 +
  vmstate.c |  650 +
  9 files changed, 1921 insertions(+), 1524 deletions(-)
  create mode 100644 qemu-file.c
  create mode 100644 tests/test-vmstate.c
  create mode 100644 vmstate.c



Series Reviewed-by: Orit Wasserman 
(with v2 of patch 8)



Re: [Qemu-devel] [PATCH] The calculation of bytes_xfer in qemu_put_buffer() is wrong

2013-11-20 Thread Orit Wasserman

On 11/20/2013 01:26 PM, Juan Quintela wrote:

From: "Wangting (Kathy)" 

In qemu_put_buffer(), bytes_xfer += size is wrong, it will be more
than expected, and should be bytes_xfer += l.

Signed-off-by: zhangmin 
Signed-off-by: Juan Quintela 
---
  savevm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/savevm.c b/savevm.c
index 2f631d4..3f912dd 100644
--- a/savevm.c
+++ b/savevm.c
@@ -794,7 +794,7 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int 
size)
  if (l > size)
  l = size;
  memcpy(f->buf + f->buf_index, buf, l);
-f->bytes_xfer += size;
+f->bytes_xfer += l;
  if (f->ops->writev_buffer) {
  add_to_iovec(f, f->buf + f->buf_index, l);
  }



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 05/39] memory: create function to set a single dirty bit

2013-11-07 Thread Orit Wasserman

On 11/06/2013 03:04 PM, Juan Quintela wrote:

Signed-off-by: Juan Quintela 
---
  cputlb.c   | 2 +-
  include/exec/memory-internal.h | 6 ++
  2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/cputlb.c b/cputlb.c
index fff0afb..72452e5 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -137,7 +137,7 @@ void tlb_protect_code(ram_addr_t ram_addr)
  void tlb_unprotect_code_phys(CPUArchState *env, ram_addr_t ram_addr,
   target_ulong vaddr)
  {
-cpu_physical_memory_set_dirty_flags(ram_addr, CODE_DIRTY_FLAG);
+cpu_physical_memory_set_dirty_flag(ram_addr, CODE_DIRTY_FLAG);
  }

  static bool tlb_is_dirty_ram(CPUTLBEntry *tlbe)
diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index c71a5e6..4ebab80 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -76,6 +76,12 @@ static inline void 
cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
  ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] |= dirty_flags;
  }

+static inline void cpu_physical_memory_set_dirty_flag(ram_addr_t addr,
+  int dirty_flag)
+{
+ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] |= dirty_flag;
+}
+
  static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
  {
  cpu_physical_memory_set_dirty_flags(addr, 0xff);



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 04/39] exec: use accessor function to know if memory is dirty

2013-11-07 Thread Orit Wasserman

On 11/06/2013 03:04 PM, Juan Quintela wrote:

Signed-off-by: Juan Quintela 
---
  exec.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/exec.c b/exec.c
index 79610ce..7cf5634 100644
--- a/exec.c
+++ b/exec.c
@@ -1397,7 +1397,7 @@ static void notdirty_mem_write(void *opaque, hwaddr 
ram_addr,
  cpu_physical_memory_set_dirty_flags(ram_addr, dirty_flags);
  /* we remove the notdirty callback only if the code has been
 flushed */
-if (dirty_flags == 0xff) {
+if (cpu_physical_memory_is_dirty(ram_addr)) {
  CPUArchState *env = current_cpu->env_ptr;
  tlb_set_dirty(env, env->mem_io_vaddr);
  }



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 03/39] memory: cpu_physical_memory_set_dirty_range() return void

2013-11-07 Thread Orit Wasserman

On 11/06/2013 03:04 PM, Juan Quintela wrote:

Signed-off-by: Juan Quintela 
---
  memory.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/memory.c b/memory.c
index 28f6449..9722f23 100644
--- a/memory.c
+++ b/memory.c
@@ -1182,7 +1182,7 @@ void memory_region_set_dirty(MemoryRegion *mr, hwaddr 
addr,
   hwaddr size)
  {
  assert(mr->terminates);
-return cpu_physical_memory_set_dirty_range(mr->ram_addr + addr, size, -1);
+cpu_physical_memory_set_dirty_range(mr->ram_addr + addr, size, -1);
  }

  bool memory_region_test_and_clear_dirty(MemoryRegion *mr, hwaddr addr,



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 02/39] memory: cpu_physical_memory_set_dirty_flags() result is never used

2013-11-07 Thread Orit Wasserman

On 11/06/2013 03:04 PM, Juan Quintela wrote:

So return void.

Signed-off-by: Juan Quintela 
---
  include/exec/memory-internal.h | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index d0e0633..c71a5e6 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -70,10 +70,10 @@ static inline int cpu_physical_memory_get_dirty(ram_addr_t 
start,
  return ret;
  }

-static inline int cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
+static inline void cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
int dirty_flags)
  {
-return ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] |= dirty_flags;
+ram_list.phys_dirty[addr >> TARGET_PAGE_BITS] |= dirty_flags;
  }

  static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH 01/39] Move prototypes to memory.h

2013-11-07 Thread Orit Wasserman

On 11/06/2013 03:04 PM, Juan Quintela wrote:

As the comment says, it should only be used on "core" memory files.

Signed-off-by: Juan Quintela 
---
  include/exec/cpu-common.h | 4 
  include/exec/memory.h | 4 +++-
  2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index e4996e1..8110ef0 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -51,10 +51,6 @@ typedef void CPUWriteMemoryFunc(void *opaque, hwaddr addr, 
uint32_t value);
  typedef uint32_t CPUReadMemoryFunc(void *opaque, hwaddr addr);

  void qemu_ram_remap(ram_addr_t addr, ram_addr_t length);
-/* This should not be used by devices.  */
-MemoryRegion *qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr);
-void qemu_ram_set_idstr(ram_addr_t addr, const char *name, DeviceState *dev);
-
  void cpu_physical_memory_rw(hwaddr addr, uint8_t *buf,
  int len, int is_write);
  static inline void cpu_physical_memory_read(hwaddr addr,
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 480dfbf..c7e151f 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -1055,7 +1055,9 @@ void *address_space_map(AddressSpace *as, hwaddr addr,
  void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len,
   int is_write, hwaddr access_len);

-
+/* This should not be used by devices.  */
+MemoryRegion *qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr);
+void qemu_ram_set_idstr(ram_addr_t addr, const char *name, DeviceState *dev);
  #endif

  #endif



Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] KVM call for agenda for 2013-10-01

2013-09-30 Thread Orit Wasserman
On 09/24/2013 05:09 PM, Juan Quintela wrote:
> 
> Hi
> 
> Please, send any topic that you are interested in covering.
> 
> Last week I forgot to send the call for topics.  We still have a topic there.
> 
> Thanks, Juan.
> 
> Agenda so far:
> - Talk about qemu reverse executing (1st description was done this week)
>   How to handle IO when we want to do reverse execution.
>   How this relate to Kemari needs?

Hi Juan,
I'm sorry but I won't be able to attend to call tomorrow due to travel.

Regards,
Orit

>   And to icount changes?
> 
> Call details:
> 
> 10:00 AM to 11:00 AM EDT
> Every two weeks
> 
> If you need phone number details,  contact me privately.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 




Re: [Qemu-devel] [PATCH 3/3 resend v2] arch_init: right return for ram_save_iterate

2013-09-12 Thread Orit Wasserman
On 09/11/2013 02:27 PM, Paolo Bonzini wrote:
> Il 11/09/2013 13:06, Juan Quintela ha scritto:
 And I think that the right solution is make qemu_get_rate_limit() to
 return -1 in case of error (or the error, I don't care).
>>>
>>> You might do both things, it would avoid the useless g_usleep you
>>> pointed out below.  But Lei's patch is good, because an error could
>>> happen exactly during the qemu_put_be64 that writes RAM_SAVE_FLAG_EOS.
>>
>> Caller checks also.  This is the reason I wanted qemu_file_* callers to
>> return an error.  It has some advantages and some disadvantages.  We
>> don't agree on which ones are bigger O:-)
> 
> I think the disadvantages are bigger.  It litters the code with error
> handling, hides where things actually happen, and doesn't even simplify
> QEMUFile itself.  Checking only at the toplevel is simpler, all we need
> to do is ensure that we get there every now and then (and that's what
> qemu_file_rate_limit does).
> 

I also prefer the error checking at the top level.

Orit

 savevm.c: qemu_savevm_state_iterate()

 if (qemu_file_rate_limit(f)) {
 return 0;
 }

 check is incorrect again, we should return an error if there is one
 error.
>>>
>>> Nothing cares if qemu_savevm_state_iterate returns 0 or negative, so
>>> changing qemu_savevm_state_iterate to only return 0/1 would make sense too.
>>
>> In this case, 0 means:
>>   please, call us again
>> when what we mean is:
>>   don't care about calling us again, there is an error.  Handle the error.
> 
> Or alternatively, 0 means:
> 
>we haven't finished the work
> 
> when what we mean is:
> 
>we haven't finished the work (BTW, please check if there is an error)
> 
>> Notice that qemu_save_iterate() already returns errors in other code
>> paths
> 
> Yes that's also unnecessary.
> 
>> If we change th ereturn value for qemu_file_rate_limit() the change that
>> cames with this patch is not needed, that was my point.
> 
> This is what an earlier patch from Lei did.  I told him (or her?) to
> leave qemu_file_rate_limit aside since the idea behind QEMUFile is to
> only handle the error at the top.
> 
> Paolo
> 




Re: [Qemu-devel] [PATCH RFC 0/4] Curling: KVM Fault Tolerance

2013-09-12 Thread Orit Wasserman
On 09/11/2013 04:54 AM, junqing.w...@cs2c.com.cn wrote:
> Hi,
> 
>>The first is that if the VM failure happen in the middle on the live 
>>migration  >the backup VM state will be inconsistent which means you can't 
>>failover to it.
> 
> Yes, I have concerned about this problem. That is why we need a prefetch 
> buffer.
> 

You are right I missed that.

>>Solving it is not simple as you need some transaction mechanism that will 
>>>change the backup VM state only when the transaction completes (the live 
>>migration completes). >Kemari has something like that. >
> 
> The backup VM state will be loaded only when the one whole migration data is 
> prefetched. Otherwise, VM state will not be loaded. So the backup VM is 
> ensured to have a consistent state like a checkpoint.
> However, how close this checkpoint to the point of the VM failure depends on 
> the workload and bandwidth.
> 

At the moment in your implementation the prefetch buffer can be very large 
(several copies of guest memory size) 
are you planning to address this issue?

>>The second is that sadly live migration doesn't always converge this means  
>>>that the backup VM won't have a consist state to failover to. >You need to 
>>detect such a case and throttle down the guest to force convergence.
> 
> Yes, that's a problem. AFAK, qemu already have an auto convergence feature.

How about activating it when you do fault tolerance automatically?

> From another perspective,  if many migrations could not converge, maybe the 
> workload is high and the bandwidth is low,  and it is not recommended to use 
> FT in general.
> 

I agree but we need some way to notify the user of such problem.

Regards,
Orit
> 
> 




Re: [Qemu-devel] [PATCH RFC 0/4] Curling: KVM Fault Tolerance

2013-09-10 Thread Orit Wasserman
On 09/10/2013 06:43 AM, Jules Wang wrote:
> The goal of Curling(sports) is to provide a fault tolerant mechanism for KVM,
> so that in the event of a hardware failure, the virtual machine fails over to
> the backup in a way that is completely transparent to the guest operating 
> system.
> 
> Our goal is exactly the same as the goal of Kemari, by which Curling is
> inspired. However, Curling is simpler than Kemari(too simple, I afraid):
> 
> * By leveraging live migration feature, we do endless live migrations between
> the sender and receiver, so the two virtual machines are synchronized.
> 

Hi,
There are two issues I see with your solution,
The first is that if the VM failure happen in the middle on the live migration 
the backup VM state will be inconsistent which means you can't failover to it.
Solving it is not simple as you need some transaction mechanism that will
change the backup VM state only when the transaction completes (the live 
migration completes).
Kemari has something like that.

The second is that sadly live migration doesn't always converge this means 
that the backup VM won't have a consist state to failover to.
You need to detect such a case and throttle down the guest to force convergence.

Regards,
Orit

> * The receiver does not load vm state once the migration begins, instead, it
> perfetches one whole migration data into a buffer, then loads vm state from 
> that
> buffer afterwards. This "all or nothing" approach prevents the
> broken-in-the-middle problem Kemari has.
> 
> * The sender sleeps a little while after each migration, to ease the 
> performance
> penalty entailed by vm_stop and iothread locks. This is a tradeoff between
> performance and accuracy.
> 
> Usage:
> The steps of curling are the same as the steps of live migration except the
> following:
> 1. Start the receiver vm with -incoming curling:tcp::
> 2. Start ft in the qemu monitor of sender vm by following cmdline:
>> migrate_set_speed  
>> migrate curling:tcp::
> 3. Connect to the receiver vm by vnc or spice. The screen of the vm is 
> displayed
> when curling is ready.
> 4. Now, the sender vm is protected by ft, When it encounters a failure,
> the failover kicks in.
> 
> Problems to be discussed:
> 1. When the receiver is prefectching data, how does it know where is the EOF 
> of
> one migration?
> 
> Currently, we use a magic number 0xfeedcafe to indicate the EOF.
> Any better solutions?
> 
> 2. How to reduce the overhead entailed by vm_stop and iothread locks?
> 
> Any solutions other than sleeping?
> 
> --
> 
> Jules Wang (4):
>   Curling: add doc
>   Curling: cmdline interface
>   Curling: the sender
>   Curling: the receiver
> 
>  arch_init.c   |  18 +++--
>  docs/curling.txt  |  52 ++
>  include/migration/migration.h |   2 +
>  include/migration/qemu-file.h |   1 +
>  include/sysemu/sysemu.h   |   1 +
>  migration.c   |  61 ++--
>  savevm.c  | 158 
> --
>  7 files changed, 277 insertions(+), 16 deletions(-)
>  create mode 100644 docs/curling.txt
> 




Re: [Qemu-devel] [PATCH 0/3 resend v2] Migration fix

2013-09-04 Thread Orit Wasserman
On 09/04/2013 12:02 PM, Lei Li wrote:
> This small patch series is extracted from localhost migration 
> series that Paolo reviewed positively, send it separately as 
> his request. 
> 
> It fixs wrong initialization in RDMA hook ram_control_load_hook() 
> and gets rid of the negative check for qemu_file_rate_limit() with 
> bytes_transferred and return value adjusted correspondingly in 
> ram_save_iterate.
> 
> 
> Lei Li (3):
>   savevm: add comments for qemu_file_get_error()
>   savevm: fix wrong initialization by ram_control_load_hook
>   arch_init: right return for ram_save_iterate
> 
>  arch_init.c |   15 ++-
>  savevm.c|9 -
>  2 files changed, 18 insertions(+), 6 deletions(-)
> 
> 

Looks good,

Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH v2] Fix query-migrate documentation in qmp-commands.hx

2013-08-20 Thread Orit Wasserman
On 08/19/2013 09:41 PM, Luiz Capitulino wrote:
> On Mon, 12 Aug 2013 10:19:52 -0400
> Luiz Capitulino  wrote:
> 
>> On Thu,  8 Aug 2013 20:05:48 +0300
>> Orit Wasserman  wrote:
>>
>>> "ram" is present also when migration completes.
>>> expected-downtime, total-time and downtime are no longer part of "ram" data.
>>>
>>> Signed-off-by: Orit Wasserman 
>>
>> Applied to the qmp branch, thanks.
> 
> This one missed the deadline for 1.6. I'm my updating my tree for 1.7,
> but this patch conflicts with 8f3067, which repeated for the setup-time
> key the mistake this patch is fixing for other keys.
> 
> We need a respin for this patch, which should also move setup-time
> to the main dict. Orit, I can do it myself if you want.
> 

That will be great,
Thanks,
Orit
>>
>>> ---
>>>  qmp-commands.hx | 20 ++--
>>>  1 file changed, 10 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/qmp-commands.hx b/qmp-commands.hx
>>> index 2e59b0d..a22a841 100644
>>> --- a/qmp-commands.hx
>>> +++ b/qmp-commands.hx
>>> @@ -2626,8 +2626,8 @@ The main json-object contains the following:
>>>  - "expected-downtime": only present while migration is active
>>>  total amount in ms for downtime that was calculated on
>>>  the last bitmap round (json-int)
>>> -- "ram": only present if "status" is "active", it is a json-object with the
>>> -  following RAM information:
>>> +- "ram": only present if "status" is "active" or "complete", it is a
>>> + json-object with the following RAM information:
>>>   - "transferred": amount transferred in bytes (json-int)
>>>   - "remaining": amount remaining to transfer in bytes (json-int)
>>>   - "total": total amount of memory in bytes (json-int)
>>> @@ -2669,12 +2669,12 @@ Examples:
>>>  -> { "execute": "query-migrate" }
>>>  <- { "return": {
>>>  "status": "completed",
>>> +"total-time":12345,
>>> +"downtime":12345,
>>>  "ram":{
>>>"transferred":123,
>>>"remaining":123,
>>>"total":246,
>>> -  "total-time":12345,
>>> -  "downtime":12345,
>>>"duplicate":123,
>>>"normal":123,
>>>"normal-bytes":123456
>>> @@ -2693,12 +2693,12 @@ Examples:
>>>  <- {
>>>"return":{
>>>   "status":"active",
>>> + "total-time":12345,
>>> + "expected-downtime":12345,
>>>   "ram":{
>>>  "transferred":123,
>>>  "remaining":123,
>>>  "total":246,
>>> -"total-time":12345,
>>> -"expected-downtime":12345,
>>>  "duplicate":123,
>>>  "normal":123,
>>>  "normal-bytes":123456
>>> @@ -2712,12 +2712,12 @@ Examples:
>>>  <- {
>>>"return":{
>>>   "status":"active",
>>> + "total-time":12345,
>>> + "expected-downtime":12345,
>>>   "ram":{
>>>  "total":1057024,
>>>  "remaining":1053304,
>>>  "transferred":3720,
>>> -"total-time":12345,
>>> -"expected-downtime":12345,
>>>  "duplicate":123,
>>>  "normal":123,
>>>  "normal-bytes":123456
>>> @@ -2736,13 +2736,13 @@ Examples:
>>>  <- {
>>>"return":{
>>>   "status":"active",
>>> + "total-time":12345,
>>> + "expected-downtime":12345,
>>>   "capabilities" : [ { "capability": "xbzrle", "state" : true } ],
>>>   "ram":{
>>>  "total":1057024,
>>>  "remaining":1053304,
>>>  "transferred":3720,
>>> -"total-time":12345,
>>> -"expected-downtime":12345,
>>>  "duplicate":10,
>>>  "normal":,
>>>  "normal-bytes":3412992
>>
> 




[Qemu-devel] [PATCH v2] Fix query-migrate documentation in qmp-commands.hx

2013-08-08 Thread Orit Wasserman
"ram" is present also when migration completes.
expected-downtime, total-time and downtime are no longer part of "ram" data.

Signed-off-by: Orit Wasserman 
---
 qmp-commands.hx | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/qmp-commands.hx b/qmp-commands.hx
index 2e59b0d..a22a841 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -2626,8 +2626,8 @@ The main json-object contains the following:
 - "expected-downtime": only present while migration is active
 total amount in ms for downtime that was calculated on
 the last bitmap round (json-int)
-- "ram": only present if "status" is "active", it is a json-object with the
-  following RAM information:
+- "ram": only present if "status" is "active" or "complete", it is a
+ json-object with the following RAM information:
  - "transferred": amount transferred in bytes (json-int)
  - "remaining": amount remaining to transfer in bytes (json-int)
  - "total": total amount of memory in bytes (json-int)
@@ -2669,12 +2669,12 @@ Examples:
 -> { "execute": "query-migrate" }
 <- { "return": {
 "status": "completed",
+"total-time":12345,
+"downtime":12345,
 "ram":{
   "transferred":123,
   "remaining":123,
   "total":246,
-  "total-time":12345,
-  "downtime":12345,
   "duplicate":123,
   "normal":123,
   "normal-bytes":123456
@@ -2693,12 +2693,12 @@ Examples:
 <- {
   "return":{
  "status":"active",
+ "total-time":12345,
+ "expected-downtime":12345,
  "ram":{
 "transferred":123,
 "remaining":123,
 "total":246,
-"total-time":12345,
-"expected-downtime":12345,
 "duplicate":123,
 "normal":123,
 "normal-bytes":123456
@@ -2712,12 +2712,12 @@ Examples:
 <- {
   "return":{
  "status":"active",
+ "total-time":12345,
+ "expected-downtime":12345,
  "ram":{
 "total":1057024,
 "remaining":1053304,
 "transferred":3720,
-"total-time":12345,
-"expected-downtime":12345,
 "duplicate":123,
 "normal":123,
 "normal-bytes":123456
@@ -2736,13 +2736,13 @@ Examples:
 <- {
   "return":{
  "status":"active",
+ "total-time":12345,
+ "expected-downtime":12345,
  "capabilities" : [ { "capability": "xbzrle", "state" : true } ],
  "ram":{
 "total":1057024,
 "remaining":1053304,
 "transferred":3720,
-"total-time":12345,
-"expected-downtime":12345,
 "duplicate":10,
 "normal":,
 "normal-bytes":3412992
-- 
1.8.1.4




Re: [Qemu-devel] [PATCH] Fix query-migrate documentation in qmp-commands.hx

2013-08-08 Thread Orit Wasserman
On 08/08/2013 05:50 PM, Luiz Capitulino wrote:
> On Thu,  8 Aug 2013 11:46:14 +0300
> Orit Wasserman  wrote:
> 
>> "ram" is present also when migration completes.
>> total-time and downtime are no longer part of "ram" data.
> 
> expected-downtime has to be moved too, looks good otherwise.

I will fix it in a separate patch.

Orit
> 
>>
>> Signed-off-by: Orit Wasserman 
>> ---
>>  qmp-commands.hx | 14 +++---
>>  1 file changed, 7 insertions(+), 7 deletions(-)
>>
>> diff --git a/qmp-commands.hx b/qmp-commands.hx
>> index 2e59b0d..1d43f4c 100644
>> --- a/qmp-commands.hx
>> +++ b/qmp-commands.hx
>> @@ -2626,8 +2626,8 @@ The main json-object contains the following:
>>  - "expected-downtime": only present while migration is active
>>  total amount in ms for downtime that was calculated on
>>  the last bitmap round (json-int)
>> -- "ram": only present if "status" is "active", it is a json-object with the
>> -  following RAM information:
>> +- "ram": only present if "status" is "active" or "complete", it is a
>> + json-object with the following RAM information:
>>   - "transferred": amount transferred in bytes (json-int)
>>   - "remaining": amount remaining to transfer in bytes (json-int)
>>   - "total": total amount of memory in bytes (json-int)
>> @@ -2669,12 +2669,12 @@ Examples:
>>  -> { "execute": "query-migrate" }
>>  <- { "return": {
>>  "status": "completed",
>> +"total-time":12345,
>> +"downtime":12345,
>>  "ram":{
>>"transferred":123,
>>"remaining":123,
>>"total":246,
>> -  "total-time":12345,
>> -  "downtime":12345,
>>"duplicate":123,
>>"normal":123,
>>"normal-bytes":123456
>> @@ -2693,11 +2693,11 @@ Examples:
>>  <- {
>>"return":{
>>   "status":"active",
>> + "total-time":12345,
>>   "ram":{
>>  "transferred":123,
>>  "remaining":123,
>>  "total":246,
>> -"total-time":12345,
>>  "expected-downtime":12345,
>>  "duplicate":123,
>>  "normal":123,
>> @@ -2712,11 +2712,11 @@ Examples:
>>  <- {
>>"return":{
>>   "status":"active",
>> + "total-time":12345,
>>   "ram":{
>>  "total":1057024,
>>  "remaining":1053304,
>>  "transferred":3720,
>> -"total-time":12345,
>>  "expected-downtime":12345,
>>  "duplicate":123,
>>  "normal":123,
>> @@ -2736,12 +2736,12 @@ Examples:
>>  <- {
>>"return":{
>>   "status":"active",
>> + "total-time":12345,
>>   "capabilities" : [ { "capability": "xbzrle", "state" : true } ],
>>   "ram":{
>>  "total":1057024,
>>  "remaining":1053304,
>>  "transferred":3720,
>> -"total-time":12345,
>>  "expected-downtime":12345,
>>  "duplicate":10,
>>  "normal":,
> 




[Qemu-devel] [PATCH] Fix query-migrate documentation in qmp-commands.hx

2013-08-08 Thread Orit Wasserman
"ram" is present also when migration completes.
total-time and downtime are no longer part of "ram" data.

Signed-off-by: Orit Wasserman 
---
 qmp-commands.hx | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/qmp-commands.hx b/qmp-commands.hx
index 2e59b0d..1d43f4c 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -2626,8 +2626,8 @@ The main json-object contains the following:
 - "expected-downtime": only present while migration is active
 total amount in ms for downtime that was calculated on
 the last bitmap round (json-int)
-- "ram": only present if "status" is "active", it is a json-object with the
-  following RAM information:
+- "ram": only present if "status" is "active" or "complete", it is a
+ json-object with the following RAM information:
  - "transferred": amount transferred in bytes (json-int)
  - "remaining": amount remaining to transfer in bytes (json-int)
  - "total": total amount of memory in bytes (json-int)
@@ -2669,12 +2669,12 @@ Examples:
 -> { "execute": "query-migrate" }
 <- { "return": {
 "status": "completed",
+"total-time":12345,
+"downtime":12345,
 "ram":{
   "transferred":123,
   "remaining":123,
   "total":246,
-  "total-time":12345,
-  "downtime":12345,
   "duplicate":123,
   "normal":123,
   "normal-bytes":123456
@@ -2693,11 +2693,11 @@ Examples:
 <- {
   "return":{
  "status":"active",
+ "total-time":12345,
  "ram":{
 "transferred":123,
 "remaining":123,
 "total":246,
-"total-time":12345,
 "expected-downtime":12345,
 "duplicate":123,
 "normal":123,
@@ -2712,11 +2712,11 @@ Examples:
 <- {
   "return":{
  "status":"active",
+ "total-time":12345,
  "ram":{
 "total":1057024,
 "remaining":1053304,
 "transferred":3720,
-"total-time":12345,
 "expected-downtime":12345,
 "duplicate":123,
 "normal":123,
@@ -2736,12 +2736,12 @@ Examples:
 <- {
   "return":{
  "status":"active",
+ "total-time":12345,
  "capabilities" : [ { "capability": "xbzrle", "state" : true } ],
  "ram":{
 "total":1057024,
 "remaining":1053304,
 "transferred":3720,
-"total-time":12345,
 "expected-downtime":12345,
 "duplicate":10,
 "normal":,
-- 
1.8.1.4




Re: [Qemu-devel] [PATCH for-1.6 4/4] rdma: proper getaddrinfo() handling

2013-08-07 Thread Orit Wasserman
On 08/07/2013 07:05 PM, mrhi...@linux.vnet.ibm.com wrote:
> From: "Michael R. Hines" 
> 
> getaddrinfo() already knows what it's doing,
> wqand can potentially return multiple addresses.
> 
> Signed-off-by: Michael R. Hines 
> ---
>  migration-rdma.c |   56 
> --
>  1 file changed, 29 insertions(+), 27 deletions(-)
> 
> diff --git a/migration-rdma.c b/migration-rdma.c
> index 30e08cd..d71cca5 100644
> --- a/migration-rdma.c
> +++ b/migration-rdma.c
> @@ -392,7 +392,6 @@ typedef struct RDMAContext {
>  uint64_t unregistrations[RDMA_SIGNALED_SEND_MAX];
>  
>  GHashTable *blockmap;
> -bool ipv6;
>  } RDMAContext;
>  
>  /*
> @@ -745,7 +744,7 @@ static int qemu_rdma_resolve_host(RDMAContext *rdma, 
> Error **errp)
>  char port_str[16];
>  struct rdma_cm_event *cm_event;
>  char ip[40] = "unknown";
> -int af = rdma->ipv6 ? PF_INET6 : PF_INET;
> +struct addrinfo *e;
>  
>  if (rdma->host == NULL || !strcmp(rdma->host, "")) {
>  ERROR(errp, "RDMA hostname has not been set");
> @@ -775,18 +774,23 @@ static int qemu_rdma_resolve_host(RDMAContext *rdma, 
> Error **errp)
>  goto err_resolve_get_addr;
>  }
>  
> -inet_ntop(af, &((struct sockaddr_in *) res->ai_addr)->sin_addr,
> -ip, sizeof ip);
> -DPRINTF("%s => %s\n", rdma->host, ip);
> +for (e = res; e != NULL; e = e->ai_next) {
> +inet_ntop(e->ai_family,
> +&((struct sockaddr_in *) e->ai_addr)->sin_addr, ip, sizeof ip);
> +DPRINTF("Trying %s => %s\n", rdma->host, ip);
>  
> -/* resolve the first address */
> -ret = rdma_resolve_addr(rdma->cm_id, NULL, res->ai_addr,
> -RDMA_RESOLVE_TIMEOUT_MS);
> -if (ret) {
> -ERROR(errp, "could not resolve address %s", rdma->host);
> -goto err_resolve_get_addr;
> +/* resolve the first address */
> +ret = rdma_resolve_addr(rdma->cm_id, NULL, e->ai_addr,
> +RDMA_RESOLVE_TIMEOUT_MS);
> +if (!ret) {
> +goto route;
> +}
>  }
>  
> +ERROR(errp, "could not resolve address %s", rdma->host);
> +goto err_resolve_get_addr;
> +
> +route:
>  qemu_rdma_dump_gid("source_resolve_addr", rdma->cm_id);
>  
>  ret = rdma_get_cm_event(rdma->channel, &cm_event);
> @@ -2260,8 +2264,6 @@ err_rdma_source_connect:
>  static int qemu_rdma_dest_init(RDMAContext *rdma, Error **errp)
>  {
>  int ret = -EINVAL, idx;
> -int af = rdma->ipv6 ? PF_INET6 : PF_INET;
> -struct sockaddr_in sin;
>  struct rdma_cm_id *listen_id;
>  char ip[40] = "unknown";
>  struct addrinfo *res;
> @@ -2292,35 +2294,36 @@ static int qemu_rdma_dest_init(RDMAContext *rdma, 
> Error **errp)
>  goto err_dest_init_create_listen_id;
>  }
>  
> -memset(&sin, 0, sizeof(sin));
> -sin.sin_family = af;
> -sin.sin_port = htons(rdma->port);
>  snprintf(port_str, 16, "%d", rdma->port);
>  port_str[15] = '\0';
>  
>  if (rdma->host && strcmp("", rdma->host)) {
> +struct addrinfo *e;
> +
>  ret = getaddrinfo(rdma->host, port_str, NULL, &res);
>  if (ret < 0) {
>  ERROR(errp, "could not getaddrinfo address %s", rdma->host);
>  goto err_dest_init_bind_addr;
>  }
>  
> +for (e = res; e != NULL; e = e->ai_next) {
> +inet_ntop(e->ai_family,
> +&((struct sockaddr_in *) e->ai_addr)->sin_addr, ip, sizeof 
> ip);
> +DPRINTF("Trying %s => %s\n", rdma->host, ip);
> +ret = rdma_bind_addr(listen_id, e->ai_addr);
> +if (!ret) {
> +goto listen;
> +}
> +}
>  
> -inet_ntop(af, &((struct sockaddr_in *) res->ai_addr)->sin_addr,
> -ip, sizeof ip);
> +ERROR(errp, "Error: could not rdma_bind_addr!");
> +goto err_dest_init_bind_addr;
>  } else {
>  ERROR(errp, "migration host and port not specified!");
>  ret = -EINVAL;
>  goto err_dest_init_bind_addr;
>  }
> -
> -DPRINTF("%s => %s\n", rdma->host, ip);
> -
> -    ret = rdma_bind_addr(listen_id, res->ai_addr);
> -if (ret) {
> -ERROR(errp, "Error: could not rdma_bind_addr!");
> -goto err_dest_init_bind_addr;
> -}
> +listen:
>  
>  rdma->listen_id = listen_id;
>  qemu_rdma_dump_gid("dest_init", listen_id);
> @@ -2351,7 +2354,6 @@ static void *qemu_rdma_data_init(const char *host_port, 
> Error **errp)
>  if (addr != NULL) {
>  rdma->port = atoi(addr->port);
>  rdma->host = g_strdup(addr->host);
> -rdma->ipv6 = addr->ipv6;
>  } else {
>  ERROR(errp, "bad RDMA migration address '%s'", host_port);
>  g_free(rdma);
> 

Looks good,
Reviewed-by: Orit Wasserman 




Re: [Qemu-devel] [PATCH 0/3] rdma: validate remote provided RDMAControlHeader::len

2013-08-07 Thread Orit Wasserman
On 08/07/2013 05:26 AM, Isaku Yamahata wrote:
> RDMAControlHeader::len is remote-provided. So validate the value before use.
> 
> Isaku Yamahata (3):
>   rdma: use resp.len after validation in qemu_rdma_registration_stop
>   rdma: validate RDMAControlHeader::len
>   rdma: check if RDMAControlHeader::len match transferred byte
> 
>  migration-rdma.c |   44 ++--
>  1 file changed, 30 insertions(+), 14 deletions(-)
> 

Series 
Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH v3 For-1.6 3/7] rdma: correct newlines in error statements

2013-08-07 Thread Orit Wasserman
 goto err_dest_init_create_listen_id;
>  }
>  
> @@ -2279,7 +2279,7 @@ static int qemu_rdma_dest_init(RDMAContext *rdma, Error 
> **errp)
>  if (rdma->host && strcmp("", rdma->host)) {
>  ret = getaddrinfo(rdma->host, port_str, NULL, &res);
>  if (ret < 0) {
> -ERROR(errp, "could not getaddrinfo address %s\n", rdma->host);
> +ERROR(errp, "could not getaddrinfo address %s", rdma->host);
>  goto err_dest_init_bind_addr;
>  }
>  
> @@ -2287,7 +2287,7 @@ static int qemu_rdma_dest_init(RDMAContext *rdma, Error 
> **errp)
>  inet_ntop(af, &((struct sockaddr_in *) res->ai_addr)->sin_addr,
>  ip, sizeof ip);
>  } else {
> -ERROR(errp, "migration host and port not specified!\n");
> +ERROR(errp, "migration host and port not specified!");
>  ret = -EINVAL;
>  goto err_dest_init_bind_addr;
>  }
> @@ -2296,7 +2296,7 @@ static int qemu_rdma_dest_init(RDMAContext *rdma, Error 
> **errp)
>  
>  ret = rdma_bind_addr(listen_id, res->ai_addr);
>  if (ret) {
> -ERROR(errp, "Error: could not rdma_bind_addr!\n");
> +ERROR(errp, "Error: could not rdma_bind_addr!");
>  goto err_dest_init_bind_addr;
>  }
>  
> @@ -3036,7 +3036,7 @@ static int qemu_rdma_registration_stop(QEMUFile *f, 
> void *opaque,
>  ®_result_idx, rdma->pin_all ?
>  qemu_rdma_reg_whole_ram_blocks : NULL);
>  if (ret < 0) {
> -ERROR(errp, "receiving remote info!\n");
> +ERROR(errp, "receiving remote info!");
>  return ret;
>  }
>  
> @@ -3061,7 +3061,7 @@ static int qemu_rdma_registration_stop(QEMUFile *f, 
> void *opaque,
>  if (local->nb_blocks != nb_remote_blocks) {
>  ERROR(errp, "ram blocks mismatch #1! "
>  "Your QEMU command line parameters are probably "
> -"not identical on both the source and 
> destination.\n");
> +"not identical on both the source and destination.");
>  return -EINVAL;
>  }
>  
> @@ -3077,7 +3077,7 @@ static int qemu_rdma_registration_stop(QEMUFile *f, 
> void *opaque,
>  if (rdma->block[i].length != local->block[j].length) {
>  ERROR(errp, "ram blocks mismatch #2! "
>  "Your QEMU command line parameters are probably "
> -"not identical on both the source and 
> destination.\n");
> +"not identical on both the source and destination.");
>  return -EINVAL;
>      }
>  local->block[j].remote_host_addr =
> @@ -3089,7 +3089,7 @@ static int qemu_rdma_registration_stop(QEMUFile *f, 
> void *opaque,
>  if (j >= local->nb_blocks) {
>  ERROR(errp, "ram blocks mismatch #3! "
>  "Your QEMU command line parameters are probably "
> -"not identical on both the source and 
> destination.\n");
> +"not identical on both the source and destination.");
>  return -EINVAL;
>  }
>  }
> @@ -3163,7 +3163,7 @@ static void rdma_accept_incoming_migration(void *opaque)
>  ret = qemu_rdma_accept(rdma);
>  
>  if (ret) {
> -ERROR(errp, "RDMA Migration initialization failed!\n");
> +ERROR(errp, "RDMA Migration initialization failed!");
>  return;
>  }
>  
> @@ -3171,7 +3171,7 @@ static void rdma_accept_incoming_migration(void *opaque)
>  
>  f = qemu_fopen_rdma(rdma, "rb");
>  if (f == NULL) {
> -ERROR(errp, "could not qemu_fopen_rdma!\n");
> +ERROR(errp, "could not qemu_fopen_rdma!");
>  qemu_rdma_cleanup(rdma);
>  return;
>  }
> @@ -3204,7 +3204,7 @@ void rdma_start_incoming_migration(const char 
> *host_port, Error **errp)
>  ret = rdma_listen(rdma->listen_id, 5);
>  
>  if (ret) {
> -ERROR(errp, "listening on socket!\n");
> +ERROR(errp, "listening on socket!");
>  goto err;
>  }
>  
> @@ -3228,7 +3228,7 @@ void rdma_start_outgoing_migration(void *opaque,
>  int ret = 0;
>  
>  if (rdma == NULL) {
> -ERROR(temp, "Failed to initialize RDMA data structures! %d\n", ret);
> +ERROR(temp, "Failed to initialize RDMA data structures! %d", ret);
>  goto err;
>  }
>  
> 

Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH v3 For-1.6 2/7] rdma: forgot to turn off the debugging flag

2013-08-07 Thread Orit Wasserman
On 08/04/2013 05:54 AM, mrhi...@linux.vnet.ibm.com wrote:
> From: "Michael R. Hines" 
> 
> Ooops. We forgot to turn off the flag.
> 
> Signed-off-by: Michael R. Hines 
> ---
>  migration-rdma.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/migration-rdma.c b/migration-rdma.c
> index 9cf73e3..fe6118d 100644
> --- a/migration-rdma.c
> +++ b/migration-rdma.c
> @@ -27,7 +27,7 @@
>  #include 
>  #include 
>  
> -#define DEBUG_RDMA
> +//#define DEBUG_RDMA
>  //#define DEBUG_RDMA_VERBOSE
>  //#define DEBUG_RDMA_REALLY_VERBOSE
>  
> 

Reviewed-by: Orit Wasserman 



Re: [Qemu-devel] [PATCH v3 For-1.6 1/7] rdma: bugfix: make IPv6 support work

2013-08-07 Thread Orit Wasserman
On 08/04/2013 05:54 AM, mrhi...@linux.vnet.ibm.com wrote:
> From: "Michael R. Hines" 
> 
> RDMA does not use sockets, so we cannot use many of the socket
> helper functions, but we *do* use inet_parse() which gives
> RDMA all the necessary details of the connection parameters.
> 
> However, when testing with libvirt, a simple IPv6 migration test failed
> because we were not using getaddrinfo() properly.
> 
> This makes IPv6 migration over RDMA work.
> 
> Signed-off-by: Michael R. Hines 
> ---
>  migration-rdma.c |   33 +
>  1 file changed, 21 insertions(+), 12 deletions(-)
> 
> diff --git a/migration-rdma.c b/migration-rdma.c
> index d044830..9cf73e3 100644
> --- a/migration-rdma.c
> +++ b/migration-rdma.c
> @@ -392,6 +392,7 @@ typedef struct RDMAContext {
>  uint64_t unregistrations[RDMA_SIGNALED_SEND_MAX];
>  
>  GHashTable *blockmap;
> +bool ipv6;
>  } RDMAContext;
>  
>  /*
> @@ -744,6 +745,7 @@ static int qemu_rdma_resolve_host(RDMAContext *rdma, 
> Error **errp)
>  char port_str[16];
>  struct rdma_cm_event *cm_event;
>  char ip[40] = "unknown";
> +int af = rdma->ipv6 ? PF_INET6 : PF_INET;
>  
>  if (rdma->host == NULL || !strcmp(rdma->host, "")) {
>  ERROR(errp, "RDMA hostname has not been set\n");
> @@ -773,7 +775,7 @@ static int qemu_rdma_resolve_host(RDMAContext *rdma, 
> Error **errp)
>  goto err_resolve_get_addr;
>  }
>  
> -inet_ntop(AF_INET, &((struct sockaddr_in *) res->ai_addr)->sin_addr,
> +inet_ntop(af, &((struct sockaddr_in *) res->ai_addr)->sin_addr,
>  ip, sizeof ip);
>  DPRINTF("%s => %s\n", rdma->host, ip);
>  
> @@ -2236,9 +2238,12 @@ err_rdma_source_connect:
>  static int qemu_rdma_dest_init(RDMAContext *rdma, Error **errp)
>  {
>  int ret = -EINVAL, idx;
> +int af = rdma->ipv6 ? PF_INET6 : PF_INET;
>  struct sockaddr_in sin;
>  struct rdma_cm_id *listen_id;
>  char ip[40] = "unknown";
> +struct addrinfo *res;
> +char port_str[16];
>  
>  for (idx = 0; idx <= RDMA_WRID_MAX; idx++) {
>  rdma->wr_data[idx].control_len = 0;
> @@ -2266,27 +2271,30 @@ static int qemu_rdma_dest_init(RDMAContext *rdma, 
> Error **errp)
>  }
>  
>  memset(&sin, 0, sizeof(sin));
> -sin.sin_family = AF_INET;
> +sin.sin_family = af;
>  sin.sin_port = htons(rdma->port);
> +snprintf(port_str, 16, "%d", rdma->port);
> +port_str[15] = '\0';
>  
>  if (rdma->host && strcmp("", rdma->host)) {
> -struct hostent *dest_addr;
> -dest_addr = gethostbyname(rdma->host);
> -if (!dest_addr) {
> -ERROR(errp, "migration could not gethostbyname!\n");
> -ret = -EINVAL;
> +ret = getaddrinfo(rdma->host, port_str, NULL, &res);

Hi Michael,

getaddrinfo can return a list of addresses, you need to handle it.
Look at qemu-sockets.c for example.

Regards,
Orit
> +if (ret < 0) {
> +ERROR(errp, "could not getaddrinfo address %s\n", rdma->host);
>  goto err_dest_init_bind_addr;
>  }
> -memcpy(&sin.sin_addr.s_addr, dest_addr->h_addr,
> -dest_addr->h_length);
> -inet_ntop(AF_INET, dest_addr->h_addr, ip, sizeof ip);
> +
> +
> +inet_ntop(af, &((struct sockaddr_in *) res->ai_addr)->sin_addr,
> +ip, sizeof ip);
>  } else {
> -sin.sin_addr.s_addr = INADDR_ANY;
> +ERROR(errp, "migration host and port not specified!\n");
> +ret = -EINVAL;
> +goto err_dest_init_bind_addr;
>  }
>  
>  DPRINTF("%s => %s\n", rdma->host, ip);
>  
> -ret = rdma_bind_addr(listen_id, (struct sockaddr *)&sin);
> +ret = rdma_bind_addr(listen_id, res->ai_addr);
>  if (ret) {
>  ERROR(errp, "Error: could not rdma_bind_addr!\n");
>  goto err_dest_init_bind_addr;
> @@ -2321,6 +2329,7 @@ static void *qemu_rdma_data_init(const char *host_port, 
> Error **errp)
>  if (addr != NULL) {
>  rdma->port = atoi(addr->port);
>  rdma->host = g_strdup(addr->host);
> +rdma->ipv6 = addr->ipv6;
>  } else {
>  ERROR(errp, "bad RDMA migration address '%s'", host_port);
>  g_free(rdma);
> 




Re: [Qemu-devel] [PATCH] rdma: bugfix: make IPv6 support work

2013-07-31 Thread Orit Wasserman
On 07/31/2013 04:39 PM, Michael R. Hines wrote:
> On 07/30/2013 11:31 AM, Orit Wasserman wrote:
>> On 07/30/2013 05:57 PM, Michael R. Hines wrote:
>>> On 07/30/2013 04:14 AM, Orit Wasserman wrote:
>>>> On 07/27/2013 05:23 AM, mrhi...@linux.vnet.ibm.com wrote:
>>>>> From: "Michael R. Hines" 
>>>>>
>>>>> When testing with libvirt, a simple IPv6 migration test failed
>>>>> because we were not using getaddrinfo() properly.
>>>>> This makes IPv6 migration over RDMA work.
>>>>>
>>>>> Also, we forgot to turn the DPRINTF flag off =).
>>>>>
>>>>> Signed-off-by: Michael R. Hines 
>>>>> ---
>>>>>migration-rdma.c |   35 ++-
>>>>>1 file changed, 22 insertions(+), 13 deletions(-)
>>>>>
>>>>> diff --git a/migration-rdma.c b/migration-rdma.c
>>>>> index d044830..3256c9b 100644
>>>>> --- a/migration-rdma.c
>>>>> +++ b/migration-rdma.c
>>>>> @@ -27,7 +27,7 @@
>>>>>#include 
>>>>>#include 
>>>>>-#define DEBUG_RDMA
>>>>> +//#define DEBUG_RDMA
>>>> Can you put this in a separate patch?
>>> Acknowledged.
>>>
>>>>>//#define DEBUG_RDMA_VERBOSE
>>>>>//#define DEBUG_RDMA_REALLY_VERBOSE
>>>>>@@ -392,6 +392,7 @@ typedef struct RDMAContext {
>>>>>uint64_t unregistrations[RDMA_SIGNALED_SEND_MAX];
>>>>>  GHashTable *blockmap;
>>>>> +bool ipv6;
>>>>>} RDMAContext;
>>>>>  /*
>>>>> @@ -744,6 +745,7 @@ static int qemu_rdma_resolve_host(RDMAContext *rdma, 
>>>>> Error **errp)
>>>>>char port_str[16];
>>>>>struct rdma_cm_event *cm_event;
>>>>>char ip[40] = "unknown";
>>>>> +int af = rdma->ipv6 ? PF_INET6 : PF_INET;
>>>> We have code that handles ipv6 in utils/qemu-sockets.c, it also handle host
>>>> and port parsing please take a look at inet_parse_connect_opts.
>>>> I think it can be reused here and for the destination.
>>> RDMA cannot use that function - it creates a socket and RDMA does not use 
>>> sockets.
>>>
>>> RDMA is *already*, however, using inet_parse() which does exactly what you 
>>> said.
>>>
>> You can update the function so it can be used for RDMA also.
> 
> The inet_parse() function does everything that we need -
> it already understands IPv6 without opening an actual socket.
> 
> Any more than that would require CONFIG_RDMA to be
> conditionalized inside of the utils/ source code,
> and that seems like a lot of work for such a small reward.
> 

Agreed,
Orit

> - Michael
> 
>> Orit
>>
>>>> Regards,
>>>> Orit
>>>>>  if (rdma->host == NULL || !strcmp(rdma->host, "")) {
>>>>>ERROR(errp, "RDMA hostname has not been set\n");
>>>>> @@ -773,7 +775,7 @@ static int qemu_rdma_resolve_host(RDMAContext *rdma, 
>>>>> Error **errp)
>>>>>goto err_resolve_get_addr;
>>>>>}
>>>>>-inet_ntop(AF_INET, &((struct sockaddr_in *) 
>>>>> res->ai_addr)->sin_addr,
>>>>> +inet_ntop(af, &((struct sockaddr_in *) res->ai_addr)->sin_addr,
>>>>>ip, sizeof ip);
>>>>>DPRINTF("%s => %s\n", rdma->host, ip);
>>>>>@@ -2236,9 +2238,12 @@ err_rdma_source_connect:
>>>>>static int qemu_rdma_dest_init(RDMAContext *rdma, Error **errp)
>>>>>{
>>>>>int ret = -EINVAL, idx;
>>>>> +int af = rdma->ipv6 ? PF_INET6 : PF_INET;
>>>>>struct sockaddr_in sin;
>>>>>struct rdma_cm_id *listen_id;
>>>>>char ip[40] = "unknown";
>>>>> +struct addrinfo *res;
>>>>> +char port_str[16];
>>>>>  for (idx = 0; idx <= RDMA_WRID_MAX; idx++) {
>>>>>rdma->wr_data[idx].control_len = 0;
>>>>> @@ -2266,27 +2271,30 @@ static int qemu_rdma_dest_init(RDMAContext *rdma, 
>>>>> Error **errp)
>>>>>}
>>>>>  memset(&sin, 0, sizeof(sin));

Re: [Qemu-devel] [PATCH] rdma: bugfix: make IPv6 support work

2013-07-30 Thread Orit Wasserman
On 07/30/2013 05:57 PM, Michael R. Hines wrote:
> On 07/30/2013 04:14 AM, Orit Wasserman wrote:
>> On 07/27/2013 05:23 AM, mrhi...@linux.vnet.ibm.com wrote:
>>> From: "Michael R. Hines" 
>>>
>>> When testing with libvirt, a simple IPv6 migration test failed
>>> because we were not using getaddrinfo() properly.
>>> This makes IPv6 migration over RDMA work.
>>>
>>> Also, we forgot to turn the DPRINTF flag off =).
>>>
>>> Signed-off-by: Michael R. Hines 
>>> ---
>>>   migration-rdma.c |   35 ++-
>>>   1 file changed, 22 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/migration-rdma.c b/migration-rdma.c
>>> index d044830..3256c9b 100644
>>> --- a/migration-rdma.c
>>> +++ b/migration-rdma.c
>>> @@ -27,7 +27,7 @@
>>>   #include 
>>>   #include 
>>>   -#define DEBUG_RDMA
>>> +//#define DEBUG_RDMA
>> Can you put this in a separate patch?
> 
> Acknowledged.
> 
>>>   //#define DEBUG_RDMA_VERBOSE
>>>   //#define DEBUG_RDMA_REALLY_VERBOSE
>>>   @@ -392,6 +392,7 @@ typedef struct RDMAContext {
>>>   uint64_t unregistrations[RDMA_SIGNALED_SEND_MAX];
>>> GHashTable *blockmap;
>>> +bool ipv6;
>>>   } RDMAContext;
>>> /*
>>> @@ -744,6 +745,7 @@ static int qemu_rdma_resolve_host(RDMAContext *rdma, 
>>> Error **errp)
>>>   char port_str[16];
>>>   struct rdma_cm_event *cm_event;
>>>   char ip[40] = "unknown";
>>> +int af = rdma->ipv6 ? PF_INET6 : PF_INET;
>> We have code that handles ipv6 in utils/qemu-sockets.c, it also handle host
>> and port parsing please take a look at inet_parse_connect_opts.
>> I think it can be reused here and for the destination.
> 
> RDMA cannot use that function - it creates a socket and RDMA does not use 
> sockets.
> 
> RDMA is *already*, however, using inet_parse() which does exactly what you 
> said.
> 

You can update the function so it can be used for RDMA also.

Orit

>> Regards,
>> Orit
>>> if (rdma->host == NULL || !strcmp(rdma->host, "")) {
>>>   ERROR(errp, "RDMA hostname has not been set\n");
>>> @@ -773,7 +775,7 @@ static int qemu_rdma_resolve_host(RDMAContext *rdma, 
>>> Error **errp)
>>>   goto err_resolve_get_addr;
>>>   }
>>>   -inet_ntop(AF_INET, &((struct sockaddr_in *) res->ai_addr)->sin_addr,
>>> +inet_ntop(af, &((struct sockaddr_in *) res->ai_addr)->sin_addr,
>>>   ip, sizeof ip);
>>>   DPRINTF("%s => %s\n", rdma->host, ip);
>>>   @@ -2236,9 +2238,12 @@ err_rdma_source_connect:
>>>   static int qemu_rdma_dest_init(RDMAContext *rdma, Error **errp)
>>>   {
>>>   int ret = -EINVAL, idx;
>>> +int af = rdma->ipv6 ? PF_INET6 : PF_INET;
>>>   struct sockaddr_in sin;
>>>   struct rdma_cm_id *listen_id;
>>>   char ip[40] = "unknown";
>>> +struct addrinfo *res;
>>> +char port_str[16];
>>> for (idx = 0; idx <= RDMA_WRID_MAX; idx++) {
>>>   rdma->wr_data[idx].control_len = 0;
>>> @@ -2266,27 +2271,30 @@ static int qemu_rdma_dest_init(RDMAContext *rdma, 
>>> Error **errp)
>>>   }
>>> memset(&sin, 0, sizeof(sin));
>>> -sin.sin_family = AF_INET;
>>> +sin.sin_family = af;
>>>   sin.sin_port = htons(rdma->port);
>>> +snprintf(port_str, 16, "%d", rdma->port);
>>> +port_str[15] = '\0';
>>> if (rdma->host && strcmp("", rdma->host)) {
>>> -struct hostent *dest_addr;
>>> -dest_addr = gethostbyname(rdma->host);
>>> -if (!dest_addr) {
>>> -ERROR(errp, "migration could not gethostbyname!\n");
>>> -ret = -EINVAL;
>>> +ret = getaddrinfo(rdma->host, port_str, NULL, &res);
>>> +if (ret < 0) {
>>> +ERROR(errp, "could not getaddrinfo address %s\n", rdma->host);
>>>   goto err_dest_init_bind_addr;
>>>   }
>>> -memcpy(&sin.sin_addr.s_addr, dest_addr->h_addr,
>>> -dest_addr->h_length);
>>> -inet_ntop(AF_INET, dest_addr->h_addr, ip, sizeof ip);
>>> +
>>

Re: [Qemu-devel] [PATCH] rdma: bugfix: make IPv6 support work

2013-07-30 Thread Orit Wasserman
On 07/27/2013 05:23 AM, mrhi...@linux.vnet.ibm.com wrote:
> From: "Michael R. Hines" 
> 
> When testing with libvirt, a simple IPv6 migration test failed
> because we were not using getaddrinfo() properly.
> This makes IPv6 migration over RDMA work.
> 
> Also, we forgot to turn the DPRINTF flag off =).
> 
> Signed-off-by: Michael R. Hines 
> ---
>  migration-rdma.c |   35 ++-
>  1 file changed, 22 insertions(+), 13 deletions(-)
> 
> diff --git a/migration-rdma.c b/migration-rdma.c
> index d044830..3256c9b 100644
> --- a/migration-rdma.c
> +++ b/migration-rdma.c
> @@ -27,7 +27,7 @@
>  #include 
>  #include 
>  
> -#define DEBUG_RDMA
> +//#define DEBUG_RDMA

Can you put this in a separate patch?

>  //#define DEBUG_RDMA_VERBOSE
>  //#define DEBUG_RDMA_REALLY_VERBOSE
>  
> @@ -392,6 +392,7 @@ typedef struct RDMAContext {
>  uint64_t unregistrations[RDMA_SIGNALED_SEND_MAX];
>  
>  GHashTable *blockmap;
> +bool ipv6;
>  } RDMAContext;
>  
>  /*
> @@ -744,6 +745,7 @@ static int qemu_rdma_resolve_host(RDMAContext *rdma, 
> Error **errp)
>  char port_str[16];
>  struct rdma_cm_event *cm_event;
>  char ip[40] = "unknown";
> +int af = rdma->ipv6 ? PF_INET6 : PF_INET;

We have code that handles ipv6 in utils/qemu-sockets.c, it also handle host 
and port parsing please take a look at inet_parse_connect_opts.
I think it can be reused here and for the destination.

Regards,
Orit
>  
>  if (rdma->host == NULL || !strcmp(rdma->host, "")) {
>  ERROR(errp, "RDMA hostname has not been set\n");
> @@ -773,7 +775,7 @@ static int qemu_rdma_resolve_host(RDMAContext *rdma, 
> Error **errp)
>  goto err_resolve_get_addr;
>  }
>  
> -inet_ntop(AF_INET, &((struct sockaddr_in *) res->ai_addr)->sin_addr,
> +inet_ntop(af, &((struct sockaddr_in *) res->ai_addr)->sin_addr,
>  ip, sizeof ip);
>  DPRINTF("%s => %s\n", rdma->host, ip);
>  
> @@ -2236,9 +2238,12 @@ err_rdma_source_connect:
>  static int qemu_rdma_dest_init(RDMAContext *rdma, Error **errp)
>  {
>  int ret = -EINVAL, idx;
> +int af = rdma->ipv6 ? PF_INET6 : PF_INET;
>  struct sockaddr_in sin;
>  struct rdma_cm_id *listen_id;
>  char ip[40] = "unknown";
> +struct addrinfo *res;
> +char port_str[16];
>  
>  for (idx = 0; idx <= RDMA_WRID_MAX; idx++) {
>  rdma->wr_data[idx].control_len = 0;
> @@ -2266,27 +2271,30 @@ static int qemu_rdma_dest_init(RDMAContext *rdma, 
> Error **errp)
>  }
>  
>  memset(&sin, 0, sizeof(sin));
> -sin.sin_family = AF_INET;
> +sin.sin_family = af; 
>  sin.sin_port = htons(rdma->port);
> +snprintf(port_str, 16, "%d", rdma->port);
> +port_str[15] = '\0';
>  
>  if (rdma->host && strcmp("", rdma->host)) {
> -struct hostent *dest_addr;
> -dest_addr = gethostbyname(rdma->host);
> -if (!dest_addr) {
> -ERROR(errp, "migration could not gethostbyname!\n");
> -ret = -EINVAL;
> +ret = getaddrinfo(rdma->host, port_str, NULL, &res);
> +if (ret < 0) {
> +ERROR(errp, "could not getaddrinfo address %s\n", rdma->host);
>  goto err_dest_init_bind_addr;
>  }
> -memcpy(&sin.sin_addr.s_addr, dest_addr->h_addr,
> -dest_addr->h_length);
> -inet_ntop(AF_INET, dest_addr->h_addr, ip, sizeof ip);
> +
> +
> +inet_ntop(af, &((struct sockaddr_in *) res->ai_addr)->sin_addr,
> +ip, sizeof ip);
>  } else {
> -sin.sin_addr.s_addr = INADDR_ANY;
> +ERROR(errp, "migration host and port not specified!\n");
> +ret = -EINVAL;
> +goto err_dest_init_bind_addr;
>  }
>  
>  DPRINTF("%s => %s\n", rdma->host, ip);
>  
> -ret = rdma_bind_addr(listen_id, (struct sockaddr *)&sin);
> +ret = rdma_bind_addr(listen_id, res->ai_addr);
>  if (ret) {
>  ERROR(errp, "Error: could not rdma_bind_addr!\n");
>  goto err_dest_init_bind_addr;
> @@ -2321,6 +2329,7 @@ static void *qemu_rdma_data_init(const char *host_port, 
> Error **errp)
>  if (addr != NULL) {
>  rdma->port = atoi(addr->port);
>  rdma->host = g_strdup(addr->host);
> +rdma->ipv6 = addr->ipv6;
>  } else {
>  ERROR(errp, "bad RDMA migration address '%s'", host_port);
>  g_free(rdma);
> 




[Qemu-devel] [PATCH v3 2/2] Fix real mode guest segments dpl value in savevm

2013-07-22 Thread Orit Wasserman
Older KVM version put invalid value in the segments registers dpl field for
real mode guests (0x3).
This breaks migration from those hosts to hosts with unrestricted guest support.
We detect it by checking CS dpl value for real mode guest and fix the dpl values
of all the segment registers.

Signed-off-by: Orit Wasserman 
Reviewed-by: Juan Quintela 
---
 target-i386/machine.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/target-i386/machine.c b/target-i386/machine.c
index af4c167..f9ec581 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -252,6 +252,24 @@ static void cpu_pre_save(void *opaque)
 }
 
 env->fpregs_format_vmstate = 0;
+
+/*
+ * Real mode guest segments register DPL should be zero.
+ * Older KVM version were setting it wrongly.
+ * Fixing it will allow live migration to host with unrestricted guest
+ * support (otherwise the migration will fail with invalid guest state
+ * error).
+ */
+if (!(env->cr[0] & CR0_PE_MASK) &&
+(env->segs[R_CS].flags >> DESC_DPL_SHIFT & 3) != 0) {
+env->segs[R_CS].flags &= ~(env->segs[R_CS].flags & DESC_DPL_MASK);
+env->segs[R_DS].flags &= ~(env->segs[R_DS].flags & DESC_DPL_MASK);
+env->segs[R_ES].flags &= ~(env->segs[R_ES].flags & DESC_DPL_MASK);
+env->segs[R_FS].flags &= ~(env->segs[R_FS].flags & DESC_DPL_MASK);
+env->segs[R_GS].flags &= ~(env->segs[R_GS].flags & DESC_DPL_MASK);
+env->segs[R_SS].flags &= ~(env->segs[R_SS].flags & DESC_DPL_MASK);
+}
+
 }
 
 static int cpu_post_load(void *opaque, int version_id)
-- 
1.8.1.4




[Qemu-devel] [PATCH v3 1/2] Fix real mode guest migration

2013-07-22 Thread Orit Wasserman
Older KVM versions save CS dpl value to an invalid value for real mode guests
(0x3). This patch detect this situation when loading CPU state and set all the
segments dpl to zero.
This will allow migration from older KVM on host without unrestricted guest
to hosts with restricted guest support.
For example migration from a Penryn host (with kernel 2.6.32) to
a Westmere host (for real mode guest) will fail with "kvm: unhandled exit 
8021".

Signed-off-by: Orit Wasserman 
Reviewed-by: Juan Quintela 
---
 target-i386/machine.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/target-i386/machine.c b/target-i386/machine.c
index 3659db9..af4c167 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -260,6 +260,24 @@ static int cpu_post_load(void *opaque, int version_id)
 CPUX86State *env = &cpu->env;
 int i;
 
+/*
+ * Real mode guest segments register DPL should be zero.
+ * Older KVM version were setting it wrongly.
+ * Fixing it will allow live migration from such host that don't have
+ * restricted guest support to a host with unrestricted guest support
+ * (otherwise the migration will fail with invalid guest state
+ * error).
+ */
+if (!(env->cr[0] & CR0_PE_MASK) &&
+(env->segs[R_CS].flags >> DESC_DPL_SHIFT & 3) != 0) {
+env->segs[R_CS].flags &= ~(env->segs[R_CS].flags & DESC_DPL_MASK);
+env->segs[R_DS].flags &= ~(env->segs[R_DS].flags & DESC_DPL_MASK);
+env->segs[R_ES].flags &= ~(env->segs[R_ES].flags & DESC_DPL_MASK);
+env->segs[R_FS].flags &= ~(env->segs[R_FS].flags & DESC_DPL_MASK);
+env->segs[R_GS].flags &= ~(env->segs[R_GS].flags & DESC_DPL_MASK);
+env->segs[R_SS].flags &= ~(env->segs[R_SS].flags & DESC_DPL_MASK);
+}
+
 /* XXX: restore FPU round state */
 env->fpstt = (env->fpus_vmstate >> 11) & 7;
 env->fpus = env->fpus_vmstate & ~0x3800;
-- 
1.8.1.4




[Qemu-devel] [PATCH v3 0/2] Fix real mode guest live migration

2013-07-22 Thread Orit Wasserman
Older KVM versions save CS dpl value to an invalid value for real mode guests
(0x3). This caused migration to fail from older KVM on host without 
unrestricted guest to hosts with restricted guest support with "kvm: unhandled 
exit 8021".
For example migration from a Penryn host (with kernel 2.6.32) to
a Westmere host (for real mode guest) 

This series fixing it both in the destination and source QEMU.

Changes from v2:
  Fix more typos

Changes from v1:
  Fix typos and style

Orit Wasserman (2):
  Fix real mode guest migration
  Fix real mode guest segments dpl value in savevm

 target-i386/machine.c | 36 
 1 file changed, 36 insertions(+)

-- 
1.8.1.4




[Qemu-devel] [PATCH v2 1/2] Fix real mode guest migration

2013-07-22 Thread Orit Wasserman
Older KVM versions save CS dpl value to an invalid value for real mode guests
(0x3). This patch detect this situation when loading CPU state and set all the
segments dpl to zero.
This will allow migration from older KVM on host without unrestricted guest
to hosts with restricted guest support.
For example migration from a Penryn host (with kernel 2.6.32) to
a Westmere host (for real mode guest) will fail with "kvm: unhandled exit 
8021".

Signed-off-by: Orit Wasserman 
---
 target-i386/machine.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/target-i386/machine.c b/target-i386/machine.c
index 3659db9..83ea6ac 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -260,6 +260,24 @@ static int cpu_post_load(void *opaque, int version_id)
 CPUX86State *env = &cpu->env;
 int i;
 
+/*
+ * Real mode guest segments register DPL should be zero.
+ * Older KVM version were setting it wrongly.
+ * Fixing it will allow live migration from such host that don't have
+ * restricted guest support to an host with unrestricted guest support
+ * (otherwise the migration will fail with invalid guest state
+ * error).
+ */
+if (!(env->cr[0] & CR0_PE_MASK) &&
+(env->segs[R_CS].flags >> DESC_DPL_SHIFT & 3) != 0) {
+env->segs[R_CS].flags &= ~(env->segs[R_CS].flags & DESC_DPL_MASK);
+env->segs[R_DS].flags &= ~(env->segs[R_DS].flags & DESC_DPL_MASK);
+env->segs[R_ES].flags &= ~(env->segs[R_ES].flags & DESC_DPL_MASK);
+env->segs[R_FS].flags &= ~(env->segs[R_FS].flags & DESC_DPL_MASK);
+env->segs[R_GS].flags &= ~(env->segs[R_GS].flags & DESC_DPL_MASK);
+env->segs[R_SS].flags &= ~(env->segs[R_SS].flags & DESC_DPL_MASK);
+}
+
 /* XXX: restore FPU round state */
 env->fpstt = (env->fpus_vmstate >> 11) & 7;
 env->fpus = env->fpus_vmstate & ~0x3800;
-- 
1.8.1.4




[Qemu-devel] [PATCH v2 0/2] Fix real mode guest live migration

2013-07-22 Thread Orit Wasserman
Older KVM versions save CS dpl value to an invalid value for real mode guests
(0x3). This caused migration to fail from older KVM on host without 
unrestricted guest to hosts with restricted guest support with "kvm: unhandled 
exit 8021".
For example migration from a Penryn host (with kernel 2.6.32) to
a Westmere host (for real mode guest) 

This series fixing it both in the destination and source QEMU.

Changes from v1:
  Fix typos and style

Orit Wasserman (2):
  Fix real mode guest migration
  Fix real mode guest segments dpl value in savevm

 target-i386/machine.c | 36 
 1 file changed, 36 insertions(+)

-- 
1.8.1.4




[Qemu-devel] [PATCH v2 2/2] Fix real mode guest segments dpl value in savevm

2013-07-22 Thread Orit Wasserman
Older KVM version put invalid value in the segments registers dpl field for
real mode guests (0x3).
This breaks migration from those hosts to hosts with unrestricted guest support.
We detect it by checking CS dpl value for real mode guest and fix the dpl values
of all the segment registers.

Signed-off-by: Orit Wasserman 
---
 target-i386/machine.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/target-i386/machine.c b/target-i386/machine.c
index 83ea6ac..c98f1c9 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -252,6 +252,24 @@ static void cpu_pre_save(void *opaque)
 }
 
 env->fpregs_format_vmstate = 0;
+
+/*
+ * Real mode guest segments register DPL should be zero.
+ * Older KVM version were setting it wrongly.
+ * Fixing it will allow live migration to host with unrestricted guest
+ * support (otherwise the migration will fail with invalid guest state
+ * error).
+ */
+if (!(env->cr[0] & CR0_PE_MASK) &&
+(env->segs[R_CS].flags >> DESC_DPL_SHIFT & 3) != 0) {
+env->segs[R_CS].flags &= ~(env->segs[R_CS].flags & DESC_DPL_MASK);
+env->segs[R_DS].flags &= ~(env->segs[R_DS].flags & DESC_DPL_MASK);
+env->segs[R_ES].flags &= ~(env->segs[R_ES].flags & DESC_DPL_MASK);
+env->segs[R_FS].flags &= ~(env->segs[R_FS].flags & DESC_DPL_MASK);
+env->segs[R_GS].flags &= ~(env->segs[R_GS].flags & DESC_DPL_MASK);
+env->segs[R_SS].flags &= ~(env->segs[R_SS].flags & DESC_DPL_MASK);
+}
+
 }
 
 static int cpu_post_load(void *opaque, int version_id)
-- 
1.8.1.4




Re: [Qemu-devel] [PATCH 1/2] Fix real mode guest migration

2013-07-22 Thread Orit Wasserman
On 07/22/2013 01:33 PM, Andreas Färber wrote:
> Am 22.07.2013 11:49, schrieb Paolo Bonzini:
>> Il 22/07/2013 08:49, Orit Wasserman ha scritto:
>>> Older KVM versions save CS dpl value to an invalid value for real mode 
>>> guests
>>> (0x3). This patch detect this situation when loading CPU state and set all 
>>> the
>>> segments dpl to zero.
>>> This will allow migration from older KVM on host without unrestricted guest
>>> to hosts with restricted guest support.
>>> For example migration from a Penryn host (with kernel 2.6.32) to
>>> a Westmere host.
>>>
>>> Signed-off-by: Orit Wasserman 
>>> ---
>>>  target-i386/machine.c | 18 ++
>>>  1 file changed, 18 insertions(+)
>>>
>>> diff --git a/target-i386/machine.c b/target-i386/machine.c
>>> index 3659db9..7e95829 100644
>>> --- a/target-i386/machine.c
>>> +++ b/target-i386/machine.c
>>> @@ -260,6 +260,24 @@ static int cpu_post_load(void *opaque, int version_id)
>>>  CPUX86State *env = &cpu->env;
>>>  int i;
>>>  
>>> +/*
>>> +  Real mode guest segments register DPL should be zero.
>>> +  Older KVM version were setting it worngly.
>>> +  Fixing it will allow live migration from such host that don't have
>>> +  restricted guest support to an host with unrestricted guest support
>>> +  (otherwise the migration will fail with invalid guest state
>>> +  error).
>>> +*/
>>
>> Coding standard asks for *s on every line.
>>
>> As discussed offlist, I would prefer to have this in the kernel since
>> that's where the bug is.  Gleb disagrees.
>>
>> We need to find a third person who mediates...  Anthony, Eduardo, what
>> do you think?
> 
> Having the code here does not look wrong to me, to enforce a consistent
> state inside QEMU.
> 
> However I wonder what happens without this patch on Westmere? Might it
> make sense to sanitize or at least "assert" (whatever the kernel
> equivalent is ;)) in the ioctl setting X86CPU state to the vCPU that the
> incoming values will be valid for the host CPU? And optionally in QEMU's
> KVM code for the reverse direction, cpu_synchronize_state(), to cope
> with older kernels?
> 

Without the patch we get "kvm: unhandled exit 8021" error in incoming
migration or loadvm. This is a KVM error (kernel) which translates to invalid
guest state.This happens only in migration of a real mode guest.

The problem in fixing the values in cpu_synchronize_state is that the function
is called in many places in the code. 
As real mode code is very complex (Gleb can attest to that) I prefer a fix that
has a very limited scope like fixing it in the cpu_post_load and cpu_pre_save
function that are only used in savevm and live migration.

Orit

> Regards,
> Andreas
> 
>>> +if (!(env->cr[0] & CR0_PE_MASK) &&
>>> + (env->segs[R_CS].flags >> DESC_DPL_SHIFT & 3) != 0) {
>>> +env->segs[R_CS].flags &= ~(env->segs[R_CS].flags & DESC_DPL_MASK);
>>> +env->segs[R_DS].flags &= ~(env->segs[R_DS].flags & DESC_DPL_MASK);
>>> +env->segs[R_ES].flags &= ~(env->segs[R_ES].flags & DESC_DPL_MASK);
>>> +env->segs[R_FS].flags &= ~(env->segs[R_FS].flags & DESC_DPL_MASK);
>>> +env->segs[R_GS].flags &= ~(env->segs[R_GS].flags & DESC_DPL_MASK);
>>> +env->segs[R_SS].flags &= ~(env->segs[R_SS].flags & DESC_DPL_MASK);
>>> +}
>>> +
>>>  /* XXX: restore FPU round state */
>>>  env->fpstt = (env->fpus_vmstate >> 11) & 7;
>>>  env->fpus = env->fpus_vmstate & ~0x3800;
> 




Re: [Qemu-devel] [PATCH 1/2] Fix real mode guest migration

2013-07-22 Thread Orit Wasserman
On 07/22/2013 12:49 PM, Paolo Bonzini wrote:
> Il 22/07/2013 08:49, Orit Wasserman ha scritto:
>> Older KVM versions save CS dpl value to an invalid value for real mode guests
>> (0x3). This patch detect this situation when loading CPU state and set all 
>> the
>> segments dpl to zero.
>> This will allow migration from older KVM on host without unrestricted guest
>> to hosts with restricted guest support.
>> For example migration from a Penryn host (with kernel 2.6.32) to
>> a Westmere host.
>>
>> Signed-off-by: Orit Wasserman 
>> ---
>>  target-i386/machine.c | 18 ++
>>  1 file changed, 18 insertions(+)
>>
>> diff --git a/target-i386/machine.c b/target-i386/machine.c
>> index 3659db9..7e95829 100644
>> --- a/target-i386/machine.c
>> +++ b/target-i386/machine.c
>> @@ -260,6 +260,24 @@ static int cpu_post_load(void *opaque, int version_id)
>>  CPUX86State *env = &cpu->env;
>>  int i;
>>  
>> +/*
>> +  Real mode guest segments register DPL should be zero.
>> +  Older KVM version were setting it worngly.
>> +  Fixing it will allow live migration from such host that don't have
>> +  restricted guest support to an host with unrestricted guest support
>> +  (otherwise the migration will fail with invalid guest state
>> +  error).
>> +*/
> 
> Coding standard asks for *s on every line.
I will fix it in v2.
By the way checkpatch.pl didn't send a warning about this it will be nice to 
add.

> 
> As discussed offlist, I would prefer to have this in the kernel since
> that's where the bug is.  Gleb disagrees.
> 

As this is a migration bug I prefer to fix it in the migration code only.
Fixing it in the kernel will effect all scenarios that load the segments,
in userspace it is only in load VM code.

Orit
> We need to find a third person who mediates...  Anthony, Eduardo, what
> do you think?
> 
> Paolo
> 
>> +if (!(env->cr[0] & CR0_PE_MASK) &&
>> + (env->segs[R_CS].flags >> DESC_DPL_SHIFT & 3) != 0) {
>> +env->segs[R_CS].flags &= ~(env->segs[R_CS].flags & DESC_DPL_MASK);
>> +env->segs[R_DS].flags &= ~(env->segs[R_DS].flags & DESC_DPL_MASK);
>> +env->segs[R_ES].flags &= ~(env->segs[R_ES].flags & DESC_DPL_MASK);
>> +env->segs[R_FS].flags &= ~(env->segs[R_FS].flags & DESC_DPL_MASK);
>> +env->segs[R_GS].flags &= ~(env->segs[R_GS].flags & DESC_DPL_MASK);
>> +env->segs[R_SS].flags &= ~(env->segs[R_SS].flags & DESC_DPL_MASK);
>> +}
>> +
>>  /* XXX: restore FPU round state */
>>  env->fpstt = (env->fpus_vmstate >> 11) & 7;
>>  env->fpus = env->fpus_vmstate & ~0x3800;
>>
> 




  1   2   3   4   5   6   7   8   >