Re: [Qemu-block] [PATCH v3 1/1] migration: disallow migrate_add_blocker during migration

2016-12-09 Thread John Snow


On 12/09/2016 12:39 PM, Dr. David Alan Gilbert wrote:
> * John Snow (js...@redhat.com) wrote:
>> If a migration is already in progress and somebody attempts
>> to add a migration blocker, this should rightly fail.
>>
>> Add an errp parameter and a retcode return value to migrate_add_blocker.
>>
>> Signed-off-by: John Snow 
>> ---
>>  block/qcow.c  |  6 +-
>>  block/vdi.c   |  6 +-
>>  block/vhdx.c  | 24 ++--
>>  block/vmdk.c  |  7 ++-
>>  block/vpc.c   | 10 +++---
>>  block/vvfat.c | 20 
>>  hw/9pfs/9p.c  | 16 
>>  hw/display/virtio-gpu.c   | 29 -
>>  hw/misc/ivshmem.c | 11 +++
>>  hw/scsi/vhost-scsi.c  | 25 +++--
>>  hw/virtio/vhost.c |  8 +++-
>>  include/migration/migration.h |  6 +-
>>  migration/migration.c | 35 +--
>>  stubs/migr-blocker.c  |  3 ++-
>>  target-i386/kvm.c | 16 +---
>>  15 files changed, 163 insertions(+), 59 deletions(-)
>>
>> diff --git a/block/qcow.c b/block/qcow.c
>> index 7540f43..11526a1 100644
>> --- a/block/qcow.c
>> +++ b/block/qcow.c
>> @@ -252,7 +252,11 @@ static int qcow_open(BlockDriverState *bs, QDict 
>> *options, int flags,
>>  error_setg(>migration_blocker, "The qcow format used by node '%s' "
>> "does not support live migration",
>> bdrv_get_device_or_node_name(bs));
>> -migrate_add_blocker(s->migration_blocker);
>> +ret = migrate_add_blocker(s->migration_blocker, errp);
>> +if (ret < 0) {
>> +error_free(s->migration_blocker);
>> +goto fail;
>> +}
>>  
>>  qemu_co_mutex_init(>lock);
>>  return 0;
>> diff --git a/block/vdi.c b/block/vdi.c
>> index 96b78d5..2b56f52 100644
>> --- a/block/vdi.c
>> +++ b/block/vdi.c
>> @@ -471,7 +471,11 @@ static int vdi_open(BlockDriverState *bs, QDict 
>> *options, int flags,
>>  error_setg(>migration_blocker, "The vdi format used by node '%s' "
>> "does not support live migration",
>> bdrv_get_device_or_node_name(bs));
>> -migrate_add_blocker(s->migration_blocker);
>> +ret = migrate_add_blocker(s->migration_blocker, errp);
>> +if (ret < 0) {
>> +error_free(s->migration_blocker);
>> +goto fail_free_bmap;
>> +}
>>  
>>  qemu_co_mutex_init(>write_lock);
>>  
>> diff --git a/block/vhdx.c b/block/vhdx.c
>> index 0ba2f0a..d81941b 100644
>> --- a/block/vhdx.c
>> +++ b/block/vhdx.c
>> @@ -991,20 +991,24 @@ static int vhdx_open(BlockDriverState *bs, QDict 
>> *options, int flags,
>>  }
>>  }
>>  
>> -if (flags & BDRV_O_RDWR) {
>> -ret = vhdx_update_headers(bs, s, false, NULL);
>> -if (ret < 0) {
>> -goto fail;
>> -}
>> -}
>> -
>> -/* TODO: differencing files */
>> -
>>  /* Disable migration when VHDX images are used */
>>  error_setg(>migration_blocker, "The vhdx format used by node '%s' "
>> "does not support live migration",
>> bdrv_get_device_or_node_name(bs));
>> -migrate_add_blocker(s->migration_blocker);
>> +ret = migrate_add_blocker(s->migration_blocker, errp);
>> +if (ret < 0) {
>> +error_free(s->migration_blocker);
>> +goto fail;
>> +}
>> +
>> +if (flags & BDRV_O_RDWR) {
>> +ret = vhdx_update_headers(bs, s, false, NULL);
>> +if (ret < 0) {
>> +goto fail;
>> +}
>> +}
>> +
>> +/* TODO: differencing files */
>>  
>>  return 0;
>>  fail:
>> diff --git a/block/vmdk.c b/block/vmdk.c
>> index a11c27a..4a45a83 100644
>> --- a/block/vmdk.c
>> +++ b/block/vmdk.c
>> @@ -976,7 +976,12 @@ static int vmdk_open(BlockDriverState *bs, QDict 
>> *options, int flags,
>>  error_setg(>migration_blocker, "The vmdk format used by node '%s' "
>> "does not support live migration",
>> bdrv_get_device_or_node_name(bs));
>> -migrate_add_blocker(s->migration_blocker);
>> +ret = migrate_add_blocker(s->migration_blocker, errp);
>> +if (ret < 0) {
>> +error_free(s->migration_blocker);
>> +goto fail;
>> +}
>> +
>>  g_free(buf);
>>  return 0;
>>  
>> diff --git a/block/vpc.c b/block/vpc.c
>> index 8d5886f..299a8c8 100644
>> --- a/block/vpc.c
>> +++ b/block/vpc.c
>> @@ -422,13 +422,17 @@ static int vpc_open(BlockDriverState *bs, QDict 
>> *options, int flags,
>>  #endif
>>  }
>>  
>> -qemu_co_mutex_init(>lock);
>> -
>>  /* Disable migration when VHD images are used */
>>  error_setg(>migration_blocker, "The vpc format used by node '%s' "
>> "does not support live migration",
>> bdrv_get_device_or_node_name(bs));
>> -

Re: [Qemu-block] [PATCH 15/21] qcow2: add .bdrv_can_store_dirty_bitmap

2016-12-09 Thread Vladimir Sementsov-Ogievskiy

09.12.2016 20:28, Max Reitz wrote:

On 22.11.2016 18:26, Vladimir Sementsov-Ogievskiy wrote:

Realize .bdrv_can_store_dirty_bitmap interface.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/qcow2-bitmap.c | 40 
  block/qcow2.c|  1 +
  block/qcow2.h|  4 
  3 files changed, 45 insertions(+)

diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index a975388..55b1112 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -1115,3 +1115,43 @@ fail:
  
  bitmap_list_free(bm_list);

  }
+
+bool qcow2_can_store_dirty_bitmap(BlockDriverState *bs,
+  const char *name,
+  uint32_t granularity,
+  Error **errp)
+{
+BDRVQcow2State *s = bs->opaque;
+const char *reason = NULL;
+bool found;
+Qcow2BitmapList *bm_list;
+
+if (check_constraints_on_bitmap(bs, name, granularity) != 0) {
+reason = "it doesn't satisfy the constraints";
+goto common_errp;
+}
+
+if (s->nb_bitmaps == 0) {
+return true;
+}
+
+bm_list = bitmap_list_load(bs, s->bitmap_directory_offset,
+   s->bitmap_directory_size, errp);

Maybe it would make sense to keep the bitmap list in memory...


+if (bm_list == NULL) {
+return false;
+}
+
+found = !!find_bitmap_by_name(bm_list, name);

You can omit the !!, storing it in a bool will do that for you.


+bitmap_list_free(bm_list);
+if (found) {
+reason = "bitmap with the same name is already stored";
+goto common_errp;
+}
+
+return true;
+
+common_errp:
+error_setg(errp, "Can't make bitmap '%s' persistent in '%s', as %s.",
+   name, bdrv_get_device_or_node_name(bs), reason);

Hm, so this function isn't for checking whether a bitmap can be stored,
but whether it can be made persistent. That's a difference; you can
store a bitmap of the same name in the file if you want to overwrite it
(i.e. it's an auto-load bitmap), but you cannot create a new bitmap with
the same name.

Maybe rename it from bdrv_can_store_dirty_bitmap to
bdrv_can_store_new_dirty_bitmap?


Yes, I'll rename it.



Max


+return false;
+}
diff --git a/block/qcow2.c b/block/qcow2.c
index 66c7f74..cb9c2a2 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3516,6 +3516,7 @@ BlockDriver bdrv_qcow2 = {
  
  .bdrv_load_autoloading_dirty_bitmaps = qcow2_load_autoloading_dirty_bitmaps,

  .bdrv_store_persistent_dirty_bitmaps = 
qcow2_store_persistent_dirty_bitmaps,
+.bdrv_can_store_dirty_bitmap = qcow2_can_store_dirty_bitmap,
  };
  
  static void bdrv_qcow2_init(void)

diff --git a/block/qcow2.h b/block/qcow2.h
index d9a7643..e7a44a1 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -616,5 +616,9 @@ void qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, 
void **table);
  /* qcow2-bitmap.c functions */
  void qcow2_load_autoloading_dirty_bitmaps(BlockDriverState *bs, Error **errp);
  void qcow2_store_persistent_dirty_bitmaps(BlockDriverState *bs, Error **errp);
+bool qcow2_can_store_dirty_bitmap(BlockDriverState *bs,
+  const char *name,
+  uint32_t granularity,
+  Error **errp);
  
  #endif







--
Best regards,
Vladimir




Re: [Qemu-block] [PATCH 13/21] qcow2: add .bdrv_store_persistent_dirty_bitmaps()

2016-12-09 Thread Vladimir Sementsov-Ogievskiy

09.12.2016 20:05, Max Reitz wrote:

On 22.11.2016 18:26, Vladimir Sementsov-Ogievskiy wrote:

Realize block bitmap storing interface, to allow qcow2 images store
persistent bitmaps.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/qcow2-bitmap.c | 451 +++
  block/qcow2.c|   1 +
  block/qcow2.h|   1 +
  3 files changed, 453 insertions(+)

diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index 81be1ca..a975388 100644
--- a/block/qcow2-bitmap.c


[...]


+return;
+}
+}
+
+/* check constraints and names */
+for (bitmap = bdrv_dirty_bitmap_next(bs, NULL); bitmap != NULL;
+bitmap = bdrv_dirty_bitmap_next(bs, bitmap)) {

Alignment to the opening parenthesis, please.


Hmm.. without an alignment it is not so simple to distinguish for-loop 
header from its body.



[...]


[1] What about bitmaps that have BME_FLAG_IN_USE set but do not have a
corresponding BDS bitmap?

If such a bitmap does not have BME_FLAG_AUTO set, we didn't set the
flag, so we should keep it unchanged. That's what this function is
currently doing.

However, if such a bitmap does have BME_FLAG_AUTO set, it was definitely
us who set the IN_USE flag (because otherwise we would have aborted
loading the bitmaps, and thus also aborted bdrv_open_common()).
Therefore, the only explanation is that the bitmap was deleted in the
meantime, and that means we should also delete it in the qcow2 file.


Right. Or, alternatively, these bitmaps may be deleted on corresponding 
BdrvDirtyBitmap deletion.



--
Best regards,
Vladimir




Re: [Qemu-block] [PATCH 17/21] qmp: add autoload parameter to block-dirty-bitmap-add

2016-12-09 Thread Max Reitz
On 22.11.2016 18:26, Vladimir Sementsov-Ogievskiy wrote:
> Optional. Default is false.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> Signed-off-by: Denis V. Lunev 
> ---
>  blockdev.c| 18 --
>  docs/qmp-commands.txt |  4 
>  qapi/block-core.json  |  6 +-
>  3 files changed, 25 insertions(+), 3 deletions(-)
> 
> diff --git a/blockdev.c b/blockdev.c
> index 3876d1d..3891d86 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1968,6 +1968,7 @@ static void 
> block_dirty_bitmap_add_prepare(BlkActionState *common,
>  qmp_block_dirty_bitmap_add(action->node, action->name,
> action->has_granularity, action->granularity,
> action->has_persistent, action->persistent,
> +   action->has_autoload, action->autoload,
> _err);
>  
>  if (!local_err) {
> @@ -2698,6 +2699,7 @@ out:
>  void qmp_block_dirty_bitmap_add(const char *node, const char *name,
>  bool has_granularity, uint32_t granularity,
>  bool has_persistent, bool persistent,
> +bool has_autoload, bool autoload,
>  Error **errp)
>  {
>  AioContext *aio_context;
> @@ -2731,6 +2733,15 @@ void qmp_block_dirty_bitmap_add(const char *node, 
> const char *name,
>  if (!has_persistent) {
>  persistent = false;
>  }
> +if (!has_autoload) {
> +autoload = false;
> +}
> +
> +if (has_autoload && !persistent) {
> +error_setg(errp, "Autoload flag must be used only for persistent "
> + "bitmaps");
> +goto out;
> +}
>  
>  if (persistent &&
>  !bdrv_can_store_dirty_bitmap(bs, name, granularity, errp)) {
> @@ -2738,10 +2749,13 @@ void qmp_block_dirty_bitmap_add(const char *node, 
> const char *name,
>  }
>  
>  bitmap = bdrv_create_dirty_bitmap(bs, granularity, name, errp);
> -if (bitmap != NULL) {
> -bdrv_dirty_bitmap_set_persistance(bitmap, persistent);
> +if (bitmap == NULL) {
> +goto out;
>  }
>  
> +bdrv_dirty_bitmap_set_persistance(bitmap, persistent);
> +bdrv_dirty_bitmap_set_autoload(bitmap, autoload);
> +
>   out:
>  aio_context_release(aio_context);
>  }
> diff --git a/docs/qmp-commands.txt b/docs/qmp-commands.txt
> index c4ad1e4..dda2911 100644
> --- a/docs/qmp-commands.txt
> +++ b/docs/qmp-commands.txt
> @@ -1018,6 +1018,10 @@ Arguments:
>  - "persistent": bitmap will be saved to the corresponding block device image
>  file on its close. For now only Qcow2 disks support 
> persistent
>  bitmaps. (json-bool, optional, default false) (Since 2.8)
> +- "autoload": the bitmap will be automatically loaded when the image it is
> +  stored in is opened. This flag may only be specified for
> +  persistent bitmaps (json-bool, optional, default false)
> +  (Since 2.8)

*2.9

>  
>  Example:
>  
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index cec312c..648f94a 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -1235,11 +1235,15 @@
>  #  corresponding block device image file on its close. Default is
>  #  false. (Since 2.8)
>  #
> +# @autoload: #optional the bitmap will be automatically loaded when the image
> +#it is stored in is opened. This flag may only be specified for
> +#persistent bitmaps. Default is false. (Since 2.8)

*2.9

With both fixed:

Reviewed-by: Max Reitz 

> +#
>  # Since 2.4
>  ##
>  { 'struct': 'BlockDirtyBitmapAdd',
>'data': { 'node': 'str', 'name': 'str', '*granularity': 'uint32',
> -'*persistent': 'bool' } }
> +'*persistent': 'bool', '*autoload': 'bool' } }
>  
>  ##
>  # @block-dirty-bitmap-add
> 




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v3 0/1] migration: disallow migrate_add_blocker during migration

2016-12-09 Thread no-reply
Hi,

Your series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [PATCH v3 0/1] migration: disallow migrate_add_blocker 
during migration
Type: series
Message-id: 20161209172547.31550-1-js...@redhat.com

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

# Useful git options
git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
bebd3d0 migration: disallow migrate_add_blocker during migration

=== OUTPUT BEGIN ===
Checking PATCH 1/1: migration: disallow migrate_add_blocker during migration...
ERROR: do not use C99 // comments
#158: FILE: block/vvfat.c:1205:
+//assert(is_consistent(s));

total: 1 errors, 0 warnings, 394 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

Re: [Qemu-block] [PATCH v3 1/1] migration: disallow migrate_add_blocker during migration

2016-12-09 Thread Dr. David Alan Gilbert
* John Snow (js...@redhat.com) wrote:
> If a migration is already in progress and somebody attempts
> to add a migration blocker, this should rightly fail.
> 
> Add an errp parameter and a retcode return value to migrate_add_blocker.
> 
> Signed-off-by: John Snow 
> ---
>  block/qcow.c  |  6 +-
>  block/vdi.c   |  6 +-
>  block/vhdx.c  | 24 ++--
>  block/vmdk.c  |  7 ++-
>  block/vpc.c   | 10 +++---
>  block/vvfat.c | 20 
>  hw/9pfs/9p.c  | 16 
>  hw/display/virtio-gpu.c   | 29 -
>  hw/misc/ivshmem.c | 11 +++
>  hw/scsi/vhost-scsi.c  | 25 +++--
>  hw/virtio/vhost.c |  8 +++-
>  include/migration/migration.h |  6 +-
>  migration/migration.c | 35 +--
>  stubs/migr-blocker.c  |  3 ++-
>  target-i386/kvm.c | 16 +---
>  15 files changed, 163 insertions(+), 59 deletions(-)
> 
> diff --git a/block/qcow.c b/block/qcow.c
> index 7540f43..11526a1 100644
> --- a/block/qcow.c
> +++ b/block/qcow.c
> @@ -252,7 +252,11 @@ static int qcow_open(BlockDriverState *bs, QDict 
> *options, int flags,
>  error_setg(>migration_blocker, "The qcow format used by node '%s' "
> "does not support live migration",
> bdrv_get_device_or_node_name(bs));
> -migrate_add_blocker(s->migration_blocker);
> +ret = migrate_add_blocker(s->migration_blocker, errp);
> +if (ret < 0) {
> +error_free(s->migration_blocker);
> +goto fail;
> +}
>  
>  qemu_co_mutex_init(>lock);
>  return 0;
> diff --git a/block/vdi.c b/block/vdi.c
> index 96b78d5..2b56f52 100644
> --- a/block/vdi.c
> +++ b/block/vdi.c
> @@ -471,7 +471,11 @@ static int vdi_open(BlockDriverState *bs, QDict 
> *options, int flags,
>  error_setg(>migration_blocker, "The vdi format used by node '%s' "
> "does not support live migration",
> bdrv_get_device_or_node_name(bs));
> -migrate_add_blocker(s->migration_blocker);
> +ret = migrate_add_blocker(s->migration_blocker, errp);
> +if (ret < 0) {
> +error_free(s->migration_blocker);
> +goto fail_free_bmap;
> +}
>  
>  qemu_co_mutex_init(>write_lock);
>  
> diff --git a/block/vhdx.c b/block/vhdx.c
> index 0ba2f0a..d81941b 100644
> --- a/block/vhdx.c
> +++ b/block/vhdx.c
> @@ -991,20 +991,24 @@ static int vhdx_open(BlockDriverState *bs, QDict 
> *options, int flags,
>  }
>  }
>  
> -if (flags & BDRV_O_RDWR) {
> -ret = vhdx_update_headers(bs, s, false, NULL);
> -if (ret < 0) {
> -goto fail;
> -}
> -}
> -
> -/* TODO: differencing files */
> -
>  /* Disable migration when VHDX images are used */
>  error_setg(>migration_blocker, "The vhdx format used by node '%s' "
> "does not support live migration",
> bdrv_get_device_or_node_name(bs));
> -migrate_add_blocker(s->migration_blocker);
> +ret = migrate_add_blocker(s->migration_blocker, errp);
> +if (ret < 0) {
> +error_free(s->migration_blocker);
> +goto fail;
> +}
> +
> +if (flags & BDRV_O_RDWR) {
> +ret = vhdx_update_headers(bs, s, false, NULL);
> +if (ret < 0) {
> +goto fail;
> +}
> +}
> +
> +/* TODO: differencing files */
>  
>  return 0;
>  fail:
> diff --git a/block/vmdk.c b/block/vmdk.c
> index a11c27a..4a45a83 100644
> --- a/block/vmdk.c
> +++ b/block/vmdk.c
> @@ -976,7 +976,12 @@ static int vmdk_open(BlockDriverState *bs, QDict 
> *options, int flags,
>  error_setg(>migration_blocker, "The vmdk format used by node '%s' "
> "does not support live migration",
> bdrv_get_device_or_node_name(bs));
> -migrate_add_blocker(s->migration_blocker);
> +ret = migrate_add_blocker(s->migration_blocker, errp);
> +if (ret < 0) {
> +error_free(s->migration_blocker);
> +goto fail;
> +}
> +
>  g_free(buf);
>  return 0;
>  
> diff --git a/block/vpc.c b/block/vpc.c
> index 8d5886f..299a8c8 100644
> --- a/block/vpc.c
> +++ b/block/vpc.c
> @@ -422,13 +422,17 @@ static int vpc_open(BlockDriverState *bs, QDict 
> *options, int flags,
>  #endif
>  }
>  
> -qemu_co_mutex_init(>lock);
> -
>  /* Disable migration when VHD images are used */
>  error_setg(>migration_blocker, "The vpc format used by node '%s' "
> "does not support live migration",
> bdrv_get_device_or_node_name(bs));
> -migrate_add_blocker(s->migration_blocker);
> +ret = migrate_add_blocker(s->migration_blocker, errp);
> +if (ret < 0) {
> +error_free(s->migration_blocker);
> +goto fail;
> +}
> +
> +

Re: [Qemu-block] [PATCH 16/21] qmp: add persistent flag to block-dirty-bitmap-add

2016-12-09 Thread Max Reitz
On 22.11.2016 18:26, Vladimir Sementsov-Ogievskiy wrote:
> Add optional 'persistent' flag to qmp command block-dirty-bitmap-add.
> Default is false.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> Signed-off-by: Denis V. Lunev 
> ---
>  blockdev.c| 17 -
>  docs/qmp-commands.txt |  3 +++
>  qapi/block-core.json  |  7 ++-
>  3 files changed, 25 insertions(+), 2 deletions(-)
> 
> diff --git a/blockdev.c b/blockdev.c
> index 245e1e1..3876d1d 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1967,6 +1967,7 @@ static void 
> block_dirty_bitmap_add_prepare(BlkActionState *common,
>  /* AIO context taken and released within qmp_block_dirty_bitmap_add */
>  qmp_block_dirty_bitmap_add(action->node, action->name,
> action->has_granularity, action->granularity,
> +   action->has_persistent, action->persistent,
> _err);
>  
>  if (!local_err) {
> @@ -2696,10 +2697,12 @@ out:
>  
>  void qmp_block_dirty_bitmap_add(const char *node, const char *name,
>  bool has_granularity, uint32_t granularity,
> +bool has_persistent, bool persistent,
>  Error **errp)
>  {
>  AioContext *aio_context;
>  BlockDriverState *bs;
> +BdrvDirtyBitmap *bitmap;
>  
>  if (!name || name[0] == '\0') {
>  error_setg(errp, "Bitmap name cannot be empty");
> @@ -2725,7 +2728,19 @@ void qmp_block_dirty_bitmap_add(const char *node, 
> const char *name,
>  granularity = bdrv_get_default_bitmap_granularity(bs);
>  }
>  
> -bdrv_create_dirty_bitmap(bs, granularity, name, errp);
> +if (!has_persistent) {
> +persistent = false;
> +}
> +
> +if (persistent &&
> +!bdrv_can_store_dirty_bitmap(bs, name, granularity, errp)) {

Alignment to the opening parenthesis, please.

(Also, this may need a change depending on whether you want to rename it
to bdrv_can_store_new_dirty_bitmap().)

With that and the s/2\.8/2.9/ below fixed:

Reviewed-by: Max Reitz 

> +goto out;
> +}
> +
> +bitmap = bdrv_create_dirty_bitmap(bs, granularity, name, errp);
> +if (bitmap != NULL) {
> +bdrv_dirty_bitmap_set_persistance(bitmap, persistent);
> +}
>  
>   out:
>  aio_context_release(aio_context);
> diff --git a/docs/qmp-commands.txt b/docs/qmp-commands.txt
> index abf210a..c4ad1e4 100644
> --- a/docs/qmp-commands.txt
> +++ b/docs/qmp-commands.txt
> @@ -1015,6 +1015,9 @@ Arguments:
>  - "node": device/node on which to create dirty bitmap (json-string)
>  - "name": name of the new dirty bitmap (json-string)
>  - "granularity": granularity to track writes with (int, optional)
> +- "persistent": bitmap will be saved to the corresponding block device image
> +file on its close. For now only Qcow2 disks support 
> persistent
> +bitmaps. (json-bool, optional, default false) (Since 2.8)
>  
>  Example:
>  
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index c29bef7..cec312c 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -1231,10 +1231,15 @@
>  # @granularity: #optional the bitmap granularity, default is 64k for
>  #   block-dirty-bitmap-add
>  #
> +# @persistent: #optional the bitmap is persistent, i.e. it will be saved to 
> the
> +#  corresponding block device image file on its close. Default is
> +#  false. (Since 2.8)
> +#
>  # Since 2.4
>  ##
>  { 'struct': 'BlockDirtyBitmapAdd',
> -  'data': { 'node': 'str', 'name': 'str', '*granularity': 'uint32' } }
> +  'data': { 'node': 'str', 'name': 'str', '*granularity': 'uint32',
> +'*persistent': 'bool' } }
>  
>  ##
>  # @block-dirty-bitmap-add
> 




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH 15/21] qcow2: add .bdrv_can_store_dirty_bitmap

2016-12-09 Thread Max Reitz
On 22.11.2016 18:26, Vladimir Sementsov-Ogievskiy wrote:
> Realize .bdrv_can_store_dirty_bitmap interface.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/qcow2-bitmap.c | 40 
>  block/qcow2.c|  1 +
>  block/qcow2.h|  4 
>  3 files changed, 45 insertions(+)
> 
> diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
> index a975388..55b1112 100644
> --- a/block/qcow2-bitmap.c
> +++ b/block/qcow2-bitmap.c
> @@ -1115,3 +1115,43 @@ fail:
>  
>  bitmap_list_free(bm_list);
>  }
> +
> +bool qcow2_can_store_dirty_bitmap(BlockDriverState *bs,
> +  const char *name,
> +  uint32_t granularity,
> +  Error **errp)
> +{
> +BDRVQcow2State *s = bs->opaque;
> +const char *reason = NULL;
> +bool found;
> +Qcow2BitmapList *bm_list;
> +
> +if (check_constraints_on_bitmap(bs, name, granularity) != 0) {
> +reason = "it doesn't satisfy the constraints";
> +goto common_errp;
> +}
> +
> +if (s->nb_bitmaps == 0) {
> +return true;
> +}
> +
> +bm_list = bitmap_list_load(bs, s->bitmap_directory_offset,
> +   s->bitmap_directory_size, errp);

Maybe it would make sense to keep the bitmap list in memory...

> +if (bm_list == NULL) {
> +return false;
> +}
> +
> +found = !!find_bitmap_by_name(bm_list, name);

You can omit the !!, storing it in a bool will do that for you.

> +bitmap_list_free(bm_list);
> +if (found) {
> +reason = "bitmap with the same name is already stored";
> +goto common_errp;
> +}
> +
> +return true;
> +
> +common_errp:
> +error_setg(errp, "Can't make bitmap '%s' persistent in '%s', as %s.",
> +   name, bdrv_get_device_or_node_name(bs), reason);

Hm, so this function isn't for checking whether a bitmap can be stored,
but whether it can be made persistent. That's a difference; you can
store a bitmap of the same name in the file if you want to overwrite it
(i.e. it's an auto-load bitmap), but you cannot create a new bitmap with
the same name.

Maybe rename it from bdrv_can_store_dirty_bitmap to
bdrv_can_store_new_dirty_bitmap?

Max

> +return false;
> +}
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 66c7f74..cb9c2a2 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -3516,6 +3516,7 @@ BlockDriver bdrv_qcow2 = {
>  
>  .bdrv_load_autoloading_dirty_bitmaps = 
> qcow2_load_autoloading_dirty_bitmaps,
>  .bdrv_store_persistent_dirty_bitmaps = 
> qcow2_store_persistent_dirty_bitmaps,
> +.bdrv_can_store_dirty_bitmap = qcow2_can_store_dirty_bitmap,
>  };
>  
>  static void bdrv_qcow2_init(void)
> diff --git a/block/qcow2.h b/block/qcow2.h
> index d9a7643..e7a44a1 100644
> --- a/block/qcow2.h
> +++ b/block/qcow2.h
> @@ -616,5 +616,9 @@ void qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, 
> void **table);
>  /* qcow2-bitmap.c functions */
>  void qcow2_load_autoloading_dirty_bitmaps(BlockDriverState *bs, Error 
> **errp);
>  void qcow2_store_persistent_dirty_bitmaps(BlockDriverState *bs, Error 
> **errp);
> +bool qcow2_can_store_dirty_bitmap(BlockDriverState *bs,
> +  const char *name,
> +  uint32_t granularity,
> +  Error **errp);
>  
>  #endif
> 




signature.asc
Description: OpenPGP digital signature


[Qemu-block] [PATCH v3 0/1] migration: disallow migrate_add_blocker during migration

2016-12-09 Thread John Snow
As discussed on-list, this may be a necessary building block for
a run-time flag that prevents us from getting into any situations
where we configure a non-migratable QEMU.

I'd like extra attention to be paid to the virtio-gpu, vhost, and
ivshmem changes as I am not perfectly confident in those.

Sorry for the wide CC distributions.

V3: Rebased from a version I sent about a year ago.
Changed the migration function names a bit,
Changed new calls to migrate_add_blocker.

John Snow (1):
  migration: disallow migrate_add_blocker during migration

 block/qcow.c  |  6 +-
 block/vdi.c   |  6 +-
 block/vhdx.c  | 24 ++--
 block/vmdk.c  |  7 ++-
 block/vpc.c   | 10 +++---
 block/vvfat.c | 20 
 hw/9pfs/9p.c  | 16 
 hw/display/virtio-gpu.c   | 29 -
 hw/misc/ivshmem.c | 11 +++
 hw/scsi/vhost-scsi.c  | 25 +++--
 hw/virtio/vhost.c |  8 +++-
 include/migration/migration.h |  6 +-
 migration/migration.c | 35 +--
 stubs/migr-blocker.c  |  3 ++-
 target-i386/kvm.c | 16 +---
 15 files changed, 163 insertions(+), 59 deletions(-)

-- 
2.9.3




[Qemu-block] [PATCH v3 1/1] migration: disallow migrate_add_blocker during migration

2016-12-09 Thread John Snow
If a migration is already in progress and somebody attempts
to add a migration blocker, this should rightly fail.

Add an errp parameter and a retcode return value to migrate_add_blocker.

Signed-off-by: John Snow 
---
 block/qcow.c  |  6 +-
 block/vdi.c   |  6 +-
 block/vhdx.c  | 24 ++--
 block/vmdk.c  |  7 ++-
 block/vpc.c   | 10 +++---
 block/vvfat.c | 20 
 hw/9pfs/9p.c  | 16 
 hw/display/virtio-gpu.c   | 29 -
 hw/misc/ivshmem.c | 11 +++
 hw/scsi/vhost-scsi.c  | 25 +++--
 hw/virtio/vhost.c |  8 +++-
 include/migration/migration.h |  6 +-
 migration/migration.c | 35 +--
 stubs/migr-blocker.c  |  3 ++-
 target-i386/kvm.c | 16 +---
 15 files changed, 163 insertions(+), 59 deletions(-)

diff --git a/block/qcow.c b/block/qcow.c
index 7540f43..11526a1 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -252,7 +252,11 @@ static int qcow_open(BlockDriverState *bs, QDict *options, 
int flags,
 error_setg(>migration_blocker, "The qcow format used by node '%s' "
"does not support live migration",
bdrv_get_device_or_node_name(bs));
-migrate_add_blocker(s->migration_blocker);
+ret = migrate_add_blocker(s->migration_blocker, errp);
+if (ret < 0) {
+error_free(s->migration_blocker);
+goto fail;
+}
 
 qemu_co_mutex_init(>lock);
 return 0;
diff --git a/block/vdi.c b/block/vdi.c
index 96b78d5..2b56f52 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -471,7 +471,11 @@ static int vdi_open(BlockDriverState *bs, QDict *options, 
int flags,
 error_setg(>migration_blocker, "The vdi format used by node '%s' "
"does not support live migration",
bdrv_get_device_or_node_name(bs));
-migrate_add_blocker(s->migration_blocker);
+ret = migrate_add_blocker(s->migration_blocker, errp);
+if (ret < 0) {
+error_free(s->migration_blocker);
+goto fail_free_bmap;
+}
 
 qemu_co_mutex_init(>write_lock);
 
diff --git a/block/vhdx.c b/block/vhdx.c
index 0ba2f0a..d81941b 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -991,20 +991,24 @@ static int vhdx_open(BlockDriverState *bs, QDict 
*options, int flags,
 }
 }
 
-if (flags & BDRV_O_RDWR) {
-ret = vhdx_update_headers(bs, s, false, NULL);
-if (ret < 0) {
-goto fail;
-}
-}
-
-/* TODO: differencing files */
-
 /* Disable migration when VHDX images are used */
 error_setg(>migration_blocker, "The vhdx format used by node '%s' "
"does not support live migration",
bdrv_get_device_or_node_name(bs));
-migrate_add_blocker(s->migration_blocker);
+ret = migrate_add_blocker(s->migration_blocker, errp);
+if (ret < 0) {
+error_free(s->migration_blocker);
+goto fail;
+}
+
+if (flags & BDRV_O_RDWR) {
+ret = vhdx_update_headers(bs, s, false, NULL);
+if (ret < 0) {
+goto fail;
+}
+}
+
+/* TODO: differencing files */
 
 return 0;
 fail:
diff --git a/block/vmdk.c b/block/vmdk.c
index a11c27a..4a45a83 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -976,7 +976,12 @@ static int vmdk_open(BlockDriverState *bs, QDict *options, 
int flags,
 error_setg(>migration_blocker, "The vmdk format used by node '%s' "
"does not support live migration",
bdrv_get_device_or_node_name(bs));
-migrate_add_blocker(s->migration_blocker);
+ret = migrate_add_blocker(s->migration_blocker, errp);
+if (ret < 0) {
+error_free(s->migration_blocker);
+goto fail;
+}
+
 g_free(buf);
 return 0;
 
diff --git a/block/vpc.c b/block/vpc.c
index 8d5886f..299a8c8 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -422,13 +422,17 @@ static int vpc_open(BlockDriverState *bs, QDict *options, 
int flags,
 #endif
 }
 
-qemu_co_mutex_init(>lock);
-
 /* Disable migration when VHD images are used */
 error_setg(>migration_blocker, "The vpc format used by node '%s' "
"does not support live migration",
bdrv_get_device_or_node_name(bs));
-migrate_add_blocker(s->migration_blocker);
+ret = migrate_add_blocker(s->migration_blocker, errp);
+if (ret < 0) {
+error_free(s->migration_blocker);
+goto fail;
+}
+
+qemu_co_mutex_init(>lock);
 
 return 0;
 
diff --git a/block/vvfat.c b/block/vvfat.c
index ded2109..3de3320 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -1185,22 +1185,26 @@ static int vvfat_open(BlockDriverState *bs, QDict 
*options, int flags,
 
 s->sector_count = s->faked_sectors + 

Re: [Qemu-block] [PATCH 14/21] block: add bdrv_can_store_dirty_bitmap

2016-12-09 Thread Max Reitz
On 22.11.2016 18:26, Vladimir Sementsov-Ogievskiy wrote:
> This will be needed to check some restrictions before making bitmap
> persistent in qmp-block-dirty-bitmap-add (this functionality will be
> added by future patch)
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block.c   | 22 ++
>  include/block/block.h |  2 ++
>  include/block/block_int.h |  2 ++
>  3 files changed, 26 insertions(+)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH 13/21] qcow2: add .bdrv_store_persistent_dirty_bitmaps()

2016-12-09 Thread Max Reitz
On 22.11.2016 18:26, Vladimir Sementsov-Ogievskiy wrote:
> Realize block bitmap storing interface, to allow qcow2 images store
> persistent bitmaps.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/qcow2-bitmap.c | 451 
> +++
>  block/qcow2.c|   1 +
>  block/qcow2.h|   1 +
>  3 files changed, 453 insertions(+)
> 
> diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
> index 81be1ca..a975388 100644
> --- a/block/qcow2-bitmap.c
> +++ b/block/qcow2-bitmap.c
> @@ -28,6 +28,7 @@
>  #include "qemu/osdep.h"
>  #include "qapi/error.h"
>  #include "exec/log.h"
> +#include "qemu/cutils.h"
>  
>  #include "block/block_int.h"
>  #include "block/qcow2.h"
> @@ -43,6 +44,10 @@
>  #define BME_MIN_GRANULARITY_BITS 9
>  #define BME_MAX_NAME_SIZE 1023
>  
> +#if BME_MAX_TABLE_SIZE * 8ULL > INT_MAX
> +#error In the code bitmap table physical size assumed to fit into int
> +#endif
> +
>  /* Bitmap directory entry flags */
>  #define BME_RESERVED_FLAGS 0xfffcU
>  #define BME_FLAG_IN_USE 1
> @@ -74,6 +79,8 @@ typedef struct Qcow2Bitmap {
>  uint8_t granularity_bits;
>  char *name;
>  
> +BdrvDirtyBitmap *dirty_bitmap;

(I'm not quite happy with the asymmetry of this field (it isn't set by
load_bitmap(), but it is required by store_bitmap()), but making it
symmetric either by making load_bitmap() set it or by store_bitmap() not
reading it (but getting the value through an explicit parameter) makes
the code needlessly more complicated, so I guess I'll have to stay
not-quite-happy.)

> +
>  QSIMPLEQ_ENTRY(Qcow2Bitmap) entry;
>  } Qcow2Bitmap;
>  typedef QSIMPLEQ_HEAD(Qcow2BitmapList, Qcow2Bitmap) Qcow2BitmapList;
> @@ -87,6 +94,27 @@ static inline bool can_write(BlockDriverState *bs)
>  return !bdrv_is_read_only(bs) && !(bdrv_get_flags(bs) & BDRV_O_INACTIVE);
>  }
>  
> +static int update_header_sync(BlockDriverState *bs)
> +{
> +int ret;
> +
> +ret = qcow2_update_header(bs);
> +if (ret < 0) {
> +return ret;
> +}
> +
> +/* We doesn't return  bdrv_flush error code. Even if it fails, write was

s/doesn't/don't/

(Also, there's a double space after return.)

> + * successful and it is more logical to consider that header is in the 
> new
> + * state than in the old.
> + */
> +ret = bdrv_flush(bs);
> +if (ret < 0) {
> +fprintf(stderr, "Failed to flush qcow2 header");
> +}
> +
> +return 0;
> +}
> +
>  static inline void bitmap_table_to_cpu(uint64_t *bitmap_table, size_t size)
>  {
>  size_t i;
> @@ -96,6 +124,15 @@ static inline void bitmap_table_to_cpu(uint64_t 
> *bitmap_table, size_t size)
>  }
>  }
>  
> +static inline void bitmap_table_to_be(uint64_t *bitmap_table, size_t size)
> +{
> +size_t i;
> +
> +for (i = 0; i < size; ++i) {
> +cpu_to_be64s(_table[i]);
> +}
> +}
> +
>  /* Check table entry specification constraints. If cluster_size is 0, offset
>   * alignment is not checked. */
>  static int check_table_entry(uint64_t entry, int cluster_size)
> @@ -121,6 +158,51 @@ static int check_table_entry(uint64_t entry, int 
> cluster_size)
>  return 0;
>  }
>  
> +static int check_constraints_on_bitmap(BlockDriverState *bs,
> +   const char *name,
> +   uint32_t granularity)
> +{
> +BDRVQcow2State *s = bs->opaque;
> +int granularity_bits = ctz32(granularity);
> +
> +int64_t nb_sectors = bdrv_nb_sectors(bs);
> +
> +if (nb_sectors < 0) {
> +return nb_sectors;
> +}
> +
> +uint64_t phys_bitmap_bytes = (nb_sectors << BDRV_SECTOR_BITS) >>

Using bdrv_getlength() would be simpler.

> + granularity_bits;
> +uint64_t bitmap_table_size = phys_bitmap_bytes / s->cluster_size;

Should be a DIV_ROUND_UP().

> +size_t name_size = strlen(name);
> +
> +int fail =

I'd personally like a bool more.

> +(bitmap_table_size > BME_MAX_TABLE_SIZE) ||
> +(phys_bitmap_bytes > BME_MAX_PHYS_SIZE) ||
> +(granularity_bits > BME_MAX_GRANULARITY_BITS) ||
> +(granularity_bits < BME_MIN_GRANULARITY_BITS) ||
> +(name_size > BME_MAX_NAME_SIZE);
> +
> +return fail ? -EINVAL : 0;
> +}
> +
> +static void clear_bitmap_table(BlockDriverState *bs, uint64_t *bitmap_table,
> +   uint32_t bitmap_table_size)
> +{
> +BDRVQcow2State *s = bs->opaque;
> +int i;
> +
> +for (i = 0; i < bitmap_table_size; ++i) {
> +uint64_t addr = bitmap_table[i] & BME_TABLE_ENTRY_OFFSET_MASK;
> +if (!addr) {
> +continue;
> +}
> +
> +qcow2_free_clusters(bs, addr, s->cluster_size, QCOW2_DISCARD_OTHER);
> +bitmap_table[i] = 0;
> +}
> +}
> +
>  static int bitmap_table_load(BlockDriverState *bs, Qcow2Bitmap *bm,
>   uint64_t **bitmap_table)
>  {
> @@ 

Re: [Qemu-block] [PATCH RFC 0/1] Allow storing the qcow2 L2 cache in disk

2016-12-09 Thread Alberto Garcia
On Fri 09 Dec 2016 03:18:23 PM CET, Kevin Wolf wrote:
>> I have been making some tests with exactly that scenario and the
>> results look good: storing the cache in disk gives roughly the same
>> performance as storing it in memory.
>> 
>> |-++--++|
>> | | Random 4k reads   | Sequential 4k reads |
>> | | Throughput | IOPS | Throughput |  IOPS  |
>> |-++--++|
>> | Cache in memory/SSD | 406 KB/s   |   99 | 84 MB/s|  21000 |
>> | Default cache (1MB) | 200 KB/s   |   60 | 83 MB/s|  21000 |
>> | No cache| 200 KB/s   |   49 | 56 MB/s|  14000 |
>> |-++--++|
>> 
>> I'm including the patch that I used to get these results. This is the
>> simplest approach that I could think of.
>> 
>> Opinions, questions?
>
> I suppose you used the fact that the cache is now on disk to increase
> the cache size so that it covers the whole image?

Right, the wording on the table is not clear, but that's what I did. I
also don't think this makes much sense if the cache is not big enough to
cover the whole image.

> If so, are you sure that you aren't just testing that accessing memory
> in the kernel page cache is just as fast as accessing memory in qemu's
> own cache? It seems this would just bypass the cache size limit given
> to qemu by instead leaving things cached in the kernel where the limit
> doesn't apply.

Fair question: what I checked is that the PSS/RSS values match the
expected values (i.e, they don't grow as you read from the disk
image). If the kernel is caching those pages so accessing them after
MADV_DONTNEED does not require going to disk again is a possibility that
I haven't ruled out.

Berto



Re: [Qemu-block] [PATCH RFC 0/1] Allow storing the qcow2 L2 cache in disk

2016-12-09 Thread Kevin Wolf
Am 09.12.2016 um 14:47 hat Alberto Garcia geschrieben:
> Hi all,
> 
> as we all know, one of the main things that can make the qcow2 format
> slow is the need to load entries from the L2 table in order to map a
> guest offset (on the virtual disk) to a host offset (on the qcow2
> image).
> 
> We have an L2 cache to deal with this, and as long as the cache is big
> enough then the peformance is comparable to that of a raw image.
> 
> For large qcow2 images the amount of RAM we need in order to cache all
> L2 tables can be big (128MB per TB of disk image if we're using the
> default cluster size of 64KB). In order to solve this problem we have
> a setting that allows the user to clean unused cache entries after a
> certain interval of time. This works fine most of the time, although
> we can still have peaks of RAM usage if there's a lot of I/O going on
> in one or more VMs.
> 
> In some scenarios, however, there's a different alternative: if the
> qcow2 image is stored in a slow backend (eg. HDD), we could save
> memory by putting the L2 cache in a faster one (SSD) instead of in
> RAM.
> 
> I have been making some tests with exactly that scenario and the
> results look good: storing the cache in disk gives roughly the same
> performance as storing it in memory.
> 
> |-++--++|
> | | Random 4k reads   | Sequential 4k reads |
> | | Throughput | IOPS | Throughput |  IOPS  |
> |-++--++|
> | Cache in memory/SSD | 406 KB/s   |   99 | 84 MB/s|  21000 |
> | Default cache (1MB) | 200 KB/s   |   60 | 83 MB/s|  21000 |
> | No cache| 200 KB/s   |   49 | 56 MB/s|  14000 |
> |-++--++|
> 
> I'm including the patch that I used to get these results. This is the
> simplest approach that I could think of.
> 
> Opinions, questions?

I suppose you used the fact that the cache is now on disk to increase
the cache size so that it covers the whole image?

If so, are you sure that you aren't just testing that accessing memory
in the kernel page cache is just as fast as accessing memory in qemu's
own cache? It seems this would just bypass the cache size limit given to
qemu by instead leaving things cached in the kernel where the limit
doesn't apply.

Kevin



Re: [Qemu-block] [PATCH 02/36] qdict: Add convenience helpers for wrapped puts

2016-12-09 Thread Alberto Garcia
On Wed 30 Nov 2016 08:44:20 PM CET, Eric Blake wrote:
> Quite a few users of qdict_put() were manually wrapping a
> non-QObject. We can make such call-sites shorter, by providing
> common macros to do the tedious work.  Also shorten nearby
> qdict_put_obj(,,QOBJECT()) sequences.
>
> Signed-off-by: Eric Blake 

Thanks, the code looks much better now :)

Reviewed-by: Alberto Garcia 

Berto



[Qemu-block] [PATCH RFC 1/1] qcow2: Allow storing the qcow2 L2 cache in disk

2016-12-09 Thread Alberto Garcia
In scenarios where we have a large qcow2 file stored in a slow storage
backend, we can save memory by storing its L2 cache in a faster drive
rather than keeping it in RAM.

Signed-off-by: Alberto Garcia 
---
 block/qcow2-cache.c | 56 +
 block/qcow2.c   | 11 +--
 block/qcow2.h   |  3 ++-
 3 files changed, 59 insertions(+), 11 deletions(-)

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index 1d25147..8ba8ed9 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -23,6 +23,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/mmap-alloc.h"
 #include "block/block_int.h"
 #include "qemu-common.h"
 #include "qcow2.h"
@@ -36,6 +37,7 @@ typedef struct Qcow2CachedTable {
 } Qcow2CachedTable;
 
 struct Qcow2Cache {
+int fd;
 Qcow2CachedTable   *entries;
 struct Qcow2Cache  *depends;
 int size;
@@ -63,7 +65,7 @@ static inline int qcow2_cache_get_table_idx(BlockDriverState 
*bs,
 }
 
 static void qcow2_cache_table_release(BlockDriverState *bs, Qcow2Cache *c,
-  int i, int num_tables)
+  int i, int num_tables, bool flush)
 {
 /* Using MADV_DONTNEED to discard memory is a Linux-specific feature */
 #ifdef CONFIG_LINUX
@@ -74,6 +76,10 @@ static void qcow2_cache_table_release(BlockDriverState *bs, 
Qcow2Cache *c,
 size_t offset = QEMU_ALIGN_UP((uintptr_t) t, align) - (uintptr_t) t;
 size_t length = QEMU_ALIGN_DOWN(mem_size - offset, align);
 if (length > 0) {
+if (flush) {
+assert(c->fd > 0);
+msync((uint8_t *) t + offset, length, MS_SYNC);
+}
 madvise((uint8_t *) t + offset, length, MADV_DONTNEED);
 }
 #endif
@@ -106,26 +112,50 @@ void qcow2_cache_clean_unused(BlockDriverState *bs, 
Qcow2Cache *c)
 }
 
 if (to_clean > 0) {
-qcow2_cache_table_release(bs, c, i - to_clean, to_clean);
+qcow2_cache_table_release(bs, c, i - to_clean, to_clean, false);
 }
 }
 
 c->cache_clean_lru_counter = c->lru_counter;
 }
 
-Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables)
+Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables,
+   const char *cache_file_path)
 {
 BDRVQcow2State *s = bs->opaque;
 Qcow2Cache *c;
+size_t total_size = (size_t) num_tables * s->cluster_size;
 
 c = g_new0(Qcow2Cache, 1);
 c->size = num_tables;
 c->entries = g_try_new0(Qcow2CachedTable, num_tables);
-c->table_array = qemu_try_blockalign(bs->file->bs,
- (size_t) num_tables * 
s->cluster_size);
+if (cache_file_path) {
+size_t align = MAX(bdrv_opt_mem_align(bs->file->bs), getpagesize());
+char *filename = g_strdup_printf("%s/qemu-qcow2-cache.XX",
+ cache_file_path);
+c->fd = mkstemp(filename);
+if (c->fd > 0) {
+unlink(filename);
+}
+g_free(filename);
+if (c->fd == -1 || ftruncate(c->fd, total_size)) {
+goto out;
+}
+c->table_array = qemu_ram_mmap(c->fd, total_size, align, true);
+} else {
+c->table_array = qemu_try_blockalign(bs->file->bs, total_size);
+}
 
+out:
 if (!c->entries || !c->table_array) {
-qemu_vfree(c->table_array);
+if (cache_file_path) {
+qemu_ram_munmap(c->table_array, total_size);
+if (c->fd > 0) {
+close(c->fd);
+}
+} else {
+qemu_vfree(c->table_array);
+}
 g_free(c->entries);
 g_free(c);
 c = NULL;
@@ -142,7 +172,14 @@ int qcow2_cache_destroy(BlockDriverState *bs, Qcow2Cache 
*c)
 assert(c->entries[i].ref == 0);
 }
 
-qemu_vfree(c->table_array);
+if (c->fd) {
+BDRVQcow2State *s = bs->opaque;
+size_t total_size = (size_t) c->size * s->cluster_size;
+qemu_ram_munmap(c->table_array, total_size);
+close(c->fd);
+} else {
+qemu_vfree(c->table_array);
+}
 g_free(c->entries);
 g_free(c);
 
@@ -297,7 +334,7 @@ int qcow2_cache_empty(BlockDriverState *bs, Qcow2Cache *c)
 c->entries[i].lru_counter = 0;
 }
 
-qcow2_cache_table_release(bs, c, 0, c->size);
+qcow2_cache_table_release(bs, c, 0, c->size, false);
 
 c->lru_counter = 0;
 
@@ -399,6 +436,9 @@ void qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, 
void **table)
 
 if (c->entries[i].ref == 0) {
 c->entries[i].lru_counter = ++c->lru_counter;
+if (c->fd) {
+qcow2_cache_table_release(bs, c, i, 1, true);
+}
 }
 
 assert(c->entries[i].ref >= 0);
diff --git a/block/qcow2.c b/block/qcow2.c
index 96fb8a8..4d5d092 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -461,6 +461,11 @@ static QemuOptsList 

[Qemu-block] [PATCH RFC 0/1] Allow storing the qcow2 L2 cache in disk

2016-12-09 Thread Alberto Garcia
Hi all,

as we all know, one of the main things that can make the qcow2 format
slow is the need to load entries from the L2 table in order to map a
guest offset (on the virtual disk) to a host offset (on the qcow2
image).

We have an L2 cache to deal with this, and as long as the cache is big
enough then the peformance is comparable to that of a raw image.

For large qcow2 images the amount of RAM we need in order to cache all
L2 tables can be big (128MB per TB of disk image if we're using the
default cluster size of 64KB). In order to solve this problem we have
a setting that allows the user to clean unused cache entries after a
certain interval of time. This works fine most of the time, although
we can still have peaks of RAM usage if there's a lot of I/O going on
in one or more VMs.

In some scenarios, however, there's a different alternative: if the
qcow2 image is stored in a slow backend (eg. HDD), we could save
memory by putting the L2 cache in a faster one (SSD) instead of in
RAM.

I have been making some tests with exactly that scenario and the
results look good: storing the cache in disk gives roughly the same
performance as storing it in memory.

|-++--++|
| | Random 4k reads   | Sequential 4k reads |
| | Throughput | IOPS | Throughput |  IOPS  |
|-++--++|
| Cache in memory/SSD | 406 KB/s   |   99 | 84 MB/s|  21000 |
| Default cache (1MB) | 200 KB/s   |   60 | 83 MB/s|  21000 |
| No cache| 200 KB/s   |   49 | 56 MB/s|  14000 |
|-++--++|

I'm including the patch that I used to get these results. This is the
simplest approach that I could think of.

Opinions, questions?

Thanks,

Berto

Alberto Garcia (1):
  qcow2: Allow storing the qcow2 L2 cache in disk

 block/qcow2-cache.c | 56 +
 block/qcow2.c   | 11 +--
 block/qcow2.h   |  3 ++-
 3 files changed, 59 insertions(+), 11 deletions(-)

-- 
2.10.2