Re: [Qemu-block] [PATCH v3 1/3] block: add bdrv_get_format_alloc_stat format interface

2017-06-28 Thread Vladimir Sementsov-Ogievskiy

29.06.2017 03:15, John Snow wrote:


On 06/28/2017 11:59 AM, Vladimir Sementsov-Ogievskiy wrote:

27.06.2017 02:19, John Snow wrote:

On 06/06/2017 12:26 PM, Vladimir Sementsov-Ogievskiy wrote:

The function should collect statistics, about used/unused by top-level
format driver space (in its .file) and allocation status
(data/zero/discarded/after-eof) of corresponding areas in this .file.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
   block.c   | 16 ++
   include/block/block.h |  3 +++
   include/block/block_int.h |  2 ++
   qapi/block-core.json  | 55
+++
   4 files changed, 76 insertions(+)

diff --git a/block.c b/block.c
index 50ba264143..7d720ae0c2 100644
--- a/block.c
+++ b/block.c
@@ -3407,6 +3407,22 @@ int64_t
bdrv_get_allocated_file_size(BlockDriverState *bs)
   }
 /**
+ * Collect format allocation info. See BlockFormatAllocInfo
definition in
+ * qapi/block-core.json.
+ */
+int bdrv_get_format_alloc_stat(BlockDriverState *bs,
BlockFormatAllocInfo *bfai)
+{
+BlockDriver *drv = bs->drv;
+if (!drv) {
+return -ENOMEDIUM;
+}
+if (drv->bdrv_get_format_alloc_stat) {
+return drv->bdrv_get_format_alloc_stat(bs, bfai);
+}
+return -ENOTSUP;
+}
+
+/**
* Return number of sectors on success, -errno on error.
*/
   int64_t bdrv_nb_sectors(BlockDriverState *bs)
diff --git a/include/block/block.h b/include/block/block.h
index 9b355e92d8..646376a772 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -335,6 +335,9 @@ typedef enum {
 int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res,
BdrvCheckMode fix);
   +int bdrv_get_format_alloc_stat(BlockDriverState *bs,
+   BlockFormatAllocInfo *bfai);
+
   /* The units of offset and total_work_size may be chosen
arbitrarily by the
* block driver; total_work_size may change during the course of
the amendment
* operation */
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 8d3724cce6..458c715e99 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -208,6 +208,8 @@ struct BlockDriver {
   int64_t (*bdrv_getlength)(BlockDriverState *bs);
   bool has_variable_length;
   int64_t (*bdrv_get_allocated_file_size)(BlockDriverState *bs);
+int (*bdrv_get_format_alloc_stat)(BlockDriverState *bs,
+  BlockFormatAllocInfo *bfai);
 int coroutine_fn
(*bdrv_co_pwritev_compressed)(BlockDriverState *bs,
   uint64_t offset, uint64_t bytes, QEMUIOVector *qiov);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index ea0b3e8b13..fd7b52bd69 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -139,6 +139,61 @@
  '*format-specific': 'ImageInfoSpecific' } }
 ##
+# @BlockFormatAllocInfo:
+#

I apologize in advance, but I don't understand this patch very well. Let
me ask some questions to get patch review rolling again, since you've
been waiting a bit.


+#
+# Allocation relations between format file and underlying protocol
file.
+# All fields are in bytes.
+#

The format file in this case would be ... what, the virtual file
represented by the qcow2? and the underlying protocol file is the raw
file that is the qcow2 itself?

yes


+# There are two types of the format file portions: 'used' and
'unused'. It's up
+# to the format how to interpret these types. For now the only
format supporting
+# the feature is Qcow2 and for this case 'used' are clusters with
positive
+# refcount and unused a clusters with zero refcount. Described
portions include
+# all format file allocations, not only virtual disk data (metadata,
internal
+# snapshots, etc. are included).

I guess the semantic differentiation between "used" and "unused" is left
to the individual fields, below.

hmm, I don't understand. differentiation is up to the format, and for
qcow2 it is described above


+#
+# For the underlying file there are native block-status types of the
portions:
+#  - data: allocated data
+#  - zero: read-as-zero holes
+#  - discarded: not allocated
+# 4th additional type is 'overrun', which is for the format file
portions beyond
+# the end of the underlying file.
+#
+# So, the fields are:
+#
+# @used-data: used by the format file and backed by data in the
underlying file
+#

I assume this is "defined and addressable data".


+# @used-zero: used by the format file and backed by a hole in the
underlying
+# file
+#

By a hole? Can you give me an example? Do you mean like a filesystem
hole ala falloc()?

-zero, -data and -discarded are the block status of corresponding area
in underlying file.

so, if underlying file is raw, yes, it should be a filesystem hole.

example:
-
# ./qemu-img create -f qcow2 x 1G
Formatting 'x', fmt=qcow2 size=1073741824 encryption=off
cluster_size=65536 lazy_refcounts=off refcount_bits=16
# ./qemu-img check x
No errors were fo

[Qemu-block] [PATCH] block: fix bs->file leak in bdrv_new_open_driver()

2017-06-28 Thread Manos Pitsidianakis
bdrv_open_driver() is called in two places, bdrv_new_open_driver() and
bdrv_open_common(). In the latter, failure cleanup in is in its caller,
bdrv_open_inherit(), which unrefs the bs->file of the failed driver open
if it exists. Let's check for this in bdrv_new_open_driver() as well.

Signed-off-by: Manos Pitsidianakis 
---
 block.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block.c b/block.c
index 694396281b..aeacd520e0 100644
--- a/block.c
+++ b/block.c
@@ -1165,6 +1165,9 @@ BlockDriverState *bdrv_new_open_driver(BlockDriver *drv, 
const char *node_name,
 
 ret = bdrv_open_driver(bs, drv, node_name, bs->options, flags, errp);
 if (ret < 0) {
+if (bs->file != NULL) {
+bdrv_unref_child(bs, bs->file);
+}
 QDECREF(bs->explicit_options);
 QDECREF(bs->options);
 bdrv_unref(bs);
-- 
2.11.0




Re: [Qemu-block] [PATCH] mirror: fix the inconsistent AioContext problem in the backing BDSs during mirroring.

2017-06-28 Thread sochin.jiang
Oh,I got it, thanks.


Sochin


On 2017/6/29 6:33, Max Reitz wrote:
> On 2017-06-26 13:04, sochin.jiang wrote:
>> From: "sochin.jiang" 
>>
>>  mirror_complete opens the backings, BDSs of the new open backings should 
>> have a
>>  same AioContext with the top when using iothreads, fix the code to 
>> guarantee this,
>>  also avoiding unexpected qemu exit(assert fails in bdrv_attach_child).
> Thanks! The functional change looks good to me. I'll see to add an
> iotest to cover this case.
>
> Some notes: There shouldn't be any spaces at the front of these lines,
> and every line should be limited to 72 characters.
>
> Also, the commit message should not end in a full stop (and it's good to
> keep it short).
>
> I've reworded a bit of the commit message (and shortened the title*) and
> applied it to my block branch:
>
> https://github.com/XanClic/qemu/commits/block
>
> Max
>
> [1] I've shortened it to "mirror: Fix inconsistent backing AioContext
> for after mirroring". I usually try to fit the title into 50 characters
> (because that's what the default vim highlighting is proposing...), so
> for reference, I would have made it something like "block: Align backing
> BDS's AioContext with overlay" -- if it had been my patch. Since it was
> yours, I'm fine with commit titles longer than 50 characters (although
> they really should not exceed 72).
>
>> Signed-off-by: sochin.jiang 
>> ---
>>  block.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/block.c b/block.c
>> index 6943962..b312fe6 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -2185,6 +2185,7 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
>> *parent_options,
>>  ret = -EINVAL;
>>  goto free_exit;
>>  }
>> +bdrv_set_aio_context(backing_hd, bdrv_get_aio_context(bs));
>>  
>>  /* Hook up the backing file link; drop our reference, bs owns the
>>   * backing_hd reference now */
>>
>





Re: [Qemu-block] [PATCH v3 1/3] block: add bdrv_get_format_alloc_stat format interface

2017-06-28 Thread John Snow


On 06/28/2017 11:59 AM, Vladimir Sementsov-Ogievskiy wrote:
> 27.06.2017 02:19, John Snow wrote:
>>
>> On 06/06/2017 12:26 PM, Vladimir Sementsov-Ogievskiy wrote:
>>> The function should collect statistics, about used/unused by top-level
>>> format driver space (in its .file) and allocation status
>>> (data/zero/discarded/after-eof) of corresponding areas in this .file.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>>> ---
>>>   block.c   | 16 ++
>>>   include/block/block.h |  3 +++
>>>   include/block/block_int.h |  2 ++
>>>   qapi/block-core.json  | 55
>>> +++
>>>   4 files changed, 76 insertions(+)
>>>
>>> diff --git a/block.c b/block.c
>>> index 50ba264143..7d720ae0c2 100644
>>> --- a/block.c
>>> +++ b/block.c
>>> @@ -3407,6 +3407,22 @@ int64_t
>>> bdrv_get_allocated_file_size(BlockDriverState *bs)
>>>   }
>>> /**
>>> + * Collect format allocation info. See BlockFormatAllocInfo
>>> definition in
>>> + * qapi/block-core.json.
>>> + */
>>> +int bdrv_get_format_alloc_stat(BlockDriverState *bs,
>>> BlockFormatAllocInfo *bfai)
>>> +{
>>> +BlockDriver *drv = bs->drv;
>>> +if (!drv) {
>>> +return -ENOMEDIUM;
>>> +}
>>> +if (drv->bdrv_get_format_alloc_stat) {
>>> +return drv->bdrv_get_format_alloc_stat(bs, bfai);
>>> +}
>>> +return -ENOTSUP;
>>> +}
>>> +
>>> +/**
>>>* Return number of sectors on success, -errno on error.
>>>*/
>>>   int64_t bdrv_nb_sectors(BlockDriverState *bs)
>>> diff --git a/include/block/block.h b/include/block/block.h
>>> index 9b355e92d8..646376a772 100644
>>> --- a/include/block/block.h
>>> +++ b/include/block/block.h
>>> @@ -335,6 +335,9 @@ typedef enum {
>>> int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res,
>>> BdrvCheckMode fix);
>>>   +int bdrv_get_format_alloc_stat(BlockDriverState *bs,
>>> +   BlockFormatAllocInfo *bfai);
>>> +
>>>   /* The units of offset and total_work_size may be chosen
>>> arbitrarily by the
>>>* block driver; total_work_size may change during the course of
>>> the amendment
>>>* operation */
>>> diff --git a/include/block/block_int.h b/include/block/block_int.h
>>> index 8d3724cce6..458c715e99 100644
>>> --- a/include/block/block_int.h
>>> +++ b/include/block/block_int.h
>>> @@ -208,6 +208,8 @@ struct BlockDriver {
>>>   int64_t (*bdrv_getlength)(BlockDriverState *bs);
>>>   bool has_variable_length;
>>>   int64_t (*bdrv_get_allocated_file_size)(BlockDriverState *bs);
>>> +int (*bdrv_get_format_alloc_stat)(BlockDriverState *bs,
>>> +  BlockFormatAllocInfo *bfai);
>>> int coroutine_fn
>>> (*bdrv_co_pwritev_compressed)(BlockDriverState *bs,
>>>   uint64_t offset, uint64_t bytes, QEMUIOVector *qiov);
>>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>>> index ea0b3e8b13..fd7b52bd69 100644
>>> --- a/qapi/block-core.json
>>> +++ b/qapi/block-core.json
>>> @@ -139,6 +139,61 @@
>>>  '*format-specific': 'ImageInfoSpecific' } }
>>> ##
>>> +# @BlockFormatAllocInfo:
>>> +#
>> I apologize in advance, but I don't understand this patch very well. Let
>> me ask some questions to get patch review rolling again, since you've
>> been waiting a bit.
>>
>>> +#
>>> +# Allocation relations between format file and underlying protocol
>>> file.
>>> +# All fields are in bytes.
>>> +#
>> The format file in this case would be ... what, the virtual file
>> represented by the qcow2? and the underlying protocol file is the raw
>> file that is the qcow2 itself?
> 
> yes
> 
>>
>>> +# There are two types of the format file portions: 'used' and
>>> 'unused'. It's up
>>> +# to the format how to interpret these types. For now the only
>>> format supporting
>>> +# the feature is Qcow2 and for this case 'used' are clusters with
>>> positive
>>> +# refcount and unused a clusters with zero refcount. Described
>>> portions include
>>> +# all format file allocations, not only virtual disk data (metadata,
>>> internal
>>> +# snapshots, etc. are included).
>> I guess the semantic differentiation between "used" and "unused" is left
>> to the individual fields, below.
> 
> hmm, I don't understand. differentiation is up to the format, and for
> qcow2 it is described above
> 
>>
>>> +#
>>> +# For the underlying file there are native block-status types of the
>>> portions:
>>> +#  - data: allocated data
>>> +#  - zero: read-as-zero holes
>>> +#  - discarded: not allocated
>>> +# 4th additional type is 'overrun', which is for the format file
>>> portions beyond
>>> +# the end of the underlying file.
>>> +#
>>> +# So, the fields are:
>>> +#
>>> +# @used-data: used by the format file and backed by data in the
>>> underlying file
>>> +#
>> I assume this is "defined and addressable data".
>>
>>> +# @used-zero: used by the format file and backed by a hole in the
>>> underlying
>>> +# file
>>> +#
>> By a hole? Can you give

Re: [Qemu-block] [PATCH v2 3/4] qcow2: add shrink image support

2017-06-28 Thread Max Reitz
On 2017-06-28 17:31, Pavel Butsykin wrote:
> On 28.06.2017 16:59, Max Reitz wrote:
>> On 2017-06-27 17:06, Pavel Butsykin wrote:
>>> On 26.06.2017 20:47, Max Reitz wrote:
 On 2017-06-26 17:23, Pavel Butsykin wrote:
>>> []
>
> Is there any guarantee that in the future this will not change?
> Because
> in this case it can be a potential danger.

 Since this behavior is not documented anywhere, there is no guarantee.

> I can add a comment... Or add a new variable with the size of
> reftable_tmp, and every time count min(s->refcount_table_size,
> reftable_tmp_size)
> before accessing to s->refcount_table[]/reftable_tmp[]

 Or (1) you add an assertion that refcount_table_size doesn't change
 along with a comment why that is the case, which also explains in
 detail
 why the call to qcow2_free_clusters() should be safe: The on-disk
 reftable differs from the one in memory. qcow2_free_clusters()and
 update_refcount() themselves do not access the reftable, so they are
 safe. However, update_refcount() calls alloc_refcount_block(), and that
 function does access the reftable: Now, as long as
 s->refcount_table_size does not shrink (which I can't see why it
 would),
 refcount_table_index should always be smaller. Now we're accessing
 s->refcount_table: This will always return an existing refblock because
 this will either be the refblock itself (for self-referencing
 refblocks)
 or another one that is not going to be freed by qcow2_shrink_reftable()
 because this function will not free refblocks which cover other
 clusters
 than themselves.
 We will then proceed to update the refblock which is either right
 (if it
 is not the refblock to be freed) or won't do anything (if it is the one
 to be freed).
 In any case, we will never write to the reftable and reading from the
 basically outdated cached version will never do anything bad.
>>>
>>> OK, SGTM.
>>>
 Or (2) you copy reftable_tmp into s->refcount_table[] *before* any call
 to qcow2_free_clusters(). To make this work, you would need to also
 discard all refblocks from the cache in this function here (and not in
 update_refcount()) and then only call qcow2_free_clusters() on
 refblocks
 which were not self-referencing. An alternative hack would be to simply
 mark the image dirty and just not do any qcow2_free_clusters() call...
>>>
>>> The main purpose of qcow2_reftable_shrink() function is discard all
>>> unnecessary refblocks from the file. If we do only rewrite
>>> refcount_table and discard non-self-referencing refblocks (which are
>>> actually very rare), then the meaning of the function is lost.
>>
>> It would do exactly the same. The idea is that you do not need to call
>> qcow2_free_clusters() on self-referencing refblocks at all, since they
>> are freed automatically when their reftable entry is overwritten with 0.
> 
> Not sure.. For self-referencing refblocks, we also need to do:
> 1. check if refcount > 1

Yes, if that wasn't an error flagged by qemu-img check. :-)

(http://git.qemu.org/?p=qemu.git;a=blob;f=block/qcow2-refcount.c;h=7c06061aae90eb4f091f51df995a9e099178c0ed;hb=HEAD#l1787)

> 2. update s->free_cluster_index
> 3. call update_refcount_discard() (to in the end the fallocate
> PUNCH_HOLE was called on refblock offset)

These, yes, you'd have to do here.

> It will be practically a copy-paste from qcow2_free_clusters(), so it is
> better to avoid it. I think that if it makes sense to do
> qcow2_reftable_shrink(), it is only because we can slightly reduce image
> size.

But it would be a small copy-paste (although I may very well be wrong)
and it would help me sleep better because I could actually understand it.

Max

 Or (3) of course it would be possible to not clean up refcount
 structures at all...
>>>
>>> Nice solution :)
>>
>> It is, because as I said refcount structures only have a small overhead.
> 
> Yes, I agree.



signature.asc
Description: OpenPGP digital signature


[Qemu-block] [PATCH] iotests: Add test for dataplane mirroring

2017-06-28 Thread Max Reitz
Signed-off-by: Max Reitz 
---
Depends on Stefan's "virtio: use ioeventfd in TCG and qtest mode" series
to work at all, and on "mirror: Fix inconsistent backing AioContext for
after mirroring" (in my block branch) so it does not fail.
---
 tests/qemu-iotests/106 | 97 ++
 tests/qemu-iotests/106.out | 14 +++
 tests/qemu-iotests/group   |  1 +
 3 files changed, 112 insertions(+)
 create mode 100755 tests/qemu-iotests/106
 create mode 100644 tests/qemu-iotests/106.out

diff --git a/tests/qemu-iotests/106 b/tests/qemu-iotests/106
new file mode 100755
index 000..ad438b5
--- /dev/null
+++ b/tests/qemu-iotests/106
@@ -0,0 +1,97 @@
+#!/bin/bash
+#
+# Test case for mirroring with dataplane
+#
+# Copyright (C) 2017 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=mre...@redhat.com
+
+seq=$(basename $0)
+echo "QA output created by $seq"
+
+here=$PWD
+status=1# failure is the default!
+
+_cleanup()
+{
+_cleanup_qemu
+_cleanup_test_img
+_rm_test_img "$TEST_IMG.overlay0"
+_rm_test_img "$TEST_IMG.overlay1"
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and qemu instance handling
+. ./common.rc
+. ./common.filter
+. ./common.qemu
+
+_supported_fmt qcow2
+_supported_proto file
+_supported_os Linux
+
+IMG_SIZE=64K
+
+_make_test_img $IMG_SIZE
+TEST_IMG="$TEST_IMG.overlay0" _make_test_img -b "$TEST_IMG" $IMG_SIZE
+TEST_IMG="$TEST_IMG.overlay1" _make_test_img -b "$TEST_IMG" $IMG_SIZE
+
+# So that we actually have something to mirror and the job does not return
+# immediately (which may be bad because then we cannot know whether the
+# 'return' or the 'BLOCK_JOB_READY' comes first).
+$QEMU_IO -c 'write 0 64' "$TEST_IMG.overlay0" | _filter_qemu_io
+
+# We cannot use virtio-blk here because that does not actually set the attached
+# BB's AioContext in qtest mode
+_launch_qemu \
+-object iothread,id=iothr \
+-blockdev 
node-name=source,driver=$IMGFMT,file.driver=file,file.filename="$TEST_IMG.overlay0"
 \
+-device virtio-scsi,id=scsi-bus,iothread=iothr \
+-device scsi-hd,bus=scsi-bus.0,drive=source
+
+_send_qemu_cmd $QEMU_HANDLE \
+"{ 'execute': 'qmp_capabilities' }" \
+'return'
+
+_send_qemu_cmd $QEMU_HANDLE \
+"{ 'execute': 'drive-mirror',
+   'arguments': {
+   'job-id': 'mirror',
+   'device': 'source',
+   'target': '$TEST_IMG.overlay1',
+   'mode':   'existing',
+   'sync':   'top'
+   } }" \
+'BLOCK_JOB_READY'
+
+# The backing BDS should be assigned the overlay's AioContext
+_send_qemu_cmd $QEMU_HANDLE \
+"{ 'execute': 'block-job-complete',
+   'arguments': { 'device': 'mirror' } }" \
+'BLOCK_JOB_COMPLETED'
+
+_send_qemu_cmd $QEMU_HANDLE \
+"{ 'execute': 'quit' }" \
+'return'
+
+wait=yes _cleanup_qemu
+
+# success, all done
+echo '*** done'
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/106.out b/tests/qemu-iotests/106.out
new file mode 100644
index 000..d1b83f1
--- /dev/null
+++ b/tests/qemu-iotests/106.out
@@ -0,0 +1,14 @@
+QA output created by 106
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=65536
+Formatting 'TEST_DIR/t.IMGFMT.overlay0', fmt=IMGFMT size=65536 
backing_file=TEST_DIR/t.IMGFMT
+Formatting 'TEST_DIR/t.IMGFMT.overlay1', fmt=IMGFMT size=65536 
backing_file=TEST_DIR/t.IMGFMT
+wrote 64/64 bytes at offset 0
+64 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+{"return": {}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"BLOCK_JOB_READY", "data": {"device": "mirror", "len": 65536, "offset": 65536, 
"speed": 0, "type": "mirror"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"BLOCK_JOB_COMPLETED", "data": {"device": "mirror", "len": 65536, "offset": 
65536, "speed": 0, "type": "mirror"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"SHUTDOWN", "data": {"guest": false}}
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 613d596..8da3e2b 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -112,6 +112,7 @@
 103 rw auto quick
 104 rw auto
 105 rw auto quick
+106 rw auto backing
 107 rw auto quick
 108 rw auto quick
 109 rw auto
-- 
2.9.4




Re: [Qemu-block] [PATCH] mirror: fix the inconsistent AioContext problem in the backing BDSs during mirroring.

2017-06-28 Thread Max Reitz
On 2017-06-26 13:04, sochin.jiang wrote:
> From: "sochin.jiang" 
> 
>  mirror_complete opens the backings, BDSs of the new open backings should 
> have a
>  same AioContext with the top when using iothreads, fix the code to guarantee 
> this,
>  also avoiding unexpected qemu exit(assert fails in bdrv_attach_child).

Thanks! The functional change looks good to me. I'll see to add an
iotest to cover this case.

Some notes: There shouldn't be any spaces at the front of these lines,
and every line should be limited to 72 characters.

Also, the commit message should not end in a full stop (and it's good to
keep it short).

I've reworded a bit of the commit message (and shortened the title*) and
applied it to my block branch:

https://github.com/XanClic/qemu/commits/block

Max

[1] I've shortened it to "mirror: Fix inconsistent backing AioContext
for after mirroring". I usually try to fit the title into 50 characters
(because that's what the default vim highlighting is proposing...), so
for reference, I would have made it something like "block: Align backing
BDS's AioContext with overlay" -- if it had been my patch. Since it was
yours, I'm fine with commit titles longer than 50 characters (although
they really should not exceed 72).

> Signed-off-by: sochin.jiang 
> ---
>  block.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/block.c b/block.c
> index 6943962..b312fe6 100644
> --- a/block.c
> +++ b/block.c
> @@ -2185,6 +2185,7 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
> *parent_options,
>  ret = -EINVAL;
>  goto free_exit;
>  }
> +bdrv_set_aio_context(backing_hd, bdrv_get_aio_context(bs));
>  
>  /* Hook up the backing file link; drop our reference, bs owns the
>   * backing_hd reference now */
> 




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v4 1/2] live-block-ops.txt: Rename, rewrite, and improve it

2017-06-28 Thread Eric Blake
On 06/28/2017 03:15 PM, Alberto Garcia wrote:
> On Wed 28 Jun 2017 04:58:00 PM CEST, Kashyap Chamarthy wrote:
>> This patch documents (including their QMP invocations) all the four
>> major kinds of live block operations:
>>
>>   - `block-stream`
>>   - `block-commit`
>>   - `drive-mirror` (& `blockdev-mirror`)
>>   - `drive-backup` (& `blockdev-backup`)
> 
> This is excellent work, thanks for doing this!
> 
> I haven't had the time to review the complete document yet, but here are
> my comments from the first half:
> 
>> +Disk image backing chain notation
>> +-
>   [...]
>> +.. important::
>> +The base disk image can be raw format; however, all the overlay
>> +files must be of QCOW2 format.
> 
> This is not quite like that: overlay files must be in a format that
> supports backing files. QCOW2 is the most common one, but there are
> others (qed). Grep for 'supports_backing' in the source code.

At the same time, other image formats are not as frequently tested, or
may be read-only.  Maybe a compromise of "The overlay files can
generally be any format that supports a backing file, although qcow2 is
the preferred format and the one used in this document".

> 
>> +(1) ``block-stream``: Live copy of data from backing files into overlay
>> +files (with the optional goal of removing the backing file from the
>> +chain).
> 
> optional? The backing file is removed from the chain as soon as the
> operation finishes, although the image file is not removed from the
> disk. Maybe you meant that?

Hmm, you're right. In this case, qemu ALWAYS rewrites the backing chain
to get rid of the (now-streamed) backing image.

> 
>> +(2) ``block-commit``: Live merge of data from overlay files into backing
>> +files (with the optional goal of removing the overlay file from the
>> +chain).  Since QEMU 2.0, this includes "active ``block-commit``"
>> +(i.e.  merge the current active layer into the base image).
> 
> Same question about the 'optional' here.

Here, optional is a bit more correct. With non-active (intermediate)
commit, qemu ALWAYS rewrites the backing chain to be shorter; but with
live commit, you can chose whether to pivot to the now-shorter chain
(job-complete) or whether to keep the active image intact (starting to
collect a new delta from the point-in-time of the just-completed commit,
by job-cancel).

> 
>> +writing to it.  (The rule of thumb is: live QEMU will always be pointing
>> +to the right-most image in a disk image chain.)
> 
> I think it's 'rightmost', without the hyphen.

Sadly, I think this is one case where both spellings work to a native
reader, and where I don't know of a specific style-guide preference.  I
probably would have written with the hyphen.

> 
>> +(3) Intermediate streaming (available since QEMU 2.8): Starting afresh
>> +with the original example disk image chain, with a total of four
>> +images, it is possible to copy contents from image [B] into image
>> +[C].  Once the copy is finished, image [B] can now be (optionally)
>> +discarded; and the backing file pointer of image [C] will be
>> +adjusted to point to [A].
> 
> The 'optional' usage again. [B] will be removed from the chain and can
> be (optionally) removed from the disk, but that you have to do yourself,
> QEMU won't do that.

Indeed, we may need to be specifically clear of the cases where qemu
shortens the chain, but where disk images that are no longer used by the
chain (whether they are still viable [as in stream], or invalidated [as
in commit crossing more than one element of the chain]) are still left
on the disk for the user to discard separately from qemu.

> 
>> +The ``block-commit`` command lets you to live merge data from overlay
>> +images into backing file(s).
> 
> I don't think "lets you to live merge data" is correct English.

Probably better as "lets you merge live data from overlay..."

> 
>> +The disk image chain can be shortened in one of the following ways:
>> +
>> +.. _`block-commit_Case-1`:
>> +
>> +(1) Commit content from only image [B] into image [A].  The resulting
>> +chain is the following, where image [C] is adjusted to point at [A]
>> +as its new backing file::
>> +
>> +[A] <-- [C] <-- [D]
>> +
>> +(2) Commit content from images [B] and [C] into image [A].  The
>> +resulting chain, where image [D] is adjusted to point to image [A]
>> +as its new backing file::
>> +
>> +[A] <-- [D]
> 
> Aren't these two just different variants of the same case?

Almost. But in case 1, image [B] is still viable (from a guest point of
view, the contents of [B] have not changed); in case 2, image [B] is
most likely corrupted (any changes propagated from [C] into [A] that
were not already overridden in [B] now read differently, making image
[B] no longer match anything the guest ever saw at any point in past time).

> 
>> +
>> +.. _`block-commit_Case-3`:
>> +
>> +(3) Commit content from images [B], [C], and the ac

Re: [Qemu-block] [PATCH v4 1/2] live-block-ops.txt: Rename, rewrite, and improve it

2017-06-28 Thread Alberto Garcia
On Wed 28 Jun 2017 04:58:00 PM CEST, Kashyap Chamarthy wrote:
> This patch documents (including their QMP invocations) all the four
> major kinds of live block operations:
>
>   - `block-stream`
>   - `block-commit`
>   - `drive-mirror` (& `blockdev-mirror`)
>   - `drive-backup` (& `blockdev-backup`)

This is excellent work, thanks for doing this!

I haven't had the time to review the complete document yet, but here are
my comments from the first half:

> +Disk image backing chain notation
> +-
  [...]
> +.. important::
> +The base disk image can be raw format; however, all the overlay
> +files must be of QCOW2 format.

This is not quite like that: overlay files must be in a format that
supports backing files. QCOW2 is the most common one, but there are
others (qed). Grep for 'supports_backing' in the source code.

> +(1) ``block-stream``: Live copy of data from backing files into overlay
> +files (with the optional goal of removing the backing file from the
> +chain).

optional? The backing file is removed from the chain as soon as the
operation finishes, although the image file is not removed from the
disk. Maybe you meant that?

> +(2) ``block-commit``: Live merge of data from overlay files into backing
> +files (with the optional goal of removing the overlay file from the
> +chain).  Since QEMU 2.0, this includes "active ``block-commit``"
> +(i.e.  merge the current active layer into the base image).

Same question about the 'optional' here.

> +writing to it.  (The rule of thumb is: live QEMU will always be pointing
> +to the right-most image in a disk image chain.)

I think it's 'rightmost', without the hyphen.

> +(3) Intermediate streaming (available since QEMU 2.8): Starting afresh
> +with the original example disk image chain, with a total of four
> +images, it is possible to copy contents from image [B] into image
> +[C].  Once the copy is finished, image [B] can now be (optionally)
> +discarded; and the backing file pointer of image [C] will be
> +adjusted to point to [A].

The 'optional' usage again. [B] will be removed from the chain and can
be (optionally) removed from the disk, but that you have to do yourself,
QEMU won't do that.

> +The ``block-commit`` command lets you to live merge data from overlay
> +images into backing file(s).

I don't think "lets you to live merge data" is correct English.

> +The disk image chain can be shortened in one of the following ways:
> +
> +.. _`block-commit_Case-1`:
> +
> +(1) Commit content from only image [B] into image [A].  The resulting
> +chain is the following, where image [C] is adjusted to point at [A]
> +as its new backing file::
> +
> +[A] <-- [C] <-- [D]
> +
> +(2) Commit content from images [B] and [C] into image [A].  The
> +resulting chain, where image [D] is adjusted to point to image [A]
> +as its new backing file::
> +
> +[A] <-- [D]

Aren't these two just different variants of the same case?

> +
> +.. _`block-commit_Case-3`:
> +
> +(3) Commit content from images [B], [C], and the active layer [D] into
> +image [A].  The resulting chain (in this case, a consolidated single
> +image)::
> +
> +[A]
> +
> +(4) Commit content from image only image [C] into image [B].  The
> +resulting chain::
> +
> + [A] <-- [B] <-- [D]
> +
> +(5) Commit content from image [C] and the active layer [D] into image
> +[B].  The resulting chain::
> +
> + [A] <-- [B]

Same here.

I mean, it's fine to have several different examples, but I think
there's really two main cases here (as you correctly explain later).

> +(QEMU) block-commit device=node-D base=a.qcow2 top=d.qcow2 job-id=job0
> +{
> +"execute": "block-commit",
> +"arguments": {
> +"device": "node-D",
> +"job-id": "job0",
> +"top": "d.qcow2",
> +"base": "a.qcow2"
> +}
> +}

This is correct, but I don't know if it's worth mentioning that if you
omit the 'top' parameter it defaults to the active layer (node-D in this
example).

Same with 'base'.

Else this part looks good! I'll check the rest of the document tomorrow
and give you my feedback.

Berto



Re: [Qemu-block] [Qemu-devel] [PATCH 0/6] virtio: use ioeventfd in TCG and qtest mode

2017-06-28 Thread Eric Blake
On 06/28/2017 01:47 PM, Stefan Hajnoczi wrote:
> This patch series fixes qemu-iotests 068.  Since commit
> ea4f3cebc4e0224605ab9dd9724aa4e7768fe372 ("qemu-iotests: 068: test iothread
> mode") the test case has attempted to use dataplane without -M accel=kvm.
> Although QEMU is capable of running TCG or qtest with emulated ioeventfd/irqfd
> we haven't enabled it yet.
> 
> Unfortunately the virtio test cases fail when ioeventfd is enabled in qtest
> mode.  This is because they make assumptions about virtqueue ISR signalling.
> They assume that a request is completed when ISR becomes 1.  However, the ISR
> can be set to 1 even though no new request has completed since commit
> 83d768b5640946b7da55ce8335509df297e2c7cd "virtio: set ISR on dataplane
> notifications".
> 
> This issue is solved by introducing a proper qvirtqueue_get_buf() API (similar
> to the Linux guest drivers) instead of making assumptions about the ISR.  Most
> of the patches update the test cases to use the new API.
> 
> Stefan Hajnoczi (6):
>   libqos: fix typo in virtio.h QVirtQueue->used comment
>   libqos: add virtio used ring support
>   tests: fix virtio-scsi-test ISR dependence
>   tests: fix virtio-blk-test ISR dependence
>   tests: fix virtio-net-test ISR dependence
>   virtio-pci: use ioeventfd even when KVM is disabled

I'm less familiar with the code in question, so I'll let others review,
but it did fix the failure of 068 for me.

Tested-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-block] [PATCH 5/6] tests: fix virtio-net-test ISR dependence

2017-06-28 Thread Stefan Hajnoczi
Use the new used ring APIs instead of assuming ISR being set means the
request has completed.

Signed-off-by: Stefan Hajnoczi 
---
 tests/virtio-net-test.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tests/virtio-net-test.c b/tests/virtio-net-test.c
index 8f94360..635b942 100644
--- a/tests/virtio-net-test.c
+++ b/tests/virtio-net-test.c
@@ -108,7 +108,7 @@ static void rx_test(QVirtioDevice *dev,
 ret = iov_send(socket, iov, 2, 0, sizeof(len) + sizeof(test));
 g_assert_cmpint(ret, ==, sizeof(test) + sizeof(len));
 
-qvirtio_wait_queue_isr(dev, vq, QVIRTIO_NET_TIMEOUT_US);
+qvirtio_wait_used_elem(dev, vq, free_head, QVIRTIO_NET_TIMEOUT_US);
 memread(req_addr + VNET_HDR_SIZE, buffer, sizeof(test));
 g_assert_cmpstr(buffer, ==, "TEST");
 
@@ -131,7 +131,7 @@ static void tx_test(QVirtioDevice *dev,
 free_head = qvirtqueue_add(vq, req_addr, 64, false, false);
 qvirtqueue_kick(dev, vq, free_head);
 
-qvirtio_wait_queue_isr(dev, vq, QVIRTIO_NET_TIMEOUT_US);
+qvirtio_wait_used_elem(dev, vq, free_head, QVIRTIO_NET_TIMEOUT_US);
 guest_free(alloc, req_addr);
 
 ret = qemu_recv(socket, &len, sizeof(len), 0);
@@ -182,7 +182,7 @@ static void rx_stop_cont_test(QVirtioDevice *dev,
 rsp = qmp("{ 'execute' : 'cont'}");
 QDECREF(rsp);
 
-qvirtio_wait_queue_isr(dev, vq, QVIRTIO_NET_TIMEOUT_US);
+qvirtio_wait_used_elem(dev, vq, free_head, QVIRTIO_NET_TIMEOUT_US);
 memread(req_addr + VNET_HDR_SIZE, buffer, sizeof(test));
 g_assert_cmpstr(buffer, ==, "TEST");
 
-- 
2.9.4




[Qemu-block] [PATCH 6/6] virtio-pci: use ioeventfd even when KVM is disabled

2017-06-28 Thread Stefan Hajnoczi
Old kvm.ko versions only supported a tiny number of ioeventfds so
virtio-pci avoids ioeventfds when kvm_has_many_ioeventfds() returns 0.

Do not check kvm_has_many_ioeventfds() when KVM is disabled since it
always returns 0.  Since commit 8c56c1a592b5092d91da8d8943c1d6462a6f
("memory: emulate ioeventfd") it has been possible to use ioeventfds in
qtest or TCG mode.

This patch makes -device virtio-blk-pci,iothread=iothread0 work even
when KVM is disabled.

I have tested that virtio-blk-pci works under TCG both with and without
iothread.

This patch fixes qemu-iotests 068, which was accidentally merged early
despite the dependency on ioeventfd.

Cc: Michael S. Tsirkin 
Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Message-id: 20170615163813.7255-2-stefa...@redhat.com
Signed-off-by: Stefan Hajnoczi 
---
 hw/virtio/virtio-pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 20d6a08..301920e 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1740,7 +1740,7 @@ static void virtio_pci_realize(PCIDevice *pci_dev, Error 
**errp)
 bool pcie_port = pci_bus_is_express(pci_dev->bus) &&
  !pci_bus_is_root(pci_dev->bus);
 
-if (!kvm_has_many_ioeventfds()) {
+if (kvm_enabled() && !kvm_has_many_ioeventfds()) {
 proxy->flags &= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
 }
 
-- 
2.9.4




[Qemu-block] [PATCH 4/6] tests: fix virtio-blk-test ISR dependence

2017-06-28 Thread Stefan Hajnoczi
Use the new used ring APIs instead of assuming ISR being set means the
request has completed.

Signed-off-by: Stefan Hajnoczi 
---
 tests/virtio-blk-test.c | 27 +--
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/tests/virtio-blk-test.c b/tests/virtio-blk-test.c
index fd2078c..0576cb1 100644
--- a/tests/virtio-blk-test.c
+++ b/tests/virtio-blk-test.c
@@ -196,7 +196,7 @@ static void test_basic(QVirtioDevice *dev, QGuestAllocator 
*alloc,
 
 qvirtqueue_kick(dev, vq, free_head);
 
-qvirtio_wait_queue_isr(dev, vq, QVIRTIO_BLK_TIMEOUT_US);
+qvirtio_wait_used_elem(dev, vq, free_head, QVIRTIO_BLK_TIMEOUT_US);
 status = readb(req_addr + 528);
 g_assert_cmpint(status, ==, 0);
 
@@ -218,7 +218,7 @@ static void test_basic(QVirtioDevice *dev, QGuestAllocator 
*alloc,
 
 qvirtqueue_kick(dev, vq, free_head);
 
-qvirtio_wait_queue_isr(dev, vq, QVIRTIO_BLK_TIMEOUT_US);
+qvirtio_wait_used_elem(dev, vq, free_head, QVIRTIO_BLK_TIMEOUT_US);
 status = readb(req_addr + 528);
 g_assert_cmpint(status, ==, 0);
 
@@ -246,7 +246,7 @@ static void test_basic(QVirtioDevice *dev, QGuestAllocator 
*alloc,
 qvirtqueue_add(vq, req_addr + 528, 1, true, false);
 qvirtqueue_kick(dev, vq, free_head);
 
-qvirtio_wait_queue_isr(dev, vq, QVIRTIO_BLK_TIMEOUT_US);
+qvirtio_wait_used_elem(dev, vq, free_head, QVIRTIO_BLK_TIMEOUT_US);
 status = readb(req_addr + 528);
 g_assert_cmpint(status, ==, 0);
 
@@ -267,7 +267,7 @@ static void test_basic(QVirtioDevice *dev, QGuestAllocator 
*alloc,
 
 qvirtqueue_kick(dev, vq, free_head);
 
-qvirtio_wait_queue_isr(dev, vq, QVIRTIO_BLK_TIMEOUT_US);
+qvirtio_wait_used_elem(dev, vq, free_head, QVIRTIO_BLK_TIMEOUT_US);
 status = readb(req_addr + 528);
 g_assert_cmpint(status, ==, 0);
 
@@ -348,7 +348,7 @@ static void pci_indirect(void)
 free_head = qvirtqueue_add_indirect(&vqpci->vq, indirect);
 qvirtqueue_kick(&dev->vdev, &vqpci->vq, free_head);
 
-qvirtio_wait_queue_isr(&dev->vdev, &vqpci->vq,
+qvirtio_wait_used_elem(&dev->vdev, &vqpci->vq, free_head,
QVIRTIO_BLK_TIMEOUT_US);
 status = readb(req_addr + 528);
 g_assert_cmpint(status, ==, 0);
@@ -373,7 +373,7 @@ static void pci_indirect(void)
 free_head = qvirtqueue_add_indirect(&vqpci->vq, indirect);
 qvirtqueue_kick(&dev->vdev, &vqpci->vq, free_head);
 
-qvirtio_wait_queue_isr(&dev->vdev, &vqpci->vq,
+qvirtio_wait_used_elem(&dev->vdev, &vqpci->vq, free_head,
QVIRTIO_BLK_TIMEOUT_US);
 status = readb(req_addr + 528);
 g_assert_cmpint(status, ==, 0);
@@ -484,7 +484,7 @@ static void pci_msix(void)
 qvirtqueue_add(&vqpci->vq, req_addr + 528, 1, true, false);
 qvirtqueue_kick(&dev->vdev, &vqpci->vq, free_head);
 
-qvirtio_wait_queue_isr(&dev->vdev, &vqpci->vq,
+qvirtio_wait_used_elem(&dev->vdev, &vqpci->vq, free_head,
QVIRTIO_BLK_TIMEOUT_US);
 
 status = readb(req_addr + 528);
@@ -509,7 +509,7 @@ static void pci_msix(void)
 qvirtqueue_kick(&dev->vdev, &vqpci->vq, free_head);
 
 
-qvirtio_wait_queue_isr(&dev->vdev, &vqpci->vq,
+qvirtio_wait_used_elem(&dev->vdev, &vqpci->vq, free_head,
QVIRTIO_BLK_TIMEOUT_US);
 
 status = readb(req_addr + 528);
@@ -540,6 +540,8 @@ static void pci_idx(void)
 uint64_t capacity;
 uint32_t features;
 uint32_t free_head;
+uint32_t write_head;
+uint32_t desc_idx;
 uint8_t status;
 char *data;
 
@@ -581,7 +583,8 @@ static void pci_idx(void)
 qvirtqueue_add(&vqpci->vq, req_addr + 528, 1, true, false);
 qvirtqueue_kick(&dev->vdev, &vqpci->vq, free_head);
 
-qvirtio_wait_queue_isr(&dev->vdev, &vqpci->vq, QVIRTIO_BLK_TIMEOUT_US);
+qvirtio_wait_used_elem(&dev->vdev, &vqpci->vq, free_head,
+   QVIRTIO_BLK_TIMEOUT_US);
 
 /* Write request */
 req.type = VIRTIO_BLK_T_OUT;
@@ -600,6 +603,7 @@ static void pci_idx(void)
 qvirtqueue_add(&vqpci->vq, req_addr + 16, 512, false, true);
 qvirtqueue_add(&vqpci->vq, req_addr + 528, 1, true, false);
 qvirtqueue_kick(&dev->vdev, &vqpci->vq, free_head);
+write_head = free_head;
 
 /* No notification expected */
 status = qvirtio_wait_status_byte_no_isr(&dev->vdev,
@@ -625,8 +629,11 @@ static void pci_idx(void)
 
 qvirtqueue_kick(&dev->vdev, &vqpci->vq, free_head);
 
-qvirtio_wait_queue_isr(&dev->vdev, &vqpci->vq,
+/* We get just one notification for both requests */
+qvirtio_wait_used_elem(&dev->vdev, &vqpci->vq, write_head,
QVIRTIO_BLK_TIMEOUT_US);
+g_assert(qvirtqueue_get_buf(&vqpci->vq, &desc_idx));
+g_assert_cmpint(desc_idx, ==, free_head);
 
 status = readb(req_addr + 528);
 g_assert_cmpint(status, ==, 0);
-- 
2.9.4




[Qemu-block] [PATCH 3/6] tests: fix virtio-scsi-test ISR dependence

2017-06-28 Thread Stefan Hajnoczi
Use the new used ring APIs instead of assuming ISR being set means the
request has completed.

Signed-off-by: Stefan Hajnoczi 
---
 tests/virtio-scsi-test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/virtio-scsi-test.c b/tests/virtio-scsi-test.c
index eff71df..87a3b6e 100644
--- a/tests/virtio-scsi-test.c
+++ b/tests/virtio-scsi-test.c
@@ -121,7 +121,7 @@ static uint8_t virtio_scsi_do_command(QVirtIOSCSI *vs, 
const uint8_t *cdb,
 }
 
 qvirtqueue_kick(vs->dev, vq, free_head);
-qvirtio_wait_queue_isr(vs->dev, vq, QVIRTIO_SCSI_TIMEOUT_US);
+qvirtio_wait_used_elem(vs->dev, vq, free_head, QVIRTIO_SCSI_TIMEOUT_US);
 
 response = readb(resp_addr +
  offsetof(struct virtio_scsi_cmd_resp, response));
-- 
2.9.4




[Qemu-block] [PATCH 2/6] libqos: add virtio used ring support

2017-06-28 Thread Stefan Hajnoczi
Existing tests do not touch the virtqueue used ring.  Instead they poll
the virtqueue ISR register and peek into their request's device-specific
status field.

It turns out that the virtqueue ISR register can be set to 1 more than
once for a single notification (see commit
83d768b5640946b7da55ce8335509df297e2c7cd "virtio: set ISR on dataplane
notifications").  This causes problems for tests that assume a 1:1
correspondence between the ISR being 1 and request completion.

Peeking at device-specific status fields is also problematic if the
device has no field that can be abused for EINPROGRESS polling
semantics.  This is the case if all the field's values may be set by the
device; there's no magic constant left for polling.

It's time to process the used ring for completed requests, just like a
real virtio guest driver.  This patch adds the necessary APIs.

Signed-off-by: Stefan Hajnoczi 
---
 tests/libqos/virtio.h |  6 ++
 tests/libqos/virtio.c | 60 +++
 2 files changed, 66 insertions(+)

diff --git a/tests/libqos/virtio.h b/tests/libqos/virtio.h
index 829de5e..8fbcd18 100644
--- a/tests/libqos/virtio.h
+++ b/tests/libqos/virtio.h
@@ -32,6 +32,7 @@ typedef struct QVirtQueue {
 uint32_t free_head;
 uint32_t num_free;
 uint32_t align;
+uint16_t last_used_idx;
 bool indirect;
 bool event;
 } QVirtQueue;
@@ -120,6 +121,10 @@ uint8_t qvirtio_wait_status_byte_no_isr(QVirtioDevice *d,
 QVirtQueue *vq,
 uint64_t addr,
 gint64 timeout_us);
+void qvirtio_wait_used_elem(QVirtioDevice *d,
+QVirtQueue *vq,
+uint32_t desc_idx,
+gint64 timeout_us);
 void qvirtio_wait_config_isr(QVirtioDevice *d, gint64 timeout_us);
 QVirtQueue *qvirtqueue_setup(QVirtioDevice *d,
  QGuestAllocator *alloc, uint16_t index);
@@ -135,6 +140,7 @@ uint32_t qvirtqueue_add(QVirtQueue *vq, uint64_t data, 
uint32_t len, bool write,
 bool next);
 uint32_t qvirtqueue_add_indirect(QVirtQueue *vq, QVRingIndirectDesc *indirect);
 void qvirtqueue_kick(QVirtioDevice *d, QVirtQueue *vq, uint32_t free_head);
+bool qvirtqueue_get_buf(QVirtQueue *vq, uint32_t *desc_idx);
 
 void qvirtqueue_set_used_event(QVirtQueue *vq, uint16_t idx);
 #endif
diff --git a/tests/libqos/virtio.c b/tests/libqos/virtio.c
index ec30cb9..9880a69 100644
--- a/tests/libqos/virtio.c
+++ b/tests/libqos/virtio.c
@@ -116,6 +116,35 @@ uint8_t qvirtio_wait_status_byte_no_isr(QVirtioDevice *d,
 return val;
 }
 
+/*
+ * qvirtio_wait_used_elem:
+ * @desc_idx: The next expected vq->desc[] index in the used ring
+ * @timeout_us: How many microseconds to wait before failing
+ *
+ * This function waits for the next completed request on the used ring.
+ */
+void qvirtio_wait_used_elem(QVirtioDevice *d,
+QVirtQueue *vq,
+uint32_t desc_idx,
+gint64 timeout_us)
+{
+gint64 start_time = g_get_monotonic_time();
+
+for (;;) {
+uint32_t got_desc_idx;
+
+clock_step(100);
+
+if (d->bus->get_queue_isr_status(d, vq) &&
+qvirtqueue_get_buf(vq, &got_desc_idx)) {
+g_assert_cmpint(got_desc_idx, ==, desc_idx);
+return;
+}
+
+g_assert(g_get_monotonic_time() - start_time <= timeout_us);
+}
+}
+
 void qvirtio_wait_config_isr(QVirtioDevice *d, gint64 timeout_us)
 {
 gint64 start_time = g_get_monotonic_time();
@@ -272,6 +301,37 @@ void qvirtqueue_kick(QVirtioDevice *d, QVirtQueue *vq, 
uint32_t free_head)
 }
 }
 
+/*
+ * qvirtqueue_get_buf:
+ * @desc_idx: A pointer that is filled with the vq->desc[] index, may be NULL
+ *
+ * This function gets the next used element if there is one ready.
+ *
+ * Returns: true if an element was ready, false otherwise
+ */
+bool qvirtqueue_get_buf(QVirtQueue *vq, uint32_t *desc_idx)
+{
+uint16_t idx;
+
+idx = readw(vq->used + offsetof(struct vring_used, idx));
+if (idx == vq->last_used_idx) {
+return false;
+}
+
+if (desc_idx) {
+uint64_t elem_addr;
+
+elem_addr = vq->used +
+offsetof(struct vring_used, ring) +
+(vq->last_used_idx % vq->size) *
+sizeof(struct vring_used_elem);
+*desc_idx = readl(elem_addr + offsetof(struct vring_used_elem, id));
+}
+
+vq->last_used_idx++;
+return true;
+}
+
 void qvirtqueue_set_used_event(QVirtQueue *vq, uint16_t idx)
 {
 g_assert(vq->event);
-- 
2.9.4




[Qemu-block] [PATCH 0/6] virtio: use ioeventfd in TCG and qtest mode

2017-06-28 Thread Stefan Hajnoczi
This patch series fixes qemu-iotests 068.  Since commit
ea4f3cebc4e0224605ab9dd9724aa4e7768fe372 ("qemu-iotests: 068: test iothread
mode") the test case has attempted to use dataplane without -M accel=kvm.
Although QEMU is capable of running TCG or qtest with emulated ioeventfd/irqfd
we haven't enabled it yet.

Unfortunately the virtio test cases fail when ioeventfd is enabled in qtest
mode.  This is because they make assumptions about virtqueue ISR signalling.
They assume that a request is completed when ISR becomes 1.  However, the ISR
can be set to 1 even though no new request has completed since commit
83d768b5640946b7da55ce8335509df297e2c7cd "virtio: set ISR on dataplane
notifications".

This issue is solved by introducing a proper qvirtqueue_get_buf() API (similar
to the Linux guest drivers) instead of making assumptions about the ISR.  Most
of the patches update the test cases to use the new API.

Stefan Hajnoczi (6):
  libqos: fix typo in virtio.h QVirtQueue->used comment
  libqos: add virtio used ring support
  tests: fix virtio-scsi-test ISR dependence
  tests: fix virtio-blk-test ISR dependence
  tests: fix virtio-net-test ISR dependence
  virtio-pci: use ioeventfd even when KVM is disabled

 tests/libqos/virtio.h|  8 ++-
 hw/virtio/virtio-pci.c   |  2 +-
 tests/libqos/virtio.c| 60 
 tests/virtio-blk-test.c  | 27 ++
 tests/virtio-net-test.c  |  6 ++---
 tests/virtio-scsi-test.c |  2 +-
 6 files changed, 89 insertions(+), 16 deletions(-)

-- 
2.9.4




[Qemu-block] [PATCH 1/6] libqos: fix typo in virtio.h QVirtQueue->used comment

2017-06-28 Thread Stefan Hajnoczi
Signed-off-by: Stefan Hajnoczi 
---
 tests/libqos/virtio.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/libqos/virtio.h b/tests/libqos/virtio.h
index 3397a08..829de5e 100644
--- a/tests/libqos/virtio.h
+++ b/tests/libqos/virtio.h
@@ -26,7 +26,7 @@ typedef struct QVirtioDevice {
 typedef struct QVirtQueue {
 uint64_t desc; /* This points to an array of struct vring_desc */
 uint64_t avail; /* This points to a struct vring_avail */
-uint64_t used; /* This points to a struct vring_desc */
+uint64_t used; /* This points to a struct vring_used */
 uint16_t index;
 uint32_t size;
 uint32_t free_head;
-- 
2.9.4




[Qemu-block] [PATCH v3 10/11] dirty-bitmap: Switch bdrv_set_dirty() to bytes

2017-06-28 Thread Eric Blake
Both callers already had bytes available, but were scaling to
sectors.  Move the scaling to internal code.  In the case of
bdrv_aligned_pwritev(), we are now passing the exact offset
rather than a rounded sector-aligned value, but that's okay
as long as dirty bitmap widens start/bytes to granularity
boundaries.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v3: rebase to lock context changes, R-b kept
v2: no change
---
 include/block/block_int.h | 2 +-
 block/dirty-bitmap.c  | 7 ---
 block/io.c| 6 ++
 3 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 15fa602..36b2153 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -953,7 +953,7 @@ void blk_dev_eject_request(BlockBackend *blk, bool force);
 bool blk_dev_is_tray_open(BlockBackend *blk);
 bool blk_dev_is_medium_locked(BlockBackend *blk);

-void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector, int64_t nr_sect);
+void bdrv_set_dirty(BlockDriverState *bs, int64_t offset, int64_t bytes);
 bool bdrv_requests_pending(BlockDriverState *bs);

 void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap **out);
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 95716be..e353b69 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -550,10 +550,10 @@ void bdrv_dirty_bitmap_deserialize_finish(BdrvDirtyBitmap 
*bitmap)
 hbitmap_deserialize_finish(bitmap->bitmap);
 }

-void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector,
-int64_t nr_sectors)
+void bdrv_set_dirty(BlockDriverState *bs, int64_t offset, int64_t bytes)
 {
 BdrvDirtyBitmap *bitmap;
+int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);

 if (QLIST_EMPTY(&bs->dirty_bitmaps)) {
 return;
@@ -564,7 +564,8 @@ void bdrv_set_dirty(BlockDriverState *bs, int64_t 
cur_sector,
 if (!bdrv_dirty_bitmap_enabled(bitmap)) {
 continue;
 }
-hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
+hbitmap_set(bitmap->bitmap, offset >> BDRV_SECTOR_BITS,
+end_sector - (offset >> BDRV_SECTOR_BITS));
 }
 bdrv_dirty_bitmaps_unlock(bs);
 }
diff --git a/block/io.c b/block/io.c
index 061a162..c7ffa95 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1310,7 +1310,6 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild 
*child,
 bool waited;
 int ret;

-int64_t start_sector = offset >> BDRV_SECTOR_BITS;
 int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);
 uint64_t bytes_remaining = bytes;
 int max_transfer;
@@ -1381,7 +1380,7 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild 
*child,
 bdrv_debug_event(bs, BLKDBG_PWRITEV_DONE);

 atomic_inc(&bs->write_gen);
-bdrv_set_dirty(bs, start_sector, end_sector - start_sector);
+bdrv_set_dirty(bs, offset, bytes);

 stat64_max(&bs->wr_highest_offset, offset + bytes);

@@ -2362,8 +2361,7 @@ int coroutine_fn bdrv_co_pdiscard(BlockDriverState *bs, 
int64_t offset,
 ret = 0;
 out:
 atomic_inc(&bs->write_gen);
-bdrv_set_dirty(bs, req.offset >> BDRV_SECTOR_BITS,
-   req.bytes >> BDRV_SECTOR_BITS);
+bdrv_set_dirty(bs, req.offset, req.bytes);
 tracked_request_end(&req);
 bdrv_dec_in_flight(bs);
 return ret;
-- 
2.9.4




[Qemu-block] [PATCH v3 08/11] dirty-bitmap: Change bdrv_[re]set_dirty_bitmap() to use bytes

2017-06-28 Thread Eric Blake
Some of the callers were already scaling bytes to sectors; others
can be easily converted to pass byte offsets, all in our shift
towards a consistent byte interface everywhere.  Making the change
will also make it easier to write the hold-out callers to use byte
rather than sectors for their iterations; it also makes it easier
for a future dirty-bitmap patch to offload scaling over to the
internal hbitmap.  Although all callers happen to pass
sector-aligned values, make the internal scaling robust to any
sub-sector requests.

Signed-off-by: Eric Blake 

---
v3: rebase to addition of _locked interfaces; complex enough that I
dropped R-b
v2: no change
---
 include/block/dirty-bitmap.h |  8 
 block/dirty-bitmap.c | 22 ++
 block/mirror.c   | 16 
 migration/block.c|  7 +--
 4 files changed, 31 insertions(+), 22 deletions(-)

diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 792544b..9662f6f 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -35,9 +35,9 @@ bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap);
 const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap *bitmap);
 DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap);
 void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
-   int64_t cur_sector, int64_t nr_sectors);
+   int64_t offset, int64_t bytes);
 void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
- int64_t cur_sector, int64_t nr_sectors);
+ int64_t offset, int64_t bytes);
 BdrvDirtyBitmapIter *bdrv_dirty_meta_iter_new(BdrvDirtyBitmap *bitmap);
 BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap);
 void bdrv_dirty_iter_free(BdrvDirtyBitmapIter *iter);
@@ -62,9 +62,9 @@ void bdrv_dirty_bitmap_unlock(BdrvDirtyBitmap *bitmap);
 bool bdrv_get_dirty_locked(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
int64_t offset);
 void bdrv_set_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
-  int64_t cur_sector, int64_t nr_sectors);
+  int64_t offset, int64_t bytes);
 void bdrv_reset_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
-int64_t cur_sector, int64_t nr_sectors);
+int64_t offset, int64_t bytes);
 int64_t bdrv_dirty_iter_next(BdrvDirtyBitmapIter *iter);
 void bdrv_set_dirty_iter(BdrvDirtyBitmapIter *hbi, int64_t offset);
 int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap);
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 84c6102..95716be 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -454,33 +454,39 @@ int64_t bdrv_dirty_iter_next(BdrvDirtyBitmapIter *iter)

 /* Called within bdrv_dirty_bitmap_lock..unlock */
 void bdrv_set_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
-  int64_t cur_sector, int64_t nr_sectors)
+  int64_t offset, int64_t bytes)
 {
+int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);
+
 assert(bdrv_dirty_bitmap_enabled(bitmap));
-hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
+hbitmap_set(bitmap->bitmap, offset >> BDRV_SECTOR_BITS,
+end_sector - (offset >> BDRV_SECTOR_BITS));
 }

 void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
-   int64_t cur_sector, int64_t nr_sectors)
+   int64_t offset, int64_t bytes)
 {
 bdrv_dirty_bitmap_lock(bitmap);
-bdrv_set_dirty_bitmap_locked(bitmap, cur_sector, nr_sectors);
+bdrv_set_dirty_bitmap_locked(bitmap, offset, bytes);
 bdrv_dirty_bitmap_unlock(bitmap);
 }

 /* Called within bdrv_dirty_bitmap_lock..unlock */
 void bdrv_reset_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
-int64_t cur_sector, int64_t nr_sectors)
+int64_t offset, int64_t bytes)
 {
+int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);
+
 assert(bdrv_dirty_bitmap_enabled(bitmap));
-hbitmap_reset(bitmap->bitmap, cur_sector, nr_sectors);
+hbitmap_reset(bitmap->bitmap, offset >> BDRV_SECTOR_BITS,
+  end_sector - (offset >> BDRV_SECTOR_BITS));
 }

 void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
- int64_t cur_sector, int64_t nr_sectors)
+ int64_t offset, int64_t bytes)
 {
 bdrv_dirty_bitmap_lock(bitmap);
-bdrv_reset_dirty_bitmap_locked(bitmap, cur_sector, nr_sectors);
+bdrv_reset_dirty_bitmap_locked(bitmap, offset, bytes);
 bdrv_dirty_bitmap_unlock(bitmap);
 }

diff --git a/block/mirror.c b/block/mirror.c
index 19331c0..b74d6e0 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -141,8 +141,7 @@ static void mirror_write_complete(void *opaque, int ret)
 if

[Qemu-block] [PATCH v3 11/11] dirty-bitmap: Convert internal hbitmap size/granularity

2017-06-28 Thread Eric Blake
Now that all callers are using byte-based interfaces, there's no
reason for our internal hbitmap to remain with sector-based
granularity.  It also simplifies our internal scaling, since we
already know that hbitmap widens requests out to granularity
boundaries.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v2: no change
---
 block/dirty-bitmap.c | 37 -
 1 file changed, 12 insertions(+), 25 deletions(-)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index e353b69..b0af91f 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -38,7 +38,7 @@
  */
 struct BdrvDirtyBitmap {
 QemuMutex *mutex;
-HBitmap *bitmap;/* Dirty sector bitmap implementation */
+HBitmap *bitmap;/* Dirty bitmap implementation */
 HBitmap *meta;  /* Meta dirty bitmap */
 BdrvDirtyBitmap *successor; /* Anonymous child; implies frozen status */
 char *name; /* Optional non-empty unique ID */
@@ -118,12 +118,7 @@ BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState 
*bs,
 }
 bitmap = g_new0(BdrvDirtyBitmap, 1);
 bitmap->mutex = &bs->dirty_bitmap_mutex;
-/*
- * TODO - let hbitmap track full granularity. For now, it is tracking
- * only sector granularity, as a shortcut for our iterators.
- */
-bitmap->bitmap = hbitmap_alloc(bitmap_size >> BDRV_SECTOR_BITS,
-   ctz32(granularity) - BDRV_SECTOR_BITS);
+bitmap->bitmap = hbitmap_alloc(bitmap_size, ctz32(granularity));
 bitmap->size = bitmap_size;
 bitmap->name = g_strdup(name);
 bitmap->disabled = false;
@@ -293,7 +288,7 @@ void bdrv_dirty_bitmap_truncate(BlockDriverState *bs)
 QLIST_FOREACH(bitmap, &bs->dirty_bitmaps, list) {
 assert(!bdrv_dirty_bitmap_frozen(bitmap));
 assert(!bitmap->active_iterators);
-hbitmap_truncate(bitmap->bitmap, size >> BDRV_SECTOR_BITS);
+hbitmap_truncate(bitmap->bitmap, size);
 bitmap->size = size;
 }
 bdrv_dirty_bitmaps_unlock(bs);
@@ -388,7 +383,7 @@ bool bdrv_get_dirty_locked(BlockDriverState *bs, 
BdrvDirtyBitmap *bitmap,
int64_t offset)
 {
 if (bitmap) {
-return hbitmap_get(bitmap->bitmap, offset >> BDRV_SECTOR_BITS);
+return hbitmap_get(bitmap->bitmap, offset);
 } else {
 return false;
 }
@@ -416,7 +411,7 @@ uint32_t 
bdrv_get_default_bitmap_granularity(BlockDriverState *bs)

 uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap)
 {
-return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
+return 1U << hbitmap_granularity(bitmap->bitmap);
 }

 BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap)
@@ -449,18 +444,15 @@ void bdrv_dirty_iter_free(BdrvDirtyBitmapIter *iter)

 int64_t bdrv_dirty_iter_next(BdrvDirtyBitmapIter *iter)
 {
-return hbitmap_iter_next(&iter->hbi) * BDRV_SECTOR_SIZE;
+return hbitmap_iter_next(&iter->hbi);
 }

 /* Called within bdrv_dirty_bitmap_lock..unlock */
 void bdrv_set_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
   int64_t offset, int64_t bytes)
 {
-int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);
-
 assert(bdrv_dirty_bitmap_enabled(bitmap));
-hbitmap_set(bitmap->bitmap, offset >> BDRV_SECTOR_BITS,
-end_sector - (offset >> BDRV_SECTOR_BITS));
+hbitmap_set(bitmap->bitmap, offset, bytes);
 }

 void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
@@ -475,11 +467,8 @@ void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
 void bdrv_reset_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
 int64_t offset, int64_t bytes)
 {
-int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);
-
 assert(bdrv_dirty_bitmap_enabled(bitmap));
-hbitmap_reset(bitmap->bitmap, offset >> BDRV_SECTOR_BITS,
-  end_sector - (offset >> BDRV_SECTOR_BITS));
+hbitmap_reset(bitmap->bitmap, offset, bytes);
 }

 void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
@@ -498,7 +487,7 @@ void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, 
HBitmap **out)
 hbitmap_reset_all(bitmap->bitmap);
 } else {
 HBitmap *backup = bitmap->bitmap;
-bitmap->bitmap = hbitmap_alloc(bitmap->size >> BDRV_SECTOR_BITS,
+bitmap->bitmap = hbitmap_alloc(bitmap->size,
hbitmap_granularity(backup));
 *out = backup;
 }
@@ -553,7 +542,6 @@ void bdrv_dirty_bitmap_deserialize_finish(BdrvDirtyBitmap 
*bitmap)
 void bdrv_set_dirty(BlockDriverState *bs, int64_t offset, int64_t bytes)
 {
 BdrvDirtyBitmap *bitmap;
-int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);

 if (QLIST_EMPTY(&bs->dirty_bitmaps)) {
 return;
@@ -564,8 +552,7 @@ void bdrv_set_dirty(BlockDriverState *bs, int64_t offset, 
int64_t bytes)
 if (!bdrv_dirty_bitmap_enabled(bitmap

[Qemu-block] [PATCH v3 06/11] dirty-bitmap: Change bdrv_get_dirty_count() to report bytes

2017-06-28 Thread Eric Blake
Thanks to recent cleanups, all callers were scaling a return value
of sectors into bytes; do the scaling internally instead.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 
Reviewed-by: Juan Quintela 

---
v3: no change, add R-b
v2: no change
---
 block/dirty-bitmap.c |  4 ++--
 block/mirror.c   | 13 +
 migration/block.c|  2 +-
 3 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 0029303..41cd41f 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -369,7 +369,7 @@ BlockDirtyInfoList 
*bdrv_query_dirty_bitmaps(BlockDriverState *bs)
 QLIST_FOREACH(bm, &bs->dirty_bitmaps, list) {
 BlockDirtyInfo *info = g_new0(BlockDirtyInfo, 1);
 BlockDirtyInfoList *entry = g_new0(BlockDirtyInfoList, 1);
-info->count = bdrv_get_dirty_count(bm) << BDRV_SECTOR_BITS;
+info->count = bdrv_get_dirty_count(bm);
 info->granularity = bdrv_dirty_bitmap_granularity(bm);
 info->has_name = !!bm->name;
 info->name = g_strdup(bm->name);
@@ -573,7 +573,7 @@ void bdrv_set_dirty_iter(BdrvDirtyBitmapIter *iter, int64_t 
offset)

 int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap)
 {
-return hbitmap_count(bitmap->bitmap);
+return hbitmap_count(bitmap->bitmap) << BDRV_SECTOR_BITS;
 }

 int64_t bdrv_get_meta_dirty_count(BdrvDirtyBitmap *bitmap)
diff --git a/block/mirror.c b/block/mirror.c
index 0cde201..6ea6a27 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -806,11 +806,10 @@ static void coroutine_fn mirror_run(void *opaque)

 cnt = bdrv_get_dirty_count(s->dirty_bitmap);
 /* s->common.offset contains the number of bytes already processed so
- * far, cnt is the number of dirty sectors remaining and
+ * far, cnt is the number of dirty bytes remaining and
  * s->bytes_in_flight is the number of bytes currently being
  * processed; together those are the current total operation length */
-s->common.len = s->common.offset + s->bytes_in_flight +
-cnt * BDRV_SECTOR_SIZE;
+s->common.len = s->common.offset + s->bytes_in_flight + cnt;

 /* Note that even when no rate limit is applied we need to yield
  * periodically with no pending I/O so that bdrv_drain_all() returns.
@@ -822,8 +821,7 @@ static void coroutine_fn mirror_run(void *opaque)
 s->common.iostatus == BLOCK_DEVICE_IO_STATUS_OK) {
 if (s->in_flight >= MAX_IN_FLIGHT || s->buf_free_count == 0 ||
 (cnt == 0 && s->in_flight > 0)) {
-trace_mirror_yield(s, cnt * BDRV_SECTOR_SIZE,
-   s->buf_free_count, s->in_flight);
+trace_mirror_yield(s, cnt, s->buf_free_count, s->in_flight);
 mirror_wait_for_io(s);
 continue;
 } else if (cnt != 0) {
@@ -864,7 +862,7 @@ static void coroutine_fn mirror_run(void *opaque)
  * whether to switch to target check one last time if I/O has
  * come in the meanwhile, and if not flush the data to disk.
  */
-trace_mirror_before_drain(s, cnt * BDRV_SECTOR_SIZE);
+trace_mirror_before_drain(s, cnt);

 bdrv_drained_begin(bs);
 cnt = bdrv_get_dirty_count(s->dirty_bitmap);
@@ -883,8 +881,7 @@ static void coroutine_fn mirror_run(void *opaque)
 }

 ret = 0;
-trace_mirror_before_sleep(s, cnt * BDRV_SECTOR_SIZE,
-  s->synced, delay_ns);
+trace_mirror_before_sleep(s, cnt, s->synced, delay_ns);
 if (!s->synced) {
 block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
 if (block_job_is_cancelled(&s->common)) {
diff --git a/migration/block.c b/migration/block.c
index 4a48e5c..d14f745 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -664,7 +664,7 @@ static int64_t get_remaining_dirty(void)
 aio_context_release(blk_get_aio_context(bmds->blk));
 }

-return dirty << BDRV_SECTOR_BITS;
+return dirty;
 }


-- 
2.9.4




[Qemu-block] [PATCH v3 07/11] dirty-bitmap: Change bdrv_get_dirty_locked() to take bytes

2017-06-28 Thread Eric Blake
Half the callers were already scaling bytes to sectors; the other
half can eventually be simplified to use byte iteration.  Both
callers were already using the result as a bool, so make that
explicit.  Making the change also makes it easier for a future
dirty-bitmap patch to offload scaling over to the internal hbitmap.

Remember, asking whether a byte is dirty is effectively asking
whether the entire granularity containing the byte is dirty, since
we only track dirtiness by granularity.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 
Reviewed-by: Juan Quintela 

---
v3: rebase to _locked rename was straightforward enough that R-b kept
v2: tweak commit message, no code change
---
 include/block/dirty-bitmap.h | 4 ++--
 block/dirty-bitmap.c | 8 
 block/mirror.c   | 3 +--
 migration/block.c| 3 ++-
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index bc34832..792544b 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -59,8 +59,8 @@ void bdrv_dirty_bitmap_deserialize_finish(BdrvDirtyBitmap 
*bitmap);
 /* Functions that require manual locking.  */
 void bdrv_dirty_bitmap_lock(BdrvDirtyBitmap *bitmap);
 void bdrv_dirty_bitmap_unlock(BdrvDirtyBitmap *bitmap);
-int bdrv_get_dirty_locked(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
-  int64_t sector);
+bool bdrv_get_dirty_locked(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
+   int64_t offset);
 void bdrv_set_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
   int64_t cur_sector, int64_t nr_sectors);
 void bdrv_reset_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 41cd41f..84c6102 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -384,13 +384,13 @@ BlockDirtyInfoList 
*bdrv_query_dirty_bitmaps(BlockDriverState *bs)
 }

 /* Called within bdrv_dirty_bitmap_lock..unlock */
-int bdrv_get_dirty_locked(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
-  int64_t sector)
+bool bdrv_get_dirty_locked(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
+   int64_t offset)
 {
 if (bitmap) {
-return hbitmap_get(bitmap->bitmap, sector);
+return hbitmap_get(bitmap->bitmap, offset >> BDRV_SECTOR_BITS);
 } else {
-return 0;
+return false;
 }
 }

diff --git a/block/mirror.c b/block/mirror.c
index 6ea6a27..19331c0 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -362,8 +362,7 @@ static uint64_t coroutine_fn 
mirror_iteration(MirrorBlockJob *s)
 int64_t next_offset = offset + nb_chunks * s->granularity;
 int64_t next_chunk = next_offset / s->granularity;
 if (next_offset >= s->bdev_length ||
-!bdrv_get_dirty_locked(source, s->dirty_bitmap,
-   next_offset >> BDRV_SECTOR_BITS)) {
+!bdrv_get_dirty_locked(source, s->dirty_bitmap, next_offset)) {
 break;
 }
 if (test_bit(next_chunk, s->in_flight_bitmap)) {
diff --git a/migration/block.c b/migration/block.c
index d14f745..d6557a1 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -527,7 +527,8 @@ static int mig_save_device_dirty(QEMUFile *f, 
BlkMigDevState *bmds,
 blk_mig_unlock();
 }
 bdrv_dirty_bitmap_lock(bmds->dirty_bitmap);
-if (bdrv_get_dirty_locked(bs, bmds->dirty_bitmap, sector)) {
+if (bdrv_get_dirty_locked(bs, bmds->dirty_bitmap,
+  sector * BDRV_SECTOR_SIZE)) {
 if (total_sectors - sector < BDRV_SECTORS_PER_DIRTY_CHUNK) {
 nr_sectors = total_sectors - sector;
 } else {
-- 
2.9.4




[Qemu-block] [PATCH v3 09/11] mirror: Switch mirror_dirty_init() to byte-based iteration

2017-06-28 Thread Eric Blake
Now that we have adjusted the majority of the calls this function
makes to be byte-based, it is easier to read the code if it makes
passes over the image using bytes rather than sectors.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v2: no change
---
 block/mirror.c | 35 ++-
 1 file changed, 14 insertions(+), 21 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index b74d6e0..6f54dcc 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -613,15 +613,13 @@ static void mirror_throttle(MirrorBlockJob *s)

 static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
 {
-int64_t sector_num, end;
+int64_t offset;
 BlockDriverState *base = s->base;
 BlockDriverState *bs = s->source;
 BlockDriverState *target_bs = blk_bs(s->target);
-int ret, n;
+int ret;
 int64_t count;

-end = s->bdev_length / BDRV_SECTOR_SIZE;
-
 if (base == NULL && !bdrv_has_zero_init(target_bs)) {
 if (!bdrv_can_write_zeroes_with_unmap(target_bs)) {
 bdrv_set_dirty_bitmap(s->dirty_bitmap, 0, s->bdev_length);
@@ -629,9 +627,9 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
 }

 s->initial_zeroing_ongoing = true;
-for (sector_num = 0; sector_num < end; ) {
-int nb_sectors = MIN(end - sector_num,
-QEMU_ALIGN_DOWN(INT_MAX, s->granularity) >> BDRV_SECTOR_BITS);
+for (offset = 0; offset < s->bdev_length; ) {
+int bytes = MIN(s->bdev_length - offset,
+QEMU_ALIGN_DOWN(INT_MAX, s->granularity));

 mirror_throttle(s);

@@ -647,9 +645,8 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
 continue;
 }

-mirror_do_zero_or_discard(s, sector_num * BDRV_SECTOR_SIZE,
-  nb_sectors * BDRV_SECTOR_SIZE, false);
-sector_num += nb_sectors;
+mirror_do_zero_or_discard(s, offset, bytes, false);
+offset += bytes;
 }

 mirror_wait_for_all_io(s);
@@ -657,10 +654,10 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob 
*s)
 }

 /* First part, loop on the sectors and initialize the dirty bitmap.  */
-for (sector_num = 0; sector_num < end; ) {
+for (offset = 0; offset < s->bdev_length; ) {
 /* Just to make sure we are not exceeding int limit. */
-int nb_sectors = MIN(INT_MAX >> BDRV_SECTOR_BITS,
- end - sector_num);
+int bytes = MIN(s->bdev_length - offset,
+QEMU_ALIGN_DOWN(INT_MAX, s->granularity));

 mirror_throttle(s);

@@ -668,20 +665,16 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob 
*s)
 return 0;
 }

-ret = bdrv_is_allocated_above(bs, base, sector_num * BDRV_SECTOR_SIZE,
-  nb_sectors * BDRV_SECTOR_SIZE, &count);
+ret = bdrv_is_allocated_above(bs, base, offset, bytes, &count);
 if (ret < 0) {
 return ret;
 }

-n = DIV_ROUND_UP(count, BDRV_SECTOR_SIZE);
-assert(n > 0);
+count = QEMU_ALIGN_UP(count, BDRV_SECTOR_SIZE);
 if (ret == 1) {
-bdrv_set_dirty_bitmap(s->dirty_bitmap,
-  sector_num * BDRV_SECTOR_SIZE,
-  n * BDRV_SECTOR_SIZE);
+bdrv_set_dirty_bitmap(s->dirty_bitmap, offset, count);
 }
-sector_num += n;
+offset += count;
 }
 return 0;
 }
-- 
2.9.4




[Qemu-block] [PATCH v3 04/11] dirty-bitmap: Set iterator start by offset, not sector

2017-06-28 Thread Eric Blake
All callers to bdrv_dirty_iter_new() passed 0 for their initial
starting point, drop that parameter.

All callers to bdrv_set_dirty_iter() were scaling an offset to
a sector number; move the scaling to occur internally to dirty
bitmap code instead.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v2: no change
---
 include/block/dirty-bitmap.h | 5 ++---
 block/backup.c   | 5 ++---
 block/dirty-bitmap.c | 9 -
 block/mirror.c   | 4 ++--
 4 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 35a0a83..bc34832 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -39,8 +39,7 @@ void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
 void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
  int64_t cur_sector, int64_t nr_sectors);
 BdrvDirtyBitmapIter *bdrv_dirty_meta_iter_new(BdrvDirtyBitmap *bitmap);
-BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap,
- uint64_t first_sector);
+BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap);
 void bdrv_dirty_iter_free(BdrvDirtyBitmapIter *iter);

 uint64_t bdrv_dirty_bitmap_serialization_size(const BdrvDirtyBitmap *bitmap,
@@ -67,7 +66,7 @@ void bdrv_set_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
 void bdrv_reset_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
 int64_t cur_sector, int64_t nr_sectors);
 int64_t bdrv_dirty_iter_next(BdrvDirtyBitmapIter *iter);
-void bdrv_set_dirty_iter(BdrvDirtyBitmapIter *hbi, int64_t sector_num);
+void bdrv_set_dirty_iter(BdrvDirtyBitmapIter *hbi, int64_t offset);
 int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap);
 int64_t bdrv_get_meta_dirty_count(BdrvDirtyBitmap *bitmap);
 void bdrv_dirty_bitmap_truncate(BlockDriverState *bs);
diff --git a/block/backup.c b/block/backup.c
index b2048bf..2a94e8b 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -372,7 +372,7 @@ static int coroutine_fn 
backup_run_incremental(BackupBlockJob *job)

 granularity = bdrv_dirty_bitmap_granularity(job->sync_bitmap);
 clusters_per_iter = MAX((granularity / job->cluster_size), 1);
-dbi = bdrv_dirty_iter_new(job->sync_bitmap, 0);
+dbi = bdrv_dirty_iter_new(job->sync_bitmap);

 /* Find the next dirty sector(s) */
 while ((offset = bdrv_dirty_iter_next(dbi) * BDRV_SECTOR_SIZE) >= 0) {
@@ -403,8 +403,7 @@ static int coroutine_fn 
backup_run_incremental(BackupBlockJob *job)
 /* If the bitmap granularity is smaller than the backup granularity,
  * we need to advance the iterator pointer to the next cluster. */
 if (granularity < job->cluster_size) {
-bdrv_set_dirty_iter(dbi,
-cluster * job->cluster_size / 
BDRV_SECTOR_SIZE);
+bdrv_set_dirty_iter(dbi, cluster * job->cluster_size);
 }

 last_cluster = cluster - 1;
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index b2b9342..faf5a4c 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -419,11 +419,10 @@ uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap 
*bitmap)
 return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
 }

-BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap,
- uint64_t first_sector)
+BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap)
 {
 BdrvDirtyBitmapIter *iter = g_new(BdrvDirtyBitmapIter, 1);
-hbitmap_iter_init(&iter->hbi, bitmap->bitmap, first_sector);
+hbitmap_iter_init(&iter->hbi, bitmap->bitmap, 0);
 iter->bitmap = bitmap;
 bitmap->active_iterators++;
 return iter;
@@ -567,9 +566,9 @@ void bdrv_set_dirty(BlockDriverState *bs, int64_t 
cur_sector,
 /**
  * Advance a BdrvDirtyBitmapIter to an arbitrary offset.
  */
-void bdrv_set_dirty_iter(BdrvDirtyBitmapIter *iter, int64_t sector_num)
+void bdrv_set_dirty_iter(BdrvDirtyBitmapIter *iter, int64_t offset)
 {
-hbitmap_iter_init(&iter->hbi, iter->hbi.hb, sector_num);
+hbitmap_iter_init(&iter->hbi, iter->hbi.hb, offset >> BDRV_SECTOR_BITS);
 }

 int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap)
diff --git a/block/mirror.c b/block/mirror.c
index 8c3752b..3869450 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -373,7 +373,7 @@ static uint64_t coroutine_fn 
mirror_iteration(MirrorBlockJob *s)
 next_dirty = bdrv_dirty_iter_next(s->dbi) * BDRV_SECTOR_SIZE;
 if (next_dirty > next_offset || next_dirty < 0) {
 /* The bitmap iterator's cache is stale, refresh it */
-bdrv_set_dirty_iter(s->dbi, next_offset >> BDRV_SECTOR_BITS);
+bdrv_set_dirty_iter(s->dbi, next_offset);
 next_dirty = bdrv_dirty_iter_next(s->dbi) * BDRV_SECTOR_SIZE;
 }
 assert(next_dirty == next_offset);
@@ -791,7 +791,7 @@ static void coroutine_fn mirror_run(void *opaque)
 }

[Qemu-block] [PATCH v3 02/11] dirty-bitmap: Drop unused functions

2017-06-28 Thread Eric Blake
We had several functions that no one is currently using, and which
use sector-based interfaces.  I'm trying to convert towards byte-based
interfaces, so it's easier to just drop the unused functions:

bdrv_dirty_bitmap_size
bdrv_dirty_bitmap_get_meta
bdrv_dirty_bitmap_get_meta_locked
bdrv_dirty_bitmap_reset_meta
bdrv_dirty_bitmap_meta_granularity

Vladimir may re-add bdrv_dirty_bitmap_size() for persistent
bitmaps, but has agreed to do so with byte rather than sector
access at the point where it is needed.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v3: rebase to upstream changes (bdrv_dirty_bitmap_get_meta_locked was
added in b64bd51e with no clients), kept R-b
v2: tweak commit message based on review, no code change
---
 include/block/dirty-bitmap.h | 11 --
 block/dirty-bitmap.c | 49 
 2 files changed, 60 deletions(-)

diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index ad6558a..35a0a83 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -30,25 +30,14 @@ void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs);
 uint32_t bdrv_get_default_bitmap_granularity(BlockDriverState *bs);
 uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap);
-uint32_t bdrv_dirty_bitmap_meta_granularity(BdrvDirtyBitmap *bitmap);
 bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap);
 bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap);
 const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap *bitmap);
-int64_t bdrv_dirty_bitmap_size(const BdrvDirtyBitmap *bitmap);
 DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap);
 void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
int64_t cur_sector, int64_t nr_sectors);
 void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
  int64_t cur_sector, int64_t nr_sectors);
-int bdrv_dirty_bitmap_get_meta(BlockDriverState *bs,
-   BdrvDirtyBitmap *bitmap, int64_t sector,
-   int nb_sectors);
-int bdrv_dirty_bitmap_get_meta_locked(BlockDriverState *bs,
-  BdrvDirtyBitmap *bitmap, int64_t sector,
-  int nb_sectors);
-void bdrv_dirty_bitmap_reset_meta(BlockDriverState *bs,
-  BdrvDirtyBitmap *bitmap, int64_t sector,
-  int nb_sectors);
 BdrvDirtyBitmapIter *bdrv_dirty_meta_iter_new(BdrvDirtyBitmap *bitmap);
 BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap,
  uint64_t first_sector);
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index f17fc14..13febf2 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -161,50 +161,6 @@ void bdrv_release_meta_dirty_bitmap(BdrvDirtyBitmap 
*bitmap)
 qemu_mutex_unlock(bitmap->mutex);
 }

-int bdrv_dirty_bitmap_get_meta_locked(BlockDriverState *bs,
-  BdrvDirtyBitmap *bitmap, int64_t sector,
-  int nb_sectors)
-{
-uint64_t i;
-int sectors_per_bit = 1 << hbitmap_granularity(bitmap->meta);
-
-/* To optimize: we can make hbitmap to internally check the range in a
- * coarse level, or at least do it word by word. */
-for (i = sector; i < sector + nb_sectors; i += sectors_per_bit) {
-if (hbitmap_get(bitmap->meta, i)) {
-return true;
-}
-}
-return false;
-}
-
-int bdrv_dirty_bitmap_get_meta(BlockDriverState *bs,
-   BdrvDirtyBitmap *bitmap, int64_t sector,
-   int nb_sectors)
-{
-bool dirty;
-
-qemu_mutex_lock(bitmap->mutex);
-dirty = bdrv_dirty_bitmap_get_meta_locked(bs, bitmap, sector, nb_sectors);
-qemu_mutex_unlock(bitmap->mutex);
-
-return dirty;
-}
-
-void bdrv_dirty_bitmap_reset_meta(BlockDriverState *bs,
-  BdrvDirtyBitmap *bitmap, int64_t sector,
-  int nb_sectors)
-{
-qemu_mutex_lock(bitmap->mutex);
-hbitmap_reset(bitmap->meta, sector, nb_sectors);
-qemu_mutex_unlock(bitmap->mutex);
-}
-
-int64_t bdrv_dirty_bitmap_size(const BdrvDirtyBitmap *bitmap)
-{
-return bitmap->size;
-}
-
 const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap *bitmap)
 {
 return bitmap->name;
@@ -460,11 +416,6 @@ uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap 
*bitmap)
 return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
 }

-uint32_t bdrv_dirty_bitmap_meta_granularity(BdrvDirtyBitmap *bitmap)
-{
-return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->meta);
-}
-
 BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap,
  uint64_t first_sector)
 {
-- 
2.9.4




[Qemu-block] [PATCH v3 05/11] dirty-bitmap: Change bdrv_dirty_iter_next() to report byte offset

2017-06-28 Thread Eric Blake
Thanks to recent cleanups, all callers were scaling a return value
of sectors into bytes; do the scaling internally instead.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v2: no change
---
 block/backup.c   | 2 +-
 block/dirty-bitmap.c | 2 +-
 block/mirror.c   | 8 
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 2a94e8b..18389cd 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -375,7 +375,7 @@ static int coroutine_fn 
backup_run_incremental(BackupBlockJob *job)
 dbi = bdrv_dirty_iter_new(job->sync_bitmap);

 /* Find the next dirty sector(s) */
-while ((offset = bdrv_dirty_iter_next(dbi) * BDRV_SECTOR_SIZE) >= 0) {
+while ((offset = bdrv_dirty_iter_next(dbi)) >= 0) {
 cluster = offset / job->cluster_size;

 /* Fake progress updates for any clusters we skipped */
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index faf5a4c..0029303 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -449,7 +449,7 @@ void bdrv_dirty_iter_free(BdrvDirtyBitmapIter *iter)

 int64_t bdrv_dirty_iter_next(BdrvDirtyBitmapIter *iter)
 {
-return hbitmap_iter_next(&iter->hbi);
+return hbitmap_iter_next(&iter->hbi) * BDRV_SECTOR_SIZE;
 }

 /* Called within bdrv_dirty_bitmap_lock..unlock */
diff --git a/block/mirror.c b/block/mirror.c
index 3869450..0cde201 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -336,10 +336,10 @@ static uint64_t coroutine_fn 
mirror_iteration(MirrorBlockJob *s)
 int max_io_bytes = MAX(s->buf_size / MAX_IN_FLIGHT, MAX_IO_BYTES);

 bdrv_dirty_bitmap_lock(s->dirty_bitmap);
-offset = bdrv_dirty_iter_next(s->dbi) * BDRV_SECTOR_SIZE;
+offset = bdrv_dirty_iter_next(s->dbi);
 if (offset < 0) {
 bdrv_set_dirty_iter(s->dbi, 0);
-offset = bdrv_dirty_iter_next(s->dbi) * BDRV_SECTOR_SIZE;
+offset = bdrv_dirty_iter_next(s->dbi);
 trace_mirror_restart_iter(s, bdrv_get_dirty_count(s->dirty_bitmap) *
   BDRV_SECTOR_SIZE);
 assert(offset >= 0);
@@ -370,11 +370,11 @@ static uint64_t coroutine_fn 
mirror_iteration(MirrorBlockJob *s)
 break;
 }

-next_dirty = bdrv_dirty_iter_next(s->dbi) * BDRV_SECTOR_SIZE;
+next_dirty = bdrv_dirty_iter_next(s->dbi);
 if (next_dirty > next_offset || next_dirty < 0) {
 /* The bitmap iterator's cache is stale, refresh it */
 bdrv_set_dirty_iter(s->dbi, next_offset);
-next_dirty = bdrv_dirty_iter_next(s->dbi) * BDRV_SECTOR_SIZE;
+next_dirty = bdrv_dirty_iter_next(s->dbi);
 }
 assert(next_dirty == next_offset);
 nb_chunks++;
-- 
2.9.4




[Qemu-block] [PATCH v3 03/11] dirty-bitmap: Track size in bytes

2017-06-28 Thread Eric Blake
We are still using an internal hbitmap that tracks a size in sectors,
with the granularity scaled down accordingly, because it lets us
use a shortcut for our iterators which are currently sector-based.
But there's no reason we can't track the dirty bitmap size in bytes,
since it is an internal-only variable.

Use is_power_of_2() while at it, instead of open-coding that.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v2: tweak commit message, no code change
---
 block/dirty-bitmap.c | 23 +--
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 13febf2..b2b9342 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -42,7 +42,7 @@ struct BdrvDirtyBitmap {
 HBitmap *meta;  /* Meta dirty bitmap */
 BdrvDirtyBitmap *successor; /* Anonymous child; implies frozen status */
 char *name; /* Optional non-empty unique ID */
-int64_t size;   /* Size of the bitmap (Number of sectors) */
+int64_t size;   /* Size of the bitmap, in bytes */
 bool disabled;  /* Bitmap is read-only */
 int active_iterators;   /* How many iterators are active */
 QLIST_ENTRY(BdrvDirtyBitmap) list;
@@ -103,17 +103,14 @@ BdrvDirtyBitmap 
*bdrv_create_dirty_bitmap(BlockDriverState *bs,
 {
 int64_t bitmap_size;
 BdrvDirtyBitmap *bitmap;
-uint32_t sector_granularity;

-assert((granularity & (granularity - 1)) == 0);
+assert(is_power_of_2(granularity) && granularity >= BDRV_SECTOR_SIZE);

 if (name && bdrv_find_dirty_bitmap(bs, name)) {
 error_setg(errp, "Bitmap already exists: %s", name);
 return NULL;
 }
-sector_granularity = granularity >> BDRV_SECTOR_BITS;
-assert(sector_granularity);
-bitmap_size = bdrv_nb_sectors(bs);
+bitmap_size = bdrv_getlength(bs);
 if (bitmap_size < 0) {
 error_setg_errno(errp, -bitmap_size, "could not get length of device");
 errno = -bitmap_size;
@@ -121,7 +118,12 @@ BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState 
*bs,
 }
 bitmap = g_new0(BdrvDirtyBitmap, 1);
 bitmap->mutex = &bs->dirty_bitmap_mutex;
-bitmap->bitmap = hbitmap_alloc(bitmap_size, ctz32(sector_granularity));
+/*
+ * TODO - let hbitmap track full granularity. For now, it is tracking
+ * only sector granularity, as a shortcut for our iterators.
+ */
+bitmap->bitmap = hbitmap_alloc(bitmap_size >> BDRV_SECTOR_BITS,
+   ctz32(granularity) - BDRV_SECTOR_BITS);
 bitmap->size = bitmap_size;
 bitmap->name = g_strdup(name);
 bitmap->disabled = false;
@@ -284,13 +286,14 @@ BdrvDirtyBitmap 
*bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
 void bdrv_dirty_bitmap_truncate(BlockDriverState *bs)
 {
 BdrvDirtyBitmap *bitmap;
-uint64_t size = bdrv_nb_sectors(bs);
+int64_t size = bdrv_getlength(bs);

+assert(size >= 0);
 bdrv_dirty_bitmaps_lock(bs);
 QLIST_FOREACH(bitmap, &bs->dirty_bitmaps, list) {
 assert(!bdrv_dirty_bitmap_frozen(bitmap));
 assert(!bitmap->active_iterators);
-hbitmap_truncate(bitmap->bitmap, size);
+hbitmap_truncate(bitmap->bitmap, size >> BDRV_SECTOR_BITS);
 bitmap->size = size;
 }
 bdrv_dirty_bitmaps_unlock(bs);
@@ -490,7 +493,7 @@ void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, 
HBitmap **out)
 hbitmap_reset_all(bitmap->bitmap);
 } else {
 HBitmap *backup = bitmap->bitmap;
-bitmap->bitmap = hbitmap_alloc(bitmap->size,
+bitmap->bitmap = hbitmap_alloc(bitmap->size >> BDRV_SECTOR_BITS,
hbitmap_granularity(backup));
 *out = backup;
 }
-- 
2.9.4




[Qemu-block] [PATCH v3 01/11] dirty-bitmap: Report BlockDirtyInfo.count in bytes, as documented

2017-06-28 Thread Eric Blake
We've been documenting the value in bytes since its introduction
in commit b9a9b3a4 (v1.3), where it was actually reported in bytes.

Commit e4654d2 (v2.0) then removed things from block/qapi.c, in
preparation for a rewrite to a list of dirty sectors in the next
commit 21b5683 in block.c, but the new code mistakenly started
reporting in sectors.

Fixes: https://bugzilla.redhat.com/1441460

CC: qemu-sta...@nongnu.org
Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
Too late for 2.9, since the regression has been unnoticed for
nine releases. But worth putting in 2.9.1.

v2: no change
---
 block/dirty-bitmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index a04c6e4..f17fc14 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -410,7 +410,7 @@ BlockDirtyInfoList 
*bdrv_query_dirty_bitmaps(BlockDriverState *bs)
 QLIST_FOREACH(bm, &bs->dirty_bitmaps, list) {
 BlockDirtyInfo *info = g_new0(BlockDirtyInfo, 1);
 BlockDirtyInfoList *entry = g_new0(BlockDirtyInfoList, 1);
-info->count = bdrv_get_dirty_count(bm);
+info->count = bdrv_get_dirty_count(bm) << BDRV_SECTOR_BITS;
 info->granularity = bdrv_dirty_bitmap_granularity(bm);
 info->has_name = !!bm->name;
 info->name = g_strdup(bm->name);
-- 
2.9.4




[Qemu-block] [PATCH v3 00/11] make dirty-bitmap byte-based

2017-06-28 Thread Eric Blake
There are patches floating around to add NBD_CMD_BLOCK_STATUS,
but NBD wants to report status on byte granularity (even if the
reporting will probably be naturally aligned to sectors or even
much higher levels).  I've therefore started the task of
converting our block status code to report at a byte granularity
rather than sectors.

This is part two of that conversion: dirty-bitmap. Other parts
include bdrv_is_allocated (at v3 [1]) and replacing
bdrv_get_block_status with a byte based callback in all the
drivers (at v1, needs a rebase [3]).

Available as a tag at:
git fetch git://repo.or.cz/qemu/ericb.git nbd-byte-dirty-v3

Depends on Kevin's block branch and my v3 bdrv_is_allocated [1]

Since v2 [2], I had to rebase on top of Paolo's locking fixes;
patch v2 2/12 is gone, and many of the others had a lot of
context conflicts. But I felt the resolution was simple enough
that I kept R-b on all but patch 8.

[1] https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg06077.html
[2] https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg03859.html
[3] https://lists.gnu.org/archive/html/qemu-devel/2017-04/msg02642.html

(git backport-diff doesn't like the rename in 7/11)

001/11:[] [--] 'dirty-bitmap: Report BlockDirtyInfo.count in bytes, as 
documented'
002/11:[0024] [FC] 'dirty-bitmap: Drop unused functions'
003/11:[] [-C] 'dirty-bitmap: Track size in bytes'
004/11:[] [-C] 'dirty-bitmap: Set iterator start by offset, not sector'
005/11:[] [-C] 'dirty-bitmap: Change bdrv_dirty_iter_next() to report byte 
offset'
006/11:[] [-C] 'dirty-bitmap: Change bdrv_get_dirty_count() to report bytes'
007/11:[down] 'dirty-bitmap: Change bdrv_get_dirty_locked() to take bytes'
008/11:[0036] [FC] 'dirty-bitmap: Change bdrv_[re]set_dirty_bitmap() to use 
bytes'
009/11:[] [--] 'mirror: Switch mirror_dirty_init() to byte-based iteration'
010/11:[0001] [FC] 'dirty-bitmap: Switch bdrv_set_dirty() to bytes'
011/11:[] [-C] 'dirty-bitmap: Convert internal hbitmap size/granularity'

Eric Blake (11):
  dirty-bitmap: Report BlockDirtyInfo.count in bytes, as documented
  dirty-bitmap: Drop unused functions
  dirty-bitmap: Track size in bytes
  dirty-bitmap: Set iterator start by offset, not sector
  dirty-bitmap: Change bdrv_dirty_iter_next() to report byte offset
  dirty-bitmap: Change bdrv_get_dirty_count() to report bytes
  dirty-bitmap: Change bdrv_get_dirty_locked() to take bytes
  dirty-bitmap: Change bdrv_[re]set_dirty_bitmap() to use bytes
  mirror: Switch mirror_dirty_init() to byte-based iteration
  dirty-bitmap: Switch bdrv_set_dirty() to bytes
  dirty-bitmap: Convert internal hbitmap size/granularity

 include/block/block_int.h|   2 +-
 include/block/dirty-bitmap.h |  28 
 block/backup.c   |   7 ++-
 block/dirty-bitmap.c | 105 +++
 block/io.c   |   6 +--
 block/mirror.c   |  73 +-
 migration/block.c|  12 +++--
 7 files changed, 79 insertions(+), 154 deletions(-)

-- 
2.9.4




Re: [Qemu-block] [Qemu-devel] [PATCH RFC v3 5/8] block: add BlockDevOptionsThrottle to QAPI

2017-06-28 Thread Kevin Wolf
Am 28.06.2017 um 18:02 hat Eric Blake geschrieben:
> On 06/28/2017 10:50 AM, Kevin Wolf wrote:
> > Am 23.06.2017 um 14:46 hat Manos Pitsidianakis geschrieben:
> >> This is needed to configure throttle filter driver nodes with QAPI.
> >>
> >> Signed-off-by: Manos Pitsidianakis 
> >> ---
> >>  qapi/block-core.json | 19 ++-
> >>  1 file changed, 18 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/qapi/block-core.json b/qapi/block-core.json
> >> index f85c2235c7..1d4afafe8c 100644
> >> --- a/qapi/block-core.json
> >> +++ b/qapi/block-core.json
> >> @@ -2119,7 +2119,7 @@
> >>  'host_device', 'http', 'https', 'iscsi', 'luks', 'nbd', 'nfs',
> >>  'null-aio', 'null-co', 'parallels', 'qcow', 'qcow2', 'qed',
> >>  'quorum', 'raw', 'rbd', 'replication', 'sheepdog', 'ssh',
> >> -'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat', 'vxhs' ] }
> >> +'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat', 'vxhs' ] }
> >>  
> >>  ##
> >>  # @BlockdevOptionsFile:
> >> @@ -2984,6 +2984,7 @@
> >>'replication':'BlockdevOptionsReplication',
> >>'sheepdog':   'BlockdevOptionsSheepdog',
> >>'ssh':'BlockdevOptionsSsh',
> >> +  'throttle':   'BlockdevOptionsThrottle',
> >>'vdi':'BlockdevOptionsGenericFormat',
> >>'vhdx':   'BlockdevOptionsGenericFormat',
> >>'vmdk':   'BlockdevOptionsGenericCOWFormat',
> >> @@ -3723,3 +3724,19 @@
> >>'data' : { 'parent': 'str',
> >>   '*child': 'str',
> >>   '*node': 'str' } }
> >> +
> >> +##
> >> +# @BlockdevOptionsThrottle:
> >> +#
> >> +# Driver specific block device options for Throttle
> >> +#
> >> +# @throttling-group: the name of the throttling group to use
> >> +#
> >> +# @options:BlockIOThrottle options
> > 
> > Missing #optional marker.
> 
> The marker is now auto-generated based solely on the '*options' below,
> so we don't need a redundant thing here.

Oh nice, progress!

Kevin


pgp48WiX_ib10.pgp
Description: PGP signature


Re: [Qemu-block] [Qemu-devel] [RFC] QMP design: Fixing query-block and friends

2017-06-28 Thread John Snow


On 06/28/2017 03:15 AM, Markus Armbruster wrote:
> John Snow  writes:
> 
>> On 06/27/2017 12:31 PM, Kevin Wolf wrote:
>>> Hi,
>>>
>>> I haven't really liked query-block for a long time, but now that
>>> blockdev-add and -blockdev have settled, it might finally be the time to
>>> actually do something about it. In fact, if used together with these
>>> modern interfaces, our query commands are simply broken, so we have to
>>> fix something.
>>>
>>
>> [...words...]
>>
>>>
>>> So how do we go forward from here?
>>>
>>> I guess we could add a few hacks o fix the really urgent things, and
>>> just adding more information is always possible (at the cost of even
>>> more duplication).
>>>
>>
>> I think you've included this suggestion so that you can summarily
>> dismiss it as foolish.
>>
>>> However, it appears to me that I dislike so many thing about our current
>>> query commands that I'm tempted to say: Throw it all away and start
>>> over.
>>>
>>
>> Inclined to agree. The structure of the block layer has changed so much
>> in the past few years and this is easily seen by the gap you've outlined
>> here.
>>
>> We have to keep the old query commands around for a while as Eric says,
>> but I worry that they are so wrong and misleading as to be actively harmful.
>>
>> Maybe there's some hair trigger somewhere that if $NEW_FEATURE_X is used
>> to configure QEMU in some way, that the old commands can be deprecated
>> at runtime, such that we can more aggressively force their retirement.
> 
> We warn on use of deprecated command line and HMP features.  I think we
> want the same for QMP, within QMP.
> 
> [...]
> 

I was thinking of something even stronger than a warning in this case.
Warn if you use it anyway, but if you use $SOME_2.10_FEATURE, it
actually disables it.

"Hi, I know that you have seen the 2.10 API. I'm removing access to this
feature, because you REALLY ought not use it."

Could be as simple as actually disabling the old query command if the
new query command is utilized.

--js



Re: [Qemu-block] [Qemu-devel] [PATCH RFC v3 5/8] block: add BlockDevOptionsThrottle to QAPI

2017-06-28 Thread Eric Blake
On 06/28/2017 10:50 AM, Kevin Wolf wrote:
> Am 23.06.2017 um 14:46 hat Manos Pitsidianakis geschrieben:
>> This is needed to configure throttle filter driver nodes with QAPI.
>>
>> Signed-off-by: Manos Pitsidianakis 
>> ---
>>  qapi/block-core.json | 19 ++-
>>  1 file changed, 18 insertions(+), 1 deletion(-)
>>
>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>> index f85c2235c7..1d4afafe8c 100644
>> --- a/qapi/block-core.json
>> +++ b/qapi/block-core.json
>> @@ -2119,7 +2119,7 @@
>>  'host_device', 'http', 'https', 'iscsi', 'luks', 'nbd', 'nfs',
>>  'null-aio', 'null-co', 'parallels', 'qcow', 'qcow2', 'qed',
>>  'quorum', 'raw', 'rbd', 'replication', 'sheepdog', 'ssh',
>> -'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat', 'vxhs' ] }
>> +'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat', 'vxhs' ] }
>>  
>>  ##
>>  # @BlockdevOptionsFile:
>> @@ -2984,6 +2984,7 @@
>>'replication':'BlockdevOptionsReplication',
>>'sheepdog':   'BlockdevOptionsSheepdog',
>>'ssh':'BlockdevOptionsSsh',
>> +  'throttle':   'BlockdevOptionsThrottle',
>>'vdi':'BlockdevOptionsGenericFormat',
>>'vhdx':   'BlockdevOptionsGenericFormat',
>>'vmdk':   'BlockdevOptionsGenericCOWFormat',
>> @@ -3723,3 +3724,19 @@
>>'data' : { 'parent': 'str',
>>   '*child': 'str',
>>   '*node': 'str' } }
>> +
>> +##
>> +# @BlockdevOptionsThrottle:
>> +#
>> +# Driver specific block device options for Throttle
>> +#
>> +# @throttling-group: the name of the throttling group to use
>> +#
>> +# @options:BlockIOThrottle options
> 
> Missing #optional marker.

The marker is now auto-generated based solely on the '*options' below,
so we don't need a redundant thing here.

> 
>> +# Since: 2.9
>> +##
>> +{ 'struct': 'BlockdevOptionsThrottle',
>> +  'data': { 'throttling-group': 'str',
>> +'file' : 'BlockdevRef',
>> +'*options' : 'BlockIOThrottle'
>> + } }
> 
> Didn't we intend to make 'throttling-group' optional, too?
> 
> If we don't, then the question of anonymous ThrottleGroup objects is
> kind of moot (not completely because -drive isn't bound to the schema,
> but in that case we should just error out there too if it's missing).
> 
> Kevin
> 
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v3 1/3] block: add bdrv_get_format_alloc_stat format interface

2017-06-28 Thread Vladimir Sementsov-Ogievskiy

27.06.2017 02:19, John Snow wrote:


On 06/06/2017 12:26 PM, Vladimir Sementsov-Ogievskiy wrote:

The function should collect statistics, about used/unused by top-level
format driver space (in its .file) and allocation status
(data/zero/discarded/after-eof) of corresponding areas in this .file.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block.c   | 16 ++
  include/block/block.h |  3 +++
  include/block/block_int.h |  2 ++
  qapi/block-core.json  | 55 +++
  4 files changed, 76 insertions(+)

diff --git a/block.c b/block.c
index 50ba264143..7d720ae0c2 100644
--- a/block.c
+++ b/block.c
@@ -3407,6 +3407,22 @@ int64_t bdrv_get_allocated_file_size(BlockDriverState 
*bs)
  }
  
  /**

+ * Collect format allocation info. See BlockFormatAllocInfo definition in
+ * qapi/block-core.json.
+ */
+int bdrv_get_format_alloc_stat(BlockDriverState *bs, BlockFormatAllocInfo 
*bfai)
+{
+BlockDriver *drv = bs->drv;
+if (!drv) {
+return -ENOMEDIUM;
+}
+if (drv->bdrv_get_format_alloc_stat) {
+return drv->bdrv_get_format_alloc_stat(bs, bfai);
+}
+return -ENOTSUP;
+}
+
+/**
   * Return number of sectors on success, -errno on error.
   */
  int64_t bdrv_nb_sectors(BlockDriverState *bs)
diff --git a/include/block/block.h b/include/block/block.h
index 9b355e92d8..646376a772 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -335,6 +335,9 @@ typedef enum {
  
  int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res, BdrvCheckMode fix);
  
+int bdrv_get_format_alloc_stat(BlockDriverState *bs,

+   BlockFormatAllocInfo *bfai);
+
  /* The units of offset and total_work_size may be chosen arbitrarily by the
   * block driver; total_work_size may change during the course of the amendment
   * operation */
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 8d3724cce6..458c715e99 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -208,6 +208,8 @@ struct BlockDriver {
  int64_t (*bdrv_getlength)(BlockDriverState *bs);
  bool has_variable_length;
  int64_t (*bdrv_get_allocated_file_size)(BlockDriverState *bs);
+int (*bdrv_get_format_alloc_stat)(BlockDriverState *bs,
+  BlockFormatAllocInfo *bfai);
  
  int coroutine_fn (*bdrv_co_pwritev_compressed)(BlockDriverState *bs,

  uint64_t offset, uint64_t bytes, QEMUIOVector *qiov);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index ea0b3e8b13..fd7b52bd69 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -139,6 +139,61 @@
 '*format-specific': 'ImageInfoSpecific' } }
  
  ##

+# @BlockFormatAllocInfo:
+#

I apologize in advance, but I don't understand this patch very well. Let
me ask some questions to get patch review rolling again, since you've
been waiting a bit.


+#
+# Allocation relations between format file and underlying protocol file.
+# All fields are in bytes.
+#

The format file in this case would be ... what, the virtual file
represented by the qcow2? and the underlying protocol file is the raw
file that is the qcow2 itself?


yes




+# There are two types of the format file portions: 'used' and 'unused'. It's up
+# to the format how to interpret these types. For now the only format 
supporting
+# the feature is Qcow2 and for this case 'used' are clusters with positive
+# refcount and unused a clusters with zero refcount. Described portions include
+# all format file allocations, not only virtual disk data (metadata, internal
+# snapshots, etc. are included).

I guess the semantic differentiation between "used" and "unused" is left
to the individual fields, below.


hmm, I don't understand. differentiation is up to the format, and for 
qcow2 it is described above





+#
+# For the underlying file there are native block-status types of the portions:
+#  - data: allocated data
+#  - zero: read-as-zero holes
+#  - discarded: not allocated
+# 4th additional type is 'overrun', which is for the format file portions 
beyond
+# the end of the underlying file.
+#
+# So, the fields are:
+#
+# @used-data: used by the format file and backed by data in the underlying file
+#

I assume this is "defined and addressable data".


+# @used-zero: used by the format file and backed by a hole in the underlying
+# file
+#

By a hole? Can you give me an example? Do you mean like a filesystem
hole ala falloc()?


-zero, -data and -discarded are the block status of corresponding area 
in underlying file.


so, if underlying file is raw, yes, it should be a filesystem hole.

example:
-
# ./qemu-img create -f qcow2 x 1G
Formatting 'x', fmt=qcow2 size=1073741824 encryption=off 
cluster_size=65536 lazy_refcounts=off refcount_bits=16

# ./qemu-img check x
No errors were found on the image.
Image end offset: 262144
Format allocation info (including metadata):
 

Re: [Qemu-block] [PATCH RFC v3 6/8] block: add options parameter to bdrv_new_open_driver()

2017-06-28 Thread Kevin Wolf
Am 26.06.2017 um 17:11 hat Stefan Hajnoczi geschrieben:
> On Fri, Jun 23, 2017 at 03:46:58PM +0300, Manos Pitsidianakis wrote:
> > diff --git a/block.c b/block.c
> > index 694396281b..c7d9f8959a 100644
> > --- a/block.c
> > +++ b/block.c
> > @@ -1150,20 +1150,25 @@ free_and_fail:
> >  }
> >  
> >  BlockDriverState *bdrv_new_open_driver(BlockDriver *drv, const char 
> > *node_name,
> > -   int flags, Error **errp)
> > +   int flags, QDict *options, Error 
> > **errp)
> 
> Please add a doc comment that explains the QDict ownership when options
> != NULL.  Users need to understand whether the options QDict still
> belongs to them after the call or bdrv_new_open_driver() takes over
> ownership.
> 
> See bdrv_open_inherit() for an example.

I think we might not only want to document it, but probably also change
this function. bdrv_open_inherit() and friends take ownership of the
QDict, so doing the same here would probably avoid bugs in the future.

It also seems more practical, because the only user that is added in
patch 8 already does QDECREF() after calling bdrv_new_open_driver().

Kevin


pgpa8LGLqE6Hm.pgp
Description: PGP signature


Re: [Qemu-block] [PATCH RFC v3 5/8] block: add BlockDevOptionsThrottle to QAPI

2017-06-28 Thread Kevin Wolf
Am 23.06.2017 um 14:46 hat Manos Pitsidianakis geschrieben:
> This is needed to configure throttle filter driver nodes with QAPI.
> 
> Signed-off-by: Manos Pitsidianakis 
> ---
>  qapi/block-core.json | 19 ++-
>  1 file changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index f85c2235c7..1d4afafe8c 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -2119,7 +2119,7 @@
>  'host_device', 'http', 'https', 'iscsi', 'luks', 'nbd', 'nfs',
>  'null-aio', 'null-co', 'parallels', 'qcow', 'qcow2', 'qed',
>  'quorum', 'raw', 'rbd', 'replication', 'sheepdog', 'ssh',
> -'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat', 'vxhs' ] }
> +'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat', 'vxhs' ] }
>  
>  ##
>  # @BlockdevOptionsFile:
> @@ -2984,6 +2984,7 @@
>'replication':'BlockdevOptionsReplication',
>'sheepdog':   'BlockdevOptionsSheepdog',
>'ssh':'BlockdevOptionsSsh',
> +  'throttle':   'BlockdevOptionsThrottle',
>'vdi':'BlockdevOptionsGenericFormat',
>'vhdx':   'BlockdevOptionsGenericFormat',
>'vmdk':   'BlockdevOptionsGenericCOWFormat',
> @@ -3723,3 +3724,19 @@
>'data' : { 'parent': 'str',
>   '*child': 'str',
>   '*node': 'str' } }
> +
> +##
> +# @BlockdevOptionsThrottle:
> +#
> +# Driver specific block device options for Throttle
> +#
> +# @throttling-group: the name of the throttling group to use
> +#
> +# @options:BlockIOThrottle options

Missing #optional marker.

> +# Since: 2.9
> +##
> +{ 'struct': 'BlockdevOptionsThrottle',
> +  'data': { 'throttling-group': 'str',
> +'file' : 'BlockdevRef',
> +'*options' : 'BlockIOThrottle'
> + } }

Didn't we intend to make 'throttling-group' optional, too?

If we don't, then the question of anonymous ThrottleGroup objects is
kind of moot (not completely because -drive isn't bound to the schema,
but in that case we should just error out there too if it's missing).

Kevin



Re: [Qemu-block] [PATCH RFC v3 3/8] block: add throttle block filter driver

2017-06-28 Thread Manos Pitsidianakis

On Wed, Jun 28, 2017 at 05:36:54PM +0200, Kevin Wolf wrote:

Am 28.06.2017 um 17:22 hat Manos Pitsidianakis geschrieben:

Since we're moving groups to QOM we will need ids for each group.
Can objects be anonymous?


Hm, that's a good question. But object_new() doesn't take an ID, so I
think they can be anonymous.

Looking a bit closer, strcut Object doesn't even have a field for the
ID. It seems that what the ID really is is the name of a property in a
parent object that points to the new object. So as long as you don't
want to have another QOM object point to it, there is no such thing as
an ID.

Anyway, I followed the call chain from throttle_group_register_tgm() and
it ends in throttle_group_incref() where we simply have this:

   tg = g_new0(ThrottleGroup, 1);
   tg->name = g_strdup(name);

Shouldn't this be using something a little more QOMy now?


Yes, like it was mentioned in the QOM patch's thread 
block/throttle-groups.c should use QOM internally. I will change this 
for the next revision and also look into anonymity.


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH] tests: Avoid non-portable 'echo -ARG'

2017-06-28 Thread Edgar E. Iglesias
On Wed, Jun 28, 2017 at 09:21:37AM -0500, Eric Blake wrote:
> POSIX says that backslashes in the arguments to 'echo', as well as
> any use of 'echo -n' and 'echo -e', are non-portable; it recommends
> people should favor 'printf' instead.  This is definitely true where
> we do not control which shell is running (such as in makefile snippets
> or in documentation examples).  But even for scripts where we
> require bash (and therefore, where echo does what we want by default),
> it is still possible to use 'shopt -s xpg_echo' to change bash's
> behavior of echo.  And setting a good example never hurts when we are
> not sure if a snippet will be copied from a bash-only script to a
> general shell script (although I don't change the use of non-portable
> \e for ESC when we know the running shell is bash).
> 
> Replace 'echo -n "..."' with 'printf "..."', and 'echo -e "..."'
> with 'printf "...\n"'.
> 
> In the qemu-iotests check script, also fix unusual shell quoting
> that would result in word-splitting if 'date' outputs a space.

Reviewed-by: Edgar E. Iglesias 



> 
> Signed-off-by: Eric Blake 
> ---
> 
> Of course, Stefan's pending patch:
> [PATCH 3/5] qemu-iotests: 068: extract _qemu() function
> also touches 068, so there may be some (obvious) merge conflicts
> to resolve there depending on what goes in first.
> 
>  qemu-options.hx |  4 ++--
>  tests/multiboot/run_test.sh | 10 +-
>  tests/qemu-iotests/051  |  7 ---
>  tests/qemu-iotests/068  |  2 +-
>  tests/qemu-iotests/142  | 48 
> ++---
>  tests/qemu-iotests/171  | 14 ++---
>  tests/qemu-iotests/check| 18 -
>  tests/rocker/all| 10 +-
>  tests/tcg/cris/Makefile |  8 
>  9 files changed, 61 insertions(+), 60 deletions(-)
> 
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 896ff17..c8205bb 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -4351,7 +4351,7 @@ The simplest (insecure) usage is to provide the secret 
> inline
> 
>  The simplest secure usage is to provide the secret via a file
> 
> - # echo -n "letmein" > mypasswd.txt
> + # printf "letmein" > mypasswd.txt
>   # $QEMU -object secret,id=sec0,file=mypasswd.txt,format=raw
> 
>  For greater security, AES-256-CBC should be used. To illustrate usage,
> @@ -4379,7 +4379,7 @@ telling openssl to base64 encode the result, but it 
> could be left
>  as raw bytes if desired.
> 
>  @example
> - # SECRET=$(echo -n "letmein" |
> + # SECRET=$(printf "letmein" |
>  openssl enc -aes-256-cbc -a -K $KEY -iv $IV)
>  @end example
> 
> diff --git a/tests/multiboot/run_test.sh b/tests/multiboot/run_test.sh
> index 78d7edf..35bfe0e 100755
> --- a/tests/multiboot/run_test.sh
> +++ b/tests/multiboot/run_test.sh
> @@ -26,7 +26,7 @@ run_qemu() {
>  local kernel=$1
>  shift
> 
> -echo -e "\n\n=== Running test case: $kernel $@ ===\n" >> test.log
> +printf "\n\n=== Running test case: $kernel $@ ===\n\n" >> test.log
> 
>  $QEMU \
>  -kernel $kernel \
> @@ -68,21 +68,21 @@ for t in mmap modules; do
>  pass=1
> 
>  if [ $debugexit != 1 ]; then
> -echo -e "\e[31m ?? \e[0m $t (no debugexit used, exit code $ret)"
> +printf "\e[31m ?? \e[0m $t (no debugexit used, exit code $ret)\n"
>  pass=0
>  elif [ $ret != 0 ]; then
> -echo -e "\e[31mFAIL\e[0m $t (exit code $ret)"
> +printf "\e[31mFAIL\e[0m $t (exit code $ret)\n"
>  pass=0
>  fi
> 
>  if ! diff $t.out test.log > /dev/null 2>&1; then
> -echo -e "\e[31mFAIL\e[0m $t (output difference)"
> +printf "\e[31mFAIL\e[0m $t (output difference)\n"
>  diff -u $t.out test.log
>  pass=0
>  fi
> 
>  if [ $pass == 1 ]; then
> -echo -e "\e[32mPASS\e[0m $t"
> +printf "\e[32mPASS\e[0m $t\n"
>  fi
> 
>  done
> diff --git a/tests/qemu-iotests/051 b/tests/qemu-iotests/051
> index 26c29de..322c4a8 100755
> --- a/tests/qemu-iotests/051
> +++ b/tests/qemu-iotests/051
> @@ -217,7 +217,7 @@ run_qemu -drive driver=null-co,cache=invalid_value
>  # Test 142 checks the direct=on cases
> 
>  for cache in writeback writethrough unsafe invalid_value; do
> -echo -e "info block\ninfo block file\ninfo block backing\ninfo block 
> backing-file" | \
> +printf "info block\ninfo block file\ninfo block backing\ninfo block 
> backing-file\n" | \
>  run_qemu -drive 
> file="$TEST_IMG",cache=$cache,backing.file.filename="$TEST_IMG.base",backing.cache.no-flush=on,backing.node-name=backing,backing.file.node-name=backing-file,file.node-name=file,if=none,id=$device_id
>  -nodefaults
>  done
> 
> @@ -325,8 +325,9 @@ echo "qemu-io $device_id \"write -P 0x22 0 4k\"" | 
> run_qemu -drive file="$TEST_I
> 
>  $QEMU_IO -c "read -P 0x22 0 4k" "$TEST_IMG" | _filter_qemu_io
> 
> -echo -e "qemu-io $device_id \"write -P 0x33 0 4k\"\ncommit $device_id" | 
> run_qemu -drive file="$TEST_IMG",snapshot=on,if=none,

Re: [Qemu-block] [PATCH RFC v3 3/8] block: add throttle block filter driver

2017-06-28 Thread Kevin Wolf
Am 28.06.2017 um 17:22 hat Manos Pitsidianakis geschrieben:
> On Wed, Jun 28, 2017 at 04:40:12PM +0200, Kevin Wolf wrote:
> >Am 23.06.2017 um 14:46 hat Manos Pitsidianakis geschrieben:
> >>block/throttle.c uses existing I/O throttle infrastructure inside a
> >>block filter driver. I/O operations are intercepted in the filter's
> >>read/write coroutines, and referred to block/throttle-groups.c
> >>
> >>The driver can be used with the command
> >>-drive driver=throttle,file.filename=foo.qcow2,iops-total=...
> >>The configuration flags and semantics are identical to the hardcoded
> >>throttling ones.
> >>
> >>Signed-off-by: Manos Pitsidianakis 
> >>---
> >> block/Makefile.objs |   1 +
> >> block/throttle.c| 427 
> >> 
> >> include/qemu/throttle-options.h |  60 --
> >> 3 files changed, 469 insertions(+), 19 deletions(-)
> >> create mode 100644 block/throttle.c
> >>
> >>diff --git a/block/Makefile.objs b/block/Makefile.objs
> >>index ea955302c8..bb811a4d01 100644
> >>--- a/block/Makefile.objs
> >>+++ b/block/Makefile.objs
> >>@@ -25,6 +25,7 @@ block-obj-y += accounting.o dirty-bitmap.o
> >> block-obj-y += write-threshold.o
> >> block-obj-y += backup.o
> >> block-obj-$(CONFIG_REPLICATION) += replication.o
> >>+block-obj-y += throttle.o
> >>
> >> block-obj-y += crypto.o
> >>
> >>diff --git a/block/throttle.c b/block/throttle.c
> >>new file mode 100644
> >>index 00..0c17051161
> >>--- /dev/null
> >>+++ b/block/throttle.c
> >>@@ -0,0 +1,427 @@
> >>+/*
> >>+ * QEMU block throttling filter driver infrastructure
> >>+ *
> >>+ * Copyright (c) 2017 Manos Pitsidianakis
> >>+ *
> >>+ * This program is free software; you can redistribute it and/or
> >>+ * modify it under the terms of the GNU General Public License as
> >>+ * published by the Free Software Foundation; either version 2 or
> >>+ * (at your option) version 3 of the License.
> >>+ *
> >>+ * This program is distributed in the hope that it will be useful,
> >>+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
> >>+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> >>+ * GNU General Public License for more details.
> >>+ *
> >>+ * You should have received a copy of the GNU General Public License
> >>+ * along with this program; if not, see .
> >>+ */
> >
> >Please consider using the LGPL. We're still hoping to turn the block
> >layer into a library one day, and almost all code in it is licensed
> >liberally (MIT or LGPL).
> >
> >>+#include "qemu/osdep.h"
> >>+#include "block/throttle-groups.h"
> >>+#include "qemu/throttle-options.h"
> >>+#include "qapi/error.h"
> >>+
> >>+
> >>+static QemuOptsList throttle_opts = {
> >>+.name = "throttle",
> >>+.head = QTAILQ_HEAD_INITIALIZER(throttle_opts.head),
> >>+.desc = {
> >>+{
> >>+.name = QEMU_OPT_IOPS_TOTAL,
> >>+.type = QEMU_OPT_NUMBER,
> >>+.help = "limit total I/O operations per second",
> >>+},{
> >>+.name = QEMU_OPT_IOPS_READ,
> >>+.type = QEMU_OPT_NUMBER,
> >>+.help = "limit read operations per second",
> >>+},{
> >>+.name = QEMU_OPT_IOPS_WRITE,
> >>+.type = QEMU_OPT_NUMBER,
> >>+.help = "limit write operations per second",
> >>+},{
> >>+.name = QEMU_OPT_BPS_TOTAL,
> >>+.type = QEMU_OPT_NUMBER,
> >>+.help = "limit total bytes per second",
> >>+},{
> >>+.name = QEMU_OPT_BPS_READ,
> >>+.type = QEMU_OPT_NUMBER,
> >>+.help = "limit read bytes per second",
> >>+},{
> >>+.name = QEMU_OPT_BPS_WRITE,
> >>+.type = QEMU_OPT_NUMBER,
> >>+.help = "limit write bytes per second",
> >>+},{
> >>+.name = QEMU_OPT_IOPS_TOTAL_MAX,
> >>+.type = QEMU_OPT_NUMBER,
> >>+.help = "I/O operations burst",
> >>+},{
> >>+.name = QEMU_OPT_IOPS_READ_MAX,
> >>+.type = QEMU_OPT_NUMBER,
> >>+.help = "I/O operations read burst",
> >>+},{
> >>+.name = QEMU_OPT_IOPS_WRITE_MAX,
> >>+.type = QEMU_OPT_NUMBER,
> >>+.help = "I/O operations write burst",
> >>+},{
> >>+.name = QEMU_OPT_BPS_TOTAL_MAX,
> >>+.type = QEMU_OPT_NUMBER,
> >>+.help = "total bytes burst",
> >>+},{
> >>+.name = QEMU_OPT_BPS_READ_MAX,
> >>+.type = QEMU_OPT_NUMBER,
> >>+.help = "total bytes read burst",
> >>+},{
> >>+.name = QEMU_OPT_BPS_WRITE_MAX,
> >>+.type = QEMU_OPT_NUMBER,
> >>+.help = "total bytes write burst",
> >>+},{
> >>+.name = QEMU_OPT_IOPS_TOTAL_MAX_LENGTH,
> >>+.type = QEMU_OPT_NUMBER,
> >>+.help = "length of the iopstotalmax burst period, in seconds",
> >>+

Re: [Qemu-block] [PATCH v2 3/4] qcow2: add shrink image support

2017-06-28 Thread Pavel Butsykin

On 28.06.2017 16:59, Max Reitz wrote:

On 2017-06-27 17:06, Pavel Butsykin wrote:

On 26.06.2017 20:47, Max Reitz wrote:

On 2017-06-26 17:23, Pavel Butsykin wrote:

[]


Is there any guarantee that in the future this will not change? Because
in this case it can be a potential danger.


Since this behavior is not documented anywhere, there is no guarantee.


I can add a comment... Or add a new variable with the size of
reftable_tmp, and every time count min(s->refcount_table_size,
reftable_tmp_size)
before accessing to s->refcount_table[]/reftable_tmp[]


Or (1) you add an assertion that refcount_table_size doesn't change
along with a comment why that is the case, which also explains in detail
why the call to qcow2_free_clusters() should be safe: The on-disk
reftable differs from the one in memory. qcow2_free_clusters()and
update_refcount() themselves do not access the reftable, so they are
safe. However, update_refcount() calls alloc_refcount_block(), and that
function does access the reftable: Now, as long as
s->refcount_table_size does not shrink (which I can't see why it would),
refcount_table_index should always be smaller. Now we're accessing
s->refcount_table: This will always return an existing refblock because
this will either be the refblock itself (for self-referencing refblocks)
or another one that is not going to be freed by qcow2_shrink_reftable()
because this function will not free refblocks which cover other clusters
than themselves.
We will then proceed to update the refblock which is either right (if it
is not the refblock to be freed) or won't do anything (if it is the one
to be freed).
In any case, we will never write to the reftable and reading from the
basically outdated cached version will never do anything bad.


OK, SGTM.


Or (2) you copy reftable_tmp into s->refcount_table[] *before* any call
to qcow2_free_clusters(). To make this work, you would need to also
discard all refblocks from the cache in this function here (and not in
update_refcount()) and then only call qcow2_free_clusters() on refblocks
which were not self-referencing. An alternative hack would be to simply
mark the image dirty and just not do any qcow2_free_clusters() call...


The main purpose of qcow2_reftable_shrink() function is discard all
unnecessary refblocks from the file. If we do only rewrite
refcount_table and discard non-self-referencing refblocks (which are
actually very rare), then the meaning of the function is lost.


It would do exactly the same. The idea is that you do not need to call
qcow2_free_clusters() on self-referencing refblocks at all, since they
are freed automatically when their reftable entry is overwritten with 0.


Not sure.. For self-referencing refblocks, we also need to do:
1. check if refcount > 1
2. update s->free_cluster_index
3. call update_refcount_discard() (to in the end the fallocate
PUNCH_HOLE was called on refblock offset)

It will be practically a copy-paste from qcow2_free_clusters(), so it is
better to avoid it. I think that if it makes sense to do
qcow2_reftable_shrink(), it is only because we can slightly reduce image
size.


Or (3) of course it would be possible to not clean up refcount
structures at all...


Nice solution :)


It is, because as I said refcount structures only have a small overhead.


Yes, I agree.


Max





Re: [Qemu-block] [PATCH RFC v3 3/8] block: add throttle block filter driver

2017-06-28 Thread Manos Pitsidianakis

On Wed, Jun 28, 2017 at 04:40:12PM +0200, Kevin Wolf wrote:

Am 23.06.2017 um 14:46 hat Manos Pitsidianakis geschrieben:

block/throttle.c uses existing I/O throttle infrastructure inside a
block filter driver. I/O operations are intercepted in the filter's
read/write coroutines, and referred to block/throttle-groups.c

The driver can be used with the command
-drive driver=throttle,file.filename=foo.qcow2,iops-total=...
The configuration flags and semantics are identical to the hardcoded
throttling ones.

Signed-off-by: Manos Pitsidianakis 
---
 block/Makefile.objs |   1 +
 block/throttle.c| 427 
 include/qemu/throttle-options.h |  60 --
 3 files changed, 469 insertions(+), 19 deletions(-)
 create mode 100644 block/throttle.c

diff --git a/block/Makefile.objs b/block/Makefile.objs
index ea955302c8..bb811a4d01 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -25,6 +25,7 @@ block-obj-y += accounting.o dirty-bitmap.o
 block-obj-y += write-threshold.o
 block-obj-y += backup.o
 block-obj-$(CONFIG_REPLICATION) += replication.o
+block-obj-y += throttle.o

 block-obj-y += crypto.o

diff --git a/block/throttle.c b/block/throttle.c
new file mode 100644
index 00..0c17051161
--- /dev/null
+++ b/block/throttle.c
@@ -0,0 +1,427 @@
+/*
+ * QEMU block throttling filter driver infrastructure
+ *
+ * Copyright (c) 2017 Manos Pitsidianakis
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 or
+ * (at your option) version 3 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see .
+ */


Please consider using the LGPL. We're still hoping to turn the block
layer into a library one day, and almost all code in it is licensed
liberally (MIT or LGPL).


+#include "qemu/osdep.h"
+#include "block/throttle-groups.h"
+#include "qemu/throttle-options.h"
+#include "qapi/error.h"
+
+
+static QemuOptsList throttle_opts = {
+.name = "throttle",
+.head = QTAILQ_HEAD_INITIALIZER(throttle_opts.head),
+.desc = {
+{
+.name = QEMU_OPT_IOPS_TOTAL,
+.type = QEMU_OPT_NUMBER,
+.help = "limit total I/O operations per second",
+},{
+.name = QEMU_OPT_IOPS_READ,
+.type = QEMU_OPT_NUMBER,
+.help = "limit read operations per second",
+},{
+.name = QEMU_OPT_IOPS_WRITE,
+.type = QEMU_OPT_NUMBER,
+.help = "limit write operations per second",
+},{
+.name = QEMU_OPT_BPS_TOTAL,
+.type = QEMU_OPT_NUMBER,
+.help = "limit total bytes per second",
+},{
+.name = QEMU_OPT_BPS_READ,
+.type = QEMU_OPT_NUMBER,
+.help = "limit read bytes per second",
+},{
+.name = QEMU_OPT_BPS_WRITE,
+.type = QEMU_OPT_NUMBER,
+.help = "limit write bytes per second",
+},{
+.name = QEMU_OPT_IOPS_TOTAL_MAX,
+.type = QEMU_OPT_NUMBER,
+.help = "I/O operations burst",
+},{
+.name = QEMU_OPT_IOPS_READ_MAX,
+.type = QEMU_OPT_NUMBER,
+.help = "I/O operations read burst",
+},{
+.name = QEMU_OPT_IOPS_WRITE_MAX,
+.type = QEMU_OPT_NUMBER,
+.help = "I/O operations write burst",
+},{
+.name = QEMU_OPT_BPS_TOTAL_MAX,
+.type = QEMU_OPT_NUMBER,
+.help = "total bytes burst",
+},{
+.name = QEMU_OPT_BPS_READ_MAX,
+.type = QEMU_OPT_NUMBER,
+.help = "total bytes read burst",
+},{
+.name = QEMU_OPT_BPS_WRITE_MAX,
+.type = QEMU_OPT_NUMBER,
+.help = "total bytes write burst",
+},{
+.name = QEMU_OPT_IOPS_TOTAL_MAX_LENGTH,
+.type = QEMU_OPT_NUMBER,
+.help = "length of the iopstotalmax burst period, in seconds",
+},{
+.name = QEMU_OPT_IOPS_READ_MAX_LENGTH,
+.type = QEMU_OPT_NUMBER,
+.help = "length of the iopsreadmax burst period, in seconds",
+},{
+.name = QEMU_OPT_IOPS_WRITE_MAX_LENGTH,
+.type = QEMU_OPT_NUMBER,
+.help = "length of the iopswritemax burst period, in seconds",
+},{
+.name = QEMU_OPT_BPS_TOTAL_MAX_LENGTH,
+.type = QEMU_OPT_NUMBER,
+.help = "length of the bpstotalmax burst period, in seconds",
+},{
+

[Qemu-block] [PATCH v4 2/2] bitmaps.md: Convert to rST; move it into 'interop' dir

2017-06-28 Thread Kashyap Chamarthy
This is part of the on-going effort to convert QEMU upstream
documentation syntax to reStructuredText (rST).

The conversion to rST was done using:

$ pandoc -f markdown -t rst bitmaps.md -o bitmaps.rst

Then, make a couple of small syntactical adjustments.  While at it,
reword a statement to avoid ambiguity.  Addressing the feedback from
this thread:

https://lists.nongnu.org/archive/html/qemu-devel/2017-06/msg05428.html

Signed-off-by: Kashyap Chamarthy 
---
* A Sphinx-rendered HTML version is here:
  https://kashyapc.fedorapeople.org/v4-QEMU-Docs/_build/html/docs/bitmaps.html

* The patch has "v4" subject prefix because I rolled this in along with
  the other document (live-block-operations.rst) that is in review,
  which is actually at v4.
---
 docs/devel/bitmaps.md| 505 --
 docs/interop/bitmaps.rst | 555 +++
 2 files changed, 555 insertions(+), 505 deletions(-)
 delete mode 100644 docs/devel/bitmaps.md
 create mode 100644 docs/interop/bitmaps.rst

diff --git a/docs/devel/bitmaps.md b/docs/devel/bitmaps.md
deleted file mode 100644
index a2e8d51..000
--- a/docs/devel/bitmaps.md
+++ /dev/null
@@ -1,505 +0,0 @@
-
-
-# Dirty Bitmaps and Incremental Backup
-
-* Dirty Bitmaps are objects that track which data needs to be backed up for the
-  next incremental backup.
-
-* Dirty bitmaps can be created at any time and attached to any node
-  (not just complete drives.)
-
-## Dirty Bitmap Names
-
-* A dirty bitmap's name is unique to the node, but bitmaps attached to 
different
-  nodes can share the same name.
-
-* Dirty bitmaps created for internal use by QEMU may be anonymous and have no
-  name, but any user-created bitmaps may not be. There can be any number of
-  anonymous bitmaps per node.
-
-* The name of a user-created bitmap must not be empty ("").
-
-## Bitmap Modes
-
-* A Bitmap can be "frozen," which means that it is currently in-use by a backup
-  operation and cannot be deleted, renamed, written to, reset,
-  etc.
-
-* The normal operating mode for a bitmap is "active."
-
-## Basic QMP Usage
-
-### Supported Commands ###
-
-* block-dirty-bitmap-add
-* block-dirty-bitmap-remove
-* block-dirty-bitmap-clear
-
-### Creation
-
-* To create a new bitmap, enabled, on the drive with id=drive0:
-
-```json
-{ "execute": "block-dirty-bitmap-add",
-  "arguments": {
-"node": "drive0",
-"name": "bitmap0"
-  }
-}
-```
-
-* This bitmap will have a default granularity that matches the cluster size of
-  its associated drive, if available, clamped to between [4KiB, 64KiB].
-  The current default for qcow2 is 64KiB.
-
-* To create a new bitmap that tracks changes in 32KiB segments:
-
-```json
-{ "execute": "block-dirty-bitmap-add",
-  "arguments": {
-"node": "drive0",
-"name": "bitmap0",
-"granularity": 32768
-  }
-}
-```
-
-### Deletion
-
-* Bitmaps that are frozen cannot be deleted.
-
-* Deleting the bitmap does not impact any other bitmaps attached to the same
-  node, nor does it affect any backups already created from this node.
-
-* Because bitmaps are only unique to the node to which they are attached,
-  you must specify the node/drive name here, too.
-
-```json
-{ "execute": "block-dirty-bitmap-remove",
-  "arguments": {
-"node": "drive0",
-"name": "bitmap0"
-  }
-}
-```
-
-### Resetting
-
-* Resetting a bitmap will clear all information it holds.
-
-* An incremental backup created from an empty bitmap will copy no data,
-  as if nothing has changed.
-
-```json
-{ "execute": "block-dirty-bitmap-clear",
-  "arguments": {
-"node": "drive0",
-"name": "bitmap0"
-  }
-}
-```
-
-## Transactions
-
-### Justification
-
-Bitmaps can be safely modified when the VM is paused or halted by using
-the basic QMP commands. For instance, you might perform the following actions:
-
-1. Boot the VM in a paused state.
-2. Create a full drive backup of drive0.
-3. Create a new bitmap attached to drive0.
-4. Resume execution of the VM.
-5. Incremental backups are ready to be created.
-
-At this point, the bitmap and drive backup would be correctly in sync,
-and incremental backups made from this point forward would be correctly aligned
-to the full drive backup.
-
-This is not particularly useful if we decide we want to start incremental
-backups after the VM has been running for a while, for which we will need to
-perform actions such as the following:
-
-1. Boot the VM and begin execution.
-2. Using a single transaction, perform the following operations:
-* Create bitmap0.
-* Create a full drive backup of drive0.
-3. Incremental backups are now ready to be created.
-
-### Supported Bitmap Transactions
-
-* block-dirty-bitmap-add
-* block-dirty-bitmap-clear
-
-The usages are identical to their respective QMP commands, but see below
-for examples.
-
-### Example: New Incremental Backup
-
-As outlined in the justification, perhaps we want to create a new incremental
-backup chain attached 

[Qemu-block] [PATCH v4 1/2] live-block-ops.txt: Rename, rewrite, and improve it

2017-06-28 Thread Kashyap Chamarthy
This patch documents (including their QMP invocations) all the four
major kinds of live block operations:

  - `block-stream`
  - `block-commit`
  - `drive-mirror` (& `blockdev-mirror`)
  - `drive-backup` (& `blockdev-backup`)

Things considered while writing this document:

  - Use reStructuredText as markup language (with the goal of generating
the HTML output using the Sphinx Documentation Generator).  It is
gentler on the eye, and can be trivially converted to different
formats.  (Another reason: upstream QEMU is considering to switch to
Sphinx, which uses reStructuredText as its markup language.)

  - Raw QMP JSON output vs. 'qmp-shell'.  I debated with myself whether
to only show raw QMP JSON output (as that is the canonical
representation), or use 'qmp-shell', which takes key-value pairs.  I
settled on the approach of: for the first occurence of a command,
use raw JSON; for subsequent occurences, use 'qmp-shell', with an
occasional exception.

  - Usage of `-blockdev` command-line.

  - Usage of 'node-name' vs. file path to refer to disks.  While we have
`blockdev-{mirror, backup}` as 'node-name'-alternatives for
`drive-{mirror, backup}`, the `block-commit` command still operate
on file names for parameters 'base' and 'top'.  So I added a caveat
at the beginning to that effect.

Refer this related thread that I started (where I learnt
`block-stream` was recently reworked to accept 'node-name' for 'top'
and 'base' parameters):
https://lists.nongnu.org/archive/html/qemu-devel/2017-05/msg06466.html
"[RFC] Making 'block-stream', and 'block-commit' accept node-name"

All commands showed in this document were tested while documenting.

Thanks: Eric Blake for the section: "A note on points-in-time vs file
names".  This useful bit was originally articulated by Eric in his
KVMForum 2015 presentation, so I included that specific bit in this
document.

Signed-off-by: Kashyap Chamarthy 
---
* A Sphinx-rendered HTML version is here:
  
https://kashyapc.fedorapeople.org/v4-QEMU-Docs/_build/html/docs/live-block-operations.html

* Changes in v4
   - As per Paolo's suggestion on IRC, move the document from docs/ to
 docs/interop/.  (Where "interop" means: management and
 interoperability with QEMU)
   - Mention synchronization modes as part of the "Live disk backup ---
 ``drive-backup`` and `blockdev-backup``" section.
   - And add a note mentioning the docs/interop/bitmaps.rst for detailed
 workings of the 'incremental' synchronization mode.

* Change in v3 [address feedback from SFinucane]:
   - Make the Copyright header only part of the source, but not the
 rendered version
   - Use the ".. note::", ".. important::" admonitions where appropriate
   - Add a TODO note to remove the ".. contents::" directive when Sphinx
 is integrated
   - Make effective use of "::", resulting in removing lots of needless
 blank lines
   - Use anchors where applicable
   - Reword content in some places; fix some spelling mistakes

* Changes in v2 [address content feedback from Eric; styling changes
  from Stephen Finucane]:
   - [Styling] Remove the ToC, as the Sphinx, ".. contents::" will take
 auto-generate it as part of the rendered version
   - [Styling] Replace ".. code-block::" with "::" as it depends on the
 external 'pygments' library and the syntaxes available vary between
 different versions. [Thanks to Stephen Finucane, who this tip on
 IRC, from experience of doing Sphinx documentation for the Open
 vSwitch project]
   - [Styling] Remove all needless hyperlinks, since ToC will take care
 of them
   - Fix commit message typos
   - Add Copyright / License boilerplate text at the top
   - Reword sentences in "Disk image backing chain notation" section
   - Fix descriptions of `block-{stream, commit}`
   - Rework `block-stream` QMP invocations to take its 'node-name'
 parameter 'base-node'
   - Add 'file.node-name=file' to the '-blockdev' command-line
   - s/shall/will/g
   - Clarify throughout the document, where appropriate,
 that we're starting afresh with the original disk image chain
   - Address mistakes in "Live block commit (`block-commit`)" and
 "QMP invocation for `block-commit`" sections
   - Describe the case of "shallow mirroring" (synchronize only the
 contents of the *top*-most disk image -- "sync": "top") for
 `drive-mirror`, as it's part of an important use case: live storage
 migration without shared storage setup.  (Add a new section: "QMP
 invocation for live storage migration with `drive-mirror` + NBD" as
 part of this)
   - Add QMP invocation example for `blockdev-{mirror, backup}`
---
 docs/interop/live-block-operations.rst | 1037 
 docs/live-block-ops.txt|   72 ---
 2 files changed, 1037 insertions(+), 72 deletions(-)
 create mode 100644 docs/interop/live-block-operations.rst
 delete mode 100644 docs/

[Qemu-block] [PATCH v4 0/2] Rewrite 'live-block-ops.txt'; convert 'bitmaps.md' to rST

2017-06-28 Thread Kashyap Chamarthy
Rewrite the 'live-block-ops.txt' document (after renaming it to
'live-block-operations.rst') in reStructuredText (rST) format.  Given
upstream QEMU's desire[*] to take advantage of the Sphinx + rST
framework to gerate its documentation:

"Based on experience from the Linux kernel, QEMU's docs pipeline is
going to be based on Sphinx [...] Sphinx is extensible and it is
easy to add new input formats and input directives."

And as part of review, John Snow suggested[+] to link to the
'bitmaps.md' document.  So while at it, convert the 'bitmaps.md'
document also into rST syntax.

Then, moved both the documents ('live-block-operations.rst', and
'bitmaps.rst') to 'qemu/docs/interop' directory.  (Paolo explained the
term "interop" as: "management & interoperability with QEMU".)

That's the result of this series:

(1) Rewrite the 'live-block-ops.txt' document (for details, refer the
commit message of the patch) in rST.

Sphinx-generted HTML rendering:


https://kashyapc.fedorapeople.org/v4-QEMU-Docs/_build/html/docs/live-block-operations.html

(2) Convert the 'bitmaps.md' document to rST, as discussed on this[+]
thread.

Sphinx-generted HTML rendering:


https://kashyapc.fedorapeople.org/v4-QEMU-Docs/_build/html/docs/bitmaps.html

NB: Since I rolled up the bitmaps.rst into this submission, I just
stuck a v4 for this patch, too.

[*] http://wiki.qemu.org/Features/Documentation
[+] https://lists.nongnu.org/archive/html/qemu-devel/2017-06/msg05428.html

Kashyap Chamarthy (2):
  live-block-ops.txt: Rename, rewrite, and improve it
  bitmaps.md: Convert to rST; move it into 'interop' dir

 docs/devel/bitmaps.md  |  505 
 docs/interop/bitmaps.rst   |  555 +
 docs/interop/live-block-operations.rst | 1037 
 docs/live-block-ops.txt|   72 ---
 4 files changed, 1592 insertions(+), 577 deletions(-)
 delete mode 100644 docs/devel/bitmaps.md
 create mode 100644 docs/interop/bitmaps.rst
 create mode 100644 docs/interop/live-block-operations.rst
 delete mode 100644 docs/live-block-ops.txt

-- 
2.7.5




Re: [Qemu-block] [PATCH 1/4] block/qcow2: add compression_algorithm create option

2017-06-28 Thread Denis V. Lunev
On 06/27/2017 03:34 PM, Peter Lieven wrote:
> this patch adds a new compression_algorithm option when creating qcow2 images.
> The current default for the compresison algorithm is zlib and zlib will be
> used when this option is omitted (like before).
>
> If the option is specified e.g. with:
>
>  qemu-img create -f qcow2 -o compression_algorithm=zlib image.qcow2 1G
>
> then a new compression algorithm header extension is added and an incompatible
> feature bit is set. This means that if the header is present it must be parsed
> by Qemu on qcow2_open and it must be validated if the specified compression
> algorithm is supported by the current build of Qemu.
>
> This means if the compression_algorithm option is specified Qemu prior to this
> commit will not be able to open the created image.
>
> Signed-off-by: Peter Lieven 

as general, it is weird to have formatting changes, spec changes and
real code changes in one patch.

[skipped]

> diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt
> index 80cdfd0..1f165d6 100644
> --- a/docs/interop/qcow2.txt
> +++ b/docs/interop/qcow2.txt
> @@ -85,7 +85,11 @@ in the description of a field.
>  be written to (unless for regaining
>  consistency).
>  
> -Bits 2-63:  Reserved (set to 0)
> +Bit 2:  Compress algorithm bit.  If this bit is set 
> then
> +the compress algorithm extension must be 
> parsed
> +and checked for compatiblity.
Eric is correct here. We should add note that compressed algorithm extension
must present when the bit is sent and must be absent in the other case.

> +
> +Bits 3-63:  Reserved (set to 0)
>  
>   80 -  87:  compatible_features
>  Bitmask of compatible features. An implementation can
> @@ -135,6 +139,8 @@ be stored. Each extension has a structure like the 
> following:
>  0xE2792ACA - Backing file format name
>  0x6803f857 - Feature name table
>  0x23852875 - Bitmaps extension
> +0xC0318300 - Compression Algorithm
> +0xC03183xx - Reserved for compression algorithm 
> params
>  other  - Unknown header extension, can be safely
>   ignored

I think that there is no need to reserve 255 magics once we will
add opaque container to the extension.

>  
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 15fa602..03a4b8f 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -40,23 +40,24 @@
>  #define BLOCK_FLAG_ENCRYPT  1
>  #define BLOCK_FLAG_LAZY_REFCOUNTS   8
>  
> -#define BLOCK_OPT_SIZE  "size"
> -#define BLOCK_OPT_ENCRYPT   "encryption"
> -#define BLOCK_OPT_COMPAT6   "compat6"
> -#define BLOCK_OPT_HWVERSION "hwversion"
> -#define BLOCK_OPT_BACKING_FILE  "backing_file"
> -#define BLOCK_OPT_BACKING_FMT   "backing_fmt"
> -#define BLOCK_OPT_CLUSTER_SIZE  "cluster_size"
> -#define BLOCK_OPT_TABLE_SIZE"table_size"
> -#define BLOCK_OPT_PREALLOC  "preallocation"
> -#define BLOCK_OPT_SUBFMT"subformat"
> -#define BLOCK_OPT_COMPAT_LEVEL  "compat"
> -#define BLOCK_OPT_LAZY_REFCOUNTS"lazy_refcounts"
> -#define BLOCK_OPT_ADAPTER_TYPE  "adapter_type"
> -#define BLOCK_OPT_REDUNDANCY"redundancy"
> -#define BLOCK_OPT_NOCOW "nocow"
> -#define BLOCK_OPT_OBJECT_SIZE   "object_size"
> -#define BLOCK_OPT_REFCOUNT_BITS "refcount_bits"
> +#define BLOCK_OPT_SIZE  "size"
> +#define BLOCK_OPT_ENCRYPT   "encryption"
> +#define BLOCK_OPT_COMPAT6   "compat6"
> +#define BLOCK_OPT_HWVERSION "hwversion"
> +#define BLOCK_OPT_BACKING_FILE  "backing_file"
> +#define BLOCK_OPT_BACKING_FMT   "backing_fmt"
> +#define BLOCK_OPT_CLUSTER_SIZE  "cluster_size"
> +#define BLOCK_OPT_TABLE_SIZE"table_size"
> +#define BLOCK_OPT_PREALLOC  "preallocation"
> +#define BLOCK_OPT_SUBFMT"subformat"
> +#define BLOCK_OPT_COMPAT_LEVEL  "compat"
> +#define BLOCK_OPT_LAZY_REFCOUNTS"lazy_refcounts"
> +#define BLOCK_OPT_ADAPTER_TYPE  "adapter_type"
> +#define BLOCK_OPT_REDUNDANCY"redundancy"
> +#define BLOCK_OPT_NOCOW "nocow"
> +#define BLOCK_OPT_OBJECT_SIZE   "object_size"
> +#define BLOCK_OPT_REFCOUNT_BITS "refcount_bits"
> +#define BLOCK_OPT_COMPRESSION_ALGORITHM "compression_algorithm"
>  
>  #define BLOCK_PROBE_BUF_SIZE512
>  
> diff --git a/qemu-img.texi b/qemu-img.texi
> index 5b925ec..c0d1bec 100644
> --- a/qemu-img.texi
> +++ b/qemu-img.texi
> @@ -621,6 +621,16 @@ file which is COW and has data blocks already, it 
> couldn't be c

[Qemu-block] [PATCH v4 0/2] Rewrite 'live-block-ops.txt'; convert 'bitmaps.md' to rST

2017-06-28 Thread Kashyap Chamarthy
Rewrite the 'live-block-ops.txt' document (after renaming it to
'live-block-operations.rst') in reStructuredText (rST) format.  Given
upstream QEMU's desire[*] to take advantage of the Sphinx + rST
framework to gerate its documentation:

"Based on experience from the Linux kernel, QEMU's docs pipeline is
going to be based on Sphinx [...] Sphinx is extensible and it is
easy to add new input formats and input directives."

And as part of review, John Snow suggested[+] to link to the
'bitmaps.md' document.  So while at it, convert the 'bitmaps.md'
document also into rST syntax.

Then, moved both the documents ('live-block-operations.rst', and
'bitmaps.rst') to 'qemu/docs/interop' directory.  (Paolo explained the
term "interop" as: "management & interoperability with QEMU".)

That's the result of this series:

(1) Rewrite the 'live-block-ops.txt' document (for details, refer the
commit message of the patch) in rST.

Sphinx-generted HTML rendering:


https://kashyapc.fedorapeople.org/v4-QEMU-Docs/_build/html/docs/live-block-operations.html

(2) Convert the 'bitmaps.md' document to rST, as discussed on this[+]
thread.

Sphinx-generted HTML rendering:


https://kashyapc.fedorapeople.org/v4-QEMU-Docs/_build/html/docs/bitmaps.html

NB: Since I rolled up the bitmaps.rst into this submission, I just
stuck a v4 for this patch, too.

[*] http://wiki.qemu.org/Features/Documentation
[+] https://lists.nongnu.org/archive/html/qemu-devel/2017-06/msg05428.html

David Hildenbrand (2):
  target/s390x: Improve heuristic for ipte
  target/s390x: Implement idte instruction

 target/s390x/cpu_models.c  |  1 +
 target/s390x/helper.h  |  1 +
 target/s390x/insn-data.def |  2 ++
 target/s390x/mem_helper.c  | 76 --
 target/s390x/translate.c   | 15 +
 5 files changed, 86 insertions(+), 9 deletions(-)

-- 
2.7.5




Re: [Qemu-block] [Qemu-devel] [PATCH 1/4] block/qcow2: add compression_algorithm create option

2017-06-28 Thread Denis V. Lunev
On 06/27/2017 04:27 PM, Peter Lieven wrote:
> Am 27.06.2017 um 15:20 schrieb Daniel P. Berrange:
>> On Tue, Jun 27, 2017 at 02:34:07PM +0200, Peter Lieven wrote:
>>> this patch adds a new compression_algorithm option when creating
>>> qcow2 images.
>>> The current default for the compresison algorithm is zlib and zlib
>>> will be
>>> used when this option is omitted (like before).
>>>
>>> If the option is specified e.g. with:
>>>
>>>   qemu-img create -f qcow2 -o compression_algorithm=zlib image.qcow2 1G
>> IMHO we should introduce a nested struct "compress" struct to hold
>> the format
>> name, and any other format specific arguments, in a way that maps
>> nicely to
>> any future QAPI representmatch of create options. eg
>>
>> { 'enum': 'BlockdevQcow2CompressFormat',
>>'data': [ 'zlib', 'lzo' ] }
>>
>> { 'union': 'BlockdevQcow2Compress',
>>'base': { 'format': 'BlockdevQcow2CompressFormat' },
>>'discriminator': 'format',
>>'data': { 'zlib': 'BlockdevQcow2CompressZLib',
>>  'lzo': 'BlockdevQcow2CompressLZO'} }
>>
>> so it would map to
>>
>>   qemu-img create -f qcow2 -o compress.format=zlib image.qcow2 1G
>>
>> and let us have other compress. options specific to each format
>
> Or would it be possible to start with just a compress.level (int)
> parameter.
> In fact that would be sufficient for almost all formats (or better use
> algorithms?).
> The windowBits can be default to -15 in the future. It seems the old
> choice of -12
> was just not optimal. We just have to use it for backwards
> compatiblity if the compress
> options are not specified.
>
> Peter
>
We can put generic parameters on top (in generic header) but
put algorithm-dependent container inside. This could be
viable for the future to avoid incompatible format changes.

Den



Re: [Qemu-block] [PATCH RFC v3 3/8] block: add throttle block filter driver

2017-06-28 Thread Kevin Wolf
Am 23.06.2017 um 14:46 hat Manos Pitsidianakis geschrieben:
> block/throttle.c uses existing I/O throttle infrastructure inside a
> block filter driver. I/O operations are intercepted in the filter's
> read/write coroutines, and referred to block/throttle-groups.c
> 
> The driver can be used with the command
> -drive driver=throttle,file.filename=foo.qcow2,iops-total=...
> The configuration flags and semantics are identical to the hardcoded
> throttling ones.
> 
> Signed-off-by: Manos Pitsidianakis 
> ---
>  block/Makefile.objs |   1 +
>  block/throttle.c| 427 
> 
>  include/qemu/throttle-options.h |  60 --
>  3 files changed, 469 insertions(+), 19 deletions(-)
>  create mode 100644 block/throttle.c
> 
> diff --git a/block/Makefile.objs b/block/Makefile.objs
> index ea955302c8..bb811a4d01 100644
> --- a/block/Makefile.objs
> +++ b/block/Makefile.objs
> @@ -25,6 +25,7 @@ block-obj-y += accounting.o dirty-bitmap.o
>  block-obj-y += write-threshold.o
>  block-obj-y += backup.o
>  block-obj-$(CONFIG_REPLICATION) += replication.o
> +block-obj-y += throttle.o
>  
>  block-obj-y += crypto.o
>  
> diff --git a/block/throttle.c b/block/throttle.c
> new file mode 100644
> index 00..0c17051161
> --- /dev/null
> +++ b/block/throttle.c
> @@ -0,0 +1,427 @@
> +/*
> + * QEMU block throttling filter driver infrastructure
> + *
> + * Copyright (c) 2017 Manos Pitsidianakis
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License as
> + * published by the Free Software Foundation; either version 2 or
> + * (at your option) version 3 of the License.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, see .
> + */

Please consider using the LGPL. We're still hoping to turn the block
layer into a library one day, and almost all code in it is licensed
liberally (MIT or LGPL).

> +#include "qemu/osdep.h"
> +#include "block/throttle-groups.h"
> +#include "qemu/throttle-options.h"
> +#include "qapi/error.h"
> +
> +
> +static QemuOptsList throttle_opts = {
> +.name = "throttle",
> +.head = QTAILQ_HEAD_INITIALIZER(throttle_opts.head),
> +.desc = {
> +{
> +.name = QEMU_OPT_IOPS_TOTAL,
> +.type = QEMU_OPT_NUMBER,
> +.help = "limit total I/O operations per second",
> +},{
> +.name = QEMU_OPT_IOPS_READ,
> +.type = QEMU_OPT_NUMBER,
> +.help = "limit read operations per second",
> +},{
> +.name = QEMU_OPT_IOPS_WRITE,
> +.type = QEMU_OPT_NUMBER,
> +.help = "limit write operations per second",
> +},{
> +.name = QEMU_OPT_BPS_TOTAL,
> +.type = QEMU_OPT_NUMBER,
> +.help = "limit total bytes per second",
> +},{
> +.name = QEMU_OPT_BPS_READ,
> +.type = QEMU_OPT_NUMBER,
> +.help = "limit read bytes per second",
> +},{
> +.name = QEMU_OPT_BPS_WRITE,
> +.type = QEMU_OPT_NUMBER,
> +.help = "limit write bytes per second",
> +},{
> +.name = QEMU_OPT_IOPS_TOTAL_MAX,
> +.type = QEMU_OPT_NUMBER,
> +.help = "I/O operations burst",
> +},{
> +.name = QEMU_OPT_IOPS_READ_MAX,
> +.type = QEMU_OPT_NUMBER,
> +.help = "I/O operations read burst",
> +},{
> +.name = QEMU_OPT_IOPS_WRITE_MAX,
> +.type = QEMU_OPT_NUMBER,
> +.help = "I/O operations write burst",
> +},{
> +.name = QEMU_OPT_BPS_TOTAL_MAX,
> +.type = QEMU_OPT_NUMBER,
> +.help = "total bytes burst",
> +},{
> +.name = QEMU_OPT_BPS_READ_MAX,
> +.type = QEMU_OPT_NUMBER,
> +.help = "total bytes read burst",
> +},{
> +.name = QEMU_OPT_BPS_WRITE_MAX,
> +.type = QEMU_OPT_NUMBER,
> +.help = "total bytes write burst",
> +},{
> +.name = QEMU_OPT_IOPS_TOTAL_MAX_LENGTH,
> +.type = QEMU_OPT_NUMBER,
> +.help = "length of the iopstotalmax burst period, in seconds",
> +},{
> +.name = QEMU_OPT_IOPS_READ_MAX_LENGTH,
> +.type = QEMU_OPT_NUMBER,
> +.help = "length of the iopsreadmax burst period, in seconds",
> +},{
> +.name = QEMU_OPT_IOPS_WRITE_MAX_LENGTH,
> +.type = QEMU_OPT_NUMBER,
> +.help = "length of the iopswritemax burst period, in seconds",
> +   

Re: [Qemu-block] [PATCH v22 13/30] block: new bdrv_reopen_bitmaps_rw interface

2017-06-28 Thread Vladimir Sementsov-Ogievskiy

28.06.2017 16:31, Paolo Bonzini wrote:


On 28/06/2017 15:02, Vladimir Sementsov-Ogievskiy wrote:

It is interesting, but I see this problem only in your answers, in my
letters I see this white-space on its place.

That's the good old Thunderbird "format=flowed" bug.

Vladimir, download
http://people.redhat.com/pbonzini/format-flawed.tar.gz and place it into
~/.thunderbird//extensions.  It works around the bug.


unfortunately, with this buttons 'reply' and 'reply to all' do nothing..



Paolo


28.06.2017 15:36, Eric Blake wrote:

[meta-comment]

On 06/28/2017 07:10 AM, Vladimir Sementsov-Ogievskiy wrote:

28.06.2017 15:05, Vladimir Sementsov-Ogievskiy wrote:

Add format driver handler, which should mark loaded read-only
bitmaps as 'IN_USE' in the image and unset read_only field in
corresponding BdrvDirtyBitmap's.

Signed-off-by: Vladimir Sementsov-Ogievskiy
Reviewed-by: John Snow

Your original message had spaces before '<' in the email addresses, but
it got lost here...



--
Best regards,
Vladimir




[Qemu-block] [PATCH] tests: Avoid non-portable 'echo -ARG'

2017-06-28 Thread Eric Blake
POSIX says that backslashes in the arguments to 'echo', as well as
any use of 'echo -n' and 'echo -e', are non-portable; it recommends
people should favor 'printf' instead.  This is definitely true where
we do not control which shell is running (such as in makefile snippets
or in documentation examples).  But even for scripts where we
require bash (and therefore, where echo does what we want by default),
it is still possible to use 'shopt -s xpg_echo' to change bash's
behavior of echo.  And setting a good example never hurts when we are
not sure if a snippet will be copied from a bash-only script to a
general shell script (although I don't change the use of non-portable
\e for ESC when we know the running shell is bash).

Replace 'echo -n "..."' with 'printf "..."', and 'echo -e "..."'
with 'printf "...\n"'.

In the qemu-iotests check script, also fix unusual shell quoting
that would result in word-splitting if 'date' outputs a space.

Signed-off-by: Eric Blake 
---

Of course, Stefan's pending patch:
[PATCH 3/5] qemu-iotests: 068: extract _qemu() function
also touches 068, so there may be some (obvious) merge conflicts
to resolve there depending on what goes in first.

 qemu-options.hx |  4 ++--
 tests/multiboot/run_test.sh | 10 +-
 tests/qemu-iotests/051  |  7 ---
 tests/qemu-iotests/068  |  2 +-
 tests/qemu-iotests/142  | 48 ++---
 tests/qemu-iotests/171  | 14 ++---
 tests/qemu-iotests/check| 18 -
 tests/rocker/all| 10 +-
 tests/tcg/cris/Makefile |  8 
 9 files changed, 61 insertions(+), 60 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 896ff17..c8205bb 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -4351,7 +4351,7 @@ The simplest (insecure) usage is to provide the secret 
inline

 The simplest secure usage is to provide the secret via a file

- # echo -n "letmein" > mypasswd.txt
+ # printf "letmein" > mypasswd.txt
  # $QEMU -object secret,id=sec0,file=mypasswd.txt,format=raw

 For greater security, AES-256-CBC should be used. To illustrate usage,
@@ -4379,7 +4379,7 @@ telling openssl to base64 encode the result, but it could 
be left
 as raw bytes if desired.

 @example
- # SECRET=$(echo -n "letmein" |
+ # SECRET=$(printf "letmein" |
 openssl enc -aes-256-cbc -a -K $KEY -iv $IV)
 @end example

diff --git a/tests/multiboot/run_test.sh b/tests/multiboot/run_test.sh
index 78d7edf..35bfe0e 100755
--- a/tests/multiboot/run_test.sh
+++ b/tests/multiboot/run_test.sh
@@ -26,7 +26,7 @@ run_qemu() {
 local kernel=$1
 shift

-echo -e "\n\n=== Running test case: $kernel $@ ===\n" >> test.log
+printf "\n\n=== Running test case: $kernel $@ ===\n\n" >> test.log

 $QEMU \
 -kernel $kernel \
@@ -68,21 +68,21 @@ for t in mmap modules; do
 pass=1

 if [ $debugexit != 1 ]; then
-echo -e "\e[31m ?? \e[0m $t (no debugexit used, exit code $ret)"
+printf "\e[31m ?? \e[0m $t (no debugexit used, exit code $ret)\n"
 pass=0
 elif [ $ret != 0 ]; then
-echo -e "\e[31mFAIL\e[0m $t (exit code $ret)"
+printf "\e[31mFAIL\e[0m $t (exit code $ret)\n"
 pass=0
 fi

 if ! diff $t.out test.log > /dev/null 2>&1; then
-echo -e "\e[31mFAIL\e[0m $t (output difference)"
+printf "\e[31mFAIL\e[0m $t (output difference)\n"
 diff -u $t.out test.log
 pass=0
 fi

 if [ $pass == 1 ]; then
-echo -e "\e[32mPASS\e[0m $t"
+printf "\e[32mPASS\e[0m $t\n"
 fi

 done
diff --git a/tests/qemu-iotests/051 b/tests/qemu-iotests/051
index 26c29de..322c4a8 100755
--- a/tests/qemu-iotests/051
+++ b/tests/qemu-iotests/051
@@ -217,7 +217,7 @@ run_qemu -drive driver=null-co,cache=invalid_value
 # Test 142 checks the direct=on cases

 for cache in writeback writethrough unsafe invalid_value; do
-echo -e "info block\ninfo block file\ninfo block backing\ninfo block 
backing-file" | \
+printf "info block\ninfo block file\ninfo block backing\ninfo block 
backing-file\n" | \
 run_qemu -drive 
file="$TEST_IMG",cache=$cache,backing.file.filename="$TEST_IMG.base",backing.cache.no-flush=on,backing.node-name=backing,backing.file.node-name=backing-file,file.node-name=file,if=none,id=$device_id
 -nodefaults
 done

@@ -325,8 +325,9 @@ echo "qemu-io $device_id \"write -P 0x22 0 4k\"" | run_qemu 
-drive file="$TEST_I

 $QEMU_IO -c "read -P 0x22 0 4k" "$TEST_IMG" | _filter_qemu_io

-echo -e "qemu-io $device_id \"write -P 0x33 0 4k\"\ncommit $device_id" | 
run_qemu -drive file="$TEST_IMG",snapshot=on,if=none,id=$device_id\
-   | 
_filter_qemu_io
+printf "qemu-io $device_id \"write -P 0x33 0 4k\"\ncommit $device_id\n" |
+run_qemu -drive file="$TEST_IMG",snapshot=on,if=none,id=$device_id |
+_filter_qemu_io

 $QEMU_IO -c "read -P 0x33 0 4k" "$TEST_IMG" | _filter_qemu_io

diff -

Re: [Qemu-block] [PATCH v2 3/4] qcow2: add shrink image support

2017-06-28 Thread Max Reitz
On 2017-06-27 17:06, Pavel Butsykin wrote:
> On 26.06.2017 20:47, Max Reitz wrote:
>> On 2017-06-26 17:23, Pavel Butsykin wrote:
> []
>>>
>>> Is there any guarantee that in the future this will not change? Because
>>> in this case it can be a potential danger.
>>
>> Since this behavior is not documented anywhere, there is no guarantee.
>>
>>> I can add a comment... Or add a new variable with the size of
>>> reftable_tmp, and every time count min(s->refcount_table_size,
>>> reftable_tmp_size)
>>> before accessing to s->refcount_table[]/reftable_tmp[]
>>
>> Or (1) you add an assertion that refcount_table_size doesn't change
>> along with a comment why that is the case, which also explains in detail
>> why the call to qcow2_free_clusters() should be safe: The on-disk
>> reftable differs from the one in memory. qcow2_free_clusters()and
>> update_refcount() themselves do not access the reftable, so they are
>> safe. However, update_refcount() calls alloc_refcount_block(), and that
>> function does access the reftable: Now, as long as
>> s->refcount_table_size does not shrink (which I can't see why it would),
>> refcount_table_index should always be smaller. Now we're accessing
>> s->refcount_table: This will always return an existing refblock because
>> this will either be the refblock itself (for self-referencing refblocks)
>> or another one that is not going to be freed by qcow2_shrink_reftable()
>> because this function will not free refblocks which cover other clusters
>> than themselves.
>> We will then proceed to update the refblock which is either right (if it
>> is not the refblock to be freed) or won't do anything (if it is the one
>> to be freed).
>> In any case, we will never write to the reftable and reading from the
>> basically outdated cached version will never do anything bad.
> 
> OK, SGTM.
> 
>> Or (2) you copy reftable_tmp into s->refcount_table[] *before* any call
>> to qcow2_free_clusters(). To make this work, you would need to also
>> discard all refblocks from the cache in this function here (and not in
>> update_refcount()) and then only call qcow2_free_clusters() on refblocks
>> which were not self-referencing. An alternative hack would be to simply
>> mark the image dirty and just not do any qcow2_free_clusters() call...
> 
> The main purpose of qcow2_reftable_shrink() function is discard all
> unnecessary refblocks from the file. If we do only rewrite
> refcount_table and discard non-self-referencing refblocks (which are
> actually very rare), then the meaning of the function is lost.

It would do exactly the same. The idea is that you do not need to call
qcow2_free_clusters() on self-referencing refblocks at all, since they
are freed automatically when their reftable entry is overwritten with 0.

>> Or (3) of course it would be possible to not clean up refcount
>> structures at all...
> 
> Nice solution :)

It is, because as I said refcount structures only have a small overhead.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v22 00/30] qcow2: persistent dirty bitmaps

2017-06-28 Thread Vladimir Sementsov-Ogievskiy

28.06.2017 16:01, Paolo Bonzini wrote:



On 28/06/2017 14:05, Vladimir Sementsov-Ogievskiy wrote:

Rebase on master, so changes, mostly related to new dirty bitmaps mutex:

10: - asserts now in bdrv_{re,}set_dirty_bitmap_locked functions.
 - also add assert into bdrv_undo_clear_dirty_bitmap (the only change, not 
related to rebase)
 - add mutex lock into bdrv_dirty_bitmap_set_readonly (as it changes bitmap 
list,
   so the lock should be taken)
 - return instead of go-to in qmp_block_dirty_bitmap_clear
 - in dirty-bitmaps.h, move bdrv_dirty_bitmap_set_readonly before block
   "Functions that require manual locking", move
   bdrv_dirty_bitmap_readonly and bdrv_has_readonly_bitmaps into this block
15: - add mutex lock/unlock into bdrv_dirty_bitmap_set_autoload
 - in dirty-bitmaps.h, move bdrv_dirty_bitmap_set_autoload before block
   "Functions that require manual locking", move
   bdrv_dirty_bitmap_get_autoload into this block
17: - add mutex lock/unlock into bdrv_dirty_bitmap_set_persistance
 - in dirty-bitmaps.h, move bdrv_dirty_bitmap_set_persistance before block
   "Functions that require manual locking", move
   bdrv_dirty_bitmap_get_persistance and
   bdrv_has_changed_persistent_bitmaps into this block
18: in dirty-bitmaps.h, move bdrv_dirty_bitmap_next into block
 "Functions that require manual locking". (do not remove r-b, as it is
 just one empty line removed before function declaration)
23: return instead of go-to in qmp_block_dirty_bitmap_add
24: return instead of go-to in qmp_block_dirty_bitmap_add
25: - return instead of go-to
 - remove aio_context_acquire/release calls
 - no aio_context parameter for block_dirty_bitmap_lookup
 - in dirty-bitmaps.h, move bdrv_dirty_bitmap_sha256 into block
 "Functions that require manual locking".
29: - return instead of go-to in qmp_block_dirty_bitmap_remove


All looks good, thanks.  I'll rebase my own fixes on top of these
patches, no need to have you respin them.

Paolo



Thank you! And for thunderbird-work-around too!

--
Best regards,
Vladimir



Re: [Qemu-block] [PATCH RFC v3 6/8] block: add options parameter to bdrv_new_open_driver()

2017-06-28 Thread Manos Pitsidianakis

On Wed, Jun 28, 2017 at 03:42:41PM +0200, Alberto Garcia wrote:

On Fri 23 Jun 2017 02:46:58 PM CEST, Manos Pitsidianakis wrote:

 BlockDriverState *bdrv_new_open_driver(BlockDriver *drv, const char *node_name,
-   int flags, Error **errp)
+   int flags, QDict *options, Error **errp)
 {
 BlockDriverState *bs;
 int ret;

 bs = bdrv_new();
 bs->open_flags = flags;
-bs->explicit_options = qdict_new();
-bs->options = qdict_new();
+if (options) {
+bs->explicit_options = qdict_clone_shallow(options);
+bs->options = qdict_clone_shallow(options);
+} else {
+bs->explicit_options = qdict_new();
+bs->options = qdict_new();
+}
 bs->opaque = NULL;

 update_options_from_flags(bs->options, flags);

-ret = bdrv_open_driver(bs, drv, node_name, bs->options, flags, errp);
+ret = bdrv_open_driver(bs, drv, node_name, options, flags, errp);


Why this last change? In the default case you're now passing NULL
instead of the QDict created with qdict_new().


Duh, nice catch!


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH RFC v3 6/8] block: add options parameter to bdrv_new_open_driver()

2017-06-28 Thread Alberto Garcia
On Fri 23 Jun 2017 02:46:58 PM CEST, Manos Pitsidianakis wrote:
>  BlockDriverState *bdrv_new_open_driver(BlockDriver *drv, const char 
> *node_name,
> -   int flags, Error **errp)
> +   int flags, QDict *options, Error 
> **errp)
>  {
>  BlockDriverState *bs;
>  int ret;
>  
>  bs = bdrv_new();
>  bs->open_flags = flags;
> -bs->explicit_options = qdict_new();
> -bs->options = qdict_new();
> +if (options) {
> +bs->explicit_options = qdict_clone_shallow(options);
> +bs->options = qdict_clone_shallow(options);
> +} else {
> +bs->explicit_options = qdict_new();
> +bs->options = qdict_new();
> +}
>  bs->opaque = NULL;
>  
>  update_options_from_flags(bs->options, flags);
>  
> -ret = bdrv_open_driver(bs, drv, node_name, bs->options, flags, errp);
> +ret = bdrv_open_driver(bs, drv, node_name, options, flags, errp);

Why this last change? In the default case you're now passing NULL
instead of the QDict created with qdict_new().

Berto



Re: [Qemu-block] [PATCH RFC v3 5/8] block: add BlockDevOptionsThrottle to QAPI

2017-06-28 Thread Manos Pitsidianakis

On Wed, Jun 28, 2017 at 03:35:48PM +0200, Alberto Garcia wrote:

On Fri 23 Jun 2017 02:46:57 PM CEST, Manos Pitsidianakis wrote:


+# @BlockdevOptionsThrottle:
+#
+# Driver specific block device options for Throttle
+#


I would put this earlier in the json file, together with the rest of the
BlockdevOptions* structs.


+# @throttling-group: the name of the throttling group to use


Why not call it simply "group" ?


Sure! :)



+#
+# @options:BlockIOThrottle options
+# Since: 2.9
+##
+{ 'struct': 'BlockdevOptionsThrottle',
+  'data': { 'throttling-group': 'str',
+'file' : 'BlockdevRef',
+'*options' : 'BlockIOThrottle'
+ } }


Not sure if 'file' is the best name for the field ("child"?), but I'm
fine with it.


There doesn't seem to be a consistent naming scheme in block-core.json 
("file", "image" are candidates) so I put it file after bs->file. 


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH RFC v3 5/8] block: add BlockDevOptionsThrottle to QAPI

2017-06-28 Thread Alberto Garcia
On Fri 23 Jun 2017 02:46:57 PM CEST, Manos Pitsidianakis wrote:

> +# @BlockdevOptionsThrottle:
> +#
> +# Driver specific block device options for Throttle
> +#

I would put this earlier in the json file, together with the rest of the
BlockdevOptions* structs.

> +# @throttling-group: the name of the throttling group to use

Why not call it simply "group" ?

> +#
> +# @options:BlockIOThrottle options
> +# Since: 2.9
> +##
> +{ 'struct': 'BlockdevOptionsThrottle',
> +  'data': { 'throttling-group': 'str',
> +'file' : 'BlockdevRef',
> +'*options' : 'BlockIOThrottle'
> + } }

Not sure if 'file' is the best name for the field ("child"?), but I'm
fine with it.

Berto



Re: [Qemu-block] [PATCH v22 13/30] block: new bdrv_reopen_bitmaps_rw interface

2017-06-28 Thread Vladimir Sementsov-Ogievskiy

Finally, it looks like thunderbird bug
https://bugzilla.mozilla.org/show_bug.cgi?id=1160880

[sorry for so much offtopic]

28.06.2017 16:13, Vladimir Sementsov-Ogievskiy wrote:

28.06.2017 16:02, Vladimir Sementsov-Ogievskiy wrote:
It is interesting, but I see this problem only in your answers, in my 
letters I see this white-space on its place.


In outgoing letter I see this white-space, but in letter from 
mailing-list it is absent.




28.06.2017 15:36, Eric Blake wrote:

[meta-comment]

On 06/28/2017 07:10 AM, Vladimir Sementsov-Ogievskiy wrote:

28.06.2017 15:05, Vladimir Sementsov-Ogievskiy wrote:

Add format driver handler, which should mark loaded read-only
bitmaps as 'IN_USE' in the image and unset read_only field in
corresponding BdrvDirtyBitmap's.

Signed-off-by: Vladimir Sementsov-Ogievskiy
Reviewed-by: John Snow

Your original message had spaces before '<' in the email addresses, but
it got lost here...



Forgot to add:

Reviewed-by: Max Reitz

...this one also lacks the space.  I'm not sure if git cares, but it may
be worth investigating why your mailer eats the space when you reply
manually rather than sending via git; and for consistency, it is worth
keeping the space (for example, we like to grep 'git log' for learning
how active various contributors are, and having a consistent usage of
space before < in an email address can make the task easier).








--
Best regards,
Vladimir



Re: [Qemu-block] [PATCH v22 13/30] block: new bdrv_reopen_bitmaps_rw interface

2017-06-28 Thread Paolo Bonzini


On 28/06/2017 15:02, Vladimir Sementsov-Ogievskiy wrote:
> It is interesting, but I see this problem only in your answers, in my
> letters I see this white-space on its place.

That's the good old Thunderbird "format=flowed" bug.

Vladimir, download
http://people.redhat.com/pbonzini/format-flawed.tar.gz and place it into
~/.thunderbird//extensions.  It works around the bug.

Paolo

> 28.06.2017 15:36, Eric Blake wrote:
>> [meta-comment]
>>
>> On 06/28/2017 07:10 AM, Vladimir Sementsov-Ogievskiy wrote:
>>> 28.06.2017 15:05, Vladimir Sementsov-Ogievskiy wrote:
 Add format driver handler, which should mark loaded read-only
 bitmaps as 'IN_USE' in the image and unset read_only field in
 corresponding BdrvDirtyBitmap's.

 Signed-off-by: Vladimir Sementsov-Ogievskiy
 Reviewed-by: John Snow
>> Your original message had spaces before '<' in the email addresses, but
>> it got lost here...
>>



Re: [Qemu-block] [PATCH v22 13/30] block: new bdrv_reopen_bitmaps_rw interface

2017-06-28 Thread Vladimir Sementsov-Ogievskiy

28.06.2017 16:02, Vladimir Sementsov-Ogievskiy wrote:
It is interesting, but I see this problem only in your answers, in my 
letters I see this white-space on its place.


In outgoing letter I see this white-space, but in letter from 
mailing-list it is absent.




28.06.2017 15:36, Eric Blake wrote:

[meta-comment]

On 06/28/2017 07:10 AM, Vladimir Sementsov-Ogievskiy wrote:

28.06.2017 15:05, Vladimir Sementsov-Ogievskiy wrote:

Add format driver handler, which should mark loaded read-only
bitmaps as 'IN_USE' in the image and unset read_only field in
corresponding BdrvDirtyBitmap's.

Signed-off-by: Vladimir Sementsov-Ogievskiy
Reviewed-by: John Snow

Your original message had spaces before '<' in the email addresses, but
it got lost here...



Forgot to add:

Reviewed-by: Max Reitz

...this one also lacks the space.  I'm not sure if git cares, but it may
be worth investigating why your mailer eats the space when you reply
manually rather than sending via git; and for consistency, it is worth
keeping the space (for example, we like to grep 'git log' for learning
how active various contributors are, and having a consistent usage of
space before < in an email address can make the task easier).





--
Best regards,
Vladimir




Re: [Qemu-block] [Qemu-devel] [PULL 11/61] virtio-pci: use ioeventfd even when KVM is disabled

2017-06-28 Thread QingFeng Hao



在 2017/6/28 18:22, Kevin Wolf 写道:

Am 28.06.2017 um 12:11 hat QingFeng Hao geschrieben:

在 2017/6/24 0:21, Kevin Wolf 写道:

From: Stefan Hajnoczi 

Old kvm.ko versions only supported a tiny number of ioeventfds so
virtio-pci avoids ioeventfds when kvm_has_many_ioeventfds() returns 0.

Do not check kvm_has_many_ioeventfds() when KVM is disabled since it
always returns 0.  Since commit 8c56c1a592b5092d91da8d8943c1d6462a6f
("memory: emulate ioeventfd") it has been possible to use ioeventfds in
qtest or TCG mode.

This patch makes -device virtio-blk-pci,iothread=iothread0 work even
when KVM is disabled.

I have tested that virtio-blk-pci works under TCG both with and without
iothread.

Cc: Michael S. Tsirkin 
Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Kevin Wolf 
---
  hw/virtio/virtio-pci.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 20d6a08..301920e 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1740,7 +1740,7 @@ static void virtio_pci_realize(PCIDevice *pci_dev, Error 
**errp)
  bool pcie_port = pci_bus_is_express(pci_dev->bus) &&
   !pci_bus_is_root(pci_dev->bus);

-if (!kvm_has_many_ioeventfds()) {
+if (kvm_enabled() && !kvm_has_many_ioeventfds()) {
  proxy->flags &= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
  }

This response is actually for mail thread "Re: [Qemu-devel] [PATCH
1/5] virtio-pci: use ioeventfd even when KVM is disabled"
which I didn't receive, sorry.
I also saw the failed case of 068 as Fam due to the same cause on
s390x and x86.
With this patch applied, no failure found. Further investigation
shows that the error is in
virtio_scsi_dataplane_setup:
  if (!virtio_device_ioeventfd_enabled(vdev)) {
 error_setg(errp, "ioeventfd is required for iothread");
 return;
  }
call flow is:
virtio_device_ioeventfd_enabled-->virtio_bus_ioeventfd_enabled
-->k->ioeventfd_enabled-->virtio_pci_ioeventfd_enabled
virtio_pci_ioeventfd_enabled checks flag
VIRTIO_PCI_FLAG_USE_IOEVENTFD which was
cleared in virtio_pci_realize if this patch isn't applied.

Yes, we know all of this. However, this patch is not correct and causes
'make check' failures on some platforms. The open question is where that
failure comes from. Before this is solved, the patch can't be applied.

Thanks Kevin. Maybe I am luck, I didn't encounter the failure when running
'make check' with this patch applied. thanks

Kevin



--
Regards
QingFeng Hao




Re: [Qemu-block] [PATCH v22 13/30] block: new bdrv_reopen_bitmaps_rw interface

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
It is interesting, but I see this problem only in your answers, in my 
letters I see this white-space on its place.


28.06.2017 15:36, Eric Blake wrote:

[meta-comment]

On 06/28/2017 07:10 AM, Vladimir Sementsov-Ogievskiy wrote:

28.06.2017 15:05, Vladimir Sementsov-Ogievskiy wrote:

Add format driver handler, which should mark loaded read-only
bitmaps as 'IN_USE' in the image and unset read_only field in
corresponding BdrvDirtyBitmap's.

Signed-off-by: Vladimir Sementsov-Ogievskiy
Reviewed-by: John Snow

Your original message had spaces before '<' in the email addresses, but
it got lost here...



Forgot to add:

Reviewed-by: Max Reitz

...this one also lacks the space.  I'm not sure if git cares, but it may
be worth investigating why your mailer eats the space when you reply
manually rather than sending via git; and for consistency, it is worth
keeping the space (for example, we like to grep 'git log' for learning
how active various contributors are, and having a consistent usage of
space before < in an email address can make the task easier).



--
Best regards,
Vladimir




Re: [Qemu-block] [PATCH v22 00/30] qcow2: persistent dirty bitmaps

2017-06-28 Thread Paolo Bonzini


On 28/06/2017 14:05, Vladimir Sementsov-Ogievskiy wrote:
> Rebase on master, so changes, mostly related to new dirty bitmaps mutex:
> 
> 10: - asserts now in bdrv_{re,}set_dirty_bitmap_locked functions.
> - also add assert into bdrv_undo_clear_dirty_bitmap (the only change, not 
> related to rebase)
> - add mutex lock into bdrv_dirty_bitmap_set_readonly (as it changes 
> bitmap list,
>   so the lock should be taken)
> - return instead of go-to in qmp_block_dirty_bitmap_clear
> - in dirty-bitmaps.h, move bdrv_dirty_bitmap_set_readonly before block
>   "Functions that require manual locking", move
>   bdrv_dirty_bitmap_readonly and bdrv_has_readonly_bitmaps into this block
> 15: - add mutex lock/unlock into bdrv_dirty_bitmap_set_autoload
> - in dirty-bitmaps.h, move bdrv_dirty_bitmap_set_autoload before block
>   "Functions that require manual locking", move
>   bdrv_dirty_bitmap_get_autoload into this block
> 17: - add mutex lock/unlock into bdrv_dirty_bitmap_set_persistance
> - in dirty-bitmaps.h, move bdrv_dirty_bitmap_set_persistance before block
>   "Functions that require manual locking", move 
>   bdrv_dirty_bitmap_get_persistance and
>   bdrv_has_changed_persistent_bitmaps into this block
> 18: in dirty-bitmaps.h, move bdrv_dirty_bitmap_next into block
> "Functions that require manual locking". (do not remove r-b, as it is 
> just one empty line removed before function declaration)
> 23: return instead of go-to in qmp_block_dirty_bitmap_add
> 24: return instead of go-to in qmp_block_dirty_bitmap_add
> 25: - return instead of go-to
> - remove aio_context_acquire/release calls
> - no aio_context parameter for block_dirty_bitmap_lookup
> - in dirty-bitmaps.h, move bdrv_dirty_bitmap_sha256 into block
> "Functions that require manual locking".
> 29: - return instead of go-to in qmp_block_dirty_bitmap_remove

All looks good, thanks.  I'll rebase my own fixes on top of these
patches, no need to have you respin them.

Paolo



Re: [Qemu-block] [Xen-devel] [PATCH v2 0/3] xen-disk: performance improvements

2017-06-28 Thread Paul Durrant
> -Original Message-
> From: Stefano Stabellini [mailto:sstabell...@kernel.org]
> Sent: 27 June 2017 23:07
> To: Paul Durrant 
> Cc: xen-de...@lists.xenproject.org; qemu-de...@nongnu.org; qemu-
> bl...@nongnu.org
> Subject: Re: [Xen-devel] [PATCH v2 0/3] xen-disk: performance
> improvements
> 
> On Wed, 21 Jun 2017, Paul Durrant wrote:
> > Paul Durrant (3):
> >   xen-disk: only advertize feature-persistent if grant copy is not
> > available
> >   xen-disk: add support for multi-page shared rings
> >   xen-disk: use an IOThread per instance
> >
> >  hw/block/trace-events |   7 ++
> >  hw/block/xen_disk.c   | 228
> +++---
> >  2 files changed, 188 insertions(+), 47 deletions(-)
> 
> While waiting for an answer on patch #3, I sent a pull request for the
> first 2 patches

Cool. Thanks. Hopefully we won't have to wait too long for review on patch #3.

  Cheers,

Paul



Re: [Qemu-block] [PATCH RFC v3 2/8] block: Add aio_context field in ThrottleGroupMember

2017-06-28 Thread Kevin Wolf
Am 28.06.2017 um 14:15 hat Manos Pitsidianakis geschrieben:
> On Wed, Jun 28, 2017 at 01:27:36PM +0200, Kevin Wolf wrote:
> >Am 23.06.2017 um 14:46 hat Manos Pitsidianakis geschrieben:
> >>timer_cb() needs to know about the current Aio context of the throttle
> >>request that is woken up. In order to make ThrottleGroupMember backend
> >>agnostic, this information is stored in an aio_context field instead of
> >>accessing it from BlockBackend.
> >>
> >>Signed-off-by: Manos Pitsidianakis 
> >
> >You're copying the AioContext when the BlockBackend is registered for
> >the throttle group, but what keeps both sides in sync when the context
> >is changed later on? Don't we need to update the ThrottleGroupMember in
> >blk_set_aio_context?
> 
> blk_set_aio_context calls throttle_timers_attach_aio_context which
> updates this. Though as Alberto said util/throttle.c should not know
> about ThrottleGroupMember. This is not needed in the later patches
> because the ThrottleGroupMember's aio_context gets updated as a node
> in the driver's bdrv_attach_aio_context
> 
> We can add a new function in block/throttle.c that updates a
> member's aio context but I'm not sure if it's really needed if
> members are only used in throttle nodes.

Oh, I looked at the final state after the series instead of this very
commit, so I missed the existing calls in blk_set_aio_context().

My bad, sorry for the noise.

Kevin


pgpOh7c2AaLnP.pgp
Description: PGP signature


Re: [Qemu-block] [PATCH v22 13/30] block: new bdrv_reopen_bitmaps_rw interface

2017-06-28 Thread Eric Blake
[meta-comment]

On 06/28/2017 07:10 AM, Vladimir Sementsov-Ogievskiy wrote:
> 28.06.2017 15:05, Vladimir Sementsov-Ogievskiy wrote:
>> Add format driver handler, which should mark loaded read-only
>> bitmaps as 'IN_USE' in the image and unset read_only field in
>> corresponding BdrvDirtyBitmap's.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy
>> Reviewed-by: John Snow

Your original message had spaces before '<' in the email addresses, but
it got lost here...

> 
> 
> Forgot to add:
> 
> Reviewed-by: Max Reitz

...this one also lacks the space.  I'm not sure if git cares, but it may
be worth investigating why your mailer eats the space when you reply
manually rather than sending via git; and for consistency, it is worth
keeping the space (for example, we like to grep 'git log' for learning
how active various contributors are, and having a consistent usage of
space before < in an email address can make the task easier).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v22 13/30] block: new bdrv_reopen_bitmaps_rw interface

2017-06-28 Thread Vladimir Sementsov-Ogievskiy

28.06.2017 15:05, Vladimir Sementsov-Ogievskiy wrote:

Add format driver handler, which should mark loaded read-only
bitmaps as 'IN_USE' in the image and unset read_only field in
corresponding BdrvDirtyBitmap's.

Signed-off-by: Vladimir Sementsov-Ogievskiy
Reviewed-by: John Snow



Forgot to add:

Reviewed-by: Max Reitz


--
Best regards,
Vladimir



[Qemu-block] [PATCH v22 22/30] qcow2: add .bdrv_can_store_new_dirty_bitmap

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Realize .bdrv_can_store_new_dirty_bitmap interface.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: John Snow 
Reviewed-by: Max Reitz 
---
 block/qcow2-bitmap.c | 51 +++
 block/qcow2.c|  1 +
 block/qcow2.h|  4 
 3 files changed, 56 insertions(+)

diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index 7912a82c8c..f45324e584 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -1387,3 +1387,54 @@ int qcow2_reopen_bitmaps_ro(BlockDriverState *bs, Error 
**errp)
 
 return 0;
 }
+
+bool qcow2_can_store_new_dirty_bitmap(BlockDriverState *bs,
+  const char *name,
+  uint32_t granularity,
+  Error **errp)
+{
+BDRVQcow2State *s = bs->opaque;
+bool found;
+Qcow2BitmapList *bm_list;
+
+if (check_constraints_on_bitmap(bs, name, granularity, errp) != 0) {
+goto fail;
+}
+
+if (s->nb_bitmaps == 0) {
+return true;
+}
+
+if (s->nb_bitmaps >= QCOW2_MAX_BITMAPS) {
+error_setg(errp,
+   "Maximum number of persistent bitmaps is already reached");
+goto fail;
+}
+
+if (s->bitmap_directory_size + calc_dir_entry_size(strlen(name), 0) >
+QCOW2_MAX_BITMAP_DIRECTORY_SIZE)
+{
+error_setg(errp, "Not enough space in the bitmap directory");
+goto fail;
+}
+
+bm_list = bitmap_list_load(bs, s->bitmap_directory_offset,
+   s->bitmap_directory_size, errp);
+if (bm_list == NULL) {
+goto fail;
+}
+
+found = find_bitmap_by_name(bm_list, name);
+bitmap_list_free(bm_list);
+if (found) {
+error_setg(errp, "Bitmap with the same name is already stored");
+goto fail;
+}
+
+return true;
+
+fail:
+error_prepend(errp, "Can't make bitmap '%s' persistent in '%s': ",
+  name, bdrv_get_device_or_node_name(bs));
+return false;
+}
diff --git a/block/qcow2.c b/block/qcow2.c
index b68e04766f..fc1f69cead 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3611,6 +3611,7 @@ BlockDriver bdrv_qcow2 = {
 .bdrv_attach_aio_context  = qcow2_attach_aio_context,
 
 .bdrv_reopen_bitmaps_rw = qcow2_reopen_bitmaps_rw,
+.bdrv_can_store_new_dirty_bitmap = qcow2_can_store_new_dirty_bitmap,
 };
 
 static void bdrv_qcow2_init(void)
diff --git a/block/qcow2.h b/block/qcow2.h
index 7d0a20c053..8b2f66f8b6 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -633,5 +633,9 @@ bool qcow2_load_autoloading_dirty_bitmaps(BlockDriverState 
*bs, Error **errp);
 int qcow2_reopen_bitmaps_rw(BlockDriverState *bs, Error **errp);
 void qcow2_store_persistent_dirty_bitmaps(BlockDriverState *bs, Error **errp);
 int qcow2_reopen_bitmaps_ro(BlockDriverState *bs, Error **errp);
+bool qcow2_can_store_new_dirty_bitmap(BlockDriverState *bs,
+  const char *name,
+  uint32_t granularity,
+  Error **errp);
 
 #endif
-- 
2.11.1




[Qemu-block] [PATCH v22 00/30] qcow2: persistent dirty bitmaps

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Hi all!

There is a new update of qcow2-bitmap series - v22.

web: 
https://src.openvz.org/users/vsementsov/repos/qemu/browse?at=qcow2-bitmap-v22
git: https://src.openvz.org/scm/~vsementsov/qemu.git (tag qcow2-bitmap-v22)

v22:

Rebase on master, so changes, mostly related to new dirty bitmaps mutex:

10: - asserts now in bdrv_{re,}set_dirty_bitmap_locked functions.
- also add assert into bdrv_undo_clear_dirty_bitmap (the only change, not 
related to rebase)
- add mutex lock into bdrv_dirty_bitmap_set_readonly (as it changes bitmap 
list,
  so the lock should be taken)
- return instead of go-to in qmp_block_dirty_bitmap_clear
- in dirty-bitmaps.h, move bdrv_dirty_bitmap_set_readonly before block
  "Functions that require manual locking", move
  bdrv_dirty_bitmap_readonly and bdrv_has_readonly_bitmaps into this block
15: - add mutex lock/unlock into bdrv_dirty_bitmap_set_autoload
- in dirty-bitmaps.h, move bdrv_dirty_bitmap_set_autoload before block
  "Functions that require manual locking", move
  bdrv_dirty_bitmap_get_autoload into this block
17: - add mutex lock/unlock into bdrv_dirty_bitmap_set_persistance
- in dirty-bitmaps.h, move bdrv_dirty_bitmap_set_persistance before block
  "Functions that require manual locking", move 
  bdrv_dirty_bitmap_get_persistance and
  bdrv_has_changed_persistent_bitmaps into this block
18: in dirty-bitmaps.h, move bdrv_dirty_bitmap_next into block
"Functions that require manual locking". (do not remove r-b, as it is 
just one empty line removed before function declaration)
23: return instead of go-to in qmp_block_dirty_bitmap_add
24: return instead of go-to in qmp_block_dirty_bitmap_add
25: - return instead of go-to
- remove aio_context_acquire/release calls
- no aio_context parameter for block_dirty_bitmap_lookup
- in dirty-bitmaps.h, move bdrv_dirty_bitmap_sha256 into block
"Functions that require manual locking".
29: - return instead of go-to in qmp_block_dirty_bitmap_remove

r-b's are dropped from 10,15,17,25.

v21:

09,10: improve comment, add r-b's by Max and John
10: improve comment,k
11,12: add r-b by John
13: prepend local_err with additional info (Max), add r-b by John
14: add r-b's by Max and John
20,30: add r-b by Max


v20:

handle reopening images ro and rw.

On reopening ro: store bitmaps (storing sets 'IN_USE'=0 in the image)
and mark them readonly (set readonly flag in BlockDirtyBitmap)

After reopening rw: mark bitmaps IN_USE in the image
and unset readonly flag in BlockDirtyBitmap

09: new
10: improve comment
add parameter 'value' to bdrv_dirty_bitmap_set_readonly
11: use new parameter of bdrv_dirty_bitmap_set_readonly
12-14, 20: new

v19:

rebased on master

05: move 'sign-off' over 'reviewed-by's
08: error_report -> error_setg in qcow2_truncate (because of rebase)
09: return EPERM in bdrv_aligned_pwritev and bdrv_co_pdiscard if there
are readonly bitmaps. EPERM is chosen because it is already used for
readonly image in bdrv_co_pdiscard.
Also handle readonly bitmap in block_dirty_bitmap_clear_prepare and
qmp_block_dirty_bitmap_clear
Max's r-b is not added
10: fix grammar in comment
add Max's r-b
12, 13, 15, 21: add Max's r-b
24: fix grammar in comment
25: fix grammar and wording in comment
also, I see contextual changes in inactiavate mechanism. Hope, they do not
affect these series.

v18:

rebased on master (sorry for v17)

08: contextual: qcow2_do_open is changed instead of qcow2_open
rename s/need_update_header/update_header/ in qcow2_do_open, to not do it 
in 10
save r-b's by Max and John
09: new patch
10: load_bitmap_data: do not clear bitmap parameter - it should be already 
cleared
(it actually created before single load_bitmap_data() call)
if some bitmaps are loaded, but we can't write the image (it is readonly
or inactive), so we can't mark bitmaps "in use" in the image, mark
corresponding BdrvDirtyBitmap read-only.
change error_setg to error_setg_errno for "Can't update bitmap directory"
no needs to rename s/need_update_header/update_header/ here, as it done in 
08
13: function bdrv_has_persistent_bitmaps becomes 
bdrv_has_changed_persistent_bitmaps,
to handle readonly field.
14: declaration moved to the bottom of .h, save r-b's
15: firstly check bdrv_has_changed_persistent_bitmaps and only then fail on 
!can_write, and then QSIMPLEQ_INIT(&drop_tables)
skip readonly bitmaps in saving loop
18: remove '#optional', 2.9 -> 2.10, save r-b's
19: remove '#optional', 2.9 -> 2.10, save r-b's
20: 2.9 -> 2.10, save r-b's
21: add check of read-only image open, drop r-b's
24: add comment to qapi/block-core.json, that block-dirty-bitmap-add removes 
bitmap
from storage. r-b's by Max and John saved


v17:
08: add r-b's by Max and John
09: clear unknown autoclear features from BDRVQcow2State before calling
qcow2_load_autoloading_dirty_bitmaps(), and also do not extra update
header if it 

[Qemu-block] [PATCH v22 23/30] qmp: add persistent flag to block-dirty-bitmap-add

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Add optional 'persistent' flag to qmp command block-dirty-bitmap-add.
Default is false.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Denis V. Lunev 
Reviewed-by: Max Reitz 
Reviewed-by: John Snow 
---
 blockdev.c   | 18 +-
 qapi/block-core.json |  8 +++-
 2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 64e03c0caf..125acabc07 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1973,6 +1973,7 @@ static void block_dirty_bitmap_add_prepare(BlkActionState 
*common,
 /* AIO context taken and released within qmp_block_dirty_bitmap_add */
 qmp_block_dirty_bitmap_add(action->node, action->name,
action->has_granularity, action->granularity,
+   action->has_persistent, action->persistent,
&local_err);
 
 if (!local_err) {
@@ -2720,9 +2721,11 @@ out:
 
 void qmp_block_dirty_bitmap_add(const char *node, const char *name,
 bool has_granularity, uint32_t granularity,
+bool has_persistent, bool persistent,
 Error **errp)
 {
 BlockDriverState *bs;
+BdrvDirtyBitmap *bitmap;
 
 if (!name || name[0] == '\0') {
 error_setg(errp, "Bitmap name cannot be empty");
@@ -2745,7 +2748,20 @@ void qmp_block_dirty_bitmap_add(const char *node, const 
char *name,
 granularity = bdrv_get_default_bitmap_granularity(bs);
 }
 
-bdrv_create_dirty_bitmap(bs, granularity, name, errp);
+if (!has_persistent) {
+persistent = false;
+}
+
+if (persistent &&
+!bdrv_can_store_new_dirty_bitmap(bs, name, granularity, errp))
+{
+return;
+}
+
+bitmap = bdrv_create_dirty_bitmap(bs, granularity, name, errp);
+if (bitmap != NULL) {
+bdrv_dirty_bitmap_set_persistance(bitmap, persistent);
+}
 }
 
 void qmp_block_dirty_bitmap_remove(const char *node, const char *name,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index f85c2235c7..13f98ec146 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1561,10 +1561,16 @@
 # @granularity: the bitmap granularity, default is 64k for
 #   block-dirty-bitmap-add
 #
+# @persistent: the bitmap is persistent, i.e. it will be saved to the
+#  corresponding block device image file on its close. For now only
+#  Qcow2 disks support persistent bitmaps. Default is false for
+#  block-dirty-bitmap-add. (Since: 2.10)
+#
 # Since: 2.4
 ##
 { 'struct': 'BlockDirtyBitmapAdd',
-  'data': { 'node': 'str', 'name': 'str', '*granularity': 'uint32' } }
+  'data': { 'node': 'str', 'name': 'str', '*granularity': 'uint32',
+'*persistent': 'bool' } }
 
 ##
 # @block-dirty-bitmap-add:
-- 
2.11.1




[Qemu-block] [PATCH v22 13/30] block: new bdrv_reopen_bitmaps_rw interface

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Add format driver handler, which should mark loaded read-only
bitmaps as 'IN_USE' in the image and unset read_only field in
corresponding BdrvDirtyBitmap's.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: John Snow 
---
 block.c   | 19 +++
 include/block/block_int.h |  7 +++
 2 files changed, 26 insertions(+)

diff --git a/block.c b/block.c
index 37d68e3276..3f83da178d 100644
--- a/block.c
+++ b/block.c
@@ -2990,12 +2990,16 @@ void bdrv_reopen_commit(BDRVReopenState *reopen_state)
 {
 BlockDriver *drv;
 BlockDriverState *bs;
+bool old_can_write, new_can_write;
 
 assert(reopen_state != NULL);
 bs = reopen_state->bs;
 drv = bs->drv;
 assert(drv != NULL);
 
+old_can_write =
+!bdrv_is_read_only(bs) && !(bdrv_get_flags(bs) & BDRV_O_INACTIVE);
+
 /* If there are any driver level actions to take */
 if (drv->bdrv_reopen_commit) {
 drv->bdrv_reopen_commit(reopen_state);
@@ -3009,6 +3013,21 @@ void bdrv_reopen_commit(BDRVReopenState *reopen_state)
 bs->read_only = !(reopen_state->flags & BDRV_O_RDWR);
 
 bdrv_refresh_limits(bs, NULL);
+
+new_can_write =
+!bdrv_is_read_only(bs) && !(bdrv_get_flags(bs) & BDRV_O_INACTIVE);
+if (!old_can_write && new_can_write && drv->bdrv_reopen_bitmaps_rw) {
+Error *local_err = NULL;
+if (drv->bdrv_reopen_bitmaps_rw(bs, &local_err) < 0) {
+/* This is not fatal, bitmaps just left read-only, so all following
+ * writes will fail. User can remove read-only bitmaps to unblock
+ * writes.
+ */
+error_reportf_err(local_err,
+  "%s: Failed to make dirty bitmaps writable: ",
+  bdrv_get_node_name(bs));
+}
+}
 }
 
 /*
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 748970055e..4ad8eec2dd 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -381,6 +381,13 @@ struct BlockDriver {
  uint64_t parent_perm, uint64_t parent_shared,
  uint64_t *nperm, uint64_t *nshared);
 
+/**
+ * Bitmaps should be marked as 'IN_USE' in the image on reopening image
+ * as rw. This handler should realize it. It also should unset readonly
+ * field of BlockDirtyBitmap's in case of success.
+ */
+int (*bdrv_reopen_bitmaps_rw)(BlockDriverState *bs, Error **errp);
+
 QLIST_ENTRY(BlockDriver) list;
 };
 
-- 
2.11.1




[Qemu-block] [PATCH v22 14/30] qcow2: support .bdrv_reopen_bitmaps_rw

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Realize bdrv_reopen_bitmaps_rw interface.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: John Snow 
Reviewed-by: Max Reitz 
---
 block/qcow2-bitmap.c | 61 
 block/qcow2.c|  2 ++
 block/qcow2.h|  1 +
 3 files changed, 64 insertions(+)

diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index 2c7b057e21..a21fab8ce8 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -826,3 +826,64 @@ fail:
 
 return false;
 }
+
+int qcow2_reopen_bitmaps_rw(BlockDriverState *bs, Error **errp)
+{
+BDRVQcow2State *s = bs->opaque;
+Qcow2BitmapList *bm_list;
+Qcow2Bitmap *bm;
+GSList *ro_dirty_bitmaps = NULL;
+int ret = 0;
+
+if (s->nb_bitmaps == 0) {
+/* No bitmaps - nothing to do */
+return 0;
+}
+
+if (!can_write(bs)) {
+error_setg(errp, "Can't write to the image on reopening bitmaps rw");
+return -EINVAL;
+}
+
+bm_list = bitmap_list_load(bs, s->bitmap_directory_offset,
+   s->bitmap_directory_size, errp);
+if (bm_list == NULL) {
+return -EINVAL;
+}
+
+QSIMPLEQ_FOREACH(bm, bm_list, entry) {
+if (!(bm->flags & BME_FLAG_IN_USE)) {
+BdrvDirtyBitmap *bitmap = bdrv_find_dirty_bitmap(bs, bm->name);
+if (bitmap == NULL) {
+continue;
+}
+
+if (!bdrv_dirty_bitmap_readonly(bitmap)) {
+error_setg(errp, "Bitmap %s is not readonly but not marked"
+ "'IN_USE' in the image. Something went wrong,"
+ "all the bitmaps may be corrupted", bm->name);
+ret = -EINVAL;
+goto out;
+}
+
+bm->flags |= BME_FLAG_IN_USE;
+ro_dirty_bitmaps = g_slist_append(ro_dirty_bitmaps, bitmap);
+}
+}
+
+if (ro_dirty_bitmaps != NULL) {
+/* in_use flags must be updated */
+ret = update_ext_header_and_dir_in_place(bs, bm_list);
+if (ret < 0) {
+error_setg_errno(errp, -ret, "Can't update bitmap directory");
+goto out;
+}
+g_slist_foreach(ro_dirty_bitmaps, set_readonly_helper, false);
+}
+
+out:
+g_slist_free(ro_dirty_bitmaps);
+bitmap_list_free(bm_list);
+
+return ret;
+}
diff --git a/block/qcow2.c b/block/qcow2.c
index 92e8ff064d..8f070b12a2 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3595,6 +3595,8 @@ BlockDriver bdrv_qcow2 = {
 
 .bdrv_detach_aio_context  = qcow2_detach_aio_context,
 .bdrv_attach_aio_context  = qcow2_attach_aio_context,
+
+.bdrv_reopen_bitmaps_rw = qcow2_reopen_bitmaps_rw,
 };
 
 static void bdrv_qcow2_init(void)
diff --git a/block/qcow2.h b/block/qcow2.h
index 67c61de008..3e23bb7361 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -630,5 +630,6 @@ int qcow2_check_bitmaps_refcounts(BlockDriverState *bs, 
BdrvCheckResult *res,
   void **refcount_table,
   int64_t *refcount_table_size);
 bool qcow2_load_autoloading_dirty_bitmaps(BlockDriverState *bs, Error **errp);
+int qcow2_reopen_bitmaps_rw(BlockDriverState *bs, Error **errp);
 
 #endif
-- 
2.11.1




[Qemu-block] [PATCH v22 10/30] block/dirty-bitmap: add readonly field to BdrvDirtyBitmap

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
It will be needed in following commits for persistent bitmaps.
If bitmap is loaded from read-only storage (and we can't mark it
"in use" in this storage) corresponding BdrvDirtyBitmap should be
read-only.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/dirty-bitmap.c | 36 
 block/io.c   |  8 
 blockdev.c   |  6 ++
 include/block/dirty-bitmap.h |  4 
 4 files changed, 54 insertions(+)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index a8fe149c4a..17d3068336 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -46,6 +46,12 @@ struct BdrvDirtyBitmap {
 bool disabled;  /* Bitmap is disabled. It ignores all writes to
the device */
 int active_iterators;   /* How many iterators are active */
+bool readonly;  /* Bitmap is read-only. This field also
+   prevents the respective image from being
+   modified (i.e. blocks writes and discards).
+   Such operations must fail and both the image
+   and this bitmap must remain unchanged while
+   this flag is set. */
 QLIST_ENTRY(BdrvDirtyBitmap) list;
 };
 
@@ -505,6 +511,7 @@ void bdrv_set_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
   int64_t cur_sector, int64_t nr_sectors)
 {
 assert(bdrv_dirty_bitmap_enabled(bitmap));
+assert(!bdrv_dirty_bitmap_readonly(bitmap));
 hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
 }
 
@@ -521,6 +528,7 @@ void bdrv_reset_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
 int64_t cur_sector, int64_t nr_sectors)
 {
 assert(bdrv_dirty_bitmap_enabled(bitmap));
+assert(!bdrv_dirty_bitmap_readonly(bitmap));
 hbitmap_reset(bitmap->bitmap, cur_sector, nr_sectors);
 }
 
@@ -535,6 +543,7 @@ void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
 void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap **out)
 {
 assert(bdrv_dirty_bitmap_enabled(bitmap));
+assert(!bdrv_dirty_bitmap_readonly(bitmap));
 bdrv_dirty_bitmap_lock(bitmap);
 if (!out) {
 hbitmap_reset_all(bitmap->bitmap);
@@ -551,6 +560,7 @@ void bdrv_undo_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, 
HBitmap *in)
 {
 HBitmap *tmp = bitmap->bitmap;
 assert(bdrv_dirty_bitmap_enabled(bitmap));
+assert(!bdrv_dirty_bitmap_readonly(bitmap));
 bitmap->bitmap = in;
 hbitmap_free(tmp);
 }
@@ -613,6 +623,7 @@ void bdrv_set_dirty(BlockDriverState *bs, int64_t 
cur_sector,
 if (!bdrv_dirty_bitmap_enabled(bitmap)) {
 continue;
 }
+assert(!bdrv_dirty_bitmap_readonly(bitmap));
 hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
 }
 bdrv_dirty_bitmaps_unlock(bs);
@@ -635,3 +646,28 @@ int64_t bdrv_get_meta_dirty_count(BdrvDirtyBitmap *bitmap)
 {
 return hbitmap_count(bitmap->meta);
 }
+
+bool bdrv_dirty_bitmap_readonly(const BdrvDirtyBitmap *bitmap)
+{
+return bitmap->readonly;
+}
+
+/* Called with BQL taken. */
+void bdrv_dirty_bitmap_set_readonly(BdrvDirtyBitmap *bitmap, bool value)
+{
+qemu_mutex_lock(bitmap->mutex);
+bitmap->readonly = value;
+qemu_mutex_unlock(bitmap->mutex);
+}
+
+bool bdrv_has_readonly_bitmaps(BlockDriverState *bs)
+{
+BdrvDirtyBitmap *bm;
+QLIST_FOREACH(bm, &bs->dirty_bitmaps, list) {
+if (bm->readonly) {
+return true;
+}
+}
+
+return false;
+}
diff --git a/block/io.c b/block/io.c
index 91611ffb2a..49057f19af 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1343,6 +1343,10 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild 
*child,
 uint64_t bytes_remaining = bytes;
 int max_transfer;
 
+if (bdrv_has_readonly_bitmaps(bs)) {
+return -EPERM;
+}
+
 assert(is_power_of_2(align));
 assert((offset & (align - 1)) == 0);
 assert((bytes & (align - 1)) == 0);
@@ -2435,6 +2439,10 @@ int coroutine_fn bdrv_co_pdiscard(BlockDriverState *bs, 
int64_t offset,
 return -ENOMEDIUM;
 }
 
+if (bdrv_has_readonly_bitmaps(bs)) {
+return -EPERM;
+}
+
 ret = bdrv_check_byte_request(bs, offset, count);
 if (ret < 0) {
 return ret;
diff --git a/blockdev.c b/blockdev.c
index f92dcf24bf..64e03c0caf 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2023,6 +2023,9 @@ static void 
block_dirty_bitmap_clear_prepare(BlkActionState *common,
 } else if (!bdrv_dirty_bitmap_enabled(state->bitmap)) {
 error_setg(errp, "Cannot clear a disabled bitmap");
 return;
+} else if (bdrv_dirty_bitmap_readonly(state->bitmap)) {
+error_setg(errp, "Cannot clear a readonly bitmap");
+return;
 }
 
 bdrv_clear_dirty_bitmap(state->bitmap, &state->backup);
@@ -2791,6 +2794,9 @@ void qmp_block_dirty_bitm

[Qemu-block] [PATCH v22 01/30] specs/qcow2: fix bitmap granularity qemu-specific note

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: John Snow 
---
 docs/interop/qcow2.txt | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt
index 80cdfd0e91..dda53dd2a3 100644
--- a/docs/interop/qcow2.txt
+++ b/docs/interop/qcow2.txt
@@ -472,8 +472,7 @@ Structure of a bitmap directory entry:
  17:granularity_bits
 Granularity bits. Valid values: 0 - 63.
 
-Note: Qemu currently doesn't support granularity_bits
-greater than 31.
+Note: Qemu currently supports only values 9 - 31.
 
 Granularity is calculated as
 granularity = 1 << granularity_bits
-- 
2.11.1




[Qemu-block] [PATCH v22 25/30] qmp: add x-debug-block-dirty-bitmap-sha256

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/dirty-bitmap.c |  5 +
 blockdev.c   | 25 +
 include/block/dirty-bitmap.h |  1 +
 include/qemu/hbitmap.h   |  8 
 qapi/block-core.json | 27 +++
 tests/Makefile.include   |  2 +-
 util/hbitmap.c   | 11 +++
 7 files changed, 78 insertions(+), 1 deletion(-)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index d1469418e6..5fcf917707 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -725,3 +725,8 @@ BdrvDirtyBitmap *bdrv_dirty_bitmap_next(BlockDriverState 
*bs,
 return bitmap == NULL ? QLIST_FIRST(&bs->dirty_bitmaps) :
 QLIST_NEXT(bitmap, list);
 }
+
+char *bdrv_dirty_bitmap_sha256(const BdrvDirtyBitmap *bitmap, Error **errp)
+{
+return hbitmap_sha256(bitmap->bitmap, errp);
+}
diff --git a/blockdev.c b/blockdev.c
index 4bb7033994..3c8fb75208 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2832,6 +2832,31 @@ void qmp_block_dirty_bitmap_clear(const char *node, 
const char *name,
 bdrv_clear_dirty_bitmap(bitmap, NULL);
 }
 
+BlockDirtyBitmapSha256 *qmp_x_debug_block_dirty_bitmap_sha256(const char *node,
+  const char *name,
+  Error **errp)
+{
+BdrvDirtyBitmap *bitmap;
+BlockDriverState *bs;
+BlockDirtyBitmapSha256 *ret = NULL;
+char *sha256;
+
+bitmap = block_dirty_bitmap_lookup(node, name, &bs, errp);
+if (!bitmap || !bs) {
+return NULL;
+}
+
+sha256 = bdrv_dirty_bitmap_sha256(bitmap, errp);
+if (sha256 == NULL) {
+return NULL;
+}
+
+ret = g_new(BlockDirtyBitmapSha256, 1);
+ret->sha256 = sha256;
+
+return ret;
+}
+
 void hmp_drive_del(Monitor *mon, const QDict *qdict)
 {
 const char *id = qdict_get_str(qdict, "id");
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index ccf2f81640..744479bc76 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -97,5 +97,6 @@ bool bdrv_dirty_bitmap_get_persistance(BdrvDirtyBitmap 
*bitmap);
 bool bdrv_has_changed_persistent_bitmaps(BlockDriverState *bs);
 BdrvDirtyBitmap *bdrv_dirty_bitmap_next(BlockDriverState *bs,
 BdrvDirtyBitmap *bitmap);
+char *bdrv_dirty_bitmap_sha256(const BdrvDirtyBitmap *bitmap, Error **errp);
 
 #endif
diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index b52304ac29..d3a74a21fc 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -253,6 +253,14 @@ void hbitmap_deserialize_ones(HBitmap *hb, uint64_t start, 
uint64_t count,
 void hbitmap_deserialize_finish(HBitmap *hb);
 
 /**
+ * hbitmap_sha256:
+ * @bitmap: HBitmap to operate on.
+ *
+ * Returns SHA256 hash of the last level.
+ */
+char *hbitmap_sha256(const HBitmap *bitmap, Error **errp);
+
+/**
  * hbitmap_free:
  * @hb: HBitmap to operate on.
  *
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 5c42cc7790..6ad8585400 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1644,6 +1644,33 @@
   'data': 'BlockDirtyBitmap' }
 
 ##
+# @BlockDirtyBitmapSha256:
+#
+# SHA256 hash of dirty bitmap data
+#
+# @sha256: ASCII representation of SHA256 bitmap hash
+#
+# Since: 2.10
+##
+  { 'struct': 'BlockDirtyBitmapSha256',
+'data': {'sha256': 'str'} }
+
+##
+# @x-debug-block-dirty-bitmap-sha256:
+#
+# Get bitmap SHA256
+#
+# Returns: BlockDirtyBitmapSha256 on success
+#  If @node is not a valid block device, DeviceNotFound
+#  If @name is not found or if hashing has failed, GenericError with an
+#  explanation
+#
+# Since: 2.10
+##
+  { 'command': 'x-debug-block-dirty-bitmap-sha256',
+'data': 'BlockDirtyBitmap', 'returns': 'BlockDirtyBitmapSha256' }
+
+##
 # @blockdev-mirror:
 #
 # Start mirroring a block device's writes to a new destination.
diff --git a/tests/Makefile.include b/tests/Makefile.include
index ae889cae02..c738e92673 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -553,7 +553,7 @@ tests/test-blockjob$(EXESUF): tests/test-blockjob.o 
$(test-block-obj-y) $(test-u
 tests/test-blockjob-txn$(EXESUF): tests/test-blockjob-txn.o 
$(test-block-obj-y) $(test-util-obj-y)
 tests/test-thread-pool$(EXESUF): tests/test-thread-pool.o $(test-block-obj-y)
 tests/test-iov$(EXESUF): tests/test-iov.o $(test-util-obj-y)
-tests/test-hbitmap$(EXESUF): tests/test-hbitmap.o $(test-util-obj-y)
+tests/test-hbitmap$(EXESUF): tests/test-hbitmap.o $(test-util-obj-y) 
$(test-crypto-obj-y)
 tests/test-x86-cpuid$(EXESUF): tests/test-x86-cpuid.o
 tests/test-xbzrle$(EXESUF): tests/test-xbzrle.o migration/xbzrle.o 
migration/page_cache.o $(test-util-obj-y)
 tests/test-cutils$(EXESUF): tests/test-cutils.o util/cutils.o
diff --git a/util/hbitmap.c b/util/hbitmap.c
index 0c1591a594..21535cc90b 100644
--- a/util/hbitmap.c
+++

[Qemu-block] [PATCH v22 07/30] qcow2-refcount: rename inc_refcounts() and make it public

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
This is needed for the following patch, which will introduce refcounts
checking for qcow2 bitmaps.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Reviewed-by: John Snow 
---
 block/qcow2-refcount.c | 53 ++
 block/qcow2.h  |  4 
 2 files changed, 32 insertions(+), 25 deletions(-)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 7c06061aae..d7066c875b 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1323,11 +1323,10 @@ static int realloc_refcount_array(BDRVQcow2State *s, 
void **array,
  *
  * Modifies the number of errors in res.
  */
-static int inc_refcounts(BlockDriverState *bs,
- BdrvCheckResult *res,
- void **refcount_table,
- int64_t *refcount_table_size,
- int64_t offset, int64_t size)
+int qcow2_inc_refcounts_imrt(BlockDriverState *bs, BdrvCheckResult *res,
+ void **refcount_table,
+ int64_t *refcount_table_size,
+ int64_t offset, int64_t size)
 {
 BDRVQcow2State *s = bs->opaque;
 uint64_t start, last, cluster_offset, k, refcount;
@@ -1420,8 +1419,9 @@ static int check_refcounts_l2(BlockDriverState *bs, 
BdrvCheckResult *res,
 nb_csectors = ((l2_entry >> s->csize_shift) &
s->csize_mask) + 1;
 l2_entry &= s->cluster_offset_mask;
-ret = inc_refcounts(bs, res, refcount_table, refcount_table_size,
-l2_entry & ~511, nb_csectors * 512);
+ret = qcow2_inc_refcounts_imrt(bs, res,
+   refcount_table, refcount_table_size,
+   l2_entry & ~511, nb_csectors * 512);
 if (ret < 0) {
 goto fail;
 }
@@ -1454,8 +1454,9 @@ static int check_refcounts_l2(BlockDriverState *bs, 
BdrvCheckResult *res,
 }
 
 /* Mark cluster as used */
-ret = inc_refcounts(bs, res, refcount_table, refcount_table_size,
-offset, s->cluster_size);
+ret = qcow2_inc_refcounts_imrt(bs, res,
+   refcount_table, refcount_table_size,
+   offset, s->cluster_size);
 if (ret < 0) {
 goto fail;
 }
@@ -1508,8 +1509,8 @@ static int check_refcounts_l1(BlockDriverState *bs,
 l1_size2 = l1_size * sizeof(uint64_t);
 
 /* Mark L1 table as used */
-ret = inc_refcounts(bs, res, refcount_table, refcount_table_size,
-l1_table_offset, l1_size2);
+ret = qcow2_inc_refcounts_imrt(bs, res, refcount_table, 
refcount_table_size,
+   l1_table_offset, l1_size2);
 if (ret < 0) {
 goto fail;
 }
@@ -1538,8 +1539,9 @@ static int check_refcounts_l1(BlockDriverState *bs,
 if (l2_offset) {
 /* Mark L2 table as used */
 l2_offset &= L1E_OFFSET_MASK;
-ret = inc_refcounts(bs, res, refcount_table, refcount_table_size,
-l2_offset, s->cluster_size);
+ret = qcow2_inc_refcounts_imrt(bs, res,
+   refcount_table, refcount_table_size,
+   l2_offset, s->cluster_size);
 if (ret < 0) {
 goto fail;
 }
@@ -1757,14 +1759,15 @@ static int check_refblocks(BlockDriverState *bs, 
BdrvCheckResult *res,
 }
 
 res->corruptions_fixed++;
-ret = inc_refcounts(bs, res, refcount_table, nb_clusters,
-offset, s->cluster_size);
+ret = qcow2_inc_refcounts_imrt(bs, res,
+   refcount_table, nb_clusters,
+   offset, s->cluster_size);
 if (ret < 0) {
 return ret;
 }
 /* No need to check whether the refcount is now greater than 1:
  * This area was just allocated and zeroed, so it can only be
- * exactly 1 after inc_refcounts() */
+ * exactly 1 after qcow2_inc_refcounts_imrt() */
 continue;
 
 resize_fail:
@@ -1779,8 +1782,8 @@ resize_fail:
 }
 
 if (offset != 0) {
-ret = inc_refcounts(bs, res, refcount_table, nb_clusters,
-offset, s->cluster_size);
+ret = qcow2_inc_refcounts_imrt(bs, res, refcount_table, 
nb_clusters,
+   offset, s->cluster_size);
 if (ret < 0) {
 return ret;
 }
@@ -1820,8 +1823,8 @@ static int calculate_refcounts(BlockDrive

[Qemu-block] [PATCH v22 09/30] block/dirty-bitmap: fix comment for BlockDirtyBitmap.disabled field

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: John Snow 
Reviewed-by: Max Reitz 
---
 block/dirty-bitmap.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index f502c45a70..a8fe149c4a 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -43,7 +43,8 @@ struct BdrvDirtyBitmap {
 BdrvDirtyBitmap *successor; /* Anonymous child; implies frozen status */
 char *name; /* Optional non-empty unique ID */
 int64_t size;   /* Size of the bitmap (Number of sectors) */
-bool disabled;  /* Bitmap is read-only */
+bool disabled;  /* Bitmap is disabled. It ignores all writes to
+   the device */
 int active_iterators;   /* How many iterators are active */
 QLIST_ENTRY(BdrvDirtyBitmap) list;
 };
-- 
2.11.1




[Qemu-block] [PATCH v22 20/30] qcow2: store bitmaps on reopening image as read-only

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Store bitmaps and mark them read-only on reopening image as read-only.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 block/qcow2-bitmap.c | 22 ++
 block/qcow2.c|  5 +
 block/qcow2.h|  1 +
 3 files changed, 28 insertions(+)

diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index 5f53486b22..7912a82c8c 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -1365,3 +1365,25 @@ fail:
 
 bitmap_list_free(bm_list);
 }
+
+int qcow2_reopen_bitmaps_ro(BlockDriverState *bs, Error **errp)
+{
+BdrvDirtyBitmap *bitmap;
+Error *local_err = NULL;
+
+qcow2_store_persistent_dirty_bitmaps(bs, &local_err);
+if (local_err != NULL) {
+error_propagate(errp, local_err);
+return -EINVAL;
+}
+
+for (bitmap = bdrv_dirty_bitmap_next(bs, NULL); bitmap != NULL;
+ bitmap = bdrv_dirty_bitmap_next(bs, bitmap))
+{
+if (bdrv_dirty_bitmap_get_persistance(bitmap)) {
+bdrv_dirty_bitmap_set_readonly(bitmap, true);
+}
+}
+
+return 0;
+}
diff --git a/block/qcow2.c b/block/qcow2.c
index 365298d4ce..b68e04766f 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1382,6 +1382,11 @@ static int qcow2_reopen_prepare(BDRVReopenState *state,
 
 /* We need to write out any unwritten data if we reopen read-only. */
 if ((state->flags & BDRV_O_RDWR) == 0) {
+ret = qcow2_reopen_bitmaps_ro(state->bs, errp);
+if (ret < 0) {
+goto fail;
+}
+
 ret = bdrv_flush(state->bs);
 if (ret < 0) {
 goto fail;
diff --git a/block/qcow2.h b/block/qcow2.h
index 0594551237..7d0a20c053 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -632,5 +632,6 @@ int qcow2_check_bitmaps_refcounts(BlockDriverState *bs, 
BdrvCheckResult *res,
 bool qcow2_load_autoloading_dirty_bitmaps(BlockDriverState *bs, Error **errp);
 int qcow2_reopen_bitmaps_rw(BlockDriverState *bs, Error **errp);
 void qcow2_store_persistent_dirty_bitmaps(BlockDriverState *bs, Error **errp);
+int qcow2_reopen_bitmaps_ro(BlockDriverState *bs, Error **errp);
 
 #endif
-- 
2.11.1




[Qemu-block] [PATCH v22 08/30] qcow2: add bitmaps extension

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Add bitmap extension as specified in docs/specs/qcow2.txt.
For now, just mirror extension header into Qcow2 state and check
constraints. Also, calculate refcounts for qcow2 bitmaps, to not break
qemu-img check.

For now, disable image resize if it has bitmaps. It will be fixed later.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Reviewed-by: John Snow 
---
 block/Makefile.objs|   2 +-
 block/qcow2-bitmap.c   | 439 +
 block/qcow2-refcount.c |   6 +
 block/qcow2.c  | 124 +-
 block/qcow2.h  |  27 +++
 5 files changed, 592 insertions(+), 6 deletions(-)
 create mode 100644 block/qcow2-bitmap.c

diff --git a/block/Makefile.objs b/block/Makefile.objs
index ea955302c8..9efc6c49ea 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -1,5 +1,5 @@
 block-obj-y += raw-format.o qcow.o vdi.o vmdk.o cloop.o bochs.o vpc.o vvfat.o 
dmg.o
-block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o 
qcow2-cache.o
+block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o 
qcow2-cache.o qcow2-bitmap.o
 block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
 block-obj-y += qed-check.o
 block-obj-y += vhdx.o vhdx-endian.o vhdx-log.o
diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
new file mode 100644
index 00..b8e472b3e8
--- /dev/null
+++ b/block/qcow2-bitmap.c
@@ -0,0 +1,439 @@
+/*
+ * Bitmaps for the QCOW version 2 format
+ *
+ * Copyright (c) 2014-2017 Vladimir Sementsov-Ogievskiy
+ *
+ * This file is derived from qcow2-snapshot.c, original copyright:
+ * Copyright (c) 2004-2006 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+
+#include "block/block_int.h"
+#include "block/qcow2.h"
+
+/* NOTICE: BME here means Bitmaps Extension and used as a namespace for
+ * _internal_ constants. Please do not use this _internal_ abbreviation for
+ * other needs and/or outside of this file. */
+
+/* Bitmap directory entry constraints */
+#define BME_MAX_TABLE_SIZE 0x800
+#define BME_MAX_PHYS_SIZE 0x2000 /* restrict BdrvDirtyBitmap size in RAM */
+#define BME_MAX_GRANULARITY_BITS 31
+#define BME_MIN_GRANULARITY_BITS 9
+#define BME_MAX_NAME_SIZE 1023
+
+/* Bitmap directory entry flags */
+#define BME_RESERVED_FLAGS 0xfffcU
+
+/* bits [1, 8] U [56, 63] are reserved */
+#define BME_TABLE_ENTRY_RESERVED_MASK 0xff0001feULL
+#define BME_TABLE_ENTRY_OFFSET_MASK 0x00fffe00ULL
+#define BME_TABLE_ENTRY_FLAG_ALL_ONES (1ULL << 0)
+
+typedef struct QEMU_PACKED Qcow2BitmapDirEntry {
+/* header is 8 byte aligned */
+uint64_t bitmap_table_offset;
+
+uint32_t bitmap_table_size;
+uint32_t flags;
+
+uint8_t type;
+uint8_t granularity_bits;
+uint16_t name_size;
+uint32_t extra_data_size;
+/* extra data follows  */
+/* name follows  */
+} Qcow2BitmapDirEntry;
+
+typedef struct Qcow2BitmapTable {
+uint64_t offset;
+uint32_t size; /* number of 64bit entries */
+QSIMPLEQ_ENTRY(Qcow2BitmapTable) entry;
+} Qcow2BitmapTable;
+
+typedef struct Qcow2Bitmap {
+Qcow2BitmapTable table;
+uint32_t flags;
+uint8_t granularity_bits;
+char *name;
+
+QSIMPLEQ_ENTRY(Qcow2Bitmap) entry;
+} Qcow2Bitmap;
+typedef QSIMPLEQ_HEAD(Qcow2BitmapList, Qcow2Bitmap) Qcow2BitmapList;
+
+typedef enum BitmapType {
+BT_DIRTY_TRACKING_BITMAP = 1
+} BitmapType;
+
+static int check_table_entry(uint64_t entry, int cluster_size)
+{
+uint64_t offset;
+
+if (entry & BME_TABLE_ENTRY_RESERVED_MASK) {
+return -EINVAL;
+}
+
+offset = entry & BME_TABLE_ENTRY_OFFSET_MASK;
+if (offset != 0) {
+/* if offset specified, bit 0 is reserved */
+if (entry & BME_TABLE_ENTRY_FLAG_ALL_ONES) {
+return -EINVAL;
+}
+
+if (offset % cluster_size != 0) {
+return -EINV

[Qemu-block] [PATCH v22 17/30] block: introduce persistent dirty bitmaps

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
New field BdrvDirtyBitmap.persistent means, that bitmap should be saved
by format driver in .bdrv_close and .bdrv_inactivate. No format driver
supports it for now.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/dirty-bitmap.c | 29 +
 block/qcow2-bitmap.c |  1 +
 include/block/dirty-bitmap.h |  4 
 3 files changed, 34 insertions(+)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 06dc7a3ac9..3c17c452ae 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -54,6 +54,7 @@ struct BdrvDirtyBitmap {
this flag is set. */
 bool autoload;  /* For persistent bitmaps: bitmap must be
autoloaded on image opening */
+bool persistent;/* bitmap must be saved to owner disk image */
 QLIST_ENTRY(BdrvDirtyBitmap) list;
 };
 
@@ -102,6 +103,7 @@ void bdrv_dirty_bitmap_make_anon(BdrvDirtyBitmap *bitmap)
 assert(!bdrv_dirty_bitmap_frozen(bitmap));
 g_free(bitmap->name);
 bitmap->name = NULL;
+bitmap->persistent = false;
 bitmap->autoload = false;
 }
 
@@ -299,6 +301,8 @@ BdrvDirtyBitmap 
*bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
 bitmap->name = NULL;
 successor->name = name;
 bitmap->successor = NULL;
+successor->persistent = bitmap->persistent;
+bitmap->persistent = false;
 successor->autoload = bitmap->autoload;
 bitmap->autoload = false;
 bdrv_release_dirty_bitmap(bs, bitmap);
@@ -689,3 +693,28 @@ bool bdrv_dirty_bitmap_get_autoload(const BdrvDirtyBitmap 
*bitmap)
 {
 return bitmap->autoload;
 }
+
+/* Called with BQL taken. */
+void bdrv_dirty_bitmap_set_persistance(BdrvDirtyBitmap *bitmap, bool 
persistent)
+{
+qemu_mutex_lock(bitmap->mutex);
+bitmap->persistent = persistent;
+qemu_mutex_unlock(bitmap->mutex);
+}
+
+bool bdrv_dirty_bitmap_get_persistance(BdrvDirtyBitmap *bitmap)
+{
+return bitmap->persistent;
+}
+
+bool bdrv_has_changed_persistent_bitmaps(BlockDriverState *bs)
+{
+BdrvDirtyBitmap *bm;
+QLIST_FOREACH(bm, &bs->dirty_bitmaps, list) {
+if (bm->persistent && !bm->readonly) {
+return true;
+}
+}
+
+return false;
+}
diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index ee6d8f75a9..52e4616b8c 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -794,6 +794,7 @@ bool qcow2_load_autoloading_dirty_bitmaps(BlockDriverState 
*bs, Error **errp)
 goto fail;
 }
 
+bdrv_dirty_bitmap_set_persistance(bitmap, true);
 bdrv_dirty_bitmap_set_autoload(bitmap, true);
 bm->flags |= BME_FLAG_IN_USE;
 created_dirty_bitmaps =
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index e2fea12b94..3995789218 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -73,6 +73,8 @@ void bdrv_dirty_bitmap_deserialize_finish(BdrvDirtyBitmap 
*bitmap);
 
 void bdrv_dirty_bitmap_set_readonly(BdrvDirtyBitmap *bitmap, bool value);
 void bdrv_dirty_bitmap_set_autoload(BdrvDirtyBitmap *bitmap, bool autoload);
+void bdrv_dirty_bitmap_set_persistance(BdrvDirtyBitmap *bitmap,
+bool persistent);
 
 /* Functions that require manual locking.  */
 void bdrv_dirty_bitmap_lock(BdrvDirtyBitmap *bitmap);
@@ -91,5 +93,7 @@ void bdrv_dirty_bitmap_truncate(BlockDriverState *bs);
 bool bdrv_dirty_bitmap_readonly(const BdrvDirtyBitmap *bitmap);
 bool bdrv_has_readonly_bitmaps(BlockDriverState *bs);
 bool bdrv_dirty_bitmap_get_autoload(const BdrvDirtyBitmap *bitmap);
+bool bdrv_dirty_bitmap_get_persistance(BdrvDirtyBitmap *bitmap);
+bool bdrv_has_changed_persistent_bitmaps(BlockDriverState *bs);
 
 #endif
-- 
2.11.1




[Qemu-block] [PATCH v22 15/30] block/dirty-bitmap: add autoload field to BdrvDirtyBitmap

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Mirror AUTO flag from Qcow2 bitmap in BdrvDirtyBitmap. This will be
needed in future, to save this flag back to Qcow2 for persistent
bitmaps.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/dirty-bitmap.c | 18 ++
 block/qcow2-bitmap.c |  2 ++
 include/block/dirty-bitmap.h |  2 ++
 3 files changed, 22 insertions(+)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 17d3068336..06dc7a3ac9 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -52,6 +52,8 @@ struct BdrvDirtyBitmap {
Such operations must fail and both the image
and this bitmap must remain unchanged while
this flag is set. */
+bool autoload;  /* For persistent bitmaps: bitmap must be
+   autoloaded on image opening */
 QLIST_ENTRY(BdrvDirtyBitmap) list;
 };
 
@@ -100,6 +102,7 @@ void bdrv_dirty_bitmap_make_anon(BdrvDirtyBitmap *bitmap)
 assert(!bdrv_dirty_bitmap_frozen(bitmap));
 g_free(bitmap->name);
 bitmap->name = NULL;
+bitmap->autoload = false;
 }
 
 /* Called with BQL taken.  */
@@ -296,6 +299,8 @@ BdrvDirtyBitmap 
*bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
 bitmap->name = NULL;
 successor->name = name;
 bitmap->successor = NULL;
+successor->autoload = bitmap->autoload;
+bitmap->autoload = false;
 bdrv_release_dirty_bitmap(bs, bitmap);
 
 return successor;
@@ -671,3 +676,16 @@ bool bdrv_has_readonly_bitmaps(BlockDriverState *bs)
 
 return false;
 }
+
+/* Called with BQL taken. */
+void bdrv_dirty_bitmap_set_autoload(BdrvDirtyBitmap *bitmap, bool autoload)
+{
+qemu_mutex_lock(bitmap->mutex);
+bitmap->autoload = autoload;
+qemu_mutex_unlock(bitmap->mutex);
+}
+
+bool bdrv_dirty_bitmap_get_autoload(const BdrvDirtyBitmap *bitmap)
+{
+return bitmap->autoload;
+}
diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index a21fab8ce8..ee6d8f75a9 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -793,6 +793,8 @@ bool qcow2_load_autoloading_dirty_bitmaps(BlockDriverState 
*bs, Error **errp)
 if (bitmap == NULL) {
 goto fail;
 }
+
+bdrv_dirty_bitmap_set_autoload(bitmap, true);
 bm->flags |= BME_FLAG_IN_USE;
 created_dirty_bitmaps =
 g_slist_append(created_dirty_bitmaps, bitmap);
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index cb43fa37e2..e2fea12b94 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -72,6 +72,7 @@ void bdrv_dirty_bitmap_deserialize_ones(BdrvDirtyBitmap 
*bitmap,
 void bdrv_dirty_bitmap_deserialize_finish(BdrvDirtyBitmap *bitmap);
 
 void bdrv_dirty_bitmap_set_readonly(BdrvDirtyBitmap *bitmap, bool value);
+void bdrv_dirty_bitmap_set_autoload(BdrvDirtyBitmap *bitmap, bool autoload);
 
 /* Functions that require manual locking.  */
 void bdrv_dirty_bitmap_lock(BdrvDirtyBitmap *bitmap);
@@ -89,5 +90,6 @@ int64_t bdrv_get_meta_dirty_count(BdrvDirtyBitmap *bitmap);
 void bdrv_dirty_bitmap_truncate(BlockDriverState *bs);
 bool bdrv_dirty_bitmap_readonly(const BdrvDirtyBitmap *bitmap);
 bool bdrv_has_readonly_bitmaps(BlockDriverState *bs);
+bool bdrv_dirty_bitmap_get_autoload(const BdrvDirtyBitmap *bitmap);
 
 #endif
-- 
2.11.1




[Qemu-block] [PATCH v22 26/30] iotests: test qcow2 persistent dirty bitmap

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 tests/qemu-iotests/165 | 105 +
 tests/qemu-iotests/165.out |   5 +++
 tests/qemu-iotests/group   |   1 +
 3 files changed, 111 insertions(+)
 create mode 100755 tests/qemu-iotests/165
 create mode 100644 tests/qemu-iotests/165.out

diff --git a/tests/qemu-iotests/165 b/tests/qemu-iotests/165
new file mode 100755
index 00..74d7b79a0b
--- /dev/null
+++ b/tests/qemu-iotests/165
@@ -0,0 +1,105 @@
+#!/usr/bin/env python
+#
+# Tests for persistent dirty bitmaps.
+#
+# Copyright: Vladimir Sementsov-Ogievskiy 2015-2017
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+import os
+import re
+import iotests
+from iotests import qemu_img
+
+disk = os.path.join(iotests.test_dir, 'disk')
+disk_size = 0x4000 # 1G
+
+# regions for qemu_io: (start, count) in bytes
+regions1 = ((0,0x10),
+(0x20, 0x10))
+
+regions2 = ((0x1000, 0x2),
+(0x3fff, 0x1))
+
+class TestPersistentDirtyBitmap(iotests.QMPTestCase):
+
+def setUp(self):
+qemu_img('create', '-f', iotests.imgfmt, disk, str(disk_size))
+
+def tearDown(self):
+os.remove(disk)
+
+def mkVm(self):
+return iotests.VM().add_drive(disk)
+
+def mkVmRo(self):
+return iotests.VM().add_drive(disk, opts='readonly=on')
+
+def getSha256(self):
+result = self.vm.qmp('x-debug-block-dirty-bitmap-sha256',
+ node='drive0', name='bitmap0')
+return result['return']['sha256']
+
+def checkBitmap(self, sha256):
+result = self.vm.qmp('x-debug-block-dirty-bitmap-sha256',
+ node='drive0', name='bitmap0')
+self.assert_qmp(result, 'return/sha256', sha256);
+
+def writeRegions(self, regions):
+for r in regions:
+self.vm.hmp_qemu_io('drive0',
+'write %d %d' % r)
+
+def qmpAddBitmap(self):
+self.vm.qmp('block-dirty-bitmap-add', node='drive0',
+name='bitmap0', persistent=True, autoload=True)
+
+def test_persistent(self):
+self.vm = self.mkVm()
+self.vm.launch()
+self.qmpAddBitmap()
+
+self.writeRegions(regions1)
+sha256 = self.getSha256()
+
+self.vm.shutdown()
+
+self.vm = self.mkVmRo()
+self.vm.launch()
+self.vm.shutdown()
+
+#catch 'Persistent bitmaps are lost' possible error
+log = self.vm.get_log()
+log = re.sub(r'^\[I \d+\.\d+\] OPENED\n', '', log)
+log = re.sub(r'\[I \+\d+\.\d+\] CLOSED\n?$', '', log)
+if log:
+print log
+
+self.vm = self.mkVm()
+self.vm.launch()
+
+self.checkBitmap(sha256)
+self.writeRegions(regions2)
+sha256 = self.getSha256()
+
+self.vm.shutdown()
+self.vm.launch()
+
+self.checkBitmap(sha256)
+
+self.vm.shutdown()
+
+if __name__ == '__main__':
+iotests.main(supported_fmts=['qcow2'])
diff --git a/tests/qemu-iotests/165.out b/tests/qemu-iotests/165.out
new file mode 100644
index 00..ae1213e6f8
--- /dev/null
+++ b/tests/qemu-iotests/165.out
@@ -0,0 +1,5 @@
+.
+--
+Ran 1 tests
+
+OK
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index a6acafffd7..ad09b8005b 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -163,6 +163,7 @@
 159 rw auto quick
 160 rw auto quick
 162 auto quick
+165 rw auto quick
 170 rw auto quick
 171 rw auto quick
 172 auto
-- 
2.11.1




[Qemu-block] [PATCH v22 05/30] block: fix bdrv_dirty_bitmap_granularity signature

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Make getter signature const-correct. This allows other functions with
const dirty bitmap parameter use bdrv_dirty_bitmap_granularity().

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
Reviewed-by: John Snow 
Reviewed-by: Kevin Wolf 
---
 block/dirty-bitmap.c | 2 +-
 include/block/dirty-bitmap.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index a04c6e4154..df0110cf9f 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -455,7 +455,7 @@ uint32_t 
bdrv_get_default_bitmap_granularity(BlockDriverState *bs)
 return granularity;
 }
 
-uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap)
+uint32_t bdrv_dirty_bitmap_granularity(const BdrvDirtyBitmap *bitmap)
 {
 return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
 }
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index ad6558af56..ab89f08e3d 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -29,7 +29,7 @@ void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs);
 uint32_t bdrv_get_default_bitmap_granularity(BlockDriverState *bs);
-uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap);
+uint32_t bdrv_dirty_bitmap_granularity(const BdrvDirtyBitmap *bitmap);
 uint32_t bdrv_dirty_bitmap_meta_granularity(BdrvDirtyBitmap *bitmap);
 bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap);
 bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap);
-- 
2.11.1




Re: [Qemu-block] [PATCH RFC v3 2/8] block: Add aio_context field in ThrottleGroupMember

2017-06-28 Thread Manos Pitsidianakis

On Wed, Jun 28, 2017 at 01:27:36PM +0200, Kevin Wolf wrote:

Am 23.06.2017 um 14:46 hat Manos Pitsidianakis geschrieben:

timer_cb() needs to know about the current Aio context of the throttle
request that is woken up. In order to make ThrottleGroupMember backend
agnostic, this information is stored in an aio_context field instead of
accessing it from BlockBackend.

Signed-off-by: Manos Pitsidianakis 


You're copying the AioContext when the BlockBackend is registered for
the throttle group, but what keeps both sides in sync when the context
is changed later on? Don't we need to update the ThrottleGroupMember in
blk_set_aio_context?


blk_set_aio_context calls throttle_timers_attach_aio_context which 
updates this. Though as Alberto said util/throttle.c should not know 
about ThrottleGroupMember. This is not needed in the later patches 
because the ThrottleGroupMember's aio_context gets updated as a node in 
the driver's bdrv_attach_aio_context


We can add a new function in block/throttle.c that updates a member's 
aio context but I'm not sure if it's really needed if members are only 
used in throttle nodes.


signature.asc
Description: PGP signature


[Qemu-block] [PATCH v22 21/30] block: add bdrv_can_store_new_dirty_bitmap

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
This will be needed to check some restrictions before making bitmap
persistent in qmp-block-dirty-bitmap-add (this functionality will be
added by future patch)

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Reviewed-by: John Snow 
---
 block.c   | 22 ++
 include/block/block.h |  3 +++
 include/block/block_int.h |  4 
 3 files changed, 29 insertions(+)

diff --git a/block.c b/block.c
index c649afec91..b2719bceff 100644
--- a/block.c
+++ b/block.c
@@ -4954,3 +4954,25 @@ void bdrv_del_child(BlockDriverState *parent_bs, 
BdrvChild *child, Error **errp)
 
 parent_bs->drv->bdrv_del_child(parent_bs, child, errp);
 }
+
+bool bdrv_can_store_new_dirty_bitmap(BlockDriverState *bs, const char *name,
+ uint32_t granularity, Error **errp)
+{
+BlockDriver *drv = bs->drv;
+
+if (!drv) {
+error_setg_errno(errp, ENOMEDIUM,
+ "Can't store persistent bitmaps to %s",
+ bdrv_get_device_or_node_name(bs));
+return false;
+}
+
+if (!drv->bdrv_can_store_new_dirty_bitmap) {
+error_setg_errno(errp, ENOTSUP,
+ "Can't store persistent bitmaps to %s",
+ bdrv_get_device_or_node_name(bs));
+return false;
+}
+
+return drv->bdrv_can_store_new_dirty_bitmap(bs, name, granularity, errp);
+}
diff --git a/include/block/block.h b/include/block/block.h
index a4f09df95a..1daf9a0882 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -630,4 +630,7 @@ void bdrv_add_child(BlockDriverState *parent, 
BlockDriverState *child,
 Error **errp);
 void bdrv_del_child(BlockDriverState *parent, BdrvChild *child, Error **errp);
 
+bool bdrv_can_store_new_dirty_bitmap(BlockDriverState *bs, const char *name,
+ uint32_t granularity, Error **errp);
+
 #endif
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 4ad8eec2dd..009b4d41df 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -387,6 +387,10 @@ struct BlockDriver {
  * field of BlockDirtyBitmap's in case of success.
  */
 int (*bdrv_reopen_bitmaps_rw)(BlockDriverState *bs, Error **errp);
+bool (*bdrv_can_store_new_dirty_bitmap)(BlockDriverState *bs,
+const char *name,
+uint32_t granularity,
+Error **errp);
 
 QLIST_ENTRY(BlockDriver) list;
 };
-- 
2.11.1




Re: [Qemu-block] [PATCH RFC v3 3/8] block: add throttle block filter driver

2017-06-28 Thread Stefan Hajnoczi
On Tue, Jun 27, 2017 at 04:34:22PM +0300, Manos Pitsidianakis wrote:
> On Tue, Jun 27, 2017 at 01:45:40PM +0100, Stefan Hajnoczi wrote:
> > On Mon, Jun 26, 2017 at 07:26:41PM +0300, Manos Pitsidianakis wrote:
> > > On Mon, Jun 26, 2017 at 03:30:55PM +0100, Stefan Hajnoczi wrote:
> > > > > +bs->file = bdrv_open_child(NULL, options, "file",
> > > > > +bs, &child_file, false, 
> > > > > &local_err);
> > > > > +
> > > > > +if (local_err) {
> > > > > +error_propagate(errp, local_err);
> > > > > +return -EINVAL;
> > > > > +}
> > > > > +
> > > > > +qdict_flatten(options);
> > > > > +return throttle_configure_tgm(bs, tgm, options, errp);
> > > >
> > > > Who destroys bs->file on error?
> > > 
> > > It is reaped by bdrv_open_inherit() on failure, if I'm not mistaken.
> > > That's how other drivers handle this as well. Some (eg block/qcow2.c)
> > > check if bs->file is NULL instead of the error pointer they pass, so
> > > this is not not very consistent.
> > 
> > Maybe I'm missing it but I don't see relevant bs->file cleanup in
> > bdrv_open_inherit() or bdrv_open_common().
> > 
> > Please post the exact line where it happens.
> > 
> > Stefan
> 
> Relevant commit: de234897b60e034ba94b307fc289e2dc692c9251 block: Do not
> unref bs->file on error in BD's open
> 
> bdrv_open_inherit() does this on failure:
> 
> fail:
>blk_unref(file);
>if (bs->file != NULL) {
>bdrv_unref_child(bs, bs->file);
>}

Thanks, you are right.  I missed it.

> While looking into this I noticed bdrv_new_open_driver() doesn't handle
> bs->file on failure. It simply unrefs the bs but because its child's ref
> still remains, it is leaked.

That's a good candidate for a separate bug fix patch.


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH RFC v3 4/8] block: convert ThrottleGroup to object with QOM

2017-06-28 Thread Stefan Hajnoczi
On Tue, Jun 27, 2017 at 06:05:55PM +0200, Alberto Garcia wrote:
> On Mon 26 Jun 2017 06:58:32 PM CEST, Manos Pitsidianakis wrote:
> > On Mon, Jun 26, 2017 at 03:52:34PM +0100, Stefan Hajnoczi wrote:
> >>On Fri, Jun 23, 2017 at 03:46:56PM +0300, Manos Pitsidianakis wrote:
> >>> +static bool throttle_group_exists(const char *name)
> >>> +{
> >>> +ThrottleGroup *iter;
> >>> +bool ret = false;
> >>> +
> >>> +qemu_mutex_lock(&throttle_groups_lock);
> >>
> >>Not sure if this lock or the throttle_groups list are necessary.
> 
> As Manos says accesses to the throttle_groups list need to be locked.

Explicit locking is only necessary if the list is accessed outside the
QEMU global mutex.  If the monitor is the only thing that accesses the
list then a lock is not necessary.

Anyway, this point might be moot if every ThrottleGroup is a QOM object
and we drop this code in favor of using QOM APIs to find and iterate
over objects.

Stefan


signature.asc
Description: PGP signature


[Qemu-block] [PATCH v22 19/30] qcow2: add persistent dirty bitmaps support

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Store persistent dirty bitmaps in qcow2 image.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 block/qcow2-bitmap.c | 475 +++
 block/qcow2.c|   9 +
 block/qcow2.h|   1 +
 3 files changed, 485 insertions(+)

diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index 52e4616b8c..5f53486b22 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -27,6 +27,7 @@
 
 #include "qemu/osdep.h"
 #include "qapi/error.h"
+#include "qemu/cutils.h"
 
 #include "block/block_int.h"
 #include "block/qcow2.h"
@@ -42,6 +43,10 @@
 #define BME_MIN_GRANULARITY_BITS 9
 #define BME_MAX_NAME_SIZE 1023
 
+#if BME_MAX_TABLE_SIZE * 8ULL > INT_MAX
+#error In the code bitmap table physical size assumed to fit into int
+#endif
+
 /* Bitmap directory entry flags */
 #define BME_RESERVED_FLAGS 0xfffcU
 #define BME_FLAG_IN_USE (1U << 0)
@@ -72,6 +77,8 @@ typedef struct Qcow2BitmapTable {
 uint32_t size; /* number of 64bit entries */
 QSIMPLEQ_ENTRY(Qcow2BitmapTable) entry;
 } Qcow2BitmapTable;
+typedef QSIMPLEQ_HEAD(Qcow2BitmapTableList, Qcow2BitmapTable)
+Qcow2BitmapTableList;
 
 typedef struct Qcow2Bitmap {
 Qcow2BitmapTable table;
@@ -79,6 +86,8 @@ typedef struct Qcow2Bitmap {
 uint8_t granularity_bits;
 char *name;
 
+BdrvDirtyBitmap *dirty_bitmap;
+
 QSIMPLEQ_ENTRY(Qcow2Bitmap) entry;
 } Qcow2Bitmap;
 typedef QSIMPLEQ_HEAD(Qcow2BitmapList, Qcow2Bitmap) Qcow2BitmapList;
@@ -104,6 +113,15 @@ static int update_header_sync(BlockDriverState *bs)
 return bdrv_flush(bs);
 }
 
+static inline void bitmap_table_to_be(uint64_t *bitmap_table, size_t size)
+{
+size_t i;
+
+for (i = 0; i < size; ++i) {
+cpu_to_be64s(&bitmap_table[i]);
+}
+}
+
 static int check_table_entry(uint64_t entry, int cluster_size)
 {
 uint64_t offset;
@@ -127,6 +145,70 @@ static int check_table_entry(uint64_t entry, int 
cluster_size)
 return 0;
 }
 
+static int check_constraints_on_bitmap(BlockDriverState *bs,
+   const char *name,
+   uint32_t granularity,
+   Error **errp)
+{
+BDRVQcow2State *s = bs->opaque;
+int granularity_bits = ctz32(granularity);
+int64_t len = bdrv_getlength(bs);
+
+assert(granularity > 0);
+assert((granularity & (granularity - 1)) == 0);
+
+if (len < 0) {
+error_setg_errno(errp, -len, "Failed to get size of '%s'",
+ bdrv_get_device_or_node_name(bs));
+return len;
+}
+
+if (granularity_bits > BME_MAX_GRANULARITY_BITS) {
+error_setg(errp, "Granularity exceeds maximum (%llu bytes)",
+   1ULL << BME_MAX_GRANULARITY_BITS);
+return -EINVAL;
+}
+if (granularity_bits < BME_MIN_GRANULARITY_BITS) {
+error_setg(errp, "Granularity is under minimum (%llu bytes)",
+   1ULL << BME_MIN_GRANULARITY_BITS);
+return -EINVAL;
+}
+
+if ((len > (uint64_t)BME_MAX_PHYS_SIZE << granularity_bits) ||
+(len > (uint64_t)BME_MAX_TABLE_SIZE * s->cluster_size <<
+   granularity_bits))
+{
+error_setg(errp, "Too much space will be occupied by the bitmap. "
+   "Use larger granularity");
+return -EINVAL;
+}
+
+if (strlen(name) > BME_MAX_NAME_SIZE) {
+error_setg(errp, "Name length exceeds maximum (%u characters)",
+   BME_MAX_NAME_SIZE);
+return -EINVAL;
+}
+
+return 0;
+}
+
+static void clear_bitmap_table(BlockDriverState *bs, uint64_t *bitmap_table,
+   uint32_t bitmap_table_size)
+{
+BDRVQcow2State *s = bs->opaque;
+int i;
+
+for (i = 0; i < bitmap_table_size; ++i) {
+uint64_t addr = bitmap_table[i] & BME_TABLE_ENTRY_OFFSET_MASK;
+if (!addr) {
+continue;
+}
+
+qcow2_free_clusters(bs, addr, s->cluster_size, QCOW2_DISCARD_OTHER);
+bitmap_table[i] = 0;
+}
+}
+
 static int bitmap_table_load(BlockDriverState *bs, Qcow2BitmapTable *tb,
  uint64_t **bitmap_table)
 {
@@ -165,6 +247,28 @@ fail:
 return ret;
 }
 
+static int free_bitmap_clusters(BlockDriverState *bs, Qcow2BitmapTable *tb)
+{
+int ret;
+uint64_t *bitmap_table;
+
+ret = bitmap_table_load(bs, tb, &bitmap_table);
+if (ret < 0) {
+assert(bitmap_table == NULL);
+return ret;
+}
+
+clear_bitmap_table(bs, bitmap_table, tb->size);
+qcow2_free_clusters(bs, tb->offset, tb->size * sizeof(uint64_t),
+QCOW2_DISCARD_OTHER);
+g_free(bitmap_table);
+
+tb->offset = 0;
+tb->size = 0;
+
+return 0;
+}
+
 /* This function returns the number of disk sectors covered by a single qcow2
  * cluster of bitmap data. */
 static uint64_t sectors_covered_by_bitmap_cluster(const BDRVQcow2State *s,
@@ -748,6 +852,69 @@ static 

[Qemu-block] [PATCH v22 18/30] block/dirty-bitmap: add bdrv_dirty_bitmap_next()

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Reviewed-by: John Snow 
---
 block/dirty-bitmap.c | 7 +++
 include/block/dirty-bitmap.h | 2 ++
 2 files changed, 9 insertions(+)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 3c17c452ae..d1469418e6 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -718,3 +718,10 @@ bool bdrv_has_changed_persistent_bitmaps(BlockDriverState 
*bs)
 
 return false;
 }
+
+BdrvDirtyBitmap *bdrv_dirty_bitmap_next(BlockDriverState *bs,
+BdrvDirtyBitmap *bitmap)
+{
+return bitmap == NULL ? QLIST_FIRST(&bs->dirty_bitmaps) :
+QLIST_NEXT(bitmap, list);
+}
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 3995789218..ccf2f81640 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -95,5 +95,7 @@ bool bdrv_has_readonly_bitmaps(BlockDriverState *bs);
 bool bdrv_dirty_bitmap_get_autoload(const BdrvDirtyBitmap *bitmap);
 bool bdrv_dirty_bitmap_get_persistance(BdrvDirtyBitmap *bitmap);
 bool bdrv_has_changed_persistent_bitmaps(BlockDriverState *bs);
+BdrvDirtyBitmap *bdrv_dirty_bitmap_next(BlockDriverState *bs,
+BdrvDirtyBitmap *bitmap);
 
 #endif
-- 
2.11.1




[Qemu-block] [PATCH v22 27/30] block/dirty-bitmap: add bdrv_remove_persistent_dirty_bitmap

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Interface for removing persistent bitmap from its storage.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Reviewed-by: John Snow 
---
 block/dirty-bitmap.c | 18 ++
 include/block/block_int.h|  3 +++
 include/block/dirty-bitmap.h |  3 +++
 3 files changed, 24 insertions(+)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 5fcf917707..b2ca78b4d0 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -395,6 +395,7 @@ void bdrv_release_dirty_bitmap(BlockDriverState *bs, 
BdrvDirtyBitmap *bitmap)
 /**
  * Release all named dirty bitmaps attached to a BDS (for use in bdrv_close()).
  * There must not be any frozen bitmaps attached.
+ * This function does not remove persistent bitmaps from the storage.
  * Called with BQL taken.
  */
 void bdrv_release_named_dirty_bitmaps(BlockDriverState *bs)
@@ -402,6 +403,23 @@ void bdrv_release_named_dirty_bitmaps(BlockDriverState *bs)
 bdrv_do_release_matching_dirty_bitmap(bs, NULL, true);
 }
 
+/**
+ * Remove persistent dirty bitmap from the storage if it exists.
+ * Absence of bitmap is not an error, because we have the following scenario:
+ * BdrvDirtyBitmap can have .persistent = true but not yet saved and have no
+ * stored version. For such bitmap bdrv_remove_persistent_dirty_bitmap() should
+ * not fail.
+ * This function doesn't release corresponding BdrvDirtyBitmap.
+ */
+void bdrv_remove_persistent_dirty_bitmap(BlockDriverState *bs,
+ const char *name,
+ Error **errp)
+{
+if (bs->drv && bs->drv->bdrv_remove_persistent_dirty_bitmap) {
+bs->drv->bdrv_remove_persistent_dirty_bitmap(bs, name, errp);
+}
+}
+
 /* Called with BQL taken.  */
 void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
 {
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 009b4d41df..b3be797a96 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -391,6 +391,9 @@ struct BlockDriver {
 const char *name,
 uint32_t granularity,
 Error **errp);
+void (*bdrv_remove_persistent_dirty_bitmap)(BlockDriverState *bs,
+const char *name,
+Error **errp);
 
 QLIST_ENTRY(BlockDriver) list;
 };
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 744479bc76..d38233efd8 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -25,6 +25,9 @@ BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs,
 void bdrv_dirty_bitmap_make_anon(BdrvDirtyBitmap *bitmap);
 void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap);
 void bdrv_release_named_dirty_bitmaps(BlockDriverState *bs);
+void bdrv_remove_persistent_dirty_bitmap(BlockDriverState *bs,
+ const char *name,
+ Error **errp);
 void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs);
-- 
2.11.1




[Qemu-block] [PATCH v22 11/30] qcow2: autoloading dirty bitmaps

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Auto loading bitmaps are bitmaps in Qcow2, with the AUTO flag set. They
are loaded when the image is opened and become BdrvDirtyBitmaps for the
corresponding drive.

Extra data in bitmaps is not supported for now.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Reviewed-by: John Snow 
---
 block/qcow2-bitmap.c | 389 +++
 block/qcow2.c|  17 ++-
 block/qcow2.h|   2 +
 3 files changed, 406 insertions(+), 2 deletions(-)

diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index b8e472b3e8..2c7b057e21 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -44,6 +44,8 @@
 
 /* Bitmap directory entry flags */
 #define BME_RESERVED_FLAGS 0xfffcU
+#define BME_FLAG_IN_USE (1U << 0)
+#define BME_FLAG_AUTO   (1U << 1)
 
 /* bits [1, 8] U [56, 63] are reserved */
 #define BME_TABLE_ENTRY_RESERVED_MASK 0xff0001feULL
@@ -85,6 +87,23 @@ typedef enum BitmapType {
 BT_DIRTY_TRACKING_BITMAP = 1
 } BitmapType;
 
+static inline bool can_write(BlockDriverState *bs)
+{
+return !bdrv_is_read_only(bs) && !(bdrv_get_flags(bs) & BDRV_O_INACTIVE);
+}
+
+static int update_header_sync(BlockDriverState *bs)
+{
+int ret;
+
+ret = qcow2_update_header(bs);
+if (ret < 0) {
+return ret;
+}
+
+return bdrv_flush(bs);
+}
+
 static int check_table_entry(uint64_t entry, int cluster_size)
 {
 uint64_t offset;
@@ -146,6 +165,120 @@ fail:
 return ret;
 }
 
+/* This function returns the number of disk sectors covered by a single qcow2
+ * cluster of bitmap data. */
+static uint64_t sectors_covered_by_bitmap_cluster(const BDRVQcow2State *s,
+  const BdrvDirtyBitmap 
*bitmap)
+{
+uint32_t sector_granularity =
+bdrv_dirty_bitmap_granularity(bitmap) >> BDRV_SECTOR_BITS;
+
+return (uint64_t)sector_granularity * (s->cluster_size << 3);
+}
+
+/* load_bitmap_data
+ * @bitmap_table entries must satisfy specification constraints.
+ * @bitmap must be cleared */
+static int load_bitmap_data(BlockDriverState *bs,
+const uint64_t *bitmap_table,
+uint32_t bitmap_table_size,
+BdrvDirtyBitmap *bitmap)
+{
+int ret = 0;
+BDRVQcow2State *s = bs->opaque;
+uint64_t sector, sbc;
+uint64_t bm_size = bdrv_dirty_bitmap_size(bitmap);
+uint8_t *buf = NULL;
+uint64_t i, tab_size =
+size_to_clusters(s,
+bdrv_dirty_bitmap_serialization_size(bitmap, 0, bm_size));
+
+if (tab_size != bitmap_table_size || tab_size > BME_MAX_TABLE_SIZE) {
+return -EINVAL;
+}
+
+buf = g_malloc(s->cluster_size);
+sbc = sectors_covered_by_bitmap_cluster(s, bitmap);
+for (i = 0, sector = 0; i < tab_size; ++i, sector += sbc) {
+uint64_t count = MIN(bm_size - sector, sbc);
+uint64_t entry = bitmap_table[i];
+uint64_t offset = entry & BME_TABLE_ENTRY_OFFSET_MASK;
+
+assert(check_table_entry(entry, s->cluster_size) == 0);
+
+if (offset == 0) {
+if (entry & BME_TABLE_ENTRY_FLAG_ALL_ONES) {
+bdrv_dirty_bitmap_deserialize_ones(bitmap, sector, count,
+   false);
+} else {
+/* No need to deserialize zeros because the dirty bitmap is
+ * already cleared */
+}
+} else {
+ret = bdrv_pread(bs->file, offset, buf, s->cluster_size);
+if (ret < 0) {
+goto finish;
+}
+bdrv_dirty_bitmap_deserialize_part(bitmap, buf, sector, count,
+   false);
+}
+}
+ret = 0;
+
+bdrv_dirty_bitmap_deserialize_finish(bitmap);
+
+finish:
+g_free(buf);
+
+return ret;
+}
+
+static BdrvDirtyBitmap *load_bitmap(BlockDriverState *bs,
+Qcow2Bitmap *bm, Error **errp)
+{
+int ret;
+uint64_t *bitmap_table = NULL;
+uint32_t granularity;
+BdrvDirtyBitmap *bitmap = NULL;
+
+if (bm->flags & BME_FLAG_IN_USE) {
+error_setg(errp, "Bitmap '%s' is in use", bm->name);
+goto fail;
+}
+
+ret = bitmap_table_load(bs, &bm->table, &bitmap_table);
+if (ret < 0) {
+error_setg_errno(errp, -ret,
+ "Could not read bitmap_table table from image for "
+ "bitmap '%s'", bm->name);
+goto fail;
+}
+
+granularity = 1U << bm->granularity_bits;
+bitmap = bdrv_create_dirty_bitmap(bs, granularity, bm->name, errp);
+if (bitmap == NULL) {
+goto fail;
+}
+
+ret = load_bitmap_data(bs, bitmap_table, bm->table.size, bitmap);
+if (ret < 0) {
+error_setg_errno(errp, -ret, "Could not read bitmap '%s' from image",
+ bm->name);
+goto fail;
+}
+
+g_free(bitmap_table);
+return

[Qemu-block] [PATCH v22 24/30] qmp: add autoload parameter to block-dirty-bitmap-add

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Optional. Default is false.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Denis V. Lunev 
Reviewed-by: Max Reitz 
Reviewed-by: John Snow 
---
 blockdev.c   | 18 --
 qapi/block-core.json |  6 +-
 2 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 125acabc07..4bb7033994 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1974,6 +1974,7 @@ static void block_dirty_bitmap_add_prepare(BlkActionState 
*common,
 qmp_block_dirty_bitmap_add(action->node, action->name,
action->has_granularity, action->granularity,
action->has_persistent, action->persistent,
+   action->has_autoload, action->autoload,
&local_err);
 
 if (!local_err) {
@@ -2722,6 +2723,7 @@ out:
 void qmp_block_dirty_bitmap_add(const char *node, const char *name,
 bool has_granularity, uint32_t granularity,
 bool has_persistent, bool persistent,
+bool has_autoload, bool autoload,
 Error **errp)
 {
 BlockDriverState *bs;
@@ -2751,6 +2753,15 @@ void qmp_block_dirty_bitmap_add(const char *node, const 
char *name,
 if (!has_persistent) {
 persistent = false;
 }
+if (!has_autoload) {
+autoload = false;
+}
+
+if (has_autoload && !persistent) {
+error_setg(errp, "Autoload flag must be used only for persistent "
+ "bitmaps");
+return;
+}
 
 if (persistent &&
 !bdrv_can_store_new_dirty_bitmap(bs, name, granularity, errp))
@@ -2759,9 +2770,12 @@ void qmp_block_dirty_bitmap_add(const char *node, const 
char *name,
 }
 
 bitmap = bdrv_create_dirty_bitmap(bs, granularity, name, errp);
-if (bitmap != NULL) {
-bdrv_dirty_bitmap_set_persistance(bitmap, persistent);
+if (bitmap == NULL) {
+return;
 }
+
+bdrv_dirty_bitmap_set_persistance(bitmap, persistent);
+bdrv_dirty_bitmap_set_autoload(bitmap, autoload);
 }
 
 void qmp_block_dirty_bitmap_remove(const char *node, const char *name,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 13f98ec146..5c42cc7790 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1566,11 +1566,15 @@
 #  Qcow2 disks support persistent bitmaps. Default is false for
 #  block-dirty-bitmap-add. (Since: 2.10)
 #
+# @autoload: the bitmap will be automatically loaded when the image it is 
stored
+#in is opened. This flag may only be specified for persistent
+#bitmaps. Default is false for block-dirty-bitmap-add. (Since: 
2.10)
+#
 # Since: 2.4
 ##
 { 'struct': 'BlockDirtyBitmapAdd',
   'data': { 'node': 'str', 'name': 'str', '*granularity': 'uint32',
-'*persistent': 'bool' } }
+'*persistent': 'bool', '*autoload': 'bool' } }
 
 ##
 # @block-dirty-bitmap-add:
-- 
2.11.1




[Qemu-block] [PATCH v22 02/30] specs/qcow2: do not use wording 'bitmap header'

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
A bitmap directory entry is sometimes called a 'bitmap header'. This
patch leaves only one name - 'bitmap directory entry'. The name 'bitmap
header' creates misunderstandings with 'qcow2 header' and 'qcow2 bitmap
header extension' (which is extension of qcow2 header)

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
Reviewed-by: John Snow 
---
 docs/interop/qcow2.txt | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt
index dda53dd2a3..8874e8c774 100644
--- a/docs/interop/qcow2.txt
+++ b/docs/interop/qcow2.txt
@@ -201,7 +201,7 @@ The fields of the bitmaps extension are:
 
   8 - 15:  bitmap_directory_size
Size of the bitmap directory in bytes. It is the cumulative
-   size of all (nb_bitmaps) bitmap headers.
+   size of all (nb_bitmaps) bitmap directory entries.
 
  16 - 23:  bitmap_directory_offset
Offset into the image file at which the bitmap directory
@@ -426,8 +426,7 @@ Each bitmap saved in the image is described in a bitmap 
directory entry. The
 bitmap directory is a contiguous area in the image file, whose starting offset
 and length are given by the header extension fields bitmap_directory_offset and
 bitmap_directory_size. The entries of the bitmap directory have variable
-length, depending on the lengths of the bitmap name and extra data. These
-entries are also called bitmap headers.
+length, depending on the lengths of the bitmap name and extra data.
 
 Structure of a bitmap directory entry:
 
-- 
2.11.1




[Qemu-block] [PATCH v22 28/30] qcow2: add .bdrv_remove_persistent_dirty_bitmap

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Realize .bdrv_remove_persistent_dirty_bitmap interface.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Reviewed-by: John Snow 
---
 block/qcow2-bitmap.c | 41 +
 block/qcow2.c|  1 +
 block/qcow2.h|  3 +++
 3 files changed, 45 insertions(+)

diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index f45324e584..8448bec46d 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -1236,6 +1236,47 @@ static Qcow2Bitmap *find_bitmap_by_name(Qcow2BitmapList 
*bm_list,
 return NULL;
 }
 
+void qcow2_remove_persistent_dirty_bitmap(BlockDriverState *bs,
+  const char *name,
+  Error **errp)
+{
+int ret;
+BDRVQcow2State *s = bs->opaque;
+Qcow2Bitmap *bm;
+Qcow2BitmapList *bm_list;
+
+if (s->nb_bitmaps == 0) {
+/* Absence of the bitmap is not an error: see explanation above
+ * bdrv_remove_persistent_dirty_bitmap() definition. */
+return;
+}
+
+bm_list = bitmap_list_load(bs, s->bitmap_directory_offset,
+   s->bitmap_directory_size, errp);
+if (bm_list == NULL) {
+return;
+}
+
+bm = find_bitmap_by_name(bm_list, name);
+if (bm == NULL) {
+goto fail;
+}
+
+QSIMPLEQ_REMOVE(bm_list, bm, Qcow2Bitmap, entry);
+
+ret = update_ext_header_and_dir(bs, bm_list);
+if (ret < 0) {
+error_setg_errno(errp, -ret, "Failed to update bitmap extension");
+goto fail;
+}
+
+free_bitmap_clusters(bs, &bm->table);
+
+fail:
+bitmap_free(bm);
+bitmap_list_free(bm_list);
+}
+
 void qcow2_store_persistent_dirty_bitmaps(BlockDriverState *bs, Error **errp)
 {
 BdrvDirtyBitmap *bitmap;
diff --git a/block/qcow2.c b/block/qcow2.c
index fc1f69cead..b836b8c831 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3612,6 +3612,7 @@ BlockDriver bdrv_qcow2 = {
 
 .bdrv_reopen_bitmaps_rw = qcow2_reopen_bitmaps_rw,
 .bdrv_can_store_new_dirty_bitmap = qcow2_can_store_new_dirty_bitmap,
+.bdrv_remove_persistent_dirty_bitmap = 
qcow2_remove_persistent_dirty_bitmap,
 };
 
 static void bdrv_qcow2_init(void)
diff --git a/block/qcow2.h b/block/qcow2.h
index 8b2f66f8b6..ffb951df52 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -637,5 +637,8 @@ bool qcow2_can_store_new_dirty_bitmap(BlockDriverState *bs,
   const char *name,
   uint32_t granularity,
   Error **errp);
+void qcow2_remove_persistent_dirty_bitmap(BlockDriverState *bs,
+  const char *name,
+  Error **errp);
 
 #endif
-- 
2.11.1




[Qemu-block] [PATCH v22 12/30] block: refactor bdrv_reopen_commit

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Add bs local variable to simplify code.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: John Snow 
---
 block.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/block.c b/block.c
index 694396281b..37d68e3276 100644
--- a/block.c
+++ b/block.c
@@ -2989,9 +2989,11 @@ error:
 void bdrv_reopen_commit(BDRVReopenState *reopen_state)
 {
 BlockDriver *drv;
+BlockDriverState *bs;
 
 assert(reopen_state != NULL);
-drv = reopen_state->bs->drv;
+bs = reopen_state->bs;
+drv = bs->drv;
 assert(drv != NULL);
 
 /* If there are any driver level actions to take */
@@ -3000,13 +3002,13 @@ void bdrv_reopen_commit(BDRVReopenState *reopen_state)
 }
 
 /* set BDS specific flags now */
-QDECREF(reopen_state->bs->explicit_options);
+QDECREF(bs->explicit_options);
 
-reopen_state->bs->explicit_options   = reopen_state->explicit_options;
-reopen_state->bs->open_flags = reopen_state->flags;
-reopen_state->bs->read_only = !(reopen_state->flags & BDRV_O_RDWR);
+bs->explicit_options   = reopen_state->explicit_options;
+bs->open_flags = reopen_state->flags;
+bs->read_only = !(reopen_state->flags & BDRV_O_RDWR);
 
-bdrv_refresh_limits(reopen_state->bs, NULL);
+bdrv_refresh_limits(bs, NULL);
 }
 
 /*
-- 
2.11.1




[Qemu-block] [PATCH v22 29/30] qmp: block-dirty-bitmap-remove: remove persistent

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Remove persistent bitmap from the storage on block-dirty-bitmap-remove.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Reviewed-by: John Snow 
---
 blockdev.c   | 10 ++
 qapi/block-core.json |  3 ++-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 3c8fb75208..122a936719 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2783,6 +2783,7 @@ void qmp_block_dirty_bitmap_remove(const char *node, 
const char *name,
 {
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
+Error *local_err = NULL;
 
 bitmap = block_dirty_bitmap_lookup(node, name, &bs, errp);
 if (!bitmap || !bs) {
@@ -2795,6 +2796,15 @@ void qmp_block_dirty_bitmap_remove(const char *node, 
const char *name,
name);
 return;
 }
+
+if (bdrv_dirty_bitmap_get_persistance(bitmap)) {
+bdrv_remove_persistent_dirty_bitmap(bs, name, &local_err);
+if (local_err != NULL) {
+error_propagate(errp, local_err);
+return;
+}
+}
+
 bdrv_dirty_bitmap_make_anon(bitmap);
 bdrv_release_dirty_bitmap(bs, bitmap);
 }
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 6ad8585400..e471efa1b4 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1601,7 +1601,8 @@
 # @block-dirty-bitmap-remove:
 #
 # Stop write tracking and remove the dirty bitmap that was created
-# with block-dirty-bitmap-add.
+# with block-dirty-bitmap-add. If the bitmap is persistent, remove it from its
+# storage too.
 #
 # Returns: nothing on success
 #  If @node is not a valid block device or node, DeviceNotFound
-- 
2.11.1




[Qemu-block] [PATCH v22 06/30] block/dirty-bitmap: add deserialize_ones func

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Add bdrv_dirty_bitmap_deserialize_ones() function, which is needed for
qcow2 bitmap loading, to handle unallocated bitmap parts, marked as
all-ones.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Reviewed-by: John Snow 
---
 block/dirty-bitmap.c |  7 +++
 include/block/dirty-bitmap.h |  3 +++
 include/qemu/hbitmap.h   | 15 +++
 util/hbitmap.c   | 17 +
 4 files changed, 42 insertions(+)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index df0110cf9f..f502c45a70 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -586,6 +586,13 @@ void bdrv_dirty_bitmap_deserialize_zeroes(BdrvDirtyBitmap 
*bitmap,
 hbitmap_deserialize_zeroes(bitmap->bitmap, start, count, finish);
 }
 
+void bdrv_dirty_bitmap_deserialize_ones(BdrvDirtyBitmap *bitmap,
+uint64_t start, uint64_t count,
+bool finish)
+{
+hbitmap_deserialize_ones(bitmap->bitmap, start, count, finish);
+}
+
 void bdrv_dirty_bitmap_deserialize_finish(BdrvDirtyBitmap *bitmap)
 {
 hbitmap_deserialize_finish(bitmap->bitmap);
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index ab89f08e3d..05451c727d 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -66,6 +66,9 @@ void bdrv_dirty_bitmap_deserialize_part(BdrvDirtyBitmap 
*bitmap,
 void bdrv_dirty_bitmap_deserialize_zeroes(BdrvDirtyBitmap *bitmap,
   uint64_t start, uint64_t count,
   bool finish);
+void bdrv_dirty_bitmap_deserialize_ones(BdrvDirtyBitmap *bitmap,
+uint64_t start, uint64_t count,
+bool finish);
 void bdrv_dirty_bitmap_deserialize_finish(BdrvDirtyBitmap *bitmap);
 
 /* Functions that require manual locking.  */
diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index 6b04391266..b52304ac29 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -229,6 +229,21 @@ void hbitmap_deserialize_zeroes(HBitmap *hb, uint64_t 
start, uint64_t count,
 bool finish);
 
 /**
+ * hbitmap_deserialize_ones
+ * @hb: HBitmap to operate on.
+ * @start: First bit to restore.
+ * @count: Number of bits to restore.
+ * @finish: Whether to call hbitmap_deserialize_finish automatically.
+ *
+ * Fills the bitmap with ones.
+ *
+ * If @finish is false, caller must call hbitmap_serialize_finish before using
+ * the bitmap.
+ */
+void hbitmap_deserialize_ones(HBitmap *hb, uint64_t start, uint64_t count,
+  bool finish);
+
+/**
  * hbitmap_deserialize_finish
  * @hb: HBitmap to operate on.
  *
diff --git a/util/hbitmap.c b/util/hbitmap.c
index 0b38817505..0c1591a594 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -551,6 +551,23 @@ void hbitmap_deserialize_zeroes(HBitmap *hb, uint64_t 
start, uint64_t count,
 }
 }
 
+void hbitmap_deserialize_ones(HBitmap *hb, uint64_t start, uint64_t count,
+  bool finish)
+{
+uint64_t el_count;
+unsigned long *first;
+
+if (!count) {
+return;
+}
+serialization_chunk(hb, start, count, &first, &el_count);
+
+memset(first, 0xff, el_count * sizeof(unsigned long));
+if (finish) {
+hbitmap_deserialize_finish(hb);
+}
+}
+
 void hbitmap_deserialize_finish(HBitmap *bitmap)
 {
 int64_t i, size, prev_size;
-- 
2.11.1




[Qemu-block] [PATCH v22 03/30] hbitmap: improve dirty iter

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Make dirty iter resistant to resetting bits in corresponding HBitmap.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Reviewed-by: John Snow 
---
 include/qemu/hbitmap.h | 26 --
 util/hbitmap.c | 23 ++-
 2 files changed, 26 insertions(+), 23 deletions(-)

diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index 9239fe515e..6b04391266 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -256,10 +256,9 @@ void hbitmap_free(HBitmap *hb);
  * the lowest-numbered bit that is set in @hb, starting at @first.
  *
  * Concurrent setting of bits is acceptable, and will at worst cause the
- * iteration to miss some of those bits.  Resetting bits before the current
- * position of the iterator is also okay.  However, concurrent resetting of
- * bits can lead to unexpected behavior if the iterator has not yet reached
- * those bits.
+ * iteration to miss some of those bits.
+ *
+ * The concurrent resetting of bits is OK.
  */
 void hbitmap_iter_init(HBitmapIter *hbi, const HBitmap *hb, uint64_t first);
 
@@ -298,24 +297,7 @@ void hbitmap_free_meta(HBitmap *hb);
  * Return the next bit that is set in @hbi's associated HBitmap,
  * or -1 if all remaining bits are zero.
  */
-static inline int64_t hbitmap_iter_next(HBitmapIter *hbi)
-{
-unsigned long cur = hbi->cur[HBITMAP_LEVELS - 1];
-int64_t item;
-
-if (cur == 0) {
-cur = hbitmap_iter_skip_words(hbi);
-if (cur == 0) {
-return -1;
-}
-}
-
-/* The next call will resume work from the next bit.  */
-hbi->cur[HBITMAP_LEVELS - 1] = cur & (cur - 1);
-item = ((uint64_t)hbi->pos << BITS_PER_LEVEL) + ctzl(cur);
-
-return item << hbi->granularity;
-}
+int64_t hbitmap_iter_next(HBitmapIter *hbi);
 
 /**
  * hbitmap_iter_next_word:
diff --git a/util/hbitmap.c b/util/hbitmap.c
index 35088e19c4..0b38817505 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -106,8 +106,9 @@ unsigned long hbitmap_iter_skip_words(HBitmapIter *hbi)
 
 unsigned long cur;
 do {
-cur = hbi->cur[--i];
+i--;
 pos >>= BITS_PER_LEVEL;
+cur = hbi->cur[i] & hb->levels[i][pos];
 } while (cur == 0);
 
 /* Check for end of iteration.  We always use fewer than BITS_PER_LONG
@@ -139,6 +140,26 @@ unsigned long hbitmap_iter_skip_words(HBitmapIter *hbi)
 return cur;
 }
 
+int64_t hbitmap_iter_next(HBitmapIter *hbi)
+{
+unsigned long cur = hbi->cur[HBITMAP_LEVELS - 1] &
+hbi->hb->levels[HBITMAP_LEVELS - 1][hbi->pos];
+int64_t item;
+
+if (cur == 0) {
+cur = hbitmap_iter_skip_words(hbi);
+if (cur == 0) {
+return -1;
+}
+}
+
+/* The next call will resume work from the next bit.  */
+hbi->cur[HBITMAP_LEVELS - 1] = cur & (cur - 1);
+item = ((uint64_t)hbi->pos << BITS_PER_LEVEL) + ctzl(cur);
+
+return item << hbi->granularity;
+}
+
 void hbitmap_iter_init(HBitmapIter *hbi, const HBitmap *hb, uint64_t first)
 {
 unsigned i, bit;
-- 
2.11.1




[Qemu-block] [PATCH v22 16/30] block: bdrv_close: release bitmaps after drv->bdrv_close

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Release bitmaps after 'if (bs->drv) { ... }' block. This will allow
format driver to save persistent bitmaps, which will appear in following
commits.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 block.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/block.c b/block.c
index 3f83da178d..c649afec91 100644
--- a/block.c
+++ b/block.c
@@ -3061,9 +3061,6 @@ static void bdrv_close(BlockDriverState *bs)
 bdrv_flush(bs);
 bdrv_drain(bs); /* in case flush left pending I/O */
 
-bdrv_release_named_dirty_bitmaps(bs);
-assert(QLIST_EMPTY(&bs->dirty_bitmaps));
-
 if (bs->drv) {
 BdrvChild *child, *next;
 
@@ -3102,6 +3099,9 @@ static void bdrv_close(BlockDriverState *bs)
 bs->full_open_options = NULL;
 }
 
+bdrv_release_named_dirty_bitmaps(bs);
+assert(QLIST_EMPTY(&bs->dirty_bitmaps));
+
 QLIST_FOREACH_SAFE(ban, &bs->aio_notifiers, list, ban_next) {
 g_free(ban);
 }
-- 
2.11.1




[Qemu-block] [PATCH v22 30/30] block: release persistent bitmaps on inactivate

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
We should release them here to reload on invalidate cache.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 block.c  |  4 
 block/dirty-bitmap.c | 29 +++--
 include/block/dirty-bitmap.h |  1 +
 3 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/block.c b/block.c
index b2719bceff..acc6e4de1c 100644
--- a/block.c
+++ b/block.c
@@ -4156,6 +4156,10 @@ static int bdrv_inactivate_recurse(BlockDriverState *bs,
 }
 }
 
+/* At this point persistent bitmaps should be already stored by the format
+ * driver */
+bdrv_release_persistent_dirty_bitmaps(bs);
+
 return 0;
 }
 
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index b2ca78b4d0..543bddb9b5 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -356,15 +356,20 @@ void bdrv_dirty_bitmap_truncate(BlockDriverState *bs)
 bdrv_dirty_bitmaps_unlock(bs);
 }
 
+static bool bdrv_dirty_bitmap_has_name(BdrvDirtyBitmap *bitmap)
+{
+return !!bdrv_dirty_bitmap_name(bitmap);
+}
+
 /* Called with BQL taken.  */
-static void bdrv_do_release_matching_dirty_bitmap(BlockDriverState *bs,
-  BdrvDirtyBitmap *bitmap,
-  bool only_named)
+static void bdrv_do_release_matching_dirty_bitmap(
+BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
+bool (*cond)(BdrvDirtyBitmap *bitmap))
 {
 BdrvDirtyBitmap *bm, *next;
 bdrv_dirty_bitmaps_lock(bs);
 QLIST_FOREACH_SAFE(bm, &bs->dirty_bitmaps, list, next) {
-if ((!bitmap || bm == bitmap) && (!only_named || bm->name)) {
+if ((!bitmap || bm == bitmap) && (!cond || cond(bm))) {
 assert(!bm->active_iterators);
 assert(!bdrv_dirty_bitmap_frozen(bm));
 assert(!bm->meta);
@@ -389,7 +394,7 @@ out:
 /* Called with BQL taken.  */
 void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
 {
-bdrv_do_release_matching_dirty_bitmap(bs, bitmap, false);
+bdrv_do_release_matching_dirty_bitmap(bs, bitmap, NULL);
 }
 
 /**
@@ -400,7 +405,19 @@ void bdrv_release_dirty_bitmap(BlockDriverState *bs, 
BdrvDirtyBitmap *bitmap)
  */
 void bdrv_release_named_dirty_bitmaps(BlockDriverState *bs)
 {
-bdrv_do_release_matching_dirty_bitmap(bs, NULL, true);
+bdrv_do_release_matching_dirty_bitmap(bs, NULL, 
bdrv_dirty_bitmap_has_name);
+}
+
+/**
+ * Release all persistent dirty bitmaps attached to a BDS (for use in
+ * bdrv_inactivate_recurse()).
+ * There must not be any frozen bitmaps attached.
+ * This function does not remove persistent bitmaps from the storage.
+ */
+void bdrv_release_persistent_dirty_bitmaps(BlockDriverState *bs)
+{
+bdrv_do_release_matching_dirty_bitmap(bs, NULL,
+  bdrv_dirty_bitmap_get_persistance);
 }
 
 /**
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index d38233efd8..cbd9704e6a 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -25,6 +25,7 @@ BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs,
 void bdrv_dirty_bitmap_make_anon(BdrvDirtyBitmap *bitmap);
 void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap);
 void bdrv_release_named_dirty_bitmaps(BlockDriverState *bs);
+void bdrv_release_persistent_dirty_bitmaps(BlockDriverState *bs);
 void bdrv_remove_persistent_dirty_bitmap(BlockDriverState *bs,
  const char *name,
  Error **errp);
-- 
2.11.1




[Qemu-block] [PATCH v22 04/30] tests: add hbitmap iter test

2017-06-28 Thread Vladimir Sementsov-Ogievskiy
Test that hbitmap iter is resistant to bitmap resetting.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Denis V. Lunev 
Reviewed-by: Max Reitz 
Reviewed-by: John Snow 
---
 tests/test-hbitmap.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/tests/test-hbitmap.c b/tests/test-hbitmap.c
index 23773d2051..1acb353889 100644
--- a/tests/test-hbitmap.c
+++ b/tests/test-hbitmap.c
@@ -909,6 +909,22 @@ static void hbitmap_test_add(const char *testpath,
hbitmap_test_teardown);
 }
 
+static void test_hbitmap_iter_and_reset(TestHBitmapData *data,
+const void *unused)
+{
+HBitmapIter hbi;
+
+hbitmap_test_init(data, L1 * 2, 0);
+hbitmap_set(data->hb, 0, data->size);
+
+hbitmap_iter_init(&hbi, data->hb, BITS_PER_LONG - 1);
+
+hbitmap_iter_next(&hbi);
+
+hbitmap_reset_all(data->hb);
+hbitmap_iter_next(&hbi);
+}
+
 int main(int argc, char **argv)
 {
 g_test_init(&argc, &argv, NULL);
@@ -966,6 +982,9 @@ int main(int argc, char **argv)
  test_hbitmap_serialize_part);
 hbitmap_test_add("/hbitmap/serialize/zeroes",
  test_hbitmap_serialize_zeroes);
+
+hbitmap_test_add("/hbitmap/iter/iter_and_reset",
+ test_hbitmap_iter_and_reset);
 g_test_run();
 
 return 0;
-- 
2.11.1




Re: [Qemu-block] [Qemu-devel] [RFC] QMP design: Fixing query-block and friends

2017-06-28 Thread Markus Armbruster
Kevin Wolf  writes:

> Am 28.06.2017 um 09:10 hat Markus Armbruster geschrieben:
>> Eric Blake  writes:
>> > On 06/27/2017 11:31 AM, Kevin Wolf wrote:
>> >> If that's what we're going to do, I think I can figure out something
>> >> nice for block nodes. That shouldn't be too hard. The only question
>> >> would be whether we want a command to query one node or whether we would
>> >> keep returning all of them.
>> >
>> > The age-old filtering question. It's also plausible to have a single
>> > command, with an optional argument, and which always returns an array:
>> > the full array if the argument was omitted, or an array of one matching
>> > the argument when one was provided.  Adding filtering is an easy patch
>> > on top once it is proven to make life easier for a client, and I'm okay
>> > with a first approach that does not filter.
>> 
>> The graph may change.  Querying node by node would have to cope with
>> changes somehow, which I'd expect to be complex and fragile.  I think we
>> really need a way to get a complete, consistent graph.  So let's
>> implement that first.  If we still want server-side filtering once
>> that's done, we can add some on top.
>> 
>> As usual, I doubt we really need server-side filtering, and I dislike
>> the interface complexity it brings.
>
> I didn't really think of it as filtering. Every other operation is done
> on a single object, so querying a single object would be the natural
> extension. I mean, we also don't consider it "filtering" that we have
> many separate query commands instead of a 'query-qemu-state'.

Let me try to be more precise.

Once you have a command to return "full" data, I doubt the need to add
filtering to it so it can optionally return partial data.

I put "full" in quotes, because it's a design decision.  If you design a
command to query information about a node, then "full" is information
about just that node.  If you design one for the entire node graph, then
"full" is about all nodes that exist.  If you design one for the
sub-graph rooted at a certain node, then "full" is about all the nodes
in that sub-graph.

The design will depend on considerations like the desire let clients
gain a consistent view more easily.  That's not what I meant by
"filtering".

If the chosen design returns information on multiple nodes, then adding
optional parameters to make it return less is "filtering".  This kind of
filtering can easily be done in the client.  Doing it in the server
increases interface complexity, and that needs justification.

Here's a justification I could accept: we can show certain clients need
partial information frequently enough to make saving bandwidth
worthwhile.

Here's a justification I refuse to accept: because we can.

Is my stance clearer now?

> But you have a good point with the necessary atomicity, so we'll return
> everything at once.
>
>> >> I am, however, a bit less confident about BBs. As I said above, I
>> >> consider them part of the qdev devices.
>> 
>> They weren't meant to be when I created them, but I guess it's what they
>> evolved to be.
>> 
>> >> As far as I know, there is no
>> >> high-level infrastructure to query runtime state of devices and qdev
>> >> properties are supposed to be read-only. Maybe non-qdev QOM properties
>> >> can be used somehow?
>> 
>> Since qdev properties are implemented as QOM property, there are no
>> non-qdev QOM properties.
>
> You got the logic wrong here: "All qdev properties are QOM properties"
> doesn't imply "All QOM properties are qdev properties". Now I don't say
> that it's not true anyway, I don't know enough about QOM and qdev to say
> much about it.

I got it backwards.  There are no non-QOM qdev properties.  Sorry for
the confusion.

QOM isn't the only way to interact with QEMU objects (objects in the
widest possible sense).  But it's a generic way that already exists.
Let's consider whether we can use it before we invent new ways.

Read/write QOM properties is how we configure and control QOM objects,
including devices.  Read-only QOM properties is how we inspect them.

> I always had the impression that qdev provided some wrappers around QOM
> that add magic that makes properties configurable in -device and things
> like that, which you wouldn't want for these properties. I also know
> that devices aren't supposed to change qdev properties at runtime
> (whereas I remember QOM not to have trouble with that), but I'm not sure
> if there is a technical reason for this.

Historically, qdev properties are for configuring devices.  But nothing
stops you from using (abusing?) qdev properties for something else.  You
could, for instance, ignore a property's initial value (making it *not*
configuration), then have its value track some interesting bit of device
state, so the user can inspect it with info qtree.

However, arguing about this has become rather pointless, because qdev is
less and less separate from QOM.  We've acquired non-qdev QOM

  1   2   >