date:20180313

[Qemu-block] [PULL v2 14/17] nbd: BLOCK_STATUS for standard get_block_status function: client part

2018-03-13 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

Minimal realization: only one extent in server answer is supported.
Flag NBD_CMD_FLAG_REQ_ONE is used to force this behavior.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20180312152126.286890-6-vsement...@virtuozzo.com>
Reviewed-by: Eric Blake 
[eblake: grammar tweaks, fix min_block check and 32-bit cap, use -1
instead of errno on failure in nbd_negotiate_simple_meta_context,
ensure that block status makes progress on success]
Signed-off-by: Eric Blake 
---
 block/nbd-client.h  |   6 +++
 include/block/nbd.h |   3 ++
 block/nbd-client.c  | 150 
 block/nbd.c |   3 ++
 nbd/client.c| 117 
 5 files changed, 279 insertions(+)

diff --git a/block/nbd-client.h b/block/nbd-client.h
index 612c4c21a0c..0ece76e5aff 100644
--- a/block/nbd-client.h
+++ b/block/nbd-client.h
@@ -61,4 +61,10 @@ void nbd_client_detach_aio_context(BlockDriverState *bs);
 void nbd_client_attach_aio_context(BlockDriverState *bs,
AioContext *new_context);

+int coroutine_fn nbd_client_co_block_status(BlockDriverState *bs,
+bool want_zero,
+int64_t offset, int64_t bytes,
+int64_t *pnum, int64_t *map,
+BlockDriverState **file);
+
 #endif /* NBD_CLIENT_H */
diff --git a/include/block/nbd.h b/include/block/nbd.h
index 2285637e673..fcdcd545023 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -260,6 +260,7 @@ struct NBDExportInfo {
 /* In-out fields, set by client before nbd_receive_negotiate() and
  * updated by server results during nbd_receive_negotiate() */
 bool structured_reply;
+bool base_allocation; /* base:allocation context for NBD_CMD_BLOCK_STATUS 
*/

 /* Set by server results during nbd_receive_negotiate() */
 uint64_t size;
@@ -267,6 +268,8 @@ struct NBDExportInfo {
 uint32_t min_block;
 uint32_t opt_block;
 uint32_t max_block;
+
+uint32_t meta_base_allocation_id;
 };
 typedef struct NBDExportInfo NBDExportInfo;

diff --git a/block/nbd-client.c b/block/nbd-client.c
index 0d9f73a137f..e64e346d690 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -228,6 +228,48 @@ static int 
nbd_parse_offset_hole_payload(NBDStructuredReplyChunk *chunk,
 return 0;
 }

+/* nbd_parse_blockstatus_payload
+ * support only one extent in reply and only for
+ * base:allocation context
+ */
+static int nbd_parse_blockstatus_payload(NBDClientSession *client,
+ NBDStructuredReplyChunk *chunk,
+ uint8_t *payload, uint64_t 
orig_length,
+ NBDExtent *extent, Error **errp)
+{
+uint32_t context_id;
+
+if (chunk->length != sizeof(context_id) + sizeof(extent)) {
+error_setg(errp, "Protocol error: invalid payload for "
+ "NBD_REPLY_TYPE_BLOCK_STATUS");
+return -EINVAL;
+}
+
+context_id = payload_advance32();
+if (client->info.meta_base_allocation_id != context_id) {
+error_setg(errp, "Protocol error: unexpected context id %d for "
+ "NBD_REPLY_TYPE_BLOCK_STATUS, when negotiated context 
"
+ "id is %d", context_id,
+ client->info.meta_base_allocation_id);
+return -EINVAL;
+}
+
+extent->length = payload_advance32();
+extent->flags = payload_advance32();
+
+if (extent->length == 0 ||
+(client->info.min_block && !QEMU_IS_ALIGNED(extent->length,
+client->info.min_block)) ||
+extent->length > orig_length)
+{
+error_setg(errp, "Protocol error: server sent status chunk with "
+   "invalid length");
+return -EINVAL;
+}
+
+return 0;
+}
+
 /* nbd_parse_error_payload
  * on success @errp contains message describing nbd error reply
  */
@@ -642,6 +684,68 @@ static int nbd_co_receive_cmdread_reply(NBDClientSession 
*s, uint64_t handle,
 return iter.ret;
 }

+static int nbd_co_receive_blockstatus_reply(NBDClientSession *s,
+uint64_t handle, uint64_t length,
+NBDExtent *extent, Error **errp)
+{
+NBDReplyChunkIter iter;
+NBDReply reply;
+void *payload = NULL;
+Error *local_err = NULL;
+bool received = false;
+
+assert(!extent->length);
+NBD_FOREACH_REPLY_CHUNK(s, iter, handle, s->info.structured_reply,
+NULL, , )
+{
+int ret;
+NBDStructuredReplyChunk *chunk = 
+
+assert(nbd_reply_is_structured());
+
+

Re: [Qemu-block] [Qemu-devel] [PATCH v2 0/8] nbd block status base:allocation

2018-03-13 Thread Eric Blake


On 03/12/2018 10:11 PM, Eric Blake wrote:






   CC  block/nbd-client.o
   CC  block/sheepdog.o
/var/tmp/patchew-tester-tmp-erqpie2w/src/block/nbd-client.c: In 
function ‘nbd_client_co_block_status’:
/var/tmp/patchew-tester-tmp-erqpie2w/src/block/nbd-client.c:890:15: 
error: ‘extent.flags’ may be used uninitialized in this function 
[-Werror=maybe-uninitialized]

  NBDExtent extent;
    ^~
/var/tmp/patchew-tester-tmp-erqpie2w/src/block/nbd-client.c:925:19: 
error: ‘extent.length’ may be used uninitialized in this function 
[-Werror=maybe-uninitialized]

  *pnum = extent.length;
  ~~^~~


May be a false positive where the compiler merely can't see through the 
logic, or it might be a real bug; but I suspect either way that the 
solution is to initialize this (I'm guessing patch 5), as in:


NBDExtent extent = { 0 };


Not a sufficient fix - it's probably a real bug in that extent is never 
initialized if the NBD_FOREACH_REPLY_CHUNK() loop in 
nbd_co_receive_blockstatus_reply() never encounters an 
NBD_REPLY_TYPE_BLOCK_STATUS chunk.  A malicious server could just reply 
with NBD_REPLY_TYPE_NONE (no status reported after all), but the block 
layer contract requires us to make progress or return an error.  So, if 
the server didn't give us any extents, we need to turn it into an error 
even if the server did not.




Will squash that in when I get to that part of the review.



And I forgot to do it before the pull request v1, oops.  Here's what I'm 
squashing in for v2:


diff --git i/block/nbd-client.c w/block/nbd-client.c
index be160052cb1..486d73f1c63 100644
--- i/block/nbd-client.c
+++ w/block/nbd-client.c
@@ -694,6 +694,7 @@ static int 
nbd_co_receive_blockstatus_reply(NBDClientSession *s,

 Error *local_err = NULL;
 bool received = false;

+extent->length = 0;
 NBD_FOREACH_REPLY_CHUNK(s, iter, handle, s->info.structured_reply,
 NULL, , )
 {
@@ -734,6 +735,13 @@ static int 
nbd_co_receive_blockstatus_reply(NBDClientSession *s,

 payload = NULL;
 }

+if (!extent->length && !iter.err) {
+error_setg(,
+   "Server did not reply with any status extents");
+if (!iter.ret) {
+iter.ret = -EIO;
+}
+}
 error_propagate(errp, iter.err);
 return iter.ret;
 }
@@ -919,6 +927,7 @@ int coroutine_fn 
nbd_client_co_block_status(BlockDriverState *bs,

 return ret;
 }

+assert(extent.length);
 *pnum = extent.length;
 return (extent.flags & NBD_STATE_HOLE ? 0 : BDRV_BLOCK_DATA) |
(extent.flags & NBD_STATE_ZERO ? BDRV_BLOCK_ZERO : 0);



--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-block] [Qemu-devel] [PATCH v11 00/13] Dirty bitmaps postcopy migration

2018-03-13 Thread Dr. David Alan Gilbert

* Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:
> It looks like a bug in a recent commit to checkpatch. It don't support do { } 
> while

Yes, adding Su Hang and Eric in and trimming some others out.
So yes, ignore this patchew failure for this case, but we need to fix
that separately.

Dave

> 
> Best regards,
> 
> Vladimir.
> 
> 
> От: no-re...@patchew.org 
> Отправлено: 13 марта 2018 г. 22:03:29
> Кому: Vladimir Sementsov-Ogievskiy
> Копия: f...@redhat.com; qemu-block@nongnu.org; qemu-de...@nongnu.org; 
> kw...@redhat.com; peter.mayd...@linaro.org; Vladimir Sementsov-Ogievskiy; 
> f...@redhat.com; lir...@il.ibm.com; quint...@redhat.com; js...@redhat.com; 
> arm...@redhat.com; mre...@redhat.com; stefa...@redhat.com; Denis Lunev; 
> amit.s...@redhat.com; pbonz...@redhat.com; dgilb...@redhat.com
> Тема: Re: [Qemu-devel] [PATCH v11 00/13] Dirty bitmaps postcopy migration
> 
> Hi,
> 
> This series seems to have some coding style problems. See output below for
> more information:
> 
> Type: series
> Message-id: 20180313180320.339796-1-vsement...@virtuozzo.com
> Subject: [Qemu-devel] [PATCH v11 00/13] Dirty bitmaps postcopy migration
> 
> === TEST SCRIPT BEGIN ===
> #!/bin/bash
> 
> BASE=base
> n=1
> total=$(git log --oneline $BASE.. | wc -l)
> failed=0
> 
> git config --local diff.renamelimit 0
> git config --local diff.renames True
> git config --local diff.algorithm histogram
> 
> commits="$(git log --format=%H --reverse $BASE..)"
> for c in $commits; do
> echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
> if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; 
> then
> failed=1
> echo
> fi
> n=$((n+1))
> done
> 
> exit $failed
> === TEST SCRIPT END ===
> 
> Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
> From https://github.com/patchew-project/qemu
>  * [new tag]   
> patchew/20180313180320.339796-1-vsement...@virtuozzo.com -> 
> patchew/20180313180320.339796-1-vsement...@virtuozzo.com
> Auto packing the repository in background for optimum performance.
> See "git help gc" for manual housekeeping.
> Switched to a new branch 'test'
> 71e03c4ecc iotests: add dirty bitmap postcopy test
> daa548f79f iotests: add dirty bitmap migration test
> 353c5fdae1 migration: add postcopy migration of dirty bitmaps
> 1da07d4ba2 migration: allow qmp command migrate-start-postcopy for any 
> postcopy
> b789a2887e migration: add is_active_iterate handler
> 48eb14f856 migration/qemu-file: add qemu_put_counted_string()
> 1d6549dae1 migration: include migrate_dirty_bitmaps in migrate_postcopy
> e9e40af39a qapi: add dirty-bitmaps migration capability
> c575185038 migration: introduce postcopy-only pending
> 7cae35cd7c dirty-bitmap: add locked state
> 47bbd2a70c block/dirty-bitmap: add _locked version of 
> bdrv_reclaim_dirty_bitmap
> 870ff1d916 block/dirty-bitmap: fix locking in bdrv_reclaim_dirty_bitmap
> 5dca3ae226 block/dirty-bitmap: add bdrv_dirty_bitmap_enable_successor()
> 
> === OUTPUT BEGIN ===
> Checking PATCH 1/13: block/dirty-bitmap: add 
> bdrv_dirty_bitmap_enable_successor()...
> Checking PATCH 2/13: block/dirty-bitmap: fix locking in 
> bdrv_reclaim_dirty_bitmap...
> Checking PATCH 3/13: block/dirty-bitmap: add _locked version of 
> bdrv_reclaim_dirty_bitmap...
> Checking PATCH 4/13: dirty-bitmap: add locked state...
> Checking PATCH 5/13: migration: introduce postcopy-only pending...
> Checking PATCH 6/13: qapi: add dirty-bitmaps migration capability...
> Checking PATCH 7/13: migration: include migrate_dirty_bitmaps in 
> migrate_postcopy...
> Checking PATCH 8/13: migration/qemu-file: add qemu_put_counted_string()...
> Checking PATCH 9/13: migration: add is_active_iterate handler...
> Checking PATCH 10/13: migration: allow qmp command migrate-start-postcopy for 
> any postcopy...
> Checking PATCH 11/13: migration: add postcopy migration of dirty bitmaps...
> ERROR: braces {} are necessary for all arms of this statement
> #737: FILE: migration/block-dirty-bitmap.c:690:
> +} while (!(s.flags & DIRTY_BITMAP_MIG_FLAG_EOS));
> [...]
> 
> total: 1 errors, 0 warnings, 816 lines checked
> 
> Your patch has style problems, please review.  If any of these errors
> are false positives report them to the maintainer, see
> CHECKPATCH in MAINTAINERS.
> 
> Checking PATCH 12/13: iotests: add dirty bitmap migration test...
> Checking PATCH 13/13: iotests: add dirty bitmap postcopy test...
> === OUTPUT END ===
> 
> Test command exited with code: 1
> 
> 
> ---
> Email generated automatically by Patchew [http://patchew.org/].
> Please send your feedback to patchew-de...@freelists.org
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-block] [Qemu-devel] [PATCH v11 00/13] Dirty bitmaps postcopy migration

2018-03-13 Thread Eric Blake


On 03/13/2018 03:01 PM, Vladimir Sementsov-Ogievskiy wrote:

It looks like a bug in a recent commit to checkpatch. It don't support do { } 
while




Checking PATCH 11/13: migration: add postcopy migration of dirty bitmaps...
ERROR: braces {} are necessary for all arms of this statement
#737: FILE: migration/block-dirty-bitmap.c:690:
+} while (!(s.flags & DIRTY_BITMAP_MIG_FLAG_EOS));
[...]



Indeed, this is a regression in Su Hang's commit 2b9aef6f.

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-block] [Qemu-devel] [PATCH v11 00/13] Dirty bitmaps postcopy migration

2018-03-13 Thread Vladimir Sementsov-Ogievskiy

It looks like a bug in a recent commit to checkpatch. It don't support do { } 
while


Best regards,

Vladimir.


От: no-re...@patchew.org 
Отправлено: 13 марта 2018 г. 22:03:29
Кому: Vladimir Sementsov-Ogievskiy
Копия: f...@redhat.com; qemu-block@nongnu.org; qemu-de...@nongnu.org; 
kw...@redhat.com; peter.mayd...@linaro.org; Vladimir Sementsov-Ogievskiy; 
f...@redhat.com; lir...@il.ibm.com; quint...@redhat.com; js...@redhat.com; 
arm...@redhat.com; mre...@redhat.com; stefa...@redhat.com; Denis Lunev; 
amit.s...@redhat.com; pbonz...@redhat.com; dgilb...@redhat.com
Тема: Re: [Qemu-devel] [PATCH v11 00/13] Dirty bitmaps postcopy migration

Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20180313180320.339796-1-vsement...@virtuozzo.com
Subject: [Qemu-devel] [PATCH v11 00/13] Dirty bitmaps postcopy migration

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
>From https://github.com/patchew-project/qemu
 * [new tag]   
patchew/20180313180320.339796-1-vsement...@virtuozzo.com -> 
patchew/20180313180320.339796-1-vsement...@virtuozzo.com
Auto packing the repository in background for optimum performance.
See "git help gc" for manual housekeeping.
Switched to a new branch 'test'
71e03c4ecc iotests: add dirty bitmap postcopy test
daa548f79f iotests: add dirty bitmap migration test
353c5fdae1 migration: add postcopy migration of dirty bitmaps
1da07d4ba2 migration: allow qmp command migrate-start-postcopy for any postcopy
b789a2887e migration: add is_active_iterate handler
48eb14f856 migration/qemu-file: add qemu_put_counted_string()
1d6549dae1 migration: include migrate_dirty_bitmaps in migrate_postcopy
e9e40af39a qapi: add dirty-bitmaps migration capability
c575185038 migration: introduce postcopy-only pending
7cae35cd7c dirty-bitmap: add locked state
47bbd2a70c block/dirty-bitmap: add _locked version of bdrv_reclaim_dirty_bitmap
870ff1d916 block/dirty-bitmap: fix locking in bdrv_reclaim_dirty_bitmap
5dca3ae226 block/dirty-bitmap: add bdrv_dirty_bitmap_enable_successor()

=== OUTPUT BEGIN ===
Checking PATCH 1/13: block/dirty-bitmap: add 
bdrv_dirty_bitmap_enable_successor()...
Checking PATCH 2/13: block/dirty-bitmap: fix locking in 
bdrv_reclaim_dirty_bitmap...
Checking PATCH 3/13: block/dirty-bitmap: add _locked version of 
bdrv_reclaim_dirty_bitmap...
Checking PATCH 4/13: dirty-bitmap: add locked state...
Checking PATCH 5/13: migration: introduce postcopy-only pending...
Checking PATCH 6/13: qapi: add dirty-bitmaps migration capability...
Checking PATCH 7/13: migration: include migrate_dirty_bitmaps in 
migrate_postcopy...
Checking PATCH 8/13: migration/qemu-file: add qemu_put_counted_string()...
Checking PATCH 9/13: migration: add is_active_iterate handler...
Checking PATCH 10/13: migration: allow qmp command migrate-start-postcopy for 
any postcopy...
Checking PATCH 11/13: migration: add postcopy migration of dirty bitmaps...
ERROR: braces {} are necessary for all arms of this statement
#737: FILE: migration/block-dirty-bitmap.c:690:
+} while (!(s.flags & DIRTY_BITMAP_MIG_FLAG_EOS));
[...]

total: 1 errors, 0 warnings, 816 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 12/13: iotests: add dirty bitmap migration test...
Checking PATCH 13/13: iotests: add dirty bitmap postcopy test...
=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

Re: [Qemu-block] [Qemu-devel] [PATCH v11 11/13] migration: add postcopy migration of dirty bitmaps

2018-03-13 Thread John Snow



On 03/13/2018 02:29 PM, Vladimir Sementsov-Ogievskiy wrote:
> 13.03.2018 21:22, Dr. David Alan Gilbert wrote:
>> * Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:
>>> Postcopy migration of dirty bitmaps. Only named dirty bitmaps are
>>> migrated.
>>>
>>> If destination qemu is already containing a dirty bitmap with the
>>> same name
>>> as a migrated bitmap (for the same node), then, if their
>>> granularities are
>>> the same the migration will be done, otherwise the error will be
>>> generated.
>>>
>>> If destination qemu doesn't contain such bitmap it will be created.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>>> ---
> 
> [...]
> 
>>> +
>>> +static int dirty_bitmap_load_bits(QEMUFile *f, DirtyBitmapLoadState *s)
>>> +{
>>> +    uint64_t first_byte = qemu_get_be64(f) << BDRV_SECTOR_BITS;
>>> +    uint64_t nr_bytes = (uint64_t)qemu_get_be32(f) << BDRV_SECTOR_BITS;
>>> +    trace_dirty_bitmap_load_bits_enter(first_byte >> BDRV_SECTOR_BITS,
>>> +   nr_bytes >> BDRV_SECTOR_BITS);
>>> +
>>> +    if (s->flags & DIRTY_BITMAP_MIG_FLAG_ZEROES) {
>>> +    trace_dirty_bitmap_load_bits_zeroes();
>>> +    bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte,
>>> nr_bytes,
>>> + false);
>>> +    } else {
>>> +    size_t ret;
>>> +    uint8_t *buf;
>>> +    uint64_t buf_size = qemu_get_be64(f);
>>> +    uint64_t needed_size =
>>> +    bdrv_dirty_bitmap_serialization_size(s->bitmap,
>>> + first_byte, nr_bytes);
>>> +
>>> +    if (needed_size > buf_size ||
>>> +    buf_size > QEMU_ALIGN_UP(needed_size, 4 + sizeof(long))
>> I think you meant '4 * sizeof(long)';  other than that, from the
>> migration side I'm OK, so with that fixed, and someone from the block
>> side checking the block code:
>>
>>
>> Reviewed-by: Dr. David Alan Gilbert 
>>
> 
> Ohh, yes, 4 * sizeof(long).
> Who will finally pull it? Should I respin, or you fix it inflight?
> 

I'm testing and staging it right now. David gave his blessing for me to
send a Pull Request.

--js

Re: [Qemu-block] [Qemu-devel] [PATCH v11 00/13] Dirty bitmaps postcopy migration

2018-03-13 Thread no-reply

Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20180313180320.339796-1-vsement...@virtuozzo.com
Subject: [Qemu-devel] [PATCH v11 00/13] Dirty bitmaps postcopy migration

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]   
patchew/20180313180320.339796-1-vsement...@virtuozzo.com -> 
patchew/20180313180320.339796-1-vsement...@virtuozzo.com
Auto packing the repository in background for optimum performance.
See "git help gc" for manual housekeeping.
Switched to a new branch 'test'
71e03c4ecc iotests: add dirty bitmap postcopy test
daa548f79f iotests: add dirty bitmap migration test
353c5fdae1 migration: add postcopy migration of dirty bitmaps
1da07d4ba2 migration: allow qmp command migrate-start-postcopy for any postcopy
b789a2887e migration: add is_active_iterate handler
48eb14f856 migration/qemu-file: add qemu_put_counted_string()
1d6549dae1 migration: include migrate_dirty_bitmaps in migrate_postcopy
e9e40af39a qapi: add dirty-bitmaps migration capability
c575185038 migration: introduce postcopy-only pending
7cae35cd7c dirty-bitmap: add locked state
47bbd2a70c block/dirty-bitmap: add _locked version of bdrv_reclaim_dirty_bitmap
870ff1d916 block/dirty-bitmap: fix locking in bdrv_reclaim_dirty_bitmap
5dca3ae226 block/dirty-bitmap: add bdrv_dirty_bitmap_enable_successor()

=== OUTPUT BEGIN ===
Checking PATCH 1/13: block/dirty-bitmap: add 
bdrv_dirty_bitmap_enable_successor()...
Checking PATCH 2/13: block/dirty-bitmap: fix locking in 
bdrv_reclaim_dirty_bitmap...
Checking PATCH 3/13: block/dirty-bitmap: add _locked version of 
bdrv_reclaim_dirty_bitmap...
Checking PATCH 4/13: dirty-bitmap: add locked state...
Checking PATCH 5/13: migration: introduce postcopy-only pending...
Checking PATCH 6/13: qapi: add dirty-bitmaps migration capability...
Checking PATCH 7/13: migration: include migrate_dirty_bitmaps in 
migrate_postcopy...
Checking PATCH 8/13: migration/qemu-file: add qemu_put_counted_string()...
Checking PATCH 9/13: migration: add is_active_iterate handler...
Checking PATCH 10/13: migration: allow qmp command migrate-start-postcopy for 
any postcopy...
Checking PATCH 11/13: migration: add postcopy migration of dirty bitmaps...
ERROR: braces {} are necessary for all arms of this statement
#737: FILE: migration/block-dirty-bitmap.c:690:
+} while (!(s.flags & DIRTY_BITMAP_MIG_FLAG_EOS));
[...]

total: 1 errors, 0 warnings, 816 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 12/13: iotests: add dirty bitmap migration test...
Checking PATCH 13/13: iotests: add dirty bitmap postcopy test...
=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

Re: [Qemu-block] [PATCH 1/2] qcow2: Give the refcount cache the minimum possible size by default

2018-03-13 Thread Alberto Garcia

On Tue 13 Mar 2018 07:23:36 PM CET, Eric Blake wrote:
>> +*refcount_cache_size =
>> +MIN(combined_cache_size, min_refcount_cache);
>
> but here, if combined_cache_size is smaller than min_refcount_cache,
>
>> +*l2_cache_size = combined_cache_size - *refcount_cache_size;
>
> then l2_cache_size is set to a negative value.

No, it's set to 0.

If combined == 4k and min_refcount == 256, then

   refcount_cache_size = MIN(4k, 256k) // 4k
   l2_cache_size = 4k - 4k; // 0

Then the caller ensures that it's always set to the minimum (as it did
with the previous code).

Berto

Re: [Qemu-block] [PATCH 1/2] qcow2: Give the refcount cache the minimum possible size by default

2018-03-13 Thread Eric Blake


On 03/13/2018 01:48 PM, Alberto Garcia wrote:

On Tue 13 Mar 2018 07:23:36 PM CET, Eric Blake wrote:

+*refcount_cache_size =
+MIN(combined_cache_size, min_refcount_cache);


but here, if combined_cache_size is smaller than min_refcount_cache,


+*l2_cache_size = combined_cache_size - *refcount_cache_size;


then l2_cache_size is set to a negative value.


No, it's set to 0.

If combined == 4k and min_refcount == 256, then

refcount_cache_size = MIN(4k, 256k) // 4k
l2_cache_size = 4k - 4k; // 0


Ah. Mental breakdown on my part in trying to compute (x - MIN()).



Then the caller ensures that it's always set to the minimum (as it did
with the previous code).


So the caller will use larger than the requested limits if the requested 
limits are too small, and we are okay with calculations resulting in 0 
here.  All right, thanks for stepping me through my error; you're good 
to go with:


Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-block] [PATCH v11 11/13] migration: add postcopy migration of dirty bitmaps

2018-03-13 Thread Dr. David Alan Gilbert

* Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:
> Postcopy migration of dirty bitmaps. Only named dirty bitmaps are migrated.
> 
> If destination qemu is already containing a dirty bitmap with the same name
> as a migrated bitmap (for the same node), then, if their granularities are
> the same the migration will be done, otherwise the error will be generated.
> 
> If destination qemu doesn't contain such bitmap it will be created.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  include/migration/misc.h   |   3 +
>  migration/migration.h  |   3 +
>  migration/block-dirty-bitmap.c | 746 
> +
>  migration/migration.c  |   5 +
>  migration/savevm.c |   2 +
>  vl.c   |   1 +
>  migration/Makefile.objs|   1 +
>  migration/trace-events |  14 +
>  8 files changed, 775 insertions(+)
>  create mode 100644 migration/block-dirty-bitmap.c
> 
> diff --git a/include/migration/misc.h b/include/migration/misc.h
> index 77fd4f587c..4ebf24c6c2 100644
> --- a/include/migration/misc.h
> +++ b/include/migration/misc.h
> @@ -56,4 +56,7 @@ bool migration_has_failed(MigrationState *);
>  bool migration_in_postcopy_after_devices(MigrationState *);
>  void migration_global_dump(Monitor *mon);
>  
> +/* migration/block-dirty-bitmap.c */
> +void dirty_bitmap_mig_init(void);
> +
>  #endif
> diff --git a/migration/migration.h b/migration/migration.h
> index da6bc37de8..a79540b99c 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -235,4 +235,7 @@ void migrate_send_rp_pong(MigrationIncomingState *mis,
>  int migrate_send_rp_req_pages(MigrationIncomingState *mis, const char* 
> rbname,
>ram_addr_t start, size_t len);
>  
> +void dirty_bitmap_mig_before_vm_start(void);
> +void init_dirty_bitmap_incoming_migration(void);
> +
>  #endif
> diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
> new file mode 100644
> index 00..98ba4589e3
> --- /dev/null
> +++ b/migration/block-dirty-bitmap.c
> @@ -0,0 +1,746 @@
> +/*
> + * Block dirty bitmap postcopy migration
> + *
> + * Copyright IBM, Corp. 2009
> + * Copyright (c) 2016-2017 Virtuozzo International GmbH. All rights reserved.
> + *
> + * Authors:
> + *  Liran Schour   
> + *  Vladimir Sementsov-Ogievskiy 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.  See
> + * the COPYING file in the top-level directory.
> + * This file is derived from migration/block.c, so it's author and IBM 
> copyright
> + * are here, although content is quite different.
> + *
> + * Contributions after 2012-01-13 are licensed under the terms of the
> + * GNU GPL, version 2 or (at your option) any later version.
> + *
> + ****
> + *
> + * Here postcopy migration of dirty bitmaps is realized. Only QMP-addressable
> + * bitmaps are migrated.
> + *
> + * Bitmap migration implies creating bitmap with the same name and 
> granularity
> + * in destination QEMU. If the bitmap with the same name (for the same node)
> + * already exists on destination an error will be generated.
> + *
> + * format of migration:
> + *
> + * # Header (shared for different chunk types)
> + * 1, 2 or 4 bytes: flags (see qemu_{put,put}_flags)
> + * [ 1 byte: node name size ] \  flags & DEVICE_NAME
> + * [ n bytes: node name ] /
> + * [ 1 byte: bitmap name size ] \  flags & BITMAP_NAME
> + * [ n bytes: bitmap name ] /
> + *
> + * # Start of bitmap migration (flags & START)
> + * header
> + * be64: granularity
> + * 1 byte: bitmap flags (corresponds to BdrvDirtyBitmap)
> + *   bit 0-  bitmap is enabled
> + *   bit 1-  bitmap is persistent
> + *   bit 2-  bitmap is autoloading
> + *   bits 3-7 - reserved, must be zero
> + *
> + * # Complete of bitmap migration (flags & COMPLETE)
> + * header
> + *
> + * # Data chunk of bitmap migration
> + * header
> + * be64: start sector
> + * be32: number of sectors
> + * [ be64: buffer size  ] \ ! (flags & ZEROES)
> + * [ n bytes: buffer] /
> + *
> + * The last chunk in stream should contain flags & EOS. The chunk may skip
> + * device and/or bitmap names, assuming them to be the same with the previous
> + * chunk.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "block/block.h"
> +#include "block/block_int.h"
> +#include "sysemu/block-backend.h"
> +#include "qemu/main-loop.h"
> +#include "qemu/error-report.h"
> +#include "migration/misc.h"
> +#include "migration/migration.h"
> +#include "migration/qemu-file.h"
> +#include "migration/vmstate.h"
> +#include "migration/register.h"
> +#include "qemu/hbitmap.h"
> +#include "sysemu/sysemu.h"
> +#include "qemu/cutils.h"
> +#include "qapi/error.h"
> +#include "trace.h"
> +
> +#define CHUNK_SIZE (1 << 10)
> +
> +/* Flags occupy one, two or four bytes (Big Endian). The size is determined 
> as
>

Re: [Qemu-block] [PATCH v11 11/13] migration: add postcopy migration of dirty bitmaps

2018-03-13 Thread Vladimir Sementsov-Ogievskiy


13.03.2018 21:22, Dr. David Alan Gilbert wrote:

* Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:

Postcopy migration of dirty bitmaps. Only named dirty bitmaps are migrated.

If destination qemu is already containing a dirty bitmap with the same name
as a migrated bitmap (for the same node), then, if their granularities are
the same the migration will be done, otherwise the error will be generated.

If destination qemu doesn't contain such bitmap it will be created.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---


[...]


+
+static int dirty_bitmap_load_bits(QEMUFile *f, DirtyBitmapLoadState *s)
+{
+uint64_t first_byte = qemu_get_be64(f) << BDRV_SECTOR_BITS;
+uint64_t nr_bytes = (uint64_t)qemu_get_be32(f) << BDRV_SECTOR_BITS;
+trace_dirty_bitmap_load_bits_enter(first_byte >> BDRV_SECTOR_BITS,
+   nr_bytes >> BDRV_SECTOR_BITS);
+
+if (s->flags & DIRTY_BITMAP_MIG_FLAG_ZEROES) {
+trace_dirty_bitmap_load_bits_zeroes();
+bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte, nr_bytes,
+ false);
+} else {
+size_t ret;
+uint8_t *buf;
+uint64_t buf_size = qemu_get_be64(f);
+uint64_t needed_size =
+bdrv_dirty_bitmap_serialization_size(s->bitmap,
+ first_byte, nr_bytes);
+
+if (needed_size > buf_size ||
+buf_size > QEMU_ALIGN_UP(needed_size, 4 + sizeof(long))

I think you meant '4 * sizeof(long)';  other than that, from the
migration side I'm OK, so with that fixed, and someone from the block
side checking the block code:


Reviewed-by: Dr. David Alan Gilbert 



Ohh, yes, 4 * sizeof(long).
Who will finally pull it? Should I respin, or you fix it inflight?

--
Best regards,
Vladimir

Re: [Qemu-block] [PATCH 2/2] docs: Document the new default sizes of the qcow2 caches

2018-03-13 Thread Eric Blake


On 03/13/2018 10:02 AM, Alberto Garcia wrote:

We have just reduced the refcount cache size to the minimum unless
the user explicitly requests a larger one, so we have to update the
documentation to reflect this change.

Signed-off-by: Alberto Garcia 
---
  docs/qcow2-cache.txt | 31 ++-
  1 file changed, 14 insertions(+), 17 deletions(-)




+Before QEMU 2.12 the refcount cache had a default size of 1/4 of the
+L2 cache size. This resulted in unnecessarily large caches, so now the
+refcount cache is as small as possible unless overriden by the user.


s/overriden/overridden/

Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-block] [Qemu-devel] [PULL 18/41] blockjobs: add block-job-finalize

2018-03-13 Thread Eric Blake


On 03/13/2018 11:17 AM, Kevin Wolf wrote:

From: John Snow 

Instead of automatically transitioning from PENDING to CONCLUDED, gate
the .prepare() and .commit() phases behind an explicit acknowledgement
provided by the QMP monitor if auto_finalize = false has been requested.




  ##
+# @block-job-finalize:
+#
+# Once a job that has manual=true reaches the pending state, it can be


Is this wording stale, given that you add two separate auto-* bool flags 
in 19/41?  You may want to prepare a followup patch (doc bug fixes are 
safe during softfreeze, so it need not hold up this pull request) that 
tweaks this and any similar stale wording.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-block] [PATCH 1/2] qcow2: Give the refcount cache the minimum possible size by default

2018-03-13 Thread Eric Blake


On 03/13/2018 10:02 AM, Alberto Garcia wrote:

The L2 and refcount caches have default sizes that can be overriden
using the l2-cache-size and refcount-cache-size (an additional
parameter named cache-size sets the combined size of both caches).

Unless forced by one of the aforementioned parameters, QEMU will set
the unspecified sizes so that the L2 cache is 4 times larger than the
refcount cache.





However this patch takes a completely different approach and instead
of keeping a ratio between both cache sizes it assigns as much as
possible to the L2 cache and the remainder to the refcount cache.

The reason is that L2 tables are used for every single I/O request
from the guest and the effect of increasing the cache is significant
and clearly measurable. Refcount blocks are however only used for
cluster allocation and internal snapshots and in practice are accessed
sequentially in most cases, so the effect of increasing the cache is
negligible (even when doing random writes from the guest).

So, make the refcount cache as small as possible unless the user
explicitly asks for a larger one.


I like the reasoning given here.

I'd count this as a bugfix, safe even during freeze (but it's ultimately 
the maintainer's call)




Signed-off-by: Alberto Garcia 
---
  block/qcow2.c  | 31 +++
  block/qcow2.h  |  4 
  tests/qemu-iotests/137.out |  2 +-
  3 files changed, 20 insertions(+), 17 deletions(-)



+++ b/block/qcow2.c
@@ -802,23 +802,30 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
  } else if (refcount_cache_size_set) {
  *l2_cache_size = combined_cache_size - *refcount_cache_size;
  } else {
-*refcount_cache_size = combined_cache_size
- / (DEFAULT_L2_REFCOUNT_SIZE_RATIO + 1);
-*l2_cache_size = combined_cache_size - *refcount_cache_size;


In the old code, refcount_cache_size and l2_cache_size are both set to 
fractions of the combined size, so both are positive (even if 
combined_cache_size is too small for the minimums required)



+uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
+uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
+uint64_t min_refcount_cache =
+(uint64_t) MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
+
+/* Assign as much memory as possible to the L2 cache, and
+ * use the remainder for the refcount cache */
+if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
+*l2_cache_size = max_l2_cache;
+*refcount_cache_size = combined_cache_size - *l2_cache_size;
+} else {
+*refcount_cache_size =
+MIN(combined_cache_size, min_refcount_cache);


but here, if combined_cache_size is smaller than min_refcount_cache,


+*l2_cache_size = combined_cache_size - *refcount_cache_size;


then l2_cache_size is set to a negative value.

I think you're missing bounds validations that the combined cache size 
is large enough for the minimums required.  Or maybe a slight tweak, if 
it is okay for one of the two caches to be sized at 0 (that is, if 
combined_cache_size is too small for refcount, can it instead be given 
100% to the l2 cache and let refcount be uncached)?



+}
  }
  } else {
-if (!l2_cache_size_set && !refcount_cache_size_set) {
+if (!l2_cache_size_set) {
  *l2_cache_size = MAX(DEFAULT_L2_CACHE_BYTE_SIZE,
   (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
   * s->cluster_size);
-*refcount_cache_size = *l2_cache_size
- / DEFAULT_L2_REFCOUNT_SIZE_RATIO;
-} else if (!l2_cache_size_set) {
-*l2_cache_size = *refcount_cache_size
-   * DEFAULT_L2_REFCOUNT_SIZE_RATIO;
-} else if (!refcount_cache_size_set) {
-*refcount_cache_size = *l2_cache_size
- / DEFAULT_L2_REFCOUNT_SIZE_RATIO;
+}
+if (!refcount_cache_size_set) {
+*refcount_cache_size = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
  }
  }
  

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

[Qemu-block] [PATCH v11 07/13] migration: include migrate_dirty_bitmaps in migrate_postcopy

2018-03-13 Thread Vladimir Sementsov-Ogievskiy

Enable postcopy if dirty bitmap migration is enabled.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Juan Quintela 
Reviewed-by: John Snow 
Reviewed-by: Fam Zheng 
---
 migration/migration.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/migration.c b/migration/migration.c
index e0aff5c814..094196c236 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1508,7 +1508,7 @@ bool migrate_postcopy_ram(void)
 
 bool migrate_postcopy(void)
 {
-return migrate_postcopy_ram();
+return migrate_postcopy_ram() || migrate_dirty_bitmaps();
 }
 
 bool migrate_auto_converge(void)
-- 
2.11.1

[Qemu-block] [PATCH v11 01/13] block/dirty-bitmap: add bdrv_dirty_bitmap_enable_successor()

2018-03-13 Thread Vladimir Sementsov-Ogievskiy

Enabling bitmap successor is necessary to enable successors of bitmaps
being migrated before target vm start.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: John Snow 
Reviewed-by: Fam Zheng 
Message-id: 20180207155837.92351-2-vsement...@virtuozzo.com
Signed-off-by: John Snow 
---
 include/block/dirty-bitmap.h | 1 +
 block/dirty-bitmap.c | 8 
 2 files changed, 9 insertions(+)

diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 09efec609f..e0ebc96b01 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -21,6 +21,7 @@ BdrvDirtyBitmap *bdrv_dirty_bitmap_abdicate(BlockDriverState 
*bs,
 BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap,
Error **errp);
+void bdrv_dirty_bitmap_enable_successor(BdrvDirtyBitmap *bitmap);
 BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs,
 const char *name);
 void bdrv_dirty_bitmap_make_anon(BdrvDirtyBitmap *bitmap);
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 909f0517f8..0d0e807216 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -234,6 +234,14 @@ int bdrv_dirty_bitmap_create_successor(BlockDriverState 
*bs,
 return 0;
 }
 
+/* Called with BQL taken. */
+void bdrv_dirty_bitmap_enable_successor(BdrvDirtyBitmap *bitmap)
+{
+qemu_mutex_lock(bitmap->mutex);
+bdrv_enable_dirty_bitmap(bitmap->successor);
+qemu_mutex_unlock(bitmap->mutex);
+}
+
 /**
  * For a bitmap with a successor, yield our name to the successor,
  * delete the old bitmap, and return a handle to the new bitmap.
-- 
2.11.1

Re: [Qemu-block] [PATCH v11 05/13] migration: introduce postcopy-only pending

2018-03-13 Thread Dr. David Alan Gilbert

* Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:
> There would be savevm states (dirty-bitmap) which can migrate only in
> postcopy stage. The corresponding pending is introduced here.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  include/migration/register.h | 17 +++--
>  migration/savevm.h   |  5 +++--
>  hw/s390x/s390-stattrib.c |  7 ---
>  migration/block.c|  7 ---
>  migration/migration.c| 12 ++--
>  migration/ram.c  |  9 +
>  migration/savevm.c   | 13 -
>  migration/trace-events   |  2 +-
>  8 files changed, 46 insertions(+), 26 deletions(-)
> 
> diff --git a/include/migration/register.h b/include/migration/register.h
> index f4f7bdc177..9436a87678 100644
> --- a/include/migration/register.h
> +++ b/include/migration/register.h
> @@ -37,8 +37,21 @@ typedef struct SaveVMHandlers {
>  int (*save_setup)(QEMUFile *f, void *opaque);
>  void (*save_live_pending)(QEMUFile *f, void *opaque,
>uint64_t threshold_size,
> -  uint64_t *non_postcopiable_pending,
> -  uint64_t *postcopiable_pending);
> +  uint64_t *res_precopy_only,
> +  uint64_t *res_compatible,
> +  uint64_t *res_postcopy_only);
> +/* Note for save_live_pending:
> + * - res_precopy_only is for data which must be migrated in precopy phase
> + * or in stopped state, in other words - before target vm start
> + * - res_compatible is for data which may be migrated in any phase
> + * - res_postcopy_only is for data which must be migrated in postcopy 
> phase
> + * or in stopped state, in other words - after source vm stop
> + *
> + * Sum of res_postcopy_only, res_compatible and res_postcopy_only is the
> + * whole amount of pending data.
> + */
> +
> +
>  LoadStateHandler *load_state;
>  int (*load_setup)(QEMUFile *f, void *opaque);
>  int (*load_cleanup)(void *opaque);
> diff --git a/migration/savevm.h b/migration/savevm.h
> index 295c4a1f2c..cf4f0d37ca 100644
> --- a/migration/savevm.h
> +++ b/migration/savevm.h
> @@ -38,8 +38,9 @@ void qemu_savevm_state_complete_postcopy(QEMUFile *f);
>  int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
> bool inactivate_disks);
>  void qemu_savevm_state_pending(QEMUFile *f, uint64_t max_size,
> -   uint64_t *res_non_postcopiable,
> -   uint64_t *res_postcopiable);
> +   uint64_t *res_precopy_only,
> +   uint64_t *res_compatible,
> +   uint64_t *res_postcopy_only);
>  void qemu_savevm_send_ping(QEMUFile *f, uint32_t value);
>  void qemu_savevm_send_open_return_path(QEMUFile *f);
>  int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t len);
> diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c
> index adf07ef312..70b95550a8 100644
> --- a/hw/s390x/s390-stattrib.c
> +++ b/hw/s390x/s390-stattrib.c
> @@ -183,15 +183,16 @@ static int cmma_save_setup(QEMUFile *f, void *opaque)
>  }
>  
>  static void cmma_save_pending(QEMUFile *f, void *opaque, uint64_t max_size,
> - uint64_t *non_postcopiable_pending,
> - uint64_t *postcopiable_pending)
> +  uint64_t *res_precopy_only,
> +  uint64_t *res_compatible,
> +  uint64_t *res_postcopy_only)
>  {
>  S390StAttribState *sas = S390_STATTRIB(opaque);
>  S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
>  long long res = sac->get_dirtycount(sas);
>  
>  if (res >= 0) {
> -*non_postcopiable_pending += res;
> +*res_precopy_only += res;
>  }
>  }
>  
> diff --git a/migration/block.c b/migration/block.c
> index 41b95d1dd8..5c03632257 100644
> --- a/migration/block.c
> +++ b/migration/block.c
> @@ -864,8 +864,9 @@ static int block_save_complete(QEMUFile *f, void *opaque)
>  }
>  
>  static void block_save_pending(QEMUFile *f, void *opaque, uint64_t max_size,
> -   uint64_t *non_postcopiable_pending,
> -   uint64_t *postcopiable_pending)
> +   uint64_t *res_precopy_only,
> +   uint64_t *res_compatible,
> +   uint64_t *res_postcopy_only)
>  {
>  /* Estimate pending number of bytes to send */
>  uint64_t pending;
> @@ -886,7 +887,7 @@ static void block_save_pending(QEMUFile *f, void *opaque, 
> uint64_t max_size,
>  
>  DPRINTF("Enter save live pending  %" PRIu64 "\n", pending);
>

[Qemu-block] [PATCH v11 10/13] migration: allow qmp command migrate-start-postcopy for any postcopy

2018-03-13 Thread Vladimir Sementsov-Ogievskiy

Allow migrate-start-postcopy for any postcopy type

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 migration/migration.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/migration.c b/migration/migration.c
index 094196c236..59b4fe6090 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1022,7 +1022,7 @@ void qmp_migrate_start_postcopy(Error **errp)
 {
 MigrationState *s = migrate_get_current();
 
-if (!migrate_postcopy_ram()) {
+if (!migrate_postcopy()) {
 error_setg(errp, "Enable postcopy with migrate_set_capability before"
  " the start of migration");
 return;
-- 
2.11.1

[Qemu-block] [PATCH v11 11/13] migration: add postcopy migration of dirty bitmaps

2018-03-13 Thread Vladimir Sementsov-Ogievskiy

Postcopy migration of dirty bitmaps. Only named dirty bitmaps are migrated.

If destination qemu is already containing a dirty bitmap with the same name
as a migrated bitmap (for the same node), then, if their granularities are
the same the migration will be done, otherwise the error will be generated.

If destination qemu doesn't contain such bitmap it will be created.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 include/migration/misc.h   |   3 +
 migration/migration.h  |   3 +
 migration/block-dirty-bitmap.c | 746 +
 migration/migration.c  |   5 +
 migration/savevm.c |   2 +
 vl.c   |   1 +
 migration/Makefile.objs|   1 +
 migration/trace-events |  14 +
 8 files changed, 775 insertions(+)
 create mode 100644 migration/block-dirty-bitmap.c

diff --git a/include/migration/misc.h b/include/migration/misc.h
index 77fd4f587c..4ebf24c6c2 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -56,4 +56,7 @@ bool migration_has_failed(MigrationState *);
 bool migration_in_postcopy_after_devices(MigrationState *);
 void migration_global_dump(Monitor *mon);
 
+/* migration/block-dirty-bitmap.c */
+void dirty_bitmap_mig_init(void);
+
 #endif
diff --git a/migration/migration.h b/migration/migration.h
index da6bc37de8..a79540b99c 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -235,4 +235,7 @@ void migrate_send_rp_pong(MigrationIncomingState *mis,
 int migrate_send_rp_req_pages(MigrationIncomingState *mis, const char* rbname,
   ram_addr_t start, size_t len);
 
+void dirty_bitmap_mig_before_vm_start(void);
+void init_dirty_bitmap_incoming_migration(void);
+
 #endif
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
new file mode 100644
index 00..98ba4589e3
--- /dev/null
+++ b/migration/block-dirty-bitmap.c
@@ -0,0 +1,746 @@
+/*
+ * Block dirty bitmap postcopy migration
+ *
+ * Copyright IBM, Corp. 2009
+ * Copyright (c) 2016-2017 Virtuozzo International GmbH. All rights reserved.
+ *
+ * Authors:
+ *  Liran Schour   
+ *  Vladimir Sementsov-Ogievskiy 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ * This file is derived from migration/block.c, so it's author and IBM 
copyright
+ * are here, although content is quite different.
+ *
+ * Contributions after 2012-01-13 are licensed under the terms of the
+ * GNU GPL, version 2 or (at your option) any later version.
+ *
+ ****
+ *
+ * Here postcopy migration of dirty bitmaps is realized. Only QMP-addressable
+ * bitmaps are migrated.
+ *
+ * Bitmap migration implies creating bitmap with the same name and granularity
+ * in destination QEMU. If the bitmap with the same name (for the same node)
+ * already exists on destination an error will be generated.
+ *
+ * format of migration:
+ *
+ * # Header (shared for different chunk types)
+ * 1, 2 or 4 bytes: flags (see qemu_{put,put}_flags)
+ * [ 1 byte: node name size ] \  flags & DEVICE_NAME
+ * [ n bytes: node name ] /
+ * [ 1 byte: bitmap name size ] \  flags & BITMAP_NAME
+ * [ n bytes: bitmap name ] /
+ *
+ * # Start of bitmap migration (flags & START)
+ * header
+ * be64: granularity
+ * 1 byte: bitmap flags (corresponds to BdrvDirtyBitmap)
+ *   bit 0-  bitmap is enabled
+ *   bit 1-  bitmap is persistent
+ *   bit 2-  bitmap is autoloading
+ *   bits 3-7 - reserved, must be zero
+ *
+ * # Complete of bitmap migration (flags & COMPLETE)
+ * header
+ *
+ * # Data chunk of bitmap migration
+ * header
+ * be64: start sector
+ * be32: number of sectors
+ * [ be64: buffer size  ] \ ! (flags & ZEROES)
+ * [ n bytes: buffer] /
+ *
+ * The last chunk in stream should contain flags & EOS. The chunk may skip
+ * device and/or bitmap names, assuming them to be the same with the previous
+ * chunk.
+ */
+
+#include "qemu/osdep.h"
+#include "block/block.h"
+#include "block/block_int.h"
+#include "sysemu/block-backend.h"
+#include "qemu/main-loop.h"
+#include "qemu/error-report.h"
+#include "migration/misc.h"
+#include "migration/migration.h"
+#include "migration/qemu-file.h"
+#include "migration/vmstate.h"
+#include "migration/register.h"
+#include "qemu/hbitmap.h"
+#include "sysemu/sysemu.h"
+#include "qemu/cutils.h"
+#include "qapi/error.h"
+#include "trace.h"
+
+#define CHUNK_SIZE (1 << 10)
+
+/* Flags occupy one, two or four bytes (Big Endian). The size is determined as
+ * follows:
+ * in first (most significant) byte bit 8 is clear  -->  one byte
+ * in first byte bit 8 is set-->  two or four bytes, depending on second
+ *byte:
+ *| in second byte bit 8 is clear  -->  two bytes
+ *| in second byte bit 8 is set-->  four bytes
+ */
+#define

[Qemu-block] [PATCH v11 06/13] qapi: add dirty-bitmaps migration capability

2018-03-13 Thread Vladimir Sementsov-Ogievskiy

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: John Snow 
Reviewed-by: Eric Blake 
Reviewed-by: Juan Quintela 
Reviewed-by: Fam Zheng 
---
 qapi/migration.json   | 6 +-
 migration/migration.h | 1 +
 migration/migration.c | 9 +
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index 7f465a1902..9d0bf82cf4 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -354,12 +354,16 @@
 #
 # @x-multifd: Use more than one fd for migration (since 2.11)
 #
+# @dirty-bitmaps: If enabled, QEMU will migrate named dirty bitmaps.
+# (since 2.12)
+#
 # Since: 1.2
 ##
 { 'enum': 'MigrationCapability',
   'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
'compress', 'events', 'postcopy-ram', 'x-colo', 'release-ram',
-   'block', 'return-path', 'pause-before-switchover', 'x-multifd' ] }
+   'block', 'return-path', 'pause-before-switchover', 'x-multifd',
+   'dirty-bitmaps' ] }
 
 ##
 # @MigrationCapabilityStatus:
diff --git a/migration/migration.h b/migration/migration.h
index 08c5d2ded1..da6bc37de8 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -205,6 +205,7 @@ bool migrate_postcopy(void);
 bool migrate_release_ram(void);
 bool migrate_postcopy_ram(void);
 bool migrate_zero_blocks(void);
+bool migrate_dirty_bitmaps(void);
 
 bool migrate_auto_converge(void);
 bool migrate_use_multifd(void);
diff --git a/migration/migration.c b/migration/migration.c
index 90307f8ab5..e0aff5c814 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1565,6 +1565,15 @@ int migrate_decompress_threads(void)
 return s->parameters.decompress_threads;
 }
 
+bool migrate_dirty_bitmaps(void)
+{
+MigrationState *s;
+
+s = migrate_get_current();
+
+return s->enabled_capabilities[MIGRATION_CAPABILITY_DIRTY_BITMAPS];
+}
+
 bool migrate_use_events(void)
 {
 MigrationState *s;
-- 
2.11.1

[Qemu-block] [PATCH v11 04/13] dirty-bitmap: add locked state

2018-03-13 Thread Vladimir Sementsov-Ogievskiy

Add special state, when qmp operations on the bitmap are disabled.
It is needed during bitmap migration. "Frozen" state is not
appropriate here, because it looks like bitmap is unchanged.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: John Snow 
Message-id: 20180207155837.92351-5-vsement...@virtuozzo.com
[Adjusted comment and spacing. --js]
Signed-off-by: John Snow 
---
 qapi/block-core.json |  5 -
 include/block/dirty-bitmap.h |  3 +++
 block/dirty-bitmap.c | 16 
 blockdev.c   | 19 +++
 4 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 524d51567a..2b378f510a 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -426,10 +426,13 @@
 # @active: The bitmap is actively monitoring for new writes, and can be 
cleared,
 #  deleted, or used for backup operations.
 #
+# @locked: The bitmap is currently in-use by some operation and can not be
+#  cleared, deleted, or used for backup operations. (Since 2.12)
+#
 # Since: 2.4
 ##
 { 'enum': 'DirtyBitmapStatus',
-  'data': ['active', 'disabled', 'frozen'] }
+  'data': ['active', 'disabled', 'frozen', 'locked'] }
 
 ##
 # @BlockDirtyInfo:
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 5c239be74d..1ff8949b1b 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -69,6 +69,8 @@ void bdrv_dirty_bitmap_deserialize_finish(BdrvDirtyBitmap 
*bitmap);
 void bdrv_dirty_bitmap_set_readonly(BdrvDirtyBitmap *bitmap, bool value);
 void bdrv_dirty_bitmap_set_persistance(BdrvDirtyBitmap *bitmap,
bool persistent);
+void bdrv_dirty_bitmap_set_qmp_locked(BdrvDirtyBitmap *bitmap, bool 
qmp_locked);
+
 
 /* Functions that require manual locking.  */
 void bdrv_dirty_bitmap_lock(BdrvDirtyBitmap *bitmap);
@@ -88,6 +90,7 @@ bool bdrv_dirty_bitmap_readonly(const BdrvDirtyBitmap 
*bitmap);
 bool bdrv_has_readonly_bitmaps(BlockDriverState *bs);
 bool bdrv_dirty_bitmap_get_autoload(const BdrvDirtyBitmap *bitmap);
 bool bdrv_dirty_bitmap_get_persistance(BdrvDirtyBitmap *bitmap);
+bool bdrv_dirty_bitmap_qmp_locked(BdrvDirtyBitmap *bitmap);
 bool bdrv_has_changed_persistent_bitmaps(BlockDriverState *bs);
 BdrvDirtyBitmap *bdrv_dirty_bitmap_next(BlockDriverState *bs,
 BdrvDirtyBitmap *bitmap);
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index ce00ff3474..967159479d 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -40,6 +40,8 @@ struct BdrvDirtyBitmap {
 QemuMutex *mutex;
 HBitmap *bitmap;/* Dirty bitmap implementation */
 HBitmap *meta;  /* Meta dirty bitmap */
+bool qmp_locked;/* Bitmap is locked, it can't be modified
+   through QMP */
 BdrvDirtyBitmap *successor; /* Anonymous child; implies frozen status */
 char *name; /* Optional non-empty unique ID */
 int64_t size;   /* Size of the bitmap, in bytes */
@@ -183,6 +185,18 @@ bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap)
 return bitmap->successor;
 }
 
+void bdrv_dirty_bitmap_set_qmp_locked(BdrvDirtyBitmap *bitmap, bool qmp_locked)
+{
+qemu_mutex_lock(bitmap->mutex);
+bitmap->qmp_locked = qmp_locked;
+qemu_mutex_unlock(bitmap->mutex);
+}
+
+bool bdrv_dirty_bitmap_qmp_locked(BdrvDirtyBitmap *bitmap)
+{
+return bitmap->qmp_locked;
+}
+
 /* Called with BQL taken.  */
 bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap)
 {
@@ -194,6 +208,8 @@ DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap 
*bitmap)
 {
 if (bdrv_dirty_bitmap_frozen(bitmap)) {
 return DIRTY_BITMAP_STATUS_FROZEN;
+} else if (bdrv_dirty_bitmap_qmp_locked(bitmap)) {
+return DIRTY_BITMAP_STATUS_LOCKED;
 } else if (!bdrv_dirty_bitmap_enabled(bitmap)) {
 return DIRTY_BITMAP_STATUS_DISABLED;
 } else {
diff --git a/blockdev.c b/blockdev.c
index 1fbfd3a2c4..b9de18f3b2 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2118,6 +2118,9 @@ static void 
block_dirty_bitmap_clear_prepare(BlkActionState *common,
 if (bdrv_dirty_bitmap_frozen(state->bitmap)) {
 error_setg(errp, "Cannot modify a frozen bitmap");
 return;
+} else if (bdrv_dirty_bitmap_qmp_locked(state->bitmap)) {
+error_setg(errp, "Cannot modify a locked bitmap");
+return;
 } else if (!bdrv_dirty_bitmap_enabled(state->bitmap)) {
 error_setg(errp, "Cannot clear a disabled bitmap");
 return;
@@ -2862,6 +2865,11 @@ void qmp_block_dirty_bitmap_remove(const char *node, 
const char *name,
"Bitmap '%s' is currently frozen and cannot be removed",
name);
 return;
+} else if (bdrv_dirty_bitmap_qmp_locked(bitmap)) {
+error_setg(errp,
+

[Qemu-block] [PATCH v11 12/13] iotests: add dirty bitmap migration test

2018-03-13 Thread Vladimir Sementsov-Ogievskiy

The test starts two vms (vm_a, vm_b), create dirty bitmap in
the first one, do several writes to corresponding device and
then migrate vm_a to vm_b.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/169 | 156 +
 tests/qemu-iotests/169.out |   5 ++
 tests/qemu-iotests/group   |   1 +
 3 files changed, 162 insertions(+)
 create mode 100755 tests/qemu-iotests/169
 create mode 100644 tests/qemu-iotests/169.out

diff --git a/tests/qemu-iotests/169 b/tests/qemu-iotests/169
new file mode 100755
index 00..3a8db91f6f
--- /dev/null
+++ b/tests/qemu-iotests/169
@@ -0,0 +1,156 @@
+#!/usr/bin/env python
+#
+# Tests for dirty bitmaps migration.
+#
+# Copyright (c) 2016-2017 Virtuozzo International GmbH. All rights reserved.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+import os
+import iotests
+import time
+import itertools
+import operator
+import new
+from iotests import qemu_img
+
+
+disk_a = os.path.join(iotests.test_dir, 'disk_a')
+disk_b = os.path.join(iotests.test_dir, 'disk_b')
+size = '1M'
+mig_file = os.path.join(iotests.test_dir, 'mig_file')
+
+
+class TestDirtyBitmapMigration(iotests.QMPTestCase):
+def tearDown(self):
+self.vm_a.shutdown()
+self.vm_b.shutdown()
+os.remove(disk_a)
+os.remove(disk_b)
+os.remove(mig_file)
+
+def setUp(self):
+qemu_img('create', '-f', iotests.imgfmt, disk_a, size)
+qemu_img('create', '-f', iotests.imgfmt, disk_b, size)
+
+self.vm_a = iotests.VM(path_suffix='a').add_drive(disk_a)
+self.vm_a.launch()
+
+self.vm_b = iotests.VM(path_suffix='b')
+self.vm_b.add_incoming("exec: cat '" + mig_file + "'")
+
+def add_bitmap(self, vm, granularity, persistent):
+params = {'node': 'drive0',
+  'name': 'bitmap0',
+  'granularity': granularity}
+if persistent:
+params['persistent'] = True
+params['autoload'] = True
+
+result = vm.qmp('block-dirty-bitmap-add', **params)
+self.assert_qmp(result, 'return', {});
+
+def get_bitmap_hash(self, vm):
+result = vm.qmp('x-debug-block-dirty-bitmap-sha256',
+node='drive0', name='bitmap0')
+return result['return']['sha256']
+
+def check_bitmap(self, vm, sha256):
+result = vm.qmp('x-debug-block-dirty-bitmap-sha256',
+node='drive0', name='bitmap0')
+if sha256:
+self.assert_qmp(result, 'return/sha256', sha256);
+else:
+self.assert_qmp(result, 'error/desc',
+"Dirty bitmap 'bitmap0' not found");
+
+def do_test_migration(self, persistent, migrate_bitmaps, online,
+  shared_storage):
+granularity = 512
+
+# regions = ((start, count), ...)
+regions = ((0, 0x1),
+   (0xf, 0x1),
+   (0xa0201, 0x1000))
+
+should_migrate = migrate_bitmaps or persistent and shared_storage
+
+self.vm_b.add_drive(disk_a if shared_storage else disk_b)
+
+if online:
+os.mkfifo(mig_file)
+self.vm_b.launch()
+
+self.add_bitmap(self.vm_a, granularity, persistent)
+for r in regions:
+self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % r)
+sha256 = self.get_bitmap_hash(self.vm_a)
+
+if migrate_bitmaps:
+capabilities = [{'capability': 'dirty-bitmaps', 'state': True}]
+
+result = self.vm_a.qmp('migrate-set-capabilities',
+   capabilities=capabilities)
+self.assert_qmp(result, 'return', {})
+
+if online:
+result = self.vm_b.qmp('migrate-set-capabilities',
+   capabilities=capabilities)
+self.assert_qmp(result, 'return', {})
+
+result = self.vm_a.qmp('migrate-set-capabilities',
+   capabilities=[{'capability': 'events',
+  'state': True}])
+self.assert_qmp(result, 'return', {})
+
+result = self.vm_a.qmp('migrate', uri='exec:cat>' + mig_file)
+while True:
+event = self.vm_a.event_wait('MIGRATION')
+if

[Qemu-block] [PATCH v11 09/13] migration: add is_active_iterate handler

2018-03-13 Thread Vladimir Sementsov-Ogievskiy

Only-postcopy savevm states (dirty-bitmap) don't need live iteration, so
to disable them and stop transporting empty sections there is a new
savevm handler.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Juan Quintela 
Reviewed-by: John Snow 
Reviewed-by: Fam Zheng 
---
 include/migration/register.h | 9 +
 migration/savevm.c   | 5 +
 2 files changed, 14 insertions(+)

diff --git a/include/migration/register.h b/include/migration/register.h
index 9436a87678..f6f12f9b1a 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -26,6 +26,15 @@ typedef struct SaveVMHandlers {
 bool (*is_active)(void *opaque);
 bool (*has_postcopy)(void *opaque);
 
+/* is_active_iterate
+ * If it is not NULL then qemu_savevm_state_iterate will skip iteration if
+ * it returns false. For example, it is needed for only-postcopy-states,
+ * which needs to be handled by qemu_savevm_state_setup and
+ * qemu_savevm_state_pending, but do not need iterations until not in
+ * postcopy stage.
+ */
+bool (*is_active_iterate)(void *opaque);
+
 /* This runs outside the iothread lock in the migration case, and
  * within the lock in the savevm case.  The callback had better only
  * use data that is local to the migration thread or protected
diff --git a/migration/savevm.c b/migration/savevm.c
index cd5944b81f..a60819ec2e 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1028,6 +1028,11 @@ int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy)
 continue;
 }
 }
+if (se->ops && se->ops->is_active_iterate) {
+if (!se->ops->is_active_iterate(se->opaque)) {
+continue;
+}
+}
 /*
  * In the postcopy phase, any device that doesn't know how to
  * do postcopy should have saved it's state in the _complete
-- 
2.11.1

[Qemu-block] [PATCH v11 13/13] iotests: add dirty bitmap postcopy test

2018-03-13 Thread Vladimir Sementsov-Ogievskiy

Test
- start two vms (vm_a, vm_b)

- in a
- do writes from set A
- do writes from set B
- fix bitmap sha256
- clear bitmap
- do writes from set A
- start migration
- than, in b
- wait vm start (postcopy should start)
- do writes from set B
- check bitmap sha256

The test should verify postcopy migration and then merging with delta
(changes in target, during postcopy process).

Reduce supported cache modes to only 'none', because with cache on time
from source.STOP to target.RESUME is unpredictable and we can fail with
timout while waiting for target.RESUME.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/199| 118 ++
 tests/qemu-iotests/199.out|   5 ++
 tests/qemu-iotests/group  |   1 +
 tests/qemu-iotests/iotests.py |   7 ++-
 4 files changed, 130 insertions(+), 1 deletion(-)
 create mode 100755 tests/qemu-iotests/199
 create mode 100644 tests/qemu-iotests/199.out

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
new file mode 100755
index 00..651e8df5d9
--- /dev/null
+++ b/tests/qemu-iotests/199
@@ -0,0 +1,118 @@
+#!/usr/bin/env python
+#
+# Tests for dirty bitmaps postcopy migration.
+#
+# Copyright (c) 2016-2017 Virtuozzo International GmbH. All rights reserved.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+import os
+import iotests
+import time
+from iotests import qemu_img
+
+disk_a = os.path.join(iotests.test_dir, 'disk_a')
+disk_b = os.path.join(iotests.test_dir, 'disk_b')
+size = '256G'
+fifo = os.path.join(iotests.test_dir, 'mig_fifo')
+
+class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
+
+def tearDown(self):
+self.vm_a.shutdown()
+self.vm_b.shutdown()
+os.remove(disk_a)
+os.remove(disk_b)
+os.remove(fifo)
+
+def setUp(self):
+os.mkfifo(fifo)
+qemu_img('create', '-f', iotests.imgfmt, disk_a, size)
+qemu_img('create', '-f', iotests.imgfmt, disk_b, size)
+self.vm_a = iotests.VM(path_suffix='a').add_drive(disk_a)
+self.vm_b = iotests.VM(path_suffix='b').add_drive(disk_b)
+self.vm_b.add_incoming("exec: cat '" + fifo + "'")
+self.vm_a.launch()
+self.vm_b.launch()
+
+def test_postcopy(self):
+write_size = 0x4000
+granularity = 512
+chunk = 4096
+
+result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
+   name='bitmap', granularity=granularity)
+self.assert_qmp(result, 'return', {});
+
+s = 0
+while s < write_size:
+self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+s += 0x1
+s = 0x8000
+while s < write_size:
+self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+s += 0x1
+
+result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
+   node='drive0', name='bitmap')
+sha256 = result['return']['sha256']
+
+result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',
+   name='bitmap')
+self.assert_qmp(result, 'return', {});
+s = 0
+while s < write_size:
+self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+s += 0x1
+
+bitmaps_cap = {'capability': 'dirty-bitmaps', 'state': True}
+events_cap = {'capability': 'events', 'state': True}
+
+result = self.vm_a.qmp('migrate-set-capabilities',
+   capabilities=[bitmaps_cap, events_cap])
+self.assert_qmp(result, 'return', {})
+
+result = self.vm_b.qmp('migrate-set-capabilities',
+   capabilities=[bitmaps_cap])
+self.assert_qmp(result, 'return', {})
+
+result = self.vm_a.qmp('migrate', uri='exec:cat>' + fifo)
+self.assert_qmp(result, 'return', {})
+
+result = self.vm_a.qmp('migrate-start-postcopy')
+self.assert_qmp(result, 'return', {})
+
+while True:
+event = self.vm_a.event_wait('MIGRATION')
+if event['data']['status'] == 'completed':
+break
+
+s = 0x8000
+while s < write_size:
+self.vm_b.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+

Re: [Qemu-block] [PATCH v10 10/12] migration: add postcopy migration of dirty bitmaps

2018-03-13 Thread Dr. David Alan Gilbert

* Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:
> 12.03.2018 19:09, Dr. David Alan Gilbert wrote:
> > * Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:
> > > Postcopy migration of dirty bitmaps. Only named dirty bitmaps are 
> > > migrated.
> > > 
> > > +
> > > +init_dirty_bitmap_incoming_migration();
> > > +
> > You might want to consider if that's better in vl.c near where
> > ram_mig_init() is, OR whether there should be a call in
> > migratation_incoming_state_destroy to clean it up.
> > (Although I doubt the cases where the destroy happens are interesting
> > for postcopy bitmaps).
> 
> If you don't mind, let's leave it as is for now

Yep, that's OK.

Dave

> > 
> > >   once = true;
> > >   }
> > >   return _current;
> > > @@ -297,6 +300,8 @@ static void process_incoming_migration_bh(void 
> > > *opaque)
> > >  state, we need to obey autostart. Any other state is set with
> > >  runstate_set. */
> > > +dirty_bitmap_mig_before_vm_start();
> > > +
> > >   if (!global_state_received() ||
> > >   global_state_get_runstate() == RUN_STATE_RUNNING) {
> > >   if (autostart) {
> > > diff --git a/migration/savevm.c b/migration/savevm.c
> > > index e5d557458e..93b339646b 100644
> > > --- a/migration/savevm.c
> > > +++ b/migration/savevm.c
> > > @@ -1673,6 +1673,8 @@ static void loadvm_postcopy_handle_run_bh(void 
> > > *opaque)
> > >   trace_loadvm_postcopy_handle_run_vmstart();
> > > +dirty_bitmap_mig_before_vm_start();
> > > +
> > >   if (autostart) {
> > >   /* Hold onto your hats, starting the CPU */
> > >   vm_start();
> > > diff --git a/vl.c b/vl.c
> > > index e517a8d995..0ef3f2b5a2 100644
> > > --- a/vl.c
> > > +++ b/vl.c
> > > @@ -4514,6 +4514,7 @@ int main(int argc, char **argv, char **envp)
> > >   blk_mig_init();
> > >   ram_mig_init();
> > > +dirty_bitmap_mig_init();
> > >   /* If the currently selected machine wishes to override the 
> > > units-per-bus
> > >* property of its default HBA interface type, do so now. */
> > > diff --git a/migration/Makefile.objs b/migration/Makefile.objs
> > > index 99e038024d..c83ec47ba8 100644
> > > --- a/migration/Makefile.objs
> > > +++ b/migration/Makefile.objs
> > > @@ -6,6 +6,7 @@ common-obj-y += qemu-file.o global_state.o
> > >   common-obj-y += qemu-file-channel.o
> > >   common-obj-y += xbzrle.o postcopy-ram.o
> > >   common-obj-y += qjson.o
> > > +common-obj-y += block-dirty-bitmap.o
> > >   common-obj-$(CONFIG_RDMA) += rdma.o
> > > diff --git a/migration/trace-events b/migration/trace-events
> > > index a04fffb877..e9eb8078d4 100644
> > > --- a/migration/trace-events
> > > +++ b/migration/trace-events
> > > @@ -227,3 +227,17 @@ colo_vm_state_change(const char *old, const char 
> > > *new) "Change '%s' => '%s'"
> > >   colo_send_message(const char *msg) "Send '%s' message"
> > >   colo_receive_message(const char *msg) "Receive '%s' message"
> > >   colo_failover_set_state(const char *new_state) "new state %s"
> > > +
> > > +# migration/block-dirty-bitmap.c
> > > +send_bitmap_header_enter(void) ""
> > > +send_bitmap_bits(uint32_t flags, uint64_t start_sector, uint32_t 
> > > nr_sectors, uint64_t data_size) "\n   flags:0x%x\n   
> > > start_sector: %" PRIu64 "\n   nr_sectors:   %" PRIu32 "\n   data_size:
> > > %" PRIu64 "\n"
> > Tracing doesn't have \n's in
> 
> will fix.
> 
> > 
> > > +dirty_bitmap_save_iterate(int in_postcopy) "in postcopy: %d"
> > > +dirty_bitmap_save_complete_enter(void) ""
> > > +dirty_bitmap_save_complete_finish(void) ""
> > > +dirty_bitmap_save_pending(uint64_t pending, uint64_t max_size) "pending 
> > > %" PRIu64 " max: %" PRIu64
> > > +dirty_bitmap_load_complete(void) ""
> > > +dirty_bitmap_load_bits_enter(uint64_t first_sector, uint32_t nr_sectors) 
> > > "chunk: %" PRIu64 " %" PRIu32
> > > +dirty_bitmap_load_bits_zeroes(void) ""
> > > +dirty_bitmap_load_header(uint32_t flags) "flags 0x%x"
> > > +dirty_bitmap_load_enter(void) ""
> > > +dirty_bitmap_load_success(void) ""
> > So other than minor bits, this one looks OK from a migration side; I
> > can't say I've followed the block side of the patch though.
> > 
> > Dave
> > 
> > > -- 
> > > 2.11.1
> > > 
> > --
> > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> 
> 
> -- 
> Best regards,
> Vladimir
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-block] [PATCH v11 10/13] migration: allow qmp command migrate-start-postcopy for any postcopy

2018-03-13 Thread Dr. David Alan Gilbert

* Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:
> Allow migrate-start-postcopy for any postcopy type
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  migration/migration.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 094196c236..59b4fe6090 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1022,7 +1022,7 @@ void qmp_migrate_start_postcopy(Error **errp)
>  {
>  MigrationState *s = migrate_get_current();
>  
> -if (!migrate_postcopy_ram()) {
> +if (!migrate_postcopy()) {
>  error_setg(errp, "Enable postcopy with migrate_set_capability before"
>   " the start of migration");
>  return;
> -- 
> 2.11.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

[Qemu-block] [PATCH v11 08/13] migration/qemu-file: add qemu_put_counted_string()

2018-03-13 Thread Vladimir Sementsov-Ogievskiy

Add function opposite to qemu_get_counted_string.
qemu_put_counted_string puts one-byte length of the string (string
should not be longer than 255 characters), and then it puts the string,
without last zero byte.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: John Snow 
Reviewed-by: Juan Quintela 
Reviewed-by: Fam Zheng 
---
 migration/qemu-file.h |  2 ++
 migration/qemu-file.c | 13 +
 2 files changed, 15 insertions(+)

diff --git a/migration/qemu-file.h b/migration/qemu-file.h
index aae4e5ed36..f4f356ab12 100644
--- a/migration/qemu-file.h
+++ b/migration/qemu-file.h
@@ -174,4 +174,6 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t 
block_offset,
  ram_addr_t offset, size_t size,
  uint64_t *bytes_sent);
 
+void qemu_put_counted_string(QEMUFile *f, const char *name);
+
 #endif
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 2ab2bf362d..e85f501f86 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -734,6 +734,19 @@ size_t qemu_get_counted_string(QEMUFile *f, char buf[256])
 }
 
 /*
+ * Put a string with one preceding byte containing its length. The length of
+ * the string should be less than 256.
+ */
+void qemu_put_counted_string(QEMUFile *f, const char *str)
+{
+size_t len = strlen(str);
+
+assert(len < 256);
+qemu_put_byte(f, len);
+qemu_put_buffer(f, (const uint8_t *)str, len);
+}
+
+/*
  * Set the blocking state of the QEMUFile.
  * Note: On some transports the OS only keeps a single blocking state for
  *   both directions, and thus changing the blocking on the main
-- 
2.11.1

[Qemu-block] [PATCH v11 05/13] migration: introduce postcopy-only pending

2018-03-13 Thread Vladimir Sementsov-Ogievskiy

There would be savevm states (dirty-bitmap) which can migrate only in
postcopy stage. The corresponding pending is introduced here.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 include/migration/register.h | 17 +++--
 migration/savevm.h   |  5 +++--
 hw/s390x/s390-stattrib.c |  7 ---
 migration/block.c|  7 ---
 migration/migration.c| 12 ++--
 migration/ram.c  |  9 +
 migration/savevm.c   | 13 -
 migration/trace-events   |  2 +-
 8 files changed, 46 insertions(+), 26 deletions(-)

diff --git a/include/migration/register.h b/include/migration/register.h
index f4f7bdc177..9436a87678 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -37,8 +37,21 @@ typedef struct SaveVMHandlers {
 int (*save_setup)(QEMUFile *f, void *opaque);
 void (*save_live_pending)(QEMUFile *f, void *opaque,
   uint64_t threshold_size,
-  uint64_t *non_postcopiable_pending,
-  uint64_t *postcopiable_pending);
+  uint64_t *res_precopy_only,
+  uint64_t *res_compatible,
+  uint64_t *res_postcopy_only);
+/* Note for save_live_pending:
+ * - res_precopy_only is for data which must be migrated in precopy phase
+ * or in stopped state, in other words - before target vm start
+ * - res_compatible is for data which may be migrated in any phase
+ * - res_postcopy_only is for data which must be migrated in postcopy phase
+ * or in stopped state, in other words - after source vm stop
+ *
+ * Sum of res_postcopy_only, res_compatible and res_postcopy_only is the
+ * whole amount of pending data.
+ */
+
+
 LoadStateHandler *load_state;
 int (*load_setup)(QEMUFile *f, void *opaque);
 int (*load_cleanup)(void *opaque);
diff --git a/migration/savevm.h b/migration/savevm.h
index 295c4a1f2c..cf4f0d37ca 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -38,8 +38,9 @@ void qemu_savevm_state_complete_postcopy(QEMUFile *f);
 int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
bool inactivate_disks);
 void qemu_savevm_state_pending(QEMUFile *f, uint64_t max_size,
-   uint64_t *res_non_postcopiable,
-   uint64_t *res_postcopiable);
+   uint64_t *res_precopy_only,
+   uint64_t *res_compatible,
+   uint64_t *res_postcopy_only);
 void qemu_savevm_send_ping(QEMUFile *f, uint32_t value);
 void qemu_savevm_send_open_return_path(QEMUFile *f);
 int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t len);
diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c
index adf07ef312..70b95550a8 100644
--- a/hw/s390x/s390-stattrib.c
+++ b/hw/s390x/s390-stattrib.c
@@ -183,15 +183,16 @@ static int cmma_save_setup(QEMUFile *f, void *opaque)
 }
 
 static void cmma_save_pending(QEMUFile *f, void *opaque, uint64_t max_size,
- uint64_t *non_postcopiable_pending,
- uint64_t *postcopiable_pending)
+  uint64_t *res_precopy_only,
+  uint64_t *res_compatible,
+  uint64_t *res_postcopy_only)
 {
 S390StAttribState *sas = S390_STATTRIB(opaque);
 S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
 long long res = sac->get_dirtycount(sas);
 
 if (res >= 0) {
-*non_postcopiable_pending += res;
+*res_precopy_only += res;
 }
 }
 
diff --git a/migration/block.c b/migration/block.c
index 41b95d1dd8..5c03632257 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -864,8 +864,9 @@ static int block_save_complete(QEMUFile *f, void *opaque)
 }
 
 static void block_save_pending(QEMUFile *f, void *opaque, uint64_t max_size,
-   uint64_t *non_postcopiable_pending,
-   uint64_t *postcopiable_pending)
+   uint64_t *res_precopy_only,
+   uint64_t *res_compatible,
+   uint64_t *res_postcopy_only)
 {
 /* Estimate pending number of bytes to send */
 uint64_t pending;
@@ -886,7 +887,7 @@ static void block_save_pending(QEMUFile *f, void *opaque, 
uint64_t max_size,
 
 DPRINTF("Enter save live pending  %" PRIu64 "\n", pending);
 /* We don't do postcopy */
-*non_postcopiable_pending += pending;
+*res_precopy_only += pending;
 }
 
 static int block_load(QEMUFile *f, void *opaque, int version_id)
diff --git a/migration/migration.c b/migration/migration.c
index 6a4780ef6f..90307f8ab5 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@

[Qemu-block] [PATCH v11 02/13] block/dirty-bitmap: fix locking in bdrv_reclaim_dirty_bitmap

2018-03-13 Thread Vladimir Sementsov-Ogievskiy

Like other setters here these functions should take a lock.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Fam Zheng 
Reviewed-by: John Snow 
Message-id: 20180207155837.92351-3-vsement...@virtuozzo.com
Signed-off-by: John Snow 
---
 block/dirty-bitmap.c | 85 
 1 file changed, 53 insertions(+), 32 deletions(-)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 0d0e807216..75435f6c2f 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -242,6 +242,51 @@ void bdrv_dirty_bitmap_enable_successor(BdrvDirtyBitmap 
*bitmap)
 qemu_mutex_unlock(bitmap->mutex);
 }
 
+/* Called within bdrv_dirty_bitmap_lock..unlock */
+static void bdrv_do_release_matching_dirty_bitmap_locked(
+BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
+bool (*cond)(BdrvDirtyBitmap *bitmap))
+{
+BdrvDirtyBitmap *bm, *next;
+
+QLIST_FOREACH_SAFE(bm, >dirty_bitmaps, list, next) {
+if ((!bitmap || bm == bitmap) && (!cond || cond(bm))) {
+assert(!bm->active_iterators);
+assert(!bdrv_dirty_bitmap_frozen(bm));
+assert(!bm->meta);
+QLIST_REMOVE(bm, list);
+hbitmap_free(bm->bitmap);
+g_free(bm->name);
+g_free(bm);
+
+if (bitmap) {
+return;
+}
+}
+}
+
+if (bitmap) {
+abort();
+}
+}
+
+/* Called with BQL taken.  */
+static void bdrv_do_release_matching_dirty_bitmap(
+BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
+bool (*cond)(BdrvDirtyBitmap *bitmap))
+{
+bdrv_dirty_bitmaps_lock(bs);
+bdrv_do_release_matching_dirty_bitmap_locked(bs, bitmap, cond);
+bdrv_dirty_bitmaps_unlock(bs);
+}
+
+/* Called within bdrv_dirty_bitmap_lock..unlock */
+static void bdrv_release_dirty_bitmap_locked(BlockDriverState *bs,
+ BdrvDirtyBitmap *bitmap)
+{
+bdrv_do_release_matching_dirty_bitmap_locked(bs, bitmap, NULL);
+}
+
 /**
  * For a bitmap with a successor, yield our name to the successor,
  * delete the old bitmap, and return a handle to the new bitmap.
@@ -281,7 +326,11 @@ BdrvDirtyBitmap 
*bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
BdrvDirtyBitmap *parent,
Error **errp)
 {
-BdrvDirtyBitmap *successor = parent->successor;
+BdrvDirtyBitmap *successor;
+
+qemu_mutex_lock(parent->mutex);
+
+successor = parent->successor;
 
 if (!successor) {
 error_setg(errp, "Cannot reclaim a successor when none is present");
@@ -292,9 +341,11 @@ BdrvDirtyBitmap 
*bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
 error_setg(errp, "Merging of parent and successor bitmap failed");
 return NULL;
 }
-bdrv_release_dirty_bitmap(bs, successor);
+bdrv_release_dirty_bitmap_locked(bs, successor);
 parent->successor = NULL;
 
+qemu_mutex_unlock(parent->mutex);
+
 return parent;
 }
 
@@ -322,36 +373,6 @@ static bool bdrv_dirty_bitmap_has_name(BdrvDirtyBitmap 
*bitmap)
 }
 
 /* Called with BQL taken.  */
-static void bdrv_do_release_matching_dirty_bitmap(
-BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
-bool (*cond)(BdrvDirtyBitmap *bitmap))
-{
-BdrvDirtyBitmap *bm, *next;
-bdrv_dirty_bitmaps_lock(bs);
-QLIST_FOREACH_SAFE(bm, >dirty_bitmaps, list, next) {
-if ((!bitmap || bm == bitmap) && (!cond || cond(bm))) {
-assert(!bm->active_iterators);
-assert(!bdrv_dirty_bitmap_frozen(bm));
-assert(!bm->meta);
-QLIST_REMOVE(bm, list);
-hbitmap_free(bm->bitmap);
-g_free(bm->name);
-g_free(bm);
-
-if (bitmap) {
-goto out;
-}
-}
-}
-if (bitmap) {
-abort();
-}
-
-out:
-bdrv_dirty_bitmaps_unlock(bs);
-}
-
-/* Called with BQL taken.  */
 void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
 {
 bdrv_do_release_matching_dirty_bitmap(bs, bitmap, NULL);
-- 
2.11.1

[Qemu-block] [PATCH v11 03/13] block/dirty-bitmap: add _locked version of bdrv_reclaim_dirty_bitmap

2018-03-13 Thread Vladimir Sementsov-Ogievskiy

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: John Snow 
Message-id: 20180207155837.92351-4-vsement...@virtuozzo.com
Signed-off-by: John Snow 
---
 include/block/dirty-bitmap.h |  3 +++
 block/dirty-bitmap.c | 28 ++--
 2 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index e0ebc96b01..5c239be74d 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -93,5 +93,8 @@ BdrvDirtyBitmap *bdrv_dirty_bitmap_next(BlockDriverState *bs,
 BdrvDirtyBitmap *bitmap);
 char *bdrv_dirty_bitmap_sha256(const BdrvDirtyBitmap *bitmap, Error **errp);
 int64_t bdrv_dirty_bitmap_next_zero(BdrvDirtyBitmap *bitmap, uint64_t start);
+BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap_locked(BlockDriverState *bs,
+  BdrvDirtyBitmap *bitmap,
+  Error **errp);
 
 #endif
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 75435f6c2f..ce00ff3474 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -320,17 +320,13 @@ BdrvDirtyBitmap 
*bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
  * In cases of failure where we can no longer safely delete the parent,
  * we may wish to re-join the parent and child/successor.
  * The merged parent will be un-frozen, but not explicitly re-enabled.
- * Called with BQL taken.
+ * Called within bdrv_dirty_bitmap_lock..unlock and with BQL taken.
  */
-BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
-   BdrvDirtyBitmap *parent,
-   Error **errp)
+BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap_locked(BlockDriverState *bs,
+  BdrvDirtyBitmap *parent,
+  Error **errp)
 {
-BdrvDirtyBitmap *successor;
-
-qemu_mutex_lock(parent->mutex);
-
-successor = parent->successor;
+BdrvDirtyBitmap *successor = parent->successor;
 
 if (!successor) {
 error_setg(errp, "Cannot reclaim a successor when none is present");
@@ -344,9 +340,21 @@ BdrvDirtyBitmap 
*bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
 bdrv_release_dirty_bitmap_locked(bs, successor);
 parent->successor = NULL;
 
+return parent;
+}
+
+/* Called with BQL taken. */
+BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
+   BdrvDirtyBitmap *parent,
+   Error **errp)
+{
+BdrvDirtyBitmap *ret;
+
+qemu_mutex_lock(parent->mutex);
+ret = bdrv_reclaim_dirty_bitmap_locked(bs, parent, errp);
 qemu_mutex_unlock(parent->mutex);
 
-return parent;
+return ret;
 }
 
 /**
-- 
2.11.1

[Qemu-block] [PATCH v11 00/13] Dirty bitmaps postcopy migration

2018-03-13 Thread Vladimir Sementsov-Ogievskiy

Hi all!

There is a new version of dirty bitmap postcopy migration series.

Patches 01-04 are directly from John's branch
  https://github.com/jnsnow/qemu/tree/bitmaps
, they are included only for patchew.

v11
clone: tag postcopy-v11 from https://src.openvz.org/scm/~vsementsov/qemu.git
online: 
https://src.openvz.org/users/vsementsov/repos/qemu/browse?at=postcopy-v11

05: drop inconsistent behavior change, keeping necessity of setting 
s->start_postcopy
10: new patch. it is needed because of 05 change, we should allow
migrate-start-postcopy for dirty-bitmaps too.
11: in dirty_bitmap_load_bits():
- check too large buffer size
- check return value of qemu_get_buffer
drop "\n" from trace-event
12: - set dirty-bitmap capability for target (only for online case and left a 
TODO for
offline).
- move from STOP to MIGRATION event like in 203
13: - drop Fam's r-b
- set dirty-bitmap capability for target
- move from STOP to MIGRATION event like in 203
- add missed self.assert_qmp after migrate cmd
- add call of migrate-start-postcopy (see 05 changes and patch 10)

v10

clone: tag postcopy-v10 from https://src.openvz.org/scm/~vsementsov/qemu.git
online: 
https://src.openvz.org/users/vsementsov/repos/qemu/browse?at=postcopy-v10

01,02: r-b Fam
03: adjust comments about locking
04: fixed 124 iotest (was broken because of small mistake in 
block/dirty-bitmap.c)
05: rebased on master, staff from migration_thread is moved to 
migration_iteration_run, so
drop r-b by John and Juan
06: 2.11->2.12, r-b Fam
07,08,09,: r-b Fam

10: move to device names instead of node names, looks like libvirt don't care 
about
same node-names.
flag AUTOLOAD is ignored for now
use QEMU_ALIGN_UP and DIV_ROUND_UP
skip automatically inserted nodes, when search for dirty bitmaps
allow migration of no bitmaps (see in dirty_bitmap_load_header new logic
   with nothing variable, which avoids extra 
errors)
handle return code of dirty_bitmap_load_header
avoid iteration if there are no bitmaps (see new .no_bitmaps field of 
 dirty_bitmap_mig_state)
call dirty_bitmap_mig_before_vm_start from process_incoming_migration_bh 
too,
to enable bitmaps in case of postcopy not actually started.
11: not add r-b Fam
tiny reorganisation of do_test_migration parameters: remove useless default
values and make shared_storage to be the last
disable shared storage test for now, until it will be fixed (it will be 
separate
series, more related to qcow2 than to migration)
12: r-b Fam

also, "iotests: add default node-name" is dropped, as not more needed.


v9

clone: tag postcopy-v9 from https://src.openvz.org/scm/~vsementsov/qemu.git
online: https://src.openvz.org/users/vsementsov/repos/qemu/browse?at=postcopy-v9

01: r-b John
02: was incomplete, now add here bdrv_reclaim_dirty_bitmap fix
03: new
04: new
05: r-b John
07: fix type in commit message, r-b John
09: add comment about is_active_iterate, r-b Snow and keep Juan's r-b, hope 
comment is ok
10: change copyright to Virtuozzo
reword comment at the top of the file
rewrite init_dirty_bitmap_migration, to not do same things twice (John)
  and skip _only_ unnamed bitmaps, error out for unnamed nodes (John)
use new "locked" state of bitmaps instead of frozen on source vm
do not support migrating bitmap to existent one with the same name,
  keep only create-new-bitmap way
break loop in dirty_bitmap_load_complete when bitmap is found
use bitmap locking instead of context acquire
12: rewrite, to add more cases. (note, that 169 iotest is also in my
"[PATCH v2 0/3] fix bitmaps migration through shared storage", which 
probably should
go to qemu-stable. So this patch should rewrite it, but here I make it like 
new patch,
to simplify review. When "[PATCH v2..." merged I'll rebase this on it), 
drop r-b
13: move to separate test, drop r-b


v8.1

clone: tag postcopy-v8.1 from https://src.openvz.org/scm/~vsementsov/qemu.git
online: 
https://src.openvz.org/users/vsementsov/repos/qemu/browse?at=postcopy-v8.1

05: fix compilation, add new version for cmma_save_pending too.


v8

clone: tag postcopy-v8 from https://src.openvz.org/scm/~vsementsov/qemu.git
online: https://src.openvz.org/users/vsementsov/repos/qemu/browse?at=postcopy-v8

- rebased on master
- patches 01-03 from v7 are already merged to master
- patch order is changed to make it possible to merge block/dirty-bitmap patches
  in separate if is needed
01: new patch
03: fixed to use _locked version of bdrv_release_dirty_bitmap
06: qapi-schema.json -> qapi/migration.json
2.9 -> 2.11
10: protocol changed a bit:
  instead of 1 byte "bitmap enabled flag" this byte becomes just "flags"
  and have "enabled", "persistent" and "autoloading" flags inside.
  also, make all migrated bitmaps to be not persistent (to prevent their
  storing on source vm)
14: new patch

[Qemu-block] [PULL 11/17] nbd/server: add nbd_read_opt_name helper

2018-03-13 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

Add helper to read name in format:

  uint32 len   (<= NBD_MAX_NAME_SIZE)
  len bytes string (not 0-terminated)

The helper will be reused in following patch.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20180312152126.286890-3-vsement...@virtuozzo.com>
Reviewed-by: Eric Blake 
[eblake: grammar fixes, actually check error]
Signed-off-by: Eric Blake 
---
 nbd/server.c | 53 +++--
 1 file changed, 43 insertions(+), 10 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 01ea97afe98..280bdbb1040 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -273,6 +273,48 @@ static int nbd_opt_read(NBDClient *client, void *buffer, 
size_t size,
 return qio_channel_read_all(client->ioc, buffer, size, errp) < 0 ? -EIO : 
1;
 }

+/* nbd_opt_read_name
+ *
+ * Read a string with the format:
+ *   uint32_t len (<= NBD_MAX_NAME_SIZE)
+ *   len bytes string (not 0-terminated)
+ *
+ * @name should be enough to store NBD_MAX_NAME_SIZE+1.
+ * If @length is non-null, it will be set to the actual string length.
+ *
+ * Return -errno on I/O error, 0 if option was completely handled by
+ * sending a reply about inconsistent lengths, or 1 on success.
+ */
+static int nbd_opt_read_name(NBDClient *client, char *name, uint32_t *length,
+ Error **errp)
+{
+int ret;
+uint32_t len;
+
+ret = nbd_opt_read(client, , sizeof(len), errp);
+if (ret <= 0) {
+return ret;
+}
+cpu_to_be32s();
+
+if (len > NBD_MAX_NAME_SIZE) {
+return nbd_opt_invalid(client, errp,
+   "Invalid name length: %" PRIu32, len);
+}
+
+ret = nbd_opt_read(client, name, len, errp);
+if (ret <= 0) {
+return ret;
+}
+name[len] = '\0';
+
+if (length) {
+*length = len;
+}
+
+return 1;
+}
+
 /* Send a single NBD_REP_SERVER reply to NBD_OPT_LIST, including payload.
  * Return -errno on error, 0 on success. */
 static int nbd_negotiate_send_rep_list(NBDClient *client, NBDExport *exp,
@@ -455,19 +497,10 @@ static int nbd_negotiate_handle_info(NBDClient *client, 
uint16_t myflags,
 2 bytes: N, number of requests (can be 0)
 N * 2 bytes: N requests
 */
-rc = nbd_opt_read(client, , sizeof(namelen), errp);
+rc = nbd_opt_read_name(client, name, , errp);
 if (rc <= 0) {
 return rc;
 }
-be32_to_cpus();
-if (namelen >= sizeof(name)) {
-return nbd_opt_invalid(client, errp, "name too long for qemu");
-}
-rc = nbd_opt_read(client, name, namelen, errp);
-if (rc <= 0) {
-return rc;
-}
-name[namelen] = '\0';
 trace_nbd_negotiate_handle_export_name_request(name);

 rc = nbd_opt_read(client, , sizeof(requests), errp);
-- 
2.14.3

[Qemu-block] [PULL 17/17] iotests: new test 209 for NBD BLOCK_STATUS

2018-03-13 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20180312152126.286890-9-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/209 | 34 ++
 tests/qemu-iotests/209.out |  2 ++
 tests/qemu-iotests/group   |  1 +
 3 files changed, 37 insertions(+)
 create mode 100755 tests/qemu-iotests/209
 create mode 100644 tests/qemu-iotests/209.out

diff --git a/tests/qemu-iotests/209 b/tests/qemu-iotests/209
new file mode 100755
index 000..259e991ec6e
--- /dev/null
+++ b/tests/qemu-iotests/209
@@ -0,0 +1,34 @@
+#!/usr/bin/env python
+#
+# Tests for NBD BLOCK_STATUS extension
+#
+# Copyright (c) 2018 Virtuozzo International GmbH
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+import iotests
+from iotests import qemu_img_create, qemu_io, qemu_img_verbose, qemu_nbd, \
+file_path
+
+iotests.verify_image_format(supported_fmts=['qcow2'])
+
+disk, nbd_sock = file_path('disk', 'nbd-sock')
+nbd_uri = 'nbd+unix:///exp?socket=' + nbd_sock
+
+qemu_img_create('-f', iotests.imgfmt, disk, '1M')
+qemu_io('-f', iotests.imgfmt, '-c', 'write 0 512K', disk)
+
+qemu_nbd('-k', nbd_sock, '-x', 'exp', '-f', iotests.imgfmt, disk)
+qemu_img_verbose('map', '-f', 'raw', '--output=json', nbd_uri)
diff --git a/tests/qemu-iotests/209.out b/tests/qemu-iotests/209.out
new file mode 100644
index 000..0d29724e84a
--- /dev/null
+++ b/tests/qemu-iotests/209.out
@@ -0,0 +1,2 @@
+[{ "start": 0, "length": 524288, "depth": 0, "zero": false, "data": true},
+{ "start": 524288, "length": 524288, "depth": 0, "zero": true, "data": false}]
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 890fe91f2b1..624e1fbd4fe 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -205,3 +205,4 @@
 206 rw auto
 207 rw auto
 208 rw auto quick
+209 rw auto quick
-- 
2.14.3

[Qemu-block] [PULL 15/17] iotests.py: tiny refactor: move system imports up

2018-03-13 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
Message-Id: <20180312152126.286890-7-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/iotests.py | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 1bcc9ca57dc..c1302a2f9b1 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -23,13 +23,14 @@ import subprocess
 import string
 import unittest
 import sys
-sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..', 'scripts'))
-import qtest
 import struct
 import json
 import signal
 import logging

+sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..', 'scripts'))
+import qtest
+

 # This will not work if arguments contain spaces but is necessary if we
 # want to support the override options that ./check supports.
-- 
2.14.3

[Qemu-block] [PULL 16/17] iotests: add file_path helper

2018-03-13 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

Simple way to have auto generated filenames with auto cleanup. Like
FilePath but without using 'with' statement and without additional
indentation of the whole test.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20180312152126.286890-8-vsement...@virtuozzo.com>
Reviewed-by: Eric Blake 
[eblake: grammar tweak]
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/iotests.py | 32 
 1 file changed, 32 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index c1302a2f9b1..90cd751e2af 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -27,6 +27,7 @@ import struct
 import json
 import signal
 import logging
+import atexit

 sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..', 'scripts'))
 import qtest
@@ -250,6 +251,37 @@ class FilePath(object):
 return False


+def file_path_remover():
+for path in reversed(file_path_remover.paths):
+try:
+os.remove(path)
+except OSError:
+pass
+
+
+def file_path(*names):
+''' Another way to get auto-generated filename that cleans itself up.
+
+Use is as simple as:
+
+img_a, img_b = file_path('a.img', 'b.img')
+sock = file_path('socket')
+'''
+
+if not hasattr(file_path_remover, 'paths'):
+file_path_remover.paths = []
+atexit.register(file_path_remover)
+
+paths = []
+for name in names:
+filename = '{0}-{1}'.format(os.getpid(), name)
+path = os.path.join(test_dir, filename)
+file_path_remover.paths.append(path)
+paths.append(path)
+
+return paths[0] if len(paths) == 1 else paths
+
+
 class VM(qtest.QEMUQtestMachine):
 '''A QEMU VM'''

-- 
2.14.3

[Qemu-block] [PULL 09/17] iotests: add 208 nbd-server + blockdev-snapshot-sync test case

2018-03-13 Thread Eric Blake

From: Stefan Hajnoczi 

This test case adds an NBD server export and then invokes
blockdev-snapshot-sync, which changes the BlockDriverState node that the
NBD server's BlockBackend points to.  This is an interesting scenario to
test and exercises the code path fixed by the previous commit.

Signed-off-by: Stefan Hajnoczi 
Message-Id: <20180306204819.11266-3-stefa...@redhat.com>
Reviewed-by: Max Reitz 
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/208 | 55 ++
 tests/qemu-iotests/208.out |  9 
 tests/qemu-iotests/group   |  1 +
 3 files changed, 65 insertions(+)
 create mode 100755 tests/qemu-iotests/208
 create mode 100644 tests/qemu-iotests/208.out

diff --git a/tests/qemu-iotests/208 b/tests/qemu-iotests/208
new file mode 100755
index 000..4e82b96c829
--- /dev/null
+++ b/tests/qemu-iotests/208
@@ -0,0 +1,55 @@
+#!/usr/bin/env python
+#
+# Copyright (C) 2018 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+# Creator/Owner: Stefan Hajnoczi 
+#
+# Check that the runtime NBD server does not crash when stopped after
+# blockdev-snapshot-sync.
+
+import iotests
+
+with iotests.FilePath('disk.img') as disk_img_path, \
+ iotests.FilePath('disk-snapshot.img') as disk_snapshot_img_path, \
+ iotests.FilePath('nbd.sock') as nbd_sock_path, \
+ iotests.VM() as vm:
+
+img_size = '10M'
+iotests.qemu_img_pipe('create', '-f', iotests.imgfmt, disk_img_path, 
img_size)
+
+iotests.log('Launching VM...')
+(vm.add_drive(disk_img_path, 'node-name=drive0-node', interface='none')
+   .launch())
+
+iotests.log('Starting NBD server...')
+iotests.log(vm.qmp('nbd-server-start', addr={
+"type": "unix",
+"data": {
+"path": nbd_sock_path,
+}
+}))
+
+iotests.log('Adding NBD export...')
+iotests.log(vm.qmp('nbd-server-add', device='drive0-node', writable=True))
+
+iotests.log('Creating external snapshot...')
+iotests.log(vm.qmp('blockdev-snapshot-sync',
+node_name='drive0-node',
+snapshot_node_name='drive0-snapshot-node',
+snapshot_file=disk_snapshot_img_path))
+
+iotests.log('Stopping NBD server...')
+iotests.log(vm.qmp('nbd-server-stop'))
diff --git a/tests/qemu-iotests/208.out b/tests/qemu-iotests/208.out
new file mode 100644
index 000..3687e9d0dd4
--- /dev/null
+++ b/tests/qemu-iotests/208.out
@@ -0,0 +1,9 @@
+Launching VM...
+Starting NBD server...
+{u'return': {}}
+Adding NBD export...
+{u'return': {}}
+Creating external snapshot...
+{u'return': {}}
+Stopping NBD server...
+{u'return': {}}
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index c401791fcdb..890fe91f2b1 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -204,3 +204,4 @@
 205 rw auto quick
 206 rw auto
 207 rw auto
+208 rw auto quick
-- 
2.14.3

[Qemu-block] [PULL 02/17] nbd/server: move nbd_co_send_structured_error up

2018-03-13 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

To be reused in nbd_co_send_sparse_read() in the following patch.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20180308184636.178534-2-vsement...@virtuozzo.com>
Reviewed-by: Eric Blake 
Signed-off-by: Eric Blake 
---
 nbd/server.c | 48 
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index e714bfe6a17..3d0f024193c 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1342,6 +1342,30 @@ static int coroutine_fn 
nbd_co_send_structured_read(NBDClient *client,
 return nbd_co_send_iov(client, iov, 2, errp);
 }

+static int coroutine_fn nbd_co_send_structured_error(NBDClient *client,
+ uint64_t handle,
+ uint32_t error,
+ const char *msg,
+ Error **errp)
+{
+NBDStructuredError chunk;
+int nbd_err = system_errno_to_nbd_errno(error);
+struct iovec iov[] = {
+{.iov_base = , .iov_len = sizeof(chunk)},
+{.iov_base = (char *)msg, .iov_len = msg ? strlen(msg) : 0},
+};
+
+assert(nbd_err);
+trace_nbd_co_send_structured_error(handle, nbd_err,
+   nbd_err_lookup(nbd_err), msg ? msg : 
"");
+set_be_chunk(, NBD_REPLY_FLAG_DONE, NBD_REPLY_TYPE_ERROR, handle,
+ sizeof(chunk) - sizeof(chunk.h) + iov[1].iov_len);
+stl_be_p(, nbd_err);
+stw_be_p(_length, iov[1].iov_len);
+
+return nbd_co_send_iov(client, iov, 1 + !!iov[1].iov_len, errp);
+}
+
 static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,
 uint64_t handle,
 uint64_t offset,
@@ -1401,30 +1425,6 @@ static int coroutine_fn 
nbd_co_send_sparse_read(NBDClient *client,
 return ret;
 }

-static int coroutine_fn nbd_co_send_structured_error(NBDClient *client,
- uint64_t handle,
- uint32_t error,
- const char *msg,
- Error **errp)
-{
-NBDStructuredError chunk;
-int nbd_err = system_errno_to_nbd_errno(error);
-struct iovec iov[] = {
-{.iov_base = , .iov_len = sizeof(chunk)},
-{.iov_base = (char *)msg, .iov_len = msg ? strlen(msg) : 0},
-};
-
-assert(nbd_err);
-trace_nbd_co_send_structured_error(handle, nbd_err,
-   nbd_err_lookup(nbd_err), msg ? msg : 
"");
-set_be_chunk(, NBD_REPLY_FLAG_DONE, NBD_REPLY_TYPE_ERROR, handle,
- sizeof(chunk) - sizeof(chunk.h) + iov[1].iov_len);
-stl_be_p(, nbd_err);
-stw_be_p(_length, iov[1].iov_len);
-
-return nbd_co_send_iov(client, iov, 1 + !!iov[1].iov_len, errp);
-}
-
 /* nbd_co_receive_request
  * Collect a client request. Return 0 if request looks valid, -EIO to drop
  * connection right away, and any other negative value to report an error to
-- 
2.14.3

[Qemu-block] [PULL 01/17] iotests: Fix stuck NBD process on 33

2018-03-13 Thread Eric Blake

Commit afe35cde6 added additional actions to test 33, but forgot
to reset the image between tests.  As a result, './check -nbd 33'
fails because the qemu-nbd process from the first half is still
occupying the port, preventing the second half from starting a
new qemu-nbd process.  Worse, the failure leaves a rogue qemu-nbd
process behind even after the test fails, which causes knock-on
failures to later tests that also want to start qemu-nbd.

Reported-by: Max Reitz 
Signed-off-by: Eric Blake 
Message-Id: <20180312211156.452139-1-ebl...@redhat.com>
Reviewed-by: Max Reitz 
---
 tests/qemu-iotests/033 | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/qemu-iotests/033 b/tests/qemu-iotests/033
index a1d8357331d..ee8a1338bbd 100755
--- a/tests/qemu-iotests/033
+++ b/tests/qemu-iotests/033
@@ -105,6 +105,7 @@ for align in 512 4k; do
 done
 done

+_cleanup_test_img

 # Trigger truncate that would shrink qcow2 L1 table, which is done by
 #   clearing one entry (8 bytes) with bdrv_co_pwrite_zeroes()
-- 
2.14.3

Re: [Qemu-block] [Qemu-devel] [PULL 00/41] Block layer patches

2018-03-13 Thread no-reply

Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20180313161803.1814-1-kw...@redhat.com
Subject: [Qemu-devel] [PULL 00/41] Block layer patches

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 t [tag update]patchew/20180307082512.14203-1-be...@igalia.com -> 
patchew/20180307082512.14203-1-be...@igalia.com
 t [tag update]
patchew/20180313153458.26822-1-peter.mayd...@linaro.org -> 
patchew/20180313153458.26822-1-peter.mayd...@linaro.org
 * [new tag]   patchew/20180313161803.1814-1-kw...@redhat.com -> 
patchew/20180313161803.1814-1-kw...@redhat.com
Switched to a new branch 'test'
73ceee78e6 block/mirror: change the semantic of 'force' of block-job-cancel
7363482c0a vpc: Require aligned size in .bdrv_co_create
2882d5ed9b vpc: Support .bdrv_co_create
fc6a997d9c vhdx: Support .bdrv_co_create
20ef05f192 vdi: Make comments consistent with other drivers
459ee653e4 qed: Support .bdrv_co_create
8bba4791b7 qcow: Support .bdrv_co_create
f64c119db2 qemu-iotests: Enable write tests for parallels
2080c0a1ab parallels: Support .bdrv_co_create
9faa105c59 iotests: Add regression test for commit base locking
6a296d9cfe block: Fix flags in reopen queue
781f48c549 vdi: Implement .bdrv_co_create
4cab0e18bb vdi: Move file creation to vdi_co_create_opts
891969da22 vdi: Pull option parsing from vdi_co_create
49280fb721 qemu-iotests: Test luks QMP image creation
0a4d72fa21 luks: Catch integer overflow for huge sizes
6fda8b9a38 luks: Turn invalid assertion into check
c88dc7ac6e luks: Support .bdrv_co_create
681d5dff50 luks: Create block_crypto_co_create_generic()
d641340c5a luks: Separate image file creation from formatting
8cae2fd8e8 tests/test-blockjob: test cancellations
1390b7c37d iotests: test manual job dismissal
1ad1823194 blockjobs: Expose manual property
55441ac858 blockjobs: add block-job-finalize
4a6e1bfbb0 blockjobs: add PENDING status and event
ca394bb9c1 blockjobs: add waiting status
92017bd151 blockjobs: add prepare callback
ef89eb33ad blockjobs: add block_job_txn_apply function
225d9d25ba blockjobs: add commit, abort, clean helpers
2de3034128 blockjobs: ensure abort is called for cancelled jobs
37ef0263ce blockjobs: add block_job_dismiss
b59095a50b blockjobs: add NULL state
4c49f9fa27 blockjobs: add CONCLUDED state
7a4f169154 blockjobs: add ABORTING state
ac30b7288c blockjobs: add block_job_verb permission table
068f7c2061 iotests: add pause_wait
8c386101bf blockjobs: add state transition table
d8d8ffcb3d blockjobs: add status enum
99b5fa3cf0 Blockjobs: documentation touchup
318dc73f7e blockjobs: model single jobs as transactions
020497053e blockjobs: fix set-speed kick

=== OUTPUT BEGIN ===
Checking PATCH 1/41: blockjobs: fix set-speed kick...
Checking PATCH 2/41: blockjobs: model single jobs as transactions...
Checking PATCH 3/41: Blockjobs: documentation touchup...
Checking PATCH 4/41: blockjobs: add status enum...
Checking PATCH 5/41: blockjobs: add state transition table...
ERROR: space prohibited before open square bracket '['
#82: FILE: blockjob.c:48:
+/* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0},

ERROR: space prohibited before open square bracket '['
#83: FILE: blockjob.c:49:
+/* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0},

ERROR: space prohibited before open square bracket '['
#84: FILE: blockjob.c:50:
+/* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0},

ERROR: space prohibited before open square bracket '['
#85: FILE: blockjob.c:51:
+/* P: */ [BLOCK_JOB_STATUS_PAUSED]= {0, 0, 1, 0, 0, 0},

ERROR: space prohibited before open square bracket '['
#86: FILE: blockjob.c:52:
+/* Y: */ [BLOCK_JOB_STATUS_READY] = {0, 0, 0, 0, 0, 1},

ERROR: space prohibited before open square bracket '['
#87: FILE: blockjob.c:53:
+/* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0},

total: 6 errors, 0 warnings, 88 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 6/41: iotests: add pause_wait...
Checking PATCH 7/41: blockjobs: add block_job_verb permission table...
Checking PATCH 8/41: blockjobs: add ABORTING state...
ERROR: space prohibited before open square bracket '['
#65: FILE: blockjob.c:48:
+/* U: */

Re: [Qemu-block] [PATCH v10 05/12] migration: introduce postcopy-only pending

2018-03-13 Thread John Snow



On 03/13/2018 12:33 PM, Vladimir Sementsov-Ogievskiy wrote:
> 13.03.2018 19:16, John Snow wrote:
>>
>> On 03/13/2018 12:14 PM, Vladimir Sementsov-Ogievskiy wrote:
>>> Hmm, I agree, it is the simplest thing we can do for now, and I'll
>>> rethink later,
>>> how (and is it worth doing) to go to postcopy automatically in case of
>>> only-dirty-bitmaps.
>>> Should I respin?
>> Please do. I already staged patches 1-4 in my branch, so if you'd like,
>> you can respin just 5+.
>>
>> https://github.com/jnsnow/qemu/tree/bitmaps
>>
>> --js
> 
> Ok, I'll base on your branch. How should I write Based-on: for patchew
> in this case?
> 

You know, that's actually a good question... just send out the whole
series for the sake of patchew. I Added an R-B to patches 1-4 and edited
a single comment in #4.

--js

[Qemu-block] [PULL 14/17] nbd: BLOCK_STATUS for standard get_block_status function: client part

2018-03-13 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

Minimal realization: only one extent in server answer is supported.
Flag NBD_CMD_FLAG_REQ_ONE is used to force this behavior.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20180312152126.286890-6-vsement...@virtuozzo.com>
Reviewed-by: Eric Blake 
[eblake: grammar tweaks, fix min_block check and 32-bit cap, use -1
instead of errno on failure in nbd_negotiate_simple_meta_context]
Signed-off-by: Eric Blake 
---
 block/nbd-client.h  |   6 +++
 include/block/nbd.h |   3 ++
 block/nbd-client.c  | 141 
 block/nbd.c |   3 ++
 nbd/client.c| 117 +++
 5 files changed, 270 insertions(+)

diff --git a/block/nbd-client.h b/block/nbd-client.h
index 612c4c21a0c..0ece76e5aff 100644
--- a/block/nbd-client.h
+++ b/block/nbd-client.h
@@ -61,4 +61,10 @@ void nbd_client_detach_aio_context(BlockDriverState *bs);
 void nbd_client_attach_aio_context(BlockDriverState *bs,
AioContext *new_context);

+int coroutine_fn nbd_client_co_block_status(BlockDriverState *bs,
+bool want_zero,
+int64_t offset, int64_t bytes,
+int64_t *pnum, int64_t *map,
+BlockDriverState **file);
+
 #endif /* NBD_CLIENT_H */
diff --git a/include/block/nbd.h b/include/block/nbd.h
index 2285637e673..fcdcd545023 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -260,6 +260,7 @@ struct NBDExportInfo {
 /* In-out fields, set by client before nbd_receive_negotiate() and
  * updated by server results during nbd_receive_negotiate() */
 bool structured_reply;
+bool base_allocation; /* base:allocation context for NBD_CMD_BLOCK_STATUS 
*/

 /* Set by server results during nbd_receive_negotiate() */
 uint64_t size;
@@ -267,6 +268,8 @@ struct NBDExportInfo {
 uint32_t min_block;
 uint32_t opt_block;
 uint32_t max_block;
+
+uint32_t meta_base_allocation_id;
 };
 typedef struct NBDExportInfo NBDExportInfo;

diff --git a/block/nbd-client.c b/block/nbd-client.c
index 0d9f73a137f..be160052cb1 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -228,6 +228,48 @@ static int 
nbd_parse_offset_hole_payload(NBDStructuredReplyChunk *chunk,
 return 0;
 }

+/* nbd_parse_blockstatus_payload
+ * support only one extent in reply and only for
+ * base:allocation context
+ */
+static int nbd_parse_blockstatus_payload(NBDClientSession *client,
+ NBDStructuredReplyChunk *chunk,
+ uint8_t *payload, uint64_t 
orig_length,
+ NBDExtent *extent, Error **errp)
+{
+uint32_t context_id;
+
+if (chunk->length != sizeof(context_id) + sizeof(extent)) {
+error_setg(errp, "Protocol error: invalid payload for "
+ "NBD_REPLY_TYPE_BLOCK_STATUS");
+return -EINVAL;
+}
+
+context_id = payload_advance32();
+if (client->info.meta_base_allocation_id != context_id) {
+error_setg(errp, "Protocol error: unexpected context id %d for "
+ "NBD_REPLY_TYPE_BLOCK_STATUS, when negotiated context 
"
+ "id is %d", context_id,
+ client->info.meta_base_allocation_id);
+return -EINVAL;
+}
+
+extent->length = payload_advance32();
+extent->flags = payload_advance32();
+
+if (extent->length == 0 ||
+(client->info.min_block && !QEMU_IS_ALIGNED(extent->length,
+client->info.min_block)) ||
+extent->length > orig_length)
+{
+error_setg(errp, "Protocol error: server sent status chunk with "
+   "invalid length");
+return -EINVAL;
+}
+
+return 0;
+}
+
 /* nbd_parse_error_payload
  * on success @errp contains message describing nbd error reply
  */
@@ -642,6 +684,60 @@ static int nbd_co_receive_cmdread_reply(NBDClientSession 
*s, uint64_t handle,
 return iter.ret;
 }

+static int nbd_co_receive_blockstatus_reply(NBDClientSession *s,
+uint64_t handle, uint64_t length,
+NBDExtent *extent, Error **errp)
+{
+NBDReplyChunkIter iter;
+NBDReply reply;
+void *payload = NULL;
+Error *local_err = NULL;
+bool received = false;
+
+NBD_FOREACH_REPLY_CHUNK(s, iter, handle, s->info.structured_reply,
+NULL, , )
+{
+int ret;
+NBDStructuredReplyChunk *chunk = 
+
+assert(nbd_reply_is_structured());
+
+switch (chunk->type) {
+case NBD_REPLY_TYPE_BLOCK_STATUS:
+

[Qemu-block] [PULL 12/17] nbd: BLOCK_STATUS for standard get_block_status function: server part

2018-03-13 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

Minimal realization: only one extent in server answer is supported.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20180312152126.286890-4-vsement...@virtuozzo.com>
Reviewed-by: Eric Blake 
[eblake: tweak whitespace, move constant from .h to .c, improve
logic of check_meta_export_name, simplify nbd_negotiate_options
by doing more in nbd_negotiate_meta_queries]
Signed-off-by: Eric Blake 
---
 nbd/server.c | 311 +++
 1 file changed, 311 insertions(+)

diff --git a/nbd/server.c b/nbd/server.c
index 280bdbb1040..cea158913ba 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -22,6 +22,8 @@
 #include "trace.h"
 #include "nbd-internal.h"

+#define NBD_META_ID_BASE_ALLOCATION 0
+
 static int system_errno_to_nbd_errno(int err)
 {
 switch (err) {
@@ -82,6 +84,16 @@ struct NBDExport {

 static QTAILQ_HEAD(, NBDExport) exports = QTAILQ_HEAD_INITIALIZER(exports);

+/* NBDExportMetaContexts represents a list of contexts to be exported,
+ * as selected by NBD_OPT_SET_META_CONTEXT. Also used for
+ * NBD_OPT_LIST_META_CONTEXT. */
+typedef struct NBDExportMetaContexts {
+char export_name[NBD_MAX_NAME_SIZE + 1];
+bool valid; /* means that negotiation of the option finished without
+   errors */
+bool base_allocation; /* export base:allocation context (block status) */
+} NBDExportMetaContexts;
+
 struct NBDClient {
 int refcount;
 void (*close_fn)(NBDClient *client, bool negotiated);
@@ -102,6 +114,7 @@ struct NBDClient {
 bool closing;

 bool structured_reply;
+NBDExportMetaContexts export_meta;

 uint32_t opt; /* Current option being negotiated */
 uint32_t optlen; /* remaining length of data in ioc for the option being
@@ -273,6 +286,20 @@ static int nbd_opt_read(NBDClient *client, void *buffer, 
size_t size,
 return qio_channel_read_all(client->ioc, buffer, size, errp) < 0 ? -EIO : 
1;
 }

+/* Drop size bytes from the unparsed payload of the current option.
+ * Return -errno on I/O error, 0 if option was completely handled by
+ * sending a reply about inconsistent lengths, or 1 on success. */
+static int nbd_opt_skip(NBDClient *client, size_t size, Error **errp)
+{
+if (size > client->optlen) {
+return nbd_opt_invalid(client, errp,
+   "Inconsistent lengths in option %s",
+   nbd_opt_lookup(client->opt));
+}
+client->optlen -= size;
+return nbd_drop(client->ioc, size, errp) < 0 ? -EIO : 1;
+}
+
 /* nbd_opt_read_name
  *
  * Read a string with the format:
@@ -372,6 +399,12 @@ static int nbd_negotiate_handle_list(NBDClient *client, 
Error **errp)
 return nbd_negotiate_send_rep(client, NBD_REP_ACK, errp);
 }

+static void nbd_check_meta_export_name(NBDClient *client)
+{
+client->export_meta.valid &= !strcmp(client->exp->name,
+ client->export_meta.export_name);
+}
+
 /* Send a reply to NBD_OPT_EXPORT_NAME.
  * Return -errno on error, 0 on success. */
 static int nbd_negotiate_handle_export_name(NBDClient *client,
@@ -423,6 +456,7 @@ static int nbd_negotiate_handle_export_name(NBDClient 
*client,

 QTAILQ_INSERT_TAIL(>exp->clients, client, next);
 nbd_export_get(client->exp);
+nbd_check_meta_export_name(client);

 return 0;
 }
@@ -616,6 +650,7 @@ static int nbd_negotiate_handle_info(NBDClient *client, 
uint16_t myflags,
 client->exp = exp;
 QTAILQ_INSERT_TAIL(>exp->clients, client, next);
 nbd_export_get(client->exp);
+nbd_check_meta_export_name(client);
 rc = 1;
 }
 return rc;
@@ -670,6 +705,189 @@ static QIOChannel 
*nbd_negotiate_handle_starttls(NBDClient *client,
 return QIO_CHANNEL(tioc);
 }

+/* nbd_negotiate_send_meta_context
+ *
+ * Send one chunk of reply to NBD_OPT_{LIST,SET}_META_CONTEXT
+ *
+ * For NBD_OPT_LIST_META_CONTEXT @context_id is ignored, 0 is used instead.
+ */
+static int nbd_negotiate_send_meta_context(NBDClient *client,
+   const char *context,
+   uint32_t context_id,
+   Error **errp)
+{
+NBDOptionReplyMetaContext opt;
+struct iovec iov[] = {
+{.iov_base = , .iov_len = sizeof(opt)},
+{.iov_base = (void *)context, .iov_len = strlen(context)}
+};
+
+if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
+context_id = 0;
+}
+
+set_be_option_rep(, client->opt, NBD_REP_META_CONTEXT,
+  sizeof(opt) - sizeof(opt.h) + iov[1].iov_len);
+stl_be_p(_id, context_id);
+
+return qio_channel_writev_all(client->ioc, iov, 2, errp) < 0 ? -EIO : 0;
+}
+
+/* nbd_meta_base_query
+ *
+ * Handle query to 'base' namespace. For now, only base:allocation context is
+ * available in it.  'len' is

[Qemu-block] [PULL 13/17] block/nbd-client: save first fatal error in nbd_iter_error

2018-03-13 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

It is ok, that fatal error hides previous not fatal, but hiding
first fatal error is a bad feature.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
Message-Id: <20180312152126.286890-5-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 block/nbd-client.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index 7b68499b76a..0d9f73a137f 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -481,6 +481,7 @@ static coroutine_fn int nbd_co_receive_one_chunk(

 typedef struct NBDReplyChunkIter {
 int ret;
+bool fatal;
 Error *err;
 bool done, only_structured;
 } NBDReplyChunkIter;
@@ -490,11 +491,12 @@ static void nbd_iter_error(NBDReplyChunkIter *iter, bool 
fatal,
 {
 assert(ret < 0);

-if (fatal || iter->ret == 0) {
+if ((fatal && !iter->fatal) || iter->ret == 0) {
 if (iter->ret != 0) {
 error_free(iter->err);
 iter->err = NULL;
 }
+iter->fatal = fatal;
 iter->ret = ret;
 error_propagate(>err, *local_err);
 } else {
-- 
2.14.3

[Qemu-block] [PULL 04/17] nbd/server: fix: check client->closing before sending reply

2018-03-13 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

Since the unchanged code has just set client->recv_coroutine to
NULL before calling nbd_client_receive_next_request(), we are
spawning a new coroutine unconditionally, but the first thing
that coroutine will do is check for client->closing, making it
a no-op if we have already detected that the client is going
away.  Furthermore, for any error other than EIO (where we
disconnect, which itself sets client->closing), if the client
has already gone away, we'll probably encounter EIO later
in the function and attempt disconnect at that point.  Logically,
as soon as we know the connection is closing, there is no need
to try a likely-to-fail a response or spawn a no-op coroutine.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20180308184636.178534-4-vsement...@virtuozzo.com>
Reviewed-by: Eric Blake 
[eblake: squash in further reordering: hoist check before spawning
next coroutine, and document rationale in commit message]
Signed-off-by: Eric Blake 
---
 nbd/server.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 5f292064af0..b230ecb4fb8 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1543,14 +1543,6 @@ static coroutine_fn void nbd_trip(void *opaque)
 req = nbd_request_get(client);
 ret = nbd_co_receive_request(req, , _err);
 client->recv_coroutine = NULL;
-nbd_client_receive_next_request(client);
-if (ret == -EIO) {
-goto disconnect;
-}
-
-if (ret < 0) {
-goto reply;
-}

 if (client->closing) {
 /*
@@ -1560,6 +1552,15 @@ static coroutine_fn void nbd_trip(void *opaque)
 goto done;
 }

+nbd_client_receive_next_request(client);
+if (ret == -EIO) {
+goto disconnect;
+}
+
+if (ret < 0) {
+goto reply;
+}
+
 switch (request.type) {
 case NBD_CMD_READ:
 /* XXX: NBD Protocol only documents use of FUA with WRITE */
-- 
2.14.3

Re: [Qemu-block] [PATCH v10 10/12] migration: add postcopy migration of dirty bitmaps

2018-03-13 Thread Vladimir Sementsov-Ogievskiy


12.03.2018 19:09, Dr. David Alan Gilbert wrote:

* Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:

Postcopy migration of dirty bitmaps. Only named dirty bitmaps are migrated.

If destination qemu is already containing a dirty bitmap with the same name
as a migrated bitmap (for the same node), then, if their granularities are
the same the migration will be done, otherwise the error will be generated.

If destination qemu doesn't contain such bitmap it will be created.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  include/migration/misc.h   |   3 +
  migration/migration.h  |   3 +
  migration/block-dirty-bitmap.c | 737 +
  migration/migration.c  |   5 +
  migration/savevm.c |   2 +
  vl.c   |   1 +
  migration/Makefile.objs|   1 +
  migration/trace-events |  14 +
  8 files changed, 766 insertions(+)
  create mode 100644 migration/block-dirty-bitmap.c

diff --git a/include/migration/misc.h b/include/migration/misc.h
index 77fd4f587c..4ebf24c6c2 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -56,4 +56,7 @@ bool migration_has_failed(MigrationState *);
  bool migration_in_postcopy_after_devices(MigrationState *);
  void migration_global_dump(Monitor *mon);
  
+/* migration/block-dirty-bitmap.c */

+void dirty_bitmap_mig_init(void);
+
  #endif
diff --git a/migration/migration.h b/migration/migration.h
index 861cdfaa96..79f72b7e50 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -233,4 +233,7 @@ void migrate_send_rp_pong(MigrationIncomingState *mis,
  void migrate_send_rp_req_pages(MigrationIncomingState *mis, const char* 
rbname,
ram_addr_t start, size_t len);
  
+void dirty_bitmap_mig_before_vm_start(void);

+void init_dirty_bitmap_incoming_migration(void);
+
  #endif
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
new file mode 100644
index 00..5b41f7140d
--- /dev/null
+++ b/migration/block-dirty-bitmap.c
@@ -0,0 +1,737 @@
+/*
+ * Block dirty bitmap postcopy migration
+ *
+ * Copyright IBM, Corp. 2009
+ * Copyright (c) 2016-2017 Virtuozzo International GmbH. All rights reserved.
+ *
+ * Authors:
+ *  Liran Schour   
+ *  Vladimir Sementsov-Ogievskiy 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ * This file is derived from migration/block.c, so it's author and IBM 
copyright
+ * are here, although content is quite different.
+ *
+ * Contributions after 2012-01-13 are licensed under the terms of the
+ * GNU GPL, version 2 or (at your option) any later version.
+ *
+ ****
+ *
+ * Here postcopy migration of dirty bitmaps is realized. Only QMP-addressable
+ * bitmaps are migrated.
+ *
+ * Bitmap migration implies creating bitmap with the same name and granularity
+ * in destination QEMU. If the bitmap with the same name (for the same node)
+ * already exists on destination an error will be generated.
+ *
+ * format of migration:
+ *
+ * # Header (shared for different chunk types)
+ * 1, 2 or 4 bytes: flags (see qemu_{put,put}_flags)
+ * [ 1 byte: node name size ] \  flags & DEVICE_NAME
+ * [ n bytes: node name ] /
+ * [ 1 byte: bitmap name size ] \  flags & BITMAP_NAME
+ * [ n bytes: bitmap name ] /
+ *
+ * # Start of bitmap migration (flags & START)
+ * header
+ * be64: granularity
+ * 1 byte: bitmap flags (corresponds to BdrvDirtyBitmap)
+ *   bit 0-  bitmap is enabled
+ *   bit 1-  bitmap is persistent
+ *   bit 2-  bitmap is autoloading
+ *   bits 3-7 - reserved, must be zero
+ *
+ * # Complete of bitmap migration (flags & COMPLETE)
+ * header
+ *
+ * # Data chunk of bitmap migration
+ * header
+ * be64: start sector
+ * be32: number of sectors
+ * [ be64: buffer size  ] \ ! (flags & ZEROES)
+ * [ n bytes: buffer] /
+ *
+ * The last chunk in stream should contain flags & EOS. The chunk may skip
+ * device and/or bitmap names, assuming them to be the same with the previous
+ * chunk.
+ */
+
+#include "qemu/osdep.h"
+#include "block/block.h"
+#include "block/block_int.h"
+#include "sysemu/block-backend.h"
+#include "qemu/main-loop.h"
+#include "qemu/error-report.h"
+#include "migration/misc.h"
+#include "migration/migration.h"
+#include "migration/qemu-file.h"
+#include "migration/vmstate.h"
+#include "migration/register.h"
+#include "qemu/hbitmap.h"
+#include "sysemu/sysemu.h"
+#include "qemu/cutils.h"
+#include "qapi/error.h"
+#include "trace.h"
+
+#define CHUNK_SIZE (1 << 10)
+
+/* Flags occupy one, two or four bytes (Big Endian). The size is determined as
+ * follows:
+ * in first (most significant) byte bit 8 is clear  -->  one byte
+ * in first byte bit 8 is set-->  two or four bytes, depending on second
+ *byte:
+

[Qemu-block] [PULL 41/41] block/mirror: change the semantic of 'force' of block-job-cancel

2018-03-13 Thread Kevin Wolf

From: Liang Li 

When doing drive mirror to a low speed shared storage, if there was heavy
BLK IO write workload in VM after the 'ready' event, drive mirror block job
can't be canceled immediately, it would keep running until the heavy BLK IO
workload stopped in the VM.

Libvirt depends on the current block-job-cancel semantics, which is that
when used without a flag after the 'ready' event, the command blocks
until data is in sync.  However, these semantics are awkward in other
situations, for example, people may use drive mirror for realtime
backups while still wanting to use block live migration.  Libvirt cannot
start a block live migration while another drive mirror is in progress,
but the user would rather abandon the backup attempt as broken and
proceed with the live migration than be stuck waiting for the current
drive mirror backup to finish.

The drive-mirror command already includes a 'force' flag, which libvirt
does not use, although it documented the flag as only being useful to
quit a job which is paused.  However, since quitting a paused job has
the same effect as abandoning a backup in a non-paused job (namely, the
destination file is not in sync, and the command completes immediately),
we can just improve the documentation to make the force flag obviously
useful.

Cc: Paolo Bonzini 
Cc: Jeff Cody 
Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: Eric Blake 
Cc: John Snow 
Reported-by: Huaitong Han 
Signed-off-by: Huaitong Han 
Signed-off-by: Liang Li 
Signed-off-by: Jeff Cody 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json  |  5 +++--
 include/block/blockjob.h  | 12 ++--
 block/mirror.c| 10 --
 blockdev.c|  4 ++--
 blockjob.c| 16 +---
 tests/test-blockjob-txn.c |  8 
 hmp-commands.hx   |  3 ++-
 7 files changed, 34 insertions(+), 24 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 47ff5f8ce5..00ef614c03 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2204,8 +2204,9 @@
 #  the name of the parameter), but since QEMU 2.7 it can have
 #  other values.
 #
-# @force: whether to allow cancellation of a paused job (default
-# false).  Since 1.3.
+# @force: If true, and the job has already emitted the event BLOCK_JOB_READY,
+# abandon the job immediately (even if it is paused) instead of waiting
+# for the destination to complete its final synchronization (since 1.3)
 #
 # Returns: Nothing on success
 #  If no background operation is active on this device, DeviceNotActive
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 978274ed2b..fc645dac68 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -63,6 +63,12 @@ typedef struct BlockJob {
 bool cancelled;
 
 /**
+ * Set to true if the job should abort immediately without waiting
+ * for data to be in sync.
+ */
+bool force;
+
+/**
  * Counter for pause request. If non-zero, the block job is either paused,
  * or if busy == true will pause itself as soon as possible.
  */
@@ -230,10 +236,11 @@ void block_job_start(BlockJob *job);
 /**
  * block_job_cancel:
  * @job: The job to be canceled.
+ * @force: Quit a job without waiting for data to be in sync.
  *
  * Asynchronously cancel the specified job.
  */
-void block_job_cancel(BlockJob *job);
+void block_job_cancel(BlockJob *job, bool force);
 
 /**
  * block_job_complete:
@@ -307,11 +314,12 @@ void block_job_user_resume(BlockJob *job, Error **errp);
 /**
  * block_job_user_cancel:
  * @job: The job to be cancelled.
+ * @force: Quit a job without waiting for data to be in sync.
  *
  * Cancels the specified job, but may refuse to do so if the
  * operation isn't currently meaningful.
  */
-void block_job_user_cancel(BlockJob *job, Error **errp);
+void block_job_user_cancel(BlockJob *job, bool force, Error **errp);
 
 /**
  * block_job_cancel_sync:
diff --git a/block/mirror.c b/block/mirror.c
index 76fddb3838..820f512c7b 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -869,11 +869,8 @@ static void coroutine_fn mirror_run(void *opaque)
 
 ret = 0;
 trace_mirror_before_sleep(s, cnt, s->synced, delay_ns);
-if (!s->synced) {
-block_job_sleep_ns(>common, delay_ns);
-if (block_job_is_cancelled(>common)) {
-break;
-}
+if (block_job_is_cancelled(>common) && s->common.force) {
+break;
 } else if (!should_complete) {
 delay_ns = (s->in_flight == 0 && cnt == 0 ? SLICE_TIME : 0);
 block_job_sleep_ns(>common, delay_ns);
@@ -887,7 +884,8 @@ immediate_exit:
  * or

[Qemu-block] [PULL 33/41] parallels: Support .bdrv_co_create

2018-03-13 Thread Kevin Wolf

This adds the .bdrv_co_create driver callback to parallels, which
enables image creation over QMP.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Reviewed-by: Jeff Cody 
---
 qapi/block-core.json |  18 -
 block/parallels.c| 199 ++-
 2 files changed, 168 insertions(+), 49 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 6211b8222c..e0ab01d92d 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3625,6 +3625,22 @@
 'size': 'size' } }
 
 ##
+# @BlockdevCreateOptionsParallels:
+#
+# Driver specific image creation options for parallels.
+#
+# @file Node to create the image format on
+# @size Size of the virtual disk in bytes
+# @cluster-size Cluster size in bytes (default: 1 MB)
+#
+# Since: 2.12
+##
+{ 'struct': 'BlockdevCreateOptionsParallels',
+  'data': { 'file': 'BlockdevRef',
+'size': 'size',
+'*cluster-size':'size' } }
+
+##
 # @BlockdevQcow2Version:
 #
 # @v2:  The original QCOW2 format as introduced in qemu 0.10 (version 2)
@@ -3826,7 +3842,7 @@
   'null-aio':   'BlockdevCreateNotSupported',
   'null-co':'BlockdevCreateNotSupported',
   'nvme':   'BlockdevCreateNotSupported',
-  'parallels':  'BlockdevCreateNotSupported',
+  'parallels':  'BlockdevCreateOptionsParallels',
   'qcow2':  'BlockdevCreateOptionsQcow2',
   'qcow':   'BlockdevCreateNotSupported',
   'qed':'BlockdevCreateNotSupported',
diff --git a/block/parallels.c b/block/parallels.c
index c13cb619e6..2da5e56a9d 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -34,6 +34,9 @@
 #include "sysemu/block-backend.h"
 #include "qemu/module.h"
 #include "qemu/option.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qapi/qapi-visit-block-core.h"
 #include "qemu/bswap.h"
 #include "qemu/bitmap.h"
 #include "migration/blocker.h"
@@ -79,6 +82,25 @@ static QemuOptsList parallels_runtime_opts = {
 },
 };
 
+static QemuOptsList parallels_create_opts = {
+.name = "parallels-create-opts",
+.head = QTAILQ_HEAD_INITIALIZER(parallels_create_opts.head),
+.desc = {
+{
+.name = BLOCK_OPT_SIZE,
+.type = QEMU_OPT_SIZE,
+.help = "Virtual disk size",
+},
+{
+.name = BLOCK_OPT_CLUSTER_SIZE,
+.type = QEMU_OPT_SIZE,
+.help = "Parallels image cluster size",
+.def_value_str = stringify(DEFAULT_CLUSTER_SIZE),
+},
+{ /* end of list */ }
+}
+};
+
 
 static int64_t bat2sect(BDRVParallelsState *s, uint32_t idx)
 {
@@ -480,46 +502,62 @@ out:
 }
 
 
-static int coroutine_fn parallels_co_create_opts(const char *filename,
- QemuOpts *opts,
- Error **errp)
+static int coroutine_fn parallels_co_create(BlockdevCreateOptions* opts,
+Error **errp)
 {
+BlockdevCreateOptionsParallels *parallels_opts;
+BlockDriverState *bs;
+BlockBackend *blk;
 int64_t total_size, cl_size;
-uint8_t tmp[BDRV_SECTOR_SIZE];
-Error *local_err = NULL;
-BlockBackend *file;
 uint32_t bat_entries, bat_sectors;
 ParallelsHeader header;
+uint8_t tmp[BDRV_SECTOR_SIZE];
 int ret;
 
-total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
-  BDRV_SECTOR_SIZE);
-cl_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_CLUSTER_SIZE,
-  DEFAULT_CLUSTER_SIZE), BDRV_SECTOR_SIZE);
+assert(opts->driver == BLOCKDEV_DRIVER_PARALLELS);
+parallels_opts = >u.parallels;
+
+/* Sanity checks */
+total_size = parallels_opts->size;
+
+if (parallels_opts->has_cluster_size) {
+cl_size = parallels_opts->cluster_size;
+} else {
+cl_size = DEFAULT_CLUSTER_SIZE;
+}
+
 if (total_size >= MAX_PARALLELS_IMAGE_FACTOR * cl_size) {
-error_propagate(errp, local_err);
+error_setg(errp, "Image size is too large for this cluster size");
 return -E2BIG;
 }
 
-ret = bdrv_create_file(filename, opts, _err);
-if (ret < 0) {
-error_propagate(errp, local_err);
-return ret;
+if (!QEMU_IS_ALIGNED(total_size, BDRV_SECTOR_SIZE)) {
+error_setg(errp, "Image size must be a multiple of 512 bytes");
+return -EINVAL;
 }
 
-file = blk_new_open(filename, NULL, NULL,
-BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL,
-_err);
-if (file == NULL) {
-error_propagate(errp, local_err);
+if (!QEMU_IS_ALIGNED(cl_size, BDRV_SECTOR_SIZE)) {
+error_setg(errp, "Cluster size must be a multiple of 512 bytes");
+

Re: [Qemu-block] [Qemu-devel] [PATCH] iotests: Update output of 051 and 186 after commit 1454509726719e0933c

2018-03-13 Thread Kevin Wolf

Am 13.03.2018 um 17:22 hat Thomas Huth geschrieben:
> On 13.03.2018 17:08, Kevin Wolf wrote:
> > Am 06.03.2018 um 17:52 hat Thomas Huth geschrieben:
> >> On 06.03.2018 17:45, Alberto Garcia wrote:
> >>> Signed-off-by: Alberto Garcia 
> >>> ---
> >>>  tests/qemu-iotests/051.pc.out | 20 
> >>>  tests/qemu-iotests/186.out| 22 +++---
> >>>  2 files changed, 3 insertions(+), 39 deletions(-)
> >>>
> >>> diff --git a/tests/qemu-iotests/051.pc.out b/tests/qemu-iotests/051.pc.out
> >>> index 830c11880a..b01f9a90d7 100644
> >>> --- a/tests/qemu-iotests/051.pc.out
> >>> +++ b/tests/qemu-iotests/051.pc.out
> >>> @@ -117,20 +117,10 @@ Testing: -drive if=ide,media=cdrom
> >>>  QEMU X.Y.Z monitor - type 'help' for more information
> >>>  (qemu) quit
> >>>  
> >>> -Testing: -drive if=scsi,media=cdrom
> >>> -QEMU X.Y.Z monitor - type 'help' for more information
> >>> -(qemu) QEMU_PROG: -drive if=scsi,media=cdrom: warning: bus=0,unit=0 is 
> >>> deprecated with this machine type
> >>> -quit
> >>> -
> >>>  Testing: -drive if=ide
> >>>  QEMU X.Y.Z monitor - type 'help' for more information
> >>>  (qemu) QEMU_PROG: Initialization of device ide-hd failed: Device needs 
> >>> media, but drive is empty
> >>>  
> >>> -Testing: -drive if=scsi
> >>> -QEMU X.Y.Z monitor - type 'help' for more information
> >>> -(qemu) QEMU_PROG: -drive if=scsi: warning: bus=0,unit=0 is deprecated 
> >>> with this machine type
> >>> -QEMU_PROG: -drive if=scsi: Device needs media, but drive is empty
> >>> -
> >>>  Testing: -drive if=virtio
> >>>  QEMU X.Y.Z monitor - type 'help' for more information
> >>>  (qemu) QEMU_PROG: -drive if=virtio: Device needs media, but drive is 
> >>> empty
> >>> @@ -170,20 +160,10 @@ Testing: -drive 
> >>> file=TEST_DIR/t.qcow2,if=ide,media=cdrom,readonly=on
> >>>  QEMU X.Y.Z monitor - type 'help' for more information
> >>>  (qemu) quit
> >>>  
> >>> -Testing: -drive file=TEST_DIR/t.qcow2,if=scsi,media=cdrom,readonly=on
> >>> -QEMU X.Y.Z monitor - type 'help' for more information
> >>> -(qemu) QEMU_PROG: -drive 
> >>> file=TEST_DIR/t.qcow2,if=scsi,media=cdrom,readonly=on: warning: 
> >>> bus=0,unit=0 is deprecated with this machine type
> >>> -quit
> >>> -
> >>>  Testing: -drive file=TEST_DIR/t.qcow2,if=ide,readonly=on
> >>>  QEMU X.Y.Z monitor - type 'help' for more information
> >>>  (qemu) QEMU_PROG: Initialization of device ide-hd failed: Block node is 
> >>> read-only
> >>>  
> >>> -Testing: -drive file=TEST_DIR/t.qcow2,if=scsi,readonly=on
> >>> -QEMU X.Y.Z monitor - type 'help' for more information
> >>> -(qemu) QEMU_PROG: -drive file=TEST_DIR/t.qcow2,if=scsi,readonly=on: 
> >>> warning: bus=0,unit=0 is deprecated with this machine type
> >>> -quit
> >>> -
> >>>  Testing: -drive file=TEST_DIR/t.qcow2,if=virtio,readonly=on
> >>>  QEMU X.Y.Z monitor - type 'help' for more information
> >>>  (qemu) quit
> >>
> >> Ack for that part.
> >>
> >>> diff --git a/tests/qemu-iotests/186.out b/tests/qemu-iotests/186.out
> >>> index c8377fe146..d83bba1a88 100644
> >>> --- a/tests/qemu-iotests/186.out
> >>> +++ b/tests/qemu-iotests/186.out
> >>> @@ -444,31 +444,15 @@ ide0-cd0 (NODE_NAME): null-co:// (null-co, 
> >>> read-only)
> >>>  
> >>>  Testing: -drive if=scsi,driver=null-co
> >>>  QEMU X.Y.Z monitor - type 'help' for more information
> >>> -(qemu) QEMU_PROG: -drive if=scsi,driver=null-co: warning: bus=0,unit=0 
> >>> is deprecated with this machine type
> >>> -info block
> >>> -scsi0-hd0 (NODE_NAME): null-co:// (null-co)
> >>> -Attached to:  /machine/unattached/device[27]/scsi.0/legacy[0]
> >>> -Cache mode:   writeback
> >>> -(qemu) quit
> >>> +(qemu) QEMU_PROG: -drive if=scsi,driver=null-co: machine type does not 
> >>> support if=scsi,bus=0,unit=0
> >>>  
> >>>  Testing: -drive if=scsi,media=cdrom
> >>>  QEMU X.Y.Z monitor - type 'help' for more information
> >>> -(qemu) QEMU_PROG: -drive if=scsi,media=cdrom: warning: bus=0,unit=0 is 
> >>> deprecated with this machine type
> >>> -info block
> >>> -scsi0-cd0: [not inserted]
> >>> -Attached to:  /machine/unattached/device[27]/scsi.0/legacy[0]
> >>> -Removable device: not locked, tray closed
> >>> -(qemu) quit
> >>> +(qemu) QEMU_PROG: -drive if=scsi,media=cdrom: machine type does not 
> >>> support if=scsi,bus=0,unit=0
> >>>  
> >>>  Testing: -drive if=scsi,driver=null-co,media=cdrom
> >>>  QEMU X.Y.Z monitor - type 'help' for more information
> >>> -(qemu) QEMU_PROG: -drive if=scsi,driver=null-co,media=cdrom: warning: 
> >>> bus=0,unit=0 is deprecated with this machine type
> >>> -info block
> >>> -scsi0-cd0 (NODE_NAME): null-co:// (null-co, read-only)
> >>> -Attached to:  /machine/unattached/device[27]/scsi.0/legacy[0]
> >>> -Removable device: not locked, tray closed
> >>> -Cache mode:   writeback
> >>> -(qemu) quit
> >>> +(qemu) QEMU_PROG: -drive if=scsi,driver=null-co,media=cdrom: machine 
> >>> type does not support if=scsi,bus=0,unit=0
> >>
> >> That rather sounds

[Qemu-block] [PULL 28/41] vdi: Pull option parsing from vdi_co_create

2018-03-13 Thread Kevin Wolf

From: Max Reitz 

In preparation of QAPI-fying VDI image creation, we have to create a
BlockdevCreateOptionsVdi type which is received by a (future)
vdi_co_create().

vdi_co_create_opts() now converts the QemuOpts object into such a
BlockdevCreateOptionsVdi object.  The protocol-layer file is still
created in vdi_co_do_create() (and BlockdevCreateOptionsVdi.file is set
to an empty string), but that will be addressed by a follow-up patch.

Note that cluster-size is not part of the QAPI schema because it is not
supported by default.

Signed-off-by: Max Reitz 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json | 18 +++
 block/vdi.c  | 91 
 2 files changed, 95 insertions(+), 14 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index ba2d10d13a..c69d70d7a8 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3766,6 +3766,24 @@
 'size': 'size' } }
 
 ##
+# @BlockdevCreateOptionsVdi:
+#
+# Driver specific image creation options for VDI.
+#
+# @file Node to create the image format on
+# @size Size of the virtual disk in bytes
+# @static   Whether to create a statically (true) or
+#   dynamically (false) allocated image
+#   (default: false, i.e. dynamic)
+#
+# Since: 2.12
+##
+{ 'struct': 'BlockdevCreateOptionsVdi',
+  'data': { 'file': 'BlockdevRef',
+'size': 'size',
+'*static':  'bool' } }
+
+##
 # @BlockdevCreateNotSupported:
 #
 # This is used for all drivers that don't support creating images.
diff --git a/block/vdi.c b/block/vdi.c
index 2b5ddd0666..0c8f8204ce 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -51,6 +51,9 @@
 
 #include "qemu/osdep.h"
 #include "qapi/error.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qapi/qapi-visit-block-core.h"
 #include "block/block_int.h"
 #include "sysemu/block-backend.h"
 #include "qemu/module.h"
@@ -140,6 +143,8 @@
 #define VDI_DISK_SIZE_MAX((uint64_t)VDI_BLOCKS_IN_IMAGE_MAX * \
   (uint64_t)DEFAULT_CLUSTER_SIZE)
 
+static QemuOptsList vdi_create_opts;
+
 typedef struct {
 char text[0x40];
 uint32_t signature;
@@ -716,13 +721,14 @@ nonallocating_write:
 return ret;
 }
 
-static int coroutine_fn vdi_co_create_opts(const char *filename, QemuOpts 
*opts,
-   Error **errp)
+static int coroutine_fn vdi_co_do_create(const char *filename,
+ QemuOpts *file_opts,
+ BlockdevCreateOptionsVdi *vdi_opts,
+ size_t block_size, Error **errp)
 {
 int ret = 0;
 uint64_t bytes = 0;
 uint32_t blocks;
-size_t block_size = DEFAULT_CLUSTER_SIZE;
 uint32_t image_type = VDI_TYPE_DYNAMIC;
 VdiHeader header;
 size_t i;
@@ -735,18 +741,25 @@ static int coroutine_fn vdi_co_create_opts(const char 
*filename, QemuOpts *opts,
 logout("\n");
 
 /* Read out options. */
-bytes = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
- BDRV_SECTOR_SIZE);
-#if defined(CONFIG_VDI_BLOCK_SIZE)
-/* TODO: Additional checks (SECTOR_SIZE * 2^n, ...). */
-block_size = qemu_opt_get_size_del(opts,
-   BLOCK_OPT_CLUSTER_SIZE,
-   DEFAULT_CLUSTER_SIZE);
-#endif
-#if defined(CONFIG_VDI_STATIC_IMAGE)
-if (qemu_opt_get_bool_del(opts, BLOCK_OPT_STATIC, false)) {
+bytes = vdi_opts->size;
+if (vdi_opts->q_static) {
 image_type = VDI_TYPE_STATIC;
 }
+#ifndef CONFIG_VDI_STATIC_IMAGE
+if (image_type == VDI_TYPE_STATIC) {
+ret = -ENOTSUP;
+error_setg(errp, "Statically allocated images cannot be created in "
+   "this build");
+goto exit;
+}
+#endif
+#ifndef CONFIG_VDI_BLOCK_SIZE
+if (block_size != DEFAULT_CLUSTER_SIZE) {
+ret = -ENOTSUP;
+error_setg(errp,
+   "A non-default cluster size is not supported in this 
build");
+goto exit;
+}
 #endif
 
 if (bytes > VDI_DISK_SIZE_MAX) {
@@ -757,7 +770,7 @@ static int coroutine_fn vdi_co_create_opts(const char 
*filename, QemuOpts *opts,
 goto exit;
 }
 
-ret = bdrv_create_file(filename, opts, _err);
+ret = bdrv_create_file(filename, file_opts, _err);
 if (ret < 0) {
 error_propagate(errp, local_err);
 goto exit;
@@ -847,6 +860,56 @@ exit:
 return ret;
 }
 
+static int coroutine_fn vdi_co_create_opts(const char *filename, QemuOpts 
*opts,
+   Error **errp)
+{
+QDict *qdict = NULL;
+BlockdevCreateOptionsVdi *create_options = NULL;
+uint64_t block_size = DEFAULT_CLUSTER_SIZE;
+Visitor *v;

Re: [Qemu-block] [Qemu-devel] [PATCH] iotests: Update output of 051 and 186 after commit 1454509726719e0933c

2018-03-13 Thread Thomas Huth

On 13.03.2018 17:08, Kevin Wolf wrote:
> Am 06.03.2018 um 17:52 hat Thomas Huth geschrieben:
>> On 06.03.2018 17:45, Alberto Garcia wrote:
>>> Signed-off-by: Alberto Garcia 
>>> ---
>>>  tests/qemu-iotests/051.pc.out | 20 
>>>  tests/qemu-iotests/186.out| 22 +++---
>>>  2 files changed, 3 insertions(+), 39 deletions(-)
>>>
>>> diff --git a/tests/qemu-iotests/051.pc.out b/tests/qemu-iotests/051.pc.out
>>> index 830c11880a..b01f9a90d7 100644
>>> --- a/tests/qemu-iotests/051.pc.out
>>> +++ b/tests/qemu-iotests/051.pc.out
>>> @@ -117,20 +117,10 @@ Testing: -drive if=ide,media=cdrom
>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>>  (qemu) quit
>>>  
>>> -Testing: -drive if=scsi,media=cdrom
>>> -QEMU X.Y.Z monitor - type 'help' for more information
>>> -(qemu) QEMU_PROG: -drive if=scsi,media=cdrom: warning: bus=0,unit=0 is 
>>> deprecated with this machine type
>>> -quit
>>> -
>>>  Testing: -drive if=ide
>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>>  (qemu) QEMU_PROG: Initialization of device ide-hd failed: Device needs 
>>> media, but drive is empty
>>>  
>>> -Testing: -drive if=scsi
>>> -QEMU X.Y.Z monitor - type 'help' for more information
>>> -(qemu) QEMU_PROG: -drive if=scsi: warning: bus=0,unit=0 is deprecated with 
>>> this machine type
>>> -QEMU_PROG: -drive if=scsi: Device needs media, but drive is empty
>>> -
>>>  Testing: -drive if=virtio
>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>>  (qemu) QEMU_PROG: -drive if=virtio: Device needs media, but drive is empty
>>> @@ -170,20 +160,10 @@ Testing: -drive 
>>> file=TEST_DIR/t.qcow2,if=ide,media=cdrom,readonly=on
>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>>  (qemu) quit
>>>  
>>> -Testing: -drive file=TEST_DIR/t.qcow2,if=scsi,media=cdrom,readonly=on
>>> -QEMU X.Y.Z monitor - type 'help' for more information
>>> -(qemu) QEMU_PROG: -drive 
>>> file=TEST_DIR/t.qcow2,if=scsi,media=cdrom,readonly=on: warning: 
>>> bus=0,unit=0 is deprecated with this machine type
>>> -quit
>>> -
>>>  Testing: -drive file=TEST_DIR/t.qcow2,if=ide,readonly=on
>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>>  (qemu) QEMU_PROG: Initialization of device ide-hd failed: Block node is 
>>> read-only
>>>  
>>> -Testing: -drive file=TEST_DIR/t.qcow2,if=scsi,readonly=on
>>> -QEMU X.Y.Z monitor - type 'help' for more information
>>> -(qemu) QEMU_PROG: -drive file=TEST_DIR/t.qcow2,if=scsi,readonly=on: 
>>> warning: bus=0,unit=0 is deprecated with this machine type
>>> -quit
>>> -
>>>  Testing: -drive file=TEST_DIR/t.qcow2,if=virtio,readonly=on
>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>>  (qemu) quit
>>
>> Ack for that part.
>>
>>> diff --git a/tests/qemu-iotests/186.out b/tests/qemu-iotests/186.out
>>> index c8377fe146..d83bba1a88 100644
>>> --- a/tests/qemu-iotests/186.out
>>> +++ b/tests/qemu-iotests/186.out
>>> @@ -444,31 +444,15 @@ ide0-cd0 (NODE_NAME): null-co:// (null-co, read-only)
>>>  
>>>  Testing: -drive if=scsi,driver=null-co
>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>> -(qemu) QEMU_PROG: -drive if=scsi,driver=null-co: warning: bus=0,unit=0 is 
>>> deprecated with this machine type
>>> -info block
>>> -scsi0-hd0 (NODE_NAME): null-co:// (null-co)
>>> -Attached to:  /machine/unattached/device[27]/scsi.0/legacy[0]
>>> -Cache mode:   writeback
>>> -(qemu) quit
>>> +(qemu) QEMU_PROG: -drive if=scsi,driver=null-co: machine type does not 
>>> support if=scsi,bus=0,unit=0
>>>  
>>>  Testing: -drive if=scsi,media=cdrom
>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>> -(qemu) QEMU_PROG: -drive if=scsi,media=cdrom: warning: bus=0,unit=0 is 
>>> deprecated with this machine type
>>> -info block
>>> -scsi0-cd0: [not inserted]
>>> -Attached to:  /machine/unattached/device[27]/scsi.0/legacy[0]
>>> -Removable device: not locked, tray closed
>>> -(qemu) quit
>>> +(qemu) QEMU_PROG: -drive if=scsi,media=cdrom: machine type does not 
>>> support if=scsi,bus=0,unit=0
>>>  
>>>  Testing: -drive if=scsi,driver=null-co,media=cdrom
>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>> -(qemu) QEMU_PROG: -drive if=scsi,driver=null-co,media=cdrom: warning: 
>>> bus=0,unit=0 is deprecated with this machine type
>>> -info block
>>> -scsi0-cd0 (NODE_NAME): null-co:// (null-co, read-only)
>>> -Attached to:  /machine/unattached/device[27]/scsi.0/legacy[0]
>>> -Removable device: not locked, tray closed
>>> -Cache mode:   writeback
>>> -(qemu) quit
>>> +(qemu) QEMU_PROG: -drive if=scsi,driver=null-co,media=cdrom: machine type 
>>> does not support if=scsi,bus=0,unit=0
>>
>> That rather sounds like this "if=scsi" test should be removed now?
> 
> I think, it actually sounds like a SCSI adapter should be added manually
> now.

The "-drive if=scsi" syntax was deprecated for x86 and has now been
completely removed. It also does not work there anymore if you configure
a SCSI

[Qemu-block] [PULL 23/41] luks: Create block_crypto_co_create_generic()

2018-03-13 Thread Kevin Wolf

Everything that refers to the protocol layer or QemuOpts is moved out of
block_crypto_create_generic(), so that the remaining function is
suitable to be called by a .bdrv_co_create implementation.

LUKS is the only driver that actually implements the old interface, and
we don't intend to use it in any new drivers, so put the moved out code
directly into a LUKS function rather than creating a generic
intermediate one.

Signed-off-by: Kevin Wolf 
Reviewed-by: Daniel P. Berrangé 
Reviewed-by: Eric Blake 
---
 block/crypto.c | 95 +-
 1 file changed, 61 insertions(+), 34 deletions(-)

diff --git a/block/crypto.c b/block/crypto.c
index 77871640cc..b0a4cb3388 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -306,43 +306,29 @@ static int block_crypto_open_generic(QCryptoBlockFormat 
format,
 }
 
 
-static int block_crypto_create_generic(QCryptoBlockFormat format,
-   const char *filename,
-   QemuOpts *opts,
-   Error **errp)
+static int block_crypto_co_create_generic(BlockDriverState *bs,
+  int64_t size,
+  QCryptoBlockCreateOptions *opts,
+  Error **errp)
 {
-int ret = -EINVAL;
-QCryptoBlockCreateOptions *create_opts = NULL;
+int ret;
+BlockBackend *blk;
 QCryptoBlock *crypto = NULL;
-struct BlockCryptoCreateData data = {
-.size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
- BDRV_SECTOR_SIZE),
-};
-QDict *cryptoopts;
-
-/* Parse options */
-cryptoopts = qemu_opts_to_qdict(opts, NULL);
+struct BlockCryptoCreateData data;
 
-create_opts = block_crypto_create_opts_init(format, cryptoopts, errp);
-if (!create_opts) {
-return -1;
-}
+blk = blk_new(BLK_PERM_WRITE | BLK_PERM_RESIZE, BLK_PERM_ALL);
 
-/* Create protocol layer */
-ret = bdrv_create_file(filename, opts, errp);
+ret = blk_insert_bs(blk, bs, errp);
 if (ret < 0) {
-return ret;
+goto cleanup;
 }
 
-data.blk = blk_new_open(filename, NULL, NULL,
-BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL,
-errp);
-if (!data.blk) {
-return -EINVAL;
-}
+data = (struct BlockCryptoCreateData) {
+.blk = blk,
+.size = size,
+};
 
-/* Create format layer */
-crypto = qcrypto_block_create(create_opts, NULL,
+crypto = qcrypto_block_create(opts, NULL,
   block_crypto_init_func,
   block_crypto_write_func,
   ,
@@ -355,10 +341,8 @@ static int block_crypto_create_generic(QCryptoBlockFormat 
format,
 
 ret = 0;
  cleanup:
-QDECREF(cryptoopts);
 qcrypto_block_free(crypto);
-blk_unref(data.blk);
-qapi_free_QCryptoBlockCreateOptions(create_opts);
+blk_unref(blk);
 return ret;
 }
 
@@ -563,8 +547,51 @@ static int coroutine_fn 
block_crypto_co_create_opts_luks(const char *filename,
  QemuOpts *opts,
  Error **errp)
 {
-return block_crypto_create_generic(Q_CRYPTO_BLOCK_FORMAT_LUKS,
-   filename, opts, errp);
+QCryptoBlockCreateOptions *create_opts = NULL;
+BlockDriverState *bs = NULL;
+QDict *cryptoopts;
+int64_t size;
+int ret;
+
+/* Parse options */
+size = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
+
+cryptoopts = qemu_opts_to_qdict_filtered(opts, NULL,
+ _crypto_create_opts_luks,
+ true);
+
+create_opts = block_crypto_create_opts_init(Q_CRYPTO_BLOCK_FORMAT_LUKS,
+cryptoopts, errp);
+if (!create_opts) {
+ret = -EINVAL;
+goto fail;
+}
+
+/* Create protocol layer */
+ret = bdrv_create_file(filename, opts, errp);
+if (ret < 0) {
+return ret;
+}
+
+bs = bdrv_open(filename, NULL, NULL,
+   BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL, errp);
+if (!bs) {
+ret = -EINVAL;
+goto fail;
+}
+
+/* Create format layer */
+ret = block_crypto_co_create_generic(bs, size, create_opts, errp);
+if (ret < 0) {
+goto fail;
+}
+
+ret = 0;
+fail:
+bdrv_unref(bs);
+qapi_free_QCryptoBlockCreateOptions(create_opts);
+QDECREF(cryptoopts);
+return ret;
 }
 
 static int block_crypto_get_info_luks(BlockDriverState *bs,
-- 
2.13.6

[Qemu-block] [PULL 40/41] vpc: Require aligned size in .bdrv_co_create

2018-03-13 Thread Kevin Wolf

Perform the rounding to match a CHS geometry only in the legacy code
path in .bdrv_co_create_opts. QMP now requires that the user already
passes a CHS aligned image size, unless force-size=true is given.

CHS alignment is required to make the image compatible with Virtual PC,
but not for use with newer Microsoft hypervisors.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
---
 block/vpc.c | 113 +++-
 1 file changed, 82 insertions(+), 31 deletions(-)

diff --git a/block/vpc.c b/block/vpc.c
index 8824211713..28ffa0d2f8 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -902,6 +902,62 @@ static int create_fixed_disk(BlockBackend *blk, uint8_t 
*buf,
 return ret;
 }
 
+static int calculate_rounded_image_size(BlockdevCreateOptionsVpc *vpc_opts,
+uint16_t *out_cyls,
+uint8_t *out_heads,
+uint8_t *out_secs_per_cyl,
+int64_t *out_total_sectors,
+Error **errp)
+{
+int64_t total_size = vpc_opts->size;
+uint16_t cyls = 0;
+uint8_t heads = 0;
+uint8_t secs_per_cyl = 0;
+int64_t total_sectors;
+int i;
+
+/*
+ * Calculate matching total_size and geometry. Increase the number of
+ * sectors requested until we get enough (or fail). This ensures that
+ * qemu-img convert doesn't truncate images, but rather rounds up.
+ *
+ * If the image size can't be represented by a spec conformant CHS 
geometry,
+ * we set the geometry to 65535 x 16 x 255 (CxHxS) sectors and use
+ * the image size from the VHD footer to calculate total_sectors.
+ */
+if (vpc_opts->force_size) {
+/* This will force the use of total_size for sector count, below */
+cyls = VHD_CHS_MAX_C;
+heads= VHD_CHS_MAX_H;
+secs_per_cyl = VHD_CHS_MAX_S;
+} else {
+total_sectors = MIN(VHD_MAX_GEOMETRY, total_size / BDRV_SECTOR_SIZE);
+for (i = 0; total_sectors > (int64_t)cyls * heads * secs_per_cyl; i++) 
{
+calculate_geometry(total_sectors + i, , , 
_per_cyl);
+}
+}
+
+if ((int64_t)cyls * heads * secs_per_cyl == VHD_MAX_GEOMETRY) {
+total_sectors = total_size / BDRV_SECTOR_SIZE;
+/* Allow a maximum disk size of 2040 GiB */
+if (total_sectors > VHD_MAX_SECTORS) {
+error_setg(errp, "Disk size is too large, max size is 2040 GiB");
+return -EFBIG;
+}
+} else {
+total_sectors = (int64_t) cyls * heads * secs_per_cyl;
+}
+
+*out_total_sectors = total_sectors;
+if (out_cyls) {
+*out_cyls = cyls;
+*out_heads = heads;
+*out_secs_per_cyl = secs_per_cyl;
+}
+
+return 0;
+}
+
 static int coroutine_fn vpc_co_create(BlockdevCreateOptions *opts,
   Error **errp)
 {
@@ -911,7 +967,6 @@ static int coroutine_fn vpc_co_create(BlockdevCreateOptions 
*opts,
 
 uint8_t buf[1024];
 VHDFooter *footer = (VHDFooter *) buf;
-int i;
 uint16_t cyls = 0;
 uint8_t heads = 0;
 uint8_t secs_per_cyl = 0;
@@ -953,38 +1008,22 @@ static int coroutine_fn 
vpc_co_create(BlockdevCreateOptions *opts,
 }
 blk_set_allow_write_beyond_eof(blk, true);
 
-/*
- * Calculate matching total_size and geometry. Increase the number of
- * sectors requested until we get enough (or fail). This ensures that
- * qemu-img convert doesn't truncate images, but rather rounds up.
- *
- * If the image size can't be represented by a spec conformant CHS 
geometry,
- * we set the geometry to 65535 x 16 x 255 (CxHxS) sectors and use
- * the image size from the VHD footer to calculate total_sectors.
- */
-if (vpc_opts->force_size) {
-/* This will force the use of total_size for sector count, below */
-cyls = VHD_CHS_MAX_C;
-heads= VHD_CHS_MAX_H;
-secs_per_cyl = VHD_CHS_MAX_S;
-} else {
-total_sectors = MIN(VHD_MAX_GEOMETRY, total_size / BDRV_SECTOR_SIZE);
-for (i = 0; total_sectors > (int64_t)cyls * heads * secs_per_cyl; i++) 
{
-calculate_geometry(total_sectors + i, , , 
_per_cyl);
-}
+/* Get geometry and check that it matches the image size*/
+ret = calculate_rounded_image_size(vpc_opts, , , _per_cyl,
+   _sectors, errp);
+if (ret < 0) {
+goto out;
 }
 
-if ((int64_t)cyls * heads * secs_per_cyl == VHD_MAX_GEOMETRY) {
-total_sectors = total_size / BDRV_SECTOR_SIZE;
-/* Allow a maximum disk size of 2040 GiB */
-if (total_sectors > VHD_MAX_SECTORS) {
-error_setg(errp, "Disk size is too large, max size is 2040 GiB");
-ret = -EFBIG;
-goto out;
-}
-} else {

[Qemu-block] [PULL 34/41] qemu-iotests: Enable write tests for parallels

2018-03-13 Thread Kevin Wolf

Originally we added parallels as a read-only format to qemu-iotests
where we did just some tests with a binary image. Since then, write and
image creation support has been added to the driver, so we can now
enable it in _supported_fmt generic.

The driver doesn't support migration yet, though, so we need to add it
to the list of exceptions in 181.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Reviewed-by: Jeff Cody 
---
 tests/qemu-iotests/181   | 2 +-
 tests/qemu-iotests/check | 1 -
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/181 b/tests/qemu-iotests/181
index 0c91e8f9de..5e767c6195 100755
--- a/tests/qemu-iotests/181
+++ b/tests/qemu-iotests/181
@@ -44,7 +44,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 
 _supported_fmt generic
 # Formats that do not support live migration
-_unsupported_fmt qcow vdi vhdx vmdk vpc vvfat
+_unsupported_fmt qcow vdi vhdx vmdk vpc vvfat parallels
 _supported_proto generic
 _supported_os Linux
 
diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index e6b6ff7a04..469142cd58 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -284,7 +284,6 @@ testlist options
 
 -parallels)
 IMGFMT=parallels
-IMGFMT_GENERIC=false
 xpand=false
 ;;
 
-- 
2.13.6

[Qemu-block] [PULL 35/41] qcow: Support .bdrv_co_create

2018-03-13 Thread Kevin Wolf

This adds the .bdrv_co_create driver callback to qcow, which
enables image creation over QMP.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Reviewed-by: Jeff Cody 
---
 qapi/block-core.json |  21 +-
 block/qcow.c | 196 ++-
 2 files changed, 150 insertions(+), 67 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index e0ab01d92d..7b7d5a01fd 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3641,6 +3641,25 @@
 '*cluster-size':'size' } }
 
 ##
+# @BlockdevCreateOptionsQcow:
+#
+# Driver specific image creation options for qcow.
+#
+# @file Node to create the image format on
+# @size Size of the virtual disk in bytes
+# @backing-file File name of the backing file if a backing file
+#   should be used
+# @encrypt  Encryption options if the image should be encrypted
+#
+# Since: 2.12
+##
+{ 'struct': 'BlockdevCreateOptionsQcow',
+  'data': { 'file': 'BlockdevRef',
+'size': 'size',
+'*backing-file':'str',
+'*encrypt': 'QCryptoBlockCreateOptions' } }
+
+##
 # @BlockdevQcow2Version:
 #
 # @v2:  The original QCOW2 format as introduced in qemu 0.10 (version 2)
@@ -3843,8 +3862,8 @@
   'null-co':'BlockdevCreateNotSupported',
   'nvme':   'BlockdevCreateNotSupported',
   'parallels':  'BlockdevCreateOptionsParallels',
+  'qcow':   'BlockdevCreateOptionsQcow',
   'qcow2':  'BlockdevCreateOptionsQcow2',
-  'qcow':   'BlockdevCreateNotSupported',
   'qed':'BlockdevCreateNotSupported',
   'quorum': 'BlockdevCreateNotSupported',
   'raw':'BlockdevCreateNotSupported',
diff --git a/block/qcow.c b/block/qcow.c
index 47a18d9a3a..2e3770ca63 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -33,6 +33,8 @@
 #include 
 #include "qapi/qmp/qdict.h"
 #include "qapi/qmp/qstring.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qapi/qapi-visit-block-core.h"
 #include "crypto/block.h"
 #include "migration/blocker.h"
 #include "block/crypto.h"
@@ -86,6 +88,8 @@ typedef struct BDRVQcowState {
 Error *migration_blocker;
 } BDRVQcowState;
 
+static QemuOptsList qcow_create_opts;
+
 static int decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset);
 
 static int qcow_probe(const uint8_t *buf, int buf_size, const char *filename)
@@ -810,62 +814,50 @@ static void qcow_close(BlockDriverState *bs)
 error_free(s->migration_blocker);
 }
 
-static int coroutine_fn qcow_co_create_opts(const char *filename, QemuOpts 
*opts,
-Error **errp)
+static int coroutine_fn qcow_co_create(BlockdevCreateOptions *opts,
+   Error **errp)
 {
+BlockdevCreateOptionsQcow *qcow_opts;
 int header_size, backing_filename_len, l1_size, shift, i;
 QCowHeader header;
 uint8_t *tmp;
 int64_t total_size = 0;
-char *backing_file = NULL;
-Error *local_err = NULL;
 int ret;
+BlockDriverState *bs;
 BlockBackend *qcow_blk;
-char *encryptfmt = NULL;
-QDict *options;
-QDict *encryptopts = NULL;
-QCryptoBlockCreateOptions *crypto_opts = NULL;
 QCryptoBlock *crypto = NULL;
 
-/* Read out options */
-total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
-  BDRV_SECTOR_SIZE);
+assert(opts->driver == BLOCKDEV_DRIVER_QCOW);
+qcow_opts = >u.qcow;
+
+/* Sanity checks */
+total_size = qcow_opts->size;
 if (total_size == 0) {
 error_setg(errp, "Image size is too small, cannot be zero length");
-ret = -EINVAL;
-goto cleanup;
+return -EINVAL;
 }
 
-backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
-encryptfmt = qemu_opt_get_del(opts, BLOCK_OPT_ENCRYPT_FORMAT);
-if (encryptfmt) {
-if (qemu_opt_get(opts, BLOCK_OPT_ENCRYPT)) {
-error_setg(errp, "Options " BLOCK_OPT_ENCRYPT " and "
-   BLOCK_OPT_ENCRYPT_FORMAT " are mutually exclusive");
-ret = -EINVAL;
-goto cleanup;
-}
-} else if (qemu_opt_get_bool_del(opts, BLOCK_OPT_ENCRYPT, false)) {
-encryptfmt = g_strdup("aes");
+if (qcow_opts->has_encrypt &&
+qcow_opts->encrypt->format != Q_CRYPTO_BLOCK_FORMAT_QCOW)
+{
+error_setg(errp, "Unsupported encryption format");
+return -EINVAL;
 }
 
-ret = bdrv_create_file(filename, opts, _err);
-if (ret < 0) {
-error_propagate(errp, local_err);
-goto cleanup;
+/* Create BlockBackend to write to the image */
+bs = bdrv_open_blockdev_ref(qcow_opts->file, errp);
+if (bs == NULL) {
+return -EIO;
 }
 
-qcow_blk = blk_new_open(filename, NULL, NULL,
-

Re: [Qemu-block] [PATCH v10 05/12] migration: introduce postcopy-only pending

2018-03-13 Thread Vladimir Sementsov-Ogievskiy


13.03.2018 19:16, John Snow wrote:


On 03/13/2018 12:14 PM, Vladimir Sementsov-Ogievskiy wrote:

Hmm, I agree, it is the simplest thing we can do for now, and I'll
rethink later,
how (and is it worth doing) to go to postcopy automatically in case of
only-dirty-bitmaps.
Should I respin?

Please do. I already staged patches 1-4 in my branch, so if you'd like,
you can respin just 5+.

https://github.com/jnsnow/qemu/tree/bitmaps

--js


Ok, I'll base on your branch. How should I write Based-on: for patchew 
in this case?


--
Best regards,
Vladimir

[Qemu-block] [PULL 12/41] blockjobs: ensure abort is called for cancelled jobs

2018-03-13 Thread Kevin Wolf

From: John Snow 

Presently, even if a job is canceled post-completion as a result of
a failing peer in a transaction, it will still call .commit because
nothing has updated or changed its return code.

The reason why this does not cause problems currently is because
backup's implementation of .commit checks for cancellation itself.

I'd like to simplify this contract:

(1) Abort is called if the job/transaction fails
(2) Commit is called if the job/transaction succeeds

To this end: A job's return code, if 0, will be forcibly set as
-ECANCELED if that job has already concluded. Remove the now
redundant check in the backup job implementation.

We need to check for cancellation in both block_job_completed
AND block_job_completed_single, because jobs may be cancelled between
those two calls; for instance in transactions. This also necessitates
an ABORTING -> ABORTING transition to be allowed.

The check in block_job_completed could be removed, but there's no
point in starting to attempt to succeed a transaction that we know
in advance will fail.

This does NOT affect mirror jobs that are "canceled" during their
synchronous phase. The mirror job itself forcibly sets the canceled
property to false prior to ceding control, so such cases will invoke
the "commit" callback.

Signed-off-by: John Snow 
Reviewed-by: Eric Blake 
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 block/backup.c |  2 +-
 blockjob.c | 21 -
 block/trace-events |  1 +
 3 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 7e254dabff..453cd62c24 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -206,7 +206,7 @@ static void backup_cleanup_sync_bitmap(BackupBlockJob *job, 
int ret)
 BdrvDirtyBitmap *bm;
 BlockDriverState *bs = blk_bs(job->common.blk);
 
-if (ret < 0 || block_job_is_cancelled(>common)) {
+if (ret < 0) {
 /* Merge the successor back into the parent, delete nothing. */
 bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
 assert(bm);
diff --git a/blockjob.c b/blockjob.c
index 59ac4a13c7..61af628376 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -51,7 +51,7 @@ bool 
BlockJobSTT[BLOCK_JOB_STATUS__MAX][BLOCK_JOB_STATUS__MAX] = {
 /* P: */ [BLOCK_JOB_STATUS_PAUSED]= {0, 0, 1, 0, 0, 0, 0, 0, 0},
 /* Y: */ [BLOCK_JOB_STATUS_READY] = {0, 0, 0, 0, 0, 1, 1, 1, 0},
 /* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0, 0, 0},
-/* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0, 1, 0},
+/* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 1, 1, 0},
 /* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0, 1},
 /* N: */ [BLOCK_JOB_STATUS_NULL]  = {0, 0, 0, 0, 0, 0, 0, 0, 0},
 };
@@ -405,13 +405,22 @@ static void block_job_conclude(BlockJob *job)
 }
 }
 
+static void block_job_update_rc(BlockJob *job)
+{
+if (!job->ret && block_job_is_cancelled(job)) {
+job->ret = -ECANCELED;
+}
+if (job->ret) {
+block_job_state_transition(job, BLOCK_JOB_STATUS_ABORTING);
+}
+}
+
 static void block_job_completed_single(BlockJob *job)
 {
 assert(job->completed);
 
-if (job->ret || block_job_is_cancelled(job)) {
-block_job_state_transition(job, BLOCK_JOB_STATUS_ABORTING);
-}
+/* Ensure abort is called for late-transactional failures */
+block_job_update_rc(job);
 
 if (!job->ret) {
 if (job->driver->commit) {
@@ -896,7 +905,9 @@ void block_job_completed(BlockJob *job, int ret)
 assert(blk_bs(job->blk)->job == job);
 job->completed = true;
 job->ret = ret;
-if (ret < 0 || block_job_is_cancelled(job)) {
+block_job_update_rc(job);
+trace_block_job_completed(job, ret, job->ret);
+if (job->ret) {
 block_job_completed_txn_abort(job);
 } else {
 block_job_completed_txn_success(job);
diff --git a/block/trace-events b/block/trace-events
index 266afd9e99..5e531e0310 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -5,6 +5,7 @@ bdrv_open_common(void *bs, const char *filename, int flags, 
const char *format_n
 bdrv_lock_medium(void *bs, bool locked) "bs %p locked %d"
 
 # blockjob.c
+block_job_completed(void *job, int ret, int jret) "job %p ret %d corrected ret 
%d"
 block_job_state_transition(void *job,  int ret, const char *legal, const char 
*s0, const char *s1) "job %p (ret: %d) attempting %s transition (%s-->%s)"
 block_job_apply_verb(void *job, const char *state, const char *verb, const 
char *legal) "job %p in state %s; applying verb %s (%s)"
 
-- 
2.13.6

[Qemu-block] [PULL 26/41] luks: Catch integer overflow for huge sizes

2018-03-13 Thread Kevin Wolf

When you request an image size close to UINT64_MAX, the addition of the
crypto header may cause an integer overflow. Catch it instead of
silently truncating the image size.

Signed-off-by: Kevin Wolf 
Reviewed-by: Daniel P. Berrangé 
---
 block/crypto.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/block/crypto.c b/block/crypto.c
index 00fb40c631..e0b8856f74 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -102,6 +102,11 @@ static ssize_t block_crypto_init_func(QCryptoBlock *block,
 {
 struct BlockCryptoCreateData *data = opaque;
 
+if (data->size > INT64_MAX || headerlen > INT64_MAX - data->size) {
+error_setg(errp, "The requested file size is too large");
+return -EFBIG;
+}
+
 /* User provided size should reflect amount of space made
  * available to the guest, so we must take account of that
  * which will be used by the crypto header
-- 
2.13.6

[Qemu-block] [PULL 31/41] block: Fix flags in reopen queue

2018-03-13 Thread Kevin Wolf

From: Fam Zheng 

Reopen flags are not synchronized according to the
bdrv_reopen_queue_child precedence until bdrv_reopen_prepare. It is a
bit too late: we already check the consistency in bdrv_check_perm before
that.

This fixes the bug that when bdrv_reopen a RO node as RW, the flags for
backing child are wrong. Before, we could recurse with flags.rw=1; now,
role->inherit_options + update_flags_from_options will make sure to
clear the bit when necessary.  Note that this will not clear an
explicitly set bit, as in the case of parallel block jobs (e.g.
test_stream_parallel in 030), because the explicit options include
'read-only=false' (for an intermediate node used by a different job).

Signed-off-by: Fam Zheng 
Reviewed-by: Max Reitz 
Signed-off-by: Kevin Wolf 
---
 block.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/block.c b/block.c
index 75a9fd49de..e02d83b027 100644
--- a/block.c
+++ b/block.c
@@ -2883,8 +2883,16 @@ static BlockReopenQueue 
*bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
 
 /* Inherit from parent node */
 if (parent_options) {
+QemuOpts *opts;
+QDict *options_copy;
 assert(!flags);
 role->inherit_options(, options, parent_flags, parent_options);
+options_copy = qdict_clone_shallow(options);
+opts = qemu_opts_create(_runtime_opts, NULL, 0, _abort);
+qemu_opts_absorb_qdict(opts, options_copy, NULL);
+update_flags_from_options(, opts);
+qemu_opts_del(opts);
+QDECREF(options_copy);
 }
 
 /* Old values are used for options that aren't set yet */
-- 
2.13.6

[Qemu-block] [PULL 27/41] qemu-iotests: Test luks QMP image creation

2018-03-13 Thread Kevin Wolf

Signed-off-by: Kevin Wolf 
Reviewed-by: Daniel P. Berrangé 
---
 tests/qemu-iotests/209   | 210 +++
 tests/qemu-iotests/209.out   | 136 
 tests/qemu-iotests/common.rc |   2 +-
 tests/qemu-iotests/group |   1 +
 4 files changed, 348 insertions(+), 1 deletion(-)
 create mode 100755 tests/qemu-iotests/209
 create mode 100644 tests/qemu-iotests/209.out

diff --git a/tests/qemu-iotests/209 b/tests/qemu-iotests/209
new file mode 100755
index 00..96a5213e77
--- /dev/null
+++ b/tests/qemu-iotests/209
@@ -0,0 +1,210 @@
+#!/bin/bash
+#
+# Test luks and file image creation
+#
+# Copyright (C) 2018 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=kw...@redhat.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+here=`pwd`
+status=1   # failure is the default!
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt luks
+_supported_proto file
+_supported_os Linux
+
+function do_run_qemu()
+{
+echo Testing: "$@"
+$QEMU -nographic -qmp stdio -serial none "$@"
+echo
+}
+
+function run_qemu()
+{
+do_run_qemu "$@" 2>&1 | _filter_testdir | _filter_qmp \
+  | _filter_qemu | _filter_imgfmt \
+  | _filter_actual_image_size
+}
+
+echo
+echo "=== Successful image creation (defaults) ==="
+echo
+
+size=$((128 * 1024 * 1024))
+
+run_qemu -object secret,id=keysec0,data="foo" <&1 | \
+$QEMU_IMG info $QEMU_IMG_EXTRA_ARGS "$@" "$TEST_IMG" 2>&1 | \
 sed -e "s#$IMGPROTO:$TEST_DIR#TEST_DIR#g" \
 -e "s#$TEST_DIR#TEST_DIR#g" \
 -e "s#$IMGFMT#IMGFMT#g" \
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index c401791fcd..b8d0fd6177 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -204,3 +204,4 @@
 205 rw auto quick
 206 rw auto
 207 rw auto
+209 rw auto
-- 
2.13.6

[Qemu-block] [PULL 22/41] luks: Separate image file creation from formatting

2018-03-13 Thread Kevin Wolf

The crypto driver used to create the image file in a callback from the
crypto subsystem. If we want to implement .bdrv_co_create, this needs to
go away because that callback will get a reference to an already
existing block node.

Move the image file creation to block_crypto_create_generic().

Signed-off-by: Kevin Wolf 
Reviewed-by: Daniel P. Berrangé 
Reviewed-by: Eric Blake 
---
 block/crypto.c | 37 +
 1 file changed, 17 insertions(+), 20 deletions(-)

diff --git a/block/crypto.c b/block/crypto.c
index e6095e7807..77871640cc 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -71,8 +71,6 @@ static ssize_t block_crypto_read_func(QCryptoBlock *block,
 
 
 struct BlockCryptoCreateData {
-const char *filename;
-QemuOpts *opts;
 BlockBackend *blk;
 uint64_t size;
 };
@@ -103,27 +101,13 @@ static ssize_t block_crypto_init_func(QCryptoBlock *block,
   Error **errp)
 {
 struct BlockCryptoCreateData *data = opaque;
-int ret;
 
 /* User provided size should reflect amount of space made
  * available to the guest, so we must take account of that
  * which will be used by the crypto header
  */
-data->size += headerlen;
-
-qemu_opt_set_number(data->opts, BLOCK_OPT_SIZE, data->size, _abort);
-ret = bdrv_create_file(data->filename, data->opts, errp);
-if (ret < 0) {
-return -1;
-}
-
-data->blk = blk_new_open(data->filename, NULL, NULL,
- BDRV_O_RDWR | BDRV_O_PROTOCOL, errp);
-if (!data->blk) {
-return -1;
-}
-
-return 0;
+return blk_truncate(data->blk, data->size + headerlen, PREALLOC_MODE_OFF,
+errp);
 }
 
 
@@ -333,11 +317,10 @@ static int block_crypto_create_generic(QCryptoBlockFormat 
format,
 struct BlockCryptoCreateData data = {
 .size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
  BDRV_SECTOR_SIZE),
-.opts = opts,
-.filename = filename,
 };
 QDict *cryptoopts;
 
+/* Parse options */
 cryptoopts = qemu_opts_to_qdict(opts, NULL);
 
 create_opts = block_crypto_create_opts_init(format, cryptoopts, errp);
@@ -345,6 +328,20 @@ static int block_crypto_create_generic(QCryptoBlockFormat 
format,
 return -1;
 }
 
+/* Create protocol layer */
+ret = bdrv_create_file(filename, opts, errp);
+if (ret < 0) {
+return ret;
+}
+
+data.blk = blk_new_open(filename, NULL, NULL,
+BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL,
+errp);
+if (!data.blk) {
+return -EINVAL;
+}
+
+/* Create format layer */
 crypto = qcrypto_block_create(create_opts, NULL,
   block_crypto_init_func,
   block_crypto_write_func,
-- 
2.13.6

[Qemu-block] [PULL 20/41] iotests: test manual job dismissal

2018-03-13 Thread Kevin Wolf

From: John Snow 

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 tests/qemu-iotests/056 | 187 +
 tests/qemu-iotests/056.out |   4 +-
 2 files changed, 189 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/056 b/tests/qemu-iotests/056
index 04f2c3c841..223292175a 100755
--- a/tests/qemu-iotests/056
+++ b/tests/qemu-iotests/056
@@ -29,6 +29,26 @@ backing_img = os.path.join(iotests.test_dir, 'backing.img')
 test_img = os.path.join(iotests.test_dir, 'test.img')
 target_img = os.path.join(iotests.test_dir, 'target.img')
 
+def img_create(img, fmt=iotests.imgfmt, size='64M', **kwargs):
+fullname = os.path.join(iotests.test_dir, '%s.%s' % (img, fmt))
+optargs = []
+for k,v in kwargs.iteritems():
+optargs = optargs + ['-o', '%s=%s' % (k,v)]
+args = ['create', '-f', fmt] + optargs + [fullname, size]
+iotests.qemu_img(*args)
+return fullname
+
+def try_remove(img):
+try:
+os.remove(img)
+except OSError:
+pass
+
+def io_write_patterns(img, patterns):
+for pattern in patterns:
+iotests.qemu_io('-c', 'write -P%s %s %s' % pattern, img)
+
+
 class TestSyncModesNoneAndTop(iotests.QMPTestCase):
 image_len = 64 * 1024 * 1024 # MB
 
@@ -108,5 +128,172 @@ class TestBeforeWriteNotifier(iotests.QMPTestCase):
 event = self.cancel_and_wait()
 self.assert_qmp(event, 'data/type', 'backup')
 
+class BackupTest(iotests.QMPTestCase):
+def setUp(self):
+self.vm = iotests.VM()
+self.test_img = img_create('test')
+self.dest_img = img_create('dest')
+self.vm.add_drive(self.test_img)
+self.vm.launch()
+
+def tearDown(self):
+self.vm.shutdown()
+try_remove(self.test_img)
+try_remove(self.dest_img)
+
+def hmp_io_writes(self, drive, patterns):
+for pattern in patterns:
+self.vm.hmp_qemu_io(drive, 'write -P%s %s %s' % pattern)
+self.vm.hmp_qemu_io(drive, 'flush')
+
+def qmp_backup_and_wait(self, cmd='drive-backup', serror=None,
+aerror=None, **kwargs):
+if not self.qmp_backup(cmd, serror, **kwargs):
+return False
+return self.qmp_backup_wait(kwargs['device'], aerror)
+
+def qmp_backup(self, cmd='drive-backup',
+   error=None, **kwargs):
+self.assertTrue('device' in kwargs)
+res = self.vm.qmp(cmd, **kwargs)
+if error:
+self.assert_qmp(res, 'error/desc', error)
+return False
+self.assert_qmp(res, 'return', {})
+return True
+
+def qmp_backup_wait(self, device, error=None):
+event = self.vm.event_wait(name="BLOCK_JOB_COMPLETED",
+   match={'data': {'device': device}})
+self.assertNotEqual(event, None)
+try:
+failure = self.dictpath(event, 'data/error')
+except AssertionError:
+# Backup succeeded.
+self.assert_qmp(event, 'data/offset', event['data']['len'])
+return True
+else:
+# Failure.
+self.assert_qmp(event, 'data/error', qerror)
+return False
+
+def test_dismiss_false(self):
+res = self.vm.qmp('query-block-jobs')
+self.assert_qmp(res, 'return', [])
+self.qmp_backup_and_wait(device='drive0', format=iotests.imgfmt,
+ sync='full', target=self.dest_img,
+ auto_dismiss=True)
+res = self.vm.qmp('query-block-jobs')
+self.assert_qmp(res, 'return', [])
+
+def test_dismiss_true(self):
+res = self.vm.qmp('query-block-jobs')
+self.assert_qmp(res, 'return', [])
+self.qmp_backup_and_wait(device='drive0', format=iotests.imgfmt,
+ sync='full', target=self.dest_img,
+ auto_dismiss=False)
+res = self.vm.qmp('query-block-jobs')
+self.assert_qmp(res, 'return[0]/status', 'concluded')
+res = self.vm.qmp('block-job-dismiss', id='drive0')
+self.assert_qmp(res, 'return', {})
+res = self.vm.qmp('query-block-jobs')
+self.assert_qmp(res, 'return', [])
+
+def test_dismiss_bad_id(self):
+res = self.vm.qmp('query-block-jobs')
+self.assert_qmp(res, 'return', [])
+res = self.vm.qmp('block-job-dismiss', id='foobar')
+self.assert_qmp(res, 'error/class', 'DeviceNotActive')
+
+def test_dismiss_collision(self):
+res = self.vm.qmp('query-block-jobs')
+self.assert_qmp(res, 'return', [])
+self.qmp_backup_and_wait(device='drive0', format=iotests.imgfmt,
+ sync='full', target=self.dest_img,
+ auto_dismiss=False)
+res = self.vm.qmp('query-block-jobs')
+self.assert_qmp(res,

[Qemu-block] [PULL 25/41] luks: Turn invalid assertion into check

2018-03-13 Thread Kevin Wolf

The .bdrv_getlength implementation of the crypto block driver asserted
that the payload offset isn't after EOF. This is an invalid assertion to
make as the image file could be corrupted. Instead, check it and return
-EIO if the file is too small for the payload offset.

Zero length images are fine, so trigger -EIO only on offset > len, not
on offset >= len as the assertion did before.

Signed-off-by: Kevin Wolf 
Reviewed-by: Daniel P. Berrangé 
---
 block/crypto.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/block/crypto.c b/block/crypto.c
index a1139b6f09..00fb40c631 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -518,7 +518,10 @@ static int64_t block_crypto_getlength(BlockDriverState *bs)
 
 uint64_t offset = qcrypto_block_get_payload_offset(crypto->block);
 assert(offset < INT64_MAX);
-assert(offset < len);
+
+if (offset > len) {
+return -EIO;
+}
 
 len -= offset;
 
-- 
2.13.6

[Qemu-block] [PULL 13/41] blockjobs: add commit, abort, clean helpers

2018-03-13 Thread Kevin Wolf

From: John Snow 

The completed_single function is getting a little mucked up with
checking to see which callbacks exist, so let's factor them out.

Signed-off-by: John Snow 
Reviewed-by: Eric Blake 
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 blockjob.c | 35 ++-
 1 file changed, 26 insertions(+), 9 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 61af628376..0c64fadc6d 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -415,6 +415,29 @@ static void block_job_update_rc(BlockJob *job)
 }
 }
 
+static void block_job_commit(BlockJob *job)
+{
+assert(!job->ret);
+if (job->driver->commit) {
+job->driver->commit(job);
+}
+}
+
+static void block_job_abort(BlockJob *job)
+{
+assert(job->ret);
+if (job->driver->abort) {
+job->driver->abort(job);
+}
+}
+
+static void block_job_clean(BlockJob *job)
+{
+if (job->driver->clean) {
+job->driver->clean(job);
+}
+}
+
 static void block_job_completed_single(BlockJob *job)
 {
 assert(job->completed);
@@ -423,17 +446,11 @@ static void block_job_completed_single(BlockJob *job)
 block_job_update_rc(job);
 
 if (!job->ret) {
-if (job->driver->commit) {
-job->driver->commit(job);
-}
+block_job_commit(job);
 } else {
-if (job->driver->abort) {
-job->driver->abort(job);
-}
-}
-if (job->driver->clean) {
-job->driver->clean(job);
+block_job_abort(job);
 }
+block_job_clean(job);
 
 if (job->cb) {
 job->cb(job->opaque, job->ret);
-- 
2.13.6

[Qemu-block] [PULL 11/41] blockjobs: add block_job_dismiss

2018-03-13 Thread Kevin Wolf

From: John Snow 

For jobs that have reached their CONCLUDED state, prior to having their
last reference put down (meaning jobs that have completed successfully,
unsuccessfully, or have been canceled), allow the user to dismiss the
job's lingering status report via block-job-dismiss.

This gives management APIs the chance to conclusively determine if a job
failed or succeeded, even if the event broadcast was missed.

Note: block_job_do_dismiss and block_job_decommission happen to do
exactly the same thing, but they're called from different semantic
contexts, so both aliases are kept to improve readability.

Note 2: Don't worry about the 0x04 flag definition for AUTO_DISMISS, she
has a friend coming in a future patch to fill the hole where 0x02 is.

Verbs:
Dismiss: operates on CONCLUDED jobs only.
Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json | 24 +++-
 include/block/blockjob.h | 14 ++
 blockdev.c   | 14 ++
 blockjob.c   | 26 --
 block/trace-events   |  1 +
 5 files changed, 76 insertions(+), 3 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 4b777fc46f..fb577d45f8 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -970,10 +970,12 @@
 #
 # @complete: see @block-job-complete
 #
+# @dismiss: see @block-job-dismiss
+#
 # Since: 2.12
 ##
 { 'enum': 'BlockJobVerb',
-  'data': ['cancel', 'pause', 'resume', 'set-speed', 'complete' ] }
+  'data': ['cancel', 'pause', 'resume', 'set-speed', 'complete', 'dismiss' ] }
 
 ##
 # @BlockJobStatus:
@@ -2244,6 +2246,26 @@
 { 'command': 'block-job-complete', 'data': { 'device': 'str' } }
 
 ##
+# @block-job-dismiss:
+#
+# For jobs that have already concluded, remove them from the block-job-query
+# list. This command only needs to be run for jobs which were started with
+# QEMU 2.12+ job lifetime management semantics.
+#
+# This command will refuse to operate on any job that has not yet reached
+# its terminal state, BLOCK_JOB_STATUS_CONCLUDED. For jobs that make use of
+# BLOCK_JOB_READY event, block-job-cancel or block-job-complete will still need
+# to be used as appropriate.
+#
+# @id: The job identifier.
+#
+# Returns: Nothing on success
+#
+# Since: 2.12
+##
+{ 'command': 'block-job-dismiss', 'data': { 'id': 'str' } }
+
+##
 # @BlockdevDiscardOptions:
 #
 # Determines how to handle discard requests.
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index df0a9773d1..c535829b46 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -142,6 +142,9 @@ typedef struct BlockJob {
 /** Current state; See @BlockJobStatus for details. */
 BlockJobStatus status;
 
+/** True if this job should automatically dismiss itself */
+bool auto_dismiss;
+
 BlockJobTxn *txn;
 QLIST_ENTRY(BlockJob) txn_list;
 } BlockJob;
@@ -151,6 +154,8 @@ typedef enum BlockJobCreateFlags {
 BLOCK_JOB_DEFAULT = 0x00,
 /* BlockJob is not QMP-created and should not send QMP events */
 BLOCK_JOB_INTERNAL = 0x01,
+/* BlockJob requires manual dismiss step */
+BLOCK_JOB_MANUAL_DISMISS = 0x04,
 } BlockJobCreateFlags;
 
 /**
@@ -235,6 +240,15 @@ void block_job_cancel(BlockJob *job);
 void block_job_complete(BlockJob *job, Error **errp);
 
 /**
+ * block_job_dismiss:
+ * @job: The job to be dismissed.
+ * @errp: Error object.
+ *
+ * Remove a concluded job from the query list.
+ */
+void block_job_dismiss(BlockJob **job, Error **errp);
+
+/**
  * block_job_query:
  * @job: The job to get information about.
  *
diff --git a/blockdev.c b/blockdev.c
index f70a783803..9900cbc7dd 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3853,6 +3853,20 @@ void qmp_block_job_complete(const char *device, Error 
**errp)
 aio_context_release(aio_context);
 }
 
+void qmp_block_job_dismiss(const char *id, Error **errp)
+{
+AioContext *aio_context;
+BlockJob *job = find_block_job(id, _context, errp);
+
+if (!job) {
+return;
+}
+
+trace_qmp_block_job_dismiss(job);
+block_job_dismiss(, errp);
+aio_context_release(aio_context);
+}
+
 void qmp_change_backing_file(const char *device,
  const char *image_node_name,
  const char *backing_file,
diff --git a/blockjob.c b/blockjob.c
index 2ef48075b0..59ac4a13c7 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -63,6 +63,7 @@ bool 
BlockJobVerbTable[BLOCK_JOB_VERB__MAX][BLOCK_JOB_STATUS__MAX] = {
 [BLOCK_JOB_VERB_RESUME]   = {0, 1, 1, 1, 1, 1, 0, 0, 0},
 [BLOCK_JOB_VERB_SET_SPEED]= {0, 1, 1, 1, 1, 1, 0, 0, 0},
 [BLOCK_JOB_VERB_COMPLETE] = {0, 0, 0, 0, 1, 0, 0, 0, 0},
+[BLOCK_JOB_VERB_DISMISS]  = {0, 0, 0, 0, 0, 0, 0, 1, 0},
 };
 
 static void block_job_state_transition(BlockJob *job, BlockJobStatus s1)
@@ -391,9 +392,17 @@ static void

[Qemu-block] [PULL 39/41] vpc: Support .bdrv_co_create

2018-03-13 Thread Kevin Wolf

This adds the .bdrv_co_create driver callback to vpc, which
enables image creation over QMP.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
---
 qapi/block-core.json |  33 ++-
 block/vpc.c  | 152 ++-
 2 files changed, 147 insertions(+), 38 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 350094f46a..47ff5f8ce5 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3880,6 +3880,37 @@
 '*block-state-zero':'bool' } }
 
 ##
+# @BlockdevVpcSubformat:
+#
+# @dynamic: Growing image file
+# @fixed:   Preallocated fixed-size image file
+#
+# Since: 2.12
+##
+{ 'enum': 'BlockdevVpcSubformat',
+  'data': [ 'dynamic', 'fixed' ] }
+
+##
+# @BlockdevCreateOptionsVpc:
+#
+# Driver specific image creation options for vpc (VHD).
+#
+# @file Node to create the image format on
+# @size Size of the virtual disk in bytes
+# @subformatvhdx subformat (default: dynamic)
+# @force-size   Force use of the exact byte size instead of rounding to the
+#   next size that can be represented in CHS geometry
+#   (default: false)
+#
+# Since: 2.12
+##
+{ 'struct': 'BlockdevCreateOptionsVpc',
+  'data': { 'file': 'BlockdevRef',
+'size': 'size',
+'*subformat':   'BlockdevVpcSubformat',
+'*force-size':  'bool' } }
+
+##
 # @BlockdevCreateNotSupported:
 #
 # This is used for all drivers that don't support creating images.
@@ -3936,7 +3967,7 @@
   'vdi':'BlockdevCreateOptionsVdi',
   'vhdx':   'BlockdevCreateOptionsVhdx',
   'vmdk':   'BlockdevCreateNotSupported',
-  'vpc':'BlockdevCreateNotSupported',
+  'vpc':'BlockdevCreateOptionsVpc',
   'vvfat':  'BlockdevCreateNotSupported',
   'vxhs':   'BlockdevCreateNotSupported'
   } }
diff --git a/block/vpc.c b/block/vpc.c
index b2e2b9ebd4..8824211713 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -32,6 +32,9 @@
 #include "migration/blocker.h"
 #include "qemu/bswap.h"
 #include "qemu/uuid.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qapi/qapi-visit-block-core.h"
 
 /**/
 
@@ -166,6 +169,8 @@ static QemuOptsList vpc_runtime_opts = {
 }
 };
 
+static QemuOptsList vpc_create_opts;
+
 static uint32_t vpc_checksum(uint8_t* buf, size_t size)
 {
 uint32_t res = 0;
@@ -897,12 +902,15 @@ static int create_fixed_disk(BlockBackend *blk, uint8_t 
*buf,
 return ret;
 }
 
-static int coroutine_fn vpc_co_create_opts(const char *filename, QemuOpts 
*opts,
-   Error **errp)
+static int coroutine_fn vpc_co_create(BlockdevCreateOptions *opts,
+  Error **errp)
 {
+BlockdevCreateOptionsVpc *vpc_opts;
+BlockBackend *blk = NULL;
+BlockDriverState *bs = NULL;
+
 uint8_t buf[1024];
 VHDFooter *footer = (VHDFooter *) buf;
-char *disk_type_param;
 int i;
 uint16_t cyls = 0;
 uint8_t heads = 0;
@@ -911,45 +919,38 @@ static int coroutine_fn vpc_co_create_opts(const char 
*filename, QemuOpts *opts,
 int64_t total_size;
 int disk_type;
 int ret = -EIO;
-bool force_size;
-Error *local_err = NULL;
-BlockBackend *blk = NULL;
 
-/* Read out options */
-total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
-  BDRV_SECTOR_SIZE);
-disk_type_param = qemu_opt_get_del(opts, BLOCK_OPT_SUBFMT);
-if (disk_type_param) {
-if (!strcmp(disk_type_param, "dynamic")) {
-disk_type = VHD_DYNAMIC;
-} else if (!strcmp(disk_type_param, "fixed")) {
-disk_type = VHD_FIXED;
-} else {
-error_setg(errp, "Invalid disk type, %s", disk_type_param);
-ret = -EINVAL;
-goto out;
-}
-} else {
+assert(opts->driver == BLOCKDEV_DRIVER_VPC);
+vpc_opts = >u.vpc;
+
+/* Validate options and set default values */
+total_size = vpc_opts->size;
+
+if (!vpc_opts->has_subformat) {
+vpc_opts->subformat = BLOCKDEV_VPC_SUBFORMAT_DYNAMIC;
+}
+switch (vpc_opts->subformat) {
+case BLOCKDEV_VPC_SUBFORMAT_DYNAMIC:
 disk_type = VHD_DYNAMIC;
+break;
+case BLOCKDEV_VPC_SUBFORMAT_FIXED:
+disk_type = VHD_FIXED;
+break;
+default:
+g_assert_not_reached();
 }
 
-force_size = qemu_opt_get_bool_del(opts, VPC_OPT_FORCE_SIZE, false);
-
-ret = bdrv_create_file(filename, opts, _err);
-if (ret < 0) {
-error_propagate(errp, local_err);
-goto out;
+/* Create BlockBackend to write to the image */
+bs = bdrv_open_blockdev_ref(vpc_opts->file, errp);
+if (bs == NULL) {
+return

[Qemu-block] [PULL 17/41] blockjobs: add PENDING status and event

2018-03-13 Thread Kevin Wolf

From: John Snow 

For jobs utilizing the new manual workflow, we intend to prohibit
them from modifying the block graph until the management layer provides
an explicit ACK via block-job-finalize to move the process forward.

To distinguish this runstate from "ready" or "waiting," we add a new
"pending" event and status.

For now, the transition from PENDING to CONCLUDED/ABORTING is automatic,
but a future commit will add the explicit block-job-finalize step.

Transitions:
Waiting -> Pending:   Normal transition.
Pending -> Concluded: Normal transition.
Pending -> Aborting:  Late transactional failures and cancellations.

Removed Transitions:
Waiting -> Concluded: Jobs must go to PENDING first.

Verbs:
Cancel: Can be applied to a pending job.

 +-+
 |UNDEFINED|
 +--+--+
|
 +--v+
   +-+CREATED+-+
   | +--++ |
   ||  |
   | +--++ +--+|
   +-+RUNNING<->PAUSED||
   | +--+-+--+ +--+|
   || ||
   || +--+ |
   ||| |
   | +--v--+   +---+ | |
   +-+READY<--->STANDBY| | |
   | +--+--+   +---+ | |
   ||| |
   | +--v+   | |
   +-+WAITING<---+ |
   | +--++ |
   ||  |
   | +--v+ |
   +-+PENDING| |
   | +--++ |
   ||  |
+--v-+   +--v--+   |
|ABORTING+--->CONCLUDED|   |
++   +--+--+   |
|  |
 +--v-+|
 |NULL<+
 ++

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json | 31 +-
 include/block/blockjob.h |  5 
 blockjob.c   | 67 +++-
 3 files changed, 78 insertions(+), 25 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 6631614d0b..0ae12272ff 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1002,6 +1002,11 @@
 #   to the waiting state. This status will likely not be visible for
 #   the last job in a transaction.
 #
+# @pending: The job has finished its work, but has finalization steps that it
+#   needs to make prior to completing. These changes may require
+#   manual intervention by the management process if manual was set
+#   to true. These changes may still fail.
+#
 # @aborting: The job is in the process of being aborted, and will finish with
 #an error. The job will afterwards report that it is @concluded.
 #This status may not be visible to the management process.
@@ -1016,7 +1021,7 @@
 ##
 { 'enum': 'BlockJobStatus',
   'data': ['undefined', 'created', 'running', 'paused', 'ready', 'standby',
-   'waiting', 'aborting', 'concluded', 'null' ] }
+   'waiting', 'pending', 'aborting', 'concluded', 'null' ] }
 
 ##
 # @BlockJobInfo:
@@ -4263,6 +4268,30 @@
 'speed' : 'int' } }
 
 ##
+# @BLOCK_JOB_PENDING:
+#
+# Emitted when a block job is awaiting explicit authorization to finalize graph
+# changes via @block-job-finalize. If this job is part of a transaction, it 
will
+# not emit this event until the transaction has converged first.
+#
+# @type: job type
+#
+# @id: The job identifier.
+#
+# Since: 2.12
+#
+# Example:
+#
+# <- { "event": "BLOCK_JOB_WAITING",
+#  "data": { "device": "drive0", "type": "mirror" },
+#  "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
+#
+##
+{ 'event': 'BLOCK_JOB_PENDING',
+  'data': { 'type'  : 'BlockJobType',
+'id': 'str' } }
+
+##
 # @PreallocMode:
 #
 # Preallocation mode of QEMU image file
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index c535829b46..7c8d51effa 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -142,6 +142,9 @@ typedef struct BlockJob {
 /** Current state; See @BlockJobStatus for details. */
 BlockJobStatus status;
 
+/** True if this job should automatically finalize itself */
+bool auto_finalize;
+
 /** True if this job should automatically dismiss itself */
 bool auto_dismiss;
 
@@ -154,6 +157,8 @@ typedef enum BlockJobCreateFlags {
 BLOCK_JOB_DEFAULT = 0x00,
 /* BlockJob is not QMP-created and should not send QMP events */
 BLOCK_JOB_INTERNAL = 0x01,
+/* BlockJob requires manual finalize step */
+BLOCK_JOB_MANUAL_FINALIZE = 0x02,
 /* BlockJob requires manual dismiss step */

[Qemu-block] [PULL 19/41] blockjobs: Expose manual property

2018-03-13 Thread Kevin Wolf

From: John Snow 

Expose the "manual" property via QAPI for the backup-related jobs.
As of this commit, this allows the management API to request the
"concluded" and "dismiss" semantics for backup jobs.

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json   | 48 ++
 blockdev.c | 31 +++---
 blockjob.c |  2 ++
 tests/qemu-iotests/109.out | 24 +++
 4 files changed, 82 insertions(+), 23 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 2c32fc69f9..3e52d248eb 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1054,13 +1054,20 @@
 #
 # @status: Current job state/status (since 2.12)
 #
+# @auto-finalize: Job will finalize itself when PENDING, moving to
+# the CONCLUDED state. (since 2.12)
+#
+# @auto-dismiss: Job will dismiss itself when CONCLUDED, moving to the NULL
+#state and disappearing from the query list. (since 2.12)
+#
 # Since: 1.1
 ##
 { 'struct': 'BlockJobInfo',
   'data': {'type': 'str', 'device': 'str', 'len': 'int',
'offset': 'int', 'busy': 'bool', 'paused': 'bool', 'speed': 'int',
'io-status': 'BlockDeviceIoStatus', 'ready': 'bool',
-   'status': 'BlockJobStatus' } }
+   'status': 'BlockJobStatus',
+   'auto-finalize': 'bool', 'auto-dismiss': 'bool' } }
 
 ##
 # @query-block-jobs:
@@ -1210,6 +1217,18 @@
 #   default 'report' (no limitations, since this applies to
 #   a different block device than @device).
 #
+# @auto-finalize: When false, this job will wait in a PENDING state after it 
has
+# finished its work, waiting for @block-job-finalize.
+# When true, this job will automatically perform its abort or
+# commit actions.
+# Defaults to true. (Since 2.12)
+#
+# @auto-dismiss: When false, this job will wait in a CONCLUDED state after it
+#has completed ceased all work, and wait for 
@block-job-dismiss.
+#When true, this job will automatically disappear from the 
query
+#list without user intervention.
+#Defaults to true. (Since 2.12)
+#
 # Note: @on-source-error and @on-target-error only affect background
 # I/O.  If an error occurs during a guest write request, the device's
 # rerror/werror actions will be used.
@@ -1218,10 +1237,12 @@
 ##
 { 'struct': 'DriveBackup',
   'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
-'*format': 'str', 'sync': 'MirrorSyncMode', '*mode': 
'NewImageMode',
-'*speed': 'int', '*bitmap': 'str', '*compress': 'bool',
+'*format': 'str', 'sync': 'MirrorSyncMode',
+'*mode': 'NewImageMode', '*speed': 'int',
+'*bitmap': 'str', '*compress': 'bool',
 '*on-source-error': 'BlockdevOnError',
-'*on-target-error': 'BlockdevOnError' } }
+'*on-target-error': 'BlockdevOnError',
+'*auto-finalize': 'bool', '*auto-dismiss': 'bool' } }
 
 ##
 # @BlockdevBackup:
@@ -1251,6 +1272,18 @@
 #   default 'report' (no limitations, since this applies to
 #   a different block device than @device).
 #
+# @auto-finalize: When false, this job will wait in a PENDING state after it 
has
+# finished its work, waiting for @block-job-finalize.
+# When true, this job will automatically perform its abort or
+# commit actions.
+# Defaults to true. (Since 2.12)
+#
+# @auto-dismiss: When false, this job will wait in a CONCLUDED state after it
+#has completed ceased all work, and wait for 
@block-job-dismiss.
+#When true, this job will automatically disappear from the 
query
+#list without user intervention.
+#Defaults to true. (Since 2.12)
+#
 # Note: @on-source-error and @on-target-error only affect background
 # I/O.  If an error occurs during a guest write request, the device's
 # rerror/werror actions will be used.
@@ -1259,11 +1292,10 @@
 ##
 { 'struct': 'BlockdevBackup',
   'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
-'sync': 'MirrorSyncMode',
-'*speed': 'int',
-'*compress': 'bool',
+'sync': 'MirrorSyncMode', '*speed': 'int', '*compress': 'bool',
 '*on-source-error': 'BlockdevOnError',
-'*on-target-error': 'BlockdevOnError' } }
+'*on-target-error': 'BlockdevOnError',
+'*auto-finalize': 'bool', '*auto-dismiss': 'bool' } }
 
 ##
 # @blockdev-snapshot-sync:
diff --git a/blockdev.c b/blockdev.c
index efd3ab2e99..809adbe7f9 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3261,7 +3261,7 @@ static BlockJob *do_drive_backup(DriveBackup *backup,

[Qemu-block] [PULL 38/41] vhdx: Support .bdrv_co_create

2018-03-13 Thread Kevin Wolf

This adds the .bdrv_co_create driver callback to vhdx, which
enables image creation over QMP.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
---
 qapi/block-core.json |  40 +-
 block/vhdx.c | 216 ++-
 2 files changed, 203 insertions(+), 53 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index d091817855..350094f46a 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3842,6 +3842,44 @@
 '*static':  'bool' } }
 
 ##
+# @BlockdevVhdxSubformat:
+#
+# @dynamic: Growing image file
+# @fixed:   Preallocated fixed-size image file
+#
+# Since: 2.12
+##
+{ 'enum': 'BlockdevVhdxSubformat',
+  'data': [ 'dynamic', 'fixed' ] }
+
+##
+# @BlockdevCreateOptionsVhdx:
+#
+# Driver specific image creation options for vhdx.
+#
+# @file Node to create the image format on
+# @size Size of the virtual disk in bytes
+# @log-size Log size in bytes, must be a multiple of 1 MB
+#   (default: 1 MB)
+# @block-size   Block size in bytes, must be a multiple of 1 MB and not
+#   larger than 256 MB (default: automatically choose a block
+#   size depending on the image size)
+# @subformatvhdx subformat (default: dynamic)
+# @block-state-zero Force use of payload blocks of type 'ZERO'. Non-standard,
+#   but default.  Do not set to 'off' when using 'qemu-img
+#   convert' with subformat=dynamic.
+#
+# Since: 2.12
+##
+{ 'struct': 'BlockdevCreateOptionsVhdx',
+  'data': { 'file': 'BlockdevRef',
+'size': 'size',
+'*log-size':'size',
+'*block-size':  'size',
+'*subformat':   'BlockdevVhdxSubformat',
+'*block-state-zero':'bool' } }
+
+##
 # @BlockdevCreateNotSupported:
 #
 # This is used for all drivers that don't support creating images.
@@ -3896,7 +3934,7 @@
   'ssh':'BlockdevCreateOptionsSsh',
   'throttle':   'BlockdevCreateNotSupported',
   'vdi':'BlockdevCreateOptionsVdi',
-  'vhdx':   'BlockdevCreateNotSupported',
+  'vhdx':   'BlockdevCreateOptionsVhdx',
   'vmdk':   'BlockdevCreateNotSupported',
   'vpc':'BlockdevCreateNotSupported',
   'vvfat':  'BlockdevCreateNotSupported',
diff --git a/block/vhdx.c b/block/vhdx.c
index d82350d07c..f1b97f4b49 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -26,6 +26,9 @@
 #include "block/vhdx.h"
 #include "migration/blocker.h"
 #include "qemu/uuid.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qapi/qapi-visit-block-core.h"
 
 /* Options for VHDX creation */
 
@@ -39,6 +42,8 @@ typedef enum VHDXImageType {
 VHDX_TYPE_DIFFERENCING,   /* Currently unsupported */
 } VHDXImageType;
 
+static QemuOptsList vhdx_create_opts;
+
 /* Several metadata and region table data entries are identified by
  * guids in  a MS-specific GUID format. */
 
@@ -1792,59 +1797,71 @@ exit:
  *. ~ --- ~  ~  ~ ---.
  *   1MB
  */
-static int coroutine_fn vhdx_co_create_opts(const char *filename, QemuOpts 
*opts,
-Error **errp)
+static int coroutine_fn vhdx_co_create(BlockdevCreateOptions *opts,
+   Error **errp)
 {
+BlockdevCreateOptionsVhdx *vhdx_opts;
+BlockBackend *blk = NULL;
+BlockDriverState *bs = NULL;
+
 int ret = 0;
-uint64_t image_size = (uint64_t) 2 * GiB;
-uint32_t log_size   = 1 * MiB;
-uint32_t block_size = 0;
+uint64_t image_size;
+uint32_t log_size;
+uint32_t block_size;
 uint64_t signature;
 uint64_t metadata_offset;
 bool use_zero_blocks = false;
 
 gunichar2 *creator = NULL;
 glong creator_items;
-BlockBackend *blk;
-char *type = NULL;
 VHDXImageType image_type;
-Error *local_err = NULL;
 
-image_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
-  BDRV_SECTOR_SIZE);
-log_size = qemu_opt_get_size_del(opts, VHDX_BLOCK_OPT_LOG_SIZE, 0);
-block_size = qemu_opt_get_size_del(opts, VHDX_BLOCK_OPT_BLOCK_SIZE, 0);
-type = qemu_opt_get_del(opts, BLOCK_OPT_SUBFMT);
-use_zero_blocks = qemu_opt_get_bool_del(opts, VHDX_BLOCK_OPT_ZERO, true);
+assert(opts->driver == BLOCKDEV_DRIVER_VHDX);
+vhdx_opts = >u.vhdx;
 
+/* Validate options and set default values */
+image_size = vhdx_opts->size;
 if (image_size > VHDX_MAX_IMAGE_SIZE) {
 error_setg_errno(errp, EINVAL, "Image size too large; max of 64TB");
-ret = -EINVAL;
-goto exit;
+return -EINVAL;
 }
 
-if (type == NULL) {
-type = g_strdup("dynamic");
+if (!vhdx_opts->has_log_size) {
+log_size =

[Qemu-block] [PULL 16/41] blockjobs: add waiting status

2018-03-13 Thread Kevin Wolf

From: John Snow 

For jobs that are stuck waiting on others in a transaction, it would
be nice to know that they are no longer "running" in that sense, but
instead are waiting on other jobs in the transaction.

Jobs that are "waiting" in this sense cannot be meaningfully altered
any longer as they have left their running loop. The only meaningful
user verb for jobs in this state is "cancel," which will cancel the
whole transaction, too.

Transitions:
Running -> Waiting:   Normal transition.
Ready   -> Waiting:   Normal transition.
Waiting -> Aborting:  Transactional cancellation.
Waiting -> Concluded: Normal transition.

Removed Transitions:
Running -> Concluded: Jobs must go to WAITING first.
Ready   -> Concluded: Jobs must go to WAITING first.

Verbs:
Cancel: Can be applied to WAITING jobs.

 +-+
 |UNDEFINED|
 +--+--+
|
 +--v+
   +-+CREATED+-+
   | +--++ |
   ||  |
   | +--v+ +--+|
   +-+RUNNING<->PAUSED||
   | +--+-+--+ +--+|
   || ||
   || +--+ |
   ||| |
   | +--v--+   +---+ | |
   +-+READY<--->STANDBY| | |
   | +--+--+   +---+ | |
   ||| |
   | +--v+   | |
   +-+WAITING<---+ |
   | +--++ |
   ||  |
+--v-+   +--v--+   |
|ABORTING+--->CONCLUDED|   |
++   +--+--+   |
|  |
 +--v-+|
 |NULL<+
 ++

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json |  6 +-
 blockjob.c   | 37 -
 2 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index fb577d45f8..6631614d0b 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -998,6 +998,10 @@
 # @standby: The job is ready, but paused. This is nearly identical to @paused.
 #   The job may return to @ready or otherwise be canceled.
 #
+# @waiting: The job is waiting for other jobs in the transaction to converge
+#   to the waiting state. This status will likely not be visible for
+#   the last job in a transaction.
+#
 # @aborting: The job is in the process of being aborted, and will finish with
 #an error. The job will afterwards report that it is @concluded.
 #This status may not be visible to the management process.
@@ -1012,7 +1016,7 @@
 ##
 { 'enum': 'BlockJobStatus',
   'data': ['undefined', 'created', 'running', 'paused', 'ready', 'standby',
-   'aborting', 'concluded', 'null' ] }
+   'waiting', 'aborting', 'concluded', 'null' ] }
 
 ##
 # @BlockJobInfo:
diff --git a/blockjob.c b/blockjob.c
index 1395d8eed1..996278ed9c 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -44,26 +44,27 @@ static QemuMutex block_job_mutex;
 
 /* BlockJob State Transition Table */
 bool BlockJobSTT[BLOCK_JOB_STATUS__MAX][BLOCK_JOB_STATUS__MAX] = {
-  /* U, C, R, P, Y, S, X, E, N */
-/* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0, 0, 0},
-/* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 1, 0, 1},
-/* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1, 1, 0},
-/* P: */ [BLOCK_JOB_STATUS_PAUSED]= {0, 0, 1, 0, 0, 0, 0, 0, 0},
-/* Y: */ [BLOCK_JOB_STATUS_READY] = {0, 0, 0, 0, 0, 1, 1, 1, 0},
-/* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0, 0, 0},
-/* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 1, 1, 0},
-/* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0, 1},
-/* N: */ [BLOCK_JOB_STATUS_NULL]  = {0, 0, 0, 0, 0, 0, 0, 0, 0},
+  /* U, C, R, P, Y, S, W, X, E, N */
+/* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0, 0, 0, 0},
+/* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 0, 1, 0, 1},
+/* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1, 1, 0, 0},
+/* P: */ [BLOCK_JOB_STATUS_PAUSED]= {0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
+/* Y: */ [BLOCK_JOB_STATUS_READY] = {0, 0, 0, 0, 0, 1, 1, 1, 0, 0},
+/* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0, 0, 0, 0},
+/* W: */ [BLOCK_JOB_STATUS_WAITING]   = {0, 0, 0, 0, 0, 0, 0, 1, 1, 0},
+/* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0, 1, 1, 0},
+/* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 1},
+/* N: */ [BLOCK_JOB_STATUS_NULL]  = {0, 0, 0,

[Qemu-block] [PULL 15/41] blockjobs: add prepare callback

2018-03-13 Thread Kevin Wolf

From: John Snow 

Some jobs upon finalization may need to perform some work that can
still fail. If these jobs are part of a transaction, it's important
that these callbacks fail the entire transaction.

We allow for a new callback in addition to commit/abort/clean that
allows us the opportunity to have fairly late-breaking failures
in the transactional process.

The expected flow is:

- All jobs in a transaction converge to the PENDING state,
  added in a forthcoming commit.
- Upon being finalized, either automatically or explicitly
  by the user, jobs prepare to complete.
- If any job fails preparation, all jobs call .abort.
- Otherwise, they succeed and call .commit.

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 include/block/blockjob_int.h | 10 ++
 blockjob.c   | 30 +++---
 2 files changed, 37 insertions(+), 3 deletions(-)

diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
index 259d49b32a..642adce68b 100644
--- a/include/block/blockjob_int.h
+++ b/include/block/blockjob_int.h
@@ -54,6 +54,16 @@ struct BlockJobDriver {
 void (*complete)(BlockJob *job, Error **errp);
 
 /**
+ * If the callback is not NULL, prepare will be invoked when all the jobs
+ * belonging to the same transaction complete; or upon this job's 
completion
+ * if it is not in a transaction.
+ *
+ * This callback will not be invoked if the job has already failed.
+ * If it fails, abort and then clean will be called.
+ */
+int (*prepare)(BlockJob *job);
+
+/**
  * If the callback is not NULL, it will be invoked when all the jobs
  * belonging to the same transaction complete; or upon this job's
  * completion if it is not in a transaction. Skipped if NULL.
diff --git a/blockjob.c b/blockjob.c
index 7e03824751..1395d8eed1 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -415,6 +415,14 @@ static void block_job_update_rc(BlockJob *job)
 }
 }
 
+static int block_job_prepare(BlockJob *job)
+{
+if (job->ret == 0 && job->driver->prepare) {
+job->ret = job->driver->prepare(job);
+}
+return job->ret;
+}
+
 static void block_job_commit(BlockJob *job)
 {
 assert(!job->ret);
@@ -438,7 +446,7 @@ static void block_job_clean(BlockJob *job)
 }
 }
 
-static void block_job_completed_single(BlockJob *job)
+static int block_job_completed_single(BlockJob *job)
 {
 assert(job->completed);
 
@@ -472,6 +480,7 @@ static void block_job_completed_single(BlockJob *job)
 QLIST_REMOVE(job, txn_list);
 block_job_txn_unref(job->txn);
 block_job_conclude(job);
+return 0;
 }
 
 static void block_job_cancel_async(BlockJob *job)
@@ -487,17 +496,22 @@ static void block_job_cancel_async(BlockJob *job)
 job->cancelled = true;
 }
 
-static void block_job_txn_apply(BlockJobTxn *txn, void fn(BlockJob *))
+static int block_job_txn_apply(BlockJobTxn *txn, int fn(BlockJob *))
 {
 AioContext *ctx;
 BlockJob *job, *next;
+int rc;
 
 QLIST_FOREACH_SAFE(job, >jobs, txn_list, next) {
 ctx = blk_get_aio_context(job->blk);
 aio_context_acquire(ctx);
-fn(job);
+rc = fn(job);
 aio_context_release(ctx);
+if (rc) {
+break;
+}
 }
+return rc;
 }
 
 static int block_job_finish_sync(BlockJob *job,
@@ -580,6 +594,8 @@ static void block_job_completed_txn_success(BlockJob *job)
 {
 BlockJobTxn *txn = job->txn;
 BlockJob *other_job;
+int rc = 0;
+
 /*
  * Successful completion, see if there are other running jobs in this
  * txn.
@@ -590,6 +606,14 @@ static void block_job_completed_txn_success(BlockJob *job)
 }
 assert(other_job->ret == 0);
 }
+
+/* Jobs may require some prep-work to complete without failure */
+rc = block_job_txn_apply(txn, block_job_prepare);
+if (rc) {
+block_job_completed_txn_abort(job);
+return;
+}
+
 /* We are the last completed job, commit the transaction. */
 block_job_txn_apply(txn, block_job_completed_single);
 }
-- 
2.13.6

[Qemu-block] [PULL 36/41] qed: Support .bdrv_co_create

2018-03-13 Thread Kevin Wolf

This adds the .bdrv_co_create driver callback to qed, which
enables image creation over QMP.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
---
 qapi/block-core.json |  25 ++-
 block/qed.c  | 204 ++-
 2 files changed, 162 insertions(+), 67 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 7b7d5a01fd..d091817855 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3703,6 +3703,29 @@
 '*refcount-bits':   'int' } }
 
 ##
+# @BlockdevCreateOptionsQed:
+#
+# Driver specific image creation options for qed.
+#
+# @file Node to create the image format on
+# @size Size of the virtual disk in bytes
+# @backing-file File name of the backing file if a backing file
+#   should be used
+# @backing-fmt  Name of the block driver to use for the backing file
+# @cluster-size Cluster size in bytes (default: 65536)
+# @table-size   L1/L2 table size (in clusters)
+#
+# Since: 2.12
+##
+{ 'struct': 'BlockdevCreateOptionsQed',
+  'data': { 'file': 'BlockdevRef',
+'size': 'size',
+'*backing-file':'str',
+'*backing-fmt': 'BlockdevDriver',
+'*cluster-size':'size',
+'*table-size':  'int' } }
+
+##
 # @BlockdevCreateOptionsRbd:
 #
 # Driver specific image creation options for rbd/Ceph.
@@ -3864,7 +3887,7 @@
   'parallels':  'BlockdevCreateOptionsParallels',
   'qcow':   'BlockdevCreateOptionsQcow',
   'qcow2':  'BlockdevCreateOptionsQcow2',
-  'qed':'BlockdevCreateNotSupported',
+  'qed':'BlockdevCreateOptionsQed',
   'quorum': 'BlockdevCreateNotSupported',
   'raw':'BlockdevCreateNotSupported',
   'rbd':'BlockdevCreateOptionsRbd',
diff --git a/block/qed.c b/block/qed.c
index 5e6a6bfaa0..46a84beeed 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -20,6 +20,11 @@
 #include "trace.h"
 #include "qed.h"
 #include "sysemu/block-backend.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qapi/qapi-visit-block-core.h"
+
+static QemuOptsList qed_create_opts;
 
 static int bdrv_qed_probe(const uint8_t *buf, int buf_size,
   const char *filename)
@@ -594,57 +599,95 @@ static void bdrv_qed_close(BlockDriverState *bs)
 qemu_vfree(s->l1_table);
 }
 
-static int qed_create(const char *filename, uint32_t cluster_size,
-  uint64_t image_size, uint32_t table_size,
-  const char *backing_file, const char *backing_fmt,
-  QemuOpts *opts, Error **errp)
+static int coroutine_fn bdrv_qed_co_create(BlockdevCreateOptions *opts,
+   Error **errp)
 {
-QEDHeader header = {
-.magic = QED_MAGIC,
-.cluster_size = cluster_size,
-.table_size = table_size,
-.header_size = 1,
-.features = 0,
-.compat_features = 0,
-.l1_table_offset = cluster_size,
-.image_size = image_size,
-};
+BlockdevCreateOptionsQed *qed_opts;
+BlockBackend *blk = NULL;
+BlockDriverState *bs = NULL;
+
+QEDHeader header;
 QEDHeader le_header;
 uint8_t *l1_table = NULL;
-size_t l1_size = header.cluster_size * header.table_size;
-Error *local_err = NULL;
+size_t l1_size;
 int ret = 0;
-BlockBackend *blk;
 
-ret = bdrv_create_file(filename, opts, _err);
-if (ret < 0) {
-error_propagate(errp, local_err);
-return ret;
+assert(opts->driver == BLOCKDEV_DRIVER_QED);
+qed_opts = >u.qed;
+
+/* Validate options and set default values */
+if (!qed_opts->has_cluster_size) {
+qed_opts->cluster_size = QED_DEFAULT_CLUSTER_SIZE;
+}
+if (!qed_opts->has_table_size) {
+qed_opts->table_size = QED_DEFAULT_TABLE_SIZE;
 }
 
-blk = blk_new_open(filename, NULL, NULL,
-   BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL,
-   _err);
-if (blk == NULL) {
-error_propagate(errp, local_err);
+if (!qed_is_cluster_size_valid(qed_opts->cluster_size)) {
+error_setg(errp, "QED cluster size must be within range [%u, %u] "
+ "and power of 2",
+   QED_MIN_CLUSTER_SIZE, QED_MAX_CLUSTER_SIZE);
+return -EINVAL;
+}
+if (!qed_is_table_size_valid(qed_opts->table_size)) {
+error_setg(errp, "QED table size must be within range [%u, %u] "
+ "and power of 2",
+   QED_MIN_TABLE_SIZE, QED_MAX_TABLE_SIZE);
+return -EINVAL;
+}
+if (!qed_is_image_size_valid(qed_opts->size, qed_opts->cluster_size,
+ qed_opts->table_size))
+{
+error_setg(errp, "QED image size must be a non-zero

[Qemu-block] [PULL 32/41] iotests: Add regression test for commit base locking

2018-03-13 Thread Kevin Wolf

From: Fam Zheng 

Signed-off-by: Fam Zheng 
Reviewed-by: Max Reitz 
Signed-off-by: Kevin Wolf 
---
 tests/qemu-iotests/153 | 12 
 tests/qemu-iotests/153.out |  5 +
 2 files changed, 17 insertions(+)

diff --git a/tests/qemu-iotests/153 b/tests/qemu-iotests/153
index adfd02695b..a0fd815483 100755
--- a/tests/qemu-iotests/153
+++ b/tests/qemu-iotests/153
@@ -178,6 +178,18 @@ rm -f "${TEST_IMG}.lnk" &>/dev/null
 ln -s ${TEST_IMG} "${TEST_IMG}.lnk" || echo "Failed to create link"
 _run_qemu_with_images "${TEST_IMG}.lnk" "${TEST_IMG}"
 
+echo
+echo "== Active commit to intermediate layer should work when base in use =="
+_launch_qemu -drive format=$IMGFMT,file="${TEST_IMG}.a",id=drive0,if=none \
+ -device virtio-blk,drive=drive0
+
+_send_qemu_cmd $QEMU_HANDLE \
+"{ 'execute': 'qmp_capabilities' }" \
+'return'
+_run_cmd $QEMU_IMG commit -b "${TEST_IMG}.b" "${TEST_IMG}.c"
+
+_cleanup_qemu
+
 _launch_qemu
 
 _send_qemu_cmd $QEMU_HANDLE \
diff --git a/tests/qemu-iotests/153.out b/tests/qemu-iotests/153.out
index 34309cfb20..bb721cb747 100644
--- a/tests/qemu-iotests/153.out
+++ b/tests/qemu-iotests/153.out
@@ -372,6 +372,11 @@ Is another process using the image?
 == Symbolic link ==
 QEMU_PROG: -drive if=none,file=TEST_DIR/t.qcow2: Failed to get "write" lock
 Is another process using the image?
+
+== Active commit to intermediate layer should work when base in use ==
+{"return": {}}
+
+_qemu_img_wrapper commit -b TEST_DIR/t.qcow2.b TEST_DIR/t.qcow2.c
 {"return": {}}
 Adding drive
 
-- 
2.13.6

[Qemu-block] [PULL 29/41] vdi: Move file creation to vdi_co_create_opts

2018-03-13 Thread Kevin Wolf

From: Max Reitz 

Signed-off-by: Max Reitz 
Signed-off-by: Kevin Wolf 
---
 block/vdi.c | 46 --
 1 file changed, 28 insertions(+), 18 deletions(-)

diff --git a/block/vdi.c b/block/vdi.c
index 0c8f8204ce..2a39b0ac98 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -721,9 +721,7 @@ nonallocating_write:
 return ret;
 }
 
-static int coroutine_fn vdi_co_do_create(const char *filename,
- QemuOpts *file_opts,
- BlockdevCreateOptionsVdi *vdi_opts,
+static int coroutine_fn vdi_co_do_create(BlockdevCreateOptionsVdi *vdi_opts,
  size_t block_size, Error **errp)
 {
 int ret = 0;
@@ -734,7 +732,7 @@ static int coroutine_fn vdi_co_do_create(const char 
*filename,
 size_t i;
 size_t bmap_size;
 int64_t offset = 0;
-Error *local_err = NULL;
+BlockDriverState *bs_file = NULL;
 BlockBackend *blk = NULL;
 uint32_t *bmap = NULL;
 
@@ -770,18 +768,15 @@ static int coroutine_fn vdi_co_do_create(const char 
*filename,
 goto exit;
 }
 
-ret = bdrv_create_file(filename, file_opts, _err);
-if (ret < 0) {
-error_propagate(errp, local_err);
+bs_file = bdrv_open_blockdev_ref(vdi_opts->file, errp);
+if (!bs_file) {
+ret = -EIO;
 goto exit;
 }
 
-blk = blk_new_open(filename, NULL, NULL,
-   BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL,
-   _err);
-if (blk == NULL) {
-error_propagate(errp, local_err);
-ret = -EIO;
+blk = blk_new(BLK_PERM_WRITE | BLK_PERM_RESIZE, BLK_PERM_ALL);
+ret = blk_insert_bs(blk, bs_file, errp);
+if (ret < 0) {
 goto exit;
 }
 
@@ -818,7 +813,7 @@ static int coroutine_fn vdi_co_do_create(const char 
*filename,
 vdi_header_to_le();
 ret = blk_pwrite(blk, offset, , sizeof(header), 0);
 if (ret < 0) {
-error_setg(errp, "Error writing header to %s", filename);
+error_setg(errp, "Error writing header");
 goto exit;
 }
 offset += sizeof(header);
@@ -839,7 +834,7 @@ static int coroutine_fn vdi_co_do_create(const char 
*filename,
 }
 ret = blk_pwrite(blk, offset, bmap, bmap_size, 0);
 if (ret < 0) {
-error_setg(errp, "Error writing bmap to %s", filename);
+error_setg(errp, "Error writing bmap");
 goto exit;
 }
 offset += bmap_size;
@@ -849,13 +844,14 @@ static int coroutine_fn vdi_co_do_create(const char 
*filename,
 ret = blk_truncate(blk, offset + blocks * block_size,
PREALLOC_MODE_OFF, errp);
 if (ret < 0) {
-error_prepend(errp, "Failed to statically allocate %s", filename);
+error_prepend(errp, "Failed to statically allocate file");
 goto exit;
 }
 }
 
 exit:
 blk_unref(blk);
+bdrv_unref(bs_file);
 g_free(bmap);
 return ret;
 }
@@ -865,6 +861,7 @@ static int coroutine_fn vdi_co_create_opts(const char 
*filename, QemuOpts *opts,
 {
 QDict *qdict = NULL;
 BlockdevCreateOptionsVdi *create_options = NULL;
+BlockDriverState *bs_file = NULL;
 uint64_t block_size = DEFAULT_CLUSTER_SIZE;
 Visitor *v;
 Error *local_err = NULL;
@@ -888,7 +885,19 @@ static int coroutine_fn vdi_co_create_opts(const char 
*filename, QemuOpts *opts,
 
 qdict = qemu_opts_to_qdict_filtered(opts, NULL, _create_opts, true);
 
-qdict_put_str(qdict, "file", ""); /* FIXME */
+ret = bdrv_create_file(filename, opts, errp);
+if (ret < 0) {
+goto done;
+}
+
+bs_file = bdrv_open(filename, NULL, NULL,
+BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL, errp);
+if (!bs_file) {
+ret = -EIO;
+goto done;
+}
+
+qdict_put_str(qdict, "file", bs_file->node_name);
 
 /* Get the QAPI object */
 v = qobject_input_visitor_new_keyval(QOBJECT(qdict));
@@ -903,10 +912,11 @@ static int coroutine_fn vdi_co_create_opts(const char 
*filename, QemuOpts *opts,
 
 create_options->size = ROUND_UP(create_options->size, BDRV_SECTOR_SIZE);
 
-ret = vdi_co_do_create(filename, opts, create_options, block_size, errp);
+ret = vdi_co_do_create(create_options, block_size, errp);
 done:
 QDECREF(qdict);
 qapi_free_BlockdevCreateOptionsVdi(create_options);
+bdrv_unref(bs_file);
 return ret;
 }
 
-- 
2.13.6

[Qemu-block] [PULL 30/41] vdi: Implement .bdrv_co_create

2018-03-13 Thread Kevin Wolf

From: Max Reitz 

Signed-off-by: Max Reitz 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json |  2 +-
 block/vdi.c  | 24 +++-
 2 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index c69d70d7a8..6211b8222c 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3837,7 +3837,7 @@
   'sheepdog':   'BlockdevCreateOptionsSheepdog',
   'ssh':'BlockdevCreateOptionsSsh',
   'throttle':   'BlockdevCreateNotSupported',
-  'vdi':'BlockdevCreateNotSupported',
+  'vdi':'BlockdevCreateOptionsVdi',
   'vhdx':   'BlockdevCreateNotSupported',
   'vmdk':   'BlockdevCreateNotSupported',
   'vpc':'BlockdevCreateNotSupported',
diff --git a/block/vdi.c b/block/vdi.c
index 2a39b0ac98..8132e3adfe 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -721,9 +721,10 @@ nonallocating_write:
 return ret;
 }
 
-static int coroutine_fn vdi_co_do_create(BlockdevCreateOptionsVdi *vdi_opts,
+static int coroutine_fn vdi_co_do_create(BlockdevCreateOptions *create_options,
  size_t block_size, Error **errp)
 {
+BlockdevCreateOptionsVdi *vdi_opts;
 int ret = 0;
 uint64_t bytes = 0;
 uint32_t blocks;
@@ -736,6 +737,9 @@ static int coroutine_fn 
vdi_co_do_create(BlockdevCreateOptionsVdi *vdi_opts,
 BlockBackend *blk = NULL;
 uint32_t *bmap = NULL;
 
+assert(create_options->driver == BLOCKDEV_DRIVER_VDI);
+vdi_opts = _options->u.vdi;
+
 logout("\n");
 
 /* Read out options. */
@@ -856,11 +860,17 @@ exit:
 return ret;
 }
 
+static int coroutine_fn vdi_co_create(BlockdevCreateOptions *create_options,
+  Error **errp)
+{
+return vdi_co_do_create(create_options, DEFAULT_CLUSTER_SIZE, errp);
+}
+
 static int coroutine_fn vdi_co_create_opts(const char *filename, QemuOpts 
*opts,
Error **errp)
 {
 QDict *qdict = NULL;
-BlockdevCreateOptionsVdi *create_options = NULL;
+BlockdevCreateOptions *create_options = NULL;
 BlockDriverState *bs_file = NULL;
 uint64_t block_size = DEFAULT_CLUSTER_SIZE;
 Visitor *v;
@@ -897,11 +907,12 @@ static int coroutine_fn vdi_co_create_opts(const char 
*filename, QemuOpts *opts,
 goto done;
 }
 
+qdict_put_str(qdict, "driver", "vdi");
 qdict_put_str(qdict, "file", bs_file->node_name);
 
 /* Get the QAPI object */
 v = qobject_input_visitor_new_keyval(QOBJECT(qdict));
-visit_type_BlockdevCreateOptionsVdi(v, NULL, _options, _err);
+visit_type_BlockdevCreateOptions(v, NULL, _options, _err);
 visit_free(v);
 
 if (local_err) {
@@ -910,12 +921,14 @@ static int coroutine_fn vdi_co_create_opts(const char 
*filename, QemuOpts *opts,
 goto done;
 }
 
-create_options->size = ROUND_UP(create_options->size, BDRV_SECTOR_SIZE);
+assert(create_options->driver == BLOCKDEV_DRIVER_VDI);
+create_options->u.vdi.size = ROUND_UP(create_options->u.vdi.size,
+  BDRV_SECTOR_SIZE);
 
 ret = vdi_co_do_create(create_options, block_size, errp);
 done:
 QDECREF(qdict);
-qapi_free_BlockdevCreateOptionsVdi(create_options);
+qapi_free_BlockdevCreateOptions(create_options);
 bdrv_unref(bs_file);
 return ret;
 }
@@ -969,6 +982,7 @@ static BlockDriver bdrv_vdi = {
 .bdrv_reopen_prepare = vdi_reopen_prepare,
 .bdrv_child_perm  = bdrv_format_default_perms,
 .bdrv_co_create_opts = vdi_co_create_opts,
+.bdrv_co_create  = vdi_co_create,
 .bdrv_has_zero_init = bdrv_has_zero_init_1,
 .bdrv_co_block_status = vdi_co_block_status,
 .bdrv_make_empty = vdi_make_empty,
-- 
2.13.6

Re: [Qemu-block] [PATCH v2 0/8] nbd block status base:allocation

2018-03-13 Thread Vladimir Sementsov-Ogievskiy


13.03.2018 18:55, Eric Blake wrote:

On 03/12/2018 10:21 AM, Vladimir Sementsov-Ogievskiy wrote:

Hi all.

Here is minimal realization of base:allocation context of NBD
block-status extension, which allows to get block status through
NBD.

v2 changes are in each patch after "---" line.

Vladimir Sementsov-Ogievskiy (8):
   nbd/server: add nbd_opt_invalid helper
   nbd/server: add nbd_read_opt_name helper
   nbd: BLOCK_STATUS for standard get_block_status function: server part
   block/nbd-client: save first fatal error in nbd_iter_error
   nbd: BLOCK_STATUS for standard get_block_status function: client part
   iotests.py: tiny refactor: move system imports up
   iotests: add file_path helper
   iotests: new test 209 for NBD BLOCK_STATUS


I've staged this on my NBD queue, pull request to come later today 
(still this morning for me) so that it makes 2.12 softfreeze.




So, I'm happy, thank you!

--
Best regards,
Vladimir

[Qemu-block] [PULL 24/41] luks: Support .bdrv_co_create

2018-03-13 Thread Kevin Wolf

This adds the .bdrv_co_create driver callback to luks, which enables
image creation over QMP.

Signed-off-by: Kevin Wolf 
Reviewed-by: Daniel P. Berrangé 
---
 qapi/block-core.json | 17 -
 block/crypto.c   | 34 ++
 2 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 3e52d248eb..ba2d10d13a 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3596,6 +3596,21 @@
 '*preallocation':   'PreallocMode' } }
 
 ##
+# @BlockdevCreateOptionsLUKS:
+#
+# Driver specific image creation options for LUKS.
+#
+# @file Node to create the image format on
+# @size Size of the virtual disk in bytes
+#
+# Since: 2.12
+##
+{ 'struct': 'BlockdevCreateOptionsLUKS',
+  'base': 'QCryptoBlockCreateOptionsLUKS',
+  'data': { 'file': 'BlockdevRef',
+'size': 'size' } }
+
+##
 # @BlockdevCreateOptionsNfs:
 #
 # Driver specific image creation options for NFS.
@@ -3787,7 +3802,7 @@
   'http':   'BlockdevCreateNotSupported',
   'https':  'BlockdevCreateNotSupported',
   'iscsi':  'BlockdevCreateNotSupported',
-  'luks':   'BlockdevCreateNotSupported',
+  'luks':   'BlockdevCreateOptionsLUKS',
   'nbd':'BlockdevCreateNotSupported',
   'nfs':'BlockdevCreateOptionsNfs',
   'null-aio':   'BlockdevCreateNotSupported',
diff --git a/block/crypto.c b/block/crypto.c
index b0a4cb3388..a1139b6f09 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -543,6 +543,39 @@ static int block_crypto_open_luks(BlockDriverState *bs,
  bs, options, flags, errp);
 }
 
+static int coroutine_fn
+block_crypto_co_create_luks(BlockdevCreateOptions *create_options, Error 
**errp)
+{
+BlockdevCreateOptionsLUKS *luks_opts;
+BlockDriverState *bs = NULL;
+QCryptoBlockCreateOptions create_opts;
+int ret;
+
+assert(create_options->driver == BLOCKDEV_DRIVER_LUKS);
+luks_opts = _options->u.luks;
+
+bs = bdrv_open_blockdev_ref(luks_opts->file, errp);
+if (bs == NULL) {
+return -EIO;
+}
+
+create_opts = (QCryptoBlockCreateOptions) {
+.format = Q_CRYPTO_BLOCK_FORMAT_LUKS,
+.u.luks = *qapi_BlockdevCreateOptionsLUKS_base(luks_opts),
+};
+
+ret = block_crypto_co_create_generic(bs, luks_opts->size, _opts,
+ errp);
+if (ret < 0) {
+goto fail;
+}
+
+ret = 0;
+fail:
+bdrv_unref(bs);
+return ret;
+}
+
 static int coroutine_fn block_crypto_co_create_opts_luks(const char *filename,
  QemuOpts *opts,
  Error **errp)
@@ -647,6 +680,7 @@ BlockDriver bdrv_crypto_luks = {
 .bdrv_open  = block_crypto_open_luks,
 .bdrv_close = block_crypto_close,
 .bdrv_child_perm= bdrv_format_default_perms,
+.bdrv_co_create = block_crypto_co_create_luks,
 .bdrv_co_create_opts = block_crypto_co_create_opts_luks,
 .bdrv_truncate  = block_crypto_truncate,
 .create_opts= _crypto_create_opts_luks,
-- 
2.13.6

[Qemu-block] [PULL 10/41] blockjobs: add NULL state

2018-03-13 Thread Kevin Wolf

From: John Snow 

Add a new state that specifically demarcates when we begin to permanently
demolish a job after it has performed all work. This makes the transition
explicit in the STM table and highlights conditions under which a job may
be demolished.

Alongside this state, add a new helper command "block_job_decommission",
which transitions to the NULL state and puts down our implicit reference.
This separates instances in the code for "block_job_unref" which merely
undo a matching "block_job_ref" with instances intended to initiate the
full destruction of the object.

This decommission action also sets a number of fields to make sure that
block internals or external users that are holding a reference to a job
to see when it "finishes" are convinced that the job object is "done."
This is necessary, for instance, to do a block_job_cancel_sync on a
created object which will not make any progress.

Now, all jobs must go through block_job_decommission prior to being
freed, giving us start-to-finish state machine coverage for jobs.

Transitions:
Created   -> Null: Early failure event before the job is started
Concluded -> Null: Standard transition.

Verbs:
None. This should not ever be visible to the monitor.

 +-+
 |UNDEFINED|
 +--+--+
|
 +--v+
   +-+CREATED+--+
   | +--++  |
   ||   |
   | +--v+ +--+ |
   +-+RUNNING<->PAUSED| |
   | +--+-+--+ +--+ |
   || | |
   || +--+  |
   |||  |
   | +--v--+   +---+ |  |
   +-+READY<--->STANDBY| |  |
   | +--+--+   +---+ |  |
   |||  |
+--v-+   +--v--+ |  |
|ABORTING+--->CONCLUDED<-+  |
++   +--+--+|
|   |
 +--v-+ |
 |NULL<-+
 ++

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json |  5 -
 blockjob.c   | 50 --
 2 files changed, 36 insertions(+), 19 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 2edfd194e3..4b777fc46f 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1003,11 +1003,14 @@
 # @concluded: The job has finished all work. If manual was set to true, the job
 # will remain in the query list until it is dismissed.
 #
+# @null: The job is in the process of being dismantled. This state should not
+#ever be visible externally.
+#
 # Since: 2.12
 ##
 { 'enum': 'BlockJobStatus',
   'data': ['undefined', 'created', 'running', 'paused', 'ready', 'standby',
-   'aborting', 'concluded' ] }
+   'aborting', 'concluded', 'null' ] }
 
 ##
 # @BlockJobInfo:
diff --git a/blockjob.c b/blockjob.c
index 3f730967b3..2ef48075b0 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -44,24 +44,25 @@ static QemuMutex block_job_mutex;
 
 /* BlockJob State Transition Table */
 bool BlockJobSTT[BLOCK_JOB_STATUS__MAX][BLOCK_JOB_STATUS__MAX] = {
-  /* U, C, R, P, Y, S, X, E */
-/* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0, 0},
-/* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 1, 0},
-/* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1, 1},
-/* P: */ [BLOCK_JOB_STATUS_PAUSED]= {0, 0, 1, 0, 0, 0, 0, 0},
-/* Y: */ [BLOCK_JOB_STATUS_READY] = {0, 0, 0, 0, 0, 1, 1, 1},
-/* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0, 0},
-/* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0, 1},
-/* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0},
+  /* U, C, R, P, Y, S, X, E, N */
+/* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0, 0, 0},
+/* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 1, 0, 1},
+/* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1, 1, 0},
+/* P: */ [BLOCK_JOB_STATUS_PAUSED]= {0, 0, 1, 0, 0, 0, 0, 0, 0},
+/* Y: */ [BLOCK_JOB_STATUS_READY] = {0, 0, 0, 0, 0, 1, 1, 1, 0},
+/* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0, 0, 0},
+/* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0, 1, 0},
+/* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0, 1},
+/* N: */ [BLOCK_JOB_STATUS_NULL]  = {0, 0, 0, 0, 0, 0, 0, 0, 0},
 };
 
 bool BlockJobVerbTable[BLOCK_JOB_VERB__MAX][BLOCK_JOB_STATUS__MAX] = {
-  /* U, C, R, P, Y, S, X, E */
-[BLOCK_JOB_VERB_CANCEL]   = {0, 1, 1, 1, 1, 1, 0, 0},
-

[Qemu-block] [PULL 21/41] tests/test-blockjob: test cancellations

2018-03-13 Thread Kevin Wolf

From: John Snow 

Whatever the state a blockjob is in, it should be able to be canceled
by the block layer.

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 tests/test-blockjob.c | 233 +-
 1 file changed, 229 insertions(+), 4 deletions(-)

diff --git a/tests/test-blockjob.c b/tests/test-blockjob.c
index 599e28d732..8946bfd37b 100644
--- a/tests/test-blockjob.c
+++ b/tests/test-blockjob.c
@@ -24,14 +24,15 @@ static void block_job_cb(void *opaque, int ret)
 {
 }
 
-static BlockJob *do_test_id(BlockBackend *blk, const char *id,
-bool should_succeed)
+static BlockJob *mk_job(BlockBackend *blk, const char *id,
+const BlockJobDriver *drv, bool should_succeed,
+int flags)
 {
 BlockJob *job;
 Error *errp = NULL;
 
-job = block_job_create(id, _block_job_driver, NULL, blk_bs(blk),
-   0, BLK_PERM_ALL, 0, BLOCK_JOB_DEFAULT, block_job_cb,
+job = block_job_create(id, drv, NULL, blk_bs(blk),
+   0, BLK_PERM_ALL, 0, flags, block_job_cb,
NULL, );
 if (should_succeed) {
 g_assert_null(errp);
@@ -50,6 +51,13 @@ static BlockJob *do_test_id(BlockBackend *blk, const char 
*id,
 return job;
 }
 
+static BlockJob *do_test_id(BlockBackend *blk, const char *id,
+bool should_succeed)
+{
+return mk_job(blk, id, _block_job_driver,
+  should_succeed, BLOCK_JOB_DEFAULT);
+}
+
 /* This creates a BlockBackend (optionally with a name) with a
  * BlockDriverState inserted. */
 static BlockBackend *create_blk(const char *name)
@@ -142,6 +150,216 @@ static void test_job_ids(void)
 destroy_blk(blk[2]);
 }
 
+typedef struct CancelJob {
+BlockJob common;
+BlockBackend *blk;
+bool should_converge;
+bool should_complete;
+bool completed;
+} CancelJob;
+
+static void cancel_job_completed(BlockJob *job, void *opaque)
+{
+CancelJob *s = opaque;
+s->completed = true;
+block_job_completed(job, 0);
+}
+
+static void cancel_job_complete(BlockJob *job, Error **errp)
+{
+CancelJob *s = container_of(job, CancelJob, common);
+s->should_complete = true;
+}
+
+static void coroutine_fn cancel_job_start(void *opaque)
+{
+CancelJob *s = opaque;
+
+while (!s->should_complete) {
+if (block_job_is_cancelled(>common)) {
+goto defer;
+}
+
+if (!s->common.ready && s->should_converge) {
+block_job_event_ready(>common);
+}
+
+block_job_sleep_ns(>common, 10);
+}
+
+ defer:
+block_job_defer_to_main_loop(>common, cancel_job_completed, s);
+}
+
+static const BlockJobDriver test_cancel_driver = {
+.instance_size = sizeof(CancelJob),
+.start = cancel_job_start,
+.complete  = cancel_job_complete,
+};
+
+static CancelJob *create_common(BlockJob **pjob)
+{
+BlockBackend *blk;
+BlockJob *job;
+CancelJob *s;
+
+blk = create_blk(NULL);
+job = mk_job(blk, "Steve", _cancel_driver, true,
+ BLOCK_JOB_MANUAL_FINALIZE | BLOCK_JOB_MANUAL_DISMISS);
+block_job_ref(job);
+assert(job->status == BLOCK_JOB_STATUS_CREATED);
+s = container_of(job, CancelJob, common);
+s->blk = blk;
+
+*pjob = job;
+return s;
+}
+
+static void cancel_common(CancelJob *s)
+{
+BlockJob *job = >common;
+BlockBackend *blk = s->blk;
+BlockJobStatus sts = job->status;
+
+block_job_cancel_sync(job);
+if ((sts != BLOCK_JOB_STATUS_CREATED) &&
+(sts != BLOCK_JOB_STATUS_CONCLUDED)) {
+BlockJob *dummy = job;
+block_job_dismiss(, _abort);
+}
+assert(job->status == BLOCK_JOB_STATUS_NULL);
+block_job_unref(job);
+destroy_blk(blk);
+}
+
+static void test_cancel_created(void)
+{
+BlockJob *job;
+CancelJob *s;
+
+s = create_common();
+cancel_common(s);
+}
+
+static void test_cancel_running(void)
+{
+BlockJob *job;
+CancelJob *s;
+
+s = create_common();
+
+block_job_start(job);
+assert(job->status == BLOCK_JOB_STATUS_RUNNING);
+
+cancel_common(s);
+}
+
+static void test_cancel_paused(void)
+{
+BlockJob *job;
+CancelJob *s;
+
+s = create_common();
+
+block_job_start(job);
+assert(job->status == BLOCK_JOB_STATUS_RUNNING);
+
+block_job_user_pause(job, _abort);
+block_job_enter(job);
+assert(job->status == BLOCK_JOB_STATUS_PAUSED);
+
+cancel_common(s);
+}
+
+static void test_cancel_ready(void)
+{
+BlockJob *job;
+CancelJob *s;
+
+s = create_common();
+
+block_job_start(job);
+assert(job->status == BLOCK_JOB_STATUS_RUNNING);
+
+s->should_converge = true;
+block_job_enter(job);
+assert(job->status == BLOCK_JOB_STATUS_READY);
+
+cancel_common(s);
+}
+
+static void test_cancel_standby(void)
+{
+BlockJob *job;
+

[Qemu-block] [PULL 07/41] blockjobs: add block_job_verb permission table

2018-03-13 Thread Kevin Wolf

From: John Snow 

Which commands ("verbs") are appropriate for jobs in which state is
also somewhat burdensome to keep track of.

As of this commit, it looks rather useless, but begins to look more
interesting the more states we add to the STM table.

A recurring theme is that no verb will apply to an 'undefined' job.

Further, it's not presently possible to restrict the "pause" or "resume"
verbs any more than they are in this commit because of the asynchronous
nature of how jobs enter the PAUSED state; justifications for some
seemingly erroneous applications are given below.

=
Verbs
=

Cancel:Any state except undefined.
Pause: Any state except undefined;
   'created': Requests that the job pauses as it starts.
   'running': Normal usage. (PAUSED)
   'paused':  The job may be paused for internal reasons,
  but the user may wish to force an indefinite
  user-pause, so this is allowed.
   'ready':   Normal usage. (STANDBY)
   'standby': Same logic as above.
Resume:Any state except undefined;
   'created': Will lift a user's pause-on-start request.
   'running': Will lift a pause request before it takes effect.
   'paused':  Normal usage.
   'ready':   Will lift a pause request before it takes effect.
   'standby': Normal usage.
Set-speed: Any state except undefined, though ready may not be meaningful.
Complete:  Only a 'ready' job may accept a complete request.

===
Changes
===

(1)

To facilitate "nice" error checking, all five major block-job verb
interfaces in blockjob.c now support an errp parameter:

- block_job_user_cancel is added as a new interface.
- block_job_user_pause gains an errp paramter
- block_job_user_resume gains an errp parameter
- block_job_set_speed already had an errp parameter.
- block_job_complete already had an errp parameter.

(2)

block-job-pause and block-job-resume will no longer no-op when trying
to pause an already paused job, or trying to resume a job that isn't
paused. These functions will now report that they did not perform the
action requested because it was not possible.

iotests have been adjusted to address this new behavior.

(3)

block-job-complete doesn't worry about checking !block_job_started,
because the permission table guards against this.

(4)

test-bdrv-drain's job implementation needs to announce that it is
'ready' now, in order to be completed.

Signed-off-by: John Snow 
Reviewed-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json | 20 ++
 include/block/blockjob.h | 13 +++--
 blockdev.c   | 10 +++
 blockjob.c   | 71 ++--
 tests/test-bdrv-drain.c  |  1 +
 block/trace-events   |  1 +
 6 files changed, 100 insertions(+), 16 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index f8c19a9a2b..217a31385f 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -956,6 +956,26 @@
   'data': ['commit', 'stream', 'mirror', 'backup'] }
 
 ##
+# @BlockJobVerb:
+#
+# Represents command verbs that can be applied to a blockjob.
+#
+# @cancel: see @block-job-cancel
+#
+# @pause: see @block-job-pause
+#
+# @resume: see @block-job-resume
+#
+# @set-speed: see @block-job-set-speed
+#
+# @complete: see @block-job-complete
+#
+# Since: 2.12
+##
+{ 'enum': 'BlockJobVerb',
+  'data': ['cancel', 'pause', 'resume', 'set-speed', 'complete' ] }
+
+##
 # @BlockJobStatus:
 #
 # Indicates the present state of a given blockjob in its lifetime.
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index b39a2f9521..df0a9773d1 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -249,7 +249,7 @@ BlockJobInfo *block_job_query(BlockJob *job, Error **errp);
  * Asynchronously pause the specified job.
  * Do not allow a resume until a matching call to block_job_user_resume.
  */
-void block_job_user_pause(BlockJob *job);
+void block_job_user_pause(BlockJob *job, Error **errp);
 
 /**
  * block_job_paused:
@@ -266,7 +266,16 @@ bool block_job_user_paused(BlockJob *job);
  * Resume the specified job.
  * Must be paired with a preceding block_job_user_pause.
  */
-void block_job_user_resume(BlockJob *job);
+void block_job_user_resume(BlockJob *job, Error **errp);
+
+/**
+ * block_job_user_cancel:
+ * @job: The job to be cancelled.
+ *
+ * Cancels the specified job, but may refuse to do so if the
+ * operation isn't currently meaningful.
+ */
+void block_job_user_cancel(BlockJob *job, Error **errp);
 
 /**
  * block_job_cancel_sync:
diff --git a/blockdev.c b/blockdev.c
index 1fbfd3a2c4..f70a783803 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3806,7 +3806,7 @@ void qmp_block_job_cancel(const char *device,
 }
 
 trace_qmp_block_job_cancel(job);
-

[Qemu-block] [PULL 04/41] blockjobs: add status enum

2018-03-13 Thread Kevin Wolf

From: John Snow 

We're about to add several new states, and booleans are becoming
unwieldly and difficult to reason about. It would help to have a
more explicit bookkeeping of the state of blockjobs. To this end,
add a new "status" field and add our existing states in a redundant
manner alongside the bools they are replacing:

UNDEFINED: Placeholder, default state. Not currently visible to QMP
   unless changes occur in the future to allow creating jobs
   without starting them via QMP.
CREATED:   replaces !!job->co && paused && !busy
RUNNING:   replaces effectively (!paused && busy)
PAUSED:Nearly redundant with info->paused, which shows pause_count.
   This reports the actual status of the job, which almost always
   matches the paused request status. It differs in that it is
   strictly only true when the job has actually gone dormant.
READY: replaces job->ready.
STANDBY:   Paused, but job->ready is true.

New state additions in coming commits will not be quite so redundant:

WAITING:   Waiting on transaction. This job has finished all the work
   it can until the transaction converges, fails, or is canceled.
PENDING:   Pending authorization from user. This job has finished all the
   work it can until the job or transaction is finalized via
   block_job_finalize. This implies the transaction has converged
   and left the WAITING phase.
ABORTING:  Job has encountered an error condition and is in the process
   of aborting.
CONCLUDED: Job has ceased all operations and has a return code available
   for query and may be dismissed via block_job_dismiss.
NULL:  Job has been dismissed and (should) be destroyed. Should never
   be visible to QMP.

Some of these states appear somewhat superfluous, but it helps define the
expected flow of a job; so some of the states wind up being synchronous
empty transitions. Importantly, jobs can be in only one of these states
at any given time, which helps code and external users alike reason about
the current condition of a job unambiguously.

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json   | 31 ++-
 include/block/blockjob.h   |  3 +++
 blockjob.c |  9 +
 tests/qemu-iotests/109.out | 24 
 4 files changed, 54 insertions(+), 13 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 524d51567a..f8c19a9a2b 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -956,6 +956,32 @@
   'data': ['commit', 'stream', 'mirror', 'backup'] }
 
 ##
+# @BlockJobStatus:
+#
+# Indicates the present state of a given blockjob in its lifetime.
+#
+# @undefined: Erroneous, default state. Should not ever be visible.
+#
+# @created: The job has been created, but not yet started.
+#
+# @running: The job is currently running.
+#
+# @paused: The job is running, but paused. The pause may be requested by
+#  either the QMP user or by internal processes.
+#
+# @ready: The job is running, but is ready for the user to signal completion.
+# This is used for long-running jobs like mirror that are designed to
+# run indefinitely.
+#
+# @standby: The job is ready, but paused. This is nearly identical to @paused.
+#   The job may return to @ready or otherwise be canceled.
+#
+# Since: 2.12
+##
+{ 'enum': 'BlockJobStatus',
+  'data': ['undefined', 'created', 'running', 'paused', 'ready', 'standby'] }
+
+##
 # @BlockJobInfo:
 #
 # Information about a long-running block device operation.
@@ -981,12 +1007,15 @@
 #
 # @ready: true if the job may be completed (since 2.2)
 #
+# @status: Current job state/status (since 2.12)
+#
 # Since: 1.1
 ##
 { 'struct': 'BlockJobInfo',
   'data': {'type': 'str', 'device': 'str', 'len': 'int',
'offset': 'int', 'busy': 'bool', 'paused': 'bool', 'speed': 'int',
-   'io-status': 'BlockDeviceIoStatus', 'ready': 'bool'} }
+   'io-status': 'BlockDeviceIoStatus', 'ready': 'bool',
+   'status': 'BlockJobStatus' } }
 
 ##
 # @query-block-jobs:
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index b77fac118d..b39a2f9521 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -139,6 +139,9 @@ typedef struct BlockJob {
  */
 QEMUTimer sleep_timer;
 
+/** Current state; See @BlockJobStatus for details. */
+BlockJobStatus status;
+
 BlockJobTxn *txn;
 QLIST_ENTRY(BlockJob) txn_list;
 } BlockJob;
diff --git a/blockjob.c b/blockjob.c
index ecc5fcbdf8..719169cccd 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -320,6 +320,7 @@ void block_job_start(BlockJob *job)
 job->pause_count--;
 job->busy = true;
 job->paused = false;
+job->status = BLOCK_JOB_STATUS_RUNNING;
 bdrv_coroutine_enter(blk_bs(job->blk), job->co);
 }
 
@@ -598,6 +599,7 @@ BlockJobInfo

[Qemu-block] [PULL 18/41] blockjobs: add block-job-finalize

2018-03-13 Thread Kevin Wolf

From: John Snow 

Instead of automatically transitioning from PENDING to CONCLUDED, gate
the .prepare() and .commit() phases behind an explicit acknowledgement
provided by the QMP monitor if auto_finalize = false has been requested.

This allows us to perform graph changes in prepare and/or commit so that
graph changes do not occur autonomously without knowledge of the
controlling management layer.

Transactions that have reached the "PENDING" state together can all be
moved to invoke their finalization methods by issuing block_job_finalize
to any one job in the transaction.

Jobs in a transaction with mixed job->auto_finalize settings will all
remain stuck in the "PENDING" state, as if the entire transaction was
specified with auto_finalize = false. Jobs that specified
auto_finalize = true, however, will still not emit the PENDING event.

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json | 23 ++-
 include/block/blockjob.h | 17 ++
 blockdev.c   | 14 +++
 blockjob.c   | 60 +++-
 block/trace-events   |  1 +
 5 files changed, 98 insertions(+), 17 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 0ae12272ff..2c32fc69f9 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -972,10 +972,13 @@
 #
 # @dismiss: see @block-job-dismiss
 #
+# @finalize: see @block-job-finalize
+#
 # Since: 2.12
 ##
 { 'enum': 'BlockJobVerb',
-  'data': ['cancel', 'pause', 'resume', 'set-speed', 'complete', 'dismiss' ] }
+  'data': ['cancel', 'pause', 'resume', 'set-speed', 'complete', 'dismiss',
+   'finalize' ] }
 
 ##
 # @BlockJobStatus:
@@ -2275,6 +2278,24 @@
 { 'command': 'block-job-dismiss', 'data': { 'id': 'str' } }
 
 ##
+# @block-job-finalize:
+#
+# Once a job that has manual=true reaches the pending state, it can be
+# instructed to finalize any graph changes and do any necessary cleanup
+# via this command.
+# For jobs in a transaction, instructing one job to finalize will force
+# ALL jobs in the transaction to finalize, so it is only necessary to instruct
+# a single member job to finalize.
+#
+# @id: The job identifier.
+#
+# Returns: Nothing on success
+#
+# Since: 2.12
+##
+{ 'command': 'block-job-finalize', 'data': { 'id': 'str' } }
+
+##
 # @BlockdevDiscardOptions:
 #
 # Determines how to handle discard requests.
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 7c8d51effa..978274ed2b 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -244,6 +244,23 @@ void block_job_cancel(BlockJob *job);
  */
 void block_job_complete(BlockJob *job, Error **errp);
 
+
+/**
+ * block_job_finalize:
+ * @job: The job to fully commit and finish.
+ * @errp: Error object.
+ *
+ * For jobs that have finished their work and are pending
+ * awaiting explicit acknowledgement to commit their work,
+ * This will commit that work.
+ *
+ * FIXME: Make the below statement universally true:
+ * For jobs that support the manual workflow mode, all graph
+ * changes that occur as a result will occur after this command
+ * and before a successful reply.
+ */
+void block_job_finalize(BlockJob *job, Error **errp);
+
 /**
  * block_job_dismiss:
  * @job: The job to be dismissed.
diff --git a/blockdev.c b/blockdev.c
index 9900cbc7dd..efd3ab2e99 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3853,6 +3853,20 @@ void qmp_block_job_complete(const char *device, Error 
**errp)
 aio_context_release(aio_context);
 }
 
+void qmp_block_job_finalize(const char *id, Error **errp)
+{
+AioContext *aio_context;
+BlockJob *job = find_block_job(id, _context, errp);
+
+if (!job) {
+return;
+}
+
+trace_qmp_block_job_finalize(job);
+block_job_finalize(job, errp);
+aio_context_release(aio_context);
+}
+
 void qmp_block_job_dismiss(const char *id, Error **errp)
 {
 AioContext *aio_context;
diff --git a/blockjob.c b/blockjob.c
index 3880a89678..4b73cb0263 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -65,6 +65,7 @@ bool 
BlockJobVerbTable[BLOCK_JOB_VERB__MAX][BLOCK_JOB_STATUS__MAX] = {
 [BLOCK_JOB_VERB_RESUME]   = {0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0},
 [BLOCK_JOB_VERB_SET_SPEED]= {0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0},
 [BLOCK_JOB_VERB_COMPLETE] = {0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0},
+[BLOCK_JOB_VERB_FINALIZE] = {0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0},
 [BLOCK_JOB_VERB_DISMISS]  = {0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0},
 };
 
@@ -449,7 +450,7 @@ static void block_job_clean(BlockJob *job)
 }
 }
 
-static int block_job_completed_single(BlockJob *job)
+static int block_job_finalize_single(BlockJob *job)
 {
 assert(job->completed);
 
@@ -590,18 +591,36 @@ static void block_job_completed_txn_abort(BlockJob *job)
 assert(other_job->cancelled);
 block_job_finish_sync(other_job,

[Qemu-block] [PULL 14/41] blockjobs: add block_job_txn_apply function

2018-03-13 Thread Kevin Wolf

From: John Snow 

Simply apply a function transaction-wide.
A few more uses of this in forthcoming patches.

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 blockjob.c | 25 -
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 0c64fadc6d..7e03824751 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -487,6 +487,19 @@ static void block_job_cancel_async(BlockJob *job)
 job->cancelled = true;
 }
 
+static void block_job_txn_apply(BlockJobTxn *txn, void fn(BlockJob *))
+{
+AioContext *ctx;
+BlockJob *job, *next;
+
+QLIST_FOREACH_SAFE(job, >jobs, txn_list, next) {
+ctx = blk_get_aio_context(job->blk);
+aio_context_acquire(ctx);
+fn(job);
+aio_context_release(ctx);
+}
+}
+
 static int block_job_finish_sync(BlockJob *job,
  void (*finish)(BlockJob *, Error **errp),
  Error **errp)
@@ -565,9 +578,8 @@ static void block_job_completed_txn_abort(BlockJob *job)
 
 static void block_job_completed_txn_success(BlockJob *job)
 {
-AioContext *ctx;
 BlockJobTxn *txn = job->txn;
-BlockJob *other_job, *next;
+BlockJob *other_job;
 /*
  * Successful completion, see if there are other running jobs in this
  * txn.
@@ -576,15 +588,10 @@ static void block_job_completed_txn_success(BlockJob *job)
 if (!other_job->completed) {
 return;
 }
-}
-/* We are the last completed job, commit the transaction. */
-QLIST_FOREACH_SAFE(other_job, >jobs, txn_list, next) {
-ctx = blk_get_aio_context(other_job->blk);
-aio_context_acquire(ctx);
 assert(other_job->ret == 0);
-block_job_completed_single(other_job);
-aio_context_release(ctx);
 }
+/* We are the last completed job, commit the transaction. */
+block_job_txn_apply(txn, block_job_completed_single);
 }
 
 /* Assumes the block_job_mutex is held */
-- 
2.13.6

[Qemu-block] [PULL 05/41] blockjobs: add state transition table

2018-03-13 Thread Kevin Wolf

From: John Snow 

The state transition table has mostly been implied. We're about to make
it a bit more complex, so let's make the STM explicit instead.

Perform state transitions with a function that for now just asserts the
transition is appropriate.

Transitions:
Undefined -> Created: During job initialization.
Created   -> Running: Once the job is started.
  Jobs cannot transition from "Created" to "Paused"
  directly, but will instead synchronously transition
  to running to paused immediately.
Running   -> Paused:  Normal workflow for pauses.
Running   -> Ready:   Normal workflow for jobs reaching their sync point.
  (e.g. mirror)
Ready -> Standby: Normal workflow for pausing ready jobs.
Paused-> Running: Normal resume.
Standby   -> Ready:   Resume of a Standby job.

+-+
|UNDEFINED|
+--+--+
   |
+--v+
|CREATED|
+--++
   |
+--v+ +--+
|RUNNING<->PAUSED|
+--++ +--+
   |
+--v--+   +---+
|READY<--->STANDBY|
+-+   +---+

Notably, there is no state presently defined as of this commit that
deals with a job after the "running" or "ready" states, so this table
will be adjusted alongside the commits that introduce those states.

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 blockjob.c | 40 +---
 block/trace-events |  3 +++
 2 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 719169cccd..442426e27b 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -28,6 +28,7 @@
 #include "block/block.h"
 #include "block/blockjob_int.h"
 #include "block/block_int.h"
+#include "block/trace.h"
 #include "sysemu/block-backend.h"
 #include "qapi/error.h"
 #include "qapi/qapi-events-block-core.h"
@@ -41,6 +42,31 @@
  * block_job_enter. */
 static QemuMutex block_job_mutex;
 
+/* BlockJob State Transition Table */
+bool BlockJobSTT[BLOCK_JOB_STATUS__MAX][BLOCK_JOB_STATUS__MAX] = {
+  /* U, C, R, P, Y, S */
+/* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0},
+/* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0},
+/* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0},
+/* P: */ [BLOCK_JOB_STATUS_PAUSED]= {0, 0, 1, 0, 0, 0},
+/* Y: */ [BLOCK_JOB_STATUS_READY] = {0, 0, 0, 0, 0, 1},
+/* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0},
+};
+
+static void block_job_state_transition(BlockJob *job, BlockJobStatus s1)
+{
+BlockJobStatus s0 = job->status;
+assert(s1 >= 0 && s1 <= BLOCK_JOB_STATUS__MAX);
+trace_block_job_state_transition(job, job->ret, BlockJobSTT[s0][s1] ?
+ "allowed" : "disallowed",
+ qapi_enum_lookup(_lookup,
+  s0),
+ qapi_enum_lookup(_lookup,
+  s1));
+assert(BlockJobSTT[s0][s1]);
+job->status = s1;
+}
+
 static void block_job_lock(void)
 {
 qemu_mutex_lock(_job_mutex);
@@ -320,7 +346,7 @@ void block_job_start(BlockJob *job)
 job->pause_count--;
 job->busy = true;
 job->paused = false;
-job->status = BLOCK_JOB_STATUS_RUNNING;
+block_job_state_transition(job, BLOCK_JOB_STATUS_RUNNING);
 bdrv_coroutine_enter(blk_bs(job->blk), job->co);
 }
 
@@ -702,7 +728,7 @@ void *block_job_create(const char *job_id, const 
BlockJobDriver *driver,
 job->paused= true;
 job->pause_count   = 1;
 job->refcnt= 1;
-job->status= BLOCK_JOB_STATUS_CREATED;
+block_job_state_transition(job, BLOCK_JOB_STATUS_CREATED);
 aio_timer_init(qemu_get_aio_context(), >sleep_timer,
QEMU_CLOCK_REALTIME, SCALE_NS,
block_job_sleep_timer_cb, job);
@@ -817,13 +843,13 @@ void coroutine_fn block_job_pause_point(BlockJob *job)
 
 if (block_job_should_pause(job) && !block_job_is_cancelled(job)) {
 BlockJobStatus status = job->status;
-job->status = status == BLOCK_JOB_STATUS_READY ? \
-BLOCK_JOB_STATUS_STANDBY : \
-BLOCK_JOB_STATUS_PAUSED;
+block_job_state_transition(job, status == BLOCK_JOB_STATUS_READY ? \
+   BLOCK_JOB_STATUS_STANDBY :   \
+   BLOCK_JOB_STATUS_PAUSED);
 job->paused = true;
 block_job_do_yield(job, -1);
 job->paused = false;
-job->status = status;
+block_job_state_transition(job, status);
 }
 
 if (job->driver->resume) {
@@ -929,7 +955,7 @@ void block_job_iostatus_reset(BlockJob *job)
 
 void block_job_event_ready(BlockJob *job)
 {
-job->status = BLOCK_JOB_STATUS_READY;
+

[Qemu-block] [PULL 09/41] blockjobs: add CONCLUDED state

2018-03-13 Thread Kevin Wolf

From: John Snow 

add a new state "CONCLUDED" that identifies a job that has ceased all
operations. The wording was chosen to avoid any phrasing that might
imply success, error, or cancellation. The task has simply ceased all
operation and can never again perform any work.

("finished", "done", and "completed" might all imply success.)

Transitions:
Running  -> Concluded: normal completion
Ready-> Concluded: normal completion
Aborting -> Concluded: error and cancellations

Verbs:
None as of this commit. (a future commit adds 'dismiss')

 +-+
 |UNDEFINED|
 +--+--+
|
 +--v+
   +-+CREATED|
   | +--++
   ||
   | +--v+ +--+
   +-+RUNNING<->PAUSED|
   | +--+-+--+ +--+
   || |
   || +--+
   |||
   | +--v--+   +---+ |
   +-+READY<--->STANDBY| |
   | +--+--+   +---+ |
   |||
+--v-+   +--v--+ |
|ABORTING+--->CONCLUDED<-+
++   +-+

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json |  7 +--
 blockjob.c   | 39 ---
 2 files changed, 29 insertions(+), 17 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index c33a9e91a7..2edfd194e3 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -997,14 +997,17 @@
 #   The job may return to @ready or otherwise be canceled.
 #
 # @aborting: The job is in the process of being aborted, and will finish with
-#an error.
+#an error. The job will afterwards report that it is @concluded.
 #This status may not be visible to the management process.
 #
+# @concluded: The job has finished all work. If manual was set to true, the job
+# will remain in the query list until it is dismissed.
+#
 # Since: 2.12
 ##
 { 'enum': 'BlockJobStatus',
   'data': ['undefined', 'created', 'running', 'paused', 'ready', 'standby',
-   'aborting' ] }
+   'aborting', 'concluded' ] }
 
 ##
 # @BlockJobInfo:
diff --git a/blockjob.c b/blockjob.c
index fe5b0041f7..3f730967b3 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -44,23 +44,24 @@ static QemuMutex block_job_mutex;
 
 /* BlockJob State Transition Table */
 bool BlockJobSTT[BLOCK_JOB_STATUS__MAX][BLOCK_JOB_STATUS__MAX] = {
-  /* U, C, R, P, Y, S, X */
-/* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0},
-/* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 1},
-/* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1},
-/* P: */ [BLOCK_JOB_STATUS_PAUSED]= {0, 0, 1, 0, 0, 0, 0},
-/* Y: */ [BLOCK_JOB_STATUS_READY] = {0, 0, 0, 0, 0, 1, 1},
-/* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0},
-/* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0},
+  /* U, C, R, P, Y, S, X, E */
+/* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0, 0},
+/* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 1, 0},
+/* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1, 1},
+/* P: */ [BLOCK_JOB_STATUS_PAUSED]= {0, 0, 1, 0, 0, 0, 0, 0},
+/* Y: */ [BLOCK_JOB_STATUS_READY] = {0, 0, 0, 0, 0, 1, 1, 1},
+/* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0, 0},
+/* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0, 1},
+/* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0},
 };
 
 bool BlockJobVerbTable[BLOCK_JOB_VERB__MAX][BLOCK_JOB_STATUS__MAX] = {
-  /* U, C, R, P, Y, S, X */
-[BLOCK_JOB_VERB_CANCEL]   = {0, 1, 1, 1, 1, 1, 0},
-[BLOCK_JOB_VERB_PAUSE]= {0, 1, 1, 1, 1, 1, 0},
-[BLOCK_JOB_VERB_RESUME]   = {0, 1, 1, 1, 1, 1, 0},
-[BLOCK_JOB_VERB_SET_SPEED]= {0, 1, 1, 1, 1, 1, 0},
-[BLOCK_JOB_VERB_COMPLETE] = {0, 0, 0, 0, 1, 0, 0},
+  /* U, C, R, P, Y, S, X, E */
+[BLOCK_JOB_VERB_CANCEL]   = {0, 1, 1, 1, 1, 1, 0, 0},
+[BLOCK_JOB_VERB_PAUSE]= {0, 1, 1, 1, 1, 1, 0, 0},
+[BLOCK_JOB_VERB_RESUME]   = {0, 1, 1, 1, 1, 1, 0, 0},
+[BLOCK_JOB_VERB_SET_SPEED]= {0, 1, 1, 1, 1, 1, 0, 0},
+[BLOCK_JOB_VERB_COMPLETE] = {0, 0, 0, 0, 1, 0, 0, 0},
 };
 
 static void block_job_state_transition(BlockJob *job, BlockJobStatus s1)
@@ -377,6 +378,11 @@ void block_job_start(BlockJob *job)
 bdrv_coroutine_enter(blk_bs(job->blk), job->co);
 }
 
+static void block_job_conclude(BlockJob *job)
+{
+block_job_state_transition(job,

[Qemu-block] [PULL 00/41] Block layer patches

2018-03-13 Thread Kevin Wolf

The following changes since commit 22ef7ba8e8ce7fef297549b3defcac333742b804:

  Merge remote-tracking branch 'remotes/famz/tags/staging-pull-request' into 
staging (2018-03-13 11:42:45 +)

are available in the git repository at:

  git://repo.or.cz/qemu/kevin.git tags/for-upstream

for you to fetch changes up to be6c885842efded81a20f4ca24f0d4e123a80c00:

  block/mirror: change the semantic of 'force' of block-job-cancel (2018-03-13 
16:54:47 +0100)


Block layer patches


Fam Zheng (2):
  block: Fix flags in reopen queue
  iotests: Add regression test for commit base locking

John Snow (21):
  blockjobs: fix set-speed kick
  blockjobs: model single jobs as transactions
  Blockjobs: documentation touchup
  blockjobs: add status enum
  blockjobs: add state transition table
  iotests: add pause_wait
  blockjobs: add block_job_verb permission table
  blockjobs: add ABORTING state
  blockjobs: add CONCLUDED state
  blockjobs: add NULL state
  blockjobs: add block_job_dismiss
  blockjobs: ensure abort is called for cancelled jobs
  blockjobs: add commit, abort, clean helpers
  blockjobs: add block_job_txn_apply function
  blockjobs: add prepare callback
  blockjobs: add waiting status
  blockjobs: add PENDING status and event
  blockjobs: add block-job-finalize
  blockjobs: Expose manual property
  iotests: test manual job dismissal
  tests/test-blockjob: test cancellations

Kevin Wolf (14):
  luks: Separate image file creation from formatting
  luks: Create block_crypto_co_create_generic()
  luks: Support .bdrv_co_create
  luks: Turn invalid assertion into check
  luks: Catch integer overflow for huge sizes
  qemu-iotests: Test luks QMP image creation
  parallels: Support .bdrv_co_create
  qemu-iotests: Enable write tests for parallels
  qcow: Support .bdrv_co_create
  qed: Support .bdrv_co_create
  vdi: Make comments consistent with other drivers
  vhdx: Support .bdrv_co_create
  vpc: Support .bdrv_co_create
  vpc: Require aligned size in .bdrv_co_create

Liang Li (1):
  block/mirror: change the semantic of 'force' of block-job-cancel

Max Reitz (3):
  vdi: Pull option parsing from vdi_co_create
  vdi: Move file creation to vdi_co_create_opts
  vdi: Implement .bdrv_co_create

 qapi/block-core.json  | 363 --
 include/block/blockjob.h  |  71 -
 include/block/blockjob_int.h  |  17 +-
 block.c   |   8 +
 block/backup.c|   5 +-
 block/commit.c|   2 +-
 block/crypto.c| 150 -
 block/mirror.c|  12 +-
 block/parallels.c | 199 +--
 block/qcow.c  | 196 +++
 block/qed.c   | 204 
 block/stream.c|   2 +-
 block/vdi.c   | 147 +
 block/vhdx.c  | 216 +++--
 block/vpc.c   | 241 +---
 blockdev.c|  71 +++--
 blockjob.c| 358 +++--
 tests/test-bdrv-drain.c   |   5 +-
 tests/test-blockjob-txn.c |  27 ++--
 tests/test-blockjob.c | 233 ++-
 block/trace-events|   7 +
 hmp-commands.hx   |   3 +-
 tests/qemu-iotests/030|   6 +-
 tests/qemu-iotests/055|  17 +-
 tests/qemu-iotests/056| 187 ++
 tests/qemu-iotests/056.out|   4 +-
 tests/qemu-iotests/109.out|  24 +--
 tests/qemu-iotests/153|  12 ++
 tests/qemu-iotests/153.out|   5 +
 tests/qemu-iotests/181|   2 +-
 tests/qemu-iotests/209| 210 
 tests/qemu-iotests/209.out| 136 
 tests/qemu-iotests/check  |   1 -
 tests/qemu-iotests/common.rc  |   2 +-
 tests/qemu-iotests/group  |   1 +
 tests/qemu-iotests/iotests.py |  12 +-
 36 files changed, 2642 insertions(+), 514 deletions(-)
 create mode 100755 tests/qemu-iotests/209
 create mode 100644 tests/qemu-iotests/209.out

[Qemu-block] [PULL 08/41] blockjobs: add ABORTING state

2018-03-13 Thread Kevin Wolf

From: John Snow 

Add a new state ABORTING.

This makes transitions from normative states to error states explicit
in the STM, and serves as a disambiguation for which states may complete
normally when normal end-states (CONCLUDED) are added in future commits.

Notably, Paused/Standby jobs do not transition directly to aborting,
as they must wake up first and cooperate in their cancellation.

Transitions:
Created -> Aborting: can be cancelled (by the system)
Running -> Aborting: can be cancelled or encounter an error
Ready   -> Aborting: can be cancelled or encounter an error

Verbs:
None. The job must finish cleaning itself up and report its final status.

 +-+
 |UNDEFINED|
 +--+--+
|
 +--v+
   +-+CREATED|
   | +--++
   ||
   | +--v+ +--+
   +-+RUNNING<->PAUSED|
   | +--++ +--+
   ||
   | +--v--+   +---+
   +-+READY<--->STANDBY|
   | +-+   +---+
   |
+--v-+
|ABORTING|
++

Signed-off-by: John Snow 
Reviewed-by: Eric Blake 
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json |  7 ++-
 blockjob.c   | 31 ++-
 2 files changed, 24 insertions(+), 14 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 217a31385f..c33a9e91a7 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -996,10 +996,15 @@
 # @standby: The job is ready, but paused. This is nearly identical to @paused.
 #   The job may return to @ready or otherwise be canceled.
 #
+# @aborting: The job is in the process of being aborted, and will finish with
+#an error.
+#This status may not be visible to the management process.
+#
 # Since: 2.12
 ##
 { 'enum': 'BlockJobStatus',
-  'data': ['undefined', 'created', 'running', 'paused', 'ready', 'standby'] }
+  'data': ['undefined', 'created', 'running', 'paused', 'ready', 'standby',
+   'aborting' ] }
 
 ##
 # @BlockJobInfo:
diff --git a/blockjob.c b/blockjob.c
index d369c0cb4d..fe5b0041f7 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -44,22 +44,23 @@ static QemuMutex block_job_mutex;
 
 /* BlockJob State Transition Table */
 bool BlockJobSTT[BLOCK_JOB_STATUS__MAX][BLOCK_JOB_STATUS__MAX] = {
-  /* U, C, R, P, Y, S */
-/* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0},
-/* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0},
-/* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0},
-/* P: */ [BLOCK_JOB_STATUS_PAUSED]= {0, 0, 1, 0, 0, 0},
-/* Y: */ [BLOCK_JOB_STATUS_READY] = {0, 0, 0, 0, 0, 1},
-/* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0},
+  /* U, C, R, P, Y, S, X */
+/* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0},
+/* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 1},
+/* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1},
+/* P: */ [BLOCK_JOB_STATUS_PAUSED]= {0, 0, 1, 0, 0, 0, 0},
+/* Y: */ [BLOCK_JOB_STATUS_READY] = {0, 0, 0, 0, 0, 1, 1},
+/* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0},
+/* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0},
 };
 
 bool BlockJobVerbTable[BLOCK_JOB_VERB__MAX][BLOCK_JOB_STATUS__MAX] = {
-  /* U, C, R, P, Y, S */
-[BLOCK_JOB_VERB_CANCEL]   = {0, 1, 1, 1, 1, 1},
-[BLOCK_JOB_VERB_PAUSE]= {0, 1, 1, 1, 1, 1},
-[BLOCK_JOB_VERB_RESUME]   = {0, 1, 1, 1, 1, 1},
-[BLOCK_JOB_VERB_SET_SPEED]= {0, 1, 1, 1, 1, 1},
-[BLOCK_JOB_VERB_COMPLETE] = {0, 0, 0, 0, 1, 0},
+  /* U, C, R, P, Y, S, X */
+[BLOCK_JOB_VERB_CANCEL]   = {0, 1, 1, 1, 1, 1, 0},
+[BLOCK_JOB_VERB_PAUSE]= {0, 1, 1, 1, 1, 1, 0},
+[BLOCK_JOB_VERB_RESUME]   = {0, 1, 1, 1, 1, 1, 0},
+[BLOCK_JOB_VERB_SET_SPEED]= {0, 1, 1, 1, 1, 1, 0},
+[BLOCK_JOB_VERB_COMPLETE] = {0, 0, 0, 0, 1, 0, 0},
 };
 
 static void block_job_state_transition(BlockJob *job, BlockJobStatus s1)
@@ -380,6 +381,10 @@ static void block_job_completed_single(BlockJob *job)
 {
 assert(job->completed);
 
+if (job->ret || block_job_is_cancelled(job)) {
+block_job_state_transition(job, BLOCK_JOB_STATUS_ABORTING);
+}
+
 if (!job->ret) {
 if (job->driver->commit) {
 job->driver->commit(job);
-- 
2.13.6

[Qemu-block] [PULL 03/41] Blockjobs: documentation touchup

2018-03-13 Thread Kevin Wolf

From: John Snow 

Trivial; Document what the job creation flags do,
and some general tidying.

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 include/block/blockjob.h | 8 
 include/block/blockjob_int.h | 4 +++-
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 29cde3ffe3..b77fac118d 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -127,12 +127,10 @@ typedef struct BlockJob {
 /** Reference count of the block job */
 int refcnt;
 
-/* True if this job has reported completion by calling block_job_completed.
- */
+/** True when job has reported completion by calling block_job_completed. 
*/
 bool completed;
 
-/* ret code passed to block_job_completed.
- */
+/** ret code passed to block_job_completed. */
 int ret;
 
 /**
@@ -146,7 +144,9 @@ typedef struct BlockJob {
 } BlockJob;
 
 typedef enum BlockJobCreateFlags {
+/* Default behavior */
 BLOCK_JOB_DEFAULT = 0x00,
+/* BlockJob is not QMP-created and should not send QMP events */
 BLOCK_JOB_INTERNAL = 0x01,
 } BlockJobCreateFlags;
 
diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
index becaae74c2..259d49b32a 100644
--- a/include/block/blockjob_int.h
+++ b/include/block/blockjob_int.h
@@ -114,11 +114,13 @@ struct BlockJobDriver {
  * block_job_create:
  * @job_id: The id of the newly-created job, or %NULL to have one
  * generated automatically.
- * @job_type: The class object for the newly-created job.
+ * @driver: The class object for the newly-created job.
  * @txn: The transaction this job belongs to, if any. %NULL otherwise.
  * @bs: The block
  * @perm, @shared_perm: Permissions to request for @bs
  * @speed: The maximum speed, in bytes per second, or 0 for unlimited.
+ * @flags: Creation flags for the Block Job.
+ * See @BlockJobCreateFlags
  * @cb: Completion function for the job.
  * @opaque: Opaque pointer value passed to @cb.
  * @errp: Error object.
-- 
2.13.6

[Qemu-block] [PULL 06/41] iotests: add pause_wait

2018-03-13 Thread Kevin Wolf

From: John Snow 

Split out the pause command into the actual pause and the wait.
Not every usage presently needs to resubmit a pause request.

The intent with the next commit will be to explicitly disallow
redundant or meaningless pause/resume requests, so the tests
need to become more judicious to reflect that.

Signed-off-by: John Snow 
Reviewed-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Signed-off-by: Kevin Wolf 
---
 tests/qemu-iotests/030|  6 ++
 tests/qemu-iotests/055| 17 ++---
 tests/qemu-iotests/iotests.py | 12 
 3 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/tests/qemu-iotests/030 b/tests/qemu-iotests/030
index b5f88959aa..640a6dfd10 100755
--- a/tests/qemu-iotests/030
+++ b/tests/qemu-iotests/030
@@ -86,11 +86,9 @@ class TestSingleDrive(iotests.QMPTestCase):
 result = self.vm.qmp('block-stream', device='drive0')
 self.assert_qmp(result, 'return', {})
 
-result = self.vm.qmp('block-job-pause', device='drive0')
-self.assert_qmp(result, 'return', {})
-
+self.pause_job('drive0', wait=False)
 self.vm.resume_drive('drive0')
-self.pause_job('drive0')
+self.pause_wait('drive0')
 
 result = self.vm.qmp('query-block-jobs')
 offset = self.dictpath(result, 'return[0]/offset')
diff --git a/tests/qemu-iotests/055 b/tests/qemu-iotests/055
index 8a5d9fd269..3437c11507 100755
--- a/tests/qemu-iotests/055
+++ b/tests/qemu-iotests/055
@@ -86,11 +86,9 @@ class TestSingleDrive(iotests.QMPTestCase):
  target=target, sync='full')
 self.assert_qmp(result, 'return', {})
 
-result = self.vm.qmp('block-job-pause', device='drive0')
-self.assert_qmp(result, 'return', {})
-
+self.pause_job('drive0', wait=False)
 self.vm.resume_drive('drive0')
-self.pause_job('drive0')
+self.pause_wait('drive0')
 
 result = self.vm.qmp('query-block-jobs')
 offset = self.dictpath(result, 'return[0]/offset')
@@ -303,13 +301,12 @@ class TestSingleTransaction(iotests.QMPTestCase):
 ])
 self.assert_qmp(result, 'return', {})
 
-result = self.vm.qmp('block-job-pause', device='drive0')
-self.assert_qmp(result, 'return', {})
+self.pause_job('drive0', wait=False)
 
 result = self.vm.qmp('block-job-set-speed', device='drive0', speed=0)
 self.assert_qmp(result, 'return', {})
 
-self.pause_job('drive0')
+self.pause_wait('drive0')
 
 result = self.vm.qmp('query-block-jobs')
 offset = self.dictpath(result, 'return[0]/offset')
@@ -534,11 +531,9 @@ class TestDriveCompression(iotests.QMPTestCase):
 result = self.vm.qmp(cmd, device='drive0', sync='full', compress=True, 
**args)
 self.assert_qmp(result, 'return', {})
 
-result = self.vm.qmp('block-job-pause', device='drive0')
-self.assert_qmp(result, 'return', {})
-
+self.pause_job('drive0', wait=False)
 self.vm.resume_drive('drive0')
-self.pause_job('drive0')
+self.pause_wait('drive0')
 
 result = self.vm.qmp('query-block-jobs')
 offset = self.dictpath(result, 'return[0]/offset')
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 1bcc9ca57d..5303bbc8e2 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -473,10 +473,7 @@ class QMPTestCase(unittest.TestCase):
 event = self.wait_until_completed(drive=drive)
 self.assert_qmp(event, 'data/type', 'mirror')
 
-def pause_job(self, job_id='job0'):
-result = self.vm.qmp('block-job-pause', device=job_id)
-self.assert_qmp(result, 'return', {})
-
+def pause_wait(self, job_id='job0'):
 with Timeout(1, "Timeout waiting for job to pause"):
 while True:
 result = self.vm.qmp('query-block-jobs')
@@ -484,6 +481,13 @@ class QMPTestCase(unittest.TestCase):
 if job['device'] == job_id and job['paused'] == True and 
job['busy'] == False:
 return job
 
+def pause_job(self, job_id='job0', wait=True):
+result = self.vm.qmp('block-job-pause', device=job_id)
+self.assert_qmp(result, 'return', {})
+if wait:
+return self.pause_wait(job_id)
+return result
+
 
 def notrun(reason):
 '''Skip this test suite'''
-- 
2.13.6

[Qemu-block] [PULL 02/41] blockjobs: model single jobs as transactions

2018-03-13 Thread Kevin Wolf

From: John Snow 

model all independent jobs as single job transactions.

It's one less case we have to worry about when we add more states to the
transition machine. This way, we can just treat all job lifetimes exactly
the same. This helps tighten assertions of the STM graph and removes some
conditionals that would have been needed in the coming commits adding a
more explicit job lifetime management API.

Signed-off-by: John Snow 
Reviewed-by: Eric Blake 
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 include/block/blockjob.h |  1 -
 include/block/blockjob_int.h |  3 ++-
 block/backup.c   |  3 +--
 block/commit.c   |  2 +-
 block/mirror.c   |  2 +-
 block/stream.c   |  2 +-
 blockjob.c   | 25 -
 tests/test-bdrv-drain.c  |  4 ++--
 tests/test-blockjob-txn.c| 19 +++
 tests/test-blockjob.c|  2 +-
 10 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 00403d9482..29cde3ffe3 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -141,7 +141,6 @@ typedef struct BlockJob {
  */
 QEMUTimer sleep_timer;
 
-/** Non-NULL if this job is part of a transaction */
 BlockJobTxn *txn;
 QLIST_ENTRY(BlockJob) txn_list;
 } BlockJob;
diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
index c9b23b0cc9..becaae74c2 100644
--- a/include/block/blockjob_int.h
+++ b/include/block/blockjob_int.h
@@ -115,6 +115,7 @@ struct BlockJobDriver {
  * @job_id: The id of the newly-created job, or %NULL to have one
  * generated automatically.
  * @job_type: The class object for the newly-created job.
+ * @txn: The transaction this job belongs to, if any. %NULL otherwise.
  * @bs: The block
  * @perm, @shared_perm: Permissions to request for @bs
  * @speed: The maximum speed, in bytes per second, or 0 for unlimited.
@@ -132,7 +133,7 @@ struct BlockJobDriver {
  * called from a wrapper that is specific to the job type.
  */
 void *block_job_create(const char *job_id, const BlockJobDriver *driver,
-   BlockDriverState *bs, uint64_t perm,
+   BlockJobTxn *txn, BlockDriverState *bs, uint64_t perm,
uint64_t shared_perm, int64_t speed, int flags,
BlockCompletionFunc *cb, void *opaque, Error **errp);
 
diff --git a/block/backup.c b/block/backup.c
index 4a16a37229..7e254dabff 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -621,7 +621,7 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 }
 
 /* job->common.len is fixed, so we can't allow resize */
-job = block_job_create(job_id, _job_driver, bs,
+job = block_job_create(job_id, _job_driver, txn, bs,
BLK_PERM_CONSISTENT_READ,
BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE |
BLK_PERM_WRITE_UNCHANGED | BLK_PERM_GRAPH_MOD,
@@ -677,7 +677,6 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 block_job_add_bdrv(>common, "target", target, 0, BLK_PERM_ALL,
_abort);
 job->common.len = len;
-block_job_txn_add_job(txn, >common);
 
 return >common;
 
diff --git a/block/commit.c b/block/commit.c
index 1943c9c3e1..ab4fa3c3cf 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -289,7 +289,7 @@ void commit_start(const char *job_id, BlockDriverState *bs,
 return;
 }
 
-s = block_job_create(job_id, _job_driver, bs, 0, BLK_PERM_ALL,
+s = block_job_create(job_id, _job_driver, NULL, bs, 0, BLK_PERM_ALL,
  speed, BLOCK_JOB_DEFAULT, NULL, NULL, errp);
 if (!s) {
 return;
diff --git a/block/mirror.c b/block/mirror.c
index f5bf620942..76fddb3838 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1166,7 +1166,7 @@ static void mirror_start_job(const char *job_id, 
BlockDriverState *bs,
 }
 
 /* Make sure that the source is not resized while the job is running */
-s = block_job_create(job_id, driver, mirror_top_bs,
+s = block_job_create(job_id, driver, NULL, mirror_top_bs,
  BLK_PERM_CONSISTENT_READ,
  BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
  BLK_PERM_WRITE | BLK_PERM_GRAPH_MOD, speed,
diff --git a/block/stream.c b/block/stream.c
index 499cdacdb0..f3b53f49e2 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -244,7 +244,7 @@ void stream_start(const char *job_id, BlockDriverState *bs,
 /* Prevent concurrent jobs trying to modify the graph structure here, we
  * already have our own plans. Also don't allow resize as the image size is
  * queried only at the job start and then cached. */
-s = block_job_create(job_id, _job_driver, bs,
+s

Re: [Qemu-block] [PATCH v10 05/12] migration: introduce postcopy-only pending

2018-03-13 Thread John Snow



On 03/13/2018 12:14 PM, Vladimir Sementsov-Ogievskiy wrote:
> 
> Hmm, I agree, it is the simplest thing we can do for now, and I'll
> rethink later,
> how (and is it worth doing) to go to postcopy automatically in case of
> only-dirty-bitmaps.
> Should I respin?

Please do. I already staged patches 1-4 in my branch, so if you'd like,
you can respin just 5+.

https://github.com/jnsnow/qemu/tree/bitmaps

--js

[Qemu-block] [PULL 01/41] blockjobs: fix set-speed kick

2018-03-13 Thread Kevin Wolf

From: John Snow 

If speed is '0' it's not actually "less than" the previous speed.
Kick the job in this case too.

Signed-off-by: John Snow 
Reviewed-by: Eric Blake 
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 blockjob.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/blockjob.c b/blockjob.c
index 801d29d849..afd92db01f 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -499,7 +499,7 @@ void block_job_set_speed(BlockJob *job, int64_t speed, 
Error **errp)
 }
 
 job->speed = speed;
-if (speed <= old_speed) {
+if (speed && speed <= old_speed) {
 return;
 }
 
-- 
2.13.6

Re: [Qemu-block] [Qemu-devel] [PATCH] iotests: Update output of 051 and 186 after commit 1454509726719e0933c

2018-03-13 Thread Kevin Wolf

Am 06.03.2018 um 17:52 hat Thomas Huth geschrieben:
> On 06.03.2018 17:45, Alberto Garcia wrote:
> > Signed-off-by: Alberto Garcia 
> > ---
> >  tests/qemu-iotests/051.pc.out | 20 
> >  tests/qemu-iotests/186.out| 22 +++---
> >  2 files changed, 3 insertions(+), 39 deletions(-)
> > 
> > diff --git a/tests/qemu-iotests/051.pc.out b/tests/qemu-iotests/051.pc.out
> > index 830c11880a..b01f9a90d7 100644
> > --- a/tests/qemu-iotests/051.pc.out
> > +++ b/tests/qemu-iotests/051.pc.out
> > @@ -117,20 +117,10 @@ Testing: -drive if=ide,media=cdrom
> >  QEMU X.Y.Z monitor - type 'help' for more information
> >  (qemu) quit
> >  
> > -Testing: -drive if=scsi,media=cdrom
> > -QEMU X.Y.Z monitor - type 'help' for more information
> > -(qemu) QEMU_PROG: -drive if=scsi,media=cdrom: warning: bus=0,unit=0 is 
> > deprecated with this machine type
> > -quit
> > -
> >  Testing: -drive if=ide
> >  QEMU X.Y.Z monitor - type 'help' for more information
> >  (qemu) QEMU_PROG: Initialization of device ide-hd failed: Device needs 
> > media, but drive is empty
> >  
> > -Testing: -drive if=scsi
> > -QEMU X.Y.Z monitor - type 'help' for more information
> > -(qemu) QEMU_PROG: -drive if=scsi: warning: bus=0,unit=0 is deprecated with 
> > this machine type
> > -QEMU_PROG: -drive if=scsi: Device needs media, but drive is empty
> > -
> >  Testing: -drive if=virtio
> >  QEMU X.Y.Z monitor - type 'help' for more information
> >  (qemu) QEMU_PROG: -drive if=virtio: Device needs media, but drive is empty
> > @@ -170,20 +160,10 @@ Testing: -drive 
> > file=TEST_DIR/t.qcow2,if=ide,media=cdrom,readonly=on
> >  QEMU X.Y.Z monitor - type 'help' for more information
> >  (qemu) quit
> >  
> > -Testing: -drive file=TEST_DIR/t.qcow2,if=scsi,media=cdrom,readonly=on
> > -QEMU X.Y.Z monitor - type 'help' for more information
> > -(qemu) QEMU_PROG: -drive 
> > file=TEST_DIR/t.qcow2,if=scsi,media=cdrom,readonly=on: warning: 
> > bus=0,unit=0 is deprecated with this machine type
> > -quit
> > -
> >  Testing: -drive file=TEST_DIR/t.qcow2,if=ide,readonly=on
> >  QEMU X.Y.Z monitor - type 'help' for more information
> >  (qemu) QEMU_PROG: Initialization of device ide-hd failed: Block node is 
> > read-only
> >  
> > -Testing: -drive file=TEST_DIR/t.qcow2,if=scsi,readonly=on
> > -QEMU X.Y.Z monitor - type 'help' for more information
> > -(qemu) QEMU_PROG: -drive file=TEST_DIR/t.qcow2,if=scsi,readonly=on: 
> > warning: bus=0,unit=0 is deprecated with this machine type
> > -quit
> > -
> >  Testing: -drive file=TEST_DIR/t.qcow2,if=virtio,readonly=on
> >  QEMU X.Y.Z monitor - type 'help' for more information
> >  (qemu) quit
> 
> Ack for that part.
> 
> > diff --git a/tests/qemu-iotests/186.out b/tests/qemu-iotests/186.out
> > index c8377fe146..d83bba1a88 100644
> > --- a/tests/qemu-iotests/186.out
> > +++ b/tests/qemu-iotests/186.out
> > @@ -444,31 +444,15 @@ ide0-cd0 (NODE_NAME): null-co:// (null-co, read-only)
> >  
> >  Testing: -drive if=scsi,driver=null-co
> >  QEMU X.Y.Z monitor - type 'help' for more information
> > -(qemu) QEMU_PROG: -drive if=scsi,driver=null-co: warning: bus=0,unit=0 is 
> > deprecated with this machine type
> > -info block
> > -scsi0-hd0 (NODE_NAME): null-co:// (null-co)
> > -Attached to:  /machine/unattached/device[27]/scsi.0/legacy[0]
> > -Cache mode:   writeback
> > -(qemu) quit
> > +(qemu) QEMU_PROG: -drive if=scsi,driver=null-co: machine type does not 
> > support if=scsi,bus=0,unit=0
> >  
> >  Testing: -drive if=scsi,media=cdrom
> >  QEMU X.Y.Z monitor - type 'help' for more information
> > -(qemu) QEMU_PROG: -drive if=scsi,media=cdrom: warning: bus=0,unit=0 is 
> > deprecated with this machine type
> > -info block
> > -scsi0-cd0: [not inserted]
> > -Attached to:  /machine/unattached/device[27]/scsi.0/legacy[0]
> > -Removable device: not locked, tray closed
> > -(qemu) quit
> > +(qemu) QEMU_PROG: -drive if=scsi,media=cdrom: machine type does not 
> > support if=scsi,bus=0,unit=0
> >  
> >  Testing: -drive if=scsi,driver=null-co,media=cdrom
> >  QEMU X.Y.Z monitor - type 'help' for more information
> > -(qemu) QEMU_PROG: -drive if=scsi,driver=null-co,media=cdrom: warning: 
> > bus=0,unit=0 is deprecated with this machine type
> > -info block
> > -scsi0-cd0 (NODE_NAME): null-co:// (null-co, read-only)
> > -Attached to:  /machine/unattached/device[27]/scsi.0/legacy[0]
> > -Removable device: not locked, tray closed
> > -Cache mode:   writeback
> > -(qemu) quit
> > +(qemu) QEMU_PROG: -drive if=scsi,driver=null-co,media=cdrom: machine type 
> > does not support if=scsi,bus=0,unit=0
> 
> That rather sounds like this "if=scsi" test should be removed now?

I think, it actually sounds like a SCSI adapter should be added manually
now.

Kevin

Re: [Qemu-block] [PATCH v10 05/12] migration: introduce postcopy-only pending

2018-03-13 Thread Vladimir Sementsov-Ogievskiy


13.03.2018 18:35, Dr. David Alan Gilbert wrote:

* Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:

13.03.2018 13:30, Dr. David Alan Gilbert wrote:

* Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:

12.03.2018 18:30, Dr. David Alan Gilbert wrote:

* Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:

There would be savevm states (dirty-bitmap) which can migrate only in
postcopy stage. The corresponding pending is introduced here.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

[...]


static MigIterateState migration_iteration_run(MigrationState *s)
{
-uint64_t pending_size, pend_post, pend_nonpost;
+uint64_t pending_size, pend_pre, pend_compat, pend_post;
bool in_postcopy = s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE;
-qemu_savevm_state_pending(s->to_dst_file, s->threshold_size,
-  _nonpost, _post);
-pending_size = pend_nonpost + pend_post;
+qemu_savevm_state_pending(s->to_dst_file, s->threshold_size, _pre,
+  _compat, _post);
+pending_size = pend_pre + pend_compat + pend_post;
trace_migrate_pending(pending_size, s->threshold_size,
-  pend_post, pend_nonpost);
+  pend_pre, pend_compat, pend_post);
if (pending_size && pending_size >= s->threshold_size) {
/* Still a significant amount to transfer */
if (migrate_postcopy() && !in_postcopy &&
-pend_nonpost <= s->threshold_size &&
-atomic_read(>start_postcopy)) {
+pend_pre <= s->threshold_size &&
+(atomic_read(>start_postcopy) ||
+ (pend_pre + pend_compat <= s->threshold_size)))

This change does something different from the description;
it causes a postcopy_start even if the user never ran the postcopy-start
command; so sorry, we can't do that; because postcopy for RAM is
something that users can enable but only switch into when they've given
up on it completing normally.

However, I guess that leaves you with a problem; which is what happens
to the system when you've run out of pend_pre+pend_compat but can't
complete because pend_post is non-0; so I don't know the answer to that.



Hmm. Here, we go to postcopy only if "pend_pre + pend_compat <=
s->threshold_size". Pre-patch, in this case we will go to
migration_completion(). So, precopy stage is finishing anyway.

Right.


So, we want
in this case to finish ram migration like it was finished by
migration_completion(), and then, run postcopy, which will handle only dirty
bitmaps, yes?

It's a bit tricky; the first important thing is that we can't change the
semantics of the migration without the 'dirty bitmaps'.

So then there's the question of how  a migration with both
postcopy-ram+dirty bitmaps should work;  again I don't think we should
enter the postcopy-ram phase until start-postcopy is issued.

Then there's the 3rd case; dirty-bitmaps but no postcopy-ram; in that
case I worry less about the semantics of how you want to do it.

I have an idea:

in postcopy_start(), in ram_has_postcopy() (and may be some other places?),
check atomic_read(>start_postcopy) instead of migrate_postcopy_ram()

We've got to use migrate_postcopy_ram() to decide whether we should do
ram specific things, e.g. send the ram discard data.
I'm wanting to make sure that if we have another full postcopy device
(like RAM, maybe storage say) that we'll just add that in with
migrate_postcopy_whatever().


then:

1. behavior without dirty-bitmaps is not changed, as currently we cant go
into postcopy_start and ram_has_postcopy without s->start_postcopy
2. dirty-bitmaps+ram: if user don't set s->start_postcopy, postcopy_start()
will operate as if migration capability was not enabled, so ram should
complete its migration
3. only dirty-bitmaps: again, postcopy_start() will operate as if migration
capability was not enabled, so ram should complete its migration

Why can't we just remove the change to the trigger condition in this
patch?  Then I think everything works as long as the management layer
does eventually call migration-start-postcopy ?
(It might waste some bandwidth at the point where there's otherwise
nothing left to send).


Hmm, I agree, it is the simplest thing we can do for now, and I'll 
rethink later,
how (and is it worth doing) to go to postcopy automatically in case of 
only-dirty-bitmaps.

Should I respin?



Even with the use of migrate-start-postcopy, you're going to need to be
careful about the higher level story; you need to document when to do it
and what the higher levels should do after a migration failure - at the
moment they know that once postcopy starts migration is irrecoverable if
it fails; I suspect that's not true with your dirty bitmaps.

IMHO this still comes back to my original observation from ~18months ago
that in many ways this isn't very postcopy like; in the sense that all
the semantics

Re: [Qemu-block] [PATCH v3 1/1] block/mirror: change the semantic of 'force' of block-job-cancel

2018-03-13 Thread Kevin Wolf

Am 13.03.2018 um 13:12 hat Jeff Cody geschrieben:
> From: Liang Li 
> 
> When doing drive mirror to a low speed shared storage, if there was heavy
> BLK IO write workload in VM after the 'ready' event, drive mirror block job
> can't be canceled immediately, it would keep running until the heavy BLK IO
> workload stopped in the VM.
> 
> Libvirt depends on the current block-job-cancel semantics, which is that
> when used without a flag after the 'ready' event, the command blocks
> until data is in sync.  However, these semantics are awkward in other
> situations, for example, people may use drive mirror for realtime
> backups while still wanting to use block live migration.  Libvirt cannot
> start a block live migration while another drive mirror is in progress,
> but the user would rather abandon the backup attempt as broken and
> proceed with the live migration than be stuck waiting for the current
> drive mirror backup to finish.
> 
> The drive-mirror command already includes a 'force' flag, which libvirt
> does not use, although it documented the flag as only being useful to
> quit a job which is paused.  However, since quitting a paused job has
> the same effect as abandoning a backup in a non-paused job (namely, the
> destination file is not in sync, and the command completes immediately),
> we can just improve the documentation to make the force flag obviously
> useful.
> 
> Cc: Paolo Bonzini 
> Cc: Jeff Cody 
> Cc: Kevin Wolf 
> Cc: Max Reitz 
> Cc: Eric Blake 
> Cc: John Snow 
> Reported-by: Huaitong Han 
> Signed-off-by: Huaitong Han 
> Signed-off-by: Liang Li 
> Signed-off-by: Jeff Cody 
> ---
> 
> N.B.: This was rebased on top of Kevin's block branch,
>   and the 'force' flag added to block_job_user_cancel

Thanks, applied to the block branch.

Kevin

Re: [Qemu-block] [PATCH v2 0/8] nbd block status base:allocation

2018-03-13 Thread Eric Blake


On 03/12/2018 10:21 AM, Vladimir Sementsov-Ogievskiy wrote:

Hi all.

Here is minimal realization of base:allocation context of NBD
block-status extension, which allows to get block status through
NBD.

v2 changes are in each patch after "---" line.

Vladimir Sementsov-Ogievskiy (8):
   nbd/server: add nbd_opt_invalid helper
   nbd/server: add nbd_read_opt_name helper
   nbd: BLOCK_STATUS for standard get_block_status function: server part
   block/nbd-client: save first fatal error in nbd_iter_error
   nbd: BLOCK_STATUS for standard get_block_status function: client part
   iotests.py: tiny refactor: move system imports up
   iotests: add file_path helper
   iotests: new test 209 for NBD BLOCK_STATUS


I've staged this on my NBD queue, pull request to come later today 
(still this morning for me) so that it makes 2.12 softfreeze.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-block] [PATCH v2 8/8] iotests: new test 209 for NBD BLOCK_STATUS

2018-03-13 Thread Eric Blake


On 03/12/2018 10:21 AM, Vladimir Sementsov-Ogievskiy wrote:

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---





+
+iotests.verify_image_format(supported_fmts=['qcow2'])


Interesting that './check -nbd' doesn't run 209 (because that defaults 
to format -raw, but we need format -qcow2), but './check -nbd' and 
'./check -qcow2 -nbd' do run it, so I've tested that it passes, and is 
quick.



+
+disk, nbd_sock = file_path('disk', 'nbd-sock')
+nbd_uri = 'nbd+unix:///exp?socket=' + nbd_sock
+
+qemu_img_create('-f', iotests.imgfmt, disk, '1M')
+qemu_io('-f', iotests.imgfmt, '-c', 'write 0 512K', disk)
+
+qemu_nbd('-k', nbd_sock, '-x', 'exp', '-f', iotests.imgfmt, disk)
+qemu_img_verbose('map', '-f', 'raw', '--output=json', nbd_uri)


And this one is easy enough to reproduce, whether I use shell or python. 
 (Better than some of the python iotests that just have a line of 
'.' where you have to scratch your head at how to reproduce failures).


Reviewed-by: Eric Blake 


+++ b/tests/qemu-iotests/group
@@ -202,3 +202,4 @@
  203 rw auto
  204 rw auto quick
  205 rw auto quick
+209 rw auto quick


the obvious context conflict as other tests land into master, but I 
don't mind that ;)


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-block] [PATCH v10 05/12] migration: introduce postcopy-only pending

2018-03-13 Thread Dr. David Alan Gilbert

* Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:
> 13.03.2018 13:30, Dr. David Alan Gilbert wrote:
> > * Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:
> > > 12.03.2018 18:30, Dr. David Alan Gilbert wrote:
> > > > * Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:
> > > > > There would be savevm states (dirty-bitmap) which can migrate only in
> > > > > postcopy stage. The corresponding pending is introduced here.
> > > > > 
> > > > > Signed-off-by: Vladimir Sementsov-Ogievskiy 
> > > > > ---
> > > [...]
> > > 
> > > > >static MigIterateState migration_iteration_run(MigrationState *s)
> > > > >{
> > > > > -uint64_t pending_size, pend_post, pend_nonpost;
> > > > > +uint64_t pending_size, pend_pre, pend_compat, pend_post;
> > > > >bool in_postcopy = s->state == 
> > > > > MIGRATION_STATUS_POSTCOPY_ACTIVE;
> > > > > -qemu_savevm_state_pending(s->to_dst_file, s->threshold_size,
> > > > > -  _nonpost, _post);
> > > > > -pending_size = pend_nonpost + pend_post;
> > > > > +qemu_savevm_state_pending(s->to_dst_file, s->threshold_size, 
> > > > > _pre,
> > > > > +  _compat, _post);
> > > > > +pending_size = pend_pre + pend_compat + pend_post;
> > > > >trace_migrate_pending(pending_size, s->threshold_size,
> > > > > -  pend_post, pend_nonpost);
> > > > > +  pend_pre, pend_compat, pend_post);
> > > > >if (pending_size && pending_size >= s->threshold_size) {
> > > > >/* Still a significant amount to transfer */
> > > > >if (migrate_postcopy() && !in_postcopy &&
> > > > > -pend_nonpost <= s->threshold_size &&
> > > > > -atomic_read(>start_postcopy)) {
> > > > > +pend_pre <= s->threshold_size &&
> > > > > +(atomic_read(>start_postcopy) ||
> > > > > + (pend_pre + pend_compat <= s->threshold_size)))
> > > > This change does something different from the description;
> > > > it causes a postcopy_start even if the user never ran the postcopy-start
> > > > command; so sorry, we can't do that; because postcopy for RAM is
> > > > something that users can enable but only switch into when they've given
> > > > up on it completing normally.
> > > > 
> > > > However, I guess that leaves you with a problem; which is what happens
> > > > to the system when you've run out of pend_pre+pend_compat but can't
> > > > complete because pend_post is non-0; so I don't know the answer to that.
> > > > 
> > > > 
> > > Hmm. Here, we go to postcopy only if "pend_pre + pend_compat <=
> > > s->threshold_size". Pre-patch, in this case we will go to
> > > migration_completion(). So, precopy stage is finishing anyway.
> > Right.
> > 
> > > So, we want
> > > in this case to finish ram migration like it was finished by
> > > migration_completion(), and then, run postcopy, which will handle only 
> > > dirty
> > > bitmaps, yes?
> > It's a bit tricky; the first important thing is that we can't change the
> > semantics of the migration without the 'dirty bitmaps'.
> > 
> > So then there's the question of how  a migration with both
> > postcopy-ram+dirty bitmaps should work;  again I don't think we should
> > enter the postcopy-ram phase until start-postcopy is issued.
> > 
> > Then there's the 3rd case; dirty-bitmaps but no postcopy-ram; in that
> > case I worry less about the semantics of how you want to do it.
> 
> I have an idea:
> 
> in postcopy_start(), in ram_has_postcopy() (and may be some other places?),
> check atomic_read(>start_postcopy) instead of migrate_postcopy_ram()

We've got to use migrate_postcopy_ram() to decide whether we should do
ram specific things, e.g. send the ram discard data.
I'm wanting to make sure that if we have another full postcopy device
(like RAM, maybe storage say) that we'll just add that in with
migrate_postcopy_whatever().

> then:
> 
> 1. behavior without dirty-bitmaps is not changed, as currently we cant go
> into postcopy_start and ram_has_postcopy without s->start_postcopy
> 2. dirty-bitmaps+ram: if user don't set s->start_postcopy, postcopy_start()
> will operate as if migration capability was not enabled, so ram should
> complete its migration
> 3. only dirty-bitmaps: again, postcopy_start() will operate as if migration
> capability was not enabled, so ram should complete its migration

Why can't we just remove the change to the trigger condition in this
patch?  Then I think everything works as long as the management layer
does eventually call migration-start-postcopy ?
(It might waste some bandwidth at the point where there's otherwise
nothing left to send).

Even with the use of migrate-start-postcopy, you're going to need to be
careful about the higher level story; you need to document when to do it
and what the higher levels should do after a migration failure - at the
moment they know that once

Re: [Qemu-block] [PATCH v2 5/8] nbd: BLOCK_STATUS for standard get_block_status function: client part

2018-03-13 Thread Eric Blake


On 03/12/2018 10:21 AM, Vladimir Sementsov-Ogievskiy wrote:

Minimal realization: only one extent in server answer is supported.
Flag NBD_CMD_FLAG_REQ_ONE is used to force this behavior.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

v2: - drop iotests changes, as server is fixed in 03
 - rebase to byte-based block status
 - use payload_advance32
 - check extent->length for zero and for alignment (not sure about
   zero, but, we do not send block status with zero-length, so
   reply should not be zero-length too)


The NBD spec needs to be clarified that a zero-length request is bogus; 
once that is done, then the server can be required to make progress (if 
it succeeds, at least one non-zero extent was reported per namespace), 
as that is the most useful behavior (if a server replies with 0 extents 
or 0-length extents, the client could go into an inf-loop re-requesting 
the same status).



 - handle max_block in nbd_client_co_block_status
 - handle zero-length request in nbd_client_co_block_status
 - do not use magic numbers in nbd_negotiate_simple_meta_context

 ? Hm, don't remember, what we decided about DATA/HOLE flags mapping..


At this point, it's still up in the air for me to fix the complaints 
Kevin had, but those are bug fixes on top of this series (and thus okay 
during soft freeze), so your initial implementation is adequate for a 
first commit.



+++ b/block/nbd-client.c
@@ -228,6 +228,47 @@ static int 
nbd_parse_offset_hole_payload(NBDStructuredReplyChunk *chunk,
  return 0;
  }
  
+/* nbd_parse_blockstatus_payload

+ * support only one extent in reply and only for
+ * base:allocation context
+ */
+static int nbd_parse_blockstatus_payload(NBDClientSession *client,
+ NBDStructuredReplyChunk *chunk,
+ uint8_t *payload, uint64_t 
orig_length,
+ NBDExtent *extent, Error **errp)
+{
+uint32_t context_id;
+
+if (chunk->length != sizeof(context_id) + sizeof(extent)) {
+error_setg(errp, "Protocol error: invalid payload for "
+ "NBD_REPLY_TYPE_BLOCK_STATUS");
+return -EINVAL;
+}
+
+context_id = payload_advance32();
+if (client->info.meta_base_allocation_id != context_id) {
+error_setg(errp, "Protocol error: unexpected context id: %d for "


s/id:/id/


+ "NBD_REPLY_TYPE_BLOCK_STATUS, when negotiated context 
"
+ "id is %d", context_id,
+ client->info.meta_base_allocation_id);
+return -EINVAL;
+}
+
+extent->length = payload_advance32();
+extent->flags = payload_advance32();
+
+if (extent->length == 0 ||
+extent->length % client->info.min_block != 0 ||
+extent->length > orig_length)
+{
+/* TODO: clarify in NBD spec the second requirement about min_block */


Yeah, the spec wording can be tightened, but the intent is obvious: the 
server better not be reporting status on anything smaller than what you 
can address with read or write.  But I think we can address that on the 
NBD list without a TODO here.


However, you do have a bug: the server doesn't have to report min_block, 
so the value can still be zero (see nbd_refresh_limits, for example) - 
and %0 is a bad idea.  I'll do the obvious cleanup of checking for a 
non-zero min_block.



+error_setg(errp, "Protocol error: server sent chunk of invalid 
length");


Maybe insert 'status' in there?

  
+static int nbd_co_receive_blockstatus_reply(NBDClientSession *s,

+uint64_t handle, uint64_t length,
+NBDExtent *extent, Error **errp)
+{
+NBDReplyChunkIter iter;
+NBDReply reply;
+void *payload = NULL;
+Error *local_err = NULL;
+bool received = false;
+
+NBD_FOREACH_REPLY_CHUNK(s, iter, handle, s->info.structured_reply,
+NULL, , )
+{
+int ret;
+NBDStructuredReplyChunk *chunk = 
+
+assert(nbd_reply_is_structured());
+
+switch (chunk->type) {
+case NBD_REPLY_TYPE_BLOCK_STATUS:
+if (received) {
+s->quit = true;
+error_setg(_err, "Several BLOCK_STATUS chunks in reply");


Not necessarily an error later on when we request more than one 
namespace, but fine for the initial implementation where we really do 
expect exactly one status.



+nbd_iter_error(, true, -EINVAL, _err);
+}
+received = true;
+
+ret = nbd_parse_blockstatus_payload(s, ,
+payload, length, extent,
+_err);
+if (ret < 0) {
+s->quit = true;
+nbd_iter_error(, true, ret, _err);
+}
+

Re: [Qemu-block] [PATCH v2 7/8] iotests: add file_path helper

2018-03-13 Thread Eric Blake


On 03/12/2018 10:21 AM, Vladimir Sementsov-Ogievskiy wrote:

Simple way to have auto generated filenames with auto cleanup. Like
FilePath but without using 'with' statement and without additional
indentation of the whole test.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---




+def file_path(*names):
+''' Another way to get auto-generated filename that cleans itself up.
+
+Use it as simple as:


s/it/is/


+
+img_a, img_b = file_path('a.img', 'b.img')
+sock = file_path('socket')
+'''
+
+if not hasattr(file_path_remover, 'paths'):
+file_path_remover.paths = []
+atexit.register(file_path_remover)
+
+paths = []
+for name in names:
+filename = '{0}-{1}'.format(os.getpid(), name)
+path = os.path.join(test_dir, filename)
+file_path_remover.paths.append(path)
+paths.append(path)
+
+return paths[0] if len(paths) == 1 else paths
+
+
  class VM(qtest.QEMUQtestMachine):
  '''A QEMU VM'''
  


Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-block] [PATCH v3 1/2] block: Fix flags in reopen queue

2018-03-13 Thread Kevin Wolf

Am 13.03.2018 um 15:20 hat Fam Zheng geschrieben:
> Reopen flags are not synchronized according to the
> bdrv_reopen_queue_child precedence until bdrv_reopen_prepare. It is a
> bit too late: we already check the consistency in bdrv_check_perm before
> that.
> 
> This fixes the bug that when bdrv_reopen a RO node as RW, the flags for
> backing child are wrong. Before, we could recurse with flags.rw=1; now,
> role->inherit_options + update_flags_from_options will make sure to
> clear the bit when necessary.  Note that this will not clear an
> explicitly set bit, as in the case of parallel block jobs (e.g.
> test_stream_parallel in 030), because the explicit options include
> 'read-only=false' (for an intermediate node used by a different job).
> 
> Signed-off-by: Fam Zheng 
> ---
>  block.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/block.c b/block.c
> index 75a9fd49de..a121d2ebcc 100644
> --- a/block.c
> +++ b/block.c
> @@ -2883,8 +2883,15 @@ static BlockReopenQueue 
> *bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
>  
>  /* Inherit from parent node */
>  if (parent_options) {
> +QemuOpts *opts;
> +QDict *options_copy;
>  assert(!flags);
>  role->inherit_options(, options, parent_flags, parent_options);
> +options_copy = qdict_clone_shallow(options);
> +opts = qemu_opts_create(_runtime_opts, NULL, 0, _abort);
> +qemu_opts_absorb_qdict(opts, options_copy, NULL);
> +update_flags_from_options(, opts);
> +qemu_opts_del(opts);

Squashed in a line here after Fam and Max agreed on IRC:

+QDECREF(options_copy);

>  }
>  
>  /* Old values are used for options that aren't set yet */

Kevin

1 2 >

1 - 100 of 144 matches

Mail list logo