date:20190710

Re: [Qemu-block] [Qemu-devel] [PATCH v4 17/18] iotests: add test 257 for bitmap-mode backups

2019-07-10 Thread John Snow




On 7/9/19 7:25 PM, John Snow wrote:
> Signed-off-by: John Snow 
> ---
>  tests/qemu-iotests/257 |  416 +++
>  tests/qemu-iotests/257.out | 2247 
>  tests/qemu-iotests/group   |1 +
>  3 files changed, 2664 insertions(+)
>  create mode 100755 tests/qemu-iotests/257
>  create mode 100644 tests/qemu-iotests/257.out
> 
> diff --git a/tests/qemu-iotests/257 b/tests/qemu-iotests/257
> new file mode 100755
> index 00..75a651c7c3
> --- /dev/null
> +++ b/tests/qemu-iotests/257
> @@ -0,0 +1,416 @@
> +#!/usr/bin/env python
> +#
> +# Test bitmap-sync backups (incremental, differential, and partials)
> +#
> +# Copyright (c) 2019 John Snow for Red Hat, Inc.
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see .
> +#
> +# owner=js...@redhat.com
> +
> +from collections import namedtuple
> +import math
> +import os
> +
> +import iotests
> +from iotests import log, qemu_img
> +
> +SIZE = 64 * 1024 * 1024
> +GRANULARITY = 64 * 1024
> +
> +Pattern = namedtuple('Pattern', ['byte', 'offset', 'size'])
> +def mkpattern(byte, offset, size=GRANULARITY):
> +"""Constructor for Pattern() with default size"""
> +return Pattern(byte, offset, size)
> +
> +class PatternGroup:
> +"""Grouping of Pattern objects. Initialize with an iterable of 
> Patterns."""
> +def __init__(self, patterns):
> +self.patterns = patterns
> +
> +def bits(self, granularity):
> +"""Calculate the unique bits dirtied by this pattern grouping"""
> +res = set()
> +for pattern in self.patterns:
> +lower = math.floor(pattern.offset / granularity)
> +upper = math.floor((pattern.offset + pattern.size - 1) / 
> granularity)
> +res = res | set(range(lower, upper + 1))
> +return res
> +

Squashing in:

-lower = math.floor(pattern.offset / granularity)
-upper = math.floor((pattern.offset + pattern.size - 1) /
granularity)
+lower = pattern.offset // granularity
+upper = (pattern.offset + pattern.size - 1) // granularity

To quiet the Python2 beast.


--js

Re: [Qemu-block] [PATCH 0/4] virtio: handle zoned backing devices

2019-07-10 Thread Dmitry Fomichev

On Wed, 2019-07-10 at 23:09 +0200, Kevin Wolf wrote:
> Am 10.07.2019 um 13:33 hat Paolo Bonzini geschrieben:
> > On 10/07/19 13:02, Kevin Wolf wrote:
> > > Hm... Actually, file-posix implements .bdrv_check_perm and could just
> > > refuse attaching a parent there if it doesn't request a specific
> > > permission like BLK_PERM_SUPPORT_ZONED. That should give us the
> > > whitelist semantics through existing infrastructure.
> > 
> > I'd like Dmitry to have something more precise to base his work on.  The
> > permissions system is really complicated and I never really wrapped my
> > head around it, so I need your help.
> > 
> > IIUC, blkconf_apply_backend_options would grow a new argument (like
> > "resizable") and that argument would add BLK_PERM_SUPPORT_ZONED to the
> > perm that blkconf_apply_backend_options passes to blk_set_perm.  On the
> > other side raw_check_perm would say something like
> > 
> > if (is_zoned(s) && !(perm & BLK_PERM_SUPPORT_ZONED)) {
> > error_setg();
> > return -ENOTSUP;
> > }
> > 
> > Is this correct?
> 
> Yes, I think this is how you'd best implement it.
> 
> > In addition, BLK_PERM_SUPPORT_ZONED would have to be a shared
> > permission, since it's possible to assign the same block device to
> > multiple scsi-block devices.  So BLK_PERM_SUPPORT_ZONED would be added
> > unconditionally to shared_perm.
> 
> Right, this part shows that we're kind of abusing the permission system
> here. I think unconditionally adding BLK_PERM_SUPPORT_ZONED to the set
> of shared permissions could probably happen centrally in
> bdrv_child_perm().
> 
> > ps: I have always thought that shared_perm is expressed the wrong way
> > and should have been "denied_perm".  How hard would it be to change that
> > now?
> 
> I'm not sure it would be better. It is shared_perm because that means
> that the default is restrictive (error mode: operation refused, clearly
> reported, easy to fix) rather than permissive (error mode: image
> corruption, hard to figure out where we were too permissive). Basically,
> whitelist instead of blacklist, once again.
> 
> But if we did indeed decide to change it, the only way to find out how
> hard it is would be doing it. I suspect not too hard in principle, but
> making sure that we converted all callers and don't introduce wrong
> callers later through (silent) merge conflicts is the more concerning
> part.
> 
> Kevin

Thanks for the feedback, the permissions based approach indeed looks
cleaner. I am looking into modifying the patchset to do the check in
bdrv_check_perm and will send a V2.

Paolo,
WRT to Host Aware drives, these MAY work, but we don't have any of these
available for testing and are not able to verify which drivers do work
with them and which do not. This is the reason for not letting them pass
thru. If you prefer, I can enable passing HA drives in V2.

Dmitry

Re: [Qemu-block] [PATCH] LUKS: support preallocation in qemu-img

2019-07-10 Thread Max Reitz

On 10.07.19 23:24, Max Reitz wrote:
> On 10.07.19 19:03, Maxim Levitsky wrote:
>> preallocation=off and preallocation=metadata
>> both allocate luks header only, and preallocation=falloc/full
>> is passed to underlying file, with the given image size.
>>
>> Note that the actual preallocated size is a bit smaller due
>> to luks header.
> 
> Couldn’t you just preallocate it after creating the crypto header so
> qcrypto_block_get_payload_offset(crypto->block) + size is the actual
> file size?
> 
>> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1534951
>>
>> Signed-off-by: Maxim Levitsky 
>> ---
>>  block/crypto.c | 28 ++--
>>  1 file changed, 26 insertions(+), 2 deletions(-)
> 
> Hm.  I would expect a preallocated image to read 0.  But if you just
> pass this through to the protocol layer, it won’t read 0.
> 
> (In fact, I don’t even quite see the point of having LUKS as an own
> format still.  It was useful when qcow2 didn’t have LUKS support, but
> now it does, so...  I suppose everyone using the LUKS format should
> actually be using qcow2 with LUKS?)

Kevin just pointed out to me that our LUKS format is compatible to the
actual layout cryptsetup uses.  OK, that is an important use case.

Hm.  Unfortunately, that doesn’t really necessitate preallocation.

Well, whatever.  If it’s simple enough, that shouldn’t stop us from
implementing preallocation anyway.

Now I found that qapi/block-core.json defines PreallocMode’s falloc and
full values as follows:

> # @falloc: like @full preallocation but allocate disk space by
> #  posix_fallocate() rather than writing zeros.
> # @full: preallocate all data by writing zeros to device to ensure disk
> #space is really available. @full preallocation also sets up
> #metadata correctly.

So it isn’t just me who expects these to pre-initialize the image to 0.
 Hm, although...  I suppose @falloc technically does not specify whether
the data reads as zeroes.  I kind of find it to be implied, but, well...

Max

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH v2 00/12] block: Fixes for concurrent block jobs

2019-07-10 Thread Max Reitz

Ping

I suppose I’ll apply the patches I had in v1 if I don’t get reviews for
the new patches, because some fixes are better than none.

(And probably patch 2, because it’s obvious.)

Max

On 03.07.19 19:28, Max Reitz wrote:
> This is a v2 to “block: Add BDS.never_freeze”.
> 
> It depends on my “block: Delay poll when ending drained sections”
> series:
> 
> Depends-on: <20190619152603.5937-1-mre...@redhat.com>
> 
> 
> It turned out that if you run 030 (or just the new test_overlapping_5
> case) sufficiently often, it breaks; which is why I’m hesitant to just
> merge the “add never_freeze” series as it is.
> 
> There are several reasons for why this test case breaks, I hope patches
> 3 to 6 fix them.  Patch 12 adds a test that is much more reliable than
> test_overlapping_5 at detecting the problems fixed by at least patches 4
> to 6.  (I think that 3 doesn’t really need a test.)
> 
> I’m sure there are other ways to see these problems, but well, coming
> from 030, concurrent commit/stream jobs are how I reproduced them.
> Hence the same of this series.
> 
> Patch 2 is for something I encountered on the way.  Patch 11 tests it.
> 
> 
> v2:
> - Added a bunch of more patches.
> 
> 
> git backport-diff against v1:
> 
> Key:
> [] : patches are identical
> [] : number of functional differences between upstream/downstream patch
> [down] : patch is downstream-only
> The flags [FC] indicate (F)unctional and (C)ontextual differences, 
> respectively
> 
> 001/12:[] [--] 'block: Add BDS.never_freeze'
> 002/12:[down] 'block/stream: Fix error path'
> 003/12:[down] 'block/stream: Swap backing file change order'
> 004/12:[down] 'block: Keep subtree drained in drop_intermediate'
> 005/12:[down] 'block: Reduce (un)drains when replacing a child'
> 006/12:[down] 'block: Deep-clear inherits_from'
> 007/12:[] [--] 'iotests: Fix throttling in 030'
> 008/12:[] [--] 'iotests: Compare error messages in 030'
> 009/12:[] [--] 'iotests: Add @use_log to VM.run_job()'
> 010/12:[] [--] 'iotests: Add new case to 030'
> 011/12:[down] 'iotests: Add read-only test case to 030'
> 012/12:[down] 'iotests: Add test for concurrent stream/commit'
> 
> 
> 
> Max Reitz (12):
>   block: Add BDS.never_freeze
>   block/stream: Fix error path
>   block/stream: Swap backing file change order
>   block: Keep subtree drained in drop_intermediate
>   block: Reduce (un)drains when replacing a child
>   block: Deep-clear inherits_from
>   iotests: Fix throttling in 030
>   iotests: Compare error messages in 030
>   iotests: Add @use_log to VM.run_job()
>   iotests: Add new case to 030
>   iotests: Add read-only test case to 030
>   iotests: Add test for concurrent stream/commit
> 
>  include/block/block_int.h |   3 +
>  block.c   |  93 +--
>  block/commit.c|   4 +
>  block/mirror.c|   4 +
>  block/stream.c|   4 +-
>  tests/qemu-iotests/030| 150 +--
>  tests/qemu-iotests/030.out|   4 +-
>  tests/qemu-iotests/258| 163 ++
>  tests/qemu-iotests/258.out|  33 +++
>  tests/qemu-iotests/group  |   1 +
>  tests/qemu-iotests/iotests.py |  18 ++--
>  11 files changed, 413 insertions(+), 64 deletions(-)
>  create mode 100755 tests/qemu-iotests/258
>  create mode 100644 tests/qemu-iotests/258.out
> 




signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH] LUKS: support preallocation in qemu-img

2019-07-10 Thread Max Reitz

On 10.07.19 19:03, Maxim Levitsky wrote:
> preallocation=off and preallocation=metadata
> both allocate luks header only, and preallocation=falloc/full
> is passed to underlying file, with the given image size.
> 
> Note that the actual preallocated size is a bit smaller due
> to luks header.

Couldn’t you just preallocate it after creating the crypto header so
qcrypto_block_get_payload_offset(crypto->block) + size is the actual
file size?

> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1534951
> 
> Signed-off-by: Maxim Levitsky 
> ---
>  block/crypto.c | 28 ++--
>  1 file changed, 26 insertions(+), 2 deletions(-)

Hm.  I would expect a preallocated image to read 0.  But if you just
pass this through to the protocol layer, it won’t read 0.

(In fact, I don’t even quite see the point of having LUKS as an own
format still.  It was useful when qcow2 didn’t have LUKS support, but
now it does, so...  I suppose everyone using the LUKS format should
actually be using qcow2 with LUKS?)

Max

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH 0/4] virtio: handle zoned backing devices

2019-07-10 Thread Kevin Wolf

Am 10.07.2019 um 13:33 hat Paolo Bonzini geschrieben:
> On 10/07/19 13:02, Kevin Wolf wrote:
> > Hm... Actually, file-posix implements .bdrv_check_perm and could just
> > refuse attaching a parent there if it doesn't request a specific
> > permission like BLK_PERM_SUPPORT_ZONED. That should give us the
> > whitelist semantics through existing infrastructure.
> 
> I'd like Dmitry to have something more precise to base his work on.  The
> permissions system is really complicated and I never really wrapped my
> head around it, so I need your help.
> 
> IIUC, blkconf_apply_backend_options would grow a new argument (like
> "resizable") and that argument would add BLK_PERM_SUPPORT_ZONED to the
> perm that blkconf_apply_backend_options passes to blk_set_perm.  On the
> other side raw_check_perm would say something like
> 
> if (is_zoned(s) && !(perm & BLK_PERM_SUPPORT_ZONED)) {
> error_setg();
> return -ENOTSUP;
> }
> 
> Is this correct?

Yes, I think this is how you'd best implement it.

> In addition, BLK_PERM_SUPPORT_ZONED would have to be a shared
> permission, since it's possible to assign the same block device to
> multiple scsi-block devices.  So BLK_PERM_SUPPORT_ZONED would be added
> unconditionally to shared_perm.

Right, this part shows that we're kind of abusing the permission system
here. I think unconditionally adding BLK_PERM_SUPPORT_ZONED to the set
of shared permissions could probably happen centrally in
bdrv_child_perm().

> ps: I have always thought that shared_perm is expressed the wrong way
> and should have been "denied_perm".  How hard would it be to change that
> now?

I'm not sure it would be better. It is shared_perm because that means
that the default is restrictive (error mode: operation refused, clearly
reported, easy to fix) rather than permissive (error mode: image
corruption, hard to figure out where we were too permissive). Basically,
whitelist instead of blacklist, once again.

But if we did indeed decide to change it, the only way to find out how
hard it is would be doing it. I suspect not too hard in principle, but
making sure that we converted all callers and don't introduce wrong
callers later through (silent) merge conflicts is the more concerning
part.

Kevin

Re: [Qemu-block] [PATCH] nvme: Set number of queues later in nvme_init()

2019-07-10 Thread Max Reitz

On 10.07.19 16:57, Michal Privoznik wrote:
> When creating the admin queue in nvme_init() the variable that
> holds the number of queues created is modified before actual
> queue creation. This is a problem because if creating the queue
> fails then the variable is left in inconsistent state. This was
> actually observed when I tried to hotplug a nvme disk. The
> control got to nvme_file_open() which called nvme_init() which
> failed and thus nvme_close() was called which in turn called
> nvme_free_queue_pair() with queue being NULL. This lead to an
> instant crash:
> 
>   #0  0x55d9507ec211 in nvme_free_queue_pair (bs=0x55d952ddb880, q=0x0) 
> at block/nvme.c:164
>   #1  0x55d9507ee180 in nvme_close (bs=0x55d952ddb880) at block/nvme.c:729
>   #2  0x55d9507ee3d5 in nvme_file_open (bs=0x55d952ddb880, 
> options=0x55d952bb1410, flags=147456, errp=0x7ffd8e19e200) at block/nvme.c:781
>   #3  0x55d9507629f3 in bdrv_open_driver (bs=0x55d952ddb880, 
> drv=0x55d95109c1e0 , node_name=0x0, options=0x55d952bb1410, 
> open_flags=147456, errp=0x7ffd8e19e310) at block.c:1291
>   #4  0x55d9507633d6 in bdrv_open_common (bs=0x55d952ddb880, file=0x0, 
> options=0x55d952bb1410, errp=0x7ffd8e19e310) at block.c:1551
>   #5  0x55d950766881 in bdrv_open_inherit (filename=0x0, reference=0x0, 
> options=0x55d952bb1410, flags=32768, parent=0x55d9538ce420, 
> child_role=0x55d950eaade0 , errp=0x7ffd8e19e510) at block.c:3063
>   #6  0x55d950765ae4 in bdrv_open_child_bs (filename=0x0, 
> options=0x55d9541cdff0, bdref_key=0x55d950af33aa "file", 
> parent=0x55d9538ce420, child_role=0x55d950eaade0 , 
> allow_none=true, errp=0x7ffd8e19e510) at block.c:2712
>   #7  0x55d950766633 in bdrv_open_inherit (filename=0x0, reference=0x0, 
> options=0x55d9541cdff0, flags=0, parent=0x0, child_role=0x0, 
> errp=0x7ffd8e19e908) at block.c:3011
>   #8  0x55d950766dba in bdrv_open (filename=0x0, reference=0x0, 
> options=0x55d953d00390, flags=0, errp=0x7ffd8e19e908) at block.c:3156
>   #9  0x55d9507cb635 in blk_new_open (filename=0x0, reference=0x0, 
> options=0x55d953d00390, flags=0, errp=0x7ffd8e19e908) at 
> block/block-backend.c:389
>   #10 0x55d950465ec5 in blockdev_init (file=0x0, bs_opts=0x55d953d00390, 
> errp=0x7ffd8e19e908) at blockdev.c:602
> 
> Signed-off-by: Michal Privoznik 
> ---
>  block/nvme.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Thanks, applied to my block branch:

https://git.xanclic.moe/XanClic/qemu/commits/branch/block

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH 6/8] block/backup: issue progress updates for skipped regions

2019-07-10 Thread Max Reitz

On 10.07.19 22:47, John Snow wrote:
> 
> 
> On 7/10/19 4:30 PM, Max Reitz wrote:
>> On 10.07.19 20:20, John Snow wrote:
>>>
>>>
>>> On 7/10/19 12:36 PM, Max Reitz wrote:
 On 10.07.19 03:05, John Snow wrote:
> The way bitmap backups work is by starting at 75% if it needs
> to copy just 25% of the disk.

 Although there is this comment:

> /* TODO job_progress_set_remaining() would make more sense */

 So...

> The way sync=top currently works, however, is to start at 0% and then
> never update the progress if it doesn't copy a region. If it needs to
> copy 25% of the disk, we'll finish at 25%.
>
> Update the progress when we skip regions.

 Wouldn’t it be more correct to decrease the job length?

 Max

>>>
>>> Admittedly I have precisely no idea. Maybe? As far as I understand it,
>>> we guarantee only:
>>>
>>> (1) Progress monotonically increases
>>> (2) Upon completion, progress will equal the total work estimate.
>>> [Trying to fix that to be true here.]
>>>
>>> This means we can do stuff like:
>>>
>>> - Total work estimate can increase or decrease arbitrarily
>>> - Neither value has to mean anything in particular
>>>
>>>
>>> Bitmap sync works by artificially increasing progress for NOP regions,
>>> seen in init_copy_bitmap.
>>
>> Yes, and it has a TODO comment that says it should be done differently.
>>
>>> Full sync also tends to increase progress regardless of it actually did
>>> a copy or not; offload support also counts as progress here. So if you
>>> full sync an empty image, you'll see it increasing the progress as it
>>> doesn't actually do anything.
>>>
>>> Top sync is the odd one out, which just omits progress for regions it skips.
>>>
>>> My only motivation here was to make them consistent. Can I do it the
>>> other way? Yeah, probably. Is one way better than the other? I
>>> legitimately have no idea. I guess whoever wrote the last comment felt
>>> that it should all be the other way instead. Why'd they not do that?
>>
>> If you look at the commit (05df8a6a2b4), I suppose it was because that
>> commit simply did not intend to change behavior.  It just touched that
>> piece of code and noted that maybe there should be a follow-up commit to
>> change it.
>>
>> But yeah, whatever.
>>
>> Reviewed-by: Max Reitz 
>>
> 
> I mean. I'll make it consistent either way, but I actually don't know
> which way I should make it go. I just think that all the modes should
> work the same if we can help it.
> 
> Flip a coin?

If you’d flip a coin, I can say that I’d find it a bit more meaningful
to reduce the length.  Especially if you change sync=top to calculate
beforehand how much we need to copy, so the length isn’t even reduced
while the job is running.

And it’s easier to remove the TODO comment this way.  If we decide
against it, you’d have to remove the TODO comment aso (because we
decided against it), but you’d need to justify it in the commit message.
 And we all know that writing explanations and documentation is the
hardest thing of all.

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH 8/8] iotests/257: test traditional sync modes

2019-07-10 Thread Max Reitz

On 10.07.19 21:00, John Snow wrote:
> 
> 
> On 7/10/19 1:14 PM, Max Reitz wrote:
>> On 10.07.19 03:05, John Snow wrote:
>>> Signed-off-by: John Snow 
>>> ---
>>>  tests/qemu-iotests/257 |   31 +
>>>  tests/qemu-iotests/257.out | 3089 
>>>  2 files changed, 3120 insertions(+)
>>
>> Oof.
>>
> 
> Yeah, it's... a lot of test output. We probably shouldn't count the
> reference test output against any kind of SLOC metrics.
> 
>>> diff --git a/tests/qemu-iotests/257 b/tests/qemu-iotests/257
>>> index de8707cb19..8de1c4da19 100755
>>> --- a/tests/qemu-iotests/257
>>> +++ b/tests/qemu-iotests/257
>>
>> [...]
>>
>>> @@ -410,6 +416,11 @@ def test_bitmap_sync(bsync_mode, msync_mode='bitmap', 
>>> failure=None):
>>>  if bsync_mode == 'always' and failure == 'intermediate':
>>>  # We manage to copy one sector (one bit) before the error.
>>>  ebitmap.clear_bit(ebitmap.first_bit)
>>> +if msync_mode in ('full', 'top'):
>>> +# These modes return all bits set except what was 
>>> copied/skipped
>>
>> Hm.  How useful is bitmap support for 'top' then, anyway?  That means
>> that if you want to resume a top backup, you always have to resume it
>> like it was a full backup.  Which sounds kind of useless.
>>
>> Max
>>
> 
> Good point!
> 
> I think this can be fixed by doing an initialization pass of the
> copy_bitmap when sync=top to set only the allocated regions in the bitmap.
> 
> This means that the write notifier won't copy out regions that are
> written to that weren't already in the top layer. I believe this is
> actually a bugfix; the data we'd copy out in such cases is actually in
> the backing layer and shouldn't be copied with sync=top.

Now that you mention it...  I didn’t realize that.  Yes, you’re right.

> So this would have two effects:
> (1) sync=top gets a little more judicious about what it copies out on
> sync=top, and
> (2) the bitmap return value is more meaningful again.
> 
> This doesn't touch sync=none at all, which needs more invasive fixes if
> we wanted it to have useful bitmap return values (it needs to
> differentiate the idea between must-copy and can-copy, and I still don't
> know if this is worthwhile to do, so until I hear otherwise, I'm not gonna.)

No, I’m with you on that one.

Max

>>> +fail_bit = ebitmap.first_bit
>>> +ebitmap.clear()
>>> +ebitmap.dirty_bits(range(fail_bit, SIZE // GRANULARITY))
>>>  ebitmap.compare(get_bitmap(bitmaps, drive0.device, 'bitmap0'))
>>>  
>>>  # 2 - Writes and Reference Backup
>> [...]
>>




signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH 6/8] block/backup: issue progress updates for skipped regions

2019-07-10 Thread John Snow




On 7/10/19 4:30 PM, Max Reitz wrote:
> On 10.07.19 20:20, John Snow wrote:
>>
>>
>> On 7/10/19 12:36 PM, Max Reitz wrote:
>>> On 10.07.19 03:05, John Snow wrote:
 The way bitmap backups work is by starting at 75% if it needs
 to copy just 25% of the disk.
>>>
>>> Although there is this comment:
>>>
 /* TODO job_progress_set_remaining() would make more sense */
>>>
>>> So...
>>>
 The way sync=top currently works, however, is to start at 0% and then
 never update the progress if it doesn't copy a region. If it needs to
 copy 25% of the disk, we'll finish at 25%.

 Update the progress when we skip regions.
>>>
>>> Wouldn’t it be more correct to decrease the job length?
>>>
>>> Max
>>>
>>
>> Admittedly I have precisely no idea. Maybe? As far as I understand it,
>> we guarantee only:
>>
>> (1) Progress monotonically increases
>> (2) Upon completion, progress will equal the total work estimate.
>> [Trying to fix that to be true here.]
>>
>> This means we can do stuff like:
>>
>> - Total work estimate can increase or decrease arbitrarily
>> - Neither value has to mean anything in particular
>>
>>
>> Bitmap sync works by artificially increasing progress for NOP regions,
>> seen in init_copy_bitmap.
> 
> Yes, and it has a TODO comment that says it should be done differently.
> 
>> Full sync also tends to increase progress regardless of it actually did
>> a copy or not; offload support also counts as progress here. So if you
>> full sync an empty image, you'll see it increasing the progress as it
>> doesn't actually do anything.
>>
>> Top sync is the odd one out, which just omits progress for regions it skips.
>>
>> My only motivation here was to make them consistent. Can I do it the
>> other way? Yeah, probably. Is one way better than the other? I
>> legitimately have no idea. I guess whoever wrote the last comment felt
>> that it should all be the other way instead. Why'd they not do that?
> 
> If you look at the commit (05df8a6a2b4), I suppose it was because that
> commit simply did not intend to change behavior.  It just touched that
> piece of code and noted that maybe there should be a follow-up commit to
> change it.
> 
> But yeah, whatever.
> 
> Reviewed-by: Max Reitz 
> 

I mean. I'll make it consistent either way, but I actually don't know
which way I should make it go. I just think that all the modes should
work the same if we can help it.

Flip a coin?

--js

Re: [Qemu-block] [PATCH 7/8] block/backup: support bitmap sync modes for non-bitmap backups

2019-07-10 Thread Max Reitz

On 10.07.19 20:32, John Snow wrote:
> 
> 
> On 7/10/19 12:48 PM, Max Reitz wrote:
>> On 10.07.19 03:05, John Snow wrote:
>>> Accept bitmaps and sync policies for the other backup modes.
>>> This allows us to do things like create a bitmap synced to a full backup
>>> without a transaction, or start a resumable backup process.
>>>
>>> Some combinations don't make sense, though:
>>>
>>> - NEVER policy combined with any non-BITMAP mode doesn't do anything,
>>>   because the bitmap isn't used for input or output.
>>>   It's harmless, but is almost certainly never what the user wanted.
>>>
>>> - sync=NONE is more questionable. It can't use on-success because this
>>>   job never completes with success anyway, and the resulting artifact
>>>   of 'always' is suspect: because we start with a full bitmap and only
>>>   copy out segments that get written to, the final output bitmap will
>>>   always be ... a fully set bitmap.
>>>
>>>   Maybe there's contexts in which bitmaps make sense for sync=none,
>>>   but not without more severe changes to the current job, and omitting
>>>   it here doesn't prevent us from adding it later.
>>>
>>> Signed-off-by: John Snow 
>>> ---
>>>  block/backup.c   |  8 +---
>>>  blockdev.c   | 22 ++
>>>  qapi/block-core.json |  6 --
>>>  3 files changed, 27 insertions(+), 9 deletions(-)
>>
>> [...]
>>
>>> diff --git a/blockdev.c b/blockdev.c
>>> index f0b7da53b0..bc152f8e0d 100644
>>> --- a/blockdev.c
>>> +++ b/blockdev.c
>>
>> [...]
>>
>>> +if (!backup->has_bitmap && backup->has_bitmap_mode) {
>>> +error_setg(errp, "Cannot specify Bitmap sync mode without a 
>>> bitmap");
>>
>> Any reason for capitalizing the first “Bitmap”?
>>
>> With a reason or it fixed:
>>
>> Reviewed-by: Max Reitz 
>>
> 
> Hanging around Germans too much?

You should know then that the korrekt way to write it would be:

„Specifying Binarydigitmapsynchronizationmode without a Binarydigitmap
is absolutely verboten!“

> Actually, I can explain why: because a "bitmap" is a generic term, but
> whenever I capitalize it as "Bitmap" I am referring to a Block Dirty
> Bitmap which is a specific sort of thing. I do this unconsciously.
> 
> But in that case, I should actually be consistent in the interface (and
> docstrings and docs and error strings) and always call it that specific
> thing, which I don't.
> 
> "bitmap" is probably more correct for now, but I ought to go through all
> the interface and make it consistent in some way or another.
> 
> 
> (Actually: I'd like to see if I can't rename the "BdrvDirtyBitmap"
> object to something more generic and shorter so I can rename many of the
> functions we have something shorter.

Well, BDB is free.  That path has worked fine for BlockDriverStates.

Or what I said on IRC, but you know.

> Because the structure is "BdrvDirtyBitmap", there's some confusion when
> we name functions like bdrv_dirty_bitmap_{verb} because it's not clear
> if this is a bdrv function that does something with dirty bitmaps, or if
> it's a "BdrvDirtyBitmap" function that does something with that object.
> 
> I guess that seems like a subtle point, but it's why the naming
> conventions in dirty-bitmap.c are all over the place. I think at one
> point, the idea was that:
> 
> bdrv_{verb}_dirty_bitmap was an action applied to a BDS that happened to
> do something with dirty bitmaps. bdrv_dirty_bitmap_{verb} was an action
> that applied to a "BdrvDirtyBitmap". Crystal clear and not confusing at
> all, right?

I just thought people named their functions whatever they felt like at
the time.

> It'd be nice to have functions that operate on a node be named
> bdrv_dbitmap_foo() and functions that operate on the bitmap structure
> itself named just dbitmap_bar().
> 
> Would it be okay if I named them such a thing, I wonder?
> 
> we have "bitmap" and "hbitmap" already, I could do something like
> "dbitmap" for "dirty bitmap" or some such. Kind of an arbitrary change I
> admit, but I'm itching to do a big spring cleaning in dirty-bitmap.c
> right after this series is done.)

HBitmaps are generally used to track dirty areas, so I’d find this a
misnomer.  BDBitmap would be OK.  The “block” part should be in there
somewhere.

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH 6/8] block/backup: issue progress updates for skipped regions

2019-07-10 Thread Max Reitz

On 10.07.19 20:20, John Snow wrote:
> 
> 
> On 7/10/19 12:36 PM, Max Reitz wrote:
>> On 10.07.19 03:05, John Snow wrote:
>>> The way bitmap backups work is by starting at 75% if it needs
>>> to copy just 25% of the disk.
>>
>> Although there is this comment:
>>
>>> /* TODO job_progress_set_remaining() would make more sense */
>>
>> So...
>>
>>> The way sync=top currently works, however, is to start at 0% and then
>>> never update the progress if it doesn't copy a region. If it needs to
>>> copy 25% of the disk, we'll finish at 25%.
>>>
>>> Update the progress when we skip regions.
>>
>> Wouldn’t it be more correct to decrease the job length?
>>
>> Max
>>
> 
> Admittedly I have precisely no idea. Maybe? As far as I understand it,
> we guarantee only:
> 
> (1) Progress monotonically increases
> (2) Upon completion, progress will equal the total work estimate.
> [Trying to fix that to be true here.]
> 
> This means we can do stuff like:
> 
> - Total work estimate can increase or decrease arbitrarily
> - Neither value has to mean anything in particular
> 
> 
> Bitmap sync works by artificially increasing progress for NOP regions,
> seen in init_copy_bitmap.

Yes, and it has a TODO comment that says it should be done differently.

> Full sync also tends to increase progress regardless of it actually did
> a copy or not; offload support also counts as progress here. So if you
> full sync an empty image, you'll see it increasing the progress as it
> doesn't actually do anything.
> 
> Top sync is the odd one out, which just omits progress for regions it skips.
> 
> My only motivation here was to make them consistent. Can I do it the
> other way? Yeah, probably. Is one way better than the other? I
> legitimately have no idea. I guess whoever wrote the last comment felt
> that it should all be the other way instead. Why'd they not do that?

If you look at the commit (05df8a6a2b4), I suppose it was because that
commit simply did not intend to change behavior.  It just touched that
piece of code and noted that maybe there should be a follow-up commit to
change it.

But yeah, whatever.

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH 4/8] block/backup: hoist bitmap check into QMP interface

2019-07-10 Thread Max Reitz

On 10.07.19 19:57, John Snow wrote:
> 
> 
> On 7/10/19 12:11 PM, Max Reitz wrote:
>> On 10.07.19 03:05, John Snow wrote:
>>> This is nicer to do in the unified QMP interface that we have now,
>>> because it lets us use the right terminology back at the user.
>>>
>>> Signed-off-by: John Snow 
>>> ---
>>>  block/backup.c | 13 -
>>>  blockdev.c | 10 ++
>>>  2 files changed, 14 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/block/backup.c b/block/backup.c
>>> index e2729cf6fa..a64b768e24 100644
>>> --- a/block/backup.c
>>> +++ b/block/backup.c
>>> @@ -566,6 +566,10 @@ BlockJob *backup_job_create(const char *job_id, 
>>> BlockDriverState *bs,
>>>  assert(bs);
>>>  assert(target);
>>>  
>>> +/* QMP interface protects us from these cases */
>>> +assert(sync_mode != MIRROR_SYNC_MODE_INCREMENTAL);
>>> +assert(sync_bitmap || sync_mode != MIRROR_SYNC_MODE_BITMAP);
>>
>> Implication would be a nice operator sometimes.
>>
>> ("assert(sync_mode == MIRROR_SYNC_MODE_BITMAP -> sync_bitmap)")
>>
>> (Can you do that in C++?  No, you can’t overload bool’s operators, right?)
>>
>> Reviewed-by: Max Reitz 
>>
> 
> Yes, I also find this assertion kind of hard to read personally, but it
> feels somewhat clunky to write:
> 
> if (antecedent) {
> assert(condition);
> }
> 
> I suppose we can also phrase this as:
> 
> assert(sync_mode == MIRROR_SYNC_MODE_BITMAP ? sync_bitmap : true);
> 
> Which might honestly be pretty good. Mind if I change it to this?

Looks weird (mostly unfamiliar), but I do not.

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH 3/8] iotests/257: Refactor backup helpers

2019-07-10 Thread Max Reitz

On 10.07.19 19:52, John Snow wrote:
> 
> 
> On 7/10/19 12:04 PM, Max Reitz wrote:
>> On 10.07.19 03:05, John Snow wrote:
>>> This test needs support for non-bitmap backups and missing or
>>> unspecified bitmap sync modes, so rewrite the helpers to be a little
>>> more generic.
>>>
>>> Signed-off-by: John Snow 
>>> ---
>>>  tests/qemu-iotests/257 |  46 +
>>>  tests/qemu-iotests/257.out | 192 ++---
>>>  2 files changed, 124 insertions(+), 114 deletions(-)
>>>
>>> diff --git a/tests/qemu-iotests/257 b/tests/qemu-iotests/257
>>> index 2ff4aa8695..2eb4f26c28 100755
>>> --- a/tests/qemu-iotests/257
>>> +++ b/tests/qemu-iotests/257
>>
>> [...]
>>
>>> -def bitmap_backup(drive, n, filepath, bitmap, bitmap_mode):
>>> -log("--- Bitmap Backup #{:d} ---\n".format(n))
>>> -target_id = "bitmap_target_{:d}".format(n)
>>> -job_id = "bitmap_backup_{:d}".format(n)
>>> +def backup(drive, n, filepath, bitmap, bitmap_mode, sync='bitmap'):
>>> +log("--- Test Backup #{:d} ---\n".format(n))
>>> +target_id = "backup_target_{:d}".format(n)
>>> +job_id = "backup_{:d}".format(n)
>>>  target_drive = Drive(filepath, vm=drive.vm)
>>>  
>>>  target_drive.create_target(target_id, drive.fmt, drive.size)
>>> -drive.vm.qmp_log("blockdev-backup", job_id=job_id, device=drive.name,
>>> - target=target_id, sync="bitmap",
>>> - bitmap_mode=bitmap_mode,
>>> - bitmap=bitmap,
>>> - auto_finalize=False)
>>> +
>>> +kwargs = {
>>> +'job_id': job_id,
>>> +'auto_finalize': False,
>>> +'bitmap': bitmap,
>>> +'bitmap_mode': bitmap_mode,
>>> +}
>>> +kwargs = {key: val for key, val in kwargs.items() if val is not None}
>>
>> I suppose this is to remove items that are None?
>>
>> Very cute, but why not just
>>
>>   kwargs = {
>> 'job_id': job_id,
>> 'auto_finalize': False,
>>   }
>>   if bitmap is not None:
>> kwargs['bitmap'] = bitmap
>> kwargs['bitmap_mode'] = bitmap_mode
>>
>> Exactly the same number of lines, but immediately makes it clear what’s
>> going on.  Not as cute, I admit.
>>
>> (Yes, I am indeed actively trying to train you not to write cute code.)
>>
> 
> It sneaks in. I genuinely struggle with understanding what other people
> will find readable; I have an authentically hard time reviewing other
> people's patches too. I'm earnestly not sure how I can help improve
> this, but I would like to.
> 
> I wasn't sure what the easiest way to avoid sending the "None" over the
> wire was, so I went with a general thing, but yes: it's because bitmap
> and bitmap_mode are set to None sometimes and I need to omit such keys.
> 
> In this case, though, I do test bitmap and bitmap_mode separately, so
> for the purposes of testing intentionally bad combinations you do need:
> 
> if bitmap is not None:
> kwargs['bitmap'] = bitmap
> if bitmap_mode is not None:
> kwargs['bitmap_mode'] = bitmap_mode
> 
> And I just looked at this and it did not spark joy; so I went with a
> generic filter to remove nulled keys. I admit it's /slightly/ cute and
> not immediately obvious why it needs to be done.
> 
> 
> This is even cuter, so maybe I am traveling in the wrong direction:
> 
> def backup(drive, n, filepath, sync, **kwargs):
> log("--- Test Backup #{:d} ---\n".format(n))
> target_id = "backup_target_{:d}".format(n)
> job_id = "backup_{:d}".format(n)
> target_drive = Drive(filepath, vm=drive.vm)
> 
> target_drive.create_target(target_id, drive.fmt, drive.size)
> kwargs.setdefault('auto_finalize', False)
> # Strip any arguments explicitly nulled by the caller:
> kwargs = {key: val for key, val in kwargs.items()
>   if val is not None}
> blockdev_backup(drive.vm, drive.name, target_id, sync, **kwargs)
> return job_id
> 
> It's quite a bit shorter and also makes backup() more flexible by
> omitting the bitmap and bitmap_mode arguments entirely, allowing the
> caller to override the auto_finalize default, etc. In this permutation,
> we don't know the full extent of kwargs so it makes sense to generically
> filter it.
> 
> Manually conditionally setting arguments is probably also fine.
> Do you still have a preference for the more static approach?

It’s OK with the comment.
Although it needs a kwargs.setdefault('job_id', job_id), too.

(Hm.  Shouldn’t 'auto-finalize' and 'job-id' work just fine?  It’s not
like the QMP scripts swap - and _.  They just replace _  by -.)

Max

>> The rest looks good to me:
>>
>> Reviewed-by: Max Reitz 
>>
> 
> Thanks for reviewing, as always!
> 




signature.asc
Description: OpenPGP digital signature

[Qemu-block] [PULL 2/3] sphinx: add qmp_lexer

2019-07-10 Thread John Snow

Sphinx, through Pygments, does not like annotated json examples very
much. In some versions of Sphinx (1.7), it will render the non-json
portions of code blocks in red, but in newer versions (2.0) it will
throw an exception and not highlight the block at all. Though we can
suppress this warning, it doesn't bring back highlighting on non-strict
json blocks.

We can alleviate this by creating a custom lexer for QMP examples that
allows us to properly highlight these examples in a robust way, keeping
our directionality and elision notations.

Signed-off-by: Eduardo Habkost 
Signed-off-by: John Snow 
Reported-by: Aarushi Mehta 
Reviewed-by: Peter Maydell 
Message-id: 20190603214653.29369-3-js...@redhat.com
Signed-off-by: John Snow 
---
 docs/conf.py |  4 ++--
 docs/sphinx/qmp_lexer.py | 43 
 2 files changed, 45 insertions(+), 2 deletions(-)
 create mode 100644 docs/sphinx/qmp_lexer.py

diff --git a/docs/conf.py b/docs/conf.py
index befbcc6c3e..e46b299b71 100644
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -41,7 +41,7 @@ except NameError:
 # add these directories to sys.path here. If the directory is relative to the
 # documentation root, use an absolute path starting from qemu_docdir.
 #
-# sys.path.insert(0, os.path.join(qemu_docdir, "my_subdir"))
+sys.path.insert(0, os.path.join(qemu_docdir, "sphinx"))
 
 
 # -- General configuration 
@@ -54,7 +54,7 @@ needs_sphinx = '1.3'
 # Add any Sphinx extension module names here, as strings. They can be
 # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
 # ones.
-extensions = []
+extensions = ['qmp_lexer']
 
 # Add any paths that contain templates here, relative to this directory.
 templates_path = ['_templates']
diff --git a/docs/sphinx/qmp_lexer.py b/docs/sphinx/qmp_lexer.py
new file mode 100644
index 00..f7e4c0e198
--- /dev/null
+++ b/docs/sphinx/qmp_lexer.py
@@ -0,0 +1,43 @@
+# QEMU Monitor Protocol Lexer Extension
+#
+# Copyright (C) 2019, Red Hat Inc.
+#
+# Authors:
+#  Eduardo Habkost 
+#  John Snow 
+#
+# This work is licensed under the terms of the GNU GPLv2 or later.
+# See the COPYING file in the top-level directory.
+"""qmp_lexer is a Sphinx extension that provides a QMP lexer for code 
blocks."""
+
+from pygments.lexer import RegexLexer, DelegatingLexer
+from pygments.lexers.data import JsonLexer
+from pygments import token
+from sphinx import errors
+
+class QMPExampleMarkersLexer(RegexLexer):
+"""
+QMPExampleMarkersLexer lexes QMP example annotations.
+This lexer adds support for directionality flow and elision indicators.
+"""
+tokens = {
+'root': [
+(r'-> ', token.Generic.Prompt),
+(r'<- ', token.Generic.Prompt),
+(r' ?\.{3} ?', token.Generic.Prompt),
+]
+}
+
+class QMPExampleLexer(DelegatingLexer):
+"""QMPExampleLexer lexes annotated QMP examples."""
+def __init__(self, **options):
+super(QMPExampleLexer, self).__init__(JsonLexer, 
QMPExampleMarkersLexer,
+  token.Error, **options)
+
+def setup(sphinx):
+"""For use by the Sphinx extensions API."""
+try:
+sphinx.require_sphinx('2.1')
+sphinx.add_lexer('QMP', QMPExampleLexer)
+except errors.VersionRequirementError:
+sphinx.add_lexer('QMP', QMPExampleLexer())
-- 
2.21.0

[Qemu-block] [PULL 1/3] docs/interop/bitmaps.rst: Fix typos

2019-07-10 Thread John Snow

Pygments and Sphinx get pickier all the time; Sphinx 2.1+ now catches
these errors.

Signed-off-by: John Snow 
Reported-by: Aarushi Mehta 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Message-id: 20190603214653.29369-2-js...@redhat.com
Signed-off-by: John Snow 
---
 docs/interop/bitmaps.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/interop/bitmaps.rst b/docs/interop/bitmaps.rst
index 510e8809a9..c29ac4a854 100644
--- a/docs/interop/bitmaps.rst
+++ b/docs/interop/bitmaps.rst
@@ -399,7 +399,7 @@ in any one source bitmap, the target bitmap will mark that 
segment dirty.
"arguments": {
  "node": "drive0",
  "target": "new_bitmap",
- "bitmaps: [ "bitmap0" ]
+ "bitmaps": [ "bitmap0" ]
}
  }
 
@@ -1437,7 +1437,7 @@ applied:
.. code:: json
 
 <- {
- "timestamp": {...}
+ "timestamp": {...},
  "data": {
"device": "drive0",
"type": "backup",
-- 
2.21.0

[Qemu-block] [PULL 0/3] Bitmaps patches

2019-07-10 Thread John Snow

The following changes since commit 6df2cdf44a82426f7a59dcb03f0dd2181ed7fdfa:

  Update version for v4.1.0-rc0 release (2019-07-09 17:21:53 +0100)

are available in the Git repository at:

  https://github.com/jnsnow/qemu.git tags/bitmaps-pull-request

for you to fetch changes up to a7786bfb0effe0b4b0fc61d8a8cd307c0b739ed7:

  docs/bitmaps: use QMP lexer instead of json (2019-07-10 15:08:07 -0400)


Pull request:
  This is a build fix.



John Snow (3):
  docs/interop/bitmaps.rst: Fix typos
  sphinx: add qmp_lexer
  docs/bitmaps: use QMP lexer instead of json

 docs/conf.py |  4 +--
 docs/interop/bitmaps.rst | 58 
 docs/sphinx/qmp_lexer.py | 43 +
 3 files changed, 74 insertions(+), 31 deletions(-)
 create mode 100644 docs/sphinx/qmp_lexer.py

-- 
2.21.0

[Qemu-block] [PULL 3/3] docs/bitmaps: use QMP lexer instead of json

2019-07-10 Thread John Snow

The annotated style json we use in QMP documentation is not strict json
and depending on the version of Sphinx (2.0+) or Pygments installed,
might cause the build to fail.

Use the new QMP lexer.

Further, some versions of Sphinx can not apply custom lexers to "code"
directives and require the use of "code-block" directives instead, so
make that change at this time as well.

Tested under:
- Sphinx 1.3.6 and Pygments 2.4
- Sphinx 1.7.6 and Pygments 2.2 (Fedora 29 packages)
- Sphinx 2.0.1 and Pygments 2.4
- Sphinx 3.0.0+/f396b3a783 and Pygments 2.4 (From Sphinx git c4f44bdd)

Reported-by: Aarushi Mehta 
Signed-off-by: John Snow 
Reviewed-by: Eduardo Habkost 
Message-id: 20190603214653.29369-4-js...@redhat.com
Signed-off-by: John Snow 
---
 docs/interop/bitmaps.rst | 54 
 1 file changed, 27 insertions(+), 27 deletions(-)

diff --git a/docs/interop/bitmaps.rst b/docs/interop/bitmaps.rst
index c29ac4a854..c20bd37a79 100644
--- a/docs/interop/bitmaps.rst
+++ b/docs/interop/bitmaps.rst
@@ -199,7 +199,7 @@ persistence, and recording state can be adjusted at 
creation time.
 
  to create a new, actively recording persistent bitmap:
 
- .. code:: json
+ .. code-block:: QMP
 
   -> { "execute": "block-dirty-bitmap-add",
"arguments": {
@@ -220,7 +220,7 @@ persistence, and recording state can be adjusted at 
creation time.
  To create a new, disabled (``-recording``), transient bitmap that tracks
  changes in 32KiB segments:
 
- .. code:: json
+ .. code-block:: QMP
 
   -> { "execute": "block-dirty-bitmap-add",
"arguments": {
@@ -254,7 +254,7 @@ Deletes a bitmap. Bitmaps that are ``+busy`` cannot be 
removed.
 
  Remove a bitmap named ``bitmap0`` from node ``drive0``:
 
- .. code:: json
+ .. code-block:: QMP
 
   -> { "execute": "block-dirty-bitmap-remove",
"arguments": {
@@ -280,7 +280,7 @@ Clears all dirty bits from a bitmap. ``+busy`` bitmaps 
cannot be cleared.
 
  Clear all dirty bits from bitmap ``bitmap0`` on node ``drive0``:
 
- .. code:: json
+ .. code-block:: QMP
 
   -> { "execute": "block-dirty-bitmap-clear",
"arguments": {
@@ -309,7 +309,7 @@ begin being recorded. ``+busy`` bitmaps cannot be enabled.
 
  To set ``+recording`` on bitmap ``bitmap0`` on node ``drive0``:
 
- .. code:: json
+ .. code-block:: QMP
 
   -> { "execute": "block-dirty-bitmap-enable",
"arguments": {
@@ -347,7 +347,7 @@ writes to begin being ignored. ``+busy`` bitmaps cannot be 
disabled.
 
  To set ``-recording`` on bitmap ``bitmap0`` on node ``drive0``:
 
- .. code:: json
+ .. code-block:: QMP
 
   -> { "execute": "block-dirty-bitmap-disable",
"arguments": {
@@ -393,7 +393,7 @@ in any one source bitmap, the target bitmap will mark that 
segment dirty.
  ``drive0``. If ``new_bitmap`` was empty prior to this command, this achieves
  a copy.
 
- .. code:: json
+ .. code-block:: QMP
 
   -> { "execute": "block-dirty-bitmap-merge",
"arguments": {
@@ -424,7 +424,7 @@ attached to nodes serving as the root for guest devices.
  API. This result highlights a bitmap ``bitmap0`` attached to the root node of
  device ``drive0``.
 
- .. code:: json
+ .. code-block:: QMP
 
   -> {
"execute": "query-block",
@@ -562,7 +562,7 @@ new, empty bitmap that records writes from this point in 
time forward.
   destination. These writes will be recorded in the bitmap
   accordingly.
 
-.. code:: json
+.. code-block:: QMP
 
   -> {
"execute": "transaction",
@@ -650,7 +650,7 @@ Example: Resetting an Incremental Backup Anchor Point
 If we want to start a new backup chain with an existing bitmap, we can also
 use a transaction to reset the bitmap while making a new full backup:
 
-.. code:: json
+.. code-block:: QMP
 
   -> {
"execute": "transaction",
@@ -730,7 +730,7 @@ Example: First Incremental Backup
 
 #. Issue an incremental backup command:
 
-   .. code:: json
+   .. code-block:: QMP
 
 -> {
  "execute": "drive-backup",
@@ -788,7 +788,7 @@ Example: Second Incremental Backup
 #. Issue a new incremental backup command. The only difference here is that we
have changed the target image below.
 
-   .. code:: json
+   .. code-block:: QMP
 
 -> {
  "execute": "drive-backup",
@@ -869,7 +869,7 @@ image:
 #. Issue a new incremental backup command. Apart from the new destination
image, there is no difference from the last two examples.
 
-   .. code:: json
+   .. code-block:: QMP
 
 -> {
  "execute": "drive-backup",
@@ -932,7 +932,7 @@ point in time.
 
 #. Create a full (anchor) backup for each drive, with accompanying bitmaps:
 
-   .. code:: json
+   .. code-block:: QMP
 
 -> {
  "execute": "transaction",
@@ -1018,7 +1018,7 @@ point in time.
 
 #. Issue a multi-drive incremental push backup transaction:
 
-   .. code:: json
+   .. code-block:: QMP
 
 -> {
  "execute": "transaction",
@@ -1121,7 +1121,7 @@ described above. This example demonstrates the single-job 
fa

Re: [Qemu-block] [PATCH 8/8] iotests/257: test traditional sync modes

2019-07-10 Thread John Snow

On 7/10/19 1:14 PM, Max Reitz wrote:
> On 10.07.19 03:05, John Snow wrote:
>> Signed-off-by: John Snow 
>> ---
>>  tests/qemu-iotests/257 |   31 +
>>  tests/qemu-iotests/257.out | 3089 
>>  2 files changed, 3120 insertions(+)
> 
> Oof.
> 

Yeah, it's... a lot of test output. We probably shouldn't count the
reference test output against any kind of SLOC metrics.

>> diff --git a/tests/qemu-iotests/257 b/tests/qemu-iotests/257
>> index de8707cb19..8de1c4da19 100755
>> --- a/tests/qemu-iotests/257
>> +++ b/tests/qemu-iotests/257
> 
> [...]
> 
>> @@ -410,6 +416,11 @@ def test_bitmap_sync(bsync_mode, msync_mode='bitmap', 
>> failure=None):
>>  if bsync_mode == 'always' and failure == 'intermediate':
>>  # We manage to copy one sector (one bit) before the error.
>>  ebitmap.clear_bit(ebitmap.first_bit)
>> +if msync_mode in ('full', 'top'):
>> +# These modes return all bits set except what was 
>> copied/skipped
> 
> Hm.  How useful is bitmap support for 'top' then, anyway?  That means
> that if you want to resume a top backup, you always have to resume it
> like it was a full backup.  Which sounds kind of useless.
> 
> Max
> 

Good point!

I think this can be fixed by doing an initialization pass of the
copy_bitmap when sync=top to set only the allocated regions in the bitmap.

This means that the write notifier won't copy out regions that are
written to that weren't already in the top layer. I believe this is
actually a bugfix; the data we'd copy out in such cases is actually in
the backing layer and shouldn't be copied with sync=top.

So this would have two effects:
(1) sync=top gets a little more judicious about what it copies out on
sync=top, and
(2) the bitmap return value is more meaningful again.

This doesn't touch sync=none at all, which needs more invasive fixes if
we wanted it to have useful bitmap return values (it needs to
differentiate the idea between must-copy and can-copy, and I still don't
know if this is worthwhile to do, so until I hear otherwise, I'm not gonna.)

>> +fail_bit = ebitmap.first_bit
>> +ebitmap.clear()
>> +ebitmap.dirty_bits(range(fail_bit, SIZE // GRANULARITY))
>>  ebitmap.compare(get_bitmap(bitmaps, drive0.device, 'bitmap0'))
>>  
>>  # 2 - Writes and Reference Backup
> [...]
>

Re: [Qemu-block] [PATCH 7/8] block/backup: support bitmap sync modes for non-bitmap backups

2019-07-10 Thread John Snow

On 7/10/19 12:48 PM, Max Reitz wrote:
> On 10.07.19 03:05, John Snow wrote:
>> Accept bitmaps and sync policies for the other backup modes.
>> This allows us to do things like create a bitmap synced to a full backup
>> without a transaction, or start a resumable backup process.
>>
>> Some combinations don't make sense, though:
>>
>> - NEVER policy combined with any non-BITMAP mode doesn't do anything,
>>   because the bitmap isn't used for input or output.
>>   It's harmless, but is almost certainly never what the user wanted.
>>
>> - sync=NONE is more questionable. It can't use on-success because this
>>   job never completes with success anyway, and the resulting artifact
>>   of 'always' is suspect: because we start with a full bitmap and only
>>   copy out segments that get written to, the final output bitmap will
>>   always be ... a fully set bitmap.
>>
>>   Maybe there's contexts in which bitmaps make sense for sync=none,
>>   but not without more severe changes to the current job, and omitting
>>   it here doesn't prevent us from adding it later.
>>
>> Signed-off-by: John Snow 
>> ---
>>  block/backup.c   |  8 +---
>>  blockdev.c   | 22 ++
>>  qapi/block-core.json |  6 --
>>  3 files changed, 27 insertions(+), 9 deletions(-)
> 
> [...]
> 
>> diff --git a/blockdev.c b/blockdev.c
>> index f0b7da53b0..bc152f8e0d 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
> 
> [...]
> 
>> +if (!backup->has_bitmap && backup->has_bitmap_mode) {
>> +error_setg(errp, "Cannot specify Bitmap sync mode without a 
>> bitmap");
> 
> Any reason for capitalizing the first “Bitmap”?
> 
> With a reason or it fixed:
> 
> Reviewed-by: Max Reitz 
> 

Hanging around Germans too much?

Actually, I can explain why: because a "bitmap" is a generic term, but
whenever I capitalize it as "Bitmap" I am referring to a Block Dirty
Bitmap which is a specific sort of thing. I do this unconsciously.

But in that case, I should actually be consistent in the interface (and
docstrings and docs and error strings) and always call it that specific
thing, which I don't.

"bitmap" is probably more correct for now, but I ought to go through all
the interface and make it consistent in some way or another.

(Actually: I'd like to see if I can't rename the "BdrvDirtyBitmap"
object to something more generic and shorter so I can rename many of the
functions we have something shorter.

Because the structure is "BdrvDirtyBitmap", there's some confusion when
we name functions like bdrv_dirty_bitmap_{verb} because it's not clear
if this is a bdrv function that does something with dirty bitmaps, or if
it's a "BdrvDirtyBitmap" function that does something with that object.

I guess that seems like a subtle point, but it's why the naming
conventions in dirty-bitmap.c are all over the place. I think at one
point, the idea was that:

bdrv_{verb}_dirty_bitmap was an action applied to a BDS that happened to
do something with dirty bitmaps. bdrv_dirty_bitmap_{verb} was an action
that applied to a "BdrvDirtyBitmap". Crystal clear and not confusing at
all, right?

It'd be nice to have functions that operate on a node be named
bdrv_dbitmap_foo() and functions that operate on the bitmap structure
itself named just dbitmap_bar().

Would it be okay if I named them such a thing, I wonder?

we have "bitmap" and "hbitmap" already, I could do something like
"dbitmap" for "dirty bitmap" or some such. Kind of an arbitrary change I
admit, but I'm itching to do a big spring cleaning in dirty-bitmap.c
right after this series is done.)

Re: [Qemu-block] [PATCH 6/8] block/backup: issue progress updates for skipped regions

2019-07-10 Thread John Snow

On 7/10/19 12:36 PM, Max Reitz wrote:
> On 10.07.19 03:05, John Snow wrote:
>> The way bitmap backups work is by starting at 75% if it needs
>> to copy just 25% of the disk.
> 
> Although there is this comment:
> 
>> /* TODO job_progress_set_remaining() would make more sense */
> 
> So...
> 
>> The way sync=top currently works, however, is to start at 0% and then
>> never update the progress if it doesn't copy a region. If it needs to
>> copy 25% of the disk, we'll finish at 25%.
>>
>> Update the progress when we skip regions.
> 
> Wouldn’t it be more correct to decrease the job length?
> 
> Max
> 

Admittedly I have precisely no idea. Maybe? As far as I understand it,
we guarantee only:

(1) Progress monotonically increases
(2) Upon completion, progress will equal the total work estimate.
[Trying to fix that to be true here.]

This means we can do stuff like:

- Total work estimate can increase or decrease arbitrarily
- Neither value has to mean anything in particular

Bitmap sync works by artificially increasing progress for NOP regions,
seen in init_copy_bitmap.

Full sync also tends to increase progress regardless of it actually did
a copy or not; offload support also counts as progress here. So if you
full sync an empty image, you'll see it increasing the progress as it
doesn't actually do anything.

Top sync is the odd one out, which just omits progress for regions it skips.

My only motivation here was to make them consistent. Can I do it the
other way? Yeah, probably. Is one way better than the other? I
legitimately have no idea. I guess whoever wrote the last comment felt
that it should all be the other way instead. Why'd they not do that?

¯\_(ツ)_/¯

>> Signed-off-by: John Snow 
>> ---
>>  block/backup.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/block/backup.c b/block/backup.c
>> index a64b768e24..38c4a688c6 100644
>> --- a/block/backup.c
>> +++ b/block/backup.c
>> @@ -417,6 +417,7 @@ static int coroutine_fn backup_loop(BackupBlockJob *job)
>>  if (job->sync_mode == MIRROR_SYNC_MODE_TOP &&
>>  bdrv_is_unallocated_range(bs, offset, job->cluster_size))
>>  {
>> +job_progress_update(&job->common.job, job->cluster_size);
>>  bdrv_reset_dirty_bitmap(job->copy_bitmap, offset,
>>  job->cluster_size);
>>  continue;
>>
> 
>

Re: [Qemu-block] [PATCH 5/8] iotests/257: test API failures

2019-07-10 Thread John Snow




On 7/10/19 12:22 PM, Max Reitz wrote:
> On 10.07.19 03:05, John Snow wrote:
>> Signed-off-by: John Snow 
>> ---
>>  tests/qemu-iotests/257 | 69 +++
>>  tests/qemu-iotests/257.out | 85 ++
>>  2 files changed, 154 insertions(+)
>>
>> diff --git a/tests/qemu-iotests/257 b/tests/qemu-iotests/257
>> index 2eb4f26c28..de8707cb19 100755
>> --- a/tests/qemu-iotests/257
>> +++ b/tests/qemu-iotests/257
>> @@ -450,10 +450,79 @@ def test_bitmap_sync(bsync_mode, msync_mode='bitmap', 
>> failure=None):
>>  compare_images(img_path, fbackup2)
>>  log('')
>>  
>> +def test_backup_api():
>> +"""
>> +"""
> 
> Er, OK?
> 
> [...]
> 

Whops.

>> +for sync_mode, config in error_cases.items():
>> +log("-- Sync mode {:s} tests --\n".format(sync_mode))
>> +for bitmap, policies in config.items():
> 
> You might be interested in the fact that the iteration order is
> different for Python2.  Or maybe you aren’t.
> 

asdf. This is undoubtedly the worst thing about Python. I'll have to fix
this, because the ordering isn't guaranteed until 3.5 or some such and
we're only EOLing Python2.

--js

Re: [Qemu-block] [PATCH 4/8] block/backup: hoist bitmap check into QMP interface

2019-07-10 Thread John Snow




On 7/10/19 12:11 PM, Max Reitz wrote:
> On 10.07.19 03:05, John Snow wrote:
>> This is nicer to do in the unified QMP interface that we have now,
>> because it lets us use the right terminology back at the user.
>>
>> Signed-off-by: John Snow 
>> ---
>>  block/backup.c | 13 -
>>  blockdev.c | 10 ++
>>  2 files changed, 14 insertions(+), 9 deletions(-)
>>
>> diff --git a/block/backup.c b/block/backup.c
>> index e2729cf6fa..a64b768e24 100644
>> --- a/block/backup.c
>> +++ b/block/backup.c
>> @@ -566,6 +566,10 @@ BlockJob *backup_job_create(const char *job_id, 
>> BlockDriverState *bs,
>>  assert(bs);
>>  assert(target);
>>  
>> +/* QMP interface protects us from these cases */
>> +assert(sync_mode != MIRROR_SYNC_MODE_INCREMENTAL);
>> +assert(sync_bitmap || sync_mode != MIRROR_SYNC_MODE_BITMAP);
> 
> Implication would be a nice operator sometimes.
> 
> ("assert(sync_mode == MIRROR_SYNC_MODE_BITMAP -> sync_bitmap)")
> 
> (Can you do that in C++?  No, you can’t overload bool’s operators, right?)
> 
> Reviewed-by: Max Reitz 
> 

Yes, I also find this assertion kind of hard to read personally, but it
feels somewhat clunky to write:

if (antecedent) {
assert(condition);
}

I suppose we can also phrase this as:

assert(sync_mode == MIRROR_SYNC_MODE_BITMAP ? sync_bitmap : true);

Which might honestly be pretty good. Mind if I change it to this?

--js

Re: [Qemu-block] [PATCH 3/8] iotests/257: Refactor backup helpers

2019-07-10 Thread John Snow

On 7/10/19 12:04 PM, Max Reitz wrote:
> On 10.07.19 03:05, John Snow wrote:
>> This test needs support for non-bitmap backups and missing or
>> unspecified bitmap sync modes, so rewrite the helpers to be a little
>> more generic.
>>
>> Signed-off-by: John Snow 
>> ---
>>  tests/qemu-iotests/257 |  46 +
>>  tests/qemu-iotests/257.out | 192 ++---
>>  2 files changed, 124 insertions(+), 114 deletions(-)
>>
>> diff --git a/tests/qemu-iotests/257 b/tests/qemu-iotests/257
>> index 2ff4aa8695..2eb4f26c28 100755
>> --- a/tests/qemu-iotests/257
>> +++ b/tests/qemu-iotests/257
> 
> [...]
> 
>> -def bitmap_backup(drive, n, filepath, bitmap, bitmap_mode):
>> -log("--- Bitmap Backup #{:d} ---\n".format(n))
>> -target_id = "bitmap_target_{:d}".format(n)
>> -job_id = "bitmap_backup_{:d}".format(n)
>> +def backup(drive, n, filepath, bitmap, bitmap_mode, sync='bitmap'):
>> +log("--- Test Backup #{:d} ---\n".format(n))
>> +target_id = "backup_target_{:d}".format(n)
>> +job_id = "backup_{:d}".format(n)
>>  target_drive = Drive(filepath, vm=drive.vm)
>>  
>>  target_drive.create_target(target_id, drive.fmt, drive.size)
>> -drive.vm.qmp_log("blockdev-backup", job_id=job_id, device=drive.name,
>> - target=target_id, sync="bitmap",
>> - bitmap_mode=bitmap_mode,
>> - bitmap=bitmap,
>> - auto_finalize=False)
>> +
>> +kwargs = {
>> +'job_id': job_id,
>> +'auto_finalize': False,
>> +'bitmap': bitmap,
>> +'bitmap_mode': bitmap_mode,
>> +}
>> +kwargs = {key: val for key, val in kwargs.items() if val is not None}
> 
> I suppose this is to remove items that are None?
> 
> Very cute, but why not just
> 
>   kwargs = {
> 'job_id': job_id,
> 'auto_finalize': False,
>   }
>   if bitmap is not None:
> kwargs['bitmap'] = bitmap
> kwargs['bitmap_mode'] = bitmap_mode
> 
> Exactly the same number of lines, but immediately makes it clear what’s
> going on.  Not as cute, I admit.
> 
> (Yes, I am indeed actively trying to train you not to write cute code.)
> 

It sneaks in. I genuinely struggle with understanding what other people
will find readable; I have an authentically hard time reviewing other
people's patches too. I'm earnestly not sure how I can help improve
this, but I would like to.

I wasn't sure what the easiest way to avoid sending the "None" over the
wire was, so I went with a general thing, but yes: it's because bitmap
and bitmap_mode are set to None sometimes and I need to omit such keys.

In this case, though, I do test bitmap and bitmap_mode separately, so
for the purposes of testing intentionally bad combinations you do need:

if bitmap is not None:
kwargs['bitmap'] = bitmap
if bitmap_mode is not None:
kwargs['bitmap_mode'] = bitmap_mode

And I just looked at this and it did not spark joy; so I went with a
generic filter to remove nulled keys. I admit it's /slightly/ cute and
not immediately obvious why it needs to be done.

This is even cuter, so maybe I am traveling in the wrong direction:

def backup(drive, n, filepath, sync, **kwargs):
log("--- Test Backup #{:d} ---\n".format(n))
target_id = "backup_target_{:d}".format(n)
job_id = "backup_{:d}".format(n)
target_drive = Drive(filepath, vm=drive.vm)

target_drive.create_target(target_id, drive.fmt, drive.size)
kwargs.setdefault('auto_finalize', False)
# Strip any arguments explicitly nulled by the caller:
kwargs = {key: val for key, val in kwargs.items()
  if val is not None}
blockdev_backup(drive.vm, drive.name, target_id, sync, **kwargs)
return job_id

It's quite a bit shorter and also makes backup() more flexible by
omitting the bitmap and bitmap_mode arguments entirely, allowing the
caller to override the auto_finalize default, etc. In this permutation,
we don't know the full extent of kwargs so it makes sense to generically
filter it.

Manually conditionally setting arguments is probably also fine.
Do you still have a preference for the more static approach?

> The rest looks good to me:
> 
> Reviewed-by: Max Reitz 
> 

Thanks for reviewing, as always!

Re: [Qemu-block] [PATCH 2/8] iotests/257: add EmulatedBitmap class

2019-07-10 Thread John Snow




On 7/10/19 11:47 AM, Max Reitz wrote:
> On 10.07.19 03:05, John Snow wrote:
>> Represent a bitmap with an object that we can mark and clear bits in.
>> This makes it easier to manage partial writes when we don't write a
>> full group's worth of patterns before an error.
>>
>> Signed-off-by: John Snow 
>> ---
>>  tests/qemu-iotests/257 | 125 +
>>  1 file changed, 76 insertions(+), 49 deletions(-)
>>
>> diff --git a/tests/qemu-iotests/257 b/tests/qemu-iotests/257
>> index f576a35a5c..2ff4aa8695 100755
>> --- a/tests/qemu-iotests/257
>> +++ b/tests/qemu-iotests/257
>> @@ -85,6 +85,60 @@ GROUPS = [
>>  Pattern('0xdd', 0x3fc)]), # New; leaving a gap to the right
>>  ]
>>  
>> +
>> +class EmulatedBitmap:
>> +def __init__(self, granularity=GRANULARITY):
>> +self._bits = set()
>> +self.groups = set()
>> +self.granularity = granularity
>> +
>> +def dirty_bits(self, bits):
>> +self._bits |= set(bits)
>> +
>> +def dirty_group(self, n):
>> +self.dirty_bits(GROUPS[n].bits(self.granularity))
>> +
>> +def clear(self):
>> +self._bits = set()
>> +
>> +def clear_bits(self, bits):
>> +self._bits = self._bits - set(bits)
> 
> Does -= not work here?
> 
> No real complaints.  Sorry.
> 
> Reviewed-by: Max Reitz 
> 

It's okay. I forget which shorthand operators Python has for which types
sometimes.

>>> a = {'a', 'b', 'c'}
>>> a -= {'c'}
>>> a
{'b', 'a'}

Well, apparently it does. I'll change it.

--js

Re: [Qemu-block] [PATCH 1/8] iotests/257: add Pattern class

2019-07-10 Thread John Snow




On 7/10/19 12:26 PM, Max Reitz wrote:
> On 10.07.19 03:05, John Snow wrote:
>> Just kidding, this is easier to manage with a full class instead of a
>> namedtuple.
>>
>> Signed-off-by: John Snow 
>> ---
>>  tests/qemu-iotests/257 | 58 +++---
>>  1 file changed, 32 insertions(+), 26 deletions(-)
>>
>> diff --git a/tests/qemu-iotests/257 b/tests/qemu-iotests/257
>> index 75a651c7c3..f576a35a5c 100755
>> --- a/tests/qemu-iotests/257
>> +++ b/tests/qemu-iotests/257
>> @@ -19,7 +19,6 @@
>>  #
>>  # owner=js...@redhat.com
>>  
>> -from collections import namedtuple
>>  import math
>>  import os
>>  
>> @@ -29,10 +28,18 @@ from iotests import log, qemu_img
>>  SIZE = 64 * 1024 * 1024
>>  GRANULARITY = 64 * 1024
>>  
>> -Pattern = namedtuple('Pattern', ['byte', 'offset', 'size'])
>> -def mkpattern(byte, offset, size=GRANULARITY):
>> -"""Constructor for Pattern() with default size"""
>> -return Pattern(byte, offset, size)
>> +
>> +class Pattern:
>> +def __init__(self, byte, offset, size=GRANULARITY):
>> +self.byte = byte
>> +self.offset = offset
>> +self.size = size
>> +
>> +def bits(self, granularity):
>> +lower = math.floor(self.offset / granularity)
>> +upper = math.floor((self.offset + self.size - 1) / granularity)
>> +return set(range(lower, upper + 1))
> 
> By the way, this doesn’t work with Python2 (pre-existing in your other
> series).  It complains that these are floats.
> 
> Now I don’t know whether you care but there is the fact that the
> expressions would be shorter if they were of the form x // y instead of
> math.floor(x / y).
> 
> Max
> 

Ah, crud; OK -- I'll play nice with python2 for a while longer. Thanks
for pointing this out.

Re: [Qemu-block] [PATCH v4 00/18] bitmaps: introduce 'bitmap' sync mode

2019-07-10 Thread John Snow

On 7/10/19 10:48 AM, Max Reitz wrote:
> On 10.07.19 01:25, John Snow wrote:
>> This series adds a new "BITMAP" sync mode that is meant to replace the
>> existing "INCREMENTAL" sync mode.
> 
> So who’s going to take it? :-)
> 
> get_maintainer.pl says I’m responsible for 14/18 patches, and you are
> for 9/18.  But you did prefix the title with “bitmaps”, which
> MAINTAINERS clearly states is your territory. ;-)
> 
> Max
> 

I'll do it, since there's more to come and it'll be easier for me to
stage it all in one giant chunk.

Thank you for your patience and reviews! This series has helped me
scratch a major itch that I've had ever since I implemented incremental
backups.

--js

Re: [Qemu-block] [PATCH 8/8] iotests/257: test traditional sync modes

2019-07-10 Thread Max Reitz

On 10.07.19 03:05, John Snow wrote:
> Signed-off-by: John Snow 
> ---
>  tests/qemu-iotests/257 |   31 +
>  tests/qemu-iotests/257.out | 3089 
>  2 files changed, 3120 insertions(+)

Oof.

> diff --git a/tests/qemu-iotests/257 b/tests/qemu-iotests/257
> index de8707cb19..8de1c4da19 100755
> --- a/tests/qemu-iotests/257
> +++ b/tests/qemu-iotests/257

[...]

> @@ -410,6 +416,11 @@ def test_bitmap_sync(bsync_mode, msync_mode='bitmap', 
> failure=None):
>  if bsync_mode == 'always' and failure == 'intermediate':
>  # We manage to copy one sector (one bit) before the error.
>  ebitmap.clear_bit(ebitmap.first_bit)
> +if msync_mode in ('full', 'top'):
> +# These modes return all bits set except what was 
> copied/skipped

Hm.  How useful is bitmap support for 'top' then, anyway?  That means
that if you want to resume a top backup, you always have to resume it
like it was a full backup.  Which sounds kind of useless.

Max

> +fail_bit = ebitmap.first_bit
> +ebitmap.clear()
> +ebitmap.dirty_bits(range(fail_bit, SIZE // GRANULARITY))
>  ebitmap.compare(get_bitmap(bitmaps, drive0.device, 'bitmap0'))
>  
>  # 2 - Writes and Reference Backup
[...]



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [Qemu-devel] [PATCH] nvme: Set number of queues later in nvme_init()

2019-07-10 Thread Maxim Levitsky

On Wed, 2019-07-10 at 17:34 +0200, Philippe Mathieu-Daudé wrote:
> On 7/10/19 4:57 PM, Michal Privoznik wrote:
> > When creating the admin queue in nvme_init() the variable that
> > holds the number of queues created is modified before actual
> > queue creation. This is a problem because if creating the queue
> > fails then the variable is left in inconsistent state. This was
> > actually observed when I tried to hotplug a nvme disk. The
> > control got to nvme_file_open() which called nvme_init() which
> > failed and thus nvme_close() was called which in turn called
> > nvme_free_queue_pair() with queue being NULL. This lead to an
> > instant crash:
> > 
> >   #0  0x55d9507ec211 in nvme_free_queue_pair (bs=0x55d952ddb880, q=0x0) 
> > at block/nvme.c:164
> >   #1  0x55d9507ee180 in nvme_close (bs=0x55d952ddb880) at 
> > block/nvme.c:729
> >   #2  0x55d9507ee3d5 in nvme_file_open (bs=0x55d952ddb880, 
> > options=0x55d952bb1410, flags=147456, errp=0x7ffd8e19e200) at 
> > block/nvme.c:781
> >   #3  0x55d9507629f3 in bdrv_open_driver (bs=0x55d952ddb880, 
> > drv=0x55d95109c1e0 , node_name=0x0, options=0x55d952bb1410, 
> > open_flags=147456, errp=0x7ffd8e19e310) at block.c:1291
> >   #4  0x55d9507633d6 in bdrv_open_common (bs=0x55d952ddb880, file=0x0, 
> > options=0x55d952bb1410, errp=0x7ffd8e19e310) at block.c:1551
> >   #5  0x55d950766881 in bdrv_open_inherit (filename=0x0, reference=0x0, 
> > options=0x55d952bb1410, flags=32768, parent=0x55d9538ce420, 
> > child_role=0x55d950eaade0 , errp=0x7ffd8e19e510)
> > at block.c:3063
> >   #6  0x55d950765ae4 in bdrv_open_child_bs (filename=0x0, 
> > options=0x55d9541cdff0, bdref_key=0x55d950af33aa "file", 
> > parent=0x55d9538ce420, child_role=0x55d950eaade0 ,
> > allow_none=true, errp=0x7ffd8e19e510) at block.c:2712
> >   #7  0x55d950766633 in bdrv_open_inherit (filename=0x0, reference=0x0, 
> > options=0x55d9541cdff0, flags=0, parent=0x0, child_role=0x0, 
> > errp=0x7ffd8e19e908) at block.c:3011
> >   #8  0x55d950766dba in bdrv_open (filename=0x0, reference=0x0, 
> > options=0x55d953d00390, flags=0, errp=0x7ffd8e19e908) at block.c:3156
> >   #9  0x55d9507cb635 in blk_new_open (filename=0x0, reference=0x0, 
> > options=0x55d953d00390, flags=0, errp=0x7ffd8e19e908) at 
> > block/block-backend.c:389
> >   #10 0x55d950465ec5 in blockdev_init (file=0x0, 
> > bs_opts=0x55d953d00390, errp=0x7ffd8e19e908) at blockdev.c:602
> > 
> 
> Fixes: bdd6a90a9e5
> 
> > Signed-off-by: Michal Privoznik 
> 
> Reviewed-by: Philippe Mathieu-Daudé 
> Tested-by: Philippe Mathieu-Daudé 
> 
> > ---
> >  block/nvme.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/block/nvme.c b/block/nvme.c
> > index 73ed5fa75f..9896b7f7c6 100644
> > --- a/block/nvme.c
> > +++ b/block/nvme.c
> > @@ -613,12 +613,12 @@ static int nvme_init(BlockDriverState *bs, const char 
> > *device, int namespace,
> >  
> >  /* Set up admin queue. */
> >  s->queues = g_new(NVMeQueuePair *, 1);
> > -s->nr_queues = 1;
> >  s->queues[0] = nvme_create_queue_pair(bs, 0, NVME_QUEUE_SIZE, errp);
> >  if (!s->queues[0]) {
> >  ret = -EINVAL;
> >  goto out;
> >  }
> > +s->nr_queues = 1;
> >  QEMU_BUILD_BUG_ON(NVME_QUEUE_SIZE & 0xF000);
> >  s->regs->aqa = cpu_to_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE);
> >  s->regs->asq = cpu_to_le64(s->queues[0]->sq.iova);
> > 
> 
> 


Reviewed-by: Maxim Levitsky 
Best regards,
Maxim Levitsky

[Qemu-block] [PATCH] LUKS: support preallocation in qemu-img

2019-07-10 Thread Maxim Levitsky

preallocation=off and preallocation=metadata
both allocate luks header only, and preallocation=falloc/full
is passed to underlying file, with the given image size.

Note that the actual preallocated size is a bit smaller due
to luks header.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1534951

Signed-off-by: Maxim Levitsky 
---
 block/crypto.c | 28 ++--
 1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/block/crypto.c b/block/crypto.c
index 8237424ae6..74b789d278 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -251,6 +251,7 @@ static int block_crypto_open_generic(QCryptoBlockFormat 
format,
 static int block_crypto_co_create_generic(BlockDriverState *bs,
   int64_t size,
   QCryptoBlockCreateOptions *opts,
+  PreallocMode prealloc,
   Error **errp)
 {
 int ret;
@@ -266,6 +267,13 @@ static int block_crypto_co_create_generic(BlockDriverState 
*bs,
 goto cleanup;
 }
 
+if (prealloc != PREALLOC_MODE_OFF) {
+ret = blk_truncate(blk, size, prealloc, errp);
+if (ret < 0) {
+goto cleanup;
+}
+}
+
 data = (struct BlockCryptoCreateData) {
 .blk = blk,
 .size = size,
@@ -516,7 +524,7 @@ block_crypto_co_create_luks(BlockdevCreateOptions 
*create_options, Error **errp)
 };
 
 ret = block_crypto_co_create_generic(bs, luks_opts->size, &create_opts,
- errp);
+ PREALLOC_MODE_OFF, errp);
 if (ret < 0) {
 goto fail;
 }
@@ -534,12 +542,28 @@ static int coroutine_fn 
block_crypto_co_create_opts_luks(const char *filename,
 QCryptoBlockCreateOptions *create_opts = NULL;
 BlockDriverState *bs = NULL;
 QDict *cryptoopts;
+PreallocMode prealloc;
+char *buf = NULL;
 int64_t size;
 int ret;
+Error *local_err = NULL;
 
 /* Parse options */
 size = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
 
+buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
+prealloc = qapi_enum_parse(&PreallocMode_lookup, buf,
+   PREALLOC_MODE_OFF, &local_err);
+g_free(buf);
+if (local_err) {
+error_propagate(errp, local_err);
+return -EINVAL;
+}
+
+if (prealloc == PREALLOC_MODE_METADATA) {
+prealloc  = PREALLOC_MODE_OFF;
+}
+
 cryptoopts = qemu_opts_to_qdict_filtered(opts, NULL,
  &block_crypto_create_opts_luks,
  true);
@@ -565,7 +589,7 @@ static int coroutine_fn 
block_crypto_co_create_opts_luks(const char *filename,
 }
 
 /* Create format layer */
-ret = block_crypto_co_create_generic(bs, size, create_opts, errp);
+ret = block_crypto_co_create_generic(bs, size, create_opts, prealloc, 
errp);
 if (ret < 0) {
 goto fail;
 }
-- 
2.17.2

Re: [Qemu-block] [PATCH 7/8] block/backup: support bitmap sync modes for non-bitmap backups

2019-07-10 Thread Max Reitz

On 10.07.19 03:05, John Snow wrote:
> Accept bitmaps and sync policies for the other backup modes.
> This allows us to do things like create a bitmap synced to a full backup
> without a transaction, or start a resumable backup process.
> 
> Some combinations don't make sense, though:
> 
> - NEVER policy combined with any non-BITMAP mode doesn't do anything,
>   because the bitmap isn't used for input or output.
>   It's harmless, but is almost certainly never what the user wanted.
> 
> - sync=NONE is more questionable. It can't use on-success because this
>   job never completes with success anyway, and the resulting artifact
>   of 'always' is suspect: because we start with a full bitmap and only
>   copy out segments that get written to, the final output bitmap will
>   always be ... a fully set bitmap.
> 
>   Maybe there's contexts in which bitmaps make sense for sync=none,
>   but not without more severe changes to the current job, and omitting
>   it here doesn't prevent us from adding it later.
> 
> Signed-off-by: John Snow 
> ---
>  block/backup.c   |  8 +---
>  blockdev.c   | 22 ++
>  qapi/block-core.json |  6 --
>  3 files changed, 27 insertions(+), 9 deletions(-)

[...]

> diff --git a/blockdev.c b/blockdev.c
> index f0b7da53b0..bc152f8e0d 100644
> --- a/blockdev.c
> +++ b/blockdev.c

[...]

> +if (!backup->has_bitmap && backup->has_bitmap_mode) {
> +error_setg(errp, "Cannot specify Bitmap sync mode without a bitmap");

Any reason for capitalizing the first “Bitmap”?

With a reason or it fixed:

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH 6/8] block/backup: issue progress updates for skipped regions

2019-07-10 Thread Max Reitz

On 10.07.19 03:05, John Snow wrote:
> The way bitmap backups work is by starting at 75% if it needs
> to copy just 25% of the disk.

Although there is this comment:

> /* TODO job_progress_set_remaining() would make more sense */

So...

> The way sync=top currently works, however, is to start at 0% and then
> never update the progress if it doesn't copy a region. If it needs to
> copy 25% of the disk, we'll finish at 25%.
> 
> Update the progress when we skip regions.

Wouldn’t it be more correct to decrease the job length?

Max

> Signed-off-by: John Snow 
> ---
>  block/backup.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/block/backup.c b/block/backup.c
> index a64b768e24..38c4a688c6 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -417,6 +417,7 @@ static int coroutine_fn backup_loop(BackupBlockJob *job)
>  if (job->sync_mode == MIRROR_SYNC_MODE_TOP &&
>  bdrv_is_unallocated_range(bs, offset, job->cluster_size))
>  {
> +job_progress_update(&job->common.job, job->cluster_size);
>  bdrv_reset_dirty_bitmap(job->copy_bitmap, offset,
>  job->cluster_size);
>  continue;
> 




signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH 5/8] iotests/257: test API failures

2019-07-10 Thread Max Reitz

On 10.07.19 03:05, John Snow wrote:
> Signed-off-by: John Snow 
> ---
>  tests/qemu-iotests/257 | 69 +++
>  tests/qemu-iotests/257.out | 85 ++
>  2 files changed, 154 insertions(+)
> 
> diff --git a/tests/qemu-iotests/257 b/tests/qemu-iotests/257
> index 2eb4f26c28..de8707cb19 100755
> --- a/tests/qemu-iotests/257
> +++ b/tests/qemu-iotests/257
> @@ -450,10 +450,79 @@ def test_bitmap_sync(bsync_mode, msync_mode='bitmap', 
> failure=None):
>  compare_images(img_path, fbackup2)
>  log('')
>  
> +def test_backup_api():
> +"""
> +"""

Er, OK?

[...]

> +for sync_mode, config in error_cases.items():
> +log("-- Sync mode {:s} tests --\n".format(sync_mode))
> +for bitmap, policies in config.items():

You might be interested in the fact that the iteration order is
different for Python2.  Or maybe you aren’t.

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH 1/8] iotests/257: add Pattern class

2019-07-10 Thread Max Reitz

On 10.07.19 03:05, John Snow wrote:
> Just kidding, this is easier to manage with a full class instead of a
> namedtuple.
> 
> Signed-off-by: John Snow 
> ---
>  tests/qemu-iotests/257 | 58 +++---
>  1 file changed, 32 insertions(+), 26 deletions(-)
> 
> diff --git a/tests/qemu-iotests/257 b/tests/qemu-iotests/257
> index 75a651c7c3..f576a35a5c 100755
> --- a/tests/qemu-iotests/257
> +++ b/tests/qemu-iotests/257
> @@ -19,7 +19,6 @@
>  #
>  # owner=js...@redhat.com
>  
> -from collections import namedtuple
>  import math
>  import os
>  
> @@ -29,10 +28,18 @@ from iotests import log, qemu_img
>  SIZE = 64 * 1024 * 1024
>  GRANULARITY = 64 * 1024
>  
> -Pattern = namedtuple('Pattern', ['byte', 'offset', 'size'])
> -def mkpattern(byte, offset, size=GRANULARITY):
> -"""Constructor for Pattern() with default size"""
> -return Pattern(byte, offset, size)
> +
> +class Pattern:
> +def __init__(self, byte, offset, size=GRANULARITY):
> +self.byte = byte
> +self.offset = offset
> +self.size = size
> +
> +def bits(self, granularity):
> +lower = math.floor(self.offset / granularity)
> +upper = math.floor((self.offset + self.size - 1) / granularity)
> +return set(range(lower, upper + 1))

By the way, this doesn’t work with Python2 (pre-existing in your other
series).  It complains that these are floats.

Now I don’t know whether you care but there is the fact that the
expressions would be shorter if they were of the form x // y instead of
math.floor(x / y).

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH 4/8] block/backup: hoist bitmap check into QMP interface

2019-07-10 Thread Max Reitz

On 10.07.19 03:05, John Snow wrote:
> This is nicer to do in the unified QMP interface that we have now,
> because it lets us use the right terminology back at the user.
> 
> Signed-off-by: John Snow 
> ---
>  block/backup.c | 13 -
>  blockdev.c | 10 ++
>  2 files changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/block/backup.c b/block/backup.c
> index e2729cf6fa..a64b768e24 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -566,6 +566,10 @@ BlockJob *backup_job_create(const char *job_id, 
> BlockDriverState *bs,
>  assert(bs);
>  assert(target);
>  
> +/* QMP interface protects us from these cases */
> +assert(sync_mode != MIRROR_SYNC_MODE_INCREMENTAL);
> +assert(sync_bitmap || sync_mode != MIRROR_SYNC_MODE_BITMAP);

Implication would be a nice operator sometimes.

("assert(sync_mode == MIRROR_SYNC_MODE_BITMAP -> sync_bitmap)")

(Can you do that in C++?  No, you can’t overload bool’s operators, right?)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH 3/8] iotests/257: Refactor backup helpers

2019-07-10 Thread Max Reitz

On 10.07.19 03:05, John Snow wrote:
> This test needs support for non-bitmap backups and missing or
> unspecified bitmap sync modes, so rewrite the helpers to be a little
> more generic.
> 
> Signed-off-by: John Snow 
> ---
>  tests/qemu-iotests/257 |  46 +
>  tests/qemu-iotests/257.out | 192 ++---
>  2 files changed, 124 insertions(+), 114 deletions(-)
> 
> diff --git a/tests/qemu-iotests/257 b/tests/qemu-iotests/257
> index 2ff4aa8695..2eb4f26c28 100755
> --- a/tests/qemu-iotests/257
> +++ b/tests/qemu-iotests/257

[...]

> -def bitmap_backup(drive, n, filepath, bitmap, bitmap_mode):
> -log("--- Bitmap Backup #{:d} ---\n".format(n))
> -target_id = "bitmap_target_{:d}".format(n)
> -job_id = "bitmap_backup_{:d}".format(n)
> +def backup(drive, n, filepath, bitmap, bitmap_mode, sync='bitmap'):
> +log("--- Test Backup #{:d} ---\n".format(n))
> +target_id = "backup_target_{:d}".format(n)
> +job_id = "backup_{:d}".format(n)
>  target_drive = Drive(filepath, vm=drive.vm)
>  
>  target_drive.create_target(target_id, drive.fmt, drive.size)
> -drive.vm.qmp_log("blockdev-backup", job_id=job_id, device=drive.name,
> - target=target_id, sync="bitmap",
> - bitmap_mode=bitmap_mode,
> - bitmap=bitmap,
> - auto_finalize=False)
> +
> +kwargs = {
> +'job_id': job_id,
> +'auto_finalize': False,
> +'bitmap': bitmap,
> +'bitmap_mode': bitmap_mode,
> +}
> +kwargs = {key: val for key, val in kwargs.items() if val is not None}

I suppose this is to remove items that are None?

Very cute, but why not just

  kwargs = {
'job_id': job_id,
'auto_finalize': False,
  }
  if bitmap is not None:
kwargs['bitmap'] = bitmap
kwargs['bitmap_mode'] = bitmap_mode

Exactly the same number of lines, but immediately makes it clear what’s
going on.  Not as cute, I admit.

(Yes, I am indeed actively trying to train you not to write cute code.)

The rest looks good to me:

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH 2/8] iotests/257: add EmulatedBitmap class

2019-07-10 Thread Max Reitz

On 10.07.19 03:05, John Snow wrote:
> Represent a bitmap with an object that we can mark and clear bits in.
> This makes it easier to manage partial writes when we don't write a
> full group's worth of patterns before an error.
> 
> Signed-off-by: John Snow 
> ---
>  tests/qemu-iotests/257 | 125 +
>  1 file changed, 76 insertions(+), 49 deletions(-)
> 
> diff --git a/tests/qemu-iotests/257 b/tests/qemu-iotests/257
> index f576a35a5c..2ff4aa8695 100755
> --- a/tests/qemu-iotests/257
> +++ b/tests/qemu-iotests/257
> @@ -85,6 +85,60 @@ GROUPS = [
>  Pattern('0xdd', 0x3fc)]), # New; leaving a gap to the right
>  ]
>  
> +
> +class EmulatedBitmap:
> +def __init__(self, granularity=GRANULARITY):
> +self._bits = set()
> +self.groups = set()
> +self.granularity = granularity
> +
> +def dirty_bits(self, bits):
> +self._bits |= set(bits)
> +
> +def dirty_group(self, n):
> +self.dirty_bits(GROUPS[n].bits(self.granularity))
> +
> +def clear(self):
> +self._bits = set()
> +
> +def clear_bits(self, bits):
> +self._bits = self._bits - set(bits)

Does -= not work here?

No real complaints.  Sorry.

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [Qemu-devel] [PATCH] nvme: Set number of queues later in nvme_init()

2019-07-10 Thread Philippe Mathieu-Daudé

On 7/10/19 4:57 PM, Michal Privoznik wrote:
> When creating the admin queue in nvme_init() the variable that
> holds the number of queues created is modified before actual
> queue creation. This is a problem because if creating the queue
> fails then the variable is left in inconsistent state. This was
> actually observed when I tried to hotplug a nvme disk. The
> control got to nvme_file_open() which called nvme_init() which
> failed and thus nvme_close() was called which in turn called
> nvme_free_queue_pair() with queue being NULL. This lead to an
> instant crash:
> 
>   #0  0x55d9507ec211 in nvme_free_queue_pair (bs=0x55d952ddb880, q=0x0) 
> at block/nvme.c:164
>   #1  0x55d9507ee180 in nvme_close (bs=0x55d952ddb880) at block/nvme.c:729
>   #2  0x55d9507ee3d5 in nvme_file_open (bs=0x55d952ddb880, 
> options=0x55d952bb1410, flags=147456, errp=0x7ffd8e19e200) at block/nvme.c:781
>   #3  0x55d9507629f3 in bdrv_open_driver (bs=0x55d952ddb880, 
> drv=0x55d95109c1e0 , node_name=0x0, options=0x55d952bb1410, 
> open_flags=147456, errp=0x7ffd8e19e310) at block.c:1291
>   #4  0x55d9507633d6 in bdrv_open_common (bs=0x55d952ddb880, file=0x0, 
> options=0x55d952bb1410, errp=0x7ffd8e19e310) at block.c:1551
>   #5  0x55d950766881 in bdrv_open_inherit (filename=0x0, reference=0x0, 
> options=0x55d952bb1410, flags=32768, parent=0x55d9538ce420, 
> child_role=0x55d950eaade0 , errp=0x7ffd8e19e510) at block.c:3063
>   #6  0x55d950765ae4 in bdrv_open_child_bs (filename=0x0, 
> options=0x55d9541cdff0, bdref_key=0x55d950af33aa "file", 
> parent=0x55d9538ce420, child_role=0x55d950eaade0 , 
> allow_none=true, errp=0x7ffd8e19e510) at block.c:2712
>   #7  0x55d950766633 in bdrv_open_inherit (filename=0x0, reference=0x0, 
> options=0x55d9541cdff0, flags=0, parent=0x0, child_role=0x0, 
> errp=0x7ffd8e19e908) at block.c:3011
>   #8  0x55d950766dba in bdrv_open (filename=0x0, reference=0x0, 
> options=0x55d953d00390, flags=0, errp=0x7ffd8e19e908) at block.c:3156
>   #9  0x55d9507cb635 in blk_new_open (filename=0x0, reference=0x0, 
> options=0x55d953d00390, flags=0, errp=0x7ffd8e19e908) at 
> block/block-backend.c:389
>   #10 0x55d950465ec5 in blockdev_init (file=0x0, bs_opts=0x55d953d00390, 
> errp=0x7ffd8e19e908) at blockdev.c:602
> 

Fixes: bdd6a90a9e5

> Signed-off-by: Michal Privoznik 

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 

> ---
>  block/nvme.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/nvme.c b/block/nvme.c
> index 73ed5fa75f..9896b7f7c6 100644
> --- a/block/nvme.c
> +++ b/block/nvme.c
> @@ -613,12 +613,12 @@ static int nvme_init(BlockDriverState *bs, const char 
> *device, int namespace,
>  
>  /* Set up admin queue. */
>  s->queues = g_new(NVMeQueuePair *, 1);
> -s->nr_queues = 1;
>  s->queues[0] = nvme_create_queue_pair(bs, 0, NVME_QUEUE_SIZE, errp);
>  if (!s->queues[0]) {
>  ret = -EINVAL;
>  goto out;
>  }
> +s->nr_queues = 1;
>  QEMU_BUILD_BUG_ON(NVME_QUEUE_SIZE & 0xF000);
>  s->regs->aqa = cpu_to_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE);
>  s->regs->asq = cpu_to_le64(s->queues[0]->sq.iova);
>

Re: [Qemu-block] [PATCH v3] block/rbd: implement .bdrv_get_allocated_file_size callback

2019-07-10 Thread Stefano Garzarella

On Tue, Jul 09, 2019 at 09:42:14PM -0400, Jason Dillaman wrote:
> On Tue, Jul 9, 2019 at 11:32 AM Max Reitz  wrote:
> > On 09.07.19 15:09, Stefano Garzarella wrote:
> > >
> > > Max, Jason, thanks for the details!
> > >
> > > If you agree, I'll try to implement something similar, iterating on all
> > > snapshots.
> >
> > Yes, that should work.
> >
> > > What about metadata?
> > > I saw some APIs (e.g. rbd_metadata_list()) that allow us to get the 
> > > metadata.
> > > An idea could be to iterate over them and sum the keys-values size 
> > > returned,
> > > but I'm not sure they are returned in the same format as they are stored.
> >
> > I wouldn’t mind ignoring metadata too much.  If you can get something to
> > work, even better, but I don’t consider that too important.
> 
> I don't think it's a good idea to attempt to estimate it. If there is
> enough desire, we can always add a supported "disk-usage" API method
> that takes into account things like the image metadata, object-map,
> journaling, etc overhead. However, I wouldn't expect it to be anywhere
> near the actual block usage (unless something has gone terribly wrong
> w/ journaling and it fails to trim your log).

Okay, so for now, I'll skip the metadata and if you think can be useful
to add "disk-usage" API in RBD, I'll use it later.

Thanks,
Stefano

Re: [Qemu-block] [PATCH 1/8] iotests/257: add Pattern class

2019-07-10 Thread Max Reitz

On 10.07.19 03:05, John Snow wrote:
> Just kidding, this is easier to manage with a full class instead of a
> namedtuple.
> 
> Signed-off-by: John Snow 
> ---
>  tests/qemu-iotests/257 | 58 +++---
>  1 file changed, 32 insertions(+), 26 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

[Qemu-block] [PATCH] nvme: Set number of queues later in nvme_init()

2019-07-10 Thread Michal Privoznik

When creating the admin queue in nvme_init() the variable that
holds the number of queues created is modified before actual
queue creation. This is a problem because if creating the queue
fails then the variable is left in inconsistent state. This was
actually observed when I tried to hotplug a nvme disk. The
control got to nvme_file_open() which called nvme_init() which
failed and thus nvme_close() was called which in turn called
nvme_free_queue_pair() with queue being NULL. This lead to an
instant crash:

  #0  0x55d9507ec211 in nvme_free_queue_pair (bs=0x55d952ddb880, q=0x0) at 
block/nvme.c:164
  #1  0x55d9507ee180 in nvme_close (bs=0x55d952ddb880) at block/nvme.c:729
  #2  0x55d9507ee3d5 in nvme_file_open (bs=0x55d952ddb880, 
options=0x55d952bb1410, flags=147456, errp=0x7ffd8e19e200) at block/nvme.c:781
  #3  0x55d9507629f3 in bdrv_open_driver (bs=0x55d952ddb880, 
drv=0x55d95109c1e0 , node_name=0x0, options=0x55d952bb1410, 
open_flags=147456, errp=0x7ffd8e19e310) at block.c:1291
  #4  0x55d9507633d6 in bdrv_open_common (bs=0x55d952ddb880, file=0x0, 
options=0x55d952bb1410, errp=0x7ffd8e19e310) at block.c:1551
  #5  0x55d950766881 in bdrv_open_inherit (filename=0x0, reference=0x0, 
options=0x55d952bb1410, flags=32768, parent=0x55d9538ce420, 
child_role=0x55d950eaade0 , errp=0x7ffd8e19e510) at block.c:3063
  #6  0x55d950765ae4 in bdrv_open_child_bs (filename=0x0, 
options=0x55d9541cdff0, bdref_key=0x55d950af33aa "file", parent=0x55d9538ce420, 
child_role=0x55d950eaade0 , allow_none=true, errp=0x7ffd8e19e510) 
at block.c:2712
  #7  0x55d950766633 in bdrv_open_inherit (filename=0x0, reference=0x0, 
options=0x55d9541cdff0, flags=0, parent=0x0, child_role=0x0, 
errp=0x7ffd8e19e908) at block.c:3011
  #8  0x55d950766dba in bdrv_open (filename=0x0, reference=0x0, 
options=0x55d953d00390, flags=0, errp=0x7ffd8e19e908) at block.c:3156
  #9  0x55d9507cb635 in blk_new_open (filename=0x0, reference=0x0, 
options=0x55d953d00390, flags=0, errp=0x7ffd8e19e908) at 
block/block-backend.c:389
  #10 0x55d950465ec5 in blockdev_init (file=0x0, bs_opts=0x55d953d00390, 
errp=0x7ffd8e19e908) at blockdev.c:602

Signed-off-by: Michal Privoznik 
---
 block/nvme.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/nvme.c b/block/nvme.c
index 73ed5fa75f..9896b7f7c6 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -613,12 +613,12 @@ static int nvme_init(BlockDriverState *bs, const char 
*device, int namespace,
 
 /* Set up admin queue. */
 s->queues = g_new(NVMeQueuePair *, 1);
-s->nr_queues = 1;
 s->queues[0] = nvme_create_queue_pair(bs, 0, NVME_QUEUE_SIZE, errp);
 if (!s->queues[0]) {
 ret = -EINVAL;
 goto out;
 }
+s->nr_queues = 1;
 QEMU_BUILD_BUG_ON(NVME_QUEUE_SIZE & 0xF000);
 s->regs->aqa = cpu_to_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE);
 s->regs->asq = cpu_to_le64(s->queues[0]->sq.iova);
-- 
2.21.0

Re: [Qemu-block] [PATCH v4 00/18] bitmaps: introduce 'bitmap' sync mode

2019-07-10 Thread Max Reitz

On 10.07.19 01:25, John Snow wrote:
> This series adds a new "BITMAP" sync mode that is meant to replace the
> existing "INCREMENTAL" sync mode.

So who’s going to take it? :-)

get_maintainer.pl says I’m responsible for 14/18 patches, and you are
for 9/18.  But you did prefix the title with “bitmaps”, which
MAINTAINERS clearly states is your territory. ;-)

Max

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH v4 17/18] iotests: add test 257 for bitmap-mode backups

2019-07-10 Thread Max Reitz

On 10.07.19 01:25, John Snow wrote:
> Signed-off-by: John Snow 
> ---
>  tests/qemu-iotests/257 |  416 +++
>  tests/qemu-iotests/257.out | 2247 
>  tests/qemu-iotests/group   |1 +
>  3 files changed, 2664 insertions(+)
>  create mode 100755 tests/qemu-iotests/257
>  create mode 100644 tests/qemu-iotests/257.out

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH v4 11/18] block/backup: upgrade copy_bitmap to BdrvDirtyBitmap

2019-07-10 Thread Max Reitz

On 10.07.19 01:25, John Snow wrote:
> This simplifies some interface matters; namely the initialization and
> (later) merging the manifest back into the sync_bitmap if it was
> provided.
> 
> Signed-off-by: John Snow 
> ---
>  block/backup.c | 81 ++
>  1 file changed, 42 insertions(+), 39 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH v4 09/18] block/dirty-bitmap: add bdrv_dirty_bitmap_merge_internal

2019-07-10 Thread Max Reitz

On 10.07.19 01:25, John Snow wrote:
> I'm surprised it didn't come up sooner, but sometimes we have a +busy
> bitmap as a source. This is dangerous from the QMP API, but if we are
> the owner that marked the bitmap busy, it's safe to merge it using it as
> a read only source.
> 
> It is not safe in the general case to allow users to read from in-use
> bitmaps, so create an internal variant that foregoes the safety
> checking.
> 
> Signed-off-by: John Snow 
> ---
>  block/dirty-bitmap.c  | 54 +++
>  include/block/block_int.h |  3 +++
>  2 files changed, 52 insertions(+), 5 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [Qemu-devel] [PATCH v2 1/1] raw-posix.c - use max transfer length / max segement count only for SCSI passthrough

2019-07-10 Thread Maxim Levitsky

On Thu, 2019-07-04 at 15:43 +0300, Maxim Levitsky wrote:
> Regular kernel block devices (/dev/sda*, /dev/nvme*, etc) don't have
> max segment size/max segment count hardware requirements exposed
> to the userspace, but rather the kernel block layer
> takes care to split the incoming requests that
> violate these requirements.
> 
> Allowing the kernel to do the splitting allows qemu to avoid
> various overheads that arise otherwise from this.
> 
> This is especially visible in nbd server,
> exposing as a raw file, a mostly empty qcow2 image over the net.
> In this case most of the reads by the remote user
> won't even hit the underlying kernel block device,
> and therefore most of the  overhead will be in the
> nbd traffic which increases significantly with lower max transfer size.
> 
> In addition to that even for local block device
> access the peformance improves a bit due to less
> traffic between qemu and the kernel when large
> transfer sizes are used (e.g for image conversion)
> 
> More info can be found at:
> https://bugzilla.redhat.com/show_bug.cgi?id=1647104
> 
> Signed-off-by: Maxim Levitsky 
> Reviewed-by: Stefan Hajnoczi 
> Reviewed-by: Eric Blake 
> ---
>  block/file-posix.c | 54 --
>  1 file changed, 28 insertions(+), 26 deletions(-)
> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index ab05b51a66..4479cc7ab4 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1038,15 +1038,13 @@ static void raw_reopen_abort(BDRVReopenState *state)
>  s->reopen_state = NULL;
>  }
>  
> -static int hdev_get_max_transfer_length(BlockDriverState *bs, int fd)
> +static int sg_get_max_transfer_length(int fd)
>  {
>  #ifdef BLKSECTGET
>  int max_bytes = 0;
> -short max_sectors = 0;
> -if (bs->sg && ioctl(fd, BLKSECTGET, &max_bytes) == 0) {
> +
> +if (ioctl(fd, BLKSECTGET, &max_bytes) == 0) {
>  return max_bytes;
> -} else if (!bs->sg && ioctl(fd, BLKSECTGET, &max_sectors) == 0) {
> -return max_sectors << BDRV_SECTOR_BITS;
>  } else {
>  return -errno;
>  }
> @@ -1055,25 +1053,31 @@ static int 
> hdev_get_max_transfer_length(BlockDriverState *bs, int fd)
>  #endif
>  }
>  
> -static int hdev_get_max_segments(const struct stat *st)
> +static int sg_get_max_segments(int fd)
>  {
>  #ifdef CONFIG_LINUX
>  char buf[32];
>  const char *end;
> -char *sysfspath;
> +char *sysfspath = NULL;
>  int ret;
> -int fd = -1;
> +int sysfd = -1;
>  long max_segments;
> +struct stat st;
> +
> +if (fstat(fd, &st)) {
> +ret = -errno;
> +goto out;
> +}
>  
>  sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments",
> -major(st->st_rdev), minor(st->st_rdev));
> -fd = open(sysfspath, O_RDONLY);
> -if (fd == -1) {
> +major(st.st_rdev), minor(st.st_rdev));
> +sysfd = open(sysfspath, O_RDONLY);
> +if (sysfd == -1) {
>  ret = -errno;
>  goto out;
>  }
>  do {
> -ret = read(fd, buf, sizeof(buf) - 1);
> +ret = read(sysfd, buf, sizeof(buf) - 1);
>  } while (ret == -1 && errno == EINTR);
>  if (ret < 0) {
>  ret = -errno;
> @@ -1090,8 +1094,8 @@ static int hdev_get_max_segments(const struct stat *st)
>  }
>  
>  out:
> -if (fd != -1) {
> -close(fd);
> +if (sysfd != -1) {
> +close(sysfd);
>  }
>  g_free(sysfspath);
>  return ret;
> @@ -1103,19 +1107,17 @@ out:
>  static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
>  {
>  BDRVRawState *s = bs->opaque;
> -struct stat st;
>  
> -if (!fstat(s->fd, &st)) {
> -if (S_ISBLK(st.st_mode) || S_ISCHR(st.st_mode)) {
> -int ret = hdev_get_max_transfer_length(bs, s->fd);
> -if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) {
> -bs->bl.max_transfer = pow2floor(ret);
> -}
> -ret = hdev_get_max_segments(&st);
> -if (ret > 0) {
> -bs->bl.max_transfer = MIN(bs->bl.max_transfer,
> -  ret * getpagesize());
> -}
> +if (bs->sg) {
> +int ret = sg_get_max_transfer_length(s->fd);
> +
> +if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) {
> +bs->bl.max_transfer = pow2floor(ret);
> +}
> +
> +ret = sg_get_max_segments(s->fd);
> +if (ret > 0) {
> +bs->bl.max_transfer = MIN(bs->bl.max_transfer, ret * 
> getpagesize());
>  }
>  }
>  


Ping.

Best regards,
Maxim Levitsky

Re: [Qemu-block] [PATCH 0/4] virtio: handle zoned backing devices

2019-07-10 Thread Paolo Bonzini

On 10/07/19 13:02, Kevin Wolf wrote:
> Hm... Actually, file-posix implements .bdrv_check_perm and could just
> refuse attaching a parent there if it doesn't request a specific
> permission like BLK_PERM_SUPPORT_ZONED. That should give us the
> whitelist semantics through existing infrastructure.

I'd like Dmitry to have something more precise to base his work on.  The
permissions system is really complicated and I never really wrapped my
head around it, so I need your help.

IIUC, blkconf_apply_backend_options would grow a new argument (like
"resizable") and that argument would add BLK_PERM_SUPPORT_ZONED to the
perm that blkconf_apply_backend_options passes to blk_set_perm.  On the
other side raw_check_perm would say something like

if (is_zoned(s) && !(perm & BLK_PERM_SUPPORT_ZONED)) {
error_setg();
return -ENOTSUP;
}

Is this correct?

In addition, BLK_PERM_SUPPORT_ZONED would have to be a shared
permission, since it's possible to assign the same block device to
multiple scsi-block devices.  So BLK_PERM_SUPPORT_ZONED would be added
unconditionally to shared_perm.

Paolo

ps: I have always thought that shared_perm is expressed the wrong way
and should have been "denied_perm".  How hard would it be to change that
now?

Re: [Qemu-block] [PATCH 0/4] virtio: handle zoned backing devices

2019-07-10 Thread Kevin Wolf

Am 10.07.2019 um 12:09 hat Paolo Bonzini geschrieben:
> On 09/07/19 22:38, Dmitry Fomichev wrote:
> > Currently, attaching zoned block devices (i.e. storage devices
> > compliant to ZAC/ZBC standards) using several virtio methods doesn't
> > work - the zoned devices appear as regular block devices at the guest.
> > This may cause unexpected i/o errors and, potentially, some data
> > corruption.
> > 
> > To be more precise, attaching a zoned device via virtio-pci-blk,
> > virtio-scsi-pci/scsi-disk or virtio-scsi-pci/scsi-hd demonstrates the
> > above behavior. A simple fix is needed to make
> > virtio-scsi-pci/scsi-block work and this is covered by a different
> > patch. The virtio-scsi-pci/scsi-generic method appears to handle zoned
> > devices without problems.
> 
> The problem with this approach is that other devices (e.g. ide-hd or sd
> card) also break with zoned devices and the only way to fix it would be
> to add code denying zoned block devices to all of them.
> 
> The question then becomes how to define a whitelist.  One possiblity is
> to add a QOM interface (for example TYPE_ZONED_BLOCK_SUPPORT) to
> scsi-block and scsi-generic.  In do_parse_drive you can query the
> BlockBackend with bdrv_get_zoned_info, and return an error if the
> backend is a zoned block device and the device does not implement
> TYPE_ZONED_BLOCK_SUPPORT.  (Commit 6b1566c is an example of adding a new
> QOM interface; in your case, it would be simpler as the interface would
> not have any method).  Kevin, what do you think?

What about non-device users such as block jobs or (NBD) exports? Won't
they have to special-case such devices, too? In fact, what about image
format drivers or even filters?

I feel that this needs to be managed at the BDS level somehow. Not sure
which mechanism to use, though. Permissions would be suitable for a
blacklist approach, but I agree with you that we need a whitelist
instead.

Hm... Actually, file-posix implements .bdrv_check_perm and could just
refuse attaching a parent there if it doesn't request a specific
permission like BLK_PERM_SUPPORT_ZONED. That should give us the
whitelist semantics through existing infrastructure.

Kevin

Re: [Qemu-block] [PATCH 0/4] virtio: handle zoned backing devices

2019-07-10 Thread Paolo Bonzini

On 09/07/19 22:38, Dmitry Fomichev wrote:
> Currently, attaching zoned block devices (i.e. storage devices
> compliant to ZAC/ZBC standards) using several virtio methods doesn't
> work - the zoned devices appear as regular block devices at the guest.
> This may cause unexpected i/o errors and, potentially, some data
> corruption.
> 
> To be more precise, attaching a zoned device via virtio-pci-blk,
> virtio-scsi-pci/scsi-disk or virtio-scsi-pci/scsi-hd demonstrates the
> above behavior. A simple fix is needed to make
> virtio-scsi-pci/scsi-block work and this is covered by a different
> patch. The virtio-scsi-pci/scsi-generic method appears to handle zoned
> devices without problems.

The problem with this approach is that other devices (e.g. ide-hd or sd
card) also break with zoned devices and the only way to fix it would be
to add code denying zoned block devices to all of them.

The question then becomes how to define a whitelist.  One possiblity is
to add a QOM interface (for example TYPE_ZONED_BLOCK_SUPPORT) to
scsi-block and scsi-generic.  In do_parse_drive you can query the
BlockBackend with bdrv_get_zoned_info, and return an error if the
backend is a zoned block device and the device does not implement
TYPE_ZONED_BLOCK_SUPPORT.  (Commit 6b1566c is an example of adding a new
QOM interface; in your case, it would be simpler as the interface would
not have any method).  Kevin, what do you think?

Also, why deny passing Host Aware devices?

Paolo

Re: [Qemu-block] ANNOUNCE: Tool for diffing and backing up images and block devices

2019-07-10 Thread Kevin Wolf

Am 09.07.2019 um 18:23 hat Dan Shearer geschrieben:
> ddb is now available at https://github.com/gladserv/ddb and is intended for
> comparing block devices. The definition of "device" is very broad and can be
> custom-defined.
> 
> ddb also knows about sparse files.

How does this compare to qemu-img?

If I understand correctly, the main feature in comparison is that it
offers a mode that kind of combines qemu-img's 'convert -B' and
'rebase', i.e. convert with a target backing file, but making no
assumptions about the content of the target backing file. Is my
understanding correct?

Kevin

50 matches

Mail list logo