date:20190709

Re: [Qemu-block] [Qemu-devel] [PATCH v4 04/18] qapi: add BitmapSyncMode enum

2019-07-09 Thread Markus Armbruster

John Snow  writes:

> Depending on what a user is trying to accomplish, there might be a few
> bitmap cleanup actions that occur when an operation is finished that
> could be useful.
>
> I am proposing three:
> - NEVER: The bitmap is never synchronized against what was copied.
> - ALWAYS: The bitmap is always synchronized, even on failures.
> - ON-SUCCESS: The bitmap is synchronized only on success.
>
> The existing incremental backup modes use 'on-success' semantics,
> so add just that one for right now.
>
> Signed-off-by: John Snow 
> Reviewed-by: Max Reitz 
> ---
>  qapi/block-core.json | 14 ++
>  1 file changed, 14 insertions(+)
>
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 0af3866015..0c853d00bd 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -1134,6 +1134,20 @@
>  { 'enum': 'MirrorSyncMode',
>'data': ['top', 'full', 'none', 'incremental'] }
>  
> +##
> +# @BitmapSyncMode:
> +#
> +# An enumeration of possible behaviors for the synchronization of a bitmap
> +# when used for data copy operations.
> +#
> +# @on-success: The bitmap is only synced when the operation is successful.
> +#  This is the behavior always used for 'INCREMENTAL' backups.
> +#
> +# Since: 4.2
> +##
> +{ 'enum': 'BitmapSyncMode',
> +  'data': ['on-success'] }
> +
>  ##
>  # @MirrorCopyMode:
>  #

Reviewed-by: Markus Armbruster

Re: [Qemu-block] [Qemu-devel] [PATCH v4 01/18] qapi/block-core: Introduce BackupCommon

2019-07-09 Thread Markus Armbruster

John Snow  writes:

> On 7/9/19 7:25 PM, John Snow wrote:
>> drive-backup and blockdev-backup have an awful lot of things in common
>> that are the same. Let's fix that.
>> 
>> I don't deduplicate 'target', because the semantics actually did change
>> between each structure. Leave that one alone so it can be documented
>> separately.
>
> Sorry Markus, I forgot to adjust this. Pretend I wrote:
>
> Where documentation was not identical, use the most up-to-date version.
> For "speed", use Blockdev-Backup's version. For "sync", use
> Drive-Backup's version.

With that:
Reviewed-by: Markus Armbruster 

[...]

Re: [Qemu-block] [PATCH v3] block/rbd: implement .bdrv_get_allocated_file_size callback

2019-07-09 Thread Jason Dillaman

On Tue, Jul 9, 2019 at 11:32 AM Max Reitz  wrote:
>
> On 09.07.19 15:09, Stefano Garzarella wrote:
> > On Tue, Jul 09, 2019 at 08:55:19AM -0400, Jason Dillaman wrote:
> >> On Tue, Jul 9, 2019 at 5:45 AM Max Reitz  wrote:
> >>>
> >>> On 09.07.19 10:55, Max Reitz wrote:
>  On 09.07.19 05:08, Jason Dillaman wrote:
> > On Fri, Jul 5, 2019 at 6:43 AM Stefano Garzarella  
> > wrote:
> >>
> >> On Fri, Jul 05, 2019 at 11:58:43AM +0200, Max Reitz wrote:
> >>> On 05.07.19 11:32, Stefano Garzarella wrote:
>  This patch allows 'qemu-img info' to show the 'disk size' for
>  the RBD images that have the fast-diff feature enabled.
> 
>  If this feature is enabled, we use the rbd_diff_iterate2() API
>  to calculate the allocated size for the image.
> 
>  Signed-off-by: Stefano Garzarella 
>  ---
>  v3:
>    - return -ENOTSUP instead of -1 when fast-diff is not available
>  [John, Jason]
>  v2:
>    - calculate the actual usage only if the fast-diff feature is
>  enabled [Jason]
>  ---
>   block/rbd.c | 54 
>  +
>   1 file changed, 54 insertions(+)
> >>>
> >>> Well, the librbd documentation is non-existing as always, but while
> >>> googling, I at least found that libvirt has exactly the same code.  
> >>> So I
> >>> suppose it must be quite correct, then.
> >>>
> >>
> >> While I wrote this code I took a look at libvirt implementation and 
> >> also
> >> at the "rbd" tool in the ceph repository: compute_image_disk_usage() in
> >> src/tools/rbd/action/DiskUsage.cc
> >>
>  diff --git a/block/rbd.c b/block/rbd.c
>  index 59757b3120..b6bed683e5 100644
>  --- a/block/rbd.c
>  +++ b/block/rbd.c
>  @@ -1084,6 +1084,59 @@ static int64_t 
>  qemu_rbd_getlength(BlockDriverState *bs)
>   return info.size;
>   }
> 
>  +static int rbd_allocated_size_cb(uint64_t offset, size_t len, int 
>  exists,
>  + void *arg)
>  +{
>  +int64_t *alloc_size = (int64_t *) arg;
>  +
>  +if (exists) {
>  +(*alloc_size) += len;
>  +}
>  +
>  +return 0;
>  +}
>  +
>  +static int64_t qemu_rbd_get_allocated_file_size(BlockDriverState 
>  *bs)
>  +{
>  +BDRVRBDState *s = bs->opaque;
>  +uint64_t flags, features;
>  +int64_t alloc_size = 0;
>  +int r;
>  +
>  +r = rbd_get_flags(s->image, &flags);
>  +if (r < 0) {
>  +return r;
>  +}
>  +
>  +r = rbd_get_features(s->image, &features);
>  +if (r < 0) {
>  +return r;
>  +}
>  +
>  +/*
>  + * We use rbd_diff_iterate2() only if the RBD image have 
>  fast-diff
>  + * feature enabled. If it is disabled, rbd_diff_iterate2() 
>  could be
>  + * very slow on a big image.
>  + */
>  +if (!(features & RBD_FEATURE_FAST_DIFF) ||
>  +(flags & RBD_FLAG_FAST_DIFF_INVALID)) {
>  +return -ENOTSUP;
>  +}
>  +
>  +/*
>  + * rbd_diff_iterate2(), if the source snapshot name is NULL, 
>  invokes
>  + * the callback on all allocated regions of the image.
>  + */
>  +r = rbd_diff_iterate2(s->image, NULL, 0,
>  +  bs->total_sectors * BDRV_SECTOR_SIZE, 0, 
>  1,
>  +  &rbd_allocated_size_cb, &alloc_size);
> >>>
> >>> But I have a question.  This is basically block_status, right?  So it
> >>> gives us information on which areas are allocated and which are not.
> >>> The result thus gives us a lower bound on the allocation size, but is 
> >>> it
> >>> really exactly the allocation size?
> >>>
> >>> There are two things I’m concerned about:
> >>>
> >>> 1. What about metadata?
> >>
> >> Good question, I don't think it includes the size used by metadata and 
> >> I
> >> don't know if there is a way to know it. I'll check better.
> >
> > It does not include the size of metadata, the "rbd_diff_iterate2"
> > function is literally just looking for touched data blocks within the
> > RBD image.
> >
> >>>
> >>> 2. If you have multiple snapshots, this will only report the overall
> >>> allocation information, right?  So say there is something like this:
> >>>
> >>> (“A” means an allocated MB, “-” is an unallocated MB)
> >>>
> >>> Snapshot 1: ---
> >>> Snapshot 2: --A

Re: [Qemu-block] [Qemu-devel] [PATCH v4 01/18] qapi/block-core: Introduce BackupCommon

2019-07-09 Thread John Snow




On 7/9/19 7:25 PM, John Snow wrote:
> drive-backup and blockdev-backup have an awful lot of things in common
> that are the same. Let's fix that.
> 
> I don't deduplicate 'target', because the semantics actually did change
> between each structure. Leave that one alone so it can be documented
> separately.

Sorry Markus, I forgot to adjust this. Pretend I wrote:

Where documentation was not identical, use the most up-to-date version.
For "speed", use Blockdev-Backup's version. For "sync", use
Drive-Backup's version.

> 
> Signed-off-by: John Snow 
> Reviewed-by: Max Reitz 
> ---
>  qapi/block-core.json | 103 ++-
>  1 file changed, 33 insertions(+), 70 deletions(-)
> 
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 0d43d4f37c..0af3866015 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -1315,32 +1315,23 @@
>'data': { 'node': 'str', 'overlay': 'str' } }
>  
>  ##
> -# @DriveBackup:
> +# @BackupCommon:
>  #
>  # @job-id: identifier for the newly-created block job. If
>  #  omitted, the device name will be used. (Since 2.7)
>  #
>  # @device: the device name or node-name of a root node which should be 
> copied.
>  #
> -# @target: the target of the new image. If the file exists, or if it
> -#  is a device, the existing file/device will be used as the new
> -#  destination.  If it does not exist, a new file will be created.
> -#
> -# @format: the format of the new destination, default is to
> -#  probe if @mode is 'existing', else the format of the source
> -#
>  # @sync: what parts of the disk image should be copied to the destination
>  #(all the disk, only the sectors allocated in the topmost image, 
> from a
>  #dirty bitmap, or only new I/O).
>  #
> -# @mode: whether and how QEMU should create a new image, default is
> -#'absolute-paths'.
> -#
> -# @speed: the maximum speed, in bytes per second
> +# @speed: the maximum speed, in bytes per second. The default is 0,
> +# for unlimited.
>  #
>  # @bitmap: the name of dirty bitmap if sync is "incremental".
>  #  Must be present if sync is "incremental", must NOT be present
> -#  otherwise. (Since 2.4)
> +#  otherwise. (Since 2.4 (drive-backup), 3.1 (blockdev-backup))
>  #
>  # @compress: true to compress data, if the target format supports it.
>  #(default: false) (since 2.8)
> @@ -1370,75 +1361,47 @@
>  # I/O.  If an error occurs during a guest write request, the device's
>  # rerror/werror actions will be used.
>  #
> +# Since: 4.2
> +##
> +{ 'struct': 'BackupCommon',
> +  'data': { '*job-id': 'str', 'device': 'str',
> +'sync': 'MirrorSyncMode', '*speed': 'int',
> +'*bitmap': 'str', '*compress': 'bool',
> +'*on-source-error': 'BlockdevOnError',
> +'*on-target-error': 'BlockdevOnError',
> +'*auto-finalize': 'bool', '*auto-dismiss': 'bool' } }
> +
> +##
> +# @DriveBackup:
> +#
> +# @target: the target of the new image. If the file exists, or if it
> +#  is a device, the existing file/device will be used as the new
> +#  destination.  If it does not exist, a new file will be created.
> +#
> +# @format: the format of the new destination, default is to
> +#  probe if @mode is 'existing', else the format of the source
> +#
> +# @mode: whether and how QEMU should create a new image, default is
> +#'absolute-paths'.
> +#
>  # Since: 1.6
>  ##
>  { 'struct': 'DriveBackup',
> -  'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
> -'*format': 'str', 'sync': 'MirrorSyncMode',
> -'*mode': 'NewImageMode', '*speed': 'int',
> -'*bitmap': 'str', '*compress': 'bool',
> -'*on-source-error': 'BlockdevOnError',
> -'*on-target-error': 'BlockdevOnError',
> -'*auto-finalize': 'bool', '*auto-dismiss': 'bool' } }
> +  'base': 'BackupCommon',
> +  'data': { 'target': 'str',
> +'*format': 'str',
> +'*mode': 'NewImageMode' } }
>  
>  ##
>  # @BlockdevBackup:
>  #
> -# @job-id: identifier for the newly-created block job. If
> -#  omitted, the device name will be used. (Since 2.7)
> -#
> -# @device: the device name or node-name of a root node which should be 
> copied.
> -#
>  # @target: the device name or node-name of the backup target node.
>  #
> -# @sync: what parts of the disk image should be copied to the destination
> -#(all the disk, only the sectors allocated in the topmost image, or
> -#only new I/O).
> -#
> -# @speed: the maximum speed, in bytes per second. The default is 0,
> -# for unlimited.
> -#
> -# @bitmap: the name of dirty bitmap if sync is "incremental".
> -#  Must be present if sync is "incremental", must NOT be present
> -#  otherwise. (Since 3.1)
> -#
> -# @compress: true to compress data, if the target format supports it.
> -#

[Qemu-block] [PATCH 8/8] iotests/257: test traditional sync modes

2019-07-09 Thread John Snow

Signed-off-by: John Snow 
---
 tests/qemu-iotests/257 |   31 +
 tests/qemu-iotests/257.out | 3089 
 2 files changed, 3120 insertions(+)

diff --git a/tests/qemu-iotests/257 b/tests/qemu-iotests/257
index de8707cb19..8de1c4da19 100755
--- a/tests/qemu-iotests/257
+++ b/tests/qemu-iotests/257
@@ -288,6 +288,12 @@ def test_bitmap_sync(bsync_mode, msync_mode='bitmap', 
failure=None):
   Bitmaps are always synchronized, regardless of failure.
   (Partial images must be kept.)
 
+:param msync_mode: The mirror sync mode to use for the first backup.
+   Can be any one of:
+- bitmap: Backups based on bitmap manifest.
+- full:   Full backups.
+- top:Full backups of the top layer only.
+
 :param failure: Is the (optional) failure mode, and can be any of:
 - None: No failure. Test the normative path. Default.
 - simulated:Cancel the job right before it completes.
@@ -410,6 +416,11 @@ def test_bitmap_sync(bsync_mode, msync_mode='bitmap', 
failure=None):
 if bsync_mode == 'always' and failure == 'intermediate':
 # We manage to copy one sector (one bit) before the error.
 ebitmap.clear_bit(ebitmap.first_bit)
+if msync_mode in ('full', 'top'):
+# These modes return all bits set except what was 
copied/skipped
+fail_bit = ebitmap.first_bit
+ebitmap.clear()
+ebitmap.dirty_bits(range(fail_bit, SIZE // GRANULARITY))
 ebitmap.compare(get_bitmap(bitmaps, drive0.device, 'bitmap0'))
 
 # 2 - Writes and Reference Backup
@@ -501,6 +512,21 @@ def test_backup_api():
 'bitmap404': ['on-success', 'always', 'never', None],
 'bitmap0':   [None],
 },
+'full': {
+None:['on-success', 'always', 'never'],
+'bitmap404': ['on-success', 'always', 'never', None],
+'bitmap0':   ['never', None],
+},
+'top': {
+None:['on-success', 'always', 'never'],
+'bitmap404': ['on-success', 'always', 'never', None],
+'bitmap0':   ['never', None],
+},
+'none': {
+None:['on-success', 'always', 'never'],
+'bitmap404': ['on-success', 'always', 'never', None],
+'bitmap0':   ['on-success', 'always', 'never', None],
+}
 }
 
 for sync_mode, config in error_cases.items():
@@ -522,6 +548,11 @@ def main():
 for failure in ("simulated", "intermediate", None):
 test_bitmap_sync(bsync_mode, "bitmap", failure)
 
+for sync_mode in ('full', 'top'):
+for bsync_mode in ('on-success', 'always'):
+for failure in ('simulated', 'intermediate', None):
+test_bitmap_sync(bsync_mode, sync_mode, failure)
+
 test_backup_api()
 
 if __name__ == '__main__':
diff --git a/tests/qemu-iotests/257.out b/tests/qemu-iotests/257.out
index 43f2e0f9c9..304e33ab73 100644
--- a/tests/qemu-iotests/257.out
+++ b/tests/qemu-iotests/257.out
@@ -2246,6 +2246,3002 @@ qemu_img compare "TEST_DIR/PID-bsync2" 
"TEST_DIR/PID-fbackup2" ==> Identical, OK
 qemu_img compare "TEST_DIR/PID-img" "TEST_DIR/PID-fbackup2" ==> Identical, OK!
 
 
+=== Mode full; Bitmap Sync on-success with simulated failure ===
+
+--- Preparing image & VM ---
+
+{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "file": 
{"driver": "file", "filename": "TEST_DIR/PID-img"}, "node-name": "drive0"}}
+{"return": {}}
+{"execute": "device_add", "arguments": {"drive": "drive0", "driver": 
"scsi-hd", "id": "device0", "share-rw": true}}
+{"return": {}}
+
+--- Write #0 ---
+
+write -P0x49 0x000 0x1
+{"return": ""}
+write -P0x6c 0x010 0x1
+{"return": ""}
+write -P0x6f 0x200 0x1
+{"return": ""}
+write -P0x76 0x3ff 0x1
+{"return": ""}
+{
+  "bitmaps": {
+"device0": []
+  }
+}
+
+--- Reference Backup #0 ---
+
+{}
+{"execute": "job-dismiss", "arguments": {"id": "bdc-file-job"}}
+{"return": {}}
+{}
+{}
+{"execute": "job-dismiss", "arguments": {"id": "bdc-fmt-job"}}
+{"return": {}}
+{}
+{"execute": "blockdev-backup", "arguments": {"device": "drive0", "job-id": 
"ref_backup_0", "sync": "full", "target": "ref_target_0"}}
+{"return": {}}
+{"data": {"device": "ref_backup_0", "len": 67108864, "offset": 67108864, 
"speed": 0, "type": "backup"}, "event": "BLOCK_JOB_COMPLETED", "timestamp": 
{"microseconds": "USECS", "seconds": "SECS"}}
+
+--- Add Bitmap ---
+
+{"execute": "block-dirty-bitmap-add", "arguments": {"granularity": 65536, 
"name": "bitmap0", "node": "drive0"}}
+{"return": {}}
+
+--- Write #1 ---
+
+write -P0x65 0x000 0x1
+{"return": ""}
+write -P0x77 0x00f8000 0x1
+{"return": ""}
+write -P0x72 0x2008000 0x1
+{"return": ""}
+write -P0x69 0x3fe 0x1
+

[Qemu-block] [PATCH 6/8] block/backup: issue progress updates for skipped regions

2019-07-09 Thread John Snow

The way bitmap backups work is by starting at 75% if it needs
to copy just 25% of the disk.

The way sync=top currently works, however, is to start at 0% and then
never update the progress if it doesn't copy a region. If it needs to
copy 25% of the disk, we'll finish at 25%.

Update the progress when we skip regions.

Signed-off-by: John Snow 
---
 block/backup.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/backup.c b/block/backup.c
index a64b768e24..38c4a688c6 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -417,6 +417,7 @@ static int coroutine_fn backup_loop(BackupBlockJob *job)
 if (job->sync_mode == MIRROR_SYNC_MODE_TOP &&
 bdrv_is_unallocated_range(bs, offset, job->cluster_size))
 {
+job_progress_update(&job->common.job, job->cluster_size);
 bdrv_reset_dirty_bitmap(job->copy_bitmap, offset,
 job->cluster_size);
 continue;
-- 
2.21.0

[Qemu-block] [PATCH 3/8] iotests/257: Refactor backup helpers

2019-07-09 Thread John Snow

This test needs support for non-bitmap backups and missing or
unspecified bitmap sync modes, so rewrite the helpers to be a little
more generic.

Signed-off-by: John Snow 
---
 tests/qemu-iotests/257 |  46 +
 tests/qemu-iotests/257.out | 192 ++---
 2 files changed, 124 insertions(+), 114 deletions(-)

diff --git a/tests/qemu-iotests/257 b/tests/qemu-iotests/257
index 2ff4aa8695..2eb4f26c28 100755
--- a/tests/qemu-iotests/257
+++ b/tests/qemu-iotests/257
@@ -208,6 +208,14 @@ def get_bitmap(bitmaps, drivename, name, recording=None):
 return bitmap
 return None
 
+def blockdev_backup(vm, device, target, sync, **kwargs):
+result = vm.qmp_log('blockdev-backup',
+device=device,
+target=target,
+sync=sync,
+**kwargs)
+return result
+
 def reference_backup(drive, n, filepath):
 log("--- Reference Backup #{:d} ---\n".format(n))
 target_id = "ref_target_{:d}".format(n)
@@ -215,24 +223,26 @@ def reference_backup(drive, n, filepath):
 target_drive = Drive(filepath, vm=drive.vm)
 
 target_drive.create_target(target_id, drive.fmt, drive.size)
-drive.vm.qmp_log("blockdev-backup",
- job_id=job_id, device=drive.name,
- target=target_id, sync="full")
+blockdev_backup(drive.vm, drive.name, target_id, "full", job_id=job_id)
 drive.vm.run_job(job_id, auto_dismiss=True)
 log('')
 
-def bitmap_backup(drive, n, filepath, bitmap, bitmap_mode):
-log("--- Bitmap Backup #{:d} ---\n".format(n))
-target_id = "bitmap_target_{:d}".format(n)
-job_id = "bitmap_backup_{:d}".format(n)
+def backup(drive, n, filepath, bitmap, bitmap_mode, sync='bitmap'):
+log("--- Test Backup #{:d} ---\n".format(n))
+target_id = "backup_target_{:d}".format(n)
+job_id = "backup_{:d}".format(n)
 target_drive = Drive(filepath, vm=drive.vm)
 
 target_drive.create_target(target_id, drive.fmt, drive.size)
-drive.vm.qmp_log("blockdev-backup", job_id=job_id, device=drive.name,
- target=target_id, sync="bitmap",
- bitmap_mode=bitmap_mode,
- bitmap=bitmap,
- auto_finalize=False)
+
+kwargs = {
+'job_id': job_id,
+'auto_finalize': False,
+'bitmap': bitmap,
+'bitmap_mode': bitmap_mode,
+}
+kwargs = {key: val for key, val in kwargs.items() if val is not None}
+blockdev_backup(drive.vm, drive.name, target_id, sync, **kwargs)
 return job_id
 
 def perform_writes(drive, n):
@@ -264,7 +274,7 @@ def compare_images(image, reference, baseimg=None, 
expected_match=True):
 "OK!" if ret == expected_ret else "ERROR!"),
 filters=[iotests.filter_testfiles])
 
-def test_bitmap_sync(bsync_mode, failure=None):
+def test_bitmap_sync(bsync_mode, msync_mode='bitmap', failure=None):
 """
 Test bitmap backup routines.
 
@@ -292,7 +302,7 @@ def test_bitmap_sync(bsync_mode, failure=None):
  fbackup0, fbackup1, fbackup2), \
  iotests.VM() as vm:
 
-mode = "Bitmap Sync Mode {:s}".format(bsync_mode)
+mode = "Mode {:s}; Bitmap Sync {:s}".format(msync_mode, bsync_mode)
 preposition = "with" if failure else "without"
 cond = "{:s} {:s}".format(preposition,
   "{:s} failure".format(failure) if failure
@@ -363,12 +373,12 @@ def test_bitmap_sync(bsync_mode, failure=None):
 ebitmap.compare(bitmap)
 reference_backup(drive0, 1, fbackup1)
 
-# 1 - Bitmap Backup (Optional induced failure)
+# 1 - Test Backup (w/ Optional induced failure)
 if failure == 'intermediate':
 # Activate blkdebug induced failure for second-to-next read
 log(vm.hmp_qemu_io(drive0.name, 'flush'))
 log('')
-job = bitmap_backup(drive0, 1, bsync1, "bitmap0", bsync_mode)
+job = backup(drive0, 1, bsync1, "bitmap0", bsync_mode, sync=msync_mode)
 
 def _callback():
 """Issue writes while the job is open to test bitmap divergence."""
@@ -409,7 +419,7 @@ def test_bitmap_sync(bsync_mode, failure=None):
 reference_backup(drive0, 2, fbackup2)
 
 # 2 - Bitmap Backup (In failure modes, this is a recovery.)
-job = bitmap_backup(drive0, 2, bsync2, "bitmap0", bsync_mode)
+job = backup(drive0, 2, bsync2, "bitmap0", bsync_mode)
 vm.run_job(job, auto_dismiss=True, auto_finalize=False)
 bitmaps = query_bitmaps(vm)
 log(bitmaps, indent=2)
@@ -443,7 +453,7 @@ def test_bitmap_sync(bsync_mode, failure=None):
 def main():
 for bsync_mode in ("never", "on-success", "always"):
 for failure in ("simulated", "intermediate", None):
-test_bitmap_sync(bsync_mode, failure)
+test_bitmap_sync(bsync_mode, "bitmap", failure)
 
 if __name__ == '__main__'

[Qemu-block] [PATCH 5/8] iotests/257: test API failures

2019-07-09 Thread John Snow

Signed-off-by: John Snow 
---
 tests/qemu-iotests/257 | 69 +++
 tests/qemu-iotests/257.out | 85 ++
 2 files changed, 154 insertions(+)

diff --git a/tests/qemu-iotests/257 b/tests/qemu-iotests/257
index 2eb4f26c28..de8707cb19 100755
--- a/tests/qemu-iotests/257
+++ b/tests/qemu-iotests/257
@@ -450,10 +450,79 @@ def test_bitmap_sync(bsync_mode, msync_mode='bitmap', 
failure=None):
 compare_images(img_path, fbackup2)
 log('')
 
+def test_backup_api():
+"""
+"""
+with iotests.FilePaths(['img', 'bsync1']) as \
+ (img_path, backup_path), \
+ iotests.VM() as vm:
+
+log("\n=== API failure tests ===\n")
+log('--- Preparing image & VM ---\n')
+drive0 = Drive(img_path, vm=vm)
+drive0.img_create(iotests.imgfmt, SIZE)
+vm.add_device("{},id=scsi0".format(iotests.get_virtio_scsi_device()))
+vm.launch()
+
+file_config = {
+'driver': 'file',
+'filename': drive0.path
+}
+
+vm.qmp_log('blockdev-add',
+   filters=[iotests.filter_qmp_testfiles],
+   node_name="drive0",
+   driver=drive0.fmt,
+   file=file_config)
+drive0.node = 'drive0'
+drive0.device = 'device0'
+vm.qmp_log("device_add", id=drive0.device,
+   drive=drive0.name, driver="scsi-hd")
+log('')
+
+target0 = Drive(backup_path, vm=vm)
+target0.create_target("backup_target", drive0.fmt, drive0.size)
+log('')
+
+vm.qmp_log("block-dirty-bitmap-add", node=drive0.name,
+   name="bitmap0", granularity=GRANULARITY)
+log('')
+
+log('-- Testing invalid QMP commands --\n')
+
+error_cases = {
+'incremental': {
+None:['on-success', 'always', 'never', None],
+'bitmap404': ['on-success', 'always', 'never', None],
+'bitmap0':   ['always', 'never']
+},
+'bitmap': {
+None:['on-success', 'always', 'never', None],
+'bitmap404': ['on-success', 'always', 'never', None],
+'bitmap0':   [None],
+},
+}
+
+for sync_mode, config in error_cases.items():
+log("-- Sync mode {:s} tests --\n".format(sync_mode))
+for bitmap, policies in config.items():
+for policy in policies:
+kwargs = { 'job_id': 'api_job' }
+if bitmap is not None:
+kwargs['bitmap'] = bitmap
+if policy is not None:
+kwargs['bitmap-mode'] = policy
+blockdev_backup(drive0.vm, drive0.name, "backup_target",
+sync_mode, **kwargs)
+log('')
+
+
 def main():
 for bsync_mode in ("never", "on-success", "always"):
 for failure in ("simulated", "intermediate", None):
 test_bitmap_sync(bsync_mode, "bitmap", failure)
 
+test_backup_api()
+
 if __name__ == '__main__':
 iotests.script_main(main, supported_fmts=['qcow2'])
diff --git a/tests/qemu-iotests/257.out b/tests/qemu-iotests/257.out
index 0abc96acd3..43f2e0f9c9 100644
--- a/tests/qemu-iotests/257.out
+++ b/tests/qemu-iotests/257.out
@@ -2245,3 +2245,88 @@ qemu_img compare "TEST_DIR/PID-bsync1" 
"TEST_DIR/PID-fbackup1" ==> Identical, OK
 qemu_img compare "TEST_DIR/PID-bsync2" "TEST_DIR/PID-fbackup2" ==> Identical, 
OK!
 qemu_img compare "TEST_DIR/PID-img" "TEST_DIR/PID-fbackup2" ==> Identical, OK!
 
+
+=== API failure tests ===
+
+--- Preparing image & VM ---
+
+{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "file": 
{"driver": "file", "filename": "TEST_DIR/PID-img"}, "node-name": "drive0"}}
+{"return": {}}
+{"execute": "device_add", "arguments": {"drive": "drive0", "driver": 
"scsi-hd", "id": "device0"}}
+{"return": {}}
+
+{}
+{"execute": "job-dismiss", "arguments": {"id": "bdc-file-job"}}
+{"return": {}}
+{}
+{}
+{"execute": "job-dismiss", "arguments": {"id": "bdc-fmt-job"}}
+{"return": {}}
+{}
+
+{"execute": "block-dirty-bitmap-add", "arguments": {"granularity": 65536, 
"name": "bitmap0", "node": "drive0"}}
+{"return": {}}
+
+-- Testing invalid QMP commands --
+
+-- Sync mode incremental tests --
+
+{"execute": "blockdev-backup", "arguments": {"bitmap-mode": "on-success", 
"device": "drive0", "job-id": "api_job", "sync": "incremental", "target": 
"backup_target"}}
+{"error": {"class": "GenericError", "desc": "must provide a valid bitmap name 
for 'incremental' sync mode"}}
+
+{"execute": "blockdev-backup", "arguments": {"bitmap-mode": "always", 
"device": "drive0", "job-id": "api_job", "sync": "incremental", "target": 
"backup_target"}}
+{"error": {"class": "GenericError", "desc": "must provide a valid bitmap name 
for 'incremental' sync mode"}}
+
+{"execute": "block

[Qemu-block] [PATCH 4/8] block/backup: hoist bitmap check into QMP interface

2019-07-09 Thread John Snow

This is nicer to do in the unified QMP interface that we have now,
because it lets us use the right terminology back at the user.

Signed-off-by: John Snow 
---
 block/backup.c | 13 -
 blockdev.c | 10 ++
 2 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index e2729cf6fa..a64b768e24 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -566,6 +566,10 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 assert(bs);
 assert(target);
 
+/* QMP interface protects us from these cases */
+assert(sync_mode != MIRROR_SYNC_MODE_INCREMENTAL);
+assert(sync_bitmap || sync_mode != MIRROR_SYNC_MODE_BITMAP);
+
 if (bs == target) {
 error_setg(errp, "Source and target cannot be the same");
 return NULL;
@@ -597,16 +601,7 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 return NULL;
 }
 
-/* QMP interface should have handled translating this to bitmap mode */
-assert(sync_mode != MIRROR_SYNC_MODE_INCREMENTAL);
-
 if (sync_mode == MIRROR_SYNC_MODE_BITMAP) {
-if (!sync_bitmap) {
-error_setg(errp, "must provide a valid bitmap name for "
-   "'%s' sync mode", MirrorSyncMode_str(sync_mode));
-return NULL;
-}
-
 /* If we need to write to this bitmap, check that we can: */
 if (bitmap_mode != BITMAP_SYNC_MODE_NEVER &&
 bdrv_dirty_bitmap_check(sync_bitmap, BDRV_BITMAP_DEFAULT, errp)) {
diff --git a/blockdev.c b/blockdev.c
index 3e30bc2ca7..f0b7da53b0 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3464,6 +3464,16 @@ static BlockJob *do_backup_common(BackupCommon *backup,
 return NULL;
 }
 
+if ((backup->sync == MIRROR_SYNC_MODE_BITMAP) ||
+(backup->sync == MIRROR_SYNC_MODE_INCREMENTAL)) {
+/* done before desugaring 'incremental' to print the right message */
+if (!backup->has_bitmap) {
+error_setg(errp, "must provide a valid bitmap name for "
+   "'%s' sync mode", MirrorSyncMode_str(backup->sync));
+return NULL;
+}
+}
+
 if (backup->sync == MIRROR_SYNC_MODE_INCREMENTAL) {
 if (backup->has_bitmap_mode &&
 backup->bitmap_mode != BITMAP_SYNC_MODE_ON_SUCCESS) {
-- 
2.21.0

[Qemu-block] [PATCH 7/8] block/backup: support bitmap sync modes for non-bitmap backups

2019-07-09 Thread John Snow

Accept bitmaps and sync policies for the other backup modes.
This allows us to do things like create a bitmap synced to a full backup
without a transaction, or start a resumable backup process.

Some combinations don't make sense, though:

- NEVER policy combined with any non-BITMAP mode doesn't do anything,
  because the bitmap isn't used for input or output.
  It's harmless, but is almost certainly never what the user wanted.

- sync=NONE is more questionable. It can't use on-success because this
  job never completes with success anyway, and the resulting artifact
  of 'always' is suspect: because we start with a full bitmap and only
  copy out segments that get written to, the final output bitmap will
  always be ... a fully set bitmap.

  Maybe there's contexts in which bitmaps make sense for sync=none,
  but not without more severe changes to the current job, and omitting
  it here doesn't prevent us from adding it later.

Signed-off-by: John Snow 
---
 block/backup.c   |  8 +---
 blockdev.c   | 22 ++
 qapi/block-core.json |  6 --
 3 files changed, 27 insertions(+), 9 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 38c4a688c6..b94ed595ba 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -602,7 +602,7 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 return NULL;
 }
 
-if (sync_mode == MIRROR_SYNC_MODE_BITMAP) {
+if (sync_bitmap) {
 /* If we need to write to this bitmap, check that we can: */
 if (bitmap_mode != BITMAP_SYNC_MODE_NEVER &&
 bdrv_dirty_bitmap_check(sync_bitmap, BDRV_BITMAP_DEFAULT, errp)) {
@@ -613,12 +613,6 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 if (bdrv_dirty_bitmap_create_successor(bs, sync_bitmap, errp) < 0) {
 return NULL;
 }
-} else if (sync_bitmap) {
-error_setg(errp,
-   "a bitmap was given to backup_job_create, "
-   "but it received an incompatible sync_mode (%s)",
-   MirrorSyncMode_str(sync_mode));
-return NULL;
 }
 
 len = bdrv_getlength(bs);
diff --git a/blockdev.c b/blockdev.c
index f0b7da53b0..bc152f8e0d 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3502,6 +3502,28 @@ static BlockJob *do_backup_common(BackupCommon *backup,
 if (bdrv_dirty_bitmap_check(bmap, BDRV_BITMAP_ALLOW_RO, errp)) {
 return NULL;
 }
+
+/* This does not produce a useful bitmap artifact: */
+if (backup->sync == MIRROR_SYNC_MODE_NONE) {
+error_setg(errp, "sync mode '%s' does not produce meaningful 
bitmap"
+   " outputs", MirrorSyncMode_str(backup->sync));
+return NULL;
+}
+
+/* If the bitmap isn't used for input or output, this is useless: */
+if (backup->bitmap_mode == BITMAP_SYNC_MODE_NEVER &&
+backup->sync != MIRROR_SYNC_MODE_BITMAP) {
+error_setg(errp, "Bitmap sync mode '%s' has no meaningful effect"
+   " when combined with sync mode '%s'",
+   BitmapSyncMode_str(backup->bitmap_mode),
+   MirrorSyncMode_str(backup->sync));
+return NULL;
+}
+}
+
+if (!backup->has_bitmap && backup->has_bitmap_mode) {
+error_setg(errp, "Cannot specify Bitmap sync mode without a bitmap");
+return NULL;
 }
 
 if (!backup->auto_finalize) {
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 5a578806c5..099e4f37b2 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1352,13 +1352,15 @@
 # @speed: the maximum speed, in bytes per second. The default is 0,
 # for unlimited.
 #
-# @bitmap: the name of a dirty bitmap if sync is "bitmap" or "incremental".
+# @bitmap: The name of a dirty bitmap to use.
 #  Must be present if sync is "bitmap" or "incremental".
+#  Can be present if sync is "full" or "top".
 #  Must not be present otherwise.
 #  (Since 2.4 (drive-backup), 3.1 (blockdev-backup))
 #
 # @bitmap-mode: Specifies the type of data the bitmap should contain after
-#   the operation concludes. Must be present if sync is "bitmap".
+#   the operation concludes.
+#   Must be present if a bitmap was provided,
 #   Must NOT be present otherwise. (Since 4.2)
 #
 # @compress: true to compress data, if the target format supports it.
-- 
2.21.0

[Qemu-block] [PATCH 0/8] bitmaps: allow bitmaps to be used with full and top

2019-07-09 Thread John Snow

Requires: <20190709232550.10724-1-js...@redhat.com>
[PATCH v4 00/18] bitmaps: introduce 'bitmap' sync mode

This follows the previous series which adds the 'bitmap' sync mode
and uses it to add interactions with bitmaps to the 'full' and 'top'
modes to blockdev-backup and drive-backup.

Why?
 on-success: Can conveniently synchronize a bitmap to a full backup.
 Allows for transactionless anchor backups.
 Allows us to attempt an anchor backup without damaging
   our bitmap until the backup is successful.
 Allows for transactional, ungrouped anchor backups.
 always: Allows us to resume full/top style backups with a later
 invocation to sync=bitmap. Neat!

Summary:
1-3: Refactor iotest 257 to accommodate this;
4-5: Augment 257 to test trivial failure cases
6-7: Implement feature
8: Test new feature

John Snow (8):
  iotests/257: add Pattern class
  iotests/257: add EmulatedBitmap class
  iotests/257: Refactor backup helpers
  block/backup: hoist bitmap check into QMP interface
  iotests/257: test API failures
  block/backup: issue progress updates for skipped regions
  block/backup: support bitmap sync modes for non-bitmap backups
  iotests/257: test traditional sync modes

 block/backup.c |   22 +-
 blockdev.c |   32 +
 qapi/block-core.json   |6 +-
 tests/qemu-iotests/257 |  329 +++-
 tests/qemu-iotests/257.out | 3366 +++-
 5 files changed, 3548 insertions(+), 207 deletions(-)

-- 
2.21.0

[Qemu-block] [PATCH 2/8] iotests/257: add EmulatedBitmap class

2019-07-09 Thread John Snow

Represent a bitmap with an object that we can mark and clear bits in.
This makes it easier to manage partial writes when we don't write a
full group's worth of patterns before an error.

Signed-off-by: John Snow 
---
 tests/qemu-iotests/257 | 125 +
 1 file changed, 76 insertions(+), 49 deletions(-)

diff --git a/tests/qemu-iotests/257 b/tests/qemu-iotests/257
index f576a35a5c..2ff4aa8695 100755
--- a/tests/qemu-iotests/257
+++ b/tests/qemu-iotests/257
@@ -85,6 +85,60 @@ GROUPS = [
 Pattern('0xdd', 0x3fc)]), # New; leaving a gap to the right
 ]
 
+
+class EmulatedBitmap:
+def __init__(self, granularity=GRANULARITY):
+self._bits = set()
+self.groups = set()
+self.granularity = granularity
+
+def dirty_bits(self, bits):
+self._bits |= set(bits)
+
+def dirty_group(self, n):
+self.dirty_bits(GROUPS[n].bits(self.granularity))
+
+def clear(self):
+self._bits = set()
+
+def clear_bits(self, bits):
+self._bits = self._bits - set(bits)
+
+def clear_bit(self, bit):
+self.clear_bits({bit})
+
+def clear_group(self, n):
+self.clear_bits(GROUPS[n].bits(self.granularity))
+
+@property
+def first_bit(self):
+return sorted(self.bits)[0]
+
+@property
+def bits(self):
+return self._bits
+
+@property
+def count(self):
+return len(self.bits)
+
+def compare(self, qmp_bitmap):
+"""
+Print a nice human-readable message checking that a bitmap as reported
+by the QMP interface has as many bits set as we expect it to.
+"""
+
+name = qmp_bitmap.get('name', '(anonymous)')
+log("= Checking Bitmap {:s} =".format(name))
+
+want = self.count
+have = qmp_bitmap['count'] // qmp_bitmap['granularity']
+
+log("expecting {:d} dirty sectors; have {:d}. {:s}".format(
+want, have, "OK!" if want == have else "ERROR!"))
+log('')
+
+
 class Drive:
 """Represents, vaguely, a drive attached to a VM.
 Includes format, graph, and device information."""
@@ -195,27 +249,6 @@ def perform_writes(drive, n):
 log('')
 return bitmaps
 
-def calculate_bits(groups=None):
-"""Calculate how many bits we expect to see dirtied."""
-if groups:
-bits = set.union(*(GROUPS[group].bits(GRANULARITY) for group in 
groups))
-return len(bits)
-return 0
-
-def bitmap_comparison(bitmap, groups=None, want=0):
-"""
-Print a nice human-readable message checking that this bitmap has as
-many bits set as we expect it to.
-"""
-log("= Checking Bitmap {:s} =".format(bitmap.get('name', '(anonymous)')))
-
-if groups:
-want = calculate_bits(groups)
-have = bitmap['count'] // bitmap['granularity']
-
-log("expecting {:d} dirty sectors; have {:d}. {:s}".format(
-want, have, "OK!" if want == have else "ERROR!"))
-log('')
 
 def compare_images(image, reference, baseimg=None, expected_match=True):
 """
@@ -321,12 +354,13 @@ def test_bitmap_sync(bsync_mode, failure=None):
 vm.qmp_log("block-dirty-bitmap-add", node=drive0.name,
name="bitmap0", granularity=GRANULARITY)
 log('')
+ebitmap = EmulatedBitmap()
 
 # 1 - Writes and Reference Backup
 bitmaps = perform_writes(drive0, 1)
-dirty_groups = {1}
+ebitmap.dirty_group(1)
 bitmap = get_bitmap(bitmaps, drive0.device, 'bitmap0')
-bitmap_comparison(bitmap, groups=dirty_groups)
+ebitmap.compare(bitmap)
 reference_backup(drive0, 1, fbackup1)
 
 # 1 - Bitmap Backup (Optional induced failure)
@@ -342,54 +376,47 @@ def test_bitmap_sync(bsync_mode, failure=None):
 log('')
 bitmaps = perform_writes(drive0, 2)
 # Named bitmap (static, should be unchanged)
-bitmap_comparison(get_bitmap(bitmaps, drive0.device, 'bitmap0'),
-  groups=dirty_groups)
+ebitmap.compare(get_bitmap(bitmaps, drive0.device, 'bitmap0'))
 # Anonymous bitmap (dynamic, shows new writes)
-bitmap_comparison(get_bitmap(bitmaps, drive0.device, '',
- recording=True), groups={2})
-dirty_groups.add(2)
+anonymous = EmulatedBitmap()
+anonymous.dirty_group(2)
+anonymous.compare(get_bitmap(bitmaps, drive0.device, '',
+ recording=True))
+
+# Simulate the order in which this will happen:
+# group 1 gets cleared first, then group two gets written.
+if ((bsync_mode == 'on-success' and not failure) or
+(bsync_mode == 'always')):
+ebitmap.clear_group(1)
+ebitmap.dirty_group(2)
 
 vm.run_job(job, auto_dismiss=True, auto_finalize=False,
pre_finalize=_callback,

[Qemu-block] [PATCH 1/8] iotests/257: add Pattern class

2019-07-09 Thread John Snow

Just kidding, this is easier to manage with a full class instead of a
namedtuple.

Signed-off-by: John Snow 
---
 tests/qemu-iotests/257 | 58 +++---
 1 file changed, 32 insertions(+), 26 deletions(-)

diff --git a/tests/qemu-iotests/257 b/tests/qemu-iotests/257
index 75a651c7c3..f576a35a5c 100755
--- a/tests/qemu-iotests/257
+++ b/tests/qemu-iotests/257
@@ -19,7 +19,6 @@
 #
 # owner=js...@redhat.com
 
-from collections import namedtuple
 import math
 import os
 
@@ -29,10 +28,18 @@ from iotests import log, qemu_img
 SIZE = 64 * 1024 * 1024
 GRANULARITY = 64 * 1024
 
-Pattern = namedtuple('Pattern', ['byte', 'offset', 'size'])
-def mkpattern(byte, offset, size=GRANULARITY):
-"""Constructor for Pattern() with default size"""
-return Pattern(byte, offset, size)
+
+class Pattern:
+def __init__(self, byte, offset, size=GRANULARITY):
+self.byte = byte
+self.offset = offset
+self.size = size
+
+def bits(self, granularity):
+lower = math.floor(self.offset / granularity)
+upper = math.floor((self.offset + self.size - 1) / granularity)
+return set(range(lower, upper + 1))
+
 
 class PatternGroup:
 """Grouping of Pattern objects. Initialize with an iterable of Patterns."""
@@ -43,40 +50,39 @@ class PatternGroup:
 """Calculate the unique bits dirtied by this pattern grouping"""
 res = set()
 for pattern in self.patterns:
-lower = math.floor(pattern.offset / granularity)
-upper = math.floor((pattern.offset + pattern.size - 1) / 
granularity)
-res = res | set(range(lower, upper + 1))
+res = res | pattern.bits(granularity)
 return res
 
+
 GROUPS = [
 PatternGroup([
 # Batch 0: 4 clusters
-mkpattern('0x49', 0x000),
-mkpattern('0x6c', 0x010),   # 1M
-mkpattern('0x6f', 0x200),   # 32M
-mkpattern('0x76', 0x3ff)]), # 64M - 64K
+Pattern('0x49', 0x000),
+Pattern('0x6c', 0x010),   # 1M
+Pattern('0x6f', 0x200),   # 32M
+Pattern('0x76', 0x3ff)]), # 64M - 64K
 PatternGroup([
 # Batch 1: 6 clusters (3 new)
-mkpattern('0x65', 0x000),   # Full overwrite
-mkpattern('0x77', 0x00f8000),   # Partial-left (1M-32K)
-mkpattern('0x72', 0x2008000),   # Partial-right (32M+32K)
-mkpattern('0x69', 0x3fe)]), # Adjacent-left (64M - 128K)
+Pattern('0x65', 0x000),   # Full overwrite
+Pattern('0x77', 0x00f8000),   # Partial-left (1M-32K)
+Pattern('0x72', 0x2008000),   # Partial-right (32M+32K)
+Pattern('0x69', 0x3fe)]), # Adjacent-left (64M - 128K)
 PatternGroup([
 # Batch 2: 7 clusters (3 new)
-mkpattern('0x74', 0x001),   # Adjacent-right
-mkpattern('0x69', 0x00e8000),   # Partial-left  (1M-96K)
-mkpattern('0x6e', 0x2018000),   # Partial-right (32M+96K)
-mkpattern('0x67', 0x3fe,
-  2*GRANULARITY)]), # Overwrite [(64M-128K)-64M)
+Pattern('0x74', 0x001),   # Adjacent-right
+Pattern('0x69', 0x00e8000),   # Partial-left  (1M-96K)
+Pattern('0x6e', 0x2018000),   # Partial-right (32M+96K)
+Pattern('0x67', 0x3fe,
+2*GRANULARITY)]), # Overwrite [(64M-128K)-64M)
 PatternGroup([
 # Batch 3: 8 clusters (5 new)
 # Carefully chosen such that nothing re-dirties the one cluster
 # that copies out successfully before failure in Group #1.
-mkpattern('0xaa', 0x001,
-  3*GRANULARITY),   # Overwrite and 2x Adjacent-right
-mkpattern('0xbb', 0x00d8000),   # Partial-left (1M-160K)
-mkpattern('0xcc', 0x2028000),   # Partial-right (32M+160K)
-mkpattern('0xdd', 0x3fc)]), # New; leaving a gap to the right
+Pattern('0xaa', 0x001,
+3*GRANULARITY),   # Overwrite and 2x Adjacent-right
+Pattern('0xbb', 0x00d8000),   # Partial-left (1M-160K)
+Pattern('0xcc', 0x2028000),   # Partial-right (32M+160K)
+Pattern('0xdd', 0x3fc)]), # New; leaving a gap to the right
 ]
 
 class Drive:
-- 
2.21.0

[Qemu-block] [PATCH v4 09/18] block/dirty-bitmap: add bdrv_dirty_bitmap_merge_internal

2019-07-09 Thread John Snow

I'm surprised it didn't come up sooner, but sometimes we have a +busy
bitmap as a source. This is dangerous from the QMP API, but if we are
the owner that marked the bitmap busy, it's safe to merge it using it as
a read only source.

It is not safe in the general case to allow users to read from in-use
bitmaps, so create an internal variant that foregoes the safety
checking.

Signed-off-by: John Snow 
---
 block/dirty-bitmap.c  | 54 +++
 include/block/block_int.h |  3 +++
 2 files changed, 52 insertions(+), 5 deletions(-)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 95a9c2a5d8..7881fea684 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -810,6 +810,12 @@ bool bdrv_dirty_bitmap_next_dirty_area(BdrvDirtyBitmap 
*bitmap,
 return hbitmap_next_dirty_area(bitmap->bitmap, offset, bytes);
 }
 
+/**
+ * bdrv_merge_dirty_bitmap: merge src into dest.
+ * Ensures permissions on bitmaps are reasonable; use for public API.
+ *
+ * @backup: If provided, make a copy of dest here prior to merge.
+ */
 void bdrv_merge_dirty_bitmap(BdrvDirtyBitmap *dest, const BdrvDirtyBitmap *src,
  HBitmap **backup, Error **errp)
 {
@@ -833,6 +839,42 @@ void bdrv_merge_dirty_bitmap(BdrvDirtyBitmap *dest, const 
BdrvDirtyBitmap *src,
 goto out;
 }
 
+ret = bdrv_dirty_bitmap_merge_internal(dest, src, backup, false);
+assert(ret);
+
+out:
+qemu_mutex_unlock(dest->mutex);
+if (src->mutex != dest->mutex) {
+qemu_mutex_unlock(src->mutex);
+}
+}
+
+/**
+ * bdrv_dirty_bitmap_merge_internal: merge src into dest.
+ * Does NOT check bitmap permissions; not suitable for use as public API.
+ *
+ * @backup: If provided, make a copy of dest here prior to merge.
+ * @lock: If true, lock and unlock bitmaps on the way in/out.
+ * returns true if the merge succeeded; false if unattempted.
+ */
+bool bdrv_dirty_bitmap_merge_internal(BdrvDirtyBitmap *dest,
+  const BdrvDirtyBitmap *src,
+  HBitmap **backup,
+  bool lock)
+{
+bool ret;
+
+assert(!bdrv_dirty_bitmap_readonly(dest));
+assert(!bdrv_dirty_bitmap_inconsistent(dest));
+assert(!bdrv_dirty_bitmap_inconsistent(src));
+
+if (lock) {
+qemu_mutex_lock(dest->mutex);
+if (src->mutex != dest->mutex) {
+qemu_mutex_lock(src->mutex);
+}
+}
+
 if (backup) {
 *backup = dest->bitmap;
 dest->bitmap = hbitmap_alloc(dest->size, hbitmap_granularity(*backup));
@@ -840,11 +882,13 @@ void bdrv_merge_dirty_bitmap(BdrvDirtyBitmap *dest, const 
BdrvDirtyBitmap *src,
 } else {
 ret = hbitmap_merge(dest->bitmap, src->bitmap, dest->bitmap);
 }
-assert(ret);
 
-out:
-qemu_mutex_unlock(dest->mutex);
-if (src->mutex != dest->mutex) {
-qemu_mutex_unlock(src->mutex);
+if (lock) {
+qemu_mutex_unlock(dest->mutex);
+if (src->mutex != dest->mutex) {
+qemu_mutex_unlock(src->mutex);
+}
 }
+
+return ret;
 }
diff --git a/include/block/block_int.h b/include/block/block_int.h
index e1f2aa627e..83ffdf4950 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -1238,6 +1238,9 @@ void bdrv_set_dirty(BlockDriverState *bs, int64_t offset, 
int64_t bytes);
 
 void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap **out);
 void bdrv_restore_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap *backup);
+bool bdrv_dirty_bitmap_merge_internal(BdrvDirtyBitmap *dest,
+  const BdrvDirtyBitmap *src,
+  HBitmap **backup, bool lock);
 
 void bdrv_inc_in_flight(BlockDriverState *bs);
 void bdrv_dec_in_flight(BlockDriverState *bs);
-- 
2.21.0

[Qemu-block] [PATCH v4 18/18] block/backup: loosen restriction on readonly bitmaps

2019-07-09 Thread John Snow

With the "never" sync policy, we actually can utilize readonly bitmaps
now. Loosen the check at the QMP level, and tighten it based on
provided arguments down at the job creation level instead.

Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 block/backup.c | 6 ++
 blockdev.c | 2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/block/backup.c b/block/backup.c
index acfe87b756..e2729cf6fa 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -607,6 +607,12 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 return NULL;
 }
 
+/* If we need to write to this bitmap, check that we can: */
+if (bitmap_mode != BITMAP_SYNC_MODE_NEVER &&
+bdrv_dirty_bitmap_check(sync_bitmap, BDRV_BITMAP_DEFAULT, errp)) {
+return NULL;
+}
+
 /* Create a new bitmap, and freeze/disable this one. */
 if (bdrv_dirty_bitmap_create_successor(bs, sync_bitmap, errp) < 0) {
 return NULL;
diff --git a/blockdev.c b/blockdev.c
index 5dfaa976c9..3e30bc2ca7 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3489,7 +3489,7 @@ static BlockJob *do_backup_common(BackupCommon *backup,
"when providing a bitmap");
 return NULL;
 }
-if (bdrv_dirty_bitmap_check(bmap, BDRV_BITMAP_DEFAULT, errp)) {
+if (bdrv_dirty_bitmap_check(bmap, BDRV_BITMAP_ALLOW_RO, errp)) {
 return NULL;
 }
 }
-- 
2.21.0

[Qemu-block] [PATCH v4 17/18] iotests: add test 257 for bitmap-mode backups

2019-07-09 Thread John Snow

Signed-off-by: John Snow 
---
 tests/qemu-iotests/257 |  416 +++
 tests/qemu-iotests/257.out | 2247 
 tests/qemu-iotests/group   |1 +
 3 files changed, 2664 insertions(+)
 create mode 100755 tests/qemu-iotests/257
 create mode 100644 tests/qemu-iotests/257.out

diff --git a/tests/qemu-iotests/257 b/tests/qemu-iotests/257
new file mode 100755
index 00..75a651c7c3
--- /dev/null
+++ b/tests/qemu-iotests/257
@@ -0,0 +1,416 @@
+#!/usr/bin/env python
+#
+# Test bitmap-sync backups (incremental, differential, and partials)
+#
+# Copyright (c) 2019 John Snow for Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+# owner=js...@redhat.com
+
+from collections import namedtuple
+import math
+import os
+
+import iotests
+from iotests import log, qemu_img
+
+SIZE = 64 * 1024 * 1024
+GRANULARITY = 64 * 1024
+
+Pattern = namedtuple('Pattern', ['byte', 'offset', 'size'])
+def mkpattern(byte, offset, size=GRANULARITY):
+"""Constructor for Pattern() with default size"""
+return Pattern(byte, offset, size)
+
+class PatternGroup:
+"""Grouping of Pattern objects. Initialize with an iterable of Patterns."""
+def __init__(self, patterns):
+self.patterns = patterns
+
+def bits(self, granularity):
+"""Calculate the unique bits dirtied by this pattern grouping"""
+res = set()
+for pattern in self.patterns:
+lower = math.floor(pattern.offset / granularity)
+upper = math.floor((pattern.offset + pattern.size - 1) / 
granularity)
+res = res | set(range(lower, upper + 1))
+return res
+
+GROUPS = [
+PatternGroup([
+# Batch 0: 4 clusters
+mkpattern('0x49', 0x000),
+mkpattern('0x6c', 0x010),   # 1M
+mkpattern('0x6f', 0x200),   # 32M
+mkpattern('0x76', 0x3ff)]), # 64M - 64K
+PatternGroup([
+# Batch 1: 6 clusters (3 new)
+mkpattern('0x65', 0x000),   # Full overwrite
+mkpattern('0x77', 0x00f8000),   # Partial-left (1M-32K)
+mkpattern('0x72', 0x2008000),   # Partial-right (32M+32K)
+mkpattern('0x69', 0x3fe)]), # Adjacent-left (64M - 128K)
+PatternGroup([
+# Batch 2: 7 clusters (3 new)
+mkpattern('0x74', 0x001),   # Adjacent-right
+mkpattern('0x69', 0x00e8000),   # Partial-left  (1M-96K)
+mkpattern('0x6e', 0x2018000),   # Partial-right (32M+96K)
+mkpattern('0x67', 0x3fe,
+  2*GRANULARITY)]), # Overwrite [(64M-128K)-64M)
+PatternGroup([
+# Batch 3: 8 clusters (5 new)
+# Carefully chosen such that nothing re-dirties the one cluster
+# that copies out successfully before failure in Group #1.
+mkpattern('0xaa', 0x001,
+  3*GRANULARITY),   # Overwrite and 2x Adjacent-right
+mkpattern('0xbb', 0x00d8000),   # Partial-left (1M-160K)
+mkpattern('0xcc', 0x2028000),   # Partial-right (32M+160K)
+mkpattern('0xdd', 0x3fc)]), # New; leaving a gap to the right
+]
+
+class Drive:
+"""Represents, vaguely, a drive attached to a VM.
+Includes format, graph, and device information."""
+
+def __init__(self, path, vm=None):
+self.path = path
+self.vm = vm
+self.fmt = None
+self.size = None
+self.node = None
+self.device = None
+
+@property
+def name(self):
+return self.node or self.device
+
+def img_create(self, fmt, size):
+self.fmt = fmt
+self.size = size
+iotests.qemu_img_create('-f', self.fmt, self.path, str(self.size))
+
+def create_target(self, name, fmt, size):
+basename = os.path.basename(self.path)
+file_node_name = "file_{}".format(basename)
+vm = self.vm
+
+log(vm.command('blockdev-create', job_id='bdc-file-job',
+   options={
+   'driver': 'file',
+   'filename': self.path,
+   'size': 0,
+   }))
+vm.run_job('bdc-file-job')
+log(vm.command('blockdev-add', driver='file',
+   node_name=file_node_name, filename=self.path))
+
+log(vm.command('blockdev-create', job_id='bdc-fmt-job',
+   options={
+   'drive

[Qemu-block] [PATCH v4 16/18] iotests: Add virtio-scsi device helper

2019-07-09 Thread John Snow

Seems that it comes up enough.

Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 tests/qemu-iotests/040| 6 +-
 tests/qemu-iotests/093| 6 ++
 tests/qemu-iotests/139| 7 ++-
 tests/qemu-iotests/238| 5 +
 tests/qemu-iotests/iotests.py | 4 
 5 files changed, 10 insertions(+), 18 deletions(-)

diff --git a/tests/qemu-iotests/040 b/tests/qemu-iotests/040
index b81133a474..657b37103c 100755
--- a/tests/qemu-iotests/040
+++ b/tests/qemu-iotests/040
@@ -85,11 +85,7 @@ class TestSingleDrive(ImageCommitTestCase):
 qemu_io('-f', 'raw', '-c', 'write -P 0xab 0 524288', backing_img)
 qemu_io('-f', iotests.imgfmt, '-c', 'write -P 0xef 524288 524288', 
mid_img)
 self.vm = iotests.VM().add_drive(test_img, 
"node-name=top,backing.node-name=mid,backing.backing.node-name=base", 
interface="none")
-if iotests.qemu_default_machine == 's390-ccw-virtio':
-self.vm.add_device("virtio-scsi-ccw")
-else:
-self.vm.add_device("virtio-scsi-pci")
-
+self.vm.add_device(iotests.get_virtio_scsi_device())
 self.vm.add_device("scsi-hd,id=scsi0,drive=drive0")
 self.vm.launch()
 
diff --git a/tests/qemu-iotests/093 b/tests/qemu-iotests/093
index d88fbc182e..46153220f8 100755
--- a/tests/qemu-iotests/093
+++ b/tests/qemu-iotests/093
@@ -366,10 +366,8 @@ class ThrottleTestGroupNames(iotests.QMPTestCase):
 class ThrottleTestRemovableMedia(iotests.QMPTestCase):
 def setUp(self):
 self.vm = iotests.VM()
-if iotests.qemu_default_machine == 's390-ccw-virtio':
-self.vm.add_device("virtio-scsi-ccw,id=virtio-scsi")
-else:
-self.vm.add_device("virtio-scsi-pci,id=virtio-scsi")
+self.vm.add_device("{},id=virtio-scsi".format(
+iotests.get_virtio_scsi_device()))
 self.vm.launch()
 
 def tearDown(self):
diff --git a/tests/qemu-iotests/139 b/tests/qemu-iotests/139
index 933b45121a..2176ea51ba 100755
--- a/tests/qemu-iotests/139
+++ b/tests/qemu-iotests/139
@@ -35,11 +35,8 @@ class TestBlockdevDel(iotests.QMPTestCase):
 def setUp(self):
 iotests.qemu_img('create', '-f', iotests.imgfmt, base_img, '1M')
 self.vm = iotests.VM()
-if iotests.qemu_default_machine == 's390-ccw-virtio':
-self.vm.add_device("virtio-scsi-ccw,id=virtio-scsi")
-else:
-self.vm.add_device("virtio-scsi-pci,id=virtio-scsi")
-
+self.vm.add_device("{},id=virtio-scsi".format(
+iotests.get_virtio_scsi_device()))
 self.vm.launch()
 
 def tearDown(self):
diff --git a/tests/qemu-iotests/238 b/tests/qemu-iotests/238
index 1c0a46fa90..387a77b2cd 100755
--- a/tests/qemu-iotests/238
+++ b/tests/qemu-iotests/238
@@ -23,10 +23,7 @@ import os
 import iotests
 from iotests import log
 
-if iotests.qemu_default_machine == 's390-ccw-virtio':
-virtio_scsi_device = 'virtio-scsi-ccw'
-else:
-virtio_scsi_device = 'virtio-scsi-pci'
+virtio_scsi_device = iotests.get_virtio_scsi_device()
 
 vm = iotests.VM()
 vm.launch()
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 6135c9663d..8ae7bc353e 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -164,6 +164,10 @@ def qemu_io_silent(*args):
  (-exitcode, ' '.join(args)))
 return exitcode
 
+def get_virtio_scsi_device():
+if qemu_default_machine == 's390-ccw-virtio':
+return 'virtio-scsi-ccw'
+return 'virtio-scsi-pci'
 
 class QemuIoInteractive:
 def __init__(self, *args):
-- 
2.21.0

[Qemu-block] [PATCH v4 05/18] block/backup: Add mirror sync mode 'bitmap'

2019-07-09 Thread John Snow

We don't need or want a new sync mode for simple differences in
semantics.  Create a new mode simply named "BITMAP" that is designed to
make use of the new Bitmap Sync Mode field.

Because the only bitmap sync mode is 'on-success', this adds no new
functionality to the backup job (yet). The old incremental backup mode
is maintained as a syntactic sugar for sync=bitmap, mode=on-success.

Add all of the plumbing necessary to support this new instruction.

Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 block/backup.c| 20 
 block/mirror.c|  6 --
 block/replication.c   |  2 +-
 blockdev.c| 25 +++--
 include/block/block_int.h |  4 +++-
 qapi/block-core.json  | 21 +++--
 6 files changed, 58 insertions(+), 20 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 715e1d3be8..996941fa61 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -38,9 +38,9 @@ typedef struct CowRequest {
 typedef struct BackupBlockJob {
 BlockJob common;
 BlockBackend *target;
-/* bitmap for sync=incremental */
 BdrvDirtyBitmap *sync_bitmap;
 MirrorSyncMode sync_mode;
+BitmapSyncMode bitmap_mode;
 BlockdevOnError on_source_error;
 BlockdevOnError on_target_error;
 CoRwlock flush_rwlock;
@@ -452,7 +452,7 @@ static int coroutine_fn backup_run(Job *job, Error **errp)
 
 job_progress_set_remaining(job, s->len);
 
-if (s->sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
+if (s->sync_mode == MIRROR_SYNC_MODE_BITMAP) {
 backup_incremental_init_copy_bitmap(s);
 } else {
 hbitmap_set(s->copy_bitmap, 0, s->len);
@@ -536,6 +536,7 @@ static int64_t 
backup_calculate_cluster_size(BlockDriverState *target,
 BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
   BlockDriverState *target, int64_t speed,
   MirrorSyncMode sync_mode, BdrvDirtyBitmap *sync_bitmap,
+  BitmapSyncMode bitmap_mode,
   bool compress,
   BlockdevOnError on_source_error,
   BlockdevOnError on_target_error,
@@ -583,10 +584,13 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 return NULL;
 }
 
-if (sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
+/* QMP interface should have handled translating this to bitmap mode */
+assert(sync_mode != MIRROR_SYNC_MODE_INCREMENTAL);
+
+if (sync_mode == MIRROR_SYNC_MODE_BITMAP) {
 if (!sync_bitmap) {
 error_setg(errp, "must provide a valid bitmap name for "
- "\"incremental\" sync mode");
+   "'%s' sync mode", MirrorSyncMode_str(sync_mode));
 return NULL;
 }
 
@@ -596,8 +600,8 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 }
 } else if (sync_bitmap) {
 error_setg(errp,
-   "a sync_bitmap was provided to backup_run, "
-   "but received an incompatible sync_mode (%s)",
+   "a bitmap was given to backup_job_create, "
+   "but it received an incompatible sync_mode (%s)",
MirrorSyncMode_str(sync_mode));
 return NULL;
 }
@@ -639,8 +643,8 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 job->on_source_error = on_source_error;
 job->on_target_error = on_target_error;
 job->sync_mode = sync_mode;
-job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_INCREMENTAL ?
-   sync_bitmap : NULL;
+job->sync_bitmap = sync_bitmap;
+job->bitmap_mode = bitmap_mode;
 job->compress = compress;
 
 /* Detect image-fleecing (and similar) schemes */
diff --git a/block/mirror.c b/block/mirror.c
index 2fcec70e35..75c8f38c6a 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1717,8 +1717,10 @@ void mirror_start(const char *job_id, BlockDriverState 
*bs,
 bool is_none_mode;
 BlockDriverState *base;
 
-if (mode == MIRROR_SYNC_MODE_INCREMENTAL) {
-error_setg(errp, "Sync mode 'incremental' not supported");
+if ((mode == MIRROR_SYNC_MODE_INCREMENTAL) ||
+(mode == MIRROR_SYNC_MODE_BITMAP)) {
+error_setg(errp, "Sync mode '%s' not supported",
+   MirrorSyncMode_str(mode));
 return;
 }
 is_none_mode = mode == MIRROR_SYNC_MODE_NONE;
diff --git a/block/replication.c b/block/replication.c
index 23b2993d74..936b2f8b5a 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -543,7 +543,7 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 
 s->backup_job = backup_job_create(
 NULL, s->secondary_disk->bs, 
s->hidden_disk->bs,
-0, MIRROR_SYNC_MODE_NONE, NULL, false,
+0, MIRROR_SYNC_MODE_NONE, NULL, 0, false,
 B

[Qemu-block] [PATCH v4 12/18] block/backup: add 'always' bitmap sync policy

2019-07-09 Thread John Snow

This adds an "always" policy for bitmap synchronization. Regardless of if
the job succeeds or fails, the bitmap is *always* synchronized. This means
that for backups that fail part-way through, the bitmap retains a record of
which sectors need to be copied out to accomplish a new backup using the
old, partial result.

In effect, this allows us to "resume" a failed backup; however the new backup
will be from the new point in time, so it isn't a "resume" as much as it is
an "incremental retry." This can be useful in the case of extremely large
backups that fail considerably through the operation and we'd like to not waste
the work that was already performed.

Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 block/backup.c   | 27 +++
 qapi/block-core.json |  5 -
 2 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index fb1b39c44e..acfe87b756 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -268,18 +268,29 @@ static void backup_cleanup_sync_bitmap(BackupBlockJob 
*job, int ret)
 {
 BdrvDirtyBitmap *bm;
 BlockDriverState *bs = blk_bs(job->common.blk);
+bool sync = (((ret == 0) || (job->bitmap_mode == BITMAP_SYNC_MODE_ALWAYS)) 
\
+ && (job->bitmap_mode != BITMAP_SYNC_MODE_NEVER));
 
-if (ret < 0 || job->bitmap_mode == BITMAP_SYNC_MODE_NEVER) {
+if (sync) {
 /*
- * Failure, or we don't want to synchronize the bitmap.
- * Merge the successor back into the parent, delete nothing.
+ * We succeeded, or we always intended to sync the bitmap.
+ * Delete this bitmap and install the child.
  */
-bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
-assert(bm);
-} else {
-/* Everything is fine, delete this bitmap and install the backup. */
 bm = bdrv_dirty_bitmap_abdicate(bs, job->sync_bitmap, NULL);
-assert(bm);
+} else {
+/*
+ * We failed, or we never intended to sync the bitmap anyway.
+ * Merge the successor back into the parent, keeping all data.
+ */
+bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
+}
+
+assert(bm);
+
+if (ret < 0 && job->bitmap_mode == BITMAP_SYNC_MODE_ALWAYS) {
+/* If we failed and synced, merge in the bits we didn't copy: */
+bdrv_dirty_bitmap_merge_internal(bm, job->copy_bitmap,
+ NULL, true);
 }
 }
 
diff --git a/qapi/block-core.json b/qapi/block-core.json
index b1a98e..5a578806c5 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1149,10 +1149,13 @@
 # @never: The bitmap is never synchronized with the operation, and is
 # treated solely as a read-only manifest of blocks to copy.
 #
+# @always: The bitmap is always synchronized with the operation,
+#  regardless of whether or not the operation was successful.
+#
 # Since: 4.2
 ##
 { 'enum': 'BitmapSyncMode',
-  'data': ['on-success', 'never'] }
+  'data': ['on-success', 'never', 'always'] }
 
 ##
 # @MirrorCopyMode:
-- 
2.21.0

[Qemu-block] [PATCH v4 02/18] drive-backup: create do_backup_common

2019-07-09 Thread John Snow

Create a common core that comprises the actual meat of what the backup API
boundary needs to do, and then switch drive-backup to use it.

Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 blockdev.c | 122 +
 1 file changed, 67 insertions(+), 55 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 4d141e9a1f..5bc8ecd087 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3425,6 +3425,70 @@ out:
 aio_context_release(aio_context);
 }
 
+/* Common QMP interface for drive-backup and blockdev-backup */
+static BlockJob *do_backup_common(BackupCommon *backup,
+  BlockDriverState *bs,
+  BlockDriverState *target_bs,
+  AioContext *aio_context,
+  JobTxn *txn, Error **errp)
+{
+BlockJob *job = NULL;
+BdrvDirtyBitmap *bmap = NULL;
+int job_flags = JOB_DEFAULT;
+int ret;
+
+if (!backup->has_speed) {
+backup->speed = 0;
+}
+if (!backup->has_on_source_error) {
+backup->on_source_error = BLOCKDEV_ON_ERROR_REPORT;
+}
+if (!backup->has_on_target_error) {
+backup->on_target_error = BLOCKDEV_ON_ERROR_REPORT;
+}
+if (!backup->has_job_id) {
+backup->job_id = NULL;
+}
+if (!backup->has_auto_finalize) {
+backup->auto_finalize = true;
+}
+if (!backup->has_auto_dismiss) {
+backup->auto_dismiss = true;
+}
+if (!backup->has_compress) {
+backup->compress = false;
+}
+
+ret = bdrv_try_set_aio_context(target_bs, aio_context, errp);
+if (ret < 0) {
+return NULL;
+}
+
+if (backup->has_bitmap) {
+bmap = bdrv_find_dirty_bitmap(bs, backup->bitmap);
+if (!bmap) {
+error_setg(errp, "Bitmap '%s' could not be found", backup->bitmap);
+return NULL;
+}
+if (bdrv_dirty_bitmap_check(bmap, BDRV_BITMAP_DEFAULT, errp)) {
+return NULL;
+}
+}
+
+if (!backup->auto_finalize) {
+job_flags |= JOB_MANUAL_FINALIZE;
+}
+if (!backup->auto_dismiss) {
+job_flags |= JOB_MANUAL_DISMISS;
+}
+
+job = backup_job_create(backup->job_id, bs, target_bs, backup->speed,
+backup->sync, bmap, backup->compress,
+backup->on_source_error, backup->on_target_error,
+job_flags, NULL, NULL, txn, errp);
+return job;
+}
+
 static BlockJob *do_drive_backup(DriveBackup *backup, JobTxn *txn,
  Error **errp)
 {
@@ -3432,39 +3496,16 @@ static BlockJob *do_drive_backup(DriveBackup *backup, 
JobTxn *txn,
 BlockDriverState *target_bs;
 BlockDriverState *source = NULL;
 BlockJob *job = NULL;
-BdrvDirtyBitmap *bmap = NULL;
 AioContext *aio_context;
 QDict *options = NULL;
 Error *local_err = NULL;
-int flags, job_flags = JOB_DEFAULT;
+int flags;
 int64_t size;
 bool set_backing_hd = false;
-int ret;
 
-if (!backup->has_speed) {
-backup->speed = 0;
-}
-if (!backup->has_on_source_error) {
-backup->on_source_error = BLOCKDEV_ON_ERROR_REPORT;
-}
-if (!backup->has_on_target_error) {
-backup->on_target_error = BLOCKDEV_ON_ERROR_REPORT;
-}
 if (!backup->has_mode) {
 backup->mode = NEW_IMAGE_MODE_ABSOLUTE_PATHS;
 }
-if (!backup->has_job_id) {
-backup->job_id = NULL;
-}
-if (!backup->has_auto_finalize) {
-backup->auto_finalize = true;
-}
-if (!backup->has_auto_dismiss) {
-backup->auto_dismiss = true;
-}
-if (!backup->has_compress) {
-backup->compress = false;
-}
 
 bs = bdrv_lookup_bs(backup->device, backup->device, errp);
 if (!bs) {
@@ -3541,12 +3582,6 @@ static BlockJob *do_drive_backup(DriveBackup *backup, 
JobTxn *txn,
 goto out;
 }
 
-ret = bdrv_try_set_aio_context(target_bs, aio_context, errp);
-if (ret < 0) {
-bdrv_unref(target_bs);
-goto out;
-}
-
 if (set_backing_hd) {
 bdrv_set_backing_hd(target_bs, source, &local_err);
 if (local_err) {
@@ -3554,31 +3589,8 @@ static BlockJob *do_drive_backup(DriveBackup *backup, 
JobTxn *txn,
 }
 }
 
-if (backup->has_bitmap) {
-bmap = bdrv_find_dirty_bitmap(bs, backup->bitmap);
-if (!bmap) {
-error_setg(errp, "Bitmap '%s' could not be found", backup->bitmap);
-goto unref;
-}
-if (bdrv_dirty_bitmap_check(bmap, BDRV_BITMAP_DEFAULT, errp)) {
-goto unref;
-}
-}
-if (!backup->auto_finalize) {
-job_flags |= JOB_MANUAL_FINALIZE;
-}
-if (!backup->auto_dismiss) {
-job_flags |= JOB_MANUAL_DISMISS;
-}
-
-job = backup_job_create(backup->job_id, bs, target_bs, backup->speed,
-backup->sync, bmap, backup->compres

[Qemu-block] [PATCH v4 13/18] iotests: add testing shim for script-style python tests

2019-07-09 Thread John Snow

Because the new-style python tests don't use the iotests.main() test
launcher, we don't turn on the debugger logging for these scripts
when invoked via ./check -d.

Refactor the launcher shim into new and old style shims so that they
share environmental configuration.

Two cleanup notes: debug was not actually used as a global, and there
was no reason to create a class in an inner scope just to achieve
default variables; we can simply create an instance of the runner with
the values we want instead.

Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 tests/qemu-iotests/iotests.py | 40 +++
 1 file changed, 26 insertions(+), 14 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 3ecef5bc90..fcad957d63 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -61,7 +61,6 @@ cachemode = os.environ.get('CACHEMODE')
 qemu_default_machine = os.environ.get('QEMU_DEFAULT_MACHINE')
 
 socket_scm_helper = os.environ.get('SOCKET_SCM_HELPER', 'socket_scm_helper')
-debug = False
 
 luks_default_secret_object = 'secret,id=keysec0,data=' + \
  os.environ.get('IMGKEYSECRET', '')
@@ -834,11 +833,22 @@ def skip_if_unsupported(required_formats=[], 
read_only=False):
 return func_wrapper
 return skip_test_decorator
 
-def main(supported_fmts=[], supported_oses=['linux'], supported_cache_modes=[],
- unsupported_fmts=[]):
-'''Run tests'''
+def execute_unittest(output, verbosity, debug):
+runner = unittest.TextTestRunner(stream=output, descriptions=True,
+ verbosity=verbosity)
+try:
+# unittest.main() will use sys.exit(); so expect a SystemExit
+# exception
+unittest.main(testRunner=runner)
+finally:
+if not debug:
+sys.stderr.write(re.sub(r'Ran (\d+) tests? in [\d.]+s',
+r'Ran \1 tests', output.getvalue()))
 
-global debug
+def execute_test(test_function=None,
+ supported_fmts=[], supported_oses=['linux'],
+ supported_cache_modes=[], unsupported_fmts=[]):
+"""Run either unittest or script-style tests."""
 
 # We are using TEST_DIR and QEMU_DEFAULT_MACHINE as proxies to
 # indicate that we're not being run via "check". There may be
@@ -870,13 +880,15 @@ def main(supported_fmts=[], supported_oses=['linux'], 
supported_cache_modes=[],
 
 logging.basicConfig(level=(logging.DEBUG if debug else logging.WARN))
 
-class MyTestRunner(unittest.TextTestRunner):
-def __init__(self, stream=output, descriptions=True, 
verbosity=verbosity):
-unittest.TextTestRunner.__init__(self, stream, descriptions, 
verbosity)
+if not test_function:
+execute_unittest(output, verbosity, debug)
+else:
+test_function()
 
-# unittest.main() will use sys.exit() so expect a SystemExit exception
-try:
-unittest.main(testRunner=MyTestRunner)
-finally:
-if not debug:
-sys.stderr.write(re.sub(r'Ran (\d+) tests? in [\d.]+s', r'Ran \1 
tests', output.getvalue()))
+def script_main(test_function, *args, **kwargs):
+"""Run script-style tests outside of the unittest framework"""
+execute_test(test_function, *args, **kwargs)
+
+def main(*args, **kwargs):
+"""Run tests using the unittest framework"""
+execute_test(None, *args, **kwargs)
-- 
2.21.0

[Qemu-block] [PATCH v4 03/18] blockdev-backup: utilize do_backup_common

2019-07-09 Thread John Snow

Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 blockdev.c | 65 +-
 1 file changed, 6 insertions(+), 59 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 5bc8ecd087..77365d8166 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3624,78 +3624,25 @@ BlockJob *do_blockdev_backup(BlockdevBackup *backup, 
JobTxn *txn,
 {
 BlockDriverState *bs;
 BlockDriverState *target_bs;
-Error *local_err = NULL;
-BdrvDirtyBitmap *bmap = NULL;
 AioContext *aio_context;
-BlockJob *job = NULL;
-int job_flags = JOB_DEFAULT;
-int ret;
-
-if (!backup->has_speed) {
-backup->speed = 0;
-}
-if (!backup->has_on_source_error) {
-backup->on_source_error = BLOCKDEV_ON_ERROR_REPORT;
-}
-if (!backup->has_on_target_error) {
-backup->on_target_error = BLOCKDEV_ON_ERROR_REPORT;
-}
-if (!backup->has_job_id) {
-backup->job_id = NULL;
-}
-if (!backup->has_auto_finalize) {
-backup->auto_finalize = true;
-}
-if (!backup->has_auto_dismiss) {
-backup->auto_dismiss = true;
-}
-if (!backup->has_compress) {
-backup->compress = false;
-}
+BlockJob *job;
 
 bs = bdrv_lookup_bs(backup->device, backup->device, errp);
 if (!bs) {
 return NULL;
 }
 
-aio_context = bdrv_get_aio_context(bs);
-aio_context_acquire(aio_context);
-
 target_bs = bdrv_lookup_bs(backup->target, backup->target, errp);
 if (!target_bs) {
-goto out;
+return NULL;
 }
 
-ret = bdrv_try_set_aio_context(target_bs, aio_context, errp);
-if (ret < 0) {
-goto out;
-}
+aio_context = bdrv_get_aio_context(bs);
+aio_context_acquire(aio_context);
 
-if (backup->has_bitmap) {
-bmap = bdrv_find_dirty_bitmap(bs, backup->bitmap);
-if (!bmap) {
-error_setg(errp, "Bitmap '%s' could not be found", backup->bitmap);
-goto out;
-}
-if (bdrv_dirty_bitmap_check(bmap, BDRV_BITMAP_DEFAULT, errp)) {
-goto out;
-}
-}
+job = do_backup_common(qapi_BlockdevBackup_base(backup),
+   bs, target_bs, aio_context, txn, errp);
 
-if (!backup->auto_finalize) {
-job_flags |= JOB_MANUAL_FINALIZE;
-}
-if (!backup->auto_dismiss) {
-job_flags |= JOB_MANUAL_DISMISS;
-}
-job = backup_job_create(backup->job_id, bs, target_bs, backup->speed,
-backup->sync, bmap, backup->compress,
-backup->on_source_error, backup->on_target_error,
-job_flags, NULL, NULL, txn, &local_err);
-if (local_err != NULL) {
-error_propagate(errp, local_err);
-}
-out:
 aio_context_release(aio_context);
 return job;
 }
-- 
2.21.0

[Qemu-block] [PATCH v4 15/18] iotests: teach FilePath to produce multiple paths

2019-07-09 Thread John Snow

Use "FilePaths" instead of "FilePath" to request multiple files be
cleaned up after we leave that object's scope.

This is not crucial; but it saves a little typing.

Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 tests/qemu-iotests/iotests.py | 34 --
 1 file changed, 24 insertions(+), 10 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index c544659ecb..6135c9663d 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -358,31 +358,45 @@ class Timeout:
 def timeout(self, signum, frame):
 raise Exception(self.errmsg)
 
+def file_pattern(name):
+return "{0}-{1}".format(os.getpid(), name)
 
-class FilePath(object):
-'''An auto-generated filename that cleans itself up.
+class FilePaths(object):
+"""
+FilePaths is an auto-generated filename that cleans itself up.
 
 Use this context manager to generate filenames and ensure that the file
 gets deleted::
 
-with TestFilePath('test.img') as img_path:
+with FilePaths(['test.img']) as img_path:
 qemu_img('create', img_path, '1G')
 # migration_sock_path is automatically deleted
-'''
-def __init__(self, name):
-filename = '{0}-{1}'.format(os.getpid(), name)
-self.path = os.path.join(test_dir, filename)
+"""
+def __init__(self, names):
+self.paths = []
+for name in names:
+self.paths.append(os.path.join(test_dir, file_pattern(name)))
 
 def __enter__(self):
-return self.path
+return self.paths
 
 def __exit__(self, exc_type, exc_val, exc_tb):
 try:
-os.remove(self.path)
+for path in self.paths:
+os.remove(path)
 except OSError:
 pass
 return False
 
+class FilePath(FilePaths):
+"""
+FilePath is a specialization of FilePaths that takes a single filename.
+"""
+def __init__(self, name):
+super(FilePath, self).__init__([name])
+
+def __enter__(self):
+return self.paths[0]
 
 def file_path_remover():
 for path in reversed(file_path_remover.paths):
@@ -407,7 +421,7 @@ def file_path(*names):
 
 paths = []
 for name in names:
-filename = '{0}-{1}'.format(os.getpid(), name)
+filename = file_pattern(name)
 path = os.path.join(test_dir, filename)
 file_path_remover.paths.append(path)
 paths.append(path)
-- 
2.21.0

[Qemu-block] [PATCH v4 11/18] block/backup: upgrade copy_bitmap to BdrvDirtyBitmap

2019-07-09 Thread John Snow

This simplifies some interface matters; namely the initialization and
(later) merging the manifest back into the sync_bitmap if it was
provided.

Signed-off-by: John Snow 
---
 block/backup.c | 81 ++
 1 file changed, 42 insertions(+), 39 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index efd0dcd2e7..fb1b39c44e 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -38,7 +38,10 @@ typedef struct CowRequest {
 typedef struct BackupBlockJob {
 BlockJob common;
 BlockBackend *target;
+
 BdrvDirtyBitmap *sync_bitmap;
+BdrvDirtyBitmap *copy_bitmap;
+
 MirrorSyncMode sync_mode;
 BitmapSyncMode bitmap_mode;
 BlockdevOnError on_source_error;
@@ -51,7 +54,6 @@ typedef struct BackupBlockJob {
 NotifierWithReturn before_write;
 QLIST_HEAD(, CowRequest) inflight_reqs;
 
-HBitmap *copy_bitmap;
 bool use_copy_range;
 int64_t copy_range_size;
 
@@ -113,7 +115,7 @@ static int coroutine_fn 
backup_cow_with_bounce_buffer(BackupBlockJob *job,
 int write_flags = job->serialize_target_writes ? BDRV_REQ_SERIALISING : 0;
 
 assert(QEMU_IS_ALIGNED(start, job->cluster_size));
-hbitmap_reset(job->copy_bitmap, start, job->cluster_size);
+bdrv_reset_dirty_bitmap(job->copy_bitmap, start, job->cluster_size);
 nbytes = MIN(job->cluster_size, job->len - start);
 if (!*bounce_buffer) {
 *bounce_buffer = blk_blockalign(blk, job->cluster_size);
@@ -146,7 +148,7 @@ static int coroutine_fn 
backup_cow_with_bounce_buffer(BackupBlockJob *job,
 
 return nbytes;
 fail:
-hbitmap_set(job->copy_bitmap, start, job->cluster_size);
+bdrv_set_dirty_bitmap(job->copy_bitmap, start, job->cluster_size);
 return ret;
 
 }
@@ -169,12 +171,14 @@ static int coroutine_fn 
backup_cow_with_offload(BackupBlockJob *job,
 assert(QEMU_IS_ALIGNED(start, job->cluster_size));
 nbytes = MIN(job->copy_range_size, end - start);
 nr_clusters = DIV_ROUND_UP(nbytes, job->cluster_size);
-hbitmap_reset(job->copy_bitmap, start, job->cluster_size * nr_clusters);
+bdrv_reset_dirty_bitmap(job->copy_bitmap, start,
+job->cluster_size * nr_clusters);
 ret = blk_co_copy_range(blk, start, job->target, start, nbytes,
 read_flags, write_flags);
 if (ret < 0) {
 trace_backup_do_cow_copy_range_fail(job, start, ret);
-hbitmap_set(job->copy_bitmap, start, job->cluster_size * nr_clusters);
+bdrv_set_dirty_bitmap(job->copy_bitmap, start,
+  job->cluster_size * nr_clusters);
 return ret;
 }
 
@@ -202,7 +206,7 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
 cow_request_begin(&cow_request, job, start, end);
 
 while (start < end) {
-if (!hbitmap_get(job->copy_bitmap, start)) {
+if (!bdrv_dirty_bitmap_get(job->copy_bitmap, start)) {
 trace_backup_do_cow_skip(job, start);
 start += job->cluster_size;
 continue; /* already copied */
@@ -298,14 +302,16 @@ static void backup_abort(Job *job)
 static void backup_clean(Job *job)
 {
 BackupBlockJob *s = container_of(job, BackupBlockJob, common.job);
+BlockDriverState *bs = blk_bs(s->common.blk);
+
+if (s->copy_bitmap) {
+bdrv_release_dirty_bitmap(bs, s->copy_bitmap);
+s->copy_bitmap = NULL;
+}
+
 assert(s->target);
 blk_unref(s->target);
 s->target = NULL;
-
-if (s->copy_bitmap) {
-hbitmap_free(s->copy_bitmap);
-s->copy_bitmap = NULL;
-}
 }
 
 void backup_do_checkpoint(BlockJob *job, Error **errp)
@@ -320,7 +326,7 @@ void backup_do_checkpoint(BlockJob *job, Error **errp)
 return;
 }
 
-hbitmap_set(backup_job->copy_bitmap, 0, backup_job->len);
+bdrv_set_dirty_bitmap(backup_job->copy_bitmap, 0, backup_job->len);
 }
 
 static void backup_drain(BlockJob *job)
@@ -389,59 +395,52 @@ static bool bdrv_is_unallocated_range(BlockDriverState 
*bs,
 
 static int coroutine_fn backup_loop(BackupBlockJob *job)
 {
-int ret;
 bool error_is_read;
 int64_t offset;
-HBitmapIter hbi;
+BdrvDirtyBitmapIter *bdbi;
 BlockDriverState *bs = blk_bs(job->common.blk);
+int ret = 0;
 
-hbitmap_iter_init(&hbi, job->copy_bitmap, 0);
-while ((offset = hbitmap_iter_next(&hbi)) != -1) {
+bdbi = bdrv_dirty_iter_new(job->copy_bitmap);
+while ((offset = bdrv_dirty_iter_next(bdbi)) != -1) {
 if (job->sync_mode == MIRROR_SYNC_MODE_TOP &&
 bdrv_is_unallocated_range(bs, offset, job->cluster_size))
 {
-hbitmap_reset(job->copy_bitmap, offset, job->cluster_size);
+bdrv_reset_dirty_bitmap(job->copy_bitmap, offset,
+job->cluster_size);
 continue;
 }
 
 do {
 if (yield_and_check(job)) {
-return 0;
+goto out;
 }
 ret = backup_

[Qemu-block] [PATCH v4 14/18] iotests: teach run_job to cancel pending jobs

2019-07-09 Thread John Snow

run_job can cancel pending jobs to simulate failure. This lets us use
the pending callback to issue test commands while the job is open, but
then still have the job fail in the end.

Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 tests/qemu-iotests/iotests.py | 22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index fcad957d63..c544659ecb 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -541,7 +541,22 @@ class VM(qtest.QEMUQtestMachine):
 
 # Returns None on success, and an error string on failure
 def run_job(self, job, auto_finalize=True, auto_dismiss=False,
-pre_finalize=None, wait=60.0):
+pre_finalize=None, cancel=False, wait=60.0):
+"""
+run_job moves a job from creation through to dismissal.
+
+:param job: String. ID of recently-launched job
+:param auto_finalize: Bool. True if the job was launched with
+  auto_finalize. Defaults to True.
+:param auto_dismiss: Bool. True if the job was launched with
+ auto_dismiss=True. Defaults to False.
+:param pre_finalize: Callback. A callable that takes no arguments to be
+ invoked prior to issuing job-finalize, if any.
+:param cancel: Bool. When true, cancels the job after the pre_finalize
+   callback.
+:param wait: Float. Timeout value specifying how long to wait for any
+ event, in seconds. Defaults to 60.0.
+"""
 match_device = {'data': {'device': job}}
 match_id = {'data': {'id': job}}
 events = [
@@ -568,7 +583,10 @@ class VM(qtest.QEMUQtestMachine):
 elif status == 'pending' and not auto_finalize:
 if pre_finalize:
 pre_finalize()
-self.qmp_log('job-finalize', id=job)
+if cancel:
+self.qmp_log('job-cancel', id=job)
+else:
+self.qmp_log('job-finalize', id=job)
 elif status == 'concluded' and not auto_dismiss:
 self.qmp_log('job-dismiss', id=job)
 elif status == 'null':
-- 
2.21.0

[Qemu-block] [PATCH v4 01/18] qapi/block-core: Introduce BackupCommon

2019-07-09 Thread John Snow

drive-backup and blockdev-backup have an awful lot of things in common
that are the same. Let's fix that.

I don't deduplicate 'target', because the semantics actually did change
between each structure. Leave that one alone so it can be documented
separately.

Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 qapi/block-core.json | 103 ++-
 1 file changed, 33 insertions(+), 70 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 0d43d4f37c..0af3866015 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1315,32 +1315,23 @@
   'data': { 'node': 'str', 'overlay': 'str' } }
 
 ##
-# @DriveBackup:
+# @BackupCommon:
 #
 # @job-id: identifier for the newly-created block job. If
 #  omitted, the device name will be used. (Since 2.7)
 #
 # @device: the device name or node-name of a root node which should be copied.
 #
-# @target: the target of the new image. If the file exists, or if it
-#  is a device, the existing file/device will be used as the new
-#  destination.  If it does not exist, a new file will be created.
-#
-# @format: the format of the new destination, default is to
-#  probe if @mode is 'existing', else the format of the source
-#
 # @sync: what parts of the disk image should be copied to the destination
 #(all the disk, only the sectors allocated in the topmost image, from a
 #dirty bitmap, or only new I/O).
 #
-# @mode: whether and how QEMU should create a new image, default is
-#'absolute-paths'.
-#
-# @speed: the maximum speed, in bytes per second
+# @speed: the maximum speed, in bytes per second. The default is 0,
+# for unlimited.
 #
 # @bitmap: the name of dirty bitmap if sync is "incremental".
 #  Must be present if sync is "incremental", must NOT be present
-#  otherwise. (Since 2.4)
+#  otherwise. (Since 2.4 (drive-backup), 3.1 (blockdev-backup))
 #
 # @compress: true to compress data, if the target format supports it.
 #(default: false) (since 2.8)
@@ -1370,75 +1361,47 @@
 # I/O.  If an error occurs during a guest write request, the device's
 # rerror/werror actions will be used.
 #
+# Since: 4.2
+##
+{ 'struct': 'BackupCommon',
+  'data': { '*job-id': 'str', 'device': 'str',
+'sync': 'MirrorSyncMode', '*speed': 'int',
+'*bitmap': 'str', '*compress': 'bool',
+'*on-source-error': 'BlockdevOnError',
+'*on-target-error': 'BlockdevOnError',
+'*auto-finalize': 'bool', '*auto-dismiss': 'bool' } }
+
+##
+# @DriveBackup:
+#
+# @target: the target of the new image. If the file exists, or if it
+#  is a device, the existing file/device will be used as the new
+#  destination.  If it does not exist, a new file will be created.
+#
+# @format: the format of the new destination, default is to
+#  probe if @mode is 'existing', else the format of the source
+#
+# @mode: whether and how QEMU should create a new image, default is
+#'absolute-paths'.
+#
 # Since: 1.6
 ##
 { 'struct': 'DriveBackup',
-  'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
-'*format': 'str', 'sync': 'MirrorSyncMode',
-'*mode': 'NewImageMode', '*speed': 'int',
-'*bitmap': 'str', '*compress': 'bool',
-'*on-source-error': 'BlockdevOnError',
-'*on-target-error': 'BlockdevOnError',
-'*auto-finalize': 'bool', '*auto-dismiss': 'bool' } }
+  'base': 'BackupCommon',
+  'data': { 'target': 'str',
+'*format': 'str',
+'*mode': 'NewImageMode' } }
 
 ##
 # @BlockdevBackup:
 #
-# @job-id: identifier for the newly-created block job. If
-#  omitted, the device name will be used. (Since 2.7)
-#
-# @device: the device name or node-name of a root node which should be copied.
-#
 # @target: the device name or node-name of the backup target node.
 #
-# @sync: what parts of the disk image should be copied to the destination
-#(all the disk, only the sectors allocated in the topmost image, or
-#only new I/O).
-#
-# @speed: the maximum speed, in bytes per second. The default is 0,
-# for unlimited.
-#
-# @bitmap: the name of dirty bitmap if sync is "incremental".
-#  Must be present if sync is "incremental", must NOT be present
-#  otherwise. (Since 3.1)
-#
-# @compress: true to compress data, if the target format supports it.
-#(default: false) (since 2.8)
-#
-# @on-source-error: the action to take on an error on the source,
-#   default 'report'.  'stop' and 'enospc' can only be used
-#   if the block device supports io-status (see BlockInfo).
-#
-# @on-target-error: the action to take on an error on the target,
-#   default 'report' (no limitations, since this applies to
-#   a different block device than @device).
-#
-# @auto-finalize: When false, this j

[Qemu-block] [PATCH v4 10/18] block/dirty-bitmap: add bdrv_dirty_bitmap_get

2019-07-09 Thread John Snow

Add a public interface for get. While we're at it,
rename "bdrv_get_dirty_bitmap_locked" to "bdrv_dirty_bitmap_get_locked".

(There are more functions to rename to the bdrv_dirty_bitmap_VERB form,
but they will wait until the conclusion of this series.)

Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 block/dirty-bitmap.c | 19 ---
 block/mirror.c   |  2 +-
 include/block/dirty-bitmap.h |  4 ++--
 migration/block.c|  5 ++---
 nbd/server.c |  2 +-
 5 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 7881fea684..75a5daf116 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -509,14 +509,19 @@ BlockDirtyInfoList 
*bdrv_query_dirty_bitmaps(BlockDriverState *bs)
 }
 
 /* Called within bdrv_dirty_bitmap_lock..unlock */
-bool bdrv_get_dirty_locked(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
-   int64_t offset)
+bool bdrv_dirty_bitmap_get_locked(BdrvDirtyBitmap *bitmap, int64_t offset)
 {
-if (bitmap) {
-return hbitmap_get(bitmap->bitmap, offset);
-} else {
-return false;
-}
+return hbitmap_get(bitmap->bitmap, offset);
+}
+
+bool bdrv_dirty_bitmap_get(BdrvDirtyBitmap *bitmap, int64_t offset)
+{
+bool ret;
+bdrv_dirty_bitmap_lock(bitmap);
+ret = bdrv_dirty_bitmap_get_locked(bitmap, offset);
+bdrv_dirty_bitmap_unlock(bitmap);
+
+return ret;
 }
 
 /**
diff --git a/block/mirror.c b/block/mirror.c
index 75c8f38c6a..63c3ead094 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -476,7 +476,7 @@ static uint64_t coroutine_fn 
mirror_iteration(MirrorBlockJob *s)
 int64_t next_offset = offset + nb_chunks * s->granularity;
 int64_t next_chunk = next_offset / s->granularity;
 if (next_offset >= s->bdev_length ||
-!bdrv_get_dirty_locked(source, s->dirty_bitmap, next_offset)) {
+!bdrv_dirty_bitmap_get_locked(s->dirty_bitmap, next_offset)) {
 break;
 }
 if (test_bit(next_chunk, s->in_flight_bitmap)) {
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 62682eb865..0120ef3f05 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -84,12 +84,12 @@ void bdrv_dirty_bitmap_set_busy(BdrvDirtyBitmap *bitmap, 
bool busy);
 void bdrv_merge_dirty_bitmap(BdrvDirtyBitmap *dest, const BdrvDirtyBitmap *src,
  HBitmap **backup, Error **errp);
 void bdrv_dirty_bitmap_set_migration(BdrvDirtyBitmap *bitmap, bool migration);
+bool bdrv_dirty_bitmap_get(BdrvDirtyBitmap *bitmap, int64_t offset);
 
 /* Functions that require manual locking.  */
 void bdrv_dirty_bitmap_lock(BdrvDirtyBitmap *bitmap);
 void bdrv_dirty_bitmap_unlock(BdrvDirtyBitmap *bitmap);
-bool bdrv_get_dirty_locked(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
-   int64_t offset);
+bool bdrv_dirty_bitmap_get_locked(BdrvDirtyBitmap *bitmap, int64_t offset);
 void bdrv_set_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
   int64_t offset, int64_t bytes);
 void bdrv_reset_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
diff --git a/migration/block.c b/migration/block.c
index 91f98ef44a..a5b60456ae 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -520,7 +520,6 @@ static int mig_save_device_dirty(QEMUFile *f, 
BlkMigDevState *bmds,
  int is_async)
 {
 BlkMigBlock *blk;
-BlockDriverState *bs = blk_bs(bmds->blk);
 int64_t total_sectors = bmds->total_sectors;
 int64_t sector;
 int nr_sectors;
@@ -535,8 +534,8 @@ static int mig_save_device_dirty(QEMUFile *f, 
BlkMigDevState *bmds,
 blk_mig_unlock();
 }
 bdrv_dirty_bitmap_lock(bmds->dirty_bitmap);
-if (bdrv_get_dirty_locked(bs, bmds->dirty_bitmap,
-  sector * BDRV_SECTOR_SIZE)) {
+if (bdrv_dirty_bitmap_get_locked(bmds->dirty_bitmap,
+ sector * BDRV_SECTOR_SIZE)) {
 if (total_sectors - sector < BDRV_SECTORS_PER_DIRTY_CHUNK) {
 nr_sectors = total_sectors - sector;
 } else {
diff --git a/nbd/server.c b/nbd/server.c
index 10faedcfc5..fbd51b48a7 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -2003,7 +2003,7 @@ static unsigned int bitmap_to_extents(BdrvDirtyBitmap 
*bitmap, uint64_t offset,
 bdrv_dirty_bitmap_lock(bitmap);
 
 it = bdrv_dirty_iter_new(bitmap);
-dirty = bdrv_get_dirty_locked(NULL, bitmap, offset);
+dirty = bdrv_dirty_bitmap_get_locked(bitmap, offset);
 
 assert(begin < overall_end && nb_extents);
 while (begin < overall_end && i < nb_extents) {
-- 
2.21.0

[Qemu-block] [PATCH v4 00/18] bitmaps: introduce 'bitmap' sync mode

2019-07-09 Thread John Snow

This series adds a new "BITMAP" sync mode that is meant to replace the
existing "INCREMENTAL" sync mode.

This mode can have its behavior modified by issuing any of three bitmap sync
modes, passed as arguments to the job.

The three bitmap sync modes are:
- ON-SUCCESS: This is an alias for the old incremental mode. The bitmap is
  conditionally synchronized based on the return code of the job
  upon completion.
- NEVER: This is, effectively, the differential backup mode. It never clears
 the bitmap, as the name suggests.
- ALWAYS: Here is the new, exciting thing. The bitmap is always synchronized,
  even on failure. On success, this is identical to incremental, but
  on failure it clears only the bits that were copied successfully.
  This can be used to "resume" incremental backups from later points
  in times.

I wrote this series by accident on my way to implement incremental mode
for mirror, but this happened first -- the problem is that Mirror mode
uses its existing modes in a very particular way; and this was the best
way to add bitmap support into the mirror job properly.

Summary:
- 01-03: refactor blockdev-backup and drive-backup to share more interface code
- 04-05: add the new 'bitmap' sync mode with sync policy 'conditional',
 which is functionally identical to 'incremental' sync mode.
- 06:add sync policy 'never' ("Differential" backups.)
- 07-11: rework some merging code to facilite patch 12;
- 12:add sync policy 'always' ("Resumable" backups)
- 13-16: test infrastructure changes to support patch 16:
- 17:new iotest!
- 18:minor policy loosening as a QOL improvement

Future work:
 - Update bitmaps.rst to explain these. (WIP, it's hard, sorry!)
 - Add these modes to Mirror. (Done*, but needs tests.)
 - Allow the use of bitmaps and bitmap sync modes with non-BITMAP modes;
   This will allow for resumable/re-tryable full backups.

===
V4:
===

[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/18:[] [--] 'qapi/block-core: Introduce BackupCommon'
002/18:[] [--] 'drive-backup: create do_backup_common'
003/18:[] [--] 'blockdev-backup: utilize do_backup_common'
004/18:[] [--] 'qapi: add BitmapSyncMode enum'
005/18:[] [--] 'block/backup: Add mirror sync mode 'bitmap''
006/18:[] [--] 'block/backup: add 'never' policy to bitmap sync mode'
007/18:[] [--] 'hbitmap: Fix merge when b is empty, and result is not an 
alias of a'
008/18:[] [--] 'hbitmap: enable merging across granularities'
009/18:[0004] [FC] 'block/dirty-bitmap: add bdrv_dirty_bitmap_merge_internal'
010/18:[] [--] 'block/dirty-bitmap: add bdrv_dirty_bitmap_get'
011/18:[0008] [FC] 'block/backup: upgrade copy_bitmap to BdrvDirtyBitmap'
012/18:[] [--] 'block/backup: add 'always' bitmap sync policy'
013/18:[] [--] 'iotests: add testing shim for script-style python tests'
014/18:[] [--] 'iotests: teach run_job to cancel pending jobs'
015/18:[] [--] 'iotests: teach FilePath to produce multiple paths'
016/18:[] [--] 'iotests: Add virtio-scsi device helper'
017/18:[0063] [FC] 'iotests: add test 257 for bitmap-mode backups'
018/18:[] [--] 'block/backup: loosen restriction on readonly bitmaps'

Changes:
009: Added assertions.
011: Moved copy bitmap to source node.
017: Rework get_bitmap to tolerate multiple anonymous bitmaps
 Update test output to accommodate the same.

===
V3:
===

Changes:
001: Made suggested doc fixes.
 Changed 'since' to 4.2.
002: Added bds and aio_context to backup_common
 Removed accidental extraneous unref on target_bs
 Removed local_err propagation
003: Fallout from #002; hoist aio_context acquisition up into do_blockdev_backup
004: 'conditional' --> 'on-success'
005: Rediscover the lost stanza that ensures a bitmap mode was given
 Fallout from 2, 3, 4.
006: Block comment fix for patchew
 Fallout from #4
009: Fix assert() style issue. Why'd they let a macro be lowercase like that?
 Probably to make specifically my life difficult.
010: Fix style issue {
011: Fix long lines
 rename "bs" --> "target_bs" where appropriate
 Free copy_bitmap from the right node
012: Multiline comment changes for patchew
 Fallout from #4
015: Fix long line for patchew
 Reinstate that second newline that Max likes
017: Fallout from #4.

===
V2:
===

Changes:
004: Fixed typo
 Change @conditional docstring
005: Moved desugaring code into blockdev.c, facilitated by patches 1-3.
006: Change @never docstring slightly.
007: Merge will clear the target bitmap when both components bitmaps are empty,
 and the target bitmap is not an alias of either component bitmap.
008: Check orig_size (logical size) instead of size (actual size) to enable
 cross-granularity merging.
 Fix the spars

[Qemu-block] [PATCH v4 08/18] hbitmap: enable merging across granularities

2019-07-09 Thread John Snow

Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 util/hbitmap.c | 36 +++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/util/hbitmap.c b/util/hbitmap.c
index 3b6acae42b..306bc4876d 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -777,7 +777,27 @@ void hbitmap_truncate(HBitmap *hb, uint64_t size)
 
 bool hbitmap_can_merge(const HBitmap *a, const HBitmap *b)
 {
-return (a->size == b->size) && (a->granularity == b->granularity);
+return (a->orig_size == b->orig_size);
+}
+
+/**
+ * hbitmap_sparse_merge: performs dst = dst | src
+ * works with differing granularities.
+ * best used when src is sparsely populated.
+ */
+static void hbitmap_sparse_merge(HBitmap *dst, const HBitmap *src)
+{
+uint64_t offset = 0;
+uint64_t count = src->orig_size;
+
+while (hbitmap_next_dirty_area(src, &offset, &count)) {
+hbitmap_set(dst, offset, count);
+offset += count;
+if (offset >= src->orig_size) {
+break;
+}
+count = src->orig_size - offset;
+}
 }
 
 /**
@@ -808,10 +828,24 @@ bool hbitmap_merge(const HBitmap *a, const HBitmap *b, 
HBitmap *result)
 return true;
 }
 
+if (a->granularity != b->granularity) {
+if ((a != result) && (b != result)) {
+hbitmap_reset_all(result);
+}
+if (a != result) {
+hbitmap_sparse_merge(result, a);
+}
+if (b != result) {
+hbitmap_sparse_merge(result, b);
+}
+return true;
+}
+
 /* This merge is O(size), as BITS_PER_LONG and HBITMAP_LEVELS are constant.
  * It may be possible to improve running times for sparsely populated maps
  * by using hbitmap_iter_next, but this is suboptimal for dense maps.
  */
+assert(a->size == b->size);
 for (i = HBITMAP_LEVELS - 1; i >= 0; i--) {
 for (j = 0; j < a->sizes[i]; j++) {
 result->levels[i][j] = a->levels[i][j] | b->levels[i][j];
-- 
2.21.0

[Qemu-block] [PATCH v4 04/18] qapi: add BitmapSyncMode enum

2019-07-09 Thread John Snow

Depending on what a user is trying to accomplish, there might be a few
bitmap cleanup actions that occur when an operation is finished that
could be useful.

I am proposing three:
- NEVER: The bitmap is never synchronized against what was copied.
- ALWAYS: The bitmap is always synchronized, even on failures.
- ON-SUCCESS: The bitmap is synchronized only on success.

The existing incremental backup modes use 'on-success' semantics,
so add just that one for right now.

Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 qapi/block-core.json | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 0af3866015..0c853d00bd 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1134,6 +1134,20 @@
 { 'enum': 'MirrorSyncMode',
   'data': ['top', 'full', 'none', 'incremental'] }
 
+##
+# @BitmapSyncMode:
+#
+# An enumeration of possible behaviors for the synchronization of a bitmap
+# when used for data copy operations.
+#
+# @on-success: The bitmap is only synced when the operation is successful.
+#  This is the behavior always used for 'INCREMENTAL' backups.
+#
+# Since: 4.2
+##
+{ 'enum': 'BitmapSyncMode',
+  'data': ['on-success'] }
+
 ##
 # @MirrorCopyMode:
 #
-- 
2.21.0

[Qemu-block] [PATCH v4 06/18] block/backup: add 'never' policy to bitmap sync mode

2019-07-09 Thread John Snow

This adds a "never" policy for bitmap synchronization. Regardless of if
the job succeeds or fails, we never update the bitmap. This can be used
to perform differential backups, or simply to avoid the job modifying a
bitmap.

Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 block/backup.c   | 7 +--
 qapi/block-core.json | 5 -
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 996941fa61..efd0dcd2e7 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -265,8 +265,11 @@ static void backup_cleanup_sync_bitmap(BackupBlockJob 
*job, int ret)
 BdrvDirtyBitmap *bm;
 BlockDriverState *bs = blk_bs(job->common.blk);
 
-if (ret < 0) {
-/* Merge the successor back into the parent, delete nothing. */
+if (ret < 0 || job->bitmap_mode == BITMAP_SYNC_MODE_NEVER) {
+/*
+ * Failure, or we don't want to synchronize the bitmap.
+ * Merge the successor back into the parent, delete nothing.
+ */
 bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
 assert(bm);
 } else {
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 99dcd5f099..b1a98e 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1146,10 +1146,13 @@
 # @on-success: The bitmap is only synced when the operation is successful.
 #  This is the behavior always used for 'INCREMENTAL' backups.
 #
+# @never: The bitmap is never synchronized with the operation, and is
+# treated solely as a read-only manifest of blocks to copy.
+#
 # Since: 4.2
 ##
 { 'enum': 'BitmapSyncMode',
-  'data': ['on-success'] }
+  'data': ['on-success', 'never'] }
 
 ##
 # @MirrorCopyMode:
-- 
2.21.0

[Qemu-block] [PATCH v4 07/18] hbitmap: Fix merge when b is empty, and result is not an alias of a

2019-07-09 Thread John Snow

Nobody calls the function like this currently, but we neither prohibit
or cope with this behavior. I decided to make the function cope with it.

Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 util/hbitmap.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/util/hbitmap.c b/util/hbitmap.c
index 7905212a8b..3b6acae42b 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -781,8 +781,9 @@ bool hbitmap_can_merge(const HBitmap *a, const HBitmap *b)
 }
 
 /**
- * Given HBitmaps A and B, let A := A (BITOR) B.
- * Bitmap B will not be modified.
+ * Given HBitmaps A and B, let R := A (BITOR) B.
+ * Bitmaps A and B will not be modified,
+ * except when bitmap R is an alias of A or B.
  *
  * @return true if the merge was successful,
  * false if it was not attempted.
@@ -797,7 +798,13 @@ bool hbitmap_merge(const HBitmap *a, const HBitmap *b, 
HBitmap *result)
 }
 assert(hbitmap_can_merge(b, result));
 
-if (hbitmap_count(b) == 0) {
+if ((!hbitmap_count(a) && result == b) ||
+(!hbitmap_count(b) && result == a)) {
+return true;
+}
+
+if (!hbitmap_count(a) && !hbitmap_count(b)) {
+hbitmap_reset_all(result);
 return true;
 }
 
-- 
2.21.0

Re: [Qemu-block] [PATCH v3 17/18] iotests: add test 257 for bitmap-mode backups

2019-07-09 Thread John Snow




On 7/9/19 2:49 PM, Max Reitz wrote:
> On 05.07.19 22:16, John Snow wrote:
>> Signed-off-by: John Snow 
>> ---
>>  tests/qemu-iotests/257 |  409 +++
>>  tests/qemu-iotests/257.out | 2199 
>>  tests/qemu-iotests/group   |1 +
>>  3 files changed, 2609 insertions(+)
>>  create mode 100755 tests/qemu-iotests/257
>>  create mode 100644 tests/qemu-iotests/257.out
> 
> Reviewed-by: Max Reitz 
> 

Thanks for all the reviews so far :)

Despite my grumpiness I do rather like the way this turned out.

--js

[Qemu-block] [PATCH 4/4] hw/scsi: Don't realize zoned block devices for virtio-scsi legacy drivers

2019-07-09 Thread Dmitry Fomichev

From: Shin'ichiro Kawasaki 

Prevent scsi-hd and scsi-disk drivers from attaching a zoned block device
because it will appear as a regular block device at the guest and will
most certainly cause problems.

The functionality to support ZBDs is not planned for scsi-hd and
scsi-disk legacy drivers. It is supported via scsi-generic driver already
and discussion is ongoing for scsi-block driver.

Signed-off-by: Shin'ichiro Kawasaki 
---
 hw/scsi/scsi-disk.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index ed7295bfd7..80682a61fb 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -2401,6 +2401,11 @@ static void scsi_hd_realize(SCSIDevice *dev, Error 
**errp)
  * backend will be issued in scsi_realize
  */
 if (s->qdev.conf.blk) {
+if (blk_is_zoned(s->qdev.conf.blk)) {
+error_setg(errp, "zoned block devices are not supported");
+return;
+}
+
 ctx = blk_get_aio_context(s->qdev.conf.blk);
 aio_context_acquire(ctx);
 blkconf_blocksizes(&s->qdev.conf);
-- 
2.21.0

[Qemu-block] [PATCH 1/4] block: Add zoned device model property

2019-07-09 Thread Dmitry Fomichev

This commit adds Zoned Device Model as defined in T10 ZBC and
T13 ZAC standards, along with some useful accessor functions.

Since the default value of zero in the zoned model field corresponds
to a regular (non-zoned) block device, there is no functionality
change with this commit.

Signed-off-by: Dmitry Fomichev 
---
 block.c| 14 ++
 block/block-backend.c  | 20 
 include/block/block.h  |  9 +
 include/block/block_int.h  |  4 
 include/sysemu/block-backend.h |  2 ++
 5 files changed, 49 insertions(+)

diff --git a/block.c b/block.c
index c139540f2b..396d999f00 100644
--- a/block.c
+++ b/block.c
@@ -4649,6 +4649,20 @@ void bdrv_get_geometry(BlockDriverState *bs, uint64_t 
*nb_sectors_ptr)
 *nb_sectors_ptr = nb_sectors < 0 ? 0 : nb_sectors;
 }
 
+uint8_t bdrv_get_zoned_model(BlockDriverState *bs)
+{
+if (bs->drv->bdrv_get_zoned_info) {
+bs->drv->bdrv_get_zoned_info(bs);
+}
+
+return bs->bl.zoned_model;
+}
+
+uint8_t bdrv_is_zoned(BlockDriverState *bs)
+{
+return bdrv_get_zoned_model(bs) != BLK_ZONED_NONE;
+}
+
 bool bdrv_is_sg(BlockDriverState *bs)
 {
 return bs->sg;
diff --git a/block/block-backend.c b/block/block-backend.c
index a8d160fd5d..34723f3655 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1438,6 +1438,15 @@ int64_t blk_nb_sectors(BlockBackend *blk)
 return bdrv_nb_sectors(blk_bs(blk));
 }
 
+uint8_t blk_get_zoned_model(BlockBackend *blk)
+{
+if (!blk_is_available(blk)) {
+return BLK_ZONED_NONE;
+}
+
+return bdrv_get_zoned_model(blk_bs(blk));
+}
+
 BlockAIOCB *blk_aio_preadv(BlockBackend *blk, int64_t offset,
QEMUIOVector *qiov, BdrvRequestFlags flags,
BlockCompletionFunc *cb, void *opaque)
@@ -1705,6 +1714,17 @@ bool blk_is_sg(BlockBackend *blk)
 return bdrv_is_sg(bs);
 }
 
+bool blk_is_zoned(BlockBackend *blk)
+{
+BlockDriverState *bs = blk_bs(blk);
+
+if (!bs) {
+return false;
+}
+
+return bdrv_is_zoned(bs);
+}
+
 bool blk_enable_write_cache(BlockBackend *blk)
 {
 return blk->enable_write_cache;
diff --git a/include/block/block.h b/include/block/block.h
index 734c9d2f76..7aa096afcc 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -278,6 +278,13 @@ enum {
 
 char *bdrv_perm_names(uint64_t perm);
 
+/* Known zoned device models */
+enum blk_zoned_model {
+BLK_ZONED_NONE, /* Regular block device */
+BLK_ZONED_HA,   /* Host-aware zoned block device */
+BLK_ZONED_HM,   /* Host-managed zoned block device */
+};
+
 /* disk I/O throttling */
 void bdrv_init(void);
 void bdrv_init_with_whitelist(void);
@@ -354,6 +361,8 @@ int64_t bdrv_get_allocated_file_size(BlockDriverState *bs);
 BlockMeasureInfo *bdrv_measure(BlockDriver *drv, QemuOpts *opts,
BlockDriverState *in_bs, Error **errp);
 void bdrv_get_geometry(BlockDriverState *bs, uint64_t *nb_sectors_ptr);
+uint8_t bdrv_get_zoned_model(BlockDriverState *bs);
+uint8_t bdrv_is_zoned(BlockDriverState *bs);
 void bdrv_refresh_limits(BlockDriverState *bs, Error **errp);
 int bdrv_commit(BlockDriverState *bs);
 int bdrv_change_backing_file(BlockDriverState *bs,
diff --git a/include/block/block_int.h b/include/block/block_int.h
index d6415b53c1..8f35591803 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -416,6 +416,7 @@ struct BlockDriver {
 bool (*bdrv_debug_is_suspended)(BlockDriverState *bs, const char *tag);
 
 void (*bdrv_refresh_limits)(BlockDriverState *bs, Error **errp);
+void (*bdrv_get_zoned_info)(BlockDriverState *bs);
 
 /*
  * Returns 1 if newly created images are guaranteed to contain only
@@ -614,6 +615,9 @@ typedef struct BlockLimits {
 
 /* maximum number of iovec elements */
 int max_iov;
+
+/* Zoned device model. Zero value indicates a regular block device */
+uint8_t zoned_model;
 } BlockLimits;
 
 typedef struct BdrvOpBlocker BdrvOpBlocker;
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index 733c4957eb..65b6341218 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -156,6 +156,7 @@ int blk_pwrite(BlockBackend *blk, int64_t offset, const 
void *buf, int bytes,
 int64_t blk_getlength(BlockBackend *blk);
 void blk_get_geometry(BlockBackend *blk, uint64_t *nb_sectors_ptr);
 int64_t blk_nb_sectors(BlockBackend *blk);
+uint8_t blk_get_zoned_model(BlockBackend *blk);
 BlockAIOCB *blk_aio_preadv(BlockBackend *blk, int64_t offset,
QEMUIOVector *qiov, BdrvRequestFlags flags,
BlockCompletionFunc *cb, void *opaque);
@@ -189,6 +190,7 @@ void blk_error_action(BlockBackend *blk, BlockErrorAction 
action,
   bool is_read, int error);
 bool blk_is_read_only(BlockBackend *blk);
 bool blk_is_sg(BlockBackend *blk);
+bool blk_is_zoned(BlockBackend *blk);

[Qemu-block] [PATCH 3/4] virtio-blk: Don't realize zoned block devices

2019-07-09 Thread Dmitry Fomichev

Prevent virtio-blk code from attaching a zoned block device because
it will otherwise appear as a reqular block device at the guest and
that will most certainly cause problems.

The functionality to support ZBDs via virtio-blk should be pretty
useful and there are some attempts underway to get it implemented,
but such work will inevitably lead to some modifications in virtio
protocol spec. Therefore, this activity is considered a more
long-term effort.

So for now, we just don't allow zoned block devices to work via
virtio-blk.

Signed-off-by: Dmitry Fomichev 
---
 hw/block/virtio-blk.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index cbb3729158..c11e028308 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -1140,6 +1140,11 @@ static void virtio_blk_device_realize(DeviceState *dev, 
Error **errp)
 return;
 }
 
+if (blk_is_zoned(conf->conf.blk)) {
+error_setg(errp, "zoned block devices are not supported");
+return;
+}
+
 if (virtio_has_feature(s->host_features, VIRTIO_BLK_F_DISCARD) &&
 (!conf->max_discard_sectors ||
  conf->max_discard_sectors > BDRV_REQUEST_MAX_SECTORS)) {
-- 
2.21.0

[Qemu-block] [PATCH 2/4] raw: Recognize zoned backing devices

2019-07-09 Thread Dmitry Fomichev

The purpose of this patch is to recognize a zoned block device (ZBD)
when it is opened as a raw file. The new code initializes the zoned
model propery introduced by the previous commit.

This commit is Linux-specific and in essence it checks the Zoned
Block Device Model (None/Host-managed/Host-aware) value in sysfs on
the host.

In order to avoid code duplication in file-posix.c, a common helper
function is added to read value of a sysfs entry under
/sys/block//queue. Now, the existing function that reads the
"max_segments" value and the the new function that reads "zoned"
value share the same helper code.

Signed-off-by: Dmitry Fomichev 
---
 block/file-posix.c | 69 ++
 block/raw-format.c |  8 ++
 2 files changed, 66 insertions(+), 11 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index ab05b51a66..dd81dc3301 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1055,33 +1055,53 @@ static int 
hdev_get_max_transfer_length(BlockDriverState *bs, int fd)
 #endif
 }
 
-static int hdev_get_max_segments(const struct stat *st)
+static int hdev_read_blk_queue_entry(const struct stat *st, const char *key,
+char *buf, int buf_len)
 {
 #ifdef CONFIG_LINUX
-char buf[32];
-const char *end;
 char *sysfspath;
 int ret;
 int fd = -1;
-long max_segments;
 
-sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments",
-major(st->st_rdev), minor(st->st_rdev));
+sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/%s",
+major(st->st_rdev), minor(st->st_rdev), key);
 fd = open(sysfspath, O_RDONLY);
 if (fd == -1) {
 ret = -errno;
 goto out;
 }
 do {
-ret = read(fd, buf, sizeof(buf) - 1);
+ret = read(fd, buf, buf_len - 1);
 } while (ret == -1 && errno == EINTR);
 if (ret < 0) {
 ret = -errno;
-goto out;
 } else if (ret == 0) {
 ret = -EIO;
+}
+out:
+if (fd != -1) {
+close(fd);
+}
+g_free(sysfspath);
+return ret;
+#else
+return -ENOTSUP;
+#endif
+}
+
+static int hdev_get_max_segments(const struct stat *st)
+{
+#ifdef CONFIG_LINUX
+char buf[32];
+const char *end;
+int ret;
+long max_segments;
+
+ret = hdev_read_blk_queue_entry(st, "max_segments", buf, sizeof(buf));
+if (ret < 0) {
 goto out;
 }
+
 buf[ret] = 0;
 /* The file is ended with '\n', pass 'end' to accept that. */
 ret = qemu_strtol(buf, &end, 10, &max_segments);
@@ -1090,10 +1110,33 @@ static int hdev_get_max_segments(const struct stat *st)
 }
 
 out:
-if (fd != -1) {
-close(fd);
+return ret;
+#else
+return -ENOTSUP;
+#endif
+}
+
+static int hdev_get_zoned_model(const struct stat *st)
+{
+#ifdef CONFIG_LINUX
+char buf[32];
+int ret;
+
+ret = hdev_read_blk_queue_entry(st, "zoned", buf, sizeof(buf));
+if (ret < 0) {
+ret = BLK_ZONED_NONE;
+goto out;
 }
-g_free(sysfspath);
+
+buf[ret - 1] = 0;
+ret = BLK_ZONED_NONE;
+if (strcmp(buf, "host-managed") == 0) {
+ret = BLK_ZONED_HM;
+} else if (strcmp(buf, "host-aware") == 0) {
+ret = BLK_ZONED_HA;
+}
+
+out:
 return ret;
 #else
 return -ENOTSUP;
@@ -1116,6 +1159,10 @@ static void raw_refresh_limits(BlockDriverState *bs, 
Error **errp)
 bs->bl.max_transfer = MIN(bs->bl.max_transfer,
   ret * getpagesize());
 }
+ret = hdev_get_zoned_model(&st);
+if (ret >= 0) {
+bs->bl.zoned_model = ret;
+}
 }
 }
 
diff --git a/block/raw-format.c b/block/raw-format.c
index bffd424dd0..12c2a3f95d 100644
--- a/block/raw-format.c
+++ b/block/raw-format.c
@@ -369,6 +369,13 @@ static void raw_refresh_limits(BlockDriverState *bs, Error 
**errp)
 }
 }
 
+static void raw_get_zoned_info(BlockDriverState *bs)
+{
+if (!bs->probed) {
+bs->bl.zoned_model = bs->file->bs->bl.zoned_model;
+}
+}
+
 static int coroutine_fn raw_co_truncate(BlockDriverState *bs, int64_t offset,
 PreallocMode prealloc, Error **errp)
 {
@@ -572,6 +579,7 @@ BlockDriver bdrv_raw = {
 .bdrv_co_ioctl= &raw_co_ioctl,
 .create_opts  = &raw_create_opts,
 .bdrv_has_zero_init   = &raw_has_zero_init,
+.bdrv_get_zoned_info  = &raw_get_zoned_info,
 .strong_runtime_opts  = raw_strong_runtime_opts,
 .mutable_opts = mutable_opts,
 };
-- 
2.21.0

[Qemu-block] [PATCH 0/4] virtio: handle zoned backing devices

2019-07-09 Thread Dmitry Fomichev

Currently, attaching zoned block devices (i.e. storage devices
compliant to ZAC/ZBC standards) using several virtio methods doesn't
work - the zoned devices appear as regular block devices at the guest.
This may cause unexpected i/o errors and, potentially, some data
corruption.

To be more precise, attaching a zoned device via virtio-pci-blk,
virtio-scsi-pci/scsi-disk or virtio-scsi-pci/scsi-hd demonstrates the
above behavior. A simple fix is needed to make
virtio-scsi-pci/scsi-block work and this is covered by a different
patch. The virtio-scsi-pci/scsi-generic method appears to handle zoned
devices without problems.

This patchset adds code to check if the backing device that is being
opened is zoned. If this is the case, the new code prohibits
virtualization of the device for all use cases lacking proper zoned
support.

Considering that there is still a couple of different working ways
to virtualize a ZBD, this patchset provides a reasonable short-term
solution for this problem. What about long term?

It appears to be beneficial to add proper ZBD support to virtio-blk.
In order to support this use case properly, some virtio-blk protocol
changes will be necessary. They are needed to allow the host code to
propagate some ZBD properties that are required for virtio guest
driver to configure the guest block device as ZBD, such as zoned
device model, zone size and the total number of zones. Further, some
support needs to be added for REPORT ZONES command as well as for zone
operations, such as OPEN ZONE, CLOSE ZONE, FINISH ZONE and RESET ZONE.

These additions to the protocol are relatively straightforward, but
they need to be approved by the virtio TC and the whole process may
take some time.

ZBD support for virtio-scsi-pci/scsi-disk and virtio-scsi-pci/scsi-hd
does not seem as necessary. Users will be expected to attach zoned
block devices via virtio-scsi-pci/scsi-block instead.

This patchset contains some Linux-specific code. This code is
necessary to obtain Zoned Block Device model value from Linux sysfs.
Dmitry Fomichev (3):
  block: Add zoned device model property
  raw: Recognize zoned backing devices
  virtio-blk: Don't realize zoned block devices

Shin'ichiro Kawasaki (1):
  hw/scsi: Don't realize zoned block devices for virtio-scsi legacy
drivers

 block.c| 14 +++
 block/block-backend.c  | 20 ++
 block/file-posix.c | 69 --
 block/raw-format.c |  8 
 hw/block/virtio-blk.c  |  5 +++
 hw/scsi/scsi-disk.c|  5 +++
 include/block/block.h  |  9 +
 include/block/block_int.h  |  4 ++
 include/sysemu/block-backend.h |  2 +
 9 files changed, 125 insertions(+), 11 deletions(-)

-- 
2.21.0

Re: [Qemu-block] [PATCH v3 17/18] iotests: add test 257 for bitmap-mode backups

2019-07-09 Thread Max Reitz

On 05.07.19 22:16, John Snow wrote:
> Signed-off-by: John Snow 
> ---
>  tests/qemu-iotests/257 |  409 +++
>  tests/qemu-iotests/257.out | 2199 
>  tests/qemu-iotests/group   |1 +
>  3 files changed, 2609 insertions(+)
>  create mode 100755 tests/qemu-iotests/257
>  create mode 100644 tests/qemu-iotests/257.out

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH v3 12/18] block/backup: add 'always' bitmap sync policy

2019-07-09 Thread Max Reitz

On 05.07.19 22:16, John Snow wrote:
> This adds an "always" policy for bitmap synchronization. Regardless of if
> the job succeeds or fails, the bitmap is *always* synchronized. This means
> that for backups that fail part-way through, the bitmap retains a record of
> which sectors need to be copied out to accomplish a new backup using the
> old, partial result.
> 
> In effect, this allows us to "resume" a failed backup; however the new backup
> will be from the new point in time, so it isn't a "resume" as much as it is
> an "incremental retry." This can be useful in the case of extremely large
> backups that fail considerably through the operation and we'd like to not 
> waste
> the work that was already performed.
> 
> Signed-off-by: John Snow 
> ---
>  block/backup.c   | 27 +++
>  qapi/block-core.json |  5 -
>  2 files changed, 23 insertions(+), 9 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH v3 06/18] block/backup: add 'never' policy to bitmap sync mode

2019-07-09 Thread Max Reitz

On 05.07.19 22:16, John Snow wrote:
> This adds a "never" policy for bitmap synchronization. Regardless of if
> the job succeeds or fails, we never update the bitmap. This can be used
> to perform differential backups, or simply to avoid the job modifying a
> bitmap.
> 
> Signed-off-by: John Snow 
> ---
>  block/backup.c   | 7 +--
>  qapi/block-core.json | 5 -
>  2 files changed, 9 insertions(+), 3 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH v3 04/18] qapi: add BitmapSyncMode enum

2019-07-09 Thread Max Reitz

On 05.07.19 22:16, John Snow wrote:
> Depending on what a user is trying to accomplish, there might be a few
> bitmap cleanup actions that occur when an operation is finished that
> could be useful.
> 
> I am proposing three:
> - NEVER: The bitmap is never synchronized against what was copied.
> - ALWAYS: The bitmap is always synchronized, even on failures.
> - ON-SUCCESS: The bitmap is synchronized only on success.
> 
> The existing incremental backup modes use 'on-success' semantics,
> so add just that one for right now.
> 
> Signed-off-by: John Snow 
> ---
>  qapi/block-core.json | 14 ++
>  1 file changed, 14 insertions(+)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH v3 05/18] block/backup: Add mirror sync mode 'bitmap'

2019-07-09 Thread Max Reitz

On 05.07.19 22:16, John Snow wrote:
> We don't need or want a new sync mode for simple differences in
> semantics.  Create a new mode simply named "BITMAP" that is designed to
> make use of the new Bitmap Sync Mode field.
> 
> Because the only bitmap sync mode is 'on-success', this adds no new
> functionality to the backup job (yet). The old incremental backup mode
> is maintained as a syntactic sugar for sync=bitmap, mode=on-success.
> 
> Add all of the plumbing necessary to support this new instruction.
> 
> Signed-off-by: John Snow 
> ---
>  block/backup.c| 20 
>  block/mirror.c|  6 --
>  block/replication.c   |  2 +-
>  blockdev.c| 25 +++--
>  include/block/block_int.h |  4 +++-
>  qapi/block-core.json  | 21 +++--
>  6 files changed, 58 insertions(+), 20 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [Qemu-devel] [PATCH v3 2/9] hw/block/pflash_cfi01: Use the correct READ_ARRAY value

2019-07-09 Thread Philippe Mathieu-Daudé

On 7/9/19 7:10 PM, Dr. David Alan Gilbert wrote:
> * Philippe Mathieu-Daudé (phi...@redhat.com) wrote:
>> Hi David,
>>
>> On 7/9/19 12:30 PM, Dr. David Alan Gilbert wrote:
>>> * Philippe Mathieu-Daudé (phi...@redhat.com) wrote:
 In the "Read Array Flowchart" the command has a value of 0xFF.

 In the document [*] the "Read Array Flowchart", the READ_ARRAY
 command has a value of 0xff.

 Use the correct value in the pflash model.

 There is no change of behavior in the guest, because:
 - when the guest were sending 0xFF, the reset_flash label
   was setting the command value as 0x00
 - 0x00 was used internally for READ_ARRAY

 To keep migration behaving correctly, we have to increase
 the VMState version. When migrating from an older version,
 we use the correct command value.
>>>
>>> The problem is that incrementing the version will break backwards
>>> compatibility; so you won't be able to migrate this back to an older
>>> QEMU version; so for example a q35/uefi with this won't be able
>>> to migrate backwards to a 4.0.0 or older qemu.
>>>
>>> So instead of bumping the version_id you probably need to wire
>>> the behaviour to a machine type and then on your new type
>>> wire a subsection containing a flag; the reception of that subsection
>>> tells you to use the new/correct semantics.
>>
>> I'm starting to understand VMState subsections, but it might be overkill
>> for this change...
>>
>>   Subsections
>>   ---
>>
>>   The most common structure change is adding new data, e.g. when adding
>>   a newer form of device, or adding that state that you previously
>>   forgot to migrate.  This is best solved using a subsection.
>>
>> This is not the case here, the field is already present and migrated.
>>
>> It seems I can use a simple pre_save hook, always migrating the
>> READ_ARRAY using the incorrect value:
>>
>> -- >8 --
>> --- a/hw/block/pflash_cfi01.c
>> +++ b/hw/block/pflash_cfi01.c
>> @@ -97,12 +97,29 @@ struct PFlashCFI01 {
>>  bool incorrect_read_array_command;
>>  };
>>
>> +static int pflash_pre_save(void *opaque)
>> +{
>> +PFlashCFI01 *s = opaque;
>> +
>> +/*
>> + * Previous to QEMU v4.1 an incorrect value of 0x00 was used for the
>> + * READ_ARRAY command. To preserve migrating to these older version,
>> + * always migrate the READ_ARRAY command as 0x00.
>> + */
>> +if (s->cmd == 0xff) {
>> +s->cmd = 0x00;
>> +}
>> +
>> +return 0;
>> +}
> 
> Be careful what happens if migration fails and you continue on the
> source - is that OK - or are you going to have to flip that back somehow
> (in a post_save).

Hmm OK...

> 
> Another way to do the same is to have a dummy field; tmp_cmd, and the
> tmp_cmd is the thing that's actually migrated and filled by pre_save
> (or use VMSTATE_WITH_TMP )
> 
> 
>>  static int pflash_post_load(void *opaque, int version_id);
>>
>>  static const VMStateDescription vmstate_pflash = {
>>  .name = "pflash_cfi01",
>>  .version_id = 1,
>>  .minimum_version_id = 1,
>> +.pre_save = pflash_pre_save,
>>  .post_load = pflash_post_load,
>>  .fields = (VMStateField[]) {
>>  VMSTATE_UINT8(wcycle, PFlashCFI01),
>> @@ -1001,5 +1018,14 @@ static int pflash_post_load(void *opaque, int
>> version_id)
>>  pfl->vmstate = qemu_add_vm_change_state_handler(postload_update_cb,
>>  pfl);
>>  }
>> +
>> +/*
>> + * Previous to QEMU v4.1 an incorrect value of 0x00 was used for the
>> + * READ_ARRAY command.
>> + */
>> +if (pfl->cmd == 0x00) {
>> +pfl->cmd = 0xff;
>> +}
>> +
>>  return 0;
>>  }
>> ---
>>
>> Being simpler and less intrusive (no new property in hw/core/machine.c),
>> is this acceptable?
> 
> From the migration point of view yes; I don't know enough about pflash
> to say if it makes sense;  for example could there ever be a 00 command
> really used and then you'd have to distinguish that somehow?

Well, I think this change is simpler than it looks.

Currently the code is (what will run on older guest):

static uint32_t pflash_read(PFlashCFI01 *pfl, hwaddr offset,
int width, int be)
{
switch (pfl->cmd) {
default:
/* This should never happen : reset state & treat it as a read*/
DPRINTF("%s: unknown command state: %x\n", __func__, pfl->cmd);
pfl->wcycle = 0;
pfl->cmd = 0;
/* fall through to read code */
case 0x00:
/* Flash area read */
ret = pflash_data_read(pfl, offset, width, be);
break;

and:

static void pflash_write(PFlashCFI01 *pfl, hwaddr offset,
 uint32_t value, int width, int be)
{
switch (pfl->wcycle) {
case 0:
/* read mode */
switch (cmd) {
case 0x00: /* ??? */
goto reset_flash;
case 0xff: /* Read array mode */
DPRINTF("%s: Read array mode\n", _

Re: [Qemu-block] [PATCH v3 03/18] blockdev-backup: utilize do_backup_common

2019-07-09 Thread Max Reitz

On 05.07.19 22:16, John Snow wrote:
> Signed-off-by: John Snow 
> ---
>  blockdev.c | 65 +-
>  1 file changed, 6 insertions(+), 59 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH v3 02/18] drive-backup: create do_backup_common

2019-07-09 Thread Max Reitz

On 05.07.19 22:16, John Snow wrote:
> Create a common core that comprises the actual meat of what the backup API
> boundary needs to do, and then switch drive-backup to use it.
> 
> Signed-off-by: John Snow 
> ---
>  blockdev.c | 122 +
>  1 file changed, 67 insertions(+), 55 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [Qemu-devel] [PATCH v3 2/9] hw/block/pflash_cfi01: Use the correct READ_ARRAY value

2019-07-09 Thread Dr. David Alan Gilbert

* Philippe Mathieu-Daudé (phi...@redhat.com) wrote:
> Hi David,
> 
> On 7/9/19 12:30 PM, Dr. David Alan Gilbert wrote:
> > * Philippe Mathieu-Daudé (phi...@redhat.com) wrote:
> >> In the "Read Array Flowchart" the command has a value of 0xFF.
> >>
> >> In the document [*] the "Read Array Flowchart", the READ_ARRAY
> >> command has a value of 0xff.
> >>
> >> Use the correct value in the pflash model.
> >>
> >> There is no change of behavior in the guest, because:
> >> - when the guest were sending 0xFF, the reset_flash label
> >>   was setting the command value as 0x00
> >> - 0x00 was used internally for READ_ARRAY
> >>
> >> To keep migration behaving correctly, we have to increase
> >> the VMState version. When migrating from an older version,
> >> we use the correct command value.
> > 
> > The problem is that incrementing the version will break backwards
> > compatibility; so you won't be able to migrate this back to an older
> > QEMU version; so for example a q35/uefi with this won't be able
> > to migrate backwards to a 4.0.0 or older qemu.
> > 
> > So instead of bumping the version_id you probably need to wire
> > the behaviour to a machine type and then on your new type
> > wire a subsection containing a flag; the reception of that subsection
> > tells you to use the new/correct semantics.
> 
> I'm starting to understand VMState subsections, but it might be overkill
> for this change...
> 
>   Subsections
>   ---
> 
>   The most common structure change is adding new data, e.g. when adding
>   a newer form of device, or adding that state that you previously
>   forgot to migrate.  This is best solved using a subsection.
> 
> This is not the case here, the field is already present and migrated.
> 
> It seems I can use a simple pre_save hook, always migrating the
> READ_ARRAY using the incorrect value:
> 
> -- >8 --
> --- a/hw/block/pflash_cfi01.c
> +++ b/hw/block/pflash_cfi01.c
> @@ -97,12 +97,29 @@ struct PFlashCFI01 {
>  bool incorrect_read_array_command;
>  };
> 
> +static int pflash_pre_save(void *opaque)
> +{
> +PFlashCFI01 *s = opaque;
> +
> +/*
> + * Previous to QEMU v4.1 an incorrect value of 0x00 was used for the
> + * READ_ARRAY command. To preserve migrating to these older version,
> + * always migrate the READ_ARRAY command as 0x00.
> + */
> +if (s->cmd == 0xff) {
> +s->cmd = 0x00;
> +}
> +
> +return 0;
> +}

Be careful what happens if migration fails and you continue on the
source - is that OK - or are you going to have to flip that back somehow
(in a post_save).

Another way to do the same is to have a dummy field; tmp_cmd, and the
tmp_cmd is the thing that's actually migrated and filled by pre_save
(or use VMSTATE_WITH_TMP )


>  static int pflash_post_load(void *opaque, int version_id);
> 
>  static const VMStateDescription vmstate_pflash = {
>  .name = "pflash_cfi01",
>  .version_id = 1,
>  .minimum_version_id = 1,
> +.pre_save = pflash_pre_save,
>  .post_load = pflash_post_load,
>  .fields = (VMStateField[]) {
>  VMSTATE_UINT8(wcycle, PFlashCFI01),
> @@ -1001,5 +1018,14 @@ static int pflash_post_load(void *opaque, int
> version_id)
>  pfl->vmstate = qemu_add_vm_change_state_handler(postload_update_cb,
>  pfl);
>  }
> +
> +/*
> + * Previous to QEMU v4.1 an incorrect value of 0x00 was used for the
> + * READ_ARRAY command.
> + */
> +if (pfl->cmd == 0x00) {
> +pfl->cmd = 0xff;
> +}
> +
>  return 0;
>  }
> ---
> 
> Being simpler and less intrusive (no new property in hw/core/machine.c),
> is this acceptable?

From the migration point of view yes; I don't know enough about pflash
to say if it makes sense;  for example could there ever be a 00 command
really used and then you'd have to distinguish that somehow?

> Thanks,
> 
> Phil.
> 
> [...]
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

[Qemu-block] ANNOUNCE: Tool for diffing and backing up images and block devices

2019-07-09 Thread Dan Shearer

ddb is now available at https://github.com/gladserv/ddb and is intended for
comparing block devices. The definition of "device" is very broad and can be
custom-defined.

ddb also knows about sparse files.

--
Dan Shearer
d...@shearer.org

Re: [Qemu-block] [PULL 0/1] pflash-next patches for hard features freeze

2019-07-09 Thread Peter Maydell

On Tue, 9 Jul 2019 at 16:21, Philippe Mathieu-Daudé  wrote:
>
> The following changes since commit a538626aff7c8934ec47bc6ed41cac5bd1b7723c:
>
>   Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20190709' into 
> staging (2019-07-09 11:49:26 +0100)
>
> are available in the Git repository at:
>
>   https://gitlab.com/philmd/qemu.git tags/pflash-next-20190709
>
> for you to fetch changes up to 51500d37700904a0ee1ef775a585d871b36f7060:
>
>   Revert "hw/block/pflash_cfi02: Reduce I/O accesses to 16-bit" (2019-07-09 
> 17:14:39 +0200)
>
> 
> Restore 32-bit I/O accesses on AMD flashes
> (precautionary revert).
>
> 
>
> Philippe Mathieu-Daudé (1):
>   Revert "hw/block/pflash_cfi02: Reduce I/O accesses to 16-bit"
>
>  hw/block/pflash_cfi02.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/4.1
for any user-visible changes.

-- PMM

Re: [Qemu-block] [PATCH v3] block/rbd: implement .bdrv_get_allocated_file_size callback

2019-07-09 Thread Max Reitz

On 09.07.19 15:09, Stefano Garzarella wrote:
> On Tue, Jul 09, 2019 at 08:55:19AM -0400, Jason Dillaman wrote:
>> On Tue, Jul 9, 2019 at 5:45 AM Max Reitz  wrote:
>>>
>>> On 09.07.19 10:55, Max Reitz wrote:
 On 09.07.19 05:08, Jason Dillaman wrote:
> On Fri, Jul 5, 2019 at 6:43 AM Stefano Garzarella  
> wrote:
>>
>> On Fri, Jul 05, 2019 at 11:58:43AM +0200, Max Reitz wrote:
>>> On 05.07.19 11:32, Stefano Garzarella wrote:
 This patch allows 'qemu-img info' to show the 'disk size' for
 the RBD images that have the fast-diff feature enabled.

 If this feature is enabled, we use the rbd_diff_iterate2() API
 to calculate the allocated size for the image.

 Signed-off-by: Stefano Garzarella 
 ---
 v3:
   - return -ENOTSUP instead of -1 when fast-diff is not available
 [John, Jason]
 v2:
   - calculate the actual usage only if the fast-diff feature is
 enabled [Jason]
 ---
  block/rbd.c | 54 +
  1 file changed, 54 insertions(+)
>>>
>>> Well, the librbd documentation is non-existing as always, but while
>>> googling, I at least found that libvirt has exactly the same code.  So I
>>> suppose it must be quite correct, then.
>>>
>>
>> While I wrote this code I took a look at libvirt implementation and also
>> at the "rbd" tool in the ceph repository: compute_image_disk_usage() in
>> src/tools/rbd/action/DiskUsage.cc
>>
 diff --git a/block/rbd.c b/block/rbd.c
 index 59757b3120..b6bed683e5 100644
 --- a/block/rbd.c
 +++ b/block/rbd.c
 @@ -1084,6 +1084,59 @@ static int64_t 
 qemu_rbd_getlength(BlockDriverState *bs)
  return info.size;
  }

 +static int rbd_allocated_size_cb(uint64_t offset, size_t len, int 
 exists,
 + void *arg)
 +{
 +int64_t *alloc_size = (int64_t *) arg;
 +
 +if (exists) {
 +(*alloc_size) += len;
 +}
 +
 +return 0;
 +}
 +
 +static int64_t qemu_rbd_get_allocated_file_size(BlockDriverState *bs)
 +{
 +BDRVRBDState *s = bs->opaque;
 +uint64_t flags, features;
 +int64_t alloc_size = 0;
 +int r;
 +
 +r = rbd_get_flags(s->image, &flags);
 +if (r < 0) {
 +return r;
 +}
 +
 +r = rbd_get_features(s->image, &features);
 +if (r < 0) {
 +return r;
 +}
 +
 +/*
 + * We use rbd_diff_iterate2() only if the RBD image have fast-diff
 + * feature enabled. If it is disabled, rbd_diff_iterate2() could 
 be
 + * very slow on a big image.
 + */
 +if (!(features & RBD_FEATURE_FAST_DIFF) ||
 +(flags & RBD_FLAG_FAST_DIFF_INVALID)) {
 +return -ENOTSUP;
 +}
 +
 +/*
 + * rbd_diff_iterate2(), if the source snapshot name is NULL, 
 invokes
 + * the callback on all allocated regions of the image.
 + */
 +r = rbd_diff_iterate2(s->image, NULL, 0,
 +  bs->total_sectors * BDRV_SECTOR_SIZE, 0, 1,
 +  &rbd_allocated_size_cb, &alloc_size);
>>>
>>> But I have a question.  This is basically block_status, right?  So it
>>> gives us information on which areas are allocated and which are not.
>>> The result thus gives us a lower bound on the allocation size, but is it
>>> really exactly the allocation size?
>>>
>>> There are two things I’m concerned about:
>>>
>>> 1. What about metadata?
>>
>> Good question, I don't think it includes the size used by metadata and I
>> don't know if there is a way to know it. I'll check better.
>
> It does not include the size of metadata, the "rbd_diff_iterate2"
> function is literally just looking for touched data blocks within the
> RBD image.
>
>>>
>>> 2. If you have multiple snapshots, this will only report the overall
>>> allocation information, right?  So say there is something like this:
>>>
>>> (“A” means an allocated MB, “-” is an unallocated MB)
>>>
>>> Snapshot 1: ---
>>> Snapshot 2: --A
>>> Snapshot 3: ---
>>>
>>> I think the allocated data size is the number of As in total (13 MB).
>>> But I suppose this API will just return 7 MB, because it looks on
>>> everything an it sees the whole image range (7 MB) to be allocated.  It
>>> doesn’t report in how many snapshots some region is allocated.
>
> It should return 13 dirty da

[Qemu-block] [PULL 0/1] pflash-next patches for hard features freeze

2019-07-09 Thread Philippe Mathieu-Daudé

The following changes since commit a538626aff7c8934ec47bc6ed41cac5bd1b7723c:

  Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20190709' into 
staging (2019-07-09 11:49:26 +0100)

are available in the Git repository at:

  https://gitlab.com/philmd/qemu.git tags/pflash-next-20190709

for you to fetch changes up to 51500d37700904a0ee1ef775a585d871b36f7060:

  Revert "hw/block/pflash_cfi02: Reduce I/O accesses to 16-bit" (2019-07-09 
17:14:39 +0200)


Restore 32-bit I/O accesses on AMD flashes
(precautionary revert).



Philippe Mathieu-Daudé (1):
  Revert "hw/block/pflash_cfi02: Reduce I/O accesses to 16-bit"

 hw/block/pflash_cfi02.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

-- 
2.20.1

[Qemu-block] [PULL 1/1] Revert "hw/block/pflash_cfi02: Reduce I/O accesses to 16-bit"

2019-07-09 Thread Philippe Mathieu-Daudé

This reverts commit 3ae0343db69c379beb5750b4ed70794bbed51b85.

Stephen Checkoway noticed commit 3ae0343db69 is incorrect.
This commit state all parallel flashes are limited to 16-bit
accesses, however the x32 configuration exists in some models,
such the Cypress S29CL032J, which CFI Device Geometry Definition
announces:

  CFI ADDR DATA
  0x28,0x29 = 0x0003 (x32-only asynchronous interface)

Guests should not be affected by the previous change, because
QEMU does not announce itself as x32 capable:

/* Flash device interface (8 & 16 bits) */
pfl->cfi_table[0x28] = 0x02;
pfl->cfi_table[0x29] = 0x00;

Commit 3ae0343db69 does not restrict the bus to 16-bit accesses,
but restrict the implementation as 16-bit access max, so a guest
32-bit access will result in 2x 16-bit calls.

Now, we have 2 boards that register the flash device in 32-bit
access:

- PPC: taihu_405ep

  The CFI id matches the S29AL008J that is a 1MB in x16, while
  the code QEMU forces it to be 2MB, and checking Linux it expects
  a 4MB flash.

- ARM: Digic4

  While the comment says "Samsung K8P3215UQB 64M Bit (4Mx16)",
  this flash is 32Mb (2MB). Also note the CFI id does not match
  the comment.

To avoid unexpected side effect, we revert commit 3ae0343db69,
and will clean the board code later.

Reported-by: Stephen Checkoway 
Reviewed-by: Markus Armbruster 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/block/pflash_cfi02.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/hw/block/pflash_cfi02.c b/hw/block/pflash_cfi02.c
index 5392290c72..83084b9d72 100644
--- a/hw/block/pflash_cfi02.c
+++ b/hw/block/pflash_cfi02.c
@@ -317,6 +317,8 @@ static uint64_t pflash_read(void *opaque, hwaddr offset, 
unsigned int width)
 boff = offset & 0xFF;
 if (pfl->width == 2) {
 boff = boff >> 1;
+} else if (pfl->width == 4) {
+boff = boff >> 2;
 }
 switch (pfl->cmd) {
 default:
@@ -447,6 +449,8 @@ static void pflash_write(void *opaque, hwaddr offset, 
uint64_t value,
 boff = offset;
 if (pfl->width == 2) {
 boff = boff >> 1;
+} else if (pfl->width == 4) {
+boff = boff >> 2;
 }
 /* Only the least-significant 11 bits are used in most cases. */
 boff &= 0x7FF;
@@ -706,7 +710,6 @@ static void pflash_write(void *opaque, hwaddr offset, 
uint64_t value,
 static const MemoryRegionOps pflash_cfi02_ops = {
 .read = pflash_read,
 .write = pflash_write,
-.impl.max_access_size = 2,
 .valid.min_access_size = 1,
 .valid.max_access_size = 4,
 .endianness = DEVICE_NATIVE_ENDIAN,
-- 
2.20.1

Re: [Qemu-block] [Qemu-devel] [PATCH-for-4.1] Revert "hw/block/pflash_cfi02: Reduce I/O accesses to 16-bit"

2019-07-09 Thread Markus Armbruster

Philippe Mathieu-Daudé  writes:

> On 7/9/19 1:34 PM, Markus Armbruster wrote:
>> Your subject matches the one git-revert creates (good), but you don't
>
> I was wondering whether rewrite it as "Restore 32-bit I/O accesses",
> keeping the original subject is better?

I recommend to start with git-revert, resolve the conflicts if any, and
append appropriate detail (rationale!) to git-revert's commit message.
That way, people immediately see it's a revert, and of what.

[...]

Re: [Qemu-block] [Qemu-devel] [PATCH v3 2/9] hw/block/pflash_cfi01: Use the correct READ_ARRAY value

2019-07-09 Thread Philippe Mathieu-Daudé

Hi David,

On 7/9/19 12:30 PM, Dr. David Alan Gilbert wrote:
> * Philippe Mathieu-Daudé (phi...@redhat.com) wrote:
>> In the "Read Array Flowchart" the command has a value of 0xFF.
>>
>> In the document [*] the "Read Array Flowchart", the READ_ARRAY
>> command has a value of 0xff.
>>
>> Use the correct value in the pflash model.
>>
>> There is no change of behavior in the guest, because:
>> - when the guest were sending 0xFF, the reset_flash label
>>   was setting the command value as 0x00
>> - 0x00 was used internally for READ_ARRAY
>>
>> To keep migration behaving correctly, we have to increase
>> the VMState version. When migrating from an older version,
>> we use the correct command value.
> 
> The problem is that incrementing the version will break backwards
> compatibility; so you won't be able to migrate this back to an older
> QEMU version; so for example a q35/uefi with this won't be able
> to migrate backwards to a 4.0.0 or older qemu.
> 
> So instead of bumping the version_id you probably need to wire
> the behaviour to a machine type and then on your new type
> wire a subsection containing a flag; the reception of that subsection
> tells you to use the new/correct semantics.

I'm starting to understand VMState subsections, but it might be overkill
for this change...

  Subsections
  ---

  The most common structure change is adding new data, e.g. when adding
  a newer form of device, or adding that state that you previously
  forgot to migrate.  This is best solved using a subsection.

This is not the case here, the field is already present and migrated.

It seems I can use a simple pre_save hook, always migrating the
READ_ARRAY using the incorrect value:

-- >8 --
--- a/hw/block/pflash_cfi01.c
+++ b/hw/block/pflash_cfi01.c
@@ -97,12 +97,29 @@ struct PFlashCFI01 {
 bool incorrect_read_array_command;
 };

+static int pflash_pre_save(void *opaque)
+{
+PFlashCFI01 *s = opaque;
+
+/*
+ * Previous to QEMU v4.1 an incorrect value of 0x00 was used for the
+ * READ_ARRAY command. To preserve migrating to these older version,
+ * always migrate the READ_ARRAY command as 0x00.
+ */
+if (s->cmd == 0xff) {
+s->cmd = 0x00;
+}
+
+return 0;
+}
+
 static int pflash_post_load(void *opaque, int version_id);

 static const VMStateDescription vmstate_pflash = {
 .name = "pflash_cfi01",
 .version_id = 1,
 .minimum_version_id = 1,
+.pre_save = pflash_pre_save,
 .post_load = pflash_post_load,
 .fields = (VMStateField[]) {
 VMSTATE_UINT8(wcycle, PFlashCFI01),
@@ -1001,5 +1018,14 @@ static int pflash_post_load(void *opaque, int
version_id)
 pfl->vmstate = qemu_add_vm_change_state_handler(postload_update_cb,
 pfl);
 }
+
+/*
+ * Previous to QEMU v4.1 an incorrect value of 0x00 was used for the
+ * READ_ARRAY command.
+ */
+if (pfl->cmd == 0x00) {
+pfl->cmd = 0xff;
+}
+
 return 0;
 }
---

Being simpler and less intrusive (no new property in hw/core/machine.c),
is this acceptable?

Thanks,

Phil.

[...]

Re: [Qemu-block] [PATCH v3] block/rbd: implement .bdrv_get_allocated_file_size callback

2019-07-09 Thread Stefano Garzarella

On Tue, Jul 09, 2019 at 08:55:19AM -0400, Jason Dillaman wrote:
> On Tue, Jul 9, 2019 at 5:45 AM Max Reitz  wrote:
> >
> > On 09.07.19 10:55, Max Reitz wrote:
> > > On 09.07.19 05:08, Jason Dillaman wrote:
> > >> On Fri, Jul 5, 2019 at 6:43 AM Stefano Garzarella  
> > >> wrote:
> > >>>
> > >>> On Fri, Jul 05, 2019 at 11:58:43AM +0200, Max Reitz wrote:
> >  On 05.07.19 11:32, Stefano Garzarella wrote:
> > > This patch allows 'qemu-img info' to show the 'disk size' for
> > > the RBD images that have the fast-diff feature enabled.
> > >
> > > If this feature is enabled, we use the rbd_diff_iterate2() API
> > > to calculate the allocated size for the image.
> > >
> > > Signed-off-by: Stefano Garzarella 
> > > ---
> > > v3:
> > >   - return -ENOTSUP instead of -1 when fast-diff is not available
> > > [John, Jason]
> > > v2:
> > >   - calculate the actual usage only if the fast-diff feature is
> > > enabled [Jason]
> > > ---
> > >  block/rbd.c | 54 
> > > +
> > >  1 file changed, 54 insertions(+)
> > 
> >  Well, the librbd documentation is non-existing as always, but while
> >  googling, I at least found that libvirt has exactly the same code.  So 
> >  I
> >  suppose it must be quite correct, then.
> > 
> > >>>
> > >>> While I wrote this code I took a look at libvirt implementation and also
> > >>> at the "rbd" tool in the ceph repository: compute_image_disk_usage() in
> > >>> src/tools/rbd/action/DiskUsage.cc
> > >>>
> > > diff --git a/block/rbd.c b/block/rbd.c
> > > index 59757b3120..b6bed683e5 100644
> > > --- a/block/rbd.c
> > > +++ b/block/rbd.c
> > > @@ -1084,6 +1084,59 @@ static int64_t 
> > > qemu_rbd_getlength(BlockDriverState *bs)
> > >  return info.size;
> > >  }
> > >
> > > +static int rbd_allocated_size_cb(uint64_t offset, size_t len, int 
> > > exists,
> > > + void *arg)
> > > +{
> > > +int64_t *alloc_size = (int64_t *) arg;
> > > +
> > > +if (exists) {
> > > +(*alloc_size) += len;
> > > +}
> > > +
> > > +return 0;
> > > +}
> > > +
> > > +static int64_t qemu_rbd_get_allocated_file_size(BlockDriverState *bs)
> > > +{
> > > +BDRVRBDState *s = bs->opaque;
> > > +uint64_t flags, features;
> > > +int64_t alloc_size = 0;
> > > +int r;
> > > +
> > > +r = rbd_get_flags(s->image, &flags);
> > > +if (r < 0) {
> > > +return r;
> > > +}
> > > +
> > > +r = rbd_get_features(s->image, &features);
> > > +if (r < 0) {
> > > +return r;
> > > +}
> > > +
> > > +/*
> > > + * We use rbd_diff_iterate2() only if the RBD image have 
> > > fast-diff
> > > + * feature enabled. If it is disabled, rbd_diff_iterate2() could 
> > > be
> > > + * very slow on a big image.
> > > + */
> > > +if (!(features & RBD_FEATURE_FAST_DIFF) ||
> > > +(flags & RBD_FLAG_FAST_DIFF_INVALID)) {
> > > +return -ENOTSUP;
> > > +}
> > > +
> > > +/*
> > > + * rbd_diff_iterate2(), if the source snapshot name is NULL, 
> > > invokes
> > > + * the callback on all allocated regions of the image.
> > > + */
> > > +r = rbd_diff_iterate2(s->image, NULL, 0,
> > > +  bs->total_sectors * BDRV_SECTOR_SIZE, 0, 1,
> > > +  &rbd_allocated_size_cb, &alloc_size);
> > 
> >  But I have a question.  This is basically block_status, right?  So it
> >  gives us information on which areas are allocated and which are not.
> >  The result thus gives us a lower bound on the allocation size, but is 
> >  it
> >  really exactly the allocation size?
> > 
> >  There are two things I’m concerned about:
> > 
> >  1. What about metadata?
> > >>>
> > >>> Good question, I don't think it includes the size used by metadata and I
> > >>> don't know if there is a way to know it. I'll check better.
> > >>
> > >> It does not include the size of metadata, the "rbd_diff_iterate2"
> > >> function is literally just looking for touched data blocks within the
> > >> RBD image.
> > >>
> > 
> >  2. If you have multiple snapshots, this will only report the overall
> >  allocation information, right?  So say there is something like this:
> > 
> >  (“A” means an allocated MB, “-” is an unallocated MB)
> > 
> >  Snapshot 1: ---
> >  Snapshot 2: --A
> >  Snapshot 3: ---
> > 
> >  I think the allocated data size is the number of As in total (13 MB).
> >  But I suppose this API will just return 7 MB, because it looks on
> >  everything an it sees the whole image range (7 MB) to be allocated.

Re: [Qemu-block] [PATCH v3] block/rbd: implement .bdrv_get_allocated_file_size callback

2019-07-09 Thread Jason Dillaman

On Tue, Jul 9, 2019 at 5:45 AM Max Reitz  wrote:
>
> On 09.07.19 10:55, Max Reitz wrote:
> > On 09.07.19 05:08, Jason Dillaman wrote:
> >> On Fri, Jul 5, 2019 at 6:43 AM Stefano Garzarella  
> >> wrote:
> >>>
> >>> On Fri, Jul 05, 2019 at 11:58:43AM +0200, Max Reitz wrote:
>  On 05.07.19 11:32, Stefano Garzarella wrote:
> > This patch allows 'qemu-img info' to show the 'disk size' for
> > the RBD images that have the fast-diff feature enabled.
> >
> > If this feature is enabled, we use the rbd_diff_iterate2() API
> > to calculate the allocated size for the image.
> >
> > Signed-off-by: Stefano Garzarella 
> > ---
> > v3:
> >   - return -ENOTSUP instead of -1 when fast-diff is not available
> > [John, Jason]
> > v2:
> >   - calculate the actual usage only if the fast-diff feature is
> > enabled [Jason]
> > ---
> >  block/rbd.c | 54 +
> >  1 file changed, 54 insertions(+)
> 
>  Well, the librbd documentation is non-existing as always, but while
>  googling, I at least found that libvirt has exactly the same code.  So I
>  suppose it must be quite correct, then.
> 
> >>>
> >>> While I wrote this code I took a look at libvirt implementation and also
> >>> at the "rbd" tool in the ceph repository: compute_image_disk_usage() in
> >>> src/tools/rbd/action/DiskUsage.cc
> >>>
> > diff --git a/block/rbd.c b/block/rbd.c
> > index 59757b3120..b6bed683e5 100644
> > --- a/block/rbd.c
> > +++ b/block/rbd.c
> > @@ -1084,6 +1084,59 @@ static int64_t 
> > qemu_rbd_getlength(BlockDriverState *bs)
> >  return info.size;
> >  }
> >
> > +static int rbd_allocated_size_cb(uint64_t offset, size_t len, int 
> > exists,
> > + void *arg)
> > +{
> > +int64_t *alloc_size = (int64_t *) arg;
> > +
> > +if (exists) {
> > +(*alloc_size) += len;
> > +}
> > +
> > +return 0;
> > +}
> > +
> > +static int64_t qemu_rbd_get_allocated_file_size(BlockDriverState *bs)
> > +{
> > +BDRVRBDState *s = bs->opaque;
> > +uint64_t flags, features;
> > +int64_t alloc_size = 0;
> > +int r;
> > +
> > +r = rbd_get_flags(s->image, &flags);
> > +if (r < 0) {
> > +return r;
> > +}
> > +
> > +r = rbd_get_features(s->image, &features);
> > +if (r < 0) {
> > +return r;
> > +}
> > +
> > +/*
> > + * We use rbd_diff_iterate2() only if the RBD image have fast-diff
> > + * feature enabled. If it is disabled, rbd_diff_iterate2() could be
> > + * very slow on a big image.
> > + */
> > +if (!(features & RBD_FEATURE_FAST_DIFF) ||
> > +(flags & RBD_FLAG_FAST_DIFF_INVALID)) {
> > +return -ENOTSUP;
> > +}
> > +
> > +/*
> > + * rbd_diff_iterate2(), if the source snapshot name is NULL, 
> > invokes
> > + * the callback on all allocated regions of the image.
> > + */
> > +r = rbd_diff_iterate2(s->image, NULL, 0,
> > +  bs->total_sectors * BDRV_SECTOR_SIZE, 0, 1,
> > +  &rbd_allocated_size_cb, &alloc_size);
> 
>  But I have a question.  This is basically block_status, right?  So it
>  gives us information on which areas are allocated and which are not.
>  The result thus gives us a lower bound on the allocation size, but is it
>  really exactly the allocation size?
> 
>  There are two things I’m concerned about:
> 
>  1. What about metadata?
> >>>
> >>> Good question, I don't think it includes the size used by metadata and I
> >>> don't know if there is a way to know it. I'll check better.
> >>
> >> It does not include the size of metadata, the "rbd_diff_iterate2"
> >> function is literally just looking for touched data blocks within the
> >> RBD image.
> >>
> 
>  2. If you have multiple snapshots, this will only report the overall
>  allocation information, right?  So say there is something like this:
> 
>  (“A” means an allocated MB, “-” is an unallocated MB)
> 
>  Snapshot 1: ---
>  Snapshot 2: --A
>  Snapshot 3: ---
> 
>  I think the allocated data size is the number of As in total (13 MB).
>  But I suppose this API will just return 7 MB, because it looks on
>  everything an it sees the whole image range (7 MB) to be allocated.  It
>  doesn’t report in how many snapshots some region is allocated.
> >>
> >> It should return 13 dirty data blocks (multipled by the size of the
> >> data block) since when you don't provide a "from snapshot" name, it
> >> will iterate from the first snapshot to the HEAD revision.
> >
> > Have you tested that?
> >
> > I‘m so skeptical because the callbac

Re: [Qemu-block] [Qemu-devel] [PATCH-for-4.1] Revert "hw/block/pflash_cfi02: Reduce I/O accesses to 16-bit"

2019-07-09 Thread Philippe Mathieu-Daudé

On 7/9/19 1:34 PM, Markus Armbruster wrote:
> Your subject matches the one git-revert creates (good), but you don't

I was wondering whether rewrite it as "Restore 32-bit I/O accesses",
keeping the original subject is better?

> have the rest of its commit message:
> 
>   This reverts commit 3ae0343db69c379beb5750b4ed70794bbed51b85.
> 
> Please add that line.

OK.

> The patch is a clean revert.
> 
> Reviewed-by: Markus Armbruster 
> 

Thanks!

Re: [Qemu-block] [Qemu-devel] [PATCH-for-4.1] Revert "hw/block/pflash_cfi02: Reduce I/O accesses to 16-bit"

2019-07-09 Thread Markus Armbruster

Your subject matches the one git-revert creates (good), but you don't
have the rest of its commit message:

  This reverts commit 3ae0343db69c379beb5750b4ed70794bbed51b85.

Please add that line.

The patch is a clean revert.

Reviewed-by: Markus Armbruster

Re: [Qemu-block] [Qemu-devel] [PATCH v1 1/3] qcow2: introduce compression type feature

2019-07-09 Thread Markus Armbruster

I'd like to recommend a few commas.

Denis Plotnikov  writes:

> The patch adds some preparation parts for incompatible compression type
> feature to QCOW2 header that indicates that *all* compressed clusters
> must be (de)compressed using a certain compression type.
>
> It is implied that the compression type is set on the image creation and
> can be changed only later by image conversion, thus compression type
> defines the only compression algorithm used for the image.
>
> The goal of the feature is to add support of other compression algorithms
> to qcow2. For example, ZSTD which is more effective on compression than ZLIB.
> It works roughly x2 faster than ZLIB providing a comparable compression ratio
> and therefore provide a performance advantage in backup scenarios.
>
> The default compression is ZLIB. Images created with ZLIB compression type
> is backward compatible with older qemu versions.
>
> Signed-off-by: Denis Plotnikov 
[...]
> diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt
> index af5711e533..5b8a8d15fe 100644
> --- a/docs/interop/qcow2.txt
> +++ b/docs/interop/qcow2.txt
> @@ -109,7 +109,12 @@ in the description of a field.
>  An External Data File Name header extension 
> may
>  be present if this bit is set.
>  
> -Bits 3-63:  Reserved (set to 0)
> +Bit 3:  Compression type bit. The bit must be set if
> +the compression type differs from default: 
> zlib.
> +If the compression type is default the bit 
> must
> +be unset.
> +
> +Bits 4-63:  Reserved (set to 0)
>  
>   80 -  87:  compatible_features
>  Bitmask of compatible features. An implementation can
> @@ -165,6 +170,21 @@ in the description of a field.
>  Length of the header structure in bytes. For version 2
>  images, the length is always assumed to be 72 bytes.
>  
> +104 - 107:  compression_type
> +Defines the compression method used for compressed 
> clusters.
> +A single compression type is applied to all compressed 
> image
> +clusters.
> +The compression type is set on image creation only.
> +The default compression type is zlib.
> +When the deafult compression type is used the compression

Comma after used, please.

> +type bit (incompatible feature bit 3) must be unset.
> +When any other compression type is used the compression

Likewise.

> +type bit must be set.
> +Qemu versions older than 4.1 can use images created with
> +the default compression type without any additional
> +preparations and cannot use images created with any other

Comma after preparations, please.

> +compression type.
> +
>  Directly after the image header, optional sections called header extensions 
> can
>  be stored. Each extension has a structure like the following:
>  
[...]

Re: [Qemu-block] [Qemu-devel] [PATCH v3 2/9] hw/block/pflash_cfi01: Use the correct READ_ARRAY value

2019-07-09 Thread Dr. David Alan Gilbert

* Philippe Mathieu-Daudé (phi...@redhat.com) wrote:
> In the "Read Array Flowchart" the command has a value of 0xFF.
> 
> In the document [*] the "Read Array Flowchart", the READ_ARRAY
> command has a value of 0xff.
> 
> Use the correct value in the pflash model.
> 
> There is no change of behavior in the guest, because:
> - when the guest were sending 0xFF, the reset_flash label
>   was setting the command value as 0x00
> - 0x00 was used internally for READ_ARRAY
> 
> To keep migration behaving correctly, we have to increase
> the VMState version. When migrating from an older version,
> we use the correct command value.

The problem is that incrementing the version will break backwards
compatibility; so you won't be able to migrate this back to an older
QEMU version; so for example a q35/uefi with this won't be able
to migrate backwards to a 4.0.0 or older qemu.

So instead of bumping the version_id you probably need to wire
the behaviour to a machine type and then on your new type
wire a subsection containing a flag; the reception of that subsection
tells you to use the new/correct semantics.


Dave


> [*] "Common Flash Interface (CFI) and Command Sets"
> (Intel Application Note 646)
> Appendix B "Basic Command Set"
> 
> Reviewed-by: John Snow 
> Reviewed-by: Alistair Francis 
> Regression-tested-by: Laszlo Ersek 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> v3: Handle migrating the 'cmd' field.
> 
> Since Laszlo stated he did not test migration [*], I'm keeping his
> test tag, because the change with v2 has no impact in the tests
> he ran.
> 
> Likewise I'm keeping John and Alistair tags, but I'd like an extra
> review for the migration change, thanks!
> 
> [*] https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg00679.html
> ---
>  hw/block/pflash_cfi01.c | 23 +--
>  1 file changed, 13 insertions(+), 10 deletions(-)
> 
> diff --git a/hw/block/pflash_cfi01.c b/hw/block/pflash_cfi01.c
> index 9e34fd4e82..58cbef0588 100644
> --- a/hw/block/pflash_cfi01.c
> +++ b/hw/block/pflash_cfi01.c
> @@ -100,7 +100,7 @@ static int pflash_post_load(void *opaque, int version_id);
>  
>  static const VMStateDescription vmstate_pflash = {
>  .name = "pflash_cfi01",
> -.version_id = 1,
> +.version_id = 2,
>  .minimum_version_id = 1,
>  .post_load = pflash_post_load,
>  .fields = (VMStateField[]) {
> @@ -277,10 +277,9 @@ static uint32_t pflash_read(PFlashCFI01 *pfl, hwaddr 
> offset,
>  /* This should never happen : reset state & treat it as a read */
>  DPRINTF("%s: unknown command state: %x\n", __func__, pfl->cmd);
>  pfl->wcycle = 0;
> -pfl->cmd = 0;
> +pfl->cmd = 0xff;
>  /* fall through to read code */
> -case 0x00:
> -/* Flash area read */
> +case 0xff: /* Read Array */
>  ret = pflash_data_read(pfl, offset, width, be);
>  break;
>  case 0x10: /* Single byte program */
> @@ -448,8 +447,6 @@ static void pflash_write(PFlashCFI01 *pfl, hwaddr offset,
>  case 0:
>  /* read mode */
>  switch (cmd) {
> -case 0x00: /* ??? */
> -goto reset_flash;
>  case 0x10: /* Single Byte Program */
>  case 0x40: /* Single Byte Program */
>  DPRINTF("%s: Single Byte Program\n", __func__);
> @@ -526,7 +523,7 @@ static void pflash_write(PFlashCFI01 *pfl, hwaddr offset,
>  if (cmd == 0xd0) { /* confirm */
>  pfl->wcycle = 0;
>  pfl->status |= 0x80;
> -} else if (cmd == 0xff) { /* read array mode */
> +} else if (cmd == 0xff) { /* Read Array */
>  goto reset_flash;
>  } else
>  goto error_flash;
> @@ -553,7 +550,7 @@ static void pflash_write(PFlashCFI01 *pfl, hwaddr offset,
>  } else if (cmd == 0x01) {
>  pfl->wcycle = 0;
>  pfl->status |= 0x80;
> -} else if (cmd == 0xff) {
> +} else if (cmd == 0xff) { /* read array mode */
>  goto reset_flash;
>  } else {
>  DPRINTF("%s: Unknown (un)locking command\n", __func__);
> @@ -645,7 +642,7 @@ static void pflash_write(PFlashCFI01 *pfl, hwaddr offset,
>  trace_pflash_reset();
>  memory_region_rom_device_set_romd(&pfl->mem, true);
>  pfl->wcycle = 0;
> -pfl->cmd = 0;
> +pfl->cmd = 0xff;
>  }
>  
>  
> @@ -761,7 +758,7 @@ static void pflash_cfi01_realize(DeviceState *dev, Error 
> **errp)
>  }
>  
>  pfl->wcycle = 0;
> -pfl->cmd = 0;
> +pfl->cmd = 0xff;
>  pfl->status = 0;
>  /* Hardcoded CFI table */
>  /* Standard "QRY" string */
> @@ -1001,5 +998,11 @@ static int pflash_post_load(void *opaque, int 
> version_id)
>  pfl->vmstate = qemu_add_vm_change_state_handler(postload_update_cb,
>  pfl);
>  }
> +if (version_id < 2) {
> +/* v1 used incorrect val

Re: [Qemu-block] [PATCH v3] block/rbd: implement .bdrv_get_allocated_file_size callback

2019-07-09 Thread Max Reitz

On 09.07.19 10:55, Max Reitz wrote:
> On 09.07.19 05:08, Jason Dillaman wrote:
>> On Fri, Jul 5, 2019 at 6:43 AM Stefano Garzarella  
>> wrote:
>>>
>>> On Fri, Jul 05, 2019 at 11:58:43AM +0200, Max Reitz wrote:
 On 05.07.19 11:32, Stefano Garzarella wrote:
> This patch allows 'qemu-img info' to show the 'disk size' for
> the RBD images that have the fast-diff feature enabled.
>
> If this feature is enabled, we use the rbd_diff_iterate2() API
> to calculate the allocated size for the image.
>
> Signed-off-by: Stefano Garzarella 
> ---
> v3:
>   - return -ENOTSUP instead of -1 when fast-diff is not available
> [John, Jason]
> v2:
>   - calculate the actual usage only if the fast-diff feature is
> enabled [Jason]
> ---
>  block/rbd.c | 54 +
>  1 file changed, 54 insertions(+)

 Well, the librbd documentation is non-existing as always, but while
 googling, I at least found that libvirt has exactly the same code.  So I
 suppose it must be quite correct, then.

>>>
>>> While I wrote this code I took a look at libvirt implementation and also
>>> at the "rbd" tool in the ceph repository: compute_image_disk_usage() in
>>> src/tools/rbd/action/DiskUsage.cc
>>>
> diff --git a/block/rbd.c b/block/rbd.c
> index 59757b3120..b6bed683e5 100644
> --- a/block/rbd.c
> +++ b/block/rbd.c
> @@ -1084,6 +1084,59 @@ static int64_t qemu_rbd_getlength(BlockDriverState 
> *bs)
>  return info.size;
>  }
>
> +static int rbd_allocated_size_cb(uint64_t offset, size_t len, int exists,
> + void *arg)
> +{
> +int64_t *alloc_size = (int64_t *) arg;
> +
> +if (exists) {
> +(*alloc_size) += len;
> +}
> +
> +return 0;
> +}
> +
> +static int64_t qemu_rbd_get_allocated_file_size(BlockDriverState *bs)
> +{
> +BDRVRBDState *s = bs->opaque;
> +uint64_t flags, features;
> +int64_t alloc_size = 0;
> +int r;
> +
> +r = rbd_get_flags(s->image, &flags);
> +if (r < 0) {
> +return r;
> +}
> +
> +r = rbd_get_features(s->image, &features);
> +if (r < 0) {
> +return r;
> +}
> +
> +/*
> + * We use rbd_diff_iterate2() only if the RBD image have fast-diff
> + * feature enabled. If it is disabled, rbd_diff_iterate2() could be
> + * very slow on a big image.
> + */
> +if (!(features & RBD_FEATURE_FAST_DIFF) ||
> +(flags & RBD_FLAG_FAST_DIFF_INVALID)) {
> +return -ENOTSUP;
> +}
> +
> +/*
> + * rbd_diff_iterate2(), if the source snapshot name is NULL, invokes
> + * the callback on all allocated regions of the image.
> + */
> +r = rbd_diff_iterate2(s->image, NULL, 0,
> +  bs->total_sectors * BDRV_SECTOR_SIZE, 0, 1,
> +  &rbd_allocated_size_cb, &alloc_size);

 But I have a question.  This is basically block_status, right?  So it
 gives us information on which areas are allocated and which are not.
 The result thus gives us a lower bound on the allocation size, but is it
 really exactly the allocation size?

 There are two things I’m concerned about:

 1. What about metadata?
>>>
>>> Good question, I don't think it includes the size used by metadata and I
>>> don't know if there is a way to know it. I'll check better.
>>
>> It does not include the size of metadata, the "rbd_diff_iterate2"
>> function is literally just looking for touched data blocks within the
>> RBD image.
>>

 2. If you have multiple snapshots, this will only report the overall
 allocation information, right?  So say there is something like this:

 (“A” means an allocated MB, “-” is an unallocated MB)

 Snapshot 1: ---
 Snapshot 2: --A
 Snapshot 3: ---

 I think the allocated data size is the number of As in total (13 MB).
 But I suppose this API will just return 7 MB, because it looks on
 everything an it sees the whole image range (7 MB) to be allocated.  It
 doesn’t report in how many snapshots some region is allocated.
>>
>> It should return 13 dirty data blocks (multipled by the size of the
>> data block) since when you don't provide a "from snapshot" name, it
>> will iterate from the first snapshot to the HEAD revision.
> 
> Have you tested that?
> 
> I‘m so skeptical because the callback function interface has no way of
> distinguishing between different layers of snapshots.
> 
> And also because we have the bdrv_block_status_above() function which
> just looks strikingly similar (with the difference that it does not
> invoke a callback but just returns the next allocated range).  If you
> pass base=NUL

Re: [Qemu-block] [PATCH v3] block/rbd: implement .bdrv_get_allocated_file_size callback

2019-07-09 Thread Max Reitz

On 09.07.19 05:08, Jason Dillaman wrote:
> On Fri, Jul 5, 2019 at 6:43 AM Stefano Garzarella  wrote:
>>
>> On Fri, Jul 05, 2019 at 11:58:43AM +0200, Max Reitz wrote:
>>> On 05.07.19 11:32, Stefano Garzarella wrote:
 This patch allows 'qemu-img info' to show the 'disk size' for
 the RBD images that have the fast-diff feature enabled.

 If this feature is enabled, we use the rbd_diff_iterate2() API
 to calculate the allocated size for the image.

 Signed-off-by: Stefano Garzarella 
 ---
 v3:
   - return -ENOTSUP instead of -1 when fast-diff is not available
 [John, Jason]
 v2:
   - calculate the actual usage only if the fast-diff feature is
 enabled [Jason]
 ---
  block/rbd.c | 54 +
  1 file changed, 54 insertions(+)
>>>
>>> Well, the librbd documentation is non-existing as always, but while
>>> googling, I at least found that libvirt has exactly the same code.  So I
>>> suppose it must be quite correct, then.
>>>
>>
>> While I wrote this code I took a look at libvirt implementation and also
>> at the "rbd" tool in the ceph repository: compute_image_disk_usage() in
>> src/tools/rbd/action/DiskUsage.cc
>>
 diff --git a/block/rbd.c b/block/rbd.c
 index 59757b3120..b6bed683e5 100644
 --- a/block/rbd.c
 +++ b/block/rbd.c
 @@ -1084,6 +1084,59 @@ static int64_t qemu_rbd_getlength(BlockDriverState 
 *bs)
  return info.size;
  }

 +static int rbd_allocated_size_cb(uint64_t offset, size_t len, int exists,
 + void *arg)
 +{
 +int64_t *alloc_size = (int64_t *) arg;
 +
 +if (exists) {
 +(*alloc_size) += len;
 +}
 +
 +return 0;
 +}
 +
 +static int64_t qemu_rbd_get_allocated_file_size(BlockDriverState *bs)
 +{
 +BDRVRBDState *s = bs->opaque;
 +uint64_t flags, features;
 +int64_t alloc_size = 0;
 +int r;
 +
 +r = rbd_get_flags(s->image, &flags);
 +if (r < 0) {
 +return r;
 +}
 +
 +r = rbd_get_features(s->image, &features);
 +if (r < 0) {
 +return r;
 +}
 +
 +/*
 + * We use rbd_diff_iterate2() only if the RBD image have fast-diff
 + * feature enabled. If it is disabled, rbd_diff_iterate2() could be
 + * very slow on a big image.
 + */
 +if (!(features & RBD_FEATURE_FAST_DIFF) ||
 +(flags & RBD_FLAG_FAST_DIFF_INVALID)) {
 +return -ENOTSUP;
 +}
 +
 +/*
 + * rbd_diff_iterate2(), if the source snapshot name is NULL, invokes
 + * the callback on all allocated regions of the image.
 + */
 +r = rbd_diff_iterate2(s->image, NULL, 0,
 +  bs->total_sectors * BDRV_SECTOR_SIZE, 0, 1,
 +  &rbd_allocated_size_cb, &alloc_size);
>>>
>>> But I have a question.  This is basically block_status, right?  So it
>>> gives us information on which areas are allocated and which are not.
>>> The result thus gives us a lower bound on the allocation size, but is it
>>> really exactly the allocation size?
>>>
>>> There are two things I’m concerned about:
>>>
>>> 1. What about metadata?
>>
>> Good question, I don't think it includes the size used by metadata and I
>> don't know if there is a way to know it. I'll check better.
> 
> It does not include the size of metadata, the "rbd_diff_iterate2"
> function is literally just looking for touched data blocks within the
> RBD image.
> 
>>>
>>> 2. If you have multiple snapshots, this will only report the overall
>>> allocation information, right?  So say there is something like this:
>>>
>>> (“A” means an allocated MB, “-” is an unallocated MB)
>>>
>>> Snapshot 1: ---
>>> Snapshot 2: --A
>>> Snapshot 3: ---
>>>
>>> I think the allocated data size is the number of As in total (13 MB).
>>> But I suppose this API will just return 7 MB, because it looks on
>>> everything an it sees the whole image range (7 MB) to be allocated.  It
>>> doesn’t report in how many snapshots some region is allocated.
> 
> It should return 13 dirty data blocks (multipled by the size of the
> data block) since when you don't provide a "from snapshot" name, it
> will iterate from the first snapshot to the HEAD revision.

Have you tested that?

I‘m so skeptical because the callback function interface has no way of
distinguishing between different layers of snapshots.

And also because we have the bdrv_block_status_above() function which
just looks strikingly similar (with the difference that it does not
invoke a callback but just returns the next allocated range).  If you
pass base=NULL to it, it will also “interpret that as the beginning of
time and return all allocated regions of the image” (or rather image
chain, in our case).  But it would just return 7 MB a

Re: [Qemu-block] [PATCH for-4.1] iotests: Update 082 expected output

2019-07-09 Thread Kevin Wolf

Am 08.07.2019 um 20:47 hat Eric Blake geschrieben:
> A recent tweak to the '-o help' output for qemu-img needs to be
> reflected into the iotests expected outputs.
> 
> Fixes: f7077c98
> Reported-by: Kevin Wolf 
> Signed-off-by: Eric Blake 

Thanks, applied to the block branch.

Kevin

63 matches

Mail list logo