Re: [PATCH v4 3/5] mirror: allow specifying working bitmap

2024-06-24 Thread Vladimir Sementsov-Ogievskiy

On 21.05.24 15:20, Fiona Ebner wrote:

From: John Snow 

for the mirror job. The bitmap's granularity is used as the job's
granularity.

The new @bitmap parameter is marked unstable in the QAPI and can
currently only be used for @sync=full mode.

Clusters initially dirty in the bitmap as well as new writes are
copied to the target.

Using block-dirty-bitmap-clear and block-dirty-bitmap-merge API,
callers can simulate the three kinds of @BitmapSyncMode (which is used
by backup):
1. always: default, just pass bitmap as working bitmap.
2. never: copy bitmap and pass copy to the mirror job.
3. on-success: copy bitmap and pass copy to the mirror job and if
successful, merge bitmap into original afterwards.

When the target image is a non-COW "diff image", i.e. one that was not
used as the target of a previous mirror and the target image's cluster
size is larger than the bitmap's granularity, or when
@copy-mode=write-blocking is used, there is a pitfall, because the
cluster in the target image will be allocated, but not contain all the
data corresponding to the same region in the source image.

An idea to avoid the limitation would be to mark clusters which are
affected by unaligned writes and are not allocated in the target image
dirty, so they would be copied fully later. However, for migration,
the invariant that an actively synced mirror stays actively synced
(unless an error happens) is useful, because without that invariant,
migration might inactivate block devices when mirror still got work
to do and run into an assertion failure [0].

Another approach would be to read the missing data from the source
upon unaligned writes to be able to write the full target cluster
instead.

But certain targets like NBD do not allow querying the cluster size.
To avoid limiting/breaking the use case of syncing to an existing
target, which is arguably more common than the diff image use case,
document the limitation in QAPI.

This patch was originally based on one by Ma Haocong, but it has since
been modified pretty heavily, first by John and then again by Fiona.

[0]: 
https://lore.kernel.org/qemu-devel/1db7f571-cb7f-c293-04cc-cd856e060...@proxmox.com/

Suggested-by: Ma Haocong 
Signed-off-by: Ma Haocong 
Signed-off-by: John Snow 
[FG: switch to bdrv_dirty_bitmap_merge_internal]
Signed-off-by: Fabian Grünbichler 
Signed-off-by: Thomas Lamprecht 
[FE: rebase for 9.1
  get rid of bitmap mode parameter
  use caller-provided bitmap as working bitmap
  turn bitmap parameter experimental]
Signed-off-by: Fiona Ebner 
Acked-by: Markus Armbruster 
---
  block/mirror.c | 80 +-
  blockdev.c | 44 +++---
  include/block/block_int-global-state.h |  5 +-
  qapi/block-core.json   | 35 ++-
  tests/unit/test-block-iothread.c   |  2 +-
  5 files changed, 141 insertions(+), 25 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index ca23d6ef65..d3d0698116 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -73,6 +73,11 @@ typedef struct MirrorBlockJob {
  size_t buf_size;
  int64_t bdev_length;
  unsigned long *cow_bitmap;
+/*
+ * Whether the bitmap is created locally or provided by the caller (for
+ * incremental sync).
+ */
+bool dirty_bitmap_is_local;
  BdrvDirtyBitmap *dirty_bitmap;
  BdrvDirtyBitmapIter *dbi;
  uint8_t *buf;
@@ -691,7 +696,11 @@ static int mirror_exit_common(Job *job)
  bdrv_unfreeze_backing_chain(mirror_top_bs, target_bs);
  }
  
-bdrv_release_dirty_bitmap(s->dirty_bitmap);

+if (s->dirty_bitmap_is_local) {
+bdrv_release_dirty_bitmap(s->dirty_bitmap);
+} else {
+bdrv_enable_dirty_bitmap(s->dirty_bitmap);
+}
  
  /* Make sure that the source BDS doesn't go away during bdrv_replace_node,

   * before we can call bdrv_drained_end */
@@ -820,6 +829,16 @@ static void mirror_abort(Job *job)
  assert(ret == 0);
  }
  
+/* Always called after commit/abort. */

+static void mirror_clean(Job *job)
+{
+MirrorBlockJob *s = container_of(job, MirrorBlockJob, common.job);
+
+if (!s->dirty_bitmap_is_local && s->dirty_bitmap) {
+bdrv_dirty_bitmap_set_busy(s->dirty_bitmap, false);
+}


why not do that in existing mirror_exit_common, where we already do 
release/enable?


+}
+



--
Best regards,
Vladimir




Re: [PATCH v4 3/5] mirror: allow specifying working bitmap

2024-06-24 Thread Vladimir Sementsov-Ogievskiy

On 21.05.24 15:20, Fiona Ebner wrote:

From: John Snow 

for the mirror job. The bitmap's granularity is used as the job's
granularity.

The new @bitmap parameter is marked unstable in the QAPI and can
currently only be used for @sync=full mode.

Clusters initially dirty in the bitmap as well as new writes are
copied to the target.

Using block-dirty-bitmap-clear and block-dirty-bitmap-merge API,
callers can simulate the three kinds of @BitmapSyncMode (which is used


[..]


  /*
   * The dirty bitmap is set by bdrv_mirror_top_do_write() when not in 
active
- * mode.
+ * mode. For external/caller-provided bitmap, need to wait until
+ * bdrv_mirror_top_do_write() can actually access it before disabling.


hmm, isn't that true at least for non-mode? for other modes we rely on 
mirror_dirty_init(), but with none mode we risk to miss some writes.. That's 
preexisting, but probably it's better to move disabling the bitmap to the 
moment where we sure that we do dirty it by hand on any not-immediatelly-synced 
write. And than have _disable_() call in same place for all scenarios.


   */
-bdrv_disable_dirty_bitmap(s->dirty_bitmap);
+if (s->dirty_bitmap_is_local) {
+bdrv_disable_dirty_bitmap(s->dirty_bitmap);
+}
  




--
Best regards,
Vladimir




[PATCH v4 3/5] mirror: allow specifying working bitmap

2024-05-21 Thread Fiona Ebner
From: John Snow 

for the mirror job. The bitmap's granularity is used as the job's
granularity.

The new @bitmap parameter is marked unstable in the QAPI and can
currently only be used for @sync=full mode.

Clusters initially dirty in the bitmap as well as new writes are
copied to the target.

Using block-dirty-bitmap-clear and block-dirty-bitmap-merge API,
callers can simulate the three kinds of @BitmapSyncMode (which is used
by backup):
1. always: default, just pass bitmap as working bitmap.
2. never: copy bitmap and pass copy to the mirror job.
3. on-success: copy bitmap and pass copy to the mirror job and if
   successful, merge bitmap into original afterwards.

When the target image is a non-COW "diff image", i.e. one that was not
used as the target of a previous mirror and the target image's cluster
size is larger than the bitmap's granularity, or when
@copy-mode=write-blocking is used, there is a pitfall, because the
cluster in the target image will be allocated, but not contain all the
data corresponding to the same region in the source image.

An idea to avoid the limitation would be to mark clusters which are
affected by unaligned writes and are not allocated in the target image
dirty, so they would be copied fully later. However, for migration,
the invariant that an actively synced mirror stays actively synced
(unless an error happens) is useful, because without that invariant,
migration might inactivate block devices when mirror still got work
to do and run into an assertion failure [0].

Another approach would be to read the missing data from the source
upon unaligned writes to be able to write the full target cluster
instead.

But certain targets like NBD do not allow querying the cluster size.
To avoid limiting/breaking the use case of syncing to an existing
target, which is arguably more common than the diff image use case,
document the limitation in QAPI.

This patch was originally based on one by Ma Haocong, but it has since
been modified pretty heavily, first by John and then again by Fiona.

[0]: 
https://lore.kernel.org/qemu-devel/1db7f571-cb7f-c293-04cc-cd856e060...@proxmox.com/

Suggested-by: Ma Haocong 
Signed-off-by: Ma Haocong 
Signed-off-by: John Snow 
[FG: switch to bdrv_dirty_bitmap_merge_internal]
Signed-off-by: Fabian Grünbichler 
Signed-off-by: Thomas Lamprecht 
[FE: rebase for 9.1
 get rid of bitmap mode parameter
 use caller-provided bitmap as working bitmap
 turn bitmap parameter experimental]
Signed-off-by: Fiona Ebner 
Acked-by: Markus Armbruster 
---
 block/mirror.c | 80 +-
 blockdev.c | 44 +++---
 include/block/block_int-global-state.h |  5 +-
 qapi/block-core.json   | 35 ++-
 tests/unit/test-block-iothread.c   |  2 +-
 5 files changed, 141 insertions(+), 25 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index ca23d6ef65..d3d0698116 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -73,6 +73,11 @@ typedef struct MirrorBlockJob {
 size_t buf_size;
 int64_t bdev_length;
 unsigned long *cow_bitmap;
+/*
+ * Whether the bitmap is created locally or provided by the caller (for
+ * incremental sync).
+ */
+bool dirty_bitmap_is_local;
 BdrvDirtyBitmap *dirty_bitmap;
 BdrvDirtyBitmapIter *dbi;
 uint8_t *buf;
@@ -691,7 +696,11 @@ static int mirror_exit_common(Job *job)
 bdrv_unfreeze_backing_chain(mirror_top_bs, target_bs);
 }
 
-bdrv_release_dirty_bitmap(s->dirty_bitmap);
+if (s->dirty_bitmap_is_local) {
+bdrv_release_dirty_bitmap(s->dirty_bitmap);
+} else {
+bdrv_enable_dirty_bitmap(s->dirty_bitmap);
+}
 
 /* Make sure that the source BDS doesn't go away during bdrv_replace_node,
  * before we can call bdrv_drained_end */
@@ -820,6 +829,16 @@ static void mirror_abort(Job *job)
 assert(ret == 0);
 }
 
+/* Always called after commit/abort. */
+static void mirror_clean(Job *job)
+{
+MirrorBlockJob *s = container_of(job, MirrorBlockJob, common.job);
+
+if (!s->dirty_bitmap_is_local && s->dirty_bitmap) {
+bdrv_dirty_bitmap_set_busy(s->dirty_bitmap, false);
+}
+}
+
 static void coroutine_fn mirror_throttle(MirrorBlockJob *s)
 {
 int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
@@ -1016,7 +1035,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
 mirror_free_init(s);
 
 s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
-if (s->sync_mode != MIRROR_SYNC_MODE_NONE) {
+if (s->sync_mode != MIRROR_SYNC_MODE_NONE && s->dirty_bitmap_is_local) {
 ret = mirror_dirty_init(s);
 if (ret < 0 || job_is_cancelled(&s->common.job)) {
 goto immediate_exit;
@@ -1029,6 +1048,14 @@ static int coroutine_fn mirror_run(Job *job, Error 
**errp)
  */
 mirror_top_opaque->job = s;
 
+/*
+ * External/caller-provided bitmap can only be disabled now that
+ * bdrv_mirr