On Mon, Jan 12, 2026 at 04:23:51PM +0100, Fiona Ebner wrote:
> Some Proxmox users reported an occasional assertion failure [0][1] in
> busy VMs when using drive mirror with active mode. In particular, the
> failure may occur for zero writes shorter than the job granularity:
> 
> > #0  0x00007b421154b507 in abort ()
> > #1  0x00007b421154b420 in ?? ()
> > #2  0x0000641c582e061f in bitmap_set (map=0x7b4204014e00, start=14, nr=-1)
> > #3  0x0000641c58062824 in do_sync_target_write (job=0x641c7e73d1e0,
> >       method=MIRROR_METHOD_ZERO, offset=852480, bytes=4096, qiov=0x0, 
> > flags=0)
> > #4  0x0000641c58062250 in bdrv_mirror_top_do_write (bs=0x641c7e62e1f0,
>         method=MIRROR_METHOD_ZERO, copy_to_target=true, offset=852480,
>         bytes=4096, qiov=0x0, flags=0)
> > #5  0x0000641c58061f31 in bdrv_mirror_top_pwrite_zeroes (bs=0x641c7e62e1f0,
>         offset=852480, bytes=4096, flags=0)
> 
> The range for the dirty bitmap described by dirty_bitmap_offset and
> dirty_bitmap_end is narrower than the original range and in fact,
> dirty_bitmap_end might be smaller than dirty_bitmap_offset. There
> already is a check for 'dirty_bitmap_offset < dirty_bitmap_end' before
> resetting the dirty bitmap. Add such a check for setting the zero
> bitmap too, which uses the same narrower range.
> 
> [0]: https://forum.proxmox.com/threads/177981/
> [1]: https://bugzilla.proxmox.com/show_bug.cgi?id=7222
> 
> Cc: [email protected]
> Fixes: 7e277545b9 ("mirror: Skip writing zeroes when target is already zero")
> Signed-off-by: Fiona Ebner <[email protected]>
> ---
>  block/mirror.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/block/mirror.c b/block/mirror.c
> index b344182c74..bc982cb99a 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -1514,9 +1514,12 @@ do_sync_target_write(MirrorBlockJob *job, MirrorMethod 
> method,
>          assert(!qiov);
>          ret = blk_co_pwrite_zeroes(job->target, offset, bytes, flags);
>          if (job->zero_bitmap && ret >= 0) {
> -            bitmap_set(job->zero_bitmap, dirty_bitmap_offset / 
> job->granularity,
> -                       (dirty_bitmap_end - dirty_bitmap_offset) /
> -                       job->granularity);
> +            if (dirty_bitmap_offset < dirty_bitmap_end) {
> +                bitmap_set(job->zero_bitmap,
> +                           dirty_bitmap_offset / job->granularity,
> +                           (dirty_bitmap_end - dirty_bitmap_offset) /
> +                           job->granularity);
> +            }

Why does this case clause use dirty_bitmap_offset and dirty_bitmap_end
instead of zero_bitmap_offset and zero_bitmap_end like the other
zero_bitmap operations in this switch statement?

   if (job->zero_bitmap && ret >= 0) {
-      bitmap_set(job->zero_bitmap, dirty_bitmap_offset / job->granularity,
-                 (dirty_bitmap_end - dirty_bitmap_offset) /
-                 job->granularity);
+      bitmap_set(job->zero_bitmap, zero_bitmap_offset,
+                 zero_bitmap_end - zero_bitmap_offset);
   }

I'm probably missing something, but it's not obvious to me :).

Stefan

Attachment: signature.asc
Description: PGP signature

Reply via email to