On Tue, Jan 20, 2026 at 09:57:42AM +0100, Fiona Ebner wrote: > Am 19.01.26 um 9:10 PM schrieb Stefan Hajnoczi: > > On Mon, Jan 12, 2026 at 04:23:51PM +0100, Fiona Ebner wrote: > >> Some Proxmox users reported an occasional assertion failure [0][1] in > >> busy VMs when using drive mirror with active mode. In particular, the > >> failure may occur for zero writes shorter than the job granularity: > >> > >>> #0 0x00007b421154b507 in abort () > >>> #1 0x00007b421154b420 in ?? () > >>> #2 0x0000641c582e061f in bitmap_set (map=0x7b4204014e00, start=14, nr=-1) > >>> #3 0x0000641c58062824 in do_sync_target_write (job=0x641c7e73d1e0, > >>> method=MIRROR_METHOD_ZERO, offset=852480, bytes=4096, qiov=0x0, > >>> flags=0) > >>> #4 0x0000641c58062250 in bdrv_mirror_top_do_write (bs=0x641c7e62e1f0, > >> method=MIRROR_METHOD_ZERO, copy_to_target=true, offset=852480, > >> bytes=4096, qiov=0x0, flags=0) > >>> #5 0x0000641c58061f31 in bdrv_mirror_top_pwrite_zeroes > >>> (bs=0x641c7e62e1f0, > >> offset=852480, bytes=4096, flags=0) > >> > >> The range for the dirty bitmap described by dirty_bitmap_offset and > >> dirty_bitmap_end is narrower than the original range and in fact, > >> dirty_bitmap_end might be smaller than dirty_bitmap_offset. There > >> already is a check for 'dirty_bitmap_offset < dirty_bitmap_end' before > >> resetting the dirty bitmap. Add such a check for setting the zero > >> bitmap too, which uses the same narrower range. > >> > >> [0]: https://forum.proxmox.com/threads/177981/ > >> [1]: https://bugzilla.proxmox.com/show_bug.cgi?id=7222 > >> > >> Cc: [email protected] > >> Fixes: 7e277545b9 ("mirror: Skip writing zeroes when target is already > >> zero") > >> Signed-off-by: Fiona Ebner <[email protected]> > >> --- > >> block/mirror.c | 9 ++++++--- > >> 1 file changed, 6 insertions(+), 3 deletions(-) > >> > >> diff --git a/block/mirror.c b/block/mirror.c > >> index b344182c74..bc982cb99a 100644 > >> --- a/block/mirror.c > >> +++ b/block/mirror.c > >> @@ -1514,9 +1514,12 @@ do_sync_target_write(MirrorBlockJob *job, > >> MirrorMethod method, > >> assert(!qiov); > >> ret = blk_co_pwrite_zeroes(job->target, offset, bytes, flags); > >> if (job->zero_bitmap && ret >= 0) { > >> - bitmap_set(job->zero_bitmap, dirty_bitmap_offset / > >> job->granularity, > >> - (dirty_bitmap_end - dirty_bitmap_offset) / > >> - job->granularity); > >> + if (dirty_bitmap_offset < dirty_bitmap_end) { > >> + bitmap_set(job->zero_bitmap, > >> + dirty_bitmap_offset / job->granularity, > >> + (dirty_bitmap_end - dirty_bitmap_offset) / > >> + job->granularity); > >> + } > > > > Why does this case clause use dirty_bitmap_offset and dirty_bitmap_end > > instead of zero_bitmap_offset and zero_bitmap_end like the other > > zero_bitmap operations in this switch statement? > > > > if (job->zero_bitmap && ret >= 0) { > > - bitmap_set(job->zero_bitmap, dirty_bitmap_offset / job->granularity, > > - (dirty_bitmap_end - dirty_bitmap_offset) / > > - job->granularity); > > + bitmap_set(job->zero_bitmap, zero_bitmap_offset, > > + zero_bitmap_end - zero_bitmap_offset); > > } > > > > I'm probably missing something, but it's not obvious to me :). > > Sorry, I could've mentioned this in the commit message. There is a > comment explaining it further up in the function: > > > /* > > * Tails are either clean or shrunk, so for dirty bitmap resetting > > * we safely align the range narrower. But for zero bitmap, round > > * range wider for checking or clearing, and narrower for setting. > > */
Thanks! Stefan
signature.asc
Description: PGP signature
