Am 16.02.26 um 5:38 PM schrieb Kevin Wolf:
> Am 16.02.2026 um 17:12 hat Fiona Ebner geschrieben:
>> Am 16.02.26 um 4:31 PM schrieb Kevin Wolf:
>>> Am 16.02.2026 um 14:01 hat Fiona Ebner geschrieben:
>>>> Am 16.02.26 um 1:48 PM schrieb Fiona Ebner:
>>>>> Am 16.02.26 um 11:25 AM schrieb Kevin Wolf:
>>>>>> Am 12.02.2026 um 13:02 hat Fiona Ebner geschrieben:
>>>>>>> Currently, the dirty bitmap is disabled too early and the following
>>>>>>> bad scenario is possible:
>>>>>>>
>>>>>>> 1. Dirty bitmap is disabled in mirror_start_job()
>>>>>>> 2. Some request are started in mirror_top_bs while s->job == NULL
>>>>>>> 3. mirror_dirty_init() -> bdrv_co_is_allocated_above() runs and because
>>>>>>>    the request hasn't completed yet, the block isn't allocated
>>>>>>> 4. The request completes, still sees s->job == NULL and skips the
>>>>>>>    bitmap, and nothing else will mark it dirty either
>>>>>>>
>>>>>>> One ingredient is that mirror_top_opaque->job is only set after the
>>>>>>> job is fully initialized. For the rationale, see commit 32125b1460
>>>>>>> ("mirror: Fix access of uninitialised fields during start").
>>>>>>>
>>>>>>> Disabling the dirty bitmap is safe once bdrv_mirror_top_do_write()
>>>>>>> sees that the job is set, because then:
>>>>>>>
>>>>>>> 1. When not using MIRROR_COPY_MODE_WRITE_BLOCKING, the dirty bitmap
>>>>>>>    will be set by bdrv_mirror_top_do_write().
>>>>>>>
>>>>>>> 2. When using MIRROR_COPY_MODE_WRITE_BLOCKING, writes will be done
>>>>>>>    synchronously to the target.
>>>>>>>
>>>>>>> At least with virtio-blk using iothread-vq-mapping, mirror_run() and
>>>>>>> bdrv_mirror_top_do_write() might be called in different threads.
>>>>>>> bdrv_disable_dirty_bitmap() acquires and releases the dirty bitmap
>>>>>>> mutex, so the memory is synchronized between threads.
>>>>>>>
>>>>>>> Many thanks to Kevin Wolf for discussing the issue and solutions with
>>>>>>> me! [0]
>>>>>>>
>>>>>>> [0]: 
>>>>>>> https://lore.kernel.org/qemu-devel/[email protected]/T/
>>>>>>>
>>>>>>> Cc: [email protected]
>>>>>>> Closes: https://gitlab.com/qemu-project/qemu/-/issues/3273
>>>>>>> Signed-off-by: Fiona Ebner <[email protected]>
>>>>>>> ---
>>>>>>>  block/mirror.c | 21 +++++++++++++++------
>>>>>>>  1 file changed, 15 insertions(+), 6 deletions(-)
>>>>>>>
>>>>>>> diff --git a/block/mirror.c b/block/mirror.c
>>>>>>> index b344182c74..eadd4501e8 100644
>>>>>>> --- a/block/mirror.c
>>>>>>> +++ b/block/mirror.c
>>>>>>> @@ -1123,6 +1123,21 @@ static int coroutine_fn mirror_run(Job *job, 
>>>>>>> Error **errp)
>>>>>>>       */
>>>>>>>      mirror_top_opaque->job = s;
>>>>>>>  
>>>>>>> +    /*
>>>>>>> +     * Disabling the dirty bitmap is safe once 
>>>>>>> bdrv_mirror_top_do_write() sees
>>>>>>> +     * that the job is set, because then:
>>>>>>> +     *
>>>>>>> +     * 1. When not using MIRROR_COPY_MODE_WRITE_BLOCKING, the dirty 
>>>>>>> bitmap will
>>>>>>> +     * be set by bdrv_mirror_top_do_write().
>>>>>>> +     *
>>>>>>> +     * 2. When using MIRROR_COPY_MODE_WRITE_BLOCKING, writes will be 
>>>>>>> done
>>>>>>> +     * synchronously to the target.
>>>>>>> +     *
>>>>>>> +     * bdrv_disable_dirty_bitmap() acquires and releases the dirty 
>>>>>>> bitmap mutex,
>>>>>>> +     * so the memory is synchronized between threads.
>>>>>>> +     */
>>>>>>> +    bdrv_disable_dirty_bitmap(s->dirty_bitmap);
>>>>>>> +
>>>>>>>      assert(!s->dbi);
>>>>>>>      s->dbi = bdrv_dirty_iter_new(s->dirty_bitmap);
>>>>>>>      for (;;) {
>>>>>>> @@ -2014,12 +2029,6 @@ static BlockJob *mirror_start_job(
>>>>>>>          goto fail;
>>>>>>>      }
>>>>>>>  
>>>>>>> -    /*
>>>>>>> -     * The dirty bitmap is set by bdrv_mirror_top_do_write() when not 
>>>>>>> in active
>>>>>>> -     * mode.
>>>>>>> -     */
>>>>>>> -    bdrv_disable_dirty_bitmap(s->dirty_bitmap);
>>>>>>> -
>>>>>>>      bdrv_graph_wrlock_drained();
>>>>>>>      ret = block_job_add_bdrv(&s->common, "source", bs, 0,
>>>>>>>                               BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE 
>>>>>>> |
>>>>>>
>>>>>> The thing I meant in the other thread is if we don't need something like
>>>>>> this additionally:
>>>>>>
>>>>>> diff --git a/block/mirror.c b/block/mirror.c
>>>>>> index b344182c747..159954158ba 100644
>>>>>> --- a/block/mirror.c
>>>>>> +++ b/block/mirror.c
>>>>>> @@ -1672,9 +1672,17 @@ bdrv_mirror_top_do_write(BlockDriverState *bs, 
>>>>>> MirrorMethod method,
>>>>>>          abort();
>>>>>>      }
>>>>>>  
>>>>>> -    if (!copy_to_target && s->job && s->job->dirty_bitmap) {
>>>>>> +    if (!copy_to_target) {
>>>>>>          qatomic_set(&s->job->actively_synced, false);
>>>>>> -        bdrv_set_dirty_bitmap(s->job->dirty_bitmap, offset, bytes);
>>>>>> +        if (s->job && s->job->dirty_bitmap) {
>>>>>> +            bdrv_set_dirty_bitmap(s->job->dirty_bitmap, offset, bytes);
>>>>>> +        } else {
>>>>>> +            /*
>>>>>> +             * Avoid race in the case that mirror_run() disables the 
>>>>>> bitmap
>>>>>> +             * between here and bdrv_co_write_req_finish().
>>>>>> +             */
>>>>>> +            bdrv_set_dirty(bs, offset, bytes);
>>>>>> +        }
>>>>>>      }
>>>>>>  
>>>>>>      if (ret < 0) {
>>>>>
>>>>> You're right! It's not enough to ensure that the job is set before the
>>>>> bitmap is disabled, because both could happen after the check for job 
>>>>> here.
>>>>>
>>>>> Isn't this change enough by itself? After the change,
>>>>> bdrv_mirror_top_do_write() ensures that:
>>>>>
>>>>> 1. When copy_to_target == true, the write is done synchronously to the
>>>>> target, no need to set the dirty bitmap.
>>>>>
>>>>> 2. When copy_to_target == false, the dirty bitmap is set.
>>>>>
>>>>> So it doesn't really matter at which point the dirty bitmap is disabled?
>>>>> Or am I missing something again?
>>>>
>>>> Ah right, bdrv_set_dirty() only sets it if still enabled :)
>>>
>>> Yes, exactly.
>>>
>>> If that's still too confusing, how about just giving mirror_top_bs
>>> access to its dirty bitmap right from the start while it's drained? I
>>> didn't try to reproduce the bug with it yet, though.
>>
>> You beat me to it :) Was thinking about this too, inspired by your
>> initial suggestion for the alternate approach to avoid relying on the
>> block layer for dirty tracking, but mine ended up needlessly involved,
>> because I removed the dirty_bitmap from the MirrorBlockJob object and
>> had to add a helper function for getting the dirty bitmap back from the
>> job, which is less than ideal :P
>>>
>>> Kevin
>>>
>>> diff --git a/block/mirror.c b/block/mirror.c
>>> index b344182c747..c11aca1366c 100644
>>> --- a/block/mirror.c
>>> +++ b/block/mirror.c
>>> @@ -99,6 +99,7 @@ typedef struct MirrorBlockJob {
>>>  
>>>  typedef struct MirrorBDSOpaque {
>>>      MirrorBlockJob *job;
>>> +    BdrvDirtyBitmap *dirty_bitmap;
>>>      bool stop;
>>>      bool is_commit;
>>>  } MirrorBDSOpaque;
>>> @@ -1672,9 +1673,9 @@ bdrv_mirror_top_do_write(BlockDriverState *bs, 
>>> MirrorMethod method,
>>>          abort();
>>>      }
>>>  
>>> -    if (!copy_to_target && s->job && s->job->dirty_bitmap) {
>>> +    if (!copy_to_target) {
>>
>> I put an assert(s->dirty_bitmap) here, but maybe it's not worth it?
> 
> The difference is SIGABRT here vs. SIGSEGV two lines later. I don't think
> it matters much, either way is fine with me.

Right.

>>>          qatomic_set(&s->job->actively_synced, false);
>>> -        bdrv_set_dirty_bitmap(s->job->dirty_bitmap, offset, bytes);
>>> +        bdrv_set_dirty_bitmap(s->dirty_bitmap, offset, bytes);
>>>      }
>>>  
>>>      if (ret < 0) {
>>> @@ -1901,13 +1902,28 @@ static BlockJob *mirror_start_job(
>>
>> I created and disabled the dirty bitmap here already before the drained
>> section. It seems slightly more readable. Do you see any downside to that?
> 
> Hm... I think it's okay in practice because a drain comes after it, so
> while there may be a race initially, we don't care when exactly we start
> tracking as long as it's before checking the block status. But that's a
> bit subtle.
> 
> We could keep creating the bitmap outside of the drain if that improves
> things, but I would leave bdrv_disable_dirty_bitmap() inside the drained
> section just to be very clear that there can't be a race with a request.

Okay, agreed.

>>>  
>>>      bdrv_drained_begin(bs);
>>>      ret = bdrv_append(mirror_top_bs, bs, errp);
>>> -    bdrv_drained_end(bs);
>>> -
>>>      if (ret < 0) {
>>> +        bdrv_drained_end(bs);
>>>          bdrv_unref(mirror_top_bs);
>>>          return NULL;
>>>      }
>>>  
>>> +    bs_opaque->dirty_bitmap = bdrv_create_dirty_bitmap(mirror_top_bs,
>>> +                                                       granularity,
>>> +                                                       NULL, errp);
>>> +    if (!bs_opaque->dirty_bitmap) {
>>> +        bdrv_drained_end(bs);
>>> +        bdrv_unref(mirror_top_bs);
>>> +        return NULL;
>>> +    }
>>> +
>>> +    /*
>>> +     * The dirty bitmap is set by bdrv_mirror_top_do_write() when not in 
>>> active
>>> +     * mode.
>>> +     */
>>
>> This comment could become:
>>
>> +    /*
>> +     * Disabling the dirty bitmap is safe, because:
>> +     *
>> +     * 1. If should_copy_to_target() == true, writes will be done
>> synchronously
>> +     * to the target, no need to set the bitmap.
>> +     *
>> +     * 2. If should_copy_to_target() == false, the dirty bitmap will be
>> set by
>> +     * bdrv_mirror_top_do_write().
>> +     */
> 
> Feels verbose to me because the implications of the existing comment
> seem obvious to me, but if you think it's useful, it probably isn't as
> obvious as I thought.

The thing that might not be immediately obvious is that "active mode"
only starts once the job reference is set in the opaque object.

>>> +    bdrv_disable_dirty_bitmap(bs_opaque->dirty_bitmap);
>>> +    bdrv_drained_end(bs);
>>> +
>>>      /* Make sure that the source is not resized while the job is running */
>>>      s = block_job_create(job_id, driver, NULL, mirror_top_bs,
>>>                           BLK_PERM_CONSISTENT_READ,
>>> @@ -2002,24 +2018,13 @@ static BlockJob *mirror_start_job(
>>>      s->base_overlay = bdrv_find_overlay(bs, base);
>>>      s->granularity = granularity;
>>>      s->buf_size = ROUND_UP(buf_size, granularity);
>>> +    s->dirty_bitmap = bs_opaque->dirty_bitmap;
>>>      s->unmap = unmap;
>>>      if (auto_complete) {
>>>          s->should_complete = true;
>>>      }
>>>      bdrv_graph_rdunlock_main_loop();
>>>  
>>> -    s->dirty_bitmap = bdrv_create_dirty_bitmap(s->mirror_top_bs, 
>>> granularity,
>>> -                                               NULL, errp);
>>> -    if (!s->dirty_bitmap) {
>>> -        goto fail;
>>> -    }
>>> -
>>> -    /*
>>> -     * The dirty bitmap is set by bdrv_mirror_top_do_write() when not in 
>>> active
>>> -     * mode.
>>> -     */
>>> -    bdrv_disable_dirty_bitmap(s->dirty_bitmap);
>>> -
>>>      bdrv_graph_wrlock_drained();
>>>      ret = block_job_add_bdrv(&s->common, "source", bs, 0,
>>>                               BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE |
>>> @@ -2099,9 +2104,6 @@ fail:
>>>          g_free(s->replaces);
>>>          blk_unref(s->target);
>>>          bs_opaque->job = NULL;
>>> -        if (s->dirty_bitmap) {
>>> -            bdrv_release_dirty_bitmap(s->dirty_bitmap);
>>> -        }
>>>          job_early_fail(&s->common.job);
>>>      }
>>>  
>>> @@ -2115,6 +2117,7 @@ fail:
>>>      bdrv_graph_wrunlock();
>>>      bdrv_drained_end(bs);
>>>  
>>> +    bdrv_release_dirty_bitmap(bs_opaque->dirty_bitmap);
>>>      bdrv_unref(mirror_top_bs);
>>>  
>>>      return NULL;

Are you going to send this as a proper patch?

Best Regards,
Fiona


Reply via email to