Hi Kevin,

Am 12.05.26 um 2:14 PM schrieb Kevin Wolf:
> Am 29.04.2026 um 17:06 hat Fiona Ebner geschrieben:
>> Am 27.04.26 um 7:43 PM schrieb Kevin Wolf:
>>> The whole implementation of bdrv_commit() is only correct if no new
>>> writes come in while it's running: It has only a single loop checking
>>> the allocation status for each block and finally calls bdrv_make_empty()
>>> without checking if that throws away any new changes.
>>>
>>> We already have to drain while taking the graph write lock. Just extend
>>> the drained section to all of bdrv_commit() to make sure that we don't
>>> get any inconsistencies.
>>>
>>> Signed-off-by: Kevin Wolf <[email protected]>
>>> ---
>>>  block/commit.c | 10 ++++++++--
>>>  1 file changed, 8 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/block/commit.c b/block/commit.c
>>> index 0d9e1a16d7a..c5e3ef03a21 100644
>>> --- a/block/commit.c
>>> +++ b/block/commit.c
>>> @@ -518,6 +518,7 @@ int bdrv_commit(BlockDriverState *bs)
>>>      if (!drv)
>>>          return -ENOMEDIUM;
>>>  
>>> +    bdrv_drain_all_begin();
>>
>> I suppose we could instead drain only the affected BDSs? Blocking all
>> for the whole duration of the blk_pread+blk_pwrite loop seems a bit much.
> 
> Possible, but I'm not completely sure. Basically what I did here is just
> moving the drain part of bdrv_graph_wrlock_drained() earlier, which also
> drains all nodes.
> 
> I think when you introduced it, the idea was that just draining
> everything is acceptable and easier to verify. Which means that it might
> not be strictly necessary, but I don't want to prove that now either.
> 
> Ultimately this drain_all goes back to commit 91ba0e1, in which you
> stated:
> 
>     More granular draining is not trivially possible, because
>     bdrv_change_aio_context() can recursively call itself e.g. via
>     bdrv_child_change_aio_context().
> 

oh, right. I had forgotten about that recursion.

>> Independently of that, I wonder if blk_commit_all() should drain all?
>> I'm not familiar with it, but it seems that the intent is to have a
>> point-in-time state which is consistent between different BDSs? That
>> intent could be made explicit by draining all.
> 
> Hm, I suppose that would make sense, yes. Do you want to send a patch on
> top of this? Bonus points for a test case that shows the inconsistency.

I'll add it to my TODO pile, but it'll be a few weeks, as I'm quite busy
with other work at the moment.

Best Regards,
Fiona


Reply via email to