Am 29.04.2026 um 17:06 hat Fiona Ebner geschrieben:
> Am 27.04.26 um 7:43 PM schrieb Kevin Wolf:
> > The whole implementation of bdrv_commit() is only correct if no new
> > writes come in while it's running: It has only a single loop checking
> > the allocation status for each block and finally calls bdrv_make_empty()
> > without checking if that throws away any new changes.
> >
> > We already have to drain while taking the graph write lock. Just extend
> > the drained section to all of bdrv_commit() to make sure that we don't
> > get any inconsistencies.
> >
> > Signed-off-by: Kevin Wolf <[email protected]>
> > ---
> > block/commit.c | 10 ++++++++--
> > 1 file changed, 8 insertions(+), 2 deletions(-)
> >
> > diff --git a/block/commit.c b/block/commit.c
> > index 0d9e1a16d7a..c5e3ef03a21 100644
> > --- a/block/commit.c
> > +++ b/block/commit.c
> > @@ -518,6 +518,7 @@ int bdrv_commit(BlockDriverState *bs)
> > if (!drv)
> > return -ENOMEDIUM;
> >
> > + bdrv_drain_all_begin();
>
> I suppose we could instead drain only the affected BDSs? Blocking all
> for the whole duration of the blk_pread+blk_pwrite loop seems a bit much.
Possible, but I'm not completely sure. Basically what I did here is just
moving the drain part of bdrv_graph_wrlock_drained() earlier, which also
drains all nodes.
I think when you introduced it, the idea was that just draining
everything is acceptable and easier to verify. Which means that it might
not be strictly necessary, but I don't want to prove that now either.
Ultimately this drain_all goes back to commit 91ba0e1, in which you
stated:
More granular draining is not trivially possible, because
bdrv_change_aio_context() can recursively call itself e.g. via
bdrv_child_change_aio_context().
> Independently of that, I wonder if blk_commit_all() should drain all?
> I'm not familiar with it, but it seems that the intent is to have a
> point-in-time state which is consistent between different BDSs? That
> intent could be made explicit by draining all.
Hm, I suppose that would make sense, yes. Do you want to send a patch on
top of this? Bonus points for a test case that shows the inconsistency.
Kevin