Am 26.07.2019 um 11:52 hat Max Reitz geschrieben:
> On 25.07.19 18:27, Kevin Wolf wrote:
> > Calling bdrv_drained_end() for target_bs can restarts requests too
> > early, so that they would execute on mirror_top_bs, which however has
> > already dropped all permissions.
> > 
> > Keep the target node drained until all graph changes have completed.
> > 
> > Signed-off-by: Kevin Wolf <kw...@redhat.com>
> > ---
> >  block/mirror.c | 14 ++++++++------
> >  1 file changed, 8 insertions(+), 6 deletions(-)
> > 
> > diff --git a/block/mirror.c b/block/mirror.c
> > index 8cb75fb409..7483051f8d 100644
> > --- a/block/mirror.c
> > +++ b/block/mirror.c
> > @@ -644,6 +644,11 @@ static int mirror_exit_common(Job *job)
> >      bdrv_ref(mirror_top_bs);
> >      bdrv_ref(target_bs);
> >  
> > +    /* The mirror job has no requests in flight any more, but we need to
> > +     * drain potential other users of the BDS before changing the graph. */
> > +    assert(s->in_drain);
> > +    bdrv_drained_begin(target_bs);
> > +
> 
> In contrast to what Eric said, I think it is a problem that this is just
> code motion.
> 
> The comment doesn’t tell the reason why the target needs to be drained
> here.  Other users of the BDS have their own BdrvChild and thus their
> own permissions, their requests do not go through mirror.
> 
> So in addition to why the target needs to be drained around
> bdrv_replace_node(), the comment should tell why we need to drain it
> here, like the commit message does.
> 
> Now, the thing is, I don’t quite understand the connection between the
> target and mirror_top_bs that the commit message wants to establish.
> 
> I see the following problem:
> (1) We drain src (at the end of mirror_run()).
> (2) This implicitly drains mirror_top_bs.
> (3) We drain target.
> (4) bdrv_replace_node() replaces src by target, thus replacing the drain
>     on mirror_top_bs from src by the one from target.
> (5) We undrain target, thus also undraining mirror_top_bs.

(5.5) Remove mirror_top_bs from the target chain

> (6) After all is done, we undrain src, which has no effect on
>     mirror_top_bs, because they haven’t been connected since (4).
> 
> I suppose (5) is the problem.  This patch moves it down to (6), so
> mirror_top_bs is drained as long as src is drained.

The problem is that (5) happens before (5.5), so we can start requests
on a node that we're about to remove (without draining it again before).

> (If to_replace is not src, then src will stay attached, which keeps
> mirror_top_bs drained, too.)
> 
> This makes it seem to me like the actually important thing is to drain
> mirror_top_bs, not target.  If so, it would seem more obvious to me to
> just add a drain on mirror_top_bs than to move the existing target drain.

Do you really think having a third drained section makes things easier
to understand? Draining both source and target while we're modifying the
graph seems pretty intuitive to me - which is also why I moved the
bdrv_drained_begin() to the very start instead of looking for the first
operation that actually strictly needs it.

> >      /* Remove target parent that still uses BLK_PERM_WRITE/RESIZE before
> >       * inserting target_bs at s->to_replace, where we might not be able to 
> > get
> >       * these permissions.
> > @@ -684,12 +689,7 @@ static int mirror_exit_common(Job *job)
> >              bdrv_reopen_set_read_only(target_bs, ro, NULL);
> >          }
> >  
> > -        /* The mirror job has no requests in flight any more, but we need 
> > to
> > -         * drain potential other users of the BDS before changing the 
> > graph. */
> > -        assert(s->in_drain);
> > -        bdrv_drained_begin(target_bs);
> 
> By the way, don’t we need to drain to_replace also?  In case it isn’t src?

I think to_replace is required to be in the subtree of src, no?

Though maybe it could have another parent, so you might be right.

Kevin

Attachment: signature.asc
Description: PGP signature

Reply via email to