Am 26.07.2019 um 11:52 hat Max Reitz geschrieben: > On 25.07.19 18:27, Kevin Wolf wrote: > > Calling bdrv_drained_end() for target_bs can restarts requests too > > early, so that they would execute on mirror_top_bs, which however has > > already dropped all permissions. > > > > Keep the target node drained until all graph changes have completed. > > > > Signed-off-by: Kevin Wolf <kw...@redhat.com> > > --- > > block/mirror.c | 14 ++++++++------ > > 1 file changed, 8 insertions(+), 6 deletions(-) > > > > diff --git a/block/mirror.c b/block/mirror.c > > index 8cb75fb409..7483051f8d 100644 > > --- a/block/mirror.c > > +++ b/block/mirror.c > > @@ -644,6 +644,11 @@ static int mirror_exit_common(Job *job) > > bdrv_ref(mirror_top_bs); > > bdrv_ref(target_bs); > > > > + /* The mirror job has no requests in flight any more, but we need to > > + * drain potential other users of the BDS before changing the graph. */ > > + assert(s->in_drain); > > + bdrv_drained_begin(target_bs); > > + > > In contrast to what Eric said, I think it is a problem that this is just > code motion. > > The comment doesn’t tell the reason why the target needs to be drained > here. Other users of the BDS have their own BdrvChild and thus their > own permissions, their requests do not go through mirror. > > So in addition to why the target needs to be drained around > bdrv_replace_node(), the comment should tell why we need to drain it > here, like the commit message does. > > Now, the thing is, I don’t quite understand the connection between the > target and mirror_top_bs that the commit message wants to establish. > > I see the following problem: > (1) We drain src (at the end of mirror_run()). > (2) This implicitly drains mirror_top_bs. > (3) We drain target. > (4) bdrv_replace_node() replaces src by target, thus replacing the drain > on mirror_top_bs from src by the one from target. > (5) We undrain target, thus also undraining mirror_top_bs.
(5.5) Remove mirror_top_bs from the target chain > (6) After all is done, we undrain src, which has no effect on > mirror_top_bs, because they haven’t been connected since (4). > > I suppose (5) is the problem. This patch moves it down to (6), so > mirror_top_bs is drained as long as src is drained. The problem is that (5) happens before (5.5), so we can start requests on a node that we're about to remove (without draining it again before). > (If to_replace is not src, then src will stay attached, which keeps > mirror_top_bs drained, too.) > > This makes it seem to me like the actually important thing is to drain > mirror_top_bs, not target. If so, it would seem more obvious to me to > just add a drain on mirror_top_bs than to move the existing target drain. Do you really think having a third drained section makes things easier to understand? Draining both source and target while we're modifying the graph seems pretty intuitive to me - which is also why I moved the bdrv_drained_begin() to the very start instead of looking for the first operation that actually strictly needs it. > > /* Remove target parent that still uses BLK_PERM_WRITE/RESIZE before > > * inserting target_bs at s->to_replace, where we might not be able to > > get > > * these permissions. > > @@ -684,12 +689,7 @@ static int mirror_exit_common(Job *job) > > bdrv_reopen_set_read_only(target_bs, ro, NULL); > > } > > > > - /* The mirror job has no requests in flight any more, but we need > > to > > - * drain potential other users of the BDS before changing the > > graph. */ > > - assert(s->in_drain); > > - bdrv_drained_begin(target_bs); > > By the way, don’t we need to drain to_replace also? In case it isn’t src? I think to_replace is required to be in the subtree of src, no? Though maybe it could have another parent, so you might be right. Kevin
signature.asc
Description: PGP signature