On 25.07.19 11:18, Vladimir Sementsov-Ogievskiy wrote:
> On reopen to rw parent may need rw access to child in .prepare, for
> example qcow2 needs to write IN_USE flags into stored bitmaps
> (currently it is done in a hacky way after commit and don't work).
> So, let's introduce such logic.
> 
> The drawback is that in worst case bdrv_reopen_set_read_only may finish
> with error and in some intermediate state: some nodes reopened RW and
> some are not. But this is a way to fix bug around reopening qcow2
> bitmaps in the following commits.

This commit message doesn’t really explain what this patch does.

> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com>
> ---
>  include/block/block_int.h |   2 +
>  block.c                   | 144 ++++++++++++++++++++++++++++++++++----
>  2 files changed, 133 insertions(+), 13 deletions(-)
> 
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 3aa1e832a8..7bd6fd68dd 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -531,6 +531,8 @@ struct BlockDriver {
>                               uint64_t parent_perm, uint64_t parent_shared,
>                               uint64_t *nperm, uint64_t *nshared);
>  
> +     bool (*bdrv_need_rw_file_child_during_reopen_rw)(BlockDriverState *bs);
> +
>      /**
>       * Bitmaps should be marked as 'IN_USE' in the image on reopening image
>       * as rw. This handler should realize it. It also should unset readonly
> diff --git a/block.c b/block.c
> index cbd8da5f3b..3c8e1c59b4 100644
> --- a/block.c
> +++ b/block.c
> @@ -1715,10 +1715,12 @@ static void bdrv_get_cumulative_perm(BlockDriverState 
> *bs, uint64_t *perm,
>                                       uint64_t *shared_perm);
>  
>  typedef struct BlockReopenQueueEntry {
> -     bool prepared;
> -     bool perms_checked;
> -     BDRVReopenState state;
> -     QSIMPLEQ_ENTRY(BlockReopenQueueEntry) entry;
> +    bool reopened_file_child_rw;
> +    bool changed_file_child_perm_rw;
> +    bool prepared;
> +    bool perms_checked;
> +    BDRVReopenState state;
> +    QSIMPLEQ_ENTRY(BlockReopenQueueEntry) entry;
>  } BlockReopenQueueEntry;
>  
>  /*
> @@ -3421,6 +3423,105 @@ BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue 
> *bs_queue,
>                                     keep_old_opts);
>  }
>  
> +static int bdrv_reopen_set_read_only_drained(BlockDriverState *bs,
> +                                             bool read_only,
> +                                             Error **errp)
> +{
> +    BlockReopenQueue *queue;
> +    QDict *opts = qdict_new();
> +
> +    qdict_put_bool(opts, BDRV_OPT_READ_ONLY, read_only);
> +
> +    queue = bdrv_reopen_queue(NULL, bs, opts, true);
> +
> +    return bdrv_reopen_multiple(queue, errp);
> +}
> +
> +/*
> + * handle_recursive_reopens
> + *
> + * On fail it needs rollback_recursive_reopens to be called.

It would be nice if this description actually said anything about what
the function is supposed to do.

> + */
> +static int handle_recursive_reopens(BlockReopenQueueEntry *current,
> +                                    Error **errp)
> +{
> +    int ret;
> +    BlockDriverState *bs = current->state.bs;
> +
> +    /*
> +     * We use the fact that in reopen-queue children are always following
> +     * parents.
> +     * TODO: Switch BlockReopenQueue to be QTAILQ and use
> +     *       QTAILQ_FOREACH_REVERSE.

Why don’t you do that first?  It would make the code more obvious at
least to me.

> +     */
> +    if (QSIMPLEQ_NEXT(current, entry)) {
> +        ret = handle_recursive_reopens(QSIMPLEQ_NEXT(current, entry), errp);
> +        if (ret < 0) {
> +            return ret;
> +        }
> +    }
> +
> +    if ((current->state.flags & BDRV_O_RDWR) && bs->file && bs->drv &&
> +        bs->drv->bdrv_need_rw_file_child_during_reopen_rw &&
> +        bs->drv->bdrv_need_rw_file_child_during_reopen_rw(bs))
> +    {
> +        if (!bdrv_is_writable(bs->file->bs)) {
> +            ret = bdrv_reopen_set_read_only_drained(bs->file->bs, false, 
> errp);

Hm.  Sorry, I find this all a bit hard to understand.  (No comments and
all.)

I understand that this is for an RO -> RW transition?  Everything is
still RO, but the parent will need an RW child before it transitions to
RW itself.


I’m going to be honest up front, I don’t like this very much.  But I
think it may be a reasonable solution for now.

As I remember, the problem was that when reopening a qcow2 node from RO
to RW, we need to write something in .prepare() (because it can fail),
but naturally no .prepare() is called after any .commit(), so no matter
the order of nodes in the ReopenQueue, the child node will never be RW
by this point.

Hm.  To me that mostly means that making the whole reopen process a
transaction was just a dream that turns out not to work.

OK, so what would be the real, proper, all-encompassing fix?  I suppose
we’d need a way to express reopen dependency relationships.  So if a
node depends on one or more of its children to be reopened before/after
it can be reopened itself, we’d need to pull them out of the ReopenQueue
(along with their dependencies) and do a separate bdrv_reopen_multiple()
on them.  So we’d want to split the ReopenQueue into as few subqueues as
possible, so that all dependencies are satisfied.

One such dependency is if you change a node from RO to RW, and that
change requires an RW child.

The reverse dependency occurs is if you change a node from RW to RO, and
the nodes wants to write something to its child, so it needs to remain
RW until then.

Currently, the former is just broken (hence this patch).  The latter is
kind of addressed by virtue of “Writing happens in .prepare(), and
parents come before children in the ReopenQueue”.


OK, so you address one specific case of a dependency, namely of a node
on its bs->file when it comes to writableness.  Not too bad, supporting
bs->file as the only dependency makes sense.  Everything else would
become complicated.


Besides being more specific than the general solution I tried to sketch
above, there is one more difference, though: In that description, I said
we should remove the node and its dependencies from the ReopenQueue, and
commit it earlier.  That has three implicatons:

First, this patch reopens the file if it is not writable, but the parent
needs it to be writable.  I think that’s wrong.  We should take the
file’s entry of the ReopenQueue and reopen it accordingly, not blindly
reopen it RW.  (If the user didn’t specify the file to be reopened RW,
that should be an error.)

Second, you need to take the dependencies into account.  (I don’t know
whether this one is a problem in practice.)  If the file node itself has
a child that needs to be RW, then you need to take that from the
ReopenQueue, too, and repoen it RW, too.

Third, the reopen may require some other options on the file.
Temporarily reopening it RW without changing those options may not be
what the user wants.  (So another reason to take the existing
ReopenQueue entry.)


So -- without having tried, of course -- I think a better design would
be to look for bs->file->bs in the ReopenQueue, recursively all of its
children, and move all of those entries into a new queue, and then
invoke bdrv_reopen_multiple() on that one first.

The question then becomes how to roll back those changes...  I don’t
know whether just having bdrv_reopen() partially succeed is so bad.
Otherwise, we’d need a function to translate an existing node's state
into a BdrvReopenQueueEntry so we can thus return to the old state.

> +            if (ret < 0) {
> +                return ret;
> +            }
> +            current->reopened_file_child_rw = true;
> +        }
> +
> +        if (!(bs->file->perm & BLK_PERM_WRITE)) {
> +            ret = bdrv_child_try_set_perm(bs->file,
> +                                          bs->file->perm | BLK_PERM_WRITE,
> +                                          bs->file->shared_perm, errp);

bdrv_child_try_set_perm() is dangerous, because its effect will be
undone whenever something happens that causes the permissions to be
refreshed.  (Hence the comment in block_int.h which says to avoid it.)
Generally, bdrv_child_refresh_perms() should be enough.  If it isn’t,
I’d say something’s off.

Max

> +            if (ret < 0) {
> +                return ret;
> +            }
> +            current->changed_file_child_perm_rw = true;
> +        }
> +    }
> +
> +    return 0;
> +}

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to