02.08.2019 19:23, Vladimir Sementsov-Ogievskiy wrote: > 02.08.2019 18:42, Kevin Wolf wrote: >> Am 31.07.2019 um 14:09 hat Max Reitz geschrieben: >>> On 25.07.19 11:18, Vladimir Sementsov-Ogievskiy wrote: >>>> On reopen to rw parent may need rw access to child in .prepare, for >>>> example qcow2 needs to write IN_USE flags into stored bitmaps >>>> (currently it is done in a hacky way after commit and don't work). >>>> So, let's introduce such logic. >>>> >>>> The drawback is that in worst case bdrv_reopen_set_read_only may finish >>>> with error and in some intermediate state: some nodes reopened RW and >>>> some are not. But this is a way to fix bug around reopening qcow2 >>>> bitmaps in the following commits. >>> >>> This commit message doesn’t really explain what this patch does. >>> >>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com> >>>> --- >>>> include/block/block_int.h | 2 + >>>> block.c | 144 ++++++++++++++++++++++++++++++++++---- >>>> 2 files changed, 133 insertions(+), 13 deletions(-) >>>> >>>> diff --git a/include/block/block_int.h b/include/block/block_int.h >>>> index 3aa1e832a8..7bd6fd68dd 100644 >>>> --- a/include/block/block_int.h >>>> +++ b/include/block/block_int.h >>>> @@ -531,6 +531,8 @@ struct BlockDriver { >>>> uint64_t parent_perm, uint64_t >>>> parent_shared, >>>> uint64_t *nperm, uint64_t *nshared); >>>> + bool (*bdrv_need_rw_file_child_during_reopen_rw)(BlockDriverState >>>> *bs); >>>> + >>>> /** >>>> * Bitmaps should be marked as 'IN_USE' in the image on reopening >>>> image >>>> * as rw. This handler should realize it. It also should unset >>>> readonly >>>> diff --git a/block.c b/block.c >>>> index cbd8da5f3b..3c8e1c59b4 100644 >>>> --- a/block.c >>>> +++ b/block.c >>>> @@ -1715,10 +1715,12 @@ static void >>>> bdrv_get_cumulative_perm(BlockDriverState *bs, uint64_t *perm, >>>> uint64_t *shared_perm); >>>> typedef struct BlockReopenQueueEntry { >>>> - bool prepared; >>>> - bool perms_checked; >>>> - BDRVReopenState state; >>>> - QSIMPLEQ_ENTRY(BlockReopenQueueEntry) entry; >>>> + bool reopened_file_child_rw; >>>> + bool changed_file_child_perm_rw; >>>> + bool prepared; >>>> + bool perms_checked; >>>> + BDRVReopenState state; >>>> + QSIMPLEQ_ENTRY(BlockReopenQueueEntry) entry; >>>> } BlockReopenQueueEntry; >>>> /* >>>> @@ -3421,6 +3423,105 @@ BlockReopenQueue >>>> *bdrv_reopen_queue(BlockReopenQueue *bs_queue, >>>> keep_old_opts); >>>> } >>>> +static int bdrv_reopen_set_read_only_drained(BlockDriverState *bs, >>>> + bool read_only, >>>> + Error **errp) >>>> +{ >>>> + BlockReopenQueue *queue; >>>> + QDict *opts = qdict_new(); >>>> + >>>> + qdict_put_bool(opts, BDRV_OPT_READ_ONLY, read_only); >>>> + >>>> + queue = bdrv_reopen_queue(NULL, bs, opts, true); >>>> + >>>> + return bdrv_reopen_multiple(queue, errp); >>>> +} >>>> + >>>> +/* >>>> + * handle_recursive_reopens >>>> + * >>>> + * On fail it needs rollback_recursive_reopens to be called. >>> >>> It would be nice if this description actually said anything about what >>> the function is supposed to do. >>> >>>> + */ >>>> +static int handle_recursive_reopens(BlockReopenQueueEntry *current, >>>> + Error **errp) >>>> +{ >>>> + int ret; >>>> + BlockDriverState *bs = current->state.bs; >>>> + >>>> + /* >>>> + * We use the fact that in reopen-queue children are always following >>>> + * parents. >>>> + * TODO: Switch BlockReopenQueue to be QTAILQ and use >>>> + * QTAILQ_FOREACH_REVERSE. >>> >>> Why don’t you do that first? It would make the code more obvious at >>> least to me. >>> >>>> + */ >>>> + if (QSIMPLEQ_NEXT(current, entry)) { >>>> + ret = handle_recursive_reopens(QSIMPLEQ_NEXT(current, entry), >>>> errp); >>>> + if (ret < 0) { >>>> + return ret; >>>> + } >>>> + } >>>> + >>>> + if ((current->state.flags & BDRV_O_RDWR) && bs->file && bs->drv && >>>> + bs->drv->bdrv_need_rw_file_child_during_reopen_rw && >>>> + bs->drv->bdrv_need_rw_file_child_during_reopen_rw(bs)) >>>> + { >>>> + if (!bdrv_is_writable(bs->file->bs)) { >>>> + ret = bdrv_reopen_set_read_only_drained(bs->file->bs, false, >>>> errp); >>> >>> Hm. Sorry, I find this all a bit hard to understand. (No comments and >>> all.) >>> >>> I understand that this is for an RO -> RW transition? Everything is >>> still RO, but the parent will need an RW child before it transitions to >>> RW itself. >>> >>> >>> I’m going to be honest up front, I don’t like this very much. But I >>> think it may be a reasonable solution for now. >>> >>> As I remember, the problem was that when reopening a qcow2 node from RO >>> to RW, we need to write something in .prepare() (because it can fail), >>> but naturally no .prepare() is called after any .commit(), so no matter >>> the order of nodes in the ReopenQueue, the child node will never be RW >>> by this point. >>> >>> Hm. To me that mostly means that making the whole reopen process a >>> transaction was just a dream that turns out not to work. >> >> This patch already looks somewhat complicated to me, and what you're >> proposing below sounds another order of magnitude more complex. > > Agree :) However, at this point I've almost implemented it (it's not a reason > to chose more complex way, of course). > >> >> But I think the important point is your last sentence above. Once we >> acknowledge that we can't possibly maintain full transaction semantics, >> I'll ask this naive question: What prevents us from just keeping the >> additional write in .commit? > > Hmm, it's what we've started with. The only thing to do is to reverse order > of commits, to commit file child before parent (and this way it works in > Virtuozzo now). > And I proposed it long ago: > https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg04528.html > > >> >> It is expected to work in the common case, so we're only talking about >> the behaviour in error cases anyway. If something fails and we can't do >> things in a transactionable way, we need to decide what the surprise >> result will look like, because we can neither guarantee a rollback nor >> successful completion. >> >> If the write fails unexpectedly, and we end up with a qcow2 image that >> is opened r/w, but has read-only bitmaps - wouldn't that be a reasonable >> result? It seems much easier to explain than some dependency subchain >> already being committed and the rest rolled back. > > And this is how it works now (except it doesn't work because of commit order). > In our long ago conversation (link above) you pointed that the problem here > is that we don't return an error from actually failed commit and it is > ignored.. > >> >> So, I guess my question is, what is the specifc scenario you're trying >> to fix with this series (why isn't the final patch a test case that >> would answer this question?), and are we sure that the cure isn't worse >> than the disease? >> > > Test appears at 03 and tests what already works, and lacks test-cases which > are broken, and they are added in 08 when all is fixed. > > And here are two things to fix: > First is that we lose bitmaps on reopening to RO and it is described and > fixed in 06. > Second is that we cant set IN_USE flag when reopening to RW and it is fixed > finally in 08. > > > ===== > > So, if we decide to keep things simple, what to do? Just reorder commits to > satisfy dependencies, if any? > > Should we add return code to commit, which should always be 0 except very rare > case? > >
And another idea is to postpone IN_USE setting until we really need to modify the bitmap.. Not sure that it is simpler. -- Best regards, Vladimir