On Tue, Aug 9, 2011 at 11:35 AM, Kevin Wolf <kw...@redhat.com> wrote: > Am 09.08.2011 12:25, schrieb Stefan Hajnoczi: >> On Mon, Aug 8, 2011 at 4:16 PM, Kevin Wolf <kw...@redhat.com> wrote: >>> Am 08.08.2011 16:49, schrieb Stefan Hajnoczi: >>>> On Fri, Aug 5, 2011 at 10:48 AM, Kevin Wolf <kw...@redhat.com> wrote: >>>>> Am 05.08.2011 11:29, schrieb Stefan Hajnoczi: >>>>>> On Fri, Aug 5, 2011 at 10:07 AM, Kevin Wolf <kw...@redhat.com> wrote: >>>>>>> Am 05.08.2011 10:40, schrieb Stefan Hajnoczi: >>>>>>>> We've discussed safe methods for reopening image files (e.g. useful for >>>>>>>> changing the hostcache parameter). The problem is that closing the >>>>>>>> file first >>>>>>>> and then opening it again exposes us to the error case where the open >>>>>>>> fails. >>>>>>>> At that point we cannot get to the file anymore and our options are to >>>>>>>> terminate QEMU, pause the VM, or offline the block device. >>>>>>>> >>>>>>>> This window of vulnerability can be eliminated by keeping the file >>>>>>>> descriptor >>>>>>>> around and falling back to it should the open fail. >>>>>>>> >>>>>>>> The challenge for the file descriptor approach is that image formats, >>>>>>>> like >>>>>>>> VMDK, can span multiple files. Therefore the solution is not as >>>>>>>> simple as >>>>>>>> stashing a single file descriptor and reopening from it. >>>>>>> >>>>>>> So far I agree. The rest I believe is wrong because you can't assume >>>>>>> that every backend uses file descriptors. The qemu block layer is based >>>>>>> on BlockDriverStates, not fds. They are a concept that should be hidden >>>>>>> in raw-posix. >>>>>>> >>>>>>> I think something like this could do: >>>>>>> >>>>>>> struct BDRVReopenState { >>>>>>> BlockDriverState *bs; >>>>>>> /* can be extended by block drivers */ >>>>>>> }; >>>>>>> >>>>>>> .bdrv_reopen(BlockDriverState *bs, BDRVReopenState **reopen_state, int >>>>>>> flags); >>>>>>> .bdrv_reopen_commit(BDRVReopenState *reopen_state); >>>>>>> .bdrv_reopen_abort(BDRVReopenState *reopen_state); >>>>>>> >>>>>>> raw-posix would store the old file descriptor in its reopen_state. On >>>>>>> commit, it closes the old descriptors, on abort it reverts to the old >>>>>>> one and closes the newly opened one. >>>>>>> >>>>>>> Makes things a bit more complicated than the simple bdrv_reopen I had in >>>>>>> mind before, but it allows VMDK to get an all-or-nothing semantics. >>>>>> >>>>>> Can you show how bdrv_reopen() would use these new interfaces? I'm >>>>>> not 100% clear on the idea. >>>>> >>>>> Well, you wouldn't only call bdrv_reopen, but also either >>>>> bdrv_reopen_commit/abort (for the top-level caller we can have a wrapper >>>>> function that does both, but that's syntactic sugar). >>>>> >>>>> For example we would have: >>>>> >>>>> int vmdk_reopen() >>>> >>>> .bdrv_reopen() is a confusing name for this operation because it does >>>> not reopen anything. bdrv_prepare_reopen() might be clearer. >>> >>> Makes sense. >>> >>>> >>>>> { >>>>> *((VMDKReopenState**) rs) = malloc(); >>>>> >>>>> foreach (extent in s->extents) { >>>>> ret = bdrv_reopen(extent->file, &extent->reopen_state) >>>>> if (ret < 0) >>>>> goto fail; >>>>> } >>>>> return 0; >>>>> >>>>> fail: >>>>> foreach (extent in rs->already_reopened) { >>>>> bdrv_reopen_abort(extent->reopen_state); >>>>> } >>>>> return ret; >>>>> } >>>> >>>>> void vmdk_reopen_commit() >>>>> { >>>>> foreach (extent in s->extents) { >>>>> bdrv_reopen_commit(extent->reopen_state); >>>>> } >>>>> free(rs); >>>>> } >>>>> >>>>> void vmdk_reopen_abort() >>>>> { >>>>> foreach (extent in s->extents) { >>>>> bdrv_reopen_abort(extent->reopen_state); >>>>> } >>>>> free(rs); >>>>> } >>>> >>>> Does the caller invoke bdrv_close(bs) after bdrv_prepare_reopen(bs, >>>> &rs)? >>> >>> No. Closing the old backend would be part of bdrv_reopen_commit. >>> >>> Do you have a use case where it would be helpful if the caller invoked >>> bdrv_close? >> >> When the caller does bdrv_close() two BlockDriverStates are never open >> for the same image file. I thought this was a property we wanted. >> >> Also, in the block_set_hostcache case we need to reopen without >> switching to a new BlockDriverState instance. That means the reopen >> needs to be in-place with respect to the BlockDriverState *bs pointer. >> We cannot create a new instance. > > Yes, but where do you even get the second BlockDriverState from? > > My prototype only returns an int, not a new BlockDriverState. Until > bdrv_reopen_commit() it would refer to the old file descriptors etc. and > after bdrv_reopen_commit() the very same BlockDriverState would refer to > the new ones.
It seems I don't understand the API. I thought it was: do_block_set_hostcache() { bdrv_prepare_reopen(bs, &rs); ...open new file and check everything is okay... if (ret == 0) { bdrv_reopen_commit(rs); } else { bdrv_reopen_abort(rs); } return ret; } If the caller isn't opening the new file then what's the point of giving the caller control over prepare, commit, and abort? Stefan