Am 09.08.2011 12:25, schrieb Stefan Hajnoczi: > On Mon, Aug 8, 2011 at 4:16 PM, Kevin Wolf <kw...@redhat.com> wrote: >> Am 08.08.2011 16:49, schrieb Stefan Hajnoczi: >>> On Fri, Aug 5, 2011 at 10:48 AM, Kevin Wolf <kw...@redhat.com> wrote: >>>> Am 05.08.2011 11:29, schrieb Stefan Hajnoczi: >>>>> On Fri, Aug 5, 2011 at 10:07 AM, Kevin Wolf <kw...@redhat.com> wrote: >>>>>> Am 05.08.2011 10:40, schrieb Stefan Hajnoczi: >>>>>>> We've discussed safe methods for reopening image files (e.g. useful for >>>>>>> changing the hostcache parameter). The problem is that closing the >>>>>>> file first >>>>>>> and then opening it again exposes us to the error case where the open >>>>>>> fails. >>>>>>> At that point we cannot get to the file anymore and our options are to >>>>>>> terminate QEMU, pause the VM, or offline the block device. >>>>>>> >>>>>>> This window of vulnerability can be eliminated by keeping the file >>>>>>> descriptor >>>>>>> around and falling back to it should the open fail. >>>>>>> >>>>>>> The challenge for the file descriptor approach is that image formats, >>>>>>> like >>>>>>> VMDK, can span multiple files. Therefore the solution is not as simple >>>>>>> as >>>>>>> stashing a single file descriptor and reopening from it. >>>>>> >>>>>> So far I agree. The rest I believe is wrong because you can't assume >>>>>> that every backend uses file descriptors. The qemu block layer is based >>>>>> on BlockDriverStates, not fds. They are a concept that should be hidden >>>>>> in raw-posix. >>>>>> >>>>>> I think something like this could do: >>>>>> >>>>>> struct BDRVReopenState { >>>>>> BlockDriverState *bs; >>>>>> /* can be extended by block drivers */ >>>>>> }; >>>>>> >>>>>> .bdrv_reopen(BlockDriverState *bs, BDRVReopenState **reopen_state, int >>>>>> flags); >>>>>> .bdrv_reopen_commit(BDRVReopenState *reopen_state); >>>>>> .bdrv_reopen_abort(BDRVReopenState *reopen_state); >>>>>> >>>>>> raw-posix would store the old file descriptor in its reopen_state. On >>>>>> commit, it closes the old descriptors, on abort it reverts to the old >>>>>> one and closes the newly opened one. >>>>>> >>>>>> Makes things a bit more complicated than the simple bdrv_reopen I had in >>>>>> mind before, but it allows VMDK to get an all-or-nothing semantics. >>>>> >>>>> Can you show how bdrv_reopen() would use these new interfaces? I'm >>>>> not 100% clear on the idea. >>>> >>>> Well, you wouldn't only call bdrv_reopen, but also either >>>> bdrv_reopen_commit/abort (for the top-level caller we can have a wrapper >>>> function that does both, but that's syntactic sugar). >>>> >>>> For example we would have: >>>> >>>> int vmdk_reopen() >>> >>> .bdrv_reopen() is a confusing name for this operation because it does >>> not reopen anything. bdrv_prepare_reopen() might be clearer. >> >> Makes sense. >> >>> >>>> { >>>> *((VMDKReopenState**) rs) = malloc(); >>>> >>>> foreach (extent in s->extents) { >>>> ret = bdrv_reopen(extent->file, &extent->reopen_state) >>>> if (ret < 0) >>>> goto fail; >>>> } >>>> return 0; >>>> >>>> fail: >>>> foreach (extent in rs->already_reopened) { >>>> bdrv_reopen_abort(extent->reopen_state); >>>> } >>>> return ret; >>>> } >>> >>>> void vmdk_reopen_commit() >>>> { >>>> foreach (extent in s->extents) { >>>> bdrv_reopen_commit(extent->reopen_state); >>>> } >>>> free(rs); >>>> } >>>> >>>> void vmdk_reopen_abort() >>>> { >>>> foreach (extent in s->extents) { >>>> bdrv_reopen_abort(extent->reopen_state); >>>> } >>>> free(rs); >>>> } >>> >>> Does the caller invoke bdrv_close(bs) after bdrv_prepare_reopen(bs, >>> &rs)? >> >> No. Closing the old backend would be part of bdrv_reopen_commit. >> >> Do you have a use case where it would be helpful if the caller invoked >> bdrv_close? > > When the caller does bdrv_close() two BlockDriverStates are never open > for the same image file. I thought this was a property we wanted. > > Also, in the block_set_hostcache case we need to reopen without > switching to a new BlockDriverState instance. That means the reopen > needs to be in-place with respect to the BlockDriverState *bs pointer. > We cannot create a new instance.
Yes, but where do you even get the second BlockDriverState from? My prototype only returns an int, not a new BlockDriverState. Until bdrv_reopen_commit() it would refer to the old file descriptors etc. and after bdrv_reopen_commit() the very same BlockDriverState would refer to the new ones. Kevin