Re: [Qemu-block] [PATCH v2 2/3] block/mirror: Fix target backing BDS
On Thu, Jun 9, 2016 at 11:58 AM, Kevin Wolfwrote: > Am 08.06.2016 um 17:39 hat Nir Soffer geschrieben: >> On Wed, Jun 8, 2016 at 12:32 PM, Kevin Wolf wrote: >> > Am 06.06.2016 um 16:42 hat Max Reitz geschrieben: >> >> Currently, we are trying to move the backing BDS from the source to the >> >> target in bdrv_replace_in_backing_chain() which is called from >> >> mirror_exit(). However, mirror_complete() already tries to open the >> >> target's backing chain with a call to bdrv_open_backing_file(). >> >> >> >> First, we should only set the target's backing BDS once. Second, the >> >> mirroring block job has a better idea of what to set it to than the >> >> generic code in bdrv_replace_in_backing_chain() (in fact, the latter's >> >> conditions on when to move the backing BDS from source to target are not >> >> really correct). >> >> >> >> Therefore, remove that code from bdrv_replace_in_backing_chain() and >> >> leave it to mirror_complete(). >> >> >> >> However, mirror_complete() in turn pursues a questionable strategy by >> >> employing bdrv_open_backing_file(): On the one hand, because this may >> >> open the wrong backing file with drive-mirror in "existing" mode, or >> >> because it will not override a possibly wrong backing file in the >> >> blockdev-mirror case. >> >> >> >> On the other hand, we want to reuse the existing backing chain of the >> >> source instead of opening everything anew, because the latter results in >> >> having multiple BDSs for a single physical file and thus potentially >> >> concurrent access which we should try to avoid. >> > >> > Careful, this "wrong" backing file might actually be intended! >> > >> > Consider a case where you want to move an image with its whole backing >> > chain to different storage. In that case, you would copy all of the >> > backing files (cp is good enough, they are read-only), create the >> > destination image which already points at the copied backing chain, and >> > then mirror in "existing" mode. >> > >> > The intention is obviously that after the job completion the new backing >> > chain is used and not the old one. >> > >> > I know that such cases were discussed when mirroring was introduced, I'm >> > not sure whether it's actually used. We need some input there: >> > >> > Eric, can you tell us whether libvirt makes use of such a setup? >> > >> > Nir, I'm not sure who is the right person in oVirt these days, but do >> > you either know yourself whether oVirt requires this to work, or do you >> > know who else would know? >> >> I'm the right person, thanks for keeping me in the loop. >> >> What you describe is how we migrate a disk from one storage to another: >> >> 1. Create a vm snapshot >> 2. Create a volume on the destination storage for the snapshot >> 3. Start mirroring from the source snapshot to the destination snapshot >> using libvirt virDomainBlockCopy: >> https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockCopy > > With VIR_DOMAIN_BLOCK_COPY_SHALLOW set, right? (That is, sync=top in QMP > speech.) Yes, actually we use: VIR_DOMAIN_BLOCK_COPY_SHALLOW | VIR_DOMAIN_BLOCK_COPY_REUSE_EXT >> 4. Copy the reset of the chain from source to destination using qemu-img >> convert >> 5. Pivot to the new chain using libvirt virDomainBlockJobAbort >> >> https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockJobAbort >> 6. Remove the old chain >> >> source and target can be files or block device, and we plan to support also >> rbd and gluster volumes as target, maybe also as source. > > Thanks, Nir, we should then do our best not to break it. > > Max, maybe we can add a qemu-iotests case that does the exact same thing > as oVirt does? > > Kevin
Re: [Qemu-block] [PATCH v2 2/3] block/mirror: Fix target backing BDS
On Wed, Jun 8, 2016 at 12:32 PM, Kevin Wolfwrote: > Am 06.06.2016 um 16:42 hat Max Reitz geschrieben: >> Currently, we are trying to move the backing BDS from the source to the >> target in bdrv_replace_in_backing_chain() which is called from >> mirror_exit(). However, mirror_complete() already tries to open the >> target's backing chain with a call to bdrv_open_backing_file(). >> >> First, we should only set the target's backing BDS once. Second, the >> mirroring block job has a better idea of what to set it to than the >> generic code in bdrv_replace_in_backing_chain() (in fact, the latter's >> conditions on when to move the backing BDS from source to target are not >> really correct). >> >> Therefore, remove that code from bdrv_replace_in_backing_chain() and >> leave it to mirror_complete(). >> >> However, mirror_complete() in turn pursues a questionable strategy by >> employing bdrv_open_backing_file(): On the one hand, because this may >> open the wrong backing file with drive-mirror in "existing" mode, or >> because it will not override a possibly wrong backing file in the >> blockdev-mirror case. >> >> On the other hand, we want to reuse the existing backing chain of the >> source instead of opening everything anew, because the latter results in >> having multiple BDSs for a single physical file and thus potentially >> concurrent access which we should try to avoid. > > Careful, this "wrong" backing file might actually be intended! > > Consider a case where you want to move an image with its whole backing > chain to different storage. In that case, you would copy all of the > backing files (cp is good enough, they are read-only), create the > destination image which already points at the copied backing chain, and > then mirror in "existing" mode. > > The intention is obviously that after the job completion the new backing > chain is used and not the old one. > > I know that such cases were discussed when mirroring was introduced, I'm > not sure whether it's actually used. We need some input there: > > Eric, can you tell us whether libvirt makes use of such a setup? > > Nir, I'm not sure who is the right person in oVirt these days, but do > you either know yourself whether oVirt requires this to work, or do you > know who else would know? I'm the right person, thanks for keeping me in the loop. What you describe is how we migrate a disk from one storage to another: 1. Create a vm snapshot 2. Create a volume on the destination storage for the snapshot 3. Start mirroring from the source snapshot to the destination snapshot using libvirt virDomainBlockCopy: https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockCopy 4. Copy the reset of the chain from source to destination using qemu-img convert 5. Pivot to the new chain using libvirt virDomainBlockJobAbort https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockJobAbort 6. Remove the old chain source and target can be files or block device, and we plan to support also rbd and gluster volumes as target, maybe also as source. Nir > >> Thus, instead of invoking bdrv_open_backing_file(), just set the correct >> backing BDS directly via bdrv_set_backing_hd(). Also, do so only when >> mirror_complete() is certain to succeed. >> >> In contrast to what bdrv_replace_in_backing_chain() did so far, we do >> not need to drop the source's backing file. >> >> Signed-off-by: Max Reitz > > Leaving the actual code review for later when we have decided what > semantics we even want. > > Kevin
Re: [Qemu-block] [PATCH v2 2/3] block/mirror: Fix target backing BDS
On 08.06.2016 13:28, Paolo Bonzini wrote: > > > - Original Message - >> From: "Kevin Wolf">> To: "Max Reitz" >> Cc: qemu-block@nongnu.org, qemu-de...@nongnu.org, "Fam Zheng" >> , nsof...@redhat.com, >> ebl...@redhat.com, pbonz...@redhat.com >> Sent: Wednesday, June 8, 2016 11:32:29 AM >> Subject: Re: [PATCH v2 2/3] block/mirror: Fix target backing BDS >> >> Am 06.06.2016 um 16:42 hat Max Reitz geschrieben: >>> Currently, we are trying to move the backing BDS from the source to the >>> target in bdrv_replace_in_backing_chain() which is called from >>> mirror_exit(). However, mirror_complete() already tries to open the >>> target's backing chain with a call to bdrv_open_backing_file(). >>> >>> First, we should only set the target's backing BDS once. Second, the >>> mirroring block job has a better idea of what to set it to than the >>> generic code in bdrv_replace_in_backing_chain() (in fact, the latter's >>> conditions on when to move the backing BDS from source to target are not >>> really correct). >>> >>> Therefore, remove that code from bdrv_replace_in_backing_chain() and >>> leave it to mirror_complete(). >>> >>> However, mirror_complete() in turn pursues a questionable strategy by >>> employing bdrv_open_backing_file(): On the one hand, because this may >>> open the wrong backing file with drive-mirror in "existing" mode, or >>> because it will not override a possibly wrong backing file in the >>> blockdev-mirror case. >> >> Careful, this "wrong" backing file might actually be intended! >> >> Consider a case where you want to move an image with its whole backing >> chain to different storage. In that case, you would copy all of the >> backing files (cp is good enough, they are read-only), create the >> destination image which already points at the copied backing chain, and >> then mirror in "existing" mode. >> >> The intention is obviously that after the job completion the new backing >> chain is used and not the old one. > > Yes, this is the intention and it should not be changed. In addition > to what Kevin said, you can use drive-mirror to collapse the image to a > single file; in this case, QEMU should not be using the backing files of > the source. That is an issue that we have right now. If you do drive-mirror in absolute-paths mode with sync=full, the target will have the backing chain of the source. This is something that this patch fixes. In fact, I think if you do drive-mirror in existing mode or blockdev-mirror and the target image does not have a backing file (whatever sync mode you have used), the same will happen. Max > bdrv_open_backing_file() is used because what we want to do is to > "undo" the BDRV_O_NO_BACKING flag used by qmp_drive_mirror. > > If the contents change under the guest feet, it's the layers above > QEMU that have screwed up. > > Paolo > signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [PATCH v2 2/3] block/mirror: Fix target backing BDS
On 08.06.2016 16:40, Max Reitz wrote: > On 08.06.2016 13:28, Paolo Bonzini wrote: >> >> >> - Original Message - >>> From: "Kevin Wolf">>> To: "Max Reitz" >>> Cc: qemu-block@nongnu.org, qemu-de...@nongnu.org, "Fam Zheng" >>> , nsof...@redhat.com, >>> ebl...@redhat.com, pbonz...@redhat.com >>> Sent: Wednesday, June 8, 2016 11:32:29 AM >>> Subject: Re: [PATCH v2 2/3] block/mirror: Fix target backing BDS >>> >>> Am 06.06.2016 um 16:42 hat Max Reitz geschrieben: Currently, we are trying to move the backing BDS from the source to the target in bdrv_replace_in_backing_chain() which is called from mirror_exit(). However, mirror_complete() already tries to open the target's backing chain with a call to bdrv_open_backing_file(). First, we should only set the target's backing BDS once. Second, the mirroring block job has a better idea of what to set it to than the generic code in bdrv_replace_in_backing_chain() (in fact, the latter's conditions on when to move the backing BDS from source to target are not really correct). Therefore, remove that code from bdrv_replace_in_backing_chain() and leave it to mirror_complete(). However, mirror_complete() in turn pursues a questionable strategy by employing bdrv_open_backing_file(): On the one hand, because this may open the wrong backing file with drive-mirror in "existing" mode, or because it will not override a possibly wrong backing file in the blockdev-mirror case. >>> >>> Careful, this "wrong" backing file might actually be intended! >>> >>> Consider a case where you want to move an image with its whole backing >>> chain to different storage. In that case, you would copy all of the >>> backing files (cp is good enough, they are read-only), create the >>> destination image which already points at the copied backing chain, and >>> then mirror in "existing" mode. >>> >>> The intention is obviously that after the job completion the new backing >>> chain is used and not the old one. >> >> Yes, this is the intention and it should not be changed. In addition >> to what Kevin said, you can use drive-mirror to collapse the image to a >> single file; in this case, QEMU should not be using the backing files of >> the source. > > That is an issue that we have right now. If you do drive-mirror in > absolute-paths mode with sync=full, the target will have the backing > chain of the source. This is something that this patch fixes. As a clarification: I mean the backing chain inside QEMU (in the BDS graph), not the on-disk backing chain, i.e. how the physical image files link to each other. Max > In fact, I think if you do drive-mirror in existing mode or > blockdev-mirror and the target image does not have a backing file > (whatever sync mode you have used), the same will happen. > > Max > >> bdrv_open_backing_file() is used because what we want to do is to >> "undo" the BDRV_O_NO_BACKING flag used by qmp_drive_mirror. >> >> If the contents change under the guest feet, it's the layers above >> QEMU that have screwed up. >> >> Paolo >> > > signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [PATCH v2 2/3] block/mirror: Fix target backing BDS
Am 08.06.2016 um 13:28 hat Paolo Bonzini geschrieben: > > > - Original Message - > > From: "Kevin Wolf"> > To: "Max Reitz" > > Cc: qemu-block@nongnu.org, qemu-de...@nongnu.org, "Fam Zheng" > > , nsof...@redhat.com, > > ebl...@redhat.com, pbonz...@redhat.com > > Sent: Wednesday, June 8, 2016 11:32:29 AM > > Subject: Re: [PATCH v2 2/3] block/mirror: Fix target backing BDS > > > > Am 06.06.2016 um 16:42 hat Max Reitz geschrieben: > > > Currently, we are trying to move the backing BDS from the source to the > > > target in bdrv_replace_in_backing_chain() which is called from > > > mirror_exit(). However, mirror_complete() already tries to open the > > > target's backing chain with a call to bdrv_open_backing_file(). > > > > > > First, we should only set the target's backing BDS once. Second, the > > > mirroring block job has a better idea of what to set it to than the > > > generic code in bdrv_replace_in_backing_chain() (in fact, the latter's > > > conditions on when to move the backing BDS from source to target are not > > > really correct). > > > > > > Therefore, remove that code from bdrv_replace_in_backing_chain() and > > > leave it to mirror_complete(). > > > > > > However, mirror_complete() in turn pursues a questionable strategy by > > > employing bdrv_open_backing_file(): On the one hand, because this may > > > open the wrong backing file with drive-mirror in "existing" mode, or > > > because it will not override a possibly wrong backing file in the > > > blockdev-mirror case. > > > > Careful, this "wrong" backing file might actually be intended! > > > > Consider a case where you want to move an image with its whole backing > > chain to different storage. In that case, you would copy all of the > > backing files (cp is good enough, they are read-only), create the > > destination image which already points at the copied backing chain, and > > then mirror in "existing" mode. > > > > The intention is obviously that after the job completion the new backing > > chain is used and not the old one. > > Yes, this is the intention and it should not be changed. In addition > to what Kevin said, you can use drive-mirror to collapse the image to a > single file; in this case, QEMU should not be using the backing files of > the source. > > bdrv_open_backing_file() is used because what we want to do is to > "undo" the BDRV_O_NO_BACKING flag used by qmp_drive_mirror. > > If the contents change under the guest feet, it's the layers above > QEMU that have screwed up. We should probably have test cases for both scenarios. They would make it obvious that changing this behaviour is not okay. Actually, I'm surprised that our existing cases don't seem to cover this. Kevin
Re: [Qemu-block] [PATCH v2 2/3] block/mirror: Fix target backing BDS
- Original Message - > From: "Kevin Wolf"> To: "Max Reitz" > Cc: qemu-block@nongnu.org, qemu-de...@nongnu.org, "Fam Zheng" > , nsof...@redhat.com, > ebl...@redhat.com, pbonz...@redhat.com > Sent: Wednesday, June 8, 2016 11:32:29 AM > Subject: Re: [PATCH v2 2/3] block/mirror: Fix target backing BDS > > Am 06.06.2016 um 16:42 hat Max Reitz geschrieben: > > Currently, we are trying to move the backing BDS from the source to the > > target in bdrv_replace_in_backing_chain() which is called from > > mirror_exit(). However, mirror_complete() already tries to open the > > target's backing chain with a call to bdrv_open_backing_file(). > > > > First, we should only set the target's backing BDS once. Second, the > > mirroring block job has a better idea of what to set it to than the > > generic code in bdrv_replace_in_backing_chain() (in fact, the latter's > > conditions on when to move the backing BDS from source to target are not > > really correct). > > > > Therefore, remove that code from bdrv_replace_in_backing_chain() and > > leave it to mirror_complete(). > > > > However, mirror_complete() in turn pursues a questionable strategy by > > employing bdrv_open_backing_file(): On the one hand, because this may > > open the wrong backing file with drive-mirror in "existing" mode, or > > because it will not override a possibly wrong backing file in the > > blockdev-mirror case. > > Careful, this "wrong" backing file might actually be intended! > > Consider a case where you want to move an image with its whole backing > chain to different storage. In that case, you would copy all of the > backing files (cp is good enough, they are read-only), create the > destination image which already points at the copied backing chain, and > then mirror in "existing" mode. > > The intention is obviously that after the job completion the new backing > chain is used and not the old one. Yes, this is the intention and it should not be changed. In addition to what Kevin said, you can use drive-mirror to collapse the image to a single file; in this case, QEMU should not be using the backing files of the source. bdrv_open_backing_file() is used because what we want to do is to "undo" the BDRV_O_NO_BACKING flag used by qmp_drive_mirror. If the contents change under the guest feet, it's the layers above QEMU that have screwed up. Paolo
[Qemu-block] [PATCH v2 2/3] block/mirror: Fix target backing BDS
Currently, we are trying to move the backing BDS from the source to the target in bdrv_replace_in_backing_chain() which is called from mirror_exit(). However, mirror_complete() already tries to open the target's backing chain with a call to bdrv_open_backing_file(). First, we should only set the target's backing BDS once. Second, the mirroring block job has a better idea of what to set it to than the generic code in bdrv_replace_in_backing_chain() (in fact, the latter's conditions on when to move the backing BDS from source to target are not really correct). Therefore, remove that code from bdrv_replace_in_backing_chain() and leave it to mirror_complete(). However, mirror_complete() in turn pursues a questionable strategy by employing bdrv_open_backing_file(): On the one hand, because this may open the wrong backing file with drive-mirror in "existing" mode, or because it will not override a possibly wrong backing file in the blockdev-mirror case. On the other hand, we want to reuse the existing backing chain of the source instead of opening everything anew, because the latter results in having multiple BDSs for a single physical file and thus potentially concurrent access which we should try to avoid. Thus, instead of invoking bdrv_open_backing_file(), just set the correct backing BDS directly via bdrv_set_backing_hd(). Also, do so only when mirror_complete() is certain to succeed. In contrast to what bdrv_replace_in_backing_chain() did so far, we do not need to drop the source's backing file. Signed-off-by: Max Reitz--- block.c| 8 block/mirror.c | 21 + 2 files changed, 13 insertions(+), 16 deletions(-) diff --git a/block.c b/block.c index 16463aa..792f5dd 100644 --- a/block.c +++ b/block.c @@ -2288,14 +2288,6 @@ void bdrv_replace_in_backing_chain(BlockDriverState *old, BlockDriverState *new) change_parent_backing_link(old, new); -/* Change backing files if a previously independent node is added to the - * chain. For active commit, we replace top by its own (indirect) backing - * file and don't do anything here so we don't build a loop. */ -if (new->backing == NULL && !bdrv_chain_contains(backing_bs(old), new)) { -bdrv_set_backing_hd(new, backing_bs(old)); -bdrv_set_backing_hd(old, NULL); -} - bdrv_unref(old); } diff --git a/block/mirror.c b/block/mirror.c index 80fd3c7..217475b 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -742,15 +742,11 @@ static void mirror_set_speed(BlockJob *job, int64_t speed, Error **errp) static void mirror_complete(BlockJob *job, Error **errp) { MirrorBlockJob *s = container_of(job, MirrorBlockJob, common); -Error *local_err = NULL; -int ret; +BlockDriverState *src, *target; + +src = blk_bs(job->blk); +target = blk_bs(s->target); -ret = bdrv_open_backing_file(blk_bs(s->target), NULL, "backing", - _err); -if (ret < 0) { -error_propagate(errp, local_err); -return; -} if (!s->synced) { error_setg(errp, QERR_BLOCK_JOB_NOT_READY, job->id); return; @@ -777,6 +773,15 @@ static void mirror_complete(BlockJob *job, Error **errp) aio_context_release(replace_aio_context); } +/* Now we need to adjust the target's backing BDS. This is not necessary + * when having performed a commit operation. */ +if (!bdrv_chain_contains(backing_bs(src), target)) { +BlockDriverState *backing = s->is_none_mode ? src : s->base; +if (backing_bs(target) != backing) { +bdrv_set_backing_hd(target, backing); +} +} + s->should_complete = true; block_job_enter(>common); } -- 2.8.3