Re: about mirror cancel
16.04.2021 10:11, Max Reitz wrote: On 16.04.21 09:05, Max Reitz wrote: On 15.04.21 20:46, Vladimir Sementsov-Ogievskiy wrote: [...] Note, that if cancelling all in-flight requests on target is wrong on mirror cancel, we still don't have real bug, as the only implementation of .bdrv_cancel_in_flight is stopping reconnect waiting in nbd driver. So, we'll cancel requests only if connection is already lost anyway. But that probably means, that correct name of the handler would be .bdrv_cancel_in_fligth_requests_that_will_most_probably_fail_anyway().. It’s documentation states that it should cancel all in-flight requests, so it’s more likely it just isn’t implemented where it could be. Oh, I now see you added it in the same series. Well, then I suppose you’re free to change the semantics as you see fit. But be aware that even cancelling those requests means that you abandon the target. So it must then fail instead of emitting the COMPLETED event (AFAIR the mirror job emits COMPLETED when cancelled in READY with force=false). Note that BLOCK_JOB_COMPLETED can indicate failure, it has error field.. Interesting, does libvirt aware of it.. If the user wants the mirror job to create a consistent copy and so cancels it after READY (with force=false), I don’t know whether cancelling those hanging requests is what we want. If the cancel hangs and the user sees this, they are still free to decide to cancel again with force=true, no? Right. All right, I'll remake the feature to cancel requests only with force=true. Thanks for explanations, and great that I decided to ask. -- Best regards, Vladimir
Re: about mirror cancel
On 16.04.21 09:05, Max Reitz wrote: On 15.04.21 20:46, Vladimir Sementsov-Ogievskiy wrote: [...] Note, that if cancelling all in-flight requests on target is wrong on mirror cancel, we still don't have real bug, as the only implementation of .bdrv_cancel_in_flight is stopping reconnect waiting in nbd driver. So, we'll cancel requests only if connection is already lost anyway. But that probably means, that correct name of the handler would be .bdrv_cancel_in_fligth_requests_that_will_most_probably_fail_anyway().. It’s documentation states that it should cancel all in-flight requests, so it’s more likely it just isn’t implemented where it could be. Oh, I now see you added it in the same series. Well, then I suppose you’re free to change the semantics as you see fit. But be aware that even cancelling those requests means that you abandon the target. So it must then fail instead of emitting the COMPLETED event (AFAIR the mirror job emits COMPLETED when cancelled in READY with force=false). If the user wants the mirror job to create a consistent copy and so cancels it after READY (with force=false), I don’t know whether cancelling those hanging requests is what we want. If the cancel hangs and the user sees this, they are still free to decide to cancel again with force=true, no? Max
Re: about mirror cancel
On 15.04.21 20:46, Vladimir Sementsov-Ogievskiy wrote: Hi all! Recently I've implemented fast-cancelling of mirror job: do bdrv_cancel_in_flight() in mirror_cancel(). Now I'm in doubt: is it a correct thing? I heard, that mirror-cancel is a kind of valid mirror completion.. Looking at documentation: # Note that if you issue 'block-job-cancel' after 'drive-mirror' has indicated # (via the event BLOCK_JOB_READY) that the source and destination are # synchronized, then the event triggered by this command changes to # BLOCK_JOB_COMPLETED, to indicate that the mirroring has ended and the # destination now has a point-in-time copy tied to the time of the cancellation. So, in other words, do we guarantee something to the user, if it calls mirror-cancel in ready state? Does this abuse exist in libvirt? How is it abuse it if it’s documented? AFAIR it does exist, because libvirt’s blockcopy always uses mirror (with --pivot it’s allowed to complete, without it is cancelled). (And the point of course is that if you want mirror to create a copy without pivoting, you need this behavior, because otherwise the target may be in an inconsistent state.) Note, that if cancelling all in-flight requests on target is wrong on mirror cancel, we still don't have real bug, as the only implementation of .bdrv_cancel_in_flight is stopping reconnect waiting in nbd driver. So, we'll cancel requests only if connection is already lost anyway. But that probably means, that correct name of the handler would be .bdrv_cancel_in_fligth_requests_that_will_most_probably_fail_anyway().. It’s documentation states that it should cancel all in-flight requests, so it’s more likely it just isn’t implemented where it could be. And it also means, that abuse of mirror-cancel as valid completion is a bad idea, as we can't distinguish the cases when user calls job-cancel to have a kind of point-in-time copy, or just to cancel job (and being not interested in the final state of target). So, probably we need an option boolean argument for blockjob-cancel, like "hard", that will cancel in-flight writes on target node.. There is @force. See commit b76e4458b1eb3c3. Max
about mirror cancel
Hi all! Recently I've implemented fast-cancelling of mirror job: do bdrv_cancel_in_flight() in mirror_cancel(). Now I'm in doubt: is it a correct thing? I heard, that mirror-cancel is a kind of valid mirror completion.. Looking at documentation: # Note that if you issue 'block-job-cancel' after 'drive-mirror' has indicated # (via the event BLOCK_JOB_READY) that the source and destination are # synchronized, then the event triggered by this command changes to # BLOCK_JOB_COMPLETED, to indicate that the mirroring has ended and the # destination now has a point-in-time copy tied to the time of the cancellation. So, in other words, do we guarantee something to the user, if it calls mirror-cancel in ready state? Does this abuse exist in libvirt? Note, that if cancelling all in-flight requests on target is wrong on mirror cancel, we still don't have real bug, as the only implementation of .bdrv_cancel_in_flight is stopping reconnect waiting in nbd driver. So, we'll cancel requests only if connection is already lost anyway. But that probably means, that correct name of the handler would be .bdrv_cancel_in_fligth_requests_that_will_most_probably_fail_anyway().. And it also means, that abuse of mirror-cancel as valid completion is a bad idea, as we can't distinguish the cases when user calls job-cancel to have a kind of point-in-time copy, or just to cancel job (and being not interested in the final state of target). So, probably we need an option boolean argument for blockjob-cancel, like "hard", that will cancel in-flight writes on target node.. -- Best regards, Vladimir