On 22.07.21 19:58, Vladimir Sementsov-Ogievskiy wrote:
22.07.2021 15:26, Max Reitz wrote:
Most callers of job_is_cancelled() actually want to know whether the job
is on its way to immediate termination. For example, we refuse to pause
jobs that are cancelled; but this only makes sense for jobs that are
really actually cancelled.
A mirror job that is cancelled during READY with force=false should
absolutely be allowed to pause. This "cancellation" (which is actually
a kind of completion)
You have to repeat that this "cancel" is not "cancel".
So, the whole problem is that feature of mirror, on cancel in READY
state do not cancel but do some specific kind of completion.
You try to make this thing correctly handled on generic layer..
Did you consider instead just drop the feature from generic layer? So
that all *cancel* functions always do force-cancel. Then the internal
implementation become a lot clearer.
Yes, I considered that, and I’ve decided against it (for now), because
such a change would obviously be an incompatible change. It would
require a deprecation period, and so we would need to fix this bug now
anyway.
But we have to support the qmp block-job-cancel of READY mirror (and
commit) with force=false.
We can do it as an exclusion in qmp_block_job_cancel, something like:
if (job is mirror or commit AND it's ready AND force = false)
mirror_soft_cancel(...);
else
job_cancel(...);
I didn’t consider such a hack, though. I don’t like it. If we think
that we should change our approach because mirror’s soft cancel is
actually a completion mode, and the current situation is too confusing,
such a change should be user-visible, too. (I think there was this idea
of having job-specific flags or parameters you could change at runtime,
and so you’d just change the “pivot” parameter between true or false.)
Also, I don’t know whether this would really make anything “a lot”
easier. After this series job_is_cancelled() already tells the true
story, so all we could really change is to drop force_cancel and unify
the “s->should_complete || job_cancel_requested()” conditions in
block/mirror.c into one variable. So when I considered making cancel
exclusively force-cancel jobs, I thought it wouldn’t actually be worth
it in practice.
may take an indefinite amount of time, and so
should behave like any job during normal operation. For example, with
on-target-error=stop, the job should stop on write errors. (In
contrast, force-cancelled jobs should not get write errors, as they
should just terminate and not do further I/O.)
Therefore, redefine job_is_cancelled() to only return true for jobs that
are force-cancelled (which as of HEAD^ means any job that interprets the
cancellation request as a request for immediate termination), and add
job_cancel_request() as the general variant, which returns true for any
jobs which have been requested to be cancelled, whether it be
immediately or after an arbitrarily long completion phase.
Buglink: https://gitlab.com/qemu-project/qemu/-/issues/462
Signed-off-by: Max Reitz <mre...@redhat.com>
---
[..]
--- a/job.c
+++ b/job.c
@@ -216,6 +216,11 @@ const char *job_type_str(const Job *job)
}
bool job_is_cancelled(Job *job)
+{
+ return job->cancelled && job->force_cancel;
+}
+
+bool job_cancel_requested(Job *job)
{
return job->cancelled;
}
@@ -650,7 +655,7 @@ static void job_conclude(Job *job)
static void job_update_rc(Job *job)
{
- if (!job->ret && job_is_cancelled(job)) {
+ if (!job->ret && job_cancel_requested(job)) {
Why not job_is_cancelled() here?
So in case of mirror other kind of completion we set ret to -ECANCELED?
I thought the return value is a user-visible thing, so I left it as-is.
Seems I was wrong, more below.
job->ret = -ECANCELED;
}
if (job->ret) {
@@ -704,7 +709,7 @@ static int job_finalize_single(Job *job)
/* Emit events only if we actually started */
if (job_started(job)) {
- if (job_is_cancelled(job)) {
+ if (job_cancel_requested(job)) {
job_event_cancelled(job);
Same question here.. Shouldn't mirror report COMPLETED event in case
of not-force cancelled in READY state?
Same here, I thought this is user-visible, nothing internal, so I should
leave it as-is.
Now I see that cancelling mirror post-READY indeed should result in a
COMPLETED event. So I’m actually not exactly sure how mirror does that,
despite this code here (which functionally isn’t changed by this patch),
but it’s absolutely true that job_is_cancelled() would be more
appropriate here.
(No iotest failed, so I thought this change was right. Well.)
} else {
job_event_completed(job);
@@ -1015,7 +1020,7 @@ void job_complete(Job *job, Error **errp)
if (job_apply_verb(job, JOB_VERB_COMPLETE, errp)) {
return;
}
- if (job_is_cancelled(job) || !job->driver->complete) {
+ if (job_cancel_requested(job) || !job->driver->complete) {
error_setg(errp, "The active block job '%s' cannot be
completed",
job->id);
return;
@@ -1043,7 +1048,7 @@ int job_finish_sync(Job *job, void
(*finish)(Job *, Error **errp), Error **errp)
AIO_WAIT_WHILE(job->aio_context,
(job_enter(job), !job_is_completed(job)));
- ret = (job_is_cancelled(job) && job->ret == 0) ? -ECANCELED :
job->ret;
+ ret = (job_cancel_requested(job) && job->ret == 0) ? -ECANCELED
: job->ret;
So here it should probably stay a job_is_cancelled(), too.
Max
job_unref(job);
return ret;
}