On 2018-01-19 21:58, John Snow wrote: > The mirror job makes a semi-inaccurate record of the last time we yielded > by recording the last time we left a "pause", but this doesn't always > correlate to the time we actually last successfully ceded control. > > Record the time we last *exited* a yield centrally. In other words, record > the time we began execution of this job to know how long we have been > selfish for. > > Signed-off-by: John Snow <js...@redhat.com> > --- > block/mirror.c | 8 ++------ > blockjob.c | 2 ++ > include/block/blockjob.h | 5 +++++ > 3 files changed, 9 insertions(+), 6 deletions(-) > > diff --git a/block/mirror.c b/block/mirror.c > index c9badc1203..88f4e8964d 100644 > --- a/block/mirror.c > +++ b/block/mirror.c > @@ -63,7 +63,6 @@ typedef struct MirrorBlockJob { > QSIMPLEQ_HEAD(, MirrorBuffer) buf_free; > int buf_free_count; > > - uint64_t last_pause_ns; > unsigned long *in_flight_bitmap; > int in_flight; > int64_t bytes_in_flight; > @@ -596,8 +595,7 @@ static void mirror_throttle(MirrorBlockJob *s) > { > int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); > > - if (now - s->last_pause_ns > SLICE_TIME) { > - s->last_pause_ns = now; > + if (now - s->common.last_enter_ns > SLICE_TIME) { > block_job_sleep_ns(&s->common, 0); > } else { > block_job_pause_point(&s->common); > @@ -769,7 +767,6 @@ static void coroutine_fn mirror_run(void *opaque) > > mirror_free_init(s); > > - s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); > if (!s->is_none_mode) { > ret = mirror_dirty_init(s); > if (ret < 0 || block_job_is_cancelled(&s->common)) { > @@ -803,7 +800,7 @@ static void coroutine_fn mirror_run(void *opaque) > * We do so every SLICE_TIME nanoseconds, or when there is an error, > * or when the source is clean, whichever comes first. > */ > - delta = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - s->last_pause_ns; > + delta = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - > s->common.last_enter_ns;
The horror. > if (delta < SLICE_TIME && > s->common.iostatus == BLOCK_DEVICE_IO_STATUS_OK) { > if (s->in_flight >= MAX_IN_FLIGHT || s->buf_free_count == 0 || > @@ -878,7 +875,6 @@ static void coroutine_fn mirror_run(void *opaque) > delay_ns = (s->in_flight == 0 && cnt == 0 ? SLICE_TIME : 0); > block_job_sleep_ns(&s->common, delay_ns); > } > - s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); > } Hmmm. So you're now relying on block_job_sleep_ns() updating last_enter_ns. But it will only do so if we actually do sleep, which we will not if the block job has been cancelled. Then, last_enter_ns will stay at its current value indefinitely because neither block_job_sleep_ns() nor block_job_pause_point() will yield. OK, so at this point we should leave the mirror loop. There are three points where this is done: (1) If s->ret < 0. OK, let's hope it isn't. (2) If cnt == 0 && should_complete. We'll come back to that. (3) If !s->synced && block_job_is_cancelled(...). So basically now, if the job has not emitted a READY event yet. But if we have emitted the READY, we have to wait for cnt == 0 && should_complete (note that should_complete in turn will only be set if s->in_flight == 0 && cnt == 0). But unless delta < SLICE_TIME, we will never do another mirror_iteration(), so unless we have already started the necessary requests, they will never be started and we will loop forever. So try this on master (I prefixed the QMP lines with in/out depending on the direction -- note that it's important to have the block-job-cancel on the same line the human-monitor-command finishes on so they are executed right after each other!): $ ./qemu-img create -f qcow2 foo.qcow2 64M $ x86_64-softmmu/qemu-system-x86_64 -qmp stdio In: {"QMP": {"version": {"qemu": {"micro": 50, "minor": 11, "major": 2}, "package": " (v2.9.0-632-g4a52d43-dirty)"}, "capabilities": []}} Out: {"execute":"qmp_capabilities"} In: {"return": {}} Out: {"execute":"object-add","arguments": {"qom-type":"throttle-group","id":"tg0","props":{"limits": {"bps-total":16777216}}}} In: {"return": {}} Out: {"execute":"blockdev-add","arguments": {"node-name":"source","driver":"qcow2","file": {"driver":"file","filename":"foo.qcow2"}}} In: {"return": {}} Out: {"execute":"blockdev-add","arguments": {"node-name":"target","driver":"throttle", "throttle-group":"tg0","file": {"driver":"null-co","size":67108864}}} In: {"return": {}} Out: {"execute":"blockdev-mirror","arguments": {"job-id":"mirror","device":"source","target":"target", "sync":"full"}} In: {"return": {}} In: {"timestamp": {"seconds": 1518040566, "microseconds": 658111}, "event": "BLOCK_JOB_READY", "data": {"device": "mirror", "len": 67108864, "offset": 67108864, "speed": 0, "type": "mirror"}} Out: {"execute":"human-monitor-command","arguments": {"command-line":"qemu-io source \"write 0 64M\""} }{"execute":"block-job-cancel","arguments":{"device":"mirror"}} In: wrote 67108864/67108864 bytes at offset 0 In: 64 MiB, 1 ops; 0.0310 sec (2.011 GiB/sec and 32.1823 ops/sec) In: {"return": ""} In: {"return": {}} In: {"timestamp": {"seconds": 1518040578, "microseconds": 626669}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "mirror", "len": 134217728, "offset": 134217728, "speed": 0, "type": "mirror"}} But with your patch, the BLOCK_JOB_COMPLETED never appears (it should take four seconds). > > immediate_exit: > diff --git a/blockjob.c b/blockjob.c > index f5cea84e73..2a9ff66b95 100644 > --- a/blockjob.c > +++ b/blockjob.c > @@ -321,6 +321,7 @@ void block_job_start(BlockJob *job) > job->pause_count--; > job->busy = true; > job->paused = false; > + job->last_enter_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); > bdrv_coroutine_enter(blk_bs(job->blk), job->co); > } > > @@ -786,6 +787,7 @@ static void block_job_do_yield(BlockJob *job, uint64_t ns) > job->busy = false; > block_job_unlock(); > qemu_coroutine_yield(); > + job->last_enter_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); > > /* Set by block_job_enter before re-entering the coroutine. */ > assert(job->busy); > diff --git a/include/block/blockjob.h b/include/block/blockjob.h > index 00403d9482..e965845c94 100644 > --- a/include/block/blockjob.h > +++ b/include/block/blockjob.h > @@ -141,6 +141,11 @@ typedef struct BlockJob { > */ > QEMUTimer sleep_timer; > > + /** > + * Timestamp of the last yield A yield has two timestamps, one pre-yield, one post-yield. Maybe it should be noted that this records the post-yield one because that's not what I'd assume just based on the comment, but OTOH the name of the variable should be enough to clear that up. Max > + */ > + uint64_t last_enter_ns; > + > /** Non-NULL if this job is part of a transaction */ > BlockJobTxn *txn; > QLIST_ENTRY(BlockJob) txn_list; >
signature.asc
Description: OpenPGP digital signature