When block_job_sleep_ns() is called, the co-routine is scheduled for future execution. If we allow the job to be re-entered prior to the scheduled time, we present a race condition in which a coroutine can be entered recursively, or even entered after the coroutine is deleted.
The job->busy flag is used by blockjobs when a coroutine is busy executing. The function 'block_job_enter()' obeys the busy flag, and will not enter a coroutine if set. If we sleep a job, we need to leave the busy flag set, so that subsequent calls to block_job_enter() are prevented. This fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1508708 Also, in block_job_start(), set the relevant job flags (.busy, .paused) before creating the coroutine, not just before executing it. Signed-off-by: Jeff Cody <jc...@redhat.com> --- blockjob.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/blockjob.c b/blockjob.c index 3a0c491..e181295 100644 --- a/blockjob.c +++ b/blockjob.c @@ -291,10 +291,10 @@ void block_job_start(BlockJob *job) { assert(job && !block_job_started(job) && job->paused && job->driver && job->driver->start); - job->co = qemu_coroutine_create(block_job_co_entry, job); job->pause_count--; job->busy = true; job->paused = false; + job->co = qemu_coroutine_create(block_job_co_entry, job); bdrv_coroutine_enter(blk_bs(job->blk), job->co); } @@ -797,11 +797,14 @@ void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns) return; } - job->busy = false; + /* We need to leave job->busy set here, because when we have + * put a coroutine to 'sleep', we have scheduled it to run in + * the future. We cannot enter that same coroutine again before + * it wakes and runs, otherwise we risk double-entry or entry after + * completion. */ if (!block_job_should_pause(job)) { co_aio_sleep_ns(blk_get_aio_context(job->blk), type, ns); } - job->busy = true; block_job_pause_point(job); } -- 2.9.5