On Tue, Feb 08, 2022 at 09:35:01AM -0500, Emanuele Giuseppe Esposito wrote: > diff --git a/block/replication.c b/block/replication.c > index 55c8f894aa..a03b28726e 100644 > --- a/block/replication.c > +++ b/block/replication.c > @@ -149,7 +149,9 @@ static void replication_close(BlockDriverState *bs) > if (s->stage == BLOCK_REPLICATION_FAILOVER) { > commit_job = &s->commit_job->job; > assert(commit_job->aio_context == qemu_get_current_aio_context());
Is it safe to access commit_job->aio_context outside job_mutex? > @@ -1838,7 +1840,9 @@ static void drive_backup_abort(BlkActionState *common) > aio_context = bdrv_get_aio_context(state->bs); > aio_context_acquire(aio_context); > > - job_cancel_sync(&state->job->job, true); > + WITH_JOB_LOCK_GUARD() { > + job_cancel_sync(&state->job->job, true); > + } Maybe job_cancel_sync() should take the lock internally since all callers in this patch seem to need the lock? I noticed this patch does not add WITH_JOB_LOCK_GUARD() to tests/unit/test-blockjob.c:cancel_common(). Was that an oversight or is there a reason why job_mutex is not needed around the job_cancel_sync() call there? > @@ -252,7 +258,13 @@ int block_job_add_bdrv(BlockJob *job, const char *name, > BlockDriverState *bs, > > static void block_job_on_idle(Notifier *n, void *opaque) > { > + /* > + * we can't kick with job_mutex held, but we also want > + * to protect the notifier list. > + */ > + job_unlock(); > aio_wait_kick(); > + job_lock(); I don't understand this. aio_wait_kick() looks safe to call with a mutex held? > @@ -292,7 +304,9 @@ bool block_job_set_speed(BlockJob *job, int64_t speed, > Error **errp) > job->speed = speed; > > if (drv->set_speed) { > + job_unlock(); > drv->set_speed(job, speed); > + job_lock(); What guarantees that job stays alive during drv->set_speed(job)? We don't hold a ref here. Maybe the assumption is that block_job_set_speed() only gets called from the main loop thread and nothing else will modify the jobs list while we're in drv->set_speed()? > @@ -545,10 +566,15 @@ BlockErrorAction block_job_error_action(BlockJob *job, > BlockdevOnError on_err, > action); > } > if (action == BLOCK_ERROR_ACTION_STOP) { > - if (!job->job.user_paused) { > - job_pause(&job->job); > - /* make the pause user visible, which will be resumed from QMP. > */ > - job->job.user_paused = true; > + WITH_JOB_LOCK_GUARD() { > + if (!job->job.user_paused) { > + job_pause(&job->job); > + /* > + * make the pause user visible, which will be > + * resumed from QMP. > + */ > + job->job.user_paused = true; > + } > } > block_job_iostatus_set_err(job, error); Does this need the lock? If not, why is block_job_iostatus_reset() called with the hold?
signature.asc
Description: PGP signature