On Wed, May 10, 2023 at 10:36:00PM +0200, Kevin Wolf wrote: > > When jobs are sleeping, for example to enforce a given rate limit, they > can be reentered early, in particular in order to get paused, to update > the rate limit or to get cancelled. > > Before this patch, they behave in this case as if they had fully > completed their rate limiting delay. This means that requests are sped > up beyond their limit, violating the constraints that the user gave us.
Aha - our tests ARE non-deterministic! Good find. > > Change the block jobs to sleep in a loop until the necessary delay is > completed, while still allowing cancelling them immediately as well > pausing (handled by the pause point in job_sleep_ns()) and updating the > rate limit. > > This change is also motivated by iotests cases being prone to fail > because drain operations pause and unpause them so often that block jobs > complete earlier than they are supposed to. In particular, the next > commit would fail iotests 030 without this change. > > Signed-off-by: Kevin Wolf <kw...@redhat.com> > --- > include/block/blockjob_int.h | 14 ++++++++++---- > block/commit.c | 7 ++----- > block/mirror.c | 23 ++++++++++------------- > block/stream.c | 7 ++----- > blockjob.c | 22 ++++++++++++++++++++-- > 5 files changed, 44 insertions(+), 29 deletions(-) > > +++ b/blockjob.c > @@ -319,10 +319,28 @@ static bool block_job_set_speed(BlockJob *job, int64_t > speed, Error **errp) > return block_job_set_speed_locked(job, speed, errp); > } > > -int64_t block_job_ratelimit_get_delay(BlockJob *job, uint64_t n) > +void block_job_ratelimit_processed_bytes(BlockJob *job, uint64_t n) > { > IO_CODE(); > - return ratelimit_calculate_delay(&job->limit, n); > + ratelimit_calculate_delay(&job->limit, n); > +} > + > +void block_job_ratelimit_sleep(BlockJob *job) > +{ > + uint64_t delay_ns; > + > + /* > + * Sleep at least once. If the job is reentered early, keep waiting until > + * we've waited for the full time that is necessary to keep the job at > the > + * right speed. > + * > + * Make sure to recalculate the delay after each (possibly interrupted) > + * sleep because the speed can change while the job has yielded. > + */ > + do { > + delay_ns = ratelimit_calculate_delay(&job->limit, 0); > + job_sleep_ns(&job->job, delay_ns); > + } while (delay_ns && !job_is_cancelled(&job->job)); > } Yes, that looks more robust. Reviewed-by: Eric Blake <ebl...@redhat.com> -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org