Re: [PATCH] ratelimit: restrict the delay time to a non-negative value

2022-09-23 Thread Markus Armbruster
Alberto Garcia  writes:

> On Wed 21 Sep 2022 09:47:32 AM +08, Wang Liang wrote:
>>> > -return limit->slice_end_time - now;
>>> > +return MAX(limit->slice_end_time - now, 0);
>>> 
>>> How can this be negative? slice_end_time is guaranteed to be larger
>>> than
>>> now:
>>> 
>>> if (limit->slice_end_time < now) {
>>> /* Previous, possibly extended, time slice finished; reset
>>> the
>>>  * accounting. */
>>> limit->slice_start_time = now;
>>> limit->slice_end_time = now + limit->slice_ns;
>>> limit->dispatched = 0;
>>> }
>>> 
>> This is just a guarantee. 
>>
>> If slice_end_time is assigned later by
>> limit->slice_end_time = limit->slice_start_time +
>> (uint64_t)(delay_slices * limit->slice_ns);
>> There may be precision issues at that time.
>
> Ok, on a closer look, if at the start of the function
>
>limit->slice_start_time < now, and
>limit->slice_end_time >= now
>
> it seems that in principle what you say can happen.

How?  Let's see.

static inline int64_t ratelimit_calculate_delay(RateLimit *limit, uint64_t 
n)
{
int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);

What kind of clock is QEMU_CLOCK_REALTIME?  See below.

double delay_slices;

QEMU_LOCK_GUARD(>lock);
if (!limit->slice_quota) {
/* Throttling disabled.  */
return 0;
}
assert(limit->slice_ns);

if (limit->slice_end_time < now) {

This is false.

/* Previous, possibly extended, time slice finished; reset the
 * accounting. */
limit->slice_start_time = now;
limit->slice_end_time = now + limit->slice_ns;
limit->dispatched = 0;
}

limit->dispatched += n;

This is in theory vulnerable to wrap-around.

if (limit->dispatched < limit->slice_quota) {

This must be false (or else we return 0, which isn't negative).

/* We may send further data within the current time slice, no
 * need to delay the next request. */
return 0;
}

/* Quota exceeded. Wait based on the excess amount and then start a new
 * slice. */
delay_slices = (double)limit->dispatched / limit->slice_quota;

Both @dispatched and @slice_quota are uint64_t.  Conversion to double
may lose precision, but cant't change the sign.  Therefore,
@delay_slices is non-negative.

limit->slice_end_time = limit->slice_start_time +
(uint64_t)(delay_slices * limit->slice_ns);

Conversion from double to uint64_t has undefined behavior when the value
is not representable after truncation towards zero.  So, if the
multiplication's result truncated towards zero exceeds UINT_MAX, we're
theoretically toast.

To return a negative value, @slice_end_time must become less than @now
here.

return limit->slice_end_time - now;
}

This is how far I get without (laboriously!) reconstructing what the
members of struct RateLimit actually mean, and what its invariants are,
if any.  We could write down such things in comments, but we prefer to
keep things fresh and spicy, and developers confused.

Can you elaborate on the "precision issues"?

> If it's so, it would be good to know under what conditions this happens,
> because this hasn't changed in years.
>
> Berto




Re: [PATCH] ratelimit: restrict the delay time to a non-negative value

2022-09-21 Thread Alberto Garcia
On Wed 21 Sep 2022 09:47:32 AM +08, Wang Liang wrote:
>> > -return limit->slice_end_time - now;
>> > +return MAX(limit->slice_end_time - now, 0);
>> 
>> How can this be negative? slice_end_time is guaranteed to be larger
>> than
>> now:
>> 
>> if (limit->slice_end_time < now) {
>> /* Previous, possibly extended, time slice finished; reset
>> the
>>  * accounting. */
>> limit->slice_start_time = now;
>> limit->slice_end_time = now + limit->slice_ns;
>> limit->dispatched = 0;
>> }
>> 
> This is just a guarantee. 
>
> If slice_end_time is assigned later by
> limit->slice_end_time = limit->slice_start_time +
> (uint64_t)(delay_slices * limit->slice_ns);

Ok, on a closer look, if at the start of the function

   limit->slice_start_time < now, and
   limit->slice_end_time >= now

it seems that in principle what you say can happen.

If it's so, it would be good to know under what conditions this happens,
because this hasn't changed in years.

Berto



Re: [PATCH] ratelimit: restrict the delay time to a non-negative value

2022-09-21 Thread Wang Liang
On Wed, 2022-09-21 at 06:53 +0200, Markus Armbruster wrote:
> Wang Liang  writes:
> 
> > On Tue, 2022-09-20 at 13:18 +, Alberto Garcia wrote:
> > > On Tue 20 Sep 2022 08:33:50 PM +08, wanglian...@126.com wrote:
> > > > From: Wang Liang 
> > > > 
> > > > The delay time should never be a negative value.
> > > > 
> > > > -return limit->slice_end_time - now;
> > > > +return MAX(limit->slice_end_time - now, 0);
> > > 
> > > How can this be negative? slice_end_time is guaranteed to be
> > > larger
> > > than
> > > now:
> > > 
> > > if (limit->slice_end_time < now) {
> > > /* Previous, possibly extended, time slice finished;
> > > reset
> > > the
> > >  * accounting. */
> > > limit->slice_start_time = now;
> > > limit->slice_end_time = now + limit->slice_ns;
> > > limit->dispatched = 0;
> > > }
> > > 
> > This is just a guarantee. 
> 
> Smells like an invariant to me.
> 
> > If slice_end_time is assigned later by
> > limit->slice_end_time = limit->slice_start_time +
> > (uint64_t)(delay_slices * limit->slice_ns);
> > There may be precision issues at that time.
> 
> What are the issues exactly?  What misbehavior are you observing?
> 
> Your commit message should show how delay time can become negative,
> and
> why that's bad.

It was observed in a production environment based on qemu v2.12.1.

The block-stream job delayed a very long time and do not get any
progress since ratelimit_calculate_delay returns a negative value.

Sorry, I don't have an environment to reproduce it in the mainline
version now.





Re: [PATCH] ratelimit: restrict the delay time to a non-negative value

2022-09-20 Thread Markus Armbruster
Wang Liang  writes:

> On Tue, 2022-09-20 at 13:18 +, Alberto Garcia wrote:
>> On Tue 20 Sep 2022 08:33:50 PM +08, wanglian...@126.com wrote:
>> > From: Wang Liang 
>> > 
>> > The delay time should never be a negative value.
>> > 
>> > -return limit->slice_end_time - now;
>> > +return MAX(limit->slice_end_time - now, 0);
>> 
>> How can this be negative? slice_end_time is guaranteed to be larger
>> than
>> now:
>> 
>> if (limit->slice_end_time < now) {
>> /* Previous, possibly extended, time slice finished; reset
>> the
>>  * accounting. */
>> limit->slice_start_time = now;
>> limit->slice_end_time = now + limit->slice_ns;
>> limit->dispatched = 0;
>> }
>> 
> This is just a guarantee. 

Smells like an invariant to me.

> If slice_end_time is assigned later by
> limit->slice_end_time = limit->slice_start_time +
> (uint64_t)(delay_slices * limit->slice_ns);
> There may be precision issues at that time.

What are the issues exactly?  What misbehavior are you observing?

Your commit message should show how delay time can become negative, and
why that's bad.




Re: [PATCH] ratelimit: restrict the delay time to a non-negative value

2022-09-20 Thread Wang Liang
On Tue, 2022-09-20 at 13:18 +, Alberto Garcia wrote:
> On Tue 20 Sep 2022 08:33:50 PM +08, wanglian...@126.com wrote:
> > From: Wang Liang 
> > 
> > The delay time should never be a negative value.
> > 
> > -return limit->slice_end_time - now;
> > +return MAX(limit->slice_end_time - now, 0);
> 
> How can this be negative? slice_end_time is guaranteed to be larger
> than
> now:
> 
> if (limit->slice_end_time < now) {
> /* Previous, possibly extended, time slice finished; reset
> the
>  * accounting. */
> limit->slice_start_time = now;
> limit->slice_end_time = now + limit->slice_ns;
> limit->dispatched = 0;
> }
> 
This is just a guarantee. 

If slice_end_time is assigned later by
limit->slice_end_time = limit->slice_start_time +
(uint64_t)(delay_slices * limit->slice_ns);
There may be precision issues at that time.

> Berto




Re: [PATCH] ratelimit: restrict the delay time to a non-negative value

2022-09-20 Thread Alberto Garcia
On Tue 20 Sep 2022 08:33:50 PM +08, wanglian...@126.com wrote:
> From: Wang Liang 
>
> The delay time should never be a negative value.
>
> -return limit->slice_end_time - now;
> +return MAX(limit->slice_end_time - now, 0);

How can this be negative? slice_end_time is guaranteed to be larger than
now:

if (limit->slice_end_time < now) {
/* Previous, possibly extended, time slice finished; reset the
 * accounting. */
limit->slice_start_time = now;
limit->slice_end_time = now + limit->slice_ns;
limit->dispatched = 0;
}

Berto



[PATCH] ratelimit: restrict the delay time to a non-negative value

2022-09-20 Thread wangliangzz
From: Wang Liang 

The delay time should never be a negative value.

Signed-off-by: Wang Liang 
---
 include/qemu/ratelimit.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/qemu/ratelimit.h b/include/qemu/ratelimit.h
index 48bf59e857..c8ea855fc1 100644
--- a/include/qemu/ratelimit.h
+++ b/include/qemu/ratelimit.h
@@ -69,7 +69,7 @@ static inline int64_t ratelimit_calculate_delay(RateLimit 
*limit, uint64_t n)
 delay_slices = (double)limit->dispatched / limit->slice_quota;
 limit->slice_end_time = limit->slice_start_time +
 (uint64_t)(delay_slices * limit->slice_ns);
-return limit->slice_end_time - now;
+return MAX(limit->slice_end_time - now, 0);
 }
 
 static inline void ratelimit_init(RateLimit *limit)
-- 
2.31.1