Re: Regression in v5.0-rc1 with autosuspend hrtimers

2019-01-09 Thread Ladislav Michl
On Thu, Jan 10, 2019 at 08:50:14AM +0100, Vincent Guittot wrote:
> On Thu, 10 Jan 2019 at 08:46, Ladislav Michl  wrote:
> >
> > On Wed, Jan 09, 2019 at 11:06:34PM +0100, Rafael J. Wysocki wrote:
> > > On Wed, Jan 9, 2019 at 7:05 PM Vincent Guittot
> > >  wrote:
> > > >
> > > > On Wed, 9 Jan 2019 at 18:26, Ladislav Michl  
> > > > wrote:
> > > > >
> > > > > On Wed, Jan 09, 2019 at 05:32:31PM +0100, Vincent Guittot wrote:
> > > > > > On Wed, 9 Jan 2019 at 17:07, Ladislav Michl  
> > > > > > wrote:
> > > > > > >
> > > > > > > On Wed, Jan 09, 2019 at 03:12:25PM +0100, Vincent Guittot wrote:
> > > > > > > > Please keep all thread list when replying :-)
> > > > > > >
> > > > > > > Ahh, sorry for that...
> > > > > > > [snip]
> > > > > > > > On Wed, 9 Jan 2019 at 14:33, Ladislav Michl 
> > > > > > > >  wrote:
> > > > > > > > > I agree, but it doea all the magic correctly, so you won't 
> > > > > > > > > need
> > > > > > > > > to touch that code is ktime_t changes (I know, unlikely..)
> > > > > > > >
> > > > > > > > But the current implementation doesn't care of any changes in 
> > > > > > > > ktime_t
> > > > > > > > as it uses raw ns
> > > > > > >
> > > > > > > Fair enough, so let's go back to:
> > > > > >
> > > > > > This one looks good for me
> > > > >
> > > > > Lets split is for 'comments fix' patch, which was just send and 'the 
> > > > > rest'.
> > > > > Now, 'the rest' can either be v2 of your "PM/runtime: Fix 
> > > > > autosuspend_delay on
> > > > > 32bits arch" or will wait for 5.1. You decide :)
> > > >
> > > > I don't really mind.
> > > >
> > > > Rafael,
> > > > Do you prefer to only take the fix for u64 casting problem or do you
> > > > prefer to also take the optimization below in one single patch ?
> > >
> > > The casting problem is a bug and the optimization is next release
> > > material in my view.  Please send the optimization separately.
> >
> > Ok, will send that separately in a few moments...
> > Btw, u64 casting problem seems to be still present in rpm_suspend while
> > computing slack for autosuspend delay greater than 25% of 2^31 (2^33) ns.
> 
> Good point. I will add it to the fix as well

Another nit then, for (u64)(autosuspend_delay) is (u64)autosuspend_delay
enough :)

I'll wait for your v2 before sending optimization patch.

> > Not sure if that triggers any real bug.


Re: Regression in v5.0-rc1 with autosuspend hrtimers

2019-01-09 Thread Vincent Guittot
On Thu, 10 Jan 2019 at 08:46, Ladislav Michl  wrote:
>
> On Wed, Jan 09, 2019 at 11:06:34PM +0100, Rafael J. Wysocki wrote:
> > On Wed, Jan 9, 2019 at 7:05 PM Vincent Guittot
> >  wrote:
> > >
> > > On Wed, 9 Jan 2019 at 18:26, Ladislav Michl  wrote:
> > > >
> > > > On Wed, Jan 09, 2019 at 05:32:31PM +0100, Vincent Guittot wrote:
> > > > > On Wed, 9 Jan 2019 at 17:07, Ladislav Michl  
> > > > > wrote:
> > > > > >
> > > > > > On Wed, Jan 09, 2019 at 03:12:25PM +0100, Vincent Guittot wrote:
> > > > > > > Please keep all thread list when replying :-)
> > > > > >
> > > > > > Ahh, sorry for that...
> > > > > > [snip]
> > > > > > > On Wed, 9 Jan 2019 at 14:33, Ladislav Michl 
> > > > > > >  wrote:
> > > > > > > > I agree, but it doea all the magic correctly, so you won't need
> > > > > > > > to touch that code is ktime_t changes (I know, unlikely..)
> > > > > > >
> > > > > > > But the current implementation doesn't care of any changes in 
> > > > > > > ktime_t
> > > > > > > as it uses raw ns
> > > > > >
> > > > > > Fair enough, so let's go back to:
> > > > >
> > > > > This one looks good for me
> > > >
> > > > Lets split is for 'comments fix' patch, which was just send and 'the 
> > > > rest'.
> > > > Now, 'the rest' can either be v2 of your "PM/runtime: Fix 
> > > > autosuspend_delay on
> > > > 32bits arch" or will wait for 5.1. You decide :)
> > >
> > > I don't really mind.
> > >
> > > Rafael,
> > > Do you prefer to only take the fix for u64 casting problem or do you
> > > prefer to also take the optimization below in one single patch ?
> >
> > The casting problem is a bug and the optimization is next release
> > material in my view.  Please send the optimization separately.
>
> Ok, will send that separately in a few moments...
> Btw, u64 casting problem seems to be still present in rpm_suspend while
> computing slack for autosuspend delay greater than 25% of 2^31 (2^33) ns.

Good point. I will add it to the fix as well

> Not sure if that triggers any real bug.


Re: Regression in v5.0-rc1 with autosuspend hrtimers

2019-01-09 Thread Ladislav Michl
On Wed, Jan 09, 2019 at 11:06:34PM +0100, Rafael J. Wysocki wrote:
> On Wed, Jan 9, 2019 at 7:05 PM Vincent Guittot
>  wrote:
> >
> > On Wed, 9 Jan 2019 at 18:26, Ladislav Michl  wrote:
> > >
> > > On Wed, Jan 09, 2019 at 05:32:31PM +0100, Vincent Guittot wrote:
> > > > On Wed, 9 Jan 2019 at 17:07, Ladislav Michl  
> > > > wrote:
> > > > >
> > > > > On Wed, Jan 09, 2019 at 03:12:25PM +0100, Vincent Guittot wrote:
> > > > > > Please keep all thread list when replying :-)
> > > > >
> > > > > Ahh, sorry for that...
> > > > > [snip]
> > > > > > On Wed, 9 Jan 2019 at 14:33, Ladislav Michl  
> > > > > > wrote:
> > > > > > > I agree, but it doea all the magic correctly, so you won't need
> > > > > > > to touch that code is ktime_t changes (I know, unlikely..)
> > > > > >
> > > > > > But the current implementation doesn't care of any changes in 
> > > > > > ktime_t
> > > > > > as it uses raw ns
> > > > >
> > > > > Fair enough, so let's go back to:
> > > >
> > > > This one looks good for me
> > >
> > > Lets split is for 'comments fix' patch, which was just send and 'the 
> > > rest'.
> > > Now, 'the rest' can either be v2 of your "PM/runtime: Fix 
> > > autosuspend_delay on
> > > 32bits arch" or will wait for 5.1. You decide :)
> >
> > I don't really mind.
> >
> > Rafael,
> > Do you prefer to only take the fix for u64 casting problem or do you
> > prefer to also take the optimization below in one single patch ?
> 
> The casting problem is a bug and the optimization is next release
> material in my view.  Please send the optimization separately.

Ok, will send that separately in a few moments...
Btw, u64 casting problem seems to be still present in rpm_suspend while
computing slack for autosuspend delay greater than 25% of 2^31 (2^33) ns.
Not sure if that triggers any real bug.


Re: Regression in v5.0-rc1 with autosuspend hrtimers

2019-01-09 Thread Rafael J. Wysocki
On Wed, Jan 9, 2019 at 7:05 PM Vincent Guittot
 wrote:
>
> On Wed, 9 Jan 2019 at 18:26, Ladislav Michl  wrote:
> >
> > On Wed, Jan 09, 2019 at 05:32:31PM +0100, Vincent Guittot wrote:
> > > On Wed, 9 Jan 2019 at 17:07, Ladislav Michl  wrote:
> > > >
> > > > On Wed, Jan 09, 2019 at 03:12:25PM +0100, Vincent Guittot wrote:
> > > > > Please keep all thread list when replying :-)
> > > >
> > > > Ahh, sorry for that...
> > > > [snip]
> > > > > On Wed, 9 Jan 2019 at 14:33, Ladislav Michl  
> > > > > wrote:
> > > > > > I agree, but it doea all the magic correctly, so you won't need
> > > > > > to touch that code is ktime_t changes (I know, unlikely..)
> > > > >
> > > > > But the current implementation doesn't care of any changes in ktime_t
> > > > > as it uses raw ns
> > > >
> > > > Fair enough, so let's go back to:
> > >
> > > This one looks good for me
> >
> > Lets split is for 'comments fix' patch, which was just send and 'the rest'.
> > Now, 'the rest' can either be v2 of your "PM/runtime: Fix autosuspend_delay 
> > on
> > 32bits arch" or will wait for 5.1. You decide :)
>
> I don't really mind.
>
> Rafael,
> Do you prefer to only take the fix for u64 casting problem or do you
> prefer to also take the optimization below in one single patch ?

The casting problem is a bug and the optimization is next release
material in my view.  Please send the optimization separately.


Re: Regression in v5.0-rc1 with autosuspend hrtimers

2019-01-09 Thread Vincent Guittot
On Wed, 9 Jan 2019 at 18:26, Ladislav Michl  wrote:
>
> On Wed, Jan 09, 2019 at 05:32:31PM +0100, Vincent Guittot wrote:
> > On Wed, 9 Jan 2019 at 17:07, Ladislav Michl  wrote:
> > >
> > > On Wed, Jan 09, 2019 at 03:12:25PM +0100, Vincent Guittot wrote:
> > > > Please keep all thread list when replying :-)
> > >
> > > Ahh, sorry for that...
> > > [snip]
> > > > On Wed, 9 Jan 2019 at 14:33, Ladislav Michl  
> > > > wrote:
> > > > > I agree, but it doea all the magic correctly, so you won't need
> > > > > to touch that code is ktime_t changes (I know, unlikely..)
> > > >
> > > > But the current implementation doesn't care of any changes in ktime_t
> > > > as it uses raw ns
> > >
> > > Fair enough, so let's go back to:
> >
> > This one looks good for me
>
> Lets split is for 'comments fix' patch, which was just send and 'the rest'.
> Now, 'the rest' can either be v2 of your "PM/runtime: Fix autosuspend_delay on
> 32bits arch" or will wait for 5.1. You decide :)

I don't really mind.

Rafael,
Do you prefer to only take the fix for u64 casting problem or do you
prefer to also take the optimization below in one single patch ?


>
> > > diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> > > index 70624695b6d5..e61ee9b47a26 100644
> > > --- a/drivers/base/power/runtime.c
> > > +++ b/drivers/base/power/runtime.c
> > > @@ -121,7 +121,7 @@ static void pm_runtime_cancel_pending(struct device 
> > > *dev)
> > >   * Compute the autosuspend-delay expiration time based on the device's
> > >   * power.last_busy time.  If the delay has already expired or is disabled
> > >   * (negative) or the power.use_autosuspend flag isn't set, return 0.
> > > - * Otherwise return the expiration time in jiffies (adjusted to be 
> > > nonzero).
> > > + * Otherwise return the expiration time in nanoseconds (adjusted to be 
> > > nonzero).
> > >   *
> > >   * This function may be called either with or without dev->power.lock 
> > > held.
> > >   * Either way it can be racy, since power.last_busy may be updated at 
> > > any time.
> > > @@ -129,24 +129,21 @@ static void pm_runtime_cancel_pending(struct device 
> > > *dev)
> > >  u64 pm_runtime_autosuspend_expiration(struct device *dev)
> > >  {
> > > int autosuspend_delay;
> > > -   u64 last_busy, expires = 0;
> > > -   u64 now = ktime_to_ns(ktime_get());
> > > +   u64 expires;
> > >
> > > if (!dev->power.use_autosuspend)
> > > -   goto out;
> > > +   return 0;
> > >
> > > autosuspend_delay = READ_ONCE(dev->power.autosuspend_delay);
> > > if (autosuspend_delay < 0)
> > > -   goto out;
> > > -
> > > -   last_busy = READ_ONCE(dev->power.last_busy);
> > > +   return 0;
> > >
> > > -   expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;
> > > -   if (expires <= now)
> > > -   expires = 0;/* Already expired. */
> > > +   expires  = READ_ONCE(dev->power.last_busy);
> > > +   expires += NSEC_PER_MSEC * (u64)autosuspend_delay;
> > > +   if (expires > ktime_to_ns(ktime_get()))
> > > +   return expires; /* Expires in the future */
> > >
> > > - out:
> > > -   return expires;
> > > +   return 0;
> > >  }
> > >  EXPORT_SYMBOL_GPL(pm_runtime_autosuspend_expiration);
> > >
> > > Which gives for arm:
> > > 0480 :
> > >  480:   e92d41f0push{r4, r5, r6, r7, r8, lr}
> > >  484:   e5d030d1ldrbr3, [r0, #209]  ; 0xd1
> > >  488:   e3130004tst r3, #4
> > >  48c:   0a0cbeq 4c4 
> > > 
> > >  490:   e59030e4ldr r3, [r0, #228]  ; 0xe4
> > >  494:   e353cmp r3, #0
> > >  498:   ba09blt 4c4 
> > > 
> > >  49c:   e28050e8add r5, r0, #232; 0xe8
> > >  4a0:   e8950030ldm r5, {r4, r5}
> > >  4a4:   e59f2030ldr r2, [pc, #48]   ; 4dc 
> > > 
> > >  4a8:   e0e54392smlal   r4, r5, r2, r3
> > >  4ac:   ebfebl  0 
> > >  4b0:   e1550001cmp r5, r1
> > >  4b4:   0154cmpeq   r4, r0
> > >  4b8:   e1a06004mov r6, r4
> > >  4bc:   e1a07005mov r7, r5
> > >  4c0:   8a01bhi 4cc 
> > > 
> > >  4c4:   e3a06000mov r6, #0
> > >  4c8:   e3a07000mov r7, #0
> > >  4cc:   e1a6mov r0, r6
> > >  4d0:   e1a01007mov r1, r7
> > >  4d4:   e8bd41f0pop {r4, r5, r6, r7, r8, lr}
> > >  4d8:   e12fff1ebx  lr
> > >  4dc:   000f4240.word   0x000f4240
> > >
> > > And x86:
> > > 0230 :
> > >  230:   8b 87 a4 01 00 00   mov0x1a4(%rdi),%eax
> > >  236:   53  push   %rbx
> > >  237:   85 c0   test   %eax,%eax
> 

Re: Regression in v5.0-rc1 with autosuspend hrtimers

2019-01-09 Thread Ladislav Michl
On Wed, Jan 09, 2019 at 05:32:31PM +0100, Vincent Guittot wrote:
> On Wed, 9 Jan 2019 at 17:07, Ladislav Michl  wrote:
> >
> > On Wed, Jan 09, 2019 at 03:12:25PM +0100, Vincent Guittot wrote:
> > > Please keep all thread list when replying :-)
> >
> > Ahh, sorry for that...
> > [snip]
> > > On Wed, 9 Jan 2019 at 14:33, Ladislav Michl  wrote:
> > > > I agree, but it doea all the magic correctly, so you won't need
> > > > to touch that code is ktime_t changes (I know, unlikely..)
> > >
> > > But the current implementation doesn't care of any changes in ktime_t
> > > as it uses raw ns
> >
> > Fair enough, so let's go back to:
> 
> This one looks good for me

Lets split is for 'comments fix' patch, which was just send and 'the rest'.
Now, 'the rest' can either be v2 of your "PM/runtime: Fix autosuspend_delay on
32bits arch" or will wait for 5.1. You decide :)

> > diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> > index 70624695b6d5..e61ee9b47a26 100644
> > --- a/drivers/base/power/runtime.c
> > +++ b/drivers/base/power/runtime.c
> > @@ -121,7 +121,7 @@ static void pm_runtime_cancel_pending(struct device 
> > *dev)
> >   * Compute the autosuspend-delay expiration time based on the device's
> >   * power.last_busy time.  If the delay has already expired or is disabled
> >   * (negative) or the power.use_autosuspend flag isn't set, return 0.
> > - * Otherwise return the expiration time in jiffies (adjusted to be 
> > nonzero).
> > + * Otherwise return the expiration time in nanoseconds (adjusted to be 
> > nonzero).
> >   *
> >   * This function may be called either with or without dev->power.lock held.
> >   * Either way it can be racy, since power.last_busy may be updated at any 
> > time.
> > @@ -129,24 +129,21 @@ static void pm_runtime_cancel_pending(struct device 
> > *dev)
> >  u64 pm_runtime_autosuspend_expiration(struct device *dev)
> >  {
> > int autosuspend_delay;
> > -   u64 last_busy, expires = 0;
> > -   u64 now = ktime_to_ns(ktime_get());
> > +   u64 expires;
> >
> > if (!dev->power.use_autosuspend)
> > -   goto out;
> > +   return 0;
> >
> > autosuspend_delay = READ_ONCE(dev->power.autosuspend_delay);
> > if (autosuspend_delay < 0)
> > -   goto out;
> > -
> > -   last_busy = READ_ONCE(dev->power.last_busy);
> > +   return 0;
> >
> > -   expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;
> > -   if (expires <= now)
> > -   expires = 0;/* Already expired. */
> > +   expires  = READ_ONCE(dev->power.last_busy);
> > +   expires += NSEC_PER_MSEC * (u64)autosuspend_delay;
> > +   if (expires > ktime_to_ns(ktime_get()))
> > +   return expires; /* Expires in the future */
> >
> > - out:
> > -   return expires;
> > +   return 0;
> >  }
> >  EXPORT_SYMBOL_GPL(pm_runtime_autosuspend_expiration);
> >
> > Which gives for arm:
> > 0480 :
> >  480:   e92d41f0push{r4, r5, r6, r7, r8, lr}
> >  484:   e5d030d1ldrbr3, [r0, #209]  ; 0xd1
> >  488:   e3130004tst r3, #4
> >  48c:   0a0cbeq 4c4 
> > 
> >  490:   e59030e4ldr r3, [r0, #228]  ; 0xe4
> >  494:   e353cmp r3, #0
> >  498:   ba09blt 4c4 
> > 
> >  49c:   e28050e8add r5, r0, #232; 0xe8
> >  4a0:   e8950030ldm r5, {r4, r5}
> >  4a4:   e59f2030ldr r2, [pc, #48]   ; 4dc 
> > 
> >  4a8:   e0e54392smlal   r4, r5, r2, r3
> >  4ac:   ebfebl  0 
> >  4b0:   e1550001cmp r5, r1
> >  4b4:   0154cmpeq   r4, r0
> >  4b8:   e1a06004mov r6, r4
> >  4bc:   e1a07005mov r7, r5
> >  4c0:   8a01bhi 4cc 
> > 
> >  4c4:   e3a06000mov r6, #0
> >  4c8:   e3a07000mov r7, #0
> >  4cc:   e1a6mov r0, r6
> >  4d0:   e1a01007mov r1, r7
> >  4d4:   e8bd41f0pop {r4, r5, r6, r7, r8, lr}
> >  4d8:   e12fff1ebx  lr
> >  4dc:   000f4240.word   0x000f4240
> >
> > And x86:
> > 0230 :
> >  230:   8b 87 a4 01 00 00   mov0x1a4(%rdi),%eax
> >  236:   53  push   %rbx
> >  237:   85 c0   test   %eax,%eax
> >  239:   78 1e   js 259 
> > 
> >  23b:   48 63 d8movslq %eax,%rbx
> >  23e:   48 8b 97 a8 01 00 00mov0x1a8(%rdi),%rdx
> >  245:   48 69 db 40 42 0f 00imul   $0xf4240,%rbx,%rbx
> >  24c:   48 01 d3add%rdx,%rbx
> >  24f:   e8 00 00 00 00  callq  254 
> > 
> >  254:   48 39 c3cmp%rax,%rbx
> >  257:

Re: Regression in v5.0-rc1 with autosuspend hrtimers

2019-01-09 Thread Tony Lindgren
* Vincent Guittot  [190109 16:56]:
> On Wed, 9 Jan 2019 at 17:50, Tony Lindgren  wrote:
> >
> > * Vincent Guittot  [190109 16:48]:
> > > On Wed, 9 Jan 2019 at 17:28, Tony Lindgren  wrote:
> > > >
> > > > * Rafael J. Wysocki  [190109 09:44]:
> > > > > On Wed, Jan 9, 2019 at 2:51 AM Tony Lindgren  wrote:
> > > > > >
> > > > > > * Vincent Guittot  [190109 01:42]:
> > > > > > > Le Tuesday 08 Jan 2019 à 13:37:43 (-0800), Tony Lindgren a écrit :
> > > > > > > > Lowering the autosuspend_delay_ms to 2100 ms makes things work 
> > > > > > > > again.
> > > > > > > > Anything higher than 2200 ms seems to somehow time out 
> > > > > > > > immediately
> > > > > > > > now :)
> > > > > > >
> > > > > > > This is quite close to the max ns of an int on arm 32bits
> > > > > > >
> > > > > > > Could you try the patch below ?
> > > > > >
> > > > > > Yup great thanks, that's it:
> > > > > >
> > > > > > Tested-by: Tony Lindgren 
> > > > >
> > > > > Cool.  Thanks for getting to the bottom of this!
> > > >
> > > > No problem.
> > > >
> > > > One more thing I noticed: The 25% slack can get noticeable
> > > > for larger values. For things like a 3 second uart console
> > > > timeout slack of 750 ms is quite large variation.
> > > >
> > > > Should we have a limit of max 100 ms for the slack?
> > >
> > > Keep in mind that when jiffies were used, expires was rounded to a
> > > full second when delay was greater than a second. So you could already
> > > have difference of up 990ms on arm before this patch
> > > And i don't take into account the rework of timer infra which add
> > > another level of variation, something like up to 640 ms more when the
> > > timer is greater than 2880 ms for arm IIRC
> >
> > I think it was rounded up earlier.
> >
> > Don't we get rounded down now also?
> 
> We still round up. In hrtimer we have :
> timer->_softexpires = time;
> timer->node.expires = ktime_add_safe(time, delta);
> so the hrtimer will expire between "time" and "time+delta"

OK thanks for checking it. In that case we should be good to go :)

Tony


Re: Regression in v5.0-rc1 with autosuspend hrtimers

2019-01-09 Thread Vincent Guittot
On Wed, 9 Jan 2019 at 17:50, Tony Lindgren  wrote:
>
> * Vincent Guittot  [190109 16:48]:
> > On Wed, 9 Jan 2019 at 17:28, Tony Lindgren  wrote:
> > >
> > > * Rafael J. Wysocki  [190109 09:44]:
> > > > On Wed, Jan 9, 2019 at 2:51 AM Tony Lindgren  wrote:
> > > > >
> > > > > * Vincent Guittot  [190109 01:42]:
> > > > > > Le Tuesday 08 Jan 2019 à 13:37:43 (-0800), Tony Lindgren a écrit :
> > > > > > > Lowering the autosuspend_delay_ms to 2100 ms makes things work 
> > > > > > > again.
> > > > > > > Anything higher than 2200 ms seems to somehow time out immediately
> > > > > > > now :)
> > > > > >
> > > > > > This is quite close to the max ns of an int on arm 32bits
> > > > > >
> > > > > > Could you try the patch below ?
> > > > >
> > > > > Yup great thanks, that's it:
> > > > >
> > > > > Tested-by: Tony Lindgren 
> > > >
> > > > Cool.  Thanks for getting to the bottom of this!
> > >
> > > No problem.
> > >
> > > One more thing I noticed: The 25% slack can get noticeable
> > > for larger values. For things like a 3 second uart console
> > > timeout slack of 750 ms is quite large variation.
> > >
> > > Should we have a limit of max 100 ms for the slack?
> >
> > Keep in mind that when jiffies were used, expires was rounded to a
> > full second when delay was greater than a second. So you could already
> > have difference of up 990ms on arm before this patch
> > And i don't take into account the rework of timer infra which add
> > another level of variation, something like up to 640 ms more when the
> > timer is greater than 2880 ms for arm IIRC
>
> I think it was rounded up earlier.
>
> Don't we get rounded down now also?

We still round up. In hrtimer we have :
timer->_softexpires = time;
timer->node.expires = ktime_add_safe(time, delta);
so the hrtimer will expire between "time" and "time+delta"

>
> Regards,
>
> Tony


Re: Regression in v5.0-rc1 with autosuspend hrtimers

2019-01-09 Thread Tony Lindgren
* Vincent Guittot  [190109 16:48]:
> On Wed, 9 Jan 2019 at 17:28, Tony Lindgren  wrote:
> >
> > * Rafael J. Wysocki  [190109 09:44]:
> > > On Wed, Jan 9, 2019 at 2:51 AM Tony Lindgren  wrote:
> > > >
> > > > * Vincent Guittot  [190109 01:42]:
> > > > > Le Tuesday 08 Jan 2019 à 13:37:43 (-0800), Tony Lindgren a écrit :
> > > > > > Lowering the autosuspend_delay_ms to 2100 ms makes things work 
> > > > > > again.
> > > > > > Anything higher than 2200 ms seems to somehow time out immediately
> > > > > > now :)
> > > > >
> > > > > This is quite close to the max ns of an int on arm 32bits
> > > > >
> > > > > Could you try the patch below ?
> > > >
> > > > Yup great thanks, that's it:
> > > >
> > > > Tested-by: Tony Lindgren 
> > >
> > > Cool.  Thanks for getting to the bottom of this!
> >
> > No problem.
> >
> > One more thing I noticed: The 25% slack can get noticeable
> > for larger values. For things like a 3 second uart console
> > timeout slack of 750 ms is quite large variation.
> >
> > Should we have a limit of max 100 ms for the slack?
> 
> Keep in mind that when jiffies were used, expires was rounded to a
> full second when delay was greater than a second. So you could already
> have difference of up 990ms on arm before this patch
> And i don't take into account the rework of timer infra which add
> another level of variation, something like up to 640 ms more when the
> timer is greater than 2880 ms for arm IIRC

I think it was rounded up earlier.

Don't we get rounded down now also?

Regards,

Tony


Re: Regression in v5.0-rc1 with autosuspend hrtimers

2019-01-09 Thread Vincent Guittot
On Wed, 9 Jan 2019 at 17:28, Tony Lindgren  wrote:
>
> * Rafael J. Wysocki  [190109 09:44]:
> > On Wed, Jan 9, 2019 at 2:51 AM Tony Lindgren  wrote:
> > >
> > > * Vincent Guittot  [190109 01:42]:
> > > > Le Tuesday 08 Jan 2019 à 13:37:43 (-0800), Tony Lindgren a écrit :
> > > > > Lowering the autosuspend_delay_ms to 2100 ms makes things work again.
> > > > > Anything higher than 2200 ms seems to somehow time out immediately
> > > > > now :)
> > > >
> > > > This is quite close to the max ns of an int on arm 32bits
> > > >
> > > > Could you try the patch below ?
> > >
> > > Yup great thanks, that's it:
> > >
> > > Tested-by: Tony Lindgren 
> >
> > Cool.  Thanks for getting to the bottom of this!
>
> No problem.
>
> One more thing I noticed: The 25% slack can get noticeable
> for larger values. For things like a 3 second uart console
> timeout slack of 750 ms is quite large variation.
>
> Should we have a limit of max 100 ms for the slack?

Keep in mind that when jiffies were used, expires was rounded to a
full second when delay was greater than a second. So you could already
have difference of up 990ms on arm before this patch
And i don't take into account the rework of timer infra which add
another level of variation, something like up to 640 ms more when the
timer is greater than 2880 ms for arm IIRC
>
> Regards,
>
> Tony


Re: Regression in v5.0-rc1 with autosuspend hrtimers

2019-01-09 Thread Vincent Guittot
On Wed, 9 Jan 2019 at 17:07, Ladislav Michl  wrote:
>
> On Wed, Jan 09, 2019 at 03:12:25PM +0100, Vincent Guittot wrote:
> > Please keep all thread list when replying :-)
>
> Ahh, sorry for that...
> [snip]
> > On Wed, 9 Jan 2019 at 14:33, Ladislav Michl  wrote:
> > > I agree, but it doea all the magic correctly, so you won't need
> > > to touch that code is ktime_t changes (I know, unlikely..)
> >
> > But the current implementation doesn't care of any changes in ktime_t
> > as it uses raw ns
>
> Fair enough, so let's go back to:

This one looks good for me

> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> index 70624695b6d5..e61ee9b47a26 100644
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -121,7 +121,7 @@ static void pm_runtime_cancel_pending(struct device *dev)
>   * Compute the autosuspend-delay expiration time based on the device's
>   * power.last_busy time.  If the delay has already expired or is disabled
>   * (negative) or the power.use_autosuspend flag isn't set, return 0.
> - * Otherwise return the expiration time in jiffies (adjusted to be nonzero).
> + * Otherwise return the expiration time in nanoseconds (adjusted to be 
> nonzero).
>   *
>   * This function may be called either with or without dev->power.lock held.
>   * Either way it can be racy, since power.last_busy may be updated at any 
> time.
> @@ -129,24 +129,21 @@ static void pm_runtime_cancel_pending(struct device 
> *dev)
>  u64 pm_runtime_autosuspend_expiration(struct device *dev)
>  {
> int autosuspend_delay;
> -   u64 last_busy, expires = 0;
> -   u64 now = ktime_to_ns(ktime_get());
> +   u64 expires;
>
> if (!dev->power.use_autosuspend)
> -   goto out;
> +   return 0;
>
> autosuspend_delay = READ_ONCE(dev->power.autosuspend_delay);
> if (autosuspend_delay < 0)
> -   goto out;
> -
> -   last_busy = READ_ONCE(dev->power.last_busy);
> +   return 0;
>
> -   expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;
> -   if (expires <= now)
> -   expires = 0;/* Already expired. */
> +   expires  = READ_ONCE(dev->power.last_busy);
> +   expires += NSEC_PER_MSEC * (u64)autosuspend_delay;
> +   if (expires > ktime_to_ns(ktime_get()))
> +   return expires; /* Expires in the future */
>
> - out:
> -   return expires;
> +   return 0;
>  }
>  EXPORT_SYMBOL_GPL(pm_runtime_autosuspend_expiration);
>
> Which gives for arm:
> 0480 :
>  480:   e92d41f0push{r4, r5, r6, r7, r8, lr}
>  484:   e5d030d1ldrbr3, [r0, #209]  ; 0xd1
>  488:   e3130004tst r3, #4
>  48c:   0a0cbeq 4c4 
> 
>  490:   e59030e4ldr r3, [r0, #228]  ; 0xe4
>  494:   e353cmp r3, #0
>  498:   ba09blt 4c4 
> 
>  49c:   e28050e8add r5, r0, #232; 0xe8
>  4a0:   e8950030ldm r5, {r4, r5}
>  4a4:   e59f2030ldr r2, [pc, #48]   ; 4dc 
> 
>  4a8:   e0e54392smlal   r4, r5, r2, r3
>  4ac:   ebfebl  0 
>  4b0:   e1550001cmp r5, r1
>  4b4:   0154cmpeq   r4, r0
>  4b8:   e1a06004mov r6, r4
>  4bc:   e1a07005mov r7, r5
>  4c0:   8a01bhi 4cc 
> 
>  4c4:   e3a06000mov r6, #0
>  4c8:   e3a07000mov r7, #0
>  4cc:   e1a6mov r0, r6
>  4d0:   e1a01007mov r1, r7
>  4d4:   e8bd41f0pop {r4, r5, r6, r7, r8, lr}
>  4d8:   e12fff1ebx  lr
>  4dc:   000f4240.word   0x000f4240
>
> And x86:
> 0230 :
>  230:   8b 87 a4 01 00 00   mov0x1a4(%rdi),%eax
>  236:   53  push   %rbx
>  237:   85 c0   test   %eax,%eax
>  239:   78 1e   js 259 
> 
>  23b:   48 63 d8movslq %eax,%rbx
>  23e:   48 8b 97 a8 01 00 00mov0x1a8(%rdi),%rdx
>  245:   48 69 db 40 42 0f 00imul   $0xf4240,%rbx,%rbx
>  24c:   48 01 d3add%rdx,%rbx
>  24f:   e8 00 00 00 00  callq  254 
> 
>  254:   48 39 c3cmp%rax,%rbx
>  257:   77 02   ja 25b 
> 
>  259:   31 db   xor%ebx,%ebx
>  25b:   48 89 d8mov%rbx,%rax
>  25e:   5b  pop%rbx
>  25f:   c3  retq
>
> 02a0 :
>  2a0:   f6 87 91 01 00 00 04testb  $0x4,0x191(%rdi)
>  2a7:   74 02   je 2ab 
> 
>  2a9:   eb 85   jmp230 
> 
>  2ab:   31 c0 

Re: Regression in v5.0-rc1 with autosuspend hrtimers

2019-01-09 Thread Tony Lindgren
* Rafael J. Wysocki  [190109 09:44]:
> On Wed, Jan 9, 2019 at 2:51 AM Tony Lindgren  wrote:
> >
> > * Vincent Guittot  [190109 01:42]:
> > > Le Tuesday 08 Jan 2019 à 13:37:43 (-0800), Tony Lindgren a écrit :
> > > > Lowering the autosuspend_delay_ms to 2100 ms makes things work again.
> > > > Anything higher than 2200 ms seems to somehow time out immediately
> > > > now :)
> > >
> > > This is quite close to the max ns of an int on arm 32bits
> > >
> > > Could you try the patch below ?
> >
> > Yup great thanks, that's it:
> >
> > Tested-by: Tony Lindgren 
> 
> Cool.  Thanks for getting to the bottom of this!

No problem.

One more thing I noticed: The 25% slack can get noticeable
for larger values. For things like a 3 second uart console
timeout slack of 750 ms is quite large variation.

Should we have a limit of max 100 ms for the slack?

Regards,

Tony


Re: Regression in v5.0-rc1 with autosuspend hrtimers

2019-01-09 Thread Ladislav Michl
On Wed, Jan 09, 2019 at 03:12:25PM +0100, Vincent Guittot wrote:
> Please keep all thread list when replying :-)

Ahh, sorry for that...
[snip]
> On Wed, 9 Jan 2019 at 14:33, Ladislav Michl  wrote:
> > I agree, but it doea all the magic correctly, so you won't need
> > to touch that code is ktime_t changes (I know, unlikely..)
> 
> But the current implementation doesn't care of any changes in ktime_t
> as it uses raw ns

Fair enough, so let's go back to:
diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 70624695b6d5..e61ee9b47a26 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -121,7 +121,7 @@ static void pm_runtime_cancel_pending(struct device *dev)
  * Compute the autosuspend-delay expiration time based on the device's
  * power.last_busy time.  If the delay has already expired or is disabled
  * (negative) or the power.use_autosuspend flag isn't set, return 0.
- * Otherwise return the expiration time in jiffies (adjusted to be nonzero).
+ * Otherwise return the expiration time in nanoseconds (adjusted to be 
nonzero).
  *
  * This function may be called either with or without dev->power.lock held.
  * Either way it can be racy, since power.last_busy may be updated at any time.
@@ -129,24 +129,21 @@ static void pm_runtime_cancel_pending(struct device *dev)
 u64 pm_runtime_autosuspend_expiration(struct device *dev)
 {
int autosuspend_delay;
-   u64 last_busy, expires = 0;
-   u64 now = ktime_to_ns(ktime_get());
+   u64 expires;
 
if (!dev->power.use_autosuspend)
-   goto out;
+   return 0;
 
autosuspend_delay = READ_ONCE(dev->power.autosuspend_delay);
if (autosuspend_delay < 0)
-   goto out;
-
-   last_busy = READ_ONCE(dev->power.last_busy);
+   return 0;
 
-   expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;
-   if (expires <= now)
-   expires = 0;/* Already expired. */
+   expires  = READ_ONCE(dev->power.last_busy);
+   expires += NSEC_PER_MSEC * (u64)autosuspend_delay;
+   if (expires > ktime_to_ns(ktime_get()))
+   return expires; /* Expires in the future */
 
- out:
-   return expires;
+   return 0;
 }
 EXPORT_SYMBOL_GPL(pm_runtime_autosuspend_expiration);

Which gives for arm: 
0480 :
 480:   e92d41f0push{r4, r5, r6, r7, r8, lr}
 484:   e5d030d1ldrbr3, [r0, #209]  ; 0xd1
 488:   e3130004tst r3, #4
 48c:   0a0cbeq 4c4 

 490:   e59030e4ldr r3, [r0, #228]  ; 0xe4
 494:   e353cmp r3, #0
 498:   ba09blt 4c4 

 49c:   e28050e8add r5, r0, #232; 0xe8
 4a0:   e8950030ldm r5, {r4, r5}
 4a4:   e59f2030ldr r2, [pc, #48]   ; 4dc 

 4a8:   e0e54392smlal   r4, r5, r2, r3
 4ac:   ebfebl  0 
 4b0:   e1550001cmp r5, r1
 4b4:   0154cmpeq   r4, r0
 4b8:   e1a06004mov r6, r4
 4bc:   e1a07005mov r7, r5
 4c0:   8a01bhi 4cc 

 4c4:   e3a06000mov r6, #0
 4c8:   e3a07000mov r7, #0
 4cc:   e1a6mov r0, r6
 4d0:   e1a01007mov r1, r7
 4d4:   e8bd41f0pop {r4, r5, r6, r7, r8, lr}
 4d8:   e12fff1ebx  lr
 4dc:   000f4240.word   0x000f4240

And x86:
0230 :
 230:   8b 87 a4 01 00 00   mov0x1a4(%rdi),%eax
 236:   53  push   %rbx
 237:   85 c0   test   %eax,%eax
 239:   78 1e   js 259 

 23b:   48 63 d8movslq %eax,%rbx
 23e:   48 8b 97 a8 01 00 00mov0x1a8(%rdi),%rdx
 245:   48 69 db 40 42 0f 00imul   $0xf4240,%rbx,%rbx
 24c:   48 01 d3add%rdx,%rbx
 24f:   e8 00 00 00 00  callq  254 

 254:   48 39 c3cmp%rax,%rbx
 257:   77 02   ja 25b 

 259:   31 db   xor%ebx,%ebx
 25b:   48 89 d8mov%rbx,%rax
 25e:   5b  pop%rbx
 25f:   c3  retq   

02a0 :
 2a0:   f6 87 91 01 00 00 04testb  $0x4,0x191(%rdi)
 2a7:   74 02   je 2ab 

 2a9:   eb 85   jmp230 

 2ab:   31 c0   xor%eax,%eax
 2ad:   c3  retq   
 2ae:   66 90   xchg   %ax,%ax


[snip]
> > Well, main concern was not to call ktime_get at the beginning of function
> > as it is not too cheap.
> 
> Doesn't the compiler optimize that to just before the test ?

N

Re: Regression in v5.0-rc1 with autosuspend hrtimers

2019-01-09 Thread Vincent Guittot
Please keep all thread list when replying :-)

On Wed, 9 Jan 2019 at 14:33, Ladislav Michl  wrote:
>
> On Wed, Jan 09, 2019 at 02:24:37PM +0100, Vincent Guittot wrote:
> > On Wed, 9 Jan 2019 at 12:58, Ladislav Michl  wrote:
> > >
> > > On Wed, Jan 09, 2019 at 12:27:57PM +0100, Vincent Guittot wrote:
> > > > On Wed, 9 Jan 2019 at 12:17, Ladislav Michl  
> > > > wrote:
> > > > >
> > > > > On Wed, Jan 09, 2019 at 02:42:18AM +0100, Vincent Guittot wrote:
> > > > > > Le Tuesday 08 Jan 2019 à 13:37:43 (-0800), Tony Lindgren a écrit :
> > > > > > > * Vincent Guittot  [190108 16:42]:
> > > > > > > > On Tue, 8 Jan 2019 at 16:53, Tony Lindgren  
> > > > > > > > wrote:
> > > > > > > > > Hmm so could it be that we now rely on timers that that may
> > > > > > > > > not be capable of waking up the system from idle states with
> > > > > > > > > hrtimer?
> > > > > > > >
> > > > > > > > With nohz and hrtimer enabled,  timer relies on hrtimer to 
> > > > > > > > generate
> > > > > > > > the tick so you should use the same interrupt.
> > > > > > >
> > > > > > > OK yeah looks like that part is working just fine.
> > > > > > >
> > > > > > > Adding some printks and debugging over ssh, looks like
> > > > > > > omap8250_runtime_resume() gets called just fine based on a 
> > > > > > > wakeirq,
> > > > > > > but then omap8250_runtime_suspend() runs immediately instead of
> > > > > > > waiting for the three second timeout.
> > > > > > >
> > > > > > > Lowering the autosuspend_delay_ms to 2100 ms makes things work 
> > > > > > > again.
> > > > > > > Anything higher than 2200 ms seems to somehow time out immediately
> > > > > > > now :)
> > > > > >
> > > > > > This is quite close to the max ns of an int on arm 32bits
> > > > > >
> > > > > > Could you try the patch below ?
> > > > > >
> > > > > > ---
> > > > > >  drivers/base/power/runtime.c | 2 +-
> > > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/drivers/base/power/runtime.c 
> > > > > > b/drivers/base/power/runtime.c
> > > > > > index 7062469..44c5c76 100644
> > > > > > --- a/drivers/base/power/runtime.c
> > > > > > +++ b/drivers/base/power/runtime.c
> > > > > > @@ -141,7 +141,7 @@ u64 pm_runtime_autosuspend_expiration(struct 
> > > > > > device *dev)
> > > > > >
> > > > > >   last_busy = READ_ONCE(dev->power.last_busy);
> > > > > >
> > > > > > - expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;
> > > > > > + expires = last_busy + (u64)(autosuspend_delay) * 
> > > > > > NSEC_PER_MSEC;
> > > > > >   if (expires <= now)
> > > > > >   expires = 0;/* Already expired. */
> > > > >
> > > > > Hmm, comment above function states it returns "the expiration time in 
> > > > > jiffies
> > > > > (adjusted to be nonzero)", so there's probably more to fix...
> > > >
> > > > The comment is wrong and should be updated as commit 8234f6734c5d has
> > > > moved on hrtimer and expires is now in raw ns unit
> > >
> > > Yup. Who'll send a patch? Is it worth optimizing as bellow? I'm fine with 
> > > doing
> >
> > You can send a patch for updating the comment.
> >
> > > both.
> >
> > Regarding proposal below:
> > - pm_runtime_autosuspend_expiration() returns u64 and not ktime_t
>
> Well, that's matter of adding ktime_to_ns (which is noop).
>
> > - use helper ktime_before/after to compare ktime_t value
> >
> > Using ktime helper function makes the code less readable and the
>
> That why I avoided it.

But you must use helper function if you use ktime_t
That's one main reason for using u64 instead of ktime_t

>
> > proposal make the function more difficult to read IMHO when you
> > compare
> >   expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;
> > with
> >   expires = ktime_add_ns(ms_to_ktime(autosuspend_delay),
> >   READ_ONCE(dev->power.last_busy));
>
> I agree, but it doea all the magic correctly, so you won't need
> to touch that code is ktime_t changes (I know, unlikely..)

But the current implementation doesn't care of any changes in ktime_t
as it uses raw ns

>
> > or when you compare
> >  if (expires <= now)
> > with
> >  if (ktime_after(now, expires))
> >
> > And  I'm not sure that this will optimize the code at the end
>
> Checked generated code on ARM and x86 and gcc does pretty good job here.
>
> > Only the replacement of the "goto out" by return 0 would make sense IMO
>
> Well, main concern was not to call ktime_get at the beginning of function
> as it is not too cheap.

Doesn't the compiler optimize that to just before the test ?

>
> > Regards,
> > Vincent
> >
> > >
> > > > > You can also consider change like this (still does not return 
> > > > > jiffies):
> > > > > diff --git a/drivers/base/power/runtime.c 
> > > > > b/drivers/base/power/runtime.c
> > > > > index 70624695b6d5..c72eaf21a61c 100644
> > > > > --- a/drivers/base/power/runtime.c
> > > > > +++ b/drivers/base/power/runtime.c
> > > > > @@ -129,23 +129,20 @@ static void pm_runtime_cancel_pending(struct 
> > > > > dev

Re: Regression in v5.0-rc1 with autosuspend hrtimers

2019-01-09 Thread Vincent Guittot
On Wed, 9 Jan 2019 at 12:58, Ladislav Michl  wrote:
>
> On Wed, Jan 09, 2019 at 12:27:57PM +0100, Vincent Guittot wrote:
> > On Wed, 9 Jan 2019 at 12:17, Ladislav Michl  wrote:
> > >
> > > On Wed, Jan 09, 2019 at 02:42:18AM +0100, Vincent Guittot wrote:
> > > > Le Tuesday 08 Jan 2019 à 13:37:43 (-0800), Tony Lindgren a écrit :
> > > > > * Vincent Guittot  [190108 16:42]:
> > > > > > On Tue, 8 Jan 2019 at 16:53, Tony Lindgren  wrote:
> > > > > > > Hmm so could it be that we now rely on timers that that may
> > > > > > > not be capable of waking up the system from idle states with
> > > > > > > hrtimer?
> > > > > >
> > > > > > With nohz and hrtimer enabled,  timer relies on hrtimer to generate
> > > > > > the tick so you should use the same interrupt.
> > > > >
> > > > > OK yeah looks like that part is working just fine.
> > > > >
> > > > > Adding some printks and debugging over ssh, looks like
> > > > > omap8250_runtime_resume() gets called just fine based on a wakeirq,
> > > > > but then omap8250_runtime_suspend() runs immediately instead of
> > > > > waiting for the three second timeout.
> > > > >
> > > > > Lowering the autosuspend_delay_ms to 2100 ms makes things work again.
> > > > > Anything higher than 2200 ms seems to somehow time out immediately
> > > > > now :)
> > > >
> > > > This is quite close to the max ns of an int on arm 32bits
> > > >
> > > > Could you try the patch below ?
> > > >
> > > > ---
> > > >  drivers/base/power/runtime.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> > > > index 7062469..44c5c76 100644
> > > > --- a/drivers/base/power/runtime.c
> > > > +++ b/drivers/base/power/runtime.c
> > > > @@ -141,7 +141,7 @@ u64 pm_runtime_autosuspend_expiration(struct device 
> > > > *dev)
> > > >
> > > >   last_busy = READ_ONCE(dev->power.last_busy);
> > > >
> > > > - expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;
> > > > + expires = last_busy + (u64)(autosuspend_delay) * NSEC_PER_MSEC;
> > > >   if (expires <= now)
> > > >   expires = 0;/* Already expired. */
> > >
> > > Hmm, comment above function states it returns "the expiration time in 
> > > jiffies
> > > (adjusted to be nonzero)", so there's probably more to fix...
> >
> > The comment is wrong and should be updated as commit 8234f6734c5d has
> > moved on hrtimer and expires is now in raw ns unit
>
> Yup. Who'll send a patch? Is it worth optimizing as bellow? I'm fine with 
> doing

You can send a patch for updating the comment.

> both.

Regarding proposal below:
- pm_runtime_autosuspend_expiration() returns u64 and not ktime_t
- use helper ktime_before/after to compare ktime_t value

Using ktime helper function makes the code less readable and the
proposal make the function more difficult to read IMHO when you
compare
  expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;
with
  expires = ktime_add_ns(ms_to_ktime(autosuspend_delay),
  READ_ONCE(dev->power.last_busy));
or when you compare
 if (expires <= now)
with
 if (ktime_after(now, expires))

And  I'm not sure that this will optimize the code at the end

Only the replacement of the "goto out" by return 0 would make sense IMO

Regards,
Vincent

>
> > > You can also consider change like this (still does not return jiffies):
> > > diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> > > index 70624695b6d5..c72eaf21a61c 100644
> > > --- a/drivers/base/power/runtime.c
> > > +++ b/drivers/base/power/runtime.c
> > > @@ -129,23 +129,20 @@ static void pm_runtime_cancel_pending(struct device 
> > > *dev)
> > >  u64 pm_runtime_autosuspend_expiration(struct device *dev)
> > >  {
> > > int autosuspend_delay;
> > > -   u64 last_busy, expires = 0;
> > > -   u64 now = ktime_to_ns(ktime_get());
> > > +   ktime_t expires;
> > >
> > > if (!dev->power.use_autosuspend)
> > > -   goto out;
> > > +   return 0;
> > >
> > > autosuspend_delay = READ_ONCE(dev->power.autosuspend_delay);
> > > if (autosuspend_delay < 0)
> > > -   goto out;
> > > -
> > > -   last_busy = READ_ONCE(dev->power.last_busy);
> > > +   return 0;
> > >
> > > -   expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;
> > > -   if (expires <= now)
> > > -   expires = 0;/* Already expired. */
> > > +   expires = ktime_add_ns(ms_to_ktime(autosuspend_delay),
> > > +  READ_ONCE(dev->power.last_busy));
> > > +   if (expires <= ktime_get())
> > > +   return 0;   /* Already expired. */
> > >
> > > - out:
> > > return expires;
> > >  }
> > >  EXPORT_SYMBOL_GPL(pm_runtime_autosuspend_expiration);
> > >
> > > Regards,
> > > ladis


Re: Regression in v5.0-rc1 with autosuspend hrtimers

2019-01-09 Thread Vincent Guittot
On Wed, 9 Jan 2019 at 12:17, Ladislav Michl  wrote:
>
> On Wed, Jan 09, 2019 at 02:42:18AM +0100, Vincent Guittot wrote:
> > Le Tuesday 08 Jan 2019 à 13:37:43 (-0800), Tony Lindgren a écrit :
> > > * Vincent Guittot  [190108 16:42]:
> > > > On Tue, 8 Jan 2019 at 16:53, Tony Lindgren  wrote:
> > > > > Hmm so could it be that we now rely on timers that that may
> > > > > not be capable of waking up the system from idle states with
> > > > > hrtimer?
> > > >
> > > > With nohz and hrtimer enabled,  timer relies on hrtimer to generate
> > > > the tick so you should use the same interrupt.
> > >
> > > OK yeah looks like that part is working just fine.
> > >
> > > Adding some printks and debugging over ssh, looks like
> > > omap8250_runtime_resume() gets called just fine based on a wakeirq,
> > > but then omap8250_runtime_suspend() runs immediately instead of
> > > waiting for the three second timeout.
> > >
> > > Lowering the autosuspend_delay_ms to 2100 ms makes things work again.
> > > Anything higher than 2200 ms seems to somehow time out immediately
> > > now :)
> >
> > This is quite close to the max ns of an int on arm 32bits
> >
> > Could you try the patch below ?
> >
> > ---
> >  drivers/base/power/runtime.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> > index 7062469..44c5c76 100644
> > --- a/drivers/base/power/runtime.c
> > +++ b/drivers/base/power/runtime.c
> > @@ -141,7 +141,7 @@ u64 pm_runtime_autosuspend_expiration(struct device 
> > *dev)
> >
> >   last_busy = READ_ONCE(dev->power.last_busy);
> >
> > - expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;
> > + expires = last_busy + (u64)(autosuspend_delay) * NSEC_PER_MSEC;
> >   if (expires <= now)
> >   expires = 0;/* Already expired. */
>
> Hmm, comment above function states it returns "the expiration time in jiffies
> (adjusted to be nonzero)", so there's probably more to fix...

The comment is wrong and should be updated as commit 8234f6734c5d has
moved on hrtimer and expires is now in raw ns unit

>
> You can also consider change like this (still does not return jiffies):
> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> index 70624695b6d5..c72eaf21a61c 100644
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -129,23 +129,20 @@ static void pm_runtime_cancel_pending(struct device 
> *dev)
>  u64 pm_runtime_autosuspend_expiration(struct device *dev)
>  {
> int autosuspend_delay;
> -   u64 last_busy, expires = 0;
> -   u64 now = ktime_to_ns(ktime_get());
> +   ktime_t expires;
>
> if (!dev->power.use_autosuspend)
> -   goto out;
> +   return 0;
>
> autosuspend_delay = READ_ONCE(dev->power.autosuspend_delay);
> if (autosuspend_delay < 0)
> -   goto out;
> -
> -   last_busy = READ_ONCE(dev->power.last_busy);
> +   return 0;
>
> -   expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;
> -   if (expires <= now)
> -   expires = 0;/* Already expired. */
> +   expires = ktime_add_ns(ms_to_ktime(autosuspend_delay),
> +  READ_ONCE(dev->power.last_busy));
> +   if (expires <= ktime_get())
> +   return 0;   /* Already expired. */
>
> - out:
> return expires;
>  }
>  EXPORT_SYMBOL_GPL(pm_runtime_autosuspend_expiration);
>
> Regards,
> ladis


Re: Regression in v5.0-rc1 with autosuspend hrtimers

2019-01-09 Thread Ladislav Michl
On Wed, Jan 09, 2019 at 02:42:18AM +0100, Vincent Guittot wrote:
> Le Tuesday 08 Jan 2019 à 13:37:43 (-0800), Tony Lindgren a écrit :
> > * Vincent Guittot  [190108 16:42]:
> > > On Tue, 8 Jan 2019 at 16:53, Tony Lindgren  wrote:
> > > > Hmm so could it be that we now rely on timers that that may
> > > > not be capable of waking up the system from idle states with
> > > > hrtimer?
> > > 
> > > With nohz and hrtimer enabled,  timer relies on hrtimer to generate
> > > the tick so you should use the same interrupt.
> > 
> > OK yeah looks like that part is working just fine.
> > 
> > Adding some printks and debugging over ssh, looks like
> > omap8250_runtime_resume() gets called just fine based on a wakeirq,
> > but then omap8250_runtime_suspend() runs immediately instead of
> > waiting for the three second timeout.
> > 
> > Lowering the autosuspend_delay_ms to 2100 ms makes things work again.
> > Anything higher than 2200 ms seems to somehow time out immediately
> > now :)
> 
> This is quite close to the max ns of an int on arm 32bits
> 
> Could you try the patch below ?
> 
> ---
>  drivers/base/power/runtime.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> index 7062469..44c5c76 100644
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -141,7 +141,7 @@ u64 pm_runtime_autosuspend_expiration(struct device *dev)
>  
>   last_busy = READ_ONCE(dev->power.last_busy);
>  
> - expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;
> + expires = last_busy + (u64)(autosuspend_delay) * NSEC_PER_MSEC;
>   if (expires <= now)
>   expires = 0;/* Already expired. */

Hmm, comment above function states it returns "the expiration time in jiffies
(adjusted to be nonzero)", so there's probably more to fix...

You can also consider change like this (still does not return jiffies):
diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 70624695b6d5..c72eaf21a61c 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -129,23 +129,20 @@ static void pm_runtime_cancel_pending(struct device *dev)
 u64 pm_runtime_autosuspend_expiration(struct device *dev)
 {
int autosuspend_delay;
-   u64 last_busy, expires = 0;
-   u64 now = ktime_to_ns(ktime_get());
+   ktime_t expires;
 
if (!dev->power.use_autosuspend)
-   goto out;
+   return 0;
 
autosuspend_delay = READ_ONCE(dev->power.autosuspend_delay);
if (autosuspend_delay < 0)
-   goto out;
-
-   last_busy = READ_ONCE(dev->power.last_busy);
+   return 0;
 
-   expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;
-   if (expires <= now)
-   expires = 0;/* Already expired. */
+   expires = ktime_add_ns(ms_to_ktime(autosuspend_delay),
+  READ_ONCE(dev->power.last_busy));
+   if (expires <= ktime_get())
+   return 0;   /* Already expired. */
 
- out:
return expires;
 }
 EXPORT_SYMBOL_GPL(pm_runtime_autosuspend_expiration);

Regards,
ladis


Re: Regression in v5.0-rc1 with autosuspend hrtimers

2019-01-09 Thread Rafael J. Wysocki
On Wed, Jan 9, 2019 at 2:51 AM Tony Lindgren  wrote:
>
> * Vincent Guittot  [190109 01:42]:
> > Le Tuesday 08 Jan 2019 à 13:37:43 (-0800), Tony Lindgren a écrit :
> > > Lowering the autosuspend_delay_ms to 2100 ms makes things work again.
> > > Anything higher than 2200 ms seems to somehow time out immediately
> > > now :)
> >
> > This is quite close to the max ns of an int on arm 32bits
> >
> > Could you try the patch below ?
>
> Yup great thanks, that's it:
>
> Tested-by: Tony Lindgren 

Cool.  Thanks for getting to the bottom of this!


Re: Regression in v5.0-rc1 with autosuspend hrtimers

2019-01-08 Thread Tony Lindgren
* Vincent Guittot  [190109 01:42]:
> Le Tuesday 08 Jan 2019 à 13:37:43 (-0800), Tony Lindgren a écrit :
> > Lowering the autosuspend_delay_ms to 2100 ms makes things work again.
> > Anything higher than 2200 ms seems to somehow time out immediately
> > now :)
> 
> This is quite close to the max ns of an int on arm 32bits
> 
> Could you try the patch below ?

Yup great thanks, that's it:

Tested-by: Tony Lindgren 

> ---
>  drivers/base/power/runtime.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> index 7062469..44c5c76 100644
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -141,7 +141,7 @@ u64 pm_runtime_autosuspend_expiration(struct device *dev)
>  
>   last_busy = READ_ONCE(dev->power.last_busy);
>  
> - expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;
> + expires = last_busy + (u64)(autosuspend_delay) * NSEC_PER_MSEC;
>   if (expires <= now)
>   expires = 0;/* Already expired. */
>  
> -- 
> 2.7.4
> 
> 
> > 
> > Regards,
> > 
> > Tony


Re: Regression in v5.0-rc1 with autosuspend hrtimers

2019-01-08 Thread Vincent Guittot
Le Tuesday 08 Jan 2019 à 13:37:43 (-0800), Tony Lindgren a écrit :
> * Vincent Guittot  [190108 16:42]:
> > On Tue, 8 Jan 2019 at 16:53, Tony Lindgren  wrote:
> > > Hmm so could it be that we now rely on timers that that may
> > > not be capable of waking up the system from idle states with
> > > hrtimer?
> > 
> > With nohz and hrtimer enabled,  timer relies on hrtimer to generate
> > the tick so you should use the same interrupt.
> 
> OK yeah looks like that part is working just fine.
> 
> Adding some printks and debugging over ssh, looks like
> omap8250_runtime_resume() gets called just fine based on a wakeirq,
> but then omap8250_runtime_suspend() runs immediately instead of
> waiting for the three second timeout.
> 
> Lowering the autosuspend_delay_ms to 2100 ms makes things work again.
> Anything higher than 2200 ms seems to somehow time out immediately
> now :)

This is quite close to the max ns of an int on arm 32bits

Could you try the patch below ?

---
 drivers/base/power/runtime.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 7062469..44c5c76 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -141,7 +141,7 @@ u64 pm_runtime_autosuspend_expiration(struct device *dev)
 
last_busy = READ_ONCE(dev->power.last_busy);
 
-   expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;
+   expires = last_busy + (u64)(autosuspend_delay) * NSEC_PER_MSEC;
if (expires <= now)
expires = 0;/* Already expired. */
 
-- 
2.7.4


> 
> Regards,
> 
> Tony


Re: Regression in v5.0-rc1 with autosuspend hrtimers

2019-01-08 Thread Tony Lindgren
* Vincent Guittot  [190108 16:42]:
> On Tue, 8 Jan 2019 at 16:53, Tony Lindgren  wrote:
> > Hmm so could it be that we now rely on timers that that may
> > not be capable of waking up the system from idle states with
> > hrtimer?
> 
> With nohz and hrtimer enabled,  timer relies on hrtimer to generate
> the tick so you should use the same interrupt.

OK yeah looks like that part is working just fine.

Adding some printks and debugging over ssh, looks like
omap8250_runtime_resume() gets called just fine based on a wakeirq,
but then omap8250_runtime_suspend() runs immediately instead of
waiting for the three second timeout.

Lowering the autosuspend_delay_ms to 2100 ms makes things work again.
Anything higher than 2200 ms seems to somehow time out immediately
now :)

Regards,

Tony


Re: Regression in v5.0-rc1 with autosuspend hrtimers

2019-01-08 Thread Vincent Guittot
On Tue, 8 Jan 2019 at 16:53, Tony Lindgren  wrote:
>
> * Vincent Guittot  [190108 08:00]:
> > Hi Tony,
> >
> > On Tue, 8 Jan 2019 at 00:38, Tony Lindgren  wrote:
> > >
> > > Hi all,
> > >
> > > Looks like commit 8234f6734c5d ("PM-runtime: Switch autosuspend
> > > over to using hrtimers") caused a regression on at least
> > > omap5-uevm where 8250 UART rx wake no longer works. I have not
> > > noticed this happening on others so far.
> > >
> > > The devices I've tested all are using 8250 with dedicated
> > > wakeirqs configured for the rx pin. I can see the interrupt
> > > increase on omap5-uevm after some one or more keypresses,
> > > but then nothing. It seems that the uart just falls back
> > > asleep right away or something.
> > >
> > > Any ideas what might be going wrong?
> >
> > What is the autosuspend value ? Can it be that the autosuspend is set
> > to a short value but was finally greater than 10-20ms on arm32. And
> > now the autosuspend happens before and this has changed the sequence ?
>
> It's set to 3 seconds. The difference between let's say
> C-A9 pandaboard (that is working) compared to C-A15 omap5-uevm
> is that the C-A15 has arch_timer in use. Other than that things
> should behave more or less the same way.
>
> Hmm so could it be that we now rely on timers that that may
> not be capable of waking up the system from idle states with
> hrtimer?

With nohz and hrtimer enabled,  timer relies on hrtimer to generate
the tick so you should use the same interrupt.

>
> Regards,
>
> Tony
>


Re: Regression in v5.0-rc1 with autosuspend hrtimers

2019-01-08 Thread Tony Lindgren
* Vincent Guittot  [190108 08:00]:
> Hi Tony,
> 
> On Tue, 8 Jan 2019 at 00:38, Tony Lindgren  wrote:
> >
> > Hi all,
> >
> > Looks like commit 8234f6734c5d ("PM-runtime: Switch autosuspend
> > over to using hrtimers") caused a regression on at least
> > omap5-uevm where 8250 UART rx wake no longer works. I have not
> > noticed this happening on others so far.
> >
> > The devices I've tested all are using 8250 with dedicated
> > wakeirqs configured for the rx pin. I can see the interrupt
> > increase on omap5-uevm after some one or more keypresses,
> > but then nothing. It seems that the uart just falls back
> > asleep right away or something.
> >
> > Any ideas what might be going wrong?
> 
> What is the autosuspend value ? Can it be that the autosuspend is set
> to a short value but was finally greater than 10-20ms on arm32. And
> now the autosuspend happens before and this has changed the sequence ?

It's set to 3 seconds. The difference between let's say
C-A9 pandaboard (that is working) compared to C-A15 omap5-uevm
is that the C-A15 has arch_timer in use. Other than that things
should behave more or less the same way.

Hmm so could it be that we now rely on timers that that may
not be capable of waking up the system from idle states with
hrtimer?

Regards,

Tony



Re: Regression in v5.0-rc1 with autosuspend hrtimers

2019-01-08 Thread Vincent Guittot
Hi Tony,

On Tue, 8 Jan 2019 at 00:38, Tony Lindgren  wrote:
>
> Hi all,
>
> Looks like commit 8234f6734c5d ("PM-runtime: Switch autosuspend
> over to using hrtimers") caused a regression on at least
> omap5-uevm where 8250 UART rx wake no longer works. I have not
> noticed this happening on others so far.
>
> The devices I've tested all are using 8250 with dedicated
> wakeirqs configured for the rx pin. I can see the interrupt
> increase on omap5-uevm after some one or more keypresses,
> but then nothing. It seems that the uart just falls back
> asleep right away or something.
>
> Any ideas what might be going wrong?

What is the autosuspend value ? Can it be that the autosuspend is set
to a short value but was finally greater than 10-20ms on arm32. And
now the autosuspend happens before and this has changed the sequence ?

Regards,
Vincent
>
> Regards,
>
> Tony
>