Re: [PATCH 0/6] cpuidle: menu: Fixes, optimizations and cleanups

2018-10-10 Thread Rafael J. Wysocki
On Wed, Oct 10, 2018 at 2:02 AM Doug Smythies  wrote:
>
> On 2018.10.09 03:43 Rafael J. Wysocki wrote:
>
> ...[snip]...
>
> > While at it, could you test the appended patch
> > (on top of the previous 8) for me please?
> >
> > I think that this code can be simplified now.
> >
> > ---
> > drivers/cpuidle/governors/menu.c |8 
> > 1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > Index: linux-pm/drivers/cpuidle/governors/menu.c
> > ===
> > --- linux-pm.orig/drivers/cpuidle/governors/menu.c
> > +++ linux-pm/drivers/cpuidle/governors/menu.c
> > @@ -371,12 +371,12 @@ static int menu_select(struct cpuidle_dr
> >   if (s->target_residency > predicted_us) {
> >   /*
> >* Use a physical idle state, not busy polling, unless
> > -  * a timer is going to trigger really really soon.
> > +  * a timer is going to trigger soon enough.
> >*/
> >   if ((drv->states[idx].flags & CPUIDLE_FLAG_POLLING) &&
> > - i == idx + 1 && latency_req > s->exit_latency &&
> > - data->next_timer_us > max_t(unsigned int, 20,
> > - s->target_residency)) 
> > {
> > + s->exit_latency <= latency_req &&
> > + s->target_residency <= data->next_timer_us) {
> > + predicted_us = s->target_residency;
> >   idx = i;
> >   break;
> >   }
>
> It seems to work fine.
> I was unable to detect any difference between the 8 patch set and with
> this additional patch for any of the tests that I ran. (at least beyond
> noise and/or experimental error.)

Great, thank you!

Cheers,
Rafael


Re: [PATCH 0/6] cpuidle: menu: Fixes, optimizations and cleanups

2018-10-10 Thread Rafael J. Wysocki
On Wed, Oct 10, 2018 at 2:02 AM Doug Smythies  wrote:
>
> On 2018.10.09 03:43 Rafael J. Wysocki wrote:
>
> ...[snip]...
>
> > While at it, could you test the appended patch
> > (on top of the previous 8) for me please?
> >
> > I think that this code can be simplified now.
> >
> > ---
> > drivers/cpuidle/governors/menu.c |8 
> > 1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > Index: linux-pm/drivers/cpuidle/governors/menu.c
> > ===
> > --- linux-pm.orig/drivers/cpuidle/governors/menu.c
> > +++ linux-pm/drivers/cpuidle/governors/menu.c
> > @@ -371,12 +371,12 @@ static int menu_select(struct cpuidle_dr
> >   if (s->target_residency > predicted_us) {
> >   /*
> >* Use a physical idle state, not busy polling, unless
> > -  * a timer is going to trigger really really soon.
> > +  * a timer is going to trigger soon enough.
> >*/
> >   if ((drv->states[idx].flags & CPUIDLE_FLAG_POLLING) &&
> > - i == idx + 1 && latency_req > s->exit_latency &&
> > - data->next_timer_us > max_t(unsigned int, 20,
> > - s->target_residency)) 
> > {
> > + s->exit_latency <= latency_req &&
> > + s->target_residency <= data->next_timer_us) {
> > + predicted_us = s->target_residency;
> >   idx = i;
> >   break;
> >   }
>
> It seems to work fine.
> I was unable to detect any difference between the 8 patch set and with
> this additional patch for any of the tests that I ran. (at least beyond
> noise and/or experimental error.)

Great, thank you!

Cheers,
Rafael


RE: [PATCH 0/6] cpuidle: menu: Fixes, optimizations and cleanups

2018-10-09 Thread Doug Smythies
On 2018.10.09 03:43 Rafael J. Wysocki wrote:

...[snip]...

> While at it, could you test the appended patch
> (on top of the previous 8) for me please?
>
> I think that this code can be simplified now.
>
> ---
> drivers/cpuidle/governors/menu.c |8 
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> Index: linux-pm/drivers/cpuidle/governors/menu.c
> ===
> --- linux-pm.orig/drivers/cpuidle/governors/menu.c
> +++ linux-pm/drivers/cpuidle/governors/menu.c
> @@ -371,12 +371,12 @@ static int menu_select(struct cpuidle_dr
>   if (s->target_residency > predicted_us) {
>   /*
>* Use a physical idle state, not busy polling, unless
> -  * a timer is going to trigger really really soon.
> +  * a timer is going to trigger soon enough.
>*/
>   if ((drv->states[idx].flags & CPUIDLE_FLAG_POLLING) &&
> - i == idx + 1 && latency_req > s->exit_latency &&
> - data->next_timer_us > max_t(unsigned int, 20,
> - s->target_residency)) {
> + s->exit_latency <= latency_req &&
> + s->target_residency <= data->next_timer_us) {
> + predicted_us = s->target_residency;
>   idx = i;
>   break;
>   }

It seems to work fine.
I was unable to detect any difference between the 8 patch set and with
this additional patch for any of the tests that I ran. (at least beyond
noise and/or experimental error.)

Note: I didn't publish any of the pretty graphs.

... Doug




RE: [PATCH 0/6] cpuidle: menu: Fixes, optimizations and cleanups

2018-10-09 Thread Doug Smythies
On 2018.10.09 03:43 Rafael J. Wysocki wrote:

...[snip]...

> While at it, could you test the appended patch
> (on top of the previous 8) for me please?
>
> I think that this code can be simplified now.
>
> ---
> drivers/cpuidle/governors/menu.c |8 
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> Index: linux-pm/drivers/cpuidle/governors/menu.c
> ===
> --- linux-pm.orig/drivers/cpuidle/governors/menu.c
> +++ linux-pm/drivers/cpuidle/governors/menu.c
> @@ -371,12 +371,12 @@ static int menu_select(struct cpuidle_dr
>   if (s->target_residency > predicted_us) {
>   /*
>* Use a physical idle state, not busy polling, unless
> -  * a timer is going to trigger really really soon.
> +  * a timer is going to trigger soon enough.
>*/
>   if ((drv->states[idx].flags & CPUIDLE_FLAG_POLLING) &&
> - i == idx + 1 && latency_req > s->exit_latency &&
> - data->next_timer_us > max_t(unsigned int, 20,
> - s->target_residency)) {
> + s->exit_latency <= latency_req &&
> + s->target_residency <= data->next_timer_us) {
> + predicted_us = s->target_residency;
>   idx = i;
>   break;
>   }

It seems to work fine.
I was unable to detect any difference between the 8 patch set and with
this additional patch for any of the tests that I ran. (at least beyond
noise and/or experimental error.)

Note: I didn't publish any of the pretty graphs.

... Doug




Re: [PATCH 0/6] cpuidle: menu: Fixes, optimizations and cleanups

2018-10-09 Thread Rafael J. Wysocki
On Tuesday, October 9, 2018 12:26:48 AM CEST Rafael J. Wysocki wrote:
> On Tue, Oct 9, 2018 at 12:14 AM Doug Smythies  wrote:
> >
> > On 2018.10.08 00:51 Rafael J. Wysocki wrote:
> > > On Mon, Oct 8, 2018 at 8:02 AM Doug Smythies  wrote:
> > >>
> > >> On 2018.10.03 23:56 Rafael J. Wysocki wrote:
> > >>> On Tue, Oct 2, 2018 at 11:51 PM Rafael J. Wysocki  
> > >>> wrote:
> 
> [cut]
> 
> > >> Test 2: pipe test 2 CPUs, one core. CPU test:
> > >>
> > >> The average loop times graph is here:
> > >> http://fast.smythies.com/linux-pm/k419/k419-rjw-pipe-1core.png
> > >>
> > >> The power and idle statistics graphs are here:
> > >> http://fast.smythies.com/linux-pm/k419/k419-rjw-pipe-1core.htm
> > >>
> > >> Conclusions:
> > >>
> > >> Better performance at the cost of more power with
> > >> the patch set, but late August had both better performance
> > >> and less power.
> > >>
> > >> Overall idle entries and exits are about the same, but way
> > >> way more idle state 0 entries and exits with the patch set.
> > >
> > >Same as above (and expected too).
> >
> > I Disagree. The significant transfer of idle entries from
> > idle state 1 with kernel 4.19-rc6 to idle state 0 with the
> > additional 8 patch set is virtually entirely due to this patch:
> >
> > "[PATCH 2/6] cpuidle: menu: Compute first_idx when latency_req is known"
> 
> OK
> 
> > As far as I can determine from all of this data, in particular the
> > histogram data below, it seems to me that it now is selecting
> > idle state 0 whereas before it was selecting idle state 1
> > is the correct decision for those very short duration idle states
> > (well, for my processor (older i7-2600K) at least).
> 
> At least, that's a matter of consistency IMO.
> 
> State 1 should not be selected if the final latency limit is below its
> exit latency and that's what happens in that situation.
> 
> > Note: I did test my above assertion with kernels compiled with only
> > the first 2 and then 3 of the 8 patch set.
> 
> I see.

While at it, could you test the appended patch (on top of the previous 8)
for me please?

I think that this code can be simplified now.

---
 drivers/cpuidle/governors/menu.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

Index: linux-pm/drivers/cpuidle/governors/menu.c
===
--- linux-pm.orig/drivers/cpuidle/governors/menu.c
+++ linux-pm/drivers/cpuidle/governors/menu.c
@@ -371,12 +371,12 @@ static int menu_select(struct cpuidle_dr
if (s->target_residency > predicted_us) {
/*
 * Use a physical idle state, not busy polling, unless
-* a timer is going to trigger really really soon.
+* a timer is going to trigger soon enough.
 */
if ((drv->states[idx].flags & CPUIDLE_FLAG_POLLING) &&
-   i == idx + 1 && latency_req > s->exit_latency &&
-   data->next_timer_us > max_t(unsigned int, 20,
-   s->target_residency)) {
+   s->exit_latency <= latency_req &&
+   s->target_residency <= data->next_timer_us) {
+   predicted_us = s->target_residency;
idx = i;
break;
}



Re: [PATCH 0/6] cpuidle: menu: Fixes, optimizations and cleanups

2018-10-09 Thread Rafael J. Wysocki
On Tuesday, October 9, 2018 12:26:48 AM CEST Rafael J. Wysocki wrote:
> On Tue, Oct 9, 2018 at 12:14 AM Doug Smythies  wrote:
> >
> > On 2018.10.08 00:51 Rafael J. Wysocki wrote:
> > > On Mon, Oct 8, 2018 at 8:02 AM Doug Smythies  wrote:
> > >>
> > >> On 2018.10.03 23:56 Rafael J. Wysocki wrote:
> > >>> On Tue, Oct 2, 2018 at 11:51 PM Rafael J. Wysocki  
> > >>> wrote:
> 
> [cut]
> 
> > >> Test 2: pipe test 2 CPUs, one core. CPU test:
> > >>
> > >> The average loop times graph is here:
> > >> http://fast.smythies.com/linux-pm/k419/k419-rjw-pipe-1core.png
> > >>
> > >> The power and idle statistics graphs are here:
> > >> http://fast.smythies.com/linux-pm/k419/k419-rjw-pipe-1core.htm
> > >>
> > >> Conclusions:
> > >>
> > >> Better performance at the cost of more power with
> > >> the patch set, but late August had both better performance
> > >> and less power.
> > >>
> > >> Overall idle entries and exits are about the same, but way
> > >> way more idle state 0 entries and exits with the patch set.
> > >
> > >Same as above (and expected too).
> >
> > I Disagree. The significant transfer of idle entries from
> > idle state 1 with kernel 4.19-rc6 to idle state 0 with the
> > additional 8 patch set is virtually entirely due to this patch:
> >
> > "[PATCH 2/6] cpuidle: menu: Compute first_idx when latency_req is known"
> 
> OK
> 
> > As far as I can determine from all of this data, in particular the
> > histogram data below, it seems to me that it now is selecting
> > idle state 0 whereas before it was selecting idle state 1
> > is the correct decision for those very short duration idle states
> > (well, for my processor (older i7-2600K) at least).
> 
> At least, that's a matter of consistency IMO.
> 
> State 1 should not be selected if the final latency limit is below its
> exit latency and that's what happens in that situation.
> 
> > Note: I did test my above assertion with kernels compiled with only
> > the first 2 and then 3 of the 8 patch set.
> 
> I see.

While at it, could you test the appended patch (on top of the previous 8)
for me please?

I think that this code can be simplified now.

---
 drivers/cpuidle/governors/menu.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

Index: linux-pm/drivers/cpuidle/governors/menu.c
===
--- linux-pm.orig/drivers/cpuidle/governors/menu.c
+++ linux-pm/drivers/cpuidle/governors/menu.c
@@ -371,12 +371,12 @@ static int menu_select(struct cpuidle_dr
if (s->target_residency > predicted_us) {
/*
 * Use a physical idle state, not busy polling, unless
-* a timer is going to trigger really really soon.
+* a timer is going to trigger soon enough.
 */
if ((drv->states[idx].flags & CPUIDLE_FLAG_POLLING) &&
-   i == idx + 1 && latency_req > s->exit_latency &&
-   data->next_timer_us > max_t(unsigned int, 20,
-   s->target_residency)) {
+   s->exit_latency <= latency_req &&
+   s->target_residency <= data->next_timer_us) {
+   predicted_us = s->target_residency;
idx = i;
break;
}



Re: [PATCH 0/6] cpuidle: menu: Fixes, optimizations and cleanups

2018-10-08 Thread Rafael J. Wysocki
On Tue, Oct 9, 2018 at 12:14 AM Doug Smythies  wrote:
>
> On 2018.10.08 00:51 Rafael J. Wysocki wrote:
> > On Mon, Oct 8, 2018 at 8:02 AM Doug Smythies  wrote:
> >>
> >> On 2018.10.03 23:56 Rafael J. Wysocki wrote:
> >>> On Tue, Oct 2, 2018 at 11:51 PM Rafael J. Wysocki  
> >>> wrote:

[cut]

> >> Test 2: pipe test 2 CPUs, one core. CPU test:
> >>
> >> The average loop times graph is here:
> >> http://fast.smythies.com/linux-pm/k419/k419-rjw-pipe-1core.png
> >>
> >> The power and idle statistics graphs are here:
> >> http://fast.smythies.com/linux-pm/k419/k419-rjw-pipe-1core.htm
> >>
> >> Conclusions:
> >>
> >> Better performance at the cost of more power with
> >> the patch set, but late August had both better performance
> >> and less power.
> >>
> >> Overall idle entries and exits are about the same, but way
> >> way more idle state 0 entries and exits with the patch set.
> >
> >Same as above (and expected too).
>
> I Disagree. The significant transfer of idle entries from
> idle state 1 with kernel 4.19-rc6 to idle state 0 with the
> additional 8 patch set is virtually entirely due to this patch:
>
> "[PATCH 2/6] cpuidle: menu: Compute first_idx when latency_req is known"

OK

> As far as I can determine from all of this data, in particular the
> histogram data below, it seems to me that it now is selecting
> idle state 0 whereas before it was selecting idle state 1
> is the correct decision for those very short duration idle states
> (well, for my processor (older i7-2600K) at least).

At least, that's a matter of consistency IMO.

State 1 should not be selected if the final latency limit is below its
exit latency and that's what happens in that situation.

> Note: I did test my above assertion with kernels compiled with only
> the first 2 and then 3 of the 8 patch set.

I see.

Thanks,
Rafael


Re: [PATCH 0/6] cpuidle: menu: Fixes, optimizations and cleanups

2018-10-08 Thread Rafael J. Wysocki
On Tue, Oct 9, 2018 at 12:14 AM Doug Smythies  wrote:
>
> On 2018.10.08 00:51 Rafael J. Wysocki wrote:
> > On Mon, Oct 8, 2018 at 8:02 AM Doug Smythies  wrote:
> >>
> >> On 2018.10.03 23:56 Rafael J. Wysocki wrote:
> >>> On Tue, Oct 2, 2018 at 11:51 PM Rafael J. Wysocki  
> >>> wrote:

[cut]

> >> Test 2: pipe test 2 CPUs, one core. CPU test:
> >>
> >> The average loop times graph is here:
> >> http://fast.smythies.com/linux-pm/k419/k419-rjw-pipe-1core.png
> >>
> >> The power and idle statistics graphs are here:
> >> http://fast.smythies.com/linux-pm/k419/k419-rjw-pipe-1core.htm
> >>
> >> Conclusions:
> >>
> >> Better performance at the cost of more power with
> >> the patch set, but late August had both better performance
> >> and less power.
> >>
> >> Overall idle entries and exits are about the same, but way
> >> way more idle state 0 entries and exits with the patch set.
> >
> >Same as above (and expected too).
>
> I Disagree. The significant transfer of idle entries from
> idle state 1 with kernel 4.19-rc6 to idle state 0 with the
> additional 8 patch set is virtually entirely due to this patch:
>
> "[PATCH 2/6] cpuidle: menu: Compute first_idx when latency_req is known"

OK

> As far as I can determine from all of this data, in particular the
> histogram data below, it seems to me that it now is selecting
> idle state 0 whereas before it was selecting idle state 1
> is the correct decision for those very short duration idle states
> (well, for my processor (older i7-2600K) at least).

At least, that's a matter of consistency IMO.

State 1 should not be selected if the final latency limit is below its
exit latency and that's what happens in that situation.

> Note: I did test my above assertion with kernels compiled with only
> the first 2 and then 3 of the 8 patch set.

I see.

Thanks,
Rafael


RE: [PATCH 0/6] cpuidle: menu: Fixes, optimizations and cleanups

2018-10-08 Thread Doug Smythies
On 2018.10.08 00:51 Rafael J. Wysocki wrote:
> On Mon, Oct 8, 2018 at 8:02 AM Doug Smythies  wrote:
>>
>> On 2018.10.03 23:56 Rafael J. Wysocki wrote:
>>> On Tue, Oct 2, 2018 at 11:51 PM Rafael J. Wysocki  
>>> wrote:

 Hi All,

 This series fixes a couple of issues with the menu governor, optimizes it
 somewhat and makes a couple of cleanups in it.  Please refer to the
 patch changelogs for details.

 All of the changes in the series are straightforward in my view.  The
 first two patches are fixes, the rest is optimizations and cleanups.
>>>
>>> I'm inclined to take this stuff in for 4.20 if nobody has problems
>>> with it, so please have a look if you care (and you should, because
>>> the code in question is run on all tickless systems out there).
>>
>> Hi Rafael,
>>
>> I did tests with kernel 4.19-rc6 as a baseline reference and then
>> with 8 of your patches (&8patches in the graphs legend):
>>
>> cpuidle: menu: Replace data->predicted_us with local variable
>>   . as required to get this set of 6 to then apply.
>> This set of 6 patches.
>> cpuidle: poll_state: Revise loop termination condition
>>
>> Recall I also did some testing in late August [1], with
>> a kernel that was just a few hundred commits before 4.19-rc1.
>> The baseline is now way different. While I don't know why,
>> I bisected the kernel and either made a mistake, or it was:
>>
>> first bad commit: [06e386a1db54ab6a671e103e929b590f7a88f0e3]
>> Merge tag 'fbdev-v4.19' of https://github.com/bzolnier/linux
>>
>> Anyway, and for reference, included on some of the graphs
>> is the old data from late August (legend name "4.18+3rjw
>> (Aug test)")
>>
>> Test 1: A Thomas Ilsche type "powernightmare" test:
>> (forever ((10 times - variable usec sleep) 0.999 seconds sleep) X 40 
>> staggered threads.
>> Where the "variable" was from 0.05 to 5 in steps of 0.05, for the first ~200 
>> minutes of the test.
>> (note: overheads mean that actual loop times are quite different.)
>> And then from 5 to 50 in steps of 1, for the remaining 100 minutes of the 
>> test.
>> (Shortened by 900 minutes from the way the test was done in August.)
>> Each step ran for 2 minutes. The system was idle for 1 minute at the start, 
>> and a few
>> minutes at the end of the graphs.
>>
>> The power and idle statistics graphs are here:
>> http://fast.smythies.com/linux-pm/k419/k419-pn-sweep-rjw.htm
>>
>> Observations:
>>
>> While the graphs are pretty and such, the only significant
>> difference is the idle state 0 percentages go down a lot
>> with the 8 patches. However the number of idle state 0
>> entries per minute goes up. To present the same information
>> in a different way a trace was done (at 9 Gigabytes in
>> 2 minutes):
>
> The difference in the idle state 0 usage is a consequence of the "poll
> idle" patch and is expected.
>
>> &8patches
>> Idle State 0: Total Entries: 10091412 : time (seconds): 49.447025
>> Idle State 1: Total Entries: 49332297 : time (seconds): 375.943064
>> Idle State 2: Total Entries: 311810 : time (seconds): 2.626403
>>
>> k4.19-rc6
>> Idle State 0: Total Entries: 9162465 : time (seconds): 70.650566
>> Idle State 1: Total Entries: 47592671 : time (seconds): 373.625083
>> Idle State 2: Total Entries: 266212 : time (seconds): 2.278159
>>
>> Conclusions: Behaves as expected.
>
> Right. :-)

>> Test 2: pipe test 2 CPUs, one core. CPU test:
>>
>> The average loop times graph is here:
>> http://fast.smythies.com/linux-pm/k419/k419-rjw-pipe-1core.png
>>
>> The power and idle statistics graphs are here:
>> http://fast.smythies.com/linux-pm/k419/k419-rjw-pipe-1core.htm
>>
>> Conclusions:
>>
>> Better performance at the cost of more power with
>> the patch set, but late August had both better performance
>> and less power.
>>
>> Overall idle entries and exits are about the same, but way
>> way more idle state 0 entries and exits with the patch set.
>
>Same as above (and expected too).

I Disagree. The significant transfer of idle entries from
idle state 1 with kernel 4.19-rc6 to idle state 0 with the
additional 8 patch set is virtually entirely due to this patch:

"[PATCH 2/6] cpuidle: menu: Compute first_idx when latency_req is known"

As far as I can determine from all of this data, in particular the
histogram data below, it seems to me that it now is selecting
idle state 0 whereas before it was selecting idle state 1
is the correct decision for those very short duration idle states
(well, for my processor (older i7-2600K) at least).

Note: I did test my above assertion with kernels compiled with only
the first 2 and then 3 of the 8 patch set.

>
>> Supporting: trace summary (note: such a heavy load on the trace
>> system (~6 gigabytes in 2 minutes) costs about 25% in performance):
>>
>> k4.16-rc6 pipe
>> Idle State 0: Total Entries: 76638 : time (seconds): 0.193166
>> Idle State 1: Total Entries: 37825999 : time (seconds): 23.886772
>> Idle State 2: Total Entries: 49 : time (seconds): 0.007908
>>
>> &8patches

RE: [PATCH 0/6] cpuidle: menu: Fixes, optimizations and cleanups

2018-10-08 Thread Doug Smythies
On 2018.10.08 00:51 Rafael J. Wysocki wrote:
> On Mon, Oct 8, 2018 at 8:02 AM Doug Smythies  wrote:
>>
>> On 2018.10.03 23:56 Rafael J. Wysocki wrote:
>>> On Tue, Oct 2, 2018 at 11:51 PM Rafael J. Wysocki  
>>> wrote:

 Hi All,

 This series fixes a couple of issues with the menu governor, optimizes it
 somewhat and makes a couple of cleanups in it.  Please refer to the
 patch changelogs for details.

 All of the changes in the series are straightforward in my view.  The
 first two patches are fixes, the rest is optimizations and cleanups.
>>>
>>> I'm inclined to take this stuff in for 4.20 if nobody has problems
>>> with it, so please have a look if you care (and you should, because
>>> the code in question is run on all tickless systems out there).
>>
>> Hi Rafael,
>>
>> I did tests with kernel 4.19-rc6 as a baseline reference and then
>> with 8 of your patches (&8patches in the graphs legend):
>>
>> cpuidle: menu: Replace data->predicted_us with local variable
>>   . as required to get this set of 6 to then apply.
>> This set of 6 patches.
>> cpuidle: poll_state: Revise loop termination condition
>>
>> Recall I also did some testing in late August [1], with
>> a kernel that was just a few hundred commits before 4.19-rc1.
>> The baseline is now way different. While I don't know why,
>> I bisected the kernel and either made a mistake, or it was:
>>
>> first bad commit: [06e386a1db54ab6a671e103e929b590f7a88f0e3]
>> Merge tag 'fbdev-v4.19' of https://github.com/bzolnier/linux
>>
>> Anyway, and for reference, included on some of the graphs
>> is the old data from late August (legend name "4.18+3rjw
>> (Aug test)")
>>
>> Test 1: A Thomas Ilsche type "powernightmare" test:
>> (forever ((10 times - variable usec sleep) 0.999 seconds sleep) X 40 
>> staggered threads.
>> Where the "variable" was from 0.05 to 5 in steps of 0.05, for the first ~200 
>> minutes of the test.
>> (note: overheads mean that actual loop times are quite different.)
>> And then from 5 to 50 in steps of 1, for the remaining 100 minutes of the 
>> test.
>> (Shortened by 900 minutes from the way the test was done in August.)
>> Each step ran for 2 minutes. The system was idle for 1 minute at the start, 
>> and a few
>> minutes at the end of the graphs.
>>
>> The power and idle statistics graphs are here:
>> http://fast.smythies.com/linux-pm/k419/k419-pn-sweep-rjw.htm
>>
>> Observations:
>>
>> While the graphs are pretty and such, the only significant
>> difference is the idle state 0 percentages go down a lot
>> with the 8 patches. However the number of idle state 0
>> entries per minute goes up. To present the same information
>> in a different way a trace was done (at 9 Gigabytes in
>> 2 minutes):
>
> The difference in the idle state 0 usage is a consequence of the "poll
> idle" patch and is expected.
>
>> &8patches
>> Idle State 0: Total Entries: 10091412 : time (seconds): 49.447025
>> Idle State 1: Total Entries: 49332297 : time (seconds): 375.943064
>> Idle State 2: Total Entries: 311810 : time (seconds): 2.626403
>>
>> k4.19-rc6
>> Idle State 0: Total Entries: 9162465 : time (seconds): 70.650566
>> Idle State 1: Total Entries: 47592671 : time (seconds): 373.625083
>> Idle State 2: Total Entries: 266212 : time (seconds): 2.278159
>>
>> Conclusions: Behaves as expected.
>
> Right. :-)

>> Test 2: pipe test 2 CPUs, one core. CPU test:
>>
>> The average loop times graph is here:
>> http://fast.smythies.com/linux-pm/k419/k419-rjw-pipe-1core.png
>>
>> The power and idle statistics graphs are here:
>> http://fast.smythies.com/linux-pm/k419/k419-rjw-pipe-1core.htm
>>
>> Conclusions:
>>
>> Better performance at the cost of more power with
>> the patch set, but late August had both better performance
>> and less power.
>>
>> Overall idle entries and exits are about the same, but way
>> way more idle state 0 entries and exits with the patch set.
>
>Same as above (and expected too).

I Disagree. The significant transfer of idle entries from
idle state 1 with kernel 4.19-rc6 to idle state 0 with the
additional 8 patch set is virtually entirely due to this patch:

"[PATCH 2/6] cpuidle: menu: Compute first_idx when latency_req is known"

As far as I can determine from all of this data, in particular the
histogram data below, it seems to me that it now is selecting
idle state 0 whereas before it was selecting idle state 1
is the correct decision for those very short duration idle states
(well, for my processor (older i7-2600K) at least).

Note: I did test my above assertion with kernels compiled with only
the first 2 and then 3 of the 8 patch set.

>
>> Supporting: trace summary (note: such a heavy load on the trace
>> system (~6 gigabytes in 2 minutes) costs about 25% in performance):
>>
>> k4.16-rc6 pipe
>> Idle State 0: Total Entries: 76638 : time (seconds): 0.193166
>> Idle State 1: Total Entries: 37825999 : time (seconds): 23.886772
>> Idle State 2: Total Entries: 49 : time (seconds): 0.007908
>>
>> &8patches

Re: [PATCH 0/6] cpuidle: menu: Fixes, optimizations and cleanups

2018-10-08 Thread Rafael J. Wysocki
Hi Doug,

On Mon, Oct 8, 2018 at 8:02 AM Doug Smythies  wrote:
>
> On 2018.10.03 23:56 Rafael J. Wysocki wrote:
> > On Tue, Oct 2, 2018 at 11:51 PM Rafael J. Wysocki  
> > wrote:
> >>
> >> Hi All,
> >>
> >> This series fixes a couple of issues with the menu governor, optimizes it
> >> somewhat and makes a couple of cleanups in it.  Please refer to the
> >> patch changelogs for details.
> >>
> >> All of the changes in the series are straightforward in my view.  The
> >> first two patches are fixes, the rest is optimizations and cleanups.
> >
> > I'm inclined to take this stuff in for 4.20 if nobody has problems
> > with it, so please have a look if you care (and you should, because
> > the code in question is run on all tickless systems out there).
>
> Hi Rafael,
>
> I did tests with kernel 4.19-rc6 as a baseline reference and then
> with 8 of your patches (&8patches in the graphs legend):
>
> cpuidle: menu: Replace data->predicted_us with local variable
>   . as required to get this set of 6 to then apply.
> This set of 6 patches.
> cpuidle: poll_state: Revise loop termination condition
>
> Recall I also did some testing in late August [1], with
> a kernel that was just a few hundred commits before 4.19-rc1.
> The baseline is now way different. While I don't know why,
> I bisected the kernel and either made a mistake, or it was:
>
> first bad commit: [06e386a1db54ab6a671e103e929b590f7a88f0e3]
> Merge tag 'fbdev-v4.19' of https://github.com/bzolnier/linux
>
> Anyway, and for reference, included on some of the graphs
> is the old data from late August (legend name "4.18+3rjw
> (Aug test)")
>
> Test 1: A Thomas Ilsche type "powernightmare" test:
> (forever ((10 times - variable usec sleep) 0.999 seconds sleep) X 40 
> staggered threads.
> Where the "variable" was from 0.05 to 5 in steps of 0.05, for the first ~200 
> minutes of the test.
> (note: overheads mean that actual loop times are quite different.)
> And then from 5 to 50 in steps of 1, for the remaining 100 minutes of the 
> test.
> (Shortened by 900 minutes from the way the test was done in August.)
> Each step ran for 2 minutes. The system was idle for 1 minute at the start, 
> and a few
> minutes at the end of the graphs.
>
> The power and idle statistics graphs are here:
> http://fast.smythies.com/linux-pm/k419/k419-pn-sweep-rjw.htm
>
> Observations:
>
> While the graphs are pretty and such, the only significant
> difference is the idle state 0 percentages go down a lot
> with the 8 patches. However the number of idle state 0
> entries per minute goes up. To present the same information
> in a different way a trace was done (at 9 Gigabytes in
> 2 minutes):

The difference in the idle state 0 usage is a consequence of the "poll
idle" patch and is expected.

> &8patches
> Idle State 0: Total Entries: 10091412 : time (seconds): 49.447025
> Idle State 1: Total Entries: 49332297 : time (seconds): 375.943064
> Idle State 2: Total Entries: 311810 : time (seconds): 2.626403
>
> k4.19-rc6
> Idle State 0: Total Entries: 9162465 : time (seconds): 70.650566
> Idle State 1: Total Entries: 47592671 : time (seconds): 373.625083
> Idle State 2: Total Entries: 266212 : time (seconds): 2.278159
>
> Conclusions: Behaves as expected.

Right. :-)

> Test 2: pipe test 2 CPUs, one core. CPU test:
>
> The average loop times graph is here:
> http://fast.smythies.com/linux-pm/k419/k419-rjw-pipe-1core.png
>
> The power and idle statistics graphs are here:
> http://fast.smythies.com/linux-pm/k419/k419-rjw-pipe-1core.htm
>
> Conclusions:
>
> Better performance at the cost of more power with
> the patch set, but late August had both better performance
> and less power.
>
> Overall idle entries and exits are about the same, but way
> way more idle state 0 entries and exits with the patch set.

Same as above (and expected too).

> Supporting: trace summary (note: such a heavy load on the trace
> system (~6 gigabytes in 2 minutes) costs about 25% in performance):
>
> k4.16-rc6 pipe
> Idle State 0: Total Entries: 76638 : time (seconds): 0.193166
> Idle State 1: Total Entries: 37825999 : time (seconds): 23.886772
> Idle State 2: Total Entries: 49 : time (seconds): 0.007908
>
> &8patches
> Idle State 0: Total Entries: 37632104 : time (seconds): 26.097220
> Idle State 1: Total Entries: 397 : time (seconds): 0.020021
> Idle State 2: Total Entries: 208 : time (seconds): 0.031052
>
> With rjw 8 patch set (1st col is usecs duration, 2nd col
> is number of occurrences in 2 minutes):
>
> Idle State: 0  Summary:
> 0 24401500
> 1 13153259
> 2 19807
> 3 32731
> 4 802
> 5 346
> 6 1554
> 7 20087
> 8 1849
> 9 150
> 10 9
> 11 10
>
> Idle State: 1  Summary:
> 0 29
> 1 44
> 2 15
> 3 45
> 4 5
> 5 26
> 6 2
> 7 24
> 8 4
> 9 21
> 10 6
> 11 39
> 12 15
> 13 38
> 14 14
> 15 27
> 16 10
> 17 12
> 18 1
> 35 1
> 89 1
> 135 1
> 678 1
> 991 2
> 995 3
> 996 1
> 997 8
> 998 1
> 999 1
>
> Kernel 4.19-rc6 reference:
>
> Idle State: 0  Summary:
> 0 17212
> 1 7516
> 2 34737
> 3 14763
> 4 2312
> 5 74
> 6 3
> 7 3
> 8 

Re: [PATCH 0/6] cpuidle: menu: Fixes, optimizations and cleanups

2018-10-08 Thread Rafael J. Wysocki
Hi Doug,

On Mon, Oct 8, 2018 at 8:02 AM Doug Smythies  wrote:
>
> On 2018.10.03 23:56 Rafael J. Wysocki wrote:
> > On Tue, Oct 2, 2018 at 11:51 PM Rafael J. Wysocki  
> > wrote:
> >>
> >> Hi All,
> >>
> >> This series fixes a couple of issues with the menu governor, optimizes it
> >> somewhat and makes a couple of cleanups in it.  Please refer to the
> >> patch changelogs for details.
> >>
> >> All of the changes in the series are straightforward in my view.  The
> >> first two patches are fixes, the rest is optimizations and cleanups.
> >
> > I'm inclined to take this stuff in for 4.20 if nobody has problems
> > with it, so please have a look if you care (and you should, because
> > the code in question is run on all tickless systems out there).
>
> Hi Rafael,
>
> I did tests with kernel 4.19-rc6 as a baseline reference and then
> with 8 of your patches (&8patches in the graphs legend):
>
> cpuidle: menu: Replace data->predicted_us with local variable
>   . as required to get this set of 6 to then apply.
> This set of 6 patches.
> cpuidle: poll_state: Revise loop termination condition
>
> Recall I also did some testing in late August [1], with
> a kernel that was just a few hundred commits before 4.19-rc1.
> The baseline is now way different. While I don't know why,
> I bisected the kernel and either made a mistake, or it was:
>
> first bad commit: [06e386a1db54ab6a671e103e929b590f7a88f0e3]
> Merge tag 'fbdev-v4.19' of https://github.com/bzolnier/linux
>
> Anyway, and for reference, included on some of the graphs
> is the old data from late August (legend name "4.18+3rjw
> (Aug test)")
>
> Test 1: A Thomas Ilsche type "powernightmare" test:
> (forever ((10 times - variable usec sleep) 0.999 seconds sleep) X 40 
> staggered threads.
> Where the "variable" was from 0.05 to 5 in steps of 0.05, for the first ~200 
> minutes of the test.
> (note: overheads mean that actual loop times are quite different.)
> And then from 5 to 50 in steps of 1, for the remaining 100 minutes of the 
> test.
> (Shortened by 900 minutes from the way the test was done in August.)
> Each step ran for 2 minutes. The system was idle for 1 minute at the start, 
> and a few
> minutes at the end of the graphs.
>
> The power and idle statistics graphs are here:
> http://fast.smythies.com/linux-pm/k419/k419-pn-sweep-rjw.htm
>
> Observations:
>
> While the graphs are pretty and such, the only significant
> difference is the idle state 0 percentages go down a lot
> with the 8 patches. However the number of idle state 0
> entries per minute goes up. To present the same information
> in a different way a trace was done (at 9 Gigabytes in
> 2 minutes):

The difference in the idle state 0 usage is a consequence of the "poll
idle" patch and is expected.

> &8patches
> Idle State 0: Total Entries: 10091412 : time (seconds): 49.447025
> Idle State 1: Total Entries: 49332297 : time (seconds): 375.943064
> Idle State 2: Total Entries: 311810 : time (seconds): 2.626403
>
> k4.19-rc6
> Idle State 0: Total Entries: 9162465 : time (seconds): 70.650566
> Idle State 1: Total Entries: 47592671 : time (seconds): 373.625083
> Idle State 2: Total Entries: 266212 : time (seconds): 2.278159
>
> Conclusions: Behaves as expected.

Right. :-)

> Test 2: pipe test 2 CPUs, one core. CPU test:
>
> The average loop times graph is here:
> http://fast.smythies.com/linux-pm/k419/k419-rjw-pipe-1core.png
>
> The power and idle statistics graphs are here:
> http://fast.smythies.com/linux-pm/k419/k419-rjw-pipe-1core.htm
>
> Conclusions:
>
> Better performance at the cost of more power with
> the patch set, but late August had both better performance
> and less power.
>
> Overall idle entries and exits are about the same, but way
> way more idle state 0 entries and exits with the patch set.

Same as above (and expected too).

> Supporting: trace summary (note: such a heavy load on the trace
> system (~6 gigabytes in 2 minutes) costs about 25% in performance):
>
> k4.16-rc6 pipe
> Idle State 0: Total Entries: 76638 : time (seconds): 0.193166
> Idle State 1: Total Entries: 37825999 : time (seconds): 23.886772
> Idle State 2: Total Entries: 49 : time (seconds): 0.007908
>
> &8patches
> Idle State 0: Total Entries: 37632104 : time (seconds): 26.097220
> Idle State 1: Total Entries: 397 : time (seconds): 0.020021
> Idle State 2: Total Entries: 208 : time (seconds): 0.031052
>
> With rjw 8 patch set (1st col is usecs duration, 2nd col
> is number of occurrences in 2 minutes):
>
> Idle State: 0  Summary:
> 0 24401500
> 1 13153259
> 2 19807
> 3 32731
> 4 802
> 5 346
> 6 1554
> 7 20087
> 8 1849
> 9 150
> 10 9
> 11 10
>
> Idle State: 1  Summary:
> 0 29
> 1 44
> 2 15
> 3 45
> 4 5
> 5 26
> 6 2
> 7 24
> 8 4
> 9 21
> 10 6
> 11 39
> 12 15
> 13 38
> 14 14
> 15 27
> 16 10
> 17 12
> 18 1
> 35 1
> 89 1
> 135 1
> 678 1
> 991 2
> 995 3
> 996 1
> 997 8
> 998 1
> 999 1
>
> Kernel 4.19-rc6 reference:
>
> Idle State: 0  Summary:
> 0 17212
> 1 7516
> 2 34737
> 3 14763
> 4 2312
> 5 74
> 6 3
> 7 3
> 8 

RE: [PATCH 0/6] cpuidle: menu: Fixes, optimizations and cleanups

2018-10-08 Thread Doug Smythies
On 2018.10.03 23:56 Rafael J. Wysocki wrote:
> On Tue, Oct 2, 2018 at 11:51 PM Rafael J. Wysocki  wrote:
>>
>> Hi All,
>>
>> This series fixes a couple of issues with the menu governor, optimizes it
>> somewhat and makes a couple of cleanups in it.  Please refer to the
>> patch changelogs for details.
>>
>> All of the changes in the series are straightforward in my view.  The
>> first two patches are fixes, the rest is optimizations and cleanups.
>
> I'm inclined to take this stuff in for 4.20 if nobody has problems
> with it, so please have a look if you care (and you should, because
> the code in question is run on all tickless systems out there).

Hi Rafael,

I did tests with kernel 4.19-rc6 as a baseline reference and then
with 8 of your patches (&8patches in the graphs legend):

cpuidle: menu: Replace data->predicted_us with local variable
  . as required to get this set of 6 to then apply.
This set of 6 patches.
cpuidle: poll_state: Revise loop termination condition

Recall I also did some testing in late August [1], with
a kernel that was just a few hundred commits before 4.19-rc1.
The baseline is now way different. While I don't know why,
I bisected the kernel and either made a mistake, or it was:

first bad commit: [06e386a1db54ab6a671e103e929b590f7a88f0e3]
Merge tag 'fbdev-v4.19' of https://github.com/bzolnier/linux 

Anyway, and for reference, included on some of the graphs
is the old data from late August (legend name "4.18+3rjw
(Aug test)")

Test 1: A Thomas Ilsche type "powernightmare" test:
(forever ((10 times - variable usec sleep) 0.999 seconds sleep) X 40 staggered 
threads.
Where the "variable" was from 0.05 to 5 in steps of 0.05, for the first ~200 
minutes of the test.
(note: overheads mean that actual loop times are quite different.)
And then from 5 to 50 in steps of 1, for the remaining 100 minutes of the test.
(Shortened by 900 minutes from the way the test was done in August.)
Each step ran for 2 minutes. The system was idle for 1 minute at the start, and 
a few
minutes at the end of the graphs.

The power and idle statistics graphs are here:
http://fast.smythies.com/linux-pm/k419/k419-pn-sweep-rjw.htm

Observations:

While the graphs are pretty and such, the only significant
difference is the idle state 0 percentages go down a lot
with the 8 patches. However the number of idle state 0
entries per minute goes up. To present the same information
in a different way a trace was done (at 9 Gigabytes in
2 minutes):

&8patches
Idle State 0: Total Entries: 10091412 : time (seconds): 49.447025
Idle State 1: Total Entries: 49332297 : time (seconds): 375.943064
Idle State 2: Total Entries: 311810 : time (seconds): 2.626403

k4.19-rc6
Idle State 0: Total Entries: 9162465 : time (seconds): 70.650566
Idle State 1: Total Entries: 47592671 : time (seconds): 373.625083
Idle State 2: Total Entries: 266212 : time (seconds): 2.278159

Conclusions: Behaves as expected.

Test 2: pipe test 2 CPUs, one core. CPU test:

The average loop times graph is here:
http://fast.smythies.com/linux-pm/k419/k419-rjw-pipe-1core.png

The power and idle statistics graphs are here:
http://fast.smythies.com/linux-pm/k419/k419-rjw-pipe-1core.htm

Conclusions:

Better performance at the cost of more power with
the patch set, but late August had both better performance
and less power.

Overall idle entries and exits are about the same, but way
way more idle state 0 entries and exits with the patch set.

Supporting: trace summary (note: such a heavy load on the trace
system (~6 gigabytes in 2 minutes) costs about 25% in performance):

k4.16-rc6 pipe
Idle State 0: Total Entries: 76638 : time (seconds): 0.193166
Idle State 1: Total Entries: 37825999 : time (seconds): 23.886772
Idle State 2: Total Entries: 49 : time (seconds): 0.007908

&8patches
Idle State 0: Total Entries: 37632104 : time (seconds): 26.097220
Idle State 1: Total Entries: 397 : time (seconds): 0.020021
Idle State 2: Total Entries: 208 : time (seconds): 0.031052

With rjw 8 patch set (1st col is usecs duration, 2nd col
is number of occurrences in 2 minutes):

Idle State: 0  Summary:
0 24401500
1 13153259
2 19807
3 32731
4 802
5 346
6 1554
7 20087
8 1849
9 150
10 9
11 10

Idle State: 1  Summary:
0 29
1 44
2 15
3 45
4 5
5 26
6 2
7 24
8 4
9 21
10 6
11 39
12 15
13 38
14 14
15 27
16 10
17 12
18 1
35 1
89 1
135 1
678 1
991 2
995 3
996 1
997 8
998 1
999 1

Kernel 4.19-rc6 reference:

Idle State: 0  Summary:
0 17212
1 7516
2 34737
3 14763
4 2312
5 74
6 3
7 3
8 3
9 4
10 5
11 5
40 1

Idle State: 1  Summary:
0 36073601
1 1662728
2 67985
3 106
4 22
5 8
6 2214
7 11037
8 7110
9 1156
10 1
11 1
13 2
23 1
29 1
99 1
554 1
620 1
846 1
870 1
936 1
944 1
963 1
972 1
989 1
991 1
993 1
994 1
995 2
996 2
997 6
998 3

Test 3: iperf test:

Method: Be an iperf client to 3 servers at once.
Packets are small on purpose, we want the highest
frequency of packets, not fastest payload delivery.

Performance:

Kernel 4.19: 79.9 + 23.5 + 32.8 = 136.2 Mbits/Sec.
&8patches:   78.6 + 23.2 + 33.0 = 

RE: [PATCH 0/6] cpuidle: menu: Fixes, optimizations and cleanups

2018-10-08 Thread Doug Smythies
On 2018.10.03 23:56 Rafael J. Wysocki wrote:
> On Tue, Oct 2, 2018 at 11:51 PM Rafael J. Wysocki  wrote:
>>
>> Hi All,
>>
>> This series fixes a couple of issues with the menu governor, optimizes it
>> somewhat and makes a couple of cleanups in it.  Please refer to the
>> patch changelogs for details.
>>
>> All of the changes in the series are straightforward in my view.  The
>> first two patches are fixes, the rest is optimizations and cleanups.
>
> I'm inclined to take this stuff in for 4.20 if nobody has problems
> with it, so please have a look if you care (and you should, because
> the code in question is run on all tickless systems out there).

Hi Rafael,

I did tests with kernel 4.19-rc6 as a baseline reference and then
with 8 of your patches (&8patches in the graphs legend):

cpuidle: menu: Replace data->predicted_us with local variable
  . as required to get this set of 6 to then apply.
This set of 6 patches.
cpuidle: poll_state: Revise loop termination condition

Recall I also did some testing in late August [1], with
a kernel that was just a few hundred commits before 4.19-rc1.
The baseline is now way different. While I don't know why,
I bisected the kernel and either made a mistake, or it was:

first bad commit: [06e386a1db54ab6a671e103e929b590f7a88f0e3]
Merge tag 'fbdev-v4.19' of https://github.com/bzolnier/linux 

Anyway, and for reference, included on some of the graphs
is the old data from late August (legend name "4.18+3rjw
(Aug test)")

Test 1: A Thomas Ilsche type "powernightmare" test:
(forever ((10 times - variable usec sleep) 0.999 seconds sleep) X 40 staggered 
threads.
Where the "variable" was from 0.05 to 5 in steps of 0.05, for the first ~200 
minutes of the test.
(note: overheads mean that actual loop times are quite different.)
And then from 5 to 50 in steps of 1, for the remaining 100 minutes of the test.
(Shortened by 900 minutes from the way the test was done in August.)
Each step ran for 2 minutes. The system was idle for 1 minute at the start, and 
a few
minutes at the end of the graphs.

The power and idle statistics graphs are here:
http://fast.smythies.com/linux-pm/k419/k419-pn-sweep-rjw.htm

Observations:

While the graphs are pretty and such, the only significant
difference is the idle state 0 percentages go down a lot
with the 8 patches. However the number of idle state 0
entries per minute goes up. To present the same information
in a different way a trace was done (at 9 Gigabytes in
2 minutes):

&8patches
Idle State 0: Total Entries: 10091412 : time (seconds): 49.447025
Idle State 1: Total Entries: 49332297 : time (seconds): 375.943064
Idle State 2: Total Entries: 311810 : time (seconds): 2.626403

k4.19-rc6
Idle State 0: Total Entries: 9162465 : time (seconds): 70.650566
Idle State 1: Total Entries: 47592671 : time (seconds): 373.625083
Idle State 2: Total Entries: 266212 : time (seconds): 2.278159

Conclusions: Behaves as expected.

Test 2: pipe test 2 CPUs, one core. CPU test:

The average loop times graph is here:
http://fast.smythies.com/linux-pm/k419/k419-rjw-pipe-1core.png

The power and idle statistics graphs are here:
http://fast.smythies.com/linux-pm/k419/k419-rjw-pipe-1core.htm

Conclusions:

Better performance at the cost of more power with
the patch set, but late August had both better performance
and less power.

Overall idle entries and exits are about the same, but way
way more idle state 0 entries and exits with the patch set.

Supporting: trace summary (note: such a heavy load on the trace
system (~6 gigabytes in 2 minutes) costs about 25% in performance):

k4.16-rc6 pipe
Idle State 0: Total Entries: 76638 : time (seconds): 0.193166
Idle State 1: Total Entries: 37825999 : time (seconds): 23.886772
Idle State 2: Total Entries: 49 : time (seconds): 0.007908

&8patches
Idle State 0: Total Entries: 37632104 : time (seconds): 26.097220
Idle State 1: Total Entries: 397 : time (seconds): 0.020021
Idle State 2: Total Entries: 208 : time (seconds): 0.031052

With rjw 8 patch set (1st col is usecs duration, 2nd col
is number of occurrences in 2 minutes):

Idle State: 0  Summary:
0 24401500
1 13153259
2 19807
3 32731
4 802
5 346
6 1554
7 20087
8 1849
9 150
10 9
11 10

Idle State: 1  Summary:
0 29
1 44
2 15
3 45
4 5
5 26
6 2
7 24
8 4
9 21
10 6
11 39
12 15
13 38
14 14
15 27
16 10
17 12
18 1
35 1
89 1
135 1
678 1
991 2
995 3
996 1
997 8
998 1
999 1

Kernel 4.19-rc6 reference:

Idle State: 0  Summary:
0 17212
1 7516
2 34737
3 14763
4 2312
5 74
6 3
7 3
8 3
9 4
10 5
11 5
40 1

Idle State: 1  Summary:
0 36073601
1 1662728
2 67985
3 106
4 22
5 8
6 2214
7 11037
8 7110
9 1156
10 1
11 1
13 2
23 1
29 1
99 1
554 1
620 1
846 1
870 1
936 1
944 1
963 1
972 1
989 1
991 1
993 1
994 1
995 2
996 2
997 6
998 3

Test 3: iperf test:

Method: Be an iperf client to 3 servers at once.
Packets are small on purpose, we want the highest
frequency of packets, not fastest payload delivery.

Performance:

Kernel 4.19: 79.9 + 23.5 + 32.8 = 136.2 Mbits/Sec.
&8patches:   78.6 + 23.2 + 33.0 = 

Re: [PATCH 0/6] cpuidle: menu: Fixes, optimizations and cleanups

2018-10-04 Thread Peter Zijlstra
On Thu, Oct 04, 2018 at 08:55:45AM +0200, Rafael J. Wysocki wrote:
> On Tue, Oct 2, 2018 at 11:51 PM Rafael J. Wysocki  wrote:
> >
> > Hi All,
> >
> > This series fixes a couple of issues with the menu governor, optimizes it
> > somewhat and makes a couple of cleanups in it.  Please refer to the
> > patch changelogs for details.
> >
> > All of the changes in the series are straightforward in my view.  The
> > first two patches are fixes, the rest is optimizations and cleanups.
> 
> I'm inclined to take this stuff in for 4.20 if nobody has problems
> with it, so please have a look if you care (and you should, because
> the code in question is run on all tickless systems out there).

Looks ok to me,

Acked-by: Peter Zijlstra (Intel) 


Re: [PATCH 0/6] cpuidle: menu: Fixes, optimizations and cleanups

2018-10-04 Thread Peter Zijlstra
On Thu, Oct 04, 2018 at 08:55:45AM +0200, Rafael J. Wysocki wrote:
> On Tue, Oct 2, 2018 at 11:51 PM Rafael J. Wysocki  wrote:
> >
> > Hi All,
> >
> > This series fixes a couple of issues with the menu governor, optimizes it
> > somewhat and makes a couple of cleanups in it.  Please refer to the
> > patch changelogs for details.
> >
> > All of the changes in the series are straightforward in my view.  The
> > first two patches are fixes, the rest is optimizations and cleanups.
> 
> I'm inclined to take this stuff in for 4.20 if nobody has problems
> with it, so please have a look if you care (and you should, because
> the code in question is run on all tickless systems out there).

Looks ok to me,

Acked-by: Peter Zijlstra (Intel) 


Re: [PATCH 0/6] cpuidle: menu: Fixes, optimizations and cleanups

2018-10-04 Thread Rafael J. Wysocki
On Tue, Oct 2, 2018 at 11:51 PM Rafael J. Wysocki  wrote:
>
> Hi All,
>
> This series fixes a couple of issues with the menu governor, optimizes it
> somewhat and makes a couple of cleanups in it.  Please refer to the
> patch changelogs for details.
>
> All of the changes in the series are straightforward in my view.  The
> first two patches are fixes, the rest is optimizations and cleanups.

I'm inclined to take this stuff in for 4.20 if nobody has problems
with it, so please have a look if you care (and you should, because
the code in question is run on all tickless systems out there).

Thanks,
Rafael


Re: [PATCH 0/6] cpuidle: menu: Fixes, optimizations and cleanups

2018-10-04 Thread Rafael J. Wysocki
On Tue, Oct 2, 2018 at 11:51 PM Rafael J. Wysocki  wrote:
>
> Hi All,
>
> This series fixes a couple of issues with the menu governor, optimizes it
> somewhat and makes a couple of cleanups in it.  Please refer to the
> patch changelogs for details.
>
> All of the changes in the series are straightforward in my view.  The
> first two patches are fixes, the rest is optimizations and cleanups.

I'm inclined to take this stuff in for 4.20 if nobody has problems
with it, so please have a look if you care (and you should, because
the code in question is run on all tickless systems out there).

Thanks,
Rafael


[PATCH 0/6] cpuidle: menu: Fixes, optimizations and cleanups

2018-10-02 Thread Rafael J. Wysocki
Hi All,

This series fixes a couple of issues with the menu governor, optimizes it
somewhat and makes a couple of cleanups in it.  Please refer to the
patch changelogs for details.

All of the changes in the series are straightforward in my view.  The
first two patches are fixes, the rest is optimizations and cleanups.

Thanks,
Rafael



[PATCH 0/6] cpuidle: menu: Fixes, optimizations and cleanups

2018-10-02 Thread Rafael J. Wysocki
Hi All,

This series fixes a couple of issues with the menu governor, optimizes it
somewhat and makes a couple of cleanups in it.  Please refer to the
patch changelogs for details.

All of the changes in the series are straightforward in my view.  The
first two patches are fixes, the rest is optimizations and cleanups.

Thanks,
Rafael