Re: [PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-06-01 Thread Viresh Kumar
On 30-05-16, 09:35, Steve Muckle wrote:
> The text above (the second sentence) seems okay to me in that it
> mentions remote updates are not currently supported. Let me know if
> there is specific text you would like included.

Perhaps I got confused between cover-letter and this log. Leave that.
Its fine.

-- 
viresh


Re: [PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-06-01 Thread Viresh Kumar
On 30-05-16, 09:35, Steve Muckle wrote:
> The text above (the second sentence) seems okay to me in that it
> mentions remote updates are not currently supported. Let me know if
> there is specific text you would like included.

Perhaps I got confused between cover-letter and this log. Leave that.
Its fine.

-- 
viresh


Re: [PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-05-31 Thread Peter Zijlstra
On Mon, May 30, 2016 at 09:02:33PM +0530, Viresh Kumar wrote:
> But this is something that is fundamentally broken for now. The user writes
> updates the policy->max/min, we return the call to the user thinks that it has

IMO the presence of these user min/max files is the broken part.


Re: [PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-05-31 Thread Peter Zijlstra
On Mon, May 30, 2016 at 09:02:33PM +0530, Viresh Kumar wrote:
> But this is something that is fundamentally broken for now. The user writes
> updates the policy->max/min, we return the call to the user thinks that it has

IMO the presence of these user min/max files is the broken part.


Re: [PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-05-30 Thread Wanpeng Li
2016-05-30 22:25 GMT+08:00 Rafael J. Wysocki :
> On Mon, May 30, 2016 at 12:18 PM, Viresh Kumar  
> wrote:
>> On 29-05-16, 02:40, Rafael J. Wysocki wrote:
>>> I can't really parse the above question, so I'm not going to try to
>>> answer it. :-)
>>
>> Sorry about that :(
>>
>> IOW, I think that we should make this change into the sched-governor (I will
>> send a patch separately if you agree to this):
>
> I don't.
>
>> diff --git a/kernel/sched/cpufreq_schedutil.c 
>> b/kernel/sched/cpufreq_schedutil.c
>> index 14c4aa25cc45..5934b14aa21c 100644
>> --- a/kernel/sched/cpufreq_schedutil.c
>> +++ b/kernel/sched/cpufreq_schedutil.c
>> @@ -66,11 +66,6 @@ static bool sugov_should_update_freq(struct sugov_policy 
>> *sg_policy, u64 time)
>>
>> if (unlikely(sg_policy->need_freq_update)) {
>> sg_policy->need_freq_update = false;
>> -   /*
>> -* This happens when limits change, so forget the previous
>> -* next_freq value and force an update.
>> -*/
>> -   sg_policy->next_freq = UINT_MAX;
>> return true;
>> }
>>
>> And here is my reasoning behind this.
>>
>> Can you show me any case, where the above code (as present in mainline
>> today) leads to a freq-change? I couldn't find any.
>>
>> Let me demonstrate.
>>
>> Following only talks about the fast-switch path, the other path is
>> same as well.
>>
>> Suppose this is the current range of frequencies supported by a
>> driver: 200, 400, 600, 800, 1000 (in MHz).
>>
>> And policy->cur = next_freq = 400 MHz.
>>
>> A.) Suppose that we change policy->min to 400 MHz from userspace.
>> -> sugov_limits()
>>This will find everything in order and simply set
>>need_freq_update, without updating the frequency.
>>
>> On next util-callback, we will forcefully return true from
>> sugov_should_update_freq() and reach sugov_update_commit().
>>
>> We calculate next_freq and that comes to 400 MHz again (that's the
>> case we are trying to target with the above code).
>>
>> With the current code, we will forcefully end up calling
>> cpufreq_driver_fast_switch().
>>
>> Because the new and current frequencies are same,
>> cpufreq_driver->fast_switch() will simply return.
>>
>> NOTE: I also think that cpufreq_driver_fast_switch() should have a
>> check like (policy->cur == target_freq). I will add that too, in
>> case you agree.
>>
>> So, forcefully updating next_freq to UINT_MAX will end up wasting
>> some cycles, but wouldn't do any useful stuff.
>
> It will, but there's no way to distinguish this case from B in the
> governor with the current min/max synchronization mechanism.  That is,
> it only knows that something has changed, but checking what exactly
> has changed would be racy.
>
>> B.) Suppose that we change policy->min to 600 MHz from userspace.
>> -> sugov_limits()
>>This will find that policy->cur is less than 600 and will set
>>that to 600 MHz by calling __cpufreq_driver_target(). We will
>>also set need_freq_update.
>>
>>Note that next_freq and policy->cur are not in sync anymore and
>>perhaps this is the most important case for the above code.
>
> It is.
>
> Moreover, please note that __cpufreq_driver_target() is only called in
> sugov_limits() when policy->fast_switch_enabled is unset.
>
>> On next util-callback, we will forcefully return true from
>> sugov_should_update_freq() and reach sugov_update_commit().
>>
>> We calculate next_freq and lets say that comes to 400 MHz again
>> (as that's the case we are trying to target with the above code).
>>
>> With the current code, we will forcefully end up calling
>> cpufreq_driver_fast_switch().
>>
>> Because next_freq() is not part of the new range, we will clamp it
>> and set it to 600 MHz eventually. Again, next and current
>> frequencies are same, cpufreq_driver->fast_switch() will simply
>> return.
>
> Not really (as per the above).
>
> And even in the !fast_switch_enabled case, if next_freq stays at 400
> MHz (which is different from policy->cur), it may lead to suboptimal
> decisions going forward (eg. if it goes to 600 MHz next time and the
> governor will think that the frequency has changed, although in fact
> it hasn't).

Does set next_freq = UNIT_MAX has same effect as next_freq stays at
400MHz since both means that frequency has changed?

Regards,
Wanpeng Li


Re: [PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-05-30 Thread Wanpeng Li
2016-05-30 22:25 GMT+08:00 Rafael J. Wysocki :
> On Mon, May 30, 2016 at 12:18 PM, Viresh Kumar  
> wrote:
>> On 29-05-16, 02:40, Rafael J. Wysocki wrote:
>>> I can't really parse the above question, so I'm not going to try to
>>> answer it. :-)
>>
>> Sorry about that :(
>>
>> IOW, I think that we should make this change into the sched-governor (I will
>> send a patch separately if you agree to this):
>
> I don't.
>
>> diff --git a/kernel/sched/cpufreq_schedutil.c 
>> b/kernel/sched/cpufreq_schedutil.c
>> index 14c4aa25cc45..5934b14aa21c 100644
>> --- a/kernel/sched/cpufreq_schedutil.c
>> +++ b/kernel/sched/cpufreq_schedutil.c
>> @@ -66,11 +66,6 @@ static bool sugov_should_update_freq(struct sugov_policy 
>> *sg_policy, u64 time)
>>
>> if (unlikely(sg_policy->need_freq_update)) {
>> sg_policy->need_freq_update = false;
>> -   /*
>> -* This happens when limits change, so forget the previous
>> -* next_freq value and force an update.
>> -*/
>> -   sg_policy->next_freq = UINT_MAX;
>> return true;
>> }
>>
>> And here is my reasoning behind this.
>>
>> Can you show me any case, where the above code (as present in mainline
>> today) leads to a freq-change? I couldn't find any.
>>
>> Let me demonstrate.
>>
>> Following only talks about the fast-switch path, the other path is
>> same as well.
>>
>> Suppose this is the current range of frequencies supported by a
>> driver: 200, 400, 600, 800, 1000 (in MHz).
>>
>> And policy->cur = next_freq = 400 MHz.
>>
>> A.) Suppose that we change policy->min to 400 MHz from userspace.
>> -> sugov_limits()
>>This will find everything in order and simply set
>>need_freq_update, without updating the frequency.
>>
>> On next util-callback, we will forcefully return true from
>> sugov_should_update_freq() and reach sugov_update_commit().
>>
>> We calculate next_freq and that comes to 400 MHz again (that's the
>> case we are trying to target with the above code).
>>
>> With the current code, we will forcefully end up calling
>> cpufreq_driver_fast_switch().
>>
>> Because the new and current frequencies are same,
>> cpufreq_driver->fast_switch() will simply return.
>>
>> NOTE: I also think that cpufreq_driver_fast_switch() should have a
>> check like (policy->cur == target_freq). I will add that too, in
>> case you agree.
>>
>> So, forcefully updating next_freq to UINT_MAX will end up wasting
>> some cycles, but wouldn't do any useful stuff.
>
> It will, but there's no way to distinguish this case from B in the
> governor with the current min/max synchronization mechanism.  That is,
> it only knows that something has changed, but checking what exactly
> has changed would be racy.
>
>> B.) Suppose that we change policy->min to 600 MHz from userspace.
>> -> sugov_limits()
>>This will find that policy->cur is less than 600 and will set
>>that to 600 MHz by calling __cpufreq_driver_target(). We will
>>also set need_freq_update.
>>
>>Note that next_freq and policy->cur are not in sync anymore and
>>perhaps this is the most important case for the above code.
>
> It is.
>
> Moreover, please note that __cpufreq_driver_target() is only called in
> sugov_limits() when policy->fast_switch_enabled is unset.
>
>> On next util-callback, we will forcefully return true from
>> sugov_should_update_freq() and reach sugov_update_commit().
>>
>> We calculate next_freq and lets say that comes to 400 MHz again
>> (as that's the case we are trying to target with the above code).
>>
>> With the current code, we will forcefully end up calling
>> cpufreq_driver_fast_switch().
>>
>> Because next_freq() is not part of the new range, we will clamp it
>> and set it to 600 MHz eventually. Again, next and current
>> frequencies are same, cpufreq_driver->fast_switch() will simply
>> return.
>
> Not really (as per the above).
>
> And even in the !fast_switch_enabled case, if next_freq stays at 400
> MHz (which is different from policy->cur), it may lead to suboptimal
> decisions going forward (eg. if it goes to 600 MHz next time and the
> governor will think that the frequency has changed, although in fact
> it hasn't).

Does set next_freq = UNIT_MAX has same effect as next_freq stays at
400MHz since both means that frequency has changed?

Regards,
Wanpeng Li


Re: [PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-05-30 Thread Rafael J. Wysocki
On Mon, May 30, 2016 at 5:32 PM, Viresh Kumar  wrote:
> I clearly missed the !policy->fast_switch_enabled check in sugov_limit() and 
> so
> the confusion.
>
> On 30-05-16, 16:25, Rafael J. Wysocki wrote:
>> On Mon, May 30, 2016 at 12:18 PM, Viresh Kumar  
>> wrote:
>> > Suppose this is the current range of frequencies supported by a
>> > driver: 200, 400, 600, 800, 1000 (in MHz).

[cut]

> I am not sure how harmful it can be, but we are returning from sugov_limits()
> without making sure that policy->cur is in valid range currently. I also know
> that you left it out because of the possible races with the util handler.
>
> But this is something that is fundamentally broken for now.

To be clear, this is your opinion that I disagree with.

> The user writes updates the policy->max/min, we return the call to the user 
> thinks that it has
> successfully written to the file and everything is aligned. But we may be
> running at an frequency from invalid range. Yes, that will happen very soon, 
> but
> its broken.

Does it really matter that the call may return before the new limit
takes effect?  I don't think so.

Moreover, if the limit is updated cross-CPU (ie. the CPU running the
call is not the one whose limit is updated) and the target CPU is
idle, waking it up just in order to make sure that the frequency limit
took effect is not particularly smart.

What really matters is how soon the limit can take effect and that in
both "fast switch" and "traditional" cases is about the same time, so
I don't see anything problematic here.

Actually, making sugov_limits() wait for the limit to take effect
would be a matter of adding three lines of code to it, but that would
change a relatively lightweight call into a really heavyweight one, so
I didn't add them, because I didn't see a point in doing that.

I still don't see a point in doing that for that matter.


Re: [PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-05-30 Thread Rafael J. Wysocki
On Mon, May 30, 2016 at 5:32 PM, Viresh Kumar  wrote:
> I clearly missed the !policy->fast_switch_enabled check in sugov_limit() and 
> so
> the confusion.
>
> On 30-05-16, 16:25, Rafael J. Wysocki wrote:
>> On Mon, May 30, 2016 at 12:18 PM, Viresh Kumar  
>> wrote:
>> > Suppose this is the current range of frequencies supported by a
>> > driver: 200, 400, 600, 800, 1000 (in MHz).

[cut]

> I am not sure how harmful it can be, but we are returning from sugov_limits()
> without making sure that policy->cur is in valid range currently. I also know
> that you left it out because of the possible races with the util handler.
>
> But this is something that is fundamentally broken for now.

To be clear, this is your opinion that I disagree with.

> The user writes updates the policy->max/min, we return the call to the user 
> thinks that it has
> successfully written to the file and everything is aligned. But we may be
> running at an frequency from invalid range. Yes, that will happen very soon, 
> but
> its broken.

Does it really matter that the call may return before the new limit
takes effect?  I don't think so.

Moreover, if the limit is updated cross-CPU (ie. the CPU running the
call is not the one whose limit is updated) and the target CPU is
idle, waking it up just in order to make sure that the frequency limit
took effect is not particularly smart.

What really matters is how soon the limit can take effect and that in
both "fast switch" and "traditional" cases is about the same time, so
I don't see anything problematic here.

Actually, making sugov_limits() wait for the limit to take effect
would be a matter of adding three lines of code to it, but that would
change a relatively lightweight call into a really heavyweight one, so
I didn't add them, because I didn't see a point in doing that.

I still don't see a point in doing that for that matter.


Re: [PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-05-30 Thread Steve Muckle
On Fri, May 27, 2016 at 01:41:02PM +0800, Wanpeng Li wrote:
> 2016-05-26 10:53 GMT+08:00 Steve Muckle :
> > The slow-path frequency transition path is relatively expensive as it
> > requires waking up a thread to do work. Should support be added for
> > remote CPU cpufreq updates that is also expensive since it requires an
> > IPI. These activities should be avoided if they are not necessary.
> >
> > To that end, calculate the actual driver-supported frequency required by
> > the new utilization value in schedutil by using the recently added
> > cpufreq_driver_resolve_freq callback. If it is the same as the
> > previously requested driver frequency then there is no need to continue
> > with the update assuming the cpu frequency limits have not changed. This
> > will have additional benefits should the semantics of the rate limit be
> > changed to apply solely to frequency transitions rather than to
> > frequency calculations in schedutil.
> 
> sugov_should_update_freq() still be called before get_nex_freq() after
> the patch applied, so rate limit still apply to both frequency
> transitions and frequency calculations, right?

Yes this series does not change the semantics of the rate limit. It
only includes the changes required for resolving raw target frequencies
to driver-supported frequencies so redundant operations can be avoided.

FWIW the approach I took to change the rate_limit semantics [0] just
moves where last_freq_update_time is set. I was going to return to that
once these changes are complete.

[0]: https://lkml.org/lkml/2016/5/9/857

thanks,
Steve


Re: [PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-05-30 Thread Steve Muckle
On Fri, May 27, 2016 at 01:41:02PM +0800, Wanpeng Li wrote:
> 2016-05-26 10:53 GMT+08:00 Steve Muckle :
> > The slow-path frequency transition path is relatively expensive as it
> > requires waking up a thread to do work. Should support be added for
> > remote CPU cpufreq updates that is also expensive since it requires an
> > IPI. These activities should be avoided if they are not necessary.
> >
> > To that end, calculate the actual driver-supported frequency required by
> > the new utilization value in schedutil by using the recently added
> > cpufreq_driver_resolve_freq callback. If it is the same as the
> > previously requested driver frequency then there is no need to continue
> > with the update assuming the cpu frequency limits have not changed. This
> > will have additional benefits should the semantics of the rate limit be
> > changed to apply solely to frequency transitions rather than to
> > frequency calculations in schedutil.
> 
> sugov_should_update_freq() still be called before get_nex_freq() after
> the patch applied, so rate limit still apply to both frequency
> transitions and frequency calculations, right?

Yes this series does not change the semantics of the rate limit. It
only includes the changes required for resolving raw target frequencies
to driver-supported frequencies so redundant operations can be avoided.

FWIW the approach I took to change the rate_limit semantics [0] just
moves where last_freq_update_time is set. I was going to return to that
once these changes are complete.

[0]: https://lkml.org/lkml/2016/5/9/857

thanks,
Steve


Re: [PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-05-30 Thread Steve Muckle
On Thu, May 26, 2016 at 12:46:29PM +0530, Viresh Kumar wrote:
> On 25-05-16, 19:53, Steve Muckle wrote:
> > The slow-path frequency transition path is relatively expensive as it
> > requires waking up a thread to do work. Should support be added for
> > remote CPU cpufreq updates that is also expensive since it requires an
> > IPI. These activities should be avoided if they are not necessary.
> > 
> > To that end, calculate the actual driver-supported frequency required by
> > the new utilization value in schedutil by using the recently added
> > cpufreq_driver_resolve_freq callback. If it is the same as the
> > previously requested driver frequency then there is no need to continue
> > with the update assuming the cpu frequency limits have not changed. This
> > will have additional benefits should the semantics of the rate limit be
> > changed to apply solely to frequency transitions rather than to
> > frequency calculations in schedutil.
> 
> Maybe mention here that this patch isn't avoiding those IPIs yet, I
> got an impression earlier that they are avoided with it.

Perhaps the problem is that my cover letter should have more clearly
specified that the earlier referenced series was an unaccepted draft?
I'll be more careful to note that in the future.

The text above (the second sentence) seems okay to me in that it
mentions remote updates are not currently supported. Let me know if
there is specific text you would like included.

thanks,
Steve


Re: [PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-05-30 Thread Steve Muckle
On Thu, May 26, 2016 at 12:46:29PM +0530, Viresh Kumar wrote:
> On 25-05-16, 19:53, Steve Muckle wrote:
> > The slow-path frequency transition path is relatively expensive as it
> > requires waking up a thread to do work. Should support be added for
> > remote CPU cpufreq updates that is also expensive since it requires an
> > IPI. These activities should be avoided if they are not necessary.
> > 
> > To that end, calculate the actual driver-supported frequency required by
> > the new utilization value in schedutil by using the recently added
> > cpufreq_driver_resolve_freq callback. If it is the same as the
> > previously requested driver frequency then there is no need to continue
> > with the update assuming the cpu frequency limits have not changed. This
> > will have additional benefits should the semantics of the rate limit be
> > changed to apply solely to frequency transitions rather than to
> > frequency calculations in schedutil.
> 
> Maybe mention here that this patch isn't avoiding those IPIs yet, I
> got an impression earlier that they are avoided with it.

Perhaps the problem is that my cover letter should have more clearly
specified that the earlier referenced series was an unaccepted draft?
I'll be more careful to note that in the future.

The text above (the second sentence) seems okay to me in that it
mentions remote updates are not currently supported. Let me know if
there is specific text you would like included.

thanks,
Steve


Re: [PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-05-30 Thread Viresh Kumar
I clearly missed the !policy->fast_switch_enabled check in sugov_limit() and so
the confusion.

On 30-05-16, 16:25, Rafael J. Wysocki wrote:
> On Mon, May 30, 2016 at 12:18 PM, Viresh Kumar  
> wrote:
> > Suppose this is the current range of frequencies supported by a
> > driver: 200, 400, 600, 800, 1000 (in MHz).
> >
> > And policy->cur = next_freq = 400 MHz.
> >
> > A.) Suppose that we change policy->min to 400 MHz from userspace.
> > -> sugov_limits()
> >This will find everything in order and simply set
> >need_freq_update, without updating the frequency.
> >
> > On next util-callback, we will forcefully return true from
> > sugov_should_update_freq() and reach sugov_update_commit().
> >
> > We calculate next_freq and that comes to 400 MHz again (that's the
> > case we are trying to target with the above code).
> >
> > With the current code, we will forcefully end up calling
> > cpufreq_driver_fast_switch().
> >
> > Because the new and current frequencies are same,
> > cpufreq_driver->fast_switch() will simply return.
> >
> > NOTE: I also think that cpufreq_driver_fast_switch() should have a
> > check like (policy->cur == target_freq). I will add that too, in
> > case you agree.
> >
> > So, forcefully updating next_freq to UINT_MAX will end up wasting
> > some cycles, but wouldn't do any useful stuff.
> 
> It will, but there's no way to distinguish this case from B in the
> governor with the current min/max synchronization mechanism.  That is,
> it only knows that something has changed, but checking what exactly
> has changed would be racy.
> 
> > B.) Suppose that we change policy->min to 600 MHz from userspace.
> > -> sugov_limits()
> >This will find that policy->cur is less than 600 and will set
> >that to 600 MHz by calling __cpufreq_driver_target(). We will
> >also set need_freq_update.
> >
> >Note that next_freq and policy->cur are not in sync anymore and
> >perhaps this is the most important case for the above code.
> 
> It is.
> 
> Moreover, please note that __cpufreq_driver_target() is only called in
> sugov_limits() when policy->fast_switch_enabled is unset.

Yep, I missed it.

I am not sure how harmful it can be, but we are returning from sugov_limits()
without making sure that policy->cur is in valid range currently. I also know
that you left it out because of the possible races with the util handler.

But this is something that is fundamentally broken for now. The user writes
updates the policy->max/min, we return the call to the user thinks that it has
successfully written to the file and everything is aligned. But we may be
running at an frequency from invalid range. Yes, that will happen very soon, but
its broken.

-- 
viresh


Re: [PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-05-30 Thread Viresh Kumar
I clearly missed the !policy->fast_switch_enabled check in sugov_limit() and so
the confusion.

On 30-05-16, 16:25, Rafael J. Wysocki wrote:
> On Mon, May 30, 2016 at 12:18 PM, Viresh Kumar  
> wrote:
> > Suppose this is the current range of frequencies supported by a
> > driver: 200, 400, 600, 800, 1000 (in MHz).
> >
> > And policy->cur = next_freq = 400 MHz.
> >
> > A.) Suppose that we change policy->min to 400 MHz from userspace.
> > -> sugov_limits()
> >This will find everything in order and simply set
> >need_freq_update, without updating the frequency.
> >
> > On next util-callback, we will forcefully return true from
> > sugov_should_update_freq() and reach sugov_update_commit().
> >
> > We calculate next_freq and that comes to 400 MHz again (that's the
> > case we are trying to target with the above code).
> >
> > With the current code, we will forcefully end up calling
> > cpufreq_driver_fast_switch().
> >
> > Because the new and current frequencies are same,
> > cpufreq_driver->fast_switch() will simply return.
> >
> > NOTE: I also think that cpufreq_driver_fast_switch() should have a
> > check like (policy->cur == target_freq). I will add that too, in
> > case you agree.
> >
> > So, forcefully updating next_freq to UINT_MAX will end up wasting
> > some cycles, but wouldn't do any useful stuff.
> 
> It will, but there's no way to distinguish this case from B in the
> governor with the current min/max synchronization mechanism.  That is,
> it only knows that something has changed, but checking what exactly
> has changed would be racy.
> 
> > B.) Suppose that we change policy->min to 600 MHz from userspace.
> > -> sugov_limits()
> >This will find that policy->cur is less than 600 and will set
> >that to 600 MHz by calling __cpufreq_driver_target(). We will
> >also set need_freq_update.
> >
> >Note that next_freq and policy->cur are not in sync anymore and
> >perhaps this is the most important case for the above code.
> 
> It is.
> 
> Moreover, please note that __cpufreq_driver_target() is only called in
> sugov_limits() when policy->fast_switch_enabled is unset.

Yep, I missed it.

I am not sure how harmful it can be, but we are returning from sugov_limits()
without making sure that policy->cur is in valid range currently. I also know
that you left it out because of the possible races with the util handler.

But this is something that is fundamentally broken for now. The user writes
updates the policy->max/min, we return the call to the user thinks that it has
successfully written to the file and everything is aligned. But we may be
running at an frequency from invalid range. Yes, that will happen very soon, but
its broken.

-- 
viresh


Re: [PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-05-30 Thread Rafael J. Wysocki
On Mon, May 30, 2016 at 12:18 PM, Viresh Kumar  wrote:
> On 29-05-16, 02:40, Rafael J. Wysocki wrote:
>> I can't really parse the above question, so I'm not going to try to
>> answer it. :-)
>
> Sorry about that :(
>
> IOW, I think that we should make this change into the sched-governor (I will
> send a patch separately if you agree to this):

I don't.

> diff --git a/kernel/sched/cpufreq_schedutil.c 
> b/kernel/sched/cpufreq_schedutil.c
> index 14c4aa25cc45..5934b14aa21c 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -66,11 +66,6 @@ static bool sugov_should_update_freq(struct sugov_policy 
> *sg_policy, u64 time)
>
> if (unlikely(sg_policy->need_freq_update)) {
> sg_policy->need_freq_update = false;
> -   /*
> -* This happens when limits change, so forget the previous
> -* next_freq value and force an update.
> -*/
> -   sg_policy->next_freq = UINT_MAX;
> return true;
> }
>
> And here is my reasoning behind this.
>
> Can you show me any case, where the above code (as present in mainline
> today) leads to a freq-change? I couldn't find any.
>
> Let me demonstrate.
>
> Following only talks about the fast-switch path, the other path is
> same as well.
>
> Suppose this is the current range of frequencies supported by a
> driver: 200, 400, 600, 800, 1000 (in MHz).
>
> And policy->cur = next_freq = 400 MHz.
>
> A.) Suppose that we change policy->min to 400 MHz from userspace.
> -> sugov_limits()
>This will find everything in order and simply set
>need_freq_update, without updating the frequency.
>
> On next util-callback, we will forcefully return true from
> sugov_should_update_freq() and reach sugov_update_commit().
>
> We calculate next_freq and that comes to 400 MHz again (that's the
> case we are trying to target with the above code).
>
> With the current code, we will forcefully end up calling
> cpufreq_driver_fast_switch().
>
> Because the new and current frequencies are same,
> cpufreq_driver->fast_switch() will simply return.
>
> NOTE: I also think that cpufreq_driver_fast_switch() should have a
> check like (policy->cur == target_freq). I will add that too, in
> case you agree.
>
> So, forcefully updating next_freq to UINT_MAX will end up wasting
> some cycles, but wouldn't do any useful stuff.

It will, but there's no way to distinguish this case from B in the
governor with the current min/max synchronization mechanism.  That is,
it only knows that something has changed, but checking what exactly
has changed would be racy.

> B.) Suppose that we change policy->min to 600 MHz from userspace.
> -> sugov_limits()
>This will find that policy->cur is less than 600 and will set
>that to 600 MHz by calling __cpufreq_driver_target(). We will
>also set need_freq_update.
>
>Note that next_freq and policy->cur are not in sync anymore and
>perhaps this is the most important case for the above code.

It is.

Moreover, please note that __cpufreq_driver_target() is only called in
sugov_limits() when policy->fast_switch_enabled is unset.

> On next util-callback, we will forcefully return true from
> sugov_should_update_freq() and reach sugov_update_commit().
>
> We calculate next_freq and lets say that comes to 400 MHz again
> (as that's the case we are trying to target with the above code).
>
> With the current code, we will forcefully end up calling
> cpufreq_driver_fast_switch().
>
> Because next_freq() is not part of the new range, we will clamp it
> and set it to 600 MHz eventually. Again, next and current
> frequencies are same, cpufreq_driver->fast_switch() will simply
> return.

Not really (as per the above).

And even in the !fast_switch_enabled case, if next_freq stays at 400
MHz (which is different from policy->cur), it may lead to suboptimal
decisions going forward (eg. if it goes to 600 MHz next time and the
governor will think that the frequency has changed, although in fact
it hasn't).

I guess the !fast_switch_enabled case might be optimized by comparing
next_freq with policy->cur at one point, but next_freq should be
updated regardless IMO.


Re: [PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-05-30 Thread Rafael J. Wysocki
On Mon, May 30, 2016 at 12:18 PM, Viresh Kumar  wrote:
> On 29-05-16, 02:40, Rafael J. Wysocki wrote:
>> I can't really parse the above question, so I'm not going to try to
>> answer it. :-)
>
> Sorry about that :(
>
> IOW, I think that we should make this change into the sched-governor (I will
> send a patch separately if you agree to this):

I don't.

> diff --git a/kernel/sched/cpufreq_schedutil.c 
> b/kernel/sched/cpufreq_schedutil.c
> index 14c4aa25cc45..5934b14aa21c 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -66,11 +66,6 @@ static bool sugov_should_update_freq(struct sugov_policy 
> *sg_policy, u64 time)
>
> if (unlikely(sg_policy->need_freq_update)) {
> sg_policy->need_freq_update = false;
> -   /*
> -* This happens when limits change, so forget the previous
> -* next_freq value and force an update.
> -*/
> -   sg_policy->next_freq = UINT_MAX;
> return true;
> }
>
> And here is my reasoning behind this.
>
> Can you show me any case, where the above code (as present in mainline
> today) leads to a freq-change? I couldn't find any.
>
> Let me demonstrate.
>
> Following only talks about the fast-switch path, the other path is
> same as well.
>
> Suppose this is the current range of frequencies supported by a
> driver: 200, 400, 600, 800, 1000 (in MHz).
>
> And policy->cur = next_freq = 400 MHz.
>
> A.) Suppose that we change policy->min to 400 MHz from userspace.
> -> sugov_limits()
>This will find everything in order and simply set
>need_freq_update, without updating the frequency.
>
> On next util-callback, we will forcefully return true from
> sugov_should_update_freq() and reach sugov_update_commit().
>
> We calculate next_freq and that comes to 400 MHz again (that's the
> case we are trying to target with the above code).
>
> With the current code, we will forcefully end up calling
> cpufreq_driver_fast_switch().
>
> Because the new and current frequencies are same,
> cpufreq_driver->fast_switch() will simply return.
>
> NOTE: I also think that cpufreq_driver_fast_switch() should have a
> check like (policy->cur == target_freq). I will add that too, in
> case you agree.
>
> So, forcefully updating next_freq to UINT_MAX will end up wasting
> some cycles, but wouldn't do any useful stuff.

It will, but there's no way to distinguish this case from B in the
governor with the current min/max synchronization mechanism.  That is,
it only knows that something has changed, but checking what exactly
has changed would be racy.

> B.) Suppose that we change policy->min to 600 MHz from userspace.
> -> sugov_limits()
>This will find that policy->cur is less than 600 and will set
>that to 600 MHz by calling __cpufreq_driver_target(). We will
>also set need_freq_update.
>
>Note that next_freq and policy->cur are not in sync anymore and
>perhaps this is the most important case for the above code.

It is.

Moreover, please note that __cpufreq_driver_target() is only called in
sugov_limits() when policy->fast_switch_enabled is unset.

> On next util-callback, we will forcefully return true from
> sugov_should_update_freq() and reach sugov_update_commit().
>
> We calculate next_freq and lets say that comes to 400 MHz again
> (as that's the case we are trying to target with the above code).
>
> With the current code, we will forcefully end up calling
> cpufreq_driver_fast_switch().
>
> Because next_freq() is not part of the new range, we will clamp it
> and set it to 600 MHz eventually. Again, next and current
> frequencies are same, cpufreq_driver->fast_switch() will simply
> return.

Not really (as per the above).

And even in the !fast_switch_enabled case, if next_freq stays at 400
MHz (which is different from policy->cur), it may lead to suboptimal
decisions going forward (eg. if it goes to 600 MHz next time and the
governor will think that the frequency has changed, although in fact
it hasn't).

I guess the !fast_switch_enabled case might be optimized by comparing
next_freq with policy->cur at one point, but next_freq should be
updated regardless IMO.


Re: [PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-05-30 Thread Viresh Kumar
On 29-05-16, 02:40, Rafael J. Wysocki wrote:
> I can't really parse the above question, so I'm not going to try to
> answer it. :-)

Sorry about that :(

IOW, I think that we should make this change into the sched-governor (I will
send a patch separately if you agree to this):

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 14c4aa25cc45..5934b14aa21c 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -66,11 +66,6 @@ static bool sugov_should_update_freq(struct sugov_policy 
*sg_policy, u64 time)
 
if (unlikely(sg_policy->need_freq_update)) {
sg_policy->need_freq_update = false;
-   /*
-* This happens when limits change, so forget the previous
-* next_freq value and force an update.
-*/
-   sg_policy->next_freq = UINT_MAX;
return true;
}

And here is my reasoning behind this.

Can you show me any case, where the above code (as present in mainline
today) leads to a freq-change? I couldn't find any.

Let me demonstrate.

Following only talks about the fast-switch path, the other path is
same as well.

Suppose this is the current range of frequencies supported by a
driver: 200, 400, 600, 800, 1000 (in MHz).

And policy->cur = next_freq = 400 MHz.

A.) Suppose that we change policy->min to 400 MHz from userspace.
-> sugov_limits()
   This will find everything in order and simply set
   need_freq_update, without updating the frequency.

On next util-callback, we will forcefully return true from
sugov_should_update_freq() and reach sugov_update_commit().

We calculate next_freq and that comes to 400 MHz again (that's the
case we are trying to target with the above code).

With the current code, we will forcefully end up calling
cpufreq_driver_fast_switch().

Because the new and current frequencies are same,
cpufreq_driver->fast_switch() will simply return.

NOTE: I also think that cpufreq_driver_fast_switch() should have a
check like (policy->cur == target_freq). I will add that too, in
case you agree.

So, forcefully updating next_freq to UINT_MAX will end up wasting
some cycles, but wouldn't do any useful stuff.

B.) Suppose that we change policy->min to 600 MHz from userspace.
-> sugov_limits()
   This will find that policy->cur is less than 600 and will set
   that to 600 MHz by calling __cpufreq_driver_target(). We will
   also set need_freq_update.

   Note that next_freq and policy->cur are not in sync anymore and
   perhaps this is the most important case for the above code.

On next util-callback, we will forcefully return true from
sugov_should_update_freq() and reach sugov_update_commit().

We calculate next_freq and lets say that comes to 400 MHz again
(as that's the case we are trying to target with the above code).

With the current code, we will forcefully end up calling
cpufreq_driver_fast_switch().

Because next_freq() is not part of the new range, we will clamp it
and set it to 600 MHz eventually. Again, next and current
frequencies are same, cpufreq_driver->fast_switch() will simply
return.

So, forcefully updating next_freq to UINT_MAX will end up wasting
some cycles here as well, but would do any useful stuff.


What am I missing ?

-- 
viresh


Re: [PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-05-30 Thread Viresh Kumar
On 29-05-16, 02:40, Rafael J. Wysocki wrote:
> I can't really parse the above question, so I'm not going to try to
> answer it. :-)

Sorry about that :(

IOW, I think that we should make this change into the sched-governor (I will
send a patch separately if you agree to this):

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 14c4aa25cc45..5934b14aa21c 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -66,11 +66,6 @@ static bool sugov_should_update_freq(struct sugov_policy 
*sg_policy, u64 time)
 
if (unlikely(sg_policy->need_freq_update)) {
sg_policy->need_freq_update = false;
-   /*
-* This happens when limits change, so forget the previous
-* next_freq value and force an update.
-*/
-   sg_policy->next_freq = UINT_MAX;
return true;
}

And here is my reasoning behind this.

Can you show me any case, where the above code (as present in mainline
today) leads to a freq-change? I couldn't find any.

Let me demonstrate.

Following only talks about the fast-switch path, the other path is
same as well.

Suppose this is the current range of frequencies supported by a
driver: 200, 400, 600, 800, 1000 (in MHz).

And policy->cur = next_freq = 400 MHz.

A.) Suppose that we change policy->min to 400 MHz from userspace.
-> sugov_limits()
   This will find everything in order and simply set
   need_freq_update, without updating the frequency.

On next util-callback, we will forcefully return true from
sugov_should_update_freq() and reach sugov_update_commit().

We calculate next_freq and that comes to 400 MHz again (that's the
case we are trying to target with the above code).

With the current code, we will forcefully end up calling
cpufreq_driver_fast_switch().

Because the new and current frequencies are same,
cpufreq_driver->fast_switch() will simply return.

NOTE: I also think that cpufreq_driver_fast_switch() should have a
check like (policy->cur == target_freq). I will add that too, in
case you agree.

So, forcefully updating next_freq to UINT_MAX will end up wasting
some cycles, but wouldn't do any useful stuff.

B.) Suppose that we change policy->min to 600 MHz from userspace.
-> sugov_limits()
   This will find that policy->cur is less than 600 and will set
   that to 600 MHz by calling __cpufreq_driver_target(). We will
   also set need_freq_update.

   Note that next_freq and policy->cur are not in sync anymore and
   perhaps this is the most important case for the above code.

On next util-callback, we will forcefully return true from
sugov_should_update_freq() and reach sugov_update_commit().

We calculate next_freq and lets say that comes to 400 MHz again
(as that's the case we are trying to target with the above code).

With the current code, we will forcefully end up calling
cpufreq_driver_fast_switch().

Because next_freq() is not part of the new range, we will clamp it
and set it to 600 MHz eventually. Again, next and current
frequencies are same, cpufreq_driver->fast_switch() will simply
return.

So, forcefully updating next_freq to UINT_MAX will end up wasting
some cycles here as well, but would do any useful stuff.


What am I missing ?

-- 
viresh


Re: [PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-05-28 Thread Rafael J. Wysocki
On Thu, May 26, 2016 at 9:16 AM, Viresh Kumar  wrote:
> On 25-05-16, 19:53, Steve Muckle wrote:
>> The slow-path frequency transition path is relatively expensive as it
>> requires waking up a thread to do work. Should support be added for
>> remote CPU cpufreq updates that is also expensive since it requires an
>> IPI. These activities should be avoided if they are not necessary.
>>
>> To that end, calculate the actual driver-supported frequency required by
>> the new utilization value in schedutil by using the recently added
>> cpufreq_driver_resolve_freq callback. If it is the same as the
>> previously requested driver frequency then there is no need to continue
>> with the update assuming the cpu frequency limits have not changed. This
>> will have additional benefits should the semantics of the rate limit be
>> changed to apply solely to frequency transitions rather than to
>> frequency calculations in schedutil.

[cut]

> I also have a doubt (I am quite sure Rafael will have a reason for
> that, which I am failing to understand now), on why we are doing
> next_freq == UINT_MAX in sugov_should_update_freq().
>
> I understand that because the limits might have changed,
> need_freq_update would have been set to true. We should evaluate
> next-freq again without worrying about the load or the time since last
> evaluation.

This is in response to the "limits" event (or to the ->limits call
after my recent patches).  That event basically means "something has
changed, so if you have cached anything, invalidate it" to the
governor.  Accordingly, it invalidates next_freq, because that's a
cached value.

> But what will happen by forcefully calling the cpufreq routines to
> change the frequency, if next_freq hasn't changed even after limits
> updates?

I can't really parse the above question, so I'm not going to try to
answer it. :-)

> Wouldn't that call always return early because the new freq
> and the current freq are going to be same ?
>
> @Rafael: Sorry for asking this so late :(

It is not too late.  If there's a problem somewhere, it needs to be
fixed, but at this point I have no idea what you are asking about.


Re: [PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-05-28 Thread Rafael J. Wysocki
On Thu, May 26, 2016 at 9:16 AM, Viresh Kumar  wrote:
> On 25-05-16, 19:53, Steve Muckle wrote:
>> The slow-path frequency transition path is relatively expensive as it
>> requires waking up a thread to do work. Should support be added for
>> remote CPU cpufreq updates that is also expensive since it requires an
>> IPI. These activities should be avoided if they are not necessary.
>>
>> To that end, calculate the actual driver-supported frequency required by
>> the new utilization value in schedutil by using the recently added
>> cpufreq_driver_resolve_freq callback. If it is the same as the
>> previously requested driver frequency then there is no need to continue
>> with the update assuming the cpu frequency limits have not changed. This
>> will have additional benefits should the semantics of the rate limit be
>> changed to apply solely to frequency transitions rather than to
>> frequency calculations in schedutil.

[cut]

> I also have a doubt (I am quite sure Rafael will have a reason for
> that, which I am failing to understand now), on why we are doing
> next_freq == UINT_MAX in sugov_should_update_freq().
>
> I understand that because the limits might have changed,
> need_freq_update would have been set to true. We should evaluate
> next-freq again without worrying about the load or the time since last
> evaluation.

This is in response to the "limits" event (or to the ->limits call
after my recent patches).  That event basically means "something has
changed, so if you have cached anything, invalidate it" to the
governor.  Accordingly, it invalidates next_freq, because that's a
cached value.

> But what will happen by forcefully calling the cpufreq routines to
> change the frequency, if next_freq hasn't changed even after limits
> updates?

I can't really parse the above question, so I'm not going to try to
answer it. :-)

> Wouldn't that call always return early because the new freq
> and the current freq are going to be same ?
>
> @Rafael: Sorry for asking this so late :(

It is not too late.  If there's a problem somewhere, it needs to be
fixed, but at this point I have no idea what you are asking about.


Re: [PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-05-26 Thread Wanpeng Li
2016-05-26 10:53 GMT+08:00 Steve Muckle :
> The slow-path frequency transition path is relatively expensive as it
> requires waking up a thread to do work. Should support be added for
> remote CPU cpufreq updates that is also expensive since it requires an
> IPI. These activities should be avoided if they are not necessary.
>
> To that end, calculate the actual driver-supported frequency required by
> the new utilization value in schedutil by using the recently added
> cpufreq_driver_resolve_freq callback. If it is the same as the
> previously requested driver frequency then there is no need to continue
> with the update assuming the cpu frequency limits have not changed. This
> will have additional benefits should the semantics of the rate limit be
> changed to apply solely to frequency transitions rather than to
> frequency calculations in schedutil.

sugov_should_update_freq() still be called before get_nex_freq() after
the patch applied, so rate limit still apply to both frequency
transitions and frequency calculations, right?

Regards,
Wanpeng Li


Re: [PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-05-26 Thread Wanpeng Li
2016-05-26 10:53 GMT+08:00 Steve Muckle :
> The slow-path frequency transition path is relatively expensive as it
> requires waking up a thread to do work. Should support be added for
> remote CPU cpufreq updates that is also expensive since it requires an
> IPI. These activities should be avoided if they are not necessary.
>
> To that end, calculate the actual driver-supported frequency required by
> the new utilization value in schedutil by using the recently added
> cpufreq_driver_resolve_freq callback. If it is the same as the
> previously requested driver frequency then there is no need to continue
> with the update assuming the cpu frequency limits have not changed. This
> will have additional benefits should the semantics of the rate limit be
> changed to apply solely to frequency transitions rather than to
> frequency calculations in schedutil.

sugov_should_update_freq() still be called before get_nex_freq() after
the patch applied, so rate limit still apply to both frequency
transitions and frequency calculations, right?

Regards,
Wanpeng Li


Re: [PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-05-26 Thread Viresh Kumar
On 25-05-16, 19:53, Steve Muckle wrote:
> The slow-path frequency transition path is relatively expensive as it
> requires waking up a thread to do work. Should support be added for
> remote CPU cpufreq updates that is also expensive since it requires an
> IPI. These activities should be avoided if they are not necessary.
> 
> To that end, calculate the actual driver-supported frequency required by
> the new utilization value in schedutil by using the recently added
> cpufreq_driver_resolve_freq callback. If it is the same as the
> previously requested driver frequency then there is no need to continue
> with the update assuming the cpu frequency limits have not changed. This
> will have additional benefits should the semantics of the rate limit be
> changed to apply solely to frequency transitions rather than to
> frequency calculations in schedutil.

Maybe mention here that this patch isn't avoiding those IPIs yet, I
got an impression earlier that they are avoided with it.

> The last raw required frequency is cached. This allows the driver
> frequency lookup to be skipped in the event that the new raw required
> frequency matches the last one, assuming a frequency update has not been
> forced due to limits changing (indicated by a next_freq value of
> UINT_MAX, see sugov_should_update_freq).

I am not sure this is required, see below..

> Signed-off-by: Steve Muckle 
> ---
>  kernel/sched/cpufreq_schedutil.c | 30 ++
>  1 file changed, 22 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/sched/cpufreq_schedutil.c 
> b/kernel/sched/cpufreq_schedutil.c
> index 154ae3a51e86..ef73ca4d8d78 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -45,6 +45,8 @@ struct sugov_cpu {
>   struct update_util_data update_util;
>   struct sugov_policy *sg_policy;
>  
> + unsigned int cached_raw_freq;
> +
>   /* The fields below are only needed when sharing a policy. */
>   unsigned long util;
>   unsigned long max;
> @@ -104,7 +106,7 @@ static void sugov_update_commit(struct sugov_policy 
> *sg_policy, u64 time,
>  
>  /**
>   * get_next_freq - Compute a new frequency for a given cpufreq policy.
> - * @policy: cpufreq policy object to compute the new frequency for.
> + * @sg_cpu: schedutil cpu object to compute the new frequency for.
>   * @util: Current CPU utilization.
>   * @max: CPU capacity.
>   *
> @@ -119,14 +121,24 @@ static void sugov_update_commit(struct sugov_policy 
> *sg_policy, u64 time,
>   * next_freq = C * curr_freq * util_raw / max
>   *
>   * Take C = 1.25 for the frequency tipping point at (util / max) = 0.8.
> + *
> + * The lowest driver-supported frequency which is equal or greater than the 
> raw
> + * next_freq (as calculated above) is returned.
>   */
> -static unsigned int get_next_freq(struct cpufreq_policy *policy,
> -   unsigned long util, unsigned long max)
> +static unsigned int get_next_freq(struct sugov_cpu *sg_cpu, unsigned long 
> util,
> +   unsigned long max)
>  {
> + struct sugov_policy *sg_policy = sg_cpu->sg_policy;
> + struct cpufreq_policy *policy = sg_policy->policy;
>   unsigned int freq = arch_scale_freq_invariant() ?
>   policy->cpuinfo.max_freq : policy->cur;
>  
> - return (freq + (freq >> 2)) * util / max;
> + freq = (freq + (freq >> 2)) * util / max;
> +
> + if (freq == sg_cpu->cached_raw_freq && sg_policy->next_freq != UINT_MAX)
> + return sg_policy->next_freq;

I am not sure what the benefit of the second comparison to UINT_MAX
is. The output of below code will be same if the freq was ==
cached_raw_freq, no matter what.

I also have a doubt (I am quite sure Rafael will have a reason for
that, which I am failing to understand now), on why we are doing
next_freq == UINT_MAX in sugov_should_update_freq().

I understand that because the limits might have changed,
need_freq_update would have been set to true. We should evaluate
next-freq again without worrying about the load or the time since last
evaluation.

But what will happen by forcefully calling the cpufreq routines to
change the frequency, if next_freq hasn't changed even after limits
updates? Wouldn't that call always return early because the new freq
and the current freq are going to be same ?

@Rafael: Sorry for asking this so late :(

-- 
viresh


Re: [PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-05-26 Thread Viresh Kumar
On 25-05-16, 19:53, Steve Muckle wrote:
> The slow-path frequency transition path is relatively expensive as it
> requires waking up a thread to do work. Should support be added for
> remote CPU cpufreq updates that is also expensive since it requires an
> IPI. These activities should be avoided if they are not necessary.
> 
> To that end, calculate the actual driver-supported frequency required by
> the new utilization value in schedutil by using the recently added
> cpufreq_driver_resolve_freq callback. If it is the same as the
> previously requested driver frequency then there is no need to continue
> with the update assuming the cpu frequency limits have not changed. This
> will have additional benefits should the semantics of the rate limit be
> changed to apply solely to frequency transitions rather than to
> frequency calculations in schedutil.

Maybe mention here that this patch isn't avoiding those IPIs yet, I
got an impression earlier that they are avoided with it.

> The last raw required frequency is cached. This allows the driver
> frequency lookup to be skipped in the event that the new raw required
> frequency matches the last one, assuming a frequency update has not been
> forced due to limits changing (indicated by a next_freq value of
> UINT_MAX, see sugov_should_update_freq).

I am not sure this is required, see below..

> Signed-off-by: Steve Muckle 
> ---
>  kernel/sched/cpufreq_schedutil.c | 30 ++
>  1 file changed, 22 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/sched/cpufreq_schedutil.c 
> b/kernel/sched/cpufreq_schedutil.c
> index 154ae3a51e86..ef73ca4d8d78 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -45,6 +45,8 @@ struct sugov_cpu {
>   struct update_util_data update_util;
>   struct sugov_policy *sg_policy;
>  
> + unsigned int cached_raw_freq;
> +
>   /* The fields below are only needed when sharing a policy. */
>   unsigned long util;
>   unsigned long max;
> @@ -104,7 +106,7 @@ static void sugov_update_commit(struct sugov_policy 
> *sg_policy, u64 time,
>  
>  /**
>   * get_next_freq - Compute a new frequency for a given cpufreq policy.
> - * @policy: cpufreq policy object to compute the new frequency for.
> + * @sg_cpu: schedutil cpu object to compute the new frequency for.
>   * @util: Current CPU utilization.
>   * @max: CPU capacity.
>   *
> @@ -119,14 +121,24 @@ static void sugov_update_commit(struct sugov_policy 
> *sg_policy, u64 time,
>   * next_freq = C * curr_freq * util_raw / max
>   *
>   * Take C = 1.25 for the frequency tipping point at (util / max) = 0.8.
> + *
> + * The lowest driver-supported frequency which is equal or greater than the 
> raw
> + * next_freq (as calculated above) is returned.
>   */
> -static unsigned int get_next_freq(struct cpufreq_policy *policy,
> -   unsigned long util, unsigned long max)
> +static unsigned int get_next_freq(struct sugov_cpu *sg_cpu, unsigned long 
> util,
> +   unsigned long max)
>  {
> + struct sugov_policy *sg_policy = sg_cpu->sg_policy;
> + struct cpufreq_policy *policy = sg_policy->policy;
>   unsigned int freq = arch_scale_freq_invariant() ?
>   policy->cpuinfo.max_freq : policy->cur;
>  
> - return (freq + (freq >> 2)) * util / max;
> + freq = (freq + (freq >> 2)) * util / max;
> +
> + if (freq == sg_cpu->cached_raw_freq && sg_policy->next_freq != UINT_MAX)
> + return sg_policy->next_freq;

I am not sure what the benefit of the second comparison to UINT_MAX
is. The output of below code will be same if the freq was ==
cached_raw_freq, no matter what.

I also have a doubt (I am quite sure Rafael will have a reason for
that, which I am failing to understand now), on why we are doing
next_freq == UINT_MAX in sugov_should_update_freq().

I understand that because the limits might have changed,
need_freq_update would have been set to true. We should evaluate
next-freq again without worrying about the load or the time since last
evaluation.

But what will happen by forcefully calling the cpufreq routines to
change the frequency, if next_freq hasn't changed even after limits
updates? Wouldn't that call always return early because the new freq
and the current freq are going to be same ?

@Rafael: Sorry for asking this so late :(

-- 
viresh


[PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-05-25 Thread Steve Muckle
The slow-path frequency transition path is relatively expensive as it
requires waking up a thread to do work. Should support be added for
remote CPU cpufreq updates that is also expensive since it requires an
IPI. These activities should be avoided if they are not necessary.

To that end, calculate the actual driver-supported frequency required by
the new utilization value in schedutil by using the recently added
cpufreq_driver_resolve_freq callback. If it is the same as the
previously requested driver frequency then there is no need to continue
with the update assuming the cpu frequency limits have not changed. This
will have additional benefits should the semantics of the rate limit be
changed to apply solely to frequency transitions rather than to
frequency calculations in schedutil.

The last raw required frequency is cached. This allows the driver
frequency lookup to be skipped in the event that the new raw required
frequency matches the last one, assuming a frequency update has not been
forced due to limits changing (indicated by a next_freq value of
UINT_MAX, see sugov_should_update_freq).

Signed-off-by: Steve Muckle 
---
 kernel/sched/cpufreq_schedutil.c | 30 ++
 1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 154ae3a51e86..ef73ca4d8d78 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -45,6 +45,8 @@ struct sugov_cpu {
struct update_util_data update_util;
struct sugov_policy *sg_policy;
 
+   unsigned int cached_raw_freq;
+
/* The fields below are only needed when sharing a policy. */
unsigned long util;
unsigned long max;
@@ -104,7 +106,7 @@ static void sugov_update_commit(struct sugov_policy 
*sg_policy, u64 time,
 
 /**
  * get_next_freq - Compute a new frequency for a given cpufreq policy.
- * @policy: cpufreq policy object to compute the new frequency for.
+ * @sg_cpu: schedutil cpu object to compute the new frequency for.
  * @util: Current CPU utilization.
  * @max: CPU capacity.
  *
@@ -119,14 +121,24 @@ static void sugov_update_commit(struct sugov_policy 
*sg_policy, u64 time,
  * next_freq = C * curr_freq * util_raw / max
  *
  * Take C = 1.25 for the frequency tipping point at (util / max) = 0.8.
+ *
+ * The lowest driver-supported frequency which is equal or greater than the raw
+ * next_freq (as calculated above) is returned.
  */
-static unsigned int get_next_freq(struct cpufreq_policy *policy,
- unsigned long util, unsigned long max)
+static unsigned int get_next_freq(struct sugov_cpu *sg_cpu, unsigned long util,
+ unsigned long max)
 {
+   struct sugov_policy *sg_policy = sg_cpu->sg_policy;
+   struct cpufreq_policy *policy = sg_policy->policy;
unsigned int freq = arch_scale_freq_invariant() ?
policy->cpuinfo.max_freq : policy->cur;
 
-   return (freq + (freq >> 2)) * util / max;
+   freq = (freq + (freq >> 2)) * util / max;
+
+   if (freq == sg_cpu->cached_raw_freq && sg_policy->next_freq != UINT_MAX)
+   return sg_policy->next_freq;
+   sg_cpu->cached_raw_freq = freq;
+   return cpufreq_driver_resolve_freq(policy, freq);
 }
 
 static void sugov_update_single(struct update_util_data *hook, u64 time,
@@ -141,13 +153,14 @@ static void sugov_update_single(struct update_util_data 
*hook, u64 time,
return;
 
next_f = util == ULONG_MAX ? policy->cpuinfo.max_freq :
-   get_next_freq(policy, util, max);
+   get_next_freq(sg_cpu, util, max);
sugov_update_commit(sg_policy, time, next_f);
 }
 
-static unsigned int sugov_next_freq_shared(struct sugov_policy *sg_policy,
+static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu,
   unsigned long util, unsigned long 
max)
 {
+   struct sugov_policy *sg_policy = sg_cpu->sg_policy;
struct cpufreq_policy *policy = sg_policy->policy;
unsigned int max_f = policy->cpuinfo.max_freq;
u64 last_freq_update_time = sg_policy->last_freq_update_time;
@@ -187,7 +200,7 @@ static unsigned int sugov_next_freq_shared(struct 
sugov_policy *sg_policy,
}
}
 
-   return get_next_freq(policy, util, max);
+   return get_next_freq(sg_cpu, util, max);
 }
 
 static void sugov_update_shared(struct update_util_data *hook, u64 time,
@@ -204,7 +217,7 @@ static void sugov_update_shared(struct update_util_data 
*hook, u64 time,
sg_cpu->last_update = time;
 
if (sugov_should_update_freq(sg_policy, time)) {
-   next_f = sugov_next_freq_shared(sg_policy, util, max);
+   next_f = sugov_next_freq_shared(sg_cpu, util, max);
sugov_update_commit(sg_policy, time, next_f);
}
 
@@ -432,6 

[PATCH v2 3/3] cpufreq: schedutil: map raw required frequency to driver frequency

2016-05-25 Thread Steve Muckle
The slow-path frequency transition path is relatively expensive as it
requires waking up a thread to do work. Should support be added for
remote CPU cpufreq updates that is also expensive since it requires an
IPI. These activities should be avoided if they are not necessary.

To that end, calculate the actual driver-supported frequency required by
the new utilization value in schedutil by using the recently added
cpufreq_driver_resolve_freq callback. If it is the same as the
previously requested driver frequency then there is no need to continue
with the update assuming the cpu frequency limits have not changed. This
will have additional benefits should the semantics of the rate limit be
changed to apply solely to frequency transitions rather than to
frequency calculations in schedutil.

The last raw required frequency is cached. This allows the driver
frequency lookup to be skipped in the event that the new raw required
frequency matches the last one, assuming a frequency update has not been
forced due to limits changing (indicated by a next_freq value of
UINT_MAX, see sugov_should_update_freq).

Signed-off-by: Steve Muckle 
---
 kernel/sched/cpufreq_schedutil.c | 30 ++
 1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 154ae3a51e86..ef73ca4d8d78 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -45,6 +45,8 @@ struct sugov_cpu {
struct update_util_data update_util;
struct sugov_policy *sg_policy;
 
+   unsigned int cached_raw_freq;
+
/* The fields below are only needed when sharing a policy. */
unsigned long util;
unsigned long max;
@@ -104,7 +106,7 @@ static void sugov_update_commit(struct sugov_policy 
*sg_policy, u64 time,
 
 /**
  * get_next_freq - Compute a new frequency for a given cpufreq policy.
- * @policy: cpufreq policy object to compute the new frequency for.
+ * @sg_cpu: schedutil cpu object to compute the new frequency for.
  * @util: Current CPU utilization.
  * @max: CPU capacity.
  *
@@ -119,14 +121,24 @@ static void sugov_update_commit(struct sugov_policy 
*sg_policy, u64 time,
  * next_freq = C * curr_freq * util_raw / max
  *
  * Take C = 1.25 for the frequency tipping point at (util / max) = 0.8.
+ *
+ * The lowest driver-supported frequency which is equal or greater than the raw
+ * next_freq (as calculated above) is returned.
  */
-static unsigned int get_next_freq(struct cpufreq_policy *policy,
- unsigned long util, unsigned long max)
+static unsigned int get_next_freq(struct sugov_cpu *sg_cpu, unsigned long util,
+ unsigned long max)
 {
+   struct sugov_policy *sg_policy = sg_cpu->sg_policy;
+   struct cpufreq_policy *policy = sg_policy->policy;
unsigned int freq = arch_scale_freq_invariant() ?
policy->cpuinfo.max_freq : policy->cur;
 
-   return (freq + (freq >> 2)) * util / max;
+   freq = (freq + (freq >> 2)) * util / max;
+
+   if (freq == sg_cpu->cached_raw_freq && sg_policy->next_freq != UINT_MAX)
+   return sg_policy->next_freq;
+   sg_cpu->cached_raw_freq = freq;
+   return cpufreq_driver_resolve_freq(policy, freq);
 }
 
 static void sugov_update_single(struct update_util_data *hook, u64 time,
@@ -141,13 +153,14 @@ static void sugov_update_single(struct update_util_data 
*hook, u64 time,
return;
 
next_f = util == ULONG_MAX ? policy->cpuinfo.max_freq :
-   get_next_freq(policy, util, max);
+   get_next_freq(sg_cpu, util, max);
sugov_update_commit(sg_policy, time, next_f);
 }
 
-static unsigned int sugov_next_freq_shared(struct sugov_policy *sg_policy,
+static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu,
   unsigned long util, unsigned long 
max)
 {
+   struct sugov_policy *sg_policy = sg_cpu->sg_policy;
struct cpufreq_policy *policy = sg_policy->policy;
unsigned int max_f = policy->cpuinfo.max_freq;
u64 last_freq_update_time = sg_policy->last_freq_update_time;
@@ -187,7 +200,7 @@ static unsigned int sugov_next_freq_shared(struct 
sugov_policy *sg_policy,
}
}
 
-   return get_next_freq(policy, util, max);
+   return get_next_freq(sg_cpu, util, max);
 }
 
 static void sugov_update_shared(struct update_util_data *hook, u64 time,
@@ -204,7 +217,7 @@ static void sugov_update_shared(struct update_util_data 
*hook, u64 time,
sg_cpu->last_update = time;
 
if (sugov_should_update_freq(sg_policy, time)) {
-   next_f = sugov_next_freq_shared(sg_policy, util, max);
+   next_f = sugov_next_freq_shared(sg_cpu, util, max);
sugov_update_commit(sg_policy, time, next_f);
}
 
@@ -432,6 +445,7 @@ static int