Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-09-04 Thread Rafael J. Wysocki
On Thursday, September 04, 2014 04:07:28 PM Viresh Kumar wrote:
> On 4 September 2014 15:33, Preeti U Murthy  wrote:
> > I think Rafael's point was that since no driver that had implemented the
> > target_index callback was using it at the time that this patch was
> > proposed, it was be best to couple the check on existence of stop_cpu
> > callback with the the check on the kind of cpufreq driver. However
> > powerpc is also in need of this today and we implement the target_index
> > callback and find it convenient to use the stop CPU callback.
> 
> No, this is what he said..
> 
> "
> So to me, (1) the new ->stop() should *only* be called for ->setpolicy 
> drivers,
> because the purpose of it should be to "allow ->setpolicy drivers to do what 
> the
> GOV_STOP will do for regular drivers"
> "
> 
> > Rafael, in which case would it not make sense to remove the check on
> > driver->setpolicy above?
> >
> > Besides, I don't understand very well why we had this double check in
> > the first place. Only if the drivers are in need of the functionality
> > like stop_cpu, would they have implemented this callback right? If we
> > are to assume that the drivers which have implemented the target_index
> > callback should never be needing it, they would not have implemented the
> > stop CPU callback either. So what was that, which was blatantly wrong
> > with just having a check on stop_cpu? I did go through the discussion
> > but did not find a convincing answer to this.
> 
> The idea was to get something similar to GOV_STOP for setpolicy drivers.
> But in the end we didn't get to that. What we do in GOV_STOP is stop
> changing CPUs frequency, but here in stop_cpu() we can stop changing
> CPUs frequency OR take it to minimum, whatever we want..
> 
> As I said earlier, probably we should just do what you did in your patch +
> some documentation changes.

OK, if that works for everybody.  For one, I wouldn't like to end up with a
callback used for different things in every drvier implementing it.

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-09-04 Thread Viresh Kumar
On 4 September 2014 15:33, Preeti U Murthy  wrote:
> I think Rafael's point was that since no driver that had implemented the
> target_index callback was using it at the time that this patch was
> proposed, it was be best to couple the check on existence of stop_cpu
> callback with the the check on the kind of cpufreq driver. However
> powerpc is also in need of this today and we implement the target_index
> callback and find it convenient to use the stop CPU callback.

No, this is what he said..

"
So to me, (1) the new ->stop() should *only* be called for ->setpolicy drivers,
because the purpose of it should be to "allow ->setpolicy drivers to do what the
GOV_STOP will do for regular drivers"
"

> Rafael, in which case would it not make sense to remove the check on
> driver->setpolicy above?
>
> Besides, I don't understand very well why we had this double check in
> the first place. Only if the drivers are in need of the functionality
> like stop_cpu, would they have implemented this callback right? If we
> are to assume that the drivers which have implemented the target_index
> callback should never be needing it, they would not have implemented the
> stop CPU callback either. So what was that, which was blatantly wrong
> with just having a check on stop_cpu? I did go through the discussion
> but did not find a convincing answer to this.

The idea was to get something similar to GOV_STOP for setpolicy drivers.
But in the end we didn't get to that. What we do in GOV_STOP is stop
changing CPUs frequency, but here in stop_cpu() we can stop changing
CPUs frequency OR take it to minimum, whatever we want..

As I said earlier, probably we should just do what you did in your patch +
some documentation changes.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-09-04 Thread Preeti U Murthy
On 09/04/2014 02:46 PM, Viresh Kumar wrote:
> On 4 September 2014 14:40, Preeti U Murthy  wrote:
>> cpufreq: Allow stop CPU callback to be used by all cpufreq drivers
>>
>> Commit 367dc4aa introduced the stop CPU callback for intel_pstate
>> drivers. During the CPU_DOWN_PREPARE stage, this callback is invoked
>> so that drivers can take some action on the pstate of the cpu
>> before it is taken offline. This callback was assumed to be useful
>> only for those drivers which have implemented the set_policy CPU
>> callback because they have no other way to take action about the
>> cpufreq of a CPU which is being hotplugged out except in the exit
>> callback which is called very late in the offline process.
>>
>> The drivers which implement the target/target_index callbacks were
>> expected to take care of requirements like the ones that commit
>> 367dc4aa addresses in the GOV_STOP notification event. But there
>> are disadvantages to restricting the usage of stop CPU callback
>> to cpufreq drivers that implement the set_policy callbacks and who
>> want to take explicit action on the setting the cpufreq during a
>> hotplug operation.
>>
>> 1.GOV_STOP gets called for every CPU offline and drivers would usually
>>   want to take action when the last cpu in the policy->cpus mask
>>   is taken offline. As long as there is more than one cpu in the
>>   policy->cpus mask, cpufreq core itself makes sure that the freq
>>   for the other cpus in this mask is set according to the maximum load.
>>   This is sensible and drivers which implement the target_index callback
>>   would mostly not want to modify that. However the cpufreq core leaves a
>>   loose end when the cpu in the policy->cpus mask is the last one to go 
>> offline;
>>   it does nothing explicit to the frequency of the core. Drivers may need
>>   a way to take some action here and stop CPU callback mechanism is the
>>   best way to do it today.
>>
>> 2.We cannot implement driver specific actions in the GOV_STOP mechanism.
>>   So we will need another driver callback which is invoked from here which is
>>   unnecessary.
>>
>>   Therefore this patch extends the usage of stop CPU callback to be used
>>   by all cpufreq drivers as long as they have this callback implemented
>>   and irrespective of whether they are set_policy/target_index drivers.
>>   The assumption is if the drivers find the GOV_STOP path to be a suitable
>>   way of implementing what they want to do with the freq of the cpu
>>   going offine,they will not implement the stop CPU callback at all.
>>
>>   Signed-off-by: Preeti U Murthy 
>>
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index d9fdedd..6463f35 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -1380,7 +1380,7 @@ static int __cpufreq_remove_dev_prepare(struct device 
>> *dev,
>> if (!cpufreq_suspended)
>> pr_debug("%s: policy Kobject moved to cpu: %d from: 
>> %d\n",
>>  __func__, new_cpu, cpu);
>> -   } else if (cpufreq_driver->stop_cpu && cpufreq_driver->setpolicy) {
>> +   } else if (cpufreq_driver->stop_cpu) {
>> cpufreq_driver->stop_cpu(policy);
>> }
> 
> Rafael explicitly said earlier that he want to see a separate callback for
> ->target() drivers, don't know why..

I think Rafael's point was that since no driver that had implemented the
target_index callback was using it at the time that this patch was
proposed, it was be best to couple the check on existence of stop_cpu
callback with the the check on the kind of cpufreq driver. However
powerpc is also in need of this today and we implement the target_index
callback and find it convenient to use the stop CPU callback.

Rafael, in which case would it not make sense to remove the check on
driver->setpolicy above?

Besides, I don't understand very well why we had this double check in
the first place. Only if the drivers are in need of the functionality
like stop_cpu, would they have implemented this callback right? If we
are to assume that the drivers which have implemented the target_index
callback should never be needing it, they would not have implemented the
stop CPU callback either. So what was that, which was blatantly wrong
with just having a check on stop_cpu? I did go through the discussion
but did not find a convincing answer to this.

Rafael?

Regards
Preeti U Murthy

> 
> It looks fine to me though.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-09-04 Thread Viresh Kumar
On 4 September 2014 14:40, Preeti U Murthy  wrote:
> cpufreq: Allow stop CPU callback to be used by all cpufreq drivers
>
> Commit 367dc4aa introduced the stop CPU callback for intel_pstate
> drivers. During the CPU_DOWN_PREPARE stage, this callback is invoked
> so that drivers can take some action on the pstate of the cpu
> before it is taken offline. This callback was assumed to be useful
> only for those drivers which have implemented the set_policy CPU
> callback because they have no other way to take action about the
> cpufreq of a CPU which is being hotplugged out except in the exit
> callback which is called very late in the offline process.
>
> The drivers which implement the target/target_index callbacks were
> expected to take care of requirements like the ones that commit
> 367dc4aa addresses in the GOV_STOP notification event. But there
> are disadvantages to restricting the usage of stop CPU callback
> to cpufreq drivers that implement the set_policy callbacks and who
> want to take explicit action on the setting the cpufreq during a
> hotplug operation.
>
> 1.GOV_STOP gets called for every CPU offline and drivers would usually
>   want to take action when the last cpu in the policy->cpus mask
>   is taken offline. As long as there is more than one cpu in the
>   policy->cpus mask, cpufreq core itself makes sure that the freq
>   for the other cpus in this mask is set according to the maximum load.
>   This is sensible and drivers which implement the target_index callback
>   would mostly not want to modify that. However the cpufreq core leaves a
>   loose end when the cpu in the policy->cpus mask is the last one to go 
> offline;
>   it does nothing explicit to the frequency of the core. Drivers may need
>   a way to take some action here and stop CPU callback mechanism is the
>   best way to do it today.
>
> 2.We cannot implement driver specific actions in the GOV_STOP mechanism.
>   So we will need another driver callback which is invoked from here which is
>   unnecessary.
>
>   Therefore this patch extends the usage of stop CPU callback to be used
>   by all cpufreq drivers as long as they have this callback implemented
>   and irrespective of whether they are set_policy/target_index drivers.
>   The assumption is if the drivers find the GOV_STOP path to be a suitable
>   way of implementing what they want to do with the freq of the cpu
>   going offine,they will not implement the stop CPU callback at all.
>
>   Signed-off-by: Preeti U Murthy 
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index d9fdedd..6463f35 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1380,7 +1380,7 @@ static int __cpufreq_remove_dev_prepare(struct device 
> *dev,
> if (!cpufreq_suspended)
> pr_debug("%s: policy Kobject moved to cpu: %d from: 
> %d\n",
>  __func__, new_cpu, cpu);
> -   } else if (cpufreq_driver->stop_cpu && cpufreq_driver->setpolicy) {
> +   } else if (cpufreq_driver->stop_cpu) {
> cpufreq_driver->stop_cpu(policy);
> }

Rafael explicitly said earlier that he want to see a separate callback for
->target() drivers, don't know why..

It looks fine to me though.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-09-04 Thread Preeti U Murthy
Hi,

I went through the cpufreq code and the previous discussion in the
context of this patch and I propose the below patch. Please let me know
if it misses something that you all had discussed.

On 09/04/2014 11:38 AM, Viresh Kumar wrote:
> On 19 March 2014 19:49, Rafael J. Wysocki  wrote:
> 
>> That said, for the intel_pstate case ->stop() as proposed by Dirk is 
>> demonstrably
>> sufficient and there are no other ->setpolicy drivers in sight wanting or 
>> needing
>> anything else.
>>
>> So to me, (1) the new ->stop() should *only* be called for ->setpolicy 
>> drivers,
>> because the purpose of it should be to "allow ->setpolicy drivers to do what 
>> the
>> GOV_STOP will do for regular drivers" as you put it above, and (2) some code 
>> in
>> the original intel_pstate's ->exit() may/should stay in there (instead of 
>> being
>> moved to the new ->stop()), which is the only possibly remaining issue here.
>>
>> The whole discussion about possibly re-using ->stop() for ->target drivers 
>> goes
>> in a totally wrong direction, because *if* ->target drivers need a new 
>> callback
>> to be executed around where ->stop() is called for ->setpolicy drivers, 
>> *then*
>> that has to be a *different* callback.
>>
>> And by the way, ->get() in fact has a different meaning for ->setpolicy 
>> drivers,
>> so it would be good to consider logical separation of ->setpolicy and 
>> ->target
>> drivers so that each kind has its own separate set of callbacks with no 
>> overlaps.
>> That would make it easier to avoid breakage resulting from changes made with
>> ->setpolicy drivers that also affect ->target drivers in unpredictable ways 
>> and
>> the other way around.
> 
> Okay, I have picked up a very old thread but it looks more sensible to start
> replying here..
> 
> Preeti (Cc'd) wants to do something similar, i.e. reduce freq of a
> core before it
> goes down. And the driver is probably: drivers/cpufreq/powernv-cpufreq.c, 
> which
> is ->target() type.
> 
> Now should we reuse the same callback ->stop_cpu() or implement a new one?
> I don't know if adding a new callback would be a good idea here..
> 
> --
> viresh
> 

cpufreq: Allow stop CPU callback to be used by all cpufreq drivers

Commit 367dc4aa introduced the stop CPU callback for intel_pstate
drivers. During the CPU_DOWN_PREPARE stage, this callback is invoked
so that drivers can take some action on the pstate of the cpu
before it is taken offline. This callback was assumed to be useful
only for those drivers which have implemented the set_policy CPU
callback because they have no other way to take action about the
cpufreq of a CPU which is being hotplugged out except in the exit
callback which is called very late in the offline process.

The drivers which implement the target/target_index callbacks were
expected to take care of requirements like the ones that commit
367dc4aa addresses in the GOV_STOP notification event. But there
are disadvantages to restricting the usage of stop CPU callback
to cpufreq drivers that implement the set_policy callbacks and who
want to take explicit action on the setting the cpufreq during a
hotplug operation.
   
1.GOV_STOP gets called for every CPU offline and drivers would usually
  want to take action when the last cpu in the policy->cpus mask
  is taken offline. As long as there is more than one cpu in the
  policy->cpus mask, cpufreq core itself makes sure that the freq
  for the other cpus in this mask is set according to the maximum load.
  This is sensible and drivers which implement the target_index callback
  would mostly not want to modify that. However the cpufreq core leaves a
  loose end when the cpu in the policy->cpus mask is the last one to go offline;
  it does nothing explicit to the frequency of the core. Drivers may need
  a way to take some action here and stop CPU callback mechanism is the
  best way to do it today.

2.We cannot implement driver specific actions in the GOV_STOP mechanism.
  So we will need another driver callback which is invoked from here which is
  unnecessary.

  Therefore this patch extends the usage of stop CPU callback to be used
  by all cpufreq drivers as long as they have this callback implemented
  and irrespective of whether they are set_policy/target_index drivers.
  The assumption is if the drivers find the GOV_STOP path to be a suitable
  way of implementing what they want to do with the freq of the cpu
  going offine,they will not implement the stop CPU callback at all.

  Signed-off-by: Preeti U Murthy 

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index d9fdedd..6463f35 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1380,7 +1380,7 @@ static int __cpufreq_remove_dev_prepare(struct device 
*dev,
if (!cpufreq_suspended)
pr_debug("%s: policy Kobject moved to cpu: %d from: 
%d\n",
 __func__, new_cpu, cpu);
-   } else if 

Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-09-04 Thread Viresh Kumar
On 19 March 2014 19:49, Rafael J. Wysocki  wrote:

> That said, for the intel_pstate case ->stop() as proposed by Dirk is 
> demonstrably
> sufficient and there are no other ->setpolicy drivers in sight wanting or 
> needing
> anything else.
>
> So to me, (1) the new ->stop() should *only* be called for ->setpolicy 
> drivers,
> because the purpose of it should be to "allow ->setpolicy drivers to do what 
> the
> GOV_STOP will do for regular drivers" as you put it above, and (2) some code 
> in
> the original intel_pstate's ->exit() may/should stay in there (instead of 
> being
> moved to the new ->stop()), which is the only possibly remaining issue here.
>
> The whole discussion about possibly re-using ->stop() for ->target drivers 
> goes
> in a totally wrong direction, because *if* ->target drivers need a new 
> callback
> to be executed around where ->stop() is called for ->setpolicy drivers, *then*
> that has to be a *different* callback.
>
> And by the way, ->get() in fact has a different meaning for ->setpolicy 
> drivers,
> so it would be good to consider logical separation of ->setpolicy and ->target
> drivers so that each kind has its own separate set of callbacks with no 
> overlaps.
> That would make it easier to avoid breakage resulting from changes made with
> ->setpolicy drivers that also affect ->target drivers in unpredictable ways 
> and
> the other way around.

Okay, I have picked up a very old thread but it looks more sensible to start
replying here..

Preeti (Cc'd) wants to do something similar, i.e. reduce freq of a
core before it
goes down. And the driver is probably: drivers/cpufreq/powernv-cpufreq.c, which
is ->target() type.

Now should we reuse the same callback ->stop_cpu() or implement a new one?
I don't know if adding a new callback would be a good idea here..

--
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-09-04 Thread Rafael J. Wysocki
On Thursday, September 04, 2014 04:07:28 PM Viresh Kumar wrote:
 On 4 September 2014 15:33, Preeti U Murthy pre...@linux.vnet.ibm.com wrote:
  I think Rafael's point was that since no driver that had implemented the
  target_index callback was using it at the time that this patch was
  proposed, it was be best to couple the check on existence of stop_cpu
  callback with the the check on the kind of cpufreq driver. However
  powerpc is also in need of this today and we implement the target_index
  callback and find it convenient to use the stop CPU callback.
 
 No, this is what he said..
 
 
 So to me, (1) the new -stop() should *only* be called for -setpolicy 
 drivers,
 because the purpose of it should be to allow -setpolicy drivers to do what 
 the
 GOV_STOP will do for regular drivers
 
 
  Rafael, in which case would it not make sense to remove the check on
  driver-setpolicy above?
 
  Besides, I don't understand very well why we had this double check in
  the first place. Only if the drivers are in need of the functionality
  like stop_cpu, would they have implemented this callback right? If we
  are to assume that the drivers which have implemented the target_index
  callback should never be needing it, they would not have implemented the
  stop CPU callback either. So what was that, which was blatantly wrong
  with just having a check on stop_cpu? I did go through the discussion
  but did not find a convincing answer to this.
 
 The idea was to get something similar to GOV_STOP for setpolicy drivers.
 But in the end we didn't get to that. What we do in GOV_STOP is stop
 changing CPUs frequency, but here in stop_cpu() we can stop changing
 CPUs frequency OR take it to minimum, whatever we want..
 
 As I said earlier, probably we should just do what you did in your patch +
 some documentation changes.

OK, if that works for everybody.  For one, I wouldn't like to end up with a
callback used for different things in every drvier implementing it.

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-09-04 Thread Viresh Kumar
On 19 March 2014 19:49, Rafael J. Wysocki r...@rjwysocki.net wrote:

 That said, for the intel_pstate case -stop() as proposed by Dirk is 
 demonstrably
 sufficient and there are no other -setpolicy drivers in sight wanting or 
 needing
 anything else.

 So to me, (1) the new -stop() should *only* be called for -setpolicy 
 drivers,
 because the purpose of it should be to allow -setpolicy drivers to do what 
 the
 GOV_STOP will do for regular drivers as you put it above, and (2) some code 
 in
 the original intel_pstate's -exit() may/should stay in there (instead of 
 being
 moved to the new -stop()), which is the only possibly remaining issue here.

 The whole discussion about possibly re-using -stop() for -target drivers 
 goes
 in a totally wrong direction, because *if* -target drivers need a new 
 callback
 to be executed around where -stop() is called for -setpolicy drivers, *then*
 that has to be a *different* callback.

 And by the way, -get() in fact has a different meaning for -setpolicy 
 drivers,
 so it would be good to consider logical separation of -setpolicy and -target
 drivers so that each kind has its own separate set of callbacks with no 
 overlaps.
 That would make it easier to avoid breakage resulting from changes made with
 -setpolicy drivers that also affect -target drivers in unpredictable ways 
 and
 the other way around.

Okay, I have picked up a very old thread but it looks more sensible to start
replying here..

Preeti (Cc'd) wants to do something similar, i.e. reduce freq of a
core before it
goes down. And the driver is probably: drivers/cpufreq/powernv-cpufreq.c, which
is -target() type.

Now should we reuse the same callback -stop_cpu() or implement a new one?
I don't know if adding a new callback would be a good idea here..

--
viresh
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-09-04 Thread Preeti U Murthy
Hi,

I went through the cpufreq code and the previous discussion in the
context of this patch and I propose the below patch. Please let me know
if it misses something that you all had discussed.

On 09/04/2014 11:38 AM, Viresh Kumar wrote:
 On 19 March 2014 19:49, Rafael J. Wysocki r...@rjwysocki.net wrote:
 
 That said, for the intel_pstate case -stop() as proposed by Dirk is 
 demonstrably
 sufficient and there are no other -setpolicy drivers in sight wanting or 
 needing
 anything else.

 So to me, (1) the new -stop() should *only* be called for -setpolicy 
 drivers,
 because the purpose of it should be to allow -setpolicy drivers to do what 
 the
 GOV_STOP will do for regular drivers as you put it above, and (2) some code 
 in
 the original intel_pstate's -exit() may/should stay in there (instead of 
 being
 moved to the new -stop()), which is the only possibly remaining issue here.

 The whole discussion about possibly re-using -stop() for -target drivers 
 goes
 in a totally wrong direction, because *if* -target drivers need a new 
 callback
 to be executed around where -stop() is called for -setpolicy drivers, 
 *then*
 that has to be a *different* callback.

 And by the way, -get() in fact has a different meaning for -setpolicy 
 drivers,
 so it would be good to consider logical separation of -setpolicy and 
 -target
 drivers so that each kind has its own separate set of callbacks with no 
 overlaps.
 That would make it easier to avoid breakage resulting from changes made with
 -setpolicy drivers that also affect -target drivers in unpredictable ways 
 and
 the other way around.
 
 Okay, I have picked up a very old thread but it looks more sensible to start
 replying here..
 
 Preeti (Cc'd) wants to do something similar, i.e. reduce freq of a
 core before it
 goes down. And the driver is probably: drivers/cpufreq/powernv-cpufreq.c, 
 which
 is -target() type.
 
 Now should we reuse the same callback -stop_cpu() or implement a new one?
 I don't know if adding a new callback would be a good idea here..
 
 --
 viresh
 

cpufreq: Allow stop CPU callback to be used by all cpufreq drivers

Commit 367dc4aa introduced the stop CPU callback for intel_pstate
drivers. During the CPU_DOWN_PREPARE stage, this callback is invoked
so that drivers can take some action on the pstate of the cpu
before it is taken offline. This callback was assumed to be useful
only for those drivers which have implemented the set_policy CPU
callback because they have no other way to take action about the
cpufreq of a CPU which is being hotplugged out except in the exit
callback which is called very late in the offline process.

The drivers which implement the target/target_index callbacks were
expected to take care of requirements like the ones that commit
367dc4aa addresses in the GOV_STOP notification event. But there
are disadvantages to restricting the usage of stop CPU callback
to cpufreq drivers that implement the set_policy callbacks and who
want to take explicit action on the setting the cpufreq during a
hotplug operation.
   
1.GOV_STOP gets called for every CPU offline and drivers would usually
  want to take action when the last cpu in the policy-cpus mask
  is taken offline. As long as there is more than one cpu in the
  policy-cpus mask, cpufreq core itself makes sure that the freq
  for the other cpus in this mask is set according to the maximum load.
  This is sensible and drivers which implement the target_index callback
  would mostly not want to modify that. However the cpufreq core leaves a
  loose end when the cpu in the policy-cpus mask is the last one to go offline;
  it does nothing explicit to the frequency of the core. Drivers may need
  a way to take some action here and stop CPU callback mechanism is the
  best way to do it today.

2.We cannot implement driver specific actions in the GOV_STOP mechanism.
  So we will need another driver callback which is invoked from here which is
  unnecessary.

  Therefore this patch extends the usage of stop CPU callback to be used
  by all cpufreq drivers as long as they have this callback implemented
  and irrespective of whether they are set_policy/target_index drivers.
  The assumption is if the drivers find the GOV_STOP path to be a suitable
  way of implementing what they want to do with the freq of the cpu
  going offine,they will not implement the stop CPU callback at all.

  Signed-off-by: Preeti U Murthy pre...@linux.vnet.ibm.com

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index d9fdedd..6463f35 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1380,7 +1380,7 @@ static int __cpufreq_remove_dev_prepare(struct device 
*dev,
if (!cpufreq_suspended)
pr_debug(%s: policy Kobject moved to cpu: %d from: 
%d\n,
 __func__, new_cpu, cpu);
-   } else if (cpufreq_driver-stop_cpu  cpufreq_driver-setpolicy) {
+   } else if 

Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-09-04 Thread Viresh Kumar
On 4 September 2014 14:40, Preeti U Murthy pre...@linux.vnet.ibm.com wrote:
 cpufreq: Allow stop CPU callback to be used by all cpufreq drivers

 Commit 367dc4aa introduced the stop CPU callback for intel_pstate
 drivers. During the CPU_DOWN_PREPARE stage, this callback is invoked
 so that drivers can take some action on the pstate of the cpu
 before it is taken offline. This callback was assumed to be useful
 only for those drivers which have implemented the set_policy CPU
 callback because they have no other way to take action about the
 cpufreq of a CPU which is being hotplugged out except in the exit
 callback which is called very late in the offline process.

 The drivers which implement the target/target_index callbacks were
 expected to take care of requirements like the ones that commit
 367dc4aa addresses in the GOV_STOP notification event. But there
 are disadvantages to restricting the usage of stop CPU callback
 to cpufreq drivers that implement the set_policy callbacks and who
 want to take explicit action on the setting the cpufreq during a
 hotplug operation.

 1.GOV_STOP gets called for every CPU offline and drivers would usually
   want to take action when the last cpu in the policy-cpus mask
   is taken offline. As long as there is more than one cpu in the
   policy-cpus mask, cpufreq core itself makes sure that the freq
   for the other cpus in this mask is set according to the maximum load.
   This is sensible and drivers which implement the target_index callback
   would mostly not want to modify that. However the cpufreq core leaves a
   loose end when the cpu in the policy-cpus mask is the last one to go 
 offline;
   it does nothing explicit to the frequency of the core. Drivers may need
   a way to take some action here and stop CPU callback mechanism is the
   best way to do it today.

 2.We cannot implement driver specific actions in the GOV_STOP mechanism.
   So we will need another driver callback which is invoked from here which is
   unnecessary.

   Therefore this patch extends the usage of stop CPU callback to be used
   by all cpufreq drivers as long as they have this callback implemented
   and irrespective of whether they are set_policy/target_index drivers.
   The assumption is if the drivers find the GOV_STOP path to be a suitable
   way of implementing what they want to do with the freq of the cpu
   going offine,they will not implement the stop CPU callback at all.

   Signed-off-by: Preeti U Murthy pre...@linux.vnet.ibm.com

 diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
 index d9fdedd..6463f35 100644
 --- a/drivers/cpufreq/cpufreq.c
 +++ b/drivers/cpufreq/cpufreq.c
 @@ -1380,7 +1380,7 @@ static int __cpufreq_remove_dev_prepare(struct device 
 *dev,
 if (!cpufreq_suspended)
 pr_debug(%s: policy Kobject moved to cpu: %d from: 
 %d\n,
  __func__, new_cpu, cpu);
 -   } else if (cpufreq_driver-stop_cpu  cpufreq_driver-setpolicy) {
 +   } else if (cpufreq_driver-stop_cpu) {
 cpufreq_driver-stop_cpu(policy);
 }

Rafael explicitly said earlier that he want to see a separate callback for
-target() drivers, don't know why..

It looks fine to me though.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-09-04 Thread Preeti U Murthy
On 09/04/2014 02:46 PM, Viresh Kumar wrote:
 On 4 September 2014 14:40, Preeti U Murthy pre...@linux.vnet.ibm.com wrote:
 cpufreq: Allow stop CPU callback to be used by all cpufreq drivers

 Commit 367dc4aa introduced the stop CPU callback for intel_pstate
 drivers. During the CPU_DOWN_PREPARE stage, this callback is invoked
 so that drivers can take some action on the pstate of the cpu
 before it is taken offline. This callback was assumed to be useful
 only for those drivers which have implemented the set_policy CPU
 callback because they have no other way to take action about the
 cpufreq of a CPU which is being hotplugged out except in the exit
 callback which is called very late in the offline process.

 The drivers which implement the target/target_index callbacks were
 expected to take care of requirements like the ones that commit
 367dc4aa addresses in the GOV_STOP notification event. But there
 are disadvantages to restricting the usage of stop CPU callback
 to cpufreq drivers that implement the set_policy callbacks and who
 want to take explicit action on the setting the cpufreq during a
 hotplug operation.

 1.GOV_STOP gets called for every CPU offline and drivers would usually
   want to take action when the last cpu in the policy-cpus mask
   is taken offline. As long as there is more than one cpu in the
   policy-cpus mask, cpufreq core itself makes sure that the freq
   for the other cpus in this mask is set according to the maximum load.
   This is sensible and drivers which implement the target_index callback
   would mostly not want to modify that. However the cpufreq core leaves a
   loose end when the cpu in the policy-cpus mask is the last one to go 
 offline;
   it does nothing explicit to the frequency of the core. Drivers may need
   a way to take some action here and stop CPU callback mechanism is the
   best way to do it today.

 2.We cannot implement driver specific actions in the GOV_STOP mechanism.
   So we will need another driver callback which is invoked from here which is
   unnecessary.

   Therefore this patch extends the usage of stop CPU callback to be used
   by all cpufreq drivers as long as they have this callback implemented
   and irrespective of whether they are set_policy/target_index drivers.
   The assumption is if the drivers find the GOV_STOP path to be a suitable
   way of implementing what they want to do with the freq of the cpu
   going offine,they will not implement the stop CPU callback at all.

   Signed-off-by: Preeti U Murthy pre...@linux.vnet.ibm.com

 diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
 index d9fdedd..6463f35 100644
 --- a/drivers/cpufreq/cpufreq.c
 +++ b/drivers/cpufreq/cpufreq.c
 @@ -1380,7 +1380,7 @@ static int __cpufreq_remove_dev_prepare(struct device 
 *dev,
 if (!cpufreq_suspended)
 pr_debug(%s: policy Kobject moved to cpu: %d from: 
 %d\n,
  __func__, new_cpu, cpu);
 -   } else if (cpufreq_driver-stop_cpu  cpufreq_driver-setpolicy) {
 +   } else if (cpufreq_driver-stop_cpu) {
 cpufreq_driver-stop_cpu(policy);
 }
 
 Rafael explicitly said earlier that he want to see a separate callback for
 -target() drivers, don't know why..

I think Rafael's point was that since no driver that had implemented the
target_index callback was using it at the time that this patch was
proposed, it was be best to couple the check on existence of stop_cpu
callback with the the check on the kind of cpufreq driver. However
powerpc is also in need of this today and we implement the target_index
callback and find it convenient to use the stop CPU callback.

Rafael, in which case would it not make sense to remove the check on
driver-setpolicy above?

Besides, I don't understand very well why we had this double check in
the first place. Only if the drivers are in need of the functionality
like stop_cpu, would they have implemented this callback right? If we
are to assume that the drivers which have implemented the target_index
callback should never be needing it, they would not have implemented the
stop CPU callback either. So what was that, which was blatantly wrong
with just having a check on stop_cpu? I did go through the discussion
but did not find a convincing answer to this.

Rafael?

Regards
Preeti U Murthy

 
 It looks fine to me though.
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-09-04 Thread Viresh Kumar
On 4 September 2014 15:33, Preeti U Murthy pre...@linux.vnet.ibm.com wrote:
 I think Rafael's point was that since no driver that had implemented the
 target_index callback was using it at the time that this patch was
 proposed, it was be best to couple the check on existence of stop_cpu
 callback with the the check on the kind of cpufreq driver. However
 powerpc is also in need of this today and we implement the target_index
 callback and find it convenient to use the stop CPU callback.

No, this is what he said..


So to me, (1) the new -stop() should *only* be called for -setpolicy drivers,
because the purpose of it should be to allow -setpolicy drivers to do what the
GOV_STOP will do for regular drivers


 Rafael, in which case would it not make sense to remove the check on
 driver-setpolicy above?

 Besides, I don't understand very well why we had this double check in
 the first place. Only if the drivers are in need of the functionality
 like stop_cpu, would they have implemented this callback right? If we
 are to assume that the drivers which have implemented the target_index
 callback should never be needing it, they would not have implemented the
 stop CPU callback either. So what was that, which was blatantly wrong
 with just having a check on stop_cpu? I did go through the discussion
 but did not find a convincing answer to this.

The idea was to get something similar to GOV_STOP for setpolicy drivers.
But in the end we didn't get to that. What we do in GOV_STOP is stop
changing CPUs frequency, but here in stop_cpu() we can stop changing
CPUs frequency OR take it to minimum, whatever we want..

As I said earlier, probably we should just do what you did in your patch +
some documentation changes.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-03-19 Thread Rafael J. Wysocki
On Wednesday, March 19, 2014 03:30:48 PM Srivatsa S. Bhat wrote:
> On 03/19/2014 10:33 AM, Viresh Kumar wrote:
> > On 18 March 2014 17:46, Srivatsa S. Bhat
> >  wrote:
> >> Agreed. As far as I understand, for ->target drivers, today we use GOV_STOP
> >> to stop managing the CPU going offline. And for ->setpolicy drivers, we 
> >> will
> >> use this new callback to achieve the same goal.
> > 
> > So a better question would be: What's the purpose of ->stop() call for a 
> > policy?
> 
> Ideally, it should remove the outgoing CPU from the policy and "stop managing
> that CPU", whatever that means to the driver (for intel_pstate, it means
> setting it to min P state and destroying the timer).
> 
> > Stop managing CPUs of that policy?
> 
> Stop managing only the particular CPU going offline. IOW, we should somehow
> communicate to the ->stop() callback that we are taking CPU 'x' offline.
> 
> If adding a ->stop() callback in the cpufreq_driver is not the best way to
> achieve it, then lets think of an alternative. The way I look at it, this
> new mechanism what we want, should allow ->setpolicy drivers to do what the
> GOV_STOP will do for regular drivers. That is, allow it to "shutdown the
> CPU from a cpufreq perspective", whatever that means to the driver.
> We can think of a completely different way of achieving it, if ->stop()
> is not suitable for that purpose.

I agree.

That said, for the intel_pstate case ->stop() as proposed by Dirk is 
demonstrably
sufficient and there are no other ->setpolicy drivers in sight wanting or 
needing
anything else.

So to me, (1) the new ->stop() should *only* be called for ->setpolicy drivers,
because the purpose of it should be to "allow ->setpolicy drivers to do what the
GOV_STOP will do for regular drivers" as you put it above, and (2) some code in
the original intel_pstate's ->exit() may/should stay in there (instead of being
moved to the new ->stop()), which is the only possibly remaining issue here.

The whole discussion about possibly re-using ->stop() for ->target drivers goes
in a totally wrong direction, because *if* ->target drivers need a new callback
to be executed around where ->stop() is called for ->setpolicy drivers, *then*
that has to be a *different* callback.

And by the way, ->get() in fact has a different meaning for ->setpolicy drivers,
so it would be good to consider logical separation of ->setpolicy and ->target
drivers so that each kind has its own separate set of callbacks with no 
overlaps.
That would make it easier to avoid breakage resulting from changes made with
->setpolicy drivers that also affect ->target drivers in unpredictable ways and
the other way around.

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-03-19 Thread Srivatsa S. Bhat
On 03/19/2014 10:33 AM, Viresh Kumar wrote:
> On 18 March 2014 17:46, Srivatsa S. Bhat
>  wrote:
>> Agreed. As far as I understand, for ->target drivers, today we use GOV_STOP
>> to stop managing the CPU going offline. And for ->setpolicy drivers, we will
>> use this new callback to achieve the same goal.
> 
> So a better question would be: What's the purpose of ->stop() call for a 
> policy?

Ideally, it should remove the outgoing CPU from the policy and "stop managing
that CPU", whatever that means to the driver (for intel_pstate, it means
setting it to min P state and destroying the timer).

> Stop managing CPUs of that policy?

Stop managing only the particular CPU going offline. IOW, we should somehow
communicate to the ->stop() callback that we are taking CPU 'x' offline.

If adding a ->stop() callback in the cpufreq_driver is not the best way to
achieve it, then lets think of an alternative. The way I look at it, this
new mechanism what we want, should allow ->setpolicy drivers to do what the
GOV_STOP will do for regular drivers. That is, allow it to "shutdown the
CPU from a cpufreq perspective", whatever that means to the driver.
We can think of a completely different way of achieving it, if ->stop()
is not suitable for that purpose.

> Or even do something on CPUs of a policy
> before CPUs are offlined?
> 
> Probably in the current solution Dirk is doing both these things..
> 
> And so I thought maybe its better not to restrict ->stop() to just
> setpolicy() drivers.


Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-03-19 Thread Srivatsa S. Bhat
On 03/19/2014 10:33 AM, Viresh Kumar wrote:
 On 18 March 2014 17:46, Srivatsa S. Bhat
 srivatsa.b...@linux.vnet.ibm.com wrote:
 Agreed. As far as I understand, for -target drivers, today we use GOV_STOP
 to stop managing the CPU going offline. And for -setpolicy drivers, we will
 use this new callback to achieve the same goal.
 
 So a better question would be: What's the purpose of -stop() call for a 
 policy?

Ideally, it should remove the outgoing CPU from the policy and stop managing
that CPU, whatever that means to the driver (for intel_pstate, it means
setting it to min P state and destroying the timer).

 Stop managing CPUs of that policy?

Stop managing only the particular CPU going offline. IOW, we should somehow
communicate to the -stop() callback that we are taking CPU 'x' offline.

If adding a -stop() callback in the cpufreq_driver is not the best way to
achieve it, then lets think of an alternative. The way I look at it, this
new mechanism what we want, should allow -setpolicy drivers to do what the
GOV_STOP will do for regular drivers. That is, allow it to shutdown the
CPU from a cpufreq perspective, whatever that means to the driver.
We can think of a completely different way of achieving it, if -stop()
is not suitable for that purpose.

 Or even do something on CPUs of a policy
 before CPUs are offlined?
 
 Probably in the current solution Dirk is doing both these things..
 
 And so I thought maybe its better not to restrict -stop() to just
 setpolicy() drivers.


Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-03-19 Thread Rafael J. Wysocki
On Wednesday, March 19, 2014 03:30:48 PM Srivatsa S. Bhat wrote:
 On 03/19/2014 10:33 AM, Viresh Kumar wrote:
  On 18 March 2014 17:46, Srivatsa S. Bhat
  srivatsa.b...@linux.vnet.ibm.com wrote:
  Agreed. As far as I understand, for -target drivers, today we use GOV_STOP
  to stop managing the CPU going offline. And for -setpolicy drivers, we 
  will
  use this new callback to achieve the same goal.
  
  So a better question would be: What's the purpose of -stop() call for a 
  policy?
 
 Ideally, it should remove the outgoing CPU from the policy and stop managing
 that CPU, whatever that means to the driver (for intel_pstate, it means
 setting it to min P state and destroying the timer).
 
  Stop managing CPUs of that policy?
 
 Stop managing only the particular CPU going offline. IOW, we should somehow
 communicate to the -stop() callback that we are taking CPU 'x' offline.
 
 If adding a -stop() callback in the cpufreq_driver is not the best way to
 achieve it, then lets think of an alternative. The way I look at it, this
 new mechanism what we want, should allow -setpolicy drivers to do what the
 GOV_STOP will do for regular drivers. That is, allow it to shutdown the
 CPU from a cpufreq perspective, whatever that means to the driver.
 We can think of a completely different way of achieving it, if -stop()
 is not suitable for that purpose.

I agree.

That said, for the intel_pstate case -stop() as proposed by Dirk is 
demonstrably
sufficient and there are no other -setpolicy drivers in sight wanting or 
needing
anything else.

So to me, (1) the new -stop() should *only* be called for -setpolicy drivers,
because the purpose of it should be to allow -setpolicy drivers to do what the
GOV_STOP will do for regular drivers as you put it above, and (2) some code in
the original intel_pstate's -exit() may/should stay in there (instead of being
moved to the new -stop()), which is the only possibly remaining issue here.

The whole discussion about possibly re-using -stop() for -target drivers goes
in a totally wrong direction, because *if* -target drivers need a new callback
to be executed around where -stop() is called for -setpolicy drivers, *then*
that has to be a *different* callback.

And by the way, -get() in fact has a different meaning for -setpolicy drivers,
so it would be good to consider logical separation of -setpolicy and -target
drivers so that each kind has its own separate set of callbacks with no 
overlaps.
That would make it easier to avoid breakage resulting from changes made with
-setpolicy drivers that also affect -target drivers in unpredictable ways and
the other way around.

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-03-18 Thread Viresh Kumar
On 18 March 2014 17:46, Srivatsa S. Bhat
 wrote:
> Agreed. As far as I understand, for ->target drivers, today we use GOV_STOP
> to stop managing the CPU going offline. And for ->setpolicy drivers, we will
> use this new callback to achieve the same goal.

So a better question would be: What's the purpose of ->stop() call for a policy?
Stop managing CPUs of that policy? Or even do something on CPUs of a policy
before CPUs are offlined?

Probably in the current solution Dirk is doing both these things..

And so I thought maybe its better not to restrict ->stop() to just
setpolicy() drivers.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-03-18 Thread Srivatsa S. Bhat
On 03/15/2014 07:29 AM, Rafael J. Wysocki wrote:
> On Friday, March 14, 2014 11:29:04 AM Dirk Brandewie wrote:
>> On 03/14/2014 10:07 AM, Viresh Kumar wrote:
>>> On 14 March 2014 20:40, Dirk Brandewie  wrote:
 Are you proposing adding cpufreq_generic_suspend() to the core I can not
 find
 it in the mainline code.
>>>
>>> Its already there in linux-next. I am suggesting to reuse that
>>> infrastructure with
>>> some necessary modification to support both suspend and hotplug.
>>
>> Suspend and hotplug are two very different things and if we start
>> crossing those wires bad things are going to happen IMHO.
>>
>> In "normal" operation using the suspend path to do this work could
>> work in principal but doesn't handle the case where the user does
>> echo 0 | sudo tee /sys/devices/system/cpu/cpuX/online
>>
>> Trying force hotplug and suspend into a common mechanism would
>> lead to a bunch of special case code or a significant rework of the
>> core code IMHO.
>>
>>
>>>
> Over that I don't think Dirk's solution is going to work if we twist
> the systems a bit.

 Could you explain what "twist the systems a bit" means?
>>>
>>> The one I explained in the below paragraph.
>>>
> For example, Dirk probably wanted to set P-STATE of every core to MIN
> when it goes down. But his solution probably doesn't do that right now.
>

 No, I wanted to set the core that was being off-lined to min P state.
>>>
>>> Sorry, probably my words were a bit confusing. I meant exactly what
>>> you just wrote. Core going down will set its freq to min.
>>>
> As exit() is called only when the policy expires or all the CPUs of that
> policy
> are down. Suppose only one CPU out of 4 goes down from a policy, then
> pstate driver would never know that happened. And that core wouldn't go
> to min state.

 My patch does not change the semantics of exit() or when it is called.  For
 intel_pstate their is one cpu per policy so I moved all the cleanup to
>>>
>>> I didn't knew that its guaranteed by pstate driver. I thought it would 
>>> still be
>>> hardware dependent as some cores might share clock line.
>>
>> This is guaranteed by the hardware.  Each core has its own MSR for P state
>> request.  Any coordination that is required between cores to select the
>> package P state is handled by the hardware.
>>
>>>
 exit_prepare() if there were work I needed to do at CPU_POST_DEAD I would
 have
 continued to export the *optional* exit callback.

 The callback name exit_prepare() was probably unfortunate and might be
 causing
 some confusion.  I will be changing the name per Rafael's feedback.
>>>
>>> Don't know.. There is another problem here that exit_prepare() would be 
>>> called
>>> for each CPU whereas exit() would be called for each policy.
>>
>> Granted but I don't see this as a problem in this case there is a 1:1
>> relationship.  If a driver chooses to use the *optional* exit_prepare() 
>> callback
>> and knows that there is a many:1 relationship between the policy and CPUs
>> then it would have to deal with it.
> 
> Actually, I think we should make it clear that the new callback is for
> ->setpolicy drivers only, which will make things a bit clearer.
>

Agreed. As far as I understand, for ->target drivers, today we use GOV_STOP
to stop managing the CPU going offline. And for ->setpolicy drivers, we will
use this new callback to achieve the same goal. 

Regards,
Srivatsa S. Bhat
 
> We seem to get caught by the difference between ->setpolicy and ->target
> drivers on a regular basis, so it might be a good idea to make the distinction
> more clear in the code.  I have an idea how to do that, but need some time
> to prototype it.
>  
>>> And I strongly feel that we shouldn't give another callback here but instead
>>> just set core to a specific freq as mentioned by driver with some other 
>>> field.
>>>
> I think we have two solutions here:
> - If its just about setting core a particular freq when it goes down, I
> think it
> looks a generic enough problem and so better fix core for that. Probably
> with
> help of flags field/suspend-freq (maybe renamed) and without calling
> drivers
> exit() at all..


 ATM the only thing that needs to be done in this case is to allow
 intel_pstate
 to set the P state on the core when it is going done.  My solution from the
 cores point of view is more generic, it allows any driver that needs to do
 work
 during CPU_DOWN_PREPARE to do it without adding any new logic to the core.
>>>
>>> Yeah, do we really need to give that freedom right now? Would be better
>>> to get this into core as that would be more generic and people looking to 
>>> set
>>> core to some freq at shutdown wouldn't be replicating that code.
> 
> Question is if it needs to be more generic.
> 
> I honestly don't think that ->target drivers will ever do anything like it,
> 

Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-03-18 Thread Srivatsa S. Bhat
On 03/15/2014 07:29 AM, Rafael J. Wysocki wrote:
 On Friday, March 14, 2014 11:29:04 AM Dirk Brandewie wrote:
 On 03/14/2014 10:07 AM, Viresh Kumar wrote:
 On 14 March 2014 20:40, Dirk Brandewie dirk.brande...@gmail.com wrote:
 Are you proposing adding cpufreq_generic_suspend() to the core I can not
 find
 it in the mainline code.

 Its already there in linux-next. I am suggesting to reuse that
 infrastructure with
 some necessary modification to support both suspend and hotplug.

 Suspend and hotplug are two very different things and if we start
 crossing those wires bad things are going to happen IMHO.

 In normal operation using the suspend path to do this work could
 work in principal but doesn't handle the case where the user does
 echo 0 | sudo tee /sys/devices/system/cpu/cpuX/online

 Trying force hotplug and suspend into a common mechanism would
 lead to a bunch of special case code or a significant rework of the
 core code IMHO.



 Over that I don't think Dirk's solution is going to work if we twist
 the systems a bit.

 Could you explain what twist the systems a bit means?

 The one I explained in the below paragraph.

 For example, Dirk probably wanted to set P-STATE of every core to MIN
 when it goes down. But his solution probably doesn't do that right now.


 No, I wanted to set the core that was being off-lined to min P state.

 Sorry, probably my words were a bit confusing. I meant exactly what
 you just wrote. Core going down will set its freq to min.

 As exit() is called only when the policy expires or all the CPUs of that
 policy
 are down. Suppose only one CPU out of 4 goes down from a policy, then
 pstate driver would never know that happened. And that core wouldn't go
 to min state.

 My patch does not change the semantics of exit() or when it is called.  For
 intel_pstate their is one cpu per policy so I moved all the cleanup to

 I didn't knew that its guaranteed by pstate driver. I thought it would 
 still be
 hardware dependent as some cores might share clock line.

 This is guaranteed by the hardware.  Each core has its own MSR for P state
 request.  Any coordination that is required between cores to select the
 package P state is handled by the hardware.


 exit_prepare() if there were work I needed to do at CPU_POST_DEAD I would
 have
 continued to export the *optional* exit callback.

 The callback name exit_prepare() was probably unfortunate and might be
 causing
 some confusion.  I will be changing the name per Rafael's feedback.

 Don't know.. There is another problem here that exit_prepare() would be 
 called
 for each CPU whereas exit() would be called for each policy.

 Granted but I don't see this as a problem in this case there is a 1:1
 relationship.  If a driver chooses to use the *optional* exit_prepare() 
 callback
 and knows that there is a many:1 relationship between the policy and CPUs
 then it would have to deal with it.
 
 Actually, I think we should make it clear that the new callback is for
 -setpolicy drivers only, which will make things a bit clearer.


Agreed. As far as I understand, for -target drivers, today we use GOV_STOP
to stop managing the CPU going offline. And for -setpolicy drivers, we will
use this new callback to achieve the same goal. 

Regards,
Srivatsa S. Bhat
 
 We seem to get caught by the difference between -setpolicy and -target
 drivers on a regular basis, so it might be a good idea to make the distinction
 more clear in the code.  I have an idea how to do that, but need some time
 to prototype it.
  
 And I strongly feel that we shouldn't give another callback here but instead
 just set core to a specific freq as mentioned by driver with some other 
 field.

 I think we have two solutions here:
 - If its just about setting core a particular freq when it goes down, I
 think it
 looks a generic enough problem and so better fix core for that. Probably
 with
 help of flags field/suspend-freq (maybe renamed) and without calling
 drivers
 exit() at all..


 ATM the only thing that needs to be done in this case is to allow
 intel_pstate
 to set the P state on the core when it is going done.  My solution from the
 cores point of view is more generic, it allows any driver that needs to do
 work
 during CPU_DOWN_PREPARE to do it without adding any new logic to the core.

 Yeah, do we really need to give that freedom right now? Would be better
 to get this into core as that would be more generic and people looking to 
 set
 core to some freq at shutdown wouldn't be replicating that code.
 
 Question is if it needs to be more generic.
 
 I honestly don't think that -target drivers will ever do anything like it,
 because they need the governor to exit before.  So we are talking about the
 only two -setpolicy drivers in the tree here.
 
 IMHO yes and it would be hard to be more generic, if your platform needs to
 do architecture specific during the PREPARE phase of cpu hotplug use this
 callback or not.

 BTW now that you have added a path 

Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-03-18 Thread Viresh Kumar
On 18 March 2014 17:46, Srivatsa S. Bhat
srivatsa.b...@linux.vnet.ibm.com wrote:
 Agreed. As far as I understand, for -target drivers, today we use GOV_STOP
 to stop managing the CPU going offline. And for -setpolicy drivers, we will
 use this new callback to achieve the same goal.

So a better question would be: What's the purpose of -stop() call for a policy?
Stop managing CPUs of that policy? Or even do something on CPUs of a policy
before CPUs are offlined?

Probably in the current solution Dirk is doing both these things..

And so I thought maybe its better not to restrict -stop() to just
setpolicy() drivers.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-03-17 Thread Viresh Kumar
On Sat, Mar 15, 2014 at 7:29 AM, Rafael J. Wysocki  wrote:
> I honestly don't think that ->target drivers will ever do anything like it,
> because they need the governor to "exit" before.  So we are talking about the
> only two ->setpolicy drivers in the tree here.

I don't have a example platform which would need it but as I can see there are
platforms which want to do this when CPUs goes down from disable_nonboot_cpus(),
I would be in favor to keep this option available for them as well..
They might want
to do this when last CPU of a policy goes down, they might be able to save some
power over there.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-03-17 Thread Viresh Kumar
Hi,

It was a long weekend in India due to some holidays and so couldn't reply.

On Fri, Mar 14, 2014 at 11:59 PM, Dirk Brandewie
 wrote:
> On 03/14/2014 10:07 AM, Viresh Kumar wrote:
> Suspend and hotplug are two very different things and if we start
> crossing those wires bad things are going to happen IMHO.
>
> In "normal" operation using the suspend path to do this work could
> work in principal but doesn't handle the case where the user does
>echo 0 | sudo tee /sys/devices/system/cpu/cpuX/online
>
> Trying force hotplug and suspend into a common mechanism would
> lead to a bunch of special case code or a significant rework of the
> core code IMHO.

What you said is correct, we shouldn't do it. But what I am asking for
is a bit different. The stuff we are doing in core on system suspend
isn't actually related to suspend but only CPU online/offline.

There are platforms which want to set CPUs to a particular frequency
before they are taken out by disable_nonboot_cpus. And then there
are platforms which want to do similar thing when CPUs are taken
down with help of sysfs files. But there is a common baseline there:
Set CPUs to a particular P-state before they are taken down.

And so I wanted to keep a common solution for both these requirements.

> This is guaranteed by the hardware.  Each core has its own MSR for P state
> request.  Any coordination that is required between cores to select the
> package P state is handled by the hardware.

I see.. Let me send some patches which I have in my mind and then we can
decide which set looks more reasonable :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-03-17 Thread Viresh Kumar
Hi,

It was a long weekend in India due to some holidays and so couldn't reply.

On Fri, Mar 14, 2014 at 11:59 PM, Dirk Brandewie
dirk.brande...@gmail.com wrote:
 On 03/14/2014 10:07 AM, Viresh Kumar wrote:
 Suspend and hotplug are two very different things and if we start
 crossing those wires bad things are going to happen IMHO.

 In normal operation using the suspend path to do this work could
 work in principal but doesn't handle the case where the user does
echo 0 | sudo tee /sys/devices/system/cpu/cpuX/online

 Trying force hotplug and suspend into a common mechanism would
 lead to a bunch of special case code or a significant rework of the
 core code IMHO.

What you said is correct, we shouldn't do it. But what I am asking for
is a bit different. The stuff we are doing in core on system suspend
isn't actually related to suspend but only CPU online/offline.

There are platforms which want to set CPUs to a particular frequency
before they are taken out by disable_nonboot_cpus. And then there
are platforms which want to do similar thing when CPUs are taken
down with help of sysfs files. But there is a common baseline there:
Set CPUs to a particular P-state before they are taken down.

And so I wanted to keep a common solution for both these requirements.

 This is guaranteed by the hardware.  Each core has its own MSR for P state
 request.  Any coordination that is required between cores to select the
 package P state is handled by the hardware.

I see.. Let me send some patches which I have in my mind and then we can
decide which set looks more reasonable :)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-03-17 Thread Viresh Kumar
On Sat, Mar 15, 2014 at 7:29 AM, Rafael J. Wysocki r...@rjwysocki.net wrote:
 I honestly don't think that -target drivers will ever do anything like it,
 because they need the governor to exit before.  So we are talking about the
 only two -setpolicy drivers in the tree here.

I don't have a example platform which would need it but as I can see there are
platforms which want to do this when CPUs goes down from disable_nonboot_cpus(),
I would be in favor to keep this option available for them as well..
They might want
to do this when last CPU of a policy goes down, they might be able to save some
power over there.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-03-14 Thread Rafael J. Wysocki
On Friday, March 14, 2014 11:29:04 AM Dirk Brandewie wrote:
> On 03/14/2014 10:07 AM, Viresh Kumar wrote:
> > On 14 March 2014 20:40, Dirk Brandewie  wrote:
> >> Are you proposing adding cpufreq_generic_suspend() to the core I can not
> >> find
> >> it in the mainline code.
> >
> > Its already there in linux-next. I am suggesting to reuse that
> > infrastructure with
> > some necessary modification to support both suspend and hotplug.
> 
> Suspend and hotplug are two very different things and if we start
> crossing those wires bad things are going to happen IMHO.
> 
> In "normal" operation using the suspend path to do this work could
> work in principal but doesn't handle the case where the user does
> echo 0 | sudo tee /sys/devices/system/cpu/cpuX/online
> 
> Trying force hotplug and suspend into a common mechanism would
> lead to a bunch of special case code or a significant rework of the
> core code IMHO.
> 
> 
> >
> >>> Over that I don't think Dirk's solution is going to work if we twist
> >>> the systems a bit.
> >>
> >> Could you explain what "twist the systems a bit" means?
> >
> > The one I explained in the below paragraph.
> >
> >>> For example, Dirk probably wanted to set P-STATE of every core to MIN
> >>> when it goes down. But his solution probably doesn't do that right now.
> >>>
> >>
> >> No, I wanted to set the core that was being off-lined to min P state.
> >
> > Sorry, probably my words were a bit confusing. I meant exactly what
> > you just wrote. Core going down will set its freq to min.
> >
> >>> As exit() is called only when the policy expires or all the CPUs of that
> >>> policy
> >>> are down. Suppose only one CPU out of 4 goes down from a policy, then
> >>> pstate driver would never know that happened. And that core wouldn't go
> >>> to min state.
> >>
> >> My patch does not change the semantics of exit() or when it is called.  For
> >> intel_pstate their is one cpu per policy so I moved all the cleanup to
> >
> > I didn't knew that its guaranteed by pstate driver. I thought it would 
> > still be
> > hardware dependent as some cores might share clock line.
> 
> This is guaranteed by the hardware.  Each core has its own MSR for P state
> request.  Any coordination that is required between cores to select the
> package P state is handled by the hardware.
> 
> >
> >> exit_prepare() if there were work I needed to do at CPU_POST_DEAD I would
> >> have
> >> continued to export the *optional* exit callback.
> >>
> >> The callback name exit_prepare() was probably unfortunate and might be
> >> causing
> >> some confusion.  I will be changing the name per Rafael's feedback.
> >
> > Don't know.. There is another problem here that exit_prepare() would be 
> > called
> > for each CPU whereas exit() would be called for each policy.
> 
> Granted but I don't see this as a problem in this case there is a 1:1
> relationship.  If a driver chooses to use the *optional* exit_prepare() 
> callback
> and knows that there is a many:1 relationship between the policy and CPUs
> then it would have to deal with it.

Actually, I think we should make it clear that the new callback is for
->setpolicy drivers only, which will make things a bit clearer.

We seem to get caught by the difference between ->setpolicy and ->target
drivers on a regular basis, so it might be a good idea to make the distinction
more clear in the code.  I have an idea how to do that, but need some time
to prototype it.

> > And I strongly feel that we shouldn't give another callback here but instead
> > just set core to a specific freq as mentioned by driver with some other 
> > field.
> >
> >>> I think we have two solutions here:
> >>> - If its just about setting core a particular freq when it goes down, I
> >>> think it
> >>> looks a generic enough problem and so better fix core for that. Probably
> >>> with
> >>> help of flags field/suspend-freq (maybe renamed) and without calling
> >>> drivers
> >>> exit() at all..
> >>
> >>
> >> ATM the only thing that needs to be done in this case is to allow
> >> intel_pstate
> >> to set the P state on the core when it is going done.  My solution from the
> >> cores point of view is more generic, it allows any driver that needs to do
> >> work
> >> during CPU_DOWN_PREPARE to do it without adding any new logic to the core.
> >
> > Yeah, do we really need to give that freedom right now? Would be better
> > to get this into core as that would be more generic and people looking to 
> > set
> > core to some freq at shutdown wouldn't be replicating that code.

Question is if it needs to be more generic.

I honestly don't think that ->target drivers will ever do anything like it,
because they need the governor to "exit" before.  So we are talking about the
only two ->setpolicy drivers in the tree here.

> IMHO yes and it would be hard to be more generic, if your platform needs to
> do architecture specific during the PREPARE phase of cpu hotplug use this
> callback or not.
> 
> BTW now that you have 

Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-03-14 Thread Dirk Brandewie

On 03/14/2014 10:07 AM, Viresh Kumar wrote:

On 14 March 2014 20:40, Dirk Brandewie  wrote:

Are you proposing adding cpufreq_generic_suspend() to the core I can not
find
it in the mainline code.


Its already there in linux-next. I am suggesting to reuse that
infrastructure with
some necessary modification to support both suspend and hotplug.


Suspend and hotplug are two very different things and if we start
crossing those wires bad things are going to happen IMHO.

In "normal" operation using the suspend path to do this work could
work in principal but doesn't handle the case where the user does
   echo 0 | sudo tee /sys/devices/system/cpu/cpuX/online

Trying force hotplug and suspend into a common mechanism would
lead to a bunch of special case code or a significant rework of the
core code IMHO.





Over that I don't think Dirk's solution is going to work if we twist
the systems a bit.


Could you explain what "twist the systems a bit" means?


The one I explained in the below paragraph.


For example, Dirk probably wanted to set P-STATE of every core to MIN
when it goes down. But his solution probably doesn't do that right now.



No, I wanted to set the core that was being off-lined to min P state.


Sorry, probably my words were a bit confusing. I meant exactly what
you just wrote. Core going down will set its freq to min.


As exit() is called only when the policy expires or all the CPUs of that
policy
are down. Suppose only one CPU out of 4 goes down from a policy, then
pstate driver would never know that happened. And that core wouldn't go
to min state.


My patch does not change the semantics of exit() or when it is called.  For
intel_pstate their is one cpu per policy so I moved all the cleanup to


I didn't knew that its guaranteed by pstate driver. I thought it would still be
hardware dependent as some cores might share clock line.


This is guaranteed by the hardware.  Each core has its own MSR for P state
request.  Any coordination that is required between cores to select the
package P state is handled by the hardware.




exit_prepare() if there were work I needed to do at CPU_POST_DEAD I would
have
continued to export the *optional* exit callback.

The callback name exit_prepare() was probably unfortunate and might be
causing
some confusion.  I will be changing the name per Rafael's feedback.


Don't know.. There is another problem here that exit_prepare() would be called
for each CPU whereas exit() would be called for each policy.


Granted but I don't see this as a problem in this case there is a 1:1
relationship.  If a driver chooses to use the *optional* exit_prepare() callback
and knows that there is a many:1 relationship between the policy and CPUs
then it would have to deal with it.



And I strongly feel that we shouldn't give another callback here but instead
just set core to a specific freq as mentioned by driver with some other field.


I think we have two solutions here:
- If its just about setting core a particular freq when it goes down, I
think it
looks a generic enough problem and so better fix core for that. Probably
with
help of flags field/suspend-freq (maybe renamed) and without calling
drivers
exit() at all..



ATM the only thing that needs to be done in this case is to allow
intel_pstate
to set the P state on the core when it is going done.  My solution from the
cores point of view is more generic, it allows any driver that needs to do
work
during CPU_DOWN_PREPARE to do it without adding any new logic to the core.


Yeah, do we really need to give that freedom right now? Would be better
to get this into core as that would be more generic and people looking to set
core to some freq at shutdown wouldn't be replicating that code.


IMHO yes and it would be hard to be more generic, if your platform needs to
do architecture specific during the PREPARE phase of cpu hotplug use this
callback or not.

BTW now that you have added a path where the cpufreq_suspend() could fail
it return a value and be checked in dpm_suspend() instead of printing an
error and just continuing.





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-03-14 Thread Viresh Kumar
On 14 March 2014 20:40, Dirk Brandewie  wrote:
> Are you proposing adding cpufreq_generic_suspend() to the core I can not
> find
> it in the mainline code.

Its already there in linux-next. I am suggesting to reuse that
infrastructure with
some necessary modification to support both suspend and hotplug.

>> Over that I don't think Dirk's solution is going to work if we twist
>> the systems a bit.
>
> Could you explain what "twist the systems a bit" means?

The one I explained in the below paragraph.

>> For example, Dirk probably wanted to set P-STATE of every core to MIN
>> when it goes down. But his solution probably doesn't do that right now.
>>
>
> No, I wanted to set the core that was being off-lined to min P state.

Sorry, probably my words were a bit confusing. I meant exactly what
you just wrote. Core going down will set its freq to min.

>> As exit() is called only when the policy expires or all the CPUs of that
>> policy
>> are down. Suppose only one CPU out of 4 goes down from a policy, then
>> pstate driver would never know that happened. And that core wouldn't go
>> to min state.
>
> My patch does not change the semantics of exit() or when it is called.  For
> intel_pstate their is one cpu per policy so I moved all the cleanup to

I didn't knew that its guaranteed by pstate driver. I thought it would still be
hardware dependent as some cores might share clock line.

> exit_prepare() if there were work I needed to do at CPU_POST_DEAD I would
> have
> continued to export the *optional* exit callback.
>
> The callback name exit_prepare() was probably unfortunate and might be
> causing
> some confusion.  I will be changing the name per Rafael's feedback.

Don't know.. There is another problem here that exit_prepare() would be called
for each CPU whereas exit() would be called for each policy.

And I strongly feel that we shouldn't give another callback here but instead
just set core to a specific freq as mentioned by driver with some other field.

>> I think we have two solutions here:
>> - If its just about setting core a particular freq when it goes down, I
>> think it
>> looks a generic enough problem and so better fix core for that. Probably
>> with
>> help of flags field/suspend-freq (maybe renamed) and without calling
>> drivers
>> exit() at all..
>
>
> ATM the only thing that needs to be done in this case is to allow
> intel_pstate
> to set the P state on the core when it is going done.  My solution from the
> cores point of view is more generic, it allows any driver that needs to do
> work
> during CPU_DOWN_PREPARE to do it without adding any new logic to the core.

Yeah, do we really need to give that freedom right now? Would be better
to get this into core as that would be more generic and people looking to set
core to some freq at shutdown wouldn't be replicating that code.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-03-14 Thread Dirk Brandewie

On 03/13/2014 09:59 PM, Viresh Kumar wrote:

On Thu, Mar 13, 2014 at 11:06 PM,   wrote:

From: Dirk Brandewie 

Some drivers (intel_pstate) need to modify state on a core before it
is completely offline.  The ->exit() callback is executed during the
CPU_POST_DEAD phase of the cpu offline process which is too late to
change the state of the core.

Patch 1 adds an optional callback function to the cpufreq_driver
interface which is called by the core during the CPU_DOWN_PREPARE
phase of cpu offline in __cpufreq_remove_dev_prepare().

Patch 2 changes intel_pstate to use the ->exit_prepare callback to do
its cleanup during cpu offline.


Copying stuff from other mail thread here so that we can discuss on a
single mail chain.

On 14 March 2014 03:09, Rafael J. Wysocki  wrote:

On Thursday, March 13, 2014 12:56:02 PM Viresh Kumar wrote:

On 13 March 2014 08:07, Rafael J. Wysocki  wrote:

On Wednesday, March 12, 2014 02:27:07 PM Dirk Brandewie wrote:



I see two possibilities:
   1. Move the exit() callback to __cpufreq_remove_dev_prepare().  I don't
  have a good understanding of what carnage this would cause in the core
  or other scaling drivers.

   2. Add another callback to the cpufreq_driver interface that would be call
  from __cpufreq_remove_dev_prepare() if the callback was set.


I prefer 2, the reason being that it pretty much is guaranteed not to break
anything.  For the record, I'm not a fan of adding new cpufreq driver callbacks,
but in this particular case it seems we can't really do anything better.


I haven't thought a lot about which one of these two looks better, probably
Rafael might be correct. But I can see another way out here as this is very
much driver specific. Why can't we do a register_hotcpu_notifier() from
pstate driver alone?


Why would that be better than adding a new callback?


Because its becoming more and more confusing. Probably we got the problem
right but have wrong solutions for it.

But having considered this issue in detail now, I have more inputs. All Dirk and
Patrick wanted is to set core to min P-state before it gets offlined. We already
have some infrastructure in core which is called this today:
cpufreq_generic_suspend(). We can rework on that to get a more ideal solution
for both the problems.


Are you proposing adding cpufreq_generic_suspend() to the core I can not find
it in the mainline code.



Over that I don't think Dirk's solution is going to work if we twist
the systems a bit.


Could you explain what "twist the systems a bit" means?


For example, Dirk probably wanted to set P-STATE of every core to MIN
when it goes down. But his solution probably doesn't do that right now.



No, I wanted to set the core that was being off-lined to min P state.


As exit() is called only when the policy expires or all the CPUs of that policy
are down. Suppose only one CPU out of 4 goes down from a policy, then
pstate driver would never know that happened. And that core wouldn't go
to min state.


My patch does not change the semantics of exit() or when it is called.  For
intel_pstate their is one cpu per policy so I moved all the cleanup to
exit_prepare() if there were work I needed to do at CPU_POST_DEAD I would have
continued to export the *optional* exit callback.

The callback name exit_prepare() was probably unfortunate and might be causing
some confusion.  I will be changing the name per Rafael's feedback.


I think we have two solutions here:
- If its just about setting core a particular freq when it goes down, I think it
looks a generic enough problem and so better fix core for that. Probably with
help of flags field/suspend-freq (maybe renamed) and without calling drivers
exit() at all..


ATM the only thing that needs to be done in this case is to allow intel_pstate
to set the P state on the core when it is going done.  My solution from the
cores point of view is more generic, it allows any driver that needs to do work
during CPU_DOWN_PREPARE to do it without adding any new logic to the core.



- If this is highly driver specific (which doesn't look like if all we
have to do is setting freq to MIN),


That is why intel_pstate is the *only* driver using exit_prepare().  If
other drivers need to do work during this phase of the cpu hotplug process
then they can export the callback otherwise it has no affect on existing
or future scaling drivers.


then better have something like
register_hotcpu_notifier() with priority set to -1, so that it gets called after
cpufreq.


intel_pstate or any scaling driver having its own hotcpu_notifier makes possible
for the core and the driver to disagree on the state of the driver.  Having
the driver take hotplug notifications from two directions is a bad idea IMHO.



--
viresh



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  

Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-03-14 Thread Dirk Brandewie

On 03/13/2014 09:59 PM, Viresh Kumar wrote:

On Thu, Mar 13, 2014 at 11:06 PM,  dirk.brande...@gmail.com wrote:

From: Dirk Brandewie dirk.j.brande...@intel.com

Some drivers (intel_pstate) need to modify state on a core before it
is completely offline.  The -exit() callback is executed during the
CPU_POST_DEAD phase of the cpu offline process which is too late to
change the state of the core.

Patch 1 adds an optional callback function to the cpufreq_driver
interface which is called by the core during the CPU_DOWN_PREPARE
phase of cpu offline in __cpufreq_remove_dev_prepare().

Patch 2 changes intel_pstate to use the -exit_prepare callback to do
its cleanup during cpu offline.


Copying stuff from other mail thread here so that we can discuss on a
single mail chain.

On 14 March 2014 03:09, Rafael J. Wysocki r...@rjwysocki.net wrote:

On Thursday, March 13, 2014 12:56:02 PM Viresh Kumar wrote:

On 13 March 2014 08:07, Rafael J. Wysocki r...@rjwysocki.net wrote:

On Wednesday, March 12, 2014 02:27:07 PM Dirk Brandewie wrote:



I see two possibilities:
   1. Move the exit() callback to __cpufreq_remove_dev_prepare().  I don't
  have a good understanding of what carnage this would cause in the core
  or other scaling drivers.

   2. Add another callback to the cpufreq_driver interface that would be call
  from __cpufreq_remove_dev_prepare() if the callback was set.


I prefer 2, the reason being that it pretty much is guaranteed not to break
anything.  For the record, I'm not a fan of adding new cpufreq driver callbacks,
but in this particular case it seems we can't really do anything better.


I haven't thought a lot about which one of these two looks better, probably
Rafael might be correct. But I can see another way out here as this is very
much driver specific. Why can't we do a register_hotcpu_notifier() from
pstate driver alone?


Why would that be better than adding a new callback?


Because its becoming more and more confusing. Probably we got the problem
right but have wrong solutions for it.

But having considered this issue in detail now, I have more inputs. All Dirk and
Patrick wanted is to set core to min P-state before it gets offlined. We already
have some infrastructure in core which is called this today:
cpufreq_generic_suspend(). We can rework on that to get a more ideal solution
for both the problems.


Are you proposing adding cpufreq_generic_suspend() to the core I can not find
it in the mainline code.



Over that I don't think Dirk's solution is going to work if we twist
the systems a bit.


Could you explain what twist the systems a bit means?


For example, Dirk probably wanted to set P-STATE of every core to MIN
when it goes down. But his solution probably doesn't do that right now.



No, I wanted to set the core that was being off-lined to min P state.


As exit() is called only when the policy expires or all the CPUs of that policy
are down. Suppose only one CPU out of 4 goes down from a policy, then
pstate driver would never know that happened. And that core wouldn't go
to min state.


My patch does not change the semantics of exit() or when it is called.  For
intel_pstate their is one cpu per policy so I moved all the cleanup to
exit_prepare() if there were work I needed to do at CPU_POST_DEAD I would have
continued to export the *optional* exit callback.

The callback name exit_prepare() was probably unfortunate and might be causing
some confusion.  I will be changing the name per Rafael's feedback.


I think we have two solutions here:
- If its just about setting core a particular freq when it goes down, I think it
looks a generic enough problem and so better fix core for that. Probably with
help of flags field/suspend-freq (maybe renamed) and without calling drivers
exit() at all..


ATM the only thing that needs to be done in this case is to allow intel_pstate
to set the P state on the core when it is going done.  My solution from the
cores point of view is more generic, it allows any driver that needs to do work
during CPU_DOWN_PREPARE to do it without adding any new logic to the core.



- If this is highly driver specific (which doesn't look like if all we
have to do is setting freq to MIN),


That is why intel_pstate is the *only* driver using exit_prepare().  If
other drivers need to do work during this phase of the cpu hotplug process
then they can export the callback otherwise it has no affect on existing
or future scaling drivers.


then better have something like
register_hotcpu_notifier() with priority set to -1, so that it gets called after
cpufreq.


intel_pstate or any scaling driver having its own hotcpu_notifier makes possible
for the core and the driver to disagree on the state of the driver.  Having
the driver take hotplug notifications from two directions is a bad idea IMHO.



--
viresh



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-03-14 Thread Viresh Kumar
On 14 March 2014 20:40, Dirk Brandewie dirk.brande...@gmail.com wrote:
 Are you proposing adding cpufreq_generic_suspend() to the core I can not
 find
 it in the mainline code.

Its already there in linux-next. I am suggesting to reuse that
infrastructure with
some necessary modification to support both suspend and hotplug.

 Over that I don't think Dirk's solution is going to work if we twist
 the systems a bit.

 Could you explain what twist the systems a bit means?

The one I explained in the below paragraph.

 For example, Dirk probably wanted to set P-STATE of every core to MIN
 when it goes down. But his solution probably doesn't do that right now.


 No, I wanted to set the core that was being off-lined to min P state.

Sorry, probably my words were a bit confusing. I meant exactly what
you just wrote. Core going down will set its freq to min.

 As exit() is called only when the policy expires or all the CPUs of that
 policy
 are down. Suppose only one CPU out of 4 goes down from a policy, then
 pstate driver would never know that happened. And that core wouldn't go
 to min state.

 My patch does not change the semantics of exit() or when it is called.  For
 intel_pstate their is one cpu per policy so I moved all the cleanup to

I didn't knew that its guaranteed by pstate driver. I thought it would still be
hardware dependent as some cores might share clock line.

 exit_prepare() if there were work I needed to do at CPU_POST_DEAD I would
 have
 continued to export the *optional* exit callback.

 The callback name exit_prepare() was probably unfortunate and might be
 causing
 some confusion.  I will be changing the name per Rafael's feedback.

Don't know.. There is another problem here that exit_prepare() would be called
for each CPU whereas exit() would be called for each policy.

And I strongly feel that we shouldn't give another callback here but instead
just set core to a specific freq as mentioned by driver with some other field.

 I think we have two solutions here:
 - If its just about setting core a particular freq when it goes down, I
 think it
 looks a generic enough problem and so better fix core for that. Probably
 with
 help of flags field/suspend-freq (maybe renamed) and without calling
 drivers
 exit() at all..


 ATM the only thing that needs to be done in this case is to allow
 intel_pstate
 to set the P state on the core when it is going done.  My solution from the
 cores point of view is more generic, it allows any driver that needs to do
 work
 during CPU_DOWN_PREPARE to do it without adding any new logic to the core.

Yeah, do we really need to give that freedom right now? Would be better
to get this into core as that would be more generic and people looking to set
core to some freq at shutdown wouldn't be replicating that code.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-03-14 Thread Dirk Brandewie

On 03/14/2014 10:07 AM, Viresh Kumar wrote:

On 14 March 2014 20:40, Dirk Brandewie dirk.brande...@gmail.com wrote:

Are you proposing adding cpufreq_generic_suspend() to the core I can not
find
it in the mainline code.


Its already there in linux-next. I am suggesting to reuse that
infrastructure with
some necessary modification to support both suspend and hotplug.


Suspend and hotplug are two very different things and if we start
crossing those wires bad things are going to happen IMHO.

In normal operation using the suspend path to do this work could
work in principal but doesn't handle the case where the user does
   echo 0 | sudo tee /sys/devices/system/cpu/cpuX/online

Trying force hotplug and suspend into a common mechanism would
lead to a bunch of special case code or a significant rework of the
core code IMHO.





Over that I don't think Dirk's solution is going to work if we twist
the systems a bit.


Could you explain what twist the systems a bit means?


The one I explained in the below paragraph.


For example, Dirk probably wanted to set P-STATE of every core to MIN
when it goes down. But his solution probably doesn't do that right now.



No, I wanted to set the core that was being off-lined to min P state.


Sorry, probably my words were a bit confusing. I meant exactly what
you just wrote. Core going down will set its freq to min.


As exit() is called only when the policy expires or all the CPUs of that
policy
are down. Suppose only one CPU out of 4 goes down from a policy, then
pstate driver would never know that happened. And that core wouldn't go
to min state.


My patch does not change the semantics of exit() or when it is called.  For
intel_pstate their is one cpu per policy so I moved all the cleanup to


I didn't knew that its guaranteed by pstate driver. I thought it would still be
hardware dependent as some cores might share clock line.


This is guaranteed by the hardware.  Each core has its own MSR for P state
request.  Any coordination that is required between cores to select the
package P state is handled by the hardware.




exit_prepare() if there were work I needed to do at CPU_POST_DEAD I would
have
continued to export the *optional* exit callback.

The callback name exit_prepare() was probably unfortunate and might be
causing
some confusion.  I will be changing the name per Rafael's feedback.


Don't know.. There is another problem here that exit_prepare() would be called
for each CPU whereas exit() would be called for each policy.


Granted but I don't see this as a problem in this case there is a 1:1
relationship.  If a driver chooses to use the *optional* exit_prepare() callback
and knows that there is a many:1 relationship between the policy and CPUs
then it would have to deal with it.



And I strongly feel that we shouldn't give another callback here but instead
just set core to a specific freq as mentioned by driver with some other field.


I think we have two solutions here:
- If its just about setting core a particular freq when it goes down, I
think it
looks a generic enough problem and so better fix core for that. Probably
with
help of flags field/suspend-freq (maybe renamed) and without calling
drivers
exit() at all..



ATM the only thing that needs to be done in this case is to allow
intel_pstate
to set the P state on the core when it is going done.  My solution from the
cores point of view is more generic, it allows any driver that needs to do
work
during CPU_DOWN_PREPARE to do it without adding any new logic to the core.


Yeah, do we really need to give that freedom right now? Would be better
to get this into core as that would be more generic and people looking to set
core to some freq at shutdown wouldn't be replicating that code.


IMHO yes and it would be hard to be more generic, if your platform needs to
do architecture specific during the PREPARE phase of cpu hotplug use this
callback or not.

BTW now that you have added a path where the cpufreq_suspend() could fail
it return a value and be checked in dpm_suspend() instead of printing an
error and just continuing.





--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-03-14 Thread Rafael J. Wysocki
On Friday, March 14, 2014 11:29:04 AM Dirk Brandewie wrote:
 On 03/14/2014 10:07 AM, Viresh Kumar wrote:
  On 14 March 2014 20:40, Dirk Brandewie dirk.brande...@gmail.com wrote:
  Are you proposing adding cpufreq_generic_suspend() to the core I can not
  find
  it in the mainline code.
 
  Its already there in linux-next. I am suggesting to reuse that
  infrastructure with
  some necessary modification to support both suspend and hotplug.
 
 Suspend and hotplug are two very different things and if we start
 crossing those wires bad things are going to happen IMHO.
 
 In normal operation using the suspend path to do this work could
 work in principal but doesn't handle the case where the user does
 echo 0 | sudo tee /sys/devices/system/cpu/cpuX/online
 
 Trying force hotplug and suspend into a common mechanism would
 lead to a bunch of special case code or a significant rework of the
 core code IMHO.
 
 
 
  Over that I don't think Dirk's solution is going to work if we twist
  the systems a bit.
 
  Could you explain what twist the systems a bit means?
 
  The one I explained in the below paragraph.
 
  For example, Dirk probably wanted to set P-STATE of every core to MIN
  when it goes down. But his solution probably doesn't do that right now.
 
 
  No, I wanted to set the core that was being off-lined to min P state.
 
  Sorry, probably my words were a bit confusing. I meant exactly what
  you just wrote. Core going down will set its freq to min.
 
  As exit() is called only when the policy expires or all the CPUs of that
  policy
  are down. Suppose only one CPU out of 4 goes down from a policy, then
  pstate driver would never know that happened. And that core wouldn't go
  to min state.
 
  My patch does not change the semantics of exit() or when it is called.  For
  intel_pstate their is one cpu per policy so I moved all the cleanup to
 
  I didn't knew that its guaranteed by pstate driver. I thought it would 
  still be
  hardware dependent as some cores might share clock line.
 
 This is guaranteed by the hardware.  Each core has its own MSR for P state
 request.  Any coordination that is required between cores to select the
 package P state is handled by the hardware.
 
 
  exit_prepare() if there were work I needed to do at CPU_POST_DEAD I would
  have
  continued to export the *optional* exit callback.
 
  The callback name exit_prepare() was probably unfortunate and might be
  causing
  some confusion.  I will be changing the name per Rafael's feedback.
 
  Don't know.. There is another problem here that exit_prepare() would be 
  called
  for each CPU whereas exit() would be called for each policy.
 
 Granted but I don't see this as a problem in this case there is a 1:1
 relationship.  If a driver chooses to use the *optional* exit_prepare() 
 callback
 and knows that there is a many:1 relationship between the policy and CPUs
 then it would have to deal with it.

Actually, I think we should make it clear that the new callback is for
-setpolicy drivers only, which will make things a bit clearer.

We seem to get caught by the difference between -setpolicy and -target
drivers on a regular basis, so it might be a good idea to make the distinction
more clear in the code.  I have an idea how to do that, but need some time
to prototype it.

  And I strongly feel that we shouldn't give another callback here but instead
  just set core to a specific freq as mentioned by driver with some other 
  field.
 
  I think we have two solutions here:
  - If its just about setting core a particular freq when it goes down, I
  think it
  looks a generic enough problem and so better fix core for that. Probably
  with
  help of flags field/suspend-freq (maybe renamed) and without calling
  drivers
  exit() at all..
 
 
  ATM the only thing that needs to be done in this case is to allow
  intel_pstate
  to set the P state on the core when it is going done.  My solution from the
  cores point of view is more generic, it allows any driver that needs to do
  work
  during CPU_DOWN_PREPARE to do it without adding any new logic to the core.
 
  Yeah, do we really need to give that freedom right now? Would be better
  to get this into core as that would be more generic and people looking to 
  set
  core to some freq at shutdown wouldn't be replicating that code.

Question is if it needs to be more generic.

I honestly don't think that -target drivers will ever do anything like it,
because they need the governor to exit before.  So we are talking about the
only two -setpolicy drivers in the tree here.

 IMHO yes and it would be hard to be more generic, if your platform needs to
 do architecture specific during the PREPARE phase of cpu hotplug use this
 callback or not.
 
 BTW now that you have added a path where the cpufreq_suspend() could fail
 it return a value and be checked in dpm_suspend() instead of printing an
 error and just continuing.

I'm not sure what you mean?  Are you saying that it might be a good idea

Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-03-13 Thread Viresh Kumar
On Thu, Mar 13, 2014 at 11:06 PM,   wrote:
> From: Dirk Brandewie 
>
> Some drivers (intel_pstate) need to modify state on a core before it
> is completely offline.  The ->exit() callback is executed during the
> CPU_POST_DEAD phase of the cpu offline process which is too late to
> change the state of the core.
>
> Patch 1 adds an optional callback function to the cpufreq_driver
> interface which is called by the core during the CPU_DOWN_PREPARE
> phase of cpu offline in __cpufreq_remove_dev_prepare().
>
> Patch 2 changes intel_pstate to use the ->exit_prepare callback to do
> its cleanup during cpu offline.

Copying stuff from other mail thread here so that we can discuss on a
single mail chain.

On 14 March 2014 03:09, Rafael J. Wysocki  wrote:
> On Thursday, March 13, 2014 12:56:02 PM Viresh Kumar wrote:
>> On 13 March 2014 08:07, Rafael J. Wysocki  wrote:
>> > On Wednesday, March 12, 2014 02:27:07 PM Dirk Brandewie wrote:
>>
>> >> > I see two possibilities:
>> >> >   1. Move the exit() callback to __cpufreq_remove_dev_prepare().  I 
>> >> > don't
>> >> >  have a good understanding of what carnage this would cause in the 
>> >> > core
>> >> >  or other scaling drivers.
>> >> >
>> >> >   2. Add another callback to the cpufreq_driver interface that would be 
>> >> > call
>> >> >  from __cpufreq_remove_dev_prepare() if the callback was set.
>> >
>> > I prefer 2, the reason being that it pretty much is guaranteed not to break
>> > anything.  For the record, I'm not a fan of adding new cpufreq driver 
>> > callbacks,
>> > but in this particular case it seems we can't really do anything better.
>>
>> I haven't thought a lot about which one of these two looks better, probably
>> Rafael might be correct. But I can see another way out here as this is very
>> much driver specific. Why can't we do a register_hotcpu_notifier() from
>> pstate driver alone?
>
> Why would that be better than adding a new callback?

Because its becoming more and more confusing. Probably we got the problem
right but have wrong solutions for it.

But having considered this issue in detail now, I have more inputs. All Dirk and
Patrick wanted is to set core to min P-state before it gets offlined. We already
have some infrastructure in core which is called this today:
cpufreq_generic_suspend(). We can rework on that to get a more ideal solution
for both the problems.

Over that I don't think Dirk's solution is going to work if we twist
the systems a
bit. For example, Dirk probably wanted to set P-STATE of every core to MIN
when it goes down. But his solution probably doesn't do that right now.

As exit() is called only when the policy expires or all the CPUs of that policy
are down. Suppose only one CPU out of 4 goes down from a policy, then
pstate driver would never know that happened. And that core wouldn't go
to min state.

I think we have two solutions here:
- If its just about setting core a particular freq when it goes down, I think it
looks a generic enough problem and so better fix core for that. Probably with
help of flags field/suspend-freq (maybe renamed) and without calling drivers
exit() at all..

- If this is highly driver specific (which doesn't look like if all we
have to do
is setting freq to MIN), then better have something like
register_hotcpu_notifier() with priority set to -1, so that it gets called after
cpufreq.

--
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-03-13 Thread dirk . brandewie
From: Dirk Brandewie 

Some drivers (intel_pstate) need to modify state on a core before it
is completely offline.  The ->exit() callback is executed during the
CPU_POST_DEAD phase of the cpu offline process which is too late to
change the state of the core.

Patch 1 adds an optional callback function to the cpufreq_driver
interface which is called by the core during the CPU_DOWN_PREPARE
phase of cpu offline in __cpufreq_remove_dev_prepare().

Patch 2 changes intel_pstate to use the ->exit_prepare callback to do
its cleanup during cpu offline. 

Dirk Brandewie (2):
  cpufreq: Add exit_prepare callback to cpufreq_driver interface
  intel_pstate: Set core to min P state during core offline

 Documentation/cpu-freq/cpu-drivers.txt |  8 +++-
 drivers/cpufreq/cpufreq.c  |  3 +++
 drivers/cpufreq/intel_pstate.c | 20 +---
 include/linux/cpufreq.h|  1 +
 4 files changed, 24 insertions(+), 8 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-03-13 Thread dirk . brandewie
From: Dirk Brandewie dirk.j.brande...@intel.com

Some drivers (intel_pstate) need to modify state on a core before it
is completely offline.  The -exit() callback is executed during the
CPU_POST_DEAD phase of the cpu offline process which is too late to
change the state of the core.

Patch 1 adds an optional callback function to the cpufreq_driver
interface which is called by the core during the CPU_DOWN_PREPARE
phase of cpu offline in __cpufreq_remove_dev_prepare().

Patch 2 changes intel_pstate to use the -exit_prepare callback to do
its cleanup during cpu offline. 

Dirk Brandewie (2):
  cpufreq: Add exit_prepare callback to cpufreq_driver interface
  intel_pstate: Set core to min P state during core offline

 Documentation/cpu-freq/cpu-drivers.txt |  8 +++-
 drivers/cpufreq/cpufreq.c  |  3 +++
 drivers/cpufreq/intel_pstate.c | 20 +---
 include/linux/cpufreq.h|  1 +
 4 files changed, 24 insertions(+), 8 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Add exit_prepare callback to the cpufreq_driver interface.

2014-03-13 Thread Viresh Kumar
On Thu, Mar 13, 2014 at 11:06 PM,  dirk.brande...@gmail.com wrote:
 From: Dirk Brandewie dirk.j.brande...@intel.com

 Some drivers (intel_pstate) need to modify state on a core before it
 is completely offline.  The -exit() callback is executed during the
 CPU_POST_DEAD phase of the cpu offline process which is too late to
 change the state of the core.

 Patch 1 adds an optional callback function to the cpufreq_driver
 interface which is called by the core during the CPU_DOWN_PREPARE
 phase of cpu offline in __cpufreq_remove_dev_prepare().

 Patch 2 changes intel_pstate to use the -exit_prepare callback to do
 its cleanup during cpu offline.

Copying stuff from other mail thread here so that we can discuss on a
single mail chain.

On 14 March 2014 03:09, Rafael J. Wysocki r...@rjwysocki.net wrote:
 On Thursday, March 13, 2014 12:56:02 PM Viresh Kumar wrote:
 On 13 March 2014 08:07, Rafael J. Wysocki r...@rjwysocki.net wrote:
  On Wednesday, March 12, 2014 02:27:07 PM Dirk Brandewie wrote:

   I see two possibilities:
 1. Move the exit() callback to __cpufreq_remove_dev_prepare().  I 
   don't
have a good understanding of what carnage this would cause in the 
   core
or other scaling drivers.
  
 2. Add another callback to the cpufreq_driver interface that would be 
   call
from __cpufreq_remove_dev_prepare() if the callback was set.
 
  I prefer 2, the reason being that it pretty much is guaranteed not to break
  anything.  For the record, I'm not a fan of adding new cpufreq driver 
  callbacks,
  but in this particular case it seems we can't really do anything better.

 I haven't thought a lot about which one of these two looks better, probably
 Rafael might be correct. But I can see another way out here as this is very
 much driver specific. Why can't we do a register_hotcpu_notifier() from
 pstate driver alone?

 Why would that be better than adding a new callback?

Because its becoming more and more confusing. Probably we got the problem
right but have wrong solutions for it.

But having considered this issue in detail now, I have more inputs. All Dirk and
Patrick wanted is to set core to min P-state before it gets offlined. We already
have some infrastructure in core which is called this today:
cpufreq_generic_suspend(). We can rework on that to get a more ideal solution
for both the problems.

Over that I don't think Dirk's solution is going to work if we twist
the systems a
bit. For example, Dirk probably wanted to set P-STATE of every core to MIN
when it goes down. But his solution probably doesn't do that right now.

As exit() is called only when the policy expires or all the CPUs of that policy
are down. Suppose only one CPU out of 4 goes down from a policy, then
pstate driver would never know that happened. And that core wouldn't go
to min state.

I think we have two solutions here:
- If its just about setting core a particular freq when it goes down, I think it
looks a generic enough problem and so better fix core for that. Probably with
help of flags field/suspend-freq (maybe renamed) and without calling drivers
exit() at all..

- If this is highly driver specific (which doesn't look like if all we
have to do
is setting freq to MIN), then better have something like
register_hotcpu_notifier() with priority set to -1, so that it gets called after
cpufreq.

--
viresh
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/