Re: [PATCH] cpufreq: powernv: Add support of frequency domain

2017-12-20 Thread Gautham R Shenoy
On Tue, Dec 19, 2017 at 09:21:52PM +1100, Balbir Singh wrote:
> On Tue, Dec 19, 2017 at 8:20 PM, Gautham R Shenoy
>  wrote:
> > Hi Viresh,
> > On Mon, Dec 18, 2017 at 01:59:35PM +0530, Viresh Kumar wrote:
> >> On 18-12-17, 10:41, Abhishek wrote:
> >> > We need to do it in this way as the current implementation takes the max 
> >> > of
> >> > the PMSR of the cores. Thus, when the frequency is required to be ramped 
> >> > up,
> >> > it suffices to write to just the local PMSR, but when the frequency is 
> >> > to be
> >> > ramped down, if we don't send the IPI it breaks the compatibility with 
> >> > P8.
> >>
> >> Looks strange really that you have to program this differently for 
> >> speeding up
> >> or down. These CPUs are part of one cpufreq policy and so I would normally
> >> expect changes to any CPU should reflect for other CPUs as well.
> >>
> >> @Goutham: Do you know why it is so ?
> >>
> >
> > These are due to some implementation quirks where the platform has
> > provided a PMCR per-core to be backward compatible with POWER8, but
> > controls the frequency at a quad-level, by taking the maximum of the
> > four PMCR values instead of the latest one. So, changes to any CPU in
> > the core will reflect on all the cores if the frequency is higher than
> > the current frequency, but not necessarily if the requested frequency
> > is lower than the current frequency.
> >
> > Without sending the extra IPIs, we will be breaking the ABI since if
> > we set userspace governor, and change the frequency of a core by
> > lowering it, then it will not reflect on the CPUs of the cores in the
> > quad.
> 
> 
> What about cpufreq_policy->cpus/related_cpus? Am I missing something?

The frequency indicator passed via the device tree is used to derive
the mask corresponding to the set of CPUs that share the same
frequency. It is this mask that is set to
cpufreq_policy->cpus/related_cpus.


> 
> >
> > Abhishek,
> > I think we can rework this by sending the extra IPIs only in the
> > presence of the quirk which can be indicated through a device-tree
> > parameter. If the future implementation fix this, then we won't need
> > the extra IPIs.
> 
> Balbir Singh.
> 



Re: [PATCH] cpufreq: powernv: Add support of frequency domain

2017-12-19 Thread Balbir Singh
On Tue, Dec 19, 2017 at 8:20 PM, Gautham R Shenoy
 wrote:
> Hi Viresh,
> On Mon, Dec 18, 2017 at 01:59:35PM +0530, Viresh Kumar wrote:
>> On 18-12-17, 10:41, Abhishek wrote:
>> > We need to do it in this way as the current implementation takes the max of
>> > the PMSR of the cores. Thus, when the frequency is required to be ramped 
>> > up,
>> > it suffices to write to just the local PMSR, but when the frequency is to 
>> > be
>> > ramped down, if we don't send the IPI it breaks the compatibility with P8.
>>
>> Looks strange really that you have to program this differently for speeding 
>> up
>> or down. These CPUs are part of one cpufreq policy and so I would normally
>> expect changes to any CPU should reflect for other CPUs as well.
>>
>> @Goutham: Do you know why it is so ?
>>
>
> These are due to some implementation quirks where the platform has
> provided a PMCR per-core to be backward compatible with POWER8, but
> controls the frequency at a quad-level, by taking the maximum of the
> four PMCR values instead of the latest one. So, changes to any CPU in
> the core will reflect on all the cores if the frequency is higher than
> the current frequency, but not necessarily if the requested frequency
> is lower than the current frequency.
>
> Without sending the extra IPIs, we will be breaking the ABI since if
> we set userspace governor, and change the frequency of a core by
> lowering it, then it will not reflect on the CPUs of the cores in the
> quad.


What about cpufreq_policy->cpus/related_cpus? Am I missing something?

>
> Abhishek,
> I think we can rework this by sending the extra IPIs only in the
> presence of the quirk which can be indicated through a device-tree
> parameter. If the future implementation fix this, then we won't need
> the extra IPIs.

Balbir Singh.


Re: [PATCH] cpufreq: powernv: Add support of frequency domain

2017-12-19 Thread Gautham R Shenoy
Hi Viresh,
On Mon, Dec 18, 2017 at 01:59:35PM +0530, Viresh Kumar wrote:
> On 18-12-17, 10:41, Abhishek wrote:
> > We need to do it in this way as the current implementation takes the max of
> > the PMSR of the cores. Thus, when the frequency is required to be ramped up,
> > it suffices to write to just the local PMSR, but when the frequency is to be
> > ramped down, if we don't send the IPI it breaks the compatibility with P8.
> 
> Looks strange really that you have to program this differently for speeding up
> or down. These CPUs are part of one cpufreq policy and so I would normally
> expect changes to any CPU should reflect for other CPUs as well.
> 
> @Goutham: Do you know why it is so ?
> 

These are due to some implementation quirks where the platform has
provided a PMCR per-core to be backward compatible with POWER8, but
controls the frequency at a quad-level, by taking the maximum of the
four PMCR values instead of the latest one. So, changes to any CPU in
the core will reflect on all the cores if the frequency is higher than
the current frequency, but not necessarily if the requested frequency
is lower than the current frequency.

Without sending the extra IPIs, we will be breaking the ABI since if
we set userspace governor, and change the frequency of a core by
lowering it, then it will not reflect on the CPUs of the cores in the
quad.

Abhishek,
I think we can rework this by sending the extra IPIs only in the
presence of the quirk which can be indicated through a device-tree
parameter. If the future implementation fix this, then we won't need
the extra IPIs.

> -- 
> viresh
> 



Re: [PATCH] cpufreq: powernv: Add support of frequency domain

2017-12-18 Thread Viresh Kumar
On 18-12-17, 10:41, Abhishek wrote:
> We need to do it in this way as the current implementation takes the max of
> the PMSR of the cores. Thus, when the frequency is required to be ramped up,
> it suffices to write to just the local PMSR, but when the frequency is to be
> ramped down, if we don't send the IPI it breaks the compatibility with P8.

Looks strange really that you have to program this differently for speeding up
or down. These CPUs are part of one cpufreq policy and so I would normally
expect changes to any CPU should reflect for other CPUs as well.

@Goutham: Do you know why it is so ?

-- 
viresh


Re: [PATCH] cpufreq: powernv: Add support of frequency domain

2017-12-17 Thread Abhishek

On 12/14/2017 10:12 AM, Viresh Kumar wrote:

+ Gautham,

@Gautham: Can you please help reviewing this one ?

On 13-12-17, 13:49, Abhishek Goel wrote:

@@ -693,6 +746,8 @@ static int powernv_cpufreq_target_index(struct 
cpufreq_policy *policy,
  {
struct powernv_smp_call_data freq_data;
unsigned int cur_msec, gpstate_idx;
+   cpumask_t temp;
+   u32 cpu;
struct global_pstate_info *gpstates = policy->driver_data;
  
  	if (unlikely(rebooting) && new_index != get_nominal_index())

@@ -761,24 +816,48 @@ static int powernv_cpufreq_target_index(struct 
cpufreq_policy *policy,
spin_unlock(&gpstates->gpstate_lock);
  
  	/*

-* Use smp_call_function to send IPI and execute the
-* mtspr on target CPU.  We could do that without IPI
-* if current CPU is within policy->cpus (core)
+* Use smp_call_function to send IPI and execute the mtspr on CPU.
+* This needs to be done on every core of the policy

Why on each CPU ?
We need to do it in this way as the current implementation takes the max 
of the PMSR of the cores. Thus, when the frequency is required to be 
ramped up, it suffices to write to just the local PMSR, but when the 
frequency is to be ramped down, if we don't send the IPI it breaks the 
compatibility with P8.



 */
-   smp_call_function_any(policy->cpus, set_pstate, &freq_data, 1);
+   cpumask_copy(&temp, policy->cpus);
+
+   while (!cpumask_empty(&temp)) {
+   cpu = cpumask_first(&temp);
+   smp_call_function_any(cpu_sibling_mask(cpu),
+   set_pstate, &freq_data, 1);
+   cpumask_andnot(&temp, &temp, cpu_sibling_mask(cpu));
+   }
+
return 0;
  }




Re: [PATCH] cpufreq: powernv: Add support of frequency domain

2017-12-13 Thread Viresh Kumar
+ Gautham,

@Gautham: Can you please help reviewing this one ?

On 13-12-17, 13:49, Abhishek Goel wrote:
> @@ -693,6 +746,8 @@ static int powernv_cpufreq_target_index(struct 
> cpufreq_policy *policy,
>  {
>   struct powernv_smp_call_data freq_data;
>   unsigned int cur_msec, gpstate_idx;
> + cpumask_t temp;
> + u32 cpu;
>   struct global_pstate_info *gpstates = policy->driver_data;
>  
>   if (unlikely(rebooting) && new_index != get_nominal_index())
> @@ -761,24 +816,48 @@ static int powernv_cpufreq_target_index(struct 
> cpufreq_policy *policy,
>   spin_unlock(&gpstates->gpstate_lock);
>  
>   /*
> -  * Use smp_call_function to send IPI and execute the
> -  * mtspr on target CPU.  We could do that without IPI
> -  * if current CPU is within policy->cpus (core)
> +  * Use smp_call_function to send IPI and execute the mtspr on CPU.
> +  * This needs to be done on every core of the policy

Why on each CPU ?

>*/
> - smp_call_function_any(policy->cpus, set_pstate, &freq_data, 1);
> + cpumask_copy(&temp, policy->cpus);
> +
> + while (!cpumask_empty(&temp)) {
> + cpu = cpumask_first(&temp);
> + smp_call_function_any(cpu_sibling_mask(cpu),
> + set_pstate, &freq_data, 1);
> + cpumask_andnot(&temp, &temp, cpu_sibling_mask(cpu));
> + }
> +
>   return 0;
>  }

-- 
viresh