[PATCH 2/3] opp: Allow opp-supported-hw to contain multiple versions

2020-08-26 Thread Viresh Kumar
The bindings allow multiple versions to be passed to "opp-supported-hw"
property, either of which can result in enabling of the OPP.

Update code to allow that.

Signed-off-by: Viresh Kumar 
---
 drivers/opp/of.c | 47 +--
 1 file changed, 33 insertions(+), 14 deletions(-)

diff --git a/drivers/opp/of.c b/drivers/opp/of.c
index e39ddcc779af..5dac8bffd68c 100644
--- a/drivers/opp/of.c
+++ b/drivers/opp/of.c
@@ -434,9 +434,9 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_of_find_icc_paths);
 static bool _opp_is_supported(struct device *dev, struct opp_table *opp_table,
  struct device_node *np)
 {
-   unsigned int count = opp_table->supported_hw_count;
-   u32 version;
-   int ret;
+   unsigned int levels = opp_table->supported_hw_count;
+   int count, versions, ret, i, j;
+   u32 val;
 
if (!opp_table->supported_hw) {
/*
@@ -451,21 +451,40 @@ static bool _opp_is_supported(struct device *dev, struct 
opp_table *opp_table,
return true;
}
 
-   while (count--) {
-   ret = of_property_read_u32_index(np, "opp-supported-hw", count,
-);
-   if (ret) {
-   dev_warn(dev, "%s: failed to read opp-supported-hw 
property at index %d: %d\n",
-__func__, count, ret);
-   return false;
+   count = of_property_count_u32_elems(np, "opp-supported-hw");
+   if (count <= 0 || count % levels) {
+   dev_err(dev, "%s: Invalid opp-supported-hw property (%d)\n",
+   __func__, count);
+   return false;
+   }
+
+   versions = count / levels;
+
+   /* All levels in at least one of the versions should match */
+   for (i = 0; i < versions; i++) {
+   bool supported = true;
+
+   for (j = 0; j < levels; j++) {
+   ret = of_property_read_u32_index(np, "opp-supported-hw",
+i * levels + j, );
+   if (ret) {
+   dev_warn(dev, "%s: failed to read 
opp-supported-hw property at index %d: %d\n",
+__func__, i * levels + j, ret);
+   return false;
+   }
+
+   /* Check if the level is supported */
+   if (!(val & opp_table->supported_hw[j])) {
+   supported = false;
+   break;
+   }
}
 
-   /* Both of these are bitwise masks of the versions */
-   if (!(version & opp_table->supported_hw[count]))
-   return false;
+   if (supported)
+   return true;
}
 
-   return true;
+   return false;
 }
 
 static int opp_parse_supplies(struct dev_pm_opp *opp, struct device *dev,
-- 
2.25.0.rc1.19.g042ed3e048af



[PATCH 0/3] opp: Allow opp-supported-hw to contain multiple versions

2020-08-26 Thread Viresh Kumar
Stephan and Dmitry,

Here is an attempt to solve the problem you guys faced, I have tested it
locally and works with my expectations. Please see if they solve your
problems.

Dmitry: I sent another message for you in patch 3's comments section.

--
viresh

Viresh Kumar (3):
  dt-bindings: opp: Allow opp-supported-hw to contain multiple versions
  opp: Allow opp-supported-hw to contain multiple versions
  ARM: tegra: Pass multiple versions in opp-supported-hw property

 Documentation/devicetree/bindings/opp/opp.txt |  53 +-
 .../boot/dts/tegra20-cpu-opp-microvolt.dtsi   |  36 -
 arch/arm/boot/dts/tegra20-cpu-opp.dtsi|  67 +-
 .../boot/dts/tegra30-cpu-opp-microvolt.dtsi   | 512 -
 arch/arm/boot/dts/tegra30-cpu-opp.dtsi| 986 +++---
 drivers/opp/of.c  |  47 +-
 6 files changed, 214 insertions(+), 1487 deletions(-)

-- 
2.25.0.rc1.19.g042ed3e048af



Re: [PATCH v3 5/5] arch_topology, arm, arm64: define arch_scale_freq_invariant()

2020-08-25 Thread Viresh Kumar
On 24-08-20, 22:02, Ionela Voinescu wrote:
> From: Valentin Schneider 
> 
> arch_scale_freq_invariant() is used by schedutil to determine whether
> the scheduler's load-tracking signals are frequency invariant. Its
> definition is overridable, though by default it is hardcoded to 'true'
> if arch_scale_freq_capacity() is defined ('false' otherwise).
> 
> This behaviour is not overridden on arm, arm64 and other users of the
> generic arch topology driver, which is somewhat precarious:
> arch_scale_freq_capacity() will always be defined, yet not all cpufreq
> drivers are guaranteed to drive the frequency invariance scale factor
> setting. In other words, the load-tracking signals may very well *not*
> be frequency invariant.
> 
> Now that cpufreq can be queried on whether the current driver is driving
> the Frequency Invariance (FI) scale setting, the current situation can
> be improved. This combines the query of whether cpufreq supports the
> setting of the frequency scale factor, with whether all online CPUs are
> counter-based FI enabled.
> 
> While cpufreq FI enablement applies at system level, for all CPUs,
> counter-based FI support could also be used for only a subset of CPUs to
> set the invariance scale factor. Therefore, if cpufreq-based FI support
> is present, we consider the system to be invariant. If missing, we
> require all online CPUs to be counter-based FI enabled in order for the
> full system to be considered invariant.
> 
> If the system ends up not being invariant, a new condition is needed in
> the counter initialization code that disables all scale factor setting
> based on counters.
> 
> Precedence of counters over cpufreq use is not important here. The
> invariant status is only given to the system if all CPUs have at least
> one method of setting the frequency scale factor.
> 
> Signed-off-by: Valentin Schneider 
> Signed-off-by: Ionela Voinescu 
> Acked-by: Catalin Marinas 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Sudeep Holla 
> ---
>  arch/arm/include/asm/topology.h   | 1 +
>  arch/arm64/include/asm/topology.h | 1 +
>  arch/arm64/kernel/topology.c  | 7 +++
>  drivers/base/arch_topology.c  | 6 ++
>  include/linux/arch_topology.h | 2 ++
>  5 files changed, 17 insertions(+)
> 
> diff --git a/arch/arm/include/asm/topology.h b/arch/arm/include/asm/topology.h
> index e0593cf095d0..9219e67befbe 100644
> --- a/arch/arm/include/asm/topology.h
> +++ b/arch/arm/include/asm/topology.h
> @@ -9,6 +9,7 @@
>  
>  /* Replace task scheduler's default frequency-invariant accounting */
>  #define arch_scale_freq_capacity topology_get_freq_scale
> +#define arch_scale_freq_invariant topology_scale_freq_invariant

Maybe this macro should have been named arch_is_freq_invariant as all other ones
are actually getting us a scaled number and this one is just a flag. But yeah,
that is out of this series's scope, but maybe you should name
topology_scale_freq_invariant() to topology_is_freq_invariant() or something
else on those lines ? Anyway:

Acked-by: Viresh Kumar 

-- 
viresh


Re: [PATCH v3 4/5] arch_topology, cpufreq: constify arch_* cpumasks

2020-08-25 Thread Viresh Kumar
On 24-08-20, 22:02, Ionela Voinescu wrote:
> From: Valentin Schneider 
> 
> The passed cpumask arguments to arch_set_freq_scale() and
> arch_freq_counters_available() are only iterated over, so reflect this
> in the prototype. This also allows to pass system cpumasks like
> cpu_online_mask without getting a warning.
> 
> Signed-off-by: Valentin Schneider 
> Signed-off-by: Ionela Voinescu 
> Acked-by: Catalin Marinas 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Sudeep Holla 
> Cc: Rafael J. Wysocki 
> Cc: Viresh Kumar 
> ---
>  arch/arm64/kernel/topology.c  | 2 +-
>  drivers/base/arch_topology.c  | 4 ++--
>  drivers/cpufreq/cpufreq.c | 5 +++--
>  include/linux/arch_topology.h | 2 +-
>  include/linux/cpufreq.h   | 3 ++-
>  5 files changed, 9 insertions(+), 7 deletions(-)

Acked-by: Viresh Kumar 

-- 
viresh


Re: [PATCH v3 3/5] cpufreq: report whether cpufreq supports Frequency Invariance (FI)

2020-08-25 Thread Viresh Kumar
On 24-08-20, 22:02, Ionela Voinescu wrote:
> Now that the update of the FI scale factor is done in cpufreq core for
> selected functions - target(), target_index() and fast_switch(),
> we can provide feedback to the task scheduler and architecture code
> on whether cpufreq supports FI.
> 
> For this purpose provide an external function to expose whether the
> cpufreq drivers support FI, by using a static key.
> 
> The logic behind the enablement of cpufreq-based invariance is as
> follows:
>  - cpufreq-based invariance is disabled by default
>  - cpufreq-based invariance is enabled if any of the callbacks
>above is implemented while the unsupported setpolicy() is not
> 
> The cpufreq_supports_freq_invariance() function only returns whether
> cpufreq is instrumented with the arch_set_freq_scale() calls that
> result in support for frequency invariance. Due to the lack of knowledge
> on whether the implementation of arch_set_freq_scale() actually results
> in the setting of a scale factor based on cpufreq information, it is up
> to the architecture code to ensure the setting and provision of the
> scale factor to the scheduler.
> 
> Signed-off-by: Ionela Voinescu 
> Cc: Rafael J. Wysocki 
> Cc: Viresh Kumar 
> ---
>  drivers/cpufreq/cpufreq.c | 20 
>  include/linux/cpufreq.h   |  5 +
>  2 files changed, 25 insertions(+)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 34533c6b3bd0..b3a7d1cc5e92 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -61,6 +61,9 @@ static struct cpufreq_driver *cpufreq_driver;
>  static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
>  static DEFINE_RWLOCK(cpufreq_driver_lock);
>  
> +/* Mark support for the scheduler's frequency invariance engine */
> +static DEFINE_STATIC_KEY_FALSE(cpufreq_freq_invariance);
> +
>  /* Flag to suspend/resume CPUFreq governors */
>  static bool cpufreq_suspended;
>  
> @@ -69,6 +72,20 @@ static inline bool has_target(void)
>   return cpufreq_driver->target_index || cpufreq_driver->target;
>  }
>  
> +static inline
> +void enable_cpufreq_freq_invariance(struct cpufreq_driver *driver)
> +{
> + if (!driver->setpolicy) {
> + static_branch_enable_cpuslocked(_freq_invariance);
> + pr_debug("supports frequency invariance");
> + }
> +}
> +

I would rather open-code this int the cpufreq_register_driver() routine as
that's what is done in cpufreq_unregister_driver() as well.

> +bool cpufreq_supports_freq_invariance(void)
> +{
> + return static_branch_likely(_freq_invariance);
> +}
> +

And would keep the definition of the static key with this routine at a single
place.

>  /* internal prototypes */
>  static unsigned int __cpufreq_get(struct cpufreq_policy *policy);
>  static int cpufreq_init_governor(struct cpufreq_policy *policy);
> @@ -2721,6 +2738,8 @@ int cpufreq_register_driver(struct cpufreq_driver 
> *driver_data)
>   cpufreq_driver = driver_data;
>   write_unlock_irqrestore(_driver_lock, flags);
>  
> + enable_cpufreq_freq_invariance(cpufreq_driver);
> +
>   if (driver_data->setpolicy)
>   driver_data->flags |= CPUFREQ_CONST_LOOPS;
>  
> @@ -2790,6 +2809,7 @@ int cpufreq_unregister_driver(struct cpufreq_driver 
> *driver)
>   cpus_read_lock();
>   subsys_interface_unregister(_interface);
>   remove_boost_sysfs_file();
> + static_branch_disable_cpuslocked(_freq_invariance);
>   cpuhp_remove_state_nocalls_cpuslocked(hp_online);
>  
>   write_lock_irqsave(_driver_lock, flags);
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index 8f141d4c859c..091df9968945 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -217,6 +217,7 @@ void refresh_frequency_limits(struct cpufreq_policy 
> *policy);
>  void cpufreq_update_policy(unsigned int cpu);
>  void cpufreq_update_limits(unsigned int cpu);
>  bool have_governor_per_policy(void);
> +bool cpufreq_supports_freq_invariance(void);
>  struct kobject *get_governor_parent_kobj(struct cpufreq_policy *policy);
>  void cpufreq_enable_fast_switch(struct cpufreq_policy *policy);
>  void cpufreq_disable_fast_switch(struct cpufreq_policy *policy);
> @@ -237,6 +238,10 @@ static inline unsigned int 
> cpufreq_get_hw_max_freq(unsigned int cpu)
>  {
>   return 0;
>  }
> +static inline bool cpufreq_supports_freq_invariance(void)
> +{
> + return false;
> +}
>  static inline void disable_cpufreq(void) { }
>  #endif
>  
> -- 
> 2.17.1

-- 
viresh


Re: [PATCH v3 2/5] cpufreq: move invariance setter calls in cpufreq core

2020-08-25 Thread Viresh Kumar
On 24-08-20, 22:02, Ionela Voinescu wrote:
> To properly scale its per-entity load-tracking signals, the task scheduler
> needs to be given a frequency scale factor, i.e. some image of the current
> frequency the CPU is running at. Currently, this scale can be computed
> either by using counters (APERF/MPERF on x86, AMU on arm64), or by
> piggy-backing on the frequency selection done by cpufreq.
> 
> For the latter, drivers have to explicitly set the scale factor
> themselves, despite it being purely boiler-plate code: the required
> information depends entirely on the kind of frequency switch callback
> implemented by the driver, i.e. either of: target_index(), target(),
> fast_switch() and setpolicy().
> 
> The fitness of those callbacks with regard to driving the Frequency
> Invariance Engine (FIE) is studied below:
> 
> target_index()
> ==
> Documentation states that the chosen frequency "must be determined by
> freq_table[index].frequency". It isn't clear if it *has* to be that
> frequency, or if it can use that frequency value to do some computation
> that ultimately leads to a different frequency selection. All drivers
> go for the former, while the vexpress-spc-cpufreq has an atypical
> implementation which is handled separately.
> 
> Therefore, the hook works on the assumption the core can use
> freq_table[index].frequency.
> 
> target()
> ===
> This has been flagged as deprecated since:
> 
>   commit 9c0ebcf78fde ("cpufreq: Implement light weight ->target_index() 
> routine")
> 
> It also doesn't have that many users:
> 
>   gx-suspmod.c:439:   .target = cpufreq_gx_target,
>   s3c24xx-cpufreq.c:428:  .target = s3c_cpufreq_target,
>   intel_pstate.c:2528:.target = intel_cpufreq_target,
>   cppc_cpufreq.c:401: .target = cppc_cpufreq_set_target,
>   cpufreq-nforce2.c:371:  .target = nforce2_target,
>   sh-cpufreq.c:163:   .target = sh_cpufreq_target,
>   pcc-cpufreq.c:573:  .target = pcc_cpufreq_target,
> 
> Similarly to the path taken for target_index() calls in the cpufreq core
> during a frequency change, all of the drivers above will mark the end of a
> frequency change by a call to cpufreq_freq_transition_end().
> 
> Therefore, cpufreq_freq_transition_end() can be used as the location for
> the arch_set_freq_scale() call to potentially inform the scheduler of the
> frequency change.
> 
> This change maintains the previous functionality for the drivers that
> implement the target_index() callback, while also adding support for the
> few drivers that implement the deprecated target() callback.
> 
> fast_switch()
> =
> This callback *has* to return the frequency that was selected.
> 
> setpolicy()
> ===
> This callback does not have any designated way of informing what was the
> end choice. But there are only two drivers using setpolicy(), and none
> of them have current FIE support:
> 
>   drivers/cpufreq/longrun.c:281:  .setpolicy  = longrun_set_policy,
>   drivers/cpufreq/intel_pstate.c:2215:.setpolicy  = 
> intel_pstate_set_policy,
> 
> The intel_pstate is known to use counter-driven frequency invariance.
> 
> Conclusion
> ==
> 
> Given that the significant majority of current FIE enabled drivers use
> callbacks that lend themselves to triggering the setting of the FIE scale
> factor in a generic way, move the invariance setter calls to cpufreq core.
> 
> As a result of setting the frequency scale factor in cpufreq core, after
> callbacks that lend themselves to trigger it, remove this functionality
> from the driver side.
> 
> To be noted that despite marking a successful frequency change, many
> cpufreq drivers will consider the new frequency as the requested
> frequency, although this is might not be the one granted by the hardware.
> 
> Therefore, the call to arch_set_freq_scale() is a "best effort" one, and
> it is up to the architecture if the new frequency is used in the new
> frequency scale factor setting (determined by the implementation of
> arch_set_freq_scale()) or eventually used by the scheduler (determined
> by the implementation of arch_scale_freq_capacity()). The architecture
> is in a better position to decide if it has better methods to obtain
> more accurate information regarding the current frequency and use that
> information instead (for example, the use of counters).
> 
> Also, the implementation to arch_set_freq_scale() will now have to handle
> error conditions (current frequency == 0) in order to prevent the
> overhead in cpufreq core when the default arch_set_freq_scale()
> implementation is used.
> 
> Signed-off-by: Ionela Voinescu 
> S

Re: [PATCH v3 1/5] arch_topology: validate input frequencies to arch_set_freq_scale()

2020-08-24 Thread Viresh Kumar
On 24-08-20, 22:02, Ionela Voinescu wrote:
> The current frequency passed to arch_set_freq_scale() could end up
> being 0, signaling an error in setting a new frequency. Also, if the
> maximum frequency in 0, this will result in a division by 0 error.
> 
> Therefore, validate these input values before using them for the
> setting of the frequency scale factor.
> 
> Signed-off-by: Ionela Voinescu 
> Cc: Sudeep Holla 
> Cc: Rafael J. Wysocki 
> ---
>  drivers/base/arch_topology.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 75f72d684294..1aca82fcceb8 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -33,6 +33,9 @@ void arch_set_freq_scale(struct cpumask *cpus, unsigned 
> long cur_freq,
>   unsigned long scale;
>   int i;
>  
> + if (!cur_freq || !max_freq)

We should probably use unlikely() here.

Rafael: Shouldn't this have a WARN_ON_ONCE() as well ?

> + return;
> +
>   /*
>* If the use of counters for FIE is enabled, just return as we don't
>* want to update the scale factor with information from CPUFREQ.
> -- 
> 2.17.1

-- 
viresh


Re: [PATCH V2] cpufreq: tegra186: Fix initial frequency

2020-08-24 Thread Viresh Kumar
On 24-08-20, 15:59, Jon Hunter wrote:
> Commit 6cc3d0e9a097 ("cpufreq: tegra186: add
> CPUFREQ_NEED_INITIAL_FREQ_CHECK flag") fixed CPUFREQ support for
> Tegra186 but as a consequence the following warnings are now seen on
> boot ...
> 
>  cpufreq: cpufreq_online: CPU0: Running at unlisted freq: 0 KHz
>  cpufreq: cpufreq_online: CPU0: Unlisted initial frequency changed to: 
> 2035200 KHz
>  cpufreq: cpufreq_online: CPU1: Running at unlisted freq: 0 KHz
>  cpufreq: cpufreq_online: CPU1: Unlisted initial frequency changed to: 
> 2035200 KHz
>  cpufreq: cpufreq_online: CPU2: Running at unlisted freq: 0 KHz
>  cpufreq: cpufreq_online: CPU2: Unlisted initial frequency changed to: 
> 2035200 KHz
>  cpufreq: cpufreq_online: CPU3: Running at unlisted freq: 0 KHz
>  cpufreq: cpufreq_online: CPU3: Unlisted initial frequency changed to: 
> 2035200 KHz
>  cpufreq: cpufreq_online: CPU4: Running at unlisted freq: 0 KHz
>  cpufreq: cpufreq_online: CPU4: Unlisted initial frequency changed to: 
> 2035200 KHz
>  cpufreq: cpufreq_online: CPU5: Running at unlisted freq: 0 KHz
>  cpufreq: cpufreq_online: CPU5: Unlisted initial frequency changed to: 
> 2035200 KHz
> 
> Fix this by adding a 'get' callback for the Tegra186 CPUFREQ driver to
> retrieve the current operating frequency for a given CPU. The 'get'
> callback uses the current 'ndiv' value that is programmed to determine
> that current operating frequency.
> 
> Signed-off-by: Jon Hunter 
> ---
> Changes since V1:
> - Moved code into a 'get' callback
> 
>  drivers/cpufreq/tegra186-cpufreq.c | 30 ++
>  1 file changed, 30 insertions(+)
> 
> diff --git a/drivers/cpufreq/tegra186-cpufreq.c 
> b/drivers/cpufreq/tegra186-cpufreq.c
> index 01e1f58ba422..0d0fcff60765 100644
> --- a/drivers/cpufreq/tegra186-cpufreq.c
> +++ b/drivers/cpufreq/tegra186-cpufreq.c
> @@ -14,6 +14,7 @@
>  
>  #define EDVD_CORE_VOLT_FREQ(core)(0x20 + (core) * 0x4)
>  #define EDVD_CORE_VOLT_FREQ_F_SHIFT  0
> +#define EDVD_CORE_VOLT_FREQ_F_MASK   0x
>  #define EDVD_CORE_VOLT_FREQ_V_SHIFT  16
>  
>  struct tegra186_cpufreq_cluster_info {
> @@ -91,10 +92,39 @@ static int tegra186_cpufreq_set_target(struct 
> cpufreq_policy *policy,
>   return 0;
>  }
>  
> +static unsigned int tegra186_cpufreq_get(unsigned int cpu)
> +{
> + struct cpufreq_frequency_table *tbl;
> + struct cpufreq_policy *policy;
> + void __iomem *edvd_reg;
> + unsigned int i, freq = 0;
> + u32 ndiv;
> +
> + policy = cpufreq_cpu_get(cpu);
> + if (!policy)
> + return -EINVAL;

This should be return 0;

Applied with that change. Thanks.

-- 
viresh


Re: [RFC PATCH 3/3] opp: Power on (virtual) power domains managed by the OPP core

2020-08-24 Thread Viresh Kumar
On 24-08-20, 17:08, Stephan Gerhold wrote:
> On Mon, Aug 24, 2020 at 04:36:57PM +0200, Ulf Hansson wrote:
> > That said, perhaps should rely on the consumer to deploy runtime PM
> > support, but let the OPP core to set up the device links for the genpd
> > virtual devices!?
> > 
> 
> Yes, that would be the alternative option.

That is the right option IMO.

> I would be fine with it as long as it also works for the CPUfreq case.
> 
> I don't think anything manages runtime PM for the CPU device, just
> like no-one calls dev_pm_opp_set_rate(cpu_dev, 0). So with my patch the
> power domain is essentially kept always-on (except for system suspend).
> At least in my case this is intended.
> 
> If device links also keep the power domains on if the consumer device
> does not make use of runtime PM it should work fine for my case.

With device link, you only need to do rpm enable/disable on the consumer device
and it will get propagated by itself.

> Personally, I think my original patch (without device links) fits better
> into the OPP API, for the following two reasons.
> 
> With device links:
> 
>   1. Unlike regulators/interconnects, attached power domains would be
>  controlled by runtime PM instead of dev_pm_opp_set_rate(opp_dev, 0).
> 
>   2. ... some driver using OPP tables might not make use of runtime PM.
>  In that case, the power domains would stay on the whole time,
>  even if dev_pm_opp_set_rate(opp_dev, 0) was called.
> 
> With my patch, the power domain state is directly related to the
> dev_pm_opp_set_rate(opp_dev, 0) call, which is more intuitive than
> relying on the runtime PM state in my opinion.

So opp-set-rate isn't in the best of shape TBH, some things are left for the
drivers while other are done by it. Regulator-enable/disable was moved to it
some time back as people needed something like that. While on the other hand,
clk_enable/disable doesn't happen there, nor does rpm enable/disable.

Maybe one day we may want to do that, but lets make sure someone wants to do
that first.

Anyway, even in that case both of the changes would be required. We must make
device links nevertheless first. And later on if required, we may want to do rpm
enable/disable on the consumer device itself.

-- 
viresh


Re: [PATCH - stable v5.4 and v5.7] opp: Enable resources again if they were disabled earlier

2020-08-24 Thread Viresh Kumar
On 24-08-20, 18:10, Greg KH wrote:
> On Mon, Aug 24, 2020 at 02:52:23PM +0530, Viresh Kumar wrote:
> > From: Rajendra Nayak 
> > 
> > commit a4501bac0e553bed117b7e1b166d49731caf7260 upstream.
> > 
> > dev_pm_opp_set_rate() can now be called with freq = 0 in order
> > to either drop performance or bandwidth votes or to disable
> > regulators on platforms which support them.
> > 
> > In such cases, a subsequent call to dev_pm_opp_set_rate() with
> > the same frequency ends up returning early because 'old_freq == freq'
> > 
> > Instead make it fall through and put back the dropped performance
> > and bandwidth votes and/or enable back the regulators.
> > 
> > Cc: v5.3+  # v5.3+
> > Fixes: cd7ea582866f ("opp: Make dev_pm_opp_set_rate() handle freq = 0 to 
> > drop performance votes")
> > Reported-by: Sajida Bhanu 
> > Reviewed-by: Sibi Sankar 
> > Reported-by: Matthias Kaehlcke 
> > Tested-by: Matthias Kaehlcke 
> > Reviewed-by: Stephen Boyd 
> > Signed-off-by: Rajendra Nayak 
> > [ Viresh: Don't skip clk_set_rate() and massaged changelog ]
> > Signed-off-by: Viresh Kumar 
> > [ Viresh: Updated the patch to apply to v5.4 ]
> > Signed-off-by: Viresh Kumar 
> > ---
> >  drivers/opp/core.c | 10 ++
> >  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> This too is already in the 5.7 and 5.4 queues, why add it again?

Same here, your bot reported that patch failed to apply for 5.4 and 5.7, again
rightly so as I was required to modify the patch a little bit and so I have
added another signed-off and details.

-- 
viresh


Re: [PATCH - for v5.7 stable] opp: Put opp table in dev_pm_opp_set_rate() for empty tables

2020-08-24 Thread Viresh Kumar
On 24-08-20, 18:10, Greg KH wrote:
> On Mon, Aug 24, 2020 at 03:00:03PM +0530, Viresh Kumar wrote:
> > From: Stephen Boyd 
> > 
> > commit 8979ef70850eb469e1094279259d1ef393ffe85f upstream.
> > 
> > We get the opp_table pointer at the top of the function and so we should
> > put the pointer at the end of the function like all other exit paths
> > from this function do.
> > 
> > Cc: v5.7+  # v5.7+
> > Fixes: aca48b61f963 ("opp: Manage empty OPP tables with clk handle")
> > Reviewed-by: Rajendra Nayak 
> > Signed-off-by: Stephen Boyd 
> > [ Viresh: Split the patch into two ]
> > Signed-off-by: Viresh Kumar 
> > [ Viresh: Update the code for v5.7-stable ]
> > Signed-off-by: Viresh Kumar 
> > ---
> >  drivers/opp/core.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> This is already in the 5.7-stable queue, why add it again?
> 
> confused,

Because I received an email from your bot that it failed to apply to 5.7 stable,
rightly so as the patch was required to be modified.

-- 
viresh


Re: [RFC PATCH 1/3] opp: Reduce code duplication in _set_required_opps()

2020-08-24 Thread Viresh Kumar
On 24-08-20, 13:30, Stephan Gerhold wrote:
> You're right. Not sure why I removed it.
> 
> I suspect I had it in _set_required_opp() at some point, but I moved
> code around several times until I was happy with the result.
> 
> We should just add it back.
> Should I send a v2 with it fixed or would you like to handle it?

I have applied the first two patches to linux-next branch in my tree,
please have a look.

-- 
viresh


Re: [RFC PATCH 2/3] opp: Set required OPPs in reverse order when scaling down

2020-08-24 Thread Viresh Kumar
On 21-08-20, 18:31, Stephan Gerhold wrote:
> This patch does not apply anymore after the cleanup you pushed to
> opp/linux-next. I would be happy to send a v2 with that fixed.
> 
> On my other OPP patch set you mentioned that you might apply these
> directly with some of your own changes - would you also prefer to do it
> yourself in this case or should I send a v2?

I will pick the first 2 myself, that's fine. Lets see where we go with
the third one :)

> Still looking for your feedback on both patch sets by the way! :)

Sorry about the delay, I was on vacation for over a week in between and
this and the other patchset was a bit tricky (which you may have not
realized, not sure, as I wondered if something will not work within
the OPP core for v1 binding, but it did finally I believe) :)

-- 
viresh


Re: [RFC PATCH 3/3] opp: Power on (virtual) power domains managed by the OPP core

2020-08-24 Thread Viresh Kumar
On 30-07-20, 10:01, Stephan Gerhold wrote:
> dev_pm_opp_attach_genpd() allows attaching an arbitrary number of
> power domains to an OPP table. In that case, the genpd core will
> create a virtual device for each of the power domains.
> 
> At the moment, the OPP core only calls
> dev_pm_genpd_set_performance_state() on these virtual devices.
> It does not attempt to power on the power domains. Therefore
> the required power domain might never get turned on.
> 
> So far, dev_pm_opp_attach_genpd() is only used in qcom-cpufreq-nvmem.c
> to attach the CPR power domain to the CPU OPP table. The CPR driver
> does not check if it was actually powered on so this did not cause
> any problems. However, other drivers (e.g. rpmpd) might ignore the
> performance state until the power domain is actually powered on.
> 
> Since these virtual devices are managed exclusively by the OPP core,
> I would say that it should also be responsible to ensure they are
> enabled. A similar approach is already used for regulators, see
> commit 8d45719caaf5 ("opp: core: add regulators enable and disable").
> 
> This commit implements similar functionality for the virtual genpd
> devices managed by the OPP core. The power domains are turned on
> the first time dev_pm_opp_set_rate() is called. They are turned off
> again when dev_pm_opp_set_rate(dev, 0) is called.
> 
> Signed-off-by: Stephan Gerhold 
> ---
> Related discussion: 
> https://lore.kernel.org/linux-arm-msm/20200426123140.ga190...@gerhold.net/
> 
> There would be also other ways to implement this, e.g. device links,
> assuming that the device using the OPP table also makes use of runtime PM.
> My first thought was that it would be most consistent to handle this like
> regulators, bandwidth votes etc. RFC :)

This stuff was done ages back and I am starting to forget almost
everything now :)

Ulf, why doesn't pm_runtime_get(dev) take care of enabling multiple
power domain case ? RFP (request for patience) :)

-- 
viresh


Re: [RFC PATCH 1/3] opp: Reduce code duplication in _set_required_opps()

2020-08-24 Thread Viresh Kumar
On 30-07-20, 10:01, Stephan Gerhold wrote:
> Move call to dev_pm_genpd_set_performance_state() to a separate
> function so we can avoid duplicating the code for the single and
> multiple genpd case.
> 
> Signed-off-by: Stephan Gerhold 
> ---
>  drivers/opp/core.c | 40 +---
>  1 file changed, 21 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/opp/core.c b/drivers/opp/core.c
> index 9d7fb45b1786..f7a476b55069 100644
> --- a/drivers/opp/core.c
> +++ b/drivers/opp/core.c
> @@ -781,6 +781,21 @@ static int _set_opp_custom(const struct opp_table 
> *opp_table,
>   return opp_table->set_opp(data);
>  }
>  
> +static int _set_required_opp(struct device *dev, struct device *pd_dev,
> +  struct dev_pm_opp *opp, int i)
> +{
> + unsigned int pstate = likely(opp) ? opp->required_opps[i]->pstate : 0;
> + int ret;
> +
> + ret = dev_pm_genpd_set_performance_state(pd_dev, pstate);
> + if (ret) {
> + dev_err(dev, "Failed to set performance rate of %s: %d (%d)\n",
> + dev_name(pd_dev), pstate, ret);
> + }
> +
> + return ret;
> +}
> +
>  /* This is only called for PM domain for now */
>  static int _set_required_opps(struct device *dev,
> struct opp_table *opp_table,
> @@ -788,22 +803,15 @@ static int _set_required_opps(struct device *dev,
>  {
>   struct opp_table **required_opp_tables = opp_table->required_opp_tables;
>   struct device **genpd_virt_devs = opp_table->genpd_virt_devs;
> - unsigned int pstate;
> + struct device *pd_dev;
>   int i, ret = 0;
>  
>   if (!required_opp_tables)
>   return 0;
>  
>   /* Single genpd case */
> - if (!genpd_virt_devs) {
> - pstate = likely(opp) ? opp->required_opps[0]->pstate : 0;
> - ret = dev_pm_genpd_set_performance_state(dev, pstate);
> - if (ret) {
> - dev_err(dev, "Failed to set performance state of %s: %d 
> (%d)\n",
> - dev_name(dev), pstate, ret);
> - }
> - return ret;
> - }
> + if (!genpd_virt_devs)
> + return _set_required_opp(dev, dev, opp, 0);
>  
>   /* Multiple genpd case */
>  
> @@ -814,17 +822,11 @@ static int _set_required_opps(struct device *dev,
>   mutex_lock(_table->genpd_virt_dev_lock);
>  
>   for (i = 0; i < opp_table->required_opp_count; i++) {
> - pstate = likely(opp) ? opp->required_opps[i]->pstate : 0;
> -
> - if (!genpd_virt_devs[i])
> - continue;

Don't we need this check anymore ?

> + pd_dev = genpd_virt_devs[i];
>  
> - ret = dev_pm_genpd_set_performance_state(genpd_virt_devs[i], 
> pstate);
> - if (ret) {
> - dev_err(dev, "Failed to set performance rate of %s: %d 
> (%d)\n",
> - dev_name(genpd_virt_devs[i]), pstate, ret);
> + ret = _set_required_opp(dev, pd_dev, opp, i);
> + if (ret)
>   break;
> - }
>   }
>   mutex_unlock(_table->genpd_virt_dev_lock);
>  
> -- 
> 2.27.0

-- 
viresh


Re: [RFC 0/3] cpufreq: cppc: Add support for frequency invariance

2020-08-24 Thread Viresh Kumar
On 24-07-20, 11:38, Vincent Guittot wrote:
> Yeah it's exactly the same behavior as x86 and re using the same
> mechanism seems the  best solution
> 
> The main problem is that AMU currently assumes that it will be the
> only to support such tick based mechanism whereas others like cppc can
> provides similar information

Ionela do you have some feedback to Vincent's email?

We can't handle this apart from the tick handler, i.e. capture this
data at every tick like it is done for x86 or AMU.

-- 
viresh


Re: [RFC PATCH v3 0/2] Add Krait Cache Scaling support

2020-08-24 Thread Viresh Kumar
+Vincent/Saravana/Sibi

On 21-08-20, 16:00, Ansuel Smith wrote:
> This adds Krait Cache scaling support using the cpufreq notifier.
> I have some doubt about where this should be actually placed (clk or cpufreq)?
> Also the original idea was to create a dedicated cpufreq driver (like it's 
> done in
> the codeaurora qcom repo) by copying the cpufreq-dt driver and adding the 
> cache
> scaling logic but i still don't know what is better. Have a very similar 
> driver or
> add a dedicated driver only for the cache using the cpufreq notifier and do 
> the
> scale on every freq transition.
> Thanks to everyone who will review or answer these questions.

Saravana was doing something with devfreq to solve such issues if I
wasn't mistaken.

Sibi ?

-- 
viresh


Re: [PATCH v2 1/2] cpufreq: mediatek-hw: Add support for Mediatek cpufreq HW driver

2020-08-24 Thread Viresh Kumar
On 13-08-20, 15:07, Hector Yuan wrote:
> From: "Hector.Yuan" 
> 
> Add MT6779 cpufreq HW support.
> 
> Signed-off-by: Hector.Yuan 
> ---
>  arch/arm64/configs/defconfig  |1 +
>  drivers/cpufreq/Kconfig.arm   |   11 ++
>  drivers/cpufreq/Makefile  |1 +
>  drivers/cpufreq/mediatek-cpufreq-hw.c |  255 
> +
>  4 files changed, 268 insertions(+)
>  create mode 100644 drivers/cpufreq/mediatek-cpufreq-hw.c
> 
> diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
> index 883e8ba..866a1bf 100644
> --- a/arch/arm64/configs/defconfig
> +++ b/arch/arm64/configs/defconfig
> @@ -86,6 +86,7 @@ CONFIG_CPUFREQ_DT=y
>  CONFIG_ACPI_CPPC_CPUFREQ=m
>  CONFIG_ARM_ALLWINNER_SUN50I_CPUFREQ_NVMEM=m
>  CONFIG_ARM_ARMADA_37XX_CPUFREQ=y
> +CONFIG_ARM_MEDIATEK_CPUFREQ_HW=m

What about a 'default m' in Kconfig itself ?

>  CONFIG_ARM_SCPI_CPUFREQ=y
>  CONFIG_ARM_IMX_CPUFREQ_DT=m
>  CONFIG_ARM_QCOM_CPUFREQ_NVMEM=y
> diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm
> index c6cbfc8..81f1cc1 100644
> --- a/drivers/cpufreq/Kconfig.arm
> +++ b/drivers/cpufreq/Kconfig.arm
> @@ -121,6 +121,17 @@ config ARM_MEDIATEK_CPUFREQ
>   help
> This adds the CPUFreq driver support for MediaTek SoCs.
>  
> +config ARM_MEDIATEK_CPUFREQ_HW
> + tristate "MediaTek CPUFreq HW driver"
> + depends on ARCH_MEDIATEK || COMPILE_TEST
> + help
> +   Support for the CPUFreq HW driver.
> +   Some MediaTek chipsets have a HW engine to offload the steps
> +   necessary for changing the frequency of the CPUs. Firmware loaded
> +   in this engine exposes a programming interface to the OS.
> +   The driver implements the cpufreq interface for this HW engine.
> +   Say Y if you want to support CPUFreq HW.
> +
>  config ARM_OMAP2PLUS_CPUFREQ
>   bool "TI OMAP2+"
>   depends on ARCH_OMAP2PLUS
> diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
> index f6670c4..dc1f371 100644
> --- a/drivers/cpufreq/Makefile
> +++ b/drivers/cpufreq/Makefile
> @@ -57,6 +57,7 @@ obj-$(CONFIG_ARM_IMX6Q_CPUFREQ) += 
> imx6q-cpufreq.o
>  obj-$(CONFIG_ARM_IMX_CPUFREQ_DT) += imx-cpufreq-dt.o
>  obj-$(CONFIG_ARM_KIRKWOOD_CPUFREQ)   += kirkwood-cpufreq.o
>  obj-$(CONFIG_ARM_MEDIATEK_CPUFREQ)   += mediatek-cpufreq.o
> +obj-$(CONFIG_ARM_MEDIATEK_CPUFREQ_HW)+= mediatek-cpufreq-hw.o
>  obj-$(CONFIG_MACH_MVEBU_V7)  += mvebu-cpufreq.o
>  obj-$(CONFIG_ARM_OMAP2PLUS_CPUFREQ)  += omap-cpufreq.o
>  obj-$(CONFIG_ARM_PXA2xx_CPUFREQ) += pxa2xx-cpufreq.o
> diff --git a/drivers/cpufreq/mediatek-cpufreq-hw.c 
> b/drivers/cpufreq/mediatek-cpufreq-hw.c
> new file mode 100644
> index 000..6752db9
> --- /dev/null
> +++ b/drivers/cpufreq/mediatek-cpufreq-hw.c
> @@ -0,0 +1,255 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2020 MediaTek Inc.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define LUT_MAX_ENTRIES  32U
> +#define LUT_FREQ GENMASK(11, 0)
> +#define LUT_VOLT GENMASK(28, 12)
> +#define LUT_ROW_SIZE 0x4
> +
> +/* Register offsets */
> +#define REG_ENABLE   0x84
> +#define REG_PERF_STATE   0x88
> +
> +static struct platform_device *global_pdev;

Use cpufreq_driver->driver_data for this, it is already used in other
drivers for similar use.

> +static int mtk_cpufreq_hw_target_index(struct cpufreq_policy *policy,
> +unsigned int index)
> +{
> + void __iomem *perf_state_reg = policy->driver_data;
> + unsigned long freq = policy->freq_table[index].frequency;
> +
> + writel_relaxed(index, perf_state_reg);
> + arch_set_freq_scale(policy->related_cpus, freq,
> + policy->cpuinfo.max_freq);
> + return 0;
> +}
> +
> +static unsigned int mtk_cpufreq_hw_get(unsigned int cpu)
> +{
> + void __iomem *perf_state_reg;
> + struct cpufreq_policy *policy;
> + unsigned int index;
> +
> + policy = cpufreq_cpu_get_raw(cpu);
> + if (!policy)
> + return 0;
> +
> + perf_state_reg = policy->driver_data;
> +
> + index = readl_relaxed(perf_state_reg);
> + index = min(index, LUT_MAX_ENTRIES - 1);
> +
> + return policy->freq_table[index].frequency;
> +}
> +
> +static int mtk_cpufreq_hw_read_lut(struct device *cpu_dev,

This routine needs to be named better, it is creating the cpufreq
table after all.

> +struct cpufreq_policy *policy,
> +void __iomem *base)

Please make sure checkpatch --strict doesn't give any errors.

> +{
> + u32 data;
> + u32 freq, volt, prev_freq = 0;

Merge these two..

> + int i = 0;
> + struct cpufreq_frequency_table  *table;
> +
> + table = kcalloc(LUT_MAX_ENTRIES + 1, sizeof(*table), 

[PATCH - for v5.7 stable] opp: Put opp table in dev_pm_opp_set_rate() for empty tables

2020-08-24 Thread Viresh Kumar
From: Stephen Boyd 

commit 8979ef70850eb469e1094279259d1ef393ffe85f upstream.

We get the opp_table pointer at the top of the function and so we should
put the pointer at the end of the function like all other exit paths
from this function do.

Cc: v5.7+  # v5.7+
Fixes: aca48b61f963 ("opp: Manage empty OPP tables with clk handle")
Reviewed-by: Rajendra Nayak 
Signed-off-by: Stephen Boyd 
[ Viresh: Split the patch into two ]
Signed-off-by: Viresh Kumar 
[ Viresh: Update the code for v5.7-stable ]
Signed-off-by: Viresh Kumar 
---
 drivers/opp/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index bb7060d52eec..c94e725e6522 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -820,7 +820,7 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long 
target_freq)
if (opp_table->required_opp_tables) {
ret = _set_required_opps(dev, opp_table, NULL);
} else if (!_get_opp_count(opp_table)) {
-   return 0;
+   ret = 0;
} else {
dev_err(dev, "target frequency can't be 0\n");
ret = -EINVAL;
-- 
2.25.0.rc1.19.g042ed3e048af



[PATCH - stable v5.4 and v5.7] opp: Enable resources again if they were disabled earlier

2020-08-24 Thread Viresh Kumar
From: Rajendra Nayak 

commit a4501bac0e553bed117b7e1b166d49731caf7260 upstream.

dev_pm_opp_set_rate() can now be called with freq = 0 in order
to either drop performance or bandwidth votes or to disable
regulators on platforms which support them.

In such cases, a subsequent call to dev_pm_opp_set_rate() with
the same frequency ends up returning early because 'old_freq == freq'

Instead make it fall through and put back the dropped performance
and bandwidth votes and/or enable back the regulators.

Cc: v5.3+  # v5.3+
Fixes: cd7ea582866f ("opp: Make dev_pm_opp_set_rate() handle freq = 0 to drop 
performance votes")
Reported-by: Sajida Bhanu 
Reviewed-by: Sibi Sankar 
Reported-by: Matthias Kaehlcke 
Tested-by: Matthias Kaehlcke 
Reviewed-by: Stephen Boyd 
Signed-off-by: Rajendra Nayak 
[ Viresh: Don't skip clk_set_rate() and massaged changelog ]
Signed-off-by: Viresh Kumar 
[ Viresh: Updated the patch to apply to v5.4 ]
Signed-off-by: Viresh Kumar 
---
 drivers/opp/core.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 9ff0538ee83a..518442be638d 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -843,10 +843,12 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long 
target_freq)
 
/* Return early if nothing to do */
if (old_freq == freq) {
-   dev_dbg(dev, "%s: old/new frequencies (%lu Hz) are same, 
nothing to do\n",
-   __func__, freq);
-   ret = 0;
-   goto put_opp_table;
+   if (!opp_table->required_opp_tables) {
+   dev_dbg(dev, "%s: old/new frequencies (%lu Hz) are 
same, nothing to do\n",
+   __func__, freq);
+   ret = 0;
+   goto put_opp_table;
+   }
}
 
temp_freq = old_freq;
-- 
2.25.0.rc1.19.g042ed3e048af



Re: [RFC PATCH 1/2] opp: Allow dev_pm_opp_get_opp_table() to return -EPROBE_DEFER

2020-08-24 Thread Viresh Kumar
On 12-08-20, 11:10, Ulf Hansson wrote:
> On Mon, 27 Jul 2020 at 11:31, Stephan Gerhold  wrote:
> > I wasn't sure if the changes in drivers/base/power/domain.c
> > should be made in a separate commit, but they need to be made together
> > with the other changes.
> 
> I would suggest to move the changes in drivers/base/power/domain.c
> into a separate patch, still part of the series, but let it preceed
> $subject patch.

That can't be done as the return type of a routine was changing and that needs
to be sorted out in the same patch.

-- 
viresh


[PATCH V2 1/2] opp: Allow dev_pm_opp_get_opp_table() to return -EPROBE_DEFER

2020-08-24 Thread Viresh Kumar
From: Stephan Gerhold 

The OPP core manages various resources, e.g. clocks or interconnect paths.
These resources are looked up when the OPP table is allocated once
dev_pm_opp_get_opp_table() is called the first time (either directly
or indirectly through one of the many helper functions).

At this point, the resources may not be available yet, i.e. looking them
up will result in -EPROBE_DEFER. Unfortunately, dev_pm_opp_get_opp_table()
is currently unable to propagate this error code since it only returns
the allocated OPP table or NULL.

This means that all consumers of the OPP core are required to make sure
that all necessary resources are available. Usually this happens by
requesting them, checking the result and releasing them immediately after.

For example, we have added "dev_pm_opp_of_find_icc_paths(dev, NULL)" to
several drivers now just to make sure the interconnect providers are
ready before the OPP table is allocated. If this call is missing,
the OPP core will only warn about this and then attempt to continue
without interconnect. This will eventually fail horribly, e.g.:

cpu cpu0: _allocate_opp_table: Error finding interconnect paths: -517
... later ...
of: _read_bw: Mismatch between opp-peak-kBps and paths (1 0)
cpu cpu0: _opp_add_static_v2: opp key field not found
cpu cpu0: _of_add_opp_table_v2: Failed to add OPP, -22

This example happens when trying to use interconnects for a CPU OPP
table together with qcom-cpufreq-nvmem.c. qcom-cpufreq-nvmem calls
dev_pm_opp_set_supported_hw(), which ends up allocating the OPP table
early. To fix the problem with the current approach we would need to add
yet another call to dev_pm_opp_of_find_icc_paths(dev, NULL).
But actually qcom-cpufreq-nvmem.c has nothing to do with interconnects...

This commit attempts to make this more robust by allowing
dev_pm_opp_get_opp_table() to return an error pointer. Fixing all
the usages is trivial because the function is usually used indirectly
through another helper (e.g. dev_pm_opp_set_supported_hw() above).
These other helpers already return an error pointer.

The example above then works correctly because set_supported_hw() will
return -EPROBE_DEFER, and qcom-cpufreq-nvmem.c already propagates that
error. It should also be possible to remove the remaining usages of
"dev_pm_opp_of_find_icc_paths(dev, NULL)" from other drivers as well.

Note that this commit currently only handles -EPROBE_DEFER for the
clock/interconnects within _allocate_opp_table(). Other errors are just
ignored as before. Eventually those should be propagated as well.

Signed-off-by: Stephan Gerhold 
[ Viresh: skip checking return value of dev_pm_opp_get_opp_table() for
  EPROBE_DEFER in domain.c, fix NULL return value and reorder
  code a bit in core.c, and update exynos-asv.c ]
Signed-off-by: Viresh Kumar 
---
Stephan, I have made some changes to the code. Please try it again and
lemme know if it works fine.

 drivers/base/power/domain.c  | 14 +
 drivers/opp/core.c   | 53 +++-
 drivers/opp/of.c |  8 ++---
 drivers/soc/samsung/exynos-asv.c |  2 +-
 4 files changed, 44 insertions(+), 33 deletions(-)

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index 2cb5e04cf86c..b92bb61550d3 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -2044,8 +2044,9 @@ int of_genpd_add_provider_simple(struct device_node *np,
if (genpd->set_performance_state) {
ret = dev_pm_opp_of_add_table(>dev);
if (ret) {
-   dev_err(>dev, "Failed to add OPP table: %d\n",
-   ret);
+   if (ret != -EPROBE_DEFER)
+   dev_err(>dev, "Failed to add OPP table: 
%d\n",
+   ret);
goto unlock;
}
 
@@ -2054,7 +2055,7 @@ int of_genpd_add_provider_simple(struct device_node *np,
 * state.
 */
genpd->opp_table = dev_pm_opp_get_opp_table(>dev);
-   WARN_ON(!genpd->opp_table);
+   WARN_ON(IS_ERR(genpd->opp_table));
}
 
ret = genpd_add_provider(np, genpd_xlate_simple, genpd);
@@ -2111,8 +2112,9 @@ int of_genpd_add_provider_onecell(struct device_node *np,
if (genpd->set_performance_state) {
ret = dev_pm_opp_of_add_table_indexed(>dev, i);
if (ret) {
-   dev_err(>dev, "Failed to add OPP table 
for index %d: %d\n",
-   i, ret);
+   if (ret != -EPROBE_DEFER)
+   dev_err(>dev, "Failed to add OPP 
table for index %d: %d\n",
+   i, 

[PATCH V2 2/2] cpufreq: dt: Refactor initialization to handle probe deferral properly

2020-08-24 Thread Viresh Kumar
From: Stephan Gerhold 

cpufreq-dt is currently unable to handle -EPROBE_DEFER properly
because the error code is not propagated for the cpufreq_driver->init()
callback. Instead, it attempts to avoid the situation by temporarily
requesting all resources within resources_available() and releasing them
again immediately after. This has several disadvantages:

  - Whenever we add something like interconnect handling to the OPP core
we need to patch cpufreq-dt to request these resources early.

  - resources_available() is only run for CPU0, but other clusters may
eventually depend on other resources that are not available yet.
(See FIXME comment removed by this commit...)

  - All resources need to be looked up several times.

Now that the OPP core can propagate -EPROBE_DEFER during initialization,
it would be nice to avoid all that trouble and just propagate its error
code when necessary.

This commit refactors the cpufreq-dt driver to initialize private_data
before registering the cpufreq driver. We do this by iterating over
all possible CPUs and ensure that all resources are initialized:

  1. dev_pm_opp_get_opp_table() ensures the OPP table is allocated
 and initialized with clock and interconnects.

  2. dev_pm_opp_set_regulators() requests the regulators and assigns
 them to the OPP table.

  3. We call dev_pm_opp_of_get_sharing_cpus() early so that we only
 initialize the OPP table once for each shared policy.

With these changes, we actually end up saving a few lines of code,
the resources are no longer looked up multiple times and everything
should be much more robust.

Signed-off-by: Stephan Gerhold 
[ Viresh: Use list_head structure for maintaining the list and minor
  changes ]
Signed-off-by: Viresh Kumar 
---
 drivers/cpufreq/cpufreq-dt.c | 286 +--
 1 file changed, 143 insertions(+), 143 deletions(-)

diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
index 944d7b45afe9..d29cd93045fa 100644
--- a/drivers/cpufreq/cpufreq-dt.c
+++ b/drivers/cpufreq/cpufreq-dt.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -24,18 +25,35 @@
 #include "cpufreq-dt.h"
 
 struct private_data {
-   struct opp_table *opp_table;
+   struct list_head node;
+
+   cpumask_var_t cpus;
struct device *cpu_dev;
-   const char *reg_name;
+   struct opp_table *opp_table;
+   struct opp_table *reg_opp_table;
bool have_static_opps;
 };
 
+static LIST_HEAD(priv_list);
+
 static struct freq_attr *cpufreq_dt_attr[] = {
_freq_attr_scaling_available_freqs,
NULL,   /* Extra space for boost-attr if required */
NULL,
 };
 
+static struct private_data *cpufreq_dt_find_data(int cpu)
+{
+   struct private_data *priv;
+
+   list_for_each_entry(priv, _list, node) {
+   if (cpumask_test_cpu(cpu, priv->cpus))
+   return priv;
+   }
+
+   return NULL;
+}
+
 static int set_target(struct cpufreq_policy *policy, unsigned int index)
 {
struct private_data *priv = policy->driver_data;
@@ -90,83 +108,24 @@ static const char *find_supply_name(struct device *dev)
return name;
 }
 
-static int resources_available(void)
-{
-   struct device *cpu_dev;
-   struct regulator *cpu_reg;
-   struct clk *cpu_clk;
-   int ret = 0;
-   const char *name;
-
-   cpu_dev = get_cpu_device(0);
-   if (!cpu_dev) {
-   pr_err("failed to get cpu0 device\n");
-   return -ENODEV;
-   }
-
-   cpu_clk = clk_get(cpu_dev, NULL);
-   ret = PTR_ERR_OR_ZERO(cpu_clk);
-   if (ret) {
-   /*
-* If cpu's clk node is present, but clock is not yet
-* registered, we should try defering probe.
-*/
-   if (ret == -EPROBE_DEFER)
-   dev_dbg(cpu_dev, "clock not ready, retry\n");
-   else
-   dev_err(cpu_dev, "failed to get clock: %d\n", ret);
-
-   return ret;
-   }
-
-   clk_put(cpu_clk);
-
-   ret = dev_pm_opp_of_find_icc_paths(cpu_dev, NULL);
-   if (ret)
-   return ret;
-
-   name = find_supply_name(cpu_dev);
-   /* Platform doesn't require regulator */
-   if (!name)
-   return 0;
-
-   cpu_reg = regulator_get_optional(cpu_dev, name);
-   ret = PTR_ERR_OR_ZERO(cpu_reg);
-   if (ret) {
-   /*
-* If cpu's regulator supply node is present, but regulator is
-* not yet registered, we should try defering probe.
-*/
-   if (ret == -EPROBE_DEFER)
-   dev_dbg(cpu_dev, "cpu0 regulator not ready, retry\n");
-   else
-   dev_dbg(cpu_dev, "no regulator for cpu0: %d\n", ret);
-

Re: [Patch] cpufreq: replace cpu_logical_map with read_cpuid_mpir

2020-08-20 Thread Viresh Kumar
On 20-08-20, 13:37, Sudeep Holla wrote:
> On Thu, Aug 20, 2020 at 11:09:45AM +0530, Viresh Kumar wrote:
> > On 12-08-20, 01:13, Sumit Gupta wrote:
> > > Commit eaecca9e7710 ("arm64: Fix __cpu_logical_map undefined issue")
> > > fixes the issue with building tegra194 cpufreq driver as module. But
> > > the fix might cause problem while supporting physical cpu hotplug[1].
> > > 
> > > This patch fixes the original problem by avoiding use of 
> > > cpu_logical_map().
> > > Instead calling read_cpuid_mpidr() to get MPIDR on target cpu.
> > > 
> > > [1] https://lore.kernel.org/linux-arm-kernel/20200724131059.GB6521@bogus/
> > > 
> > > Reviewed-by: Sudeep Holla 
> > > Signed-off-by: Sumit Gupta 
> > > ---
> > >  drivers/cpufreq/tegra194-cpufreq.c | 10 +++---
> > >  1 file changed, 7 insertions(+), 3 deletions(-)
> > 
> > Applied. Thanks.
> 
> Just to confirm, is this going as a fix ? We want to drop exporting
> cpu_logical_map in v5.9 so this needs to go as fix. I missed it earlier,
> actually,
> 
> Fixes: df320f89359c ("cpufreq: Add Tegra194 cpufreq driver")
> is appropriate here so that we can drop export symbol which was part of
> Commit eaecca9e7710 ("arm64: Fix __cpu_logical_map undefined issue")
> as a workaround to  fix the build.

Okay.

Rafael: Please pick this patch directly for next rc with 

Acked-by: Viresh Kumar 

-- 
viresh


Re: [PATCH] thermal: sysfs: Fall back to vmalloc() for cooling device's statistics

2020-08-20 Thread Viresh Kumar
On 21-08-20, 10:44, Yue Hu wrote:
> From: Yue Hu 
> 
> We observed warning about kzalloc() when register thermal cooling device
> in backlight_device_register(). backlight display can be a cooling device
> since reducing screen brightness will can help reduce temperature.
> 
> However, ->get_max_state of backlight will assign max brightness of 1024
> to states. The memory size can be getting 1MB+ due to states * states.
> That is so large to trigger kmalloc() warning.
> 
> So, let's use kvzalloc() to avoid the issue, also change kfree -> kvfree.
> 
> Suggested-by: Amit Kucheria 
> Signed-off-by: Yue Hu 
> ---
>  drivers/thermal/thermal_sysfs.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)

Acked-by: Viresh Kumar 

-- 
viresh


[PATCH 4/8] mmc: sdhci-msm: Unconditionally call dev_pm_opp_of_remove_table()

2020-08-20 Thread Viresh Kumar
dev_pm_opp_of_remove_table() doesn't report any errors when it fails to
find the OPP table with error -ENODEV (i.e. OPP table not present for
the device). And we can call dev_pm_opp_of_remove_table()
unconditionally here.

Signed-off-by: Viresh Kumar 
---
 drivers/mmc/host/sdhci-msm.c | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/mmc/host/sdhci-msm.c b/drivers/mmc/host/sdhci-msm.c
index 5a33389037cd..b7e47107a31a 100644
--- a/drivers/mmc/host/sdhci-msm.c
+++ b/drivers/mmc/host/sdhci-msm.c
@@ -263,7 +263,6 @@ struct sdhci_msm_host {
unsigned long clk_rate;
struct mmc_host *mmc;
struct opp_table *opp_table;
-   bool has_opp_table;
bool use_14lpp_dll_reset;
bool tuning_done;
bool calibration_done;
@@ -2285,9 +2284,7 @@ static int sdhci_msm_probe(struct platform_device *pdev)
 
/* OPP table is optional */
ret = dev_pm_opp_of_add_table(>dev);
-   if (!ret) {
-   msm_host->has_opp_table = true;
-   } else if (ret != -ENODEV) {
+   if (ret != -ENODEV) {
dev_err(>dev, "Invalid OPP table in Device tree\n");
goto opp_cleanup;
}
@@ -2453,8 +2450,7 @@ static int sdhci_msm_probe(struct platform_device *pdev)
clk_bulk_disable_unprepare(ARRAY_SIZE(msm_host->bulk_clks),
   msm_host->bulk_clks);
 opp_cleanup:
-   if (msm_host->has_opp_table)
-   dev_pm_opp_of_remove_table(>dev);
+   dev_pm_opp_of_remove_table(>dev);
dev_pm_opp_put_clkname(msm_host->opp_table);
 bus_clk_disable:
if (!IS_ERR(msm_host->bus_clk))
@@ -2474,8 +2470,7 @@ static int sdhci_msm_remove(struct platform_device *pdev)
 
sdhci_remove_host(host, dead);
 
-   if (msm_host->has_opp_table)
-   dev_pm_opp_of_remove_table(>dev);
+   dev_pm_opp_of_remove_table(>dev);
dev_pm_opp_put_clkname(msm_host->opp_table);
pm_runtime_get_sync(>dev);
pm_runtime_disable(>dev);
-- 
2.25.0.rc1.19.g042ed3e048af



[PATCH 1/8] cpufreq: imx6q: Unconditionally call dev_pm_opp_of_remove_table()

2020-08-20 Thread Viresh Kumar
dev_pm_opp_of_remove_table() doesn't report any errors when it fails to
find the OPP table with error -ENODEV (i.e. OPP table not present for
the device). And we can call dev_pm_opp_of_remove_table()
unconditionally here.

Signed-off-by: Viresh Kumar 
---
 drivers/cpufreq/imx6q-cpufreq.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/cpufreq/imx6q-cpufreq.c b/drivers/cpufreq/imx6q-cpufreq.c
index ef7b34c1fd2b..5bf5fc759881 100644
--- a/drivers/cpufreq/imx6q-cpufreq.c
+++ b/drivers/cpufreq/imx6q-cpufreq.c
@@ -48,7 +48,6 @@ static struct clk_bulk_data clks[] = {
 };
 
 static struct device *cpu_dev;
-static bool free_opp;
 static struct cpufreq_frequency_table *freq_table;
 static unsigned int max_freq;
 static unsigned int transition_latency;
@@ -390,9 +389,6 @@ static int imx6q_cpufreq_probe(struct platform_device *pdev)
goto put_reg;
}
 
-   /* Because we have added the OPPs here, we must free them */
-   free_opp = true;
-
if (of_machine_is_compatible("fsl,imx6ul") ||
of_machine_is_compatible("fsl,imx6ull")) {
ret = imx6ul_opp_check_speed_grading(cpu_dev);
@@ -507,8 +503,7 @@ static int imx6q_cpufreq_probe(struct platform_device *pdev)
 free_freq_table:
dev_pm_opp_free_cpufreq_table(cpu_dev, _table);
 out_free_opp:
-   if (free_opp)
-   dev_pm_opp_of_remove_table(cpu_dev);
+   dev_pm_opp_of_remove_table(cpu_dev);
 put_reg:
if (!IS_ERR(arm_reg))
regulator_put(arm_reg);
@@ -528,8 +523,7 @@ static int imx6q_cpufreq_remove(struct platform_device 
*pdev)
 {
cpufreq_unregister_driver(_cpufreq_driver);
dev_pm_opp_free_cpufreq_table(cpu_dev, _table);
-   if (free_opp)
-   dev_pm_opp_of_remove_table(cpu_dev);
+   dev_pm_opp_of_remove_table(cpu_dev);
regulator_put(arm_reg);
if (!IS_ERR(pu_reg))
regulator_put(pu_reg);
-- 
2.25.0.rc1.19.g042ed3e048af



[PATCH 5/8] spi: spi-geni-qcom: Unconditionally call dev_pm_opp_of_remove_table()

2020-08-20 Thread Viresh Kumar
dev_pm_opp_of_remove_table() doesn't report any errors when it fails to
find the OPP table with error -ENODEV (i.e. OPP table not present for
the device). And we can call dev_pm_opp_of_remove_table()
unconditionally here.

Signed-off-by: Viresh Kumar 
---
 drivers/spi/spi-geni-qcom.c | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
index 80cea5cd3612..5d3904a0aff8 100644
--- a/drivers/spi/spi-geni-qcom.c
+++ b/drivers/spi/spi-geni-qcom.c
@@ -613,9 +613,7 @@ static int spi_geni_probe(struct platform_device *pdev)
return PTR_ERR(mas->se.opp_table);
/* OPP table is optional */
ret = dev_pm_opp_of_add_table(>dev);
-   if (!ret) {
-   mas->se.has_opp_table = true;
-   } else if (ret != -ENODEV) {
+   if (ret != -ENODEV) {
dev_err(>dev, "invalid OPP table in device tree\n");
return ret;
}
@@ -669,8 +667,7 @@ static int spi_geni_probe(struct platform_device *pdev)
 spi_geni_probe_runtime_disable:
pm_runtime_disable(dev);
spi_master_put(spi);
-   if (mas->se.has_opp_table)
-   dev_pm_opp_of_remove_table(>dev);
+   dev_pm_opp_of_remove_table(>dev);
dev_pm_opp_put_clkname(mas->se.opp_table);
return ret;
 }
@@ -685,8 +682,7 @@ static int spi_geni_remove(struct platform_device *pdev)
 
free_irq(mas->irq, spi);
pm_runtime_disable(>dev);
-   if (mas->se.has_opp_table)
-   dev_pm_opp_of_remove_table(>dev);
+   dev_pm_opp_of_remove_table(>dev);
dev_pm_opp_put_clkname(mas->se.opp_table);
return 0;
 }
-- 
2.25.0.rc1.19.g042ed3e048af



[PATCH 0/8] opp: Unconditionally call dev_pm_opp_of_remove_table()

2020-08-20 Thread Viresh Kumar
Hello,

This cleans up some of the user code around calls to
dev_pm_opp_of_remove_table().

All the patches can be picked by respective maintainers directly except
for the last patch, which needs the previous two to get merged first.

These are based for 5.9-rc1.

Rajendra, Since most of these changes are related to qcom stuff, it
would be great if you can give them a try. I wasn't able to test them
due to lack of hardware.

Viresh Kumar (8):
  cpufreq: imx6q: Unconditionally call dev_pm_opp_of_remove_table()
  drm/lima: Unconditionally call dev_pm_opp_of_remove_table()
  drm/msm: Unconditionally call dev_pm_opp_of_remove_table()
  mmc: sdhci-msm: Unconditionally call dev_pm_opp_of_remove_table()
  spi: spi-geni-qcom: Unconditionally call dev_pm_opp_of_remove_table()
  spi: spi-qcom-qspi: Unconditionally call dev_pm_opp_of_remove_table()
  tty: serial: qcom_geni_serial: Unconditionally call
dev_pm_opp_of_remove_table()
  qcom-geni-se: remove has_opp_table

 drivers/cpufreq/imx6q-cpufreq.c | 10 ++
 drivers/gpu/drm/lima/lima_devfreq.c |  6 +-
 drivers/gpu/drm/lima/lima_devfreq.h |  1 -
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 10 +++---
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h |  1 -
 drivers/gpu/drm/msm/dsi/dsi_host.c  |  8 ++--
 drivers/mmc/host/sdhci-msm.c| 11 +++
 drivers/spi/spi-geni-qcom.c | 10 +++---
 drivers/spi/spi-qcom-qspi.c | 11 +++
 drivers/tty/serial/qcom_geni_serial.c   | 10 +++---
 include/linux/qcom-geni-se.h|  2 --
 11 files changed, 20 insertions(+), 60 deletions(-)

-- 
2.25.0.rc1.19.g042ed3e048af



Re: [PATCH v6 1/6] tty: serial: qcom_geni_serial: Use OPP API to set clk/perf state

2020-08-20 Thread Viresh Kumar
On 22-07-20, 10:54, Viresh Kumar wrote:
> On 21-07-20, 01:43, Stephen Boyd wrote:
> > It seems that dev_pm_opp_set_rate() calls _find_opp_table() and finds
> > something that isn't an error pointer but then dev_pm_opp_of_add_table()
> > returns an error value because there isn't an operating-points property
> > in DT. We're getting saved because this driver also happens to call
> > dev_pm_opp_set_clkname() which allocates the OPP table a second time
> > (because the first time it got freed when dev_pm_opp_of_add_table()
> > return -ENODEV because the property was missing).
> > 
> > Why do we need 'has_opp_table' logic? It seems that we have to keep
> > track of the fact that dev_pm_opp_of_add_table() failed so that we don't
> > put the table again, but then dev_pm_opp_set_clkname() can be called
> > to allocate the table regardless.

I have sent a patchset to clean this stuff up a bit now.

-- 
viresh


[PATCH 6/8] spi: spi-qcom-qspi: Unconditionally call dev_pm_opp_of_remove_table()

2020-08-20 Thread Viresh Kumar
dev_pm_opp_of_remove_table() doesn't report any errors when it fails to
find the OPP table with error -ENODEV (i.e. OPP table not present for
the device). And we can call dev_pm_opp_of_remove_table()
unconditionally here.

Signed-off-by: Viresh Kumar 
---
 drivers/spi/spi-qcom-qspi.c | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/spi/spi-qcom-qspi.c b/drivers/spi/spi-qcom-qspi.c
index b8857a97f40a..7fd6f631dc17 100644
--- a/drivers/spi/spi-qcom-qspi.c
+++ b/drivers/spi/spi-qcom-qspi.c
@@ -143,7 +143,6 @@ struct qcom_qspi {
struct qspi_xfer xfer;
struct icc_path *icc_path_cpu_to_qspi;
struct opp_table *opp_table;
-   bool has_opp_table;
unsigned long last_speed;
/* Lock to protect data accessed by IRQs */
spinlock_t lock;
@@ -546,9 +545,7 @@ static int qcom_qspi_probe(struct platform_device *pdev)
}
/* OPP table is optional */
ret = dev_pm_opp_of_add_table(>dev);
-   if (!ret) {
-   ctrl->has_opp_table = true;
-   } else if (ret != -ENODEV) {
+   if (ret != -ENODEV) {
dev_err(>dev, "invalid OPP table in device tree\n");
goto exit_probe_master_put;
}
@@ -562,8 +559,7 @@ static int qcom_qspi_probe(struct platform_device *pdev)
return 0;
 
pm_runtime_disable(dev);
-   if (ctrl->has_opp_table)
-   dev_pm_opp_of_remove_table(>dev);
+   dev_pm_opp_of_remove_table(>dev);
dev_pm_opp_put_clkname(ctrl->opp_table);
 
 exit_probe_master_put:
@@ -581,8 +577,7 @@ static int qcom_qspi_remove(struct platform_device *pdev)
spi_unregister_master(master);
 
pm_runtime_disable(>dev);
-   if (ctrl->has_opp_table)
-   dev_pm_opp_of_remove_table(>dev);
+   dev_pm_opp_of_remove_table(>dev);
dev_pm_opp_put_clkname(ctrl->opp_table);
 
return 0;
-- 
2.25.0.rc1.19.g042ed3e048af



[PATCH 8/8] qcom-geni-se: remove has_opp_table

2020-08-20 Thread Viresh Kumar
has_opp_table isn't used anymore, remove it.

Signed-off-by: Viresh Kumar 
---
 include/linux/qcom-geni-se.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/linux/qcom-geni-se.h b/include/linux/qcom-geni-se.h
index 8f385fbe5a0e..02d1417c8ecf 100644
--- a/include/linux/qcom-geni-se.h
+++ b/include/linux/qcom-geni-se.h
@@ -48,7 +48,6 @@ struct geni_icc_path {
  * @clk_perf_tbl:  Table of clock frequency input to serial engine clock
  * @icc_paths: Array of ICC paths for SE
  * @opp_table: Pointer to the OPP table
- * @has_opp_table: Specifies if the SE has an OPP table
  */
 struct geni_se {
void __iomem *base;
@@ -59,7 +58,6 @@ struct geni_se {
unsigned long *clk_perf_tbl;
struct geni_icc_path icc_paths[3];
struct opp_table *opp_table;
-   bool has_opp_table;
 };
 
 /* Common SE registers */
-- 
2.25.0.rc1.19.g042ed3e048af



[PATCH 2/8] drm/lima: Unconditionally call dev_pm_opp_of_remove_table()

2020-08-20 Thread Viresh Kumar
dev_pm_opp_of_remove_table() doesn't report any errors when it fails to
find the OPP table with error -ENODEV (i.e. OPP table not present for
the device). And we can call dev_pm_opp_of_remove_table()
unconditionally here.

Signed-off-by: Viresh Kumar 
---
 drivers/gpu/drm/lima/lima_devfreq.c | 6 +-
 drivers/gpu/drm/lima/lima_devfreq.h | 1 -
 2 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/lima/lima_devfreq.c 
b/drivers/gpu/drm/lima/lima_devfreq.c
index bbe02817721b..cd290d866a04 100644
--- a/drivers/gpu/drm/lima/lima_devfreq.c
+++ b/drivers/gpu/drm/lima/lima_devfreq.c
@@ -105,10 +105,7 @@ void lima_devfreq_fini(struct lima_device *ldev)
devfreq->devfreq = NULL;
}
 
-   if (devfreq->opp_of_table_added) {
-   dev_pm_opp_of_remove_table(ldev->dev);
-   devfreq->opp_of_table_added = false;
-   }
+   dev_pm_opp_of_remove_table(ldev->dev);
 
if (devfreq->regulators_opp_table) {
dev_pm_opp_put_regulators(devfreq->regulators_opp_table);
@@ -162,7 +159,6 @@ int lima_devfreq_init(struct lima_device *ldev)
ret = dev_pm_opp_of_add_table(dev);
if (ret)
goto err_fini;
-   ldevfreq->opp_of_table_added = true;
 
lima_devfreq_reset(ldevfreq);
 
diff --git a/drivers/gpu/drm/lima/lima_devfreq.h 
b/drivers/gpu/drm/lima/lima_devfreq.h
index 5eed2975a375..2d9b3008ce77 100644
--- a/drivers/gpu/drm/lima/lima_devfreq.h
+++ b/drivers/gpu/drm/lima/lima_devfreq.h
@@ -18,7 +18,6 @@ struct lima_devfreq {
struct opp_table *clkname_opp_table;
struct opp_table *regulators_opp_table;
struct thermal_cooling_device *cooling;
-   bool opp_of_table_added;
 
ktime_t busy_time;
ktime_t idle_time;
-- 
2.25.0.rc1.19.g042ed3e048af



[PATCH 3/8] drm/msm: Unconditionally call dev_pm_opp_of_remove_table()

2020-08-20 Thread Viresh Kumar
dev_pm_opp_of_remove_table() doesn't report any errors when it fails to
find the OPP table with error -ENODEV (i.e. OPP table not present for
the device). And we can call dev_pm_opp_of_remove_table()
unconditionally here.

Signed-off-by: Viresh Kumar 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 10 +++---
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h |  1 -
 drivers/gpu/drm/msm/dsi/dsi_host.c  |  8 ++--
 3 files changed, 5 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
index c0a4d4e16d82..1bd67ba1bf1f 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
@@ -1010,9 +1010,7 @@ static int dpu_bind(struct device *dev, struct device 
*master, void *data)
return PTR_ERR(dpu_kms->opp_table);
/* OPP table is optional */
ret = dev_pm_opp_of_add_table(dev);
-   if (!ret) {
-   dpu_kms->has_opp_table = true;
-   } else if (ret != -ENODEV) {
+   if (ret != -ENODEV) {
dev_err(dev, "invalid OPP table in device tree\n");
dev_pm_opp_put_clkname(dpu_kms->opp_table);
return ret;
@@ -1037,8 +1035,7 @@ static int dpu_bind(struct device *dev, struct device 
*master, void *data)
priv->kms = _kms->base;
return ret;
 err:
-   if (dpu_kms->has_opp_table)
-   dev_pm_opp_of_remove_table(dev);
+   dev_pm_opp_of_remove_table(dev);
dev_pm_opp_put_clkname(dpu_kms->opp_table);
return ret;
 }
@@ -1056,8 +1053,7 @@ static void dpu_unbind(struct device *dev, struct device 
*master, void *data)
if (dpu_kms->rpm_enabled)
pm_runtime_disable(>dev);
 
-   if (dpu_kms->has_opp_table)
-   dev_pm_opp_of_remove_table(dev);
+   dev_pm_opp_of_remove_table(dev);
dev_pm_opp_put_clkname(dpu_kms->opp_table);
 }
 
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
index e140cd633071..8295979a7165 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
@@ -129,7 +129,6 @@ struct dpu_kms {
bool rpm_enabled;
 
struct opp_table *opp_table;
-   bool has_opp_table;
 
struct dss_module_power mp;
 
diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c 
b/drivers/gpu/drm/msm/dsi/dsi_host.c
index b17ac6c27554..288f9df06ea2 100644
--- a/drivers/gpu/drm/msm/dsi/dsi_host.c
+++ b/drivers/gpu/drm/msm/dsi/dsi_host.c
@@ -113,7 +113,6 @@ struct msm_dsi_host {
struct clk *byte_intf_clk;
 
struct opp_table *opp_table;
-   bool has_opp_table;
 
u32 byte_clk_rate;
u32 pixel_clk_rate;
@@ -1891,9 +1890,7 @@ int msm_dsi_host_init(struct msm_dsi *msm_dsi)
return PTR_ERR(msm_host->opp_table);
/* OPP table is optional */
ret = dev_pm_opp_of_add_table(>dev);
-   if (!ret) {
-   msm_host->has_opp_table = true;
-   } else if (ret != -ENODEV) {
+   if (ret != -ENODEV) {
dev_err(>dev, "invalid OPP table in device tree\n");
dev_pm_opp_put_clkname(msm_host->opp_table);
return ret;
@@ -1934,8 +1931,7 @@ void msm_dsi_host_destroy(struct mipi_dsi_host *host)
mutex_destroy(_host->cmd_mutex);
mutex_destroy(_host->dev_mutex);
 
-   if (msm_host->has_opp_table)
-   dev_pm_opp_of_remove_table(_host->pdev->dev);
+   dev_pm_opp_of_remove_table(_host->pdev->dev);
dev_pm_opp_put_clkname(msm_host->opp_table);
pm_runtime_disable(_host->pdev->dev);
 }
-- 
2.25.0.rc1.19.g042ed3e048af



[PATCH 7/8] tty: serial: qcom_geni_serial: Unconditionally call dev_pm_opp_of_remove_table()

2020-08-20 Thread Viresh Kumar
dev_pm_opp_of_remove_table() doesn't report any errors when it fails to
find the OPP table with error -ENODEV (i.e. OPP table not present for
the device). And we can call dev_pm_opp_of_remove_table()
unconditionally here.

Signed-off-by: Viresh Kumar 
---
 drivers/tty/serial/qcom_geni_serial.c | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/tty/serial/qcom_geni_serial.c 
b/drivers/tty/serial/qcom_geni_serial.c
index 3aa29d201f54..e4c90a76e6ac 100644
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -1433,9 +1433,7 @@ static int qcom_geni_serial_probe(struct platform_device 
*pdev)
return PTR_ERR(port->se.opp_table);
/* OPP table is optional */
ret = dev_pm_opp_of_add_table(>dev);
-   if (!ret) {
-   port->se.has_opp_table = true;
-   } else if (ret != -ENODEV) {
+   if (ret != -ENODEV) {
dev_err(>dev, "invalid OPP table in device tree\n");
return ret;
}
@@ -1478,8 +1476,7 @@ static int qcom_geni_serial_probe(struct platform_device 
*pdev)
 
return 0;
 err:
-   if (port->se.has_opp_table)
-   dev_pm_opp_of_remove_table(>dev);
+   dev_pm_opp_of_remove_table(>dev);
dev_pm_opp_put_clkname(port->se.opp_table);
return ret;
 }
@@ -1489,8 +1486,7 @@ static int qcom_geni_serial_remove(struct platform_device 
*pdev)
struct qcom_geni_serial_port *port = platform_get_drvdata(pdev);
struct uart_driver *drv = port->private_data.drv;
 
-   if (port->se.has_opp_table)
-   dev_pm_opp_of_remove_table(>dev);
+   dev_pm_opp_of_remove_table(>dev);
dev_pm_opp_put_clkname(port->se.opp_table);
dev_pm_clear_wake_irq(>dev);
device_init_wakeup(>dev, false);
-- 
2.25.0.rc1.19.g042ed3e048af



[PATCH V3 1/4] opp: Rename regulator_enabled and use it as status of all resources

2020-08-20 Thread Viresh Kumar
Expand the scope of the regulator_enabled flag and use it to track
status of all the resources. This will be used for other stuff in the
next patch.

Signed-off-by: Viresh Kumar 
---
 drivers/opp/core.c | 19 +--
 drivers/opp/opp.h  |  4 ++--
 2 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 9668ea04cc80..6f43ef4945b7 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -703,12 +703,10 @@ static int _generic_set_opp_regulator(struct opp_table 
*opp_table,
 * Enable the regulator after setting its voltages, otherwise it breaks
 * some boot-enabled regulators.
 */
-   if (unlikely(!opp_table->regulator_enabled)) {
+   if (unlikely(!opp_table->enabled)) {
ret = regulator_enable(reg);
if (ret < 0)
dev_warn(dev, "Failed to enable regulator: %d", ret);
-   else
-   opp_table->regulator_enabled = true;
}
 
return 0;
@@ -909,12 +907,12 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long 
target_freq)
if (ret)
goto put_opp_table;
 
-   if (opp_table->regulator_enabled) {
+   if (opp_table->regulators)
regulator_disable(opp_table->regulators[0]);
-   opp_table->regulator_enabled = false;
-   }
 
ret = _set_required_opps(dev, opp_table, NULL);
+
+   opp_table->enabled = false;
goto put_opp_table;
}
 
@@ -1001,8 +999,11 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long 
target_freq)
dev_err(dev, "Failed to set required opps: %d\n", ret);
}
 
-   if (!ret)
+   if (!ret) {
ret = _set_opp_bw(opp_table, opp, dev, false);
+   if (!ret)
+   opp_table->enabled = true;
+   }
 
 put_opp:
dev_pm_opp_put(opp);
@@ -1796,11 +1797,9 @@ void dev_pm_opp_put_regulators(struct opp_table 
*opp_table)
/* Make sure there are no concurrent readers while updating opp_table */
WARN_ON(!list_empty(_table->opp_list));
 
-   if (opp_table->regulator_enabled) {
+   if (opp_table->enabled) {
for (i = opp_table->regulator_count - 1; i >= 0; i--)
regulator_disable(opp_table->regulators[i]);
-
-   opp_table->regulator_enabled = false;
}
 
for (i = opp_table->regulator_count - 1; i >= 0; i--)
diff --git a/drivers/opp/opp.h b/drivers/opp/opp.h
index e51646ff279e..0c3de3f6db5c 100644
--- a/drivers/opp/opp.h
+++ b/drivers/opp/opp.h
@@ -147,11 +147,11 @@ enum opp_table_access {
  * @clk: Device's clock handle
  * @regulators: Supply regulators
  * @regulator_count: Number of power supply regulators. Its value can be -1
- * @regulator_enabled: Set to true if regulators were previously enabled.
  * (uninitialized), 0 (no opp-microvolt property) or > 0 (has opp-microvolt
  * property).
  * @paths: Interconnect path handles
  * @path_count: Number of interconnect paths
+ * @enabled: Set to true if the device's resources are enabled/configured.
  * @genpd_performance_state: Device's power domain support performance state.
  * @is_genpd: Marks if the OPP table belongs to a genpd.
  * @set_opp: Platform specific set_opp callback
@@ -195,9 +195,9 @@ struct opp_table {
struct clk *clk;
struct regulator **regulators;
int regulator_count;
-   bool regulator_enabled;
struct icc_path **paths;
unsigned int path_count;
+   bool enabled;
bool genpd_performance_state;
bool is_genpd;
 
-- 
2.25.0.rc1.19.g042ed3e048af



[PATCH V3 2/4] opp: Reuse the enabled flag in !target_freq path

2020-08-20 Thread Viresh Kumar
The OPP core needs to track if the resources of devices are
enabled/configured or not, as it disables the resources when target_freq
is set to 0.

Handle that with the new enabled flag and remove otherwise complex
conditional statements.

Signed-off-by: Viresh Kumar 
---
 drivers/opp/core.c | 29 +++--
 1 file changed, 11 insertions(+), 18 deletions(-)

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 6f43ef4945b7..0b437d483b75 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -886,22 +886,18 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long 
target_freq)
}
 
if (unlikely(!target_freq)) {
+   ret = 0;
+
+   if (!opp_table->enabled)
+   goto put_opp_table;
+
/*
 * Some drivers need to support cases where some platforms may
 * have OPP table for the device, while others don't and
 * opp_set_rate() just needs to behave like clk_set_rate().
 */
-   if (!_get_opp_count(opp_table)) {
-   ret = 0;
-   goto put_opp_table;
-   }
-
-   if (!opp_table->required_opp_tables && !opp_table->regulators &&
-   !opp_table->paths) {
-   dev_err(dev, "target frequency can't be 0\n");
-   ret = -EINVAL;
+   if (!_get_opp_count(opp_table))
goto put_opp_table;
-   }
 
ret = _set_opp_bw(opp_table, NULL, dev, true);
if (ret)
@@ -931,14 +927,11 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long 
target_freq)
old_freq = clk_get_rate(clk);
 
/* Return early if nothing to do */
-   if (old_freq == freq) {
-   if (!opp_table->required_opp_tables && !opp_table->regulators &&
-   !opp_table->paths) {
-   dev_dbg(dev, "%s: old/new frequencies (%lu Hz) are 
same, nothing to do\n",
-   __func__, freq);
-   ret = 0;
-   goto put_opp_table;
-   }
+   if (opp_table->enabled && old_freq == freq) {
+   dev_dbg(dev, "%s: old/new frequencies (%lu Hz) are same, 
nothing to do\n",
+   __func__, freq);
+   ret = 0;
+   goto put_opp_table;
}
 
/*
-- 
2.25.0.rc1.19.g042ed3e048af



[PATCH V3 4/4] opp: Remove _dev_pm_opp_find_and_remove_table() wrapper

2020-08-20 Thread Viresh Kumar
Remove the unnecessary wrapper and merge
_dev_pm_opp_find_and_remove_table() with dev_pm_opp_remove_table().

Signed-off-by: Viresh Kumar 
---
 drivers/opp/core.c | 21 -
 drivers/opp/cpu.c  |  2 +-
 drivers/opp/of.c   |  2 +-
 drivers/opp/opp.h  |  1 -
 4 files changed, 10 insertions(+), 16 deletions(-)

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 4edd2c3d6d91..6978b9218c6e 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -2395,7 +2395,14 @@ int dev_pm_opp_unregister_notifier(struct device *dev,
 }
 EXPORT_SYMBOL(dev_pm_opp_unregister_notifier);
 
-void _dev_pm_opp_find_and_remove_table(struct device *dev)
+/**
+ * dev_pm_opp_remove_table() - Free all OPPs associated with the device
+ * @dev:   device pointer used to lookup OPP table.
+ *
+ * Free both OPPs created using static entries present in DT and the
+ * dynamically added entries.
+ */
+void dev_pm_opp_remove_table(struct device *dev)
 {
struct opp_table *opp_table;
 
@@ -2420,16 +2427,4 @@ void _dev_pm_opp_find_and_remove_table(struct device 
*dev)
/* Drop reference taken while the OPP table was added */
dev_pm_opp_put_opp_table(opp_table);
 }
-
-/**
- * dev_pm_opp_remove_table() - Free all OPPs associated with the device
- * @dev:   device pointer used to lookup OPP table.
- *
- * Free both OPPs created using static entries present in DT and the
- * dynamically added entries.
- */
-void dev_pm_opp_remove_table(struct device *dev)
-{
-   _dev_pm_opp_find_and_remove_table(dev);
-}
 EXPORT_SYMBOL_GPL(dev_pm_opp_remove_table);
diff --git a/drivers/opp/cpu.c b/drivers/opp/cpu.c
index b5055cc886ef..5004335cf0de 100644
--- a/drivers/opp/cpu.c
+++ b/drivers/opp/cpu.c
@@ -124,7 +124,7 @@ void _dev_pm_opp_cpumask_remove_table(const struct cpumask 
*cpumask,
continue;
}
 
-   _dev_pm_opp_find_and_remove_table(cpu_dev);
+   dev_pm_opp_remove_table(cpu_dev);
}
 }
 
diff --git a/drivers/opp/of.c b/drivers/opp/of.c
index 0430290670ab..7d9d4455a59e 100644
--- a/drivers/opp/of.c
+++ b/drivers/opp/of.c
@@ -616,7 +616,7 @@ static int opp_parse_supplies(struct dev_pm_opp *opp, 
struct device *dev,
  */
 void dev_pm_opp_of_remove_table(struct device *dev)
 {
-   _dev_pm_opp_find_and_remove_table(dev);
+   dev_pm_opp_remove_table(dev);
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_of_remove_table);
 
diff --git a/drivers/opp/opp.h b/drivers/opp/opp.h
index 0c3de3f6db5c..78e876ec803e 100644
--- a/drivers/opp/opp.h
+++ b/drivers/opp/opp.h
@@ -217,7 +217,6 @@ void _get_opp_table_kref(struct opp_table *opp_table);
 int _get_opp_count(struct opp_table *opp_table);
 struct opp_table *_find_opp_table(struct device *dev);
 struct opp_device *_add_opp_dev(const struct device *dev, struct opp_table 
*opp_table);
-void _dev_pm_opp_find_and_remove_table(struct device *dev);
 struct dev_pm_opp *_opp_allocate(struct opp_table *opp_table);
 void _opp_free(struct dev_pm_opp *opp);
 int _opp_compare_key(struct dev_pm_opp *opp1, struct dev_pm_opp *opp2);
-- 
2.25.0.rc1.19.g042ed3e048af



[PATCH V3 0/4] opp: general cleanups

2020-08-20 Thread Viresh Kumar
Hi,

Here is another version of the cleanups I sent earlier.

Rajendra: Please see if these work fine now.

V3:
- Dropped v2 1/4 as it is already merged.
- New patch 4/4 added.
- Reordered the first two patches here (Stephen)
- disable regulator only if present

Viresh Kumar (4):
  opp: Rename regulator_enabled and use it as status of all resources
  opp: Track device's resources configuration status
  opp: Split out _opp_set_rate_zero()
  opp: Remove _dev_pm_opp_find_and_remove_table() wrapper

 drivers/opp/core.c | 103 +
 drivers/opp/cpu.c  |   2 +-
 drivers/opp/of.c   |   2 +-
 drivers/opp/opp.h  |   5 +--
 4 files changed, 52 insertions(+), 60 deletions(-)

-- 
2.25.0.rc1.19.g042ed3e048af



[PATCH V3 3/4] opp: Split out _opp_set_rate_zero()

2020-08-20 Thread Viresh Kumar
Create separate routine _opp_set_rate_zero() to handle !target_freq
case.

Signed-off-by: Viresh Kumar 
---
 drivers/opp/core.c | 52 ++
 1 file changed, 29 insertions(+), 23 deletions(-)

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 0b437d483b75..4edd2c3d6d91 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -860,6 +860,34 @@ int dev_pm_opp_set_bw(struct device *dev, struct 
dev_pm_opp *opp)
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_set_bw);
 
+static int _opp_set_rate_zero(struct device *dev, struct opp_table *opp_table)
+{
+   int ret;
+
+   if (!opp_table->enabled)
+   return 0;
+
+   /*
+* Some drivers need to support cases where some platforms may
+* have OPP table for the device, while others don't and
+* opp_set_rate() just needs to behave like clk_set_rate().
+*/
+   if (!_get_opp_count(opp_table))
+   return 0;
+
+   ret = _set_opp_bw(opp_table, NULL, dev, true);
+   if (ret)
+   return ret;
+
+   if (opp_table->regulators)
+   regulator_disable(opp_table->regulators[0]);
+
+   ret = _set_required_opps(dev, opp_table, NULL);
+
+   opp_table->enabled = false;
+   return ret;
+}
+
 /**
  * dev_pm_opp_set_rate() - Configure new OPP based on frequency
  * @dev:device for which we do this operation
@@ -886,29 +914,7 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long 
target_freq)
}
 
if (unlikely(!target_freq)) {
-   ret = 0;
-
-   if (!opp_table->enabled)
-   goto put_opp_table;
-
-   /*
-* Some drivers need to support cases where some platforms may
-* have OPP table for the device, while others don't and
-* opp_set_rate() just needs to behave like clk_set_rate().
-*/
-   if (!_get_opp_count(opp_table))
-   goto put_opp_table;
-
-   ret = _set_opp_bw(opp_table, NULL, dev, true);
-   if (ret)
-   goto put_opp_table;
-
-   if (opp_table->regulators)
-   regulator_disable(opp_table->regulators[0]);
-
-   ret = _set_required_opps(dev, opp_table, NULL);
-
-   opp_table->enabled = false;
+   ret = _opp_set_rate_zero(dev, opp_table);
goto put_opp_table;
}
 
-- 
2.25.0.rc1.19.g042ed3e048af



Re: [PATCH 3/4] opp: Reused enabled flag and remove regulator_enabled

2020-08-20 Thread Viresh Kumar
On 15-08-20, 00:55, Stephen Boyd wrote:
> Quoting Viresh Kumar (2020-08-12 21:29:00)
> > diff --git a/drivers/opp/core.c b/drivers/opp/core.c
> > index e8882e7fd8a5..5f5da257f58a 100644
> > --- a/drivers/opp/core.c
> > +++ b/drivers/opp/core.c
> > @@ -703,12 +703,10 @@ static int _generic_set_opp_regulator(struct 
> > opp_table *opp_table,
> >  * Enable the regulator after setting its voltages, otherwise it 
> > breaks
> >  * some boot-enabled regulators.
> >  */
> > -   if (unlikely(!opp_table->regulator_enabled)) {
> > +   if (unlikely(!opp_table->enabled)) {
> > ret = regulator_enable(reg);
> > if (ret < 0)
> > dev_warn(dev, "Failed to enable regulator: %d", 
> > ret);
> > -   else
> > -   opp_table->regulator_enabled = true;
> 
> A quick glance makes this look unsafe now because we're only checking
> 'enabled' and not actually setting it when this function is called. I
> have to go back to the previous patch to understand where enabled is now
> set to confirm that it is OK. If it was all one patch all the context
> would be here.

The only case where things can go crazy are the cases where (for
example) clk_set_rate() fails, or something like that which would be a
bug and it shouldn't bother in the normal working of this code.

-- 
viresh


Re: [Patch] cpufreq: replace cpu_logical_map with read_cpuid_mpir

2020-08-19 Thread Viresh Kumar
On 12-08-20, 01:13, Sumit Gupta wrote:
> Commit eaecca9e7710 ("arm64: Fix __cpu_logical_map undefined issue")
> fixes the issue with building tegra194 cpufreq driver as module. But
> the fix might cause problem while supporting physical cpu hotplug[1].
> 
> This patch fixes the original problem by avoiding use of cpu_logical_map().
> Instead calling read_cpuid_mpidr() to get MPIDR on target cpu.
> 
> [1] https://lore.kernel.org/linux-arm-kernel/20200724131059.GB6521@bogus/
> 
> Reviewed-by: Sudeep Holla 
> Signed-off-by: Sumit Gupta 
> ---
>  drivers/cpufreq/tegra194-cpufreq.c | 10 +++---
>  1 file changed, 7 insertions(+), 3 deletions(-)

Applied. Thanks.

-- 
viresh


Re: linux-next: Fixes tag needs some work in the pm tree

2020-08-19 Thread Viresh Kumar
On 20-08-20, 08:01, Stephen Rothwell wrote:
> Hi all,
> 
> In commit
> 
>   ceac7fc18ac7 ("opp: Enable resources again if they were disabled earlier")
> 
> Fixes tag
> 
>   Fixes: cd7ea582 ("opp: Make dev_pm_opp_set_rate() handle freq = 0 to drop 
> performance votes")
> 
> has these problem(s):
> 
>   - SHA1 should be at least 12 digits long
> Can be fixed by setting core.abbrev to 12 (or more) or (for git v2.11
> or later) just making sure it is not set (or set to "auto").

It came from the author and I failed to verify it :(

Rafael: Do you want me to resend the pull request with this fixed ? Or
something else ?

-- 
viresh


Re: [RESEND PATCH 3/5] ARM: dts: spear: Align L2 cache-controller nodename with dtschema

2020-08-19 Thread Viresh Kumar
On 19-08-20, 19:58, Krzysztof Kozlowski wrote:
> Fix dtschema validator warnings like:
> l2-cache: $nodename:0: 'l2-cache' does not match 
> '^(cache-controller|cpu)(@[0-9a-f,]+)*$'
> 
> Signed-off-by: Krzysztof Kozlowski 
> ---
>  arch/arm/boot/dts/spear13xx.dtsi | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm/boot/dts/spear13xx.dtsi 
> b/arch/arm/boot/dts/spear13xx.dtsi
> index f187da4485f4..c87b881b2c8b 100644
> --- a/arch/arm/boot/dts/spear13xx.dtsi
> +++ b/arch/arm/boot/dts/spear13xx.dtsi
> @@ -43,7 +43,7 @@
> 0 7 0x04>;
>   };
>  
> - L2: l2-cache {
> + L2: cache-controller {
>   compatible = "arm,pl310-cache";
>   reg = <0xed00 0x1000>;
>   cache-unified;

Acked-by: Viresh Kumar 

-- 
viresh


Re: [greybus-dev] [PATCH] drivers/greybus: Use kobj_to_dev() instead

2020-08-17 Thread Viresh Kumar
On 13-08-20, 11:34, Wang Qing wrote:
> Use kobj_to_dev() instead of container_of()
> 
> Signed-off-by: Wang Qing 
> ---
>  drivers/greybus/interface.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)

Acked-by: Viresh Kumar 

-- 
viresh


Re: [PATCH 2/4] opp: Track device's resources configuration status

2020-08-16 Thread Viresh Kumar
On 15-08-20, 01:03, Stephen Boyd wrote:
> Quoting Viresh Kumar (2020-08-12 21:28:59)
> > The OPP core needs to track if the resources of devices are enabled or
> > configured or not, as it disables the resources when target_freq is set
> > to 0.
> > 
> > Handle that with a separate variable to make it easy to maintain.
> > 
> > Also note that we will unconditionally call clk_set_rate() in the case
> > where the resources are getting enabled again. This shouldn't have any
> > side effects anyway.
> 
> Any reason to want to do that?

To avoid more flags, code paths and simplicity of the code. And this
should normally happen in a corner case as well, like calling
set-rate(0) from suspend and then reinitializing things again in
resume.

> We'll have to grab the prepare lock in
> the clk framework to figure out that there's nothing to do sometimes. If
> anything, a comment may help to indicate that we call clk_set_rate()
> again, but don't expect it to matter much.

Ok.

-- 
viresh


Re: [PATCH] opp: Fix dev_pm_opp_set_rate() to not return early

2020-08-12 Thread Viresh Kumar
On 11-08-20, 14:09, Stephen Boyd wrote:
> This is a goto maze! Any chance we can clean this up?

I have sent a short series in reply to this series, please have a
look. It should look better now.

-- 
viresh


[PATCH V2 1/4] opp: Enable resources again if they were disabled earlier

2020-08-12 Thread Viresh Kumar
From: Rajendra Nayak 

dev_pm_opp_set_rate() can now be called with freq = 0 in order
to either drop performance or bandwidth votes or to disable
regulators on platforms which support them.

In such cases, a subsequent call to dev_pm_opp_set_rate() with
the same frequency ends up returning early because 'old_freq == freq'

Instead make it fall through and put back the dropped performance
and bandwidth votes and/or enable back the regulators.

Cc: v5.3+  # v5.3+
Fixes: cd7ea582 ("opp: Make dev_pm_opp_set_rate() handle freq = 0 to drop 
performance votes")
Reported-by: Sajida Bhanu 
Reviewed-by: Sibi Sankar 
Reported-by: Matthias Kaehlcke 
Tested-by: Matthias Kaehlcke 
Signed-off-by: Rajendra Nayak 
[ Viresh: Don't skip clk_set_rate() and massaged changelog ]
Signed-off-by: Viresh Kumar 
---
Hi Rajendra,

I wasn't able to test this stuff, please give it a try. I have
simplified your patch and cleaned up a bunch of stuff as well.

 drivers/opp/core.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index bdb028c7793d..9668ea04cc80 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -934,10 +934,13 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long 
target_freq)
 
/* Return early if nothing to do */
if (old_freq == freq) {
-   dev_dbg(dev, "%s: old/new frequencies (%lu Hz) are same, 
nothing to do\n",
-   __func__, freq);
-   ret = 0;
-   goto put_opp_table;
+   if (!opp_table->required_opp_tables && !opp_table->regulators &&
+   !opp_table->paths) {
+   dev_dbg(dev, "%s: old/new frequencies (%lu Hz) are 
same, nothing to do\n",
+   __func__, freq);
+   ret = 0;
+   goto put_opp_table;
+   }
}
 
/*
-- 
2.14.1



[PATCH 2/4] opp: Track device's resources configuration status

2020-08-12 Thread Viresh Kumar
The OPP core needs to track if the resources of devices are enabled or
configured or not, as it disables the resources when target_freq is set
to 0.

Handle that with a separate variable to make it easy to maintain.

Also note that we will unconditionally call clk_set_rate() in the case
where the resources are getting enabled again. This shouldn't have any
side effects anyway.

Signed-off-by: Viresh Kumar 
---
 drivers/opp/core.c | 37 ++---
 drivers/opp/opp.h  |  2 ++
 2 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 9668ea04cc80..e8882e7fd8a5 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -888,22 +888,18 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long 
target_freq)
}
 
if (unlikely(!target_freq)) {
+   ret = 0;
+
+   if (!opp_table->enabled)
+   goto put_opp_table;
+
/*
 * Some drivers need to support cases where some platforms may
 * have OPP table for the device, while others don't and
 * opp_set_rate() just needs to behave like clk_set_rate().
 */
-   if (!_get_opp_count(opp_table)) {
-   ret = 0;
-   goto put_opp_table;
-   }
-
-   if (!opp_table->required_opp_tables && !opp_table->regulators &&
-   !opp_table->paths) {
-   dev_err(dev, "target frequency can't be 0\n");
-   ret = -EINVAL;
+   if (!_get_opp_count(opp_table))
goto put_opp_table;
-   }
 
ret = _set_opp_bw(opp_table, NULL, dev, true);
if (ret)
@@ -915,6 +911,9 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long 
target_freq)
}
 
ret = _set_required_opps(dev, opp_table, NULL);
+   if (!ret)
+   opp_table->enabled = false;
+
goto put_opp_table;
}
 
@@ -933,14 +932,11 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long 
target_freq)
old_freq = clk_get_rate(clk);
 
/* Return early if nothing to do */
-   if (old_freq == freq) {
-   if (!opp_table->required_opp_tables && !opp_table->regulators &&
-   !opp_table->paths) {
-   dev_dbg(dev, "%s: old/new frequencies (%lu Hz) are 
same, nothing to do\n",
-   __func__, freq);
-   ret = 0;
-   goto put_opp_table;
-   }
+   if (opp_table->enabled && old_freq == freq) {
+   dev_dbg(dev, "%s: old/new frequencies (%lu Hz) are same, 
nothing to do\n",
+   __func__, freq);
+   ret = 0;
+   goto put_opp_table;
}
 
/*
@@ -1001,8 +997,11 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long 
target_freq)
dev_err(dev, "Failed to set required opps: %d\n", ret);
}
 
-   if (!ret)
+   if (!ret) {
ret = _set_opp_bw(opp_table, opp, dev, false);
+   if (!ret)
+   opp_table->enabled = true;
+   }
 
 put_opp:
dev_pm_opp_put(opp);
diff --git a/drivers/opp/opp.h b/drivers/opp/opp.h
index e51646ff279e..bd35802acc6e 100644
--- a/drivers/opp/opp.h
+++ b/drivers/opp/opp.h
@@ -152,6 +152,7 @@ enum opp_table_access {
  * property).
  * @paths: Interconnect path handles
  * @path_count: Number of interconnect paths
+ * @enabled: Set to true if the device's resources are enabled/configured.
  * @genpd_performance_state: Device's power domain support performance state.
  * @is_genpd: Marks if the OPP table belongs to a genpd.
  * @set_opp: Platform specific set_opp callback
@@ -198,6 +199,7 @@ struct opp_table {
bool regulator_enabled;
struct icc_path **paths;
unsigned int path_count;
+   bool enabled;
bool genpd_performance_state;
bool is_genpd;
 
-- 
2.14.1



[PATCH 3/4] opp: Reused enabled flag and remove regulator_enabled

2020-08-12 Thread Viresh Kumar
The common "enabled" flag can be used here instead of
"regulator_enabled" now.

Signed-off-by: Viresh Kumar 
---
 drivers/opp/core.c | 13 +++--
 drivers/opp/opp.h  |  2 --
 2 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index e8882e7fd8a5..5f5da257f58a 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -703,12 +703,10 @@ static int _generic_set_opp_regulator(struct opp_table 
*opp_table,
 * Enable the regulator after setting its voltages, otherwise it breaks
 * some boot-enabled regulators.
 */
-   if (unlikely(!opp_table->regulator_enabled)) {
+   if (unlikely(!opp_table->enabled)) {
ret = regulator_enable(reg);
if (ret < 0)
dev_warn(dev, "Failed to enable regulator: %d", ret);
-   else
-   opp_table->regulator_enabled = true;
}
 
return 0;
@@ -905,10 +903,7 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long 
target_freq)
if (ret)
goto put_opp_table;
 
-   if (opp_table->regulator_enabled) {
-   regulator_disable(opp_table->regulators[0]);
-   opp_table->regulator_enabled = false;
-   }
+   regulator_disable(opp_table->regulators[0]);
 
ret = _set_required_opps(dev, opp_table, NULL);
if (!ret)
@@ -1795,11 +1790,9 @@ void dev_pm_opp_put_regulators(struct opp_table 
*opp_table)
/* Make sure there are no concurrent readers while updating opp_table */
WARN_ON(!list_empty(_table->opp_list));
 
-   if (opp_table->regulator_enabled) {
+   if (opp_table->enabled) {
for (i = opp_table->regulator_count - 1; i >= 0; i--)
regulator_disable(opp_table->regulators[i]);
-
-   opp_table->regulator_enabled = false;
}
 
for (i = opp_table->regulator_count - 1; i >= 0; i--)
diff --git a/drivers/opp/opp.h b/drivers/opp/opp.h
index bd35802acc6e..0c3de3f6db5c 100644
--- a/drivers/opp/opp.h
+++ b/drivers/opp/opp.h
@@ -147,7 +147,6 @@ enum opp_table_access {
  * @clk: Device's clock handle
  * @regulators: Supply regulators
  * @regulator_count: Number of power supply regulators. Its value can be -1
- * @regulator_enabled: Set to true if regulators were previously enabled.
  * (uninitialized), 0 (no opp-microvolt property) or > 0 (has opp-microvolt
  * property).
  * @paths: Interconnect path handles
@@ -196,7 +195,6 @@ struct opp_table {
struct clk *clk;
struct regulator **regulators;
int regulator_count;
-   bool regulator_enabled;
struct icc_path **paths;
unsigned int path_count;
bool enabled;
-- 
2.14.1



[PATCH 4/4] opp: Split out _opp_set_rate_zero()

2020-08-12 Thread Viresh Kumar
Create separate routine _opp_set_rate_zero() to handle !target_freq
case.

Signed-off-by: Viresh Kumar 
---
 drivers/opp/core.c | 52 +---
 1 file changed, 29 insertions(+), 23 deletions(-)

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 5f5da257f58a..e198b57efcf8 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -860,6 +860,34 @@ int dev_pm_opp_set_bw(struct device *dev, struct 
dev_pm_opp *opp)
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_set_bw);
 
+static int _opp_set_rate_zero(struct device *dev, struct opp_table *opp_table)
+{
+   int ret;
+
+   if (!opp_table->enabled)
+   return 0;
+
+   /*
+* Some drivers need to support cases where some platforms may
+* have OPP table for the device, while others don't and
+* opp_set_rate() just needs to behave like clk_set_rate().
+*/
+   if (!_get_opp_count(opp_table))
+   return 0;
+
+   ret = _set_opp_bw(opp_table, NULL, dev, true);
+   if (ret)
+   return ret;
+
+   regulator_disable(opp_table->regulators[0]);
+
+   ret = _set_required_opps(dev, opp_table, NULL);
+   if (!ret)
+   opp_table->enabled = false;
+
+   return ret;
+}
+
 /**
  * dev_pm_opp_set_rate() - Configure new OPP based on frequency
  * @dev:device for which we do this operation
@@ -886,29 +914,7 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long 
target_freq)
}
 
if (unlikely(!target_freq)) {
-   ret = 0;
-
-   if (!opp_table->enabled)
-   goto put_opp_table;
-
-   /*
-* Some drivers need to support cases where some platforms may
-* have OPP table for the device, while others don't and
-* opp_set_rate() just needs to behave like clk_set_rate().
-*/
-   if (!_get_opp_count(opp_table))
-   goto put_opp_table;
-
-   ret = _set_opp_bw(opp_table, NULL, dev, true);
-   if (ret)
-   goto put_opp_table;
-
-   regulator_disable(opp_table->regulators[0]);
-
-   ret = _set_required_opps(dev, opp_table, NULL);
-   if (!ret)
-   opp_table->enabled = false;
-
+   ret = _opp_set_rate_zero(dev, opp_table);
goto put_opp_table;
}
 
-- 
2.14.1



Re: [RFC PATCH 1/2] opp: Allow dev_pm_opp_get_opp_table() to return -EPROBE_DEFER

2020-08-12 Thread Viresh Kumar
On 12-08-20, 12:53, Stephan Gerhold wrote:
> On Wed, Aug 12, 2020 at 11:10:38AM +0200, Ulf Hansson wrote:
> > > I wasn't sure if the changes in drivers/base/power/domain.c
> > > should be made in a separate commit, but they need to be made together
> > > with the other changes.
> > 
> > I would suggest to move the changes in drivers/base/power/domain.c
> > into a separate patch, still part of the series, but let it preceed
> > $subject patch.
> > 
> 
> OK, will do that in v2 - thank you!
> 
> I have another small build fix reported by the kernel test robot,
> but will wait with sending that out until Viresh had a chance to give
> some feedback on the basic idea. :)

What was the issue that was reported ? I may end up applying V1 only
with some of my changes.

-- 
viresh


Re: [PATCH] OPP: Put opp table in dev_pm_opp_set_rate() all the time

2020-08-12 Thread Viresh Kumar
On 11-08-20, 14:28, Stephen Boyd wrote:
> @@ -905,7 +907,7 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long 
> target_freq)
>  
>   ret = _set_opp_bw(opp_table, NULL, dev, true);
>   if (ret)
> - return ret;
> + goto put_opp_table;

This was broken by a different patch.

Fixes: b00e667a6d8b ("opp: Remove bandwidth votes when target_freq is zero")

I did split the patch into two and applied the correct tags (not yet
pushed though).

-- 
viresh


Re: [PATCH v2 37/41] cpufreq: s3c24xx: move low-level clk reg access into platform code

2020-08-06 Thread Viresh Kumar
On 06-08-20, 20:20, Krzysztof Kozlowski wrote:
> From: Arnd Bergmann 
> 
> Rather than have the cpufreq drivers touch include the
> common headers to get the constants, add a small indirection.
> This is still not the proper way that would do this through
> the common clk API, but it lets us kill off the header file
> usage.
> 
> Signed-off-by: Arnd Bergmann 
> [krzk: Rebase and fix -Wold-style-definition]
> Signed-off-by: Krzysztof Kozlowski 

Acked-by: Viresh Kumar 

-- 
viresh


Re: [PATCH v2 36/41] cpufreq: s3c2412: use global s3c2412_cpufreq_setrefresh

2020-08-06 Thread Viresh Kumar
On 06-08-20, 20:20, Krzysztof Kozlowski wrote:
> From: Arnd Bergmann 
> 
> There are two identical copies of the s3c2412_cpufreq_setrefresh
> function: a static one in the cpufreq driver and a global
> version in iotiming-s3c2412.c.
> 
> As the function requires the use of a hardcoded register address
> from a header that we want to not be visible to drivers, just
> move the existing global function and add a declaration in
> one of the cpufreq header files.
> 
> Signed-off-by: Arnd Bergmann 
> Signed-off-by: Krzysztof Kozlowski 
> ---
>  drivers/cpufreq/s3c2412-cpufreq.c| 23 
>  include/linux/soc/samsung/s3c-cpufreq-core.h |  1 +
>  2 files changed, 1 insertion(+), 23 deletions(-)

Acked-by: Viresh Kumar 

-- 
viresh


Re: [PATCH v2 35/41] ARM: s3c: remove cpufreq header dependencies

2020-08-06 Thread Viresh Kumar
On 06-08-20, 20:20, Krzysztof Kozlowski wrote:
> From: Arnd Bergmann 
> 
> The cpufreq drivers are split between the machine directory
> and the drivers/cpufreq directory. In order to share header
> files after we convert s3c to multiplatform, those headers
> have to live in a different global location.
> 
> Move them to linux/soc/samsung/ in lack of a better place.
> 
> Signed-off-by: Arnd Bergmann 
> Signed-off-by: Krzysztof Kozlowski 
> ---
>  arch/arm/mach-s3c24xx/common.c |  1 -
>  arch/arm/mach-s3c24xx/cpufreq-utils.c  |  2 +-
>  arch/arm/mach-s3c24xx/iotiming-s3c2410.c   |  2 +-
>  arch/arm/mach-s3c24xx/iotiming-s3c2412.c   |  2 +-
>  arch/arm/mach-s3c24xx/mach-bast.c  |  2 +-
>  arch/arm/mach-s3c24xx/mach-osiris-dvs.c|  2 +-
>  arch/arm/mach-s3c24xx/mach-osiris.c|  2 +-
>  arch/arm/mach-s3c24xx/pll-s3c2410.c|  4 ++--
>  arch/arm/mach-s3c24xx/pll-s3c2440-1200.c   |  4 ++--
>  arch/arm/mach-s3c24xx/pll-s3c2440-16934400.c   |  4 ++--
>  arch/arm/mach-s3c24xx/s3c2410.c|  1 -
>  arch/arm/mach-s3c24xx/s3c2412.c|  1 -
>  arch/arm/mach-s3c24xx/s3c244x.c|  2 --
>  arch/arm/mach-s3c64xx/s3c6400.c|  1 -
>  arch/arm/mach-s3c64xx/s3c6410.c|  2 +-
>  arch/arm/plat-samsung/include/plat/cpu.h   |  9 -
>  drivers/cpufreq/s3c2410-cpufreq.c  |  5 ++---
>  drivers/cpufreq/s3c2412-cpufreq.c  |  5 ++---
>  drivers/cpufreq/s3c2440-cpufreq.c  |  5 ++---
>  drivers/cpufreq/s3c24xx-cpufreq-debugfs.c  |  2 +-
>  drivers/cpufreq/s3c24xx-cpufreq.c  |  5 ++---
>  .../linux/soc/samsung/s3c-cpu-freq.h   |  4 
>  .../linux/soc/samsung/s3c-cpufreq-core.h   |  6 +-
>  include/linux/soc/samsung/s3c-pm.h | 10 ++
>  24 files changed, 41 insertions(+), 42 deletions(-)
>  rename arch/arm/plat-samsung/include/plat/cpu-freq.h => 
> include/linux/soc/samsung/s3c-cpu-freq.h (97%)
>  rename arch/arm/plat-samsung/include/plat/cpu-freq-core.h => 
> include/linux/soc/samsung/s3c-cpufreq-core.h (98%)

Acked-by: Viresh Kumar 

-- 
viresh


Re: [PATCH v2 34/41] cpufreq: s3c24xx: split out registers

2020-08-06 Thread Viresh Kumar
On 06-08-20, 20:20, Krzysztof Kozlowski wrote:
> From: Arnd Bergmann 
> 
> Each of the cpufreq drivers uses a fixed set of register
> bits, copy those definitions into the drivers to avoid
> including mach/regs-clock.h.
> 
> Signed-off-by: Arnd Bergmann 
> [krzk: Fix build by copying also S3C2410_LOCKTIME]
> Signed-off-by: Krzysztof Kozlowski 
> 
> ---

Acked-by: Viresh Kumar 

-- 
viresh


Re: [PATCH 0/4] CPUFreq statistics retrieved by drivers

2020-08-05 Thread Viresh Kumar
On 05-08-20, 12:04, Lukasz Luba wrote:
> I know that Viresh is going to develop patches and improve these
> cpufreq stats framework. Maybe he also had this 'aggregation' in mind.
> I will leave it him.

I am only going to look at cpufreq's view of stats independently from
the firmware.

-- 
viresh


Re: [PATCH 0/4] CPUFreq statistics retrieved by drivers

2020-08-04 Thread Viresh Kumar
On 04-08-20, 11:29, Lukasz Luba wrote:
> On 8/4/20 6:35 AM, Viresh Kumar wrote:
> > IIUC, the only concern right now is to capture stats with fast switch ? 
> > Maybe we
> > can do something else in that case and brainstorm a bit..
> 
> Correct, the fast switch is the only concern right now and not tracked. We
> could fill in that information with statistics data from firmware
> with a cpufreq driver help.
> 
> I could make the if from patch 1/4 covering narrowed case, when
> fast switch is present, check for drivers stats.
> Something like:
> ---8<
> if (policy->fast_switch_enabled)
>   if (policy->has_driver_stats)
>   return cpufreq_stats_present_driver_data(policy, buf);
>   else
>   return 0;
> -->8--

I don't think doing it with help of firmware is the right thing to do
here then. For another platform we may not have a firmware which can
help us, we need something in the opp core itself for that. Lemme see
if I can do something about it.

> > Why is firmware the governor here ? Aren't you talking about the simple fast
> > switch case only ?
> 
> I used a term 'governor' for the firmware because it makes the final
> set for the frequency. It (FW) should respect the frequency value
> set using the fast switch. I don't know how other firmware (e.g. Intel)
> treats this fast switch value or if they even expose FW stats, though.

For Intel I think, Linux is one of the entities that vote for deciding
the frequency of the CPUs and the firmware (after taking all such
factors into account) chooses a frequency by its own, which must be >=
the frequency requested by Linux.

> You can read about this statistics region in [1] at:
> 4.5.5 Performance domain statistics shared memory region
> 
> > 
> > Over that, I think this cpufreq stats information isn't parsed by any tool 
> > right
> > now and tweaking it a bit won't hurt anyone (like if we start capturing 
> > things a
> > bit differently). So we may not want to worry about breaking userspace ABI 
> > here,
> > if what we are looking to do is the right thing to do.
> 
> So, there is some hope... IMHO it would be better to have this cpufreq
> stats in normal location, rather then in scmi debugfs.

I agree.

> > I am not sure what notifications are we talking about here.
> 
> There is a notification mechanism described in the SCMI spec [1] at
> 4.5.4 Notifications.
> We were referring to that mechanism.

Ahh, I see. All I was thinking was about the cpufreq specific
notifiers :)

-- 
viresh


Re: [PATCH v2 4/7] cpufreq: report whether cpufreq supports Frequency Invariance (FI)

2020-08-04 Thread Viresh Kumar
On 03-08-20, 16:24, Ionela Voinescu wrote:
> Right, cpufreq_register_driver() should check that at least one of them
> is present

> (although currently cpufreq_register_driver() will return
> -EINVAL if .fast_switch() alone is present - something to be fixed).

I think it is fine as there is no guarantee from cpufreq core if
.fast_switch() will get called and so target/target_index must be
present. We can't do fast-switch today without schedutil (as only that
enables it) and if a notifier gets registered before the driver, then
we are gone again.

> Will do, on both accounts.
> 
> 
> > > + static_branch_enable_cpuslocked(_set_freq_scale);
> > > + pr_debug("%s: Driver %s can provide frequency invariance.",
> > > +  __func__, driver->name);
> > 
> > I think a simpler print will work well too.
> > 
> > pr_debug("Freq invariance enabled");
> > 
> 
> I think the right way of reporting this support is important here.

Yeah, we can't say it is enabled as you explained, though I meant
something else here then, i.e. getting rid of driver name and
unimportant stuff. What about this now:

pr_debug("supports frequency invariance");

This shall get printed as this finally:

cpufreq: supports frequency invariance

-- 
viresh


Re: [PATCH v2 3/7] arch_topology: disable frequency invariance for CONFIG_BL_SWITCHER

2020-08-04 Thread Viresh Kumar
On 30-07-20, 12:29, Dietmar Eggemann wrote:
> On 30/07/2020 06:24, Viresh Kumar wrote:
> > On 22-07-20, 10:37, Ionela Voinescu wrote:
> >> +++ b/drivers/base/arch_topology.c
> >> @@ -27,6 +27,7 @@ __weak bool arch_freq_counters_available(struct cpumask 
> >> *cpus)
> >>  }
> >>  DEFINE_PER_CPU(unsigned long, freq_scale) = SCHED_CAPACITY_SCALE;
> >>  
> >> +#ifndef CONFIG_BL_SWITCHER
> >>  void arch_set_freq_scale(struct cpumask *cpus, unsigned long cur_freq,
> >> unsigned long max_freq)
> >>  {
> >> @@ -46,6 +47,7 @@ void arch_set_freq_scale(struct cpumask *cpus, unsigned 
> >> long cur_freq,
> >>for_each_cpu(i, cpus)
> >>per_cpu(freq_scale, i) = scale;
> >>  }
> >> +#endif
> > 
> > I don't really like this change, the ifdef hackery is disgusting and
> > then we are putting that in a completely different part of the kernel.
> > 
> > There are at least these two ways of solving this, maybe more:
> > 
> > - Fix the bl switcher driver and add the complexity in it (which you
> >   tried to do earlier).
> > 
> > - Add a cpufreq flag to skip arch-set-freq-scale call.
> 
> I agree it's not nice but IMHO the cpufreq flag is worse since we would
> introduce new infrastructure only for a deprecated feature. I'm assuming
> that BL SWITCHER is the only feature needing this CPUfreq flag extension.
> 
> #ifdef CONFIG_BL_SWITCHER is already in drivers/irqchip/irq-gic.c so
> it's ugly already.
> 
> Runtime detecting (via bL_switching_enabled) of BL SWITCHER is right now
> also only handled inside vexpress-spc-cpufreq.c via a
> bL_switcher_notifier. A mechanism which also sits behind a #ifdef
> CONFIG_BL_SWITCHER.

Vexpress one is a driver and so ugliness could be ignored here :)

So here is option number 3 (in continuation of the earlier two
options):
- Don't do anything for bL switcher, just add a TODO/NOTE in the
  driver that FIE is broken for switcher. And I don't think anyone
  will care about FIE for the switcher anyway :)

-- 
viresh


Re: [PATCH v2 2/7] cpufreq: set invariance scale factor on transition end

2020-08-04 Thread Viresh Kumar
On 03-08-20, 14:58, Ionela Voinescu wrote:
> Hi Viresh,
> 
> On Thursday 30 Jul 2020 at 09:43:34 (+0530), Viresh Kumar wrote:
> > On 22-07-20, 10:37, Ionela Voinescu wrote:
> > > While the move of the invariance setter calls (arch_set_freq_scale())
> > > from cpufreq drivers to cpufreq core maintained the previous
> > > functionality for existing drivers that use target_index() and
> > > fast_switch() for frequency switching, it also gives the possibility
> > > of adding support for users of the target() callback, which is exploited
> > > here.
> > > 
> > > To be noted that the target() callback has been flagged as deprecated
> > > since:
> > > 
> > > commit 9c0ebcf78fde ("cpufreq: Implement light weight ->target_index() 
> > > routine")
> > > 
> > > It also doesn't have that many users:
> > > 
> > >   cpufreq-nforce2.c:371:2:  .target = nforce2_target,
> > >   cppc_cpufreq.c:416:2: .target = cppc_cpufreq_set_target,
> > >   gx-suspmod.c:439:2:   .target = cpufreq_gx_target,
> > >   pcc-cpufreq.c:573:2:  .target = pcc_cpufreq_target,
> > > 
> > > Similarly to the path taken for target_index() calls in the cpufreq core
> > > during a frequency change, all of the drivers above will mark the end of a
> > > frequency change by a call to cpufreq_freq_transition_end().
> > > 
> > > Therefore, cpufreq_freq_transition_end() can be used as the location for
> > > the arch_set_freq_scale() call to potentially inform the scheduler of the
> > > frequency change.
> > > 
> > > This change maintains the previous functionality for the drivers that
> > > implement the target_index() callback, while also adding support for the
> > > few drivers that implement the deprecated target() callback.
> > > 
> > > Two notes are worthwhile here:
> > >  - In __target_index(), cpufreq_freq_transition_end() is called only for
> > >drivers that have synchronous notifications enabled. There is only one
> > >driver that disables them,
> > > 
> > >drivers/cpufreq/powernow-k8.c:1142: .flags = 
> > > CPUFREQ_ASYNC_NOTIFICATION,
> > > 
> > >which is deprecated.
> > 
> > I don't think this is deprecated.

Heh, maybe I misunderstood. I thought you are talking about the flag,
while you were talking about the driver.

> Sorry, possibly 'deprecated' is a strong word.
> 
> As far as I knew acpi_cpufreq was recommended more recently for K8/K10
> CPUs so that's why I decided not to create a special case for it, also
> considering that it was not supporting cpufreq-based frequency
> invariance to begin with.
> 
> We could support this as well by having a call to arch_set_freq_scale()
> on the else path in __target_index(). But given that there was only this
> one user of CPUFREQ_ASYNC_NOTIFICATION, I thought I'd propose this simpler
> version first.
> 
> Let me know if my reasoning is wrong.

Nevertheless, I don't think you need to mention this detail in
changelog for powernow-k8 as cpufreq_freq_transition_end() does get
called for it as well, by the driver instead of the core.

-- 
viresh


Re: [PATCH v3] Provide USF for the portable equipment.

2020-08-03 Thread Viresh Kumar
On 03-08-20, 22:31, Dongdong Yang wrote:
> From: Dongdong Yang 
> 
> This patch provides USF(User Sensitive Feedback factor) auxiliary
> cpufreq governor to support high level layer sysfs inodes setting
> for utils adjustment purpose from the identified scenario on portable
> equipment. Because the power consumption and UI response are more cared
> for by portable equipment users. And the "screen off" status stands for
> no request from the user, however, the kernel is still expected to
> notify the user in time on modem, network or powerkey events occur. USF
> provides "sched_usf_non_ux_r" sysfs inode to cut down the utils from
> user space tasks according to high level scenario. In addition, it
> usually hints more cpufreq demand that the preemptive counts of the
> tasks on the cpu burst and over the user expecting completed time such
> as the ratio sysctl_sched_latency to sysctl_sched_min_granularity on
> "screen on" status, which more likely with more UI. The sysfs inodes
> "sched_usf_up_l0_r" and "sched_usf_down_r" have been provided to adjust
> the utils according to high level identified scenario to alloc the
> cpufreq in time.
> 
> Changes in v3
>   - Move usf.c to kernel/sched.
>   - Remove trace_printk and debugfs.
>   - Add document draft.
>   - Update comments.
> 
> Changes in v2
>   - Add adjust_task_pred_set switch.
>   - Move adjust_task_pred_demand declaration into sched.h
>   - Update comments.

Sending updated patchset for this isn't going to help you my friend. You need
people (maintainers) to agree on the idea here first. The patch can be
beautified later if required once the idea is agreed upon. I saw Peter already
gave his NAK to it during V1. You need to discuss with people here to see why
they don't like it first and as Greg said earlier, this should not go to staging
at all if it ever makes it mainline.

The more versions you send now (without proper discussions first), the harder it
will be for this stuff to get merged upstream.

-- 
viresh


Re: [PATCH 0/4] CPUFreq statistics retrieved by drivers

2020-08-03 Thread Viresh Kumar
On 30-07-20, 10:36, Lukasz Luba wrote:
> On 7/30/20 10:10 AM, Sudeep Holla wrote:
> > On Thu, Jul 30, 2020 at 02:23:33PM +0530, Viresh Kumar wrote:
> > > On 29-07-20, 16:12, Lukasz Luba wrote:
> > > > The existing CPUFreq framework does not tracks the statistics when the
> > > > 'fast switch' is used or when firmware changes the frequency 
> > > > independently
> > > > due to e.g. thermal reasons. However, the firmware might track the 
> > > > frequency
> > > > changes and expose this to the kernel.
> > > > 
> > > > This patch set aims to introduce CPUfreq statistics gathered by firmware
> > > > and retrieved by CPUFreq driver. It would require a new API functions
> > > > in the CPUFreq, which allows to poke drivers to get these stats.
> > > > 
> > > > The needed CPUFreq infrastructure is in patch 1/4, patch 2/4 extends
> > > > ARM SCMI protocol layer, patches 3/4, 4/4  modify ARM SCMI CPUFreq 
> > > > driver.
> > > 
> > > Are you doing this for the fast switch case or because your platform
> > > actually runs at frequencies which may be different from what cpufreq
> > > core has requested ?
> > > 
> > 
> > I think so.
> 
> For both cases, but fast switch is major and present. Thermal is not
> currently implemented in SCP FW, but might be in future.

Okay, lets simplify things a bit and merge things slowly upstream and merge only
what is required right now.

IIUC, the only concern right now is to capture stats with fast switch ? Maybe we
can do something else in that case and brainstorm a bit..

> > > I am also not sure what these tables should represent, what the
> > > cpufreq core has decided for the CPUs or the frequencies we actually
> > > run at, as these two can be very different for example if the hardware
> > > runs at frequencies which don't match exactly to what is there in the
> > > freq table. I believe these are rather to show what cpufreq and its
> > > governors are doing with the CPUs.
> > > 
> > 
> > Exactly, I raised similar point in internal discussion and asked Lukasz
> > to take up the same on the list. I assume it was always what cpufreq
> > requested rather than what was delivered. So will we break the userspace
> > ABI if we change that is the main question.
> 
> Thank you for confirmation. If that is the mechanism for tracking what
> cpufreq governors are doing with the CPUs, then is clashes with
> presented data in FW memory, because firmware is the governor.

Why is firmware the governor here ? Aren't you talking about the simple fast
switch case only ?

Over that, I think this cpufreq stats information isn't parsed by any tool right
now and tweaking it a bit won't hurt anyone (like if we start capturing things a
bit differently). So we may not want to worry about breaking userspace ABI here,
if what we are looking to do is the right thing to do.

> > > Over that I would like the userspace stats to work exactly as the way
> > > they work right now, i.e. capture all transitions from one freq to
> > > other, not just time-in-state. Also resetting of the stats from
> > > userspace for example. All allocation and printing of the data must be
> > > done from stats core, the only thing which the driver would do at the
> > > end is updating the stats structure and nothing more. Instead of
> > > reading all stats from the firmware, it will be much easier if you can
> > > just get the information from the firmware whenever there is a
> > > frequency switch and then we can update the stats the way it is done
> > > right now. And that would be simple.
> > > 
> > 
> > Good point, but notifications may not be lightweight. If that is no good,
> > alternatively, I suggested to keep these firmware stats in a separate
> > debugfs. Thoughts ?
> 
> I agree that notifications might not be lightweight.

I am not sure what notifications are we talking about here.

> Furthermore I think
> this still clashes with the assumption that cpufreq governor decisions
> are tracked in these statistics, not the firmware decision.
> 
> In this case I think we would have to create debugfs.
> Sudeep do you think these debugfs should be exposed from the protocol
> layer:
> drivers/firmware/arm_scmi/perf.c
> or maybe from the cpufreq scmi driver? I would probably be safer to have
> it in the cpufreq driver because we have scmi_handle there.

For the CPUs it would be better if we can keep things in cpufreq only, lets see
how we go about it.

-- 
viresh


Re: [PATCH 1/2] cpufreq: tegra186: Fix initial frequency

2020-08-03 Thread Viresh Kumar
On 31-07-20, 13:14, Jon Hunter wrote:
> I have been doing some more testing on Tegra, I noticed that when
> reading the current CPU frequency via the sysfs scaling_cur_freq entry,
> this always returns the cached value (at least for Tegra). Looking at
> the implementation of scaling_cur_freq I see ...
> 
> static ssize_t show_scaling_cur_freq(struct cpufreq_policy *policy, char *buf)
> {
> ssize_t ret; 
> unsigned int freq;
> 
> freq = arch_freq_get_on_cpu(policy->cpu);
> if (freq)
> ret = sprintf(buf, "%u\n", freq);
> else if (cpufreq_driver && cpufreq_driver->setpolicy &&
> cpufreq_driver->get)
> ret = sprintf(buf, "%u\n", cpufreq_driver->get(policy->cpu));
> else
> ret = sprintf(buf, "%u\n", policy->cur);
> return ret; 
> }
> 
> The various Tegra CPU frequency drivers do not implement the
> set_policy callback and hence why we always get the cached value. I
> see the following commit added this and before it simply return the
> cached value ...
> 
> commit c034b02e213d271b98c45c4a7b54af8f69aaac1e
> Author: Dirk Brandewie 
> Date:   Mon Oct 13 08:37:40 2014 -0700
> 
> cpufreq: expose scaling_cur_freq sysfs file for set_policy() drivers
> 
> Is this intentional? 

Yes, it is.

There are two sysfs files to read the current frequency.

- scaling_cur_freq: as you noticed it returns cached value unless it is for
  setpolicy drivers in whose case cpufreq core doesn't control the frequency and
  so doesn't cache any values.

- cpuinfo_cur_freq: This will return the value as read from hardware using
  ->get() callback.

-- 
viresh


Re: [PATCH v3] cpufreq: CPPC: simply the code access 'highest_perf' value in cppc_perf_caps struct

2020-08-03 Thread Viresh Kumar
On 04-08-20, 10:37, Xin Hao wrote:
> Hi everyone:
> 
> I want to know why my patch didn't merge into upstream ?

I have sent a pull request earlier today to Rafael and this will get
merged in the next pull request Rafael will send to Linus.

-- 
viresh


Re: [PATCH v5 0/6] Add support for GPU DDR BW scaling

2020-07-30 Thread Viresh Kumar
On 30-07-20, 08:27, Rob Clark wrote:
> Hmm, I've already sent my pull request to Dave, dropping the patch
> would require force-push and sending a new PR.  Which I can do if Dave
> prefers.  OTOH I guess it isn't the end of the world if the patch is
> merged via two different trees.

I don't think a patch can go via two trees, as that would have two sha
keys for the same code.

Though it is fine for a patch to go via two different trees if we make
sure the same sha key is used for both.

Will it be possible for you to provide a branch/tag of your branch
that I can base stuff of ?

-- 
viresh


Re: [PATCH 0/4] CPUFreq statistics retrieved by drivers

2020-07-30 Thread Viresh Kumar
On 29-07-20, 16:12, Lukasz Luba wrote:
> The existing CPUFreq framework does not tracks the statistics when the
> 'fast switch' is used or when firmware changes the frequency independently
> due to e.g. thermal reasons. However, the firmware might track the frequency
> changes and expose this to the kernel.
> 
> This patch set aims to introduce CPUfreq statistics gathered by firmware
> and retrieved by CPUFreq driver. It would require a new API functions
> in the CPUFreq, which allows to poke drivers to get these stats.
> 
> The needed CPUFreq infrastructure is in patch 1/4, patch 2/4 extends
> ARM SCMI protocol layer, patches 3/4, 4/4  modify ARM SCMI CPUFreq driver.

Are you doing this for the fast switch case or because your platform
actually runs at frequencies which may be different from what cpufreq
core has requested ?

I am also not sure what these tables should represent, what the
cpufreq core has decided for the CPUs or the frequencies we actually
run at, as these two can be very different for example if the hardware
runs at frequencies which don't match exactly to what is there in the
freq table. I believe these are rather to show what cpufreq and its
governors are doing with the CPUs.

Over that I would like the userspace stats to work exactly as the way
they work right now, i.e. capture all transitions from one freq to
other, not just time-in-state. Also resetting of the stats from
userspace for example. All allocation and printing of the data must be
done from stats core, the only thing which the driver would do at the
end is updating the stats structure and nothing more. Instead of
reading all stats from the firmware, it will be much easier if you can
just get the information from the firmware whenever there is a
frequency switch and then we can update the stats the way it is done
right now. And that would be simple.

-- 
viresh


Re: [PATCH] cpufreq: cached_resolved_idx can not be negative

2020-07-30 Thread Viresh Kumar
On 30-07-20, 12:02, Amit Kucheria wrote:
> Looking at this more closely, I found another call site for
> cpufreq_frequency_table_target() in cpufreq.c that needs the index to
> be unsigned int.
> 
> But then cpufreq_frequency_table_target() returns -EINVAL, so we

It returns -EINVAL only in the case where the relation is not valid,
which will never happen. Maybe that should be marked with WARN or BUG
and we should drop return value of -EINVAL.

Rafael ?

> should be able to handle int values.

And so no.

> I think you will need to fix the unconditional assignment of
> policy->cached_resolved_idx = idx
> in cpufreq_driver_resolve_freq(). It doesn't check for -EINVAL, so the
> qcom driver is write in checking for a negative value.

Right, I don't want it to have that check for the reason stated above.

The point is I don't want code that verifies cached-idx at all, it is
useless.

-- 
viresh


Re: [PATCH 2/2] thermal: cpufreq_cooling: Reuse effective_cpu_util()

2020-07-30 Thread Viresh Kumar
On 17-07-20, 11:46, Vincent Guittot wrote:
> On Thu, 16 Jul 2020 at 16:24, Lukasz Luba  wrote:
> > On 7/16/20 12:56 PM, Peter Zijlstra wrote:
> > > Currently cpufreq_cooling appears to estimate the CPU energy usage by
> > > calculating the percentage of idle time using the per-cpu cpustat stuff,
> > > which is pretty horrific.
> >
> > Even worse, it then *samples* the *current* CPU frequency at that
> > particular point in time and assumes that when the CPU wasn't idle
> > during that period - it had *this* frequency...
> 
> So there is 2 problems in the power calculation of cpufreq cooling device :
> - How to get an accurate utilization level of the cpu which is what
> this patch is trying to fix because using idle time is just wrong
> whereas scheduler utilization is frequency invariant

Since this patch is targeted only towards fixing this particular
problem, should I change something in the patch to make it acceptable
?

> - How to get power estimate from this utilization level. And as you
> pointed out, using the current freq which is not accurate.

This should be tackled separately I believe.

-- 
viresh


Re: [PATCH] cpufreq: cached_resolved_idx can not be negative

2020-07-30 Thread Viresh Kumar
On 30-07-20, 11:29, Amit Kucheria wrote:
> On Thu, Jul 30, 2020 at 9:38 AM Viresh Kumar  wrote:
> >
> > It is not possible for cached_resolved_idx to be invalid here as the
> > cpufreq core always sets index to a positive value.
> >
> > Change its type to unsigned int and fix qcom usage a bit.
> 
> Shouldn't you fix up idx in cpufreq_driver_resolve_freq() to be
> unsigned int too?

Yes, merged this into the patch.

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 0128de3603df..053d72e52a31 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -538,7 +538,7 @@ unsigned int cpufreq_driver_resolve_freq(struct 
cpufreq_policy *policy,
policy->cached_target_freq = target_freq;
 
if (cpufreq_driver->target_index) {
-   int idx;
+   unsigned int idx;
 
idx = cpufreq_frequency_table_target(policy, target_freq,
 CPUFREQ_RELATION_L);

-- 
viresh


Re: [PATCH v5 0/6] Add support for GPU DDR BW scaling

2020-07-29 Thread Viresh Kumar
On 22-07-20, 11:00, Viresh Kumar wrote:
> On 21-07-20, 07:28, Rob Clark wrote:
> > With your ack, I can add the patch the dev_pm_opp_set_bw patch to my
> > tree and merge it via msm-next -> drm-next -> linus
> 
> I wanted to send it via my tree, but its okay. Pick this patch from
> linux-next and add my Ack, I will drop it after that.
> 
> a8351c12c6c7 OPP: Add and export helper to set bandwidth

Oops, sorry for the trouble but this needs to go via my tree only :(

I maintain two different branches, one for OPP and another one for
cpufreq. There was no dependency within the OPP branch and so I
dropped it that day and asked you to take it.

But when I tried to send a pull request today I realised that one of
the qcom patches in the cpufreq branch is dependent on it and I need
to keep this patch in my tree.

-- 
viresh


Re: [PATCH v2 7/7] cpufreq: make schedutil the default for arm and arm64

2020-07-29 Thread Viresh Kumar
On 22-07-20, 10:37, Ionela Voinescu wrote:
> From: Valentin Schneider 
> 
> schedutil is already a hard-requirement for EAS, which has lead to making
> it default on arm (when CONFIG_BIG_LITTLE), see:
> 
>   commit 8fdcca8e254a ("cpufreq: Select schedutil when using big.LITTLE")
> 
> One thing worth pointing out is that schedutil isn't only relevant for
> asymmetric CPU capacity systems; for instance, schedutil is the only
> governor that honours util-clamp performance requests. Another good example
> of this is x86 switching to using it by default in:
> 
>   commit a00ec3874e7d ("cpufreq: intel_pstate: Select schedutil as the 
> default governor")
> 
> Arguably it should be made the default for all architectures, but it seems
> better to wait for them to also gain frequency invariance powers. Make it
> the default for arm && arm64 for now.
> 
> Signed-off-by: Valentin Schneider 
> Signed-off-by: Ionela Voinescu 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Russell King 
> Cc: Rafael J. Wysocki 
> Cc: Viresh Kumar 
> ---
>  drivers/cpufreq/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig
> index e91750132552..2c7171e0b001 100644
> --- a/drivers/cpufreq/Kconfig
> +++ b/drivers/cpufreq/Kconfig
> @@ -37,7 +37,7 @@ config CPU_FREQ_STAT
>  choice
>   prompt "Default CPUFreq governor"
>   default CPU_FREQ_DEFAULT_GOV_USERSPACE if ARM_SA1100_CPUFREQ || 
> ARM_SA1110_CPUFREQ
> - default CPU_FREQ_DEFAULT_GOV_SCHEDUTIL if BIG_LITTLE
> + default CPU_FREQ_DEFAULT_GOV_SCHEDUTIL if ARM64 || ARM
>   default CPU_FREQ_DEFAULT_GOV_SCHEDUTIL if X86_INTEL_PSTATE && SMP
>   default CPU_FREQ_DEFAULT_GOV_PERFORMANCE
>   help

Applied. Thanks.

-- 
viresh


Re: [PATCH v2 4/7] cpufreq: report whether cpufreq supports Frequency Invariance (FI)

2020-07-29 Thread Viresh Kumar
On 22-07-20, 10:37, Ionela Voinescu wrote:
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 3497c1cd6818..1d0b046fe8e9 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -61,6 +61,9 @@ static struct cpufreq_driver *cpufreq_driver;
>  static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
>  static DEFINE_RWLOCK(cpufreq_driver_lock);
>  
> +/* Mark support for the scheduler's frequency invariance engine */
> +static DEFINE_STATIC_KEY_FALSE(cpufreq_set_freq_scale);
> +
>  /* Flag to suspend/resume CPUFreq governors */
>  static bool cpufreq_suspended;
>  
> @@ -69,6 +72,25 @@ static inline bool has_target(void)
>   return cpufreq_driver->target_index || cpufreq_driver->target;
>  }
>  
> +static inline
> +void enable_cpufreq_freq_invariance(struct cpufreq_driver *driver)
> +{
> + if ((driver->target || driver->target_index || driver->fast_switch) &&
> + !driver->setpolicy) {

Just checking for !driver->setpolicy should be enough here.

> + static_branch_enable_cpuslocked(_set_freq_scale);
> + pr_debug("%s: Driver %s can provide frequency invariance.",
> +  __func__, driver->name);

I think a simpler print will work well too.

pr_debug("Freq invariance enabled");

__func__ isn't really required as this is the only print with that
kind of info in cpufreq.c.

> + } else
> + pr_err("%s: Driver %s cannot provide frequency invariance.",
> + __func__, driver->name);

Why not supporting freq-invariance an error ? I will just drop this
message completely.

-- 
viresh


Re: [PATCH v2 3/7] arch_topology: disable frequency invariance for CONFIG_BL_SWITCHER

2020-07-29 Thread Viresh Kumar
On 22-07-20, 10:37, Ionela Voinescu wrote:
> +++ b/drivers/base/arch_topology.c
> @@ -27,6 +27,7 @@ __weak bool arch_freq_counters_available(struct cpumask 
> *cpus)
>  }
>  DEFINE_PER_CPU(unsigned long, freq_scale) = SCHED_CAPACITY_SCALE;
>  
> +#ifndef CONFIG_BL_SWITCHER
>  void arch_set_freq_scale(struct cpumask *cpus, unsigned long cur_freq,
>unsigned long max_freq)
>  {
> @@ -46,6 +47,7 @@ void arch_set_freq_scale(struct cpumask *cpus, unsigned 
> long cur_freq,
>   for_each_cpu(i, cpus)
>   per_cpu(freq_scale, i) = scale;
>  }
> +#endif

I don't really like this change, the ifdef hackery is disgusting and
then we are putting that in a completely different part of the kernel.

There are at least these two ways of solving this, maybe more:

- Fix the bl switcher driver and add the complexity in it (which you
  tried to do earlier).

- Add a cpufreq flag to skip arch-set-freq-scale call.

Rafael ?

-- 
viresh


Re: [PATCH v2 2/7] cpufreq: set invariance scale factor on transition end

2020-07-29 Thread Viresh Kumar
On 22-07-20, 10:37, Ionela Voinescu wrote:
> While the move of the invariance setter calls (arch_set_freq_scale())
> from cpufreq drivers to cpufreq core maintained the previous
> functionality for existing drivers that use target_index() and
> fast_switch() for frequency switching, it also gives the possibility
> of adding support for users of the target() callback, which is exploited
> here.
> 
> To be noted that the target() callback has been flagged as deprecated
> since:
> 
> commit 9c0ebcf78fde ("cpufreq: Implement light weight ->target_index() 
> routine")
> 
> It also doesn't have that many users:
> 
>   cpufreq-nforce2.c:371:2:  .target = nforce2_target,
>   cppc_cpufreq.c:416:2: .target = cppc_cpufreq_set_target,
>   gx-suspmod.c:439:2:   .target = cpufreq_gx_target,
>   pcc-cpufreq.c:573:2:  .target = pcc_cpufreq_target,
> 
> Similarly to the path taken for target_index() calls in the cpufreq core
> during a frequency change, all of the drivers above will mark the end of a
> frequency change by a call to cpufreq_freq_transition_end().
> 
> Therefore, cpufreq_freq_transition_end() can be used as the location for
> the arch_set_freq_scale() call to potentially inform the scheduler of the
> frequency change.
> 
> This change maintains the previous functionality for the drivers that
> implement the target_index() callback, while also adding support for the
> few drivers that implement the deprecated target() callback.
> 
> Two notes are worthwhile here:
>  - In __target_index(), cpufreq_freq_transition_end() is called only for
>drivers that have synchronous notifications enabled. There is only one
>driver that disables them,
> 
>drivers/cpufreq/powernow-k8.c:1142: .flags = CPUFREQ_ASYNC_NOTIFICATION,
> 
>which is deprecated.

I don't think this is deprecated.

-- 
viresh


[PATCH] cpufreq: cached_resolved_idx can not be negative

2020-07-29 Thread Viresh Kumar
It is not possible for cached_resolved_idx to be invalid here as the
cpufreq core always sets index to a positive value.

Change its type to unsigned int and fix qcom usage a bit.

Signed-off-by: Viresh Kumar 
---
 drivers/cpufreq/qcom-cpufreq-hw.c | 5 +
 include/linux/cpufreq.h   | 2 +-
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c 
b/drivers/cpufreq/qcom-cpufreq-hw.c
index 0a04b6f03b9a..8c0842bd6c5a 100644
--- a/drivers/cpufreq/qcom-cpufreq-hw.c
+++ b/drivers/cpufreq/qcom-cpufreq-hw.c
@@ -66,13 +66,10 @@ static unsigned int qcom_cpufreq_hw_fast_switch(struct 
cpufreq_policy *policy,
unsigned int target_freq)
 {
void __iomem *perf_state_reg = policy->driver_data;
-   int index;
+   unsigned int index;
unsigned long freq;
 
index = policy->cached_resolved_idx;
-   if (index < 0)
-   return 0;
-
writel_relaxed(index, perf_state_reg);
 
freq = policy->freq_table[index].frequency;
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index e62b022cb07e..58687a5bf9c8 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -127,7 +127,7 @@ struct cpufreq_policy {
 
 /* Cached frequency lookup from cpufreq_driver_resolve_freq. */
unsigned int cached_target_freq;
-   int cached_resolved_idx;
+   unsigned int cached_resolved_idx;
 
/* Synchronization for frequency transitions */
booltransition_ongoing; /* Tracks transition status 
*/
-- 
2.14.1



Re: [PATCH v2 1/7] cpufreq: move invariance setter calls in cpufreq core

2020-07-29 Thread Viresh Kumar
On 27-07-20, 15:48, Rafael J. Wysocki wrote:
> On Wed, Jul 22, 2020 at 11:38 AM Ionela Voinescu
>  wrote:
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > index 036f4cc42ede..bac4101546db 100644
> > --- a/drivers/cpufreq/cpufreq.c
> > +++ b/drivers/cpufreq/cpufreq.c
> > @@ -2058,9 +2058,16 @@ EXPORT_SYMBOL(cpufreq_unregister_notifier);
> >  unsigned int cpufreq_driver_fast_switch(struct cpufreq_policy *policy,
> > unsigned int target_freq)
> >  {
> > +   unsigned int freq;
> > +
> > target_freq = clamp_val(target_freq, policy->min, policy->max);
> > +   freq = cpufreq_driver->fast_switch(policy, target_freq);
> > +
> > +   if (freq)
> > +   arch_set_freq_scale(policy->related_cpus, freq,
> > +   policy->cpuinfo.max_freq);
> 
> Why can't arch_set_freq_scale() handle freq == 0?

Actually there is no need to. AFAIU the freq returned by fast_switch
can never be 0 (yeah qcom driver does it right now and I am fixing
it). And so we can drop this check altogether.

-- 
viresh


Re: [greybus-dev] [PATCH][next] greybus: Use fallthrough pseudo-keyword

2020-07-29 Thread Viresh Kumar
On 28-07-20, 17:37, Alex Elder wrote:
> On 7/27/20 1:32 PM, Gustavo A. R. Silva wrote:
> > Replace the existing /* fall through */ comments and its variants with
> > the new pseudo-keyword macro fallthrough[1].
> > 
> > [1] 
> > https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through
> 
> Thanks for the patch.  It looks good, but it raises
> another question I'd like discussion on.
> 
> It seems that Johan likes default (or final) cases in
> switch statements without a "break" statement.  Viresh
> and Bryan appear to be fond of this too.
> 
> It's pedantic, but I don't like that.  Am I wrong?
>   --> Johan/Viresh/Bryan would you please comment?

I am not fond of them as they aren't required for the working of the code. It is
a bit like using an empty return statement for a routine with void return type,
though it surely adds some consistency to the switch case.

But if people really feel it must be there, then its fine :)

-- 
viresh


Re: [PATCH V2 1/4] gpio: mxc: Support module build

2020-07-28 Thread Viresh Kumar
On 28-07-20, 09:59, Linus Walleij wrote:
> On Mon, Jul 27, 2020 at 12:44 PM Arnd Bergmann  wrote:
> > On Mon, Jul 27, 2020 at 10:18 AM Anson Huang  wrote:
> > drivers/pinctrl/spear/pinctrl-plgpio.c:static
> > SIMPLE_DEV_PM_OPS(plgpio_dev_pm_ops, plgpio_suspend, plgpio_resume);
> 
> This one is affected by the same problem, I don't know if anyone
> really has this hardware anymore, but there are some
> SPEAr products deployed so the users should be notified
> that they may need to move this to syscore ops.
> 
> Viresh?

Yeah, I can't test the patches but Ack them if someone can send them to me or if
someone can tell me exactly what to do here.

-- 
viresh


Re: [PATCH v4 4/5] arm64: dts: sdm845: Add OPP tables and power-domains for venus

2020-07-27 Thread Viresh Kumar
On 27-07-20, 17:38, Rajendra Nayak wrote:
> 
> On 7/27/2020 11:23 AM, Rajendra Nayak wrote:
> > 
> > 
> > On 7/24/2020 7:39 PM, Stanimir Varbanov wrote:
> > > Hi,
> > > 
> > > On 7/23/20 9:06 PM, Stanimir Varbanov wrote:
> > > > Hi Rajendra,
> > > > 
> > > > After applying 2,3 and 4/5 patches on linaro-integration v5.8-rc2 I see
> > > > below messages on db845:
> > > > 
> > > > qcom-venus aa0.video-codec: dev_pm_opp_set_rate: failed to find
> > > > current OPP for freq 53397 (-34)
> > > > 
> > > > ^^^ This one is new.
> > > > 
> > > > qcom_rpmh TCS Busy, retrying RPMH message send: addr=0x3
> > > > 
> > > > ^^^ and this message is annoying, can we make it pr_debug in rpmh?
> > > > 
> > > > On 7/23/20 2:26 PM, Rajendra Nayak wrote:
> > > > > Add the OPP tables in order to be able to vote on the performance 
> > > > > state of
> > > > > a power-domain.
> > > > > 
> > > > > Signed-off-by: Rajendra Nayak 
> > > > > ---
> > > > >   arch/arm64/boot/dts/qcom/sdm845.dtsi | 40 
> > > > > ++--
> > > > >   1 file changed, 38 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
> > > > > b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > > > index e506793..5ca2265 100644
> > > > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > > > @@ -3631,8 +3631,10 @@
> > > > >   interrupts = ;
> > > > >   power-domains = < VENUS_GDSC>,
> > > > >   < VCODEC0_GDSC>,
> > > > > -    < VCODEC1_GDSC>;
> > > > > -    power-domain-names = "venus", "vcodec0", "vcodec1";
> > > > > +    < VCODEC1_GDSC>,
> > > > > +    < SDM845_CX>;
> > > > > +    power-domain-names = "venus", "vcodec0", "vcodec1", "cx";
> > > > > +    operating-points-v2 = <_opp_table>;
> > > > >   clocks = < VIDEO_CC_VENUS_CTL_CORE_CLK>,
> > > > >    < VIDEO_CC_VENUS_AHB_CLK>,
> > > > >    < VIDEO_CC_VENUS_CTL_AXI_CLK>,
> > > > > @@ -3654,6 +3656,40 @@
> > > > >   video-core1 {
> > > > >   compatible = "venus-encoder";
> > > > >   };
> > > > > +
> > > > > +    venus_opp_table: venus-opp-table {
> > > > > +    compatible = "operating-points-v2";
> > > > > +
> > > > > +    opp-1 {
> > > > > +    opp-hz = /bits/ 64 <1>;
> > > > > +    required-opps = <_opp_min_svs>;
> > > > > +    };
> > > > > +
> > > > > +    opp-2 {
> > > > > +    opp-hz = /bits/ 64 <2>;
> > > > > +    required-opps = <_opp_low_svs>;
> > > > > +    };
> > > > > +
> > > > > +    opp-32000 {
> > > > > +    opp-hz = /bits/ 64 <32000>;
> > > > > +    required-opps = <_opp_svs>;
> > > > > +    };
> > > > > +
> > > > > +    opp-38000 {
> > > > > +    opp-hz = /bits/ 64 <38000>;
> > > > > +    required-opps = <_opp_svs_l1>;
> > > > > +    };
> > > > > +
> > > > > +    opp-44400 {
> > > > > +    opp-hz = /bits/ 64 <44400>;
> > > > > +    required-opps = <_opp_nom>;
> > > > > +    };
> > > > > +
> > > > > +    opp-53300 {
> > > > > +    opp-hz = /bits/ 64 <53300>;

Is this the highest OPP in table ?

> > > Actually it comes from videocc, where ftbl_video_cc_venus_clk_src
> > > defines 53300 but the real calculated freq is 53397.
> > 
> > I still don't quite understand why the videocc driver returns this
> > frequency despite this not being in the freq table.
> 
> Ok, so I see the same issue on sc7180 also. clk_round_rate() does seem to
> return whats in the freq table, but clk_set_rate() goes ahead and sets it
> to 53397. Subsequently when we try to set a different OPP, it fails to
> find the 'current' OPP entry for 53397. This sounds like an issue with 
> the OPP
> framework? Should we not fall back to the highest OPP as the current OPP?
> 
> Stephen/Viresh, any thoughts?

I think we (in all frameworks generally) try to set a frequency <=
target frequency and so there may be a problem if the frequency is
larger than highest supported. IOW, you need to fix tables a bit.

-- 
viresh


Re: [PATCH 2/2] thermal: cpufreq_cooling: Reuse effective_cpu_util()

2020-07-22 Thread Viresh Kumar
On 17-07-20, 11:43, Quentin Perret wrote:
> On Friday 17 Jul 2020 at 11:33:05 (+0100), Quentin Perret wrote:
> > On Friday 17 Jul 2020 at 11:14:38 (+0100), Quentin Perret wrote:
> > > On Tuesday 14 Jul 2020 at 12:06:53 (+0530), Viresh Kumar wrote:
> > > >  /**
> > > > - * get_load() - get load for a cpu since last updated
> > > > + * get_load() - get current load for a cpu
> > > >   * @cpufreq_cdev:   cpufreq_cooling_device for this cpu
> > > >   * @cpu:   cpu number
> > > > - * @cpu_idx:   index of the cpu in time_in_idle*
> > > > + * @cpu_idx:   index of the cpu
> > > >   *
> > > > - * Return: The average load of cpu @cpu in percentage since this
> > > > - * function was last called.
> > > > + * Return: The current load of cpu @cpu in percentage.
> > > >   */
> > > >  static u32 get_load(struct cpufreq_cooling_device *cpufreq_cdev, int 
> > > > cpu,
> > > > int cpu_idx)
> > > >  {
> > > > -   u32 load;
> > > > -   u64 now, now_idle, delta_time, delta_idle;
> > > > -   struct time_in_idle *idle_time = 
> > > > _cdev->idle_time[cpu_idx];
> > > > -
> > > > -   now_idle = get_cpu_idle_time(cpu, , 0);
> > > > -   delta_idle = now_idle - idle_time->time;
> > > > -   delta_time = now - idle_time->timestamp;
> > > > +   unsigned long util = cpu_util_cfs(cpu_rq(cpu));
> > > > +   unsigned long max = arch_scale_cpu_capacity(cpu);
> > > 
> > > Should we subtract the thermal PELT signal from max?

Makes sense to me, but I am not really good with this util and load
stuff and so would leave that to you guys to decide :)

-- 
viresh


Re: [PATCH 4.19 123/133] thermal/drivers/cpufreq_cooling: Fix wrong frequency converted from power

2020-07-22 Thread Viresh Kumar
On 22-07-20, 09:43, Pavel Machek wrote:
> OTOH the changelog is extremely confusing, because code would not work
> on the table presented there as an example.

Yeah, maybe I should have updated it too, just missed it completely :(

-- 
viresh


Re: [PATCH 4.19 123/133] thermal/drivers/cpufreq_cooling: Fix wrong frequency converted from power

2020-07-21 Thread Viresh Kumar
On 21-07-20, 13:43, Pavel Machek wrote:
> On Mon 2020-07-20 17:37:50, Greg Kroah-Hartman wrote:
> > From: Finley Xiao 
> > 
> > commit 371a3bc79c11b707d7a1b7a2c938dc3cc042fffb upstream.
> > 
> > The function cpu_power_to_freq is used to find a frequency and set the
> > cooling device to consume at most the power to be converted. For example,
> > if the power to be converted is 80mW, and the em table is as follow.
> > struct em_cap_state table[] = {
> > /* KHz mW */
> > { 1008000, 36, 0 },
> > { 120, 49, 0 },
> > { 1296000, 59, 0 },
> > { 1416000, 72, 0 },
> > { 1512000, 86, 0 },
> > };
> > The target frequency should be 1416000KHz, not 1512000KHz.
> > 
> > Fixes: 349d39dc5739 ("thermal: cpu_cooling: merge frequency and power 
> > tables")
> 
> Wow, this is completely different from the upstream patch.

Right, I have mentioned this in the patch I sent for stable.

https://lore.kernel.org/lkml/bc3978d0b7472c140e4d87f61138168a2a7b995c.1594194577.git.viresh.ku...@linaro.org/

> There the
> loops goes down, not up. The code does not match the changelog here.

Yes, the order is different in earlier kernels but I would say that
the changelog still matches as it doesn't necessarily talks about any
ordering here.

> > --- a/drivers/thermal/cpu_cooling.c
> > +++ b/drivers/thermal/cpu_cooling.c
> > @@ -278,11 +278,11 @@ static u32 cpu_power_to_freq(struct cpuf
> > int i;
> > struct freq_table *freq_table = cpufreq_cdev->freq_table;
> >  
> > -   for (i = 1; i <= cpufreq_cdev->max_level; i++)
> > -   if (power > freq_table[i].power)
> > +   for (i = 0; i < cpufreq_cdev->max_level; i++)
> > +   if (power >= freq_table[i].power)
> > break;
> >  
> > -   return freq_table[i - 1].frequency;
> > +   return freq_table[i].frequency;
> >  }
> 
> 
> Something is very wrong here, if table is sorted like described in the
> changelog, it will always break at i==0 or i==1... not working at all
> in the old or the new version.

As I understand from the other email you sent, this works fine now.
Right ?
-- 
viresh


Re: [PATCH v5 0/6] Add support for GPU DDR BW scaling

2020-07-21 Thread Viresh Kumar
On 21-07-20, 07:28, Rob Clark wrote:
> With your ack, I can add the patch the dev_pm_opp_set_bw patch to my
> tree and merge it via msm-next -> drm-next -> linus

I wanted to send it via my tree, but its okay. Pick this patch from
linux-next and add my Ack, I will drop it after that.

a8351c12c6c7 OPP: Add and export helper to set bandwidth

> Otherwise I can send a second later pull req that adds the final patch
> after has rebased to 5.9-rc1 (by which point the opp next tree will
> have presumably been merged

The PM stuff gets pushed fairly early and so I was asking you to
rebase just on my tree, so you could have sent the pull request right
after the PM tree landed there instead of waiting for rc1.

But its fine now.

-- 
viresh


Re: [PATCH v6 1/6] tty: serial: qcom_geni_serial: Use OPP API to set clk/perf state

2020-07-21 Thread Viresh Kumar
On 21-07-20, 01:43, Stephen Boyd wrote:
> It seems that dev_pm_opp_set_rate() calls _find_opp_table() and finds
> something that isn't an error pointer but then dev_pm_opp_of_add_table()
> returns an error value because there isn't an operating-points property
> in DT. We're getting saved because this driver also happens to call
> dev_pm_opp_set_clkname() which allocates the OPP table a second time
> (because the first time it got freed when dev_pm_opp_of_add_table()
> return -ENODEV because the property was missing).
> 
> Why do we need 'has_opp_table' logic? It seems that we have to keep
> track of the fact that dev_pm_opp_of_add_table() failed so that we don't
> put the table again, but then dev_pm_opp_set_clkname() can be called
> to allocate the table regardless.
> 
> This maintainer is paying very close attention

:)

> to super confusing code like
> this:
> 
>   if (drv->has_opp_table)
>   dev_pm_opp_of_remove_table(dev);
>   dev_pm_opp_put_clkname(drv->opp_table);
> 
> which reads as "if I have an opp table remove it and oh by the way
> remove the clk name for this opp table pointer I also happen to always
> have".
> 
> Maybe I would be happier if dev_pm_opp_of_table() went away and we just
> had dev_pm_opp_add_table(const struct opp_config *config) that did all
> the things for us like set a clk name, set the supported hw, set the
> prop name, etc. based on the single config struct pointer and also
> parsed out the OPP table from DT or just ignored that if there isn't any
> operating-points property. Then the caller wouldn't need to keep track
> of 'if has_opp_table' because it doesn't seem to actually care and the
> core is happy to allocate a table for the device anyway so long as it
> sets a clk name.

The config style wouldn't work as well as we don't really want to
allocate an OPP table if the property isn't found in DT.

All the mess is coming from the fact that I wanted to make it easy for
the generic drivers to have code which can do opp-set-rate or
clk-set-rate depending on how the platform is configured. While the
intention was fine, the end result is still not great as you figured
out.

Because we need to keep a flag to make the right decision anyway, I
wonder if doing this is the best solution we have at hand.

if (opp-table-present)
opp_set_rate();
else
clk_set_rate();

Or maybe stop printing errors from dev_pm_opp_of_remove_table() if the
OPP table isn't found. And so we can get rid of the flag.

-- 
viresh


Re: opp: Modify opp API, dev_pm_opp_get_freq(), find freq in opp, even it is disabled

2020-07-21 Thread Viresh Kumar
On 21-07-20, 01:15, Stephen Boyd wrote:
> Quoting Andrew-sh.Cheng (2020-07-20 01:55:26)
> > From: "Andrew-sh.Cheng" 
> > 
> > Modify dev_pm_opp_get_freq() to return freqeuncy
> > even this opp item is not available.
> > So that we can get the information of disable opp items.
> > 
> > Signed-off-by: Andrew-sh.Cheng 
> > ---
> >  drivers/opp/core.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/opp/core.c b/drivers/opp/core.c
> > index eed42d6b2e6b..5213e0462382 100644
> > --- a/drivers/opp/core.c
> > +++ b/drivers/opp/core.c
> > @@ -118,7 +118,7 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_voltage);
> >   */
> >  unsigned long dev_pm_opp_get_freq(struct dev_pm_opp *opp)
> >  {
> > -   if (IS_ERR_OR_NULL(opp) || !opp->available) {
> > +   if (IS_ERR_OR_NULL(opp)) {
> 
> I wonder why we even have this check. Seems like the caller deserves an
> oops in this case instead of a small pr_err().

I think the reason is same as to why multiple subsystems do similar checks.
While many of them don't do anything if they get a NULL pointer and simply
return, which is fine to support cases where NULL is passed.

But I do agree that maybe we may want to make sure opp-table or opp pointers
passed are all valid all the time and so just remove these checks and let them
crash.

-- 
viresh


Re: [PATCH v5 0/6] Add support for GPU DDR BW scaling

2020-07-20 Thread Viresh Kumar
On 20-07-20, 08:03, Rob Clark wrote:
> On Mon, Jul 20, 2020 at 3:01 AM Viresh Kumar  wrote:
> >
> > On 15-07-20, 08:36, Rob Clark wrote:
> > > I can take the first two into msm-next, the 3rd will need to wait
> > > until dev_pm_opp_set_bw() lands
> >
> > You can base that on a8351c12c6c7 in linux-next, I will make sure not to 
> > rebase
> > it anymore.

This was 5.8-rc1 + 2 patches for OPP. That's all.

> >
> 
> I can't really base on something newer than drm-next

But you need the OPP dependency, isn't it ?

-- 
viresh


Re: [PATCH v2] dt-bindings: thermal: Get rid of thermal.txt and replace references

2020-07-20 Thread Viresh Kumar
On 20-07-20, 17:23, Amit Kucheria wrote:
> Now that we have yaml bindings for the thermal subsystem, get rid of the
> old bindings (thermal.txt).
> 
> Replace all references to thermal.txt in the Documentation with a link
> to the appropriate YAML bindings using the following search and replace
> pattern:
>  - If the reference is specific to the thermal-sensor-cells property,
>  replace with a pointer to thermal-sensor.yaml
>  - If the reference is to the cooling-cells property, replace with a
>  pointer to thermal-cooling-devices.yaml
>  - If the reference is generic thermal bindings, replace with a
>  reference to thermal*.yaml.
> 
> Signed-off-by: Amit Kucheria 
> Reviewed-by: Rob Herring 
> 
> ---
> Changes since v1:
>  - Rebase onto v.5.8-rc6 to make it apply again
>  - Fix cpufreq/nvidia,tegra20-cpufreq.txt
>  - Fix bindings/arm/freescale/fsl,scu.txt
> 
> 
>  .../bindings/cpufreq/cpufreq-dt.txt   |   3 +-
>  .../bindings/cpufreq/cpufreq-mediatek.txt |   4 +-
>  .../cpufreq/nvidia,tegra20-cpufreq.txt|   2 +-

Acked-by: Viresh Kumar 

-- 
viresh


Re: [PATCH v5 0/6] Add support for GPU DDR BW scaling

2020-07-20 Thread Viresh Kumar
On 15-07-20, 08:36, Rob Clark wrote:
> I can take the first two into msm-next, the 3rd will need to wait
> until dev_pm_opp_set_bw() lands

You can base that on a8351c12c6c7 in linux-next, I will make sure not to rebase
it anymore.

-- 
viresh


Re: opp: Modify opp API, dev_pm_opp_get_freq(), find freq in opp, even it is disabled

2020-07-20 Thread Viresh Kumar
On 20-07-20, 16:55, Andrew-sh.Cheng wrote:
> From: "Andrew-sh.Cheng" 
> 
> Modify dev_pm_opp_get_freq() to return freqeuncy
> even this opp item is not available.
> So that we can get the information of disable opp items.
> 
> Signed-off-by: Andrew-sh.Cheng 
> ---
>  drivers/opp/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/opp/core.c b/drivers/opp/core.c
> index eed42d6b2e6b..5213e0462382 100644
> --- a/drivers/opp/core.c
> +++ b/drivers/opp/core.c
> @@ -118,7 +118,7 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_voltage);
>   */
>  unsigned long dev_pm_opp_get_freq(struct dev_pm_opp *opp)
>  {
> - if (IS_ERR_OR_NULL(opp) || !opp->available) {
> + if (IS_ERR_OR_NULL(opp)) {
>   pr_err("%s: Invalid parameters\n", __func__);
>   return 0;
>   }

Applied. Thanks.

-- 
viresh


Re: [TEGRA194_CPUFREQ PATCH v6 2/3] arm64: tegra: Add t194 ccplex compatible and bpmp property

2020-07-18 Thread Viresh Kumar
On 16-07-20, 14:37, Thierry Reding wrote:
> On Wed, Jul 15, 2020 at 07:01:24PM +0530, Sumit Gupta wrote:
> > On Tegra194, data on valid operating points for the CPUs needs to be
> > queried from BPMP. In T194, there is no node representing CPU complex.
> > So, add compatible string to the 'cpus' node instead of using dummy
> > node to bind cpufreq driver. Also, add reference to the BPMP instance
> > for the CPU complex.
> > 
> > Signed-off-by: Sumit Gupta 
> > ---
> >  arch/arm64/boot/dts/nvidia/tegra194.dtsi | 2 ++
> >  1 file changed, 2 insertions(+)
> 
> Looks like the DT bindings are now done so I've applied this for v5.9.
> 
> Viresh, are you going to pick up the other patches, or do you want me
> to pick them up and send you a pull request?

Applied the other two patches, Thanks.

-- 
viresh


Re: [TEGRA194_CPUFREQ PATCH v6 3/3] cpufreq: Add Tegra194 cpufreq driver

2020-07-15 Thread Viresh Kumar
On 15-07-20, 20:57, Sumit Gupta wrote:
> Sorry, missed to remove this. Will wait if any other comments before
> re-spin.

I don't have any further comments, maybe just send a new version of
this patch alone and name it v6.1.

-- 
viresh


Re: [PATCH] opp: Increase parsed_static_opps on _of_add_opp_table_v1

2020-07-15 Thread Viresh Kumar
On 15-07-20, 23:54, Walter Lozano wrote:
> Currently, when using _of_add_opp_table_v2 parsed_static_opps is
> increased and this value is used on _opp_remove_all_static to
> check if there are static opps entries that need to be freed.
> Unfortunately this does not happens when using _of_add_opp_table_v1,
> which leads to warnings.
> 
> This patch increases parsed_static_opps on _of_add_opp_table_v1 in a
> similar way as in _of_add_opp_table_v2.
> 
> Signed-off-by: Walter Lozano 
> ---
> 
>  drivers/opp/of.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/opp/of.c b/drivers/opp/of.c
> index 9a5873591a40..b2bc82bf8b42 100644
> --- a/drivers/opp/of.c
> +++ b/drivers/opp/of.c
> @@ -917,6 +917,8 @@ static int _of_add_opp_table_v1(struct device *dev, 
> struct opp_table *opp_table)
>   nr -= 2;
>   }
>  
> + opp_table->parsed_static_opps++;
> +
>   return ret;
>  }

Merged with this and added relevant Fixes and stable tags.

diff --git a/drivers/opp/of.c b/drivers/opp/of.c
index b2bc82bf8b42..314f306140a1 100644
--- a/drivers/opp/of.c
+++ b/drivers/opp/of.c
@@ -902,6 +902,10 @@ static int _of_add_opp_table_v1(struct device *dev, struct 
opp_table *opp_table)
return -EINVAL;
}
 
+   mutex_lock(_table->lock);
+   opp_table->parsed_static_opps = 1;
+   mutex_unlock(_table->lock);
+
val = prop->value;
while (nr) {
unsigned long freq = be32_to_cpup(val++) * 1000;
@@ -917,8 +921,6 @@ static int _of_add_opp_table_v1(struct device *dev, struct 
opp_table *opp_table)
nr -= 2;
}
 
-   opp_table->parsed_static_opps++;
-
return ret;
 }
 
-- 
viresh


Re: [TEGRA194_CPUFREQ PATCH v5 3/4] cpufreq: Add Tegra194 cpufreq driver

2020-07-15 Thread Viresh Kumar
On 13-07-20, 19:36, Sumit Gupta wrote:
> Add support for CPU frequency scaling on Tegra194. The frequency
> of each core can be adjusted by writing a clock divisor value to
> a MSR on the core. The range of valid divisors is queried from
> the BPMP.
> 
> Signed-off-by: Mikko Perttunen 
> Signed-off-by: Sumit Gupta 
> ---
>  drivers/cpufreq/Kconfig.arm|   6 +
>  drivers/cpufreq/Makefile   |   1 +
>  drivers/cpufreq/tegra194-cpufreq.c | 397 
> +
>  3 files changed, 404 insertions(+)
>  create mode 100644 drivers/cpufreq/tegra194-cpufreq.c
> 
> diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm
> index 15c1a12..f3d8f09 100644
> --- a/drivers/cpufreq/Kconfig.arm
> +++ b/drivers/cpufreq/Kconfig.arm
> @@ -314,6 +314,12 @@ config ARM_TEGRA186_CPUFREQ
>   help
> This adds the CPUFreq driver support for Tegra186 SOCs.
>  
> +config ARM_TEGRA194_CPUFREQ
> + tristate "Tegra194 CPUFreq support"
> + depends on ARCH_TEGRA && TEGRA_BPMP

Shouldn't this depend on ARCH_TEGRA_194_SOC instead ? And I asked you
to add a default y here itself instead of patch 4/4.

> + help
> +   This adds CPU frequency driver support for Tegra194 SOCs.
> +
>  config ARM_TI_CPUFREQ
>   bool "Texas Instruments CPUFreq support"
>   depends on ARCH_OMAP2PLUS
> diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
> index f6670c4..66b5563 100644
> --- a/drivers/cpufreq/Makefile
> +++ b/drivers/cpufreq/Makefile
> @@ -83,6 +83,7 @@ obj-$(CONFIG_ARM_TANGO_CPUFREQ) += 
> tango-cpufreq.o
>  obj-$(CONFIG_ARM_TEGRA20_CPUFREQ)+= tegra20-cpufreq.o
>  obj-$(CONFIG_ARM_TEGRA124_CPUFREQ)   += tegra124-cpufreq.o
>  obj-$(CONFIG_ARM_TEGRA186_CPUFREQ)   += tegra186-cpufreq.o
> +obj-$(CONFIG_ARM_TEGRA194_CPUFREQ)   += tegra194-cpufreq.o
>  obj-$(CONFIG_ARM_TI_CPUFREQ) += ti-cpufreq.o
>  obj-$(CONFIG_ARM_VEXPRESS_SPC_CPUFREQ)   += vexpress-spc-cpufreq.o
>  
> diff --git a/drivers/cpufreq/tegra194-cpufreq.c 
> b/drivers/cpufreq/tegra194-cpufreq.c
> +static struct cpufreq_frequency_table *
> +init_freq_table(struct platform_device *pdev, struct tegra_bpmp *bpmp,
> + unsigned int cluster_id)
> +{
> + struct cpufreq_frequency_table *freq_table;
> + struct mrq_cpu_ndiv_limits_response resp;
> + unsigned int num_freqs, ndiv, delta_ndiv;
> + struct mrq_cpu_ndiv_limits_request req;
> + struct tegra_bpmp_message msg;
> + u16 freq_table_step_size;
> + int err, index;
> +
> + memset(, 0, sizeof(req));
> + req.cluster_id = cluster_id;
> +
> + memset(, 0, sizeof(msg));
> + msg.mrq = MRQ_CPU_NDIV_LIMITS;
> + msg.tx.data = 
> + msg.tx.size = sizeof(req);
> + msg.rx.data = 
> + msg.rx.size = sizeof(resp);
> +
> + err = tegra_bpmp_transfer(bpmp, );
> + if (err)
> + return ERR_PTR(err);
> +
> + /*
> +  * Make sure frequency table step is a multiple of mdiv to match
> +  * vhint table granularity.
> +  */
> + freq_table_step_size = resp.mdiv *
> + DIV_ROUND_UP(CPUFREQ_TBL_STEP_HZ, resp.ref_clk_hz);
> +
> + dev_dbg(>dev, "cluster %d: frequency table step size: %d\n",
> + cluster_id, freq_table_step_size);
> +
> + delta_ndiv = resp.ndiv_max - resp.ndiv_min;
> +
> + if (unlikely(delta_ndiv == 0))
> + num_freqs = 1;
> + else
> + /* We store both ndiv_min and ndiv_max hence the +1 */
> + num_freqs = delta_ndiv / freq_table_step_size + 1;

You need {} in the if else blocks here because of the comment here.

> +
> + num_freqs += (delta_ndiv % freq_table_step_size) ? 1 : 0;
> +
> + freq_table = devm_kcalloc(>dev, num_freqs + 1,
> +   sizeof(*freq_table), GFP_KERNEL);
> + if (!freq_table)
> + return ERR_PTR(-ENOMEM);
> +
> + for (index = 0, ndiv = resp.ndiv_min;
> + ndiv < resp.ndiv_max;
> + index++, ndiv += freq_table_step_size) {
> + freq_table[index].driver_data = ndiv;
> + freq_table[index].frequency = map_ndiv_to_freq(, ndiv);
> + }
> +
> + freq_table[index].driver_data = resp.ndiv_max;
> + freq_table[index++].frequency = map_ndiv_to_freq(, resp.ndiv_max);
> + freq_table[index].frequency = CPUFREQ_TABLE_END;
> +
> + return freq_table;
> +}
> +
> +static int tegra194_cpufreq_probe(struct platform_device *pdev)
> +{
> + struct tegra194_cpufreq_data *data;
> + struct tegra_bpmp *bpmp;
> + int err, i;
> +
> + data = devm_kzalloc(>dev, sizeof(*data), GFP_KERNEL);
> + if (!data)
> + return -ENOMEM;
> +
> + data->num_clusters = MAX_CLUSTERS;
> + data->tables = devm_kcalloc(>dev, data->num_clusters,
> + sizeof(*data->tables), GFP_KERNEL);
> + if (!data->tables)
> + return -ENOMEM;
> +
> + platform_set_drvdata(pdev, data);
> +
> + 

Re: [PATCH v2 00/13] Rid W=1 warnings in CPUFreq

2020-07-15 Thread Viresh Kumar
On 15-07-20, 09:26, Lee Jones wrote:
> This set is part of a larger effort attempting to clean-up W=1
> kernel builds, which are currently overwhelmingly riddled with
> niggly little warnings.
> 
> After these patches are applied, the build system no longer
> complains about any W=0 nor W=1 level warnings in drivers/cpufreq.
> 
> Hurrah!
> 
> Changelog
> 
> v1 => v2:
>  - Collect *-bys
>  - Use __maybe_unused instead of removing device IDs
>  - Use __always_unused instead of using unused variables
>  - Include architecture header instead of creating new include file

Acked-by: Viresh Kumar 

-- 
viresh


Re: [PATCH v2 04/13] cpufreq: sti-cpufreq: Fix some formatting and misspelling issues

2020-07-15 Thread Viresh Kumar
On 15-07-20, 09:26, Lee Jones wrote:
> Kerneldoc format for attribute descriptions should be '@.*: '.
> 
> Fixes the following W=1 kernel build warning(s):
> 
>  drivers/cpufreq/sti-cpufreq.c:49: warning: cannot understand function 
> prototype: 'struct sti_cpufreq_ddata '
> 
> Cc: Patrice Chotard 
> Cc: Pal Singh 
> Signed-off-by: Lee Jones 
> ---
>  drivers/cpufreq/sti-cpufreq.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/cpufreq/sti-cpufreq.c b/drivers/cpufreq/sti-cpufreq.c
> index 8f16bbb164b84..a5ad96d29adca 100644
> --- a/drivers/cpufreq/sti-cpufreq.c
> +++ b/drivers/cpufreq/sti-cpufreq.c
> @@ -40,11 +40,11 @@ enum {
>  };
>  
>  /**
> - * ST CPUFreq Driver Data
> + * struct sti_cpufreq_ddata - ST CPUFreq Driver Data
>   *
> - * @cpu_node CPU's OF node
> - * @syscfg_eng   Engineering Syscon register map
> - * @regmap   Syscon register map
> + * @cpu: CPU's OF node
> + * @syscfg_eng:  Engineering Syscon register map
> + * @syscfg:  Syscon register map
>   */
>  static struct sti_cpufreq_ddata {
>   struct device *cpu;

I already applied the one from V1 earlier this morning.

-- 
viresh


Re: [PATCH v2 06/13] cpufreq: powernv-cpufreq: Functions only used in call-backs should be static

2020-07-15 Thread Viresh Kumar
On 15-07-20, 09:26, Lee Jones wrote:
> Fixes the following W=1 kernel build warning(s):
> 
>  drivers/cpufreq/powernv-cpufreq.c:669:6: warning: no previous prototype for 
> ‘gpstate_timer_handler’ [-Wmissing-prototypes]
>  drivers/cpufreq/powernv-cpufreq.c:902:6: warning: no previous prototype for 
> ‘powernv_cpufreq_work_fn’ [-Wmissing-prototypes]
> 
> Cc: Michael Ellerman 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: linuxppc-...@lists.ozlabs.org
> Signed-off-by: Lee Jones 
> Acked-by: Viresh Kumar 
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cpufreq/powernv-cpufreq.c 
> b/drivers/cpufreq/powernv-cpufreq.c
> index 8646eb197cd96..068cc53abe320 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -666,7 +666,7 @@ static inline void  queue_gpstate_timer(struct 
> global_pstate_info *gpstates)
>   * according quadratic equation. Queues a new timer if it is still not equal
>   * to local pstate
>   */
> -void gpstate_timer_handler(struct timer_list *t)
> +static void gpstate_timer_handler(struct timer_list *t)
>  {
>   struct global_pstate_info *gpstates = from_timer(gpstates, t, timer);
>   struct cpufreq_policy *policy = gpstates->policy;
> @@ -899,7 +899,7 @@ static struct notifier_block powernv_cpufreq_reboot_nb = {
>   .notifier_call = powernv_cpufreq_reboot_notifier,
>  };
>  
> -void powernv_cpufreq_work_fn(struct work_struct *work)
> +static void powernv_cpufreq_work_fn(struct work_struct *work)
>  {
>   struct chip *chip = container_of(work, struct chip, throttle);
>   struct cpufreq_policy *policy;

Don't you want to drop this patch now ? As you already reviewed the
other one on the list ?

-- 
viresh


Re: [PATCH 03/13] cpufreq: cpufreq_governor: Demote store_sampling_rate() header to standard comment block

2020-07-15 Thread Viresh Kumar
On 15-07-20, 08:31, Lee Jones wrote:
> I'm not sure what you mean.  Kerneldoc headers are designed to be
> extracted and converted into mediums which are easy to read/browse.
> For example, see the online documentation for 'debug_object_init':
> 
>  
> https://www.kernel.org/doc/html/latest/core-api/debug-objects.html?highlight=debug_object_init#c.debug_object_init
> 
> They are generally meant to be referenced/consumed.  There is even a
> script provided inside the kernel to find offending instances where
> kerneldoc headers are provided, but not *yet* referenced:
> 
>  `scripts/find-unused-docs.sh`
> 
> HINT: There are many.
> 
> There *could* be and argument to use kerneldoc *just* so you can use
> the kerneldoc checker `scripts/kernel-doc` (which is invoked by W=1
> builds), in order to ensure the parameter descriptions are kept in
> check.
> 
> However, in this case, there are no descriptions provided.  So, in
> reference to my previous question, what are your reasons for wanting
> to keep kerneldoc here?

I think the code did the right thing by keeping them as kernel doc
type comments. What we missed then is getting them used in the *.rst
documentation.

A simple way of doing that could be just adding this to the cpu-freq
rst file, like:

-8<-
Here are the bits from the in-source documentation:

.. kernel-doc:: include/linux/cpufreq.h
.. kernel-doc:: drivers/cpufreq/cpufreq.c
.. kernel-doc:: drivers/cpufreq/freq_table.c
.. kernel-doc:: drivers/cpufreq/cpufreq_governor.c
-8<-

This will make the script stop complaining about these. But the layout
of things wont' be very nice right now.

-- 
viresh


Re: [PATCH 2/2] thermal: cpufreq_cooling: Reuse effective_cpu_util()

2020-07-15 Thread Viresh Kumar
On 14-07-20, 15:05, Rafael J. Wysocki wrote:
> On Tue, Jul 14, 2020 at 8:37 AM Viresh Kumar  wrote:
> >  static u32 get_load(struct cpufreq_cooling_device *cpufreq_cdev, int cpu,
> > int cpu_idx)
> >  {
> > -   u32 load;
> > -   u64 now, now_idle, delta_time, delta_idle;
> > -   struct time_in_idle *idle_time = _cdev->idle_time[cpu_idx];
> > -
> > -   now_idle = get_cpu_idle_time(cpu, , 0);
> > -   delta_idle = now_idle - idle_time->time;
> > -   delta_time = now - idle_time->timestamp;
> > +   unsigned long util = cpu_util_cfs(cpu_rq(cpu));
> > +   unsigned long max = arch_scale_cpu_capacity(cpu);
> >
> > -   if (delta_time <= delta_idle)
> > -   load = 0;
> > -   else
> > -   load = div64_u64(100 * (delta_time - delta_idle), 
> > delta_time);
> > -
> > -   idle_time->time = now_idle;
> > -   idle_time->timestamp = now;
> > -
> > -   return load;
> > +   util = effective_cpu_util(cpu, util, max, ENERGY_UTIL, NULL);
> 
> Hmm.
> 
> It doesn't look like cpufreq_cdev and cpu_idx are needed any more in
> this function, so maybe drop them from the arg list?

Right.

> And then there
> won't be anything specific to CPU cooling in this function, so maybe
> move it to sched and export it from there properly?

There isn't a lot happening in this routine right now TBH and am not
sure if it is really worth it to have a separate routine for this
(unless we can get rid of something for all the callers, like avoiding
a call to arch_scale_cpu_capacity() and then naming it
effective_cpu_load().

> Also it looks like max could be passed to it along with the CPU number
> instead of being always taken as arch_scale_cpu_capacity(cpu).

I am not sure what you are suggesting here. What will be the value of
max if not arch_scale_cpu_capacity() ?

-- 
viresh


<    5   6   7   8   9   10   11   12   13   14   >