subject:"Re\: \[linux\-pm\] \[RFC PATCH\] PM\: Introduce generic DVFS framework with device\-specific OPPs"

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

2011-04-28 Thread Mark Brown

On Wed, Apr 27, 2011 at 01:48:52PM -0700, Colin Cross wrote:

> OPP currently has opp_enable and opp_disable functions.  I don't
> understand why these are needed, they are only used at init time to
> determine available voltages, which could be handled by never passing
> unavailable voltages to the dvfs implementation.

I queried this when OPP was originally added.  The motivation which was
given (which seemed fairly reasonable) was to reduce the number of data
tables for similar parts and board designs.  That did seem like
something which it was reasonable to factor out in some way, though
possibly with a different mechanism.
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

2011-04-28 Thread MyungJoo Ham

On Thu, Apr 28, 2011 at 4:06 PM, Colin Cross  wrote:
> On Wed, Apr 27, 2011 at 11:50 PM, MyungJoo Ham  
> wrote:
>> On Thu, Apr 28, 2011 at 3:44 PM, Colin Cross  wrote:
>>> On Wed, Apr 27, 2011 at 11:12 PM, MyungJoo Ham  
>>> wrote:
 On Thu, Apr 28, 2011 at 5:48 AM, Colin Cross  wrote:
> OPP currently has opp_enable and opp_disable functions.  I don't
> understand why these are needed, they are only used at init time to
> determine available voltages, which could be handled by never passing
> unavailable voltages to the dvfs implementation.

 We need them in runtime.

 A device "a" may want to guarantee that a device "b" to be at least
 "200MHz" or faster while it does some operations. Then, "a" will
 opp_disable("b", 100MHz and others); and opp_enable("b", them) later
 on. We have similar issues with multimedia blocks (MFC, Camera, FB,
 GPU) and CPU/Memory Bus. Ondemand governor of CPUFREQ has some delay
 on catching up a workload (1.5x the sampling rate in average, <2.0x
 the sampling rate in worst cases), which may incur flickering/tearing
 issues with multimedia streams. On the other hand, a general thermal
 monitor or battery manager might want to limit energy usage by
 disabling top performance clocks if it is too hot or the battery level
 is low.
>>>
>>> That sounds like a very strange api, when what you really mean is
>>> clk_set_min_rate or clk_set_max_rate.
>>
>> Essentially, that's what needed.
>> However, with clk_set_min/max_rate, don't we need to let another
>> device to be consumer of other devices' clocks? Not just introducing a
>> device to other devices?
>
> Yes, but that's effectively what you're doing through a backwards api
> anyways.  The question is, for these complicated clock scenarios where
> the final frequency of a clock depends on so many factors, should that
> control go through the clock framework, or through some sort of global
> clock governor (which is where OPP would reappear).
>

In the use cases of runtime clock setting by devfreq or other devices
mentioned above, we are controlling the device's performance with the
representative clock of the device, not a specific clock among the
clocks that the device has. For a device "A" with clock "a1" and "a2",
another device "B" would not control both "a1" and "a2" directly to
get the guaranteed performance from "A". Besides, "B" should not do so
if there are specific orders, delays, and other controls for "A" to
properly change performance.

Therefore, my answer is that it would be preferred to control through
some wrapper/interface/or anything that is connected to the device of
the controlled clocks (and let the device's callback or something
control its clocks), not to control through clock framework directly.
In this version of devfreq+OPP, these are handled by the "target"
callback.


Cheers!
- MyungJoo
-- 
MyungJoo Ham, Ph.D.
Mobile Software Platform Lab,
Digital Media and Communications (DMC) Business
Samsung Electronics
cell: 82-10-6714-2858
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

2011-04-28 Thread MyungJoo Ham

On Thu, Apr 28, 2011 at 3:43 PM, Colin Cross  wrote:
> I understand the need for some sort of governor that can use device
> state to determine the necessary clock frequencies.  Where I disagree
> is the connection to voltages.  The governor should ONLY determine the
> frequencies desired, and the voltage required to meet those
> frequencies should be determined by the clock framework, based only on
> the clock and the frequency.

Yes, as long as AVS(Adaptive Voltage Scaling) is not involved, devfreq
does not need to care about voltages and let device driver (such as
the target callback or its callee) take care of voltages. Besides, my
impression on AVS is that AVS wouldn't be depending on software DVFS
scheme, at least with some AVS test on S5PC110. So, I'd say that it's
safe to let devfreq framework handle frequency only and let target
callback handle anything else except for choosing representative clock
frequency.

However, if we are going to detach devfreq from OPP, we only need to
provide frequency list at init and { an interface to control max/min
freq or an interface to lookup max/min freq of corresponding
representative clock. }

> ___
> linux-pm mailing list
> linux...@lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/linux-pm
>

ps. In our AVS test, the device drivers had nothing to do with voltage
scaling except for initializing devices. The H/W did everything about
voltage scaling dynamically.

Thanks,

MyungJoo.
-- 
MyungJoo Ham (함명주), Ph.D.
Mobile Software Platform Lab,
Digital Media and Communications (DMC) Business
Samsung Electronics
cell: 82-10-6714-2858
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

2011-04-28 Thread Colin Cross

On Wed, Apr 27, 2011 at 11:50 PM, MyungJoo Ham  wrote:
> On Thu, Apr 28, 2011 at 3:44 PM, Colin Cross  wrote:
>> On Wed, Apr 27, 2011 at 11:12 PM, MyungJoo Ham  
>> wrote:
>>> On Thu, Apr 28, 2011 at 5:48 AM, Colin Cross  wrote:
 OPP currently has opp_enable and opp_disable functions.  I don't
 understand why these are needed, they are only used at init time to
 determine available voltages, which could be handled by never passing
 unavailable voltages to the dvfs implementation.
>>>
>>> We need them in runtime.
>>>
>>> A device "a" may want to guarantee that a device "b" to be at least
>>> "200MHz" or faster while it does some operations. Then, "a" will
>>> opp_disable("b", 100MHz and others); and opp_enable("b", them) later
>>> on. We have similar issues with multimedia blocks (MFC, Camera, FB,
>>> GPU) and CPU/Memory Bus. Ondemand governor of CPUFREQ has some delay
>>> on catching up a workload (1.5x the sampling rate in average, <2.0x
>>> the sampling rate in worst cases), which may incur flickering/tearing
>>> issues with multimedia streams. On the other hand, a general thermal
>>> monitor or battery manager might want to limit energy usage by
>>> disabling top performance clocks if it is too hot or the battery level
>>> is low.
>>
>> That sounds like a very strange api, when what you really mean is
>> clk_set_min_rate or clk_set_max_rate.
>
> Essentially, that's what needed.
> However, with clk_set_min/max_rate, don't we need to let another
> device to be consumer of other devices' clocks? Not just introducing a
> device to other devices?

Yes, but that's effectively what you're doing through a backwards api
anyways.  The question is, for these complicated clock scenarios where
the final frequency of a clock depends on so many factors, should that
control go through the clock framework, or through some sort of global
clock governor (which is where OPP would reappear).
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

2011-04-27 Thread MyungJoo Ham

On Thu, Apr 28, 2011 at 3:44 PM, Colin Cross  wrote:
> On Wed, Apr 27, 2011 at 11:12 PM, MyungJoo Ham  
> wrote:
>> On Thu, Apr 28, 2011 at 5:48 AM, Colin Cross  wrote:
>>> OPP currently has opp_enable and opp_disable functions.  I don't
>>> understand why these are needed, they are only used at init time to
>>> determine available voltages, which could be handled by never passing
>>> unavailable voltages to the dvfs implementation.
>>
>> We need them in runtime.
>>
>> A device "a" may want to guarantee that a device "b" to be at least
>> "200MHz" or faster while it does some operations. Then, "a" will
>> opp_disable("b", 100MHz and others); and opp_enable("b", them) later
>> on. We have similar issues with multimedia blocks (MFC, Camera, FB,
>> GPU) and CPU/Memory Bus. Ondemand governor of CPUFREQ has some delay
>> on catching up a workload (1.5x the sampling rate in average, <2.0x
>> the sampling rate in worst cases), which may incur flickering/tearing
>> issues with multimedia streams. On the other hand, a general thermal
>> monitor or battery manager might want to limit energy usage by
>> disabling top performance clocks if it is too hot or the battery level
>> is low.
>
> That sounds like a very strange api, when what you really mean is
> clk_set_min_rate or clk_set_max_rate.

Essentially, that's what needed.
However, with clk_set_min/max_rate, don't we need to let another
device to be consumer of other devices' clocks? Not just introducing a
device to other devices?

> ___
> linux-pm mailing list
> linux...@lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/linux-pm
>



-- 
MyungJoo Ham (함명주), Ph.D.
Mobile Software Platform Lab,
Digital Media and Communications (DMC) Business
Samsung Electronics
cell: 82-10-6714-2858
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

2011-04-27 Thread Colin Cross

On Wed, Apr 27, 2011 at 11:12 PM, MyungJoo Ham  wrote:
> On Thu, Apr 28, 2011 at 5:48 AM, Colin Cross  wrote:
>> OPP currently has opp_enable and opp_disable functions.  I don't
>> understand why these are needed, they are only used at init time to
>> determine available voltages, which could be handled by never passing
>> unavailable voltages to the dvfs implementation.
>
> We need them in runtime.
>
> A device "a" may want to guarantee that a device "b" to be at least
> "200MHz" or faster while it does some operations. Then, "a" will
> opp_disable("b", 100MHz and others); and opp_enable("b", them) later
> on. We have similar issues with multimedia blocks (MFC, Camera, FB,
> GPU) and CPU/Memory Bus. Ondemand governor of CPUFREQ has some delay
> on catching up a workload (1.5x the sampling rate in average, <2.0x
> the sampling rate in worst cases), which may incur flickering/tearing
> issues with multimedia streams. On the other hand, a general thermal
> monitor or battery manager might want to limit energy usage by
> disabling top performance clocks if it is too hot or the battery level
> is low.

That sounds like a very strange api, when what you really mean is
clk_set_min_rate or clk_set_max_rate.
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

2011-04-27 Thread Colin Cross

On Wed, Apr 27, 2011 at 10:59 PM, MyungJoo Ham  wrote:
> What one instance of DVFS (devfreq) controls are clocks and
> regulators. (a device may have multiple regulators as well as multiple
> clocks)
> What one instance of DVFS (devfreq) monitors (device load and/or
> temperature) is a device that uses the clocks and regulators.
>
> If we focus on the things that are controlled by DVFS, connecting DVFS
> with clock seems fine; however, DVFS's decision is based on the status
> of the device and the decision (monitoring result) configures a set of
> clocks and regulators. The clocks are not configured independently
> from others if the clocks are used by a DVFS-capable device. The
> frequency/voltage pair (OPP in this patch) associated with a device
> becomes a representative value of a specific configuration that
> configures the set of clocks and regulators.
>
> This is quite similar with CPUFREQ. CPUFREQ provides a single
> frequency value as a result of monitoring; however the machine's
> cpufreq driver may set multiple clocks and multiple voltage regulators
> based on the representative value (which is usually the core clock)
> although the cpufreq driver may need to control many more clocks with
> different frequencies.
>
> With multiple clocks of a device, if there is a clock that is required
> to be set independently from the "representative" clock with DVFS, it
> means that the DVFS monitoring result (load/temperature) is not a
> scalar value but a vector (multi-dimensional value). That implies that
> we need to monitor different and independent values, which in turn,
> implies that we need separated devices. Note that the DVFS monitor
> result from load and temperature combined is not a multi-dimensional
> value because the temperature limits "maximum possible frequency or
> voltage" and the load gives "preferred lower bound of frequency" that
> can be overridden by the limit set by temperature.
>
> Therefore, having one DVFS per clock where multiple clocks are
> attached to a device will create multiple monitors that monitor the
> same object(device behavior) with same metrics (load and temperature).
>
> Besides, the reason I've started with "target" callback, not clk and
> regulator names or pointers is that a device may have multiple clks
> and regulators and the OPP may only show the representative
> clock/regulators as CPUFREQ does. Especially when the order of
> transitions of those multiple clocks and regulators matter (if they
> are in a single device, it sometimes does), running a DVFS per clock,
> not per device, will be bothersome if not disasterous.

I understand the need for some sort of governor that can use device
state to determine the necessary clock frequencies.  Where I disagree
is the connection to voltages.  The governor should ONLY determine the
frequencies desired, and the voltage required to meet those
frequencies should be determined by the clock framework, based only on
the clock and the frequency.
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

2011-04-27 Thread MyungJoo Ham

On Thu, Apr 28, 2011 at 5:48 AM, Colin Cross  wrote:
> On Wed, Apr 27, 2011 at 12:26 PM, Thomas Gleixner  wrote:
>> Forget OMAP implementation details for a while, sit back and look at
>> the big picture.
>
> Here's my proposal for DVFS:
> - DVFS is implemented in drivers/clk/dvfs.c, and is called by the
> common clock implementation to adjust the voltages, if necessary, on
> regular clk_* calls.
> - Platform code provides mappings in the form (clk, regulator, max
> frequency, min voltage) to the dvfs code.
> - Everything that is in OPP today gets converted to helper functions
> inside the dvfs implementation, and is never called from SoC code
> (except to pass tables at init), or from drivers.
> - OPP can be recreated in the future as a upper level policy manager
> for clocks that need to move together, if that is ever necessary.  It
> would not know anything about voltages.
> - A few common policy implementations need to be added to the common
> clock implementation, like temperature limits.

I hope that my previous reply answered this.

>
> For Tegra:
> - DVFS continues to be accessed by calling clk_* functions
>
> For OMAP:
> - DVFS is triggered by hwmod through clk_* functions.  Any cross-arch
> driver can continue to call clk_* functions.
>
> OPP currently has opp_enable and opp_disable functions.  I don't
> understand why these are needed, they are only used at init time to
> determine available voltages, which could be handled by never passing
> unavailable voltages to the dvfs implementation.

We need them in runtime.

A device "a" may want to guarantee that a device "b" to be at least
"200MHz" or faster while it does some operations. Then, "a" will
opp_disable("b", 100MHz and others); and opp_enable("b", them) later
on. We have similar issues with multimedia blocks (MFC, Camera, FB,
GPU) and CPU/Memory Bus. Ondemand governor of CPUFREQ has some delay
on catching up a workload (1.5x the sampling rate in average, <2.0x
the sampling rate in worst cases), which may incur flickering/tearing
issues with multimedia streams. On the other hand, a general thermal
monitor or battery manager might want to limit energy usage by
disabling top performance clocks if it is too hot or the battery level
is low.

> ___
> linux-pm mailing list
> linux...@lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/linux-pm
>



-- 
MyungJoo Ham (함명주), Ph.D.
Mobile Software Platform Lab,
Digital Media and Communications (DMC) Business
Samsung Electronics
cell: 82-10-6714-2858
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

2011-04-27 Thread MyungJoo Ham

On Thu, Apr 28, 2011 at 3:37 AM, Colin Cross  wrote:
> (sorry, missent the earlier one)
>
> On Wed, Apr 27, 2011 at 11:07 AM, Menon, Nishanth  wrote:
>> On Wed, Apr 27, 2011 at 12:49, Colin Cross  wrote:
>> +l-o
>>
>>> I'm a little confused about the design for this, and OPP as well.  OPP
>>> matches a struct device * and a frequency to a voltage, which is not a
>>> generically useful pairing, as far as I can tell.  On Tegra, it is
>>> quite possible for a single device to have multiple clocks that each
>>> have different voltage requirements, for example the display block can
>>> have an interface clock as well as a pixel clock.  Simplifying this to
>>> dev + freq = voltage seems very OMAP specific, and will be difficult
>>> or impossible to adapt to Tegra.
>> We have the same requirements as well(iclk,fclk,pixclk etc..)! We
>> group them under voltage domains in OMAP ;). if your issue was a
>> ability to have a single freq to a OPP, it is upto SoC to do the
>> proper mapping. Concept of an OPP still remains consistent - which is
>> for a voltage, there is only so much freq you can drive that specific
>> module to.
> No, that is still wrong.  You don't drive a module at a frequency, you
> drive a clock.  You can't map struct device * 1-1 to a clock.  Look at
> omap2_set_init_voltage:
> static int __init omap2_set_init_voltage(char *vdd_name, char *clk_name,
>                                                struct device *dev) {
>        ...
>         clk =  clk_get(NULL, clk_name);
>         freq = clk->rate;
>         opp = opp_find_freq_ceil(dev, &freq);
>         ...
> }
>
> What happens if I have a dev with two frequencies?  I can only pass a
> dev into opp.  It makes infinitely more sense to pass in a clock:
> opp_find_freq_ceil(clk, &freq).

What one instance of DVFS (devfreq) controls are clocks and
regulators. (a device may have multiple regulators as well as multiple
clocks)
What one instance of DVFS (devfreq) monitors (device load and/or
temperature) is a device that uses the clocks and regulators.

If we focus on the things that are controlled by DVFS, connecting DVFS
with clock seems fine; however, DVFS's decision is based on the status
of the device and the decision (monitoring result) configures a set of
clocks and regulators. The clocks are not configured independently
from others if the clocks are used by a DVFS-capable device. The
frequency/voltage pair (OPP in this patch) associated with a device
becomes a representative value of a specific configuration that
configures the set of clocks and regulators.

This is quite similar with CPUFREQ. CPUFREQ provides a single
frequency value as a result of monitoring; however the machine's
cpufreq driver may set multiple clocks and multiple voltage regulators
based on the representative value (which is usually the core clock)
although the cpufreq driver may need to control many more clocks with
different frequencies.

With multiple clocks of a device, if there is a clock that is required
to be set independently from the "representative" clock with DVFS, it
means that the DVFS monitoring result (load/temperature) is not a
scalar value but a vector (multi-dimensional value). That implies that
we need to monitor different and independent values, which in turn,
implies that we need separated devices. Note that the DVFS monitor
result from load and temperature combined is not a multi-dimensional
value because the temperature limits "maximum possible frequency or
voltage" and the load gives "preferred lower bound of frequency" that
can be overridden by the limit set by temperature.

Therefore, having one DVFS per clock where multiple clocks are
attached to a device will create multiple monitors that monitor the
same object(device behavior) with same metrics (load and temperature).

Besides, the reason I've started with "target" callback, not clk and
regulator names or pointers is that a device may have multiple clks
and regulators and the OPP may only show the representative
clock/regulators as CPUFREQ does. Especially when the order of
transitions of those multiple clocks and regulators matter (if they
are in a single device, it sometimes does), running a DVFS per clock,
not per device, will be bothersome if not disasterous.

>
>> It is upto SoC frameworks to implement the transitions. E.g. lets look
>> at scalability: How'd the mechanism proposed work with temperature
>> variances: Example: I dont want to hit 1.5GHz if temp >70C - wont it
>> be an SoC specific hack I'd need to introduce?
> No, because you're putting it in the wrong place, that is a policy
> decision.  Handle it in the clock framework, or handle it in the
> device driver.  That's a bad example either way - what happens if you
> are already at 1.5GHz when the temperature crosses 70C?  You need an
> interrupt that tells you the temperature is too high, and than needs
> to affect a policy decision at a much higher level than dvfs.
>
>>
>> All OPP framework does is store that maps, and leave

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

2011-04-27 Thread Colin Cross

On Wed, Apr 27, 2011 at 12:26 PM, Thomas Gleixner  wrote:
> Forget OMAP implementation details for a while, sit back and look at
> the big picture.

Here's my proposal for DVFS:
- DVFS is implemented in drivers/clk/dvfs.c, and is called by the
common clock implementation to adjust the voltages, if necessary, on
regular clk_* calls.
- Platform code provides mappings in the form (clk, regulator, max
frequency, min voltage) to the dvfs code.
- Everything that is in OPP today gets converted to helper functions
inside the dvfs implementation, and is never called from SoC code
(except to pass tables at init), or from drivers.
- OPP can be recreated in the future as a upper level policy manager
for clocks that need to move together, if that is ever necessary.  It
would not know anything about voltages.
- A few common policy implementations need to be added to the common
clock implementation, like temperature limits.

For Tegra:
- DVFS continues to be accessed by calling clk_* functions

For OMAP:
- DVFS is triggered by hwmod through clk_* functions.  Any cross-arch
driver can continue to call clk_* functions.

OPP currently has opp_enable and opp_disable functions.  I don't
understand why these are needed, they are only used at init time to
determine available voltages, which could be handled by never passing
unavailable voltages to the dvfs implementation.
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

2011-04-27 Thread Thomas Gleixner

On Wed, 27 Apr 2011, Menon, Nishanth wrote:

> OPP table is just a storage and retrieval mechanism, it is upto SoC
> frameworks to choose the most adequate of solutions - e.g. OMAP has
> omap_device, hwmod and a clock framework for more intricate control to
> work in conjunction with cpuidle frameworks as well.

Can you please stop thinking about OMAP for a minute?

A clock framework is nothing SoC specific. A framework is an
abstraction of common HW functionality, which implements general
functionality and relies on the HW specific part to configure it and
to provide access to the hardware itself.

clocks are ordered as trees in HW, simply because you cannot have a
clock consumer be driven by more than one active clock at the same
time. A clock consumer may select a different clock producer, but that
merily changes the tree structure nothing else. So why should every
SoC implement it's own (different buggy) version of tree handling and
call it framework?

Yes, I know you might argue that some devices need two clocks enabled
to be functional. That's correct, but coupling those clocks at the
framework level is the wrong thing to do. If a device needs both an
interface clock and a separate interconnect clock to work, then it
needs to enable both clocks and become a consumer of them.

> There is cross domain dependency which OMAP (yet to be pushed to
> mainline) has - example: when OMAP4's MPUs are at a certain OPP, L3
> (OMAP's SoC bus) needs to be at least a certain OPP - these are
> framework which may be very custom to OMAP itself.

Wrong again. That's not a framework when you hack SoC specific
decision functions into it. It's the OMAP internal hackery to make
stuff work, but that's far from a framework.

What you are describing is a restriction which can be expressed in
tables or rules which are fed into a general framework.

Look at generic irqs, generic timekeeping, generic clockevents and
tons of other real frameworks in the kernel. They abstract out
concepts and provide generic interfaces rather than claiming that the
problem is unique to a particular piece of silicon.

Forget OMAP implementation details for a while, sit back and look at
the big picture.

Thanks,

tglx

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

2011-04-27 Thread Thomas Gleixner

On Wed, 27 Apr 2011, Menon, Nishanth wrote:
> On Wed, Apr 27, 2011 at 12:49, Colin Cross  wrote:
> > I proposed in a different thread on LKML that DVFS be handled within
> > the generic clock implementation.  Platforms would register a
> > regulator and a table of voltages for each struct clock that required
> > DVFS, and the voltages would be changed on normal clk_* requests.
> > This maintains compatibility with existing clk_* calls.
> 
> It is upto SoC frameworks to implement the transitions. E.g. lets look
> at scalability: How'd the mechanism proposed work with temperature
> variances: Example: I dont want to hit 1.5GHz if temp >70C - wont it
> be an SoC specific hack I'd need to introduce?

Why is limiting the max core frequency depending on temperature a SoC
specific problem ?

Everyone wants to do that. x86 does it in hardware / SMM, other
architectures want the kernel to take care of it.

So the decision is simple. Something wants to set core freq to 1.5
GHz, so it calls clk_set_rate() and there we consult the DVFS code
first to validate that setting. If it can be set, fine, then DVFS will
set the voltages _before_ we change the frequency or it will simply
veto the change because one of the preliminaries for such a change is
not given.

Please stop thinking that your SoC is sooo special. It's NOT.

The HW concepts are quite similar all over the place, they are just
named differently and use different IP blocks with slightly different
functionality, but the problems are not unique to a particular SoC at
all.

> All OPP framework does is store that maps, and leaves it to users to
> choose regulators, clock framework variances, SoC temperature sensors
> or what ever mechanisms they choose to allow through a transition.

That's how it's implemented, but that does not say that the design is
correct and usable for more than the usecase it was modeled after.

We are looking into a common clock framework, which abstracts out the
duplicated functionality of the various implementations and reduces
them to the real thing: hardware drivers. So we really need to look
into that DVFS problem as well, simply because it is tightly coupled
and not a complete separate entity.

And looking at the struct clk disaster we really don't want another
incarnation in terms of DVFS where we end up with the same decision
functions in various SoCs over and over.

Thanks,

tglx

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

2011-04-27 Thread Colin Cross

On Wed, Apr 27, 2011 at 11:48 AM, Menon, Nishanth  wrote:
> On Wed, Apr 27, 2011 at 13:29, Colin Cross  wrote:
>> On Wed, Apr 27, 2011 at 11:07 AM, Menon, Nishanth  wrote:
>>> On Wed, Apr 27, 2011 at 12:49, Colin Cross  wrote:
>>> +l-o
>>>
 I'm a little confused about the design for this, and OPP as well.  OPP
 matches a struct device * and a frequency to a voltage, which is not a
 generically useful pairing, as far as I can tell.  On Tegra, it is
 quite possible for a single device to have multiple clocks that each
 have different voltage requirements, for example the display block can
 have an interface clock as well as a pixel clock.  Simplifying this to
 dev + freq = voltage seems very OMAP specific, and will be difficult
 or impossible to adapt to Tegra.
>>> We have the same requirements as well(iclk,fclk,pixclk etc..)! We
>>> group them under voltage domains in OMAP ;). if your issue was a
>>> ability to have a single freq to a OPP, it is upto SoC to do the
>>> proper mapping. Concept of an OPP still remains consistent - which is
>>> for a voltage, there is only so much freq you can drive that specific
>>> module to.
>> No, that is still wrong.  You don't drive a module at a frequency, you
>> drive a clock.  You can't map struct device * 1-1 to a clock.  Look at
> Agreed, module runs on clocks - Lets say n clocks provide a module
> it's functionality.
>
>> omap2_set_init_voltage:
>> static int __init omap2_set_init_voltage(char *vdd_name, char *clk_name,
>>                                                struct device *dev) {
>>
>>        clk =  clk_get(NULL, clk_name);
>>        freq = clk->rate;
>>        opp = opp_find_freq_ceil(dev, &freq);
>>        ...
>> }
>>
>> Now what happens if I have a dev with two frequencies,
> we do have it - it depends on what the OPP table represents. we do
> have modules which have both interface and functional clocks on OMAP
> as well. for a module(represented by struct device *) which has n
> clocks, choose the scheme of representation of clock that depends on
> voltage for the module.
> in the example you provided "the display block can have an interface
> clock as well as a pixel clock" - I suppose you mean:
> {.pclk = x, .iclk = y, .v = z}
> The question I'd ask is this : for a voltage z, is the dependency on
> pclk or iclk? I can expect a dependency of pclk to iclk requirement
> (considering pixel clock drives an external display for example). the
> table reduces to just
> {.iclk = y, .v = z} and a different table that has divisor for .iclk
> to pclk which is SoC based.

No, there can be voltage requirements on both, and the higher voltage
requirement of the two must be used.

> OPP table is just a storage and retrieval mechanism, it is upto SoC
> frameworks to choose the most adequate of solutions - e.g. OMAP has
> omap_device, hwmod and a clock framework for more intricate control to
> work in conjunction with cpuidle frameworks as well.
>
> There is cross domain dependency which OMAP (yet to be pushed to
> mainline) has - example: when OMAP4's MPUs are at a certain OPP, L3
> (OMAP's SoC bus) needs to be at least a certain OPP - these are
> framework which may be very custom to OMAP itself.
>
> ---
> Regards,
> Nishanth Menon
>
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

2011-04-27 Thread Menon, Nishanth

On Wed, Apr 27, 2011 at 13:29, Colin Cross  wrote:
> On Wed, Apr 27, 2011 at 11:07 AM, Menon, Nishanth  wrote:
>> On Wed, Apr 27, 2011 at 12:49, Colin Cross  wrote:
>> +l-o
>>
>>> I'm a little confused about the design for this, and OPP as well.  OPP
>>> matches a struct device * and a frequency to a voltage, which is not a
>>> generically useful pairing, as far as I can tell.  On Tegra, it is
>>> quite possible for a single device to have multiple clocks that each
>>> have different voltage requirements, for example the display block can
>>> have an interface clock as well as a pixel clock.  Simplifying this to
>>> dev + freq = voltage seems very OMAP specific, and will be difficult
>>> or impossible to adapt to Tegra.
>> We have the same requirements as well(iclk,fclk,pixclk etc..)! We
>> group them under voltage domains in OMAP ;). if your issue was a
>> ability to have a single freq to a OPP, it is upto SoC to do the
>> proper mapping. Concept of an OPP still remains consistent - which is
>> for a voltage, there is only so much freq you can drive that specific
>> module to.
> No, that is still wrong.  You don't drive a module at a frequency, you
> drive a clock.  You can't map struct device * 1-1 to a clock.  Look at
Agreed, module runs on clocks - Lets say n clocks provide a module
it's functionality.

> omap2_set_init_voltage:
> static int __init omap2_set_init_voltage(char *vdd_name, char *clk_name,
>                                                struct device *dev) {
>
>        clk =  clk_get(NULL, clk_name);
>        freq = clk->rate;
>        opp = opp_find_freq_ceil(dev, &freq);
>        ...
> }
>
> Now what happens if I have a dev with two frequencies,
we do have it - it depends on what the OPP table represents. we do
have modules which have both interface and functional clocks on OMAP
as well. for a module(represented by struct device *) which has n
clocks, choose the scheme of representation of clock that depends on
voltage for the module.
in the example you provided "the display block can have an interface
clock as well as a pixel clock" - I suppose you mean:
{.pclk = x, .iclk = y, .v = z}
The question I'd ask is this : for a voltage z, is the dependency on
pclk or iclk? I can expect a dependency of pclk to iclk requirement
(considering pixel clock drives an external display for example). the
table reduces to just
{.iclk = y, .v = z} and a different table that has divisor for .iclk
to pclk which is SoC based.

OPP table is just a storage and retrieval mechanism, it is upto SoC
frameworks to choose the most adequate of solutions - e.g. OMAP has
omap_device, hwmod and a clock framework for more intricate control to
work in conjunction with cpuidle frameworks as well.

There is cross domain dependency which OMAP (yet to be pushed to
mainline) has - example: when OMAP4's MPUs are at a certain OPP, L3
(OMAP's SoC bus) needs to be at least a certain OPP - these are
framework which may be very custom to OMAP itself.

---
Regards,
Nishanth Menon
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

2011-04-27 Thread Colin Cross

(sorry, missent the earlier one)

On Wed, Apr 27, 2011 at 11:07 AM, Menon, Nishanth  wrote:
> On Wed, Apr 27, 2011 at 12:49, Colin Cross  wrote:
> +l-o
>
>> I'm a little confused about the design for this, and OPP as well.  OPP
>> matches a struct device * and a frequency to a voltage, which is not a
>> generically useful pairing, as far as I can tell.  On Tegra, it is
>> quite possible for a single device to have multiple clocks that each
>> have different voltage requirements, for example the display block can
>> have an interface clock as well as a pixel clock.  Simplifying this to
>> dev + freq = voltage seems very OMAP specific, and will be difficult
>> or impossible to adapt to Tegra.
> We have the same requirements as well(iclk,fclk,pixclk etc..)! We
> group them under voltage domains in OMAP ;). if your issue was a
> ability to have a single freq to a OPP, it is upto SoC to do the
> proper mapping. Concept of an OPP still remains consistent - which is
> for a voltage, there is only so much freq you can drive that specific
> module to.
No, that is still wrong.  You don't drive a module at a frequency, you
drive a clock.  You can't map struct device * 1-1 to a clock.  Look at
omap2_set_init_voltage:
static int __init omap2_set_init_voltage(char *vdd_name, char *clk_name,
                                               struct device *dev) {
...
        clk =  clk_get(NULL, clk_name);
        freq = clk->rate;
        opp = opp_find_freq_ceil(dev, &freq);
        ...
}

What happens if I have a dev with two frequencies?  I can only pass a
dev into opp.  It makes infinitely more sense to pass in a clock:
opp_find_freq_ceil(clk, &freq).

> It is upto SoC frameworks to implement the transitions. E.g. lets look
> at scalability: How'd the mechanism proposed work with temperature
> variances: Example: I dont want to hit 1.5GHz if temp >70C - wont it
> be an SoC specific hack I'd need to introduce?
No, because you're putting it in the wrong place, that is a policy
decision.  Handle it in the clock framework, or handle it in the
device driver.  That's a bad example either way - what happens if you
are already at 1.5GHz when the temperature crosses 70C?  You need an
interrupt that tells you the temperature is too high, and than needs
to affect a policy decision at a much higher level than dvfs.

>
> All OPP framework does is store that maps, and leaves it to users to
> choose regulators, clock framework variances, SoC temperature sensors
> or what ever mechanisms they choose to allow through a transition.
I understand its just a map, but its a map between two things that
don't have a direct mapping in many SoCs.  I think if you changed
every usage of struct dev * in opp to struct clk *, it would make much
more sense.  There is already a mapping from struct dev * to struct
clk *, its called clk_get, and it takes a second parameter to allow
devices to have multiple clocks.
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

2011-04-27 Thread Colin Cross

On Wed, Apr 27, 2011 at 11:07 AM, Menon, Nishanth  wrote:
> On Wed, Apr 27, 2011 at 12:49, Colin Cross  wrote:
> +l-o
>
>> I'm a little confused about the design for this, and OPP as well.  OPP
>> matches a struct device * and a frequency to a voltage, which is not a
>> generically useful pairing, as far as I can tell.  On Tegra, it is
>> quite possible for a single device to have multiple clocks that each
>> have different voltage requirements, for example the display block can
>> have an interface clock as well as a pixel clock.  Simplifying this to
>> dev + freq = voltage seems very OMAP specific, and will be difficult
>> or impossible to adapt to Tegra.
> We have the same requirements as well(iclk,fclk,pixclk etc..)! We
> group them under voltage domains in OMAP ;). if your issue was a
> ability to have a single freq to a OPP, it is upto SoC to do the
> proper mapping. Concept of an OPP still remains consistent - which is
> for a voltage, there is only so much freq you can drive that specific
> module to.
No, that is still wrong.  You don't drive a module at a frequency, you
drive a clock.  You can't map struct device * 1-1 to a clock.  Look at
omap2_set_init_voltage:
static int __init omap2_set_init_voltage(char *vdd_name, char *clk_name,
struct device *dev) {

clk =  clk_get(NULL, clk_name);
freq = clk->rate;
opp = opp_find_freq_ceil(dev, &freq);
...
}

Now what happens if I have a dev with two frequencies,

>> Moreover, from a silicon perspective, there is always a simple link
>> from a single frequency to a minimum voltage for a given circuit.
>> There is no need to group them into OPPs, which seem to have a group
>> of clocks and their frequencies that map to a single voltage.  That is
>> an artifact of the way TI specifies voltages.
>>
>> I don't think DVFS is even the right place for any sort of governor.
>> DVFS is very simple - to increase to a specific clock speed, the
>> voltage must be immediately be raised, with minimum or no delay, to a
>> specified value that is specific to that clock.  When the frequency is
>> lowered, the voltage should be decreased.  There is a tiny bit of
>> policy to determine when to delay dropping the voltage in case the
>> frequency will immediately be raised again, but nowhere near the
>> complexity of what is shown here.
>>
>> I proposed in a different thread on LKML that DVFS be handled within
>> the generic clock implementation.  Platforms would register a
>> regulator and a table of voltages for each struct clock that required
>> DVFS, and the voltages would be changed on normal clk_* requests.
>> This maintains compatibility with existing clk_* calls.
>
> It is upto SoC frameworks to implement the transitions. E.g. lets look
> at scalability: How'd the mechanism proposed work with temperature
> variances: Example: I dont want to hit 1.5GHz if temp >70C - wont it
> be an SoC specific hack I'd need to introduce?
>
> All OPP framework does is store that maps, and leaves it to users to
> choose regulators, clock framework variances, SoC temperature sensors
> or what ever mechanisms they choose to allow through a transition.
>
>> There is a place for a GPU, etc., frequency governor, but it is a
>> completely separate issue from DVFS, and should not be mixed in.  I
>> could have a GPU that is not voltage scalable, but could still benefit
>> from lowering the frequency when it is not in use.  A devfreq
>> interface sounds perfect for this, as long as it only ends up calling
>> clk_* functions, and those functions handle getting the voltage
>> correct.
>
> Regards,
> Nishanth Menon
> PS:
> https://lists.linux-foundation.org/pipermail/linux-pm/2011-April/031113.html
> for start of thread
>
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

2011-04-27 Thread Menon, Nishanth

On Wed, Apr 27, 2011 at 12:49, Colin Cross  wrote:
+l-o

> I'm a little confused about the design for this, and OPP as well.  OPP
> matches a struct device * and a frequency to a voltage, which is not a
> generically useful pairing, as far as I can tell.  On Tegra, it is
> quite possible for a single device to have multiple clocks that each
> have different voltage requirements, for example the display block can
> have an interface clock as well as a pixel clock.  Simplifying this to
> dev + freq = voltage seems very OMAP specific, and will be difficult
> or impossible to adapt to Tegra.
We have the same requirements as well(iclk,fclk,pixclk etc..)! We
group them under voltage domains in OMAP ;). if your issue was a
ability to have a single freq to a OPP, it is upto SoC to do the
proper mapping. Concept of an OPP still remains consistent - which is
for a voltage, there is only so much freq you can drive that specific
module to.

> Moreover, from a silicon perspective, there is always a simple link
> from a single frequency to a minimum voltage for a given circuit.
> There is no need to group them into OPPs, which seem to have a group
> of clocks and their frequencies that map to a single voltage.  That is
> an artifact of the way TI specifies voltages.
>
> I don't think DVFS is even the right place for any sort of governor.
> DVFS is very simple - to increase to a specific clock speed, the
> voltage must be immediately be raised, with minimum or no delay, to a
> specified value that is specific to that clock.  When the frequency is
> lowered, the voltage should be decreased.  There is a tiny bit of
> policy to determine when to delay dropping the voltage in case the
> frequency will immediately be raised again, but nowhere near the
> complexity of what is shown here.
>
> I proposed in a different thread on LKML that DVFS be handled within
> the generic clock implementation.  Platforms would register a
> regulator and a table of voltages for each struct clock that required
> DVFS, and the voltages would be changed on normal clk_* requests.
> This maintains compatibility with existing clk_* calls.

It is upto SoC frameworks to implement the transitions. E.g. lets look
at scalability: How'd the mechanism proposed work with temperature
variances: Example: I dont want to hit 1.5GHz if temp >70C - wont it
be an SoC specific hack I'd need to introduce?

All OPP framework does is store that maps, and leaves it to users to
choose regulators, clock framework variances, SoC temperature sensors
or what ever mechanisms they choose to allow through a transition.

> There is a place for a GPU, etc., frequency governor, but it is a
> completely separate issue from DVFS, and should not be mixed in.  I
> could have a GPU that is not voltage scalable, but could still benefit
> from lowering the frequency when it is not in use.  A devfreq
> interface sounds perfect for this, as long as it only ends up calling
> clk_* functions, and those functions handle getting the voltage
> correct.

Regards,
Nishanth Menon
PS:
https://lists.linux-foundation.org/pipermail/linux-pm/2011-April/031113.html
for start of thread
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

17 matches

Site Navigation

Mail list logo

Footer information