Re: [RFC v2 0/6] Managing cluser-level c-states with generic power domains

2015-10-20 Thread Marc Titinger

On 19/10/2015 22:58, Lina Iyer wrote:

Hi Marc,

I am trying to apply this on top of Axel's patches on linux-next (after
fixing issues I saw with his v9), and running to issues applying your
patches. Could you rebase on top of his v10 (he said he would send to
the ML soon) ?



Hi Lina,

I want to replay this with Juno this afternoon first, I'll post ASAP.

Also, based on Kevin's comment I was wondering if I should drop this 
path already and try the other way as discussed (hook l2 devices to 
runtime-pm, through the CPU device), but I still need to think about 
this first.


Cheers,
Marc.




Thanks,
Lina

On Tue, Oct 06 2015 at 08:27 -0600, Marc Titinger wrote:

v2:
- rebase on Lina Iyer's latest series
- remove unnecessary dependency on perf-state patches from Axel Haslam

---

Summary

1) DESCRIPTION
2) DEPENDENCIES
3) URL



1) DESCRIPTION


This patch set's underlying idea is that cluster-level c-states
can be managed
by the power domain, building upon Lina Iyers recent work on
CPU-domain, and Axel Haslam's
genpd multiple states. The power domain may contain CPU devices and
non-CPU devices.

Non-CPU Devices may expose latency constraints by registering
intermediate power-states upon
probing, for instance shallower states than the deepest cluster-off
state. The generic
power domain governor may chose a device retention state in place of
the cluster-sleep
state demanded by the menu governor, and call the platform specific
handling to enter/leave
that retention state.


power-states
---


The proposed way how cluster-level c-states are declared as manageable
by the
power domain, rather than through the cpuidle-ops, relies on the
introduction of
"power-states", consistent with c-states. Here is an example of the DT
bindings,
the c-state CLUSTER_SLEEP_0 is exposed as a power-state in the
compatible property:

juno.dts:   idle-states {
   entry-method = "arm,psci";

   CPU_SLEEP_0: cpu-sleep-0 {
   compatible = "arm,idle-state";
   arm,psci-suspend-param = <0x001>;
   local-timer-stop;
   entry-latency-us = <100>;
   exit-latency-us = <250>;
   min-residency-us = <2000>;
   };

   CLUSTER_SLEEP_0: cluster-sleep-0 {
   compatible = "arm,power-state";
   arm,psci-suspend-param = <0x101>;
   local-timer-stop;
   entry-latency-us = <800>;
   exit-latency-us = <700>;
   min-residency-us = <2500>;
   };
}

This will tell cpuidle runtime_put/get the CPU devices for this
c-state. Eventually, the
actual platform handlers may be called from the genpd platform ops (in
place of cpuidle_ops).

"drivers/cpuidle/cpuidle-arm.c":

static const struct of_device_id arm_idle_state_match[] __initconst = {
   {.compatible = "arm,idle-state",
.data = arm_enter_idle_state},
   {.compatible = "arm,power-state",
.data = arm_enter_power_state},
};


In case of a power-state, arm_enter_power_state will only call
pm_runtime_put/get_sync
The power doamin will handle the power off, currently this patch set
lacks the final
call to the psci interface to have a fully fonctionnal setup
(and there are some genpd_lock'ing issues if put/get actually suspend
the CPU device.)

Ultimately, we would like the Power Domain's simple governor to being
able to chose
the cluster power-state based on the c-states defered to it
(power-states) and constraints
added by the devices. Consequently, we need to "soak" those
power-states into the
power-domain intermediate states from Axel. Since power-states are
declared and handled
the same manner than c-states (idle-states in DT), these patches add a
soaking used when
attaching to a genpd, where power-states are parsed from the DT into
the genpd states:


"drivers/base/power/domain.c":

static const struct of_device_id power_state_match[] = {
   {.compatible = "arm,power-state",
},
};

int of_genpd_device_parse_states(struct device_node *np,
struct generic_pm_domain *genpd)

debugfs addition
---

To easy debug, this patch set adds a seq-file names "states" to the
pm_genpd debugfs:

   cat /sys/kernel/debug/pm_genpd/*

 Domain State nameEnter (ns) / Exit (ns)
   -
   a53_pd   cluster-sleep-0  150 / 80
   a57_pd   cluster-sleep-0  150 / 80

And also a seq-file "timings", to help visualize the constrains of the
non-CPU
devices in a cluster PD.

   Domain Devices, Timings in ns
  

Re: [RFC v2 0/6] Managing cluser-level c-states with generic power domains

2015-10-19 Thread Lina Iyer

Hi Marc,

I am trying to apply this on top of Axel's patches on linux-next (after
fixing issues I saw with his v9), and running to issues applying your
patches. Could you rebase on top of his v10 (he said he would send to
the ML soon) ?

Thanks,
Lina

On Tue, Oct 06 2015 at 08:27 -0600, Marc Titinger wrote:

v2:
- rebase on Lina Iyer's latest series
- remove unnecessary dependency on perf-state patches from Axel Haslam

---

Summary

1) DESCRIPTION
2) DEPENDENCIES
3) URL



1) DESCRIPTION


This patch set's underlying idea is that cluster-level c-states can be 
managed
by the power domain, building upon Lina Iyers recent work on CPU-domain, and 
Axel Haslam's
genpd multiple states. The power domain may contain CPU devices and non-CPU 
devices.

Non-CPU Devices may expose latency constraints by registering intermediate 
power-states upon
probing, for instance shallower states than the deepest cluster-off state. The 
generic
power domain governor may chose a device retention state in place of the 
cluster-sleep
state demanded by the menu governor, and call the platform specific handling to 
enter/leave
that retention state.


power-states
---


The proposed way how cluster-level c-states are declared as manageable by the
power domain, rather than through the cpuidle-ops, relies on the introduction of
"power-states", consistent with c-states. Here is an example of the DT bindings,
the c-state CLUSTER_SLEEP_0 is exposed as a power-state in the compatible 
property:

juno.dts:   idle-states {
   entry-method = "arm,psci";

   CPU_SLEEP_0: cpu-sleep-0 {
   compatible = "arm,idle-state";
   arm,psci-suspend-param = <0x001>;
   local-timer-stop;
   entry-latency-us = <100>;
   exit-latency-us = <250>;
   min-residency-us = <2000>;
   };

   CLUSTER_SLEEP_0: cluster-sleep-0 {
   compatible = "arm,power-state";
   arm,psci-suspend-param = <0x101>;
   local-timer-stop;
   entry-latency-us = <800>;
   exit-latency-us = <700>;
   min-residency-us = <2500>;
   };
}

This will tell cpuidle runtime_put/get the CPU devices for this c-state. 
Eventually, the
actual platform handlers may be called from the genpd platform ops (in place of 
cpuidle_ops).

"drivers/cpuidle/cpuidle-arm.c":

static const struct of_device_id arm_idle_state_match[] __initconst = {
   {.compatible = "arm,idle-state",
.data = arm_enter_idle_state},
   {.compatible = "arm,power-state",
.data = arm_enter_power_state},
};


In case of a power-state, arm_enter_power_state will only call 
pm_runtime_put/get_sync
The power doamin will handle the power off, currently this patch set lacks the 
final
call to the psci interface to have a fully fonctionnal setup
(and there are some genpd_lock'ing issues if put/get actually suspend the CPU 
device.)

Ultimately, we would like the Power Domain's simple governor to being able to 
chose
the cluster power-state based on the c-states defered to it (power-states) and 
constraints
added by the devices. Consequently, we need to "soak" those power-states into 
the
power-domain intermediate states from Axel. Since power-states are declared and 
handled
the same manner than c-states (idle-states in DT), these patches add a soaking 
used when
attaching to a genpd, where power-states are parsed from the DT into the genpd 
states:


"drivers/base/power/domain.c":

static const struct of_device_id power_state_match[] = {
   {.compatible = "arm,power-state",
},
};

int of_genpd_device_parse_states(struct device_node *np,
struct generic_pm_domain *genpd)

debugfs addition
---

To easy debug, this patch set adds a seq-file names "states" to the pm_genpd 
debugfs:

   cat /sys/kernel/debug/pm_genpd/*

 Domain State nameEnter (ns) / Exit (ns)
   -
   a53_pd   cluster-sleep-0  150 / 80
   a57_pd   cluster-sleep-0  150 / 80

And also a seq-file "timings", to help visualize the constrains of the non-CPU
devices in a cluster PD.

   Domain Devices, Timings in ns
  Stop/Start Save/Restore, Effective
  ---
a57_pd
   /cpus/cpu@0  800   /7401320  /1720  ,0 (cached stop)
   /cpus/cpu@1  800   /7401420  /1780  ,0 (cached stop)
   /D1  660   /58016560 /6080  ,2199420 (cached stop)


Device power-states

Re: [RFC v2 0/6] Managing cluser-level c-states with generic power domains

2015-10-14 Thread Axel Haslam
> @Axel: any plans to repost your series now that we have a little more
> momentum here?
>

sure, no problem, ill rebase adding Marc's patch for the debugfs file
and re-post them.

Regards,
Axel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC v2 0/6] Managing cluser-level c-states with generic power domains

2015-10-13 Thread Kevin Hilman
Hi Marc,

Sorry for the lag in reviewing this.  I've been on the road for a couple
weeks of conference related travel.  I did have a look through this on
the plane, but didn't have time to reply until now, and then had to
rework my thoughts a bit since Lina already said some of the things I
was thinking.   Anyways, some overall thoughts below...

Marc Titinger  writes:

> v2: 
>  - rebase on Lina Iyer's latest series
>  - remove unnecessary dependency on perf-state patches from Axel Haslam
>
> ---
>
> Summary
>
> 1) DESCRIPTION
> 2) DEPENDENCIES
> 3) URL
> 
>
>
> 1) DESCRIPTION
>
>
>   This patch set's underlying idea is that cluster-level c-states can be 
> managed

nit: can we drop the use of C-states throughout and just refer to idle
states.  C-states is technically an ACPI specific term.

> by the power domain, building upon Lina Iyers recent work on CPU-domain, and 
> Axel Haslam's
> genpd multiple states. The power domain may contain CPU devices and non-CPU 
> devices.

@Axel: any plans to repost your series now that we have a little more
momentum here?

> Non-CPU Devices may expose latency constraints by registering intermediate 
> power-states upon
> probing, for instance shallower states than the deepest cluster-off state. 
> The generic
> power domain governor may chose a device retention state in place of the 
> cluster-sleep
> state demanded by the menu governor, and call the platform specific handling 
> to enter/leave
> that retention state.
>
>
> power-states
> ---
>
>
> The proposed way how cluster-level c-states are declared as manageable by the
> power domain, rather than through the cpuidle-ops, relies on the introduction 
> of
> "power-states", consistent with c-states. Here is an example of the DT 
> bindings,
> the c-state CLUSTER_SLEEP_0 is exposed as a power-state in the compatible 
> property:
>
> juno.dts:   idle-states {
> entry-method = "arm,psci";
>
> CPU_SLEEP_0: cpu-sleep-0 {
> compatible = "arm,idle-state";
> arm,psci-suspend-param = <0x001>;
> local-timer-stop;
> entry-latency-us = <100>;
> exit-latency-us = <250>;
> min-residency-us = <2000>;
> };
>
> CLUSTER_SLEEP_0: cluster-sleep-0 {
> compatible = "arm,power-state";
> arm,psci-suspend-param = <0x101>;
> local-timer-stop;
> entry-latency-us = <800>;
> exit-latency-us = <700>;
> min-residency-us = <2500>;
> };
>   }
>
> This will tell cpuidle runtime_put/get the CPU devices for this c-state. 
> Eventually, the
> actual platform handlers may be called from the genpd platform ops (in place 
> of cpuidle_ops).
>
> "drivers/cpuidle/cpuidle-arm.c":
>
> static const struct of_device_id arm_idle_state_match[] __initconst = {
> {.compatible = "arm,idle-state",
>  .data = arm_enter_idle_state},
> {.compatible = "arm,power-state",
>  .data = arm_enter_power_state},
> };
>
>
> In case of a power-state, arm_enter_power_state will only call 
> pm_runtime_put/get_sync
> The power doamin will handle the power off, 

I'm not crazy about this part of the design.  Essentially it delegates
the cluster level decision making to the CPUidle driver.  IMO, all of
the decision making should be at the domain level.

> currently this patch set lacks the final
> call to the psci interface to have a fully fonctionnal setup
> (and there are some genpd_lock'ing issues if put/get actually suspend the CPU 
> device.)

> Ultimately, we would like the Power Domain's simple governor to being able to 
> chose
> the cluster power-state based on the c-states defered to it (power-states) 
> and constraints
> added by the devices. Consequently, we need to "soak" those power-states into 
> the
> power-domain intermediate states from Axel. Since power-states are declared 
> and handled
> the same manner than c-states (idle-states in DT), these patches add a 
> soaking used when
> attaching to a genpd, where power-states are parsed from the DT into the 
> genpd states:
>
>
> "drivers/base/power/domain.c":
>
> static const struct of_device_id power_state_match[] = {
> {.compatible = "arm,power-state",
>  },
> };
>
> int of_genpd_device_parse_states(struct device_node *np,
>  struct generic_pm_domain *genpd)

This approach results in some strange (IMO) sharing of the
responsibility, as can be seen because both the genpd and the cpuidle
driver end up parsing the new arm,power-state states.

IMO, the CPUidle drive

[RFC v2 0/6] Managing cluser-level c-states with generic power domains

2015-10-06 Thread Marc Titinger
v2: 
 - rebase on Lina Iyer's latest series
 - remove unnecessary dependency on perf-state patches from Axel Haslam

---

Summary

1) DESCRIPTION
2) DEPENDENCIES
3) URL



1) DESCRIPTION


This patch set's underlying idea is that cluster-level c-states can be 
managed
by the power domain, building upon Lina Iyers recent work on CPU-domain, and 
Axel Haslam's
genpd multiple states. The power domain may contain CPU devices and non-CPU 
devices.

Non-CPU Devices may expose latency constraints by registering intermediate 
power-states upon
probing, for instance shallower states than the deepest cluster-off state. The 
generic
power domain governor may chose a device retention state in place of the 
cluster-sleep
state demanded by the menu governor, and call the platform specific handling to 
enter/leave
that retention state.


power-states
---


The proposed way how cluster-level c-states are declared as manageable by the
power domain, rather than through the cpuidle-ops, relies on the introduction of
"power-states", consistent with c-states. Here is an example of the DT bindings,
the c-state CLUSTER_SLEEP_0 is exposed as a power-state in the compatible 
property:

juno.dts:   idle-states {
entry-method = "arm,psci";

CPU_SLEEP_0: cpu-sleep-0 {
compatible = "arm,idle-state";
arm,psci-suspend-param = <0x001>;
local-timer-stop;
entry-latency-us = <100>;
exit-latency-us = <250>;
min-residency-us = <2000>;
};

CLUSTER_SLEEP_0: cluster-sleep-0 {
compatible = "arm,power-state";
arm,psci-suspend-param = <0x101>;
local-timer-stop;
entry-latency-us = <800>;
exit-latency-us = <700>;
min-residency-us = <2500>;
};
}

This will tell cpuidle runtime_put/get the CPU devices for this c-state. 
Eventually, the
actual platform handlers may be called from the genpd platform ops (in place of 
cpuidle_ops).

"drivers/cpuidle/cpuidle-arm.c":

static const struct of_device_id arm_idle_state_match[] __initconst = {
{.compatible = "arm,idle-state",
 .data = arm_enter_idle_state},
{.compatible = "arm,power-state",
 .data = arm_enter_power_state},
};


In case of a power-state, arm_enter_power_state will only call 
pm_runtime_put/get_sync
The power doamin will handle the power off, currently this patch set lacks the 
final
call to the psci interface to have a fully fonctionnal setup
(and there are some genpd_lock'ing issues if put/get actually suspend the CPU 
device.)

Ultimately, we would like the Power Domain's simple governor to being able to 
chose
the cluster power-state based on the c-states defered to it (power-states) and 
constraints
added by the devices. Consequently, we need to "soak" those power-states into 
the
power-domain intermediate states from Axel. Since power-states are declared and 
handled
the same manner than c-states (idle-states in DT), these patches add a soaking 
used when
attaching to a genpd, where power-states are parsed from the DT into the genpd 
states:


"drivers/base/power/domain.c":

static const struct of_device_id power_state_match[] = {
{.compatible = "arm,power-state",
 },
};

int of_genpd_device_parse_states(struct device_node *np,
 struct generic_pm_domain *genpd)

debugfs addition
---

To easy debug, this patch set adds a seq-file names "states" to the pm_genpd 
debugfs:

cat /sys/kernel/debug/pm_genpd/*

  Domain State nameEnter (ns) / Exit (ns)
-
a53_pd   cluster-sleep-0  150 / 80
a57_pd   cluster-sleep-0  150 / 80

And also a seq-file "timings", to help visualize the constrains of the non-CPU
devices in a cluster PD.

Domain Devices, Timings in ns
   Stop/Start Save/Restore, Effective
  ---
a57_pd
/cpus/cpu@0  800   /7401320  /1720  ,0 (cached stop)
/cpus/cpu@1  800   /7401420  /1780  ,0 (cached stop)
/D1  660   /58016560 /6080  ,2199420 (cached stop)


Device power-states
---

some devices, like L2 caches, may feature a shallower retention mode, between 
CPU_SLEEP_0
and CLUSTER_SLEEP_0, in which mode the L2 memory is not powered off, leading to 
faster
resume than CLUSTER_SLEEP_0.

One way to handle device constrains and ret