[PATCH 2/2] Thermal: Deactive cooling device when unbind it.

2012-09-21 Thread zhanghongbo
From: hongbo.zhang hongbo.zh...@linaro.com

A cooling device should be set to state zero when it is unbound.

Signed-off-by: hongbo.zhang hongbo.zh...@linaro.com
---
 drivers/thermal/thermal_sys.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/thermal/thermal_sys.c b/drivers/thermal/thermal_sys.c
index 2c28c85..efc5c56 100644
--- a/drivers/thermal/thermal_sys.c
+++ b/drivers/thermal/thermal_sys.c
@@ -885,6 +885,8 @@ int thermal_zone_unbind_cooling_device(struct 
thermal_zone_device *tz,
mutex_lock(tz-lock);
list_for_each_entry_safe(pos, next, tz-cooling_devices, node) {
if (pos-tz == tz  pos-trip == trip  pos-cdev == cdev) {
+   if (cdev-ops-set_cur_state)
+   cdev-ops-set_cur_state(cdev, 0);
list_del(pos-node);
mutex_unlock(tz-lock);
goto unbind;
-- 
1.7.11.3


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


[PATCH 1/2] Thermal: Add interface to deactive cooling devices.

2012-09-21 Thread zhanghongbo
From: hongbo.zhang hongbo.zh...@linaro.com

If the thermal zone mode is disabled, all the referenced cooling
devices should be put into state zero.
Without this patch the thermal driver cannot deactive all its
cooling devices in .set_mode callback, because the cooling device
list is maintained in the generic thermal layer.
This interface is introduced to fix it.

Signed-off-by: hongbo.zhang hongbo.zh...@linaro.com
---
 drivers/thermal/thermal_sys.c | 22 ++
 include/linux/thermal.h   |  1 +
 2 files changed, 23 insertions(+)

diff --git a/drivers/thermal/thermal_sys.c b/drivers/thermal/thermal_sys.c
index 2ab31e4..2c28c85 100644
--- a/drivers/thermal/thermal_sys.c
+++ b/drivers/thermal/thermal_sys.c
@@ -1130,6 +1130,28 @@ leave:
 EXPORT_SYMBOL(thermal_zone_device_update);
 
 /**
+ * thermal_zone_device_deactive - deactive cooling devices of thermal zone
+ * @tz:thermal zone device
+ *
+ * This function should be called in the thermal zone device .set_mode
+ * callback when the thermal zone is disabled.
+ */
+void thermal_zone_device_deactive(struct thermal_zone_device *tz)
+{
+   struct thermal_cooling_device_instance *instance;
+   struct thermal_cooling_device *cdev;
+
+   mutex_lock(tz-lock);
+   list_for_each_entry(instance, tz-cooling_devices, node) {
+   cdev = instance-cdev;
+   if (cdev-ops-set_cur_state)
+   cdev-ops-set_cur_state(cdev, 0);
+   }
+   mutex_unlock(tz-lock);
+}
+EXPORT_SYMBOL(thermal_zone_device_deactive);
+
+/**
  * create_trip_attrs - create attributes for trip points
  * @tz:the thermal zone device
  * @mask:  Writeable trip point bitmap.
diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index 4b94a61..5e915a3 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -161,6 +161,7 @@ int thermal_zone_bind_cooling_device(struct 
thermal_zone_device *, int,
 int thermal_zone_unbind_cooling_device(struct thermal_zone_device *, int,
   struct thermal_cooling_device *);
 void thermal_zone_device_update(struct thermal_zone_device *);
+void thermal_zone_device_deactive(struct thermal_zone_device *);
 struct thermal_cooling_device *thermal_cooling_device_register(char *, void *,
const struct thermal_cooling_device_ops *);
 void thermal_cooling_device_unregister(struct thermal_cooling_device *);
-- 
1.7.11.3


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


[PATCH 0/2] Thermal patches to deactive cooling devices when needed

2012-09-21 Thread zhanghongbo
From: hongbo.zhang hongbo.zh...@linaro.com

This patch set contains two patches.

[PATCH 1/2]
A new interface is introduced to deactive all the referenced cooling devices
when thermal zone is disabled. Because the cooling device list is maintained
in the generic thermal layer, the thermal driver cannot walk through the list
and cannot deactive its cooling devices either.
This interface is needed in the .set_mode callback when the thermal zone mode
is set to disabled.

[PATCH 2/2]
When a cooling device is unbound it should be deactived, otherwise cooling
device will keep active after unbinding, this isn't what we expect.

hongbo.zhang (2):
  Thermal: Add interface to deactive cooling devices.
  Thermal: Deactive cooling device when unbind it.

 drivers/thermal/thermal_sys.c | 24 
 include/linux/thermal.h   |  1 +
 2 files changed, 25 insertions(+)

-- 
1.7.11.3


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: [PATCH 0/2] Thermal patches to deactive cooling devices when needed

2012-09-21 Thread Zhang Rui
On 五, 2012-09-21 at 14:57 +0800, zhanghongbo wrote:
 From: hongbo.zhang hongbo.zh...@linaro.com
 
 This patch set contains two patches.
 
 [PATCH 1/2]
 A new interface is introduced to deactive all the referenced cooling devices
 when thermal zone is disabled.

we can not deactive a cooling device directly.
we should deactive all the thermal_instances for this thermal zone.
because a cooling device may be referenced in multiple thermal zones.

 Because the cooling device list is maintained
 in the generic thermal layer, the thermal driver cannot walk through the list
 and cannot deactive its cooling devices either.
 This interface is needed in the .set_mode callback when the thermal zone mode
 is set to disabled.
 
Durga is introducing the cooling policy for the generic thermal layer
and one of them is userspace.
if we set the policy to userspace, the generic thermal layer will do
nothing but getting input from userspace.
if we have an API to change the cooling policy for a thermal zone, can
this be used instead?

 [PATCH 2/2]
 When a cooling device is unbound it should be deactived, otherwise cooling
 device will keep active after unbinding, this isn't what we expect.
 
as I said, it should delete the thermal_instance and then update the
cooling device.

thanks,
rui


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: [PATCH 0/2] Thermal patches to deactive cooling devices when needed

2012-09-21 Thread Hongbo Zhang
On 21 September 2012 15:21, Zhang Rui rui.zh...@intel.com wrote:
 On 五, 2012-09-21 at 14:57 +0800, zhanghongbo wrote:
 From: hongbo.zhang hongbo.zh...@linaro.com

 This patch set contains two patches.

 [PATCH 1/2]
 A new interface is introduced to deactive all the referenced cooling devices
 when thermal zone is disabled.

 we can not deactive a cooling device directly.
 we should deactive all the thermal_instances for this thermal zone.
 because a cooling device may be referenced in multiple thermal zones.
Understand.

 Because the cooling device list is maintained
 in the generic thermal layer, the thermal driver cannot walk through the list
 and cannot deactive its cooling devices either.
 This interface is needed in the .set_mode callback when the thermal zone mode
 is set to disabled.

 Durga is introducing the cooling policy for the generic thermal layer
 and one of them is userspace.
 if we set the policy to userspace, the generic thermal layer will do
 nothing but getting input from userspace.
 if we have an API to change the cooling policy for a thermal zone, can
 this be used instead?
The reason I sent out these patches is that I found out deactivation
of cooling devices
is necessary, and I didn't see any update of thermal framework recently.
Since you are still working on this framework, can you consider this
deactivation
function in your next version? or need I resend again? I think it is
better you do it.

Another propose is that let the thermal driver walk through its
cooling device list is
also important I think, if userspace mode is introduced, the thermal
driver needs to
manipulate its cooling device, the thermal driver cannot do this
without knowing its
cooling devices list. (one method to achieve this is as my last patch
several weeks ago)

Another question, when will you update your framework, v3.7?
Thanks.

 [PATCH 2/2]
 When a cooling device is unbound it should be deactived, otherwise cooling
 device will keep active after unbinding, this isn't what we expect.

 as I said, it should delete the thermal_instance and then update the
 cooling device.
Understand.

 thanks,
 rui


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: [PATCH 0/2] Thermal patches to deactive cooling devices when needed

2012-09-21 Thread Zhang Rui
On 五, 2012-09-21 at 15:50 +0800, Hongbo Zhang wrote:
 On 21 September 2012 15:21, Zhang Rui rui.zh...@intel.com wrote:
  On 五, 2012-09-21 at 14:57 +0800, zhanghongbo wrote:
  From: hongbo.zhang hongbo.zh...@linaro.com
 
  This patch set contains two patches.
 
  [PATCH 1/2]
  A new interface is introduced to deactive all the referenced cooling 
  devices
  when thermal zone is disabled.
 
  we can not deactive a cooling device directly.
  we should deactive all the thermal_instances for this thermal zone.
  because a cooling device may be referenced in multiple thermal zones.
 Understand.
 
  Because the cooling device list is maintained
  in the generic thermal layer, the thermal driver cannot walk through the 
  list
  and cannot deactive its cooling devices either.
  This interface is needed in the .set_mode callback when the thermal zone 
  mode
  is set to disabled.
 
  Durga is introducing the cooling policy for the generic thermal layer
  and one of them is userspace.
  if we set the policy to userspace, the generic thermal layer will do
  nothing but getting input from userspace.
  if we have an API to change the cooling policy for a thermal zone, can
  this be used instead?
 The reason I sent out these patches is that I found out deactivation
 of cooling devices
 is necessary, and I didn't see any update of thermal framework recently.
 Since you are still working on this framework, can you consider this
 deactivation
 function in your next version? or need I resend again? I think it is
 better you do it.

yes, I'll do it.
 
 Another propose is that let the thermal driver walk through its
 cooling device list is
 also important I think, if userspace mode is introduced, the thermal
 driver needs to
 manipulate its cooling device, the thermal driver cannot do this
 without knowing its
 cooling devices list. (one method to achieve this is as my last patch
 several weeks ago)
 
thermal_instance list is enough for this, because a cooling device may
be referenced in multiple trip points for one thermal zone, the thermal
zone device should just deactive all the thermal instances when using
userspace governor.

 Another question, when will you update your framework, v3.7?

as this change is introduced recently, I'm not sure if we can catch up
3.7 merge window.

thanks,
rui


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: [PATCH 0/2] Thermal patches to deactive cooling devices when needed

2012-09-21 Thread Hongbo Zhang
On 21 September 2012 16:02, Zhang Rui rui.zh...@intel.com wrote:
 On 五, 2012-09-21 at 15:50 +0800, Hongbo Zhang wrote:
 On 21 September 2012 15:21, Zhang Rui rui.zh...@intel.com wrote:
  On 五, 2012-09-21 at 14:57 +0800, zhanghongbo wrote:
  From: hongbo.zhang hongbo.zh...@linaro.com
 
  This patch set contains two patches.
 
  [PATCH 1/2]
  A new interface is introduced to deactive all the referenced cooling 
  devices
  when thermal zone is disabled.
 
  we can not deactive a cooling device directly.
  we should deactive all the thermal_instances for this thermal zone.
  because a cooling device may be referenced in multiple thermal zones.
 Understand.
 
  Because the cooling device list is maintained
  in the generic thermal layer, the thermal driver cannot walk through the 
  list
  and cannot deactive its cooling devices either.
  This interface is needed in the .set_mode callback when the thermal zone 
  mode
  is set to disabled.
 
  Durga is introducing the cooling policy for the generic thermal layer
  and one of them is userspace.
  if we set the policy to userspace, the generic thermal layer will do
  nothing but getting input from userspace.
  if we have an API to change the cooling policy for a thermal zone, can
  this be used instead?
 The reason I sent out these patches is that I found out deactivation
 of cooling devices
 is necessary, and I didn't see any update of thermal framework recently.
 Since you are still working on this framework, can you consider this
 deactivation
 function in your next version? or need I resend again? I think it is
 better you do it.

 yes, I'll do it.
OK, thanks.

 Another propose is that let the thermal driver walk through its
 cooling device list is
 also important I think, if userspace mode is introduced, the thermal
 driver needs to
 manipulate its cooling device, the thermal driver cannot do this
 without knowing its
 cooling devices list. (one method to achieve this is as my last patch
 several weeks ago)

 thermal_instance list is enough for this, because a cooling device may
 be referenced in multiple trip points for one thermal zone, the thermal
 zone device should just deactive all the thermal instances when using
 userspace governor.
Get it.

 Another question, when will you update your framework, v3.7?

 as this change is introduced recently, I'm not sure if we can catch up
 3.7 merge window.
I will rebase ST-Ericsson thermal driver against your new framework
and try to upstream it then.

 thanks,
 rui


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: [Gumstix-users] Linaro, Gumstix, and illegal instructions

2012-09-21 Thread Ash Charles
Like Jeff mentioned, I also saw some illegal instructions on early
linaro builds but didn't pursue it at the time.  I just did a little
digging online and there was some mention of ARM errata causing issues
( https://bugs.launchpad.net/ubuntu/+source/fakeroot/+bug/495536)

Based on,
https://github.com/gumstix/Gumstix-Overo-Kernel/blob/master/arch/arm/configs/overo_linaro_defconfig
several of the errata are not set.  To be honest, I've not found a
description of the errata other than the high-level detail mentioned
in the kernel config so I'd love to know which are the correct ones to
be setting for the Overo.

-Ash
On Thu, Sep 20, 2012 at 11:00 AM, Jonathan Kunkee
jonathan.kun...@gmail.com wrote:
 Hello,

 I recently acquired a Gumstix Overo Water with a Tobi and decided to put
 Node.js on it to turn it into a web server of sorts.

 The build dies with an Illegal instruction error, and I am thoroughly
 confused.

 In order to set up the build, I did the following:
 1. imaged a microSD card with the Linaro binaries indicated by
 http://wiki.gumstix.org/index.php?title=Installing_Linaro_Image
 2. booted
 3. dd'd/enabled 200MB of swap on the SD card
 4. apt-get update/upgrade everything
 5. installed various development dependencies (git, libc6-dev, g++, and many
 others)
 6. cloned Joyent's repo.
 7. modified `wscript` in the repo with V8-specific changes for ARM
 (including enabling hard-float) (modified the scons invocation in v0.8)
 8. ./configure
 8. make
 9. observed 'illegal instruction' death in the make process:

 /root/node/deps/v8/src/arguments.h: In constructor
 'v8::internal::CustomArguments::CustomArguments(v8::internal::Isolate*,
 v8::internal::Object*, v8::internal::Object*, v8::internal::JSObject*)':
 /root/node/deps/v8/src/arguments.h:93:65: internal compiler error: Illegal
 instruction
 Please submit a full bug report,
 with preprocessed source if appropriate.
 See file:///usr/share/doc/gcc-4.6/README.Bugs for instructions.
 The bug is not reproducible, so it is likely a hardware or OS problem.
 scons: *** [obj/release/runtime-profiler.o] Error 1
 scons: building terminated because of errors.


 If I restart the build, it works fine for a while (including rebuilding the
 object it died building) then dies again. A different portion of the build
 is running and it usually receives a segfault:

 [ 9/35] cxx: src/node_javascript.cc - out/Release/src/node_javascript_5.o
 /usr/bin/g++ -pthread -g -O3 -DHAVE_OPENSSL=1 -D_LARGEFILE_SOURCE
 -D_FILE_OFFSET_BITS=64 -DHAVE_FDATASYNC=1 -DARCH=arm -DPLATFORM=linux
 -D__POSIX__=1 -Wno-unused-parameter -D_FORTIFY_SOURCE=2 -IRelease/src
 -I../src -IRelease/deps/http_parser -I../deps/http_parser
 -IRelease/deps/uv/include -I../deps/uv/include -IRelease/deps/uv/src/ares
 -I../deps/uv/src/ares -IRelease/deps/v8/include -I../deps/v8/include
 -Ideps/v8/include ../src/node_javascript.cc -c -o
 Release/src/node_javascript_5.o
 g++: internal compiler error: Segmentation fault (program as)
 Please submit a full bug report,
 with preprocessed source if appropriate.
 See file:///usr/share/doc/gcc-4.6/README.Bugs for instructions.
 Waf: Leaving directory `/root/node/out'


 I am confused. I've made a few observations about my setup:

 * Thinking that it might be a thermal problem because it was under heavy
 load at 600MHz in open air, I dug out a small heat sink, attached it, and
 pointed a small fan at it. It now runs cool to the touch. I still get the
 same 'illegal instruction' deaths under load.

 * Usually three invocations of 'make' are required before it actually
 finishes, but it does finish. I can run the command-line interpreter and it
 seems to behave inasmuch as I've used it.
 * 'make test' fails, but I think that's a Node.js/ARM problem.
 * I run htop to monitor the build, and sometimes the build system works fine
 while htop dies with an 'illegal instruction' error.
 * Node.js on ARM is presently somewhat sticky, so I checked out the v0.6.6
 tag. v0.6.18 encounters the same problem. v0.8.9 simply segfaults instead.
 * I have an Overo Water (GS3503W-R2889) on a Tobi.
 * MLO version: I haven't the foggiest, but I get the U-Boot version string
 twice, once being before u-boot.bin is loaded.
 * U-Boot 2012.04.01 (Jul 19 2012 - 17:31:34)
 * I moved the SD card to a different Water/Tobi pair and I got the same
 error (illegal instruction, Node.js v0.8.9)

 It feels like a task swapping bug in the kernel. Perhaps it's related to
 errata on the OMAP3530. It could also be read timeouts from the swap file on
 the SD card, though dmesg doesn't say anything of the sort. I found a kernel
 patch regarding inconsistent caches on ARM, but it hit in 2.6.33 and I'm
 running 3.2.1.

 Has anyone seen something like this?
 Would anyone be willing to replicate the problem? I'd be happy to share my
 tweaks to Node.js' build system to get it to build.

 Thanks for reading!
 Jon Kunkee

 --
 Everyone 

[PATCH v3 3/3] devfreq: Add current freq callback in device profile

2012-09-21 Thread Rajagopal Venkat
Devfreq returns governor predicted frequency as current frequency
via sysfs interface. But device may not support all frequencies
that governor predicts. So add a callback in device profile to get
current freq from driver. Also add a new sysfs node to expose
governor predicted next target frequency.

Signed-off-by: Rajagopal Venkat rajagopal.ven...@linaro.org
Signed-off-by: MyungJoo Ham myungjoo@samsung.com
---
 Documentation/ABI/testing/sysfs-class-devfreq | 11 ++-
 drivers/devfreq/devfreq.c | 14 ++
 include/linux/devfreq.h   |  3 +++
 3 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/sysfs-class-devfreq 
b/Documentation/ABI/testing/sysfs-class-devfreq
index 89283b1..e6cf08e 100644
--- a/Documentation/ABI/testing/sysfs-class-devfreq
+++ b/Documentation/ABI/testing/sysfs-class-devfreq
@@ -19,7 +19,16 @@ Date:September 2011
 Contact:   MyungJoo Ham myungjoo@samsung.com
 Description:
The /sys/class/devfreq/.../cur_freq shows the current
-   frequency of the corresponding devfreq object.
+   frequency of the corresponding devfreq object. Same as
+   target_freq when get_cur_freq() is not implemented by
+   devfreq driver.
+
+What:  /sys/class/devfreq/.../target_freq
+Date:  September 2012
+Contact:   Rajagopal Venkat rajagopal.ven...@linaro.org
+Description:
+   The /sys/class/devfreq/.../target_freq shows the next governor
+   predicted target frequency of the corresponding devfreq object.
 
 What:  /sys/class/devfreq/.../polling_interval
 Date:  September 2011
diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index edddb9e..33b0a33 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -453,6 +453,19 @@ static ssize_t show_governor(struct device *dev,
 static ssize_t show_freq(struct device *dev,
 struct device_attribute *attr, char *buf)
 {
+   unsigned long freq;
+   struct devfreq *devfreq = to_devfreq(dev);
+
+   if (devfreq-profile-get_cur_freq 
+   !devfreq-profile-get_cur_freq(devfreq-dev.parent, freq))
+   return sprintf(buf, %lu\n, freq);
+
+   return sprintf(buf, %lu\n, devfreq-previous_freq);
+}
+
+static ssize_t show_target_freq(struct device *dev,
+   struct device_attribute *attr, char *buf)
+{
return sprintf(buf, %lu\n, to_devfreq(dev)-previous_freq);
 }
 
@@ -552,6 +565,7 @@ static ssize_t show_max_freq(struct device *dev, struct 
device_attribute *attr,
 static struct device_attribute devfreq_attrs[] = {
__ATTR(governor, S_IRUGO, show_governor, NULL),
__ATTR(cur_freq, S_IRUGO, show_freq, NULL),
+   __ATTR(target_freq, S_IRUGO, show_target_freq, NULL),
__ATTR(polling_interval, S_IRUGO | S_IWUSR, show_polling_interval,
   store_polling_interval),
__ATTR(min_freq, S_IRUGO | S_IWUSR, show_min_freq, store_min_freq),
diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
index 18bd3b7..f49c5d3 100644
--- a/include/linux/devfreq.h
+++ b/include/linux/devfreq.h
@@ -66,6 +66,8 @@ struct devfreq_dev_status {
  * explained above with DEVFREQ_FLAG_* macros.
  * @get_dev_status The device should provide the current performance
  * status to devfreq, which is used by governors.
+ * @get_cur_freq   The device should provide the current frequency
+ * at which it is operating.
  * @exit   An optional callback that is called when devfreq
  * is removing the devfreq object due to error or
  * from devfreq_remove_device() call. If the user
@@ -79,6 +81,7 @@ struct devfreq_dev_profile {
int (*target)(struct device *dev, unsigned long *freq, u32 flags);
int (*get_dev_status)(struct device *dev,
  struct devfreq_dev_status *stat);
+   int (*get_cur_freq)(struct device *dev, unsigned long *freq);
void (*exit)(struct device *dev);
 };
 
-- 
1.7.11.3


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


[PATCH v3 2/3] devfreq: Add suspend and resume apis

2012-09-21 Thread Rajagopal Venkat
Add devfreq suspend/resume apis for devfreq users. This patch
supports suspend and resume of devfreq load monitoring, required
for devices which can idle.

Signed-off-by: Rajagopal Venkat rajagopal.ven...@linaro.org
---
 drivers/devfreq/devfreq.c | 28 
 drivers/devfreq/governor.h|  2 ++
 drivers/devfreq/governor_simpleondemand.c |  9 +
 include/linux/devfreq.h   | 12 
 4 files changed, 51 insertions(+)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index 8e9b5aa..edddb9e 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -416,6 +416,34 @@ int devfreq_remove_device(struct devfreq *devfreq)
 }
 EXPORT_SYMBOL(devfreq_remove_device);
 
+/**
+ * devfreq_suspend_device() - Suspend devfreq of a device.
+ * @devfreqthe devfreq instance to be suspended
+ */
+int devfreq_suspend_device(struct devfreq *devfreq)
+{
+   if (!devfreq)
+   return -EINVAL;
+
+   return devfreq-governor-event_handler(devfreq,
+   DEVFREQ_GOV_SUSPEND, NULL);
+}
+EXPORT_SYMBOL(devfreq_suspend_device);
+
+/**
+ * devfreq_resume_device() - Resume devfreq of a device.
+ * @devfreqthe devfreq instance to be resumed
+ */
+int devfreq_resume_device(struct devfreq *devfreq)
+{
+   if (!devfreq)
+   return -EINVAL;
+
+   return devfreq-governor-event_handler(devfreq,
+   DEVFREQ_GOV_RESUME, NULL);
+}
+EXPORT_SYMBOL(devfreq_resume_device);
+
 static ssize_t show_governor(struct device *dev,
 struct device_attribute *attr, char *buf)
 {
diff --git a/drivers/devfreq/governor.h b/drivers/devfreq/governor.h
index bb3aff3..26432ac 100644
--- a/drivers/devfreq/governor.h
+++ b/drivers/devfreq/governor.h
@@ -22,6 +22,8 @@
 #define DEVFREQ_GOV_START  0x1
 #define DEVFREQ_GOV_STOP   0x2
 #define DEVFREQ_GOV_INTERVAL   0x3
+#define DEVFREQ_GOV_SUSPEND0x4
+#define DEVFREQ_GOV_RESUME 0x5
 
 /* Caution: devfreq-lock must be locked before calling update_devfreq */
 extern int update_devfreq(struct devfreq *devfreq);
diff --git a/drivers/devfreq/governor_simpleondemand.c 
b/drivers/devfreq/governor_simpleondemand.c
index cf94218..a8ba78c 100644
--- a/drivers/devfreq/governor_simpleondemand.c
+++ b/drivers/devfreq/governor_simpleondemand.c
@@ -104,6 +104,15 @@ int devfreq_simple_ondemand_handler(struct devfreq 
*devfreq,
case DEVFREQ_GOV_INTERVAL:
devfreq_interval_update(devfreq, (unsigned int *)data);
break;
+
+   case DEVFREQ_GOV_SUSPEND:
+   devfreq_monitor_suspend(devfreq);
+   break;
+
+   case DEVFREQ_GOV_RESUME:
+   devfreq_monitor_resume(devfreq);
+   break;
+
default:
break;
}
diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
index 2ab70e3..18bd3b7 100644
--- a/include/linux/devfreq.h
+++ b/include/linux/devfreq.h
@@ -158,6 +158,8 @@ extern struct devfreq *devfreq_add_device(struct device 
*dev,
  const struct devfreq_governor *governor,
  void *data);
 extern int devfreq_remove_device(struct devfreq *devfreq);
+extern int devfreq_suspend_device(struct devfreq *devfreq);
+extern int devfreq_resume_device(struct devfreq *devfreq);
 
 /* Helper functions for devfreq user device driver with OPP. */
 extern struct opp *devfreq_recommended_opp(struct device *dev,
@@ -211,6 +213,16 @@ static int devfreq_remove_device(struct devfreq *devfreq)
return 0;
 }
 
+static int devfreq_suspend_device(struct devfreq *devfreq)
+{
+   return 0;
+}
+
+static int devfreq_resume_device(struct devfreq *devfreq)
+{
+   return 0;
+}
+
 static struct opp *devfreq_recommended_opp(struct device *dev,
   unsigned long *freq, u32 flags)
 {
-- 
1.7.11.3


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


[PATCH v3 1/3] devfreq: Core updates to support devices which can idle

2012-09-21 Thread Rajagopal Venkat
Prepare devfreq core framework to support devices which
can idle. When device idleness is detected perhaps through
runtime-pm, need some mechanism to suspend devfreq load
monitoring and resume back when device is online. Present
code continues monitoring unless device is removed from
devfreq core.

This patch introduces following design changes,

- use per device work instead of global work to monitor device
  load. This enables suspend/resume of device devfreq and
  reduces monitoring code complexity.
- decouple delayed work based load monitoring logic from core
  by introducing helpers functions to be used by governors. This
  provides flexibility for governors either to use delayed work
  based monitoring functions or to implement their own mechanism.
- devfreq core interacts with governors via events to perform
  specific actions. These events include start/stop devfreq.
  This sets ground for adding suspend/resume events.

The devfreq apis are not modified and are kept intact.

Signed-off-by: Rajagopal Venkat rajagopal.ven...@linaro.org
---
 Documentation/ABI/testing/sysfs-class-devfreq |   8 -
 drivers/devfreq/devfreq.c | 431 +++---
 drivers/devfreq/governor.h|  11 +
 drivers/devfreq/governor_performance.c|  16 +-
 drivers/devfreq/governor_powersave.c  |  16 +-
 drivers/devfreq/governor_simpleondemand.c |  24 ++
 drivers/devfreq/governor_userspace.c  |  23 +-
 include/linux/devfreq.h   |  34 +-
 8 files changed, 267 insertions(+), 296 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-class-devfreq 
b/Documentation/ABI/testing/sysfs-class-devfreq
index 23d78b5..89283b1 100644
--- a/Documentation/ABI/testing/sysfs-class-devfreq
+++ b/Documentation/ABI/testing/sysfs-class-devfreq
@@ -21,14 +21,6 @@ Description:
The /sys/class/devfreq/.../cur_freq shows the current
frequency of the corresponding devfreq object.
 
-What:  /sys/class/devfreq/.../central_polling
-Date:  September 2011
-Contact:   MyungJoo Ham myungjoo@samsung.com
-Description:
-   The /sys/class/devfreq/.../central_polling shows whether
-   the devfreq ojbect is using devfreq-provided central
-   polling mechanism or not.
-
 What:  /sys/class/devfreq/.../polling_interval
 Date:  September 2011
 Contact:   MyungJoo Ham myungjoo@samsung.com
diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index b146d76..8e9b5aa 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -30,17 +30,11 @@
 struct class *devfreq_class;
 
 /*
- * devfreq_work periodically monitors every registered device.
- * The minimum polling interval is one jiffy. The polling interval is
- * determined by the minimum polling period among all polling devfreq
- * devices. The resolution of polling interval is one jiffy.
+ * devfreq core provides delayed work based load monitoring helper
+ * functions. Governors can use these or can implement their own
+ * monitoring mechanism.
  */
-static bool polling;
 static struct workqueue_struct *devfreq_wq;
-static struct delayed_work devfreq_work;
-
-/* wait removing if this is to be removed */
-static struct devfreq *wait_remove_device;
 
 /* The list of all device-devfreq */
 static LIST_HEAD(devfreq_list);
@@ -72,6 +66,8 @@ static struct devfreq *find_device_devfreq(struct device *dev)
return ERR_PTR(-ENODEV);
 }
 
+/* Load monitoring helper functions for governors use */
+
 /**
  * update_devfreq() - Reevaluate the device and configure frequency.
  * @devfreq:   the devfreq instance.
@@ -121,6 +117,140 @@ int update_devfreq(struct devfreq *devfreq)
 }
 
 /**
+ * devfreq_monitor() - Periodically poll devfreq objects.
+ * @work:  the work struct used to run devfreq_monitor periodically.
+ *
+ */
+static void devfreq_monitor(struct work_struct *work)
+{
+   int err;
+   struct devfreq *devfreq = container_of(work,
+   struct devfreq, work.work);
+
+   mutex_lock(devfreq-lock);
+   err = update_devfreq(devfreq);
+   if (err)
+   dev_err(devfreq-dev, dvfs failed with (%d) error\n, err);
+
+   queue_delayed_work(devfreq_wq, devfreq-work,
+   msecs_to_jiffies(devfreq-profile-polling_ms));
+   mutex_unlock(devfreq-lock);
+}
+
+/**
+ * devfreq_monitor_start() - Start load monitoring of devfreq instance
+ * @devfreq:   the devfreq instance.
+ *
+ * Helper function for starting devfreq device load monitoing. By
+ * default delayed work based monitoring is supported. Function
+ * to be called from governor in response to DEVFREQ_GOV_START
+ * event when device is added to devfreq framework.
+ */
+void devfreq_monitor_start(struct devfreq *devfreq)
+{
+   INIT_DEFERRABLE_WORK(devfreq-work, devfreq_monitor);
+   queue_delayed_work(devfreq_wq, devfreq-work,
+   

[RFC PATCH 09/10] sched: Add HMP task migration ftrace event

2012-09-21 Thread morten . rasmussen
From: Morten Rasmussen morten.rasmus...@arm.com

Adds ftrace event for tracing task migrations using HMP
optimized scheduling.

Signed-off-by: Morten Rasmussen morten.rasmus...@arm.com
---
 include/trace/events/sched.h |   28 
 kernel/sched/fair.c  |   15 +++
 2 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index 847eb76..501aa32 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -555,6 +555,34 @@ TRACE_EVENT(sched_task_usage_ratio,
__entry-comm, __entry-pid,
__entry-ratio)
 );
+
+/*
+ * Tracepoint for HMP (CONFIG_SCHED_HMP) task migrations.
+ */
+TRACE_EVENT(sched_hmp_migrate,
+
+   TP_PROTO(struct task_struct *tsk, int dest, int force),
+
+   TP_ARGS(tsk, dest, force),
+
+   TP_STRUCT__entry(
+   __array(char, comm, TASK_COMM_LEN)
+   __field(pid_t, pid)
+   __field(int,  dest)
+   __field(int,  force)
+   ),
+
+   TP_fast_assign(
+   memcpy(__entry-comm, tsk-comm, TASK_COMM_LEN);
+   __entry-pid   = tsk-pid;
+   __entry-dest  = dest;
+   __entry-force = force;
+   ),
+
+   TP_printk(comm=%s pid=%d dest=%d force=%d,
+   __entry-comm, __entry-pid,
+   __entry-dest, __entry-force)
+);
 #endif /* _TRACE_SCHED_H */
 
 /* This part must be outside protection */
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0be53be..811b2b9 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -,10 +,16 @@ unlock:
rcu_read_unlock();
 
 #ifdef CONFIG_SCHED_HMP
-   if (hmp_up_migration(prev_cpu, p-se))
-   return hmp_select_faster_cpu(p, prev_cpu);
-   if (hmp_down_migration(prev_cpu, p-se))
-   return hmp_select_slower_cpu(p, prev_cpu);
+   if (hmp_up_migration(prev_cpu, p-se)) {
+   new_cpu = hmp_select_faster_cpu(p, prev_cpu);
+   trace_sched_hmp_migrate(p, new_cpu, 0);
+   return new_cpu;
+   }
+   if (hmp_down_migration(prev_cpu, p-se)) {
+   new_cpu = hmp_select_slower_cpu(p, prev_cpu);
+   trace_sched_hmp_migrate(p, new_cpu, 0);
+   return new_cpu;
+   }
/* Make sure that the task stays in its previous hmp domain */
if (!cpumask_test_cpu(new_cpu, hmp_cpu_domain(prev_cpu)-cpus))
return prev_cpu;
@@ -5718,6 +5724,7 @@ static void hmp_force_up_migration(int this_cpu)
target-push_cpu = hmp_select_faster_cpu(p, 
cpu);
target-migrate_task = p;
force = 1;
+   trace_sched_hmp_migrate(p, target-push_cpu, 1);
}
}
raw_spin_unlock_irqrestore(target-lock, flags);
-- 
1.7.9.5



___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


[RFC PATCH 05/10] ARM: Add HMP scheduling support for ARM architecture

2012-09-21 Thread morten . rasmussen
From: Morten Rasmussen morten.rasmus...@arm.com

Adds Kconfig entries to enable HMP scheduling on ARM platforms.
Currently, it disables CPU level sched_domain load-balacing in order
to simplify things. This needs fixing in a later revision. HMP
scheduling will do the load-balancing at this level instead.

Signed-off-by: Morten Rasmussen morten.rasmus...@arm.com
---
 arch/arm/Kconfig|   14 ++
 arch/arm/include/asm/topology.h |   32 
 2 files changed, 46 insertions(+)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 05de193..cb80846 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1584,6 +1584,20 @@ config SCHED_HMP_PRIO_FILTER_VAL
default 5
depends on SCHED_HMP_PRIO_FILTER
 
+config HMP_FAST_CPU_MASK
+   string HMP scheduler fast CPU mask
+   depends on SCHED_HMP
+   help
+  Specify the cpuids of the fast CPUs in the system as a list string,
+ e.g. cpuid 0+1 should be specified as 0-1.
+
+config HMP_SLOW_CPU_MASK
+   string HMP scheduler slow CPU mask
+   depends on SCHED_HMP
+   help
+ Specify the cpuids of the slow CPUs in the system as a list string,
+ e.g. cpuid 0+1 should be specified as 0-1.
+
 config HAVE_ARM_SCU
bool
help
diff --git a/arch/arm/include/asm/topology.h b/arch/arm/include/asm/topology.h
index 58b8b84..13a03de 100644
--- a/arch/arm/include/asm/topology.h
+++ b/arch/arm/include/asm/topology.h
@@ -27,6 +27,38 @@ void init_cpu_topology(void);
 void store_cpu_topology(unsigned int cpuid);
 const struct cpumask *cpu_coregroup_mask(int cpu);
 
+#ifdef CONFIG_DISABLE_CPU_SCHED_DOMAIN_BALANCE
+/* Common values for CPUs */
+#ifndef SD_CPU_INIT
+#define SD_CPU_INIT (struct sched_domain) {\
+   .min_interval   = 1,\
+   .max_interval   = 4,\
+   .busy_factor= 64,   \
+   .imbalance_pct  = 125,  \
+   .cache_nice_tries   = 1,\
+   .busy_idx   = 2,\
+   .idle_idx   = 1,\
+   .newidle_idx= 0,\
+   .wake_idx   = 0,\
+   .forkexec_idx   = 0,\
+   \
+   .flags  = 0*SD_LOAD_BALANCE \
+   | 1*SD_BALANCE_NEWIDLE  \
+   | 1*SD_BALANCE_EXEC \
+   | 1*SD_BALANCE_FORK \
+   | 0*SD_BALANCE_WAKE \
+   | 1*SD_WAKE_AFFINE  \
+   | 0*SD_PREFER_LOCAL \
+   | 0*SD_SHARE_CPUPOWER   \
+   | 0*SD_SHARE_PKG_RESOURCES  \
+   | 0*SD_SERIALIZE\
+   ,   \
+   .last_balance= jiffies, \
+   .balance_interval   = 1,\
+}
+#endif
+#endif /* CONFIG_DISABLE_CPU_SCHED_DOMAIN_BALANCE */
+
 #else
 
 static inline void init_cpu_topology(void) { }
-- 
1.7.9.5



___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


[RFC PATCH 08/10] sched: Add ftrace events for entity load-tracking

2012-09-21 Thread morten . rasmussen
From: Morten Rasmussen morten.rasmus...@arm.com

Adds ftrace events for key variables related to the entity
load-tracking to help debugging scheduler behaviour. Allows tracing
of load contribution and runqueue residency ratio for both entities
and runqueues as well as entity CPU usage ratio.

Signed-off-by: Morten Rasmussen morten.rasmus...@arm.com
---
 include/trace/events/sched.h |  125 ++
 kernel/sched/fair.c  |7 +++
 2 files changed, 132 insertions(+)

diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index 5a8671e..847eb76 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -430,6 +430,131 @@ TRACE_EVENT(sched_pi_setprio,
__entry-oldprio, __entry-newprio)
 );
 
+/*
+ * Tracepoint for showing tracked load contribution.
+ */
+TRACE_EVENT(sched_task_load_contrib,
+
+   TP_PROTO(struct task_struct *tsk, unsigned long load_contrib),
+
+   TP_ARGS(tsk, load_contrib),
+
+   TP_STRUCT__entry(
+   __array(char, comm, TASK_COMM_LEN)
+   __field(pid_t, pid)
+   __field(unsigned long, load_contrib)
+   ),
+
+   TP_fast_assign(
+   memcpy(__entry-comm, tsk-comm, TASK_COMM_LEN);
+   __entry-pid= tsk-pid;
+   __entry-load_contrib   = load_contrib;
+   ),
+
+   TP_printk(comm=%s pid=%d load_contrib=%lu,
+   __entry-comm, __entry-pid,
+   __entry-load_contrib)
+);
+
+/*
+ * Tracepoint for showing tracked task runnable ratio [0..1023].
+ */
+TRACE_EVENT(sched_task_runnable_ratio,
+
+   TP_PROTO(struct task_struct *tsk, unsigned long ratio),
+
+   TP_ARGS(tsk, ratio),
+
+   TP_STRUCT__entry(
+   __array(char, comm, TASK_COMM_LEN)
+   __field(pid_t, pid)
+   __field(unsigned long, ratio)
+   ),
+
+   TP_fast_assign(
+   memcpy(__entry-comm, tsk-comm, TASK_COMM_LEN);
+   __entry-pid   = tsk-pid;
+   __entry-ratio = ratio;
+   ),
+
+   TP_printk(comm=%s pid=%d ratio=%lu,
+   __entry-comm, __entry-pid,
+   __entry-ratio)
+);
+
+/*
+ * Tracepoint for showing tracked rq runnable ratio [0..1023].
+ */
+TRACE_EVENT(sched_rq_runnable_ratio,
+
+   TP_PROTO(int cpu, unsigned long ratio),
+
+   TP_ARGS(cpu, ratio),
+
+   TP_STRUCT__entry(
+   __field(int, cpu)
+   __field(unsigned long, ratio)
+   ),
+
+   TP_fast_assign(
+   __entry-cpu   = cpu;
+   __entry-ratio = ratio;
+   ),
+
+   TP_printk(cpu=%d ratio=%lu,
+   __entry-cpu,
+   __entry-ratio)
+);
+
+/*
+ * Tracepoint for showing tracked rq runnable load.
+ */
+TRACE_EVENT(sched_rq_runnable_load,
+
+   TP_PROTO(int cpu, u64 load),
+
+   TP_ARGS(cpu, load),
+
+   TP_STRUCT__entry(
+   __field(int, cpu)
+   __field(u64, load)
+   ),
+
+   TP_fast_assign(
+   __entry-cpu  = cpu;
+   __entry-load = load;
+   ),
+
+   TP_printk(cpu=%d load=%llu,
+   __entry-cpu,
+   __entry-load)
+);
+
+/*
+ * Tracepoint for showing tracked task cpu usage ratio [0..1023].
+ */
+TRACE_EVENT(sched_task_usage_ratio,
+
+   TP_PROTO(struct task_struct *tsk, unsigned long ratio),
+
+   TP_ARGS(tsk, ratio),
+
+   TP_STRUCT__entry(
+   __array(char, comm, TASK_COMM_LEN)
+   __field(pid_t, pid)
+   __field(unsigned long, ratio)
+   ),
+
+   TP_fast_assign(
+   memcpy(__entry-comm, tsk-comm, TASK_COMM_LEN);
+   __entry-pid   = tsk-pid;
+   __entry-ratio = ratio;
+   ),
+
+   TP_printk(comm=%s pid=%d ratio=%lu,
+   __entry-comm, __entry-pid,
+   __entry-ratio)
+);
 #endif /* _TRACE_SCHED_H */
 
 /* This part must be outside protection */
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 8f0f3b9..0be53be 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1192,9 +1192,11 @@ static inline void __update_task_entity_contrib(struct 
sched_entity *se)
contrib = se-avg.runnable_avg_sum * scale_load_down(se-load.weight);
contrib /= (se-avg.runnable_avg_period + 1);
se-avg.load_avg_contrib = scale_load(contrib);
+   trace_sched_task_load_contrib(task_of(se), se-avg.load_avg_contrib);
contrib = se-avg.runnable_avg_sum * scale_load_down(NICE_0_LOAD);
contrib /= (se-avg.runnable_avg_period + 1);
se-avg.load_avg_ratio = scale_load(contrib);
+   trace_sched_task_runnable_ratio(task_of(se), se-avg.load_avg_ratio);
 }
 
 /* Compute the current contribution to load_avg by se, return any delta */
@@ -1286,9 +1288,14 @@ static void update_cfs_rq_blocked_load(struct cfs_rq 
*cfs_rq, int 

[RFC PATCH 00/10] sched: Task placement for heterogeneous MP systems

2012-09-21 Thread morten . rasmussen
From: Morten Rasmussen morten.rasmus...@arm.com

Hi Paul, Paul, Peter, Suresh, linaro-sched-sig, and LKML,

As a follow-up on my Linux Plumbers Conference talk about my experiments with
scheduling on heterogeneous systems I'm posting a proof-of-concept patch set
with my modifications. The intention behind the modifications is to tweak
scheduling behaviour to only use fast (and power hungry) cores when it is
necessary and also improve performance consistency. Without the modifications
it is more or less random where tasks are scheduled and so is the execution
time.

I'm seeing good improvements on performance consistency for web browsing on
Android using Bbench http://www.gem5.org/Bbench on the ARM big.LITTLE TC2
chip, which has two fast cores (Cortex-A15) and three power-efficient cores
(Cortex-A7). The total execution time numbers below are for Androids
SurfaceFlinger process is key for page rendering performance. The average
execution time is lower with the patches enabled and the standard deviation is
much smaller. Similar improvements can be seen for the Android.Browser and
WebViewCoreThread processes.

Total execution time statistics based on 50 runs.

SurfaceFlinger  SMP kernel [s]  HMP modifications [s]
--
Average 14.617  11.012
St. Dev. 4.577   0.902
10% Pctl.9.343  10.783
90% Pctl.   18.743  11.695

Unfortunately, I cannot share power-efficiency numbers at this stage.

This patch set introduces proof-of-concept scheduler modifications which
attempt to improve scheduling decisions on heterogeneous multi-processor
systems (HMP) such as ARM big.LITTLE systems. The patch set relies on the
entity load-tracking re-work patch set by Paul Turner:

https://lkml.org/lkml/2012/8/23/267

The modifications attempt to migrate tasks between cores with different
compute capacity depending on the tracked load and priority. The aim is
to only use fast cores for tasks which really need the extra performance
and thereby improve power consumption by running everything else on the
slow cores.

The patch introduces hmp_domains to represent the different types of cores
that are available on the given platform. Multiple (2) hmp_domains is
supported but not tested. hmp_domains must be set up by platform code and
the patch set includes patches for ARM platforms using device-tree.

The patches intentionally try to avoid modifying the existing code paths
as much as possible. The aim is to experiment with HMP scheduling and get
the overall policy right before integrating it properly with the existing
load-balancer.

Morten

Morten Rasmussen (10):
  sched: entity load-tracking load_avg_ratio
  sched: Task placement for heterogeneous systems based on task
load-tracking
  sched: Forced task migration on heterogeneous systems
  sched: Introduce priority-based task migration filter
  ARM: Add HMP scheduling support for ARM architecture
  ARM: sched: Use device-tree to provide fast/slow CPU list for HMP
  ARM: sched: Setup SCHED_HMP domains
  sched: Add ftrace events for entity load-tracking
  sched: Add HMP task migration ftrace event
  sched: SCHED_HMP multi-domain task migration control

 arch/arm/Kconfig|   46 +
 arch/arm/include/asm/topology.h |   32 +++
 arch/arm/kernel/topology.c  |   91 
 include/linux/sched.h   |   11 +
 include/trace/events/sched.h|  153 ++
 kernel/sched/core.c |4 +
 kernel/sched/fair.c |  434 ++-
 kernel/sched/sched.h|9 +
 8 files changed, 779 insertions(+), 1 deletion(-)

-- 
1.7.9.5



___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


[RFC PATCH 07/10] ARM: sched: Setup SCHED_HMP domains

2012-09-21 Thread morten . rasmussen
From: Morten Rasmussen morten.rasmus...@arm.com

SCHED_HMP requires the different cpu types to be represented by an
ordered list of hmp_domains. Each hmp_domain represents all cpus of
a particular type using a cpumask.

The list is platform specific and therefore must be generated by
platform code by implementing arch_get_hmp_domains().

Signed-off-by: Morten Rasmussen morten.rasmus...@arm.com
---
 arch/arm/kernel/topology.c |   22 ++
 1 file changed, 22 insertions(+)

diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index 7682e12..ec8ad5c 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -383,6 +383,28 @@ void __init arch_get_fast_and_slow_cpus(struct cpumask 
*fast,
cpumask_clear(slow);
 }
 
+void __init arch_get_hmp_domains(struct list_head *hmp_domains_list)
+{
+   struct cpumask hmp_fast_cpu_mask;
+   struct cpumask hmp_slow_cpu_mask;
+   struct hmp_domain *domain;
+
+   arch_get_fast_and_slow_cpus(hmp_fast_cpu_mask, hmp_slow_cpu_mask);
+
+   /*
+* Initialize hmp_domains
+* Must be ordered with respect to compute capacity.
+* Fastest domain at head of list.
+*/
+   domain = (struct hmp_domain *)
+   kmalloc(sizeof(struct hmp_domain), GFP_KERNEL);
+   cpumask_copy(domain-cpus, hmp_slow_cpu_mask);
+   list_add(domain-hmp_domains, hmp_domains_list);
+   domain = (struct hmp_domain *)
+   kmalloc(sizeof(struct hmp_domain), GFP_KERNEL);
+   cpumask_copy(domain-cpus, hmp_fast_cpu_mask);
+   list_add(domain-hmp_domains, hmp_domains_list);
+}
 #endif /* CONFIG_SCHED_HMP */
 
 
-- 
1.7.9.5



___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


[RFC PATCH 06/10] ARM: sched: Use device-tree to provide fast/slow CPU list for HMP

2012-09-21 Thread morten . rasmussen
From: Morten Rasmussen morten.rasmus...@arm.com

We can't rely on Kconfig options to set the fast and slow CPU lists for
HMP scheduling if we want a single kernel binary to support multiple
devices with different CPU topology. E.g. TC2 (ARM's Test-Chip-2
big.LITTLE system), Fast Models, or even non big.LITTLE devices.

This patch adds the function arch_get_fast_and_slow_cpus() to generate
the lists at run-time by parsing the CPU nodes in device-tree; it
assumes slow cores are A7s and everything else is fast. The function
still supports the old Kconfig options as this is useful for testing the
HMP scheduler on devices without big.LITTLE.

This patch is reuse of a patch by Jon Medhurst t...@linaro.org with a
few bits left out.

Signed-off-by: Morten Rasmussen morten.rasmus...@arm.com
---
 arch/arm/Kconfig   |4 ++-
 arch/arm/kernel/topology.c |   69 
 2 files changed, 72 insertions(+), 1 deletion(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index cb80846..f1271bc 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1588,13 +1588,15 @@ config HMP_FAST_CPU_MASK
string HMP scheduler fast CPU mask
depends on SCHED_HMP
help
-  Specify the cpuids of the fast CPUs in the system as a list string,
+  Leave empty to use device tree information.
+ Specify the cpuids of the fast CPUs in the system as a list string,
  e.g. cpuid 0+1 should be specified as 0-1.
 
 config HMP_SLOW_CPU_MASK
string HMP scheduler slow CPU mask
depends on SCHED_HMP
help
+ Leave empty to use device tree information.
  Specify the cpuids of the slow CPUs in the system as a list string,
  e.g. cpuid 0+1 should be specified as 0-1.
 
diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index 26c12c6..7682e12 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -317,6 +317,75 @@ void store_cpu_topology(unsigned int cpuid)
cpu_topology[cpuid].socket_id, mpidr);
 }
 
+
+#ifdef CONFIG_SCHED_HMP
+
+static const char * const little_cores[] = {
+   arm,cortex-a7,
+   NULL,
+};
+
+static bool is_little_cpu(struct device_node *cn)
+{
+   const char * const *lc;
+   for (lc = little_cores; *lc; lc++)
+   if (of_device_is_compatible(cn, *lc))
+   return true;
+   return false;
+}
+
+void __init arch_get_fast_and_slow_cpus(struct cpumask *fast,
+   struct cpumask *slow)
+{
+   struct device_node *cn = NULL;
+   int cpu = 0;
+
+   cpumask_clear(fast);
+   cpumask_clear(slow);
+
+   /*
+* Use the config options if they are given. This helps testing
+* HMP scheduling on systems without a big.LITTLE architecture.
+*/
+   if (strlen(CONFIG_HMP_FAST_CPU_MASK)  
strlen(CONFIG_HMP_SLOW_CPU_MASK)) {
+   if (cpulist_parse(CONFIG_HMP_FAST_CPU_MASK, fast))
+   WARN(1, Failed to parse HMP fast cpu mask!\n);
+   if (cpulist_parse(CONFIG_HMP_SLOW_CPU_MASK, slow))
+   WARN(1, Failed to parse HMP slow cpu mask!\n);
+   return;
+   }
+
+   /*
+* Else, parse device tree for little cores.
+*/
+   while ((cn = of_find_node_by_type(cn, cpu))) {
+
+   if (cpu = num_possible_cpus())
+   break;
+
+   if (is_little_cpu(cn))
+   cpumask_set_cpu(cpu, slow);
+   else
+   cpumask_set_cpu(cpu, fast);
+
+   cpu++;
+   }
+
+   if (!cpumask_empty(fast)  !cpumask_empty(slow))
+   return;
+
+   /*
+* We didn't find both big and little cores so let's call all cores
+* fast as this will keep the system running, with all cores being
+* treated equal.
+*/
+   cpumask_setall(fast);
+   cpumask_clear(slow);
+}
+
+#endif /* CONFIG_SCHED_HMP */
+
+
 /*
  * init_cpu_topology is called at boot when only one cpu is running
  * which prevent simultaneous write access to cpu_topology array
-- 
1.7.9.5



___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


[RFC PATCH 04/10] sched: Introduce priority-based task migration filter

2012-09-21 Thread morten . rasmussen
From: Morten Rasmussen morten.rasmus...@arm.com

Introduces a priority threshold which prevents low priority task
from migrating to faster hmp_domains (cpus). This is useful for
user-space software which assigns lower task priority to background
task.

Signed-off-by: Morten Rasmussen morten.rasmus...@arm.com
---
 arch/arm/Kconfig|   13 +
 kernel/sched/fair.c |   15 +++
 2 files changed, 28 insertions(+)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 5b09684..05de193 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1571,6 +1571,19 @@ config SCHED_HMP
  !SCHED_AUTOGROUP. Furthermore, normal load-balancing must be disabled
  between cpus of different type (DISABLE_CPU_SCHED_DOMAIN_BALANCE).
 
+config SCHED_HMP_PRIO_FILTER
+   bool (EXPERIMENTAL) Filter HMP migrations by task priority
+   depends on SCHED_HMP
+   help
+ Enables task priority based HMP migration filter. Any task with
+ a NICE value above the threshold will always be on low-power cpus
+ with less compute capacity.
+
+config SCHED_HMP_PRIO_FILTER_VAL
+   int NICE priority threshold
+   default 5
+   depends on SCHED_HMP_PRIO_FILTER
+
 config HAVE_ARM_SCU
bool
help
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 490f1f0..8f0f3b9 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3129,9 +3129,12 @@ static int __init hmp_cpu_mask_setup(void)
  * hmp_down_threshold: max. load allowed for tasks migrating to a slower cpu
  * The default values (512, 256) offer good responsiveness, but may need
  * tweaking suit particular needs.
+ *
+ * hmp_up_prio: Only up migrate task with high priority (hmp_up_prio)
  */
 unsigned int hmp_up_threshold = 512;
 unsigned int hmp_down_threshold = 256;
+unsigned int hmp_up_prio = NICE_TO_PRIO(CONFIG_SCHED_HMP_PRIO_FILTER_VAL);
 
 static unsigned int hmp_up_migration(int cpu, struct sched_entity *se);
 static unsigned int hmp_down_migration(int cpu, struct sched_entity *se);
@@ -5491,6 +5494,12 @@ static unsigned int hmp_up_migration(int cpu, struct 
sched_entity *se)
if (hmp_cpu_is_fastest(cpu))
return 0;
 
+#ifdef CONFIG_SCHED_HMP_PRIO_FILTER
+   /* Filter by task priority */
+   if (p-prio = hmp_up_prio)
+   return 0;
+#endif
+
if (cpumask_intersects(hmp_faster_domain(cpu)-cpus,
tsk_cpus_allowed(p))
 se-avg.load_avg_ratio  hmp_up_threshold) {
@@ -5507,6 +5516,12 @@ static unsigned int hmp_down_migration(int cpu, struct 
sched_entity *se)
if (hmp_cpu_is_slowest(cpu))
return 0;
 
+#ifdef CONFIG_SCHED_HMP_PRIO_FILTER
+   /* Filter by task priority */
+   if (p-prio = hmp_up_prio)
+   return 1;
+#endif
+
if (cpumask_intersects(hmp_slower_domain(cpu)-cpus,
tsk_cpus_allowed(p))
 se-avg.load_avg_ratio  hmp_down_threshold) {
-- 
1.7.9.5



___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


[RFC PATCH 03/10] sched: Forced task migration on heterogeneous systems

2012-09-21 Thread morten . rasmussen
From: Morten Rasmussen morten.rasmus...@arm.com

This patch introduces forced task migration for moving suitable
currently running tasks between hmp_domains. Task behaviour is likely
to change over time. Tasks running in a less capable hmp_domain may
change to become more demanding and should therefore be migrated up.
They are unlikely go through the select_task_rq_fair() path anytime
soon and therefore need special attention.

This patch introduces a period check (SCHED_TICK) of the currently
running task on all runqueues and sets up a forced migration using
stop_machine_no_wait() if the task needs to be migrated.

Ideally, this should not be implemented by polling all runqueues.

Signed-off-by: Morten Rasmussen morten.rasmus...@arm.com
---
 kernel/sched/fair.c  |  196 +-
 kernel/sched/sched.h |3 +
 2 files changed, 198 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d80de46..490f1f0 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3744,7 +3744,6 @@ int can_migrate_task(struct task_struct *p, struct lb_env 
*env)
 * 1) task is cache cold, or
 * 2) too many balance attempts have failed.
 */
-
tsk_cache_hot = task_hot(p, env-src_rq-clock_task, env-sd);
if (!tsk_cache_hot ||
env-sd-nr_balance_failed  env-sd-cache_nice_tries) {
@@ -5516,6 +5515,199 @@ static unsigned int hmp_down_migration(int cpu, struct 
sched_entity *se)
return 0;
 }
 
+/*
+ * hmp_can_migrate_task - may task p from runqueue rq be migrated to this_cpu?
+ * Ideally this function should be merged with can_migrate_task() to avoid
+ * redundant code.
+ */
+static int hmp_can_migrate_task(struct task_struct *p, struct lb_env *env)
+{
+   int tsk_cache_hot = 0;
+
+   /*
+* We do not migrate tasks that are:
+* 1) running (obviously), or
+* 2) cannot be migrated to this CPU due to cpus_allowed
+*/
+   if (!cpumask_test_cpu(env-dst_cpu, tsk_cpus_allowed(p))) {
+   schedstat_inc(p, se.statistics.nr_failed_migrations_affine);
+   return 0;
+   }
+   env-flags = ~LBF_ALL_PINNED;
+
+   if (task_running(env-src_rq, p)) {
+   schedstat_inc(p, se.statistics.nr_failed_migrations_running);
+   return 0;
+   }
+
+   /*
+* Aggressive migration if:
+* 1) task is cache cold, or
+* 2) too many balance attempts have failed.
+*/
+
+   tsk_cache_hot = task_hot(p, env-src_rq-clock_task, env-sd);
+   if (!tsk_cache_hot ||
+   env-sd-nr_balance_failed  env-sd-cache_nice_tries) {
+#ifdef CONFIG_SCHEDSTATS
+   if (tsk_cache_hot) {
+   schedstat_inc(env-sd, lb_hot_gained[env-idle]);
+   schedstat_inc(p, se.statistics.nr_forced_migrations);
+   }
+#endif
+   return 1;
+   }
+
+   return 1;
+}
+
+/*
+ * move_specific_task tries to move a specific task.
+ * Returns 1 if successful and 0 otherwise.
+ * Called with both runqueues locked.
+ */
+static int move_specific_task(struct lb_env *env, struct task_struct *pm)
+{
+   struct task_struct *p, *n;
+
+   list_for_each_entry_safe(p, n, env-src_rq-cfs_tasks, se.group_node) {
+   if (throttled_lb_pair(task_group(p), env-src_rq-cpu,
+   env-dst_cpu))
+   continue;
+
+   if (!hmp_can_migrate_task(p, env))
+   continue;
+   /* Check if we found the right task */
+   if (p != pm)
+   continue;
+
+   move_task(p, env);
+   /*
+* Right now, this is only the third place move_task()
+* is called, so we can safely collect move_task()
+* stats here rather than inside move_task().
+*/
+   schedstat_inc(env-sd, lb_gained[env-idle]);
+   return 1;
+   }
+   return 0;
+}
+
+/*
+ * hmp_active_task_migration_cpu_stop is run by cpu stopper and used to
+ * migrate a specific task from one runqueue to another.
+ * hmp_force_up_migration uses this to push a currently running task
+ * off a runqueue.
+ * Based on active_load_balance_stop_cpu and can potentially be merged.
+ */
+static int hmp_active_task_migration_cpu_stop(void *data)
+{
+   struct rq *busiest_rq = data;
+   struct task_struct *p = busiest_rq-migrate_task;
+   int busiest_cpu = cpu_of(busiest_rq);
+   int target_cpu = busiest_rq-push_cpu;
+   struct rq *target_rq = cpu_rq(target_cpu);
+   struct sched_domain *sd;
+
+   raw_spin_lock_irq(busiest_rq-lock);
+   /* make sure the requested cpu hasn't gone down in the meantime */
+   if (unlikely(busiest_cpu != smp_processor_id() ||
+   !busiest_rq-active_balance)) {
+   goto out_unlock;
+   }
+   /* Is there any 

[RFC PATCH 02/10] sched: Task placement for heterogeneous systems based on task load-tracking

2012-09-21 Thread morten . rasmussen
From: Morten Rasmussen morten.rasmus...@arm.com

This patch introduces the basic SCHED_HMP infrastructure. Each class of
cpus is represented by a hmp_domain and tasks will only be moved between
these domains when their load profiles suggest it is beneficial.

SCHED_HMP relies heavily on the task load-tracking introduced in Paul
Turners fair group scheduling patch set:

https://lkml.org/lkml/2012/8/23/267

SCHED_HMP requires that the platform implements arch_get_hmp_domains()
which should set up the platform specific list of hmp_domains. It is
also assumed that the platform disables SD_LOAD_BALANCE for the
appropriate sched_domains.
Tasks placement takes place every time a task is to be inserted into
a runqueue based on its load history. The task placement decision is
based on load thresholds.

There are no restrictions on the number of hmp_domains, however,
multiple (2) has not been tested and the up/down migration policy is
rather simple.

Signed-off-by: Morten Rasmussen morten.rasmus...@arm.com
---
 arch/arm/Kconfig  |   17 +
 include/linux/sched.h |6 ++
 kernel/sched/fair.c   |  168 +
 kernel/sched/sched.h  |6 ++
 4 files changed, 197 insertions(+)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index f4a5d58..5b09684 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1554,6 +1554,23 @@ config SCHED_SMT
  MultiThreading at a cost of slightly increased overhead in some
  places. If unsure say N here.
 
+config DISABLE_CPU_SCHED_DOMAIN_BALANCE
+   bool (EXPERIMENTAL) Disable CPU level scheduler load-balancing
+   help
+ Disables scheduler load-balancing at CPU sched domain level.
+
+config SCHED_HMP
+   bool (EXPERIMENTAL) Heterogenous multiprocessor scheduling
+   depends on DISABLE_CPU_SCHED_DOMAIN_BALANCE  SCHED_MC  
FAIR_GROUP_SCHED  !SCHED_AUTOGROUP
+   help
+ Experimental scheduler optimizations for heterogeneous platforms.
+ Attempts to introspectively select task affinity to optimize power
+ and performance. Basic support for multiple (2) cpu types is in 
place,
+ but it has only been tested with two types of cpus.
+ There is currently no support for migration of task groups, hence
+ !SCHED_AUTOGROUP. Furthermore, normal load-balancing must be disabled
+ between cpus of different type (DISABLE_CPU_SCHED_DOMAIN_BALANCE).
+
 config HAVE_ARM_SCU
bool
help
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 81e4e82..df971a3 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1039,6 +1039,12 @@ unsigned long default_scale_smt_power(struct 
sched_domain *sd, int cpu);
 
 bool cpus_share_cache(int this_cpu, int that_cpu);
 
+#ifdef CONFIG_SCHED_HMP
+struct hmp_domain {
+   struct cpumask cpus;
+   struct list_head hmp_domains;
+};
+#endif /* CONFIG_SCHED_HMP */
 #else /* CONFIG_SMP */
 
 struct sched_domain_attr;
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 3e17dd5..d80de46 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3077,6 +3077,125 @@ static int select_idle_sibling(struct task_struct *p, 
int target)
return target;
 }
 
+#ifdef CONFIG_SCHED_HMP
+/*
+ * Heterogenous multiprocessor (HMP) optimizations
+ *
+ * The cpu types are distinguished using a list of hmp_domains
+ * which each represent one cpu type using a cpumask.
+ * The list is assumed ordered by compute capacity with the
+ * fastest domain first.
+ */
+DEFINE_PER_CPU(struct hmp_domain *, hmp_cpu_domain);
+
+extern void __init arch_get_hmp_domains(struct list_head *hmp_domains_list);
+
+/* Setup hmp_domains */
+static int __init hmp_cpu_mask_setup(void)
+{
+   char buf[64];
+   struct hmp_domain *domain;
+   struct list_head *pos;
+   int dc, cpu;
+
+   pr_debug(Initializing HMP scheduler:\n);
+
+   /* Initialize hmp_domains using platform code */
+   arch_get_hmp_domains(hmp_domains);
+   if (list_empty(hmp_domains)) {
+   pr_debug(HMP domain list is empty!\n);
+   return 0;
+   }
+
+   /* Print hmp_domains */
+   dc = 0;
+   list_for_each(pos, hmp_domains) {
+   domain = list_entry(pos, struct hmp_domain, hmp_domains);
+   cpulist_scnprintf(buf, 64, domain-cpus);
+   pr_debug(  HMP domain %d: %s\n, dc, buf);
+
+   for_each_cpu_mask(cpu, domain-cpus) {
+   per_cpu(hmp_cpu_domain, cpu) = domain;
+   }
+   dc++;
+   }
+
+   return 1;
+}
+
+/*
+ * Migration thresholds should be in the range [0..1023]
+ * hmp_up_threshold: min. load required for migrating tasks to a faster cpu
+ * hmp_down_threshold: max. load allowed for tasks migrating to a slower cpu
+ * The default values (512, 256) offer good responsiveness, but may need
+ * tweaking suit particular needs.
+ */
+unsigned int hmp_up_threshold = 512;
+unsigned 

[RFC PATCH 10/10] sched: SCHED_HMP multi-domain task migration control

2012-09-21 Thread morten . rasmussen
From: Morten Rasmussen morten.rasmus...@arm.com

We need a way to prevent tasks that are migrating up and down the
hmp_domains from migrating straight on through before the load has
adapted to the new compute capacity of the CPU on the new hmp_domain.
This patch adds a next up/down migration delay that prevents the task
from doing another migration in the same direction until the delay
has expired.

Signed-off-by: Morten Rasmussen morten.rasmus...@arm.com
---
 include/linux/sched.h |4 
 kernel/sched/core.c   |4 
 kernel/sched/fair.c   |   38 ++
 3 files changed, 46 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index df971a3..ca3890a 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1158,6 +1158,10 @@ struct sched_avg {
s64 decay_count;
unsigned long load_avg_contrib;
unsigned long load_avg_ratio;
+#ifdef CONFIG_SCHED_HMP
+   u64 hmp_last_up_migration;
+   u64 hmp_last_down_migration;
+#endif
u32 usage_avg_sum;
 };
 
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 652b86b..a3b1ff6 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1723,6 +1723,10 @@ static void __sched_fork(struct task_struct *p)
 #if defined(CONFIG_SMP)  defined(CONFIG_FAIR_GROUP_SCHED)
p-se.avg.runnable_avg_period = 0;
p-se.avg.runnable_avg_sum = 0;
+#ifdef CONFIG_SCHED_HMP
+   p-se.avg.hmp_last_up_migration = 0;
+   p-se.avg.hmp_last_down_migration = 0;
+#endif
 #endif
 #ifdef CONFIG_SCHEDSTATS
memset(p-se.statistics, 0, sizeof(p-se.statistics));
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 811b2b9..56cbda1 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3138,10 +3138,14 @@ static int __init hmp_cpu_mask_setup(void)
  * tweaking suit particular needs.
  *
  * hmp_up_prio: Only up migrate task with high priority (hmp_up_prio)
+ * hmp_next_up_threshold: Delay before next up migration (1024 ~= 1 ms)
+ * hmp_next_down_threshold: Delay before next down migration (1024 ~= 1 ms)
  */
 unsigned int hmp_up_threshold = 512;
 unsigned int hmp_down_threshold = 256;
 unsigned int hmp_up_prio = NICE_TO_PRIO(CONFIG_SCHED_HMP_PRIO_FILTER_VAL);
+unsigned int hmp_next_up_threshold = 4096;
+unsigned int hmp_next_down_threshold = 4096;
 
 static unsigned int hmp_up_migration(int cpu, struct sched_entity *se);
 static unsigned int hmp_down_migration(int cpu, struct sched_entity *se);
@@ -3204,6 +3208,21 @@ static inline unsigned int hmp_select_slower_cpu(struct 
task_struct *tsk,
tsk_cpus_allowed(tsk));
 }
 
+static inline void hmp_next_up_delay(struct sched_entity *se, int cpu)
+{
+   struct cfs_rq *cfs_rq = cpu_rq(cpu)-cfs;
+
+   se-avg.hmp_last_up_migration = cfs_rq_clock_task(cfs_rq);
+   se-avg.hmp_last_down_migration = 0;
+}
+
+static inline void hmp_next_down_delay(struct sched_entity *se, int cpu)
+{
+   struct cfs_rq *cfs_rq = cpu_rq(cpu)-cfs;
+
+   se-avg.hmp_last_down_migration = cfs_rq_clock_task(cfs_rq);
+   se-avg.hmp_last_up_migration = 0;
+}
 #endif /* CONFIG_SCHED_HMP */
 
 /*
@@ -3335,11 +3354,13 @@ unlock:
 #ifdef CONFIG_SCHED_HMP
if (hmp_up_migration(prev_cpu, p-se)) {
new_cpu = hmp_select_faster_cpu(p, prev_cpu);
+   hmp_next_up_delay(p-se, new_cpu);
trace_sched_hmp_migrate(p, new_cpu, 0);
return new_cpu;
}
if (hmp_down_migration(prev_cpu, p-se)) {
new_cpu = hmp_select_slower_cpu(p, prev_cpu);
+   hmp_next_down_delay(p-se, new_cpu);
trace_sched_hmp_migrate(p, new_cpu, 0);
return new_cpu;
}
@@ -5503,6 +5524,8 @@ static void nohz_idle_balance(int this_cpu, enum 
cpu_idle_type idle) { }
 static unsigned int hmp_up_migration(int cpu, struct sched_entity *se)
 {
struct task_struct *p = task_of(se);
+   struct cfs_rq *cfs_rq = cpu_rq(cpu)-cfs;
+   u64 now;
 
if (hmp_cpu_is_fastest(cpu))
return 0;
@@ -5513,6 +5536,12 @@ static unsigned int hmp_up_migration(int cpu, struct 
sched_entity *se)
return 0;
 #endif
 
+   /* Let the task load settle before doing another up migration */
+   now = cfs_rq_clock_task(cfs_rq);
+   if (((now - se-avg.hmp_last_up_migration)  10)
+hmp_next_up_threshold)
+   return 0;
+
if (cpumask_intersects(hmp_faster_domain(cpu)-cpus,
tsk_cpus_allowed(p))
 se-avg.load_avg_ratio  hmp_up_threshold) {
@@ -5525,6 +5554,8 @@ static unsigned int hmp_up_migration(int cpu, struct 
sched_entity *se)
 static unsigned int hmp_down_migration(int cpu, struct sched_entity *se)
 {
struct task_struct *p = task_of(se);
+   struct cfs_rq *cfs_rq = cpu_rq(cpu)-cfs;
+   u64 now;
 
if (hmp_cpu_is_slowest(cpu))
  

[RFC PATCH 01/10] sched: entity load-tracking load_avg_ratio

2012-09-21 Thread morten . rasmussen
From: Morten Rasmussen morten.rasmus...@arm.com

This patch adds load_avg_ratio to each task. The load_avg_ratio is a
variant of load_avg_contrib which is not scaled by the task priority. It
is calculated like this:

runnable_avg_sum * NICE_0_LOAD / (runnable_avg_period + 1).

Signed-off-by: Morten Rasmussen morten.rasmus...@arm.com
---
 include/linux/sched.h |1 +
 kernel/sched/fair.c   |3 +++
 2 files changed, 4 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 4dc4990..81e4e82 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1151,6 +1151,7 @@ struct sched_avg {
u64 last_runnable_update;
s64 decay_count;
unsigned long load_avg_contrib;
+   unsigned long load_avg_ratio;
u32 usage_avg_sum;
 };
 
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 095d86c..3e17dd5 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1192,6 +1192,9 @@ static inline void __update_task_entity_contrib(struct 
sched_entity *se)
contrib = se-avg.runnable_avg_sum * scale_load_down(se-load.weight);
contrib /= (se-avg.runnable_avg_period + 1);
se-avg.load_avg_contrib = scale_load(contrib);
+   contrib = se-avg.runnable_avg_sum * scale_load_down(NICE_0_LOAD);
+   contrib /= (se-avg.runnable_avg_period + 1);
+   se-avg.load_avg_ratio = scale_load(contrib);
 }
 
 /* Compute the current contribution to load_avg by se, return any delta */
-- 
1.7.9.5



___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


omapconf tool publicly released

2012-09-21 Thread Mike Turquette
Hi all,

Some of you may have heard of a tool we use inside TI for debugging on
OMAP.  It's a nice userspace tool which can inspect many aspects of
hardware state called omapconf.  The tool has just been open sourced
and can be found at:

https://github.com/omapconf/omapconf

Regards,
Mike

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: omapconf tool publicly released

2012-09-21 Thread Zach Pfeffer
On 21 September 2012 15:07, Mike Turquette mturque...@linaro.org wrote:
 Hi all,

 Some of you may have heard of a tool we use inside TI for debugging on
 OMAP.  It's a nice userspace tool which can inspect many aspects of
 hardware state called omapconf.  The tool has just been open sourced
 and can be found at:

 https://github.com/omapconf/omapconf

Is there an Android.mk? Looking at
https://github.com/omapconf/omapconf it says it works on Android, but
I don't see an Android.mk to compile it with.

 Regards,
 Mike

 ___
 linaro-dev mailing list
 linaro-dev@lists.linaro.org
 http://lists.linaro.org/mailman/listinfo/linaro-dev



-- 
Zach Pfeffer
Android Platform Team Lead, Linaro Platform Teams
Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: [Gumstix-users] Linaro, Gumstix, and illegal instructions

2012-09-21 Thread Robert Nelson
On Thu, Sep 20, 2012 at 7:33 PM, Ash Charles ashchar...@gmail.com wrote:
 Like Jeff mentioned, I also saw some illegal instructions on early
 linaro builds but didn't pursue it at the time.  I just did a little
 digging online and there was some mention of ARM errata causing issues
 ( https://bugs.launchpad.net/ubuntu/+source/fakeroot/+bug/495536)

 Based on,
 https://github.com/gumstix/Gumstix-Overo-Kernel/blob/master/arch/arm/configs/overo_linaro_defconfig
 several of the errata are not set.  To be honest, I've not found a
 description of the errata other than the high-level detail mentioned
 in the kernel config so I'd love to know which are the correct ones to
 be setting for the Overo.

For omap34/35xx class hardware definitely turn on
CONFIG_ARM_ERRATA_430973=y

When using a mix of arm/thumb application binaries on this core.

Regards,

-- 
Robert Nelson
http://www.rcn-ee.com/

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: [PATCH][V2] cpuidle : rename function name __cpuidle_register_driver

2012-09-21 Thread Rafael J. Wysocki
On Thursday, September 20, 2012, Daniel Lezcano wrote:
 The function __cpuidle_register_driver name is confusing because it
 suggests, conforming to the coding style of the kernel, it registers
 the driver without taking a lock. Actually, it just fill the different
 power field states with a decresing value if the power has not been
 specified.
 
 Clarify the purpose of the function by changing its name and
 move the condition out of this function.
 
 This patch fix nothing and does not change the behavior of the
 function. It is just for the sake of clarity.
 
 IHMO, reading in the code:
 
 +   if (!drv-power_specified)
 +   set_power_states(drv);
 
 is much more explicit than:
 
 -   __cpuidle_register_driver(drv);
 
 Signed-off-by: Daniel Lezcano daniel.lezc...@linaro.org

Applied to the linux-next branch of the linux-pm.git tree as v3.7 material.

Thanks,
Rafael


 ---
  drivers/cpuidle/driver.c |   15 +--
  1 files changed, 9 insertions(+), 6 deletions(-)
 
 diff --git a/drivers/cpuidle/driver.c b/drivers/cpuidle/driver.c
 index 424bc81..87db387 100644
 --- a/drivers/cpuidle/driver.c
 +++ b/drivers/cpuidle/driver.c
 @@ -18,9 +18,10 @@ static struct cpuidle_driver *cpuidle_curr_driver;
  DEFINE_SPINLOCK(cpuidle_driver_lock);
  int cpuidle_driver_refcount;
  
 -static void __cpuidle_register_driver(struct cpuidle_driver *drv)
 +static void set_power_states(struct cpuidle_driver *drv)
  {
   int i;
 +
   /*
* cpuidle driver should set the drv-power_specified bit
* before registering if the driver provides
 @@ -35,10 +36,8 @@ static void __cpuidle_register_driver(struct 
 cpuidle_driver *drv)
* an power value of -1.  So we use -2, -3, etc, for other
* c-states.
*/
 - if (!drv-power_specified) {
 - for (i = CPUIDLE_DRIVER_STATE_START; i  drv-state_count; i++)
 - drv-states[i].power_usage = -1 - i;
 - }
 + for (i = CPUIDLE_DRIVER_STATE_START; i  drv-state_count; i++)
 + drv-states[i].power_usage = -1 - i;
  }
  
  /**
 @@ -58,8 +57,12 @@ int cpuidle_register_driver(struct cpuidle_driver *drv)
   spin_unlock(cpuidle_driver_lock);
   return -EBUSY;
   }
 - __cpuidle_register_driver(drv);
 +
 + if (!drv-power_specified)
 + set_power_states(drv);
 +
   cpuidle_curr_driver = drv;
 +
   spin_unlock(cpuidle_driver_lock);
  
   return 0;
 


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


NI Power Meter Results

2012-09-21 Thread Zach Pfeffer
Just wanted to share this with everyone.

I've attached the output folder that the NI instrument creates for
each test session. In the results file you'll see a text doc called
results.txt that lists the comma delimited parameters that get
measured followed by the measurements themselves:

Current Cycle Average,Current Cycle RMS,Current Mean (DC),Current
Negative Peak,Current Peak to Peak,Current Positive Peak,Current
RMS,Volt Cycle Average,Volt Cycle RMS,Volt Mean (DC),Volt Negative
Peak,Volt Peak to Peak,Volt Positive Peak,Volt RMS

See:

https://docs.google.com/a/linaro.org/file/d/0B3pUtxWjZbP9bFhqNGZfYzNSMWs/edit

Included in each record is a Record Number that indexes into the
report directory. Each directory is marked with an index and under
that directory is the graph associated with the data for example:

https://docs.google.com/a/linaro.org/file/d/0B3pUtxWjZbP9VnVQS3M4WWx1OVk/edit

In addition, controlling the instrument is super easy. You connect to
the box over TCP/IP the you can send 5 single character commands in
any order: 1,0,s,e,r

1 turns the power on
0 turns it off
s starts a measurement
e ends a measurement
r records

r is destructive, so if you send an r it erase the previous data
record. The data record does survive instrument restarts (as opposed
to having an implicit r at the start of the measurement.

At any point the existing data set can simply be uploaded.

One minor point. This instrument produces a lot of data, instead of
moving all this data around, the instrument can be configured to do
all the measurement, making the analyzed data set easier to understand
and faster to upload.

Comments and questions welcome.

See it in action at:
https://plus.google.com/u/0/104422661029399872488/posts/NU4pZ36L13U

-- 
Zach Pfeffer
Android Platform Team Lead, Linaro Platform Teams
Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: omapconf tool publicly released

2012-09-21 Thread Zach Pfeffer
On 21 September 2012 16:00, Marcin Juszkiewicz
marcin.juszkiew...@linaro.org wrote:
 W dniu 21.09.2012 22:24, Zach Pfeffer pisze:
 On 21 September 2012 15:07, Mike Turquette mturque...@linaro.org wrote:

 https://github.com/omapconf/omapconf

 Is there an Android.mk? Looking at
 https://github.com/omapconf/omapconf it says it works on Android, but
 I don't see an Android.mk to compile it with.

 It is on page:

 Build instructions and installation via ADB (Android):

 Make sure your Android device is connected to host via ADB: # adb
 kill-server # adb devices * daemon not running. starting it now * *
 daemon started successfully * List of devices attached emulator-5554
 device # adb root

 To build and install ompaconf for Android via ADB:

 make CROSS_COMPILE=arm-none-linux-gnueabi- install_android

 OMAPCONF binary will be copied to /data directory (known writable
 directory) on your Android device. You may get it copied to a different
 directory by updating Makefile at your convenience.

Thanks Marcin. I'll check it out.


 ___
 linaro-dev mailing list
 linaro-dev@lists.linaro.org
 http://lists.linaro.org/mailman/listinfo/linaro-dev



-- 
Zach Pfeffer
Android Platform Team Lead, Linaro Platform Teams
Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev