Re: [PATCH v3] PM / devfreq: Restart previous governor if new governor fails to start

2016-11-11 Thread Saravana Kannan

On 11/09/2016 04:10 PM, Chanwoo Choi wrote:

Hi,

On 2016년 11월 10일 05:34, Saravana Kannan wrote:

On 11/08/2016 06:38 PM, Chanwoo Choi wrote:

On 2016년 11월 09일 11:36, Chanwoo Choi wrote:

Hi,

On 2016년 11월 09일 10:33, Chanwoo Choi wrote:

On 2016년 11월 09일 05:52, Saravana Kannan wrote:

On 11/08/2016 01:02 AM, Chanwoo Choi wrote:

Hi,

On 2016년 11월 08일 03:57, Saravana Kannan wrote:

On 10/26/2016 05:06 PM, Chanwoo Choi wrote:

Hi,

On 2016년 10월 27일 04:17, Saravana Kannan wrote:

If the new governor fails to start, switch back to old governor so that the
devfreq state is not left in some weird limbo.

Signed-off-by: Saravana Kannan <skan...@codeaurora.org>
---
* Fix NULL deref for real this time.
* Addressed some style preferences.

 drivers/devfreq/devfreq.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index bf3ea76..77c41a5 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -919,7 +919,7 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
 struct devfreq *df = to_devfreq(dev);
 int ret;
 char str_governor[DEVFREQ_NAME_LEN + 1];
-struct devfreq_governor *governor;
+const struct devfreq_governor *governor, *prev_governor;

 ret = sscanf(buf, "%" __stringify(DEVFREQ_NAME_LEN) "s", str_governor);
 if (ret != 1)
@@ -944,12 +944,21 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
 goto out;
 }
 }
+prev_governor = df->governor;
 df->governor = governor;
 strncpy(df->governor_name, governor->name, DEVFREQ_NAME_LEN);
 ret = df->governor->event_handler(df, DEVFREQ_GOV_START, NULL);
-if (ret)
+if (ret) {
 dev_warn(dev, "%s: Governor %s not started(%d)\n",
  __func__, df->governor->name, ret);
+if (prev_governor) {


I think that prev_governor is always set. You don't need to check it.
Why do check it?


Not true. Even higher up in this function, we check if df->governor != NULL. 
Simple example would be that the default governor passed in while adding the 
device isn't compiled in.


I don't understand. If device is not registered by devfreq_add_device(),
you don't change the governor by using governor_store().

If you can change the governor through governor_store(),
it means that the devfreq instance was added without any problem.
The added devfreq instance must have the sepecific governor
on df->governor variable.

So, I think that df->governor is always set and then prev_governor is always 
set.


Read the code more closely. add_device doesn't (and shouldn't) fail if the 
governor isn't present. After that, the governor could be changed from user 
space.


If governor is not present during devfreq_add_device(),
the devfreq instance is not able to register the devfreq framework.
So, the user never touch the sysfs[1] to change the governor
because the sysfs[1] is not created by devfreq framework.
[1] /sys/class/devfreq/[device name]/governor

In summary, if governor is not present during devfreq_add_device(),
the devfreq framework doesn't create tye sysfs[1] for governor.

The user-space never change the governor through sysfs[1]
because there is no any sysfs entry[1].


I checked the code of devfreq_add_device().
As you mentioned, if there is no governor,
devfreq_add_device() don't pass wihtout calling the 
devfreq->governor->even_handler().


Wrong expression / don't pass -> pass



I think it's correct as is. Just because the governor isn't there (or hasn't 
registered yet) doesn't mean the device add should fail.

That aside, do you care to ack this patch for the way the code is currently?


Reviewed-by: Chanwoo Choi <cw00.c...@samsung.com>


Thanks


After fixing the bug of devfreq_add_device(),
I'll remove the unneeded 'if statement' of prev_governor in governor_store.



As mentioned earlier, I don't think it's a bug. It's fine as is.

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH v3] PM / devfreq: Restart previous governor if new governor fails to start

2016-11-11 Thread Saravana Kannan

On 11/09/2016 04:10 PM, Chanwoo Choi wrote:

Hi,

On 2016년 11월 10일 05:34, Saravana Kannan wrote:

On 11/08/2016 06:38 PM, Chanwoo Choi wrote:

On 2016년 11월 09일 11:36, Chanwoo Choi wrote:

Hi,

On 2016년 11월 09일 10:33, Chanwoo Choi wrote:

On 2016년 11월 09일 05:52, Saravana Kannan wrote:

On 11/08/2016 01:02 AM, Chanwoo Choi wrote:

Hi,

On 2016년 11월 08일 03:57, Saravana Kannan wrote:

On 10/26/2016 05:06 PM, Chanwoo Choi wrote:

Hi,

On 2016년 10월 27일 04:17, Saravana Kannan wrote:

If the new governor fails to start, switch back to old governor so that the
devfreq state is not left in some weird limbo.

Signed-off-by: Saravana Kannan 
---
* Fix NULL deref for real this time.
* Addressed some style preferences.

 drivers/devfreq/devfreq.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index bf3ea76..77c41a5 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -919,7 +919,7 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
 struct devfreq *df = to_devfreq(dev);
 int ret;
 char str_governor[DEVFREQ_NAME_LEN + 1];
-struct devfreq_governor *governor;
+const struct devfreq_governor *governor, *prev_governor;

 ret = sscanf(buf, "%" __stringify(DEVFREQ_NAME_LEN) "s", str_governor);
 if (ret != 1)
@@ -944,12 +944,21 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
 goto out;
 }
 }
+prev_governor = df->governor;
 df->governor = governor;
 strncpy(df->governor_name, governor->name, DEVFREQ_NAME_LEN);
 ret = df->governor->event_handler(df, DEVFREQ_GOV_START, NULL);
-if (ret)
+if (ret) {
 dev_warn(dev, "%s: Governor %s not started(%d)\n",
  __func__, df->governor->name, ret);
+if (prev_governor) {


I think that prev_governor is always set. You don't need to check it.
Why do check it?


Not true. Even higher up in this function, we check if df->governor != NULL. 
Simple example would be that the default governor passed in while adding the 
device isn't compiled in.


I don't understand. If device is not registered by devfreq_add_device(),
you don't change the governor by using governor_store().

If you can change the governor through governor_store(),
it means that the devfreq instance was added without any problem.
The added devfreq instance must have the sepecific governor
on df->governor variable.

So, I think that df->governor is always set and then prev_governor is always 
set.


Read the code more closely. add_device doesn't (and shouldn't) fail if the 
governor isn't present. After that, the governor could be changed from user 
space.


If governor is not present during devfreq_add_device(),
the devfreq instance is not able to register the devfreq framework.
So, the user never touch the sysfs[1] to change the governor
because the sysfs[1] is not created by devfreq framework.
[1] /sys/class/devfreq/[device name]/governor

In summary, if governor is not present during devfreq_add_device(),
the devfreq framework doesn't create tye sysfs[1] for governor.

The user-space never change the governor through sysfs[1]
because there is no any sysfs entry[1].


I checked the code of devfreq_add_device().
As you mentioned, if there is no governor,
devfreq_add_device() don't pass wihtout calling the 
devfreq->governor->even_handler().


Wrong expression / don't pass -> pass



I think it's correct as is. Just because the governor isn't there (or hasn't 
registered yet) doesn't mean the device add should fail.

That aside, do you care to ack this patch for the way the code is currently?


Reviewed-by: Chanwoo Choi 


Thanks


After fixing the bug of devfreq_add_device(),
I'll remove the unneeded 'if statement' of prev_governor in governor_store.



As mentioned earlier, I don't think it's a bug. It's fine as is.

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH v3] PM / devfreq: Restart previous governor if new governor fails to start

2016-11-09 Thread Saravana Kannan

On 11/08/2016 06:38 PM, Chanwoo Choi wrote:

On 2016년 11월 09일 11:36, Chanwoo Choi wrote:

Hi,

On 2016년 11월 09일 10:33, Chanwoo Choi wrote:

On 2016년 11월 09일 05:52, Saravana Kannan wrote:

On 11/08/2016 01:02 AM, Chanwoo Choi wrote:

Hi,

On 2016년 11월 08일 03:57, Saravana Kannan wrote:

On 10/26/2016 05:06 PM, Chanwoo Choi wrote:

Hi,

On 2016년 10월 27일 04:17, Saravana Kannan wrote:

If the new governor fails to start, switch back to old governor so that the
devfreq state is not left in some weird limbo.

Signed-off-by: Saravana Kannan <skan...@codeaurora.org>
---
* Fix NULL deref for real this time.
* Addressed some style preferences.

drivers/devfreq/devfreq.c | 13 +++--
1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index bf3ea76..77c41a5 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -919,7 +919,7 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
struct devfreq *df = to_devfreq(dev);
int ret;
char str_governor[DEVFREQ_NAME_LEN + 1];
-struct devfreq_governor *governor;
+const struct devfreq_governor *governor, *prev_governor;

ret = sscanf(buf, "%" __stringify(DEVFREQ_NAME_LEN) "s", str_governor);
if (ret != 1)
@@ -944,12 +944,21 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
goto out;
}
}
+prev_governor = df->governor;
df->governor = governor;
strncpy(df->governor_name, governor->name, DEVFREQ_NAME_LEN);
ret = df->governor->event_handler(df, DEVFREQ_GOV_START, NULL);
-if (ret)
+if (ret) {
dev_warn(dev, "%s: Governor %s not started(%d)\n",
 __func__, df->governor->name, ret);
+if (prev_governor) {


I think that prev_governor is always set. You don't need to check it.
Why do check it?


Not true. Even higher up in this function, we check if df->governor != NULL. 
Simple example would be that the default governor passed in while adding the 
device isn't compiled in.


I don't understand. If device is not registered by devfreq_add_device(),
you don't change the governor by using governor_store().

If you can change the governor through governor_store(),
it means that the devfreq instance was added without any problem.
The added devfreq instance must have the sepecific governor
on df->governor variable.

So, I think that df->governor is always set and then prev_governor is always 
set.


Read the code more closely. add_device doesn't (and shouldn't) fail if the 
governor isn't present. After that, the governor could be changed from user 
space.


If governor is not present during devfreq_add_device(),
the devfreq instance is not able to register the devfreq framework.
So, the user never touch the sysfs[1] to change the governor
because the sysfs[1] is not created by devfreq framework.
[1] /sys/class/devfreq/[device name]/governor

In summary, if governor is not present during devfreq_add_device(),
the devfreq framework doesn't create tye sysfs[1] for governor.

The user-space never change the governor through sysfs[1]
because there is no any sysfs entry[1].


I checked the code of devfreq_add_device().
As you mentioned, if there is no governor,
devfreq_add_device() don't pass wihtout calling the 
devfreq->governor->even_handler().


Wrong expression / don't pass -> pass



I think it's correct as is. Just because the governor isn't there (or 
hasn't registered yet) doesn't mean the device add should fail.


That aside, do you care to ack this patch for the way the code is currently?

-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH v3] PM / devfreq: Restart previous governor if new governor fails to start

2016-11-09 Thread Saravana Kannan

On 11/08/2016 06:38 PM, Chanwoo Choi wrote:

On 2016년 11월 09일 11:36, Chanwoo Choi wrote:

Hi,

On 2016년 11월 09일 10:33, Chanwoo Choi wrote:

On 2016년 11월 09일 05:52, Saravana Kannan wrote:

On 11/08/2016 01:02 AM, Chanwoo Choi wrote:

Hi,

On 2016년 11월 08일 03:57, Saravana Kannan wrote:

On 10/26/2016 05:06 PM, Chanwoo Choi wrote:

Hi,

On 2016년 10월 27일 04:17, Saravana Kannan wrote:

If the new governor fails to start, switch back to old governor so that the
devfreq state is not left in some weird limbo.

Signed-off-by: Saravana Kannan 
---
* Fix NULL deref for real this time.
* Addressed some style preferences.

drivers/devfreq/devfreq.c | 13 +++--
1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index bf3ea76..77c41a5 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -919,7 +919,7 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
struct devfreq *df = to_devfreq(dev);
int ret;
char str_governor[DEVFREQ_NAME_LEN + 1];
-struct devfreq_governor *governor;
+const struct devfreq_governor *governor, *prev_governor;

ret = sscanf(buf, "%" __stringify(DEVFREQ_NAME_LEN) "s", str_governor);
if (ret != 1)
@@ -944,12 +944,21 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
goto out;
}
}
+prev_governor = df->governor;
df->governor = governor;
strncpy(df->governor_name, governor->name, DEVFREQ_NAME_LEN);
ret = df->governor->event_handler(df, DEVFREQ_GOV_START, NULL);
-if (ret)
+if (ret) {
dev_warn(dev, "%s: Governor %s not started(%d)\n",
 __func__, df->governor->name, ret);
+if (prev_governor) {


I think that prev_governor is always set. You don't need to check it.
Why do check it?


Not true. Even higher up in this function, we check if df->governor != NULL. 
Simple example would be that the default governor passed in while adding the 
device isn't compiled in.


I don't understand. If device is not registered by devfreq_add_device(),
you don't change the governor by using governor_store().

If you can change the governor through governor_store(),
it means that the devfreq instance was added without any problem.
The added devfreq instance must have the sepecific governor
on df->governor variable.

So, I think that df->governor is always set and then prev_governor is always 
set.


Read the code more closely. add_device doesn't (and shouldn't) fail if the 
governor isn't present. After that, the governor could be changed from user 
space.


If governor is not present during devfreq_add_device(),
the devfreq instance is not able to register the devfreq framework.
So, the user never touch the sysfs[1] to change the governor
because the sysfs[1] is not created by devfreq framework.
[1] /sys/class/devfreq/[device name]/governor

In summary, if governor is not present during devfreq_add_device(),
the devfreq framework doesn't create tye sysfs[1] for governor.

The user-space never change the governor through sysfs[1]
because there is no any sysfs entry[1].


I checked the code of devfreq_add_device().
As you mentioned, if there is no governor,
devfreq_add_device() don't pass wihtout calling the 
devfreq->governor->even_handler().


Wrong expression / don't pass -> pass



I think it's correct as is. Just because the governor isn't there (or 
hasn't registered yet) doesn't mean the device add should fail.


That aside, do you care to ack this patch for the way the code is currently?

-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH v3] PM / devfreq: Restart previous governor if new governor fails to start

2016-11-08 Thread Saravana Kannan

On 11/08/2016 01:02 AM, Chanwoo Choi wrote:

Hi,

On 2016년 11월 08일 03:57, Saravana Kannan wrote:

On 10/26/2016 05:06 PM, Chanwoo Choi wrote:

Hi,

On 2016년 10월 27일 04:17, Saravana Kannan wrote:

If the new governor fails to start, switch back to old governor so that the
devfreq state is not left in some weird limbo.

Signed-off-by: Saravana Kannan <skan...@codeaurora.org>
---
* Fix NULL deref for real this time.
* Addressed some style preferences.

   drivers/devfreq/devfreq.c | 13 +++--
   1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index bf3ea76..77c41a5 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -919,7 +919,7 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
   struct devfreq *df = to_devfreq(dev);
   int ret;
   char str_governor[DEVFREQ_NAME_LEN + 1];
-struct devfreq_governor *governor;
+const struct devfreq_governor *governor, *prev_governor;

   ret = sscanf(buf, "%" __stringify(DEVFREQ_NAME_LEN) "s", str_governor);
   if (ret != 1)
@@ -944,12 +944,21 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
   goto out;
   }
   }
+prev_governor = df->governor;
   df->governor = governor;
   strncpy(df->governor_name, governor->name, DEVFREQ_NAME_LEN);
   ret = df->governor->event_handler(df, DEVFREQ_GOV_START, NULL);
-if (ret)
+if (ret) {
   dev_warn(dev, "%s: Governor %s not started(%d)\n",
__func__, df->governor->name, ret);
+if (prev_governor) {


I think that prev_governor is always set. You don't need to check it.
Why do check it?


Not true. Even higher up in this function, we check if df->governor != NULL. 
Simple example would be that the default governor passed in while adding the 
device isn't compiled in.


I don't understand. If device is not registered by devfreq_add_device(),
you don't change the governor by using governor_store().

If you can change the governor through governor_store(),
it means that the devfreq instance was added without any problem.
The added devfreq instance must have the sepecific governor
on df->governor variable.

So, I think that df->governor is always set and then prev_governor is always 
set.


Read the code more closely. add_device doesn't (and shouldn't) fail if 
the governor isn't present. After that, the governor could be changed 
from user space.


-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH v3] PM / devfreq: Restart previous governor if new governor fails to start

2016-11-08 Thread Saravana Kannan

On 11/08/2016 01:02 AM, Chanwoo Choi wrote:

Hi,

On 2016년 11월 08일 03:57, Saravana Kannan wrote:

On 10/26/2016 05:06 PM, Chanwoo Choi wrote:

Hi,

On 2016년 10월 27일 04:17, Saravana Kannan wrote:

If the new governor fails to start, switch back to old governor so that the
devfreq state is not left in some weird limbo.

Signed-off-by: Saravana Kannan 
---
* Fix NULL deref for real this time.
* Addressed some style preferences.

   drivers/devfreq/devfreq.c | 13 +++--
   1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index bf3ea76..77c41a5 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -919,7 +919,7 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
   struct devfreq *df = to_devfreq(dev);
   int ret;
   char str_governor[DEVFREQ_NAME_LEN + 1];
-struct devfreq_governor *governor;
+const struct devfreq_governor *governor, *prev_governor;

   ret = sscanf(buf, "%" __stringify(DEVFREQ_NAME_LEN) "s", str_governor);
   if (ret != 1)
@@ -944,12 +944,21 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
   goto out;
   }
   }
+prev_governor = df->governor;
   df->governor = governor;
   strncpy(df->governor_name, governor->name, DEVFREQ_NAME_LEN);
   ret = df->governor->event_handler(df, DEVFREQ_GOV_START, NULL);
-if (ret)
+if (ret) {
   dev_warn(dev, "%s: Governor %s not started(%d)\n",
__func__, df->governor->name, ret);
+if (prev_governor) {


I think that prev_governor is always set. You don't need to check it.
Why do check it?


Not true. Even higher up in this function, we check if df->governor != NULL. 
Simple example would be that the default governor passed in while adding the 
device isn't compiled in.


I don't understand. If device is not registered by devfreq_add_device(),
you don't change the governor by using governor_store().

If you can change the governor through governor_store(),
it means that the devfreq instance was added without any problem.
The added devfreq instance must have the sepecific governor
on df->governor variable.

So, I think that df->governor is always set and then prev_governor is always 
set.


Read the code more closely. add_device doesn't (and shouldn't) fail if 
the governor isn't present. After that, the governor could be changed 
from user space.


-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH v3] PM / devfreq: Restart previous governor if new governor fails to start

2016-11-07 Thread Saravana Kannan

On 10/26/2016 05:06 PM, Chanwoo Choi wrote:

Hi,

On 2016년 10월 27일 04:17, Saravana Kannan wrote:

If the new governor fails to start, switch back to old governor so that the
devfreq state is not left in some weird limbo.

Signed-off-by: Saravana Kannan <skan...@codeaurora.org>
---
* Fix NULL deref for real this time.
* Addressed some style preferences.

  drivers/devfreq/devfreq.c | 13 +++--
  1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index bf3ea76..77c41a5 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -919,7 +919,7 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
struct devfreq *df = to_devfreq(dev);
int ret;
char str_governor[DEVFREQ_NAME_LEN + 1];
-   struct devfreq_governor *governor;
+   const struct devfreq_governor *governor, *prev_governor;

ret = sscanf(buf, "%" __stringify(DEVFREQ_NAME_LEN) "s", str_governor);
if (ret != 1)
@@ -944,12 +944,21 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
goto out;
}
}
+   prev_governor = df->governor;
df->governor = governor;
strncpy(df->governor_name, governor->name, DEVFREQ_NAME_LEN);
ret = df->governor->event_handler(df, DEVFREQ_GOV_START, NULL);
-   if (ret)
+   if (ret) {
dev_warn(dev, "%s: Governor %s not started(%d)\n",
 __func__, df->governor->name, ret);
+   if (prev_governor) {


I think that prev_governor is always set. You don't need to check it.
Why do check it?


Not true. Even higher up in this function, we check if df->governor != 
NULL. Simple example would be that the default governor passed in while 
adding the device isn't compiled in.



Thanks,
Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH v3] PM / devfreq: Restart previous governor if new governor fails to start

2016-11-07 Thread Saravana Kannan

On 10/26/2016 05:06 PM, Chanwoo Choi wrote:

Hi,

On 2016년 10월 27일 04:17, Saravana Kannan wrote:

If the new governor fails to start, switch back to old governor so that the
devfreq state is not left in some weird limbo.

Signed-off-by: Saravana Kannan 
---
* Fix NULL deref for real this time.
* Addressed some style preferences.

  drivers/devfreq/devfreq.c | 13 +++--
  1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index bf3ea76..77c41a5 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -919,7 +919,7 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
struct devfreq *df = to_devfreq(dev);
int ret;
char str_governor[DEVFREQ_NAME_LEN + 1];
-   struct devfreq_governor *governor;
+   const struct devfreq_governor *governor, *prev_governor;

ret = sscanf(buf, "%" __stringify(DEVFREQ_NAME_LEN) "s", str_governor);
if (ret != 1)
@@ -944,12 +944,21 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
goto out;
}
}
+   prev_governor = df->governor;
df->governor = governor;
strncpy(df->governor_name, governor->name, DEVFREQ_NAME_LEN);
ret = df->governor->event_handler(df, DEVFREQ_GOV_START, NULL);
-   if (ret)
+   if (ret) {
dev_warn(dev, "%s: Governor %s not started(%d)\n",
 __func__, df->governor->name, ret);
+   if (prev_governor) {


I think that prev_governor is always set. You don't need to check it.
Why do check it?


Not true. Even higher up in this function, we check if df->governor != 
NULL. Simple example would be that the default governor passed in while 
adding the device isn't compiled in.



Thanks,
Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH v2] PM / devfreq: Restart previous governor if new governor fails to start

2016-10-26 Thread Saravana Kannan

On 10/26/2016 12:22 AM, Chanwoo Choi wrote:

Hi,

On 2016년 10월 26일 10:25, Saravana Kannan wrote:

If the new governor fails to start, switch back to old governor so that the
devfreq state is not left in some weird limbo.

Signed-off-by: Saravana Kannan <skan...@codeaurora.org>
---
* Fixed typo in commit text
* Fixed potential NULL deref

  drivers/devfreq/devfreq.c | 9 +++--
  1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index bf3ea76..243253e 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -919,7 +919,7 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
struct devfreq *df = to_devfreq(dev);
int ret;
char str_governor[DEVFREQ_NAME_LEN + 1];
-   struct devfreq_governor *governor;
+   const struct devfreq_governor *governor, *prev_gov;

ret = sscanf(buf, "%" __stringify(DEVFREQ_NAME_LEN) "s", str_governor);
if (ret != 1)
@@ -944,12 +944,17 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
goto out;
}
}
+   prev_gov = df->governor;
df->governor = governor;
strncpy(df->governor_name, governor->name, DEVFREQ_NAME_LEN);
ret = df->governor->event_handler(df, DEVFREQ_GOV_START, NULL);
-   if (ret)
+   if (ret) {
dev_warn(dev, "%s: Governor %s not started(%d)\n",
 __func__, df->governor->name, ret);
+   df->governor = prev_gov;
+   strncpy(df->governor_name, prev_gov->name, DEVFREQ_NAME_LEN);
+   df->governor->event_handler(df, DEVFREQ_GOV_START, NULL);
+   }


Sigh... looks like I added the changes to the index but forgot to commit 
it before I sent the patch. Will send v3.



  out:
mutex_unlock(_list_lock);




Looks good to me.
But, how about using the 'prev_governor' instead of 'prev_gov'?
because the variable name of current governor is 'governor' instead of 'gov'.


I was trying to avoid wrapping lines too much. But this doesn't cause 
any additional wraps. Done.




Reviewed-by: Chanwoo Choi <cw00.c...@samsung.com>



Can you please review v3?

Thanks,
Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH v2] PM / devfreq: Restart previous governor if new governor fails to start

2016-10-26 Thread Saravana Kannan

On 10/26/2016 12:22 AM, Chanwoo Choi wrote:

Hi,

On 2016년 10월 26일 10:25, Saravana Kannan wrote:

If the new governor fails to start, switch back to old governor so that the
devfreq state is not left in some weird limbo.

Signed-off-by: Saravana Kannan 
---
* Fixed typo in commit text
* Fixed potential NULL deref

  drivers/devfreq/devfreq.c | 9 +++--
  1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index bf3ea76..243253e 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -919,7 +919,7 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
struct devfreq *df = to_devfreq(dev);
int ret;
char str_governor[DEVFREQ_NAME_LEN + 1];
-   struct devfreq_governor *governor;
+   const struct devfreq_governor *governor, *prev_gov;

ret = sscanf(buf, "%" __stringify(DEVFREQ_NAME_LEN) "s", str_governor);
if (ret != 1)
@@ -944,12 +944,17 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
goto out;
}
}
+   prev_gov = df->governor;
df->governor = governor;
strncpy(df->governor_name, governor->name, DEVFREQ_NAME_LEN);
ret = df->governor->event_handler(df, DEVFREQ_GOV_START, NULL);
-   if (ret)
+   if (ret) {
dev_warn(dev, "%s: Governor %s not started(%d)\n",
 __func__, df->governor->name, ret);
+   df->governor = prev_gov;
+   strncpy(df->governor_name, prev_gov->name, DEVFREQ_NAME_LEN);
+   df->governor->event_handler(df, DEVFREQ_GOV_START, NULL);
+   }


Sigh... looks like I added the changes to the index but forgot to commit 
it before I sent the patch. Will send v3.



  out:
mutex_unlock(_list_lock);




Looks good to me.
But, how about using the 'prev_governor' instead of 'prev_gov'?
because the variable name of current governor is 'governor' instead of 'gov'.


I was trying to avoid wrapping lines too much. But this doesn't cause 
any additional wraps. Done.




Reviewed-by: Chanwoo Choi 



Can you please review v3?

Thanks,
Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


[PATCH v3] PM / devfreq: Restart previous governor if new governor fails to start

2016-10-26 Thread Saravana Kannan
If the new governor fails to start, switch back to old governor so that the
devfreq state is not left in some weird limbo.

Signed-off-by: Saravana Kannan <skan...@codeaurora.org>
---
* Fix NULL deref for real this time.
* Addressed some style preferences.

 drivers/devfreq/devfreq.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index bf3ea76..77c41a5 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -919,7 +919,7 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
struct devfreq *df = to_devfreq(dev);
int ret;
char str_governor[DEVFREQ_NAME_LEN + 1];
-   struct devfreq_governor *governor;
+   const struct devfreq_governor *governor, *prev_governor;
 
ret = sscanf(buf, "%" __stringify(DEVFREQ_NAME_LEN) "s", str_governor);
if (ret != 1)
@@ -944,12 +944,21 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
goto out;
}
}
+   prev_governor = df->governor;
df->governor = governor;
strncpy(df->governor_name, governor->name, DEVFREQ_NAME_LEN);
ret = df->governor->event_handler(df, DEVFREQ_GOV_START, NULL);
-   if (ret)
+   if (ret) {
dev_warn(dev, "%s: Governor %s not started(%d)\n",
 __func__, df->governor->name, ret);
+   if (prev_governor) {
+   df->governor = prev_governor;
+   strncpy(df->governor_name, prev_governor->name,
+   DEVFREQ_NAME_LEN);
+   df->governor->event_handler(df, DEVFREQ_GOV_START,
+   NULL);
+   }
+   }
 out:
mutex_unlock(_list_lock);
 
-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v3] PM / devfreq: Restart previous governor if new governor fails to start

2016-10-26 Thread Saravana Kannan
If the new governor fails to start, switch back to old governor so that the
devfreq state is not left in some weird limbo.

Signed-off-by: Saravana Kannan 
---
* Fix NULL deref for real this time.
* Addressed some style preferences.

 drivers/devfreq/devfreq.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index bf3ea76..77c41a5 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -919,7 +919,7 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
struct devfreq *df = to_devfreq(dev);
int ret;
char str_governor[DEVFREQ_NAME_LEN + 1];
-   struct devfreq_governor *governor;
+   const struct devfreq_governor *governor, *prev_governor;
 
ret = sscanf(buf, "%" __stringify(DEVFREQ_NAME_LEN) "s", str_governor);
if (ret != 1)
@@ -944,12 +944,21 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
goto out;
}
}
+   prev_governor = df->governor;
df->governor = governor;
strncpy(df->governor_name, governor->name, DEVFREQ_NAME_LEN);
ret = df->governor->event_handler(df, DEVFREQ_GOV_START, NULL);
-   if (ret)
+   if (ret) {
dev_warn(dev, "%s: Governor %s not started(%d)\n",
 __func__, df->governor->name, ret);
+   if (prev_governor) {
+   df->governor = prev_governor;
+   strncpy(df->governor_name, prev_governor->name,
+   DEVFREQ_NAME_LEN);
+   df->governor->event_handler(df, DEVFREQ_GOV_START,
+   NULL);
+   }
+   }
 out:
mutex_unlock(_list_lock);
 
-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v2] PM / devfreq: Restart previous governor if new governor fails to start

2016-10-25 Thread Saravana Kannan
If the new governor fails to start, switch back to old governor so that the
devfreq state is not left in some weird limbo.

Signed-off-by: Saravana Kannan <skan...@codeaurora.org>
---
* Fixed typo in commit text
* Fixed potential NULL deref

 drivers/devfreq/devfreq.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index bf3ea76..243253e 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -919,7 +919,7 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
struct devfreq *df = to_devfreq(dev);
int ret;
char str_governor[DEVFREQ_NAME_LEN + 1];
-   struct devfreq_governor *governor;
+   const struct devfreq_governor *governor, *prev_gov;
 
ret = sscanf(buf, "%" __stringify(DEVFREQ_NAME_LEN) "s", str_governor);
if (ret != 1)
@@ -944,12 +944,17 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
goto out;
}
}
+   prev_gov = df->governor;
df->governor = governor;
strncpy(df->governor_name, governor->name, DEVFREQ_NAME_LEN);
ret = df->governor->event_handler(df, DEVFREQ_GOV_START, NULL);
-   if (ret)
+   if (ret) {
dev_warn(dev, "%s: Governor %s not started(%d)\n",
 __func__, df->governor->name, ret);
+   df->governor = prev_gov;
+   strncpy(df->governor_name, prev_gov->name, DEVFREQ_NAME_LEN);
+   df->governor->event_handler(df, DEVFREQ_GOV_START, NULL);
+   }
 out:
mutex_unlock(_list_lock);
 
-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v2] PM / devfreq: Restart previous governor if new governor fails to start

2016-10-25 Thread Saravana Kannan
If the new governor fails to start, switch back to old governor so that the
devfreq state is not left in some weird limbo.

Signed-off-by: Saravana Kannan 
---
* Fixed typo in commit text
* Fixed potential NULL deref

 drivers/devfreq/devfreq.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index bf3ea76..243253e 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -919,7 +919,7 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
struct devfreq *df = to_devfreq(dev);
int ret;
char str_governor[DEVFREQ_NAME_LEN + 1];
-   struct devfreq_governor *governor;
+   const struct devfreq_governor *governor, *prev_gov;
 
ret = sscanf(buf, "%" __stringify(DEVFREQ_NAME_LEN) "s", str_governor);
if (ret != 1)
@@ -944,12 +944,17 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
goto out;
}
}
+   prev_gov = df->governor;
df->governor = governor;
strncpy(df->governor_name, governor->name, DEVFREQ_NAME_LEN);
ret = df->governor->event_handler(df, DEVFREQ_GOV_START, NULL);
-   if (ret)
+   if (ret) {
dev_warn(dev, "%s: Governor %s not started(%d)\n",
 __func__, df->governor->name, ret);
+   df->governor = prev_gov;
+   strncpy(df->governor_name, prev_gov->name, DEVFREQ_NAME_LEN);
+   df->governor->event_handler(df, DEVFREQ_GOV_START, NULL);
+   }
 out:
mutex_unlock(_list_lock);
 
-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH] PM / devfreq: Restart previous governor if new governor fails to start

2016-10-24 Thread Saravana Kannan
If the new governor fails to start, switch back to old governor so that the
devfreq state is not left if some weird limbo.

Signed-off-by: Saravana Kannan <skan...@codeaurora.org>
---
 drivers/devfreq/devfreq.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index bf3ea76..243253e 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -919,7 +919,7 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
struct devfreq *df = to_devfreq(dev);
int ret;
char str_governor[DEVFREQ_NAME_LEN + 1];
-   struct devfreq_governor *governor;
+   const struct devfreq_governor *governor, *prev_gov;
 
ret = sscanf(buf, "%" __stringify(DEVFREQ_NAME_LEN) "s", str_governor);
if (ret != 1)
@@ -944,12 +944,17 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
goto out;
}
}
+   prev_gov = df->governor;
df->governor = governor;
strncpy(df->governor_name, governor->name, DEVFREQ_NAME_LEN);
ret = df->governor->event_handler(df, DEVFREQ_GOV_START, NULL);
-   if (ret)
+   if (ret) {
dev_warn(dev, "%s: Governor %s not started(%d)\n",
 __func__, df->governor->name, ret);
+   df->governor = prev_gov;
+   strncpy(df->governor_name, prev_gov->name, DEVFREQ_NAME_LEN);
+   df->governor->event_handler(df, DEVFREQ_GOV_START, NULL);
+   }
 out:
mutex_unlock(_list_lock);
 
-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH] PM / devfreq: Restart previous governor if new governor fails to start

2016-10-24 Thread Saravana Kannan
If the new governor fails to start, switch back to old governor so that the
devfreq state is not left if some weird limbo.

Signed-off-by: Saravana Kannan 
---
 drivers/devfreq/devfreq.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index bf3ea76..243253e 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -919,7 +919,7 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
struct devfreq *df = to_devfreq(dev);
int ret;
char str_governor[DEVFREQ_NAME_LEN + 1];
-   struct devfreq_governor *governor;
+   const struct devfreq_governor *governor, *prev_gov;
 
ret = sscanf(buf, "%" __stringify(DEVFREQ_NAME_LEN) "s", str_governor);
if (ret != 1)
@@ -944,12 +944,17 @@ static ssize_t governor_store(struct device *dev, struct 
device_attribute *attr,
goto out;
}
}
+   prev_gov = df->governor;
df->governor = governor;
strncpy(df->governor_name, governor->name, DEVFREQ_NAME_LEN);
ret = df->governor->event_handler(df, DEVFREQ_GOV_START, NULL);
-   if (ret)
+   if (ret) {
dev_warn(dev, "%s: Governor %s not started(%d)\n",
 __func__, df->governor->name, ret);
+   df->governor = prev_gov;
+   strncpy(df->governor_name, prev_gov->name, DEVFREQ_NAME_LEN);
+   df->governor->event_handler(df, DEVFREQ_GOV_START, NULL);
+   }
 out:
mutex_unlock(_list_lock);
 
-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project



Re: [PATCH 1/7] cpufreq: remove redundant CPUFREQ_INCOMPATIBLE notifier event

2016-04-06 Thread Saravana Kannan

On 04/06/2016 02:45 PM, Rafael J. Wysocki wrote:

On Wed, Apr 6, 2016 at 11:29 PM, Saravana Kannan <skan...@codeaurora.org> wrote:

On 04/06/2016 02:21 PM, Rafael J. Wysocki wrote:


On Wed, Apr 6, 2016 at 10:30 PM, Saravana Kannan <skan...@codeaurora.org>
wrote:


On 09/09/2015 05:53 PM, Rafael J. Wysocki wrote:




[cut]



Well, nobody was using that event.



True, but that's more of a bug in drivers/thermal/cpu-cooling.c and
drivers/acpi/processor_thermal.c. We should revert this patch and fix those
drivers. Does that seem acceptable to you?


I'd rather see a patch series adding the event back along with some
users.  One user at least.



Ok, I'll make those two drivers use them and send it out. It's very 
clearly a bug in those drivers.


-Saravana`

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 1/7] cpufreq: remove redundant CPUFREQ_INCOMPATIBLE notifier event

2016-04-06 Thread Saravana Kannan

On 04/06/2016 02:45 PM, Rafael J. Wysocki wrote:

On Wed, Apr 6, 2016 at 11:29 PM, Saravana Kannan  wrote:

On 04/06/2016 02:21 PM, Rafael J. Wysocki wrote:


On Wed, Apr 6, 2016 at 10:30 PM, Saravana Kannan 
wrote:


On 09/09/2015 05:53 PM, Rafael J. Wysocki wrote:




[cut]



Well, nobody was using that event.



True, but that's more of a bug in drivers/thermal/cpu-cooling.c and
drivers/acpi/processor_thermal.c. We should revert this patch and fix those
drivers. Does that seem acceptable to you?


I'd rather see a patch series adding the event back along with some
users.  One user at least.



Ok, I'll make those two drivers use them and send it out. It's very 
clearly a bug in those drivers.


-Saravana`

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 1/7] cpufreq: remove redundant CPUFREQ_INCOMPATIBLE notifier event

2016-04-06 Thread Saravana Kannan

On 04/06/2016 02:21 PM, Rafael J. Wysocki wrote:

On Wed, Apr 6, 2016 at 10:30 PM, Saravana Kannan <skan...@codeaurora.org> wrote:

On 09/09/2015 05:53 PM, Rafael J. Wysocki wrote:


Hi,

On Thu, Sep 10, 2015 at 2:39 AM, Viresh Kumar <viresh.ku...@linaro.org>
wrote:


On 10-09-15, 01:26, Rafael J. Wysocki wrote:


On Monday, August 03, 2015 08:36:14 AM Viresh Kumar wrote:


What's being done from CPUFREQ_INCOMPATIBLE, can also be done with
CPUFREQ_ADJUST. There is nothing special with CPUFREQ_INCOMPATIBLE
notifier.



The above part of the changelog is a disaster to me. :-(

It not only doesn't explain what really goes on, but it's actively
confusing.

What really happens is that the core sends CPUFREQ_INCOMPATIBLE
notifications
unconditionally right after sending the CPUFREQ_ADJUST ones, so the
former is
just redundant and it's more efficient to merge the two into one.



Undoubtedly this looks far better :)

But, isn't this series already applied some time back ?



Right, never mind.  For some reason that patch was left in the "New"
state.

The code is OK.




I guess I didn't notice this change when it was sent out.

The comment that was deleted in this patch clearly states why the
INCOMPATIBLE notifier is needed. Some client might want to boost the CPU min
freq for performance or other reasons, but thermal might want to limit it.
So, by having thermal register for INCOMPATIBLE notifiers to enforce the
limits, we provide a way to guarantee it gets the final say.

The real fix should have been to change drivers/thermal/cpu_cooling.c to use
CPUFREQ_INCOMPATIBLE instead of CPUFREQ_ADJUST.

Is there something I'm missing? If not, can we please revert this patch?


Well, nobody was using that event.



True, but that's more of a bug in drivers/thermal/cpu-cooling.c and 
drivers/acpi/processor_thermal.c. We should revert this patch and fix 
those drivers. Does that seem acceptable to you?


-Saravana


--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 1/7] cpufreq: remove redundant CPUFREQ_INCOMPATIBLE notifier event

2016-04-06 Thread Saravana Kannan

On 04/06/2016 02:21 PM, Rafael J. Wysocki wrote:

On Wed, Apr 6, 2016 at 10:30 PM, Saravana Kannan  wrote:

On 09/09/2015 05:53 PM, Rafael J. Wysocki wrote:


Hi,

On Thu, Sep 10, 2015 at 2:39 AM, Viresh Kumar 
wrote:


On 10-09-15, 01:26, Rafael J. Wysocki wrote:


On Monday, August 03, 2015 08:36:14 AM Viresh Kumar wrote:


What's being done from CPUFREQ_INCOMPATIBLE, can also be done with
CPUFREQ_ADJUST. There is nothing special with CPUFREQ_INCOMPATIBLE
notifier.



The above part of the changelog is a disaster to me. :-(

It not only doesn't explain what really goes on, but it's actively
confusing.

What really happens is that the core sends CPUFREQ_INCOMPATIBLE
notifications
unconditionally right after sending the CPUFREQ_ADJUST ones, so the
former is
just redundant and it's more efficient to merge the two into one.



Undoubtedly this looks far better :)

But, isn't this series already applied some time back ?



Right, never mind.  For some reason that patch was left in the "New"
state.

The code is OK.




I guess I didn't notice this change when it was sent out.

The comment that was deleted in this patch clearly states why the
INCOMPATIBLE notifier is needed. Some client might want to boost the CPU min
freq for performance or other reasons, but thermal might want to limit it.
So, by having thermal register for INCOMPATIBLE notifiers to enforce the
limits, we provide a way to guarantee it gets the final say.

The real fix should have been to change drivers/thermal/cpu_cooling.c to use
CPUFREQ_INCOMPATIBLE instead of CPUFREQ_ADJUST.

Is there something I'm missing? If not, can we please revert this patch?


Well, nobody was using that event.



True, but that's more of a bug in drivers/thermal/cpu-cooling.c and 
drivers/acpi/processor_thermal.c. We should revert this patch and fix 
those drivers. Does that seem acceptable to you?


-Saravana


--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH V2 0/7] cpufreq: governors: Fix ABBA lockups

2016-02-09 Thread Saravana Kannan

On 02/07/2016 06:28 PM, Rafael J. Wysocki wrote:

On Friday, February 05, 2016 06:22:35 PM Saravana Kannan wrote:

On 02/04/2016 07:54 PM, Rafael J. Wysocki wrote:

On Thursday, February 04, 2016 07:18:32 PM Rafael J. Wysocki wrote:

On Thu, Feb 4, 2016 at 6:44 PM, Saravana Kannan  wrote:

On 02/04/2016 09:43 AM, Saravana Kannan wrote:


On 02/04/2016 03:09 AM, Viresh Kumar wrote:


On 04-02-16, 00:50, Rafael J. Wysocki wrote:


This is exactly right.  We've avoided one deadlock only to trip into
another one.

This happens because update_sampling_rate() acquires
od_dbs_cdata.mutex which is held around cpufreq_governor_exit() by
cpufreq_governor_dbs().

Worse yet, a deadlock can still happen without (the new)
dbs_data->mutex, just between s_active and od_dbs_cdata.mutex if
update_sampling_rate() runs in parallel with
cpufreq_governor_dbs()->cpufreq_governor_exit() and the latter wins
the race.

It looks like we need to drop the governor mutex before putting the
kobject in cpufreq_governor_exit().




[cut]



No no no no! Let's not open up this can of worms of queuing up the work
to handle a write to a sysfs file. It *MIGHT* work for this specific
tunable (I haven't bothered to analyze), but this makes it impossible to
return a useful/proper error value.



Sent too soon. Not only that, but it can also cause the writes to the sysfs
files to get processed in a different order and I don't know what other
issues/races THAT will open up.


Well, I don't like this too.

I actually do have an idea about how to fix these deadlocks, but it is
on top of my cleanup series.

I'll write more about it later today.


Having actually posted that series again after cleaning it up I can say
what I'm thinking about hopefully without confusing anyone too much.  So
please bear in mind that I'm going to refer to this series below:

http://marc.info/?l=linux-pm=145463901630950=4

Also this is more of a brain dump rather than actual design description,
so there may be holes etc in it.  Please let me know if you can see any.

The problem at hand is that policy->rwsem needs to be held around *all*
operations in cpufreq_set_policy().  In particular, it cannot be dropped
around invocations of __cpufreq_governor() with the event arg equal to
_EXIT as that leads to interesting races.

Unfortunately, we know that holding policy->rwsem in those places leads
to a deadlock with governor sysfs attributes removal in cpufreq_governor_exit().

Viresh attempted to fix this by avoiding to acquire policy->rwsem for governor
attributes access (as holding it is not necessary for them in principle).  That
was a nice try, but it turned out to be insufficient because of another deadlock
scenario uncovered by it.  Namely, since the ondemand governor's 
update_sampling_rate()
acquires the governor mutex (called dbs_data_mutex after my patches mentioned
above), it may deadlock with exactly the same piece of code in 
cpufreq_governor_exit()
in almost exactly the same way.

To avoid that other deadlock, we'd either need to drop dbs_data_mutex from
update_sampling_rate(), or drop it for the removal of the governor sysfs
attributes in cpufreq_governor_exit().  I don't think the former is an option
at least at this point, so it looks like we pretty much have to do the latter.

With that in mind, I'd start with the changes made by Viresh (maybe without the
first patch which really isn't essential here).  That is, introduce a separate
kobject type for the governor attributes kobject and register that in
cpufreq_governor_init().  The show/store callbacks for that kobject type won't
acquire policy->rwsem so the first deadlock will be avoided.

But in addition to that, I'd drop dbs_data_mutex before the removal of governor
sysfs attributes.  That actually happens in two places, in 
cpufreq_governor_exit()
and in the error path of cpufreq_governor_init().

To that end, I'd move the locking from cpufreq_governor_dbs() to the functions
called by it.  That should be readily doable and they can do all of the
necessary checks themselves.  cpufreq_governor_dbs() would become a pure mux 
then,
but that's not such a big deal.

With that, cpufreq_governor_exit() may just drop the lock before it does the
final kobject_put().  The danger here is that the sysfs show/store callbacks of
the governor attributes kobject may see invalid dbs_data for a while, after the
lock has been dropped and before the kobject is deleted.  That may be addressed
by checking, for example, the presence of the dbs_data's "tuners" pointer in 
those
callbacks.  If it is NULL, they can simply return -EAGAIN or similar.

Now, that means, though, that they need to acquire the same lock as
cpufreq_governor_exit(), or they may see things go away while they are running.
The simplest approach here would be to take dbs_data_mutex in them too, although
that's a bit of a sledgehammer.  It might be better to have a per-policy lock
in struct policy_dbs_info for that, for example, but

Re: [PATCH V2 0/7] cpufreq: governors: Fix ABBA lockups

2016-02-09 Thread Saravana Kannan

On 02/07/2016 06:28 PM, Rafael J. Wysocki wrote:

On Friday, February 05, 2016 06:22:35 PM Saravana Kannan wrote:

On 02/04/2016 07:54 PM, Rafael J. Wysocki wrote:

On Thursday, February 04, 2016 07:18:32 PM Rafael J. Wysocki wrote:

On Thu, Feb 4, 2016 at 6:44 PM, Saravana Kannan <skan...@codeaurora.org> wrote:

On 02/04/2016 09:43 AM, Saravana Kannan wrote:


On 02/04/2016 03:09 AM, Viresh Kumar wrote:


On 04-02-16, 00:50, Rafael J. Wysocki wrote:


This is exactly right.  We've avoided one deadlock only to trip into
another one.

This happens because update_sampling_rate() acquires
od_dbs_cdata.mutex which is held around cpufreq_governor_exit() by
cpufreq_governor_dbs().

Worse yet, a deadlock can still happen without (the new)
dbs_data->mutex, just between s_active and od_dbs_cdata.mutex if
update_sampling_rate() runs in parallel with
cpufreq_governor_dbs()->cpufreq_governor_exit() and the latter wins
the race.

It looks like we need to drop the governor mutex before putting the
kobject in cpufreq_governor_exit().




[cut]



No no no no! Let's not open up this can of worms of queuing up the work
to handle a write to a sysfs file. It *MIGHT* work for this specific
tunable (I haven't bothered to analyze), but this makes it impossible to
return a useful/proper error value.



Sent too soon. Not only that, but it can also cause the writes to the sysfs
files to get processed in a different order and I don't know what other
issues/races THAT will open up.


Well, I don't like this too.

I actually do have an idea about how to fix these deadlocks, but it is
on top of my cleanup series.

I'll write more about it later today.


Having actually posted that series again after cleaning it up I can say
what I'm thinking about hopefully without confusing anyone too much.  So
please bear in mind that I'm going to refer to this series below:

http://marc.info/?l=linux-pm=145463901630950=4

Also this is more of a brain dump rather than actual design description,
so there may be holes etc in it.  Please let me know if you can see any.

The problem at hand is that policy->rwsem needs to be held around *all*
operations in cpufreq_set_policy().  In particular, it cannot be dropped
around invocations of __cpufreq_governor() with the event arg equal to
_EXIT as that leads to interesting races.

Unfortunately, we know that holding policy->rwsem in those places leads
to a deadlock with governor sysfs attributes removal in cpufreq_governor_exit().

Viresh attempted to fix this by avoiding to acquire policy->rwsem for governor
attributes access (as holding it is not necessary for them in principle).  That
was a nice try, but it turned out to be insufficient because of another deadlock
scenario uncovered by it.  Namely, since the ondemand governor's 
update_sampling_rate()
acquires the governor mutex (called dbs_data_mutex after my patches mentioned
above), it may deadlock with exactly the same piece of code in 
cpufreq_governor_exit()
in almost exactly the same way.

To avoid that other deadlock, we'd either need to drop dbs_data_mutex from
update_sampling_rate(), or drop it for the removal of the governor sysfs
attributes in cpufreq_governor_exit().  I don't think the former is an option
at least at this point, so it looks like we pretty much have to do the latter.

With that in mind, I'd start with the changes made by Viresh (maybe without the
first patch which really isn't essential here).  That is, introduce a separate
kobject type for the governor attributes kobject and register that in
cpufreq_governor_init().  The show/store callbacks for that kobject type won't
acquire policy->rwsem so the first deadlock will be avoided.

But in addition to that, I'd drop dbs_data_mutex before the removal of governor
sysfs attributes.  That actually happens in two places, in 
cpufreq_governor_exit()
and in the error path of cpufreq_governor_init().

To that end, I'd move the locking from cpufreq_governor_dbs() to the functions
called by it.  That should be readily doable and they can do all of the
necessary checks themselves.  cpufreq_governor_dbs() would become a pure mux 
then,
but that's not such a big deal.

With that, cpufreq_governor_exit() may just drop the lock before it does the
final kobject_put().  The danger here is that the sysfs show/store callbacks of
the governor attributes kobject may see invalid dbs_data for a while, after the
lock has been dropped and before the kobject is deleted.  That may be addressed
by checking, for example, the presence of the dbs_data's "tuners" pointer in 
those
callbacks.  If it is NULL, they can simply return -EAGAIN or similar.

Now, that means, though, that they need to acquire the same lock as
cpufreq_governor_exit(), or they may see things go away while they are running.
The simplest approach here would be to take dbs_data_mutex in them too, although
that's a bit of a sledgehammer.  It might be better to have a per-policy lock
in struct pol

Re: [PATCH V2 0/7] cpufreq: governors: Fix ABBA lockups

2016-02-05 Thread Saravana Kannan

On 02/04/2016 07:54 PM, Rafael J. Wysocki wrote:

On Thursday, February 04, 2016 07:18:32 PM Rafael J. Wysocki wrote:

On Thu, Feb 4, 2016 at 6:44 PM, Saravana Kannan  wrote:

On 02/04/2016 09:43 AM, Saravana Kannan wrote:


On 02/04/2016 03:09 AM, Viresh Kumar wrote:


On 04-02-16, 00:50, Rafael J. Wysocki wrote:


This is exactly right.  We've avoided one deadlock only to trip into
another one.

This happens because update_sampling_rate() acquires
od_dbs_cdata.mutex which is held around cpufreq_governor_exit() by
cpufreq_governor_dbs().

Worse yet, a deadlock can still happen without (the new)
dbs_data->mutex, just between s_active and od_dbs_cdata.mutex if
update_sampling_rate() runs in parallel with
cpufreq_governor_dbs()->cpufreq_governor_exit() and the latter wins
the race.

It looks like we need to drop the governor mutex before putting the
kobject in cpufreq_governor_exit().




[cut]



No no no no! Let's not open up this can of worms of queuing up the work
to handle a write to a sysfs file. It *MIGHT* work for this specific
tunable (I haven't bothered to analyze), but this makes it impossible to
return a useful/proper error value.



Sent too soon. Not only that, but it can also cause the writes to the sysfs
files to get processed in a different order and I don't know what other
issues/races THAT will open up.


Well, I don't like this too.

I actually do have an idea about how to fix these deadlocks, but it is
on top of my cleanup series.

I'll write more about it later today.


Having actually posted that series again after cleaning it up I can say
what I'm thinking about hopefully without confusing anyone too much.  So
please bear in mind that I'm going to refer to this series below:

http://marc.info/?l=linux-pm=145463901630950=4

Also this is more of a brain dump rather than actual design description,
so there may be holes etc in it.  Please let me know if you can see any.

The problem at hand is that policy->rwsem needs to be held around *all*
operations in cpufreq_set_policy().  In particular, it cannot be dropped
around invocations of __cpufreq_governor() with the event arg equal to
_EXIT as that leads to interesting races.

Unfortunately, we know that holding policy->rwsem in those places leads
to a deadlock with governor sysfs attributes removal in cpufreq_governor_exit().

Viresh attempted to fix this by avoiding to acquire policy->rwsem for governor
attributes access (as holding it is not necessary for them in principle).  That
was a nice try, but it turned out to be insufficient because of another deadlock
scenario uncovered by it.  Namely, since the ondemand governor's 
update_sampling_rate()
acquires the governor mutex (called dbs_data_mutex after my patches mentioned
above), it may deadlock with exactly the same piece of code in 
cpufreq_governor_exit()
in almost exactly the same way.

To avoid that other deadlock, we'd either need to drop dbs_data_mutex from
update_sampling_rate(), or drop it for the removal of the governor sysfs
attributes in cpufreq_governor_exit().  I don't think the former is an option
at least at this point, so it looks like we pretty much have to do the latter.

With that in mind, I'd start with the changes made by Viresh (maybe without the
first patch which really isn't essential here).  That is, introduce a separate
kobject type for the governor attributes kobject and register that in
cpufreq_governor_init().  The show/store callbacks for that kobject type won't
acquire policy->rwsem so the first deadlock will be avoided.

But in addition to that, I'd drop dbs_data_mutex before the removal of governor
sysfs attributes.  That actually happens in two places, in 
cpufreq_governor_exit()
and in the error path of cpufreq_governor_init().

To that end, I'd move the locking from cpufreq_governor_dbs() to the functions
called by it.  That should be readily doable and they can do all of the
necessary checks themselves.  cpufreq_governor_dbs() would become a pure mux 
then,
but that's not such a big deal.

With that, cpufreq_governor_exit() may just drop the lock before it does the
final kobject_put().  The danger here is that the sysfs show/store callbacks of
the governor attributes kobject may see invalid dbs_data for a while, after the
lock has been dropped and before the kobject is deleted.  That may be addressed
by checking, for example, the presence of the dbs_data's "tuners" pointer in 
those
callbacks.  If it is NULL, they can simply return -EAGAIN or similar.

Now, that means, though, that they need to acquire the same lock as
cpufreq_governor_exit(), or they may see things go away while they are running.
The simplest approach here would be to take dbs_data_mutex in them too, although
that's a bit of a sledgehammer.  It might be better to have a per-policy lock
in struct policy_dbs_info for that, for example, but then the governor attribute
sysfs callbacks would need to get that object instead of dbs_data.

On the fli

Re: [PATCH V2 0/7] cpufreq: governors: Fix ABBA lockups

2016-02-05 Thread Saravana Kannan

On 02/04/2016 07:54 PM, Rafael J. Wysocki wrote:

On Thursday, February 04, 2016 07:18:32 PM Rafael J. Wysocki wrote:

On Thu, Feb 4, 2016 at 6:44 PM, Saravana Kannan <skan...@codeaurora.org> wrote:

On 02/04/2016 09:43 AM, Saravana Kannan wrote:


On 02/04/2016 03:09 AM, Viresh Kumar wrote:


On 04-02-16, 00:50, Rafael J. Wysocki wrote:


This is exactly right.  We've avoided one deadlock only to trip into
another one.

This happens because update_sampling_rate() acquires
od_dbs_cdata.mutex which is held around cpufreq_governor_exit() by
cpufreq_governor_dbs().

Worse yet, a deadlock can still happen without (the new)
dbs_data->mutex, just between s_active and od_dbs_cdata.mutex if
update_sampling_rate() runs in parallel with
cpufreq_governor_dbs()->cpufreq_governor_exit() and the latter wins
the race.

It looks like we need to drop the governor mutex before putting the
kobject in cpufreq_governor_exit().




[cut]



No no no no! Let's not open up this can of worms of queuing up the work
to handle a write to a sysfs file. It *MIGHT* work for this specific
tunable (I haven't bothered to analyze), but this makes it impossible to
return a useful/proper error value.



Sent too soon. Not only that, but it can also cause the writes to the sysfs
files to get processed in a different order and I don't know what other
issues/races THAT will open up.


Well, I don't like this too.

I actually do have an idea about how to fix these deadlocks, but it is
on top of my cleanup series.

I'll write more about it later today.


Having actually posted that series again after cleaning it up I can say
what I'm thinking about hopefully without confusing anyone too much.  So
please bear in mind that I'm going to refer to this series below:

http://marc.info/?l=linux-pm=145463901630950=4

Also this is more of a brain dump rather than actual design description,
so there may be holes etc in it.  Please let me know if you can see any.

The problem at hand is that policy->rwsem needs to be held around *all*
operations in cpufreq_set_policy().  In particular, it cannot be dropped
around invocations of __cpufreq_governor() with the event arg equal to
_EXIT as that leads to interesting races.

Unfortunately, we know that holding policy->rwsem in those places leads
to a deadlock with governor sysfs attributes removal in cpufreq_governor_exit().

Viresh attempted to fix this by avoiding to acquire policy->rwsem for governor
attributes access (as holding it is not necessary for them in principle).  That
was a nice try, but it turned out to be insufficient because of another deadlock
scenario uncovered by it.  Namely, since the ondemand governor's 
update_sampling_rate()
acquires the governor mutex (called dbs_data_mutex after my patches mentioned
above), it may deadlock with exactly the same piece of code in 
cpufreq_governor_exit()
in almost exactly the same way.

To avoid that other deadlock, we'd either need to drop dbs_data_mutex from
update_sampling_rate(), or drop it for the removal of the governor sysfs
attributes in cpufreq_governor_exit().  I don't think the former is an option
at least at this point, so it looks like we pretty much have to do the latter.

With that in mind, I'd start with the changes made by Viresh (maybe without the
first patch which really isn't essential here).  That is, introduce a separate
kobject type for the governor attributes kobject and register that in
cpufreq_governor_init().  The show/store callbacks for that kobject type won't
acquire policy->rwsem so the first deadlock will be avoided.

But in addition to that, I'd drop dbs_data_mutex before the removal of governor
sysfs attributes.  That actually happens in two places, in 
cpufreq_governor_exit()
and in the error path of cpufreq_governor_init().

To that end, I'd move the locking from cpufreq_governor_dbs() to the functions
called by it.  That should be readily doable and they can do all of the
necessary checks themselves.  cpufreq_governor_dbs() would become a pure mux 
then,
but that's not such a big deal.

With that, cpufreq_governor_exit() may just drop the lock before it does the
final kobject_put().  The danger here is that the sysfs show/store callbacks of
the governor attributes kobject may see invalid dbs_data for a while, after the
lock has been dropped and before the kobject is deleted.  That may be addressed
by checking, for example, the presence of the dbs_data's "tuners" pointer in 
those
callbacks.  If it is NULL, they can simply return -EAGAIN or similar.

Now, that means, though, that they need to acquire the same lock as
cpufreq_governor_exit(), or they may see things go away while they are running.
The simplest approach here would be to take dbs_data_mutex in them too, although
that's a bit of a sledgehammer.  It might be better to have a per-policy lock
in struct policy_dbs_info for that, for example, but then the governor attribute
sysfs callbacks would need to get that object inst

Re: [PATCH V2 0/7] cpufreq: governors: Fix ABBA lockups

2016-02-04 Thread Saravana Kannan

On 02/04/2016 09:43 AM, Saravana Kannan wrote:

On 02/04/2016 03:09 AM, Viresh Kumar wrote:

On 04-02-16, 00:50, Rafael J. Wysocki wrote:

This is exactly right.  We've avoided one deadlock only to trip into
another one.

This happens because update_sampling_rate() acquires
od_dbs_cdata.mutex which is held around cpufreq_governor_exit() by
cpufreq_governor_dbs().

Worse yet, a deadlock can still happen without (the new)
dbs_data->mutex, just between s_active and od_dbs_cdata.mutex if
update_sampling_rate() runs in parallel with
cpufreq_governor_dbs()->cpufreq_governor_exit() and the latter wins
the race.

It looks like we need to drop the governor mutex before putting the
kobject in cpufreq_governor_exit().


I have tried to explore all possible ways of fixing this, and every
other way looked to be racy in some way.

Does anyone else have a better idea (untested):

-8<-

Subject: [PATCH] cpufreq: ondemand: Shoot update_sampling_rate with a
separate
  work

Signed-off-by: Viresh Kumar 
---
  drivers/cpufreq/cpufreq_governor.h |  2 ++
  drivers/cpufreq/cpufreq_ondemand.c | 39
+-
  2 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.h
b/drivers/cpufreq/cpufreq_governor.h
index 7bed63e14e7d..97e604356b20 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -141,6 +141,8 @@ struct od_dbs_tuners {
  unsigned int powersave_bias;
  unsigned int io_is_busy;
  unsigned int min_sampling_rate;
+struct work_struct work;
+struct dbs_data *dbs_data;
  };

  struct cs_dbs_tuners {
diff --git a/drivers/cpufreq/cpufreq_ondemand.c
b/drivers/cpufreq/cpufreq_ondemand.c
index 82ed490f7de0..93ad7a226aee 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -242,20 +242,27 @@ static struct common_dbs_data od_dbs_cdata;
   * reducing the sampling rate, we need to make the new value effective
   * immediately.
   */
-static void update_sampling_rate(struct dbs_data *dbs_data,
-unsigned int new_rate)
+static void update_sampling_rate(struct work_struct *work)
  {
-struct od_dbs_tuners *od_tuners = dbs_data->tuners;
+struct od_dbs_tuners *od_tuners = container_of(work, struct
+   od_dbs_tuners, work);
+unsigned int new_rate = od_tuners->sampling_rate;
+struct dbs_data *dbs_data = od_tuners->dbs_data;
  struct cpumask cpumask;
  int cpu;

-od_tuners->sampling_rate = new_rate = max(new_rate,
-od_tuners->min_sampling_rate);
-
  /*
   * Lock governor so that governor start/stop can't execute in
parallel.
+ *
+ * We can't do a regular mutex_lock() here, as that may deadlock
against
+ * another thread performing CPUFREQ_GOV_POLICY_EXIT event on the
+ * governor, which might have already taken od_dbs_cdata.mutex
and is
+ * waiting for this work to finish.
   */
-mutex_lock(_dbs_cdata.mutex);
+if (!mutex_trylock(_dbs_cdata.mutex)) {
+queue_work(system_wq, _tuners->work);
+return;
+}

  cpumask_copy(, cpu_online_mask);

@@ -311,13 +318,22 @@ static void update_sampling_rate(struct dbs_data
*dbs_data,
  static ssize_t store_sampling_rate(struct dbs_data *dbs_data, const
char *buf,
  size_t count)
  {
+struct od_dbs_tuners *od_tuners = dbs_data->tuners;
  unsigned int input;
  int ret;
  ret = sscanf(buf, "%u", );
  if (ret != 1)
  return -EINVAL;

-update_sampling_rate(dbs_data, input);
+od_tuners->sampling_rate = max(input, od_tuners->min_sampling_rate);
+
+/*
+ * update_sampling_rate() requires to hold od_dbs_cdata.mutex,
but we
+ * can't take that from this thread, otherwise it results in ABBA
+ * lockdep between s_active and od_dbs_cdata.mutex locks.
+ */
+queue_work(system_wq, _tuners->work);
+
  return count;
  }

@@ -501,6 +517,8 @@ static int od_init(struct dbs_data *dbs_data, bool
notify)
  tuners->ignore_nice_load = 0;
  tuners->powersave_bias = default_powersave_bias;
  tuners->io_is_busy = should_io_be_busy();
+INIT_WORK(>work, update_sampling_rate);
+tuners->dbs_data = dbs_data;

  dbs_data->tuners = tuners;
  return 0;
@@ -508,7 +526,10 @@ static int od_init(struct dbs_data *dbs_data,
bool notify)

  static void od_exit(struct dbs_data *dbs_data, bool notify)
  {
-kfree(dbs_data->tuners);
+struct od_dbs_tuners *tuners = dbs_data->tuners;
+
+cancel_work_sync(>work);
+kfree(tuners);
  }

  define_get_cpu_dbs_routines(od_cpu_dbs_info);



No no no no! Let's not open up this can of worms of queuing up the work
to handle a write to a sysfs file. It *MIGHT* work for this specific
tunable (I haven't bothered to analyze), but this makes it impossible to
return a useful/proper e

Re: [PATCH V2 0/7] cpufreq: governors: Fix ABBA lockups

2016-02-04 Thread Saravana Kannan

On 02/04/2016 03:09 AM, Viresh Kumar wrote:

On 04-02-16, 00:50, Rafael J. Wysocki wrote:

This is exactly right.  We've avoided one deadlock only to trip into
another one.

This happens because update_sampling_rate() acquires
od_dbs_cdata.mutex which is held around cpufreq_governor_exit() by
cpufreq_governor_dbs().

Worse yet, a deadlock can still happen without (the new)
dbs_data->mutex, just between s_active and od_dbs_cdata.mutex if
update_sampling_rate() runs in parallel with
cpufreq_governor_dbs()->cpufreq_governor_exit() and the latter wins
the race.

It looks like we need to drop the governor mutex before putting the
kobject in cpufreq_governor_exit().


I have tried to explore all possible ways of fixing this, and every
other way looked to be racy in some way.

Does anyone else have a better idea (untested):

-8<-

Subject: [PATCH] cpufreq: ondemand: Shoot update_sampling_rate with a separate
  work

Signed-off-by: Viresh Kumar 
---
  drivers/cpufreq/cpufreq_governor.h |  2 ++
  drivers/cpufreq/cpufreq_ondemand.c | 39 +-
  2 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.h 
b/drivers/cpufreq/cpufreq_governor.h
index 7bed63e14e7d..97e604356b20 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -141,6 +141,8 @@ struct od_dbs_tuners {
unsigned int powersave_bias;
unsigned int io_is_busy;
unsigned int min_sampling_rate;
+   struct work_struct work;
+   struct dbs_data *dbs_data;
  };

  struct cs_dbs_tuners {
diff --git a/drivers/cpufreq/cpufreq_ondemand.c 
b/drivers/cpufreq/cpufreq_ondemand.c
index 82ed490f7de0..93ad7a226aee 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -242,20 +242,27 @@ static struct common_dbs_data od_dbs_cdata;
   * reducing the sampling rate, we need to make the new value effective
   * immediately.
   */
-static void update_sampling_rate(struct dbs_data *dbs_data,
-   unsigned int new_rate)
+static void update_sampling_rate(struct work_struct *work)
  {
-   struct od_dbs_tuners *od_tuners = dbs_data->tuners;
+   struct od_dbs_tuners *od_tuners = container_of(work, struct
+  od_dbs_tuners, work);
+   unsigned int new_rate = od_tuners->sampling_rate;
+   struct dbs_data *dbs_data = od_tuners->dbs_data;
struct cpumask cpumask;
int cpu;

-   od_tuners->sampling_rate = new_rate = max(new_rate,
-   od_tuners->min_sampling_rate);
-
/*
 * Lock governor so that governor start/stop can't execute in parallel.
+*
+* We can't do a regular mutex_lock() here, as that may deadlock against
+* another thread performing CPUFREQ_GOV_POLICY_EXIT event on the
+* governor, which might have already taken od_dbs_cdata.mutex and is
+* waiting for this work to finish.
 */
-   mutex_lock(_dbs_cdata.mutex);
+   if (!mutex_trylock(_dbs_cdata.mutex)) {
+   queue_work(system_wq, _tuners->work);
+   return;
+   }

cpumask_copy(, cpu_online_mask);

@@ -311,13 +318,22 @@ static void update_sampling_rate(struct dbs_data 
*dbs_data,
  static ssize_t store_sampling_rate(struct dbs_data *dbs_data, const char *buf,
size_t count)
  {
+   struct od_dbs_tuners *od_tuners = dbs_data->tuners;
unsigned int input;
int ret;
ret = sscanf(buf, "%u", );
if (ret != 1)
return -EINVAL;

-   update_sampling_rate(dbs_data, input);
+   od_tuners->sampling_rate = max(input, od_tuners->min_sampling_rate);
+
+   /*
+* update_sampling_rate() requires to hold od_dbs_cdata.mutex, but we
+* can't take that from this thread, otherwise it results in ABBA
+* lockdep between s_active and od_dbs_cdata.mutex locks.
+*/
+   queue_work(system_wq, _tuners->work);
+
return count;
  }

@@ -501,6 +517,8 @@ static int od_init(struct dbs_data *dbs_data, bool notify)
tuners->ignore_nice_load = 0;
tuners->powersave_bias = default_powersave_bias;
tuners->io_is_busy = should_io_be_busy();
+   INIT_WORK(>work, update_sampling_rate);
+   tuners->dbs_data = dbs_data;

dbs_data->tuners = tuners;
return 0;
@@ -508,7 +526,10 @@ static int od_init(struct dbs_data *dbs_data, bool notify)

  static void od_exit(struct dbs_data *dbs_data, bool notify)
  {
-   kfree(dbs_data->tuners);
+   struct od_dbs_tuners *tuners = dbs_data->tuners;
+
+   cancel_work_sync(>work);
+   kfree(tuners);
  }

  define_get_cpu_dbs_routines(od_cpu_dbs_info);



No no no no! Let's not open up this can of worms of queuing up the work 
to handle a write to a sysfs file. It *MIGHT* work for this specific 
tunable (I haven't bothered to 

Re: [PATCH V2 0/7] cpufreq: governors: Fix ABBA lockups

2016-02-04 Thread Saravana Kannan

On 02/04/2016 09:43 AM, Saravana Kannan wrote:

On 02/04/2016 03:09 AM, Viresh Kumar wrote:

On 04-02-16, 00:50, Rafael J. Wysocki wrote:

This is exactly right.  We've avoided one deadlock only to trip into
another one.

This happens because update_sampling_rate() acquires
od_dbs_cdata.mutex which is held around cpufreq_governor_exit() by
cpufreq_governor_dbs().

Worse yet, a deadlock can still happen without (the new)
dbs_data->mutex, just between s_active and od_dbs_cdata.mutex if
update_sampling_rate() runs in parallel with
cpufreq_governor_dbs()->cpufreq_governor_exit() and the latter wins
the race.

It looks like we need to drop the governor mutex before putting the
kobject in cpufreq_governor_exit().


I have tried to explore all possible ways of fixing this, and every
other way looked to be racy in some way.

Does anyone else have a better idea (untested):

-8<-

Subject: [PATCH] cpufreq: ondemand: Shoot update_sampling_rate with a
separate
  work

Signed-off-by: Viresh Kumar <viresh.ku...@linaro.org>
---
  drivers/cpufreq/cpufreq_governor.h |  2 ++
  drivers/cpufreq/cpufreq_ondemand.c | 39
+-
  2 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.h
b/drivers/cpufreq/cpufreq_governor.h
index 7bed63e14e7d..97e604356b20 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -141,6 +141,8 @@ struct od_dbs_tuners {
  unsigned int powersave_bias;
  unsigned int io_is_busy;
  unsigned int min_sampling_rate;
+struct work_struct work;
+struct dbs_data *dbs_data;
  };

  struct cs_dbs_tuners {
diff --git a/drivers/cpufreq/cpufreq_ondemand.c
b/drivers/cpufreq/cpufreq_ondemand.c
index 82ed490f7de0..93ad7a226aee 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -242,20 +242,27 @@ static struct common_dbs_data od_dbs_cdata;
   * reducing the sampling rate, we need to make the new value effective
   * immediately.
   */
-static void update_sampling_rate(struct dbs_data *dbs_data,
-unsigned int new_rate)
+static void update_sampling_rate(struct work_struct *work)
  {
-struct od_dbs_tuners *od_tuners = dbs_data->tuners;
+struct od_dbs_tuners *od_tuners = container_of(work, struct
+   od_dbs_tuners, work);
+unsigned int new_rate = od_tuners->sampling_rate;
+struct dbs_data *dbs_data = od_tuners->dbs_data;
  struct cpumask cpumask;
  int cpu;

-od_tuners->sampling_rate = new_rate = max(new_rate,
-od_tuners->min_sampling_rate);
-
  /*
   * Lock governor so that governor start/stop can't execute in
parallel.
+ *
+ * We can't do a regular mutex_lock() here, as that may deadlock
against
+ * another thread performing CPUFREQ_GOV_POLICY_EXIT event on the
+ * governor, which might have already taken od_dbs_cdata.mutex
and is
+ * waiting for this work to finish.
   */
-mutex_lock(_dbs_cdata.mutex);
+if (!mutex_trylock(_dbs_cdata.mutex)) {
+queue_work(system_wq, _tuners->work);
+return;
+}

  cpumask_copy(, cpu_online_mask);

@@ -311,13 +318,22 @@ static void update_sampling_rate(struct dbs_data
*dbs_data,
  static ssize_t store_sampling_rate(struct dbs_data *dbs_data, const
char *buf,
  size_t count)
  {
+struct od_dbs_tuners *od_tuners = dbs_data->tuners;
  unsigned int input;
  int ret;
  ret = sscanf(buf, "%u", );
  if (ret != 1)
  return -EINVAL;

-update_sampling_rate(dbs_data, input);
+od_tuners->sampling_rate = max(input, od_tuners->min_sampling_rate);
+
+/*
+ * update_sampling_rate() requires to hold od_dbs_cdata.mutex,
but we
+ * can't take that from this thread, otherwise it results in ABBA
+ * lockdep between s_active and od_dbs_cdata.mutex locks.
+ */
+queue_work(system_wq, _tuners->work);
+
  return count;
  }

@@ -501,6 +517,8 @@ static int od_init(struct dbs_data *dbs_data, bool
notify)
  tuners->ignore_nice_load = 0;
  tuners->powersave_bias = default_powersave_bias;
  tuners->io_is_busy = should_io_be_busy();
+INIT_WORK(>work, update_sampling_rate);
+tuners->dbs_data = dbs_data;

  dbs_data->tuners = tuners;
  return 0;
@@ -508,7 +526,10 @@ static int od_init(struct dbs_data *dbs_data,
bool notify)

  static void od_exit(struct dbs_data *dbs_data, bool notify)
  {
-kfree(dbs_data->tuners);
+struct od_dbs_tuners *tuners = dbs_data->tuners;
+
+cancel_work_sync(>work);
+kfree(tuners);
  }

  define_get_cpu_dbs_routines(od_cpu_dbs_info);



No no no no! Let's not open up this can of worms of queuing up the work
to handle a write to a sysfs file. It *MIGHT* work for this specific
tunable (I haven't bothered to analyze), but this makes it impossible to
r

Re: [PATCH V2 0/7] cpufreq: governors: Fix ABBA lockups

2016-02-04 Thread Saravana Kannan

On 02/04/2016 03:09 AM, Viresh Kumar wrote:

On 04-02-16, 00:50, Rafael J. Wysocki wrote:

This is exactly right.  We've avoided one deadlock only to trip into
another one.

This happens because update_sampling_rate() acquires
od_dbs_cdata.mutex which is held around cpufreq_governor_exit() by
cpufreq_governor_dbs().

Worse yet, a deadlock can still happen without (the new)
dbs_data->mutex, just between s_active and od_dbs_cdata.mutex if
update_sampling_rate() runs in parallel with
cpufreq_governor_dbs()->cpufreq_governor_exit() and the latter wins
the race.

It looks like we need to drop the governor mutex before putting the
kobject in cpufreq_governor_exit().


I have tried to explore all possible ways of fixing this, and every
other way looked to be racy in some way.

Does anyone else have a better idea (untested):

-8<-

Subject: [PATCH] cpufreq: ondemand: Shoot update_sampling_rate with a separate
  work

Signed-off-by: Viresh Kumar 
---
  drivers/cpufreq/cpufreq_governor.h |  2 ++
  drivers/cpufreq/cpufreq_ondemand.c | 39 +-
  2 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.h 
b/drivers/cpufreq/cpufreq_governor.h
index 7bed63e14e7d..97e604356b20 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -141,6 +141,8 @@ struct od_dbs_tuners {
unsigned int powersave_bias;
unsigned int io_is_busy;
unsigned int min_sampling_rate;
+   struct work_struct work;
+   struct dbs_data *dbs_data;
  };

  struct cs_dbs_tuners {
diff --git a/drivers/cpufreq/cpufreq_ondemand.c 
b/drivers/cpufreq/cpufreq_ondemand.c
index 82ed490f7de0..93ad7a226aee 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -242,20 +242,27 @@ static struct common_dbs_data od_dbs_cdata;
   * reducing the sampling rate, we need to make the new value effective
   * immediately.
   */
-static void update_sampling_rate(struct dbs_data *dbs_data,
-   unsigned int new_rate)
+static void update_sampling_rate(struct work_struct *work)
  {
-   struct od_dbs_tuners *od_tuners = dbs_data->tuners;
+   struct od_dbs_tuners *od_tuners = container_of(work, struct
+  od_dbs_tuners, work);
+   unsigned int new_rate = od_tuners->sampling_rate;
+   struct dbs_data *dbs_data = od_tuners->dbs_data;
struct cpumask cpumask;
int cpu;

-   od_tuners->sampling_rate = new_rate = max(new_rate,
-   od_tuners->min_sampling_rate);
-
/*
 * Lock governor so that governor start/stop can't execute in parallel.
+*
+* We can't do a regular mutex_lock() here, as that may deadlock against
+* another thread performing CPUFREQ_GOV_POLICY_EXIT event on the
+* governor, which might have already taken od_dbs_cdata.mutex and is
+* waiting for this work to finish.
 */
-   mutex_lock(_dbs_cdata.mutex);
+   if (!mutex_trylock(_dbs_cdata.mutex)) {
+   queue_work(system_wq, _tuners->work);
+   return;
+   }

cpumask_copy(, cpu_online_mask);

@@ -311,13 +318,22 @@ static void update_sampling_rate(struct dbs_data 
*dbs_data,
  static ssize_t store_sampling_rate(struct dbs_data *dbs_data, const char *buf,
size_t count)
  {
+   struct od_dbs_tuners *od_tuners = dbs_data->tuners;
unsigned int input;
int ret;
ret = sscanf(buf, "%u", );
if (ret != 1)
return -EINVAL;

-   update_sampling_rate(dbs_data, input);
+   od_tuners->sampling_rate = max(input, od_tuners->min_sampling_rate);
+
+   /*
+* update_sampling_rate() requires to hold od_dbs_cdata.mutex, but we
+* can't take that from this thread, otherwise it results in ABBA
+* lockdep between s_active and od_dbs_cdata.mutex locks.
+*/
+   queue_work(system_wq, _tuners->work);
+
return count;
  }

@@ -501,6 +517,8 @@ static int od_init(struct dbs_data *dbs_data, bool notify)
tuners->ignore_nice_load = 0;
tuners->powersave_bias = default_powersave_bias;
tuners->io_is_busy = should_io_be_busy();
+   INIT_WORK(>work, update_sampling_rate);
+   tuners->dbs_data = dbs_data;

dbs_data->tuners = tuners;
return 0;
@@ -508,7 +526,10 @@ static int od_init(struct dbs_data *dbs_data, bool notify)

  static void od_exit(struct dbs_data *dbs_data, bool notify)
  {
-   kfree(dbs_data->tuners);
+   struct od_dbs_tuners *tuners = dbs_data->tuners;
+
+   cancel_work_sync(>work);
+   kfree(tuners);
  }

  define_get_cpu_dbs_routines(od_cpu_dbs_info);



No no no no! Let's not open up this can of worms of queuing up the work 
to handle a write to a sysfs file. It *MIGHT* work for this specific 
tunable 

Re: [PATCH 7/11] cpufreq: governor: Rework cpufreq_governor_dbs()

2016-02-03 Thread Saravana Kannan
ic struct dbs_governor od_dbs_gov = {
.gov = {
.name = "ondemand",
-   .governor = od_cpufreq_governor_dbs,
+   .governor = cpufreq_governor_dbs,
.max_transition_latency = TRANSITION_LATENCY_LIMIT,
.owner = THIS_MODULE,
},
@@ -563,12 +560,6 @@ static struct dbs_governor od_dbs_gov =

  #define CPU_FREQ_GOV_ONDEMAND (_dbs_gov.gov)

-static int od_cpufreq_governor_dbs(struct cpufreq_policy *policy,
-   unsigned int event)
-{
-   return cpufreq_governor_dbs(policy, _dbs_gov, event);
-}
-
  static void od_set_powersave_bias(unsigned int powersave_bias)
  {
struct cpufreq_policy *policy;

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



Nice!

Acked-by: Saravana Kannan 

-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 5/11] cpufreq: governor: Put governor structure into common_dbs_data

2016-02-03 Thread Saravana Kannan
cpu_get_raw(freq->cpu);
-
-   if (!policy)
-   return 0;
-
-   /* policy isn't governed by conservative governor */
-   if (policy->governor != _gov_conservative)
-   return 0;
-
-   /*
-* we only care if our internally tracked freq moves outside the 'valid'
-* ranges of frequency available to us otherwise we do not change it
-   */
-   if (dbs_info->requested_freq > policy->max
-   || dbs_info->requested_freq < policy->min)
-   dbs_info->requested_freq = freq->new;
-
-   return 0;
-}
+   void *data);

  static struct notifier_block cs_cpufreq_notifier_block = {
.notifier_call = dbs_cpufreq_notifier,
@@ -358,7 +325,16 @@ static void cs_exit(struct dbs_data *dbs

  define_get_cpu_dbs_routines(cs_cpu_dbs_info);

+static int cs_cpufreq_governor_dbs(struct cpufreq_policy *policy,
+  unsigned int event);
+
  static struct common_dbs_data cs_dbs_cdata = {
+   .gov = {
+   .name = "conservative",
+   .governor = cs_cpufreq_governor_dbs,
+   .max_transition_latency = TRANSITION_LATENCY_LIMIT,
+   .owner = THIS_MODULE,
+   },
.governor = GOV_CONSERVATIVE,
.attr_group_gov_sys = _attr_group_gov_sys,
.attr_group_gov_pol = _attr_group_gov_pol,
@@ -370,20 +346,48 @@ static struct common_dbs_data cs_dbs_cda
.exit = cs_exit,
  };

+#define CPU_FREQ_GOV_CONSERVATIVE  (_dbs_cdata.gov)
+
  static int cs_cpufreq_governor_dbs(struct cpufreq_policy *policy,
   unsigned int event)
  {
return cpufreq_governor_dbs(policy, _dbs_cdata, event);
  }

+static int dbs_cpufreq_notifier(struct notifier_block *nb, unsigned long val,
+   void *data)
+{
+   struct cpufreq_freqs *freq = data;
+   struct cs_cpu_dbs_info_s *dbs_info =
+   _cpu(cs_cpu_dbs_info, freq->cpu);
+   struct cpufreq_policy *policy = cpufreq_cpu_get_raw(freq->cpu);
+
+   if (!policy)
+   return 0;
+
+   /* policy isn't governed by conservative governor */
+   if (policy->governor != CPU_FREQ_GOV_CONSERVATIVE)
+   return 0;
+
+   /*
+* we only care if our internally tracked freq moves outside the 'valid'
+* ranges of frequency available to us otherwise we do not change it
+   */
+   if (dbs_info->requested_freq > policy->max
+   || dbs_info->requested_freq < policy->min)
+   dbs_info->requested_freq = freq->new;
+
+   return 0;
+}
+
  static int __init cpufreq_gov_dbs_init(void)
  {
-   return cpufreq_register_governor(_gov_conservative);
+   return cpufreq_register_governor(CPU_FREQ_GOV_CONSERVATIVE);
  }

  static void __exit cpufreq_gov_dbs_exit(void)
  {
-   cpufreq_unregister_governor(_gov_conservative);
+   cpufreq_unregister_governor(CPU_FREQ_GOV_CONSERVATIVE);
  }

  MODULE_AUTHOR("Alexander Clouter ");
@@ -395,7 +399,7 @@ MODULE_LICENSE("GPL");
  #ifdef CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE
  struct cpufreq_governor *cpufreq_default_governor(void)
  {
-   return _gov_conservative;
+   return CPU_FREQ_GOV_CONSERVATIVE;
  }

  fs_initcall(cpufreq_gov_dbs_init);



I'm not sold on the macros/#defines for the , but not a strong 
opinion.


Acked-by: Saravana Kannan 

-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 3/11] cpufreq: governor: Use common global_dbs_data pointer

2016-02-03 Thread Saravana Kannan

On 02/03/2016 05:11 PM, Saravana Kannan wrote:

On 02/03/2016 03:22 PM, Rafael J. Wysocki wrote:

From: Rafael J. Wysocki 

If the ondemand and conservative governors cannot use per-policy
tunables (CPUFREQ_HAVE_GOVERNOR_PER_POLICY is not set in the cpufreq
driver), all policy objects point to the same single dbs_data object.
Additionally, that object is pointed to by a global pointer hidden in
the governor's data structures.

There is no reason for that pointer to be buried in those
data structures, though, so make it explicitly global.

Signed-off-by: Rafael J. Wysocki 
---
  drivers/cpufreq/cpufreq_governor.c |   20 ++--
  drivers/cpufreq/cpufreq_governor.h |   20 ++--
  2 files changed, 20 insertions(+), 20 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_governor.h
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.h
+++ linux-pm/drivers/cpufreq/cpufreq_governor.h
@@ -78,7 +78,7 @@ __ATTR(_name, 0644, show_##_name##_gov_p
  static ssize_t show_##file_name##_gov_sys\
  (struct kobject *kobj, struct attribute *attr, char *buf)\
  {\
-struct _gov##_dbs_tuners *tuners =
_gov##_dbs_cdata.gdbs_data->tuners; \
+struct _gov##_dbs_tuners *tuners = global_dbs_data->tuners; \
  return sprintf(buf, "%u\n", tuners->file_name);\
  }\
  \
@@ -94,7 +94,7 @@ static ssize_t show_##file_name##_gov_po
  static ssize_t store_##file_name##_gov_sys\
  (struct kobject *kobj, struct attribute *attr, const char *buf,
size_t count) \
  {\
-struct dbs_data *dbs_data = _gov##_dbs_cdata.gdbs_data;\
+struct dbs_data *dbs_data = global_dbs_data;\
  return store_##file_name(dbs_data, buf, count);\
  }\
  \
@@ -201,19 +201,14 @@ struct cs_dbs_tuners {
  /* Common Governor data across policies */
  struct dbs_data;
  struct common_dbs_data {
-/* Common across governors */
+struct cpufreq_governor gov;
+


Actually, this line is completely unrelated to this patch. Should go on 
Patch 5?




Cautiously Acked-by: Saravana Kannan 


Good call on the "cautiously" I guess!

-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 3/11] cpufreq: governor: Use common global_dbs_data pointer

2016-02-03 Thread Saravana Kannan

On 02/03/2016 05:25 PM, Rafael J. Wysocki wrote:

On Thu, Feb 4, 2016 at 2:11 AM, Saravana Kannan  wrote:

On 02/03/2016 03:22 PM, Rafael J. Wysocki wrote:


From: Rafael J. Wysocki 

If the ondemand and conservative governors cannot use per-policy
tunables (CPUFREQ_HAVE_GOVERNOR_PER_POLICY is not set in the cpufreq
driver), all policy objects point to the same single dbs_data object.
Additionally, that object is pointed to by a global pointer hidden in
the governor's data structures.

There is no reason for that pointer to be buried in those
data structures, though, so make it explicitly global.

Signed-off-by: Rafael J. Wysocki 
---
   drivers/cpufreq/cpufreq_governor.c |   20 ++--
   drivers/cpufreq/cpufreq_governor.h |   20 ++--
   2 files changed, 20 insertions(+), 20 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_governor.h
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.h
+++ linux-pm/drivers/cpufreq/cpufreq_governor.h
@@ -78,7 +78,7 @@ __ATTR(_name, 0644, show_##_name##_gov_p
   static ssize_t show_##file_name##_gov_sys \
   (struct kobject *kobj, struct attribute *attr, char *buf) \
   { \
-   struct _gov##_dbs_tuners *tuners =
_gov##_dbs_cdata.gdbs_data->tuners; \
+   struct _gov##_dbs_tuners *tuners = global_dbs_data->tuners; \
 return sprintf(buf, "%u\n", tuners->file_name); \
   } \
 \
@@ -94,7 +94,7 @@ static ssize_t show_##file_name##_gov_po
   static ssize_t store_##file_name##_gov_sys\
   (struct kobject *kobj, struct attribute *attr, const char *buf, size_t
count) \
   { \
-   struct dbs_data *dbs_data = _gov##_dbs_cdata.gdbs_data; \
+   struct dbs_data *dbs_data = global_dbs_data;\
 return store_##file_name(dbs_data, buf, count); \
   } \
 \
@@ -201,19 +201,14 @@ struct cs_dbs_tuners {
   /* Common Governor data across policies */
   struct dbs_data;
   struct common_dbs_data {
-   /* Common across governors */
+   struct cpufreq_governor gov;
+
 #define GOV_ONDEMAND0
 #define GOV_CONSERVATIVE1
 int governor;
 struct attribute_group *attr_group_gov_sys; /* one governor -
system */
 struct attribute_group *attr_group_gov_pol; /* one governor -
policy */

-   /*
-* Common data for platforms that don't set
-* CPUFREQ_HAVE_GOVERNOR_PER_POLICY
-*/
-   struct dbs_data *gdbs_data;
-
 struct cpu_dbs_info *(*get_cpu_cdbs)(int cpu);
 void *(*get_cpu_dbs_info_s)(int cpu);
 unsigned int (*gov_dbs_timer)(struct cpufreq_policy *policy);
@@ -233,6 +228,11 @@ struct dbs_data {
 void *tuners;
   };

+/*
+ * Common governor data for platforms without
CPUFREQ_HAVE_GOVERNOR_PER_POLICY.
+ */
+extern struct dbs_data *global_dbs_data;
+
   /* Governor specific ops, will be passed to dbs_data->gov_ops */
   struct od_ops {
 void (*powersave_bias_init_cpu)(int cpu);
@@ -256,7 +256,7 @@ static inline int delay_for_sampling_rat
   static ssize_t show_sampling_rate_min_gov_sys \
   (struct kobject *kobj, struct attribute *attr, char *buf) \
   { \
-   struct dbs_data *dbs_data = _gov##_dbs_cdata.gdbs_data; \
+   struct dbs_data *dbs_data = global_dbs_data;\
 return sprintf(buf, "%u\n", dbs_data->min_sampling_rate);   \
   } \
 \
Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
+++ linux-pm/drivers/cpufreq/cpufreq_governor.c
@@ -22,6 +22,9 @@

   #include "cpufreq_governor.h"

+struct dbs_data *global_dbs_data;
+EXPORT_SYMBOL_GPL(global_dbs_data);
+
   DEFINE_MUTEX(dbs_data_mutex);
   EXPORT_SYMBOL_GPL(dbs_data_mutex);

@@ -377,22 +380,19 @@ static int cpufreq_governor_init(struct
 latency * LATENCY_MULTIPLIER));

 if (!have_governor_per_policy())
-   cdata->gdbs_data = dbs_data;
+   global_dbs_data = dbs_data;

 policy->governor_data = dbs_data;

 ret =

Re: [PATCH 4/11] cpufreq: governor: Avoid passing dbs_data pointers around unnecessarily

2016-02-03 Thread Saravana Kannan
ck governor to block concurrent initialization of governor */
mutex_lock(_data_mutex);

-   if (have_governor_per_policy())
-   dbs_data = policy->governor_data;
-   else
-   dbs_data = global_dbs_data;
-
-   if (!dbs_data && (event != CPUFREQ_GOV_POLICY_INIT)) {
-   ret = -EINVAL;
-   goto unlock;
+   if (event == CPUFREQ_GOV_POLICY_INIT) {
+   ret = cpufreq_governor_init(policy, cdata);
+   } else if (policy->governor_data) {
+   switch (event) {
+   case CPUFREQ_GOV_POLICY_EXIT:
+   ret = cpufreq_governor_exit(policy);
+   break;
+   case CPUFREQ_GOV_START:
+   ret = cpufreq_governor_start(policy);
+   break;
+   case CPUFREQ_GOV_STOP:
+   ret = cpufreq_governor_stop(policy);
+   break;
+   case CPUFREQ_GOV_LIMITS:
+   ret = cpufreq_governor_limits(policy);
+   break;
+   }
}

-   switch (event) {
-   case CPUFREQ_GOV_POLICY_INIT:
-   ret = cpufreq_governor_init(policy, dbs_data, cdata);
-   break;
-   case CPUFREQ_GOV_POLICY_EXIT:
-   ret = cpufreq_governor_exit(policy, dbs_data);
-   break;
-   case CPUFREQ_GOV_START:
-   ret = cpufreq_governor_start(policy, dbs_data);
-   break;
-   case CPUFREQ_GOV_STOP:
-   ret = cpufreq_governor_stop(policy, dbs_data);
-   break;
-   case CPUFREQ_GOV_LIMITS:
-   ret = cpufreq_governor_limits(policy, dbs_data);
-   break;
-   default:
-   ret = -EINVAL;
-   }
-
-unlock:
mutex_unlock(_data_mutex);
-


Po-tay-to, Po-tah-to.


return ret;
  }
  EXPORT_SYMBOL_GPL(cpufreq_governor_dbs);



Agree with the general idea of the patch though.

Conditional on the comment above being resolve amongst the others:
Acked-by: Saravana Kannan 

-Saravana
--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 3/11] cpufreq: governor: Use common global_dbs_data pointer

2016-02-03 Thread Saravana Kannan
set_gdbs_data;
-
-   return 0;
+   if (!ret)
+   return 0;


I think the previous method of a handling the error is easier to read 
and more in line with the typical kernel coding style. The successful 
path ends in an unconditional return statement and the error paths are 
handled with a goto.


This also doesn't seem relevant to what the patch is trying to do. So, 
I'd prefer that it be left as is.




-reset_gdbs_data:
policy->governor_data = NULL;
-
if (!have_governor_per_policy())
-   cdata->gdbs_data = NULL;
+   global_dbs_data = NULL;
+
cdata->exit(dbs_data, !policy->governor->initialized);
  free_common_dbs_info:
free_common_dbs_info(policy, cdata);
@@ -418,7 +418,7 @@ static int cpufreq_governor_exit(struct
policy->governor_data = NULL;

if (!have_governor_per_policy())
-   cdata->gdbs_data = NULL;
+   global_dbs_data = NULL;

cdata->exit(dbs_data, policy->governor->initialized == 1);
kfree(dbs_data);
@@ -550,7 +550,7 @@ int cpufreq_governor_dbs(struct cpufreq_
if (have_governor_per_policy())
dbs_data = policy->governor_data;
else
-   dbs_data = cdata->gdbs_data;
+   dbs_data = global_dbs_data;

if (!dbs_data && (event != CPUFREQ_GOV_POLICY_INIT)) {
ret = -EINVAL;

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



If the minor comment is addressed, this looks okay to me.

Cautiously Acked-by: Saravana Kannan 

-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 2/11] cpufreq: governor: Use common mutex for dbs_data protection

2016-02-03 Thread Saravana Kannan

On 02/03/2016 03:16 PM, Rafael J. Wysocki wrote:

From: Rafael J. Wysocki 

Every governor relying on the common code in cpufreq_governor.c
has to provide its own mutex in struct common_dbs_data.  However,
those mutexes are never used at the same time and doing it this
way makes it rather difficult to follow the code.  Moreover,
if two governor modules are loaded we end up with two mutexes
used for the same purpose one at a time.

Introduce a single common mutex for that instead and drop the
mutex field from struct common_dbs_data.  That at least will
ensure that the mutex is always present and initialized regardless
of what the particular governors do.

Signed-off-by: Rafael J. Wysocki 
---
  drivers/cpufreq/cpufreq_conservative.c |1 -
  drivers/cpufreq/cpufreq_governor.c |7 +--
  drivers/cpufreq/cpufreq_governor.h |6 +-
  drivers/cpufreq/cpufreq_ondemand.c |5 ++---
  4 files changed, 8 insertions(+), 11 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
+++ linux-pm/drivers/cpufreq/cpufreq_governor.c
@@ -22,6 +22,9 @@

  #include "cpufreq_governor.h"

+DEFINE_MUTEX(dbs_data_mutex);
+EXPORT_SYMBOL_GPL(dbs_data_mutex);
+
  static struct attribute_group *get_sysfs_attr(struct dbs_data *dbs_data)
  {
if (have_governor_per_policy())
@@ -542,7 +545,7 @@ int cpufreq_governor_dbs(struct cpufreq_
int ret;

/* Lock governor to block concurrent initialization of governor */
-   mutex_lock(>mutex);
+   mutex_lock(_data_mutex);

if (have_governor_per_policy())
dbs_data = policy->governor_data;
@@ -575,7 +578,7 @@ int cpufreq_governor_dbs(struct cpufreq_
}

  unlock:
-   mutex_unlock(>mutex);
+   mutex_unlock(_data_mutex);

return ret;
  }
Index: linux-pm/drivers/cpufreq/cpufreq_governor.h
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.h
+++ linux-pm/drivers/cpufreq/cpufreq_governor.h
@@ -223,11 +223,6 @@ struct common_dbs_data {

/* Governor specific ops, see below */
void *gov_ops;
-
-   /*
-* Protects governor's data (struct dbs_data and struct common_dbs_data)
-*/
-   struct mutex mutex;
  };

  /* Governor Per policy data */
@@ -272,6 +267,7 @@ static ssize_t show_sampling_rate_min_go
return sprintf(buf, "%u\n", dbs_data->min_sampling_rate);  \
  }

+extern struct mutex dbs_data_mutex;
  extern struct mutex cpufreq_governor_lock;
  void gov_set_update_util(struct cpu_common_dbs_info *shared,
 unsigned int delay_us);
Index: linux-pm/drivers/cpufreq/cpufreq_ondemand.c
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_ondemand.c
+++ linux-pm/drivers/cpufreq/cpufreq_ondemand.c
@@ -251,7 +251,7 @@ static void update_sampling_rate(struct
/*
 * Lock governor so that governor start/stop can't execute in parallel.
 */
-   mutex_lock(_dbs_cdata.mutex);
+   mutex_lock(_data_mutex);

cpumask_copy(, cpu_online_mask);

@@ -304,7 +304,7 @@ static void update_sampling_rate(struct
}
}

-   mutex_unlock(_dbs_cdata.mutex);
+   mutex_unlock(_data_mutex);
  }

  static ssize_t store_sampling_rate(struct dbs_data *dbs_data, const char *buf,
@@ -550,7 +550,6 @@ static struct common_dbs_data od_dbs_cda
.gov_ops = _ops,
.init = od_init,
.exit = od_exit,
-   .mutex = __MUTEX_INITIALIZER(od_dbs_cdata.mutex),
  };

  static int od_cpufreq_governor_dbs(struct cpufreq_policy *policy,
Index: linux-pm/drivers/cpufreq/cpufreq_conservative.c
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_conservative.c
+++ linux-pm/drivers/cpufreq/cpufreq_conservative.c
@@ -368,7 +368,6 @@ static struct common_dbs_data cs_dbs_cda
.gov_check_cpu = cs_check_cpu,
.init = cs_init,
.exit = cs_exit,
-   .mutex = __MUTEX_INITIALIZER(cs_dbs_cdata.mutex),
  };

  static int cs_cpufreq_governor_dbs(struct cpufreq_policy *policy,

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html




Acked-by: Saravana Kannan 


--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 1/11] cpufreq: Clean up default and fallback governor setup

2016-02-03 Thread Saravana Kannan
_CPU_FREQ_DEFAULT_GOV_POWERSAVE
-static
-#endif
-struct cpufreq_governor cpufreq_gov_powersave = {
+static struct cpufreq_governor cpufreq_gov_powersave = {
.name   = "powersave",
.governor   = cpufreq_governor_powersave,
.owner  = THIS_MODULE,
@@ -57,6 +54,11 @@ MODULE_DESCRIPTION("CPUfreq policy gover
  MODULE_LICENSE("GPL");

  #ifdef CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE
+struct cpufreq_governor *cpufreq_default_governor(void)
+{
+   return _gov_powersave;
+}
+
  fs_initcall(cpufreq_gov_powersave_init);
  #else
  module_init(cpufreq_gov_powersave_init);
Index: linux-pm/drivers/cpufreq/cpufreq_userspace.c
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_userspace.c
+++ linux-pm/drivers/cpufreq/cpufreq_userspace.c
@@ -89,10 +89,7 @@ static int cpufreq_governor_userspace(st
return rc;
  }

-#ifndef CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE
-static
-#endif
-struct cpufreq_governor cpufreq_gov_userspace = {
+static struct cpufreq_governor cpufreq_gov_userspace = {
.name   = "userspace",
.governor   = cpufreq_governor_userspace,
.store_setspeed = cpufreq_set,
@@ -116,6 +113,11 @@ MODULE_DESCRIPTION("CPUfreq policy gover
  MODULE_LICENSE("GPL");

  #ifdef CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE
+struct cpufreq_governor *cpufreq_default_governor(void)
+{
+   return _gov_userspace;
+}
+
  fs_initcall(cpufreq_gov_userspace_init);
  #else
  module_init(cpufreq_gov_userspace_init);

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



Ah, that's so much more cleaner!

Acked-by: Saravana Kannan 


--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH V2 5/7] cpufreq: Merge cpufreq_offline_prepare/finish routines

2016-02-03 Thread Saravana Kannan

On 02/03/2016 06:02 AM, Viresh Kumar wrote:

The offline routine was separated into two halves earlier by
'commit 1aee40ac9c86 ("cpufreq: Invoke __cpufreq_remove_dev_finish()
after releasing cpu_hotplug.lock");.

And the reasons cited were, race issues between accessing policy's sysfs
files and policy kobject's cleanup.

That race isn't valid anymore, as we don't remove the policy & its
kobject completely on hotplugs, but do that from ->remove() callback of
subsys framework.

These two routines can be merged back now.

This is a preparatory step for the next patch, that will enforce
policy->rwsem lock around __cpufreq_governor() routines STOP/EXIT
sequence.


Is this stale text? Seems like this is now done in the *previous* patch?

-Saravana


--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 2/5] cpufreq: governor: Create separate sysfs-ops

2016-02-03 Thread Saravana Kannan

On 02/02/2016 10:54 PM, Viresh Kumar wrote:

On 02-02-16, 17:32, Saravana Kannan wrote:

But if we are expecting sched dvfs to come in, why make it worse for it. It
would be completely pointless to try and shoehorn sched dvfs to use
cpufreq_governor.c


We can move the common part to cpufreq core and not make sched-dvfs
reuse cpufreq_governor.c



Let's do this please.

-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 2/5] cpufreq: governor: Create separate sysfs-ops

2016-02-03 Thread Saravana Kannan

On 02/02/2016 10:57 PM, Viresh Kumar wrote:

On 02-02-16, 20:03, Saravana Kannan wrote:

This is the hotplug case I mentioned. The sysfs file removals will happen
only for the last CPU in that policy (we thankfully optimized that part last
year). We also know that multiple CPUs can't be hotplugged at the same time.
So, in the start of cpufreq_offline_prepare, we just need to check if this
is the last CPU in the policy and if that's the case, do the gov sysfs
remove and then grab the policy lock and do all our crap. If a read is going
on, that's going to finish before the sysfs attr remove can go ahead and it
can grab the policy lock if it needs to and that still won't cause a
deadlock because we haven't yet grabbed the policy lock in
cpufreq_offline_prepare(). If the read comes after the sysfs remove, then
the read is obviously going to fail (we can depend on the sysfs framework on
doing its job there).


IMHO, these are all dirty hacks we should stay away from. Adding such
hunks in code is considered a band-aid kind of solution and hurts
readability badly. The new solution (new governor show/store)
implement this in a very clean and proper way I feel..

Others are free to disagree though :)



I think it looks clean since we haven't sorted out the race conditions 
that Juri pointed out. So, it's early to call this series clean :)


Also, I don't see it as a dirty hack at all. What's so hacky about it? 
We are just identifying conditions when we'll have to remove the sysfs 
files and removing them before grabbing the policy lock.


The unlock/lock that we have now is what is a dirty hack.

-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH V2 5/7] cpufreq: Merge cpufreq_offline_prepare/finish routines

2016-02-03 Thread Saravana Kannan

On 02/03/2016 06:02 AM, Viresh Kumar wrote:

The offline routine was separated into two halves earlier by
'commit 1aee40ac9c86 ("cpufreq: Invoke __cpufreq_remove_dev_finish()
after releasing cpu_hotplug.lock");.

And the reasons cited were, race issues between accessing policy's sysfs
files and policy kobject's cleanup.

That race isn't valid anymore, as we don't remove the policy & its
kobject completely on hotplugs, but do that from ->remove() callback of
subsys framework.

These two routines can be merged back now.

This is a preparatory step for the next patch, that will enforce
policy->rwsem lock around __cpufreq_governor() routines STOP/EXIT
sequence.


Is this stale text? Seems like this is now done in the *previous* patch?

-Saravana


--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 1/11] cpufreq: Clean up default and fallback governor setup

2016-02-03 Thread Saravana Kannan
@@ static int cpufreq_governor_powersave(st
return 0;
  }

-#ifndef CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE
-static
-#endif
-struct cpufreq_governor cpufreq_gov_powersave = {
+static struct cpufreq_governor cpufreq_gov_powersave = {
.name   = "powersave",
.governor   = cpufreq_governor_powersave,
.owner  = THIS_MODULE,
@@ -57,6 +54,11 @@ MODULE_DESCRIPTION("CPUfreq policy gover
  MODULE_LICENSE("GPL");

  #ifdef CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE
+struct cpufreq_governor *cpufreq_default_governor(void)
+{
+   return _gov_powersave;
+}
+
  fs_initcall(cpufreq_gov_powersave_init);
  #else
  module_init(cpufreq_gov_powersave_init);
Index: linux-pm/drivers/cpufreq/cpufreq_userspace.c
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_userspace.c
+++ linux-pm/drivers/cpufreq/cpufreq_userspace.c
@@ -89,10 +89,7 @@ static int cpufreq_governor_userspace(st
return rc;
  }

-#ifndef CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE
-static
-#endif
-struct cpufreq_governor cpufreq_gov_userspace = {
+static struct cpufreq_governor cpufreq_gov_userspace = {
.name   = "userspace",
.governor   = cpufreq_governor_userspace,
.store_setspeed = cpufreq_set,
@@ -116,6 +113,11 @@ MODULE_DESCRIPTION("CPUfreq policy gover
  MODULE_LICENSE("GPL");

  #ifdef CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE
+struct cpufreq_governor *cpufreq_default_governor(void)
+{
+   return _gov_userspace;
+}
+
  fs_initcall(cpufreq_gov_userspace_init);
  #else
  module_init(cpufreq_gov_userspace_init);

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



Ah, that's so much more cleaner!

Acked-by: Saravana Kannan <skan...@codeaurora.org>


--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 2/11] cpufreq: governor: Use common mutex for dbs_data protection

2016-02-03 Thread Saravana Kannan

On 02/03/2016 03:16 PM, Rafael J. Wysocki wrote:

From: Rafael J. Wysocki <rafael.j.wyso...@intel.com>

Every governor relying on the common code in cpufreq_governor.c
has to provide its own mutex in struct common_dbs_data.  However,
those mutexes are never used at the same time and doing it this
way makes it rather difficult to follow the code.  Moreover,
if two governor modules are loaded we end up with two mutexes
used for the same purpose one at a time.

Introduce a single common mutex for that instead and drop the
mutex field from struct common_dbs_data.  That at least will
ensure that the mutex is always present and initialized regardless
of what the particular governors do.

Signed-off-by: Rafael J. Wysocki <rafael.j.wyso...@intel.com>
---
  drivers/cpufreq/cpufreq_conservative.c |1 -
  drivers/cpufreq/cpufreq_governor.c |7 +--
  drivers/cpufreq/cpufreq_governor.h |6 +-
  drivers/cpufreq/cpufreq_ondemand.c |5 ++---
  4 files changed, 8 insertions(+), 11 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
+++ linux-pm/drivers/cpufreq/cpufreq_governor.c
@@ -22,6 +22,9 @@

  #include "cpufreq_governor.h"

+DEFINE_MUTEX(dbs_data_mutex);
+EXPORT_SYMBOL_GPL(dbs_data_mutex);
+
  static struct attribute_group *get_sysfs_attr(struct dbs_data *dbs_data)
  {
if (have_governor_per_policy())
@@ -542,7 +545,7 @@ int cpufreq_governor_dbs(struct cpufreq_
int ret;

/* Lock governor to block concurrent initialization of governor */
-   mutex_lock(>mutex);
+   mutex_lock(_data_mutex);

if (have_governor_per_policy())
dbs_data = policy->governor_data;
@@ -575,7 +578,7 @@ int cpufreq_governor_dbs(struct cpufreq_
}

  unlock:
-   mutex_unlock(>mutex);
+   mutex_unlock(_data_mutex);

return ret;
  }
Index: linux-pm/drivers/cpufreq/cpufreq_governor.h
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.h
+++ linux-pm/drivers/cpufreq/cpufreq_governor.h
@@ -223,11 +223,6 @@ struct common_dbs_data {

/* Governor specific ops, see below */
void *gov_ops;
-
-   /*
-* Protects governor's data (struct dbs_data and struct common_dbs_data)
-*/
-   struct mutex mutex;
  };

  /* Governor Per policy data */
@@ -272,6 +267,7 @@ static ssize_t show_sampling_rate_min_go
return sprintf(buf, "%u\n", dbs_data->min_sampling_rate);  \
  }

+extern struct mutex dbs_data_mutex;
  extern struct mutex cpufreq_governor_lock;
  void gov_set_update_util(struct cpu_common_dbs_info *shared,
 unsigned int delay_us);
Index: linux-pm/drivers/cpufreq/cpufreq_ondemand.c
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_ondemand.c
+++ linux-pm/drivers/cpufreq/cpufreq_ondemand.c
@@ -251,7 +251,7 @@ static void update_sampling_rate(struct
/*
 * Lock governor so that governor start/stop can't execute in parallel.
 */
-   mutex_lock(_dbs_cdata.mutex);
+   mutex_lock(_data_mutex);

cpumask_copy(, cpu_online_mask);

@@ -304,7 +304,7 @@ static void update_sampling_rate(struct
}
}

-   mutex_unlock(_dbs_cdata.mutex);
+   mutex_unlock(_data_mutex);
  }

  static ssize_t store_sampling_rate(struct dbs_data *dbs_data, const char *buf,
@@ -550,7 +550,6 @@ static struct common_dbs_data od_dbs_cda
.gov_ops = _ops,
.init = od_init,
.exit = od_exit,
-   .mutex = __MUTEX_INITIALIZER(od_dbs_cdata.mutex),
  };

  static int od_cpufreq_governor_dbs(struct cpufreq_policy *policy,
Index: linux-pm/drivers/cpufreq/cpufreq_conservative.c
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_conservative.c
+++ linux-pm/drivers/cpufreq/cpufreq_conservative.c
@@ -368,7 +368,6 @@ static struct common_dbs_data cs_dbs_cda
.gov_check_cpu = cs_check_cpu,
.init = cs_init,
.exit = cs_exit,
-   .mutex = __MUTEX_INITIALIZER(cs_dbs_cdata.mutex),
  };

  static int cs_cpufreq_governor_dbs(struct cpufreq_policy *policy,

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html




Acked-by: Saravana Kannan <skan...@codeaurora.org>


--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 4/11] cpufreq: governor: Avoid passing dbs_data pointers around unnecessarily

2016-02-03 Thread Saravana Kannan
ata *dbs_data;
-   int ret;
+   int ret = -EINVAL;

/* Lock governor to block concurrent initialization of governor */
mutex_lock(_data_mutex);

-   if (have_governor_per_policy())
-   dbs_data = policy->governor_data;
-   else
-   dbs_data = global_dbs_data;
-
-   if (!dbs_data && (event != CPUFREQ_GOV_POLICY_INIT)) {
-   ret = -EINVAL;
-   goto unlock;
+   if (event == CPUFREQ_GOV_POLICY_INIT) {
+   ret = cpufreq_governor_init(policy, cdata);
+   } else if (policy->governor_data) {
+   switch (event) {
+   case CPUFREQ_GOV_POLICY_EXIT:
+   ret = cpufreq_governor_exit(policy);
+   break;
+   case CPUFREQ_GOV_START:
+   ret = cpufreq_governor_start(policy);
+   break;
+   case CPUFREQ_GOV_STOP:
+   ret = cpufreq_governor_stop(policy);
+   break;
+   case CPUFREQ_GOV_LIMITS:
+   ret = cpufreq_governor_limits(policy);
+   break;
+   }
}

-   switch (event) {
-   case CPUFREQ_GOV_POLICY_INIT:
-   ret = cpufreq_governor_init(policy, dbs_data, cdata);
-   break;
-   case CPUFREQ_GOV_POLICY_EXIT:
-   ret = cpufreq_governor_exit(policy, dbs_data);
-   break;
-   case CPUFREQ_GOV_START:
-   ret = cpufreq_governor_start(policy, dbs_data);
-   break;
-   case CPUFREQ_GOV_STOP:
-   ret = cpufreq_governor_stop(policy, dbs_data);
-   break;
-   case CPUFREQ_GOV_LIMITS:
-   ret = cpufreq_governor_limits(policy, dbs_data);
-   break;
-   default:
-   ret = -EINVAL;
-   }
-
-unlock:
mutex_unlock(_data_mutex);
-


Po-tay-to, Po-tah-to.


return ret;
  }
  EXPORT_SYMBOL_GPL(cpufreq_governor_dbs);



Agree with the general idea of the patch though.

Conditional on the comment above being resolve amongst the others:
Acked-by: Saravana Kannan <skan...@codeaurora.org>

-Saravana
--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 3/11] cpufreq: governor: Use common global_dbs_data pointer

2016-02-03 Thread Saravana Kannan

On 02/03/2016 05:11 PM, Saravana Kannan wrote:

On 02/03/2016 03:22 PM, Rafael J. Wysocki wrote:

From: Rafael J. Wysocki <rafael.j.wyso...@intel.com>

If the ondemand and conservative governors cannot use per-policy
tunables (CPUFREQ_HAVE_GOVERNOR_PER_POLICY is not set in the cpufreq
driver), all policy objects point to the same single dbs_data object.
Additionally, that object is pointed to by a global pointer hidden in
the governor's data structures.

There is no reason for that pointer to be buried in those
data structures, though, so make it explicitly global.

Signed-off-by: Rafael J. Wysocki <rafael.j.wyso...@intel.com>
---
  drivers/cpufreq/cpufreq_governor.c |   20 ++--
  drivers/cpufreq/cpufreq_governor.h |   20 ++--
  2 files changed, 20 insertions(+), 20 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_governor.h
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.h
+++ linux-pm/drivers/cpufreq/cpufreq_governor.h
@@ -78,7 +78,7 @@ __ATTR(_name, 0644, show_##_name##_gov_p
  static ssize_t show_##file_name##_gov_sys\
  (struct kobject *kobj, struct attribute *attr, char *buf)\
  {\
-struct _gov##_dbs_tuners *tuners =
_gov##_dbs_cdata.gdbs_data->tuners; \
+struct _gov##_dbs_tuners *tuners = global_dbs_data->tuners; \
  return sprintf(buf, "%u\n", tuners->file_name);\
  }\
  \
@@ -94,7 +94,7 @@ static ssize_t show_##file_name##_gov_po
  static ssize_t store_##file_name##_gov_sys\
  (struct kobject *kobj, struct attribute *attr, const char *buf,
size_t count) \
  {\
-struct dbs_data *dbs_data = _gov##_dbs_cdata.gdbs_data;\
+struct dbs_data *dbs_data = global_dbs_data;\
  return store_##file_name(dbs_data, buf, count);\
  }\
  \
@@ -201,19 +201,14 @@ struct cs_dbs_tuners {
  /* Common Governor data across policies */
  struct dbs_data;
  struct common_dbs_data {
-/* Common across governors */
+struct cpufreq_governor gov;
+


Actually, this line is completely unrelated to this patch. Should go on 
Patch 5?




Cautiously Acked-by: Saravana Kannan <skan...@codeaurora.org>


Good call on the "cautiously" I guess!

-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 3/11] cpufreq: governor: Use common global_dbs_data pointer

2016-02-03 Thread Saravana Kannan
et_sysfs_attr(dbs_data));
-   if (ret)
-   goto reset_gdbs_data;
-
-   return 0;
+   if (!ret)
+   return 0;


I think the previous method of a handling the error is easier to read 
and more in line with the typical kernel coding style. The successful 
path ends in an unconditional return statement and the error paths are 
handled with a goto.


This also doesn't seem relevant to what the patch is trying to do. So, 
I'd prefer that it be left as is.




-reset_gdbs_data:
policy->governor_data = NULL;
-
if (!have_governor_per_policy())
-   cdata->gdbs_data = NULL;
+   global_dbs_data = NULL;
+
cdata->exit(dbs_data, !policy->governor->initialized);
  free_common_dbs_info:
free_common_dbs_info(policy, cdata);
@@ -418,7 +418,7 @@ static int cpufreq_governor_exit(struct
policy->governor_data = NULL;

if (!have_governor_per_policy())
-   cdata->gdbs_data = NULL;
+   global_dbs_data = NULL;

cdata->exit(dbs_data, policy->governor->initialized == 1);
kfree(dbs_data);
@@ -550,7 +550,7 @@ int cpufreq_governor_dbs(struct cpufreq_
if (have_governor_per_policy())
dbs_data = policy->governor_data;
else
-   dbs_data = cdata->gdbs_data;
+   dbs_data = global_dbs_data;

if (!dbs_data && (event != CPUFREQ_GOV_POLICY_INIT)) {
ret = -EINVAL;

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



If the minor comment is addressed, this looks okay to me.

Cautiously Acked-by: Saravana Kannan <skan...@codeaurora.org>

-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 3/11] cpufreq: governor: Use common global_dbs_data pointer

2016-02-03 Thread Saravana Kannan

On 02/03/2016 05:25 PM, Rafael J. Wysocki wrote:

On Thu, Feb 4, 2016 at 2:11 AM, Saravana Kannan <skan...@codeaurora.org> wrote:

On 02/03/2016 03:22 PM, Rafael J. Wysocki wrote:


From: Rafael J. Wysocki <rafael.j.wyso...@intel.com>

If the ondemand and conservative governors cannot use per-policy
tunables (CPUFREQ_HAVE_GOVERNOR_PER_POLICY is not set in the cpufreq
driver), all policy objects point to the same single dbs_data object.
Additionally, that object is pointed to by a global pointer hidden in
the governor's data structures.

There is no reason for that pointer to be buried in those
data structures, though, so make it explicitly global.

Signed-off-by: Rafael J. Wysocki <rafael.j.wyso...@intel.com>
---
   drivers/cpufreq/cpufreq_governor.c |   20 ++--
   drivers/cpufreq/cpufreq_governor.h |   20 ++--
   2 files changed, 20 insertions(+), 20 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_governor.h
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.h
+++ linux-pm/drivers/cpufreq/cpufreq_governor.h
@@ -78,7 +78,7 @@ __ATTR(_name, 0644, show_##_name##_gov_p
   static ssize_t show_##file_name##_gov_sys \
   (struct kobject *kobj, struct attribute *attr, char *buf) \
   { \
-   struct _gov##_dbs_tuners *tuners =
_gov##_dbs_cdata.gdbs_data->tuners; \
+   struct _gov##_dbs_tuners *tuners = global_dbs_data->tuners; \
 return sprintf(buf, "%u\n", tuners->file_name); \
   } \
 \
@@ -94,7 +94,7 @@ static ssize_t show_##file_name##_gov_po
   static ssize_t store_##file_name##_gov_sys\
   (struct kobject *kobj, struct attribute *attr, const char *buf, size_t
count) \
   { \
-   struct dbs_data *dbs_data = _gov##_dbs_cdata.gdbs_data; \
+   struct dbs_data *dbs_data = global_dbs_data;\
 return store_##file_name(dbs_data, buf, count); \
   } \
 \
@@ -201,19 +201,14 @@ struct cs_dbs_tuners {
   /* Common Governor data across policies */
   struct dbs_data;
   struct common_dbs_data {
-   /* Common across governors */
+   struct cpufreq_governor gov;
+
 #define GOV_ONDEMAND0
 #define GOV_CONSERVATIVE1
 int governor;
 struct attribute_group *attr_group_gov_sys; /* one governor -
system */
 struct attribute_group *attr_group_gov_pol; /* one governor -
policy */

-   /*
-* Common data for platforms that don't set
-* CPUFREQ_HAVE_GOVERNOR_PER_POLICY
-*/
-   struct dbs_data *gdbs_data;
-
 struct cpu_dbs_info *(*get_cpu_cdbs)(int cpu);
 void *(*get_cpu_dbs_info_s)(int cpu);
 unsigned int (*gov_dbs_timer)(struct cpufreq_policy *policy);
@@ -233,6 +228,11 @@ struct dbs_data {
 void *tuners;
   };

+/*
+ * Common governor data for platforms without
CPUFREQ_HAVE_GOVERNOR_PER_POLICY.
+ */
+extern struct dbs_data *global_dbs_data;
+
   /* Governor specific ops, will be passed to dbs_data->gov_ops */
   struct od_ops {
 void (*powersave_bias_init_cpu)(int cpu);
@@ -256,7 +256,7 @@ static inline int delay_for_sampling_rat
   static ssize_t show_sampling_rate_min_gov_sys \
   (struct kobject *kobj, struct attribute *attr, char *buf) \
   { \
-   struct dbs_data *dbs_data = _gov##_dbs_cdata.gdbs_data; \
+   struct dbs_data *dbs_data = global_dbs_data;\
 return sprintf(buf, "%u\n", dbs_data->min_sampling_rate);   \
   } \
 \
Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
+++ linux-pm/drivers/cpufreq/cpufreq_governor.c
@@ -22,6 +22,9 @@

   #include "cpufreq_governor.h"

+struct dbs_data *global_dbs_data;
+EXPORT_SYMBOL_GPL(global_dbs_data);
+
   DEFINE_MUTEX(dbs_data_mutex);
   EXPORT_SYMBOL_GPL(dbs_data_mutex);

@@ -377,22 +380,19 @@ static int cpufreq_governor_init(struct
 latency * LATENCY_MULTIPLIER));

 if (!have_governor_per_policy())
-   cdata->gdbs_data = dbs_data;
+   glob

Re: [PATCH 7/11] cpufreq: governor: Rework cpufreq_governor_dbs()

2016-02-03 Thread Saravana Kannan
_dbs(struct cpufreq_policy *policy,
-  unsigned int event);
-
  static struct dbs_governor od_dbs_gov = {
.gov = {
.name = "ondemand",
-   .governor = od_cpufreq_governor_dbs,
+   .governor = cpufreq_governor_dbs,
.max_transition_latency = TRANSITION_LATENCY_LIMIT,
.owner = THIS_MODULE,
},
@@ -563,12 +560,6 @@ static struct dbs_governor od_dbs_gov =

  #define CPU_FREQ_GOV_ONDEMAND (_dbs_gov.gov)

-static int od_cpufreq_governor_dbs(struct cpufreq_policy *policy,
-   unsigned int event)
-{
-   return cpufreq_governor_dbs(policy, _dbs_gov, event);
-}
-
  static void od_set_powersave_bias(unsigned int powersave_bias)
  {
struct cpufreq_policy *policy;

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



Nice!

Acked-by: Saravana Kannan <skan...@codeaurora.org>

-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 5/11] cpufreq: governor: Put governor structure into common_dbs_data

2016-02-03 Thread Saravana Kannan
   _cpu(cs_cpu_dbs_info, freq->cpu);
-   struct cpufreq_policy *policy = cpufreq_cpu_get_raw(freq->cpu);
-
-   if (!policy)
-   return 0;
-
-   /* policy isn't governed by conservative governor */
-   if (policy->governor != _gov_conservative)
-   return 0;
-
-   /*
-* we only care if our internally tracked freq moves outside the 'valid'
-* ranges of frequency available to us otherwise we do not change it
-   */
-   if (dbs_info->requested_freq > policy->max
-   || dbs_info->requested_freq < policy->min)
-   dbs_info->requested_freq = freq->new;
-
-   return 0;
-}
+   void *data);

  static struct notifier_block cs_cpufreq_notifier_block = {
.notifier_call = dbs_cpufreq_notifier,
@@ -358,7 +325,16 @@ static void cs_exit(struct dbs_data *dbs

  define_get_cpu_dbs_routines(cs_cpu_dbs_info);

+static int cs_cpufreq_governor_dbs(struct cpufreq_policy *policy,
+  unsigned int event);
+
  static struct common_dbs_data cs_dbs_cdata = {
+   .gov = {
+   .name = "conservative",
+   .governor = cs_cpufreq_governor_dbs,
+   .max_transition_latency = TRANSITION_LATENCY_LIMIT,
+   .owner = THIS_MODULE,
+   },
.governor = GOV_CONSERVATIVE,
.attr_group_gov_sys = _attr_group_gov_sys,
.attr_group_gov_pol = _attr_group_gov_pol,
@@ -370,20 +346,48 @@ static struct common_dbs_data cs_dbs_cda
.exit = cs_exit,
  };

+#define CPU_FREQ_GOV_CONSERVATIVE  (_dbs_cdata.gov)
+
  static int cs_cpufreq_governor_dbs(struct cpufreq_policy *policy,
   unsigned int event)
  {
return cpufreq_governor_dbs(policy, _dbs_cdata, event);
  }

+static int dbs_cpufreq_notifier(struct notifier_block *nb, unsigned long val,
+   void *data)
+{
+   struct cpufreq_freqs *freq = data;
+   struct cs_cpu_dbs_info_s *dbs_info =
+   _cpu(cs_cpu_dbs_info, freq->cpu);
+   struct cpufreq_policy *policy = cpufreq_cpu_get_raw(freq->cpu);
+
+   if (!policy)
+   return 0;
+
+   /* policy isn't governed by conservative governor */
+   if (policy->governor != CPU_FREQ_GOV_CONSERVATIVE)
+   return 0;
+
+   /*
+* we only care if our internally tracked freq moves outside the 'valid'
+* ranges of frequency available to us otherwise we do not change it
+   */
+   if (dbs_info->requested_freq > policy->max
+   || dbs_info->requested_freq < policy->min)
+   dbs_info->requested_freq = freq->new;
+
+   return 0;
+}
+
  static int __init cpufreq_gov_dbs_init(void)
  {
-   return cpufreq_register_governor(_gov_conservative);
+   return cpufreq_register_governor(CPU_FREQ_GOV_CONSERVATIVE);
  }

  static void __exit cpufreq_gov_dbs_exit(void)
  {
-   cpufreq_unregister_governor(_gov_conservative);
+   cpufreq_unregister_governor(CPU_FREQ_GOV_CONSERVATIVE);
  }

  MODULE_AUTHOR("Alexander Clouter <a...@digriz.org.uk>");
@@ -395,7 +399,7 @@ MODULE_LICENSE("GPL");
  #ifdef CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE
  struct cpufreq_governor *cpufreq_default_governor(void)
  {
-   return _gov_conservative;
+   return CPU_FREQ_GOV_CONSERVATIVE;
  }

  fs_initcall(cpufreq_gov_dbs_init);



I'm not sold on the macros/#defines for the , but not a strong 
opinion.


Acked-by: Saravana Kannan <skan...@codeaurora.org>

-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 2/5] cpufreq: governor: Create separate sysfs-ops

2016-02-03 Thread Saravana Kannan

On 02/02/2016 10:57 PM, Viresh Kumar wrote:

On 02-02-16, 20:03, Saravana Kannan wrote:

This is the hotplug case I mentioned. The sysfs file removals will happen
only for the last CPU in that policy (we thankfully optimized that part last
year). We also know that multiple CPUs can't be hotplugged at the same time.
So, in the start of cpufreq_offline_prepare, we just need to check if this
is the last CPU in the policy and if that's the case, do the gov sysfs
remove and then grab the policy lock and do all our crap. If a read is going
on, that's going to finish before the sysfs attr remove can go ahead and it
can grab the policy lock if it needs to and that still won't cause a
deadlock because we haven't yet grabbed the policy lock in
cpufreq_offline_prepare(). If the read comes after the sysfs remove, then
the read is obviously going to fail (we can depend on the sysfs framework on
doing its job there).


IMHO, these are all dirty hacks we should stay away from. Adding such
hunks in code is considered a band-aid kind of solution and hurts
readability badly. The new solution (new governor show/store)
implement this in a very clean and proper way I feel..

Others are free to disagree though :)



I think it looks clean since we haven't sorted out the race conditions 
that Juri pointed out. So, it's early to call this series clean :)


Also, I don't see it as a dirty hack at all. What's so hacky about it? 
We are just identifying conditions when we'll have to remove the sysfs 
files and removing them before grabbing the policy lock.


The unlock/lock that we have now is what is a dirty hack.

-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 2/5] cpufreq: governor: Create separate sysfs-ops

2016-02-03 Thread Saravana Kannan

On 02/02/2016 10:54 PM, Viresh Kumar wrote:

On 02-02-16, 17:32, Saravana Kannan wrote:

But if we are expecting sched dvfs to come in, why make it worse for it. It
would be completely pointless to try and shoehorn sched dvfs to use
cpufreq_governor.c


We can move the common part to cpufreq core and not make sched-dvfs
reuse cpufreq_governor.c



Let's do this please.

-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor

2016-02-02 Thread Saravana Kannan

On 02/02/2016 09:02 PM, Viresh Kumar wrote:

On 02-02-16, 20:04, Saravana Kannan wrote:

What's the s_active lock in CPU1 coming from?


That's taken by sysfs core while removing the files.


The only reason it's there
today is because of the sysfs dir remove. If you move it before the
policy->rwsem, you won't have it after the policy->rwsem too. So, I think it
will fix the issue.


Its complex and we will end up making ugly..


I disagree. I think it's way better and simpler than this patch set. It 
also doesn't tie into cpufreq_governor.* which is a good thing IMO since 
it keeps things simpler for sched-dvfs too.



For example, EXIT can be called while switching governors. The
policy->rwsem is taken at the beginning cpufreq_set_policy(). To
decide if we should remove the governor sysfs directory so early (i.e.
before taking rwsem) in the call, is going to be difficult.


Just check if the governor is changing. And if it is, you just need to 
remove the policy specific stuff.



Over that the same directory might be shared across multiple policies,
and all that information is present only with the governor-core.


That's why I said the gov needs to register the per pol and system wide 
attrs list separately.


This will also remove the need for ever governor to do this crap.

-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor

2016-02-02 Thread Saravana Kannan

On 02/02/2016 06:13 PM, Viresh Kumar wrote:

On 02-02-16, 13:37, Saravana Kannan wrote:

On 02/01/2016 10:34 PM, Viresh Kumar wrote:

What will that solve? It will stay exactly same then as well, as we
would be adding/removing these attributes from within the same
policy->rwsem ..


The problem isn't that you are holding the policy rwsem. The problem is that
we are trying to grab the same locks in different order. This is trying to
fix that.


That's exactly what I was trying to say, sorry for not being very
clear.

Even if you would move the sysfs file creation thing into the cpufreq
core, instead of governor, we will have locks this way:

CPU0CPU1
(sysfs read)(sysfs dir remove)
s_active lock   policy->rwsem
policy->rwsem
 s_active lock (hang)


And so I said, nothing will change.



What's the s_active lock in CPU1 coming from? The only reason it's there 
today is because of the sysfs dir remove. If you move it before the 
policy->rwsem, you won't have it after the policy->rwsem too. So, I 
think it will fix the issue.


-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 2/5] cpufreq: governor: Create separate sysfs-ops

2016-02-02 Thread Saravana Kannan

On 02/02/2016 05:52 PM, Rafael J. Wysocki wrote:

On Wed, Feb 3, 2016 at 2:32 AM, Saravana Kannan  wrote:

On 02/02/2016 05:07 PM, Rafael J. Wysocki wrote:


On Wed, Feb 3, 2016 at 12:42 AM, Rafael J. Wysocki 
wrote:


On Tue, Feb 2, 2016 at 11:21 PM, Saravana Kannan 
wrote:


On 02/02/2016 11:40 AM, Rafael J. Wysocki wrote:



On Tue, Feb 2, 2016 at 6:01 PM, Juri Lelli  wrote:





[cut]



I also don't like this patch because it forces governors to either
implement
their own macros and management of their attributes or force them to use
the
governor structs that come with cpufreq_governor.h. cpufreq_governor.h
IMHO
is very ondemand and conservative governor specific and is very
irrelevant
for sched-dvfs or any other governors (hint hint).

The only time this ABBA locking is an issue is when governor are
changing
and trying to add/remove attributes. That can easily be checked in
store_governor and dealt with without holding the policy rwsem if the
governors can provide their per sys and per policy attribute arrays as
part
of registering themselves.

I'm sorry that I just keep talking about the idea and not sending out
the
patches.



I think you have a point, though.

The deadlock really is specific to the governors using the code in
cpufreq_governor.c.



That said no other governors in the tree use any sysfs attributes for
tunables AFAICS and the out-of-the tree ones are out of interest here.



But if we are expecting sched dvfs to come in, why make it worse for it. It
would be completely pointless to try and shoehorn sched dvfs to use
cpufreq_governor.c


Well, do you honestly think that using the existing stuff in it would
be a good idea?

If not, then why it matters at all?


Also the deadlock happens if one of the tunable attributes is accessed
while we're trying to remove it which very well may happen on read
access too.


Isn't this THE deadlock we are talking about? The removal of the attributes
only happen when governors are changes and we send a POLICY_EXIT and or all
the cores are hotplugged out.


It generally happens when the "old" governor is going away, whatever the reason.


And my suggestion would work just as well there.

Why are you prefixing your sentence with "Also"? Is there some other case
I'm not considering?


Say someone is reading sampling_rate for a policy with 1 CPU in it and
someone else is taking the CPU offline.  The governor EXIT code path
(that will trigger as a result) will try to remove the sampling_rate
attribute and (if it does that under policy->rwsem) it'll wait for the
read access to finish.  Where exactly would you put the deadlock
prevention in this case?


This is the hotplug case I mentioned. The sysfs file removals will 
happen only for the last CPU in that policy (we thankfully optimized 
that part last year). We also know that multiple CPUs can't be 
hotplugged at the same time. So, in the start of 
cpufreq_offline_prepare, we just need to check if this is the last CPU 
in the policy and if that's the case, do the gov sysfs remove and then 
grab the policy lock and do all our crap. If a read is going on, that's 
going to finish before the sysfs attr remove can go ahead and it can 
grab the policy lock if it needs to and that still won't cause a 
deadlock because we haven't yet grabbed the policy lock in 
cpufreq_offline_prepare(). If the read comes after the sysfs remove, 
then the read is obviously going to fail (we can depend on the sysfs 
framework on doing its job there).


Will that still leave any race conditions in?

-Saravana
--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 2/5] cpufreq: governor: Create separate sysfs-ops

2016-02-02 Thread Saravana Kannan

On 02/02/2016 05:07 PM, Rafael J. Wysocki wrote:

On Wed, Feb 3, 2016 at 12:42 AM, Rafael J. Wysocki  wrote:

On Tue, Feb 2, 2016 at 11:21 PM, Saravana Kannan  wrote:

On 02/02/2016 11:40 AM, Rafael J. Wysocki wrote:


On Tue, Feb 2, 2016 at 6:01 PM, Juri Lelli  wrote:




[cut]



I also don't like this patch because it forces governors to either implement
their own macros and management of their attributes or force them to use the
governor structs that come with cpufreq_governor.h. cpufreq_governor.h IMHO
is very ondemand and conservative governor specific and is very irrelevant
for sched-dvfs or any other governors (hint hint).

The only time this ABBA locking is an issue is when governor are changing
and trying to add/remove attributes. That can easily be checked in
store_governor and dealt with without holding the policy rwsem if the
governors can provide their per sys and per policy attribute arrays as part
of registering themselves.

I'm sorry that I just keep talking about the idea and not sending out the
patches.


I think you have a point, though.

The deadlock really is specific to the governors using the code in
cpufreq_governor.c.


That said no other governors in the tree use any sysfs attributes for
tunables AFAICS and the out-of-the tree ones are out of interest here.


But if we are expecting sched dvfs to come in, why make it worse for it. 
It would be completely pointless to try and shoehorn sched dvfs to use 
cpufreq_governor.c



Also the deadlock happens if one of the tunable attributes is accessed
while we're trying to remove it which very well may happen on read
access too.


Isn't this THE deadlock we are talking about? The removal of the 
attributes only happen when governors are changes and we send a 
POLICY_EXIT and or all the cores are hotplugged out. And my suggestion 
would work just as well there.


Why are you prefixing your sentence with "Also"? Is there some other 
case I'm not considering?


-Saravana
--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 2/5] cpufreq: governor: Create separate sysfs-ops

2016-02-02 Thread Saravana Kannan

On 02/02/2016 11:40 AM, Rafael J. Wysocki wrote:

On Tue, Feb 2, 2016 at 6:01 PM, Juri Lelli  wrote:

Hi Rafael,

On 02/02/16 17:35, Rafael J. Wysocki wrote:

On Tue, Feb 2, 2016 at 4:47 PM, Juri Lelli  wrote:

Hi Viresh,

On 02/02/16 16:27, Viresh Kumar wrote:

Until now, governors (ondemand/conservative) were using the
'global-attr' or 'freq-attr', depending on the sysfs location where we
want to create governor's directory.

The problem is that, in case of 'freq-attr', we are forced to use
show()/store() present in cpufreq.c, which always take policy->rwsem.

And because of that we were facing some ABBA lockups during governor
callback event CPUFREQ_GOV_POLICY_EXIT. And so we were dropping the
rwsem right before calling governor callback for CPUFREQ_GOV_POLICY_EXIT
event.

That caused further problems and it never worked perfectly.

This patch attempts to fix that by creating separate sysfs-ops for
cpufreq governors.

Because things got much simplified now, we don't need separate
show/store callbacks for governor-for-system and governor-per-policy
cases.

Signed-off-by: Viresh Kumar 


This patch cleans things up a lot, that's good.

One thing I'm still concerned about, though: don't we need some locking
in place for some of the store operations on governors attributes? Are
store_{ignore_nice_load, sampling_down_fact, etc} safe without locking?


That would require some investigation I suppose.


It seems that we can call them from different cpus concurrently.


Yes, we can.

One quick-and-dirty way of dealing with that might be to introduce a
"sysfs lock" into struct dbs_data and hold that around the invocation
of gattr->store() in the sysfs_ops's ->store callback.



There is value in trying to solve this issue by using some of the
existing locks, IMHO.


Some value - maybe.  I'm not sure how much of it, though.

Finer-grained locking is generally easier to follow, because the locks
tend to be used for specific purposes only.


Can't we actually try to use the policy->rwsem (or one of the core
locks) + wait_for_completion approach as we do in cpufreq core?


No.  Too many things depend on that lock already and some of them work
by accident rather than by design.


Also, wait_for_completion() and complete() is just another way to 
implement a lock. So, it won't necessarily solve any deadlock issues.


I also don't like this patch because it forces governors to either 
implement their own macros and management of their attributes or force 
them to use the governor structs that come with cpufreq_governor.h. 
cpufreq_governor.h IMHO is very ondemand and conservative governor 
specific and is very irrelevant for sched-dvfs or any other governors 
(hint hint).


The only time this ABBA locking is an issue is when governor are 
changing and trying to add/remove attributes. That can easily be checked 
in store_governor and dealt with without holding the policy rwsem if the 
governors can provide their per sys and per policy attribute arrays as 
part of registering themselves.


I'm sorry that I just keep talking about the idea and not sending out 
the patches.


-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor

2016-02-02 Thread Saravana Kannan

On 02/01/2016 10:36 PM, Viresh Kumar wrote:

On 01-02-16, 22:00, Rafael J. Wysocki wrote:

I'm not sure what you mean by "the sysfs lock" here?  The policy rwsem
or something else?


He perhaps referred to the s_active.lock that we see in traces.



Yeah, that's what I mean. I generally don't use the exact name of the 
lock in emails (lazy to look it up) if there isn't a lot of chance for 
mistaking it for another lock.


-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor

2016-02-02 Thread Saravana Kannan

On 02/01/2016 10:34 PM, Viresh Kumar wrote:

On 01-02-16, 12:24, Saravana Kannan wrote:

On 02/01/2016 02:22 AM, Rafael J. Wysocki wrote:
I'm not sure whose idea you are referring to. Viresh's (I don't think I saw
his proposal) or mine.


http://git.linaro.org/people/viresh.kumar/linux.git/commit/57714d5b1778f2f610bcc5c74d85b29ba1cc1995


Anyway, to explain my suggestion better, I'm proposing to make it so that we
don't have a need for the AB BA locking. The only reason the governor needs
to even grab the sysfs lock is to add/remove the sysfs attribute files.

That can be easily achieved if the policy struct has some "gov_attrs"
field(s) that each governor populates. Then the framework just has to create
them after POLICY_INIT is processed by the governor and remove them before
POILICY_EXIT is sent to the governor.


What will that solve? It will stay exactly same then as well, as we
would be adding/removing these attributes from within the same
policy->rwsem ..


The problem isn't that you are holding the policy rwsem. The problem is 
that we are trying to grab the same locks in different order. This is 
trying to fix that.



That way, we also avoid having to worry about the gov attributes accessed by
the show/store disappearing while the files are being accessed.


It can't happen. S_active lock should be taking care of that, isn't
it?


You are right. That can't happen because we have the s_active lock. I 
meant to say that in general we don't have to worry about the races 
between a show/store needing some policy specific data within the 
governor to be valid but racing with governor change where it ends up 
being invalid. The releasing of the policy rwsem across POLICY_EXIT 
allows this to happen today.


-Saravana


--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor

2016-02-02 Thread Saravana Kannan

On 02/01/2016 10:36 PM, Viresh Kumar wrote:

On 01-02-16, 22:00, Rafael J. Wysocki wrote:

I'm not sure what you mean by "the sysfs lock" here?  The policy rwsem
or something else?


He perhaps referred to the s_active.lock that we see in traces.



Yeah, that's what I mean. I generally don't use the exact name of the 
lock in emails (lazy to look it up) if there isn't a lot of chance for 
mistaking it for another lock.


-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor

2016-02-02 Thread Saravana Kannan

On 02/01/2016 10:34 PM, Viresh Kumar wrote:

On 01-02-16, 12:24, Saravana Kannan wrote:

On 02/01/2016 02:22 AM, Rafael J. Wysocki wrote:
I'm not sure whose idea you are referring to. Viresh's (I don't think I saw
his proposal) or mine.


http://git.linaro.org/people/viresh.kumar/linux.git/commit/57714d5b1778f2f610bcc5c74d85b29ba1cc1995


Anyway, to explain my suggestion better, I'm proposing to make it so that we
don't have a need for the AB BA locking. The only reason the governor needs
to even grab the sysfs lock is to add/remove the sysfs attribute files.

That can be easily achieved if the policy struct has some "gov_attrs"
field(s) that each governor populates. Then the framework just has to create
them after POLICY_INIT is processed by the governor and remove them before
POILICY_EXIT is sent to the governor.


What will that solve? It will stay exactly same then as well, as we
would be adding/removing these attributes from within the same
policy->rwsem ..


The problem isn't that you are holding the policy rwsem. The problem is 
that we are trying to grab the same locks in different order. This is 
trying to fix that.



That way, we also avoid having to worry about the gov attributes accessed by
the show/store disappearing while the files are being accessed.


It can't happen. S_active lock should be taking care of that, isn't
it?


You are right. That can't happen because we have the s_active lock. I 
meant to say that in general we don't have to worry about the races 
between a show/store needing some policy specific data within the 
governor to be valid but racing with governor change where it ends up 
being invalid. The releasing of the policy rwsem across POLICY_EXIT 
allows this to happen today.


-Saravana


--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 2/5] cpufreq: governor: Create separate sysfs-ops

2016-02-02 Thread Saravana Kannan

On 02/02/2016 11:40 AM, Rafael J. Wysocki wrote:

On Tue, Feb 2, 2016 at 6:01 PM, Juri Lelli  wrote:

Hi Rafael,

On 02/02/16 17:35, Rafael J. Wysocki wrote:

On Tue, Feb 2, 2016 at 4:47 PM, Juri Lelli  wrote:

Hi Viresh,

On 02/02/16 16:27, Viresh Kumar wrote:

Until now, governors (ondemand/conservative) were using the
'global-attr' or 'freq-attr', depending on the sysfs location where we
want to create governor's directory.

The problem is that, in case of 'freq-attr', we are forced to use
show()/store() present in cpufreq.c, which always take policy->rwsem.

And because of that we were facing some ABBA lockups during governor
callback event CPUFREQ_GOV_POLICY_EXIT. And so we were dropping the
rwsem right before calling governor callback for CPUFREQ_GOV_POLICY_EXIT
event.

That caused further problems and it never worked perfectly.

This patch attempts to fix that by creating separate sysfs-ops for
cpufreq governors.

Because things got much simplified now, we don't need separate
show/store callbacks for governor-for-system and governor-per-policy
cases.

Signed-off-by: Viresh Kumar 


This patch cleans things up a lot, that's good.

One thing I'm still concerned about, though: don't we need some locking
in place for some of the store operations on governors attributes? Are
store_{ignore_nice_load, sampling_down_fact, etc} safe without locking?


That would require some investigation I suppose.


It seems that we can call them from different cpus concurrently.


Yes, we can.

One quick-and-dirty way of dealing with that might be to introduce a
"sysfs lock" into struct dbs_data and hold that around the invocation
of gattr->store() in the sysfs_ops's ->store callback.



There is value in trying to solve this issue by using some of the
existing locks, IMHO.


Some value - maybe.  I'm not sure how much of it, though.

Finer-grained locking is generally easier to follow, because the locks
tend to be used for specific purposes only.


Can't we actually try to use the policy->rwsem (or one of the core
locks) + wait_for_completion approach as we do in cpufreq core?


No.  Too many things depend on that lock already and some of them work
by accident rather than by design.


Also, wait_for_completion() and complete() is just another way to 
implement a lock. So, it won't necessarily solve any deadlock issues.


I also don't like this patch because it forces governors to either 
implement their own macros and management of their attributes or force 
them to use the governor structs that come with cpufreq_governor.h. 
cpufreq_governor.h IMHO is very ondemand and conservative governor 
specific and is very irrelevant for sched-dvfs or any other governors 
(hint hint).


The only time this ABBA locking is an issue is when governor are 
changing and trying to add/remove attributes. That can easily be checked 
in store_governor and dealt with without holding the policy rwsem if the 
governors can provide their per sys and per policy attribute arrays as 
part of registering themselves.


I'm sorry that I just keep talking about the idea and not sending out 
the patches.


-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 2/5] cpufreq: governor: Create separate sysfs-ops

2016-02-02 Thread Saravana Kannan

On 02/02/2016 05:07 PM, Rafael J. Wysocki wrote:

On Wed, Feb 3, 2016 at 12:42 AM, Rafael J. Wysocki <raf...@kernel.org> wrote:

On Tue, Feb 2, 2016 at 11:21 PM, Saravana Kannan <skan...@codeaurora.org> wrote:

On 02/02/2016 11:40 AM, Rafael J. Wysocki wrote:


On Tue, Feb 2, 2016 at 6:01 PM, Juri Lelli <juri.le...@arm.com> wrote:




[cut]



I also don't like this patch because it forces governors to either implement
their own macros and management of their attributes or force them to use the
governor structs that come with cpufreq_governor.h. cpufreq_governor.h IMHO
is very ondemand and conservative governor specific and is very irrelevant
for sched-dvfs or any other governors (hint hint).

The only time this ABBA locking is an issue is when governor are changing
and trying to add/remove attributes. That can easily be checked in
store_governor and dealt with without holding the policy rwsem if the
governors can provide their per sys and per policy attribute arrays as part
of registering themselves.

I'm sorry that I just keep talking about the idea and not sending out the
patches.


I think you have a point, though.

The deadlock really is specific to the governors using the code in
cpufreq_governor.c.


That said no other governors in the tree use any sysfs attributes for
tunables AFAICS and the out-of-the tree ones are out of interest here.


But if we are expecting sched dvfs to come in, why make it worse for it. 
It would be completely pointless to try and shoehorn sched dvfs to use 
cpufreq_governor.c



Also the deadlock happens if one of the tunable attributes is accessed
while we're trying to remove it which very well may happen on read
access too.


Isn't this THE deadlock we are talking about? The removal of the 
attributes only happen when governors are changes and we send a 
POLICY_EXIT and or all the cores are hotplugged out. And my suggestion 
would work just as well there.


Why are you prefixing your sentence with "Also"? Is there some other 
case I'm not considering?


-Saravana
--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor

2016-02-02 Thread Saravana Kannan

On 02/02/2016 09:02 PM, Viresh Kumar wrote:

On 02-02-16, 20:04, Saravana Kannan wrote:

What's the s_active lock in CPU1 coming from?


That's taken by sysfs core while removing the files.


The only reason it's there
today is because of the sysfs dir remove. If you move it before the
policy->rwsem, you won't have it after the policy->rwsem too. So, I think it
will fix the issue.


Its complex and we will end up making ugly..


I disagree. I think it's way better and simpler than this patch set. It 
also doesn't tie into cpufreq_governor.* which is a good thing IMO since 
it keeps things simpler for sched-dvfs too.



For example, EXIT can be called while switching governors. The
policy->rwsem is taken at the beginning cpufreq_set_policy(). To
decide if we should remove the governor sysfs directory so early (i.e.
before taking rwsem) in the call, is going to be difficult.


Just check if the governor is changing. And if it is, you just need to 
remove the policy specific stuff.



Over that the same directory might be shared across multiple policies,
and all that information is present only with the governor-core.


That's why I said the gov needs to register the per pol and system wide 
attrs list separately.


This will also remove the need for ever governor to do this crap.

-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 2/5] cpufreq: governor: Create separate sysfs-ops

2016-02-02 Thread Saravana Kannan

On 02/02/2016 05:52 PM, Rafael J. Wysocki wrote:

On Wed, Feb 3, 2016 at 2:32 AM, Saravana Kannan <skan...@codeaurora.org> wrote:

On 02/02/2016 05:07 PM, Rafael J. Wysocki wrote:


On Wed, Feb 3, 2016 at 12:42 AM, Rafael J. Wysocki <raf...@kernel.org>
wrote:


On Tue, Feb 2, 2016 at 11:21 PM, Saravana Kannan <skan...@codeaurora.org>
wrote:


On 02/02/2016 11:40 AM, Rafael J. Wysocki wrote:



On Tue, Feb 2, 2016 at 6:01 PM, Juri Lelli <juri.le...@arm.com> wrote:





[cut]



I also don't like this patch because it forces governors to either
implement
their own macros and management of their attributes or force them to use
the
governor structs that come with cpufreq_governor.h. cpufreq_governor.h
IMHO
is very ondemand and conservative governor specific and is very
irrelevant
for sched-dvfs or any other governors (hint hint).

The only time this ABBA locking is an issue is when governor are
changing
and trying to add/remove attributes. That can easily be checked in
store_governor and dealt with without holding the policy rwsem if the
governors can provide their per sys and per policy attribute arrays as
part
of registering themselves.

I'm sorry that I just keep talking about the idea and not sending out
the
patches.



I think you have a point, though.

The deadlock really is specific to the governors using the code in
cpufreq_governor.c.



That said no other governors in the tree use any sysfs attributes for
tunables AFAICS and the out-of-the tree ones are out of interest here.



But if we are expecting sched dvfs to come in, why make it worse for it. It
would be completely pointless to try and shoehorn sched dvfs to use
cpufreq_governor.c


Well, do you honestly think that using the existing stuff in it would
be a good idea?

If not, then why it matters at all?


Also the deadlock happens if one of the tunable attributes is accessed
while we're trying to remove it which very well may happen on read
access too.


Isn't this THE deadlock we are talking about? The removal of the attributes
only happen when governors are changes and we send a POLICY_EXIT and or all
the cores are hotplugged out.


It generally happens when the "old" governor is going away, whatever the reason.


And my suggestion would work just as well there.

Why are you prefixing your sentence with "Also"? Is there some other case
I'm not considering?


Say someone is reading sampling_rate for a policy with 1 CPU in it and
someone else is taking the CPU offline.  The governor EXIT code path
(that will trigger as a result) will try to remove the sampling_rate
attribute and (if it does that under policy->rwsem) it'll wait for the
read access to finish.  Where exactly would you put the deadlock
prevention in this case?


This is the hotplug case I mentioned. The sysfs file removals will 
happen only for the last CPU in that policy (we thankfully optimized 
that part last year). We also know that multiple CPUs can't be 
hotplugged at the same time. So, in the start of 
cpufreq_offline_prepare, we just need to check if this is the last CPU 
in the policy and if that's the case, do the gov sysfs remove and then 
grab the policy lock and do all our crap. If a read is going on, that's 
going to finish before the sysfs attr remove can go ahead and it can 
grab the policy lock if it needs to and that still won't cause a 
deadlock because we haven't yet grabbed the policy lock in 
cpufreq_offline_prepare(). If the read comes after the sysfs remove, 
then the read is obviously going to fail (we can depend on the sysfs 
framework on doing its job there).


Will that still leave any race conditions in?

-Saravana
--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor

2016-02-02 Thread Saravana Kannan

On 02/02/2016 06:13 PM, Viresh Kumar wrote:

On 02-02-16, 13:37, Saravana Kannan wrote:

On 02/01/2016 10:34 PM, Viresh Kumar wrote:

What will that solve? It will stay exactly same then as well, as we
would be adding/removing these attributes from within the same
policy->rwsem ..


The problem isn't that you are holding the policy rwsem. The problem is that
we are trying to grab the same locks in different order. This is trying to
fix that.


That's exactly what I was trying to say, sorry for not being very
clear.

Even if you would move the sysfs file creation thing into the cpufreq
core, instead of governor, we will have locks this way:

CPU0CPU1
(sysfs read)(sysfs dir remove)
s_active lock   policy->rwsem
policy->rwsem
 s_active lock (hang)


And so I said, nothing will change.



What's the s_active lock in CPU1 coming from? The only reason it's there 
today is because of the sysfs dir remove. If you move it before the 
policy->rwsem, you won't have it after the policy->rwsem too. So, I 
think it will fix the issue.


-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor

2016-02-01 Thread Saravana Kannan

On 02/01/2016 02:22 AM, Rafael J. Wysocki wrote:

On Mon, Feb 1, 2016 at 7:09 AM, Viresh Kumar  wrote:

On 30-01-16, 12:49, Rafael J. Wysocki wrote:

On Friday, January 29, 2016 04:33:39 PM Saravana Kannan wrote:

AFAIR, the ABBA issue was between the sysfs lock and the policy lock.


Yeah, to be precise here it is:

CPU0 (sysfs read)   CPU1 (exit governor)

sysfs-read  set_policy()-> lock policy->rwsem
sysfs-active lock   Remove sysfs files
lock policy->rwsem  sysfs-active lock
Actual read


The fix for that issue should not be dropping the lock around
POLICY_EXIT.


Right.  Dropping the lock is a mistake (which I overlooked, sadly).


I joined the party at around time of 3.10, and we had this problem and
hacky solution then as well. We tried to get rid of it multiple times,
but sadly failed.


I kind of like your idea of accessing governor attributes without
holding the policy rwsem.


I'm not sure whose idea you are referring to. Viresh's (I don't think I 
saw his proposal) or mine.



I looked at that code and it seems doable to me.  The problem to solve
there would be to ensure that the dbs_data pointer is valid when
show/store runs for those attributes.

The fact that we make the distinction between global and policy
governors in there doesn't really help, but it looks like getting rid
of that bit wouldn't be too much effort.  Let me take a deeper look at
that.



Anyway, to explain my suggestion better, I'm proposing to make it so 
that we don't have a need for the AB BA locking. The only reason the 
governor needs to even grab the sysfs lock is to add/remove the sysfs 
attribute files.


That can be easily achieved if the policy struct has some "gov_attrs" 
field(s) that each governor populates. Then the framework just has to 
create them after POLICY_INIT is processed by the governor and remove 
them before POILICY_EXIT is sent to the governor.


That way, we also avoid having to worry about the gov attributes 
accessed by the show/store disappearing while the files are being 
accessed. Since we remove those files before we even ask the gov to 
clean up, that situation can never happen.


The current problem is that there is no good place for the governor to 
populate this "gov_attrs" field(s). Maybe the governor register might be 
one place for it to provide the data to the framework and the framework 
can later fill it up itself when switching governors.


Thanks,
Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor

2016-02-01 Thread Saravana Kannan

On 02/01/2016 02:22 AM, Rafael J. Wysocki wrote:

On Mon, Feb 1, 2016 at 7:09 AM, Viresh Kumar <viresh.ku...@linaro.org> wrote:

On 30-01-16, 12:49, Rafael J. Wysocki wrote:

On Friday, January 29, 2016 04:33:39 PM Saravana Kannan wrote:

AFAIR, the ABBA issue was between the sysfs lock and the policy lock.


Yeah, to be precise here it is:

CPU0 (sysfs read)   CPU1 (exit governor)

sysfs-read  set_policy()-> lock policy->rwsem
sysfs-active lock   Remove sysfs files
lock policy->rwsem  sysfs-active lock
Actual read


The fix for that issue should not be dropping the lock around
POLICY_EXIT.


Right.  Dropping the lock is a mistake (which I overlooked, sadly).


I joined the party at around time of 3.10, and we had this problem and
hacky solution then as well. We tried to get rid of it multiple times,
but sadly failed.


I kind of like your idea of accessing governor attributes without
holding the policy rwsem.


I'm not sure whose idea you are referring to. Viresh's (I don't think I 
saw his proposal) or mine.



I looked at that code and it seems doable to me.  The problem to solve
there would be to ensure that the dbs_data pointer is valid when
show/store runs for those attributes.

The fact that we make the distinction between global and policy
governors in there doesn't really help, but it looks like getting rid
of that bit wouldn't be too much effort.  Let me take a deeper look at
that.



Anyway, to explain my suggestion better, I'm proposing to make it so 
that we don't have a need for the AB BA locking. The only reason the 
governor needs to even grab the sysfs lock is to add/remove the sysfs 
attribute files.


That can be easily achieved if the policy struct has some "gov_attrs" 
field(s) that each governor populates. Then the framework just has to 
create them after POLICY_INIT is processed by the governor and remove 
them before POILICY_EXIT is sent to the governor.


That way, we also avoid having to worry about the gov attributes 
accessed by the show/store disappearing while the files are being 
accessed. Since we remove those files before we even ask the gov to 
clean up, that situation can never happen.


The current problem is that there is no good place for the governor to 
populate this "gov_attrs" field(s). Maybe the governor register might be 
one place for it to provide the data to the framework and the framework 
can later fill it up itself when switching governors.


Thanks,
Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [RFC PATCH 00/19] cpufreq locking cleanups and documentation

2016-01-29 Thread Saravana Kannan

On 01/11/2016 09:35 AM, Juri Lelli wrote:

Hi all,

In the context of the ongoing discussion about introducing a simple platform
energy model to guide scheduling decisions (Energy Aware Scheduling [1])
concerns have been expressed by Peter about the component in charge of driving
clock frequency selection (Steve recently posted an update of such component
[2]): https://lkml.org/lkml/2015/8/15/141.

The problem is that, with this new approach, cpufreq core functions need to be
accessed from scheduler hot-paths and the overhead associated with the current
locking scheme might result to be unsustainable.

Peter's proposed approach of using RCU logic to reduce locking overhead seems
reasonable, but things may not be so straightforward as originally thought. The
very first thing I actually realized when I started looking into this is that
it was hard for me to understand which locking mechanism was protecting which
data structure. As mostly a way to build a better understanding of the current
cpufreq locking scheme and also as preparatory work for implementing RCU logic,
I came up with this set of patches. In fact, at this stage, I would like each
patch to be considered as a question I'm asking rather than a proposed change,
thus the RFC tag for the series; with the intent of documenting current locking
scheme and modifying it a bit in order to make RCU logic implementation easier.
Actually, as you'll soon notice, I didn't really start from scratch. Mike
shared with me some patches he has been developing while looking at the same
problem. I've given Mike attribution for the patches that I took unchanged from
him, with thanks for sharing his findings with me.

High level description of patches:

  o [01-04] cleanup and move code around to make things (hopefully) cleaner
  o [05-14] insert lockdep assertions and fix uncovered erroneous situations
  o [15-18] remove overkill usage of locking mechanism
  o 19  adds documentation for the cleaned up locking scheme

With Viresh' tests [3] on both arm TC2 and arm64 Juno boards I'm not seeing
anything bad happening. However, coverage is really small (as is my personal
confidence of not breaking things for other confs :-)).

This set is based on top of linux-pm/linux-next as of today and it is also
available from here:

  git://linux-arm.org/linux-jl.git upstream/cpufreq_cleanups

Comments, concerns and rants are the primary goal of this posting; I'm thus
looking forward to them.

Best,

- Juri

[1] https://lkml.org/lkml/2015/7/7/754
[2] https://lkml.org/lkml/2015/12/9/35
[3] https://git.linaro.org/people/viresh.kumar/cpufreq-tests.git

Juri Lelli (16):
   cpufreq: kill for_each_policy
   cpufreq: bring data structures close to their locks
   cpufreq: assert locking when accessing cpufreq_policy_list
   cpufreq: always access cpufreq_policy_list while holding
 cpufreq_driver_lock
   cpufreq: assert locking when accessing cpufreq_governor_list
   cpufreq: fix warning for cpufreq_init_policy unlocked access to
 cpufreq_governor_list
   cpufreq: fix warning for show_scaling_available_governors unlocked
 access to cpufreq_governor_list
   cpufreq: assert policy->rwsem is held in cpufreq_set_policy
   cpufreq: assert policy->rwsem is held in __cpufreq_governor
   cpufreq: fix locking of policy->rwsem in cpufreq_init_policy
   cpufreq: fix locking of policy->rwsem in cpufreq_offline_prepare
   cpufreq: fix locking of policy->rwsem in cpufreq_offline_finish
   cpufreq: remove useless usage of cpufreq_governor_mutex in
 __cpufreq_governor
   cpufreq: hold policy->rwsem across CPUFREQ_GOV_POLICY_EXIT
   cpufreq: stop checking for cpufreq_driver being present in
 cpufreq_cpu_get
   cpufreq: documentation: document locking scheme

Michael Turquette (3):
   cpufreq: do not expose cpufreq_governor_lock
   cpufreq: merge governor lock and mutex
   cpufreq: remove transition_lock

  Documentation/cpu-freq/core.txt|  44 +
  drivers/cpufreq/cpufreq.c  | 132 +++--
  drivers/cpufreq/cpufreq_governor.h |   2 -
  include/linux/cpufreq.h|   5 --
  4 files changed, 125 insertions(+), 58 deletions(-)



Juri,

I haven't looked at the cpufreq-tests, but I doubt they do hotplug 
testing where they remove all the CPUs of a policy (to trigger a policy 
exit).


Can you please add that to your testing? I wouldn't be surprised if some 
of your clean ups would cause a dead lock. This clean up series is 
definitely appreciated, but I think the patch series might still be 
missing some patches that are needed to make things work without 
deadlocking.


I'll try to do a deeper analysis/review/testing, but kinda hard pressed 
on time here.


Thanks,
Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor

2016-01-29 Thread Saravana Kannan

On 01/12/2016 02:20 AM, Viresh Kumar wrote:

On 11-01-16, 17:35, Juri Lelli wrote:

__cpufreq_governor works on policy, so policy->rwsem has to be held.
Add assertion for such condition.

Cc: "Rafael J. Wysocki" 
Cc: Viresh Kumar 
Signed-off-by: Juri Lelli 
---
  drivers/cpufreq/cpufreq.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index f1f9fbc..e7fc5c9 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1950,6 +1950,9 @@ static int __cpufreq_governor(struct cpufreq_policy 
*policy,
/* Don't start any governor operations if we are entering suspend */
if (cpufreq_suspended)
return 0;
+
+   lockdep_assert_held(>rwsem);
+


We had an ABBA problem with the EXIT governor callback and so this
rwsem is dropped just before that from set_policy()..

commit 955ef4833574 ("cpufreq: Drop rwsem lock around
CPUFREQ_GOV_POLICY_EXIT")



AFAIR, the ABBA issue was between the sysfs lock and the policy lock. 
The fix for that issue should not be dropping the lock around 
POLICY_EXIT. The proper fix is to have the governor "export" the 
attributes it wants to add/remove and have the cpufreq framework do the 
adding/removing of the attributes from sysfs for the governor.


Thanks,
Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [RFC PATCH 00/19] cpufreq locking cleanups and documentation

2016-01-29 Thread Saravana Kannan

On 01/11/2016 09:35 AM, Juri Lelli wrote:

Hi all,

In the context of the ongoing discussion about introducing a simple platform
energy model to guide scheduling decisions (Energy Aware Scheduling [1])
concerns have been expressed by Peter about the component in charge of driving
clock frequency selection (Steve recently posted an update of such component
[2]): https://lkml.org/lkml/2015/8/15/141.

The problem is that, with this new approach, cpufreq core functions need to be
accessed from scheduler hot-paths and the overhead associated with the current
locking scheme might result to be unsustainable.

Peter's proposed approach of using RCU logic to reduce locking overhead seems
reasonable, but things may not be so straightforward as originally thought. The
very first thing I actually realized when I started looking into this is that
it was hard for me to understand which locking mechanism was protecting which
data structure. As mostly a way to build a better understanding of the current
cpufreq locking scheme and also as preparatory work for implementing RCU logic,
I came up with this set of patches. In fact, at this stage, I would like each
patch to be considered as a question I'm asking rather than a proposed change,
thus the RFC tag for the series; with the intent of documenting current locking
scheme and modifying it a bit in order to make RCU logic implementation easier.
Actually, as you'll soon notice, I didn't really start from scratch. Mike
shared with me some patches he has been developing while looking at the same
problem. I've given Mike attribution for the patches that I took unchanged from
him, with thanks for sharing his findings with me.

High level description of patches:

  o [01-04] cleanup and move code around to make things (hopefully) cleaner
  o [05-14] insert lockdep assertions and fix uncovered erroneous situations
  o [15-18] remove overkill usage of locking mechanism
  o 19  adds documentation for the cleaned up locking scheme

With Viresh' tests [3] on both arm TC2 and arm64 Juno boards I'm not seeing
anything bad happening. However, coverage is really small (as is my personal
confidence of not breaking things for other confs :-)).

This set is based on top of linux-pm/linux-next as of today and it is also
available from here:

  git://linux-arm.org/linux-jl.git upstream/cpufreq_cleanups

Comments, concerns and rants are the primary goal of this posting; I'm thus
looking forward to them.

Best,

- Juri

[1] https://lkml.org/lkml/2015/7/7/754
[2] https://lkml.org/lkml/2015/12/9/35
[3] https://git.linaro.org/people/viresh.kumar/cpufreq-tests.git

Juri Lelli (16):
   cpufreq: kill for_each_policy
   cpufreq: bring data structures close to their locks
   cpufreq: assert locking when accessing cpufreq_policy_list
   cpufreq: always access cpufreq_policy_list while holding
 cpufreq_driver_lock
   cpufreq: assert locking when accessing cpufreq_governor_list
   cpufreq: fix warning for cpufreq_init_policy unlocked access to
 cpufreq_governor_list
   cpufreq: fix warning for show_scaling_available_governors unlocked
 access to cpufreq_governor_list
   cpufreq: assert policy->rwsem is held in cpufreq_set_policy
   cpufreq: assert policy->rwsem is held in __cpufreq_governor
   cpufreq: fix locking of policy->rwsem in cpufreq_init_policy
   cpufreq: fix locking of policy->rwsem in cpufreq_offline_prepare
   cpufreq: fix locking of policy->rwsem in cpufreq_offline_finish
   cpufreq: remove useless usage of cpufreq_governor_mutex in
 __cpufreq_governor
   cpufreq: hold policy->rwsem across CPUFREQ_GOV_POLICY_EXIT
   cpufreq: stop checking for cpufreq_driver being present in
 cpufreq_cpu_get
   cpufreq: documentation: document locking scheme

Michael Turquette (3):
   cpufreq: do not expose cpufreq_governor_lock
   cpufreq: merge governor lock and mutex
   cpufreq: remove transition_lock

  Documentation/cpu-freq/core.txt|  44 +
  drivers/cpufreq/cpufreq.c  | 132 +++--
  drivers/cpufreq/cpufreq_governor.h |   2 -
  include/linux/cpufreq.h|   5 --
  4 files changed, 125 insertions(+), 58 deletions(-)



Juri,

I haven't looked at the cpufreq-tests, but I doubt they do hotplug 
testing where they remove all the CPUs of a policy (to trigger a policy 
exit).


Can you please add that to your testing? I wouldn't be surprised if some 
of your clean ups would cause a dead lock. This clean up series is 
definitely appreciated, but I think the patch series might still be 
missing some patches that are needed to make things work without 
deadlocking.


I'll try to do a deeper analysis/review/testing, but kinda hard pressed 
on time here.


Thanks,
Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor

2016-01-29 Thread Saravana Kannan

On 01/12/2016 02:20 AM, Viresh Kumar wrote:

On 11-01-16, 17:35, Juri Lelli wrote:

__cpufreq_governor works on policy, so policy->rwsem has to be held.
Add assertion for such condition.

Cc: "Rafael J. Wysocki" 
Cc: Viresh Kumar 
Signed-off-by: Juri Lelli 
---
  drivers/cpufreq/cpufreq.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index f1f9fbc..e7fc5c9 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1950,6 +1950,9 @@ static int __cpufreq_governor(struct cpufreq_policy 
*policy,
/* Don't start any governor operations if we are entering suspend */
if (cpufreq_suspended)
return 0;
+
+   lockdep_assert_held(>rwsem);
+


We had an ABBA problem with the EXIT governor callback and so this
rwsem is dropped just before that from set_policy()..

commit 955ef4833574 ("cpufreq: Drop rwsem lock around
CPUFREQ_GOV_POLICY_EXIT")



AFAIR, the ABBA issue was between the sysfs lock and the policy lock. 
The fix for that issue should not be dropping the lock around 
POLICY_EXIT. The proper fix is to have the governor "export" the 
attributes it wants to add/remove and have the cpufreq framework do the 
adding/removing of the attributes from sysfs for the governor.


Thanks,
Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH V3 5/5] cpufreq: postfix policy directory with the first CPU in related_cpus

2015-10-16 Thread Saravana Kannan

On 10/16/2015 12:11 AM, Viresh Kumar wrote:

The sysfs policy directory is postfixed currently with the CPU number
for which the policy was created, which isn't necessarily the first CPU
in related_cpus mask.

To make it more consistent and predictable, lets postfix the policy with
the first cpu in related-cpus mask.

Suggested-by: Saravana Kannan 
Signed-off-by: Viresh Kumar 
---
V2->V3:
- Fix error path where we may try to put an uninitialized kobject.
- Break kobject_init_and_add() to kobject_init() and kobject_add().

  drivers/cpufreq/cpufreq.c | 21 +++--
  1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 4fa2215cc6ec..7c48e7316d91 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1022,7 +1022,6 @@ static struct cpufreq_policy 
*cpufreq_policy_alloc(unsigned int cpu)
  {
struct device *dev = get_cpu_device(cpu);
struct cpufreq_policy *policy;
-   int ret;

if (WARN_ON(!dev))
return NULL;
@@ -1040,13 +1039,7 @@ static struct cpufreq_policy 
*cpufreq_policy_alloc(unsigned int cpu)
if (!zalloc_cpumask_var(>real_cpus, GFP_KERNEL))
goto err_free_rcpumask;

-   ret = kobject_init_and_add(>kobj, _cpufreq,
-  cpufreq_global_kobject, "policy%u", cpu);
-   if (ret) {
-   pr_err("%s: failed to init policy->kobj: %d\n", __func__, ret);
-   goto err_free_real_cpus;
-   }
-
+   kobject_init(>kobj, _cpufreq);


Oh yeah, this works better. I forgot kobject has a separate init and add 
fuctions.



INIT_LIST_HEAD(>policy_list);
init_rwsem(>rwsem);
spin_lock_init(>transition_lock);
@@ -1057,8 +1050,6 @@ static struct cpufreq_policy 
*cpufreq_policy_alloc(unsigned int cpu)
policy->cpu = cpu;
return policy;

-err_free_real_cpus:
-   free_cpumask_var(policy->real_cpus);
  err_free_rcpumask:
free_cpumask_var(policy->related_cpus);
  err_free_cpumask:
@@ -1163,6 +1154,16 @@ static int cpufreq_online(unsigned int cpu)
cpumask_copy(policy->related_cpus, policy->cpus);
/* Remember CPUs present at the policy creation time. */
cpumask_and(policy->real_cpus, policy->cpus, cpu_present_mask);
+
+   /* Name and add the kobject */
+   ret = kobject_add(>kobj, cpufreq_global_kobject,
+ "policy%u",
+ cpumask_first(policy->related_cpus));
+   if (ret) {
+   pr_err("%s: failed to add policy->kobj: %d\n", __func__,
+  ret);
+   goto out_exit_policy;
+   }


Another out of patch issue that I see while reviewing this patch:

I think the existing error handling gotos aren't really cleaning things 
up well.


In the lines that follow this code we set the per_cpu(cpufreq_cpu_data) 
to point to the new policy. But if the subsequent cpu->get() fails, we 
goto out_exit_policy. But that label doesn't clean up the 
per_cpu(cpufreq_cpu_data). So, I think we need another label to jump to 
if ->get() fails



}

/*



Reviewed-by: Saravana Kannan 

-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V3 5/5] cpufreq: postfix policy directory with the first CPU in related_cpus

2015-10-16 Thread Saravana Kannan

On 10/16/2015 12:11 AM, Viresh Kumar wrote:

The sysfs policy directory is postfixed currently with the CPU number
for which the policy was created, which isn't necessarily the first CPU
in related_cpus mask.

To make it more consistent and predictable, lets postfix the policy with
the first cpu in related-cpus mask.

Suggested-by: Saravana Kannan <skan...@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.ku...@linaro.org>
---
V2->V3:
- Fix error path where we may try to put an uninitialized kobject.
- Break kobject_init_and_add() to kobject_init() and kobject_add().

  drivers/cpufreq/cpufreq.c | 21 +++--
  1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 4fa2215cc6ec..7c48e7316d91 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1022,7 +1022,6 @@ static struct cpufreq_policy 
*cpufreq_policy_alloc(unsigned int cpu)
  {
struct device *dev = get_cpu_device(cpu);
struct cpufreq_policy *policy;
-   int ret;

if (WARN_ON(!dev))
return NULL;
@@ -1040,13 +1039,7 @@ static struct cpufreq_policy 
*cpufreq_policy_alloc(unsigned int cpu)
if (!zalloc_cpumask_var(>real_cpus, GFP_KERNEL))
goto err_free_rcpumask;

-   ret = kobject_init_and_add(>kobj, _cpufreq,
-  cpufreq_global_kobject, "policy%u", cpu);
-   if (ret) {
-   pr_err("%s: failed to init policy->kobj: %d\n", __func__, ret);
-   goto err_free_real_cpus;
-   }
-
+   kobject_init(>kobj, _cpufreq);


Oh yeah, this works better. I forgot kobject has a separate init and add 
fuctions.



INIT_LIST_HEAD(>policy_list);
init_rwsem(>rwsem);
spin_lock_init(>transition_lock);
@@ -1057,8 +1050,6 @@ static struct cpufreq_policy 
*cpufreq_policy_alloc(unsigned int cpu)
policy->cpu = cpu;
return policy;

-err_free_real_cpus:
-   free_cpumask_var(policy->real_cpus);
  err_free_rcpumask:
free_cpumask_var(policy->related_cpus);
  err_free_cpumask:
@@ -1163,6 +1154,16 @@ static int cpufreq_online(unsigned int cpu)
cpumask_copy(policy->related_cpus, policy->cpus);
/* Remember CPUs present at the policy creation time. */
cpumask_and(policy->real_cpus, policy->cpus, cpu_present_mask);
+
+   /* Name and add the kobject */
+   ret = kobject_add(>kobj, cpufreq_global_kobject,
+ "policy%u",
+ cpumask_first(policy->related_cpus));
+   if (ret) {
+   pr_err("%s: failed to add policy->kobj: %d\n", __func__,
+  ret);
+   goto out_exit_policy;
+   }


Another out of patch issue that I see while reviewing this patch:

I think the existing error handling gotos aren't really cleaning things 
up well.


In the lines that follow this code we set the per_cpu(cpufreq_cpu_data) 
to point to the new policy. But if the subsequent cpu->get() fails, we 
goto out_exit_policy. But that label doesn't clean up the 
per_cpu(cpufreq_cpu_data). So, I think we need another label to jump to 
if ->get() fails



}

/*



Reviewed-by: Saravana Kannan <skan...@codeaurora.org>

-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/5] cpufreq: create cpu/cpufreq/policyX directories

2015-10-15 Thread Saravana Kannan

On 10/14/2015 11:55 PM, Viresh Kumar wrote:

On 13-10-15, 12:29, Saravana Kannan wrote:

But we don't need to track track of "present-cpus" separately
though. We could do the for_each_cpu_and() when we create the
symlinks for the first time. And after that, we can just use the
subsystem interface callbacks (cpufreq_add_dev() and
cpufreq_remove_dev()) to keep the symlinks updated.

I don't see any place where keeping track of this separately is more
efficient. This would save some memory savings when the number of
CPUs is large and also simplify the code because we won't have to
keep another field up to date.


It is still required to track when can we free the policy.



Ok, I'm not very familiar with this new field's uses. I'll take a closer 
look later and respond if I have other thoughts.


Thanks,
Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 4/5] cpufreq: create cpu/cpufreq/policyX directories

2015-10-15 Thread Saravana Kannan

On 10/15/2015 09:05 AM, Viresh Kumar wrote:

The cpufreq sysfs interface had been a bit inconsistent as one of the
CPUs for a policy had a real directory within its sysfs 'cpuX' directory
and all other CPUs had links to it. That also made the code a bit
complex as we need to take care of moving the sysfs directory if the CPU
containing the real directory is getting physically hot-unplugged.

Solve this by creating 'policyX' directories (per-policy) in
/sys/devices/system/cpu/cpufreq/ directory, where X is the CPU for which
the policy was first created.

This also removes the need of keeping kobj_cpu and we can remove it now.

Suggested-by: Saravana Kannan 
Signed-off-by: Viresh Kumar 



Since you've added a separate patch for making policyX more consistent:
Reviewed-by: Saravana Kannan 

Btw, does a Review-by have an implicit Acked-by?


---
  drivers/cpufreq/cpufreq.c | 34 --
  include/linux/cpufreq.h   |  1 -
  2 files changed, 4 insertions(+), 31 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 04222e7bbc73..4fa2215cc6ec 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -910,9 +910,6 @@ static int cpufreq_add_dev_symlink(struct cpufreq_policy 
*policy)

/* Some related CPUs might not be present (physically hotplugged) */
for_each_cpu(j, policy->real_cpus) {
-   if (j == policy->kobj_cpu)
-   continue;
-
ret = add_cpu_dev_symlink(policy, j);
if (ret)
break;


Kinda unrelated to this patch, but shouldn't this function undo the 
symlinks is has created so far before returning? Otherwise, we'd be 
leaving around broken symlinks.


-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 5/5] cpufreq: postfix policy directory with the first CPU in related_cpus

2015-10-15 Thread Saravana Kannan

On 10/15/2015 09:05 AM, Viresh Kumar wrote:

The sysfs policy directory is postfixed currently with the CPU number
for which the policy was created, which isn't necessarily the first CPU
in related_cpus mask.

To make it more consistent and predictable, lets postfix the policy with
the first cpu in related-cpus mask.

Suggested-by: Saravana Kannan 
Signed-off-by: Viresh Kumar 
---
  drivers/cpufreq/cpufreq.c | 19 ++-
  1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 4fa2215cc6ec..3fe13875565d 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1022,7 +1022,6 @@ static struct cpufreq_policy 
*cpufreq_policy_alloc(unsigned int cpu)
  {
struct device *dev = get_cpu_device(cpu);
struct cpufreq_policy *policy;
-   int ret;

if (WARN_ON(!dev))
return NULL;
@@ -1040,13 +1039,6 @@ static struct cpufreq_policy 
*cpufreq_policy_alloc(unsigned int cpu)
if (!zalloc_cpumask_var(>real_cpus, GFP_KERNEL))
goto err_free_rcpumask;

-   ret = kobject_init_and_add(>kobj, _cpufreq,
-  cpufreq_global_kobject, "policy%u", cpu);
-   if (ret) {
-   pr_err("%s: failed to init policy->kobj: %d\n", __func__, ret);
-   goto err_free_real_cpus;
-   }
-
INIT_LIST_HEAD(>policy_list);
init_rwsem(>rwsem);
spin_lock_init(>transition_lock);
@@ -1057,7 +1049,6 @@ static struct cpufreq_policy 
*cpufreq_policy_alloc(unsigned int cpu)
policy->cpu = cpu;
return policy;

-err_free_real_cpus:
free_cpumask_var(policy->real_cpus);


Delete this line too? Does GCC not complain about unreachable code?


  err_free_rcpumask:
free_cpumask_var(policy->related_cpus);
@@ -1163,6 +1154,16 @@ static int cpufreq_online(unsigned int cpu)
cpumask_copy(policy->related_cpus, policy->cpus);
/* Remember CPUs present at the policy creation time. */
cpumask_and(policy->real_cpus, policy->cpus, cpu_present_mask);
+
+   /* Initialize the kobject */
+   ret = kobject_init_and_add(>kobj, _cpufreq,
+  cpufreq_global_kobject, "policy%u",
+  cpumask_first(policy->related_cpus));
+   if (ret) {
+   pr_err("%s: failed to init policy->kobj: %d\n",
+  __func__, ret);
+   goto out_exit_policy;


out_exit_policy label includes a call to cpufreq_policy_free(). That 
function needs to be changed to not call cpufreq_policy_put_kobj() in 
this case so that we don't try to kobject_put() an unallocated kobj.


Maybe you an call cpufreq_policy_put_kobj() in the error handling path 
of this function? Basically split out kojb alloc and free from policy 
alloc and free and alloc/free them around the same time 
(cpufreq_remove_Dev() will have to also call cpufreq_policy_put_kobj() 
when real_cpus is empty().


The refactor is just a suggestion. I'm looking at the latest code in a 
gitweb and making comments. So, I might have missed some corner cases in 
the refactor.


Also, it might be better to move the notifier from within 
cpufreq_policy_put_kobj() to cpufreq_policy_free()? Seems more appropriate.


Thanks,
Saravana


--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 4/5] cpufreq: create cpu/cpufreq/policyX directories

2015-10-15 Thread Saravana Kannan

On 10/15/2015 09:05 AM, Viresh Kumar wrote:

The cpufreq sysfs interface had been a bit inconsistent as one of the
CPUs for a policy had a real directory within its sysfs 'cpuX' directory
and all other CPUs had links to it. That also made the code a bit
complex as we need to take care of moving the sysfs directory if the CPU
containing the real directory is getting physically hot-unplugged.

Solve this by creating 'policyX' directories (per-policy) in
/sys/devices/system/cpu/cpufreq/ directory, where X is the CPU for which
the policy was first created.

This also removes the need of keeping kobj_cpu and we can remove it now.

Suggested-by: Saravana Kannan <skan...@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.ku...@linaro.org>



Since you've added a separate patch for making policyX more consistent:
Reviewed-by: Saravana Kannan <skan...@codeaurora.org>

Btw, does a Review-by have an implicit Acked-by?


---
  drivers/cpufreq/cpufreq.c | 34 --
  include/linux/cpufreq.h   |  1 -
  2 files changed, 4 insertions(+), 31 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 04222e7bbc73..4fa2215cc6ec 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -910,9 +910,6 @@ static int cpufreq_add_dev_symlink(struct cpufreq_policy 
*policy)

/* Some related CPUs might not be present (physically hotplugged) */
for_each_cpu(j, policy->real_cpus) {
-   if (j == policy->kobj_cpu)
-   continue;
-
ret = add_cpu_dev_symlink(policy, j);
if (ret)
break;


Kinda unrelated to this patch, but shouldn't this function undo the 
symlinks is has created so far before returning? Otherwise, we'd be 
leaving around broken symlinks.


-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/5] cpufreq: create cpu/cpufreq/policyX directories

2015-10-15 Thread Saravana Kannan

On 10/14/2015 11:55 PM, Viresh Kumar wrote:

On 13-10-15, 12:29, Saravana Kannan wrote:

But we don't need to track track of "present-cpus" separately
though. We could do the for_each_cpu_and() when we create the
symlinks for the first time. And after that, we can just use the
subsystem interface callbacks (cpufreq_add_dev() and
cpufreq_remove_dev()) to keep the symlinks updated.

I don't see any place where keeping track of this separately is more
efficient. This would save some memory savings when the number of
CPUs is large and also simplify the code because we won't have to
keep another field up to date.


It is still required to track when can we free the policy.



Ok, I'm not very familiar with this new field's uses. I'll take a closer 
look later and respond if I have other thoughts.


Thanks,
Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 5/5] cpufreq: postfix policy directory with the first CPU in related_cpus

2015-10-15 Thread Saravana Kannan

On 10/15/2015 09:05 AM, Viresh Kumar wrote:

The sysfs policy directory is postfixed currently with the CPU number
for which the policy was created, which isn't necessarily the first CPU
in related_cpus mask.

To make it more consistent and predictable, lets postfix the policy with
the first cpu in related-cpus mask.

Suggested-by: Saravana Kannan <skan...@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.ku...@linaro.org>
---
  drivers/cpufreq/cpufreq.c | 19 ++-
  1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 4fa2215cc6ec..3fe13875565d 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1022,7 +1022,6 @@ static struct cpufreq_policy 
*cpufreq_policy_alloc(unsigned int cpu)
  {
struct device *dev = get_cpu_device(cpu);
struct cpufreq_policy *policy;
-   int ret;

if (WARN_ON(!dev))
return NULL;
@@ -1040,13 +1039,6 @@ static struct cpufreq_policy 
*cpufreq_policy_alloc(unsigned int cpu)
if (!zalloc_cpumask_var(>real_cpus, GFP_KERNEL))
goto err_free_rcpumask;

-   ret = kobject_init_and_add(>kobj, _cpufreq,
-  cpufreq_global_kobject, "policy%u", cpu);
-   if (ret) {
-   pr_err("%s: failed to init policy->kobj: %d\n", __func__, ret);
-   goto err_free_real_cpus;
-   }
-
INIT_LIST_HEAD(>policy_list);
init_rwsem(>rwsem);
spin_lock_init(>transition_lock);
@@ -1057,7 +1049,6 @@ static struct cpufreq_policy 
*cpufreq_policy_alloc(unsigned int cpu)
policy->cpu = cpu;
return policy;

-err_free_real_cpus:
free_cpumask_var(policy->real_cpus);


Delete this line too? Does GCC not complain about unreachable code?


  err_free_rcpumask:
free_cpumask_var(policy->related_cpus);
@@ -1163,6 +1154,16 @@ static int cpufreq_online(unsigned int cpu)
cpumask_copy(policy->related_cpus, policy->cpus);
/* Remember CPUs present at the policy creation time. */
cpumask_and(policy->real_cpus, policy->cpus, cpu_present_mask);
+
+   /* Initialize the kobject */
+   ret = kobject_init_and_add(>kobj, _cpufreq,
+  cpufreq_global_kobject, "policy%u",
+  cpumask_first(policy->related_cpus));
+   if (ret) {
+   pr_err("%s: failed to init policy->kobj: %d\n",
+  __func__, ret);
+   goto out_exit_policy;


out_exit_policy label includes a call to cpufreq_policy_free(). That 
function needs to be changed to not call cpufreq_policy_put_kobj() in 
this case so that we don't try to kobject_put() an unallocated kobj.


Maybe you an call cpufreq_policy_put_kobj() in the error handling path 
of this function? Basically split out kojb alloc and free from policy 
alloc and free and alloc/free them around the same time 
(cpufreq_remove_Dev() will have to also call cpufreq_policy_put_kobj() 
when real_cpus is empty().


The refactor is just a suggestion. I'm looking at the latest code in a 
gitweb and making comments. So, I might have missed some corner cases in 
the refactor.


Also, it might be better to move the notifier from within 
cpufreq_policy_put_kobj() to cpufreq_policy_free()? Seems more appropriate.


Thanks,
Saravana


--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH Resend] cpufreq: Drop redundant check for inactive policies

2015-10-14 Thread Saravana Kannan

On 10/14/2015 05:35 PM, Rafael J. Wysocki wrote:

On Tuesday, October 13, 2015 10:57:13 AM Viresh Kumar wrote:

We just made sure policy->cpu is online and this check will always fail
as the policy is active. Drop it.

Signed-off-by: Viresh Kumar 
Acked-by: Saravana Kannan 


Applied, thanks!

Rafael



I didn't give a clear ack/review for the series. So, to clarify my 
ack/review


For all patches except 4/5, I'm okay with either/all of this:
Reviewed-by:  Saravana Kannan 
Acked-by: Saravana Kannan 

For 4/5, I would still like us to move the sysfs creating after init. 
That part shouldn't be too hard. We don't need to create the sysfs file 
before init.


Once that's done, I wouldn't mind giving an Ack.

Thanks,
Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH Resend] cpufreq: Drop redundant check for inactive policies

2015-10-14 Thread Saravana Kannan

On 10/14/2015 05:35 PM, Rafael J. Wysocki wrote:

On Tuesday, October 13, 2015 10:57:13 AM Viresh Kumar wrote:

We just made sure policy->cpu is online and this check will always fail
as the policy is active. Drop it.

Signed-off-by: Viresh Kumar <viresh.ku...@linaro.org>
Acked-by: Saravana Kannan <skan...@codeaurora.org>


Applied, thanks!

Rafael



I didn't give a clear ack/review for the series. So, to clarify my 
ack/review


For all patches except 4/5, I'm okay with either/all of this:
Reviewed-by:  Saravana Kannan <skan...@codeaurora.org>
Acked-by: Saravana Kannan <skan...@codeaurora.org>

For 4/5, I would still like us to move the sysfs creating after init. 
That part shouldn't be too hard. We don't need to create the sysfs file 
before init.


Once that's done, I wouldn't mind giving an Ack.

Thanks,
Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/5] cpufreq: create cpu/cpufreq/policyX directories

2015-10-13 Thread Saravana Kannan

On 10/12/2015 08:39 PM, Viresh Kumar wrote:

On 12-10-15, 12:31, Saravana Kannan wrote:

Can we use the first CPU in the related CPUs mask? Instead of the
first CPU that the policy got created on? The policyX numbering
would be a bit more consistent that way.


Okay..


Suggested-by: ?


Will add. Though me/Rafael thought about it long back, but then
dropped the idea :)


Didn't notice when this got added. Do we really need this anymore if
we don't care about moving the directory and creating symlinks? I
don't think we need it anymore. And if we really need to know
related - offline, we can use for_each_cpu_and(related,
online/present mask)


Its about tracking present-cpus, for which the link is present. So, it
is still required.


But we don't need to track track of "present-cpus" separately though. We 
could do the for_each_cpu_and() when we create the symlinks for the 
first time. And after that, we can just use the subsystem interface 
callbacks (cpufreq_add_dev() and cpufreq_remove_dev()) to keep the 
symlinks updated.


I don't see any place where keeping track of this separately is more 
efficient. This would save some memory savings when the number of CPUs 
is large and also simplify the code because we won't have to keep 
another field up to date.


-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/5] cpufreq: create cpu/cpufreq/policyX directories

2015-10-13 Thread Saravana Kannan

On 10/12/2015 11:15 PM, Viresh Kumar wrote:

On 12-10-15, 12:31, Saravana Kannan wrote:

Can we use the first CPU in the related CPUs mask? Instead of the
first CPU that the policy got created on? The policyX numbering
would be a bit more consistent that way.


Okay, checked this again. The problem is that ->init() isn't called
yet and we are very early in the initialization sequence. So, we can't
really know related_cpus yet. So I will keep it unchanged for now.



Can we move the sysfs add to the end so that by the time we add sysfs, 
we'll have all the details?


-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/5] cpufreq: Use cpumask_copy instead of cpumask_or to copy a mask

2015-10-13 Thread Saravana Kannan

On 10/12/2015 08:23 PM, Viresh Kumar wrote:

On 12-10-15, 12:12, Saravana Kannan wrote:

if (new_policy) {
/* related_cpus should at least include policy->cpus. */
-   cpumask_or(policy->related_cpus, policy->related_cpus, 
policy->cpus);
+   cpumask_copy(policy->related_cpus, policy->cpus);


Again, why? It actually seems wrong. A 4 core cluster could come up
with just 2 cores when the policy is added. But the related CPUs
would be 4 CPUs.


Firstly, the patch hasn't changed anything at all. related_cpus was
empty until this point, and orring or setting it with ->cpus will
result in the same output.


I was under the impression that the CPUfreq drivers were expected to 
fill in related_cpus and the or-ing was a safety net. If that's not the 
case, then this change is fine.



Secondly, this is what we always wanted. related_cpus should contain
the mask of all possible CPUs for that cluster.


I think the confusion was that I thought the drivers are supposed to do 
this. If this doesn't mess up other CPUfreq drivers that I'm not 
familiar with, then I don't have concerns.


Can you still explain the why in the commit text though? If it's just 
that related_cpus is always empty and copying is more efficient than 
or-ing, then mention that?


Thanks,
Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/5] cpufreq: create cpu/cpufreq/policyX directories

2015-10-13 Thread Saravana Kannan

On 10/12/2015 08:39 PM, Viresh Kumar wrote:

On 12-10-15, 12:31, Saravana Kannan wrote:

Can we use the first CPU in the related CPUs mask? Instead of the
first CPU that the policy got created on? The policyX numbering
would be a bit more consistent that way.


Okay..


Suggested-by: ?


Will add. Though me/Rafael thought about it long back, but then
dropped the idea :)


Didn't notice when this got added. Do we really need this anymore if
we don't care about moving the directory and creating symlinks? I
don't think we need it anymore. And if we really need to know
related - offline, we can use for_each_cpu_and(related,
online/present mask)


Its about tracking present-cpus, for which the link is present. So, it
is still required.


But we don't need to track track of "present-cpus" separately though. We 
could do the for_each_cpu_and() when we create the symlinks for the 
first time. And after that, we can just use the subsystem interface 
callbacks (cpufreq_add_dev() and cpufreq_remove_dev()) to keep the 
symlinks updated.


I don't see any place where keeping track of this separately is more 
efficient. This would save some memory savings when the number of CPUs 
is large and also simplify the code because we won't have to keep 
another field up to date.


-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/5] cpufreq: create cpu/cpufreq/policyX directories

2015-10-13 Thread Saravana Kannan

On 10/12/2015 11:15 PM, Viresh Kumar wrote:

On 12-10-15, 12:31, Saravana Kannan wrote:

Can we use the first CPU in the related CPUs mask? Instead of the
first CPU that the policy got created on? The policyX numbering
would be a bit more consistent that way.


Okay, checked this again. The problem is that ->init() isn't called
yet and we are very early in the initialization sequence. So, we can't
really know related_cpus yet. So I will keep it unchanged for now.



Can we move the sysfs add to the end so that by the time we add sysfs, 
we'll have all the details?


-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/5] cpufreq: Use cpumask_copy instead of cpumask_or to copy a mask

2015-10-13 Thread Saravana Kannan

On 10/12/2015 08:23 PM, Viresh Kumar wrote:

On 12-10-15, 12:12, Saravana Kannan wrote:

if (new_policy) {
/* related_cpus should at least include policy->cpus. */
-   cpumask_or(policy->related_cpus, policy->related_cpus, 
policy->cpus);
+   cpumask_copy(policy->related_cpus, policy->cpus);


Again, why? It actually seems wrong. A 4 core cluster could come up
with just 2 cores when the policy is added. But the related CPUs
would be 4 CPUs.


Firstly, the patch hasn't changed anything at all. related_cpus was
empty until this point, and orring or setting it with ->cpus will
result in the same output.


I was under the impression that the CPUfreq drivers were expected to 
fill in related_cpus and the or-ing was a safety net. If that's not the 
case, then this change is fine.



Secondly, this is what we always wanted. related_cpus should contain
the mask of all possible CPUs for that cluster.


I think the confusion was that I thought the drivers are supposed to do 
this. If this doesn't mess up other CPUfreq drivers that I'm not 
familiar with, then I don't have concerns.


Can you still explain the why in the commit text though? If it's just 
that related_cpus is always empty and copying is more efficient than 
or-ing, then mention that?


Thanks,
Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] cpufreq: Drop redundant check for inactive policies

2015-10-12 Thread Saravana Kannan

On 10/11/2015 10:21 AM, Viresh Kumar wrote:

We just made sure policy->cpu is online and this check will always fail
as the policy is active. Drop it.

Signed-off-by: Viresh Kumar 
---
  drivers/cpufreq/cpufreq.c | 7 ---
  1 file changed, 7 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 58aabe0f2d2c..4fa2215cc6ec 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -843,18 +843,11 @@ static ssize_t store(struct kobject *kobj, struct 
attribute *attr,

down_write(>rwsem);

-   /* Updating inactive policies is invalid, so avoid doing that. */
-   if (unlikely(policy_is_inactive(policy))) {
-   ret = -EBUSY;
-   goto unlock_policy_rwsem;
-   }
-
if (fattr->store)
ret = fattr->store(policy, buf, count);
else
ret = -EIO;

-unlock_policy_rwsem:
up_write(>rwsem);
  unlock:
put_online_cpus();



Doesn't really seem related to the sysfs reorg/clean up. Should it be a 
separate patch outside of this series?


Acked-by: Saravana Kannan 

-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/5] cpufreq: create cpu/cpufreq/policyX directories

2015-10-12 Thread Saravana Kannan

On 10/11/2015 10:21 AM, Viresh Kumar wrote:

The cpufreq sysfs interface had been a bit inconsistent as one of the
CPUs for a policy had a real directory within its sysfs 'cpuX' directory
and all other CPUs had links to it. That also made the code a bit
complex as we need to take care of moving the sysfs directory if the CPU
containing the real directory is getting physically hot-unplugged.


This should actually make hotplug a bit faster too.


Solve this by creating 'policyX' directories (per-policy) in
/sys/devices/system/cpu/cpufreq/ directory, where X is the CPU for which
the policy was first created.
Can we use the first CPU in the related CPUs mask? Instead of the first 
CPU that the policy got created on? The policyX numbering would be a bit 
more consistent that way.




This also removes the need of keeping kobj_cpu and we can remove it now.

Signed-off-by: Viresh Kumar 

Suggested-by: ?


---
  drivers/cpufreq/cpufreq.c | 34 --
  include/linux/cpufreq.h   |  1 -
  2 files changed, 4 insertions(+), 31 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 2cb0e3b8170e..58aabe0f2d2c 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -917,9 +917,6 @@ static int cpufreq_add_dev_symlink(struct cpufreq_policy 
*policy)

/* Some related CPUs might not be present (physically hotplugged) */
for_each_cpu(j, policy->real_cpus) {
Didn't notice when this got added. Do we really need this anymore if we 
don't care about moving the directory and creating symlinks? I don't 
think we need it anymore. And if we really need to know related - 
offline, we can use for_each_cpu_and(related, online/present mask)



-   if (j == policy->kobj_cpu)
-   continue;
-
ret = add_cpu_dev_symlink(policy, j);
if (ret)
break;
@@ -933,12 +930,8 @@ static void cpufreq_remove_dev_symlink(struct 
cpufreq_policy *policy)
unsigned int j;

/* Some related CPUs might not be present (physically hotplugged) */
-   for_each_cpu(j, policy->real_cpus) {
-   if (j == policy->kobj_cpu)
-   continue;
-
+   for_each_cpu(j, policy->real_cpus)
remove_cpu_dev_symlink(policy, j);
-   }
  }

  static int cpufreq_add_dev_interface(struct cpufreq_policy *policy)
@@ -1054,8 +1047,8 @@ static struct cpufreq_policy 
*cpufreq_policy_alloc(unsigned int cpu)
if (!zalloc_cpumask_var(>real_cpus, GFP_KERNEL))
goto err_free_rcpumask;

-   ret = kobject_init_and_add(>kobj, _cpufreq, >kobj,
-  "cpufreq");
+   ret = kobject_init_and_add(>kobj, _cpufreq,
+  cpufreq_global_kobject, "policy%u", cpu);
if (ret) {
pr_err("%s: failed to init policy->kobj: %d\n", __func__, ret);
goto err_free_real_cpus;
@@ -1069,10 +1062,6 @@ static struct cpufreq_policy 
*cpufreq_policy_alloc(unsigned int cpu)
INIT_WORK(>update, handle_update);

policy->cpu = cpu;
-
-   /* Set this once on allocation */
-   policy->kobj_cpu = cpu;
-
return policy;

  err_free_real_cpus:
@@ -1424,22 +1413,7 @@ static void cpufreq_remove_dev(struct device *dev, 
struct subsys_interface *sif)
return;
}

-   if (cpu != policy->kobj_cpu) {
-   remove_cpu_dev_symlink(policy, cpu);
-   } else {
-   /*
-* The CPU owning the policy object is going away.  Move it to
-* another suitable CPU.
-*/
-   unsigned int new_cpu = cpumask_first(policy->real_cpus);
-   struct device *new_dev = get_cpu_device(new_cpu);
-
-   dev_dbg(dev, "%s: Moving policy object to CPU%u\n", __func__, 
new_cpu);
-
-   sysfs_remove_link(_dev->kobj, "cpufreq");
-   policy->kobj_cpu = new_cpu;
-   WARN_ON(kobject_move(>kobj, _dev->kobj));
-   }
+   remove_cpu_dev_symlink(policy, cpu);
  }

  static void handle_update(struct work_struct *work)
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 9623218d996a..ef4c5b1a860f 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -65,7 +65,6 @@ struct cpufreq_policy {
unsigned intshared_type; /* ACPI: ANY or ALL affected CPUs
should set cpufreq */
unsigned intcpu;/* cpu managing this policy, must be 
online */
-   unsigned intkobj_cpu; /* cpu managing sysfs files, can be 
offline */

struct clk  *clk;
struct cpufreq_cpuinfo  cpuinfo;/* see above */



-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe 

Re: [PATCH 1/5] cpufreq: Use cpumask_copy instead of cpumask_or to copy a mask

2015-10-12 Thread Saravana Kannan

On 10/11/2015 10:21 AM, Viresh Kumar wrote:

Signed-off-by: Viresh Kumar 


The commit text should explain the why you are doing this.


---
  drivers/cpufreq/cpufreq.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 25c4c15103a0..b32521432db4 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1221,7 +1221,7 @@ static int cpufreq_online(unsigned int cpu)

if (new_policy) {
/* related_cpus should at least include policy->cpus. */
-   cpumask_or(policy->related_cpus, policy->related_cpus, 
policy->cpus);
+   cpumask_copy(policy->related_cpus, policy->cpus);


Again, why? It actually seems wrong. A 4 core cluster could come up with 
just 2 cores when the policy is added. But the related CPUs would be 4 CPUs.



/* Remember CPUs present at the policy creation time. */
cpumask_and(policy->real_cpus, policy->cpus, cpu_present_mask);
}



-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/5] cpufreq: Use cpumask_copy instead of cpumask_or to copy a mask

2015-10-12 Thread Saravana Kannan

On 10/11/2015 10:21 AM, Viresh Kumar wrote:

Signed-off-by: Viresh Kumar 


The commit text should explain the why you are doing this.


---
  drivers/cpufreq/cpufreq.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 25c4c15103a0..b32521432db4 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1221,7 +1221,7 @@ static int cpufreq_online(unsigned int cpu)

if (new_policy) {
/* related_cpus should at least include policy->cpus. */
-   cpumask_or(policy->related_cpus, policy->related_cpus, 
policy->cpus);
+   cpumask_copy(policy->related_cpus, policy->cpus);


Again, why? It actually seems wrong. A 4 core cluster could come up with 
just 2 cores when the policy is added. But the related CPUs would be 4 CPUs.



/* Remember CPUs present at the policy creation time. */
cpumask_and(policy->real_cpus, policy->cpus, cpu_present_mask);
}



-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] cpufreq: Drop redundant check for inactive policies

2015-10-12 Thread Saravana Kannan

On 10/11/2015 10:21 AM, Viresh Kumar wrote:

We just made sure policy->cpu is online and this check will always fail
as the policy is active. Drop it.

Signed-off-by: Viresh Kumar <viresh.ku...@linaro.org>
---
  drivers/cpufreq/cpufreq.c | 7 ---
  1 file changed, 7 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 58aabe0f2d2c..4fa2215cc6ec 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -843,18 +843,11 @@ static ssize_t store(struct kobject *kobj, struct 
attribute *attr,

down_write(>rwsem);

-   /* Updating inactive policies is invalid, so avoid doing that. */
-   if (unlikely(policy_is_inactive(policy))) {
-   ret = -EBUSY;
-   goto unlock_policy_rwsem;
-   }
-
if (fattr->store)
ret = fattr->store(policy, buf, count);
else
ret = -EIO;

-unlock_policy_rwsem:
up_write(>rwsem);
  unlock:
put_online_cpus();



Doesn't really seem related to the sysfs reorg/clean up. Should it be a 
separate patch outside of this series?


Acked-by: Saravana Kannan <skan...@codeaurora.org>

-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/5] cpufreq: create cpu/cpufreq/policyX directories

2015-10-12 Thread Saravana Kannan

On 10/11/2015 10:21 AM, Viresh Kumar wrote:

The cpufreq sysfs interface had been a bit inconsistent as one of the
CPUs for a policy had a real directory within its sysfs 'cpuX' directory
and all other CPUs had links to it. That also made the code a bit
complex as we need to take care of moving the sysfs directory if the CPU
containing the real directory is getting physically hot-unplugged.


This should actually make hotplug a bit faster too.


Solve this by creating 'policyX' directories (per-policy) in
/sys/devices/system/cpu/cpufreq/ directory, where X is the CPU for which
the policy was first created.
Can we use the first CPU in the related CPUs mask? Instead of the first 
CPU that the policy got created on? The policyX numbering would be a bit 
more consistent that way.




This also removes the need of keeping kobj_cpu and we can remove it now.

Signed-off-by: Viresh Kumar 

Suggested-by: ?


---
  drivers/cpufreq/cpufreq.c | 34 --
  include/linux/cpufreq.h   |  1 -
  2 files changed, 4 insertions(+), 31 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 2cb0e3b8170e..58aabe0f2d2c 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -917,9 +917,6 @@ static int cpufreq_add_dev_symlink(struct cpufreq_policy 
*policy)

/* Some related CPUs might not be present (physically hotplugged) */
for_each_cpu(j, policy->real_cpus) {
Didn't notice when this got added. Do we really need this anymore if we 
don't care about moving the directory and creating symlinks? I don't 
think we need it anymore. And if we really need to know related - 
offline, we can use for_each_cpu_and(related, online/present mask)



-   if (j == policy->kobj_cpu)
-   continue;
-
ret = add_cpu_dev_symlink(policy, j);
if (ret)
break;
@@ -933,12 +930,8 @@ static void cpufreq_remove_dev_symlink(struct 
cpufreq_policy *policy)
unsigned int j;

/* Some related CPUs might not be present (physically hotplugged) */
-   for_each_cpu(j, policy->real_cpus) {
-   if (j == policy->kobj_cpu)
-   continue;
-
+   for_each_cpu(j, policy->real_cpus)
remove_cpu_dev_symlink(policy, j);
-   }
  }

  static int cpufreq_add_dev_interface(struct cpufreq_policy *policy)
@@ -1054,8 +1047,8 @@ static struct cpufreq_policy 
*cpufreq_policy_alloc(unsigned int cpu)
if (!zalloc_cpumask_var(>real_cpus, GFP_KERNEL))
goto err_free_rcpumask;

-   ret = kobject_init_and_add(>kobj, _cpufreq, >kobj,
-  "cpufreq");
+   ret = kobject_init_and_add(>kobj, _cpufreq,
+  cpufreq_global_kobject, "policy%u", cpu);
if (ret) {
pr_err("%s: failed to init policy->kobj: %d\n", __func__, ret);
goto err_free_real_cpus;
@@ -1069,10 +1062,6 @@ static struct cpufreq_policy 
*cpufreq_policy_alloc(unsigned int cpu)
INIT_WORK(>update, handle_update);

policy->cpu = cpu;
-
-   /* Set this once on allocation */
-   policy->kobj_cpu = cpu;
-
return policy;

  err_free_real_cpus:
@@ -1424,22 +1413,7 @@ static void cpufreq_remove_dev(struct device *dev, 
struct subsys_interface *sif)
return;
}

-   if (cpu != policy->kobj_cpu) {
-   remove_cpu_dev_symlink(policy, cpu);
-   } else {
-   /*
-* The CPU owning the policy object is going away.  Move it to
-* another suitable CPU.
-*/
-   unsigned int new_cpu = cpumask_first(policy->real_cpus);
-   struct device *new_dev = get_cpu_device(new_cpu);
-
-   dev_dbg(dev, "%s: Moving policy object to CPU%u\n", __func__, 
new_cpu);
-
-   sysfs_remove_link(_dev->kobj, "cpufreq");
-   policy->kobj_cpu = new_cpu;
-   WARN_ON(kobject_move(>kobj, _dev->kobj));
-   }
+   remove_cpu_dev_symlink(policy, cpu);
  }

  static void handle_update(struct work_struct *work)
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 9623218d996a..ef4c5b1a860f 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -65,7 +65,6 @@ struct cpufreq_policy {
unsigned intshared_type; /* ACPI: ANY or ALL affected CPUs
should set cpufreq */
unsigned intcpu;/* cpu managing this policy, must be 
online */
-   unsigned intkobj_cpu; /* cpu managing sysfs files, can be 
offline */

struct clk  *clk;
struct cpufreq_cpuinfo  cpuinfo;/* see above */



-Saravana

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative 

Re: [PATCH v2 RESEND/RFC] timer: make deferrable cpu unbound timers really not bound to a cpu

2015-03-24 Thread Saravana Kannan

On 03/19/2015 05:10 PM, Saravana Kannan wrote:

On 09/23/2014 11:33 AM, Thomas Gleixner wrote:

On Mon, 15 Sep 2014, Joonwoo Park wrote:

+#ifdef CONFIG_SMP
+static struct tvec_base *tvec_base_deferral = _tvec_bases;
+#endif


In principle I like the idea of a deferrable wheel, but this
implementation is going to go nowhere.


Hi Thomas,

To give some more context:

This bug is a serious pain in the a** for anyone using deferrable timers
or deferrable workqueues today for some periodic work and don't care for
which CPU the code runs in.

Couple of examples of such issues in existing code:

1) In a SMP system, CPUfreq governors (ondemand and conservative) end up
queueing a deferrable work on every single CPU and the first one to run
the deferrable workqueue cancels the work on all other CPUS, runs the
work and then sets up a workqueue on all the CPUs again for the next
sampling point.

2) Devfreq actually doesn't work well today because it doesn't do this
nasty hack like CPUfreq. So, if the devfreq work happens to run on a CPU
that's typically idle, then it doesn't run for a long time. I've
actually seen logs where the devfreq polling interval is set to 20ms,
but ends up running 800ms later.

Don't know how many other drivers are suffering from this bug. Yes,
IMHO, this is a bug. When the timer is CPU unbound, anyone who hasn't
looked at the actual timer code would assume that it'll run as long as
any CPU is busy.


First of all making it SMP only is silly. The deferrable stuff is a
pain in other places as well.

But whats way worse is:


+static inline void __run_timers(struct tvec_base *base, bool try)
  {
  struct timer_list *timer;

-spin_lock_irq(>lock);
+if (!try)
+spin_lock_irq(>lock);
+else if (!spin_trylock_irq(>lock))
+return;


Yuck. All cpus fighting about a single spinlock roughly at the same
time? You just created a proper thundering herd cacheline bouncing
issue.


I agree, This is not good. I think a simple way to fix this is to have
only the CPU that's the jiffy owner to run through this deferrable timer
list.

That should address any unnecessary cache bouncing issues. Would that be
an acceptable solution to this?



No way. We have already mechanisms in place to deal with such
problems, you just have to use them.


I'm not sure which problem you are referring to here. Or what the
already existing solutions are.

I don't think you were referring to the "deferrable timer getting
delayed for long periods despite CPUs being active" problem, because I
don't think we have a better solution than this patch (with the jiffy
owner CPU fix). Asking every driver that doesn't care which CPU the work
runs on to queue a work or set up a timer on every CPU is definitely not
a good solution -- that's spreading the complexity to every other driver
instead of fixing it in one place. And that causes unnecessary cache
pollution too.

Thoughts? I would really like to see a fix for this go in soon so that
we can simplify cpufreq and have devfreq and other drivers work correctly.

Thanks,
Saravana



Thomas,

Bump.

-Saravana

--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 RESEND/RFC] timer: make deferrable cpu unbound timers really not bound to a cpu

2015-03-24 Thread Saravana Kannan

On 03/19/2015 05:10 PM, Saravana Kannan wrote:

On 09/23/2014 11:33 AM, Thomas Gleixner wrote:

On Mon, 15 Sep 2014, Joonwoo Park wrote:

+#ifdef CONFIG_SMP
+static struct tvec_base *tvec_base_deferral = boot_tvec_bases;
+#endif


In principle I like the idea of a deferrable wheel, but this
implementation is going to go nowhere.


Hi Thomas,

To give some more context:

This bug is a serious pain in the a** for anyone using deferrable timers
or deferrable workqueues today for some periodic work and don't care for
which CPU the code runs in.

Couple of examples of such issues in existing code:

1) In a SMP system, CPUfreq governors (ondemand and conservative) end up
queueing a deferrable work on every single CPU and the first one to run
the deferrable workqueue cancels the work on all other CPUS, runs the
work and then sets up a workqueue on all the CPUs again for the next
sampling point.

2) Devfreq actually doesn't work well today because it doesn't do this
nasty hack like CPUfreq. So, if the devfreq work happens to run on a CPU
that's typically idle, then it doesn't run for a long time. I've
actually seen logs where the devfreq polling interval is set to 20ms,
but ends up running 800ms later.

Don't know how many other drivers are suffering from this bug. Yes,
IMHO, this is a bug. When the timer is CPU unbound, anyone who hasn't
looked at the actual timer code would assume that it'll run as long as
any CPU is busy.


First of all making it SMP only is silly. The deferrable stuff is a
pain in other places as well.

But whats way worse is:


+static inline void __run_timers(struct tvec_base *base, bool try)
  {
  struct timer_list *timer;

-spin_lock_irq(base-lock);
+if (!try)
+spin_lock_irq(base-lock);
+else if (!spin_trylock_irq(base-lock))
+return;


Yuck. All cpus fighting about a single spinlock roughly at the same
time? You just created a proper thundering herd cacheline bouncing
issue.


I agree, This is not good. I think a simple way to fix this is to have
only the CPU that's the jiffy owner to run through this deferrable timer
list.

That should address any unnecessary cache bouncing issues. Would that be
an acceptable solution to this?



No way. We have already mechanisms in place to deal with such
problems, you just have to use them.


I'm not sure which problem you are referring to here. Or what the
already existing solutions are.

I don't think you were referring to the deferrable timer getting
delayed for long periods despite CPUs being active problem, because I
don't think we have a better solution than this patch (with the jiffy
owner CPU fix). Asking every driver that doesn't care which CPU the work
runs on to queue a work or set up a timer on every CPU is definitely not
a good solution -- that's spreading the complexity to every other driver
instead of fixing it in one place. And that causes unnecessary cache
pollution too.

Thoughts? I would really like to see a fix for this go in soon so that
we can simplify cpufreq and have devfreq and other drivers work correctly.

Thanks,
Saravana



Thomas,

Bump.

-Saravana

--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: [PATCH v2 RESEND/RFC] timer: make deferrable cpu unbound timers really not bound to a cpu

2015-03-19 Thread Saravana Kannan

On 09/23/2014 11:33 AM, Thomas Gleixner wrote:

On Mon, 15 Sep 2014, Joonwoo Park wrote:

+#ifdef CONFIG_SMP
+static struct tvec_base *tvec_base_deferral = _tvec_bases;
+#endif


In principle I like the idea of a deferrable wheel, but this
implementation is going to go nowhere.


Hi Thomas,

To give some more context:

This bug is a serious pain in the a** for anyone using deferrable timers 
or deferrable workqueues today for some periodic work and don't care for 
which CPU the code runs in.


Couple of examples of such issues in existing code:

1) In a SMP system, CPUfreq governors (ondemand and conservative) end up 
queueing a deferrable work on every single CPU and the first one to run 
the deferrable workqueue cancels the work on all other CPUS, runs the 
work and then sets up a workqueue on all the CPUs again for the next 
sampling point.


2) Devfreq actually doesn't work well today because it doesn't do this 
nasty hack like CPUfreq. So, if the devfreq work happens to run on a CPU 
that's typically idle, then it doesn't run for a long time. I've 
actually seen logs where the devfreq polling interval is set to 20ms, 
but ends up running 800ms later.


Don't know how many other drivers are suffering from this bug. Yes, 
IMHO, this is a bug. When the timer is CPU unbound, anyone who hasn't 
looked at the actual timer code would assume that it'll run as long as 
any CPU is busy.



First of all making it SMP only is silly. The deferrable stuff is a
pain in other places as well.

But whats way worse is:


+static inline void __run_timers(struct tvec_base *base, bool try)
  {
struct timer_list *timer;

-   spin_lock_irq(>lock);
+   if (!try)
+   spin_lock_irq(>lock);
+   else if (!spin_trylock_irq(>lock))
+   return;


Yuck. All cpus fighting about a single spinlock roughly at the same
time? You just created a proper thundering herd cacheline bouncing
issue.


I agree, This is not good. I think a simple way to fix this is to have 
only the CPU that's the jiffy owner to run through this deferrable timer 
list.


That should address any unnecessary cache bouncing issues. Would that be 
an acceptable solution to this?




No way. We have already mechanisms in place to deal with such
problems, you just have to use them.


I'm not sure which problem you are referring to here. Or what the 
already existing solutions are.


I don't think you were referring to the "deferrable timer getting 
delayed for long periods despite CPUs being active" problem, because I 
don't think we have a better solution than this patch (with the jiffy 
owner CPU fix). Asking every driver that doesn't care which CPU the work 
runs on to queue a work or set up a timer on every CPU is definitely not 
a good solution -- that's spreading the complexity to every other driver 
instead of fixing it in one place. And that causes unnecessary cache 
pollution too.


Thoughts? I would really like to see a fix for this go in soon so that 
we can simplify cpufreq and have devfreq and other drivers work correctly.


Thanks,
Saravana

--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: [PATCH v2 RESEND/RFC] timer: make deferrable cpu unbound timers really not bound to a cpu

2015-03-19 Thread Saravana Kannan

On 09/23/2014 11:33 AM, Thomas Gleixner wrote:

On Mon, 15 Sep 2014, Joonwoo Park wrote:

+#ifdef CONFIG_SMP
+static struct tvec_base *tvec_base_deferral = boot_tvec_bases;
+#endif


In principle I like the idea of a deferrable wheel, but this
implementation is going to go nowhere.


Hi Thomas,

To give some more context:

This bug is a serious pain in the a** for anyone using deferrable timers 
or deferrable workqueues today for some periodic work and don't care for 
which CPU the code runs in.


Couple of examples of such issues in existing code:

1) In a SMP system, CPUfreq governors (ondemand and conservative) end up 
queueing a deferrable work on every single CPU and the first one to run 
the deferrable workqueue cancels the work on all other CPUS, runs the 
work and then sets up a workqueue on all the CPUs again for the next 
sampling point.


2) Devfreq actually doesn't work well today because it doesn't do this 
nasty hack like CPUfreq. So, if the devfreq work happens to run on a CPU 
that's typically idle, then it doesn't run for a long time. I've 
actually seen logs where the devfreq polling interval is set to 20ms, 
but ends up running 800ms later.


Don't know how many other drivers are suffering from this bug. Yes, 
IMHO, this is a bug. When the timer is CPU unbound, anyone who hasn't 
looked at the actual timer code would assume that it'll run as long as 
any CPU is busy.



First of all making it SMP only is silly. The deferrable stuff is a
pain in other places as well.

But whats way worse is:


+static inline void __run_timers(struct tvec_base *base, bool try)
  {
struct timer_list *timer;

-   spin_lock_irq(base-lock);
+   if (!try)
+   spin_lock_irq(base-lock);
+   else if (!spin_trylock_irq(base-lock))
+   return;


Yuck. All cpus fighting about a single spinlock roughly at the same
time? You just created a proper thundering herd cacheline bouncing
issue.


I agree, This is not good. I think a simple way to fix this is to have 
only the CPU that's the jiffy owner to run through this deferrable timer 
list.


That should address any unnecessary cache bouncing issues. Would that be 
an acceptable solution to this?




No way. We have already mechanisms in place to deal with such
problems, you just have to use them.


I'm not sure which problem you are referring to here. Or what the 
already existing solutions are.


I don't think you were referring to the deferrable timer getting 
delayed for long periods despite CPUs being active problem, because I 
don't think we have a better solution than this patch (with the jiffy 
owner CPU fix). Asking every driver that doesn't care which CPU the work 
runs on to queue a work or set up a timer on every CPU is definitely not 
a good solution -- that's spreading the complexity to every other driver 
instead of fixing it in one place. And that causes unnecessary cache 
pollution too.


Thoughts? I would really like to see a fix for this go in soon so that 
we can simplify cpufreq and have devfreq and other drivers work correctly.


Thanks,
Saravana

--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/5] cpufreq, fix locking around CPUFREQ_GOV_POLICY_EXIT calls

2014-11-13 Thread Saravana Kannan

On 11/11/2014 05:07 AM, Viresh Kumar wrote:

On 11 November 2014 17:45, Prarit Bhargava  wrote:

the deadlock in commit 955ef4833574636819cd269cfbae12f79cbde63a

[   75.471265]CPU0CPU1
[   75.476327]
[   75.481385]   lock(>rwsem);
[   75.485307]lock(s_active#219);
[   75.491857]lock(>rwsem);
[   75.498592]   lock(s_active#219);
[   75.502331]
[   75.502331]  *** DEADLOCK ***


I wanted to understand how this deadlock is prevented by a simple change
to trylock..


And also your changelog talks about accessing invalid pointers
without the trylock change, how can that be possible? After the read
lock is taken,
all the pointers should be valid.


consider the following very simple case:

the governor is ondemand.  cpu 0 reads cpuinfo_cur_freq. cpu0 expects to get the
current cpu freq for the ondemand governor.


Name it A.



simultaneously, cpu1 changes the governor from ondemand to userspace.


Name it B.



the two threads will race for the policy->mutex

suppose cpu0 gets it first.  then there is no problem.  the userspace program
for cpu0 gets exactly the data it is expecting.

Now suppose cpu1 gets the lock and starts to write ... cpu0 is blocked.

cpu1 completes the governor change, and cpu0 gets the mutex ... and returns
bogus data at this point.


What do you mean by bogus here? That userspace wouldn't be able to know if
the value is for which governor?

If that's the case than it can still happen. Issue both above commands at almost
the same time. You will never be able to differentiate if the sequence is:

- A followed by B
- B followed by A
- A waited for B and so returned -EBUSY (Only this will be clear)

And the value read can still be bogus. So, we haven't solved the problem at all.


Ah, we are on this topic again I see. I didn't read the patch/thread 
fully, but I can guess where this is going by reading the partial set of 
patches.


Prarit,

You can't just try lock to avoid the deadlock. If you do, then the 
userspace API becomes a mess. Writes to scaling_governor (or anything 
else) will no longer by guaranteed to work. Userspace will have to read 
back, check and retry. That would break a ton of existing userpace scripts.


Viresh,

The deadlock scenario is read. That's why the code is what it is today.

All,

IMO, the right way to fix this is to have the governor have over it's 
list of attributes it want to expose thru sysfs to the cpufreq 
framework. Then the framework can add/remove this in the right order 
when the governors are changed. The framework can do this outside of the 
policy lock being held when the governors are switched. This would allow 
avoid the original deadlock between sysfs locks and the policy lock 
without just ever having to fail userspace writes to scaling_governor.


-Saravana


--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/5] cpufreq, fix locking around CPUFREQ_GOV_POLICY_EXIT calls

2014-11-13 Thread Saravana Kannan

On 11/11/2014 05:07 AM, Viresh Kumar wrote:

On 11 November 2014 17:45, Prarit Bhargava pra...@redhat.com wrote:

the deadlock in commit 955ef4833574636819cd269cfbae12f79cbde63a

[   75.471265]CPU0CPU1
[   75.476327]
[   75.481385]   lock(policy-rwsem);
[   75.485307]lock(s_active#219);
[   75.491857]lock(policy-rwsem);
[   75.498592]   lock(s_active#219);
[   75.502331]
[   75.502331]  *** DEADLOCK ***


I wanted to understand how this deadlock is prevented by a simple change
to trylock..


And also your changelog talks about accessing invalid pointers
without the trylock change, how can that be possible? After the read
lock is taken,
all the pointers should be valid.


consider the following very simple case:

the governor is ondemand.  cpu 0 reads cpuinfo_cur_freq. cpu0 expects to get the
current cpu freq for the ondemand governor.


Name it A.



simultaneously, cpu1 changes the governor from ondemand to userspace.


Name it B.



the two threads will race for the policy-mutex

suppose cpu0 gets it first.  then there is no problem.  the userspace program
for cpu0 gets exactly the data it is expecting.

Now suppose cpu1 gets the lock and starts to write ... cpu0 is blocked.

cpu1 completes the governor change, and cpu0 gets the mutex ... and returns
bogus data at this point.


What do you mean by bogus here? That userspace wouldn't be able to know if
the value is for which governor?

If that's the case than it can still happen. Issue both above commands at almost
the same time. You will never be able to differentiate if the sequence is:

- A followed by B
- B followed by A
- A waited for B and so returned -EBUSY (Only this will be clear)

And the value read can still be bogus. So, we haven't solved the problem at all.


Ah, we are on this topic again I see. I didn't read the patch/thread 
fully, but I can guess where this is going by reading the partial set of 
patches.


Prarit,

You can't just try lock to avoid the deadlock. If you do, then the 
userspace API becomes a mess. Writes to scaling_governor (or anything 
else) will no longer by guaranteed to work. Userspace will have to read 
back, check and retry. That would break a ton of existing userpace scripts.


Viresh,

The deadlock scenario is read. That's why the code is what it is today.

All,

IMO, the right way to fix this is to have the governor have over it's 
list of attributes it want to expose thru sysfs to the cpufreq 
framework. Then the framework can add/remove this in the right order 
when the governors are changed. The framework can do this outside of the 
policy lock being held when the governors are switched. This would allow 
avoid the original deadlock between sysfs locks and the policy lock 
without just ever having to fail userspace writes to scaling_governor.


-Saravana


--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 0/5] Simplify hotplug/suspend handling

2014-10-23 Thread Saravana Kannan

On 10/16/2014 01:53 AM, Viresh Kumar wrote:

On 25 July 2014 06:37, Saravana Kannan  wrote:

Series of patchs to simplify policy/sysfs/kobj/locking handling across
suspend/resume

The following have been tested so far on a 2x2 cluster environment:
- Boot with 2 cpus and no cpufreq driver.
- mod probe driver and see cpufreq sysfs files show up only for the 1st cluster.
- Online the rest of the 2 CPUs and have files show up correctly.
- rmmod the driver and see the files go away.
- modprobe again (or back and forth multiples times) and see it work.
- suspend/resume works as expected.
- When a cluster is offline, all read/writes to its sysfs files return an error

v4
- Split it up into smaller patches
- Will handle physical CPU removal correctly
- Fixed earlier mistake of deleting code under !recover_policy
- Dropped some code refactor that reuses a lot of code between add/remove
- Dropped fix for exiting hotplug race with cpufreq driver probe/rmmod
- Dropped changes will come later once this series is acked.


Hi Saravana,

Any  updates on this? We might need some of this soon or should somebody
else start working on this ?



Hey,

Sorry for the delay. Got side tracked with some commercial stuff. I'm 
still invested in finishing this up. I'll try to send out something 
within a week.


I did notice (didn't read mych) the "Locking issues with cpufreq and 
sysfs" thread. I think my patches should side step most of it.


Thanks,
Saravana

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 0/5] Simplify hotplug/suspend handling

2014-10-23 Thread Saravana Kannan

On 10/16/2014 01:53 AM, Viresh Kumar wrote:

On 25 July 2014 06:37, Saravana Kannan skan...@codeaurora.org wrote:

Series of patchs to simplify policy/sysfs/kobj/locking handling across
suspend/resume

The following have been tested so far on a 2x2 cluster environment:
- Boot with 2 cpus and no cpufreq driver.
- mod probe driver and see cpufreq sysfs files show up only for the 1st cluster.
- Online the rest of the 2 CPUs and have files show up correctly.
- rmmod the driver and see the files go away.
- modprobe again (or back and forth multiples times) and see it work.
- suspend/resume works as expected.
- When a cluster is offline, all read/writes to its sysfs files return an error

v4
- Split it up into smaller patches
- Will handle physical CPU removal correctly
- Fixed earlier mistake of deleting code under !recover_policy
- Dropped some code refactor that reuses a lot of code between add/remove
- Dropped fix for exiting hotplug race with cpufreq driver probe/rmmod
- Dropped changes will come later once this series is acked.


Hi Saravana,

Any  updates on this? We might need some of this soon or should somebody
else start working on this ?



Hey,

Sorry for the delay. Got side tracked with some commercial stuff. I'm 
still invested in finishing this up. I'll try to send out something 
within a week.


I did notice (didn't read mych) the Locking issues with cpufreq and 
sysfs thread. I think my patches should side step most of it.


Thanks,
Saravana

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


<    5   6   7   8   9   10   11   12   13   >