Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-11-21 Thread Petr Mladek
On Tue 2016-11-15 08:40:57, Jacob Pan wrote:
> On Tue, 15 Nov 2016 19:36:26 +0800
> Zhang Rui  wrote:
> 
> > On Mon, 2016-11-14 at 11:12 -0800, Jacob Pan wrote:
> > > On Fri, 11 Nov 2016 18:34:10 +0100
> > > Petr Mladek  wrote:
> > >   
> > > > 
> > > > Please, let me known if I should do this as a separate patch
> > > > and/or resend the patchset.  
> > > Rui/Eduardo,
> > > There are four independent posted changes made to powerclamp driver:
> > > Should I roll them into one set such that you can easily apply them?
> > > 1. add back module device table https://lkml.org/lkml/2016/11/14/760
> > > 2. kworker, this patchset
> > > 3. cpu hotplug state, this patch.
> > > 4. PF_IDLE https://lkml.org/lkml/2016/11/14/747
> > >   
> > hmm, I will apply patch 1 and queue up for next -rc and 4.8 stable.
> > 
> > please resend the others that are targeted for 4.10 and later in one
> > patch set.
> OK, will do. I guess Petr and Sebastian are OK with me putting your
> powerclamp changes in one patchset too?

Sure. I am fine with it.

Best Regards,
Petr


Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-11-21 Thread Petr Mladek
On Tue 2016-11-15 08:40:57, Jacob Pan wrote:
> On Tue, 15 Nov 2016 19:36:26 +0800
> Zhang Rui  wrote:
> 
> > On Mon, 2016-11-14 at 11:12 -0800, Jacob Pan wrote:
> > > On Fri, 11 Nov 2016 18:34:10 +0100
> > > Petr Mladek  wrote:
> > >   
> > > > 
> > > > Please, let me known if I should do this as a separate patch
> > > > and/or resend the patchset.  
> > > Rui/Eduardo,
> > > There are four independent posted changes made to powerclamp driver:
> > > Should I roll them into one set such that you can easily apply them?
> > > 1. add back module device table https://lkml.org/lkml/2016/11/14/760
> > > 2. kworker, this patchset
> > > 3. cpu hotplug state, this patch.
> > > 4. PF_IDLE https://lkml.org/lkml/2016/11/14/747
> > >   
> > hmm, I will apply patch 1 and queue up for next -rc and 4.8 stable.
> > 
> > please resend the others that are targeted for 4.10 and later in one
> > patch set.
> OK, will do. I guess Petr and Sebastian are OK with me putting your
> powerclamp changes in one patchset too?

Sure. I am fine with it.

Best Regards,
Petr


Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-11-15 Thread Jacob Pan
On Tue, 15 Nov 2016 19:36:26 +0800
Zhang Rui  wrote:

> On Mon, 2016-11-14 at 11:12 -0800, Jacob Pan wrote:
> > On Fri, 11 Nov 2016 18:34:10 +0100
> > Petr Mladek  wrote:
> >   
> > > 
> > > Please, let me known if I should do this as a separate patch
> > > and/or resend the patchset.  
> > Rui/Eduardo,
> > There are four independent posted changes made to powerclamp driver:
> > Should I roll them into one set such that you can easily apply them?
> > 1. add back module device table https://lkml.org/lkml/2016/11/14/760
> > 2. kworker, this patchset
> > 3. cpu hotplug state, this patch.
> > 4. PF_IDLE https://lkml.org/lkml/2016/11/14/747
> >   
> hmm, I will apply patch 1 and queue up for next -rc and 4.8 stable.
> 
> please resend the others that are targeted for 4.10 and later in one
> patch set.
OK, will do. I guess Petr and Sebastian are OK with me putting your
powerclamp changes in one patchset too?


Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-11-15 Thread Jacob Pan
On Tue, 15 Nov 2016 19:36:26 +0800
Zhang Rui  wrote:

> On Mon, 2016-11-14 at 11:12 -0800, Jacob Pan wrote:
> > On Fri, 11 Nov 2016 18:34:10 +0100
> > Petr Mladek  wrote:
> >   
> > > 
> > > Please, let me known if I should do this as a separate patch
> > > and/or resend the patchset.  
> > Rui/Eduardo,
> > There are four independent posted changes made to powerclamp driver:
> > Should I roll them into one set such that you can easily apply them?
> > 1. add back module device table https://lkml.org/lkml/2016/11/14/760
> > 2. kworker, this patchset
> > 3. cpu hotplug state, this patch.
> > 4. PF_IDLE https://lkml.org/lkml/2016/11/14/747
> >   
> hmm, I will apply patch 1 and queue up for next -rc and 4.8 stable.
> 
> please resend the others that are targeted for 4.10 and later in one
> patch set.
OK, will do. I guess Petr and Sebastian are OK with me putting your
powerclamp changes in one patchset too?


Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-11-15 Thread Zhang Rui
On Mon, 2016-11-14 at 11:12 -0800, Jacob Pan wrote:
> On Fri, 11 Nov 2016 18:34:10 +0100
> Petr Mladek  wrote:
> 
> > 
> > Please, let me known if I should do this as a separate patch
> > and/or resend the patchset.
> Rui/Eduardo,
> There are four independent posted changes made to powerclamp driver:
> Should I roll them into one set such that you can easily apply them?
> 1. add back module device table https://lkml.org/lkml/2016/11/14/760
> 2. kworker, this patchset
> 3. cpu hotplug state, this patch.
> 4. PF_IDLE https://lkml.org/lkml/2016/11/14/747
> 
hmm, I will apply patch 1 and queue up for next -rc and 4.8 stable.

please resend the others that are targeted for 4.10 and later in one
patch set.

thanks,
rui


Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-11-15 Thread Zhang Rui
On Mon, 2016-11-14 at 11:12 -0800, Jacob Pan wrote:
> On Fri, 11 Nov 2016 18:34:10 +0100
> Petr Mladek  wrote:
> 
> > 
> > Please, let me known if I should do this as a separate patch
> > and/or resend the patchset.
> Rui/Eduardo,
> There are four independent posted changes made to powerclamp driver:
> Should I roll them into one set such that you can easily apply them?
> 1. add back module device table https://lkml.org/lkml/2016/11/14/760
> 2. kworker, this patchset
> 3. cpu hotplug state, this patch.
> 4. PF_IDLE https://lkml.org/lkml/2016/11/14/747
> 
hmm, I will apply patch 1 and queue up for next -rc and 4.8 stable.

please resend the others that are targeted for 4.10 and later in one
patch set.

thanks,
rui


Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-11-14 Thread Jacob Pan
On Fri, 11 Nov 2016 18:34:10 +0100
Petr Mladek  wrote:

> Please, let me known if I should do this as a separate patch
> and/or resend the patchset.

Rui/Eduardo,
There are four independent posted changes made to powerclamp driver:
Should I roll them into one set such that you can easily apply them?
1. add back module device table https://lkml.org/lkml/2016/11/14/760
2. kworker, this patchset
3. cpu hotplug state, this patch.
4. PF_IDLE https://lkml.org/lkml/2016/11/14/747

Thanks,

Jacob


Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-11-14 Thread Jacob Pan
On Fri, 11 Nov 2016 18:34:10 +0100
Petr Mladek  wrote:

> Please, let me known if I should do this as a separate patch
> and/or resend the patchset.

Rui/Eduardo,
There are four independent posted changes made to powerclamp driver:
Should I roll them into one set such that you can easily apply them?
1. add back module device table https://lkml.org/lkml/2016/11/14/760
2. kworker, this patchset
3. cpu hotplug state, this patch.
4. PF_IDLE https://lkml.org/lkml/2016/11/14/747

Thanks,

Jacob


Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-11-11 Thread Petr Mladek
On Fri 2016-11-11 11:07:13, Petr Mladek wrote:
> I am going to do some more tests and will send a fix. It should
> be enough to remove the KTW_FREEZABLE flag from the
> kthread_create_worker_on_cpu() call.

Please, find below an updated patch that fixes the suspend with
kidle_inject kthreads running. The kthread worker must not
get catched by the freezer.

Please, let me known if I should do this as a separate patch
and/or resend the patchset.


>From ffdabb3d61d4ac4fe48bc170a47d2be93cf58e48 Mon Sep 17 00:00:00 2001
From: Sebastian Andrzej Siewior 
Date: Thu, 1 Sep 2016 11:23:57 +0200
Subject: [PATCH] thermal/intel_powerclamp: Convert to CPU hotplug state

This is a conversation to the new hotplug state machine with
the difference that CPU_DEAD becomes CPU_PREDOWN.

At the same time it makes the handling of the two states symmetrical.
stop_power_clamp_worker() is called unconditionally and the controversial
error message is removed.

The kthread worker must not be freezable. Otherwise it could _not_ be
destroyed in the cpu predown callback. In particular, kthread_stop()
or kthread_flush_worker() would hang. These calls wait for the kthread
to do some job and it would never happen when it was frozen.

Finally, the hotplug state callbacks are removed after the powerclamping
is stopped to avoid a potential race.

Signed-off-by: Sebastian Andrzej Siewior 
[pmla...@suse.com: Fixed freezer and a possible race in powerclamp_exit().]
Signed-off-by: Petr Mladek 
---
 drivers/thermal/intel_powerclamp.c | 73 +++---
 1 file changed, 36 insertions(+), 37 deletions(-)

diff --git a/drivers/thermal/intel_powerclamp.c 
b/drivers/thermal/intel_powerclamp.c
index 2b99c0627043..d9b9e0fb4e48 100644
--- a/drivers/thermal/intel_powerclamp.c
+++ b/drivers/thermal/intel_powerclamp.c
@@ -43,7 +43,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -530,8 +529,7 @@ static void start_power_clamp_worker(unsigned long cpu)
struct powerclamp_worker_data *w_data = per_cpu_ptr(worker_data, cpu);
struct kthread_worker *worker;
 
-   worker = kthread_create_worker_on_cpu(cpu, KTW_FREEZABLE,
- "kidle_inject/%ld", cpu);
+   worker = kthread_create_worker_on_cpu(cpu, 0, "kidle_inject/%ld", cpu);
if (IS_ERR(worker))
return;
 
@@ -622,43 +620,35 @@ static void end_power_clamp(void)
}
 }
 
-static int powerclamp_cpu_callback(struct notifier_block *nfb,
-   unsigned long action, void *hcpu)
+static int powerclamp_cpu_online(unsigned int cpu)
 {
-   unsigned long cpu = (unsigned long)hcpu;
+   if (clamping == false)
+   return 0;
+   start_power_clamp_worker(cpu);
+   /* prefer BSP as controlling CPU */
+   if (cpu == 0) {
+   control_cpu = 0;
+   smp_mb();
+   }
+   return 0;
+}
 
-   if (false == clamping)
-   goto exit_ok;
+static int powerclamp_cpu_predown(unsigned int cpu)
+{
+   if (clamping == false)
+   return 0;
 
-   switch (action) {
-   case CPU_ONLINE:
-   start_power_clamp_worker(cpu);
-   /* prefer BSP as controlling CPU */
-   if (cpu == 0) {
-   control_cpu = 0;
-   smp_mb();
-   }
-   break;
-   case CPU_DEAD:
-   if (test_bit(cpu, cpu_clamping_mask)) {
-   pr_err("cpu %lu dead but powerclamping thread is not\n",
-   cpu);
-   stop_power_clamp_worker(cpu);
-   }
-   if (cpu == control_cpu) {
-   control_cpu = smp_processor_id();
-   smp_mb();
-   }
-   }
+   stop_power_clamp_worker(cpu);
+   if (cpu != control_cpu)
+   return 0;
 
-exit_ok:
-   return NOTIFY_OK;
+   control_cpu = cpumask_first(cpu_online_mask);
+   if (control_cpu == cpu)
+   control_cpu = cpumask_next(cpu, cpu_online_mask);
+   smp_mb();
+   return 0;
 }
 
-static struct notifier_block powerclamp_cpu_notifier = {
-   .notifier_call = powerclamp_cpu_callback,
-};
-
 static int powerclamp_get_max_state(struct thermal_cooling_device *cdev,
 unsigned long *state)
 {
@@ -778,6 +768,8 @@ static inline void powerclamp_create_debug_files(void)
debugfs_remove_recursive(debug_dir);
 }
 
+static enum cpuhp_state hp_state;
+
 static int __init powerclamp_init(void)
 {
int retval;
@@ -795,7 +787,14 @@ static int __init powerclamp_init(void)
 
/* set default limit, maybe adjusted during runtime based on feedback */
window_size = 2;
-   register_hotcpu_notifier(_cpu_notifier);
+   retval = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
+ 

Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-11-11 Thread Petr Mladek
On Fri 2016-11-11 11:07:13, Petr Mladek wrote:
> I am going to do some more tests and will send a fix. It should
> be enough to remove the KTW_FREEZABLE flag from the
> kthread_create_worker_on_cpu() call.

Please, find below an updated patch that fixes the suspend with
kidle_inject kthreads running. The kthread worker must not
get catched by the freezer.

Please, let me known if I should do this as a separate patch
and/or resend the patchset.


>From ffdabb3d61d4ac4fe48bc170a47d2be93cf58e48 Mon Sep 17 00:00:00 2001
From: Sebastian Andrzej Siewior 
Date: Thu, 1 Sep 2016 11:23:57 +0200
Subject: [PATCH] thermal/intel_powerclamp: Convert to CPU hotplug state

This is a conversation to the new hotplug state machine with
the difference that CPU_DEAD becomes CPU_PREDOWN.

At the same time it makes the handling of the two states symmetrical.
stop_power_clamp_worker() is called unconditionally and the controversial
error message is removed.

The kthread worker must not be freezable. Otherwise it could _not_ be
destroyed in the cpu predown callback. In particular, kthread_stop()
or kthread_flush_worker() would hang. These calls wait for the kthread
to do some job and it would never happen when it was frozen.

Finally, the hotplug state callbacks are removed after the powerclamping
is stopped to avoid a potential race.

Signed-off-by: Sebastian Andrzej Siewior 
[pmla...@suse.com: Fixed freezer and a possible race in powerclamp_exit().]
Signed-off-by: Petr Mladek 
---
 drivers/thermal/intel_powerclamp.c | 73 +++---
 1 file changed, 36 insertions(+), 37 deletions(-)

diff --git a/drivers/thermal/intel_powerclamp.c 
b/drivers/thermal/intel_powerclamp.c
index 2b99c0627043..d9b9e0fb4e48 100644
--- a/drivers/thermal/intel_powerclamp.c
+++ b/drivers/thermal/intel_powerclamp.c
@@ -43,7 +43,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -530,8 +529,7 @@ static void start_power_clamp_worker(unsigned long cpu)
struct powerclamp_worker_data *w_data = per_cpu_ptr(worker_data, cpu);
struct kthread_worker *worker;
 
-   worker = kthread_create_worker_on_cpu(cpu, KTW_FREEZABLE,
- "kidle_inject/%ld", cpu);
+   worker = kthread_create_worker_on_cpu(cpu, 0, "kidle_inject/%ld", cpu);
if (IS_ERR(worker))
return;
 
@@ -622,43 +620,35 @@ static void end_power_clamp(void)
}
 }
 
-static int powerclamp_cpu_callback(struct notifier_block *nfb,
-   unsigned long action, void *hcpu)
+static int powerclamp_cpu_online(unsigned int cpu)
 {
-   unsigned long cpu = (unsigned long)hcpu;
+   if (clamping == false)
+   return 0;
+   start_power_clamp_worker(cpu);
+   /* prefer BSP as controlling CPU */
+   if (cpu == 0) {
+   control_cpu = 0;
+   smp_mb();
+   }
+   return 0;
+}
 
-   if (false == clamping)
-   goto exit_ok;
+static int powerclamp_cpu_predown(unsigned int cpu)
+{
+   if (clamping == false)
+   return 0;
 
-   switch (action) {
-   case CPU_ONLINE:
-   start_power_clamp_worker(cpu);
-   /* prefer BSP as controlling CPU */
-   if (cpu == 0) {
-   control_cpu = 0;
-   smp_mb();
-   }
-   break;
-   case CPU_DEAD:
-   if (test_bit(cpu, cpu_clamping_mask)) {
-   pr_err("cpu %lu dead but powerclamping thread is not\n",
-   cpu);
-   stop_power_clamp_worker(cpu);
-   }
-   if (cpu == control_cpu) {
-   control_cpu = smp_processor_id();
-   smp_mb();
-   }
-   }
+   stop_power_clamp_worker(cpu);
+   if (cpu != control_cpu)
+   return 0;
 
-exit_ok:
-   return NOTIFY_OK;
+   control_cpu = cpumask_first(cpu_online_mask);
+   if (control_cpu == cpu)
+   control_cpu = cpumask_next(cpu, cpu_online_mask);
+   smp_mb();
+   return 0;
 }
 
-static struct notifier_block powerclamp_cpu_notifier = {
-   .notifier_call = powerclamp_cpu_callback,
-};
-
 static int powerclamp_get_max_state(struct thermal_cooling_device *cdev,
 unsigned long *state)
 {
@@ -778,6 +768,8 @@ static inline void powerclamp_create_debug_files(void)
debugfs_remove_recursive(debug_dir);
 }
 
+static enum cpuhp_state hp_state;
+
 static int __init powerclamp_init(void)
 {
int retval;
@@ -795,7 +787,14 @@ static int __init powerclamp_init(void)
 
/* set default limit, maybe adjusted during runtime based on feedback */
window_size = 2;
-   register_hotcpu_notifier(_cpu_notifier);
+   retval = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
+  

Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-11-11 Thread Petr Mladek
On Fri 2016-11-11 10:33:30, Petr Mladek wrote:
> Then I tried to revert the conversion to the kthread worker
> API (2nd patch from this patchset), see below. And it still
> hangs during the suspend inside
> 
>   powerclamp_cpu_predown()
> kthread_stop(*percpu_thread);
> 
> 
> Note that both kthread_flush_worker() and kthread_stop()
> waits until the kthread gets scheduled and do some job.
> Also note that the kthread is bound to the given CPU.
> 
> My guess is that the kthread cannot be scheduled at this stage.
> I wonder if the CPU is already partially down or that tasks
> are freezed so that "normal" tasks are not scheduled at
> this point. I am still trying to understand the code
> related to suspend, cpu hotplug, and scheduler.

And yes, the problem seems to be that the kthread is freezed
so that it could not run. The suspend works when I disable:

clamp_thread()
//set_freezable();
//try_to_freeze();

In fact, we should not need these calls. They are needed only
when we want to stop the kthread on exact location so that
it does not produce I/O that would block/break suspend.
But this is not the case of intel_powerclamp.

I am going to do some more tests and will send a fix. It should
be enough to remove the KTW_FREEZABLE flag from the
kthread_create_worker_on_cpu() call.

Best Regards,
Petr


Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-11-11 Thread Petr Mladek
On Fri 2016-11-11 10:33:30, Petr Mladek wrote:
> Then I tried to revert the conversion to the kthread worker
> API (2nd patch from this patchset), see below. And it still
> hangs during the suspend inside
> 
>   powerclamp_cpu_predown()
> kthread_stop(*percpu_thread);
> 
> 
> Note that both kthread_flush_worker() and kthread_stop()
> waits until the kthread gets scheduled and do some job.
> Also note that the kthread is bound to the given CPU.
> 
> My guess is that the kthread cannot be scheduled at this stage.
> I wonder if the CPU is already partially down or that tasks
> are freezed so that "normal" tasks are not scheduled at
> this point. I am still trying to understand the code
> related to suspend, cpu hotplug, and scheduler.

And yes, the problem seems to be that the kthread is freezed
so that it could not run. The suspend works when I disable:

clamp_thread()
//set_freezable();
//try_to_freeze();

In fact, we should not need these calls. They are needed only
when we want to stop the kthread on exact location so that
it does not produce I/O that would block/break suspend.
But this is not the case of intel_powerclamp.

I am going to do some more tests and will send a fix. It should
be enough to remove the KTW_FREEZABLE flag from the
kthread_create_worker_on_cpu() call.

Best Regards,
Petr


Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-11-11 Thread Petr Mladek
On Thu 2016-10-27 13:27:36, Jacob Pan wrote:
> On Thu, 27 Oct 2016 17:17:26 +0200
> Sebastian Andrzej Siewior  wrote:
> 
> > On 2016-10-27 16:53:48 [+0200], Petr Mladek wrote:
> > > 
> > > In each case, I wonder if the problem is caused by the conversion
> > > to the kthread worker or by the CPU hotplug state conversion.  
> > 
> > drop the hotplug patch and you will see.
> > 
> Petr,
> 
> I dropped hp patch no long see the hang during suspend to s3. However,
> the problem seems to be this line,
> 
> diff --git a/drivers/thermal/intel_powerclamp.c
> b/drivers/thermal/intel_powerclamp.c index 390e50b..b61da57 100644
> --- a/drivers/thermal/intel_powerclamp.c
> +++ b/drivers/thermal/intel_powerclamp.c
> @@ -574,7 +574,7 @@ static void stop_power_clamp_worker(unsigned long
> cpu) */
> del_timer_sync(_data->wakeup_timer);
> clear_bit(w_data->cpu, cpu_clamping_mask);
> -   kthread_destroy_worker(w_data->worker);
> +// kthread_destroy_worker(w_data->worker);
>  
> w_data->worker = NULL;
>  }
> 
> If I do the above, everything works with S3 and CPU HP patch.
> 
> Inside kthread_destroy_worker()
>   kthread_flush_worker(worker);
> never completes then blocks s3 entry!

The kthread_flush_worker() was not needed because the queue was empty
in this case (runtime checked). But it still hangs on

  kthread_destroy_worker()
kthread_stop(task);


Then I tried to revert the conversion to the kthread worker
API (2nd patch from this patchset), see below. And it still
hangs during the suspend inside

powerclamp_cpu_predown()
  kthread_stop(*percpu_thread);


Note that both kthread_flush_worker() and kthread_stop()
waits until the kthread gets scheduled and do some job.
Also note that the kthread is bound to the given CPU.

My guess is that the kthread cannot be scheduled at this stage.
I wonder if the CPU is already partially down or that tasks
are freezed so that "normal" tasks are not scheduled at
this point. I am still trying to understand the code
related to suspend, cpu hotplug, and scheduler.


Just for record. Below is the conversion to the new
CPU hotplug state that can be applied on top
of the 1st patch from this patchset ("thermal/intel_powerclamp:
 Remove duplicated code that starts the kthread").
It allows to rule out the kthread worker API for the moment.


>From 928b066ea07532d82c802912e17b44bd01ec5665 Mon Sep 17 00:00:00 2001
From: Sebastian Andrzej Siewior 
Date: Thu, 1 Sep 2016 11:23:57 +0200
Subject: [PATCH] thermal/intel_powerclamp: Convert to CPU hotplug state

This is a conversation to the new hotplug state machine with
the difference that CPU_DEAD becomes CPU_PREDOWN.

At the same time it makes the handling of the two states symmetrical.
stop_power_clamp_worker() is called unconditionally and the controversial
error message is removed.

Finally, the hotplug state callbacks are removed after the powerclamping
is stopped to avoid a potential race.

Signed-off-by: Sebastian Andrzej Siewior 
[pmla...@suse.com: Fixed the possible race in powerclamp_exit()]
Signed-off-by: Petr Mladek 
---
 drivers/thermal/intel_powerclamp.c | 70 --
 1 file changed, 36 insertions(+), 34 deletions(-)

diff --git a/drivers/thermal/intel_powerclamp.c 
b/drivers/thermal/intel_powerclamp.c
index 0244805b7b86..4ebfea3f0f1e 100644
--- a/drivers/thermal/intel_powerclamp.c
+++ b/drivers/thermal/intel_powerclamp.c
@@ -572,45 +572,38 @@ static void end_power_clamp(void)
}
 }
 
-static int powerclamp_cpu_callback(struct notifier_block *nfb,
-   unsigned long action, void *hcpu)
+static int powerclamp_cpu_online(unsigned int cpu)
+{
+   if (clamping == false)
+   return 0;
+   start_power_clamp_thread(cpu);
+   /* prefer BSP as controlling CPU */
+   if (cpu == 0) {
+   control_cpu = 0;
+   smp_mb();
+   }
+   return 0;
+}
+
+static int powerclamp_cpu_predown(unsigned int cpu)
 {
-   unsigned long cpu = (unsigned long)hcpu;
struct task_struct **percpu_thread =
per_cpu_ptr(powerclamp_thread, cpu);
 
-   if (false == clamping)
-   goto exit_ok;
+   if (clamping == false)
+   return 0;
 
-   switch (action) {
-   case CPU_ONLINE:
-   start_power_clamp_thread(cpu);
-   /* prefer BSP as controlling CPU */
-   if (cpu == 0) {
-   control_cpu = 0;
-   smp_mb();
-   }
-   break;
-   case CPU_DEAD:
-   if (test_bit(cpu, cpu_clamping_mask)) {
-   pr_err("cpu %lu dead but powerclamping thread is not\n",
-   cpu);
-   kthread_stop(*percpu_thread);
-   }
-   if (cpu == control_cpu) {
-   

Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-11-11 Thread Petr Mladek
On Thu 2016-10-27 13:27:36, Jacob Pan wrote:
> On Thu, 27 Oct 2016 17:17:26 +0200
> Sebastian Andrzej Siewior  wrote:
> 
> > On 2016-10-27 16:53:48 [+0200], Petr Mladek wrote:
> > > 
> > > In each case, I wonder if the problem is caused by the conversion
> > > to the kthread worker or by the CPU hotplug state conversion.  
> > 
> > drop the hotplug patch and you will see.
> > 
> Petr,
> 
> I dropped hp patch no long see the hang during suspend to s3. However,
> the problem seems to be this line,
> 
> diff --git a/drivers/thermal/intel_powerclamp.c
> b/drivers/thermal/intel_powerclamp.c index 390e50b..b61da57 100644
> --- a/drivers/thermal/intel_powerclamp.c
> +++ b/drivers/thermal/intel_powerclamp.c
> @@ -574,7 +574,7 @@ static void stop_power_clamp_worker(unsigned long
> cpu) */
> del_timer_sync(_data->wakeup_timer);
> clear_bit(w_data->cpu, cpu_clamping_mask);
> -   kthread_destroy_worker(w_data->worker);
> +// kthread_destroy_worker(w_data->worker);
>  
> w_data->worker = NULL;
>  }
> 
> If I do the above, everything works with S3 and CPU HP patch.
> 
> Inside kthread_destroy_worker()
>   kthread_flush_worker(worker);
> never completes then blocks s3 entry!

The kthread_flush_worker() was not needed because the queue was empty
in this case (runtime checked). But it still hangs on

  kthread_destroy_worker()
kthread_stop(task);


Then I tried to revert the conversion to the kthread worker
API (2nd patch from this patchset), see below. And it still
hangs during the suspend inside

powerclamp_cpu_predown()
  kthread_stop(*percpu_thread);


Note that both kthread_flush_worker() and kthread_stop()
waits until the kthread gets scheduled and do some job.
Also note that the kthread is bound to the given CPU.

My guess is that the kthread cannot be scheduled at this stage.
I wonder if the CPU is already partially down or that tasks
are freezed so that "normal" tasks are not scheduled at
this point. I am still trying to understand the code
related to suspend, cpu hotplug, and scheduler.


Just for record. Below is the conversion to the new
CPU hotplug state that can be applied on top
of the 1st patch from this patchset ("thermal/intel_powerclamp:
 Remove duplicated code that starts the kthread").
It allows to rule out the kthread worker API for the moment.


>From 928b066ea07532d82c802912e17b44bd01ec5665 Mon Sep 17 00:00:00 2001
From: Sebastian Andrzej Siewior 
Date: Thu, 1 Sep 2016 11:23:57 +0200
Subject: [PATCH] thermal/intel_powerclamp: Convert to CPU hotplug state

This is a conversation to the new hotplug state machine with
the difference that CPU_DEAD becomes CPU_PREDOWN.

At the same time it makes the handling of the two states symmetrical.
stop_power_clamp_worker() is called unconditionally and the controversial
error message is removed.

Finally, the hotplug state callbacks are removed after the powerclamping
is stopped to avoid a potential race.

Signed-off-by: Sebastian Andrzej Siewior 
[pmla...@suse.com: Fixed the possible race in powerclamp_exit()]
Signed-off-by: Petr Mladek 
---
 drivers/thermal/intel_powerclamp.c | 70 --
 1 file changed, 36 insertions(+), 34 deletions(-)

diff --git a/drivers/thermal/intel_powerclamp.c 
b/drivers/thermal/intel_powerclamp.c
index 0244805b7b86..4ebfea3f0f1e 100644
--- a/drivers/thermal/intel_powerclamp.c
+++ b/drivers/thermal/intel_powerclamp.c
@@ -572,45 +572,38 @@ static void end_power_clamp(void)
}
 }
 
-static int powerclamp_cpu_callback(struct notifier_block *nfb,
-   unsigned long action, void *hcpu)
+static int powerclamp_cpu_online(unsigned int cpu)
+{
+   if (clamping == false)
+   return 0;
+   start_power_clamp_thread(cpu);
+   /* prefer BSP as controlling CPU */
+   if (cpu == 0) {
+   control_cpu = 0;
+   smp_mb();
+   }
+   return 0;
+}
+
+static int powerclamp_cpu_predown(unsigned int cpu)
 {
-   unsigned long cpu = (unsigned long)hcpu;
struct task_struct **percpu_thread =
per_cpu_ptr(powerclamp_thread, cpu);
 
-   if (false == clamping)
-   goto exit_ok;
+   if (clamping == false)
+   return 0;
 
-   switch (action) {
-   case CPU_ONLINE:
-   start_power_clamp_thread(cpu);
-   /* prefer BSP as controlling CPU */
-   if (cpu == 0) {
-   control_cpu = 0;
-   smp_mb();
-   }
-   break;
-   case CPU_DEAD:
-   if (test_bit(cpu, cpu_clamping_mask)) {
-   pr_err("cpu %lu dead but powerclamping thread is not\n",
-   cpu);
-   kthread_stop(*percpu_thread);
-   }
-   if (cpu == control_cpu) {
-   control_cpu = smp_processor_id();
-   smp_mb();
-   }
-

Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-10-27 Thread Jacob Pan
On Thu, 27 Oct 2016 17:17:26 +0200
Sebastian Andrzej Siewior  wrote:

> On 2016-10-27 16:53:48 [+0200], Petr Mladek wrote:
> > 
> > In each case, I wonder if the problem is caused by the conversion
> > to the kthread worker or by the CPU hotplug state conversion.  
> 
> drop the hotplug patch and you will see.
> 
Petr,

I dropped hp patch no long see the hang during suspend to s3. However,
the problem seems to be this line,

diff --git a/drivers/thermal/intel_powerclamp.c
b/drivers/thermal/intel_powerclamp.c index 390e50b..b61da57 100644
--- a/drivers/thermal/intel_powerclamp.c
+++ b/drivers/thermal/intel_powerclamp.c
@@ -574,7 +574,7 @@ static void stop_power_clamp_worker(unsigned long
cpu) */
del_timer_sync(_data->wakeup_timer);
clear_bit(w_data->cpu, cpu_clamping_mask);
-   kthread_destroy_worker(w_data->worker);
+// kthread_destroy_worker(w_data->worker);
 
w_data->worker = NULL;
 }

If I do the above, everything works with S3 and CPU HP patch.

Inside kthread_destroy_worker()
kthread_flush_worker(worker);
never completes then blocks s3 entry!

I will be at LPC next week also, we can chat about that.


Thanks,

Jacob


> > Best Regards,
> > Petr  
> 
> Sebastian

[Jacob Pan]


Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-10-27 Thread Jacob Pan
On Thu, 27 Oct 2016 17:17:26 +0200
Sebastian Andrzej Siewior  wrote:

> On 2016-10-27 16:53:48 [+0200], Petr Mladek wrote:
> > 
> > In each case, I wonder if the problem is caused by the conversion
> > to the kthread worker or by the CPU hotplug state conversion.  
> 
> drop the hotplug patch and you will see.
> 
Petr,

I dropped hp patch no long see the hang during suspend to s3. However,
the problem seems to be this line,

diff --git a/drivers/thermal/intel_powerclamp.c
b/drivers/thermal/intel_powerclamp.c index 390e50b..b61da57 100644
--- a/drivers/thermal/intel_powerclamp.c
+++ b/drivers/thermal/intel_powerclamp.c
@@ -574,7 +574,7 @@ static void stop_power_clamp_worker(unsigned long
cpu) */
del_timer_sync(_data->wakeup_timer);
clear_bit(w_data->cpu, cpu_clamping_mask);
-   kthread_destroy_worker(w_data->worker);
+// kthread_destroy_worker(w_data->worker);
 
w_data->worker = NULL;
 }

If I do the above, everything works with S3 and CPU HP patch.

Inside kthread_destroy_worker()
kthread_flush_worker(worker);
never completes then blocks s3 entry!

I will be at LPC next week also, we can chat about that.


Thanks,

Jacob


> > Best Regards,
> > Petr  
> 
> Sebastian

[Jacob Pan]


Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-10-27 Thread Sebastian Andrzej Siewior
On 2016-10-27 16:53:48 [+0200], Petr Mladek wrote:
> 
> In each case, I wonder if the problem is caused by the conversion
> to the kthread worker or by the CPU hotplug state conversion.

drop the hotplug patch and you will see.

> Best Regards,
> Petr

Sebastian


Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-10-27 Thread Sebastian Andrzej Siewior
On 2016-10-27 16:53:48 [+0200], Petr Mladek wrote:
> 
> In each case, I wonder if the problem is caused by the conversion
> to the kthread worker or by the CPU hotplug state conversion.

drop the hotplug patch and you will see.

> Best Regards,
> Petr

Sebastian


Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-10-27 Thread Petr Mladek
On Mon 2016-10-24 09:55:29, Jacob Pan wrote:
> On Mon, 24 Oct 2016 17:48:07 +0200
> Petr Mladek  wrote:
> 
> > On Fri 2016-10-21 13:21:18, Jacob Pan wrote:
> > > On Mon, 17 Oct 2016 14:32:52 +0200
> > > Petr Mladek  wrote:
> > > 
> > > > From: Sebastian Andrzej Siewior 
> > > > 
> > > Hi Sebastian,
> > > I applied this patchset on 4.9-rc1 and run some cpu online/offline
> > > loops test while injecting idle, e.g. 25%. I got system hang after a
> > > few cycles. Still looking into root cause.
> > 
> > You might need the patch
> > ("sched/fair: Fix sched domains NULL deference in
> > select_idle_sibling()") from linux-tip, see
> > https://lkml.kernel.org/r/tip-9cfb38a7ba5a9c27c1af8093fb1af4b699c0a...@git.kernel.org
> > 
> > I have mentioned it in the cover letter. I am sorry if it was not
> > visible enough.
> Thanks for point it out again, I missed it.
> 
> Now rc2 has this fix so I tried again cpuhp seems to work well now.
> 
> However, I found suspend to ram does not work with this patchset. Have
> you  tested it with "echo mem > /sys/power/state" while doing idle
> injection?
> It fails to enter s3 on my system.

Hmm, I haven't tested the system suspend. I am sorry but I am snowed
under some other tasks this week and will be on the Plumbers
conference the next week. I hope that I will have some time to look
at it then.

In each case, I wonder if the problem is caused by the conversion
to the kthread worker or by the CPU hotplug state conversion.

Best Regards,
Petr


Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-10-27 Thread Petr Mladek
On Mon 2016-10-24 09:55:29, Jacob Pan wrote:
> On Mon, 24 Oct 2016 17:48:07 +0200
> Petr Mladek  wrote:
> 
> > On Fri 2016-10-21 13:21:18, Jacob Pan wrote:
> > > On Mon, 17 Oct 2016 14:32:52 +0200
> > > Petr Mladek  wrote:
> > > 
> > > > From: Sebastian Andrzej Siewior 
> > > > 
> > > Hi Sebastian,
> > > I applied this patchset on 4.9-rc1 and run some cpu online/offline
> > > loops test while injecting idle, e.g. 25%. I got system hang after a
> > > few cycles. Still looking into root cause.
> > 
> > You might need the patch
> > ("sched/fair: Fix sched domains NULL deference in
> > select_idle_sibling()") from linux-tip, see
> > https://lkml.kernel.org/r/tip-9cfb38a7ba5a9c27c1af8093fb1af4b699c0a...@git.kernel.org
> > 
> > I have mentioned it in the cover letter. I am sorry if it was not
> > visible enough.
> Thanks for point it out again, I missed it.
> 
> Now rc2 has this fix so I tried again cpuhp seems to work well now.
> 
> However, I found suspend to ram does not work with this patchset. Have
> you  tested it with "echo mem > /sys/power/state" while doing idle
> injection?
> It fails to enter s3 on my system.

Hmm, I haven't tested the system suspend. I am sorry but I am snowed
under some other tasks this week and will be on the Plumbers
conference the next week. I hope that I will have some time to look
at it then.

In each case, I wonder if the problem is caused by the conversion
to the kthread worker or by the CPU hotplug state conversion.

Best Regards,
Petr


Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-10-24 Thread Jacob Pan
On Mon, 24 Oct 2016 17:48:07 +0200
Petr Mladek  wrote:

> On Fri 2016-10-21 13:21:18, Jacob Pan wrote:
> > On Mon, 17 Oct 2016 14:32:52 +0200
> > Petr Mladek  wrote:
> > 
> > > From: Sebastian Andrzej Siewior 
> > > 
> > Hi Sebastian,
> > I applied this patchset on 4.9-rc1 and run some cpu online/offline
> > loops test while injecting idle, e.g. 25%. I got system hang after a
> > few cycles. Still looking into root cause.
> 
> You might need the patch
> ("sched/fair: Fix sched domains NULL deference in
> select_idle_sibling()") from linux-tip, see
> https://lkml.kernel.org/r/tip-9cfb38a7ba5a9c27c1af8093fb1af4b699c0a...@git.kernel.org
> 
> I have mentioned it in the cover letter. I am sorry if it was not
> visible enough.
Thanks for point it out again, I missed it.

Now rc2 has this fix so I tried again cpuhp seems to work well now.

However, I found suspend to ram does not work with this patchset. Have
you  tested it with "echo mem > /sys/power/state" while doing idle
injection?
It fails to enter s3 on my system.


Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-10-24 Thread Jacob Pan
On Mon, 24 Oct 2016 17:48:07 +0200
Petr Mladek  wrote:

> On Fri 2016-10-21 13:21:18, Jacob Pan wrote:
> > On Mon, 17 Oct 2016 14:32:52 +0200
> > Petr Mladek  wrote:
> > 
> > > From: Sebastian Andrzej Siewior 
> > > 
> > Hi Sebastian,
> > I applied this patchset on 4.9-rc1 and run some cpu online/offline
> > loops test while injecting idle, e.g. 25%. I got system hang after a
> > few cycles. Still looking into root cause.
> 
> You might need the patch
> ("sched/fair: Fix sched domains NULL deference in
> select_idle_sibling()") from linux-tip, see
> https://lkml.kernel.org/r/tip-9cfb38a7ba5a9c27c1af8093fb1af4b699c0a...@git.kernel.org
> 
> I have mentioned it in the cover letter. I am sorry if it was not
> visible enough.
Thanks for point it out again, I missed it.

Now rc2 has this fix so I tried again cpuhp seems to work well now.

However, I found suspend to ram does not work with this patchset. Have
you  tested it with "echo mem > /sys/power/state" while doing idle
injection?
It fails to enter s3 on my system.


Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-10-24 Thread Petr Mladek
On Fri 2016-10-21 13:21:18, Jacob Pan wrote:
> On Mon, 17 Oct 2016 14:32:52 +0200
> Petr Mladek  wrote:
> 
> > From: Sebastian Andrzej Siewior 
> > 
> Hi Sebastian,
> I applied this patchset on 4.9-rc1 and run some cpu online/offline
> loops test while injecting idle, e.g. 25%. I got system hang after a
> few cycles. Still looking into root cause.

You might need the patch
("sched/fair: Fix sched domains NULL deference in select_idle_sibling()")
from linux-tip, see
https://lkml.kernel.org/r/tip-9cfb38a7ba5a9c27c1af8093fb1af4b699c0a...@git.kernel.org

I have mentioned it in the cover letter. I am sorry if it was not
visible enough.

Best Regards,
Petr


Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-10-24 Thread Petr Mladek
On Fri 2016-10-21 13:21:18, Jacob Pan wrote:
> On Mon, 17 Oct 2016 14:32:52 +0200
> Petr Mladek  wrote:
> 
> > From: Sebastian Andrzej Siewior 
> > 
> Hi Sebastian,
> I applied this patchset on 4.9-rc1 and run some cpu online/offline
> loops test while injecting idle, e.g. 25%. I got system hang after a
> few cycles. Still looking into root cause.

You might need the patch
("sched/fair: Fix sched domains NULL deference in select_idle_sibling()")
from linux-tip, see
https://lkml.kernel.org/r/tip-9cfb38a7ba5a9c27c1af8093fb1af4b699c0a...@git.kernel.org

I have mentioned it in the cover letter. I am sorry if it was not
visible enough.

Best Regards,
Petr


Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-10-21 Thread Jacob Pan
On Mon, 17 Oct 2016 14:32:52 +0200
Petr Mladek  wrote:

> From: Sebastian Andrzej Siewior 
> 
Hi Sebastian,
I applied this patchset on 4.9-rc1 and run some cpu online/offline
loops test while injecting idle, e.g. 25%. I got system hang after a
few cycles. Still looking into root cause.

Thanks,

Jacob
> This is a conversation to the new hotplug state machine with
> the difference that CPU_DEAD becomes CPU_PREDOWN.
> 
> At the same time it makes the handling of the two states symmetrical.
> stop_power_clamp_worker() is called unconditionally and the
> controversial error message is removed.
> 
> Finally, the hotplug state callbacks are removed after the
> powerclamping is stopped to avoid a potential race.
> 
> Signed-off-by: Sebastian Andrzej Siewior 
> [pmla...@suse.com: Fixed the possible race in powerclamp_exit()]
> Signed-off-by: Petr Mladek 
> ---
>  drivers/thermal/intel_powerclamp.c | 69
> +++--- 1 file changed, 35
> insertions(+), 34 deletions(-)
> 
> diff --git a/drivers/thermal/intel_powerclamp.c
> b/drivers/thermal/intel_powerclamp.c index a94f7c849a4e..390e50b97324
> 100644 --- a/drivers/thermal/intel_powerclamp.c
> +++ b/drivers/thermal/intel_powerclamp.c
> @@ -622,43 +622,35 @@ static void end_power_clamp(void)
>   }
>  }
>  
> -static int powerclamp_cpu_callback(struct notifier_block *nfb,
> - unsigned long action, void *hcpu)
> +static int powerclamp_cpu_online(unsigned int cpu)
>  {
> - unsigned long cpu = (unsigned long)hcpu;
> + if (clamping == false)
> + return 0;
> + start_power_clamp_worker(cpu);
> + /* prefer BSP as controlling CPU */
> + if (cpu == 0) {
> + control_cpu = 0;
> + smp_mb();
> + }
> + return 0;
> +}
>  
> - if (false == clamping)
> - goto exit_ok;
> +static int powerclamp_cpu_predown(unsigned int cpu)
> +{
> + if (clamping == false)
> + return 0;
>  
> - switch (action) {
> - case CPU_ONLINE:
> - start_power_clamp_worker(cpu);
> - /* prefer BSP as controlling CPU */
> - if (cpu == 0) {
> - control_cpu = 0;
> - smp_mb();
> - }
> - break;
> - case CPU_DEAD:
> - if (test_bit(cpu, cpu_clamping_mask)) {
> - pr_err("cpu %lu dead but powerclamping
> thread is not\n",
> - cpu);
> - stop_power_clamp_worker(cpu);
> - }
> - if (cpu == control_cpu) {
> - control_cpu = smp_processor_id();
> - smp_mb();
> - }
> - }
> + stop_power_clamp_worker(cpu);
> + if (cpu != control_cpu)
> + return 0;
>  
> -exit_ok:
> - return NOTIFY_OK;
> + control_cpu = cpumask_first(cpu_online_mask);
> + if (control_cpu == cpu)
> + control_cpu = cpumask_next(cpu, cpu_online_mask);
> + smp_mb();
> + return 0;
>  }
>  
> -static struct notifier_block powerclamp_cpu_notifier = {
> - .notifier_call = powerclamp_cpu_callback,
> -};
> -
>  static int powerclamp_get_max_state(struct thermal_cooling_device
> *cdev, unsigned long *state)
>  {
> @@ -788,6 +780,8 @@ static inline void
> powerclamp_create_debug_files(void)
> debugfs_remove_recursive(debug_dir); }
>  
> +static enum cpuhp_state hp_state;
> +
>  static int __init powerclamp_init(void)
>  {
>   int retval;
> @@ -805,7 +799,14 @@ static int __init powerclamp_init(void)
>  
>   /* set default limit, maybe adjusted during runtime based on
> feedback */ window_size = 2;
> - register_hotcpu_notifier(_cpu_notifier);
> + retval = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
> +
> "thermal/intel_powerclamp:online",
> +powerclamp_cpu_online,
> +powerclamp_cpu_predown);
> + if (retval < 0)
> + goto exit_free;
> +
> + hp_state = retval;
>  
>   worker_data = alloc_percpu(struct powerclamp_worker_data);
>   if (!worker_data) {
> @@ -830,7 +831,7 @@ static int __init powerclamp_init(void)
>  exit_free_thread:
>   free_percpu(worker_data);
>  exit_unregister:
> - unregister_hotcpu_notifier(_cpu_notifier);
> + cpuhp_remove_state_nocalls(hp_state);
>  exit_free:
>   kfree(cpu_clamping_mask);
>   return retval;
> @@ -839,8 +840,8 @@ static int __init powerclamp_init(void)
>  
>  static void __exit powerclamp_exit(void)
>  {
> - unregister_hotcpu_notifier(_cpu_notifier);
>   end_power_clamp();
> + cpuhp_remove_state_nocalls(hp_state);
>   free_percpu(worker_data);
>   thermal_cooling_device_unregister(cooling_dev);
>   kfree(cpu_clamping_mask);

[Jacob Pan]


Re: [PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-10-21 Thread Jacob Pan
On Mon, 17 Oct 2016 14:32:52 +0200
Petr Mladek  wrote:

> From: Sebastian Andrzej Siewior 
> 
Hi Sebastian,
I applied this patchset on 4.9-rc1 and run some cpu online/offline
loops test while injecting idle, e.g. 25%. I got system hang after a
few cycles. Still looking into root cause.

Thanks,

Jacob
> This is a conversation to the new hotplug state machine with
> the difference that CPU_DEAD becomes CPU_PREDOWN.
> 
> At the same time it makes the handling of the two states symmetrical.
> stop_power_clamp_worker() is called unconditionally and the
> controversial error message is removed.
> 
> Finally, the hotplug state callbacks are removed after the
> powerclamping is stopped to avoid a potential race.
> 
> Signed-off-by: Sebastian Andrzej Siewior 
> [pmla...@suse.com: Fixed the possible race in powerclamp_exit()]
> Signed-off-by: Petr Mladek 
> ---
>  drivers/thermal/intel_powerclamp.c | 69
> +++--- 1 file changed, 35
> insertions(+), 34 deletions(-)
> 
> diff --git a/drivers/thermal/intel_powerclamp.c
> b/drivers/thermal/intel_powerclamp.c index a94f7c849a4e..390e50b97324
> 100644 --- a/drivers/thermal/intel_powerclamp.c
> +++ b/drivers/thermal/intel_powerclamp.c
> @@ -622,43 +622,35 @@ static void end_power_clamp(void)
>   }
>  }
>  
> -static int powerclamp_cpu_callback(struct notifier_block *nfb,
> - unsigned long action, void *hcpu)
> +static int powerclamp_cpu_online(unsigned int cpu)
>  {
> - unsigned long cpu = (unsigned long)hcpu;
> + if (clamping == false)
> + return 0;
> + start_power_clamp_worker(cpu);
> + /* prefer BSP as controlling CPU */
> + if (cpu == 0) {
> + control_cpu = 0;
> + smp_mb();
> + }
> + return 0;
> +}
>  
> - if (false == clamping)
> - goto exit_ok;
> +static int powerclamp_cpu_predown(unsigned int cpu)
> +{
> + if (clamping == false)
> + return 0;
>  
> - switch (action) {
> - case CPU_ONLINE:
> - start_power_clamp_worker(cpu);
> - /* prefer BSP as controlling CPU */
> - if (cpu == 0) {
> - control_cpu = 0;
> - smp_mb();
> - }
> - break;
> - case CPU_DEAD:
> - if (test_bit(cpu, cpu_clamping_mask)) {
> - pr_err("cpu %lu dead but powerclamping
> thread is not\n",
> - cpu);
> - stop_power_clamp_worker(cpu);
> - }
> - if (cpu == control_cpu) {
> - control_cpu = smp_processor_id();
> - smp_mb();
> - }
> - }
> + stop_power_clamp_worker(cpu);
> + if (cpu != control_cpu)
> + return 0;
>  
> -exit_ok:
> - return NOTIFY_OK;
> + control_cpu = cpumask_first(cpu_online_mask);
> + if (control_cpu == cpu)
> + control_cpu = cpumask_next(cpu, cpu_online_mask);
> + smp_mb();
> + return 0;
>  }
>  
> -static struct notifier_block powerclamp_cpu_notifier = {
> - .notifier_call = powerclamp_cpu_callback,
> -};
> -
>  static int powerclamp_get_max_state(struct thermal_cooling_device
> *cdev, unsigned long *state)
>  {
> @@ -788,6 +780,8 @@ static inline void
> powerclamp_create_debug_files(void)
> debugfs_remove_recursive(debug_dir); }
>  
> +static enum cpuhp_state hp_state;
> +
>  static int __init powerclamp_init(void)
>  {
>   int retval;
> @@ -805,7 +799,14 @@ static int __init powerclamp_init(void)
>  
>   /* set default limit, maybe adjusted during runtime based on
> feedback */ window_size = 2;
> - register_hotcpu_notifier(_cpu_notifier);
> + retval = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
> +
> "thermal/intel_powerclamp:online",
> +powerclamp_cpu_online,
> +powerclamp_cpu_predown);
> + if (retval < 0)
> + goto exit_free;
> +
> + hp_state = retval;
>  
>   worker_data = alloc_percpu(struct powerclamp_worker_data);
>   if (!worker_data) {
> @@ -830,7 +831,7 @@ static int __init powerclamp_init(void)
>  exit_free_thread:
>   free_percpu(worker_data);
>  exit_unregister:
> - unregister_hotcpu_notifier(_cpu_notifier);
> + cpuhp_remove_state_nocalls(hp_state);
>  exit_free:
>   kfree(cpu_clamping_mask);
>   return retval;
> @@ -839,8 +840,8 @@ static int __init powerclamp_init(void)
>  
>  static void __exit powerclamp_exit(void)
>  {
> - unregister_hotcpu_notifier(_cpu_notifier);
>   end_power_clamp();
> + cpuhp_remove_state_nocalls(hp_state);
>   free_percpu(worker_data);
>   thermal_cooling_device_unregister(cooling_dev);
>   kfree(cpu_clamping_mask);

[Jacob Pan]


[PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-10-17 Thread Petr Mladek
From: Sebastian Andrzej Siewior 

This is a conversation to the new hotplug state machine with
the difference that CPU_DEAD becomes CPU_PREDOWN.

At the same time it makes the handling of the two states symmetrical.
stop_power_clamp_worker() is called unconditionally and the controversial
error message is removed.

Finally, the hotplug state callbacks are removed after the powerclamping
is stopped to avoid a potential race.

Signed-off-by: Sebastian Andrzej Siewior 
[pmla...@suse.com: Fixed the possible race in powerclamp_exit()]
Signed-off-by: Petr Mladek 
---
 drivers/thermal/intel_powerclamp.c | 69 +++---
 1 file changed, 35 insertions(+), 34 deletions(-)

diff --git a/drivers/thermal/intel_powerclamp.c 
b/drivers/thermal/intel_powerclamp.c
index a94f7c849a4e..390e50b97324 100644
--- a/drivers/thermal/intel_powerclamp.c
+++ b/drivers/thermal/intel_powerclamp.c
@@ -622,43 +622,35 @@ static void end_power_clamp(void)
}
 }
 
-static int powerclamp_cpu_callback(struct notifier_block *nfb,
-   unsigned long action, void *hcpu)
+static int powerclamp_cpu_online(unsigned int cpu)
 {
-   unsigned long cpu = (unsigned long)hcpu;
+   if (clamping == false)
+   return 0;
+   start_power_clamp_worker(cpu);
+   /* prefer BSP as controlling CPU */
+   if (cpu == 0) {
+   control_cpu = 0;
+   smp_mb();
+   }
+   return 0;
+}
 
-   if (false == clamping)
-   goto exit_ok;
+static int powerclamp_cpu_predown(unsigned int cpu)
+{
+   if (clamping == false)
+   return 0;
 
-   switch (action) {
-   case CPU_ONLINE:
-   start_power_clamp_worker(cpu);
-   /* prefer BSP as controlling CPU */
-   if (cpu == 0) {
-   control_cpu = 0;
-   smp_mb();
-   }
-   break;
-   case CPU_DEAD:
-   if (test_bit(cpu, cpu_clamping_mask)) {
-   pr_err("cpu %lu dead but powerclamping thread is not\n",
-   cpu);
-   stop_power_clamp_worker(cpu);
-   }
-   if (cpu == control_cpu) {
-   control_cpu = smp_processor_id();
-   smp_mb();
-   }
-   }
+   stop_power_clamp_worker(cpu);
+   if (cpu != control_cpu)
+   return 0;
 
-exit_ok:
-   return NOTIFY_OK;
+   control_cpu = cpumask_first(cpu_online_mask);
+   if (control_cpu == cpu)
+   control_cpu = cpumask_next(cpu, cpu_online_mask);
+   smp_mb();
+   return 0;
 }
 
-static struct notifier_block powerclamp_cpu_notifier = {
-   .notifier_call = powerclamp_cpu_callback,
-};
-
 static int powerclamp_get_max_state(struct thermal_cooling_device *cdev,
 unsigned long *state)
 {
@@ -788,6 +780,8 @@ static inline void powerclamp_create_debug_files(void)
debugfs_remove_recursive(debug_dir);
 }
 
+static enum cpuhp_state hp_state;
+
 static int __init powerclamp_init(void)
 {
int retval;
@@ -805,7 +799,14 @@ static int __init powerclamp_init(void)
 
/* set default limit, maybe adjusted during runtime based on feedback */
window_size = 2;
-   register_hotcpu_notifier(_cpu_notifier);
+   retval = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
+  "thermal/intel_powerclamp:online",
+  powerclamp_cpu_online,
+  powerclamp_cpu_predown);
+   if (retval < 0)
+   goto exit_free;
+
+   hp_state = retval;
 
worker_data = alloc_percpu(struct powerclamp_worker_data);
if (!worker_data) {
@@ -830,7 +831,7 @@ static int __init powerclamp_init(void)
 exit_free_thread:
free_percpu(worker_data);
 exit_unregister:
-   unregister_hotcpu_notifier(_cpu_notifier);
+   cpuhp_remove_state_nocalls(hp_state);
 exit_free:
kfree(cpu_clamping_mask);
return retval;
@@ -839,8 +840,8 @@ static int __init powerclamp_init(void)
 
 static void __exit powerclamp_exit(void)
 {
-   unregister_hotcpu_notifier(_cpu_notifier);
end_power_clamp();
+   cpuhp_remove_state_nocalls(hp_state);
free_percpu(worker_data);
thermal_cooling_device_unregister(cooling_dev);
kfree(cpu_clamping_mask);
-- 
1.8.5.6



[PATCH 3/3] thermal/intel_powerclamp: Convert to CPU hotplug state

2016-10-17 Thread Petr Mladek
From: Sebastian Andrzej Siewior 

This is a conversation to the new hotplug state machine with
the difference that CPU_DEAD becomes CPU_PREDOWN.

At the same time it makes the handling of the two states symmetrical.
stop_power_clamp_worker() is called unconditionally and the controversial
error message is removed.

Finally, the hotplug state callbacks are removed after the powerclamping
is stopped to avoid a potential race.

Signed-off-by: Sebastian Andrzej Siewior 
[pmla...@suse.com: Fixed the possible race in powerclamp_exit()]
Signed-off-by: Petr Mladek 
---
 drivers/thermal/intel_powerclamp.c | 69 +++---
 1 file changed, 35 insertions(+), 34 deletions(-)

diff --git a/drivers/thermal/intel_powerclamp.c 
b/drivers/thermal/intel_powerclamp.c
index a94f7c849a4e..390e50b97324 100644
--- a/drivers/thermal/intel_powerclamp.c
+++ b/drivers/thermal/intel_powerclamp.c
@@ -622,43 +622,35 @@ static void end_power_clamp(void)
}
 }
 
-static int powerclamp_cpu_callback(struct notifier_block *nfb,
-   unsigned long action, void *hcpu)
+static int powerclamp_cpu_online(unsigned int cpu)
 {
-   unsigned long cpu = (unsigned long)hcpu;
+   if (clamping == false)
+   return 0;
+   start_power_clamp_worker(cpu);
+   /* prefer BSP as controlling CPU */
+   if (cpu == 0) {
+   control_cpu = 0;
+   smp_mb();
+   }
+   return 0;
+}
 
-   if (false == clamping)
-   goto exit_ok;
+static int powerclamp_cpu_predown(unsigned int cpu)
+{
+   if (clamping == false)
+   return 0;
 
-   switch (action) {
-   case CPU_ONLINE:
-   start_power_clamp_worker(cpu);
-   /* prefer BSP as controlling CPU */
-   if (cpu == 0) {
-   control_cpu = 0;
-   smp_mb();
-   }
-   break;
-   case CPU_DEAD:
-   if (test_bit(cpu, cpu_clamping_mask)) {
-   pr_err("cpu %lu dead but powerclamping thread is not\n",
-   cpu);
-   stop_power_clamp_worker(cpu);
-   }
-   if (cpu == control_cpu) {
-   control_cpu = smp_processor_id();
-   smp_mb();
-   }
-   }
+   stop_power_clamp_worker(cpu);
+   if (cpu != control_cpu)
+   return 0;
 
-exit_ok:
-   return NOTIFY_OK;
+   control_cpu = cpumask_first(cpu_online_mask);
+   if (control_cpu == cpu)
+   control_cpu = cpumask_next(cpu, cpu_online_mask);
+   smp_mb();
+   return 0;
 }
 
-static struct notifier_block powerclamp_cpu_notifier = {
-   .notifier_call = powerclamp_cpu_callback,
-};
-
 static int powerclamp_get_max_state(struct thermal_cooling_device *cdev,
 unsigned long *state)
 {
@@ -788,6 +780,8 @@ static inline void powerclamp_create_debug_files(void)
debugfs_remove_recursive(debug_dir);
 }
 
+static enum cpuhp_state hp_state;
+
 static int __init powerclamp_init(void)
 {
int retval;
@@ -805,7 +799,14 @@ static int __init powerclamp_init(void)
 
/* set default limit, maybe adjusted during runtime based on feedback */
window_size = 2;
-   register_hotcpu_notifier(_cpu_notifier);
+   retval = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
+  "thermal/intel_powerclamp:online",
+  powerclamp_cpu_online,
+  powerclamp_cpu_predown);
+   if (retval < 0)
+   goto exit_free;
+
+   hp_state = retval;
 
worker_data = alloc_percpu(struct powerclamp_worker_data);
if (!worker_data) {
@@ -830,7 +831,7 @@ static int __init powerclamp_init(void)
 exit_free_thread:
free_percpu(worker_data);
 exit_unregister:
-   unregister_hotcpu_notifier(_cpu_notifier);
+   cpuhp_remove_state_nocalls(hp_state);
 exit_free:
kfree(cpu_clamping_mask);
return retval;
@@ -839,8 +840,8 @@ static int __init powerclamp_init(void)
 
 static void __exit powerclamp_exit(void)
 {
-   unregister_hotcpu_notifier(_cpu_notifier);
end_power_clamp();
+   cpuhp_remove_state_nocalls(hp_state);
free_percpu(worker_data);
thermal_cooling_device_unregister(cooling_dev);
kfree(cpu_clamping_mask);
-- 
1.8.5.6