Re: [PATCH 2/3] cpuidle: coupled: abort idle if pokes are pending

2013-08-26 Thread Joseph Lo
On Sat, 2013-08-24 at 03:45 +0800, Colin Cross wrote:
> Joseph Lo  reported a lockup on Tegra3 caused
> by a race condition in coupled cpuidle.  When two or more cpus
Actually this issue can be reproduced on both Tegra20/30 platforms. And
I suggest using Tegra20 to replace Tegra3 here, we only apply coupled
CPU idle function on Tegra20 in the mainline right now.
> enter idle at the same time, the first cpus to arrive may go to the
> ready loop without processing pending pokes from the last cpu to
> arrive.
> 
> This patch adds a check for pending pokes once all cpus have been
> synchronized in the ready loop and resets the coupled state and
> retries if any cpus failed to handle their pending poke.
> 
> Retrying on all cpus may trigger the same issue again, so this patch
> also adds a check to ensure that each cpu has received at least one
> poke between when it enters the waiting loop and when it moves on to
> the ready loop.
> 
> Reported-by: Joseph Lo 
> CC: sta...@vger.kernel.org
> Signed-off-by: Colin Cross 
> ---
>  drivers/cpuidle/coupled.c | 107 
> +++---
>  1 file changed, 82 insertions(+), 25 deletions(-)
> 
[snip]
> +/*
> + * The cpuidle_coupled_poke_pending mask is used to ensure that each cpu has
s/cpuidle_coupled_poke_pending/cpuidle_coupled_poked/? :)
> + * been poked once to minimize entering the ready loop with a poke pending,
> + * which would require aborting and retrying.
> + */
> +static cpumask_t cpuidle_coupled_poked;
> 
I fixed this issue by checking if there is a pending SGI, then abort the
coupled state on Tegra20. It still can be reproduced easily if I remove
the checking code. So I tested the case with this patch, the result is
good. This patch can fix the issue indeed.

I also tested with the other two patches. It didn't cause any
regression.

So this series:
Tested-by: Joseph Lo 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] cpuidle: coupled: abort idle if pokes are pending

2013-08-26 Thread Joseph Lo
On Sat, 2013-08-24 at 03:45 +0800, Colin Cross wrote:
 Joseph Lo jose...@nvidia.com reported a lockup on Tegra3 caused
 by a race condition in coupled cpuidle.  When two or more cpus
Actually this issue can be reproduced on both Tegra20/30 platforms. And
I suggest using Tegra20 to replace Tegra3 here, we only apply coupled
CPU idle function on Tegra20 in the mainline right now.
 enter idle at the same time, the first cpus to arrive may go to the
 ready loop without processing pending pokes from the last cpu to
 arrive.
 
 This patch adds a check for pending pokes once all cpus have been
 synchronized in the ready loop and resets the coupled state and
 retries if any cpus failed to handle their pending poke.
 
 Retrying on all cpus may trigger the same issue again, so this patch
 also adds a check to ensure that each cpu has received at least one
 poke between when it enters the waiting loop and when it moves on to
 the ready loop.
 
 Reported-by: Joseph Lo jose...@nvidia.com
 CC: sta...@vger.kernel.org
 Signed-off-by: Colin Cross ccr...@android.com
 ---
  drivers/cpuidle/coupled.c | 107 
 +++---
  1 file changed, 82 insertions(+), 25 deletions(-)
 
[snip]
 +/*
 + * The cpuidle_coupled_poke_pending mask is used to ensure that each cpu has
s/cpuidle_coupled_poke_pending/cpuidle_coupled_poked/? :)
 + * been poked once to minimize entering the ready loop with a poke pending,
 + * which would require aborting and retrying.
 + */
 +static cpumask_t cpuidle_coupled_poked;
 
I fixed this issue by checking if there is a pending SGI, then abort the
coupled state on Tegra20. It still can be reproduced easily if I remove
the checking code. So I tested the case with this patch, the result is
good. This patch can fix the issue indeed.

I also tested with the other two patches. It didn't cause any
regression.

So this series:
Tested-by: Joseph Lo jose...@nvidia.com


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] cpuidle: coupled: abort idle if pokes are pending

2013-08-23 Thread Colin Cross
Joseph Lo  reported a lockup on Tegra3 caused
by a race condition in coupled cpuidle.  When two or more cpus
enter idle at the same time, the first cpus to arrive may go to the
ready loop without processing pending pokes from the last cpu to
arrive.

This patch adds a check for pending pokes once all cpus have been
synchronized in the ready loop and resets the coupled state and
retries if any cpus failed to handle their pending poke.

Retrying on all cpus may trigger the same issue again, so this patch
also adds a check to ensure that each cpu has received at least one
poke between when it enters the waiting loop and when it moves on to
the ready loop.

Reported-by: Joseph Lo 
CC: sta...@vger.kernel.org
Signed-off-by: Colin Cross 
---
 drivers/cpuidle/coupled.c | 107 +++---
 1 file changed, 82 insertions(+), 25 deletions(-)

diff --git a/drivers/cpuidle/coupled.c b/drivers/cpuidle/coupled.c
index db92bcb..bbc4ba5 100644
--- a/drivers/cpuidle/coupled.c
+++ b/drivers/cpuidle/coupled.c
@@ -106,6 +106,7 @@ struct cpuidle_coupled {
cpumask_t coupled_cpus;
int requested_state[NR_CPUS];
atomic_t ready_waiting_counts;
+   atomic_t abort_barrier;
int online_count;
int refcnt;
int prevent;
@@ -122,12 +123,19 @@ static DEFINE_MUTEX(cpuidle_coupled_lock);
 static DEFINE_PER_CPU(struct call_single_data, cpuidle_coupled_poke_cb);
 
 /*
- * The cpuidle_coupled_poked_mask mask is used to avoid calling
+ * The cpuidle_coupled_poke_pending mask is used to avoid calling
  * __smp_call_function_single with the per cpu call_single_data struct already
  * in use.  This prevents a deadlock where two cpus are waiting for each others
  * call_single_data struct to be available
  */
-static cpumask_t cpuidle_coupled_poked_mask;
+static cpumask_t cpuidle_coupled_poke_pending;
+
+/*
+ * The cpuidle_coupled_poke_pending mask is used to ensure that each cpu has
+ * been poked once to minimize entering the ready loop with a poke pending,
+ * which would require aborting and retrying.
+ */
+static cpumask_t cpuidle_coupled_poked;
 
 /**
  * cpuidle_coupled_parallel_barrier - synchronize all online coupled cpus
@@ -291,10 +299,11 @@ static inline int cpuidle_coupled_get_state(struct 
cpuidle_device *dev,
return state;
 }
 
-static void cpuidle_coupled_poked(void *info)
+static void cpuidle_coupled_handle_poke(void *info)
 {
int cpu = (unsigned long)info;
-   cpumask_clear_cpu(cpu, _coupled_poked_mask);
+   cpumask_set_cpu(cpu, _coupled_poked);
+   cpumask_clear_cpu(cpu, _coupled_poke_pending);
 }
 
 /**
@@ -313,7 +322,7 @@ static void cpuidle_coupled_poke(int cpu)
 {
struct call_single_data *csd = _cpu(cpuidle_coupled_poke_cb, cpu);
 
-   if (!cpumask_test_and_set_cpu(cpu, _coupled_poked_mask))
+   if (!cpumask_test_and_set_cpu(cpu, _coupled_poke_pending))
__smp_call_function_single(cpu, csd, 0);
 }
 
@@ -340,30 +349,19 @@ static void cpuidle_coupled_poke_others(int this_cpu,
  * @coupled: the struct coupled that contains the current cpu
  * @next_state: the index in drv->states of the requested state for this cpu
  *
- * Updates the requested idle state for the specified cpuidle device,
- * poking all coupled cpus out of idle if necessary to let them see the new
- * state.
+ * Updates the requested idle state for the specified cpuidle device.
+ * Returns the number of waiting cpus.
  */
-static void cpuidle_coupled_set_waiting(int cpu,
+static int cpuidle_coupled_set_waiting(int cpu,
struct cpuidle_coupled *coupled, int next_state)
 {
-   int w;
-
coupled->requested_state[cpu] = next_state;
 
/*
-* If this is the last cpu to enter the waiting state, poke
-* all the other cpus out of their waiting state so they can
-* enter a deeper state.  This can race with one of the cpus
-* exiting the waiting state due to an interrupt and
-* decrementing waiting_count, see comment below.
-*
 * The atomic_inc_return provides a write barrier to order the write
 * to requested_state with the later write that increments ready_count.
 */
-   w = atomic_inc_return(>ready_waiting_counts) & WAITING_MASK;
-   if (w == coupled->online_count)
-   cpuidle_coupled_poke_others(cpu, coupled);
+   return atomic_inc_return(>ready_waiting_counts) & WAITING_MASK;
 }
 
 /**
@@ -418,13 +416,24 @@ static void cpuidle_coupled_set_done(int cpu, struct 
cpuidle_coupled *coupled)
 static int cpuidle_coupled_clear_pokes(int cpu)
 {
local_irq_enable();
-   while (cpumask_test_cpu(cpu, _coupled_poked_mask))
+   while (cpumask_test_cpu(cpu, _coupled_poke_pending))
cpu_relax();
local_irq_disable();
 
return need_resched() ? -EINTR : 0;
 }
 
+static bool cpuidle_coupled_any_pokes_pending(struct cpuidle_coupled *coupled)
+{
+   cpumask_t cpus;
+   

[PATCH 2/3] cpuidle: coupled: abort idle if pokes are pending

2013-08-23 Thread Colin Cross
Joseph Lo jose...@nvidia.com reported a lockup on Tegra3 caused
by a race condition in coupled cpuidle.  When two or more cpus
enter idle at the same time, the first cpus to arrive may go to the
ready loop without processing pending pokes from the last cpu to
arrive.

This patch adds a check for pending pokes once all cpus have been
synchronized in the ready loop and resets the coupled state and
retries if any cpus failed to handle their pending poke.

Retrying on all cpus may trigger the same issue again, so this patch
also adds a check to ensure that each cpu has received at least one
poke between when it enters the waiting loop and when it moves on to
the ready loop.

Reported-by: Joseph Lo jose...@nvidia.com
CC: sta...@vger.kernel.org
Signed-off-by: Colin Cross ccr...@android.com
---
 drivers/cpuidle/coupled.c | 107 +++---
 1 file changed, 82 insertions(+), 25 deletions(-)

diff --git a/drivers/cpuidle/coupled.c b/drivers/cpuidle/coupled.c
index db92bcb..bbc4ba5 100644
--- a/drivers/cpuidle/coupled.c
+++ b/drivers/cpuidle/coupled.c
@@ -106,6 +106,7 @@ struct cpuidle_coupled {
cpumask_t coupled_cpus;
int requested_state[NR_CPUS];
atomic_t ready_waiting_counts;
+   atomic_t abort_barrier;
int online_count;
int refcnt;
int prevent;
@@ -122,12 +123,19 @@ static DEFINE_MUTEX(cpuidle_coupled_lock);
 static DEFINE_PER_CPU(struct call_single_data, cpuidle_coupled_poke_cb);
 
 /*
- * The cpuidle_coupled_poked_mask mask is used to avoid calling
+ * The cpuidle_coupled_poke_pending mask is used to avoid calling
  * __smp_call_function_single with the per cpu call_single_data struct already
  * in use.  This prevents a deadlock where two cpus are waiting for each others
  * call_single_data struct to be available
  */
-static cpumask_t cpuidle_coupled_poked_mask;
+static cpumask_t cpuidle_coupled_poke_pending;
+
+/*
+ * The cpuidle_coupled_poke_pending mask is used to ensure that each cpu has
+ * been poked once to minimize entering the ready loop with a poke pending,
+ * which would require aborting and retrying.
+ */
+static cpumask_t cpuidle_coupled_poked;
 
 /**
  * cpuidle_coupled_parallel_barrier - synchronize all online coupled cpus
@@ -291,10 +299,11 @@ static inline int cpuidle_coupled_get_state(struct 
cpuidle_device *dev,
return state;
 }
 
-static void cpuidle_coupled_poked(void *info)
+static void cpuidle_coupled_handle_poke(void *info)
 {
int cpu = (unsigned long)info;
-   cpumask_clear_cpu(cpu, cpuidle_coupled_poked_mask);
+   cpumask_set_cpu(cpu, cpuidle_coupled_poked);
+   cpumask_clear_cpu(cpu, cpuidle_coupled_poke_pending);
 }
 
 /**
@@ -313,7 +322,7 @@ static void cpuidle_coupled_poke(int cpu)
 {
struct call_single_data *csd = per_cpu(cpuidle_coupled_poke_cb, cpu);
 
-   if (!cpumask_test_and_set_cpu(cpu, cpuidle_coupled_poked_mask))
+   if (!cpumask_test_and_set_cpu(cpu, cpuidle_coupled_poke_pending))
__smp_call_function_single(cpu, csd, 0);
 }
 
@@ -340,30 +349,19 @@ static void cpuidle_coupled_poke_others(int this_cpu,
  * @coupled: the struct coupled that contains the current cpu
  * @next_state: the index in drv-states of the requested state for this cpu
  *
- * Updates the requested idle state for the specified cpuidle device,
- * poking all coupled cpus out of idle if necessary to let them see the new
- * state.
+ * Updates the requested idle state for the specified cpuidle device.
+ * Returns the number of waiting cpus.
  */
-static void cpuidle_coupled_set_waiting(int cpu,
+static int cpuidle_coupled_set_waiting(int cpu,
struct cpuidle_coupled *coupled, int next_state)
 {
-   int w;
-
coupled-requested_state[cpu] = next_state;
 
/*
-* If this is the last cpu to enter the waiting state, poke
-* all the other cpus out of their waiting state so they can
-* enter a deeper state.  This can race with one of the cpus
-* exiting the waiting state due to an interrupt and
-* decrementing waiting_count, see comment below.
-*
 * The atomic_inc_return provides a write barrier to order the write
 * to requested_state with the later write that increments ready_count.
 */
-   w = atomic_inc_return(coupled-ready_waiting_counts)  WAITING_MASK;
-   if (w == coupled-online_count)
-   cpuidle_coupled_poke_others(cpu, coupled);
+   return atomic_inc_return(coupled-ready_waiting_counts)  WAITING_MASK;
 }
 
 /**
@@ -418,13 +416,24 @@ static void cpuidle_coupled_set_done(int cpu, struct 
cpuidle_coupled *coupled)
 static int cpuidle_coupled_clear_pokes(int cpu)
 {
local_irq_enable();
-   while (cpumask_test_cpu(cpu, cpuidle_coupled_poked_mask))
+   while (cpumask_test_cpu(cpu, cpuidle_coupled_poke_pending))
cpu_relax();
local_irq_disable();
 
return need_resched() ? -EINTR : 0;
 }