Re: BUG: sleeping function called from ras_epow_interrupt context

2015-07-15 Thread Thomas Huth
On 07/15/2015 09:58 PM, Nathan Fontenot wrote:
> On 07/15/2015 09:35 AM, Thomas Huth wrote:
>> On 07/14/2015 11:22 PM, Benjamin Herrenschmidt wrote:
>>> On Tue, 2015-07-14 at 20:43 +0200, Thomas Huth wrote:
 Any suggestions how to fix this? Simply revert 587f83e8dd50d? Use
 mdelay() instead of msleep() in rtas_busy_delay()? Something more
 fancy?
>>>
>>> A proper fix would be more fancy, the get_sensor should happen in a
>>> kernel thread instead.
>>
>> I'm not very familiar with this stuff, but isn't the EPOW interrupt
>> something that is very time-critical? Moving parts of the handler into a
>> kernel thread then does not sound like a very good idea to me...
>>
>> Another question: Can it happen at all that this get-sensor call results
>> in a sleep condition? Looking at commit ID
>> 81b73dd92b97423b8f5324a59044da478c04f4c4 ("Fix might-sleep warning on
>> removing cpus"), which apparently fixed a similar issue for CPU
>> hot-plugging, indicates that at least some of the rtas calls are never
>> returning the busy code? In that case we could fix this by introducing a
>> similar rtas_get_sensor_fast() function? (or simply revert 587f83e8dd50d
>> which would be quite similar, I think)
>>
> 
> Looking at the PAPR, the get-sensor-state rtas call for the EPOW sensor
> is listed as a fast call and should not return a busy indication.

Great, good to know, thanks for looking that up! So IMHO we should
either introduce a rtas_get_sensor_fast() function or revert
587f83e8dd50d ... any preferences? Shall I come up with a patch?

> I'm curious as to why we're getting a busy return indication when
> making this call.

Looking at the code again, rtas_busy_delay() likely never slept ... it's
likely just the "might_sleep()" annotation in that function that causes
the BUG.

 Thomas




signature.asc
Description: OpenPGP digital signature


Re: BUG: sleeping function called from ras_epow_interrupt context

2015-07-15 Thread Nathan Fontenot
On 07/15/2015 09:35 AM, Thomas Huth wrote:
> On 07/14/2015 11:22 PM, Benjamin Herrenschmidt wrote:
>> On Tue, 2015-07-14 at 20:43 +0200, Thomas Huth wrote:
>>> Any suggestions how to fix this? Simply revert 587f83e8dd50d? Use
>>> mdelay() instead of msleep() in rtas_busy_delay()? Something more
>>> fancy?
>>
>> A proper fix would be more fancy, the get_sensor should happen in a
>> kernel thread instead.
> 
> I'm not very familiar with this stuff, but isn't the EPOW interrupt
> something that is very time-critical? Moving parts of the handler into a
> kernel thread then does not sound like a very good idea to me...
> 
> Another question: Can it happen at all that this get-sensor call results
> in a sleep condition? Looking at commit ID
> 81b73dd92b97423b8f5324a59044da478c04f4c4 ("Fix might-sleep warning on
> removing cpus"), which apparently fixed a similar issue for CPU
> hot-plugging, indicates that at least some of the rtas calls are never
> returning the busy code? In that case we could fix this by introducing a
> similar rtas_get_sensor_fast() function? (or simply revert 587f83e8dd50d
> which would be quite similar, I think)
> 

Looking at the PAPR, the get-sensor-state rtas call for the EPOW sensor
is listed as a fast call and should not return a busy indication.

I'm curious as to why we're getting a busy return indication when
making this call.

-Nathan

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BUG: sleeping function called from ras_epow_interrupt context

2015-07-15 Thread Thomas Huth
On 07/14/2015 11:22 PM, Benjamin Herrenschmidt wrote:
> On Tue, 2015-07-14 at 20:43 +0200, Thomas Huth wrote:
>> Any suggestions how to fix this? Simply revert 587f83e8dd50d? Use
>> mdelay() instead of msleep() in rtas_busy_delay()? Something more
>> fancy?
> 
> A proper fix would be more fancy, the get_sensor should happen in a
> kernel thread instead.

I'm not very familiar with this stuff, but isn't the EPOW interrupt
something that is very time-critical? Moving parts of the handler into a
kernel thread then does not sound like a very good idea to me...

Another question: Can it happen at all that this get-sensor call results
in a sleep condition? Looking at commit ID
81b73dd92b97423b8f5324a59044da478c04f4c4 ("Fix might-sleep warning on
removing cpus"), which apparently fixed a similar issue for CPU
hot-plugging, indicates that at least some of the rtas calls are never
returning the busy code? In that case we could fix this by introducing a
similar rtas_get_sensor_fast() function? (or simply revert 587f83e8dd50d
which would be quite similar, I think)

 Thomas




signature.asc
Description: OpenPGP digital signature


Re: [PATCH v2 3/3] sched/preempt: fix cond_resched_lock() and cond_resched_softirq()

2015-07-15 Thread Peter Zijlstra
On Wed, Jul 15, 2015 at 03:52:34PM +0300, Konstantin Khlebnikov wrote:
> On 15.07.2015 15:16, Eric Dumazet wrote:
> >On Wed, 2015-07-15 at 12:52 +0300, Konstantin Khlebnikov wrote:
> >>These functions check should_resched() before unlocking spinlock/bh-enable:
> >>preempt_count always non-zero => should_resched() always returns false.
> >>cond_resched_lock() worked iff spin_needbreak is set.
> >
> >Interesting, this definitely used to work (linux-3.11)
> >
> >Any idea of which commit broke things ?
> >
> 
> Searching... done
> 
> This one: bdb43806589096ac4272fe1307e789846ac08d7c in v3.13
> 
> before
> 
> -static inline int should_resched(void)
> -{
> -   return need_resched() && !(preempt_count() & PREEMPT_ACTIVE);
> -}
> 
> after
> 
> +static __always_inline bool should_resched(void)
> +{
> +   return unlikely(!*preempt_count_ptr());
> +}

Argh, indeed!

> 
> So,
> 
> Fixes: bdb438065890 ("sched: Extract the basic add/sub preempt_count
> modifiers")

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 3/3] sched/preempt: fix cond_resched_lock() and cond_resched_softirq()

2015-07-15 Thread Konstantin Khlebnikov

On 15.07.2015 15:16, Eric Dumazet wrote:

On Wed, 2015-07-15 at 12:52 +0300, Konstantin Khlebnikov wrote:

These functions check should_resched() before unlocking spinlock/bh-enable:
preempt_count always non-zero => should_resched() always returns false.
cond_resched_lock() worked iff spin_needbreak is set.


Interesting, this definitely used to work (linux-3.11)

Any idea of which commit broke things ?



Searching... done

This one: bdb43806589096ac4272fe1307e789846ac08d7c in v3.13

before

-static inline int should_resched(void)
-{
-   return need_resched() && !(preempt_count() & PREEMPT_ACTIVE);
-}

after

+static __always_inline bool should_resched(void)
+{
+   return unlikely(!*preempt_count_ptr());
+}


So,

Fixes: bdb438065890 ("sched: Extract the basic add/sub preempt_count 
modifiers")


--
Konstantin
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 3/3] sched/preempt: fix cond_resched_lock() and cond_resched_softirq()

2015-07-15 Thread Eric Dumazet
On Wed, 2015-07-15 at 12:52 +0300, Konstantin Khlebnikov wrote:
> These functions check should_resched() before unlocking spinlock/bh-enable:
> preempt_count always non-zero => should_resched() always returns false.
> cond_resched_lock() worked iff spin_needbreak is set.

Interesting, this definitely used to work (linux-3.11)

Any idea of which commit broke things ?



--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/3] drivers/xen/preempt: use need_resched() instead of should_resched()

2015-07-15 Thread Konstantin Khlebnikov
This code is used only when CONFIG_PREEMPT=n and only in non-atomic context:
xen_in_preemptible_hcall is set only in privcmd_ioctl_hypercall().
Thus preempt_count is zero and should_resched() is equal to need_resched().

Signed-off-by: Konstantin Khlebnikov 
---
 drivers/xen/preempt.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/xen/preempt.c b/drivers/xen/preempt.c
index a1800c150839..08cb419eb4e6 100644
--- a/drivers/xen/preempt.c
+++ b/drivers/xen/preempt.c
@@ -31,7 +31,7 @@ EXPORT_SYMBOL_GPL(xen_in_preemptible_hcall);
 asmlinkage __visible void xen_maybe_preempt_hcall(void)
 {
if (unlikely(__this_cpu_read(xen_in_preemptible_hcall)
-&& should_resched())) {
+&& need_resched())) {
/*
 * Clear flag as we may be rescheduled on a different
 * cpu.

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/3] sched/preempt: fix cond_resched_lock() and cond_resched_softirq()

2015-07-15 Thread Konstantin Khlebnikov
These functions check should_resched() before unlocking spinlock/bh-enable:
preempt_count always non-zero => should_resched() always returns false.
cond_resched_lock() worked iff spin_needbreak is set.

This patch adds argument "preempt_offset" to should_resched().

preempt_count offset constants for that:

PREEMPT_DISABLE_OFFSET  - offset after preempt_disable()
PREEMPT_LOCK_OFFSET - offset after spin_lock()
SOFTIRQ_DISABLE_OFFSET  - offset after local_bh_distable()
SOFTIRQ_LOCK_OFFSET - offset after spin_lock_bh()

Signed-off-by: Konstantin Khlebnikov 
---
 arch/x86/include/asm/preempt.h |4 ++--
 include/asm-generic/preempt.h  |5 +++--
 include/linux/preempt.h|   19 ++-
 include/linux/sched.h  |6 --
 kernel/sched/core.c|6 +++---
 5 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/preempt.h b/arch/x86/include/asm/preempt.h
index dca71714f860..b12f81022a6b 100644
--- a/arch/x86/include/asm/preempt.h
+++ b/arch/x86/include/asm/preempt.h
@@ -90,9 +90,9 @@ static __always_inline bool __preempt_count_dec_and_test(void)
 /*
  * Returns true when we need to resched and can (barring IRQ state).
  */
-static __always_inline bool should_resched(void)
+static __always_inline bool should_resched(int preempt_offset)
 {
-   return unlikely(!raw_cpu_read_4(__preempt_count));
+   return unlikely(raw_cpu_read_4(__preempt_count) == preempt_offset);
 }
 
 #ifdef CONFIG_PREEMPT
diff --git a/include/asm-generic/preempt.h b/include/asm-generic/preempt.h
index d0a7a4753db2..0bec580a4885 100644
--- a/include/asm-generic/preempt.h
+++ b/include/asm-generic/preempt.h
@@ -71,9 +71,10 @@ static __always_inline bool 
__preempt_count_dec_and_test(void)
 /*
  * Returns true when we need to resched and can (barring IRQ state).
  */
-static __always_inline bool should_resched(void)
+static __always_inline bool should_resched(int preempt_offset)
 {
-   return unlikely(!preempt_count() && tif_need_resched());
+   return unlikely(preempt_count() == preempt_offset &&
+   tif_need_resched());
 }
 
 #ifdef CONFIG_PREEMPT
diff --git a/include/linux/preempt.h b/include/linux/preempt.h
index 84991f185173..bea8dd8ff5e0 100644
--- a/include/linux/preempt.h
+++ b/include/linux/preempt.h
@@ -84,13 +84,21 @@
  */
 #define in_nmi()   (preempt_count() & NMI_MASK)
 
+/*
+ * The preempt_count offset after preempt_disable();
+ */
 #if defined(CONFIG_PREEMPT_COUNT)
-# define PREEMPT_DISABLE_OFFSET 1
+# define PREEMPT_DISABLE_OFFSETPREEMPT_OFFSET
 #else
-# define PREEMPT_DISABLE_OFFSET 0
+# define PREEMPT_DISABLE_OFFSET0
 #endif
 
 /*
+ * The preempt_count offset after spin_lock()
+ */
+#define PREEMPT_LOCK_OFFSETPREEMPT_DISABLE_OFFSET
+
+/*
  * The preempt_count offset needed for things like:
  *
  *  spin_lock_bh()
@@ -103,7 +111,7 @@
  *
  * Work as expected.
  */
-#define SOFTIRQ_LOCK_OFFSET (SOFTIRQ_DISABLE_OFFSET + PREEMPT_DISABLE_OFFSET)
+#define SOFTIRQ_LOCK_OFFSET (SOFTIRQ_DISABLE_OFFSET + PREEMPT_LOCK_OFFSET)
 
 /*
  * Are we running in atomic context?  WARNING: this macro cannot
@@ -124,7 +132,8 @@
 #if defined(CONFIG_DEBUG_PREEMPT) || defined(CONFIG_PREEMPT_TRACER)
 extern void preempt_count_add(int val);
 extern void preempt_count_sub(int val);
-#define preempt_count_dec_and_test() ({ preempt_count_sub(1); 
should_resched(); })
+#define preempt_count_dec_and_test() \
+   ({ preempt_count_sub(1); should_resched(0); })
 #else
 #define preempt_count_add(val) __preempt_count_add(val)
 #define preempt_count_sub(val) __preempt_count_sub(val)
@@ -184,7 +193,7 @@ do { \
 
 #define preempt_check_resched() \
 do { \
-   if (should_resched()) \
+   if (should_resched(0)) \
__preempt_schedule(); \
 } while (0)
 
diff --git a/include/linux/sched.h b/include/linux/sched.h
index ae21f1591615..a8e9b17acdee 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2885,12 +2885,6 @@ extern int _cond_resched(void);
 
 extern int __cond_resched_lock(spinlock_t *lock);
 
-#ifdef CONFIG_PREEMPT_COUNT
-#define PREEMPT_LOCK_OFFSETPREEMPT_OFFSET
-#else
-#define PREEMPT_LOCK_OFFSET0
-#endif
-
 #define cond_resched_lock(lock) ({ \
___might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET);\
__cond_resched_lock(lock);  \
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 78b4bad10081..d9a4d93dc879 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4492,7 +4492,7 @@ SYSCALL_DEFINE0(sched_yield)
 
 int __sched _cond_resched(void)
 {
-   if (should_resched()) {
+   if (should_resched(0)) {
preempt_schedule_common();
return 1;
}
@@ -4510,7 +4510,7 @@ EXPORT_SYMBOL(_cond_resched);
  */
 int __cond_resched_lock(spinlock_t *lock)
 {
-   int resched = should_resched();
+   int resched = should_resched(PREEMPT_LOCK_OFFSET);

[PATCH v2 2/3] KVM: PPC: Book3S HV: Use need_resched() instead of should_resched()

2015-07-15 Thread Konstantin Khlebnikov
Function should_resched() is equal to (!preempt_count() && need_resched()).
In preemptive kernel preempt_count here is non-zero because of vc->lock.

Signed-off-by: Konstantin Khlebnikov 
---
 arch/powerpc/kvm/book3s_hv.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 68d067ad4222..a9f753fb73a8 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -2178,7 +2178,7 @@ static int kvmppc_run_vcpu(struct kvm_run *kvm_run, 
struct kvm_vcpu *vcpu)
vc->runner = vcpu;
if (n_ceded == vc->n_runnable) {
kvmppc_vcore_blocked(vc);
-   } else if (should_resched()) {
+   } else if (need_resched()) {
vc->vcore_state = VCORE_PREEMPT;
/* Let something else run */
cond_resched_lock(&vc->lock);

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html