Re: BUG: sleeping function called from ras_epow_interrupt context
On 07/15/2015 09:58 PM, Nathan Fontenot wrote: > On 07/15/2015 09:35 AM, Thomas Huth wrote: >> On 07/14/2015 11:22 PM, Benjamin Herrenschmidt wrote: >>> On Tue, 2015-07-14 at 20:43 +0200, Thomas Huth wrote: Any suggestions how to fix this? Simply revert 587f83e8dd50d? Use mdelay() instead of msleep() in rtas_busy_delay()? Something more fancy? >>> >>> A proper fix would be more fancy, the get_sensor should happen in a >>> kernel thread instead. >> >> I'm not very familiar with this stuff, but isn't the EPOW interrupt >> something that is very time-critical? Moving parts of the handler into a >> kernel thread then does not sound like a very good idea to me... >> >> Another question: Can it happen at all that this get-sensor call results >> in a sleep condition? Looking at commit ID >> 81b73dd92b97423b8f5324a59044da478c04f4c4 ("Fix might-sleep warning on >> removing cpus"), which apparently fixed a similar issue for CPU >> hot-plugging, indicates that at least some of the rtas calls are never >> returning the busy code? In that case we could fix this by introducing a >> similar rtas_get_sensor_fast() function? (or simply revert 587f83e8dd50d >> which would be quite similar, I think) >> > > Looking at the PAPR, the get-sensor-state rtas call for the EPOW sensor > is listed as a fast call and should not return a busy indication. Great, good to know, thanks for looking that up! So IMHO we should either introduce a rtas_get_sensor_fast() function or revert 587f83e8dd50d ... any preferences? Shall I come up with a patch? > I'm curious as to why we're getting a busy return indication when > making this call. Looking at the code again, rtas_busy_delay() likely never slept ... it's likely just the "might_sleep()" annotation in that function that causes the BUG. Thomas signature.asc Description: OpenPGP digital signature
Re: BUG: sleeping function called from ras_epow_interrupt context
On 07/15/2015 09:35 AM, Thomas Huth wrote: > On 07/14/2015 11:22 PM, Benjamin Herrenschmidt wrote: >> On Tue, 2015-07-14 at 20:43 +0200, Thomas Huth wrote: >>> Any suggestions how to fix this? Simply revert 587f83e8dd50d? Use >>> mdelay() instead of msleep() in rtas_busy_delay()? Something more >>> fancy? >> >> A proper fix would be more fancy, the get_sensor should happen in a >> kernel thread instead. > > I'm not very familiar with this stuff, but isn't the EPOW interrupt > something that is very time-critical? Moving parts of the handler into a > kernel thread then does not sound like a very good idea to me... > > Another question: Can it happen at all that this get-sensor call results > in a sleep condition? Looking at commit ID > 81b73dd92b97423b8f5324a59044da478c04f4c4 ("Fix might-sleep warning on > removing cpus"), which apparently fixed a similar issue for CPU > hot-plugging, indicates that at least some of the rtas calls are never > returning the busy code? In that case we could fix this by introducing a > similar rtas_get_sensor_fast() function? (or simply revert 587f83e8dd50d > which would be quite similar, I think) > Looking at the PAPR, the get-sensor-state rtas call for the EPOW sensor is listed as a fast call and should not return a busy indication. I'm curious as to why we're getting a busy return indication when making this call. -Nathan -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BUG: sleeping function called from ras_epow_interrupt context
On 07/14/2015 11:22 PM, Benjamin Herrenschmidt wrote: > On Tue, 2015-07-14 at 20:43 +0200, Thomas Huth wrote: >> Any suggestions how to fix this? Simply revert 587f83e8dd50d? Use >> mdelay() instead of msleep() in rtas_busy_delay()? Something more >> fancy? > > A proper fix would be more fancy, the get_sensor should happen in a > kernel thread instead. I'm not very familiar with this stuff, but isn't the EPOW interrupt something that is very time-critical? Moving parts of the handler into a kernel thread then does not sound like a very good idea to me... Another question: Can it happen at all that this get-sensor call results in a sleep condition? Looking at commit ID 81b73dd92b97423b8f5324a59044da478c04f4c4 ("Fix might-sleep warning on removing cpus"), which apparently fixed a similar issue for CPU hot-plugging, indicates that at least some of the rtas calls are never returning the busy code? In that case we could fix this by introducing a similar rtas_get_sensor_fast() function? (or simply revert 587f83e8dd50d which would be quite similar, I think) Thomas signature.asc Description: OpenPGP digital signature
Re: [PATCH v2 3/3] sched/preempt: fix cond_resched_lock() and cond_resched_softirq()
On Wed, Jul 15, 2015 at 03:52:34PM +0300, Konstantin Khlebnikov wrote: > On 15.07.2015 15:16, Eric Dumazet wrote: > >On Wed, 2015-07-15 at 12:52 +0300, Konstantin Khlebnikov wrote: > >>These functions check should_resched() before unlocking spinlock/bh-enable: > >>preempt_count always non-zero => should_resched() always returns false. > >>cond_resched_lock() worked iff spin_needbreak is set. > > > >Interesting, this definitely used to work (linux-3.11) > > > >Any idea of which commit broke things ? > > > > Searching... done > > This one: bdb43806589096ac4272fe1307e789846ac08d7c in v3.13 > > before > > -static inline int should_resched(void) > -{ > - return need_resched() && !(preempt_count() & PREEMPT_ACTIVE); > -} > > after > > +static __always_inline bool should_resched(void) > +{ > + return unlikely(!*preempt_count_ptr()); > +} Argh, indeed! > > So, > > Fixes: bdb438065890 ("sched: Extract the basic add/sub preempt_count > modifiers") Thanks! -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 3/3] sched/preempt: fix cond_resched_lock() and cond_resched_softirq()
On 15.07.2015 15:16, Eric Dumazet wrote: On Wed, 2015-07-15 at 12:52 +0300, Konstantin Khlebnikov wrote: These functions check should_resched() before unlocking spinlock/bh-enable: preempt_count always non-zero => should_resched() always returns false. cond_resched_lock() worked iff spin_needbreak is set. Interesting, this definitely used to work (linux-3.11) Any idea of which commit broke things ? Searching... done This one: bdb43806589096ac4272fe1307e789846ac08d7c in v3.13 before -static inline int should_resched(void) -{ - return need_resched() && !(preempt_count() & PREEMPT_ACTIVE); -} after +static __always_inline bool should_resched(void) +{ + return unlikely(!*preempt_count_ptr()); +} So, Fixes: bdb438065890 ("sched: Extract the basic add/sub preempt_count modifiers") -- Konstantin -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 3/3] sched/preempt: fix cond_resched_lock() and cond_resched_softirq()
On Wed, 2015-07-15 at 12:52 +0300, Konstantin Khlebnikov wrote: > These functions check should_resched() before unlocking spinlock/bh-enable: > preempt_count always non-zero => should_resched() always returns false. > cond_resched_lock() worked iff spin_needbreak is set. Interesting, this definitely used to work (linux-3.11) Any idea of which commit broke things ? -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 1/3] drivers/xen/preempt: use need_resched() instead of should_resched()
This code is used only when CONFIG_PREEMPT=n and only in non-atomic context: xen_in_preemptible_hcall is set only in privcmd_ioctl_hypercall(). Thus preempt_count is zero and should_resched() is equal to need_resched(). Signed-off-by: Konstantin Khlebnikov --- drivers/xen/preempt.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/xen/preempt.c b/drivers/xen/preempt.c index a1800c150839..08cb419eb4e6 100644 --- a/drivers/xen/preempt.c +++ b/drivers/xen/preempt.c @@ -31,7 +31,7 @@ EXPORT_SYMBOL_GPL(xen_in_preemptible_hcall); asmlinkage __visible void xen_maybe_preempt_hcall(void) { if (unlikely(__this_cpu_read(xen_in_preemptible_hcall) -&& should_resched())) { +&& need_resched())) { /* * Clear flag as we may be rescheduled on a different * cpu. -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 3/3] sched/preempt: fix cond_resched_lock() and cond_resched_softirq()
These functions check should_resched() before unlocking spinlock/bh-enable: preempt_count always non-zero => should_resched() always returns false. cond_resched_lock() worked iff spin_needbreak is set. This patch adds argument "preempt_offset" to should_resched(). preempt_count offset constants for that: PREEMPT_DISABLE_OFFSET - offset after preempt_disable() PREEMPT_LOCK_OFFSET - offset after spin_lock() SOFTIRQ_DISABLE_OFFSET - offset after local_bh_distable() SOFTIRQ_LOCK_OFFSET - offset after spin_lock_bh() Signed-off-by: Konstantin Khlebnikov --- arch/x86/include/asm/preempt.h |4 ++-- include/asm-generic/preempt.h |5 +++-- include/linux/preempt.h| 19 ++- include/linux/sched.h |6 -- kernel/sched/core.c|6 +++--- 5 files changed, 22 insertions(+), 18 deletions(-) diff --git a/arch/x86/include/asm/preempt.h b/arch/x86/include/asm/preempt.h index dca71714f860..b12f81022a6b 100644 --- a/arch/x86/include/asm/preempt.h +++ b/arch/x86/include/asm/preempt.h @@ -90,9 +90,9 @@ static __always_inline bool __preempt_count_dec_and_test(void) /* * Returns true when we need to resched and can (barring IRQ state). */ -static __always_inline bool should_resched(void) +static __always_inline bool should_resched(int preempt_offset) { - return unlikely(!raw_cpu_read_4(__preempt_count)); + return unlikely(raw_cpu_read_4(__preempt_count) == preempt_offset); } #ifdef CONFIG_PREEMPT diff --git a/include/asm-generic/preempt.h b/include/asm-generic/preempt.h index d0a7a4753db2..0bec580a4885 100644 --- a/include/asm-generic/preempt.h +++ b/include/asm-generic/preempt.h @@ -71,9 +71,10 @@ static __always_inline bool __preempt_count_dec_and_test(void) /* * Returns true when we need to resched and can (barring IRQ state). */ -static __always_inline bool should_resched(void) +static __always_inline bool should_resched(int preempt_offset) { - return unlikely(!preempt_count() && tif_need_resched()); + return unlikely(preempt_count() == preempt_offset && + tif_need_resched()); } #ifdef CONFIG_PREEMPT diff --git a/include/linux/preempt.h b/include/linux/preempt.h index 84991f185173..bea8dd8ff5e0 100644 --- a/include/linux/preempt.h +++ b/include/linux/preempt.h @@ -84,13 +84,21 @@ */ #define in_nmi() (preempt_count() & NMI_MASK) +/* + * The preempt_count offset after preempt_disable(); + */ #if defined(CONFIG_PREEMPT_COUNT) -# define PREEMPT_DISABLE_OFFSET 1 +# define PREEMPT_DISABLE_OFFSETPREEMPT_OFFSET #else -# define PREEMPT_DISABLE_OFFSET 0 +# define PREEMPT_DISABLE_OFFSET0 #endif /* + * The preempt_count offset after spin_lock() + */ +#define PREEMPT_LOCK_OFFSETPREEMPT_DISABLE_OFFSET + +/* * The preempt_count offset needed for things like: * * spin_lock_bh() @@ -103,7 +111,7 @@ * * Work as expected. */ -#define SOFTIRQ_LOCK_OFFSET (SOFTIRQ_DISABLE_OFFSET + PREEMPT_DISABLE_OFFSET) +#define SOFTIRQ_LOCK_OFFSET (SOFTIRQ_DISABLE_OFFSET + PREEMPT_LOCK_OFFSET) /* * Are we running in atomic context? WARNING: this macro cannot @@ -124,7 +132,8 @@ #if defined(CONFIG_DEBUG_PREEMPT) || defined(CONFIG_PREEMPT_TRACER) extern void preempt_count_add(int val); extern void preempt_count_sub(int val); -#define preempt_count_dec_and_test() ({ preempt_count_sub(1); should_resched(); }) +#define preempt_count_dec_and_test() \ + ({ preempt_count_sub(1); should_resched(0); }) #else #define preempt_count_add(val) __preempt_count_add(val) #define preempt_count_sub(val) __preempt_count_sub(val) @@ -184,7 +193,7 @@ do { \ #define preempt_check_resched() \ do { \ - if (should_resched()) \ + if (should_resched(0)) \ __preempt_schedule(); \ } while (0) diff --git a/include/linux/sched.h b/include/linux/sched.h index ae21f1591615..a8e9b17acdee 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -2885,12 +2885,6 @@ extern int _cond_resched(void); extern int __cond_resched_lock(spinlock_t *lock); -#ifdef CONFIG_PREEMPT_COUNT -#define PREEMPT_LOCK_OFFSETPREEMPT_OFFSET -#else -#define PREEMPT_LOCK_OFFSET0 -#endif - #define cond_resched_lock(lock) ({ \ ___might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET);\ __cond_resched_lock(lock); \ diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 78b4bad10081..d9a4d93dc879 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4492,7 +4492,7 @@ SYSCALL_DEFINE0(sched_yield) int __sched _cond_resched(void) { - if (should_resched()) { + if (should_resched(0)) { preempt_schedule_common(); return 1; } @@ -4510,7 +4510,7 @@ EXPORT_SYMBOL(_cond_resched); */ int __cond_resched_lock(spinlock_t *lock) { - int resched = should_resched(); + int resched = should_resched(PREEMPT_LOCK_OFFSET);
[PATCH v2 2/3] KVM: PPC: Book3S HV: Use need_resched() instead of should_resched()
Function should_resched() is equal to (!preempt_count() && need_resched()). In preemptive kernel preempt_count here is non-zero because of vc->lock. Signed-off-by: Konstantin Khlebnikov --- arch/powerpc/kvm/book3s_hv.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 68d067ad4222..a9f753fb73a8 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -2178,7 +2178,7 @@ static int kvmppc_run_vcpu(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) vc->runner = vcpu; if (n_ceded == vc->n_runnable) { kvmppc_vcore_blocked(vc); - } else if (should_resched()) { + } else if (need_resched()) { vc->vcore_state = VCORE_PREEMPT; /* Let something else run */ cond_resched_lock(&vc->lock); -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html