Re: [RFC][PATCH 0/7] nested sleeps, fixes and debug infra
On Wed, Aug 06, 2014 at 02:16:03PM -0700, David Miller wrote: > From: Peter Zijlstra > Date: Wed, 6 Aug 2014 10:31:34 +0200 > > > On Wed, Aug 06, 2014 at 11:51:29AM +0400, Ilya Dryomov wrote: > > > >> OK, this one is a bit different. > >> > >> WARNING: CPU: 1 PID: 1744 at kernel/sched/core.c:7104 > >> __might_sleep+0x58/0x90() > >> do not call blocking ops when !TASK_RUNNING; state=1 set at > >> [] prepare_to_wait+0x50 /0xa0 > > > >> [] __might_sleep+0x58/0x90 > >> [] lock_sock_nested+0x31/0xb0 > >> [] sk_stream_wait_memory+0x18a/0x2d0 > > > > Urgh, tedious. Its not an actual bug as is. Due to the condition check > > in sk_wait_event() we can call lock_sock() with ->state != TASK_RUNNING. > > > > I'm not entirely sure what the cleanest way is to make this go away. > > Possibly something like so: > > If you submit this formally to netdev with a signoff I'm willing to apply > this if it helps the debug infrastructure. Thanks, for now I'm just collecting things to see how far I can take this. But I'll certainly include you and netdev on a next posting. pgpkr5GUvDpWQ.pgp Description: PGP signature
Re: [RFC][PATCH 0/7] nested sleeps, fixes and debug infra
From: Peter Zijlstra Date: Wed, 6 Aug 2014 10:31:34 +0200 > On Wed, Aug 06, 2014 at 11:51:29AM +0400, Ilya Dryomov wrote: > >> OK, this one is a bit different. >> >> WARNING: CPU: 1 PID: 1744 at kernel/sched/core.c:7104 >> __might_sleep+0x58/0x90() >> do not call blocking ops when !TASK_RUNNING; state=1 set at >> [] prepare_to_wait+0x50 /0xa0 > >> [] __might_sleep+0x58/0x90 >> [] lock_sock_nested+0x31/0xb0 >> [] sk_stream_wait_memory+0x18a/0x2d0 > > Urgh, tedious. Its not an actual bug as is. Due to the condition check > in sk_wait_event() we can call lock_sock() with ->state != TASK_RUNNING. > > I'm not entirely sure what the cleanest way is to make this go away. > Possibly something like so: If you submit this formally to netdev with a signoff I'm willing to apply this if it helps the debug infrastructure. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 0/7] nested sleeps, fixes and debug infra
On Wed, Aug 06, 2014 at 11:51:29AM +0400, Ilya Dryomov wrote: > OK, this one is a bit different. > > WARNING: CPU: 1 PID: 1744 at kernel/sched/core.c:7104 > __might_sleep+0x58/0x90() > do not call blocking ops when !TASK_RUNNING; state=1 set at > [] prepare_to_wait+0x50 /0xa0 > [] __might_sleep+0x58/0x90 > [] lock_sock_nested+0x31/0xb0 > [] sk_stream_wait_memory+0x18a/0x2d0 Urgh, tedious. Its not an actual bug as is. Due to the condition check in sk_wait_event() we can call lock_sock() with ->state != TASK_RUNNING. I'm not entirely sure what the cleanest way is to make this go away. Possibly something like so: --- include/net/sock.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/net/sock.h b/include/net/sock.h index 156350745700..37902176c5ab 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -886,6 +886,7 @@ static inline void sock_rps_reset_rxhash(struct sock *sk) if (!__rc) {\ *(__timeo) = schedule_timeout(*(__timeo)); \ } \ + __set_current_state(TASK_RUNNING); \ lock_sock(__sk);\ __rc = __condition; \ __rc; \ pgpbNe_DUVw5_.pgp Description: PGP signature
Re: [RFC][PATCH 0/7] nested sleeps, fixes and debug infra
On Tue, Aug 5, 2014 at 5:06 PM, Peter Zijlstra wrote: > On Tue, Aug 05, 2014 at 12:33:16PM +0400, Ilya Dryomov wrote: >> On Mon, Aug 4, 2014 at 2:30 PM, Peter Zijlstra wrote: >> > Hi, >> > >> > Ilya recently tripped over a nested sleep which made Ingo suggest we should >> > have debug checks for that. So I did some, see patch 7. Of course that >> > triggered a whole bunch of fail the instant I tried to boot my machine. >> > >> > With this series I can boot my test box and build a kernel on it, I'm >> > fairly >> > sure that's far too limited a test to have found all, but its a start. >> >> FWIW, I'm getting a lot of these during light rbd testing. CC'ed >> netdev and linux-mm. > > Both are cond_resched() calls, and that's not blocking as such, just a > preemption point, so lets exclude those. OK, this one is a bit different. WARNING: CPU: 1 PID: 1744 at kernel/sched/core.c:7104 __might_sleep+0x58/0x90() do not call blocking ops when !TASK_RUNNING; state=1 set at [] prepare_to_wait+0x50 /0xa0 Modules linked in: CPU: 1 PID: 1744 Comm: lt-ceph_test_li Not tainted 3.16.0-vm+ #113 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 1bc0 88006c4479d8 8156f455 88006c447a28 88006c447a18 81033357 0001 0950 817ee48a 88006dba6120 Call Trace: [] dump_stack+0x4f/0x7c [] warn_slowpath_common+0x87/0xb0 [] warn_slowpath_fmt+0x41/0x50 [] ? trace_hardirqs_on_caller+0x182/0x1f0 [] ? prepare_to_wait+0x50/0xa0 [] ? prepare_to_wait+0x50/0xa0 [] __might_sleep+0x58/0x90 [] lock_sock_nested+0x31/0xb0 [] ? release_sock+0x1bb/0x200 [] sk_stream_wait_memory+0x18a/0x2d0 [] ? woken_wake_function+0x10/0x10 [] tcp_sendmsg+0xb6f/0xd70 [] inet_sendmsg+0xdf/0x100 [] ? inet_recvmsg+0x100/0x100 [] sock_sendmsg+0x67/0x90 [] ? might_fault+0x51/0xb0 [] ___sys_sendmsg+0x2d2/0x2e0 [] ? do_dup2+0xd0/0xd0 [] ? do_dup2+0xd0/0xd0 [] ? finish_task_switch+0x50/0x100 [] ? __fget_light+0x45/0x60 [] ? __fdget+0xe/0x10 [] __sys_sendmsg+0x44/0x70 [] SyS_sendmsg+0x9/0x10 [] system_call_fastpath+0x16/0x1b Thanks, Ilya -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 0/7] nested sleeps, fixes and debug infra
On Tue, Aug 05, 2014 at 12:33:16PM +0400, Ilya Dryomov wrote: > On Mon, Aug 4, 2014 at 2:30 PM, Peter Zijlstra wrote: > > Hi, > > > > Ilya recently tripped over a nested sleep which made Ingo suggest we should > > have debug checks for that. So I did some, see patch 7. Of course that > > triggered a whole bunch of fail the instant I tried to boot my machine. > > > > With this series I can boot my test box and build a kernel on it, I'm fairly > > sure that's far too limited a test to have found all, but its a start. > > FWIW, I'm getting a lot of these during light rbd testing. CC'ed > netdev and linux-mm. Both are cond_resched() calls, and that's not blocking as such, just a preemption point, so lets exclude those. From the school of '_' are free: --- include/linux/kernel.h |3 +++ include/linux/sched.h |6 +++--- kernel/sched/core.c| 12 +--- 3 files changed, 15 insertions(+), 6 deletions(-) --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -162,6 +162,7 @@ extern int _cond_resched(void); #endif #ifdef CONFIG_DEBUG_ATOMIC_SLEEP + void ___might_sleep(const char *file, int line, int preempt_offset); void __might_sleep(const char *file, int line, int preempt_offset); /** * might_sleep - annotation for functions that can sleep @@ -176,6 +177,8 @@ extern int _cond_resched(void); # define might_sleep() \ do { __might_sleep(__FILE__, __LINE__, 0); might_resched(); } while (0) #else + static inline void ___might_sleep(const char *file, int line, + int preempt_offset) { } static inline void __might_sleep(const char *file, int line, int preempt_offset) { } # define might_sleep() do { might_resched(); } while (0) --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -2754,7 +2754,7 @@ static inline int signal_pending_state(l extern int _cond_resched(void); #define cond_resched() ({ \ - __might_sleep(__FILE__, __LINE__, 0); \ + ___might_sleep(__FILE__, __LINE__, 0); \ _cond_resched();\ }) @@ -2767,14 +2767,14 @@ extern int __cond_resched_lock(spinlock_ #endif #define cond_resched_lock(lock) ({ \ - __might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET); \ + ___might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET);\ __cond_resched_lock(lock); \ }) extern int __cond_resched_softirq(void); #define cond_resched_softirq() ({ \ - __might_sleep(__FILE__, __LINE__, SOFTIRQ_DISABLE_OFFSET); \ + ___might_sleep(__FILE__, __LINE__, SOFTIRQ_DISABLE_OFFSET); \ __cond_resched_softirq(); \ }) --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -7078,8 +7078,6 @@ static inline int preempt_count_equals(i void __might_sleep(const char *file, int line, int preempt_offset) { - static unsigned long prev_jiffy;/* ratelimiting */ - /* * Blocking primitives will set (and therefore destroy) current->state, * since we will exit with TASK_RUNNING make sure we enter with it, @@ -7093,6 +7091,14 @@ void __might_sleep(const char *file, int (void *)current->task_state_change)) __set_current_state(TASK_RUNNING); + ___might_sleep(file, line, preempt_offset); +} +EXPORT_SYMBOL(__might_sleep); + +void ___might_sleep(const char *file, int line, int preempt_offset) +{ + static unsigned long prev_jiffy;/* ratelimiting */ + rcu_sleep_check(); /* WARN_ON_ONCE() by default, no rate limit reqd. */ if ((preempt_count_equals(preempt_offset) && !irqs_disabled() && !is_idle_task(current)) || @@ -7122,7 +7128,7 @@ void __might_sleep(const char *file, int #endif dump_stack(); } -EXPORT_SYMBOL(__might_sleep); +EXPORT_SYMBOL(___might_sleep); #endif #ifdef CONFIG_MAGIC_SYSRQ pgpDOrSu0ZLAM.pgp Description: PGP signature
Re: [RFC][PATCH 0/7] nested sleeps, fixes and debug infra
On Mon, Aug 4, 2014 at 2:30 PM, Peter Zijlstra wrote: > Hi, > > Ilya recently tripped over a nested sleep which made Ingo suggest we should > have debug checks for that. So I did some, see patch 7. Of course that > triggered a whole bunch of fail the instant I tried to boot my machine. > > With this series I can boot my test box and build a kernel on it, I'm fairly > sure that's far too limited a test to have found all, but its a start. FWIW, I'm getting a lot of these during light rbd testing. CC'ed netdev and linux-mm. WARNING: CPU: 2 PID: 1978 at kernel/sched/core.c:7094 __might_sleep+0x5b/0x1e0() do not call blocking ops when !TASK_RUNNING; state=1 set at [] prepare_to_wait+0x50/0xa0 Modules linked in: CPU: 2 PID: 1978 Comm: ceph-osd Not tainted 3.16.0-vm+ #109 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 1bb6 8800126739e8 8156ec1d 880012673a38 880012673a28 81032c27 880012673a58 0200 8800150fa060 07ad 817ed352 Call Trace: [] dump_stack+0x4f/0x7c [] warn_slowpath_common+0x87/0xb0 [] warn_slowpath_fmt+0x41/0x50 [] ? tcp_v4_do_rcv+0x10f/0x4a0 [] ? prepare_to_wait+0x50/0xa0 [] ? prepare_to_wait+0x50/0xa0 [] __might_sleep+0x5b/0x1e0 [] release_sock+0x13d/0x200 [] sk_stream_wait_memory+0x133/0x2d0 [] ? woken_wake_function+0x10/0x10 [] tcp_sendmsg+0xb6f/0xd70 [] inet_sendmsg+0xdf/0x100 [] ? inet_recvmsg+0x100/0x100 [] sock_sendmsg+0x67/0x90 [] ? might_fault+0x51/0xb0 [] ___sys_sendmsg+0x2d2/0x2e0 [] ? futex_wake+0x128/0x140 [] ? futex_wake+0x1/0x140 [] ? do_dup2+0xd0/0xd0 [] ? get_parent_ip+0x11/0x50 [] ? debug_smp_processor_id+0x17/0x20 [] ? delay_tsc+0x85/0xb0 [] ? __fget+0xdd/0xf0 [] ? do_dup2+0xd0/0xd0 [] ? __fget_light+0x45/0x60 [] ? __fdget+0xe/0x10 [] __sys_sendmsg+0x44/0x70 [] SyS_sendmsg+0x9/0x10 [] system_call_fastpath+0x16/0x1b WARNING: CPU: 0 PID: 380 at kernel/sched/core.c:7094 __might_sleep+0x5b/0x1e0() do not call blocking ops when !TASK_RUNNING; state=1 set at [] prepare_to_wait+0x50/0xa0 Modules linked in: CPU: 0 PID: 380 Comm: kswapd0 Tainted: GW 3.16.0-vm+ #109 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 1bb6 88007b64bc68 8156ec1d 88007b64bcb8 88007b64bca8 81032c27 88007c062060 0065 8179ca1f Call Trace: [] dump_stack+0x4f/0x7c [] warn_slowpath_common+0x87/0xb0 [] warn_slowpath_fmt+0x41/0x50 [] ? prepare_to_wait+0x50/0xa0 [] ? prepare_to_wait+0x50/0xa0 [] __might_sleep+0x5b/0x1e0 [] __reset_isolation_suitable+0x83/0x140 [] reset_isolation_suitable+0x33/0x50 [] kswapd+0x2e7/0x4d0 [] ? woken_wake_function+0x10/0x10 [] ? balance_pgdat+0x5b0/0x5b0 [] kthread+0xfb/0x110 [] ? flush_kthread_worker+0x130/0x130 [] ret_from_fork+0x7c/0xb0 [] ? flush_kthread_worker+0x130/0x130 Thanks, Ilya -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/