On Tue, Jan 6, 2015 at 5:49 AM, Sedat Dilek <sedat.di...@gmail.com> wrote:
> [ Please CC me I am not subscribed to LKML ]
>
> [ QUOTE ]
>
> On Mon, Jan 05, 2015 at 05:46:15PM -0800, Linus Torvalds wrote:
>  > It's a day delayed - not because of any particular development issues,
>  > but simply because I was tiling a bathroom yesterday. But rc3 is out
>  > there now, and things have stayed reasonably calm. I really hope that
>  > implies that 3.19 is looking good, but it's equally likely that it's
>  > just that people are still recovering from the holiday season.
>  >
>  > A bit over three quarters of the changes here are drivers - mostly
>  > networking, thermal, input layer, sound, power management. The rest is
>  > misc - filesystems, core networking, some arch fixes, etc. But all of
>  > it is pretty small.
>  >
>  > So go out and test,
>
> This has been there since just before rc1. Is there a fix for this
> stalled in someones git tree maybe ?
>
> [    7.952588] WARNING: CPU: 0 PID: 299 at kernel/sched/core.c:7303
> __might_sleep+0x8d/0xa0()
> [    7.952592] do not call blocking ops when !TASK_RUNNING; state=1
> set at [<ffffffff910a0f7a>] prepare_to_wait+0x2a/0x90
> [    7.952595] CPU: 0 PID: 299 Comm: systemd-readahe Not tainted
> 3.19.0-rc3+ #100
> [    7.952597]  0000000000001c87 00000000720a2c76 ffff8800b2513c88
> ffffffff915b47c7
> [    7.952598]  ffffffff910a3648 ffff8800b2513ce0 ffff8800b2513cc8
> ffffffff91062c30
> [    7.952599]  0000000000000000 ffffffff91796fb2 000000000000026d
> 0000000000000000
> [    7.952600] Call Trace:
> [    7.952603]  [<ffffffff915b47c7>] dump_stack+0x4c/0x65
> [    7.952604]  [<ffffffff910a3648>] ? down_trylock+0x28/0x40
> [    7.952606]  [<ffffffff91062c30>] warn_slowpath_common+0x80/0xc0
> [    7.952607]  [<ffffffff91062cc0>] warn_slowpath_fmt+0x50/0x70
> [    7.952608]  [<ffffffff910a0f7a>] ? prepare_to_wait+0x2a/0x90
> [    7.952610]  [<ffffffff910a0f7a>] ? prepare_to_wait+0x2a/0x90
> [    7.952611]  [<ffffffff910867ed>] __might_sleep+0x8d/0xa0
> [    7.952614]  [<ffffffff915b8ea9>] mutex_lock_nested+0x39/0x3e0
> [    7.952616]  [<ffffffff910a77ad>] ? trace_hardirqs_on+0xd/0x10
> [    7.952617]  [<ffffffff910a0fac>] ? prepare_to_wait+0x5c/0x90
> [    7.952620]  [<ffffffff911a63e0>] fanotify_read+0xe0/0x5b0
> [    7.952622]  [<ffffffff91090801>] ? sched_clock_cpu+0xc1/0xd0
> [    7.952624]  [<ffffffff91242459>] ? selinux_file_permission+0xb9/0x130
> [    7.952626]  [<ffffffff910a14d0>] ? prepare_to_wait_event+0xf0/0xf0
> [    7.952628]  [<ffffffff91162513>] __vfs_read+0x13/0x50
> [    7.952629]  [<ffffffff911625d8>] vfs_read+0x88/0x140
> [    7.952631]  [<ffffffff911626e7>] SyS_read+0x57/0xd0
> [    7.952633]  [<ffffffff915bd952>] system_call_fastpath+0x12/0x17
>
> [ /QUOTE ]
>
> I am seeing a similiar call-trace/warning.
> It is reproducible when running fio (latest: v2.2.4) while my loop-mq
> tests (see block.git#for-next)
>
> Some people tend to say it's coming from the linux-aio area [1], but I
> am not sure.
> 1st I thought this is a Linux-next problem but I am seeing it also
> with my rc-kernels.
> For parts of aio there is a patch discussed in [2].
> The experimental patchset of Ken from [3] made the "aio" call-trace go
> away here.
>
> I tried also a patch pending in peterz/queue.git#sched/core from Eric Sandeen.
> It's "check for stack overflow in ___might_sleep".
> Unfortunately, it did not help in case of my loop-mq tests.
> ( BTW, this is touching ___might_sleep() (note: triple-underscore VS.
> affected __might_sleep() <--- double-underscrore). )
>
> Let me hear your feedback.
>
> Have more fun!
>
> - Sedat -
>
> [1] http://marc.info/?l=linux-aio&m=142033318411355&w=2
> [2] http://marc.info/?l=linux-aio&m=142035799514685&w=2
> [3] http://evilpiepirate.org/git/linux-bcache.git/log/?h=aio_ring_fix
> [4] 
> http://git.kernel.org/cgit/linux/kernel/git/peterz/queue.git/patch/?id=48e615e4c3ebed488fecb6bfb40b372151f62db2

[ CC Takashi ]

>From [1]:
...

Just "me too" (but overlooked until recently).

The cause is a mutex_lock() call right after prepare_to_wait() with
TASK_INTERRUPTIBLE in fanotify_read().

static ssize_t fanotify_read(struct file *file, char __user *buf,
    size_t count, loff_t *pos)
{
....
while (1) {
prepare_to_wait(&group->notification_waitq, &wait, TASK_INTERRUPTIBLE);
mutex_lock(&group->notification_mutex);

I saw Peter already fixed a similar code in inotify_user.c by commit
e23738a7300a (but interestingly for a different reason, "Deal with
nested sleeps").  Supposedly a similar fix would be needed for
fanotify_user.c.
...

Can you explain why do you think the problem is in sched-fanotify?

I tried to do such a "similiar" (quick) fix analog to the mentioned
"sched, inotify: Deal with nested sleeps" patch from Peter.
If I did correct... It does not make the call-trace go away here.

- Sedat -

[1] http://marc.info/?l=linux-kernel&m=142053231023575&w=2
From 5445404e768653771faca9770755340200fe8b6c Mon Sep 17 00:00:00 2001
From: Sedat Dilek <sedat.di...@gmail.com>
Date: Tue, 6 Jan 2015 09:51:54 +0100
Subject: [PATCH] sched: fanotify: Deal with nested sleeps

---
 fs/notify/fanotify/fanotify_user.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index c991616..65e96e2 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -14,6 +14,7 @@
 #include <linux/types.h>
 #include <linux/uaccess.h>
 #include <linux/compat.h>
+#include <linux/wait.h>
 
 #include <asm/ioctls.h>
 
@@ -259,15 +260,15 @@ static ssize_t fanotify_read(struct file *file, char __user *buf,
 	struct fsnotify_event *kevent;
 	char __user *start;
 	int ret;
-	DEFINE_WAIT(wait);
+	DEFINE_WAIT_FUNC(wait, woken_wake_function);
 
 	start = buf;
 	group = file->private_data;
 
 	pr_debug("%s: group=%p\n", __func__, group);
 
+	add_wait_queue(&group->notification_waitq, &wait);
 	while (1) {
-		prepare_to_wait(&group->notification_waitq, &wait, TASK_INTERRUPTIBLE);
 
 		mutex_lock(&group->notification_mutex);
 		kevent = get_one_event(group, count);
@@ -289,7 +290,7 @@ static ssize_t fanotify_read(struct file *file, char __user *buf,
 
 			if (start != buf)
 				break;
-			schedule();
+			wait_woken(&wait, TASK_INTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT);
 			continue;
 		}
 
@@ -318,8 +319,8 @@ static ssize_t fanotify_read(struct file *file, char __user *buf,
 		buf += ret;
 		count -= ret;
 	}
+	remove_wait_queue(&group->notification_waitq, &wait);
 
-	finish_wait(&group->notification_waitq, &wait);
 	if (start != buf && ret != -EFAULT)
 		ret = buf - start;
 	return ret;
-- 
2.2.1

Reply via email to