Hello, If a daemon using FANOTIFY needs to open a file on a watched filesystem and its wanting OPEN_PERM events, we get deadlock. (This could happen because of a library the daemon is using suddenly decides it needs to look in a new file.) Even though the man page says that the daemon should approve its own access decision, it really can't. If its in the middle of working on a request and that in turn generates another request, the second request is going to sit in the queue inside the kernel and not be read because the daemon is waiting on a library call that will never finish. We also have no idea how many requests are stacked up before we get to it. So, it really can't approve its own access requests.
The solution is to assume that the daemon is going to approve its own file access requests. So, any requested access that matches the pid of the program receiving fanotify events should be pre-approved in the kernel and not sent to user space for approval. This should prevent deadlock. Signed-off-by: sgrubb <sgr...@redhat.com> --- fs/notify/fanotify/fanotify.c | 9 +++++++++ fs/notify/fanotify/fanotify_user.c | 3 +++ include/linux/fsnotify_backend.h | 2 ++ 3 files changed, 14 insertions(+) diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c index d2f97ec..469ce6d 100644 --- a/fs/notify/fanotify/fanotify.c +++ b/fs/notify/fanotify/fanotify.c @@ -105,6 +105,7 @@ static bool fanotify_should_send_event(struct fsnotify_mark *inode_mark, { __u32 marks_mask, marks_ignored_mask; struct path *path = data; + struct pid *cur_pid; pr_debug("%s: inode_mark=%p vfsmnt_mark=%p mask=%x data=%p" " data_type=%d\n", __func__, inode_mark, vfsmnt_mark, @@ -139,6 +140,14 @@ static bool fanotify_should_send_event(struct fsnotify_mark *inode_mark, BUG(); } + /* Assume the listening process will approve its own requests */ + cur_pid = get_pid(task_tgid(current)); + if (pid_nr(vfsmnt_mark->group->fanotify_data.pid) == pid_nr(cur_pid)) { + put_pid(cur_pid); + return false; + } + put_pid(cur_pid); + if (d_is_dir(path->dentry) && !(marks_mask & FS_ISDIR & ~marks_ignored_mask)) return false; diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c index 8e8e6bc..510e3bc 100644 --- a/fs/notify/fanotify/fanotify_user.c +++ b/fs/notify/fanotify/fanotify_user.c @@ -387,6 +387,8 @@ static int fanotify_release(struct inode *ignored, struct file *file) */ wake_up(&group->fanotify_data.access_waitq); #endif + /* Get rid of reference held since fanotify_init */ + put_pid(group->fanotify_data.pid); /* matches the fanotify_init->fsnotify_alloc_group */ fsnotify_destroy_group(group); @@ -740,6 +742,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags) group->fanotify_data.user = user; atomic_inc(&user->fanotify_listeners); + group->fanotify_data.pid = get_pid(task_tgid(current)); oevent = fanotify_alloc_event(NULL, FS_Q_OVERFLOW, NULL); if (unlikely(!oevent)) { diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h index 533c440..48938ad 100644 --- a/include/linux/fsnotify_backend.h +++ b/include/linux/fsnotify_backend.h @@ -16,6 +16,7 @@ #include <linux/spinlock.h> #include <linux/types.h> #include <linux/atomic.h> +#include <linux/pid.h> /* * IN_* from inotfy.h lines up EXACTLY with FS_*, this is so we can easily @@ -184,6 +185,7 @@ struct fsnotify_group { int f_flags; unsigned int max_marks; struct user_struct *user; + struct pid *pid; } fanotify_data; #endif /* CONFIG_FANOTIFY */ }; -- 2.4.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/