On Wed, Dec 13, 2017 at 05:03:00PM -0800, Andrew Morton wrote:
> >     sched/wait: assert the wait_queue_head lock is held in __wake_up_common
> >     
> >     Better ensure we actually hold the lock using lockdep than just 
> > commenting
> >     on it.  Due to the various exported _locked interfaces it is far too 
> > easy
> >     to get the locking wrong.
> 
> I'm probably sitting on an older version.  I've dropped
> 
> epoll: use the waitqueue lock to protect ep->wq
> sched/wait: assert the wait_queue_head lock is held in __wake_up_common

Looks pretty clear to me that userfaultfd is also abusing the wake_up_locked
interfaces:

        spin_lock(&ctx->fault_pending_wqh.lock);
        __wake_up_locked_key(&ctx->fault_pending_wqh, TASK_NORMAL, &range);
        __wake_up_locked_key(&ctx->fault_wqh, TASK_NORMAL, &range);
        spin_unlock(&ctx->fault_pending_wqh.lock);

Sure, it's locked, but not by the lock you thought it was going to be.

There doesn't actually appear to be a bug here; fault_wqh is always serialised
by fault_pending_wqh.lock, but lockdep can't know that.  I think this patch
will solve the problem.

diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index ac9a4e65ca49..a39bc3237b68 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -879,7 +879,7 @@ static int userfaultfd_release(struct inode *inode, struct 
file *file)
         */
        spin_lock(&ctx->fault_pending_wqh.lock);
        __wake_up_locked_key(&ctx->fault_pending_wqh, TASK_NORMAL, &range);
-       __wake_up_locked_key(&ctx->fault_wqh, TASK_NORMAL, &range);
+       __wake_up(&ctx->fault_wqh, TASK_NORMAL, 1, &range);
        spin_unlock(&ctx->fault_pending_wqh.lock);
 
        /* Flush pending events that may still wait on event_wqh */
@@ -1045,7 +1045,7 @@ static ssize_t userfaultfd_ctx_read(struct 
userfaultfd_ctx *ctx, int no_wait,
                         * anyway.
                         */
                        list_del(&uwq->wq.entry);
-                       __add_wait_queue(&ctx->fault_wqh, &uwq->wq);
+                       add_wait_queue(&ctx->fault_wqh, &uwq->wq);
 
                        write_seqcount_end(&ctx->refile_seq);
 
@@ -1194,7 +1194,7 @@ static void __wake_userfault(struct userfaultfd_ctx *ctx,
                __wake_up_locked_key(&ctx->fault_pending_wqh, TASK_NORMAL,
                                     range);
        if (waitqueue_active(&ctx->fault_wqh))
-               __wake_up_locked_key(&ctx->fault_wqh, TASK_NORMAL, range);
+               __wake_up(&ctx->fault_wqh, TASK_NORMAL, 1, range);
        spin_unlock(&ctx->fault_pending_wqh.lock);
 }
 

Reply via email to