> 3.11-rc6's commit 58ad436fcf49 ("genetlink: fix family dump race")
> gives me the lockdep trace below at startup.

Hmm. Yes, I see now how this happens, not sure why I didn't run into it.

The problem is that genl_family_rcv_msg() is called with the genl_lock
held, and then calls netlink_dump_start() with it held, creating a
genl_lock->cb_mutex dependency, but obviously the dump continuation is
the other way around.

We could use the semaphore instead, I believe, but I don't really
understand the mutex vs. semaphore well enough to be sure that's
correct.

johannes


diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c
index f85f8a2..6cfa646 100644
--- a/net/netlink/genetlink.c
+++ b/net/netlink/genetlink.c
@@ -792,7 +792,7 @@ static int ctrl_dumpfamily(struct sk_buff *skb, struct 
netlink_callback *cb)
        bool need_locking = chains_to_skip || fams_to_skip;
 
        if (need_locking)
-               genl_lock();
+               down_read(&cb_lock);
 
        for (i = chains_to_skip; i < GENL_FAM_TAB_SIZE; i++) {
                n = 0;
@@ -815,7 +815,7 @@ errout:
        cb->args[1] = n;
 
        if (need_locking)
-               genl_unlock();
+               up_read(&cb_lock);
 
        return skb->len;
 }


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to