On 22-12-2006 15:28, Eric Sesterhenn wrote:
> hi,
> 
> while running my usual stuff on 2.6.20-rc1-git5, sfuzz 
> (http://www.digitaldwarf.be/products/sfuzz.c)
> did the following, to produce the lockdep warning below:
...
> Here is the stacktrace:
> 
> [  313.239556] =======================================================
> [  313.239718] [ INFO: possible circular locking dependency detected ]
> [  313.239795] 2.6.20-rc1-git5 #26
> [  313.239858] -------------------------------------------------------
> [  313.239929] sfuzz/4133 is trying to acquire lock:
> [  313.239996]  (ref_table_lock){-+..}, at: [<c04cd0a9>] 
> tipc_ref_discard+0x29/0xe0
> [  313.241101] 
> [  313.241105] but task is already holding lock:
> [  313.241225]  (&table[i].lock){-+..}, at: [<c04cb7c0>] 
> tipc_deleteport+0x40/0x1a0
> [  313.241524] 
> [  313.241528] which lock already depends on the new lock.
> [  313.241535] 
> [  313.241709] 
> [  313.241713] the existing dependency chain (in reverse order) is:
> [  313.241837] 
> [  313.241841] -> #1 (&table[i].lock){-+..}:
> [  313.242096]        [<c01366c5>] __lock_acquire+0xd05/0xde0
> [  313.242562]        [<c0136809>] lock_acquire+0x69/0xa0
> [  313.243013]        [<c04d4040>] _spin_lock_bh+0x40/0x60
> [  313.243476]        [<c04cd1cb>] tipc_ref_acquire+0x6b/0xe0
> [  313.244115]        [<c04cac73>] tipc_createport_raw+0x33/0x260
> [  313.244562]        [<c04cbe21>] tipc_createport+0x41/0x120
> [  313.245007]        [<c04c57ec>] tipc_subscr_start+0xcc/0x120
> [  313.245458]        [<c04bdb56>] process_signal_queue+0x56/0xa0
> [  313.245906]        [<c011ea18>] tasklet_action+0x38/0x80
> [  313.246361]        [<c011ecbb>] __do_softirq+0x5b/0xc0
> [  313.246817]        [<c01060e8>] do_softirq+0x88/0xe0
> [  313.247450]        [<ffffffff>] 0xffffffff
> [  313.247894] 
> [  313.247898] -> #0 (ref_table_lock){-+..}:
> [  313.248155]        [<c0136415>] __lock_acquire+0xa55/0xde0
> [  313.248601]        [<c0136809>] lock_acquire+0x69/0xa0
> [  313.249037]        [<c04d4100>] _write_lock_bh+0x40/0x60
> [  313.249486]        [<c04cd0a9>] tipc_ref_discard+0x29/0xe0
> [  313.249922]        [<c04cb7da>] tipc_deleteport+0x5a/0x1a0
> [  313.250543]        [<c04cd4f8>] tipc_create+0x58/0x160
> [  313.250980]        [<c0431cb2>] __sock_create+0x112/0x280
> [  313.251422]        [<c0431e5a>] sock_create+0x1a/0x20
> [  313.251863]        [<c04320fb>] sys_socket+0x1b/0x40
> [  313.252301]        [<c0432a72>] sys_socketcall+0x92/0x260
> [  313.252738]        [<c0102fd0>] syscall_call+0x7/0xb
> [  313.253175]        [<ffffffff>] 0xffffffff
> [  313.253778] 
> [  313.253782] other info that might help us debug this:
> [  313.253790] 
> [  313.253956] 1 lock held by sfuzz/4133:
> [  313.254019]  #0:  (&table[i].lock){-+..}, at: [<c04cb7c0>] 
> tipc_deleteport+0x40/0x1a0
> [  313.254346] 
> [  313.254351] stack backtrace:
> [  313.254470]  [<c01045da>] show_trace_log_lvl+0x1a/0x40
> [  313.254594]  [<c0104d72>] show_trace+0x12/0x20
> [  313.254711]  [<c0104e79>] dump_stack+0x19/0x20
> [  313.254829]  [<c013480f>] print_circular_bug_tail+0x6f/0x80
> [  313.254952]  [<c0136415>] __lock_acquire+0xa55/0xde0
> [  313.255070]  [<c0136809>] lock_acquire+0x69/0xa0
> [  313.255188]  [<c04d4100>] _write_lock_bh+0x40/0x60
> [  313.255315]  [<c04cd0a9>] tipc_ref_discard+0x29/0xe0
> [  313.255435]  [<c04cb7da>] tipc_deleteport+0x5a/0x1a0
> [  313.255565]  [<c04cd4f8>] tipc_create+0x58/0x160
> [  313.255687]  [<c0431cb2>] __sock_create+0x112/0x280
> [  313.255811]  [<c0431e5a>] sock_create+0x1a/0x20
> [  313.255942]  [<c04320fb>] sys_socket+0x1b/0x40
> [  313.256059]  [<c0432a72>] sys_socketcall+0x92/0x260
> [  313.256179]  [<c0102fd0>] syscall_call+0x7/0xb
> [  313.256300]  =======================
> 
> Greetings, Eric

Hello,

Maybe I misinterpret this but, IMHO lockdep
complains about locks acquired in different
order: tipc_ref_acquire() gets ref_table_lock 
and then tipc_ret_table.entries[index]->lock,
but tipc_deleteport() inversely (with:
tipc_port_lock() and tipc_ref_discard()).
I hope maintainers will decide the correct
order.

Btw. there is a problem with tipc_ref_discard():
it should be called with tipc_port_lock, but
how to discard a ref if this lock can't be
acquired? Is it OK to call it without the lock
like in subscr_named_msg_event()?

Btw. #2: during this checking I've found
two places where return values from
tipc_ref_lock() and tipc_port_lock() are not 
checked, so I attach a patch proposal for
this (compiled but not tested):

Regards,
Jarek P.
---

[PATCH] tipc: checking returns from locking functions

Checking of return values from tipc_ref_lock()
and tipc_port_lock() added in 2 places. 

Signed-off-by: Jarek Poplawski <[EMAIL PROTECTED]>
---

diff -Nurp linux-2.6.20-rc2-/net/tipc/port.c linux-2.6.20-rc2/net/tipc/port.c
--- linux-2.6.20-rc2-/net/tipc/port.c   2006-11-29 22:57:37.000000000 +0100
+++ linux-2.6.20-rc2/net/tipc/port.c    2006-12-28 11:05:17.000000000 +0100
@@ -238,7 +238,12 @@ u32 tipc_createport_raw(void *usr_handle
                return 0;
        }
 
-       tipc_port_lock(ref);
+       if (!tipc_port_lock(ref)) {
+               tipc_ref_discard(ref);
+               warn("Port creation failed, reference table invalid\n");
+               kfree(p_ptr);
+               return 0;
+       }
        p_ptr->publ.ref = ref;
        msg = &p_ptr->publ.phdr;
        msg_init(msg, DATA_LOW, TIPC_NAMED_MSG, TIPC_OK, LONG_H_SIZE, 0);
diff -Nurp linux-2.6.20-rc2-/net/tipc/subscr.c 
linux-2.6.20-rc2/net/tipc/subscr.c
--- linux-2.6.20-rc2-/net/tipc/subscr.c 2006-12-18 09:01:04.000000000 +0100
+++ linux-2.6.20-rc2/net/tipc/subscr.c  2006-12-28 11:31:27.000000000 +0100
@@ -499,7 +499,12 @@ static void subscr_named_msg_event(void 
 
        /* Add subscriber to topology server's subscriber list */
 
-       tipc_ref_lock(subscriber->ref);
+       if (!tipc_ref_lock(subscriber->ref)) {
+               warn("Subscriber rejected, unable to find port\n");
+               tipc_ref_discard(subscriber->ref);
+               kfree(subscriber);
+               return;
+       }
        spin_lock_bh(&topsrv.lock);
        list_add(&subscriber->subscriber_list, &topsrv.subscriber_list);
        spin_unlock_bh(&topsrv.lock);
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to