Re: [PATCH] tipc: checking returns and Re: Possible Circular Locking in TIPC
Jarek Poplawski wrote: If you are sure there is no circular locking possible between these two functions and this entry->lock here isn't endangered by other functions, you could try to make lockdep "silent" like this: write_lock_bh(&ref_table_lock); if (tipc_ref_table.first_free) { index = tipc_ref_table.first_free; entry = &(tipc_ref_table.entries[index]); index_mask = tipc_ref_table.index_mask; /* take lock in case a previous user of entry still holds it */ -spin_lock_bh(&entry->lock, ); + local_bh_disable(); + spin_lock_nested(&entry->lock, SINGLE_DEPTH_NESTING); next_plus_upper = entry->data.next_plus_upper; tipc_ref_table.first_free = next_plus_upper & index_mask; reference = (next_plus_upper & ~index_mask) + index; entry->data.reference = reference; entry->object = object; if (lock != 0) *lock = &entry->lock; /* may stay as is or: */ -spin_unlock_bh(&entry->lock); + spin_unlock(&entry->lock); + local_bh_enable(); } write_unlock_bh(&ref_table_lock); Looks like an acceptable solution. I will try this. Thanks ///Jon - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] tipc: checking returns and Re: Possible Circular Locking in TIPC
On Thu, Jan 04, 2007 at 04:16:20PM +, Jon Maloy wrote: > Regards > ///jon > > Jarek Poplawski wrote: > > > > >I know lockdep is sometimes > >too careful but nevertheless some change is needed > >to fix a real bug or give additional information > >to lockdep. > > > > > I don't know lockdep well enough yet, but I will try to find out if that > is possible. If you are sure there is no circular locking possible between these two functions and this entry->lock here isn't endangered by other functions, you could try to make lockdep "silent" like this: write_lock_bh(&ref_table_lock); if (tipc_ref_table.first_free) { index = tipc_ref_table.first_free; entry = &(tipc_ref_table.entries[index]); index_mask = tipc_ref_table.index_mask; /* take lock in case a previous user of entry still holds it */ -spin_lock_bh(&entry->lock, ); + local_bh_disable(); + spin_lock_nested(&entry->lock, SINGLE_DEPTH_NESTING); next_plus_upper = entry->data.next_plus_upper; tipc_ref_table.first_free = next_plus_upper & index_mask; reference = (next_plus_upper & ~index_mask) + index; entry->data.reference = reference; entry->object = object; if (lock != 0) *lock = &entry->lock; /* may stay as is or: */ -spin_unlock_bh(&entry->lock); + spin_unlock(&entry->lock); + local_bh_enable(); } write_unlock_bh(&ref_table_lock); Cheers, Jarek P. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] tipc: checking returns and Re: Possible Circular Locking in TIPC
Regards ///jon Jarek Poplawski wrote: I know lockdep is sometimes too careful but nevertheless some change is needed to fix a real bug or give additional information to lockdep. I don't know lockdep well enough yet, but I will try to find out if that is possible. Btw. there is a problem with tipc_ref_discard(): it should be called with tipc_port_lock, but how to discard a ref if this lock can't be acquired? Is it OK to call it without the lock like in subscr_named_msg_event()? I suspect you are mixing up things here. We are handling two different reference entries and two different locks in this function. One reference entry points to a subscription instance, and its reference (index) is obtainable from subscriber->ref. So, we could easily lock the entry if needed. However, in this particular case it is unnecessary, since there is no chance that anybody else could have obtained the new reference, and hence no risk for race conditions. The other reference entry was intended to point to a new port, but, since we didn't obtain any reference in the first place, there is no port to delete and no reference to discard. I admit I don't know this program and I hope I didn't mislead anybody with my message. I only tried to point at some doubts and maybe this function could be better commented about when the lock is needed. Agreed. Thanks for explanations & best regards, Jarek P. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] tipc: checking returns and Re: Possible Circular Locking in TIPC
On Wed, Jan 03, 2007 at 11:16:59PM +, Jon Maloy wrote: > See my comments below. > Regards > ///jon > > Jarek Poplawski wrote: > > >.. > > > >Maybe I misinterpret this but, IMHO lockdep > >complains about locks acquired in different > >order: tipc_ref_acquire() gets ref_table_lock > >and then tipc_ret_table.entries[index]->lock, > >but tipc_deleteport() inversely (with: > >tipc_port_lock() and tipc_ref_discard()). > >I hope maintainers will decide the correct > >order. > > > > > This order is correct. There can never be parallel access to the > same _instance_ of tipc_ret_table.entries[index]->lock from > the two functions you mention. > Note that tipc_deleteport() takes as argument the reference (=index) > returned from tipc_ref_acquire(), so it can not be (and is not) called > until and unless the latter function has returned a valid reference. > As a parallel, you can't do free() on a memory chunk until > malloc() has given you a pointer to it. I'm happy the order is correct! But the warning probably will be back. I know lockdep is sometimes too careful but nevertheless some change is needed to fix a real bug or give additional information to lockdep. > >Btw. there is a problem with tipc_ref_discard(): > >it should be called with tipc_port_lock, but > >how to discard a ref if this lock can't be > >acquired? Is it OK to call it without the lock > >like in subscr_named_msg_event()? > > > > > I suspect you are mixing up things here. > We are handling two different reference entries and two > different locks in this function. > One reference entry points to a subscription instance, and its > reference (index) is obtainable from subscriber->ref. So, we > could easily lock the entry if needed. However, in this > particular case it is unnecessary, since there is no chance that > anybody else could have obtained the new reference, and > hence no risk for race conditions. > The other reference entry was intended to point to a new port, > but, since we didn't obtain any reference in the first place, > there is no port to delete and no reference to discard. I admit I don't know this program and I hope I didn't mislead anybody with my message. I only tried to point at some doubts and maybe this function could be better commented about when the lock is needed. Thanks for explanations & best regards, Jarek P. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] tipc: checking returns and Re: Possible Circular Locking in TIPC
See my comments below. Regards ///jon Jarek Poplawski wrote: .. Maybe I misinterpret this but, IMHO lockdep complains about locks acquired in different order: tipc_ref_acquire() gets ref_table_lock and then tipc_ret_table.entries[index]->lock, but tipc_deleteport() inversely (with: tipc_port_lock() and tipc_ref_discard()). I hope maintainers will decide the correct order. This order is correct. There can never be parallel access to the same _instance_ of tipc_ret_table.entries[index]->lock from the two functions you mention. Note that tipc_deleteport() takes as argument the reference (=index) returned from tipc_ref_acquire(), so it can not be (and is not) called until and unless the latter function has returned a valid reference. As a parallel, you can't do free() on a memory chunk until malloc() has given you a pointer to it. Btw. there is a problem with tipc_ref_discard(): it should be called with tipc_port_lock, but how to discard a ref if this lock can't be acquired? Is it OK to call it without the lock like in subscr_named_msg_event()? I suspect you are mixing up things here. We are handling two different reference entries and two different locks in this function. One reference entry points to a subscription instance, and its reference (index) is obtainable from subscriber->ref. So, we could easily lock the entry if needed. However, in this particular case it is unnecessary, since there is no chance that anybody else could have obtained the new reference, and hence no risk for race conditions. The other reference entry was intended to point to a new port, but, since we didn't obtain any reference in the first place, there is no port to delete and no reference to discard. Btw. #2: during this checking I've found two places where return values from tipc_ref_lock() and tipc_port_lock() are not checked, so I attach a patch proposal for this (compiled but not tested): Thanks. Regards, Jarek P. --- [PATCH] tipc: checking returns from locking functions Checking of return values from tipc_ref_lock() and tipc_port_lock() added in 2 places. Signed-off-by: Jarek Poplawski <[EMAIL PROTECTED]> --- diff -Nurp linux-2.6.20-rc2-/net/tipc/port.c linux-2.6.20-rc2/net/tipc/port.c --- linux-2.6.20-rc2-/net/tipc/port.c 2006-11-29 22:57:37.0 +0100 +++ linux-2.6.20-rc2/net/tipc/port.c2006-12-28 11:05:17.0 +0100 @@ -238,7 +238,12 @@ u32 tipc_createport_raw(void *usr_handle return 0; } - tipc_port_lock(ref); + if (!tipc_port_lock(ref)) { + tipc_ref_discard(ref); + warn("Port creation failed, reference table invalid\n"); + kfree(p_ptr); + return 0; + } p_ptr->publ.ref = ref; msg = &p_ptr->publ.phdr; msg_init(msg, DATA_LOW, TIPC_NAMED_MSG, TIPC_OK, LONG_H_SIZE, 0); diff -Nurp linux-2.6.20-rc2-/net/tipc/subscr.c linux-2.6.20-rc2/net/tipc/subscr.c --- linux-2.6.20-rc2-/net/tipc/subscr.c 2006-12-18 09:01:04.0 +0100 +++ linux-2.6.20-rc2/net/tipc/subscr.c 2006-12-28 11:31:27.0 +0100 @@ -499,7 +499,12 @@ static void subscr_named_msg_event(void /* Add subscriber to topology server's subscriber list */ - tipc_ref_lock(subscriber->ref); + if (!tipc_ref_lock(subscriber->ref)) { + warn("Subscriber rejected, unable to find port\n"); + tipc_ref_discard(subscriber->ref); + kfree(subscriber); + return; + } spin_lock_bh(&topsrv.lock); list_add(&subscriber->subscriber_list, &topsrv.subscriber_list); spin_unlock_bh(&topsrv.lock); - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] tipc: checking returns and Re: Possible Circular Locking in TIPC
On 22-12-2006 15:28, Eric Sesterhenn wrote: > hi, > > while running my usual stuff on 2.6.20-rc1-git5, sfuzz > (http://www.digitaldwarf.be/products/sfuzz.c) > did the following, to produce the lockdep warning below: ... > Here is the stacktrace: > > [ 313.239556] === > [ 313.239718] [ INFO: possible circular locking dependency detected ] > [ 313.239795] 2.6.20-rc1-git5 #26 > [ 313.239858] --- > [ 313.239929] sfuzz/4133 is trying to acquire lock: > [ 313.239996] (ref_table_lock){-+..}, at: [] > tipc_ref_discard+0x29/0xe0 > [ 313.241101] > [ 313.241105] but task is already holding lock: > [ 313.241225] (&table[i].lock){-+..}, at: [] > tipc_deleteport+0x40/0x1a0 > [ 313.241524] > [ 313.241528] which lock already depends on the new lock. > [ 313.241535] > [ 313.241709] > [ 313.241713] the existing dependency chain (in reverse order) is: > [ 313.241837] > [ 313.241841] -> #1 (&table[i].lock){-+..}: > [ 313.242096][] __lock_acquire+0xd05/0xde0 > [ 313.242562][] lock_acquire+0x69/0xa0 > [ 313.243013][] _spin_lock_bh+0x40/0x60 > [ 313.243476][] tipc_ref_acquire+0x6b/0xe0 > [ 313.244115][] tipc_createport_raw+0x33/0x260 > [ 313.244562][] tipc_createport+0x41/0x120 > [ 313.245007][] tipc_subscr_start+0xcc/0x120 > [ 313.245458][] process_signal_queue+0x56/0xa0 > [ 313.245906][] tasklet_action+0x38/0x80 > [ 313.246361][] __do_softirq+0x5b/0xc0 > [ 313.246817][] do_softirq+0x88/0xe0 > [ 313.247450][] 0x > [ 313.247894] > [ 313.247898] -> #0 (ref_table_lock){-+..}: > [ 313.248155][] __lock_acquire+0xa55/0xde0 > [ 313.248601][] lock_acquire+0x69/0xa0 > [ 313.249037][] _write_lock_bh+0x40/0x60 > [ 313.249486][] tipc_ref_discard+0x29/0xe0 > [ 313.249922][] tipc_deleteport+0x5a/0x1a0 > [ 313.250543][] tipc_create+0x58/0x160 > [ 313.250980][] __sock_create+0x112/0x280 > [ 313.251422][] sock_create+0x1a/0x20 > [ 313.251863][] sys_socket+0x1b/0x40 > [ 313.252301][] sys_socketcall+0x92/0x260 > [ 313.252738][] syscall_call+0x7/0xb > [ 313.253175][] 0x > [ 313.253778] > [ 313.253782] other info that might help us debug this: > [ 313.253790] > [ 313.253956] 1 lock held by sfuzz/4133: > [ 313.254019] #0: (&table[i].lock){-+..}, at: [] > tipc_deleteport+0x40/0x1a0 > [ 313.254346] > [ 313.254351] stack backtrace: > [ 313.254470] [] show_trace_log_lvl+0x1a/0x40 > [ 313.254594] [] show_trace+0x12/0x20 > [ 313.254711] [] dump_stack+0x19/0x20 > [ 313.254829] [] print_circular_bug_tail+0x6f/0x80 > [ 313.254952] [] __lock_acquire+0xa55/0xde0 > [ 313.255070] [] lock_acquire+0x69/0xa0 > [ 313.255188] [] _write_lock_bh+0x40/0x60 > [ 313.255315] [] tipc_ref_discard+0x29/0xe0 > [ 313.255435] [] tipc_deleteport+0x5a/0x1a0 > [ 313.255565] [] tipc_create+0x58/0x160 > [ 313.255687] [] __sock_create+0x112/0x280 > [ 313.255811] [] sock_create+0x1a/0x20 > [ 313.255942] [] sys_socket+0x1b/0x40 > [ 313.256059] [] sys_socketcall+0x92/0x260 > [ 313.256179] [] syscall_call+0x7/0xb > [ 313.256300] === > > Greetings, Eric Hello, Maybe I misinterpret this but, IMHO lockdep complains about locks acquired in different order: tipc_ref_acquire() gets ref_table_lock and then tipc_ret_table.entries[index]->lock, but tipc_deleteport() inversely (with: tipc_port_lock() and tipc_ref_discard()). I hope maintainers will decide the correct order. Btw. there is a problem with tipc_ref_discard(): it should be called with tipc_port_lock, but how to discard a ref if this lock can't be acquired? Is it OK to call it without the lock like in subscr_named_msg_event()? Btw. #2: during this checking I've found two places where return values from tipc_ref_lock() and tipc_port_lock() are not checked, so I attach a patch proposal for this (compiled but not tested): Regards, Jarek P. --- [PATCH] tipc: checking returns from locking functions Checking of return values from tipc_ref_lock() and tipc_port_lock() added in 2 places. Signed-off-by: Jarek Poplawski <[EMAIL PROTECTED]> --- diff -Nurp linux-2.6.20-rc2-/net/tipc/port.c linux-2.6.20-rc2/net/tipc/port.c --- linux-2.6.20-rc2-/net/tipc/port.c 2006-11-29 22:57:37.0 +0100 +++ linux-2.6.20-rc2/net/tipc/port.c2006-12-28 11:05:17.0 +0100 @@ -238,7 +238,12 @@ u32 tipc_createport_raw(void *usr_handle return 0; } - tipc_port_lock(ref); + if (!tipc_port_lock(ref)) { + tipc_ref_discard(ref); + warn("Port creation failed, reference table invalid\n"); + kfree(p_ptr); + return 0; + } p_ptr->publ.ref = ref; msg = &p_ptr->publ.phdr; msg_init(msg, DATA_LOW, TIPC_NAMED_M