On Fri, May 30, 2014 at 01:58:33PM -0700, Michael Chan wrote: > On Fri, 2014-05-30 at 16:38 -0400, Neil Horman wrote: > > On Fri, May 30, 2014 at 01:13:40PM -0700, Michael Chan wrote: > > > On Fri, 2014-05-30 at 16:03 -0400, Neil Horman wrote: > > > > On Fri, May 30, 2014 at 10:58:11AM -0700, Michael Chan wrote: > > > > > On Fri, 2014-05-30 at 11:00 -0400, Neil Horman wrote: > > > > > > The Cnic driver handles lots of ulp operations in its netdevice > > > > > > event hanlder. > > > > > > To do this, it accesses the ulp_ops array, which is an rcu > > > > > > protected array. > > > > > > However, some ulp operations (like bnx2fc_indicate_netevent) try to > > > > > > lock > > > > > > mutexes, which might sleep (somthing that you can't do while > > > > > > holding rcu read > > > > > > side locks if you've configured non-preemptive rcu. > > > > > > > > > > > > Fix this by changing the dereference method. All accesses to the > > > > > > ulp_ops array > > > > > > for a cnic dev are modified under the protection of the rtnl lock, > > > > > > and so we can > > > > > > safely just use rcu_dereference_rtnl, and remove the rcu_read_lock > > > > > > here > > > > > > > > > > Because the bnx2fc function can sleep, we need a more complete fix to > > > > > prevent the ulp_ops from going away when the device is unregistered. > > > > > synchronize_rcu() won't be able to protect it. I'll post the patch > > > > > later today. Thanks. > > > > > > > > > The device can't be unregistered while we hold rtnl, can it? Since we > > > > hold it > > > > in this path it seems safe to me, even if we sleep, or am I missing > > > > something? > > > > Neil > > > > > > > The netdev cannot be unregistered of course, but I am talking about > > > bnx2fc unregistering the cnic device. For example if someone does > > > fcoeadm -d or bnx2fc gets unloaded. > > > > I don't think the latter can happen, as creating an fcoe transport places a > > hold > > on the bnx2fc module (see bnx2fc_create), and the former operation (fcoeadm > > -d) > > will block in bnx2fc_destroy as it requires holding the rtnl_lock, which > > will > > already be held by the netevent notifer, and confirmed by the > > rcu_dereference_rtnl in my patch. > > > > I really think we're safe here > > Take a look at bnx2fc_mod_exit(). It doesn't look safe to me as it goes > through the adapter_list unregistering all cnic devices not under > rtnl_lock. > Right, but you can't get into the module removal code at all until all transports are unregistered. I suppose if you have no registered transports and remove the bnx2fc module while a netdevice event occurs, there might be a problem, but I think that problem is bigger than what we're talking about here, as you don't want to remove the module at all while running a netdevice notifier, as you'll wind up potentially executing garbage.
Neil > _______________________________________________ fcoe-devel mailing list [email protected] http://lists.open-fcoe.org/mailman/listinfo/fcoe-devel
