Re: bcm43xx regression 2.6.19rc3 -> rc5, rtnl_lock trouble?

Michael Buesch Sun, 19 Nov 2006 08:03:47 -0800

On Saturday 18 November 2006 20:02, Larry Finger wrote:
> Ray Lee wrote:
> > Larry Finger wrote:
> >> Johannes Berg wrote:
> >>> Hah, that's a lot more plausible than bcm43xx's drain patch actually
> >>> causing this. So maybe somehow interrupts for bcm43xx aren't routed
> >>> properly or something...
> >>>
> >>> Ray, please check /proc/interrupts when this happens.
> > 
> > When it happens, I can't. The keyboard is entirely dead (I'm in X, perhaps 
> > at
> > a console it would be okay). The only thing that works is magic SysRq. even
> > ctrl-alt-f1 to get to a console doesn't work.
> > 
> > That said, /proc/interrupts doesn't show MSI routed things on my AMD64 
> > laptop.
> > 
> >>> I am convinced that the patch in question (drain tx status) is not
> >>> causing this -- the patch should be a no-op in most cases anyway, and in
> >>> those cases where it isn't a no-op it'll run only once at card init and
> >>> remove some things from a hardware-internal FIFO.
> > 
> > Okay, I can buy that.
> > 
> >> I agree that drain tx status should not cause the problem.
> >>
> >> Ray, does -rc6 solve your problem as it did for Joseph?
> > 
> > I can't get it to repeat other than the first two times. However, I
> > accidentally stopped NetworkManager from handling my wireless a few days 
> > ago,
> > and haven't restarted it, so that may play into this.
> > 
> > Humor me one last time, I beg. Did you look at the messages file I posted? 
> > (Or
> > maybe I didn't include this second bit... Damn, I need to be more careful 
> > with
> > cutting and pasting...)
> 
> The locking stuff wasn't in any of the messages that I received.
> 
> > The second sysrq-t shows locking stuff going on, can you tell me if it looks
> > reasonable? It still seems to me that something acquiring and not releasing
> > rtnl_lock explains what I was seeing (rtnl lock is implicated in both 
> > sysrq-t
> > backtraces). I don't know if that thing is bcm43xx, though.
> > 
> > Is this part reasonable?:
> >  1 lock held by events/0/4:
> >   #0:  (&bcm->mutex){--..}, at: [mutex_lock+9/16] mutex_lock+0x9/0x10
> >  2 locks held by NetworkManager/4837:
> >   #0:  (rtnl_mutex){--..}, at: [mutex_lock+9/16] mutex_lock+0x9/0x10
> >   #1:  (&bcm->mutex){--..}, at: [mutex_lock+9/16] mutex_lock+0x9/0x10
> >  1 lock held by wpa_supplicant/5953:
> >   #0:  (rtnl_mutex){--..}, at: [mutex_lock+9/16] mutex_lock+0x9/0x10
> 
> I'm not an expert on locking, but it certainly looks as if bcm43xx and 
> wpa_supplicant are OK by 
> themselves, but that NetworkManager interferes. This behavior matches what I 
> see - I don't have 
> NetworkManager on my system, but I do use wpa_supplicant, with no lockups. Of 
> course, I have i386 
> architecture.
> 
> Although NetworkManager may be the catalyst to trigger the bug, I doubt that 
> it is the cause. 
> Strictly as a guess, I would suspect that the locking problem is in SoftMAC, 
> where we know there can 
> be locking difficulties, but no one is fixing them because EOL is near for 
> that component.


Well, this is different. The (known) locking problems are of type "missing 
locking".
But this seems to be a deadlock or something, so clearly a bug that
must be fixed. Even in softmac.
I don't know yet how this can happen, though.

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: bcm43xx regression 2.6.19rc3 -> rc5, rtnl_lock trouble?

Reply via email to