On Thu, Sep 29, 2016 at 12:09:43PM +0300, Lauri Tirkkonen wrote: > On Thu, Sep 29 2016 11:04:09 +0200, Stefan Sperling wrote: > > On Thu, Sep 29, 2016 at 07:18:09AM +0000, Lauri Tirkkonen wrote: > > > >Synopsis: panic in ieee80211_node_leave_11g: bogus long slot > > > >station count 0 > > > >Category: kernel > > > >Environment: > > > System : OpenBSD 6.0 > > > Details : OpenBSD 6.0-current (GENERIC.MP) #2463: Sat Sep 17 > > > 09:52:10 MDT 2016 > > > > > > [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP > > > > > > Architecture: OpenBSD.amd64 > > > Machine : amd64 > > > >Description: > > > > > > I have had three panics with similar stacks with this kernel in my router > > > (Soekris net6501), a few days apart from each other (Sep23, 26 and 28). I > > > see > > > that there have been further changes to sys/net80211, so I will try to > > > reproduce > > > with a more recent snapshot, but since I don't have a good way to > > > actually repro > > > this and the diffs don't seem to me to be directly related, I'm reporting > > > this > > > now. > > > > Please show your /etc/hostname.athn0 file. > > nwid airman chan 2 wpaciphers ccmp wpaprotos wpa2 wpagroupcipher ccmp > wpakey <omitted> > media autoselect mode 11g mediaopt hostap > chan 12 > up > inet 192.168.33.1/24 > inet6 up
OK, you're not doing anything insane. These panics are probably fallout from this commit: ----- CVSROOT: /cvs Module name: src Changes by: [email protected] 2016/05/18 02:15:28 Modified files: sys/net80211 : ieee80211_input.c ieee80211_node.c ieee80211_proto.c Log message: In hostap mode, don't re-use association IDs (AIDs) of nodes which are still lingering in the node cache. This could cause an AID to be assigned twice, once to a newly associated node and once to a different node in COLLECT cache state (i.e. marked for future eviction from the node cache). Drivers (e.g. rt2860) may use AIDs to keep track of nodes in firmware tables and get confused when AIDs aren't unique across the node cache. The symptom observed with rt2860 were nodes stuck at 1 Mbps Tx rate since the duplicate AID made the driver perform Tx rate (AMRR) accounting on the wrong node object. To find out if a node is associated we now check the node's cache state, rather than comparing the node's AID against zero. An AID is assigned when a node associates and it lasts until the node is eventually purged from the node cache (previously, the AID was made available for re-use when the node was placed in COLLECT state). There is no need to be stingy with AIDs since the number of possible AIDs exceeds the maximum number of nodes in the cache. Problem found by Nathanael Rensen. Fix written by Nathanael and myself. Tested by Nathanael. Comitting now to get this change tested across as many drivers as possible. ----- You've found another code path where a check against AID zero is used to determine whether a node is in associated state. Tsk tsk. Does this fix it? Index: ieee80211_node.c =================================================================== RCS file: /cvs/src/sys/net80211/ieee80211_node.c,v retrieving revision 1.105 diff -u -p -r1.105 ieee80211_node.c --- ieee80211_node.c 15 Sep 2016 03:32:48 -0000 1.105 +++ ieee80211_node.c 29 Sep 2016 09:20:53 -0000 @@ -1678,11 +1678,14 @@ ieee80211_node_leave(struct ieee80211com { if (ic->ic_opmode != IEEE80211_M_HOSTAP) panic("not in ap mode, mode %u", ic->ic_opmode); + + if (ni->ni_state == IEEE80211_STA_COLLECT) + return; /* * If node wasn't previously associated all we need to do is * reclaim the reference. */ - if (ni->ni_associd == 0) { + if (ni->ni_associd == 0 || ni->ni_state < IEEE80211_STA_ASSOC) { ieee80211_node_newstate(ni, IEEE80211_STA_COLLECT); return; }
