Re: Hard(?) lock when reassociating ath with wpa_supplicant on RELENG_7
On Sat, 2008-07-12 at 21:26 -0700, Sam Leffler wrote: > Alexandre "Sunny" Kovalenko wrote: > > On Sat, 2008-07-12 at 09:57 -0700, Sam Leffler wrote: > > > >> Alexandre "Sunny" Kovalenko wrote: > >> > >>> On Fri, 2008-07-11 at 20:29 -0700, Sam Leffler wrote: > >>> > >>> > Alexandre "Sunny" Kovalenko wrote: > > > > On Fri, 2008-05-16 at 12:23 -0400, Sam Leffler wrote: > > > > > >> Alexandre "Sunny" Kovalenko wrote: > >> > >> > >>> On Mon, 2008-05-12 at 19:33 -0700, Sam Leffler wrote: > >>> > >>> > Alexandre "Sunny" Kovalenko wrote: > > > > I seem to be able to lock my machine by going into wpa_cli and > > asking it > > to 'reassoc'. The reason for question mark after "hard" is that > > debug > > information (caused by wlandebug and athdebug) is being printed on > > the > > console. The only way to get machine's attention is to hold power > > button > > for 8 seconds. > > > > > So this is just livelock due to console debug msgs. > > > >>> I am not sure, I have parsed this well enough, so I will try to > >>> clarify: > >>> machine becomes unresponsive *without* any debugging turned on, to an > >>> extent that pressing the power button twice *does not* generate ACPI > >>> console message (something to the tune of "going into S5 already -- > >>> gimme a break"). If I turn ath debugging on, I do see those messages, > >>> and only them, scrolling on the console. > >>> > >>> > >> Guess I misunderstood you. > >> > >> > > I have finally got enough time and equipment to investigate this > > further. Here are some conclusions: > > > > -- at this point (RELENG_7 as of July 9th around 15:30 EST) it is indeed > > a livelock. > > > > -- all system does, is executing ath_intr (if_ath.c) in the tight loop > > with the same status -- 0x1000 AKA HAL_INT_MIB. In order to eliminate > > possibility that I have caused livelock with the debug messages, I have > > put conditional panic() into ath_intr, as soon as sc->sc_stats.ast_mib > > reaches 10,000. Without any kind of the debug messages, it will be > > triggered within 40-60 seconds after starting of wpa_supplicant. > > > > -- I suspect that comment below, might not hold true on my equipment > > if (status & HAL_INT_MIB) { > >sc->sc_stats.ast_mib++; > >/* > > * Disable interrupts until we service the MIB > > * interrupt; otherwise it will continue to fire. > > */ > >ath_hal_intrset(ah, 0); > >/* > > * Let the hal handle the event. We assume it will < > > * clear whatever condition caused the interrupt. < > > */ > > ath_hal_mibevent(ah, &sc->sc_halstats); > > ath_hal_intrset(ah, sc->sc_imask); > > } > > > > My hardware is: > > ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, > > RF5413) > > ath0: mem 0xedf0-0xedf0 irq 17 at device 0.0 on > > pci3 > > ath0: [ITHREAD] > > ath0: using obsoleted if_watchdog interface > > ath0: Ethernet address: 00:16:cf:26:2f:3f > > ath0: mac 10.3 phy 6.1 radio 10.2 > > > > My wpa_supplicant.conf is: > > ctrl_interface=/var/run/wpa_supplicant > > ctrl_interface_group=wheel > > eapol_version=2 > > > > network={ > > ssid="XXX" > > scan_ssid=1 > > priority=1 > > proto=WPA > > pairwise=TKIP > > group=TKIP > > key_mgmt=WPA-PSK > > psk="xx" > > } > > > > Access point is Netgear WNDR3300-1B with 11N and 11G SSID set up to > > different values. Only 11G SSID is configured in wpa_supplicant.conf. In > > the test setup, AP is with 10' (3m) from the laptop. > > > > AP is successfully used by handful of Windows clients (including this > > same laptop) and iBook G4. > > > > Neither wpa_supplicant with '-d -d' nor wlandebug 0x show > > anything but normal scan. > > > > athdebug 0x loops with ath_intr: status 0x1000 until I power > > down my laptop. > > > > I would appreciate any suggestion on what I can investigate further -- > > at this point I have comfortable console setup and should be able to > > field requests for further information much better. > > > > > > > Are you running powerd? > > > >>> I do. And I just tried disabling it, and I could not reproduce the > >>> problem any more.
Re: Hard(?) lock when reassociating ath with wpa_supplicant on RELENG_7
Alexandre "Sunny" Kovalenko wrote: On Sun, 2008-07-13 at 11:21 +0200, Martin wrote: Hi Sam, do you know if there is anything done about cbb(4)? I have many wireless adapters with ath(4), but only the one based on PCMCIA is making problems on FreeBSD. I cannot boot my notebook with the device inserted into the port, or it will render the system unusable (100% load on cbb(4)). And all I can see is the following: Jul 12 14:58:39 link kernel: ath0: ath_chan_set: unable to reset channel 1 (2412 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:58:40 link kernel: ath0: ath_chan_set: unable to reset channel 6 (2437 Mhz, flags 0x480 hal flags 0xc0), hal status 3233070656 Jul 12 14:58:41 link kernel: ath0: ath_chan_set: unable to reset channel 7 (2442 Mhz, flags 0x480 hal flags 0xc0), hal status 3233070656 Jul 12 14:58:42 link kernel: ath0: ath_chan_set: unable to reset channel 2 (2417 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:58:43 link kernel: ath0: ath_chan_set: unable to reset channel 3 (2422 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:58:44 link kernel: ath0: ath_chan_set: unable to reset channel 4 (2427 Mhz, flags 0x480 hal flags 0xc0), hal status 3233070656 Jul 12 14:58:45 link kernel: ath0: ath_chan_set: unable to reset channel 5 (2432 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:58:45 link kernel: ath0: ath_chan_set: unable to reset channel 8 (2447 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:58:46 link kernel: ath0: ath_chan_set: unable to reset channel 9 (2452 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:58:47 link kernel: ath0: ath_chan_set: unable to reset channel 10 (2457 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:58:52 link kernel: ath0: device timeout Jul 12 14:58:52 link kernel: ath0: ath_reset: unable to reset hardware; hal status 3231908395 Jul 12 14:58:58 link kernel: ath0: ath_chan_set: unable to reset channel 1 (2412 Mhz, flags 0x480 hal flags 0xc0), hal status 3233070656 Jul 12 14:58:59 link kernel: ath0: ath_chan_set: unable to reset channel 6 (2437 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:00 link kernel: ath0: ath_chan_set: unable to reset channel 7 (2442 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:01 link kernel: ath0: ath_chan_set: unable to reset channel 2 (2417 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:01 link kernel: ath0: ath_chan_set: unable to reset channel 3 (2422 Mhz, flags 0x480 hal flags 0xc0), hal status 3289233408 Jul 12 14:59:02 link kernel: ath0: ath_chan_set: unable to reset channel 4 (2427 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:03 link kernel: ath0: ath_chan_set: unable to reset channel 5 (2432 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:04 link kernel: ath0: ath_chan_set: unable to reset channel 8 (2447 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:05 link kernel: ath0: ath_chan_set: unable to reset channel 9 (2452 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:06 link kernel: ath0: ath_chan_set: unable to reset channel 10 (2457 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:10 link kernel: ath0: device timeout Jul 12 14:59:10 link kernel: ath0: ath_reset: unable to reset hardware; hal status 3231908395 Jul 12 14:59:11 link kernel: ath0: ath_chan_set: unable to reset channel 1 (2412 Mhz, flags 0x480 hal flags 0xc0), hal status 3233070656 Jul 12 14:59:12 link kernel: ath0: ath_chan_set: unable to reset channel 6 (2437 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:13 link kernel: ath0: ath_chan_set: unable to reset channel 7 (2442 Mhz, flags 0x480 hal flags 0xc0), hal status 3233070656 Jul 12 14:59:14 link kernel: ath0: ath_chan_set: unable to reset channel 2 (2417 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:15 link kernel: ath0: ath_chan_set: unable to reset channel 3 (2422 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:16 link kernel: ath0: unable to reset hardware; hal status 3868687084 Jul 12 14:59:16 link kernel: ath0: ath_chan_set: unable to reset channel 4 (2427 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:17 link kernel: ath0: ath_chan_set: unable to reset channel 11 (2462 Mhz, flags 0x480 hal flags 0xc0), hal status 3828804624 Jul 12 14:59:22 link kernel: ath0: unable to reset hardware; hal status 3868539700 Jul 12 14:59:32 link kernel: ath0: unable to reset hardware; hal status 3868539700 Jul 12 14:59:42 link kernel: ath0: unable to reset hardware; hal status 3868539700 Jul 12 15:00:22 link last message repeated 4 times Jul 12 15:00:32 link kernel: ath0: unable to reset hardware; hal status 3868539700 Jul 12 15:00:42 link kernel: ath0: unable to reset hardware; hal status 3868539700 Jul 12 15:00:52 link kernel: ath0: unable to reset hardware; hal status 3868539700 wpa_supplicant is trying to access the card and switching channels, I suppose. A workarou
Re: Hard(?) lock when reassociating ath with wpa_supplicant on RELENG_7
Martin wrote: Hi Sam, do you know if there is anything done about cbb(4)? I have many wireless adapters with ath(4), but only the one based on PCMCIA is making problems on FreeBSD. I cannot boot my notebook with the device inserted into the port, or it will render the system unusable (100% load on cbb(4)). And all I can see is the following: Jul 12 14:58:39 link kernel: ath0: ath_chan_set: unable to reset channel 1 (2412 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:58:40 link kernel: ath0: ath_chan_set: unable to reset channel 6 (2437 Mhz, flags 0x480 hal flags 0xc0), hal status 3233070656 Jul 12 14:58:41 link kernel: ath0: ath_chan_set: unable to reset channel 7 (2442 Mhz, flags 0x480 hal flags 0xc0), hal status 3233070656 Jul 12 14:58:42 link kernel: ath0: ath_chan_set: unable to reset channel 2 (2417 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:58:43 link kernel: ath0: ath_chan_set: unable to reset channel 3 (2422 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:58:44 link kernel: ath0: ath_chan_set: unable to reset channel 4 (2427 Mhz, flags 0x480 hal flags 0xc0), hal status 3233070656 Jul 12 14:58:45 link kernel: ath0: ath_chan_set: unable to reset channel 5 (2432 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:58:45 link kernel: ath0: ath_chan_set: unable to reset channel 8 (2447 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:58:46 link kernel: ath0: ath_chan_set: unable to reset channel 9 (2452 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:58:47 link kernel: ath0: ath_chan_set: unable to reset channel 10 (2457 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:58:52 link kernel: ath0: device timeout Jul 12 14:58:52 link kernel: ath0: ath_reset: unable to reset hardware; hal status 3231908395 Jul 12 14:58:58 link kernel: ath0: ath_chan_set: unable to reset channel 1 (2412 Mhz, flags 0x480 hal flags 0xc0), hal status 3233070656 Jul 12 14:58:59 link kernel: ath0: ath_chan_set: unable to reset channel 6 (2437 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:00 link kernel: ath0: ath_chan_set: unable to reset channel 7 (2442 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:01 link kernel: ath0: ath_chan_set: unable to reset channel 2 (2417 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:01 link kernel: ath0: ath_chan_set: unable to reset channel 3 (2422 Mhz, flags 0x480 hal flags 0xc0), hal status 3289233408 Jul 12 14:59:02 link kernel: ath0: ath_chan_set: unable to reset channel 4 (2427 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:03 link kernel: ath0: ath_chan_set: unable to reset channel 5 (2432 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:04 link kernel: ath0: ath_chan_set: unable to reset channel 8 (2447 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:05 link kernel: ath0: ath_chan_set: unable to reset channel 9 (2452 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:06 link kernel: ath0: ath_chan_set: unable to reset channel 10 (2457 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:10 link kernel: ath0: device timeout Jul 12 14:59:10 link kernel: ath0: ath_reset: unable to reset hardware; hal status 3231908395 Jul 12 14:59:11 link kernel: ath0: ath_chan_set: unable to reset channel 1 (2412 Mhz, flags 0x480 hal flags 0xc0), hal status 3233070656 Jul 12 14:59:12 link kernel: ath0: ath_chan_set: unable to reset channel 6 (2437 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:13 link kernel: ath0: ath_chan_set: unable to reset channel 7 (2442 Mhz, flags 0x480 hal flags 0xc0), hal status 3233070656 Jul 12 14:59:14 link kernel: ath0: ath_chan_set: unable to reset channel 2 (2417 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:15 link kernel: ath0: ath_chan_set: unable to reset channel 3 (2422 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:16 link kernel: ath0: unable to reset hardware; hal status 3868687084 Jul 12 14:59:16 link kernel: ath0: ath_chan_set: unable to reset channel 4 (2427 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:17 link kernel: ath0: ath_chan_set: unable to reset channel 11 (2462 Mhz, flags 0x480 hal flags 0xc0), hal status 3828804624 Jul 12 14:59:22 link kernel: ath0: unable to reset hardware; hal status 3868539700 Jul 12 14:59:32 link kernel: ath0: unable to reset hardware; hal status 3868539700 Jul 12 14:59:42 link kernel: ath0: unable to reset hardware; hal status 3868539700 Jul 12 15:00:22 link last message repeated 4 times Jul 12 15:00:32 link kernel: ath0: unable to reset hardware; hal status 3868539700 Jul 12 15:00:42 link kernel: ath0: unable to reset hardware; hal status 3868539700 Jul 12 15:00:52 link kernel: ath0: unable to reset hardware; hal status 3868539700 wpa_supplicant is trying to access the card and switching channels, I suppose. A workaround that I'm using all the time is: 1) boot the notebook without the PCM
Re: Hard(?) lock when reassociating ath with wpa_supplicant on RELENG_7
Hi Alexandre, > You can panic the box by unloading if_ath.ko while wpa_supplicant is > running on any form factor, so this one is not specific to the removable > devices. I am not talking about unloading kernel modules (I wouldn't unload if_ath by force and complain about it). The panic is happening while ejecting the PCMCIA card and wpa_supplicant is running. The cbb(4) problem with the high load is much more annoying for me, because the only solution is to reboot the system to get it stable again. One thing I also noticed is that if I have set a high resolution with vidcontrol (1600x1200) and the text terminal display becomes really slow, ath(4) has difficulties to load the device when it is inserted into the PCMCIA-slot. -- Martin Sugioarto ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
if_ral lockups [was: Re: Hard(?) lock when reassociating ath with wpa_supplicant on RELENG_7]
Am Sonntag, 13. Juli 2008 14:37:14 schrieb Alexandre "Sunny" Kovalenko: > Just to chime into the discussion, I'm seeing something similar for if_ral since my latest update of 7-STABLE from about two weeks ago (which I did not see with a 7-STABLE from some time in march, so something must have changed in between, but I can't narrow it down exactly). I've since updated the kernel several times (the latest update being just last friday), but the problem hasn't disappeared so far. I get an oops, followed by a reboot when wpa_supplicant has associated an if_ral interface (which is a removable PCMCIA Ralink based WLAN card) when powerd is running. As soon as I disable powerd for my laptop (and it runs continuously at 1800 Mhz), the oopses disappear. As I'm getting a kernel oops, if anybody is interested, I'd more than happily reenable powerd to produce a kernel dump and make that available to anybody who is interested in debugging it, along with the kernel images, of course. I'm running a self-compiled GENERIC (with nothing changed except for the self-compiled part). I currently don't have the time to track this down any further myself, though, and the workaround works for me. Device info: ... cbb0: at device 9.0 on pci0 cardbus0: on cbb0 pccard0: <16-bit PCCard bus> on cbb0 cbb0: [ITHREAD] cbb1: at device 9.1 on pci0 cardbus1: on cbb1 pccard1: <16-bit PCCard bus> on cbb1 cbb1: [ITHREAD] ... ral0: mem 0x8800-0x88007fff irq 17 at device 0.0 on cardbus0 ral0: MAC/BBP RT2561C, RF RT2527 ral0: Ethernet address: 00:80:5a:51:23:53 ral0: [ITHREAD] ... uname: FreeBSD phoenix.modelnine.org 7.0-STABLE FreeBSD 7.0-STABLE #0: Fri Jul 11 12:28:25 CEST 2008 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC i386 -- Heiko Wundram ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Hard(?) lock when reassociating ath with wpa_supplicant on RELENG_7
On Sun, 2008-07-13 at 11:21 +0200, Martin wrote: > Hi Sam, > > do you know if there is anything done about cbb(4)? I have many > wireless adapters with ath(4), but only the one based on PCMCIA is > making problems on FreeBSD. > > I cannot boot my notebook with the device inserted into the port, or it > will render the system unusable (100% load on cbb(4)). > > And all I can see is the following: > Jul 12 14:58:39 link kernel: ath0: ath_chan_set: unable to reset channel 1 > (2412 Mhz, flags 0x480 hal flags 0xc0), hal status 0 > Jul 12 14:58:40 link kernel: ath0: ath_chan_set: unable to reset channel 6 > (2437 Mhz, flags 0x480 hal flags 0xc0), hal status 3233070656 > Jul 12 14:58:41 link kernel: ath0: ath_chan_set: unable to reset channel 7 > (2442 Mhz, flags 0x480 hal flags 0xc0), hal status 3233070656 > Jul 12 14:58:42 link kernel: ath0: ath_chan_set: unable to reset channel 2 > (2417 Mhz, flags 0x480 hal flags 0xc0), hal status 0 > Jul 12 14:58:43 link kernel: ath0: ath_chan_set: unable to reset channel 3 > (2422 Mhz, flags 0x480 hal flags 0xc0), hal status 0 > Jul 12 14:58:44 link kernel: ath0: ath_chan_set: unable to reset channel 4 > (2427 Mhz, flags 0x480 hal flags 0xc0), hal status 3233070656 > Jul 12 14:58:45 link kernel: ath0: ath_chan_set: unable to reset channel 5 > (2432 Mhz, flags 0x480 hal flags 0xc0), hal status 0 > Jul 12 14:58:45 link kernel: ath0: ath_chan_set: unable to reset channel 8 > (2447 Mhz, flags 0x480 hal flags 0xc0), hal status 0 > Jul 12 14:58:46 link kernel: ath0: ath_chan_set: unable to reset channel 9 > (2452 Mhz, flags 0x480 hal flags 0xc0), hal status 0 > Jul 12 14:58:47 link kernel: ath0: ath_chan_set: unable to reset channel 10 > (2457 Mhz, flags 0x480 hal flags 0xc0), hal status 0 > Jul 12 14:58:52 link kernel: ath0: device timeout > Jul 12 14:58:52 link kernel: ath0: ath_reset: unable to reset hardware; hal > status 3231908395 > Jul 12 14:58:58 link kernel: ath0: ath_chan_set: unable to reset channel 1 > (2412 Mhz, flags 0x480 hal flags 0xc0), hal status 3233070656 > Jul 12 14:58:59 link kernel: ath0: ath_chan_set: unable to reset channel 6 > (2437 Mhz, flags 0x480 hal flags 0xc0), hal status 0 > Jul 12 14:59:00 link kernel: ath0: ath_chan_set: unable to reset channel 7 > (2442 Mhz, flags 0x480 hal flags 0xc0), hal status 0 > Jul 12 14:59:01 link kernel: ath0: ath_chan_set: unable to reset channel 2 > (2417 Mhz, flags 0x480 hal flags 0xc0), hal status 0 > Jul 12 14:59:01 link kernel: ath0: ath_chan_set: unable to reset channel 3 > (2422 Mhz, flags 0x480 hal flags 0xc0), hal status 3289233408 > Jul 12 14:59:02 link kernel: ath0: ath_chan_set: unable to reset channel 4 > (2427 Mhz, flags 0x480 hal flags 0xc0), hal status 0 > Jul 12 14:59:03 link kernel: ath0: ath_chan_set: unable to reset channel 5 > (2432 Mhz, flags 0x480 hal flags 0xc0), hal status 0 > Jul 12 14:59:04 link kernel: ath0: ath_chan_set: unable to reset channel 8 > (2447 Mhz, flags 0x480 hal flags 0xc0), hal status 0 > Jul 12 14:59:05 link kernel: ath0: ath_chan_set: unable to reset channel 9 > (2452 Mhz, flags 0x480 hal flags 0xc0), hal status 0 > Jul 12 14:59:06 link kernel: ath0: ath_chan_set: unable to reset channel 10 > (2457 Mhz, flags 0x480 hal flags 0xc0), hal status 0 > Jul 12 14:59:10 link kernel: ath0: device timeout > Jul 12 14:59:10 link kernel: ath0: ath_reset: unable to reset hardware; hal > status 3231908395 > Jul 12 14:59:11 link kernel: ath0: ath_chan_set: unable to reset channel 1 > (2412 Mhz, flags 0x480 hal flags 0xc0), hal status 3233070656 > Jul 12 14:59:12 link kernel: ath0: ath_chan_set: unable to reset channel 6 > (2437 Mhz, flags 0x480 hal flags 0xc0), hal status 0 > Jul 12 14:59:13 link kernel: ath0: ath_chan_set: unable to reset channel 7 > (2442 Mhz, flags 0x480 hal flags 0xc0), hal status 3233070656 > Jul 12 14:59:14 link kernel: ath0: ath_chan_set: unable to reset channel 2 > (2417 Mhz, flags 0x480 hal flags 0xc0), hal status 0 > Jul 12 14:59:15 link kernel: ath0: ath_chan_set: unable to reset channel 3 > (2422 Mhz, flags 0x480 hal flags 0xc0), hal status 0 > Jul 12 14:59:16 link kernel: ath0: unable to reset hardware; hal status > 3868687084 > Jul 12 14:59:16 link kernel: ath0: ath_chan_set: unable to reset channel 4 > (2427 Mhz, flags 0x480 hal flags 0xc0), hal status 0 > Jul 12 14:59:17 link kernel: ath0: ath_chan_set: unable to reset channel 11 > (2462 Mhz, flags 0x480 hal flags 0xc0), hal status 3828804624 > Jul 12 14:59:22 link kernel: ath0: unable to reset hardware; hal status > 3868539700 > Jul 12 14:59:32 link kernel: ath0: unable to reset hardware; hal status > 3868539700 > Jul 12 14:59:42 link kernel: ath0: unable to reset hardware; hal status > 3868539700 > Jul 12 15:00:22 link last message repeated 4 times > Jul 12 15:00:32 link kernel: ath0: unable to reset hardware; hal status > 3868539700 > Jul 12 15:00:42 link kernel: ath0: unable to reset hardware; hal status > 3868539700 > Jul 12 15:00:52 link kernel: ath0: unable to r
Re: Hard(?) lock when reassociating ath with wpa_supplicant on RELENG_7
Hi Sam, do you know if there is anything done about cbb(4)? I have many wireless adapters with ath(4), but only the one based on PCMCIA is making problems on FreeBSD. I cannot boot my notebook with the device inserted into the port, or it will render the system unusable (100% load on cbb(4)). And all I can see is the following: Jul 12 14:58:39 link kernel: ath0: ath_chan_set: unable to reset channel 1 (2412 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:58:40 link kernel: ath0: ath_chan_set: unable to reset channel 6 (2437 Mhz, flags 0x480 hal flags 0xc0), hal status 3233070656 Jul 12 14:58:41 link kernel: ath0: ath_chan_set: unable to reset channel 7 (2442 Mhz, flags 0x480 hal flags 0xc0), hal status 3233070656 Jul 12 14:58:42 link kernel: ath0: ath_chan_set: unable to reset channel 2 (2417 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:58:43 link kernel: ath0: ath_chan_set: unable to reset channel 3 (2422 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:58:44 link kernel: ath0: ath_chan_set: unable to reset channel 4 (2427 Mhz, flags 0x480 hal flags 0xc0), hal status 3233070656 Jul 12 14:58:45 link kernel: ath0: ath_chan_set: unable to reset channel 5 (2432 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:58:45 link kernel: ath0: ath_chan_set: unable to reset channel 8 (2447 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:58:46 link kernel: ath0: ath_chan_set: unable to reset channel 9 (2452 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:58:47 link kernel: ath0: ath_chan_set: unable to reset channel 10 (2457 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:58:52 link kernel: ath0: device timeout Jul 12 14:58:52 link kernel: ath0: ath_reset: unable to reset hardware; hal status 3231908395 Jul 12 14:58:58 link kernel: ath0: ath_chan_set: unable to reset channel 1 (2412 Mhz, flags 0x480 hal flags 0xc0), hal status 3233070656 Jul 12 14:58:59 link kernel: ath0: ath_chan_set: unable to reset channel 6 (2437 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:00 link kernel: ath0: ath_chan_set: unable to reset channel 7 (2442 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:01 link kernel: ath0: ath_chan_set: unable to reset channel 2 (2417 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:01 link kernel: ath0: ath_chan_set: unable to reset channel 3 (2422 Mhz, flags 0x480 hal flags 0xc0), hal status 3289233408 Jul 12 14:59:02 link kernel: ath0: ath_chan_set: unable to reset channel 4 (2427 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:03 link kernel: ath0: ath_chan_set: unable to reset channel 5 (2432 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:04 link kernel: ath0: ath_chan_set: unable to reset channel 8 (2447 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:05 link kernel: ath0: ath_chan_set: unable to reset channel 9 (2452 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:06 link kernel: ath0: ath_chan_set: unable to reset channel 10 (2457 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:10 link kernel: ath0: device timeout Jul 12 14:59:10 link kernel: ath0: ath_reset: unable to reset hardware; hal status 3231908395 Jul 12 14:59:11 link kernel: ath0: ath_chan_set: unable to reset channel 1 (2412 Mhz, flags 0x480 hal flags 0xc0), hal status 3233070656 Jul 12 14:59:12 link kernel: ath0: ath_chan_set: unable to reset channel 6 (2437 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:13 link kernel: ath0: ath_chan_set: unable to reset channel 7 (2442 Mhz, flags 0x480 hal flags 0xc0), hal status 3233070656 Jul 12 14:59:14 link kernel: ath0: ath_chan_set: unable to reset channel 2 (2417 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:15 link kernel: ath0: ath_chan_set: unable to reset channel 3 (2422 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:16 link kernel: ath0: unable to reset hardware; hal status 3868687084 Jul 12 14:59:16 link kernel: ath0: ath_chan_set: unable to reset channel 4 (2427 Mhz, flags 0x480 hal flags 0xc0), hal status 0 Jul 12 14:59:17 link kernel: ath0: ath_chan_set: unable to reset channel 11 (2462 Mhz, flags 0x480 hal flags 0xc0), hal status 3828804624 Jul 12 14:59:22 link kernel: ath0: unable to reset hardware; hal status 3868539700 Jul 12 14:59:32 link kernel: ath0: unable to reset hardware; hal status 3868539700 Jul 12 14:59:42 link kernel: ath0: unable to reset hardware; hal status 3868539700 Jul 12 15:00:22 link last message repeated 4 times Jul 12 15:00:32 link kernel: ath0: unable to reset hardware; hal status 3868539700 Jul 12 15:00:42 link kernel: ath0: unable to reset hardware; hal status 3868539700 Jul 12 15:00:52 link kernel: ath0: unable to reset hardware; hal status 3868539700 wpa_supplicant is trying to access the card and switching channels, I suppose. A workaround that I'm using all the time is: 1) boot the notebook without the PCMCIA card 2) aft
Re: Hard(?) lock when reassociating ath with wpa_supplicant on RELENG_7
Alexandre "Sunny" Kovalenko wrote: On Sat, 2008-07-12 at 09:57 -0700, Sam Leffler wrote: Alexandre "Sunny" Kovalenko wrote: On Fri, 2008-07-11 at 20:29 -0700, Sam Leffler wrote: Alexandre "Sunny" Kovalenko wrote: On Fri, 2008-05-16 at 12:23 -0400, Sam Leffler wrote: Alexandre "Sunny" Kovalenko wrote: On Mon, 2008-05-12 at 19:33 -0700, Sam Leffler wrote: Alexandre "Sunny" Kovalenko wrote: I seem to be able to lock my machine by going into wpa_cli and asking it to 'reassoc'. The reason for question mark after "hard" is that debug information (caused by wlandebug and athdebug) is being printed on the console. The only way to get machine's attention is to hold power button for 8 seconds. So this is just livelock due to console debug msgs. I am not sure, I have parsed this well enough, so I will try to clarify: machine becomes unresponsive *without* any debugging turned on, to an extent that pressing the power button twice *does not* generate ACPI console message (something to the tune of "going into S5 already -- gimme a break"). If I turn ath debugging on, I do see those messages, and only them, scrolling on the console. Guess I misunderstood you. I have finally got enough time and equipment to investigate this further. Here are some conclusions: -- at this point (RELENG_7 as of July 9th around 15:30 EST) it is indeed a livelock. -- all system does, is executing ath_intr (if_ath.c) in the tight loop with the same status -- 0x1000 AKA HAL_INT_MIB. In order to eliminate possibility that I have caused livelock with the debug messages, I have put conditional panic() into ath_intr, as soon as sc->sc_stats.ast_mib reaches 10,000. Without any kind of the debug messages, it will be triggered within 40-60 seconds after starting of wpa_supplicant. -- I suspect that comment below, might not hold true on my equipment if (status & HAL_INT_MIB) { sc->sc_stats.ast_mib++; /* * Disable interrupts until we service the MIB * interrupt; otherwise it will continue to fire. */ ath_hal_intrset(ah, 0); /* * Let the hal handle the event. We assume it will < * clear whatever condition caused the interrupt. < */ ath_hal_mibevent(ah, &sc->sc_halstats); ath_hal_intrset(ah, sc->sc_imask); } My hardware is: ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) ath0: mem 0xedf0-0xedf0 irq 17 at device 0.0 on pci3 ath0: [ITHREAD] ath0: using obsoleted if_watchdog interface ath0: Ethernet address: 00:16:cf:26:2f:3f ath0: mac 10.3 phy 6.1 radio 10.2 My wpa_supplicant.conf is: ctrl_interface=/var/run/wpa_supplicant ctrl_interface_group=wheel eapol_version=2 network={ ssid="XXX" scan_ssid=1 priority=1 proto=WPA pairwise=TKIP group=TKIP key_mgmt=WPA-PSK psk="xx" } Access point is Netgear WNDR3300-1B with 11N and 11G SSID set up to different values. Only 11G SSID is configured in wpa_supplicant.conf. In the test setup, AP is with 10' (3m) from the laptop. AP is successfully used by handful of Windows clients (including this same laptop) and iBook G4. Neither wpa_supplicant with '-d -d' nor wlandebug 0x show anything but normal scan. athdebug 0x loops with ath_intr: status 0x1000 until I power down my laptop. I would appreciate any suggestion on what I can investigate further -- at this point I have comfortable console setup and should be able to field requests for further information much better. Are you running powerd? I do. And I just tried disabling it, and I could not reproduce the problem any more. Is there any way to reconcile if_ath with powerd? Don't know. There appear to be two issues. When the MIB interrupts arrive the kernel may service them w/ the cpu at a reduced clock frequency. Since powered is currently the only mechanism for increasing the frequency and it runs in user space it can take a while to bump the clock rate leading to livelock (because the logic to reduce the _cause_ of the MIB interrupt takes a long time to run). John Baldwin suggested raising the clock frequency when handling interrupts in the kernel but nothing has been done to make that happen. Separately there is a question as to why the MIB interrupts are happening at all. This is possibly due to misprogramming of the baseband h/w in the ath card. Unfortunately I've been trying to get Atheros to help understand/resolve this question for a very long time (as their code also exhibits this behaviour) but they've been unresponsive. I have some experimental code to address this in new hal versions (such as 0.10.5.6 available in http://www.freebsd.org/~sam) but apparently it does not
Re: Hard(?) lock when reassociating ath with wpa_supplicant on RELENG_7
On Sat, 2008-07-12 at 09:57 -0700, Sam Leffler wrote: > Alexandre "Sunny" Kovalenko wrote: > > On Fri, 2008-07-11 at 20:29 -0700, Sam Leffler wrote: > > > >> Alexandre "Sunny" Kovalenko wrote: > >> > >>> On Fri, 2008-05-16 at 12:23 -0400, Sam Leffler wrote: > >>> > Alexandre "Sunny" Kovalenko wrote: > > > On Mon, 2008-05-12 at 19:33 -0700, Sam Leffler wrote: > > > >> Alexandre "Sunny" Kovalenko wrote: > >> > >>> I seem to be able to lock my machine by going into wpa_cli and asking > >>> it > >>> to 'reassoc'. The reason for question mark after "hard" is that debug > >>> information (caused by wlandebug and athdebug) is being printed on the > >>> console. The only way to get machine's attention is to hold power > >>> button > >>> for 8 seconds. > >>> > >> So this is just livelock due to console debug msgs. > >> > > I am not sure, I have parsed this well enough, so I will try to clarify: > > machine becomes unresponsive *without* any debugging turned on, to an > > extent that pressing the power button twice *does not* generate ACPI > > console message (something to the tune of "going into S5 already -- > > gimme a break"). If I turn ath debugging on, I do see those messages, > > and only them, scrolling on the console. > > > Guess I misunderstood you. > > >>> I have finally got enough time and equipment to investigate this > >>> further. Here are some conclusions: > >>> > >>> -- at this point (RELENG_7 as of July 9th around 15:30 EST) it is indeed > >>> a livelock. > >>> > >>> -- all system does, is executing ath_intr (if_ath.c) in the tight loop > >>> with the same status -- 0x1000 AKA HAL_INT_MIB. In order to eliminate > >>> possibility that I have caused livelock with the debug messages, I have > >>> put conditional panic() into ath_intr, as soon as sc->sc_stats.ast_mib > >>> reaches 10,000. Without any kind of the debug messages, it will be > >>> triggered within 40-60 seconds after starting of wpa_supplicant. > >>> > >>> -- I suspect that comment below, might not hold true on my equipment > >>> if (status & HAL_INT_MIB) { > >>>sc->sc_stats.ast_mib++; > >>>/* > >>> * Disable interrupts until we service the MIB > >>> * interrupt; otherwise it will continue to fire. > >>> */ > >>>ath_hal_intrset(ah, 0); > >>>/* > >>> * Let the hal handle the event. We assume it will < > >>> * clear whatever condition caused the interrupt. < > >>> */ > >>> ath_hal_mibevent(ah, &sc->sc_halstats); > >>> ath_hal_intrset(ah, sc->sc_imask); > >>> } > >>> > >>> My hardware is: > >>> ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, > >>> RF5413) > >>> ath0: mem 0xedf0-0xedf0 irq 17 at device 0.0 on > >>> pci3 > >>> ath0: [ITHREAD] > >>> ath0: using obsoleted if_watchdog interface > >>> ath0: Ethernet address: 00:16:cf:26:2f:3f > >>> ath0: mac 10.3 phy 6.1 radio 10.2 > >>> > >>> My wpa_supplicant.conf is: > >>> ctrl_interface=/var/run/wpa_supplicant > >>> ctrl_interface_group=wheel > >>> eapol_version=2 > >>> > >>> network={ > >>> ssid="XXX" > >>> scan_ssid=1 > >>> priority=1 > >>> proto=WPA > >>> pairwise=TKIP > >>> group=TKIP > >>> key_mgmt=WPA-PSK > >>> psk="xx" > >>> } > >>> > >>> Access point is Netgear WNDR3300-1B with 11N and 11G SSID set up to > >>> different values. Only 11G SSID is configured in wpa_supplicant.conf. In > >>> the test setup, AP is with 10' (3m) from the laptop. > >>> > >>> AP is successfully used by handful of Windows clients (including this > >>> same laptop) and iBook G4. > >>> > >>> Neither wpa_supplicant with '-d -d' nor wlandebug 0x show > >>> anything but normal scan. > >>> > >>> athdebug 0x loops with ath_intr: status 0x1000 until I power > >>> down my laptop. > >>> > >>> I would appreciate any suggestion on what I can investigate further -- > >>> at this point I have comfortable console setup and should be able to > >>> field requests for further information much better. > >>> > >>> > >> Are you running powerd? > >> > > I do. And I just tried disabling it, and I could not reproduce the > > problem any more. Is there any way to reconcile if_ath with powerd? > > > > > > Don't know. There appear to be two issues. When the MIB interrupts > arrive the kernel may service them w/ the cpu at a reduced clock > frequency. Since powered is currently the only mechanism for increasing > the frequency and it runs in user space it can take a while to bump the > clock rate leading to livelock (because the logic to reduce the _cause_ > of the MIB interrupt takes a long time to run). John Baldwin suggested > raising the clock frequency when handling interrupts in the kernel but > nothing has been done to make that happen. > > Sep
Re: Hard(?) lock when reassociating ath with wpa_supplicant on RELENG_7
Alexandre "Sunny" Kovalenko wrote: On Fri, 2008-07-11 at 20:29 -0700, Sam Leffler wrote: Alexandre "Sunny" Kovalenko wrote: On Fri, 2008-05-16 at 12:23 -0400, Sam Leffler wrote: Alexandre "Sunny" Kovalenko wrote: On Mon, 2008-05-12 at 19:33 -0700, Sam Leffler wrote: Alexandre "Sunny" Kovalenko wrote: I seem to be able to lock my machine by going into wpa_cli and asking it to 'reassoc'. The reason for question mark after "hard" is that debug information (caused by wlandebug and athdebug) is being printed on the console. The only way to get machine's attention is to hold power button for 8 seconds. So this is just livelock due to console debug msgs. I am not sure, I have parsed this well enough, so I will try to clarify: machine becomes unresponsive *without* any debugging turned on, to an extent that pressing the power button twice *does not* generate ACPI console message (something to the tune of "going into S5 already -- gimme a break"). If I turn ath debugging on, I do see those messages, and only them, scrolling on the console. Guess I misunderstood you. I have finally got enough time and equipment to investigate this further. Here are some conclusions: -- at this point (RELENG_7 as of July 9th around 15:30 EST) it is indeed a livelock. -- all system does, is executing ath_intr (if_ath.c) in the tight loop with the same status -- 0x1000 AKA HAL_INT_MIB. In order to eliminate possibility that I have caused livelock with the debug messages, I have put conditional panic() into ath_intr, as soon as sc->sc_stats.ast_mib reaches 10,000. Without any kind of the debug messages, it will be triggered within 40-60 seconds after starting of wpa_supplicant. -- I suspect that comment below, might not hold true on my equipment if (status & HAL_INT_MIB) { sc->sc_stats.ast_mib++; /* * Disable interrupts until we service the MIB * interrupt; otherwise it will continue to fire. */ ath_hal_intrset(ah, 0); /* * Let the hal handle the event. We assume it will < * clear whatever condition caused the interrupt. < */ ath_hal_mibevent(ah, &sc->sc_halstats); ath_hal_intrset(ah, sc->sc_imask); } My hardware is: ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) ath0: mem 0xedf0-0xedf0 irq 17 at device 0.0 on pci3 ath0: [ITHREAD] ath0: using obsoleted if_watchdog interface ath0: Ethernet address: 00:16:cf:26:2f:3f ath0: mac 10.3 phy 6.1 radio 10.2 My wpa_supplicant.conf is: ctrl_interface=/var/run/wpa_supplicant ctrl_interface_group=wheel eapol_version=2 network={ ssid="XXX" scan_ssid=1 priority=1 proto=WPA pairwise=TKIP group=TKIP key_mgmt=WPA-PSK psk="xx" } Access point is Netgear WNDR3300-1B with 11N and 11G SSID set up to different values. Only 11G SSID is configured in wpa_supplicant.conf. In the test setup, AP is with 10' (3m) from the laptop. AP is successfully used by handful of Windows clients (including this same laptop) and iBook G4. Neither wpa_supplicant with '-d -d' nor wlandebug 0x show anything but normal scan. athdebug 0x loops with ath_intr: status 0x1000 until I power down my laptop. I would appreciate any suggestion on what I can investigate further -- at this point I have comfortable console setup and should be able to field requests for further information much better. Are you running powerd? I do. And I just tried disabling it, and I could not reproduce the problem any more. Is there any way to reconcile if_ath with powerd? Don't know. There appear to be two issues. When the MIB interrupts arrive the kernel may service them w/ the cpu at a reduced clock frequency. Since powered is currently the only mechanism for increasing the frequency and it runs in user space it can take a while to bump the clock rate leading to livelock (because the logic to reduce the _cause_ of the MIB interrupt takes a long time to run). John Baldwin suggested raising the clock frequency when handling interrupts in the kernel but nothing has been done to make that happen. Separately there is a question as to why the MIB interrupts are happening at all. This is possibly due to misprogramming of the baseband h/w in the ath card. Unfortunately I've been trying to get Atheros to help understand/resolve this question for a very long time (as their code also exhibits this behaviour) but they've been unresponsive. I have some experimental code to address this in new hal versions (such as 0.10.5.6 available in http://www.freebsd.org/~sam) but apparently it does not entirely fix the problem. Sam ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Hard(?) lock when reassociating ath with wpa_supplicant on RELENG_7
On Fri, 2008-07-11 at 20:29 -0700, Sam Leffler wrote: > Alexandre "Sunny" Kovalenko wrote: > > On Fri, 2008-05-16 at 12:23 -0400, Sam Leffler wrote: > >> Alexandre "Sunny" Kovalenko wrote: > >>> On Mon, 2008-05-12 at 19:33 -0700, Sam Leffler wrote: > Alexandre "Sunny" Kovalenko wrote: > > I seem to be able to lock my machine by going into wpa_cli and asking it > > to 'reassoc'. The reason for question mark after "hard" is that debug > > information (caused by wlandebug and athdebug) is being printed on the > > console. The only way to get machine's attention is to hold power button > > for 8 seconds. > So this is just livelock due to console debug msgs. > >>> I am not sure, I have parsed this well enough, so I will try to clarify: > >>> machine becomes unresponsive *without* any debugging turned on, to an > >>> extent that pressing the power button twice *does not* generate ACPI > >>> console message (something to the tune of "going into S5 already -- > >>> gimme a break"). If I turn ath debugging on, I do see those messages, > >>> and only them, scrolling on the console. > >> Guess I misunderstood you. > > > > I have finally got enough time and equipment to investigate this > > further. Here are some conclusions: > > > > -- at this point (RELENG_7 as of July 9th around 15:30 EST) it is indeed > > a livelock. > > > > -- all system does, is executing ath_intr (if_ath.c) in the tight loop > > with the same status -- 0x1000 AKA HAL_INT_MIB. In order to eliminate > > possibility that I have caused livelock with the debug messages, I have > > put conditional panic() into ath_intr, as soon as sc->sc_stats.ast_mib > > reaches 10,000. Without any kind of the debug messages, it will be > > triggered within 40-60 seconds after starting of wpa_supplicant. > > > > -- I suspect that comment below, might not hold true on my equipment > > if (status & HAL_INT_MIB) { > >sc->sc_stats.ast_mib++; > >/* > > * Disable interrupts until we service the MIB > > * interrupt; otherwise it will continue to fire. > > */ > >ath_hal_intrset(ah, 0); > >/* > > * Let the hal handle the event. We assume it will < > > * clear whatever condition caused the interrupt. < > > */ > > ath_hal_mibevent(ah, &sc->sc_halstats); > > ath_hal_intrset(ah, sc->sc_imask); > > } > > > > My hardware is: > > ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, > > RF5413) > > ath0: mem 0xedf0-0xedf0 irq 17 at device 0.0 on > > pci3 > > ath0: [ITHREAD] > > ath0: using obsoleted if_watchdog interface > > ath0: Ethernet address: 00:16:cf:26:2f:3f > > ath0: mac 10.3 phy 6.1 radio 10.2 > > > > My wpa_supplicant.conf is: > > ctrl_interface=/var/run/wpa_supplicant > > ctrl_interface_group=wheel > > eapol_version=2 > > > > network={ > > ssid="XXX" > > scan_ssid=1 > > priority=1 > > proto=WPA > > pairwise=TKIP > > group=TKIP > > key_mgmt=WPA-PSK > > psk="xx" > > } > > > > Access point is Netgear WNDR3300-1B with 11N and 11G SSID set up to > > different values. Only 11G SSID is configured in wpa_supplicant.conf. In > > the test setup, AP is with 10' (3m) from the laptop. > > > > AP is successfully used by handful of Windows clients (including this > > same laptop) and iBook G4. > > > > Neither wpa_supplicant with '-d -d' nor wlandebug 0x show > > anything but normal scan. > > > > athdebug 0x loops with ath_intr: status 0x1000 until I power > > down my laptop. > > > > I would appreciate any suggestion on what I can investigate further -- > > at this point I have comfortable console setup and should be able to > > field requests for further information much better. > > > > Are you running powerd? I do. And I just tried disabling it, and I could not reproduce the problem any more. Is there any way to reconcile if_ath with powerd? Thank you, -- Alexandre "Sunny" Kovalenko (Олександр Коваленко) ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Hard(?) lock when reassociating ath with wpa_supplicant on RELENG_7
Alexandre "Sunny" Kovalenko wrote: On Fri, 2008-05-16 at 12:23 -0400, Sam Leffler wrote: Alexandre "Sunny" Kovalenko wrote: On Mon, 2008-05-12 at 19:33 -0700, Sam Leffler wrote: Alexandre "Sunny" Kovalenko wrote: I seem to be able to lock my machine by going into wpa_cli and asking it to 'reassoc'. The reason for question mark after "hard" is that debug information (caused by wlandebug and athdebug) is being printed on the console. The only way to get machine's attention is to hold power button for 8 seconds. So this is just livelock due to console debug msgs. I am not sure, I have parsed this well enough, so I will try to clarify: machine becomes unresponsive *without* any debugging turned on, to an extent that pressing the power button twice *does not* generate ACPI console message (something to the tune of "going into S5 already -- gimme a break"). If I turn ath debugging on, I do see those messages, and only them, scrolling on the console. Guess I misunderstood you. I have finally got enough time and equipment to investigate this further. Here are some conclusions: -- at this point (RELENG_7 as of July 9th around 15:30 EST) it is indeed a livelock. -- all system does, is executing ath_intr (if_ath.c) in the tight loop with the same status -- 0x1000 AKA HAL_INT_MIB. In order to eliminate possibility that I have caused livelock with the debug messages, I have put conditional panic() into ath_intr, as soon as sc->sc_stats.ast_mib reaches 10,000. Without any kind of the debug messages, it will be triggered within 40-60 seconds after starting of wpa_supplicant. -- I suspect that comment below, might not hold true on my equipment if (status & HAL_INT_MIB) { sc->sc_stats.ast_mib++; /* * Disable interrupts until we service the MIB * interrupt; otherwise it will continue to fire. */ ath_hal_intrset(ah, 0); /* * Let the hal handle the event. We assume it will < * clear whatever condition caused the interrupt. < */ ath_hal_mibevent(ah, &sc->sc_halstats); ath_hal_intrset(ah, sc->sc_imask); } My hardware is: ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) ath0: mem 0xedf0-0xedf0 irq 17 at device 0.0 on pci3 ath0: [ITHREAD] ath0: using obsoleted if_watchdog interface ath0: Ethernet address: 00:16:cf:26:2f:3f ath0: mac 10.3 phy 6.1 radio 10.2 My wpa_supplicant.conf is: ctrl_interface=/var/run/wpa_supplicant ctrl_interface_group=wheel eapol_version=2 network={ ssid="XXX" scan_ssid=1 priority=1 proto=WPA pairwise=TKIP group=TKIP key_mgmt=WPA-PSK psk="xx" } Access point is Netgear WNDR3300-1B with 11N and 11G SSID set up to different values. Only 11G SSID is configured in wpa_supplicant.conf. In the test setup, AP is with 10' (3m) from the laptop. AP is successfully used by handful of Windows clients (including this same laptop) and iBook G4. Neither wpa_supplicant with '-d -d' nor wlandebug 0x show anything but normal scan. athdebug 0x loops with ath_intr: status 0x1000 until I power down my laptop. I would appreciate any suggestion on what I can investigate further -- at this point I have comfortable console setup and should be able to field requests for further information much better. Are you running powerd? Sam ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Hard(?) lock when reassociating ath with wpa_supplicant on RELENG_7
On Fri, 2008-05-16 at 12:23 -0400, Sam Leffler wrote: > Alexandre "Sunny" Kovalenko wrote: > > On Mon, 2008-05-12 at 19:33 -0700, Sam Leffler wrote: > >> Alexandre "Sunny" Kovalenko wrote: > >>> I seem to be able to lock my machine by going into wpa_cli and asking it > >>> to 'reassoc'. The reason for question mark after "hard" is that debug > >>> information (caused by wlandebug and athdebug) is being printed on the > >>> console. The only way to get machine's attention is to hold power button > >>> for 8 seconds. > >> So this is just livelock due to console debug msgs. > > I am not sure, I have parsed this well enough, so I will try to clarify: > > machine becomes unresponsive *without* any debugging turned on, to an > > extent that pressing the power button twice *does not* generate ACPI > > console message (something to the tune of "going into S5 already -- > > gimme a break"). If I turn ath debugging on, I do see those messages, > > and only them, scrolling on the console. > > Guess I misunderstood you. I have finally got enough time and equipment to investigate this further. Here are some conclusions: -- at this point (RELENG_7 as of July 9th around 15:30 EST) it is indeed a livelock. -- all system does, is executing ath_intr (if_ath.c) in the tight loop with the same status -- 0x1000 AKA HAL_INT_MIB. In order to eliminate possibility that I have caused livelock with the debug messages, I have put conditional panic() into ath_intr, as soon as sc->sc_stats.ast_mib reaches 10,000. Without any kind of the debug messages, it will be triggered within 40-60 seconds after starting of wpa_supplicant. -- I suspect that comment below, might not hold true on my equipment if (status & HAL_INT_MIB) { sc->sc_stats.ast_mib++; /* * Disable interrupts until we service the MIB * interrupt; otherwise it will continue to fire. */ ath_hal_intrset(ah, 0); /* * Let the hal handle the event. We assume it will < * clear whatever condition caused the interrupt. < */ ath_hal_mibevent(ah, &sc->sc_halstats); ath_hal_intrset(ah, sc->sc_imask); } My hardware is: ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) ath0: mem 0xedf0-0xedf0 irq 17 at device 0.0 on pci3 ath0: [ITHREAD] ath0: using obsoleted if_watchdog interface ath0: Ethernet address: 00:16:cf:26:2f:3f ath0: mac 10.3 phy 6.1 radio 10.2 My wpa_supplicant.conf is: ctrl_interface=/var/run/wpa_supplicant ctrl_interface_group=wheel eapol_version=2 network={ ssid="XXX" scan_ssid=1 priority=1 proto=WPA pairwise=TKIP group=TKIP key_mgmt=WPA-PSK psk="xx" } Access point is Netgear WNDR3300-1B with 11N and 11G SSID set up to different values. Only 11G SSID is configured in wpa_supplicant.conf. In the test setup, AP is with 10' (3m) from the laptop. AP is successfully used by handful of Windows clients (including this same laptop) and iBook G4. Neither wpa_supplicant with '-d -d' nor wlandebug 0x show anything but normal scan. athdebug 0x loops with ath_intr: status 0x1000 until I power down my laptop. I would appreciate any suggestion on what I can investigate further -- at this point I have comfortable console setup and should be able to field requests for further information much better. -- Alexandre "Sunny" Kovalenko (Олександр Коваленко) ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Hard(?) lock when reassociating ath with wpa_supplicant on RELENG_7
Alexandre "Sunny" Kovalenko wrote: On Mon, 2008-05-12 at 19:33 -0700, Sam Leffler wrote: Alexandre "Sunny" Kovalenko wrote: I seem to be able to lock my machine by going into wpa_cli and asking it to 'reassoc'. The reason for question mark after "hard" is that debug information (caused by wlandebug and athdebug) is being printed on the console. The only way to get machine's attention is to hold power button for 8 seconds. So this is just livelock due to console debug msgs. I am not sure, I have parsed this well enough, so I will try to clarify: machine becomes unresponsive *without* any debugging turned on, to an extent that pressing the power button twice *does not* generate ACPI console message (something to the tune of "going into S5 already -- gimme a break"). If I turn ath debugging on, I do see those messages, and only them, scrolling on the console. Guess I misunderstood you. Note: manual reassociation is just the handy way to reproduce the problem -- I have had machine locking up on me the whole day long completely on its own. Below are, what I think, relevant pieces of information. If anything is missing, please, chastise me appropriately and will do my best to provide. I have rigged firewire console, but am unable to break into the debugger locally or remotely. I see no log msgs. I am sorry -- mailman must have eaten it up -- I have posted them here now: http://members.verizon.net/~akovalenko/Misc/reassoc.log.gz All I see in the log is normal scanning. While I am on the subject, I would appreciate couple of the troubleshooting suggestions: * is there any way to get sysctl dev.ath.0.debug to appear, other then defining ATH_DEBUG in something like /usr/src/sys/dev/ath/ah_osdep.h? options ATH_DEBUG That does not seem to work for if_ath built as the module, sorry for not being clear in that respect. If you are build a module then you must edit the Makefile to add the option. Feel free to provide a patch to improve this situation. * is there minimal, but still usable mask for athdebug and wlandebug? I have started with 0x and kept trimming likely high-volume settings until output slowed down to the reasonable pace. Why do you want debug msgs from ath? The debug msgs from wlandebug depend on what you're trying to debug. Because neither wpa_supplicant (quoted below), nor wlandebug (in the URL above) gave me the answer -- it looks like we are going into the scan with the specific SSID in mind and never come back, so I went for the next level. However, could you, please, clarify that I understood you correctly -- you *do not* want to see mix of wlandebug and athdebug messages in the report, and I should turn wlandebug off before turning athdebug on, right? wlandebug msgs are typically low duty and can be left enabled when you add driver-level debug msgs. But I can't see from the log you sent what is going on. Try reducing the noise by eliminating all the ath debug msgs; maybe provide one log w/ only wlan msgs and one w/ wlan+ath. I suggest that when debugging you start from the highest layer and move downward. If you can't find what you need in a wpa_supplicant log then turn on msgs in net80211 with wlandebug. If that doesn't tell you what you need then move to the driver. Blindly turning everything on can easily livelock your system. That's what I did -- what I have posted is the end result of the walking down that chain, and I assumed, possibly incorrectly, that you would want result of all three to put things in context. I do apologize for the misunderstanding. For high volume msgs I often do something like: athdebug +intr; sleep 1; athdebug -intr or athdebug +intr; read x; athdebug -intr so a carriage return will disable msgs. Thank you for the suggestion. * what facility does wpa_supplicant use, when forced to syslog by -s switch? trouble% cd /data/freebsd/head/contrib/wpa_supplicant/ trouble% grep openlog *.c common.c: openlog("wpa_supplicant", LOG_PID | LOG_NDELAY, LOG_DAEMON); Thank you, should have done this myself, sorry. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Hard(?) lock when reassociating ath with wpa_supplicant on RELENG_7
On Mon, 2008-05-12 at 19:33 -0700, Sam Leffler wrote: > Alexandre "Sunny" Kovalenko wrote: > > I seem to be able to lock my machine by going into wpa_cli and asking it > > to 'reassoc'. The reason for question mark after "hard" is that debug > > information (caused by wlandebug and athdebug) is being printed on the > > console. The only way to get machine's attention is to hold power button > > for 8 seconds. > > So this is just livelock due to console debug msgs. I am not sure, I have parsed this well enough, so I will try to clarify: machine becomes unresponsive *without* any debugging turned on, to an extent that pressing the power button twice *does not* generate ACPI console message (something to the tune of "going into S5 already -- gimme a break"). If I turn ath debugging on, I do see those messages, and only them, scrolling on the console. > > > > > Note: manual reassociation is just the handy way to reproduce the > > problem -- I have had machine locking up on me the whole day long > > completely on its own. > > > > Below are, what I think, relevant pieces of information. If anything is > > missing, please, chastise me appropriately and will do my best to > > provide. I have rigged firewire console, but am unable to break into the > > debugger locally or remotely. > > I see no log msgs. I am sorry -- mailman must have eaten it up -- I have posted them here now: http://members.verizon.net/~akovalenko/Misc/reassoc.log.gz > > > > > While I am on the subject, I would appreciate couple of the > > troubleshooting suggestions: > > * is there any way to get sysctl dev.ath.0.debug to appear, other then > > defining ATH_DEBUG in something like /usr/src/sys/dev/ath/ah_osdep.h? > > options ATH_DEBUG That does not seem to work for if_ath built as the module, sorry for not being clear in that respect. > > > * is there minimal, but still usable mask for athdebug and wlandebug? I > > have started with 0x and kept trimming likely high-volume > > settings until output slowed down to the reasonable pace. > > Why do you want debug msgs from ath? The debug msgs from wlandebug > depend on what you're trying to debug. Because neither wpa_supplicant (quoted below), nor wlandebug (in the URL above) gave me the answer -- it looks like we are going into the scan with the specific SSID in mind and never come back, so I went for the next level. However, could you, please, clarify that I understood you correctly -- you *do not* want to see mix of wlandebug and athdebug messages in the report, and I should turn wlandebug off before turning athdebug on, right? > > I suggest that when debugging you start from the highest layer and move > downward. If you can't find what you need in a wpa_supplicant log then > turn on msgs in net80211 with wlandebug. If that doesn't tell you what > you need then move to the driver. Blindly turning everything on can > easily livelock your system. That's what I did -- what I have posted is the end result of the walking down that chain, and I assumed, possibly incorrectly, that you would want result of all three to put things in context. I do apologize for the misunderstanding. > For high volume msgs I often do something > like: > > athdebug +intr; sleep 1; athdebug -intr > > or > > athdebug +intr; read x; athdebug -intr > > so a carriage return will disable msgs. Thank you for the suggestion. > > > > * what facility does wpa_supplicant use, when forced to syslog by -s > > switch? > > trouble% cd /data/freebsd/head/contrib/wpa_supplicant/ > trouble% grep openlog *.c > common.c: openlog("wpa_supplicant", LOG_PID | LOG_NDELAY, LOG_DAEMON); Thank you, should have done this myself, sorry. -- Alexandre "Sunny" Kovalenko (Олександр Коваленко) ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Hard(?) lock when reassociating ath with wpa_supplicant on RELENG_7
Alexandre "Sunny" Kovalenko wrote: I seem to be able to lock my machine by going into wpa_cli and asking it to 'reassoc'. The reason for question mark after "hard" is that debug information (caused by wlandebug and athdebug) is being printed on the console. The only way to get machine's attention is to hold power button for 8 seconds. So this is just livelock due to console debug msgs. Note: manual reassociation is just the handy way to reproduce the problem -- I have had machine locking up on me the whole day long completely on its own. Below are, what I think, relevant pieces of information. If anything is missing, please, chastise me appropriately and will do my best to provide. I have rigged firewire console, but am unable to break into the debugger locally or remotely. I see no log msgs. While I am on the subject, I would appreciate couple of the troubleshooting suggestions: * is there any way to get sysctl dev.ath.0.debug to appear, other then defining ATH_DEBUG in something like /usr/src/sys/dev/ath/ah_osdep.h? options ATH_DEBUG * is there minimal, but still usable mask for athdebug and wlandebug? I have started with 0x and kept trimming likely high-volume settings until output slowed down to the reasonable pace. Why do you want debug msgs from ath? The debug msgs from wlandebug depend on what you're trying to debug. I suggest that when debugging you start from the highest layer and move downward. If you can't find what you need in a wpa_supplicant log then turn on msgs in net80211 with wlandebug. If that doesn't tell you what you need then move to the driver. Blindly turning everything on can easily livelock your system. For high volume msgs I often do something like: athdebug +intr; sleep 1; athdebug -intr or athdebug +intr; read x; athdebug -intr so a carriage return will disable msgs. * what facility does wpa_supplicant use, when forced to syslog by -s switch? trouble% cd /data/freebsd/head/contrib/wpa_supplicant/ trouble% grep openlog *.c common.c: openlog("wpa_supplicant", LOG_PID | LOG_NDELAY, LOG_DAEMON); === * OS: RELENG_7 cvsup'ed on May 11th @ 15:30EST, if_ath is loaded as the module * HW: [EMAIL PROTECTED]:3:0:0: class=0x02 card=0x058a1014 chip=0x1014168c rev=0x01 hdr=0x00 vendor = 'Atheros Communications Inc.' device = 'AR5212 Atheros AR5212 802.11abg wireless' class = network subclass = ethernet ath0: mem 0xedf0-0xedf0 irq 17 at device 0.0 on pci3 ath0: [ITHREAD] ath0: using obsoleted if_watchdog interface ath0: Ethernet address: xx:xx:xx:xx:xx:xx ath0: mac 10.3 phy 6.1 radio 10.2 * Debug settings: net.wlan.0.debug: 0xdfff<11n,debug,crypto,input,xrate,elemid,node,assoc,auth,scan,output,state,power,dot1x,dot1xsm,radius,raddump,radkeys,wpa,acl,wme,superg,doth,inact,roam,rate> dev.ath.0.debug: 0xffef8ff0 * Console output (somewhat trimmed) is attached -- last group of messages keeps repeating until I power down the machine. * Output of 'wpa_supplicant -Dbsd -iath0 -d -d -K -c/etc/wpa_supplicant.conf' (hand transcribed) is: Starting AP scan (specific SSID) Scan SSID = hexdump_ascii (len = NN) xx xx xx xx xx xx * /etc/wpa_supplicant.conf is attached * Setup: two stations with the identical SSID: Linksys WRT54Gv8 (running dd-wrt micro) at about 50' and Zyxel P-330W at about 20'. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Hard(?) lock when reassociating ath with wpa_supplicant on RELENG_7
I seem to be able to lock my machine by going into wpa_cli and asking it to 'reassoc'. The reason for question mark after "hard" is that debug information (caused by wlandebug and athdebug) is being printed on the console. The only way to get machine's attention is to hold power button for 8 seconds. Note: manual reassociation is just the handy way to reproduce the problem -- I have had machine locking up on me the whole day long completely on its own. Below are, what I think, relevant pieces of information. If anything is missing, please, chastise me appropriately and will do my best to provide. I have rigged firewire console, but am unable to break into the debugger locally or remotely. While I am on the subject, I would appreciate couple of the troubleshooting suggestions: * is there any way to get sysctl dev.ath.0.debug to appear, other then defining ATH_DEBUG in something like /usr/src/sys/dev/ath/ah_osdep.h? * is there minimal, but still usable mask for athdebug and wlandebug? I have started with 0x and kept trimming likely high-volume settings until output slowed down to the reasonable pace. * what facility does wpa_supplicant use, when forced to syslog by -s switch? === * OS: RELENG_7 cvsup'ed on May 11th @ 15:30EST, if_ath is loaded as the module * HW: [EMAIL PROTECTED]:3:0:0:class=0x02 card=0x058a1014 chip=0x1014168c rev=0x01 hdr=0x00 vendor = 'Atheros Communications Inc.' device = 'AR5212 Atheros AR5212 802.11abg wireless' class = network subclass = ethernet ath0: mem 0xedf0-0xedf0 irq 17 at device 0.0 on pci3 ath0: [ITHREAD] ath0: using obsoleted if_watchdog interface ath0: Ethernet address: xx:xx:xx:xx:xx:xx ath0: mac 10.3 phy 6.1 radio 10.2 * Debug settings: net.wlan.0.debug: 0xdfff<11n,debug,crypto,input,xrate,elemid,node,assoc,auth,scan,output,state,power,dot1x,dot1xsm,radius,raddump,radkeys,wpa,acl,wme,superg,doth,inact,roam,rate> dev.ath.0.debug: 0xffef8ff0 * Console output (somewhat trimmed) is attached -- last group of messages keeps repeating until I power down the machine. * Output of 'wpa_supplicant -Dbsd -iath0 -d -d -K -c/etc/wpa_supplicant.conf' (hand transcribed) is: Starting AP scan (specific SSID) Scan SSID = hexdump_ascii (len = NN) xx xx xx xx xx xx * /etc/wpa_supplicant.conf is attached * Setup: two stations with the identical SSID: Linksys WRT54Gv8 (running dd-wrt micro) at about 50' and Zyxel P-330W at about 20'. -- Alexandre "Sunny" Kovalenko (Олександр Коваленко) # # $Log$ # ctrl_interface=/var/run/wpa_supplicant ctrl_interface_group=wheel network={ ssid="xxx" scan_ssid=1 priority=1 key_mgmt=WPA-PSK psk="xx" } ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"