Re: iwn firmware instability with an up-to-date stable kernel

2010-04-25 Thread Bernhard Schmidt
On Sat, Apr 24, 2010 at 06:24:42PM -0700, Garrett Cooper wrote:
 On Sat, Apr 24, 2010 at 12:50 AM, Bernhard Schmidt
 bschm...@techwires.net wrote:
  On Sat, Apr 24, 2010 at 12:45:14AM -0700, Garrett Cooper wrote:
  On Sat, Apr 24, 2010 at 12:34 AM, Bernhard Schmidt
  bschm...@techwires.net wrote:
  
   How did you do that? Reloading the module, or with ifconfig?
 
  /etc/rc.d/netif restart , which does the ifconfig operations (no
  module change occurred AFAIK, but wlan0 did of course do some
  device_printf's when it was associating itself with iwn(4)).
 
  Can you do ps xa | grep wpa? Just wondering if wpa_supplicant gets
  started twice.
 
 Some more interesting data.
 
 Open authentication at home works out of the box via wpa_supplicant
 with ifconfig_wlan0=WPA DHCP whereas it flaked out and died at work.
 
 There are two instances of wpa_supplicant started up on the laptop.
 Here's a snippet from pstree that shows that both processes are
 standalone:
 
 -+= 1 root /sbin/init --
  |--= 00121 root adjkerntz -i
  |--= 00559 root /sbin/devd
  |--= 00711 root /usr/sbin/syslogd -s
  |--= 00735 root /usr/sbin/rpcbind
  |--= 00879 root /usr/sbin/moused -p /dev/psm0 -t auto
  |--= 00903 messagebus /usr/local/bin/dbus-daemon --system
  |--= 01073 root /usr/sbin/sshd
  |--= 01081 root sendmail: accepting connections (sendmail)
  |--= 01085 smmsp sendmail: Queue run...@00:30:00 for
 /var/spool/clientmqueue (sendmail)
  |--= 01093 root /usr/sbin/cron -s
  |-+= 01176 haldaemon /usr/local/sbin/hald
  | \-+- 01180 root hald-runner
  |   |--- 01185 root hald-addon-mouse-sysmouse: /dev/psm0 
 (hald-addon-mouse-sy)
  |   \--- 01205 root hald-addon-storage: /dev/acd0 (hald-addon-storage)
  |--= 01179 root /usr/local/sbin/console-kit-daemon
  |--= 01727 root /usr/sbin/wpa_supplicant -s -B -i wlan0 -c
 /etc/wpa_supplicant.conf -D bsd -P /var/run/wpa_supplicant/wlan0.pid
  |--= 01783 root /usr/sbin/wpa_supplicant -s -B -i wlan0 -c
 /etc/wpa_supplicant.conf -D bsd -P /var/run/wpa_supplicant/wlan0.pid
  |--= 01866 root dhclient: wlan0 [priv] (dhclient)
  |--= 01902 _dhcp dhclient: wlan0 (dhclient)

Indeed, devd is responsible for that, removing
notify 0 {
match system  IFNET;
match typeATTACH;
action /etc/pccard_ether $subsystem start;
};
from devd.conf prevents a seconds call to rc.d/netif and therefore
rc.d/wpa_supplicant. This breaks the intended purpose though.

Can we somehow prevent this by checking the pidfile in
rc.d/wpa_supplicant?

-- 
Bernhard
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: iwn firmware instability with an up-to-date stable kernel

2010-04-24 Thread Garrett Cooper
On Fri, Apr 23, 2010 at 10:08 PM, Brandon Gooch
jamesbrandongo...@gmail.com wrote:
 On Sat, Apr 24, 2010 at 4:59 AM, Garrett Cooper yanef...@gmail.com wrote:
 On Fri, Apr 23, 2010 at 9:42 PM, Garrett Cooper yanef...@gmail.com wrote:
 On Fri, Apr 23, 2010 at 8:05 PM, Brandon Gooch
 jamesbrandongo...@gmail.com wrote:
 2010/4/23 Garrett Cooper yanef...@gmail.com:
 2010/4/23 Garrett Cooper yanef...@gmail.com:
 2010/4/18 Olivier Cochard-Labbé oliv...@cochard.me:
 2010/4/18 Bernhard Schmidt bschm...@techwires.net:
 Are you able to reproduce this on demand? As in type a few commands and
 the firmware error occurs?


 No, I'm not able to reproduce on demand this problem.

 I'm seeing similar issues on occasion with my Lenovo as well:

 Apr 23 19:25:24 garrcoop-fbsd kernel: firmware error log:
 Apr 23 19:25:24 garrcoop-fbsd kernel: error type      =
 NMI_INTERRUPT_WDG (0x0004)
 Apr 23 19:25:24 garrcoop-fbsd kernel: program counter = 0x046C
 Apr 23 19:25:24 garrcoop-fbsd kernel: source line     = 0x00D0
 Apr 23 19:25:24 garrcoop-fbsd kernel: error data      = 
 0x00020703
 Apr 23 19:25:24 garrcoop-fbsd kernel: branch link     = 
 0x837004C2
 Apr 23 19:25:24 garrcoop-fbsd kernel: interrupt link  = 
 0x06DA18B8
 Apr 23 19:25:24 garrcoop-fbsd kernel: time            = 4287402440
 Apr 23 19:25:24 garrcoop-fbsd kernel: driver status:
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  0: qid=0  cur=1   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  1: qid=1  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  2: qid=2  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  3: qid=3  cur=36  queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  4: qid=4  cur=123 queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  5: qid=5  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  6: qid=6  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  7: qid=7  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  8: qid=8  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  9: qid=9  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 10: qid=10 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 11: qid=11 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 12: qid=12 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 13: qid=13 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 14: qid=14 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 15: qid=15 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: rx ring: cur=8

 This may be because the system was under load (I was installing a port
 shortly before the connection dropped). I'll try poking at this
 further because it's going to be an annoying productivity loss :/.

    Sorry... should have included more helpful details.
 Thanks,
 -Garrett

 dmesg:

 iwn0: Intel(R) PRO/Wireless 4965BGN mem 0xdf2fe000-0xdf2f irq 17
 at device 0.0 on pci3
 iwn0: MIMO 2T3R, MoW1, address 00:1d:e0:7d:9f:c7
 iwn0: [ITHREAD]
 iwn0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
 iwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
 iwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps
 24Mbps 36Mbps 48Mbps 54Mbps

 pciconf -lv snippet:

 i...@pci0:3:0:0:        class=0x028000 card=0x11108086 chip=0x42308086
 rev=0x61 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Intel Wireless WiFi Link 4965AGN (Intel 4965AGN)'
    class      = network
 c...@pci0:21:0:0:       class=0x060700 card=0x20c617aa chip=0x04761180
 rev=0xba hdr=0x02

 uname -a:

 $ uname -a
 FreeBSD garrcoop-fbsd.cisco.com 8.0-STABLE FreeBSD 8.0-STABLE #0
 r207006: Wed Apr 21 13:18:44 PDT 2010
 r...@garrcoop-fbsd.cisco.com:/usr/obj/usr/src/sys/LAPPY_X86  i386

 I'm actually looking at this right now. For me, it's actually
 happening when my machine stays on overnight (or for long periods of
 time, idle).

 Also, it seems to be causing the kernel to panic, although I'm now
 wondering if the Machine Check Architecture is somehow catching this
 device error and causing an exception (hw.mca.enabled=1)(?) -- not
 possible, right ???

 Whatever the case, I can't seem to get the firmware error to occur
 with iwn(4) debugging or wlandebug options enabled, so who knows
 exactly what leads to this.

 I know Bernhard has worked hard on this driver, it's a shame that this
 freaky bug has bit us all now, without leaving many clues :(

 I've attached a textdump for posterity if nothing else :)

    Connectivity appears to be shoddy in my neck of the woods (kind of
 ironic... but meh). Just running buildworld, buildkernel, then doing a
 tcpdump in parallel causes the pseudo device to go up and down a lot.
 I assume this isn't standard behavior?
    Just for reference buildworld was started shortly after 19:39:05,
 and it finished at 21:29. The interface has also gone up and down once
 since then while the system's 

Re: iwn firmware instability with an up-to-date stable kernel

2010-04-24 Thread Bernhard Schmidt
On Fri, Apr 23, 2010 at 11:27:32PM -0700, Garrett Cooper wrote:
 On Fri, Apr 23, 2010 at 10:08 PM, Brandon Gooch
 jamesbrandongo...@gmail.com wrote:
  On Sat, Apr 24, 2010 at 4:59 AM, Garrett Cooper yanef...@gmail.com wrote:
  On Fri, Apr 23, 2010 at 9:42 PM, Garrett Cooper yanef...@gmail.com wrote:
  On Fri, Apr 23, 2010 at 8:05 PM, Brandon Gooch
  jamesbrandongo...@gmail.com wrote:
  2010/4/23 Garrett Cooper yanef...@gmail.com:
  2010/4/23 Garrett Cooper yanef...@gmail.com:
  2010/4/18 Olivier Cochard-Labbé oliv...@cochard.me:
  2010/4/18 Bernhard Schmidt bschm...@techwires.net:
  Are you able to reproduce this on demand? As in type a few commands 
  and
  the firmware error occurs?
 
 
  No, I'm not able to reproduce on demand this problem.
 
  I'm seeing similar issues on occasion with my Lenovo as well:
 
  Apr 23 19:25:24 garrcoop-fbsd kernel: firmware error log:
  Apr 23 19:25:24 garrcoop-fbsd kernel: error type      =
  NMI_INTERRUPT_WDG (0x0004)
  Apr 23 19:25:24 garrcoop-fbsd kernel: program counter = 0x046C
  Apr 23 19:25:24 garrcoop-fbsd kernel: source line     = 0x00D0
  Apr 23 19:25:24 garrcoop-fbsd kernel: error data      = 
  0x00020703
  Apr 23 19:25:24 garrcoop-fbsd kernel: branch link     = 
  0x837004C2
  Apr 23 19:25:24 garrcoop-fbsd kernel: interrupt link  = 
  0x06DA18B8
  Apr 23 19:25:24 garrcoop-fbsd kernel: time            = 4287402440
  Apr 23 19:25:24 garrcoop-fbsd kernel: driver status:
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  0: qid=0  cur=1   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  1: qid=1  cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  2: qid=2  cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  3: qid=3  cur=36  
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  4: qid=4  cur=123 
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  5: qid=5  cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  6: qid=6  cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  7: qid=7  cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  8: qid=8  cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  9: qid=9  cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 10: qid=10 cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 11: qid=11 cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 12: qid=12 cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 13: qid=13 cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 14: qid=14 cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 15: qid=15 cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: rx ring: cur=8
 
  This may be because the system was under load (I was installing a port
  shortly before the connection dropped). I'll try poking at this
  further because it's going to be an annoying productivity loss :/.
 
     Sorry... should have included more helpful details.
  Thanks,
  -Garrett
 
  dmesg:
 
  iwn0: Intel(R) PRO/Wireless 4965BGN mem 0xdf2fe000-0xdf2f irq 17
  at device 0.0 on pci3
  iwn0: MIMO 2T3R, MoW1, address 00:1d:e0:7d:9f:c7
  iwn0: [ITHREAD]
  iwn0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
  iwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
  iwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps
  24Mbps 36Mbps 48Mbps 54Mbps
 
  pciconf -lv snippet:
 
  i...@pci0:3:0:0:        class=0x028000 card=0x11108086 chip=0x42308086
  rev=0x61 hdr=0x00
     vendor     = 'Intel Corporation'
     device     = 'Intel Wireless WiFi Link 4965AGN (Intel 4965AGN)'
     class      = network
  c...@pci0:21:0:0:       class=0x060700 card=0x20c617aa chip=0x04761180
  rev=0xba hdr=0x02
 
  uname -a:
 
  $ uname -a
  FreeBSD garrcoop-fbsd.cisco.com 8.0-STABLE FreeBSD 8.0-STABLE #0
  r207006: Wed Apr 21 13:18:44 PDT 2010
  r...@garrcoop-fbsd.cisco.com:/usr/obj/usr/src/sys/LAPPY_X86  i386
 
  I'm actually looking at this right now. For me, it's actually
  happening when my machine stays on overnight (or for long periods of
  time, idle).
 
  Also, it seems to be causing the kernel to panic, although I'm now
  wondering if the Machine Check Architecture is somehow catching this
  device error and causing an exception (hw.mca.enabled=1)(?) -- not
  possible, right ???
 
  Whatever the case, I can't seem to get the firmware error to occur
  with iwn(4) debugging or wlandebug options enabled, so who knows
  exactly what leads to this.
 
  I know Bernhard has worked hard on this driver, it's a shame that this
  freaky bug has bit us all now, without leaving many clues :(
 
  I've attached a textdump for posterity if nothing else :)
 
     Connectivity appears to be shoddy in my neck of the woods (kind of
  ironic... but meh). Just running buildworld, buildkernel, then doing a
  tcpdump in parallel causes the pseudo device to go up and down a 

Re: iwn firmware instability with an up-to-date stable kernel

2010-04-24 Thread Garrett Cooper
On Sat, Apr 24, 2010 at 12:34 AM, Bernhard Schmidt
bschm...@techwires.net wrote:
 On Fri, Apr 23, 2010 at 11:27:32PM -0700, Garrett Cooper wrote:
 On Fri, Apr 23, 2010 at 10:08 PM, Brandon Gooch
 jamesbrandongo...@gmail.com wrote:
  On Sat, Apr 24, 2010 at 4:59 AM, Garrett Cooper yanef...@gmail.com wrote:
  On Fri, Apr 23, 2010 at 9:42 PM, Garrett Cooper yanef...@gmail.com 
  wrote:
  On Fri, Apr 23, 2010 at 8:05 PM, Brandon Gooch
  jamesbrandongo...@gmail.com wrote:
  2010/4/23 Garrett Cooper yanef...@gmail.com:
  2010/4/23 Garrett Cooper yanef...@gmail.com:
  2010/4/18 Olivier Cochard-Labbé oliv...@cochard.me:
  2010/4/18 Bernhard Schmidt bschm...@techwires.net:
  Are you able to reproduce this on demand? As in type a few commands 
  and
  the firmware error occurs?
 
 
  No, I'm not able to reproduce on demand this problem.
 
  I'm seeing similar issues on occasion with my Lenovo as well:
 
  Apr 23 19:25:24 garrcoop-fbsd kernel: firmware error log:
  Apr 23 19:25:24 garrcoop-fbsd kernel: error type      =
  NMI_INTERRUPT_WDG (0x0004)
  Apr 23 19:25:24 garrcoop-fbsd kernel: program counter = 0x046C
  Apr 23 19:25:24 garrcoop-fbsd kernel: source line     = 0x00D0
  Apr 23 19:25:24 garrcoop-fbsd kernel: error data      = 
  0x00020703
  Apr 23 19:25:24 garrcoop-fbsd kernel: branch link     = 
  0x837004C2
  Apr 23 19:25:24 garrcoop-fbsd kernel: interrupt link  = 
  0x06DA18B8
  Apr 23 19:25:24 garrcoop-fbsd kernel: time            = 4287402440
  Apr 23 19:25:24 garrcoop-fbsd kernel: driver status:
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  0: qid=0  cur=1   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  1: qid=1  cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  2: qid=2  cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  3: qid=3  cur=36  
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  4: qid=4  cur=123 
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  5: qid=5  cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  6: qid=6  cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  7: qid=7  cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  8: qid=8  cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  9: qid=9  cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 10: qid=10 cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 11: qid=11 cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 12: qid=12 cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 13: qid=13 cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 14: qid=14 cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 15: qid=15 cur=0   
  queued=0
  Apr 23 19:25:24 garrcoop-fbsd kernel: rx ring: cur=8
 
  This may be because the system was under load (I was installing a port
  shortly before the connection dropped). I'll try poking at this
  further because it's going to be an annoying productivity loss :/.
 
     Sorry... should have included more helpful details.
  Thanks,
  -Garrett
 
  dmesg:
 
  iwn0: Intel(R) PRO/Wireless 4965BGN mem 0xdf2fe000-0xdf2f irq 17
  at device 0.0 on pci3
  iwn0: MIMO 2T3R, MoW1, address 00:1d:e0:7d:9f:c7
  iwn0: [ITHREAD]
  iwn0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
  iwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
  iwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps
  24Mbps 36Mbps 48Mbps 54Mbps
 
  pciconf -lv snippet:
 
  i...@pci0:3:0:0:        class=0x028000 card=0x11108086 chip=0x42308086
  rev=0x61 hdr=0x00
     vendor     = 'Intel Corporation'
     device     = 'Intel Wireless WiFi Link 4965AGN (Intel 4965AGN)'
     class      = network
  c...@pci0:21:0:0:       class=0x060700 card=0x20c617aa chip=0x04761180
  rev=0xba hdr=0x02
 
  uname -a:
 
  $ uname -a
  FreeBSD garrcoop-fbsd.cisco.com 8.0-STABLE FreeBSD 8.0-STABLE #0
  r207006: Wed Apr 21 13:18:44 PDT 2010
  r...@garrcoop-fbsd.cisco.com:/usr/obj/usr/src/sys/LAPPY_X86  i386
 
  I'm actually looking at this right now. For me, it's actually
  happening when my machine stays on overnight (or for long periods of
  time, idle).
 
  Also, it seems to be causing the kernel to panic, although I'm now
  wondering if the Machine Check Architecture is somehow catching this
  device error and causing an exception (hw.mca.enabled=1)(?) -- not
  possible, right ???
 
  Whatever the case, I can't seem to get the firmware error to occur
  with iwn(4) debugging or wlandebug options enabled, so who knows
  exactly what leads to this.
 
  I know Bernhard has worked hard on this driver, it's a shame that this
  freaky bug has bit us all now, without leaving many clues :(
 
  I've attached a textdump for posterity if nothing else :)
 
     Connectivity appears to be shoddy in my neck of the woods (kind of
  ironic... but meh). Just running buildworld, 

Re: iwn firmware instability with an up-to-date stable kernel

2010-04-24 Thread Bernhard Schmidt
On Sat, Apr 24, 2010 at 12:45:14AM -0700, Garrett Cooper wrote:
 On Sat, Apr 24, 2010 at 12:34 AM, Bernhard Schmidt
 bschm...@techwires.net wrote:
  On Fri, Apr 23, 2010 at 11:27:32PM -0700, Garrett Cooper wrote:
  On Fri, Apr 23, 2010 at 10:08 PM, Brandon Gooch
  jamesbrandongo...@gmail.com wrote:
   On Sat, Apr 24, 2010 at 4:59 AM, Garrett Cooper yanef...@gmail.com 
   wrote:
   On Fri, Apr 23, 2010 at 9:42 PM, Garrett Cooper yanef...@gmail.com 
   wrote:
   On Fri, Apr 23, 2010 at 8:05 PM, Brandon Gooch
   jamesbrandongo...@gmail.com wrote:
   2010/4/23 Garrett Cooper yanef...@gmail.com:
   2010/4/23 Garrett Cooper yanef...@gmail.com:
   2010/4/18 Olivier Cochard-Labbé oliv...@cochard.me:
   2010/4/18 Bernhard Schmidt bschm...@techwires.net:
   Are you able to reproduce this on demand? As in type a few 
   commands and
   the firmware error occurs?
  
  
   No, I'm not able to reproduce on demand this problem.
  
   I'm seeing similar issues on occasion with my Lenovo as well:
  
   Apr 23 19:25:24 garrcoop-fbsd kernel: firmware error log:
   Apr 23 19:25:24 garrcoop-fbsd kernel: error type      =
   NMI_INTERRUPT_WDG (0x0004)
   Apr 23 19:25:24 garrcoop-fbsd kernel: program counter = 0x046C
   Apr 23 19:25:24 garrcoop-fbsd kernel: source line     = 0x00D0
   Apr 23 19:25:24 garrcoop-fbsd kernel: error data      = 
   0x00020703
   Apr 23 19:25:24 garrcoop-fbsd kernel: branch link     = 
   0x837004C2
   Apr 23 19:25:24 garrcoop-fbsd kernel: interrupt link  = 
   0x06DA18B8
   Apr 23 19:25:24 garrcoop-fbsd kernel: time            = 4287402440
   Apr 23 19:25:24 garrcoop-fbsd kernel: driver status:
   Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  0: qid=0  cur=1   
   queued=0
   Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  1: qid=1  cur=0   
   queued=0
   Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  2: qid=2  cur=0   
   queued=0
   Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  3: qid=3  cur=36  
   queued=0
   Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  4: qid=4  cur=123 
   queued=0
   Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  5: qid=5  cur=0   
   queued=0
   Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  6: qid=6  cur=0   
   queued=0
   Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  7: qid=7  cur=0   
   queued=0
   Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  8: qid=8  cur=0   
   queued=0
   Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  9: qid=9  cur=0   
   queued=0
   Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 10: qid=10 cur=0   
   queued=0
   Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 11: qid=11 cur=0   
   queued=0
   Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 12: qid=12 cur=0   
   queued=0
   Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 13: qid=13 cur=0   
   queued=0
   Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 14: qid=14 cur=0   
   queued=0
   Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 15: qid=15 cur=0   
   queued=0
   Apr 23 19:25:24 garrcoop-fbsd kernel: rx ring: cur=8
  
   This may be because the system was under load (I was installing a 
   port
   shortly before the connection dropped). I'll try poking at this
   further because it's going to be an annoying productivity loss :/.
  
      Sorry... should have included more helpful details.
   Thanks,
   -Garrett
  
   dmesg:
  
   iwn0: Intel(R) PRO/Wireless 4965BGN mem 0xdf2fe000-0xdf2f irq 
   17
   at device 0.0 on pci3
   iwn0: MIMO 2T3R, MoW1, address 00:1d:e0:7d:9f:c7
   iwn0: [ITHREAD]
   iwn0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 
   54Mbps
   iwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
   iwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps
   24Mbps 36Mbps 48Mbps 54Mbps
  
   pciconf -lv snippet:
  
   i...@pci0:3:0:0:        class=0x028000 card=0x11108086 
   chip=0x42308086
   rev=0x61 hdr=0x00
      vendor     = 'Intel Corporation'
      device     = 'Intel Wireless WiFi Link 4965AGN (Intel 4965AGN)'
      class      = network
   c...@pci0:21:0:0:       class=0x060700 card=0x20c617aa 
   chip=0x04761180
   rev=0xba hdr=0x02
  
   uname -a:
  
   $ uname -a
   FreeBSD garrcoop-fbsd.cisco.com 8.0-STABLE FreeBSD 8.0-STABLE #0
   r207006: Wed Apr 21 13:18:44 PDT 2010
   r...@garrcoop-fbsd.cisco.com:/usr/obj/usr/src/sys/LAPPY_X86  i386
  
   I'm actually looking at this right now. For me, it's actually
   happening when my machine stays on overnight (or for long periods of
   time, idle).
  
   Also, it seems to be causing the kernel to panic, although I'm now
   wondering if the Machine Check Architecture is somehow catching this
   device error and causing an exception (hw.mca.enabled=1)(?) -- not
   possible, right ???
  
   Whatever the case, I can't seem to get the firmware error to occur
   with iwn(4) debugging or wlandebug options enabled, so who knows
   exactly what leads to this.
  
   I know Bernhard has worked hard on this driver, it's a shame that this
   freaky bug has bit us all now, without 

Re: iwn firmware instability with an up-to-date stable kernel

2010-04-24 Thread Garrett Cooper
On Sat, Apr 24, 2010 at 12:50 AM, Bernhard Schmidt
bschm...@techwires.net wrote:
 On Sat, Apr 24, 2010 at 12:45:14AM -0700, Garrett Cooper wrote:
 On Sat, Apr 24, 2010 at 12:34 AM, Bernhard Schmidt
 bschm...@techwires.net wrote:
 
  How did you do that? Reloading the module, or with ifconfig?

 /etc/rc.d/netif restart , which does the ifconfig operations (no
 module change occurred AFAIK, but wlan0 did of course do some
 device_printf's when it was associating itself with iwn(4)).

 Can you do ps xa | grep wpa? Just wondering if wpa_supplicant gets
 started twice.

Some more interesting data.

Open authentication at home works out of the box via wpa_supplicant
with ifconfig_wlan0=WPA DHCP whereas it flaked out and died at work.

There are two instances of wpa_supplicant started up on the laptop.
Here's a snippet from pstree that shows that both processes are
standalone:

-+= 1 root /sbin/init --
 |--= 00121 root adjkerntz -i
 |--= 00559 root /sbin/devd
 |--= 00711 root /usr/sbin/syslogd -s
 |--= 00735 root /usr/sbin/rpcbind
 |--= 00879 root /usr/sbin/moused -p /dev/psm0 -t auto
 |--= 00903 messagebus /usr/local/bin/dbus-daemon --system
 |--= 01073 root /usr/sbin/sshd
 |--= 01081 root sendmail: accepting connections (sendmail)
 |--= 01085 smmsp sendmail: Queue run...@00:30:00 for
/var/spool/clientmqueue (sendmail)
 |--= 01093 root /usr/sbin/cron -s
 |-+= 01176 haldaemon /usr/local/sbin/hald
 | \-+- 01180 root hald-runner
 |   |--- 01185 root hald-addon-mouse-sysmouse: /dev/psm0 (hald-addon-mouse-sy)
 |   \--- 01205 root hald-addon-storage: /dev/acd0 (hald-addon-storage)
 |--= 01179 root /usr/local/sbin/console-kit-daemon
 |--= 01727 root /usr/sbin/wpa_supplicant -s -B -i wlan0 -c
/etc/wpa_supplicant.conf -D bsd -P /var/run/wpa_supplicant/wlan0.pid
 |--= 01783 root /usr/sbin/wpa_supplicant -s -B -i wlan0 -c
/etc/wpa_supplicant.conf -D bsd -P /var/run/wpa_supplicant/wlan0.pid
 |--= 01866 root dhclient: wlan0 [priv] (dhclient)
 |--= 01902 _dhcp dhclient: wlan0 (dhclient)

Thanks,
-Garrett
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: iwn firmware instability with an up-to-date stable kernel

2010-04-23 Thread Garrett Cooper
2010/4/18 Olivier Cochard-Labbé oliv...@cochard.me:
 2010/4/18 Bernhard Schmidt bschm...@techwires.net:
 Are you able to reproduce this on demand? As in type a few commands and
 the firmware error occurs?


 No, I'm not able to reproduce on demand this problem.

I'm seeing similar issues on occasion with my Lenovo as well:

Apr 23 19:25:24 garrcoop-fbsd kernel: firmware error log:
Apr 23 19:25:24 garrcoop-fbsd kernel: error type  =
NMI_INTERRUPT_WDG (0x0004)
Apr 23 19:25:24 garrcoop-fbsd kernel: program counter = 0x046C
Apr 23 19:25:24 garrcoop-fbsd kernel: source line = 0x00D0
Apr 23 19:25:24 garrcoop-fbsd kernel: error data  = 0x00020703
Apr 23 19:25:24 garrcoop-fbsd kernel: branch link = 0x837004C2
Apr 23 19:25:24 garrcoop-fbsd kernel: interrupt link  = 0x06DA18B8
Apr 23 19:25:24 garrcoop-fbsd kernel: time= 4287402440
Apr 23 19:25:24 garrcoop-fbsd kernel: driver status:
Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  0: qid=0  cur=1   queued=0
Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  1: qid=1  cur=0   queued=0
Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  2: qid=2  cur=0   queued=0
Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  3: qid=3  cur=36  queued=0
Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  4: qid=4  cur=123 queued=0
Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  5: qid=5  cur=0   queued=0
Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  6: qid=6  cur=0   queued=0
Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  7: qid=7  cur=0   queued=0
Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  8: qid=8  cur=0   queued=0
Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  9: qid=9  cur=0   queued=0
Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 10: qid=10 cur=0   queued=0
Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 11: qid=11 cur=0   queued=0
Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 12: qid=12 cur=0   queued=0
Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 13: qid=13 cur=0   queued=0
Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 14: qid=14 cur=0   queued=0
Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 15: qid=15 cur=0   queued=0
Apr 23 19:25:24 garrcoop-fbsd kernel: rx ring: cur=8

This may be because the system was under load (I was installing a port
shortly before the connection dropped). I'll try poking at this
further because it's going to be an annoying productivity loss :/.

Thanks,
-Garrett
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: iwn firmware instability with an up-to-date stable kernel

2010-04-23 Thread Garrett Cooper
2010/4/23 Garrett Cooper yanef...@gmail.com:
 2010/4/18 Olivier Cochard-Labbé oliv...@cochard.me:
 2010/4/18 Bernhard Schmidt bschm...@techwires.net:
 Are you able to reproduce this on demand? As in type a few commands and
 the firmware error occurs?


 No, I'm not able to reproduce on demand this problem.

 I'm seeing similar issues on occasion with my Lenovo as well:

 Apr 23 19:25:24 garrcoop-fbsd kernel: firmware error log:
 Apr 23 19:25:24 garrcoop-fbsd kernel: error type      =
 NMI_INTERRUPT_WDG (0x0004)
 Apr 23 19:25:24 garrcoop-fbsd kernel: program counter = 0x046C
 Apr 23 19:25:24 garrcoop-fbsd kernel: source line     = 0x00D0
 Apr 23 19:25:24 garrcoop-fbsd kernel: error data      = 0x00020703
 Apr 23 19:25:24 garrcoop-fbsd kernel: branch link     = 0x837004C2
 Apr 23 19:25:24 garrcoop-fbsd kernel: interrupt link  = 0x06DA18B8
 Apr 23 19:25:24 garrcoop-fbsd kernel: time            = 4287402440
 Apr 23 19:25:24 garrcoop-fbsd kernel: driver status:
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  0: qid=0  cur=1   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  1: qid=1  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  2: qid=2  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  3: qid=3  cur=36  queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  4: qid=4  cur=123 queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  5: qid=5  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  6: qid=6  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  7: qid=7  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  8: qid=8  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  9: qid=9  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 10: qid=10 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 11: qid=11 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 12: qid=12 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 13: qid=13 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 14: qid=14 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 15: qid=15 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: rx ring: cur=8

 This may be because the system was under load (I was installing a port
 shortly before the connection dropped). I'll try poking at this
 further because it's going to be an annoying productivity loss :/.

Sorry... should have included more helpful details.
Thanks,
-Garrett

dmesg:

iwn0: Intel(R) PRO/Wireless 4965BGN mem 0xdf2fe000-0xdf2f irq 17
at device 0.0 on pci3
iwn0: MIMO 2T3R, MoW1, address 00:1d:e0:7d:9f:c7
iwn0: [ITHREAD]
iwn0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
iwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
iwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps
24Mbps 36Mbps 48Mbps 54Mbps

pciconf -lv snippet:

i...@pci0:3:0:0:class=0x028000 card=0x11108086 chip=0x42308086
rev=0x61 hdr=0x00
vendor = 'Intel Corporation'
device = 'Intel Wireless WiFi Link 4965AGN (Intel 4965AGN)'
class  = network
c...@pci0:21:0:0:   class=0x060700 card=0x20c617aa chip=0x04761180
rev=0xba hdr=0x02

uname -a:

$ uname -a
FreeBSD garrcoop-fbsd.cisco.com 8.0-STABLE FreeBSD 8.0-STABLE #0
r207006: Wed Apr 21 13:18:44 PDT 2010
r...@garrcoop-fbsd.cisco.com:/usr/obj/usr/src/sys/LAPPY_X86  i386
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: iwn firmware instability with an up-to-date stable kernel

2010-04-23 Thread Garrett Cooper
On Fri, Apr 23, 2010 at 8:05 PM, Brandon Gooch
jamesbrandongo...@gmail.com wrote:
 2010/4/23 Garrett Cooper yanef...@gmail.com:
 2010/4/23 Garrett Cooper yanef...@gmail.com:
 2010/4/18 Olivier Cochard-Labbé oliv...@cochard.me:
 2010/4/18 Bernhard Schmidt bschm...@techwires.net:
 Are you able to reproduce this on demand? As in type a few commands and
 the firmware error occurs?


 No, I'm not able to reproduce on demand this problem.

 I'm seeing similar issues on occasion with my Lenovo as well:

 Apr 23 19:25:24 garrcoop-fbsd kernel: firmware error log:
 Apr 23 19:25:24 garrcoop-fbsd kernel: error type      =
 NMI_INTERRUPT_WDG (0x0004)
 Apr 23 19:25:24 garrcoop-fbsd kernel: program counter = 0x046C
 Apr 23 19:25:24 garrcoop-fbsd kernel: source line     = 0x00D0
 Apr 23 19:25:24 garrcoop-fbsd kernel: error data      = 0x00020703
 Apr 23 19:25:24 garrcoop-fbsd kernel: branch link     = 0x837004C2
 Apr 23 19:25:24 garrcoop-fbsd kernel: interrupt link  = 0x06DA18B8
 Apr 23 19:25:24 garrcoop-fbsd kernel: time            = 4287402440
 Apr 23 19:25:24 garrcoop-fbsd kernel: driver status:
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  0: qid=0  cur=1   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  1: qid=1  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  2: qid=2  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  3: qid=3  cur=36  queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  4: qid=4  cur=123 queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  5: qid=5  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  6: qid=6  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  7: qid=7  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  8: qid=8  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  9: qid=9  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 10: qid=10 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 11: qid=11 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 12: qid=12 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 13: qid=13 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 14: qid=14 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 15: qid=15 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: rx ring: cur=8

 This may be because the system was under load (I was installing a port
 shortly before the connection dropped). I'll try poking at this
 further because it's going to be an annoying productivity loss :/.

    Sorry... should have included more helpful details.
 Thanks,
 -Garrett

 dmesg:

 iwn0: Intel(R) PRO/Wireless 4965BGN mem 0xdf2fe000-0xdf2f irq 17
 at device 0.0 on pci3
 iwn0: MIMO 2T3R, MoW1, address 00:1d:e0:7d:9f:c7
 iwn0: [ITHREAD]
 iwn0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
 iwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
 iwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps
 24Mbps 36Mbps 48Mbps 54Mbps

 pciconf -lv snippet:

 i...@pci0:3:0:0:        class=0x028000 card=0x11108086 chip=0x42308086
 rev=0x61 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Intel Wireless WiFi Link 4965AGN (Intel 4965AGN)'
    class      = network
 c...@pci0:21:0:0:       class=0x060700 card=0x20c617aa chip=0x04761180
 rev=0xba hdr=0x02

 uname -a:

 $ uname -a
 FreeBSD garrcoop-fbsd.cisco.com 8.0-STABLE FreeBSD 8.0-STABLE #0
 r207006: Wed Apr 21 13:18:44 PDT 2010
 r...@garrcoop-fbsd.cisco.com:/usr/obj/usr/src/sys/LAPPY_X86  i386

 I'm actually looking at this right now. For me, it's actually
 happening when my machine stays on overnight (or for long periods of
 time, idle).

 Also, it seems to be causing the kernel to panic, although I'm now
 wondering if the Machine Check Architecture is somehow catching this
 device error and causing an exception (hw.mca.enabled=1)(?) -- not
 possible, right ???

 Whatever the case, I can't seem to get the firmware error to occur
 with iwn(4) debugging or wlandebug options enabled, so who knows
 exactly what leads to this.

 I know Bernhard has worked hard on this driver, it's a shame that this
 freaky bug has bit us all now, without leaving many clues :(

 I've attached a textdump for posterity if nothing else :)

Connectivity appears to be shoddy in my neck of the woods (kind of
ironic... but meh). Just running buildworld, buildkernel, then doing a
tcpdump in parallel causes the pseudo device to go up and down a lot.
I assume this isn't standard behavior?
Just for reference buildworld was started shortly after 19:39:05,
and it finished at 21:29. The interface has also gone up and down once
since then while the system's been basically idle.
Thanks,
-Garrett

Apr 23 19:39:05 garrcoop-fbsd kernel: wlan0: promiscuous mode enabled
Apr 23 19:41:04 garrcoop-fbsd wpa_supplicant[17226]: CTRL-EVENT-SCAN-RESULTS
Apr 23 19:41:04 garrcoop-fbsd wpa_supplicant[17226]: Trying to

Re: iwn firmware instability with an up-to-date stable kernel

2010-04-23 Thread Garrett Cooper
On Fri, Apr 23, 2010 at 9:42 PM, Garrett Cooper yanef...@gmail.com wrote:
 On Fri, Apr 23, 2010 at 8:05 PM, Brandon Gooch
 jamesbrandongo...@gmail.com wrote:
 2010/4/23 Garrett Cooper yanef...@gmail.com:
 2010/4/23 Garrett Cooper yanef...@gmail.com:
 2010/4/18 Olivier Cochard-Labbé oliv...@cochard.me:
 2010/4/18 Bernhard Schmidt bschm...@techwires.net:
 Are you able to reproduce this on demand? As in type a few commands and
 the firmware error occurs?


 No, I'm not able to reproduce on demand this problem.

 I'm seeing similar issues on occasion with my Lenovo as well:

 Apr 23 19:25:24 garrcoop-fbsd kernel: firmware error log:
 Apr 23 19:25:24 garrcoop-fbsd kernel: error type      =
 NMI_INTERRUPT_WDG (0x0004)
 Apr 23 19:25:24 garrcoop-fbsd kernel: program counter = 0x046C
 Apr 23 19:25:24 garrcoop-fbsd kernel: source line     = 0x00D0
 Apr 23 19:25:24 garrcoop-fbsd kernel: error data      = 0x00020703
 Apr 23 19:25:24 garrcoop-fbsd kernel: branch link     = 0x837004C2
 Apr 23 19:25:24 garrcoop-fbsd kernel: interrupt link  = 0x06DA18B8
 Apr 23 19:25:24 garrcoop-fbsd kernel: time            = 4287402440
 Apr 23 19:25:24 garrcoop-fbsd kernel: driver status:
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  0: qid=0  cur=1   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  1: qid=1  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  2: qid=2  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  3: qid=3  cur=36  queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  4: qid=4  cur=123 queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  5: qid=5  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  6: qid=6  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  7: qid=7  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  8: qid=8  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  9: qid=9  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 10: qid=10 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 11: qid=11 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 12: qid=12 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 13: qid=13 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 14: qid=14 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 15: qid=15 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: rx ring: cur=8

 This may be because the system was under load (I was installing a port
 shortly before the connection dropped). I'll try poking at this
 further because it's going to be an annoying productivity loss :/.

    Sorry... should have included more helpful details.
 Thanks,
 -Garrett

 dmesg:

 iwn0: Intel(R) PRO/Wireless 4965BGN mem 0xdf2fe000-0xdf2f irq 17
 at device 0.0 on pci3
 iwn0: MIMO 2T3R, MoW1, address 00:1d:e0:7d:9f:c7
 iwn0: [ITHREAD]
 iwn0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
 iwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
 iwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps
 24Mbps 36Mbps 48Mbps 54Mbps

 pciconf -lv snippet:

 i...@pci0:3:0:0:        class=0x028000 card=0x11108086 chip=0x42308086
 rev=0x61 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Intel Wireless WiFi Link 4965AGN (Intel 4965AGN)'
    class      = network
 c...@pci0:21:0:0:       class=0x060700 card=0x20c617aa chip=0x04761180
 rev=0xba hdr=0x02

 uname -a:

 $ uname -a
 FreeBSD garrcoop-fbsd.cisco.com 8.0-STABLE FreeBSD 8.0-STABLE #0
 r207006: Wed Apr 21 13:18:44 PDT 2010
 r...@garrcoop-fbsd.cisco.com:/usr/obj/usr/src/sys/LAPPY_X86  i386

 I'm actually looking at this right now. For me, it's actually
 happening when my machine stays on overnight (or for long periods of
 time, idle).

 Also, it seems to be causing the kernel to panic, although I'm now
 wondering if the Machine Check Architecture is somehow catching this
 device error and causing an exception (hw.mca.enabled=1)(?) -- not
 possible, right ???

 Whatever the case, I can't seem to get the firmware error to occur
 with iwn(4) debugging or wlandebug options enabled, so who knows
 exactly what leads to this.

 I know Bernhard has worked hard on this driver, it's a shame that this
 freaky bug has bit us all now, without leaving many clues :(

 I've attached a textdump for posterity if nothing else :)

    Connectivity appears to be shoddy in my neck of the woods (kind of
 ironic... but meh). Just running buildworld, buildkernel, then doing a
 tcpdump in parallel causes the pseudo device to go up and down a lot.
 I assume this isn't standard behavior?
    Just for reference buildworld was started shortly after 19:39:05,
 and it finished at 21:29. The interface has also gone up and down once
 since then while the system's been basically idle.

Hmmm... I'm seem to be in an excellent position to reproduce this
issue. I've reproduced it twice by merely bringing the interface up
and 

Re: iwn firmware instability with an up-to-date stable kernel

2010-04-23 Thread Brandon Gooch
On Sat, Apr 24, 2010 at 4:59 AM, Garrett Cooper yanef...@gmail.com wrote:
 On Fri, Apr 23, 2010 at 9:42 PM, Garrett Cooper yanef...@gmail.com wrote:
 On Fri, Apr 23, 2010 at 8:05 PM, Brandon Gooch
 jamesbrandongo...@gmail.com wrote:
 2010/4/23 Garrett Cooper yanef...@gmail.com:
 2010/4/23 Garrett Cooper yanef...@gmail.com:
 2010/4/18 Olivier Cochard-Labbé oliv...@cochard.me:
 2010/4/18 Bernhard Schmidt bschm...@techwires.net:
 Are you able to reproduce this on demand? As in type a few commands and
 the firmware error occurs?


 No, I'm not able to reproduce on demand this problem.

 I'm seeing similar issues on occasion with my Lenovo as well:

 Apr 23 19:25:24 garrcoop-fbsd kernel: firmware error log:
 Apr 23 19:25:24 garrcoop-fbsd kernel: error type      =
 NMI_INTERRUPT_WDG (0x0004)
 Apr 23 19:25:24 garrcoop-fbsd kernel: program counter = 0x046C
 Apr 23 19:25:24 garrcoop-fbsd kernel: source line     = 0x00D0
 Apr 23 19:25:24 garrcoop-fbsd kernel: error data      = 0x00020703
 Apr 23 19:25:24 garrcoop-fbsd kernel: branch link     = 0x837004C2
 Apr 23 19:25:24 garrcoop-fbsd kernel: interrupt link  = 0x06DA18B8
 Apr 23 19:25:24 garrcoop-fbsd kernel: time            = 4287402440
 Apr 23 19:25:24 garrcoop-fbsd kernel: driver status:
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  0: qid=0  cur=1   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  1: qid=1  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  2: qid=2  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  3: qid=3  cur=36  queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  4: qid=4  cur=123 queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  5: qid=5  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  6: qid=6  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  7: qid=7  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  8: qid=8  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring  9: qid=9  cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 10: qid=10 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 11: qid=11 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 12: qid=12 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 13: qid=13 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 14: qid=14 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 15: qid=15 cur=0   queued=0
 Apr 23 19:25:24 garrcoop-fbsd kernel: rx ring: cur=8

 This may be because the system was under load (I was installing a port
 shortly before the connection dropped). I'll try poking at this
 further because it's going to be an annoying productivity loss :/.

    Sorry... should have included more helpful details.
 Thanks,
 -Garrett

 dmesg:

 iwn0: Intel(R) PRO/Wireless 4965BGN mem 0xdf2fe000-0xdf2f irq 17
 at device 0.0 on pci3
 iwn0: MIMO 2T3R, MoW1, address 00:1d:e0:7d:9f:c7
 iwn0: [ITHREAD]
 iwn0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
 iwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
 iwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps
 24Mbps 36Mbps 48Mbps 54Mbps

 pciconf -lv snippet:

 i...@pci0:3:0:0:        class=0x028000 card=0x11108086 chip=0x42308086
 rev=0x61 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Intel Wireless WiFi Link 4965AGN (Intel 4965AGN)'
    class      = network
 c...@pci0:21:0:0:       class=0x060700 card=0x20c617aa chip=0x04761180
 rev=0xba hdr=0x02

 uname -a:

 $ uname -a
 FreeBSD garrcoop-fbsd.cisco.com 8.0-STABLE FreeBSD 8.0-STABLE #0
 r207006: Wed Apr 21 13:18:44 PDT 2010
 r...@garrcoop-fbsd.cisco.com:/usr/obj/usr/src/sys/LAPPY_X86  i386

 I'm actually looking at this right now. For me, it's actually
 happening when my machine stays on overnight (or for long periods of
 time, idle).

 Also, it seems to be causing the kernel to panic, although I'm now
 wondering if the Machine Check Architecture is somehow catching this
 device error and causing an exception (hw.mca.enabled=1)(?) -- not
 possible, right ???

 Whatever the case, I can't seem to get the firmware error to occur
 with iwn(4) debugging or wlandebug options enabled, so who knows
 exactly what leads to this.

 I know Bernhard has worked hard on this driver, it's a shame that this
 freaky bug has bit us all now, without leaving many clues :(

 I've attached a textdump for posterity if nothing else :)

    Connectivity appears to be shoddy in my neck of the woods (kind of
 ironic... but meh). Just running buildworld, buildkernel, then doing a
 tcpdump in parallel causes the pseudo device to go up and down a lot.
 I assume this isn't standard behavior?
    Just for reference buildworld was started shortly after 19:39:05,
 and it finished at 21:29. The interface has also gone up and down once
 since then while the system's been basically idle.

    Hmmm... I'm seem to be in an excellent position to reproduce this
 

Re: iwn firmware instability with an up-to-date stable kernel

2010-04-18 Thread Bernhard Schmidt
On Sun, Apr 18, 2010 at 03:49:14AM +0200, Olivier Cochard-Labbé wrote:
 Hi,
 
 I meet instability with an up-to-date stable 8 kernel and iwn drivers:
 About twice a day, my wireless connection hang and I've this error
 message in dmesg:
 
 firmware error log:
   error type  = NMI_INTERRUPT_WDG (0x0004)
   program counter = 0x046C
   source line = 0x00D0
[..]
 
 Does anyone meet the same problem ?

I've seen this error a few times, even Intel knows about it:
http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=1965
Issue is that there is no known workaround.

Are you able to reproduce this on demand? As in type a few commands and
the firmware error occurs?

-- 
Bernhard
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: iwn firmware instability with an up-to-date stable kernel

2010-04-18 Thread Olivier Cochard-Labbé
2010/4/18 Bernhard Schmidt bschm...@techwires.net:
 Are you able to reproduce this on demand? As in type a few commands and
 the firmware error occurs?


No, I'm not able to reproduce on demand this problem.

Regards,

Olivier
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org