Re: OpenBSD 3.7 on Soekris rebooting at random
On Wed, 31 Aug 2005 12:47:03 +0200 Olivier Mehani [EMAIL PROTECTED] wrote: I've just finished upgrading my router to 3.8-beta (GENERIC#119). Ok, the machine has been running without problem nor unwanted reboots for almost three days. It hasn't been able to last that long before the upgrade. I think the problem is fixed then. Ath works correctly in hostap mode with said kernel ;). Thank you for the advice ! -- Olivier Mehani [EMAIL PROTECTED] PGP fingerprint: 3720 A1F7 1367 9FA3 C654 6DFB 6845 4071 E346 2FD1
Re: OpenBSD 3.7 on Soekris rebooting at random
On Tue, 23 Aug 2005 19:49:46 +0200 [EMAIL PROTECTED] wrote: I haven't time in the next 10 days to play with it, but maybe Olivier can give some feedback in case he tries the latest snapshot? I've just finished upgrading my router to 3.8-beta (GENERIC#119). I'm going to stress the machine a little now ;) I keep you informed. -- Olivier Mehani [EMAIL PROTECTED] PGP fingerprint: 3720 A1F7 1367 9FA3 C654 6DFB 6845 4071 E346 2FD1
OpenBSD 3.7 on Soekris rebooting at random
Hi, I'm facing a strange problem (started a week or so ago): My OpenBSD 3.7 running on a Soekris net4511 reboots with no obvious reason. I've started monitoring the memory usage, load average and pf states, but these do not seem to be related to the problem. I'm also using the hardware watchdog which I will disable to see if it is involved in the problem, but everything has been working well for more than two months with it before. Do you have any suggestion of other things I should monitor ? Thanks -- Olivier Mehani [EMAIL PROTECTED] [demime 1.01d removed an attachment of type application/pgp-signature]
Re: OpenBSD 3.7 on Soekris rebooting at random
Olivier Mehani wrote: Hi, I'm facing a strange problem (started a week or so ago): My OpenBSD 3.7 running on a Soekris net4511 reboots with no obvious reason. I've started monitoring the memory usage, load average and pf states, but these do not seem to be related to the problem. I'm also using the hardware watchdog which I will disable to see if it is involved in the problem, but everything has been working well for more than two months with it before. Do you have any suggestion of other things I should monitor ? Thanks -- Olivier Mehani [EMAIL PROTECTED] [demime 1.01d removed an attachment of type application/pgp-signature] a dmesg may be helpful...
Re: OpenBSD 3.7 on Soekris rebooting at random
Olivier Mehani wrote: My OpenBSD 3.7 running on a Soekris net4511 reboots with no obvious reason. I've started monitoring the memory usage, load average and pf states, but these do not seem to be related to the problem. [...] Do you have any suggestion of other things I should monitor ? Input voltage and current? I've seen reports of similar behaviour from Soekris boxes with underspec or flaky power supplies. -- Darren Tucker (dtucker at zip.com.au) GPG key 8FF4FA69 / D9A3 86E9 7EEE AF4B B2D4 37C9 C982 80C7 8FF4 FA69 Good judgement comes with experience. Unfortunately, the experience usually comes from bad judgement.
Re: OpenBSD 3.7 on Soekris rebooting at random
On Tue, 23 Aug 2005 15:21:53 +0200 Dimitri Georganas [EMAIL PROTECTED] wrote: I'm facing a strange problem (started a week or so ago): a dmesg may be helpful... Yes, I realised I forgot to include it just after posting, sorry... Anyway, it confirms that this is the watchdog which triggered the reset, but I still don't know why... Full dmesg follows: OpenBSD 3.7 (GENERIC) #50: Sun Mar 20 00:01:57 MST 2005 [EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC cpu0: AMD Am486DX4 W/B or Am5x86 W/B 150 (AuthenticAMD 486-class) cpu0: FPU real mem = 66691072 (65128K) avail mem = 53448704 (52196K) using 839 buffers containing 3436544 bytes (3356K) of memory mainbus0 (root) bios0 at mainbus0: AT/286+(00) BIOS, date 20/41/22, BIOS32 rev. 0 @ 0xf7840 pcibios0 at bios0: rev 2.0 @ 0xf/0x1 pcibios0: pcibios_get_intr_routing - function not supported pcibios0: PCI IRQ Routing information unavailable. pcibios0: PCI bus #1 is the last bus bios0: ROM list: 0xc8000/0x9000 cpu0 at mainbus0 pci0 at mainbus0 bus 0: configuration mode 1 (no bios) elansc0 at pci0 dev 0 function 0 AMD ElanSC520 PCI rev 0x00: product 0 steppin g 1.1, CPU clock 100MHz, reset 8WDT elansc0: WARNING: LAST RESET DUE TO WATCHDOG EXPIRATION! gpio0 at elansc0: 32 pins cbb0 at pci0 dev 9 function 0 Texas Instruments PCI1410 CardBus rev 0x02: irq 10 ath0 at pci0 dev 16 function 0 Atheros AR5212 rev 0x01: irq 11 ath0: mac 80.9 phy 4.3 radio 3.6, 802.11a/b/g, FCC1A, address 00:02:6f:21:ea:79 gpio at ath0 not configured sis0 at pci0 dev 18 function 0 NS DP83815 10/100 rev 0x00: DP83816A, irq 5, ad dress 00:00:24:c4:22:5c nsphyter0 at sis0 phy 0: DP83815 10/100 PHY, rev. 1 sis1 at pci0 dev 19 function 0 NS DP83815 10/100 rev 0x00: DP83816A, irq 9, ad dress 00:00:24:c4:22:5d nsphyter1 at sis1 phy 0: DP83815 10/100 PHY, rev. 1 cardslot0 at cbb0 slot 0 flags 0 cardbus0 at cardslot0: bus 1 device 0 cacheline 0x10, lattimer 0x3f pcmcia0 at cardslot0 isa0 at mainbus0 isadma0 at isa0 pckbc0 at isa0 port 0x60/5 pckbd0 at pckbc0 (kbd slot) pckbc0: using irq 1 for kbd slot wskbd0 at pckbd0 (mux 1 ignored for console): console keyboard wdc0 at isa0 port 0x1f0/8 irq 14 wd0 at wdc0 channel 0 drive 0: Hitachi XX.V.3.4.0.0 wd0: 1-sector PIO, LBA, 488MB, 1000944 sectors wd0(wdc0:0:0): using BIOS timings pcppi0 at isa0 port 0x61 midi0 at pcppi0: PC speaker sysbeep0 at pcppi0 npx0 at isa0 port 0xf0/16: using exception 16 pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo pccom0: console pccom1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo biomask f5c5 netmask ffe5 ttymask ffe7 pctr: no performance counters in CPU rtw0 at cardbus0 dev 0 function 0 irq 10 rtw0: ver F, radio SA2400A, amp SA2411, address 00:0f:3d:cf:cb:e8 dkcsum: wd0 matched BIOS disk 80 root on wd0a rootdev=0x0 rrootdev=0x300 rawdev=0x302 WARNING: / was not properly unmounted And, as it may help too, my watchdog script: #!/bin/sh echo starting watchdog... sysctl kern.watchdog.auto=0 /dev/null while : ; do sysctl kern.watchdog.period=10 /dev/null sleep 8 done -- Olivier Mehani [EMAIL PROTECTED]
Re: OpenBSD 3.7 on Soekris rebooting at random
Did you put this atheros card in one week ago? :) ath0 at pci0 dev 16 function 0 Atheros AR5212 rev 0x01: irq 11 ath0: mac 80.9 phy 4.3 radio 3.6, 802.11a/b/g, FCC1A, address 3.7 would crash on ath0 with AR5212 in hostap mode every six hours or so. Your watchdog does an excellent job, otherwise your board would just freeze. Not sure if this is an ath problem or a card-specific problem, but later snapshots (june) made it worse: the system would freeze when you'd do ifconfig up - probably due to the gpio support that, as you can see, isn't here yet in this dmesg. Your best chance is to check out the latest snapshots to see if the problem is fixed. I reported it a while ago, but didn't check back. I just replaced the atheros card by an old prism card and that one works 24/7. Olivier Mehani wrote: On Tue, 23 Aug 2005 15:21:53 +0200 Dimitri Georganas [EMAIL PROTECTED] wrote: I'm facing a strange problem (started a week or so ago): a dmesg may be helpful... Yes, I realised I forgot to include it just after posting, sorry... Anyway, it confirms that this is the watchdog which triggered the reset, but I still don't know why... Full dmesg follows: OpenBSD 3.7 (GENERIC) #50: Sun Mar 20 00:01:57 MST 2005 [EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC cpu0: AMD Am486DX4 W/B or Am5x86 W/B 150 (AuthenticAMD 486-class) cpu0: FPU real mem = 66691072 (65128K) avail mem = 53448704 (52196K) using 839 buffers containing 3436544 bytes (3356K) of memory mainbus0 (root) bios0 at mainbus0: AT/286+(00) BIOS, date 20/41/22, BIOS32 rev. 0 @ 0xf7840 pcibios0 at bios0: rev 2.0 @ 0xf/0x1 pcibios0: pcibios_get_intr_routing - function not supported pcibios0: PCI IRQ Routing information unavailable. pcibios0: PCI bus #1 is the last bus bios0: ROM list: 0xc8000/0x9000 cpu0 at mainbus0 pci0 at mainbus0 bus 0: configuration mode 1 (no bios) elansc0 at pci0 dev 0 function 0 AMD ElanSC520 PCI rev 0x00: product 0 steppin g 1.1, CPU clock 100MHz, reset 8WDT elansc0: WARNING: LAST RESET DUE TO WATCHDOG EXPIRATION! gpio0 at elansc0: 32 pins cbb0 at pci0 dev 9 function 0 Texas Instruments PCI1410 CardBus rev 0x02: irq 10 ath0 at pci0 dev 16 function 0 Atheros AR5212 rev 0x01: irq 11 ath0: mac 80.9 phy 4.3 radio 3.6, 802.11a/b/g, FCC1A, address 00:02:6f:21:ea:79 gpio at ath0 not configured sis0 at pci0 dev 18 function 0 NS DP83815 10/100 rev 0x00: DP83816A, irq 5, ad dress 00:00:24:c4:22:5c nsphyter0 at sis0 phy 0: DP83815 10/100 PHY, rev. 1 sis1 at pci0 dev 19 function 0 NS DP83815 10/100 rev 0x00: DP83816A, irq 9, ad dress 00:00:24:c4:22:5d nsphyter1 at sis1 phy 0: DP83815 10/100 PHY, rev. 1 cardslot0 at cbb0 slot 0 flags 0 cardbus0 at cardslot0: bus 1 device 0 cacheline 0x10, lattimer 0x3f pcmcia0 at cardslot0 isa0 at mainbus0 isadma0 at isa0 pckbc0 at isa0 port 0x60/5 pckbd0 at pckbc0 (kbd slot) pckbc0: using irq 1 for kbd slot wskbd0 at pckbd0 (mux 1 ignored for console): console keyboard wdc0 at isa0 port 0x1f0/8 irq 14 wd0 at wdc0 channel 0 drive 0: Hitachi XX.V.3.4.0.0 wd0: 1-sector PIO, LBA, 488MB, 1000944 sectors wd0(wdc0:0:0): using BIOS timings pcppi0 at isa0 port 0x61 midi0 at pcppi0: PC speaker sysbeep0 at pcppi0 npx0 at isa0 port 0xf0/16: using exception 16 pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo pccom0: console pccom1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo biomask f5c5 netmask ffe5 ttymask ffe7 pctr: no performance counters in CPU rtw0 at cardbus0 dev 0 function 0 irq 10 rtw0: ver F, radio SA2400A, amp SA2411, address 00:0f:3d:cf:cb:e8 dkcsum: wd0 matched BIOS disk 80 root on wd0a rootdev=0x0 rrootdev=0x300 rawdev=0x302 WARNING: / was not properly unmounted And, as it may help too, my watchdog script: #!/bin/sh echo starting watchdog... sysctl kern.watchdog.auto=0 /dev/null while : ; do sysctl kern.watchdog.period=10 /dev/null sleep 8 done
Re: OpenBSD 3.7 on Soekris rebooting at random
My OpenBSD 3.7 running on a Soekris net4511 reboots with no obvious reason. I've started monitoring the memory usage, load average and pf states, but these do not seem to be related to the problem. I'm also using the hardware watchdog which I will disable to see if it is involved in the problem, but everything has been working well for more than two months with it before. If the couple of thousand soekris machines of various developers and users started rebooting just like that, we would have heard of it by now. Very hard to diagnose when one rare one somewhere, without *any debugging information* does so.
Re: OpenBSD 3.7 on Soekris rebooting at random
On Tue, Aug 23, 2005 at 06:15:45PM +0200, Olivier Mehani wrote: The reboot is not that periodic (6 hours as you say): it can be as short as 2 hours to 2 days with, once again, no obvious reason. try a ping -f against your accesspoint over the wireless interface and it will probably reboot/lock very quickly. reyk
Re: OpenBSD 3.7 on Soekris rebooting at random
A common cause is an insufficient power supply. This has been discussed on the Soekris technical mailing list many times. On Tuesday 23 August 2005 10:04 am, Theo de Raadt wrote: My OpenBSD 3.7 running on a Soekris net4511 reboots with no obvious reason. I've started monitoring the memory usage, load average and pf states, but these do not seem to be related to the problem. I'm also using the hardware watchdog which I will disable to see if it is involved in the problem, but everything has been working well for more than two months with it before. If the couple of thousand soekris machines of various developers and users started rebooting just like that, we would have heard of it by now. Very hard to diagnose when one rare one somewhere, without *any debugging information* does so. -- John R. Shannon [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED]
Re: OpenBSD 3.7 on Soekris rebooting at random
On Tue, Aug 23, 2005 at 05:10:08PM +0200, Dimitri Georganas wrote: Did you put this atheros card in one week ago? :) ath0 at pci0 dev 16 function 0 Atheros AR5212 rev 0x01: irq 11 ath0: mac 80.9 phy 4.3 radio 3.6, 802.11a/b/g, FCC1A, address btw.: could you also give us the exact product name (on the minipci card)? 3.7 would crash on ath0 with AR5212 in hostap mode every six hours or so. Your watchdog does an excellent job, otherwise your board would just freeze. Not sure if this is an ath problem or a card-specific problem, but later snapshots (june) made it worse: the system would freeze when you'd do ifconfig up - probably due to the gpio support that, as you can see, isn't here yet in this dmesg. there were two fixes for hostap mode and it works for me without problems in the driver. one issue got fixed on 2005/05/28 (ath.c 1.29). the problem was a hardware counter overflow after some traffic and an uncatched interrupt which indicated the overflow and required to clear the counter registers. the second problem was related to receive overruns in the rx descriptor chain, which had been fixed as well (ath.c 1.31 2005/07/19). And, as it may help too, my watchdog script: #!/bin/sh echo starting watchdog... sysctl kern.watchdog.auto=0 /dev/null while : ; do sysctl kern.watchdog.period=10 /dev/null sleep 8 done also have a look at mbalmer@'s watchdogd(8) which had been imported some weeks ago. this has some timing advantages over traditional watchdog scripts. reyk
Re: OpenBSD 3.7 on Soekris rebooting at random
On Tue, 23 Aug 2005 17:10:08 +0200 Dimitri Georganas [EMAIL PROTECTED] wrote: Did you put this atheros card in one week ago? :) No, it's been in it for more than a month now and everything has been working smoothly until last week. ath0 at pci0 dev 16 function 0 Atheros AR5212 rev 0x01: irq 11 ath0: mac 80.9 phy 4.3 radio 3.6, 802.11a/b/g, FCC1A, address 3.7 would crash on ath0 with AR5212 in hostap mode every six hours or so. Your watchdog does an excellent job, otherwise your board would just freeze. This is why it gets paid ;) The reboot is not that periodic (6 hours as you say): it can be as short as 2 hours to 2 days with, once again, no obvious reason. Your best chance is to check out the latest snapshots to see if the problem is fixed. I reported it a while ago, but didn't check back. I just replaced the atheros card by an old prism card and that one works 24/7. Thanks for your advice, I'll check that. -- Olivier Mehani [EMAIL PROTECTED]
Re: OpenBSD 3.7 on Soekris rebooting at random
On Tue, 23 Aug 2005 19:13:40 +0200 Reyk Floeter [EMAIL PROTECTED] wrote: ath0 at pci0 dev 16 function 0 Atheros AR5212 rev 0x01: irq 11 ath0: mac 80.9 phy 4.3 radio 3.6, 802.11a/b/g, FCC1A, address btw.: could you also give us the exact product name (on the minipci card)? It is an Atheros 5354MP ARIES 200mW Mini PCI card On the card is written NL-5354MP+ARIES2, and on the chipset is AR5213A-00 A19911C 1804 there were two fixes for hostap mode and it works for me without problems in the driver. I'll upgrade my system and see if it's better. also have a look at mbalmer@'s watchdogd(8) which had been imported some weeks ago. this has some timing advantages over traditional watchdog scripts. Thanks for the advice, I'll look at it -- Olivier Mehani [EMAIL PROTECTED]
Re: OpenBSD 3.7 on Soekris rebooting at random
On Tue, Aug 23, 2005 at 07:13:40PM +0200, Reyk Floeter wrote: On Tue, Aug 23, 2005 at 05:10:08PM +0200, Dimitri Georganas wrote: Did you put this atheros card in one week ago? :) ath0 at pci0 dev 16 function 0 Atheros AR5212 rev 0x01: irq 11 ath0: mac 80.9 phy 4.3 radio 3.6, 802.11a/b/g, FCC1A, address btw.: could you also give us the exact product name (on the minipci card)? It's a Senao 54g - but I will have to check for more detailed info tomorrow. 3.7 would crash on ath0 with AR5212 in hostap mode every six hours or so. snap here yet in this dmesg. there were two fixes for hostap mode and it works for me without problems in the driver. one issue got fixed on 2005/05/28 (ath.c 1.29). the problem was a hardware counter overflow after some traffic and an uncatched interrupt which indicated the overflow and required to clear the counter registers. the second problem was related to receive overruns in the rx descriptor chain, which had been fixed as well (ath.c 1.31 2005/07/19). I checked 3.7 and found the card to freeze every several hours. Then I loaded a early june (maybe late may) snapshot and it froze on ifconfig up. The main difference I could see was the gpio being supported in the snapshot, something that wsn't there in 3.7. Maybe I got caught in between two fixes. I haven't time in the next 10 days to play with it, but maybe Olivier can give some feedback in case he tries the latest snapshot? In case the problem remains and the card remains unemployed here I can send it to Germany to pursue a new career as developer assisting hardware? wq