Re: OpenBSD 3.7 on Soekris rebooting at random

2005-09-03 Thread Olivier Mehani
On Wed, 31 Aug 2005 12:47:03 +0200
Olivier Mehani [EMAIL PROTECTED] wrote:

 I've just finished upgrading my router to 3.8-beta (GENERIC#119).

Ok, the machine has been running without problem nor unwanted reboots
for almost three days. It hasn't been able to last that long before the
upgrade. I think the problem is fixed then. Ath works correctly in
hostap mode with said kernel ;).

Thank you for the advice !

-- 
Olivier Mehani [EMAIL PROTECTED]
PGP fingerprint: 3720 A1F7 1367 9FA3 C654 6DFB 6845 4071 E346 2FD1



Re: OpenBSD 3.7 on Soekris rebooting at random

2005-08-31 Thread Olivier Mehani
On Tue, 23 Aug 2005 19:49:46 +0200
[EMAIL PROTECTED] wrote:

 I haven't time in the next 10 days to play with it, but maybe Olivier
 can give some feedback in case he tries the latest snapshot?

I've just finished upgrading my router to 3.8-beta (GENERIC#119).
I'm going to stress the machine a little now ;)   

I keep you informed.

-- 
Olivier Mehani [EMAIL PROTECTED]
PGP fingerprint: 3720 A1F7 1367 9FA3 C654 6DFB 6845 4071 E346 2FD1



OpenBSD 3.7 on Soekris rebooting at random

2005-08-23 Thread Olivier Mehani
Hi,

I'm facing a strange problem (started a week or so ago):

My OpenBSD 3.7 running on a Soekris net4511 reboots with no obvious
reason. I've started monitoring the memory usage, load average and pf
states, but these do not seem to be related to the problem.

I'm also using the hardware watchdog which I will disable to see if it
is involved in the problem, but everything has been working well for
more than two months with it before.

Do you have any suggestion of other things I should monitor ?

Thanks

--
Olivier Mehani [EMAIL PROTECTED]

[demime 1.01d removed an attachment of type application/pgp-signature]



Re: OpenBSD 3.7 on Soekris rebooting at random

2005-08-23 Thread Dimitri Georganas

Olivier Mehani wrote:


Hi,

I'm facing a strange problem (started a week or so ago):

My OpenBSD 3.7 running on a Soekris net4511 reboots with no obvious
reason. I've started monitoring the memory usage, load average and pf
states, but these do not seem to be related to the problem.

I'm also using the hardware watchdog which I will disable to see if it
is involved in the problem, but everything has been working well for
more than two months with it before.

Do you have any suggestion of other things I should monitor ?

Thanks

--
Olivier Mehani [EMAIL PROTECTED]

[demime 1.01d removed an attachment of type application/pgp-signature]

 


a dmesg may be helpful...



Re: OpenBSD 3.7 on Soekris rebooting at random

2005-08-23 Thread Darren Tucker
Olivier Mehani wrote:
 My OpenBSD 3.7 running on a Soekris net4511 reboots with no obvious
 reason. I've started monitoring the memory usage, load average and pf
 states, but these do not seem to be related to the problem.
[...]
 Do you have any suggestion of other things I should monitor ?

Input voltage and current?  I've seen reports of similar behaviour from
Soekris boxes with underspec or flaky power supplies.

-- 
Darren Tucker (dtucker at zip.com.au)
GPG key 8FF4FA69 / D9A3 86E9 7EEE AF4B B2D4  37C9 C982 80C7 8FF4 FA69
Good judgement comes with experience. Unfortunately, the experience
usually comes from bad judgement.



Re: OpenBSD 3.7 on Soekris rebooting at random

2005-08-23 Thread Olivier Mehani
On Tue, 23 Aug 2005 15:21:53 +0200
Dimitri Georganas [EMAIL PROTECTED] wrote:

 I'm facing a strange problem (started a week or so ago):
 a dmesg may be helpful...

Yes, I realised I forgot to include it just after posting, sorry...

Anyway, it confirms that this is the watchdog which triggered the
reset, but I still don't know why...

Full dmesg follows:

OpenBSD 3.7 (GENERIC) #50: Sun Mar 20 00:01:57 MST 2005
[EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: AMD Am486DX4 W/B or Am5x86 W/B 150 (AuthenticAMD 486-class)
cpu0: FPU
real mem  = 66691072 (65128K)
avail mem = 53448704 (52196K)
using 839 buffers containing 3436544 bytes (3356K) of memory
mainbus0 (root)
bios0 at mainbus0: AT/286+(00) BIOS, date 20/41/22, BIOS32 rev. 0 @
0xf7840 pcibios0 at bios0: rev 2.0 @ 0xf/0x1
pcibios0: pcibios_get_intr_routing - function not supported
pcibios0: PCI IRQ Routing information unavailable.
pcibios0: PCI bus #1 is the last bus
bios0: ROM list: 0xc8000/0x9000
cpu0 at mainbus0
pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
elansc0 at pci0 dev 0 function 0 AMD ElanSC520 PCI rev 0x00: product
0 steppin g 1.1, CPU clock 100MHz, reset 8WDT
elansc0: WARNING: LAST RESET DUE TO WATCHDOG EXPIRATION!
gpio0 at elansc0: 32 pins
cbb0 at pci0 dev 9 function 0 Texas Instruments PCI1410 CardBus rev
0x02: irq 10
ath0 at pci0 dev 16 function 0 Atheros AR5212 rev 0x01: irq 11
ath0: mac 80.9 phy 4.3 radio 3.6, 802.11a/b/g, FCC1A, address
00:02:6f:21:ea:79 gpio at ath0 not configured
sis0 at pci0 dev 18 function 0 NS DP83815 10/100 rev 0x00: DP83816A,
irq 5, ad dress 00:00:24:c4:22:5c
nsphyter0 at sis0 phy 0: DP83815 10/100 PHY, rev. 1
sis1 at pci0 dev 19 function 0 NS DP83815 10/100 rev 0x00: DP83816A,
irq 9, ad dress 00:00:24:c4:22:5d
nsphyter1 at sis1 phy 0: DP83815 10/100 PHY, rev. 1
cardslot0 at cbb0 slot 0 flags 0
cardbus0 at cardslot0: bus 1 device 0 cacheline 0x10, lattimer 0x3f
pcmcia0 at cardslot0
isa0 at mainbus0
isadma0 at isa0
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0 (mux 1 ignored for console): console keyboard
wdc0 at isa0 port 0x1f0/8 irq 14
wd0 at wdc0 channel 0 drive 0: Hitachi XX.V.3.4.0.0
wd0: 1-sector PIO, LBA, 488MB, 1000944 sectors
wd0(wdc0:0:0): using BIOS timings
pcppi0 at isa0 port 0x61
midi0 at pcppi0: PC speaker
sysbeep0 at pcppi0
npx0 at isa0 port 0xf0/16: using exception 16
pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
pccom0: console
pccom1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
biomask f5c5 netmask ffe5 ttymask ffe7
pctr: no performance counters in CPU
rtw0 at cardbus0 dev 0 function 0 irq 10
rtw0: ver F, radio SA2400A, amp SA2411, address 00:0f:3d:cf:cb:e8
dkcsum: wd0 matched BIOS disk 80
root on wd0a
rootdev=0x0 rrootdev=0x300 rawdev=0x302
WARNING: / was not properly unmounted

And, as it may help too, my watchdog script:

#!/bin/sh
echo starting watchdog...
  
sysctl kern.watchdog.auto=0  /dev/null
  
while : ; do
sysctl kern.watchdog.period=10  /dev/null
sleep 8
done

-- 
Olivier Mehani [EMAIL PROTECTED]



Re: OpenBSD 3.7 on Soekris rebooting at random

2005-08-23 Thread Dimitri Georganas

Did you put this atheros card in one week ago? :)


ath0 at pci0 dev 16 function 0 Atheros AR5212 rev 0x01: irq 11
ath0: mac 80.9 phy 4.3 radio 3.6, 802.11a/b/g, FCC1A, address


3.7 would crash on ath0 with AR5212 in hostap mode every six hours or so. Your 
watchdog does an excellent job, otherwise your board would just freeze.

Not sure if this is an ath problem or a card-specific problem, but later 
snapshots (june) made it worse: the system would freeze when you'd do ifconfig 
up - probably due to the gpio support that, as you can see, isn't here yet in 
this dmesg.

Your best chance is to check out the latest snapshots to see if the problem is 
fixed. I reported it a while ago, but didn't check back. I just replaced the 
atheros card by an old prism card and that one works 24/7.






Olivier Mehani wrote:


On Tue, 23 Aug 2005 15:21:53 +0200
Dimitri Georganas [EMAIL PROTECTED] wrote:

 


I'm facing a strange problem (started a week or so ago):
 


a dmesg may be helpful...
   



Yes, I realised I forgot to include it just after posting, sorry...

Anyway, it confirms that this is the watchdog which triggered the
reset, but I still don't know why...

Full dmesg follows:

OpenBSD 3.7 (GENERIC) #50: Sun Mar 20 00:01:57 MST 2005
   [EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: AMD Am486DX4 W/B or Am5x86 W/B 150 (AuthenticAMD 486-class)
cpu0: FPU
real mem  = 66691072 (65128K)
avail mem = 53448704 (52196K)
using 839 buffers containing 3436544 bytes (3356K) of memory
mainbus0 (root)
bios0 at mainbus0: AT/286+(00) BIOS, date 20/41/22, BIOS32 rev. 0 @
0xf7840 pcibios0 at bios0: rev 2.0 @ 0xf/0x1
pcibios0: pcibios_get_intr_routing - function not supported
pcibios0: PCI IRQ Routing information unavailable.
pcibios0: PCI bus #1 is the last bus
bios0: ROM list: 0xc8000/0x9000
cpu0 at mainbus0
pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
elansc0 at pci0 dev 0 function 0 AMD ElanSC520 PCI rev 0x00: product
0 steppin g 1.1, CPU clock 100MHz, reset 8WDT
elansc0: WARNING: LAST RESET DUE TO WATCHDOG EXPIRATION!
gpio0 at elansc0: 32 pins
cbb0 at pci0 dev 9 function 0 Texas Instruments PCI1410 CardBus rev
0x02: irq 10
ath0 at pci0 dev 16 function 0 Atheros AR5212 rev 0x01: irq 11
ath0: mac 80.9 phy 4.3 radio 3.6, 802.11a/b/g, FCC1A, address
00:02:6f:21:ea:79 gpio at ath0 not configured
sis0 at pci0 dev 18 function 0 NS DP83815 10/100 rev 0x00: DP83816A,
irq 5, ad dress 00:00:24:c4:22:5c
nsphyter0 at sis0 phy 0: DP83815 10/100 PHY, rev. 1
sis1 at pci0 dev 19 function 0 NS DP83815 10/100 rev 0x00: DP83816A,
irq 9, ad dress 00:00:24:c4:22:5d
nsphyter1 at sis1 phy 0: DP83815 10/100 PHY, rev. 1
cardslot0 at cbb0 slot 0 flags 0
cardbus0 at cardslot0: bus 1 device 0 cacheline 0x10, lattimer 0x3f
pcmcia0 at cardslot0
isa0 at mainbus0
isadma0 at isa0
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0 (mux 1 ignored for console): console keyboard
wdc0 at isa0 port 0x1f0/8 irq 14
wd0 at wdc0 channel 0 drive 0: Hitachi XX.V.3.4.0.0
wd0: 1-sector PIO, LBA, 488MB, 1000944 sectors
wd0(wdc0:0:0): using BIOS timings
pcppi0 at isa0 port 0x61
midi0 at pcppi0: PC speaker
sysbeep0 at pcppi0
npx0 at isa0 port 0xf0/16: using exception 16
pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
pccom0: console
pccom1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
biomask f5c5 netmask ffe5 ttymask ffe7
pctr: no performance counters in CPU
rtw0 at cardbus0 dev 0 function 0 irq 10
rtw0: ver F, radio SA2400A, amp SA2411, address 00:0f:3d:cf:cb:e8
dkcsum: wd0 matched BIOS disk 80
root on wd0a
rootdev=0x0 rrootdev=0x300 rawdev=0x302
WARNING: / was not properly unmounted

And, as it may help too, my watchdog script:

#!/bin/sh
echo starting watchdog...
 
sysctl kern.watchdog.auto=0  /dev/null
 
while : ; do

   sysctl kern.watchdog.period=10  /dev/null
   sleep 8
done




Re: OpenBSD 3.7 on Soekris rebooting at random

2005-08-23 Thread Theo de Raadt
 My OpenBSD 3.7 running on a Soekris net4511 reboots with no obvious
 reason. I've started monitoring the memory usage, load average and pf
 states, but these do not seem to be related to the problem.
 
 I'm also using the hardware watchdog which I will disable to see if it
 is involved in the problem, but everything has been working well for
 more than two months with it before.

If the couple of thousand soekris machines of various developers and
users started rebooting just like that, we would have heard of it by
now.

Very hard to diagnose when one rare one somewhere, without *any debugging
information* does so.



Re: OpenBSD 3.7 on Soekris rebooting at random

2005-08-23 Thread Reyk Floeter
On Tue, Aug 23, 2005 at 06:15:45PM +0200, Olivier Mehani wrote:
 The reboot is not that periodic (6 hours as you say): it can be as
 short as 2 hours to 2 days with, once again, no obvious reason.
  

try a ping -f against your accesspoint over the wireless interface and
it will probably reboot/lock very quickly.

reyk



Re: OpenBSD 3.7 on Soekris rebooting at random

2005-08-23 Thread John R. Shannon
A common cause is an insufficient power supply. This has been discussed on the 
Soekris technical mailing list many times.

On Tuesday 23 August 2005 10:04 am, Theo de Raadt wrote:
  My OpenBSD 3.7 running on a Soekris net4511 reboots with no obvious
  reason. I've started monitoring the memory usage, load average and pf
  states, but these do not seem to be related to the problem.
 
  I'm also using the hardware watchdog which I will disable to see if it
  is involved in the problem, but everything has been working well for
  more than two months with it before.

 If the couple of thousand soekris machines of various developers and
 users started rebooting just like that, we would have heard of it by
 now.

 Very hard to diagnose when one rare one somewhere, without *any debugging
 information* does so.

-- 
John R. Shannon
[EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]



Re: OpenBSD 3.7 on Soekris rebooting at random

2005-08-23 Thread Reyk Floeter
On Tue, Aug 23, 2005 at 05:10:08PM +0200, Dimitri Georganas wrote:
 Did you put this atheros card in one week ago? :)
 
 ath0 at pci0 dev 16 function 0 Atheros AR5212 rev 0x01: irq 11
 ath0: mac 80.9 phy 4.3 radio 3.6, 802.11a/b/g, FCC1A, address
 

btw.: could you also give us the exact product name (on the minipci card)?

 3.7 would crash on ath0 with AR5212 in hostap mode every six hours or so. 
 Your watchdog does an excellent job, otherwise your board would just freeze.
 
 Not sure if this is an ath problem or a card-specific problem, but later 
 snapshots (june) made it worse: the system would freeze when you'd do 
 ifconfig up - probably due to the gpio support that, as you can see, isn't 
 here yet in this dmesg.
 

there were two fixes for hostap mode and it works for me without
problems in the driver.

one issue got fixed on 2005/05/28 (ath.c 1.29). the problem was a
hardware counter overflow after some traffic and an uncatched
interrupt which indicated the overflow and required to clear the
counter registers. the second problem was related to receive overruns
in the rx descriptor chain, which had been fixed as well (ath.c 1.31
2005/07/19).

 And, as it may help too, my watchdog script:
 
 #!/bin/sh
 echo starting watchdog...
  
 sysctl kern.watchdog.auto=0  /dev/null
  
 while : ; do
sysctl kern.watchdog.period=10  /dev/null
sleep 8
 done

also have a look at mbalmer@'s watchdogd(8) which had been imported
some weeks ago. this has some timing advantages over traditional
watchdog scripts.

reyk



Re: OpenBSD 3.7 on Soekris rebooting at random

2005-08-23 Thread Olivier Mehani
On Tue, 23 Aug 2005 17:10:08 +0200
Dimitri Georganas [EMAIL PROTECTED] wrote:

 Did you put this atheros card in one week ago? :)

No, it's been in it for more than a month now and everything has been
working smoothly until last week.

 ath0 at pci0 dev 16 function 0 Atheros AR5212 rev 0x01: irq 11
 ath0: mac 80.9 phy 4.3 radio 3.6, 802.11a/b/g, FCC1A, address

 3.7 would crash on ath0 with AR5212 in hostap mode every six hours or
 so. Your watchdog does an excellent job, otherwise your board would
 just freeze.

This is why it gets paid ;)

The reboot is not that periodic (6 hours as you say): it can be as
short as 2 hours to 2 days with, once again, no obvious reason.
 
 Your best chance is to check out the latest snapshots to see if the
 problem is fixed. I reported it a while ago, but didn't check back. I
 just replaced the atheros card by an old prism card and that one
 works 24/7.

Thanks for your advice, I'll check that.

-- 
Olivier Mehani [EMAIL PROTECTED]



Re: OpenBSD 3.7 on Soekris rebooting at random

2005-08-23 Thread Olivier Mehani
On Tue, 23 Aug 2005 19:13:40 +0200
Reyk Floeter [EMAIL PROTECTED] wrote:

  ath0 at pci0 dev 16 function 0 Atheros AR5212 rev 0x01: irq 11
  ath0: mac 80.9 phy 4.3 radio 3.6, 802.11a/b/g, FCC1A, address
 btw.: could you also give us the exact product name (on the minipci
 card)?

It is an Atheros 5354MP ARIES 200mW Mini PCI card

On the card is written NL-5354MP+ARIES2, and on the chipset is
AR5213A-00 A19911C 1804

 there were two fixes for hostap mode and it works for me without
 problems in the driver.

I'll upgrade my system and see if it's better.

 also have a look at mbalmer@'s watchdogd(8) which had been imported
 some weeks ago. this has some timing advantages over traditional
 watchdog scripts.

Thanks for the advice, I'll look at it

-- 
Olivier Mehani [EMAIL PROTECTED]



Re: OpenBSD 3.7 on Soekris rebooting at random

2005-08-23 Thread dg
On Tue, Aug 23, 2005 at 07:13:40PM +0200, Reyk Floeter wrote:
 On Tue, Aug 23, 2005 at 05:10:08PM +0200, Dimitri Georganas wrote:
  Did you put this atheros card in one week ago? :)
  
  ath0 at pci0 dev 16 function 0 Atheros AR5212 rev 0x01: irq 11
  ath0: mac 80.9 phy 4.3 radio 3.6, 802.11a/b/g, FCC1A, address
  
 
 btw.: could you also give us the exact product name (on the minipci card)?

It's a Senao 54g - but I will have to check for more detailed info tomorrow.

 
  3.7 would crash on ath0 with AR5212 in hostap mode every six hours or so. 
snap
  here yet in this dmesg.
  
 
 there were two fixes for hostap mode and it works for me without
 problems in the driver.
 
 one issue got fixed on 2005/05/28 (ath.c 1.29). the problem was a
 hardware counter overflow after some traffic and an uncatched
 interrupt which indicated the overflow and required to clear the
 counter registers. the second problem was related to receive overruns
 in the rx descriptor chain, which had been fixed as well (ath.c 1.31
 2005/07/19).
 
I checked 3.7 and found the card to freeze every several hours. Then I loaded a 
early june (maybe late may) snapshot and it froze on ifconfig up. The main 
difference I could see was the gpio being supported in the snapshot, something 
that wsn't there in 3.7.  Maybe I got caught in between two fixes.

I haven't time in the next 10 days to play with it, but maybe Olivier can give 
some feedback in case he tries the latest snapshot?

In case the problem remains and the card remains unemployed here I can send it 
to Germany to pursue a new career as developer assisting hardware?

wq