Hi Guys. I've been running Debian 1.2 for quite a while now and it's been solid as a rock - the system is an isa486 dx4/100 with 32 meg ram. It's used as one of my dialup servers for clients - most of which run ppp. It's got 16 serial ports - 1 8port PC Com 16550 card, a 4port 16550 card and a Stallion Brumby 4 port card.
Last Monday I came into work and found that all the serial ports on the machine weren't working. The Brumby card was giving errors of 'not responding' and the normal 16550's were all giving INIT respawning errors because they couldn't get to the modems. Looking closer, it seemed that each stuffed line had an LED lit on the modem that isn't usually lit when mgetty isn't running on it - either RTS or CTS, I can't remember which now. The machine was fine - the only thing wrong was the serial ports. Keep in mind that I was running 2.0.29 for over 30 days uptime previously with no problems at all and the only reason why it would've been 30 days is becase I upgrade kernels occasionally. I tried shutting down and rebooting a few times to no avail - the cards were detected fine, but I couldn't access the modems. Instead of trying to figure out and fault find the problem I took the easiest way out by 'pretending' all those serial cards were dead. I took them out and inserted a Stallion EasyConnect 8/32 card and plugged in a couple of 8 port modules to make 16 serial ports. I then compiled 2.0.30 to include the Stallion card as a module, rebooted, and presto, I had instant serial ports - and they worked. I figured that I'd spend some time on another machine inserting the 'faulty' cards to see if I could make them error. However, this was only the start of the problem. The next day I got errors like this to /var/log/messages: May 22 20:17:01 orion kernel: STALLION: bad RX interrupt ack value=f9 and May 25 07:52:12 orion kernel: STALLION: cd1400 device not responding, port=3 panel=1 brd=0 As soon as these messages poped up there I knew that the card was dead and I had no access to the serial ports again - just like the other cards! Also, the above errors aren't always the case, sometimes is might be a different 'port' or 'ack value'. I shutdown the system and rebooted (all done remotely - no power off) and the card has come up fine and worked for around 24 hours or so. This same problem has happened nearly everyday for the past week now. Each time it does it, I rebooted and it's fine again for another day or so. I've been trying different kernels too - I've gone from 2.0.30 to 2.0.29 and now, 2.0.28 which is just happened with 30mins ago as well. Each time, a reboot fixes it. I've checked all interrupts and IO addresses - all appear fine with no conflicts. The machine _was_ working fine for the past few months and the only things I've done were update debian packages (stable only) and linux kernels. I log all my system updates and changes I do so I can review it in case anything goes wrong at a later date - the last change I made was on the 4th of May, that being: Preparing to replace quota 1.55-4 (using .../admin/quota_1.55-8.deb) ... Preparing to replace at 2.9b-1 (using .../admin/at_3.1.4-2.deb) ... Preparing to replace util-linux 2.5-9 (using .../base/util-linux_2.5-11.deb) ... Preparing to replace kbd 0.92-3 (using .../base/kbd_0.92-3.1.deb) ... Preparing to replace e2fsprogs 1.09-1 (using .../base/e2fsprogs_1.10-2.deb) ... Preparing to replace qpopper 2.2-3 (using .../mail/qpopper_2.2-4.deb) ... But keep in mind that there's a good 2 weeks between the last updates and the problem at hand. So I doubt that could've had anything to do with it. I'm about to try compiling the Stallion driver directly into the kernel instead of a module to see what happens, but I doubt this'll have any effect. I also think it could be a hardware problem, but if so, what could it possibly be? Here's the complete system configuration (for those who are still reading and havent lost interest): [EMAIL PROTECTED]:p0:~] lsdev Device DMA IRQ Ports ------------------------------------------------ SMC 15 0300-031f (isa SMC Elite 16C Ultra network card) Stallion 10 (Stallion EC8/32 serial card) aha1542 5 11 0330-0333 (Adaptec 1542CF scsi card) cascade 4 2 dma 0080-009f dma1 0000-001f dma2 00c0-00df floppy 03f0-03f5 03f7-03f7 keyboard 1 0060-006f math 13 npu 00f0-00ff pic1 0020-003f pic2 00a0-00bf serial 0360-0361 0380-039f (Stallion serial port locations) timer 0 0040-005f vga+ 03c0-03df Any of you have any ideas of how to stop me going bald? (don't take that literally:) -- Karl Ferguson Tower Networking Pty Ltd Tel: +61-8-9456-0000 [EMAIL PROTECTED] t/a STAR Online Services Fax: +61-8-9455-2776 [EMAIL PROTECTED] -- TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to [EMAIL PROTECTED] . Trouble? e-mail to [EMAIL PROTECTED] .