Hello, I'm running NetBSD-current/amd64 on a Dell PowerEdge 2850, and have been experiencing mysterious hangs. With help and guidance from Christos Zoulas, I belive I've gotten to where I now know what's going on. What to do about it is a more difficult question, and Christos suggested we take the issue to tech-kern.
The behaviour is as follows: when the machine is busy with disk and network I/O, using the integrated Dell PERC (AMI MegaRAID) "amr" RAID controller and Intel i8254x "wm" network interfaces, there will be sudden hangs, where it is completely unresponsive, including not responding to keypresses on the console, or to ICMP echo packets on the network. In single CPU mode, these hangs will last for anything from just noticeable to almost half a minute at the most. Reducing the activity on the machine keeps them from occurring; piling it on again reintroduces them. In SMP mode, they will typically be much longer -- I then tend to get only one or two of, say, ten or twenty seconds, and then what appears to be a permanent one (although it looks as if I've had hangs, while I've been away from the machine, which have had it resume operation after about a half hour). Christos and I went through a long sequence of tests and modifications, during which I, among other things, modified the amr driver to use mutexes and condvars instead of splbio()/splx(), and added a couple of bug fixes gleaned from FreeBSD. In the end, however, it turned out that the problem is interrupt storms from the integrated USB controller. Here are some interrupt mappings (the devices that are actually in use are amr0, wm0, and uhci2 (the latter running a 1200 bps serial line over a ucom device, talking to my UPS, and normally generating about 1200 or so interrupts per second to do this)): amr0: interrupting at ioapic1 pin 14 wm0: interrupting at ioapic2 pin 0 wm1: interrupting at ioapic2 pin 1 uhci0: interrupting at ioapic0 pin 16 uhci1: interrupting at ioapic0 pin 19 uhci2: interrupting at ioapic0 pin 18 ehci0: interrupting at ioapic0 pin 23 cmdide0: using ioapic0 pin 23 for native-PCI interrupt piixide0: primary channel interrupting at ioapic0 pin 14 piixide0: secondary channel interrupting at ioapic0 pin 15 radeon0: interrupting at ioapic0 pin 18 (radeon) Now, here're some counters during a hang. It was a three second hang, and a "vmstat -i 10" that was running jumped the count by about 14000 on uhci2, and 13000 on uhci0, during the period where the hang was exhibited. In the next 10 second interval, they were back to normal. interrupt total rate cpu0 timer 89770 100 ioapic1 pin 14 26260 29 ioapic2 pin 0 70060 78 ioapic0 pin 16 1558 1 ioapic0 pin 18 61740 68 ioapic0 pin 23 20928 23 ioapic0 pin 14 6 0 ioapic0 pin 4 371 0 Total 270693 302 interrupt total rate cpu0 timer 90771 100 ioapic1 pin 14 28832 31 ioapic2 pin 0 70768 78 ioapic0 pin 16 14920 16 ioapic0 pin 18 76075 84 ioapic0 pin 23 21079 23 ioapic0 pin 14 6 0 ioapic0 pin 4 377 0 Total 302828 334 interrupt total rate cpu0 timer 91772 100 ioapic1 pin 14 30682 33 ioapic2 pin 0 71440 78 ioapic0 pin 16 14964 16 ioapic0 pin 18 77350 84 ioapic0 pin 23 21313 23 ioapic0 pin 14 6 0 ioapic0 pin 4 377 0 Total 307904 336 It turns out to be a known problem with the particular Intel chip set Dell used in all its servers at the time this machine was built. See, for instance, these references: https://lists.freebsd.org/pipermail/freebsd-hardware/2005-June/002601.html http://freebsd.1045724.n5.nabble.com/em-interrupt-storm-td3877379.html I think what's needed here is interrupt storm mitigation, maybe in a similar way to what FreeBSD does, in ithread_execute_handlers() in this source file: https://svnweb.freebsd.org/base/stable/10/sys/kern/kern_intr.c?view=co However, I've been unable to figure out where to trap and throttle a storm in NetBSD. I had fun barking up the wrong tree when I discovered the softint subsystem, and instrumented that to (successfully) keep track of back-to-back invocations of the same handler, but all that really gave me was the realization that my hunch that hardware interrupt handling passes through that layer was wrong. :) What do people think? Am I on the right track when I think interrupt storm mitigation is the way to go? If so, where would be the right place to do it? I'm happy to do the work, but will need some guidance along the way. -tih -- Popularity is the hallmark of mediocrity. --Niles Crane, "Frasier"