hello. The information that a warm reset doesn't come up clean is useful information. In looking at mfii.c, it looks like there are two possible sources of the problem. The first is the one I've mentioned earlier, that somehow, interrupt handling gets mangled during operations and interrupt stop getting received from the Perc controller. The second is the Perc controller itself is getting into a weird state causing its firmware to stop completing requests.
I'm not sure which source to look at first, so here are some suggestions. 1. Before the problem occurs, can you capture some dmesg output showing how the mfii devices attach and what interupts they're using? 2. What does the output of vmstat -i look like when things are working? 3. Have yu brought up the Perc's RAID configuration menu to confirm the raid sets are healthy and that you're not getting any disk errors which might be masked from NetBSD itself? I've seen this sort of behavior when a disk is throwing errors; the Perc firmware is so busy dealing with the problem disk it stops responding to the mfii(4) driver. Unfortunately, the NetBSD driver isn't very good about reporting these kinds of errors; I'm not sure if it's a problem with the mfii(4) driver or the firmware on the Perc itself. Because the errors happen at random intervals after the machine boots, it's possible the issue is a good old fashioned failing disk. I do realize yu see the errors on two separate controllers, which is why I'm leaning toward an interrupt issue, but it would be good to verify your disks are good. Hope that helps. -Brian