Hello,
I'll try to answer your questions. The server was previously running on
Linux. Therefore, I can't say how it behaves with older NetBSD versions. I
only know that there were no problems with Linux. Therefore, I assume
there are no hardware errors.
The Perc H330 is an internal controller, the H830 is plugged in
additionally.
Since the problem occurs randomly, the question is: Does it help to kill
the init process if the problem occurred several hours ago? And can a dump
even be written if the local disk is blocked?
Thank you for your efforts
Regards
Uwe
On Fri, 20 Jun 2025, Brian Buhrow wrote:
hello. This sounds like an issue with interrupt handling.
Specifically, it seems like
interrupts from the controller are not getting routed properly after a while.
Also, another way to reboot without having to power cycle might be to have a
root shell open
before the problem begins after a reboot. Then, when the problem occurs, use
the internal kill
command to kil init with signal 9. This should cause the system to panic and
reboot. It
would be interesting to know whether things work again after this warm reboot
or if you need to
power cycle things to get the interrupts going again.
Some further questions:
1. Are the two controllers on the same PCI interrupt?
2. I'm assuming this machine ran find under older versions of NetBSD? If so,
were those older
versions runing with MSI/MSIX interrupts or the older style of interupt?
If the older style, check to make sure your motherboard and BIOS support
MSI/MSIX interrupts.
If you were previously running with the old style interrupts, make sure yur
motherboard and
BIOS support MSI/MSIX interrupts. Yu may need a BIOS update to get this
working, assuming
newer BIOS updates are available for your hardware.
I'll note that MSI/MSIX interupts weren't used on this driver until NetBSD-10,
so if you've
been running release versions of NetBSD on this hardware, you've not been using
MSI/MSIX
interrupts on the mfii cards until now.
If no BIOS updates are available and this is the first version of
NetBSD you've been using
with MSI/MSIX interrupts, try disabling MSI/MSIX and seeing if that works
better.
Hope that helps.
-thanks
-Brian