On Thu, Dec 30, 2021 at 8:02 PM Diggy Mc <d...@bornfree.org> wrote:

>
> I have oVirt Node v4.4.8.3 running on several HP ProLiant Gen8 servers.  I
> receive the following error under certain circumstances:
> "An Unrecoverable System Error (NMI) has occurred (iLO application
> watchdog timeout NMI, Service Information: 0x0000002B, 0x00000000)"
>
> When a host starts taking a load (but nowhere near a threshold), I
> encounter the above iLO-logged error and the host locks-up.  I have had to
> grossly under-utilize my hosts to avoid this problem.  I'm hoping for a
> better fix or work-around.
>
> I've had the same problem beginning with my oVirt 4.3.x hosts, so it isn't
> oVirt version specific.
>
> The little information I could find on the error wasn't helpful.  Red Hat
> acknowledges the issue, but limited to shutdown/reboot operations; not
> during "normal" operations.
>
> Anyone else experienced this problem?  How did you fix it or work around
> it?  I'd like to better utilize my servers if possible.
>
> In advance, thank you to anyone and everyone who offers help.
>
> NMI errors are usually hardware related or kernel / system related. (E.g.
memory failure, hardware health check watchdog, etc)
They are not oVirt related per-say.

That said, I'm seeing an HPE report with the same NMI service code.
https://community.hpe.com/t5/ProLiant-Servers-ML-DL-SL/Proliant-dl360p-gen8An-Unrecoverable-SystemError-NMI-has/td-p/7043891#.YdHHOduxUik

- Gilboa
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MXADE3ZVXA3VNQISODECP5XQEBEUYA4Y/

Reply via email to