Re: [Openipmi-developer] [TEST PATCH] ipmi:si: Delay when an error is discovered in error recovery

2025-08-18 Thread Mark Bannister via Openipmi-developer
> Perfect, I'll queue it for the next kernel release. I can get it into > the current release if it's urgent. Thanks, that's great. Not urgent, we can apply the patch manually anyway. > The change that caused this was c608966f3f9c "ipmi: fix msg stack when > IPMI is disconnected" and it came in

[Openipmi-developer] [BUG] ipmi_si: watchdog: hard LOCKUP in smi_event_handler/kcs_event

2025-08-15 Thread Mark Bannister via Openipmi-developer
Hi Corey I crashed a machine on 1st August after issuing 'ipmitool mc reset cold' to reset a BMC. I got a crash dump from this event which I have been analyzing. The crash occurred when the NMI watchdog detected a hard LOCKUP in an interrupt handler: [144482.968722] CPU: 1 PID: 96220 Comm: proc

Re: [Openipmi-developer] [TEST PATCH] ipmi:si: Delay when an error is discovered in error recovery

2025-08-15 Thread Mark Bannister via Openipmi-developer
> Thanks for the bug report and debugging info. I think I know what is > going on, I've attached a patch that should hopefully fix it. > Basically, it looks like the BMC is alive enough that it sort of > responds to the host, but not alive enough to actually complete a > transaction. The driver n

Re: [Openipmi-developer] [TEST PATCH] ipmi:si: Delay when an error is discovered in error recovery

2025-08-15 Thread Mark Bannister via Openipmi-developer
> > Thanks for the bug report and debugging info. I think I know what is > > going on, I've attached a patch that should hopefully fix it. > > Basically, it looks like the BMC is alive enough that it sort of > > responds to the host, but not alive enough to actually complete a > > transaction. Th