On 2022-11-14 13:26:07 Mon, Ganesh Goudar wrote: > machine_check_log_err() is not getting called for all > unrecoverable errors, And we are missing to log the error. > > Raise irq work in save_mce_event() for unrecoverable errors, > So that we log the error from MCE event handling block in > timer handler.
Thanks for fixing this. Reviewed-by: Mahesh Salgaonkar <mah...@linux.ibm.com> > > Signed-off-by: Ganesh Goudar <ganes...@linux.ibm.com> > --- > arch/powerpc/kernel/mce.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c > index 6c5d30fba766..a1cb2172eb7b 100644 > --- a/arch/powerpc/kernel/mce.c > +++ b/arch/powerpc/kernel/mce.c > @@ -131,6 +131,13 @@ void save_mce_event(struct pt_regs *regs, long handled, > if (mce->error_type == MCE_ERROR_TYPE_UE) > mce->u.ue_error.ignore_event = mce_err->ignore_event; > > + /* > + * Raise irq work, So that we don't miss to log the error for > + * unrecoverable errors. > + */ > + if (mce->disposition == MCE_DISPOSITION_NOT_RECOVERED) > + mce_irq_work_queue(); > + > if (!addr) > return; > > @@ -235,7 +242,6 @@ static void machine_check_ue_event(struct > machine_check_event *evt) > evt, sizeof(*evt)); > > /* Queue work to process this event later. */ > - mce_irq_work_queue(); > } With your patch now we can see RTAS event logged for other unrecoverable errors as well. [ 573.006337] Disabling lock debugging due to kernel taint [ 573.006357] MCE: CPU27: machine check (Severe) Real address Load/Store (foreign/control memory) [Not recovered] [ 573.006362] MCE: CPU27: PID: 10580 Comm: inject-ra-err NIP: [0000000010000df4] [ 573.006366] MCE: CPU27: Initiator CPU [ 573.006369] MCE: CPU27: Unknown [ 573.006426] RTAS: event: 1, Type: Platform Error (224), Severity: 3 Tested-by: Mahesh Salgaonkar <mah...@linux.ibm.com> Thanks, -Mahesh.