On Wed, 21 Feb 2018 12:01:11 +1100
Balbir Singh <bsinghar...@gmail.com> wrote:

> On MCE the current code will restart the machine with
> ppc_md.restart(). This case was extremely unlikely since
> prior to that a skiboot call is made and that resulted in
> a checkstop for analysis.
> 
> With newer skiboots, on P9 we don't checkstop the box by
> default, instead we return back to the kernel to extract
> useful information at the time of the MCE. While we still
> get this information, this patch converts the restart to
> a panic(), so that if configured a dump can be taken and
> we can track and probably debug the potential issue causing
> the MCE.
> 
> Signed-off-by: Balbir Singh <bsinghar...@gmail.com>

Seems like something we should be doing.

Reviewed-by: Nicholas Piggin <npig...@gmail.com>

> ---
>  arch/powerpc/platforms/powernv/opal.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/opal.c 
> b/arch/powerpc/platforms/powernv/opal.c
> index 69b5263fc9e3..b510a6f41b00 100644
> --- a/arch/powerpc/platforms/powernv/opal.c
> +++ b/arch/powerpc/platforms/powernv/opal.c
> @@ -500,9 +500,12 @@ void pnv_platform_error_reboot(struct pt_regs *regs, 
> const char *msg)
>        *    opal to trigger checkstop explicitly for error analysis.
>        *    The FSP PRD component would have already got notified
>        *    about this error through other channels.
> +      * 4. We are running on a newer skiboot that by default does
> +      *    not cause a checkstop, drops us back to the kernel to
> +      *    extract context and state at the time of the error.
>        */
>  
> -     ppc_md.restart(NULL);
> +     panic("PowerNV Unrecovered Machine Check");
>  }
>  
>  int opal_machine_check(struct pt_regs *regs)

Reply via email to