On Tue, 2019-01-15 at 19:30 -0700, Jerry Hoemann wrote:
> On Mon, Jan 14, 2019 at 07:36:17AM +0500, Ivan Mironov wrote:
> > This adds an option to not panic on NMI even if it was caused by iLO.
> > 
> > Signed-off-by: Ivan Mironov <mironov.i...@gmail.com>
> > ---
> >  drivers/watchdog/hpwdt.c | 19 ++++++++++++++++---
> >  1 file changed, 16 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
> > index 95d002b5b120..b12858491189 100644
> > --- a/drivers/watchdog/hpwdt.c
> > +++ b/drivers/watchdog/hpwdt.c
> > @@ -37,6 +37,10 @@ static unsigned int soft_margin = DEFAULT_MARGIN;        
> > /* in seconds */
> >  static bool nowayout = WATCHDOG_NOWAYOUT;
> >  static bool pretimeout = IS_ENABLED(CONFIG_HPWDT_NMI_DECODING);
> >  
> > +#ifdef CONFIG_HPWDT_NMI_DECODING
> > +static bool panic_on_nmi = true;
> > +#endif /* CONFIG_HPWDT_NMI_DECODING */
> > +
> >  static void __iomem *pci_mem_addr;         /* the PCI-memory address */
> >  static unsigned long __iomem *hpwdt_nmistat;
> >  static unsigned long __iomem *hpwdt_timer_reg;
> > @@ -146,7 +150,10 @@ static int hpwdt_set_pretimeout(struct watchdog_device 
> > *wdd, unsigned int req)
> >  
> >  static int hpwdt_my_nmi(void)
> >  {
> > -   return ioread8(hpwdt_nmistat) & 0x6;
> > +   int nmistat = ioread8(hpwdt_nmistat);
> > +
> > +   iowrite8(nmistat & ~0x6, hpwdt_nmistat);
> > +   return nmistat & 0x6;
> 
> This is a read only register.
> 

Oops... At least on my system this code has the desired effect:
subsequent function call returns zero.

Probably it would be better to use additional variable to determine
whether NMI from iLO already happened or not.

> 
> >  }
> >  
> >  /*
> > @@ -168,7 +175,10 @@ static int hpwdt_pretimeout(unsigned int ulReason, 
> > struct pt_regs *regs)
> >              "4. iLO Event Log\n",
> >              mynmi, ulReason, smp_processor_id());
> >  
> > -   nmi_panic(regs, "hpwdt: NMI: Not continuing");
> > +   if (panic_on_nmi)
> > +           nmi_panic(regs, "hpwdt: NMI: Not continuing");
> > +
> > +   pr_emerg("Dazed and confused, but trying to continue\n");
> >  
> 
> The watchdog core provides a way to enable/disable the NMI pretimeout 
> dynamically
> via ioctl.  The pretimeout NMI can also be disabled via module parameter to 
> hpwdt.
> This adds complexity without really adding functionality.
> 

It looks like disabling pretimout will disable panics only for iLO 5
(mine system has iLO 2). Or am I missing something?

> 
> BTW, don't reuse error messages.  Makes it difficult to tell where the error
> originated from.
> 

Would it be better to just return NMI_DONE and thus reuse normal NMI
handling from kernel, with logging and panic/don't panic logic?

> 
> >     return NMI_HANDLED;
> >  }
> > @@ -376,6 +386,9 @@ MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped 
> > once started (default="
> >  #ifdef CONFIG_HPWDT_NMI_DECODING
> >  module_param(pretimeout, bool, 0);
> >  MODULE_PARM_DESC(pretimeout, "Watchdog pretimeout enabled");
> > -#endif
> > +
> > +module_param(panic_on_nmi, bool, 0);
> > +MODULE_PARM_DESC(panic_on_nmi, "Cause panic on NMI induced by iLO 
> > (default=1)");
> > +#endif /* CONFIG_HPWDT_NMI_DECODING */
> >  
> >  module_pci_driver(hpwdt_driver);
> > -- 
> > 2.20.1

Reply via email to