Re: ipmi patch for review

2014-05-30 Thread Alfred Perlstein


On 5/30/14, 10:44 AM, Doug Ambrisko wrote:

On Thu, Sep 19, 2013 at 03:04:46PM -0400, John Baldwin wrote:
| On Tuesday, September 17, 2013 6:21:10 am Gleb Smirnoff wrote:
| >   Hi!
| >
| >   When system is writing a kernel core dump, it issues watchdog
| > pat wdog_kern_pat(WD_LASTVAL). If ipmi is in action, it registers
| > ipmi_wd_event() as event for watchdog. Thus ipmi_wd_event() is
| > called in dumping context.
| >
| > The problem is that ipmi_wd_event() calls into ipmi_set_watchdog(),
| > that calls into ipmi_alloc_request(), which uses M_WAITOK and
| > thus sleeps. This is a smaller problem, since can be converted to
| > M_NOWAIT. But ipmi_set_watchdog() then calls into
| > ipmi_submit_driver_request(), which calls msleep() any time.
| >
| >   The attached patch allows me to successfully write cores in
| > presence of IPMI.
|
| Of course, the watchdog might go off during your dump. :)
|
| The real fix is more complicated, which is that we should not use
| a worker thread for at least SMIC and KCS.

I haven't looked at this patch but I have local code that switches
KCS into polling direct mode when the kernel goes into panic mode.
I use this to write Linux compatible back traces into the system
event logs.  This could allow the watchdogd to continue to work.
This should be easily extended to SMIC mode.  SMBUS would be
harder but at a prior company I made the SMBIO driver work in polled
mode.

If someone wants to look at this I can post the changes for KCS and
the kernel backtrace to the system event log.  We find this useful
when looking at customer machines.

IPMI gets upset if things get intermixed/interrupted so there needs
to be serialization and cancellation if being interrupted.

These patches would be really nice to have in base.  I noticed this 
problem too, you can't really touch watchdogs some of the time when in a 
panic(9) situation and it leaves you in a bad state to stop them or 
reset them while you are dumping.


Thank you for looking at this.

-Alfred
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ipmi patch for review

2014-05-30 Thread Doug Ambrisko
On Thu, Sep 19, 2013 at 03:04:46PM -0400, John Baldwin wrote:
| On Tuesday, September 17, 2013 6:21:10 am Gleb Smirnoff wrote:
| >   Hi!
| > 
| >   When system is writing a kernel core dump, it issues watchdog
| > pat wdog_kern_pat(WD_LASTVAL). If ipmi is in action, it registers
| > ipmi_wd_event() as event for watchdog. Thus ipmi_wd_event() is
| > called in dumping context.
| > 
| > The problem is that ipmi_wd_event() calls into ipmi_set_watchdog(),
| > that calls into ipmi_alloc_request(), which uses M_WAITOK and
| > thus sleeps. This is a smaller problem, since can be converted to
| > M_NOWAIT. But ipmi_set_watchdog() then calls into
| > ipmi_submit_driver_request(), which calls msleep() any time.
| > 
| >   The attached patch allows me to successfully write cores in
| > presence of IPMI.
| 
| Of course, the watchdog might go off during your dump. :)
| 
| The real fix is more complicated, which is that we should not use
| a worker thread for at least SMIC and KCS.

I haven't looked at this patch but I have local code that switches
KCS into polling direct mode when the kernel goes into panic mode.
I use this to write Linux compatible back traces into the system
event logs.  This could allow the watchdogd to continue to work.
This should be easily extended to SMIC mode.  SMBUS would be 
harder but at a prior company I made the SMBIO driver work in polled
mode.

If someone wants to look at this I can post the changes for KCS and
the kernel backtrace to the system event log.  We find this useful
when looking at customer machines.

IPMI gets upset if things get intermixed/interrupted so there needs
to be serialization and cancellation if being interrupted.

Thanks,

Doug A.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ipmi patch for review

2013-09-20 Thread John Baldwin
On Friday, September 20, 2013 1:44:52 am Gleb Smirnoff wrote:
>   John,
> 
> On Thu, Sep 19, 2013 at 03:04:46PM -0400, John Baldwin wrote:
> J> >   When system is writing a kernel core dump, it issues watchdog
> J> > pat wdog_kern_pat(WD_LASTVAL). If ipmi is in action, it registers
> J> > ipmi_wd_event() as event for watchdog. Thus ipmi_wd_event() is
> J> > called in dumping context.
> J> > 
> J> > The problem is that ipmi_wd_event() calls into ipmi_set_watchdog(),
> J> > that calls into ipmi_alloc_request(), which uses M_WAITOK and
> J> > thus sleeps. This is a smaller problem, since can be converted to
> J> > M_NOWAIT. But ipmi_set_watchdog() then calls into
> J> > ipmi_submit_driver_request(), which calls msleep() any time.
> J> > 
> J> >   The attached patch allows me to successfully write cores in
> J> > presence of IPMI.
> J> 
> J> Of course, the watchdog might go off during your dump. :)
> 
> Yes, I understand that :(
> 
> But, imho patch improves situation, although is ugly.

Yes, I think a temporary workaround is fine for now.

> J> The real fix is more complicated, which is that we should not use
> J> a worker thread for at least SMIC and KCS.
> 
> -- 
> Totus tuus, Glebius.
> 

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ipmi patch for review

2013-09-19 Thread Gleb Smirnoff
  John,

On Thu, Sep 19, 2013 at 03:04:46PM -0400, John Baldwin wrote:
J> >   When system is writing a kernel core dump, it issues watchdog
J> > pat wdog_kern_pat(WD_LASTVAL). If ipmi is in action, it registers
J> > ipmi_wd_event() as event for watchdog. Thus ipmi_wd_event() is
J> > called in dumping context.
J> > 
J> > The problem is that ipmi_wd_event() calls into ipmi_set_watchdog(),
J> > that calls into ipmi_alloc_request(), which uses M_WAITOK and
J> > thus sleeps. This is a smaller problem, since can be converted to
J> > M_NOWAIT. But ipmi_set_watchdog() then calls into
J> > ipmi_submit_driver_request(), which calls msleep() any time.
J> > 
J> >   The attached patch allows me to successfully write cores in
J> > presence of IPMI.
J> 
J> Of course, the watchdog might go off during your dump. :)

Yes, I understand that :(

But, imho patch improves situation, although is ugly.

J> The real fix is more complicated, which is that we should not use
J> a worker thread for at least SMIC and KCS.

-- 
Totus tuus, Glebius.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ipmi patch for review

2013-09-19 Thread John Baldwin
On Tuesday, September 17, 2013 6:21:10 am Gleb Smirnoff wrote:
>   Hi!
> 
>   When system is writing a kernel core dump, it issues watchdog
> pat wdog_kern_pat(WD_LASTVAL). If ipmi is in action, it registers
> ipmi_wd_event() as event for watchdog. Thus ipmi_wd_event() is
> called in dumping context.
> 
> The problem is that ipmi_wd_event() calls into ipmi_set_watchdog(),
> that calls into ipmi_alloc_request(), which uses M_WAITOK and
> thus sleeps. This is a smaller problem, since can be converted to
> M_NOWAIT. But ipmi_set_watchdog() then calls into
> ipmi_submit_driver_request(), which calls msleep() any time.
> 
>   The attached patch allows me to successfully write cores in
> presence of IPMI.

Of course, the watchdog might go off during your dump. :)

The real fix is more complicated, which is that we should not use
a worker thread for at least SMIC and KCS.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ipmi patch for review

2013-09-17 Thread Alan Somers
I ran into a very similar problem on stable/9.  During a reboot, the
system paniced because the watchdog slept when called from the syncer.
 I solved it by dropping the sync mutex in the syncer before patting
the watchdog.  It might be a more general solution that would also
solve your problem.  My patch and stack trace below.

Syncing disks, vnodes remaining...7 7 7 5 Sleeping thread (tid 100145,
pid 23) owns a non-sleepable lock
KDB: stack backtrace of thread 100145:
sched_switch() at 0x8091086d = sched_switch+0x13d
mi_switch() at 0x808ee276 = mi_switch+0x196
sleepq_wait() at 0x80929672 = sleepq_wait+0x42
_sleep() at 0x808eeae8 = _sleep+0x3a8
ipmi_submit_driver_request() at 0x80c0df6f =
ipmi_submit_driver_request+0xbf
ipmi_set_watchdog() at 0x80c0e571 = ipmi_set_watchdog+0xb1
ipmi_wd_event() at 0x80c0e6bf = ipmi_wd_event+0x8f
kern_do_pat() at 0x807d763f = kern_do_pat+0x9f
sched_sync() at 0x8098779f = sched_sync+0x1df
fork_exit() at 0x808b21af = fork_exit+0x11f
fork_trampoline() at 0x80bd9ebe = fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xff88ce6a4bb0, rbp = 0 ---
panic: sleeping thread

 //SpectraBSD/stable/sys/kern/vfs_subr.c#3 (text) 

@@ -1869,8 +1869,15 @@
 continue;
 }

-if (first_printf == 0)
+if (first_printf == 0) {
+/*
+ * Drop the sync mutex, because some watchdog
+ * drivers need to sleep while patting
+ */
+mtx_unlock(&sync_mtx);
 wdog_kern_pat(WD_LASTVAL);
+mtx_lock(&sync_mtx);
+}

 }
 if (!LIST_EMPTY(gslp)) {

On Tue, Sep 17, 2013 at 4:21 AM, Gleb Smirnoff  wrote:
>   Hi!
>
>   When system is writing a kernel core dump, it issues watchdog
> pat wdog_kern_pat(WD_LASTVAL). If ipmi is in action, it registers
> ipmi_wd_event() as event for watchdog. Thus ipmi_wd_event() is
> called in dumping context.
>
> The problem is that ipmi_wd_event() calls into ipmi_set_watchdog(),
> that calls into ipmi_alloc_request(), which uses M_WAITOK and
> thus sleeps. This is a smaller problem, since can be converted to
> M_NOWAIT. But ipmi_set_watchdog() then calls into
> ipmi_submit_driver_request(), which calls msleep() any time.
>
>   The attached patch allows me to successfully write cores in
> presence of IPMI.
>
> --
> Totus tuus, Glebius.
>
> ___
> freebsd-current@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


ipmi patch for review

2013-09-17 Thread Gleb Smirnoff
  Hi!

  When system is writing a kernel core dump, it issues watchdog
pat wdog_kern_pat(WD_LASTVAL). If ipmi is in action, it registers
ipmi_wd_event() as event for watchdog. Thus ipmi_wd_event() is
called in dumping context.

The problem is that ipmi_wd_event() calls into ipmi_set_watchdog(),
that calls into ipmi_alloc_request(), which uses M_WAITOK and
thus sleeps. This is a smaller problem, since can be converted to
M_NOWAIT. But ipmi_set_watchdog() then calls into
ipmi_submit_driver_request(), which calls msleep() any time.

  The attached patch allows me to successfully write cores in
presence of IPMI.

-- 
Totus tuus, Glebius.
Index: dev/ipmi/ipmi.c
===
--- dev/ipmi/ipmi.c	(revision 255625)
+++ dev/ipmi/ipmi.c	(working copy)
@@ -647,6 +647,9 @@ ipmi_wd_event(void *arg, unsigned int cmd, int *er
 	unsigned int timeout;
 	int e;
 
+	if (dumping)
+		return;
+
 	cmd &= WD_INTERVAL;
 	if (cmd > 0 && cmd <= 63) {
 		timeout = ((uint64_t)1 << cmd) / 10;
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"