Re: ipmi patch for review
On 5/30/14, 10:44 AM, Doug Ambrisko wrote: On Thu, Sep 19, 2013 at 03:04:46PM -0400, John Baldwin wrote: | On Tuesday, September 17, 2013 6:21:10 am Gleb Smirnoff wrote: | > Hi! | > | > When system is writing a kernel core dump, it issues watchdog | > pat wdog_kern_pat(WD_LASTVAL). If ipmi is in action, it registers | > ipmi_wd_event() as event for watchdog. Thus ipmi_wd_event() is | > called in dumping context. | > | > The problem is that ipmi_wd_event() calls into ipmi_set_watchdog(), | > that calls into ipmi_alloc_request(), which uses M_WAITOK and | > thus sleeps. This is a smaller problem, since can be converted to | > M_NOWAIT. But ipmi_set_watchdog() then calls into | > ipmi_submit_driver_request(), which calls msleep() any time. | > | > The attached patch allows me to successfully write cores in | > presence of IPMI. | | Of course, the watchdog might go off during your dump. :) | | The real fix is more complicated, which is that we should not use | a worker thread for at least SMIC and KCS. I haven't looked at this patch but I have local code that switches KCS into polling direct mode when the kernel goes into panic mode. I use this to write Linux compatible back traces into the system event logs. This could allow the watchdogd to continue to work. This should be easily extended to SMIC mode. SMBUS would be harder but at a prior company I made the SMBIO driver work in polled mode. If someone wants to look at this I can post the changes for KCS and the kernel backtrace to the system event log. We find this useful when looking at customer machines. IPMI gets upset if things get intermixed/interrupted so there needs to be serialization and cancellation if being interrupted. These patches would be really nice to have in base. I noticed this problem too, you can't really touch watchdogs some of the time when in a panic(9) situation and it leaves you in a bad state to stop them or reset them while you are dumping. Thank you for looking at this. -Alfred ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ipmi patch for review
On Thu, Sep 19, 2013 at 03:04:46PM -0400, John Baldwin wrote: | On Tuesday, September 17, 2013 6:21:10 am Gleb Smirnoff wrote: | > Hi! | > | > When system is writing a kernel core dump, it issues watchdog | > pat wdog_kern_pat(WD_LASTVAL). If ipmi is in action, it registers | > ipmi_wd_event() as event for watchdog. Thus ipmi_wd_event() is | > called in dumping context. | > | > The problem is that ipmi_wd_event() calls into ipmi_set_watchdog(), | > that calls into ipmi_alloc_request(), which uses M_WAITOK and | > thus sleeps. This is a smaller problem, since can be converted to | > M_NOWAIT. But ipmi_set_watchdog() then calls into | > ipmi_submit_driver_request(), which calls msleep() any time. | > | > The attached patch allows me to successfully write cores in | > presence of IPMI. | | Of course, the watchdog might go off during your dump. :) | | The real fix is more complicated, which is that we should not use | a worker thread for at least SMIC and KCS. I haven't looked at this patch but I have local code that switches KCS into polling direct mode when the kernel goes into panic mode. I use this to write Linux compatible back traces into the system event logs. This could allow the watchdogd to continue to work. This should be easily extended to SMIC mode. SMBUS would be harder but at a prior company I made the SMBIO driver work in polled mode. If someone wants to look at this I can post the changes for KCS and the kernel backtrace to the system event log. We find this useful when looking at customer machines. IPMI gets upset if things get intermixed/interrupted so there needs to be serialization and cancellation if being interrupted. Thanks, Doug A. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ipmi patch for review
On Friday, September 20, 2013 1:44:52 am Gleb Smirnoff wrote: > John, > > On Thu, Sep 19, 2013 at 03:04:46PM -0400, John Baldwin wrote: > J> > When system is writing a kernel core dump, it issues watchdog > J> > pat wdog_kern_pat(WD_LASTVAL). If ipmi is in action, it registers > J> > ipmi_wd_event() as event for watchdog. Thus ipmi_wd_event() is > J> > called in dumping context. > J> > > J> > The problem is that ipmi_wd_event() calls into ipmi_set_watchdog(), > J> > that calls into ipmi_alloc_request(), which uses M_WAITOK and > J> > thus sleeps. This is a smaller problem, since can be converted to > J> > M_NOWAIT. But ipmi_set_watchdog() then calls into > J> > ipmi_submit_driver_request(), which calls msleep() any time. > J> > > J> > The attached patch allows me to successfully write cores in > J> > presence of IPMI. > J> > J> Of course, the watchdog might go off during your dump. :) > > Yes, I understand that :( > > But, imho patch improves situation, although is ugly. Yes, I think a temporary workaround is fine for now. > J> The real fix is more complicated, which is that we should not use > J> a worker thread for at least SMIC and KCS. > > -- > Totus tuus, Glebius. > -- John Baldwin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ipmi patch for review
John, On Thu, Sep 19, 2013 at 03:04:46PM -0400, John Baldwin wrote: J> > When system is writing a kernel core dump, it issues watchdog J> > pat wdog_kern_pat(WD_LASTVAL). If ipmi is in action, it registers J> > ipmi_wd_event() as event for watchdog. Thus ipmi_wd_event() is J> > called in dumping context. J> > J> > The problem is that ipmi_wd_event() calls into ipmi_set_watchdog(), J> > that calls into ipmi_alloc_request(), which uses M_WAITOK and J> > thus sleeps. This is a smaller problem, since can be converted to J> > M_NOWAIT. But ipmi_set_watchdog() then calls into J> > ipmi_submit_driver_request(), which calls msleep() any time. J> > J> > The attached patch allows me to successfully write cores in J> > presence of IPMI. J> J> Of course, the watchdog might go off during your dump. :) Yes, I understand that :( But, imho patch improves situation, although is ugly. J> The real fix is more complicated, which is that we should not use J> a worker thread for at least SMIC and KCS. -- Totus tuus, Glebius. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ipmi patch for review
On Tuesday, September 17, 2013 6:21:10 am Gleb Smirnoff wrote: > Hi! > > When system is writing a kernel core dump, it issues watchdog > pat wdog_kern_pat(WD_LASTVAL). If ipmi is in action, it registers > ipmi_wd_event() as event for watchdog. Thus ipmi_wd_event() is > called in dumping context. > > The problem is that ipmi_wd_event() calls into ipmi_set_watchdog(), > that calls into ipmi_alloc_request(), which uses M_WAITOK and > thus sleeps. This is a smaller problem, since can be converted to > M_NOWAIT. But ipmi_set_watchdog() then calls into > ipmi_submit_driver_request(), which calls msleep() any time. > > The attached patch allows me to successfully write cores in > presence of IPMI. Of course, the watchdog might go off during your dump. :) The real fix is more complicated, which is that we should not use a worker thread for at least SMIC and KCS. -- John Baldwin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ipmi patch for review
I ran into a very similar problem on stable/9. During a reboot, the system paniced because the watchdog slept when called from the syncer. I solved it by dropping the sync mutex in the syncer before patting the watchdog. It might be a more general solution that would also solve your problem. My patch and stack trace below. Syncing disks, vnodes remaining...7 7 7 5 Sleeping thread (tid 100145, pid 23) owns a non-sleepable lock KDB: stack backtrace of thread 100145: sched_switch() at 0x8091086d = sched_switch+0x13d mi_switch() at 0x808ee276 = mi_switch+0x196 sleepq_wait() at 0x80929672 = sleepq_wait+0x42 _sleep() at 0x808eeae8 = _sleep+0x3a8 ipmi_submit_driver_request() at 0x80c0df6f = ipmi_submit_driver_request+0xbf ipmi_set_watchdog() at 0x80c0e571 = ipmi_set_watchdog+0xb1 ipmi_wd_event() at 0x80c0e6bf = ipmi_wd_event+0x8f kern_do_pat() at 0x807d763f = kern_do_pat+0x9f sched_sync() at 0x8098779f = sched_sync+0x1df fork_exit() at 0x808b21af = fork_exit+0x11f fork_trampoline() at 0x80bd9ebe = fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xff88ce6a4bb0, rbp = 0 --- panic: sleeping thread //SpectraBSD/stable/sys/kern/vfs_subr.c#3 (text) @@ -1869,8 +1869,15 @@ continue; } -if (first_printf == 0) +if (first_printf == 0) { +/* + * Drop the sync mutex, because some watchdog + * drivers need to sleep while patting + */ +mtx_unlock(&sync_mtx); wdog_kern_pat(WD_LASTVAL); +mtx_lock(&sync_mtx); +} } if (!LIST_EMPTY(gslp)) { On Tue, Sep 17, 2013 at 4:21 AM, Gleb Smirnoff wrote: > Hi! > > When system is writing a kernel core dump, it issues watchdog > pat wdog_kern_pat(WD_LASTVAL). If ipmi is in action, it registers > ipmi_wd_event() as event for watchdog. Thus ipmi_wd_event() is > called in dumping context. > > The problem is that ipmi_wd_event() calls into ipmi_set_watchdog(), > that calls into ipmi_alloc_request(), which uses M_WAITOK and > thus sleeps. This is a smaller problem, since can be converted to > M_NOWAIT. But ipmi_set_watchdog() then calls into > ipmi_submit_driver_request(), which calls msleep() any time. > > The attached patch allows me to successfully write cores in > presence of IPMI. > > -- > Totus tuus, Glebius. > > ___ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
ipmi patch for review
Hi! When system is writing a kernel core dump, it issues watchdog pat wdog_kern_pat(WD_LASTVAL). If ipmi is in action, it registers ipmi_wd_event() as event for watchdog. Thus ipmi_wd_event() is called in dumping context. The problem is that ipmi_wd_event() calls into ipmi_set_watchdog(), that calls into ipmi_alloc_request(), which uses M_WAITOK and thus sleeps. This is a smaller problem, since can be converted to M_NOWAIT. But ipmi_set_watchdog() then calls into ipmi_submit_driver_request(), which calls msleep() any time. The attached patch allows me to successfully write cores in presence of IPMI. -- Totus tuus, Glebius. Index: dev/ipmi/ipmi.c === --- dev/ipmi/ipmi.c (revision 255625) +++ dev/ipmi/ipmi.c (working copy) @@ -647,6 +647,9 @@ ipmi_wd_event(void *arg, unsigned int cmd, int *er unsigned int timeout; int e; + if (dumping) + return; + cmd &= WD_INTERVAL; if (cmd > 0 && cmd <= 63) { timeout = ((uint64_t)1 << cmd) / 10; ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"