Hello, in last October I upgraded to 2.6.35 on a Sun Fire X4100 and found that
starting the watchdog no longer worked.  It produces this output when
started:

Oct 21 15:50:14 stephen watchdog[4725]: starting daemon (5.6):
Oct 21 15:50:14 stephen watchdog[4725]: int=30s realtime=yes sync=no soft=no 
mla=0 mem=0
Oct 21 15:50:14 stephen watchdog[4725]: ping: no machine to check
Oct 21 15:50:14 stephen watchdog[4725]: file: no file to check
Oct 21 15:50:14 stephen watchdog[4725]: pidfile: no server process to check
Oct 21 15:50:14 stephen watchdog[4725]: interface: no interface to check
Oct 21 15:50:14 stephen watchdog[4725]: test=none(0) repair=none 
alive=/dev/watchdog heartbeat=none temp=none to=root no_act=no
Oct 21 15:50:14 stephen kernel: IPMI message handler: BMC returned incorrect 
response, expected netfn 7 cmd 22, got netfn 7 cmd 24
Oct 21 15:50:14 stephen kernel: IPMI Watchdog: response: Error ff on cmd 22
Oct 21 15:50:14 stephen watchdog[4725]: write watchdog device gave error 22 = 
'Invalid argument'!
Oct 21 15:51:15 stephen kernel: IPMI message handler: BMC returned incorrect 
response, expected netfn 7 cmd 35, got netfn 7 cmd 22
Oct 21 15:51:15 stephen kernel: IPMI message handler: BMC returned incorrect 
response, expected netfn 7 cmd 22, got netfn 7 cmd 35


After some bisecting, I found that the patch that causes this is a
patch to reduce ipmi polling:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=3326f4f2276791561af1fd5f2020be0186459813

Unfortunately, the system is unstable if I reverse this patch.  It
crashes with "kernel BUG at kernel/timer.c:851!" (I can provide this
output as requested)


I originally sent this directly to Matthew Garrett but he hasn't been
responsive for the last month or two, and I would like to eventually be
able to upgrade to a new kernel without losing functionality.  Matthew
provided a workaround patch, but it still produced error output
infrequently.  He said it wasn't clean enough for upstream, but
hopefully it will give some indication to what he found the problem to
be:

diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
index e829053..3f1e856 100644
--- a/drivers/char/ipmi/ipmi_si_intf.c
+++ b/drivers/char/ipmi/ipmi_si_intf.c
@@ -316,6 +316,7 @@ static int unload_when_empty = 1;
 static int add_smi(struct smi_info *smi);
 static int try_smi_init(struct smi_info *smi);
 static void cleanup_one_si(struct smi_info *to_clean);
+static void smi_timeout(unsigned long data);
 
 static ATOMIC_NOTIFIER_HEAD(xaction_notifier_list);
 static int register_xaction_notifier(struct notifier_block *nb)
@@ -896,6 +897,7 @@ static void sender(void                *send_info,
 #endif
 
        mod_timer(&smi_info->si_timer, jiffies + SI_TIMEOUT_JIFFIES);
+       smi_timeout((unsigned long)smi_info);
 
        if (smi_info->thread)
                wake_up_process(smi_info->thread);

------------------------------------------------------------------------------
Gaining the trust of online customers is vital for the success of any company
that requires sensitive data to be transmitted over the Web.   Learn how to 
best implement a security strategy that keeps consumers' information secure 
and instills the confidence they need to proceed with transactions.
http://p.sf.net/sfu/oracle-sfdevnl 
_______________________________________________
Openipmi-developer mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openipmi-developer

Reply via email to