I determined the root cause behind the problem I reported yesterday. I
have tested this fix and found it to be effective. The conditions for
encountering this problem are rare. I am providing the fix here for the
benefit of others still using the ipmi_smb_intf driver on Linux-2.4.

PROBLEM:

On platforms where the BMC controls the reset line AND where the
interface to the BMC is SMBus (I2C) AND where the kernel configuration
includes "CONFIG_IPMI_SMB=y" (embedded, not modular) AND where the
kernel version is 2.4.x, there is a 1 in 300 chance that the system will
hang after displaying the message "Restarting system" on the console.

The Linux-2.4 kernel does not call functions registered with the
"module_exit" macro when MODULE is not defined (the driver is embedded).
Even if it did, the original ipmi_smb_intf driver did not set up the
call to clean up the driver.

SOLUTION:

Use the Linux-2.4 hack, register_reboot_notifier, modeled after the IDE
driver. This kills the polling thread that can cause collisions at the
I2C adapter for 10mS out of every 3000mS. With this thread gone, there
is no chance of a collision preventing the BIOS reboot request message
from getting to the BMC.



DIFF:

diff -rU3 linux-2.4.31/drivers/char/ipmi/ipmi_smb_intf.c
linux-2.4.31.new/drivers/char/ipmi/ipmi_smb_intf.c
--- linux-2.4.31/drivers/char/ipmi/ipmi_smb_intf.c      Tue Jun 27
17:21:29 2006
+++ linux-2.4.31.new/drivers/char/ipmi/ipmi_smb_intf.c  Tue Jun 27
17:31:45 2006
@@ -54,7 +54,9 @@
 #include <linux/i2c.h>
 #include <linux/ipmi_smi.h>
 #include <linux/init.h>
-
+#ifndef MODULE
+#include <linux/reboot.h>
+#endif

 #define IPMI_SMB_VERSION "37"

@@ -1251,6 +1253,48 @@
        .dec_use = NULL,
 };

+static void ipmi_cleanup_common(void)
+{
+       int i;
+       int rv;
+
+       if (!initialized)
+               return;
+
+       for (i=0; i<MAX_SMB_BMCS; i++) {
+               cleanup_one_smb(smb_infos[i]);
+       }
+
+       initialized = 0;
+
+       rv = i2c_del_driver(&smb_i2c_driver);
+       if (!rv)
+               initialized = 0;
+}
+
+#ifndef MODULE
+static int ipmi_notify_reboot (struct notifier_block *this, unsigned
long event, void *x)
+{
+       switch (event) {
+               case SYS_HALT:
+               case SYS_POWER_OFF:
+               case SYS_RESTART:
+                       break;
+               default:
+                       return NOTIFY_DONE;
+       }
+
+   ipmi_cleanup_common();
+       return NOTIFY_DONE;
+}
+
+static struct notifier_block ipmi_notifier = {
+       ipmi_notify_reboot,
+       NULL,
+       5
+};
+#endif
+
 static __init int init_ipmi_smb(void)
 {
        int i;
@@ -1281,6 +1325,9 @@
        if (!rv)
                initialized = 1;

+#ifndef MODULE
+   register_reboot_notifier(&ipmi_notifier);
+#endif
        return rv;
 }
 module_init(init_ipmi_smb);
@@ -1288,21 +1335,7 @@
 #ifdef MODULE
 static __exit void cleanup_ipmi_smb(void)
 {
-       int i;
-       int rv;
-
-       if (!initialized)
-               return;
-
-       for (i=0; i<MAX_SMB_BMCS; i++) {
-               cleanup_one_smb(smb_infos[i]);
-       }
-
-       initialized = 0;
-
-       rv = i2c_del_driver(&smb_i2c_driver);
-       if (!rv)
-               initialized = 0;
+   ipmi_cleanup_common();
 }
 module_exit(cleanup_ipmi_smb);
 #else


-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
Brian Hamon (brhamon)
Sent: Monday, June 26, 2006 2:49 PM
To: [email protected]
Subject: [Openipmi-developer] IPMI polling every 3 seconds not ending
whendrivers integrated

I am seeing a problem that is largely a product of the constraints of my
environment. I am working to resolve it, but need a little help.

First, my constraints. I am using an old kernel (2.4.30). The 2.4.31-v37
patch files are working well for me. However, I've already run across
problems that were resolved in later versions of the IPMI drivers in the
2.6 kernel. This may be one of those issues, and I'm sorry for wasting
anyone's time if so.

On my platforms, I am starting inside an initrd and pivot_root to the
mass storage device. It works much better for me if the drivers that I
need before and after the pivot_root are embedded into the kernel rather
than built as loadable modules. IPMI is one of the drivers I need early
in system initialization, so it is embedded in my kernel.

Finally, the BMC on my platform is one with which I interface across the
I2C bus (ipmi_smb_intf.o). 

The problems I encountered and solved are that the embedded drivers'
init calls go into a linker segment and are allocated in the order in
which the linker sees them. By patching the Makefiles, I was able to get
the linker to see drivers/i2c before drivers/char/ipmi. This makes the
I2C driver initialize before IPMI tries to use it to connect to the BMC
over the I2C bus.

Another problem had to do with the probing for an I2C BMC. My platform
has SMBIOS tables identifying the BMC's I2C slave address. A bit of code
walking through these tables provided the behavior I required. I later
learned that this was something that had already been put into the 2.6
kernel drivers.

=====

Now for my problem. I have seen that a small percentage of the time, the
system will not reset itself. The platform is an Intel SE7520JR2
mainboard with a National Semiconductor (Winbond) PC87431M mBMC. On this
platform, the mBMC controls the system reset line. The AMIBIOS on this
platform sends a reboot request to the mBMC (presumably an IPMI
command).

What I am seeing on the scope is a short message on the I2C bus sent
every three seconds to the mBMC from the IPMI drivers. I am fairly
certain this is ipmi_smb_intf polling for events from the mBMC (from
smb_event_handler). What is odd is that this polling does not stop even
after the kernel descends through the shutdown sequence to the
sys_reboot service (INT 80h, service 88). 

The message takes about 10mS and repeats every three seconds. What we
have observed is something on the order of 1 out of 300 reboots fail, so
we think the IPMI driver is in the middle of manipulating the I2C
control registers on the ICH5R when the BIOS tries to send the reboot
request.

I'm trying to figure out for sure if this polling is happening in the
IPMI driver. If it is, why doesn't it stop when the driver is unloaded?
If I can better understand the code sending the polling message, I
should be able to fix it, but I need some guidance.

It looks like stop_kthread in ipmi_smb_intf.c should stop the polling,
but on my system the scope is indicating this is not happening. 

Using Tomcat but need to do more? Need to support web services,
security?
Get stuff done quickly with pre-integrated technology to make your job
easier Download IBM WebSphere Application Server v.1.0.1 based on Apache
Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Openipmi-developer mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openipmi-developer

Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Openipmi-developer mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openipmi-developer

Reply via email to