RE: [PATCH] ipmi: add new kernel options to prevent automatic ipmi init
I have made a few observations in the past few days: 1) Unique IPMI device ID's did not seem to make a difference. Stratus still could not hot remove one of the KCS interfaces. 2) From what I see in IPMI spec section 20.1, having unique device ID's is not required: Controllers that implement identical sets of applications commands can have the same Device ID in a given system. Thus, a 'standardized' controller could be produced where multiple instances of the controller are used in a system, and all have the same Device ID value. [The controllers would still be differentiable by their address, location, and associated information for the controllers in the Sensor Data Records.] 3) Stratus can get by without a change to kernel 2.6.32 Stratus could not hot-remove all interfaces automatically discovered by ipmi_si, but it is possible to hot-remove all hardcoded interfaces. One of the Stratus KCS interfaces will always be online at boot time. Thus if Stratus hardcodes both of the KCS interfaces, at least one of those will be detected when ipmi_si initializes; this will prevent ipmi_si from trying to auto-detect any interfaces. Later during system startup, a Stratus script can run to hot-remove all interfaces from use by ipmi_si. Using this technique Stratus has a method to dedicate all the KCS interfaces exclusively for use by the Stratus driver without needing any kernel changes. I have verified this technique with the most recent 2.6.32-348 kernel released in a Red Hat Enterprise Linux 6.4 beta snapshot. As long as the same behavior is present in the upstream kernel, we do not need a change to the kernel to support Stratus servers. On 12/17/2012 4:14 PM, Evans, Robert wrote: On 12/14/2012 12:02 PM, Corey Minyard wrote: On 12/14/2012 10:25 AM, Evans, Robert wrote: Corey, Thanks for the thoughtful reply. Below I respond in detail to these three points. 1) Why building a variant kernel with ipmi_si as a module is not feasible. 2) User mode access to IPMI on Stratus systems (e.g. ipmitool). 3) ipmi_si hot removal seems to not work as needed. Stratus might be able to use the hot removal option instead of the proposed patch if hot removal can remove all interfaces from usage by ipmi_si. Our testing of this option was not successful as shown below. - - - 1) Why building a variant kernel with ipmi_si as a module is not feasible: Stratus sells servers based upon Red Hat Enterprise Linux (RHEL). In the next release of RHEL, ipmi_si will be built into the kernel so that access to ACPI opregion is available early in kernel startup. Stratus systems run the Red Hat kernel so that the system is certified and supported by Red Hat. For this reason using a custom kernel configured to build ipmi_si as a module is not an option. Yes, the RHEL engineer explained this to me, and it makes sense now. Thanks. 2) User mode access to IPMI on Stratus systems: Although Stratus provides a replacement for ipmi_si, we depend on ipmi_msghandler and ipmi_devintf. The device /dev/ipmi0 is present and this device is utilized by the user-mode system management software Stratus supplies. Therefore other programs like ipmitool can send IPMI commands and get responses on Stratus systems. Ah, ok. That's good. 3) Hot removal of the KCS interfaces discovered by ipmi_si seems to not do enough... One KCS cannot successfully be removed: Based upon your suggestion, we tried to use hot removal. With RHEL 6.4 Beta (kernel-2.6.32-343.el6), Stratus attempted to hot remove the IPMI interfaces. This was booted with ipmi_si.trydefaults=0 although we expect that kernel option to have no effect since a BMC is found before the defaults would be tried. This is logged when ipmi_si initializes indicating that both BMCs were discovered: ipmi message handler version 39.2 IPMI System Interface driver. ipmi_si: Trying ACPI-specified kcs state machine at i/o address 0xca2, slave address 0x0, irq 0 ipmi: Found new BMC (man_id: 0x77, prod_id: 0x05c6, dev_id: 0x41) IPMI kcs interface initialized ipmi_si: Adding SMBIOS-specified kcs state machine ipmi_si: Trying SMBIOS-specified kcs state machine at i/o address 0xda2, slave address 0x20, irq 0 ipmi: interfacing existing BMC (man_id: 0x77, prod_id: 0x05c6, dev_id: 0x41) IPMI kcs interface initialized Although there are two different BMCs, because it says interfacing existing BMC it appears that ipmi_si assumes they are the same BMC. That's happening in the message handler and it happens because the manufacturer, product, and device id all match. From the spec: The Device ID is typically used in combination with the Product ID field such that the Device IDs for different controllers are unique under a given Product ID. A controller can optionally use the Device ID as an 'instance' identifier if more than one controller of that kind is used in the system.
Re: [PATCH] ipmi: add new kernel options to prevent automatic ipmi init
Well, the built-in driver works on systems that have more than one interface and more than one BMC, and multiple IPMBs (and all of the other channel types for that matter, and the driver handles all the multiplexing and nasty addressing). There is, in fact, no arbitrary limit, and IBM tested this fairly extensively with some of their systems. I'm not sure why you would need a custom driver, and if there are some custom things that need to be done for your servers, I'd be happy to add that. I've worked with a number of other vendors to get changes like this in. And then ipmitool, freeipmi, openipmi, etc. will work with the device. I don't have a big problem with this patch. I wonder why you would want to compile the standard driver into your kernel if you expected to load a module with a custom driver later, though. None of the distros I know of compile it in, it's always a module. You can also dynamically remove the device from the driver using the hot add/remove capability. To remove it, you can do: echo remove,`cat /proc/ipmi/0/params` and it will disassociate that device (IPMI interface 0 in this case) from the driver. So you can iterate through all the devices in /proc/ipmi and remove them all at startup. If none of the above options work for you, we can go ahead with this patch. Just wanted to let you know that current options exist, and see if you wanted to take a different direction. -corey On 12/13/2012 12:40 PM, Tony Camuso wrote: The configuration change building ipmi_si into the kernel precludes the use of a custom driver that can utilize more than one KCS interface, multiple IPMBs, and more than one BMC. This capability is important for fault-tolerant systems. Even if the kernel option ipmi_si.trydefaults=0 is specified, ipmi_si discovers and claims one of the KCS interfaces on a Stratus server. The inability to now prevent the kernel from managing this device is a regression from previous kernels. The regression breaks a capability fault-tolerant vendors have relied upon. To support both ACPI opregion access and the need to avoid activation of ipmi_si on some platforms, we've added two new kernel options, ipmi_si.tryacpi and ipmi_si.trydmi be added to prevent ipmi_si from initializing when these options are set to 0 on the kernel command line. With these options at the default value of 1, ipmi_si init proceeds according to the kernel default. Tested-by: Jim Paradis jpara...@redhat.com Signed-off-by: Robert Evans robert.ev...@stratus.com Signed-off-by: Jim Paradis jpara...@redhat.com Signed-off-by: Tony Camuso tcam...@redhat.com --- drivers/char/ipmi/ipmi_si_intf.c | 28 1 file changed, 24 insertions(+), 4 deletions(-) diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c index 20ab5b3..0a441cf 100644 --- a/drivers/char/ipmi/ipmi_si_intf.c +++ b/drivers/char/ipmi/ipmi_si_intf.c @@ -1208,6 +1208,12 @@ static int smi_num; /* Used to sequence the SMIs */ #define DEFAULT_REGSPACING1 #define DEFAULT_REGSIZE 1 +#ifdef CONFIG_ACPI +static int si_tryacpi = 1; +#endif +#ifdef CONFIG_DMI +static int si_trydmi = 1; +#endif static bool si_trydefaults = 1; static char *si_type[SI_MAX_PARMS]; #define MAX_SI_TYPE_STR 30 @@ -1238,6 +1244,16 @@ MODULE_PARM_DESC(hotmod, Add and remove interfaces. See Documentation/IPMI.txt in the kernel sources for the gory details.); +#ifdef CONFIG_ACPI +module_param_named(tryacpi, si_tryacpi, bool, 0); +MODULE_PARM_DESC(tryacpi, Setting this to zero will disable the + default scan of the interfaces identified via ACPI); +#endif +#ifdef CONFIG_DMI +module_param_named(trydmi, si_trydmi, bool, 0); +MODULE_PARM_DESC(trydmi, Setting this to zero will disable the + default scan of the interfaces identified via DMI); +#endif module_param_named(trydefaults, si_trydefaults, bool, 0); MODULE_PARM_DESC(trydefaults, Setting this to 'false' will disable the default scan of the KCS and SMIC interface at the standard @@ -3408,16 +3424,20 @@ static int init_ipmi_si(void) #endif #ifdef CONFIG_ACPI - pnp_register_driver(ipmi_pnp_driver); - pnp_registered = 1; + if (si_tryacpi) { + pnp_register_driver(ipmi_pnp_driver); + pnp_registered = 1; + } #endif #ifdef CONFIG_DMI - dmi_find_bmc(); + if (si_trydmi) + dmi_find_bmc(); #endif #ifdef CONFIG_ACPI - spmi_find_bmc(); + if (si_tryacpi) + spmi_find_bmc(); #endif /* We prefer devices with interrupts, but in the case of a machine ___ devicetree-discuss mailing list devicetree-discuss@lists.ozlabs.org https://lists.ozlabs.org/listinfo/devicetree-discuss
Re: [PATCH] ipmi: add new kernel options to prevent automatic ipmi init
On 12/14/2012 10:25 AM, Evans, Robert wrote: Corey, Thanks for the thoughtful reply. Below I respond in detail to these three points. 1) Why building a variant kernel with ipmi_si as a module is not feasible. 2) User mode access to IPMI on Stratus systems (e.g. ipmitool). 3) ipmi_si hot removal seems to not work as needed. Stratus might be able to use the hot removal option instead of the proposed patch if hot removal can remove all interfaces from usage by ipmi_si. Our testing of this option was not successful as shown below. - - - 1) Why building a variant kernel with ipmi_si as a module is not feasible: Stratus sells servers based upon Red Hat Enterprise Linux (RHEL). In the next release of RHEL, ipmi_si will be built into the kernel so that access to ACPI opregion is available early in kernel startup. Stratus systems run the Red Hat kernel so that the system is certified and supported by Red Hat. For this reason using a custom kernel configured to build ipmi_si as a module is not an option. Yes, the RHEL engineer explained this to me, and it makes sense now. Thanks. 2) User mode access to IPMI on Stratus systems: Although Stratus provides a replacement for ipmi_si, we depend on ipmi_msghandler and ipmi_devintf. The device /dev/ipmi0 is present and this device is utilized by the user-mode system management software Stratus supplies. Therefore other programs like ipmitool can send IPMI commands and get responses on Stratus systems. Ah, ok. That's good. 3) Hot removal of the KCS interfaces discovered by ipmi_si seems to not do enough... One KCS cannot successfully be removed: Based upon your suggestion, we tried to use hot removal. With RHEL 6.4 Beta (kernel-2.6.32-343.el6), Stratus attempted to hot remove the IPMI interfaces. This was booted with ipmi_si.trydefaults=0 although we expect that kernel option to have no effect since a BMC is found before the defaults would be tried. This is logged when ipmi_si initializes indicating that both BMCs were discovered: ipmi message handler version 39.2 IPMI System Interface driver. ipmi_si: Trying ACPI-specified kcs state machine at i/o address 0xca2, slave address 0x0, irq 0 ipmi: Found new BMC (man_id: 0x77, prod_id: 0x05c6, dev_id: 0x41) IPMI kcs interface initialized ipmi_si: Adding SMBIOS-specified kcs state machine ipmi_si: Trying SMBIOS-specified kcs state machine at i/o address 0xda2, slave address 0x20, irq 0 ipmi: interfacing existing BMC (man_id: 0x77, prod_id: 0x05c6, dev_id: 0x41) IPMI kcs interface initialized Although there are two different BMCs, because it says interfacing existing BMC it appears that ipmi_si assumes they are the same BMC. That's happening in the message handler and it happens because the manufacturer, product, and device id all match. From the spec: The Device ID is typically used in combination with the Product ID field such that the Device IDs for different controllers are unique under a given Product ID. A controller can optionally use the Device ID as an ‘instance’ identifier if more than one controller of that kind is used in the system. (Section 20.1) Different controllers in the same system are supposed to have different device IDs. Also, I notice the slave address for the first KCS (port CA2) seems wrong. Maybe these findings are relevant to what happens next. Probably not relevant. It's not correct because, for some bizarre reason, the slave address is not present in the ACPI information. The slave address is only used by the message handler for the IPMB return address on messages routed over IPMB. It is odd that one interface is specified in ACPI and the other in DMI. You can specify all of them in both tables. After ipmi_si has been initialized, a script runs to load ftmod, the module that contains the Stratus IPMI driver. This code was added to hot remove the interfaces discovered by ipmi_si before loading ftmod: for i in $(cd /proc/ipmi; ls) do dev=IPMI${i} params=$(cat /proc/ipmi/${i}/params) msg=Considering removal of dev: ${dev} ${params} logger -p kern.info -t `basename ${0}` ${msg} echo ${msg} /dev/console [ -n ${params} ] echo remove,`cat /proc/ipmi/${i}/params` \ /sys/module/ipmi_si/parameters/hotmod done In the console log we can see this script run prior to loading the Stratus ftmod.ko and we also see that ftmod exposes a BMC: Considering removal of dev: IPMI0 kcs,i/o,0xca2,rsp=1,rsi=1,rsh=0,irq=0,ipmb=0 Considering removal of dev: IPMI1 kcs,i/o,0xda2,rsp=1,rsi=1,rsh=0,irq=0,ipmb=32 ftmod: module license 'LGPL' taints kernel. Disabling lock debugging due to kernel taint FTMOD version lsb-ft-ftmod-9.0.4-209 ftmod: GLOBAL_SIZE=4194304 ftmod: global_cc_memory 0x88003740 ipmi: Found new BMC (man_id: 0x00, prod_id: 0x, dev_id: 0x00) ipmi device interface The KCS at port DA2 is removed from use by ipmi_si. However, the other KCS is still in
Re: [PATCH] ipmi: add new kernel options to prevent automatic ipmi init
RHEL builds the ipmi_si into the kernel by default, rather than as a module, because it is required early in order to be available for ACPI opregion access. However, it appears that some of our customers have custom ipmi drivers, and this gets in their way. Stratus is currently evaluating your suggestions, and we are expecting a response from them sometime today or early next week. On 12/13/2012 02:51 PM, Corey Minyard wrote: Well, the built-in driver works on systems that have more than one interface and more than one BMC, and multiple IPMBs (and all of the other channel types for that matter, and the driver handles all the multiplexing and nasty addressing). There is, in fact, no arbitrary limit, and IBM tested this fairly extensively with some of their systems. I'm not sure why you would need a custom driver, and if there are some custom things that need to be done for your servers, I'd be happy to add that. I've worked with a number of other vendors to get changes like this in. And then ipmitool, freeipmi, openipmi, etc. will work with the device. I don't have a big problem with this patch. I wonder why you would want to compile the standard driver into your kernel if you expected to load a module with a custom driver later, though. None of the distros I know of compile it in, it's always a module. You can also dynamically remove the device from the driver using the hot add/remove capability. To remove it, you can do: echo remove,`cat /proc/ipmi/0/params` and it will disassociate that device (IPMI interface 0 in this case) from the driver. So you can iterate through all the devices in /proc/ipmi and remove them all at startup. If none of the above options work for you, we can go ahead with this patch. Just wanted to let you know that current options exist, and see if you wanted to take a different direction. -corey ___ devicetree-discuss mailing list devicetree-discuss@lists.ozlabs.org https://lists.ozlabs.org/listinfo/devicetree-discuss
Re: [PATCH] ipmi: add new kernel options to prevent automatic ipmi init
Corey, Thanks for the thoughtful reply. Below I respond in detail to these three points. 1) Why building a variant kernel with ipmi_si as a module is not feasible. 2) User mode access to IPMI on Stratus systems (e.g. ipmitool). 3) ipmi_si hot removal seems to not work as needed. Stratus might be able to use the hot removal option instead of the proposed patch if hot removal can remove all interfaces from usage by ipmi_si. Our testing of this option was not successful as shown below. - - - 1) Why building a variant kernel with ipmi_si as a module is not feasible: Stratus sells servers based upon Red Hat Enterprise Linux (RHEL). In the next release of RHEL, ipmi_si will be built into the kernel so that access to ACPI opregion is available early in kernel startup. Stratus systems run the Red Hat kernel so that the system is certified and supported by Red Hat. For this reason using a custom kernel configured to build ipmi_si as a module is not an option. 2) User mode access to IPMI on Stratus systems: Although Stratus provides a replacement for ipmi_si, we depend on ipmi_msghandler and ipmi_devintf. The device /dev/ipmi0 is present and this device is utilized by the user-mode system management software Stratus supplies. Therefore other programs like ipmitool can send IPMI commands and get responses on Stratus systems. 3) Hot removal of the KCS interfaces discovered by ipmi_si seems to not do enough... One KCS cannot successfully be removed: Based upon your suggestion, we tried to use hot removal. With RHEL 6.4 Beta (kernel-2.6.32-343.el6), Stratus attempted to hot remove the IPMI interfaces. This was booted with ipmi_si.trydefaults=0 although we expect that kernel option to have no effect since a BMC is found before the defaults would be tried. This is logged when ipmi_si initializes indicating that both BMCs were discovered: ipmi message handler version 39.2 IPMI System Interface driver. ipmi_si: Trying ACPI-specified kcs state machine at i/o address 0xca2, slave address 0x0, irq 0 ipmi: Found new BMC (man_id: 0x77, prod_id: 0x05c6, dev_id: 0x41) IPMI kcs interface initialized ipmi_si: Adding SMBIOS-specified kcs state machine ipmi_si: Trying SMBIOS-specified kcs state machine at i/o address 0xda2, slave address 0x20, irq 0 ipmi: interfacing existing BMC (man_id: 0x77, prod_id: 0x05c6, dev_id: 0x41) IPMI kcs interface initialized Although there are two different BMCs, because it says interfacing existing BMC it appears that ipmi_si assumes they are the same BMC. Also, I notice the slave address for the first KCS (port CA2) seems wrong. Maybe these findings are relevant to what happens next. After ipmi_si has been initialized, a script runs to load ftmod, the module that contains the Stratus IPMI driver. This code was added to hot remove the interfaces discovered by ipmi_si before loading ftmod: for i in $(cd /proc/ipmi; ls) do dev=IPMI${i} params=$(cat /proc/ipmi/${i}/params) msg=Considering removal of dev: ${dev} ${params} logger -p kern.info -t `basename ${0}` ${msg} echo ${msg} /dev/console [ -n ${params} ] echo remove,`cat /proc/ipmi/${i}/params` \ /sys/module/ipmi_si/parameters/hotmod done In the console log we can see this script run prior to loading the Stratus ftmod.ko and we also see that ftmod exposes a BMC: Considering removal of dev: IPMI0 kcs,i/o,0xca2,rsp=1,rsi=1,rsh=0,irq=0,ipmb=0 Considering removal of dev: IPMI1 kcs,i/o,0xda2,rsp=1,rsi=1,rsh=0,irq=0,ipmb=32 ftmod: module license 'LGPL' taints kernel. Disabling lock debugging due to kernel taint FTMOD version lsb-ft-ftmod-9.0.4-209 ftmod: GLOBAL_SIZE=4194304 ftmod: global_cc_memory 0x88003740 ipmi: Found new BMC (man_id: 0x00, prod_id: 0x, dev_id: 0x00) ipmi device interface The KCS at port DA2 is removed from use by ipmi_si. However, the other KCS is still in use by ipmi_si. Like ipmi_si, the Stratus IPMI driver uses ipmi_msghandler. With two interfaces sending commands to the same BMC, responses seem to be misdirected. The Stratus management software cannot successfully commnicate with that BMC and many errors like this are logged by ipmi_msghandler: IPMI message handler: BMC returned incorrect response, expected netfn 3d cmd 75, got netfn 3d cmd 71 IPMI message handler: BMC returned incorrect response, expected netfn 3d cmd 71, got netfn 19 cmd 20 IPMI message handler: BMC returned incorrect response, expected netfn b cmd 40, got netfn 3d cmd 71 IPMI message handler: BMC returned incorrect response, expected netfn 3d cmd 71, got netfn d cmd 2 I tried a few variations on the remove string, but never got ipmi_si to stop using the KCS at port CA2. Robert N. Evans Software Engineer S T R A T U S T E C H N O L O G I E S Original Message Subject: Re: [PATCH] ipmi: add new kernel options to prevent automatic ipmi init Date: Thu, 13 Dec 2012 13:51:23 -0600 From: Corey Minyard tcminy...@gmail.com
RE: [PATCH] ipmi: add new kernel options to prevent automatic ipmi init
On 12/14/2012 12:02 PM, Corey Minyard wrote: On 12/14/2012 10:25 AM, Evans, Robert wrote: Corey, Thanks for the thoughtful reply. Below I respond in detail to these three points. 1) Why building a variant kernel with ipmi_si as a module is not feasible. 2) User mode access to IPMI on Stratus systems (e.g. ipmitool). 3) ipmi_si hot removal seems to not work as needed. Stratus might be able to use the hot removal option instead of the proposed patch if hot removal can remove all interfaces from usage by ipmi_si. Our testing of this option was not successful as shown below. - - - 1) Why building a variant kernel with ipmi_si as a module is not feasible: Stratus sells servers based upon Red Hat Enterprise Linux (RHEL). In the next release of RHEL, ipmi_si will be built into the kernel so that access to ACPI opregion is available early in kernel startup. Stratus systems run the Red Hat kernel so that the system is certified and supported by Red Hat. For this reason using a custom kernel configured to build ipmi_si as a module is not an option. Yes, the RHEL engineer explained this to me, and it makes sense now. Thanks. 2) User mode access to IPMI on Stratus systems: Although Stratus provides a replacement for ipmi_si, we depend on ipmi_msghandler and ipmi_devintf. The device /dev/ipmi0 is present and this device is utilized by the user-mode system management software Stratus supplies. Therefore other programs like ipmitool can send IPMI commands and get responses on Stratus systems. Ah, ok. That's good. 3) Hot removal of the KCS interfaces discovered by ipmi_si seems to not do enough... One KCS cannot successfully be removed: Based upon your suggestion, we tried to use hot removal. With RHEL 6.4 Beta (kernel-2.6.32-343.el6), Stratus attempted to hot remove the IPMI interfaces. This was booted with ipmi_si.trydefaults=0 although we expect that kernel option to have no effect since a BMC is found before the defaults would be tried. This is logged when ipmi_si initializes indicating that both BMCs were discovered: ipmi message handler version 39.2 IPMI System Interface driver. ipmi_si: Trying ACPI-specified kcs state machine at i/o address 0xca2, slave address 0x0, irq 0 ipmi: Found new BMC (man_id: 0x77, prod_id: 0x05c6, dev_id: 0x41) IPMI kcs interface initialized ipmi_si: Adding SMBIOS-specified kcs state machine ipmi_si: Trying SMBIOS-specified kcs state machine at i/o address 0xda2, slave address 0x20, irq 0 ipmi: interfacing existing BMC (man_id: 0x77, prod_id: 0x05c6, dev_id: 0x41) IPMI kcs interface initialized Although there are two different BMCs, because it says interfacing existing BMC it appears that ipmi_si assumes they are the same BMC. That's happening in the message handler and it happens because the manufacturer, product, and device id all match. From the spec: The Device ID is typically used in combination with the Product ID field such that the Device IDs for different controllers are unique under a given Product ID. A controller can optionally use the Device ID as an 'instance' identifier if more than one controller of that kind is used in the system. (Section 20.1) Different controllers in the same system are supposed to have different device IDs. I have a made an inquiry to Stratus Hardware Engineering asking why our product is not compliant with the specification. I will pursue a change to future products to comply. However, Stratus has several generations of systems in the field for which this change will be very difficult. Also, I notice the slave address for the first KCS (port CA2) seems wrong. Maybe these findings are relevant to what happens next. Probably not relevant. It's not correct because, for some bizarre reason, the slave address is not present in the ACPI information. The slave address is only used by the message handler for the IPMB return address on messages routed over IPMB. It is odd that one interface is specified in ACPI and the other in DMI. You can specify all of them in both tables. The Stratus server is actually two complete servers that operate in lockstep to provide reliable operation regardless of any single failed component. One of the two I/O subsystems is active during BIOS POST. Only information about the active subsystem is placed in the SMBIOS data structure. Thus dmidecode shows this info for either port CA2 or DA2 depending upon which I/O CRU was active: Handle 0x0048, DMI type 38, 18 bytes IPMI Device Information Interface Type: KCS (Keyboard Control Style) Specification Version: 2.0 I2C Slave Address: 0x10 NV Storage Device: Not Present Base Address: 0x0DA2 (I/O) Register Spacing: Successive Byte Boundaries Interrupt Polarity: Active High Interrupt Trigger Mode: Edge I believe the ACPI data provides information to locate