Now the driver is doing exactly what it is supposed to do, but now that
may not be what we want. I'm not sure of the configuration of this
system, but the information below gives me some clues. Here's my guess
on the system:
This is a NUMA system with hot-plug CPU boards. Each board has an IPMI
controller on it. The BIOS maps the I/O address and SMBIOS tables for
the IPMI controller to different I/O locations based upon the slot the
board is in. There are a number of problems beyond this one for a
configuration of this nature. I'll address those later.
In response to your question, I believe this is exactly what the Device
ID in IPMI is intended for. Each board in the system should have a
unique device id based upon the slot it is in. Say you have an
application that monitors the CPU temperature of all the CPUs. If a
temperature goes out of range, you want to know which board that CPU is
on. And the Device ID can tell you that. The IPMI device number that
you suggest using are arbitrary, especially in a hot-plug system where
devices can come and go dynamically.
In addition, you would probably want to be able to do udev mappings so
that the same slots appear as the same device names (slot 1 is
/dev/ipmi1, slot 2 is /dev/ipmi2, etc.). The driver needs to be able to
give udev information about the devices, and the Product ID/Device ID is
really all it's got.
Now for the other problems:
1. The IPMI driver doesn't current support an arbitrary number of
devices. It has a fixed table of four. I can fix this fairly
easily, though. I wasn't really expecting a system to be designed
like this.
2. The IPMI driver has no way to handle dynamic node plugging. I
don't know of a standard way to tell the IPMI driver: "Hey, you
have a new controller here". The driver should support adding new
devices dynamically, but I need some way to know the device is
there, or that it is going away.
3. I don't think the IPMI driver provides a way for sysfs to report
the information that udev needs to do the udev mappings properly
As always with sysfs, this is probably easy once you spend 2 days
figuring out what to do.
Am I on the right track here?
-Corey
Carol Hebert wrote:
> Hi Corey,
>
> I'm still having problems with the new patches due to the device ID and
> the Product ID being the same on each of the nodes (still have
> segfault/oops). The dual node system is really two separate nodes that
> are joined at will (via RSA setup). Since each began life (and can
> resume life at any time) as a standalone system, isn't it reasonable
> that they could have the same BMC Product and Device IDs? If not, do
> you think this is something that could/should be changed/set in the BIOS
> for each BMC on multi-node systems?
>
> Alternately, would it be possible to differentiate between the two BMCs
> for sysfs file naming purposes by using the value of intf->intf_num in
> ipmi_bmc_register()? I believe that's pretty similar to what's
> currently done to differentiate between the ipmi.0 and ipmi.1
> interfaces. As an example, I tacked the intf_num onto the product id in
> ipmi_bmc_register() (your and Jeff's patched version of the
> ipmi_msghandler.c file):
>
> } else {
> - char name[14];
> + char name[16];
> snprintf(name, sizeof(name),
> - "ipmi_bmc.%4.4x", bmc->id.product_id);
> + "ipmi_bmc.%4.4x%d", bmc->id.product_id,
> intf->intf_num);
>
> and the modules loaded fine. The file names become: ipmi_bmc.00070.32
> and ipmi_bmc.00071.32 (see debug trace below). I suspect I may be
> grossly oversimplifying the feasibility/usability/implementation of this
> solution but at first glance/touch test, it appears to work so I thought
> it might be good to discuss it.
>
> Anyway, thanks again for your help. Please let me know what you'd like
> me to try next. Also, I can probably get some time on a 4-node and/or
> an 8-node system so we can really stress the solution once we've settled
> on a fix.
>
> Thanks much,
>
> Carol Hebert
>
> -----
>
> kobject ipmi_msghandler: registering. parent: <NULL>, set: module
> kobject_uevent
> fill_kobj_path: path = '/module/ipmi_msghandler'
> kobject ipmi: registering. parent: <NULL>, set: drivers
> kobject_uevent
> fill_kobj_path: path = '/bus/platform/drivers/ipmi'
> ipmi message handler version 39.0
> kobject ipmi_devintf: registering. parent: <NULL>, set: module
> kobject_uevent
> fill_kobj_path: path = '/module/ipmi_devintf'
> ipmi device interface
> subsystem ipmi: registering
> kobject ipmi: registering. parent: <NULL>, set: class
> kobject ipmi_si: registering. parent: <NULL>, set: module
> kobject_uevent
> fill_kobj_path: path = '/module/ipmi_si'
> kobject ipmi_si: registering. parent: <NULL>, set: drivers
> kobject_uevent
> fill_kobj_path: path = '/bus/platform/drivers/ipmi_si'
> IPMI System Interface driver.
> ipmi_si: Trying SMBIOS-specified KCS state machine at I/O address
> 0x90a8, slave address 0x20, irq 0
> kobject ipmi_si.0: registering. parent: platform, set: devices
> PM: Adding info for platform:ipmi_si.0
> kobject_uevent
> fill_kobj_path: path = '/devices/platform/ipmi_si.0'
> CAH: ipmi: NEW BMC: name = ipmi_bmc.00070; intf_num = 0
> kobject ipmi_bmc.00070.32: registering. parent: platform, set: devices
> PM: Adding info for platform:ipmi_bmc.00070.32
> kobject_uevent
> fill_kobj_path: path = '/devices/platform/ipmi_bmc.00070.32'
> ipmi: Found new BMC (man_id: 0x000002, prod_id: 0x0007, dev_id: 0x20)
> kobject ipmi0: registering. parent: ipmi, set: class_obj
> kobject_uevent
> fill_kobj_path: path = '/class/ipmi/ipmi0'
> fill_kobj_path: path = '/devices/platform/ipmi_si.0'
> IPMI KCS interface initialized
> ipmi_si: Trying SMBIOS-specified KCS state machine at I/O address 0xca8,
> slave address 0x20, irq 0
> kobject ipmi_si.1: registering. parent: platform, set: devices
> PM: Adding info for platform:ipmi_si.1
> kobject_uevent
> fill_kobj_path: path = '/devices/platform/ipmi_si.1'
> CAH: ipmi: NEW BMC: name = ipmi_bmc.00071; intf_num = 1
> kobject ipmi_bmc.00071.32: registering. parent: platform, set: devices
> PM: Adding info for platform:ipmi_bmc.00071.32
> kobject_uevent
> fill_kobj_path: path = '/devices/platform/ipmi_bmc.00071.32'
> ipmi: Found new BMC (man_id: 0x000002, prod_id: 0x0007, dev_id: 0x20)
> kobject ipmi1: registering. parent: ipmi, set: class_obj
> kobject_uevent
> fill_kobj_path: path = '/class/ipmi/ipmi1'
> fill_kobj_path: path = '/devices/platform/ipmi_si.1'
> IPMI KCS interface initialized
> kobject ipmi_si: registering. parent: <NULL>, set: drivers
> kobject_uevent
> fill_kobj_path: path = '/bus/pci/drivers/ipmi_si'
>
>
> On Tue, 2006-10-10 at 10:49 -0500, Corey Minyard wrote:
>
>> Sorry, I messed up the error recovery in the previous patch. This one
>> should fix it; I've simulated this and it works fine. I've also
>> included a patch from Jeff Garzik that does some more cleanup. Jeff's
>> patch must be applied first; it is named "ipmi-handle-sysfs-errors.patch".
>>
>> I'm still not sure what to do about the naming problem, though. I am
>> assuming you the two devices have different GUIDs, otherwise they would
>> should up as the same BMC. I'd prefer to not use the GUID, as it is
>> huge and meaningless to humans and applications.
>>
>> I re-read the section in the spec again, and I really believe it is the
>> intent that different BMCs on the same system have different Device
>> IDs. That way system software can identify the BMCs without having to
>> worry about operating system or SMBIOS order. So I believe you want to
>> do this for reasons beyond the IPMI driver. There is an odd habit of
>> using the IPMB address of the controller as the Device ID, but I have no
>> idea where that comes from. It doesn't seem to be the intent of the
>> spec, at least in the section I was reading.
>>
>> -Corey
>>
>> Carol Hebert wrote:
>>
>>> Hi Corey,
>>>
>>> Thanks very much for the patch. :-) I built it and ran it on my system
>>> and it works a bit better than the original but it still has some
>>> problems. I'm attaching the dmesg output below (with a bit of debug
>>> turned on in it).
>>>
>>> With the patch, the modprobe appears to create one of the two ipmi
>>> device nodes (ipmi0) expected for the dual-node system although modprobe
>>> of ipmi_si appears to hang Could you please take a look at the error
>>> messages below and see if you can spot the problem?
>>>
>>> Thanks much again,
>>>
>>> Carol Hebert
>>>
>>>
>>>
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Openipmi-developer mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openipmi-developer