Re: [Openipmi-developer] ipmi_si appears to be broken on multinode systems in 2.6.18 kernel

Corey Minyard Wed, 11 Oct 2006 14:54:29 -0700

Carol Hebert wrote:
> Hi,
>
> I believe your assessment of my x460 dual-node system configuration is
> correct with the exception of maybe changing the word "slot" to "system"
> since the nodes are joined by scalability cables rather than being
> connected via a common backplane.
>
> Regarding the uniqueness of the Device ID, I think you mentioned in an
> earlier email that the spec was a bit contradictory on the topic.  I
> took a look at the spec and agree that it is not at all clear whether
> the Device ID should be unique for all controllers or only for ones that
> support a different set of application commands/OEM fields.  In one
> paragraph, it states that:
>
> "Controllers that implement identical sets of applications (sic)
> commands can have the same Device ID in a given system. Thus a
> 'standardized' controller could be produced where multiple instances of
> the controller are used in a system, and all have the same Device ID
> value.  The controllers would still be differentiable by their
> address..."  
>
> and in the *immediately following* paragraph, it states 
>
> "A controller can optionally use the Device ID as an 'instance'
> identifier if more than one controller of that kind is used in the
> system."   (It then goes on to say that the GUID, however, is the
> preferred method of uniquely identifying controllers.)
>
> Sheesh.  :-}  
>
> In checking out the dmidecode data, I verified that the addresses of the
> controllers on the multi-node system are unique and available there.  So
> both the GUID and the address are unique for the controllers on the
> multi-node system whereas the Device ID is not.  Can we use the GUID
> (maybe in some more easily digestible form) or the address instead of
> the Device ID?   It seems like the only thing that's clear from the spec
> is that the Device ID's uniqueness is something we can't count on.
>   
The "easily digestible form" part is the problem here.  You need some
method to correlate a GUID to something a human being can use to
identify a system, and it would be nice if it wasn't custom for every
installed system out there.  The address is perhaps better, but I'm
going to have to have some way to translate the addresses to system
numbers, and it's going to have to be OEM for this type of hardware, and
the addresses are not available at the level this is happening, this
code is generic for all interface types.


So what we can do, in my order of preference :-) :

   1. Modify the IPMI firmware to set the device id to a unique number
      for every BMC in the system.  It would be really nice if this was
      done in a way that the device ids could be correlated with
      physical systems.  This will work with the IPMI driver as-is, and
      I checked and udev translations can be done as-is, too, I believe.
   2. Use some OEM IPMI command that could query the physical system
      number, if something like this exists.
   3. Create an OEM handler to use the GUID to map to physical systems. 
      I'm going to need some help with this, I have no idea how to do
      this.  Looking at the GUID format (Table 20-10 in the IPMI 2.0
      spec), I don't see any way to do this.  The node field, BTW, is
      supposed to be the 802.x MAC address.
   4. Use the I/O address.  This introduces a lot of headaches into the
      structure of the IPMI driver as the address has to be propagated
      from the interface-specific handler to the generic code, and it
      introduces an OEM handler.  And I'll need some way to map the I/O
      addresses to physical systems.

Any more ideas?

-Corey
> Regarding the ipmi device support currently being fixed at a max of 4,
> the largest multi-node configuration we currently have is 8 so we would
> need to have the table size bumped up to at least 8.  However, for
> future support, it might be useful to increase it even more (12, 16?).
>   
I'll probably just make it a list and get rid of the table so it can be
arbitrary counts.
> Finally, I don't believe dynamic node plugging will generally be an
> issue for my system since the nodes are merged at boot time rather than
> being dynamically added and/or removed.
>   
So the time is not here yet, but I'm sure it's coming someday :)  I can
wait on this one, then, but I decided it would be pretty easy to do
through the hotplug subsystem.

-Corey
> Thanks very much,
>
> Carol Hebert
>
> On Wed, 2006-10-11 at 10:25 -0500, Corey Minyard wrote:
>   
>> Now the driver is doing exactly what it is supposed to do, but now that
>> may not be what we want.  I'm not sure of the configuration of this
>> system, but the information below gives me some clues.  Here's my guess
>> on the system:
>>
>> This is a NUMA system with hot-plug CPU boards.  Each board has an IPMI
>> controller on it.  The BIOS maps the I/O address and SMBIOS tables for
>> the IPMI controller to different I/O locations based upon the slot the
>> board is in.  There are a number of problems beyond this one for a
>> configuration of this nature.  I'll address those later.
>>
>> In response to your question, I believe this is exactly what the Device
>> ID in IPMI is intended for.  Each board in the system should have a
>> unique device id based upon the slot it is in.  Say you have an
>> application that monitors the CPU temperature of all the CPUs.  If a
>> temperature goes out of range, you want to know which board that CPU is
>> on.  And the Device ID can tell you that.  The IPMI device number that
>> you suggest using are arbitrary, especially in a hot-plug system where
>> devices can come and go dynamically.
>>
>> In addition, you would probably want to be able to do udev mappings so
>> that the same slots appear as the same device names (slot 1 is
>> /dev/ipmi1, slot 2 is /dev/ipmi2, etc.).  The driver needs to be able to
>> give udev information about the devices, and the Product ID/Device ID is
>> really all it's got.
>>
>> Now for the other problems:
>>
>>    1. The IPMI driver doesn't current support an arbitrary number of
>>       devices.  It has a fixed table of four.  I can fix this fairly
>>       easily, though.  I wasn't really expecting a system to be designed
>>       like this.
>>    2. The IPMI driver has no way to handle dynamic node plugging.  I
>>       don't know of a standard way to tell the IPMI driver: "Hey, you
>>       have a new controller here".  The driver should support adding new
>>       devices dynamically, but I need some way to know the device is
>>       there, or that it is going away.
>>    3. I don't think the IPMI driver provides a way for sysfs to report
>>       the information that udev needs to do the udev mappings properly 
>>       As always with sysfs, this is probably easy once you spend 2 days
>>       figuring out what to do.
>>
>> Am I on the right track here?
>>     
>
>   


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Openipmi-developer mailing list
Openipmi-developer@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openipmi-developer

Re: [Openipmi-developer] ipmi_si appears to be broken on multinode systems in 2.6.18 kernel

Reply via email to