Carol Hebert wrote:
> Hi Corey,
>
> Sorry I wasn't able to reply sooner.  I wanted to discuss this with a
> couple of other folks first.
>
> As per your first solution listed below, I'm going to propose asap that
> we modify the f/w to ensure that the device ID is unique for every BMC.
> I don't know yet if the proposal will be accepted, but assuming it is,
> it would solve the problem for the long term.  However, maybe we ought
> to supplement that solution with one of the other solutions you listed
> (or some combination thereof) to solve the problem in the interim until
> the new f/w is released and also in general for users running the older
> (current) f/w?
>   
Ok, that's probably good for short term.  Do you know if devices with
different firmware might mix?  That might end up being messy.
> Creating an internal mapping table of the different BMCs found at probe
> time (and maybe setting the device ID to be an index into the table)
> might be useful.  Using the GUID (as you imply) would be messy, however,
> not saving the GUID/address in some form for later use might make it
> difficult to know for sure where each BMC is physically located unless
> it's guaranteed that the BMCs are probed in order.  Do you know if
> there's a guarantee that the BMCs are probed sequentially in a multi-BMC
> system? 
>   
BMCs are probed in the order that the appear in the SMBIOS/ACPI tables. 
Note that I have a new patch to allow hot adding/removing BMCs.  You
might be able to move the problem to userspace and handle it there.
> Please let me know what I can do to help.  In the meantime, I'll take a
> look at the current code and try to figure out why it's still oopsing.
>   
I thought the oops was fixed.  If not, can you send one?

As far as things you can do, I'm not really sure.  I don't have enough
details on how this hardware works to design a solution.  This is really
nitty-gritty detail information, like how the nodes map their BMC
addresses and how the SMBIOS table is populated.  If the BMCs appeared
in the SMBIOS tables in node order, then the solution is very easy, just
detect and add 1 for each.  I could just print a warning at startup when
it detects this and it would probably cover a multitude of future evils :-).

-Corey
> Thanks for your help,
>
> Carol Hebert
>
>   
>>>   
>>>       
>> The "easily digestible form" part is the problem here.  You need some
>> method to correlate a GUID to something a human being can use to
>> identify a system, and it would be nice if it wasn't custom for every
>> installed system out there.  The address is perhaps better, but I'm
>> going to have to have some way to translate the addresses to system
>> numbers, and it's going to have to be OEM for this type of hardware, and
>> the addresses are not available at the level this is happening, this
>> code is generic for all interface types.
>>
>> So what we can do, in my order of preference :-) :
>>
>>    1. Modify the IPMI firmware to set the device id to a unique number
>>       for every BMC in the system.  It would be really nice if this was
>>       done in a way that the device ids could be correlated with
>>       physical systems.  This will work with the IPMI driver as-is, and
>>       I checked and udev translations can be done as-is, too, I believe.
>>    2. Use some OEM IPMI command that could query the physical system
>>       number, if something like this exists.
>>    3. Create an OEM handler to use the GUID to map to physical systems. 
>>       I'm going to need some help with this, I have no idea how to do
>>       this.  Looking at the GUID format (Table 20-10 in the IPMI 2.0
>>       spec), I don't see any way to do this.  The node field, BTW, is
>>       supposed to be the 802.x MAC address.
>>    4. Use the I/O address.  This introduces a lot of headaches into the
>>       structure of the IPMI driver as the address has to be propagated
>>       from the interface-specific handler to the generic code, and it
>>       introduces an OEM handler.  And I'll need some way to map the I/O
>>       addresses to physical systems.
>>
>> Any more ideas?
>>
>> -Corey
>>     
>>> Regarding the ipmi device support currently being fixed at a max of 4,
>>> the largest multi-node configuration we currently have is 8 so we would
>>> need to have the table size bumped up to at least 8.  However, for
>>> future support, it might be useful to increase it even more (12, 16?).
>>>   
>>>       
>> I'll probably just make it a list and get rid of the table so it can be
>> arbitrary counts.
>>     
>>> Finally, I don't believe dynamic node plugging will generally be an
>>> issue for my system since the nodes are merged at boot time rather than
>>> being dynamically added and/or removed.
>>>   
>>>       
>> So the time is not here yet, but I'm sure it's coming someday :)  I can
>> wait on this one, then, but I decided it would be pretty easy to do
>> through the hotplug subsystem.
>>
>> -Corey
>>     
>>> Thanks very much,
>>>
>>> Carol Hebert
>>>
>>> On Wed, 2006-10-11 at 10:25 -0500, Corey Minyard wrote:
>>>   
>>>       
>>>> Now the driver is doing exactly what it is supposed to do, but now that
>>>> may not be what we want.  I'm not sure of the configuration of this
>>>> system, but the information below gives me some clues.  Here's my guess
>>>> on the system:
>>>>
>>>> This is a NUMA system with hot-plug CPU boards.  Each board has an IPMI
>>>> controller on it.  The BIOS maps the I/O address and SMBIOS tables for
>>>> the IPMI controller to different I/O locations based upon the slot the
>>>> board is in.  There are a number of problems beyond this one for a
>>>> configuration of this nature.  I'll address those later.
>>>>
>>>> In response to your question, I believe this is exactly what the Device
>>>> ID in IPMI is intended for.  Each board in the system should have a
>>>> unique device id based upon the slot it is in.  Say you have an
>>>> application that monitors the CPU temperature of all the CPUs.  If a
>>>> temperature goes out of range, you want to know which board that CPU is
>>>> on.  And the Device ID can tell you that.  The IPMI device number that
>>>> you suggest using are arbitrary, especially in a hot-plug system where
>>>> devices can come and go dynamically.
>>>>
>>>> In addition, you would probably want to be able to do udev mappings so
>>>> that the same slots appear as the same device names (slot 1 is
>>>> /dev/ipmi1, slot 2 is /dev/ipmi2, etc.).  The driver needs to be able to
>>>> give udev information about the devices, and the Product ID/Device ID is
>>>> really all it's got.
>>>>
>>>> Now for the other problems:
>>>>
>>>>    1. The IPMI driver doesn't current support an arbitrary number of
>>>>       devices.  It has a fixed table of four.  I can fix this fairly
>>>>       easily, though.  I wasn't really expecting a system to be designed
>>>>       like this.
>>>>    2. The IPMI driver has no way to handle dynamic node plugging.  I
>>>>       don't know of a standard way to tell the IPMI driver: "Hey, you
>>>>       have a new controller here".  The driver should support adding new
>>>>       devices dynamically, but I need some way to know the device is
>>>>       there, or that it is going away.
>>>>    3. I don't think the IPMI driver provides a way for sysfs to report
>>>>       the information that udev needs to do the udev mappings properly 
>>>>       As always with sysfs, this is probably easy once you spend 2 days
>>>>       figuring out what to do.
>>>>
>>>> Am I on the right track here?
>>>>     
>>>>         
>>>   
>>>       


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Openipmi-developer mailing list
Openipmi-developer@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openipmi-developer

Reply via email to