Corey Minyard wrote:
Corey>>> I don't have the I2C spec, but I was assured by the patch author
that
Corey>>> this failure was a temporary failure.
Corey>>>
Corey>>> What is bizarre, though, is that this is only used when going out on
the
Corey>>> IPMB; it shouldn't have any effect on local BMC messages. I'd be
Corey>>> surprised if this sensor was on the IPMB, or even if this box had an
Corey>>> IPMB at all.
Corey>>>
Corey>>> I suspect that there is some incorrect SDR information or
information
Corey>>> that ipmitool is misinterpreting.
Corey>>>
Corey>>> I don't think you can dump the raw SDRs via ipmitool. You can with
Corey>>> openipmi (with the "mc sdr" command). You could pull the smi
connection
Corey>>> up in the GUI, dump the SDRs on mc (0.20), and find the sensor in
the
Corey>>> tree and get the information about it there.
Bela>> `ipmitool sdr dump out-file-name` does what it describes as "Dumps raw
SDR
Bela>> data to a file". Here is such a file (in oldfangled uuencode
format...)
Corey> I can analyze it, but it's rather raw data.
I thought that was what you were asking for ;-}
Bela>> But anyway, I don't see how `ipmitool` can be complicit here. It's
Bela>> just sending ioctls to the driver, it's the driver that's going off
Bela>> into lalaland.
Corey> Well, the driver is not exactly going into lalaland, it's really doing
Corey> exactly what it was designed to do. Whether that is correct or not is
a
Corey> different story, but I got the information from someone who knows
better
Corey> than me. My understanding is that the NAK error means: "Something is
Corey> physically wrong with the message". If everything else was correct,
it
Corey> probably means a bit got flipped in the message and a checksum failed.
Corey> It seems reasonable to try a few times. But in this case, ipmitool is
Corey> asking the driver to do something invalid. In my experience with
this,
Corey> interpreting "valid" and "invalid" is difficult because it changes all
Corey> the time and varies with opinion, especially with a specification as
Corey> loosely worded as the IPMI spec. As much as possible, the driver
Corey> basically passed the buck to the BMC to interpret the data.
Corey>
Corey> I was able to analyze the raw data and figure out what is going on,
and
Corey> as I said, ipmitool is wrong. And I believe the BMC is issuing an
Corey> incorrect response. The "Sensor Owner ID" in the SDR for that sensor
is
Corey> 0xb1. The '1' in the lowest bit means that this is a "system software
Corey> ID". The IPMI spec is not terribly clear what "system software ID"
Corey> means or how you are supposed to do with a sensor of that type, but it
Corey> is clearly not a IPMB address that you can send a message to.
ipmitool
Corey> should ignore sensors like that (except for interpreting events).
Corey> Instead, it is trying to send an IPMB message to it. (It seems, BTW,
Corey> that these are "event-only" type sensors that allow you to interpret
Corey> what certain events mean.)
Would it be out of line for the driver to reject attempts to talk to
IPMB addresses with the low bit set (which are apparently "system
software ID"s)?
Corey> The BMC, on the other hand, is not checking this bit and rejecting the
Corey> message as invalid. It looks like it is trying to send to that
Corey> address. A '1' in that bit has special (and invalid in this case)
Corey> meaning on the I2C bus and the message gets rejected, but in a way
that
Corey> looks like the message should be retried. However, the machines I
have
Corey> look like the do the same thing :(.
So it looks like this could be corrected in up to 3 places:
- ipmitool should recognize that as not a communicable device, not try
to talk to it;
- ipmi driver should reject the attempt out of hand
- BMC firmware should also reject the attempt, returning an error code
such as:
0xC2 "Command invalid for given LUN"
0xCB "Requested Sensor, data, or record not present"
?--> 0xCD "Command illegal for specified sensor or record type"
0xD3 "Destination unavailable. Cannot deliver request to
selected destination"
BTW I was moved to create the following, which you might find useful to
round out ipmi_msgdefs.h.
>Bela<
/*
* From "Intelligent Platform Management Interface Specification, Second
* Generation, v2.0, Document Revision 1.0, February 12, 2004; February
* 15, 2006 Markup"; "Table 5-2, Completion Codes".
*
* http://www.intel.com/design/servers/ipmi/pdf/IPMIv2_0_rev1_0_E3_markup.pdf
*
* Note: condition codes 0x01-0x7F are device/OEM-specific.
* condition codes 0x80-0xBF are command-specific.
*/
#define IPMI_CC_NO_ERROR 0x00 /* Command Completed Normally. */
#define IPMI_CC_NODE_BUSY 0xC0 /* Node Busy. Command could not be
* processed because command processing
* resources are temporarily
unavailable. */
#define IPMI_CC_INVALID_CMD 0xC1 /* Invalid Command. Used to indicate an
* unrecognized or unsupported command.
*/
#define IPMI_CC_INVALID_LUN 0xC2 /* Command invalid for given LUN. */
#define IPMI_CC_TIMEOUT 0xC3 /* Timeout while processing command.
* Response unavailable. */
#define IPMI_CC_OUT_OF_SPACE 0xC4 /* Out of space. Command could not be
* completed because of a lack of
storage
* space required to execute the given
* command operation. */
#define IPMI_CC_INVAL_RESV 0xC5 /* Reservation Canceled or Invalid
* Reservation ID. */
#define IPMI_CC_REQ_TRUNC 0xC6 /* Request data truncated. */
#define IPMI_CC_REQ_LEN_INVAL 0xC7 /* Request data length invalid. */
#define IPMI_CC_REQ_TOO_LONG 0xC8 /* Request data field length limit
exceeded. */
#define IPMI_CC_PARAM_RANGE 0xC9 /* Parameter out of range. One or more
* parameters in the data field of the
* Request are out of range. This is
* different from `Invalid data field'
* (CCh) code in that it indicates that
* the erroneous field(s) has a
contiguous
* range of possible values. */
#define IPMI_CC_DATA_LEN 0xCA /* Cannot return number of requested
data
* bytes. */
#define IPMI_CC_NOT_PRESENT 0xCB /* Requested Sensor, data, or record not
* present. */
#define IPMI_CC_INVAL_FIELD 0xCC /* Invalid data field in Request */
#define IPMI_CC_CMD_INCOMPAT 0xCD /* Command illegal for specified sensor
or
* record type. */
#define IPMI_CC_CANT_RESPOND 0xCE /* Command response could not be
provided. */
#define IPMI_CC_DUP_REQ 0xCF /* Cannot execute duplicated request.
This
* completion code is for devices which
* cannot return the response that was
* returned for the original instance of
* the request. Such devices should
* provide separate commands that allow
* the completion status of the original
* request to be determined. An Event
* Receiver does not use this completion
* code, but returns the 00h completion
* code in the response to (valid)
* duplicated requests. */
#define IPMI_CC_SDR_UPDATING 0xD0 /* Command response could not be
* provided. SDR Repository in update
* mode. */
#define IPMI_CC_FW_UPDATING 0xD1 /* Command response could not be
provided.
* Device in firmware update mode. */
#define IPMI_CC_BMC_INITING 0xD2 /* Command response could not be
provided.
* BMC initialization or initialization
* agent in progress. */
#define IPMI_CC_DEST_UNAVAIL 0xD3 /* Destination unavailable. Cannot
deliver
* request to selected destination. E.g.
* this code can be returned if a
request
* message is targeted to SMS, but
receive
* message queue reception is disabled
for
* the particular channel. */
#define IPMI_CC_PERM_DENIED 0xD4 /* Cannot execute command due to
* insufficient privilege level or other
* security-based restriction (e.g.
* disabled for `firmware firewall'). */
#define IPMI_CC_NOT_SUPP 0xD5 /* Cannot execute command. Command, or
request parameter(s), not supported
in
present state. */
#define IPMI_CC_PARAM_ILLEGAL 0xD6 /* Cannot execute command. Parameter is
illegal because command sub-function
has been disabled or is unavailable
(e.g. disabled for `firmware
firewall'). */
#define IPMI_CC_UNSPECIFIED 0xFF /* Unspecified error. */
/* Code compatibility */
#define IPMI_NODE_BUSY_ERR IPMI_CC_NODE_BUSY
#define IPMI_INVALID_COMMAND_ERR IPMI_CC_INVALID_CMD
#define IPMI_ERR_UNSPECIFIED IPMI_CC_UNSPECIFIED
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Openipmi-developer mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openipmi-developer