Corey Minyard wrote:

Corey>>> I don't have the I2C spec, but I was assured by the patch author
that
Corey>>> this failure was a temporary failure.
Corey>>>
Corey>>> What is bizarre, though, is that this is only used when going out on
the
Corey>>> IPMB; it shouldn't have any effect on local BMC messages.  I'd be
Corey>>> surprised if this sensor was on the IPMB, or even if this box had an
Corey>>> IPMB at all.
Corey>>>
Corey>>> I suspect that there is some incorrect SDR information or
information
Corey>>> that ipmitool is misinterpreting.
Corey>>>
Corey>>> I don't think you can dump the raw SDRs via ipmitool.  You can with
Corey>>> openipmi (with the "mc sdr" command).  You could pull the smi
connection
Corey>>> up in the GUI, dump the SDRs on mc (0.20), and find the sensor in
the
Corey>>> tree and get the information about it there.

Bela>> `ipmitool sdr dump out-file-name` does what it describes as "Dumps raw
SDR
Bela>> data to a file".  Here is such a file (in oldfangled uuencode
format...)

Corey> I can analyze it, but it's rather raw data.

I thought that was what you were asking for ;-}

Bela>> But anyway, I don't see how `ipmitool` can be complicit here.  It's
Bela>> just sending ioctls to the driver, it's the driver that's going off
Bela>> into lalaland.

Corey> Well, the driver is not exactly going into lalaland, it's really doing

Corey> exactly what it was designed to do.  Whether that is correct or not is
a 
Corey> different story, but I got the information from someone who knows
better 
Corey> than me.  My understanding is that the NAK error means: "Something is 
Corey> physically wrong with the message".  If everything else was correct,
it 
Corey> probably means a bit got flipped in the message and a checksum failed.

Corey> It seems reasonable to try a few times.  But in this case, ipmitool is

Corey> asking the driver to do something invalid.  In my experience with
this, 
Corey> interpreting "valid" and "invalid" is difficult because it changes all

Corey> the time and varies with opinion, especially with a specification as 
Corey> loosely worded as the IPMI spec.  As much as possible, the driver 
Corey> basically passed the buck to the BMC to interpret the data.
Corey> 
Corey> I was able to analyze the raw data and figure out what is going on,
and 
Corey> as I said, ipmitool is wrong.  And I believe the BMC is issuing an 
Corey> incorrect response.  The "Sensor Owner ID" in the SDR for that sensor
is 
Corey> 0xb1.  The '1' in the lowest bit means that this is a "system software

Corey> ID".  The IPMI spec is not terribly clear what "system software ID" 
Corey> means or how you are supposed to do with a sensor of that type, but it

Corey> is clearly not a IPMB address that you can send a message to.
ipmitool 
Corey> should ignore sensors like that (except for interpreting events).  
Corey> Instead, it is trying to send an IPMB message to it.  (It seems, BTW, 
Corey> that these are "event-only" type sensors that allow you to interpret 
Corey> what certain events mean.)

Would it be out of line for the driver to reject attempts to talk to
IPMB addresses with the low bit set (which are apparently "system
software ID"s)?

Corey> The BMC, on the other hand, is not checking this bit and rejecting the

Corey> message as invalid.  It looks like it is trying to send to that 
Corey> address.  A '1' in that bit has special (and invalid in this case) 
Corey> meaning on the I2C bus and the message gets rejected, but in a way
that 
Corey> looks like the message should be retried.  However, the machines I
have 
Corey> look like the do the same thing :(.

So it looks like this could be corrected in up to 3 places:

  - ipmitool should recognize that as not a communicable device, not try
    to talk to it;

  - ipmi driver should reject the attempt out of hand

  - BMC firmware should also reject the attempt, returning an error code
    such as:

          0xC2 "Command invalid for given LUN"
          0xCB "Requested Sensor, data, or record not present"
     ?--> 0xCD "Command illegal for specified sensor or record type"
          0xD3 "Destination unavailable. Cannot deliver request to
                selected destination"

BTW I was moved to create the following, which you might find useful to
round out ipmi_msgdefs.h.

>Bela<

/*
 * From "Intelligent Platform Management Interface Specification, Second
 * Generation, v2.0, Document Revision 1.0, February 12, 2004; February
 * 15, 2006 Markup"; "Table 5-2, Completion Codes".
 *
 * http://www.intel.com/design/servers/ipmi/pdf/IPMIv2_0_rev1_0_E3_markup.pdf
 *
 * Note: condition codes 0x01-0x7F are device/OEM-specific.
 *       condition codes 0x80-0xBF are command-specific.
 */

#define IPMI_CC_NO_ERROR        0x00 /* Command Completed Normally. */
#define IPMI_CC_NODE_BUSY       0xC0 /* Node Busy. Command could not be
                                      * processed because command processing
                                      * resources are temporarily
unavailable. */
#define IPMI_CC_INVALID_CMD     0xC1 /* Invalid Command. Used to indicate an
                                      * unrecognized or unsupported command.
*/
#define IPMI_CC_INVALID_LUN     0xC2 /* Command invalid for given LUN. */
#define IPMI_CC_TIMEOUT         0xC3 /* Timeout while processing command.
                                      * Response unavailable. */
#define IPMI_CC_OUT_OF_SPACE    0xC4 /* Out of space. Command could not be
                                      * completed because of a lack of
storage
                                      * space required to execute the given
                                      * command operation. */
#define IPMI_CC_INVAL_RESV      0xC5 /* Reservation Canceled or Invalid
                                      * Reservation ID. */
#define IPMI_CC_REQ_TRUNC       0xC6 /* Request data truncated. */
#define IPMI_CC_REQ_LEN_INVAL   0xC7 /* Request data length invalid. */
#define IPMI_CC_REQ_TOO_LONG    0xC8 /* Request data field length limit
exceeded. */
#define IPMI_CC_PARAM_RANGE     0xC9 /* Parameter out of range. One or more
                                      * parameters in the data field of the
                                      * Request are out of range. This is
                                      * different from `Invalid data field'
                                      * (CCh) code in that it indicates that
                                      * the erroneous field(s) has a
contiguous
                                      * range of possible values. */
#define IPMI_CC_DATA_LEN        0xCA /* Cannot return number of requested
data
                                      * bytes. */
#define IPMI_CC_NOT_PRESENT     0xCB /* Requested Sensor, data, or record not
                                      * present. */
#define IPMI_CC_INVAL_FIELD     0xCC /* Invalid data field in Request */
#define IPMI_CC_CMD_INCOMPAT    0xCD /* Command illegal for specified sensor
or
                                      * record type. */
#define IPMI_CC_CANT_RESPOND    0xCE /* Command response could not be
provided. */
#define IPMI_CC_DUP_REQ         0xCF /* Cannot execute duplicated request.
This
                                      * completion code is for devices which
                                      * cannot return the response that was
                                      * returned for the original instance of
                                      * the request. Such devices should
                                      * provide separate commands that allow
                                      * the completion status of the original
                                      * request to be determined. An Event
                                      * Receiver does not use this completion
                                      * code, but returns the 00h completion
                                      * code in the response to (valid)
                                      * duplicated requests. */
#define IPMI_CC_SDR_UPDATING    0xD0 /* Command response could not be
                                      * provided. SDR Repository in update
                                      * mode. */
#define IPMI_CC_FW_UPDATING     0xD1 /* Command response could not be
provided.
                                      * Device in firmware update mode. */
#define IPMI_CC_BMC_INITING     0xD2 /* Command response could not be
provided.
                                      * BMC initialization or initialization
                                      * agent in progress. */
#define IPMI_CC_DEST_UNAVAIL    0xD3 /* Destination unavailable. Cannot
deliver
                                      * request to selected destination. E.g.
                                      * this code can be returned if a
request
                                      * message is targeted to SMS, but
receive
                                      * message queue reception is disabled
for
                                      * the particular channel. */
#define IPMI_CC_PERM_DENIED     0xD4 /* Cannot execute command due to
                                      * insufficient privilege level or other
                                      * security-based restriction (e.g.
                                      * disabled for `firmware firewall'). */
#define IPMI_CC_NOT_SUPP        0xD5 /* Cannot execute command. Command, or
                                        request parameter(s), not supported
in
                                        present state. */
#define IPMI_CC_PARAM_ILLEGAL   0xD6 /* Cannot execute command. Parameter is
                                        illegal because command sub-function
                                        has been disabled or is unavailable
                                        (e.g. disabled for `firmware
                                        firewall'). */
#define IPMI_CC_UNSPECIFIED     0xFF /* Unspecified error. */

/* Code compatibility */

#define IPMI_NODE_BUSY_ERR              IPMI_CC_NODE_BUSY
#define IPMI_INVALID_COMMAND_ERR        IPMI_CC_INVALID_CMD
#define IPMI_ERR_UNSPECIFIED            IPMI_CC_UNSPECIFIED

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Openipmi-developer mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openipmi-developer

Reply via email to