Bela Lubkin wrote: > Corey Minyard wrote: > > Corey> > Corey> I was able to analyze the raw data and figure out what is going on, > and > Corey> as I said, ipmitool is wrong. And I believe the BMC is issuing an > Corey> incorrect response. The "Sensor Owner ID" in the SDR for that sensor > is > Corey> 0xb1. The '1' in the lowest bit means that this is a "system software > > Corey> ID". The IPMI spec is not terribly clear what "system software ID" > Corey> means or how you are supposed to do with a sensor of that type, but it > > Corey> is clearly not a IPMB address that you can send a message to. > ipmitool > Corey> should ignore sensors like that (except for interpreting events). > Corey> Instead, it is trying to send an IPMB message to it. (It seems, BTW, > Corey> that these are "event-only" type sensors that allow you to interpret > Corey> what certain events mean.) > > Would it be out of line for the driver to reject attempts to talk to > IPMB addresses with the low bit set (which are apparently "system > software ID"s)? > That is a possibility, but we are back to that "what is valid" question. If I do that, someone will come along some day and expect to be able to message those things. So I'd prefer to leave that to the BMC, if possible. > Corey> The BMC, on the other hand, is not checking this bit and rejecting the > > Corey> message as invalid. It looks like it is trying to send to that > Corey> address. A '1' in that bit has special (and invalid in this case) > Corey> meaning on the I2C bus and the message gets rejected, but in a way > that > Corey> looks like the message should be retried. However, the machines I > have > Corey> look like the do the same thing :(. > > So it looks like this could be corrected in up to 3 places: > > - ipmitool should recognize that as not a communicable device, not try > to talk to it; > > - ipmi driver should reject the attempt out of hand > > - BMC firmware should also reject the attempt, returning an error code > such as: > > 0xC2 "Command invalid for given LUN" > 0xCB "Requested Sensor, data, or record not present" > ?--> 0xCD "Command illegal for specified sensor or record type" > 0xD3 "Destination unavailable. Cannot deliver request to > selected destination" > It could be fixed in all three places, certainly. It definitely needs to be fixed in ipmitool; ipmitool is doing the wrong thing with that sensor. It could be fixed in the BMC, but that a difficult place to fix it and there are already a lot of BMCs out there that don't handle this the way I would expect. I've already talked about fixing it in the driver; I'd prefer to keep the driver out of policy if possible.
I'm copying the ipmitool list. > BTW I was moved to create the following, which you might find useful to > round out ipmi_msgdefs.h. > > Could you submit a standard patch for this, with a description and Signed-off-by line? I'll need that to get it to kernel.org. Thanks, -Corey >> Bela< >> > > /* > * From "Intelligent Platform Management Interface Specification, Second > * Generation, v2.0, Document Revision 1.0, February 12, 2004; February > * 15, 2006 Markup"; "Table 5-2, Completion Codes". > * > * http://www.intel.com/design/servers/ipmi/pdf/IPMIv2_0_rev1_0_E3_markup.pdf > * > * Note: condition codes 0x01-0x7F are device/OEM-specific. > * condition codes 0x80-0xBF are command-specific. > */ > > #define IPMI_CC_NO_ERROR 0x00 /* Command Completed Normally. */ > #define IPMI_CC_NODE_BUSY 0xC0 /* Node Busy. Command could not be > * processed because command processing > * resources are temporarily > unavailable. */ > #define IPMI_CC_INVALID_CMD 0xC1 /* Invalid Command. Used to indicate an > * unrecognized or unsupported command. > */ > #define IPMI_CC_INVALID_LUN 0xC2 /* Command invalid for given LUN. */ > #define IPMI_CC_TIMEOUT 0xC3 /* Timeout while processing > command. > * Response unavailable. */ > #define IPMI_CC_OUT_OF_SPACE 0xC4 /* Out of space. Command could not be > * completed because of a lack of > storage > * space required to execute the given > * command operation. */ > #define IPMI_CC_INVAL_RESV 0xC5 /* Reservation Canceled or Invalid > * Reservation ID. */ > #define IPMI_CC_REQ_TRUNC 0xC6 /* Request data truncated. */ > #define IPMI_CC_REQ_LEN_INVAL 0xC7 /* Request data length invalid. */ > #define IPMI_CC_REQ_TOO_LONG 0xC8 /* Request data field length limit > exceeded. */ > #define IPMI_CC_PARAM_RANGE 0xC9 /* Parameter out of range. One or more > * parameters in the data field of the > * Request are out of range. This is > * different from `Invalid data field' > * (CCh) code in that it indicates that > * the erroneous field(s) has a > contiguous > * range of possible values. */ > #define IPMI_CC_DATA_LEN 0xCA /* Cannot return number of requested > data > * bytes. */ > #define IPMI_CC_NOT_PRESENT 0xCB /* Requested Sensor, data, or record not > * present. */ > #define IPMI_CC_INVAL_FIELD 0xCC /* Invalid data field in Request */ > #define IPMI_CC_CMD_INCOMPAT 0xCD /* Command illegal for specified sensor > or > * record type. */ > #define IPMI_CC_CANT_RESPOND 0xCE /* Command response could not be > provided. */ > #define IPMI_CC_DUP_REQ 0xCF /* Cannot execute duplicated > request. > This > * completion code is for devices which > * cannot return the response that was > * returned for the original instance of > * the request. Such devices should > * provide separate commands that allow > * the completion status of the original > * request to be determined. An Event > * Receiver does not use this completion > * code, but returns the 00h completion > * code in the response to (valid) > * duplicated requests. */ > #define IPMI_CC_SDR_UPDATING 0xD0 /* Command response could not be > * provided. SDR Repository in update > * mode. */ > #define IPMI_CC_FW_UPDATING 0xD1 /* Command response could not be > provided. > * Device in firmware update mode. */ > #define IPMI_CC_BMC_INITING 0xD2 /* Command response could not be > provided. > * BMC initialization or initialization > * agent in progress. */ > #define IPMI_CC_DEST_UNAVAIL 0xD3 /* Destination unavailable. Cannot > deliver > * request to selected destination. E.g. > * this code can be returned if a > request > * message is targeted to SMS, but > receive > * message queue reception is disabled > for > * the particular channel. */ > #define IPMI_CC_PERM_DENIED 0xD4 /* Cannot execute command due to > * insufficient privilege level or other > * security-based restriction (e.g. > * disabled for `firmware firewall'). */ > #define IPMI_CC_NOT_SUPP 0xD5 /* Cannot execute command. Command, or > request parameter(s), not supported > in > present state. */ > #define IPMI_CC_PARAM_ILLEGAL 0xD6 /* Cannot execute command. Parameter is > illegal because command sub-function > has been disabled or is unavailable > (e.g. disabled for `firmware > firewall'). */ > #define IPMI_CC_UNSPECIFIED 0xFF /* Unspecified error. */ > > /* Code compatibility */ > > #define IPMI_NODE_BUSY_ERR IPMI_CC_NODE_BUSY > #define IPMI_INVALID_COMMAND_ERR IPMI_CC_INVALID_CMD > #define IPMI_ERR_UNSPECIFIED IPMI_CC_UNSPECIFIED > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Openipmi-developer mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/openipmi-developer > ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Openipmi-developer mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/openipmi-developer
