Bela Lubkin wrote: > Corey Minyard wrote: > > >> I don't have the I2C spec, but I was assured by the patch author that >> this failure was a temporary failure. >> >> What is bizarre, though, is that this is only used when going out on the >> IPMB; it shouldn't have any effect on local BMC messages. I'd be >> surprised if this sensor was on the IPMB, or even if this box had an >> IPMB at all. >> >> I suspect that there is some incorrect SDR information or information >> that ipmitool is misinterpreting. >> >> I don't think you can dump the raw SDRs via ipmitool. You can with >> openipmi (with the "mc sdr" command). You could pull the smi connection >> up in the GUI, dump the SDRs on mc (0.20), and find the sensor in the >> tree and get the information about it there. >> > > I don't have any of the userland stuff installed -- just OpenIPMI driver & > ipmitool. And sometimes Dell OpenManage, HP Insight Manager, or IBM > Director. > > `ipmitool sdr dump out-file-name` does what it describes as "Dumps raw SDR > data to a file". Here is such a file (in oldfangled uuencode format...) > I can analyze it, but it's rather raw data. > > But anyway, I don't see how `ipmitool` can be complicit here. It's > just sending ioctls to the driver, it's the driver that's going off > into lalaland. > Well, the driver is not exactly going into lalaland, it's really doing exactly what it was designed to do. Whether that is correct or not is a different story, but I got the information from someone who knows better than me. My understanding is that the NAK error means: "Something is physically wrong with the message". If everything else was correct, it probably means a bit got flipped in the message and a checksum failed. It seems reasonable to try a few times. But in this case, ipmitool is asking the driver to do something invalid. In my experience with this, interpreting "valid" and "invalid" is difficult because it changes all the time and varies with opinion, especially with a specification as loosely worded as the IPMI spec. As much as possible, the driver basically passed the buck to the BMC to interpret the data.
I was able to analyze the raw data and figure out what is going on, and as I said, ipmitool is wrong. And I believe the BMC is issuing an incorrect response. The "Sensor Owner ID" in the SDR for that sensor is 0xb1. The '1' in the lowest bit means that this is a "system software ID". The IPMI spec is not terribly clear what "system software ID" means or how you are supposed to do with a sensor of that type, but it is clearly not a IPMB address that you can send a message to. ipmitool should ignore sensors like that (except for interpreting events). Instead, it is trying to send an IPMB message to it. (It seems, BTW, that these are "event-only" type sensors that allow you to interpret what certain events mean.) The BMC, on the other hand, is not checking this bit and rejecting the message as invalid. It looks like it is trying to send to that address. A '1' in that bit has special (and invalid in this case) meaning on the I2C bus and the message gets rejected, but in a way that looks like the message should be retried. However, the machines I have look like the do the same thing :(. -Corey ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Openipmi-developer mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/openipmi-developer
