Chris,
See answers inline. I don't have any concrete answers as to how to deal
with some of questions you brought up, but I do have some more detail
that may be useful to further the discussion.

On 10/17/2014 11:03 AM, Chris Dent wrote:
On Thu, 16 Oct 2014, Jim Mankovich wrote:

What I would like to propose is dropping the ipmi string from the name altogether and appending the Sensor ID to the name instead of to the Resource ID. So, transforming the above to the new naming would result in the following:

| Name | Type | Unit | Resource ID
| hardware.current.power_meter_(0x16) | gauge | W | edafe6f4-5996-4df8-bc84-7d92439e15c0 | hardware.temperature.system_board_(0x15) | gauge | C | edafe6f4-5996-4df8-bc84-7d92439e15c0

[plus sensor_provider in resource_metadata]

If this makes sense for the kinds of queries that need to happen then
we may as well do it, but I'm not sure it is. When I was writing the
consumer code for the notifications the names of the meters was a big
open question that was hard to resolve because of insufficient data
and input on what people really need to do with the samples.

The scenario you've listed is getting all sensors on a given single
platform.

What about the scenario where you want to create an alarm that says
"If temperate gets over X on any system board on any of my hardware,
notify the authorities"? Will having the "_(0x##)" qualifier allow that
to work? I don't actually know, are those qualifiers standard in some
way or are they specific to different equipment? If they are different
having them in the meter name makes creating a useful alarm in a
heterogeneous a bit more of a struggle, doesn't it?

The "_(0x##)" is an ipmitool display artifact that is tacked onto the end of the Sensor ID
in order to provide more information beyond what Sensor ID has in it.
The ## is the sensor record ID which is specific to IPMI. Whether or
not a Sensor ID (sans _(0x##)) is unique is up to the vendor, but in general I believe all vendors will likely name their sensors uniquely; otherwise, how can a person differentiate textually what component in a platform the sensor represents?

Personally, I would like to see the _(0x##) removed form the Sensor ID string (by the ipmitool driver) before it returns sensors to the Ironic conductor. I just don't see any value in this extra info. This 0x## addition only helps if a vendor used the exact same Sensor ID string for multiple sensors of the same sensor type. i.e. Multiple sensors of type "Temperature", each with the exact same Sensor ID string of "CPU" instead of giving each Sensor ID string a unique name
like "CPU 1 ", " CPU 2",...

Now if you want to get deeper into the IPMI realm, (which I don't really want to advocate) the Entity ID Code actually tells you the component. From the IPMI spec section, 43.14 Entity IDs

"The Entity ID field is used for identifying the physical entity that a sensor or device is associated with. If multiple sensors refer to the same entity, they will have the same Entity ID field value. For example, if a voltage sensor and a temperature sensor are both for a ‘Power Supply 1’ entity the Entity ID in their sensor data records would both be 10 (0Ah), per the Entity ID table." FYI: Entity 10 (0Ah) means "power supply".

In a heterogeneous platform environment, the Sensor ID string is likely going to be different per vendor, so your question "If temperate...on any system board...on any hardware, notify the authorities" is going to be tough because each vendor may name their "system board" differently. But, I bet that vendors use similar strings, so worst case, your alarm creation could require 1 alarm definition
per vendor.


Perhaps (if they are not standard) this would work:

| hardware.current.power_meter | gauge | W | edafe6f4-5996-4df8-bc84-7d92439e15c0

with both sensor_provider and whatever that qualifier is called in the
metadata?

I see generic naming as somewhat problematic. If you lump all the temperature sensors for a platform under hardware.temperature the consumer will always need to query for a specific temperature sensor that it is interested in, like "system board". The notion of having different samples from multiple sensors under a single generic name seems harder to deal with to me. If you have multiple temperature samples under the same generic meter name, how do you figure out what all the possible
temperature samples actual exist?


Then the name remains sufficiently generic to allow aggregates across
multiple systems, while still having the necessary info to narrow to
different sensors of the same type.

I understand that this proposed change is not backward compatible with the existing naming, but I don't really see a good solution that would retain backward compatibility.

I think we should strive to worry less about such things, especially
when it's just names in data fields. Not always possible, or even a
good idea, but sometimes its a win.

I'm always good with less worry.

Thanks for the feedback,
Jim


_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to