On Wed, Jul 11, 2018 at 04:01:31PM -0500, Eddie James wrote:
> Document the hwmon interface for the OCC.
> 
> Signed-off-by: Eddie James <eaja...@linux.vnet.ibm.com>
> ---
>  Documentation/hwmon/occ | 73 
> +++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 73 insertions(+)
>  create mode 100644 Documentation/hwmon/occ
> 
> diff --git a/Documentation/hwmon/occ b/Documentation/hwmon/occ
> new file mode 100644
> index 0000000..465fa1a
> --- /dev/null
> +++ b/Documentation/hwmon/occ
> @@ -0,0 +1,73 @@
> +Kernel driver occ-hwmon
> +=======================
> +
> +Supported chips:
> +  * POWER8
> +  * POWER9
> +
> +Author: Eddie James <eaja...@linux.vnet.ibm.com>
> +
> +Description
> +-----------
> +
> +This driver supports hardware monitoring for the On-Chip Controller (OCC)
> +embedded on POWER processors. The OCC is a device that collects and 
> aggregates
> +sensor data from the processor and the system. The OCC can provide the raw
> +sensor data as well as perform thermal and power management on the system.
> +
> +The P8 version of this driver is a client driver of I2C. It may be probed
> +manually if an "ibm,p8-occ-hwmon" compatible device is found under the
> +appropriate I2C bus node in the device-tree.
> +
> +The P9 version of this driver is a client driver of the FSI-based OCC driver.
> +It will be probed automatically by the FSI-based OCC driver.
> +
> +Sysfs entries
> +-------------
> +
> +The following attributes are supported. All attributes are read-only unless
> +specified.
> +
> +temp[1-n]_label              OCC sensor id.
> +temp[1-n]_input              Measured temperature in millidegrees C.
> +[with temperature sensor version 2+]
> +    temp[1-n]_fru_type               Given FRU (Field Replaceable Unit) type.

What is this ? An integer ? A string ?

> +    temp[1-n]_fault          Temperature sensor fault.
> +
> +freq[1-n]_label              OCC sensor id.
> +freq[1-n]_input              Measured frequency.

What does that have to do with hardware monitoring, and what exactly does it
measure ? AC voltage frequency ? Frequency of rainstorms in the surrounding
area ?

> +
> +power[1-n]_label     OCC sensor id.
> +power[1-n]_input     Measured power in microwatts.
> +power[1-n]_update_tag        Number of 250us samples represented in 
> accumulator.

update_tag to represent number of samples ? Odd choice for
an attribute name. Why not "_samples" ? Also, if each sample
represents a specific amount of time, why not report a time ?

> +power[1-n]_accumulator       Accumulation of 250us power readings.

There is no explanation of "accumulation". Is this the energy ?
If so, why not use energy attributes ? And what is the unit of
this measurement ?

> +[with power sensor version 2+]
> +    power[1-n]_function_id   Identifies what the power reading is for.

String ? Number ? Slot index ?  Bitmap ? And why isn't that reported
in the label ? After all, that is what the label is supposed to be
used for.

> +    power[1-n]_apss_channel  Indicates APSS channel.
> +

Does that provide any value to the user ?

> +[power version 0xa0 only]
> +power1_id                    OCC sensor id.

This is inconsistent with the other attributes and even with itself.

> +power[1-n]_label             Sensor type, "system", "proc", "vdd", or "vdn".
> +power[1-n]_input             Most recent power reading in microwatts.

Overall I am left with no idea what
        _id
        _label
        _function_id
        _apps_channel
are and how they relate to each other, except that it all looks quite
inconsistent. You might want to consider merging all those attributes into
the label in some consistent way.

> +power[1-n]_update_tag                Number of samples in the accumulator.
> +power[1-n]_accumulator               Accumulation of power readings.

Same as above.

> +[with sensor type "system" and "proc" only]
> +    power[1-n]_update_time   Time in us that the power value is read.
> +
> +caps1_current                Current OCC power cap in watts.
> +caps1_reading                Current system output power in watts.
> +caps1_norm           Power cap without redundant power.
> +caps1_max            Maximum power cap.

Why do those have to be non-standard attributes ? Please explain why you can not
use power[1-n]_cap attributes.

> +[caps version 1 and 2 only]
> +    caps1_min                Minimum power cap.
> +[caps version 3+]
> +    caps1_min_hard           Hard minimum cap that can be set and held.
> +    caps1_min_soft           Soft minimum cap below hard, not guaranteed.
> +caps1_user           The powercap specified by the user. Will be 0 if no
> +                     user powercap exists. This attribute is read-write.
> +[caps version 1+]
> +    caps1_user_source        Indicates how the user power limit was set.
> +
> +extn[1-n]_label              ASCII id or sensor id.
> +extn[1-n]_flags              Indicates type of label attribute.
> +extn[1-n]_input              Data.

Great non-explanation.

Not reviewing the series further. I am sure I asked that each non-standard
attribute is explained. There is neither an explanation why the attributes
are needed nor, in many cases, why non-standard attributes were chosen
instead of standard ones. On top of that, the non-standard attributes are
not even documented properly, leaving the reader wondering not only why
they are needed, but what they are used for in the first place.

Guenter

Reply via email to