On Fri, Jun 9, 2017 at 11:56 AM, Linda Knippers <[email protected]> wrote: > On 6/9/2017 2:10 PM, Dan Williams wrote: > >>>> This format for health_detail makes the json harder to consume for >>>> upper-layer applications. Every field should be associated with a >>>> key-id even if it's just a flag. >>> >>> >>> >>> Ok. I thought we had discussed this format with my RFC patch a >>> while back. The only change I made was adding underscores instead >>> of spaces as requested. >> >> >> Yeah, I thought I had brought up the array of json-boolean objects vs >> json-string objects before. > > > Hmm...maybe I missed that, but ok. > > >>> The items in the array are all associated with a single validation flag. >>> There are some additional groups of fields that I plan to add as well, >>> like a group of error counters, so do you want everything flat? >> >> >> Yes, after writing some scripts to parse json I think flatter is >> better unless the data itself is actually hierarchical. >> >>> >>>> I think I'd like to keep the json >>>> topology flatter. We already have support for the NFIT health state >>>> flags in this form: >>>> >>>> { >>>> "provider":"nfit_test.1", >>>> "dev":"ndbus3", >>>> "dimms":[ >>>> { >>>> "dev":"nmem5", >>>> "id":"cdab-0a-07e0-fffffeff", >>>> "flag_failed_save":true, >>>> "flag_failed_arm":true, >>>> "flag_failed_restore":true, >>>> "flag_smart_event":true, >>>> "flag_failed_flush":true >>>> } >>>> ] >>>> } >>>> >>>> So, let's just add additional flag names and omit the ones that are >>>> duplicates of the NFIT-level health state flags. >>> >>> >>> >>> None of them are duplicates because the NFIT state flags are static from >>> boot >>> time while the health status detail changes at runtime. >> >> >> Yes, but it's a pain for applications that now have to deal with two >> different flag names for similar indicators. So for example if >> "flag_failed_arm" is false at startup, but "arm_error" triggers later >> I would think that would just be reflected in "flag_failed_arm". In >> other words override the static flag_failed_arm value with the dynamic >> result so that monitor applications need only learn one json key for >> the same data. > > > But they're different things. One represents boot time state and one > represents > runtime. One is an NFIT flag, not even reported as health today, and one is > runtime health. Even your commit message says they're distinct from SMART > health data. Knowing that my NVDIMM came up healthy and then had an > issue, > or vice versa, is important.
I agree, but I don't think it's ndctl's job to maintain that distinction. The default case is "I want to know the present state of my dimm", if an environment wants to know "has my state changed over time" then it should be logging the health state snapshots and comparing them. > Maybe the state flags should be renamed to reflect what they really are. > The "true" value for those is probably redundant too because I don't think Yes, the 'true' value is redundant. The goal is to get the flag name into a json-key rather than a json-value. > you > output anything if they're fault, although now I'm not sure those NFIT > flags are being displayed properly at all. On my system, /sys../nfit/flags > shows "smart_notify", which I assume means I should see > "flag_smart_event":true, > but I don't. I'll look at that separately. The 'smart_notify' flag is whether the DIMM supports health state notifications, the 'flag_smart_event' indicates if a smart event was pending. So your system supports notifications and the NFIT says no SMART event was pending. _______________________________________________ Linux-nvdimm mailing list [email protected] https://lists.01.org/mailman/listinfo/linux-nvdimm
