Re: [PATCH v5 1/4] powerpc/papr_scm: Fetch nvdimm health information from PHYP
On Wed, Apr 1, 2020 at 8:08 PM Dan Williams wrote: [..] > > * "locked" : Indicating that nvdimm contents cant be modified > >until next power cycle. > > There is the generic NDD_LOCKED flag, can you use that? ...and in > general I wonder if we should try to unify all the common papr_scm and > nfit health flags in a generic location. It will already be the case > the ndctl needs to look somewhere papr specific for this data maybe it > all should have been generic from the beginning. The more I think about this more I think this would be a good time to introduce a common "health/" attribute group under the generic nmemX sysfs, and then have one flag per-file / attribute. Not only does that match the recommended sysfs ABI better, but it allows ndctl to enumerate which flags are supported in addition to their state. > In any event, can you also add this content to a new > Documentation/ABI/testing/sysfs-bus-papr? See sysfs-bus-nfit for > comparison.
Re: [PATCH v5 1/4] powerpc/papr_scm: Fetch nvdimm health information from PHYP
Vaibhav Jain writes: > Thanks for reviewing this patch Mpe, > Michael Ellerman writes: >> Vaibhav Jain writes: ... >> >>> + /* Check for various masks in bitmap and set the buffer */ >>> + if (health & PAPR_SCM_DIMM_UNARMED_MASK) >>> + rc += sprintf(buf, "not_armed "); >> >> I know buf is "big enough" but using sprintf() in 2020 is a bit ... :) >> >> seq_buf is a pretty thin wrapper over a buffer you can use to make this >> cleaner and also handles overflow for you. >> >> See eg. show_user_instructions() for an example. > > Unfortunatly seq_buf_printf() is still not an exported symbol hence not > usable in external modules. Send a patch? :) cheers
Re: [PATCH v5 1/4] powerpc/papr_scm: Fetch nvdimm health information from PHYP
Vaibhav Jain writes: > Implement support for fetching nvdimm health information via > H_SCM_HEALTH hcall as documented in Ref[1]. The hcall returns a pair > of 64-bit big-endian integers which are then stored in 'struct > papr_scm_priv' and subsequently partially exposed to user-space via > newly introduced dimm specific attribute 'papr_flags'. Also a new asm > header named 'papr-scm.h' is added that describes the interface > between PHYP and guest kernel. > > Following flags are reported via 'papr_flags' sysfs attribute contents > of which are space separated string flags indicating various nvdimm > states: > > * "not_armed": Indicating that nvdimm contents wont survive a power > cycle. > * "save_fail": Indicating that nvdimm contents couldn't be flushed > during last shutdown event. > * "restore_fail": Indicating that nvdimm contents couldn't be restored > during dimm initialization. > * "encrypted": Dimm contents are encrypted. > * "smart_notify": There is health event for the nvdimm. > * "scrubbed" : Indicating that contents of the nvdimm have been > scrubbed. > * "locked" : Indicating that nvdimm contents cant be modified > until next power cycle. > > [1]: commit 58b278f568f0 ("powerpc: Provide initial documentation for > PAPR hcalls") > > Signed-off-by: Vaibhav Jain > --- > Changelog: > > v4..v5 : None > > v3..v4 : None > > v2..v3 : Removed PAPR_SCM_DIMM_HEALTH_NON_CRITICAL as a condition for >NVDIMM unarmed [Aneesh] > > v1..v2 : New patch in the series. > --- > arch/powerpc/include/asm/papr_scm.h | 48 ++ > arch/powerpc/platforms/pseries/papr_scm.c | 105 +- > 2 files changed, 151 insertions(+), 2 deletions(-) > create mode 100644 arch/powerpc/include/asm/papr_scm.h > > diff --git a/arch/powerpc/include/asm/papr_scm.h > b/arch/powerpc/include/asm/papr_scm.h > new file mode 100644 > index ..868d3360f56a > --- /dev/null > +++ b/arch/powerpc/include/asm/papr_scm.h > @@ -0,0 +1,48 @@ > +/* SPDX-License-Identifier: GPL-2.0-or-later */ > +/* > + * Structures and defines needed to manage nvdimms for spapr guests. > + */ > +#ifndef _ASM_POWERPC_PAPR_SCM_H_ > +#define _ASM_POWERPC_PAPR_SCM_H_ > + > +#include > +#include > + > +/* DIMM health bitmap bitmap indicators */ > +/* SCM device is unable to persist memory contents */ > +#define PAPR_SCM_DIMM_UNARMEDPPC_BIT(0) Please don't use PPC_BIT, it's just unncessary obfuscation for folks who are reading the code without access to the docs (ie. more or less everyone other than you :) > diff --git a/arch/powerpc/platforms/pseries/papr_scm.c > b/arch/powerpc/platforms/pseries/papr_scm.c > index 0b4467e378e5..aaf2e4ab1f75 100644 > --- a/arch/powerpc/platforms/pseries/papr_scm.c > +++ b/arch/powerpc/platforms/pseries/papr_scm.c > @@ -14,6 +14,7 @@ > #include > > #include > +#include > > #define BIND_ANY_ADDR (~0ul) > > @@ -39,6 +40,13 @@ struct papr_scm_priv { > struct resource res; > struct nd_region *region; > struct nd_interleave_set nd_set; > + > + /* Protect dimm data from concurrent access */ > + struct mutex dimm_mutex; > + > + /* Health information for the dimm */ > + __be64 health_bitmap; > + __be64 health_bitmap_valid; It's much less error prone to store the data in CPU endian and do the endian conversion only at the point where the data either comes from or goes to firmware. That would also mean you can define flags above without needing PPC_BIT because they'll be in CPU endian too. > @@ -144,6 +152,35 @@ static int drc_pmem_query_n_bind(struct papr_scm_priv *p) > return drc_pmem_bind(p); > } > > +static int drc_pmem_query_health(struct papr_scm_priv *p) > +{ > + unsigned long ret[PLPAR_HCALL_BUFSIZE]; > + int64_t rc; Use kernel types please, ie. s64, or just long. > + rc = plpar_hcall(H_SCM_HEALTH, ret, p->drc_index); > + if (rc != H_SUCCESS) { > + dev_err(&p->pdev->dev, > + "Failed to query health information, Err:%lld\n", rc); > + return -ENXIO; > + } > + > + /* Protect modifications to papr_scm_priv with the mutex */ > + rc = mutex_lock_interruptible(&p->dimm_mutex); > + if (rc) > + return rc; > + > + /* Store the retrieved health information in dimm platform data */ > + p->health_bitmap = ret[0]; > + p->health_bitmap_valid = ret[1]; > + > + dev_dbg(&p->pdev->dev, > + "Queried dimm health info. Bitmap:0x%016llx Mask:0x%016llx\n", > + be64_to_cpu(p->health_bitmap), > + be64_to_cpu(p->health_bitmap_valid)); > + > + mutex_unlock(&p->dimm_mutex); > + return 0; > +} > > static int papr_scm_meta_get(struct papr_scm_priv *p, >struct nd_cmd_get_config_data_hdr *hdr) > @@ -304,6 +341,67 @@ static inli
Re: [PATCH v5 1/4] powerpc/papr_scm: Fetch nvdimm health information from PHYP
On Tue, Mar 31, 2020 at 7:33 AM Vaibhav Jain wrote: > > Implement support for fetching nvdimm health information via > H_SCM_HEALTH hcall as documented in Ref[1]. The hcall returns a pair > of 64-bit big-endian integers which are then stored in 'struct > papr_scm_priv' and subsequently partially exposed to user-space via > newly introduced dimm specific attribute 'papr_flags'. Also a new asm > header named 'papr-scm.h' is added that describes the interface > between PHYP and guest kernel. > > Following flags are reported via 'papr_flags' sysfs attribute contents > of which are space separated string flags indicating various nvdimm > states: > > * "not_armed" : Indicating that nvdimm contents wont survive a power >cycle. s/wont/will not/ > * "save_fail" : Indicating that nvdimm contents couldn't be flushed >during last shutdown event. In the nfit definition this description is "flush_fail". The "save_fail" flag was specific to hybrid devices that don't have persistent media and instead scuttle away data from DRAM to flash on power-failure. > * "restore_fail": Indicating that nvdimm contents couldn't be restored >during dimm initialization. > * "encrypted" : Dimm contents are encrypted. This does not seem like a health flag to me, have you considered the libnvdimm security interface for this indicator? > * "smart_notify": There is health event for the nvdimm. Are you also going to signal the sysfs attribute when this event happens? > * "scrubbed" : Indicating that contents of the nvdimm have been >scrubbed. This one seems odd to me what does it mean if it is not set? What does it mean if a new scrub has been launched. Basically, is there value in exposing this state? > * "locked" : Indicating that nvdimm contents cant be modified >until next power cycle. There is the generic NDD_LOCKED flag, can you use that? ...and in general I wonder if we should try to unify all the common papr_scm and nfit health flags in a generic location. It will already be the case the ndctl needs to look somewhere papr specific for this data maybe it all should have been generic from the beginning. In any event, can you also add this content to a new Documentation/ABI/testing/sysfs-bus-papr? See sysfs-bus-nfit for comparison. > > [1]: commit 58b278f568f0 ("powerpc: Provide initial documentation for > PAPR hcalls") > > Signed-off-by: Vaibhav Jain > --- > Changelog: > > v4..v5 : None > > v3..v4 : None > > v2..v3 : Removed PAPR_SCM_DIMM_HEALTH_NON_CRITICAL as a condition for > NVDIMM unarmed [Aneesh] > > v1..v2 : New patch in the series. > --- > arch/powerpc/include/asm/papr_scm.h | 48 ++ > arch/powerpc/platforms/pseries/papr_scm.c | 105 +- > 2 files changed, 151 insertions(+), 2 deletions(-) > create mode 100644 arch/powerpc/include/asm/papr_scm.h > > diff --git a/arch/powerpc/include/asm/papr_scm.h > b/arch/powerpc/include/asm/papr_scm.h > new file mode 100644 > index ..868d3360f56a > --- /dev/null > +++ b/arch/powerpc/include/asm/papr_scm.h > @@ -0,0 +1,48 @@ > +/* SPDX-License-Identifier: GPL-2.0-or-later */ > +/* > + * Structures and defines needed to manage nvdimms for spapr guests. > + */ > +#ifndef _ASM_POWERPC_PAPR_SCM_H_ > +#define _ASM_POWERPC_PAPR_SCM_H_ > + > +#include > +#include > + > +/* DIMM health bitmap bitmap indicators */ > +/* SCM device is unable to persist memory contents */ > +#define PAPR_SCM_DIMM_UNARMED PPC_BIT(0) > +/* SCM device failed to persist memory contents */ > +#define PAPR_SCM_DIMM_SHUTDOWN_DIRTY PPC_BIT(1) > +/* SCM device contents are persisted from previous IPL */ > +#define PAPR_SCM_DIMM_SHUTDOWN_CLEAN PPC_BIT(2) > +/* SCM device contents are not persisted from previous IPL */ > +#define PAPR_SCM_DIMM_EMPTYPPC_BIT(3) > +/* SCM device memory life remaining is critically low */ > +#define PAPR_SCM_DIMM_HEALTH_CRITICAL PPC_BIT(4) > +/* SCM device will be garded off next IPL due to failure */ > +#define PAPR_SCM_DIMM_HEALTH_FATAL PPC_BIT(5) > +/* SCM contents cannot persist due to current platform health status */ > +#define PAPR_SCM_DIMM_HEALTH_UNHEALTHY PPC_BIT(6) > +/* SCM device is unable to persist memory contents in certain conditions */ > +#define PAPR_SCM_DIMM_HEALTH_NON_CRITICAL PPC_BIT(7) > +/* SCM device is encrypted */ > +#define PAPR_SCM_DIMM_ENCRYPTEDPPC_BIT(8) > +/* SCM device has been scrubbed and locked */ > +#define PAPR_SCM_DIMM_SCRUBBED_AND_LOCKED PPC_BIT(9) > + > +/* Bits status indicators for health bitmap indicating unarmed dimm */ > +#define PAPR_SCM_DIMM_UNARMED_MASK (PAPR_SCM_DIMM_UNARMED |\ > + PAPR_SCM_DIMM_HEALTH_UNHEALTHY) > + > +/* Bits status indicators for health bitmap indicating unflushed dimm */ > +#define PAPR_S
Re: [PATCH v5 1/4] powerpc/papr_scm: Fetch nvdimm health information from PHYP
Vaibhav Jain writes: > Implement support for fetching nvdimm health information via > H_SCM_HEALTH hcall as documented in Ref[1]. The hcall returns a pair > of 64-bit big-endian integers which are then stored in 'struct > papr_scm_priv' and subsequently partially exposed to user-space via > newly introduced dimm specific attribute 'papr_flags'. Also a new asm > header named 'papr-scm.h' is added that describes the interface > between PHYP and guest kernel. > > Following flags are reported via 'papr_flags' sysfs attribute contents > of which are space separated string flags indicating various nvdimm > states: > > * "not_armed": Indicating that nvdimm contents wont survive a power > cycle. > * "save_fail": Indicating that nvdimm contents couldn't be flushed > during last shutdown event. > * "restore_fail": Indicating that nvdimm contents couldn't be restored > during dimm initialization. > * "encrypted": Dimm contents are encrypted. > * "smart_notify": There is health event for the nvdimm. > * "scrubbed" : Indicating that contents of the nvdimm have been > scrubbed. > * "locked" : Indicating that nvdimm contents cant be modified > until next power cycle. > > [1]: commit 58b278f568f0 ("powerpc: Provide initial documentation for > PAPR hcalls") > Reviewed-by: Aneesh Kumar K.V > Signed-off-by: Vaibhav Jain > --- > Changelog: > > v4..v5 : None > > v3..v4 : None > > v2..v3 : Removed PAPR_SCM_DIMM_HEALTH_NON_CRITICAL as a condition for >NVDIMM unarmed [Aneesh] > > v1..v2 : New patch in the series. > --- > arch/powerpc/include/asm/papr_scm.h | 48 ++ > arch/powerpc/platforms/pseries/papr_scm.c | 105 +- > 2 files changed, 151 insertions(+), 2 deletions(-) > create mode 100644 arch/powerpc/include/asm/papr_scm.h > > diff --git a/arch/powerpc/include/asm/papr_scm.h > b/arch/powerpc/include/asm/papr_scm.h > new file mode 100644 > index ..868d3360f56a > --- /dev/null > +++ b/arch/powerpc/include/asm/papr_scm.h > @@ -0,0 +1,48 @@ > +/* SPDX-License-Identifier: GPL-2.0-or-later */ > +/* > + * Structures and defines needed to manage nvdimms for spapr guests. > + */ > +#ifndef _ASM_POWERPC_PAPR_SCM_H_ > +#define _ASM_POWERPC_PAPR_SCM_H_ > + > +#include > +#include > + > +/* DIMM health bitmap bitmap indicators */ > +/* SCM device is unable to persist memory contents */ > +#define PAPR_SCM_DIMM_UNARMEDPPC_BIT(0) > +/* SCM device failed to persist memory contents */ > +#define PAPR_SCM_DIMM_SHUTDOWN_DIRTY PPC_BIT(1) > +/* SCM device contents are persisted from previous IPL */ > +#define PAPR_SCM_DIMM_SHUTDOWN_CLEAN PPC_BIT(2) > +/* SCM device contents are not persisted from previous IPL */ > +#define PAPR_SCM_DIMM_EMPTY PPC_BIT(3) > +/* SCM device memory life remaining is critically low */ > +#define PAPR_SCM_DIMM_HEALTH_CRITICALPPC_BIT(4) > +/* SCM device will be garded off next IPL due to failure */ > +#define PAPR_SCM_DIMM_HEALTH_FATAL PPC_BIT(5) > +/* SCM contents cannot persist due to current platform health status */ > +#define PAPR_SCM_DIMM_HEALTH_UNHEALTHY PPC_BIT(6) > +/* SCM device is unable to persist memory contents in certain conditions */ > +#define PAPR_SCM_DIMM_HEALTH_NON_CRITICALPPC_BIT(7) > +/* SCM device is encrypted */ > +#define PAPR_SCM_DIMM_ENCRYPTED PPC_BIT(8) > +/* SCM device has been scrubbed and locked */ > +#define PAPR_SCM_DIMM_SCRUBBED_AND_LOCKEDPPC_BIT(9) > + > +/* Bits status indicators for health bitmap indicating unarmed dimm */ > +#define PAPR_SCM_DIMM_UNARMED_MASK (PAPR_SCM_DIMM_UNARMED | \ > + PAPR_SCM_DIMM_HEALTH_UNHEALTHY) > + > +/* Bits status indicators for health bitmap indicating unflushed dimm */ > +#define PAPR_SCM_DIMM_BAD_SHUTDOWN_MASK (PAPR_SCM_DIMM_SHUTDOWN_DIRTY) > + > +/* Bits status indicators for health bitmap indicating unrestored dimm */ > +#define PAPR_SCM_DIMM_BAD_RESTORE_MASK (PAPR_SCM_DIMM_EMPTY) > + > +/* Bit status indicators for smart event notification */ > +#define PAPR_SCM_DIMM_SMART_EVENT_MASK (PAPR_SCM_DIMM_HEALTH_CRITICAL | \ > +PAPR_SCM_DIMM_HEALTH_FATAL | \ > +PAPR_SCM_DIMM_HEALTH_UNHEALTHY) > + > +#endif > diff --git a/arch/powerpc/platforms/pseries/papr_scm.c > b/arch/powerpc/platforms/pseries/papr_scm.c > index 0b4467e378e5..aaf2e4ab1f75 100644 > --- a/arch/powerpc/platforms/pseries/papr_scm.c > +++ b/arch/powerpc/platforms/pseries/papr_scm.c > @@ -14,6 +14,7 @@ > #include > > #include > +#include > > #define BIND_ANY_ADDR (~0ul) > > @@ -39,6 +40,13 @@ struct papr_scm_priv { > struct resource res; > struct nd_region *region; > struct nd_interleave_set nd_set; > + > + /* Protect dimm
[PATCH v5 1/4] powerpc/papr_scm: Fetch nvdimm health information from PHYP
Implement support for fetching nvdimm health information via H_SCM_HEALTH hcall as documented in Ref[1]. The hcall returns a pair of 64-bit big-endian integers which are then stored in 'struct papr_scm_priv' and subsequently partially exposed to user-space via newly introduced dimm specific attribute 'papr_flags'. Also a new asm header named 'papr-scm.h' is added that describes the interface between PHYP and guest kernel. Following flags are reported via 'papr_flags' sysfs attribute contents of which are space separated string flags indicating various nvdimm states: * "not_armed" : Indicating that nvdimm contents wont survive a power cycle. * "save_fail" : Indicating that nvdimm contents couldn't be flushed during last shutdown event. * "restore_fail": Indicating that nvdimm contents couldn't be restored during dimm initialization. * "encrypted" : Dimm contents are encrypted. * "smart_notify": There is health event for the nvdimm. * "scrubbed" : Indicating that contents of the nvdimm have been scrubbed. * "locked" : Indicating that nvdimm contents cant be modified until next power cycle. [1]: commit 58b278f568f0 ("powerpc: Provide initial documentation for PAPR hcalls") Signed-off-by: Vaibhav Jain --- Changelog: v4..v5 : None v3..v4 : None v2..v3 : Removed PAPR_SCM_DIMM_HEALTH_NON_CRITICAL as a condition for NVDIMM unarmed [Aneesh] v1..v2 : New patch in the series. --- arch/powerpc/include/asm/papr_scm.h | 48 ++ arch/powerpc/platforms/pseries/papr_scm.c | 105 +- 2 files changed, 151 insertions(+), 2 deletions(-) create mode 100644 arch/powerpc/include/asm/papr_scm.h diff --git a/arch/powerpc/include/asm/papr_scm.h b/arch/powerpc/include/asm/papr_scm.h new file mode 100644 index ..868d3360f56a --- /dev/null +++ b/arch/powerpc/include/asm/papr_scm.h @@ -0,0 +1,48 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Structures and defines needed to manage nvdimms for spapr guests. + */ +#ifndef _ASM_POWERPC_PAPR_SCM_H_ +#define _ASM_POWERPC_PAPR_SCM_H_ + +#include +#include + +/* DIMM health bitmap bitmap indicators */ +/* SCM device is unable to persist memory contents */ +#define PAPR_SCM_DIMM_UNARMED PPC_BIT(0) +/* SCM device failed to persist memory contents */ +#define PAPR_SCM_DIMM_SHUTDOWN_DIRTY PPC_BIT(1) +/* SCM device contents are persisted from previous IPL */ +#define PAPR_SCM_DIMM_SHUTDOWN_CLEAN PPC_BIT(2) +/* SCM device contents are not persisted from previous IPL */ +#define PAPR_SCM_DIMM_EMPTYPPC_BIT(3) +/* SCM device memory life remaining is critically low */ +#define PAPR_SCM_DIMM_HEALTH_CRITICAL PPC_BIT(4) +/* SCM device will be garded off next IPL due to failure */ +#define PAPR_SCM_DIMM_HEALTH_FATAL PPC_BIT(5) +/* SCM contents cannot persist due to current platform health status */ +#define PAPR_SCM_DIMM_HEALTH_UNHEALTHY PPC_BIT(6) +/* SCM device is unable to persist memory contents in certain conditions */ +#define PAPR_SCM_DIMM_HEALTH_NON_CRITICAL PPC_BIT(7) +/* SCM device is encrypted */ +#define PAPR_SCM_DIMM_ENCRYPTEDPPC_BIT(8) +/* SCM device has been scrubbed and locked */ +#define PAPR_SCM_DIMM_SCRUBBED_AND_LOCKED PPC_BIT(9) + +/* Bits status indicators for health bitmap indicating unarmed dimm */ +#define PAPR_SCM_DIMM_UNARMED_MASK (PAPR_SCM_DIMM_UNARMED |\ + PAPR_SCM_DIMM_HEALTH_UNHEALTHY) + +/* Bits status indicators for health bitmap indicating unflushed dimm */ +#define PAPR_SCM_DIMM_BAD_SHUTDOWN_MASK (PAPR_SCM_DIMM_SHUTDOWN_DIRTY) + +/* Bits status indicators for health bitmap indicating unrestored dimm */ +#define PAPR_SCM_DIMM_BAD_RESTORE_MASK (PAPR_SCM_DIMM_EMPTY) + +/* Bit status indicators for smart event notification */ +#define PAPR_SCM_DIMM_SMART_EVENT_MASK (PAPR_SCM_DIMM_HEALTH_CRITICAL | \ + PAPR_SCM_DIMM_HEALTH_FATAL | \ + PAPR_SCM_DIMM_HEALTH_UNHEALTHY) + +#endif diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c index 0b4467e378e5..aaf2e4ab1f75 100644 --- a/arch/powerpc/platforms/pseries/papr_scm.c +++ b/arch/powerpc/platforms/pseries/papr_scm.c @@ -14,6 +14,7 @@ #include #include +#include #define BIND_ANY_ADDR (~0ul) @@ -39,6 +40,13 @@ struct papr_scm_priv { struct resource res; struct nd_region *region; struct nd_interleave_set nd_set; + + /* Protect dimm data from concurrent access */ + struct mutex dimm_mutex; + + /* Health information for the dimm */ + __be64 health_bitmap; + __be64 health_bitmap_valid; }; static int drc_pmem_bind(struct papr_scm_priv *p) @@ -144,6 +152,35 @@ static int drc_pmem_query_n_bind(stru