Ira Weiny <ira.we...@intel.com> writes:
> On Wed, Jun 03, 2020 at 11:41:42PM +0530, Vaibhav Jain wrote: >> Hi Ira, >> >> Thanks for reviewing this patch. My responses below: >> >> Ira Weiny <ira.we...@intel.com> writes: >> > > ... > >> >> + * >> >> + * Payload Version: >> >> + * >> >> + * A 'payload_version' field is present in PDSM header that indicates a >> >> specific >> >> + * version of the structure present in PDSM Payload for a given PDSM >> >> command. >> >> + * This provides backward compatibility in case the PDSM Payload >> >> structure >> >> + * evolves and different structures are supported by 'papr_scm' and >> >> 'libndctl'. >> >> + * >> >> + * When sending a PDSM Payload to 'papr_scm', 'libndctl' should send the >> >> version >> >> + * of the payload struct it supports via 'payload_version' field. The >> >> 'papr_scm' >> >> + * module when servicing the PDSM envelope checks the 'payload_version' >> >> and then >> >> + * uses 'payload struct version' == MIN('payload_version field', >> >> + * 'max payload-struct-version supported by papr_scm') to service the >> >> PDSM. >> >> + * After servicing the PDSM, 'papr_scm' put the negotiated version of >> >> payload >> >> + * struct in returned 'payload_version' field. >> >> + * >> >> + * Libndctl on receiving the envelope back from papr_scm again checks the >> >> + * 'payload_version' field and based on it use the appropriate version >> >> dsm >> >> + * struct to parse the results. >> >> + * >> >> + * Backward Compatibility: >> >> + * >> >> + * Above scheme of exchanging different versioned PDSM struct between >> >> libndctl >> >> + * and papr_scm should provide backward compatibility until following two >> >> + * assumptions/conditions when defining new PDSM structs hold: >> >> + * >> >> + * Let T(X) = { set of attributes in PDSM struct 'T' versioned X } >> >> + * >> >> + * 1. T(X) is a proper subset of T(Y) if Y > X. >> >> + * i.e Each new version of PDSM struct should retain existing struct >> >> + * attributes from previous version >> >> + * >> >> + * 2. If an entity (libndctl or papr_scm) supports a PDSM struct T(X) >> >> then >> >> + * it should also support T(1), T(2)...T(X - 1). >> >> + * i.e When adding support for new version of a PDSM struct, libndctl >> >> + * and papr_scm should retain support of the existing PDSM struct >> >> + * version they support. >> > >> > Please see this thread for an example why versions are a bad idea in UAPIs: >> > >> > https://lkml.org/lkml/2020/3/26/213 >> > >> >> > While the use of version is different in that thread the fundamental >> > issues are >> > the same. You end up with some weird matrix of supported features and >> > structure definitions. For example, you are opening up the possibility of >> > changing structures with a different version for no good reason. >> >> Not really sure I understand the statement correctly "you are opening up >> the possibility of changing structures with a different version for no >> good reason." > [..] > What I mean is: > > struct v1 { > u32 x; > u32 y; > }; > > struct v2 { > u32 y; > u32 x; > }; > > x and y are the same data but you have now redefined the order of the struct. > You don't need that flexibility/complexity. > > Generally I think you are defining: > > struct v1 { > u32 x; > u32 y; > }; > > struct v2 { > u32 x; > u32 y; > u32 z; > u32 a; > }; > > Which becomes 2 structures... There is no need. > > The easiest thing to do is: > > struct user_data { > u32 x; > u32 y; > }; > > And later you modify user_data to: > > struct user_data { > u32 x; > u32 y; > u32 z; > u32 a; > }; > > libndctl always passes sizeof(struct user_data) to the call. [Do ensure > structures are 64bit aligned for this to work.] > > The kernel sees the size and returns the amount of data up to that size. > > Therefore, older kernels automatically fill in x and y, newer kernels fill in > z/a if the buffer was big enough. libndctl only uses the fields it knows > about. > > It is _much_ easier this way. Almost nothing needs to get changed as versions > roll forward. The only big issue is if libndctl _needs_ z then it has to > check > if z is returned. > > In that case add a cap_mask with bit fields which the kernel can fill in for > which fields are valid. > > struct user_data { > u64 cap_mask; /* where bits define extra future capabilities */ > u32 x; > u32 y; > }; > > IFF you need to add data within fields which are reserved you can use > capability flags to indicate which fields are requested and which are returned > by the kernel. > > But I _think_ for what you want libndctl must survive if z/a are not available > right? So just adding to the structure should be fine. Agreed. But as I mentioned in my response to Dan's review comments [1], we will be removing the version field altogether and instead will introduce new psdm requests bound to new struct definitions in conjuntion to nvdimm-flags. I have a patchset ready which I will be sending out soon. [1] https://lore.kernel.org/linux-nvdimm/87h7vrgpzx....@linux.ibm.com/ > >> We want to return more data in the struct in future iterations. So >> 'changing structure with different version' is something we are >> expecting. >> >> With the backward compatibility constraints 1 & 2 above, it will ensure >> that support matrix looks like a lower traingular matrix with each >> successive version supporting previous version attributes. So supporting >> future versions is relatively simplified. > > But you end up with weird switch/if's etc to deal with the multiple > structures. > > With the size method the kernel simply returns the same size data as the user > requested and everything is done. No logic required at all. Literally it can > just copy the data it has (truncating if necessary). > Agreed. But with version field gone now we will instead use new psdm requests bound to new struct definitions in conjuntion to nvdimm-flags to retrive extended data from papr_scm. >> >> > >> > Also having the user query with version Z and get back version X (older) is >> > odd. Generally if the kernel does not know about a feature (ie version Z >> > of >> > the structure) it should return -EINVAL and let the user figure out what >> > to do. >> > The user may just give up or they could try a different query. >> > >> Considering the flow of ndctl/libndctl this is needed. libndctl will >> usually issues only one CMD_CALL ioctl to kernel and if that fails then >> an error is reported and ndctl will exit loosing state. >> >> Adding mechanism in libndctl to reissue CMD_CALL ioctl to fetch a >> appropriate version of pdsm struct is going to be considerably more >> work. >> >> This version fall-back mechanism, ensures that libndctl will receive >> usable data without having to reissue a more CMD_CALL ioctls. > > Define usable? > > What happens if libndctl does not get 'z' in my example above? What does it > do? If I understand correctly it does not _need_ z. So why have a check on > the version from the kernel? > > What if we change to: > > struct v3 { > u32 x; > u32 y; > u32 z; > u32 a; > u32 b; > u32 c; > }; > > Now it has to > > if(version 2) > z/a valid do something() > > if(version 3) > b/c valid do something else() > > if z, a, b, c are all 0 does it matter? > > If not, the logic above disappears. > > If so, then you need a cap mask. Then the kernel can say c and a are valid > (but c is 0) or other flexible stuff like that. > >> >> >> + */ >> >> + >> >> +/* PDSM-header + payload expected with ND_CMD_CALL ioctl from libnvdimm >> >> */ >> >> +struct nd_pdsm_cmd_pkg { >> >> + struct nd_cmd_pkg hdr; /* Package header containing sub-cmd */ >> >> + __s32 cmd_status; /* Out: Sub-cmd status returned back */ >> >> + __u16 reserved[5]; /* Ignored and to be used in future */ >> > >> > How do you know when reserved is used for something else in the future? Is >> > reserved guaranteed (and checked by the code) to be 0? >> >> For current set of pdsm requests ignore these reserved fields. However a >> future pdsm request can leverage these reserved fields. So papr_scm >> just bind the usage of these fields with the value of >> 'nd_cmd_pkg.nd_command' that indicates the pdsm request. >> >> That being said checking if the reserved fields are set to 0 will be a >> good measure. Will add this check in next iteration. > > Exactly, if you don't check them now you will end up with an older libndctl > which passes in garbage and breaks future users... Basically rendering the > reserved fields useless. I have addressed this in my new patch-series which adds checks for reserved fields to be '0' > >> >> > >> >> + __u16 payload_version; /* In/Out: version of the payload */ >> > >> > Why is payload_version after reserved? >> Want to place the payload version field just before the payload data so >> that it can be accessed with simple pointer arithmetic. > > I did not see that in the patch. I thought you were using > nd_pdsm_cmd_pkg->payload_version? Thats right, but it just provided another simple way to retrive payload_version without resorting to 'container_of' macro. Anyways the version field is now gone hence 'payload' follows the reserved fields. > >> >> > >> >> + __u8 payload[]; /* In/Out: Sub-cmd data buffer */ >> >> +} __packed; >> >> + >> >> +/* >> >> + * Methods to be embedded in ND_CMD_CALL request. These are sent to the >> >> kernel >> >> + * via 'nd_pdsm_cmd_pkg.hdr.nd_command' member of the ioctl struct >> >> + */ >> >> +enum papr_pdsm { >> >> + PAPR_PDSM_MIN = 0x0, >> >> + PAPR_PDSM_MAX, >> >> +}; >> >> + >> >> +/* Convert a libnvdimm nd_cmd_pkg to pdsm specific pkg */ >> >> +static inline struct nd_pdsm_cmd_pkg *nd_to_pdsm_cmd_pkg(struct >> >> nd_cmd_pkg *cmd) >> >> +{ >> >> + return (struct nd_pdsm_cmd_pkg *) cmd; >> >> +} >> >> + >> >> +/* Return the payload pointer for a given pcmd */ >> >> +static inline void *pdsm_cmd_to_payload(struct nd_pdsm_cmd_pkg *pcmd) >> >> +{ >> >> + if (pcmd->hdr.nd_size_in == 0 && pcmd->hdr.nd_size_out == 0) >> >> + return NULL; >> >> + else >> >> + return (void *)(pcmd->payload); >> >> +} >> >> + >> >> +#endif /* _UAPI_ASM_POWERPC_PAPR_PDSM_H_ */ >> >> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c >> >> b/arch/powerpc/platforms/pseries/papr_scm.c >> >> index 149431594839..5e2237e7ec08 100644 >> >> --- a/arch/powerpc/platforms/pseries/papr_scm.c >> >> +++ b/arch/powerpc/platforms/pseries/papr_scm.c >> >> @@ -15,13 +15,15 @@ >> >> #include <linux/seq_buf.h> >> >> >> >> #include <asm/plpar_wrappers.h> >> >> +#include <asm/papr_pdsm.h> >> >> >> >> #define BIND_ANY_ADDR (~0ul) >> >> >> >> #define PAPR_SCM_DIMM_CMD_MASK \ >> >> ((1ul << ND_CMD_GET_CONFIG_SIZE) | \ >> >> (1ul << ND_CMD_GET_CONFIG_DATA) | \ >> >> - (1ul << ND_CMD_SET_CONFIG_DATA)) >> >> + (1ul << ND_CMD_SET_CONFIG_DATA) | \ >> >> + (1ul << ND_CMD_CALL)) >> >> >> >> /* DIMM health bitmap bitmap indicators */ >> >> /* SCM device is unable to persist memory contents */ >> >> @@ -350,16 +352,97 @@ static int papr_scm_meta_set(struct papr_scm_priv >> >> *p, >> >> return 0; >> >> } >> >> >> >> +/* >> >> + * Validate the inputs args to dimm-control function and return '0' if >> >> valid. >> >> + * This also does initial sanity validation to ND_CMD_CALL sub-command >> >> packages. >> >> + */ >> >> +static int is_cmd_valid(struct nvdimm *nvdimm, unsigned int cmd, void >> >> *buf, >> >> + unsigned int buf_len) >> >> +{ >> >> + unsigned long cmd_mask = PAPR_SCM_DIMM_CMD_MASK; >> >> + struct nd_pdsm_cmd_pkg *pkg = nd_to_pdsm_cmd_pkg(buf); >> >> + struct papr_scm_priv *p; >> >> + >> >> + /* Only dimm-specific calls are supported atm */ >> >> + if (!nvdimm) >> >> + return -EINVAL; >> >> + >> >> + /* get the provider date from struct nvdimm */ >> > >> > s/date/data >> Thanks for point this out. Will fix this in next iteration. >> >> > >> >> + p = nvdimm_provider_data(nvdimm); >> >> + >> >> + if (!test_bit(cmd, &cmd_mask)) { >> >> + dev_dbg(&p->pdev->dev, "Unsupported cmd=%u\n", cmd); >> >> + return -EINVAL; >> >> + } else if (cmd == ND_CMD_CALL) { >> >> + >> >> + /* Verify the envelope package */ >> >> + if (!buf || buf_len < sizeof(struct nd_pdsm_cmd_pkg)) { >> >> + dev_dbg(&p->pdev->dev, "Invalid pkg size=%u\n", >> >> + buf_len); >> >> + return -EINVAL; >> >> + } >> >> + >> >> + /* Verify that the PDSM family is valid */ >> >> + if (pkg->hdr.nd_family != NVDIMM_FAMILY_PAPR) { >> >> + dev_dbg(&p->pdev->dev, "Invalid pkg family=0x%llx\n", >> >> + pkg->hdr.nd_family); >> >> + return -EINVAL; >> >> + >> >> + } >> >> + >> >> + /* We except a payload with all PDSM commands */ >> >> + if (pdsm_cmd_to_payload(pkg) == NULL) { >> >> + dev_dbg(&p->pdev->dev, >> >> + "Empty payload for sub-command=0x%llx\n", >> >> + pkg->hdr.nd_command); >> >> + return -EINVAL; >> >> + } >> >> + } >> >> + >> >> + /* Command looks valid */ >> > >> > I assume the first command to be implemented also checks the { nd_command, >> > payload_version, payload length } for correctness? >> Yes the pdsm service functions do check the payload_version and >> payload_length. Please see the papr_pdsm_health() that services the >> PAPR_PDSM_HEALTH pdsm in Patch-5 >> > > cool. > >> > >> >> + return 0; >> >> +} >> >> + >> >> +static int papr_scm_service_pdsm(struct papr_scm_priv *p, >> >> + struct nd_pdsm_cmd_pkg *call_pkg) >> >> +{ >> >> + /* unknown subcommands return error in packages */ >> >> + if (call_pkg->hdr.nd_command <= PAPR_PDSM_MIN || >> >> + call_pkg->hdr.nd_command >= PAPR_PDSM_MAX) { >> >> + dev_dbg(&p->pdev->dev, "Invalid PDSM request 0x%llx\n", >> >> + call_pkg->hdr.nd_command); >> >> + call_pkg->cmd_status = -EINVAL; >> >> + return 0; >> >> + } >> >> + >> >> + /* Depending on the DSM command call appropriate service routine */ >> >> + switch (call_pkg->hdr.nd_command) { >> >> + default: >> >> + dev_dbg(&p->pdev->dev, "Unsupported PDSM request 0x%llx\n", >> >> + call_pkg->hdr.nd_command); >> >> + call_pkg->cmd_status = -ENOENT; >> >> + return 0; >> >> + } >> >> +} >> >> + >> >> static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc, >> >> struct nvdimm *nvdimm, unsigned int cmd, void *buf, >> >> unsigned int buf_len, int *cmd_rc) >> >> { >> >> struct nd_cmd_get_config_size *get_size_hdr; >> >> struct papr_scm_priv *p; >> >> + struct nd_pdsm_cmd_pkg *call_pkg = NULL; >> >> + int rc; >> >> >> >> - /* Only dimm-specific calls are supported atm */ >> >> - if (!nvdimm) >> >> - return -EINVAL; >> >> + /* Use a local variable in case cmd_rc pointer is NULL */ >> >> + if (cmd_rc == NULL) >> >> + cmd_rc = &rc; >> > >> > Why is this needed? AFAICT The caller of papr_scm_ndctl does not specify >> > null >> > and you did not change it. >> This pointer is coming from outside the papr_scm code hence need to be >> defensive here. Also as per[1] cmd_rc is "translation of firmware status" >> and not every caller would need it hence making this pointer optional. >> >> This is evident in acpi_nfit_blk_get_flags() where the 'nd_desc->ndctl' >> is called with 'cmd_rc == NULL'. >> >> [1] >> https://lore.kernel.org/linux-nvdimm/CAPcyv4hE_FG0YZXJVA1G=cbq8b9e0k54jxk5sq5uku-dnwt...@mail.gmail.com/ > > Ah... Ok. So this is a bug fix which needs to happen regardless of the status > of this patch... > >> >> > >> >> + >> >> + *cmd_rc = is_cmd_valid(nvdimm, cmd, buf, buf_len); >> >> + if (*cmd_rc) { >> >> + pr_debug("Invalid cmd=0x%x. Err=%d\n", cmd, *cmd_rc); >> >> + return *cmd_rc; >> >> + } >> >> >> >> p = nvdimm_provider_data(nvdimm); >> >> >> >> @@ -381,13 +464,19 @@ static int papr_scm_ndctl(struct >> >> nvdimm_bus_descriptor *nd_desc, >> >> *cmd_rc = papr_scm_meta_set(p, buf); > > ... Because this will break here. even without this new code... right? > > Lets get this fix in as a prelim-patch. Yes, I have moved the changes proposed in this hunk to a separate prelim patch in the v10 of the patch series. > >> >> break; >> >> >> >> + case ND_CMD_CALL: >> >> + call_pkg = nd_to_pdsm_cmd_pkg(buf); >> >> + *cmd_rc = papr_scm_service_pdsm(p, call_pkg); >> >> + break; >> >> + >> >> default: >> >> - return -EINVAL; >> >> + dev_dbg(&p->pdev->dev, "Unknown command = %d\n", cmd); >> >> + *cmd_rc = -EINVAL; >> > >> > Is this change related? If there is a bug where there is a caller of >> > papr_scm_ndctl() with cmd_rc == NULL this should be a separate patch to fix >> > that issue. >> This simplifies a bit debugging of errors reported in >> papr_scm_ndctl() as it ensures that subsequest dev_dbg "Returned with >> cmd_rc" is always logged. >> >> I think, this is a too small change to be carved out as an independent >> patch. Also this doesnt change the behaviour of the code except logging >> some more error info. >> >> However, If you feel too strongly about it I will spin a separate patch >> in this patch series for this. [..] > > This can go in as part of a 'protect against cmd_rc == NULL' > preliminary patch. Yes, have coalesced this change with changes I reffered to in my previous comment into a single prelim patch. [..] > > I flagged this because at first I could not figure out what this had to do > with > the ND_CMD_CALL... > > For reviewers you want to make your patches concise to what you are > fixing/adding... Sure. Will be more careful of this in future patches. > > Also, based on acpi_nfit_blk_get_flags() using cmd_rc == NULL it looks like we > have a bug which needs to get fixed regardless of the this patch. And if that > bug exists in earlier kernels you will need a separate patch to backport as a > fix. > > So lets get that in first and separate... :-D Sure, will send out a separate independent patch fixing the cmd_rc == NULL issue in acpi_nfit_blk_get_flags addressing it to stable tree. > > Ira > >> >> > >> > Ira >> > >> >> } >> >> >> >> dev_dbg(&p->pdev->dev, "returned with cmd_rc = %d\n", *cmd_rc); >> >> >> >> - return 0; >> >> + return *cmd_rc; >> >> } >> >> >> >> static ssize_t flags_show(struct device *dev, >> >> diff --git a/include/uapi/linux/ndctl.h b/include/uapi/linux/ndctl.h >> >> index de5d90212409..0e09dc5cec19 100644 >> >> --- a/include/uapi/linux/ndctl.h >> >> +++ b/include/uapi/linux/ndctl.h >> >> @@ -244,6 +244,7 @@ struct nd_cmd_pkg { >> >> #define NVDIMM_FAMILY_HPE2 2 >> >> #define NVDIMM_FAMILY_MSFT 3 >> >> #define NVDIMM_FAMILY_HYPERV 4 >> >> +#define NVDIMM_FAMILY_PAPR 5 >> >> >> >> #define ND_IOCTL_CALL _IOWR(ND_IOCTL, ND_CMD_CALL,\ >> >> struct nd_cmd_pkg) >> >> -- >> >> 2.26.2 >> >> >> >> -- >> Cheers >> ~ Vaibhav -- Cheers ~ Vaibhav