On Tue, 2020-05-26 at 19:23 +0000, mario.limoncie...@dell.com wrote: > > On Tue, 2020-05-26 at 13:32 -0500, Mario Limonciello wrote: > > > This reverts commit d23d12484307b40eea549b8a858f5fffad913897. > > > > > > This commit has caused regressions for the XPS 9560 containing > > > a Nuvoton TPM. > > > > Presumably this is using the tis driver? > > Correct. > > > > > > As mentioned by the reporter all TPM2 commands are failing with: > > > ERROR:tcti:src/tss2-tcti/tcti- > > > device.c:290:tcti_device_receive() > > > Failed to read response from fd 3, got errno 1: Operation not > > > permitted > > > > > > The reporter bisected this issue back to this commit which was > > > backported to stable as commit 4d6ebc4. > > > > I think the problem is request_locality ... for some inexplicable > > reason a failure there returns -1, which is EPERM to user space. > > > > That seems to be a bug in the async code since everything else > > gives a ESPIPE error if tpm_try_get_ops fails ... at least no-one > > assumes it gives back a sensible return code. > > > > What I think is happening is that with the patch the TPM goes > > through a quick sequence of request, relinquish, request, > > relinquish and it's the third request which is failing (likely > > timing out). Without the patch, the patch there's only one > > request,relinquish cycle because the ops are held while the async > > work is executed. I have a vague recollection that there is a > > problem with too many locality request in quick succession, but > > I'll defer to Jason, who I think understands the intricacies of > > localities better than I do. > > Thanks, I don't pretend to understand the nuances of this particular > code, but I was hoping that the request to revert got some attention > since Alex's kernel Bugzilla and message a few months ago to linux > integrity weren't. > > > > > If that's the problem, the solution looks simple enough: just move > > the ops get down because the priv state is already protected by the > > buffer mutex > > Yeah, if that works for Alex's situation it certainly sounds like a > better solution than reverting this patch as this patch actually does > fix a problem reported by Jeffrin originally. > > Could you propose a specific patch that Alex and Jeffrin can perhaps > both try?
Um, what's wrong with the one I originally attached and which you quote below? It's only compile tested, but I think it will work, if the theory is correct. James > > > > James > > > > --- > > > > diff --git a/drivers/char/tpm/tpm-dev-common.c > > b/drivers/char/tpm/tpm-dev- > > common.c > > index 87f449340202..1784530b8387 100644 > > --- a/drivers/char/tpm/tpm-dev-common.c > > +++ b/drivers/char/tpm/tpm-dev-common.c > > @@ -189,15 +189,6 @@ ssize_t tpm_common_write(struct file *file, > > const char > > __user *buf, > > goto out; > > } > > > > - /* atomic tpm command send and result receive. We only > > hold the ops > > - * lock during this period so that the tpm can be > > unregistered even if > > - * the char dev is held open. > > - */ > > - if (tpm_try_get_ops(priv->chip)) { > > - ret = -EPIPE; > > - goto out; > > - } > > - > > priv->response_length = 0; > > priv->response_read = false; > > *off = 0; > > @@ -211,11 +202,19 @@ ssize_t tpm_common_write(struct file *file, > > const char > > __user *buf, > > if (file->f_flags & O_NONBLOCK) { > > priv->command_enqueued = true; > > queue_work(tpm_dev_wq, &priv->async_work); > > - tpm_put_ops(priv->chip); > > mutex_unlock(&priv->buffer_mutex); > > return size; > > } > > > > + /* atomic tpm command send and result receive. We only > > hold the ops > > + * lock during this period so that the tpm can be > > unregistered even if > > + * the char dev is held open. > > + */ > > + if (tpm_try_get_ops(priv->chip)) { > > + ret = -EPIPE; > > + goto out; > > + } > > + > > ret = tpm_dev_transmit(priv->chip, priv->space, priv- > > > data_buffer, > > > > sizeof(priv->data_buffer)); > > tpm_put_ops(priv->chip);